Commun. Math. Phys. 197, 33 – 74 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Quantum Dynamics of the Compactified Trigonometric Ruijsenaars–Schneider Model J. F. van Diejen, L. Vinet Centre de Recherches Math´ematiques, Universit´e de Montr´eal, C.P. 6128, Succursale Centre-Ville, Montr´eal (Qu´ebec), H3C 3J7 Canada Received: 22 September 1997 / Accepted: 10 February 1998
Abstract: We quantize a compactified version of the trigonometric Ruijsenaars–Schneider particle model with a phase space that is symplectomorphic to the complex projective space CPN . The quantum Hamiltonian is realized as a discrete difference operator acting in a finite-dimensional Hilbert space of complex functions with support in a finite uniform lattice over a convex polytope (viz., a restricted Weyl alcove with walls having a thickness proportional to the coupling parameter). We solve the corresponding finite-dimensional (bispectral) eigenvalue problem in terms of discretized Macdonald polynomials with q (and t) on the unit circle. The normalization of the wave functions is determined using a terminating version of a recent summation formula due to Aomoto, Ito and Macdonald. The resulting eigenfunction transform determines a discrete Fourier-type involution in the Hilbert space of lattice functions. This is in correspondence with Ruijsenaars’ observation that – at the classical level – the action-angle transformation defines an (anti)symplectic involution of CPN . From the perspective of algebraic combinatorics, our results give rise to a novel system of bilinear summation identities for the Macdonald symmetric functions. Contents 1 2 2.1 2.2 2.3 3 3.1 3.2 4 4.1 4.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Classical System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some notational preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The reduced phase space for the relative particle motion: CPN . . . . . . . Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruijsenaars difference operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The finite-dimensional Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A factorized eigenfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The complete eigenbasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34 36 36 38 38 41 41 42 46 46 49
34
J. F. van Diejen, L. Vinet
5 Miscellanea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The eigenfunction transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The degeneration g ↓ 0: Discrete Fourier analysis on a Weyl alcove . . . 5.3 Ground-state vs maximal energy wave function . . . . . . . . . . . . . . . . . . . 5.4 The two-particle solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Geometric quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Connections to integrable field theories . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Bispectrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Truncated and Terminating Aomoto–Ito–Macdonald sums . . . . . . . . . . B Bilinear summation identities for Macdonald’s symmetric functions . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53 53 56 57 58 60 62 62 62 62 67 72
1. Introduction In a recently published work [R3], Ruijsenaars presented a detailed study of the dynamics of the classical Sutherland–Moser particle model [Mo] and its “relativistic” deformation the trigonometric Ruijsenaars–Schneider model [RS, R1]. In addition, he also considered a closely related integrable system characterized by an (N + 1)-particle Hamiltonian of the form 1/2 Y X sin2 ( αβg 2 ) cos(βpj ) . (1.1) 1− H= sin2 α2 (xj − xk ) 1≤j≤N +1 1≤k≤N +1, k6=j The Hamiltonian in (1.1) differs from the standard trigonometric Ruijsenaars–Schneider √ Hamiltonian by the substitution β → iβ (where i = −1). Even though over the complex field both systems are equivalent, it turns out that their real (i.e. physical) dynamics are quite distinct. (Throughout we are assuming that our variables xj , pj as well as the scale factors α, β and the coupling parameter g are real-valued.) The main point is that H (1.1) is periodic not only in the x but also in the p variables. This periodicity naturally prompts one to employ a phase space which – upon restricting attention to the relative motion in the center-of-mass frame – is bounded (in fact compact after a suitable completion). This is in contrast to the situation for the standard trigonometric Ruijsenaars– Schneider model (with cos(βpj ) → cosh(βpj ) and − sin2 (αβg/2) → + sinh2 (αβg/2) substituted in (1.1)), where the phase space is given by the (manifestly noncompact) cotangent bundle over the configuration space. From now on the system determined by the Hamiltonian H (1.1) will be referred to as the compactified trigonometric Ruijsenaars–Schneider model. It is the purpose of the present paper to investigate the corresponding quantum system. We will see that, in accordance with physical intuition, the Hilbert space for the quantum model becomes finite-dimensional. In essence, the Hilbert space in question consists of the space of all complex functions with support in a finite uniform lattice (grid) over classical configuration space. This configuration space has the geometry of a convex polytope consisting of a restricted Weyl alcove with walls that have a thickness determined by the value of the coupling parameter g. Matching the lattice so as to let it fit precisely over the configuration space, including the vertices (corner points) of the polytope, produces a quantization condition on g that relates the coupling parameter to the size of the lattice. The quantum Hamiltonian is in turn given by a discrete difference operator with a step size that is equal to the distance between neighboring lattice points. Mathematically,
The Compact Quantum Ruijsenaars–Schneider Model
35
our quantization condition on g translates in vanishing conditions for the coefficients at the boundary lattice points, therewith guaranteeing that the discrete difference operator Hamiltonian is well-defined and self-adjoint as an operator in the Hilbert space of complex functions over the finite lattice. For the quantum version of the standard trigonometric Ruijsenaars–Schneider model it is well-known (see e.g. the introduction of [D2] or Sect. 7.6.2 of [R4]) that the eigenfunctions may be expressed as a product of a factorized (ground-state) wave function and Macdonald polynomials (with 0 < q < 1) [M2, M3, M4]. Here also, in the case of the compactified Ruijsenaars–Schneider model, the eigenfunctions turn out to be similarly expressible in terms of Macdonald polynomials. In contrast to the standard situation, however, in the compactified/discrete context of the present paper the parameters q and t(= q g ) lie on the unit circle and the diagonalization of the model involves only a finite number of Macdonald polynomials (viz., precisely as many as the number of lattice points = the dimension of the Hilbert space). The orthogonality of the wave functions translates itself into a finite system of discrete orthogonality relations for the Macdonald polynomials. In the rank one case of two particles (N = 1), the Macdonald polynomials become q-ultraspherical polynomials; the corresponding discrete orthogonality relations amount in that case to a specialization of well-known orthogonality relations for the q-Racah polynomials [AW, GR, KS]. The symmetry relations for the Macdonald polynomials [Ko1, M4, EK] have as a consequence that the discrete kernel for the finite-dimensional eigenfunction transform is symmetric. This reflects the fact that we are actually dealing with a multivariate finite-dimensional doubly discrete bispectral problem in the sense of Duistermaat and Gr¨unbaum [DG, W, G]. More concretely, the discrete eigenfunction kernel satisfies the same discrete difference equations in the “spectral” variables as it does in the “spatial” variables. Combined with the unitarity, the symmetry of the kernel furthermore implies that the eigenfunction transform determines a discrete Fourier-type involution in the Hilbert space of lattice functions. This is the quantum counterpart of the corresponding property of (the closure of) the action-angle transformation for the classical compactified Ruijsenaars–Schneider model, which turns out to define an involutive (anti)symplectomorphism of the classical phase space (∼ = CPN ) [R3]. When the coupling parameter g tends to zero, our eigenfunction transform reduces to the standard Discrete Fourier Transformation for lattice functions over the dominant weights lying in a Weyl alcove. The paper is organized as follows. We first recall in Sect. 2 some properties of the classical compactified Ruijsenaars–Schneider system taken from [R3]. Specifically, we discuss the commuting integrals, the configuration space (viz. the restricted Weyl alcove with walls of thickness proportional to g) and also the phase space of the model. A rather remarkable property of the dynamical system under consideration is that the phase space for the relative particle motion in the center-of-mass frame becomes, after a suitable compactification, isomorphic to the complex projective space CPN . In particular, globally the compactified phase space does not have a topology of product form (it is not topologically equivalent to the direct product of the configuration space times a (real) N -dimensional torus). Section 3 goes on to demonstrate how canonical quantization (pj → ∂/i∂xj ) gives rise to discrete difference operator Hamiltonians acting in a finite-dimensional Hilbert space of lattice functions over the classical configuration space. In Sect. 4, the spectrum of the quantum model is determined in explicit form and the corresponding wave functions are expressed in terms of Macdonald polynomials with q, t on the unit circle. In order to normalize our wave functions such that their L2 -norms are equal to one, it is necessary to evaluate a terminating version of a
36
J. F. van Diejen, L. Vinet
recently found summation formula due to Aomoto, Ito and Macdonald [Ao, I1, I2, M5]. The details explaining how to truncate the Aomoto–Ito–Macdonald sum so as to arrive at its terminating version are relegated to the first of two appendices at the end of the paper (Appendix A). In a second appendix (viz. Appendix B), some useful properties of the Macdonald symmetric functions taken from [M2, M4] have been collected. These properties were needed in Sect. 4 for the diagonalization of the quantum model. We have also taken the opportunity to reformulate here some of our results from the viewpoint of algebraic combinatorics. This leads us, in particular, to a new system of bilinear summation identities for the Macdonald symmetric functions (cf. Proposition B.2). The paper closes in Sect. 5 with some miscellaneous results and remarks. Among other things, it is pointed out that: (i) the results on the eigenfunctions give rise to a Discrete Fourier-type Transform for lattice functions over the restricted Weyl alcove, (ii) both the maximal and the minimal energy of the compactified Ruijsenaars–Schneider model are at the quantum level the same as at the classical level (the quantization discretizes the energy levels but does not shift the spectrum) and (iii) the dimension of our Hilbert space is in agreement with the dimension predicted by the Riemann–Roch–Hirzebruch formula for CPN , in the framework of geometric quantization [HK, Si, Hu]. P Note. By means of a symplectic (with respect to the standard symplectic form j dxj ∧ dpj ) rescaling (x, p) → (βx, β −1 p) one absorbs the scale parameter β in α (cf. (1.1)). In this paper we will from now on pick β = 1 without loss of generality. At the quantum level this means that we have scaled our variables such that the step q size of the discrete difference operator Hamiltonians becomes equal to one (or to NN+1 after projection onto the center-of-mass hyperplane), cf. Sect. 3.1 and the remark at the end of Sect. 3.
2. The Classical System This section serves to summarize some of the basic properties of the classical compactified Ruijsenaars–Schneider model that are discussed in more detail in [R3]. Since we are primarily interested in the relative particle motion in the center-of-mass frame, it will be convenient to employ root system notation. For our purposes it suffices to restrict attention to the root system of type AN . Some relevant preliminaries have been collected in the first subsection. For further information regarding root systems the reader is referred e.g. to (the Planches in) Bourbaki [B]. 2.1. Some notational preliminaries. Below the vectors e1 , . . . , eN +1 always represent the unit vectors constituting the standard basis of RN +1 and h·, ·i denotes the (usual) inner product with respect to which this standard basis becomes an orthonormal basis (i.e. hej , ek i = δj,k ). Let E be the center-of-mass hyperplane E = {x ∈ RN +1 | x1 + · · · + xN +1 = 0}.
(2.1)
A natural basis {a1 , . . . , aN } for E is given by the simple roots aj = ej − ej+1 ,
j = 1, . . . , N.
(2.2)
The associated dual basis {ω1 , . . . , ωN } – determined uniquely by the property that hωj , ak i = δj,k – is realized explicitly by the fundamental weights
The Compact Quantum Ruijsenaars–Schneider Model
ωj = (e1 + · · · + ej ) −
37
j (e1 + · · · + eN +1 ), N +1
j = 1, . . . , N.
(2.3)
To these two bases of E (2.1) one can associate the root lattice Q = Z − Span{a1 , . . . , aN }
(2.4)
3 = Z − Span{ω1 , . . . , ωN },
(2.5)
and the weight lattice
as well as the corresponding positive semi-lattices (or integral cones) Q+ = N − Span{a1 , . . . , aN }
(2.6)
3+ = N − Span{ω1 , . . . , ωN },
(2.7)
and
respectively (where in our conventions the set of natural numbers N does include the number zero). The semi-lattice 3+ (2.7) is usually referred to as the cone of dominant weights. This cone is partially ordered by the dominance order, which is defined for λ, µ ∈ 3+ by µλ
iff
λ − µ ∈ Q+
(2.8)
(and µ ≺ λ iff µ λ and µ 6= λ). The Weyl group generated by the reflections in planes orthogonal to the simple roots a1 , . . . , aN (2.2) is realized explicitly as the group of permutations σ ∈ SN +1 acting on the vectors e1 , . . . , eN +1 by σ(ej ) := eσ(j) .
(2.9)
The (unique) orbit of the basis vectors aj with respect to the SN +1 -action consists of the roots AN = {ej − ek | 1 ≤ j 6= k ≤ N + 1}.
(2.10)
For future reference we also need to identify the positive roots A+N = AN ∩ Q+ = {ej − ek | 1 ≤ j < k ≤ N + 1}, the maximal root amax =
X
aj = e1 − eN +1 = ω1 + ωN
(2.11)
(2.12)
1≤j≤N
(this root is maximal in AN (2.10) with respect to the partial dominance order in (2.8)), and the weighted half sum over the positive roots X g g X a= (ej − ek ) (2.13) ρ= 2 2 1≤j
= g (ω1 + · · · + ωN ).
38
J. F. van Diejen, L. Vinet
2.2. Integrability. The Hamiltonian H (1.1) is known to be integrable: a complete set of integrals in involution is given explicitly by [RS, R1] !1/2 X Y P sin2 ( αg 2 ) cos( j∈J pj ) , (2.14) 1− Hr = sin2 α2 (xj − xk ) j∈J J⊂{1,... ,N +1} k6∈J
|J|=r
r = 1, . . . , N + 1. (Recall that we have rescaled the variables such that β = 1, cf. the note at the end of the introduction.) Observe that Hr (2.14) specializes for r = 1 to the Hamiltonian H (1.1) and that HN +1 = cos(p1 + · · · + pN +1 ), reflecting the translational invariance of the model. The projection of the Hr -flow onto the centerof-mass hyperplane x1 + · · · + xN +1 = 0 is governed by the reduced Hamiltonian Hr , written conveniently in root system notation as !1/2 X Y sin2 ( αg ) 2 cos(hν, pi) , (2.15) 1− Hr = 2 α sin ha, xi 2 a ∈A ν∈SN +1 (ωr ) N
ha,νi=1
r = 1, . . . , N (for HN +1 the reduced flow in center-of-mass plane is of course trivially stationary). Here x := (x1 , . . . , xN +1 ), p := (p1 , . . . , pN +1 ) and the sum in (2.15) is over all weights ν ∈ 3 (2.5) that lie in the SN +1 -orbit (recall the action (2.9)) of the rth fundamental weight vector ωr (2.3). 2.3. The reduced phase space for the relative particle motion: CPN . Let us from now on assume that the scale factor α is positive and that the parameter g lies in the interval 0
2π . (N + 1)α
(2.16)
In order to arrive at real-valued Hamiltonians Hr (2.15), one is led to employ a configuration space in which the particle distances |xj − xk | are bounded from below by g (> 0) and from above by 2π/α − g (> 0). This is realized by picking as configuration space the submanifold 6g of the center-of-mass plane consisting of the points x ∈ E (2.1) satisfying the conditions (i) haj , xi > g for j = 1, . . . , N ; (ii) hamax , xi < 2π/α − g (where the vectors a1 , . . . , aN denote the simple roots (2.2) and amax is the maximal root (2.12)). The parameter restriction (2.16) ensures that the submanifold 6g ⊂ E determined by (i), (ii) is nonempty (add the N inequalities from (i) and use (2.12) of an open convex polytope to compare with (ii)). Furthermore, 6g has the geometry √ consisting of an alcove with walls of thickness g/ 2 inside the Weyl alcove 60 (which corresponds to the limit g ↓ 0). The open convex polytope (or open simplex) 6g is completely determined by the N + 1 vertices (corner points) ρ, ρ + M ωr (r = 1, . . . , N ) with M = 2π α − (N + 1)g > 0. See Fig. 1. An obvious candidate for the phase space would now of course be the cotangent bundle over the configuration space: T ∗ (6g ) ∼ = 6g × E. However, in view of the periodicity of the Hamiltonians Hr (2.15) with respect to translations in p over vectors in the dilated root lattice 2πQ (cf. (2.4)), it is natural to restrict to a smaller phase space of the form 6g × T , where T is the N -dimensional torus E/(2πQ). This torus can be coordinatized explicitly as
The Compact Quantum Ruijsenaars–Schneider Model
39
2π/α ω2 [+ρ -(N+1)gω2]
x1-x3=2π/α [-g]
2π/α ω1 [+ρ -(N+1)gω1]
x1-x2=0 [+g] ω2 ρ
ω1
x2-x3=0 [+g]
0 [+ρ] √ Fig. 1. The restricted Weyl alcove with walls of thickness g/ 2 for N = 2. The region of the inner alcove corresponds to the configurations spaces 6g (without boundary) and 6g (with boundary) of the three-particle system in the center-of-mass hyperplane x1 + x2 + x3 = 0. The vertices and boundary segments with/without the shifts between square brackets refer to the inner/outer triangle, respectively
T = {p ∈ E | −π < hωr , pi ≤ π, r = 1, . . . , N },
(2.17)
where the components hωr , pi of the vector p with respect to the basis of fundamental weights {ω1 , . . . , ωN } should be read modulo 2π. Unfortunately, it turns out that the Hr -flows are not complete on the bounded phase space 6g × T [R3]. To remedy this incompleteness, it is needed to compactify the phase space in a suitable manner. For this purpose the key observation from [R3] is that it turns out possible to embed the noncomplete phase space 6g × T densely and symplectically in CPN . Here the complex projective space is to be viewed as a 2N -dimensional real manifold with symplectic form proportional to the standard symplectic form inherited from the Fubini-Study K¨ahler metric on CPN . The Hamiltonians H1 , . . . , HN (2.15) lift under this embedding to smooth (Poisson commuting) Hamiltonians on CPN [R3] and the completeness of the corresponding Hamiltonian flows is thus immediate from the compactness of the extended phase space CPN . The relevant embedding of the noncomplete phase space 6g × T into CPN used by Ruijsenaars is given explicitly by (x, p) 7→ [1 : z1 : z2 : · · · : zN ] ∈ CPN with 1/2 haj , xi − g zj = eihωj ,pi , j = 1, . . . , N. (2.18) 2π/α − g − hamax , xi The inverse mapping [z0 : z1 : z2 : · · · : zN ] 7→ (x, p) ∈ 6g × T , defined on an open dense patch {[z0 : · · · : zN ] | zj 6= 0 (j = 0, . . . , N ) } of CPN , reads haj , xi = (2π/α − (N + 1)g) eihωj ,pi =
zj |z0 | , z0 |zj |
|zj |2 + g, |z0 |2 + · · · + |zN |2
(2.19a) (2.19.b)
40
J. F. van Diejen, L. Vinet
j = 1, . . . , N . (This gives the components of x and p with respect to the bases {a1 , . . . , aN } and {ω1 , . . . , ωN }, respectively.) The above mappings are symplectic PN +1 when 6g ×T is equipped with the standard symplectic form induced by j=1 dxj ∧dpj and CPN is endowed with the renormalized Fubini-Study symplectic form N N N X X 2iR2 X 1 ω R = PN dzj ∧ dz j − PN z j dzj ∧ zj dz j , 2 2 |z | |z | (2.20) j=0 j j=0 j j=0 j=0 j=0 where the normalization is such that the integral of ωR over a complex projective line equals 4πR2 with 2R2 =
2π − (N + 1)g. α
(2.21)
The coordinate functions in (2.19a) clearly extend to smooth functions on the whole of CPN . The image of the extension of the coordinate map to the completed phase space CPN is therefore given by the compactification 6g of 6g in E, 6g = {x ∈ E | haj , xi ≥ g (j = 1, . . . , N ); hamax , xi ≤
2π − g }. α
(2.22)
It is natural to interpret the simplex 6g as the configuration space for the compactified trigonometric Ruijsenaars–Schneider model, even though globally the completed phase 6 = 6g ×T .) Notice in space CPN has not a topology of product form. (In particular CPN ∼ this connection that the coordinate functions in (2.19b) for the momentum-like variables do not extend continuously to the boundary hyperplanes zj = 0, j = 0, . . . , N (as the limiting value of (2.19b) for zj → 0 along a radius in the complex plane depends on the value of the argument).
Fig. 2. The compactification of the phase space for N = 1 turning the cylinder 6g × T into a two-sphere (∼ = CP1 ) by adding two points. The arrows indicate the canonical projections 5, 5 onto the configurations spaces 6g (open line segment) and 6g (closed line segment), respectively
It is quite instructive to view how the compactification works topologically in the situation of two particles (N = 1). In this special case the reduced phase space 6g × T before completion has the structure of an open line segment (6g = {(x/2, −x/2) | x ∈ ]g, 2π/α − g[ }) times a real one-dimensional torus T1 . Topologically this is a cylinder without the two boundary circles or, equivalently, a two-sphere with two distinct points
The Compact Quantum Ruijsenaars–Schneider Model
41
extracted. The compactification adds the two extracted points (pinching) thus resulting in a compact phase space with the topology of a two-sphere S 2 (∼ = CP1 ). The canonical projection 5 : 6g × T 7→ 6g clearly extends uniquely to a continuous projection 5 of S 2 onto the closed line segment 6g = {(x/2, −x/2) | x ∈ [g, 2π/α − g] }), however, −1 the fiber 5 (m) with m ∈ 6g reduces to a point when m lies on the boundary 6g \ 6g , whereas it is isomorphic to a real one-torus T1 for m in the interior 6g . In particular, we do not have that S 2 is isomorphic to 6g × T1 . See Fig. 2. 3. Quantization In this section we quantize the compactified Ruijsenaars–Schneider model of the previous section. The quantum versions of the Hamiltonians H1 , . . . , HN (2.15) for the relative particle motion in the center-of-mass frame will be given by commuting discrete difference operators acting in a finite-dimensional Hilbert space of functions with support on a finite uniform lattice over the classical compactified configuration space 6g (2.22). 3.1. Ruijsenaars difference operators. In [R1] Ruijsenaars showed how the Poissoncommuting Hamiltonians Hr (2.14) may be quantized formally (i.e., without specifying a Hilbert space) by means of canonical quantization in such a way that integrability is preserved. For the reduced integrals Hr (2.15) the procedure leads to difference operators of the form Hˆ r = (Hˆ r+ + Hˆ r− )/2, with Hˆ r+ =
X
r = 1, . . . , N,
(3.1)
Vν1/2 (x) Tν Vν1/2 (−x),
(3.2a)
Vν1/2 (−x) Tν−1 Vν1/2 (x),
(3.2b)
ν∈SN +1 (ωr )
Hˆ r− =
X
ν∈SN +1 (ωr )
where Tν = ehν, ∂ x i ∂
(3.3)
denotes the operator acting on functions f : E → C by a translation over the weight vector ν, i.e. (Tν f )(x) = f (x + ν), and the coefficients are determined by Vν (x) =
Y sin α (g + ha, xi) 2 . α sin 2 ha, xi a∈A
(3.4)
N
ha,νi=1
The commutativity of the above difference operators is by no means evident from their explicit expressions, but it does follow immediately from Ruijsenaars’ results in [R1]. + To this end it is helpful to observe that Hˆ r− = Hˆ N +1−r because −ωr is in the SN +1 -orbit of ωN +1−r and Vν (−x) = V−ν (x). Hence, the commutativity already follows from the + , which can be traced back to [R1] by noticing that Hˆ r+ commutativity of Hˆ 1+ , . . . , Hˆ N −r/(N +1) corresponds in Ruijsenaars’ notation to the operator Sˆ r Sˆ N +1 (with the number of particles of course being equal to N + 1).
42
J. F. van Diejen, L. Vinet
Proposition 3.1 (Quantum integrability [R1]). The difference operators Hˆ r+ , Hˆ r− and Hˆ r , r = 1, . . . , N mutually commute. To see that the classical version of Hˆ r indeed amounts to Hr (2.15), one observes that after substituting Tν = exp(ihν, pi) (which is the classical analog of (3.3)) in Hˆ r one arrives at Hr (2.15) by using the identity Vν (x)Vν (−x) =
Y
1−
a∈AN
ha,νi=1
sin2 ( αg 2 )
sin2 α2 ha, xi
.
3.2. The finite-dimensional Hilbert space. The formal difference operators Hˆ r (3.1) and Hˆ r± (3.2a), (3.2b) shift function arguments over vectors in the weight lattice 3 (2.5). We will now assign a precise meaning to these difference operators as operators in a Hilbert space of functions with support in a uniform lattice over the classical compactified configuration space 6g (2.22). The point ρ (2.13) denotes the “minimal” vertex of the simplex 6g determined (uniquely) by the property that the functionals haj , ·i, j = 1, . . . , N simultaneously assume their minimum value g. By shifting from ρ over vectors in the weight lattice 3 (2.5), one generates a uniform lattice in 6g consisting of the points ρ+µ, µ ∈ 3+ (2.7) with hamax , ρ + µi = N g + hamax , µi ≤ 2π α − g. When the (positive) coupling constant g and scale factor α are related by 2π − (N + 1)g = M ∈ N \ {0}, α
(3.5)
then the maximum value 2π α − g of the functional hamax , ·i is assumed on the lattice. More to the point, it means that in this situation apart from the “minimal vertex” ρ also the N other vertices of the simplex 6g (2.22) (viz. the “maximal vertices” ρ + M ωr , r = 1, . . . , N ) lie on the lattice and, hence, that the lattice fits precisely over the classical configuration space including its boundary. See Fig. 3. From now on we will assume that the condition in (3.5) is satisfied. (Notice that the condition is compatible with the parameter restriction in (2.16).) Let 3+M be the alcove of dominant weights in 3+ (2.7) given by 3+M = {λ ∈ 3+ | hamax , λi ≤ M },
(3.6)
and let L2 (ρ + 3+M ) be the finite-dimensional Hilbert space of complex functions over the lattice ρ + 3+M := { ρ + µ | µ ∈ 3+M } endowed with the (standard) sesquilinear inner product X (f, h) = f (ρ + µ) h(ρ + µ) (3.7) µ∈3+M
(f, h ∈ L2 (ρ + 3+M )). Notice that our conventions are such that the inner product is linear in the first slot and antilinear in the second slot. The dimension of the Hilbert space L2 (ρ + 3+M ) amounts to the number of points in 3+M , i.e., the number of vectors of the form n1 ω1 +· · ·+nN ωN with nj ∈ N (j = 1, . . . , N ) and hamax , n1 ω1 +· · ·+nN ωN i = n 1 + · · · + nN ≤ M : dim (L2 (ρ + 3+M )) =
(N + M )! . N! M!
(3.8)
The Compact Quantum Ruijsenaars–Schneider Model
43
ω2 ρ
ω1
0 Fig. 3. The lattice ρ + 3M over the configuration space 6g supporting the quantum wave functions for N = 2 + and M = 5. The dimension of the Hilbert space L2 (ρ + 3M ) (i.e. the number of points in the lattice) amounts 2+5 in this case to 2 = 21 +
In order to see that the difference operators Hˆ r(±) (3.1), (3.2a), (3.2b) are well-defined as operators in L2 (ρ + 3+M ) we need the following lemma. Lemma 3.2 (Regularity, positivity and vanishing boundary conditions). For positive parameters α, g subject to the condition (3.5), µ ∈ 3+M (3.6) and ν in the SN +1 -orbit of a fundamental weight vector ωr (2.3), one has that 0 < Vν (ρ + µ) < ∞, 0 < Vν (−ρ − µ − ν) < ∞ if µ + ν ∈ 3+M , Vν (ρ + µ) = 0 if µ + ν 6∈ 3+M , and, if in addition g 6∈ { N1 , N 1−1 , N 1−2 , · · · , 21 , 1}, that −∞ < Vν (−ρ − µ − ν) < ∞ if µ + ν 6∈ 3+M (where ρ and Vν (x) are given by (2.13) and (3.4), respectively). Proof. Let us write Vν (ρ + µ) =
Y sin α (ha, ρ + µi + g) 2 sin α2 ha, ρ + µi +
a∈A
N
ha,νi=1
Y a∈A+
sin α2 (ha, ρ + µi − g) . sin α2 ha, ρ + µi
N
ha,νi=−1
From the inequality 0 < g ≤ ha, ρ + µi ≤
2π 2π −g < α α
(∗)
for a ∈ A+N and µ ∈ 3+M , it is seen that all factors in the denominator of the above formula for Vν (ρ + µ) are positive and that all factors in the numerator are nonnegative. Zeros in the numerator appear when ha, ρ + µi + g becomes equal to 2π/α or when
44
J. F. van Diejen, L. Vinet
ha, ρ + µi − g becomes equal to 0. The first situation occurs if and only if hamax , νi = 1 and hamax , µi = M , i.e., iff hamax , µ + νi > M . The second situation occurs if and only if haj , νi = −1 and haj , µi = 0 for certain simple root aj (2.2), i.e., iff haj , µ + νi < 0 for certain simple root aj . In a similar way one derives from the formula Vν (−ρ − µ − ν) = Y sin α (ha, ρ + µi + 1 − g) 2 α sin 2 (ha, ρ + µi + 1) +
Y a∈A+
a∈A
N
sin α2 (ha, ρ + µi − 1 + g) sin α2 (ha, ρ + µi − 1)
N
ha,νi=1
ha,νi=−1
combined with the inequality (*), that Vν (−ρ − µ − ν) is positive and finite for µ ∈ 3+M with µ + ν ∈ 3+M . The denominators become zero when ha, ρ + µi + 1 = 2π/α or when ha, ρ + µi − 1 = 0, which can happen only if µ + ν 6∈ 3+M and g ∈ { N1 , N 1−1 , N 1−2 , · · · , 21 , 1}. We learn from Lemma 3.2 that for parameters subject to (3.5) the coefficient functions Vν (x) (3.4) are regular and positive on the lattice points ρ + µ, µ ∈ 3+M with µ + ν ∈ 3+M and zero on the boundary lattice points ρ + µ, µ ∈ 3+M with µ + ν 6∈ 3+M . Similarly, the coefficient function Vν (−x − ν) is regular and positive on the lattice points ρ + µ with µ, µ + ν ∈ 3+M and generically (i.e. for g 6= 1/j, j = 1, . . . , N ) regular on the boundary lattice points ρ + µ with µ ∈ 3+M , µ + ν 6∈ 3+M . (Shifts of the type ∓x → ∓x − ν in the functions governing the coefficients of Hˆ r± (3.2a), (3.2b) originate from commuting the coefficients on the right of the translation operator Tν±1 to the left.) This entails that for parameters subject to (3.5) (and g 6∈ { N1 , N 1−1 , . . . , 1}) the ± (3.2a), (3.2b) (and hence Hˆ 1 , . . . , Hˆ N (3.1)) admit difference operators Hˆ 1± , . . . , Hˆ N a well-defined restriction to functions in L2 (ρ + 3+M ) which maps the Hilbert space into itself. The explicit action on functions f ∈ L2 (ρ + 3+M ) is given by X Wν± (ρ + µ)f (ρ + µ ± ν) (3.9) (Hˆ r± f )(ρ + µ) = ν∈SN +1 (ωr )
(µ ∈ 3+M ) with
Wν+ (ρ + µ) = Wν− (ρ + µ) =
1/2
1/2
Vν (ρ + µ)Vν (−ρ − µ − ν) > 0 for µ + ν ∈ 3+M 0 for µ + ν 6∈ 3+M , 1/2
1/2
Vν (−ρ − µ)Vν (ρ + µ − ν) > 0 for µ − ν ∈ 3+M 0 for µ − ν 6∈ 3+M
(recall to this end also that Vν (−x) = V−ν (x)). The vanishing boundary conditions for the coefficients Wν± (ρ + µ) guarantee that (Hˆ r± f )(ρ + µ), µ ∈ 3+M depends only on the values of f (·) in the points of the lattice ρ + 3+M , i.e., we have that Hˆ r± is well-defined as an operator in L2 (ρ + 3+M ). For g ∈ { N1 , N 1−1 , . . . , 1} ambiguities in the value of the products Vν (ρ + µ)Vν (−ρ − µ − ν) and Vν (−ρ − µ)Vν (ρ + µ − ν) at the boundary points µ ∈ 3+M with µ + ν 6∈ 3+M may arise as zeros deriving from Vν (ρ + µ) and Vν (−ρ − µ) can meet with possible singularities deriving from Vν (−ρ − µ − ν) and Vν (ρ + µ − ν), respectively. For instance, when g = 1 we have for generic x ∈ RN that Vν (x)Vν (−x − ν) ≡ 1. Here we will resolve such ambiguities in the value of coefficients at the boundary lattice points for g in the exceptional set { N1 , N 1−1 , . . . , 1} by requiring
The Compact Quantum Ruijsenaars–Schneider Model
45
continuity under small variations in g. Specifically, this means that for all parameter values subject to the condition (3.5) we will pick the action of Hˆ r± with vanishing boundary conditions in accordance with (3.9). Proposition 3.3 (Self-adjointness). Let us assume (positive) parameters subject to the condition (3.5) and let the action of Hˆ r± : L2 (ρ + 3+M ) → L2 (ρ + 3+M ) be given by (3.9). Then the difference operators Hˆ r = (Hˆ r+ + Hˆ r− )/2, r = 1, . . . , N are self-adjoint in L2 (ρ + 3+M ). Proof. It suffices to demonstrate that the operators Hˆ r+ and Hˆ r− are each others’ adjoints in L2 (ρ + 3+M ). Some elementary manipulations produce: X (Hˆ r+ f )(ρ + µ)h(ρ + µ) (Hˆ r+ f, h) = µ∈3+M (3.9)
=
X
X
Wν+ (ρ + µ)f (ρ + µ + ν)h(ρ + µ)
ν∈SN +1 (ωr ) µ∈3+M (i)
=
X
ν∈SN +1 (ωr ) (ii)
=
X ν∈SN +1 (ωr )
(iii)
=
X
X
Wν+ (ρ + µ)f (ρ + µ + ν)h(ρ + µ)
µ∈3+ M
µ+ν∈3+M
X
Wν+ (ρ + µ˜ − ν)f (ρ + µ)h(ρ ˜ + µ˜ − ν)
µ∈ ˜ 3+ M
+ µ−ν∈3 ˜ M
X
f (ρ + µ)W ˜ ν− (ρ + µ)h(ρ ˜ + µ˜ − ν)
ν∈SN +1 (ωr ) µ∈ ˜ 3+ M + µ−ν∈3 ˜ M (i), (3.9)
=
X
f (ρ + µ)( ˜ Hˆ r− h)(ρ + µ) ˜ = (f, Hˆ r− h),
+ µ∈3 ˜ M
where we have used (i) the vanishing boundary conditions for the coefficients Wν± , (ii) ˜ ≥ 0. the substitution µ = µ˜ − ν and (iii) that Wν+ (ρ + µ˜ − ν) = Wν− (ρ + µ) Remark. For a given trigonometric period 2π α > 1, the parameter restriction in (3.5) determines a quantization condition on the coupling parameter g (permitting only a finite number of values for g labeled by M ∈ {1, . . . , [ 2π α ]}). However, it is also possible (and probably somewhat more natural) to instead interpret the restriction in (3.5) as a quantization condition on a step size parameter. Recall to this end that in the present paper we have scaled our variables such that the scale parameter β appearing in the classical (compactified) Ruijsenaars–Schneider Hamiltonian H (1.1) has the value 1 (see the note at the end of the introduction). By substituting xj → β −1 xj and α → αβ (β > 0) in the difference operators of Sect. 3.1, we reintroduce the scale parameter β in our quantum Hamiltonians. Specifically, the operators Hˆ r(±) (3.1), (3.2a), (3.2b) then pass over to discrete difference operators of the form given in (3.1)- (3.4) with the coupling ∂ i), parameter g and the translation operator Tν (3.3) replaced by βg and Tν = exp(βhν, ∂x respectively. In other words, at the quantum level the scale parameter β enters as the step size parameter of the discrete difference operators [R1]. As a consequence of this rescaling, the lattice supporting the wave functions is going to be scaled by β resulting in a lattice of the form β(ρ + 3+M ) and the parameter restriction in (3.5) passes over to
46
J. F. van Diejen, L. Vinet
2π the condition αβ − (N + 1)g = M ∈ N \ {0}. For a given trigonometric period 2π α and (positive) coupling parameter g, the latter parameter restriction may be interpreted as a quantization condition on the step size parameter β (permitting an infinite series of values for β labeled by M ∈ N \ {0}). The quantization condition at issue adjusts the step size such that the lattice β(ρ + 3+M ) fits precisely over the classical configuration space 6βg (cf. (2.22)) including the corner points (vertices), where the number of lattice points on a ribbon is M + 1.
4. Wave Functions In this section an orthonormal basis for L2 (ρ + 3+M ) is presented consisting of joint eigenfunctions of Hˆ 1 , . . . , Hˆ N . 4.1. A factorized eigenfunction. Let 1(µ) =
1 , C+ (µ) C− (µ)
µ ∈ 3+
(4.1)
with C+ (µ) =
Y
(ha, ρi : sinα )ha,µi , (g + ha, ρi : sinα )ha,µi
(4.2a)
Y (1 − g + ha, ρi : sinα )ha,µi . (1 + ha, ρi : sinα )ha,µi +
(4.2b)
a∈A+N
C− (µ) =
a∈AN
Here we have introduced “trigonometric Pochhammer symbols” defined by 1 m=0 . (z : sinα )m := Qm−1 α sin (z + k) m = 1, 2, 3, . . . k=0 2
(4.3)
We need two preparatory lemmas. The first states that, for positive parameters subject to (3.5), the value of C± (µ) (and hence that of 1(µ)) is positive and finite for µ ∈ 3+M (3.6); the second lemma describes a functional (recurrence) relation between 1 (4.1) and the coefficient functions Vν (x) (3.4), ν ∈ SN +1 (ωr ). Lemma 4.1 (Regularity and positivity). For positive parameters α, g subject to the condition (3.5), one has that C± (µ) > 0 f or µ ∈ 3+M (with 3+M given by (3.6)). Proof. Using inequality (*) in the proof of Lemma 3.2, it is not difficult to infer that the arguments of the sine factors in C± (µ) (4.2a), (4.2b) lie between 0 and π. Lemma 4.2 (Recurrence relation). Let ν be in the SN +1 -orbit of a fundamental weight vector ωr (2.3) and let µ, µ + ν ∈ 3+ (2.7). Then 1(µ + ν)Vν (−ρ − µ − ν) = 1(µ)Vν (ρ + µ).
The Compact Quantum Ruijsenaars–Schneider Model
47
Proof. We have that Y
1(µ + ν) =
a∈A+N
Y
=
a∈A+N
(g + ha, ρi : sinα )ha,µ+νi (1 + ha, ρi : sinα )ha,µ+νi (1 − g + ha, ρi : sinα )ha,µ+νi (ha, ρi : sinα )ha,µ+νi (g + ha, ρi : sinα )ha,µi (1 + ha, ρi : sinα )ha,µi (1 − g + ha, ρi : sinα )ha,µi (ha, ρi : sinα )ha,µi
Y
×
sin α2 (ha, ρ + µi + g) sin α2 (ha, ρ + µi + 1) sin α2 (ha, ρ + µi + 1 − g) sin α2 ha, ρ + µi
a∈A+
N
ha,νi=1
Y
×
a∈A+
sin α2 (ha, ρ + µi − g) sin α2 (ha, ρ + µi − 1) . sin α2 (ha, ρ + µi − 1 + g) sin α2 ha, ρ + µi
N
ha,νi=−1
Multiplication by Vν (−ρ − µ − ν) = Y sin α (ha, ρ + µi + 1 − g) 2 α sin 2 (ha, ρ + µi + 1) +
Y a∈A+
a∈A
N
sin α2 (ha, ρ + µi − 1 + g) sin α2 (ha, ρ + µi − 1)
N
ha,νi=1
ha,νi=−1
leads after cancellation of common terms in the numerator and the denominator to an expression of the form Y a∈A+N
×
(g + ha, ρi : sinα )ha,µi (1 + ha, ρi : sinα )ha,µi (1 − g + ha, ρi : sinα )ha,µi (ha, ρi : sinα )ha,µi
Y sin α (ha, ρ + µi + g) 2 α sin 2 ha, ρ + µi +
a∈A
N
a∈A+
sin α2 (ha, ρ + µi − g) sin α2 ha, ρ + µi
N
ha,νi=1
= 1(µ)Vν (ρ + µ).
Y ha,νi=−1
After these preliminaries we are now in the position to introduce a factorized joint eigenfunction of Hˆ 1 , . . . , Hˆ N . Let 90 : (ρ + 3+M ) → R be the lattice function defined by 90 (ρ + µ) =
1 1/2 N0
11/2 (µ),
µ ∈ 3+M ,
(4.4)
where the normalization constant N0 is chosen such that (90 , 90 ) = 1 (recall the inner product (3.7)). Notice that 90 is well-defined and positive at the lattice points ρ + µ, µ ∈ 3+M (3.6) because of Lemma 4.1. (The positivity and regularity of 90 can also be deduced from the recurrence relation for 1(µ) in Lemma 4.2, using the positivity of the coefficients according to Lemma 3.2 and the initial value 1(0) = 1.)
48
J. F. van Diejen, L. Vinet
Proposition 4.3 (Factorized eigenfunction). For positive parameters α, g subject to condition (3.5), the function 90 (4.4) is a joint eigenfunction of the difference operators Hˆ r : L2 (ρ + 3+M ) → L2 (ρ + 3+M ) (3.1), (3.9), X Hˆ r 90 = cos αhν, ρi 90 , r = 1, . . . , N ν∈SN +1 (ωr )
(where ρ is given by (2.13)). Proof. One has that (3.1), (3.9) 1 (Hˆ r 90 )(ρ + µ) = 1/2 2N0
=
(ii)
1/2 2N0
=
ν∈SN +1 (ωr )
1
(i)
X
X
X
Wν+ (ρ + µ)11/2 (µ + ν) + Wν− (ρ + µ)11/2 (µ − ν) Vν (ρ + µ) + Vν (−ρ − µ) 11/2 (µ)
ν∈SN +1 (ωr )
cos αhν, ρi 90 (ρ + µ),
ν∈SN +1 (ωr )
where we have used: (i) the recurrence relation of Lemma 4.2 combined with the fact that Wν± (ρ + µ) = V±ν (ρ + µ) = 0 for µ ± ν 6∈ 3+M and (ii) the Macdonald identity [M1, Theorem (2.8)] (see also Appendix B) X X Vν (x) = cos αhν, ρi. ν∈SN +1 (ωr )
ν∈SN +1 (ωr )
The following proposition gives a compact product formula for the proportionality constant N0 , which normalizes the wave function (4.4) such that its L2 -norm is equal to 1. Proposition 4.4 (Normalization). The value of N0 is given by X Y 1(µ) = 2N (M −1) (N + 1) (1 + ng : sinα )M −1 , N0 = µ∈3+M
1≤n≤N
where it is assumed that the parameters satisfy condition (3.5). P Proof. Clearly N0 = µ∈3+ 1(µ) normalizes 90 (4.4) such that (90 , 90 ) = 1. The M evaluation of the sum leading to the product formula on the r.h.s. hinges on a terminating version of a recent summation formula due to Aomoto, Ito, and Macdonald [Ao, I1, I2, M5]. The details of the summation are relegated to Appendix A (see (A.4)). By specializing the Macdonald identity in the last line of the proof of Proposition 4.3 to x = ρ and recalling Lemma 3.2, one arrives at a simple q-binomial product formula for the eigenvalues: X X Lemma 3.2 cos αhν, ρi = Vν (ρ) = Vωr (ρ) (4.5) ν∈SN +1 (ωr )
ν∈SN +1 (ωr )
QN +1
=
jαg j=1 sin( 2 ) . Qr jαg QN +1−r sin( jαg j=1 sin( 2 ) j=1 2 )
The Compact Quantum Ruijsenaars–Schneider Model
49
For r = 1 this product formula specializes to the well-known geometric progression N +1 X
cos(αρj ) =
j=1
sin( αg 2 (N + 1)) . sin( αg 2 )
(4.6)
4.2. The complete eigenbasis. We will now extend the factorized wave function 90 (4.4) to an orthonormal basis of L2 (ρ + 3+M ) consisting of joint eigenfunctions of the commuting operators Hˆ 1 , . . . , Hˆ N . The eigenbasis will be expressed in terms of Macdonald polynomials with |q| = 1. To describe these we need notation for the elementary symmetric functions, X e±iαhν,xi , r = 1, . . . , N (4.7) Er± (x) = ν∈SN +1 (ωr )
together with their real parts Er (x) =
X
cos αhν, xi,
r = 1, . . . , N
(4.8)
λ ∈ 3+ .
(4.9)
ν∈SN +1 (ωr )
and also for the monomial symmetric functions X eiαhµ,xi , mλ (x) = µ∈SN +1 (λ)
+ Notice that Er− (x) = Er+ (x) = EN +1−r (x) (since −ωr lies in the SN +1 -orbit of ωN +1−r ). The Macdonald polynomials pλ (x), λ ∈ 3+ are now defined as the unique trigonometric polynomials of the form X cλ,µ mµ (x), cλ,µ ∈ R (4.10a) pλ (x) = mλ (x) + µ∈3+
µ≺λ
satisfying the difference equations X Vν (x)pλ (x + ν) = Er+ (ρ + λ) pλ (x),
r = 1, . . . , N
(4.10.b)
ν∈SN +1 (ωr )
(with the coefficients Vν (x) given by (3.4)). For generic parameters the existence of polynomials pλ of this form follows from the work of Macdonald in [M2, M3, M4] (see Appendix B for a brief summary of the results most relevant to us here). Our parameters α and g are related to the parameters q and t employed by Macdonald via t = q g and q = eiα (cf. Appendix B). So, in particular, we have that |q| = 1 (and |t| = 1) for real α (and g). An important property of the Macdonald polynomials is that after renormalizing in the following way: Pλ (x) = C+ (λ) pλ (x),
λ ∈ 3+
(4.11)
(where C+ (λ) is given by (4.2a)), they satisfy the symmetry relations [Ko1, M4, EK] (cf. also Appendix B) Pλ (ρ + µ) = Pµ (ρ + λ),
λ, µ ∈ 3+ .
(4.12)
50
J. F. van Diejen, L. Vinet
The polynomials in (4.11) are normalized such that Pλ (ρ) = 1, as is clear from the symmetry relation (4.12) specialized to µ = 0 (since P0 (·) ≡ 1). Even though for generic parameters the existence of the Macdonald polynomials of the form (4.10a), (4.10b) is guaranteed by Macdonald’s work, it is a priori not entirely obvious that it is possible to specialize them to positive parameter values for α, g subject to the constraint in (3.5). The point is that for certain special values of the parameters the eigenvalues Er+ (ρ+λ) on the r.h.s. of (4.10b) may not be semisimple. This manifests itself through possible singularities in the expansion coefficients cλ,µ of (4.10a) at such special parameter values. The next lemma ensures that for λ ∈ 3+M the eigenvalue Er+ (ρ + λ) is in fact semisimple for positive parameters α, g subject to condition (3.5) and, hence, that the Macdonald polynomials pλ (·), λ ∈ 3+M indeed admit a well-defined specialization to these parameter values (without any singularities in the expansion coefficients being hit). Lemma 4.5 (Semisimple spectrum). Let α, g be positive and subject to the condi+ (x) (4.8) separate tion (3.5). Then the elementary symmetric functions E1+ (x), . . . , EN + the points of the lattice ρ + 3M . Proof. For x, y ∈ E (2.1), we have that Er+ (x) = Er+ (y) for r = 1, . . . , N if and only if x = σ(y)
mod
2π Q, α
with σ ∈ SN +1 . If both x and y lie in the Weyl alcove 60 (i.e. the open simplex characterized by the conditions (i) and (ii) of Sect 2.3 with g ↓ 0), then the only way in which this is possible is if σ = id and x = y. (The Weyl alcove 60 determines a fundamental domain for the action of the affine Weyl group SN +1 n ( 2π α Q) on E.) The lemma then follows because the conditions on the parameters guarantee that ρ + 3+M ⊂ 6g ⊂ 60 . After these preliminaries let us now introduce the wave function 9λ : (ρ+3+M ) → C given by 9λ (ρ + µ) =
1 1/2
N0
11/2 (λ)11/2 (µ)Pλ (ρ + µ),
λ, µ ∈ 3+M .
(4.13)
Notice that for λ = 0 this wave function reduces to the factorized wave function 90 in (4.4). The next symmetry property is an immediate consequence of the symmetry relations (4.12) for the renormalized Macdonald polynomials Pλ (x) (4.11). Proposition 4.6 (Symmetry). One has that 9λ (ρ + µ) = 9µ (ρ + λ)
for
λ, µ ∈ 3+M .
The function 9λ (4.13) turns out to be a joint eigenfunction of the operators Hˆ 1 , . . . , Hˆ N . Proposition 4.7 (Diagonalization). Let us assume positive parameters α, g subject to the constraint (3.5) and let λ ∈ 3+M . Then Hˆ r 9λ = Er (ρ + λ)9λ ,
r = 1, . . . , N
(where Hˆ r and Er (·) are given by (3.1), (3.9) and (4.8), respectively, and ρ is taken from (2.13)).
The Compact Quantum Ruijsenaars–Schneider Model
51
Proof. One has that (3.9)
(Hˆ r+ 9λ )(ρ + µ) =
X
Wν+ (ρ + µ)9λ (ρ + µ + ν)
ν∈SN +1 (ωr ) (i)
=
1 1/2 N0
11/2 (λ)11/2 (µ) ×
X
Vν (ρ + µ)Pλ (ρ + µ + ν)
ν∈SN +1 (ωr ) (ii)
= Er+ (ρ + λ)9λ (ρ + µ),
where we have used (i) Lemma 4.2 combined with the vanishing properties of the coefficients Wν+ and Vν at the boundary (cf. Lemma 3.2), and (ii) the defining difference equations for the Macdonald polynomials in (4.10b). The proposition now follows from the observation that Hˆ r = (Hˆ r+ + Hˆ r− )/2 and that Er (·) = (Er+ (·) + Er− (·))/2 with + − + Hˆ r− = Hˆ N +1−r and Er (·) = EN +1−r (·). It is clear from the proof of the above proposition that the functions 9λ (4.13) in fact diagonalize the operators Hˆ r± (3.9) individually Hˆ r± 9λ = Er± (ρ + λ)9λ ,
λ ∈ 3+M
(4.14)
(with the parameters satisfying (3.5)). It is a priori not obvious that the functions 9λ , λ ∈ 3+M actually span the Hilbert space L2 (ρ + 3+M ), since in principle linear dependencies might arise between the Macdonald polynomials pλ (·), λ ∈ 3+M upon restriction to the lattice ρ + 3+M . However, the following result states that the functions 9λ , λ ∈ 3+M in fact form an orthonormal basis of L2 (ρ + 3+M ), therewith excluding the possibility of such linear dependencies to occur. Phrased alternatively: the lattice evaluation homomorphism from the polynomial subspace Span{mλ }λ∈3+M to the space of complex functions over the lattice ρ + 3+M – defined by the assignment mλ (x) 7→ mλ (ρ + µ) – is an isomorphism. Proposition 4.8 (Orthonormality). For positive parameters α, g subject to condition (3.5), the functions 9λ : (ρ + 3+M ) → C in (4.13) form an orthonormal basis of L2 (ρ + 3+M ), i.e. 0 if λ 6= µ (9λ , 9µ ) = 1 if λ = µ (λ, µ ∈ 3+M ). Proof. The orthogonality follows by applying the eigenvalue equation (4.14) to the equality (Hˆ r+ 9λ , 9µ ) = (9λ , Hˆ r− 9µ ) (cf. the proof of Proposition 3.3) and using that + the elementary symmetric functions E1+ (·), . . . , EN (·) separate the points of ρ + 3+M (cf. Lemma 4.5). To see that the normalization of the wave functions is such that their L2 -norms are equal to 1, we apply the symmetry relations (Proposition 4.6) to the eigenvalue equations of Proposition 4.7. This leads to a system of difference equations for the wave functions in the spectral variable of the form
52
J. F. van Diejen, L. Vinet
Er (ρ + λ)9µ (ρ + λ) = X Wν+ (ρ + µ)9µ+ν (ρ + λ) + Wν− (ρ + µ)9µ−ν (ρ + λ) , ν∈SN +1 (ωr )
r = 1, . . . , N (where Wν± is taken from (3.9)). Applying the expansion on the r.h.s. to both sides of the equality (Er 9µ , 9µ+ωr ) = (9µ , Er 9µ+ωr ) and exploiting the orthogonality of the wave functions, produces the relation Wω+r (ρ + µ)(9µ+ωr , 9µ+ωr ) = Wω−r (ρ + µ + ωr )(9µ , 9µ ). But then, since Wω+r (ρ + µ) = Wω−r (ρ + µ + ωr ) (> 0) for µ, µ + ωr ∈ 3+M (see (3.9)), it is immediate that (9µ , 9µ ) is independent of µ ∈ 3+M . The orthonormality now follows because for µ = 0 we have that (90 , 90 ) = 1 in view of Proposition 4.4. Remarks. i. The orthonormality of the wave functions 9λ (4.13) described by Proposition 4.8 can be rewritten in terms of discrete orthogonality relations for the Macdonald polynomials pλ (x) (in the monic normalization) or Pλ (x) (in the symmetric normalization with Pλ (ρ) = 1, cf. (4.11), (4.12)). The discrete orthogonality measure is supported on the lattice ρ + 3+M with positive weights given by 1 (4.1). Specifically, we conclude from Proposition 4.7 that for positive parameters α, g subject to the condition (3.5) one has that X 0 if µ 6= λ (4.15a) pλ (ρ + ν) pµ (ρ + ν) 1(ν) = C− (λ) N if µ = λ, 0 C+ (λ) + ν∈3M
or equivalently X
Pλ (ρ + ν) Pµ (ρ + ν) 1(ν) =
ν∈3+M
0
N0 1(λ)
if µ 6= λ if µ = λ,
(4.15b)
for λ, µ ∈ 3+M (with C± and N0 given by (4.2a), (4.2b) and Proposition 4.4, respectively). ii. If we associate to each dominant weight vector λ given by λ=
N X
lj ω j ,
lj ∈ N
(4.16a)
j=1
a contragredient dominant weight λ∗ of the form λ∗ =
N X
lN +1−j ωj ,
(4.16b)
j=1
then the mapping λ 7→ λ∗ defines an involution of the cone of dominant weights 3+ (2.7). The Macdonald polynomials labeled by λ and λ∗ are (for α, g real) related by complex conjugation pλ (x) = pλ∗ (x),
λ ∈ 3+ .
(4.17)
Indeed, the weight −λ lies in the SN +1 -orbit of λ∗ and the vector −ρ lies in the SN +1 -orbit of ρ, from which it is concluded that
The Compact Quantum Ruijsenaars–Schneider Model
mλ (x) = mλ (−x) = mλ∗ (x),
53
Er+ (ρ + λ) = Er− (ρ + λ) = Er+ (ρ + λ∗ ).
Combining this with the observation that µ∗ ≺ λ∗ if µ ≺ λ then entails that the complex conjugate polynomial pλ (x) satisfies the same conditions of the type in (4.10a), (4.10b) as the Macdonald polynomial pλ∗ (x), whence the equality in (4.17) follows by the uniqueness of the Macdonald polynomials. The upshot is that by passing to linear combinations of the form 9λ + 9λ √ , 2 9λ − 9λ √ , 9Sλ = i 2
9C λ =
λ ∈ 3+M ,
(4.18a)
λ ∈ 3+M
(4.18b)
(thus in essence selecting the real and imaginary parts of the wave functions 9λ (4.13)), we arrive at real-valued eigenfunctions for the discrete difference operators Hˆ 1 , . . . , Hˆ N (3.2a). (3.9) C/S C/S Hˆ r 9λ = Er (ρ + λ)9λ ,
λ ∈ 3+M
(4.19)
(r = 1, . . . , N ). Notice, however, that – in contrast to the complex wave functions 9λ (4.13) – the functions in (4.18a), (4.18b) do not diagonalize the operators ± Hˆ 1± , . . . , Hˆ N (3.9) individually. When λ runs through the alcove 3+M (3.6), we of S course count each eigenfunction 9C λ and 9λ twice because C 9C λ∗ = 9λ ,
9Sλ∗ = −9Sλ ,
λ ∈ 3+M .
(4.20)
Eliminating for this redundancy yields a real-valued orthonormal basis for the Hilbert S + space L2 (ρ + 3+M ), consisting of the wave functions 9C λ , 9λ with λ ∈ 3M mod ∗ (i.e., + we pick the weights from a fundamental domain in 3M with respect to the action of the involution ∗). The complex eigenbasis is recovered from the real eigenbasis by forming the combinations 9λ =
S 9C λ + i9λ √ , 2
λ ∈ 3+M .
(4.21)
5. Miscellanea 5.1. The eigenfunction transform. The weight alcove 3+M (3.6) labels the eigenbasis 9λ (4.13). If we identify this weight alcove with the lattice ρ + 3+M then we are led to a Discrete Fourier-type Transformation in the Hilbert space L2 (ρ + 3M ), the kernel of which is determined by the eigenfunctions. Let F : L2 (ρ + 3+M ) → L2 (ρ + 3+M ) be the discrete integral transformation X F(ρ + λ, ρ + µ)f (ρ + µ) (5.1a) (Ff )(ρ + λ) = µ∈3+M
with a kernel of the form F(ρ + λ, ρ + µ) := 9λ (ρ + µ),
(5.1b)
54
J. F. van Diejen, L. Vinet
where 9λ (ρ + µ) is taken from (4.13). Furthermore, let Er : L2 (ρ + 3+M ) → L2 (ρ + 3+M ) denote the multiplication operator (Er f )(ρ + µ) = Er (ρ + µ) f (ρ + µ)
(r = 1, . . . , N )
(5.2)
with Er (·) representing the real elementary symmetric function of (4.8). The main results of this paper may be conveniently summarized in the following three properties of the discrete integral transformation F (5.1a), (5.1b): t
F = F,
F ∗ = F −1
(5.3a)
and F Hˆ r = Er F,
r = 1, . . . , N,
(5.3b)
where it is assumed that the parameters satisfy condition (3.5). The first property states that the transpose t F of F is equal to F , or in other words, that (the kernel of) the operator F is symmetric. This is a consequence of the symmetry relation in Proposition 4.6. The second property states that the adjoint F ∗ of F in L2 (ρ+3+M ) equals the inverse of F , or in other words, that the operator F is unitary. This is a consequence of the orthonormality relations for the kernel 9λ (ρ+µ) in Proposition 4.8. Finally, the third property states that F simultaneously diagonalizes the discrete difference operators Hˆ 1 , . . . , Hˆ N (3.1), (3.9) in L2 (ρ + 3+M ). This is seen by checking that both sides of (5.3b) act the same on the orthonormal eigenbasis 9λ , λ ∈ 3+M (cf. Proposition 4.7 and also Remark ii at the end of Sect. 4). The map f 7→ fˆ := F ∗ f determines a Discrete Fourier-type Transformation in 2 L (ρ + 3+M ) of the form fˆ(ρ + λ) = (f, 9λ ),
λ ∈ 3+M
(5.4a)
µ ∈ 3+M .
(5.4b)
with the inversion formula given by f (ρ + µ) = (fˆ, 9µ ),
The coefficients fˆλ := fˆ(ρ + λ), λ ∈ 3+M solve the linear interpolation problem of decomposing an arbitrary lattice function f ∈ L2 (ρ + 3+M ) in terms of the wave functions 9λ , λ ∈ 3+M . X fˆλ 9λ , f= fˆλ = (f, 9λ ). (5.5) λ∈3+M
By passing to the real eigenbasis from Remark ii at the end of Sect. 4, which is given by S + 9C λ , 9λ (4.18a), (4.18b) with λ ∈ 3M mod ∗, we arrive at analogs of the (discrete) c Fourier cosine transform F , fˆc (ρ + λ) = (f, 9C λ ), f (ρ + µ) = (fˆc , 9C µ ),
λ ∈ 3+M mod ∗,
(5.6a)
µ ∈ 3+M mod ∗,
(5.6b)
λ ∈ 3+M mod ∗,
(5.7a)
and Fourier sine transform F s , fˆs (ρ + λ) = (f, 9Sλ ), f (ρ + µ) = (fˆs , 9Sµ ),
µ∈
3+M
mod ∗,
(5.7b)
The Compact Quantum Ruijsenaars–Schneider Model
55
respectively. Here it is assumed that f ∈ L2 (ρ+3+M ) is a lattice function that is symmetric (for the cosine transform) or antisymmetric (for the sine transform) with respect to the action of the involution ∗ on the lattice ρ + 3+M , respectively. (The involution ∗ acts on the lattice ρ + 3+M by (ρ + λ) 7→ (ρ + λ∗ ), cf. Remark ii at the end of Sect. 4.) The integral transformation F (5.1a), (5.1b) conjugates (cf. (5)) the quantum Hamiltonians Hˆ 1 , . . . , Hˆ N (3.1), (3.9) to the multiplication operators E1 , . . . , EN (5.2). The spectrum of Hˆ r in L2 (ρP + 3+M ) is (thus) given by the range of the real elementary symmetric function Er (·) = ν∈SN +1 (ωr ) cos αhν, ·i on the lattice ρ+3+M . The eigenfunction transform F amounts to the quantum counterpart of the action-angle transformation φ for the classical system, which was found in explicit form by Ruijsenaars [R3]. φ
The action-angle transform at issue constitutes an (anti)symplectomorphism z → z˘ ˘ of the classical phase space (CPN , ωR ), giving rise to new canonical coordinates (˘x, p) of the form (cf. (2.19a), (2.19b)) ˘ = (2π/α − (N + 1)g) haj , pi eihωj ,˘xi =
|z˘0
|2
|z˘j |2 + g, + · · · + |z˘N |2
z˘j |z˘0 | z˘0 |z˘j |
(5.8a) (5.8b)
(j = 1, . . . , N ) on an open dense patch {[z˘0 : · · · : z˘N ] | z˘j 6= 0, j = 0, . . . , N } of CPN . The classical Hamiltonians Hr (x, p) (2.15) become in these new coordinates of the form [R3] ˘ = H˘ r (˘x, p)
X
˘ cos αhν, pi,
r = 1, . . . , N.
(5.9)
ν∈SN +1 (ωr )
(This is to be compared with the F-transformation of the quantum Hamiltonian Hˆ r to the multiplication operator Er in (5.3b).) That is, in the new coordinates the classical Hamiltonians depend only on the action variables p˘ ∈ 6g and are independent of the angle variables x˘ ∈ T = E/(2πQ) (cf. Sect. 2.3). Thus, the “spectrum” (i.e. the range) of the classical Hamiltonian Hr (2.15) on the compactified phase space CPN is given by the range of H˘ r (5.9) on the convex polytope {p˘ | p˘ ∈ 6g } (cf. (2.22)). (Notice that we may again continue the action variables p˘ (5.8a) smoothly to the whole of CPN unlike the angle variables x˘ (5.8b), cf. Sect. 2.3.) At this point it is worthwhile to mention that the convexity of the range of the action variables for our model is in agreement with the general convexity results for Hamiltonian systems on compact symplectic manifolds due to Atiyah [At] and Guillemin–Sternberg [GS1]. We see that – roughly speaking – the quantization of the model discretizes the spectrum of the Hamiltonians as if the action variables p˘ get localized on the lattice ρ + 3+M ⊂ 6g . It was furthermore shown by Ruijsenaars [R3], that the vertices p˘ = ρ, ρ + M ωr (r = 1, . . . , N ) of the convex polytope {p˘ | p˘ ∈ 6g } correspond to the equilibrium points of the flows generated by the classical Hamiltonians Hr (2.15) (cf. also Sect. 5.3 below). This state of affairs is again in agreement with the general picture presented by Atiyah and Guillemin–Sternberg [At, GS1]. A remarkable property of the action-angle transform φ is that it in fact defines an (anti)symplectic involution on (CPN , ωR ) (i.e. the model is “self-dual” in the terminology of Ruijsenaars) [R3]. The analogous property of the eigenfunction transform F (5.1a), (5.1b) states that the corresponding discrete integral transform defines a (discrete)
56
J. F. van Diejen, L. Vinet
Fourier-type involution (i.e. an involution up to complex conjugation) in the Hilbert space L2 (ρ + 3+M ) t
F ∗ F = Id,
(5.10)
cf. (5). (The kernel of t F ∗ and F are the same up to complex conjugation.) 5.2. The degeneration g ↓ 0: Discrete Fourier analysis on a Weyl alcove. For g ↓ 0, the eigenfunction transform F (5.1a), (5.1b) degenerates to a discrete integral transform F˜ : L2 (3+M ) → L2 (3+M ) characterized by a kernel of the form 1
F˜ (λ, µ) =
1/2 N˜ 0
˜ 1/2 (µ) P˜λ (µ), ˜ 1/2 (λ)1 1
λ, µ ∈ 3+M ,
(5.11)
where N˜ 0 = (N + 1)M N , 1 ˜ 1(µ) = , ˜ C+ (µ) C˜ − (µ) with C˜ + (µ) =
Y
ha, ρi ˜ , 1 + ha, ρi ˜
a∈A+
Y
C˜ − (µ) =
a∈A+
N
N + 2 − ha, ρi ˜ N + 1 − ha, ρi ˜
N
ha,µi>0
P
µ ∈ 3+M ,
ha,µi=M
P
1≤j≤N +1 (N/2 + 1 − j)ej = ω1 + · · · + ωN . The corresponding degeneration of the Macdonald polynomials P˜λ (x) is given by the renormalized monomial symmetric functions (cf. (4.11))
and ρ˜ =
a∈A+N
a/2 =
P˜λ (x) = C˜ + (λ) mλ (x),
λ ∈ 3+ ,
where the trigonometric period is fixed such that α = 2π/M (cf. Condition (3.5)). The discrete integral operator F˜ : L2 (3+M ) → L2 (3+M ) inherits from F (5.1a), (5.1b) the symmetry and unitarity properties (cf. (5.3a)) t
˜ F˜ = F,
F˜ ∗ = F˜ −1
(5.12)
(where the transpose and adjoint are now meant in L2 (3+M )). The corresponding Fourier pairing f → fˆ := F˜ ∗ f in L2 (3+M ) takes the form (cf. (5.4a), (5.4b)) ˜ λ ), fˆ(λ) = (f, 9
λ ∈ 3+M ,
(5.13a)
˜ µ ), f (µ) = (fˆ, 9
µ ∈ 3+M ,
(5.13b)
with the inversion formula
˜ λ (µ) := F˜ (λ, µ), µ ∈ 3+M . (In ˜ λ : 3+M → C is defined by 9 where the function 9 (5.13a), (5.13b) the inner product is of course meant to be the standard inner product of P L2 (3+M ) given by (f, h) = µ∈3+ f (µ)h(µ), cf. (3.7).) M ˜ The normalizing c-function C˜ + (λ) as well as the weights of the Haar measure 1(µ) ˜ and Plancherel measure 1(λ) admit in this degenerate situation a very simple combinatorial interpretation. (The fact that the Haar measure and Plancherel measure coincide
The Compact Quantum Ruijsenaars–Schneider Model
57
is a reflection of the symmetry property t F˜ = F˜ enjoyed by the Discrete Fourier Trans˜ Specifically, the quantity 1/C˜ + (λ) amounts to a positive integer counting the form F.) number of points in the weight lattice 3 that lie on the SN +1 -orbit of the dominant weight vector λ ∈ 3+ . With this interpretation in mind, it is actually seen right away from the definition that the renormalized monomial symmetric function P˜λ (x) assumes the value 1 in x = 0 and, moreover, that P˜λ (µ) is symmetric with respect to an interchange of λ ˜ counts the number of points of and µ in 3+ (cf. (4.12)). Similarly, the quantity 1(µ) the SN +1 -orbit through µ ∈ 3+M in the finite lattice 3/(M · Q) (i.e. the weight lattice modulo the root lattice multiplied by M ). In other words, here one identifies the weights if they differ by a vector in the dilated root lattice M · Q. The g ↓ 0Pdegeneration of the ˜ = N˜ 0 , normalization formula in Proposition 4.4 gives rise to the identity µ∈3M 1(µ) or more explicitly: X
Y
1 + ha, ρi ˜ ha, ρi ˜
µ∈3+M a∈A+ N ha,µi>0
Y a∈A+
N + 1 − ha, ρi ˜ = (N + 1)M N . N + 2 − ha, ρi ˜
(5.14)
N
ha,µi=M
This is the g ↓ 0 degeneration of the terminating Aomoto–Ito–Macdonald summation formula of Appendix A (cf. Eq. (A.4)). The l.h.s. of (5.14) describes a count of the (N + 1)M N points in the lattice 3/(M · Q), by summing the orders of the SN +1 -orbits through the weights in the fundamental domain 3+M . The mapping F˜ : L2 (3+M ) → L2 (3+M ) defines a Discrete Fourier Transform on the uniform lattice 3+M over the (closure of) the dilated Weyl alcove 60 (cf. (2.22) with g = 0). This Fourier transform may be seen as a reduction to the subspace of the SN +1 invariant functions of the standard Discrete Fourier Transform on the lattice 3/(M · Q), −1/2 exp( 2πi the kernel of which is given by N˜ 0 M hλ, µi), λ, µ ∈ 3/(M · Q). We thus conclude that the eigenfunction transform F (5.1a), (5.1b) constitutes a one-parameter deformation of (the permutation-invariant part of) this standard Discrete Fourier Transformation, with the coupling parameter g playing the role of the deformation parameter. 5.3. Ground-state vs maximal energy wave function. Let E(x) =
N +1 X
cos(αxj ).
(5.15)
j−1
Notice that on the hyperplane x1 + · · · + xN +1 =P0 this function coincides with the first (real) elementary symmetric function E1 (x) = ν∈SN +1 (ω1 ) cos αhν, xi (cf. (4.8)). It is not very difficult to infer that the critical points of E(x) on the simplex 6g (2.22) are located at the N + 1 vertices ρ where M = (cf. (4.6))
2π α
and
ρ + M ωr ,
r = 1, . . . , N,
(5.16)
− (N + 1)g. The (critical) values of E(x) evaluated at these vertices read sin αg 2 (N + 1) , sin( αg 2 ) 2πr )E(ρ), E(ρ + M ωr ) = cos( N +1 E(ρ) =
(5.17a) r = 1, . . . , N.
(5.17b)
58
J. F. van Diejen, L. Vinet
We thus conclude (cf. Proposition 4.7) that the maximal eigenvalue of the Hamiltonian sin α 2 (N +1)g . Hˆ 1 (3.1), (3.9) in the Hilbert space L2 (ρ + 3+M ) has the value E1 (ρ) = sin( αg 2 ) The corresponding eigenfunction is given by the factorized wave function 90 in (4.4). For N odd the minimal eigenvalue reads E1 (ρ + M ω(N +1)/2 ) = −E1 (ρ) whereas for N even the minimal eigenvalue is twofold degenerate and given by E1 (ρ + M ωN/2 ) = πN E1 (ρ + M ω(N/2+1) ) = cos( N +1 )E1 (ρ). The “critical” or “vertex” eigenvalues of the Hamiltonian Hˆ 1 in (5.17a) and (5.17b) coincide with the equilibrium values of the corresponding classical Hamiltonian H1 (2.15) at the stationary points computed by Ruijsenaars [R3, Sect. 5.3] (cf. Sect. 5.1 above). In particular, the global minimal and maximal energies of the Hamiltonian Hˆ 1 /H1 read the same at the quantum level as they do at the classical level. In other words, the energy levels get discretized at the quantum level (cf. Sect. 5.1 above) but there is no shift of the energy spectrum due to the quantization (as e.g. in the case of a harmonic oscillator). From a physical point of view it is often somewhat more natural to work with a nonnegative Hamiltonian. This can be achieved by passing to difference operators of the form H˜ r = Er (ρ) − Hˆ r X 1 Vν (x) + Vν (−x) = 2 ν∈SN +1 (ωr )
−Vν1/2 (x)Tν Vν1/2 (−x) − Vν1/2 (−x)Tν−1 Vν1/2 (x) ,
(5.18)
r = 1, . . . , N (where we have again employed the Macdonald identity from the proof of Proposition 4.3 to pass from the first to the second formula on the r.h.s.). The factorized eigenfunction 90 (4.4) amounts to the ground-state wave function for the Hamiltonian H˜ 1 (5.18), with the corresponding eigenvalue being equal to zero. 5.4. The two-particle solution. In the case of two particles, i.e. for N = 1, the quantum version of the compactified Ruijsenaars–Schneider model was introduced and solved already several years ago by Ruijsenaars in the survey paper [R2] (see Sect. 3C2). It is instructive to view how, in this special situation, our results reproduce those previously obtained by Ruijsenaars. The difference operator Hˆ 1 (3.1) (= Hˆ 1+ (3.2a) = Hˆ 1− (3.2b)) reduces to sin α (x + g) 1/2 d sin α (x − g) 1/2 2 2 e dx + (5.19) Hˆ = sin( αx sin( αx 2 ) 2 ) sin α (x − g) 1/2 d sin α (x + g) 1/2 2 2 e− dx sin( αx sin( αx 2 ) 2 ) with x = x1 − x2 . For α, g > 0 and
2π − 2g = M ∈ N \ {0}, α
(5.20)
the operator Hˆ (5.19) becomes self-adjoint upon restriction to the Hilbert space L2 (g + ZM +1 ) over the lattice g + ZM +1 = {g, g + 1, g + 2, . . . , g + M }
(5.21)
The Compact Quantum Ruijsenaars–Schneider Model
59
(cf. Sect. 3.2). An orthonormal basis of eigenfunctions 9l : (g+ZM +1 ) → R for Hˆ (5.19) is given by (cf. Sect. 4, in particular Eqs. (4.13), (4.1) and Proposition 4.4) 9l (g + m) =
1 1/2 N0
11/2 (l)11/2 (m) Pl (cos
α (g + m)), 2
(5.22)
l, m = 0, . . . , M , with 1(m) = = Pl (cos
sin α2 (m + g) (2g : sinα )m sin( αg (1 : sinα )m 2 ) m α sin (m + g) Y sin α (2g + j − 1) 2
sin( αg 2 )
α (x)) = q l(x−g)/2 3 φ2 2
2
j=1
sin( αj 2 )
,
q −l , q g , q g−x ; q, q , q 2g , 0
q = eiα ,
(α = 2π/(2g + M )) and N0 = 2M (1 + g : sinα )M −1 = 2M
M −1 Y k=1
sin
α (k + g). 2
Specifically, one has that (cf. Proposition 4.7) ˆ l = 2 cos α (g + l) 9l , H9 2
l = 0, . . . , M,
(5.23)
and that (cf. Proposition 4.8) M X
9l (g + m)9k (g + m) = δl,k ,
l, k ∈ {0, . . . , M }.
(5.24)
m=0
Clearly, the maximal and minimal eigenvalues of the Hamiltonian Hˆ are given by αg α 2 cos( αg 2 ) and 2 cos 2 (g + M ) = −2 cos( 2 ), respectively (cf. Sect. 5.3). The (A1 -type Macdonald) polynomials Pl (cos α2 (x)) coincide up to a normalization factor with the q-ultraspherical polynomials and can thus be explicitly written (in various ways) in terms of terminating basic hypergeometric series [GR, KS]. For our purposes it is convenient to employ the above representation in terms of a terminating 3 φ2 series, as this manifestly demonstrates the symmetry of the wave function 9l (g + m) (5.22) with respect to an interchange of l and m (cf. Proposition 4.6). The orthogonality relations (5.24) for the basis 9l , l = 0, . . . , M boil down to reductions to special parameter values of well-known discrete orthogonality relations for the q-Racah polynomials due to Askey and Wilson [AW, GR, KS]. To see this, one substitutes a = −q 2g−1 , b = −1, πi ) in the 4 φ3 representation for the q-Racah polynomial c = d = q g−1/2 with q = exp( 2g+M PlqR (xm ) [GR, Eq. (7.2.11)]. (Notice that the q substituted in the q-Racah polynomial is the square root of the q = eiα employed in the above 3 φ2 formula.) With the help of Singh’s quadratic transformation [GR, Eq. (3.10.13)], one arrives at the usual 4 φ3 representation for the q-ultraspherical polynomial Pl (cos α2 (g + m)) normalized such that Pl (cos α2 (g)) = 1:
60
J. F. van Diejen, L. Vinet
α Pl (cos (g + m)) = 4 φ3 2
q −l , q 2g+l , q −m/2 , q g+m/2 ; q, q , −q g , q g+1/2 , −q g+1/2
with q = eiα , α = π/(g + M/2) (cf. [GR, Eq. (7.4.14)] with β = q g = eiαg and θ = −α(g + m)/2). Passing to the corresponding 3 φ2 representation for the q-ultraspherical polynomials (see [GR, Ex. 1.29]) reproduces the above 3 φ2 formula for Pl (cos α2 (g + m)). Moreover, the orthogonality relations for the q-Racah polynomials with truncation condition of the type aq = q −M (see Eqs. (7.1.8), (7.2.2), (7.2.4) and (7.2.15) of [GR]) pass with these substitutions over into the orthogonality relations X m∈ZM +1
Pl (cos
N0 α α (g + m)) Pk (cos (g + m))1(m) = δl,k , 2 2 1(l)
l, k ∈ ZM +1 ,
which are compatible with (5.24). Notice that the rank-one case N = 1 is very special in the sense that the reflection λ 7→ −λ lies in the Weyl group (it amounts to the Weyl-permutation (x1 , x2 ) 7→ (x2 , x1 ) restricted to the hyperplane E = {(x1 , x2 ) ∈ R2 | x1 + x2 = 0}). As a consequence, the involution ∗ of Remark ii at the end of Sect. 4 reduces in this special case to the identity. Indeed, we have for N = 1 that the wave function 9λ (4.13) is real-valued, whence S 9λ = 9C λ (4.18a) and 9λ = 0 (4.18b). In other words, the generalized discrete Fourier transform F = F ∗ (5.4a), (5.4b) and the generalized discrete Fourier cosine transform F c (5.6a), (5.6b) coincide in the rank-one case and the generalized discrete Fourier sine transform F s (5.7a), (5.7b) collapses. For g ↓ 0, the eigenfunctions 9l (5.22) reduce to an orthonormal basis of L2 (ZM +1 ) of the form ˜ 1/2 (l)1 ˜ 1/2 (m) cos πlm ˜ l (m) = √ 1 1 (l, m ∈ ZM +1 ), (5.25) 9 M 2M with
˜ 1(m) =
1 f or m = 0, M, 2 f or m = 1, . . . , M − 1.
The corresponding Discrete Fourier Transformation F˜ in L2 (ZM +1 ), which has a kernel ˜ l (m) (5.25), can be seen as a reduction to the even subspace given by F˜ (l, m) = 9 of the standard √ Discrete Fourier Transformation in L2 (Z2M ) determined by the kernel exp(iπlm/M )/ 2M (with l, m ∈ Z2M ). 5.5. Geometric quantization. In the light of the fact that the phase space for the classical compactified Ruijsenaars–Schneider model is given by the complex projective space CPN equipped with the renormalized Fubini-Study symplectic form ωR (2.20), it is natural to ask oneself the question as to what extent our results may be recovered within the realms of geometric quantization. In this formalism a Hilbert space is associated to the classical phase space (CPN , ωR ) in two steps (see e.g. [Si, Hu]). In the first step (prequantization), the question is to construct a Hermitian line bundle L with connection ∇ over CPN such that the curvature of ∇ equals ωR . Such a line bundle exists provided ωR satisfies the integrality condition I ωR = 2πM with M ∈ N \ {0}, (5.26)
The Compact Quantum Ruijsenaars–Schneider Model
61
where the integration is over a complex projective line in CPN (or more generally an integral two-cycle in the homology basis). (Geometrically, this condition means that ωR belongs to an integer cohomology class: [ωR ] ∈ H 2 (CPN , Z).) The prequantum Hilbert space now consists of the space of L2 sections of the line bundle L (where the measure of integration is taken to be the Liouville volume form associated to ωR ). After H recalling that the normalization of ωR is such that ωR = 4πR2 (2.21), it is seen that the integrality condition in (5.26) amounts precisely to our quantization condition (3.5). (This observation was already made by Ruijsenaars in [R3, Sect. 1.3].) Unfortunately, the (prequantum) Hilbert space thus obtained is too big. Roughly speaking, it corresponds to an “L2 space over the phase space” whereas from a physical point of view one is interested rather in the analog of an “L2 space over the configuration space”. In the second step of the quantization procedure the prequantum Hilbert space has to be downsized so as to produce the physical Hilbert space. To this end it is needed to exploit the fact that (CPN , ωR ) is a K¨ahler manifold and, as such, carries a natural K¨ahler polarization. Specifically, as the physical Hilbert space one picks the subspace of the prequantum Hilbert space consisting of all L2 sections of the line bundle L that are covariantly constant (with respect to the connection ∇) along the leaves of the (standard) K¨ahler polarization on CPN . The result is a physical Hilbert space Hhol consisting of the holomorphic sections of L. The dimension of the space of holomorphic sections Hhol follows from a classical result (viz. a Riemann-Roch-type formula) due to Hirzebruch and Kodaira [HK] dim(Hhol ) =
(N + M )! , N! M!
(5.27)
which corresponds nicely to the dimension of our Hilbert space L2 (ρ + 3+M ) in (3.8). It is possible to realize the Hilbert space Hhol more explicitly, as the space of holomorphic sections may be identified with the space of functions of the form [HK, Si, Hu] p(z1 , . . . , zN ) , (1 + |z1 |2 + · · · + |zN |2 )M
(5.28)
where p(z1 , . . . , zN ) denotes an arbitrary polynomial of degree at most M in the affine CPN coordinates (z1 , · · · , zN ). In this representation the integration of the L2 inner product is with respect to the volume form (1 + |z1 |2 + · · · + |zN |2 )−(N +1) dz1 ∧ dz 1 ∧ · · · ∧ dzN ∧ dz N .
(5.29)
It would be very interesting to extend the analysis further so as to include a description of the quantum Hamiltonians and their eigenfunctions within the framework of geometric quantization and to compare the results with the approach taken in the present paper. In this connection it is expected that our lattice ρ + 3+M may be recovered geometrically as the so-called Bohr–Sommerfeld set – see Guillemin–Sternberg [GS2] – associated to the symplectic embedding of 6g ×T into CPN given by (2.18). Furthermore, the embedding in question induces a “real polarization with singularities” on CPN (cf. [GS2]). (This real polarization becomes singular at the boundary hyperplanes zj = 0 of the open dense patch {[z0 : · · · : zN ] | zj 6= 0, j = 0, . . . , N } ⊂ CPN .) The above-mentioned correspondence between (the dimensions of) the Hilbert space L2 (ρ + 3+M ) and the Hilbert space of holomorphic sections of the line bundle L (viz. Hhol ) suggests that (the dimension of) the Hilbert space associated to (CPN , ωR ) via geometric quantization does not depend on the choice of polarization to be either the standard K¨ahler polarization or
62
J. F. van Diejen, L. Vinet
the “real polarization with singularities” stemming from the embedding 6g ×T ,→ CPN . This is in correspondence with the more general “invariance of polarization” results for the geometric quantization of complex flag manifolds due to Guillemin and Sternberg [GS2]. 5.6. Connections to integrable field theories. The compactified Ruijsenaars–Schneider model is related in various ways to well-known infinite-dimensional integrable systems. For instance, in [R5] it was shown that at the classical level the (τ -functions of) single-solitons for the Kadomtsev-Petviashvilli and 2D Toda hierarchy (cf. [DKJM, JM, Ho1, Ho2, ZC]) may be described in terms of the equilibrium behavior of the classical compactified Ruijsenaars–Schneider molecule with an appropriate center-of-mass motion. Multi-solitons arise in this picture – in a nutshell – by passing to composite integrable Ruijsenaars–Schneider-type (m1 + · · · + mn )-particle systems that are built of n (the number of solitons) interacting compactified trigonometric Ruijsenaars–Schneider molecules in their ground state [R5]. In [GN] it was furthermore argued that formally the (quantum) compactified Ruijsenaars–Schneider model can be obtained by Hamiltonian reduction from an infinite-dimensional system on the cotangent bundle over a central extension of the d(N + 1). The latter paper also indicates some intriguing relations beloop group SU tween, on the one hand, the quantum compactified Ruijsenaars–Schneider model and, on the other hand, a gauged SU (N + 1)/SU (N + 1) Wess–Zumino–Witten topological quantum field theory on a cylinder and a Chern–Simons theory with gauge group SU (N + 1) on a three-fold that is the product of an interval and a real two-torus. 5.7. Bispectrality. The symmetry of the wave function 9λ (ρ + µ) (4.13) with respect to an interchange of λ and µ (cf. Proposition 4.6) has as a consequence that it satisfies the same discrete difference eigenvalue equations in the “spectral” variable λ as it does in the “spatial” variable µ (cf. Proposition 4.7 and the proof of Proposition 4.8). In other words, we are dealing with a multivariate doubly discrete finite-dimensional bispectral problem in the sense of Duistermaat and Gr¨unbaum [DG, W, G].
A. Truncated and Terminating Aomoto–Ito–Macdonald sums In this appendix a finitely truncated version of a recent summation formula due to Aomoto, Ito, and Macdonald [Ao, I1, I2, M5] (see also [Ka]) is derived. In Sect. 4 we used this finite Aomoto–Ito–Macdonald-type sum to arrive at a compact product formula for the normalization constant of our factorized wave function 90 (4.4) (cf. Proposition 4.4). When formulating the summation formulas in question it is convenient to employ the q-shifted factorial defined by (see e.g. [GR]) (1 − a)(1 − aq)(1 − aq 2 ) · · · (1 − aq m−1 ), m = 1, 2, 3, . . . m=0 (a; q)m = 1, 1 , m = −1, −2, −3, . . . (1−aq −1 )(1−aq −2 )···(1−aq m ) (where for negative m it is assumed that a, q ∈ C are such that the denominators do not vanish), ∞ Y (1 − aq n ), (0 < |q| < 1) (a; q)∞ = n=0
The Compact Quantum Ruijsenaars–Schneider Model
63
and (a1 , . . . , ak ; q)m = (a1 ; q)m · · · (ak ; q)m ,
m ∈ Z ∪ {∞}.
Proposition A.1 (The Aomoto–Ito–Macdonald sum [Ao, I1, I2, M5]). For 0 < q < 1, 2π + Re(g) < 0 and z ∈ CN +1 with g + ha, zi 6∈ Z + i log q Z for all a ∈ AN , one has that X
q −2hρ,z+µi
µ∈3
Y
(1 − q ha,z+µi )
a∈A+N
(q 1−g+ha,z+µi ; q)∞ = γ 2(z) (q g+ha,z+µi ; q)∞
(where the sum is taken over all weights in 3 (2.5) and the vector ρ is given by (2.13)), with Y θ(q ha,zi ) , θ(ζ) = (q, ζ, qζ −1 ; q)∞ 2(z) = q −2hρ,zi θ(q g+ha,zi ) + a∈AN
and γ = (N + 1)
Y (q 1−g−ha,ρi , q δa +g−ha,ρi ; q)∞ , (q 1−ha,ρi , q −ha,ρi ; q)∞ +
a∈AN
where δa := 1 if a is simple (cf. (2.2)) and δa := 0 otherwise. Furthermore, the series on the l.h.s. converges in absolute value. The conditions on q, g ensure that the series on the l.h.s. converges absolutely (cf. Remark iii below) and the genericity restrictions on z guarantee that all denominators are nonzero. The sum of Proposition A.1 was first considered by Aomoto [Ao], who showed that it can be evaluated as a product of the quasi-periodic factor 2(z) and a zindependent constant. The value of this constant (viz. γ), was subsequently determined by Ito [I1, I2]. Soon thereafter, Macdonald [M5] found an alternative derivation of the above sum by linking a constant term identity of Cherednik [C1] to a generalized Poincar´e series type formula for affine Weyl groups due to Matsumoto [Ma]. After division of both sides of the Aomoto–Ito–Macdonald summation formula by the term on the l.h.s. corresponding to µ = 0, one arrives at the identity X µ∈3
q −2hρ,µi
Y 1 − q ha,z+µi (q g+ha,zi ; q)ha,µi 1 − q ha,zi (q 1−g+ha,zi ; q)ha,µi +
(A.1)
a∈AN
=γ
Y a∈A+N
(q 1+ha,zi , q 1−ha,zi ; q)∞ , (q 1−g+ha,zi , q 1−g−ha,zi ; q)∞
where it is assumed that 0 < q < 1, Re(g) < 0 and that z ∈ CN +1 satisfies the genericity 2π 2π Z and g + ha, zi − 1 6∈ N + i log(q) Z for all a ∈ AN (to ensure conditions ha, zi 6∈ i log(q) that there is no division by zero). (In other words, in Eq. (A.1) we have normalized the terms of the series on the l.h.s. such that for µ = 0 the term is equal to 1.) It is instructive to observe that the proportionality constant γ on the r.h.s. (see above) may be rewritten in a somewhat more compact (but less elegant) form by canceling common factors in the numerator and denominator: γ = (N + 1)
(q; q)N ∞ (q 1−g ; q)N ∞
Y 1≤n≤N
(q 1−(n+1)g ; q)∞ . (q −ng ; q)∞
(A.2)
64
J. F. van Diejen, L. Vinet
We will now show that by specializing the vector z to the value ρ (2.13), the sum in (A.1) over the weight lattice 3 (2.5) truncates to a sum over the dominant cone 3+ (2.7). Proposition A.2 (A Truncated Aomoto–Ito–Macdonald sum). Let 0 < q < 1 and 2π for all a ∈ A+N . Re(g) < 0 such that g − ha, ρi is not a positive integer modulo i log(q) Then X
q −2hρ,µi
µ∈3+
Y 1 − q ha,ρ+µi (q g+ha,ρi ; q)ha,µi 1 − q ha,ρi (q 1−g+ha,ρi ; q)ha,µi +
a∈AN
= (N + 1)
Y (q 1+ha,ρi , q δa +g−ha,ρi ; q)∞ (q −ha,ρi , q 1−g+ha,ρi ; q)∞ +
a∈AN
= (N + 1)
Y 1≤n≤N
(q 1+ng ; q)∞ , (q −ng ; q)∞
and the series on the l.h.s. converges in absolute value. Proof. The conditions on g (and q) ensure that after substituting z = ρ (2.13) in (A.1) all terms remain finite and the series converges in absolute value (as a consequence of Proposition A.1). The resulting series on the l.h.s. now truncates because the terms become zero for µ ∈ 3 \ 3+ . This is because for µ ∈ 3 \ 3+ there exists a simple root aj (2.2) for which haj , µi is a negative integer and hence we pick up a zero from the factor 1 1 = (q 1−g+haj ,ρi ; q)haj ,µi (q; q)haj ,µi (which is zero for haj , µi < 0). The expressions for the r.h.s. are obtained from that of (A.1) (with z = ρ) by canceling common factors in numerator and denominator. A further reduction arises when we specialize the parameters q and g in such a way that q g(N +1)+M = 1 for some positive integer M . The sum of Proposition A.2 then terminates to a sum over the integral alcove 3+M (3.6). Proposition A.3 (A terminating Aomoto–Ito–Macdonald sum). Let q = exp(
2πi ) g(N + 1) + M
with g > 0 and M a positive integer. Then X µ∈3+M
q −2hρ,µi
Y 1 − q ha,ρ+µi (q g+ha,ρi ; q)ha,µi 1 − q ha,ρi (q 1−g+ha,ρi ; q)ha,µi a∈A+N Y = (N + 1) (q 1+ng ; q)M −1 . 1≤n≤N
The Compact Quantum Ruijsenaars–Schneider Model
65
Proof. Let us first substitute g=−
2πi M + N + 1 (N + 1) log(q)
(A.3)
with 0 < q < 1 and M a positive integer, in the summation formula of Proposition A.2. Notice that this value of g satisfies both the convergence criterion Re(g) < 0 as well as 2π Z for all a ∈ A+N . The sum over the regularity condition that g − ha, ρi − 1 6∈ N + i log(q) + the dominant cone 3 (2.7) then terminates to a sum over the integral alcove 3+M (3.6) because all terms become zero for µ ∈ 3+ \ 3+M . Indeed, for the above value of g we have that q (N +1)g+M = 1, so we pick up a zero from the factor (q g+hamax ,ρi ; q)hamax ,µi = (q (N +1)g ; q)µ1 −µN +1 when hamax , µi = µ1 − µN +1 > M . To arrive at the expression for the r.h.s. one uses that Y 1≤n≤N
(q 1+ng ; q)∞ = (q −ng ; q)∞
Y 1≤n≤N
(q 1+ng ; q)∞ = (q (N +1−n)g+M ; q)∞
Y
(q 1+ng ; q)M −1 .
1≤n≤N
We thus see that the terminating sum of the proposition holds for q given by 2πi ) with Re(g) = −M/(N + 1) and Im(g) < 0 (solve (A.3) for q). By exp( (N +1)g+M exploiting the analyticity in g it is possible to extend the terminating sum to generic complex g. The restriction to positive real values of g ensures that all numerators and denominators are nonzero. In trigonometric notation with q = eiα , the summation formula of Proposition A.3 becomes X
ha,µi Y α α sin( ha, ρ + µi) sin (ha, ρi + g + m − 1) 2 2 Y m=1 ha,µi Y α α sin( ha, ρi) sin (ha, ρi − g + m) 2 2 m=1 Y α sin (m + ng) = 2N (M −1) (N + 1) 2 1≤m≤M −1
µ∈3+M a∈A+N
(A.4)
1≤n≤N
for α = 2π/(g(N + 1) + M ) with M a positive integer and g > 0. (Here we have used the convention that empty products are equal to 1.) This is precisely the summation formula of Proposition 4.4. Remarks. i. For N = 1 the Aomoto–Ito–Macdonald-type sums of Eq. (A.1), Proposition A.2 and Proposition A.3 reduce to ∞ X
q −gm
m=−∞
=2
1 − q z+m (q g+z ; q) m 1 − q z (q 1−g+z ; q)m
(q 1+z , q 1−z ; q)∞ (q, q 1−2g ; q)∞ (q 1−g+z , q 1−g−z ; q)∞ (q 1−g , q −g ; q)∞
(with 0 < q < 1, Re(g) < 0 and z 6∈
2π i log(q) Z,
g ± z − 1 6∈ N +
2π i log(q) Z),
(A.5)
66
J. F. van Diejen, L. Vinet ∞ X
q −gm
1 − q g+m (q 2g ; q)
m=0
m
1−
qg
(q; q)m
=2
(q 1+g ; q)∞ (q −g ; q)∞
(A.6)
(with 0 < q < 1, Re(g) < 0) and M X
q −gm
1 − q g+m (q 2g ; q)
m=0
m
1 − qg
(q; q)m
= 2(q 1+g ; q)M −1
(A.7)
πi (with q = exp( g+M/2 ) and g > 0), respectively. The sums in (A.5), (A.6) and (A.7) are well-poised 2 ψ2 , 2 φ1 and terminating 2 φ1 sums that arise as reductions of Bailey’s 6 ψ6 , Rogers’ 6 φ5 and Rogers’ terminating 6 φ5 very-well-poised sums, respectively [GR]. For instance, the substitution a = q 2z , b = −q z , c = q z+1/2 , d = −q z+1/2 , e = q z+g in the 6 ψ6 Bailey sum [GR, Eq. (5.3.1)] yields (A.5); the substitution a = q 2g , b = −q g , c = q g+1/2 , d = −q g+1/2 in the 6 φ5 Rogers sum [GR, Eq. (2.7.1)] yields (A.6); and the substitution πi ) and n = M in the terminating 6 φ5 a = q 2g , b = q g+1/2 , c = −q g+1/2 , q = exp( 2g+M Rogers sum [GR, Eq. (2.4.2)] yields (A.7). (Observe that in the last case the parameter q in the terminating Rogers sum and in Eq. (A.7) are related by a square.) ii. The summation formula of Aomoto, Ito and Macdonald in [Ao, I1, I2, M5] is in fact more general than as stated in Proposition A.1. The point is that in those papers sums associated to an arbitrary reduced integral root system are considered. The formulation of Proposition A.1 then corresponds to the restriction of [Ao, I1, I2, M5] to case of the AN series. Whereas Aomoto’s methods for determining the quasi-periodic factor 2(z) actually hold for an arbitrary reduced root system, Ito’s computations of the proportionality constant γ in fact cover only the cases of the AN series and the integral root systems of rank two. For an arbitrary reduced root system, Ito [I1, I2] moreover presents a conjectural formula for γ. The final formulation of the generalization of Proposition A.1 to an arbitrary reduced root system is due to Macdonald [M5] (in whose approach all reduced root systems are treated at once). In particular, Macdonald [M5] provides a proof for Ito’s conjecture from [I1, I2] regarding the value of the proportionality constant γ in the case of a general reduced root system. In [D3] a further extension of the Aomoto–Ito–Macdonald sum for the nonreduced BC-type root systems was studied, together with corresponding truncated and terminating variants. iii. At first glance, one might be inclined to think that the series on the l.h.s. of Proposition A.1 is actually divergent. This is however not true, as was demonstrated in [Ao, I1, M5]. It is not so difficult to establish the (absolute) convergence of the series if we restrict the sum to be over µ in the dominant cone 3+ (2.7) rather than over the whole P weight lattice 3 (2.5). Indeed, the sum µ∈3+ q −2hρ,z+µi clearly converges (absolutely) for the parameter regime indicated and the remaining factors of the form (1 − q ha,z+µi ) and (q 1−g+ha,z+µi ; q)∞ /(q g+ha,z+µi ; q)∞ remain bounded on 3+ (since singularities are avoided and limx→+∞ (aq x ; q)∞ /(bq x ; q)∞ = 1). The key observation is now that the knowledge of the convergence of the sum over µ ∈ 3+ is already sufficient to conclude the convergence of the sum over µ ∈ 3. This is easiest seen from the representation in (A.1), which differs from the series in Proposition A.1 by an overall factor of the form
q −2hρ,zi
Y a∈A+N
(1 − q ha,zi )
(q 1−g+ha,zi ; q)∞ . (q g+ha,zi ; q)∞
The Compact Quantum Ruijsenaars–Schneider Model
67
Indeed, it is not very difficult to see that the substitution µ → σ(µ), with σ ∈ SN +1 , in the terms on the l.h.s. of (A.1) is equivalent to replacing z by σ −1 (z). Thus, summing the terms of (A.1) over the weights in (the closure of) any Weyl chamber simply amounts to summing over the dominant weights 3+ in (the closure of) the positive Weyl chamber with a correspondingly Weyl-permuted value for z. Hence, the (absolute) convergence of the sum (A.1) over the whole weight lattice follows from the (absolute) convergence of the sum over the dominant cone. Finally, to verify the claim that the substitution µ → σ(µ) in the terms on the l.h.s. of (A.1) indeed amounts to the replacement z → σ −1 (z), one first observes that the substitution µ → σ(µ) is equivalent to the replacement g X −1 a → σ −1 (a), ρ→ σ (a) (A.8) z → σ −1 (z), 2 + a∈AN
(this is immediate from the orthogonality of the reflections σ ∈ SN +1 with respect to the inner product h·, ·i). Let us now split, in the terms on the l.h.s. of (A.1) with the replacement (A.8), the product over the positive roots a ∈ A+N in a product for which −σ −1 (a) ∈ A+N , and let us apply the same σ −1 (a) ∈ A+N and a product for which g P partitioning to the terms of the sum 2 a∈A+ σ −1 (a). Then a reflection with respect to N the origin of the roots for which −σ −1 (a) ∈ A+N with the aid of the reflection formula −m m(m+1)/2 (a; q)−m = (qa−1 ; q)−1 q , brings us back – after some straightforward m (−a) manipulations – to the formula (A.1) with z replaced by σ −1 (z) as advertised. B. Bilinear summation identities for Macdonald’s symmetric functions The purpose of this appendix is twofold. Firstly, it serves to collect some basic facts on the Macdonald symmetric functions that were needed in Sect 4. For a more complete treatment of this material and proofs the reader is referred to [M4, Ch. 6]. Secondly, the appendix allows us to reformulate some of our results from the perspective of algebraic combinatorics. This gives rise to a novel system of bilinear summation identities for the Macdonald symmetric functions (cf. Proposition B.2 below). Let X Y tzj − zk r(r−1)/2 r = 1, . . . , N + 1, (B.1) TJ,q , Dr = t zj − zk j∈J J⊂{1,... ,N +1}
|J|=r
k6∈J
where |J| denotes the cardinality of the index set J ⊂ {1, . . . , N + 1} and TJ,q = Q j∈J Tj,q with (Tj,q f )(z1 , . . . , zN +1 ) = f (z1 , . . . , zj−1 , qzj , zj+1 , . . . , zN +1 ). Let n = (n1 , n2 , . . . , nN +1 ) ∈ NN +1 be a partition, i.e., let the components (or parts) be ordered as n1 ≥ n2 ≥ · · · ≥ nN +1 ≥ 0. The monomial symmetric function mn (z) associated to n is then defined as X mN +1 z1m1 · · · zN (B.2) mn (z) = +1 , m∈SN +1 (n)
where the sum is over the orbit of n under the action of the permutation group SN +1 on the components. The basis of monomial symmetric functions {mn } inherits a partial order from the dominance partial order of the partitions defined by
68
J. F. van Diejen, L. Vinet
mn
|m| = |n| and m1 + · · · + mk ≤ n1 + · · · + nk
if f
(B.3)
for k = 1, . . . , N , where |n| := n1 + · · · + nN +1 denotes the weight of the partition. Proposition B.1 (Triangularity [M4]). The q-difference operators D1 , . . . , DN +1 commute and are triangular with respect to the basis of monomial symmetric functions X [D r ]n,m mm , [D r ]n,m ∈ Q(q, t). D r mn = mn
Furthermore, the diagonal matrix elements (eigenvalues) [D r ]n,n are given by X Y tN +1−j q nj . [D r ]n,n = E n,r (q, t) = J⊂{1,... ,N +1} j∈J
For a quick proof of the polynomiality of (D r mn )(z) in z one may use that this rational expression is regular as a function of z ∈ CN +1 due to the permutation symmetry. The triangular form of the monomial expansion for (D r mn )(z) and the diagonal matrix elements [D r ]n,n then follow from the asymptotics for z to infinity. In the simplest situation, i.e. for n = 0 (so mn = 1), the monomial expansion of Proposition B.1 reduces to the (AN -type) Macdonald identity (cf. [M1, Theorem (2.8)]) X X Y tzj − zk Y = tN +1−j , (B.4) tr(r−1)/2 z − z j k j∈J J⊂{1,... ,N +1}
|J|=r
J⊂{1,... ,N +1}
k6∈J
|J|=r
j∈J
with r = 1, . . . , N +1. Substitution of t = eiαg and zj = eiαxj , j = 1, . . . , N +1 produces rN upon division by t 2 the Macdonald identity used in the proof of Proposition 4.3 for r = 1, . . . , N (the (N +1)-th identity in (B.4) is in fact trivial, as in that case the product on the l.h.s. becomes empty and one ends up with the equality tN (N +1)/2 = tN tN −1 · · · t·1). The Macdonald symmetric functions are now defined as the (joint) eigenfunctions of the commuting operators D 1 , . . . , D N +1 . Such a definition makes sense because the eigenvalues E n,r (q, t) are nondegenerate (and hence semisimple) over the field Q(q, t). Definition (Macdonald symmetric functions). For a partition n in NN +1 , the Macdonald symmetric function pn (z) is the symmetric polynomial of the form X (i) pn (z) = mn (z) + cn,m (q, t) mm (z) with cn,m (q, t) ∈ Q(q, t) such that m≺n
(ii) D r pn = E n,r (q, t) pn , r = 1, . . . , N + 1. Two important properties of the Macdonald symmetric functions are the evaluation formula (also referred to as the specialization formula) [M4, Ch. 6: Eqs. (6.11), (6.110 )] PN +1
pn (τ ) = t
j=1
(j−1)nj
Y 1≤j
(t1+k−j ; q)nj −nk (tk−j ; q)nj −nk
(B.5a)
with τ = (tN , tN −1 , . . . , t, 1) and the symmetry relation [M4, Ch. 6: Eq. (6.6)] P m (τ q n ) = P n (τ q m ) with
(B.5b)
The Compact Quantum Ruijsenaars–Schneider Model
69
P n (z) = pn (z)/pn (τ )
(B.6)
and τ q n = (tN q n1 , tN −1 q n2 , . . . , tq nN , q nN +1 ). The Macdonald symmetric function pn (z) is homogeneous of degree |n| in z and pn+e1 +···+eN +1 (z) = (z1 · · · zN +1 )pn (z). Projection to a homogeneous function of degree zero in z pλ (z) = (z1 · · · zN +1 )−|n|/(N +1) pn (z), |n| λ=n− (e1 + · · · + eN +1 ) N +1
(B.7a) (B.7b)
gives rise to the Macdonald polynomials pλ , λ ∈ 3+ (2.7) associated to the root system AN [M2, M3]. The functions pλ (z) (B.7a), (B.7b) are related to the trigonometric Macdonald polynomials pλ (x) of Sect. 4.2 via the trigonometric substitution
zj = e
iαxj
t = qg , q = eiα , j = 1, . . . , N + 1.
(B.8a) (B.8b)
More specifically, by substituting (B.8a), (B.8b) the functions pλ (z) (B.7a), (B.7b) pass over to trigonometric polynomials pλ (x) of the form in (4.10a), (4.10b). (The difference equations in (4.10b) are equivalent to the q-difference equations for pλ (z), originating from the q-difference equations D r pn = E n,r (q, t) pn in the above definition of the Macdonald symmetric function pn (z), upon substitution of (B.8a), (B.8b)). The symmetry relations in (4.12) for the renormalized trigonometric Macdonald polynomials Pλ (x) (4.11) are an immediate consequence of evaluation formula (B.5a) and the symmetry relation (B.5b). The real-valuedness of the expansion coefficients cλ,µ in (4.10a) follows from the fact that the Macdonald symmetric functions pn (z) are invariant with respect to the parameter inversion (q, t) → (q −1 , t−1 ), see [M4, Ch 6: Eq. (4.14) (iv)]. Translating back the orthogonality relations from Sect. 4.2 leads us to the following system of bilinear summation identities for the Macdonald symmetric functions pn (z). Proposition B.2 (Bilinear summation identities). Let n, k be partitions in NN +1 with n1 − nN +1 , k1 − kN +1 ≤ M ∈ N \ {0}. Then we have that X
q−
M ≥m1 ≥···≥mN +1 =0 m∈NN +1
=
|m|(|n|−|k|) N +1
pn (τ q m )pk (τ −1 q −m )1(m)
0 if k 6= n mod Z (e1 + · · · + eN +1 ) N (|n|−|k|) 2 N 0 N (n) if k = n mod Z (e1 + · · · + eN +1 ) t
as rational identity in q N +1 and t subject to relation tN +1 q M = 1, where 1
τ ±1 q ±m = (t±N q ±m1 , t±(N −1) q ±m2 , . . . , t±1 q ±mN , q ±mN +1 ) and
70
J. F. van Diejen, L. Vinet −
1(m) = t
PN +1 j=1
(N +2−2j)mj
Y
1 − tk−j q mj −mk (t1+k−j ; q) mj −mk , k−j k−j−1 1−t (qt ; q)mj −mk
1≤j
N (n) =
Y
(t
1≤j
N 0 = (N + 1)
Y
, qtk−j−1 ; q)nj −nk k−j (t , qtk−j ; q)nj −nk
1+k−j
,
(qtj ; q)M −1 .
1≤j≤N
(For k = n mod Z (e1 + · · · + eN +1 ) the relation is actually rational in q itself.) Proof. First the projection formulas in (B.7a), (B.7b) are used to rewrite the stated bilinear summation identities in terms of the AN -type Macdonald polynomials pλ , λ ∈ 3+M . The trigonometric substitution in (B.8a), (B.8b) then leads us back to the discrete orthogonality relations (4.15a) for the trigonometric Macdonald polynomials pλ (x), λ ∈ 3+M of Sect. 4.2. This proves our bilinear summation formulas for the Macdonald symmetric 2πi ) with g > 0. Analytic continuation in g functions pn (z) for t = q g , q = exp( (N +1)g+M entails the formulation of the proposition. If we specialize the formula of Proposition B.2 to the case that n = m = 0, then we arrive at the following rational identity in q, t subject to the relation tN +1 q M = 1: X 1(m) = N 0 , (B.9) M ≥m1 ≥···≥mN +1 =0 m∈NN +1
which amounts to the terminating Aomoto–Ito–Macdonald sum of Proposition A.3. The complex conjugation relation in (4.17), for the trigonometric Macdonald polynomials pλ (x) of Sect. 4.2, translates to a corresponding relation for the Macdonald symmetric functions pn (z) defined above. Specifically, if we associate to a partition n ∈ NN +1 a contragredient (or complementary) partition n∗ ∈ NN +1 with parts given by n∗j = n1 − n(N +2−j) ,
j = 1, . . . , N + 1
(B.10)
(see Fig. 4), then we have that pn∗ (z) = (z1 · · · zN +1 )
|n|+|n∗ | N +1
pn (z−1 )
|n|+|n∗ | N +1
= n1 ,
(B.11)
−1 ∗ where z−1 := (z1−1 , . . . , zN +1 ). Observe that the mapping n 7→ n is involutive modulo ∗ Z(e1 + · · · + eN +1 ), i.e., the contragredient partition of n is equal to n up to a possible integer multiple of the vector e1 + · · · + eN +1 . (This multiple is nonzero if nN +1 > 0.) The verification of (B.11) goes along lines very similar to the proof of Proposition B.2. First one uses the projective relation between the Macdonald symmetric functions pn and the AN -type Macdonald polynomials pλ in (B.7a), (B.7a), to conclude that after the trigonometric substitution (B.8a), (B.8b) the relation in (B.11) reduces to (4.17). Notice to this end that if λ is the projection of a partition n ∈ NN +1 ⊂ RN +1 on the hyperplane E (2.1) (cf. (B.7b)), then λ∗ (4.16b) amounts to the projection of the contragredient partition n∗ (B.10) onto E. This proves Eq. (B.11) for t = q g with g > 0 and q, zj on the unit circle. Analytic continuation then entails that Eq. (B.11) holds identically as an equality that is polynomial in z and rational in q, t.
The Compact Quantum Ruijsenaars–Schneider Model
71
n1 n1
0
* * ** ****
nN+1
* * * *
* * * *
* * * * *
* * * * * *
n*N N+1
n*1
n*1 Fig. 4. The contragredient (or complementary) partition n∗ of a partition n ∈ NN +1
For Schur functions (t = q) the formula in (B.11) is well-known, see e.g. Stanley [St, Eq. (11)]; it expresses an equivalence between (the characters of) the irreducible representation of SL(N + 1, C) associated to the partition n∗ and the representation contragredient to the one associated to n. We have not been able to locate a reference for the property (B.11) applying to the general (q, t)-Macdonald symmetric functions, but most likely it was known already for this case too. With the aid of (B.11), one rewrites the equality of Proposition B.2 in the form X
q−
|m|(|n|+|k|) N +1
pn (τ q m )pk (τ q m )1(m)
M ≥m1 ≥···≥mN +1 =0 m∈NN +1
=
0 if k 6= n∗ mod Z (e1 + · · · + eN +1 ) N (|n|+|k|) t 2 N 0 N (n) if k = n∗ mod Z (e1 + · · · + eN +1 ),
(B.12a)
or equivalently in the normalization of (B.6) (cf. also (4.15b)) X
q−
|m|(|n|+|k|) N +1
P n (τ q m )P k (τ q m )1(m)
M ≥m1 ≥···≥mN +1 =0 m∈NN +1
=
0
N0 1(n)
if k 6= n∗ mod Z (e1 + · · · + eN +1 ) if k = n∗ mod Z (e1 + · · · + eN +1 ),
(B.12b)
as a rational identity in t and q N +1 (or even q if k∗ = n mod Z (e1 + · · · + eN +1 )) subject to the relation tN +1 q M = 1, where n and k are partitions in NN +1 with n1 − nN +1 , k1 − kN +1 ≤ M . (To derive (B.12b) from (B.12a) one divides by pn (τ )pk (τ ) and uses that ∗ t−N |n|/2 pn (τ ) = t−N |n |/2 pn∗ (τ ) and that tN |n| N (n)/(pn (τ ))2 = 1/1(n).) 1
Remarks. i. In full generality, Macdonald defined his symmetric polynomials for an arbitrary integral root system [M3]. For the nonreduced BC root systems, Koornwinder [Ko2] subsequently found a further generalization leading to a class of Askey–Wilson polynomials [GR, KS] in several variables that contains all Macdonald polynomials
72
J. F. van Diejen, L. Vinet
associated to the classical root systems as special cases (cf. [D1, Sect. 5]). In [DS] finitedimensional discrete orthogonality properties of a type analogous to those described by Proposition B.2 were studied for Koornwinder’s generalized BC Askey–Wilson– Macdonald polynomials. In the case of the discrete orthogonality structure, the BC polynomials in question may be viewed as a multivariate generalization of the wellknown q-Racah polynomials introduced by Askey and Wilson [AW, GR, KS]. ii. For t = q g with g a nonnegative integer, the condition tN +1 q M = 1 implies that q is a root of unity. In this special case the properties of the Macdonald polynomials were studied by Kirillov, Jr. and Cherednik [Ki, C2]. In particular, Kirillov, Jr. connects the Macdonald polynomials at issue with the representation theory of the quantum group (quantized enveloping algebra) Uq (slN +1 ) for q a root of unity. Acknowledgement. We would like to express our gratitude to Amine El Gradechi for some very illuminating explanations concerning the geometric quantization of complex (K¨ahler) manifolds and to Igor Lutsenko for helping with the figures. Thanks are furthermore due to Franc¸ois Ziegler and Anatol N. Kirillov for drawing our attention to Refs. [GS2] and [St], respectively. The help of a referee in eliminating some inaccuracies in the formulas is also very much appreciated. Most of the results reported in this paper were obtained while the authors were visiting the Mathematical Sciences Research Institute (MSRI) in Berkeley, taking part in the Combinatorics Program on Representation Theory and Symmetric Functions (Spring, 1997). It is a pleasure to thank the organizers for their invitation and the institute for its hospitality. The research was supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada, le Fonds pour la Formation de Chercheurs et l’Aide a` la Recherche (FCAR) du Qu´ebec, and at the MSRI in part by NSF grant #DMS 9022140.
References [Ao] [AW]
Aomoto, K.: On product formulae for Jackson integrals associated with root systems. Preprint 1994 Askey, R., Wilson, J.: A set of orthogonal polynomials that generalize the Racah coefficients or 6 − j symbols. SIAM J. Math. Anal. 10, 1008–1016 (1979) [At] Atiyah, M.F.: Convexity and commuting Hamiltonians. Bull. London Math. Soc. 14, 1–15 (1982) [B] Bourbaki, N.: Groupes et alg`ebres de Lie, Chapitres 4–6. Paris: Hermann, 1968 [C1] Cherednik, I.: Double affine Hecke algebras and Macdonald’s conjectures. Ann. Math. 141, 191–216 (1995) [C2] Cherednik, I.: Macdonald’s evaluation conjectures and difference Fourier transform. Invent. Math. 122, 119–145 (1995) [DKJM] Date, E., Kashiwara, M., Jimbo, M., Miwa, T.: Transformation groups for soliton equations. In: Jimbo, M., Miwa, T. (eds.) Nonlinear integrable systems – Classical theory and quantum theory, Singapore: World Scientific, 1983, pp. 39–119 [D1] van Diejen, J.F.: Commuting difference operators with polynomial eigenfunctions. Compositio Math. 95, 183–233 (1995) [D2] van Diejen, J.F.: On the diagonalization of difference Calogero-Sutherland models. In: Levi, D., Vinet, L., Winternitz, P. (eds.) Symmetries and integrability of difference equations, CRM Proceedings and Lecture Notes, vol. 9, Providence, R.I.: Am. Math. Soc., 1996, pp. 79–89 [D3] van Diejen, J.F.: On certain multiple Bailey, Rogers and Dougall type summation formulas. Publ. Res. Inst. Math. Sci. 33, 483–508 (1997) [DS] van Diejen, J.F., Stokman, J.V.: Multivariable q-Racah polynomials. Duke Math. J. 91, 89–136 (1998) [DG] Duistermaat, J., Gr¨unbaum, F.A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103, 177–240 (1986) [EK] Etingof, P.I., Kirillov, Jr., A.A.: Representation-theoretic proof of the inner product and symmetry identities for Macdonald’s polynomials. Compositio Math. 102, 179–202 (1996) [GR] Gasper, G., Rahman, M.: Basic hypergeometric series. Cambridge: Cambridge University Press, 1990
The Compact Quantum Ruijsenaars–Schneider Model
[GN] [G] [GS1] [GS2] [HK] [Ho1] [Ho2] [Hu] [I1] [I2] [JM] [Ka] [Ki] [Ko1] [Ko2]
[KS] [Ma] [M1] [M2] [M3]
[M4] [M5] [Mo] [R1] [R2] [R3] [R4] [R5]
73
Gorsky, A., Nekrasov, N.: Relativistic Calogero–Moser model as gauged WZW theory. Nucl. Phys. B 436, 582–608 (1995) Gr¨unbaum, F.A.: Some bispectral musings. In: Harnad, J., Kasman, A. (eds.) The bispectral problem, CRM Proceedings and Lecture Notes, vol. 14, Providence, R.I.: Am. Math. Soc. 1998, pp. 31–46 Guillemin, V., Sternberg, S.: Convexity properties of the moment mapping. Invent. Math. 67, 491– 513 (1982) Guillemin, V., Sternberg, S.: The Gelfand-Cetlin system and quantization of the complex flag manifolds. J. Funct. Anal. 52, 106–128 (1983) Hirzebruch, F., Kodaira, K.: On the complex projective spaces. J. Math. Pures Appl. (Neuvi`eme S´erie) 36, 201–216 (1957) Hollowood, T.: Solitons in affine Toda field theories. Nucl. Phys. B 384, 523–540 (1992) Hollowood, T.: Quantizing the sl(n) solitons and the Hecke algebra. Internat. J. Modern Phys. A 8, 947–981 (1993) Hurt, N.E.: Geometric quantization in action. Dordrecht: D. Reidel Publishing Co., 1983 Ito, M.: On a theta product formula for Jackson integrals associated with root systems of rank two. J. Math. Anal. Appl. 216, 122–163 (1997) Ito, M.: A theta product formula for Jackson integrals associated with root systems. Proc. Japan Acad. Ser. A: Math. Sci. 73, 60–61 (1997) Jimbo, M., Miwa, T.: Solitons and infinite-dimensional Lie algebras. Publ. Res. Inst. Math. Sci. 19, 943–1001 (1983) ´ Kaneko, J.: q-Selberg integrals and Macdonald polynomials. Ann. Sci. Ecole Norm. Sup. (4) 29, 583–637 (1996) Kirillov, Jr., A.A.: On an inner product in modular tensor categories. J. Am. Math. Soc. 9, 1135–1169 (1996) Koornwinder, T.H.: Self-duality for q-ultraspherical polynomials associated with the root system An . Unpublished Manuscript 1988 Koornwinder, T.H.: Askey-Wilson polynomials for root systems of type BC. In: Richards, D. St. P. (ed.) Hypergeometric functions on domains of positivity, Jack polynomials, and applications, Contemp. Math., vol. 138, Providence, R.I.: Am. Math. Soc., 1992, pp. 189–204 Koekoek, R., Swarttouw, R.F.: The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue, Math. report Delft University of Technology 94-05, 1994 Matsumoto, H.: Analyse harmonique dans les syst`emes de Tits bornologiques de type affine, Lecture Notes in Math., vol. 590, Berlin: Springer-Verlag, 1977 Macdonald, I.G.: The Poincar´e series of a Coxeter group. Math. Ann. 199, 151–174 (1972) Macdonald, I.G.: Orthogonal polynomials associated with root systems. Unpublished Manuscript 1988 Macdonald, I.G.: Orthogonal polynomials associated with root systems. In: Nevai, P. (ed.) Orthogonal polynomials: Theory and practice, NATO ASI Series C, vol. 294, Dordrecht: Kluwer Academic Publishers, 1990, pp. 311–318 Macdonald, I.G.: Symmetric functions and Hall polynomials (2nd edition). Oxford: Clarendon Press, 1995 Macdonald, I.G.: A formal identity for affine root systems. Preprint 1996 Moser, J: Three integrable Hamiltonian systems connected with isospectral deformations. Adv. Math. 16, 197–220 (1975) Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero–Moser systems and elliptic function identities. Commun. Math. Phys. 110, 191–213 (1987) Ruijsenaars, S.N.M.: Finite-dimensional soliton systems. In: Kupershmidt, B. (ed.) Integrable and superintegrable systems, Singapore: World Scientific, 1990, pp. 165–206 Ruijsenaars, S.N.M.: Action-angle maps and scattering theory for some finite-dimensional integrable systems III. Sutherland type systems and their duals. Publ. Res. Inst. Math. Sci. 31, 247–353 (1995) Ruijsenaars, S.N.M.: Systems of Calogero–Moser type. In: Semenoff, G., Vinet, L. (eds.) Proceedings of the 1994 Banff summer school Particles and Fields (to appear) Ruijsenaars, S.N.M.: Integrable particle systems vs solutions to the KP and 2D Toda equations. Ann. Phys. (N.Y.) 256, 226–301 (1997)
74
[RS] [Si]
[St]
[W] [ZC]
J. F. van Diejen, L. Vinet
Ruijsenaars, S.N.M., Schneider, H.: A new class of integrable systems and its relations to solitons. Ann. Phys. (N.Y.) 170, 370–405 (1986) Simms, D.J.: Geometric quantisation of the harmonic oscillator with diagonalised Hamiltonian. In: Janner, A., Janssen, T. (eds.) Proceedings of the 2nd international colloquium on group theoretical methods in physics, vol. 1, Nijmegen: Catholic University of Nijmegen, 1973, pp. A168–A181 Stanley, R.P.: GL(n, C) for combinatorialists. In: Lloyd, E.K. (ed.) Surveys in combinatorics, London Mathematical Society Lecture Note Series, vol. 82, Cambridge: Cambridge University Press, 1983, pp. 187–199 Wilson, G.: Bispectral commutative ordinary differential operators. J. Reine Angew. Math. 442, 177–204 (1993) Zhu, Z., Caldi, D.G.: Multi-soliton solutions for affine Toda models. Nucl. Phys. B 436, 659–678 (1995)
Communicated by T. Miwa
Commun. Math. Phys. 197, 75 – 107 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Conformal Invariance of Voronoi Percolation Itai Benjamini, Oded Schramm Mathematics Department, The Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected],
[email protected] Received: 16 January 1998 / Accepted: 13 February 1998
Abstract: It is proved that in the Voronoi model for percolation in dimension 2 and 3, the crossing probabilities are asymptotically invariant under conformal change of metric. To define Voronoi percolation on a manifold M , you need a measure µ, and a Riemannian metric ds. Points are scattered according to a Poisson point process on (M, µ), with some density λ. Each cell in the Voronoi tessellation determined by the chosen points is declared open with some fixed probability p, and closed with probability 1−p, independently of the other cells. The above conformal invariance statement means that under certain conditions, the probability for an open crossing between two sets is asymptotically unchanged, as λ → ∞, if the metric ds is replaced by any (smoothly) conformal metric ds0 . Additionally, it is conjectured that if µ and µ0 are two measures comparable to the Riemannian volume measure, then replacing µ by µ0 does not effect the limiting crossing probabilities.
1. Introduction Let γ be a simple closed curve in C = R2 , and let D be the closed topological disk which it bounds. Pick two disjoint arcs γ1 , γ2 ⊂ γ. Let > 0 be small, and let Z2 denote the square grid rescaled by . Fix some p ∈ [0, 1] and declare each edge in Z2 to be open with probability p, and closed with probability 1 − p, independently of the other edges. This is just the standard bond percolation model on the square grid; for background and history, see [9]. Let P C,p (D, γ1 , γ2 ) be the probability that there is a path of open edges in the subgraph of Z2 lying in D that connects a vertex which has an edge crossing γ1 to a vertex which has an edge crossing γ2 . This is called the crossing probability for (D, γ1 , γ2 ) in the bond percolation model with parameters , p. The main interest is in the limit as → 0. H. Kesten [10] proved that the critical probability pc (the least p above which there is an infinite open connected component with probability 1) for bond percolation on the square lattice is 1/2, and that
76
I. Benjamini, O. Schramm
0 < lim inf P C,pc (D, γ1 , γ2 ) ≤ lim sup P C,pc (D, γ1 , γ2 ) < 1. →0
→0
Although not proved yet, it is widely believed that the limit P C0,p (D, γ1 , γ2 ) = lim P C,p (D, γ1 , γ2 ), →0
exists for p = pc . It is known that for p 6= pc the limits exist, and P C0,p (D, γ1 , γ2 ) is 0 if p < pc and 1 if p > pc . Aizenman, Langlands, Pouliot and Saint-Aubin have conjectured that the limits P C0,p (D, γ1 , γ2 ) are conformally invariant. More precisely, ˆ be a homeomorphism of D onto another topoConjecture 1.1 ([12]). Let f : D → D ˆ ⊂ C, and suppose that f is conformal in the interior of D. Then logical disk D P C0,pc (D, γ1 , γ2 ) = P C0,pc f (D), f (γ1 ), f (γ2 ) . This conjecture motivated the current work. In [11] numerical data from computer simulations has been collected, estimating the crossing probabilities of rectangles. The discussion of these results led to the above conjecture. Subsequently, J. L. Cardy [6] found a heuristic argument supporting this conjecture, and derived (using arguments outside the scope of mathematics) a formula for the limiting crossing probabilities, in terms of the cross ratio of the images of the endpoints of γ1 , γ2 under the conformal map from D to the unit disk. Cardy’s formula matched the numerical data quite well. Later, Langlands et. al. [12] have obtained more precise numerical data, giving further support to the conjecture and to Cardy’s formula. Although the current work does not settle the conjecture, it does prove a related conformal invariance property, which, in our view, is not less important. In order to discuss it, the Voronoi percolation model must be introduced. The precise definitions are given in Sect. 2, but a loose description will be given here. Let M be a smooth manifold, and let ds be a Riemannian metric on M . Let µ be a measure on M that is comparable to vol, the Riemannian volume measure on M . The most interesting case is µ = vol. Take some parameters p ∈ [0, 1], λ > 0. Now let ω be a Poisson point process on (M, µ), with density λ. Each cell in the Voronoi tiling with nuclei ω is declared open with probability p, and closed otherwise. Then one looks at crossing probabilities inside the union of all open tiles. The measure µ plays a role in the choice of the nuclei ω, and the metric ds is instrumental in defining the Voronoi tessellation. Our main result is that, in dimension d = 2 or 3, asymptotically, the crossing probabilities are unchanged if the metric ds is replaced by any other smoothly conformal metric. Note that the effect of a mapping f is to change both the measure µ and the metric ds. The main advantage of the Voronoi percolation model is that it permits a separate treatment of the effects of the change in µ and the change in ds. We conjecture that in two dimensions µ may be changed to any comparable measure, without effecting the limiting crossing probabilities. It is shown that this Density Invariance Conjecture and our main result imply the analog of Conjecture 1.1 in the Voronoi model. Although this seems almost tautological at first sight, there is some work involved in dealing with some sticky boundary issues. Some numerical evidence supporting the Density Invariance Conjecture in dimension two are presented here. The simulations also suggest that the limiting crossing probabilities for Voronoi percolation in dimension 2 are the same as in the Z2 model.
Conformal Invariance of Voronoi Percolation
77
The impression that one might get from Conjecture 1.1 is that the conformal invariance has something to do with analyticity, since conformal maps are analytic. In fact, as the physics literature suggests [2], this impression is erroneous. Our main result shows that the conformal invariance is much more general, and holds outside the realm of analytic maps and dimension 2. The Voronoi percolation model has been introduced into the mathematical literature by M. Q. Vahidi-Asl and J. C. Wierman [15], in the context of first passage percolation. Here are some useful properties of this model: • Rotation invariance. • Duality: in dimension 2 and p = 1/2, the union of open tiles has the same stochastic behavior as the set of closed tiles. Based on this, A. Zvavitch [16] has shown that there is no unbounded open cluster (component) for Voronoi percolation with p = 1/2 in R2 . • Generality: the model makes sense in the setting of Riemannian manifolds. In particular, the theory of Voronoi percolation in the hyperbolic plane is interesting [4]. • Separation of measure and metric, as discussed above. • Gradual refinement: one may pass from a configuration to a denser configuration by inserting new random points one by one. In contrast, when refining a grid, it is necessary to make drastic changes. The reader may wish to look into the work of M. Aizenman [1], who constructs a continuous limit of percolation models using Voronoi percolation. The plan of the paper is as follows. Sect. 2 gives precise definitions, and the statement of the main results. A brief outline of the proof is sketched in Sect. 3, while Sects. 4 through 9 provide the details. Of these, Sects. 4 through 6 are geometric in nature, and Sects. 7 through 8 are probabilistic. Sect. 9 assembles the pieces together and completes the proof. Finally, Sect. 10 introduces the Density Invariance Conjecture, presents numerical evidence for it in dimension two, and shows that it implies the analog of Conjecture 1.1 in the Voronoi percolation setting. 2. The Voronoi Percolation Model and Statement of the Main Result Throughout the paper, M will be a smooth Riemannian manifold, d will be the dimension of M , and ds will denote the Riemannian metric on M . Let d0 (·, ·) be the distance function associated with (M, ds). Also associated with ds is the natural volume measure, vol. Let µ be measure on M comparable to vol, which means that there is a constant c > 0 such that c−1 vol(A) ≤ µ(A) ≤ c vol(A) for every measurable A ⊂ M . Given parameters λ > 0, p ∈ [0, 1], one defines the Voronoi percolation process on (M, ds, µ, λ, p), as follows. Let be the space of all subsets ω of M such that the intersection of ω with any compact subset of M is finite. There is a (Borel) probability measure Pλ on given by the Poisson point process on (M, µ) with density λ. The measure Pλ is characterized by the formula, k λµ(A) exp − λµ(A) , (2.1) Pλ |ω ∩ A| = k = k! for every measurable A (with finite measure) and every integer k, and by the requirement that |ω ∩ A1 |, . . . , |ω ∩ An | are independent random variables when A1 , . . . , An are disjoint measurable sets. Here, and below, for any set X, the cardinality of X will be denoted |X|.
78
I. Benjamini, O. Schramm
The elements ω ∈ are called configurations. Let ω be some configuration. Given any z ∈ ω, its Voronoi tile T(z) = T(ω, ds, z) is the set of all points w ∈ M such that d0 (w, z) ≤ d0 (w, z 0 ) for all z 0 ∈ ω. The collection of all Voronoi tiles is the Voronoi tiling of ω, and will be denoted T(ω, ds). It is indeed a tiling of M , except for the trivial case (which will henceforth be ignored) where ω = ∅. In Voronoi percolation, each tile of T(ω, ds) is declared open with probability p, and closed with probability 1 − p, independently, and one studies the connected components of the union of all open tiles. We now make an equivalent, slightly different and more ˆ = × . Then Pλ,p is defined to be the product measure precise, formulation. Let ˆ Given ω = (ωo , ωc ) ∈ , ˆ the set ωo will be called the set of open Ppλ × P(1−p)λ on . ˆ → is defined by nuclei, and ωc is the set of closed nuclei. The projection map π : π(ωo , ωc ) = ωo ∪ ωc . If π ωˆ = ω, then ωˆ will be called a coloring of ω. The elements of ˆ are called colored configurations. ˆ be distributed according to Pλ , and let τo be a random subset of τ , chosen Let τ ∈ so that for any x ∈ τ the probability for x ∈ τo is p, and for different x, x0 ∈ τ the events x ∈ τo , x0 ∈ τo are independent. Then it is not hard to verify that (τo , τ − τo ) is distributed according to Pλ,p . This means that a legitimate way of generating a Pλ,p random ωˆ is by first selecting a Pλ random ω and then selecting an appropriate random coloring of it. We shall make use of these two distinct ways of generating a Pλ,p -random colored configuration. ˆ the tiles in T(ω, Given a colored configuration ω ∈ , ˆ ds) = T(π ω, ˆ ds) which belong to open nuclei are called open tiles, and the other tiles are closed tiles. We soon define the crossing events and the crossing probabilities. Perhaps the cleanest situation to deal with is one in which there is no boundary: M is compact (and boundaryless), and one is looking for percolation in homotopy classes; that is, the “crossing” event is the event that there is a closed curve, contained in the union of open tiles, which is in a prescribed homotopy class. However, this is not the situation prevalent in the literature. The definitions below are not the most natural ones, with respect to the way the boundary is dealt with. They have been adopted because they make the proofs easier (that is, possible), and since we feel that it is better to leave the boundary issues to future investigations. Let M 0 be a compact d-dimensional set in M , which has smooth boundary, and let ˆ S1 , S2 ⊂ M 0 be two open disjoint sets, with smooth boundary. Given ω = (ωo , ωc ) ∈ , 0 0 let TO (ω, ds) be the union of open tiles of T(ω, ds) which have nuclei in M , and let ˆ be the event that there is a connected component of C = C(M, M 0 , S1 , S2 , ds) ⊂ 0 TO (ω, ds) which intersects both ωo ∩ S1 and ωo ∩ S2 . If ω ∈ C, we say that there is a crossing from S1 to S2 in (M, M 0 , ω, ds). Now suppose that u : M → R is a smooth function, and consider the metric eu ds, which is conformal to our original metric ds. 2.1. Conformal invariance theorem for percolation. Suppose that d = dim(M ) = 2 or 3. Let I ⊂ (0, 1) be a compact interval. Then = 0, lim Pλ,p C M, M 0 , S1 , S2 , ds − C M, M 0 , S1 , S2 , eu ds λ→∞
uniformly for p ∈ I. ˆ for which there is a crossing with This means that the set of configurations ω ∈ respect to the metric ds, but not with respect to the conformal metric eu ds has measure tending to 0 as λ → ∞, and the convergence is uniform in p as long as p is kept away
Conformal Invariance of Voronoi Percolation
79
from 0 and 1. In particular, when λ is large, the probability of C M, M 0 , S1 , S2 , ds is approximately the same as the probability of C M, M 0 , S1 , S2 , eu ds , and the same is true for intersections of such events. Actually, the theorem is true even with I = [0, 1]. To prove this one needs to show that for some constant δ > 0, we have = 0, lim Pλ,p C M, M 0 , S1 , S2 , ds λ→∞
uniformly for p ∈ [0, δ], and
lim Pλ,p C M, M 0 , S1 , S2 , ds = 1,
λ→∞
uniformly for p ∈ [1 − δ, 1]. These facts, which are actually valid in greater generality, are not hard. (The analogous statements in the discrete setting are certainly well known.) But because the methods involved are almost disjoint from those of this paper, and for the sake of keeping the size of the article reasonable, the proof will be delayed to some future work. The point about the limit in Theorem 2.1 being uniform is that one may let p depend applies. Any value can be prescribed on λ and tend to p c as λ → ∞, and still the theorem 0 for limλ→∞ Pλ,p C M, M , S1 , S2 , ds , if p is an appropriate function of λ. This issue is even more importantin dimension 3, since it has not been proved in any model that the limit limλ→∞ Pλ,pc C M, M 0 , S1 , S2 , ds is not always 1 or 0. We now discuss a variant of the theorem involving percolation in homotopy classes. ˆ denote Let α be a collection of homotopy classes of M 0 and let C(M, M 0 , α, ds) ⊂ 0 the event that there is a path in TO (ω, ds) which realizes a homotopy class in α. 2.2 Conformal invariance theorem for percolation in homotopy classes. pose that d = dim(M ) = 2 or 3. Let I ⊂ (0, 1) be a compact interval. Then = 0, lim Pλ,p C M, M 0 , α, ds − C M, M 0 , α, eu ds
Sup-
λ→∞
uniformly for p ∈ I. The same proof applies to both theorems. 3. Brief Outline of the Proof of Theorems 2.1 and 2.2 Consider a configuration ω ∈ , and the Voronoi tilings T0 , T1 produced by using the two metrics ds and eu ds. A situation where there are two neighboring tiles in T0 and the corresponding tiles in T1 do not neighbor is called a defect. The first step is to analyse the geometry of configurations that are defect prone. We shall find that for compact sets in dimension d, in configurations with approximately nd cells, the typical number of defects is in the order of nd−2 . In particular, for d = 2, the expected number of defects is finite. It turns out that the best way to deal with the defects is to think of a typical configuration as a defect-free configuration, with defects added on top of it by an independent (sort of) Poisson process, which has small density. In practice, much effort is required to make this philosophy work.
80
I. Benjamini, O. Schramm
In dimensions 2 and 3, defects turn out to be rare enough so that they do not effect percolation. The effect of the defects added on top of a defect-free configuration is majorized by changing the status of all tiles intersecting sufficiently large spherical shells about the location of the defect, from open to closed, say. We shall need quite delicate tail estimates for the number of tiles intersecting such spherical shells. Using these estimates, and a second moment argument, it will follow that (for d = 2, 3), with high probability, these haphazard defects will not destroy percolation. Almost all of the proof does not assume d = 2, 3, only at the very end we shall apply this restriction. Perhaps this might be valuable in the future, in extending the results to higher dimensions. From time to time, remarks will be made, hinting how the proof may be simplified if one restricts to the case M = R2 . 4. The Geometry of Defects We consider some fixed configuration ω ∈ . Recall that d0 (·, ·) denotes the metric on M corresponding to the Riemannian metric ds, and let d1 (·, ·) be the metric corresponding to the conformal Riemannian metric eu ds. Let T0 (ω) be the Voronoi tessellation for ω with respect to d0 (·, ·), and let T1 (ω) be the Voronoi tessellation obtained by using the metric d1 (·, ·). A defect is a pair of points p1 , p2 ∈ ω such that the Voronoi tiles T0 (p1 ), T0 (p2 ) are adjacent, but the corresponding tiles T1 (p1 ), T1 (p2 ) are not. That is, T0 (p1 ) ∩ T0 (p2 ) 6= ∅ = T1 (p1 ) ∩ T1 (p2 ). Lemma 4.1. Let K be a compact subset of M . There is a constant C = C(u,K) > 0 with the following property. Suppose that q, p1 , p2 ∈ K and r ∈ 0, C −1 , satisfy d0 (q, p1 ) = d0 (q, p2 ) = r. Then there is a q1 ∈ M satisfying (1) d0 (q1 , q) < Cr2 , (2) d1 (q1 , p1 ) = d1 (q1 , p2 ), and (3) |d1 (q1 , p1 ) − d1 (q1 , p)| < Cr3 , for any p ∈ M satisfying d0 (q, p) = r. One fact the lemma tells us is that a small ball in one metric is very close to a ball in a conformal metric. In general, the two balls should be allowed to have different centers, in order to obtain the correct order of approximation. In the particular situation where (M, ds) is a domain in the plane with the Euclidean metric and the metric eu ds is the pullback to M of the Euclidean metric under a conformal map f : M → C, the lemma is significantly easier. One may take q1 as the center of the circle which is the image of the circle |z − q| = r under a M¨obius transformation which agrees with f at q, p1 , p2 . Then C is bounded by a constant times the maximum modulus of the Schwarzian derivative of f near q. Proof. Since the restriction of u to K is bounded, and the lemma is not effected if we add a bounded constant to u, we assume without loss of generality that u(q) = 0. Set ut = tu, and let dt (·, ·) be the metric induced by the Riemannian metric eut ds. Then dt is a one parameter family of metrics, interpolating between d0 and d1 . We shall first solve the differential problem; that is, a tangent vector v will be found such that for a path q(t) in M satisfying q(0) = q, q 0 (0) = v, the conditions (1’) |v| < Cr2 , d d dt q(t), p1 = dt dt q(t), p2 at t = 0, and (2’) dt d d dt q(t), p1 − dt dt q(t), p < Cr3 at t = 0, for every p ∈ M satisfying (3’) dt d0 (q, p) = r,
Conformal Invariance of Voronoi Percolation
81
are satisfied. It will then be quite easy to get the original statement from the differential statement. ∂ dt (q, p1 ) at t = 0. Since r is as small as Our first goal is to estimate the derivative ∂t we wish, we may assume that any geodesic segment joining two points whose distance is at most 2r is unique, in any of the metrics dt . Let γt be the geodesic segment for the metric dt joining q and p1 , and suppose that each γt is parameterized according to arc-length. Set g(x, y) = lenghtdx (γy ), the length of γy in the metric dx . Then g is smooth (since geodesics can be obtained by solving an ODE on the tangent bundle), and dt (q, p1 ) = g(t, t).
(4.1)
Because the curve γ0 is length minimizing in the metric d0 , the equation ∂ g(0, 0) = 0, ∂y holds. Therefore, (4.1) implies ∂ ∂ ∂ g(0, 0) = d (q, p ) = lenghtdx (γ0 ). t 1 ∂t t=0 ∂x ∂x x=0
(4.2)
(4.3)
Because γ0 is parameterized according to arclength, we have, Z r lenghtdx (γ0 ) = exu(γ0 (s)) ds. 0
Together with (4.3), this gives, ∂ ∂t
t=0
Z dt (q, p1 ) =
r
u γ0 (s) ds.
(4.4)
0
Using local coordinates, and u(q) = 0, the following estimates are obtained, γ0 (s) = q + sγ00 (0) + O(s2 ), u γ0 (s) = s∇u(q) · γ00 (0) + O(s2 ). Substituting this into (4.4) yields, 1 ∂ dt (q, p1 ) = r2 ∇u(q) · γ00 (0) + O(r3 ). ∂t t=0 2
(4.5)
If v is any tangent vector at q, and q(t) is a path in M with q 0 (t) = v, then we have d d0 q(t), p1 = −v · γ00 (0), dt t=0 because −γ00 (0) is the gradient of the d0 -distance from p1 at q. Hence, it follows from (4.5) that d d d dt q, p1 d d q(t), p = q(t), p + t 1 0 1 dt t=0 dt t=0 dt t=0 (4.6) 1 0 2 0 3 = −v · γ0 (0) + r ∇u(q) · γ0 (0) + O(r ). 2
82
I. Benjamini, O. Schramm
Suppose that v has the form v=
1 2 r ∇u(q) + O(r3 ). 2
(4.7)
d Then we get dt dt q(t), p1 = O(r3 ). The same would be equally true if p1 is replaced by any p such that d0 (q, p) = r, because the above expression for v does not depend on p1 . Consequently, (1’) and (3’) would be satisfied for an appropriately chosen C = C(u, K). So all that remains for the solution of the differential problem is to find the O(r3 ) term in the expression for v, which would guarantee (2’). Let βt be the geodesic segment joining q and p2 in the metric dt , parametrized according to arc-length. We use the expression (4.4) and the corresponding expression with β and p2 replacing γ and p1 , to get, ∂ d (q, p ) − d (q, p ) t 1 t 2 ∂t t=0 Z r (4.8) u γ0 (s) − u β0 (s) ds = Z r Z0 r 2 ∇u β0 (s) · γ0 (s) − β0 (s) ds + O |γ0 (s) − β0 (s)| ds. = 0
0
0 (0) = w. Let αw (s) denote the geodesic starting at αw (0) = q with initial direction αw 0 00 Then αw (s), αw (s) and αw (s) are smooth functions of w and s. Consequently, for s ∈ [0, r], (4.9) γ0 (s) − β0 (s) = O r|γ00 (0) − β00 (0)| , 00 00 0 0 γ0 (s) − β0 (s) = O |γ0 (0) − β0 (0)| . (4.10)
Using this in (4.8), gives, ∂ d (q, p ) − d (q, p ) t 1 t 2 ∂t t=0 Z r ∇u(q) · sγ00 (0) − sβ00 (0) ds = 0 Z r ∇u(q) · γ0 (s) − sγ00 (0) − β0 (s) + sβ00 (0) ds + Z0 r ∇u β0 (s) − ∇u(q) · γ0 (s) − β0 (s) ds + 0 + O r3 |γ00 (0) − β00 (0)|2 1 = r2 ∇u(q) · γ00 (0) − β00 (0) 2 Z r ∇u(q) · γ0 (s) − sγ00 (0) − β0 (s) + sβ00 (0) ds + 0 + O r3 |γ00 (0) − β00 (0)| . Set
h(s) = γ0 (s) − sγ00 (0) − β0 (s) + sβ00 (0).
(4.11)
Conformal Invariance of Voronoi Percolation
83
Note that h(0) = h0 (0) = 0, and h00 (s) = γ000 (s) − β000 (s), which is O |γ00 (0) − β00 (0)| , according to (4.10). Therefore, (4.12) h(s) = O r2 |γ00 (0) − β00 (0)| , for s ∈ [0, r]. Now use this in (4.11), ∂ dt (q, p1 ) − dt (q, p2 ) ∂t
Z r 1 2 r ∇u(q) · γ00 (0) − β00 (0) + ∇u(q) · h(s) ds + O r3 |γ00 (0) − β00 (0)| 2 0 1 2 0 0 = r ∇u(q) · γ0 (0) − β0 (0) + O r3 |γ00 (0) − β00 (0)| . 2 (4.13) Let A be the O r3 |γ00 (0) − β00 (0)| term, that is, 1 ∂ A= d (q, p ) − d (q, p ) − r2 ∇u(q) · γ00 (0) − β00 (0) . (4.14) t 1 t 2 ∂t t=0 2 t=0
=
Choose, v=
1 2 γ 0 (0) − β00 (0) r ∇u(q) + A 00 , 2 |γ0 (0) − β00 (0)|2
(4.15)
and, as before, let q(s) be a path satisfying q(0) = q and q 0 (0) = v. Then ∂ d (q(s), p ) − d (q(s), p ) = −v · γ00 (0) + v · β00 (0) 0 1 0 2 ∂s s=0
1 = − r2 ∇u(q) · γ00 (0) − β00 (0) − A 2 ∂ = − dt (q, p1 ) − dt (q, p2 ) , ∂t t=0
by (4.14). Consequently, d dx (q(x), p1 ) − dx (q(x), p2 ) dx x=0 ∂ ∂ = d0 (q(s), p1 ) − d0 (q(s), p2 ) + dt (q, p1 ) − dt (q, p2 ) = 0, ∂s s=0 ∂t t=0 which shows that (2’) holds. Since A is O r3 |γ00 (0) − β00 (0)| , the definition (4.15) of v satisfies (4.7). Hence (1’) and (3’) are still satisfied, as we have seen above. This completes the solution of the differential problem. To solve the original problem, for every point q ∗ and every t ∈ [0, 1], define v(q ∗ , t) q. Let q(t) be as in (4.15), but with the metric dt replacing d0 and the point q ∗ replacing the solution of theinitial value problem q(0) = q, q 0 (t) = v q(t), t , and set q1 = q(1). (Because v q(t), t = O(r2 ), r < C −1 , and q ∈ K, by an appropriate choice of the constant C it is guaranteed that this initial value problem has a solution in the interval [0, 1]. The essential point here is that q(t) stays in a compact subset of M .) Then it is easy to see that (1) and (2) hold. Verifying (3) is just slightly harder, because (3’) was obtained only for points p satisfying d0 (q, p) = d0 (q, p1 ), and these are generally not the
84
I. Benjamini, O. Schramm
points satisfying dt (q(t), p = dt (q(t), p1 . To deal with that, start with any p satisfying d0 (q, p) = r. At every point z let w(z, t) be the direction at z of the geodesic for the metric dt that goes from z to q(t). Let p(t) satisfy p(0) = p and ∂ p0 (t) = d q(s), p − d q(s), p(t) w p(t), t . s 1 s ∂s s=t
Then dt q(t), p(t) = dt q(t), p1 for all t ∈ [0, 1]. Hence, = d1 (q1 , p1 ). By d1 q1 , p(1) the equivalent of (3’) at t, |p0 (t)| = O r3 . So d0 p(1), p = O r3 , which gives d1 (q1 , p) − d1 (q1 , p1 ) = d1 (q1 , p) − d1 q1 , p(1) = O r3 .
This implies (3), and completes the proof.
Notation. Suppose that q, z are points in M . If there is a unique shortest geodesic segment from q to z, in the metric d0 , then the direction at q of that geodesic will be denoted Nq (z). When working in local coordinates, Nq (z) can be thought of as a unit vector in Rd . We may also think of Nq (z) as a unit vector in Tq M , the tangent space to M at q. Note that for any compact K ⊂ M there is an > 0 such that Nq (z) is well defined when q ∈ K and d0 (q, z) < . The following lemma will help us prove that defects are rare. Lemma 4.2. Let K be a compact subset of M . There is a constant C = C(M, ds, u, K) > 0 such that the following holds. Let ω ∈ , and consider the two Voronoi tessellations, T0 = T0 (ω), T1 = T1 (ω), obtained by using the metrics d0 and d1 . Suppose that p1 , p2 ∈ K ∩ ω form a defect (that is, T0 (p1 ) ∩ T0 (p2 ) 6= ∅ = T1 (p1 ) ∩ T1 (p2 )) and assume that T0 (p1 ) ∩ T0 (p2 ) ⊂ K. Let q be the point in T0 (p1 ) ∩ T0 (p2 ) which maximizes d0 q, ω − {p1 , p2 } − d0 (q, p1 ), and set r = d0 (q, p1 ), r0 = d0 q, ω − {p1 , p2 } . Let Z = {z1 , . . . , zk } be the set of points z ∈ ω − {p1 , p2 } such that d0 (q, z) = r0 . If r < C −1 , then (1) r0 ≤ r + Cr3 , and (2) the vectors {Nq (p1 ), Nq (p2 ), Nq (z1 ), . . . , Nq (zk )} are affinely dependent. Proof. Take C to be larger than the constant in Lemma 4, and assume r < C −1 . Let q1 be the point described in that lemma, and set r1 = d1 (q1 , p1 ) = d1 (q1 , p2 ). Since q1 ∈ / ∅ = T1 (p1 ) ∩ T1 (p2 ), there is a point z0 ∈ ω − {p1 , p2 } with d1 (q1 , z0 ) < r1 . We know that d0 (q, z0 ) ≥ r and d0 (q, q1 ) = O(r2 ). Hence, there is a point z00 on the d1 -geodesic segment from z0 to q1 that satisfies d0 (q, z00 ) = r. Then, according to 4. (3), d1 (q1 , z00 )+O(r3 ) ≥ d1 (q1 , p1 ) > d1 (q1 , z0 ). But since d1 (q1 , z0 ) = d1 (q1 , z00 )+d1 (z00 , z0 ), it follows that d1 (z0 , z00 ) = O(r3 ), which implies d0 (z0 , z00 ) = O(r3 ). Consequently, d0 (q, z0 ) = r + O(r3 ). By construction, among all the points in ω − {p1 , p2 } the points in Z are closest to q. Therefore, r0 ≤ d0 (q, z0 ) = r +O(r3 ) for z ∈ Z, and (1) is established. Let L be the set of points p in M such that d0 (p, p1 ) = d0 (p, p2 ). If q(t) is a smooth path in M which satisfies q(0) = q, then d d q(t), p − d q(t), p = q 0 (t) · Nq (p2 ) − Nq (p1 ) . 0 1 0 2 dt t=0
Because Nq (p1 ) 6= Nq (p2 ), it follows by the implicit function theorem that L ∩ W is a smooth d − 1 manifold, for some open W ⊂ M which contains q.
Conformal Invariance of Voronoi Percolation
85
Let w ∈ Tq M be any tangent vector at q which is orthogonal to Nq (p2 ) − Nq (p1 ) . Then there is a smooth path q(t) in L such that q(0) = q and q 0 (0) = w. Recall that q maximizes (4.16) d0 p, ω − p1 , p2 − d0 (p, p1 ), among p in T0 (p1 ) ∩ T0 (p2 ). Since (4.16) is negative when p ∈ L − T0 (p1 ) ∩ T0 (p2 ), it follows that q maximizes (4.16) among p in L ∩ W . Therefore, there must be some z ∈ Z such that d d q(t), z − d q(t), p = w · Nq (p1 ) − Nq (z) . 0≥ 0 0 1 dt t=0 This means that for every vector w tangent to L, w ∈ Tq L, there is some j ∈ 1, . . . , k with w · vj ≤ 0, where vj is the orthogonal projection of Nq (p1 ) − Nq (zj ) onto Tq L. Therefore, 0 is in the convex hull of {v1 , . . . , vk } (see Eggleston [7, Ch. 1, §7]), and consequently {v1 , . . . , vk } is linearly dependent. Hence, the linear span of {v1 , . . . , vk } is contained in a k − 1 dimensional subspace of Tq L. Because each Nq (p1 ) − Nq (zj ) is a linear combination of vj and Nq (p1 ) − Nq (p2 ), it follows that the set {Nq (p1 ) − Nq (p2 ), Nq (p1 )−Nq (z1 ), . . . , Nq (p1 )−Nq (zk )} is contained in a k dimensional subspace of Tq M . This proves (2), and establishes the lemma. Recall that M 0 is a compact subset of M in which the crossing is considered. Let M ⊂ M be some compact set that contains M 0 in its interior. We now define a potential defect to be a situation where some of the necessary conditions for a defect of Lemma 4.2 are satisfied. ∗
Definition. Let C be the constant in Lemma 4.2, with K taken to be M ∗ . Consider some configuration ω ∈ . A potential defect is a situation where, there is an integer k ≥ 1, and a point q ∈ M , and nuclei p1 , p2 , z1 , . . . , zk ∈ ω, and numbers r, r0 > 0 such that (1) (2) (3) (4) (5)
r < C −1 , r ≤ r0 ≤ r + Cr3 , r = d0 (q, p1 ) = d0 (q, p2 ), r0 = d0 (q, z1 ) = · · · = d0 (q, zk ), and the vectors {Nq (p1 ), Nq (p2 ), Nq (z1 ), . . . , Nq (zk )} are affinely dependent.
The number r is called the span of the potential defect, and the point q is the navel of the potential defect.
5. Defects are Rare This section will provide an estimate for the probability of having a defect or potential defect in a given region. The argument will be based on the necessary condition for defects given in Lemma 4.2. We start with the following almost obvious lemma. Lemma 5.1. Let m ≥ 3 be some integer, and let Xm be the set of points x = m such that {z1 , . . . , zm } ⊂ Rd is affinely dependent and |zj | = 1 (z1 , . . . , zm ) ∈ Rd for each j = 1, . . . , m. Then Xm has finite (md − d − 2)-dimensional measure.
86
I. Benjamini, O. Schramm
Proof. Let Y be the space of tuples y = (L, w, y1 , . . . , ym , θ), where L ⊂ Rd is an (m − 2)-dimensional linear subspace, w is a unit vector orthogonal to L, y1 , . . . , ym are unit vectors in L, and θ ∈ [0, π/2]. Then, clearly, Y is a compact (md − d − 2)dimensional smooth manifold with boundary, and therefore has finite (md − d − 2)dimensional measure. The map 8(L, w, y1 , . . . , ym , θ) = (cos θ w + sin θ y1 , . . . , cos θ w + sin θ ym ), takes Y onto Xm , and is a Lipschitz map. Therefore Xm has finite (md − d − 2)dimensional measure. Here is another nearly trivial lemma. Lemma 5.2. Let W ⊂ Rd be open, let ν be Lebesgue measure on W , let λ > 0, and let ω ⊂ W be a Poisson point process on (W, ν) with density λ. Let m > 1 be an integer, and let ωˆ m ⊂ W m be the set of points (v1 , . . . , vm ) ∈ ω m such that vj 6= vk for j 6= k. Let νm = ν × · · · × ν be the product measure in W m , and let S ⊂ W m be measurable. Then the probability that ωˆ m will intersect S is at most λm νm (S). Proof. One first proves the lemma in the case that S ⊂ W m is a box disjoint from the diagonals wj = wk . The general case follows. Details are left to the reader. For an interval I ⊂ R and a set W ⊂ M , let PD(W, I) be the event that in W there is a navel of a potential defect whose span is in the interval I. Proposition 5.3. Let K be a compact subset of M , and let W ⊂ K be open. Then, 2 Pλ PD W, [0, ] ≤ C vol(W )λd+2 d +d+2 , for every ∈ λ−1/d , C −1 , where C > 0 is a constant which may depend on M, µ, ds, u, K, but not on W, λ, . Lemma 5.4. Let the situation be as in the proposition. There is a constant C0 = C0 (M, ds, K) > 0, such that the following holds. Let ∈ [0, C0 ], δ ∈ (0, 1), and let k in the range 1, 2, . . . , d. Let S be the set of all tuples (p1 , p2 , z1 . . . , zk ) ∈ M k+2 such that for some q ∈ W , r ∈ [0, ], and r0 ∈ [r, r + δ), we have r = d0 (q, p1 ) = d0 (q, p2 ), r0 = d0 (q, z1 ) = · · · = d0 (q, zk ), and the vectors {Nq (p1 ), Nq (p2 ), Nq (z1 ), . . . , Nq (zk )} are affinely dependent. Then the (k + 2)d-dimensional measure of S is O (1) vol(W )δ(k+1)d . Proof. Recall the sets Xm of Lemma 5. Let Y = [0, 1] × [0, δ) × Xk+2 . From that lemma it follows that the (k + 1)d-dimensional measure of Y is O(δ). Let Y 0 = Y ; that is, the set Y scaled by . Then the (k + 1)d-dimensional measure of Y 0 is O (1) δ(k+1)d . Consider the map 90 : Y 0 → Rd
k+2
defined by 90 (r, α, x1 , . . . , xk+2 ) = r−1 x1 , r−1 x2 , (r + α)−1 x3 , . . . , (r + α)−1 xk+2 .
Conformal Invariance of Voronoi Percolation
87
Differentiation shows that 90 is Lipschitz in Y 0 with a Lipschitz constant which depends only on d. (Because r, r+α and the xj ’s are O().) Consequently, the (k+1)d-dimensional measure of 90 (Y 0 ) is also O (1) δ(k+1)d . We assume, with no loss of generality, that W is contained in a coordinate chart of M . This allows us to identify the tangent space Tz M with Rd , for z in a neighborhood of W . Given a point z ∈ W and a vector v ∈ Rd with |v| ≤ 15, let expz (v) denote the point x ∈ M such that d0 (z, x) = |v| and the tangent at z of the d0 -geodesic segment from z to x is v/|v|. (This is usually called the exponential map.) Since geodesics can be obtained by solving an ODE on the tangent bundle, the map expz (v) is smooth in z and v (that is, the geodesic flow is smooth). For each z ∈ W and each (v1 , v2 , . . . , vk+2 ) ∈ 90 (Y 0 ), set 91 (z, v1 , v2 , . . . , vk+2 ) = expz (v1 ), . . . , expz (vk+2 .
Since 91 is smooth, we find that the (k + 2)d-dimensional measure of 91 W× 91 (Y 0 ) is O (1) vol(W )δ(k+1)d . The lemma follows, because S = 91 W × 90 (Y 0 ) .
Proof of 5.3. Let C1 denote the constant of Lemma 4.2. Fix some small ≥ λ−1/d . Let Ak , k = 1, 2, . . . be the event that there is a point q ∈ W , an r ∈ [0, ], an r0 ∈ [r, r+C1 r3 ), and distinct points p1 , p2 , z1 , . . . , zk ∈ ω such that d0 (q, p1 ) = d0 (q, p2 ) = r, d0 (q, zj ) = r0 , j = 1, . . . , k, and the unit vectors {Nq (p1 ), Nq (p2 ), Nq (z1 ), . . . , Nq (zk )} are affinely dependent. By definition, PD W, [0, ] ⊂ A1 ∪ A2 ∪ . . . . Set A = A1 ∪ · · · ∪ Ad , and note that Ak ⊂ Ad for k > d, because any subset of Rd whose cardinality is d + 2 is affinely dependent. Therefore, PD W, [0, ] ⊂ A. (5.1) Now fix some k = 1, . . . , d, and consider Ak . To estimate Pλ (Ak ), apply Lemma 5.4 with δ = O(2 ), and then use Lemma 5. (Here the measure µ is only comparable, not equal to Lebesgue measure, but that is enough.) The combination of these two lemmas gives Pλ (Ak ) ≤ O (1) vol(W )λk+2 (k+1)d+2 .
(5.2)
Since we take ≥ λ−1/d , this is largest when k = d, and hence, Pλ PD W, [0, ) ≤ Pλ (A) ≤ O (1) vol(W )λd+2 (d+1)d+2 , which proves the proposition.
Set, 2
L = L(λ) = λ−1/d (log λ)(d+1)/d .
(5.3)
This quantity L will be an important length scale in the following sections. The two essential features of the choice of L is that it tends to zero faster than λ−1/d (log λ)1/(d−1) , but slower than λ−1/d (log λ)1/d . Lemma 5.5 (no giant tiles). Let K ⊂ M be compact, and let c > 0 be some constant. Then the Pλ -probability that there will be a tile in T(ω, ds) which intersects K and has diameter greater than cL tends to zero as λ → ∞.
88
I. Benjamini, O. Schramm
Proof. Let U ⊂ M be an open set which contains K and has compact closure. Suppose that λ is sufficiently large so that the distance from K to M − U is greater than 4cL. Let X be a maximal subset of U with the property that any distinct elements of it have distance at least cL/9. The cardinality of X satisfies |X| = O(L−d ).
(5.4)
For any x ∈ X, let Ex ⊂ be the event that the ball B0 (x, cL/9) does not intersect ω. Then we have d (5.5) Pλ Ex ) = e−λL /O(1) . Let z ∈ ω, and let T(z) be the tile with nucleus z in T(ω, ds). Suppose that T(z) intersects K, and its diameter is greater than cL. Then there is some y ∈ T(z) such that d0 (y, z) ≥ cL/2 and d0 (y, K) ≤ cL. Consequently, the ball B0 (y, cL/3) is disjoint from ω and contained in U . There will be some x ∈ X with d0 (x, y) < cL/6. For that x, we shall have ω ∈ Ex . This shows that the event that there is some tile which intersects K and has diameter ≥ cL is contained in ∪x∈X Ex . Hence, we get from (5.4) and (5.5) that the probability of that event is at most O(L−d )e−λL which tends to zero as λ → ∞, by (5.3).
d
/O(1)
,
Note that the number of tiles of T(ω, ds) that are expected to intersect K is in the order of λ. Theorem 5.6. Suppose K ⊂ M is compact, and has positive volume. Then the expected number of defects involving tiles in K is O(1)λ(d−2)/d , when λ is large. The theorem will not be needed in the following, because we will need information about potential defects more than about actual defects. It is presented only for completeness. Proof. Let W ⊂ M be a set whose diameter is smaller than λ−1 , say, and let w0 be some point in W . For any interval [a, b], let hW (a, b) be the probability that there will be in W a navel of a defect with span in the range [a, b]. By Proposition (probdef), hW 0, λ−1/d ≤ O(1) vol(W )λ(d−2)/d . Now consider some ≥ λ−1/d . If there is for a configuration ω a navel in W of a defect with span in the range [, 2], then the ball B0 (w0 ,/2) does not contain any point in ω. This latter event is independent of PD W, [, 2] , and consequently, hW (, 2) ≤ Pλ B0 (w0 , /2) ∩ ω = ∅ Pλ PD W, [, 2] ≤ O(1)e−λ
d
/O(1)
vol(W )λd+2 d
2
+d+2
,
again, using Proposition 5.3. Consider a tiling of K by sets {Wj } with very minute diameters. Let n(ω) be the number of tiles in the tiling which contain a navel of a defect. Then
Conformal Invariance of Voronoi Percolation
En(ω) ≤
X
89
∞ XX hWj 0, λ−1/d + hWj λ−1/d 2k , λ−1/d 2k+1
j
j
≤ O(1) vol (K) λ(d−2)/d
∞ X
k=0
e−2
kd
/O(1) k(d2 +d+2)
2
= O(1)λ(d−2)/d .
k=0
Since every defect has a navel which is not the navel of any other defect, for any specific configuration ω, the number of tiles Wj which meet a navel tends to a number at least as large as the number of defects of ω as the tiling Wj becomes very fine. Consequently, by the monotone convergence theorem, the expected number of defects is bounded by the limsup of En(ω), as the tiling Wj becomes finer. The theorem follows. Remark. In fact, the estimate in Theorem 5.6 is sharp. 6. The Size of Spherical Shells This section is devoted to proving a tail estimate for the number of Voronoi cells in a random Voronoi tiling T(ω, ds) which meet a union of spheres. (It is possible to do without this section if one is interested only in the case M = R2 .) The precise statement which we shall need is as follows. Proposition 6.1. Let M0 ⊂ M be compact, let K ⊂ M0 be a finite set, and let a, R, λ > 0, with λ large and R2 ≤ λ−1/d . For each x ∈ K, let S(x) be the sphere of radius R about x. Given ω ∈ , let n(ω) = n(K, R, λ, ω) be the number of Voronoi tiles in the tiling T(ω, ds) that have diameter ≤ R and intersect ∪x∈K S(x). Then there is a constant C = C(a, M, M0 , ds, µ) > 0 such that Eλ exp an(ω) ≤ exp CRd−1 λ(d−1)/d |K| , where Eλ denotes the expectation operator of (, Pλ ). 6.2 Lemma of Ball Unions. Let 0 < C < ∞, and let A ⊂ Rd be a union of open balls with centers on the unit sphere S d−1 ⊂ Rd , and with radii bounded by C. Then the (d − 1)-dimensional measure of ∂A is bounded by a constant which depends only on C and d. The proof is motivated by hyperbolic geometry, but does not use it. Proof. Suppose first that A is a finite union of such balls, A = ∪nj=1 B(qj , rj ). Let X be the set of points x ∈ ∂A such that x is on the boundary of exactly one of the balls B(qj , rj ). Then X has full (d − 1)-measure in ∂A. We now define a map f : X → S d−1 . Let x ∈ X, and suppose that j is the index such that x ∈ ∂B(qj , rj ). If x ∈ X ∩ S d−1 , set f (x) = x. Otherwise, let Bx be the largest open ball which is contained in B(qj , rj ), is internally tangent to B(qj , rj ) at x, and is disjoint from S d−1 . See Fig. 1. Clearly, Bx is well defined, and there is precisely one intersection point of ∂Bx and S d−1 . Let f (x) be that intersection point. Note that f : X → S d−1 is a continuous map. Suppose that y is a point in A ∩ S d−1 . Let B y be the maximal open ball which is externally tangent to S d−1 at y and is contained
90
I. Benjamini, O. Schramm
x Bx f (x)
Fig. 1. The definition of the map f (x)
in A. If x is some point in X ∩ ∂B y , then the ball B y is strictly contained in the ball B(qj , rj ) with x ∈ ∂B(qj , rj ). Consequently, x is the only point in ∂A ∩ B y , and Bx = B y . It follows that for every y ∈ S d−1 there is at most a unique x ∈ X outside the unit ball B(0, 1) such that f (x) = y. The same argument applies to the points x inside the unit ball. Therefore, the map f is at most 2 to 1. Consider one of the balls, B(qj , rj ), in the union making up A. It is enough to show that locally the map f does not contract distances too much in X ∩ ∂B(qj , rj ). This can be done by inspecting the extreme cases where either rj is small or at the points where ∂B(qj , rj ) is close to S d−1 . Alternatively, observe that the restriction of f to a component of ∂B(qj , rj )∩X −S d−1 is equal to an inversion in some (d−1)-dimensional sphere Z and that the center of Z cannot be too close to S d−1 . The case where A is a union of infinitely many balls follows by a limiting argument. The details are left to the reader. Remark. In the lemma, one may replace the assumption that the centers of the balls making up A are on S d−1 by the assumption that the interior angle of the intersection of these balls with the unit ball be bounded away from 0. For book-keeping, we introduce yet another tiling of M , TB , which will be nonrandom. The only important feature of TB is that every one of its tiles has diameter O λ−1/d and volume at least Cλ−1 for some constant C = C(ds) > 0. For example, TB may be constructed as follows. Take a set of points B ⊂ M such that the distance between any two points in B is at least λ−1/d , and B is maximal with this property, and let TB = T(B, ds), denote the corresponding Voronoi tiling. Proof of 6.1. For x ∈ K, let A(x) be the union of all open balls with radius at most R and center in S(x) that do not intersect any tile of TB which intersects ω. Let U (x) = U (x, ω) denote the set of points in ω which are nuclei of tiles with diameter ≤ R that intersect S(x), and suppose that q ∈ U (x). Then there must be a point z ∈ S(x) such that q is the closest point to z which is in ω. Hence q is on the
Conformal Invariance of Voronoi Percolation
91
boundary of an open ball with center in S(x) which is disjoint from ω and has diameter ≤ R. Recall that every tile of TB has diameter ≤ C1 λ−1/d , where C1 is some constant. It follows that q has distance at most C1 λ−1/d from A(x) ∪ S(x). Let H(x, ω) be the set of all the tiles of TB which are at distance at most C1 λ−1/d from A(x) ∪ S(x), but do not intersect A(x). We may conclude that U (x) ⊂ ∪H(x, ω). (If Q is a set of tiles, then ∪Q denotes the union of the tiles in Q.) Set [ H(x, ω), H= x∈K ∗
n = |(∪H) ∩ ω|. ∗
∗
Then n = n (ω) ≥ n(ω). In order to bound the tail of n∗ , let us estimate from above the size of H. Assume first that the metric ds is a flat (Euclidean) metric. For each z ∈ S(x) let r(z) be the maximal r ≥ 0 such that the open ball of radius r about z is disjoint from tiles of TB which intersect ω (r(z) = 0 if z is in a tile of TB which intersects ω). Set r∗ (z) = min{r(z), R} and for t ≥ 0, [ B z, r∗ (z) + t . A(x, t) = z∈S(x)
Then A(x) = A(x, 0) and each tile in H(x, ω) is contained in A x, 2C1 λ−1/d −A(x). In order to bound the cardinality of H(x, ω), we estimate the volume of A x, 2C1 λ−1/d − A(x), Z −1/d − A(x) = vol A x, 2C1 λ Z
2C1 λ−1/d 0
d vol A(x, t) dt dt
2C1 λ−1/d
(6.1)
vold−1 ∂A(x, t) dt,
= 0
where vold−1 denotes the d−1 dimensional measure. By the Lemma of Ball Unions (6.2), appropriately rescaled, we know that vold−1 ∂A(x, t) ≤ C2 Rd−1 for some constant C2 , and all t ≤ R. It follows then from (6.1) that X vol A x, 2C1 λ−1/d − A(x) ≤ 2C1 C2 |K|Rd−1 λ−1/d . (6.2) vol ∪ H ≤ x∈K
We set β = C3 |K|Rd−1 λ(d−1)/d ,
(6.3)
with C3 a large constant. Since µ is comparable to the measure induced by ds, we get from (6.2), (6.4) µ ∪ H ≤ βλ−1 , provided C3 is large enough. Because the measure of a tile in TB is at least O(1)−1 λ−1 , we also get, |H| ≤ β, (6.5) if C3 is large enough.
92
I. Benjamini, O. Schramm
To remove the assumption that ds is the Euclidean metric, observe that for x ∈ M one may choose a Euclidean metric for a neighborhood of x such that for points at distance O(R) from x distances are distorted by not more than an additive constant of O(R2 ). Since we have the assumption R2 ≤ λ−1/d , it is easy verify that the distortion will not influence the validity of the argument above, but may only change the constants. It is true that the collection of tiles H depends on ω. Hence we cannot naively use the standard formula for the probability that ω ∩ (∪H) has a given cardinality in terms of λ and µ(∪H). But note that H only depends on which tiles of TB which contain a point of ω, and does not depend on the number of points in each such tile. Consider some tile T , and suppose that g is the number of points in ω ∩ T . Then the distribution of g + 1 dominates the distribution of g conditioned on g ≥ 1. This can be thought of as a continuous instance of the BK inequality [5], but may also be verified directly. We conclude from this argument and the inequalities (6.4), (6.5) that for each m, ∞ X β j −β e . Pλ n ≥ m + β ≤ j! j=m ∗
Consequently, ∞ X βj eaj e−β = exp aβ − β + ea β , E exp an(ω) ≤ E exp an∗ ≤ eaβ j! j=0
and the proposition follows.
7. Clean Configurations A local potential defect is a potential defect whose span is less than 1.1L, where 2 L = L(λ) = λ−1/d (log λ)(d+1)/d as in (5.3). This section will study the statistical properties of configurations that have no local potential defects. These will be called clean configurations. We shall continue to use the book-keeping tiling TB , which was introduced in Sect. 6. In the following, we assume that λ is sufficiently large, so that the diameter of any tile in TB is less than L/100. Let ω ∈ be some configuration. Its local potential defect zone Z(ω) is defined as follows. Let Z0 (ω) be the set of all navels of local potential defects and let Z(ω) be the set of all tiles of TB which contain a point in Z0 (ω). Let Q be any set of tiles of TB . Denote by D(Q) the event that Z(ω) = Q, let F(Q) be the event that Z(ω) ⊃ Q, and let N (Q) be the event that Z(ω) ∩ Q = ∅. A clean configuration is just a configuration in D(∅). We would like to discuss the distribution of clean configurations, that is, to condition on D(∅). Hence it would be useful to have Pλ D(∅) > 0. If M has finite volume, this is clear, since with positive, but very small, probability the configuration ω will contain only a single point, and then no potential defects are possible. (It will be shown below that the clean configurations are typically not so sparse.) Hence, we shall for simplicity now assume that M has finite volume. There are obvious and simple methods to extend the discussion to the infinite volume case. ˆ is some event, and X is some subset of M . We say that A is Suppose that A ⊂ ˆ differ only in points which are in independent of X, if whenever ω ∈ A and ω 0 ∈
Conformal Invariance of Voronoi Percolation
93
X, then also ω 0 ∈ A. We shall say that A depends only on X, if A is independent of M − X. The next two lemmas relate the properties of random clean configurations to the properties of ordinary configurations. 7.1. First lemma of clean configurations. Let Q be any set of tiles of TB , and let ˆ be some event which depends only on ∪Q, the union of tiles in Q. Let Q2L be A⊂ the set of tiles of TB with distance at most 2L to ∪Q. Then Pλ A . Pλ A|D(∅) ≤ Pλ N (Q2L )
In the proof, we shall need the FKG [8] inequality for Poisson point processes. An event X ⊂ is increasing, if ω 0 ∈ X whenever ω ∈ X and ω ⊂ ω 0 ∈ . A random variable f : → R is increasing if f (ω 0 ) ≥ f (ω) whenever ω 0 ⊃ ω. Similarly decreasing events and random variables are defined. The FKG inequality for events states that Pλ (X ∩ Y) ≥ Pλ (X )Pλ (Y) if either X , Y are both increasing events, or both decreasing events. The FKG inequality for random variables states that E(f g) ≥ Ef Eg, if f, g are both increasing random variables, or both are decreasing random variables. The proof of the FKG inequality for events in Poisson point processes may be found in the paper by R. Roy [14]. Although the setting there is a bit different, the proof is easily adapted to our situation. The FKG inequality for random variables can be obtained as a corollary of the inequality for events. Proof. Let Y be the set of tiles of TB which are not in Q2L . Observe that N (Y ) is independent of A. Also note that N (Y ) and N (Q2L ) are both decreasing events, and therefore they are positively correlated, by the FKG inequality. These are the facts that enter into the following estimate: Pλ A ∩ N (Q2L ) ∩ N (Y ) Pλ A|D(∅) = Pλ N (Q2L ) ∩ N (Y ) Pλ A ∩ N (Y ) Pλ A Pλ N (Y ) = ≤ Pλ N (Q2L ) ∩ N (Y ) Pλ N (Q2L ) ∩ N (Y ) Pλ A Pλ N (Y ) Pλ A . = ≤ Pλ N (Q2L ) Pλ N (Y ) Pλ N (Q2L )
In order to effectively apply Lemma 7, we shall need an estimate for Pλ N (Q) when Q is a set of tiles in TB . Proposition 5.3 gives, 2 Pλ N (Q) ≥ 1 − O (1) vol(∪Q)λd+2 Ld +d+2 = 1 − O (1) vol(∪Q)λ(d−2)/d (log λ)O(1) .
(7.1)
We shall need a different estimate for the case that |Q|, the number of tiles in Q, is large. For any set of tiles Q ⊂ TB , the event N (Q) is monotone decreasing. Therefore, the FKG inequality and (7.1) give,
94
I. Benjamini, O. Schramm
|Q| Y Pλ N (Q) ≥ Pλ N (T ) ≥ 1 − O (1) λ−2/d (log λ)O(1) T ∈Q
−2/d
≥ exp −O (1) λ
O(1)
(log λ)
(7.2)
|Q| ,
because 1 − ≥ e−2 when > 0 is small. 7.2. Second lemma of clean configurations. Let Q be a set of tiles of TB , and let Q6L ˆ be some event be the set of all tiles of TB with distance at most 6L from ∪Q. Let A ⊂ which is independent of ∪Q6L . Then, Pλ,p A|D(∅) . Pλ,p A|D(Q) ≤ Pλ N (Q6L ) Proof. For j = 1, 2 let Bj be the set of all tiles T of TB − Q such that the distance from T to ∪Q is in the range [3(j − 1)L, 3jL). Also let B3 be all the tiles of TB which are not in Q ∪ B1 ∪ B2 , Pλ,p A ∩ F(Q) ∩ N (B1 ) ∩ N (B2 ) ∩ N (B3 ) Pλ,p A|D(Q) = Pλ F(Q) ∩ N (B1 ) ∩ N (B2 ) ∩ N (B3 ) (7.3) Pλ,p F(Q) ∩ N (B1 ) ∩ N (B3 ) ∩ A ≤ . Pλ F(Q) ∩ N (B1 ) ∩ N (B2 ) ∩ N (B3 ) Since the distance between ∪B3 and ∪(B1 ∪Q) is greater than 2.2L, the events N (B3 )∩A and F (Q) ∩ N (B1 ) are independent; that is, Pλ,p F(Q) ∩ N (B1 ) ∩ N (B3 ) ∩ A = Pλ F (Q) ∩ N (B1 ) Pλ,p N (B3 ) ∩ A . (7.4) Let A1 be the set of points in M with distance at most L from (∪B1 ) ∩ (∪B2 ), let A0 be the points in connected components of M − A1 that intersect ∪Q, and let A2 = M −A0 −A1 . We want to show that the events F(Q)∩N (B1 ) and N (B2 )∩N (B3 ) are positively correlated. For this, the FKG inequality can be used, but not immediately. ˆ can be decomposed into (ω0 , ω1 , ω2 ), where ωj = Aj ∩ ω. This induces a Any ω ∈ decomposition = 0 × 1 × 2 of . Note that F (Q) ∩ N (B1 ) is an event that’s independent of ω2 and is monotone decreasing in πω1 . Similarly, N (B2 ) ∩ N (B2 ) is independent of ω0 and is monotone decreasing in πω1 . Given any ω1 ∈ 1 , let f (ω1 ) be the probability that (ω0 , ω1 , ω2 ) ∈ F(Q) ∩ N (B1 ), and let g(ω1 ) be the probability that (ω0 , ω1 , ω2 ) ∈ N (B2 ) ∩ N (B2 ), where ω0 ∈ 0 and ω2 ∈ 2 are random. Then f and g are monotone decreasing random variables on 1 . Hence, the FKG inequality for random variables, E(f g) ≥ Ef Eg, gives, Pλ F(Q) ∩ N (B1 ) ∩ N (B2 ) ∩ N (B3 ) ≥ Pλ F(Q) ∩ N (B1 ) Pλ N (B2 ) ∩ N (B3 ) . (7.5) A similar argument shows that Pλ,p D(∅) ∩ A ≥ Pλ,p N (B3 ) ∩ A Pλ N (Q) ∩ N (B1 ) ∩ N (B2 ) (7.6) = Pλ,p N (B3 ) ∩ A Pλ N (Q6L ) . Now combine (7.3), (7.4), (7.5) and (7.6), to obtain,
Conformal Invariance of Voronoi Percolation
Pλ,p
95
Pλ,p N (B3 ) ∩ A A|D(Q) ≤ Pλ N (B2 ) ∩ N (B3 )
Pλ,p A ∩ D(∅) Pλ,p A|D(∅) ≤ ≤ . Pλ N (Q6L ) Pλ N (B2 ) ∩ N (B3 ) Pλ N (Q6L )
This proves the lemma.
(7.7)
7.3. Lemma (clean configurations have no giant tiles). Let K be a compact subset of M , and let S be the event that all tiles in T(ω, ds) which meet K have diameter smaller than L. Then lim Pλ S|D(∅) = 1. λ→∞
Proof. Let U be an open set in M whose closure is compact and which contains K. Let X be a maximal subset of U such that the distance between any two elements of X is at least L/9. For x ∈ X let Ex be the event that the ball B0 (x, L/9) is disjoint from ω. Let Q(x) be the set of tiles in TB whose distance from x is at most 3L. Since Ex depends only on the intersection of ω with B0 (x, L/9), the First Lemma of Clean Configurations 7 gives,
P (E ) e−L λ/O(1) λ x = Pλ Ex |D(∅) ≤ . Pλ N Q(x) Pλ N Q(x) d
Since
vol Q(x) = O(1)Ld = O(1)λ−1 (log λ)O(1) , the inequality (7.1) implies that Pλ N Q(x) −→ 1. Therefore,
Pλ ∪x∈X
λ→∞
d 1 Ex D(∅) ≤ O(1)|X|e−L λ/O(1) ≤ O(1)λ exp −(log λ)1+ d /O(1) −→ 0
The proof is now completed as the proof of Lemma 5.5.
λ→∞
8. Insensitivity This section can be avoided if one is only interested in the case M = R2 . Let X be some finite set. We denote by 2X the set of funtions from X to {0, 1}, and make the usual identification of 2X with the collection subsets of X. Given an element a ∈ 2X , we denote by |a| the cardinality of a, thought of as a set, which is the same as the L1 norm of a, thought of as a function. If ν1 , ν2 are two measures on 2X , we let ν1 ∪ ν2 denote the image of the measure ν1 × ν2 under the map ∪ : 2X × 2X → 2X . (In other words, ν1 ∪ of a ∪ b, if a and b are independent random ν2 is the distribution elements of 2X , ν1 and 2X , ν2 .) Similarly, the measure ν1 ∩ ν2 is defined. Fix some p ∈ [0, 1], and let η denote the product measure on 2X with η{a : x ∈ a} = p for each x ∈ X. 8.1. Insensitivity Lemma. Let ν be a measure on 2X . Then the following estimate holds for the measure norm of the difference η ∪ ν − η, q kη ∪ ν − ηk ≤ Eν∩ν p−|a| − 1.
96
I. Benjamini, O. Schramm
The expression Eν∩ν p−|a| means the expectation of p−|a| when a is distributed according to ν ∩ ν. The lemma was partly motivated by the concept of influence of a boolean variable on a function, introduced by Ben-Or and Linial [3]. Proof. What can one say? Cauchy–Schwarz! X
kη ∪ ν − ηk2 =
|η ∪ ν(a) − η(a)|
a∈2X
≤
X
η(a)
a∈2X
=
X
X
2 2
η(a)−1 η ∪ ν(a) − η(a)
a∈2X
η(a)−1 η ∪ ν(a)2 − 2η ∪ ν(a) η(a) + η(a)2
(8.1)
a∈2X
=
X
η(a)−1 η ∪ ν(a)2 − 1.
a∈2X
Observe that η(a) = p|a| (1 − p)n−|a| , where n = |X|. We may write an equality of the form b ∪ c = a as a − c ⊂ b ⊂ a. Hence, η ∪ ν(a) =
X
ν(c)p|a|−|c| (1 − p)n−|a| = η(a)
c⊂a
X
ν(c)p−|c| .
c⊂a
We use these expressions to simplify (8.1), kη ∪ ν − ηk ≤ 2
X
η(a)
a∈2X
=
X
!2 −|c|
−1
ν(c)p
c⊂a
X XX
η(a)ν(b)ν(c)p−|b|−|c| − 1
a∈2X b⊂a c⊂a
=
X X
b∈2X c∈2X
=
X X
b∈2X c∈2X
=
X X
X
ν(b)ν(c)p−|b|−|c|
η(a) − 1
a⊃b∪c
ν(b)ν(c)p−|b|−|c| p|b∪c| − 1 ν(b)ν(c)p−|b∩c| − 1 = Eν∩ν p−|a| − 1.
b∈2X c∈2X
8.2 Corollary. Let ν c denote the image of ν under the map a → X − a from 2X to 2X . Then q kη ∩ ν c k ≤ Eν∩ν (1 − p)−|a| − 1. c Proof. Use η ∩ ν c = η c ∪ ν , and apply the lemma.
Conformal Invariance of Voronoi Percolation
97
9. Assembly Proof of Theorems 2.1 and 2.2. In the proof, we shall assume that M is compact. This is basically for convenience of notation, and it is easy to modify the arguments to apply in general. ˆ in the Let C denote the event of crossing, that is C = C(M, M 0 , S1 , S2 , ds) ⊂ , situation of Theorem 2.1 and C = C(M, M 0 , α, ds), in the situation of Theorem 2.2. Similarly, let Cu denote the crossing, but with respect to the conformal metric eu ds. Given any set Q of tiles in TB , let P(Q) be the event that ∪Q is pivotal for C; that is, P(Q) is the set of all ω such that there is an ω 0 which equals ω outside of Q, and one of them is in C while the other is not. Let 1C be the event that there is a crossing with respect to the metric ds, but not with respect to the metric eu ds. In other words, 1C = C − Cu . We need to estimate Pλ,p (1C) when λ is large. We shall continue to use the book-keeping tiling TB from Sect. 6. For each set Q of tiles in TB , and for a > 0, let Qa denote the set of tiles in TB with distance at most a to ∪Q. Recall the definition 5 of L. We assume that λ is so large that in the scale of L the sets M 0 , S1 , S2 are ‘very smooth’. Let S be the event all tiles in T(ω, ds) have diameter at most L. Recall that for any set Q of tiles in TB , D(Q) denotes the event that Q is the set of tiles containing navels of local potential defects. Since when ω ∈ S, defects can occur only at local potential defects, and because defects effect the connectivity only for the tiles close by, we have, (9.1) 1C ∩ S ∩ D(Q) ⊂ P(Q6L ). We shall now estimate Pλ,p (1C). Our first goal is to have an estimate on Pλ,p (1C) in terms of a random clean configuration with defects and an independent collection of defects added on top of it. (While this is not a precise mathematical statement, we hope it aids the intuition of the reader.) First write, Pλ,p (1C) ≤ 1 − Pλ (S) + Pλ,p (1C ∩ S).
(9.2)
Now estimate the last summand, using (9.1) and Lemma 7, X Pλ,p (1C ∩ S) = Pλ,p 1C ∩ S|D(Q) Pλ D(Q) ≤
X Q
≤
X
Q
Pλ,p P(Q6L )|D(Q) Pλ D(Q) n −1 o min 1, Pλ,p P(Q6L )|D(∅) Pλ N (Q6L ) Pλ D(Q)
Q
≤2
X
(9.3)
Pλ,p P(Q6L )|D(∅) Pλ D(Q) +
Q
+
X
Pλ D(Q) .
Pλ N (Q6L ) <1/2
Our first goal of reducing to the situation where there is a clean configuration with defects added on top can now be considered as accomplished. (This is the meaning of the left summand, which is the more significant one.) We now estimate the left summand.
98
I. Benjamini, O. Schramm
Let X be a maximal set of points in M with the property that the distance between any two points in X is at least L, and for each x ∈ X let S(x) denote the sphere of radius 15L about x. For each set Q of tiles of TB , we let X(Q) denote the intersection of X with ∪QL . It follows that the balls of radius L and centers in X(Q) cover ∪Q. ˆ and some Q. Let W (ω, Q) denote the Fix for a moment some ω = (ωo , ωc ) ∈ nuclei of tiles in T(ω, ds) that intersect ∪x∈X(Q) S(x). Set + = ωo ∪ W (ω, Q), ωc − W (ω, Q) , ωQ − = ωo − W (ω, Q), ωc ∪ W (ω, Q) . ωQ + is obtained from ω by opening all the nuclei of tiles which intersect In other words, ωQ − is obtained from ω by closing them. Let K(Q) denote the event ∪x∈X(Q) S(x), and ωQ − + that there is a crossing for ωQ , but not for ωQ . Observe that S ∩ P(Q6L ) ⊂ K(Q), which gives, Pλ,p P(Q6L )|D(∅) ≤ Pλ,p K(Q)|D(∅) + 1 − Pλ S|D(∅) .
Now, (9.3) implies, Pλ,p (1C ∩ S) ≤ 2
X Q
+
Pλ,p K(Q)|D(∅) Pλ D(Q) + 2 − 2Pλ S|D(∅) +
X
Pλ D(Q) .
(9.4)
Pλ N (Q6L ) <1/2
We now estimate the sum X
Pλ,p K(Q)|D(∅) Pλ D(Q) .
(9.5)
Q
Fix some clean ω ∈ D(∅), and let ω 0 ∈ be arbitrary. Recall that Z(ω 0 ) denotes navels of local potential defects of ω 0 . This means, the set of tiles in TB that contain 0 Pλ Z(ω ) = Q = Pλ D(Q) . So (9.5) can be written as (9.6) P ω ∈ K Z(ω 0 ) ω ∈ D(∅) , where the probability is with respect to the joint distribution of ω and ω 0 . Set τ = πω (recall that this means that τ is the same as ω, except that it is not specified which nuclei of τ are open and which are closed). We may think of ω as a random coloring of τ , and rewrite (9.6) as, (9.7) Eτ P K Z(ω 0 ) τ ∈ D(∅) . Here the probability is with respect to the coloring of τ and with respect to the choice of ω 0 . Let us fix τ for a moment, and consider ω 0 and the coloring of τ as random. On 2τ , the collection of subsets of τ , let η be the (p, 1 − p) product measure. In other words, η is the distribution of ωo . Let ν be the measure on subsets of τ given by ν(A) = Pω0 W τ, Z(ω 0) ∈ A ; that is, ν is the image of the measure Pλ under the map ω 0 → W τ, Z(ω 0 ) . Note that with the notations of the Insensitivity Lemma 8.1 + and its Corollary 8.2, the open nuclei in ωZ(ω 0 ) are distributed according to η ∪ ν, and − the open nuclei in ωZ(ω0 ) are distributed according to η ∩ ν c . So,
Conformal Invariance of Voronoi Percolation
P K Z(ω 0 )
99
= Pη∪ν (C) − Pη∩ν c (C) ≤ kη ∪ ν − η ∩ ν c k ≤ kη ∪ ν − ηk + kη ∩ ν c − ηk.
(9.8)
Let a be distributed according to the measure ν ∩ ν, and set β = max {− log p, − log(1 − p)}. With the help of the Insensitivity Lemma and its corollary, (9.8) gives the following estimate, p (9.9) P K Z(ω 0 ) ≤ 2 Eν∩ν eβ|a| − 1, Let ω 00 be another random element of , and set m = m(τ, ω 0 , ω 00 ) = W τ, Z(ω 0 ) ∩ W τ, Z(ω 00 ) . Since (9.5) is equal to (9.7), the inequality (9.9) allows us to make the following estimate, X Q
Pλ,p K(Q)|D(∅) Pλ D(Q)
q βm 0 00 ≤ Eτ min 1, 2 Eω ,ω e − 1 τ ∈ D(∅) ≤ 1 − Pλ S|D(∅) + q βm 0 00 + 2Eτ Eω ,ω e − 1 τ ∈ D(∅) ∩ S Pλ S|D(∅) r ≤ 1 − Pλ S|D(∅) + 2 Eτ,ω0 ,ω00 eβm − 1 τ ∈ D(∅) ∩ S q = 1 − Pλ S|D(∅) + 2 Eτ,ω0 ,ω00 eβm |τ ∈ D(∅) ∩ S − 1.
(9.10)
Let F be the (random) set of points x ∈ X such that Z(ω 0 ) and Z(ω 00 ) both intersect of tiles in T(τ, ds) that intersect the ball of radius 50L about x, and let nx be the number P S(x) and have diameter at most L. Set n = n(F, τ ) = x∈F nx . Then for τ ∈ S we have n ≥ m. Consequently, Eτ,ω0 ,ω00 eβm |τ ∈ D(∅) ∩ S ≤ Eτ,ω0 ,ω00 eβn(F,τ ) |τ ∈ D(∅) ∩ S Eτ,ω0 ,ω00 eβn(F,τ ) |τ ∈ D(∅) ≤ Pλ S|D(∅) −1 X P (F = K)Eτ eβn(K,τ ) |τ ∈ D(∅) . = Pλ S|D(∅)
(9.11)
K⊂X
Let Qs (K) denote the set of tiles in TB that are within distance s of K, and note that n(K, τ ) depends only on the intersection of τ with Q20L (K). We use the First Lemma of Clean Configurations (7.1) to estimate the above conditional expectation by an unconditional expectation, and then apply Proposition 6.1, as follows,
100
I. Benjamini, O. Schramm
Eτ e
βn(K,τ )
Eτ eβn(K,τ ) |τ ∈ D(∅) ≤ Pλ N Q30L (K)
≤
d−1 (d−1)/d
exp O(1)|K|L λ Pλ N Q30L (K)
(9.12) −2 exp O(1)|K|(log λ)1−d = . Pλ N Q30L (K)
The number of tiles of TB in Q30L (K) is O(1)|K| (log λ)O(1) . Consequently, by (7.2), Pλ N Q30L (K) ≥ exp −O (1) |K|λ−2/d (log λ)O(1) . Hence, (9.12) may be improved to −2 Eτ eβn(K,τ ) |τ ∈ D(∅) ≤ exp O(1)|K|(log λ)1−d .
(9.13)
In order to get a good estimate for the right hand side of (9.11), we now study the distribution of F . For any x ∈ X, the inequality (7.1) provides the following estimate for the probability that x ∈ F . 2 (9.14) Pλ (x ∈ F ) ≤ O (1) Ld λ(d−2)/d (log λ)O(1) = λ−4/d (log λ)O(1) . Let X = X1 ∪ · · · ∪ XN be a partition of X into disjoint sets Xj with the property that for each j the distance between any two elements of Xj is at least 150L. We take N to be bounded by a constant, which depends only on d. This is possible, since there is a bound on the number of points of X in a ball of radius 150L. Note that if x, x0 ∈ Xj , then the events x ∈ F and x0 ∈ F are independent. Using (9.14) and |X| = o(1)λ, this gives, P |F ∩ Xj | = k ≤ λk λ−4k/d (log λ)O(1)k = λ(d−4)k/d (log λ)O(1)k . If |F | = k, we must have k ≥ |F ∩ Xj | ≥ k/N , for some j. Consequently, P |F | = k ≤ N (k + 1)λ(d−4)k/(N d) (log λ)O(1)k . Together with (9.13) and (9.11), this gives, Eτ,ω0 ,ω00 eβm |τ ∈ D(∅) ∩ S −1 X P (F = K)Eτ eβn(K,τ ) |τ ∈ D(∅) ≤ Pλ S|D(∅) ≤ Pλ S|D(∅) + Pλ S|D(∅)
K⊂X
−1
+ ∞ X −1 −1
= Pλ S|D(∅)
−2 N (k + 1)λ(d−4)k/(N d) (log λ)O(1)k exp O(1)k(log λ)1−d
k=1
1 + o(1) ,
as λ → ∞, because d < 4. Recall that Lemma 7 says that lim Pλ S|D(∅) = 1. λ→∞
(9.15) (9.16)
Conformal Invariance of Voronoi Percolation
101
With (9.10) and (9.15), this gives, X
Pλ,p K(Q)|D(∅) Pλ D(Q) −→ 0. λ→∞
Q
From this, (9.16) and (9.4), we get, X
Pλ,p (1C ∩ S) = o(1) +
Pλ D(Q) .
(9.17)
Pλ N (Q6L ) <1/2
For any given tile in TB the probability that it is in Z(ω) is bounded by O(1)λ−2/d (log λ)O(1) , by (7.1). Because the total number of tiles in TB is O(λ) the expected number of tiles in Z(ω) satisfies, E |Z(ω)| ≤ O(1)λ(d−2)/d (log λ)O(1) .
(9.18)
On the other hand, (7.1) also implies that the number of tiles in Q must be at least λ2/d (log λ)−O(1) , if Pλ N (Q6L ) < 1/2. This gives the inequality, E |Z(ω)| ≥ λ2/d (log λ)−O(1)
X
Pλ D(Q) .
(9.19)
Pλ N (Q6L ) <1/2
The combination of (9.18) and (9.19) implies, X
Pλ D(Q) ≤ O(1)λ(d−4)/d (log λ)O(1) −→ 0, λ→∞
Pλ N (Q6L ) <1/2
because d < 4. Now from (9.17), (9.2) and Lemma 5.5, it follows that Pλ,p (1C) −→ 0, which completes the proof of the theorem.
λ→∞
10. The Density Invariance Conjecture The following conjecture is probably true only in dimension d = 2. 10.1. Density invariance conjecture. Let C M, M 0 , S1 , S2 , ds be the crossings µ event, as in Theorem 2.1, let µ be a measure on M , comparable to vol, and let Pλ,p ˆ where we have stressed the dependence on µ. Then denote the resulting measure on , the limit crossing probability µ C M, M 0 , S1 , S2 , ds P C(M, M 0 , S1 , S2 , ds, µ, p) = lim Pλ,p
λ→∞
exists, and does not depend on µ. A similar statement holds for the percolation in homotopy classes of Theorem 2.2.
102
I. Benjamini, O. Schramm
One may consider a weaker version of the conjecture, where the claim is only that the difference in the probabilities corresponding to two measures µ, µ∗ tends to zero as λ → ∞, instead of claiming that the limit exists. A stronger version of the conjecture would state that the convergence is uniform in µ, as long as the constant c > 0 such that c−1 vol ≤ µ ≤ c vol is held fixed. At least in the plane, the numerical evidence below also suggests that the limiting crossing probabilities are the same as for the bond percolation model. The requirement that µ be comparable to vol is probably stronger than needed. On the other hand, assuming only that its support is M would not be sufficient. Consider the following example. Let {Aj } be a sequence of vertical lines whose union is dense Aj . Let {aj } be a sequence of positive in the plane, and let µj be the length measure onP numbers that tends to zero very fast, and let µ = j aj µj . Then it is not hard to see that when λ → ∞ the probability for crossing a horizontal rectangle from left to right tends to 1. Numerical evidence. Following is some numerical evidence which supports the conjecture in the plane. We have tested five different measures µ1 , . . . , µ5 . Their densities f1 (x, y), . . . , f5 (x, y), respectively, all depend only on the x variable, and are given in Table 1. Figure 2 shows a Voronoi tiling for a configuration obtained with the measure µ4 . Table 1. The densities of the measures tested
f1 (x, y) = 1,
2/5, 1,
f2 (x, y) =
f3 (x, y) =
1, p
2/5, 2/5,
f4 (x, y) =
f5 (x, y) =
1, 2/5,
1,
(8 − 9x)/5, 2/5,
1/3 < x < 2/3, otherwise, x < 1/3, 1/3 ≤ x ≤ 2/3, 2/3 < x, x < 1/2, 1/2 ≤ x, x < 1/3, 1/3 ≤ x ≤ 2/3, 2/3 < x.
With each of these measures, we ran the following experiment 200 times. Set R = [a, b] × [c, d] = [0, 1.2] × [0, 1] and R0 = [a0 , b0 ] × [c0 , d0 ] = [.08, 1.12] × [.08, .92]. Then the rectangle, R0 fits in R with a margin of 0.08. In the rectangle R, 100,000 points were distributed independently, according to the given measure. The Voronoi tiling was then computed. Following that, 1,000 times, random colorings of the resulting tilings were computed, in each coloring the probability for a tile to be open was taken to be 1/2, independently1 . Then the algorithm determined the largest r0 ∈ 0, (b0 − a0 )/(d0 − c0 ) such that some connected component of the intersection of the union of open tiles with 1 Actually, with the objective of saving computing time, the complete coloring was not computed, only the colors of the tiles that the algorithm queried were determined, but the result is the same.
Conformal Invariance of Voronoi Percolation
103
Fig. 2. A Voronoi tiling for a random configuration obtained with µ = µ4
n o the rectangle R0 intersects both lines {(a0 , y) : y ∈ R}, a0 + r0 (d0 − c0 ), y : y ∈ R . After all these runs, for any r in the left hand column of Table 2, the proportion of the runs for which r ≥ r0 was computed, which is a statistical estimate for the probability for left to right crossing of the rectangle [0, r] × [0, 1]. The resulting figures, denoted Pµj (r) are listed in Table 2, together with the values obtained from Cardy’s formula [6], and the numerical values given in Langlands et. al. [12], for the Z2 percolation model. We wish to stress that different entries in the column corresponding to any µj were obtained using the same trials, and are therefore dependent. On the other hand, entries in different columns may be considered independent. Note that the largest deviation between an entry in the middle columns to Cardy’s value is in the order of 0.005. This is roughly comparable to an error in r that is equal to the typical size of a Voronoi cell in these Voronoi tilings. Qhull, a program created at the Geometry Center in Minnesota, was used to compute the Voronoi tilings. We wish to thank the authors of qhull, C. Bradford Barber and Hannu Huhdanpaa, and the Geometry Center, for making it available. Invariance under conformal mappings. Following is the conjecture from Langlands et. al. [12], adapted to the Voronoi model. Conjecture 10.2. Let J be a closed topological disk in the plane R2 = C, and let ˆ be a random colored Poisson point process γ1 , γ2 ⊂ ∂J be two disjoint arcs. Let ω ∈ in the plane, with respect to ordinary area measure, with density λ and p = 1/2, and consider the resulting Voronoi tiling T. Let P Cλ (J, γ1 , γ2 ) be the probability that there is some path in J that connects γ1 and γ2 , and is contained in the union of open tiles of T. Suppose that f : J → R2 is a continuous injective mapping, which is conformal in the interior of J. Then lim P Cλ (J, γ1 , γ2 ) = lim P Cλ f (J), f (γ1 ), f (γ2 ) . λ→∞
λ→∞
Let’s talk about duality in the plane. Observe that the probability that there is some point that belongs to more than 3 Voronoi tiles is zero. A configuration in which 4
104
I. Benjamini, O. Schramm Table 2. r
Pµ1 (r)
Pµ2 (r)
Pµ3 (r)
Pµ4 (r) Pµ5 (r)
Lang. et. al.
Cardy’s value
.5000 .5235 .5481 .5779 .6070 .6400 .6667 .6721 .7059 .7414 .7500 .7753 .8190 .8611 .9048 .9512 1.000 1.051 1.105 1.161 1.221
.8214 .8037 .7854 .7636 .7428 .7197 .7011 .6974 .6742 .6508 .6451 .6287 .6011 .5758 .5508 .5254 .4994 .4730 .4475 .4222 .3965
.8229 .8063 .7883 .7673 .7462 .7220 .7028 .6992 .6770 .6532 .6475 .6313 .6037 .5782 .5534 .5271 .5010 .4750 .4492 .4238 .3974
.8234 .8070 .7888 .7673 .7456 .7222 .7036 .6998 .6762 .6525 .6469 .6303 .6029 .5772 .5516 .5263 .5004 .4750 .4495 .4244 .3989
.8229 .8046 .7867 .7646 .7438 .7204 .7015 .6978 .6747 .6510 .6456 .6298 .6022 .5775 .5522 .5271 .5012 .4757 .4506 .4254 .3997
− .8065 .7783 .7666 .7453 .7217 − .6994 .6762 .6522 − .6301 .6026 .5768 .5516 .5257 .4999 .4743 .4484 .4230 .3974
.8244 .8070 .7889 .7671 .7459 .7223 .7035 .6997 .6765 .6527 .6470 .6306 .6030 .5774 .5519 .5260 .5000 .4741 .4482 .4227 .3970
.8200 .8028 .7847 .7626 .7418 .7188 .7001 .6966 .6737 .6498 .6441 .6279 .6007 .5755 .5507 .5251 .4997 .4742 .4490 .4238 .3978
Voronoi tiles have a nonempty intersection will be called degenerate. It follows that the boundary of any union of tiles of a nondegenerate configuration is a disjoint collection of paths in the plane. In the situation of the conjecture, let γˆ 1 and γˆ 2 be the two arcs ˆ be a nondegenerate configuration. Let A be the set in ∂J − (γ1 ∪ γ2 ), and let ω ∈ of all points in J that are either on γ1 or may be joined to γ1 by a path in J contained in open tiles. Then either A intersects γ2 , or there is a boundary component of A ∩ J that connects γˆ 1 and γˆ 2 . In the latter case, it follows that there is a path in J from γˆ 1 to γˆ 2 that is contained entirely in closed tiles. On the other hand, if there is such a path connecting γˆ 1 and γˆ 2 , then there cannot be an open path in J connecting γ1 and γ2 . We conclude that either there is in J an open crossing from γ1 to γ2 , or there is a closed crossing from γˆ 1 to γˆ 2 , and these cases are mutually exclusive. Since the probability for an open crossing is the same as the probability for a closed crossing, we get, P Cλ (J, γ1 , γ2 ) + P Cλ (J, γˆ 1 , γˆ 2 ) = 1.
(10.1)
Proposition 10.3. Conjecture 10.1 implies Conjecture 10.2. The proof uses Theorem 2.1, and monotonicity and continuity properties of crossing probabilities. If one assumes that Conjecture 10.1 is valid also for intersections of crossing events, then the proof below can be used to show that 10.2 is valid for intersections of crossing events, as discussed in [12]. Proof. Let the situation be as in Conjecture 10.2. Since for any such J there is a continuous injective mapping taking J to the unit disk, which is conformal in J, we assume, without loss of generality, that f (J) is the closed unit disk U .
Conformal Invariance of Voronoi Percolation
105
We start with a one-sided estimate. Let α1 be a closed arc on the unit circle which is contained in the relative interior of the arc f (γ1 ), and let α2 be a closed arc on the unit circle which is contained in the relative interior of f (γ2 ). We shall show that ≥ 0. (10.2) lim inf P Cλ (J, γ1 , γ2 ) − P Cλ U , α1 , α2 λ→∞
Let β be an analytic simple closed curve which approximates ∂J, and has the pattern of intersection with ∂J as indicated in Figure 3, and let J 0 be the closed disk bounded by β. Let S1 be a smooth open topological disk in J 0 − J, such that ∂S1 ∩ β is an arc approximating γ1 , and let S2 be a smooth open topological disk in J 0 − J, such that ∂S2 ∩ β is an arc approximating γ2 . Let g be the Riemann map from J 0 to the unit disk, and assume that g is normalized so that g(f −1 (0)) = 0 and the derivative of g ◦ f −1 at 0 is real. Because J 0 is an approximation of J, g −1 : U → J 0 is an approximation of f −1 : U → J. We assume that β has been chosen sufficiently close to ∂J so that the arc g(∂S1 ∩ β) contains α1 in its interior, and the arc g(∂S2 ∩ β) contains α2 in its interior. Hence for some r > 1, the images of α1 and α2 under the map z → r−1 z are contained in S1 and S2 , respectively. Fix such an r, and let G(z) = rg(z).
γ1
β
J
γ2
Fig. 3. The approximation β of ∂J
Since β is analytic, g extends to a conformal homeomorphism from a neighborhood W of J to a neighborhood of the closed unit disk. Let M ⊃ J be a bounded open set whose closure is contained in W . By Theorem 2.1, when λ is large, the probability of C M, J 0 , S1 ,S2 , |dz| is approximately the same as the probability of C M, J 0 , S1 , S2 , |G0 (z)dz| . By Conjecture 10, we may also change the measure from ordinary volume measure to the measure induced by the map G. But M with metric |G0 (z)dz| and measure induced by G is isomorphic to G(M ) with ordinary Euclidean metric and measure. Thus, as λ → ∞, the probability of C M, J 0 , S1 , S2 , |dz| tends to the probability of C G(M ), G(J 0 ), G(S1 ), G(S2 ), |dz| . When λ is large, we may assume, with high probability, that all tiles near M and near U are very small. For such 0 configurations, a crossing from 0 α1 to α2 in U implies ω ∈ C G(M ), G(J ), G(S1 ), G(S2 ), |dz| , and ω ∈ C M, J , S1 , S2 , |dz| implies a crossing from γ1 to γ2 in J. This proves (10.2).
106
I. Benjamini, O. Schramm
On the other hand, if α10 and α20 are arcs on ∂U which contain f (γ1 ) and f (γ2 ) in their interiors, respectively, then ≤ 0. (10.3) lim sup P Cλ (J, γ1 , γ2 ) − P Cλ U , α10 , α20 λ→∞
This can be proved in the same way as (10.2), or deduced from (10.2), using duality. Conjecture 10.2 will follow from (10.2) and (10.3), once we prove that P Cλ (U , α1 , α2 ) is continuous in α1 and α2 , with a modulus of continuity that’s independent of λ. Therefore, the next lemma completes the proof. 10.4 Continuity Lemma. Let α1 and α20 be two disjoint arcs on ∂U , and let α2 ⊂ α20 be an arc which has an endpoint a in common with α20 . Let b be the other endpoint of α2 , let c be the other endpoint of α20 and let d be the endpoint of α1 that is separated in ∂U from a by the relative interior of α1 ∪ α2 . Set ρ=
(a − c)(b − d) , (a − d)(b − c)
the cross ratio of a, b, c, d. Assuming Conjecture 10.1, for all λ sufficiently large, O(1) P Cλ U , α1 , α20 − P Cλ U , α1 , α2 ≤ √ . ρ Proof. Let β be the component of ∂U − α1 ∪ α20 that has a as an endpoint, and let h : U → U be a conformal homeomorphism of the unit disk that takes α20 − α2 and α1 ∪ β1 into arcs of the same length, with centers on the real axis. Set γ1 = h(α1 ), δ = h(β), let γ2 be an arc that is slightly shorter than h(α2 ), is contained in h(α2 ), and has h(a) as one of its endpoints, and let γ20 be an arc that is slightly longer than h(α20 ), contains h(α20 ) and has a as one of its endpoints. Note that there is a conformal automorphism h1 of U , close to the identity, that takes γ1 into an arc that contains γ1 in its interior, and takes γ20 into an arc that contains h(α20 ) in its interior. (Recall that conformal automorphisms of U are determined by the images of three points on ∂U . One only needs to appropriately choose the images of the endpoints of γ1 and the endpoint of γ20 distinct from a.) By (10.3) with U replacing J, h1 ◦h replacing f , and arcs appropriately chosen, we have (10.4) P Cλ U , α1 , α20 ≤ P Cλ U , γ1 , γ20 + o(1), as λ → ∞. Similarly, by (10.3),
P Cλ U , α1 , α2 ≥ P Cλ U , γ1 , γ2 + o(1),
(10.5)
ˆ be the event that there is a crossing in U from γ1 to γ20 in open tiles, but Let A ⊂ there isn’t such a crossing from γ1 to γ2 , and consider some nondegenerate configuration ω ∈ A. There must be an open crossing from γ20 − γ2 to γ1 . Because γ2 does not connect in open tiles to this crossing, by duality, there must be a crossing in closed tiles from γ20 − γ2 to γ1 ∪ δ. Let B be the event that there is an open crossing from γ20 − γ2 to γ1 ∪ δ, and there is also a closed crossing between these arcs. Then, P Cλ U , γ1 , γ20 − P Cλ U , γ1 , γ2 = Pλ (A) ≤ Pλ (B). (10.6) Let n be the largest integer such that the length of the arc γ1 ∪ δ = h(α1 ∪ β) is less than π/n. Since the cross ratio is invariant under conformal automorphisms of U , it is easy to verify, using the definition of ρ, that
Conformal Invariance of Voronoi Percolation
107
ρ = O n2 .
(10.7)
Recall that by the choice of h, the arc h(α20 − α2 ) has the same length as γ1 ∪ δ. We also assume, with no loss of generality, that the length of γ20 − γ2 is less than π/n. For any ˆ such that the integer k, let Bk be the rotation of B by kπ/n; that is, the set of all ω ∈ rotation of ω about 0 by kπ/n is in B. Observe that if ω ∈ Bj is nondegenerate and k is not divisible by n, then ω ∈ / Bj+k , because any crossing from γ20 − γ2 to γ1 ∪ δ separates 0 the rotation by kπ/n of γ2 − γ2 and the rotation by kπ/n of γ1 ∪ δ. The events Bj , j = 0, . . . , n − 1 are n events with the same probability, and the intersection of any two of them has zero probability. Therefore, Pλ (B) ≤ 1/n. From (10.4), (10.5), (10.6), and (10.8), we get P Cλ U , α1 , α20 − P Cλ U , α1 , α2 ≤ 1/n + o(1). Therefore, (10.7) completes the proof.
(10.8)
(10.9)
Acknowledgement. The authors are pleased to express their thanks to Lennart Carleson for comments on an earlier version of this paper.
References 1. Aizenman, M.: Scaling Limit for the incipient spanning clusters. Mathematics of Materials: Perclation and Composits, K.M. Golden, G.R. Grimmett, R.D. James, G.W. Milton and P.N. Sen, eds., The IMA Volumes in Mathematics and its Applications, Springer-Verlag, to appear 2. Belavin, A.A., Polyakov, A.M. and Zamolodchokov, A.B.: Infinite conformal symmetry in twodiemnsional quantum field theory. Nucl. Phys. B 241, 333–380 (1984) 3. Ben-Or, M. and Linial, N.: Collective coin flipping. Randomness and Comutation, S. Micali, eds., New York: Academic Press, 1990, pp. 91–115 4. Benjamini, I. and Schramm, O: Percolation in the hyperbolic plane. Extented abstract, Proceeding of the Annual Conference of the Israel Mathematical Union, 1996 5. Van den Berg, J. and Kesten, H.: Inequalities with applications to percolation and reliabilty. J. Appl. Probab. 22, 556–569 (1985) 6. Cardy, J.L.: Critical percolation in finite geometries. J. Phys. A 25, L201–L206 (1992) 7. Eggleston, H.G.: Convexivity. Cambridge, Great Britain: Cambrigde University Press, 1958, pp. 141 8. Fortuin, C.M., Kasteleyn, P.W. and Ginibre, J.: Correlation inequalities on some partially ordered set. Commun. Math. Phys. 22, 89–103 (1971) 9. Grimmett, G.R.: Percolation. New York: Springer-Verlag, 1989, pp. 296 10. Kesten, H.: Percolation theory for mathematicians. Boston: Birkh¨auser, 1982, p. 423 11. Langlands, R.P., Pichet, C., Pouliot, P. and Saint-Aubin, Y.: On the universality of crossing pobabilities in two-dimensional percolation. J. Stat. Phys. 7, 553–574 (1992) 12. Langlands, R.P., Pichet, C., Pouliot, P. and Saint-Aubin, Y.: Conformal invariance in two-dimensional percolation. Bull. Am. Math. Soc. (N.S.) 30, 1–61 (1994) 13. Møller, J.: Lectures on Random Voronoi Tesselations. Lecture Notes in Statistics, Vol 87, Berlin– Heidelberg–New York: Springer-Verlag, 1994, p. 134 14. Roy, R.: Percolation of Poisson sticks on the plane. Peopbab. Th. Rel. Fields 89, 503–517 (1991) 15. Vashidi-Asl, M.Q. and Wierman, J.C.: First-passage percolation on the Voronoi tessalation and delaunay triangulation. Random graphs ’87 Pozna´n, 1987, Chichester: Wiley, 1990, pp. 341–359 16. Zvavitch, A.: The critical probality for Vornoi percaolation. MSc. thesis, Weizmann Institute of Science (1996) Communicated by A. Jaffe
Commun. Math. Phys. 197, 109 – 129 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Misiurewicz Maps are Rare Duncan Sands Department of Mathematics, State University of New York at Stony Brook, Stony Brook, NY 11794-3651, USA. E-mail:
[email protected] Received: 8 September 1997 / Accepted: 13 February 1998
Abstract: In any real-analytic family of S-unimodal maps with non-constant combinatorics, the set of parameters corresponding to maps with Misiurewicz (non-recurrent) dynamics has Lebesgue measure zero. In particular, this is the case for the logistic family, answering an old question.
1. Introduction An S-unimodal map with a non-recurrent critical point and without periodic attractors is known as a Misiurewicz map, in honor of Michał Misiurewicz who studied such maps in his article “Absolutely Continuous Measures for Certain Maps of an Interval” [7]. In that paper, he asked whether in one-parameter families like the logistic family x 7→ 4ax(1 − x), 0 < a ≤ 4, the set of Misiurewicz parameter values has Lebesgue measure zero or not. Our main result is that the set of Misiurewicz parameters has Lebesgue measure zero in any real-analytic family of S-unimodal maps with non-constant combinatorics, in particular in the logistic family: Theorem A. Suppose (fa )a∈U is a real-analytic family of S-unimodal maps where U ⊆ R is connected. Then either M ≡ {a ∈ U : fa is Misiurewicz } has Lebesgue measure zero or M = U .
2. Definitions By [a, b] we mean the same interval as [b, a]: the smallest closed interval containing a and b; similarly for the open interval ]a, b[. The Lebesgue measure of a set A will be denoted by |A|.
110
D. Sands
A map f : A → R defined on some non-open subset A of Rn is said to be realanalytic (or C ω ) if it is the restriction of a real-analytic map defined on an open subset of Rn containing A. Similarly for differentiable, etc. A fixed point of a map f is attracting if its immediate basin of attraction has nonempty interior; a periodic point of period n > 0 is attracting if it is attracting as a fixed point of f n , cf. [5, p. 155]. A periodic attractor is a periodic orbit some point of which is attracting (and therefore all points of which are attracting). Definition 2.1. A map f : [0, 1] → [0, 1] is unimodal if f is differentiable, f (0) = f (1) = 0 and there exists a unique c = c(f ) ∈ [0, 1] such that Df (c) = 0. The point c is called the critical point of f . It follows from the uniqueness of c that c ∈ ]0, 1[, Df > 0 on [0, c[ and Df < 0 on ]c, 1]. We define an involution τf : [0, 1] → [0, 1] by requiring that f (τf (x)) = f (x), and that τf (x) 6= x if x ∈ [0, 1] \ {c} (clearly τf (c) = c). Definition 2.2. An S-unimodal map is a C 3 unimodal map f : [0, 1] → [0, 1] such that 1. f has negative Schwarzian derivative: 2 D3 f (x) 3 D2 f (x) − <0 Df (x) 2 Df (x)
for all x ∈ [0, 1] \ {c}.
2. The critical point of f is non-flat, see [5, p. 267] for the definition. 3. Either Df (0) > 1, or 0 is an attracting fixed point for f . Remark. The critical point of a real-analytic unimodal map is automatically non-flat. Remark. The critical point c is said to be non-degenerate if D2 f (c) 6= 0. A nondegenerate critical point is non-flat. Remark. The assumptions of negative Schwarzian derivative and non-flat critical point are not essential for all of our results, some of which go through for C 2 unimodal families (cf. [5, p. 257]). A number of useful properties of S-unimodal maps are described in the appendix. Definition 2.3. Let f be S-unimodal. We say that f is ε-Misiurewicz if f has no periodic attractors and |f n (c) − c| ≥ ε for all n > 0.
(2.1)
We say that f is Misiurewicz if f is ε-Misiurewicz for some ε > 0. Misiurewicz maps have many properties reminiscent of uniformly expanding maps, see [5, p. 257]. Definition 2.4. Let U be a subset of R. A map f : U × [0, 1] → [0, 1] is said to be a family of S-unimodal maps, denoted by (fa )a∈U , if 1. the map fa : [0, 1] → [0, 1], x → f (a, x) is S-unimodal for every a ∈ U
(2.2)
Misiurewicz Maps are Rare
111
and 2. the map a 7→ c(fa ) is constant. Remark. Requiring a 7→ c(fa ) to be constant simplifies the proofs and involves no real loss of generality: if a 7→ c(fa ) is smooth (real-analytic), then a 7→ c(fa ) can be made constant by a smooth (real-analytic) change of coordinates preserving the S-unimodality of the maps fa . If D2 fa (c(fa )) 6= 0 for all a ∈ U then the map a 7→ c(fa ) is as smooth as the family (fa )a∈U . The family (fa )a∈U is said to be real-analytic (or C ω ) if the map f : U × [0, 1] → [0, 1] is real-analytic. Similarly for smooth, etc. We will only consider families that are at least C 2 . Definition 2.5. Let f : [0, 1] → [0, 1] be unimodal. The itinerary ıf (x) of x ∈ [0, 1] is the symbol sequence ıf (x) = θ0 θ1 · · · ,
(2.3)
where 0 if f i (x) < c θi = c if f i (x) = c 1 if f i (x) > c.
(2.4)
The kneading invariant κ(f ) is the itinerary of the critical value c1 ≡ f (c): κ(f ) = ıf (c1 ).
(2.5)
We will sometimes write xi for f i (x). In particular, ci ≡ f i (c). Standing assumptions. Throughout, (fa )a∈U will be a C 2 family of S-unimodal maps, referred to as (fa ), with U connected. We shall suppose that U is open, since this involves no loss of generality in Theorem A (because of Lemma 4.1). The following notation will be used: M = {a ∈ U : fa is Misiurewicz},
(2.6)
Mε = {a ∈ U : fa is ε-Misiurewicz},
(2.7)
Mε,n = {a ∈ U : fai (c) − c ≥ ε for all 0 < i < n}.
(2.8)
We will denote by Mε,n (a) the connected component of Mε,n containing a, or the empty set if a is not in Mε,n . We show dependence on fa by writing e.g. τa rather than the more cumbersome τfa . The symbol a∗ (denoting some fixed parameter) is often dropped in proofs. It is retained in statements of results.
112
D. Sands
3. Continuation of Hyperbolic Sets If fa∗ is S-unimodal, without periodic attractors, and ε > 0, then / ]c − ε, c + ε[ for any n ≥ 0} Xε = {x ∈ [0, 1] : fan∗ (x) ∈
(3.1)
is compact, forward invariant and hyperbolic. Such a set may be continued to nearby parameters. If fa∗ is ε-Misiurewicz then c1 (a∗ ) ∈ Xε , so c1 (a∗ ) can be continued as a member of Xε ; we denote its continuation to the parameter a by µ1,a∗ (a). The main result of this section is Proposition 3.2, which describes the properties of µ1,a∗ . Another continuation of c1 (a∗ ) is a 7→ c1 (a). In general µ1,a∗ (a) and c1 (a) are distinct: the curve µ1,a∗ has unchanging combinatorics. Lemma 3.1. Suppose f is S-unimodal and without periodic attractors. For every θ ∈ {−1, 0, +1}N there exists a C 1 -neighborhood V of f such that there is at most one x ∈ [0, 1] with ιg (x) = θ for every S-unimodal g ∈ V. Proof. Since f is S-unimodal and without periodic attractors, every periodic orbit of f is hyperbolic repelling, see Lemma A.1. If θ is not eventually periodic then let V be any C 1 -neighborhood of f . If θ has eventual period p > 0 then take V small enough that no S-unimodal g ∈ V has a periodic attractor of period ≤ 2p. If g ∈ V is S-unimodal, x 6= y ∈ [0, 1] and ιg (x) = ιg (y) = θ, then ιg (z) = θ for every z ∈ I ≡ [x, y]. Clearly g n is monotone for all n ≥ 0, in other words I is a homterval for I g. Since g has no wandering intervals, g n (I) must converge to a periodic attractor P of g as n → ∞, see [5, Lemma 5.2, p. 250], in particular θ must be eventually periodic. This already gives a contradiction if θ is not eventually periodic. If θ is eventually periodic of eventual period p, then P has period ≤ 2p (see [3, Lemma II.3.2]), contradicting the choice of V. Proposition 3.2 (Continuation). Suppose (fa ) is C r , r ∈ {1, 2, . . . , ∞, ω}, and a∗ ∈ M. Then there exist a neighborhood U of a∗ , a C r function µ1,a∗ : U → [0, 1] and K > 0 such that the following hold: 1. µ1,a∗ (a∗ ) = c1 (a∗ ). 2. If a ∈ U and x ∈ [0, 1], then ιa (x) = κ(a∗ ) if and only if x = µ1,a∗ (a). Define µn,a∗ : U → [0, 1] for a ∈ U and n > 1 by µn,a∗ (a) = fan−1 (µ1,a∗ (a)). Then 3. |Dµn,a∗ (a)| ≤ K for every a ∈ U and n > 0. Proof. We will prove this result using the implicit function theorem on Banach spaces [6]. The Banach spaces that we use are E = R and F = `∞ (N) with the usual norms. We start by defining an open subset A of E × F and a C r map g : A → F . Because fa∗ is Misiurewicz, there exists ε > 0 such that fai ∗ (c1 (a∗ )) ≤ c1 (a∗ ) − ε for all i > 0. Define −1 (x) (3.2) g0 : (a, x) 7→ fa [0,c] and −1 g1 : (a, x) 7→ fa [c,1] (x).
(3.3)
Misiurewicz Maps are Rare
113
These maps are well-defined and C r for, say, x ∈ [0, c1 (a∗ ) − ε] and a in a neigh bourhood of a∗ . This uses the fact that Dfa (x) 6= 0 for (a, x) ∈ U × [0, 1] \ {c} . By definition, f : U × [0, 1] → [0, 1] extends to a C r map defined on an open neighbourhood of U × [0, 1]. There therefore exists δ > 0 such that g0 and g1 extend to C r maps g0 , g1 : [a∗ − δ, a∗ + δ] × [−δ, c1 (a∗ ) − ε + δ] → R.
(3.4)
Define
n o A = ]a∗ − δ, a∗ + δ[ × x = (x0 , x1 , · · · ) ∈ F : {x1, x2 · · · } ⊆ ]−δ, c1 (a∗ ) − ε + δ[ . (3.5) This is an open subset of E × F . Note that x0 is unrestricted. Define g : A → F by g(a, (x0 , x1 , · · · )) = (gθ0 (a, x1 ), gθ1 (a, x2 ), · · · ),
(3.6)
where θ0 θ1 · · · = κ(a∗ ) is the kneading invariant of fa∗ . Then g is C r with derivative at (a, x) ∈ A given by Dg(a, x) : E × F → F (α, v) 7→ v0 = (v00 , v10 , · · · ),
(3.7)
where vi0 =
∂fa ∂a (gθi (a, xi+1 ))α
+ vi+1 . Dfa (gθi (a, xi+1 ))
(3.8)
If we put xa∗ = (c1 (a∗ ), c2 (a∗ ), · · · ), then (a∗ , xa∗ ) ∈ A and g(a∗ , xa∗ ) = xa∗ . Let us show that xa∗ is a hyperbolic fixed point of ga∗ , in other words that 1F − D2 g(a∗ , xa∗ ) : F → F
(3.9)
is a linear homeomorphism onto F , where 1F indicates the identity map on F . Indeed, by Lemma A.2 there exist K > 0 and λ > 1 such that i Dfa (cj (a∗ )) ≥ Kλi for all i ≥ 0, j ≥ 1. (3.10) ∗ A calculation shows that −1
1F − D2 g(a∗ , xa∗ )
:F →F v 7→ v0 = (v0 , v1 , · · · ),
(3.11)
where vi0 =
∞ X k=0
vi+k , Dfak∗ (c1+i (a∗ ))
(3.12)
−1 so 1F − D2 g(a∗ , xa∗ ) is well-defined and bounded because of (3.10). This shows that xa∗ is a hyperbolic fixed point of ga∗ . By the implicit function theorem [6, (10.2.1), p. 270] there is an open connected neighbourhood U of a∗ and a C r function µa∗ : U → F such that µa∗ (a∗ ) = xa∗
(3.13)
114
D. Sands
and µa∗ (a) ∈ [−δ, c1 (a∗ ) − ε + δ] ,
(3.14)
g(a, µa∗ (a)) = µa∗ (a)
(3.15)
for all a ∈ U . We define µn,a∗ : U → R to be the nth component of µa∗ (a): µa∗ (a) = (µ1,a∗ (a), µ2,a∗ (a), · · · ).
(3.16)
Let us prove that µ1,a∗ has the stated properties. Clearly µ1,a∗ is C r if r ∈ {1, 2, . . . , ∞}; if r = ω then to see that µ1,a∗ is C ω extend g0 and g1 to analytic functions of (a, x) for an appropriate domain of C2 . The implicit function theorem shows that µ1,a∗ is a C 1 function of the complex variable a, that is to say, µ1,a∗ is a C ω function of the real variable a. We must show that the range of µ1,a∗ is [0, 1]. If c1 (a∗ ) ∈ ]0, 1[ then {c1 (a∗ ), c2 (a∗ ), · · · } ⊆ ]0, 1[ ,
(3.17)
so µ1,a∗ : U → [0, 1] if U is small enough, which we may assume. The other possibility is c1 (a∗ ) = 1, in which case µ1,a∗ (a) = 1 for all a ∈ U because fa : 1 7→ 0 7→ 0 for all a. Clearly µ1,a∗ (a∗ ) = c1 (a∗ ). Since g(a, µa∗ (a)) = µa∗ (a), the definition of g implies both µn,a∗ (a) = fan−1 (µ1,a∗ (a)) and ιa (µ1,a∗ (a)) = κ(a∗ ). By Lemma 3.1, shrinking U if necessary, µ1,a∗ (a) is the unique x ∈ [0, 1] such that ιa (x) = κ(a∗ ). Since µa∗ is C 1 there exists K > 0 such that kDµa∗ (a)k∞ ≤ K for all a ∈ U (it may be necessary to shrink U). This means that |Dµn,a∗ (a)| ≤ K for all a ∈ U and n > 0. Corollary 3.3. Suppose a∗ ∈ M. Then for every ε > 0 there exists a neighborhood U of a∗ such that |µn,a∗ (a) − cn (a∗ )| = |µn,a∗ (a) − µn,a∗ (a∗ )| < ε
(3.18)
for all n > 0 and a ∈ U . Proof. This follows from Proposition 3.2(3).
4. Transversality The main result of this section is Proposition 4.2, which states that if the family (fa ) is real-analytic, then it either satisfies a transversality condition (“finite order contact with M”, defined below) or has constant combinatorics. The real-analyticity of (fa ) plays an essential role. The curves a 7→ c1 (a) and a 7→ µ1,a∗ (a) intersect at a = a∗ ; if their intersection is transversal, i.e. Dc1 (a∗ ) − Dµ1,a∗ (a∗ ) 6= 0, then the family (fa ) can be said to cross the combinatorial class of fa∗ with non-zero speed. In general, transversal intersection is too much to hope for. Note that Dc1 (a∗ ) − Dµ1,a∗ (a∗ ) =
∂fan (c1 (a∗ )) a=a∗ ∂a − lim n→∞ Dfan (c1 (a∗ )) ∗
cf. [1], as follows from the proof of Proposition 3.2.
=−
∞ X i=1
∂fa (ci (a∗ )) a=a∗ ∂a Dfai ∗ (c1 (a∗ ))
,
Misiurewicz Maps are Rare
115
Definition 4.1. We say that (fa ) has contact of order m > 0 with M at a∗ ∈ M if the family (fa ) is C m , Di c1 (a∗ ) = Di µ1,a∗ (a∗ ) for 0 < i < m,
(4.1)
Dm c1 (a∗ ) 6= Dm µ1,a∗ (a∗ ).
(4.2)
and If (fa ) has contact of order m with M at a∗ for some m > 0, then we say that (fa ) has finite order contact with M at a∗ . The contact order m will be denoted by m(a∗ ). If (fa ) does not have finite order contact with M at a∗ , then we say that (fa ) has infinite order contact with M at a∗ . Lemma 4.1. Suppose θ ∈ {−1, 0, +1}N is not periodic. Then {a ∈ U : κ(a) = θ}
(4.3)
is closed in U . Proof. This is well-known. If a∗ ∈ U is a limit point of A = {a ∈ U : κ(a) = θ}, but κ(a∗ ) 6= θ, then it is easy to see that the critical point of fa∗ must be periodic. This implies that fa has a periodic attractor for all a in a neighborhood U of a∗ . Since we are dealing with S-unimodal maps, this means that κ(a) is periodic for all a ∈ U , which shows that U ∩ A = ∅, a contradiction. Proposition 4.2 (Transversality). Suppose (fa ) is real-analytic. If (fa ) has infinite order contact with M at a∗ ∈ M, then fa is Misiurewicz with κ(a) = κ(a∗ ) for every a ∈ U . In particular, M = U . Proof. Suppose (fa ) has infinite order contact with M at a∗ ∈ M. Let J be the connected component of {a ∈ U : κ(a) = κ(a∗ )} containing a∗ . Note that J ⊆ M by Lemma A.3. We will prove that J = U . Since µ1,a∗ and c1 are real-analytic on a neighborhood of a∗ , and by hypothesis have their derivatives of all orders equal at a∗ , they are equal µ1,a∗ ≡ c1 on a neighborhood of a∗ . In particular, κ(a) = ιa (µ1,a∗ (a)) = κ(a∗ ) on a neighborhood of a∗ by Proposition 3.2(2). This shows that J has non-empty interior. Now suppose that J 6= U . Then there exists a0 ∈ U such that a0 ∈ ∂J. By Lemma 4.1, 0 κ(a ) = κ(a∗ ), and thus a0 ∈ J, in particular a0 ∈ M. We may therefore consider µ1,a0 . Now κ(a) = κ(a∗ ) for all a ∈ J, so by Proposition 3.2(2) we must have c1 (a) = µ1,a0 (a) for all a ∈ J in a neighborhood of a0 , in other words, since J has non-empty interior, for all a in a one-sided neighborhood of a0 . Since µ1,a0 and c1 are real-analytic, this implies that µ1,a0 = c1 in a (two-sided) neighborhood of a0 . In the same way as above we deduce that a0 ∈ int J. This contradiction with a0 ∈ ∂J proves that in fact J = U = M. 5. Distortion The main result of this section is Proposition 5.1 below. Suppose (fa ) has finite order contact with M at a∗ ∈ M. This means that c1 (a) ∼ µ1,a∗ (a) + ρ1 · (a − a∗ )m near a∗ , where ρ1 6= 0 is a constant. The point of the proposition is that cn (a) ∼ µn,a∗ (a) + ρn · (a − a∗ )m
(5.1)
in an appropriate sense for a in a precisely specified neighbourhood of a∗ , where ρn 6= 0 is a constant depending on n. Corollary 5.3 shows that |ρn | → ∞ as n → ∞.
116
D. Sands
Definition 5.1. Let (fa ) have contact of order m = m(a∗ ) > 0 with M at a∗ ∈ M. Consider µn,a∗ , which is defined for all n > 0 on some neighborhood U of a∗ by Proposition 3.2. We define ρn,a∗ : U → R by ρn,a∗ (a) =
Dcn (a) − Dµn,a∗ (a) , m (a − a∗ )m−1
if a 6= a∗ ,
(5.2)
and put ρn,a∗ (a∗ ) = lim ρn,a∗ (a).
(5.3)
a→a∗
Clearly Dm c1 (a∗ ) − Dm µ1,a∗ (a∗ ) 6= 0. (5.4) m! It is easy to show, and in any case follows from the proof of Lemma 5.2, that the limit (5.3) exists for all n > 0. ρ1,a∗ (a∗ ) =
Proposition 5.1 (Distortion). Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then for every ε > 0 there exists a neighborhood U of a∗ and K > 0 such that ρn,a∗ (a) 1 ≤K ≤ (5.5) K ρn,a∗ (a∗ ) for every n > 0 and a ∈ Mε,n (a∗ ) ∩ U. The proof of Proposition 5.1 occupies this section. It simplifies significantly when contact with M is transversal (order 1). Lemma 5.2. Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then ρn,a∗ (a∗ ) = ρ1,a∗ (a∗ ) Dfan−1 (c1 (a∗ )) ∗
(5.6)
for all n > 0. In particular, ρn,a∗ (a∗ ) 6= 0 for all n > 0. Proof. It is enough to show that ρn+1 (a∗ ) = Dfa∗ (cn (a∗ )) ρn (a∗ ) for all n > 0, the result then following by induction. Let m = ma∗ > 0 be the order of contact of (fa ) with M at a∗ . By definition ρn+1 (a∗ ) = lim
a→a∗
Dcn+1 (a) − Dµn+1 (a) , m (a − a∗ )m−1
(5.7)
cn+1 (a) − µn+1 (a) , (a − a∗ )m
(5.8)
and according to L’Hˆopital’s rule ρn+1 (a∗ ) = lim
a→a∗
the existence of one of the limits (5.7), (5.8) implying the existence of the other (recall that cn+1 and µn+1 are C m ). The family (fa ) is smooth, so cn+1 (a) − µn+1 (a) = fa (cn (a)) − fa (µn (a)) = (Dfa (µn (a)) + oa (cn (a), µn (a))) · (cn (a) − µn (a)), where oa is a function such that oa (s, t) → 0 as s → t, uniformly for a near a∗ . Thus ρn+1 (a∗ ) = lim (Dfa (µn (a)) + oa (cn (a), µn (a))) · a→a∗
= Dfa∗ (cn (a∗ )) ρn (a∗ ), since µn (a) → cn (a∗ ) as a → a∗ .
cn (a) − µn (a) (a − a∗ )m
(5.9) (5.10)
Misiurewicz Maps are Rare
117
Corollary 5.3. Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then there exist K > 0 and λ > 1 such that |ρn,a∗ (a∗ )| ≥ Kλn for all n > 0.
Proof. This follows from Lemmas 5.2 and A.2.
Lemma 5.4. Suppose (fa ) has contact of order m = m(a∗ ) > 0 with M at a∗ ∈ M. Then there exists a neighborhood U of a∗ such that |a − a∗ |m
inf
x∈[a,a∗ ]
|ρn,a∗ (x)| ≤ |cn (a) − µn,a∗ (a)| ≤ |a − a∗ |m
sup |ρn,a∗ (x)| (5.11)
x∈[a,a∗ ]
for every a ∈ U and n > 0. Proof. The role of U is to ensure that µn and ρn are defined, for example take U as in Proposition 3.2. To be strictly correct we should also take U convex. The proof amounts to integrating Dcn (x) − Dµn (x) = m ρn (x) (x − a∗ )m−1 over [a, a∗ ]. We remind the reader that cn (a∗ ) = µn (a∗ ). The proof of the upper bound goes like this: Z |Dcn (x) − Dµn (x)| dx (5.12) |cn (a) − µn (a)| ≤ [a,a∗ ] Z |ρn (x)| · |x − a∗ |m−1 dx (5.13) = m ≤ |a
[a,a∗ ] − a∗ | m
sup |ρn (x)|.
x∈[a,a∗ ]
If Dcn (x) − Dµn (x) is of constant sign on ]a, a∗ [ then Z |Dcn (x) − Dµn (x)| dx |cn (a) − µn (a)| = [a,a∗ ] Z |ρn (x)| · |x − a∗ |m−1 dx = m ≥ |a
[a,a∗ ] − a∗ | m
inf
x∈[a,a∗ ]
|ρn (x)|,
(5.14)
(5.15) (5.16) (5.17)
proving the lower bound in this case. If Dcn (x) − Dµn (x) changes sign on ]a, a∗ [ then Dcn (x0 ) − Dµn (x0 ) = 0 for some x0 ∈ ]a, a∗ [, giving ρn (x0 ) = 0. The lower bound then holds trivially. Lemma 5.5. Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then there exists a neighborhood U of a∗ and K > 0 such that ρn+1,a∗ (a) Dfa (cn (a)) ρn,a∗ (a) |ρn,a∗ (x)| sup ρn+1,a (a∗ ) − Dfa (cn (a∗ )) ρn,a (a∗ ) ≤ K|a − a∗ | x∈[a,a |ρ n,a∗ (a∗ )| (5.18) ∗ ∗ ∗ ∗] for every n > 0 and a ∈ U . Proof. If a = a∗ then this is trivial. If a 6= a∗ then by definition ρn+1 (a) = A computation shows that
Dcn+1 (a) − Dµn+1 (a) . m (a − a∗ )m−1
(5.19)
118
D. Sands
Dcn+1 (a) − Dµn+1 (a) = An (a) + Bn (a) Dµn (a) + Dfa (cn (a)) (Dcn (a) − Dµn (a)), (5.20) where
An (a) = ∂fa /∂a (cn (a)) − ∂fa /∂a (µn (a))
and
Bn (a) = Dfa (cn (a)) − Dfa (µn (a)).
Thus ρn+1 (a) =
An (a) + Bn (a) Dµn (a) + Dfa (cn (a)) ρn (a). m (a − a∗ )m−1
(5.21)
Recall from Lemma 5.2 that ρn+1 (a∗ ) = Dfa∗ (cn (a∗ )) ρn (a∗ ). It follows that An (a) + Bn (a) Dµn (a) Dfa (cn (a)) ρn (a) ρn+1 (a) = + . ρn+1 (a∗ ) m (a − a∗ )m−1 ρn (a∗ ) Dfa∗ (cn (a∗ )) Dfa∗ (cn (a∗ )) ρn (a∗ ) (5.22) Thus ρn+1 (a) Dfa (cn (a)) ρn (a) |An (a)| + |Bn (a)| |Dµn (a)| − . ρn+1 (a∗ ) Dfa (cn (a∗ )) ρn (a∗ ) ≤ m−1 m |a − a∗ | |ρn (a∗ )| |Dfa∗ (cn (a∗ ))| (5.23) ∗ Let us show that the right-hand side of (5.23) admits an upper bound as in (5.18). The term |Dfa∗ (cn (a∗ ))| is bounded away from zero independently of n because fa∗ is Misiurewicz; |Dµn (a)| is bounded above independently of n and a on a neighborhood of a∗ by Proposition 3.2(3). The smoothness of the family (fa ) implies that |An (a)|, |Bn (a)| ≤ Const |cn (a) − µn (a)| for all n > 0 and a in a neighborhood of a∗ . In short, there exists a neighborhood U of a∗ and K > 0 such that ρn+1 (a) Dfa (cn (a)) ρn (a) |cn (a) − µn (a)| (5.24) ρn+1 (a∗ ) − Dfa (cn (a∗ )) ρn (a∗ ) ≤ K m−1 |a − a∗ | |ρn (a∗ )| ∗ for all n > 0 and a ∈ U \ {a∗ }. m We have |cn (a) − µn (a)| ≤ |a − a∗ | supx∈[a,a∗ ] |ρn (x)| by Lemma 5.4. Substituting into (5.24) yields ρn+1 (a) supx∈[a,a∗ ] |ρn (x)| Dfa (cn (a)) ρn (a) − . (5.25) ρn+1 (a∗ ) Dfa (cn (a∗ )) ρn (a∗ ) ≤ K|a − a∗ | |ρn (a∗ )| ∗ The lemma is proved.
Lemma 5.6. Suppose f is S-unimodal and without periodic attractors. Let U be a neighborhood of the critical point of f . Then there exists a neighborhood V of f in the C 2 topology and K > 0 such that n Dg (x) 1 ≤K ≤ (5.26) K Dg n (y) for every x, y ∈ [0, 1], n ≥ 0 and g ∈ V such that g : [0, 1] → [0, 1], g n i
i
/ U , g (y) ∈ / U for all 0 ≤ i < n. and g (x) ∈
[x,y]
is monotone
Misiurewicz Maps are Rare
119
Proof. We will derive this standard result from [5, Corollary 3, p. 249], a form of Ma˜ne´ ’s lemma which states that there is a neighborhood V0 of f in the C 1 topology, K0 > 0 / U for and λ > 1 such that if g ∈ V0 , g : [0, 1] → [0, 1], x ∈ [0, 1], n ≥ 0 and g i (x) ∈ all 0 ≤ i < n, then |Dg n (x)| ≥ K0 λn . It will turn out that we may take V = V0 ∩ V1 and K = eK1 /(K0 (λ−1)) , where the neighborhood V1 of f in the C 2 topology and K1 > 0 are chosen so that Dg(x) (5.27) Dg(y) − 1 ≤ K1 |x − y| for every g ∈ V1 and x, y ∈ [0, 1] \ U. Indeed, suppose x, y ∈ [0, 1], n ≥ 0 and g ∈ V are as in the statement of the lemma. Putting Ji = [g i (x), g i (y)], we obtain from Ma˜ne´ ’s lemma that K0 λn−i |Ji | ≤ |g n−i (Ji )| ≤ 1 for all 0 ≤ i < n. Equation (5.27) yields Dg(g i (x)) i i (5.28) Dg(g i (y)) ≤ 1 + K1 g (x) − g (y) = 1 + K1 |Ji | . Thus n n−1 Pn−1 Dg (x) Y K1 |Ji | ≤ i=0 (1 + K |J |) ≤ e 1 i Dg n (y) i=0 Pn −i ≤ eK1 i=1 λ /K0 ≤ eK1 /(K0 (λ−1)) ≡ K. The lower bound follows by interchanging x and y.
(5.29) (5.30)
Lemma 5.7. For every a∗ ∈ M there exists a neighborhood U of a∗ and K > 0 such that Dfa (µn,a∗ (a)) ≤ 1 + K |a − a∗ | 1 − K |a − a∗ | ≤ Dfa∗ (µn,a∗ (a∗ )) for all a ∈ U and n > 0. Proof. This is Proposition 3.2(3) combined with the smoothness of (fa ). The denominator |Dfa∗ (µn (a∗ ))| = |Dfa∗ (cn (a∗ ))| is bounded away from zero because fa∗ is Misiurewicz. Thus it suffices to show that |Dfa (µn (a)) − Dfa∗ (µn (a∗ ))| ≤ K1 |a − a∗ |. 1 The family (fa ) is smooth, so |Dfa (µn (a)) − Dfa∗ (µn (a∗ ))| ≤ K2 max{|a − a∗ |, |µn (a) − µn (a∗ )|}. Now apply Proposition 3.2(3), which states that |µn (a) − µn (a∗ )| ≤ K3 |a − a∗ |. Lemma 5.8. For every a∗ ∈ M and ε > 0 there exists a neighborhood U of a∗ and K > 0 such that Dfan−i (ci (a)) (1 − K |a − a∗ |)n−i ≤ K(1 + K |a − a∗ |)n−i (5.31) ≤ K Dfan−i (ci (a∗ )) ∗ for every n > 0, 0 < i ≤ n and a ∈ U ∩ Mε,n (a∗ ). 1 In the proof, K , K , K > 0 are independent of n and a. The inequalities hold for all n > 0 and a in 1 2 3 some neighbourhood of a∗ not depending on n.
120
D. Sands
Proof. This amounts to combining Lemmas 5.6 and 5.7. Lemma 5.7 gives a neighborhood U0 of a∗ and K0 > 0 such that Dfan−i (µi (a)) ≤ (1 + K0 |a − a∗ |)n−i (5.32) (1 − K0 |a − a∗ |)n−i ≤ Dfan−i (c (a )) i ∗ ∗ for all n > 0, 0 < i ≤ n and a ∈ U0 (recall that µi (a∗ ) = ci (a∗ )). Proving (5.31) therefore reduces to bounding the quotient Dfan−i (µi (a)) (5.33) Df n−i (c (a)) a i from above and below. We do this using Lemma 5.6 stated in the following way: there is a neighborhood U1 of a∗ and K1 > 0 such that Dfaj (x) 1 ≤ K1 ≤ (5.34) K1 Dfaj (y) for every x, y ∈ [0, 1], j ≥ 0 and a ∈ U1 such that faj [x,y] is monotone and |fak (x) − c| ≥ ε/2, |fak (y) − c| ≥ ε/2 for all 0 ≤ k < j. We apply the lemma as follows. Taking U1 smaller if necessary, we may suppose, by Corollary 3.3, that |µk (a) − c| ≥ ε/2 for all k > 0 and a ∈ U1 . We take j = n − i, x = ci (a), y = µi (a) and a ∈ U1 ∩ Mε,n (a∗ ). Let us check the hypotheses for (5.34). If 0 ≤ k < j then |fak (x) − c| = |ci+k (a) − c| ≥ ε since a ∈ Mε,n (a∗ ). We have |fak (y) − c| = |µi+k (a) − c| ≥ ε/2 by the choice of U1 . The set Mε,n (a∗ ) is connected and contains a∗ ; this implies that fan [x,y] is monotone. Lemma 5.6, that is to say Eq. 5.34, now yields Dfan−i (ci (a)) 1 ≤ K1 . ≤ (5.35) K1 Dfan−i (µi (a)) Combining (5.32) and (5.35), we see that the result holds with K = max{K0 , K1 } and U = U 1 ∩ U2 . Lemma 5.9. Suppose (fa ) has finite order contact with M at a∗ ∈ M. For every ε > 0 there exists a neighborhood U of a∗ and K > 0 such that (1 − K|a − a∗ |)n − nK(1 + K|a − a∗ |)n |a − a∗ | Dn (a) K ρn (a) ≤ K(1 + K|a − a∗ |)n (1 + n|a − a∗ | Dn (a)) ≤ ρn (a∗ )
(5.36)
for every n > 0 and a ∈ U ∩ Mε,n (a∗ ). By Dn (a) we mean the quantity Dn (a) = sup{|ρi (x)|/|ρi (a∗ )| : 1 ≤ i < n and x ∈ [a, a∗ ]}.
(5.37)
Remark. These estimates have the property that the influence of the inductive term Dn (a) can be overcome by taking |a − a∗ | small enough. This is important for the proof of Proposition 5.1.
Misiurewicz Maps are Rare
121
Proof. The proof consists of using Lemmas 5.8 and 5.5 repeatedly. We will prove the upper bound only, since the proof of the lower bound goes similarly. By Lemma 5.5, there exists a neighborhood U0 of a∗ and K0 > 1 such that ρn (a) supx∈[a,a∗ ] |ρn−1 (x)| Dfa (cn−1 (a)) ρn−1 (a) ρn (a∗ ) − Dfa (cn−1 (a∗ )) ρn−1 (a∗ ) ≤ K0 |a − a∗ | |ρn−1 (a∗ )| ∗ (5.38) for all n > 1 and a ∈ U0 . Of course this implies in particular that ρn (a) Dfa (cn−1 (a)) ρn−1 (a) supx∈[a,a∗ ] |ρn−1 (x)| ≤ , ρn (a∗ ) Dfa (cn−1 (a∗ )) ρn−1 (a∗ ) + K0 |a − a∗ | |ρn−1 (a∗ )| ∗ (5.39) which when used repeatedly leads to ρn (a) ρ1 (a) Dfan−1 (c1 (a)) ≤ ρn (a∗ ) ρ1 (a∗ ) Df n−1 (c (a )) a∗ 1 ∗ n X supx∈[a,a∗ ] |ρi−1 (x)| Dfan−i (ci (a)) + K0 |a − a∗ | Df n−i (c (a )) . |ρi−1 (a∗ )| a∗ i ∗
(5.40)
i=2
This last holds for every n > 0 and a ∈ U0 . We interpret the empty sum that occurs when n = 1 as being zero. We now use Lemma 5.8. This lemma states that there is a neighborhood U1 of a∗ and K1 > 1 such that Dfan−i (ci (a)) (1 − K1 |a − a∗ |)n−i ≤ K1 (1 + K1 |a − a∗ |)n−i (5.41) ≤ K1 Dfan−i (c (a )) i ∗ ∗ for every n > 0, 0 < i ≤ n and a ∈ U0 ∩ Mε,n (a∗ ). Substituting into (5.40) we obtain ρn (a) ≤ K1 (1 + K1 |a − a∗ |)n−1 ρ1 (a) ρn (a∗ ) ρ1 (a∗ ) + K1 K0 |a − a∗ | Dn (a)
n X
(1 + K1 |a − a∗ |)n−i
(5.42)
i=2
ρ1 (a) + n |a − a∗ | Dn (a)) ≤ K1 K0 (1 + K1 |a − a∗ |)n ( ρ1 (a∗ )
(5.43)
for every n > 0 and a ∈ U0 ∩ U1 ∩ Mε,n (a∗ ). Let U2 be a neighborhood of a∗ small enough that 1/2 ≤ |ρ1 (a)/ρ1 (a∗ )| ≤ 2 for all a ∈ U2 . Putting K = 2K1 K0 and U = U0 ∩ U1 ∩ U2 , we see from (5.43) that ρn (a) n (5.44) ρn (a∗ ) ≤ K(1 + K |a − a∗ |) (1 + n |a − a∗ | Dn (a)) for every n > 0 and a ∈ U ∩ Mε,n (a∗ ). This proves the upper bound. As mentioned above, the proof of the lower bound is similar.
122
D. Sands
Proposition 5.1 (Distortion). Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then for every ε > 0 there exists a neighborhood U of a∗ and K > 0 such that ρn,a∗ (a) 1 ≤K ≤ (5.45) K ρn,a∗ (a∗ ) for every n > 0 and a ∈ Mε,n (a∗ ) ∩ U. Proof. The proof is by induction. First we specify some constants. Let m = m(a∗ ) > 0 denote the contact order of (fa ) with M at a∗ . Take K0 > 0, λ0 > 1 such that (c1 (a∗ ))| ≥ K0 λn−1 for all n > 0. Lemma 5.9 gives a |ρn (a∗ )/ρ1 (a∗ )| = |Dfan−1 0 ∗ neighborhood U1 of a∗ and K1 > 1 such that (1 − K1 |a − a∗ |)n − nK1 (1 + K1 |a − a∗ |)n |a − a∗ |Dn (a) K1 ρn (a) ≤ K1 (1 + K1 |a − a∗ |)n (1 + n|a − a∗ | Dn (a)) ≤ ρn (a∗ )
(5.46)
for every n > 0 and a ∈ U1 ∩ Mε,n (a∗ ). Take N > 0 large enough that the following hold for all n ≥ N : 1/m 4K1 λ20 −n/(2m) −n λ ≤ λ0 , (5.47) |ρ1 (a∗ )|K0 0 −n/(2m) n
(1 + K1 λ0
) ≤ 2,
−n/(2m)
4K1 nλ0
≤ 1,
−n/(2m) n
(1 − K1 λ0
−n/(2m)
4nK12 λ0
(5.49) 1 , 2
(5.50)
1 , 8K1
(5.51)
) ≥
≤
(5.48)
and let U2 be a neighborhood of a∗ small enough that 1/(4K1 ) ≤ |ρn (a)|/|ρn (a∗ )| ≤ 4K1 for all a ∈ U2 and 0 < n ≤ N . Put K = 4K1 and U = U1 ∩ U2 . We shall prove by induction that ρn (a) 1 ≤ K, for all a ∈ U ∩ Mε,n (a∗ ) ≤ H(n) K ρn (a∗ ) holds for every n > 0. If 0 < n ≤ N then H(n) holds by the choice of U2 , so suppose that n > N and that H(m) holds for all 0 < m < n. Let us prove H(n). Take a ∈ U ∩ Mε,n (a∗ ) and note that Dn (a) ≤ 4K1 by the inductive hypothesis. From (5.46) we obtain ρn (a) n (5.52) ρn (a∗ ) ≤ K1 (1 + K1 |a − a∗ |) (1 + 4K1 n|a − a∗ |).
Misiurewicz Maps are Rare
123
Since |cn−1 (a)−µn−1 (a)| ≤ 1 and inf x∈[a,a∗ ] |ρn−1 (x)| ≥ ρn−1 (a∗ )/(4K1 ), we deduce that 1/m 1/m 4K1 |cn−1 (a) − µn−1 (a)| 4K1 λ20 −n/(2m) ≤ λ−n ≤ λ0 |a − a∗ | ≤ |ρn−1 (a∗ )| |ρ1 (a∗ )|K0 0 (5.53) from Lemma 5.4 and the choice of N (recall that n > N ). Substituting into (5.52) we thus obtain ρn (a) −n/(2m) n −n/(2m) ) (1 + 4K1 nλ0 ) ≤ 4K1 , (5.54) ρn (a∗ ≤ K1 (1 + K1 λ0 again by the choice of N . Similarly, ρn (a) (1 − K1 |a − a∗ |)n − 4nK12 (1 + K1 |a − a∗ |)n |a − a∗ | ρn (a∗ ) ≥ K1 −n/(2m) n
(1 − K1 λ0 K1 1 ≥ 4K1
≥
)
−n/(2m) n −n/(2m) ) λ0
− 4nK12 (1 + K1 λ0
(5.55) (5.56)
by the choice of N . This proves H(n), and therefore, by induction, the proposition. Corollary 5.10. Suppose (fa ) has finite order contact with M at a∗ ∈ M. For every ε > 0, |Mε,n (a∗ )| → 0 as n → ∞. Proof. Take U small enough that Proposition 5.1 and Lemma 5.4 hold. Since |ρn (a∗ )| → ∞ as n → ∞, inf x∈U ∩Mε,n |ρn (x)| → ∞ as → ∞ by Proposition 5.1. Thus |U ∩ Mε,n (a∗ )| → 0 as n → ∞. This implies the result. 6. Lebesgue Density Lebesgue almost every point of a measurable subset A ⊆ Rn is a density point of A, see [8, p. 141]. Therefore if A has no Lebesgue density points, then A has Lebesgue measure zero. The main result of this section is: Proposition 6.1 (Density). Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then a∗ is not a Lebesgue density point of Mε for any ε > 0. The proof occupies this section. A similar argument has been used in the tent-map family [4, 2]. Most of the reasoning turns on the fact that cn (a) ∼ cn (a∗ ) + ρn · (a − a∗ )m
(6.1)
near a∗ , where m > 0 is the order of contact of a∗ ∈ M with M, and |ρn | → ∞ as n → ∞, as shown in Sect. 5. Lemma 6.2. Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then for every ε, δ, L > 0 there exists a neighborhood U of a∗ such that |Dcn (a)| ≥ L for every n > 0 and a ∈ U ∩ Mε,n (a∗ ) such that |cn (a) − cn (a∗ )| ≥ δ.
124
D. Sands
1
M,7 (a∗ ) c7
0.8
0.6 = 0.1 c = 0.5
= 0.1
0.4
0.2
0 3.5
3.6
a∗
bi
ai
3.8
a
Fig. 1. Graph of a 7→ c7 (a) for the family fa : x 7→ ax(1 − x) showing the points ai and bi of Proposition 6.1. Here c3 (a∗ ) is a repelling point of period 3, a∗ = 3.67857, ai = 3.71133, bi = 3.69229 and Mε,7 (a∗ ) = [3.65231, 3.71560]
Proof. If U is small enough then |µn (a) − µn (a∗ )| ≤ δ/2 for all a ∈ U and n > 0 by Corollary 3.3. The hypothesis |cn (a) − cn (a∗ )| ≥ δ then implies that |cn (a) − µn (a)| ≥ |cn (a) − cn (a∗ )| − |µn (a) − µn (a∗ )| ≥ δ/2. It is this that we actually use. Shrinking U if necessary, take K > 0 such that Proposition 3.2(3), Lemma 5.4, and Proposition 5.1 hold for this U and K. Let m = m(a∗ ) be the order of contact of (fa ) with M at a∗ ∈ M. We have |cn (a) − µn (a)| ≤ K |ρn (a∗ )| |a − a∗ |m from Proposition 5.1 and Lemma 5.4. Since |cn (a) − µn (a)| ≥ δ/2, see above, it follows that |a − a∗ |m ≥
δ . 2K|ρn (a∗ )|
(6.2)
By definition |Dcn (a) − Dµn (a)| = m |ρn (a)| |a − a∗ |m /|a − a∗ |. Substituting for |a − a∗ |m using (6.2) yields |Dcn (a) − Dµn (a)| ≥
δm , 2K 2 |a − a∗ |
(6.3)
where we have used Proposition 5.1. Now |Dµn (a)| ≤ K implies |Dcn (a)| ≥ mδ/(2K 2 |a − a∗ |) − K. Thus we can ensure that |Dcn (a)| is as large as we like, in particular larger than L, by taking U small enough, since a, a∗ ∈ U .
Misiurewicz Maps are Rare
125
Lemma 6.3. Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then there exist ε > 0 and positive integers ni → ∞ such that [c − ε, c + ε] ⊆ cni (Mε,ni (a∗ )) for all i. Proof. The idea is to choose neighborhoods Va of c that have nice boundary points varying smoothly in a. The term “nice” is defined below. It will follow from Lemma 6.2 that the graph of cn hits the graph of ∂Va transversally and entirely crosses Va for infinitely many n. We take ε > 0 small enough that [c − ε, c + ε] is always inside Va . Let us now construct the neighborhoods Va . /V Let V be a convex symmetric open neighborhood of c small enough that ci (a∗ ) ∈ for all i > 0. By V being symmetric we mean τa∗ (V) = V. Let p ∈ V be a periodic point of fa∗ . Such periodic points always exist. We may suppose that p is the point in its fa∗ -orbit closest to c, in other words that fai ∗ (p) ∈ / p, τa∗ (p) for all i > 0. (6.4) A point x ∈ [0, 1] such that fai ∗ (x) ∈ / x, τa∗ (x) for all i > 0 is said to be nice, or fa∗ -nice if the map is not clear. A subset of [0, 1] is said to be nice if its boundaries points arenice. In short, our periodic point p is nice, as is the symmetric neighborhood p, τa∗ (p) of c. Now, p is hyperbolic repelling by Lemma A.1 because fa∗ has no periodic attractors. Continue p using the implicit function theorem to a periodic point pa of fa for a near a∗ and put Va = ]pa , τa (pa )[. Each Va is a nice, convex, symmetric neighborhood of c compactly contained in V, at least for a in a sufficiently small neighborhood U of a∗ . We will take U to be compact in what follows. We now specify some constants. Shrinking U if necessary, take ε > 0 and K > 0 such that [c − ε, c + ε] ⊆ Va , |∂p(a)/∂a| < K and |∂τa (pa )/∂a| < K for all a ∈ U . cn be the connected component of {a ∈ U : ci (a) ∈ / Va for all 0 < i < Let M c c n} containing a∗ . Clearly a∗ ∈ int Mn and Mn ⊆ Mε,n (a∗ ) for all n > 0. By Lemma 6.2, taking U smaller if necessary, we may suppose that |Dcn (a)| ≥ 2K for cn and n > 0 such that cn (a) ∈ Va (the δ > 0 required by Lemma 6.2 every a ∈ M / V and the Va being compactly contained in V). This has comes from having cn (a∗ ) ∈ the following consequence: if x ∈ U and cn (x) ∈ ∂Vx , then the graph of cn (a) intersects that of ∂Va transversally at a = x. Indeed, |Dcn (x)| ≥ 2K while |∂p(a)/∂a| < K and |∂τa (pa )/∂a| < K. cn | → 0 as n → ∞ by Corollary 5.10, there exists N > 0 such that Since |M cn is thus an open set. It also follows from cn ⊆ int U for all n ≥ N ; for n ≥ N , M M c1+n 6= M cn for cn | → 0 that there exist integers ni → ∞ larger than N such that M |M i i c each i. We claim that cni (Mni ) ⊇ [c − ε, c + ε] for each i, which is what we shall now show. cn . Take b ∈ M cn such that b ∈ ∂ M c1+n . Clearly cn (b) ∈ c1+n ⊆ M Recall that M i i i i i ∂Vb , so as remarked above, the graph of cni (a) intersects that of ∂Va transversally at a = b. cn such that b ∈ b0 , a∗ . Because M cn ⊆ U , we must have cj (b0 ) ∈ Take b0 ∈ ∂ M i i / Vb0 . We thus ∂Vb0 for some 0 < j < ni . The niceness of ∂Vb0 then implies that cni (b0 ) ∈ have cni (a) ∈ / Va for a ∈ ]b, a∗ [; cni (b) ∈ ∂Vb , crossing transversally; and b0 ∈ / Vb0 . Moreover, |Dcni (a)| ≥ 2K as long as cni (a) ∈ Va and the boundary points of Va move with derivative < K. It follows that the graph of cni (a) actually crosses that of cn ) ⊇ cn ( b0 , a∗ ) ⊇ [c − ε, c + ε] as claimed. This implies that Va , and thus cni (M i i [c − ε, c + ε] ⊆ cni (Mε,ni (a∗ )).
126
D. Sands
Remark. The parameters mapping into ]c − ε, c + ε[ are not ε-Misiurewicz. Lemma 6.4. Suppose (fa ) has contact of order m = m(a∗ ) > 0 with M at a∗ ∈ M. Then for every ε > 0 there exists K > 0 such that for every δ > 0 there exists a neighborhood U of a∗ such that m m |ρn,a∗ (a∗ )| · |a − a∗ | − |a0 − a∗ | (6.5) −δ + K m m ≤ |cn (a) − cn (a0 )| ≤ δ + K |ρn,a∗ (a∗ )| · |a − a∗ | − |a0 − a∗ | / a, a0 . for all n > 0 and a, a0 ∈ Mε,n (a∗ ) ∩ U for which a∗ ∈ Proof. The proof is similar to that of Lemma 5.4. Let K > 0 and U be such that Proposition 5.1 holds for this ε. We shall suppose that U is convex. The upper bound is calculated like this: |cn (a) − cn (a0 )| − |µn (a) − µn (a0 )| ≤ |cn (a) − µn (a) − (cn (a0 ) − µn (a0 ))| Z |Dcn (x) − Dµn (x)| dx, (6.6) = [a,a0 ]
the equality being valid because, as follows from Proposition 5.1 and the definition of ρn , the quantity Dcn (x) − Dµn (x) does not change sign on a, a0 . Equality is needed for the proof of the lower bound. Continuing the calculation, Z (6.6) = m |ρn (x)| · |x − a∗ |m−1 dx (6.7) [a,a0 ] Z |x − a∗ |m−1 dx (6.8) ≤ Km |ρn (a∗ )| [a,a0 ] m m (6.9) = K |ρn (a∗ )| · |a − a∗ | − |a0 − a∗ | , using Proposition 5.1. Taking U smaller if necessary, we may assume (Corollary 3.3) that |µn (a)−µn (a0 )| ≤ computation yield the upper bound δ for all a, a0 ∈ U and n > 0. This and the above m m |cn (a) − cn (a0 )| ≤ δ + K |pn (a∗ )| · |a − a∗ | − |a0 − a∗ | . The proof of the lower bound is similar. Proposition 6.1 (Density). Suppose (fa ) has finite order contact with M at a∗ ∈ M. Then a∗ is not a Lebesgue density point of Mε for any ε > 0. Proof. We will bound the lower density of Mε at a∗ away from 1. This works because the curve cn (a) has the shape given by (6.1). Denote by m = m(a∗ ) > 0 the order of contact of (fa ) with M at a∗ . By Lemma 6.3 there exists ε0 > 0 and positive integers ni → ∞ such that [c − ε0 , c + ε0 ] ⊆ cni (Mε0 ,ni (a∗ )) for all i. Since taking ε smaller increases Mε , we may suppose that ε < ε0 . It follows that [c−ε, c+ε] ⊆ cni (Mε,ni (a∗ )) for all i and also that ε ≤ 1. For the same reason we may assume that a∗ ∈ Mε (just take ε smaller). Put δ = ε and take K > 0 and U as given by Lemma 6.4 for this δ, ε. Corollary 5.10 says that |Mε,ni (a∗ )| → 0 as i → ∞, so Mε,ni (a∗ ) ⊆ U for all i large enough. Take any such i. Since [c − ε, c + ε] ⊆ cni (Mε,ni (a∗ )), there exist points ai , bi ∈ Mε,ni (a∗ ) such that
Misiurewicz Maps are Rare
127
cni (]ai , bi [) = ]c − ε, c + ε[ .
(6.10)
]ai , bi [ ⊆ U \ Mε .
(6.11)
Clearly
In particular, a∗ ∈ / ]ai , bi [. Without loss of generality bi ∈ [ai , a∗ ]. Applying Lemma 6.4 with a = ai and a0 = a∗ gives |ai − a∗ |m ≤ K
2K |cni (ai ) − cni (a∗ )| + ε ≤ . |ρni (a∗ )| |ρni (a∗ )|
(6.12)
The second inequality comes from ε ≤ 1 and |cni (ai ) − cni (a∗ )| ≤ 1, this last being true because cni (ai ), cni (a∗ ) ∈ [0, 1]. Using Lemma 6.4 again with a = ai and a0 = bi gives m
m
||ai − a∗ | − |bi − a∗ | | ≥
ε 1 |cni (ai ) − cni (bi )| − ε = , (6.13) K |ρni (a∗ )| K |ρni (a∗ )|
since |cni (ai ) − cni (bi )| = 2ε. Combining (6.12) and (6.13) yields bi − a∗ m ε ai − a∗ ≤ 1 − 2K 2 ,
(6.14)
1/m b i − ai ≥1− 1− ε . ai − a∗ 2K 2
(6.15)
and therefore
Now ]ai , bi [ ⊆ U \ Mε , so this implies that ε 1/m |Mε ∩ [ai , a∗ ]| ≤ 1− . |ai − a∗ | 2K 2
(6.16)
Since ai → a∗ as i → ∞, Eq. 6.16 shows that the lower density of Mε at a∗ is bounded away from 1. In particular, a∗ is not a Lebesgue density point of Mε .
7. Proof of Theorem A Theorem A. Suppose (fa )a∈U is a real-analytic family of S-unimodal maps where U ⊆ R is connected. Then either M has Lebesgue measure zero or M = U . Proof. If M 6= U then, by Proposition 4.2, (fa ) has finite order contact with M at every a∗ ∈ M. Proposition 6.1 shows that Mε contains no Lebesgue S density points and thus has Lebesgue measure zero for every ε > 0. Therefore M = n>0 M1/n has Lebesgue measure zero.
128
D. Sands
A. Miscellaneous Lemmas Lemma A.1 (Singer). Suppose f is S-unimodal and x ∈ [0, 1] is periodic of period n > 0. Then x is attracting if and only if |Df n (x)| ≤ 1. Proof. See [10] or [5, Theorem 6.1, p. 155].
(A.1)
Lemma A.2 (Misiurewicz). Suppose f is S-unimodal and without periodic attractors. Let U be a neighborhood of the critical point of f . Then there exist K > 0 and λ > 1 such that if x ∈ [0, 1], n > 0 and / U, x, f (x), . . . f n−1 (x) ∈
(A.2)
|Df n (x)| ≥ Kλn .
(A.3)
then
Proof. See [7] or [5, Theorem 3.2, p. 231].
Remark. If f is Misiurewicz, then it follows that there exist K > 0 and λ > 1 such that |Df i (cj )| ≥ Kλi for all i ≥ 0 and j ≥ 1. Lemma A.3. Suppose f and g are S-unimodal and κ(f ) = κ(g). Then f is Misiurewicz if and only if g is Misiurewicz. That is to say, the property of being Misiurewicz depends only on the kneading invariant. A characterization of the Misiurewicz property in terms of kneading invariants can be found in [9]. Proof. This is well known. Note that an S-unimodal map f has a periodic attractor if and only if κ(f ) is periodic [3, p. 69]. Suppose f is Misiurewicz. Then f has no periodic attractors, so g has no periodic attractors because κ(g) = κ(f ). As f and g have the same kneading invariant, no wandering intervals [5, Theorem A, p. 267], no intervals of periodic points [5, Theorem 6.1, p. 155] and no periodic attractors, f and g are topologically conjugate [5, Theorem 3.2, p. 104]. Therefore g is Misiurewicz. ´ ¸ tek for encouraging me to write this article and Laurence Acknowledgement. I would like to thank Gregorz Swia Stephen for typing it for me with her usual skill and efficiency. Henk Bruin kindly commented on an early version. Sebastian van Strien had the idea of considering real-analytic families.
References 1. Benedicks, M. and Carleson, L.: The dynamics of the H´enon map. Ann. of Math. 133, 73–169 (1991) 2. Brucks, K. and Misiurewicz, M.: Trajectory of the turning point is dense for almost all tent maps. Ergodic Theory Dynamical Systems 16, 1173–1183 (1996) 3. Collet, P. and Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Boston, Basel, Stuttgart: Birkh¨auser, 1980 4. Coven, E.M., Kan, I. and Yorke, J.A. Pseudo-orbit shadowing in the family of tent maps. Trans. Am. Math. Soc. 308 (1), 227–241 (Jul 1988) 5. de Melo, W. and van Strien, S.: One-dimensional Dynamics. Berlin–Heidelberg–New York: SpringerVerlag, 1993 6. Dieudonn´e, J.: Foundations of Modern Analysis. New York: Academic Press, 1969
Misiurewicz Maps are Rare
129
´ 7. Misiurewicz, M.: Absolutely continuous measures for certain maps of an interval. Inst. Hautes Etudes Sci. Math. Pub. 53, 17–51 (1981) 8. Rudin, W.: Real and complex analysis. New York: McGraw-Hill, 1987 9. Sands, D.: Topological conditions for positive Lyapunov exponent in unimodal maps. Preprint 95-59, Universit´e de Paris-XI (Orsay), 1995 10. Singer, D.: Stable orbits and bifurcation of maps of the interval. SIAM J. Appl. Math. 35 (2), 260–267 (1978) Communicated by Ya. G. Sinai
Commun. Math. Phys. 197, 131 – 166 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Lattice of Weak∗ -Closed Inner Ideals in a W∗ -Algebra ? 2 ¨ C. M. Edwards1 , G. T. Ruttimann 1 2
The Queen’s College, Oxford OX1 4AW, UK Universit¨at Bern, Sidlerstrasse 5, CH-3012 Bern, Switzerland
Received: 29 September 1997 / Accepted: 24 February 1998
Abstract: The family CP(A) of centrally equivalent pairs of projections in a W∗ -algebra A forms a complete ∗ -lattice that is ∗ -order isomorphic to the complete ∗ -lattice I(A) of weak∗ -closed inner ideals in A and to the complete ∗ -lattice S(A) of structural projections on A. Furthermore, the set of symmetric elements of CP(A) is order isomorphic to the complete orthomodular lattice P(A) of projections in A. Although not itself, in general, orthomodular, CP(A) possesses a complementation that allows for definitions of orthogonality, centre, and central orthogonality to be given. A less familiar notion in lattice theory, that is well-known in the theory of Jordan algebras and Jordan triple systems, is that of rigid collinearity of a pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A). This is defined and characterized in terms of properties of P(A). A W∗ -algebra A is sometimes thought of as providing a model for a statistical physical system. In this case CP(A) may be considered to represent the set consisting of a particular kind of sub-system, and the central orthogonality and rigid collinearity of pairs of elements of CP(A) may be regarded as representing two different types of disjointness of the corresponding sub-systems. It is therefore natural to consider bounded measures m on CP(A) that are additive on centrally orthogonal and rigidly collinear pairs of elements. Using results of J.D.M. Wright, it is shown that, provided that A has no weak∗ -closed ideal of Type I2 , such measures are precisely those that are the restrictions of bounded centrally symmetric sesquilinear functionals φm on A × A. Furthermore, m is an hermitian measure on the complete ∗ -lattice CP(A) if and only if the sesquilinear functional φm is hermitian and m is a normal measure if and only if φm is separately weak∗ -continuous. These results can be regarded as Gleason-type theorems.
? Research supported in part by grants from Schweizerischer Nationalfonds/Fonds national suisse and the Royal Society.
132
C. M. Edwards, G. T. R¨uttimann
1. Introduction This paper is concerned with the structure of an arbitrary W∗ -algebra. It has long been recognized that the Jordan structure of a W∗ -algebra plays an intrinsic part in its description. In the fifties, Kadison [22] observed that linear isometries from a W∗ -algebra to itself that preserved the unit were not, in general, ∗ -isomorphisms, but were always Jordan ∗ -isomorphisms. A second way in which the Jordan structure of a W∗ -algebra arises naturally was discussed by Effros and Størmer [14] who showed that the range of a positive weak∗ -continuous unital projection on a W∗ -algebra is not necessarily a W∗ -algebra but is always a Jordan W∗ -algebra [7], or JW∗ -algebra [35]. The most general result along these lines is due to Kaup [24], who showed that the range of a weak∗ -continuous contractive projection on a W∗ -algebra is not necessarily a Jordan W∗ -algebra but always possesses a Jordan triple structure with respect to which it is a JBW∗ -triple. For the general theory of Jordan W∗ -algebras the reader is referred to [7,17] and [31], and for the general theory of JBW∗ -triples to [1,9,15,18] and [23]. In the study of the Jordan triple structure of a W∗ -algebra A, the weak∗ -closed subspaces J that are of an immediate interest are the inner ideals, which are defined by the property that, for each element a in J and each element b in A, the element ab∗ a lies in J. The authors showed in [10] that every weak∗ -closed inner ideal J in a W∗ algebra A is of the form eAf , for a pair (e, f ) of projections in A. A linear projection P on A is said to be structural if, for elements a, b in A, the elements (P a)b∗ (P a) and P (a(P b)∗ a) of A coincide. It was shown in [8,10] and [11] that every such projection is weak∗ -continuous and contractive and is of the form a 7→ eaf for a pair (e, f ) of projections in A. A pair (e, f ) of projections is said to be centrally equivalent if e and f have the same central support projection. It is a consequence of the results referred to above that the sets CP(A) of centrally equivalent pairs of projections in A and S(A) of structural projections on A, with appropriate partial orderings, form complete lattices which are order isomorphic to the complete lattice I(A) of weak∗ -closed inner ideals in A. A W∗ -algebra A is often thought of as a model for a statistical quantum-mechanical system, the bounded observables of which are represented by self-adjoint elements of A, the propositions of which are represented by elements of the complete orthomodular lattice P(A) of projections in A and the states of which are represented, either by bounded linear functionals on A, or by ortho-additive measures on P(A). It has recently been shown by Bunce and Wright([3,4] and [5]) that, at least when A has no direct summand of Type I2 , these two different descriptions are equivalent. In fact, they show in [6] that this is certainly not the case for Type I2 W∗ -algebras, even for continuous measures. Weak∗ -continuous contractive projections on A represent filtering processes on the physical system and their ranges represent sub-systems. It follows that the most general such sub-systems are represented not by W∗ -algebras but by JBW∗ -triples. These sub-systems are not only non-classical mechanical but also are non-quantum mechanical. In particular, the sub-systems corresponding to structural projections are represented by weak∗ -closed inner ideals in A. In this paper the Jordan theoretic properties of the complete lattice CP(A) of centrally equivalent pairs of projections in the W∗ -algebra A are examined in some detail. It is shown that CP(A) possesses a very rich structure involving notions of compatibility, orthogonality, central orthogonality and rigid collinearity, all of which have physical interpretations. The central orthogonality and rigid collinearity of a pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A) correspond to two different kinds of disjointness of the corresponding sub-systems. Consequently, it is natural to study those bounded measures m
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
133
on CP(A) which have the property that, for each pair (e1 , f1 ), (e2 , f2 ) of either centrally orthogonal or rigidly collinear elements of CP(A), m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )). Using results of Wright [32], it is shown that, provided that A has no direct summand of Type I2 , such measures are the restrictions of a particular class of bounded sesquilinear functionals on A. These are precisely the decoherence functionals discussed in [19,20,21] and [32]. Furthermore, measures that are completely additive on centrally orthogonal and rigidly collinear families of elements of CP(A) are the restrictions of separately weak∗ -continuous bounded sesquilinear functionals. It follows that states of the system corresponding to A may alternatively be thought of as measures on CP(A), with normal states corresponding to completely additive measures. These results can be considered as Gleason-type theorems ([3],[16]). A discussion of the boundedness of decoherence functionals and of the countable additivity of quantum measures may be found in [33]. The paper is organized as follows. In Sect. 2 definitions are given, notation is established and certain preliminary results are described. In Sect. 3 the notion of compatibility in the complete lattice CP(A) is introduced, and its consequences are examined. Although the complete lattice CP(A) does have a natural notion of orthogonality, it does not, in general, form an orthomodular lattice. Nevertheless, it is possible to define the centre and the concept of central orthogonality for CP(A), and this is achieved in Sect. 4. A more unfamiliar notion, that of rigid collinearity, borrowed from the theory of Jordan ∗ -triples, is introduced in Sect. 5. Whilst the structure of CP(A) and its physical interpretation are of independent interest, the main result of the paper is the characterization of bounded measures on CP(A) as the restrictions of certain bounded sesquilinear forms. This is done in Sect. 6. 2. Preliminaries Recall that a partially ordered set P is said to be a lattice if, for e and f in P, the supremum e ∨ f and the infimum e ∧ f exist. The partially ordered set P is said to be a complete lattice if, for any subset {ej : j ∈ 3} of P, the supremum ∨j∈3 ej and the infimum ∧j∈3 ej exist. A complete lattice has a greatest element, denoted by 1 and a least element, denoted by 0. A complete lattice P equipped with a non-trivial order automorphism of order two e 7→ e∗ is said to be a complete ∗ -lattice. A complete lattice P together with an anti-order automorphism e 7→ e0 of P such that, for all elements e in P, e ∨ e0 = 1, e00 = e, and, for all elements e and f in P with e ≤ f , f = e ∨ (f ∧ e0 ), is said to be orthomodular and the mapping e 7→ e0 is said to be an orthocomplementation of P. Elements e and f in the complete orthomodular lattice P are said to be orthogonal, denoted e ⊥ f , if e ≤ f 0 . An element z in P is said to be central if, for all elements e in P, e = (z ∧ e) ∨ (z 0 ∧ e). Moreover, z is central if and only if, for all elements e in P, z = (z ∧ e) ∨ (z ∧ e0 ). The set ZP of central elements of the complete orthomodular lattice P contains 0 and 1, and if z is contained in ZP then so also is z 0 . The centre ZP of P forms a sub-complete orthomodular lattice of P which, with the restricted order and orthocomplementation, is Boolean. The central support c(e) of an element e in P is the infimum of the set of elements in ZP which dominate e. Observe that, for elements e and f in P, c(e ∧ c(f )) = c(e) ∧ c(f ) (2.1) and, for each subset {ej : j ∈ 3} of P,
134
C. M. Edwards, G. T. R¨uttimann
c(
_ j∈3
ej ) =
_
c(ej ).
(2.2)
j∈3
When endowed with the product ordering the Cartesian product P × P of the complete orthomodular lattice P with itself forms a complete lattice. An element (e, f ) in P × P is said to be centrally equivalent if the central supports c(e) of e and c(f ) of f coincide and, in this case, the common central support is denoted by c(e, f ). Let CP be the collection of centrally equivalent elements of P × P. It follows from (2.2) that, with the ordering inherited from P × P, CP is a complete lattice with least element (0, 0) and greatest element (1, 1), and the supremum of a subset of CP coincides with its supremum when regarded as a subset of P × P. In general, this is not the case for the infimum. However, for any element (e, f ) in CP, (e, f ) = (c(f ), f ) ∧ (e, c(e)) = (c(f ), f ) ∧P×P (e, c(e)). For details the reader is referred to [29]. For each element (e, f ) of CP, let (e, f )∗ = (f, e). Then CP equipped with the mapping (e, f ) 7→ (e, f )∗ is a complete ∗ -lattice. Furthermore, the map e 7→ (e, e) is an order isomorphism from P onto the sub-complete lattice of CP consisting of ∗ -invariant elements of CP. Let A be a W∗ -algebra and let P(A) be the family of self-adjoint idempotents, or projections in A. For e and f in P(A), write e ≤ f if ef = e and let e0 = 1 − e, where 1 is the unit in A. These define a partial ordering and an orthocomplementation on P(A), with respect to which P(A) is a complete orthomodular lattice. Observe that, for orthogonal elements e and f in P(A), e ∨ f = e + f and, by (2.2), c(e + f ) = c(e) ∨ c(f ). Furthermore, for each increasing net (ej )j∈3 in P(A), the supremum ∨j∈3 ej coincides with the limit of the net (ej )j∈3 in the weak∗ topology. The centre ZP(A) of P(A) coincides with the complete Boolean lattice of projections in the algebraic centre Z(A) of A. Observe that, for z in ZP(A) and e in P(A), z ∧ e = ze,
z ∨ e = e + z − ez = z + z 0 e.
(2.3)
Some properties of central supports of elements of P(A) are listed below for future reference. Their proofs are straightforward. Lemma 2.1. Let P(A) be the complete orthomodular lattice of projections in the W∗ algebra A with centre ZP(A), let e lie in P(A), with central support c(e), and let z lie in ZP(A). Then: (i) c(e0 )0 ≤ c(e), c(e)0 ≤ c(e0 ), c(e)0 c(e0 )0 = 0; (ii) c(e)c(e0 ) + c(e0 )0 + c(e)0 = 1 and c(e)c(e0 ), c(e0 )0 and c(e)0 are pairwise orthogonal central projections; (iii) c(z ∧ e) = c(ze) = zc(e), c(z ∨ e) = z ∨ c(e). A subspace J of the W∗ -algebra A is said to be a left ideal if AJ ⊆ J, is said to be a right ideal if JA ⊆ J, and is said to be an ideal if it is both a left and a right ideal. For each weak∗ -closed left ideal J in A there exists a unique projection e such that J coincides with Ae. The left ideal Ae is an ideal if and only if e is central. For each element a in A, the unique projection e(a) such that the left ideal {b ∈ A : ba = 0} coincides with Ae(a)0 is said to be the left support of a. It is the least element of P(A) for which e(a)a = a. The right support f (a) is similarly defined. Clearly, e(a) = f (a∗ ) and,
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
135
therefore, the left and right supports of a self-adjoint element a coincide. This element, denoted by s(a), is the unit in the sub-W∗ -algebra of A generated by a and is said to be the support projection of a. An element u in A is said to be a partial isometry if uu∗ u = u or, equivalently, if either uu∗ or u∗ u is a projection. For any subset B of A, the set of partial isometries in B is denoted by U(B). For each partial isometry u in A, e(u) = uu∗ , f (u) = u∗ u and the central supports c(e(u)) and c(f (u)) coincide. For each element a in A, there exists a unique partial isometry r(a) in A such that a = r(a)|a| 1 and f (r(a)) = s(|a|), where |a| = (a∗ a) 2 . Moreover, r(a)∗ = r(a∗ ),
f (a) = f (r(a)),
e(a) = e(r(a)),
a = r(a)a∗ r(a).
(2.4)
The partial isometry r(a) is said to be the support of a. For details of these and other results on W∗ -algebras the reader is referred to [27] and [28]. The Jordan triple product {a b c} of three elements a, b and c in A is defined by {a b c} =
1 ∗ (ab c + cb∗ a). 2
A subspace J of A is said to be a subtriple of A if the subspace {J J J} is contained in J and is said to be an inner ideal if the subspace {J A J} is contained in J. Observe that, for each pair a, b of elements of A the subspace aAb is an inner ideal in A. Since the intersection of a family of weak∗ -closed inner ideals in A is a weak∗ -closed inner ideal in A, the set I(A) of weak∗ -closed inner ideals in A, when ordered by set inclusion, forms a complete lattice. Moreover, it can be seen that the adjoint J ∗ of an element J in I(A), is also a weak∗ -closed inner ideal and that the mapping J 7→ J ∗ is an involution on I(A). Observe that, for elements a and b in A, (aAb)∗ = b∗ Aa∗ . Combining this remark with Theorem 4.1 of [10] yields the following important result. Lemma 2.2. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Then, the mapping (e, f ) 7→ eAf is a ∗ -order isomorphism from CP(A) onto the complete ∗ -lattice I(A) of weak∗ -closed inner ideals in A, with inverse J 7→ (eJ , fJ ), where eJ =
_
{e(u) : u ∈ U (J)},
fJ =
_
{f (u) : u ∈ U (J)}.
The next two lemmas describe some further basic properties of the complete ∗ -lattice CP(A). The first follows from (2.2). Lemma 2.3. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let ((ej , fj ))j∈3 be an increasing net in CP(A) with supremum (e, f ). Then, (ej )j∈3 and (fj )j∈3 are increasing nets in P(A) which converge in the weak∗ topology to e and f respectively. For projections e and f in the W∗ -algebra A, the subspace eAf is a weak∗ -closed inner ideal in A. Therefore, by Lemma 2.2, there exists a unique element (e, ˜ f˜) in CP(A) such that eAf coincides with eA ˜ f˜. The next lemma, the proof of which is straightforward, lists the consequences of this remark.
136
C. M. Edwards, G. T. R¨uttimann
Lemma 2.4. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and, for each pair (e, f ) of elements of P(A), let (e, ˜ f˜) be the element ˜ of CP(A) for which eAf coincides with eA ˜ f . Let e and f be elements of P(A). Then e˜ = c(f )e, f˜ = c(e)f and ^ c(e, ˜ f˜) = c(e)c(f ) = {w ∈ ZP(A) : c(e)f = wf, c(f )e = we} ^ = {w ∈ ZP(A) : eAf ⊆ wA}. This result has the following corollary. Corollary 2.5. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A). Then, (e1 , f1 ) ∧ (e2 , f2 ) = (c(f1 ∧ f2 )(e1 ∧ e2 ), c(e1 ∧ e2 )(f1 ∧ f2 )). Proof. Notice that
e1 Af1 ∩ e2 Af2 = (e1 ∧ e2 )A(f1 ∧ f2 ).
The assertion follows from Lemma 2.2 and Lemma 2.4. 3. Compatibility in the Complete ∗ -Lattice CP(A)
Let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of projections in a W∗ -algebra A. In this section the notion of compatibility of pairs of elements of CP(A), borrowed from the theory of Jordan algebras and Jordan∗ -triples ([12,26]), is considered. For each element (e, f ) in CP(A) and each element a in A, let P2 (e, f )a = eaf,
P1 (e, f )a = eaf 0 + e0 af,
P0 (e, f )a = e0 af 0 .
(3.1)
Then, P0 (e, f ), P1 (e, f ) and P2 (e, f ) are weak∗ -continuous norm non-increasing pairwise orthogonal linear projections on A with sum I, the identity operator. The ranges A0 (e, f ) and A2 (e, f ) of P0 (e, f ) and P2 (e, f ) are weak∗ -closed inner ideals in A and the range A1 (e, f ) of P1 (e, f ) is a weak∗ -closed subtriple of A. The decomposition A = A0 (e, f ) ⊕ A1 (e, f ) ⊕ A2 (e, f ) of A is said to be the generalized Peirce decomposition of A relative to (e, f ). Let D(e, f ) be the weak∗ -continuous bounded linear operator on A defined, for each element a in A, by 1 1 (3.2) D(e, f )a = ( P1 (e, f ) + P2 (e, f ))a = (ea + af ). 2 2 Recall that a bounded linear operator T on a complex Banach space X is said to be hermitian if the numerical range of T in the Banach algebra B(X) of bounded linear operators on X is contained in R, or, equivalently, if, for all real numbers t, the bounded linear operator eitT is an isometry (see [2], Lemma 5.2). The proof of the following result can be extracted from the results of [12], but, for completeness, a direct proof will be given. Lemma 3.1. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), let (e, f ) be an element of CP(A) and, for j equal to 0, 1, or 2, let the operators Pj (e, f ) and D(e, f ) be defined by (3.1) and (3.2). Then:
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
137
(i) for each complex number λ of unit modulus, the weak∗ -continuous linear operator S(e, f )(λ) on A, defined by S(e, f )(λ) = P0 (e, f ) + λP1 (e, f ) + λ2 P2 (e, f ) is a Jordan triple automorphism of A and an isometry from A onto A such that, for each real number t, S(e, f )(eit ) = exp(2itD(e, f )); (ii) the weak∗ -continuous linear operator D(e, f ) is hermitian. Proof. Since S(e, f )(λ) is linear, in order to show that it is a triple homomorphism, it is enough to prove that, for each element a in A, S(e, f )(λ)(aa∗ a) = (S(e, f )(λ)a)(S(e, f )(λ)a)∗ (S(e, f )(λ)a). For j equal to 0, 1 or 2, let aj denote Pj (e, f )a. Then, using (3.1) it can be seen that, for j, k and l equal to 0, 1 or 2, the Jordan triple product {aj ak al } is equal to zero if j + l − k is not equal to 0, 1 or 2 and lies in Aj−k+l (e, f ) otherwise. Hence S(e, f )(λ)(aa∗ a) = S(e, f )(λ)
2 X
{aj ak al }
j,k,l=0
=
2 X
λj−k+l {aj ak al }
j,k,l=0
={
2 X j=0
j
λ aj
2 X k=0
k
λ ak
2 X
λl a l }
l=0
= (S(e, f )(λ)a)(S(e, f )(λ)a)∗ (S(e, f )(λ)a) as required. Furthermore, S(e, f )(λ)S(e, f )(λ) = S(e, f )(λ)S(e, f )(λ) = I, and it follows that S(e, f )(λ) is a Jordan triple automorphism of A. The results of [25] and [30], 20.10, show that the triple automorphisms of A are precisely the linear isometries from A onto A. Observe that, d S(e, f )(eit )|t=0 = 2iD(e, f ), dt and it follows that, for each real number t, S(e, f )(eit ) = exp(2itD(e, f )) as required. Since, for all real numbers t, the linear operator exp(itD(e, f )) is an isometry, the bounded linear operator D(e, f ) is hermitian. Following the definition in [26], two elements (e1 , f1 ) and (e2 , f2 ) of CP(A) are said to be compatible if, for j and k equal to 0, 1 or 2, the commutant [Pj (e1 , f1 ), Pk (e2 , f2 )] is equal to zero. The next lemma describes other conditions equivalent to that of compatibility.
138
C. M. Edwards, G. T. R¨uttimann
Lemma 3.2. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and, for an element (e, f ) of CP(A), let the operator D(e, f ) be defined by (3.2). Then, for elements (e1 , f1 ) and (e2 , f2 ) of CP(A), the following conditions are equivalent: (i) (ii) (iii) (iv) (v)
(e1 , f1 ) and (e2 , f2 ) are compatible; [D(e1 , f1 ), D(e2 , f2 )] = 0; D(e1 , f1 )(e2 Af2 ) ⊆ e2 Af2 and D(e1 , f1 )(e02 Af20 ) ⊆ e02 Af20 ; D(e2 , f2 )(e1 Af1 ) ⊆ e1 Af1 and D(e2 , f2 )(e01 Af10 ) ⊆ e01 Af10 ; e1 e2 = e2 e1 and f1 f2 = f2 f1 .
Proof. That (i) implies (ii) follows from the definition of D(e, f ). Observe that, from Lemma 3.1, if (ii) holds, then, for each real number t, [P0 (e1 , f1 ) + eit P1 (e1 , f1 ) + e2it P2 (e1 , f1 ), D(e2 , f2 )] = [exp(2itD(e1 , f1 )), D(e2 , f2 )] = 0, and, by choosing t appropriately, it follows that, for j equal to 0, 1 and 2, [Pj (e1 , f1 ), D(e2 , f2 )] is equal to zero. Similarly, for each real number s, [Pj (e1 , f1 ), P0 (e2 , f2 ) + eis P1 (e2 , f2 ) + e2is P2 (e2 , f2 )] = [Pj (e1 , f1 ), exp(2isD(e2 , f2 ))] = 0, and, by appropriately choosing s, it can be seen that, for j and k equal to 0, 1 and 2, [Pj (e1 , f1 ), Pk (e2 , f2 )] is equal to zero. Therefore, (i) and (ii) are equivalent conditions. If (i) and (ii) hold then, for j equal to 0, 1 or 2, Pj (e2 , f2 )D(e1 , f1 )(e2 Af2 ) = D(e1 , f1 )Pj (e2 , f2 )(e2 Af2 ) = δj2 D(e1 , f1 )P2 (e2 , f2 )A = δj2 P2 (e2 , f2 )D(e1 , f1 )A ⊆ e2 Af2 . Therefore, D(e1 , f1 )(e2 Af2 ) =
2 X
Pj (e2 , f2 )D(e1 , f1 )(e2 Af2 ) ⊆ e2 Af2 .
j=0
Similarly, D(e1 , f1 )(e02 Af20 ) ⊆ e02 Af20 . Therefore (iii) and, by symmetry, (iv) hold. Now suppose that (iii) holds. Then, using (3.2), for all elements a in A, e2 (e1 (e2 af2 ) + (e2 af2 )f1 )f2 = e1 (e2 af2 ) + (e2 af2 )f1 ,
(3.3)
e02 (e1 (e02 af20 ) + (e02 af20 )f1 )f20 = e1 (e02 af20 ) + (e02 af20 )f1 .
(3.4)
Multiplying (3.3) on the left by
Choosing a equal to
e2 e1 e02
e02
and (3.4) on the left by e2 gives e02 e1 e2 af2 = 0,
(3.5)
e2 e1 e02 af20
(3.6)
= 0.
in (3.5) and a equal to 1 in (3.6) yields (e02 e1 e2 )(e2 e1 e02 )f2 = 0,
(3.7)
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
139
e2 e1 e02 = (e2 e1 e02 )f2 .
(3.8)
Substituting from (3.8) in (3.7) gives (e2 e1 e02 )∗ (e2 e1 e02 ) = 0, which implies that e2 e1 e02 is equal to zero. Therefore, e2 e1 = e2 e1 e2 = (e2 e1 e2 )∗ = e1 e2 . A similar argument shows that f1 f2 = f2 f1 and (v) holds. Conversely, if (v) holds, then e01 e2 = e2 e01 ,
e1 e02 = e02 e1 ,
e01 e02 = e02 e01 ,
f10 f2 = f2 f10 ,
f1 f20 = f20 f1 ,
f10 f20 = f20 f10 ,
and simple calculations show that (i) and (ii) hold.
4. Orthogonality in the Complete ∗ -Lattice CP(A) For an element (e, f ) in the complete ∗ -lattice CP(A) of centrally equivalent pairs of elements of the complete orthomodular lattice P(A), it is not necessarily true that the projections e0 and f 0 are centrally equivalent. Consequently, the product orthocomplementation on P(A) × P(A) does not, in general, restrict to an orthocomplementation on CP(A). However, a notion of complement on CP(A) is available. Lemma 2.4 shows that, for each element (e, f ) in CP(A), the element (c(f 0 )e0 , c(e0 )f 0 ) lies in CP(A). Therefore define (4.1) (e, f )0 = (c(f 0 )e0 , c(e0 )f 0 ). Before studying the properties of the mapping (e, f ) 7→ (e, f )0 , the following lemma is needed. Lemma 4.1. Let A be a W∗ -algebra, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of projections in A and, for (e, f ) in CP(A), let (e, f )0 be defined by (4.1). Then, (4.2) (e, f )00 = ((c(f 0 )e0 )0 , (c(e0 )f 0 )0 ), and c((e, f )00 ) is equal to c(e, f ). Proof. Since (c(f 0 )e0 )0 is equal to c(f 0 )0 ∨ e, it follows from (2.2) that c((c(f 0 )e0 )0 ) = c(c(f 0 )0 ∨ e) = c(f 0 )0 ∨ c(e) = c(f 0 )0 ∨ c(f ) = c(e, f ), since, by Lemma 2.1, c(f 0 )0 ≤ c(f ). Similarly, c((c(e0 )f 0 )0 ) is equal to c(e, f ), and the proof is complete. The properties of the mapping (e, f ) 7→ (e, f )0 are listed in the following lemma. Lemma 4.2. Let A be a W∗ -algebra, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of projections in A and, for (e, f ) in CP(A), let (e, f )0 be defined by (4.1). Then: (i) the mapping (e, f ) 7→ (e, f )0 is order reversing; (ii) for each element (e, f ) in CP(A), (e, f ) ≤ (e, f )00
140
C. M. Edwards, G. T. R¨uttimann
Proof. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A) such that (e1 , f1 ) ≤ (e2 , f2 ). Then, e1 ≤ e2 ≤ c(e2 , f2 ) and c(e1 , f1 ) ≤ c(e2 , f2 ). Furthermore, since e02 ≤ e01 and f20 ≤ f10 , it can be seen that c(f20 )e02 ≤ c(f10 )e01 ,
c(e02 )f20 ≤ c(e01 )f10 .
It follows that (e2 , f2 )0 ≤ (e1 , f1 )0 as required, and the proof of (i) is complete. Using (2.2), for each element (e, f ) in CP(A), e ≤ e + c(f 0 )0 e0 = c(f 0 )0 + c(f 0 )e = (c(f 0 )e0 )0 , and, similarly, f ≤ (c(f 0 )e0 )0 . It follows from Lemma 4.1 that (e, f ) ≤ (e, f )00 .
00
It is very easy to give an example of an element (e, f ) in CP(A) for which (e, f ) is not equal to (e, f ). Let A be the orthogonal sum B(H1 ) ⊕ B(H2 ) ⊕ B(H3 ) where, for j equal to 1, 2 or 3, B(Hj ) is the W∗ -algebra of bounded linear operators on the complex Hilbert space Hj . For each element a in A, let a = a1 + a2 + a3 be its orthogonal decomposition. If 1 denotes the unit in A, then the centre Z(A) of A is the three-dimensional space C11 + C12 + C13 . Let e = e 1 + 12 ,
f = 11 + f 2 ,
where e1 is a projection in B(H1 ), not equal to 11 , and f2 is a projection in B(H2 ), not equal to 12 . Then, e0 = e01 + 13 ,
f 0 = f20 + 13 ,
It follows that
c(f 0 )e0 = 13 ,
and, therefore,
c(e0 ) = 11 + 13 ,
c(f 0 ) = 12 + 13 .
c(e0 )f 0 = 13
(e, f )00 = (11 + 12 , 11 + 12 ) 6= (e, f ).
It follows that, in general, the complete ∗ -lattice CP(A) is not orthomodular. An element (e1 , f1 ) in CP(A) is said to be orthogonal to an element (e2 , f2 ) in CP(A), written (e1 , f1 ) ⊥ (e2 , f2 ), if (e2 , f2 ) ≤ (e1 , f1 )0 . Lemma 4.3. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Then, for elements (e1 , f1 ) and (e2 , f2 ) in CP(A), the following conditions are equivalent: (i) (e1 , f1 ) ⊥ (e2 , f2 ); (ii) (e2 , f2 ) ⊥ (e1 , f1 ); (iii) e1 ⊥ e2 and f1 ⊥ f2 . Proof. If (i) holds then e2 ≤ c(f10 )e01 ≤ e01 ,
f2 ≤ c(e01 )f10 ≤ f10 ,
and (iii) holds. Conversely, if (iii) holds then e2 ≤ e01 and f2 ≤ f10 and, therefore, e2 Af2 is contained in e01 Af10 . Applying Lemma 2.2 and Lemma 2.4, it follows that (e2 , f2 ) ≤ (c(f10 )e01 , c(e01 )f10 ), and (i) holds. Therefore (i) and (iii) are equivalent and, similarly, (ii) and (iii) are equivalent.
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
141
Lemma 3.2 and Lemma 4.3 lead to the following result. Corollary 4.4. Let (e1 , f1 ) and (e2 , f2 ) be orthogonal elements of CP(A). Then (e1 , f1 ) and (e2 , f2 ) are compatible. Although the complete ∗ -lattice CP(A) is not, in general, orthomodular, it is possible to give a definition of its centre. An element (g, h) in CP(A) is said to be central if, for each element (e, f ) of CP(A), (e, f ) = ((g, h) ∧ (e, f )) ∨ ((g, h)0 ∧ (e, f )).
(4.3)
The next result describes the central elements of CP(A). Theorem 4.5. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Then, for an element (g, h) in CP(A), the following are equivalent: (i) (g, h) is central; (ii) g is equal to h and lies in ZP(A); (iii) for each element (e, f ) of CP(A), (g, h) = ((g, h) ∧ (e, f )0 ) ∨ ((g, h) ∧ (e, f )00 ).
(4.4)
Proof. (i) ⇒ (ii) For each element e in P(A), (e, e) lies in CP(A). Using Corollary 2.5, (e, e) = (g, h) ∧ (e, e) ∨ (g, h)0 ∧ (e, e) = c(h ∧ e)(g ∧ e), c(g ∧ e)(h ∧ e) ∨ c(c(g 0 )(h0 ∧ e))(c(h0 )(g 0 ∧ e)), c(c(h0 )(g 0 ∧ e))(c(g 0 )(h0 ∧ e)) . From Lemma 2.1 and Lemma 2.2, it can be seen that e = c(h ∧ e)(g ∧ e) ∨ c(g 0 )c(h0 )c(h0 ∧ e)(g 0 ∧ e) ≤ (g ∧ e) ∨ (g 0 ∧ e) ≤ e. It follows that
e = (g ∧ e) ∨ (g 0 ∧ e),
which implies that g lies in ZP(A). Similarly, h lies in ZP(A) and, since g and h have the same central support, they are equal. (ii) ⇒ (i) Let g lie in ZP(A) and let (e, f ) be an arbitrary element of CP(A). Then, by Corollary 2.5 and Lemma 2.1, (g, g) ∧ (e, f ) = (ge, gf ), and
(g, g)0 ∧ (e, f ) = (g 0 e, g 0 f ),
((g, g) ∧ (e, f )) ∨ ((g, g)0 ∧ (e, f )) = (ge, gf ) ∨ (g 0 e, g 0 f ) = ((ge ∨ g 0 e), (gf ∨ g 0 f )) = (e, f ).
It follows that (g, g) is central. (ii) ⇒ (iii) Let g be an element of ZP(A) and let (e, f ) lie in CP(A). By Lemma 2.4, Corollary 2.5 and Lemma 4.3,
142
C. M. Edwards, G. T. R¨uttimann
(g, g) ∧ (e, f )0 = (g, g) ∧ (c(f 0 )e0 , c(e0 )f 0 ) = (c(e0 )c(g ∧ f 0 )c(f 0 )(g ∧ e0 ), c(f 0 )c(g ∧ e0 )c(e0 )(g ∧ f 0 )) = (c(e0 )c(f 0 )ge0 , c(e0 )c(f 0 )gf 0 ), (g, g) ∧ (e, f )00 = (g, g) ∧ ((c(f 0 )e0 )0 , (c(e0 )f 0 )0 ) = (c(g ∧ (c(e0 )f 0 )0 )(g ∧ (c(f 0 )e0 )0 ), c(g ∧ (c(f 0 )e0 )0 )(g ∧ (c(e0 )f 0 )0 )) = (c(c(e0 )f 0 )0 g(c(f 0 )e0 )0 , c(c(f 0 )e0 )0 g(c(e0 )f 0 )0 ) = (g(c(f 0 )e0 )0 , g(c(e0 )f 0 )0 ). Since P(A) is a complete orthomodular lattice with g in its centre, it follows that, for each element h in P(A), g = ((g ∧ h) ∨ (g ∧ h0 ). Hence, ((g, g) ∧ (e, f )0 ) ∨ ((g, g) ∧ (e, f )00 ) = ((c(e0 )c(f 0 )ge0 ) ∨ (g(c(f 0 )e0 )0 ), (c(e0 )c(f 0 )gf 0 ) ∨ (g(c(e0 )f 0 )0 )) = ((g ∧ (c(f 0 )e0 )) ∨ (g ∧ (c(f 0 )e0 )0 ), (g ∧ c(e0 )f 0 ) ∨ (g ∧ (c(e0 )f 0 )0 )) = (g, g). Therefore, (iii) holds. (iii) ⇒ (i) Suppose that (g, h) satisfies (4.4), and let e be an element of P(A). Then, (e, e) lies in CP(A), (e, e)0 = (e0 , e0 ) and (e, e)00 = (e, e). Therefore, by Corollary 2.5, (g, h) ∧ (e, e)0 = (c(h ∧ e0 )(g ∧ e0 ), c(g ∧ e0 )(h ∧ e0 )), (g, h) ∧ (e, e)00 = (c(h ∧ e)(g ∧ e), c(g ∧ e)(h ∧ e)). Since (g, h) satisfies (4.4), it follows that g = (c(h ∧ e0 )(g ∧ e0 )) ∨ (c(h ∧ e)(g ∧ e)) ≤ (g ∧ e0 ) ∨ (g ∧ e) ≤ g. Therefore,
g = (g ∧ e0 ) ∨ (g ∧ e)
and, since P(A) is a complete orthomodular lattice, g lies in ZP(A). Similarly, h also lies in ZP(A) and, being centrally equivalent, g and h are equal. This result shows that the centre ZCP(A) of the complete ∗ -lattice CP(A) is the image of the centre ZP(A) of the complete orthomodular lattice P(A) under the canonical ortho-order isomorphism e 7→ (e, e) from P(A) onto the 0 -invariant sub-complete lattice of ∗ -invariant elements of CP(A). Therefore, the centre ZCP(A) of CP(A) is a subcomplete Boolean lattice of CP(A). Two elements (e1 , f1 ) and (e2 , f2 ) of CP(A) are said to be centrally orthogonal if and only if there exists an element (z, z) in ZCP(A) such that (e1 , f1 ) ≤ (z, z) and (e2 , f2 ) ≤ (z, z)0 . Observe that centrally orthogonal elements are orthogonal, and, therefore, compatible. Theorem 4.6. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Then:
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
(i)
143
the elements (e1 , f1 ) and (e2 , f2 ) of CP(A) are centrally orthogonal if and only if c(e1 , f1 ) and c(e2 , f2 ) are orthogonal in ZP(A) and, in this case, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 + e2 , f1 + f2 );
(ii) for each element (e, f ) in CP(A) and each element (z, z) in ZCP(A), the pairs (ze, zf ) and (z 0 e, z 0 f ) are centrally orthogonal elements of CP(A) such that (ze, zf ) ∨ (z 0 e, z 0 f ) = (e, f ). Proof. (i) Suppose that (e1 , f1 ) and (e2 , f2 ) are centrally orthogonal. Then, there exists an element z in ZP(A) such that (e1 , f1 ) ≤ (z, z) and (e2 , f2 ) ≤ (z, z)0 . Hence e1 ≤ z and e2 ≤ z 0 , and it follows that c(e1 , f1 ) ≤ z and c(e2 , f2 ) ≤ z 0 as required. Conversely, if c(e1 , f1 ) and c(e2 , f2 ) are orthogonal, then (e1 , f1 ) ≤ (c(e1 , f1 ), c(e1 , f1 )) and (e2 , f2 ) ≤ (c(e2 , f2 ), c(e2 , f2 )) ≤ (c(e1 , f1 )0 , c(e1 , f1 )0 ) = (c(e1 , f1 ), c(e1 , f1 ))0 , and it follows that (e1 , f1 ) and (e2 , f2 ) are centrally orthogonal. Furthermore, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 ∨ e2 , f1 ∨ f2 ) = (e1 + e2 , f1 + f2 ), since e1 ⊥ e2 and f1 ⊥ f2 . (ii) Observe that, by Lemma 2.1, c(ze) = zc(e) = zc(f ) = c(zf ). Therefore, (ze, zf ) and, similarly, (z 0 e, z 0 f ) lie in CP(A). Moreover, (ze, zf ) ≤ (z, z),
(z 0 e, z 0 f ) ≤ (z, z)0 ,
and so (ze, zf ) and (z 0 e, z 0 f ) are centrally orthogonal. The final result follows from (i). 5. Rigid Collinearity in the Complete ∗ -Lattice CP(A) A second notion borrowed from Jordan theory, that of rigid collinearity, is explored in this section. In order to do this, it is often necessary to decompose the W∗ -algebra A centrally relative to an element of CP(A). The next lemma shows how this may be achieved. Lemma 5.1. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let (e, f ) be an element of CP(A). Then: (i)
c(e, f )c(e0 )c(f 0 ), c(e, f )c(e0 )c(f 0 )0 , c(e, f )c(e0 )0 c(f 0 ), c(e0 )0 c(f 0 )0 and c(e, f )0 are pairwise orthogonal elements of ZP(A) with sum 1;
(ii) c(e, f )c(e0 )c(f 0 )A = c(e0 )c(f 0 )eAf + c(f 0 )e0 Af + c(e0 )eAf 0 + c(e, f )e0 Af 0 ; c(e, f )c(e0 )c(f 0 )0 A = c(e0 )c(f 0 )0 Af ; c(e, f )c(e0 )0 c(f 0 )A = c(e0 )0 c(f 0 )eA; c(e0 )0 c(f 0 )0 A = c(e0 )0 c(f 0 )0 eAf ; c(e, f )0 A = c(e, f )0 e0 Af 0 .
144
C. M. Edwards, G. T. R¨uttimann
Proof. (i) This is a straightforward application of Lemma 2.1(ii) applied to the projections e and f . (ii) Since A = eAf + e0 Af + eAf 0 + e0 Af 0 , the assertion is easily proved using (i). For example, c(e, f )c(e0 )c(f 0 )0 A = c(e, f )c(e0 )c(f 0 )0 (eAf + e0 Af ) + c(e, f )c(e0 )c(f 0 )0 (eAf 0 + e0 Af 0 ) = c(e, f )c(e0 )c(f 0 )0 Af + 0, by Lemma 2.1(i).
The next result is the first step along the path leading to the definition of rigid collinearity for a pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A). Lemma 5.2. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A) for which the weak∗ -closed inner ideal A2 (e1 , f1 ) = e1 Af1 is contained in the weak∗ closed subtriple A1 (e2 , f2 ) = e2 Af20 + e02 Af2 . Then: (i) (e1 , f1 ) and (e2 , f2 ) are compatible; (ii) there exist pairwise orthogonal elements z1 , z2 and z3 in ZP(A) of sum 1 such that z1 e1 ≤ z1 e2 ,
z1 f1 ≤ z1 f20 ,
z2 e1 ≤ z2 e02 ,
z2 f1 ≤ z2 f2 ,
z3 e1 = z3 f1 = 0,
and z1 e1 Af1 ⊆ z1 e2 Af20 ,
z2 e1 Af1 ⊆ z2 e02 Af2 ,
z3 e1 Af1 = {0};
(iii) there exist pairwise orthogonal elements z31 , z32 , z33 and z34 of sum z3 such that z31 (e2 Af20 + e02 Af2 ) = {0},
z32 (e2 Af20 + e02 Af2 ) = z32 e2 Af20 ,
z33 (e2 Af20 + e02 Af2 ) = z33 e02 Af2 . Proof. This result is proved by decomposing A centrally using Lemma 5.1 applied to the two elements (e1 , f1 ) and (e2 , f2 ). This leads to a decomposition of A into twentyfive weak∗ -closed ideals each of which can be considered separately. In order to simplify the notation, for (e, f ) in CP(A), let the pairwise orthogonal central projections c1 (e, f ), c2 (e, f ), . . . , c5 (e, f ) of sum 1 be defined by c1 (e, f ) = c(e, f )c(e0 )c(f 0 ), c2 (e, f ) = c(e, f )c(e0 )c(f 0 )0 , c3 (e, f ) = c(e, f )c(e0 )0 c(f 0 ), c4 (e, f ) = c(e0 )0 c(f 0 )0 , c5 (e, f ) = c(e, f )0 . For j and k equal to 1, 2, . . . , 5, let cjk = cj (e1 , f1 )ck (e2 , f2 ).
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
145
Using Lemma 5.1(ii) observe that, for k equal to 1, 2, . . . , 5, c5k e1 Af1 = {0}, and, therefore, from Lemma 2.2, (5.1) c5k e1 = c5k f1 = 0. Furthermore, again using Lemma 5.1(ii), for j equal to 1, 2, . . . , 5, cj4 (e2 Af20 + e02 Af2 ) = cj5 (e2 Af20 + e02 Af2 ) = {0}. Consequently,
cj4 e1 Af1 = cj5 e1 Af1 = {0},
and, by Lemma 2.2, cj4 e1 = cj5 f1 = 0.
(5.2)
It follows that, for j equal to 1, 2, . . . , 5 and k equal to 4 or 5, cjk [e1 , e2 ] = cjk [f1 , f2 ] = 0,
(5.3)
c5k [e1 , e2 ] = c5k [f1 , f2 ] = 0.
(5.4)
and, for k equal to 1, 2 or 3,
Now observe that, for j equal to 1, 2, 3 or 4, using Lemma 5.1(ii), cj2 e1 Af1 ⊆ cj2 e02 Af2 ,
cj3 e1 Af1 ⊆ cj3 e2 Af20 .
Again using Lemma 2.2, it follows that cj2 e1 ≤ cj2 e02 ,
cj2 f1 ≤ cj2 f2 ,
cj3 e1 ≤ cj3 e2 ,
cj3 f1 ≤ cj3 f20 .
(5.5)
Therefore, for j equal to 1, 2, 3 or 4, and k equal to 2 or 3, cjk [e1 , e2 ] = cjk [f1 , f2 ] = 0.
(5.6)
To complete the proof of (i), it remains to consider the four cases associated with the central projections c11 , c21 , c31 and c41 . Observe that, since, for j equal to 1, 2, 3 or 4, cj1 e1 Af1 ⊆ cj1 e2 Af20 + cj1 e02 Af2 , multiplying on the left by e2 and e02 in turn, cj1 e2 e1 Af1 ⊆ cj1 e2 Af20 ,
cj1 e02 e1 Af1 ⊆ cj1 e02 Af2 .
By [10], Theorem 3.4, the weak∗ -closure of the inner ideal cj1 e2 e1 Af1 is equal to cj1 e(e2 e1 )Af1 , where e(e2 e1 ) is the left support of the element e2 e1 of A. It follows from Lemma 2.2 and Lemma 2.4 that cj1 e(e2 e1 ) ≤ cj1 e2 , cj1 c(e(e2 e1 ))f1 ≤ cj1 f20 , Hence, Observe that
cj1 e(e02 e1 ) ≤ cj1 e02 ,
(5.7)
cj1 c(e(e02 e1 ))f1 ≤ cj1 f2 .
(5.8)
cj1 c(e(e2 e1 ))[f1 , f2 ] = cj1 c(e(e02 e1 ))[f1 , f2 ] = 0. cj1 e2 e1 Af1 + cj1 e02 e1 Af1 = cj1 e1 Af1 ,
(5.9)
146
C. M. Edwards, G. T. R¨uttimann
and, taking the weak∗ -closures of the two inner ideals summed in this equation, since cj1 e1 Af1 is weak∗ -closed, cj1 e(e2 e1 )Af1 + cj1 e(e02 e1 )Af1 = cj1 e1 Af1 . Being projections onto the ranges of the elements e2 e1 and e02 e1 of A, the projections e(e2 e1 ) and e(e02 e1 ) satisfy e(e2 e1 )e(e02 e1 ) = e(e02 e1 )e(e2 e1 ) = 0.
(5.10)
It follows from Lemma 2.2 and (5.10) that cj1 (c(e(e2 e1 )) ∨ c(e(e02 e1 )))f1 = cj1 f1 ,
(5.11)
and, taking central supports, cj1 (c(e(e2 e1 )) ∨ c(e(e02 e1 ))) = cj1 .
(5.12)
Therefore, from (5.9),
cj1 [f1 , f2 ] = cj1 c(e(e2 e1 )) + c(e(e02 e1 )) − c(e(e2 e1 ))c(e(e02 e1 )) [f1 , f2 ] = 0. (5.13)
Similarly,
cj1 e1 Af1 f2 ⊆ cj1 e02 Af2 ,
cj1 e1 Af1 f20 ⊆ cj1 e2 Af20 ,
and the same argument shows that cj1 [e1 , e2 ] = 0.
(5.14)
Combining (5.3), (5.4), (5.6), (5.13) and (5.14) shows that 5 X
[e1 , e2 ] =
cjk [e1 , e2 ] = 0,
[f1 , f2 ] =
j,k=1
5 X
cjk [f1 , f2 ] = 0,
j,k=1
and the proof of (i) is complete. To prove (ii) first observe that, by (5.1), (5.2) and (5.5), 5 X k=1
c5k e1 =
5 X
c5k f1 =
4 X
cj4 e1 =
4 X
j=1
k=1 4 X
cj2 e1 ≤
4 X
cj5 e1 =
j=1
cj2 e02 ,
4 X
cj4 f1 =
j=1 4 X
cj2 f1 ≤
4 X
j=1
j=1
j=1
4 X
4 X
4 X
4 X
j=1
cj3 e2 ,
j=1
cj3 f1 ≤
j=1
cj5 f1 = 0,
j=1
j=1
cj3 e1 ≤
4 X
cj2 f2 ,
(5.15)
cj3 f20 .
j=1
Since e1 and e2 commute, from (5.12), for j equal to 1, 2, 3 or 4, cj1 (c(e2 e1 ) ∨ c(e02 e1 )) = cj1 ,
(5.16)
and, from (5.7), cj1 c(e2 e1 )f1 ≤ cj1 f20 ,
cj1 c(e02 e1 )f1 ≤ cj1 f2 .
(5.17)
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
Similarly,
147
cj1 (c(f1 f2 ) ∨ c(f1 f20 )) = cj1 ,
and
cj1 c(f1 f2 )e1 ≤ cj1 e02 ,
cj1 c(f1 f20 )e1 ≤ cj1 e2 .
From (5.16), it can be seen that cj1 c(e2 e1 )c(e02 e1 )0 + cj1 c(e02 e1 )c(e2 e1 )0 + cj1 c(e2 e1 )c(e02 e1 ) = cj1 . To abbreviate the notation, let dj101 = cj1 c(e2 e1 )c(e02 e1 ), dj102 = cj1 c(e2 e1 )c(e02 e1 )0 , dj103 = cj1 c(e02 e1 )c(e2 e1 )0 , observing that 3 X
dj10l = cj1 .
l=1
Then, using (5.11) and (5.17), dj101 f1 = cj1 c(e2 e1 )c(e02 e1 )f1 = 0, dj102 f1 = cj1 c(e2 e1 )c(e02 e1 )0 f1 ≤ cj1 c(e2 e1 )c(e02 e1 )0 f20 , dj103 f1 = cj1 c(e02 e1 )c(e2 e1 )0 f1 ≤ cj1 c(e02 e1 )c(e2 e1 )0 f2 . Similarly, defining,
(5.18) (5.19) (5.20)
dj110 = cj1 c(f2 f1 )c(f20 f1 ), dj120 = cj1 c(f2 f1 )c(f20 f1 )0 , dj130 = cj1 c(f20 f1 )c(f2 f1 )0 ,
it can be seen that dj110 e1 = cj1 c(f2 f1 )c(f20 f1 )e1 = 0, dj120 e1 = cj1 c(f2 f1 )c(f20 f1 )0 e1 ≤ cj1 c(f2 f1 )c(f20 f1 )0 e02 , dj130 e1 = cj1 c(f20 f1 )c(f2 f1 )0 e1 ≤ cj1 c(f20 f1 )c(f2 f1 )0 e2 .
(5.21) (5.22) (5.23)
For k and l equal to 1, 2 or 3, define dj1kl = dj1k0 dj10l ,
(5.24)
noticing that 3 X
dj1kl = cj1 .
(5.25)
k,l=1
Therefore the W∗ -algebra cj1 A can be decomposed centrally into nine weak∗ -closed ideals, each of which can be considered separately. First observe that, for k equal to 1, 2 or 3, by (5.18) and (5.24), dj1k1 f1 = 0 and, for l equal to 1, 2 or 3,by (5.21) and (5.24), dj11l e1 = 0. Hence, dj1k1 = dj1k1 c(e1 , f1 ) = 0,
dj11l = dj11l c(e1 , f1 ) = 0.
(5.26)
148
C. M. Edwards, G. T. R¨uttimann
Moreover, by (5.19), (5.22) and (5.24), dj122 f1 ≤ dj122 f20 and dj122 e1 ≤ dj122 e02 , and it follows that dj122 e1 Af1 ⊆ (dj122 e02 Af20 ) ∩ (dj122 (e2 Af20 + e02 Af2 )) = {0}. Therefore, by Lemma 2.2, dj122 = 0,
(5.27)
dj133 = 0.
(5.28)
and, similarly, From (5.25), (5.26), (5.27) and (5.28), it can be seen that cj1 = dj123 + dj132 ,
(5.29)
where dj123 e1 ≤ dj123 e02 ,
dj123 f1 ≤ dj123 f2 ,
(5.30)
dj132 e1 ≤ dj132 e2 ,
dj132 f1 ≤ dj132 f20 .
(5.31)
and Combining (5.15) with (5.30) and (5.31), and choosing z1 =
4 X
cj3 +
j=1
4 X
dj132 ,
z2 =
4 X
j=1
cj2 +
j=1
4 X
dj123 ,
j=1
and z3 = 1 − z1 − z2 , it can be seen that z3 e1 = z3 f1 = 0,
z1 e1 ≤ z1 e2 ,
z1 f1 ≤ z1 f20 ,
z2 e1 ≤ z2 e02 ,
z2 f1 ≤ z2 f2 ,
as required. Finally, observe that z3 =
5 X
c5k +
k=1
4 X
(cj4 + cj5 ),
j=1
and, for j equal to 1, 2, . . . , 5, cj4 (e2 Af20 + e02 Af2 ) = cj5 (e2 Af20 + e02 Af2 ) = {0}, c52 (e2 Af20 + e02 Af2 ) = c52 e02 Af2 , c53 (e2 Af20 + e02 Af2 ) = c53 e2 Af20 . Choosing z31 =
5 X
(cj4 + cj5 ),
z32 = c53 ,
z33 = c52 ,
j=1
completes the proof of (ii).
Armed with this result it is now possible to state the first main theorem of this section.
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
149
Theorem 5.3. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A). Then A2 (e2 , f2 ) ⊆ A1 (e1 , f1 ) A2 (e1 , f1 ) ⊆ A1 (e2 , f2 ), if and only if there exist pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 such that w 1 e1 = w1 e2 , and in this case
w 1 f 1 ⊥ w1 f2 , w 2 e 1 ⊥ w2 e2 , w 2 f 1 = w 2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, c(e1 , f1 ) = c(e2 , f2 ) ≤ w1 + w2 .
Furthermore, a unique triple w1 , w2 and w3 of elements of ZP(A) satisfying the conditions of the theorem and w1 ≤ c(e1 , f1 ) and w2 ≤ c(e1 , f1 ) exists. In this case: (i) c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 ; (ii) writing w 1 e 1 = w 1 e 2 = e0 ,
w 2 f 1 = w 2 f 2 = f0 ,
w2 e1 + w2 e2 = e,
w1 f1 + w1 f2 = f,
c(e0 ) = w1 and c(f0 ) = w2 , and (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f + f0 ). Proof. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A) for which e1 Af1 ⊆ e2 Af20 + e02 Af2 ,
e2 Af2 ⊆ e1 Af10 + e01 Af1 .
The notation used in the proof of Lemma 5.2 will be retained. Observe that, using the results of Lemma 5.2, for j equal to 4 or 5 and k equal to 1, 2, . . . , 5, cjk e1 = cjk e2 = cjk f1 = cjk f2 = 0
(5.32)
and, by symmetry, the same result holds for k equal to 4 or 5 and j equal to 1, 2, . . . , 5. There remain nine weak∗ -closed ideals in A to be considered. Since c33 e1 Af1 ⊆ c33 e2 Af20 ,
c33 e2 Af2 ⊆ c33 e1 Af10 ,
it follows from Lemma 2.2 that
and, similarly, that
c33 e1 = c33 e2 ,
c33 f1 ≤ c33 f20
(5.33)
c22 e1 ≤ c22 e02 ,
c22 f1 = c22 f2 .
(5.34)
Furthermore, since c23 e1 Af1 ⊆ c23 e2 Af20 ,
c23 e2 Af2 ⊆ c23 e01 Af1 ,
it can be seen that c23 e1 = c23 e2 = c23 f1 = c23 f2 = 0, and, similarly, that
(5.35)
150
C. M. Edwards, G. T. R¨uttimann
c32 e1 = c32 e2 = c32 f1 = c32 f2 = 0.
(5.36)
Again using the same notation as that used in the proof of Lemma 5.2, d2123 e1 Af1 ⊆ d2123 e02 Af2 ,
d2123 e2 Af2 ⊆ d2123 e01 Af1 ,
d2132 e2 Af20 ,
d2132 e2 Af2 ⊆ d2132 e01 Af1 .
d2132 e1 Af1 ⊆ Hence,
d2123 e1 ≤ d2123 e02 , d2123 f1 = d2123 f2 , d2132 e1 = d2132 f1 = d2132 e2 = d2132 f1 = 0.
(5.37) (5.38)
d3123 e1 = d3123 e2 = d3123 f1 = d3123 f2 = 0, d3132 f1 ≤ d3132 f20 . d3132 e1 = d3132 e2 ,
(5.39) (5.40)
Similarly,
For j equal to 1, 2 or 3, define dˆ1j01 = c1j c(e1 e2 )c(e01 e2 ), dˆ1j10 = c1j c(f1 f2 )c(f10 f2 ),
dˆ1j02 = c1j c(e1 e2 )c(e01 e2 )0 , dˆ1j03 = c1j c(e01 e2 )c(e1 e2 )0 , dˆ1j20 = c1j c(f1 f2 )c(f10 f2 )0 , dˆ1j30 = c1j c(f10 f2 )c(f1 f2 )0 ,
and, for k and l equal to 1, 2 or 3, let dˆ1jkl = dˆ1jk0 dˆ1j0l . As in the proof of Lemma 5.2, c1j = dˆ1j23 + dˆ1j32 , with all other elements dˆ1jkl equal to zero. Moreover, as before, dˆ1223 f1 dˆ1232 e1 dˆ1323 e1 dˆ1332 e1
= dˆ1223 f2 , dˆ1223 e1 ≤ dˆ1223 e02 , = dˆ1232 f1 = dˆ1232 e2 = dˆ1232 f2 = 0, = dˆ1323 f1 = dˆ1323 e2 = dˆ1323 f2 = 0, = dˆ1332 e2 , dˆ1332 f1 ≤ dˆ1332 f20 .
(5.41) (5.42) (5.43) (5.44)
It remains to consider the weak∗ -closed ideal c11 A. First, observe that d1123 = c11 c(e02 e1 )c(e2 e1 )0 c(f2 f1 )c(f20 f1 )0 , d1132 = c11 c(f20 f1 )c(f2 f1 )0 c(e2 e1 )c(e02 e1 )0 , dˆ1123 = c11 c(f1 f2 )c(f10 f2 )0 c(e01 e2 )c(e1 e2 )0 , dˆ1132 = c11 c(e1 e2 )c(e01 e2 )0 c(f10 f2 )c(f1 f2 )0 , from which it follows that d1132 dˆ1123 = d1123 dˆ1132 = 0. Moreover, d1132 dˆ1132 e1 Af1 ⊆ d1132 dˆ1132 e2 Af20 ,
d1132 dˆ1132 e2 Af2 ⊆ d1132 dˆ1132 e1 Af10 ,
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
151
and it follows that d1132 dˆ1132 e1 = d1132 dˆ1132 e2 ,
d1132 dˆ1132 f1 ≤ d1132 dˆ1132 f20 ,
(5.45)
d1123 dˆ1123 f1 = d1123 dˆ1123 f2 .
(5.46)
and, similarly, that d1123 dˆ1123 e1 ≤ d1123 dˆ1123 e02 ,
Using Eqs. (5.37–5.46) it can be seen that the pairwise orthogonal elements w1 = c33 + d3132 + dˆ1332 + d1132 dˆ1132 , w2 = c22 + d2123 + dˆ1223 + d1123 dˆ1123 , w 3 = 1 − w1 − w2 , of ZP(A) satisfy the required conditions. Conversely, suppose that the pairwise orthogonal elements w1 , w2 and w3 of ZP(A) exist and satisfy the conditions of the theorem. Then, A2 (e1 , f1 ) = e1 Af1 = w1 e1 Af1 + w2 e1 Af1 = w1 e2 Af1 + w2 e1 Af2 ⊆ w1 e2 Af20 + w2 e02 Af2 ⊆ (w1 + w2 )(e2 Af20 + e02 Af2 ) ⊆ e2 Af20 + e02 Af2 = A1 (e2 , f2 ). Similarly, A2 (e2 , f2 ) is contained in A1 (e1 , f1 ). Observe that, taking central supports and using Lemma 2.1, w1 c(e1 , f1 ) = c(w1 e1 ) = c(w1 e2 ) = w1 c(e2 , f2 ), w2 c(e1 , f1 ) = c(w2 f1 ) = c(w2 f2 ) = w2 c(e2 , f2 ), w3 c(e1 , f1 ) = c(w3 e1 ) = 0, w3 c(e2 , f2 ) = c(w3 e2 ) = 0.
(5.47) (5.48) (5.49)
Adding (5.47), (5.48) and (5.49) it can be seen that c(e1 , f1 ) and c(e2 , f2 ) coincide, and it follows from (5.49) that c(e1 , f1 ) ≤ w30 = w1 + w2 . Observe that the central projection w1 constructed in the first part of the proof is given by w1 = c33 + d3132 + dˆ1332 + d1132 dˆ1132 , and that each element of this sum is dominated by c(e1 , f1 ). Hence w1 and, similarly, w2 is dominated by c(e1 , f1 ). It follows from above that c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 .
(5.50)
To show that this defines w1 , w2 and w3 uniquely, suppose that v1 , v2 and v3 is another such set of elements of ZP(A). Then, from (5.50), w 3 = v3 ,
w 1 + w 2 = v1 + v 2 .
Observe that, w1 v 2 e 1 = w 1 v 2 e 2 ,
w 1 v 2 e 1 ⊥ w1 v 2 e 2 ,
(5.51)
152
C. M. Edwards, G. T. R¨uttimann
and, therefore w1 v2 c(e1 , f1 ) = 0. By (5.50), it follows that w1 v2 ≤ w3 and, since w1 v2 ≤ w1 ≤ w30 , it can be seen that w1 v2 = 0. Therefore, from (5.51), v2 = v2 (v1 + v2 ) = v2 (w1 + w2 ) = v2 w2 , and v2 ≤ w2 . By symmetry w2 ≤ v2 and, therefore, v2 = w2 and, hence, v1 = w1 . This completes the proof of the uniqueness. Finally, by Lemma 2.1, c(e0 ) = c(w1 e1 ) = w1 c(e1 , f1 ) = w1 (w1 + w2 ) = w1 , and, similarly c(f0 ) = w2 . Furthermore, by Theorem 4.6(ii), c(e) = c(w2 e1 ) ∨ c(w2 e2 ) = w2 c(e1 , f1 ) ∨ w2 c(e2 , f2 ) = w2 , and, similarly, c(f ) = w1 . Hence (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e1 ∨ e2 , f1 ∨ f2 ) = (w1 e1 ∨ w1 e2 + w2 e1 ∨ w2 e2 , w1 f1 ∨ w1 f2 + w2 f1 ∨ w2 f2 ) = (e0 + e, f + f0 ), as required. This completes the proof of the theorem.
This theorem motivates the following definition. A pair (e1 , f1 ) and (e2 , f2 ) of elements of the complete ∗ -lattice CP(A) of centrally equivalent pairs of projections in the W∗ -algebra A is said to be rigidly collinear if there exist pairwise orthogonal central projections w1 , w2 and w3 of sum 1 satisfying the conditions of Theorem 5.3. Consequently, the main result of the theorem can be restated in the following manner. Corollary 5.4. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A). Then (e1 , f1 ) and (e2 , f2 ) are rigidly collinear if and only if A2 (e1 , f1 ) ⊆ A1 (e2 , f2 ),
A2 (e2 , f2 ) ⊆ A1 (e1 , f1 ).
It follows from Lemma 5.2(i) that a pair of rigid collinear elements in CP(A) is compatible. The notion of rigid collinearity can be extended to any family ((ej , fj ))j∈3 of elements of CP(A). Such a family is said to be rigidly collinear if every pair of distinct elements of the family is rigidly collinear. The next result describes such families. Theorem 5.5. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let ((ej , fj ))j∈3 be a rigidly collinear family of elements of CP(A). For j and k in 3 and l equal to 1, 2 or 3, let w(j, k)l be the unique element of ZP(A) such that w(j, k)1 fj ⊥ w(j, k)1 fk , w(j, k)1 ej = w(j, k)1 ek , w(j, k)2 fj = w(j, k)2 fk , w(j, k)2 ej ⊥ w(j, j)2 ek , w(j, k)3 ej = w(j, k)3 fj = w(j, k)3 ek = w(j, k)3 fk = 0, w(j, k)1 ≤ c(ej , fj ),
w( j, k)2 ≤ c(ej , fj ),
3 X l=1
w(j, k)l = 1.
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
153
Then, there exist uniquely pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 and elements e0 and f0 in P(A) such that: (i)
for all distinct j and k in 3, and l equal to 1, 2 or 3, w(j, k)l = wl ;
(ii) for all j in 3, w 1 e j = e0 , c(e0 ) = w1 ,
w 2 f j = f0 , c(f0 ) = w2 ,
w3 ej = w3 fj = 0, c(ej , fj ) = w1 + w2 ,
and (w1 fj )j∈3 and (w2 ej )j∈3 are families of pairwise orthogonal elements of P(A); (iii) writing _ _ w 2 ej , f= w 1 fj , e= j∈3
j∈3
(e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that _ (ej , fj ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f0 + f ). j∈3
Proof. Notice that, in the case in which there exists j in 3 such that (ej , fj ) is equal to (0, 0) then the same is true for all elements of 3 and w3 is equal to 1, with w1 and w2 both equal to zero. Otherwise, for distinct elements j and k of 3, (ej , fj ) and (ek , fk ) are distinct elements of CP(A) or both would be zero. Therefore, it can be assumed that (ej , fj ) is non-zero and all elements of the family are distinct. Observe that, by Theorem 5.3(iii), for all j in 3, c(ej , fj ) is the same element of ZP(A), which will be denoted by w30 . Hence, for distinct elements j and k in 3, w(j, k)1 + w(j, k)2 = w30 , w(j, k)3 = w3 .
(5.53) (5.54)
The case in which 3 consists of two elements is covered by Theorem 5.3. Therefore, let j1 , j2 and j3 be three distinct elements of 3. It follows from Theorem 5.3 that w(j1 , j2 )1 w(j2 , j3 )2 ej1 = w(j1 , j2 )1 w(j2 , j3 )2 ej2 ⊥ w(j1 , j2 )1 w(j2 , j3 )2 ej3 , w(j1 , j2 )1 w(j2 , j3 )2 fj1 ⊥ w(j1 , j2 )1 w(j2 , j3 )2 fj2 = w(j1 , j2 )1 w(j2 , j3 )2 fj3 , and, also using (5.54), that w(j1 , j2 )1 w(j2 , j3 )2 ≤ w(j1 , j3 )3 = w3 . But, by (5.53), w(j1 , j2 )1 w(j2 , j3 )2 ≤ w30 , and it follows that w(j1 , j2 )1 w(j2 , j3 )2 = 0.
(5.55)
154
C. M. Edwards, G. T. R¨uttimann
Therefore, by (5.53) and (5.55), w(j1 , j2 )1 = w(j1 , j2 )1 (w(j1 , j2 )1 + w(j1 , j2 )2 ) = w(j1 , j2 )1 (w(j2 , j3 )1 + w(j2 , j3 )2 ) = w(j1 , j2 )1 w(j2 , j3 )1 . Therefore, w(j1 , j2 )1 ≤ w(j2 , j3 )1 , and, similarly, w(j2 , j3 )1 ≤ w(j3 , j1 )1 and w(j3 , j1 )1 ≤ w(j1 , j2 )1 . Hence, w(j1 , j2 )1 = w(j2 , j3 )1 = w(j3 , j1 )1 . Let the common value be denoted by w1 . Using (5.53), it follows that w(j1 , j2 )2 = w(j2 , j3 )2 = w(j3 , j1 )2 , and the common value can be denoted by w2 . It can be seen that there exist elements e0 and f0 in P(A) such that w3 ej1 = w3 ej2 = w3 ej3 = w3 fj1 = w3 fj2 = w3 fj3 = 0, w 1 e j 1 = w 1 e j 2 = w 1 e j 3 = e0 , w 2 f j 1 = w 2 f j 2 = w 2 f j 3 = f0 , and w2 ej1 , w2 ej2 , w2 ej3 and w1 fj1 , w1 fj2 , w1 ej3 are families of pairwise orthogonal elements of P(A). Observe that, c(e0 ) = c(w1 ej1 ) = w1 c(ej1 , fj1 ) = w1 (w1 + w2 ) = w1 , and, similarly, c(f0 ) = w2 . Part (ii) of the theorem follows immediately from these remarks. Finally, notice that, using (2.2), _ _ _ c(w2 ej ) = w2 c(ej , fj ) = w2 (w1 + w2 ) = w2 , c(e) = j∈3
j∈3
j∈3
and, similarly, c(f ) = w1 . Therefore (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that _ _ _ (ej , fj ) = ( ej , fj ) j∈3
j∈3
=(
_
j∈3
j∈3
w 1 ej +
_ j∈3
w 2 ej ,
_
w 1 fj +
j∈3
_
w 2 fj )
j∈3
= (e0 + e, f + f0 ) = (e0 , f ) ∨ (e, f0 ). Uniqueness follows immediately from Theorem 5.3, and the proof of the theorem is complete. Recall that a W∗ -algebra A is said to be a factor if the centre Z(A) of A consists of complex multiples of the unit in A. In this case Theorem 5.5 has a particularly simple form. Corollary 5.6. Let A be a factor, let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let ((ej , fj ))j∈3 be a rigidly collinear family of non-zero elements of CP(A). Then, either there exists an element e0 in P(A) such that, for all j in 3, ej is equal to e0 and (fj )j∈3 is a family of pairwise orthogonal elements of P(A) or there exists an element f0 in P(A) such that, for all j in 3, fj is equal to f0 and (ej )j∈3 is a family of pairwise orthogonal elements of P(A)
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
155
Proof. Since ZP(A) is the set {0, 1} it follows that, in the theorem, one of w1 , w2 and w3 is equal to 1 and the other two are equal to zero. If w3 is equal to 1 then all elements of the rigidly collinear family are zero, giving a contradiction. If w1 is equal to 1 and w2 is equal to zero, then, by Theorem 5.5, for all j in 3, ej is equal to e0 and (fj )j∈3 is a family of pairwise orthogonal elements of P(A). The other possibility occurs if w1 is equal to zero and w2 is equal to 1. 6. Measures on the Complete *-Lattice CP(A) In this section certain measures on the complete ∗ -lattice CP(A) of centrally equivalent pairs of projections in the W∗ -algebra A are analysed. A measure m on CP(A) is a mapping from CP(A) to C such that, for each pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A) that are either centrally orthogonal or rigidly collinear m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )). The measure m is said to be bounded if the set {m((e, f )) : (e, f ) ∈ CP(A)} is a bounded subset of C and is said to be hermitian if, for all elements (e, f ) in CP(A), m((e, f )∗ ) = m((e, f )). Recall that, according to [32], a mapping ν from P(A) × P(A) to C is said to be a quantum bimeasure if, for e1 , e2 , f in P(A), with e1 ⊥ e2 , ν(e1 + e2 , f ) = ν(e1 , f ) + ν(e2 , f ), and, for e, f1 and f2 in P(A), with f1 ⊥ f2 , ν(e, f1 + f2 ) = ν(e, f1 ) + ν(e, f2 ). The quantum bimeasure ν is said to be bounded if the set {µ(e, f ) : e, f ∈ P(A)} is a bounded subset of C and is said to be hermitian if, for all e and f in P(A), ν(f, e) = ν(e, f ). The first lemma describes the relationship that exists between measures on CP(A) and quantum bimeasures on P(A) × P(A). Lemma 6.1. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Then, there exists a bijection m 7→ νm from the set of measures m on CP(A) onto the set of quantum bimeasures ν on P(A) × P(A) with the property that, for all elements z in ZP(A), and all elements e and f in P(A), ν(ze, f ) = ν(e, zf ), defined, for e and f in P(A), by νm (e, f ) = m((c(f )e, c(e)f )). The mapping sends the set of bounded measures into the set of bounded quantum bimeasures and the set of hermitian measures into the set of hermitian quantum bimeasures.
156
C. M. Edwards, G. T. R¨uttimann
Proof. Let m be a measure on CP(A) and let e1 , e2 and f be elements of P(A) with e1 ⊥ e2 . Observe that c(e1 )c(e2 ), c(e1 )c(e2 )0 , c(e1 )0 c(e2 ) and c(e1 )0 c(e2 )0 are pairwise orthogonal elements of ZP(A) of sum 1. Using Theorem 4.6(ii) and the fact that m is additive on centrally orthogonal pairs of elements of CP(A), νm (e1 , f ) = m((c(f )c(e1 )c(e2 )e1 , c(e1 )c(e2 )f ))+m((c(f )c(e1 )c(e2 )0 e1 , c(e1 )c(e2 )0 f ))+ m((c(f )c(e1 )0 c(e2 )e1 , c(e1 )c(e1 )0 c(e2 )f ))+m((c(f )c(e1 )0 c(e2 )0 e1 , c(e1 )c(e1 )0 c(e2 )0 f )) = m((c(f )c(e1 )c(e2 )e1 , c(e1 )c(e2 )f ))+m((c(f )c(e1 )c(e2 )0 e1 , c(e1 )c(e2 )0 f )). (6.1) Similarly, νm (e2 , f ) = m((c(f )c(e1 )c(e2 )e2 , c(e1 )c(e2 )f )) + m((c(f )c(e1 )0 c(e2 )0 e2 , c(e1 )0 c(e2 )f )). (6.2) Furthermore, by Lemma 2.1(iii) and Theorem 4.6, νm (e1 + e2 , f ) = m((c(f )(e1 + e2 ), (c(e1 ) ∨ c(e2 ))f )) = m((c(f )(e1 + e2 ), (c(e1 ) + c(e2 ) − c(e1 )c(e2 ))f )) = m((c(f )c(e1 )c(e2 )(e1 + e2 ), c(e1 )c(e2 )f )) + m((c(f )c(e1 )c(e2 )0 (e1 + e2 ), c(e1 )c(e2 )0 f )) + m((c(f )c(e1 )0 c(e2 )(e1 + e2 ), c(e1 )0 c(e2 )f )) 0
0
0
(6.3) 0
+ m((c(f )c(e1 ) c(e2 ) (e1 + e2 ), c(e1 ) c(e2 ) f )) = m((c(f )c(e1 )c(e2 )(e1 + e2 ), c(e1 )c(e2 )f )) + m((c(f )c(e1 )c(e2 )0 e1 , c(e1 )c(e2 )0 f )) + m((c(f )c(e1 )0 c(e2 )e1 , c(e1 )0 c(e2 )f )). However, (c(f )c(e1 )c(e2 )e1 , c(e1 )c(e2 )f ) and (c(f )c(e1 )c(e2 )e2 , c(e1 )c(e2 )f ) form a rigidly collinear pair of elements of CP(A) and, therefore, m((c(f )c(e1 )c(e2 )(e1 + e2 ), c(e1 )c(e2 )f )) = m((c(f )c(e1 )c(e2 )e1 , c(e1 )c(e2 )f )) + m((c(f )c(e1 )c(e2 )e2 ), c(e1 )c(e2 )f )). (6.4) First, substituting from (6.4) in (6.3), and then, using (6.1) and (6.2), gives νm (e1 + e2 , f ) = m((c(f )c(e1 )c(e2 )e1 , c(e1 )c(e2 )f )) + m((c(f )c(e1 )c(e2 )e2 , c(e1 )c(e2 )f )) m((c(f )c(e1 )c(e2 )0 e1 , c(e1 )c(e2 )0 f )) + m((c(f )c(e1 )0 c(e2 )e2 , c(e1 )0 c(e2 )f )) = νm (e1 , f ) + νm (e2 , f ). Similarly, for e, f1 and f2 in P(A), with f1 ⊥ f2 , νm (e, f1 + f2 ) = νm (e, f1 ) + νm (e, f2 ). Therefore, νm is a quantum bimeasure that, clearly, is bounded when m is bounded, and hermitian when m is hermitian. Let z be an element of ZP(A) and let e and f be elements of P(A). Then, using Lemma 2.1, νm (ze, f ) = m((zc(f )e, c(ze)f )) = m((zc(f )e, zc(e)f )) = m((c(zf )ze, c(ze)zf )) = νm (ze, zf ),
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
157
and, similarly, νm (e, zf ) = νm (ze, zf ) = νm (ze, f ), as required. Clearly, the mapping m 7→ νm is an injection. Conversely, suppose that ν is a quantum bimeasure on P(A) × P(A) such that, for all elements z of ZP(A) and all elements e and f in P(A), ν(ze, f ) = ν(e, zf ) = ν(ze, zf ).
(6.5)
It follows that, for e and f in P(A), ν(c(f )e, c(e)f ) = ν(e, c(e)c(f )f ) = ν(e, c(e)f ) = ν(c(e)e, f ) = ν(e, f ).
(6.6)
Let m be the mapping from CP(A) to C defined, for each element (e, f ) in CP(A), by m((e, f )) = ν(e, f ). To show that m is a measure on CP(A), first let (e1 , f1 ) and (e2 , f2 ) be centrally orthogonal elements of CP(A), and let z be an element of ZP(A) such that (e1 , f1 ) ≤ (z, z) and (e2 , f2 ) ≤ (z 0 , z 0 ). Then, using the definition of a quantum bimeasure and (6.5), m((e1 , f1 ) ∨ (e2 , f2 )) = m(((e1 ∨ e2 ), (f1 ∨ f2 ))) = ν((e1 ∨ e2 ), (f1 ∨ f2 )) = ν(ze1 + z 0 e2 , f1 ∨ f2 ) = ν(ze1 , f1 ∨ f2 ) + ν(z 0 e2 , f1 ∨ f2 ) = ν(ze1 , zf1 + z 0 f2 ) + ν(z 0 e2 , zf1 + z 0 f2 ) = ν(ze1 , zf1 ) + ν(ze1 , z 0 f2 ) + ν(z 0 e2 , zf1 ) + ν(z 0 e2 , z 0 f2 ) = ν(e1 , f1 ) + ν(e2 , f2 ), = m((e1 , f1 )) + m((e2 , f2 )). Now suppose that (e1 , f1 ) and (e2 , f2 ) are rigidly collinear elements of CP(A). Let w1 , w2 and w3 be the unique elements of ZP(A) of sum 1 such that w 1 e1 = w1 e2 ,
w 1 f 1 ⊥ w1 f2 , w 2 e 1 ⊥ w2 e2 , w 2 f 1 = w 2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 .
Recall that, by Theorem 5.3(ii), w3 (e1 ∨ e2 ) = w3 (f1 ∨ f2 ) = 0. Therefore, using (6.5),(6.7) and the additive properties of a quantum bimeasure,
(6.7) (6.8) (6.9)
158
C. M. Edwards, G. T. R¨uttimann
m(((e1 , f1 ) ∨ (e2 , f2 ))) = m(((e1 ∨ e2 ), (f1 ∨ f2 ))) = ν((e1 ∨ e2 ), (f1 ∨ f2 )) = ν((w1 + w2 )(e1 ∨ e2 ), (w1 + w2 )(f1 ∨ f2 )) = ν((w1 e1 ∨ w1 e2 ) + (w2 e1 ∨ w2 e2 ), (w1 f1 ∨ w1 f2 ) + (w2 f1 ∨ w2 f2 )) = ν(w1 e1 + w2 (e1 + e2 ), w1 (f1 + f2 ) + w2 f2 ) = ν(w1 e1 , w1 (f1 + f2 ) + w2 f2 ) + ν(w2 (e1 + e2 ), w1 (f1 + f2 ) + w2 f2 ) = ν(w1 e1 , w1 f1 + w1 f2 ) + ν(w1 e1 , w2 f2 ) + ν(w2 e1 + w2 e2 , w1 f1 + w1 f2 ) + ν(w2 e1 + w2 e2 , w2 f2 ) = ν(w1 e1 , w1 f1 ) + ν(w1 e1 , w1 f2 ) + ν(w1 e1 , w2 f2 ) + ν(w2 e1 , w1 f1 ) + ν(w2 e1 , w1 f2 ) + ν(w2 e2 , w1 f1 ) + ν(w2 e2 , w1 f2 ) + ν(w2 e1 , w2 f2 ) + ν(w2 e2 , w2 f2 ) = ν(w1 e1 , f1 ) + ν(w1 e1 , f2 ) + ν(w2 w1 e1 , f2 ) + ν(w1 w2 e1 , f1 ) + ν(w1 w2 e1 , f2 ) + ν(w1 w2 e2 , f1 ) + ν(w1 w2 e2 , f2 ) + ν(w2 e1 , f2 ) + ν(w2 e2 , f2 ), = ν(w1 e1 , f1 ) + ν(w1 e1 , f2 ) + ν(w2 e1 , f2 ) + ν(w2 e2 , f2 ) = (ν(w1 e1 , f1 ) + ν(w2 e1 , w2 f2 )) + (ν(w1 e1 , f2 ) + ν(w2 e2 , f2 )) = (ν(w1 e1 , f1 ) + ν(w2 e1 , w2 f1 )) + (ν(w1 e2 , f2 ) + ν(w2 e2 , f2 )) = (ν(w1 e1 , f1 ) + ν(w2 e1 , f1 )) + (ν(w1 e2 , f2 ) + ν(w2 e2 , f2 )) = ν(e1 , f1 ) + ν(e2 , f2 ) = m((e1 , f1 )) + m((e2 , f2 )). It follows that m is a measure on CP(A), that is bounded if ν is bounded, and is hermitian if ν is hermitian. Furthermore, for e and f in P(A), using (6.6), νm (e, f ) = m((c(f )e, c(e)f )) = ν(c(f )e, c(e)f ) = ν(e, f ), and it follows that νm and ν coincide. This completes the proof of the lemma.
This result can now be combined with those of Wright [32] to give a precise description of the bounded measures on CP(A), at least when A has no direct summand of Type I2 . Theorem 6.2. Let A be a W∗ -algebra that contains no weak∗ -closed ideal of Type I2 , with centre Z(A), let P(A) be the complete orthomodular lattice of projections in A, and let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A). Let Sb (A) be the space of bounded sesquilinear functionals φ from A × A to C such that, for all elements a and b in A and c in Z(A), φ(ca, b) = φ(a, c∗ b), and let Sbh (A) be the subspace of Sb (A) consisting of hermitian functionals. Then, the mapping φ 7→ mφ defined, for each element φ in Sb (A) and each element (e, f ) of CP(A), by mφ ((e, f )) = φ(e, f ) is a bijection from Sb (A) onto the space Mb (CP(A)) of bounded measures on CP(A), which maps Sbh (A) onto the space Mbh (CP(A)) of bounded hermitian measures on CP(A).
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
159
Proof. Let φ be an element of Sb (A) and denote by νφ its restriction to P(A) × P(A). Then, νφ is clearly a quantum bimeasure and, for all elements e and f in P(A) and z in ZP(A), νφ (ze, f ) = φ(ze, f ) = φ(e, z ∗ f ) = φ(e, zf ) = νφ (e, zf ). It follows from Lemma 6.1 that the function mφ from CP(A) to C, defined above, is a bounded measure on CP(A). Let m be a bounded measure on CP(A) and let νm be the bounded quantum bimeasure defined in Lemma 6.1. Then, by [32], Theorem 1, there exists a unique bounded bilinear functional ψm on A extending νm . For a and b in A, define a mapping φm from A × A to C by φm (a, b) = ψm (a, b∗ ). Then, φm is a bounded sesquilinear functional from A×A to C extending νm . Moreover, for e and f in P(A) and z in ZP(A), φm (ze, f ) = νm (ze, f ) = νm (e, zf ) = φm (e, zf ). Since the space of finite linear combinations of elements of P(A) is dense in A for the norm topology, it follows that, for all elements a and b in A and z in ZP(A), φm (za, b) = φm (a, zb). Since the space of finite linear combinations of elements of ZP(A) is dense in Z(A) in the norm topology, recalling that φm is conjugate linear in the second variable, it follows that, for all elements a and b in A and c in Z(A), φm (ca, b) = φm (a, c∗ b). Hence, φm lies in Sb (A) and, for (e, f ) in CP(A), mφm ((e, f )) = φm (e, f ) = ψm (e, f ) = νm (e, f ) = m((e, f )). This completes the proof of the first part of the theorem. Suppose that φ lies in Sbh (A). Then, for e and f in P(A), νφ (e, f ) = φ(e, f ) = φ(f, e) = νφ (f, e), and the quantum bimeasure νφ is hermitian. It follows from Lemma 6.1 that mφ is hermitian. Conversely, let m be a bounded hermitian measure on CP(A) and let φm be the element of Sb (A) defined in the first part of the proof. Let e1 , e2 , . . . , em and f1 , f2 , . . . , fn be elements of P(A), let λ1 , λ2 , . . . , λm and µ1 , µ2 , . . . , µn be elements of C, and let m n X X a= λj e j , b= µk ek . j=1
k=1
Then, using Lemma 6.1, ∗
φm (a, b) = ψm (a, b ) =
n m X X
λj µ¯ k νm (ej , fk )
j=1 k=1
=
n m X X
λj µ¯ k νm (fk , ej )
j=1 k=1
= ψm (b, a∗ ) = φm (b, a).
(6.10)
160
C. M. Edwards, G. T. R¨uttimann
Since finite linear combinations of elements of P(A) are dense in A in the norm topology and since φm is bounded, (6.10) holds for all elements a and b, and φm is hermitian, as required. Let φ be an element Sbh (A) and let mφ be the bounded hermitian measure on CP(A) defined in Theorem 6.2. Then, by [34], there exists a complex Hilbert space H, an element ξ of H of norm one, a Jordan ∗ -homomorphism π from A into B(H) such that π(A)ξ is dense in H, and a self-adjoint operator T in B(H) such that, for all elements a and b in A, φ(a, b) = hT π(a)ξ, π(b)ξi. However, from Theorem 6.2, for each element c in Z(A), hT π(c)π(a)ξ, π(b)ξi = hT π(ca)ξ, π(b)ξi = φ(ca, b) = φ(a, c∗ b) = hT π(a)ξ, π(c)∗ π(b)ξi = hπ(c)T π(a)ξ, π(b)ξi, and it follows that, for all c in Z(A), T π(c) = π(c)T. This remark leads to the following corollary that provides a complete description of bounded hermitian measures on CP(A). Corollary 6.3. Let A be a W∗ -algebra, that contains no weak∗ -closed ideal of Type I2 , with centre Z(A), let P(A) be the complete orthomodular lattice of projections in A, and let m be a bounded hermitian measure on the complete ∗ -lattice CP(A) of centrally equivalent pairs of elements of P(A). Then, there exists a complex Hilbert space H, an element ξ in H of norm one, a Jordan ∗ -homomorphism π from A into the W∗ -algebra B(H) of bounded linear operators on H such that π(A)ξ is dense in H, and a selfadjoint element T in the commutant π(Z(A))0 of π(Z(A)) in B(H) such that, for each element (e, f ) in CP(A), m((e, f )) = hT π(e)ξ, π(f )ξi. Furthermore, the bounded hermitian functional φm on A × A defined, for elements a and b in A, by φm (a, b) = hT π(a)ξ, π(b)ξi is the unique bounded hermitian functional on A × A having the property that, for all elements a and b in A and c in Z(A), φm (ca, b) = φm (a, c∗ b), such that, for each element (e, f ) in CP(A), φm (e, f ) = m((e, f )).
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
161
Recall that the mapping e 7→ (e, e) is an order isomorphism from the complete orthomodular lattice P(A) onto the sub-complete lattice CP(A)h of CP(A) consisting of elements invariant under the involution on CP(A). Observe that a pair (e1 , e1 ) and (e2 , e2 ) of distinct elements of CP(A)h is compatible if and only if e1 and e2 commute, is orthogonal if and only if e1 ⊥ e2 , is centrally orthogonal if and only if c(e1 ) ⊥ c(e2 ), but is never rigidly collinear. Consequently, the restriction of measures on CP(A) to CP(A)h are those that are additive only on centrally orthogonal elements of CP(A)h . They are not, in general, orthoadditive, and, therefore, do not satisfy the conditions of Gleason’s theorem for W∗ -algebras. To obtain a statement of Gleason’s theorem from the results above, it is necessary to use Corollary 5.6 to observe that, for a factor, rigid collinearity of pairs of elements of CP(A) reduces to orthogonality in P(A). A measure m on CP(A) is said to be normal if, for every centrally orthogonal or rigidly collinear family ((ej , fj ))j∈3 of elements of CP(A), _ X m (ej , fj ) = m((ej , fj )), j∈3
j∈3
where the sum is defined to be the limit of the net formed by taking sums over finite subsets of 3. The final result describes the normal bounded measures on CP(A), but first the following lemmas are required. Lemma 6.4. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), and let (ej )j∈3 be a family of pairwise orthogonal elements of P(A). Let 0 be the set of all subsets of 3, excluding the empty set. Then, there exists a family (zα )α∈0 of pairwise orthogonal elements of ZP(A) such that: _ _ zα = c(ej ); (i) α∈0
j∈3
(ii) for each α in 0, and, for each j in α, c(ej zα ) = zα , and for each j in 3 \ α, e_ = 0; j zα _ _ (iii) e j zα = ej . α∈0 j∈α
j∈3
Proof. Since Z(A) is an abelian W∗ -algebra, there exists a hyperstonian space such that Z(A) is isomorphic to the abelian W∗ -algebra C() of continuous complex-valued functions on . The complete Boolean lattice ZP(A) is mapped onto the family of characteristic functions of clopen subsets of . For each element z in ZP(A), the corresponding subset of will be denoted by z . For α in 0, let ^ ^ c(ej ) ∧ c(ej )0 . zα = j∈α
j∈3\α
Notice that, for α1 and α2 distinct elements of 0, without loss of generality, it can be / α2 . Then, assumed that there exists an element j in 3 such that j ∈ α1 and j ∈ zα1 ≤ c(ej ),
zα2 ≤ c(ej )0 ,
and it follows that zα1 and zα2 are orthogonal. Therefore, (zα )α∈0 is a family of pairwise orthogonal elements of ZP(A). Let _ _ zα , w = c(ej ). z= α∈0
j∈3
162
C. M. Edwards, G. T. R¨uttimann
Observe that, for each α in 0, there exists an element k in 3 such that zα ≤ c(ek ). Therefore, [ c(ej ) . zα ⊆ c(ek ) ⊆ j∈3
Hence,
[ [
Let ω be an element of
zα ⊆
α∈0
[
c(ej ) .
(6.11)
j∈3
c(ej ) . Then, there exists an element k in 3 such that ω is
j∈3
contained in c(ek ) . Then, the set β = {j ∈ 3 : ω ∈ c(ej ) } is non-empty, and, for each element j in β, ω is contained in c(ej ) , whilst, for each element j in 3 \ β, ω is contained in 0c(ej ) . It follows that ω∈
\
\ c(ej ) ∩ 0c(ej ) = zβ .
j∈β
Therefore,
j∈β
[
c(ej ) ⊆
[
zα .
(6.12)
α∈0
j∈3
Using (6.11) and (6.12), z =
[
zα =
α∈0
[
c(ej ) = w ,
j∈3
and it can be seen that w and z coincide, thereby completing the proof of (i). For fixed α in 0 and j in α, by (2.1), c(ej zα ) = c(ej )zα = zα . For j in 3 \ α,
ej zα ≤ c(ej )zα ≤ c(ej )c(ej )0 = 0.
This completes the proof of (ii). Using (ii), observe that, for each α in 0, _ _ _ e j zα = e j zα = e j zα . j∈α
j∈3
(6.13)
j∈3
Hence, from (6.13), _ _ _ _ _ _ _ ej = ej w = ej z = e j zα = e j zα j∈3
as required.
j∈3
j∈3
α∈0 j∈α
α∈0 j∈α
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
163
Lemma 6.5. Let A be a W∗ -algebra, with centre Z(A), let P(A) be the complete orthomodular lattice of projections in A, with centre ZP(A), and let x be a bounded linear functional on A. Then, x is weak∗ -continuous on A if and only if, for every family (ej )j∈3 of elements of P(A) that is either centrally orthogonal, or is orthogonal and such that each member has the same central support, _ X x ej = x(ej ). j∈3
j∈3
∗
Proof. Suppose that x is weak -continuous on A. Then, by [28], 1.13, for every orthogonal family (ej )j∈3 in P(A), _ X x ej = x(ej ). j∈3
j∈3
In particular, this holds if (ej )j∈3 is centrally orthogonal or consists of elements having a common central support. To prove the converse, suppose that (ej )j∈3 is an orthogonal family of elements in P(A), let 0 be the set of subsets of 3, excluding the empty set, and let (zα )α∈0 be the orthogonal family of elements of ZP(A) constructed in Lemma 6.4. Observe zα )j∈α is an orthogonal family, each member of which has that, for fixed α in 0, (ejW central support zα , and ( j∈α ej zα )α∈0 is a centrally orthogonal family. Therefore, using Lemma 6.4(iii), _ _ _ X _ ej = x x e j zα = x e j zα α∈0 j∈α
j∈3
=
XX
α∈0
x(ej zα ) =
α∈0 j∈α
X X
j∈α
x(ej zα ) ,
α∈0 j∈3
since, for j in 3 \ α, ej zα is zero, X X _ X = x(ej zα ) = e j zα x j∈3 α∈0
j∈3
α∈0
since, for each j in 3, (ej zα )α∈0 is a centrally orthogonal family, X X X = x(ej z) = x(ej w) = x(ej ), j∈3
j∈3
j∈3 ∗
since, for each j in 3, c(ej ) ≤ w. That x is weak -continuous follows from [28], 1.13. It is now possible to prove the final main result that shows how normal bounded measures may be characterized. Theorem 6.6. Let A be a W∗ -algebra, that contains no weak∗ -closed ideal of Type I2 , with centre Z(A), let P(A) be the complete orthomodular lattice of projections in A, let CP(A) be the complete ∗ -lattice of centrally equivalent pairs of elements of P(A), and let φ 7→ mφ be the bijection, defined in Theorem 6.2, from the space Sb (A) of bounded sesquilinear functionals φ from A × A to C such that, for all elements a and b in A and c in Z(A), φ(ca, b) = φ(a, c∗ b), b onto the space M (CP(A)) of bounded measures on CP(A). Then φ is separately weak∗ continuous if and only if only if mφ is normal.
164
C. M. Edwards, G. T. R¨uttimann
Proof. Suppose that φ is separately weak∗ -continuous and let ((ej , fj ))j∈3 be a centrally orthogonal family. Let 0 be the directed set of finite subsets of 3. Then, using Lemma 2.3, and the fact that, for j and k distinct elements of 3, φ(ej , fk ) = 0, _ _ _ _ _ (ej , fj ) = mφ ( ej , fj ) = φ ( ej , fk ) mφ j∈3
j∈3
= lim φ α∈0
X j∈α
= lim lim
α∈0 β∈0
=
X
j∈3
ej ,
_
fk = lim
α∈0
k∈3
XX
j∈3
X
φ ej ,
j∈α
φ(ej , fk ) = lim
α∈0
j∈α k∈β
k∈3
_
fk
k∈3
X
φ(ej , fj )
j∈α
mφ ((ej , fj )),
j∈3
as required. Now let ((ej , fj ))j∈3 be a rigidly collinear family. Let w1 , w2 and w3 be the elements of ZP(A) and (e0 , f ) and (e, f0 ) the centrally orthogonal elements of CP(A) defined in Theorem 5.5. Then, using the separate weak∗ -continuity of φ and Lemma 2.3, _ mφ ( (ej , fj )) = mφ ((e0 , f ) ∨ (e, f0 )) = mφ ((e0 , f )) + mφ ((e, f0 )) j∈3
= φ(e0 , f ) + φ(e, f0 ) = φ(e0 , = lim
α∈0
= lim
α∈0
= lim
α∈0
= lim
α∈0
X
w1 fj ) + φ(
j∈3
φ(e0 , w1 fj ) + lim
α∈0
j∈α
X
_ X
_
w 2 ej , f 0 )
j∈3
φ(w2 ej , f0 )
j∈α
φ(w1 ej , w1 fj ) + φ(w2 ej , w2 fj )
j∈α
X
mφ ((w1 ej , w1 fj )) + mφ ((w2 ej , w2 fj ))
j∈α
X
mφ ((w1 ej , w1 fj ) ∨ (w2 ej , w2 fj )),
j∈α
using the fact that (w1 ej , w1 fj ) and (w2 ej , w2 fj ) are centrally orthogonal. Hence, X X _ mφ ((w1 ej + w2 ej , w1 fj + w2 fj )) = lim mφ ((ej , fj )) mφ ( (ej , fj )) = lim α∈0
j∈3
=
X
α∈0
j∈α
j∈α
mφ ((ej , fj )),
j∈3
since w3 ej = w3 fj = 0. It follows that the measure mφ is normal. Suppose now that the measure mφ is normal and that f is an element of P(A). Let x be the bounded linear functional on A defined, for each element a in A, by x(a) = φ(a, f ). Let (ej )j∈3 be a centrally orthogonal family of elements of P(A). Using the properties of φ, and the fact that ((c(f )ej , c(ej )f ))j∈3 is a centrally orthogonal family, observe that
Lattice of Weak∗ -Closed Inner Ideals in W∗ -Algebra
x
_
165
_ _ _ ej = φ ej , f = φ c(f ) ej , c ej f
j∈3
j∈3
= mφ (c(f ) = mφ =
X
_
j∈3
ej , c
j∈3
_
_
j∈3
(c(f )ej , c(ej )f ) =
j∈3
φ(ej , f ) =
j∈3
X
j∈3
_ _ e j f ) = mφ ( c(f )ej , c(ej )f )) X
j∈3
j∈3
mφ ((c(f )ej , c(ej )f ))
j∈3
x(ej ).
j∈3
Now, let (ej )j∈3 be an orthogonal family of elements of P(A) each having central support equal to z. Notice that ((c(f )ej , zf ))j∈3 is a rigidly collinear family of elements of CP(A). Therefore _ _ _ _ ej = φ ej , f = φ c(f ) ej , c ej f x j∈3
j∈3
= mφ (c(f ) = mφ =
X j∈3
_
_
j∈3
ej , c
j∈3
_
j∈3
(c(f )ej , zf ) =
j∈3
φ(ej , f ) =
X
j∈3
_ _ e j f ) = mφ ( c(f )ej , c(ej )f )
X
j∈3
j∈3
mφ ((c(f )ej , zf ))
j∈3
x(ej ).
j∈3
Lemma 6.5 shows that the bounded linear functional x is weak∗ -continuous on A. Since the set of finite linear combinations of elements of P(A) is dense in A in the norm topology, by approximating a fixed element b in A by a finite linear combination of elements of P(A), it can be seen that the bounded linear functional mapping a 7→ φ(a, b) is also weak∗ -continuous on A. Similarly, for each fixed element a in A, the mapping b 7→ φ(a, b) is weak∗ -continuous on A. This completes the proof of the theorem. References 1. Barton, T.J., Timoney, R.M.: Weak∗ -continuity of Jordan triple products and its applications. Math. Scand. 59, 177–191 (1986) 2. Bonsall, F.F., Duncan, J.: Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras. London Mathematical Society Lecture Note Series 2. Cambridge: Cambridge University Press, 1971 3. Bunce, L.J., Wright, J.D.M.: The Mackey–Gleason problem. Bull. Am. Math. Soc. 26, 288–293 (1992) 4. Bunce, L.J., Wright, J.D.M.: Complex measures on projections in von Neumann algebras. J. Lond. Math. Soc. 46, 269–279 (1992) 5. Bunce, L.J., Wright, J.D.M.: The Mackey–Gleason problem for vector measures on projections in von Neumann algebras. J. Lond. Math. Soc. 49, 133–149 (1994) 6. Bunce, L.J., Wright, J.D.M.: Skew-symmetric functions on the sphere and quantum measures. Expositiones Math. 12, 271–280 (1994) 7. Edwards, C.M.: On Jordan W∗ -algebras. Bull. Sc. Math. 2e s´erie 104, 393–403 (1980) 8. Edwards, C.M., McCrimmon, K., R¨uttimann, G.T.: The range of a structural projection. J. Funct. Anal. 139, 196–224 (1996) 9. Edwards, C.M., R¨uttimann, G.T.: On the facial structure of the unit balls in a JBW∗ -triple and its predual. J. Lond. Math. Soc. 38, 317–322 (1988) 10. Edwards, C.M., R¨uttimann, G.T.: Inner ideals in W∗ -algebras. Michigan Math. J. 36, 147–159 (1989)
166
C. M. Edwards, G. T. R¨uttimann
11. Edwards, C.M., R¨uttimann, G.T.: Structural projections on JBW∗ -triples. J. Lond. Math. Soc. 53, 354– 368 (1996) 12. Edwards, C.M., R¨uttimann, G.T.: Peirce inner ideals in Jordan ∗ -triples. J. Alg. 180, 41–66 (1996) 13. Edwards, C.M., R¨uttimann, G.T., Vasilovsky S.Yu.: Invariant inner ideals in W∗ -algebras. Math. Nachr. 172, 95–108 (1995) 14. Effros, E.G., Størmer, E.: Positive projections and Jordan structure in operator algebras. Math. Scand. 45, 127–138 (1979) 15. Friedman, Y., Russo, B.: Structure of the predual of a JBW∗ -triple. J. Reine Angew. Math. 356, 67–89 (1985) 16. Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J Maths. and Mech. 6, 885–894 (1957) 17. Hanche-Olsen, H., Størmer, E.: Jordan Operator Algebras. London: Pitman, 1984 18. Horn, G.: Characterization of the predual and the ideal structure of a JBW∗ -triple. Math. Scand. 61, 117–133 (1987) 19. Isham, C.J.: Quantum logic and histories approach to quantum theory. J. Math. Phys. 35, 2157–2185 (1994) 20. Isham, C.J., Linden, N.: Quantum temporal logic and decoherence functionals in the histories approach to generalised quantum theory. J. Math. Phys. 35, 5452–5476 (1994) 21. Isham, C.J., Linden, N., Schreckenberg, S.: The classification of decoherence functionals: An analogue of Gleason’s theorem. J. Math. Phys. 35, 6360–6370 (1994) 22. Kadison, R.V.: Isometries of operator algebras. Ann. Math. 54, 325–338 (1951) 23. Kaup, W.: Riemann mapping theorem for bounded symmetric domains in complex Banach spaces. Math. Z. 183, 503–529 (1983) 24. Kaup, W.: Contractive projections on Jordan C∗ -algebras and generalizations, Math. Scand. 54, 95–100 (1984) 25. Kaup, W., Upmeier, H.: Banach spaces with biholomorphically equivalent equivalent unit balls are isomorphic. Proc. Am. Math. Soc. 58, 129–133 (1976) 26. McCrimmon, K.: Compatible Peirce decomposition of Jordan triple systems. Pac. J. Math. 83, 415–439 (1979) 27. Pedersen, G.K.: C∗ -algebras and their automorphism groups, (London Mathematical Society Monographs 14). London: Academic Press, 1979 28. Sakai, S.: C∗ -algebras and W∗ -algebras. Berlin–Heidelberg–New York: Springer, 1971 29. R¨uttimann, G.T.: Non-commutative measure theory. Habilitationsschrift, Universit¨at Bern, 1980 30. Upmeier, H.: Symmetric Banach manifolds and Jordan C∗ -algebras. Amsterdam: North Holland, 1985 31. Wright, J.D.M.: Jordan C∗ -algebras. Michigan Math. J. 24, 291–302 (1977) 32. Wright, J.D.M.: The structure of decoherence functionals for von Neumann quantum histories. J. Math. Phys. 36, 5409–5413 (1995) 33. Wright, J.D.M.: Decoherence functionals for von Neumann quantum histories: Boundedness and countable additivity. Commun. Math. Phys. 191, 493–500 (1998) 34. Ylinen, K.: The structure of bounded bilinear functionals on products of C∗ -algebras. Proc. Am. Math. Soc. 102, 599–601 (1988) 35. Youngson, M.A.: A Vidav theorem for Banach Jordan algebras, Math. Proc. Cambridge Philos. Soc. 84, 263–272 (1978) Communicated by H. Araki
Commun. Math. Phys. 197, 167 – 197 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Fedosov ∗-Products and Quantum Momentum Maps Ping Xu? Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail:
[email protected] Received: 8 July 1997 / Accepted: 4 March 1998
Abstract: The purpose of this paper is to study various aspects of star products on a symplectic manifold related to the Fedosov method. By introducing the notion of “quantum exponential maps” we give a characterization of Fedosov connections. As an application, a geometric realization is obtained for the equivalence between an arbitrary ∗-product and a Fedosov one. Every Fedosov ∗-product is shown to be a Vey ∗-product. Consequently, we find that every ∗-product is equivalent to a Vey ∗-product, a classical result of Lichnerowicz. Quantization of a hamiltonian G-space, and in particular, quantum momentum maps are studied. Lagrangian submanifolds are also studied under a deformation quantization.
1. Introduction In classical mechanics, observables are smooth functions on a phase space, constituting a Poisson algebra, while in quantum mechanics observables form a noncommutative associative algebra. Deformation quantization, as laid out by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer in the 1970’s [5], is one of the important methods aimed at establishing a correspondence between these two mechanics. A classical phase space is usually a symplectic manifold M . A deformation quantization, or more precisely a star-product, is a family of associative product ∗~ (depending on the Planck constant ~) on C ∞ (M )[[~]], the space of formal power series of ~ with coefficients in C ∞ (M ): f ∗~ g = f g + ~C1 (f, g) + · · · + ~k Ck (f, g) + · · · , ∀f, g ∈ C ∞ (M )[[~]] such that ?
Research partially supported by NSF grant DMS95-04913 and DMS97-04391.
(1)
168
P. Xu
(i) C1 (f, g) − C1 (g, f ) = −i{f, g}; (ii) Ck (1, f ) = Ck (f, 1) = 0, for k ≥ 1; (iii) each Ck (·, ·) is a bidifferential operator. Here {f, g} is the Poisson bracket on the symplectic manifold M . A ∗-product is said to satisfy the parity condition if in addition (iv) Ck (f, g) = (−1)k Ck (g, f ), ∀k. On a symplectic vector space V there exists a standard ∗-product known as the Moyal–Weyl product: f ∗~ g =
∞ X
(−
k=0
i~ k 1 i1 j1 ∂kf ∂kg ) π · · · π ik j k i1 , i j 1 k 2 k! ∂y · · · ∂y ∂y · · · ∂y jk
(2)
∞
∀f, g ∈ C (V )[[~]], where y 1 , · · · , y 2n are linear coordinates on V , and π ij = {y i , y j }. It is simple to see that this formula is independent of the choice of linear coordinates. The Moyal–Weyl formula has a straightforward generalization to the case of a symplectic manifold admitting a flat torsion free symplectic connection ∇, as shown in [5]. In this case, for any f, g ∈ C ∞ (M )[[~]], set (f ∗~ g)(x) = [(exp∗x f )(y) ∗~ (exp∗x g)(y)]|y=0 .
(3)
Here expx : Tx M −→ M is the exponential map corresponding to the connection ∇ defined in a neighborhood of the origin. The product ∗~ on the right-hand side is the standard Moyal–Weyl ∗-product on the symplectic vector space Tx M , so that (f ∗~ g)(x) =
∞ X k=0
(−
i~ k 1 i1 j1 ) π · · · π ik jk (∂i1 · · · ∂ik f )(∂j1 · · · ∂jk g). 2 k!
(4)
The multiplication defined by Eq. (3) fails to be associative when ∇ has nonzero curvature. The existence proof of ∗-products on a general symplectic manifold was first obtained by de Wilde and Lecomte [10] using a homological argument. Later, an alternate proof using Weyl manifolds was found by Omori et al.[26]. Guillemin [18] showed that their results in [8] also implies the existence of ∗-products on any compact pre-quantizable symplectic manifolds. Recently, Fedosov has given a nice geometrical proof of existence on a general symplectic manifold. This paper grew out of an attempt to understand the Fedosov method, as well as to solve some other related problems using his technique. In this setting, as discussed earlier in this introduction, each tangent space Tx M is a symplectic vector space, hence can be quantized using the standard Moyal–Weyl formula. This gives a bundle of algebras W −→ M , called the Weyl bundle by Fedosov, which can be thought of as a “quantum tangent bundle”. Fedosov found a nice iteration method for constructing a flat connection on the Weyl bundle whose space of parallel sections can be naturally identified with C ∞ (M )[[~]]. Thus the product on the bundle induces a ∗-product on C ∞ (M )[[~]]. Intuitively, a Fedosov connection can be thought of as a “quantum connection” on the “quantum tangent bundle” W , which is obtained by adding some “quantum correction” to the usual affine connection on the tangent bundle (see Sect. 2). Then, the correspondence between C ∞ (M )[[~]] and the parallel sections of W can be considered as taking the exponential map for such a quantum connection.
Fedosov ∗-Products and Quantum Momentum Maps
169
This viewpoint, essentially due to Weinstein ([31, 12]), leads to our introduction of the notion of “quantum exponential maps”. By a “quantum exponential map”, we mean a map from C ∞ (M )[[~]] to the space of sections 0W of the Weyl bundle, which satisfies certain axioms (see Sect. 3 for details). In fact, we will show that any quantum exponential map is equivalent to a Fedosov connection, which agrees with the general principal that connections and exponential maps are equivalent. This result enables us to characterize those subalgebras of 0W arising from Fedosov connections. As an application we realize geometrically the equivalence between any given ∗-product and a Fedosov ∗-product, a result obtained by Nest–Tsygan [25], Deligne [13] and recently Bertelson–Cahen–Gutt [7] for instance. Since a “quantum exponential map” is a “quantum correction” to the usual exponential map, the ∗-products obtained by the Fedosov method, called Fedosov ∗-products in this paper, should be closely related to Eq. (3). A Vey ∗-product, by definition, is a ∗-product where the principal terms coincide with those in Eq. (3) (see Sect. 4 for the precise definition). Vey ∗-products have played an important role since the beginning of deformation quantization theory (see [5, 23, 28]). In this paper, we will show that every Fedosov ∗-product is a Vey ∗-product, and so we recover the well known result of Lichnerowicz [23] that every ∗-product is equivalent to a Vey ∗-product. Because of the basic simplicity of its construction, Fedosov’s method provides a useful tool for studying some other problems in deformation quantization theory. In particular, in this paper we will study the following question- to what do lagrangian submanifolds correspond under a deformation quantization? Given a lagrangian submanifold L, the space CL∞ (M ) of smooth functions vanishing on L forms a Poisson subalgebra. We will show that for any ∗-product CL∞ (M )[[~]] is, roughly speaking, a subalgebra under a “quantum correction”. In symplectic geometry, hamiltonian G-spaces, and in particular, momentum maps play a very special role. It is natural to ask: what is the quantum analogue of momentum maps? The second part of the paper, as another application of Fedosov’s method, is devoted to the study of quantum momentum maps. In particular, we derive a sufficient condition for their existence. When a quantum momentum map exists, we obtain a pair of mutual commutants which can be considered as a quantum analogue of the well known pr J Poisson dual pair g∗ ←− M −→ M/G of Weinstein [30]. Derivations are always important for associative algebras. In Appendix A, we list some basic results regarding derivations on a ∗-algebra C ∞ (M )[[~]], which are needed for the study of quantum momentum maps. Most of them are proved using Fedosov’s method. We will see that Fedosov’s method provides a simpler way of understanding even some well known results. During the preparation of this paper, it came to the author’s attention that quantum momentum maps are under study by other authors as well, including Astashkevich [4], Kostant, and Tsygan [27]. In particular, Theorem 6.4 has also been proved by Kostant. Quantum momentum maps in a broader sense were already studied by Landsman [20] and Lu [24]. Recently, Bordemann–Waldmann also proved independently that a Fedosov ∗-product is of Vey type [9] for the special case when the Weyl curvature is the symplectic form.
170
P. Xu
2. Fedosov Quantization In this section, we recall some basic ingredients of the Fedosov construction of ∗products on a symplectic manifold, as well as some useful notations, which are needed in the future. For details, readers should consult [14, 15]. Let (M, ω) be a symplectic manifold of dimension 2n. Then, each tangent space Tx M is equipped with a linear symplectic structure, which can be quantized using the standard Moyal–Weyl product. The resulting space is denoted by Wx . More precisely, Definition 2.1. A formal Weyl algebra Wx associated to Tx M is an associative algebra with a unit over C, whose elements consist of formal power series in the formal parameter ~ with coefficients being formal polynomials in Tx M . In other words, each element has the form X (5) a(y, ~) = ~k ak,α y α , where y = (y 1 , · · · , y 2n ) is a linear coordinate system on Tx M , α = (α1 , · · · , α2n ) is a multi-index, y α = (y 1 )α1 · · · (y 2n )α2n , and ak,α are constants. The product is defined according to the Moyal–Weyl rule: a∗b=
∞ X k=0
(−
i~ k 1 i1 j1 ∂ka ∂kb ) π · · · π ik j k i 1 . i j 2 k! ∂y · · · ∂y k ∂y 1 · · · ∂y jk
(6)
Let W = ∪x∈M Wx . Then W is a bundle of algebras over M , called the Weyl bundle. Its space of sections 0W forms an associative algebra with unit under fiberwise multiplication. One may think of W as a “quantum tangent bundle” of M , whose space of sections 0W gives rise to a deformation quantization for the tangent bundle T M , considered as a Poisson manifold with fiberwise linear symplectic structures. The center Z(W ) of 0W consists of sections not containing y 0 s, thus can be naturally identified with C ∞ (M )[[~]]. By assigning degrees to y 0 s and ~ with degy i = 1 and deg~ = 2, there is a natural filtration C ∞ (M ) ⊂ 0(W1 ) ⊂ · · · 0(Wi ) ⊂ 0(Wi+1 ) · · · ⊂ 0(W ) with respect to the total degree (e.g. any individual term in the summation of the RHS of Eq. (5) has degree 2k + |α|.) A differential form with values in W is a section of the bundle W ⊗ ∧q T ∗ M , which can be expressed locally as X (7) a(x, y, ~, dx) = ~k ak,i1 ···ip ,j1 ···jq y i1 · · · y ip dxj1 ∧ · · · ∧ dxjq . Here the coefficient ak,i1 ···ip ,j1 ···jq is a covariant tensor symmetric with respect to i1 · · · ip and antisymmetric in j1 · · · jq . For short, we denote the space of these sections by 0W ⊗ 3q . The usual exterior derivative on differential forms extends, in a straightforward way, to an operator δ on W -valued differential forms:
δa =
X i
dxi ∧
∂a , ∀a ∈ 0W ⊗ 3∗ . ∂y i
(8)
Fedosov ∗-Products and Quantum Momentum Maps
171
By δ −1 , we denote its “inverse” operator as defined by: δ −1 a =
X i
1 i ∂ y( p + q ∂xi
a)
(9)
when p + q > 0, and δ −1 a = 0 when p + q = 0, where a ∈ 0W ⊗ 3q is homogeneous of degree p in y. There is a “Hodge”- decomposition: a = δδ −1 a + δ −1 δa + a00 , ∀a ∈ 0W ⊗ 3∗ ,
(10)
where a00 (x) is the constant term of a, i.e, the 0-form term of a|y=0 or a00 (x) = a(x, 0, 0, 0). The operator δ inherits most of the basic properties of the usual exterior derivatives. For example, δ 2 = 0 and (δ −1 )2 = 0. Let ∇ be a torsion-free symplectic connection on M and ∂ : 0W −→ 0W ⊗ 31 its induced covariant derivative. Consider a connection on W of the form: D = −δ + ∂ +
i [γ, · ], ~
(11)
with γ ∈ 0W ⊗ 31 . Clearly, D is a derivation with respect to the Moyal–Weyl product, i.e., D(a ∗ b) = a ∗ Db + Da ∗ b, ∀a, b ∈ 0W.
(12)
A simple calculation yields that i D2 a = −[ , a], ∀a ∈ 0W, ~
(13)
i 2 γ . ~
(14)
where = ω − R + δγ − ∂γ −
m Here R = 41 Rijkl y i y j dxk ∧ dxl and Rijkl = ωim Rjkl is the curvature tensor of the symplectic connection. A connection of the form (11) is called Abelian if is a scalar 2-form, i.e., ∈ 2 (M )[[~]]. It is called a Fedosov connection if it is Abelian and in addition γ ∈ 0W3 ⊗ 31 . For an Abelian connection, the Bianchi identity implies that d = D = 0, i.e., ∈ Z 2 (M )[[~]]. In this case, is called the Weyl curvature.
Theorem 2.2 (Fedosov [15]). Let ∇ be any torsion free symplectic connection, and = ω + ~ω1 + · · · ∈ Z 2 (M )[[~]] a perturbation of the symplectic form in the space Z 2 (M )[[~]]. There exists a unique γ ∈ 0W3 ⊗ 31 such that D, given by Eq. (11), is a Fedosov connection, which has Weyl curvature and satisfies δ −1 γ = 0.
172
P. Xu
Proof. It suffices to solve the equation: ω − R + δγ − ∂γ −
i 2 γ = . ~
(15)
This is equivalent to ˜ + ∂γ + i γ 2 , δγ = ~
(16)
˜ = − ω + R. Applying the operator δ −1 to Eq. (16) and using the Hodge where decomposition: Eq. (10), we obtain ˜ + δ −1 (∂γ + γ = δ −1
i 2 γ ). ~
(17)
Here we note that γ00 = 0 since γ is a 1-form. ˜ and consider the following iteration equation: Take γ0 = δ −1 , γn+1 = γ0 + δ −1 (∂γn +
i 2 γ ), ~ n
∀n ≥ 0.
(18)
Since the operator ∂ preserves the filtration and δ −1 raises it by 1, γn defined by Eq. (18) converges to a unique γ ∈ 0W ⊗ 31 , which is clearly a solution to Eq. (17). Moreover since γ0 is at least of degree 3, γ is indeed in 0W3 ⊗ 31 . This theorem indicates that a Fedosov connection D is uniquely P∞ determined by a torsion free symplectic connection ∇ and a Weyl curvature = i=0 ~i ωi ∈ Z 2 (M )[[~]]. For this reason, we will say that D is a Fedosov connection corresponding to the pair (∇, ). If D is a Fedosov connection, the space of all parallel sections WD automatically becomes an associative algebra. Let σ denote the projection from WD to its center C ∞ (M )[[~]] defined by σ(a) = a|y=0 . Theorem 2.3 (Fedosov [15]). For any a0 (x, ~) ∈ C ∞ (M )[[~]] there is a unique section a ∈ WD such that σ(a) = a0 . Therefore, σ establishes an isomorphism between WD and C ∞ (M )[[~]] as vector spaces. Proof. The equation Da = 0 can be written as i δa = ∂a + [ γ, a]. ~ Applying the operator δ −1 , it follows from the Hodge decomposition (Eq. (10)) that i a = a0 + δ −1 (∂a + [ γ, a]). ~
(19)
By using the iteration method analogous to the one as in the proof of Theorem 2.2, we see that the equation above has a unique solution since δ −1 increases the filtration. This concludes the proof.
Fedosov ∗-Products and Quantum Momentum Maps
173
3. Quantum Exponential Maps If ∇ is flat and = ω, the Fedosov connection takes a simple form: D = −δ + ∂. In this case, the solution to Eq. (19) can be expressed explicitly as a=
∞ X 1 (∂i · · · ∂ik a0 )y i1 · · · y ik , k! 1 k=0
which is just the Taylor expansion of exp∗x a0 at the origin. So the correspondence from C ∞ (M )[[~]] to WD is indeed the pullback by the (C ∞ -jet at the origin of the) usual exponential map. Thus for a general Fedosov connection, one may consider the correspondence C ∞ (M )[[~]] −→ WD as a “quantum correction” to the exponential map. In this section, we will make this idea more precise by introducing the notion of quantum exponential maps, which gives a simple characterization for Fedosov connections. As an application, we will realize geometrically the equivalence between an arbitrary ∗-product and a Fedosov one, i.e., a ∗-product arising from the Fedosov construction. Definition 3.1. A quantum exponential map is an ~-linear map ρ : C ∞ (M )[[~]] −→ 0W such that (i) (ii) (iii) (iv)
ρ(C ∞ (M )[[~]]) is a subalgebra of 0W ; ρ(a)|y=0 = a, ∀a ∈ C ∞ (M )[[~]]; ρ(a) = a + δ −1 da, ∀a ∈ C ∞ (M ), mod W2 ; ρ(a) can be expressed as a formal power series in y and ~, with coefficients being derivatives of a.
Given a quantum exponential map ρ, Condition (ii) implies that ρ establishes an isomorphism between C ∞ (M )[[~]] and its image as vector spaces. Therefore, C ∞ (M )[[~]] becomes an associative algebra because of the first condition. It is simple to see that the third condition implies that this is indeed a deformation of the symplectic structure, and the last one implies that it is a ∗-product. Therefore, Proposition 3.2. Any quantum exponential map on a symplectic manifold M induces a star-product on M . It is easy to see that for any Fedosov connection D, the correspondence from C ∞ (M )[[~]] to the space of parallel sections WD , as constructed in Theorem 2.3, satisfies all the properties of a quantum exponential map. Thus, one may consider Fedosov construction as a way of constructing quantum exponential maps, and so a quantum exponential map always exists. Theorem 3.3. Quantum exponential maps exist on any symplectic manifold. Therefore, any symplectic manifold admits a ∗-product. The ∗-product induced from a Fedosov connection is usually called a Fedosov ∗product. In what follows, we will show that quantum exponential maps and Fedosov connections are indeed equivalent. Theorem 3.4. Quantum exponential maps are equivalent to Fedosov connections. Before proving this theorem, we start with investigating the following closely related question: what kind of subalgebras of 0W arises from a Fedosov connection?
174
P. Xu
Proposition 3.5. Suppose that A ⊂ 0W is a subalgebra satisfying the following conditions: “Completeness” – for any x0 ∈ M , and any a(y, ~) ∈ Wx0 , there is an element a˜ (x, y, ~) ∈ A such that a˜ (x0 , y, ~) = a(y, ~). ˜ x0 , then a˜ ∗ |x0 = b˜ ∗ |x0 , where (ii) “Uniqueness” – if a˜ and b˜ ∈ A such that a˜ |x0 = b| a˜ and b˜ are considered as maps: M −→ W , and a˜ ∗ |x0 and b˜ ∗ |x0 refer to their tangent maps at x0 . (i)
Then, there exists a unique Abelian connection D such that A ⊆ WD . Proof. Take a torsion free symplectic connection ∇, and let ∂ : 0W −→ 0W ⊗ 31 be its corresponding covariant derivative. For any x0 ∈ M , introduce an operator ρxo : Wx0 −→ Wx0 ⊗ 31 by ρx0 (a(y, ~)) = (δ − ∂)˜a(x, y, ~)|x=x0 , where a˜ (x, y, ~) ∈ A such that a˜ (x0 , y, ~) = a(y, ~). By assumption, ρx0 is well-defined and is in fact a derivation of the algebra Wx0 . Therefore, there is a unique element γx0 ∈ Wx0 ⊗ 31 with γx0 |y=0 = 0 such that ρx0 = ad ~i γx0 = [ ~i γx0 , ·]. Applying this def
process pointwisely, we obtain a global section γ ∈ W1 ⊗ 31 with γ0 = γ|y=0 = 0. Let D = −δ + ∂ + [ ~i γ, ·]. Then, D is a connection on the Weyl bundle W and satisfies the condition: Da˜ (x, y, ~) = 0 for all a˜ (x, y, ~) ∈ A. As in Sect. 2, let = ω − R + δγ − ∂γ − ~i γ 2 denote the Weyl curvature of the connection D. It thus follows that i D2 a˜ = [ , a˜ ] = 0, ∀˜a ∈ A. ~ Since A is complete, belongs to the center. Therefore, ∈ Z 2 (M )[[~]]. In other words, D is an Abelian connection. The following lemma gives a simple sufficient condition for a subalgebra A ⊆ 0W being complete. Lemma 3.6. Let A ⊂ 0W be a subalgebra with unit. Suppose that for any a0 (x) ∈ C ∞ (M ), there is a˜ ∈ A such that a˜ = a0 + δ −1 da0 ,
mod W2 .
Then, A is complete. Proof. We will prove, by induction, the following statement: ∀x0 ∈ M and ∀a(y, ~) ∈ Wk |x0 , there is an element a˜ ∈ A such that a˜ (x0 , y, ~) − a(y, ~) ∈ Wk+1 |x0 . The conclusion will then follow immediately from a standard iteration argument. By assumption, the statement holds for both k = 0 and k = 1. Assume that a(y, ~) = ~j y i1 · · · y ip with 2j + p = k. Now a(y, ~) − ~j y i1 ∗ · · · ∗ y ip = ~j (y i1 · · · y ip − y i1 ∗ · · · ∗ y ip ) X ~i+j Ci+j,j1 ···js y j1 · · · y js , = 2i+s=p
Fedosov ∗-Products and Quantum Momentum Maps
175
where i ≥ 1. Since s = p − 2i = k − 2j − 2i < k, we can apply the induction assumption for each term y j1 · · · y js in the above summation. This allows us to conclude that there is an element b˜ ∈ A such that ˜ 0 , y, ~), mod Wk+1 . a(y, ~) − ~j y i1 ∗ · · · ∗ y ip = b(x On the other hand, for each y il , there is a˜ il ∈ A such that a˜ il |x0 = y il , mod W2 , for 1 ≤ l ≤ p. It is then clear that ~j y i1 ∗ · · · ∗ y ip = ~j a˜ i1 ∗ · · · ∗ a˜ ip |x0 , mod Wk+1 . Thus, we have ˜ 0 , y, ~), mod Wk+1 . a(y, ~) = ~j a˜ i1 ∗ · · · ∗ a˜ ip |x0 + b(x This concludes the proof.
Now we are ready to formulate the following result, which gives criteria characterizing a Fedosov algebra. Theorem 3.7. Let A be a subalgebra of 0W satisfying: (i)
∀a0 ∈ C ∞ (M ), there is an element a˜ ∈ A such that a˜ = a0 + δ −1 da0 ,
mod W2 ;
and (ii) if a and b ∈ A such that a|x0 = b|xo , then a∗ |x0 = b∗ |x0 , where a and b are considered as maps: M −→ W , and a∗ |x0 and b∗ |x0 refer to their tangent maps at x0 . Then, A is a Fedosov algebra, i.e., A arises from a Fedosov connection. Proof. According to Lemma 3.6, A is complete. Thus by Proposition 3.5, there is an Abelian connection D such that A ⊆ WD . We will break the rest of the proof into two lemmas below. Lemma 3.8. This Abelian connection D is in fact a Fedosov connection. Proof. By assumption, for any a0 ∈ C ∞ (M ), there is a˜ ∈ A such that a˜ = a0 + δ −1 da0 ,
mod W2 .
It follows from Da˜ = 0 that i (δ − ∂)˜a = [ γ, a˜ ]. ~ The degree zero term of the LHS is easily seen to be zero, while the degree zero term of the RHS is {γ1 , δ −1 da0 }, where γ1 is the degree one term of γ. Thus it implies that {γ1 , δ −1 da0 } = 0. Since a0 is arbitrary, γ1 must be constant with respect to y. On the other hand since it is linear in y, it must be identically zero. This implies that γ ∈ W2 ⊗ 31 . P Denote by γ2 the degree 2 term of γ, and assume that γ2 = rij,k (x)y i y j dxk . Note that this is the most general form of γ2 since γ|y=0 = 0. Also, we may always assume that rij,k = rji,k . Since D is Abelian, its curvature ∈ Z 2 (M )[[~]]. Assume that
176
P. Xu
=
∞ X
~i ωi = ω0 + ~ω1 + ~2 ω2 + · · · .
(20)
i=0
On the other hand, according to Eq. (14), we have = ω − R + δγ − ∂γ −
i 2 γ . ~
(21)
Comparing the degree zero terms of Eqs. (20) and (21), it follows immediately that ω0 = ω. Now the terms R, ∂γ and ~i γ 2 are all of degree not less than 2, so the only degree 1 term in Eq. (21) would be δγ2 . Hence δγ2 = 0. On the other hand, a simple calculation yields that X (rij,k (x)y j dxi ∧ dxk + rij,k (x)y i dxj ∧ dxk ) δγ2 = X X = rij,k (x)y j dxi ∧ dxk + rji,k (x)y j dxi ∧ dxk X =2 rij,k (x)y j dxi ∧ dxk . It thus follows that for any i 6= k, X X rij,k (x)y j = rkj,i (x)y j . j
j
That is, rij,k (x) = rkj,i (x). Hence, rij,k is completely symmetric with respect to i, j, and k. Let 00ijk = 0ijk + 2rij,k . Then, 00ijk defines a torsion free symplectic connection with induced differential ∂ 0 = d + [ ~i 00 , ·], where 00 = 21 00ijk y i y j dxk . It is easy to see that ∂ 0 = ∂ + [ ~i γ2 , ·]. Let γ 0 = γ − γ2 . Then γ 0 ∈ W3 ⊗ 31 , and D = −δ + ∂ 0 + [ ~i γ 0 , ·]. This shows that D is indeed a Fedosov connection. Lemma 3.9. For any a˜ ∈ WD , there is a˜ 0 ∈ A and b˜ ∈ WD such that ˜ a˜ = a˜ 0 + ~b. Proof. Let a = a˜ |y=0 = a0 (x) + ~a1 (x) + · · · ∈ C ∞ (M )[[~]]. Take a˜ 0 ∈ A such that a˜ 0 = a0 + δ −1 da0 mod W2 , which is always possible by assumption. Thus, a˜ 0 |y=0 = a0 (x) + O(~) and then (˜a − a˜ 0 )|y=0 = O(~). However, we know that a˜ − a˜ 0 ∈ WD since a˜ 0 ∈ A ⊆ WD . This implies that a˜ − a˜ 0 = ~b˜ for some b˜ ∈ WD . We now return to the proof of Theorem 3.7. By using the lemma above repeatedly, one immediately obtains the other direction of inclusion, i.e., WD ⊆ A. This concludes our proof. Clearly, Theorem 3.4 is an immediate consequence of Theorem 3.7. As another application, we will give a geometric constructive proof for the following (see [7, 13, 25]): Theorem 3.10. Any ∗-product is equivalent to a Fedosov ∗-product. Proof. As in Sect. 2, let P = T M be the regular Poisson manifold equipped with the fiberwise linear symplectic structures. Let P = 0(∪m∈M C0∞ (Tm M, R)), where C0∞ (Tm M, R) denotes the set of ∞-jets at 0 of real valued functions on Tm M . Thus, P becomes a Poisson algebra with a naturally induced Poisson structure. The following lemma would be of independent interest itself.
Fedosov ∗-Products and Quantum Momentum Maps
177
Lemma 3.11. Any ∗-product on C ∞ (M )[[~]] induces a ∗-product on the Poisson manifold P = T M so that there is an algebra embedding ρ : C ∞ (M )[[~]] −→ P[[~]]. Proof. Take any torsion-free symplectic connection ∇, and for any fixed m ∈ M , let Expm be the formal symplectic exponential map introduced by Emmrich and Weinstein [12]. Then, (Expm )∗ : C ∞ (M ) −→ C0∞ (Tm M, R) is a Poisson algebra mor∞ ∞ phism, which in fact maps jet∞ m C (M ) to C0 (Tm M, R) isomorphically. Therefore, ∞ any ∗-product on C (M )[[~]] induces a ∗-product on C0∞ (Tm M, R)[[~]], hence on C ∞ (Tm M )[[~]]. Thus, we obtain a regular ∗-product (see Appendix B for the definition), denoted as ∗˜ ~ , on the Poisson manifold P , and Exp∗ is clearly an embedding of the algebra. According to Proposition 9.1 in Appendix B, a regular ∗-product on P is essentially unique. Hence, there exists an equivalence operator: T~ = 1 + ~T1 + ~2 T2 + · · · , with Ti being leafwise differential operators, between (P[[~]], ∗˜ ~ ) and the standard Weyl quantization 0W . Let A = (T~ ◦Exp∗ )(C ∞ (M )[[~]]) ⊂ 0W . Then A is a subalgebra of 0W . It is simple to see that A satisfies all the conditions in Theorem 3.7, so it is a Fedosov algebra. This concludes the proof of the theorem. Since every ∗-product is equivalent to a Fedosov ∗-product, its characteristic class, as defined by Nest–Tsygan [25], is the class in H 2 (M )[[~]] of the Weyl curvature of its equivalent Fedosov ∗-product. This is well-defined since the Weyl curvatures of equivalent Fedosov ∗-products are cohomologous (see [15]). In fact, Fedosov showed that two Fedosov ∗-products are equivalent iff their Weyl curvatures are cohomologous [15]. Thus (see [25]), Theorem 3.12. Two ∗-products are equivalent iff they have the same characteristic class. Remark. We note that a similar classification result due to Lecomte also appeared in [21]. 4. Vey ∗-Products Let (M, ω) be a symplectic manifold, and ∇ a torsion free symplectic connection with covariant derivative ∂. Define ∂u = du, ∂ 2 u = ∂(∂u), ∂ k u = ∂(∂ k−1 u) and so on. It is simple to see that ∂ k u is a symmetric contravariant k-tensor. Let π be the Poisson bivector field on M . By π k , we denote the 2k-tensor: (π k )i1 ···ik ,j1 ···jk = π i1 j1 π i2 j2 · · · π ik jk . Set k P∇ (u, v) =< π k , ∂ k u ⊗ ∂ k v >= π i1 j1 · · · π ik jk (∂i1 · · · ∂ik u)(∂j1 · · · ∂jk v). 0 1 In particular, P∇ (u, v) = uv and P∇ (u, v) = {u, v}.
Definition 4.1 ([5]). A Vey ∗-product is a star product on C ∞ (M )[[~]] such that X 1 i~ u ∗~ v = (− )k Qk (u, v), (22) k! 2 where Qk is a bidifferential operator of maximum order k in each argument and its k . principal symbol coincides with that of P∇
178
P. Xu
The main result of the section is Theorem 4.2. Any Fedosov ∗-product is a Vey ∗-product. Moreover, if D = −δ + ∂ + [ ~i γ, ·] as in Theorem 2.2, then 2 Q2 (u, v) = P∇ (u, v) + C2 (u, v),
(23)
where C2 is a bidifferential operator of maximum order 1 in each argument. Given a Fedosov connection D = −δ + ∂ + [ ~i γ, ·], any element X a(x, y, ~) = ~i ai,j1 ···jk (x)y j1 · · · y jk
(24)
in WD is determined by the iteration formula (19). Assume that a0 (x) = σ(a) = a(x, 0, ~) ∈ C ∞ (M )[[~]]. It is clear that each coefficient in Eq. (24) can be expressed as ai,j1 ···jk (x) = Di,j1 ···jk a0 for a certain differential operator Di,j1 ···jk . We will say that the term ~i ai,j1 ···jk (x)y j1 · · · y jk is of degree (i + k, s) if Di,j1 ···jk is a differential operator of degree s. Proposition 4.3. Under the same hypothesis as above, assume that X a(x, y, ~) = ~i ai,j1 ···jk (x)y j1 · · · y jk ∈ WD . Then (i)
ai,j1 ···jk (x) = Di,j1 ···jk a0 , where Di,j1 ···jk is a differential operator of degree not greater than i + k.
(ii) a(x, y, 0) =
∞ X 1 (∂i · · · ∂ik a0 )y i1 · · · y ik + H, k! 1 k=0
where all the terms in the remainder H have the property that they have degree (t, s) with t > s. (iii) For any j, the order of the differential operator D1,j is not greater than 1. Proof. Since a(x, y, ~) is generated by iteration formula (19), we prove (i) by induction. For this purpose, we need to analyse the effect of the operators δ −1 ∂ and δ −1 [ ~i γ, ·] on an arbitrary term with degree (i + k, s). It is obvious that δ −1 ∂ maps terms of degree (i + k, s) to those of degree (i + k + 1, s + 1). On the other hand, it is not difficult to check that δ −1 [ ~i γ, ·] maps terms of degree (i + k, s) to those of degree (m, s) with m ≥ i + k. Claim (i) thus follows immediately. To prove (ii), we need to analyse the ~0 -term produced from the operator δ −1 [ ~i γ, ·] in iteration formula (19). The only possibility would be δ −1 (−i~{ ~i γ0 , a˜ }). Here γ0 = γ(x, y, 0, dx) and a˜ = a0,j1 ···jk (x)y j1 · · · y jk , which is assumed of degree (l, k). By Part (i), we know that l ≥ k. Now δ −1 (−i~{ ~i γ0 , a˜ }) = δ −1 {γ0 , a˜ }. Since γ0 is at least cubic in y (deg γ ≥ 3), the latter is of degree (l, k 0 ) with k 0 being at least k + 2. Therefore, all terms of degree (k, k) in a(x, y, 0) come solely from the iteration of the operator δ −1 ∂ on elements of the same type. The conclusion thus follows immediately. To prove (iii), we will concentrate on terms of the type: X (25) ~a1,j (x)y j .
Fedosov ∗-Products and Quantum Momentum Maps
179
They have degree (l, 2) according to Part (i). Since the operator δ −1 ∂ maps any (t, s)term into (t + 1, s + 1)-term, and δ −1 ∂ does not produce any ~, the only possible way to obtain such a term is when δ −1 ∂ acts on those terms of the type ~a1 (x). This however will never happen since we already assume that a(x, 0, ~) = a0 (x) is independent of ~. As for the operator δ −1 [ ~i γ, ·], the only possible way of producing terms of such a type is δ −1 (−i~{ ~i γ 0 , a0 }) = δ −1 {γ 0 , a0 }. Here γ 0 is any term in γ having the form: ~γi,j (x)y i dxj , and a0 is any term in a(x, y, ~) of the form: a0 = a0,m (x)y m . Thus, δ −1 {γ 0 , a0 } = ~γij (x)a0,m (x)ω im (x)y j is of degree (1, 2). This concludes the proof. An immediate consequence of Proposition 4.3 is the following: Corollary 4.4. Let a(x, y, ~) ∈ WD , and let x0 ∈ M be any point. Then, a(x0 , y, ~) = 0, ∀y ∈ Tx0 M iff jet∞ x0 a(x, ~) = 0, where a(x, ~) = a(x, 0, ~) = σ(a(x, y, ~)). Proof. One direction follows directly from Part (i) of Proposition 4.3. To prove the other direction, let a0 (x) = a(x, 0, 0) and assume that a(x, ~) = a0 (x) + ~a1 (x) + ~2 a2 (x) + · · · . We shall prove, as the first step, that jet∞ x0 a0 (x) = 0. Clearly, a0 (x0 ) = 0. Assume that all the derivatives of a0 up to the (n − 1)th order vanish at x0 . According to Part (ii) of Proposition 4.3, ∞ X 1 (∂i · · · ∂ik a0 )y i1 · · · y ik + H, a(x0 , y, 0) = k! 1 k=0
where each term in H is of degree (t, s) with t > s. Since the coefficient of y i1 · · · y in 1 is zero, it follows that n! (∂i1 · · · ∂in a0 )(x0 ) + (Da0 )(x0 ) = 0, for some differential operator D of degree less than n. By using the induction assumption, we deduce that (∂i1 · · · ∂in a0 )(x0 ) = 0. This proves that jet∞ x0 a0 (x) = 0. Let a0 (x, y, ~) ∈ WD be the parallel section corresponding to a0 (x). Then a0 (x0 , y, ~) = 0, ∀y ∈ Tx0 M . By considering the element ~1 [a(x, y, ~) − a0 (x0 , y, ~)] ∈ WD , one yields that jet∞ x0 a1 (x) = 0. The conclusion thus follows by using the same argument repeatedly. Proof of Theorem 4.2. Let a0 (x) and b0 (x) be any smooth functions on M , and a(x, y, ~) and b(x, y, ~) their corresponding parallel sections in WD . Assume that X a(x, y, ~) = ~i ai,j1 ···jk (x)y j1 · · · y jk , and X b(x, y, ~) = ~i bi,j1 ···jk (x)y j1 · · · y jk . By the definition of Moyal–Weyl product, a(x, y, ~) ∗ b(x, y, ~)|y=0 X i p 1 ~k+l+p ak,i1 ···ip (x)bl,j1 ···jp (x)π t1 s1 · · · = − 2 p! ∂(y i1 · · · y ip ) ∂(y j1 · · · y jp ) π t p sp t 1 ∂y · · · ∂y tp ∂y s1 · · · ∂y sp X X i 1 = (− )p ~k+l+p (Dk,i1 ···ip a0 )(Dl,j1 ···jp b0 )π t1 s1 · · · 2 p! n k+l+p=n
π t p sp
∂(y i1 · · · y ip ) ∂(y j1 · · · y jp ) . ∂y t1 · · · ∂y tp ∂y s1 · · · ∂y sp
180
P. Xu
According to Proposition 4.3, the order of the differential operator Dk,i1 ···ip is not greater than k + p, which is less than or equal to n, similarly for the order of Dl,j1 ···jp . If l 6= 0, then k + p = n − l < n. The order of Dk,i1 ···ip is less than n. Similarly if k 6= 0, the order of Dl,j1 ···jp is less than n. So in order to have the maximum order in both arguments, it is necessary that k = l = 0, and in this case p = n. Using Part (ii) of Proposition 4.3, it is simple to see that the principal part is X
(−
i~ n 1 ) (∂i · · · ∂in a0 )(∂j1 · · · ∂jn b0 )π i1 j1 · · · π in jn . 2 n! 1
For the ~2 -term, we need k + l + p = 2. If k = l = 0 and p = 2, we obtain the principal term. When k = p = 1 and l = 0, then a1,i1 (x) = D1,i1 a0 and b0,j1 (x) = D0,j1 b0 . According to Proposition 4.3, both D1,i1 and D0,j1 have degrees not greater than 1; similarly for the case that l = p = 1 and k = 0. This concludes the proof. As an immediate consequence, we obtain the following result, which was first proved by Lichnerowicz using homological methods [23]. Corollary 4.5 (Lichnerowicz). Any ∗-product on a symplectic manifold is equivalent to a Vey ∗-product. Remark. As it is well known, the equivalence class of a Fedosov ∗-product is determined by the class of its Weyl curvature in H 2 (M )[[~]], which is independent of the symplectic connection in construction. On the other hand, as we see in Theorem 4.2, the symplectic connection is completely encoded in the Fedosov ∗-product itself. In fact if two Fedosov connections Di = −δ + ∂i + [ ~i γi , ·] induce an identical ∗-product on C ∞ (M )[[~]], it is necessary that their symplectic connections coincide, i.e., ∂1 = ∂2 . We would like to conjecture that: Two Fedosov connections Di = −δ + ∂i + [ ~i γi , ·] induce an identical ∗-product on ∞ C (M )[[~]]), iff D1 = D2 . To prove this, one needs to show that the Weyl curvatures of D1 and D2 are not only cohomologous, but indeed coincide, or equivalently WD1 = WD2 . In other words, one needs to decode the Weyl curvature (not only its class in H 2 (M )[~]) from a Fedosov ∗-product directly. We end this section with the following inverse question: Question. Is every Vey ∗-product necessarily a Fedosov ∗-product?
5. Quantization of Lagrangian Submanifolds Lagrangian submanifolds play a fundamental role in the study of symplectic manifolds. In fact, according to the “symplectic creed”, everything can be thought of as a lagrangian submanifold [29]. It is very natural to ask: what do lagrangian submanifolds correspond to under a deformation quantization? This is what we aim to answer in this section. Lemma 5.1. Let ∗~ be the Moyal–Weyl product for a symplectic vector space V . Suppose that L ⊂ V is a lagrangian subspace. If both f and g ∈ C ∞ (V ) vanish on L, so does f ∗~ g.
Fedosov ∗-Products and Quantum Momentum Maps
181
Proof. Note that the Moyal–Weyl formula: Eq. (2) is independent of the choice of linear coordinates. Let us choose a lagrangian subspace K supplementary to L, and linear coordinates (q1 , · · · , qn ) ∈ L, (p1 , · · · , pn ) ∈ K such that ω(qi , pj ) = δij . Then the claim follows directly from Eq. (2). Let L be a lagrangian submanifold. For any x ∈ L, by WxL we denote the subspace of the Weyl algebra Wx consisting of all elements which vanish when being restricted to Tx L. According to Lemma 5.1, WxL is a subalgebra. Let W L = ∪x∈L WxL . Then W L is a subbundle of the Weyl bundle. Similarly, we can define W1L , W2L , etc. according to the natural filtration in W . We say that an element a ∈ Wp ⊗ 3q belongs to the subspace (Wp ⊗ 3q )L if for any v1 , · · · , vq ∈ Tx L, a(v1 , · · · , vq ) ∈ WpL . By (W ⊗ 3)L , we denote the direct sum ⊕p,q (Wp ⊗ 3q )L . It is clear that (W ⊗ 3)L is invariant under both δ and δ −1 . More precisely, δ maps (Wp ⊗ 3q )L into (Wp−1 ⊗ 3q+1 )L , while δ −1 maps (Wp ⊗ 3q )L into (Wp+1 ⊗ 3q−1 )L . Let ∇ be a torsion-free symplectic connection on M such that L is totally geodesic. Then its induced differential ∂ maps (W ⊗ 3q )L into (W ⊗ 3q+1 )L . Proposition 5.2. Under the same hypothesis as in Theorem 2.2, if in addition, L is a totally geodesic submanifold, which is lagrangian with respect to the Weyl curvature (i.e., i∗ = 0), then γ belongs to (W3 ⊗ 31 )L . Here i : L −→ M is the embedding. Therefore, the corresponding Fedosov connection D preserves (W ⊗ 3)L . Proof. According to Eq. (17), γ is determined by the iteration formula: ˜ + δ −1 (∂γ + γ = δ −1
i 2 γ ), ~
(26)
˜ = − ω + R. It is easy to see that R ∈ (W2 ⊗ 32 )L since L is a totally where ˜ ∈ (W2 ⊗ 32 )L . It is then geodesic lagrangian submanifold. Therefore, it follows that clear, from the above iteration formula (see also Eq. (18)), that γ ∈ (W3 ⊗ 31 )L since (W ⊗ 3∗ )L is invariant under all the operators involved in Eq. (26). L denote the subspace of WD consisting of sections whose restriction to L Let WD belong to W L . As an immediate consequence, we have L Corollary 5.3. Under the same hypothesis as in Proposition 5.2, WD is a subalgebra of WD .
By CL∞ (M ), we denote the space of smooth functions on M which vanish on L. Proposition 5.4. Under the same hypothesis as in Proposition 5.2, L ) = CL∞ (M )[[~]], σ(WD
where σ : WD −→ C ∞ (M )[[~]] is the isomorphism introduced in Sect. 2. Therefore, CL∞ (M )[[~]] is a subalgebra of the Fedosov ∗-algebra (C ∞ (M )[[~]], ∗~ ). Proof. According to Theorem 2.3, if a ∈ WD with σ(a) = a0 , a is determined by the iteration formula i a = a0 + δ −1 (∂a + [ γ, a]). ~
182
P. Xu
If a0 ∈ CL∞ (M )[[~]], then a0 ∈ W L . It follows immediately that a ∈ W L , since γ ∈ (W3 ⊗ 31 )L and all the operators involved in the equation above preserve the space (W ⊗ 3)L . L , it is obvious that a0 = σ(a) ∈ W L . Conversely, if a ∈ WD Example 5.5. If f : M −→ N is a symplectic diffeomorphism, its graph Gf = {(x, f (x))|x ∈ M } is a lagrangian submanifold of S = M × N¯ . Moreover if f ∗ N = M , where M and N are Weyl curvatures on M and N , respectively, Gf is lagrangian with respect to (M , N ). Let S be equipped with a product symplectic connection ˜ = f∗ ∇. In this case, C ∞ (S) ˜ It is easy to see that Gf is totally geodesic iff ∇ ∇ × ∇. L is a subalgebra of the corresponding Fedosov ∗-product (C ∞ (S)[[~]], ∗~ ). This implies that f∗ is an algebra morphism between the Fedosov ∗-algebras (C ∞ (M )[[~]], ∗~ ) and (C ∞ (N )[[~]], ∗~ ). The following lemma indicates that a totally geodesic symplectic connection always exists for any given lagrangian submanifold. Lemma 5.6. Given any lagrangian submanifold L ⊂ M , there always exists a torsionfree symplectic connection on M such that L is totally geodesic. Proof. First, take any torsion-free connection ∇ on M such that L is totally geodesic. Any other connection can be written as ˜ X Y = ∇X Y + S(X, Y ), ∀X, Y ∈ X (M ), (27) ∇ ˜ is torsion-free iff S is symmetric, i.e., S(X, Y ) = where S is a (2, 1)-tensor. Clearly, ∇ S(Y, X) for any X, Y ∈ X (M ). ˜ is symplectic iff ∇ ˜ X ω = 0. The latter is equivalent to ∇ ω(S(X, Y ), Z) − ω(S(X, Z), Y ) = (∇X ω)(Y, Z).
(28)
Let S be the (2, 1)-tensor defined by the equation: 1 [(∇X ω)(Y, Z) + (∇Y ω)(X, Z)]. (29) 3 Clearly, S(X, Y ), defined in this way, is symmetric with respect to X and Y . Now ω(S(X, Y ), Z) =
ω(S(X, Y ), Z) − ω(S(X, Z), Y ) 1 1 = [(∇X ω)(Y, Z) + (∇Y ω)(X, Z)] − [(∇X ω)(Z, Y ) + (∇Z ω)(X, Y )] 3 3 1 = [(∇X ω)(Y, Z) + (∇Y ω)(X, Z) + (∇X ω)(Y, Z) + (∇Z ω)(Y, X)] 3 = (∇X ω)(Y, Z), where the last step follows from the identity: (∇X ω)(Y, Z) + (∇Y ω)(Z, X) + (∇Z ω)(X, Y ) = 0. ˜ is a torsion-free symplectic connection. This means that ∇ It is clear to see, from Eq. (29), that if X, Y and Z are all tangent to L, ω(S(X, Y ), Z) = 0, since L is totally geodesic with respect to ∇. Hence, S(X, Y ) is tangent to L whenever X, Y are tangent to L. In other words, L is totally geodesic ˜ with respect to ∇.
Fedosov ∗-Products and Quantum Momentum Maps
183
Since a ∗-product is always equivalent to a Fedosov ∗-product, as a consequence, any lagrangian submanifold, under a deformation quantization, becomes a subalgebra after some “quantum correction”. More precisely, we have Theorem 5.7. Let ∗~ be a ∗-product on a symplectic manifold (M, ω) with characteristic class [] ∈ H 2 (M )[[~]]. Suppose that L is a lagrangian submanifold such that i∗ [] ∈ H 2 (L)[[~]] vanishes, where i : L −→ M is the embedding. Then there exists a formal operator: T~ = 1 + ~T1 + ~2 T2 + · · · , with Ti being differential operators on M , such that T~ (CL∞ (M )[[~]]) is a subalgebra of (C ∞ (M )[[~]], ∗~ ). Proof. By assumption, i∗ is an exact two-form on L, i.e., i∗ = dθL for some θL ∈ 1 (L). Extending θL to a one-form on M , we may assume that i∗ = di∗ θ for some ˜ = − dθ. Take a torsion-free symplectic connection ∇ such that L θ ∈ 1 (M ). Let is totally geodesic. Let ∗¯ ~ be the corresponding Fedosov ∗-product with Weyl curvature ˜ Then CL∞ (M )[[~]] is a ∗¯ ~ -subalgebra. According to Theorem 3.12, ∗~ and ∗¯ ~ are . equivalent ∗-products. The conclusion thus follows immediately. Remark. (1) The quantum counterparts of lagrangian submanifolds, according to Lu [24], are left ideals. However, it is not clear how this can be realized for ∗-products in our case. It seems that a possible candidate would be the space CL∞ (M )[[~]] modified in a certain way. (2) For a symplectic manifold M , a coisotropic submanifold is a submanifold C such that the space of functions vanishing on C is a Poisson subalgebra. It is natural to expect that the above result can be generalized to any coisotropic submanifolds. But we cannot prove this at the moment because for a general coisotropic submanifold C it is not clear if there always exists a symplectic connection such that C is totally geodesic.
6. Quantum Momentum Maps This section is devoted to the study of deformation quantization of a symplectic G-space. In particular, we introduce the notion of quantum momentum maps, which plays the role of a quantum analogue of the usual momentum maps. Let (M, ω) be a symplectic G-space with the action 8g : M −→ M , ∀g ∈ G. A ∗-product on M is called G-equivariant if for any u and v ∈ C ∞ (M )[[~]], 8∗g (u ∗~ v) = (8∗g u) ∗~ (8∗g v).
(30)
In general, M does not necessary admit a G-equivariant ∗-product. It is known [5, 23] that the existence of such a ∗-product is closely related to the existence of a G-invariant connection on the manifold. Recall that a natural ∗-product is a ∗-product: u ∗~ v = uv −
i~ {u, v} + ~2 C2 (u, v) + · · · , 2
where the ~2 -term C2 is a bidifferential operator of order 2 in each argument. Proposition 6.1. Let M be a symplectic G-space. M admits a G-equivariant natural ∗-product iff there exists a G-invariant connection on M .
184
P. Xu
Proof. Assume that there exists a G-equivariant natural ∗-product: u ∗~ v = uv −
1 i~ i~ {u, v} + ( )2 Q2 (u, v) + · · · . 2 2 2
According to Proposition 10.1, there is a unique symplectic connection ∇ such that 2 Q2 (u, v) = P∇ (u, v) + H(u, v),
where H(u, v) is a bidifferential operator of maximum order 1 in each argument. It is then clear that ∇ is G-invariant. Conversely, suppose that there exists a G-invariant connection on M . As it is well known, there is a standard method of producing a torsion free symplectic connection from an arbitrary affine connection. It is clear that if one starts with a G-invariant one, the resulting symplectic connection will be G-invariant as well. Then, the corresponding Fedosov ∗-product (with a G-invariant symplectic connection ∇, and Weyl curvature ω) will be G-equivariant. A G-invariant connection always exists if G is compact. However, when G is noncompact, there are some cases where G-invariant connections do not exist. Various attempts have been made in order to handle such a situation. For details, readers can consult [2, 3, 17]. In what follows, nevertheless we will assume that the star-product ∗~ is Gequivariant. Thus, the corresponding infinitesimal action ξ −→ ξˆ defines a Lie algebra homomorphism from g to the space of derivations DerC ∞ (M )[[~]]. By U g[[~]], we denote the space of formal power series of ~ with coefficients in the universal enveloping algebra U g. Then U g[[~]] can be naturally identified with the universal enveloping algebra of the deformed Lie algebra g~ , where its bracket is the ~-linear extension of [X, Y ]~ = −i~[X, Y ], ∀X, Y ∈ g.
(31)
Below, the algebra structure on U g[[~]] is always meant the one induced from the universal enveloping algebra of g~ . Definition 6.2. A quantum momentum map is a homomorphism of associative algebras: µ~ : U g[[~]] −→ C ∞ (M )[[~]], such that for any ξ ∈ g, i ξˆ = ad µ~ (ξ), ~
(32)
where both sides are considered as derivations on C ∞ (M )[[~]]. It is obvious, from definition, that a necessary condition for the existence of a moˆ , ∀ξ ∈ g be inner. Let us first assume that mentum map is that the derivation ρξ f = ξf this is true. Thus there is a linear map from g to C ∞ (M )[[~]], denoted by ξ −→ aξ , such that for any f ∈ C ∞ (M ) and ξ ∈ g, ˆ = [ i aξ , f ]. ξf ~ Therefore,
(33)
Fedosov ∗-Products and Quantum Momentum Maps
185
ˆ η]f [ [ξ, η]f = [ξ, ˆ i i i i = [ aξ , [ aη , f ]] − [ aη , [ aξ , f ]] ~ ~ ~ ~ i i = [[ aξ , aη ], f ]. ~ ~ On the other hand, by definition, i [ [ξ, η]f = [ a[ξ,η] , f ]. ~ Therefore,
i [aξ , aη ], f ] = 0, ∀f ∈ C ∞ (M ). ~ So a[ξ,η] − ~i [aξ , aη ], as an element in C ∞ (M )[[~]], is constant on M. Define λ : ∧2 g −→ C[[h]] by [a[ξ,η] −
λ(ξ, η) = a[ξ,η] −
i [aξ , aη ], ∀ξ, η ∈ g. ~
(34)
Proposition 6.3. (i) λ is a Lie algebra 2-cocycle; (ii) its cohomology class [λ] ∈ H 2 (g, C[[~]]) ∼ = H 2 (g) ⊗ C[[~]] is independent of the choice of the linear map aξ ; (iii) quantum momentum map exists iff [λ] = 0. Proof. Assertions (i)–(ii) are quite obvious, and are left for the reader to check. For (iii), suppose that a quantum momentum map µ~ exists. Then we may take aξ = µ~ (ξ) as our linear map. In this case, i [aξ , aη ] ~ i = µ~ [ξ, η] − [µ~ ξ, µ~ η] ~ i = µ~ ([ξ, η] − [ξ, η]~ ) ~ = 0.
λ(ξ, η) = a[ξ,η] −
Conversely, if [λ] = 0, by adding a suitable coboundary we can always choose a linear map a : g −→ C ∞ (M )[[~]] such that Eq. (33) holds and a[ξ,η] − ~i [aξ , aη ] = 0. In other words, a is a Lie algebra homomorphism from g~ to the commutator Lie algebra of (C ∞ (M )[[~]], ∗~ ). Therefore, it extends to an associative algebra morphism: µ~ : U g~ −→ (C ∞ (M )[[~]], ∗~ ).
Therefore, µ~ is a quantum momentum map by definition. This concludes the proof.
According to Theorem 8.2 below, derivations are automatically inner if H 1 (M ) = 0. Thus we have Theorem 6.4. There exists a quantum momentum map if H 1 (M ) = 0 and H 2 (g) = 0. In particular, a quantum momentum map exists if M is simply connected and g is semisimple.
186
P. Xu
We note that the above condition is exactly the same sufficient condition as for the existence of a classical momentum map [1]. However, there are many cases where classical momentum maps still exist even if this condition is not satisfied. It is reasonable to expect that this phenomenon would happen for quantum momentum maps as well. However, we do not know too many examples except for the following: Example 6.5. Suppose that Q is a G-manifold with action ϕg , which admits a Ginvariant torsion free connection ∇. Let M = T ∗ Q be equipped with the standard cotangent bundle symplectic structure. The G-action on Q naturally lifts to a symplectic action 8g on M = T ∗ Q with an equivariant momentum map J : T ∗ Q −→ g∗ [1]: ˆ ξq >, ∀ξq ∈ Tq Q, X ∈ g, < J(ξq ), X >=< X, where Xˆ denotes the vector field on Q generated by X ∈ g. To any differential operator D, we assign its (complete) symbol the polynomial SD on T ∗ Q given by −1
SD (ξq ) = De<expq
x,ξq >
|x=q ,
∀ξq ∈ Tq Q.
(35)
Here expq : Tq Q −→ Q is the exponential map of the connection ∇, defined in a neighborhood of 0. This assignment in fact establishes an isomorphism between the space D of differential operators on Q and that of polynomials on T ∗ Q. Deform the commuting relation on C ∞ (M ) ⊕ X (Q) according to: [Z~ , f ]~ = −i~Zf, ∀Z ∈ X (Q) and f ∈ C ∞ (Q); and [Y, Z]~ = −i~[Y, Z], ∀Y, Z ∈ X (Q). Since differential operators are generated by C ∞ (Q) and X (Q) over the module C ∞ (Q), this deformed bracket induces an ~-depending multiplication on D1 , which in turn induces a ~-depending multiplication on the space of polynomials on T ∗ Q, hence a ∗product on T ∗ Q. It is simple to see that for any D ∈ D, 8∗g SD = Sg−1 ·D ,
(36)
def
where g · D = ϕ∗g−1 ◦D◦ϕ∗g . In Eq. (36), by letting g = exp tX, ∀X ∈ g, and taking derivative at t = 0, one obtains immediately that ˆ D) = S ˆ . X(S [X,D] Here Xˆ on the RHS refers to the vector field on Q, while Xˆ on the LHS stands for the one on T ∗ Q generated by X ∈ g. This implies that ˆ = [ i J ∗ lX , f ], ∀f ∈ C ∞ (T ∗ Q). Xf ~ In other words, µ~ X = J ∗ lX defines a quantum momentum map. However, it is not clear how to express µ~ f explicitly for a general element f ∈ U g[[~]]. 1 A more intrinsic viewpoint is to think of the tangent bundle T Q as a Lie algebroid, and the above construction as deforming the Lie algebroid structure by multiplying by the factor −i~. Then such a construction admits an immediate generalization to Lie algebroids, which should give rise to a ∗-product for the Lie-Poisson structure associated with a Lie algebroid.
Fedosov ∗-Products and Quantum Momentum Maps
187
Similar to the classical situation, quantum momentum maps in general are not unique. Assume that both µ~ and ν~ : U g[[~]] −→ C ∞ (M )[[~]] are quantum momentum maps. Let τ~ : g −→ C ∞ (M )[[~]] be the map: τ~ (ξ) = µ~ (ξ) − ν~ (ξ),
∀ξ ∈ g.
Then, for any f ∈ C ∞ (M ), ~i [τ~ (ξ), f ] = 0. Thus it follows that τ~ (ξ) is constant on M . Also, it is easy to see that τ~ ([ξ, η]) = 0. That is, τ~ : g −→ C[[~]] is a 1-cocycle. Since all 1-coboundaries are trivial, it follows that the quantum momentum map is unique if H 1 (g) = 0. Proposition 6.6. If H 1 (g) = 0, then the quantum momentum map, if it exists, is unique. In general, for ξ ∈ g, we have µ~ ξ = ν~ ξ + τ~ ξ. However, it is not clear how to express µ~ f for a general f ∈ U g[[~]] in terms of ν~ and τ~ . To see the relation between a quantum momentum map and a classical one, we start with the following simple Lemma 6.7. Let f ∗~ g =
X
~k Ck (f, g)
k
be a ∗-product on a symplectic manifold M . Assume that X ∈ X (M ) is a vector field on M , which is an inner derivation when being considered as an operator on C ∞ (M )[[~]]. Assume that Xf = [ ~i a, f ], ∀f ∈ C ∞ (M )[[~]]. Then, modulo a constant formal power P series of ~, a is of the form a = i ~i ai , where a0 is a hamiltonian function generating the vector field X, and ak , for k ≥ 1, is determined by the equation: {ak , f } = −i
X
(Ci (aj , f ) − Ci (f, aj )).
i+j=k+1,i≥2,j≥0
Proof. It is a direct verification, and is left for the reader.
As a vector space, U g[[~]] is canonically isomorphic to pol(g∗ )[[~]], the space of formal power series of ~ with coefficients being polynomials on g∗ . The isomorphism is established by symmetrization (see [6] for details). Therefore, the algebra structure on U g[[~]] induces a ∗-product on pol(g∗ )[[~]], which gives rise to a deformation quantization for the Lie-Poisson structure g∗ . Below, we will identify these two spaces and use them interchangeably if there is no confusion. Proposition 6.8. Suppose that ∗~ is a G-equivariant ∗-product on a symplectic manifold M . Assume that µ~ : U g[[~]] −→ C ∞ (M )[[~]] is a quantum momentum map. Then, M is a hamiltonian G-space, i.e., the symplectic G-action admits an equivariant (classical) momentum map J. Moreover, µ~ f = J ∗ f + O(~),
∀f ∈ pol(g∗ ).
188
P. Xu
Proof. Since µ~ ξ, ∀ξ ∈ g, depends on ξ linearly, it defines a map J : X −→ g∗ uniquely by the relation: µ~ (lξ ) = J ∗ lξ + O(~), ∀ξ ∈ g. Then clearly J is a (classical) momentum map according to Lemma 6.7. Write the 2-cocycle λ defined by Eq. (34) as λ = λ0 + ~λ1 + · · · . Then each λi : ∧2 g −→ C is a Lie algebra 2-cocycle. It is simple to see that λ0 (ξ, η) = J ∗ l[ξ,η] − {J ∗ lξ , J ∗ lη }. The vanishing of λ implies that λ0 = 0, which means that J is equivariant. The rest of the proposition thus follows trivially. It is, however, not clear whether the converse of Proposition 6.8 is true or not. We end this section by posing the following Question. Does the existence of a classical momentum map imply the existence of a quantum momentum map? 7. Quantum Dual Pair This is a continuation of the last section. We assume the same hypothesis as in the previous section, and in particular, assume that a quantum momentum map µ~ : U g[[~]] −→ C ∞ (M )[[~]] exists. Let us recall some notion first. Given an associative algebra A and a subset B ⊆ A, B 0 is the subset of A which consists of all elements commuting with B. Then B 0 is an associative subalgebra of A, and is called the commutant of B [11]. The following proposition describes the commutant of the image of a quantum momentum map. Let C ∞ (M )G denote the space of all G-invariant functions on M . Proposition 7.1.
(µ~ U g[[~]])0 ∼ = C ∞ (M )G [[~]].
P i ∞ Proof. Let f = i ~ fi ∈ C (M )[[~]]. Then f commutes with µ~ U g[[~]] iff it commutes with its generators. That is, [f, µ~ ξ] = 0, ∀ξ ∈ g. The latter is equivalent to ˆ = 0, or in other words, f is G-invariant. that ξf According to this proposition, we have µ~ (U g[[~]]) ⊆ (C ∞ (M )G [[~]])0 . In order to describe the commutant (C ∞ (M )G [[~]])0 completely, we need to extend the quantum momentum map µ~ . It is well-known that the star-product on pol(g∗ )[[~]] naturally extends to a starproduct on smooth functions C ∞ (g∗ )[[~]]. Below we will indicate that a quantum momentum map, if it exists, extends to C ∞ (g∗ )[[~]] as well. More precisely, we have Proposition 7.2. Let µ~ : pol(g∗ )[[~]] −→ C ∞ (M )[[~]] be a quantum momentum map. Then it naturally extends to an algebra morphism, denoted by the same notation µ~ , from C ∞ (g∗ )[[~]] to C ∞ (M )[[~]] such that
Fedosov ∗-Products and Quantum Momentum Maps
189
(i) for any f ∈ C ∞ (g∗ ), µ~ f commutes with C ∞ (M )G [[~]], and (ii) for any f ∈ C ∞ (g∗ ), µ~ f = J ∗ f + O(~). Proof. Given a smooth function f ∈ C ∞ (g∗ ), for any x0 ∈ M , let f˜x0 (u), u ∈ g∗ denote its Talyor expansion at the point u0 = J(x0 ). Define2 (µ~ f )(x0 ) = (µ~ f˜x0 )|x=x0 .
(37)
It is clear that when f is a polynomial this reduces to the original µ~ . It is also obvious, from definition, that µ~ f commutes with C ∞ (M )G [[~]]. It is simple to see that each term in the expansion of f˜x0 (u) − f (J(x0 )), as a function in u, is a homogeneous polynomial in u − u0 . Hence, we have µ~ f = J ∗ f + O(~). It remains to check that µ~ is an algebra homomorphism. This follows from the fact that (f^ ∗~ g)x0 = f˜x0 ∗~ g˜ x0 ,
(38)
for all f and g ∈ C ∞ (g∗ ). P k ~ Ck (f, g). Then, Eq. (38) essentially follows from To see this, write f ∗~ g = ˜ ^ the fact that Ck (f, g)x0 = Ck (fx0 , g˜ x0 ), which is in turn a consequence of the fact that pol(g∗ )[[~]] is closed under the ∗-product on C ∞ (g∗ )[[~]]. Proposition 7.3. Under the same hypothesis as in Proposition 7.2, if, in addition, the G-action is free and proper, (C ∞ (M )G [[~]])0 ∼ = µ~ (C ∞ (g∗ )[[~]]). ∞ ∗ ∞ G 0 Proof. According to Proposition 7.2, µ~ (C P (g )[[~]]) ⊆ (C (M ) [[~]]) . To prove the other direction, let us assume that f = i ~i fi ∈ (C ∞ (M )G [[~]])0 . It is clear that for any g ∈ C ∞ (M ),
i [f, g] = {f0 , g} + O(~). ~ Then it follows that {f0 , g} = 0, ∀ g ∈ C ∞ (M )G . This implies that f0 = J ∗ f00 for some smooth function f00 ∈ C ∞ (g∗ ). Now according to Proposition 7.2, µ~ f00 = J ∗ f00 +O(~) = f0 + O(~). Therefore, f − µ~ f00 = ~f˜, where f˜ ∈ (C ∞ (M )G [[~]])0 since both f and µ~ f00 belong to (C ∞ (M )G [[~]])0 . By repeating the same argument on f˜ and so on, we deduce that f ∈ µ~ (C ∞ (g∗ )[[~]]). Combining Propositions 7.1–7.3, we have Theorem 7.4. Suppose that ∗~ is a G-equivariant ∗-product on C ∞ (M )[[~]] with a quantum momentum map µ~ . Assume that the action is free and proper, then (C ∞ (M )G [[~]])0 (µ~ C ∞ (g∗ )[[~]])0 2
∼ = µ~ (C ∞ (g∗ )[[~]]), ∼ = C ∞ (M )G [[~]].
The author is grateful to A. Weinstein for suggesting this method of extension.
190
P. Xu
Given an associative algebra A, two subalgebras B1 and B2 are called mutual commutants if B10 = B2 and B20 = B1 . The notion of mutual commutants is an important concept in the theory of associative algebras, especially in operator algebras. It has been generalized to the context of groups by Roger Howe [19], called dual pairs of groups, in his study of representation theory and mathematical physics. On the classical level, or more precisely on the level of Poisson manifolds, an analogue was introduced by Weinstein [30], which is called dual pairs of Poisson manifolds. In fact, Poisson manifolds g∗ and M/G, together with the Poisson maps J : M −→ g∗ and p : M −→ M/G, consist of a dual pair in the terms of Weinstein [30]. Now (C ∞ (g∗ )[[~]], ∗~ ) provides a deformation quantization for the Lie-Poisson structure g∗ while (C ∞ (M )G [[~]], ∗~ ) quantizes the reduced Poisson space M/G. So Theorem 7.4 equivalently says that under deformation quantization, the classical dual pair g∗ and M/G becomes mutual commutants. For this reason, we shall call this pair of algebras (C ∞ (g∗ )[[~]], ∗~ ) and (C ∞ (M )G [[~]], ∗~ ) a quantum dual pair. In general, given two Poisson manifolds P1 and P2 and their deformation quantization (C ∞ (P1 )[[~]], ∗~ ) and (C ∞ (P2 )[[~]], ∗~ ), we say that they consist of a quantum dual pair if there is a symplectic manifold M and a star-product (C ∞ (M )[[~]], ∗~ ) on M , and algebra morphisms ρ1 : C ∞ (P1 )[[~]] −→ C ∞ (M )[[~]] and ρ2 : C ∞ (P2 )[[~]] −→ C ∞ (M )[[~]] such that ρ1 (C ∞ (P1 )[[~]]) and ρ2 (C ∞ (P2 )[[~]]) are mutual commutants. Unfortunately, at the moment we only know a few examples of deformation quantizable Poisson manifolds. In fact, we do not know any other examples of quantum dual pairs besides the example in Theorem 7.4 and the trivial ones. In particular, it is not clear in general whether a classical dual pair (or Morita equivalent Poisson manifolds [34]) can be quantized to a quantum dual pair. The answer to all these questions relies upon how successful the deformation quantization theory of Poisson manifolds is. 8. Appendix A (Derivations of ∗-Algebras) In this appendix, we list some basic facts concerning derivations of a ∗-algebra on a symplectic manifold M . Some of them are well known. However, as we shall see, the Fedosov method even sheds new light on understanding these results. Definition 8.1. A derivation of a ∗-algebra (C ∞ (M )[[~]], ∗~ ) is a formal power series of ~ with coefficients being linear operators on C ∞ (M ): τ = D0 +~D1 +· · · +~i Di +· · · such that τ (f ∗~ g) = τ f ∗~ g + f ∗~ τ g, ∀f, g ∈ C ∞ (M )[[~]].
(39)
A derivation is said to be inner if τ = ad ~i H = [ ~i H, ·] for some H ∈ C ∞ (M )[[~]]. P Suppose that τ = i ~i Di is a derivation. Expand both sides of Eq. (39). Considering the ~0 -terms, we find that D0 (f g) = D0 (f )g + f D0 (g). That is, D0 is a vector field. Similarly by considering the ~1 terms, one obtains that D0 {f, g} − {D0 f, g} − {f, D0 g} = (D1 f )g + f (D1 g) − D1 (f g). Since the LHS is skew-symmetric with respect to f and g while the RHS is symmetric, both terms have to vanish identically. Thus, D1 is a vector field and D0 is a symplectic vector field. Moreover, we have the following (see [5] for an equivalent result, which however concerns derivations of the corresponding deformed Lie algebra; see also [7]).
Fedosov ∗-Products and Quantum Momentum Maps
191
Theorem 8.2. Suppose that D = D0 +~D1 +· · · ~i Di +· · · is a derivation of a ∗-algebra (C ∞ (M )[[~]], ∗~ ) on a symplectic manifold M . Then, the operator Di , for each i, is a differential operator, and in particular D0 is a symplectic vector field; (ii) there is a canonical one-one correspondence between derivations and closed 1forms in Z 1 (M )[[~]]; (iii) under such a correspondence, inner derivations correspond to exact 1-forms in B 1 (M )[[~]]. (i)
To begin with, we will consider Fedosov algebras, and assume that A = WD for some Fedosov connection D. We need a couple of lemmas first. Lemma 8.3. Let K ∈ 0W be a section. Then, ρ = ad ~i K = [ ~i K, ·] defines a derivation of WD iff DK is a scalar closed one-form on M . Proof. It is clear that ρ, defined in this way, satisfies the derivation property. For any a ∈ WD , i i Dρ(a) = [ DK, a] + [ K, Da]. ~ ~ If ρWD ⊆ WD , it follows that [DK, a] = 0, ∀a ∈ WD . Thus DK = θ is a scalar oneform on M , i.e., θ ∈ 1 (M )[[~]]. Then θ must be closed since dθ = Dθ = D2 K = 0. The converse follows essentially from the same argument backwards. Lemma 8.4. For a ∗-product on Pa symplectic manifold M , any symplectic vector field X extends to a derivation τ = i ~i Di , with Di being differential operators such that D0 = X. If X is a hamiltonian vector field, τ may be chosen as an inner derivation. If the ∗-product is in addition assumed to satisfy the parity condition, the derivation above can be chosen to admit only even powers of ~. Proof. Since every ∗-product is equivalent to a Fedosov ∗-product, we may confine ourselves to a Fedosov algebra WD . Let θ = X ω. Then θ is a closed one-form on M . Let K ∈ 0W be a section satisfying DK = θ and K|y=0 = 0.
(40)
Note that such a section always exists uniquely according to the Fedosov iteration method. In fact, K is determined by the iteration formula: i K = −δ −1 θ + δ −1 (∂K + [ γ, K]). ~
(41)
Take τ = ad ~i K. Then τ is easily seen to be a required derivation. If X is a hamiltonian vector field with the hamiltonian function H, we may simply take τ = [ ~i H, ·], which is an inner derivation having the desired property. Here the bracket [·, ·] is the commutator of the ∗-product. Note that locally we can always choose τ to be inner and of the form τ = [ ~i H, ·] for some local Hamiltonian function H. So if the ∗-product satisfies the parity condition, then only even powers of ~ will be involved in the expansion of τ .
192
P. Xu
Proof of Theorem 8.2. According to the observation preceding Theorem 8.2, D0 is a symplectic vector field. Thus, it extends to a derivation τ0 = D0 +O(~), whose coefficients are differential operators, according to the lemma above. Let τ˜ = ~1 (τ − τ0 ). Then τ˜ is again P a derivation and τ = τ0 + ~τ˜ . Applying this argument repeatedly, we find that τ = ~i τi , where every τi is a derivation with coefficients being differential operators. So τ itself is such a derivation as well. To continue, without loss of generality, we shall confine ourselves to the case of a Fedosov algebra WD . Assume that τ : WD −→ WD is a derivation. For any x0 ∈ M , we define a derivation ρx0 on Wx0 by ρx0 a(y, ~) = (τ a˜ )|x=x0
(42)
where a(y, ~) ∈ Wx0 and a˜ ∈ WD such that a˜ (x0 , y, ~) = a(y, ~). Clearly, ρx0 is well defined since τ is a local operator according to Part (i). Since all derivations on Wx0 are inner, there is an element K(x0 ) ∈ Wx0 such that ρx0 = ad ~i K(x0 ). By requiring that K(x0 )|y=0 = 0, K(x0 ) will be unique. Repeating this process pointwisely, we obtain a global section K ∈ 0W with K|y=0 = 0 such that i τ a˜ = [ K, a˜ ], ~
∀˜a ∈ WD .
According to Lemma 8.3, θ = DK belongs to Z 1 (M )[[~]]. In this way, we obtain a map ϕ from the space of derivations DerC ∞ (M )[[~]] to that of closed 1-forms Z 1 (M )[[~]]. Conversely, given any θ ∈ Z 1 (M )[[~]], the equation: DK = θ,
and K|y=0 = 0
(43)
has a unique solution. Thus, τ = ad ~i K defines a derivation of WD such that ϕτ = θ. In other words, ϕ is onto. It is also simple to see that ϕ is injective since the solution to Eq. (43) is unique. Suppose that τ is an inner derivation: τ = ad ~i H for some H ∈ WD . Thus K = H − H0 , where H0 = H|y=0 ∈ C ∞ (M )[[~]]. Then θ = DK = D(H − H0 ) = −dH0 , which is clearly exact. Conversely, if ϕτ = θ is exact, i.e., θ = dH for some H ∈ C ∞ (M )[[~]], we have D(K − H) = θ − dH = 0. That is, K − H ∈ WD . Thus τ = ad ~i K = ad ~i (K − H) is clearly inner. This concludes the proof of the theorem. It is well known that the bracket of any two symplectic vector fields is hamiltonian. As another immediate application of Fedosov method, we obtain the following “quantum” analogue of this fact. Proposition 8.5. Let ∗~ be any ∗-product on a symplectic manifold M . Then the bracket of any two derivations is an inner derivation. Proof. Without loss of generality, assume that A = WD for some Fedosov connection D. Assume that τ1 = ad ~i K1 and τ2 = ad ~i K2 are derivations of WD , where K1 and K2 are sections of W . Then, DK1 and DK2 belong to Z 1 (M )[[~]] according to Lemma 8.3. Now [τ1 , τ2 ] = ad ~i ( ~i [K1 , K2 ]). It is clear that K = ~i [K1 , K2 ] is a section of W and DK = 0. In other words, K ∈ WD . Therefore, [τ1 , τ2 ] is an inner derivation.
Fedosov ∗-Products and Quantum Momentum Maps
193
Remark. (1) Note that our definition of inner derivations differs from the usual one by a factor ~i . If we modify the usual Hochschild coboundary operator by multiplying by the factor ~i , Theorem 8.2 implies that the first Hochschild cohomology H 1 (C ∞ (M )[[~]], C ∞ (M )[[~]]) is isomorphic to H 1 (M )[[~]] (see [5]). This fact can be generalized to higher order cohomology so that H ∗ (C ∞ (M )[[~]], C ∞ (M )[[~]]) ∼ = H ∗ (M )[[~]]. In particular, the isomorphism at the order two establishes an intrinsic connection between the characteristic class of a ∗-algebra and the second de-Rham cohomology (see [33] for the detail). (2) The correspondence in Theorem 8.2 induces a Lie bracket on the space Z 1 (M )[[~]] so that the bracket of any two closed one-forms is exact (see Proposition 8.5). It is easy to see that this bracket has the form: [θ1 , θ2 ] = {θ1 , θ2 } + O(~), ∀θ1 , θ2 , where {·, ·} is the standard Lie bracket on one-forms on the symplectic manifold (see [1]). However, it is difficult to find an explicit expression for the entire bracket [·, ·]. The latter should be related to the Weyl curvature of the ∗-product. Also, for a symplectic manifold, it is well known that the Poisson bracket defined on closed one-forms extends to a bracket on all one-forms. It is not clear, however, whether one can extend the bracket [·, ·] above to 1 (M )[[~]]. It seems that these problems are all related to the question raised by Weinstein regarding “quantum Lie algebroids” [32]. Another interesting consequence of Theorem 8.2 is the following: Corollary 8.6. Assume Pthati the star product satisfies the parity condition.2If τ = D0 + 2 + ~ D + · · · = ~D 1 2 i ~ Di is a derivation, P 2i P 2ithen both τeven = D0 + ~ D2 + · · · = 2 i ~ D2i and τodd = D1 + ~ D3 + · · · = i ~ D2i+1 , are derivations. Proof. Without loss of generality, we assume that this is a Fedosov ∗-product, and consider WD as our algebra. Then any derivation τ can be written as τ = [ ~i K, ·], for some K ∈ 0W . Let θ = DK ∈ Z 1 (M )[[~]]. Write θ = θeven + ~θodd , where θeven is the sum of all even terms in ~ while θodd is the sum of all odd terms in ~ divided by ~. Let Keven and Kodd be the sections of W corresponding to θeven and θodd , respectively, defined by Eq. (40), and let i τ˜even = ad Keven and ~ i τ˜odd = ad Kodd . ~ Then, clearly both τ˜even and τ˜odd are derivations, and consist only of even powers of ~ according to Lemma 8.4. Also τ = τ˜even + ~τ˜odd by construction. Therefore, τ˜even = τeven and τ˜odd = τodd . This concludes the proof. 9. Appendix B A ∗-product f ∗~ g =
X
~k Ck (f, g)
k
on a regular Poisson manifold is said to be regular if Ck (·, ·) is a leafwise bidifferential operator for every k. In this section, for completeness, we will outline a proof for the following result, whose proof can also be found in [25].
194
P. Xu
Proposition 9.1. Suppose that P is a regular Poisson manifold whose symplectic foliation is a fibration P −→ M . Assume that the second Betti number of the fibers is zero. Then there exists essentially a unique regular ∗-product on P . I.e., any two regular ∗products are equivalent, and the equivalence operator can be chosen as a formal power series of ~ with coefficients being leafwise differential operators. Recall that, given a fibration F on a manifold P , a leafwise Hochschild cochain is a k-linear form on C ∞ (P ) with value in C ∞ (P ), which requires to be a leafwise k-differential operator on P . The Hochschild differential is given by (bc)(u0 , · · · , uk ) = u0 c(u1 , · · · , uk ) +
k−1 X
(−1)i+1 c(u0 , · · · , ui ui+1 , · · · , uk )
i=0
+(−1)k+1 c(u0 , · · · , uk−1 )uk . k (C ∞ (P ), C ∞ (P )), while its As usual, the space of k-cochains is denoted by CF k ∞ ∞ cohomology is denoted by HF (C (P ), C (P )).
Lemma 9.2 (Nest–Tsygan [25]). Let F be a fibration on a manifold P . Then, HFk (C ∞ (P ), C ∞ (P )) ∼ = 0(∧k T F).
(44)
An immediate consequence is the following 2 Lemma 9.3. (i) If c ∈ CF (C ∞ (P ), C ∞ (P )) is an antisymmetric two-cocycle, then c is a leafwise bivector field, i.e., c ∈ 0(∧2 T F ). 2 (C ∞ (P ), C ∞ (P )) is a symmetric two-cocycle, then it is a coboundary. (ii) If c ∈ CF
Proof of Proposition 9.1. The proof is simply a modification of the standard argument for symplectic manifolds. The idea, roughly speaking, is as follows. The classification of ∗-products is equivalent to that of their commutator Lie algebras [·, ·]∗ , which are Lie algebra deformations of the Poisson bracket {·, ·}. The latter is classified by the 2nd order leafwise Chevalley cohomology of the Poisson Lie algebra. However, the 2nd order leafwise Chevalley cohomology is isomorphic to the second leafwise de-Rham cohomology, and is zero by assumption. Recall that the Chevalley coboundary (see [23]) operator is given by: 1 {uλ0 , c(uλ1 , · · · , uλp )} p! 1 − c({uλ0 , uλ1 }, uλ2 , · · · , uλp )], 2(p − 1)! λ ···λ
0 p [ (∂c)(u0 , · · · , up ) = 0···p
(45)
2 (C ∞ (P ), C ∞ (P )) is where uλ ∈ C ∞ (P ) and is the Kronecker symbol. Since c ∈ CF antisymmetric,
(∂c)(u0 , u1 , u2 ) = ({u0 , c(u1 , u2 )} − c({u0 , u1 }, u2 )) + c.p., where c.p. stands for the cyclic permutation. Hence if c ∈ 0(∧2 T F), ∂c = [π, c] = dπ c. Here the bracket is the Schouten bracket and dπ : 0(∧∗ T F) −→ 0(∧∗+1 T F) is the coboundary operator of the (leafwise)-Poisson cohomology.
Fedosov ∗-Products and Quantum Momentum Maps
195
Suppose that f ∗~ g = f ∗0~ g =
X X
~k Ck (f, g), and ~k Ck0 (f, g)
are two regular ∗-products on P . We need to construct an equivalence operator between them. This will proceed by induction. Assume that they are equivalent up to i = k, i.e., we can find an equivalence 0 ) = 0. Write operator under which Ci = Ci0 for 0 ≤ i ≤ k. Then we have b(Ck+1 − Ck+1 0 0 0 0 Ck+1 = Ak+1 +Bk+1 and Ck+1 = Ak+1 +Bk+1 , where Ak+1 and Ak+1 are skew-symmetric, 0 and Bk+1 and Bk+1 are symmetric. 0 It is simple to see that both Ak+1 −A0k+1 and Bk+1 −Bk+1 are Hochschild 2-cocycles. Since Ak+1 − A0k+1 is skew-symmetric, it belongs to 0(∧2 T F) by Lemma 9.3. On the other hand, both [·, ·]∗ and [·, ·]∗0 are deformations of the Poisson Lie algebra {·, ·}. Thus, ∂(Ak+1 − A0k+1 ) = 0. That is, [π, Ak+1 − A0k+1 ] = 0. In other words, Ak+1 − A0k+1 is a 2-cocycle of the (leafwise)-Poisson cohomology, which is isomorphic to the leafwise deRham cohomology. Since the 2nd leafwise de-Rham cohomology is zero by assumption, there is a vector field X ∈ 0(T F) such that Ak+1 − A0k+1 = [π, X]. In other words, Ak+1 (f, g) − A0k+1 (f, g) = {Xf, g} + {f, Xg} − X{f, g}, ∀f, g ∈ C ∞ (P ). 0 On the other hand, since Bk+1 − Bk+1 is a symmetric Hochschild 2-cocycle, Bk+1 − 1 ∞ = bD, for some D ∈ CF (C (P ), C ∞ (P )) according to Lemma 9.3. Now it is clear that T = 1 + ~k X + ~k+1 D establishes an isomorphism between ∗~ and ∗0~ up to ~k+1 . This concludes the proof. 0 Bk+1
10. Appendix C Recall that a natural ∗-product on a symplectic manifold M is a ∗-product: u ∗~ v = uv −
1 i~ i~ {u, v} + (− )2 Q2 (u, v) + · · · , 2 2 2
where Q2 (u, v) is a bidifferential operator of order 2 in each argument. It is well known [5, 23] that associated to a natural ∗-product there is a canonical torsion-free symplectic connection. More precisely, we have 1 i~ 2 Proposition 10.1. Let u ∗~ v = uv − i~ 2 {u, v} + 2 (− 2 ) Q2 (u, v) + · · · be a natural ∗-product on M . Then there exists a unique torsion-free symplectic connection ∇ such that 2 (u, v) + H(u, v), Q2 (u, v) = P∇
where H(u, v) is a bidifferential operator of maximum order 1 in each argument. For completeness, we outline a proof here. ˜ Then, Proof of Proposition 10.1. Take an arbitrary torsion free symplectic connection ∇. u ∗~ v = uv −
1 i~ i~ 2 {u, v} + (− )2 P∇ ˜ (u, v) 2 2 2
196
P. Xu
2 is a ∗-product of order up to ~2 . Therefore, b(Q2 − P∇ ˜ ) = 0, where b denotes the usual 2 Hochschild differential. Since Q2 − P∇˜ is symmetric, it must be a 2-coboundary. Hence there is a differential operator D of maximum order 3 such that 2 Q2 − P∇ ˜ = bD.
The principal term of D corresponds to a covariant symmetric 3-tensor T ijk . ˜ ∂ ∂j = 0 ˜ kij ∂ k , and let 0 ˜ ijk = 0 ˜ ilm π lj π mk . Set In local coordinates, write ∇ ∂x i ∂x ∂x
˜ ijk + 3T ijk . 0ijk = 0
(46)
Since T ijk is a completely symmetric tensor, the equation above defines a torsion free k ∂ ijk ∂ = 0ilm π lj π mk . A simple calcusymplectic connection: ∇ ∂ i ∂x j = 0ij ∂xk , with 0 ∂x lation yields that ∂u ∂v ∂xk ∂xl ∂u ∂2v ∂2u ∂v +3T i1 i2 i3 i1 i2 i3 + 3T i1 i2 i3 i1 i2 . ∂x ∂x ∂x ∂x ∂x ∂xi3 k
l
2 2 i1 j 1 i 2 j 2 ˜ ˜ j j − 0ki i 0lj j ) π (0i1 i2 0 P∇ ˜ (u, v) − P∇ (u, v) = π 1 2 1 2 1 2
(47)
On the other hand, (bD)(u, v) = −3T i1 i2 i3
∂u ∂2v ∂2u ∂v i1 i2 i3 ˜ − 3T + H(u, v), i i i i i 1 2 3 1 2 ∂x ∂x ∂x ∂x ∂x ∂xi3
˜ where H(u, v) is a bidifferential operator of maximum order 1 in each argument. Therefore, 2 2 2 (u, v) = P∇ Q2 (u, v) − P∇ ˜ (u, v) − P∇ (u, v) + (bD)(u, v) is clearly a bidifferential operator of maximum order 1 in each argument. 2 2 To see that such a connection is unique, it suffices to note that P∇ ˜ (u, v) − P∇ (u, v) ˜ is a bidifferential operator of maximum order 1 iff ∇ = ∇. This can be easily seen from Eq. (47). Acknowledgement. The author would like to thank Pierre Bieliavsky, Jean-Luc Brylinski, Ranee Brylinski, Moshe Flato, Victor Guillemin, Yuri Manin, Marc Rieffel, Jim Stasheff, Daniel Sternheimer, Boris Tsygan and Alan Weinstein for useful discussions and comments. In addition to the funding sources mentioned in the first footnote, he would also like to thank IHES and Max-Planck-Institut for their hospitality and financial support while part of this project was being done.
References 1. Abraham, R., and Marsden, J.: Foundations of Mechanics. Reading, MA: Addison-Wesley, 2nd edition, 1985 2. Arnal, D. and Cortet, J. C.: ∗-products in the method of orbits for nilpotent groups. J. Geom. Phys. 2, 83–116 (1985) 3. Arnal, D. and Cortet, J. C.: Repr´esentations ∗ des groupes exponentiels, J. Funct. Anal. 92, 103–135 (1990) 4. Astashkevich, A.: On Fedosov’s quantization of semisimple coadjoint orbits, Ph.D thesis (MIT), (1996) 5. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., and Sternheimer, D.: Deformation theory and quantization, I and II. Ann. Phys. 111, 61–151 (1977)
Fedosov ∗-Products and Quantum Momentum Maps
197
6. Berezin, F.A.: Some remarks about the associated envelope of a Lie algebra. Funct. Anal. Appl. 1, 91–102 (1967) 7. Bertelson, M., Cahen, M., and Gutt, S.: Equivalence of star products. Classical and Quantum Gravity 14, 93–107 (1997) 8. Boutet de Monvel, L. and Guillemin V.: The spectral theory of Toeplitz operators. In: Annals of Mathematics Studies, 99, Princeton: Princeton University Press, 1981 9. Bordemann, M., and Waldmann, S.: A Fedosov star product of Wick type for Kahler manifolds. Lett. Math. Phys. 41, 243–253 (1997) 10. De Wilde, M., and Lecomte, P.: Existence of star-products and of formal deformations of Poisson Lie algebra of arbitrary symplectic manifolds. Lett. Math. Phys. 7, 487–496 (1983) 11. Dixmier, J.: Von Neumann algebras. New York: North-Holland, 1981 12. Emmrich, C., and Weinstein, A.: The differential geometry of Fedosov’s quantization. In: Lie theory and geometry in jonor of B. Kostant, Progress in Math. 123, New York: Birkhauser, 1994 13. Deligne, P.: D´eformations de l’alg`ebre des fonctions d’une vari´et´e symplectique: Comparaison entre Fedosov et De Wilde. Lecomte, Selecta. Math. (N.S.) 1, 667–697 (1995) 14. Fedosov, B.: A simple geometrical construction of deformation quantization. J. Diff. Geom. 40, 213–238 (1994) 15. Fedosov, B.: Deformation quantization and index theory. In: Mathematical Topics 9, Berlin: Akademie Verlag, 1996 16. Fedosov, B.: Reduction and eigenstates in deformation quantization. In: Pseudo-differential calculus and mathematical physics, Math. Top., 5, Berlin: Akademie Verlag, 1994, pp. 277–297 17. Flato, M. and Sternheimer, D.: Closedness of star products and cohomologies. In: Lie theory and geometry, in honor of B. Kostant, Progress in Math. 123, New York: Birkhauser, 1994, pp. 241–259 18. Guillemin, V.: Star products on compact pre-quantizable symplectic manifolds. Lett. Math. Phys. 35, 85–89 (1995) 19. Howe, R.: Dual pairs in physics: Harmonic oscillators, photons, electrons, and singletons. Lectures in Appl. Math. 21, 179–207 (1985) 20. Landsman, K.: Rieffel induction as generalized quantum Marsden-Weinstein reduction. J. Geom. Phys. 15, 285–319 (1995) 21. Lecomte, P.: Application of the cohomology of graded Lie algebras to formal deformations of Lie algebras. Lett. Math. Phys. 13, 157–166 (1987) 22. Lichnerowicz, A.: Les vari´et´es de Poisson et leurs alg`ebres de Lie associ´ees. J. Diff. Geom. 12, 253–300 (1977) 23. Lichnerowicz, A.: D´eformations d’alg´ebres associ´ees a` une vari´et´e symplectique (les ∗ν -produits). Ann. Inst. Fourier, Grenoble 32 No. 1, 157–209 (1982) 24. Lu, J.-H.: Moment maps at the quantum level. Commun. Math. Phys. 157, 387–404 (1993) 25. Nest, R. and Tsygan, B.: Algebraic index theorem for families. Advances in Math. 113, 151–205 (1995) 26. Omori, H. Maeda, Y. and Yoshioka, A.: Weyl manifolds and deformation quantization. Advances in Math. 85, 224–255 (1991) 27. Tsygan, B.: Private communication 28. Vey, J.: D´eformation du crochet de Poisson sur une vari´et´e symplectique. Comment. Math. Helv. 50, 421–454 (1975) 29. Weinstein, A.: Lectures on symplectic manifolds. CBMS Reg. Conf. Series in Math. Vol 29, Providence, RI: AMS, 1977 30. Weinstein, A.: The local structure of Poisson manifolds. J. Diff. Geom. 18, 523–557 (1983) 31. Weinstein, A.: Deformation quantization, S´eminaire Bourbaki, 46`eme ann´ee, N. 789 (1993–1994), Asterisque 227, 389–409 (1995) 32. Weinstein, A.: Private communication 33. Weinstein, A. and Xu, P.: Hochschild cohomology and characteristic classes for star-products. In: Festschrift for V. I. Arnol’d’s 60th birthday, Vol 2, Providence, RI: AMS, (to appear) 34. Xu, P.: Morita equivalence of Poisson manifolds. Commun. Math. Phys. 142, 493–509 (1991) Communicated by H. Araki
Commun. Math. Phys. 197, 199 – 210 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab Carlo Cercignani Dipartimento di Matematica, Politecnico di Milano, I–20133 Milano, Italy Received: 22 January 1997 / Accepted: 6 March 1998
Abstract: We prove that the Boltzmann equation in a slab, with boundary conditions of diffuse reflection and mass conservation at the wall and assigned total mass (per unit area) in a slab has at least one non-negative measure-valued solution. The physical model is provided by Maxwellian molecules with angular cutoff. The equation is rewritten after a change of variables which is shown to be invertible.
1. Introduction Very little is known on boundary value problems for the Boltzmann equation [1–3] with data arbitrarily large and removed from equilibrium. There are several difficulties even in the simplest case (two-plate problems), when the solution depends on just one space variable, say x (as well as on the velocity components). The main difficulty is, presumably, related to the presence of the value zero among the values taken by the x-component of the molecular velocity which multiplies the space derivative. Having this in mind, a few years ago C. Cercignani, R. Illner and M. Shinbrot [4] attacked the problem in the case of discrete velocity models, when none of the velocities has a zero component along the x-axis and the distribution of inflowing particles is given. The authors applied a version of the Leray–Schauder theorem. This can be done by a suitable use of the equations and the boundary conditions and the existence of (at least) one solution follows. An existence theorem for walls with zero net particle-flow is proved in the same paper. So far, however, no uniqueness result has been proved; it is easy, of course, to prove uniqueness (by contraction) if the ratio of the thickness L of the slab to the mean free path is sufficiently small. A result of this kind has been extended by the same authors to the case of a plane Broadwell model in a rectangle [5]. The extension to half space problems is discussed in a joint paper with Pulvirenti [6]. The theorem and its proof can be extended to the case of continuous velocities, at least with a cutoff for small values of the x-component of the molecular velocity v, provided the solution is
200
C. Cercignani
taken to be a measure [7]. In the following we are concerned with the one–dimensional nonlinear Boltzmann equation for Maxwellian molecules with angular cutoff, in order to prove the existence of a steady state solution without cutoff in vx . In this case the Boltzmann equation may be written as follows: vx ∂x f = (Qf )(x, v)
(1.1)
with x ∈ [0, L] and v ∈ R3 . The collision term in (1.1) can be written in the following form: Z Z B(θ) f (v∗0 )f (x, v0 ) − f (v∗ )f (x, v) dndv∗ , (1.2) (Qf )(x, v) = R3 S+2 0 where R θ is, as usual, the angle between v − v∗ and v − v , and we assume, for simplicity, that B(θ)dθ=1. v0 and v∗0 , are related to v and v∗ by
v0 = v − nn · (v − v∗ ), v∗0 = v∗ + nn · (v − v∗ ),
(1.3)
where n is the unit vector along v−v0 . Equation (1.1) is equipped with diffusive boundary conditions at x = 0 and x = L, i.e. if a particle hits the wall, it is re–emitted according to a Maxwellian distribution with a fixed wall temperature. We assume that the ingoing flux is assigned at x = 0, whereas it is equal to the outgoing flux at x = L. If there is a solution of the problem, of course, the net flux will be zero at x = L as well, thanks to mass conservation. We shall not use this circumstance till the end of the paper; in fact we shall prove an existence theorem exactly in this case. Thus we write the boundary conditions in the following form: f0 = f (0, v) = M+ (v)C0 if vx > 0, Z fL = f (L, v) = M− (v) |vx0 |f + (L, v0 )dv0 if vx < 0.
(1.4)
0 >0 vx
Here C0 is a given constant, whereas M+ and M− denote (scaled) halfspace Maxwellians 1 v2 p exp − . (1.5) M± (v) = 2T± T± 2πT± Note that, as noticed above, one has mass conservation at the boundary x = L due to Z |vx |M− (v)dv = 1. vx <0
We first remark that Eq. (1.1) can be rewritten in the following way: vx ∂x f + ρf = (Q+ f )(x, v), where Q+ f can be written in the following form: Z Z B(θ)f (x, v∗0 )f (x, v0 )dndv∗ . (Q+ f )(x, v) = R3 S+2
(1.6)
(1.7)
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab
We now slightly modify our problem by defining a new independent variable: Z x z= ρ(x0 )dx0 .
201
(1.8)
0
We remark that if there is a solution with a summable positive ρ, this change of variable is well-defined and invertible. We thus rewrite Eq. (1.6) in the following form: v x ∂z f + f =
(Q+ f )(z, v) ρ
(1.9)
where, by abuse of notation we continue to denote by f and ρ the values of the functions of z coinciding with the values of the corresponding functions of x at corresponding points. To have a bit shorter notation we shall denote the right-hand side of (1.9) by (Q f )/ρ(z, v). The boundaries will now be located at z = 0 and z = M , where M = R L+ ρ(x)dx. The advantage of the new form of the equation is that it is homogeneous of 0 first degree of the unknown, so that when we multiply C0 by a factor the solution (if any) will also be multiplied by the same factor. Henceforth we shall deal with Eq. (1.9), and return to Eq. (1.1) at the end of the paper. For later use, we define the outgoing mass fluxes J± by Z Z 0 + 0 0 J+ = |vx |f (M, v )dv ; J− = |vx0 |f − (0, v0 )dv0 . (1.10) 0 >0 vx
0 <0 vx
One may derive a formal solution for the modified Boltzmann equation by integrating (1.9) along the characteristics, which yields z (z − M ) H(vx ) + fL exp − H(−vx ) f (z, v) = f0 exp − vx vx 1 + |vx |
ZM 0
|z − y| Q+ f (y, v)H(vx (z − y))dy exp − vx ρ
(1.11)
with H(v) the Heaviside function H(v) =
1v>0 . 0v<0
(1.12)
Please remark that we have not defined H(v) for v = 0. This is inessential if we work in an Lp space, but it will be a crucial point later when we shall work in the space of measures. The fact that H(vx ) is not defined for vx = 0 will serve as a reminder that Eq. (1.1) holds only for vx 6= 0. ˜ , such that the search for a formal We denote the right hand side of (1.11) by Af solution of (1.1) may be written as a fixed point problem, ˜ (z, v). f (z, v) = Af
(1.13)
To clarify the influence of the prescribed boundary conditions, we consider for a moment the case vx > 0. Then, one obtains
202
C. Cercignani
z f (z, v) = f0 exp − vx
+
1 + vx
Zz 0
(z − y) exp − vx
Q+ f (y, v0 )dy. ρ
(1.14)
+
Letting z = M , multiplying f by vx and integrating over the half space vx > 0 yields Z J+ = vx >0
M M+ (v)C0 exp − vx
Z
ZM
vx dv + vx >0 0
(M − y) exp − vx
Q+ f (y, v)dydv, ρ (1.15)
where J+ was defined in (1.10). Then J+ = aC0 + c,
Z
where a=
vx >0
ZM
Z
c= 0 vx >0
M vx M+ (v) exp − vx
(1.16) dv,
(M − y) Q+ f (y, v)dvdy. exp − vx ρ
(1.17)
Analogously, we have: J− = aJ+ + d = a2 C0 + ac + d, where
ZM Z d= 0 vx <0
y exp − |vx |
Q+ f (y, v)dvdy. ρ
(1.18)
(1.19)
We remark that these definitions of J+ and J− make sense even in f is not a solution of the problem.
2. The Measure Formulation of the Steady Boltzmann Equation We try to prove the existence of a solution for the one–dimensional Boltzmann equation in the space of bounded measures on (z, v). Our unknown measure dµ will be such that, when it is a.c. with respect to the Lebesgue-Borel measure dzdv we have dµ(z, v) = f (z, v)dxdv.
(2.1)
In order to pass to the measure formulation of our problem, we have to deal with the set {vx = 0}; it would be tempting to define its measure to be zero. Then we would not need to worry about this set in the following. A detailed argument shows, however, that this is not possible, because the continuity of the operator A proved below would collapse for measures giving a nonzero measure to the set {vx = 0}. Then we have to let that measure of the aforementioned set to be fixed by the equation and we must define H(0) = 1/2 in our equations. This is needed to preserve the continuity of A. Let us now multiply Eq. (1.11) by a (bounded and continuous) test function ϕ and integrate over (z, v) ∈ [0, M ] × R3 ,
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab
ZM Z
ZM Z ϕ(x, v)f (z, v)dvdz =
0 R3
ZM Z + 0
R3
ZM Z + 0 R3
0 R3
z dvdz ϕ(z, v)H(vx )M+ C0 exp − vx
(z − M ) ϕ(z, v)H(−vx )M− J+ exp − vx 1 ϕ(z, v) |vx |
ZM 0
203
dvdz
|z − y| Q+ f (y, v)H(vx (z − y))dydvdz. (2.2) exp − vx ρ
Introducing ZM Z +
C = 0 R3
C
−
ZM Z = 0 R3
z ϕ(z, v)H(vx )M+ exp − vx
dvdz,
(z − M ) ϕ(z, v)H(−vx )M− exp − dvdz. vx
(2.3)
Equation (2.2) may be written as ZM Z
ϕ(z, v)f (z, v)dvdz = C + C0 + C − J+
0 R3
ZM Z + 0 R3
1 ϕ(z, v) |vx |
ZM 0
|z − y| Q+ f (y, v)H(vx (z − y))dydvdz (2.4) exp − vx ρ
Now, using the explicit formula for J+ as obtained in the previous section, we may write ZM Z
ϕ(z, v)f (z, v)dvdz = (C − a + C + )C0 + C − c+
0 R3
ZM Z + 0 R3
1 ϕ(z, v) |vx |
ZM 0
|z − y| Q+ f (y, v)H(vx (z − y))dydvdz (2.5) exp − vx ρ
with c given by (1.17). Now we pass to the measure formulation by just replacing f dzdv by dµ. We must, of R course, define also dµρ = dµ(z, v)dv. Of course dµρ is a measure on [0, M ]. We also R3
remark that we can define Q+ f /ρ for measures quite easily because dµ0 dµ0∗ /dµρ is well defined. Here we remark that apart from an inessential factor dµ is a probability measure and invoke the theorem on the existence of regular conditional probability measures on complete separable metric spaces [8], here applied to a finite-dimensional space. The point is that the factor dµ0 /dµρ can be viewed as a conditional distribution of v0 given x, and the theorem just quoted implies that x → dµ0 /dµρ is well defined as a measurable
204
C. Cercignani
map from [0, M ] to the space of probability measures in R3 with the narrow topology. One can then take the product with the other factor dµ0∗ and obtain the desired measure. ZM Z
ϕ(z, v)dµ(z, v) = (C − a + C + )C0 + C − c+
0 R3
ZM Z + 0
R3
1 ϕ(z, v) |vx |
ZM 0
where
|z − y| exp − vx
Z Z dQ(y, v) = R3
H(vx (z − y))dQ(y, v)dy,
B(θ)dµ(y, v0 )
S+2
dµ(y, v∗0 ) dn. dµρ (y)
(2.6)
(2.7)
Please note that c depends on the measure µ. Thus we must also give its explicit definition. Here it is: Z
ZM Z
c= vx >0 0
R3
(M − y) exp − dQ(y, v). vx
(2.8)
Then, we define an operator A on M , A:M →M µ → σ = Aµ
(2.9)
by the relation ZM Z
ϕ(z, v)dσ(z, v) = (C − a + C + )C0 + C − c+
0 R3
ZM Z + 0
R3
1 ϕ(z, v) |vx |
ZM 0
|z − y| H(vx (z − y))dQ(y, v)dy exp − |vx |
(2.10)
ZM Z ϕ(x, v)dA(µ).
= 0
R3
3. Formulation of the Main Result We want to prove that the operator A is continuous on M . Since C ± , a, C0 are given constants or bounded functions, we need only to show that this is the case for the part involving c and the last integral in the definition of A. The proof of the continuity of c is analogous to that of the aforementioned integral. Thus we shall limit ourselves to proving the continuity of the latter. In other words, we want to prove that the operator B, defined by:
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab
205
ZL Z ϕ(x, v)dB(µ)
(3.1)
0 R3
ZM Z 0 R3
1 ϕ(z, v) |vx |
ZM 0
|z − y| exp − vx
H(vx (z − y))dQ(y, v)dz
(3.2)
is continuous on M. To this end we first consider the measure dν = dµ(y, v0 )
dµ(y, v∗0 ) , dµρ (y)
(3.3)
which is a well defined measure on [0, M ] × R3 × R3 , uniquely associated with the measure dµ. If we consider another measure dµ0 and the associated measures dµ0ρ and dν0 , we have: dµ0∗ dµ0 dµ0 dµ0 dµ0 dµ0 − dµ00 ∗ | + |dµ00 ∗ − dµ00 ∗ | + |dµ00 ∗ − dµ00 0∗ | dµρ dµρ dµρ dµ0ρ dµ0ρ dµ0ρ 0 0 0 dµ dµ dµ0∗ dµ ∗ 0 |dµ0 − dµ00 | + |dµρ − dµ0ρ | + 0∗ |dµ0 − dµ00 |. ≤ dµρ dµ0ρ dµρ dµ0ρ (3.4) Now we replace in the norm of Aµ − Aµ0 both ϕ and dν − dν0 by their absolute values, use the last inequality, perform the usual trick of exchanging primed and unprimed variables [1-3], replace |ϕ| by its supremum, perform the integration with respect to z, which gives an M -dependent factor bounded by unity times a remaining integral that equals the norm of 3|dµ − dµ0 | provided we take into account that the integrals of dµ and dµ0 with respect to v equal dµρ and the integral of |dµρ − dµρ | is not larger than kdµ − dµ0 k. Hence, finally: |dν − dν0 | ≤ |dµ0
kBdµ − Bdµ0 k ≤ 3 k dµ − dµ0 k .
(3.5)
As we said before, we can argue similarly on the constant c to obtain: kAdµ − Adµ0 k ≤ 6 k dµ − dµ0 k . Then the operator A is continuous on M . We introduce the retract function TR : M → BR (0)
TR dµ =
dµ if k dµ k≤ R . (R/ k dµ k)dµ if k dµ k> R
(3.6)
(3.7)
Since TR is continuous, then TR · A is continuous and certainly maps BR (0) into itself and we can use the following [9-12] Lemma 1 (Tychonoff). Let be a non-empty compact convex subset of a locally convex, complete Hausdorff vector space X. If T : → is continuous then T has a fixed point. with T = TR · A and = BR (0). Corollary 1. TR · A has a fixed point in BR (0).
206
C. Cercignani
Let dµf be this fixed point: dµf is not zero because one can choose C0 > 0. Without restricting the generality, suppose that k Adµf k≥ R, i.e. 1 dµf , λ
(3.8)
R . k Adµf k
(3.9)
Adµf = where 0<λ=
We notice that dµ is more regular than a general measure because A has smoothing properties. In fact, if we drop the integration with respect to z in (2.12) and put φ = 1, the terms arising from the boundary conditions are exponentials in z, whereas, performing the usual trick of exchanging primed and unprimed variables in the collision term, we should examine the kernel: Z |z − y| 1 . k(z, y; v, v∗ ) = B(θ)dn 0 exp − |vx | |vx0 | If this kernel is smooth enough as to transform measures defined on [0, M ] × R3 × R3 into measures on [0, M ], then dµρ is absolutely continuous with respect to the Lebesgue measure. We shall not investigate this problem in the present paper; we only remark that it is made nontrivial by the presence of 1/|vx0 |Rwhich is unbounded when vx0 vanishes. On the other hand it is quite easy to show that R3 vx f is a function in L1 ([0, M ]). ˜ f ) in the In order to proceed, we remark that it makes sense to compute vx ∂z (Adµ sense of distributions, to obtain vx
∂ ˜ ˜ f. Adµf = dQ − Adµ ∂z
˜ f by dµf /λ throughout to obtain We can now replace Adµ vx
∂ dµf = λdQ − dµf . ∂z
(3.10)
We must also look at the boundary conditions satisfied dµf , since they are not those we started with; the right hand sides will be multiplied by λ. In particular the ingoing fluxes will be λC0 (at x = 0) and λJ+ (at x = M ), and the net fluxes λC0 − J− and (λ − 1)J+ . After integrating against the test function identically equal to unity, we have: Z Z Z 1 1 vx dµf (M, v) − dµf (0, v)] = dQ − dµf (x, v), λ λ Z Z (1 − λ)J+ + J− − λC0 = λ dQ − dµf (z, v), (1 − λ)J+ + J− − λC0 =k dµf ρ k (λ − 1) .
(3.11)
Let us show that λ cannot be (strictly) less than 1, for all R. To prove this, we assume, by contradiction, that λ < 1 and remark that C0 is a given constant and the fixed point dµf might change with R since we are using the retract operator, which depends on R. Let us first consider the possibility that dµf does not change for R ≥ R0 . This means that dµf is inside the ball of radius R0 and we do not need the retract if we choose this radius.
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab
207
If dµf changes with R, its norm must increase with R and actually diverge for R → ∞ (otherwise we would find an R0 with the property just discussed). Assuming that this is the case, let us divide both sides of Eq. (3.11) by 1 − λ; we obtain: J+ + (J− − λC0 )(1 − λ)−1 = − k dµf ρ k .
(3.12)
If we increase R without bound the left hand side tends to −∞; then we have a contradiction, unless J− ≤ λC0 and λ → 1. Let us consider a sequence of radii divergent to infinity, {Rn } and let {dµn } be the sequence of the corresponding fixed points renormalized in such a way that k dµn k=1. The measures dµn satisfy the same equation as the previously considered fixed points, except for the fact that C0 is replaced by C0 / k dµn k. If we now, passing possibly to a subsequence, let n diverge, dµn will tend to some limiting measure dµ of norm unity and this measure, by continuity, will satisfy A0 dµ = dµ, where A0 is the same as A, except for the fact that we let C0 = 0. Now proceeding as before (see Eq. (3.11)) we obtain that dµ satisfies: J − = 0.
(3.13)
This shows that at the wall z = 0 the flows of both arriving and leaving particles are zero, i.e. that at z = 0 the measure must have support at vx = 0. It is easy to show that this must occur at the other wall as well and indeed in the entire slab. Actually the solution of the problem must be a measure with support at v = 0. If this solution were impossible, we would be finished; unfortunately, it is a perfectly acceptable solution, describing a state in which all the molecules are at rest. In order to avoid this difficulty, we slightly modify our problem by adding a phantom gas having the same density as the actual gas but a distribution function M0 , which is a Maxwellian at some temperature T0 : v2 M0 (v) = (2πT0 )−3/2 exp − 2T0 and the physical gas has a probability ρ per unit time of colliding with phantom gas. The name phantom gas is chosen because there is a very small chance for the real gas to see it and we are actually interested in the case where it is not seen at all. We shall further assume that the collisions with the phantom gas scatter the molecule of the real gas into a distribution ρM0 . This means that we must add to the collision term defined in Eq. (1.2) another term Q defined as follows: (Q f )(x, v) = ρ(ρM0 − f (x, v)). It is easy to verify that we can repeat all the arguments in this paper for the model with the phantom gas G without any essential change. There is only an important difference. The very last remark in the proof that we were trying to carry out is false. The solution with all the molecules at rest is not acceptable, because the gas G would immediately destroy this state. Then we conclude that for G a λ strictly less than unity is not acceptable. This means that we can avoid using the retract function and dµf is a fixed point for A (the operator analogous to A in the case of the gas G ). We have proved the following theorem:
208
C. Cercignani
Theorem 1. There exists a positive R such that the operator A is continuous on a positive measure space M into BR (0). Then there exists at least one fixed point dµ , which is not zero because we have chosen C0 > 0. In particular, λ must equal unity in the above argument and the net current is zero at both z = M and x = 0. Thus there exist a measure solution of the boundary value problem for the gas G . In particular, it is trivial to check that the boundary conditions are satisfied and Z |vx0 |dµ (0, v0 )dv0 . C0 = 0 <0 vx
Hence if we had chosen the homogeneous boundary condition from the beginning we still would have a solution. This solution contains an arbitrary constant (C0 ) as required by physics. We prove some elementary properties of the solution we have found. First, since R the homogeneous boundary conditionsR are satisfied, the particle flow j dz = R3 vx dµ is zero. The momentum flow p dz = R3 vx2 dµ then is constant, because the collisions in the real gas conserve momentum and the collisions with the phantom gas produce a loss proportional to j dz, which has just been shown to be zero. We have thus used the mass balance and the balance of momentum. If we now consider the energy balance, we obtain (writing dρ for dµρ ) Z 1 1 3 3 d(q dz) = dρ T0 − |v|2 dµ ≤ dρ T0 − p dz dz 2 2 R3 2 2 or, integrating over the slab: 2 4 p M ≤k dρ k 3T0 + [q (0) − q (L)] ≤k dρ k 3 + [T + J− + T + J− ], R where q = R3 vx2 dµ. It would be easy to obtain estimates for J± in terms of k dρ k, but is not necessary. In fact, if these quantities were growing faster than k dρ k for sufficently small the previous argument would work to exclude the trivial solution with all the molecules at rest. Thus we can safely assume that p is bounded. Now we change again the norm of the solution imposing the condition p = 1. We consider now this solution with homogeneous boundary conditions with > 0 and let go to zero. It might of course happen that the norm increases, but p = 1 will remain finite. When we let go to zero, we can assume, by taking a subsequence, that dµ tends to a limiting measure dµ0 with the property p = 1. Because of the continuity of the operators we can pass to the limit when goes to zero and dµ0 will provide the required solution (which will also be nonzero by construction, but might have an infinitely large norm). We have thus proved: Theorem 2. There exists at least a nonzero solution dµ0 of the problem with homogeneous boundary conditions. In particular, λ must equal unity in the above argument and the net current is zero at both z = M and z = 0. We remark that this solution cannot be the previous solution with all the particles at rest, since this solution makes p = p0 = 0. It remains to return to the original formulation. The most ambitious aim would be to show that dµρ is actually a.c. with respect to the Lebesgue measure dz. We remark that the corresponding density ρ might become unbounded. It is simpler to consider 1/ρ
Measure Solutions for the Steady Nonlinear Boltzmann Equation in a Slab
209
and show Rthat is a L1 function of z. In fact, since p = 1, we can find a fixed positive ω such that 1≤v2 ≤ω vx2 dµ/dz = 1/Kω ≤ 1 and hence (for 0 ≤ δ ≤ 1) x
1 ≤ Kω
Z 2 ≤ω δ≤vx
vx2
dµ ≤ω dz
Z 2 δ≤vx
dµ = ωρδ , dz
R where ρδ = δ≤v2 dµ/dz is an L1 function whose reciprocal is uniformly bounded with x respect to δ. If we let δ go to zero, we obtain, by monotonicity, a bounded function, which may be considered the reciprocal of ρ0 when the latter is finite. We remark that the problem is not uniquely fixed by Eq. (1.1) and the boundary RL conditions. In fact we must also assign the total mass M = 0 ρ(x)dx. When we pass to the new formulation, Eq. (1.9), we assign M as the range of the independent variable and the solution is defined up to a constant factor because the problem is homogeneous of the first degree. We shall make now use of this factor to show that we can solve the original problem for any thickness L of the slab. We remark that 1/ρ is well defined and positive. Thus we can invert Eq.(1.8) in the following form: Z z 1 dz 0 . x= 0) ρ(z 0 Then we multiply Eq. (1.9) by ρ and use the relation between x and z to obtain that after the change of variables f satisfies Eq. (1.1). Since M is assigned and finite and ρ is certainly bounded from below, L is also finite. As we remarked, the solution of the modified problem is determined up to a factor; then L, defined by Z
M
L= 0
1 dz 0 ρ(z 0 )
(3.1)
is determined up to a factor. Playing with this factor we can recover any L. This means that we have the following Theorem 3. The Boltzmann equation (1.1) with homogeneous boundary conditions, Eq. (1.3) with C0 equal to the outgoing flux, has at least one nonzero measure valued solution, the support of which is different from {v = 0}. The solution is completely RLR determined if we assign the total mass (per unit area) given by M = 0 R3 dµ(x, v). 4. Concluding Remarks We have considered the one-dimensional nonlinear Boltzmann equation for Maxwellian molecules with angular cutoff, in order to prove the existence of a steady state solution. We have proved that the equation, with boundary conditions of diffuse reflection and mass conservation at the wall and assigned total mass (per unit area) in a slab has at least one nonzero solution in a suitable measure formulation where a mass variable is used in place of a space coordinate. The support of the solution is different from {v = 0}. Acknowledgement. The author is grateful to Leif Arkeryd and Reinhard Illner for their comments on an early draft of the paper which led to the present improved version.
210
C. Cercignani
References 1. Cercignani, C.: Mathematical Methods in Kinetic Theory. 2nd edition, New York: Plenum Press, 1990 2. Cercignani, C.: The Boltzmann Equation and its Applications. New York: Springer, 1988 3. Cercignani, C., Illner, R. and Pulvirenti, M.: The Mathematical Theory of Dilute Gases. New York: Springer-Verlag, 1994 4. Cercignani, C., Illner, R. and Shinbrot, M.: A Boundary Value Problem for Discrete Velocity Models. Duke Math. J. 51, 889–900 (1987) 5. Cercignani, C., Illner, R. and Shinbrot, M.: A Boundary Value Problem for the Two Dimensional Broadwell Model. Commun. Math. Phys. 114, 687–698 (1988) 6. Cercignani, C., Illner, R., Shinbrot, M. and Pulvirenti, M.: On Nonlinear Stationary Half-Space Problems in Discrete Kinetic Theory. J. Stat. Phys. 52, 885–896 (1988) 7. Arkeryd, L., Cercignani, C. and Illner, R.: Measure solutions of the steady Boltzmann equation in a slab. Commun. Math. Phys. 142, 285–296 (1991) 8. Parasarathy, K.R.: Probability measures on metric spaces. New York: Academic Press, 1967 9. Reed, M.C. and Simon, B.: Methods of Modern Mathematical Physics I. New York: Academic Press, 1980 10. Tychonoff, A.: Ein Fixpunktsatz. Math. Ann. 111, 767–776 (1935) 11. Sch¨afer, H.: Ueber die Methode der a-priori Schranken. Math. Ann. 129, 415–416 (1955) 12. Dunford, N. and Schwartz, J.T.: Linear operators I, II, III. New York: Wiley-Interscience, 1958 Communicated by J. L. Lebowitz
Commun. Math. Phys. 197, 211 – 228 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Feigenbaum Theory for Unimodal Maps with Asymmetric Critical Point: Rigorous Results B. D. Mestel1 , A. H. Osbaldestin2 1 Department of Mathematics, University of Exeter, Exeter, EX4 4QE, UK. E-mail:
[email protected]. 2 Department of Mathematical Sciences, Loughborough University, Loughborough, LE11 3TU, UK. E-mail:
[email protected]
Received: 12 December 1997 / Accepted: 12 March 1998
Abstract: We apply the methods of H. Epstein to prove the existence of a line of period-2 solutions of the Feigenbaum period-doubling renormalisation transformation. These solutions govern the universal behaviour of families of unimodal maps with “asymmetric critical points” of degree d, for which the dth derivative has differing left and right limits. 1. Introduction In this paper we study functional equations of the Feigenbaum type which occur in the theory of period-doubling cascades for families of unimodal maps f of an interval with a single critical point (which, without loss of generality, we take to be 0) of degree d and which has differing left and right dth derivative at 0. Universality for period doubling in such maps was considered by Arneodo et al [1] (see also [25, 5, 23], and [22]), where period-two scaling behaviour was observed. In [20] it was shown that the critical behaviour of such families of maps was governed by a period-two point of a generalised Feigenbaum renormalisation operator, given by the solution of the functional equations f˜L (x) = −λ−1 fR fR (−λx), f˜R (x) = −λ−1 fR fL (−λx), ˜ fL (x) = −λ˜ −1 f˜R f˜R (−λx), −1 ˜ fR (x) = −λ˜ f˜R f˜L (−λx),
(1.1a) (1.1b) (1.1c) (1.1d)
with the normalisations fL (0) = fR (0) = f˜L (0) = f˜R (0) = 1 so that λ = −fR (1) > 0 and λ˜ = −f˜R (1) > 0. (Note that these equations were written with α = −λ and α˜ = −λ˜ in [20].)
212
B. D. Mestel, A. H. Osbaldestin
Here fL and fR correspond to the left and the right parts of the unimodal map f , i.e.,
( f (x) =
fL (x) if x ≤ 0 ; fR (x) if x ≥ 0 .
(1.2)
As observed by Arneodo et al [1], the solutions of (1.1) depend on two parameters, viz., the degree d of the critical point and and “modulus” µ, which (for the case when d is an even integer) is the ratio µ=
fL(d) (0−) fR(d) (0+)
.
(1.3)
The well-known Feigenbaum case corresponds to the value µ = 1. See [13, 14, 6]. The first rigorous proofs in this case were given by Lanford [16], and by Campanino, Epstein, and Ruelle [2, 3]. See also [9] and [4]. More recently proofs have been given both by Epstein ([10, 11, 12]) using Herglotz function techniques, and by Sullivan ([24, 18]) using deep ideas from complex analysis. In this paper we use the Herglotz function techniques of Epstein to prove the period-2 point exists for all d > 1 and all µ > 0. Here d may be non-integral. More precisely, we prove the following theorem. Theorem 1. Let d > 1 and let µ > 0. Then there exists a solution of (1.1) with the properties: 1. fL = FL (|x|d ) and fR = FR (|x|d ), where FL and FR are analytic functions on open intervals containing [0, µ−1 ] and [0, 1] respectively, and with FL0 (0), FR0 (0) 6= 0. 2. f˜L = F˜L (|x|d ) and f˜R = F˜R (|x|d ), where F˜L and F˜R are analytic functions on open intervals containing [0, µ] and [0, 1] respectively, and with F˜L0 (0), F˜R0 (0) 6= 0. 3. µ = FL0 (0)/FR0 (0) = F˜R0 (0)/F˜L0 (0). Theorem 1 puts the scaling observations of Arneodo et al [1] (as expounded in [20]) on a rigorous foundation. A different instance of a renormalisation scheme in which a similar invariant modulus appears is [19]. We note that if d is an even integer, then the functions fL , fR , f˜L and f˜R are themselves analytic. The organisation of this paper is as follows. In the next section we give a brief resum´e of the properties of Herglotz and anti-Herglotz functions that we shall need. We then give a reformulation of the problem in terms of normalised anti-Herglotz functions. Following that we give a proof that the reformulated problem has a solution. Finally we derive the solution of (1.1) from the solution of the reformulated problem. 2. Resum´e of Properties of Herglotz Functions In this section we give a brief resum´e of the properties of Herglotz (also known as Pick) functions. For more details and proofs, we refer the reader to Epstein’s work [10, 11, 12] and the book by Donoghue [7]. We shall largely keep to the notation used by Epstein, although we use d for the degree rather than Epstein’s r. Let C+ , C− denote the upper and lower half planes in C. If I ⊆ R is an open interval, then we denote by (I) the domain C+ ∪ C− ∪ I. I may possibly be empty or the whole of R.
Feigenbaum Theory for Unimodal Maps
213
Definition 1. A complex analytic function on C+ ∪ C− is Herglotz (resp. anti-Herglotz) ¯ + and f (C− ) ⊆ C ¯ − (resp. f (C+ ) ⊆ C ¯ − and f (C− ) ⊆ C ¯ + ). if f (C+ ) ⊆ C The following are important properties of Herglotz and anti-Herglotz functions, which we state without proof. Proposition 1. The following hold for Herglotz and anti-Herglotz functions. 1. The composition of two Herglotz functions is a Herglotz function, the composition of two anti-Herglotz function is a Herglotz function, whilst the composition of an anti-Herglotz function and a Herglotz function is an anti-Herglotz function. 2. (Integral representation for Herglotz functions.) A Herglotz function f has the representation Z t 1 − dµ(t), (2.1) f (x) = αx + β + t − x t2 + 1 where µ is a Borel measure on R. The function f is analytic precisely on the complement of the support of µ in R. 3. On any interval I ⊆ R on which it is analytic, a Herglotz (resp. anti-Herglotz) function is increasing (resp. decreasing) and strictly so, unless it is constant. Moreover a non-constant Herglotz or anti-Herglotz function has non-zero derivative on I. 4. On any interval I ⊆ R on which it is analytic, a non-constant Herglotz or antiHerglotz function f has positive Schwarzian derivative 2 f 000 (x) 3 f 00 (x) − > 0. (2.2) S(f )(x) = 0 f (x) 2 f 0 (x) One consequence of this is that a Herglotz function cannot have a local maximum in its first derivative. 5. On any interval I on which it is analytic the right- and left-hand limits exist respectively at the left and right-hand endpoints of I, although they may be ±∞. It is ¯ We shall do this implicitly therefore possible to extend the function continuously to I. in what follows. Let A, B ∈ R satisfy A < 0 < 1 < B and let (A, B) denote C+ ∪ C− ∪(A, B). We denote by H(A, B) and AH(A, B) (respectively) the space of Herglotz and anti-Herglotz functions (respectively) analytic on the interval (A, B). Furthermore, let E(A, B) denote the space of anti-Herglotz functions ψ ∈ AH(A, B) and which satisfy the normalisations ψ(0) = 1,
ψ(1) = 0 .
(2.3)
As is normal, we equip H(A, B), AH(A, B) and E(A, B) with the topology of uniform convergence on compact subsets of (A, B). The following results are fundamental and are proved in [10, 11, 12]. Proposition 2. Let ψ ∈ E(A, B). Then 1. ψ has an integral representation of the form Z 1 1 − σ(t)dt, log ψ(x) = t t−x
(2.4)
where σ(t) satisfies 0 ≤ σ(t) ≤ 1,
σ(t) = 0 for A ≤ t < 1,
σ(t) = 1 for 1 ≤ t ≤ B .
(2.5)
214
B. D. Mestel, A. H. Osbaldestin
2. For x ∈ (0, 1) we have A(1 − x) B(1 − x) ≤ ψ(x) ≤ , A−x B−x
(2.6)
and for x ∈ (A, 0) we have B(1 − x) A(1 − x) ≤ ψ(x) ≤ . B−x A−x
(2.7)
We shall need the following simple lemma in what follows. Lemma 1. Let I be an open real interval and let f : I → R be an increasing analytic function with positive Schwarzian derivative. Let x1 , x2 ∈ I with x1 < x2 . Then if f (x1 ) > x1 and f (x2 ) < x2 , there exists a unique fixed point x∗ ∈ I of f strictly between x1 and x2 . Proof. The existence of a fixed point is a simple consequence of the intermediate value theorem. The uniqueness follows from the positivity of the Schwarzian derivative as follows. Suppose there are at least two fixed points in (x1 , x2 ). Let x∗ and x0∗ be the smallest and largest respectively. Then f 0 (x∗ ) ≤ 1 and f 0 (x0∗ ) ≤ 1. By the mean value theorem, there exists x ∈ (x1 , x2 ) with f 0 (x) = 1. Since f is not the identity function, it follows that f 0 must have a local maximum in (x∗ , x0∗ ), contradicting the positivity of the Schwarzian derivative.
3. Reformulation of the Problem The following theorem is a reformulation of Theorem 1 in terms of normalised antiHerglotz functions. Theorem 2. Let d > 1 and µ > 0, ν = µ1/d and 0 < b < 1. Then, for b sufficiently ˜ −1 ) × E(−λ−1 , (λλ) ˜ −1 ) ˜ ∈ E(−λ˜ −1 , (λλ) close to 1, there exists a function pair (ψ, ψ) −1 −1 −2 −1 −2 ⊆ E(−b ν , b ) × E(−b ν, b ) satisfying: ψ(x) =
µ z˜1d ˜ ˜ λx) ˜ 1/d ), ψ(z˜1 ψ(− λ˜ d z1 d
1 z1 d ˜ ψ(x) = ψ(z1 ψ(−λx)1/d ), µλd z˜1d
(3.1a) (3.1b)
where z1 = ψ(−λ)−1/d , ˜ λ) ˜ −1/d , z˜1 = ψ(− and 0 < λ < bν −1 and 0 < λ˜ < bν. The bulk of our work will be in establishing Theorem 2.
(3.2a) (3.2b)
Feigenbaum Theory for Unimodal Maps
215
4. Preparatory Results 4.1. Notation. For 0 < b < 1, we define λ+ = bν −1 and λ˜ + = bν. Note that λ+ λ˜ + = b2 ˜ 1 and 9 ˜ 2 denote the anti-Herglotz functions < 1. Let 91 , 92 , 9 1−x , 1 + bν −1 x ˜ 1 (x) = 1 − x , 9 1 + bνx ˜ 2 (x) = 1 − x . 92 (x) = 9 1 − b2 x 91 (x) =
(4.1a) (4.1b) (4.1c)
Let ψ ∈ E(−b−1 ν −1 , b−2 ) and ψ˜ ∈ E(−b−1 ν, b−2 ). Then, in view of Proposition 2, we have for x < 0: 92 (x) ≤ ψ(x) ≤ 91 (x), ˜ ˜ 1 (x), ˜ 2 (x) ≤ ψ(x) ≤9 9
(4.2a) (4.2b)
91 (x) ≤ ψ(x) ≤ 92 (x), ˜ ˜ 2 (x) . ˜ 1 (x) ≤ ψ(x) ≤9 9
(4.3a) (4.3b)
and for 0 < x < 1:
4.2. The maps θA . For A ∈ R ∪ {+∞}, we denote by θA the map θA (x) =
(A + 1)x . A+x
(4.4)
b−2 x . b−2 − 1 + x
(4.5)
We further denote by θmax the map θmax (x) =
The following proposition is straightforward to verify. Proposition 3. The map θA has the following properties: θA is Herglotz. θA is analytic on (−A, +∞); in particular it is analytic on (0, +∞) if A > 0. θA (0) = 0; θA (1) = 1. For 0 < A < +∞, θA (x) > x for 0 < x < 1; θA (x) < x for x > 1. θA (x) = x for A = +∞. limx→+∞ θA (x) = A + 1. Let 0 < b < 1 and let B ≥ b−2 (including the possibility that B = +∞). Then there is a unique A ∈ [b−2 − 1, +∞] such that θA (B) = b−2 . Remark. A = B(1 − b2 )/(Bb2 − 1) and if B = b−2 we have A = +∞. 8. θA depends continuously on A and, for 0 < x < 1, θA (x) decreases as A increases. Hence for A ∈ [b−2 − 1, +∞] we have θA (x) ≤ θmax (x) for 0 < x < 1.
1. 2. 3. 4. 5. 6. 7.
216
B. D. Mestel, A. H. Osbaldestin
4.3. Definition of λ− , λ˜ − . We shall prove the following lemma. Lemma 2. For b sufficiently close to 1, there exist solutions λ− , λ˜ − , of the equations 91 (θmax (92 (−λ− )−1/d ))1/d ˜ 1 (θmax (9 ˜ 2 (−λ˜ − )−1/d ))1/d /λ˜ + = λ− , ×9 −1/d
91 (θmax (92 (−λ− ) )) ˜ 1 (θmax (9 ˜ 2 (−λ˜ − )−1/d ))1/d /λ+ = λ˜ − . ×9
(4.6a)
1/d
(4.6a)
Proof. Writing λ˜ − = λ˜ + λ− /λ+ , we see that it is sufficient to solve the following ˜ 2 (−λ˜ + λ− /λ+ )−1/d ))1/d /λ˜ + = λ− . (4.7) ˜ 1 (θmax (9 91 (θmax (92 (−λ− )−1/d ))1/d 9 The left-hand side of this is an increasing analytic function of λ− . At λ− = 0, the lefthand side is zero with infinite slope and thus is greater than λ− for λ− sufficiently close to 0. It will follow that there is a solution if the left-hand side is less than λ+ for λ− = λ+ . For λ− = λ+ the equation becomes ˜ 2 (−λ˜ + )−1/d ))1/d = λ˜ + λ+ = b2 . ˜ 1 (θmax (9 91 (θmax (92 (−λ+ )−1/d ))1/d 9
(4.8)
Now setting x = θmax (92 (−λ+ )−1/d ) −1/d ! 1 + λ+ = θmax 1 + b 2 λ+ −1/d ! 1 + bν −1 = θmax 1 + b3 ν −1 −1/d 1 + bν −1 −2 b 1 + b3 ν −1 = −1/d , −1 1 + bν b−2 − 1 + 1 + b3 ν −1
(4.9)
we have x → 1 as b → 1. It follows that 91 (x) → 0 as b → 1, and, hence, for b sufficiently close to 1, we have ˜ 2 (−λ˜ + )−1/d ))1/d < 91 (x)1/d < b2 , (4.10) ˜ 1 (θmax (9 91 (θmax (92 (−λ+ )−1/d ))1/d 9 ˜ 1 (θmax (9 ˜ 2 (−λ˜ + )−1/d ))1/d < 1. as required. Here we have used the fact that 9
Henceforth we define λ− and λ˜ − to be the solutions of (4.6) with 0 < λ− < λ+ and 0 < λ˜ − < λ˜ + .
Feigenbaum Theory for Unimodal Maps
217
5. Proof of Theorem 2 ˜ λ) ˜ with Let E be the space of quadruples (ψ, λ, ψ, ψ ∈ AH(−b−1 ν −1 , b−2 ),
ψ(0) = 1,
ψ(1) = 0,
(5.1)
λ ∈ [λ− , λ+ ], ψ˜ ∈ AH(−b−1 ν, b−2 ),
˜ ψ(0) = 1,
˜ ψ(1) = 0,
(5.2)
and λ˜ ∈ [λ˜ − , λ˜ + ]. We equip E with the product topology. An important result is the following: Proposition 4. E is compact and convex and has the “fixed-point property” i.e., any continuous map T : E → E has a fixed point. The last property is a consequence of the Schauder-Tikhonov fixed point theorem. See [8] for details. 5.1. Outline of the method of proof. The method of proof is to define a “renormalisation transformation”, i.e., a continuous map T : E → E such that a fixed-point of T is a ˜ λ) ˜ = (ψ1 , λ1 , ψ˜ 1 , λ˜ 1 ), where solution of Eqs. (3.1). Ideally we would define T (ψ, λ, ψ, ˜ φ(x)), ˜ ψ1 (x) = τ˜ −1 ψ( −1 ˜ ψ1 (x) = τ ψ(φ(x)),
for x ∈ (−b−1 ν −1 , b−2 ), for x ∈ (−b−1 ν, b−2 ),
(5.3a) (5.3b)
where φ(x) = z1 ψ(−λx)1/d , ˜ ˜ λx) ˜ 1/d , φ(x) = z˜1 ψ(−
(5.4a) (5.4b)
where z1 = ψ(−λ)−1/d , ˜ λ) ˜ −1/d , z˜1 = ψ(−
(5.5a) (5.5b) −1/d
τ = ψ(z1 ) = ψ(ψ(−λ)
),
(5.5c)
˜ ψ(− ˜ λ) ˜ −1/d ), ˜ z˜1 ) = ψ( τ˜ = ψ(
(5.5d)
and where λ1 , λ˜ 1 are given by ˜ λ˜ 1 )1/d ψ(− τ 1/d , νψ(−λ1 )1/d νψ(−λ1 )1/d 1/d τ˜ . λ˜ 1 = ˜ λ˜ 1 )1/d ψ(− λ1 =
(5.6a) (5.6b)
However there are a number of obstructions to this goal. Firstly Eqs. (5.6) may not have a solution in the ranges [λ− , λ+ ] and [λ˜ − , λ˜ + ] unless b is taken sufficiently close to 1. More seriously, the functions φ and φ˜ as defined by (5.4) may not be well behaved, particularly close to the point b−2 . We overcome this problem ˜ which are themselves Herglotz but by modifying miscreant φ, φ˜ using functions θ, θ, ˜ λ). ˜ which depend continuously on the quadruple (ψ, λ, ψ,
218
B. D. Mestel, A. H. Osbaldestin
We are therefore lead to define a modified map T which coincides with the desired transformation given by Eqs. (5.3)–(5.6) on a large part of E. From the SchauderTikhonov theorem, we see that the modified map T has a fixed point, but a priori this fixed point may not satisfy (5.3)–(5.6). However, by using properties of a fixed point of the modified map T , we may show a posteriori that the map coincides with our original map at the fixed point, at least for b chosen sufficiently close to 1. 5.2. Definition of the map T . We now proceed to define a map T : E → E as follows. ˜ λ) ˜ ∈ E. Let (ψ, λ, ψ, We first define φ1 (x) =
˜ λx) ˜ 1/d ψ(− φ˜ 1 (x) = . ˜ λ) ˜ 1/d ψ(−
ψ(−λx)1/d , ψ(−λ)1/d
(5.7)
Then, since λ ≤ λ+ = bν −1 , and λ˜ ≤ λ˜ + = bν, we have that φ1 ∈ H(−λ−1 , λ−1 b−1 ν −1 ) and φ˜ 1 ∈ H(−λ˜ −1 , λ˜ −1 b−1 ν) with φ1 (1) = 1 and φ˜ 1 (1) = 1. Note that λ−1 b−1 ν −1 ≥ b−2 and λ˜ −1 b−1 ν ≥ b−2 and also that φ1 ((−λ−1 , λ−1 b−1 ν −1 )) ⊆ (0, ∞) and φ˜ 1 ((−λ˜ −1 , λ˜ −1 b−1 ν)) ⊆ (0, ∞). We now consider φ1 (b−2 ) and φ˜ 1 (b−2 ). In order to form the compositions ψ ◦ φ1 and ˜ ψ ◦ φ˜ 1 , we require that φ1 (b−2 ) ≤ b−2 and φ˜ 1 (b−2 ) ≤ b−2 . We do not know a priori that this is the case. In order to overcome this problem we modify φ1 and φ˜ 1 in a continuous way as follows. Define θ by ( x if φ1 (b−2 ) ≤ b−2 , (5.8) θ(x) = θA (x) if φ1 (b−2 ) > b−2 where A is chosen so that θA (φ1 (b−2 )) = b−2 and define ( if φ˜ 1 (b−2 ) ≤ b−2 ˜θ(x) = x , θA˜ (x) if φ˜ 1 (b−2 ) > b−2
(5.9)
where A˜ is chosen so that θA˜ (φ˜ 1 (b−2 )) = b−2 . Then θ, θ˜ depend continuously on ψ, λ, ˜ λ, ˜ since the defining conditions on θ, θ˜ depend continuously on ψ, λ, ψ, ˜ λ. ˜ ψ, ˜ Note that θ, θ are Herglotz functions fixing 0 and 1, and analytic on (0, ∞) so that the functions defined by: φ = θ ◦ φ1 ,
φ˜ = θ˜ ◦ φ˜ 1
(5.10)
satisfy φ ∈ H(−λ−1 , λ−1 b−1 ν −1 ) ⊆ H(−b−1 ν, b−2 ), −1
−2
−2
φ(1) = 1, φ(−b ν, b ) ⊆ (0, b ) ; φ˜ ∈ H(−λ˜ −1 , λ˜ −1 b−1 ν) ⊆ H(−b−1 ν −1 , b−2 ), ˜ φ(1) = 1,
−1 −1 −2 ˜ ν , b ) ⊆ (0, b−2 ) . φ(−b
(5.11) (5.12) (5.13) (5.14)
Feigenbaum Theory for Unimodal Maps
219
We now define the new functions ψ1 and ψ˜ 1 as follows. Let τ = ψ(φ(0)), ˜ φ(0)) ˜ τ˜ = ψ( .
(5.15a) (5.15b)
Then 0 < τ < 1 and 0 < τ˜ < 1. Now let ˜ φ(x)), ˜ ψ1 (x) = τ˜ −1 ψ(
ψ˜ 1 (x) = τ −1 ψ(φ(x)) .
(5.16)
Then ψ1 ∈ AH(−b−1 ν −1 , b−2 ), ψ˜ 1 ∈ AH(−b−1 ν, b−2 ),
ψ1 (0) = 1, ψ˜ 1 (0) = 1,
ψ1 (1) = 0, ψ˜ 1 (1) = 0 .
(5.17a) (5.17b)
˜ These follow from the fact that φ(1) = 1, φ(1) = 1, and the definition of τ and τ˜ (5.15). ˜ Our next task is to define new values of λ, λ˜ so that they satisfy the constraints for λ, λ. To this end we observe that τ 1/d = ψ(φ(0))1/d ≥ 91 (φ(0))1/d ≥ 91 (θmax (92 (−λ− )−1/d ))1/d .
(5.18)
Here we have used the fact that λ ≥ λ− and Proposition 3. Similarly 1/d 1/d ˜ φ(0)) ˜ ˜ ˜ 1 (φ(0)) ˜ 1 (θmax (9 ˜ 2 (−λ˜ − )−1/d ))1/d , ≥9 ≥9 τ˜ 1/d = ψ(
(5.19)
˜ 1 (θmax (9 ˜ 2 (−λ˜ − )−1/d ))1/d . (τ τ˜ )1/d ≥ 91 (θmax (92 (−λ− )−1/d ))1/d 9
(5.20)
so that
We shall use this result in the next section in conjunction with Lemma 2. 5.3. Definition of λ1 , λ˜ 1 . Our aim is to define λ1 , λ˜ 1 so that λ− ≤ λ1 ≤ λ+ ,
λ˜ − ≤ λ˜ 1 ≤ λ˜ + .
(5.21)
If (τ τ˜ )1/d ≥ b2 we set λ1 = λ+ and λ˜ 1 = λ˜ + so that (5.21) is clearly satisfied and λ1 λ˜ 1 ≤ (τ τ˜ )1/d ,
(5.22)
since λ+ λ˜ + = b2 . This is case 0. If, on the other hand, (τ τ˜ )1/d < b2 we set γ = (τ τ˜ )1/d and consider the equation λ1 =
1/d 1/d ˜ ψ(−γ/λ τ 1) , νψ(−λ1 )1/d
(5.23)
for γ ≤ λ1 ≤ λ+ . ˜λ+
(5.24)
The right-hand side of (5.23) is strictly decreasing (as a function of λ1 ) so we have three cases: Case 1. The right-hand side of (5.23) is less than λ1 for λ1 = γ/λ˜ + . Case 2. The right-hand side of (5.23) is greater than λ1 for λ1 = λ+ . Case 3. Equation (5.23) has a unique solution in the interval γ/λ˜ + , λ+ . We deal with these three cases separately.
220
B. D. Mestel, A. H. Osbaldestin
5.3.1. Case 1. The right-hand side of (5.23) is less than λ1 for λ1 = γ/λ˜ + . Set λ1 = γ/λ˜ + ,
λ˜ 1 = λ˜ + .
(5.25)
Then λ1 λ˜ 1 = γ = (τ τ˜ )1/d , so λ1 =
γ (τ τ˜ )1/d = ˜λ+ λ˜ + ≥ λ− ,
(5.26)
by (4.6) and (5.20). Hence we have λ− ≤ λ1 ≤ λ+ ,
λ˜ − ≤ λ˜ 1 ≤ λ˜ + .
(5.27)
5.3.2. Case 2. The right-hand side of (5.23) is greater than λ1 for λ1 = λ+ . Set λ 1 = λ+ ,
λ˜ 1 = γ/λ+ .
(5.28)
Then λ1 λ˜ 1 = γ = (τ τ˜ )1/d , and, similarly to case 1, γ (τ τ˜ )1/d λ˜ 1 = = λ+ λ+ ˜ ≥ λ− ,
(5.29)
again by (4.6) and (5.20). Hence we have λ− ≤ λ1 ≤ λ+ ,
λ˜ − ≤ λ˜ 1 ≤ λ˜ + .
(5.30)
5.3.3. Case 3. Equation (5.23) has a unique solution in the interval γ/λ˜ + , λ+ . Set λ1 to be the solution to (5.23) and set λ˜ 1 = γ/λ1 .
(5.31)
Then we have λ− ≤
γ ≤ λ1 ≤ λ+ , λ˜ +
(5.32)
and γ γ λ˜ − ≤ ≤ λ˜ 1 ≤ = λ˜ + , λ+ γ/λ˜ +
(5.33)
using (4.6) and (5.20) again. 5.4. Conclusion of the definition of T . From the previous section we have that, in all cases, λ1 λ˜ 1 ≤ (τ τ˜ )1/d ,
(5.34)
and, except when λ1 = λ+ and λ˜ 1 = λ˜ + (case 0), we have λ1 λ˜ 1 = γ = (τ τ˜ )1/d .
(5.35)
Feigenbaum Theory for Unimodal Maps
221
We also have in all cases λ− ≤ λ1 ≤ λ+ ,
λ˜ − ≤ λ˜ 1 ≤ λ˜ + .
(5.36)
Hence we may finally define our map T : E → E by ˜ λ) ˜ 7→ (ψ1 , λ1 , ψ˜ 1 , λ˜ 1 ) . T : (ψ, λ, ψ,
(5.37)
Because all quantities are defined in a continuous manner, it follows from the standard continuity results for composition operators that T is a continuous function. We can therefore apply the Schauder-Tikhonov theorem to deduce that T has a fixed point. ˜ λ) ˜ be the fixed point so that ψ = ψ1 , λ = λ1 , ψ˜ = ψ˜ 1 From now on we let (ψ, λ, ψ, and λ˜ = λ˜ 1 . ˜ Our first goal is to show that θ and θ˜ are both the identity 5.5. Elimination of θ and θ. function. We have first that ˜ φ(x)), ˜ ψ(x) = ψ1 (x) = τ˜ −1 ψ( −1 ˜ ˜ ψ(x) = ψ1 (x) = τ ψ(φ(x)) ;
(5.38a) (5.38b)
φ(x) = θ(ψ(−λx)1/d /ψ(−λ)1/d ), ˜ ˜ ψ(− ˜ λx) ˜ 1/d /ψ(− ˜ λ) ˜ 1/d ) . φ(x) = θ(
(5.39a) (5.39b)
Then, a priori, we have that ψ ∈ AH(−b−1 ν −1 , b−2 ), and ψ˜ ∈ AH(−b−1 ν, b−2 ). But φ and φ˜ are defined on larger domains so that in fact we have φ ∈ H(−λ−1 , λ−1 b−1 ν −1 ) ˜ and φ˜ ∈ H(−λ˜ −1 , λ˜ −1 b−1 ν), and by construction of θ, θ, φ((−λ−1 , b−2 )) ⊆ (0, b−2 ),
˜ φ((− λ˜ −1 , b−2 )) ⊆ (0, b−2 ),
(5.40)
and thus ψ ∈ AH(−λ˜ −1 , b−2 ), ψ˜ ∈ AH(−λ−1 , b−2 ). Moreover ψ(−λ˜ −1 )1/d ≤ τ˜ −1/d ,
−1 1/d ˜ ψ(−λ ) ≤ τ −1/d .
(5.41)
−1 1/d ˜ ˜ −1 , ψ(−λ˜ −1 )1/d ψ(−λ ) ≤ (τ τ˜ )−1/d ≤ (λλ)
(5.42)
Multiplying, we get
and since both factors in the leftmost expression are larger than one, we obtain ˜ −1 , ψ(−λ˜ −1 )1/d ≤ (λλ)
−1 1/d ˜ ˜ −1 . ψ(−λ ) ≤ (λλ)
(5.43)
˜ −1 ), and Hence φ1 ∈ H(−λ−1 , (λλ) ˜ −1 1/d ˜ −1 ) = ψ(−λ(λλ) ) ˜ −1 . φ1 ((λλ) ≤ (λλ) ψ(−λ)1/d
(5.44)
Since φ1 is a Herglotz function with φ1 (1) = 1, we have, using Lemma 1, that φ1 (x) < x ˜ −1 . In particular φ1 (b−2 ) < b−2 and θ(x) = x. Hence φ1 = φ. for 1 < x < (λλ) ˜ A similar argument, mutatis mutandis, shows that φ˜ 1 (b−2 ) < b−2 and θ(x) = x, and, ˜ ˜ hence, that φ1 = φ.
222
B. D. Mestel, A. H. Osbaldestin
We deduce that ˜ φ(x)), ˜ ψ(x) = τ˜ −1 ψ( φ(x) =
ψ(−λx)1/d , ψ(−λ)1/d
˜ ψ(x) = τ −1 ψ(φ(x)), ˜ λx) ˜ 1/d ψ(− ˜ φ(x) = , ˜ ˜ 1/d ψ(−λ)
(5.45) (5.46)
with either λλ˜ = b2 (if and only if λ = λ+ , λ˜ = λ˜ + ), (case 0); or λλ˜ = (τ τ˜ )1/d and λ˜ = λ˜ + , (case 1); or λλ˜ = (τ τ˜ )1/d and λ = λ+ , (case 2); or λλ˜ = (τ τ˜ )1/d and λ=
˜ λ) ˜ 1/d τ 1/d ψ(− . νψ(−λ)1/d
(5.47)
Our objective now is to show that for b chosen sufficiently close to one that cases 0, 1, 2 cannot occur, so that case 3 applies. To do this we require some properties of the fixed ˜ λ). ˜ point (ψ, λ, ψ, 5.6. Properties of a fixed point. Lemma 3. From the equations ˜ ψ(x) = τ −1 ψ(φ(x)),
˜ φ(x)), ˜ ψ(x) = τ˜ −1 ψ(
(5.48)
we may deduce ˜ 1. 0 < τ ψ(x) < 1 for x ∈ (−λ−1 , 1), and 0 < τ˜ ψ(x) < 1 for x ∈ (−λ˜ −1 , 1). ˜ −1 ) and (−λ−1 , (λλ) ˜ −1 ) 2. ψ and ψ˜ may be extended to the intervals (−λ˜ −1 , (λλ) respectively, so that they satisfy the following bounds. For x < 0: 1−x ≤ ψ(x) ≤ ˜ 1 − λλx 1−x ˜ ≤ ψ(x) ≤ ˜ 1 − λλx
1−x , ˜ 1 + λx 1−x ; 1 + λx
(5.49a) (5.49b)
and for 0 < x < 1: 1−x ≤ ψ(x) ≤ ˜ 1 + λx 1−x ˜ ≤ ψ(x) ≤ 1 + λx
1−x , ˜ 1 − λλx 1−x . ˜ 1 − λλx
(5.50a) (5.50b)
3. Writing z1 = ψ(−λ)−1/d ,
˜ λ) ˜ −1/d , z˜1 = ψ(−
(5.51)
τ˜ 1/d ≤ z1 ,
(5.52)
we have τ 1/d ≤ z˜1 , and λλ˜ ≤ (τ τ˜ )1/d ≤ z1 z˜1 . ˜ z˜1 ). Note that τ = ψ(z1 ), and τ˜ = ψ(
(5.53)
Feigenbaum Theory for Unimodal Maps
223
Proof. 1. The result follows immediately from the fact that φ((−λ−1 , 1)) ⊆ (0, 1) and ˜ φ((− λ˜ −1 , 1)) ⊆ (0, 1). 2. The extension of ψ, ψ˜ is straightforward. We have that 1 < φ(x) < x and 1 < ˜ ˜ −1 , so there exists n ∈ N so that φ(x) < x for 1 < x < (λλ) ˜ −1 )) ⊆ (1, b−2 ) . ˜ n ((1, (λλ) (φ ◦ φ)
(5.54)
˜ −1 , We then denote, for 1 < x < (λλ) ˜ n (x)) . ψext (x) = (τ τ˜ )−n ψ((φ ◦ φ)
(5.55)
By iterating (5.45), we see that this function coincides with ψ sufficiently close to one, and since ψ and ψext are both analytic, ψext is an analytic extension of ψ. We denote this extension again by ψ. The extension of ψ˜ is analogous. 3. This follows from 1, and the fact that λλ˜ ≤ (τ τ˜ )1/d . 5.7. Elimination of “bad” cases. Our goal is now to eliminate cases 0, 1, and 2 above. 5.7.1. Elimination of case 0. Suppose λ = λ+ , λ˜ = λ˜ + , λλ˜ = b2 . Then ˜ d ≤ τ τ˜ = ψ(z1 )ψ( ˜ z˜1 ) b2d = (λλ) 1 − z˜1 1 − z1 ≤ ˜ 1 1 − λλz 1 − λλ˜ z˜1 1 − z1 z˜1 ≤ ˜ 1 z˜1 1 − λλz 1 − λλ˜ ≤ using (5.53) ˜ 2 1 − (λλ) =
1 1 − b2 = . 4 1−b 1 + b2
(5.56)
By taking b sufficiently close to one we obtain a contradiction. Remark. We have used the following crude inequality, valid for 0 < c, x, y < 1: 1−x 1 − xy 1−y 1−x ≤ ≤ . (5.57) 1 − cx 1 − cy 1 − cx 1 − cxy 5.7.2. Elimination of case 1. We have λλ˜ = (τ τ˜ )1/d , λ˜ = λ˜ + , and ˜ λ) ˜ 1/d τ 1/d ψ(− < λ. νψ(−λ)1/d
(5.58)
Our aim is to obtain a contradiction by bounding λλ˜ away from 1. Now (τ τ˜ )1/d ν(τ τ˜ )1/d ψ(−λ)1/d ν τ˜ 1/d ψ(−λ)1/d λ˜ = < = , ˜ λ) ˜ 1/d ˜ λ) ˜ 1/d λ ψ(− τ 1/d ψ(− so that
(5.59)
224
B. D. Mestel, A. H. Osbaldestin
˜ −1 < z˜1 ψ(−λ)1/d τ˜ 1/d ≤ z˜1 , b = λ˜ + ν −1 = λν
(5.60)
using Lemma 3. Thus ˜ 1/d ≤ ψ(b) ˜ 1/d , λλ˜ + = (τ τ˜ )1/d ≤ τ 1/d ψ(b)
(5.61)
hence, since ψ˜ is decreasing, 1/d 1/d ˜ 1/d 1 1−b 1 ψ(b) 1−b ˜ −1 , ≤ ≤ ≤ Kν λ≤ ˜ bν bν 1 − λλb bν 1 − b3
(5.62)
˜ < b < 1, K ˜ is independent of b, and provided b is sufficiently close to where 0 < K one. Hence νψ(−λ)1/d < νψ(−λ)1/d ˜ λ) ˜ 1/d ψ(− ˜ −1 )1/d ≤ νψ(−Kν ˜ −1 1/d 1 + Kν ≤ν ˜ −1 1 − λ˜ Kν ˜ −1 1/d 1 + Kν =ν ˜ 1 − Kb ˜ −1 1/d 1 + Kν ≤ν ˜ 1−K ˜ 0, =K
(5.63)
˜ 0 is a constant independent of b. Hence where K bν = λ˜ + <
ν τ˜ 1/d ψ(−λ)1/d ˜ λ) ˜ 1/d ψ(−
˜ 0 τ˜ 1/d ≤K ˜ z˜1 )1/d ˜ 0 ψ( =K ˜ 1/d by (5.60) ˜ 0 ψ(b) ≤K 1/d 1−b 0 ˜ ≤K . ˜ 1 − λλb
(5.64)
˜ −1 bν, so we have But λλ˜ ≤ Kν ˜0 bν ≤ K
1−b ˜ 2 1 − Kb
1/d ≤
˜ 0 (1 − b)1/d K . ˜ 1/d (1 − K)
(5.65)
Since the right-hand side of this expression tends to zero as b → 1 we have a contradiction for b sufficiently close to one. 5.7.3. Elimination of case 2. The elimination of case 2 is analogous to that for case 1.
Feigenbaum Theory for Unimodal Maps
225
5.8. Conclusion of the proof of Theorem 2. Since we have eliminated the “bad” cases 0, 1, and 2, we conclude that case 3 must apply and so λ, λ˜ satisfy λλ˜ = (τ τ˜ )1/d ,
(5.66)
and ˜ λ) ˜ 1/d τ 1/d ψ(− . νψ(−λ)1/d
(5.67)
ν τ˜ 1/d ψ(−λ)1/d . λ˜ = ˜ λ) ˜ 1/d ψ(−
(5.68)
λ= It follows that
In conclusion, we have shown that, provided b is chosen sufficiently close to 1, so that Eqs. (4.6) have solution and cases 0,1, and 2 cannot occur, then we have a fixed point ˜ λ) ˜ = (ψ1 , λ1 , ψ˜ 1 , λ˜ 1 ) of T that satisfies Eqs. (5.3)–(5.6). We note that quadruple (ψ, λ, ψ, ˜ −1 ) and (λ−1 , (λλ) ˜ −1 ) ˜ the functions ψ, ψ may be extended to the domains (λ˜ −1 , (λλ) respectively and that Eqs. (5.3) are satisfied on these larger domains. The proof of Theorem 2 is complete.
6. Proof that Theorem 1 Follows from Theorem 2 ˜ is a solution of (3.1). We may construct a solution of Proof. Let us assume that (ψ, ψ) (1.1) as follows. We define ˜ . U˜ (x) = z˜1d ψ(x)
U (x) = z1 d ψ(x),
(6.1)
Now U and U˜ satisfy the equations: ˜ 1/d ) valid for x ∈ (−λ˜ −1 , (λλ) ˜ −1 ), U (x) =µλ˜ −d U˜ (U˜ (−λx) ˜ −1 ), U˜ (x) =µ−1 λ−d U (U (−λx)1/d ) valid for x ∈ (−λ−1 , (λλ)
(6.2a) (6.2b)
and U and U˜ also satisfy the normalisations U (1) = 0, U˜ (1) = 0,
U (−λ) = 1, ˜ = 1. U˜ (−λ)
(6.3a) (6.2b)
We note that U and U˜ are both strictly decreasing functions on their real domains. We may therefore define their inverses: FR (x) = U −1 (x),
F˜R (x) = U˜ −1 (x) .
(6.4)
Then FR and F˜R are analytic on open intervals of [0, 1]. We note that FR and F˜R satisfy the normalisation conditions FR (0) = 1,
F˜R (0) = 1,
in consequence of (6.3). Furthermore U and U˜ satisfy
(6.5)
226
B. D. Mestel, A. H. Osbaldestin
U ([−λ, 1]) = [0, 1], ˜ 1]) = [0, 1], U˜ ([−λ,
(6.6a) (6.6b)
µ−1 λ−d U ([0, 1]) = [0, µ−1 λ−d z1d ], µλ˜ −d U˜ ([0, 1]) = [0, µλ˜ −d z˜1d ] .
(6.7a)
and
(6.7b)
Provided FR (µλd x) > 0 and F˜R (µ−1 λ˜ d x) > 0 (so that the dth power is analytic), we may invert Eqs. (6.2) so that the functions FR and F˜R satisfy: FR (x) = −λ˜ −1 F˜R (F˜R (µ−1 λ˜ d x)d ), F˜R (x) = −λ−1 FR (FR (µλd x)d ) .
(6.8a) (6.8b)
We observe that, in view of the first statement of Lemma 3 and Eqs. (5.6) at the fixedpoint, we have µ−1 λ−d z1d > 1, µλ˜ −d z˜1d > 1 .
(6.9a) (6.9b)
From this and (6.7), we see that the above equations are valid on open intervals of [0, 1] and FR and F˜R are analytic at all points of these intervals. Note also that FR and F˜R satisfy: FR ([0, 1]) = [−λ, 1], ˜ 1] . F˜R ([0, 1]) = [−λ,
(6.10a) (6.10b)
We now define fR and f˜R by fR (x) = FR (|x|d ),
f˜R (x) = F˜R (|x|d ) .
(6.11)
fR and f˜R satisfy the equations: ˜ −1/d x)), fR (x) = −λ˜ −1 f˜R (f˜R (λµ f˜R (x) = −λ−1 fR (fR (λµ1/d x)),
(6.12a) (6.12b)
and with normalisations fR (0) = f˜R (0) = 1. Equations (6.12) hold for open intervals of [−1, 1]. Finally, we define FL , F˜L by FL (x) = FR (µx),
F˜L (x) = F˜R (µ−1 x),
(6.13)
f˜L (x) = F˜L (|x|d ) .
(6.14)
and fL and f˜L by fL (x) = FL (|x|d ),
The functions FL and F˜L are defined on open intervals of [0, µ−1 ] and [0, µ] respectively. Hence fR , fL are defined on open intervals of [−1, 1] and [−µ−1/d , µ−1/d ] respectively and, similarly, f˜R , f˜L are defined on open intervals of [−1, 1] and [−µ1/d , µ1/d ] respectively. It follows from (6.12) that the following equations are satisfied on those intervals:
Feigenbaum Theory for Unimodal Maps
˜ fL (x) = −λ˜ −1 f˜R (f˜R (−λx)), −1 ˜ ˜ ˜ ˜ fR (x) = −λ fR (fL (−λx)), f˜L (x) = −λ−1 fR (fR (−λx)), f˜R (x) = −λ−1 fR (fL (−λx)) .
227
(6.15a) (6.15b) (6.15c) (6.15d)
Here we have used the fact that the functions are even. In conclusion, it follows from the fact that U and U˜ are anti-Herglotz functions that FL0 (0), FR0 (0), F˜L0 (0), F˜R0 (0) 6= 0. It is also clear that µ = FL0 (0)/FR0 (0) = F˜R0 (0)/F˜L0 (0). The proof of Theorem 1 is complete. 7. Discussion The proof we have given is unsatisfactory in a number of ways. Firstly, the use of the Schauder-Tikhonov fixed point theorem means that we can only show existence of a solution of the functional equations (1.1). No information is given on the question of uniqueness, nor of how the solution varies with respect to the degree d and the modulus µ. It is to be expected that the solution varies analytically with respect to these parameters, but not even continuity is guaranteed by the Schauder-Tikhonov theorem. Moreover, we have not established the hyperbolicity of the period-two points for fixed d and µ; nor does our method furnish information on the dominant eigenvalue δ and its variation with d and µ. It may be possible to apply the more advanced technniques of Sullivan [24] and McMullen [17] to prove convergence of infinitely renormalizable maps to the periodtwo point under renormalization (at least for the case of d and even integer). However it is not immediately apparent how to extend the theory of polynomial-like maps to this case. It is worth reiterating that the Herglotz function techniques used in this paper impose no restriction on the degree d other than that it be greater than one. Asymmetry may also be introduced by considering maps with differing left- and right-hand degrees of criticality. Such maps have application to the theory of forced nonlinear oscillators [21]. As observed numerically by Jensen and Ma [15], scaling behaviour for such maps is nongeometric and the renormalization theory is likely to be of a different nature. This is likely to be a fruitful area for further research. Acknowledgement. This paper was written while B. D. Mestel was a visitor at the Department of Mathematical Sciences, Loughborough University. We would like to thank the referee for alerting us to some previous work in this area.
References 1. Arneodo, A., Coullet, P., Tresser, C.: A renormalization group with periodic behaviour. Phys. Lett. A 70, 74–76 (1979) 2. Campanino, M., Epstein, H.: On the existence of Feigenbaum’s fixed point. Commun. Math. Phys. 79, 261–302 (1981) 3. Campanino, M, Epstein, H., Ruelle, D.: On Feigenbaum’s functional equation g ◦ g(λx) + λg(x) = 0. Topology 21, 125–129 (1982) 4. Collet, P., Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Boston: Birkh¨auser 1980 5. Cosenza, M. G., Swift, J. B.: Influence of asymmetry on multifractal properties of maps. Phys. Rev. A 43, 4095–4099 (1991)
228
B. D. Mestel, A. H. Osbaldestin
6. Coullet, P., Tresser, C.: It´eration d’endomorphismes et groupe de renormalisation. J. de Physique C 5, 25–28 (1978) 7. Donoghue, W. F. Jr.: Monotone Matrix Functions and Analytic Continuation. Berlin: Springer-Verlag 1974 8. Dunford, N., Schwartz, J. T.: Linear Operators I. New York: Interscience, 1957 9. Eckmann, J.-P., Wittwer, P.: Computer Methods and Borel Summability Applied to Feigenbaum’s Equation. Lecture Notes in Physics, vol. 227. Berlin: Springer, 1985 10. Epstein, H.: New proofs of the existence of the Feigenbaum functions. Commun. Math. Phys. 106, 395–426 (1986) 11. Epstein, H.: Fixed points of composition operators. In: Non-linear Evolution and Chaotic Phenomena. ed G. Gallavotti and P. Zweifel, New York: Plenum, 1988 12. Epstein, H.: Fixed points of composition operators II. Nonlinearity 2, 305–310 (1989) 13. Feigenbaum, M. J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52 (1978) 14. Feigenbaum, M. J.: The universal metric properties of nonlinear transformations. J. Stat. Phys. 21, 669–706 (1979) 15. Jensen, R. V., Ma, L. K. H.: Nonuniversal behavior of asymmetric unimodal maps. Phys. Rev. A 31, 3993–3995 (1985) 16. Lanford, O. E.: A computer-assisted proof of the Feigenbaum conjectures. Bull. Am. Math. Soc. 6, 427–434 (1982) 17. McMullen, C. T.: Renormalization and 3-Manifolds which Fiber over the Circle. Annals of Mathematical Studies No. 142, Princeton: Princeton University Press, 1996 18. de Melo, W., van Strien, S.: One-Dimensional Dynamics. Berlin: Springer-Verlag, 1993 19. Mestel, B. D., Osbaldestin, A. H.: Renormalisation in implicit complex maps. Physica D 39, 149–162 (1989) 20. Mestel, B. D., Osbaldestin, A. H.: Feigenbaum theory for unimodal maps with asymmetric critical point. J. Phys. A 31, 3287–3296 (1998) 21. Octavio, M., DaCosta, A., Aponte, J.: Nonuniversality and metric properties of a forced nonlinear oscillator. Phys. Rev. A 34, 1512–1515 (1986) 22. de Sousa Vieira, M.: Influence of asymmetries on N -tupling sequences. Phys. Lett. A 143, 279–282 (1990) 23. de Sousa Vieira, M. C., Tsallis, C.: Scaling and multifractality in one-dimensional asymmetric maps. Phys. Rev. A 40, 5305–5310 (1989) 24. Sullivan, D.: Bounds, quadratic differentials and renormalization conjectures. In: Mathematics into the Twenty-first Century, Vol. 2. ed. F. Browder, Providence, RI.: Amer. Math. Soc., 1992, pp. 417–466 25. Urumov, V.: Multifurcations of asymmetric maps and their metric properties. Phys. Lett. A 156, 187–191 (1991) Communicated by Ya. G. Sinai
Commun. Math. Phys. 197, 229 – 246 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Solution of the Quantum Knizhnik Zamolodchikov Equation of Type Cn Katsuhisa Mimachi Department of Mathematics, Kyushu University 33, Hakozaki, Fukuoka 812-8581, Japan. E-mail:
[email protected] Received: 23 January 1998 / Accepted: 14 March 1998
Abstract: We construct a solution of Cherednik’s quantum Knizhnik Zamolodchikov equation associated with the root system of type Cn . This solution is given in terms of a restriction of a q-Jordan–Pochhammer integral. As its application, we give an explicit expression of a special case of the Macdonald polynomial of the Cn type. Finally we explain the connection with the representation of the Hecke algebra.
1. Introduction We study the quantum Knizhnik Zamolodchikov (QKZ) Equation ([2]) associated with the root system of type Cn . A solution to this equation is found by means of a restriction of the q-Jordan–Pochhammer integral. A solution of the QKZ equation of type An−1 is given in [14]. Since the appearance of that work, however, there has been no progress in the study of the QKZ equation for other types of root systems with regard to the determination of solutions. This paper is devoted to such a task. To construct our solution, we exploit a family of rational functions which would correspond to a basis of the q-de Rham cohomology attached to the integrand. This turns out to be a natural basis for the representation of the Hecke algebra H(W ) through the Lusztig operator Ti . Next, as a byproduct of our investigation, we obtain an integral representaion of the special case of an eigenfunction associated with the Macdonald operator of the Cn type. In particular, it is seen that, taking a suitable cycle, a restriction of the q-Jordan– Pochhammer integral expresses the Macdonald polynomial of the Cn type parametrized by the partition (λ, 0, . . . , 0). This integral leads to a more explicit expression. We believe that the present paper represents a first step toward understanding the BCn type QKZ equation and the BCn type Macdonald polynomial. It is noteworthy that even in the classical (q=1) case it was not previously known that such an integral
230
K. Mimachi
gives spherical functions associated with the root system Cn . For related works on BCn type spherical functions, we refer the reader to [6] and references therein. Throughout this paper, q is regarded as a real number satisfying 0 ≤ q < 1. 2. QKZ Equation of Type Cn We first give a review of the QKZ equation associated with the root system of type Cn for the reader’s convenience, following Cherednik [2] and Kato [7]. Let E = ⊕1≤i≤n Ri be the real Euclidean space with inner product h , i such that hi , j i = δij . Let 1 = { ±i ± j ( 1 ≤ i < j ≤ n ), ±2i ( 1 ≤ i ≤ n ) } be the root system of type Cn , 1+ = { i ± j ( 1 ≤ i < j ≤ n ), 2i ( 1 ≤ i ≤ n ) } the set of positive roots, 5 = { αi = i − i+1 ( 1 ≤ i ≤ n − 1 ), αn = 2n } the set Pnof simple roots, P = ⊕1≤i≤n Zi the weight lattice, and P ∨ = ⊕1≤i≤n Zi + Z( 21 i=1 i ) the dual weight lattice for the root system 1. We frequently write α ∈ 1+ as α > 0. An element of the group algebra A = C[P ] is denoted by eλ , as is customary. Then the Weyl group W = W (Cn ) = h s1 , s2 , . . . , sn i (where each si is a standard generator corresponding to the simple root αi ) acts on A as w(eλ ) = ewλ (w ∈ W ). The symbol sα denoting the reflections is defined by sα (x) = x − hx, αiα∨ , with α∨ = 2α/h α, αi for x ∈ E and α ∈ 1. ˜ = { α + mδ ; α ∈ 1, m ∈ Z}, where δ The set of affine roots associated with 1 is 1 denotes the constant function 1 on E. The simple roots are a0 = −θ + δ with the highest root θ = 21 and ai = αi ∈ 1 for 1 ≤ i ≤ n. We use the symbol introduced above, si (0 ≤ i ≤ n) to also represent the generator for the corresponding affine Weyl group. We note that s0 = τ (θ∨ )sθ = τ (1 )s21 , where τ (µ) is a translation by µ. Let us introduce V as the left free A−module of rank |W | = 2n P n! with the free basis hw (w ∈ W ) ; each element F of V can be written uniquely as F = w∈W fw hw (fw ∈ A). Then, let A∼ be a completion of the quotient field of A. We then have V ∼ = A∼ ⊗A V . The action rw of the Weyl group W on V ∼ is defined by the following: rw (f hy ) = w(f )hwy
for
f ∈A
and
w, y ∈ W .
Moreover, the action of the translation τ (µ) (µ ∈ P ∨ ) for a parameter u ∈ E is given by τ (µ)eλ = q −h λ,µ i eλ and
for
λ∈P,
rτ (µ) (f hw ) = τ (µ)(f )q h µ,wu i hw
τ (µ)hw = q h µ,wu i hw for
f ∈A
and
for
w∈ W
w ∈ W.
This is an evaluation representation for which eδ is identified with q. Hereafter the symbol rw is used also to represent the element w from the extended affine Weyl group WP ∨ = W n P ∨ (the semidirect product of W and P ∨ ). Then rw τ (i ) = τ (w(i )) rw . Note also that, if w = vτ (λ), v ∈ W, λ ∈ P ∨ , we have w(µ) = vµ − h λ, µ i δ for µ ∈ P. For an affine root α + mδ (α ∈ 1 , m ∈ Z), define the R-matrix Rα+mδ as an element of EndA∼ (V ∼ ) by the formula ( ∨ y −1 (α) > 0 , aα+mδ hy + q m h α , yu i bα+mδ hsα y , Rα+mδ hy = ∨ cα+mδ hy + q m h α , yu i dα+mδ hsα y , y −1 (α) < 0
Solution of Quantum Knizhnik Zamolodchikov Equation
231
for y ∈ W , where 1 − q m eα , 1 − tα q m e α tα (1 − q m eα ) = , 1 − tα q m e α
1 − tα , 1 − tα q m e α q m eα (1 − tα ) = 1 − tα q m e α
aα+mδ =
bα+mδ =
cα+mδ
dα+mδ
and α 7→ tα is a W -invariant function taking positive values; there are two different tα , which we may write as t1 = t±i ±j , t2 = t±2j . It is seen that rw Rα = Rw(α) rw −1 Rβ = R−β
and
(
for
for
˜ , w ∈ WP ∨ , α∈1
˜ β∈1
(2.1) (2.2)
Ri −j Ri −k Rj −k = Rj −k Ri −k Ri −j ,
1 ≤ i < j < k ≤ n,
Ri −j R2i Ri +j R2j = R2j Ri +j R2i Ri −j ,
1 ≤ i < j ≤ n.
(2.3)
The relations in (2.3) constitute the Yang–Baxter equation associated with the root system of type Cn . Then we can state the definition of the QKZ equation for the root system of type Cn . Definition 2.1. The QKZ equation for the root system Cn with a parameter u = (u1 , . . . , un ) ∈ Rn is the following system of equations: rτ−1 (i ) F = Rτ (i ) F ,
1 ≤ i ≤ n,
and rτ−1 ( 1 ( 2
for F ∈ V
∼
1 +···+n ))
F = Rτ ( 1 (1 +···+n )) F , 2
, with Rτ (i ) = Ri −i−1 +δ · · · Ri −1 +δ R2i +δ R1 +i · · · Ri−1 +i × Ri +i+1 · · · Ri +n R2i Ri −n · · · Ri −i+1
for 1 ≤ i ≤ n , and Rτ ( 1 (1 +···+n )) = (R21 R1 +2 R1 +3 · · · R1 +n ) 2
× (R22 R2 +3 · · · R2 +n ) × · · · × (R2n−1 Rn−1 +n )R2n . Remark. If we introduce the operators Lµ ∈ EndC (V ∼ ) (µ ∈ P ∨ ) and Pµu ∈ EndA∼ (V ∼ ) (µ ∈ P ∨ , u ∈ E) defined by X X Lµ ( f w hw ) = Lµ (fw )hw with Lµ (eλ ) = q hµ,λi eλ (λ ∈ P ) and Pµu (hw ) = q hµ,wui hw ,
232
K. Mimachi
then the equation above can be rewritten as Li F = Pui Rτ (i ) F and L 1 (1 +···+n ) F = P 1u(1 +···+n ) Rτ ( 1 (1 +···+n )) F . 2
2
2
Fulfillment of the compatibility condition of the QKZ equation is guaranteed by the Yang–Baxter equation (2.3). In the next section, we will construct a solution of the QKZ equation for the special case u = −λ1 (λ > 0) through application of the q-Jordan–Pochhammer integral.
3. Integrals and Main Result We introduce the form Y (tyj /x)∞ (tyj−1 /x)∞ dx , (yj /x)∞ (yj−1 /x)∞ x
8 = xλ
(3.1)
1≤j≤n
Q where (a)∞ = s≥0 (1 − aq s ). This can be regarded as a form of a restriction of the q-Jordan–Pochhammer integral Y
xλ
1≤j≤2n
(tyj /x)∞ dx , (yj /x)∞ x
which is studied in [14] and [1]. Next, to construct our solution in case of u = −λ1 (λ > 0), we use the induced representaion of the Weyl group W = W (Cn ) from the trivial representation of a parabolic subgroup. As a parabolic subgroup of W, we choose a stabilizer W1 = hs2 , . . . , sn i of 1 . A representative of the quotient W/W1 is fixed to be (
w1 = e, w2 = s1 , w3 = s2 s1 , . . . , wn+1 = sn · · · s2 s1 , wn+2 = sn−1 wn+1 , wn+3 = sn−2 sn−1 wn+1 , . . . , w2n = s1 · · · sn−1 wn+1 .
It is seen that the element he =
P g∈ W1
hg is invariant under the action of W1 and
the induced representation of W from he is given by the elements hw i =
X
hwi g
(1 ≤ i ≤ 2n).
g∈ W1
Using wi as suffices, we define the following rational functions:
Solution of Quantum Knizhnik Zamolodchikov Equation
ϕwi =
Y 1≤µ
yµ−1 1− x
233
, yµ−1 1−t x 1≤µ≤i Y yµ yµ−1 1− 1 − x Y x 2n−i+1<µ≤n , Y yµ yµ−1 1≤µ≤n 1−t 1−t x x Y
1 ≤ i ≤ n,
n + 1 ≤ i ≤ 2n .
2n−i+1≤µ≤n
Associated with the function 8 , we write
Z
hψi =
ψ8 C
for a rational function ψ and a fixed cycle C , and define the element 9 by X 9= hϕwi ihwi . 1≤i≤2n
Then we obtain the following, which will be proven in the next section. Proposition 3.1. rai 9 = Rai 9
0 ≤ i ≤ n.
for
We are now in a position to state our main result. Theorem 3.2. The function 9=
X
hϕwi ihwi
1≤i≤2n
satisfies the QKZ equation of type Cn with the parameter u = −λ1 (λ > 0) and t1 = t2 = t : rτ−1 (i ) 9 = Rτ (i ) 9 ,
1 ≤ i ≤ n,
(3.2)
and rτ−1 ( 1 ( 2
1 +···+n ))
9 = Rτ ( 1 (1 +···+n )) 9 . 2
(3.3)
From this point we use the identification yi = ei for 1 ≤ i ≤ n. It is seen that a system of fundamental solutions is obtained by taking suitable linearly independent cycles. Proof. We first note
rτ−1 (1 ) 9 = rsθ s0 9.
Proposition 3.1 and (2.1) imply rsθ s0 9 = rsθ rs0 9 = rsθ Rα0 9 = Rsθ (α0 ) rsθ 9 . Applying this process repeatedly, we finally obtain
234
K. Mimachi
rsθ s0 9 =Rsθ (α0 ) R(s1 ···sn )(sn−1 ···s2 )(α1 ) R(s1 ···sn )(sn−1 ···s3 )(α2 ) · · · R(s1 ···sn )(αn−1 ) × R(s1 ···sn−1 )(αn ) · · · Rs1 (α2 ) Rα1 9 =R21 +δ R1 +2 · · · R1 +n R21 R1 −n · · · R1 −2 9 , since sθ = (s1 · · · sn−1 )(sn · · · s1 ). Thus we have rτ−1 (1 ) 9 = R21 +δ R1 +2 · · · R1 +n R21 R1 −n · · · R1 −2 9 .
(3.4)
Next, let us apply rsi−1 ··· s1 on both sides of (3.4). Then the left-hand side is −1 rsi−1 ··· s1 r−1 τ (1 ) 9 = rτ ( si−1 ··· s1 (1 ) ) rsi−1 ··· s1 9
= rτ−1 ( si−1 ··· s1 (1 ) ) Rsi−1 ··· s2 (α1 ) Rsi−1 ··· s3 (α2 ) · · · Rsi−1 (αi−2 ) Rαi−1 9 = rτ−1 (i ) R1 −i R2 −i · · · Ri−2 −i Ri−1 −i 9 = R1 −i −δ R2 −i −δ · · · Ri−2 −i −δ Ri−1 −i −δ rτ−1 (i ) 9 . This follows from the relation τ (−i )(j − i ) = j − i − δ. On the other hand, the right-hand side is rsi−1 ··· s1 R21 +δ R1 +2 · · · R1 +n R21 R1 −n · · · R1 −2 9 =R2i +δ R1 +i R2 +i · · · Ri−1 +i Ri +i+1 · · · Ri +n R2i 9 . Here we have used
rs−1 9 = Ri −n · · · Rn−1 −n 9. i−1 ··· s1
Therefore we reach the desired relation (3.2) by using (2.2). Next we proceed to derive (3.3). For 1 ≤ i ≤ n , we have rτ−1 ( 1 ( 2
1 +···+n ))
h ϕwi i
tyk q n Y x ∞ xλ 1 yk 2 k=1 q x ∞
1 2
Z = C
n Y k=i+1 n Y
q− 2
1
q
tyk−1 x
yk−1
− 21
x
k=i
i Y
tyk−1 x
1
q2
∞ k=1 i−1 Y
q
∞
k=1
1 2
yk−1 x
∞
dx x
(3.5)
∞
and rτ−1 ( 1 ( 2
1 +···+n ))
h ϕwn+i i
n−i Y
tyk x
1
q2
Z
xλ
= C
k=1 n−i+1 Y
1
q2
k=1
yk x
−1 tyk 1 ty k 2 q n x ∞Y x k=n−i+1 ∞ dx . (3.6) −1 n x Y y 1 3 yk k k=1 q2 q2 x ∞ x ∞ n Y
∞
3
q2
∞ k=n−i+2
By changing the integration variable such that x 7→ q −1/2 x, from (3.5) we have
Solution of Quantum Knizhnik Zamolodchikov Equation
rτ−1 ( 1 ( 2
1 +···+n ))
h ϕwi i
tyk q n Y x ∞ xλ yk k=1 q x ∞
=q
− λ2
Z C
=q with
235
Y n i Y tyk−1 ty −1 q k x x ∞ k=1 ∞ dx k=i+1 n i−1 −1 −1 x Y Y yk y q k x x ∞ ∞ k=i
− λ2
k=1
h gϕwn+i i
(3.7)
g = sn (sn−1 sn )(sn−2 sn−1 sn ) · · · (s1 · · · sn ) ∈ W.
Here we note g(i ) = −n−i+1 for each 1 ≤ i ≤ n. Similarly, as a result of the change x 7→ q 1/2 x, from (3.6) we have rτ−1 ( 1 ( 2
λ
h ϕwn+i i n−i Y tyk x ∞ k=1 xλ n−i+1 Y yk x ∞
1 +···+n ))
Z
=q2
C
k=1
−1 tyk tyk q n x ∞Y x dx k=n−i+1 −1 ∞ n x Y y yk k k=1 q x ∞ x ∞ n Y
k=n−i+2 λ
= q 2 h gϕwi i with the same g ∈ W . As for this g = sn (sn−1 sn )(sn−2 sn−1 sn ) · · · (s1 · · · sn ) ∈ W , we have gwi = wn+i sn (sn−1 sn )(sn−2 sn−1 sn ) · · · (s2 · · · sn−1 sn ) , gwn+i = wi sn (sn−1 sn )(sn−2 sn−1 sn ) · · · (s2 · · · sn−1 sn ) for 1 ≤ i ≤ n. These relations lead to ghwi = hgwi = hwn+i , ghwn+i = hgwn+i = hwi for 1 ≤ i ≤ n. On the other hand, noting u = −λ1 , we obtain 1 λ 1 τ (− (1 + · · · + n ))hwi = q h − 2 (1 +···+n ), −λ1 i hwi = q 2 hwi , 2 1 λ 1 τ (− (1 + · · · + n ))hwn+i = q h − 2 (1 +···+n ), λn−i+1 i hwn+i = q − 2 hwn+i 2
for 1 ≤ i ≤ n. Combining these relations, we get 1 τ (− (1 + · · · + n )) 9 2
(3.8)
236
K. Mimachi
X 1 = τ (− (1 + · · · + n )) h ϕwi i hwi + h ϕwn+i i hwn+i 2 1≤i≤n X λ λ λ λ q − 2 h gϕwn+i i q 2 hwi + q 2 h gϕwi i q − 2 hwn+i = 1≤i≤n
=
X
h gϕwn+i i hwi + h gϕwi i hwn+i
1≤i≤n
=
X
h gϕwn+i i ghwn+i + h gϕwi i ghwi
1≤i≤n
= rg 9 . At this stage, applying the relation r(sn (sn−1 sn )···(sk+1 ···sn ))sk ···sn 9 = R(sn (sn−1 sn )···(sk+1 ···sn ))sk ···sn−1 (αn ) R(sn (sn−1 sn )···(sk+1 ···sn ))sk ···sn−2 (αn−1 ) × · · · × R(sn (sn−1 sn )···(sk+1 ···sn ))sk (αk+1 ) R(sn (sn−1 sn )···(sk+1 ···sn ))(αk ) × r(sn (sn−1 sn )···(sk+1 ···sn−1 sn )) 9 = R2k Rk +k+1 · · · Rk +n−1 Rk +n × r(sn (sn−1 sn )···(sk+1 ···sn−1 sn )) 9 ,
(1 ≤ k ≤ n)
repeatedly, we finally obtain rg 9 = (R 21 R 1 +2 · · · R 1 +n )(R 22 R 2 +3 · · · R 2 +n ) × · · · × (R 2n−1 R n−1 +n ) R 2n−1 9 . Therefore, we reach the desired result (3.3).
4. Proof of Proposition 3.1 To prove Proposition 3.1, we start by considering the action of si ∈ W on the ϕwk . Lemma 4.1. (a) If 1 ≤ i ≤ n − 1 , si ϕwk = ϕwk for each 1 ≤ k ≤ 2n such that k 6= i, i + 1, 2n − i, 2n − i + 1. (b) sn ϕwk = ϕwk for each 1 ≤ k ≤ 2n such that k 6= n, n + 1. (c) s0 ϕwk = ϕwk for each 1 ≤ k ≤ 2n such that k 6= 1, 2n. Proof. These assertions follow from the definition of si and ϕwk .
Moreover we have Lemma 4.2. (a) For 1 ≤ i ≤ n − 1 ; ( si ϕwi+1 s i ϕw i and
= aαi ϕwi + dαi ϕwi+1 , = bαi ϕwi + cαi ϕwi+1 ,
(4.1)
Solution of Quantum Knizhnik Zamolodchikov Equation
237
( si ϕw2n−i+1 si ϕw2n−i
= aαi ϕw2n−i + dαi ϕw2n−i+1 , = bαi ϕw2n−i + cαi ϕw2n−i+1 .
(4.2)
(b) ( sn ϕwn+1 s n ϕw n
= aαn ϕwn + dαn ϕwn+1 , = bαn ϕwn + cαn ϕwn+1 .
(4.3)
Proof. By direct calculation or expansion of partial fractions, we find y −1 1 − i+1 x −1 yi+1 yi−1 1−t 1−t x x 1
= aαi
1−t
yi−1 x
+ dα i
y −1 1− i x −1 yi+1 y −1 1−t i 1−t x x
(4.4)
and 1 y −1 1 − t i+1 x
= bαi
1 y −1 1−t i x
+ cαi
y −1 1− i x . −1 yi+1 yi−1 1−t 1−t x x
(4.5)
Multiplying the factor yj−1 x yj−1 1−t x
i−1 1 − Y j=1
on both sides of each equality (4.4) or (4.5), we get the desired relations (4.1). While the change of variables i 7→ −i+1 and i+1 7→ −i leave αi unchanged, they produce the following from (4.4) and (4.5): yi 1− x yi+1 yi 1−t 1−t x x = aαi
and
1 1−t
yi+1 + dαi x
yi+1 1− x yi+1 yi 1−t 1−t x x
(4.6)
238
K. Mimachi
1
1
1−t
yi = bαi yi+1 + cαi 1−t x x
yi+1 1− x . yi+1 yi 1−t 1−t x x
(4.7)
Multiplying the factor −1
yj yj n 1− Y x x yj −1 y 1 − t j j=i+2 x j=1 1 − t x n 1− Y
on both sides of equalities (4.6) and (4.7), we obtain the desired relations (4.2). Similarly, changing i 7→ n and i+1 7→ −n induces αi 7→ αn and leads from equalities (4.4) and (4.5) to yn 1− x yn−1 yn 1−t 1−t x x
= a αn
1 y −1 1−t n x
+ dα n
y −1 1− n x yn−1 yn 1−t 1−t x x
(4.8)
and 1
1 + cαn yn = bαn yn−1 1−t 1 − t x x
y −1 1− n x . y −1 yn 1−t n 1−t x x
(4.9)
Multiplying the factor n−1 Y j=1
yj−1 x yj−1 1−t x 1−
on both sides of equalities (4.8) and (4.9), we get the desired relations (4.3).
In contrast to the action of si for 1 ≤ i ≤ n , the action of s0 is understood as it acts on the q-de Rham cohomology, not on the rational functions. Lemma 4.3. (
q λ h s 0 ϕw1 i q −λ h s0 ϕw2n i
= aδ−θ h ϕw2n i + q λ dδ−θ h ϕw1 i , = q −λ bδ−θ h ϕw2n i + cδ−θ h ϕw1 i .
(4.10)
Solution of Quantum Knizhnik Zamolodchikov Equation
239
Proof. Make the change of variables i 7→ −1 and i+1 7→ 1 − δ (i.e. yi−1 7→ −1 y1 , yi+1 7→ qy1−1 ) in (4.4) and (4.5). Then we have y −1 1−q 1 x y −1 y1 1 − tq 1 1−t x x = aδ−θ
1 1−t
y1 + dδ−θ x
y1 1− x y1−1 y1 1−t 1 − tq x x
(4.11)
and 1 1 − tq
y1−1 x
= bδ−θ
1 1−t
y1 + cδ−θ x
y1 1− x . y1−1 y1 1−t 1 − tq x x
(4.12)
Integration after multiplying the factor −1
yj yj n 1− Y x x 8 yj −1 y 1−t j x j=1 1 − t x
n 1− Y j=2
on both sides of equalities (4.11) and (4.12) gives the following: n Y −1 tyk tyk−1 2 ty1 q q q Z n x ∞ Y x x dx ∞ xλ k=1 −1 ∞ Y n −1 x y y yk y1 C k=2 q2 1 q k q x ∞ x ∞ x ∞ x ∞ k=2 n Y tyk tyk−1 q q Z n x ∞ Y x ∞ dx xλ k=1 = aδ−θ Y n −1 x y y y C 1 k k=1 q k q x x ∞ x ∞ ∞ k=2 n −1 Y tyk−1 2 y1 tyk q q q Z n x ∞ x Y x ∞ ∞ dx k=2 λ x + dδ−θ n −1 Y y x k yk C k=1 q q x ∞ x k=1
and
∞
(4.13)
240
K. Mimachi
n y −1 Y tyk−1 2 1 ty1 tyk q q q Z n x x Y x x ∞ ∞ k=2 ∞ dx xλ ∞ n −1 Y y y x 1 k y C k=2 q q k x ∞ x ∞ x ∞ k=1 n Y tyk −1 ty q q k Z n x ∞ Y x dx k=1 xλ n = bδ−θ −1 ∞ Y x y yk y1 C k=1 q k q x x ∞ x ∞ ∞ k=2 n −1 Y ty −1 2 y1 tyk q q k q Z n x ∞ x Y x ∞ dx k=2 ∞ . xλ + cδ−θ n −1 Y y yk x C k k=1 q q x ∞ x
k=1
(4.14)
∞
Here, changing the integration variable such that x 7→ qx , we have n Y −1 tyk tyk−1 2 ty1 q q q Z n x ∞ Y x x dx ∞ xλ k=1 −1 ∞ Y n −1 x y y yk y1 C k=2 q2 1 q k q x x x ∞ x ∞ ∞ ∞ k=2 n Y tyk tyk−1 ty1−1 q Z n x ∞ Y x x dx k=1 ∞ xλ = qλ −1 ∞ Y −1 n x yk y y1 yk C k=2 q 1 q −1 x x x ∞ x ∞ ∞ ∞ k=2
= q λ h s 0 ϕw 1 i and
n y −1 Y tyk−1 2 1 tyk q q q Z n x ∞ x Y x ∞ ∞ dx k=2 λ x n −1 Y y x k yk C k=1 q q x ∞ x ∞ k=1 n −1 Y tyk y1−1 tyk q Z n x x Y x ∞ k=2 ∞ dx ∞ xλ = qλ n −1 Y yk x yk C k=1 x ∞ x
k=1
∞
λ
= q h ϕw1 i . Therefore, it is seen that (4.13) and (4.14) are equivalent to the desired relations (4.10).
Solution of Quantum Knizhnik Zamolodchikov Equation
241
Next, we consider the action of W on the hwk . Lemma 4.4. (a) If 1 ≤ i ≤ n − 1 , hsi wk = hwk for k 6= i, i + 1, 2n − i, 2n − i + 1. (b) hsn wk = hwk for k 6= n, n + 1. (c) hsθ wk = hwk for k 6= 1, 2n. Proof. In the case that 1 ≤ i ≤ n − 1 , we have si wk = wk si for 1 ≤ k ≤ i − 1 or 2n − i + 2 ≤ k ≤ 2n , and si wk = wk si+1 for i + 2 ≤ k ≤ 2n − i − 1. These lead to the desired equalities in (a). In the same way, the relations sn wk = wk sn (k 6= n, n + 1) and sθ wk = wk (s2 · · · sn−1 )(sn · · · s2 ) (k 6= 1, 2n) lead to the relations in (b) and (c). Next we consider the action of Rαi on the hwk : Lemma 4.5. (a) If 1 ≤ i ≤ n − 1, Rαi hwk = hwk for each 1 ≤ k ≤ 2n such that k 6= i, i + 1, 2n − i, 2n − i + 1. (b) Rαn hwk = hwk for each 1 ≤ k ≤ 2n such that k 6= n, n + 1. (c) Rδ−θ hwk = hwk for each 2 ≤ k ≤ 2n − 1. Proof. Since wk−1 αi = αi > 0 for 1 ≤ k ≤ i − 1 (then i ≥ 2) , we have Rαi hwk = aαi hwk + bαi hsi wk , R α i hsi w k = c α i hs i w k + d α i hw k . These imply Rαi (hwk + hsi wk ) = hwk + hsi wk , following from the relations aαi + dαi = bαi + cαi = 1. Hence, noting si wk = wk si , we obtain Rαi hwk = hwk . Other cases are similarly derived. Lemma 4.6. (a) For 1 ≤ i ≤ n − 1 , (
Rαi hwi = aαi hwi + bαi hwi+1 , Rαi hwi+1 = cαi hwi+1 + dαi hwi ,
(
Rαi hw2n−i = aαi hw2n−i + bαi hw2n−i+1 , Rαi hw2n−i+1 = cαi hw2n−i+1 + dαi hw2n−i .
(b) (
Rαn hwn = aαn hwn + bαn hwn+1 , Rαn hwn+1 = cαn hwn+1 + dαn hwn .
(c) (
Rδ−θ hw2n = aδ−θ hw2n + q −λ bδ−θ hw1 , Rδ−θ hw1 = cδ−θ hw1 + q λ dδ−θ hw2n .
Proof. This follows almost immediately from the definitions.
242
K. Mimachi
At this stage, by combination of the above lemmas, we obtain the following: In the case of 1 ≤ i ≤ n − 1 , we have X X X h si ϕwk ihsi wk = { + }h si ϕwk ihsi wk rsi 9 = k6=i, i+1, 2n−i, 2n−i+1
1≤k≤2n
X
=
k=i, i+1, 2n−i, 2n−i+1
h ϕwk ihwk
k6=i, i+1, 2n−i, 2n−i+1
+ {bαi h ϕwi i + cαi h ϕwi+1 i} hsi wi + {aαi h ϕwi i + dαi h ϕwi+1 i} hsi wi+1 + bαi h ϕw2n−i i + cαi h ϕw2n−i+1 i hsi w2n−i + aαi h ϕw2n−i i + dαi h ϕw2n−i+1 i hsi w2n−i+1 X = h ϕwk ihwk k6=i, i+1, 2n−i, 2n−i+1
+ h ϕwi i bαi hwi+1 + aαi hwi + h ϕwi+1 i cαi hwi+1 + dαi hwi + h ϕw2n−i i bαi hw2n−i+1 + aαi hw2n−i + h ϕw2n−i+1 i cαi hw2n−i+1 + dαi hw2n−i+1 = Rαi 9. Similarly, in the case i = n, we have X X r sn 9 = h sn ϕwk ihsn wk + 1≤k≤2n k6=n, n+1
X
=
k=n, n+1
h ϕwk ihwk
k6=n,n+1
+ {bαn h ϕwn i + cαn h ϕwn+1 i} hsn wn + {aαn h ϕwn i + dαn h ϕwn+1 i} hsn wn+1 X = h ϕwk ihwk k6=n,n+1
+ h ϕwn i bαn hwn+1 + aαn hwn + h ϕwn+1 i cαn hwn+1 + dαn hwn = Rαn 9. Finally, if i = 0 , by noting that s 0 hw 1 = q λ hsθ w 1 we have r s0 9 =
X
and s0 hw2n = q −λ hsθ w2n ,
h ϕwk ihsθ wk + h s0 ϕw1 iq λ hsθ w1 + h s0 ϕw2n iq −λ hsθ w2n
1≤k≤2n k6=1, 2n
=
X
k6=1, 2n
h ϕwk ihwk + h s0 ϕw1 iq λ hw2n + h s0 ϕw2n iq −λ hw1
Solution of Quantum Knizhnik Zamolodchikov Equation
X
=
243
h ϕwk ihwk + cδ−θ h ϕw1 i + q −λ bδ−θ h ϕw2n i hw1
k6=1, 2n
+ q λ dδ−θ h ϕw1 i + aδ−θ h ϕw2n i hw2n X = h ϕwk ihwk + h ϕw1 i cδ−θ hw1 + q λ dδ−θ hw2n k6=1, 2n
+ h ϕw2n i aδ−θ hw2n + q −λ bδ−θ hw1
= Rδ−θ 9.
This completes the proof of Proposition 3.1.
5. Macdonald Polynomials Macdonald introduced the q-difference operators [10] to define his orthogonal polynomials associated with root sytems. In the case of a root system of type Cn , the q-difference operator to define such a polynomial is given by E=
a 1 − tyiai yj j Y 1 − tyi2ai 21 ai T yi , a 1 − yiai yj j 1≤i≤n 1 − yi2ai 1≤i<j≤n
Y
X a1 ,... ,an =±1
where (Tyi f )(y1 , . . . , yn ) = f (y1 , . . . , qyi , . . . , yn ). Its eigenvalue is known to be X
cµ =
n Y
q 2 λj aj t 2 (n−j+1)aj
a1 ,... ,an =±1 j=1 n Y − 21 (λ1 +···+λn )
=q
1
1
(1 + tj q λn−j+1 )
j=1
with the parameter µ = (λ1 , . . . , λn ) (We consider only the special case corresponding to the condition t1 = t2 = t.) As for the eigenfunction of the operator E, we easily find the following: Corollary 5.1. The sum 2n X
ti−1 h ϕwi i
(5.1)
i=1
is a solution of the equation attached to the parameter (λ, 0, . . . , 0) : Eψ = c(λ,0,... ,0) ψ .
(5.2)
Proof. This is proven by applying the result of Kato (Theorem 4.6 in [7]) to our Theorem 3.2.
244
K. Mimachi
We next proceed to simplify the sum (5.1). We note the equality yj yj X 1− 1− n 2n Y x x 2n i−1 t ϕw i , t = 1 + (t − 1) yj−1 yj j=1 j=1 1−t 1−t x x which is demonstrated by using the partial fractions. On the other hand, we have yj yj Z 1− 1− n Y x x λ 8 = qλ h 1 i , i = q h −1 y C y j j j=1 1−t 1−t x x
(5.3)
(5.4)
which is demonstrated by changing the integration variable such that x 7→ qx. Hence, combination of (5.3) and (5.4) gives the relation 2n X
ti−1 h ϕwi i =
j=1
Therefore we reach Proposition 5.2. The function
R C
1 − q λ t2n 1−t
Z C
8.
8 is a solution to Eq. (5.2).
It should be remarked that this is valid for arbitrary cycle C and that linearly independent solutions are obtained by choosing several cycles. This situation is similar to that studied in [15]. In the case that the parameter µ is from the set of partitions, the eigenfunction of the form X aµ ν mν , Pµ (y|q, t) = mµ + ν<µ
P ν is the Macdonald polynomial for the root system Cn . Here mµ = ν∈W µ e , and + + ν < µ is defined to be µ − ν ∈ Q with Q the positive cone of the root lattice. In our case, to get the Macdonald polynomial, it is enough to consider the case that λ is a positive integer and take the cycle, with the counterclockwise direction, which encircles the sequence of poles such that yi , yi q, yi q 2 , . . . , for 1 ≤ i ≤ n and yi−1 , yi−1 q, yi−1 q 2 , . . . , for 1 ≤ i ≤ n. This is an integral representaion of the Macdonald polynomial P(λ,0,... ,0) (y|q, t). Moreover, applying the q-binomial theorem X (a)m (az)∞ zm = (q)m (z)∞
m≥0
(|z| < 1),
(a)m =
Y
(1 − aq k )
0≤k≤m−1
and the residue calculus to our integral, we obtain an exact expression of the Macdonald polynomial for the root system Cn .
Solution of Quantum Knizhnik Zamolodchikov Equation
245
Theorem 5.3. P(λ,0,... ,0) (y|q, t) =
(q)λ (t)λ
X i1 +···+i2n =λ i1 , ... , i2n ≥0
(t)i1 · · · (t)i2n i1 −i2n i2 −i2n−1 y y2 · · · ynin −in+1 . (q)i1 · · · (q)i2n 1
Remark. We also have a direct way to obtain the integral representation of the eigenfunction for (5.2). This will appear in a future paper. For the related work, we also refer the reader to [16]
6. Final Comment We finally make a comment on the meaning of our elements ϕwi from the viewpoint of the Hecke algebra. Set 1 − teαi (si − 1), for 1 ≤ i ≤ n , Ti = t + 1 − e αi where αi is an element of the simple roots and si a corresponding generator of the Weyl group W . This is the Lusztig operator associated with the root system Cn (in the special case t1 = t2 = t), which satisfies the following: (1 ≤ i ≤ n) , (Ti − t)(Ti + 1) = 0 (1 ≤ i ≤ n − 2) , Ti Ti+1 Ti = Ti+1 Ti Ti+1 Tn−1 Tn Tn−1 Tn = Tn Tn−1 Tn Tn−1 , (|i − j| > 2) . T i T j = Tj T i These are the fundamental relations for the Hecke algebra H(W ) associated with the root system of type Cn . The action of the Lusztig operator on our ϕwi is given as follows. Proposition 6.1. For 1 ≤ k ≤ n ; i 6= k − 1, k , Ti ϕwk = tϕwk , Tk−1 ϕwk = (t − 1)ϕwk + ϕwk−1 , T ϕ = tϕ k wk wk+1 , i 6= n − k, n − k + 1 , Ti ϕwn+k = tϕwn+k , Tn−k+1 ϕwn+k = (t − 1)ϕwn+k + ϕwn+k−1 , T n−k ϕwn+k = tϕwn+k+1 . This shows that the vector space ⊕2n i=1 Cϕwi gives the representaion of the Hecke algebra H(W ) for the Cn type. Moreover, we can also obtain the representation of the affine Hecke algebra in the space of the q-de Rham cohomology. See [16] for the An−1 case. In any case, we expect that such a basis attached to the action of the Hecke algebras could be generalized to the case of higher representaions. This is our future problem. Acknowledgement. The author wishes to thank Professor Shin-ichi Kato for a valuable suggestion.
246
K. Mimachi
References 1. Aomoto, K., Kato, Y., Mimachi, K.: A solution of the Yang–Baxter equation as connection coefficients of a holonomic q-difference system. Internat. Math. Res. Notices 1992, No.1 , 7–15 2. Cherednik, I.: Quantum Knizhnik–Zamolodchikov equations and affine root systems. Commun. Math. Phys. 150, 109–136 (1992) 3. Cherednik, I.: Double affine Hecke algebras, Knizhnik–Zamolodchikov equations, and Macdonald’s operators. Internat. Math. Res. Notices 1992, No.9, 171–180 4. Cherednik, I.: Double affine Hecke algebras, and Macdonald’s conjectures. Ann. of Math. 141, 191–216 (1995) 5. Cherednik, I.: Induced representations of double affine Hecke algebras and applications. Math. Res. Lett. 1, 319–337 (1994) 6. Debiard, A. and Gaveau, B.: Integral formulas for the spherical polynomials of a root system of type BC2 . J. Funct. Anal. 119, 401–454 (1994) 7. Kato, S.: R-matrix arising from affine Hecke algebras and its application to Macdonald’s difference operators. Commun. Math. Phys. 165, 533–553 (1994) 8. Koornwinder, T. H.: Askey-Wilson polynomials for root systems of type BC. In: Hypergeometric functions on domains of positivity, Jack polynomials and applications, D.St.P.Richards (ed.), Contemp. Math. 138, Providence, RI: Amer. Math. Soc., 1992, pp. 189–204 9. Lusztig, G.: Affine Hecke algebras and their graded version. J. Amer. Math. Soc. 2, 599–635 (1989) 10. Macdonald, I.G.: A new class of symmetric functions. In: Actes S´eminaire Lotharingen, Publ. Inst. Rech. Math. Adv., Strasbourg, 1988, pp. 131–171 11. Macdonald, I.G.: Affine Hecke algebras and orthogonal polynomials. S´eminaire BOURBAKI, 47`eme ann´ee, 1994–95, n◦ 797 12. Macdonald, I.G.: Symmetric Functions and Hall Polynomials (Second Edition), Oxford Mathematical Monographs, Oxford: Clarendon Press, 1995 13. Mimachi, K.: Connection problem in holonomic q-difference system associated with a Jackson integral of Jordan–Pochhammer type. Nagoya Math. J. 116, 149–161 (1989) 14. Mimachi, K.: A solution to quantum Knizhnik–Zamolodchikov equations and its application to eigenvalue problems of the Macdonald type. Duke Math. J. 85, 635–658 (1996) 15. Mimachi, K.: Rational solutions to eigenvalue problems of the Macdonald type. In preparation 16. Mimachi, K. and Noumi, M.: An integral representation of eigenfunctions for Macdonald’s q-difference operators, Tˆohoku Math. J. 49, 517–525 (1997) 17. Mimachi, K. and Noumi, M.: Representaions of the Hecke algebra on a family of rational functions. Preprint 1997 Communicated by T. Miwa
Commun. Math. Phys. 197, 247 – 276 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On Ruelle’s Probability Cascades and an Abstract Cavity Method E. Bolthausen1 , A.-S. Sznitman2 1 2
Angewandte Mathematik, Universit¨at Z¨urich, Winterthurerstr. 190, CH-8057 Z¨urich, Switzerland Departement Mathematik, ETH-Z¨urich, CH-8092 Z¨urich, Switzerland
Received: 6 October 1997 / Accepted: 2 March 1998
Abstract: We construct in this work a Markov process which describes a clustering mechanism through which equivalence classes on N are progressively lumped together. This clustering process gives a new description of Ruelle’s continuous probability cascades. It also enables to introduce an abstract cavity method, which mimicks certain features of the cavity method developed by physicists in the context of the Sherrington Kirkpatrick model. 0. Introduction We construct in this article a continuous time Markov process, (0u )u≥0 , with state space the set E of equivalence relations on N. We call it the “clustering process”; it describes an evolution in which 00 -equivalence classes are “lumped together” to form at a later time u the collection of 0u -equivalence classes. The trace of the clustering process on the set EI of equivalence relations of an arbitrary finite subset I of N, is a pure jump process with generator: X a0,00 f (00 ) − (N − 1)f (0), for 0 ∈ EI , (0.1) (LI f )(0) = 00
equivalence where a0,00 is 0 unless 00 is obtained by collapsing k ≥ 2h of the N distinct i N −2 classes of 0 into a single class, in which case a0,00 = 1/ (N − 1) k−2 . The precise mechanism of clustering is described in Sect. 1 below. This process is instrumental for the abstract cavity method we develop in this work. It offers a concrete representation of the “continuous probability cascades” constructed by Ruelle in [10]. It also provides an example of a coalescent Markov process, in the spirit of Kingman [1, 2]. For further developments around the clustering process, see also Pitman [9].
248
E. Bolthausen, A.-S. Sznitman
Let us briefly recall what the “probability cascades” are. For x ∈ (0, 1), we denote by Px the law of the Poisson point process on (0, ∞), with intensity xη −x−1 dη. If Mp stands for the set of simple pure point Radon measures on (0, ∞), the law Px is concentrated on: (0.2) M = {m ∈ Mp ; m((0, 1]) = ∞, and |m| < ∞}, where Z |m| = η dm(η). (0.3) (0,∞)
Each m ∈ M can uniquely be written in the form X δη` (m) , where η` (m), ` ≥ 0, is a strictly decreasing m= `≥0
(0.4)
sequence which tends to 0 as ` tends to ∞. The probability cascades come as follows. For any finite sequence: 0 < x1 < · · · < xK < 1,
(0.5)
one considers a collection of random variables η(ik 1 ,...,ik ) , i1 , . . . , ik ≥ 0, k ∈ [1, K], such that the sequences (η(ik 1 ,...,ik−1 ,j) , j ≥ 0), for k ∈ [1, K], i1 , . . . , ik−1 ≥ 0, are independent and respectively distributed as (ηj (m))j≥0 under Pxk .
(0.6)
Then the random weights πi1 ,...,iK = ηi11 . . . ηiK1 ,...,iK , are a.s. summable: C=
X
πi1 ,...,iK < ∞, a.s.,
(0.7) (0.8)
i1 ,...,iK ≥0
and one can recursively define the (π i1 , π i1 ,i2 , . . . , π i1 ,...,iK )i1 ,...,iK ≥0 , via π i1 ,...,iK = πi1 ,...,iK /C, and π i1 ,...,ik−1 =
X
(0.9) π i1 ,...,ik−1 ,j , for k ∈ [1, K].
j≥0
(0.10) Ruelle introduces in [8] “unordered families”, for which one only keeps track of the “tree structure of the labels” in (0.8). He shows a consistency property of the resulting distributions as K and the finite sequence x1 < · · · < xK vary. The “continuous probability cascades” are then constructed in [10] by means of an abstract projective limit argument. It turns out that the clustering obtained by looking backwards from the last component of (0.9), clumping together points which have common ancestor on level K − 1, then on level K − 2 etc., has a Markovian structure. In fact, it is the discrete skeleton of a continuous time Markov process, which is a time change of the clustering process essentially defined by (0.1). It is therefore possible to define the continuous cascades directly from the clustering process. The precise connection with Ruelle’s cascades is presented in Sect. 2, Theorem 2.2. Other links of the Ruelle’s cascades with continuous
Ruelle’s Probability Cascades and Abstract Cavity Method
249
branching processes have also been discussed in Neveu [7]. The clustering process enables to introduce the variables τ`,`0 = inf{u ≥ 0, (`, `0 ) ∈ 0u }, `, `0 ≥ 0,
(0.11)
which represent the time at which ` and `0 are “lumped together”. For an initial distribution concentrated on the “trivial” equality relation on N, the variables τ`,`0 naturally define a random ultrametric distance on N, see (1.32) below. In fact when ` 6= `0 , τ`,`0 are standard exponential variables. A second goal of the present article is to develop an “abstract cavity method”, which mimicks features of the cavity method for the Sherrington–Kirkpatrick model, as presented in Chapter 5 of M´ezard–Parisi–Virasoro [6]. Quite a number of quantities, which appear in the physicists’ prediction of the large N behavior of the SK model, naturally arise in our context. The basic ingredients for the abstract cavity method are: a sequence of normalized random weights ν` =
η` (m) , ` ≥ 0, |m|
(0.12)
where m is PxM -distributed for a given xM ∈ (0, 1), an independent standard clustering process 0. , `
a collection y (x), x ∈ [0, xM ], ` ≥ 0, of stochastic processes,
(0.13) (0.14)
which conditional on the normalized weights and the clustering process are centered Gaussian with covariance: 0
cov(y ` (x), y ` (x0 )) = q(x ∧ x0 ∧ X`,`0 ), x, x0 ∈ [0, xM ], `, `0 ≥ 0,
(0.15)
where q(·) : [0, xM ] → [0, qM ] is an increasing C 1 -diffeomorphism, and X`,`0 = xM e−τ`,`0 , `, `0 ≥ 0,
(0.16)
a function ψ : R → R in the class Cb4 .
(0.17)
In the language of M´ezard–Parisi–Virasoro [6], the coefficients ν` mimick the Gibbsian weights in decreasing order of the “pure states”, whereas the clustering process 0. , with the help of the variables X`,`0 induces an “ultrametric structure” on the “pure states”, and the y ` (xM ), play the role of the “mean cavity field” inside the “pure state” with weight ν` , i.e. up to relabelling the hα(N ) variables of [6], p. 67. In the M´ezard–Parisi–Virasoro picture, the addition of a new spin variable σ induces a change σy ` (xM ) of the Hamiltonian in “pure state” `. Summing on this spin variable, the added energy is ψ(y ` (xM )), where ψ(x) = log cosh(βx), β being the inverse temperature. Of crucial importance for the cavity method is the effect of this energy change on the Gibbsian weights of the countably many pure states. This effect can be described in an abstract setup, where for technical reasons, we assume that ψ is bounded. Reshuf` fling occurs as one multiplies the individual weights ν` by a factor eψ(y (xM )) , thereby changing the relative rank of importance of the weights. One thus introduces a random permutation of N, σ e(·), with inverse σ(·), such that for ` ≥ 0: `e = σ e(`) is the rank of µ` = ν` exp{ψ(y ` (xM ))}, among 0
the collection µ`0 = ν`0 exp{ψ(y ` (xM ))}, `0 ≥ 0.
(0.18)
250
E. Bolthausen, A.-S. Sznitman
The reshuffling operation is the replacement of (ν` ), (X`,`0 ), (y ` (·)) by e ` e `e0 ≥ 0: (·))), where for `, ((ν (R) ), (X (R)0 ), (y(R) e e ` `,e ` µσ(e `) e ` `) , X (R)0 = Xσ(e , y(R) (·) = y σ(e (·). ν (R) = P `),σ(e `0 ) e e ` `,e ` µ`
(0.19)
`
In other words, the relative importance of the weights is changed, but the initial ultrametric structure and marking processes y(·) are carried along the reshuffling operation. Our main result Theorem 4.2 describes the effect of reshuffling. It shows that the joint law of the normalized weights and of the ultrametric structure is left invariant by e ` (·), `e ≥ 0, still preserves this operation. On the other hand, the conditional law of the y(R) the tree structure, but is not Gaussian anymore. For instance, the conditional law of a component y(·), can be represented as that of a time changed process: zq(x) , x ∈ [0, xM ], where zq , solves the SDE: dzq = dBq + x(q) m(q, zq ) dq, 0 ≤ q ≤ qM , z0 = 0,
(0.20)
with x(·) the inverse of the function q(·), B. a Brownian motion and m(q, y) = ∂y f (q, y), for f (q, y) the unique Cb1,2 solution of ∂q f +
1 2
∂y2 f +
x(q) (∂y f )2 = 0, on (0, qM ) × R, f (qM , ·) = ψ(·). 2
(0.21)
Expressions like (0.20), (0.21) can for instance be found in [6], p. 45 or in Parisi [8], see [6], p. 163, as part of the prediction of the large N behavior of the SK model. The boundedness assumption on ψ(·) in (0.17), though technically convenient, excludes the natural choice ψ(·) = log(cosh(β·)), with β > 0 the inverse temperature, in the context of the SK model. In the case of a non-constant, symmetric function ψ, we can further define an “abstract iteration” procedure, which to q(·) associates a new q (R) (·), see Theorem 5.4. The fixed point equation q (R) (·) = q(·) corresponds to the so-called “selfconsistency equation”, for the SK model, see [6] (III.63), p. 45. Let us now describe how the article is organized: In Sect. 1, we construct the clustering process and derive some of its properties. Section 2 develops the connection between the clustering process and Ruelle’s probability cascades. In Sect. 3, we prepare the ground for Sect. 4 and investigate an approximate reshuffling operation. Section 4 contains the main result Theorem 2.2 of the abstract cavity method, which describes the effect of reshuffling. In Sect. 5, we give some applications of the abstract cavity method, to calculations on “single and double replicas”. This enables for a non-degenerate symmetric function ψ(·) the definition of an iteration mechanism for the function q(·), see Theorem 5.4. This work grew out of our efforts to decipher and unravel the probabilistic structure underlying the prediction of the large N behavior of the SK model at low temperature, as presented in the book of M´ezard–Parisi–Virasoro [6]. We wish to thank M. Aizenman for helpful discussions in this matter, as well as J.F. Le Gall, J. Pitman, and D. Ruelle for all their comments.
Ruelle’s Probability Cascades and Abstract Cavity Method
251
1. The Clustering Process In this section we shall construct the clustering process. It is a continuous time Markov process with state space E = {0 ⊂ N × N; 0 defines an equivalence relation on N}.
(1.1)
The set E is endowed with the canonical σ-field E generated by all events of the form: {0 ∈ E; (a, b) ∈ 0}, for a, b ∈ N. Remark that E can be viewed as a closed and therefore compact subset of {0, 1}N×N , the latter being equipped with the product topology. When I is a non-empty subset of N, it is convenient to consider the set EI and the σ-field EI , which are defined analogously to E and E, with N replaced by I. The process we shall introduce describes a “clustering mechanism”. Its trajectories 0u , u ≥ 0, are nondecreasing E-valued functions, (for the inclusion relation on E). We keep the notations introduced in the Introduction. The set Mp of simple pure point measures on (0, ∞) is endowed with its canonical σ-field Mp generated by the applications m ∈ Mp → m(A) ∈ N ∪ {∞} for A ∈ B((0, ∞)). The measurable subsets M in (0.2), and (1.2) M1 = {m ∈ M ; |m| = 1}, are endowed with the respective trace σ-fields M and M1 . For x ∈ (0, 1), we shall denote by Px the image on M1 of the Poisson law Px under the normalization map: X X δη` (m) = δ η` (m) . (1.3) N : M → M1 , N `≥0
`≥0
|m|
We now define for each non-empty I ⊆ N, and u ≥ 0, a probability kernel RuI on EI . When u = 0, R0I (0, d00 ) is simply the Dirac mass at 0 ∈ EI . On the other hand when u > 0, and 0 ∈ EI , RuI (0, d00 ) is defined as follows. We consider the at most denumerable collection C0 of 0-equivalence classes on I. The space M1 × NC0 is endowed with the canonical σ-field and the probability O X η` (m) δ` (yC ) , where x = e−u , (1.4) Qx = Px (dm) ⊗ C∈C0
`≥0
C ∈ C0 , are the canonical coordinates on NC0 . In other words, P conditional and yC , P to m = `≥0 δη` (m) ∈ M1 , the variables yC , C ∈ C0 , are independent `≥0 η` δ` distributed. We now “lump together” 0-equivalences C, which possess the same mark yC , and obtain a random equivalence relation 00 on I. Formally, for (m, (yC , C ∈ C0 )) ∈ M1 × NC0 , the collection of subsets: [ C, ` ≥ 0, (1.5) C`0 = yC =`
defines a partition of I, which uniquely determines an equivalence relation 00 ⊃ 0, on I, with equivalence classes the non-empty C`0 , ` ≥ 0. We then define RuI (0, d00 ) = the law of the EI -valued variable 00 , under Qx .
(1.6)
When J ⊂ I are non-empty subsets of N, we denote by rI,J the measurable restriction map from EI to EJ : (1.7) rI,J (0) = 0 ∩ (J × J). When I = N, we simply write rJ in place of rI,J .
252
E. Bolthausen, A.-S. Sznitman
Proposition 1.1.
RuI , u ≥ 0, is a Feller semigroup on EI .
(1.8)
For J ⊂ I non-empty subsets of N, u ≥ 0, 0 ∈ EI , one has the compatibility relation RuJ (rI,J (0), ·) is the image of RuI (0, ·) under rI,J .
(1.9)
When C1 , . . . , Ck , are k ≥ 2 distinct 0-equivalence classes on I: RuI (0, {C1 , . . . , Ck are in the same 00 -class}) = (k − 1 − e−u )(k − 2 − e−u ) . . . (1 − e−u ) , for u > 0. (k − 1)!
(1.10)
Proof. The compatibility relation (1.9) is a direct consequence of the definition of RuI and RuJ . Let us now prove that RuI , u ≥ 0, are Feller semigroups. Observe that RuI , for u ≥ 0, preserves the space of continuous functions on EI . Indeed, in view of Stone– Weierstrass’ theorem it suffices to prove the continuity of the map 0 ∈ EI → RuI (0, A) ∈ [0, 1], for u ≥ 0, when A has the form A=
n \
{00 ∈ EI ; (ai , bi ) ∈ 00 },
(1.11)
(1.12)
1
with ai , bi ∈ I, for i = 1, . . . , n. For such an A, we can apply (1.9) with J = {ai , bi , i = 1, . . . , n} ⊆ I. We are therefore reduced to the case of a finite set I, where the continuity of the map in (1.11) is obvious. As a consequence of (1.10), with k = 2, which is proven below, RuI tends to the identity when I is finite and u tends to 0. By a similar argument as above, it follows that for arbitrary I and f continuous on EI , RuI f tends uniformly to f as u tends to 0. We now come to the proof of the semigroup property. For notational simplicity, we assume I = N, although this plays no role in the proof. Given 0 ∈ E, u, v > 0, we can construct the law Ru Rv (0, ·) on E, as follows. We consider on some auxiliary space (, A, P ), (m1 , (yC , C ∈ C0 )) independent of (m2 , (Z` , ` ≥ 0)) such that m1 is Px1 -distributed with x1 = e−u , conditionally on m1 , the variables yC , C ∈ C0 , are i.i.d. P η` (m1 ) −v , conditionally on m2 , `≥0 |m1 | δ` -distributed; m2 is Px2 -distributed with x2 = e P η`0 (m2 ) the variables Z` , ` ≥ 0, are i.i.d. `0 ≥0 |m2 | δ`0 -distributed. 0 , for C ∈ C0 , via: We can define variable yC 0 = ZyC . yC
The formula:
C`000 =
[
C , for `0 ≥ 0,
(1.13) (1.14)
0 =`0 C∈C0 :yC
defines a partition on N, which naturally determines an equivalence relation 000 , which is precisely Ru Rv (0, ·)-distributed. We shall now construct a suitable random permutation τ of N such that the variables 0 ), under P have the same joint distribution as the variables yC , under Qx in (1.4), τ (yC with x = e−(u+v) . This will complete the proof of the semigroup property. Condition0 , C ∈ C0 , are independent with common ally on m1 , m2 , Z` , ` ≥ 0, the variables yC distribution:
Ruelle’s Probability Cascades and Abstract Cavity Method
253
0 P [yC = `0 |m1 , m2 , (Z` )`≥0 ] =
X `:Z`
η`1 /
=`0
X
η`1 ,
(1.15)
`≥0
where we write for simplicity η` (m1 ) = η`1 and η`0 (m2 ) = η`20 . Taking a closer look at the numerator of the fraction in (1.15), we observe that conditional on m2 : X 1{Z` = `0 } δη1 , `0 ≥ 0, (1.16) `
`≥0
are independent Poisson point processes on (0, ∞) with respective intensities P (η`20 / k≥0 ηk2 ) x1 η −x1 −1 dη. Using scaling (i.e. (A.7) with a constant function g), we see that conditional on m2 : 0 def X 1{Z` = `0 } δC −1 η1 , `0 ≥ 0, with (1.17) m` = `0
`≥0
`
X x11 C`0 = η`20 / ηk2 ,
(1.18)
k≥0
are i.i.d. Px1 -distributed. Coming back to (1.15), P -a.s.
X 0 0 = `0 |m1 , m2 , (Z` )`≥0 ] = C`0 |m` |/ Ck |mk | P [yC k X 1 1 0 (ηk2 ) x1 |mk | , `0 ≥ 0. = (η`20 ) x1 |m` |/
From (A.2), we know that
(1.19)
k≥0
P
δ
`0 ≥0 1
0
is Px1 x2 -distributed. Since the m` , `0 ≥ 0, are
1 (η`2 0 ) x1
0
i.i.d. independent from the (η`20 ) x1 , `0 ≥ 0, and E[|m` |x1 x2 ] < ∞, by (A.3), it follows from (A.7) that X def is Px1 x2 -distributed. δ 2 x1 `0 (1.20) m3 = N `0 ≥0
The formula τ (`0 ) = j, on the set
n
(η`0 )
1
|m |
1
0
ηj (m3 ) = (η`20 ) x1 |m` |/
X
o 1 (ηk2 ) x1 |mk | ;
(1.21)
k≥0
P-a.s. defines a σ(m1 , m2 , Z` , ` ≥ 0)-measurable permutation of N. We can thus consider the variables 0 ), for C ∈ C0 . (1.22) yeC = τ (yC Considering (1.15), (1.19), we see that conditional on m1 , m2 , Z` , ` ≥ 0, the yeC , C ∈ C0 , are i.i.d. with common distribution: P [e yC = j |m1 , m2 , Z` , ` ≥ 0] = ηj (m3 ), j ≥ 0. This conditional distribution only depends on m3 . Thus conditional P S on m3 , yeC , C ∈ C0 , are i.i.d., j≥0 ηj (m3 ) δj -distributed. Since the collection C∈C ,e C, j ≥ 0, yC =j 0
defines up to relabelling the same partition of N, as (1.14), and m3 is Px1 x2 -distributed, we have proved that 000 is Ru+v (0, ·)-distributed. P (A.4) Let us finally prove (1.10). The left member of (1.10) equals E Px [ `≥0 η`k ] = 1 −u . This concludes the proof of Proposition (k−1)! (k − 1 − x) . . . (1 − x), with x = e 1.1.
254
E. Bolthausen, A.-S. Sznitman
The canonical space for the “clustering process” will be T : the set of non-decreasing right continuous E-valued, 0u , u ≥ 0.
(1.23)
Observe that EI is finite when I is finite. Thus the right continuous non-decreasing function rI (0u ), u ≥ 0, is a step function, which only takes finitely many values. We endow T with the canonical σ-field T generated by the canonical E-valued coordinates, and with the filtration: Tu = σ(0v , 0 ≤ v ≤ u), u ≥ 0.
(1.24)
Theorem 1.2 (The Clustering Process). There is a unique collection of probabilities P0 on (T, T ), 0 ∈ E, such that (T, T , (0u )u≥0 , (Tu )u≥0 , P0 ) is a Markov process with semigroup Ru , u ≥ 0. Proof. Uniqueness is obvious, we shall thus only explain the construction of the P0 probabilities. As shown in Proposition 1.1, RuI , u ≥ 0, is a strongly continuous semigroup on the finite state space EI , when I 6= ∅ is finite. Thus for a given 0 ∈ E, with the help of the compatibility relation (1.9), we can construct on some auxiliary probability space a sequence 0nu , u ≥ 0, of right continuous EIn -valued processes when In = [0, n], such that: 0nu , u ≥ 0, is a Markov process with semigroup RuIn , u ≥ 0, and 0n0 = rIn (0), (1.25) n ) = 0 , for m ≥ n, u ≥ 0. (1.26) rIm ,In (0m u u S n We simply define 0∞ u = n≥0 0u , for u ≥ 0. It is straightforward using (1.9) to see that ∞ 0u , u ≥ 0, is a Markov process with semigroup Ru . Moreover 0∞ . is a (T, C) valued random variable, and we define P0 to be its law. As already mentioned in (0.11), it is convenient to introduce on T the variables: τ`,`0 = inf{u ≥ 0, (`, `0 ) ∈ 0u } , `, `0 ≥ 0.
(1.27)
It is immediate to check that for `, `0 , `00 : τ`,` = 0, τ`,`0 = τ`0 ,` , τ`,`00 ≤ max{τ`,`0 , τ`0 ,`00 }.
(1.28)
The variables τ`,`0 are P0 -a.s. finite for any 0 ∈ E, since either (`, `0 ) ∈ 0, in which case τ`,`0 = 0, P0 -a.s. or from (1.10) P0 (τ`,`0 > u) = e−u , u ≥ 0, when (`, `0 ) ∈ / 0,
(1.29)
i.e. τ`,`0 is a standard exponential variable. Observe also that 0u , u ≥ 0, is a measurable function of the variables τ`,`0 , `, `0 ≥ 0, since 0u = {(`, `0 ) ∈ N × N, τ`,`0 ≥ u}.
(1.30)
When 0 is the equality relation on N, we shall simply write P in place of P0 . Note that P -a.s. τ`,`0 , `, `0 ≥ 0, defines an ultrametric distance on N.
(1.31)
We shall now close this section with a description of the pure jump process associated to the semigroups RuI , u ≥ 0, for finite I. Although not explicitly needed for the sequel,
Ruelle’s Probability Cascades and Abstract Cavity Method
255
this will provide further insights in the structure of the clustering process. We denote by LI the generator of the semigroup RuI , u ≥ 0, so that for f a function on EI : X aI0,00 f (00 ), 0 ∈ EI . (1.32) LI f (0) = 00 ∈EI
Proposition 1.3 (I finite). Let N ≥ 1 denote the number of distinct equivalence classes of 0 ∈ EI . If N = 1 all aI0,00 = 0 (trivially), if N ≥ 2, aI0,00 = 0 unless 00 is obtained by lumping together k ≥ 2 distinct classes of 0, in which case: aI0,00 = or 00 = 0, in which case:
1 (N − 1)
N −2 k−2
,
(1.33)
aI0,00 = 1 − N.
(1.34)
Proof. One can compute the intensity of the second moments of the point processes Px , by the same technique as in Proposition 2.1 of [10], see also [6] p. 55, and see that hX i η`2 η`20 = o(u), as u → 0, with x = e−u . E Px `6=`0
As a result aI0,00 vanishes unless 00 is obtained by lumping together one subcollection of C0 . From (1.10), we deduce that when 0 ∈ EI and C1 , . . . , Ck are k ≥ 2 distinct equivalence classes of 0: LI (1{C1 ,...,Ck are lumped together} )(0) =
1 . k−1
(1.35)
It follows from an “inclusion exclusion” argument that: LI (1{C1 ,...,Ck form an equivalence class} )(0) = Z
−k 1 N X
= 0
=
p=0
N − k p
(−1)p tk−2+p dt =
N −k X
Z
p=0 1
1 (−1)p k−1+p
N − k p
tk−2 (1 − t)N −k dt
0
0(k − 1) 0(N − k + 1) h = (N − 1) 0(N )
i N − 2 −1 . k−2
This proves (1.33). As for (1.34), it follows immediately from the identity aI0,00 = P − 00 6=0 aI0,00 . The pure jump process attached to RuI , u ≥ 0, when I is finite, is now easy to describe. It has a finite number of jump times: 0 < τ1 < τ1 + τ2 < · · · < τ1 + · · · + τλ < ∞. If the initial condition 0 has N ≥ 2 classes, then τ1 is exponentially distributed with e 1 by expectation (N − 1)−1 . At time τ1 , the process jumps to an equivalence relation 0 collapsing x1 classes of 0, where the distribution of x1 is P [x1 = k] =
N 1 , N − 1 k(k − 1)
2 ≤ k ≤ N.
(1.36)
256
E. Bolthausen, A.-S. Sznitman
e 1 is chosen uniformly among the N possibilities. After Conditionally on {x1 = k}, 0 k e 1 as the new starting element, etc. After a finite number of jumps, that, τ2 is chosen with 0 the final state with one class is reached. It should also be remarked that the integer valued process which counts the number of equivalence relations is Markovian as well, with downwards jumps and transition kernel essentially described by (1.36). There is in fact a simple explicit expression for the semigroup RuI , which we now provide. 0 Proposition 1.4 (I finite). If 0 has N ≥ 2 classes, and P 0 is obtained by respective clumpings of m1 , m2 , . . . , mk ≥ 1, classes of 0, with j mj = N , then
RuI (0, 00 ) =
k (k − 1)! −(k−1)u Y e gmi (u), (N − 1)!
(1.37)
i=1
where g1 (u) = 1, and for m ≥ 2, gm (u) = (m − 1 − e−u )(m − 2 − e−u ) . . . (1 − e−u ).
(1.38)
Proof. It is convenient to set x = e−u , u ≥ 0, and fx (s) = sx , for s > 0. If fx(m) denotes the mth derivative of f with respect to s, then the right-hand side of (1.37) equals (−1)N −k
k (k − 1)! 1 Y (mj ) def eI fx (1) = Ru (0, 00 ). (N − 1)! x
(1.39)
j=1
This will be helpful in order to check the backward equation d eI euI , u ≥ 0. R = LI R du u
(1.40)
eI is obviously the identity matrix, our claim (1.37) will follow. Observe that for Since R 0 m ≥ 1: ∂m ∂ (m) fx (1) = m (−x(log s) fx (s))|s=1 ∂u ∂s m X j m (j − 1)! fx(m−j) (1). =x (−1) j j=1
Using the identity x fx(`) (1) = fx(`+1) (1) + ` fx(`) (1), we get m X ∂ (m) j m f (1) = (j − 1)! [fx(m−j+1) (1) + (m − j) fx(m−j) (1)] (−1) ∂u x j j=1 (1.41) m X 1 fx(m−j+1) (1) − m fx(m) (1), (−1)j−1 = m! j(j − 1)(m − j)! j=2
after regrouping, and the above sum over j is 0 if m = 1. We use this expression to euI (0, 00 ) with respect to u. We find: differentiate R
Ruelle’s Probability Cascades and Abstract Cavity Method
257
∂ eI eI (0, 00 ) R (0, 00 ) = (−N + 1) R u ∂u u mi X X (k − 1)! (−1)`−1 + (−1) (N − 1)! i:mi ≥2 `=2 Y mi ! 1 (mi −`+1) fx (1) fx(mj ) (1) (mi − `)! `(` − 1) x j6=i mi X X mi I e (−1)N −k−`+1 aN,` = (−N + 1) Ru (0, 0) + ` i:mi ≥2 `=2 Y (k − 1)! 1 (mi −`+1) fx (1) fx(mj ) (1) (N − `)! x j6=i X I I 00 0 e a0,000 Ru (0 , 0 ), = N −k
000 :0000 00
where we have used the notation aN,` = [(N − 1) of (1.37).
N −2 `−2
(1.42) ]−1 . This finishes the proof
2. Clustering Process and Ruelle’s Probability Cascades We shall present in this section the precise connection between the clustering process constructed in the previous section and Ruelle’s cascades as defined in [8]. We consider a sequence (2.1) 0 < x1 < x2 < · · · < xK < 1, K ≥ 1, as well as x x x K K K u1 = log > u2 = log > · · · > uK = log = 0. (2.2) x1 x2 xK The main object of this section is to give an alternative description of the law on M1 ×E K of (m, 0uK , 0uK−1 , . . . , 0u1 ) under Pxk × P . We first introduce some notations. We denote by Ik , for k ≥ 1, the set Nk of multi-indices i = (i1 , . . . , ik ) of length k. As a convention we also define I0 = {∅}. If i ∈ Ik , i0 ∈ Ik0 , with k, k 0 ≥ 0, i.i0 , denotes the concatenation of i and i0 . Furthermore, when k ≤ k 0 , i i0 means that i0 extends i, whereas [i0 ]k stands for the truncation to order k of i0 . We now introduce an auxiliary probability space (S, S, Q), endowed with a family of (0, ∞)-valued random variables, ηik , i ∈ Ik , 1 ≤ k ≤ K, satisfying (0.6). For i ∈ Ik , 1 ≤ k ≤ K, it is convenient to introduce the generalization of (0.7): 1 2 k πi = η[i] · η[i] . . . η[i] 1 2 k
(where of course [i]k = i).
(2.3)
The following lemma is actually part of the results proved in Sect. 3 of Ruelle [6]. Nevertheless its simple proof is included for the reader’s convenience. Lemma 2.1. ω=
X
δπi is Q-a.s. M -valued and
(2.4)
i∈IK
ω = N (ω) is PxK -distributed (see (1.3) for the notation).
(2.5)
258
E. Bolthausen, A.-S. Sznitman
Proof. We use induction on K. When K = 1, (2.4), (2.5) are immediate. Consider K > 1. Conditional on η.1 , η.2 , . . . , η.K−1 , X j≥0
K , for i ∈ IK−1 , δπi ηi·j
(2.6)
are independent Poisson point processes on (0, ∞), with respective intensities πixK xK η −xK −1 dη. It also follows from (A.2) that the collection of variables ηeik = (ηik )xK , i ∈ Ik , k ∈ [1, K − 1],
(2.7)
satisfy (0.6) relative to the sequence: 0<x e1 =
x1 x2 xK−1 <x e2 = < ··· < x eK−1 = . xK xK xK
(2.8)
Defining π ei analogously to (2.3) in terms of the ηe variables, the induction hypothesis implies that def X πixK < ∞, Q−a.s.. (2.9) C = IK−1
Coming back to (2.6), we see that conditional on η.1 , . . . , η.K−1 , the variable ω is distributed as a Poisson point process on (0, ∞) with intensity C xK η −xK −1 dη. Using the scaling relation (A.7) (when g is constant), we see that ω e=
X IK
δ
C
−
1 xK
πi
(2.10)
is PxK -distributed and independent of η.1 , . . . , η.K−1 . It now follows that ω is Q-a.s. M valued and N (ω) = N (e ω ) is PxK -distributed. This concludes the proof of the induction step. With the help of Lemma 2.1, we introduce on a set of full Q-probability a measurable bijection i(·) : N → IK , such that: πi(`) = η` (ω), for ` ≥ 0.
(2.11)
In other words πi(`) has rank ` among the collection πi , i ∈ IK . Furthermore, we consider a decreasing sequence of E-valued variables: 0k = {(`, `0 ) ∈ N; [i(`)]k = [i(`0 )]k }, so that 01 ⊇ 02 ⊇ · · · ⊇ 0K .
(2.12)
The connection between the clustering process and Ruelle’s cascades comes in the following Theorem 2.2. (N (m), 0uK , . . . , 0u1 ) has the same distribution under PxK × P as (ω, 0K , . . . , 01 ) under Q.
(2.13)
Ruelle’s Probability Cascades and Abstract Cavity Method
259
Proof. With the help of (2.5) and the fact that 0uK and 0K both almost surely coincide with the equality relation on N, we see that (N (m), 0uK ) and (ω, 0K ) have the same law. In view of the Markov property asserted in Theorem 1.2, the claim (2.13) will follow when we show that R(ur −ur+1 ) (0r+1 , ·) is a version of the conditional law of 0r given (ω, 0K , . . . , 0r+1 ), for r ∈ [1, K − 1]. The argument used in the proof of Lemma 2.1 shows that def X C = πixr+1 < ∞ , Q−a.s.,
(2.14)
(2.15)
Ir
and conditional to η.1 , . . . , η.r , ωi =
X
δ
i0 ∈Ir+1 :i0 i
C
−
1 xr+1
πi 0
, for i ∈ Ir
(2.16)
are independent Poisson point processes on (0, ∞) with respective intensities C −1 πixr+1 xr+1 η −xr+1 −1 dη. If we now define the point process µ on (0, ∞) × Ir : X δ −x1 , (2.17) µ= 0 i0 ∈Ir+1
(C
r+1
πi0 ,[i ]r )
we see that conditionally on η.1 , . . . , η.r µ is a Poisson point process with intensity X π xr+1 i xr+1 η −xr+1 −1 dη ⊗ δi , and C
(2.18)
Ir
def
ω0 =
X i0 ∈Ir+1
δ
C
−
1 xr+1
πi 0
is independent of η.1 , . . . , η.r , Pxr+1 −distributed. (2.19)
Analogously to (2.11), we introduce a Q-a.s. defined measurable bijection ir+1 (·) : N → Ir+1 , such that: − 1 ηj (ω 0 ) = C xr+1 πir+1 (j) , for j ≥ 0. (2.20) We further introduce variables η ki , k ∈ [1, K − r], i ∈ Ik : η 1j = ηj (ω 0 ), for j ∈ N = I1 , η k(j1 ,...,jk ) = ηir+k , for k ∈ [2, K − r], (j1 , . . . , jk ) ∈ Ik . r+1 (j1 )·(j2 ,...,jk )
(2.21) (2.22)
, are independent of η.1 , . . . , η.r , µ, and coming back to (2.17), The variables η 2. , . . . , η K−r . 1 r conditionally on η. , . . . , η. , η 1. , . . . , η K−r , the Ir -valued variables [ir+1 (j)]r , j ≥ 0, are i.i.d., X π xr+1 i with common law δi . C Ir
(2.23)
260
E. Bolthausen, A.-S. Sznitman
It is plain that the variables η satisfy (0.6) relative to the sequence: x1 = xr+1 < · · · < xK−r = xK . Defining π i , for i ∈ Ik , k ∈ [1, K − r], in analogy with (2.3), we can introduce a Q-a.s. defined measurable bijection ¯i : N → IK−r , such that: X − 1 π i(`) = η` δπi = C xr+1 η` (ω), for ` ≥ 0. (2.24) ¯ IK−r
We now find that Q-a.s., for k ∈ [r + 1, K]: 0k = {(`, `0 ) ∈ N × N; [¯i(`)]k−r = [¯i(`0 )]k−r },
(2.25)
0r = {(`, `0 ) ∈ N × N; [ir+1 ([¯i(`)]1 )]r = [ir+1 ([¯i(`)]1 )]r }.
(2.26)
whereas for k = r, In other words the 0r -equivalence classes are the S Cjr+1 , for i ∈ Ir , Cir =
where
j:[ir+1 (j)]r =i
Cjr+1 = {` ∈ N; [¯i(`)]1 = j}, for j ≥ 0,
(2.27)
are the various 0r+1 -equivalence classes. Observe that (2.25) expresses P the 0k , k ∈ [r + 1, K], in terms of the η k−r , k ∈ [r + 1, K], and that ω = N ( IK−r δπi ). Furthermore, it follows from (A.2) and (2.9) (when K = r), that X xr N δπxr+1 is P xxr -distributed, with = e−(ur −ur+1 ) . (2.28) i r+1 xr+1 Ir
If we now recall (2.27) and (2.23), it is now routine to deduce (2.14). This concludes our proof of (2.13). As an application of Theorem 2.2, we can consider the process of “mass coagulation”, mu , u ≥ 0, defined on M1 × T , as the random pure point measure on (0, ∞): X mu = δ P η` (m) . (2.29) C:0u -equivalence classes
`∈C
Under Px ⊗ P , for x ∈ (0, 1), its law is in fact concentrated on M1 as follows from the Corollary 2.3.
Under Px ⊗ P , mu is Pxe−u -distributed.
(2.30)
Proof. We choose K = 2, x1 = xe−u < x2 = x. From Theorem 2.2, (N (m), 00 , 0u ) has same law under Px ⊗ P as (ω, 02 , 01 ) under Q. It follows that the law of mu is the same as that of X . (2.31) m=N δ P η1 η2 As a consequence of (A.7),
j≥0
P
j 0 ≥0
j
(j,j 0 )
δηj1 cj is Px1 -distributed, if X 1 2 Px η(j,j [|m|x1 ] x1 , cj = 0 ) /E j 0 ≥0
where E Px [|m|x1 ] < ∞, by (A.3). Coming back to (2.32), we see that m is Px1 distributed. This proves our claim.
Ruelle’s Probability Cascades and Abstract Cavity Method
261
With the help of Theorem 1.2, it is straightforward to see that mu , u ≥ 0, under Px ⊗ P , with x ∈ (0, 1), is a simple Markov process with semigroup: Ru (m, ·) = law of mu under δm ⊗ P .
(2.32)
3. Approximate Reshuffling The main goal of this section is to prepare the ground for the next section, where we shall derive the effect of the true reshuffling operation. We first need to introduce some notations. We suppose we are given xM ∈ (0, 1), qM ∈ (0, 1], and a non-decreasing function q(·) such that: q(·) is a C 1 -diffeomorphism between [0, xM ] and [0, qM ], q(x) = qM , for x ∈ [xM , 1].
(3.1)
We denote by x(·): [0, qM ] → [0, xM ] the inverse of q(·). We consider the probability e where e Q), e B, space (6, e = M × T × C0 (R+ , R)N , 6 (3.2) e the canonical coordinates Be is the canonical product σ-field, and under the law Q, ` e are independent, respectively PxM , P , and W m, (0u , u ≥ 0), w (·), ` ≥ 0, on 6, distributed, with W the Wiener measure on C0 (R+ , R). e as well): We also introduce the [0, xM ]-valued variables on T , (and 6 X`,`0 = xM exp{−τ`,`0 }, `0 , `0 ≥ 0, (see (1.27) for the notation).
(3.3)
In view of (1.30), 0u , u ≥ 0, is a measurable function of the X`,`0 , `, `0 ≥ 0, and X`,`0 ≥ min(X`,`00 , X`00 ,`0 ), for `, `0 , `00 ≥ 0.
(3.4)
We then come to the construction of the conditionally Gaussian stochastic processes announced in (0.14). We define by induction a sequence Y ` (x), x ∈ [0, xM ], ` ≥ 0, of stochastic processes with: Y 0 (x)
= w0 (q(x)), x ∈ [0, xM ], and for N ≥ 0,
Y N +1 (x) = Y L (x), for x ∈ [0, XN +1,L ], = wN +1 (q(x) − q(XN +1,L )) + Y L (XN +1,L ), for x ∈ [XN +1,L , xM ], (3.5) provided L ∈ [0, N ] is any integer such that: XN +1,L = max{XN +1,` ; ` ∈ [0, N ]}.
(3.6)
With the help of (3.4) and an induction argument, one readily checks that (3.5)–(3.6) unambiguously defines Y N +1 (·). In fact one has: 0
Y ` (x) = Y ` (x), for x ∈ [0, X`,`0 ],
(3.7)
e conditional to m, 0u , u ≥ 0, the Y ` (x), x ∈ [0, xM ], ` ≥ 0, are centered and under Q, Gaussian processes with covariance Q Ee [Y ` (x) Y ` (x0 ) |m, 0u , u ≥ 0] = q(x∧x0 ∧X`,`0 ), x, x0 ∈ [0, xM ], `, `0 ≥ 0. (3.8) 0
262
E. Bolthausen, A.-S. Sznitman
We shall now introduce a sequence Yn` (·), for n ≥ 0, which approximates the processes Y ` (·), as n tends to infinity through a discretization of [0, xM ]. For n ≥ 0, k ∈ [0, 2n ], we define: k (3.9) xk,n = x n qM 2 (see (3.1) for the notation), as well as for n, `, `0 ≥ 0: X n xk,n 1{xk,n ≤ X`,`0 < xk+1,n }, with the convention x2n +1,n = 1. X`,` 0 = 0≤k≤2n
(3.10) The processes Yn` (·) are defined exactly as in (3.5)–(3.6), except that we replace n . Then in analogy to (3.8), conditional to m everywhere in the definition X·,· by X·,· ` and 0u , u ≥ 0, the Yn (x), x ∈ [0, xM ], ` ≥ 0, are centered Gaussian processes with covariance Q n 0 [Yn` (x) Yn` (x0 ) |m, 0u , u ≥ 0] = q(x ∧ x0 ∧ X`,` Ee 0 ), x, x ∈ [0, xM ], ` ≥ 0. (3.11) 0
It is plain that when n → ∞: n 0 X`,` 0 ↑ X`,`0 , for `, ` ≥ 0,
(3.12)
Yn` (·) converges to Y ` (·) uniformly on [0, xM ], for ` ≥ 0.
(3.13)
It is useful in view of the calculations on the effect of the approximate reshuffling n ` operation to give an alternate description of the joint law of m, (X`,` 0 )`,`0 ≥0 , (Yn (·))`≥0 , n for n fixed. To this end, we consider the situation of Sect. 2, with K = 2 and xk = xk,n , for k ∈ [1, K]. We assume that on the auxiliary probability space (S, S, Q), parallel to the variables ηik , i ∈ Ik , k ∈ [1, 2n ], we also have mutually independent standard Wiener processes, zik (·), i ∈ Ik , k ∈ [1, 2n ], which are also independent of the ηik variables. For i ∈ Ik , k ∈ [1, 2n ], x ∈ [0, xk,n ], we unambiguously define Y i (x) =
X 1≤k0
0
k z[i] k0
q
M 2n
k0 + z[i] k
0
q(x) −
(k0 − 1) q , M 2n
(3.14)
where k0 ∈ [1, k] is such that x ∈ [xk0 −1,n , xk0 ,n ]. In the case i ∈ IK , (recall K stands for 2n ), Y i (·) is thus a continuous process on [0, xM ]. We are now ready for n ` e Lemma 3.1 (n ≥ 0 is fixed). (m, (X`,` 0 )`,`0 ≥0 , (Yn (·))`≥0 ) has the same law under Q, 0 as (ω, (X `,`0 )`,`0 ≥0 , (Y i(`) )`≥0 ) under Q, with i(·) defined as in (2.11), and for `, ` ≥ 0,
X `,`0 = sup{xk,n ; k ∈ [1, K] such that [i(`)]k = [i(`0 )]k }, and the convention sup ∅ = 0.
(3.15)
Proof. We shall write xk instead of xk,n , for simplicity. It follows from (3.10) that e Q-a.s., for `, `0 ≥ 0: n 0 X`,` 0 = sup{xk , k ∈ [1, K] such that (`, ` ) ∈ 0uk },
Ruelle’s Probability Cascades and Abstract Cavity Method
263
where uk is defined as in (2.2), and sup ∅ = 0, as in (3.15). Applying Theorem 2.2, we thus see that: n e (m, (X`,` 0 )`,`0 ≥0 ) has same law under Q as (ω, (X `,`0 )`,`0 ≥0 ) under Q.
(3.16)
n ` Furthermore, conditional on (m, (X`,` 0 )`,`0 ≥0 ), the processes Yn (·) are centered Gaussian processes with covariance as in (3.11). On the other hand, by inspection of (3.14), the processes Y i (·), i ∈ IK , are independent of (ω, (X `,`0 )`,`0 ≥0 , i(·)), centered Gaussian with covariance
cov(Y i (x), Y i0 (x0 )) = min(q(x), q(x0 ), sup{q(xk ); [i]k = [i0 ]k }. It follows that conditional on (ω, (X `,`0 )`,`0 ≥0 , i(·)), the processes Y i(`) (·), ` ≥ 0, are centered Gaussian with covariance: min(q(x), q(x0 ), q(X `,`0 )) = q(x ∧ x0 ∧ X `,`0 ),
(3.17)
which is a measurable function of (X `,`0 )`,`0 ≥0 . This proves that conditional on (ω, (X `,`0 )`,`0 ≥0 ), the Y i(`) (·), ` ≥ 0, are centered Gaussian processes with covariance as in (3.17). This concludes the proof of Lemma 3.1. We shall now define the approximate reshuffling operation. To this end we consider a function ψ(·) : R → R, bounded measurable. (3.18) The boundedness assumption is here for technical convenience, although it excludes the natural choice ψ(x) = log(2 cosh(βx)) in the context of [6]. For the time being, we keep n ≥ 0 fixed, and write xk in place of xk,n . We introduce a sequence of functions ψk , k ∈ [0, 2n ], via: ψ2n (·) = ψ(·), and 1 log P qM [exk ψk ](·), 1 ≤ k ≤ 2n , ψk−1 (·) = 2n xk where Pt , t ≥ 0, stands for the usual Brownian semigroup: Z n (y 0 − y)2 o 1 √ Pt h(y) = h(y 0 )dy 0 , when t > 0, exp − 2t 2πt = h(y), for t = 0, y ∈ R, h bounded measurable.
(3.19)
(3.20)
Functions closely related to the ψk appear in M´ezard-Virasoro [4], p. 1299. We also e introduce a sequence HN , N ≥ 0, of random variables on 6: n o X H0 = exp xM ψ(Yn0 (xM )) + (xk − xk+1 ) ψk (Yn0 (xk )) , (3.21) 0≤k<2n
and for N ≥ 0,
n n N +1 n (XN HN +1 = HN exp xM ψ(YnN +1 (xM )) − XN +1,L ψk0 (Yn +1,L )) o X (xk − xk+1 ) ψk (YnN +1 (xk )) , + k0 ≤k<2n
264
E. Bolthausen, A.-S. Sznitman
n n n where L ∈ [0, N ] is such that XN +1,L = max{XN +1,` , 0 ≤ ` ≤ N }, and q(XN +1,L ) = k0 2n . e The approximate reshuffling operation comes as follows. On a set of full Qprobability, the sequence
ν` = η` (m) exp{ψ(Yn` (xM ))}, ` ≥ 0,
(3.22)
has pairwise distinct terms, and is summable. We can thus introduce on a set of full e Q-probability a measurable permutation σ en of N, with inverse σn , such that `e = σ en (`) is the rank of ν` among the ν`0 , `0 ≥ 0. The approximate reshuffling corresponds to considering X e ` `) e `e0 ≥ 0, δν` , X (r)0 = X n e e0 , Y(r) (·) = Ynσn (e (·), `, m(r) = N σn (`),σn (` ) e `,e `
(3.23)
(3.24)
`≥0
n ` 0 in place of m, X`,` 0 , Yn (·), `, ` ≥ 0, and the dependence on n of the reshuffled objects is omitted from the notation (3.24) for simplicity. Intuitively, one keeps the initial tree structure and marking processes, but relabels according to the new relative importance of the weights ν` . The next result is crucial for the sequel: (r) e ` e as Proposition 3.2. For N ≥ 0, (m(r) , (X·,· ), (Y(r) (·))0≤e ) has same law under Q `≤N n e ), (Yn` (·))0≤`≤N ) under the probability HN · Q. (m, (X·,·
Proof. We use Lemma 3.1, and recall that K stands for 2n . We introduce on a set of full Q-probability a measurable bijection τ : N → IK , such that: πτ (e exp{ψ(Y τ (e (x ))} has rank `e among the collection `) `) M πi exp{ψ(Y i (xM ))}, i ∈ IK .
(3.25)
n It readily follows from Lemma 3.1 that (m, (X·,· ), (Yn` (·))`≥0 , σ en ) has same law under −1 e Q as (ω, (X ·,· ), Y i(`) (·))`≥0 , τ ◦ i) under Q, (where it should be observed that τ −1 ◦ i (r) e ` ), (Y(r) (·))e ) depends measurably on ω and (Y i(`) (xM ))`≥0 ). As a result (m(r) , (X·,· `≥0 ` e as (ω (r) , (X (r) ), (Y e ) under Q, where has the same law under Q ·,· (r) (·))e `≥0 X e `e0 ≥ 0, ω (r) = N δπi exp{ψ(Y i (xM )} , and for `, IK
(r) e k = [τ (`e0 )]k }, Xe = sup{xk ; k ∈ [1, K] such that [τ (`)] `,e `0
(3.26)
e ` Y (r) (·) = Y τ (e (·). `) The key observation is that we can write for i ∈ IK , 1 K . . . η[i] exp{ψK (Y i (xM ))} = πi exp{ψ(Y i (xM ))} = η[i] 1 K qM
1 eψ1 (z[i]1 ( 2n eψ0 (0) η[i] 1 1
))−ψ0 (0)
K qM ( 2n
K η[i] eψK (Y i (xK−1 )+zi K
k
qM
k . . . η[i] eψk (Y i (xk−1 )+z[i]k ( 2n k
))−ψK−1 (Y i (xK−1 ))
,
))−ψk−1 (Y i (xk−1 ))
(3.27)
Ruelle’s Probability Cascades and Abstract Cavity Method
265
and with the help of (A.7), and the definition (3.19), conditional to the σ-algebras: def
0
0
Gk = σ(ηik , zik , i ∈ Ik0 , k 0 < k), k ∈ [1, K], the collection of marked point processes on (0, ∞) × C0 (R+ , R) X j≥0
δ
k e (ηi·j
q {ψk (Y i (xk−1 )+z k ( M ))−ψk−1 (Y i (xk−1 ))} k (·)) i·j 2n , zi·j
, i ∈ Ik−1 ,
(3.28)
are independent Poisson with respective intensities: xk η −xk −1 dη ⊗ µki (dz),
(3.29)
where µki is the probability on C0 (R+ , R) defined by: qM
µki (dz) = exk {ψk (Y i (xk−1 )+z( 2n
))−ψk−1 (Y i (xk−1 ))}
W (dz),
(3.30)
where W stands for the Wiener measure. As a result of (3.28), we can find variables ηeik , zeik , i ∈ Ik , k ∈ [1, K], obtained by successive relabellings of the variables ηik exp{ψk (Y i (xk )) − ψk−1 (Y [i]k−1 (xk−1 ))}, zik (·), i ∈ Ik , k ∈ [1, K], such that: ηeik , i ∈ Ik , k ∈ [1, K], has same distributions as ηik , i ∈ Ik , k ∈ [1, K],
(3.31)
the zeik variables are independent of the ηeik variables,
(3.32)
0 0 def conditional to Gek = σ(e ηik0 , zeik0 , i0 ∈ Ik0 , k 0 < k),
(3.33)
k (·), i ∈ Ik−1 , j ≥ 0, are independent, respectively µ eki -distributed, with the zei·j
µ eki (dz) = exk {ψk (Yei (xk−1 )+z( 2n
qM
ei (xk−1 ))} ))−ψk−1 (Y
W (dz),
(3.34)
and the Yei (·) are defined like the Y i (·) in (3.14), with the ze variables in place of the z variables. Taking into account the fact that ψ0 (0) in (3.27) is constant, and plays no role after normalization, we find that X e `e0 ≥ 0 : ω (r) = N , (with obvious notations), and for `, δe πi IK
(r) e k = [e Xe = sup{xk ; k ∈ [1, K], [e τ (`)] τ (`e0 )]k }, `,e `0
(3.35)
e ` Y (r) (·) = Yee (·), τ (e `) where τe(·) is the Q-a.s. defined measurable bijection between N and IK , such that: π ee has rank ` among the collection π ei , i ∈ IK . τ (`)
(3.36)
266
E. Bolthausen, A.-S. Sznitman (r)
`
As a result, we see that (ω (r) , (X ·,· ), Y (r) (·))0≤`≤N ) has same law under Q as (ω, (X ·,· ), (Y i(`) (·))0≤`≤N ) under H N Q, where H0
= exp
H N +1 = H N
K nX
o xk [ψk (Y i(0) (xk )) − ψk−1 (Y i(0) (xk−1 ))] , and for N ≥ 0, 1 n o X exp xk [ψk (Y i(N +1) (xk )) − ψk−1 (Y i(N +1) (xk−1 ))] , X N +1,L
(3.37) with L ∈ [0, N ] any number such that X N +1,L = sup{X N +1,` , 0 ≤ ` ≤ N }. Thus with the help of Lemma 3.1, and the identity in law above (3.26), Proposition 3.2 follows. 4. Reshuffling The object of this section is to study the true reshuffling operation. The description of the effect of this operation on the random weights, the clustering process, and the y ` (·) components will exhibit several quantities which arise in the prediction of the “limit picture” for the Sherrington–Kirkpatrick model, see M´ezard–Parisi–Virasoro [6], p. 45, and [5]. In the light of these references, the operation of reshuffling, we introduce in (4.4) and (4.24) below, can be seen as a kind of abstract cavity method. Supposedly in the context of the Sherrington–Kirkpatrick model, the cavity method yields a description of the disordered averaged SK-measure on N and (N + 1) sites, when N is large. We keep the notations of Sect. 3, and assume from now on that ψ(·) belongs to Cb4 (R).
(4.1)
e the sequence: e Q) e B, We define on (6, µ` = η` (m) exp{ψ(Y ` (xM ))} , ` ≥ 0.
(4.2)
e By the same argument as in (3.22), (3.23), we can introduce on a set of full Q-probability a measurable permutation σ e(·) of N, with inverse σ(·) such that: `e = σ e(`) is the rank of µ` among the sequence µ`0 , `0 ≥ 0. As in (3.24), we can then define the quantities: X e ` `) e `e0 ≥ 0, m(R) = N δµ` , X (R)0 = Xσ(e , Y(R) (·) = Y σ(e (·), `, `),σ(e `0 ) e `,e `
(4.3)
(4.4)
`≥0
which describe the reshuffling operation. Our main objective is to find the law of the random vector (4.4). To this end we introduce for n ≥ 0, q ∈ [0, qM ], y ∈ R: fn (q, y) =
1 log{P kn 2 xk,n
qM −q (exp{xk,n
with the notations of (3.19).
ψk })(y)}, when
k−1 k qM ≤ q ≤ n qM , 2n 2 (4.5)
Ruelle’s Probability Cascades and Abstract Cavity Method
267
Lemma 4.1. As n tends to infinity, fn converges uniformly on compact sets of [0, qM ] × R, to the unique solution f (q, y) in Cb1,2 ([0, qM ] × R) of ( x(q) 1 ∂q f + ∂y2 f + (∂y f )2 = 0, on (0, qM ) × R, (4.6) 2 2 f (qM , y) = ψ(y). Proof. It follows from (4.5) that on [ k−1 2 n qM , exk,n fn = P kn 2
k 2n
qM ] × R:
qM −q (e
xk,n ψk
).
(4.7)
Differentiating twice in the y variable, we find ∂y fn = P kn
qM −q (∂y
2
∂y2 fn +xk,n (∂y fn )2 = P kn 2
ψk exk,n ψk ) / P kn 2
2 qM −q ((∂y
qM −q (e
xk,n ψk
), and
ψk +xk,n (∂y ψk )2 ) exk,n ψk )/P kn 2
qM −q (e
(4.8) xk,n ψk
). (4.9)
From (4.8), we see recursively on k that: sup |∂y fn | ≤ k∂y ψk∞ ,
n,q,y
and then choosing q =
k−1 2n
qM in (4.9), we see that:
k∂y2 ψk−1 + xk−1,n (∂y ψk−1 )2 k∞ ≤ k∂y2 ψk + xk,n (∂y ψk )2 k∞ + (xk,n − xk−1,n ) k∂y ψk2∞ ,
(4.10)
which together with (4.9) easily implies that: sup |∂y2 fn | < ∞.
n,q,y
Moreover, differentiating (4.7) in the q variable and recalling that Pt is the Brownian semigroup, we find that: ( xn (q) k 1 (∂y fn )2 = 0, for q 6= n qM , k ∈ [1, 2n ] ∂q fn + ∂y2 fn + (4.11) 2 2 2 fn (qM , y) = ψ(y), with the notation: nk − 1
n
xn (q) =
2 X
xk,n 1
1
2n
qM < q ≤
o k . q M 2n
(4.12)
If we write the relations analogous to (4.8) and (4.9) obtained for the third and fourth derivative of fn in the y variable, and derive analogous controls to (4.10), we easily deduce that: sup |∂yj fn (q, y)| < ∞. (4.13) sup n≥0, 0≤j≤4
q,y
Taking first and second derivatives of (4.11) in the y variable, we see that sup
n≥0, 0≤j≤2
k∂q ∂yj fn kL∞ ([0,qM ]×R) < ∞.
(4.14)
268
E. Bolthausen, A.-S. Sznitman
It then follows from (4.13), (4.14) and fn (qM , ·) = ψ(·), that ∂yj fn , 0 ≤ j ≤ 2, are relatively compact sequences for the topology of uniform convergence on compact sets of [0, qM ] × R. From any subsequence of fn , we can extract a subsequence along which fn , ∂y fn , ∂y2 fn converge uniformly on compact subsets of [0, qM ] × R respectively to the bounded continuous function f∞ , ∂y f∞ , ∂y2 f∞ . Coming back to (4.11) in integral form, we see that f∞ ∈ Cb1,2 ([0, qM ] × R), and f∞ satisfies (4.6). Observe now that (4.6) has a unique solution. Indeed, if f and f 0 are two solutions, w = f − f 0 ∈ Cb1,2 ([0, qM ] × R) satisfies ( ∂q w +
1 2
∂y2 w +
w(qM , y) = 0,
x(q) (∂y f + ∂y f 0 ) ∂y w = 0 2
(4.15)
and the maximum principle, (see Theorem 8.1.4 of Krylov [3]), implies that w = 0, (one could also give an argument based on an S.D.E. representation of w). This shows that f∞ is uniquely determined, and thus fn converges uniformly on compact sets of [0, qM ] × R to the solution of (4.6). We shall use the notation m(q, y) = ∂y f (q, y), (q, y) ∈ [0, qM ] × R,
(4.16)
where f is the unique solution of (4.6). We can now introduce a sequence of random e variables IN , N ≥ 0, on 6: Z xM n o I0 = exp xM ψ(Y 0 (xM )) − f (q(x), Y 0 (x))dx , and for N ≥ 0, 0 n N +1 (xM )) − XN +1,L ψ(Y N +1 (XN +1,L )) IN +1 = IN exp xM ψ(Y (4.17) Z xM o − f (q(x), Y N +1 (x))dx , XN +1,L
where L ∈ [0, N ], and XN +1,L are as in (3.6). With the help of (3.7), this unambiguously defines IN , N ≥ 0. We can give an alternative expression for IN , if we notice that for x1 ∈ [0, xM ], ` ≥ 0, Ito’s formula implies Z
x1
0
`
Z
x1
x1 f (q(x1 ), Y (x1 )) − f (q(x), Y (x))dx = x m(q(x), Y ` (x)) dY ` (x) 0 0 Z x1 (4.6) 1 + x(∂q + ∂y2 ) f (q(x), Y ` (x)) dq(x) = 2 Z x10 Z x1 1 ` ` x m(q(x), Y (x))dY (x) − x2 m2 (q(x), Y ` (x)) dq(x). `
2
0
(4.18)
Ruelle’s Probability Cascades and Abstract Cavity Method
As a result, we can write
nZ
xM
I0 = exp −
1 2
Z
0 xM 0
nZ
IN +1 = IN exp Z xM 1 − 2
269
xm(q(x), Y 0 (x)) dY 0 (x) o x2 m2 (q(x), Y 0 (x)) dq(x) , xM
xm(q(x), Y N +1 (x)) dY N +1 (x)
(4.19)
XN +1,L
o x2 m2 (q(x), Y N +1 (x)) dq(x) ,
XN +1,L
for N ≥ 0. This gives a more transparent interpretation for the IN -variables if one keeps in mind the Girsanov formula. On the other hand (4.17) is better suited to the approximation scheme. (R) e ` e ), (Y(R) (·))0≤e ) has the same law under Q Theorem 4.2. For N ≥ 0, (m(R) , (X·,· `≤N e as (m, (X·,· ), (Y ` (·))0≤`≤N ) under IN · Q.
n,(r) · , Yn,(r) (·), for the Proof. We reintroduce the n-dependence in the notation mn,(r) , X·,· e quantities defined in (3.24). As a result of (3.12), (3.13), on a set of full Q-probability, as n → ∞, (4.20) mn,(r) → m(R) , vaguely on (0, ∞),
en (·) converge simply on N, respectively to σ(·) and σ e(·). Therefore, when σn (·) and σ n → ∞: e `e0 ≥ 0, = Xn → X (R)0 = Xσ(e , for `, X n,(r) `),σ(e `0 ) e e `),σn (e `0 ) σn (e `,e `0 `,e ` e e ` `) ` `) (·) = Ynσn (e (·) → Y(R) (·) = Y σ(e (·) uniformly on [0, xM ], for `e ≥ 0. Yn,(r) If we endow Mp × [0, xM ]N×N × C([0, xM ], R)N , with the canonical product topology, and denote by F a continuous bounded function in this space, it follows from these convergences, Proposition 3.2 and Lemma 4.1, that: (R) e Q ` Ee [F (m(R) , (X·,· ), (Y(R) (·))0≤e )] `≤N
n,(r) e Q ` [F (mn,(r) , (X·,· ), (Yn,(r) )0≤`≤N )] = lim E e n
Q n [F (m, (X·,· ), (Yn` (·))0≤`≤N ) HN ] = lim E e
(4.21)
n→∞
Q [F (m, (X·,· ), (Y ` (·))0≤`≤N ) IN ]. =Ee
Since N and F are arbitrary, this proves our claim.
We can give a slightly different formulation of Theorem 4.2 by considering: 6 = M × T × C([0, xM ], R)N ,
(4.22)
endowed with the natural product σ-algebra B and with the probability Q for which the canonical coordinates m and (0u )u≥0 are independent, respectively PxM and P
270
E. Bolthausen, A.-S. Sznitman
distributed, and conditional to m, (0u )u≥0 , the y ` (·), ` ≥ 0, are centered Gaussian processes with 0
E Q [y ` (x) y ` (x0 ) | m, (0u )u≥0 ] = q(x ∧ x0 ∧ X`,`0 ), x, x0 ∈ [0, xm ], `, `0 ≥ 0. (4.23) Then we can define the reshuffling operation as the measurable 8 : 60 → 6, where 60 has full Q probability, and 8 is such that: X `) δη eψ(y` (xM )) , ((e (·))e 8(m, (0u )u≥0 , (y ` (·))`≥0 ) = N σ ⊗e σ )(0u ))u≥0 , (y σ(e , `≥0 `≥0
`
(4.24) with σ e(·) and σ(·), as in (4.3), with Y ` replaced by y ` . Furthermore, we can introduce on 6 the Q-densities: n X Z xM 1{`∈LN (x)} xm(q(x), y ` (x)) dy ` (x) JN = exp 0 (4.25) Z xM `≥0 o 1 1{`∈LN (x)} x2 m2 (q(x), y ` (m)) dq(x) , − 2
0
for N ≥ 0, where LN is any map of the form S(r[0,N ] ◦ 0log( xM ) ): composition of the x restriction to [0, N ] of 0log( xM ) (a piecewise constant map), with S which to 0 ∈ E[0,N ] x associates a selection S(0) ⊆ [0, N ] of representatives of the 0-equivalence classes in [0, N ]. As a result of (4.23), JN is unambiguously defined up to null equivalence. The key Theorem 4.2 can be reformulated as: Theorem 4.3. If HN is the σ-algebra generated by (m, (0u )u≥0 , (y ` (·)0≤`≤N ), for N ≥ 0, then (4.26) 8 ◦ Q = JN · Q on HN . Proof. We only need to notice that IN in (4.19) can be rewritten as n X Z xM 1{`∈LN (x)} xm(q(x), Y ` (x)) dY ` (x) IN = exp 0 Z xM `≥0 o 1 1{`∈LN (x)} x2 m2 (q(x), Y ` (m)) dq(x) , − 2
0
for a suitable LN (x), as after (4.25).
Corollary 4.4. Let τ be a permutation of [0, N ], N ≥ 0, then under 8 ◦ Q, G = ((X`,`0 )0≤`,`0 ≤N , (y ` (·))0≤`≤N ) and Gτ = ((Xτ (`),τ (`0 ) ))0≤`,`0 ≤N , (y τ (`) (·))0≤`≤N ) have the same law. Proof. It is obvious that (X`,`0 )0≤`,`0 ≤N and (Xτ (`),τ (`0 ) )0≤`,`0 ≤N have the same law under P . Together with (4.23), this shows that G and Gτ have the same law under Q. Expressing JN as a measurable function of Gτ , we find that Gτ under JN · Q has the same law as G under JeN · Q, with n X Z xM ` ` e 1{`∈L JN = exp eN (x)} xm(q(x), y (x)) dy (x) 0 Z xM `≥0 o 1 2 2 ` 1{`∈L x m (q(x), y (x)) dq(x) , − e (x)} 2
0
N
Ruelle’s Probability Cascades and Abstract Cavity Method
271
e N (x) = S(r e [0,N ] ◦ 0 xM ), where Se : E[0,N ] → P([0, N ]) is defined by: with L log( ) x
e S(0) = τ −1 S((τ ⊗ τ )(0)) , and associates to 0 ∈ E[0,N ] a selection of representatives of the 0-equivalence classes. As a result JN = JeN , Q-a.s., and this proves our claim. 5. Single and Double Replicas Calculations We want to apply the results of the previous section to investigate the “single replica distribution”: hX i (5.1) η` δy`(xM ) , E 8◦ Q `≥0
which is a probability on R, as well as the “double replicas distribution”: h X i η` η`0 1{X`,`0 ·} δy` (xM ) ⊗ δy`0 (xM ) , E 8◦ Q
(5.2)
`,`0 ≥0
which is a probability on [0, xM ] × R × R. As a result when ψ is symmetric non-constant, we shall define a transformation of the function q(·). The equation for fixed points of this transformation in the context of the SK-model is the so-called “consistency equation”, see [6], p. 45. It is useful to introduce the time inhomogeneous transition probability (Rx0 ,x1 )0≤x0 ≤x1 ≤xM of the solution of the SDE dy(x) = dM (x) + xm(q(x), y(x)) dq(x)
(5.3)
with M (x) = w(q(x)), 0 ≤ x ≤ xM , the time changed motion of the standard Brownian y , for w(·). More precisely, we denote by Px0 ,x1 the law of y + w(q(x) − q(x0 )) y ∈ R, x0 ≤ x1 in [0, xM ], and introduce Z n Z x1 Rx0 ,x1 h(y) = h(y(x1 )) exp xm(q(x), y(x)) dy(x) x0 Z x1 o 1 − x2 m(q(x), y(x)) dq(x) dPxy0 ,x1 (y(·)) 2 Z x0 n = h(y(x1 )) exp x1 f (q(x1 ), y(x1 )) − x0 f (q(x0 ), y(x0 )) Z x1 o − f (q(x), y(x)) dx dPxy0 ,x1 (y(·)),
x0 ≤x≤x1
(5.4)
x0
where the second equality follows from Ito’s formula and (4.6), as in (4.18). It is immediate from the second line of (5.4) to check the composition rule: Rx0 ,x1 Rx1 ,x2 = Rx0 ,x2 , for 0 ≤ x0 ≤ x1 ≤ x2 ≤ xM . Lemma 5.1. Rx,xM (∂y ψ)(·) = ∂y f (q(x), ·) = m(q(x), ·), for x ∈ [0, xM ].
(5.5)
272
E. Bolthausen, A.-S. Sznitman
Proof. We introduce a regularization by convolution f = f ∗ ψ , and m = ∂y (f ∗ ψ ) = · −2 m Z ∗ ψ , where ψ = ψ( ), with ψ(q, y) ≥ 0, smooth, compactly supported and ψ dqdy = 1. It follows from (4.6) that when I = [q0 , q1 ] ⊂ (0, qM ), for small : 1 2
∂y2 m + H = 0 in I × R, with h x(·) i (∂y f )2 ∗ ψ = [x(·) m ∂y m] ∗ ψ . H = ∂y 2
∂ 1 m +
(5.6)
Let x0 = x(q0 ), x1 = x(q1 ) and Zx , x ∈ [x0 , xM ], stand for the exponential martingale (under Pxy0 ,xM ): Z x nZ x o 1 Zx = exp u m(q(u), y(u)) dy(u) − u2 m2 (q(u), y(u)) dq(u) . 2
x0
x0
Observe that by a similar calculation as in the second line of (5.4), Z. is bounded. It follows from Ito’s formula and (5.6) that Z x1 ∂y m (q(x), y(x)) dy(x) m (q(x1 ), y(x1 )) = m (q(x0 ), y(x0 )) + x0 Z x1 − H (q(x), y(x)) dq(x), when is small. x0
Letting tend to 0, we find Z x1 ∂y m(q(x), y(x)) dNx , where m(q(x1 ), y(x1 )) = m(q(x0 , y(x0 )) + x0 Z x Nx = y(x) − u m(q(u), y(u)) dq(u), x ∈ [x0 , x1 ] is a martingale
(5.7)
x0
under Zx1 · Pxy0 ,xM . If we take expectations of (5.7) with respect to the above probability and let x1 tend to xM and x0 vary in (0, xM ], we find our claim (5.5). Theorem 5.2. For h bounded measurable and x0 ∈ [0, xM ], hX i η` h(y ` (xM )) = R0,xM h(0), E 8◦ Q
(5.8)
`≥0
h X i 0 E 8◦ Q η` η`0 1(X`,`0 ≥ x0 ) h(y ` (xM )) h(y ` (xM )) Z xM `,`0 ≥0 = R0,x (Rx,xM h)2 (0) dx + (1 − xM ) R0,xM (h2 )(0).
(5.9)
x0
Proof. As a result of Theorem 4.3, (η` )`≥0 and (y ` (·))`≥0 , are independent under 8 ◦ Q, therefore the left member of (5.8) equals: X E 8◦ Q [η` ] E 8◦ Q [h(y ` (xM )], `≥0
Ruelle’s Probability Cascades and Abstract Cavity Method
273
and using Corollary 4.4 and Lemma 5.1, this equals: E Q [h(y 0 (xM )) J0 ] = R0,xM h(0). This proves (5.8). By similar considerations, the left-hand member of (5.9) equals: X E PxM [η` η`0 ] E Q [1{X0,1 ≥x0 } h(y 0 (xM )) h(y 1 (xM )) J1 ] `6=`0
+
X
(5.10)
E PxM [η`2 ] E Q [h2 (y 0 (xM ))J0 ].
`≥0
In view of (A.4) and (5.4), the last term of (5.10) equals (1 − xM ) R0,xM h2 (0). As for the first term of (5.10), note that X0,1 is uniformly distributed on (0, xM ) under Q, see (1.30), (3.31), and thus: E Q [1{X0,1 ≥x0 } h(y 0 (xM )) h(y 1 (xM )) J1 ] Z xM 1 = dx E Q [h(y 0 (xM )) h(y 1 (xM )) J1 | X0,1 = x] xM x0 Z x1 1 = R0,x (Rx,xM h)2 (0)dx, xM x0 P
by the definition of J1 and (5.4). Since E PxM [ proof of (5.9).
`6=`0
(5.11)
η` η`0 ] = xM , this concludes the
Remark 5.3. In the special case h = ∂y ψ, whose special virtue is explained below, Lemma 5.1 and Theorem 5.2 show that: hX i η` ∂y ψ(y ` (xM )) = m(0, 0), and (5.12) E 8◦ Q `≥0
h X
i 0 η` η`0 1{X`,`0 ≥x0 } ∂y ψ(y ` (xM )) ∂y ψ(y ` (xM )) E 8◦ Q Z xM `,`0 ≥0 = R0,x (m(q(x), ·)2 )(0) dx + (1 − xM ) R0,xM ((∂y ψ)2 )(0).
(5.13)
x0
We now specialise to the case where ψ is symmetric and non-constant. As a result f (q, ·) and m(q, ·) are respectively symmetric and antisymmetric functions, so that m(q, 0) = 0. In the context of the SK measure, a (very) non-rigorous cavity calculation in the spirit of Chapter 5 of [6] leads to an “approximate identity”: Z 1 q (n+1) (x) dx x0 hX i (n+1) (n+1) “≈” E ηα(n+1) ηα(n+1) 1{qα,α (x0 )} tanh βyα(n) tanh (βyα0 )(n) , 0 0 ≥ q α,α0
where q (n+1) (·) stands for the overlap function on (n + 1) spins, ηα(n+1) are the weights of (n+1) the respective “states” in the decomposition of the SK measure on (n + 1) spins, qα,α 0 are the mutual overlaps, yα(n) are the respective cavity fields, E denotes the disorder
274
E. Bolthausen, A.-S. Sznitman
expectation, and q (n+1) (x0 ) is smaller than the maximum value of q (n+1) (·). Recall that in this situation ψ(x) should be viewed as log(cosh βx). In our abstract set up, this suggests defining an “inverse temperature” β = k∂y ψk∞ ∈ (0, ∞),
(5.14)
and a reshuffled function q (R) (·) via: h X 1 d E 8◦ Q η` η`0 1{X`,`0 ≥ x0 } 2 β dx0 `,`0 ≥0 i 0 ∂y ψ(y ` (xM )) ∂y ψ(y ` (xM ))
q (R) (x) = −
(5.15)
x0 =x∧xM
(5.13)
= β −2 R0,x∧xM (m(q(x ∧ xM ), ·)2 )(0), x ∈ [0, 1].
As we shall now see, q (R) (·) fulfills analogous properties to the function q(·). Theorem 5.4. The function q (R) (·) is continuous increasing, with values in [0, 1], constant on [xM , 1], q (R) (0) = 0, q (R) (xM ) = β −2 R0,xM ((∂y ψ)2 )(0), and 0 q (R) (x) = β −2 R0,x (∂y2 f (q(x), ·))2 (0) · q 0 (x), for x ∈ [0, xM ].
(5.16) (5.17)
Proof. The formula (5.15) clearly defines a continuous increasing function, constant on [xM , 1], for which (5.16) holds. The calculation in (5.7) shows that Z x ∂y m(q(v), y(v)) dNv , where m(q(x), y(x)) = Z x 0 (5.18) Nx = y(x) − v m(q(v), y(v)) dv, x ∈ [0, xM ], 0
defines a martingale, with increasing process: Z x ∂y m(q(v), y(v))2 dq(v), 0
. Therefore for 0 ≤ x0 < x1 ≤ xM , under the law Pe = ZxM · Pxy=0 0 =0,x1 =xM def
q (R) (x1 ) − q (R) (x0 ) = β −2 E Pe[m2 (q(x1 ), y(x1 )) − m2 (q(x0 ), y(x0 ))] Z x1 = β −2 E Pe[(∂y m)2 (q(x), y(x))] dq(x),
(5.19)
x0
and our claim (5.17) readily follows.
Remark 5.5. The above theorem suggests looking at iterations of the transformation q(·) → q (R) (·). The fixed point equation q(·) = q (R) (·) essentially corresponds to the selfconsistency equation (III.63), p. 45 of M´ezard–Parisi–Virasoro [6], for the SK-model. We thus see once again that several quantities related to the physical prediction of the SK-model appear in the context of our abstract cavity method.
Ruelle’s Probability Cascades and Abstract Cavity Method
275
Appendix We shall collect in this appendix some useful results on the laws Px and Px which are defined above (0.2) and (1.3), and are used throughout this article. We recall that Mp is the set of simple pure point Radon measures on (0, ∞), and it is endowed with the topology of vague convergence. Proposition A.1. For x ∈ (0, 1), Px is the weak limit of the laws on Mp of
n X
δexp{ 1
x
(Xi −log n)}
1
where X1 , . . . , Xn are standard i.i.d. exponential variables. (A.1) For x1 ∈ (0, 1) and x1 x2 ∈ (0, 1), Px1 x2 is the image of Px1 1 P P under the map: m → η x2 ◦ m, (i.e. m = δη` → δ
1
(η` ) x2
E Px2 [|m|x1 ] < ∞, if 0 < x1 < x2 < 1.
).
(A.2) (A.3)
(k − 1 − x) . . . (1 − x) 0(k − x) = . 0(1 − x)0(k) (k − 1)! (A.4) Pn Proof. For the proof of (A.1), observe that 1 δ(Xi −log n) converges in law to a Poisson point process with intensity exp{−z}dz. Indeed for f ∈ Cc (R), For x ∈ (0, 1), k ≥ 1, E Px [< m, η k >] =
n oi Z ∞ n h n X f (xi − log n) = e−f (z−log n)−z dz E exp − 0 Z ∞ i=1 n n Z o −x e n→∞ dx = 1− (1 − e−f (x) ) −→ exp − (1 − e−f (x) ) e−x dx . n R − log n
Our claim (A.1) now follows once we notice that the image of the Poisson law with intensity e−z dz on R under the continuous map: m ∈ Mp (R) → exp{ x· } ◦ m ∈ Mp , is the Poisson law with intensity exp{ x· } ◦ (e−z dz) = xη −x−1 dη. As for (A.2), it is an immediate consequence of (A.1). Finally (A.3) and (A.4) can be found in Corollary 2.2 of Ruelle [10]. We also want to look at the situation where P is the law of m(dη dy), a Poisson point process on (0, ∞) with marks in E a Polish space, with intensity xη −x−1 dη ⊗ dµ, where µ is a probability on E. We consider a positive measurable function g on E, such that Z g(y)x dµ(y) < ∞. (A.5) E
We can then define a new random pure point measure on (0, ∞)×E through the formula Z Z f dm e = f (ηg(y), y) dm, (A.6) m(dη dy) −→ m(dη e dy) : (0,∞)×E
for f ≥ 0 measurable.
(0,∞)×E
276
E. Bolthausen, A.-S. Sznitman
Proposition A.2. m e is a Poisson point process on (0, ∞) × E with intensity xη −x−1 dη ⊗ g(y)x dµ(y).
(A.7)
Proof. For f ≥ 0 measurable on (0, ∞) × E: h n Z oi P P e f >}] = E exp − f (ηg(y), y) dm E [exp{− < m, (0,∞)×E n Z o −f (ηg(y),y) −x−1 = exp − (1 − e )xη dη dµ(y) (0,∞)×E n Z o = exp − (1 − e−f ) xη −x−1 dη g(y)x dµ(y) . (0,∞)×E
This proves our claim.
References 1. Kingman, J.F.C.: The Coalescent. Stochastic Processes and their Appl. 13, 249–261 (1982) 2. Kingman, J.F.C.: On the genealogy of large populations. In: Essays in statistical sciences. J. Gani, E.J. Hannan editors, J. Appl. Probab. special volume 19A, 27–43, 1982, Applied Probability Trust 3. Krylov, N.V.: Lectures on elliptic and parabolic equations in H¨older spaces. Graduate Studies in Mathematics, Providence, RI: A.M.S. Vol. 12, 1996 4. Mezard, M., Virasoro, M.A.: The microstructure of ultrametricity. J. Physique 46, 1293–1307 (1983) 5. Mezard, M., Parisi, G., Virasoro, M.A.: SK model: The replica solution without replicas. Europhysics Lett. 1 (2), 77–82 (1986) 6. Mezard, M., Parisi, G., Virasoro, M.A.: Spin glass theory and beyond. Singapore: World Scientific, 1987 7. Neveu, J.: A continuous state branching process in relation with the GREM model of spin glasses theory. Rapport interne no. 267, Ecole Polytechnique, Juillet 1992 8. Parisi, G.: A sequence of approximated solutions of the SK model for spin glasses. J. Phys. A: Math. Gen. 13, L115–L121 (1980) 9. Pitman, J.: Coalescents with multiple collisions. Preprint 10. Ruelle, D.: A mathematical reformulation of Derrida’s REM and GREM. Commun. Math. Phys. 108, 225–239 (1987) Communicated by J. L. Lebowitz
Commun. Math. Phys. 197, 277 – 301 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
How High-Dimensional Stadia Look Like Leonid A. Bunimovich1 , Jan Rehacek2 1 2
School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA Center for Nonlinear Systems, MS B-258, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Received: 1 June 1997 / Accepted: 3 March 1998
Abstract: We give the affirmative answer to the long-standing question whether or not the mechanism of defocusing can produce a chaotic behavior in high-dimensional Hamiltonian systems. To do this we prove that billiards in a class of regions in Rn , n > 2, with focusing and flat boundary components have nonvanishing Lyapunov exponents. 0. Introduction Systems of a billiard type play a special role in the ergodic theory of dynamical systems. On the one hand, billiards correspond to the first models that inspired Boltzmann’s and Gibbs’ creation of the ergodic theory itself. Indeed, the celebrated Boltzmann ergodic hypothesis deals with the gas of hard spheres, i.e. with a billiard. On the other hand, billiards provide the most visible models of nonuniformly hyperbolic dynamical systems. The first rigorously studied classical dynamical systems with strongly chaotic behavior (that was until recently referred to as a stochastic behavior) were the Hamiltonian systems generated by the geodesic flows on surfaces of a constant negative curvature. Hadamard, Hedlund and Hopf [H, He, Ho1] were responsible for this breakthrough. Anosov [A] and Smale [Sm] essentially generalized these ideas and introduced much more general classes of hyperbolic dynamical systems, Anosov and Axiom A systems respectively, whose dynamics essentially resembles the one in geodesic flows on surfaces of negative curvature. However, all of these systems are smooth while the classical models (gas of hard spheres) were nonsmooth. Krylov pointed out [K] that a motion of molecules (hard spheres) in the gas of hard spheres resembles very much a dynamics of geodesic flows on surfaces of negative curvature. However, his claim, while being very deep and made well ahead of his time, was extremely vague, and formally it is not proved yet whether or not it was correct. Certainly, everybody believes that it was correct but the Boltzmann hypothesis is not proven yet, even though the recent breakthrough [KSS1, KSS2] allowed to get its proof for three particles.
278
L. A. Bunimovich, J. Rehacek
It was Sinai [S1] who started and developed the theory of hyperbolic billiards that is in the heart of the modern theory of nonuniformly hyperbolic dynamical systems. He introduced the class of billiards with a smooth convex inwards boundary. (Such billiards are now called Sinai billiards). Sinai billiards form the class of the “best” nonuniformly hyperbolic systems. However, already in this situation one encounters a series of very essential difficulties, both technical and principal ones. (To give an example of such difficulties one can mention that the brilliant Hopf’s idea of the proof of ergodicity for the smooth hyperbolic systems doesn’t work for systems with singularities. Sinai’s “main” or “fundamental’ lemma demonstrated how one can get around it.) One of the applications of Sinai’s theory was the proof of the Boltzmann’s hypothesis for the gas that contains only two particles (two hard disks). While it is a very special case, it is difficult to overestimate the influence of this result on the physics community. The striking and “unbelievable” fact that the ergodic hypothesis holds for the system of only two particles, while the Boltzmann’s idea always related ergodicity of a system to an extremely large number of particles (degrees of freedom) forced a rethinking some basic “physical” philosophy. Sinai’s result has changed the “view of the universe” already for several generation of physicists and was the beginning of the triumphal penetration to physics of the ideas of the modern theory of dynamical systems. On the other hand, the classical integrable geodesic flows on the surfaces of positive curvature together with the classical examples of integrable focusing billiards (circles and ellipses) seemed to demonstrate that, in contrast to dispersing behavior, the focusing one (produced by positive curvature) would always help to stabilize dynamics. This intuition was strongly held by all mathematicians as well as physicists. Therefore the discovery of the chaotic focusing billiards made in [B2] had a strong impact on both communities. It demonstrated that there is another mechanism besides dispersing that produces a chaotic motion, i.e. the mechanism of defocusing. It is worthwhile to mention that, as is often the case, this discovery of defocusing was made accidentally. The general philosophy in mathematics, as well as in physics, is that generically strong (robust) phenomena cannot be destroyed by small perturbations. For instance, this kind of idea has been expressed by Hopf [Ho2] who conjectured that the presence of relatively small pieces of positive curvature may not destroy stochasticity of geodesic flows on surfaces of negative curvature. Sinai suggested to the first author to look at the perturbations of dispersing billiards by small focusing components inserted into a boundary. The corresponding result was proven in [B1]. Indeed, under some simple geometric conditions stochasticity of dispersing billiards will be preserved. However, these conditions imply that the dispersing boundary can not only be changed onto focusing one at some relatively small pieces, but completely removed. The first class of such regions (containing a surprisingly popular “stadium”) was introduced in [B2]. Then more general two dimensional focusing billiards with chaotic behavior were studied in [B2-4],[D3],[M1,2],[W2]. The same idea has been applied to the construction of ergodic geodesic flows on a two-dimensional sphere and on a two-dimensional torus [D1,2, B-G, O]. All these examples studied by these authors were two-dimensional ones. In the paper [B-G2], geodesic flows were constructed on high-dimensional manifolds that are ergodic thanks to the mechanism of defocusing. However, the manifolds studied in this paper were products of lower dimensional manifolds, where the focusing is taking place again in a two-dimensional subspace of the n-dimensional tangent space. The principal question, that was around since [B2], remained open: Does the mechanism of defocusing also generate chaotic billiards in higher dimensions? At first sight, the answer to this question should be negative. The reason for that is the well known phenomenon in optics, which is called an astigmatism.
How High-Dimensional Stadia Look Like
279
The astigmatism means that focusing in higher-dimensional spaces (d ≥ 3) is not the same in different two-dimensional planes. However, Coddington’s formula (see formula (1.2) below) suggests that for the small pieces of spheres the situation is still not completely hopeless (Coddington’s formula was apparently first published in 1829, see [C], p.66). This formula suggests that an internal angle of a piece of a sphere should be not more than 90◦ to ensure defocusing. It has been claimed in [B3] that some ergodic nowhere dispersing billiards exist in 3D. However, for ergodicity one needs some more conditions to be satisfied than just defocusing, which ensures nonzero Lyapunov exponents. That was the reason why the pieces of 3D spheres considered in [B3] were even smaller than the spherical caps with the internal angle of 90◦ . The ergodicity of some classes of nowhere dispersing billiards has been established in the very recent paper [B-R2]. However, the pieces of spheres considered in this paper are even smaller than in [B3]. Actually, the spherical caps used in [B-R2] are inscribed into the pieces of spheres used in [B3]. The important contribution towards the negative answer to the question whether or not the mechanism of defocusing has higher dimensional counterparts has been made by Wojtkowski [W3]. He constructed some nowhere dispersing billiards in 3D with boundaries containing semi-spheres arbitrarily far apart, which have linearly stable periodic orbits. In this paper a fundamental in geometric optics, Coddington’s formula was derived independently. However, there do exist chaotic nowhere dispersing billiards in higher (d ≥ 3) dimensions. In the recent paper [B-R1] the affirmative answer to this question was obtained in dimension three. The method of proof in this paper essentially used 3D geometry. So, the general question has been just shifted by one more dimension. In the given paper we develop the general approach that allows one to prove the existence of nowhere dispersing billiards with nonvanishing Lyapunov exponents in any dimension. It is important to mention that we study the class of billiards that is the natural highdimensional analog of the class of billiards introduced in the first paper [B2] on the new mechanism of chaos in Hamiltonian systems. In particular, our class contains highdimensional stadia. The structure of the paper is as follows. In the first section, we introduce the regions in question and review the necessary background from the theory of billiards. In Sect. 2 we state some technical lemmas which are necessary to control the curvature evolution during one series of reflections in a spherical cap. The third section contains the proof of the main result.
1. Description of the Model and the Main Result We will study billiards in some class of n-dimensional regions (n ≥ 2) described below (an example of a typical region is in Fig. 1). The boundary of the region consists of flat walls and spherical caps attached to them. We will consider only focusing caps whose angle ω 00 < 90◦ . By the angle of the spherical cap we mean the maximum angle subtended by the spherical cap at the center of the sphere. Our aim is to show that by restricting the angular size of the spherical caps, one can achieve focusing that is strong enough to obtain overall divergence of nearby trajectories. Consider an evolution of an (n − 1)-dimensional infinitesimal control surface γ (also called a wavefront) of class C 2 perpendicular to the orbit. The rate at which the neighboring trajectories diverge is measured by the curvature operator (the operator of the second fundamental form) of the surface γ. To demonstrate that the defocusing
280
L. A. Bunimovich, J. Rehacek
C
ω
Q
,,
S n φ
-1
T x Tx
x=(q,v) Fig. 1. Billiard trajectory
mechanism works, we show that if the surface γ approaches a spherical cap with the positively defined curvature operator, then the whole billiard region can be configured in such a way, that after passing through the spherical cap, the surface γ will focus “relatively soon” (and thus the curvature operator becomes positively defined again). This property ensures that the mechanism of defocusing discovered in [B2] for 2-D billiard works also for some n-D regions. Since very often we will be working with the n − 1-dimensional hypersurface γ, in the rest of the paper we will adopt the convention m = n − 1. We will mainly use the notation from [B-R1]. Let Q ⊂ Rn be a region described above and equipped with a field of inward normal vectors n(q). Further, let M’ be a restriction to Q of the unit tangent bundle of Rn . Points in M’ have the form x = (q, v), where q ∈ Q is the support of x and v ∈ Tq (Rn ). By an n-D billiard we mean a dynamical system in M’, generated by the motion of x ∈ M 0 along a straight line determined by v with unit speed. When this line reaches the boundary of Q it is reflected according to the rule “the angle of reflection is equal to the angle of incidence”. The angle φ is measured with respect to the normal n(q). This motion generates a flow on the phase space which we shall denote by St . As usual, this flow generates the discrete dynamical system with the induced mapping T obtained by the restriction of St to the boundary (see Fig. 1) M = ∂Q × S n . The n-D billiard mapping T preserves the measure dµ(q, v) = const. (v, n(q)).dq.dω, where dq is the (n-1)-dimensional Lebesgue measure on the boundary of Q generated by a volume and dω is the (n-1)-dimensional (i.e. m-dimensional) Lebesgue measure on the unit sphere. The const is the usual normalizing constant so that µ(M ) = 1. Strictly
How High-Dimensional Stadia Look Like
281
speaking, the billiard mapping is defined only for a subset (of full measure) of M. Since the detailed information about the singularity set is not necessary for our purposes, we will still call this subset M . The dynamics in the vicinity of a billiard orbit is described by the second fundamental form of the control surface orthogonal to the flow (we will call its matrix the curvature operator) and usually denote its value by (Gx, x). It is necessary to understand how this form changes upon the reflection and how it evolves during the free path. Let n(q) be a unit normal vector at the point of reflection and let v + and v − be the unit vectors along the directions of the outcoming and incoming billiard orbit respectively. Then v + = v − − 2(n.v − )n and the vectors v + and v − span a plane P in which the whole piece of an orbit lies as long as it doesn’t leave the sphere S. This plane is unique and contains any two points of reflection in a series of reflections from the spherical cap along with the center of the corresponding sphere. The plane P naturally splits the tangent space U of the control surface into two subspaces: U = Up ⊕ Ut , where Up = U ∩P and Ut is the (m−1)-dimensional orthogonal complement to Up in U. The space U has thus two distinguished directions which will be referred to as the planar direction (since it is 1-dimensional) and the transversal (or orthogonal) subspace. To obtain the formula for the reflection, we have to introduce some auxiliary operators (see [S2]), which make up for the fact that the curvature operators before and after reflection act on two different planes T− and T+ as well as the curvature operator of the boundary at the point of reflection. Let V be the isometric operator which maps T− onto T+ in a direction parallel to the vector n(q) normal to the boundary at q. This operator hence realizes the necessary rotation of the tangent plane so that this is still perpendicular to the direction of reflected velocity. Similarly, let W be the operator which maps T+ onto T0 in a direction parallel to the vector v+ and let W ∗ be the adjoint operator which maps the plane T0 onto T+ . The operator W thus transforms the curvature operator of the boundary onto the tangent plane to the orbit, where it can be added to the curvature operator of the control surface. The explicit formula for this addition is G+ = V −1 G− V − 2(v+ , n)W ∗ KW, where K is a second fundamental form of the boundary at the point of reflection q. To get an idea, what happens to a control hypersurface γ upon reflection, let us consider a situation on Fig. 2. Here we fix one particular unit direction d in the tangent space U . The value of the curvature of γ in this direction is (G− d, d). To obtain the same directional curvature of γ after the reflection, that is (G+ d, d), we write d = dp cos(θ) + dt sin(θ), where θ is the angle between d and Up and dp , dt are normalized projections of d onto Up and Ut respectively. Then the change in the curvature is κ+ = κ− − cos2 (θ)(
2 2 cos(φ) ) − sin2 (θ)( ), r cos(φ) r
282
L. A. Bunimovich, J. Rehacek
Up d θ
dt Ut
Fig. 2. Decomposition of the tangent space
where r is the radius of the reflecting sphere and φ is the angle of reflection. As a special case of this formula we see that curvature in the planar direction (θ = 0) changes just like in the planar case 2k , (1.1) κ+ = κ− − cos(φ) while in the orthogonal subspace it obeys κ+ = κ− − 2k cos(φ).
(1.2)
As for the evolution of the curvature operator during the free path, we notice that the principal curvature directions are preserved and the principal curvatures (eigenvalues of the curvature operator) evolve according to the standard “reciprocal” rule, κ(t) =
κ(0) , 1 + κ(0)t
(1.3)
where t is the time elapsed from, say x to y. In terms of the quadratic forms we obtain Gt (y) = G0 (x)(I + tG0 (x))−1 ,
(1.4)
where Gt (y) is the second fundamental form acting on the hyperplane U (y) and G0 (x) is the second fundamental form operator acting on the hyperplane U (x) (perpendicular to the direction of the motion).
How High-Dimensional Stadia Look Like
283
Now we formulate our main theorem. Theorem 1. Let the boundary ∂Q contain only spherical caps and flat components. Suppose that the angles of all spherical caps are smaller than 90◦ and that almost every trajectory enters some spherical cap. Then the billiard in Q has non-vanishing Lyapunov exponents, provided that each spherical cap is enclosed in a subregion of Q which has walls perpendicular to the one containing the cap. The size of this subregion should be such that the center of the sphere, whose cap is attached to it, is inside this subregion.
2. Technical Results In this section we will prove a series of propositions which we will need in the last section to prove the theorem about non-vanishing of Lyapunov exponents. Recall that an infinitesimal hypersurface perpendicular to the orbit is determined by the curvature operator (the second fundamental form), acting on the tangent space V to this hypersurface. This operator can be regarded as a symmetric m × m matrix. In order to study the dynamics in the vicinity of the given orbit we first state several definitions. Definition 1. The effective angle of a billiard orbit in a spherical cap is the angle of the circular arc that the plane of the orbit determines. See Fig. 3 for an illustration. The maximal angle of the spherical cap will be denoted by ω 00 , while the effective angles of the particular orbit will be denoted by ω 0 . The letter ω will be reserved for an angle between the middle of the first and last chord of a given orbit. T
p φ
q Q
P n(p) A
B ω σ B
,
O Fig. 3. Reflections in a sphere
A
,
284
L. A. Bunimovich, J. Rehacek
Definition 2. Two quadratic forms S and S’ are said to have substantial tangency, if there exists an (m − 1)-dimensional subspace U 0 ⊂ U such that for all v ∈ U 0 : (Sv, v) = (S 0 v, v). Two infinitesimal surfaces perpendicular to the billiard orbit are said to have substantial tangency if their corresponding curvature operators (second fundamental forms) have the above property. Definition 3. A control hypersurface γ will be called aligned, if one of its principal curvature directions coincides with the planar direction Up ⊂ U in the tangent space to γ. Definition 4. A “zone of focusing” of a given spherical cap is a part of the billiard region, which is bounded by the spherical cap, by the flat component to which the cap is attached, by flat components perpendicular to the one with the spherical cap and finally by a transparent virtual flat component, which is parallel to the one with the spherical cap and at the distance R from it. This “bottom wall” will usually be designated by W and the number R = R(ω 00 , ρ) will be called the size of a zone of focusing. In this paper, the size of the zone of focusing will be such that the center C of the sphere lies in the bottom wall W . Hence R(ω 00 , ρ) = ρ cos(ω 00 /2). (In Fig. 4, the “top” wall is denoted by T and the “bottom” wall by W .) Definition 5. The principal curvatures of the surface γ at the point of entrance to the zone of focusing will be called entrance curvatures, while the principal curvatures at the point of exit from the zone will be called exit curvatures. Both entrance and exit are through the transparent “bottom wall” W . Remark 1. There are two difficulties in the proof of Theorem 1. One is caused by the fact that in the transversal direction the focusing is much weaker than in the planar direction and orbits must be given sufficient time to defocus. The second one stems from the fact that computation of curvatures of a control surface is feasible only in the case when one of the principal curvature directions coincides with the planar direction or when the control surface is a piece of a sphere. In this paper, we show that the dynamics of an arbitrary control surface is determined (at least as far as the focusing properties are concerned) by the aligned surfaces. In this sense the behavior of the m-dimensional control surface can be studied by investigating the behavior of m 1-dimensional control curves separately. The same conclusion can also be inferred from the general formalism developed in [L-W]. However, our approach is less abstract and provides a tool to study the quantitative behavior of trajectories in the vicinity of the given orbit. We prove 4 propositions in this section that deal with the evolution of the curvature operator of the control surface γ. In the first proposition, we deal with the dynamics in the planar direction and in the orthogonal subspace separately. We show that an infinitesimal beam of trajectories which enters the “focusing zone” as diverging leaves the zone again as diverging. For this we will need the perpendicular walls, since the plane of the orbit may cut the spherical cap in a very small angle, which may lead to a weak focusing. The perpendicular walls then ensure that an orbit is then given sufficiently long time to defocus. In Proposition 2, we show that a general surface γ can be approximated by two aligned surfaces which have substantial tangency with γ, i.e. they share the directional
How High-Dimensional Stadia Look Like
285
C T ,,
R(ω, ρ)
ω,,
ρ W
Q
Fig. 4. Zone of focusing
curvature values along an (m−1)-dimensional subspace. In Proposition 3, we prove that the substantial tangency is preserved during the whole series of consecutive reflections. Finally, in Proposition 4, we use the substantial tangency to control the exit curvatures of the general surface γ by the exit curvatures of the surfaces constructed in Proposition 2. Proposition 1. Consider a billiard orbit having N consecutive reflections in a spherical cap, whose (maximal) angle ω 00 < 90◦ . Let a zone of focusing around this spherical cap be given, with the “bottom” wall W at the distance of the radius of the sphere. Then every incoming control surface γ, whose curvature operator is aligned and positively semidefinite at the moment of crossing the “transparent” wall W, leaves the zone of focusing as diverging in all directions, i.e. its curvature operator at the moment the surface leaves through the wall W is positively defined. Proof. First, it is clear from the similarity arguments that it is enough to consider only a spherical cap with the radius 1. In the case of a radius ρ the size of the zone of focusing would be adjusted accordingly and the whole discussion would proceed without changes. Second, since the “spherical” part of the orbit lies in the same plane, we can restrict ourselves to this plane (in Fig. 3 it is the plane AOB, which also contains the points P, Q, p and q). If we follow the incoming orbit beyond A and the outcoming orbit beyond the point B, after some time and possibly after some reflections from the surrounding flat walls, both the incoming and the outcoming orbit will intersect the “bottom” wall W at points A00 and B 00 respectively. Consider now the plane AOB only. The cap is represented by a circular arc of an angle ω 0 (the effective angle), while the “bottom” W of the focusing zone corresponds to a line through the center O of the sphere and parallel to the line representing the top of the zone of focusing.
286
L. A. Bunimovich, J. Rehacek
We again extend the in- and outcoming orbits beyond A and B (this time not considering any reflections from the surrounding flat walls). These orbits intersect the “bottom” line at points A0 and B 0 respectively. We claim that the total length of the free path from A to A00 is the same as from A to A0 and likewise for B. Denote the wall to which the cap is attached by T and its normal vector by n(T ). Since the surrounding walls are perpendicular to the wall T , reflections from them do not change the component of the (in-) outcoming velocity in the direction n(T ). Since the time it takes the billiard particle to traverse to the bottom wall W depends only on this component, we could just as well let the billiard particle go through the surrounding walls without any reflections at all. But that exactly corresponds to the free path from A to A0 . This is the only part of the proof where we need perpendicularity to T of the adjacent to its boundary components. In the rest of the proof we will thus consider only a situation in the plane AOB. We begin by stating a lemma which describes what happens to a curvature of the control surface in the planar and transversal directions. Lemma 1. Suppose that the planar and transversal directions are invariant during a series of N reflections in the spherical cap of radius ρ = 1 and maximal angle ω 00 < 90. Let κ be the curvature of the control curve in the middle of the first chord (point A) and κ0 the curvature in the middle of the last chord (point B). Then the following formulas hold: 2N + κ, (2.1) κ0 = ρ cos φ (in the planar direction) κ0 =
ω − ρsin sin φ + κ cos ω
cos ω + κρ sin ω sin φ
,
(2.2)
(in the transversal direction). Proof. We prove only the second part of the lemma. The first one is the well known fact about the planar billiards and can be verified using formulas (1.1) and (1.3). For the proof of the second part we use induction in the number of reflections. Recall that we set ρ = 1 and define (2.3) ωN = N (180 − 2φ), 1ω = 180 − 2φ.
(2.4)
The curvature evolution in the orthogonal subspace is governed by (1.2) and (1.3). First, we will derive a formula for the curvature change from the center of one chord to the next one. Since any chord during one series of reflections has the length 2 cos φ, we will need to combine formula (1.3) with t = cosφ with (1.2) and then with (1.3) again. If we denote the curvature at the beginning of this process by κ then for the resulting curvature we get κ − 2 cos φ − 2κ cos2 φ , κ0 = 1 + 2κ cos φ − 2 cos2 φ − 2κ cos3 φ which can be simplified to κ0 = and further to
−2 cos φ − κ cos 2φ , − cos 2φ + κ cos φ sin2 φ
How High-Dimensional Stadia Look Like
κ0 =
287 − sin 2φ sin φ
− κ cos 2φ
− cos 2φ + κ sin 2φ sin φ
.
From this formula it is clear that it is advantageous to do the computation with the rescaled curvatures κ˜ = κ sin φ. (2.5) With this convention and using (2.4) we finally obtain κ˜ 0 =
− sin 1ω + κ˜ cos 1ω , cos 1ω + κ˜ sin 1ω
(2.6)
which establishes (2.2) for 1 reflection. We now assume the validity of (2.2) for N reflections, κ˜ 0 =
− sin ωN + κ˜ cos ωN . cos ωN + κ˜ sin ωN
Plugging this expression into (2.6) yields a similar expression with the angle ωN + 1ω = ωN +1 . The formula (2.2) for N + 1 reflections is now obtained by returning to the nonrescaled curvatures using (2.5). Remark 2. The formula (2.2) shows that in the transversal direction, the evolution of infinitesimal orbit variations (the Jacobi fields) is essentially rotation by the angle ω. The formulas at the end of the proof above then correspond to the composition of such rotations. End of the proof of Proposition 1. Working in the plane AOB, we denote the angle AOB again by ω, and the angle A0 OA by σ. The angle σ ranges from −90◦ to 90◦ , while ω + σ > 90◦ ,
(2.7)
so that the outcoming ray approaches the “bottom” wall W . The dynamics in the planar direction has been thoroughly studied and from (2.1) the statement of the proposition easily follows. The only possible complication would occur in the case of a nearly normal hit of the edge of the spherical cap. In this case the point A0 would lie “beneath” the wall W and so the curvature at A might be negative and, according to (2.1), so might be the curvature at B. Denote by α an angle BB 0 O. Then the angle AA0 O is 0 < α + 2φ < 90. If the beam of trajectories that focused at A0 is to focus again before B 0 , then 2 − tan α − tan(α + 2φ) + ≤ , sin φ cos φ sin φ where the left-hand side is a curvature at B of the family of trajectories focusing at A and the right-hand side is a curvature of a family that would focus at B 0 . This inequality may be rewritten as tan α + 2 tan φ ≤ tan(α + 2φ), and is deduced from the convexity of the function tan on (0, 90◦ ). Thus even in this case the free path between B and B 0 will make the curvature at B 0 non-negative. The dynamics in the transversal subspace Ut is uniform in the sense that all the principal curvatures with eigenvectors in this subspace are subject to the same curvature drop (1.2) after the reflection. Hence, we can consider the evolution of these curvatures
288
L. A. Bunimovich, J. Rehacek
simultaneously and use the formula (2.2) for all of them. Our task is to show that if the curvature (in the transversal direction) at A0 is non-negative, so is the curvature at B 0 . Since (2.2) is a linear fractional transformation (LFT) in κ with positive derivative and since the free path (from A0 to A and from B to B 0 ) is also LFT with negative derivative, we infer that the relation between curvature at A0 and B 0 is also LFT with positive derivative. Thus it is enough to show that image of zero-curvature (at A0 ) is a positive curvature (at B 0 ) and image of infinity is infinity. Then we’ll have an increasing LFT with positive value at 0 and ∞ at ∞ and from this we’ll conclude that the curvature at B 0 is non-negative. So let us assume that at A0 we have κ = 0 and the same curvature is then at A. Thus ω 0 at B (according to (2.2)) we obtain κ = − tan sin φ . The length of the free path between A and A is t = tan σ sin φ, (2.8) and between B and B 0 , t = − tan(ω + σ) sin φ.
(2.9)
Thus for a curvature at B 0 we get κ=
−1 , cot ω + tan(σ + ω)
which, in view of (2.7), is positive. Now assume that at A0 the infinitesimal beam of trajectories focuses (i.e. κ = ∞). Then the curvature at A is κ = 1/(tan σ sin φ) and at B we get from (2.2), κ= =
ω 1 − sin ω + cos tan σ ω sin φ cos ω + sin tan σ
1 1 1 − tan σ tan ω 1 = . sin φ tan σ + tan ω sin φ tan(ω + σ)
This curvature is negative (from (2.7)) and its reciprocal value is exactly the length of the free path between B and B 0 , which shows that the infinitesimal beam focuses again at B 0 . This concludes the proof of the Proposition 1. When the incoming control surface γ is not aligned, we cannot break the dynamic behavior into planar and orthogonal subspaces separately since this time the principal curvature directions change upon reflection. Our first task will be to construct two auxiliary hypersurfaces that are aligned and both have substantial tangency with γ. Their alignment allows us to compute their eigenvalues (curvatures), while their substantial tangency with γ guarantees that from these computed curvatures one can obtain estimates of curvatures of γ using the interlacing property of eigenvalues, which we recall below as Lemma 2. Moreover, one of them will majorize γ and one will minorize it in the following sense. Denote by γa and γc the majorizing and minorizing hypersurfaces respectively. Since all three hypersurfaces are orthogonal to the orbit at any point, they are all characterized by their curvature operators. They will be denoted by F, G, H for γa , γ and γc respectively. Then for all the vectors x ∈ U (U is the common tangent space to the surfaces) we have (F x, x) ≤ (Gx, x) ≤ (Hx, x).
(2.10)
How High-Dimensional Stadia Look Like
289
Lemma 2. Let A and B be two symmetric matrices and let their eigenvalues be denoted by a1 ≤ a2 ≤ ... ≤ an and b1 ≤ b2 ≤ ... ≤ bn respectively. If the matrix B-A has 1-dimensional range, then their eigenvalues are interlaced, i.e. they satisfy a1 ≤ b1 ≤ a2 ≤ b2 ≤ ... ≤ an ≤ bn in the case A ≤ B and reverse inequalities if B ≤ A. Proof. The minimax principle claims that the ith eigenvalue of A is min max S
x∈S
(Ax, x) , (x, x)
where the minimum is taken over all i-dimensional subspaces S of Rm . Without loss of generality, we can assume that A ≤ B in which case we immediately obtain ai ≤ bi for all i = 1, ..., n. Next, we observe that every j-dimensional subspace S contains a (j-1)dimensional subspace S 0 which is orthogonal to the (1-dimensional) range of B − A. Since (Ax, x) = (Bx, x) for x ∈ S 0 , max
x∈S 0
(Bx, x) (Ax, x) ≤ max . x∈S (x, x) (x, x)
Hence bj−1 ≤ aj . In addition to alignment and substantial tangency with γ, we want our auxiliary surfaces F and H to have curvatures close to the ones of G. The reason for this is obvious. The more different the curvatures are at the beginning, the more they differ later on. In Proposition 2 we show that the operators F and H can be chosen in such a way, that their curvatures lie “within” the range of curvatures of G. To make this “within” more specific, let us denote by b1 the smallest curvature of γ (i.e. the smallest eigenvalue of G) and by bm the biggest one. Proposition 2. Let a semipositively defined curvature operator G be given. Then there exists two aligned positively semidefined operators F and H, such that (2.10) holds and both have substantial tangency with G. Moreover, F and H can be chosen so that all their eigenvalues lie in the interval I = (b1 , bm ). Proof. We start with the existence of the “minorizing” operator F . Thus, given a real symmetric matrix G, our task is to construct a matrix F , such that i. G ≥ F and G − F has rank 1, ii. xp is an eigenvector of F, iii. eigenvalues of F lie in I, where xp is a unit vector in Up . Recall that the m-dimensional space tangent to all of the three surfaces γf , γ and γh is naturally split into U = Up ⊕ Ut . We choose coordinate vectors as follows: for e1 we take xp and for the remaining m − 1 vectors we take any orthonormal basis in the subspace Ut . In these coordinates the quadratic forms F, G and H have matrices fij , gij and hij . We further assume that e1 is not in the eigenspace of G for if it were then G itself would be aligned and its dynamics (i.e. the dynamics of γ) could be studied directly. This implies that g11 > 0. Indeed, if 0 = g11 = (Ge1 , e1 ) then e1 would be an eigenvector, since G ≥ 0. Matrices with rank 1 are determined by a vector, say u = (u1 , u2 , ..., um ). With this notation the matrix representing the linear mapping v → (v.u)u has a matrix U =
290
L. A. Bunimovich, J. Rehacek
(uij ) = u(i)u(j). Our aim is first to find u so that F := G − U satisfies (i) and (ii). This is easily done by choosing u = (d, g21 /d, g31 /d, ..., gm1 /d),
(2.11)
where d is an arbitrary constant. Since U = G − F is of rank 1 and positive definite (i) holds. Direct computation yields that the first column of F is (g11 − d2 , 0, ..., 0)⊥ . Hence e1 = (1, 0, ..., 0) is an eigenvector of F , which proves (ii). To show (iii) we have to choose some particular d > 0. e1 is not an eigenvector of G we obtain Since b1 is the smallest eigenvalue of G, and√ g11 = (Ge1 , e1 ) > b1 . Thus we can choose d = (g11 − b1 ) > 0 so that g11 − d2 = b1 . With this choice of d the eigenvalue of F corresponding to e1 is b1 . Since G ≥ F it is clear that eigenvalues of F are bigger than b1 . We now show that with the choice of d as above there are no eigenvalues of F below b1 . Indeed, suppose that there is an eigenvector e and an eigenvalue f (F.e = be) such that b < b1 . We look at the 2D subspace S = span(e1 , e). Since b < b1 the quadratic form (F x, x) ≤ b1 (x, x) for all x ∈ S, with equality only for a vector e1 . On the other hand, since b1 is the smallest eigenvalue of G, (Gx, x) ≥ b1 . (x, x) Substantial tangency, however, implies that in the 2D subspace S there must be at least one vector x for which (Gx, x) = (F x, x). From the two inequalities above it follows that this can happen only for x = e1 . But that would imply that e1 is an eigenvector of G (with an eigenvalue b1 ) and that possibility we have excluded. One can similarly show that with the above choice of d at least two eigenvalues of F must have the value b1 . We already know that one eigenvalue of F has this value (the one corresponding to e1 ). Suppose that all the other eigenvalues of F are strictly bigger than b1 . Then let e 6= e1 be the unit eigenvector of G belonging to b1 . Obviously (Ge, e) = b1 , while (F e, e) > b1 , because F has only one eigenvector belonging to b1 (and it is not e). Hence at least two eigenvalues of F must be equal to b1 . If we denote eigenvalues of F by ai , we obtain the following configuration: a1 = b1 = a2 ≤ b2 ≤ ... ≤ am ≤ am . Of course, with an arbitrary choice of d, the equalities on the left of the above relation become just inequalities. Construction of the majorizing form H is completely analogous. We look for a vector u so that H := G + U satisfies (i) and (ii). This is accomplished by (note the change in sign) (2.110 ) u = (d, −g21 /d, −g31 /d, ..., −gm1 /d), where d is again an arbitrary constant. Since H ≥ G the eigenvalues of H cannot lie below b1 . Carefully selecting the value d one can again show that there will be two eigenvalues (one corresponding to e1 ) of H equal to bm . The reasoning is the same as in the case of the “minorizing form” F . That concludes the proof of Proposition 2.
How High-Dimensional Stadia Look Like
291
Up
x3
x2
Ut
x1
w H0
Fig. 5. Quadratic form
In order to illustrate the above construction, let us consider the case m = 3. The quadratic form corresponding to the matrix G can be thought of as an ellipsoid with major semi-axes xi (see Fig. 5). Eventually we want to construct an enclosing ellipsoid with one semi-axis pointing in the direction Up and such that it touches the ellipsoid determined by G along a 2-dimensional subspace. We find easily a sphere H0 , which “encloses” G, however the tangency is, in general, only 1-dimensional (along x3 in Fig. 5). It is clear that in order to keep the form H aligned, we can “squeeze” the sphere H0 only along the subspace Ut . On the other hand, we want to preserve the tangency, which we have along x3 and in order to achieve that, we can “squeeze” the sphere H0 only along the x⊥ 3 subspace. In 3D this results in only one possible direction of squeezing w. In more dimensions, we have to “squeeze” the sphere H0 (m − 2) times along suitable directions in order to achieve substantial tangency. This can be formalized in a purely geometrical proof of Proposition 2 which the reader can easily discover for himself. Proposition 3. Substantial tangency is preserved during the free path. Proof. Since during the free path the principal directions stay the same and principal curvatures evolve according to (2.2), we get formula (2.3) for the curvature operator after time t. Note that operators G = G0 and (I + tG)−1 commute. For the difference of operators we obtain Gt − Ft = (I + tG)−1 (G − F )(I + tF )−1 = (I + tF )−1 (G − F )(I + tG)−1 .
292
L. A. Bunimovich, J. Rehacek
Now it suffices to realize that two operators have substantial tangency if the difference of their matrices has 1-dimensional range. Knowing this about G − F , we can infer the same about Gt − Ft directly from the above equality. The substantial tangency is also preserved upon reflection, since for each unit direction x we subtract the same quantity from both (F x, x) and (Gx, x) (which represent directional curvatures of the surfaces in question. Therefore the substantial tangency is preserved during the whole series of reflections in a given spherical cap. The reason for creating this “substantial tangency” in Proposition 2 and showing that it is preserved in a series of reflections is that it enables us to say that at each moment the curvature operators satisfy either F ≤ G or G ≤ F , which we utilize in Proposition 4. Besides, it gives one good estimates of the actual eigenvalues of G in terms of those of F and H (using the interlacing property of the eigenvalues). Let us also mention that, unless the quadratic form G is proportional to unity (and that case is trivial), its eigenvalues are different, which entails c1 − a1 > 0 and cm − am > 0. Proposition 4. All of the exit eigenvalues of the general surface γ are bounded by the eigenvalues of γa and γc , in particular they are all positive. Proof. During the series of reflections the eigenvalues a1 and c1 (corresponding to the planar direction) evolve according to the known rules for planar billiards. As for the other eigenvalues of F and H (whose eigenvectors lie in the orthogonal subspace Ut ), they φ evolve homogeneously, in the sense that during the reflection the same value − 2 cos r is subtracted from all of them, while during the free path they evolve independently on each other according to (2.2). After the surfaces γa and γc arrive at the exit point B 0 , they are all non-negative, according to Proposition 1. Since both surfaces have substantial tangency with γ, we deduce that F ≤ G ≤ H at the point B 0 . Indeed, by considering the 2-D subspace S spanned by a unit vector in the planar direction and by a suitable vector in the transversal direction, we obtain (F x, x) < (Hx, x),
f or all x ∈ S.
If G < F or H < G, the surfaces γa and γc couldn’t have a substantial tangency with γ. From this it follows that the eigenvalues of the quadratic form G (which are the exit curvatures of the surface γ at B 0 are included between those of F and H, which are all positive. Remark 3. The notion of substantial tangency allows one to make statements about the dynamical behavior of the m-dimensional control surface that cannot be broken down into m separate 1-dimensional cases. This happens whenever none of the principal curvature directions coincides with the planar direction. In this case, the principal curvature directions change with every reflection and computation of curvatures according to (1.1)–(1.3) becomes infeasible. To summarize the results from this section, we now formulate three conditions that we impose on the billiard region, which we assume consists only of the flat components and spherical caps (an example of such region is in Fig. 6). Condition A: The angles of all spherical caps are less than the right angle.
How High-Dimensional Stadia Look Like
,,
ω1 ρ1
293
,,
R( ω1 , ρ 1 )
,,
R( ω2 , ρ 2 )
,,
ω2
ρ2 Fig. 6. A billiard region Q
Condition B: Every spherical cap has its own zone of focusing, whose size equals to (or is bigger than) the radius of the spherical cap. Zones of focusing for different caps are separated by a positive distance. Condition C: The set of all the phase points x ∈ M whose orbit never enters any spherical cap has measure 0. Remark 4. The purpose of “enclosing” each spherical cap in a zone of focusing is to give the outcoming control surface ample time to defocus. By requiring that the respective zones of focusing are separated, we make sure that the control surface always enters the zone of focusing with positive definite curvature operator (second fundamental form). This in turn causes the control surface to enter the corresponding spherical cap with small curvatures and after leaving it to focus while still being in the same focusing zone. The size of the zone of the focusing is such that the original “stadium” is generalized in the most natural way. If we think of the rectangular box (such as the one in Fig. 1) with two spherical caps attached to it on the opposite faces, then the center of the lower sphere should be below the center of the upper sphere. Another purpose of having the zones of focusing enclosed by flat components is to destroy the continuous group of symmetries one would obtain if the “stadium” was rotated along the horizontal axis. The reader may also notice that for some regions (e.g. the one in Fig. 1, which has a rectangular cross-section) Condition C is satisfied automatically. However, for a general position of flat walls it is not yet known (although everybody seems to believe that) whether the set of points x whose orbit never enters spherical caps has zero measure. That’s why we have postulated Condition C.
294
L. A. Bunimovich, J. Rehacek
3. Lyapunov Exponents We shall show that the n-D billiard systems described above have non-vanishing Lyapunov exponents. To achieve this we use the approach via invariant sectors discussed in [W1, L-W]. The key ingredient is to define a family of cones in the tangent bundle which is invariant (and eventually strictly invariant) with respect to the billiard map. Before we state the theorem, we would like to review a few facts from the geometry of tangent vectors and from the symplectic geometry. It is customary to relate tangent vectors for billiard systems to infinitesimal families of trajectories. More precisely, consider a point x = (r, φ) ∈ M (as in Fig. 7), determining a dashed billiard orbit (both r and φ have m components). A vector x0 = (r0 , φ0 ) ∈ Tx M can naturally be related to a family of orbits o(σ) = (r(σ), φ(σ)) = (r + σr0 , φ + σφ0 ). Note that this family represents a curve in the phase space, satisfying o(0) = x and o0 (0) = x0 . Hence this family is a natural representative of a class of equivalent curves from the usual definition of a tangent vector.
φ
r
dφ
x’
r+dr
φ
x
r Fig. 7. Tangent vectors
However, representing tangent vectors as families of trajectories originating from the billiard boundary has one formal drawback. For the purpose of expressing the behavior of nearby trajectories it is convenient to use the curvature of the wavefront corresponding to the family o(σ). Note that two tangent vectors which differ only by a scalar multiple give rise to families with the same curvature. The quantity dφ dr , which is the natural candidate to look at, is related to this curvature through a factor of cosine of the angle of reflection. For this reason we will consider the new arc-length parameter, which can be thought of as measuring the distances in the plane perpendicular to the orbit rather than on the billiard boundary.
How High-Dimensional Stadia Look Like
295
Consider an arbitrary point x = (q, v) ∈ M . By simply adding the arc-length parameter r0 in the direction of the motion, we have a complete set of coordinates on Tx (Q × S m ) for any point of the (continuous) billiard orbit. Since the dynamics in the direction of the motion is trivial, this coordinate is usually suppressed. The remaining coordinates parameterize the plane perpendicular to the orbit at any point of Q, which includes the boundary points. This can be thought of as taking the tangent space to Q × S 2 , quotiened by the direction of the orbit. In the remainder of this section Tx M will always mean this perpendicular subspace of the tangent space at each point. Thus Tx M can be defined also for non-boundary points x. Now we will take a closer look at the tangent vectors and the curvatures of wavefronts defined by the associated families o(σ). First let us recall that the second fundamental form of any smooth surface is a quadratic bilinear form G, expressing the change in the normal vectors of neighboring points. In terms of our coordinates, G can be expressed as dφ1 dφ1 dφ1 dr1 dr2 . . . drm .. . . . G = ... . .. . . dφm dφm dφm dr1 dr2 . . . drm Suppose that we fix a point x = (r1 , ..., rm , φ1 , ..., φm ) ∈ M and a tangent vector 0 x0 = (r10 , ..., rm , φ01 , ..., φ0m ) ∈ Tx M . This vector defines a family of trajectories o(σ), which can be also thought of as an infinitesimal curve, (perpendicular to the orbit). We will describe below how to express the curvature of this infinitesimal curve using the second fundamental form. 0 ) and φ0 = (φ01 , ..., φ0m ). Then the second fundamental form Denote r0 = (r10 , ..., rm of a surface maps the arc-length vector onto the angular vector φ0 = Gr0 . Since the curvature of a surface in the direction of a unit vector u is given by (Gu, u), taking u = r0 /|r0 | allows us to compute the curvature of the family o(σ) as κ=
φ0 .r0 . |r0 |2
(3.1)
Since the curvature depends only on the direction we can always rescale the vector x0 so that r0 is a unit vector. Since the dynamics around the given orbit is best described by a local hypersurface, orthogonal to the orbit, rather than by a curve, we will consider such orthogonal hypersurfaces. They correspond to m-dimensional subspaces in the tangent space, spanned by m independent vectors x0i = (ri0 , φ0i ). However, not every m-D subspace in the 2m-D tangent space corresponds to an infinitesimal perpendicular surface. In order that span(x01 , ..., x0m ) corresponds to an infinitesimal surface, it is necessary and sufficient that for all i, j = 1, ..., m, ri0 .φ0j = rj0 .φ0i .
(3.2)
This is just a condition for the symmetricity of the curvature matrix G. If we think of R2m as a symplectic space with a standard symplectic form , then Eq. (3.2) becomes just (x0i , x0j ) = 0. Hence the infinitesimal surfaces, perpendicular to the orbit can be identified with the Lagrangian subspaces of R2m , i.e. with planes that are skeworthogonal to themselves ((x0 , y 0 ) = 0 for any two vectors from that plane).
296
L. A. Bunimovich, J. Rehacek
Before we prove the main theorem, let us introduce the notion of sectors in the tangent space (for more detailed treatment see [L-W]) and recall some elementary facts about them. Let V1 , V2 ⊂ Tx M be two transversal Lagrangian subspaces, i.e. every vector in w ∈ Tx M can be uniquely written as w = v1 + v2 , where vi ∈ Vi . This decomposition allows one to define a quadratic form Q(w) = ω(v1 , v2 ) on Tx M . Recall that (x0 , y 0 ) = r0 .φ0 − s0 .ψ 0 , where r0 , s0 , φ0 , ψ 0 ∈ Rm and x0 = (r0 , φ0 ), y 0 = (s0 , ψ 0 ) ∈ R2m ∼ = ” denotes an = Tx M “∼ isomorphism between the linear spaces). Given V1 and V2 we can define a sector (cone) C = C(V1 , V2 ) = (w ∈ Tx M, Q(w) ≥ 0).
(3.3)
The interior of the sector is then defined as the set of vectors on which the quadratic form Q is strictly positive. Since the definition (3.3) is difficult to work with, we will now evaluate the quadratic form Q explicitly for a particular choice of the Lagrangian subspaces V1 and V2 . Namely, V1 = {(r0 , 0), r0 ∈ Rm },
(3.4)
V2 = {(0, r0 ), r0 ∈ Rm }.
(3.5)
It is clear that these two subspaces are Lagrangian and that they are transversal, i.e. R2m = V1 ⊕V2 (“⊕” stands for a direct sum). These subspaces correspond to infinitesimal surfaces, one of which is flat and one is focusing (i.e. with infinite curvature) and the corresponding sector is called the standard sector (for more detailed treatment, see again [L-W]). With this choice the quadratic form becomes Q(x0 ) = r0 .φ0 . It is clear that if Q(x0 ) > 0 (or Q(x0 ) ≥ 0), then we can find an infinitesimal surface with positive definite (semidefinite) curvature operator, such that a vector x0 lies in the Lagrangian subspace that corresponds to it (the reader can find more details in [B-R1]). On the other hand, every infinitesimal surface with a positive definite (semidefinite) curvature operator lies (strictly) in the standard sector C(V1 , V2 ). Thus invariance of sectors can be fully described by means of curvature operators of the infinitesimal surfaces perpendicular to the orbit. We can now finish the proof of the main theorem of this paper (see Sect. 1). Theorem. The billiard map T for the region Q satisfying the conditions (A), (B) and (C) has non-vanishing Lyapunov exponents for almost every x ∈ M . Proof. For a point yi = (Ri , u) ∈ M (see Fig. 8) for which the trajectory is defined we will construct the sectors (cones) first for certain points inside the billiard region and then translate them using the differential of the flow back to the boundary. We define these sectors roughly as those representing surfaces which have positive curvatures at the entrance of any focusing zone, which the region Q may contain. More precisely, given a point Ri , let us denote by Ai the point in the middle of the chord immediately before the first reflection in the next series of reflections in any spherical cap that belongs to Q. Since each spherical cap is enclosed in a zone of focusing, each point Ai has a corresponding point A00i at which the orbit going through Ai enters the focusing zone (see the beginning of the proof of Proposition 1). At these points A00i
How High-Dimensional Stadia Look Like
297
u Ri O
,,
Ai
,
Ai
Q
Fig. 8. Invariant cones construction
we will define the sectors, which we then move to other points. The fact that we can find such A00i ’s for a set of points of full measure follows from Condition C and from the Poincar´e Recurrence Theorem. We denote the unit velocities at the points A00i by vi and set xi = (A00i , vi ). For the given billiard orbit we denote the map carrying xi to xi+1 by s (thus s(xi ) = T ti (xi ) = xi+1 for a suitable time ti ) and the corresponding differential that acts from Txi M to Txi+1 M by S. Since xi ’s are not points of the boundary, we must explain what we mean by Txi M . It is an orthogonal complement of the velocity vector vi in the (2m + 1)-D tangent space Txi Q × S m . Since this space is 2m − D and plays the same role as Tx M we keep this notation at points xi . Thus we factor out the direction of motion and look only at the dynamics in the orthogonal complement of this direction. By considering first the dynamics between the configuration points A00i , we establish the non-vanishing of Lyapunov exponents for the first return map (with respect to the focusing zones). The non-vanishing of the Lyapunov exponents for the case of the billiard map itself then follows from the standard argument (see [W1]), i.e. the Lyapunov exponents of the map T and of the “first return map” s (between xi ’s) are proportional; the constant of proportionality being the average of the return time ti (T ti xi = xi+1 ), whose existence is guaranteed by the Ergodic Theorem. Let xi be a phase point that corresponds to A”. We define the cone C(xi ) at a point xi by a standard sector described by (3.3), (3.4) and (3.5), C(xi ) = C = {x0 ∈ Tx M, Q(x0 ) ≥ 0}. In terms of geometry, the cone C(xi ) consists of those infinitesimal surfaces (i.e. Lagrangian subspaces), whose principal curvatures are non-negative. Instead of defining
298
L. A. Bunimovich, J. Rehacek
cones at A00i we could have defined them at Ai (as was done in [B-R1]) by modifying the Lagrangian subspaces V1 and V2 . We mention that the differential S of the “first return” map s is a symplectic matrix, since it is a product of symplectic matrices representing the differential of the billiard map between the individual reflections. Since both s and S are measurable maps, the pair (s, S) is a measurable cocycle and we can use the results of [W1] about symplectic matrices. The set of m × m symplectic matrices will be denoted by Sp(2m) and of special interest will be its subset consisting of “cone-preserving” matrices F = {S ∈ Sp(2m), Q(Sx0 ) > 0
f or x0 ∈ C}.
(3.6)
Now we recall the fact that vectors from the cone C can be embedded in infinitesimal surfaces with a positive semidefinite curvature operator (it follows from (3.1) and the definition of the quadratic form Q). Results from the previous section can be now expressed in terms of the invariance of the cones C(xi ). After passing through the spherical cap, such surfaces will have a positive definite curvature operator at the exit of the focusing zone (in the notation of the previous section at the point B 00 ). When we say positively defined, it doesn’t exclude the possibility that the surface focuses exactly at the point of exit from the zone of focusing (i.e. one or more curvatures is infinite at that point). A vector corresponding to the direction of the focusing lies at the boundary of the cone at the exit point. If the exit point of one zone was also the entrance point to another one, vectors from the boundary of one cone would be mapped onto the boundary of the next one. However, since we have allowed some distance between the focusing zones, this situation will never occur and all surfaces will enter the next focusing zone as diverging and having positive definite curvature operator with finite eigenvalues. Hence, Q(Sx0 ) > 0 for every vector x0 ∈ C and S belongs to F for every pair of xi , xi+1 . The proof of our theorem is now concluded by application of Theorem 5.1 (from [W1]), which states that every measurable cocycle with values in F has only nonvanishing Lyapunov exponents. Remark 5. From the above it is clear that in order to achieve the proportional growth of the essential free path, as compared to the effective angle of the billiard orbit, it suffices to attach the spherical caps to those faces of the billiard region that are perpendicular to the faces adjacent to them. There are, however, examples of billiards which do not satisfy this requirement and for which numerical studies indicate the non-vanishing of Lyapunov exponents ([BCG]). One such example is a “fair die” , which consists of a cube, whose corners are replaced by 8 pieces of spheres (Fig. 9). The √ center of any such sphere is at the point (a, a, a) and its radius (the dotted line) is a 2. The problem with this example is existence of trajectories which after leaving the spherical cap have a reflection from the flat wall and immediately return to the same spherical cap (see the dashed line in Fig. 9). Even though the flat walls leave the principal curvatures unchanged, they rotate the principal curvature directions. Thus curvatures in the planar direction may be partially or fully interchanged with the curvatures in the transversal direction and the surfaces γa and γc which we have constructed no longer capture the full dynamics of the system. To get a quantitative estimate of the rotation of the principal direction, let us consider two pieces of an orbit, both having reflections from the flat wall at the point F (which lies on the same cube face as A and E). It is clear that the planar direction before the reflection and the planar direction after it differ by an angle between the planes P and
How High-Dimensional Stadia Look Like
299
C
a E a D a
B
F
A
Fig. 9. A corner of the “fair die”
Q, where P is determined by the outcoming velocity vector v and the center C of the sphere, while Q is determined by the vector v and the point C 0 , which is the reflection of the point C in the front face (containing the points A and E). If we denote vc the vector F C and by vc0 the vector F C 0 we obtain an explicit formula for the desired angle of rotation φ cos φcc − cos φc cos φc0 (v × vc ).(v × vc0 ) = , (5.27) cosφ = kv × vc kkv × vc0 k sin φc sin φc0 where φc is the angle between v and vc , φc0 is the angle between v and vc0 and φcc is the angle between vc and vc0 . Using this formula we have calculated the exit curvatures for the orbit having a reflection from the spherical cap near the point A, then having a reflection from the flat wall (near F ) and finally being reflected from the sphere again (near E) and leaving. In the transversal direction we obtained a negative curvature. Then we did the same for the orbit reflecting near A again, then from the flat wall while still close to A and finally reflecting from the sphere in the middle of the circular arc connecting E and B. This time the exit curvature in the transversal direction was positive. From continuity it is clear that we could then find negative exit curvatures arbitrarily close to zero. Unfortunately, in this case the setup does not allow us to deduce appropriate dilation of the free path as in the billiards studied in this paper. We believe that in this and similar examples, the observed chaotic nature of billiards is caused by the statistical averaging of the principal curvature directions of the control surface during its repeated entrances into the spherical cap. It is not clear, however, how to keep track of the principal curvature directions between two passages through the spherical cap. Therefore our theorem does not cover the full range of numerically observed phenomena, while it provides the first class of focusing billiards with chaotic dynamics in any dimension. Acknowledgement. Authors wish to express their gratitude to N.I. Chernov for valuable discussions. This work was supported by the NSF grants #DMS-9303769 and #DMS-9530637.
References [A]
Anosov D.V.: Geodesic Flows on Riemann manifolds with negative curvature. Proc. Steklov Inst. of Math. 90, 210 (1967)
300
[B1] [B2]
[B3] [B4] [B5] [B6]
[BCG] [B-R1] [B-R2] [B-S] [B-G1] [B-G2] [C] [D1] [D2] [D3] [H] [He] [Ho1] [Ho2] [KSS1] [KSS2] [L] [L-W] [M1] [M2] [O] [S1] [S2] [S-C] [Sm]
L. A. Bunimovich, J. Rehacek
Bunimovich, L.A.: On billiards close to dispersing. Math. USSR Sb. 23, 45–67 (1974) Bunimovich, L.A.: On Ergodic properties of Certain Billiards. Funct. Anal. and Its Appl. 8, 254–255 (1974), also On the Ergodic Properties of Nowhere Dispersing Billiards. Commun. Math. Phys. 65, 295–312 (1979) Bunimovich, L.A.: Many-dimensional nowhere dispersing billiards with chaotic behavior. Physica D 33, 58–64 (1988) Bunimovich, L.A.: A Theorem on Ergodicity of Two-Dimensional Hyperbolic Billiards. Commun. Math. Phys. 130, 599–621 (1990) Bunimovich, L.A.: On absolutely focusing mirrors. In: Ergodic Theory and Related Topics III. (ed. by U. Krengel), Lect. Notes 1514, New York: Springer-Verlag, 1992, pp. 62–82 Bunimovich, L.A.: Two mechanisms of Dynamical Chaos: Permanent Stochasticity and Intermittency. In: Nonlinear and Turbulent Processes in Physics, ed. by V.G. Baryakhtan et al., Kiev: Naukova Dumka, 1988 Bunimovich, L.A., Casati, G., Guarneri, I.: Chaotic focusing billiards in higher dimensions. Phys. Rev. Letter. 77, 2941–2944. (1996) Bunimovich, L.A., Rehacek, J.: Nowhere Dispersing 3D billiards with Non-vanishing Lyapunov Exponents. Commun. Math. Phys. 189, 729–757 (1997) Bunimovich, L.A., Rehacek, J.: On the ergodicity of many-dimensional focusing billiards. To appear in Annales IHP Bunimovich, L.A., Sinai, Ya.G.: On the fundamental theorem in the theory of dispersing billiards. Math. Sb. 90, 415–431 (1973) Burns, K., Gerber, M.: Real analytic Bernoulli geodesic flows on S 2 . Ergod. Th. & Dyn. Sys. 9, 27–45 (1989) Burns, K., Gerber, M.: Ergodic geodesic flows on product manifolds with low-dimensional factors. J. Reine Angew. Math. 450, 1–35 (1994) Coddington, H.: Treatise on Reflection and Retraction of Light. London: Simpkin & Marshall, 1829 Donnay, V.J.: Geodesic flow on the two-sphere. I: Positive measure entropy. Ergod. Th. & Dyn. Sys. 8, 531–553 (1988) Donnay, V.J.: Geodesic flow on the two-sphere. II: Ergodicity. In: Dynamical Systems, ed. by J.C. Alexander, Lect. Notes 1342, New York: Springer-Verlag, 1988, pp. 112–153 Donnay, V.J.: Using Integrability to Produce Chaos: Billiards with Positive Entropy. Commun. Math. Phys. 141, 225–257 (1991) Hadamard J.: Sur l’iteration et les solution asymptotiques des equations differentielles. Bull. Soc. Math. de France 29, 224–228 (1901) Hedlund G.A.: Metric transitivity of the geodesics on closed surface of constant negative curvature. Ann. Math., 33, 787–808 (1934) Hopf, E.: Statistik der Geod¨atischen Linien in Mannigfaltigkeiten Negativer Kr¨ummung. Ber. Verh. S¨achs. Akad. Wiss. 91, 261–304 (1939) Hopf, E.: Statistik der L¨osungen geodaetischer Probleme vom unstabilen Typus II. Math. Annalen 117, 590–608 (1940) Kramli, A., Simanyi, N., Szasz, D.: A “Transversal” Fundamental Theorem for Semi-Dispersing Billiard. Commun. Math. Phys. 129, 535–560 (1990) Kramli, A., Simanyi, N., Szasz, D.: Three Billiard Balls on the ν-dimensional Torus is a K-flow. Ann. Math. 133, 37–72 (1991) Lazutkin, V.F.: On the existence of caustics for the billiard ball problem in a convex domain. Math. USSR Izv. 37, 186–216 (1973) Liverani, C., Wojtkowski, M.: Ergodicity in Hamiltonian Systems. Dynamics Reported 4, 130–202 (1995) Markarian, R.: Billiards with Pesin region of measure one. Commun. Math. Phys. 118, 87–97 (1988) Markarian, R.: New ergodic billiards: Exact results. Nonlinearity 6, 819–841 (1993) Oseledec, V.I.: The multiplicative ergodic theorem: the Lyapunov characteristic numbers of dynamical system. Trans. Mosc. Math. Soc. 19, 197–231 (1993) Sinai, Ya.G.: Dynamical systems with elastic reflections. Russ. Math. Surv. 25, 137–189 (1970) Sinai, Ya.G.: Development of Krylov’s ideas. In: N.S. Krylov: Works on the foundations of statistical physics, Princeton, NJ: Princeton University Press, 1979, pp. 239–281 Sinai, Ya.G., Chernov, N.I.: Ergodic properties of certain systems of two-dimensional discs and three-dimensional balls. Russ. Math. Surv. 42, 181–207 (1987) Smale S. Differentiable Dynamical Systems. Bull. of AMS 73, 747–817 (1967)
How High-Dimensional Stadia Look Like
[W1] [W2] [W3]
301
Wojtkowski, M.P.: Invariant families of cones and Lyapunov exponents. Ergod. Th. & Dyn. Sys. 5, 145–161 (1985) Wojtkowski, M.P.: Principles for the design of billiards with non-vanishing Lyapunov exponents. Commun. Math. Phys. 105, 391–414 (1986) Wojtkowski, M.P.: Linearly Stable Orbits in 3 Dimensional Billiards. Commun. Math. Phys. 129, 319–327 (1990)
Communicated by P. Sarnak
Commun. Math. Phys. 197, 303 – 324 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Bihamiltonian Geometry, Darboux Coverings, and Linearization of the KP Hierarchy? Gregorio Falqui1 , Franco Magri2,4 , Marco Pedroni3,4 1
SISSA, Via Beirut 2/4, I-34014 Trieste, Italy. E-mail:
[email protected] Dipartimento di Matematica, Universit`a di Milano, Via C. Saldini 50, I-20133 Milano, Italy. E-mail:
[email protected] 3 Dipartimento di Matematica, Universit` a di Genova, Via Dodecaneso 35, I-16146 Genova, Italy. E-mail:
[email protected] 4 Centre E. Borel, UMS 839 IHP, (CNRS/UPMC)-Paris, France
2
Received: 7 July 1997 / Accepted: 8 March 1998
Abstract: We use ideas of the geometry of bihamiltonian manifolds, developed by Gel’fand and Zakharevich, to study the KP equations. In this approach they have the form of local conservation laws, and can be traded for a system of ordinary differential equations of Riccati type, which we call the Central System. We show that the latter can be linearized by means of a Darboux covering, and we use this procedure as an alternative technique to construct rational solutions of the KP equations.
1. Introduction In this paper we study some aspects of the KP theory from the point of view of the bihamiltonian approach to integrable systems. Our purpose is twofold. At first we describe how the KP theory can be defined by means of a suitable application of the method of Poisson pencils, in the light of the so-called Gel’fand–Zakharevich (hereinafter GZ) theorem on the local geometry of a bihamiltonian manifold [10]. Secondly we discuss, from this point of view, how one can trade the KP hierarchy of partial differential equations for a system of ordinary differential equations, which we term the Central System. We show how this system can be linearized and solved by means of a Darboux transformation. Our approach is inductive. We start from the KdV theory, which we tackle as the GZ theory of the Poisson pencil, 1 Pλ = − ∂x3 + 2(u + λ)∂x + ux , 2
(1.1)
defined on the manifold of C ∞ -functions on S 1 . Following the GZ scheme we study the Casimir function Hλ of this pencil. We show that it can be written in the integral form ?
Work supported by the Italian M.U.R.S.T. and by the G.N.F.M. of the Italian C.N.R.
304
G. Falqui, F. Magri, M. Pedroni
Z H(z) = 2z
h(x, z)dx,
(1.2)
S1
where the local density h is a Laurent series X hj z −j h=z+
(1.3)
j≥1
√ in z = λ. This density is related to the point u of the phase space by the Riccati Eq. [17, 14] u + z 2 = hx + h2 ,
(1.4)
which, as is well known (see, e.g., [1]) defines all the coefficients hj , j ≥ 1, as differential polynomials of u. The next step is to study the conservation laws associated with the KdV hierarchy. A trivial but far reaching consequence of the involutivity of the KdV Hamiltonians is that the KdV flows imply the local conservation laws [24, 7, 4] ∂ h = ∂x H (j) ∂tj
(1.5)
for the density h. These equations introduce the principal characters of our picture: the currents H (j) . We show that P they can be computed in the following way. Among the (finite) linear combinations Cl h(l) of the Fa`a di Bruno iterates h(j) = (∂x + h)h(j−1) ,
h(0) = 1,
(1.6)
of h, with coefficients Cl independent of z, we pick up, for every j ≥ 0, the unique combination having the asymptotic expansion X j Hk z −k . (1.7) H (j) = z j + k≥1
When we insert these currents into the conservation laws (1.5) we obtain a hierarchy of mutually commuting vector fields on a generic Laurent series of the type (1.3). They are (a possible form of) the celebrated KP equations. When h is required to be a solution of the Riccati Eqs. (1.4), these equations collapse into the conservation laws associated with the Poisson pencil (1.1). As a further step we study the time evolution of the currents H (j) . We show that they satisfy a closed system of ordinary differential equations which has the form of a generalized Riccati system: k
j
l=1
l=1
X j X ∂H (k) = H (j+k) − H (j) H (k) + Hl H (k−l) + Hlk H (j−l) . ∂tj
(1.8)
This we call the Central System (CS). It encompasses and extends the KP hierarchy. In [5] we have discussed how the KP equations can be recovered from CS by a projection ∂ . Different projections allow to obtain those on the orbit space of the first vector field ∂t1 KP systems related to fractional KdV hierarchies [3, 11].
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
305
Finally we turn to the problem of solving the Central System. By means of the method of Darboux covering, discussed in [15], we prove that the Miura-like map
H
(j)
=
j X
! 0 Wj−l W (l)
/W (0)
(1.9)
l=0
connects the Central System with another system of Riccati equations, defined on the space of sequences of Laurent series {W (k) }k≥0 of the form W (k) = z k +
X
Wlk z −l ,
(1.10)
l≥1
which reads j
X ∂ W (k) + z j W (k) = W (j+k) + Wlk W (j−l) . ∂tj
(1.11)
l=1
This system (which we call the Sato System, see [23]) can be explicitly linearized using methods well known from the theory of Riccati Eqs. [20]. As a result the Miura map (1.9), which in the present picture is the analog of a dressing transformation, allows to construct explicit families of solutions of the Central System, and hence of the KP hierarchy. In our opinion, this paper clarifies some issues in the analysis of the different pictures of the KP hierarchy, their mutual relations, and the linearization of the KP flows. More specifically we refer to the following three representations for KP: a) The Lax representation in the space of pseudodifferential operators; b) The Sato representation as linear flows of a maximal torus in GL∞ on the Universal Sato Grassmannian U Gr; c) The “bihamiltonian representation” as conservation laws (1.5) satisfied by the Hamiltonian density h. The analysis of the representations a) and b), and of their equivalence, has been deeply expounded in a number of nowadays classical papers and lecture notes (see, e.g., [8, 9, 16, 18, 19, 21, 22, 23]), while the picture of the KP equations as conservation laws, although already introduced in [7, 12, 21, 24], has somehow been left aside from the main stream of the research work on the subject. By fully developing the bihamiltonian approach, this paper aims to show that the KP theory can also be approached on a traditional ground, in the spirit of geometrical methods of classical mechanics [2]. The paper is organized as follows. It starts with a brief r´esum´e of the bihamiltonian theory, devoted only to those aspects of the GZ theorem relevant to the paper. The next three sections provide the description of the path from KdV to KP and the Central System, briefly discussed above. In Sect. 6 we consider the action of Darboux transformations on the Central System. In the final sections we take advantage of such a point of view to linearize the KP theory and we address the problem of writing explicit solutions, which are Hirota-like polynomial solutions.
306
G. Falqui, F. Magri, M. Pedroni
2. The Method of Poisson Pencils (to Construct Integrable Hamiltonian Systems) In the simplest setting of this method one considers a Poisson manifold M and a vector field X. The vector field is used to deform the Poisson bracket {·, ·} on M. We denote by {f, g}0 = {X(f ), g} + {f, X(g)} − X({f, g}), {f, g}00 = {X(f ), g}0 + {f, X(g)}0 − X({f, g}0 )
(2.1)
the first two Lie derivatives of this bracket along X. As the unique condition on X, we demand that the second derivative identically vanishes on M: {f, g}00 ≡ 0.
(2.2)
{f, g}λ := {f ◦ φ−λ , g ◦ φ−λ } ◦ φλ
(2.3)
In this case, the pull-back
of the given bracket with respect to the flow φλ : M → M associated with X depends linearly on λ, {f, g}λ = {f, g} − λ{f, g}0 ,
(2.4)
and, therefore, it defines a linear pencil of Poisson brackets. Under these circumstances, we say that M is an (exact) bihamiltonian manifold and that X is its Liouville vector field. The names are chosen to suggest the analogy with the case of exact symplectic manifolds. The basic idea of the method is to use the Casimir functions of the pencil (2.4) to construct integrable Hamiltonian systems on M. We describe this technique in the case of an odd-dimensional manifold endowed with a Poisson pencil of maximal rank. This entails that {·, ·}λ has a unique Casimir function Hλ , depending on λ. Let us set dim M = 2n + 1. Gel’fand and Zakharevich [10] have shown that Hλ is a degree n polynomial in λ, Hλ = H0 λn + H1 λn−1 + · · · + Hn ,
(2.5)
0
which starts with the Casimir function H0 of {·, ·} and ends with the Casimir function Hn of {·, ·}. The coefficients Hj verify the recursion relations {·, Hj+1 }0 = {·, Hj },
(2.6)
and therefore are in involution with respect to all the brackets of the pencil: {Hj , Hk }λ = 0.
(2.7)
In the compact case, their level surfaces are n-dimensional tori defining a Lagrangian foliation of M. To convert this result in a statement on dynamical systems, we consider the pencil of vector fields Xλ (f ) := {f, Hλ }0
(2.8)
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
307 0
associated with Hλ through the deformed bracket {·, ·} . We make two remarks. First we notice that Xλ is a bihamiltonian vector field since we can write Xλ (f ) = {f, Hλ }0 = {f, Hλ0 }λ .
(2.9)
Hλ0 = X(Hλ )
(2.10)
The derivative
is the second Hamiltonian function. Then we notice that Xλ is a completely integrable system in the sense of Liouville since Xλ (Hj ) = 0.
(2.11)
We call the polynomial family of vector fields, Xλ = X0 λn + X1 λn−1 + · · · + Xn ,
(2.12)
the canonical hierarchy defined on the exact bihamiltonian manifold M. 3. The KdV Hierarchy In this section we define the KdV hierarchy as the canonical hierarchy on a special exact bihamiltonian manifold, and we use this point of view to pave the way to the KP theory. In this example the manifold M is the space of scalar-valued C ∞ -functions on S 1 , the Liouville vector field is u˙ := X(u) = 1,
(3.1)
and the Poisson pencil is given in the form of a one-parameter family of skew-symmetric maps from the cotangent to the tangent bundle [9, 6]: 1 u˙ = (Pλ )u v = − vxxx + 2(u + λ)vx + ux v. 2
(3.2)
In this formula u is a point of M, v is a covector attached at u, and the value of v on a generic tangent vector u˙ is given by Z v(x)u(x) ˙ dx, (3.3) hv, ui ˙ = S1
where x is the coordinate on S 1 . The first problem is to compute the Casimir function Hλ and its derivative Hλ0 along X. In the present infinite-dimensional context they can be conveniently written as integrals Z h dx, (3.4) H = 2z 1 Z S h0 dx (3.5) H0 = S1
of local densities
308
G. Falqui, F. Magri, M. Pedroni
h(z) = z +
X
hj z −j ,
(3.6)
h0j z −j ,
(3.7)
j≥1
h0 (z) = 1 +
X j≥1
which are Laurent series (rather than polynomials) in z =
√ λ.
Proposition 3.1. Let h and h0 be the unique solutions of the Riccati system hx + h2 = u + z 2 , 1 − h0x + hh0 = z 2
(3.8) (3.9)
admitting the asymptotic expansions (3.6) and (3.7). Then their integrals Hλ and Hλ0 are respectively the Casimir function of the Poisson pencil (3.2) and its associated second Hamiltonian function. Proof. We use the identity d 1 2 1 1 vx − vvxx + (u + z 2 )v 2 (3.10) v − vxxx + 2(u + z 2 )vx + ux v = 2 dx 4 2 to prove that the solution of the equation 1 2 1 v − vvxx + (u + λ)v 2 = λ 4 x 2
(3.11)
belongs to the kernel of (3.2). Setting λ = z 2 we note that Eq. (3.11) can be written in the form of a Riccati equation, z v 2 z v x x + + + = u + z2, (3.12) v 2v x v 2v on z vx (3.13) h(z) := + . v 2v ˙ Therefore, By deriving this equation along any curve u(t) in M, we have u˙ = h˙ x + 2hh. Z Z Z 1 d ˙ ˙ ˙ 2z v(x)(hx + 2hh) dx = 2 (− vx + hv)h dx = h dx . hv, ui ˙ = 2 dt S1 S1 S1 (3.14) This formula proves that v is the differential of the first Hamiltonian Hλ , which, consequently, is the Casimir function we were looking for. To compute the second Hamiltonian Hλ0 , it suffices to notice that Z Hλ0 = X(Hλ ) = hv, 1i = v dx. (3.15) S1
This suggests to set h0 = v. Equations (3.8) and (3.9) follow from (3.12) and (3.13) respectively.
(3.16)
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
309
The next problem is to study the canonical hierarchy associated with Hλ . It admits three different representations, according to the use of Eqs. (2.8), (2.9), or (2.11) of Sect. 2. In the first representation, based on formula (2.8), the KdV equations are written as Hamiltonian equations ∂u = −2∂x v2j+1 = 0, ∂t2j ∂u = −2∂x v2j+2 ∂t2j+1
(3.17) (3.18)
with respect to the derived bracket {·, ·}0 . In the second representation, based on (2.9), they are written as Hamiltonian equations with respect to the pencil Pλ . After some straightforward computations [1, 6], we get ∂u = 0, ∂t2j 1 ∂u = − ∂x3 + 2(u + z 2 )∂x + ux · (λj v(λ))+ , ∂t2j+1 2
(3.19) (3.20)
where (λj v(λ))+ =
j X
vj−i λi .
(3.21)
i=0
In the third representation, based on formula (2.11), the attention is focused on the local Hamiltonian density h(z). It must obey local conservation laws of the form ∂h = ∂x H (j) , ∂tj
(3.22)
where the H (j) are suitable “current densities”, since the Hamiltonian Hλ is constant along the flows of the KdV hierarchy. The final problem is to compute these densities. Proposition 3.2. The current densities H (j) of the KdV hierarchy are given by the formulas H (2j) = λj , 1 H (2j+1) = − (λj v)+ x + h(λj v)+ . 2
(3.23) (3.24)
Proof. From Eq. (3.8) we have ∂u = (∂x + 2h) ∂tj
∂h ∂tj
,
(3.25)
and from the second representation (3.20) of the KdV equations we deduce 1 ∂u = (∂x + 2h)( ∂x )(−∂x + 2h)(λj v)+ , ∂t2j+1 2 by noticing that
(3.26)
310
G. Falqui, F. Magri, M. Pedroni
1 1 − vxxx + 2(u + λ)vx + ux v = (∂x + 2h)( ∂x )(−∂x + 2h) · v. 2 2 Therefore
(∂x + 2h)
∂h ∂t2j+1
= (∂x + 2h)( 21 ∂x ) −(λj v)+ x + 2h(λj v)+ = (∂x + 2h)∂x − 21 (λj v)+ x + h(λj v)+ ,
(3.27)
(3.28)
proving (3.24). The formula H (2j) = z 2j is obvious, since the even times are trivial. The definition (3.2) of the Poisson pencil, the Riccati system (3.8, 3.9) for the Hamiltonians, and the definition (3.24, 3.23) of the currents are the basic formulas of the Hamiltonian theory of the KdV equations. They introduce a new object, the currents H (j) . Their study is the leading theme of the paper. 4. The KP Hierarchy The aim of this section is to give a new characterization of the current H (j) in terms of the Hamiltonian density h(z). To this end we write (3.24) in the equivalent forms H (2j+1) = z 2j (− 21 vx + hv) + 21 (z 2j v)− ,x − h(z 2j v)− , Pj H (2j+1) = l=1 − 21 vj−l ,x (z 2l · 1) + vj−l (z 2l · h) ,
(4.1) (4.2)
where (z 2j v)− denotes the strict negative part of the expansion of z 2j v in powers of z. Each of these representations points out an important property of the currents H (j) . Equation (4.1) allows to control the expansion of H (2j+1) in powers of z. Indeed, from it we obtain H (j) = z j + O(z −1 )
(4.3)
by noticing that the second Riccati Eq. (3.9) implies z 2j (− 21 vx + hv) = z 2j+1 . The interpretation of (4.2) is more subtle: it provides a different type of expansion of the current H (2j+1) on a basis attached to the Hamiltonian density h. To display this expansion, we consider the Fa`a di Bruno iterates of h(0) = 1 at the point h, defined by h(j+1) = (∂x + h) · h(j) .
(4.4)
The linear space spanned by them (over C ∞ -functions) is denoted by H+ . Since h(2) = hx + h2 ,
(4.5)
we can write the Riccati Eq. (3.8) in the form z 2 = h(2) − uh(0) ,
(4.6)
showing that z 2 ∈ H+ . Applying the operators (∂x + h)j to both sides of this equation, one shows that z 2 (H+ ) ⊂ H+ . In particular, z 2j · 1 ∈ H+ and z 2j · h ∈ H+ for j ≥ 0. Then Eq. (4.2) means that the currents H (2j+1) belong to H+ . The same is trivially true for H (2j) . Therefore we conclude that all the currents H (j) are Fa`a di Bruno polynomials
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
H
(j)
=
j X l=0
311
cjl h(l)
(4.7)
with coefficients cjl independent of z. For any j ≥ 0 there is a unique choice of these coefficients leading to a “degree” j Fa`a di Bruno polynomial with the asymptotic expansion (4.3) in power of z: the currents H (j) are (with alternating signs) the principal minors of the infinite matrix (0) (1) (2) (3) h h h . . . h res h(0) res h(1) res h(2) res h(3) . . . z z z z h(1) h(2) h(3) 0 res z2 res z2 res z2 . . . (4.8) H= . h(2) h(3) 0 res z3 res z3 . . . (3) 0 res hz4 . . . ... ... ... ... ... Summarizing: Proposition 4.1. The current densities of the KdV theory are the principal minors of the matrix (4.8), i.e., they are the unique Fa`a di Bruno polynomials (4.7) having the asymptotic expansion (4.3) in powers of z. The advantage of this definition with respect to the one of Proposition 3.2 is that it no longer requires that the Laurent series X hl z −l (4.9) h(z) = z + l≥1
be a solution of the Riccati equation (3.8). We have already encoded the Hamiltonian origin of the currents H (j) into the Fa`a di Bruno expansion (4.7). We can thus forget the Poisson pencil (3.2) and the associated Riccati system, and retain simply the property stated in Proposition 4.1. Henceforth, we shall regard it as the definition of the currents H (j) associated with any monic Laurent series (4.9). This allows to extend Eqs. (3.22) to these series. Definition 4.2. The KP equations are the equations ∂ h = ∂x H (j) ∂tj
(4.10)
on an arbitrary monic Laurent series (4.9), where the currents H (j) are the Fa`a di Bruno polynomials considered in Proposition 4.1. After a suitable change of variables [4], this definition reproduces the standard one, usually written in the language of pseudodifferential operators (see, e.g., [8, 9]). Let us briefly explain this relation. First of all, we consider the negative Fa`a di Bruno iterates, obtained by solving backwards the recursion relations h(j+1) = (∂x + h)h(j) ,
j < 0.
(4.11)
312
G. Falqui, F. Magri, M. Pedroni
The coefficients of the h(j) can be computed recursively, and one can easily show that h(j) = z j + O(z j−1 ). Then we develop z on the basis {h(j) }j∈Z : X qj h(−j) . (4.12) z =h− j≥1
This gives an invertible relation between the coefficients hi of h and the qi . For instance, the first relations are q1 = h1 ,
q2 = h2 ,
q3 = h3 + h21 ,
q4 = h4 + 3h1 h2 − h1 h1x . Finally, we introduce the pseudodifferential operator X Q=∂− qj ∂ −j .
(4.13)
(4.14)
j≥1
One can show [4] that the KP Eqs. (4.10) on h entail the Lax equations on Q, ∂Q = [Q, (Qj )+ ], ∂tj
(4.15)
where (Qj )+ is the purely differential part of the j th power of Q. The transformation (4.12) can be usefully compared with the change of representation in the classical treatment of the equations of motion of the Euler top. If we use the space representation we simply write the conservation law of the angular momentum as dL = 0. dt
(4.16)
If we use the body representation, we write the Euler equation dL = [L, ]. dt
(4.17)
The same happens in the KP theory. When we pass from the Hamiltonian representation (4.10) to the Lax representation (4.15) we are performing the analog of the passage from the space representation to the body representation of classical mechanics. We will not discuss the KP equations any longer, but rather consider them as an intermediate step towards the main topic of this paper, i.e., the analysis of the equations on the currents H (j) themselves. 5. The Central System In this section we shall see the Riccati equations appear again in a disguised form. They arise here from the study of the time evolution of the currents H (j) . We interpret the KP ∂ + H (j) : Eqs. (4.10) as the commutativity conditions of the operators ∂x + h and ∂tj ∂ ∂x + h, + H (j) = 0. (5.1) ∂tj
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
313
Since H (j) ∈ H+ and H+ is invariant with respect to the operator ∂x + h, we see that H+ ∂ is invariant also with respect to the operators + H (j) , ∂tj ∂ + H (j) (H+ ) ⊂ H+ , (5.2) ∂tj as shown by the following simple argument: ∂ ∂ + H (j) h(k) = + H (j) (∂x + h)k 1 = (∂x + h)k H (j) ∈ H+ . (5.3) ∂tj ∂tj Let us now remark that the sequence {H (j) }j≥0 is, in its turn, a basis in H+ . Then the previous invariance condition implies that there exist coefficients γljk (independent of z) such that j+k
X jk ∂H (k) + H (j) H (k) = γl (h)H (l) . ∂tj
(5.4)
l=0
They can be easily identified by comparing the expansion of both sides of this equation in powers of z. Proposition 5.1. Along the trajectories of the KP hierarchy, the current densities H (k) obey the equations k
j
l=1
l=1
X j X ∂H (k) + H (j) H (k) = H (j+k) + Hl H (k−l) + Hlk H (j−l) , ∂tj
(5.5)
which we call the Central System (CS) associated with the KP theory. In these equations the Hamiltonian density h(z) plays no special role. Hence we can forget the Fa`a di Bruno rule to construct the polynomials H (j) , and we can look at them simply as a collection {H (j) }j∈N of Laurent series, X j H (j) = z j + Hl z −l , (5.6) l≥1
with independent coefficients Hlj . From this point of view, the Central System, written in the componentwise form j−1 m−1 k−1 k X j X X ∂Hm j k k j+k j−l k−l + Hm+k + Hj+m + Hl Hm−l = Hm + Hlk Hm + Hlj Hm , ∂tj (5.7) l=1
l=1
l=1
is manifestly a system of ordinary differential equations of Riccati type for the new variables Hlj . Through our two-step process from the KdV equations to the Central System, we eventually passed from a hierarchy of partial differential equations to a dynamical system. This result, which is originally due to Sato, admits an interesting geometrical interpretation [5], which we briefly recall. The idea is that the KP and KdV equations are a “reduction” of the Central System, and the problem is to understand this reduction
314
G. Falqui, F. Magri, M. Pedroni
process. We have to recall that the vector fields of the Central System pairwise commute. Then we are allowed to perform two kind of “reductions”. The first is a restriction to the submanifold of singular points of any vector field of the system. The second is a projection onto the orbit space of any vector field of the system along its trajectories. The two processes commute. In [5] we have shown that KP can be obtained from CS as the projection along the trajectories of the first vector field of CS. It is this process which converts the original family of ordinary differential equations into a hierarchy of partial differential equations. The projection is defined by the Fa`a di Bruno condition H (k) =
k X
ckl h(l) .
(5.8)
l=0
On the other hand, as it is well known, KdV is a restriction of KP on the manifold of singular points of the second vector field of the hierarchy. The restriction is defined by the Riccati equation hx + h2 = u + z 2 .
(5.9)
Therefore the passage from CS to KdV is a combined process, involving both a projection and a restriction. This gives the geometrical meaning of the Fa`a di Bruno condition and of the Riccati equation. Of course this is only the simplest example of such a procedure. Other examples are the so-called fractional KdV hierarchies [11], see [5]. Remark 5.2. We end this section on the Central System with some cursory remarks on its relation with the theory of the τ -function and of the Baker–Akhiezer function ψ (see, e.g., [8, 9, 13, 18, 21, 22]). The link with the Baker–Akhiezer function ψ and the alternative forms of the KP equations suggested by Sato and Sato in the seminal paper [21] rests on the following argument. A glance at the Central System shows the symmetry conditions ∂H (j) ∂H (k) = . ∂tk ∂tj
(5.10)
Therefore, there exists a function ψ such that ∂ψ = H (j) ψ. ∂tj
(5.11)
With a suitable normalization this is a Baker–Akhiezer function. Moreover, by the same conditions, the differential operators Dj :=
∂ + H (j) ∂tj
(5.12)
commute. By acting recursively with these operators on the lowest order current H (0) = 1 one obtains vectors Dj1 · · · Djk (H (0) ) belonging to H+ thanks to the invariance condition (5.2). One can see that they satisfy remarkable constraints. For instance, we have, for a, b ∈ C, (aD2 + bD12 )(1) = aH (2) + b(H (2) + 2H11 H (0) ) = (a + b)H (2) + 2bH11 H (0) ,
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
315
so that setting a = −b we can force the above linear combination to be a multiple of H (0) . Analogously (aD3 + bD1 D2 + cD13 )(1) = (a + b + c)H (3) + (b + 3c)H11 H (1) +[(b − 3c)H21 + (b + 3c)H12 ]H (0) is independent of z for a = 2c and b = −3c. In general it can be proved that the vectors Dj1 · · · Djk (1) verify the constraints 1 1 pk (− Dl )(1) = Hk−1 , l
(5.13)
where the Schur polynomials pk (t1 , t2 , . . . ) are as usual defined via the relation X X ti z i ) = pk (tl )z k . (5.14) exp( i≥1
k≥0
Then the simple identity 1 1 ∂ )(ψ) ψ · pk (− Dl )(1) = pk (− l l ∂tl
(5.15)
allows us to recover the Sato constraints pk (−
1 ∂ 1 )(ψ) = Hk−1 ·ψ l ∂tl
(5.16)
presented in [21]. The link with the τ -function is provided by the second Hamiltonian function Hλ0 discussed in Sect. 3. It is preserved along the flows of the KP hierarchy too, and therefore its Hamiltonian density h0 verifies local conservation laws of the form ∂h0 0 = ∂x H(j) , ∂tj
(5.17)
which we call dual KP equations. Setting H(l) = z l−1 −
X
Hlk z −(k+1) ,
(5.18)
k≥1 0 one finds [4] that the dual currents H(j) are given by 0 = H(j)
j X
H(l) H (j−l) .
(5.19)
l=1 0 0 This formula allows us to compute the coefficients Hjk of the expansion of H(j) in powers of z, X 0 0 −(l+1) = jz j−1 − Hjl z , (5.20) H(j) l≥1
as quadratic polynomials of the coefficients Hki of the primal currents H (i) . By using this representation one can show the symmetry property
316
G. Falqui, F. Magri, M. Pedroni 0 0 Hjk = Hkj .
(5.21)
Furthermore, as functions of the times (t1 , t2 , . . . ), they verify the differential conditions 0 0 ∂Hjk ∂Hlk = . ∂tl ∂tj
(5.22)
Therefore, there exists a function τ (t1 , t2 , . . . ) independent of z such that 0 Hjk =
∂2 log τ. ∂tj ∂tk
(5.23)
This function is the Hirota τ -function associated with the Central System. As it is well known, by introducing this function it is possible to set the KP hierarchy in the form of Hirota bilinear equations. However, it is also possible to set directly the equations of the Central System in the form of a linear system by means of a suitable transformation, to be discussed in the next sections. 6. Darboux Coverings and the Central System To linearize the Central System we shall use the method of Darboux coverings [15]. The hallmark of such a rather “unconventional” approach to the classical subject of Darboux maps and symmetries is quite simple. In a first instance, one replaces the search for a transformation between two vector fields X and Z defined on the manifolds M and P respectively, by that of a third vector field Y (defined in general on a bigger manifold N ) separately related to X and Z by two maps π : N → M and σ : N → P: X = π∗ (Y ),
Z = σ∗ (Y ).
(6.1)
By definition, integral curves of Y are mapped to integral curves of X by π, and to integral curves of Y by σ. We say for short that Y intertwines X with Z. Moreover, we say that Y is a Darboux covering of X if N is a fiber bundle over M and π is the canonical projection. In this case, any section ρ of π, invariant under Y , allows us to define a (Miura) map µ : M → P relating directly the vector fields X and Z. In pictures: Y σ∗ . Z
Y & π∗
σ∗ . X
Z
- ρ∗ µ∗ ←−
X
In the present instance, M is the space of sequences of Laurent series {W (k) }k≥0 of the form X Wlk z −l . (6.2) W (k) = z k + l≥1
This space is a natural parameter space for the big cell in the Sato Grassmannian [21, 22]. The manifold P is a second copy of M, formed by sequences {H (k) }k≥0 of the form X Hlk z −l . (6.3) H (k) = z k + l≥1
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
317
Since the sequences W (k) and H (k) will play different roles in the sequel, it is convenient to regard the spaces M and P as distinct. Finally, the manifold N is the Cartesian product M × G of M by the group G of invertible Laurent series of the form X wl z −l . (6.4) w =1+ l≥1
The vector field Z on P is any vector field of the Central System (5.5). We recall that it is completely characterized by the property ∂ + H (j) H+ ⊂ H+ , (6.5) ∂tj where H+ is the linear span of the Laurent series {H (k) }k≥0 . The vector field X on M is analogously characterized by the property ∂ + z j W+ ⊂ W+ , (6.6) ∂tj where W+ is the linear span of the Laurent series {W (k) }k≥0 . By comparing the expansions of both sides of Eq. (6.6) in powers of z, it is easily seen that the equations defining X are j
X ∂ W (k) + z j W (k) = W (j+k) + Wlk W (j−l) . ∂tj
(6.7)
l=1
It can be shown [23] that these are precisely the linear flows on the Sato Grassmannian. Finally, to define the vector field Y on N we impose the further invariance condition ∂ + z j (w) ∈ W+ , (6.8) ∂tj which is tantamount to defining j
X ∂ w + zj w = wl W (j−l) . ∂tj
(6.9)
l=0
We summarize this discussion in the following Definition 6.1. The Central System (CS) is the family of vector fields k
j
l=1
l=1
X j X ∂H (k) + H (j) H (k) = H (j+k) + Hl H (k−l) + Hlk H (j−l) ∂tj on P uniquely characterized by the invariance condition (6.5). The Sato System (S) is the family of vector fields j
X ∂ W (k) + z j W (k) = W (j+k) + Wlk W (j−l) ∂tj l=1
on M uniquely characterized by the invariance condition (6.6).
318
G. Falqui, F. Magri, M. Pedroni
The Darboux Sato System (DS) is the family of vector fields Pj ∂ W (k) + z j W (k) = W (j+k) + l=1 Wlk W (j−l) , ∂tj Pj ∂ w + z j w = l=0 wl W (j−l) ∂tj on N uniquely characterized by the invariance conditions (6.6) and (6.8). To complete the geometrical scheme of Darboux covering we have yet to define the maps σ and π. The map π is of course the canonical projection π(w, {W (k) }) = {W (k) }.
(6.10)
The map σ is defined by imposing the intertwining condition w · (H+ ) = W+
(6.11)
on the linear spans H+ and W+ . It means that multiplying any element H (j) by w we get an element of W+ . This happens if and only if wH (j) =
j X
wj−l W (l)
∀j ≥ 0.
(6.12)
l=0
Definition 6.2. We say that the sequence {H (k) }k≥0 is related to the sequence {W (k) }k≥0 by the Darboux transformation generated by w, and we write H+ = Dw (W+ ), if w · H+ = W+ . We are now in a position to prove the main property of the DS equations. Proposition 6.3. The DS system is a Darboux covering of the Sato System, intertwining it with CS. Proof. The only thing to show is that σ∗ (DS) = CS. Notice that the definitions of σ and DS entail ∂w + z j w = wH (j) . (6.13) ∂tj ∂ ∂ j (j) This means that the operators and are intertwined by the +z +H ∂tj ∂tj multiplication by w, i.e., ∂ ∂ w· (6.14) + H (j) = + z j · w. ∂tj ∂tj Therefore
∂ ∂ + H (j) (H+ ) = + z j (w(H+ )) ∂tj ∂tj ∂ + z j W+ ⊂ W+ , = ∂tj ∂ and consequently + H (j) (H+ ) ⊂ H+ , so that CS follows. ∂tj w·
(6.15)
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
319
We now exploit this result to define a Miura map relating directly S to CS. We consider in P the submanifold H (0) = 1,
(6.16)
which is clearly invariant under CS, and we construct its inverse image in N , w = W (0) ,
(6.17)
with respect to the Darboux map σ : N → P. The corresponding section ρ : M → P is given by ρ({W (k) }) = (W (0) , {W (k) }). Proposition 6.4. The submanifold defined by Eq. (6.17) in N is a section of π : N → M which is invariant under DS. Proof. We have to compare the DS equations for the pair (W (0) , w). They are: Pj ∂ w = −z j w + l=0 wl W (j−l) ∂tj Pj ∂ W (0) = −z j W (0) + l=0 Wl0 W (j−l) , ∂tj
(6.18)
since W00 = 1. Hence j
X ∂ (w − W (0) ) = −z j (w − W (0) ) + (wl − Wl0 )W (j−l) , ∂tj
(6.19)
l=0
proving the statement.
Motivated by this result, we give the following Definition 6.5. The nonlinear map µ=σ◦ρ:M→P
(6.20)
given by
H
(j)
=
j X
! 0 Wj−l W (l)
/W (0) .
(6.21)
l=0
is the Miura map relating S to CS. It enjoys the property of mapping any solution of the Sato System into a solution of CS satisfying the constraint H (0) = 1.
320
G. Falqui, F. Magri, M. Pedroni
7. Linearization of the Sato System and Families of Solutions The final step is to show that the Sato System can be explicitly linearized. To this end it is useful to write it in matrix form. Notice that in components it reads k
j X j ∂Wm j j+k k−l + Wk+m − Wm = W l Wm . ∂tk
(7.1)
l=1
If we consider the infinite shift matrix 0 1 0 ··· 0 0 1 0 ··· .. .. .. . . 3= . . .. .. . . . . .. .. . . and the convolution matrix of level k 0 ··· 1 0 ··· .. . 1 0 ··· ··· 0k = · · 1 0 .. .
(7.2)
(7.3)
we can write (7.1) in the matrix form: ∂ W + W · T 3k − 3k · W = W0k W. ∂tk
(7.4)
This equation belongs to a well-known class of linearizable matrix Riccati Eqs. [20]. Proposition 7.1. The infinite matrix W is a solution of the matrix Riccati Eq. (7.4) if and only if it has the form W = V · U −1 , with U and V satisfying the constant coefficients linear system ∂ U = T 3k U − 0k V ∂tk . (7.5) ∂ V = 3k V ∂tk Proof. We only have to check that every solution of Eq. (7.4) can be obtained from (7.5). Let W be such a solution, and let k V = exp(P k≥1 tk 3 ) . (7.6) U = W −1 V Then (U, V) is a solution of (7.5).
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
321
Since U and V are infinite matrices, in general one cannot explicitly solve the linear system (7.5), and this procedure would imply the discussion of suitable notions of convergence for formal series in infinite variables. Nevertheless, a lot of solutions can be constructed as follows1 . Let us notice that the constraints Wij = 0 ∀ i > n, j > m, is compatible with Eqs. (7.4). In other words, the space Wm,n of matrices W which have zero entries outside the first m rows and the first n columns is invariant for the Sato System. If we denote with Mm,n the m × n matrix obtained by the infinite matrix M by taking its m × n upper corner, then for the matrices W ∈ Wm,n Eqs. (7.4) can be written as ∂ Wm,n + Wm,n · ∂tk
T
3n,n
k
− 3m,m
k
· Wm,n = Wm,n (0k )n,m Wm,n . (7.7)
These are matrix Riccati equations for finite matrices. They can be linearized as in Proposition 7.1, and explicitly solved. Their solutions depend only on {tk }k=1,... ,m+n−1 , and should be compared with the Hirota polynomial solutions of the KP hierarchy.
8. An Explicit Example To make more concrete the discussion about the finite rank solutions, we present some explicit computations for the case of W3,2 . To avoid clumsy notations, let us redefine the matrix coefficients appearing in (7.7) as follows: W3,2 = W,
33,3 = A,
T
32,2 = B,
(0k )2,3 = Ck ,
(8.1)
for k = 1, . . . , 4. Hence, the only non-vanishing coefficients are:
1 0 0 , , A = B= 0 0 1 , A2 , C 1 = 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 , C3 = , C4 = . C2 = 1 0 0 0 1 0 0 0 1
0 1 0
0 0
(8.2)
This shows that the Sato System on W3,2 can be seen as a system of four Riccati-type ordinary differential equations in C6 . According to the recipe of Proposition 7.1, we set W = V U −1 , where V is a 3 × 2 matrix, and U is a non-singular matrix of rank 2. We study the Cauchy problems ∂ U = B k U − Ck V ∂ V = Ak V ∂tk ∂tk , . (8.3) U (0) = I V (0) = W (0), We first consider the equation for V . Since A is nilpotent, the solution is the polynomial 1
This is part of a joint work with J.P. Zubelli, which will appear in a forthcoming paper.
322
G. Falqui, F. Magri, M. Pedroni
V (t) = exp(
2 X
tl Al )W (0).
(8.4)
l=1
Now we recall the definition of the Schur polynomials pl (t1 , t2 , . . . ) given in Eq. (5.14) and of their “adjoint” ones, exp(−
∞ X
ti z i ) =
i=1
∞ X
2 X l=0
(8.5)
l=0
Thus we can rewrite (8.4) as
V (t) =
pel (t1 , t2 , . . . )z l .
1 p1 p2
pl (t)Al W (0) = 0 1 p1 · W (0). 0 0 1
(8.6)
As far as the second set of equations in (8.3) is concerned, we put U = exp(t1 B)U0 (t), so that these equations can be rewritten as X X ∂ U0 = −( pel B l ) · Ck ( pj Aj ) · W (0), ∂tk 1
2
l=0
j=0
(8.7)
with U0 (0) = I. The Cauchy problems with initial data U0 (0) = I can thus be easily solved; we get t1 t2 + 21 t21 t3 + t1 t2 + 16 t31 W (0). (8.8) U0 (t) = I − t2 − 21 t21 t3 − 13 t31 t4 + 21 t22 − 21 t21 t2 − 18 t41 In particular, for
0 0
W (0) = 0 0 , 0 1 we obtain
W (t) = V (t)U (t)−1 = τ −1
where τ = det U (t) = 1 − t4 − of S is
1 2 2 t2
+
1 2 2 t1 t 2
+
−t1 t2 − 21 t31 t2 + 21 t21
1 4 8 t1 .
−t21
t1
−t1
1
,
Therefore the corresponding solution
W (0) = 1 + τ −1 −(t1 t2 + 21 t31 )z −1 + (t2 + 21 t21 )z −2 , W (1) = z + τ −1 −t21 z −1 + t1 z −2 , W (2) = z 2 + τ −1 −t1 z −1 + z −2 , W (k) = z k
for k ≥ 3.
(8.9)
Bihamiltonian Geometry, Darboux Coverings, Linearization KP Eqs.
323
Using the Miura map (6.21) we obtain a solution {H (j) } of the Central System such that H (k) = z k for k = 0 and k ≥ 5. For example, H (1) is given by W (0) H (1) = W (1) + W10 W (0) .
(8.10)
This means that the coefficients Hk1 can be computed using the recursion relations H11 = W11 + (W10 )2 − W20 , H21 = W21 + 2W10 W20 − W10 W11 − (W10 )3 , 1 1 = −(Hj+1 W10 + Hj1 W20 ) Hj+2
for j ≥ 1.
In a more compact form, H (1) = (W (1) + W10 W (0) )/W (0) = −τ
−1
(t1 t2 +
1 3 2 t1 )
z + τ −1 −t21 z −1 + t1 z −2 + 1 + τ −1 −(t1 t2 + 21 t31 )z −1 + (t2 + 21 t21 )z −2
.
As we have seen in Sect. 5, h = H (1) is a solution of the KP equations after putting t1 = x. 9. Summary In this paper we have tried to give an overview of the bihamiltonian approach to the KP theory, starting from the primitive idea of the Poisson pencil to arrive at the polynomial solutions of these equations. The approach consists of two parts, dealing with the equations and with their solutions respectively. In the first part we have traced the way from KdV to CS (through a double process of extension), and backwards from CS to KdV (through a projection and a restriction). In the second part we have shown how to use the technique of Darboux coverings to linearize the equations and, therefore, to construct explicit solutions. We hope that, by providing an alternative view of the theory, the present paper may clarify the logical structure of the Hamiltonian approach to the KP theory. Acknowledgement. We thank P. Casati and J.P. Zubelli for illuminating discussions and comments. Two of us (F.M. and M.P.) would like to thank the staff of the Centre Emile Borel and the organizers of the semester Integrable Systems, Professors O. Babelon, P. van Moerbeke, and J.B. Zuber, for providing a warm environment where part of this work has been done. Thanks are also due to the anonymous referee for useful remarks.
References 1. Alber, S.I.: On stationary problems for equations of Korteweg–de Vries type. Comm. Pure Appl. Math. 34, 259–272 (1981) 2. Arnol’d, V.I.: Mathematical methods of Classical Mechanics. Second Edition, New York: SpringerVerlag, 1989 3. Bakas, I., Depireux, D.A.: A Fractional KdV Hierarchy. Mod. Phys. Lett. A6, 1561–1573 (1991) 4. Casati, P., Falqui, G., Magri, F., Pedroni, M.: The KP theory revisited. IV. KP equations, Dual KP equations, Baker–Akhiezer and τ -functions. SISSA preprint 5/96/FM
324
G. Falqui, F. Magri, M. Pedroni
5. Casati, P., Falqui, G., Magri, G., Pedroni, M.: A note on Fractional KdV Hierarchies. J. Math. Phys. 38, 4606–4628 (1997) 6. Casati, P., Magri, F., Pedroni, M.: Bihamiltonian Manifolds and τ -function. In: Mathematical Aspects of Classical Field Theory 1991, M.J. Gotay et al. eds., Contemporary Mathematics, Vol. 132, Providence, RI.: American Mathematical Society, 1992, pp. 213–234 7. Cherednik, I.V.: Differential equations for the Baker Akhiezer functions of Algebraic curves. Funct. Anal. Appl. 10, 195–203 (1978) 8. Date, E., Jimbo, M., Kashiwara, M., Miwa, T.: Transformation Groups for Soliton Equations. In: Proceedings of R.I.M.S. Symposium on Nonlinear Integrable Systems-Classical Theory and Quantum Theory, M. Jimbo, T. Miwa, eds., Singapore: World Scientific, 1983, pp. 39–119 9. Dickey, L.A.: Soliton Equations and Hamiltonian Systems. Adv. Series in Math. Phys. Vol. 12, Singapore: World Scientific, 1991 10. Gel’fand, I.M., Zakharevich, I.: On the local geometry of a bi-Hamiltonian structure. In: The Gelfand Mathematical Seminars 1990–1992, L. Corwin et al. eds., Boston: Birk¨auser, 1993, pp. 51–112 11. de Groot, M.F., Hollowood, T.J., Miramontes, J.L.: Generalized Drinfel’d-Sokolov Hierachies. Commun. Math. Phys. 145, 57–84 (1992) 12. Flaschka, H.: Construction of Conservation Laws for Lax equations: Comments on a paper by G. Wilson. Quart. J. Math. Oxford 34, 61–65 (1983) 13. Hirota, R.: Exact solution of the Korteweg-de Vries equation for multiple collisions of solitons. Phys. Rev. Lett. 27, 1192–1194 (1972) 14. Kupershmidt, B.A.: On the nature of the Gardner transformation. J. Math. Phys. 22, 449–451 (1981) 15. Magri, F., Pedroni, M., Zubelli, J.P.: On the Geometry of Darboux Transformations for the KP Hierarchy and its Connection with the Discrete KP Hierarchy. Commun. Math. Phys. 188, 305–325 (1997) 16. Manin, Yu.I.: Algebraic aspects of non-linear differential equations. J. Sov. Math., 11, 1–122 (1978) 17. Miura, R.M., Gardner, C.S., Kruskal, M.D.: Korteweg–de Vries equation and generalizations. II. Existence of conservation laws and constants of motion. J. Math. Phys. 9, 1204–1209 (1968) 18. van Moerbecke, P.: Integrable Foundations of String Theory. In: Lectures on Integrable Systems, O. Babelon et. al. eds., Singapore: World Scientific, 1994, pp. 163–268 19. Mulase, M.: Algebraic Theory of the KP Equations. In: Perspectives in Mathematical Physics, R. Penner and S-T. Yau, eds., Boston: International Press, 1994, pp. 151–218 20. Reid, W.T.: Riccati Differential Equations. New York: Academic Press, 1972 21. Sato, M., Sato, Y.: Soliton equations as dynamical systems on infinite-dimensional Grassmann manifold. In: Nonlinear PDEs in Applied Sciences (US-Japan Seminar, Tokyo), P. Lax and H. Fujita eds., Amsterdam: North-Holland, 1982, pp. 259–271 22. Segal, G., Wilson, G.: Loop Groups and equations of the KdV type. Publ. Math. IHES 61, 5–65 (1985) 23. Takasaki, K.: Geometry of Universal Grassmann Manifold from Algebraic point of view. Rev. Math. Phys. 1, 1–46 (1989) 24. Wilson, G.: On two Constructions of Conservation Laws for Lax equations. Quart. J. Math. Oxford 32, 491–512 (1981) Communicated by G. Felder
Commun. Math. Phys. 197, 325 – 345 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Stochastic Model for Growing Sandpiles and its Continuum Limit Lawrence C. Evans? , Fraydoun Rezakhanlou?? Department of Mathematics, University of California, Berkeley, California 94720, USA Received: 4 November 1997 / Accepted: 9 March 1998
Abstract: We study a stochastic model for the sandpile growth. The height of the sandpile above every site of a d-dimensional lattice is denoted by an integer. The height changes randomly and after a suitable scaling of the time and space, we show that the rescaled height converges to a deterministic function. The limit is governed by a nonlinear evolution equation. 1. Introduction This paper introduces a stochastic model for “sandpile growth” in any number of spatial dimensions, and verifies convergence in a rescaled limit to a simple deterministic nonlinear and nonlocal evolution governed by a convex functional. The probabilistic setting is this. We envision the lattice Z2 ⊂ R2 as regularly subdividing the plane into unit squares. We model our sandpile at each instant of time as a stack of unit cubes resting on the plane, each column of cubes lying above a base square. At each moment the configuration of cubes must be stable, which means that the heights of any two adjacent stacks of cubes can differ at most by one. (A given column has four adjacent columns, to the north, south, east, west.) The configuration can thus be a complicated stack of cubes, with columns of varying heights, but nowhere can there be a jump of size greater than or equal to two in any coordinate direction. Imagine next that a new cube is randomly added to the pile, being placed either upon a heretofore unoccupied square in the plane or else upon the top of a current column. In the former case the new configuration is stable, but otherwise this need not be so. In this circumstance we ordain that the new cube “fall down the stack”, moving downhill in jumps of size one, from one to another adjacent column, until such a move is no longer possible. The new cube thereafter remains in place and we have reached a new stable ? ??
Supported in part by NSF Grant DMS94–24342. Supported in part by NSF Grant DMS97–04565 and a Sloan Foundation fellowship.
326
L. C. Evans, F. Rezakhanlou
configuration. If there are several downhill “staircases” along which the cube can move, it will randomly select among the allowable downhill paths. We finally suppose that, starting from an empty stack, we continually add cubes at random locations, each of which as necessary quickly falls downhill into a stable configuration. What happens in the limit if we rescale in both space and time, so as to consider growing piles of more and more, smaller and smaller cubes? Take N to be a large integer and imagine then a complex such landscape built from cubes of side length O(N −1 ), along the surface of which cubes newly and randomly added at rate O(N ) are continually falling downhill through complicated paths into stable resting places. Any particular realization of such a stochastic process can be complicated, and in particular displays long range correlations, as there is no upper bound, independent of N , on the number of steps a moving cube may take. It is consequently surprising that the macroscopic limit as N → ∞ is rather simple. We generalize to Rn and prove that the limiting dynamics for the height u = u(x, t) (x ∈ Rn , t ≥ 0) of the sandpile are governed by the nonlinear evolution equation: ( f − ut ∈ ∂J[u] (t ≥ 0) (1.1) u=0 (t = 0), where f = f (x, t) is the macroscopic rate at which cubes are added at the point x, time t, and J[·] is the convex functional defined by ( 0 if v ∈ L2 (Rn ), |vxi | ≤ 1 a.e. (i = 1, . . . , n) (1.2) J[v] := +∞ otherwise. More precisely,
( J[v] =
for
0 +∞
if v ∈ K if v ∈ L2 (Rn ) − K
K := {v ∈ L2 (Rn ) | v is Lipschitz, |vxi | ≤ 1 a.e. (i = 1, . . . , n)} .
(1.3)
(1.4)
Recall that a Lipschitz function is differentiable a.e. The subdifferential notation “∂J[·]” in (1.1) means that for a.e. time t ≥ 0: J[u(·, t)] + (f − ut (·, t), v − u(·, t))L2 (Rn ) ≤ J[v]
(1.5)
for each v ∈ L2 (Rn ), ( , )L2 being the L2 -inner product. As (1.1) implies u(·, t) ∈ K and the right-hand side of (1.5) equals +∞ unless v ∈ K, we can rewrite (1.5): Z (f − ut (·, t))(v − u(·, t))dx ≤ 0 for a.e. t ≥ 0 and each v ∈ K. (1.6) Rn
The inequality here signals the dissipative nature of the dynamics (1.1). We further generalize and study models for which cubes are either added or removed randomly. If f + ≥ 0 and f − ≥ 0 are the rates at which cubes are added and taken away, then the macroscopic height u solves (1.1) with f := f + − f − . We will briefly discuss in Sect. 7 connections of this model with those suggested by Prigozhin [P1, P2, P3], Puhl [PL] and others. The paper is organized as follows. In Sect. 2 we introduce a probabilistic lattice model for the evolving pile of cubes. We thus consider a Markov process for the height
Stochastic Model for Growing Sandpiles and Continuum Limit
327
η(i, t), defined for times t ≥ 0 and sites i ∈ Zn . The stability condition requires for n each time t ≥ 0 that |η(i, t) −−η(j,i t)|t ≤ 1 if i, j ∈ Z are adjacent sites. Rescaled t + i source terms f N , N and f N , N control the rate new cubes are added to the pile or removed from it, and we introduce as well highly nonlocal factors c± (j, η(·, t), t) recording the rate new cubes come to rest at the site j after “falling downhill”. It turns out that the precise details as to how cubes move when several downhill paths are available are largely irrelevant. In Sect. 3 we provide some elementary estimates relating the sums of the terms involving f ± and c± over various types of sets of sites. The key estimates (3.10), (3.11) provide a kind of microscopic version of (1.6). Section 4 introduces a deterministic lattice approximation to (1.1) and Sect. 5 uses this approximation to prove E supx∈Rn | N1 η([xN ], N t) − u(x, t)| = 0 as N → ∞. Section 6 discusses “sandpiles” growing within a bounded region D, and explains how to modify our arguments for the two natural boundary conditions that cubes either “pile up” or else “fall off” of the edge of D. The concluding Sect. 7 discusses some further features of our model, and draws connections with related models in the literature. 2. The Model We make precise in this section the probabilistic model informally described in Sect. 1. We denote by Zn for the usual n-dimensional integer lattice and write i = (i1 , . . . , in ) to denote a typical site in Zn . We say i, j ∈ Zn are adjacent, written i ∼ j, provided n − 1 of the components of i and j agree and the remaining component differs by ±1. A (stable) configuration is a mapping η : Zn → Z such that ( |η(i) − η(j)| ≤ 1 if i ∼ j, and η has bounded support. The state space is S := {η : Zn → Z | η is a configuration}.
(2.1)
Given a configuration η ∈ S and a site i ∈ Z , we write n
0+ (i, η) := {j ∈ Zn | there exist sites i = i1 ∼ i2 ∼ · · · ∼ im = j with η(il+1 ) = η(il ) − 1 (l = 1, . . . , m − 1), η(k) 6= η(j) − 1 for all k ∼ j},
(2.2)
0− (i, η) := {j ∈ Zn | there exist sites i = i1 ∼ i2 ∼ · · · ∼ im = j with η(il+1 ) = η(il ) + 1 (l = 1, . . . , m − 1), η(k) 6= η(j) + 1 for all k ∼ j}.
(2.3)
Thus j ∈ 0+ (i, η) (resp. 0− (i, η)) if and only if there exists on the graph of η a downward (resp. upward) staircase through adjacent sites, starting at i and ending at j. To each j ∈ 0± (i, η) we assign a number 0 ≤ p± (i, j, η) ≤ 1, in such a way that
328
L. C. Evans, F. Rezakhanlou
X
p± (i, j, η) = 1
(η ∈ S, i ∈ Zn ).
(2.4)
j∈0± (i,η)
Let us also set p± (i, j, η) = 0 if j 6∈ 0± (i, η). We regard p+ (i, j, η) as the probability that a new cube placed at site i will end up at j, after it has “fallen downward” over the stack with height η. Likewise, we regard p− (i, j, η) as the probability that the removal of a cube from the pile at i will result in a removal at site j, after the cubes along a staircase each shifts downwards to fill in the gap created at site i. Given then t ≥ 0 and any function F :S →R, we define
Lt F : S → R
by (Lt F )(η) := (L+t F )(η) + (L− t F )(η) X X c+ (j, η, t)(F (η j ) − F (η)) + c− (j, η, t)(F (ηj ) − F (η)), := j∈Zn
(2.5)
j∈Zn
(
where
η(i) + 1 η(i)
j
η (i) := ( ηj (i) := and c± (j, η, t) :=
if i = j otherwise,
η(i) − 1 if i = j η(i) otherwise,
X
p± (i, j, η)f ±
i:j∈0± (i,η)
i t , . N N
(2.6)
Thus c+ (j, η, t) is the rate at which newly added cubes come to rest at the site j, and is computed by considering the rate at which cubes are added at each uphill site i for which j ∈ 0+ (i, η), times the probability that a cube appearing on the pile at i will end up at j. Likewise c− (j, η, t) is the rate at which cubes are removed from the site j. Then Lt is the infinitesimal generator at time t of a continuous–time Markov process on S. Let {η(·, t)}t≥0 denote this Markov process on S generated by {Lt }t≥0 , starting for simplicity with η(·, 0) ≡ 0. We intend to prove that 1 η([xN ], tN ) → u(x, t) N
(x ∈ Rn , t ≥ 0)
a.s. as N → ∞, where u is the unique solution of nonlinear evolution (1.1). We will assume hereafter ( f ± : Rn × [0, ∞) → [0, ∞) are bounded, Lipschitz, nonnegative, and there exists R > 0 such that sptf ± (·, t) ⊂ Q(0, R) (t ≥ 0).
(2.7)
(2.9)
Here Q(0, R) denotes the cube with center 0, side length 2R, and faces parallel to the coordinate planes.
Stochastic Model for Growing Sandpiles and Continuum Limit
329
The proof of (2.7) can be easily generalized to cover nonzero initial data. If we assume 1 η([xN ], 0) → u0 (x) (x ∈ Rn , t ≥ 0) N for some u0 ∈ K, then we can show (2.7) where u is the unique solution to the nonlinear evolution (1.1) with the initial condition u(x, t) = u0 (x).
3. Sets of Types I–IV This section is devoted to some combinatorial lemmas. Fix a configuration η ∈ S. We define a set A ⊆ Zn to be of type I for η provided i ∈ A implies 0+ (i, η) ⊆ A.
(3.1)
In other words, if i ∈ A, then no cube falling downhill starting from i can come to rest outside A. On the other hand, a cube added to the pile at a site outside of A could possibly end up within A. We declare a set A ⊆ Zn to be of type II for η provided Ac := Zd − A is of type I. In other words, if i 6∈ A, then no cube falling downhill starting from i can come to rest within A. We likewise define a set A ⊆ Zn to be of type III for η provided i ∈ A implies 0− (i, η) ⊆ A.
(3.2)
We lastly declare a set A ⊆ Zn to be of type IV for η provided Ac is of type III. Lemma 3.1. Fix any configuration η ∈ S. (i)
If ξ ∈ S is also a configuration and r ∈ Z, then Ar := {i | η(i) ≥ ξ(i) + r}
(3.3)
is a set of types II and III for η, and Br := {i | η(i) ≤ ξ(i) + r}
(3.4)
is a set of types I and IV for η. (ii) If A is a set of type II for η and t ≥ 0, then X
c+ (j, η, t) ≤
j∈A
X
f+
i t , . N N
(3.5)
f+
i t , . N N
(3.6)
i∈A
If B is a set of type I for η and t ≥ 0, then X j∈B
c+ (j, η, t) ≥
X i∈B
330
L. C. Evans, F. Rezakhanlou
(iii) If A is a set of type IV for η and t ≥ 0, then X
c− (j, η, t) ≤
j∈A
X
f−
i∈A
i t , . N N
(3.7)
i t , . N N
(3.8)
If B is a set of type III for η and t ≥ 0, then X
c− (j, η, t) ≥
j∈A
X
f−
i∈A
The interpretation of (3.5) is that since cubes newly placed in the type II set A can possibly move and come to rest outside of A, the growth rate of the height of the pile within A is less than or equal to the rate at which cubes are added to A. Conversely, for a type I set B, the height growth rate may exceed the rate cubes are added, as some cubes may move into B from outside: this is (3.6). Inequalities (3.7), (3.8) admit similar interpretations. Proof. 1. If we demonstrate that the set Ar is of type II and III, then it follows that the set Br is of type I and IV , because Brc = Ar+1 . Let us show Ar is of type II. So take η, ξ ∈ S, define Ar by (3.3), and assume i 6∈ Ar . Then η(i) < ξ(i) + r.
(3.9)
Let j ∈ 0+ (i, η). Then there exist sites i = i1 ∼ i2 · · · ∼ im = j with ( η(il+1 ) = η(il ) − 1 (l = 1, . . . m − 1) η(k) 6= η(j) − 1 for all k ∼ j. This and (3.9) imply: r + ξ(i2 ) ≥ r + ξ(i1 ) − 1 > η(i1 ) − 1 = η(i2 ), .. . r + ξ(im ) ≥ r + ξ(im−1 ) − 1 > η(im−1 ) − 1 = η(im ). Remember that im = j. Thus η(j) < ξ(j)+r, and so j 6∈ Ar . Consequently 0+ (i, η)∩Ar = ∅ if i 6∈ Ar , and this means that Ar is type II. In the same way we can show that the set Ar is of type III. 2. Next we prove assertion (ii). Assume A is of type II and t ≥ 0. Then (2.6) implies X
c+ (j, η, t) =
j∈A
X
p+ (i, j, η)f +
j∈A i:j∈0+ (i,η)
i t , . N N
Since A is of type II, any site i for which 0+ (i, η) contains a site j ∈ A must itself lie in A. Consequently X X X X t i i t , c+ (j, η, t) ≤ f+ , , p+ (i, j, η) ≤ f+ N N N N j∈A
i∈A
owing to (2.4), since f + ≥ 0.
j∈A
i∈A
Stochastic Model for Growing Sandpiles and Continuum Limit
331
3. Assume next B is of type I and t ≥ 0. Then (2.6) implies X
c+ (j, η, t) =
j∈B
X
p+ (i, j, η)f +
j∈B i:j∈0+ (i,η)
i t , . N N
Since B is of type I, i ∈ B implies 0+ (i, η) ⊆ B. Therefore X
X
c+ (j, η, t) ≥
j∈B
i t , N N
p+ (i, j, η)f +
j∈B i∈B:j∈0+ (i,η)
=
X i∈B
f+
X i t X + i t , , , p (i, j, η) = f+ N N N N + j∈0 (i,η)
i∈B
by (2.4). 4. The inequalities (3.7), (3.8) are established similarly.
Next are the basic inequalities we will eventually need to pass from (2.5) to (1.1). Lemma 3.2. If η, ξ are any two configurations in S, then X
c+ (j, η, t)(η(j) − ξ(j)) ≤
X
j
X
f+
i t , (η(i) − ξ(i)), N N
(3.10)
f−
i t , (η(i) − ξ(i)). N N
(3.11)
i
c− (j, η, t)(η(j) − ξ(j)) ≥
X
j
i
Proof. 1. We rewrite the left-hand side of the expression in (3.10) as where X M (r) := c+ (j, η, t) .
P∞ r=−∞
M (r)r,
j:η(j)−ξ(j)=r
Likewise the right-hand side of (3.10) is N (r) :=
P∞
r=−∞
X
N (r)r, for
f+
i:η(i)−ξ(i)=r
i t , . N N
(3.12)
N (r)r .
(3.13)
We must therefore prove ∞ X
M (r)r ≤
r=−∞
We sum by parts, by first of all setting ( P ˆ (r) := M M (s), Ps≥r ˜ M (r) := s≤r M (s),
∞ X r=−∞
P Nˆ (r) := s≥r N (s), P N˜ (r) := s≤r N (s)
(3.14)
332
L. C. Evans, F. Rezakhanlou
for r ∈ Z. Since η and ξ have compact support, M (r) = N (r) = 0 for sufficiently large ˆ (r) = Nˆ (r) = 0 for large positive r, and furthermore M ˜ (r) = N˜ (r) = 0 for |r|. Hence M large negative r. Therefore ∞ X
M (r)r =
r=−∞
0 X
˜ (r) − M ˜ (r − 1)]r + [M
r=−∞
=
∞ X
ˆ (r) − M
∞ X
ˆ (r) − M ˆ (r + 1)]r [M
r=0 −1 X
˜ (r). M
r=−∞
r=1
Also
∞ X
N (r)r =
r=−∞
∞ X
−1 X
Nˆ (r) −
N˜ (r).
r=−∞
r=1
Consequently (3.13) will follow provided we show:
and
ˆ (r) ≤ Nˆ (r) M
for each r > 0,
(3.15)
˜ (r) ≥ N˜ (r) M
for each r < 0.
(3.16)
To confirm this, rewrite (3.15) as X
c+ (j, η, t) ≤
j:η(j)≥ξ(j)+r
X
i t , . N N
f+
i:η(i)≥ξ(i)+r
(3.17)
Now Lemma 3.1,(i) asserts Ar := {i | η(i) ≥ ξ(i) + r} is a set of type II for η. Lemma 3.1,(ii) then provides us with estimate (3.15). In the same way, we establish inequality (3.16), employing the fact that Br := {i | η(i) ≤ ξ(i) + r} is a set of type I. 2. The proof of (3.11) is similar: in the above argument we replace c+ and f + with c− and f − . 4. An Approximation For our later proof of the continuum limit, it will be convenient to introduce a lattice approximation to the nonlinear evolution ( f − ut ∈ ∂J[u] (t > 0) (4.1) u=0 (t = 0) . We therefore introduce the Hilbert space H := l2 (Zn ) = {η : Zn → R | kηk2 :=
X
η(i)2 < ∞}
i
and define
ˆ := {η ∈ H | |η(i) − η(j)| ≤ 1 if i ∼ j}. K
ˆ is a closed, convex subset of H, with S ⊆ K. ˆ We define then Then K
(4.2)
Stochastic Model for Growing Sandpiles and Continuum Limit
( ˆ := J[ξ]
0 +∞
333
ˆ if ξ ∈ K otherwise.
(4.3)
For each N = 1, 2, . . . , let us also define f N : Zn × [0, ∞) → R by f N (i, t) := f
i t , , N N
(i ∈ Zn , t ≥ 0).
(4.4)
Since f has bounded support, f N (·, t) ∈ H for each t ≥ 0. Finally we introduce these nonlinear dynamics on H: ( ˆ N ] (t > 0) f N − ξtN ∈ ∂ J[ξ . (4.5) N ξ =0 (t = 0) According to nonlinear semigroup theory (e.g. Brezis [B], Zeidler [Z], etc.) the flow (4.5) has a unique solution (4.6) ξ N ∈ C([0, ∞); H), such that ξtN ∈ L2loc ((0, ∞); H)
ˆ for each time t ≥ 0. ξ N (·, t) ∈ K
and
The evolution (4.5) means that X (f N (i, t) − ξtN (i, t))(η(i) − ξ N (i, t)) ≤ 0
(4.7)
(4.8)
i
ˆ (This is a discrete analogue of (1.6).) for a.e. t ≥ 0 and each η ∈ K. Lemma 4.1. (i) For each i ∈ Zn , the mapping t 7→ ξ N (i, t) is Lipschitz continuous. Furthermore there exists a constant C such that 1 X N ξt (i, t)2 Nn
! 21 ≤C
(4.9)
i
for all N = 1, 2, . . . and a.e. t ≥ 0. (ii) For each time t ≥ 0, the mapping ξ N (·, t) : Zn → R has compact support. Furthermore, (4.10) |{i ∈ Zn | ξ N (i, N t) 6= 0}| = O(N n ), uniformly for 0 ≤ t ≤ T . (iii) We have sup |ξ N |(i, N t) = O(N ), i
uniformly for 0 ≤ t ≤ T .
(4.11)
334
L. C. Evans, F. Rezakhanlou
Proof. 1. Fix any time s ≥ 0. Then for a.e. t ≥ s: X d 1 N ( 2 kξ (·, t) − ξ N (·, s)k2 ) = ξtN (i, t)(ξ N (i, t) − ξ N (i, s)) dt i X f N (i, t)(ξ N (i, t) − ξ N (i, s)), ≤ i
owing to (4.8). Thus, writing φ(t) := kξ N (·, t) − ξ N (·, s)k2 , we have X d φ(t) ≤ 2 f N (i, t)2 dt
(4.12)
! 21 1
φ(t) 2 .
i
Now (2.9) and (4.4) imply X
f N (i, t)2 =
i
X
X i t 2 i t 2 , , = f ≤ CN n . N N N N
f
(4.13)
|i|≤RN
i
Consequently, 1 d (φ(t) 2 ) ≤ CN n/2 dt
and so
(a.e. t ≥ 0),
X 1 (ξ N (i, t) − ξ N (i, s))2 2 ≤ CN n/2 (t − s)
(4.14)
i
for all t ≥ s, because ξ N is continuous in t. As the left-hand side of (4.14) is greater than or equal to |ξ N (i, t) − ξ N (i, s)| for any given site i ∈ Zn , the mapping t 7→ ξ N (i, t) is Lipschitz continuous. Our passing to limits in (4.14) as s → t− yields estimate (4.9). For use later, we note that our taking s = 0 in (4.14) gives 1 1 X N ξ (i, t)2 2 ≤ Ct Nn
(t ≥ 0).
(4.15)
i
2. We define the lattice distance |i − j| := |i1 − j1 | + · · · + |in − jn |, where i = (i1 , . . . , in ), j = (j1 , . . . , jn ), i, j ∈ Zn . Write φN (i, t) := max(0, At + B − |i|) for t ≥ 0, i ∈ Zn , the positive constants A, B to be selected below. Then ( A if |i| < At + B φN . t (i, t) = 0 if |i| > At + B For a.e. t ≥ 0 the case |i| = At + B will not occur. Thus for a.e. t ≥ 0,
(4.16)
(4.17)
Stochastic Model for Growing Sandpiles and Continuum Limit
335
X d 1 N ( 2 k(ξ (·, t) − φN (·, t))+ k2 ) = (ξtN (i, t) − ηtN (i, t))(ξ N (i, t) − η N (i, t)) dt i
because
(ξ N − φN )+ = ξ N − η N ,
for
ˆ . η N := min(ξ N , φN ) ∈ K
Hence X d 1 N ( 2 k(ξ (·, t) − φN (·, t))+ k2 ) = (ξtN (i, t) − ηtN (i, t))(ξ N (i, t) − η N (i, t)) dt i X (f N (i, t) − ηtN (i, t))(ξ N (i, t) − η N (i, t)) according to (4.8) ≤ (4.18) i
=
X
(f N (i, t) − ηtN (i, t))(ξ N (i, t) − φN (i, t))+ .
i
Now select B = O(N ) so large that f N (i, t) 6= 0 implies |i| < B. Then take A := kf kL∞ . It follows from (4.17) that whenever ξ N (i, t) > φN (i, t), f N (i, t) ≤ ηtN (i, t)
(i ∈ Zn , t ≥ 0).
This implies that the right–hand side of (4.18) is nonpositive. Consequently (4.18) implies ξ N ≤ φN , (i ∈ Zn , t ≥ 0). Likewise −φN ≤ ξ N . As φN (·, t) has bounded support for each time t ≥ 0, assertion (ii) of the Theorem follows. ˆ we have 3. Since ξ N (·, t) ∈ K, |ξ N (i, t) − ξ N (j, t)| ≤ |i − j|
(4.19)
ξ N (j, N t) = m,
(4.20)
for t ≥ 0, i, j ∈ Zn . Thus if then (4.19) implies ξ N (i, N t) ≥ m − k
if |i − j| = k, k = 0, 1, . . . , m.
Consequently, for some positive constant c 1 mn
X
ξ N (i, N t) ≥
k=0
i:|i−j|≤m
Hence
m≤C
≤C
m 1 X (m − k)|{i : |i − j| = k}| ≥ cm. mn
1 mn
X
ξ N (i, N t)
i:|i−j|≤m
1 X N ξ (i, N t)2 mn
! 21 ≤C
i
according to (4.15). Thus
n
m 2 +1 ≤ CN
n 2 +1
.
N n/2 N T, m
336
L. C. Evans, F. Rezakhanlou
This estimate is valid whenever (4.20) holds at some site j, and a similar argument holds if ξ N (j, N t) = −m. Consequently, sup |ξ N |(j, N t) = O(N ) j
for 0 ≤ t ≤ T .
Motivated by estimate (4.11), we define uN (x, t) :=
1 N ξ ([N x], N t) N
(x ∈ Rn , t ≥ 0).
(4.21)
Lemma 4.2. For each time T ≥ 0, uN → u
uniformly in Rn × [0, T ],
(4.22)
where u is the solution of (4.1). Proof. 1. Take any function v ∈ K and define η N (i) := N v
i N
(i ∈ Zn ).
Then η N ∈ KN . Thus (4.8) implies X (f N (i, N t) − ξtN (i, N t))(η N (i) − ξ N (i, N t)) ≤ 0; i
whence
X i
f(
i i i i , t) − uN , t) v( ) − uN ( , t) ≤ 0 . t ( N N N N
(4.23)
Integrate in time and divide by N n : 2 2 i 1 X i i i 1 X v( ) − uN ( , t) ≤ v( ) − uN ( , s) n n 2N N N 2N N N i i Z t i i i 1 X + f ( , τ )(uN ( , τ ) − v( ) dτ. n N N N s N
(4.24)
i
Write
(
v N (x) := v [NNx] , f N (x, t) := f [NNx] , t
(x ∈ Rn , t ≥ 0).
Then (4.24),(4.25) and (2.9) imply Z Z N N 2 1 1 (v (x) − u (x, t)) dx ≤ 2 (v N(x) − uN(x, s))2 dx 2 n n R R Z tZ f N(x, τ )(uN(x, τ ) − v(x)) dxdτ. + s
2. For each time 0 ≤ t ≤ T ,
Rn
(4.25)
(4.26)
Stochastic Model for Growing Sandpiles and Continuum Limit
337
N ξ i ξN j i N j (j, N t) ≤ − . |u ( , t) − u ( , t)| = (i, N t) − N N N N N N
(4.27)
Z 1 X N i 1 X t N i N i |u ( , t) − u ( , s)| ≤ n |ut ( , τ )|dτ Nn N N N N s i i ! 1 X N |ξt (i, τ )| ≤ C(t − s), ≤ (t − s) sup Nn τ ≥0
(4.28)
N
Also,
i
according to (4.21), (4.10) and (4.9). Estimates (4.11), (4.27),(4.28) imply that the sequence uN (·), restricted to Zn /N × [0, T ] is bounded, equicontinuous. (4.29) Thus there exists a continuous function u and a sequence Nj → ∞ such that u Nj → u
uniformly on Rn × [0, T ]
(4.30)
for each T > 0. 3. We claim that u is the solution of (4.1). First, passing to limits as N = Nj → ∞ in (4.26) we find Z tZ 2 2 1 1 kv(·) − u(·, t)k ≤ kv(·) − u(·, s)k + f (x, τ )(u(x, τ ) − v(x))dxdτ. L2 (Rn ) L2 (Rn ) 2 2 s
Rn
(4.31)
Since uN satisfies (4.27), we deduce that |u(x + hei , t) − u(x, t)| ≤ h for each x ∈ R , 0 ≤ t ≤ T , h > 0 and i = 1, . . . , n, where ei = (0, . . . 1, . . . 0). Thus |uxi | ≤ 1 a.e. (i = 1, . . . , n) and so u(·, t) ∈ K for t ≥ 0. Set v(·) = u(·, s) in (4.31). Then Z t 2 1 ku(·, t) − u(·, s)k ≤ C f (x, τ )(u(x, τ ) − u(x, s))dxdτ 2 L 2 s Z t ku(·, τ ) − u(·, s)kL2 dτ. ≤C n
s
This estimate is valid for all 0 ≤ s ≤ t. Consequently sup (ku(·, τ ) − u(·, s)k2L2 ) ≤ C(t − s) sup (ku(·, τ ) − u(·, s)kL2 )
s≤τ ≤t
s≤τ ≤t
and thus
t 7→ u(·, t) is a Lipschitz continuous mapping from [0, ∞) into L2 (Rn ). Hence ut (·, t) exists for a.e. t, and we deduce from (4.31) that Z (f (x, t) − ut (x, t))(v(x) − u(x, t))dx ≤ 0 Rn
for a.e. t ≥ 0 and all v ∈ K. This means that u solves (4.1). As this solution is unique, we conclude the proof of assertion (4.22).
338
L. C. Evans, F. Rezakhanlou
5. Continuum Limit Our main result is this: Theorem 5.1. For each t ≥ 0, we have 1 E sup η([N x], N t) − u(x, t) → 0 x∈Rn N as N → ∞, where u is the unique solution of ( f − ut ∈ ∂J[u] u=0
(t > 0) (t = 0) .
(5.1)
(5.2)
Proof. 1. If F : S × [0, ∞) → R is Lipschitz continuous in t and F (η(·, 0), 0) ≡ 0, then Z t ∂F + Ls F (η(·, s), s))ds + M(t), (5.3) F (η(·, t), t) = ∂s 0 where {M(t)}t≥0 is a martingale with E(M(t)) = 0 Define
(t ≥ 0).
1 X (η(i) − ξ N (i, t))2 2N n+2
F (η, t) :=
(η ∈ S, t ≥ 0),
i
for ξ N as in the previous section. Then 1 X ∂F (η, t) = n+2 (−ξtN (i, t))(η(i) − ξ N (i, t)) ∂t N
(5.4)
i
and, according to (2.5), X c+ (j, η, t)(F (η j , t) − F (η, t)) L+t F (η, t) = j
=
1 X + c (j, η, t)[(η j (i) − ξ N (i, t))2 − (η(i) − ξ N (i, t))2 ] 2N n+2 i,j
= =
1 2N n+2 1 N n+2
X
c+ (j, η, t)[(1 + η(j) − ξ N (j, t))2 − (η(j) − ξ N (j, t))2 ]
j
X
c+ (j, η, t)(η(j) − ξ N (j, t) + 21 ).
j
Likewise, L− t F (η, t) = Thus if 0 ≤ s ≤ N t:
(5.5)
1
X
N n+2
j
c− (j, η, t)(ξ N (j, t) − η(j) + 21 ).
(5.6)
Stochastic Model for Growing Sandpiles and Continuum Limit
339
∂F + Ls F (η(·, s), s) ∂t 1 X + (c (j, η(·, s), s) − c− (j, η(·, s), s) − ξsN (j, s))(η(j, s) − ξ N (j, s)) = n+2 N j
+
1 X + c (j, η(·, s), s) + c− (j, η(·, s), s) . 2N n+2 j
(5.7) Observe that (2.4),(2.6) imply 1 1 X ± i t 1 X ± c (j, η(·, s), s) = f ( , )=O , n+2 n+2 2N 2N N N N2 j
(5.8)
i
uniformly for s ≥ 0. 2. Fix 0 ≤ s ≤ N t and define ξ : Zn → Z by ξ(j) := [ξ N (j, s)] We claim that
(j ∈ Zn ).
(5.9)
ξ is a configuration, i.e. ξ ∈ S.
Owing to Lemma 4.1, (ii), we note ξ : Zn → Z and ξ has compact support. Next let i ∼ j and write ξ N (i, s) = ni + αi , ξ N (j, s) = nj + αj , where ni = ξ(i), nj = ξ(j), and so 0 ≤ αi , αj < 1. We may assume ξ N (i, s) ≥ ξ N (j, s), As ξ N (·, s) ∈ K,
ni ≥ nj .
ξ N (i, s) ≤ ξ N (j, s) + 1.
Thus ni + αi ≤ nj + αj + 1, and so ni ≤ nj + αj + 1. Since ni , nj are integers and αj < 1, we conclude nj ≤ ni ≤ nj + 1. Hence |ξ(i) − ξ(j)| ≤ 1, and so ξ is a configuration. 3. We next study the first term on the right-hand side of (5.7). We have 1
X
N n+2
j
≤ ≤ =
c+ (j, η(·, s), s)(η(j, s) − ξ N (j, s))
1
X
N n+2
j
1
X
N n+2
i
1
X
N n+2
i
c+ (j, η(·, s), s)(η(j, s) − ξ(j)) + O f
+
f
+
i s , N N
i s , N N
1 N2
1 (η(i, s) − ξ(i)) + O N2
(η(i, s) − ξ N (i, s)) + O
and we employed Lemma 3.2 for the second inequality. Similarly,
1 , N2
(5.10)
340
L. C. Evans, F. Rezakhanlou
−
1
X
N n+2
j
c− (j, η(·, s), s)(η(j, s) − ξ N (j, s)) ≤−
1
X
N n+2
i
f
−
i s , N N
1 (η(i, s) − ξ (i, s)) + O . N2
(5.11)
N
4. Combine now (5.7),(5.8),(5.10),(5.11): ∂F + Ls F (η(·, s), s) ∂s 1 X 1 i s ≤ n+2 , − ξsN (i, s) (η(i, s) − ξ N (i, s)) + O f N N N N2 i 1 X N 1 = n+2 f (i, s) − ξsN (i, s) (η(i, s) − ξ N (i, s)) + O N N2 i
1 ≤O N2
for a.e. 0 ≤ s ≤ N t,
ˆ the last inequality holding in view of (4.8), since η(·, s) ∈ S ⊆ K. Return now to (5.3). We employ the foregoing estimate to compute: 1 N
E(F (η(·, N t), N t)) = O as N → ∞. Therefore E
1 X N |η (i, t) − uN (i, t)|2 Nn
! → 0,
(5.12)
i
for
(
η N (i, t) := uN (i, t) :=
η N (i, N t) ξN N (i, N t)
.
But since Lemma 4.2 asserts uN → u uniformly on Rn × [0, T ], we deduce from (5.12) that Z η | ([N x], N t) − u(x, t)|dx → 0 E Rn N as N → ∞. Since u(·, t) : Rn → R and η : Zn → Z are Lipschitz continuous, (5.1) follows. 6. Boundary Conditions Our methods allow us also to handle certain problems on bounded domains. a. No-flux boundary conditions. For the first case, imagine that for each N , we have a lattice-connected set DN ⊆ Zn such that no cube can leave DN . We may intuitively think of DN as being surrounded by a very high, vertical wall. A configuration is then a mapping η : DN → Z such that η has compact support and
Stochastic Model for Growing Sandpiles and Continuum Limit
|η(i) − η(j)| ≤ 1
341
if i ∼ j, i, j ∈ DN .
The sets 0+ are defined by 0+ (i, η) := {j ∈ Zn | there exist sites i = i1 ∼ i2 ∼ · · · ∼ im = j in DN with η(il+1 ) = η(il ) − 1(l = 1, . . . , m − 1), η(k) 6= η(j) − 1 for all k ∼ j, k ∈ DN }. A similar modification of 0− in (2.3) provides the new definition of 0− . We continue as in (2.3) and (2.6) to define c± (j, η, t) for j ∈ DN . We choose the lattice sets DN to approximate a macroscopic set D. For simplicity, let us assume D ⊂ Rn is a convex set, and define i ∈ D}, n := {v ∈ L2 (D) | v is Lipschitz continuous, |vxi | ≤ 1 a.e. in D}.
DN := {i ∈ Zn | KD
A straightforward adaptation of the proof of Theorem 5.1 yields Theorem 6.1. For each t ≥ 0, we have 1 E sup η([N x], N t) − u(x, t) → 0 x∈D N as N → ∞, where u is the unique solution of ( f − ut ∈ ∂JD [u] u=0 for
( JD [v] :=
0 ∞
(t > 0) (t = 0),
(6.1)
if v ∈ KD otherwise.
b. Zero-height boundary conditions. Another possible boundary condition ordains that a cube be removed whenever it lands outside DN . In other words, we can imagine DN as being a “table” and any moving cube that reaches the edge as “falling off”. For simplicity we only allow addition of cubes, that is, f − ≡ 0. Let us take the lattice DN as before, and now define the configuration space S0 to be the set of mappings η : Zn → Z in S such that η(i) = 0 if i 6∈ DN and η(i) ≥ 0 for all i ∈ DN . We assume f = f + ≡ 0 on Rn − D. We define 0+ as in (2.2). Note then that for η ∈ S0 and i ∈ DN , we have 0+ (i, η) := {j ∈ Zn | there exist sites i = i1 ∼ i2 ∼ · · · ∼ im = j, with i1 , i2 , . . . , im−1 ∈ DN , η(il+1 ) = η(il ) − 1 (l = 1, . . . , m − 1), and either j 6∈ DN or else j ∈ DN , η(k) 6= η(j) − 1 for k ∼ j}. To each j ∈ 0+ (i, η) we assign a number 0 ≤ p(i, j, η) ≤ 1, such that X p(i, j, η) = 1. j∈0+ (i,η)
342
L. C. Evans, F. Rezakhanlou
Finally we as before define c+ (j, η, t) :=
X
t i , , N N
(6.2)
c+ (j, η, t)(F (η j ) − F (η)).
(6.3)
p(i, j, η)f +
i:j∈0+ (i,η)
and (Lt F )(η) :=
X j∈DN
Clearly, η ≡ 0 outside DN for all time. Because of this, we define K0 := {v ∈ K | v(x) = 0 if x 6∈ D}, ( J0 [v] :=
0 +∞
if v ∈ K0 otherwise.
(6.4)
(6.5)
Theorem 6.2. For each t ≥ 0, the convergence (5.1) holds, for u the unique solution of (
f − ut ∈ ∂J0 [u] u=0
(t > 0) (t = 0).
(6.6)
Proof. The proof of this theorem follows the previous proof. In view of (3.15) and (3.16), it is enough to establish that for every ξ ∈ S0 that X
c+ (j, η, t) ≤
X
f +(
t i , ) N N
(6.7)
f +(
t i , ) N N
(6.8)
i:η(i)≥ξ(i)+r
j∈DN η(j)≥ξ(j)+r
for r > 0, and
X
c+ (j, η, t) ≥
X i:η(i)≤ξ(i)+r
j∈DN η(j)≤ξ(j)+r
for r < 0. The proof of (6.7) is obvious and holds for every r, because of (3.17). For (6.8), first observe that since η, ξ ∈ S0 and r < 0, we have B := {j | η(j) ≤ ξ(j) + r} ⊆ DN . From this, we deduce X
c+ (j, η, t) =
j∈B
Therefore (3.16) implies (6.8).
X j∈B∩DN
c+ (j, η, t).
Stochastic Model for Growing Sandpiles and Continuum Limit
343
7. Commentary Our problem is, we think, of some interest as it models evolving random landscapes in any number of dimensions and supports long range interactions between sites. The macroscopic dynamic are nonlocal, dissipative, and involve no intrinsic length or time scales, other than those imposed by the source term f : cf. Bak, Tang and Weisenfeld [B-T-K]. More specifically, what we provide here is a sort of stochastic, microscopic particle model for the macroscopic “sandpile” dynamics introduced in Prigozhin [P1, P2, P3] and in [A-E-W, E-F-G]. These papers suggest as a model for growing sandpiles the evolution ( f − ut ∈ ∂I[u] (t ≥ 0) (7.1) u=0 (t = 0), for the convex functional I[v] :=
( 0 +∞
if v ∈ L2 (Rn ), |Dv| ≤ 1 a.e. otherwise,
(7.2)
Pn 1 where Dv = (vx1 , . . . , vxn ), |Dv| = ( i=1 vx2 i ) 2 . Observe that (7.1), (7.2) is an isotropic version of (1.1), (1.2). As noted in [E-F-G], we can regard (7.1) as describing an evolution where the measure µ+ := f (·, t)dx, corresponding to mass being continually added to the system, is instantly rearranged by an optimal Monge–Kantorovich mass transfer into the measure µ− := ut (·, t)dy. So we have here a rearrangement of mass on a “fast” time scale, which on the slower, O(1) time scale forces the dynamics (7.1). The cost associated with a given mass reallocation scheme s here is Z c(x, s(x))dµ+ , (7.3) Rn
for
c(x, y) := |x − y|,
(7.4)
the Euclidean distance between the points x, y. The height function u(·, t) is the Monge– Kantorovich potential, which generates at each moment of time the instantaneous and optimal mass transfer, as the sand simply falls downhill. See [E] for more about this interpretation and other instances of Monge–Kantorovich phenomena in physical problems. Our problem (1.1), (1.2) corresponds to optimal Monge–Kantorovich mass transfer corresponding to the cost (7.3), but now for the l1 -distance, c(x, y) :=
n X
|xi − yi |,
(7.5)
i=1
instead of (7.4). This nonisotropic distance function reflects the lattice orientation on the P macroscopic scale. The special case that f = fk (t)δdk , fk ≥ 0, represents mass added at point sources, and the dynamics (7.1), (7.2) correspond then to growing, interacting ˙ pyramids of sand: cfAronsson [A, A-E-W]. There is a vast physical literature concerning “sandpile models” of various sorts, these mostly classified as critical height, critical slope or Laplacian models, according
344
L. C. Evans, F. Rezakhanlou
to whether the evolution of the sandpile depends upon the height, the first derivatives of the height, or the second derivatives: see Manna [MA], Mehta [M], Carlson–Chayes– Grannan–Swindle [C-C-G-S, C-G-S-T], etc. We are mostly unqualified to comment in any detail on connections with others’ work, except to note that ours is certainly a critical slope model, directly related to the stochastic models of Puhl [PL] (with the “disorder thresholds” removed). See [PL] for computer simulations and also comments about experiments with actual sand. The evolutions (1.1), (6.1), (6.6), (7.1), although extremely crude cartoons of the real physics of flowing sand, do in fact capture some experimental observations, most notably the behavior of the pile on a bounded domain under the zero-height boundary conditions discussed in Sect. 6. For the case f + is a point source and f − ≡ 0, the evolution (6.6) predicts that the height function be constant in time once an edge of the sandpile reaches the edge of the domain D. This is consistent with the observations of Puhl. Remark. We should also explicitly note that although we think our stochastic model is fairly natural, it is also in a sense only “weakly probabilistic”. The reason is that we are describing a surface flow: once a cube come to rest in a pile to which sand is only added, it never thereafter moves. Consequently in a pile of O(N n+1 ) cubes, we can expect that only about O(N n ) cubes are on the surface and so possibly in stochastic motion. Even more striking is the case of a single point source. If cubes are added only at the site 0, the height process η(·, t) is random, as newly added cubes fall down the pile according to the rules explained in Sect. 1,2. At any moment the graph of η(·, t) is roughly a pyramid, with random irregularities in the outer layer of cubes. But whenever the total number of cubes N added up to some time is a pyramid number, that is, whenever it is possible to arrange the N cubes into a perfect pyramid (actually a ziggurat), then the graph of the process η(·, t) must in fact be this regular pyramid. In other words, no newly added cubes can start forming the next layer of the pyramid until the previous layer is completely finished. This is all a consequence of the strong stability constraint that |η(i, t) − η(j, t)| ≤ 1 if i ∼ j. The case we have treated of many spatially distributed sources (or sinks) determined by f is more complicated of course, owing to interactions between cubes newly appearing at different sites. References [A]
Aronsson, G.: A mathematical model in sand mechanics. SIAM J. of Applied Math. 22, 437–458 (1972) [A-E-W] Aronsson, G., Evans, L.C. and Wu, Y.: Fast/slow diffusion and growing sandpiles. J. Diff. Eqs. 131, 304–335 (1996) [B-T-K] Bak, P.,Tang, C. and Weisenfeld, K.: Self-organized criticality. Phys. Rev. A. 38, 364–378 (1988) [B] Brezis, H.: Op´erateurs Maximaux Monotones et Semi-groupes de Contractions dans les Espaces de Hilbert. Amsterda: North Holland, 1973 [C-C-G-S] Carlson, J.M., Chayes, J., Grannan, E.R. and Swindle, G.H.: Self-organized criticality and singular diffusion. Phys. Rev. Lett. 65, 2547–2550 (1990) [C-G-S-T] Carlson, J.M., Grannan, E.R., Swindle, G.H. and Tour, J.: Singular diffusion limits of a class of reversible self-organizing particle systems. Annals of Prob 21, 1372–1393 (1993) [E] Evans, L.C.: Partial differential equations and Monge–Kantorovich mass transfer (survey paper). To appear in Current Developments in Math, 1997 [E-F-G] Evans, L.C., Feldman, M. and Gariepy, R.: Fast/slow diffusion and collapsing sandpiles. J. Diff. Eqs. 137, 166–209 (1997) [MA] Manna, S.S.: Critical exponents of the sand pile models in two dimensions. Physica A 179, 249–268 (1991)
Stochastic Model for Growing Sandpiles and Continuum Limit
[M] [P1] [P2] [P3] [PL] [Z]
345
Mehta, A. (ed.): Granular matter: an Interdisciplinary Approach. Berlin–Heidelberg–New York: Springer, 1994 Prigozhin, L.: A variational problem of bulk solid mechanics and free surface segregation. Chem. Eng. Sci. 78, 3647–3656 (1993) Prigozhin, L.: Sandpiles and river networks: extended systems with nonlocal interactions. Phys. Rev. E. 49, 1161–1167 (1994) Prigozhin, L.: Variational model of sandpile growth. European J. Applied Math 4, 225–235 (1996) Puhl, H.: On the modeling of real sandpiles. Physica A 182, 295–319 (1992) Zeidler, E.: Nonlinear Functional Analysis and its Applications Vol. 2, Berlin–Heidelberg–New York: Springer
Communicated by J. L. Lebowitz
Commun. Math. Phys. 197, 347 – 360 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Nekhoroshev-Stability of Elliptic Equilibria of Hamiltonian Systems Francesco Fass`o? , Massimiliano Guzzo?? , Giancarlo Benettin??? Universit`a di Padova, Dipartimento di Matematica Pura e Applicata, Via G. Belzoni 7, 35131 Padova, Italy Received: 19 September 1997 / Accepted: 12 March 1998
Abstract: We prove a conjecture by N.N. Nekhoroshev about the long-time stability of elliptic equilibria of Hamiltonian systems, without any Diophantine condition on the frequencies. Higher order terms of the Hamiltonian are used to provide convexity. The singularity of the action-angle coordinates at the origin is overcome by working in cartesian coordinates.
1. Introduction and Statement of the Result A. Introduction. In this paper we prove the Nekhoroshev-stability of elliptic equilibria of analytic Hamiltonian systems. The Hamiltonian of the problem is a convergent power series h2 + h3 + h4 + · · · ,
(1.1)
where any hk is a homogeneous polynomial of degree k in the canonical coordinates (p, q) ∈ Rn , and n 1 X ω¯ j p2j + qj2 h2 = 2 j=1
with real ω¯ j . We assume that the frequency vector ω¯ = (ω¯ 1 , . . . , ω¯ n ) does not satisfy any resonance condition up to order four and that the nonlinear term h3 + h4 has the generic property of “removing the degeneracy” of h2 , more precisely of providing convexity, in a sense specified below. Under these hypotheses, the Hamiltonian (1.1) can be trasformed ?
Gruppo Nazionale di Fisica Matematica (CNR). E-mail:
[email protected] Gruppo Nazionale di Fisica Matematica (CNR). E-mail:
[email protected] ??? Istituto Nazionale di Fisica della Materia and Gruppo Nazionale di Fisica Matematica (CNR). E-mail:
[email protected] ??
348
F. Fass`o, M. Guzzo, G. Benettin
with two Birkhoff normalization steps into a Hamiltonian which is integrable and convex (hence anisochronous) up to order five. By studying the latter system with the techniques of Hamiltonian perturbation theory, we prove that any motion starting sufficiently close to the origin remains close to it for times which grow exponentially fast with (a power of) the distance from the origin. This result was conjectured by Nekhoroshev in his 1977 celebrated paper ([18], Sect. 2.2). It should be noticed that this result does not directly follow from Nekhoroshev’s general theorem about the stability of the actions in nearly integrable Hamiltonian systems, because of the singularity at the origin of the action-angle coordinates. Thus, in order to prove it, one should perform perturbation theory using cartesian coordinates, which are regular in a neighbourhood of the equilibrium. This procedure is quite standard for isochronous systems, i.e. for constructing the so-called Birkhoff series (see e.g. [20, 10, 13]), while it seems that the case of anisochronous systems had not been understood until very recently [3]. Correspondingly, the problem of the stability of elliptic equilibria over exponentially long times has been so far investigated in the following ways: On the one hand, the problem has been studied in several papers under the assumption that the frequency ω¯ satisfies a Diophantine condition (see e.g. [10, 11, 14]). Indeed, as realized by Gallavotti in a closely related problem [8], under this strong nonresonant condition one can regard (1.1) as a small perturbation of the isochronous system described by h2 and avoid any “geometry of resonances”.1 On the other hand, the full Nekhoroshev’s conjecture – without any Diophantine assumption on the frequencies – was investigated by Lochak in [16, 17], who however worked in action-angle coordinates. Therefore, in order to consistently avoid the singularities, he had to exclude a small cusp-shaped region around each hyperplane p2j + qj2 = 0, j = 1, . . . , n. Hence, the “stability region” that he finds, although arriving arbitrarily close to the equilibrium, does not include any neighbourhood of it. (The need of entering in these cusp-shaped regions is typical of Nekhoroshev’s approach, while it is not as urgent in KAM theory, see e.g. [5].) Finally, we mention that a Nekhoroshev-like result for the rather special case of completely resonant frequencies has been very recently obtained in [1] in the framework of a study of the Nonlinear Schr¨odinger equation. A way of performing perturbation theory for anisochronous systems using cartesian coordinates was understood in [3] in connection with a problem of rigid body dynamics, namely, the stability of the fast rotations of a rigid body close to “proper” rotations. The similarity with the problem of the elliptic equilibrium resides in the fact that the proper rotations of a rigid body lie on transversally elliptic circles, on which one pair of action-angle coordinates becomes singular. Here we proceed precisely as there, although we need to generalize and to adapt to the problem at hand the perturbative scheme there introduced. The main difference comes from the fact that the small parameter represents here the distance from the fixed point, while in the rigid body problem it was an independent parameter measuring the strength of the external forces; moreover, here one needs working with any number of frequencies, and not just with two of them. We shall come back on these points at the end of this section. 1 Gallavotti’s result, further developed and improved in [2, 9, 6], concerns small perturbations of systems of uncoupled harmonic oscillators, written in action-angle coordinates; the difference with the problem of elliptic equilibria is that the perturbation is assumed to be analytic in the actions, so that one can work in action-angle coordinates.
Nekhoroshev-Stability of of Hamiltonian Systems
349
B. Statement of the result. We make use of the usual complex canonical coordinates pj − iqj √ , i 2
wj =
zj =
pj + iqj √ , 2
j = 1, . . . , n ,
and denote by Ij (wj , zj ) = iwj zj ,
j = 1, . . . , n ,
the n actions. As is well known, if the vector ω¯ does P not admit any resonance ν · ω¯ = 0 with ν ∈ Zn \ {0}, |ν| ≤ 4, where |ν| = j |νj |, then one can perform two Birkhoff normalization steps on the Hamiltonian (1.1), thus obtaining a Hamiltonian of the form h(w, z) = k(izw) + f (w, z) ,
(1.2)
where k(I) = ω¯ · I +
1 I · AI , 2
(1.3)
A being a n × n real symmetric matrix, while f is a series in (w, z) which starts with terms of order five. In any polydisk |wj | ≤ R, |zj | ≤ R the canonical transformation leading to (1.2) differs from the identity by terms of order R2 . Both Hamiltonians (1.1) and (1.2) are real whenever z = −iw¯ (that is, for real p, q). Throughout the paper, we use the following notation: |I|∞ = max |Ij | ,
[x] = the integer part of x ∈ R .
1≤j≤n
Theorem. Assume that the Hamiltonian (1.2) is convergent for |w|∞ , |z|∞ ≤ Rc , with some Rc > 0, and that the quadratic form I · AI is convex, i.e. |I · AI| ≥ m I · I
∀I ∈ Rn
(1.4)
with some m > 0. Then, there exist positive constants R ≤ Rc , ∗ , 0 , c and C such that any motion of the system (1.2), with “real” initial data (w(0), z(0)) = (w(0), −iw(0)) ¯ with :=
|I(0)|∞ ≤ 0 , R2
(1.5)
satisfies h 1 n1 i
n1 R2 ∗
for
|t| ≤ C exp
21 + 2n1 R2 ∗
for
|t| ≤ C exp
|I(t)|∞ ≤ c
∗
2
(1.6)
as well as |I(t)|∞ ≤ c
h 1 2n1 i ∗
2
.
(1.7)
350
F. Fass`o, M. Guzzo, G. Benettin
Possible values of the constants R, . . . , C will be provided in the course of the proof (see Eqs. (2.21)–(2.23)). As an application of this result, we would like to mention here the problem of the Nekhoroshev-stability of the so called Riemann ellipsoids, which will be reported in a forthcoming paper [7]. (See [4, 15] and references therein for general information about the problem). After reduction, the Riemann ellipsoids appear as equilibria of a Hamiltonian system with four degrees of freedom, whose frequencies depend on only two functions of the momentum. Therefore, the frequencies are certainly not Diophantine for generic values of the momentum, but moreover it is not even clear whether there is any one which is Diophantine at all, so that relaxing such a condition is crucial. In fact, the study of the stability of the Riemann ellipsoids was the occasion for us to become interested in the problem studied in this paper. Remarks. (i) The following generalizations, not explicitly treated here, are straightforward: • The case in which k is quasi-convex. • The case in which the elliptic equilibrium is replaced by a transversally elliptic torus. In the latter case, in addition to the coordinates (w, z), there are also one or more pairs of nonsingular action-angle coordinates, and everything runs smoothly with no important differences. (This case is in fact close to the situation considered in [3] in connection with the rigid body problem.) (ii) If ω¯ has better arithmetic properties, namely is nonresonant for |ν| ≤ s − 1 with s > 5, then the results can be improved. Specifically, it is possible to obtain a significantly better confinement of the actions, i.e. of order , on even longer time scales. However, this result requires a somewhat different treatment (it cannot be obtained just by putting a generic s > 5 in place of s = 5 in the estimates below) and will be reported in a forthcoming paper [12]. We shall come back on this point at the end of the paper. (iii) Inequalities (1.6) and (1.7) are in fact unified by |I(t)| ≤ c
1+µ n+µ ∗
for
|t| ≤ C exp
1 i h 1 n+µ
∗
2
∀µ ≥ 0
(1.8)
which is what we shall actually prove. (iv) The stability time exp(−1/n ) of (1.6) is the same which one finds if ω¯ satisfies the Diophantine condition |ω¯ · ν| ≥ const |ν|−τ , in the limit case τ = n − 1 (see [14]). In the Diophantine case, however, there is a much better confinement of the actions, of order . C. On the proof. The proof of the theorem is in a sense very traditional, being quite similar to the proof of Nekhoroshev theorem. There are however two differences. The first one is the use of the complex coordinates (w, z) instead of the action-angle coordinates. This is achieved with the technique introduced in ref. [3], which resides in the use of suitable “Fourier series” in these coordinates which, at variance from Taylor series, exactly correspond (out of the origin) to the Fourier series in the angles. Such an idea, which in fact already appears in [20] in connection with an isochronous system, is fully described and exploited in ref. [3], so we do not further comment on it here. The second difference concerns the meaning of the small parameter, which is important for the confinement of the actions. Proceeding exactly as in the proof of the Nekhoroshev theorem, for any value of a cut-off N (which will be eventually chosen to depend on as N ∼ −1/(n+µ) ), we cover the frequency space by suitable “resonant regions”. On account of the nondegeneracy of the unperturbed Hamiltonian k(I), such
Nekhoroshev-Stability of of Hamiltonian Systems
351
a covering has a diffeomorphic image in the action space. We construct a Nekhoroshev normal form only in that particular resonant region to which the equilibrium belongs, and then use this normal form to confine the motions. In this way, one essentially finds that all motions which start within such a resonant region remain inside it for times of order exp N .2 This process is highly nonuniform, since the size of the resonant regions rapidly grows (as a power of 1/) with the number of resonances. However, and this is delicate and crucial, although ω¯ is fixed, as a rule one does not know to which resonant region it does belong. The reason is that, unless ω¯ has very special arithmetic properties, by increasing the cut-off N (i.e., by considering initial data closer to the equilibrium, and exploring their motions on longer time scales), ω¯ does not remain in one and the same resonant region, but endless jumps from region to region. This fact, which is typical of Nekhoroshev’s theory although is not often stressed, is understood by thinking of the effects of increasing the cutoff: on the one hand, new resonances must be taken into account, so that new resonant regions suddenly appear in the frequency space and ω¯ may happen to fall into one of them; on the other hand, the existing resonant regions get thinner, so that ω¯ can exit a resonant region and enter a less resonant one. For a generically given ω, ¯ this process is very difficult (if not just prohibitive) to describe. For further comments see ref. [3], where the same phenomenon also plays a crucial role. In this situation, the confinement of the actions is obtained by observing that, if the motion starts at a distance from the equilibrium which is smaller than the radius of the smallest resonant region (so that it certainly starts within the resonant region of ω, ¯ whichever it is), then, for times of order exp N , it will remain within the radius of the largest resonant region (which certainly contains the resonant region of ω). ¯ As it turns out, denoting by the size of the smallest resonant region (which is the nonresonant one), 1+µ then the size of the largest one is order n+µ , which accounts for the estimate (1.8).3
2. Proof of the Theorem A. Preliminaries. We denote by k k the Euclidean P (or Hermitian) norm and by | |∞ n the supremum norm. For ν ∈ Z we write |ν| = j |νj |. We often denote by ξ = (ξ1 , . . . , ξ2n ) the coordinates (w, z) ∈ C2n , and work in polydiscs DS = {ξ ∈ C2n : |ξ|∞ ≤ S} ,
S > 0.
For any S > 0, let AS be the set of all analytic functions f : DS → C which, like the Hamiltonians (1.1) and (1.2), satisfy the reality condition f (w, −iw) ¯ ∈ R. For any ν ∈ Zn , let AνS be the subset of AS consisting of all functions of the form fˆν (iz1 w1 , . . . , izn wn )eν (z, w), where the complex function fˆν is analytic on the polydisc {I ∈ Cn : |I|∞ ≤ S 2 } and 2 More precisely, the motions must start at a distance from the equilibrium smaller than a certain fraction of the radius of the resonant region. Also, the nonresonant region is by definition the complement of the union of all resonant regions, but here, we mean instead the small polydisc where the nonresonant normal form is constructed. 3 The largest resonant region will be one of the (n − 1)-fold resonant regions. This is due to the fact that we shall choose the free parameters of the theory (specifically, the parameter R entering the theorem) in such a way that the equilibrium never belongs to the completely resonant region. The reason is that, in such a region, in the case µ = 0, one would have a very poor confinement of the actions, order 0 .
352
F. Fass`o, M. Guzzo, G. Benettin
eν (w, z) =
Qn
|νj | j=1 ηνj ,
η νj
zj = 1 w j
for νj > 0 for νj = 0 . for νj < 0
Correspondingly, for functions f ∈ AS we define the “Fourier series” X fν , fν = fˆν eν ∈ AνS , f= ν∈Zn
and we use the norm |f |S =
X
|fν |∞ S ,
|fν |∞ S = sup |fν (ξ)| . ξ∈DS
ν∈Zn
The functions fν will be called the harmonics of f .P We shall also deal with vector fields. If f = ν fν ∈ AS , we denote by F its Hamiltonian vector field and we use the Fourier series X Fν , F = ν∈Zn
where the “harmonic” Fν is by definition the Hamiltonian vector field of fν . We also use the norm ∂f ∞ X 1 ν ∞ max kFν k∞ , kF k = (2.1) kF kS = , ν S S j=1,... ,2n R ∂ξj S n ν∈Z
where R is a positive parameter (the same entering the statement of the theorem) which will be fixed in the sequel, at the very end of the proof; for the moment, only R ≤ Rc is important. Finally, let us consider the Hamiltonian h = k + f as in (1.2), (1.3), and let ∞
M=
|AI| ; ∞ n I∈C \{0} |I| sup
(2.2)
for simplicity, we assume M ≥ m, where m is the constant entering the convexity condition (1.4). Furthermore, we observe that, since the Taylor series for f in the Hamiltonian (1.2) begins with terms of order s ≥ 5, for any 0 < S ≤ R we have S s S s−1 F R2 , kF kS ≤ F, (2.3) |f |S ≤ R R where F is the Hamiltonian vector field of f and |f |R F = max kF kR , . R2 B. Resonant regions. We begin the proof of the theorem by recalling the decomposition of the frequency space Rn into resonant regions, which is used in the proof of Nekhoroshev’s theorem, with the improved estimates given by P¨oschel [19]. For any given N ≥ 1 we consider all the d-dimensional sublattices 3 of Zn , 0 ≤ d ≤ n, called N -lattices, such that: (i)
3 is generated by d vectors ν1 , . . . νd ∈ Zn with |νj | ≤ N .
Nekhoroshev-Stability of of Hamiltonian Systems
353
(ii) 3 is maximal (it is not properly contained in any sublattice of the same dimension). We denote by 0 the zero-dimensional lattice, constituted by the null vector alone. The cells of any lattice 3 6= 0 have a minimal d-dimensional euclidean volume, which we denote k3k; we put k0k = 1. Moreover, we denote by ω 3 the orthogonal projection of a vector ω onto a lattice 3 6= 0. The definition of the resonant regions depends on the cutoff N and on two positive parameters b and δ. Specifically, following [19], for any N -lattice 3 of dimension d ≥ 1 one defines δ3 = (b N )d−1
δ . k3k
(2.4)
For any N -lattice 3 of dimension d = 0, . . . , n, one then defines the resonant region B3 as the set of all points ω ∈ Rn such that kω 3 k < δ3 , 0
kω 3 k ≥ δ30
for any N -lattice 30 of dimension d + 1
(2.5)
(only the former condition for d = n, only the latter for d = 0). As is clear, for any given N ≥ 1, the resonant regions cover Rn . √ Lemma 1 ([19]). Consider any N ≥ 1 and assume b > 2. Then for any N -lattice 3 6= Zn and any ω ∈ B3 one has ∀ν ∈ Zn \ 3 , |ν| ≤ N
|ω · ν| ≥ γ3 with γ0 = δ, γ3 = (b −
√
2) N δ3
if 3 6= 0 , Zn .
(2.6)
C. The normal forms. We now construct a normal form for the Hamiltonian (1.2), adapted to the resonance properties of ω, ¯ up to an exponentially small remainder: Lemma 2. Let k(I) = ω¯ · I + 21 I · AI and assume that f ∈ AR , R > 0, satisfies (2.3). Consider any N ≥ 2 and let the N -lattice 3 be such that ω¯ ∈ B3 ; let r = [N/2]. Consider any positive number R3 ≤ R such that γ3 , 8M N M R2 . ≤ 8F
2 ≤ R3
R s−4 3
R
(2.7a) (2.7b)
Then, there exists a symplectic diffeomorphism 8 : DR3 /2 → 8(DR3 /2 ) ⊂ DR3 which satisfies F R3 s−3 R3 ∞ (2.8) ≤ |8(ξ) − ξ| ≤ 4M R r R 32r and is such that (k + f ) ◦ 8 belongs to AR3 /2 and has the form k + 2g + e−r f 0
(2.9)
354
F. Fass`o, M. Guzzo, G. Benettin
with g =
P
ν∈3 gν
and, denoting by F 0 the Hamiltonian vector field of f 0 ,
R3 s 2 max |g|R3 /2 , |f 0 |R3 /2 ≤ R F, R
kF 0 kR3 /2 ≤
R s−1 3
R
F. (2.10)
Proof of Lemma 2. The proof of Lemma 2 is quite standard, except for the usage of the coordinates (w, z). Hence, we just skecth this proof here. As usual, we construct the diffeomorphism 8 as a composition of r elementary diffeomorphisms. The single step is described in the following Lemma 3. Let k, R, N and 3 be as in Lemma 2. Consider two positive P numbers x and S such that xR ≤ S ≤ R and two functions u, v ∈ AS with u = ν∈3 uν . Denote by U and V the Hamiltonian vector fields of u and v, respectively, and assume S2 ≤
γ3 , 4M N
16 kV kS ≤ 1 . x γ3
Then, there exists a symplectic diffeomorphism 9 : DS−xR → 9(DS−xR ) ⊂ DS , analytic together with its inverse, which satisfies |9(ξ) − ξ|
∞
≤
2R R3 kV kS ≤ γ3 8x
(2.11)
and is such that (k + u + v) ◦ 9 belongs to AS−xR and has the form X k+ u+ vν + v 0 ν∈3,|ν|≤N
with v 0 and its Hamiltonian vector field V 0 satisfying β β |v|S , kV 0 kS−xR ≤ kV kS , x x S kU kS + 2kV kS , [N ]eR .
|v 0 |S−xR ≤ where β = max
8 γ3
(2.12)
Proof of Lemma 3. We generate 9 by the Lie method, that is, as the time-one map 8X 1 of the flow of a Hamiltonian vector field X. (We use the version of the method for vector fields described in [6]). Specifically, if the Hamiltonian χ of X satisfies the so-called homological equation X vν (2.13) {k, χ} = ν ∈3,|ν|≤N /
then (k + u + v) ◦ 8X 1 has the required form, with v 0 = R1X (u + v) + R2X (k) + v >N ; P∞
(2.14)
here RpX = j=p j!1 LjX denotes the pth remainder of the Lie series, LX being the Lie P derivative associated to X, and v >N = |ν|>N vν is the “ultraviolet” part of v. Equation (2.13) is solved by X vν , χ= iω(I) · ν ν ∈3,|ν|≤N /
Nekhoroshev-Stability of of Hamiltonian Systems
where ω(I) =
∂k ∂I (I)
355
= ω¯ + AI. Since
|ω(I) · ν| ≥ |ω¯ · ν| − |AI · ν| ≥ γ3 − M S 2 N ≥
3 γ3 , 4
(2.15)
one has |χ|σ ≤ 3γ4 |v ≤N |σ , with v ≤N = v − v >N . We now show that the Hamiltonian 3 vector field X of χ satisfies kXkS ≤
2 kV ≤N kS , γ3
(2.16)
V ≤N being the Hamiltonian vector field of v ≤N , which in turn implies (2.11). Indeed, z w ξ using the equality νj vν = −wj Vν j − zj Vν j (where Vν j is the ξj -component of Vν ), one verifies that Xνξj
n ξ X ξj Vν j zl wl ± = A V + z V w lj l l ν ν iω · ν (ω · ν)2 l=1
with the minus sign if j = 1, . . . , n and the plus sign if j = n + 1, . . . , 2n. Inequality (2.16) is obtained from here, observing that |wj |, |zj | ≤ S. The estimates (2.12) are obtained by estimating the various terms of (2.14) and of the analogous expression for V 0 . For this, one needs the following estimates on the remainders of the Lie series: 4 kXkS kU + V kS , x 2 2 ≤ kXkS kLX KkS ≤ kXkS kV kS , x x 4 ≤ kU + V kS |χ|S , x 4 4 ≤ kXkS |LX k|S ≤ kXkS |v ≤N |S . x x
kR1X (U + V )kS−xR ≤ kR2X (K)kS−xR |R1X (u + v)|S−xR |R2X (k)|S−xR
The proof of these inequalities is quite standard (some hints are given in [3]). One also needs to estimate the ultraviolet part of functions and vector fields. Recalling that the maximum of an analytic function is attained at the border, one computes xR |ν| xR |ν| ∞ ∞ ∞ ∞ = |fν |S 1 − , |fν |S−xR = |fˆν eν |S−xR ≤ |fˆν |S S |ν| 1 − S S so that
xR 1+[N ] >N |f |S . |f >N |S−xR ≤ 1 − S ν+δ
ν−δ
∂fν j ν Similarly, observing that if f ∈ AνS then ∂w ∈ AS j and ∂f , where ∂zj ∈ AS j th [N ] δj = (. . . , 0, 1, 0 . . . ) (1 at the j place), one finds kF kS−xR ≤ (1−xR/S) kF >N kS . In (2.12), we also used (1 − y)N ≤ (eN y)−1 , y > 0.
Lemma 2 is proven by iterating r = [N/2] times the normalization procedure of Lemma 3, each time with x = R3 /(2rR), beginning with S = R at the first step. One finds that the iteration is possible if R3 satisfies (2.7a) and if moreover
356
F. Fass`o, M. Guzzo, G. Benettin
27 r R kF kR3 ≤ 1 , γ 3 R3
(2.17)
∞ 0 in which case one has |8(ξ) − ξ| ≤ 4R γ3 kF kR3 /2 , max |g|R3 /2 , |f |R3 /2 ≤ |f |R3 , and kF 0 kR3 ≤ kF kR3 . Using the fact that the Taylor series of f begins with terms of order s, i.e. inequalities (2.3), one verifies (using also (2.7a) that inequality (2.17) can be replaced by (2.7b), while the estimates above on 8 and on the normal form reduce to (2.8) and (2.10), respectively. D. Confinement of the actions. The next step in the proof is using the normal form of Lemma 2, together with the convexity of k, so as to produce bounds on the variation of the actions. Here too the use of the coordinates (w, z) introduces only a few changes with respect to the usual proof which uses action-angle coordinates, and we limit ourselves to a sketch. Lemma 4. Within the hypotheses and the notation of Lemma 2, assume that ω¯ ∈ B3 and that the quadratic form k(I) satisfies the convexity condition (1.4). Consider any motion t 7→ (wt , zt ) of the system (1.2) with “real” initial data (w0 , z0 ) = (w0 , −iw¯ 0 ). (i)
If 3 = 0, then |w0 |∞ ≤
R0 8
=⇒
|wt |∞ ≤
R0 2
for
|t| ≤
e[N/2] . 4F
(2.18)
(ii) If 3 6= 0, Zn and 32 δ3 , m r ! 29 mR2 M , ≤ 12 min 1, 1/4 2 F m n
2 ≥ R3
R s−4 3
R then |w0 |
∞
where =
R3 ≤ 1/4 8n
√
r
m M
=⇒
|wt |∞ ≤
R3 2
for
|t| ≤
(2.19a) (2.19b)
2 e[N/2] , (2.20)
n sup|I|≤R kω¯ + AIk.
Proof. Let us prove statement (ii). Let 8 be the symplectic diffeomorphism constructed in Lemma 2, and write (w, z) = 8(w0 , z 0 ), I 0 = iw0 z 0 , 1I = It0 −I00 , 1k = k It0 −k I00 , and ω0 = ω I00 = ω¯ + AI00 . We assume that (wt , zt ) and (wt0 , zt0 ) remain within DR3 /2 for all the times we deal with (the consistency of this assumption may be easily verified 2 at the end of the proof). The convexity condition (1.4) gives m 2 k1Ik ≤ |1k|+|ω0 ·1I|. The conservation of energy for the Hamiltonian (2.9) implies |1k| ≤ 6|f |R3 so that, writing also ω0 = ω03 + (ω0 − ω03 ), one obtains R s m 3 k1Ik2 ≤ 6 R2 F + kω03 k k1Ik + |(ω0 − ω03 ) · 1I| . 2 R 0 0 = Now, P using (2.9) and (2.10), observing that {Ij , k} = 0 for all j, that {I0 j , g} 0 iν g e are the components of a vector parallel to 3, and that {I , f } = j ν ν j ν∈3 izj0 (F 0 )zj + iwj0 (F 0 )wj , one computes
Nekhoroshev-Stability of of Hamiltonian Systems
357
Z t n X |(ω0 − ω03 ) · 1I| = (ω0 − ω03 )j {Ij0 , e−[N/2] f 0 }(s)ds 0
j=1
≤ |t|
√
∞
n kω0 k e−[N/2] max |{Ij0 , f 0 }|R3 /2 ≤ 2 j
for |t| ≤ 2 −1 e[N/2] . Hence, on this time scale,
m 2
k1Ik2
R s 3
R2 F R s ≤ 8 RR3 R2 F +
kω03 k k1Ik. Taking also into account inequality (2.19b), this inequality is seen to imply k1Ik ≤
R2 2 kω03 k + 3 . m 16
¯ + kω¯ 3 k ≤ Now, since ω¯ ∈ B3 , one has kω03 k ≤ kω0 − ωk using also |It0 | ≤ |I00 | + |1I|, m ≤ M and (2.19a), √ 2 3M n 0 ∞ R3 |I0 | + . |It0 |∞ ≤ m 8
√
n M |I00 |∞ + δ3 so that,
This implies |wt0 |∞ ≤ R3 /2 and |wt |∞ ≤ R3 /2, provided that r m R3 ∞ |w00 | ≤ . 2M 4n1/4 Using (2.8) and (2.19b), one sees that the latter inequality follows from the hypothesis (2.20) on w0 . The proof of statement (i) is similar, and actually easier. E. Proof of the Theorem. We now show that, if we make the choice p m kωk ¯ R = min Rc , 32 M
(2.21)
and moreover ∗ = min
1 1 , √ , 64 2 n
0 =
m n−1 1 µ+n µ+1 a s−4 ∗ 28 M
(2.22)
r mR2 29 M a = min 1, 12 min 1, 1/4 , 2 F m n then all the hypotheses for the applicability of Lemma 4 are satisfied whenever the initial data fulfill (1.5), and in such a case the estimates (2.20) and (2.18) of Lemma 4 imply (1.8) with 2 1 1 9 M n−1 2 , , (2.23) , C = min c= 8 m 4F
with
with as in Lemma 4. p ¯ in (2.21) serves to prevent As we shall see below, the restriction R ≤ mkωk/32M ω¯ to belong to the completely resonant region BZn , i.e., to assure that kωk ¯ ≥ δZn = (bN )n−1 δ
(2.24)
358
F. Fass`o, M. Guzzo, G. Benettin
(as already mentioned in the Introduction, the reason for this request is the very poor confinement of the actions in the completely resonant region when µ = 0). If (2.24) is verified, then Lemma 4 can be applied whenever the three inequalities R3 ≤ R, (2.7a,b) are satisfied for any 3 6= Zn and the two inequalities (2.19a,b) are satisfied for any 3 6= 0, Zn . The latter four inequalities read γ3 , 8M N R s−4 M R2 3 , ≤ R 8F 32 2 δ3 , ≥ R3 m 29 r M R s−4 mR2 3 , ≤ 12 min 1, 1/4 R 2 F m n 2 ≤ R3
3 6= Zn ,
(2.25a)
3 6= Zn ,
(2.25b)
3 6= 0, Zn ,
(2.25c)
3 6= 0 , Zn .
(2.25d)
In order to fulfill these conditions, we begin by making the (rather natural) choice R2 , N n+µ
R02 =
µ ≥ 0.
(2.26)
Recalling the definition (2.5) of γ3 we choose δ (for 3 = 0) and R3 (for 3 6= 0, Zn ) so as to fulfil all conditions (2.25a–d) as equalities: δ = 8M N R02 , √ b− 2 2 (bN )d R02 R3 = b k3k
if
d = dim 3 = 1, . . . , n − 1.
Similarly, we choose the constant b so as to fulfil (2.25c) as equalities, too: b=
√
2 + 28
M . m
At this point, as one immediately checks, (2.24) follows from the choice (2.21) of R. Let us now provisionally introduce the small parameter R 2 0 ; η= R correspondingly, Eq. (2.26) can be regarded as defining the cutoff N : N=
1 1 n+µ
η
.
We can now fulfil the two conditions (2.25b,d) for all 3 6= 0, Zn just by requiring that R0 , namely η, is small enough, precisely η
1+µ n+µ
≤
1 bn−1
29 mR2 min 1, 1/4 212 F n
r
1 ! s−4
M m
(2.27)
2 (recall that m ≤ M and d ≤ n − 1). Finally, for 3 6= Zn one has R3 ≤ n−1 (1+µ)/(n+µ) 2 n b η R and so all inequalities R3 ≤ R, 3 6= Z , are fulfilled if
Nekhoroshev-Stability of of Hamiltonian Systems 1+µ
η n+µ ≤
359
1 . bn−1
(2.28)
With the above choice of the parameters R, b, δ, and R3 , 3 6= Zn , we may now 2 apply the estimates of Lemma 4 with ω¯ ∈ B3 for some 3 6= Zn . Since R02 ≤ 28mM R3 n n−1 for all 3 6= 0 , Z (as one sees using k3k ≤ N ), it is easy to verify that, if 1 1 , √ (2.29) |I(0)|∞ ≤ min R02 = ∗ η R2 64 2 m with ∗ as in (2.22), then I(0) satisfies either the inequality (2.18) (if 3 = 0) or the inequality (2.20) (if 3 6= 0, Zn ) of Lemma 4, so that in all cases |I(t)|∞ ≤
2 µ+1 µ+1 bn−1 µ+n R3 ≤ η R2 = c η µ+n R2 4 4
with c as in (2.23), for |t| as in (1.8). In the theorem, we have used the parameter = ∗ η instead of η; the two conditions (2.27) and (2.28) follow from ≤ 0 , with 0 as in (2.22). This concludes the proof of the theorem. F. Conclusive comment. As a final comment, we would like to discuss why the results do not significantly improve when s > 5. We used a generic s in all of the formulas above precisely to keep track of this fact: as the reader can see, the only improvement of a larger s is that the threesold 0 gets slightly larger, see (2.22). The reason of this fact is easily envisaged: our strategy was to confine the motions in the resonant region of ω, ¯ and so the confinemement of the action is determined solely by the size of the latter. Hence, in order to improve the confinement of the actions, one should allow the size of the resonant regions to be essentially smaller than the distance from the equilibrium. This requires a finer analysis, in which one further subdvides into different resonant regions what is here regarded as the resonant region of ω. ¯ This analysis will be published elsewhere [12]. Acknowledgement. The idea of this work came during some conversations with Debra Lewis regarding the stability of the Riemann ellipsoids. We are very indebted to her for carrying the problem to our knowledge and for many stimulating discussions. F.F. also wishes to thank Tudor Ratiu for his hospitality at the Mathematics Department of the University of California, Santa Cruz, where part of this work was done, as well as for many useful discussions. This work has been supported by the grant EC contract ERBCHRXCT940460 for the project Stability and Universality in Classical Mechanics. F.F. was also partially supported by DOE contract DEFG03-95ER25245-A000 while visiting UCSC. Note added in proofs: After this paper had been submitted for publication, we received a preprint by L. Niederman (Nonlinear stability around an elliptic equilibrium point in an Hamiltonian system) which treats the same problem and obtains very similar (though not identical) results. From a technical point of view, Niederman uses Lochak’s scheme for the proof of the Nekhoroshev theorem (the so-called “simultaneous approximation” method), but implemented in cartesian coordinates.
References 1. Bambusi, D.: Long time stability of some small amplitude solutions in nonlinear Schrˇdinger equations. Commun. Math. Phys. 189, 205–226 (1997) 2. Benettin, G. and Gallavotti, G.: Stability of motions near resonances in quasi-integrable Hamiltonian systems. J. Stat. Phys. 44, 293–338 (1986)
360
F. Fass`o, M. Guzzo, G. Benettin
3. Benettin, G., Fass`o, F. and Guzzo, M.: Fast rotations of the rigid body: A study by Hamiltonian perturbation theory. Part II: Gyroscopic rotations. Nonlinearity 10, 1695–1717 (1997) 4. Chandrasekhar, S.: Ellipsoidal Figures of Equilibrium. New Haven, CT: Yale University Press, 1969 5. Delshams, A. and Guti´errez, P.: Estimates on invariant tori near an elliptic equilibrium point of a Hamiltonian system. J. Diff. Eq. 131, 277–303 (1996) 6. Fass`o, F.: Lie series method for vector fields and Hamiltonian perturbation theory. J. Appl. Math. Phys. (ZAMP) 41, 843–864 (1990) 7. Fass`o, F. and Lewis, D.: Paper in preparation 8. Gallavotti, G.: Quasi-Integrable Mechanical Systems. In: Critical phenomena, Random Systems, Gauge Theories, K. Osterwalder and R. Stora editors, Les Houches, Session XLIII, 1984 Amsterdam: NorthHolland, 1986 9. Giorgilli, A. and Galgani, L.: Rigorous estimates for the series expansions of Hamiltonian perturbation theory. Celestial Mechanics 37, 95–112 (1985) 10. Giorgilli, A.: Rigorous results on the power expansions for the integrals of a Hamiltonian system near an elliptic equilibrium point. Ann. Inst. Henri Poincar´e - Physique Th`eorique 48, 423–439 (1988) 11. Giorgilli, A., Delshams, A., Fontich, E., Galgani, L. and Sim´o, C.: Effective Stability for a Hamiltonian System near an Elliptic Equilibrium Point, with an Application to the Restricted three Body Problem. J. Diff. Eq. 77, 167–198 (1989) 12. Guzzo, M., Fass`o, F. and Benettin, G.: On the stability of elliptic equilibria. MPEJ - Mathemathical Physics Electronic Journal 4, Paper 1 (1998) 13. Ito, H.: Convergence of Birkhoff normal forms for integrable systems. Comment. Math. Helvetici 64, 412–461 (1989) ` and Villanueva, J.: On the normal behaviour of partially elliptic lower dimensional tori of 14. Jorba, A. Hamiltonian systems. Nonlinearity 10, 783–822 (1997) 15. Lewis, D.: Bifurcation of liquid drops. Nonlinearity 6, 491–522 (1993) 16. Lochak, P.: Canonical perturbation theory via simultaneous approximation. Russ. Math. Surv. 47, 57–133 (1992) 17. Lochak, P.: Stability of Hamiltonian systems over exponentially long times: the near linear case. In: H. Dumas, K. Meyer, D. Schmidt (eds), Hamiltonian Dynamical Systems – History, Theory, and Applications, The IMA Volumes in Mathematics and its Applications 63, New York: Springer, 1995, pp. 221–229 18. Nekhoroshev, N.N.: An exponential estimate of the time of stability of nearly integrable Hamiltonian systems. Usp. Mat. Nauk 32:6, 5–66 (1977) [Russ. Math. Surv. 32:6, 1–65 (1977)] 19. P¨oschel, J.: Nekhoroshev estimates for quasi-convex Hamiltonian Systems. Math. Z. 213, 187–216 (1993) 20. Siegel, C. and Moser, J.: Lectures on Celestial Mechanics. Berlin: Springer, 1971 Communicated by Ya. G. Sinai
Commun. Math. Phys. 197, 361 – 386 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Modular Invariants, Graphs and α-Induction for Nets of Subfactors I J. B¨ockenhauer, D. E. Evans School of Mathematics, University of Wales Cardiff, PO Box 926, Senghennydd Road, Cardiff CF2 4YH, Wales, UK Received: 2 February 1998 / Accepted: 13 March 1998
Abstract: We analyze the induction and restriction of sectors for nets of subfactors defined by Longo and Rehren. Picking a local subfactor we derive a formula which specifies the structure of the induced sectors in terms of the original DHR sectors of the smaller net and canonical endomorphisms. We also obtain a reciprocity formula for induction and restriction of sectors, and we prove a certain homomorphism property of the induction mapping. Developing further some ideas of F. Xu we will apply this theory in a forthcoming paper to nets of subfactors arising from conformal field theory, in particular those coming from conformal embeddings or orbifold inclusions of SU(n) WZW models. This will provide a better understanding of the labelling of modular invariants by certain graphs, in particular of the A-D-E classification of SU(2) modular invariants. Contents 1 2 2.1 2.2 2.3 2.4 3 3.1 3.2 3.3 3.4 3.5 4 4.1 4.2 4.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subfactors and sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistics operators in algebraic quantum field theory . . . . . . . . . . . . . . . . . . . . . The braiding fusion equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nets of subfactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . α-Induction for Nets of Subfactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Definition of α-induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The main formula for α-induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Homomorphism property of α-induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . σ-restriction and ασ-reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The inverse braiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Miscellanea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The results in terms of sector algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The subgroup net of subfactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
362 363 363 364 367 370 372 372 373 375 378 380 382 382 383 385
362
J. B¨ockenhauer, D. E. Evans
1. Introduction Modular invariants associated to SU(2) characters have been classified by [3], each being labelled by a graph, a Dynkin Diagram of type A-D-E. Similarly subfactors give rise to natural invariants, e.g. their principal graphs. Each A, Deven , Eeven is the principal graph (or fusion graph) of a subfactor of index less than four. Here we begin to look systematically at this relation between modular invariants, graphs and subfactors. Our treatment begins with the formulae for the extension (λ 7→ αλ ) and the restriction endomorphism (β 7→ σβ ) for nets of subfactors N ⊂ M defined by Longo and Rehren [19]. We derive several properties of these extension and restriction endomorphisms, including a reciprocity formula, and therefore we prefer the names α-induced and σrestricted endomorphisms. In a forthcoming paper [1] we will apply the procedure of α-induction to several nets of subfactors arising from conformal field theory. We pay special attention to the current algebras of the SU(n)k WZW models. There we are dealing with nets of subfactors N ⊂ M where the smaller net N is given in terms of representations of local loop groups of SU(n). Firstly, we consider conformal embeddings of type SU(n)k ⊂ G1 with G simple. In this case the enveloping net M is given by the local loop groups of G in the level 1 vacuum representation. To such a conformal embedding corresponds a modular invariant. Secondly, we consider modular invariants of orbifold type. In this case we can construct the enveloping net M as an extension of N by simple currents; this crossed product construction is similar to the construction of the field algebra in [8]. Our treatment gives some new insights in the programme of labelling (block-diagonal) modular invariants by certain graphs initiated by Di Francesco and Zuber [5, 6] (see also [4]). With λ being the localized endomorphisms associated to the positive energy representations of LSU(n) at level k we obtain a fusion algebra generated by the subsectors of the αinduced endomorphisms αλ . Graphs are obtained by drawing the fusion graphs of the α-induced endomorphisms associated to the fundamental representation(s). They satisfy the axioms for graphs which Di Francesco and Zuber associate to modular invariants [5] (see also [20]), and for all our (SU(2) and SU(3)) examples we reproduce in fact their graphs. For SU(2) our theory yields in fact an explanation why the entries in the (non-trivial) block-diagonal modular invariants correspond to Coxeter exponents of the Deven , E6 and E8 Dynkin diagrams. We will also discuss the application of α-induction to extended U (1) theories from [2] and to the minimal models. In [21], Xu defined a map λ 7→ aλ by a similar, but different formula for the induced endomorphism. (In fact in his setting both λ and aλ are endomorphisms of the same III1 -factor M .) He has already obtained the fusion graphs for the conformal inclusions involving SU(n), however, we can also treat the orbifold inclusions of SU(n). Our underlying framework is more general because it applies, for a given net of subfactors N ⊂ M satisfying certain assumptions (which are fulfilled for many chiral conformal field theory models), to the whole class of localized, transportable endomorphisms of N whereas Xu restricts his analysis to the LSU(n) setting. Moreover, we believe that our formalism is more appropriate as the nature of induction and restriction of sectors becomes more transparent, and we believe that our setting enables us to present simpler proofs. This article is the first in a series of papers about modular invariants, graphs, and nets of subfactors. Here we develop the machinery of α-induction in a general setting. In Sect. 2 we derive the braiding fusion equations that arise naturally from the notion of localized transportable endomorphisms of algebraic quantum field theory, and which play a crucial role in our analysis. In Sect. 3 we give the definition and prove several prop-
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
363
erties of α-induction; we derive an important formula and the homomorphism property of α-induction, and we also establish ασ-reciprocity of α-induction and σ-restriction. The game of α-induction and σ-restriction of sectors generalizes the restriction and (Mackey) induction of group representations to nets of subfactors which are in general not governed by group symmetries. Nevertheless, as an illustration we briefly discuss the case of a net of subfactors arising from a subgroup of a finite group in Subsect. 4.2. The above mentioned applications of this theory to several models of conformal field theory will be presented in a forthcoming paper [1]. 2. Preliminaries In this section we review several facts about subfactors, sectors, algebraic quantum field theory and nets of subfactors, which we will need for our analysis. 2.1. Subfactors and sectors. We first briefly review some basic facts about subfactors and Longo’s theory of sectors. For a detailed treatment of these topics we refer to textbooks on operator algebras, e.g. [11]. A von Neumann algebra is a weakly closed subalgebra M ⊂ B(H) of the algebra of bounded operators on some Hilbert space H. It is called a factor if its center is trivial, M 0 ∩ M = C1. A factor is called infinite if there is an isometry v ∈ M with range projection vv ∗ 6= 1, and purely infinite or type III if Mp = pM p is infinite for every non-zero projection p ∈ M . An inclusion N ⊂ M of factors with common unit is called a subfactor. A subfactor is called irreducible if the relative commutant is trivial, N 0 ∩ M = C1, and it is called infinite if N and M are infinite factors. Let N ⊂ M be an infinite subfactor on a separable Hilbert space H. Then there is a vector 8 ∈ H which is cyclic and separating for both M and N . Let JM and JN be the modular conjugations of M and N with respect to 8. Then the endomorphism γ = Ad(JN JM )|M of M satisfies γ(M ) ⊂ N and is called a canonical endomorphism from M into N . It is unique up to conjugation by a unitary in N . The restriction θ = γ|N is called a dual canonical endomorphism. If the Kosaki index [15] is finite, [M : N ] < ∞, then there are isometries v ∈ M and w ∈ N such that vm = γ(m)v,
m ∈ M,
wn = θ(n)w,
n ∈ N,
w∗ v = [M : N ]−1/2 1 = w∗ γ(v). M (m) = w∗ γ(m)w, m ∈ M , is a conditional expectation from M onto N and Then EN the identity M (mv ∗ )v, m ∈ M, m = [M : N ] · EN
holds [16]. This means in particular that every m ∈ M can be written as m = nv for some n ∈ N , i.e. M = N v. For any unital ∗ -algebra M we denote by End(M ) the set of unital ∗ -endomorphisms of M . For λ, µ ∈ End(M ) we define the intertwiner space HomM (λ, µ) = {t ∈ M : tλ(m) = µ(m)t,
m ∈ M}
364
J. B¨ockenhauer, D. E. Evans
and
hλ, µiM = dim HomM (λ, µ).
We have hλ, µiM = hµ, λiM . Now let M be a type III factor. An endomorphism λ ∈ End(M ) is called irreducible if λ(M )0 ∩ M = C1. Endomorphisms λ, µ ∈ End(M ) are called (inner) equivalent if there is a unitary u ∈ M such that λ = Ad(u) ◦ µ. The quotient of End(M ) by inner equivalence is called the set of sectors of M and denoted by Sect(M ), and the equivalence class of λ ∈ End(M ) is denoted by [λ]M . However, we often drop the suffix and write [λ] for [λ]M as long as it is clear which factor is meant. There is a natural product of sectors coming from the composition of endomorphisms. Explicitly, [λ] × [µ] = [λ ◦ µ]. There is also an addition of sectors. Let λi ∈ End(M ), i = 1, 2, ..., n. Since M is infinite we can take a set of isometries ti ∈ M , i = 1, 2, ..., n, satisfying the relations of the Cuntz algebra On , t∗i tj = δi,j 1,
n X
ti t∗i = 1.
i=1
Define λ ∈ End(M ) by λ(m) =
n X
ti λi (m) t∗i ,
m ∈ M.
i=1
Then [λ] does not depend on the choice of the set of isometries and hence we can define the sum n M [λi ] = [λ]. i=1
Each [λi ] is called a subsector of [λ]. With the operations × and ⊕ that fulfill associativity and distributivity, Sect(M ) becomes a unital semi-ring, and the unit is given by the identity (or trivial) sector [id]. For λ ∈ End(M ) irreducible λ ∈ End(M ) is called conjugate if [λ ◦ λ] and [λ ◦ λ] both contain the identity sector once. The conjugate is unique up to inner equivalence. For general λ let γλ be the canonical endomorphism of M into λ(M ). Then a conjugate is given by λ = λ−1 ◦ γλ . [λ] is called the conjugate sector, and the map [λ] 7→ [λ] preserves sums (if [λ] = [λ1 ] ⊕ [λ2 ] then [λ] = [λ1 ] ⊕ [λ2 ]) and reverses products (if [λ] = [µ]×[ν] then [λ] = [ν]×[µ]). Furthermore, for an automorphism α ∈ Aut(M ) we have [α−1 ] = [α]. The number dλ = [M : λ(M )]1/2 is called the statistical dimension of λ. Then dλ = dλ0 if [λ] = [λ0 ], and dλ = dλ . For λ1 , λ2 ∈ End(M ) such that [λ] = [λ1 ] ⊕ [λ2 ] we have hλ, µiM = hλ1 , µiM + hλ2 , µiM . If λ, µ, ν, λ, µ ∈ End(M ) have finite statistical dimension and λ and µ are conjugates of λ and µ, respectively, then we have [18] hλ ◦ µ, νiM = hλ, ν ◦ µiM = hµ, λ ◦ νiM ,
(1)
in particular hλ, µiM = hλ, µiM . 2.2. Statistics operators in algebraic quantum field theory. Let us briefly review some facts about the algebraic framework of quantum field theory [7, 8, 9, 10, 14]. As all our
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
365
later applications are chiral theories we present the whole setting with the unit circle S 1 as the underlying “space-time” from the beginning. Since we will make explicit use of several well-known results and in order to make this article more self-contained we prefer to present the proofs which are simple and instructive, but compare also [12, 13]. Fix a point z ∈ S 1 on the circle and set ¯ / I}, Jz = {I ⊂ S 1 non-void open interval, z ∈ where I¯ denotes the closure of I. A Haag–Kastler net on the punctured circle A = {A(I), I ∈ Jz } is a family of von Neumann algebras acting on a Hilbert space H0 such that isotony holds, i.e. I ⊂ J implies A(I) ⊂ A(J), and we also have locality, i.e. I1 ∩ I2 = ∅ implies A(I1 ) ⊂ A(I2 )0 . For subsets R ⊂ S 1 (which may touch or contain the “point at infinity” z) we define (0) (R) = CA
[
A(J),
(0) CA (R) = CA (R)
J∈Jz , J⊂R
k·k
.
As usual, we denote the C ∗ -algebra of the whole circle by the same symbol as the net itself, A = CA (S 1 ). An endomorphism λ ∈ End(A) is called localized in an interval I ∈ Jz if λ(a) = a for all a ∈ CA (I 0 ), where I 0 denotes the interior of the complement of I. A localized endomorphism λ is called transportable if for all J ∈ Jz there are unitaries Uλ;I,J ∈ A, called charge transporters, such that λ˜ = Ad(Uλ;I,J ) ◦ λ is localized in J. By 1A (I) we denote the set of localized transportable (“DHR”) endomorphisms of A localized in I ∈ Jz . Let us now assume Haag duality (on the punctured circle), A(I) = CA (I 0 )0 ,
I ∈ Jz .
(2)
Note that then an endomorphism λ ∈ 1A (I◦ ) leaves any local algebra A(K) with K ∈ Jz , I ⊂ K, invariant since a0 λ(a) = λ(a0 a) = λ(aa0 ) = λ(a)a0 for any a ∈ A(K) and a0 ∈ CA (K 0 ), hence λ(a) ∈ A(K) by Haag duality. Lemma 2.1. Let I1 , I2 ∈ Jz such that I1 ∩ I2 = ∅ and let λi ∈ 1A (Ii ), i = 1, 2. Then λ1 and λ2 commute, λ1 ◦ λ2 = λ2 ◦ λ1 . Proof. Take I ∈ Jz arbitrary. Then choose intervals J1 , J2 ∈ Jz such that Ji ∩ I = ∅, i = 1, 2, and that there are also intervals K1 , K2 ∈ Jz , Ki ⊃ Ii ∪ Ji , i = 1, 2, and K1 ∩ K2 = ∅. By transportability there are unitaries Ui ≡ Uλi ;Ii ,Ji such that λ˜ i = Ad(Ui ) ◦ λi ∈ 1A (Ji ), i = 1, 2. Then Ui ∈ A(Ki ) by Haag duality, hence U1 U2 = U2 U1 and λ˜ 1 (U2 ) = U2 and λ˜ 2 (U1 ) = U1 . Then for any a ∈ A(I) we have λ˜ i (a) = a, i = 1, 2, and thus λ1 ◦ λ2 (a) = Ad(U1∗ ) ◦ λ˜ 1 ◦ Ad(U2∗ ) ◦ λ˜ 2 (a) = Ad(U ∗ λ˜ 1 (U ∗ )) ◦ λ˜ 1 ◦ λ˜ 2 (a) 1
2
= U1∗ U2∗ aU2 U1 = U2∗ U1∗ aU1 U2 = Ad(U2∗ λ˜ 2 (U1∗ )) ◦ λ˜ 2 ◦ λ˜ 1 (a) = Ad(U ∗ ) ◦ λ˜ 2 ◦ Ad(U ∗ ) ◦ λ˜ 1 (a) 2
1
= λ2 ◦ λ1 (a). Since I was arbitrary it follows λ1 ◦ λ2 (a) = λ2 ◦ λ1 (a) for any a ∈ A.
366
J. B¨ockenhauer, D. E. Evans
Now assume that λ, µ are localized in the same interval I ∈ Jz , λ, µ ∈ 1A (I). Then they will in general not commute, however, they are intertwined by a unitary operator which will be discussed in the following. Choose I1 , I2 ∈ Jz such that I1 ∩ I2 = ∅. Then there are unitaries U1 ≡ Uλ;I,I1 and U2 ≡ Uµ;I,I2 such that λ1 = Ad(U1 ) ◦ λ ∈ 1A (I1 ) and µ2 = Ad(U2 ) ◦ µ ∈ 1A (I2 ). We set IU11,I,U2 2 (λ, µ) = µ(U1∗ )U2∗ U1 λ(U2 ). This operator has remarkable invariance properties. Let 2 = {(I1 , I2 ) ∈ Jz × Jz , I1 ∩ I2 = ∅}. Jz,dis
For disjoint intervals I1 , I2 ∈ Jz denote I2 > I1 (respectively I2 < I1 ) if I1 lies clockwise (respectively counter-clockwise) to I2 relative to the point z. Let 2 = {(I1 , I2 ) ∈ Jz × Jz , I2 > I1 }, Jz,+ 2 Jz,− = {(I1 , I2 ) ∈ Jz × Jz , I2 < I1 }. 2 2 2 = Jz,+ ∪ Jz,− . Then clearly Jz,dis
Lemma 2.2. The operators IU11,I,U2 2 (λ, µ) do not depend on the special choice of U1 and 2 2 and Jz,− . U2 , moreover, varying I1 and I2 , IU11,I,U2 2 (λ, µ) remains constant on Jz,+ Proof. First replace U1 by U˜ 1 such that λ˜ 1 = Ad(U˜ 1 ) ◦ λ ∈ 1A (I1 ) as well. Then with V1 = U˜ 1 U1∗ we have λ˜ 1 = Ad(V1 ) ◦ λ1 and hence V1 ∈ A(I1 ) by Haag duality. Then we have IU1 ,I,U2˜ (λ, µ) = µ(U˜ 1∗ )U2∗ U˜ 1 λ(U2 ) 1
2
= µ(U1∗ V1∗ )U2∗ V1 U1 λ(U2 )
= µ(U1∗ )U2∗ µ2 (V1∗ )V1 U1 λ(U2 ) = IU11,I,U2 2 (λ, µ), since µ2 (V1 ) = V1 by I1 ∩ I2 = ∅. In the same way we can replace U2 by some U˜ 2 such that µ˜ 2 = Ad(U˜ 2 ) ◦ µ ∈ 1A (I2 ). In the next step we replace I1 by some I˜1 such that I˜1 ∩ I1 6= ∅ but still I˜1 ∩ I2 = ∅. We can now assume that our chosen U˜ 1 is such that λ˜ 1 ∈ 1A (I˜1 ∩ I1 ), and hence we can use the same U˜ 1 for the new interval I˜1 . In the same way we can replace I2 by I˜2 . As long as I˜1 ∩ I˜2 = ∅ we have the freedom to vary U˜ 1 and U˜ 2 , and so on. Now assume that we have I2 > I1 for our initial intervals. By iteration 2 2 , and similarly in Jz,− of the above arguments we can reach any pair of intervals in Jz,+ if I1 < I2 , the lemma is proven. We conclude that for any λ, µ ∈ 1A (I) there are only two operators ± (λ, µ) = 2 ∈ Jz,± , but + (λ, µ) and − (λ, µ) may be different in general. We now have even the choice to set I1 = I and U1 = 1. We choose intervals I± ∈ Jz such that I+ > I and I− < I. If Uµ,± ≡ Uµ;I,I± are unitaries such that µ± = Ad(Uµ,± ) ◦ µ ∈ 1A (I± ) then we find by putting I2 = I+ or I2 = I− , IU11,I,U2 2 (λ, µ), where (I1 , I2 )
∗ λ(Uµ,± ). ± (λ, µ) = Uµ,±
The ± (λ, µ)’s are usually called statistics operators. Choose K+ , K− ∈ Jz such that I ∪ I± ⊂ K± and I± ∩ K∓ = ∅. Note that Uµ,± ∈ A(K± ) by Haag duality.
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
367
Lemma 2.3. For λ, µ, ν ∈ 1A (I) we have ± (λ, µ) · λ ◦ µ(a) = µ ◦ λ(a) · ± (λ, µ), ± (λ, µ) ∈ A(I), + (λ, µ) = (− (µ, λ))∗ , ± (λ ◦ µ, ν) = ± (λ, ν) λ(± (µ, ν)), ± (λ, µ ◦ ν) = µ(± (λ, ν)) ± (λ, µ).
a ∈ A,
(3) (4) (5) (6) (7)
Proof. Ad Eq. (3): For a ∈ A we compute ∗ λ(Uµ,± ) · λ ◦ µ(a) ± (λ, µ) · λ ◦ µ(a) = Uµ,± ∗ λ(Uµ,± µ(a)) = Uµ,± ∗ λ(µ± (a) Uµ,± ) = Uµ,± ∗ · µ± ◦ λ(a) · λ(Uµ,± ) = Uµ,± ∗ λ(Uµ,± ) = µ ◦ λ(a) · Uµ,±
= µ ◦ λ(a) · ± (λ, µ). Ad Eq. (4): For a ∈ CA (I 0 ) Eq. (3) reads ± (λ, µ) a = a ± (λ, µ), i.e. ± (λ, µ) ∈ CA (I 0 )0 = A(I). Ad Eq. (5): From Uµ,± ∈ A(K± ) it follows λ− (Uµ,+ ) = Uµ,+ and µ+ (Uλ,− ) = Uλ,− . Hence ∗ λ(Uµ,+ ) + (λ, µ) = Uµ,+ ∗ ∗ Uλ,− Uλ,− λ(Uµ,+ ) = Uµ,+
∗ ∗ Uλ,− λ− (Uµ,+ ) Uλ,− = Uµ,+ ∗ ∗ = Uµ,+ Uλ,− Uµ,+ Uλ,−
∗ ∗ = Uµ,+ µ+ (Uλ,− ) Uµ,+ Uλ,− ∗ ∗ = Uµ,+ Uµ,+ µ(Uλ,− ) Uλ,− ∗ = µ(Uλ,− ) Uλ,−
= (− (µ, λ))∗ .
Ad Eq. (6): Clearly (λ ◦ µ)± ∈ 1A (I± ), where (λ ◦ µ)± = λ± ◦ µ± = Ad(Uλ◦µ,± ) ◦ λ ◦ µ, Hence
Uλ◦µ,± = Uλ,± λ(Uµ,± ).
∗ · λ ◦ µ(Uν,± ) ± (λ ◦ µ, ν) = Uν,± ∗ ∗ λ(Uν,± ) λ(Uν,± ) · λ ◦ µ(Uν,± ) = Uν,±
= ± (λ, ν) λ(± (µ, ν)). Ad Eq. (7): This follows now easily from Eqs. (5) and (6).
Note that Eq. (5) nicely reflects the invariance properties of ± (λ, µ) as stated in Lemma 2.2. 2.3. The braiding fusion equations. We will now describe how the naturality and braiding fusion equations (BFEs) arise in the algebraic framework. The content of this subsection
368
J. B¨ockenhauer, D. E. Evans
is not essentially new (e.g. versions of these equations have already been given in [13]), however, as we will make explicit use of the different versions of the BFE we again present the proofs. Moreover, in view of our applications we want to formulate the BFEs for local intertwiners and therefore we have to require strong additivity of the underlying Haag–Kastler net. Strong additivity (or “irrelevance of points”) means that A(I) = A(I1 ) ∨ A(I2 ) whenever intervals I1 and I2 are obtained by removing one single point from the interval I ∈ Jz . (For chiral conformal field theories strong additivity is in fact equivalent to Haag duality on the punctured circle.) This requirement basically ensures the equivalence of local and global intertwiners. In the following we will often consider elements of the set 1A (I) as elements of End(A(K)) for I, K ∈ Jz such that I ⊂ K which is possible since elements of 1A (I) leave A(K) invariant. Lemma 2.4. Suppose that A is strongly additive. Then for λ, µ ∈ 1A (I◦ ), I◦ ∈ Jz , we have HomA (λ, µ) = HomA(I◦ ) (λ, µ).
(8)
Proof. We first show “⊂”. Assume T ∈ HomA (λ, µ). Then clearly T λ(a) = µ(a)T for all a ∈ A(I◦ ). Moreover, as T a = T λ(a) = µ(a)T = aT for all a ∈ CA (I◦0 ) we find T ∈ CA (I◦0 )0 = A(I◦ ), proving “⊂”. Next we show “⊃”. Assume T◦ ∈ HomA(I◦ ) (λ, µ). It suffices to show T◦ λ(a) = µ(a)T◦ for all a ∈ A(I) and all I ∈ Jz such that I◦ ⊂ I (I◦ 6= I) because then T◦ ∈ HomA (λ, µ) by norm continuity. First assume that I◦ and I have one boundary point in common, i.e. I extends I◦ one one side. Then I1 = I ∩ I◦0 is an interval in Jz and A(I) = A(I◦ ) ∨ A(I1 ) by strong additivity. We have T◦ λ(a) = µ(a)T◦ for all a ∈ A(I◦ ) by assumption and also T◦ λ(a) = T◦ a = aT◦ = µ(a)T◦ for all a ∈ A(I1 ) since T◦ ∈ A(I◦ ). Hence T◦ intertwines λ and µ on the subalgebra of A(I) which is algebraically generated by A(I◦ ) and A(I1 ) and is weakly dense by strong additivity. As endomorphisms in 1A (I◦ ) are weakly continuous on any A(I), I◦ ⊂ I, it follows T◦ λ(a) = µ(a)T◦ for all a ∈ A(I). If I has no common boundary point with I◦ we just have to repeat the procedure to extend the interval also on the other side. Now we are ready to prove the naturality equations for local intertwiners. Proposition 2.5. For λ, µ, ρ ∈ 1A (I◦ ), I◦ ∈ Jz , and T ∈ HomA(I◦ ) (λ, µ) we have the naturality equations ρ(T ) ± (λ, ρ) = ± (µ, ρ) T, T ± (ρ, λ) = ± (ρ, µ) ρ(T ).
(9) (10)
Proof. Choose intervals I+ , I− ∈ Jz such that I− < I◦ < I+ . We take unitaries Uρ,± ∈ A such that ρ± = Ad(Uρ,± ) ◦ ρ are localized in I± . Then T λ(Uρ,± ) = µ(Uρ,± )T by Lemma 2.4. Moreover, ρ± (T ) = T as T ∈ A(I◦ ). We can now compute ∗ ρ(T ) ± (λ, ρ) = ρ(T ) Uρ,± λ(Uρ,± ) ∗ ρ± (T ) λ(Uρ,± ) = Uρ,± ∗ T λ(Uρ,± ) = Uρ,± ∗ µ(Uρ,± ) T = Uρ,±
= ± (µ, ρ) T,
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
369
and Eq. (10) is obtained just by applying Eq. (9) to T ∗ ∈ HomA(I◦ ) (µ, λ) and using Eq. (5). By use of Eqs. (6) and (7) we obtain immediately the following Corollary 2.6. For λ, µ, ν, ρ ∈ 1A (I◦ ), I◦ ∈ Jz , and S ∈ HomA(I◦ ) (λ ◦ µ, ν) we have the BFEs ρ(S) ± (λ, ρ)λ(± (µ, ρ)) = ± (ν, ρ) S, S λ(± (ρ, µ))± (ρ, λ) = ± (ρ, ν) ρ(S).
(11) (12)
By Lemma 2.3, Eqs. (3) and (4), we find ± (λ, µ) ∈ HomA(I◦ ) (λ ◦ µ, µ ◦ λ). Using Eq. (11) and also Eq. (6) we obtain the Yang-Baxter equation (YBE). Corollary 2.7. For λ, µ, ν ∈ 1A (I◦ ) we have the YBE ν(± (λ, µ)) ± (λ, ν) λ(± (µ, ν)) = ± (µ, ν) µ(± (λ, ν)) ± (λ, µ).
(13)
We remark that the YBE is also true without the assumption of strong additivity because the statistics operators are global intertwiners. Assume we have a Haag–Kastler net N = {N (I), I ∈ Jz } of von Neumann algebras acting on a Hilbert space H. If (the C ∗ -algebra) N leaves a subspace H0 ⊂ H invariant and the corresponding subrepresentation π0 of the defining representation of N is faithful, we denote by A = {A(I), I ∈ Jz } the isomorphic net given by A(I) = π0 (N (I)),
I ∈ Jz .
Then strong additivity of the net N is equivalent to strong additivity of the net A. If the net A is Haag dual then we say that N has a faithful Haag dual subrepresentation. In that case one checks that N (I) = CN (I 0 )0 ∩ N for I ∈ Jz . Let 1N (I) denote the set of transportable endomorphisms of N localized in I ∈ Jz , i.e. for λ ∈ 1N (I) and any J ∈ Jz there are unitary charge transporters uλ;I,J ∈ N such that λ˜ = Ad(uλ;I,J ) ◦ λ is localized in J. Then Uλ0 ;I,J = π0 (uλ;I,J ) is a charge transporter of λ0 = π0 ◦ λ ◦ π0−1 ∈ 1A (I). Note that, if N has a Haag dual subrepresentation, elements of 1N (I) leave N (K) invariant whenever K ∈ Jz contains I, so that elements of 1N (I) can also be considered as elements of End(N (K)). Now choose again I◦ , I± ∈ Jz such that I− < I◦ < I+ . For λ, µ ∈ 1N (I◦ ) we set uµ,± = uµ;I◦ ,I± , and ε± (λ, µ) = u∗µ,± λ(uµ,± ) so that ± (λ0 , µ0 ) = π0 (ε± (λ, µ)). We call the ε+ (λ, µ)’s statistics operators as well. Now assume that N is strongly additive and let λ, µ, ρ ∈ 1N (I◦ ) and t ∈ HomN (I◦ ) (λ, µ). Then T = π0 (t) ∈ HomA(I◦ ) (λ0 , µ0 ). This way we obtain HomN (λ, µ) = HomN (I◦ ) (λ, µ) from Lemma 2.4, and we have the naturality equations ρ0 (T ) ± (λ0 , ρ0 ) = ± (µ0 , ρ0 ) T, T ± (ρ0 , λ0 ) = ± (ρ0 , µ0 ) ρ0 (T ). Applying π0−1 to this and Lemma 2.3 we arrive at
370
J. B¨ockenhauer, D. E. Evans
Corollary 2.8. Assume that N has a faithful Haag dual subrepresentation. Then we have for λ, µ ∈ 1N (I◦ ), I◦ ∈ Jz , ε± (λ, µ) · λ ◦ µ(n) = µ ◦ λ(n) · ε± (λ, µ), ε± (λ, µ) ∈ N (I◦ ), ε+ (λ, µ) = (ε− (µ, λ))∗ , ε± (λ ◦ µ, ν) = ε± (λ, ν) λ(ε± (µ, ν)), ε± (λ, µ ◦ ν) = µ(ε± (λ, ν)) ε± (λ, µ).
n ∈ N,
(14) (15) (16) (17) (18)
If in addition N is strongly additive and also ν, ρ ∈ 1N (I◦ ), then for t ∈ HomN (I◦ ) (λ, µ) we have the naturality equations ρ(t) ε± (λ, ρ) = ε± (µ, ρ) t, t ε± (ρ, λ) = ε± (ρ, µ) ρ(t),
(19) (20)
for s ∈ HomN (I◦ ) (λ ◦ µ, ν) we have the BFEs ρ(s) ε± (λ, ρ)λ(ε± (µ, ρ)) = ε± (ν, ρ) s, s λ(ε± (ρ, µ))ε± (ρ, λ) = ε± (ρ, ν) ρ(s),
(21) (22)
ν(ε± (λ, µ)) ε± (λ, ν) λ(ε± (µ, ν)) = ε± (µ, ν) µ(ε± (λ, ν)) ε± (λ, µ).
(23)
and the YBE
2.4. Nets of subfactors. A net of von Neumann algebras (or even factors) over a partially ordered index set J is an assignment M : J 3 i 7→ Mi of von Neumann algebras (or factors) on a Hilbert space H such that we have isotony, Mi ⊂ Mj whenever i ≤ j. A net of subfactors consists of two nets of factors N and M such that we have subfactors Ni ⊂ Mi for all i ∈ J . We simply write N ⊂ M. A net of subfactors is called standard if there is a vector ∈ H that is cyclic and separating for every Mi on H and Ni on a subspace H0 ⊂ H. Note that the projection eN ∈ B(H) onto H0 is the Jones projection for each inclusion Ni ⊂ Mi for a standard net of subfactors. If there is also an assignment E : J 3 i 7→ Ei of faithful normal conditional expectations from Mi onto Ni such that Ei = Ej |Mi for i ≤ j, then we say that N ⊂ M has a faithful normal conditional expectation. E is called standard if it preserves the vector state ω = h, · i. If the index set J is directed we simply say N ⊂ M is a directed net and we can form S S the C ∗ -algebras i∈J Ni and i∈J Mi and denote it, by abuse of notation, by the same symbols as used for the nets, N and M, respectively. In [19] the following is proven Proposition 2.9. Let N ⊂ M be a directed standard net of subfactors with a standard conditional expectation. For every i ∈ J there is an endomorphism γ of the C ∗ -algebras M into N such that γ|Mj is a canonical endomorphism of Mj into Nj whenever i ≤ j. Furthermore, γ acts trivially on Mi0 ∩N . As i ∈ J varies to any i0 ∈ J the corresponding γ and γ 0 are inner equivalent by a unitary in Nk provided i, i0 ≤ k. Since (for a fixed i ∈ J ) γ is a canonical endomorphism of Mj into Nj whenever i ≤ j there is a restriction of γ to N that we denote by θ, θ = γ|N ∈ End(N ).
(24)
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
371
Proposition 2.10. Let N ⊂ M be a directed standard net of subfactors with a standard conditional expectation. Let γ ∈ End(M) be associated with some i ∈ J and θ ∈ End(N ) its restriction as above. Then we have unitary equivalences π 0 ' π0 ◦ γ
π 0 |N ' π0 ◦ θ,
and
(25)
where π 0 is the defining representation of M on H and π0 the ensuing representation of N on H0 = N . It is also proven in [19] that the Kosaki index is constant in a directed standard net of subfactors with a standard conditional expectation. Moreover, for such nets the following is shown in [19]. Pick γ and θ for some i ∈ J as above. Then there is an isometry w ∈ Ni satisfying wn = θ(n)w for all n ∈ N and inducing the conditional expectation E by E(m) = w∗ γ(m)w for m ∈ M. If in addition the index is finite, [M : N ] ≡ [Mi : Ni ] < ∞, then there is also an isometry v ∈ Mi satisfying vm = γ(m)v for all m ∈ M and w∗ v = [M : N ]−1/2 1 = w∗ γ(v). Then clearly E(vv ∗ ) = [M : N ]−1 1, and we have also Mj = Nj v whenever i ≤ j, and finally M = N v. A directed standard net of subfactors with a standard conditional expectation is called a quantum field theoretical net of subfactors if the index set J admits a causal structure and we have Ni ⊂ Mj0 if i and j are causally disjoint. For our purposes we choose the directed set J = Jz and assume that we have a given quantum field theoretical net of subfactors N ⊂ M. We denote by A the net (and the C ∗ -algebra) I ∈ Jz .
A(I) = π0 (N (I)),
(26)
As we are dealing with factors, π0 is automatically faithful. We assume that A satisfies Haag duality, i.e. N has a faithful Haag dual subrepresentation. Fix an interval I◦ ∈ Jz and take the endomorphism γ of Prop. 2.9. First note that Proposition 2.9 tells us that θ ∈ 1N (I◦ ). Let us consider the situation that π 0 decomposes into a finite number of representations of N as follows, π0 ◦ θ ' π 0 |N '
n M
m` π` ,
`=0
where π` , ` = 0, 1, ..., n, are irreducible, mutually disjoint representations of N and m` are multiplicities. Assume that π` are such that we can write π0 ◦ θ '
n M
m` · π0 ◦ λ`
`=0
with λ` ∈ 1N (I◦ ). Then this means that we have isometries T`,r ∈ B(H0 ), ` = 0, 1, ..., n, r = 1, 2, ..., m` , such that ∗ T`0 ,r0 T`,r
= δ`,`0 δr,r0 1,
m` n X X
T`,r T`∗0 ,r0 = 1,
`=0 r=1
and π0 ◦ θ(n) =
m` n X X `=0 r=1
T`,r · π0 ◦ λ` (n) · T`∗0 ,r0 ,
n ∈ N.
372
J. B¨ockenhauer, D. E. Evans
As θ, λ` ∈ 1N (I◦ ) it follows a=
m` n X X
T`,r aT`∗0 ,r0 ,
a ∈ CA (I◦0 ),
`=0 r=1
hence T`,r ∈ CA (I◦0 )0 = A(I◦ ). Thus we can define t`,r = π0−1 (T`,r ) ∈ N (I◦ ), and we find in particular θ(n) =
m` n X X
t`,r λ` (n) t∗`0 ,r0 ,
n ∈ N (I◦ ),
`=0 r=1
and this is in terms of sectors of N (I◦ ) [θ] =
n M
m` [λ` ].
(27)
`=0
3. α-Induction for Nets of Subfactors From now on we assume that we have a given quantum field theoretical net of subfactors N ⊂ M over the index set Jz , i.e. N (I1 ) ⊂ M (I2 )0 if I1 ∩ I2 = ∅. This implies locality of the net N but we even assume the net M to be local, and we also assume the net A = {A(I) = π0 (N (I)), I ∈ Jz } to satisfy Haag duality. We also assume the net N (or equivalently the net A) to be strongly additive. Moreover, we require the net N ⊂ M to be of finite index, [M : N ] < ∞. We fix an arbitrary interval I◦ ∈ Jz and take the corresponding endomorphism γ of Proposition 2.9. 3.1. Definition of α-induction. In the following we set ε(λ, µ) = ε+ (λ, µ) for any λ, µ ∈ 1N (I◦ ). As usual, we denote by v ∈ M (I◦ ) and w ∈ N (I◦ ) the isometries which intertwine γ ∈ End(M) and its restriction θ ∈ 1N (I◦ ), respectively, and satisfy w∗ v = [M : N ]−1/2 1 = w∗ γ(v). Lemma 3.1. For λ ∈ 1N (I◦ ) we have Ad(ε(λ, θ)) ◦ λ ◦ γ(v) = θ(ε(λ, θ)∗ )γ(v).
(28)
Proof. By the intertwining property of v we find γ(v)∗ ∈ HomN (I◦ ) (θ2 , θ). Hence we can apply the BFE, Eq. (22), and obtain ε(λ, θ) · λ ◦ γ(v)∗ = γ(v)∗ θ(ε(λ, θ))ε(λ, θ), hence
ε(λ, θ) · λ ◦ γ(v) · ε(λ, θ)∗ = θ(ε(λ, θ)∗ )γ(v).
If I ∈ Jz contains I◦ then for n ∈ N (I) we have Ad(ε(λ, θ))◦λ◦γ(n) = Ad(ε(λ, θ))◦ λ ◦ θ(n) = θ ◦ λ(n) ∈ θ(N (I)) ⊂ γ(M (I)), and note that then also θ(ε(λ, θ)∗ )γ(v) ∈ γ(M (I)). Since each m ∈ M (I) can be written as m = nv for some n ∈ N (I) we find
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
373
Corollary 3.2. For any I ∈ Jz such that I◦ ⊂ I we have Ad(ε(λ, θ)) ◦ λ ◦ γ(M (I)) ⊂ γ(M (I)).
(29)
Now we are ready to define α-induction – just by the formula (3.10) for the extended endomorphism in Proposition 3.9 in [19]. However, we have shown that this endomorphism leaves each algebra M (I) with I ∈ Jz such that I◦ ⊂ I invariant. Definition 3.3. For λ ∈ 1N (I◦ ) we define the α-induced endomorphism αλ ∈ End(M) by αλ = γ −1 ◦ Ad(ε(λ, θ)) ◦ λ ◦ γ.
(30)
Thanks to Corollary 3.2, αλ is well defined and can also be considered as an element of End(M (I)) as long as I ∈ Jz contains I◦ . The definition of α-induction is such that αλ is an extension of λ, i.e. we have αλ (n) = λ(n) obviously for n ∈ N . 3.2. The main formula for α-induction. Choose I+ ∈ Jz such that I◦ < I+ and denote by γ+ a (canonical) endomorphism associated to I+ as in Proposition 2.9, and let θ+ be its restriction to N . Then the unitary u = [M : N ] · E(v+ v ∗ ) ∈ N intertwines γ and γ+ and relates isometries v and v+ ∈ M (I+ ) by v+ = uv [19]. The proof of the following lemma from [19] makes use of locality of the net M. Lemma 3.4. We have ε(θ, θ)v 2 = ε(θ, θ)∗ v 2 = v 2 ,
ε(θ, θ)γ(v) = ε(θ, θ)∗ γ(v) = γ(v).
(31)
Proof. By the intertwining property of u we have in particular θ+ = Ad(u) ◦ θ. Therefore u = uθ,+ is a charge transporter for θ and we can write ε(θ, θ) = u∗ θ(u). By locality of M we find v+ v = vv+ , i.e. uvv = vuv = θ(u)vv, hence ε(θ, θ)v 2 ≡ u∗ θ(u)v 2 = v 2 . Since v 2 = γ(v)v we obtain ε(θ, θ)γ(v)vv ∗ = γ(v)vv ∗ by right multiplication with v ∗ . Application of the conditional expectation yields ε(θ, θ)γ(v) = γ(v) since E(vv ∗ ) = w∗ γ(vv ∗ )w = [M : N ]−1 1. Multiplying the obtained relations by ε(θ, θ)∗ from the left yields the full statement. Later we will use the following important Lemma 3.5. Let t ∈ M (I◦ ) such that tλ(n) = µ(n)t for all n ∈ N (I◦ ) and some λ, µ ∈ 1N (I◦ ). Then t ∈ HomM (I◦ ) (αλ , αµ ). Proof. As αλ , αµ restrict, respectively, to λ, µ on N (I◦ ) it suffices to show tαλ (v) = αµ (v)t. Let s = γ(t). Then clearly s ∈ HomN (I◦ ) (θ ◦ λ, θ ◦ µ). By the BFE, Eq. (21), we obtain ε(θ ◦ µ, θ) s = θ(s)ε(θ, θ)θ(ε(λ, θ)). Since ε(θ ◦ µ, θ) = ε(θ, θ)θ(ε(µ, θ)) we find s θ(ε(λ, θ)∗ ) = θ(ε(µ, θ)∗ )ε(θ, θ)∗ θ(s)ε(θ, θ).
374
J. B¨ockenhauer, D. E. Evans
So let us compute s · Ad(ε(λ, θ)) ◦ λ ◦ γ(v) = s θ(ε(λ, θ)∗ )γ(v) = θ(ε(µ, θ)∗ )ε(θ, θ)∗ θ(s)ε(θ, θ)γ(v) = θ(ε(µ, θ)∗ )ε(θ, θ)∗ θ(s)γ(v) = θ(ε(µ, θ)∗ )ε(θ, θ)∗ γ(v)s = θ(ε(µ, θ)∗ )γ(v)s = Ad(ε(µ, θ)) ◦ µ ◦ γ(v) · s, where we repeatedly used Lemmata 3.1, 3.4, and also that θ(s)γ(v) = γ 2 (t)γ(v) = γ(v)γ(t) = γ(v)s. Thanks to Corollary 3.2 we can now apply γ −1 and obtain tαλ (v) = αµ (v)t. Note that we obtained Lemma 3.5 just by the following ingredients: Haag duality and strong additivity of the net A, implying existence of statistics operators and the BFEs for local intertwiners of endomorphisms in 1N (I◦ ), and locality of the net M, implying Lemma 3.4, and of course, finiteness of the index guaranteeing the existence of the isometry v. Now consider the following special situation λ = µ = id in Lemma 3.5. First note that αid = id by the definition of α-induction. Then for each t ∈ N (I◦ )0 ∩ M (I◦ ) we find t ∈ HomM (I◦ ) (id, id) = M (I◦ )0 ∩ M (I◦ ), i.e. N (I◦ )0 ∩ M (I◦ ) ⊂ M (I◦ )0 ∩ M (I◦ ) = C 1, and I◦ ∈ Jz was arbitrary. Somewhat surprisingly, we gained Corollary 3.6. Let N ⊂ M be a directed quantum field theoretical net of subfactors over Jz with finite index. If N is strongly additive and has a Haag dual subrepresentation and M satisfies locality, then N ⊂ M is a net of irreducible subfactors. Another immediate consequence of Lemma 3.5 is the following Corollary 3.7. If [λ] = [µ] for some λ, µ ∈ 1N (I◦ ), then [αλ ] = [αµ ]. (Here and in the following we use the sector brackets for sectors of either N (I◦ ) or M (I◦ ).) Lemma 3.8. If n ∈ N then nv = 0 implies n = 0. Similarly, for m ∈ M, w∗ γ(m) = 0 implies m = 0. Proof. This follows from the identities n = [M : N ]1/2 nw∗ γ(v) = [M : N ]1/2 w∗ γ(nv), n ∈ N , and m = [M : N ]1/2 w∗ vm = [M : N ]1/2 w∗ γ(m)v, m ∈ M. We are now ready to prove the main formula for α-induction given in the following Theorem 3.9. For λ, µ ∈ 1N (I◦ ) we have hαλ , αµ iM (I◦ ) = hθ ◦ λ, µiN (I◦ ) .
(32)
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
375
Proof. We first show “≤". Let t ∈ HomM (I◦ ) (αλ , αµ ). We show that r = w∗ γ(t) ∈ HomN (I◦ ) (θ ◦ λ, µ). Clearly, r ∈ N (I◦ ). By assumption, we have tαλ (m) = αµ (m)t for all m ∈ M (I◦ ). Restriction to N (I◦ ) and application of γ yields γ(t) ∈ HomN (I◦ ) (θ ◦ λ, θ ◦ µ). It follows for all n ∈ N (I◦ ), r · θ ◦ λ(n) = w∗ · γ(t) · θ ◦ λ(n) = w∗ · θ ◦ µ(n) · γ(t) = µ(n) r, since w∗ θ(n) = nw∗ . By Lemma 3.8 the map t 7→ r = w∗ γ(t) is injective, thus “≤" is proven. We now turn to “≥". Suppose r ∈ HomN (I◦ ) (θ ◦ λ, µ) is given. We show that t = rv ∈ HomM (I◦ ) (αλ , αµ ). Clearly, t = rv ∈ M (I◦ ), and we have for all n ∈ N (I◦ ), t λ(n) = rv λ(n) = r · θ ◦ λ(n) · v = µ(n) rv = µ(n) t. Hence, by Lemma 3.5, we have t ∈ HomM (I◦ ) (αλ , αµ ). By Lemma 3.8, the map r 7→ t = rv is injective; the proof is complete. 3.3. Homomorphism property of α-induction. As αλ restricts to λ on N (I◦ ) which is of finite index in M (I◦ ), we find dαλ = dλ . This is an immediate consequence of the multiplicativity of the minimal index [17]: Consider the chain of inclusions αλ (N (I◦ )) ⊂ N (I◦ ) ⊂ M (I◦ ). Choose η ∈ End(M (I◦ )) such that η(M (I◦ )) = N (I◦ ). Then [M (I◦ ) : N (I◦ )] = d2η and [M (I◦ ) : αλ (N (I◦ ))] = d2αλ d2η , hence [N (I◦ ) : αλ (N (I◦ ))] = d2αλ but [N (I◦ ) : αλ (N (I◦ ))] ≡ [N (I◦ ) : λ(N (I◦ ))] = d2λ , thus indeed dαλ = dλ .
(33)
However, there are more properties. Lemma 3.10. For any λ, µ ∈ 1N (I◦ ) we have αλ◦µ = αλ ◦ αµ . Proof. We compute αλ◦µ = γ −1 ◦ Ad(ε(λ ◦ µ, θ)) ◦ λ ◦ µ ◦ γ = γ −1 ◦ Ad(ε(λ, θ)λ(ε(µ, θ))) ◦ λ ◦ µ ◦ γ = γ −1 ◦ Ad(ε(λ, θ)) ◦ λ ◦ Ad(ε(µ, θ)) ◦ µ ◦ γ = αλ ◦ αµ , where we used Eq. (6).
As ε(λ, µ) ∈ HomN (I◦ ) (λ◦µ, µ◦λ) we obtain from Lemma 3.5 that ε(λ, µ)αλ◦µ (m) = αµ◦λ (m)ε(λ, µ) for all m ∈ M (I◦ ), in particular for m = v. Since M = N v we obtain from Lemma 3.10 the following Corollary 3.11. For λ, µ ∈ 1N (I◦ ) we have αµ ◦ αλ = Ad(ε(λ, µ)) ◦ αλ ◦ αµ .
(34)
As αλ restricts to λ on N we clearly have αλ (ε(µ, ν)) = λ(ε(µ, ν)) for λ, µ, ν ∈ 1N (I◦ ). Therefore, by rewriting the YBE, Eq. (23), and recalling that ε(λ, λ) ∈ αλ2 (M (I◦ ))0 ∩ M (I◦ ) by Corollary 3.11, we arrive at
376
J. B¨ockenhauer, D. E. Evans
Corollary 3.12. For λ, µ, ν ∈ 1N (I◦ ) we have the YBE αν (ε(λ, µ)) ε(λ, ν) αλ (ε(µ, ν)) = ε(µ, ν) αµ (ε(λ, ν)) ε(λ, µ),
(35)
in particular, the endomorphisms αλ are braided endomorphisms, i.e. setting σi = αλi−1 (ε(λ, λ)), i = 1, 2, 3, . . . , yields a representation of the braid group B∞ . Next we show that α-induction preserves also sums of sectors. Lemma 3.13. Let λ, λ1 , λ2 ∈ 1N (I◦ ) such that [λ] = [λ1 ] ⊕ [λ2 ]. Then [αλ ] = [αλ1 ] ⊕ [αλ2 ]. Proof. As [λ] = [λ1 ] ⊕ [λ2 ] we have isometries y1 , y2 ∈ N (I◦ ) fulfilling the relations P2 of O2 , yi∗ yj = δi,j 1, i=1 yi yi∗ = 1, and λ(n) =
2 X
yi λi (n) yi∗ ,
n ∈ N (I◦ ).
i=1
We now choose an interval I+ ∈ Jz such that I◦ < I+ . Note that yi ∈ HomN (I◦ ) (λi , λ) = HomN (λi , λ), i = 1, 2. Choose a charge transporter uθ,+ ∈ N such that θ+ = Ad(uθ,+ ) ◦ θ ∈ 1N (I+ ). Then we have ε(λ, θ) = u∗θ,+ λ(uθ,+ ) =
2 X
u∗θ,+ λ(uθ,+ ) yi yi∗ =
i=1
2 X
u∗θ,+ yi λi (uθ,+ ) yi∗ .
i=1
Since yi ∈ N (I◦ ) we also find θ+ (yi ) = yi , i = 1, 2, and thus we compute for n ∈ N (I◦ ), Ad(ε(λ, θ)) ◦ λ(n) = = = =
P2
i=1
P2
i=1 P2 i=1 P2 i=1
u∗θ,+ yi λi (uθ,+ n u∗θ,+ ) yi∗ uθ,+ u∗θ,+ θ+ (yi ) λi (uθ,+ n u∗θ,+ ) θ+ (yi∗ ) uθ,+ θ(yi ) u∗θ,+ λi (uθ,+ n u∗θ,+ ) uθ,+ θ(yi∗ ) θ(yi ) · Ad(ε(λi , θ)) ◦ λi (n) · θ(yi∗ ).
Specializing to n = γ(m), m ∈ M (I◦ ), and applying γ −1 yields αλ (m) =
2 X
yi αλi (m) yi∗ ,
m ∈ M (I◦ ),
i=1
the lemma is proven.
For sectors with finite statistical dimension we can show that α-induction preserves also sector conjugation. Lemma 3.14. If λ ∈ 1N (I◦ ) is a conjugate to λ ∈ 1N (I◦ ), dλ < ∞, then αλ is a conjugate to αλ , i.e. [αλ ] = [αλ ].
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
377
Proof. Using Lemma 3.10, Theorem 3.9 and Eq. (1) we get hαλ , αλ iM (I◦ ) = hθ ◦ λ, λiN (I◦ ) = hθ ◦ λ ◦ λ, idN (I◦ ) iN (I◦ ) = hαλ◦λ , idM (I◦ ) iM (I◦ )
= hαλ ◦ αλ , idM (I◦ ) iM (I◦ ) = hαλ , αλ iM (I◦ ) .
Replacing λ by λ yields hαλ , αλ iM (I◦ ) = hαλ , αλ iM (I◦ ) whereas conjugation yields hαλ , αλ iM (I◦ ) = hαλ , αλ iM (I◦ ) . Thus we found hαλ , αλ iM (I◦ ) = hαλ , αλ iM (I◦ ) = hαλ , αλ iM (I◦ ) , and because we assumed finite statistical dimensions, these expressions are finite. Then this implies the statement. Next we want to discuss certain commutativity rules between sectors arising from α-induction. Lemma 3.15. Let λ, µ, ρ ∈ 1N (I◦ ) and r ∈ M (I◦ ) such that rλ(n) = µ(n)r for all n ∈ N (I◦ ). Then we have rε(ρ, λ) = ε(ρ, µ)αρ (r). Proof. Note that s = γ(r) ∈ HomN (I◦ ) (θ ◦ λ, θ ◦ µ). Thus the BFE, Eq. (22), yields s θ(ε(ρ, λ))ε(ρ, θ) = ε(ρ, θ ◦ µ) ρ(s), hence we obtain by using Eq. (18), s θ(ε(ρ, λ)) = θ(ε(ρ, µ)) ε(ρ, θ)ρ(s)ε(ρ, θ)∗ , and applying γ −1 yields the statement.
Proposition 3.16. Let λ, µ ∈ 1N (I◦ ) and β ∈ End(M (I◦ )) such that [β] is a subsector of [αµ ]. Then [αλ ◦ β] = [β ◦ αλ ]. Proof. By assumption, there is an isometry t ∈ M (I◦ ), t∗ t = 1, such that t β(m) = αµ (m) t,
m ∈ M (I◦ ).
Then u = t∗ ε(λ, µ)αλ (t) ∈ HomM (I◦ ) (αλ ◦ β, β ◦ αλ ) as we have for all m ∈ M (I◦ ), t∗ ε(λ, µ)αλ (t) · αλ ◦ β(m) = t∗ ε(λ, µ) · αλ ◦ αµ (m) · αλ (t) = t∗ · αµ ◦ αλ (m) · ε(λ, µ)αλ (t) = β ◦ αλ (m) · t∗ ε(λ, µ)αλ (t), where we used Corollary 3.11. All we have to show is that u is unitary. Note that tt∗ ∈ HomM (I◦ ) (αµ , αµ ) and hence in particular tt∗ ∈ µ(N (I◦ ))0 ∩ M (I◦ ) as αµ restricts to µ on N (I◦ ). Then Lemma 3.15 yields tt∗ ε(λ, µ) = ε(λ, µ)αλ (tt∗ ). Therefore u∗ u = αλ (t∗ )ε(λ, µ)∗ tt∗ ε(λ, µ)αλ (t) = αλ (t∗ tt∗ t) = 1, and
uu∗ = t∗ ε(λ, µ)αλ (tt∗ )ε(λ, µ)∗ t = t∗ tt∗ t = 1,
the proof is complete.
378
J. B¨ockenhauer, D. E. Evans
3.4. σ-restriction and ασ-reciprocity. In [19] there is also defined a restriction for endomorphisms. In our context, we will call that σ-restriction. Definition 3.17. For β ∈ End(M) the σ-restricted endomorphism σβ ∈ End(N ) is defined by σβ = γ ◦ β|N .
(36)
If β ∈ End(M) leaves M (I) invariant for I ∈ Jz , I◦ ⊂ I, then clearly σβ leaves N (I) invariant. Moreover, the formula σβ (n) = γ ◦ β(n), n ∈ N (I), defines also a map from End(M (I)) to End(N (I)). For λ ∈ 1N (I◦ ) we obviously have σαλ = θ ◦ λ so that in particular [λ] is a subsector of [σαλ ]. It is natural to ask whether [β] is a subsector of [ασβ ]. For localized, transportable β we are going to prove an even stronger result which is a sort of Frobenius reciprocity for α-induction and σ-restriction. For this we need some more preparation. Clearly, if β is localized in I◦ then so is σβ as for n ∈ CN (I◦0 ) we find σβ (n) = γ ◦ β(n) = γ(n) = θ(n) = n since θ is localized in I◦ . Now suppose that β is also transportable: For each I1 ∈ Jz we have unitary charge transporters Qβ;I◦ ,I1 ∈ M such that βI1 = Ad(Qβ;I◦ ,I1 ) ◦ β is localized in I1 . Lemma 3.18. If β ∈ 1M (I◦ ) then σβ ∈ 1N (I◦ ). Namely, for any I1 ∈ Jz we have σβ,I1 = Ad(uσβ ;I◦ ,I1 ) ◦ σβ ∈ 1N (I1 ) with uσβ ;I◦ ,I1 = uθ;I◦ ,I1 γ(Qβ;I◦ ,I1 ).
(37)
Proof. We have to show that σβ,I1 = Ad(uσβ ;I◦ ,I1 ) ◦ σβ is localized in I1 . Now for n ∈ CN (I10 ) we have σβ,I1 (n) = uθ;I◦ ,I1 γ(Qβ;I◦ ,I1 ) · γ ◦ β(n) · γ(Qβ;I◦ ,I1 )∗ u∗θ;I◦ ,I1 = uθ;I◦ ,I1 · γ ◦ βI1 (n) · u∗θ;I◦ ,I1 = uθ;I◦ ,I1 γ(n) u∗θ;I◦ ,I1 = uθ;I◦ ,I1 θ(n) u∗θ;I◦ ,I1 = θI1 (n) = n, since θI1 = Ad(uθ;I◦ ,I1 ) ◦ θ is localized in I1 .
For some interval I− ∈ Jz such that I− < I◦ we set Qβ,− = Qβ;I◦ ,I− . Lemma 3.19. For β ∈ 1M (I◦ ) we have ε(σβ , θ) = γ 2 (Qβ,− )∗ ε(θ, θ) γ(Qβ,− ). Proof. We compute ε(σβ , θ) = ε− (θ, σβ )∗ = θ(uσβ ;I◦ ,I− )∗ uσβ ;I◦ ,I− = θ(γ(Qβ,− )∗ u∗θ,− ) uθ,− γ(Qβ,− ) = γ 2 (Qβ,− )∗ θ(uθ,− )∗ uθ,− γ(Qβ,− ) = γ 2 (Qβ,− )∗ ε− (θ, θ)∗ γ(Qβ,− ) = γ 2 (Qβ,− )∗ ε(θ, θ) γ(Qβ,− ), where we used Eq. (16).
(38)
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
379
For I ∈ Jz let 1(0) M (I) denote the set of transportable endomorphisms localized in I which leave M (K) invariant for any K ∈ Jz with I ⊂ K. Note that λ(M (K)) ⊂ M (K) for λ ∈ 1M (I) is automatically satisfied if M is Haag dual, i.e. 1(0) M (I) = 1M (I) in this case. However, in order to be as general as possible we do not assume Haag duality of M (although it is satisfied in the applications we have in mind) but we do need invariance of local algebras as we often consider elements of 1(0) M (I) as elements of End(M (K)) for I ⊂ K. Lemma 3.20. Let t ∈ M (I◦ ) such that tλ(n) = β(n)t for all n ∈ N (I◦ ) and some λ ∈ 1N (I◦ ) and β ∈ 1(0) M (I◦ ). Then t ∈ Hom M (I◦ ) (αλ , β). Proof. As αλ (n) = λ(n) for all n ∈ N (I◦ ) it suffices to show tαλ (v) = β(v)t. Let s = γ(t). Then clearly s ∈ HomN (I◦ ) (θ ◦ λ, σβ ). By the BFE, Eq. (21), we obtain s θ(ε(λ, θ)∗ ) = ε(σβ , θ)∗ θ(s)ε(θ, θ). So let us compute s · Ad(ε(λ, θ)) ◦ λ ◦ γ(v) = s θ(ε(λ, θ)∗ )γ(v) = ε(σβ , θ)∗ θ(s)ε(θ, θ)γ(v) = ε(σβ , θ)∗ θ(s)γ(v) = ε(σβ , θ)∗ γ(v)s = γ(Qβ,− )∗ ε(θ, θ)∗ γ 2 (Qβ,− )γ(v)s = γ(Qβ,− )∗ ε(θ, θ)∗ γ(v)γ(Qβ,− )s = γ(Qβ,− )∗ γ(v)γ(Qβ,− )s = γ(Q∗β,− vQβ,− )s
= γ(Q∗β,− βI− (v)Qβ,− )s
= γ ◦ β(v) · s, where we repeatedly used Lemmata 3.1, 3.4 and 3.19. Applying γ −1 yields tαλ (v) = β(v)t. Now we are ready to prove the reciprocity theorem. Theorem 3.21. For λ ∈ 1N (I◦ ) and β ∈ 1(0) M (I◦ ) we have ασ-reciprocity, hαλ , βiM (I◦ ) = hλ, σβ iN (I◦ ) .
(39)
Proof. We first show “≤". Let t ∈ HomM (I◦ ) (αλ , β). We show that r = γ(t)w ∈ HomN (I◦ ) (λ, σβ ). Clearly, r ∈ N (I◦ ). By assumption, we have tαλ (m) = β(m)t for all m ∈ M (I◦ ). Restriction to N (I◦ ) and application of γ yields γ(t) ∈ HomN (I◦ ) (θ◦λ, σβ ). It follows for all n ∈ N (I◦ ), r λ(n) = γ(t)w λ(n) = γ(t) · θ ◦ λ(n) · w = σβ (n) γ(t)w = σβ (n) r. By Lemma 3.8 the map t 7→ r = γ(t)w is injective, thus “≤" is proven. We now turn to “≥". Suppose r ∈ HomN (I◦ ) (λ, σβ ) is given. We show that t = v ∗ r ∈ HomM (I◦ ) (αλ , β). Clearly, t = v ∗ r ∈ M (I◦ ), and we have for all n ∈ N (I◦ )
380
J. B¨ockenhauer, D. E. Evans
t λ(n) = v ∗ r λ(n) = v ∗ σβ (n) r = v ∗ · γ ◦ β(n) · r = β(n) v ∗ r = β(n) t. Hence, by Lemma 3.20, we have t ∈ HomM (I◦ ) (αλ , β). It follows again from Lemma 3.8 that the map r 7→ t = v ∗ r is injective; the proof is complete. It follows from the proof that we have v ∗ ∈ HomM (I◦ ) (ασβ , β) since 1 ∈ HomN (I◦ ) (σβ , σβ ). (Recall σβ ∈ 1N (I◦ ) by Lemma 3.18.) We conclude that [β] is a subsector of [ασβ ]. Remark. Note that Theorem 3.21 is not a generalization of Theorem 3.9 since we assumed in particular that β is localized. However, αµ is in general not localized; it is localized if and only if the monodromy ε(µ, θ)ε(θ, µ) is trivial (Prop. 3.9 in [19]). Note that σ-restriction does not preserve sector products, i.e. [σβ1 ◦ σβ2 ] is in general different from [σβ1 ◦β2 ], e.g. for β1 = β2 = id. However, we add the following Lemma 3.22. Let β, β1 , β2 ∈ End(M (I◦ )). If [β] = [β1 ]⊕[β2 ] then [σβ ] = [σβ1 ]⊕[σβ2 ]. If [β1 ] = [β2 ] then [σβ1 ] = [σβ2 ]. Proof. If [β] = [β1 ] ⊕ [β2 ] then there are isometries t1 , t2 ∈ M (I◦ ) satisfying the P2 relations of O2 and β(m) = i=1 ti βi (m)t∗i for m ∈ M (I◦ ). Then si = γ(ti ) satisfy the relations of O2 as well and σβ (n) = γ ◦ β(n) =
2 X
si · γ ◦ βi (n) · s∗i =
i=1
2 X
si σβi (n)s∗i ,
n ∈ N (I◦ ).
i=1
If [β1 ] = [β2 ] then β2 = Ad(u) ◦ β1 with some unitary u ∈ M (I◦ ). Then clearly σβ2 = Ad(γ(u)) ◦ σβ1 , and γ(u) ∈ N (I◦ ) is unitary. 3.5. The inverse braiding. We have used the statistics operators ε(λ, θ) ≡ ε+ (λ, θ) for the definition of the α-induced endomorphism αλ ≡ αλ+ . Of course, all our results we derived hold similarly for the endomorphims αλ− , analogously defined by use of ε− (λ, θ). However, αλ and αλ− are in general not the same. In this subsection we investigate several relations between αλ and αλ− . The following proposition is instructive. Proposition 3.23. For λ ∈ 1N (I◦ ) the following are equivalent: 1. [αλ ] = [αλ− ], 2. αλ = αλ− , 3. The monodromy is trivial: ε(λ, θ)ε(θ, λ) = 1. Proof. If [αλ ] = [αλ− ] then there is a unitary u ∈ HomM (I◦ ) (αλ , αλ− ), i.e. uαλ (m) = αλ− (m)u for all m ∈ M (I◦ ). Restriction yields uλ(n) = λ(n)u for all n ∈ N (I◦ ). By Lemma 3.5 we find u ∈ HomM (I◦ ) (αλ , αλ ), in particular uαλ (v) = αλ (v)u, hence αλ (v)u = αλ− (v)u, thus αλ (v) = αλ− (v). But αλ (n) = λ(n) = αλ− (n) for all n ∈ N , therefore αλ (m) = αλ− (m) for any m ∈ M, proving αλ = αλ− . Now by Lemma 3.1 we have αλ (v) = ε(λ, θ)∗ v, and similarly αλ− (v) = ε− (λ, θ)∗ v = ε(θ, λ)v. Therefore αλ (v) = αλ− (v) implies ε(λ, θ)ε(θ, λ)v = v and hence ε(λ, θ)ε(θ, λ) = 1 by Lemma 3.8. Now if the monodromy is trivial then ε(λ, θ) = ε− (λ, θ), and this trivially leads to [αλ ] = [αλ− ]. Nevertheless we have the following
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
381
Lemma 3.24. For λ, µ ∈ 1N (I◦ ) we have Ad(ε(λ, µ)) ◦ αλ ◦ αµ− = αµ− ◦ αλ .
(40)
Proof. As αλ and αµ− restrict to λ and µ, respectively, on N it suffices to show ε(λ, µ) · αλ ◦ αµ− (v) = αµ− ◦ αλ (v) · ε(λ, µ). Recall αλ (v) = ε(λ, θ)∗ v by Lemma 3.1, and similarly αµ− (v) = ε− (µ, θ)∗ v = ε(θ, µ)v. The YBE, Eq. (23), can be written as ε(λ, µ)λ(ε(θ, µ))ε(λ, θ)∗ = µ(ε(λ, θ)∗ )ε(θ, µ)θ(ε(λ, µ)). Now we compute ε(λ, µ) · αλ ◦ αµ− (v) = ε(λ, µ)αλ (ε(θ, µ)v) = ε(λ, µ)λ(ε(θ, µ))ε(λ, θ)∗ v = µ(ε(λ, θ)∗ )ε(θ, µ)θ(ε(λ, µ))v = µ(ε(λ, θ)∗ )ε(θ, µ) v ε(λ, µ) = αµ− (ε(λ, θ)∗ v)ε(λ, µ) = αµ− ◦ αλ (v) · ε(λ, µ), proving the lemma.
The following lemma establishes a sort of naturality equations for the α-induced endomorphisms. Lemma 3.25. Let λ, µ, ρ ∈ 1N (I◦ ). For an r ∈ M (I◦ ) such that rλ(n) = µ(n)r for all n ∈ N (I◦ ) we have αρ± (r) ε∓ (λ, ρ) = ε∓ (µ, ρ) r, ±
r ε (ρ, λ) = ε
±
(41)
(ρ, µ) αρ± (r).
(42)
Proof. Completely analogous to Lemma 3.15 we also obtain rε− (ρ, λ) = ε− (ρ, µ)αρ− (r), establishing Eq. (42). Now note that r∗ µ(n) = λ(n)r∗ for all n ∈ N (I◦ ), therefore we can apply Eq. (42) yielding Eq. (41) by use of Eq. (16). We are now ready to prove the following Proposition 3.26. Let λ, µ ∈ 1N (I◦ ) and β, δ ∈ End(M (I◦ )) such that [β] and [δ] are subsectors of [αλ ] and [αµ− ], respectively. Then [β ◦ δ] = [δ ◦ β]. Proof. By assumption, there are isometries t, s ∈ M (I◦ ), t∗ t = s∗ s = 1, such that t β(m) = αλ (m) t,
s δ(m) = αµ− (m) s,
m ∈ M (I◦ ).
Then u = s∗ αµ− (t∗ )ε(λ, µ)αλ (s)t ∈ HomM (I◦ ) (β◦δ, δ◦β) as we have for all m ∈ M (I◦ ),
382
J. B¨ockenhauer, D. E. Evans
s∗ αµ− (t∗ )ε(λ, µ)αλ (s)t · β ◦ δ(m) = s∗ αµ− (t∗ )ε(λ, µ)αλ (s) · αλ ◦ δ(m) · t = s∗ αµ− (t∗ )ε(λ, µ) · αλ ◦ αµ− (m) · αλ (s)t = s∗ αµ− (t∗ ) · αµ− ◦ αλ (m) · ε(λ, µ)αλ (s)t = s∗ · αµ− ◦ β(m) · αµ− (t∗ )ε(λ, µ)αλ (s)t = δ ◦ β · s∗ αµ− (t∗ )ε(λ, µ)αλ (s)t, where we used Lemma 3.24. All we have to show is that u is unitary. tt∗ ∈ HomM (I◦ ) (αλ , αλ ) and ss∗ ∈ HomM (I◦ ) (αµ− , αµ− ) and hence in tt∗ ∈ λ(N (I◦ ))0 ∩ M (I◦ ) and ss∗ ∈ µ(N (I◦ ))0 ∩ M (I◦ ) as αλ and αµ− λ and µ, respectively, on N (I◦ ). Then Lemma 3.25 yields αµ− (tt∗ )ε(λ, µ) = by Eq. (41) and ss∗ ε(λ, µ) = ε(λ, µ)αλ (ss∗ ) by Eq. (42). Therefore
Note that particular restrict to ε(λ, µ)tt∗
u∗ u = t∗ αλ (s∗ )ε(λ, µ)∗ αµ− (t)ss∗ αµ− (t∗ )ε(λ, µ)αλ (s)t = t∗ αλ (s∗ )ε(λ, µ)∗ ss∗ αµ− (tt∗ )ε(λ, µ)αλ (s)t = t∗ αλ (s∗ )ε(λ, µ)∗ ss∗ ε(λ, µ)tt∗ αλ (s)t = t∗ αλ (s∗ ss∗ s)tt∗ t = 1, and
uu∗ = s∗ αµ− (t∗ )ε(λ, µ)αλ (s)tt∗ αλ (s∗ )ε(λ, µ)∗ αµ− (t)s = s∗ αµ− (t∗ )ε(λ, µ)tt∗ αλ (ss∗ )ε(λ, µ)∗ αµ− (t)s = s∗ αµ− (t∗ tt∗ )ss∗ αµ− (t)s = 1,
the proof is complete.
Recall that [β] is a subsector of [ασβ ] for any β ∈ 1(0) M (I◦ ), and in the same way it is a subsector of [ασ−β ]. From Proposition 3.26 we obtain immediately Corollary 3.27. For any β ∈ 1(0) M (I◦ ) and any δ ∈ End(M (I◦ )) such that [δ] is a subsector of some [αµ ], µ ∈ 1N (I◦ ), we have [β ◦ δ] = [δ ◦ β]. 4. Miscellanea 4.1. The results in terms of sector algebras. We now want to present our results in the language of sector algebras. We need some preparation. Definition 4.1. Let V be a (real or complex), finite dimensional, unital, associative algebra (with addition ⊕ and multiplication ×) together with a basis V = {v0 , v1 , v2 , . . . , vd−1 } (in the linear space sense) such that 1 ∈ V, say v0 = 1. Let k be the structure constants, defined by Ni,j vi × vj =
d−1 M
k Ni,j vk .
(43)
k=0
If 0 = δi,j 1. (Conjugation) there is an involutive permutation i 7→ i, i = i, satisfying Ni,j j k i and Ni,j = Ni,k = Nk,j (so that it extends to an anti-automorphism of V ),
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
383
k 2. (Positive Integrality) the structure constants are non-negative integers, Ni,j ∈ N0 , k k = Nj,i , then (V, V) (or simply V ) is called a sector algebra. If V is commutative, Ni,j then it is called a fusion algebra.
Now let M be an infinite factor and V = {[λ0 ], [λ1 ], [λ2 ], . . . , [λd−1 ]} be a finite set of irreducible sectors with finite statistical dimension, which contains the trivial sector, say [λ0 ] = [id], and is closed under sector conjugation and the sector product. The latter means that the irreducible decomposition of each product [λi ]×[λj ] is a sum of elements in V (possibly with some multiplicities). We simply call such a set a sector basis. We can consider a sector basis as the basis of an algebra V where the summation ⊕ and multiplication × comes from the sum and product of sectors in the obvious sense. By the properties of addition and multiplication of sectors, V is indeed a sector algebra, and the k = hλi ◦ λj , λk iM , where λi denote representative structure constants are given by Ni,j endomorphisms of the sector [λi ]. Now suppose that we have a net of subfactors N ⊂ M as described at the beginning of Sect. 3. We denote by [1]N (I◦ ) ⊂ Sect(N (I◦ )) the set of DHR sectors, i.e. the quotient of 1N (I◦ ) by inner equivalence in N (I◦ ) (and similarly [1](0) M (I◦ ) ⊂ Sect(M (I◦ )) as (0) the quotient of 1M (I◦ ) by inner equivalence in M (I◦ )). Suppose we have a given sector basis W ⊂ [1]N (I◦ ). Because of the commutativity of sectors in [1]N (I◦ ), the associated sector algebra W is indeed a fusion algebra. As α-induction preserves unitary equivalence by Corollary 3.7, the map λ 7→ αλ extends to a map [α]: [λ] 7→ [αλ ], from W to Sect(M (I◦ )). Now let V denote the set of all irreducible subsectors [β] ∈ Sect(M (I◦ )) of every [αλ ], [λ] ∈ W. Since α-induction preserves the sector product and conjugation, V must be a sector basis and we denote by V the associated sector algebra. However, V is not necessarily commutative. We now summarize the results of Subsect. 3.3, Eq. (33), Lemmata 3.10, 3.13, 3.14 and Proposition 3.16, in the following Theorem 4.2. Let W ⊂ [1]N (I◦ ) be a sector basis and W the associated fusion algebra, and let V ⊂ Sect(M (I◦ )) be the corresponding sector basis obtained by α-induction and V the associated sector algebra. Then α-induction extends to a homomorphism [α] : W → V , preserving conjugates and statistical dimensions. Each [αλ ], [λ] ∈ W, commutes with each [β] ∈ V. If [α] is surjective i.e. each element in V can be written as a linear combination of [αλi ]’s, [λi ] ∈ W, then the sector algebra V is a fusion algebra. Now we turn to the discussion of σ-restriction in terms of sectors. By Lemma 3.22, the map β 7→ σβ extends to a map from Sect(M (I◦ )) to Sect(N (I◦ )). We can therefore summarize the results of Theorem 3.21 and Corollary 3.27 as follows. Theorem 4.3. Let T ⊂ [1](0) M (I◦ ) be a sector basis and T the associated fusion algebra. Let also W ⊂ [1]N (I◦ ) be a sector basis with associated fusion algebra W , and V, V obtained by α-induction as above. If all elements of T are mapped to elements in W by σ-restriction, then T ⊂ V and T ⊂ V is a (sector) subalgebra. Moreover, any element of T commutes with every element of V. 4.2. The subgroup net of subfactors. Although we postpone all our (conformal field theory) applications to the forthcoming paper [1] let us briefly discuss a simple example here. Consider a situation as in the DHR theory [7], i.e. we have a net F of local field algebras F (I), I ∈ Jz , that are type III-factors, and we have a compact gauge group
384
J. B¨ockenhauer, D. E. Evans
G acting outerly on each F (I), and this action is implemented on the Hilbert space H by a unitary representation U . The net A of observable algebras is then given by the fixed point algebras A(I) = F (I)G . (There are also some more physically motivated assumptions, e.g. certain space-time transformation properties and that observables and fields associated to relatively spacelike regions commute.) Now suppose that we are dealing with a finite gauge group, and that H ⊂ G is a subgroup. We define another net B by taking the fixed point algebras with respect to the subgroup, B(I) = F (I)H . Then we clearly obtain a net of subfactors A ⊂ B of finite index. (The index is in fact [G : H].) Under the standard assumptions of the DHR theory [7] the Hilbert space H decomposes with respect to the action of A as M Hπ ⊗ Cdπ . (44) H= ˆ π∈G
Here π ∈ Gˆ are the irreducible representations of G of dimension dπ , and Hπ are pairwise inequivalent representation spaces of A, the superselection sectors. The gauge group G acts on the multiplicity spaces Cdπ by the representation π, i.e. M 1Hπ ⊗ π(g), g ∈ G. (45) U (g) = ˆ π∈G
With respect to B we have another decomposition M Hρ ⊗ Cdρ , H=
(46)
ˆ ρ∈H
where now ρ ∈ Hˆ label the irreducible representations (of dimension dρ ) of the subgroup H. Since A(I) = F (I) ∩ U (G)0 and B(I) = F (I) ∩ U (H)0 it is not hard to see that the decompositions of Eq. (44) and Eq. (46) are related by M π Hπ ⊗ Cnρ , (47) Hρ = ˆ π∈G
where nπρ are the induction-restriction coefficients G nπρ = hρ, resG ˆ = hindH ρ, πiZ[G] ˆ . H πiZ[H]
Now we define a net of subfactors N ⊂ M by N (I) = πˆ 0 (A(I)) and M (I) = πˆ 0 (B(I)) where πˆ 0 is the ensuing representation of B on Hρ=0 . Let us assume that our requirements of Haag duality and strong additivity for the net N and locality of the net M are fulfilled. Let λπ ∈ 1N (I◦ ) and βρ ∈ 1(0) M (I◦ ), I◦ ∈ Jz , denote localized endomorphisms ˆ and Hρ , ρ ∈ H, ˆ so that they obey corresponding to the superselection sectors Hπ , π ∈ G, ˆ ˆ in particular the fusion rules of G and H, respectively, and their statistical dimensions coincide with the dimensions of the corresponding group representations. We learn from Proposition 2.10 (see also [19]) that σ-restriction corresponds to the restriction of representations of the net M to the net N , i.e. π 0 ◦ βρ |N ' π0 ◦ σβρ . This restriction ˆ can be read off from Eq. (47), hence we conclude for ρ ∈ H, M nπρ [λπ ]. [σβρ ] = ˆ π∈G
Modular Invariants, Graphs and α-Induction for Nets of Subfactors
385
From ασ-reciprocity, Theorem 3.21, hαλπ , βρ iM (I◦ ) = hλπ , σβρ iN (I◦ ) = nπρ , we conclude (recall dαλ = dλ ) [αλπ ] =
M
nπρ [βρ ].
ˆ ρ∈H
In other words, for this particular example of the subgroup net of subfactors, σ-restriction corresponds to the induction, α-induction corresponds to the restriction of group representations, and ασ-reciprocity reflects Frobenius reciprocity. 4.3. Remarks. In view of our later applications to chiral conformal field theories [1] we have presented the theory for nets of subfactors indexed by the set Jz , i.e. with the punctured circle S 1 \ {z} as the underlying “space-time”, and we also required strong additivity of N or, equivalently, of A. Note that for chiral conformal field theories strong additivity is equivalent to the already assumed Haag duality (on the punctured circle). For the general case we assumed strong additivity so that local intertwiners (of localized endomorphisms) extend to global ones and therefore satisfy the naturality equations and BFEs. One may however drop the strong additivity assumption and work with global intertwiners from the beginning. The invariance of local algebras M (I), I ∈ Jz , I◦ ⊂ I, under the action of αλ is also true without the strong additivity assumption because v itself is a global intertwiner. Moreover, many of our results possess global analogues, e.g. Theorem 3.9 then reads hαλ , αµ iM = hθ ◦ λ, µiN for λ, µ ∈ 1N (I◦ ) or Theorem 3.21 becomes hαλ , βiM = hλ, σβ iN , β ∈ 1(0) M (I◦ ). However, we cannot obtain Corollary 3.6 without the strong additivity assumption and we need the local formulation for the results concerning the subsectors of the [αλ ]0 s. But global analogues of the results not depending on the strong additivity can also be generalized to other space-times like the D-dimensional Minkowski space MD with D = 2, 3, 4, ... (as long as we have transportable endomorphisms). One just has to replace intervals I by double cones O and to substitute “disjoint”, I1 ∩I2 = ∅, by “causally disjoint”, i.e. “relatively spacelike”, O1 ⊂ O20 . But notice that for MD with D > 2 the spacelike complement of any double cone is connected, and this implies that we only have one statistics operator. There are no longer two different braidings and hence we have αλ ≡ αλ+ = αλ− , and in particular all induced endomorphisms αλ are localized. Acknowledgement. We are grateful to K.-H. Rehren for several useful comments on an earlier version of the manuscript. This project is supported by the EU TMR Network in Non-Commutative Geometry.
References 1. B¨ockenhauer, J., Evans, D.E.: Modular Invariants, Graphs and α-Induction for Nets of Subfactors II. Preprint, hep-th/9801171, to appear in Commun. Math. Phys. 2. Buchholz, D., Mack, G., Todorov, I.: The Current Algebra on the Circle as a Germ of Local Field Theories. Nucl. Phys. B (Proc. Suppl.) 5B, 20–56 (1988) 3. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E Classification of Minimal and A(1) 1 Conformal Invariant Theories. Commun. Math. Phys. 113, 1–26 (1987) 4. Di Francesco, P.: Integrable Lattice Models, Graphs and Modular Invariant Conformal Field Theories. Int. J. Mod. Phys. A 7, 407–500 (1992)
386
J. B¨ockenhauer, D. E. Evans
5. Di Francesco, P., Zuber, J.-B.: SU(N) Lattice Integrable Models Associated with Graphs. Nucl. Phys. B 338, 602–646 (1990) 6. Di Francesco, P., Zuber, J.-B.: SU(N) Lattice Integrable Models and Modular Invariants. In: Randjbar, S. et al (eds.): Recent Developments in Conformal Field Theories. Singapore: World Scientific 1990, 179–215 7. Doplicher, S., Haag, R., Roberts, J.E.: Fields, Observables and Gauge Transformations I. Commun. Math. Phys. 13, 1–23 (1969) 8. Doplicher, S., Haag, R., Roberts, J.E.: Fields, Observables and Gauge Transformations II. Commun. Math. Phys. 15, 173–200 (1969) 9. Doplicher, S., Haag, R., Roberts, J.E.: Local Observables and Particle Statistics I. Commun. Math. Phys. 23, 199–230 (1971) 10. Doplicher, S., Haag, R., Roberts, J.E.: Local Observables and Particle Statistics II. Commun. Math. Phys. 35, 49–85 (1974) 11. Evans, D.E., Kawahigashi, Y.: Quantum Symmetries on Operator Algebras. Oxford University Press 1998 12. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection Sectors with Braid Group Statistics and Exchange Algebras I. Commun. Math. Phys. 125, 201–226 (1989) 13. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection Sectors with Braid Group Statistics and Exchange Algebras II. Rev. Math. Phys. Special Issue, 113–157 (1992) 14. Haag, R.: Local Quantum Physics. Berlin, Heidelberg, New York: Springer-Verlag 1992 15. Kosaki, H.: Extension of Jones Theory on Index to Arbitrary Factors. J. Funct. Anal. 66, 123–140 (1986) 16. Longo, R.: Index of Subfactors and Statistics of Quantum Fields II. Commun. Math. Phys. 130, 285–309 (1990) 17. Longo, R.: Minimal Index of Braided Subfactors. J. Funct. Anal. 109, 98–112 (1991) 18. Longo, R.: A Duality for Hopf Algebras and for Subfactors. I. Commun. Math. Phys. 159, 133–150 (1994) 19. Longo, R., Rehren, K.-H.: Nets of Subfactors. Rev. Math. Phys. 7, 567–597 (1995) 20. Petkova, V.B., Zuber, J.-B.: From CFT to Graphs. Nucl. Phys. B 463, 161–193 (1996) 21. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 349–403 (1998) Communicated by H. Araki
Commun. Math. Phys. 197, 387 – 404 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A KMS-like State of Hadamard Type on Robertson–Walker Spacetimes and its Time Evolution Mathias Trucks Institut f¨ur Theoretische Physik, Technische Universit¨at Berlin, Hardenbergstraße 36, 10623 Berlin, Germany. E-mail:
[email protected] Received: 10 November 1997 / Accepted: 18 March 1998
Abstract: In this work we define a new state on the Weyl algebra of the free massive scalar Klein–Gordon field on a Robertson–Walker spacetime and prove that it is a Hadamard state. The state is supposed to approximate a thermal equilibrium state on a Robertson–Walker spacetime and we call it an adiabatic KMS state. This opens the possibility to do quantum statistical mechanics on Robertson–Walker spacetimes in the algebraic framework and the analysis of the free Bose gas on Robertson–Walker spacetimes. The state reduces to an adiabatic vacuum state if the temperature is zero and it reduces to the usual KMS state if the scaling factor in the metric of the Robertson–Walker spacetime is constant. In the second part of our work we discuss the time evolution of adiabatic KMS states. The time evolution is described by a family of propagators on the classical phase space. With the help of this family, we prove the existence of a family of propagators on the one-particle Hilbert space. We use these propagators to analyze the evolution of the two-point function of the KMS state. The inverse temperature change is proportional to the scale factor in the metric of the Robertson–Walker spacetime, as one expects for a relativistic Bose gas.
1. Introduction Beyond the standard model of cosmology the inflationary scenario involving phase transitions and symmetry breaking has been intensively discussed during the last years. For an investigation of such phenomena it is necessary to describe the thermal behavior of quantum fields and states. In this work we start a discussion of quantum statistical mechanics on Robertson–Walker spacetimes in the framework of algebraic quantum field theory. An analysis of the free Bose gas on Robertson–Walker spacetimes could serve as a model for more complicated quantum field theories. This model may show a phase transition, namely Bose–Einstein condensation.
388
M. Trucks
The algebraic framework of quantum field theory started with the work of Haag and Kastler [7]; for an overview and basic results see the book of Haag [5]. Dimock [4] generalized the axioms to globally hyperbolic spacetimes. The basic object is a net of C ∗ -algebras arising from the assignment of a C ∗ -algebra A(O) to each open, relatively compact subset O of a manifold M. The algebra A(O) is the algebra of local observables, i.e. the observables that can be measured in the region O. The quasilocal algebra A is defined as the inductive limit of the A(O), i.e. A = ∪O⊂M A(O), where the bar denotes the norm closure and the union runs through all relatively compact open sets. A state is a positive normalized linear functional on A. One of the major problems in algebraic quantum field theory on curved spacetimes is to pick out physically relevant states among all positive normalized linear functionals. In quantum field theory on Minkowski spacetime the Poincar´e group as the symmetry group and the spectrum condition determine the Minkowski vacuum. There is no symmetry group on a generic curved spacetime and therefore no way to distinguish a vacuum-like state. In quantum field theory on curved spacetimes the class of Hadamard states is believed to be a class of physically relevant states. We mention two reasons supporting this opinion: 1. The class of Hadamard states allows the renormalization of the energy-momentum tensor Tµν . Quantum field theory on curved spacetimes is a semi-classical theory, where matter fields are quantized but not the metric, so one has to deal with the semiclassical Einstein equation Gµν = 8πhTµν iω . The energy-momentum tensor contains products of fields and their derivatives at one point. The expectation value of Tµν in a state ω on the right-hand side requires a renormalization procedure. For Hadamard states the renormalization by a pointsplitting procedure is possible (see the book of Wald [24, Chap. 4.6] and references therein). 2. The “principle of local definiteness”, formulated by Haag et al. [8], contains requirements on physically relevant states, namely local quasi-equivalence, local primarity and local definiteness. It was shown by Verch [22] that all Hadamard states on a globally hyperbolic spacetime are locally quasi-equivalent. For ultrastatic spacetimes he showed that the local von Neumann algebras arising from Hadamard states are factors (of type III1 ), i.e. they are local primary and he also showed local definiteness. Recently he strengthened these results to arbitrary globally hyperbolic spacetimes, see Verch [23]. For these reasons it is reasonable to consider Hadamard states as good candidates for physically relevant states. There are not many explicitly known Hadamard states although Junker [10, Chap. 3.7] has given an explicit construction of Hadamard states. As explicitly known Hadamard states we mention the ground state on ultrastatic spacetimes, KMS states on ultrastatic spacetimes with compact spacelike Cauchy surfaces, the adiabatic vacua on Robertson–Walker spacetimes and, although perhaps this has never been proven explicitly, the Hartle-Hawking state on extended Schwarzschild spacetime. The class of adiabatic vacuum states is a class of Hadamard states, defined on the Weyl algebra of the free massive Klein–Gordon field on Robertson–Walker spacetimes, which approximates a vacuum state. On the other hand on ultrastatic spacetimes, e.g. the Einstein static universe, it is possible to define KMS states, i.e. thermal equilibrium
KMS-like State on Robertson–Walker Spacetimes
389
states, in the usual way. Here, we combine these definitions. The resulting state, which approximates a thermal equilibrium state on a Robertson–Walker spacetime, is called an adiabatic KMS state. We will prove that this state is a Hadamard state. We generalize our definition by introducing a chemical potential µ. The state is a Hadamard state if µ < m, where m is the mass parameter in the Klein–Gordon equation. We remark that Bose–Einstein condensation sets in, if the value of the chemical potential reaches the value of the mass parameter [1, 9]. We also show that an adiabatic KMS state satisfies the KMS condition with respect to an automorphism group. This automorphism group does not generate the time translations of the system. This possibility was already mentioned in the fundamental work on KMS states by Haag et al. [6]. The general idea of adiabatic KMS states and vacua is to maintain as many properties as possible of KMS states and ground states in the ultrastatic case. But it is known that a “naive” generalization does not lead to Hadamard states, Junker [10, Chap. 3.6]. The “positive frequencies” cannot be fixed on a Cauchy surface, but must be determined dynamically off the Cauchy surface. We think of an adiabatic KMS state as a state which approximates a thermal equilibrium state in the sense that switching off the expansion of the Robertson–Walker spacetime would lead to a thermal equilibrium state of inverse temperature β for t → ∞. This can be seen with the methods of the second part of the work. In the second part of the work we analyze the evolution of an adiabatic KMS state. We introduce new coordinates, so that the Klein–Gordon equation can be written as a first order system having only off-diagonal entries. This defines a semigroup for fixed t. We prove the existence of a family of propagators on the classical phase space with the help of these semigroups. By a natural generalization of the notion of a one-particle Hilbert space structure we are able to prove the existence of a family of propagators on the one-particle Hilbert space. It is given by a unitary propagator. This unitary operator is not identical to the unitary operator coming from the group of automorphisms with respect to which an adiabatic KMS state satisfies the KMS condition. It can be seen that an adiabatic vacuum state is invariant under this time evolution in a certain sense. For an adiabatic KMS state, the inverse temperature is proportional to the scale factor R in the Robertson–Walker metric, as one expects for a relativistic Bose gas. Similarly, the inverse temperature change could be shown to be proportional to R2 for a non-relativistic Bose gas. The work is organized as follows. In the next section the notion of adiabatic vacua is reviewed. After some preliminary remarks on one-particle Hilbert space structures and the definition of KMS states we give the definition of an adiabatic KMS state in Sect. 3. In Sect. 4 we give a precise definition of Hadamard states and prove that an adiabatic KMS state is of this kind. A section on the KMS condition follows. In the following section we prove the existence of a family of propagators on the classical phase space. The time evolution on the one-particle Hilbert space is described in Sect. 7. In the last section we compute the evolution of an adiabatic KMS state. The necessary results on pseudodifferential operators and wave-front sets are summarized in the appendix.
2. Adiabatic Vacuum States In this section we briefly summarize the definition of adiabatic vacua. For a more detailed discussion see [10, 15, 21]. Readers only interested in the definition of an adiabatic KMS
390
M. Trucks
state can skip over this section. Adiabatic vacua were originally introduced by Parker [17]. We consider the Klein–Gordon equation (g − m2 )ϕ = 0 , on a Lorentz manifold (M, g) topologically of the form M = I × Sε , I ⊂ R, where ε = 1, 0, −1 corresponds to the spherical, the flat and the hyperbolic case, respectively and g is a Robertson–Walker metric g = −dt2 + R(t)2 [dθ12 + 62ε (dθ22 + sin2 θ2 dφ2 )] = −dt2 + hij (Sε )dxi dxj ,
i, j = 1, 2, 3,
(1)
with 61 = sin θ1 , 60 = θ1 , 6−1 = sinh θ1 and R(t) > 0, We construct the Weyl algebra CCR(D, σ) associated with the space of classical solutions of the Klein–Gordon operator on Robertson–Walker spacetimes, where D is a real vector space and σ a symplectic form on D (see e.g. Baumg¨artel and Wollenberg [3, Chap. 8.2]). As the real symplectic vector space we consider the space of real Cauchy data ∞ ∞ (Sε ) ⊕ C0,R (Sε ). Let E be the uniquely existing causal on a Cauchy surface D := C0,R ∞ (M) propagator of the Klein–Gordon operator (see Dimock [4]). For a function f ∈ C0,R ∞ ∞ the mapping to D is given by ρ0 Ef ⊕ ρ1 Ef , where ρ0 : C0,R (M) → C0,R (Sε ) is the ∞ ∞ (M) → C0,R (Sε ) is the forward restriction operator to the Cauchy surface and ρ1 : C0,R normal derivative on the Cauchy surface. The Weyl algebra CCR(D, σ) is the algebra generated by the elements W (F ) 6= 0, F ∈ D, obeying the Weyl form of the canonical commutation relations W (F )W (G) = e−iσ(F,G)/2 W (F + G) ,
F, G ∈ D,
and with the property W (F )∗ = W (−F ), see e.g. Baumg¨artel and Wollenberg [3, 8.2]. We define the symplectic form by Z (f1 g2 − f2 g1 ) dµ(Sε ), F = f1 ⊕ f2 , G = g1 ⊕ g2 , σ(F, G) = Sε p where dµ(Sε ) = det hij (Sε )d3 x is the invariant measure on the spaces Sε . The algebras of local observables are A(O) = C ∗ (W (F ), F ∈ D, supp (F ) ⊂ O ⊂ M) , where we mean the C ∗ -algebra generated by these elements. The Klein–Gordon operator on (M, g) has the form g − m2 = −∂t2 − 3H(t)∂t + R−2 (t)1ε − m2 , ˙ where 1ε is the Laplace operator on the respective spatial parts, H(t) = R(t)/R(t) and ∂t ≡ ∂/∂t. The eigenvectors and eigenvalues of the Laplace operator are explicitly known in these three cases. This fact allows us to separate the time-dependent part of the equation. Denoting the eigenvectors by Yk (x) and the eigenvalues of the Laplace operator by −E(k), i.e. 1Yk (x) = −E(k)Yk (x), we can express the elements of the set of Cauchy data D on a Cauchy surface with the uniform notation
KMS-like State on Robertson–Walker Spacetimes
391
Z f1 c(k) Yk (x) , F = = dk cˆ(k) f2 Z b(k) g1 G= = dk ˆ Yk (x) , g2 b(k)
F, G ∈ D ,
where the integral reduces to a sum in the spherical case (see Junker [10, 3.5] for details). The main result concerning the structure of Fock states is summarized in the following Theorem 1. The Fock states for the Klein–Gordon field on a Robertson–Walker spacetime are given by a two-point function of the form Z ˆ hF |G iS = dk[c(k)b(k)S00 (k) + c(k)b(k)S 01 (k) ˆ + b(k)ˆc(k)S10 (k) + cˆ(k)b(k)S 11 (k)] .
(2)
The entries of the matrix S can be expressed in the form S00 (k) = |p(k)|2 ,
S11 (k) = R6 |q(k)|2 ,
S01 (k) = −R3 q(k)p(k) ,
S10 = S 01 ,
(3)
where p and q are polynomially bounded measurable functions satisfying q(k)p(k) − p(k)q(k) = i .
(4)
Conversely every pair of polynomially bounded measurable functions satisfying Eq. (4) yields via (3) and (2) the two-point function of a Fock state. For the proof see L¨uders and Roberts [15, Thm. 2.3]. We consider the time-dependence of the Klein–Gordon equation d2t + 3H(t) dt + m2 + R−2 (t)E(k) Tk (t) = 0, ∀k .
(5)
This equation can be solved explicitly only in exceptional cases. In the general case, one tries to solve it by an iteration procedure. For finding the iteration, we consider Z t k (t0 ) dt0 , ∀k , Tk (t) = [2R3 (t)k (t)]−1/2 exp i t0
where the functions k have to be determined. Inserting this ansatz in Eq. (5) we find that the functions k have to satisfy 2k
=
ωk2
3 − 4
R˙ R
2
3 R¨ 3 + − 2R 4
˙k k
2 −
¨k 1 , 2 k
(6)
where ωk2 = E(k)/R2 + m2 . With 2 2 2 2 ((0) k ) := ωk = E(k)/R + m
the iteration is given by )2 ((n+1) k
=
ωk2
3 − 4
2 R˙ 3 R¨ 3 + − R 2R 4
˙ (n) k (n) k
!2
(n)
−
¨k 1 . 2 (n) k
392
M. Trucks
The functions Tk (t) and T˙k (t) are related to the functions q(k) and p(k), which constitute the matrix S by Eq. (3). On a Cauchy surface at time t these relations are Tk (t) = q(k) ,
T˙k (t) = R−3 (t)p(k) .
(7)
An adiabatic vacuum state will now be defined by initial values at time t: Definition 1. For t0 , t ∈ R, let
Z t (n) 0 −1/2 0 (t)] exp i (t ) dt . Wk(n) (t) := [2R3 (t)(n) k k t0
An adiabatic vacuum state of order n is a Fock state, obtained via Eqs. (3) and (7), where the initial values at time t for Eq. (5) can be expressed by Tk (t) = Wk(n) (t) ,
˙ (n) (t) . T˙k (t) = W k
For later purposes we notice that for an adiabatic vacuum state of zeroth order we have S11 (k) = R6 |q(k)|2 = R6 |Tk (t)|2 =
R3 . 2ωk
(8)
3. Adiabatic KMS States We review the definitions of ground and KMS states on ultrastatic spacetimes by their one-particle Hilbert space structures (see e.g. Kay [11, 13]). We describe an adiabatic vacuum state (of zeroth order) in a similar way. A short computation shows the coincidence of this definition with the one given by L¨uders and Roberts [15]. Adiabatic KMS states are defined by imitating the connection of KMS states on ultrastatic spacetimes with the ground state on the same spacetime. 3.1. One-particle Hilbert space structures. Let ωS be a state on the Weyl algebra CCR(D, σ). A one-particle Hilbert space structure is a real-linear map K : D → H, H a Hilbert space, satisfying 1. KD + iKD is dense in H, 2. [S(F, G) + iσ(F, G)]/2 = hKF |KG i ,
F, G ∈ D,
where S(·, ·) is a real scalar product on D and ωS (W (F )) = exp(−S(F, F )/4) is the generating functional of the state ωS . Usually, the map K is required to intertwine the time evolutions on the phase space D and the Hilbert space H. We come back to this point in Sect. 7. 3.1.1. One-particle structure K for a ground state on ultrastatic spacetimes. For a ground state on an ultrastatic spacetime, the one-particle Hilbert space structure is given by K : D → H = L2C (S, dµ),
f1 ⊕ f2 7→ 2−1/2 (A1/4 f1 + iA−1/4 f2 ) ,
A := m2 − 1 ,
where 1 is the Laplacian on the Cauchy surface S. 3.1.2. One-particle structure K β for a KMS state on ultrastatic spacetimes. For a KMS state of inverse temperature β, the one-particle Hilbert space structure is defined by doubling the Hilbert space:
KMS-like State on Robertson–Walker Spacetimes
Kβ : D → H ⊕ H ,
393
F 7→ (cosh Z β )KF ⊕ C(sinh Z β )KF ,
where Z β is implicitly defined by tanh Z β = exp(−βA1/2 ), i.e. cosh2 Z β = [1 − exp(−βA1/2 )]−1 ,
sinh2 Z β =
exp(−βA1/2 ) , 1 − exp(−βA1/2 )
C is a conjugation and K is the map defined in Subsubsect. 3. 3.1.3. One-particle structure Kta for an adiabatic vacuum state. An adiabatic vacuum state (of zeroth order) can also be described by a one-particle Hilbert space structure, as we will show below. The mapping which defines an adiabatic vacuum state is given by Kta : D → H = L2C (Sε (t), dµ) , B1 (t) = 2
−1/2
1/4
A
f1 ⊕ f2 7→ B1 (t)f1 + iB2 (t)f2 ,
(t){1 + iH(t)[A−1/2 (t) + m2 A−3/2 (t)/2]} ,
B2 (t) = 2−1/2 A−1/4 (t) , where A(t) = m2 − 1ε /R2 (t) (ε = −1, 0, 1 refers to the closed, flat and hyperbolic spatial part resp.). This one-particle Hilbert space structure leads to a two-point function hKta (f1 ⊕ f2 ) |Kta (g1 ⊕ g2 ) i = hB1 f1 + iB2 f2 |B1 g1 + iB2 g2 i = ∗ f1 B1 B1 iB1∗ B2 g1 = . ∗ ∗ −iB B B B f2 g2 2 1 2 2 We show this expression to be equivalent to the two-point function of an adiabatic vacuum state of zeroth order as defined by L¨uders and Roberts [15]. For example for the fourth entry we have
2hf2 |B2∗ B2 g2 i = f2 A−1/2 g2 = DR E R ˆ 0 )Yk0 (x) = = dk cˆ(k)Yk (x) A−1/2 dk0 b(k Z Z ˆ 0 )ω −1 0 = dk cˆ(k) dk0 b(k k0 hYk (x) |Yk (x) i = Z Z 3 0 ˆ 0 )ω −1 = dk cˆ(k) dk0 b(k k0 R δ(k − k ) = Z = R3 dk cˆ(k)ωk−1 bˆ (k) . Comparing this with Eq. (2) gives S11 (k) = R3 /(2ωk ), which is the desired result of Eq. (8). 3.2. Definition of an adiabatic KMS state. If we look at the definition of a KMS state on an ultrastatic spacetime, we find that it is connected with the ground state on this spacetime. In a similar way we connect an adiabatic KMS state with an adiabatic vacuum state:
394
M. Trucks
Definition 2. We define an adiabatic KMS state by a one-particle Hilbert space structure Ktaβ given by Ktaβ : D → H ⊕ H , F 7→ (cosh Z β )Kta F ⊕ C(sinh Z β )Kta F , where Kta is the one-particle structure of an adiabatic vacuum state, C is a conjugation and tanh Z β = exp(−βA1/2 (t)). This definition leads to the following two-point function of an adiabatic KMS state: E D Ktaβ (f1 ⊕ f2 ) Ktaβ (g1 ⊕ g2 ) =
= B1 f1 + iB2 f2 cosh2 Z β (B1 g1 + iB2 g2 )
+ B ∗ f1 − iB ∗ f2 sinh2 Z β (B ∗ g1 − iB ∗ g2 ) . 1
2
1
2
In the following, we also use the (“four-smeared”) two-point distribution. For the connection of the (“symplectically smeared”) two-point function and the two-point distribution see e.g. [10, Chap. 3.1 and 3.3]. The two-point distribution 3 has the form (we suppress the t-dependence of A(t)): 3(f, g) = m2 −1 1 D 1/2 [A + iH(1 + A ) + i∂t ]Ef = 2 2 E m2 −1 A−1/2 cosh2 Z β [A1/2 + iH(1 + A ) + i∂t ]Eg 2 1 D 1/2 m2 −1 + [A − iH(1 + A ) − i∂t ]Ef 2 2 E m2 −1 A−1/2 sinh2 Z β [A1/2 − iH(1 + A ) − i∂t ]Eg , 2
(9)
∞ (M) and E is the causal propagator (see Sect. 2). In Sect. 4 we will prove for f, g ∈ C0,R that this two-point distribution is the two-point distribution of a Hadamard state. We remark that on closed Robertson–Walker spacetimes adiabatic KMS states can be defined by ω(·) = Z −1 tr{exp[−βH(µ)] ·}, where H(µ) is the second quantization of A1/2 (t) − µ and Z = tr exp[−βH(µ)]. A1/2 (t) has purely discrete eigenvalues of finite multiplicity going to infinity and therefore exp[−βH(µ)] is of trace class for µ < m (see [2, Prop.5.2.27]). For fixed t, this is the usual expression of a grand canonical ensemble. This justifies the term “free Bose gas”.
4. Hadamard Property of an Adiabatic KMS State In the next section we give a precise definition of a Hadamard state and prove in the following subsection that an adiabatic KMS state is a Hadamard state. The necessary results on pseudodifferential operators and wave-front sets are summarized in the appendix. 4.1. Hadamard states. Since the work of Radzikowski [18] it is known that Hadamard states can be characterized by the wave-front set of its two-point distribution. Earlier
KMS-like State on Robertson–Walker Spacetimes
395
definitions required a specific form of the two-point distribution (see Kay and Wald [14]). The characterization of a Hadamard state by its wave-front set is easier to handle and offers new possibilities to prove the Hadamard property of a state, but it requires some knowledge of pseudodifferential operators (PDO’s) and wave-front sets of distributions. We refer to the appendix for notation and some results used below. Definition 3. A quasifree state of a Klein–Gordon quantum field on a globally hyperbolic spacetime is a Hadamard state iff the wave-front set of its two-point distribution 3 is of the form: W F (3) = {(x1 , ξ1 ; x2 , −ξ2 ) ∈ T ∗ (M × M) \ {0} | (x1 , ξ1 ) ∼ (x2 , ξ2 ), ξ10 ≥ 0} , (10) where the notation (x1 , ξ1 ) ∼ (x2 , ξ2 ) means that x1 and x2 can be joined by a null geodesic γ and ξ1 is tangent to γ in x1 and ξ2 is the parallel transport of ξ1 along γ in x2 . The proof that a state is a Hadamard state requires only the analysis of the wave-front set of its two-point distribution. We will do this in the next section for an adiabatic KMS state. 4.2. An adiabatic KMS state is a Hadamard state. In the proof that an adiabatic KMS state is a Hadamard state we use the following theorems due to Junker [10, Thm.3.11 and 3.12]. Theorem 2. Let (M, g) be a globally hyperbolic spacetime with Cauchy surface S and (D, σ) be the phase space of initial data on S of the Klein–Gordon field. Let B, I, S, C be operators on L2R (S, dµ), such that I is symmetric, B is selfadjoint, positive and invertible and C ∗ C − S ∗ S = 1. Then, with H = L2C (S, dµ), ˜ = H ⊕H, K:D→H (f1 , f2 ) 7→ C(2B)−1/2 [(B + iI)f1 + if2 ] ⊕ S(2B)−1/2 [(B − iI)f1 − if2 ] , is a one-particle Hilbert space structure. For a proof see Junker [10, Thm. 3.11] (where different conventions are used). Under the assumption that the metric has the form of Eq. (1), the two-point distribution 3 resulting from this one-particle structure is given by 3(f, g) =
1
(B + iI + i∂t )Ef B −1/2 C ∗ CB −1/2 ((B + iI + i∂t )Eg + 2 1
+ (B − iI − i∂t )Ef B −1/2 S ∗ SB −1/2 (B − iI − i∂t )Eg , 2
(11)
∞ (M) and E is the causal propagator (see section 2). where f, g ∈ C0,R Now let (M, g) be a globally hyperbolic spacetime, foliated in a neighborhood of S into (−T, T ) × S with St := {t} × S and S0 = S and g of the form given in Eq. (1).
Theorem 3. Let B(t), I(t), S(t), C(t) be PDO’s on St , t ∈ (−T, T ), satisfying the properties stated in Theorem 2, such that B is elliptic, S ∈ OP S −∞ , and such that there exists a PDO Q on M with the property Q(B + iI + i∂t ) = g − m2 which possesses a principal symbol q with q −1 (0) \ {0} ⊂ {(x, ξ) ∈ T ∗ (M) | ξ 0 ≥ 0} .
396
M. Trucks
Then the quasifree state given by the one-particle Hilbert space structure of Theorem 2 is a Hadamard state, i.e. the wave-front set of the corresponding two-point distribution 3 has the form of Eq. (11). For a proof see Junker [10, Thm. 3.12]. This theorem will be used in the proof of the following theorem. The proof for closed Robertson–Walker spacetimes is in fact a generalization of the proof by Junker [10, Chap. 3.4], that a KMS state on an ultrastatic spacetime with compact Cauchy surface is a Hadamard state. Theorem 4. An adiabatic KMS state on the Weyl algebra of the free massive Klein– Gordon field on Robertson–Walker spacetimes, as defined in Definition 2, is a Hadamard state. Proof. We have to show that the wave-front set of the two-point distribution (9) has the form of Eq. (10). For F = f1 ⊕ f2 ∈ D, we have Kta F = (2A1/2 )−1/2 {[A1/2 + iH(1 + m2 A−1 /2)]f1 + if2 } . We identify the operator B in Theorem 2 resp. Theorem 3 with A1/2 = (m2 − 1/R2 )1/2 , which is an elliptic, selfadjoint, positive PDO (of order 1). Furthermore the operator I is identified with H(1 + m2 A−1 /2), which is a symmetric PDO. The operators S resp. C are identified with sinh Z β resp. cosh Z β , so that C ∗ C − S ∗ S = 1. For closed Robertson–Walker spacetimes we proceed as follows: Since S = exp(−βA1/2 /2)(1 − exp(−βA1/2 ))−1/2 and A1/2 has the properties of Theorem 8 in the appendix with the real-valued principal symbol given by a(x, ξ) = (hij ξi ξj )1/2 , we can apply this theorem to conclude that S is a PDO with principal symbol p(a(x, ξ)) =
exp(−βa(x, ξ)/2) . (1 − exp(−βa(x, ξ)))1/2
This principal symbol falls off faster than any inverse power of ξ, so that S ∈ OP S −∞ and this also means that C ∈ OP S 0 . For flat Robertson–Walker spacetimes we show directly that the involved operators are PDO’s: (cosh Z β f )(t, x) = (1 − exp(−βA1/2 ))−1/2 f (t, x) = Z = (2π)−3/2 (1 − exp(−βωk ))−1/2 f˜(t, k)Yk (x) dk , which is a PDO of order zero, because a(k) = (1 − exp(−β is a symbol of order zero. Furthermore, Z β −3/2 (sinh Z f )(t, x) = (2π) is a PDO of order −∞, because
q k 2 /R2 + m2 ))−1/2
exp(−βωk /2) f˜(t, k)Yk (x) dk (1 − exp(−βωk ))1/2
KMS-like State on Robertson–Walker Spacetimes
397
p k 2 /R2 + m2 /2) p a(k) = (1 − exp(−β k 2 /R2 + m2 ))1/2 exp(−β
and the derivatives of a(k) tend to zero faster than any inverse power of k. For hyperbolic Robertson–Walker spacetimes one has to express the eigenfunctions of the Laplace operator in terms of exp(ix · ξ) in the same way as it was done by Junker [10, Proof of Lemma 3.26], to see that the operators are PDO’s of the desired type. The operator Q is given by Q = 3iH/2 − A−1/2 ∂t A1/2 − A1/2 + i∂t . This can be verified with the help of Eq. (6). 4.3. Introduction of a chemical potential. We generalize the definition to the case of a non-vanishing chemical potential µ. This can be done by changing the definition of tanh Z β : Let tanh Z β = exp(−βh(µ)) ,
h(µ) = A1/2 (t) − µ .
(12)
The operator h(µ) is selfadjoint on dom(A1/2 ), positive if µ < m and elliptic. So we can generalize the proof in the case of a closed Robertson–Walker spacetimes to the case of a non-vanishing chemical potential µ under the restriction µ < m. In the case of non-closed Robertson–Walker spacetimes we can proceed in the same way as in the proof of Theorem 4. Theorem 5. An adiabatic KMS state on the Weyl algebra of the free massive Klein– Gordon field on Robertson–Walker spacetimes, as defined in Definition 2, generalized by Eq. (12), is a Hadamard state if µ < m. 5. On the KMS Condition In this section we show that an adiabatic KMS state fulfills the KMS condition with respect to an automorphism group αs . It does not describe time-translational automorphisms of the system, which, because of non-stationarity, will not, in general constitute a group at all. It was already remarked in the fundamental paper on the KMS condition by Haag et al. [6], that such a situation can occur. This means the system were in equilibrium if time evolution were given by the automorphism group αs , i.e. if h(s) = (m2 − 1/R2 (s))1/2 were independent of s. A KMS state can be defined in the following way (see Kay and Wald [14]). Definition 4. Let αs be an automorphism group on a C ∗ -algebra A and ω an αs invariant state. ω is a KMS state at inverse temperature β if its GNS triple (F , πβ , β ) satisfies the following properties: 1. The unique unitary group U (s) : F → F which implements αs and leaves β invariant, is strongly continuous, so that U (s) = exp(−iHs) for some selfadjoint operator H. 2. πβ (A)β is contained in the domain of exp(−βH/2). 3. There exists a complex conjugation J on F satisfying [J, exp(−iHs)] = 0 , ∀s ∈ R, and exp(−βH/2)πβ (A)β = Jπβ (A∗ )β , A ∈ A.
398
M. Trucks
For quasifree states the definition can be reduced on the one-particle Hilbert space. Ktaβ maps to H˜ = H ⊕ H, so the representation Hilbert space can be chosen to be F = Fs (H ⊕ H) = Fs (H) ⊗ Fs (H) , where Fs (H) is the symmetric Fock space over H. The Weyl operator W (f ) on F is represented by W (f ) = WF (cosh Z β Kta f ) ⊗ WF (C sinh Z β Kta f ) , where WF is the usual Weyl operator on Fs . With h = (m2 − 1ε /R2 )1/2 we define on H˜ e−ihs = e−ihs ⊕ eihs , ˜
e−β h/2 = e−βh/2 ⊕ eβh/2 , ˜
j(x ⊕ y) = (−Cy) ⊕ (−Cx)
and the operators on F by second quantization. The second condition in Definition 4 can be reduced to Kay’s regularity condition KD ⊂ dom(h−1/2 ) (see Kay [13]). h is a positive, selfadjoint operator, so that h−1/2 is bounded and dom(h−1/2 ) = H. The condi˜ tion [J, e−iHs ] = 0 reduces to [j, e−ihs ] = 0, which can easily be verified. The condition ˜ ∗ exp(−βH/2)πβ (A)β = Jπβ (A )β reduces to e−β h/2 (iK aβ f ) = j(−iK aβ f ) and ˜ can also be verified. One finds e−β h/2 (x ⊕ y) = Cy ⊕ Cx, so that the one-particle KMS condition E D E D ˜ ˜ ˜ ˜ = e−β h/2 y e−ish e−β h/2 x e−ish x |y ˜ H
˜ H
is valid for x, y ∈ H˜ and all s ∈ R. 6. Time Evolution by a Family of Propagators In this section we describe the time evolution on the classical phase space with the help of semigroup theory. We prove the existence of a family of propagators on a Hilbert space where the classical phase space is a dense subset of. In [12] the time evolution on the classical phase space is also considered. Here, we construct propagators having a larger domain. Propagators of our type were also considered by Moreno in [16]. While that work is in some aspects more general than ours, it is applicable only to closed Robertson–Walker spacetimes. 6.1. The Klein–Gordon equation as a first order system. To analyze the time evolution of an adiabatic KMS state, we first have to describe the time evolution on the classical phase space D. This can be achieved by introducing different coordinates, so that the metric has the form g = −R6 (t)dt2 + R2 (t)[dθ12 + 62ε (dθ22 + sin2 θ2 dφ2 )] ,
R(t) > 0 ,
where again ε = −1, 0, 1 corresponds to the spherical, the flat and the hyperbolic spatial ˙ to be positive and continuous on any compact part respectively. We assume R(t) and R(t) subset. In these coordinates the Klein–Gordon equation has the form [−∂t2 + R4 (t)1ε − m2 R6 (t)]ϕ = 0 , where 1ε is again the Laplace operator on the respective spatial spaces. The Klein– Gordon equation can be written as a first order system:
KMS-like State on Robertson–Walker Spacetimes
f1 , F = f2
399
∂t F = −H(t)F, 0 1 , −H(t) = −B 2 (t) 0
−B 2 (t) = R4 (t)1ε − m2 R6 (t) .
We define the operator H(t) on the real Hilbert space Ht = dom(B(t)) ⊕ L2 (Sε (t)), where Sε are the respective spatial spaces, with scalar product hf1 ⊕ f2 |g1 ⊕ g2 iB = hBf1 |Bg1 i + hf2 |g2 i . The phase space D is of course a dense subspace of Ht . For fixed s the operator H(s) is skew-adjoint on this space and therefore defines a contractive semigroup T (t) = exp[−tH(s)] on H. 6.2. Existence of the propagator. We use the following theorem to prove the existence of the propagator. For each positive integer k, we define an approximate propagator Uk (t, s) on 0 ≤ s ≤ t ≤ 1 by Uk (t, s) = exp −(t − s)H (i − 1)/k , i i−1 ≤ s ≤ t ≤ , (where 1 ≤ i ≤ k), k k and Uk (t, r) = Uk (t, s)Uk (s, r) if 0 ≤ r ≤ s ≤ t ≤ 1 . We also define C(t, s) = H(t)H(s)−1 − 1. Theorem 6. Let X be a Banach space and let I be an open interval in R. For each t ∈ I, let H(t) be the generator of a contraction semigroup on X so that 0 ∈ ρ(H(t)), the resolvent set of H(t), and 1. The H(t) have a common dense domain D. 2. For each ϕ ∈ X, (t − s)−1 C(t, s)ϕ is uniformly strongly continuous and uniformly bounded in s and t for t 6= s lying in any fixed compact subinterval of I. 3. For each ϕ ∈ X, C(t)ϕ ≡ lims%t (t − s)−1 C(t, s)ϕ exists uniformly for t in each compact subinterval and C(t) is bounded and strongly continuous in t. Then for all s ≤ t in any compact subinterval of I and any ϕ ∈ X, U (t, s)ϕ = lim Uk (t, s)ϕ k→∞
exists uniformly in s and t. Further, if ψ ∈ D, then ϕs (t) = U (t, s)ψ is in D for all t and satisfies d ϕs (t) = −H(t)ϕs (t) , ϕs (s) = ψ dt and kϕs (t)k ≤ kψk for all t ≥ s. For a proof see Reed and Simon [19, Thm. X.70]. As a consequence of the positivity of B 2 (t), 0 is in the resolvent set of H(t). We will verify condition 1, i.e. we have to show for all t ∈ I the operators H(t) have a common dense domain. For this it is sufficient to show that the spaces L2 (Sε (t)) are setwise equivalent. Let h be √ the determinant of the spatial part of the metric g and let √ µh (t) = h(t) and µh (t0 ) = h(t0 ) be the invariant measures induced by the metric g on the Cauchy surfaces at time t and t0 respectively. Then
400
M. Trucks
Z
Z p 3 |f | h(t) d x = 2
√
√
Sε
√ h(t) p 0 3 |f | √ 0 h(t ) d x , h(t ) Sε 2
where h(t)/ h(t0 ) is smooth, bounded and strictly positive, namely the Radon– Nykodim derivative of the measure µh (t) with respect to the measure µh (t0 ). Therefore (t)}t∈I are mutually absolutely continuous and because of the boundthe measures √ {µh√ edness of h(t)/ h(t0 ) we have f ∈ L2 (Sε (t)) iff f ∈ L2 (Sε (t0 )). We have to verify condition 2. The operator C(t, s) is given by C(t, s) = H(t)H(s)−1 − 1 = 0 −1 10 0 B −2 (s) = − 01 B 2 (t) 0 −1 0 0 0 , = 0 B 2 (t)B −2 (s) − 1 so we have (t − s)−1 C(t, s)F = (t − s)−1
m2 R6 (t) − R4 (t)1ε − 1 π, m2 R6 (s) − R4 (s)1ε
F = ϕ⊕π ∈ D.
We assumed R(t) to be continuous and as a consequence R(t) is jointly continuous on every compact subinterval. Therefore C(t, s) is jointly continuous in t and s. The operator is also jointly bounded on every compact subinterval (by using the eigenvalues of 1ε ) because R(t) is bounded. The last step is for ϕ ∈ H to show that C(t)ϕ = lims%t (t − s)−1 (H(t)H(s)−1 − 1)ϕ exists uniformly for t in each compact subinterval and C(t) is bounded and strongly continuous in t. The existence of the limit can be shown with the rule of de l’Hospital. It is ˙ R(t) 2 2 ˙ + 2m2 R(t)R(t)(m R (t) − 1ε )−1 . C(t) = 4 R(t) The operator is bounded by the boundedness of (m2 R2 (t) − 1ε )−1 and because we ˙ to be bounded functions and it is of course strongly continuous. assumed R(t) and R(t) We have therefore proved the existence of the propagator U (t, s) as the strong limit of the approximate propagators Uk (t, s). Remark 1. We introduced new coordinates. In these coordinates the existence proof is most easy. It is also possible to prove the existence of the propagator using the coordinates given in Eq. (1). Then the continuity and boundedness of R¨ has to be assumed too. 7. Time Evolution on the One-Particle Hilbert Space We will now describe the time evolution on the one-particle Hilbert space. In Definition 3 we defined a one-particle Hilbert space structure. For static spacetimes it is also required that U (t)K = KT (t),
(13)
where T (t) describes the time evolution on the classical phase space D and U (t) the time evolution on H, i.e. K intertwines the time evolutions.
KMS-like State on Robertson–Walker Spacetimes
401
In our case, the time evolution on the classical phase space is given by the propagator of Subsect. 6. The propagator maps Cauchy data at time s to Cauchy data at time t. It is therefore natural to require the following generalization of condition (13). For the one-particle Hilbert space structure Kta of an adiabatic vacuum state we demand that Kta intertwines the time evolutions such that U˜ (t, s)Ksa = Kta U (t, s), where U˜ (t, s) is the time evolution from Hs to Ht . This is a natural generalization. On the right-hand side the evolution from a Cauchy surface at time s to a Cauchy surface at time t is given on phase space followed by the mapping in the one-particle Hilbert space at time t. On the left-hand side we map in the one-particle Hilbert space at time s and the evolution is given to the Hilbert space at time t. The operator Ksa is injective for each s: For F = f1 ⊕ f2 , G = g1 ⊕ g2 ∈ D, we conclude (suppressing the s-dependence of A(s)) from √ a 2Ks (F − G) = m2 −1 A )(f1 − g1 ) + iA−1/4 (f2 − g2 ) = 0 , = A1/4 (f1 − g1 ) + iA−1/4 H(1 + 2 that f1 = g1 , because A1/4 maps real-valued functions to real-valued functions and this leads to f2 = g2 , i.e. F = G. Since the kernel of Ksa contains only the zero vector, we can define the propagator U˜ (t, s) on the one-particle Hilbert space by U˜ (t, s) = Kta U (t, s)(Ksa )−1 , on the range of Ksa , which is dense in H. Furthermore we will show that U˜ (t, s) is isometric on the range of Ksa and can be extended to a unitary operator from Hs to Ht . The propagator U (t, s) leaves the symplectic form σ invariant, because σ is invariant under solutions of the Klein–Gordon equation. Since the real-scalar product S on D can be defined by a complexification J (J : D → D, J 2 = −1 see Baumg¨artel and Wollenberg [3, 8.2.4]) via S(F, G) = σ(F, JG) , we also have S(U (t, s)F, U (t, s)G) = S(F, G) and U (t, s)Js = Jt U (t, s), where Js means the complexification, defining a state at time s. Now with f = Ksa F, g = Ksa G, F, G ∈ D, we have hf |g i = hKsa F |Ksa G i = = [S(F, G) + iσ(F, G)]/2 = = [S(U (t, s)F, U (t, s)G) + iσ(U (t, s)F, U (t, s)G)]/2 = = [S(U (t, s)(Ksa )−1 f, U (t, s)(Ksa )−1 g) + iσ(U (t, s)(Ksa )−1 f, U (t, s)(Ksa )−1 g)]/2 =
a = Kt U (t, s)(Ksa )−1 f Kta U (t, s)(Ksa )−1 g =
= U˜ (t, s)f U˜ (t, s)g , which shows that U˜ (t, s) is an isometry. Since U˜ (t, s) is a bounded linear operator, densely defined on the range of Ksa , it can be extended to a unitary operator from Hs to Ht . The propagator U (t, s) leaves the symplectic form σ invariant. Therefore it defines an automorphism αt,s of the algebra given by αt,s (W (F )) = W (U (t, s)F ). For an adiabatic vacuum state ω we have ωt (αt,s (W (F ))) = ωs (W (F )), as it should be.
402
M. Trucks
8. Time Evolution of an Adiabatic KMS State In this section we will answer the question about the time evolution of an adiabatic KMS state. It is well known that the inverse temperature of a relativistic Bose gas on Robertson–Walker spacetimes is proportional to the scale parameter R of the metric, while for a non-relativistic gas the inverse temperature is proportional to R2 . We will find the same behavior for a Bose gas on a Robertson–Walker spacetime described by an adiabatic KMS state. We start with the two-point function of an adiabatic KMS state of inverse temperature βt on a Cauchy surface at time t and compute the two-point function on a Cauchy surface s, hKtaβt (U (t, s)F )|Ktaβt (U (t, s)F )i = 1 + exp[−βt A1/2 (t)] a a K (U (t, s)F ) = Kt (U (t, s)F ) 1 − exp[−βt A1/2 (t)] t 1 + exp[−βt A1/2 (t)] a a ˜ ˜ U (t, s)Ks (F ) = U (t, s)Ks (F ) 1 − exp[−βt A1/2 (t)] 1 + exp[−βt A1/2 (t)] a K (F ) , F ∈ D. = Ksa (F ) 1 − exp[−βt A1/2 (t)] s We have 1/2 1ε (t) = −βt m − 2 R (t) 1/2 2 1ε R(s) 2 R (t) m 2 − = −βt R(t) R (s) R2 (s) 1/2 2 1ε 2 R (t) − = −βs m 2 , R (s) R2 (s)
−βt A
1/2
2
so the state on the Cauchy surface s can be interpreted as a state of inverse temperature βs = βt R(s)/R(t). This means that the inverse temperature change is proportional to the scale parameter R. A. Pseudodifferential Operators and Wave-Front Sets We shortly review in this appendix the necessary results on pseudodifferential operators and wave-front sets (see e.g. Taylor [20]). Definition 5. Let O be an open subset of Rn . We define the symbol class S m (O), m ∈ R, to consist of all functions p ∈ C ∞ (O × Rn ) with the property that, for any compact set K ⊂ O, any multi-indices α, β, there exists a constant CK,α,β such that β α Dx Dξ p(x, ξ) ≤ CK,α,β (1 + |ξ|)m−|α| for all x ∈ K, ξ ∈ Rn . Remark 2. It is possible to define more general symbol classes but it is not necessary for our work.
KMS-like State on Robertson–Walker Spacetimes
403
With each symbol p(x, ξ) we associate the pseudodifferential operator P by Z −n p(x, ξ)fe(ξ)eiξ·x dξ , f ∈ S(Rn ) , (P f )(x) = (2π) where e denotes the Fourier transform and if p(x, ξ) ∈ S m , we say P ∈ OP S m . The operator P is a continuous operator of D(Rn ) to C ∞ (Rn ) and can be extended to a continuous operator of E 0 (Rn ) to D0 (Rn ). By the Schwartz kernel theorem we can associate a distribution kernel KP ∈ D0 (Rn × Rn ) with the map P such that hP u |v i = hKP |u ⊗ v i. It is also possible to define a pseudodifferential operator (PDO) on a paracompact manifold. Definition 6. An operator P : C0∞ (M) → C ∞ (M) belongs to OP S m (M) if the kernel of P is smooth off the diagonal in M × M and for any coordinate neighborhood U ⊂ M there is a diffeomorphism χ : U → O ⊂ Rn , such that the map of C0∞ (O) into C ∞ (O) given by u 7→ P (u ◦ χ) ◦ χ−1 belongs to OP S m (O). If P ∈ OP S m (O), we define the principal symbol of P to be the member of the equivalence class in S m (O)/S m−1 (O) Now we define the wave-front set of a distribution. If p ∈ S m (O) and pm its principal symbol, the characteristic set charP of the PDO P associated with the symbol p is given by charP = {(x, ξ) ∈ T ∗ (O) \ {0}|pm (x, ξ) = 0} . The wave-front set W F (u) of a distribution u is defined by \ W F (u) = {charP | P ∈ OP S 0 , P u ∈ C ∞ } . A useful characterization of the wave-front set is given in the next Theorem 7. The point (x0 , ξ0 ) 6∈ W F (u) iff there is φ ∈ C0∞ (O), φ(x0 ) 6= 0, and a conic neighborhood 0 of ξ0 , such that, for every n f |φu(ξ)| ≤ Cn (1 + |ξ|)−n ,
ξ ∈ 0.
For a proof see Taylor [20, Chap. VI §1]. We quote two further results, important for this work. Definition 7. The operator p(x, D) ∈ OP S m (O) is elliptic of order m, if on each compact subset K ⊂ O there are constants CK and R such that |p(x, ξ)| ≥ CK (1 + |ξ|)m
if x ∈ K, |ξ| ≥ R.
Theorem 8. Let A ∈ OP S 1 (M) be an elliptic, selfadjoint, positive operator on a compact manifold M with real valued principal symbol a(x, ξ). Let p(λ) ∈ S m (R) be a Borel function. Then p(A) ∈ OP S m (M) with principal symbol p(a(x, ξ)). For a proof see Taylor [20, Chap. XII §1]. We remark that the square-root of the Laplace operator on a compact manifold is of this p type, (−1)1/2 ∈ OP S 1 (M) with principal symbol given by hij ξi ξj , where hij are the metric coefficients. Theorem 9. 1. If A ∈ OP S m (O), then the associated kernel distribution KA is smooth everywhere off the diagonal in O × O.
404
M. Trucks
2. If A ∈ OP S −∞ (O), then KA is smooth everywhere in O × O. For a proof see Junker [10, Lemma 2.6]. Acknowledgement. I thank K.-E. Hellwig, W. Junker, M. Keyl and R. Verch for helpful hints and discussions.
References 1. Arag˜ao de Carvalho, C.A., Goulart Rosa Jr., S.: Comments on “Bose–Einstein condensation in an Einstein universe”. J. Phys. A: Math. Gen. 13, 989–994 (1980) 2. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. Berlin– Heidelberg–New York: Springer Verlag, 1981 3. Baumg¨artel, H., Wollenberg, M.: Causal Nets of Operator Algebras. Berlin: Akademie Verlag, 1992 4. Dimock, J.: Algebras of Local Observables on a Manifold. Commun. Math. Phys. 77, 219–228 (1980) 5. Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer-Verlag, 1992 6. Haag, R., Hugenholtz, N.M., Winnink, M.: On the Equilibrium States in Quantum Statistical Mechanics. Commun. Math. Phys. 5, 215–236 (1967) 7. Haag, R., Kastler, D.: An Algebraic Approach to Quantum Field Theory. J. Math. Phys. 5, 848–861 (1964) 8. Haag, R., Narnhofer, H., Stein, U.: On Quantum Field Theory in Gravitational Background. Commun. Math. Phys. 94, 219–238 (1984) 9. Haber, H.E., Weldon, H.A.: Thermodynamics of an Ultrarelativistic Bose Gas. Phys. Rev. Lett. 46, 1497–1500 (1981) 10. Junker, W.: Hadamard States, Adiabatic Vacua and the Construction of Physical States for Quantum Fields on Curved Spacetime. Rev. Math. Phys. 8, 1091–1159 (1996) 11. Kay, B.S.: Linear Spin-Zero Quantum Fields in External Gravitational and Scalar Fields, I. A One Particle Structure for the Stationary Case. Commun. Math. Phys. 62, 55–70 (1978) 12. Kay, B.S.: Linear Spin-Zero Quantum Fields in External Gravitational and Scalar Fields, II. Generally Covariant Perturbation Theory, Commun. Math. Phys. 71, 29–46 (1980) 13. Kay, B.S.: A uniqueness result for quasi-free KMS states. Helv. Phys. Acta 58, 1017–1029 (1985) 14. Kay, B.S., Wald, R.M.: Theorems on the Uniqueness and Thermal Properties of Stationary, Nonsingular, Quasifree States on Spacetimes with a Bifurcate Killing Horizon. Phys. Rep. 207, 49–136 (1991) 15. L¨uders, C., Roberts, J.E.: Local Quasiequivalence and Adiabatic Vacuum States. Commun. Math. Phys. 134, 29–63 (1990) 16. Moreno, C.: On the spaces of positive and negative frequency solutions of the Klein–Gordon equation in curved space-times. Rep. Math. Phys. 17, 333–358 (1980) 17. Parker, L.: Quantized Fields and Particle Creation in Expanding Universes I. Phys. Rev. 183, 1057–1068 (1969) 18. Radzikowski, M.J.: Micro-Local Approach to the Hadamard Condition in Quantum Field Theory on Curved Spacetime. Commun. Math. Phys. 179, 529–553 (1996) 19. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II, Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 20. Taylor, M.E.: Pseudodifferential Operators. Princeton, W: Princeton University Press, 1981 21. Trucks, M.: Correlations of Quantum Fields on Robertson–Walker Spacetimes. Class. Quant. Grav. 13, 2941–2952 (1996) 22. Verch, R.: Local definiteness, primarity and quasiequivalence of quasifree Hadamard quantum states in curved spacetime. Commun. Math. Phys. 160, 507–536 (1994) 23. Verch, R.: Continuity of Symplectically Adjoint Maps and the Algebraic Structure of Hadamard Vacuum Representations for Quantum Fields on Curved Spacetime. Rev. Math. Phys. 9, 635–674 (1997) 24. Wald, R.M.: Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics. Chicago, IL: University of Chicago Press, 1994 Communicated by G. Felder
This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.
Commun. Math. Phys. 197, 405 – 425 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
(Semi)-Nonrelativistic Limits of the Dirac Equation with External Time-Dependent Electromagnetic Field Philippe Bechouche2 , Norbert J. Mauser1,3 , Fr´ed´eric Poupaud2 1 2 3
Institut f¨ur Mathematik, Universit¨at Wien, Strudlhofgasse 4, A-1090 Wien, Austria Univ. Nice, Lab. J.A. Dieudonn´e, URM 6621 du CNRS, Parc Valrose, F-06108 Nice, France Courant Institute, 251 Mercer str., New York, NY 10612, USA
Received: 6 November 1997 / Accepted: 18 March 1998
Abstract: We perform a mathematical study of the limit of infinite velocity of light for the Dirac equation with given time-dependent electromagnetic potential. Our approach is based on the use of appropriate projection operators for the electron and the positron component of the spinor which are better suited than the widely used simple splitting into “upper (large)” and “lower (small) component”. The “semi-nonrelativistic limit” yields the approximation by the Pauli-equation for the electron component of the 4spinor where first order corrections are kept. Like in the Foldy–Wouthuysen approach we use a rescaling of time to subtract the rest energy of the electron component and add it for the positron component which is assumed to be “small” initially. We give also rigorous results for the nonrelativistic limit to the Schr¨odinger equation. In this case we keep the symmetry of electron and positron components in the rescaling, thus avoiding smallness assumptions on the initial data, and obtain a decoupled pair of Schr¨odinger equations (with negative mass for the positron component). Convergence results for the relativistic current are included. 1. Introduction We deal with the linear one-particle Dirac equation describing e.g. a fast electron in a given electromagnetic field. In its most compact form this equation reads (1.1) iγ µ ∂µ − M + gγ µ Aµ ψ = 0. Here the unknown ψ is the 4-vector of the “Spinorfield”: ψ(t, x) ∈ C4 , x0 = c t ∈ , x = ∂ , ∂k = ∂x∂ k , where we consequently (x1 , x2 , x3 ) ∈ R3 . ∂µ stands for ∂x∂ µ , i.e. ∂0 = ∂x 0 adopt the notation that the greek letter µ denotes 0, 1, 2, 3 and k denotes the 3 spatial dimension indices 1, 2, 3. We always use the summation convention for indices that P3 appear as “a pair of co- and contravariant indices”, e.g. γ µ Aµ stands for µ=0 γ µ Aµ .
406
P. Bechouche, N. J. Mauser, F. Poupaud
The physical constants are M = m0 c/~ , g = e/~, where m0 is the electron’s rest mass, c is the velocity of light, ~ is the Planck constant and e is the unit charge. By γ µ ∈ C4×4 , µ = 0, . . . , 3, we denote the 4 × 4 Dirac matrices given by 1 0 σk 1 k , (1.2) γ0 = , γ = −1 −σ k 0 −1 where the Pauli matrices σ k are given by 01 0 −i 1 2 σ = , σ = , 10 i 0
3
σ =
1 0 0 −1
.
(1.3)
The Dirac matrices satisfy (∗ denotes the transposed, complex conjugate matrix) ∗
∗
γ 0 = γ 0 , γ k = −γ k , k = 1, 2, 3, (γ 0 γ k )∗ = γ 0 γ k , γ µ γ ν + γ ν γ µ = 0, µ 6= ν , (γ 0 )2 = Id, (γ k )2 = −Id. The following related matrices occur frequently m 0 σk σ 0 , S m := i γ k γ l = , γ0γk = m k 0 σ σ 0
(1.4) (1.5)
(1.6)
where (k, l, m) are cyclic permutations of (1,2,3). Note that the matrices S m represent the “spin-operator” [LL3]. Aµ (t, x) ∈ R3 , µ = 0, . . . , 3, are the components of the time-dependent electromag~ x) = (A1 , A2 , A3 )> netic potential, in particular A0 (t, x) is the electric potential and A(t, ~ x) = ∇A ~ 0 −∂t A ~ is the magnetic potential vector. Hence the electric field is given by E(t, ~ x) = curl A. ~ and the magnetic field B(t, From ψ we obtain the “physical” quantities, in particular the relativistic current density as a 4–vector with elements Jµ = γ 0 γ µ ψ · ψ, where the component J0 is the nonnegative “position” density. Separating the time derivative associated to the “relativistic time variable” x0 = ct and applying γ 0 from the left, we have i∂t ψ = −icγ 0 γ k ∂k ψ +
mc2 0 e e γ ψ − Ak γ 0 γ k ψ − A0 ψ. ~ ~ ~
(1.7)
We rescale this equation by introducing a reference length, time and strength of the elecx , t → Tt , Aµ → AAµ . The important dimensionless tromagnetic potential x → L parameter ε is given by the ratio of the reference velocity to the speed of light ε :=
V L /c = 1. T c
(1.8)
The scaled Planck constant is the second dimensionless parameter: δ := ~/mV 2 T . We choose L, T, A such that mV 2 = eA and such that δ = 1. The classical limit δ → 0 with ε fixed was rigorously performed in [GMMP] using Wigner transform techniques. Similar techniques were used in [AS] for the classical and the nonrelativistic limit of
Nonrelativistic Limits of Dirac Equation with Time-Dependence
407
a Pauli equation. In this work we concentrate on the nonrelativistic limit c → ∞; i.e. ε → 0. In the sequel we shall denote the dependence on ε by a superscript. The resulting scaled Dirac equation for ψ ε = ψ ε (t, x) reads i 1 i∂t ψ ε = − γ 0 γ k ∂k ψ ε + 2 γ 0 ψ ε − Ak γ 0 γ k ψ ε − A0 ψ ε , ε ε ψ ε (t = 0, x) = ψIε (x).
(1.9) (1.10)
Note that in this scaling the magnetic field is a relativistic effect which does not appear in the zeroth order approximation of the Dirac equation by Schr¨odinger equations. In the scaling chosen e.g. in [H1] the magnetic potential is rescaled such that the dynamic of the spinor is the main relativistic effect. In our scaling the position density nε (t, x) is given by nε (t, x) = J0ε (t, x) = ψ ε (t, x) · ψ ε (t, x),
(1.11)
and the components Jkε of the 3-vector J~ε of the current density by Jkε (t, x) =
1 0 k ε γ γ ψ (t, x) · ψ ε (t, x) . ε
(1.12)
Multiplying (1.9) by ψ ε and taking imaginary parts we obtain the conservation law ∂t nε + div J~ε = 0,
(1.13)
and hence the ε-independent estimates kψ ε (t)kL2 (R3 )4 = kψIε kL2 (R3 )4 , knε (t)kL1 (R3 ) = knεI kL1 (R3 ) ,
∀t,
nε ≥ 0,
(1.14) ∀t.
(1.15)
Our notation of functional spaces etc. follows [RS4]. In the following we will denote by W m,p (R3 ) the Sobolev spaces of functions which belong to Lp (R3 ), (1 ≤ p ≤ ∞) together with all the derivatives up to order m in the sense of the distributions, i.e.: W m,p (R3 ) = {f ∈ Lp (R3 ) such that ∀α ∈ N3 with |α| ≤ m there exists g ∈ Lp (R3 ) such that ∀φ ∈ Cc∞ (R3 ) :
Z
f Dα φ = (−1)|α| R3
Z gφ}.
(1.16)
R3
In particular H m (R3 ) denotes the Sobolev space W m,2 (R3 ) of functions which are L (R3 ) up to derivatives of mth order in the sense of the distribution. In the sequel D = (D1 , D2 , D3 ) stands for the spatial derivative corresponding to the Fourier multiplier ξ and Dk = −i∂k for the partial derivative associated to ξk . By introducing the “free Dirac operator” Qε 2
Qε (D) := εγ 0 γ k Dk + γ 0 Id
(1.17)
and introducing the “electromagnetic operator” A ≡ A(t, x) A := Ak (t, x)γ 0 γ k + A0 (t, x)Id
(1.18)
408
P. Bechouche, N. J. Mauser, F. Poupaud
we can rewrite (1.7) as i∂t ψ ε =
1 ε ε Q ψ + Aψ ε . ε2
(1.19)
We regard the spectral problem for the self-adjoint operator Qε on Fourier space, i.e. for Qε (ξ) = εγ 0 γ k ξk + γ 0 Id. For each ξ there are 2 eigenvalues ±λε given by q (1.20) λε (ξ) = 1 + ε2 |ξ|2 with geometric multiplicity 2. The associated projectors 5ε± (ξ) are easily calculated as 5ε± (ξ) =
Qε (ξ) 1 (Id ± ε ) 2 λ (ξ)
(1.21)
and correspond to the pseudodifferential operators 5ε± (D) on L2 (R3x )4 . Since the positive and negative eigenvalues ±λε correspond to positive and negative energies of a free Dirac particle, the spectral decomposition of Qε Qε (D) = λε (D)5ε+ (D) − λε (D)5ε− (D) is related to electrons and positrons ([D1]). The formal limit ε → 0 of 5ε∓ yields the operators 1 0 1 1 0 50± = (I ± γ 0 ), 50+ = , 50− = . 0 1 2 0 1
(1.22)
(1.23)
These “algebraic” projection operators are employed for the standard “large and small component” approach, i.e. the splitting of the “Dirac 4-spinor” into two “Pauli 2-spinors” as follows (e.g. [LL4a, IZ, ES]): Defining the “upper” and “lower component” as ψlε (t, x) := 50+ ψ ε (t, x) ,
ψsε (t, x) := 50− ψ ε (t, x)
(1.24)
we have, of course,
ε ψ1ε ψ1 0 ε ψ2 ψ2ε 0 ε ψ = ε = + ε = ψlε + ψsε . ψ3 0 ψ3 ψ4ε ψ4ε 0
(1.25)
fε as the vectors with two components skipping the zeros in ψ ε , ψ ε fε and ψ Introducing ψ s s l l the Dirac equation can be split into two equations for 2-vectors involving the Pauli matrices. Defining the “upper-large” and the “lower-small component” as the 2-vectors 2 fε (t, x) , ϕεl (t, x) := eit/ε ψ l
2 fε (t, x) ϕεs (t, x) := eit/ε ψ s
(1.26)
we immediately obtain the equations 1 i∂t ϕεl = −i σ k ∂k ϕεs − Ak σ k ϕεs − A0 ϕεl , ε
(1.27)
Nonrelativistic Limits of Dirac Equation with Time-Dependence
1 2 i∂t ϕεs = −i σ k ∂k ϕεl − Ak σ k ϕεl − A0 ϕεs − 2 ϕεs . ε ε
409
(1.28)
By formal considerations of orders of magnitude, in particular assuming that ∂t ϕεs is of O(1), we obtain from (1.28) that the lower-small component is O(ε), i ϕεs = − εσ k ∂k ϕεl + O(ε2 ), 2
(1.29)
and for the upper-large component we obtain by using (1.29) in (1.27), 1 i i∂t ϕεl = − σ k σ l ∂k ∂l ϕεl − A0 ϕεl + σ k σ l ∂k Al ϕεl + iAk ∂k ϕεl + O(ε2 ). (1.30) 2 2 ~ 2 , where Using the properties of the Pauli matrices σ j and adding the O(ε2 ) term ε2 |A| ~ := (A1 , A2 , A3 ) is the magnetic potential, this formal procedure finally yields the A “Pauli equation” [LL4a, IZ] for ϕ˜ εl as the “O(ε) approximation” of the upper-large component ϕεl , ~ + εA) ~ 2 ϕ˜ ε − A0 ϕ˜ ε − ε 1 σ k Bk ϕ˜ ε . i∂t ϕ˜ εl = (i∇ l l l 2
(1.31)
Here and in the sequel the tilde˜denotes the solution of an approximative equation - i.e. ϕ˜ εl is the solution of the approximation (1.31) obtained by neglecting O(ε2 ) terms in the exact equation (1.30). We reserve the name “Pauli equation” for semi-nonrelativistic approximations (where “semi” means that we keep terms ”at first order in ε”) for 2-spinors or 4-spinors, containing in any case the famous spin-magnetic field coupling term at order ε as in (1.31). The “total” nonrelativistic limit ε → 0 yields “Schr¨odinger equations” where the spin only rests as a divergence free term in the current density and, in our scaling, the magnetic field has vanished (in contrast to the scaling in (the works following) e.g. [H1]). Foldy and Wouthuysen (F-W) have given the first systematic approach to (semi)nonrelativistic limits [FW] which lacks a rigorous justification (e.g. [GNP, Wh]). A mathematically rigorous theory of the problem has been developed in a series of papers [Ve, H1, CC, GGT, GNP, Wh], where a pseudoresolvent convergence approach using the spectral theorem was pursued. In [Sc] another approach based on singular perturbation theory and semigroups was given. All these works treat the mere static case, i.e. timeindependent electromagnetic potentials and it seems not obvious how the methods can be adapted to the non-static case (e.g. via the Trotter-Kato formula). For a general survey with more references on the problem see [T1]. In this work we use very direct functional analytic methods for proving the limit to Schr¨odinger and Pauli equations for time-dependent potentials. Our rigorous approach is related to the F-W transform which is hence to some extent made rigorous and gives also corrections at any order in ε. The F-W approach is essentially the search for a unitary transformation which diagonalizes the Dirac Hamiltonian with respect to the orthogonal decomposition based on 50± (cp. e.g. [IZ, GNP]). The analogous decomposition based on the PDO projectors 5ε± (D) is better suited which is reflected e.g. by the fact the the resulting “positron-small component” is O(ε2 ) in contrast to the O(ε) “lower-small component” (1.29) obtained via the 50± decomposition. Using projectors including the electromagnetic potential would be another possibility (like e.g. in [GMMP]), but in
410
P. Bechouche, N. J. Mauser, F. Poupaud
particular for the time-dependent case those time-dependent PDOs have disadvantages in comparison with the projectors we use. The projectors 5ε± (ξ) are already given in Dirac’s book ([D1]) but for the sake of formally deriving the Pauli equation textbooks like [LL4a, IZ] prefer the simpler projectors 50± for the “splitting” of the Dirac equation. In the framework of pseudoresolvent convergence (e.g. [Wh, GNP]) similar projectors are defined with emphasis on the question if the resulting family of operators is analytic in ε. Of course, in the expansion of PDOs like in Lemma 2.1 we “lose regularity for every power of ε” which makes the remainder terms more and more singular. However, assuming sufficient regularity of the initial data and the electromagnetic potential we can always counterbalance in order to avoid distributional spaces. This paper is organized as follows: after a short discussion of general properties of our projectors 5ε± (D) we perform the rigorous nonrelativistic limit of the Dirac equation in Sect. 3. There we choose an approach where the rescaling of time means subtracting the rest energy both for the electron and the positron component and keeps them hence on an equal footing. Thus we avoid any smallness assumptions on the initial data. In Sect. 4 we give the crucial results for the “positron-small component” which is obtained by adding the rest energy in the transformation that rescales the time variable. For this approach we need more regularity of the initial data and the electromagnetic potential than for the nonrelativistic limit and we have to assume smallness of the initial positron component. The semi-nonrelativistic approximation of the Dirac equation by the Pauli equation for the electron component is then performed in Sect. 5. 2. Prerequisites For the pseudodifferential operators 5ε± (D) given in (1.21) we have Lemma 2.1. (i) The projectors 5ε± (D) are uniformly (in ε) bounded operators from H m (R3 )4 → H m (R3 )4 (where H m is the usual Sobolev space of functions that are in L2 up to the mth derivative in the sense of (1.16)). (ii) A series expansion of 5ε± (D) w.r.t. ε is given in the following sense: ε 5ε+ (D) = 50+ − i γ 0 γ k ∂k + ε2 R2 2 ε ε2 = 50+ − i γ 0 γ k ∂k + γ 0 1 + ε4 R4 , 2 4 ε 5ε− (D) = 50− + i γ 0 γ k ∂k − ε2 R2 2 ε ε2 = 50− + i γ 0 γ k ∂k − γ 0 1 − ε4 R4 , 2 4
(2.1) (2.2)
(2.3) (2.4)
where R2 and R4 , resp., stand for a uniformly (in ε) bounded operator H m (R3 )4 → H m−2 (R3 )4 and H m (R3 )4 → H m−4 (R3 )4 , resp. Proof. Assertion (i) follows immediately from (1.17), (1.20) and (1.21). In order to prove (ii) we take 5ε+ (ξ) =
γ0 1 γ 0 γ k ξk (Id + p + εp ) 2 1 + ε2 |ξ|2 1 + ε2 |ξ|2
Nonrelativistic Limits of Dirac Equation with Time-Dependence
411
and use 1
Cε|ξ| ≤ p
1+
ε2 |ξ|2
−1≤0,
1 1 − ε2 |ξ|2 ≤ p −1≤0 2 1 + ε2 |ξ|2
and 1 3 ε2 |ξ|2 0≤ p ) ≤ ε4 |ξ|4 . − (1 − 2 2 2 8 1 + ε |ξ| Note that we can also take εR4 as an operator H m (R3 )4 → H m−3 (R3 )4 .
We have “conservation of regularity” of the Dirac equation with time-dependent electromagnetic potential A Lemma 2.2. Let A = (A0 , A1 , A2 , A3 ) ∈ L∞ ((0, T ); W m,∞ (R3 )4 ) and ψIε = O(1) ∈ H m (R3 )4 , m ∈ N (where W m,∞ stands for the space of functions that are in bounded up to the mth derivative in the sense of (1.16)). Then the solution ψ ε (t, x) of the Dirac equation (1.9) satisfies ψ ε ∈ L∞ ((0, T ); H m (R3 )4 )
(2.5)
with a uniform bound w.r.t. ε. Proof. The case m = 0 is due to the conservation law (1.14). Assume that (2.5) holds for m − 1 and that we have kψ ε (t)kH m−1 (R3 )4 ≤ CkψIε kH m−1 (R3 )4 f (t),
(2.6)
where C is a constant and f (t) is polynomial in t. Let Dm denote the derivative of order m. We apply Dm to (1.9), multiply by Dm ψ ε and take imaginary parts: 1 ∂t (Dm ψ ε · Dm ψ ε ) + ∂k ( γ 0 γ k Dm ψ ε · Dm ψ ε ) ε X +2Im( (Dα Ak )γ 0 γ k Dβ ψ ε · Dm ψ ε + (Dα A0 )Dβ ψ ε · Dm ψ ε ) = 0. α+β=m, β6=m
The summand β = m vanishes since A is selfadjoint. Integration with respect to the space variable and using the assumptions on A gives d kDm ψ ε (t)k2L2 (R3 )4 ≤ CkA(t)kW m,∞ (R3 )4 kψ ε (t)kH m−1 (R3 )4 kDm ψ ε (t)kL2 (R3 )4 . dt From the induction hypothesis (2.6) and integrating with respect to the time variable we obtain ∀t ∈ [0, T ): kDm ψ ε (t)kL2 (R3 )4 ≤ CkAkL∞ ((O,T );W m,∞ (R3 )4 ) kψIε kH m−1 (R3 )4
Z
which allows us to conclude the H m estimate (2.5).
t 0
f (τ )dτ + kDm ψIε kL2 (R3 )4
412
P. Bechouche, N. J. Mauser, F. Poupaud
We now turn to the crucial separation of the Dirac equation in two equations according to the spectral decomposition of L2 (R3 )4 due to the eigenvalue problem of the “free” Dirac equation. We define the “electron component” ψ+ε and the “positron component” ε ψ− ψ+ε := 5ε+ (D)ψ ε ,
ε ψ− := 5ε− (D)ψ ε .
(2.7)
Applying the projectors 5ε± (D) to (1.19) gives: i∂t ψ+ε − ε + i∂t ψ−
λε ε ψ + 5ε+ (Aψ ε ) = 0, ε2 +
(2.8)
λε ε ψ + 5ε− (Aψ ε ) = 0. ε2 −
(2.9)
Obviously, the “PDO projectors” 5ε± (D) do not commute with the x-dependent electromagnetic operator A(t, x). By direct calculation using Lemma 2.1 we obtain 5ε+ (Ak γ 0 γ k ψ ε ) = Ak γ 0 γ k 5ε− ψ ε + εR1 (Ak ψ ε )
1 = Ak γ 0 γ k 5ε− ψ ε − iε Ak ∂k ψ ε + γ l γ k (∂l Ak )ψ ε 2 +ε2 R2 (Ak ψ ε ),
5ε− (Ak γ 0 γ k ψ ε ) = Ak γ 0 γ k 5ε+ ψ ε + εR1 (Ak ψ ε )
1 = Ak γ 0 γ k 5ε+ ψ ε + iε Ak ∂k ψ ε − γ l γ k (∂l Ak )ψ ε 2 +ε2 R2 (Ak ψ ε ),
5ε+ (A0 ψ ε ) = A0 5ε+ ψ ε + εR1 (A0 ψ ε ) i = A0 5ε+ ψ ε − εγ 0 γ k (∂k A0 )ψ ε + ε2 R2 (A0 ψ ε ), 2 5ε− (A0 ψ ε ) = A0 5ε− ψ ε + εR1 (A0 ψ ε ) i = A0 5ε− ψ ε + εγ 0 γ k (∂k A0 )ψ ε + ε2 R2 (A0 ψ ε ), 2
(2.10)
(2.11) (2.12)
(2.13) (2.14) (2.15) (2.16) (2.17)
where R1 , (resp R2 ) stands for unspecified, uniformly (in ε) bounded operators H m (R3 )4 → H m−1 (R3 )4 , (resp. H m (R3 )4 → H m−2 (R3 )4 ). 3. The Nonrelativistic Limit First we perform the rigorous limit ε → 0, i.e. c → ∞. In this limit the 4-vector of the solution of the Dirac equation (1.9), (1.10) converges to a decoupled pair of 2-vectors obeying Schr¨odinger-type equations; one for the electron component and one for the positron component with negative mass. In our scaling the magnetic potential does not appear in the limit equations and only the time-dependent electric potential remains. The spin, however, contributes to the current at leading order.
Nonrelativistic Limits of Dirac Equation with Time-Dependence
413
ε We define the “electron” and “positron component”, i.e. ψ+ε and ψ− as in (2.7). In order to be able to perform the nonrelativistic limit, we rescale the time similarily to (1.26), but in this chapter we make the fundamental difference that we use two different signs in the phase factor. Hence we introduce φεe , φεp by
φεe (t, x) := eit/ε ψ+ε (t, x), 2
ε φεp (t, x) := e−it/ε ψ− (t, x). 2
(3.1)
By subtracting the rest energy for both, i.e. the negative rest energy of positrons by adding a 1/ε2 term in the positron part of the Dirac equation, we keep both the electron and the positron component on an equal footing which avoids smallness assumptions on the initial data. The approach in Sect. 4 and 5 would allow for a nonrelativistic limit of, e.g., the electron component assuming smallness of the initial positron component which gives somewhat stronger results (cf. Remark 5.1). Theorem 3.1. Let the electromagnetic potential (A0 , A1 , A2 , A3 ) ∈ L∞ ((0, T ); W 1,∞ (R3 )4 ) and take ψIε = ψ ε (t = 0) = O(1) ∈ H 1 (R3 )4 . Let φ0e I , φ0p I be accumulation points as ε → 0 of the sequences {5ε+ ψIε }, {5ε− ψIε } in the L2 (R3 )4 topology. (i)
Then the nonrelativistic limit of the Dirac equation (1.9), (1.10) for the 4-spinor ψ ε (t, x) is given by 2 Schr¨odinger equations for an electron and a positron component the following way: Up to extraction of subsequences we have φεe (t, x) −→ φ0e (t, x)
ε→0
in L2 ((0, T ) × R3 )4 ,
(3.2)
ε→0
in L2 ((0, T ) × R3 )4 .
(3.3)
φεp (t, x) −→ φ0p (t, x) where φ0e,p (t, x) are solutions of i∂t φ0e = −
1 0 φ − A0 φ0e , 2 e
φ0e (t = 0) = φ0e I , i∂t φ0p = +
1 0 φ − A0 φ0p , 2 p
φ0p (t = 0) = φ0p I .
(3.4) (3.5) (3.6) (3.7)
(ii) The relativistic position density converges as follows: Noting nεe (t, x) for |φεe (t, x)|2 and nεp (t, x) for |φεp (t, x)|2 we have ε→0
in L1 ((0, T ) × R3 )4 ,
(3.8)
ε→0
in L1 ((0, T ) × R3 )4 ,
(3.9)
nεe (t, x) −→ n0e (t, x) = |φ0e (t, x)|2 nεp (t, x) −→ n0p (t, x) = |φ0p (t, x)|2
and the position density (1.11) of the Dirac spinor converges ε→0
nε (t, x) −→ (n0e (t, x) + n0p (t, x))
in L1 ((0, T ) × R3 )4 .
(3.10)
414
P. Bechouche, N. J. Mauser, F. Poupaud
(iii) Also the relativistic current densities converge to the Schr¨odinger limit, but only with additional assumptions and in a weak sense: Let ψIε = ψ ε (t = 0) = O(1) ∈ H 2 (R3 )4 in addition to the above assumptions. Then the current density (1.12) of the Dirac spinor converges as follows: Let Jk0 (t, x) := Im(φ0e · ∂k φ0e ) + Im(φ0p · ∂k φ0p ) + ~ 0 )k + curl (φ0 · Sφ ~ 0 )k , + curl (φ0e · Sφ e p p
(3.11)
~ k stands for the k th component of the curl of the where the notation curl (φ · Sφ) 1 2 3-vector (φ · S φ, φ · S φ, φ · S 3 φ)T , which is the well known divergence-free additional term in the current due to the interaction spin-magnetic field [LL3]. Then ε→0
Jkε (t, x) −→ Jk0 (t, x)
0
0
in C 1 ((0, T ); C 0 (R3 )4 weak-∗) weak- ∗ .
(3.12)
Remark 3.1. Clearly, the above 4 vectors φ0e I , φ0e and φ0p I , φ0p have two zeros in the upper and the lower half, resp. Hence we can work with the 2-vectors ϕ0e , ϕ0p obtained by skipping these zero components (cp. (1.25)). The resulting equations are the same as (3.4) - (3.12), besides the use of the Pauli matrices σ m instead of their 4 × 4 extensions S m in (3.12). Proof. A straightforward calculation starting from (2.8), (2.9) using (2.10),(2.12), (2.14), (2.16) gives the following equations for φεe , φεp defined in (3.1): i∂t φεe +
2 2 λε (D) − 1 ε φe + Ak γ k φεp e−2it/ε + A0 φεe = R+ε e−it/ε , ε2
(3.13)
i∂t φεp +
2 1 − λε (D) ε ε −it/ε2 φp + Ak γ k φεe e+2it/ε + A0 φεp = R− e , ε2
(3.14)
ε→0
ε −→ 0 in L∞ ((0, T ), L2 (R3 )4 ). where R± ε
ε
Since 0 ≤ λ ε−1 ≤ |ξ|2 and φεe,p ∈ L∞ ((0, T ); H 1 (R3 )4 ) it is clear that 1−λε2 (D) φεe,p 2 ∞ is bounded in L ((0, T ), H −1 (R3 )4 ) and hence also ∂t φεe,p is uniformly bounded in L∞ ((0, T ), H −1 (R3 )4 ). From Aubin’s Lemma we conclude that {φεe,p }ε>0 is in a compact set in L2 ((0, T ) × 3 4 R ) and (3.2), (3.3) holds. Therefore 2
ε→0
Ak γ k φεe,p e∓2it/ε * 0 2
in L2 ((0, T ) × R3 )4 weakly.
(3.15)
For any compact set K in R3 we have ±
λε (ξ) − 1 d |ξ|2 d ε→0 φ0 (ξ) φεe,p (ξ) −→ ± 2 ε 2 e,p
in L2 ((0, T ) × K)4 .
(3.16)
Nonrelativistic Limits of Dirac Equation with Time-Dependence
415
Together with the L∞ ((0, T ); H 1 (R3 )4 ) bound we hence obtain for s > 1, ±
λε (D) − 1 ε ε→0 1 0 φe,p −→ ± φe,p ε2 2
in L2 ((0, T ); H −s (R3 )4 ),
(3.17)
which allows to finally conclude the limit in the equations. The convergence of the position densities in (ii) follows immediately from (i). In order to prove (3.12) we plug the decomposition ψ ε = 5ε+ ψ ε + 5ε− ψ ε into (1.11), (1.12) and obtain Jkε =
1 0 k ε ε 1 0 k ε ε + 2 Re(γ 0 γ k ψ ε · ψ ε ). γ γ ψ+ · ψ+ + γ γ ψ− · ψ− + − ε ε ε
(3.18)
From Lemma 2.1 we obtain 1 i 1 ε 5 (D) = 50+ − γ 0 γ k ∂k + εR2 , ε + ε 2
(3.19)
where R2 is a uniformly bounded operator L∞ ((0, T ); H 2 (R3 )4 ) → L∞ ((0, T ); L2 (R3 )4 ). Therefore we can rewrite the first term in (3.18) as i i 1 0 k 0 ε γ γ 5+ ψ · 50+ ψ ε − γ 0 γ k γ 0 γ l ∂l ψ ε · 50+ ψ ε + γ 0 γ k 50+ ψ ε · γ 0 γ l ∂l ψ ε + εr, ε 2 2 (3.20) where r is a uniformly bounded L∞ ((0, T ); L1 (R3 )4 ) function due to our H 2 -assumption on the initial data. Since γ 0 γ k 50+ ψ ε is orthogonal to 50+ ψ ε , the leading order term is O(1) in L∞ ((0, T ); L1 (R3 )4 ), given by the diagonals and off-diagonals of the second and third term in (3.20), i i − ∂k ψ ε · ∂k 50+ ψ ε + 50+ ψ ε · ∂k ψ ε 2 2 iX k l iX l k 0 ε ε 0 ε − γ γ ∂l ψ · 5+ ψ + γ γ 5+ ψ · ∂l ψ ε . 2 l 2 l l= 6 k
(3.21)
l= 6 k
Now we can replace both ψ ε and 50+ ψ ε by ψ+ε with a total remainder term εr, where r is a uniformly (w.r.t. ε) bounded function in L∞ ((0, T ); L1 (R3 )4 ). Hence we have X 1 0 k ε ε γ γ ψ+ · ψ+ = Im(ψ+ε · ∂k ψ+ε ) + Im γ k γ l ∂l ψ+ε · ψ+ε + εr, (3.22) ε l l= 6 k
where the γ k γ l -term can be brought to the curl -form using (1.6). The second term (i.e. the positron term) in (3.18) can be treated in complete analogy. ε by φεe,p as defined in (3.1) gives a rapidly oscillating phasefactor in the Replacing ψ± third (“mixed”) term in (3.18) whereas the first (“electron”) and the second (“positron”) term keep their form: ~ ε )k Jkε = Im(φεe · ∂k φεe ) + Im(φεp · ∂k φεp ) + curl (φεe · Sφ e 2 ~ ε )k + εr + Re(γ 0 γ k φε · φε e2it/ε2 ), + curl (φεp · Sφ p e p ε
(3.23)
416
P. Bechouche, N. J. Mauser, F. Poupaud
where r is a uniformly bounded L∞ ((0, T ); L1 (R3 )4 ) function. By integration by parts in time and using testfunctions in C 1 ((0, T ); C 0 (R3 )4 ) we can show that 2 ε→0 2 Re(γ 0 γ k φεe · φεp e2it/ε ) * 0 ε
0
0
in C 1 ((0, T ); C 0 (R3 )4 weak-∗) weak-∗, (3.24)
and we can finally conclude the convergence (3.12) in the limit ε → 0 of (3.23).
4. Estimates for the “Small Component” We now define (cp. [LL4a, IZ]) the “electron-large component” φε+ and the “positronsmall component” φε− and rigorously justify the terms “large vs small”. We start from the “electron” ψ+ and “positron component” ψ− as defined via the PDO-projectors 5ε± in (2.7). Like in (1.26) we “factor out” the rest energy m0 c2 , i.e., 1/ε2 in our scaling: φε (t, x) := eit/ε ψ ε (t, x) = φε+ (t, x) + φε− (t, x) 2
with
φε+ (t, x) := 5ε+ (D)φε (t, x) = eit/ε ψ+ε (t, x),
(4.1)
2
ε φε− (t, x) := 5ε− (D)φε (t, x) = eit/ε ψ− (t, x). 2
(4.2)
Note that the use of the same sign in the exponential means that we subtract the positive rest energy of the electron component and add the negative rest energy of the positron component (as apparent e.g. in (4.8)). This is the basis of any “large” vs. “small” component approach. Since φε differs from ψ ε by a mere phase factor in time, we can (in any case concerning the signs in these phase factors) immediately transcribe Lemma 2.2 Proposition 4.1. Let A = (A0 , A1 , A2 , A3 ) ∈ L∞ ((0, T ); W m,∞ (R3 )4 ) and ψIε belongs to a bounded set of H m (R3 )4 , m ∈ N. Then φε (t, x) satisfies kφε kL∞ ((0,T );H m (R3 )4 ) = kψ ε kL∞ ((0,T );H m (R3 )4 ) ≤ C,
(4.3)
where C is independent of ε, and initially we have, of course, φε (t = 0, x) = ψIε (x).
(4.4)
The “split Dirac equation” for φε+ , φε− is readily obtained from (2.8), (2.9) i∂t φε+ −
λε − 1 ε φ+ + 5ε+ (Aφε ) = 0, ε2
(4.5)
i∂t φε− +
λε + 1 ε φ− + 5ε− (Aφε ) = 0. ε2
(4.6)
Since we have used the same sign for the phase factors in the transformation (4.2) we have broken the symmetry between electrons and positrons. This asymmetry of (4.5) and (4.6) becomes immediately transparent in the term ε22 Id (twice the positron’s rest energy) in the following
Nonrelativistic Limits of Dirac Equation with Time-Dependence
417
Lemma 4.1. For the pseudodifferential operator associated to λε (ξ) as given by (1.20) we have λε (D) − 1 1 = 1 + ε2 R, ε2 2
(4.7)
λε (D) + 1 2 1 = 2 Id + 1 + ε2 R, 2 ε ε 2
(4.8)
where R stands for an unspecified uniformly (w.r.t. ε) bounded operator H m (R3 )4 → H m−4 (R3 )4 . Proof. Follows immediately from the inequalities for λε (ξ) λε (ξ) − 1 |ξ|2 3 ≤0 − − ε2 |ξ|4 ≤ 8 ε2 2
3 λε (ξ) + 1 2 |ξ|2 and − ε2 |ξ|4 ≤ ≤ 0. − − 8 ε2 ε2 2
The first crucial estimate on the “electron-small component” φε− defined in (4.2) is Proposition 4.2. Let A = (A0 , A1 , A2 , A3 ) ∈ W 1,∞ ((0, T ); L∞ (R3 )4 ) and take ψIε = ψ ε (t = 0) = φε (t = 0) = O(1) ∈ H 2 (R3 )4 . Then ∂t φε− (t, x) ∈ L∞ ((0, T ); L2 (R3 )4 ) with the bound k∂t φε− (t)kL2 (R3 )4 ≤ CT kAkW 1,∞ ((0,T );L∞ (R3 )4 ) kψIε kH 2 (R3 )4 + k∂t φε− (t = 0)kL2 (R3 )4 , (4.9) where CT is an ε-independent constant depending on the time interval (0, T ). Remark 4.1. Note that ∂t φε− (t = 0) ∈ L2 (R3 )4 follows immediately from the assumptions on ψIε and A via Eq. (4.6) . Proof. We derive (4.6) with respect to t, multiply by ∂t φε− , take imaginary parts and ε is self-adjoint and integrate with respect to x. The second term vanishes since λ (D)+1 ε2 we obtain Z Z d k∂t φε− (t)k2L2 (R3 )4 = Im 5ε− (∂t A)φε · ∂t φε− dx + 5ε− (A∂t φε ) · ∂t φε− dx . dt R3 R3 (4.10) We split φε = φε+ + φε− and since (5ε− )2 = 5ε− and A is self-adjoint Z Z 5ε− (A∂t φε− ) · ∂t φε− dx = A∂t φε− · ∂t φε− dx ∈ R. R3
R3
For the φε+ -term in this third summand in (4.10) we have Z Z Im 5ε− (A∂t φε+ ) · ∂t φε− dx = −Re 5ε− A(i∂t φε+ ) · ∂t φε− dx R3 R3 ε Z λ −1 ε ε ε A φ+ − 5+ (Aφ ) ∂t φε− dx. = −Re ε2 R3
418
P. Bechouche, N. J. Mauser, F. Poupaud
Hence we can estimate
Z |Im
5ε− (∂t A)φε · ∂t φε− dx| ≤
R3
≤ C kA(t)kL∞ (R3 )4 kφε (t)kH 2 (R3 )4 + kA(t)k2L∞ (R3 )4 kφε (t)kL2 (R3 )4 )k∂t φε− (t)kL2 (R3 )4 . For the first integral in the imaginary part in (4.10) we use again the selfadjointness and idempotency of 5ε− and obtain Z |Im
3
R
∂t Aφε (t) · ∂t φε− dx| ≤ k∂t A(t)kL∞ (R3 )4 kφε (t)kL2 (R3 )4 k∂t φε− (t)kL2 (R3 )4 .
Thus (4.10) gives d k∂t φε− (t)kL2 (R3 )4 ≤ CkAkW 1,∞ ((0,T );L∞ (R3 )4 ) kφε (t)kH 2 (R3 )4 dt and we can conclude (4.9) by integrating over time and Lemma 4.1.
Lemma 2.2 on the “conservation of regularity” of the Dirac equation holds also for φε as introduced in (4.2) in the following sense Lemma 4.2. For m ∈ N, let A = (A0 , A1 , A2 , A3 ) ∈ W 1,∞ ((0, T ); W m,∞ (R3 )4 ), ψIε = φε (t = 0) = O(1) ∈ H m+1 (R3 )4 . Then the solution φε (t, x) = φε+ (t, x) + φε− (t, x) (cp. (4.1)) of the “split Dirac equation” (4.5), (4.6) satisfies ∂t φε ∈ L∞ ((0, T ); H m (R3 )4 )
(4.11)
and kφε kW 1,∞ ((0,T );H m (R3 )4 ) (4.12) ≤ CT kAkW 1,∞ ((0,T );W m,∞ (R3 )4 ) kψIε kH m (R3 )4 + k∂t φε (t = 0)kH m (R3 )4 , where CT is an ε-independent constant depending on the time interval (O, T ). The same estimates hold for the components φε+ , φε− , of course. Proof. We proceed by induction like in the proof of Lemma 2.2. We have (4.9) and the analogous estimate for ∂t φε+ follows by exactly the same proof using the selfadjointness ε of 1−λε2 (D) . Hence the result holds for m = 0. We take (4.5),(4.6), derive with respect to t and apply Dm , multiply by ∂t Dm φε± , take imaginary parts and integrate with respect to ε is self-adjoint and we obtain by using again the idempotency x. We use again that λ (D)±1 ε2 of 5ε± : Z d k∂t Dm φε± k2L2 (R3 )4 = Im Dm (∂t Aφε ) · ∂t Dm φε± dx dt 3 ZR (4.13) Dm (A∂t φε ) · ∂t Dm φε± dx . + Im R3
Nonrelativistic Limits of Dirac Equation with Time-Dependence
419
We can estimate the two terms on the r.h.s. like in the proof of Lemma 2.2 and finally get d k∂t Dm φε± (t)kL2 (R3 )4 dt ≤ CkAkW 1,∞ ((0,T );W m,∞ (R3 )4 ) kφε (t)kH m (R3 )4 + k∂t φε (t)kH m−1 (R3 )4 . Using Proposition 4.1 and the induction assumption for m − 1 and integrating with respect to time we hence obtain
kψIε kH m (R3 )4
k∂t Dm φε± (t)kL2 (R3 )4 ≤ C · tkAkW 1,∞ ((0,T );W m,∞ (R3 )4 ) + k∂t φε (t = 0)kH m−1 (R3 )4 + k∂t Dm φε (t = 0)kL2 (R3 )4 .
Like in Remark 4.1 we can conclude ∂t φε− (t = 0) ∈ H m (R3 )4 from the assumptions on ψIε and A directly from Eq. (4.6). Since kφε (t)k2H m (R3 )4 = kφε+ (t)k2H m (R3 )4 + kφε− (t)k2H m (R3 )4 we can conclude. The following proposition is the rigorous justification of the name “small” component for φε− and can be interpreted that “very few” positrons stay “very few” in the finite time evolution governed by the Dirac equation (1.7). Note that our definition (4.2) of the “small” component gives O(ε2 ) whereas the usual definition (1.26) gives only O(ε). Proposition 4.3. Let A = (A0 , A1 , A2 , A3 ) ∈ W 1,∞ ((0, T ); W 2,∞ (R3 )4 ) and take ψIε = ψ ε (t = 0) = φε (t = 0) = O(1) ∈ H 3 (R3 )4 and 5ε− ψIε = φε− (t = 0) = O(ε2 ) ∈ H 2 (R3 )4 . Then kφε− (t)kH 2 (R3 )4 = O(ε2 ) ,
∀t ∈ (0, T ).
(4.14)
Remark 4.2. Like in Remark 4.1 we can conclude that ∂t φε (t = 0) = O(1) ∈ L2 (R3 )4 and from Lemma 4.2 we have kφε kW 1,∞ ((0,T );H 2 (R3 )4 ) = O(1).
(4.15)
Proof. From (4.6) we obtain (λε + 1)φε− = −ε2 (i∂t φε− + 5ε− (Aφε )). Hence (4.8) gives
kφε− (t)kL2 (R3 )4 ≤ ε2 C k∂t φε− (t)kL2 (R3 )4 + kA(t)kL∞ (R3 )4 kφε (t)kL2 (R3 )4 . (4.16)
We take (4.6) for 1φε− and use λε (D) + 1 ≥ 2Id. From (4.12) of Lemma 4.2 for m = 2 we obtain k1φε− (t)kL2 (R3 )4 ≤ ε2 C k∂t 1φε− (t)kL2 (R3 )4 + kA(t)kW 2,∞ (R3 )4 ) kφε (t)kH 2 (R3 )4 . (4.17) Together with (4.16) we immediately have (4.14).
From the proofs of Proposition 4.2, Lemma 4.2 and Proposition 4.3 we immediately see that Proposition 4.2, Proposition 4.3 can be generalized to H m−2 (R3 )4 and H m (R3 )4 estimates:
420
P. Bechouche, N. J. Mauser, F. Poupaud
Corollary 4.3. Let A = (A0 , A1 , A2 , A3 ) ∈ W 1,∞ ((0, T ), W m,∞ (R3 )4 ) and take initial data such that ψIε = ψ ε (t = 0) = φε (t = 0) = O(1) ∈ H m+1 (R3 )4 and 5ε− ψIε = φε− (t = 0) = O(ε2 ) ∈ H m (R3 )4 . Then (i) k∂t φε− (t, x)kH m−2 (R3 )4 = O(1) ,
∀t ∈ (0, T )
(4.18)
with a bound analogous to (4.9), and (ii) kφε− (t)kH m (R3 )4 = O(ε2 ) ,
∀t ∈ (0, T ).
(4.19)
5. The “Semi-Nonrelativistic Limit” We now turn to the nonrelativistic limit of the Dirac equation (1.9) keeping terms at O(ε) ∼ O(1/c). In contrast to the widely used approach of “upper-large and lowersmall component” by applying the “algebraic projector” 50± to (1.9), our mathematically rigorous theory is based on the use of the pseudodifferential operator 5ε± (D). This implies that the “small component” 5ε− φ is of O(ε2 ) in L∞ ((0, T ); H m (R3 )4 ) and not of O(ε) as as in (1.29) and the magnetic field appears in the small component at leading order. In addition, the “Pauli equation” becomes an equation for a 4-vector with an additional term at O(ε). However, we can reformulate this Pauli equation by an equation for a 2-vector in the usual form (1.31). The equations for the electron-large and positron-small component φε± and φε as defined in (4.2), (4.1) are readily obtained from (4.5) using Lemma 4.1 and (2.10)–(2.16): 1 ε φ + A0 φε+ + Ak γ 0 γ k φε− + εGφε = ε2 Rφε , 2 +
(5.1)
1 ε 2 φ− + 2 φε− + A0 φε− + Ak γ 0 γ k φε+ − εGφε = −ε2 Rφε , 2 ε
(5.2)
i∂t φε+ +
i∂t φε− −
where Gφε follows from (2.11)–(2.17) i i Gφε = −iAk ∂k φε + γ l γ k (∂l Ak )φε − γ 0 γ k (∂k A0 )φε 2 2
(5.3)
and G is a uniformly (w.r.t. ε) bounded operator H m (R3 )4 → H m−1 (R3 )4 for A ∈ L∞ ((0, T ); W m,∞ (R3 )4 ). R is the sum of the remainder terms due to Lemma 4.1 and (2.11) - (2.17), hence it is a uniformly (w.r.t. ε) bounded operator H m (R3 )4 → H m−4 (R3 )4 for A ∈ L∞ ((0, T ); W m−2,∞ (R3 )4 ) . We can rewrite Gφε using the algebra of the Dirac matrices γ j : the diagonal in the double sum of the second term in (5.3) gives - 2i (∂k Ak )φε and the off diagonal in this ~ ε = 1 S m Bm φε , where the “block-diagonal” matrix S m is term yields 21 S m curl m Aφ 2 given by (1.6). Hence we have i 1 i Gφε = −iAk ∂k φε − (∂k Ak )φε − S k Bk φε + γ 0 γ k (∂k A0 )φε . 2 2 2
(5.4)
Nonrelativistic Limits of Dirac Equation with Time-Dependence
421
The third term in (5.4) contains the “block-anti-diagonal” matrices γ 0 γ k (cf (1.6)). This term plays a special role as we shall see in the following. Note that ∂k A0 = Ek − ∂t Ak and only in the case of a static magnetic field this term involves the mere electric field. This is a difference to the analogous term found in [D1, LL4a, Schiff]. We note that iγ 0 γ k (∂k A0 ) is anti-selfadjoint since γ 0 γ k are self-adjoint and ∂k A0 is real. We collect the results on the nonrelativistic approximation up to terms of O(ε2 ) in Theorem 5.1. Let A = (A0 , A1 , A2 , A3 ) ∈ W 1,∞ ((0, T ); W 2,∞ (R3 )4 ) and take ψIε = O(1) ∈ H 4 (R3 )4 and 5ε− ψIε = O(ε2 ) ∈ H 2 (R3 )4 . (i)
Then the semi-nonrelativistic approximation of the Dirac equation is given by the fε , following “Pauli equation” for the 4-vector φ + fε , fε = (i∇ + εA) fε + A0 φ fε − ε 1 S k Bk φ fε + ε i γ 0 γ k (∂k A0 )φ ~ 2φ i∂t φ + + + + + 2 2
(5.5)
fε (t = 0, x) = φ fε (x), φ + +I
(5.6)
where the approximation is to be understood as follows: Let ψ ε (t, x) be the solution of the Dirac equation (1.9) with initial datum ψIε (x) and 2 fε be the solution of let φε (t, x) = eit/ε ψ ε (t, x) and φε± (t, x) = 5ε± (D)φε (t, x). Let φ + ε f ε f the Pauli equation (5.5) with initial datum φ+I such that kψI − φε+I kL2 (R3 )4 = O(ε2 ). Then (ii) ε (t)k 2 3 4 + k(φε − φ fε )(t)kL2 (R3 )4 ≤ CT ε2 , kφf L (R ) + −
(5.7)
where CT is an ε-independent constant depending on the time interval (0, T ). (iii) For the positron-small component φε− we have 1 fε − iε2 ∂t φε + ε3 r, φε− = −ε2 Ak γ 0 γ k φ − + 2
(5.8)
fε is the solution of the Pauli equation (5.5) and r, 1 Ak γ 0 γ k φε and ∂t φε where φ + + − 2 are uniformly (w.r.t. ε) bounded functions in L∞ ((0, T ); L2 (R3 )4 ). (iv) The semi-nonrelativistic approximation of the position density nε = |ψ ε |2 = |φε |2 is given by fε |2 n˜ ε = |φ +
with
knε (t) − n˜ ε (t)kL1 (R3 )4 = O(ε2 ) , ∀t ∈ (0, T ) ,
(5.9)
fε is solution of the Pauli equation (5.5). where φ + (v) The semi-nonrelativistic approximation of the current density Jkε (t, x) =
1 0 k ε 1 γ γ ψ (t, x) · ψ ε (t, x) = γ 0 γ k φε (t, x) · φε (t, x) ε ε
is given by fε · S fε |2 , fε · ∂k φ fε ) + curl (φ fε )k + εAk |φ ~φ fε = Im(φ J + + + + + k
(5.10)
fε is again a solution of the Pauli equation (5.5) and the approximation to where φ + be understood in the sense that fε = εr , (5.11) Jε − J k
ε→0
0
k
where r −→ 0 in L2 ((0, T ); C 0 (R3 )4 weak-∗) weakly.
422
P. Bechouche, N. J. Mauser, F. Poupaud
Remark 5.1. In contrast to the approach in Sect. 3, the nonrelativistic limit of the current density to the Schr¨odinger limit (3.11) holds in a strong sense (cp. (3.12): ε→0
Jkε (t, x) −→ Jk0
in L∞ ((0, T ); L1 (R3 )4 ).
(5.12)
Proof. Assertions (i) and (iii) are obtained by using the estimates (4.9), (4.14) of Propositions 4.2, 4.3, in order to estimate the terms in (5.1), (5.2). With our assumptions the functions Gφε and Rφε are uniformly bounded in L∞ ((0, T ); L2 (R3 )4 ) and we can ~2. conclude by keeping the leading order terms and adding the O(ε2 ) term ε2 A In order to shorten notations in the proof of (ii) we define the “Pauli operator” P ε , ~ 2 − ε 1 S k Bk Id P ε := (i∇ + εAId) 2
(5.13)
fε . Then δ+ which is obviously selfadjoint. Let δ+ be the difference between φε+ and φ + verifies the equation i i∂t δ+ = P ε δ+ + ε γ 0 γ k (∂k A0 )δ+ + ε2 r, 2 fε δ+I = φε − φ +I
+I
(5.14) (5.15)
with r uniformly (w.r.t. ε) bounded in L∞ ((0, T ); L2 (R3 )4 ). We multiply this equation by δ+ integrate and take imaginary parts Z Z d ε 0 k kδ+ k2L2 (R3 ) = γ γ (∂k A0 )δ+ · δ+ dx + ε2 r · δ+ dx. dt R 2 R Hence
d kδ+ kL2 (R3 )4 ≤ C(εkδ+ kL2 (R3 )4 + ε2 ), dt where C is a constant. Applying the Gronwall Lemma to this inequality gives kδ+ kL2 (R3 )4 ≤ kδ+I kL2 (R3 )4 eεCt + ε(eεCt − 1),
which proves (ii). In order to prove (iv), (v) we plug the decomposition ψ ε = 5ε+ ψ ε +5ε− ψ ε into (1.11), ε (1.12) and replace ψ± by φε± (cf. (4.2),(2.7)) since the phase factors cancel completely here. Note the contrast to the proof of Theorem 3.1 (ii) which is not simply obtained by “letting ε vanish” in this proof based on the “positron-small component” (5.8). Since φε− is O(ε2 ) in L∞ ((0, T ); L2 (R3 )4 ) due to (iii) we immediately obtain (5.9) from (iv). For the current (1.12) we thus obtain Jkε =
1 0 k fε fε 2 ε ·φ fε ) + ε3 r γ γ φ+ · φ+ + Re(γ 0 γ k φf + − ε ε
(5.16)
with r uniformly bounded in L∞ ((0, T ); L1 (R3 )4 ). Using Lemma 2.1 we can rewrite the first term in (5.16) as 1 0 k 0 γ γ (5+ + R)φε+ · (50+ + R)φε+ ε with
(5.17)
Nonrelativistic Limits of Dirac Equation with Time-Dependence
423
ε 1 Rφε+ = −i γ 0 γ l ∂l φε+ + ε2 γ 0 1φε+ + ε3 r, 2 4
(5.18)
where r is uniformly (w.r.t. ε) bounded in L∞ ((0, T ); L2 (R3 )4 ). By calculations analogous to the derivation of (3.22) in the proof of Theorem 3.1 (ii) we obtain that the leading order term of (5.17) is O(1) in L∞ ((0, T ); L1 (R3 )4 ) given by X fε ) + Im fε · φ fε , fε · ∂k φ γ k γ l ∂l φ Im(φ +
+
+
+
l l= 6 k
where the last term can be again rewritten in the curl -form due to the algebra of the Dirac matrices. By straightforward calculation we see that the O(ε) term stemming from 41 ε2 γ 0 1 in (5.18) vanishes. The second term on the r.h.s. of (5.16) gives an O(ε) correction. We plug in φε− given by (5.8). We have ε *0 ∂t φf −
in L2 ((0, T ) × R3 )4 weakly.
(5.19)
ε→0 ε→0 ε −→ ε −→ 0 in L∞ ((0, T ); L2 (R3 )4 ). Hence ∂t φf 0 in D0 To see this we remark that φf − − 2 and we can conclude the weak convergence in L . Hence we have ε→0 ε ·φ fε −→ φf 0 + −
0
in L2 ((0, T ); C 0 (R3 )4 weak-∗) weakly.
ε ·φ fε ) gives the only term Therefore ε2 Re(γ 0 γ k φf + −
fε |2 , fε · φ fε ) = −εAk |φ −εRe(γ k γ l Al φ + + + since the real part of the off-diagonal terms vanishes.
It rests to connect (5.5) with the usual Pauli equation (1.31), in particular to clarify fε and the 2-vector ϕ˜ ε as in (1.31) and the nature of the relation between the 4-vector φ + l the last term in (5.5) which looks like a coupling of spin and electric field at O(ε). Defining the “upper electron-large” and the “lower electron-large” component as φε+l := 50+ φε+ = 50+ 5ε+ (D)φε ,
φε+s := 50− φε+ = 50− 5ε+ (D)φε
(5.20)
fε and φ fε we can split the Pauli (cp. (1.24)) and analogously for the approximations φ +l +s fε = φε + O(ε2 ) in L∞ ((0, T ); L2 (R3 )4 ) equation (5.5). From Theorem 1, (ii) we have φ +l +l and (5.5) becomes fε , fε = P ε φ fε − i ε S k (∂k A0 )φ i∂t φ (5.21) +l +l +s 2 fε . fε = P ε φ fε − i ε S k (∂k A0 )φ i∂t φ +s +s +l 2
(5.22)
The last terms on the r.h.s. which couple the system result from the γ 0 γ k -term in (5.5). Using Lemma 2.1 and the orthogonality of 50+ and 50− we have φε+l = 50+ φε − i ε2 γ 0 γ k ∂k 50+ φε + O(ε2 ) = O(1) and φε+s = +i ε2 γ 0 γ k ∂k 50− φε + O(ε2 ) = O(ε) in L∞ ((0, T ); L2 (R3 )4 ).
424
P. Bechouche, N. J. Mauser, F. Poupaud
fε = O(ε2 ) in L∞ ((0, T ); L2 (R3 )4 ) and we can This yields finally −i ε2 S k (∂k A0 )φ +s neglect this term in the O(ε) approximation. A more heuristic approach to this issue is given e.g. in [D1, Schiff]. Note again that we find a “spin-electric field coupling” ~ + ∂t A ~ for the non-static case. This ~ 0 = −E term involving the magnetic field since ∇A difference in the O(ε2 ) approximation is dealt with in detail in [M3]. With proper initial data we can therefore define ϕ˜ ε+l as the 2-vector skipping the zero components of φε+ (cp. (1.25)) and approximate it by the solution of the Pauli equation (1.31) for the 2-vector ϕ˜ ε+l : fε of the Corollary 5.1. The 2-vector of the upper components of the approximation φ + Dirac equation according to (5.5), (5.7) can be approximated up to terms of O(ε2 ) in L∞ ((0, T ); L2 (R3 )4 ) by the solution of the “usual Pauli equation” (1.31), ~ − εA) ~ 2 ϕ˜ ε − A0 ϕ˜ ε − ε 1 σ k Bk ϕ˜ ε . i∂t ϕ˜ ε+l = (−i∇ +l +l +l 2
(5.23)
Acknowledgement. The authors acknowledge financial support by the DAAD-PROCOPE. The second author also acknowledges support by his “Marie Curie-Fellowship” contract # ERBFMBICT961125, and by his “Erwin Schr¨odinger-Fellowship”.
References [AS]
Arnold, A., Steinr¨uck, H.: The ’electromagnetic’ Wigner equation for an electron with spin. ZAMP 40, No. 6, 793–815 (1989) [D1] Dirac, P.A.M.: Principles of Quantum Mechanics. London: Oxford University Press, 4th ed., 1958 [ES] Esteban, M.J., S´er´e, E.: Existence and multiplicity of solutions for linear and nonlinear Dirac problems. Preprint No. 9542, CEREMADE, Univ. Paris IX (1996) [FW] Foldy, L.L., Wouthuysen, S.A.: On the Dirac theory of Spin 1/2 Particles and its Nonrelativistic Limit. Phys. Rev. 78, 29–36 (1950) [GMMP] G´erard, P., Markowich, P.A., Mauser, N.J. and Poupaud, F.: Homogenization Limits and Wigner Transforms. Comm. Pure and Appl. Math. 50, 321–377 (1997) [GGT] Gesztesy, F., Grosse, H. and Thaller, B.: A rigorous approach to relativistic corrections of bound state energies for spin-1/2 particles. Ann. Inst. Henri Poincare, Phys. Theor. 40, 159–174 (1984) [Sc] Schoene, A.Y.: On the nonrelativistic limits of the Klein-Gordon and Dirac equations. J. Math. Anal. Appl. 71, 36–47 (1979) [T1] Thaller, B.: The Dirac Equation. New York–Wien: Springer, 1992 [Ve] Veselic, K.: Perturbation of Pseudoresolvents and Analyticity in 1/c of Relativistic Quantum Mechanics. Commun. Math. Phys. 22, 27–43 (1971) [GNP] Grigore, D.R., Nenciu, G. and Purice, R.: On the nonrelativistic limits of the Dirac Hamiltonian. Ann. Inst. Henri Poincare, Phys. Theor. 51, No.3, 231–263 (1989) [H1] Hunziker, W.: On the nonrelativistic limit of the Dirac Theory. Commun. Math. Phys. 40, 215–222 (1975) [IZ] Itzykson, C. and Zuber, J.B.: Quantum Field Theorie. New York: McGraw-Hill, 1985 [LL3] Landau, L. and Lifschitz, Vol.III, 2nd ed., Quantenmechanik. Berlin: Akademie-Verlag, 1971 [LL4a] Landau, L. and Lifschitz, Vol.IVa, 2nd ed., Relativistische Quantenmechanik. Berlin: AkademieVerlag, 1971 [M3] Mauser, N.J.: Rigorous derivation of the Pauli equation with time-dependent electromagnetic field. To appear in VLSI Design (1998) [RS4] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol I–IV, 3rd ed., New York–San Francisco–London: Academic Press, 1987 [Schiff] Schiff, L.I.: Quantum Mechanics. 3rd ed., New York: McGraw-Hill, 1968
Nonrelativistic Limits of Dirac Equation with Time-Dependence
[CC] [Wh]
425
Cirincione, R.J., Chernoff, P.R.: Dirac and Klein-Gordon equations: Convergence of solutions in the nonrelativistic limit. Commun. Math. Phys. 79, 33–46 (1981) White, G.B.: Splitting of the Dirac operator in the nonrelativistic limit. Ann. Inst. Henri Poincare, Phys. Theor. 40, 109–121 (1990)
Communicated by B. Simon
Commun. Math. Phys. 197, 427 – 441 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Strong Symmetry Defined by Twisting Modules, Applied to Quasi-Hereditary Algebras with Triangular Decomposition and Vanishing Radical Cube Steffen K¨onig1 , Changchang Xi2 1
Fakult¨at f¨ur Mathematik, Universit¨at Bielefeld, D-33501 Bielefeld, Germany. E-mail:
[email protected] 2 Department of Mathematics, Beijing Normal University, 100875 Beijing, P.R. China. E-mail:
[email protected] Received: 5 February 1996 / Accepted: 20 March 1998
Abstract: For a finite dimensional algebra with triangular decomposition, a new kind of modules is defined, the twisting modules. Using the structure of these modules, for some algebras with vanishing radical cube the characteristic tilting module is described and its endomorphism ring is computed. This covers both Temperley–Lieb algebras and qSchur algebras of finite representation type. Our basic tool is a new symmetry condition, stronger than the symmetry provided by the existence of a triangular decomposition in general. 1. Introduction Finite dimensional quasi-hereditary algebras occur rather frequently in applications: (classical or quantized) Schur algebras whose modules are the polynomial representations of (classical or quantum) GLn have this structure as well as the algebras associated with blocks of the BGG-category O of a semisimple complex Lie algebra and as Temperley–Lieb algebras (see [12]) which are of interest both in statistical mechanics and for constructing the Jones knot polynomial. Triangular decompositions for finite dimensional algebras are the analogue of PBW theorems for (quantized) enveloping algebras and also of crossed products (e.g. the Drinfeld double) for Hopf algebras. Such decompositions exist in many of the situations just mentioned. Various questions of interest in applications can be asked for the class of algebras with triangular decompositions, or more generally, for quasi-hereditary algebras. A basic problem is to compute quiver and relations of such algebras. This is needed for example in order to apply the sophisticated methods of representation theory of finite dimensional algebras. For many of the algebras mentioned above, the quiver (which has as vertices the isomorphism classes of simple modules and arrows corresponding to extensions between simples) can be computed. For example in the case of category O, knowing the quiver is just an equivalent form of the (proven) Kazhdan–Lusztig conjecture. However, no general methods seem to be known to compute the relations.
428
S. K¨onig, C. Xi
Moreover, one would like to be able to compute Ringel’s characteristic tilting modules, their endomorphism rings, and the quadratic duals of algebras with triangular decompositions. However, at present these questions seem to be too hard for this quite large class of algebras in general. Thus there are two possible strategies. One strategy is to look for additional properties, in particular symmetry conditions, which are satisfied in (most of) the applications in order to make the class of algebras smaller. The other one is to study subclasses which are easier to access, but contain at least reasonably many examples from applications. As examples of related investigations we mention Dyer’s study of blocks of category O and related algebras [7, 8], and the equivalences Khoroshkin has found between certain categories of D-modules and certain representations of quivers [13]. In this note we study such a subclass. We require the algebras to be quadratic and – a strong restriction – to have vanishing radical cube. This covers all Temperley–Lieb algebras, and a few of the other applications (for example, (quantized) Schur algebras of finite representation type). We give a general construction by quiver and relations of the algebras in this class, and find a condition, satisfied for Temperley–Lieb algebras, under which we prove that the characteristic tilting module is induced from an injective module over the exact Borel subalgebra. (Note that this is not true for the dual extensions defined in [25] and studied in [4]). In this way we compute the characteristic tilting modules and determine their endomorphism rings. We also compute quadratic duals. Verifying the condition for Temperley–Lieb algebras, we also classify the blocks of these algebras. The condition which we use and which is satisfied in all these examples is an isomorphism between certain induced and certain coinduced modules. This seems to be a “strong symmetry condition” of the kind one has to look for when following the first of the above strategies. Such symmetry conditions are also of interest with respect to the following uniqueness problem (which at present is completely open): A finite dimensional algebra A having a triangular decomposition A ' C ⊗S B is not uniquely determined by the data C, B, S. In view of the analogy with crossed products for Hopf algebras one would like to find a special such algebra A, analogous to the Drinfeld double, with better symmetry properties. Our strong symmetry condition might be such a symmetry property (see Propositions 3.3, 3.4, and Sect. 5, where we use strong symmetry to classify the blocks of Temperley–Lieb algebras). Our main tool is what we call twisting modules. They are defined in Sect. Three (for algebras with triangular decomposition in general). In Sect. Four we look at the related concept of twisting relations. In Sect. Five we formulate our “strong symmetry condition”. In Sect. Six we give a general construction of algebras with triangular decomposition. In Sect. Seven we classify the blocks of Temperley–Lieb algebras. In Sect. Eight we produce characteristic tilting modules by induction from an exact Borel subalgebra, and we compute their endomorphism rings. Quadratic duals are computed in Sect. Nine. Proofs of all results are given in Sect. Ten.
2. Algebras with Triangular Decomposition For simplicity we will work with finite dimensional algebras over an algebraically closed field k. Under this assumption, a finite dimensional algebra is Morita equivalent to a basic one (that is, having a product of copies of the field k as maximal semisimple quotient). And a basic k-algebra A can be written as quotient of the path algebra kQ, where Q is the opposite quiver of the algebra. In the path algebra kQ, the product of
Strong Symmetry Defined by Twisting Modules
429
arrows α : i → j and β : l → m is written as αβ and defined to be zero if j 6= l and to be the path of length two starting in i and going via α and β to m otherwise. We note that the vertices of Q by definition are the isomorphism classes of simple A-modules. The arrows i → j form a basis of the k-vector space Ext1A (L(j), L(i)), where L(i) and L(j) are simple modules representing the classes i and j, respectively. Modules always are finite dimensional left modules. First, we recall the definition of a quasi-hereditary algebra as defined by Cline, Parshall, and Scott [2]. Definition 2.1. Let A be a finite dimensional algebra over a field, and I the set of isomorphism classes of simple A-modules. Choose representatives L(i) of the elements of I. Let ≤ be a partial order on I. Then (A, ≤) is called quasi-hereditary if and only if the following assertions are true: (a) For each i ∈ I, there exists a finite dimensional A-module 1(i) with an epimorphism 1(i) → L(i) such that the composition factors L(j) of the kernel satisfy j < i. (b) For each i ∈ I, a projective cover P (i) of L(i) maps onto 1(i) such that the kernel has a finite filtration with factors 1(j) satisfying j > i. The module 1(i) is called standard module of index i. The objects of the full subcategory F(1) of A− mod by definition are the A-modules having a finite filtration with standard modules as factors. These modules are called 1-good. Injective A-modules are filtered by modules ∇(i) (which are dual to the standard modules of the quasihereditary algebra (Aop , ≤)). An algebra (A, ≤) is called directed if it is quasi-hereditary with simple standard modules. Equivalently, the Cartan matrix of C is unitriangular with respect to the ordering of rows and columns given by ≤. Thus, the projective covers P (i) of L(i) and P (j) of L(j) satisfy HomA (P (i), P (j)) = 0 unless i ≥ j and EndA (P (i)) = EndA (L(i)) for all indices i and j. Exact Borel subalgebras and 1-subalgebras of quasi-hereditary algebras over fields have been introduced in [14]. Their existence has been shown for the blocks of category O [14] and for many other quasi-hereditary algebras (see for example [5]). They are defined as follows: Let (A, ≤) be quasi-hereditary over a field and S a semisimple subalgebra of A which has the same number of simple modules (up to isomorphism) as A. Fix a bijection between the two index sets of simple A- and simple S-modules. A subalgebra B of A which contains S as a maximal semisimple subalgebra (that is, B/rad(B) ' S ⊂ B) is called an exact Borel subalgebra of (A, ≤) if (B, ≤) is directed, and the induction functor A ⊗B − is exact and produces standard modules from simple modules: A ⊗B L(B, i) ' 1(i). A 1-subalgebra C of (A, ≤) is a subalgebra of A which contains S as a maximal semisimple subalgebra, and which has the property that for each primitive idempotent e(i) ∈ S the epimorphism A · e(i) → 1(i) restricts to an isomorphism C · e(i) ' 1(i) of C-modules. Thus, standard modules over A are indecomposable projective over C. The algebra C is a 1-subalgebra of (A, ≤) if and only if C op is an exact Borel subalgebra of (Aop , ≤). In particular, C is directed with respect to the partial order ≥ which is opposite to ≤. We will use the following properties of these two kinds of subalgebras: If B is an exact Borel subalgebra of A, then the induction functor A ⊗B − is exact. If C is a 1-subalgebra of A, then the coinduction functor HomC (Aop , −) is exact. (Here, Aop is a C-module by restriction of the A-module structure.) Applying these functors to a
430
S. K¨onig, C. Xi
B-module M or a C-module N , we produce an induced module from M or a coinduced module from N , respectively. The main theorem in [15] implies that (A, ≤) has both an exact Borel subalgebra B and a 1-subalgebra C which intersect in S if and only if the multiplication in A induces an isomorphism C ⊗S B ' A of left C- and right B-modules. If in addition C ' B op by an isomorphism fixing S elementwise, then an isomorphism A ' C ⊗S B will be called a triangular decomposition of (A, ≤). Exact Borel subalgebras are in analogy to Lie theoretic Borel subalgebras, since there is directedness (which replaces solvability), an exact induction functor (that is, a PBW theorem), and Verma modules (that is, standard modules) are induced from simple modules over the subalgebra. In [16] it has been shown that the existence of a triangular decomposition is Morita invariant. Hence we may work with basic algebras without loss of generality.
3. Twisting Modules Our main technical tool in this paper is twisting modules which will be defined and explained in this section. Because of their potential use for defining symmetry conditions for algebras with triangular decomposition in general we give the most general definition. We define a whole bunch of twisting modules. Later on, we will make use of the smallest and the biggest twisting modules. They are closely related to twisting relations (to be explained below) which arise naturally in triangular definitions. Let A ' C ⊗S B be an algebra with triangular decomposition where C (and thus B ' C op ) is assumed to be basic (which is no loss of generality), but A is arbitrary. Fix an index i and a B-module M (necessarily indecomposable) having simple socle L(B, i) and Loewy length at most two. Induction produces an Amodule M (A) := A ⊗B M which has a 1-good filtration with one factor 1(i) and all other factors 1(j) satisfying j < i. Dually, fixing a C-module N having simple top L(C, i) and Loewy length at most two, we produce by coinduction an A-module N (A) := HomC (Aop , N ) which is ∇-good with one factor ∇(i) and all other factors ∇(j) satisfying j < i. (Here we use the following chain of isomorphisms of A-modules: ∇(A, j) ' Homk (1(Aop , j), k) ' Homk (Aop ⊗C op L(C op , j), k) ' HomC op (L(C op , j), Homk (Aop , k)) ' HomC (Aop , L(C, j)) – note that inside Aop the elements of C form an algebra which is isomorphic to C op .) Since each ∇(j) is injective as a B-module, restricting N (A) to B gives a direct sum decomposition L B N (A) ' ∇(i) ⊕ j ∇(j). Hence HomA (M (A), N (A)) ' HomB (M, N (A)) contains an element ψ (unique up to non-zero scalar multiple) which is the B-injective envelope M → ∇(i). We denote the image of the A-homomorphism ψ (defined up to A-isomorphism) by T M (M, N ) and call it the twisting module associated with M and N. Now we discuss two special cases which will be of particular interest. Pick three indices i, j, and l, and elements α ∈ Ext1B (L(B, i), L(B, j)) and β ∈ Ext1C (L(C, j), L(C, l)). These elements define short exact sequences whose middle terms are the modules, say, X(α) and Y (β), both of length at most two over the algebras B and C, respectively. The twisting module T M (X(α), Y (β)) alternatively can be constructed by using such extensions in the following way. Adjointness provides us with an isomorphism HomB (L(B, l), 1(A, j)) ' HomA (1(A, l), 1(A, j)),
Strong Symmetry Defined by Twisting Modules
431
hence the first space equals k if and only if l = j, and is zero otherwise. This implies that as a B-module, 1(A, j) has exactly one copy of L(B, j) in the socle. The inclusion produces a map ϕ : Ext1B (−, L(B, j)) → Ext1B (−, 1(A, j)) which sends α to ϕ(α). Exactness of induction yields a canonical isomorphism Ext1B (L(B, i), −) ' Ext1A (1(A, i), −). Composing ϕ with this isomorphism, α defines an extension 0 → 1(A, j) → E(α) → 1(A, i) → 0 whose middle term decomposes if and only if α is zero. The middle term E(α) is induced from the B-module X(α) of length at most two (containing L(B, j) in the socle) defined by α. By duality, β produces an extension 0 → ∇(A, l) → F (β) → ∇(A, j) → 0. The module F (β) restricted to B is the direct sum of two indecomposable injective B-modules ∇(A, j) and ∇(A, l). Hence HomA (E(α), F (β)) ' HomB (X(α), F (β)) contains in particular the map which gives the injective envelope of L(B, j) ⊂ X(α) → ∇(A, j) (in case α = 0, that is, when X(α) decomposes, one has to change the notation here). Let ψ be the A-homomorphism defined by this injective envelope via the above isomorphism of homomorphism spaces. Then the twisting module T M (α, β) is just the image of ψ in F (β). It coincides of course with the twisting module associated with X(α) and Y (β). If α is zero, the twisting module is a quotient of 1(A, j). If β is zero, the twisting module is a submodule of ∇(A, j). If both α and β are zero, then the twisting module equals L(A, j). Immediately from the construction of T M (α, β) and properties of 1(A, i) and ∇(A, l) we get: Proposition 3.1. The twisting module T M (α, β) has top L(i) and socle L(l). For blocks of category O, and some related algebras (all with triangular decomposition) such special twisting modules can be constructed in an alternative way using Jantzen’s translation functors [10, 11] as will be indicated below (in the section on Temperley–Lieb algebras). They play a role in Kazhdan–Lusztig theory, since subtle information like the validity of Vogan’s conjecture depends on them (see [3] for a treatment of such questions within the theory of quasi-hereditary algebras). We do not go into detail here. The other kind of twisting module which we need is the following: Pick an index i, let I(B, i) be the injective B-module with socle L(B, i) and let M (i) be the unique maximal submodule of I(B, i) with Loewy length at most two. Similarly, let N (i) be the unique maximal quotient of the indecomposable projective C-module C(i) with Loewy length at most two. As before we construct the twisting module T M (M (i), N (i)), denote it by T M (i) and call it the universal twisting module at i. The module M (i) contains any other M , which we may use for constructing a twisting module, and N (i) maps onto any other N , which we may use. Thus we get: Proposition 3.2. Each twisting module is a subquotient of a universal twisting module T M (i) for some (unique) index i. The top and the socle of the universal twisting module L T M (i) both are isomorphic to a direct sum of simple modules j L(j), where j runs through indices with Ext1B (L(j), L(i)) 6= 0 with multiplicity given by the dimension of this extension space.
432
S. K¨onig, C. Xi
4. Twisting Relations The construction of algebras with triangular decomposition given in the next section relies very much on what we call twisting relations. They are defined as follows: Let A be an algebra with a fixed triangular decomposition A ' C ⊗S B (with B ' C op ). Then A is generated (as a k-vector space) by elements of the form c ⊗ b, where c is in C and b ∈ B. Products (1 ⊗ b) × (c ⊗ 1) (with b and c running through sets of generators of rad(B)/rad2 (B) and rad(C)/rad2 (C) respectively) can be rewritten as linear combinations of basis elements c0 ⊗ b0 , thus there are relations (1 ⊗ b) × (c ⊗ 1) = P i ci ⊗ bi which we call twisting relations. In contrast to the relations in C and B which are of course satisfied in A, these twisting relations do not only depend on C and B. They contain more subtle information: indeed, two non-isomorphic algebras A and A0 may share a common triangular decomposition, in the sense that there are directed algebras C and B such that there are triangular decompositions A ' C ⊗ B ' A0 . The difference between A and A0 is to be found in the twisting relations. Let C (and thus B ' C op ) be given by quiver and relations. To fix the twisting relations in A it is enough to define the multiplication of a B-arrow with a C-arrow. Since most of the examples we want to cover are quadratic algebras, we always will use quadratic twisting relations which rewrite such a product as a linear combination of products of C-arrows with B-arrows. Which products do occur with non-zero coefficient in such a linear combination? The next proposition will tell us that this can be read off from twisting modules associated with arrows. Twisting modules do not depend on the choice of a presentation of the algebra A by quiver and relations. Thus we may choose a special such presentation. In many applications one can choose a presentation of A in such a way that B equals A+ (the subalgebra of A generated by all vertices and the arrows i → j with j > i) and C equals A− (the subalgebra of A generated by all vertices and the arrows i → j with j < i). We always assume such a presentation to be admissible (that is, the ideal of relations contains only linear combinations of paths of length at least two). This situation occurs for example for blocks of category O and also for Temperley Lieb algebras as we will see later on, however it does not always occur for Schur algebras. Proposition 4.1. Suppose A is basic and has a triangular decomposition A ' C ⊗S B with B = A+ and C = A− . (a) The twisting relations for A are quadratic if and only if all twisting modules T M (α, β) (where α and β independently run through all B- and C-arrows respectively) have Loewy length at most three. (b) Suppose all twisting relations are quadratic. Let i < j > l. Then there exists a Barrow 0 6= α ∈ Ext1B (L(j), L(i)) and a C-arrow 0 6= β ∈ Ext1C (L(l), L(j)) such that (1 ⊗ α) × (β ⊗ 1) : i → j → l occurs with non-zero coefficient in the twisting relation of i → h → l if and only if L(h) is a composition factor of T M (α, β). We keep the assumptions of the proposition. Much more information can be read off from the universal twisting module at i: For each triple j, l, h with j < l > h and j > i < h such that there are arrows j → l and l → h there is a twisting relation rewriting the path j → l → h. The path j → i → h occurs on the right hand side of this relation with a coefficient, say λ(j, l, h) ∈ k. Thus for fixed l we get a matrix (λ(j, l, h))j,h which we call the ith twisting matrix at l (compare [5] for a special case of twisting matrices).
Strong Symmetry Defined by Twisting Modules
433
Proposition 4.2. Suppose A is basic, has a triangular decomposition A ' C ⊗S B with B = A+ and C = A− and all twisting relations are quadratic. Fix two indices i and l and choose j such that i < j < l and there is a B-arrow j → l and a C-arrow j → i. Let m be the number of all such paths (where j varies). Then L(l) occurs as a composition factor with multiplicity m of the universal twisting module T M (i) modulo its socle if and only if the ith twisting matrix at l is invertible.
5. The Strong Symmetry Condition Now we can formulate our strong symmetry condition. Definition 5.1. Suppose the algebra A is basic, has a triangular decomposition A ' C ⊗S B with B = A+ and C = A− and all twisting relations are quadratic. Then A is said to have strong symmetry if and only if (for all indices) all twisting matrices are invertible. Our results in the previous section now translate as follows: Corollary 5.1. Let A be as above and have vanishing radical cube. Then A has strong symmetry if and only if each universal twisting module is both induced and coinduced. Note that the entries of the twisting matrices depend on the choice of generators of A. We have, however, shown that the invertibility of the matrices is an invariant of the algebra A, once the triangular decomposition is fixed.
6. A General Construction The definition of M -twisted double incidence algebras of posets given in [5] can be generalized to any directed algebra given by a quiver with relations. In this section we restate this definition and show how algebras with triangular decomposition and vanishing radical cube fit into this construction. In order to cover algebras with multiple arrows we have to change the terminology used in [5] a bit. We note that our assumption on the vanishing of radical cube (although we will see that it covers several interesting examples) is rather restrictive if one considers all the examples mentioned in the introduction. For example, for blocks of category O, vanishing radical cube should not be expected in case of sl3 or any larger example (except for “very singular” blocks). We assume that we are given an algebra C defined by the quiver Q = (Q0 , Q1 ) and certain relations which we do not have to specify. We denote by Q the underlying graph op of the quiver Q and by Q the opposite quiver of Q, the arrows of which will be denoted by α0 if α is an arrow in Q. For an arrow α we denote by s(α) and t(α) the starting vertex and the terminal vertex respectively. First we define twisting labels. Assume we are given an arrow α : w → z in Q op and an arrow β : z → x in Q . Let yi run through all vertices (with multiplicities if op multiple arrows occur) such that there is a pair of arrows w → yi in Q and yi → x in Q. With each such pair of arrows we associate an element l(α, β, yi ) ∈ k, the label. The collection of all labels is called the labelling M of C.
434
S. K¨onig, C. Xi
Definition 6.1. Let C be a directed algebra with a labelling M . We define a new algebra op A(C, M ) which is given by the quiver obtained from Q and the opposite quiver Q by op forming the union of the two quivers, and identifying the vertices i in Q and i in Q , and imposing the following three types of relations: (1) the relations of the algebra C; (2) the relations of the algebra C op ; and (3) the twisting relations: for each pair α, β in Q with t(α) = t(β), put the relation X αβ 0 = l(α, β, yi )γ 0 δ, i
where the summation runs over all yi such that there are the arrows γ : yi −→ s(α) and δ : yi −→ s(β) in the quiver Q. It is clear that this new algebra A(C, M ) is a finite dimensional k-algebra, and we call it the M -twisted double of C. The construction is made in order to produce algebras with a triangular decomposition C ⊗ C op . However, in general A(C, M ) need not have this property (see [5]). In fact, the multiplication map C ⊗S C op → A(C, M ) always is surjective, but not always injective. However, in the situation we are interested in, this problem does not arise. Proposition 6.1. Let C be a basic directed algebra with rad2 (C) = 0 and maximal semisimple subalgebra S. Then for any chosen labelling M , the M -twisted double A of C has a triangular decomposition A ' C ⊗S C op . Moreover, A has vanishing radical cube and quadratic twisting relations. Conversely, any basic algebra A with a triangular decomposition A ' C ⊗S C op , quadratic twisting relations, and vanishing radical cube is of this form. 7. Temperley–Lieb Algebras Temperley–Lieb algebras An (δ) (where n is a natural number and δ 6= 0 a field element) can be defined as quotients of Hecke algebras of type An , or as algebras generated by diagrams (that is, homotopy classes of n non-intersecting strings, which are multiplied by concatenation, where the parameter δ is used). For more information the reader is referred to [19] (chapters 7 and 9), [9] or [23]. We classify the blocks of these algebras in order to provide examples of algebras as defined in the previous section. This reproves a result of Westbury [23] (and closes a gap in the proof given in [23]). Alternatively, all the results in this section can be read off from the explicit results in P. Martin’s book [19] (see in particular the main theorem in Sect. 7.3 of [19], of which our corollary below is just a reformulation)1 . Proposition 7.1. For any n and any δ 6= 0, the Temperley–Lieb algebra An (δ) has the following properties: (a) It has vanishing radical cube. (b) Each standard module 1(i) and each costandard module ∇(i) is either simple or has length two. (c) It has a triangular decomposition with respect to a total order ≤ which will be denoted 1 < · · · < m, where m is the number of simple modules. 1
We are grateful to the referee for pointing out this reference.
Strong Symmetry Defined by Twisting Modules
435
It is convenient now to change notation of simple modules and denote the simple modules of a fixed block by indices 1, . . . , m for a certain natural number m. According to [19] and to [9], 2.2 and 2.3, the indices occurring in a block form an orbit under the action of the affine Weyl group A(1) 1 . We will use this fact in the proof of the next proposition when translating the assertion to be shown there into an assertion on algebraic groups. Note that Goodman and Wenzl [9] give precise rules for distributing the simple modules among the blocks. We do not repeat these rules here. The proposition precisely gives us the Cartan matrices of the blocks. In contrast to an assertion in [23] this is however not enough to classify the blocks. There is still a choice inside a finite set of non-isomorphic algebras (depending on whether in the twisting relations the coefficients scalar(i) mentioned in the proof are zero or not). This problem is solved by the next proposition. We give an overkill proof for the proposition, which has the advantage of showing many other natural examples of twisting relations with non-zero coefficients. Proposition 7.2. Let A be a block of a Temperley–Lieb algebra, and i an index which is not minimal. Then the twisting module T M (i, i + 1, i) := T M (αi+1 , βi+1 ) has length four, and contains a composition factor L(i − 1). In particular, the strong symmetry condition is satisfied. As a consequence we get the classification of blocks: Corollary 7.3. A block of the Temperley–Lieb algebra is Morita equivalent to an algebra with the following quiver (the natural number m ≥ 1 depending on the block): α1 α2 αm A: • • • ... • • β1 β2 βm modulo the ideal generated by the following relations: β1 · α1 = 0 and for each i with 1 ≤ i ≤ m − 1: αi · αi+1 = 0, βi+1 · βi = 0, βi · αi = αi+1 · βi+1 . As B. Westbury told us (private communication), another way of closing the gap in [23] is to use the double centralizer property which relates blocks of the Temperley–Lieb algebras to (well-known) blocks of Uq (sl(2)). In fact, the indecomposable projective modules of the blocks of Uq (sl(2)) are known. And the double centralizer property implies that the Temperley–Lieb algebra is the endomorphism ring of certain modules over these blocks. We note that the same structure of blocks occurs for q-Schur algebras of finite representation type [24].
8. Characteristic Tilting Modules as Induced Modules and Ringel Duals For any quasi-hereditary algebra (A, ≤) Ringel has shown in [21] that for each i (indexing simple modules) there exists a unique indecomposable module T (i) which is both 1good and ∇-good such that all standard modules 1(j) occurring in a 1-good filtration of T (i) satisfy j ≤ i, and 1(i) occurs precisely once in this filtration. The direct sum T = ⊕i T (i) is a tilting module, that is, it does not admit self-extensions and it generates
436
S. K¨onig, C. Xi
a certain derived category. The module T is called the characteristic tilting module of (A, ≤). In this section we show for the algebras we are studying that T (i) is an induced module of the form A ⊗B I(B, i), where I(B, i) is indecomposable injective as a Bmodule (hence isomorphic to ∇(A, i)). We also compute the endomorphism ring of T which usually is called the Ringel dual of A, and which has been shown by Ringel [21] to be quasi-hereditary. We note that over an algebra with triangular decomposition and singular twisting matrices, the module T (i) need not be an induced module and the Ringel dual may have a different structure (for a study of such situations see [6]). Theorem 8.1. Let A be basic with a triangular decomposition A ' C ⊗S B and B = A+ and C = A− . Assume that rad2 (C) = 0 and that all twisting relations are quadratic. Let I(B, i) be the indecomposable injective B-envelope of L(i). Then the following assertions are equivalent: The induced module A ⊗B I(B, i) is both 1-good and ∇-good, and indecomposable (hence a direct summand of the characteristic tilting module of A). (II) All ith twisting matrices are invertible. (III) The induced module A ⊗B I(B, i) is isomorphic to the coinduced module HomC (Aop , C(i)). (I)
Note that these conditions are satisfied for Temperley–Lieb algebras. By definition, the Ringel dual of (A, ≤) is the endomorphism ring of the characteristic tilting module of (A, ≤). (This is given only up to Morita equivalence, we choose the basic algebra in this class.) Theorem 8.2. Let A be basic with a triangular decomposition A ' C ⊗S B with B = A+ and C = A− . Assume that rad2 (C) = 0 and that all twisting relations are quadratic. Assume moreover, that A satisfies the strong symmetry condition. Then the Ringel dual R(A) has a triangular decomposition R(A) = B ⊗S C with C and B as before and satisfying C = R(A)+ and B = R(A)− . Moreover, for each B-arrow α : i → j the twisting module X = T M (α, αop ) over R(A) L has Loewy length three with rad(X)/soc(X) being semisimple of the form L(j) ⊕ l L(l) with l running through the sinks of C-arrows i → l (with multiplicities). 9. Quadratic Duals As interesting as Ringel duals of quasi-hereditary algebras are quadratic duals and Extalgebras. In this section we determine the quadratic duals of algebras with triangular decomposition having vanishing radical cube, and we observe that the quadratic duals are precisely the Ext-algebras. It turns out that taking quadratic duals commutes with forming twisted doubles in the following sense: The quadratic dual of an algebra with radical square zero is hereditary and vice versa. Now we show that the quadratic dual of the twisted double of a radical square zero algebra is the twisted double of a hereditary algebra. An application of Bergman’s diamond lemma [1] yields the first part of the following proposition. The second part is a special case of the main result in [17]; it follows also from Sect. Three of [5]. Proposition 9.1. The twisted double of a hereditary algebra (with any labelling) is quasi-hereditary, hence it has a triangular decomposition. Moreover, it has global dimension two, if the hereditary algebra is not semisimple (and zero otherwise).
Strong Symmetry Defined by Twisting Modules
437
Since quadratic algebras of global dimension two are Koszul, the Ext-algebra of such a twisted double of a hereditary algebra coincides with the quadratic dual. The quadratic dual itself has the following description (which is proved by direct computation as in Sect. Two of [5]). Here, M t denotes the transpose of the matrix M . The matrix −M is obtained from M by multiplying each entry by −1. Proposition 9.2. Let C be finite dimensional hereditary and M any labelling. Then the quadratic dual of the M -twisted double of C is the (−M t )-twisted double of C op /rad2 (C op ). This also determines the quadratic dual (coinciding with the Ext-algebra) of algebras with triangular decomposition and vanishing radical cube. 10. Proofs This section contains the proofs of all the results mentioned before. In each case we keep the notation of the result being proved without repeating it explicitly. Proof of 4.1. The assumption on A implies that all paths occurring in twisting relations have length at least two (otherwise the presentation of A would not be admissible). Fix α ∈ Ext1B (L(j), L(i)) and β ∈ Ext1C (L(l), L(j)). Let P be the (indecomposable) projective cover of the twisting module T M (α, β). Using the twisting relation as in (b) we write i → h → l as a(αβ) + γ, where a is a scalar (possibly zero) and γ is a linear combination of paths different from αβ. The projective cover map sends the element i → h → l of P to an element of T M (α, β). Now the residue class of αβ in the twisting module being non-zero implies the residue class of i → h → l to be non-zero, which means that there is a composition factor L(h) (generated by the residue class of i → h) in the twisting module. Conversely, if there is no such composition factor, the residue class of i → h → l and hence the coefficient of αβ in the twisting relation must be zero for all choices of α and β. Similarly, a relation involving paths of length greater than two produces a uniserial submodule of the twisting module, which is of Loewy length greater than two. Proof of 4.2. Let M (B, i) be the unique largest submodule of I(B, i) having Loewy length two. Each triple i, j, l as in the assumption gives rise to a composition factor L(j) in A ⊗B M (B, i), and there are precisely m such composition factors. Thus the question is, which of these composition factors are in the image of the map ψ : M (i) → N (i). Write A ⊗B M (B, i) as a quotient of a direct sum of indecomposable projective modules. Assume there is an element j1 → l in this sum (a linear combination of arrows) such that the images in T M (i) of all products j1 → l → j2 with a certain arrow l → j2 (with j2 < l being the starting vertex of an arrow j2 → i and j2 < i) are mapped to zero by ψ. Considering the twisting module associated with the element j1 → i in B and the element i → j2 in C we conclude (by the previous proposition), that in the twisting relation for j1 → l → j2 the element j1 → i → j2 occurs with coefficient zero. But this element is a linear combination of rows of the ith twisting matrix at l, which thus must be singular. Conversely, if the twisting matrix is singular we can in the same way find such an element which is mapped to zero by ψ. Proof of 6.1. Considering k-dimensions one sees that A is quasi-hereditary if and only if it has a triangular decomposition. Since the multiplication map C ⊗S C op → A always
438
S. K¨onig, C. Xi
is surjective, it suffices to show that A has the same k-dimension as C ⊗S C op . The diamond lemma [1] provides a basis of A (note that because of rad2 (C) = 0 there are no ambiguities to resolve) which proves this assertion. The triangular decomposition and the twisting relations imply that A has vanishing radical cube. Conversely, any algebra with triangular decomposition and quadratic twisting relations can be written as A(C, M ). The vanishing of the radical cube of A implies the vanishing of the radical square of C. Proof of 7.1. In [19] and in [9] (Sect. Two) it is shown that Temperley–Lieb algebras have vanishing radical cube. Thus (a) holds true. We prove (b) by using certain facts shown in [19] and in [9]. A more computational approach to the statement is contained in [23]. In Theorem 2.3 of [9] (see also the discussion preceding that theorem) the radical of a Temperley–Lieb algebra is described as a bimodule over certain simple subalgebras. This description implies that all multiplicities [1(i) : L(j)] must be zero or one. In [9] the Bratteli diagrams of the inclusions 3 = An (δ) ⊂ 0 = An+1 (δ) (if they are semisimple) are given. The partial order of partitions indexing simple modules in these diagrams is the same as the partial order used for defining quasi-heredity. Since standard modules are defined combinatorially and coincide with simple modules in the semisimple case, the Bratteli diagrams carry information on the general case, too. More precisely, from the Bratteli diagram one reads off the following facts: If 1(0, 1) is restricted to 3, then it coincides with 1(3, 1). If for 1 < i < n the standard module 1(0, i) is restricted to 3, then it has the same composition factors as 1(3, i − 1) and 1(3, i) together. For i = n one of the two preceding situations occurs (depending on n even or odd). Now proceeding inductively (first on the index n of the algebra An (δ), then on the index of simple modules) one checks that each standard module is either simple or of length two: This is clear for 1(1) which is simple because of An (δ) being quasi-hereditary. Restricting 1(2) gives a module which has length three or two. In the first case the above statement on composition multiplicities zero or one implies that L(2) if restricted has length two. Hence 1(2) (unrestricted) has length one or two. Next 1(3) restricted has length between two and four. Length two is no problem. Length three splits into two cases; in one of them – one factor with multiplicity two – the fact that [1(3) : L(2)] is zero or one implies that L(3) if restricted must have length two; in the other case – three different factors – we are in the first case we had for 1(2) which again implies what we want. Finally length four implies that 1(2) restricted has length two which implies that L(2) restricted has length two. Then L(3) restricted must contain the other copy of the second 3-simple. This shows that the basic algebras of the blocks of the Temperley Lieb algebra have quivers which are of the following form: α1 α2 αm A: • • • ... • • β1 β2 βm Moreover, because of the assertion in (b), the following relations must be satisfied: αi αi+1 = 0 and βi+1 βi = 0 (for all 1 ≤ i ≤ m − 1). Also, there are relations αi+1 βi+1 = scalar(i)βi αi for all 1 ≤ i ≤ m − 1, where scalar(i) (depending on i) is zero or one. This does not fix the algebra up to isomorphism, but all algebras having this quiver and satisfying these relations have triangular decompositions with B being generated
Strong Symmetry Defined by Twisting Modules
439
by the vertices and the arrows αi and C being generated by the vertices and the arrows βi . This proves (c). Proof of 7.2. It is enough to show that there is an indecomposable module M of length four having top and socle L(i) and two more composition factors L(i − 1) and L(i + 1). The Temperley–Lieb algebra is a quotient of a Hecke algebra, and the quotient map is generic, that is, it is defined independently of the value of the parameter q (that is, of δ). In the semisimple case, the Temperley–Lieb algebra is the direct factor of the Hecke algebra such that its simple modules are associated with partitions with at most two parts. Thus in general, the Temperley–Lieb algebra is the largest quotient of the Hecke algebra which has precisely these simple modules. Hence it is enough to show that such a module exists for the Hecke algebra. The Hecke algebra can be written as eAe, where A is a q-Schur algebra. Therefore it is enough to construct an analogue of M over the q-Schur algebra (the Schur functor e · − then will send this module to M since it preserves simple modules and it preserves top and socle). Now M exists if an analogue exists over the classical Schur algebra, since finite dimensional modules can be quantized. But over the classical Schur algebra, such a twisting module can be constructed using Jantzen’s translation functor (see [11], 7.19). Here we have to use the parametrization of simple modules in the block of the Temperley–Lieb algebra as orbits under the action of the affine Weyl group as given in [9]. Proof of 8.1. The induced and the coinduced module appearing in (III) have the same composition factors with the same multiplicities, hence the same dimension. Since the universal twisting module at i, T M (i), has the same socle as the coinduced module and the same top as the induced module it follows that (III) holds if and only if the map ψ provides an isomorphism between the induced module and T M (i) if and only if T M (i) coincides with the coinduced module. Thus the equivalence of (II) and (III) follows from Proposition 4.2. Suppose (II) and (III) are satisfied. The triangular decomposition implies the exactness of A ⊗B −, and this induction functor sends simple modules to standard modules. Hence A ⊗ I(B, i) is 1-good. Moreover, I(B, i) ' ∇(A, i) has Loewy length two, socle L(i) and top ⊕l L(l), where l runs through the sinks of the C-arrows i → l. Hence A ⊗ I(B, i) occurs in an exact sequence of A-modules (∗) 0 → 1(i) → A ⊗ I(B, i) → ⊕l 1(l) → 0. Condition (III) implies that A ⊗B I(B, i) is ∇-good, too, and it equals T M (i), the universal twisting module at i. In particular, the socle of A ⊗ I(B, i) equals the socle of T M (i) which is the socle of 1(i). This implies that A ⊗ I(B, i) is indecomposable. In fact, in a direct sum decomposition A ⊗ I(B, i) = X ⊕ Y , one summand, say X, must contain the unique composition factor L(i), hence X contains the submodule 1(i) which is generated by that composition factor, thus it contains all of soc(A ⊗ I(B, i)), and Y equals 0. Conversely, assume (I) is satisfied. Because of the exact sequence (∗), the socle of A ⊗ I(B, i) contains the socle of 1(i). Hence for each direct summand L(j) of the socle of 1(i), there must be a factor ∇(j) in the ∇-good filtration of A ⊗ I(B, i). Counting composition factors shows that A ⊗B I(B, i) modulo these factors must be isomorphic
440
S. K¨onig, C. Xi
to ∇(i). Hence the socle of A ⊗ I(B, i) equals the socle of 1(i) which implies that A ⊗ I(B, i) is contained in, hence equal to, the coinduced module. Proof of 8.2. From Theorem 8.1 we know that the characteristic tilting module of A is ⊕i T (i), where by definition T (i) := A ⊗B I(B, i). By adjointness we have that as a vector space HomA (A ⊗B I(B, i), A ⊗B I(B, j)) is isomorphic to HomB (I(B, i), A ⊗B I(B, j)). Moreover, TL(j) = A⊗B I(B, j) as a B-module is the direct sum of the injective B-modules ∇(j) ⊕ l ∇(l), where l runs through the sinks of C-arrows j → l (with multiplicities). Now the B-homomorphisms ∇(i) → ∇(l) are precisely all the linear combinations of B-arrows i → l. Hence the following maps form a k-basis of the algebra R(A) (where i and j independently run through all indices): (1) the identity map on T (i), (2) the map T (i) → T (j) defined by a C-arrow i → j: it sends ∇(i) to ∇(j) via this arrow (hence i > j), (3) the map T (i) → T (j) defined by a B-arrow i → j: it sends ∇(i) identically to ∇(i) (which occurs in the ∇-good filtration of T (j)) (hence i < j), (4) the map T (i) → T (j) defined by a B-arrow h → j and a C-arrow i → j as a composition of maps as in (2) and (3). The maps in (2) and (3) are irreducible in the sense that they cannot be written in a non-trivial way as a linear combination of products of maps in this list, whilst the maps of type (4) can. The maps of type (2) and (3) define the arrows in the quiver of R(A). Moreover, via the maps in (2) and (3), the algebras B and C become subalgebras of R(A) which intersect in S (the maximal semisimple subalgebra of R(A), which is generated by the maps of type (1)). By [21], the Ringel dual is quasi-hereditary with respect to the partial order ≥ opposite to the given one ≤. Its standard modules are of the form HomA (T, ∇(i)). This (or the above basis) implies that B ⊗S C is a triangular decomposition of R(A). From the above basis we read off that the Ringel dual has quadratic twisting relations again (and has vanishing radical cube). In order to prove the statement on twisting modules, it is enough to show the following: Let α = βγ be the composition of two non-zero maps β : T (i) → T (l) (with l > i) of type (3), and γ : T (l) → T (i) of type (2). Then α is not zero. This is because the first map, β, sends ∇(i) isomorphically to a certain copy of ∇(i) occurring in a ∇-good filtration of T (l). Hence it sends top(T (i)) = top(∇(i)) to the top of this copy of ∇(i). The second map, however, sends top(∇(i)) isomorphically to soc(1(i)) (here we use the 1-filtrations of T (l) and of T (i)). Acknowledgement. Both authors would like to thank C.M. Ringel for his hospitality during their stay in Bielefeld. C.C. Xi is partially supported by NSF of China.
References 1. Bergman, G.: The diamond lemma for ring theory. Adv. Math. 29, 178–218 (1978) 2. Cline, E., Parshall, B. and Scott, L.: Finite dimensional algebras and highest weight categories. J. Reine Angew. Math. 391, 85–99 (1988) 3. Cline, E., Parshall, B. and Scott, L.: Abstract Kazhdan–Lusztig theory. Tohoku Math. J. 45, 511–534 (1993)
Strong Symmetry Defined by Twisting Modules
441
4. Deng, B.M. and Xi, C.C.: Quasi-hereditary algebras which are dual extensions of algebras. Comm. Alg. 22, 4717–4736 (1994) 5. Deng, B.M. and Xi, C.C.: Quasi-hereditary algebras which are twisted double incidence algebras of posets. Contrib. Algebra and Geom. 36, 37–72 (1995) 6. Deng, B.M. and Xi, C.C.: Ringel duals of quasi-hereditary algebras. Comm. Alg. 24, 2825–2838 (1996) 7. Dyer, M.: Kazhdan–Lusztig–Stanley polynomials and quadratic algebras I. Preprint (1992) 8. Dyer, M.: Algebras associated to Bruhat intervals and poyhedral cones. In: Finite dimensional algebras and related topics. NATO ASI series 424, Dordrecht: Kluwer, 1994, pp. 95–121 9. Goodman, F.M. and Wenzl, H.: The Temperley–Lieb algebra at roots of unity. Pac. J. Math. 161, 307–334 (1993) 10. Jantzen, J.C.: Einh¨ullende Algebren halbeinfacher Lie-Algebren. Berlin–Heidelberg–New York: Springer, 1983 11. Jantzen, J.C.: Representations of algebraic groups. London–New York: Academic Press, 1987 12. Kauffman, L.H.: Knots and physics. Singapore: World Scientific, 1991 13. Khoroshkin, S.: D-modules over the arrangements of hyperplanes. Comm. Alg. 23, 3481–3504 (1995) 14. K¨onig, S.: Exact Borel subalgebras of quasi-hereditary algebras, I. With an appendix by L. Scott. Math. Z. 220, 399–426 (1995) 15. K¨onig, S.: Exact Borel subalgebras of quasi-hereditary algebras, II. Comm. Alg. 23, 2331–2344 (1995) 16. K¨onig, S.: Strong exact Borel subalgebras of quasi-hereditary algebras and abstract Kazhdan–Lusztig theory. To appear in Adv. in Math. 17. K¨onig, S.: On the global dimension of quasi-hereditary algebras with triangular decomposition. Proc. A. M. S. 124, 1993–1999 (1996) 18. K¨onig, S.: A criterion for quasi-hereditary, and an abstract straightening formula. Invent. Math. 127, 481–488 (1997) 19. Martin, P.: Potts Models and related problems in Statistical Mechanics. Singapore: World Scientific, 1991 20. Parshall, B. and Scott, L.L.: Derived categories, quasi-hereditary algebras and algebraic groups. Proc. of the Ottawa–Moosonee Workshop in Algebra 1987, Math. Lect. Note Series, Carleton University and Universit´e d’Ottawa (1988) 21. Ringel, C.M.: The category of modules with good filtration over a quasi-hereditary algebra has almost split sequences. Math. Z. 208, 209–223 (1991) 22. Temperley, H.N.V. and Lieb, E.H.: Relations between the “percolation” and “coloring” problem and other graph-theoretical problems associated with regular planar lattices: Some exact results for the “percolation” problem. Proc. Roy. Soc. London A, 322, 251–280 (1971) 23. Westbury, B.: The representation theory of Temperley Lieb algebras. Math. Z. 219, 539–566 (1995) 24. Xi, C.C.: On representation types of q-Schur algebras. J. Pure Appl. Alg. 84, 73–84 (1993) 25. Xi, C.C.: Quasi-hereditary algebras with a duality. J. reine angew. Math. 449, 201–215 (1994) Communicated by M. Jimbo
Commun. Math. Phys. 197, 443 – 450 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Definition of Chern-Simons Terms in Thermal QED3 Revisited? S. Deser1 , L. Griguolo2 , D. Seminara1 1 Department of Physics, Brandeis University, Waltham, MA 02254, USA. E-mail:
[email protected],
[email protected] 2 Center for Theoretical Physics, Laboratory for Nuclear Science and Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA. E-mail:
[email protected]
Received: 28 December 1997 / Accepted: 27 March 1998
Abstract: We present two compact derivations of the correct definition of the ChernSimons term in the topologically non trivial context of thermal QED3 . One is based on a transgression descent from a D = 4 background connection, the other on embedding the abelian model in SU (2). The results agree with earlier cohomology conclusions and can be also used to justify a recent simple heuristic approach. The correction to the naive Chern-Simons term, and its behavior under large gauge transformations are displayed. Chern-Simons (CS) terms, defined in odd dimension, contain gauge information not accessible through the field strength alone. We have in mind large gauge transformations, ones not shrinkable to the identity, to which CS (but not Fµν ) is sensitive. However it is also known [1, 2, 3, 4] that, as normally expressed, the abelian CS term (in D = 3, say) Z 1 A∧F (1) ICS = 8π 2 is not always well-defined, but requires corrections 1 . Our twofold aim is to obtain the correct form, in two complementary, compact ways and to show explicitly that “improvement” of ICS is already needed in simple but quite physical contexts, such as abelian U (1) gauge fields in D = 3 with non-trivial topology. A generic example is finite temperature QED3 , where t ranges over a finite circle S 1 of perimeter β = 1/κT and space is a closed 2-manifold 62 with associated non-vanishing magnetic flux R 1 8 = dx2 B, B ≡ ij Fij = F12 . [We recall that the flux is a necessarily quantized 2 topological invariant, 8 = 2πk; see e.g. [6].] The need to improve the naive ICS is due to the fact that it explicitly involves the vector potential A, “modulated” by the ? This work is supported by NSF grant PHY93-15811, in part by funds provided by the U.S. D.O.E. under cooperative agreement #DE-FC02-94ER40818 and by INFN, Frascati, Italy. 1 The non-abelian case differs in a number of respects; it will addressed separately [5].
444
S. Deser, L. Griguolo, D. Seminara
field strength. But presence of magnetic flux implies that A will depend on the patches needed to cover the closed manifold 62 ; hence the integral in (1) as it stands will be, unacceptably, patch-dependent. This difficulty has been recognized and cured long ago both in cohomological D = 3 calculations [1, 4] and by descent from D = 4 [2]; recently we have given a heuristic approach to the solution [3]. Improvement of ICS is not merely a mathematical nicety, but has direct bearing on real QED3 questions such as the necessity and amount of quantization of its coefficient when ICS is viewed as a dynamical field action. Our present interest in ICS was aroused by calculations of effective QED actions induced by charged fermions, and the complex of questions raised there about the seeming appearance of induced CS terms and their coefficients [3]. Here we will first present a different route to the (same) correct definition of ICS , based on the Chern-Weil theorem using the transgression formula involving a background connection Aˆ µ on the non-trivial bundle, that compactly replaces the patch-dependence and the associated boundary “counter-terms”, by a simpler Aˆ µ -dependent addition. We will then compare this method with the two earlier approaches and use it also to justify the simple-minded “derivation”. Finally, we give what is perhaps a still easier definition by use of nonabelian embedding to take advantage of the simpler (!) cohomological properties avaiblale there. We begin our analysis from the usual 4-dimensional identity that leads to the introduction of the abelian CS form, F ∧ F ≡ d(A ∧ F ) .
(2)
One cannot apply the Poincar´e lemma to this identity when Aµ is nontrivial, as when it carries a nontrivial magnetic flux through 62 . For then F ∧ F , while closed, is not exact; equivalently Aµ is globally defined on M4 as a connection, but not as a 1-form. We circumvent this obstacle through the Chern–Weil theorem (see e.g., [6]) which states (for our case) that, if (F , Fˆ ) are field strengths corresponding to two different connections (A, ˆ on some bundle, then (F ∧ F − Fˆ ∧ Fˆ ) is exact as well. A corollary, the transgression A) formula, provides the explicit 3-form whose divergence it is: ˆ ∧ (F + Fˆ ) , F ∧ F − Fˆ ∧ Fˆ = d (A − A) (3) as is easily verified since the cross-terms on the r.h.s. cancel. We can therefore define ICS , also on non-trivial bundles, to be2 Z Z 1 ˆ ∧ Fˆ + 1 ˆ ∧ (F + Fˆ ) . I¯CS ≡ F (A − A) (4) 8π 2 M4 8π 2 ∂M4 The explicit dependence of I¯CS on A − Aˆ insures that it is globally well defined: recall that a bundle is defined by the gauge transition functions between patches, all connections on the bundle having the same patch behavior. In particular, A and Aˆ carry the same flux through 62 ; see also the discussion after (6). We digress for a moment to note that appearance of an intrinsic reference background is common in connection with (gauge or gravitational) anomalies in non trivial topologies. What makes Aˆ unusual is that while it transforms as a connection when changing patches – so as to neutralize the same behavior in A – we may (and do) choose it not to transform under gauge tranformations that affect A only; this too is not unknown, for 2 In general a D = 3 bundle is not a boundary of a 4-bundle but cochains may be required [2]. This complication does not occur in our explicit examples, but can be handled as well.
Definition of Chern-Simons Terms in Thermal QED3 Revisited
445
example in the background field expansion of QFT. These different roles for A and Aˆ can be justified in terms of the usual BRST analysis (see e.g., [7]). Returning now to I¯CS , the elegant aspect of (4) is its “covariance” (no patch depenˆ In the other dence), paid for by its apparent dependence on the D = 4 background A. ˆ approaches there is no A, but covariance is lost. One gratifying property of (4) is that it immediately reproduces the correct gauge variation of I¯CS at finite temperature. Under the large gauge transformation A0 → A0 + 2πn/β, A → A, I¯CS changes proportionally to the flux, 1 ˆ = I¯CS + 1 2πn(2πk + 2πk) = I¯CS + nk. (5) I¯CS → I¯CS + 2 2πn[8(B) + 8(B)] 8π 8π 2 The variation is double what would be naively expected from (1), where the background ˆ is absent (see also [1]). A related physical issue involves the requirecontribution φ(B) ment that the coefficient µ in µICS , viewed as a quantum action, be quantized. The usual argument (using µICS ) is that its phase exponential (the relevant quantum path integral object) must also be (large-)invariant, so that µICS must vary by 2πm, m ∈ ZZ, requiring µ/2π to be even. Instead, (4,5) imply that the parameter µ/2π is any integer [1]. This choice leads to a manifestly invariant complete set of states with all possible (of course integer) fluxes.3 ¯ Let us Z next compare this definition of ICS with the direct way of computing the integral
F ∧ F . Here we must specify the embedding space; for simplicity, we take
it to be M4 = D2 × S 2 . [The apparent ambiguity in choice of embedding as well as of connection A, but keeping the desired boundary ∂M4 and the desired values of A on it, does not affect I¯CS , the differences being at most integer-valued. For example, different embeddings differ by the Chern class of the manifold obtained by gluing them together [2].] The angles (θ, φ) span the 2-sphere S 2 while (r, t) are the polar (radial and angular) coordinates parameterizing the disc D2 . Our desired 3-space is the boundary S 1 × S 2 . A nontrivial Z gauge connection A on this manifold is then realized by requiring its (integer) flux F through S 2 to be nonvanishing, entailing nontrivial transition S2
functions between the different charts covering the sphere. At the simplest level, we use two charts, splitting S 2 into two cups H± intersecting at some latitude θ = θ0 and assign U (1) connection 1-forms to each, A = A± + dψ± on H± .
(6)
The transition function corresponding to nontrivial flux corresponds to exp(iψ+ ) = exp(ikφ) exp(ψ− ), which implies A+ − A− = kdφ, (i.e. 8 = 2πk). Regularity also requires all fields to be periodic in the angular variable t, with period β. We are ready now to perform the integration, for which we revert to index notation, 3 It has also been argued that consistency is preserved with a less stringent ( but still based on (5)) quantization: specifically, if µ/2π is merely rational, (only) states with the corresponding flux values are allowed [8, 9]; states with vanishing flux are compatible with any value of µ [8]. Here, the Hilbert spaces are required to carry projective representations of the large gauge group.
446
S. Deser, L. Griguolo, D. Seminara
Z
Z
4 D 2 ×S 2
F ∧F =
D 2 ×S 2
drdtdθdφ λµνρ Fλµ Fνρ
Z
=2 Z
D 2 ×S 2
drdtdθdφ λµνρ ∂λ Aµ Fνρ Z
1
dtdθdφ
=2
S 1 ×S 2
Z
dθdφ
+2 ZS
2
+2 D2
0
Z
1
dtdr
dr rµνρ ∂r Aµ Fνρ
Z
β
S2
dttµνρ ∂t Aµ Fνρ
dr Z0
(7)
0
dθdφ iµνρ ∂i Aµ Fνρ ,
where i ≡ (θ, φ). The first integral in the final equality produces ( upon restoring the required normalization) the naive CS action of (1), Z 1 dtdθdφ rµνρ Aµ Fνρ (r = 1); (8) ICS ≡ 16π 2 S 1 ×S 2 the contribution at r = 0 vanishes since A is a regular connection on the disc. The second integral is zero since the integrand is periodic in t. However, the last term Z Z Z Z dtdr dθdφiµνρ ∂i Aµ Fνρ = 2 dtdr dθdφirνρ ∂i Ar Fνρ + 2 2 2 2 2 ZS ZS ZD ZD dtdr dθdφitνρ ∂i At Fνρ + 2 dtdr dθdφ ijνρ ∂i Aj Fνρ = 2 2 2 D2 S2 ZS ZD dtdr dθdφij ∂i Aj Frt ≡ 1, ij ≡ rtij , (9) 4 D2
S2
requires a more careful analysis. The first two integrals in the second equality can be dropped, since Ar and At define two regular scalar function on the 2-sphere. The surviving term carries all the non-trivial information. In fact, with our above choice of patches, we have "Z # Z Z ij ij dtdr dθdφ ∂i Aj Frt + dθdφ ∂i Aj Frt . (10) 1=4 D2
H+
H−
Using the Poincar´e lemma in each cup then yields Z 2π h Z i dtdr dφ A+φ − A− F (θ = θ0 ) 1=4 rt φ D2
Z
0
Z
2π
β
dφ
= 4k 0
dtAt (θ = θ0 , r = 1),
(11)
0
¯ where we have used A+φ − A− φ = k, as required by (6). The final result for ICS in this procedure thus reads Z Z 2π Z β 1 k rµνρ dtdθdφ A F + dφ dtAt (θ = θ0 ) (12) I¯CS = µ νρ 16π 2 S 1 ×S 2 8π 2 0 0 1 ≡ ICS + . 32π 2
Definition of Chern-Simons Terms in Thermal QED3 Revisited
447
We have dropped the r = 1 argument because henceforth all fields in (12) are the 4dimensional ones computed on the r = 1 boundary, that is, the 3-dimensional ones. The above route is a realization of the prescription of [2] as well as of the procedures of [4]. Several comments about (12) are in order: (a) it is “small” gauge invariant: in fact the new contribution depends only on the integral of At over S 1 and this quantity, like the naive CS, is small (but not large) invariant4 . (b) As advertised previously, the final result is fundamentally dependent on the patches or more precisely on the specific intersection between different charts. (c) Finally, although (12) seems quite different from (4), the two are actually the same (and of course (12) varies exactly as in (5)). The equivalence can be easily shown by an appropriate choice of the reference connection in (4). Take, ˆ for example, Aˆ to be any four dimensional connection that reduces on ∂M4 to (0, A), 2 ˆ where A is the usual instanton of topological charge k on S ; then (4) can be shown ˆ cancel to reduce to (12). (d) This also shows that, in (4), the 4D dependent parts of A between the two terms there; its residual lack of invariance under large transformations is purely 3-dimensional. We have just noticed in the descent from D = 4 that the correction 1 required by the naive ICS is an integral over the intersection of the patches of the transition function modulated by At . This correction was derived in [1] entirely within D = 3 (and also related there to the above descent method) using the machinery presented in [4]. To accomplish this “intrinsic” process, the cohomological aspects is carried here by the various transition functions of the (generallyZcomplicated) overlaps. The extra contributions beyond the sum over the patches α of Aα ∧ F in Z A∧F =
XZ α
Aα ∧ F +
X
Tα
(13)
α
stand for the various transition region overlap terms required cohomologically to give the improved I¯CS . They in turn are specified by the flux 8. From this redefinition it is then also possible to read the desired variation of I¯CS on a large transformation. All the above routes for defining a correct CS action rely heavily on cohomological machinery. We now relate them to the heuristic, “physical” approach [3] that recasts the naive ICS of (1) into a “maximally” gauge invariant, discarding any “ill-defined” contribution in the process (specifically in the integrations by part). To simplify our analysis, we confine ourselves again to the case of S 1 × S 2 . It can be shown that there is a gauge reachable by small transformations U = exp i, = A˜ 0 (leaving ICS invariant) in which, starting from arbitrary Aµ , the new Aµ become Z 1 β 0 (t, x) = dt A0 (t0 , x) ≡ A0 (x), (14a) AU 0 β 0 ! Z Z t t β 0 U 0 ˜ ˜ A (t, x) = A(0, x) − E(t, x), E ≡ − dt − dt E(t0 , x). (14b) β 0 0 4 That the final result is not large invariant even though the original 4D integral is manifestly unchanged by all gauge transformations, is traceable to the fact that three-dimensional fields differing by a large transformation are not gauge equivalent as components of four-dimensional fields. In our case one need merely notice that a 3D large transformation affecting the integral of At over S 1 must alter the flux of the 4D field through large the disc. Recall that under Un = exp(2πnt/β), At → At + 2πn/β. In 4DZlanguage, this corresponds to
sending At → At + 2πrn/β, A → A, which is not a gauge transformation (
drdt1Frt = 2πn).
448
S. Deser, L. Griguolo, D. Seminara
In terms of these variables, the naive ICS has the form Z β Z dt d2 x A0 (x)B(t, x) + ij (E˜ i (t, x) + Ai (0, x))Ej (t, x) = ICS = 2 Z
0
Z
β
dt
=2 Z
0
Z
β
dt
=2
d2 x A0 (x)B(t, x) + ij E˜ i (t, x)Ej (t, x) + ij Ai (0, x)∂j A0 (x) = d2 x A0 (x)(B(t, x) + B(0, x)) + ij E˜ i (t, x)Ej (t, x) ,
(15)
0
where, in the last term of the second equality, we have used E(t, x) = ∇A0 (x)−∂0 A(t, x) and then dropped ∂0ZA(t, x) by periodicity. In the last equality, we have omitted the boundary term K ≡
d3 x∂j ij (A0 (x)Ai (0, x)) coming from the integration by parts,
which is patch-dependent. Surprisingly, the final truncated expression (15) is the correct answer. A quick way of checking this is to choose, in (4), any background Aˆ that reduces to A(0, x) on the boundary. In other words the heuristic approach implicitly ˆ However, this simple “derivation” promotes A(0, x) to be our reference connection A. really involves an unjustified choice: the amount of “bad term” that we have to throw away is not uniquely defined. Before integrating by parts, the last term in the second equality of (15) involves ∂j A0 (x) and so does not depend on the constant part a of A0 , while this dependence is restored (by hand) after the integration. This mismatch obviously arises as a consequence of having dropped the specific boundary term K. However, since a part proportional to a is well-defined irrespective of A’s jumps, the amount of a that goes into the action or into the boundary contribution cannot be decided merely from requiring a well-defined final result. Our discussion so far has been entirely abelian. Our final derivation will take advantage of a simplification available in the nonabelian context of simply connected groups such as SU (N ), where all D = 3 bundles are trivial (nontrivial bundles, as in SO(3), are discussed in [5]). This implies that there are always gauges in which the connection has no jumps5 and therefore the standard formula Z 2 1 NA (16) Tr[A ∧ dA + i A ∧ A ∧ A] = ICS 2 16π 3 is valid without improvement. This fact is easy to understand in our S 1 × S 2 context because the structure of the transition function between the caps on S 2 is necessarily trivial, 51 (SU (N )) = 0. Hence there are always sections where A has been trivialized (no jumps) and (16) is applicable. Let us therefore embed our (of course well-defined at the abelian level!) connection A of U (1) in SU (2), by turning it into the SU (2)-valued form A˜ ≡ Aσ3 . [Obviously embedding in any higher SU (N ) is not necessary or useful.] To remove the discontinuity in Aφ , we have to introduce the – necessarily nonabelian ˜ − iU −1 dU . For our model, we take for – gauging U , with as usual, A˜ U = U −1 AU 0 simplicity (rather than “symmetrizing” U s in both patches), U+ (θ, φ) = sin f (θ) cos nφII + i sin f (θ) sin nφσ3 + i cos f (θ)σ2 ,
U− = I ,
(17)
where ± refers to the two caps on S and f (θ) is a monotonic regular function such that: f (π/2) = π/2 and f (0) = f 0 (0) = f 0 (π/2) = 0. At this point, A˜ U is no longer just along 2
5 A non-abelian configuration with “abelian” characteristics that lead to apparent definition difficulties for N A is proposed in [10] ICS
Definition of Chern-Simons Terms in Thermal QED3 Revisited
449
σ3 of course and we must keep both terms in (16). The standard gauge transformation NA is [11] rule for ICS Z i NA ˜U ˜ Tr[ d(A˜ ∧ dU U −1 )] + w(U ), ICS [A ] = I[A] + 16π 2 Z 1 Tr[U −1 dU ∧ U −1 dU ∧ U −1 dU ]. (18) w(U ) = 24π 2 ˜ = ICS [A] is just the naive abelian form (1), so the complete, well-defined, For us, I[A] NA . Next, we observe that the winding number contribution result is to take I¯CS = ICS w(U ) vanishes since it involves an explicit ∂t and the U of (17) is time-independent. The equality of the remaining term in (18) with 1 of (10) is easily verified by direct computation. [In the language of our initial analysis, the role of U is essentially that of ˆ the background A.] Of course, casting an abelian configuration in nonabelian terms does not alter the fundamental differences between them; to avoid confusion we comment on these: 1. The U of (17) is a local gauge function on one patch, not being smooth at the transition circle θ = π/2; it is thus not globally defined, as truly nonabelian U ’s must be. 2. In our abelian case, w(U ) vanishes while the divergence term in (18) does not; the opposite holds for true nonabelian configurations. Consequently, the sources of µ-quantization are also not the same, since it is w(U ) that requires it in the latter [11]. This difference is itself related to that between their homotopies: 53 (SU (2)) 6= 0 = 51 (SU (2)) , 53 (U (1)) = 0 6= 51 (U (1)).
(19)
Despite the above contrasts, our mapping into nonabelian form has led to the desired well-defined abelian I¯CS ! To summarize, we have been comparing, from different points of view, a set of topological and cohomological issues encountered in the analysis of the abelian CS term at finite temperature. We have shown how transgression naturally allows us to define I¯CS on nontrivial bundles, which are unavoidable in interesting (non-vanishing flux) configurations, and to easily reproduce its behavior under large gauge changes; we have compared this approach with previous ones given in the literature and also shown it to underline a simple but correct heuristic definition. Finally, we have availed ourselves of the cohomological properties of simply connected groups by embedding U (1) in SU (2); the resulting nonabelian CS term immediately produced the desired improvement. References 1. Polychronakos, A.P.: Nucl. Phys. B. 281, 241 (1987) 2. Dijkgraaf, R. and Witten, E.: Commun. Math. Phys. 129, 393 (1990) 3. Deser, S., Griguolo, L. and Seminara, D.: Effective QED Actions: Representations, Gauge Invariance, Anomalies and Mass Expansions. BRX-TH-417, hep-th 9712066; Phys. Rev D15, 57 (1998) 4. Alvarez, O.: Commun. Math. Phys. 100, 279 (1985) 5. Deser, S., Griguolo, L. and Seminara, D.: Finite Temperature Non Abelian Effective Actions. BRX-TH426 (in preparation) 6. Nakahara, M.: Geometry, Topology and Physics. Bristol: IOP Publishing Ltd, Chap. 11. 7. Manes, J., Stora, R. and Zumino, B.: Commun. Math. Phys. 102, 157 (1985) 8. Polychronakos, A.P.: Phys. Lett. B. 241, 37 (1990) 9. Iengo, R. and Lechner, K.: Phys. Rept. 213, 179 (1992)
450
S. Deser, L. Griguolo, D. Seminara
10. Jackiw, R. and Pi, S.-Y.: Reducing the Chern-Simons term by a symmetry. MIT-CTP-2696, BU-HEP97-30, HEP-TH/9712087 11. Deser, S., Jackiw, R. and Templeton, S.: Ann. Phys. 140, 372 (1982) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 197, 451 – 487 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Orthogonal Polynomials of Types A and B and Related Calogero Models Charles F. Dunkl Department of Mathematics, Kerchof Hall, University of Virginia, Charlottesville, VA 22903-3199, USA. E-mail:
[email protected] Received: 17 October 1997 / Accepted: 31 March 1998
Abstract: There are examples of Calogero–Sutherland models associated to the Weyl groups of type A and B. When exchange terms are added to the Hamiltonians the systems have non-symmetric eigenfunctions, which can be expressed as products of the ground state with members of a family of orthogonal polynomials. These polynomials can be defined and studied by using the differential-difference operators introduced by the author in Trans. Am. Math. Soc. 311, 167–183 (1989). After a description of known results, particularly from the works of Baker and Forrester, and Sahi; there is a study of polynomials which are invariant or alternating for parabolic subgroups of the symmetric group. The detailed analysis depends on using two bases of polynomials, one of which transforms monomially under group actions and the other one is orthogonal. There are formulas for norms and point-evaluations which are simplifications of those of Sahi. For any parabolic subgroup of the symmetric group there is a skew operator on polynomials which leads to evaluation at (1, 1, . . . , 1) of the quotient of the unique skew polynomial in a given irreducible subspace by the minimum alternating polynomial, analogously to a Weyl character formula. The last section concerns orthogonal polynomials for the type B Weyl group with an emphasis on the Hermite-type polynomials. These can be expressed by using the generalized binomial coefficients. A complete basis of eigenfunctions of Yamamoto’s BN spin Calogero model is obtained by multiplying these polynomials by the ground state.
1. Introduction A Calogero–Sutherland model is an exactly solvable quantum many-body system in one dimension. There are examples associated to the Weyl groups of type A and B, and by the addition of exchange (reflection) terms the Hamiltonians have non-symmetric eigenfunctions. In the two situations described here, the eigenfunctions are polynomials times the ground state.
452
C. F. Dunkl
The first example consists of N particles on a circle, with particle j being at angle θj , 0 ≤ θj < 2π, parameter k > 0; the Hamiltonian is 2 N X ∂ k H1 := − + ∂θi 2 i=1
X
k − (ij) , − θj ))
(1.1)
sin2 ( 21 (θi 1≤i<j≤N
where (ij) denotes the transposition (“exchange”) θi ←→ θj . √ Under the transformation xs = exp(θs −1), H1 =
N X
∂ xi ∂xi
i=1
2 + 2k
X 1≤i<j≤N
xi xj ((ij) − k). (xi − xj )2
(1.2)
The orthogonal polynomials associated to H1 are called the non-symmetric Jack polynomials; see Baker and Forrester [BF1], Lapointe and Vinet [LV2]. In Sect. 2 this Hamiltonian will be further described. The second example to be studied is the B-type spin Calogero model of Yamamoto [Y, YT]; parameters k, k1 : H2 = −
2 N N N X 1 X 2 X k1 (k1 − σi ) ∂ + xi + ∂xi 4 x2i i=1
i=1
X
+ 2k
1≤i<j≤N
(1.3)
i=1
k − τij k − σij + 2 (xi − xj ) (xi + xj )2
,
where σi , σij , τij are the reflections in the hyperoctahedral group WN , defined by xσi = i
j
i
i
j
(x1 , . . . ,− xi , . . . , xN ), xσij = (. . . , xj , . . . , xi , . . . ), xτij = (. . . , − xj , . . . , − xi , . . . ). The coefficient 41 in H2 is a coupling constant; it can be changed by rescaling x; this choice is to use the weight function exp(−|x|2 /2), as will be seen later. In Sect. 5 is the description of a complete orthogonal system of eigenfunctions of H2 , consisting of polynomials times the ground state Y 1≤i<j≤N
|x2i − x2j |k
N Y
|xi |k1 exp(−|x|2 /4).
i=1
The technical foundation of this paper is the algebra of differential-difference (“Dunkl”) operators associated to a reflection group [D1]. In Sect. 2, there is an outline of known results for the type A case, namely the symmetric group SN acting on RN by permutation of coordinates. This section includes a discussion of inner products and self-adjoint operators, and orthogonal decompositions. Section 3 is concerned with polynomials and operators invariant under parabolic subgroups of SN ; these are subgroups which leave intervals {1, 2, 3, . . . , `1 }, {`1 + 1, . . . , `1 + `2 }, . . . invariant. Formulas for the norms of invariant polynomials are obtained. In Sect. 4, the alternating or skew polynomials and operators are examined. There is the construction of an important operator associated to any interval {` + 1, ` + 2, . . . , ` + m} which is skew for the associated symmetric group, and which commutes with the appropriately transformed version of the Hamiltonian H1 . Any polynomial which is
Orthogonal Polynomials of Types A and B and Related Calogero Models
453
skew-symmetric for a parabolic (Young) subgroup of SN is divisible by an appropriate minimal alternating polynomial (a product of discriminants). The skew operator is used to evaluate the ratio at x = (1, 1, . . . , 1), a generalization of the Weyl dimension formula. Section 5 addresses the type B situation and shows how the type A polynomials can be used to build type B Hermite polynomials, the eigenfunctions of the transformed H2 . This results in a complete set of eigenfunctions with arbitrary parity, that is for any subset A ⊂ {1, 2, . . . , N } there are eigenfunctions which are odd in xi , i ∈ A and even / A. Previously, only the cases of all even or all odd parity were studied, the in xi , i ∈ so-called generalized Laguerre polynomials. Notations used throughout. • Z+ = {0, 1, 2, 3, . . . }, NN = Z+N , the set of compositions; • NNP is the set of partitions with no more than N nonzero parts; NNP = {λ ∈ NN : λ1 ≥ λ2 ≥ λ3 · · · ≥ λN ≥ 0}; • for α ∈ NN , α+ denotes the sorting of α to a partition; the permutation of α lying in NNP ; Pj • for α, β ∈ NN , the dominance order is defined by α β if and only if i=1 αi ≥ Pj i=1 βi for 1 ≤ j ≤ N ; and α β means α β and α 6= β; • for w ∈ SN , the symmetric group, and x ∈ RN , let xw ∈ RN be defined by (xw)i = xw(i) , 1 ≤ i ≤ N , and sgn(w) denotes the sign of w; • for a function f on RN , let (wf )(x) = f (xw), x ∈ RN ; QN α wα i , where (wα)i = αw−1 (i) , 1 ≤ • for α ∈ NN , let xα := i=1 xα i , then w(x ) = x i ≤ N , w ∈ SN ; • an interval [` + 1, ` + m] := {j ∈ Z : ` + 1 ≤ j ≤ ` + m}; / I implies w(i) = i} (that is, SI is • for an interval I, let SI = {w ∈ SN : i ∈ isomorphic to the symmetric group of I); • for an interval I, let σI be the longest element in SI , that is, σI (i) = 2` + m + 1 − i, i ∈ I = [` + 1, ` + m]; Q • for an interval I, the associated alternating polynomial is aI (x) = {xi − xj : i < j and i, j ∈ I}; PN QN • for α ∈ NN , |α| := i=1 αi , α! := i=1 (αi !); • for a set A, #A is the cardinality; QN Qλi • for λ ∈ NNP , and t ∈ R, the hook length product is h(λ, t) := i=1 j=1 (λi − j + t + k(#{s : s > i and j ≤ λs ≤ λi }); • for α ∈ NN , 1 ≤ i ≤ N, κi (α) = N k − k(#{s : αs > αi } + #{s : s < i and αs = αi }) + αi + 1 (a frequently used eigenvalue associated to α); • for an interval I, α ∈ NN satisfies condition (≥, I) respectively (>, I) if i, j ∈ I and i < j implies αi ≥ αj , respectively αi > αj ; • for two linear operators A, B the commutator is [A, := AB − BA; QB] m • for t ∈ R, m ∈ Z+ , the shifted factorial is (t)m = i=1 (t + i − 1); for λ ∈ NNP (and QN implicit parameter k) the generalized shifted factorial is (t)λ := i=1 (t − (i − 1)k)λi ; • 1N = (1, 1, . . . , 1) ∈ RN . 2. Background We review facts about the non-symmetric Jack polynomials expressed in two different bases, the relation to the operators introduced by Cherednik, and the inductive calculation of norm formulas by use of adjacent transpositions.
454
C. F. Dunkl
The symmetric group SN acts on RN by permutation of coordinates and thus extends to an action on functions wf (x) := f (xw), x ∈ RN , w ∈ SN . For a parameter k ≥ 0, the type A Dunkl operators are defined by Ti =
X 1 ∂ +k (1 − (ij)), 1 ≤ i ≤ N. ∂xi xi − xj
(2.1)
j6=i
For each partition λ ∈ NNP (henceforth partitions will be assumed to have no more than N parts), there is a space Eλ of polynomials, invariant and irreducible under the algebra generated by {Ti xi : 1 ≤ i ≤ N } and {w : w ∈ SN }. This algebra can be considered as a subalgebra of the degenerate double affine Hecke algebra of type A (the latter acts on Laurent series and also contains the multiplications by x−1 i , 1 ≤ i ≤ N , Cherednik [C], Kakei [K3]). There is a simple relationship between the rational and trigonometric differentialdifference operators of type A; indeed, let xi = eyi , 1 ≤ i ≤ n; and suppose f is a linear combination of {eyi , e−yi : 1 ≤ i ≤ N }, then (Ti xi )f = (1 + (N − 1)k)f +
X ∂f 1 (f − (ij)f ). +k ∂yi eyi −yj − 1 j6=`
However, there is no corresponding relationship for type B (the weight function for rational type B on the N -torus only involves one parameter). For α ∈ NN , Ti xi xα = (N k − k#{j : αj > αi } + αi + 1)xα −k
+k
X
αj −αi −2
αj >αi +1
s=0
X
X
αi −αj
αj ≤αi
s=1
X
−s−1 xα xs+1 i xj
s xα x−s i xj ;
every xβ in the sums satisfies β + ≺ α+ except the cases s = αi − αj > 0 which produce k(ij)xα . When j > i and αi > αj , (ij)α P ≺ α. Thus the operator Ui := Ti xi − k j
X
{A(β, α)xβ : |β| = |α|, β + ≺ α+ or α+ = β + and α β}.
The type-A Cherednik operator ξi (as in [BF3]) is defined by X 1 − (ij) X xj (1 − (ij)) ∂ 1 xi + xi + + 1 − i, ξi = k ∂xi xi − xj j>i xi − xj j
Orthogonal Polynomials of Types A and B and Related Calogero Models
455
and satisfies ξi = k1 (Ui −(k(N − 1)+1)). The set {Ui : i = 1, . . . , N } is commutative (more details below), thus there is a basis (for polynomials) of simultaneous eigenfunctions, called non-symmetric Jack polynomials. The notation Eα (x; 1/k) is used for the normalization having one as leading coefficient (of xα ); that is, X {A0 (β, α)xβ : |β| = |α|, β + ≺ α+ or β + = α+ and β ≺ α} Eα (x; 1/k) = xα + (and the coefficients A0 (β, α) depend on k and N ). In the present paper, we use the dual basis {pα : α ∈ NN } defined by the generating function N N X Y Y pα (x)y α = (1 − xi yi )−1 (1 − xi yj )−k (2.2) Fk (x, y) := α i=1
j=1
(x, y ∈ RN ). For λ ∈ NNP , ωλ is defined to be the scalar multiple of Eλ (x; 1/k) such that X {B(β, λ)pβ : |β| = |λ|, β + λ}; ωλ = pλ + the triangularity property of B(β, λ) was shown in [D3], further B(β, λ) is independent of N in the sense that B(β, λ) remains constant when β and λ are changed to (β1 , . . . , βN , 0, 0, . . . , 0), (λ1 , . . . , λN , 0, . . . , 0) ∈ NM respectively (and M ≥ N ). Note that the triangularity for {pα } is in the opposite direction to that of {xα }. Next we define the linear space Eλ as the span of the SN -orbit of ωλ , with basis {ωα : α+ = λ}, where ωwλ := wωλ for w ∈ SN (this is well defined, since wλ = λ implies wωλ = ωλ for w ∈ SN ). The intertwining operator of type A is the unique linear map V on polynomials ∂ p (x) = which satisfies: V 1 = 1, V : Pn → Pn for each n = 0, 1, 2, . . . , and V ∂x i
Ti (V p)(x) for 1 ≤ i ≤ N , x ∈ RN , each polynomial p (see [D3]). Let ξ be the linear map on polynomials defined by ξ : pα 7→ xα /α!, α ∈ NN and extended by linearity; then each Eλ is an eigenmanifold for V ξ and V ξωα = ((N k + 1)α+ )−1 ωα , each α. We will use hf, gi to denote inner products (of polynomials f, g) which satisfy two conditions: Ti xi is self-adjoint for each i, and the inner product is SN -invariant; that is, hTi xi f (x), g(x)i = hf (x), Ti xi g(x)i and hwf, gi = hf, w−1 gi, w ∈ SN . The irreducibility properties of Eλ imply that such inner products are uniquely determined up to a constant on each Eλ . P P α α Definition 2.1. For polynomials f (x) = α fα x , g(x) = α gα x , define the Ainner product X hf, giA := fα gβ T α xβ , x=0
α,β
and the p-inner product hf, gip :=
X
fα gβ (H −1 )αβ ,
α,β
where the matrix H is defined by Fk (x, y) =
X α,β
α
Alternatively, hx , pβ (x)ip = δαβ .
Hαβ xα y β .
456
C. F. Dunkl
Homogeneous polynomials of different degrees are orthogonal in both inner products. The A-product was introduced in [D2] and shown to be positive-definite. Proposition 2.2. The operators Ti xi are self-adjoint in the p- and A-inner products. Proof. The adjoint of multiplication by xi in the A-product is clearly Ti . For the pproduct, self-adjointness is equivalent to Ti(x) xi Fk (x, y) = Ti(y) yi Fk (x, y) (the superscripts refer to the variables being acted on); but X (k + 1)xi yi 1 − x j yj , +k Ti(x) xi Fk (x, y) = Fk (x, y) 1 + 1 − x i yi (1 − xi yj )(1 − xj yi ) j6=i
which is symmetric under the interchange of x and y.
Proposition 2.3. For polynomials f, g and w ∈ SN , hf, gip = hwf, wgip and hf, giA = hwf, wgiA . In particular, the transpositions (ij) are self-adjoint. Proof. The first part follows from the equation Fk (xw, yw) = Fk (x, y). The second part depends on the transformation properties of Ti , namely w−1 Ti w = Tw−1 (i) , 1 ≤ i ≤ N, w ∈ SN . Corollary 2.4. For partitions λ, µ with λ 6= µ, Eλ ⊥ Eµ in both p- and A-products. Proof. The method of ([D3], Theorem 4.3) used only the self-adjointness of each Ti xi . Sahi [Sa] proved this orthogonality for the p-product. In [D3] we used the modification Ti ρi = Ti xi + k, where ρi pα = p(α1 , . . . , αi + 1, . . . ) “raising” operator; see also [D4]. Proposition 2.5. For λ ∈ NNP , f, g ∈ Eλ , hf, giA = (N k + 1)λ hf, gip . P Proof. Since g ∈ Eλ , {pα T α g : α ∈ NN , |α| = |λ|} = (N k + 1)λ g (formula for (V ξ)−1 ). That is, T α g isP (N k + 1)λ times P the coefficient of pα in the expansion of g in the basis {pβ }. Let f = α fα xα , g = β gβ pβ , then X X fα T α g = fα gα (N k + 1)λ hf, giA = α
α
= (N k + 1)λ hf, gip .
Corollary 2.6. Let f ∈ Eλ , then f (T )∗ 1 = (N k + 1)λ f , where f (T )∗ denotes the p-adjoint of the operator. Proof. For any µ ∈ NNP , g ∈ Eµ , hg, f (T )∗ 1ip = hf (T )g, 1ip = hf, giA = δµλ (N k + 1)λ hf, gip . Since g and µ are arbitrary, f (T )∗ 1 = (N k + 1)λ f .
Orthogonal Polynomials of Types A and B and Related Calogero Models
457
In [D3] we showed that Ti ρi ωα = (N k − k#{s : αs > αi } + αi + 1)ωα +k
X
{(ij)ωα : αj > αi }, for α ∈ NN , 1 ≤ i ≤ N.
Also the commutator [Ti ρi , Tj ρj ] = k(Ti ρi −Tj ρj )(ij), for i 6= j ([D3],PLemma 2.5(iii)). This leads to the pairwise commuting operators Ui := Ti ρi − k j
αi } + #{s : s < i and αs = αi }) + αi + 1. These will appear as eigenvalues of Ui , because Ui ωα = κi (α)ωα + k6{(ij)ωα : i < j and αi < αj } − k6{(ij)ωα : j < 1 and αj < αi }. For each partition λ, the matrix of Ui with respect to the basis {ωα : α+ = λ} for Eλ is triangular in the dominance ordering (also for the lexicographic order, a total one). Note if α ∈ NN , and αi < αj , i < j for some i, j, then (ij)α α. The operators Ui satisfy some commutation properties with adjacent transposition: (i) [Ui , (j, j + 1)] = 0, if i < j or j + 1 < i; (ii) (i, i + 1)Ui (i, i + 1) = Ui+1 + k(i, i + 1).
(2.3)
Theorem 2.8 ([D4], Sect. 3). For each partition λ there exists a unique basis {ζα : α+ = λ} for Eλ satisfying (1) Ui ζα = κi (α)ζα , 1 ≤ i ≤ N . (2) ζα = ωα + 6{B(β, α)ωβ : β + = λ and β α}. (3) hζα , ζβ i = 0 if α 6= β. Corollary 2.9. ζλ = ωλ , and if αi = αi+1 , then (i, i + 1)ζα = ζα . Proof. The partition λ is the maximum element in {α : α+ = λ}. Suppose αi = αi+1 , and expand (i, i + 1)ζα in the basis {ζβ : β + = λ = α+ }. Because (i, i + 1)ζα is an eigenvector for each Uj with |i − j| > 1 with eigenvalue κi (α), it must be a scalar multiple of ζα . The fact that (i, i + 1)ωα = ωα and (2.3)(ii) shows the factor is 1. Proposition 2.10. Suppose α ∈ NN and αi > αi+1 , then span{ζα , ζσα } is invariant under σ = (i, i + 1), and the matrix of σ in this basis is # " c 1 − c2 k , where c = . κi (α) − κi+1 (α) 1 −c Proof. Let g = σζα − cζα , then Uj g = κj (α)g for j < i or i + 1 < j; this shows g ∈ span{ζα , ζσα }. The coefficient of ωσα in g is 1, since α σα. The commutation relation σUi σ = Ui+1 + kσ shows Ui g = κi+1 (α)g and Ui+1 g = κi (α)g, but κi+1 (σα) = κi (α) and κi+1 (α) = κi (σα), thus g = ζσα . Finally, σg = ζα − cσζα = (1 − c2 )ζα − cg.
458
C. F. Dunkl
These equations were found by Sahi [Sa], see also Baker and h Forresteri[BF3]. c 1 In the basis {Eα (x; 1/k), Eσα (x; 1/k)} the matrix of σ is 1−c 2 −c . By use of
the known evaluations at x = 1N and the notation of Definitions 3.10 and 3.17, Eα (x; 1/k) =
h(α+ , k
h(α+ , 1) ζα (x). + 1)E+ (α)E− (α)
Corollary 2.11. Suppose α ∈ NN and αi > αi+1 , then ζσα (1N ) = (1 − c)ζα (1N ), and
kζσα k2 = (1 − c2 )kζα k2 ,
for c =
k κi (α)−κi+1 (α)
and σ = (i, i + 1).
Proof. Since ζσα = σζα − cζα , we have that ζσα (1N ) = σζα (1N ) − cζα (1N ) = (1 − c)ζα (1N ). Since σ is self-adjoint the matrix of σ in the orthonormal basis {ζα /kζα k, ζσα /kζσα k} must be symmetric, hence kζσα k2 = (1 − c2 )kζα k2 . Corollary 2.12. In the same notation, let 1 1 ζσα , and f1 = ζα − ζσα , f 0 = ζα + 1+c 1−c
(2.4)
then σf0 = f0 and σf1 = −f1 . In the next section we derive expressions for the norms kζα k2p , kζα k2A as a by-product. If α ∈ NN and αi > αi+1 , then κi (α) − κi+1 (α) ≥ αi − αi+1 + k, thus 0 < c < 1 (when k > 0); in fact, κi (α) − κi+1 (α) = αi − αi+1 + k(1 + #{s : s > i and αs = αi } + #{s : s < i and αs = αi+1 } + #{s : αi+1 < αs < αi }). The relation of the operator Ui to the Hamiltonian H1 in (1.2) is as follows: let h(x) =
Y
|xi − xj |k
N Y
|xi |k(N −1)/2
i=1
1≤i<j≤N
(an SN -invariant positively homogeneous function), then h(x)(Ui − 1 − k(N + 1)/2)(f (x)/h(x)) := Ai f (x) = xi
X xmax(i,j) ∂f (x) −k (ij)f (x), ∂xi xi − xj j6=i
and
PN
i=1
A2i = H1 . The eigenvalue of H1 on the space h(x)Eλ is
Orthogonal Polynomials of Types A and B and Related Calogero Models N X
(κi (λ) − 1 − k(N + 1)/2) = 2
i=1
N X
λ2i
+k
i=1
N X
459
(N − 2i + 1)λi + k 2 N (N 2 − 1)/12
i=1
(see Baker and Forrester [BF1, BF2], Lapointe and Vinet √ [LV2]); the last term is the energy of the ground state. In the coordinates xj = exp( −1 θj ), Y | sin((θi − θj )/2)|k . h = 2k i<j
3. Subgroup Invariants A parabolic subgroup of SN is by definition generated by a subset of {(i, i+1) : 1 ≤ i < N }. This section concerns subspaces of Eλ invariant under a parabolic subgroup. We start with the basic structure, an interval and its group of permutations. A typical interval is denoted by I or [`1 , `2 ] and is defined to be {n ∈ Z : `1 ≤ n ≤ `2 }. The associated permutation group, denoted by SI , is defined as {w ∈ SN : w(i) = i for all i ∈ / I}. Thus SI ∼ = Sm , where m = #I. The parabolic subgroups of SN are direct products of such groups corresponding to a collection of disjoint intervals in [1, N ]. The technique developed in this section will be used to derive formulas for the norms of the non-symmetric Jack polynomials ζα , as well as for polynomials with prescribed symmetric or skew-symmetric properties for parabolic subgroups. For any β ∈ NN , the set {wβ : w ∈ SI } has a unique -maximal element, which satisfies the following: Definition 3.1. For an interval I, say that a composition α satisfies property (≥, I) or (>, I) if αi ≥ αj , respectively αi > αj , whenever i, j ∈ I and i < j. We deal with the case of one interval first. Let I = [` + 1, ` + m] with 1 ≤ ` + 1 < ` + m ≤ N . The object is to analyze span{ζwα : w ∈ SI } and span{wζα : w ∈ SI } for a fixed α satisfying (≥, I). The structure of span{ζwα } mimics that of Eλ (with m variables) with an analogue of Ti ρi . Part of the motivation for the following definition is to have commutativity among the operators associated with disjoint intervals. Definition 3.2. For a fixed interval I = [` + 1, ` + m], for i ∈ I, let X (ij). τi := Ti ρi − k j≤`
P
Note that Ui = τi − k `<j
460
C. F. Dunkl
Proof. Consider the set A = {γ : ζγ ∈ span{gβ }}, by Proposition 2.10, if γ ∈ A and γi 6= γi+1 for ` + 1 ≤ i < ` + m, then (i, i + 1)γ ∈ A. Also α ∈ A, hence A = SI α. Proposition 3.5. For β ∈ SI α, and i ∈ I, X {(ij)gβ : j ∈ I and βj > βi }, τi gβ = κ0i (β)gβ + k where κ0i (β) = N k − k(#{s : βs > βi } + #{s : s ≤ ` and βs = βi } + βi + 1. Proof. First for β = α, τi gα = Ui + k
X
(ij) gα
`<j
= (κi (α) + k#{s : ` < s < i and αs = αi })gα +k
X
{(ij)gα : ` < j < i and αj > αi },
(3.1)
because gα = ζα . This is the required formula for this case; the situation {j : i < j ≤ ` + m and αj > αi } does not occur. For an arbitrary w ∈ SI , let s = w−1 (i), β = wα, then τi gβ = τi wgα = wτs gα = (N k − k(#{j : αj > αs } + #{j : j ≤ `, αj = αs } + αs + 1)wgα +k
X
{w(s, j)gα : ` < j ≤ ` + m, αj > αs },
but w(s, j) = (w(s), w(j))w = (i, w(j))w, βt = αw−1 (t) for any t, βi = αs . Thus τi gβ = κ0i (β)gβ + k and αj = βw(j) .
X
{(i, w(j))gβ : j ∈ I and αj > βi },
This showed that the structure of the operators {τi : i ∈ I} on X is essentially the same as that of {Ti ρi : 1 ≤ i ≤ N } on Eλ . Proposition 3.6. Suppose C is a linear operator on X and [C, τi ] = 0 for each i ∈ I, then C = c1, a multiple of the identity. Proof. The same proof as ([D3], Proposition 3.2) for Eλ works, replacing {ωβ : β + = λ} by {gβ : β ∈ SI α}. Q Proposition 3.7. The operator UQI := i∈I Ui commutes with each w ∈ SI ; also [UI , τi ] = 0 for i ∈ I and UI g = i∈I κi (α)g for each g ∈ X.
Orthogonal Polynomials of Types A and B and Related Calogero Models
461
Proof. To show [UI , w] = 0 for w ∈ SI it suffices to prove this for w = (i, i + 1), ` + 1 ≤ i < ` + m. Also wUj = Uj w if j < i or j > i + 1, thus consider wUi Ui+1 w = (Ui+1 w + k)Ui+1 w = Ui+1 (Ui w − k)w + kUi+1 w = Ui+1 Ui = Ui Ui+1 (by (2.3)). Since τ`+1 = U`+1 we have [UI , τ`+1 ] = 0. For any w ∈ SI , τ`+1 = w−1 τw(`+1) w, and this shows [U QI , τj ] = 0 for each j ∈ I. For any basis element gwα of X, UI gwα = wUI gα = w i∈I κi (α)gα , since gα = ζα . The change of basis matrix for {gβ } to {ζβ } is triangular; define B by X ζβ = B(γ, β)gγ , then B(γ, γ) = 1,
(3.2)
γ∈SI α
and B(γ, β) = 0 unless γ β. The usual proof (for {ωβ } and {ζβ }) applies. There is a nice relationship between B and the Gram matrix for {gβ }. Definition 3.8. For β, γ ∈ SI α, let H(β, γ) := hgβ , gγ i/kgα k2 (independent of choice of permissible inner product). Proposition 3.9. For w1 , w2 ∈ SI , and β, γ ∈ SI α, H(w1 β, w2 γ) = H(β, w1−1 w2 γ); in particular, H(w1 α, w2 α) = H(α, w1−1 w2 α) = B −1 (α, w1−1 w2 α). Proof. The first identity follows from the SN -invariance of the inner product. For β ∈ SI α, X B −1 (γ, β)ζγ i/kζα k2 = B −1 (α, β) H(α, β) = hζα , γβ
(by orthogonality, gα = ζα ).
Sahi [Sa] found a formula for kζβ k2p in terms of a hook length product associated to the Ferrers diagram of the composition β. Here we give an expression whose complexity is roughly the number of adjacent transpositions needed to transform α to β; the upper and lower hook length products for partitions (Stanley [St]) will also be used eventually. Definition 3.10. For = + or − (“sign”), an interval I, β ∈ NN , let Y k E (β, I) := : βi < βj , i < j, and i, j ∈ I . 1+ κj (β) − κi (β) Observe E (α, I) = 1 (α satisfies (≥, I)).
462
C. F. Dunkl
Lemma 3.11. If βi+1 > βi for ` + 1 ≤ i < ` + m, then E ((i, i + 1)β, I)/E (β, I) = 1 + Proof. For {i, j} ⊂ I let t(β; i, j) = 1 + t(β; i, j) = 1. Recall
k , = ±. κi (β) − κi+1 (β)
k κj (β)−κi (β)
if βi < βj and i < j, else
κj (β) = N k − k(#{s : βs > βj } + #{s : s < j, βs = βj }) + βj + 1, each j. Let σ = (i, i + 1). Then t(σβ; i, j) = t(β; i + 1, j) and t(σβ; i + 1, j) = t(β; i, j) for j > i + 1; t(σβ; j, i) = t(β; j, i + 1) and t(σβ; j, i + 1) = t(β; j, i) for j < i. Also t(β; i, i + 1) = 1, and t(σβ; i, i + 1) = 1 +
k k =1+ . κi+1 (σβ) − κi (σβ) κi (β) − κi+1 (β)
The values t(β; j1 , j2 ) = t(σβ; j1 , j2 ) for indices (j1 , j2 ) not listed above. Since E (β; I) = Q i,j∈I t(β; i, j) this shows E (σβ; I)/E (β; I) has the specified value. Proposition 3.12. Suppose β ∈ SI α, then ζβ (1N ) = E− (β; I)ζα (1N ), and kζβ k2 = E+ (β; I)E− (β; I)kζα k2 . Proof. Corollary 2.11 showed that kζσβ k2 E+ (σβ; I)E− (σβ; I) = kζβ k2 E+ (β; I)E− (β; I) for βi > βi+1 , ` + 1 ≤ i < ` + m, and σ := (i, i + 1). The transpositions (i, i + 1) generate SI . Similarly, ζβ (1N )/E− (β; I) is constant on SI α. Observe that the case I = [1, N ] provides the values kζβ k2 /kζλ k2 and ζβ (1N )/ζλ (1N ) for λ = β + . The minimum (for ) element in SI α occurs in several important formulas. It is denoted by αR := (α1 , . . . , α` , α`+m , α`+m−1 , . . . , α`+1 , α`+m+1 , . . . , αN ), that is, αR = σI α, where σI is the longest element in SI (the length of a permutation is the minimum number of adjacent transpositions needed to produce it, as a product). Thus σI = (` + m, ` + 1)(` + m − 1, ` + 2) . . . and is an involution, σI2 = 1. Lemma 3.13. E (αR ; I) =
Y
1+
k : ` + 1 ≤ i < j ≤ ` + m and αi > αj κi (α) − κj (α)
.
Orthogonal Polynomials of Types A and B and Related Calogero Models
463
Proof. When αi = αi+1 for some i ∈ I, then the list of eigenvalues κ`+1 (αR ), . . . , κ`+m (αR ) is not the reverse of the list κ`+1 (α), . . . , κ`+m (α), nevertheless the relative order of pairwise different values is reversed; that is, αiR > αjR if and only if α2`+m+1−i > α2`+m+1−j (for ` + 1 ≤ j < i ≤ ` + m). This suffices to establish the formula. As an example for the case αi = αi+1 , take ` = 0, m = 3, and α = (α1 , α1 , α3 ) with α1 > α3 ; then κ1 (αR ) = κ3 (α), κ2 (αR ) = κ1 (α), κ3 (αR ) = κ2 (α). There is a unique SI -invariant in X, and if α satisfies (>, I) there is a unique SI alternating polynomial in X. Invariance for SI means wf = f , “alternating” means wf = sgn(w)f , for all w ∈ SI . To establish such properties it suffices to show (i, i + 1)f = f for invariance, (i, i + 1)f = −f for alternating, for ` + 1 ≤ i < ` + m. Definition 3.14. Let jα;I := E+ (αR ; I)
X β∈SI α
and
X
aα;I := E− (αR ; I)
w∈SI
1 ζβ ; E+ (β; I)
sgn(w) ζwα , E− (wα; I)
provided α satisfies (>, I). Theorem 3.15. The polynomials jα;I and aα;I have the following properties: (1) (2) (3) (4) (5) (6) (7)
wjα;I = jα;I for w ∈ SI ; kjα;I k2 = (#SI α)E+ (αR ; I)kζα k2 ; jα;I (1NP ) = (#SI α)ζα (1N ); jα;I = β∈SI α gβ ; waα;I = (sgn w)aα;I for w ∈ SI ; kaα;I k2P= (#SI )E− (αR ; I)kζα k2 ; aα;I = w∈SI (sgn w)gwα .
Proof. It is clear that the polynomials defined in (4) and (7) are the only (up to scalar multiple) invariant P and alternating elements of X. To show (1) and (5), consider a typical element f = β∈SI α fβ ζβ , and fix i with ` + 1 ≤ i < ` + m, and let σ = (i, i + 1). Then f =
X
{fβ ζβ : βi = βi+1 , β ∈ SI α}
+
X {fβ ζβ + fσβ ζσβ : βi > βi+1 , β ∈ SI α}.
By Corollary 2.12, σf = f if and only if E+ (β; I) fσβ = = fβ E+ (σβ; I)
1+
k κi (β) − κi+1 (β)
−1 ,
for each β ∈ SI α with βi > βi+1 ; also σf = −f if and only if βi = βi+1 implies fβ = 0 and βi > βi+1 implies E− (β; I) fσβ . =− fβ E− (σβ; I)
464
C. F. Dunkl
Note if α does not satisfy (>, I) then there is no nonzero alternating element. By the triangularity of B, the coefficient of gαR in jα,I or aα;I is the same as the coefficient of ζαR ; this establishes (4) and (7). P P To compute kjα,I k2 , observe that hgβ , γ∈SI α gγ i = hgα , γ gγ i for each β ∈ SI α by the SN -invariance of the inner product. Sum this equation over all β ∈ SI α to obtain hjα;I , jα;I i = (#SI α)hζα , jα;I i = (#SI α)E+ (αR ; I)kζα k2 (using the original definition of jα;I ). Formula (3) is a direct consequence of (4). The calculation for kaα;I k2 proceeds similarly: X
X
kaα;I k2 =
sgn(w1 )sgn(w2 )hgw1 α , gw2 α i
w1 ∈SI w2 ∈SI
X
= (#SI )
sgn(w2 )hgα , gw2 α i
w2 ∈SI
= (#SI )hζα , aα;I i = (#SI )E− (αR ; I)kζα k2 (replacing w2 by w1 w2 in the inner sum). Corollary 3.16.
X
H(α, β) = E+ (αR ; I),
β∈SI α
and if α satisfies (>, I), then X
sgn(w)H(α, wα) = E− (αR ; I).
w∈SI
Proof. By definition of H, X
H(α, β) = hgα ,
β∈SI α
and
X
X
gβ i/kgα k2 = hgα , jα;I i/kgα k2 ,
β
sgn(w)H(α, wα) = hgα , aα;I i/kgα k2 .
w∈SI
The triangularity argument for extracting the coefficient of ζαR was used already by Baker, Dunkl, and Forrester [BDF] in the same context. Earlier, Baker and Forrester [BF2] considered some special cases of subgroup invariance and relations to the Jack polynomials. An analogue of evaluation at 1N for aα;I will be discussed later. Hook length products are used in the norm calculations.
Orthogonal Polynomials of Types A and B and Related Calogero Models
465
Definition 3.17. For a partition λ and parameter t, let h(λ, t) :=
NY −m0
λi Y
i=1
j=1
(λi − j + t + k#{s : s > i and j ≤ λs ≤ λi }),
where λi = 0 for i > N − m0 . The special cases t = 1, k satisfy k −|λ| h(λ, 1) = h∗ (λ), k −|λ| h(λ, k) = h∗ (λ), the upper and lower hook length products for parameter 1/k (Stanley [St]), respectively. In the next paragraphs, let I = [1, N ], and let λ be a partition and λR = (λN , λN −1 , . . . , λ1 ). We will use known results for the Jack polynomials to determine kζλ k2p ; these are due to Stanley [St]. Sahi first found kζλ k2p , and recently Baker and Forrester [BF4] presented a concise self-contained determination of the structural constants of the Jack polynomials. We apply the previous results of this section and we suppress the letter I in jλ;I and Es (β; I) since I = [1, N ]. From the orthogonality relations on the N -torus (more on this later), it is known that Jλ (x; 1/k) is a multiple of jλ (where Jλ is the standard Jack polynomial). Stanley showed Jλ (1N ; 1/k) = (N k)λ k −|λ| , also in ([D3], Proposition 4.3) we showed ζλ (1N ) = ωλ (1N ) = (N k + 1)λ /h(λ, 1), so for any β ∈ NN , by Proposition 3.12, ζβ (1N ) = E− (β)(N k + 1)β + /h(β + , 1).
(3.3)
Thus by Theorem 3.15(3), Jλ (x; 1/k) =
(N k)λ h(λ, 1) jλ (x). (N k + 1)λ k |λ| (#SN λ)
Note #SN λ = #{β : β + = λ}, the dimension of Eλ . The 1 F0 formula for Jack polynomials (Yan [Yn], Beerends and Opdam [BO]) asserts N Y
(1 − xi )−(N k+1) =
i=1
X P λ∈NN
1 (N k + 1)λ J . x; λ k k |λ| h∗ (λ)h∗ (λ)
We convert this to an expression in jλ . Lemma 3.18. h(λ, k) =
(N k)λ E+ (λR )h(λ, k + 1) , (N k + 1)λ (#SN λ)
for λ ∈ NNP . Proof. We will show
h(λ, k)(N k + 1)λ E+ (λR ) . = h(λ, k + 1)(N k)λ #SN λ
Denote the factors of h(λ, t) by h(i, j; t) = λi − j + t + k#{s : j ≤ λs ≤ λi }, 1 ≤ j ≤ λi . Then h(i, j, k) = h(i, j + 1, k + 1) whenever j 6= λs for any s. The ratio h(λ, k)/h(λ, k +1), after cancellation, is a product of factors like h(i, λi , k)/h(i, 1, k +1) and h(i, λs , k)/h(i, λs + 1, k + 1), for λs < λi . We have h(i, λi , k) = k(1 + #{s : s > i, λs = λi }) and
466
C. F. Dunkl
h(i, 1, k + 1) = λi + k + k#{s : s > i, 1 ≤ λs ≤ λi } = λi + k(N − m0 − i + 1), where m0 is the number of zero parts of λ (as element of NN ). For each j ∈ Z+ , let mj = #{i : λi = j}, then NY −m0
h(i, λi , k) = k N −m0
i=1
Y
mj !
j≥1
Also, NY −m0 (N k + 1)λ λi + (N − i + 1)k = (N k)λ k(N − i + 1) i=1
m0 !
=
k N −m0 N !
NY −m0
(λi + (N − i + 1)k).
i=1
Thus h(λ, k)(N k + 1)λ h(λ, k + 1)(N k)λ
=
m0 !
Q j≥1
−m0 Y mj ! NY
N!
i=1
λi − λj + k(1 + #{s : s > i, λj ≤ λs ≤ λi }) λi − λj + k(1 + #{s : s > i, λj < λs ≤ λi }) : distinct values of λj < λi .
In the latter product λj = 0 is used; it contributes λi + k(N − i + 1) . λi + k(N − m0 − i + 1) Since #SN λ = N !/
Q
R
j
E+ (λ ) =
mj ! it suffices to identify the remaining product with Y κi (λ) − κj (λ) + k κi (λ) − κj (λ)
: i < j and λi > λj
.
Let µ1 , µ2 be two distinct values in (λ1 , λ2 , . . . , λN ) with µ1 > µ2 ; suppose λa1 = λa1 +1 = · · · = λa1 +n1 −1 = µ1 and λa2 = · · · = λa2 +n2 −1 = µ2 (and no other appearances of µ1 , µ2 ) and a1 +n1 −1 < a2 . This implies κt (λ) = (N −t+1)k+µ1 +1, a1 ≤ t < a1 +n1 and κu (λ) = (N − u + 1)k + µ2 + 1, a2 ≤ u < a2 + n2 .
Orthogonal Polynomials of Types A and B and Related Calogero Models
467
The contribution of the pair (µ1 , µ2 ) to the left-hand product is a1 +n Y1 −1
µ1 − µ2 + k(a2 + n2 − t) µ1 − µ2 + k(a2 − t)
t=a1
=
a1 +n Y1 −1 a2 +n Y2 −1 t=a1
=
µ1 − µ2 + k(u + 1 − t) µ1 − µ2 + k(u − t)
u=a2
Y κt (λ) − κu (λ) + k , κt (λ) − κu (λ) t,u
which is exactly the contribution of (µ1 , µ2 ) to E+ (λR ).
Proposition 3.19. N Y
(1 − xi )−(N k+1) =
i=1
X
(N k + 1)λ jλ (x) h(λ, k + 1)E+ (λR )
P λ∈NN
(|xi | < 1 each i).
Proof. This is exactly the 1 F0 series with h(λ, k) replaced using the lemma. Corollary 3.20. For λ ∈ NNP , kζλ k2p =
h(λ,k+1) h(λ,1) .
Proof. By the definition of the p-inner product, Fk (x, y) =
X
1
β∈NN
kζβ k2p
ζβ (x)ζβ (y).
Set y = 1N , and use kζβ k2p = E+ (β)E− (β)kζλ k2p , ζβ (1N ) = E− (β)ζλ (1N ) = E− (β)(N k + 1)λ /h(λ, 1), for λ = β + . This shows N Y
(1 − xi )−(N k+1) =
i=1
X P λ∈NN
=
X P λ∈NN
(N k + 1)λ kζλ k2p h(λ, 1)
X β∈SN λ
ζβ (x) E+ (β)
(N k + 1)λ jλ (x). kζλ k2p h(λ, 1)E+ (λR )
Match up the coefficients with the 1 F0 -expansion.
The value of kζλ k2p was first obtained by Sahi who used a recurrence relation (lowering the degree). There is one more important SN -invariant inner product for which Ti xi is selfadjoint, 1 ≤ i ≤ N , obtained by considering polynomials as (analytic) functions on the torus in CN (see [D3], Proposition 4.2).
468
C. F. Dunkl
Definition 3.21. For f, g polynomials with coefficients in Q(k), let k Z Y 0(k + 1)N −1 −1 v f (x)g (x) (xj − x` )(xj − x` ) dm(x), hf, giT := (2π)N 0(N k + 1) TN 1≤j<`≤N (3.4) √ −1 −1 where g v (x) := g(x−1 1 , x2 , . . . , xN ), dm(x) = dθ1 · · · dθN and xj = exp( −1 θj ), −π < θj ≤ π, 1 ≤ j ≤ N . Beerends and Opdam [BO] evaluated hJλ , Jλ iT , from which one can deduce: For λ ∈ NNP , g ∈ Eλ , (N k + 1)λ kgk2T = kgk2p . ((N − 1)k + 1)λ Baker and Forrester [BF3] computed kζα k2T with a different method. 4. Subgroup Alternating Polynomials Whenever a polynomial is skew-symmetric for a parabolic subgroup of SN it is divisible by an appropriate minimal alternating polynomial (a product of discriminants). The evaluation of the quotient at x = 1N is a generalization of the Weyl dimension formula. This evaluation for polynomials in Eλ (λ ∈ NNP ) will be carried out in this section, by constructing for each interval I ⊂ [1, N ] a skew operator ψI with at least these properties: ψI w = sgn(w)wψI , for w ∈ SI ; / I; [ψI , Uj ] = 0 for j ∈ if I1 is an interval disjoint from I, then [ψI , w] = 0 for w ∈ SI1 ; if α ∈ NN and satisfies (>, I), then ψI maps span{ζwα: w ∈ SI } to itself. Q Heuristically, one might suspect i<j (τi − τj ) works, but it is not skew because of Q non-commutativity, and i<j (Ui − Uj ) has the wrong transformation properties (for (#I) ≥ 3).
(1) (2) (3) (4)
Definition 4.1. For an interval I, let aI denote the minimal alternating polynomial for I, that is, aI (x) := 5{xi − xj : i < j and i, j ∈ I}. Let AI denote the associated division symmetrizing operator on polynomials: 1 X AI f (x) := sgn(w)f (xw)/aI (x). (#I)! w∈SI
For α satisfying (>, I) we will evaluate the functional (AI aα;I )(1N ), in fact, the more general situation for a collection of disjoint intervals ((AI1 AI2 · · · )f )(1N ), for suitable f . We will impose one more condition on ψI , which, surprisingly, is enough to determine the restriction to X uniquely. We will also construct an operator on X, using the Gram matrix H, satisfying the conditions; this will allow the determination of the matrix entries for ψI in the basis {gwα : w ∈ SI }. The aforementioned condition comes from the idea of a “reversing” P transformation: the requirement that ψI ζwα is a simultaneous eigenvector of τi − k i<j≤`+m (ij), for P ` < i ≤ ` + m. Note that Ui = τi − k `<j
Orthogonal Polynomials of Types A and B and Related Calogero Models
Definition 4.2. For i ∈ I, let θi := τi −
P
k 2
469
j6=i,j∈I (ij).
We show later that if ψI is skew and [ψI , θi ] = 0 for each i ∈ I, then ψI has the reversing property. Fix α satisfying (>, I), I = [` + 1, ` + m] and X = span{gwα : w ∈ SI } = span{ζwα P : w ∈ SI }. For a linear transformation A on X we use the matrix notation Agβ = γ∈SI α A(γ, β)gγ . Lemma 4.3. The linear transformation A on X is skew if and only if A(w1 α, w2 α) = sgn(w1 )c(w1−1 w2 ) for some function c on SI . Proof. Given the function c define A as indicated (recall w1 6= w2 implies w1 α 6= w2 α). The transformation A is skew if and only if (sgn w1 )w1 Agw2 α = Aw1 gw2 α for each w1 , w2 ∈ SI , that is X X A(wα, w2 α)w1 gwα = sgn(w1 ) A(w1−1 wα, w2 α)gwα sgn(w1 ) w∈SI
w∈SI
X
=
A(wα, w1 w2 α)gwα .
w∈SI
Matching up coefficients of gwα shows that A is skew if and only if sgn(w1 )A(w1−1 wα, w2 α) = A(wα, w1 w2 α) for all w, w1 , w2 ∈ SI , consistent with the relation A(wα, w1 w2 α) = sgn(w)c(w−1 w1 w2 ).
Corollary 4.4. With the same hypotheses, X sgn(w)c(w))jα;I Aaα;I = ( w∈SI
and Ajα;I = (
X
c(w))aα;I .
w∈SI
Proof. By Definition 3.10, XX Aaα;I = sgn(w2 )A(w1 α1 w2 α)gw1 α w1
=
=
X
w2
g w1 α
X
w1
w2
X
X
w1
g w1 α
! sgn(w2 )c(w1−1 w2 )sgn(w1 ) ! sgn(w3 )c(w3 ) ,
w3
changing the second summation variable w2 = w1 w3 . A similar argument shows Ajα;I = P c(w) aα;I . w
470
C. F. Dunkl
Two more relations apply when α satisfies (>, I): κi (wα) = κw−1 (i) (α), for i ∈ I, w ∈ SI ;
(4.1)
E (wα; I)E (σI wα; I) = E (αR ; I), for w ∈ SI , = ±.
(4.2)
For the second equation, note for any given wα, and ` + 1 ≤ i < j ≤ ` + m, the k appears in E (wα; I) if w(j) < w(i), else in E (σI wα; I); note term 1 + κi (α)−κ j (α) (wα)w(i) = αi . Lemma 4.5. Suppose α satisfies (>, I), then θi gβ = κ0i (β)gβ +
k X sgn(βj − βi )(ij)gβ , 2 j∈I,j6=i
for β ∈ SI α. Proof. By Proposition 3.5, θi gβ = κ0i (β)gβ + −
kX {(ij)gβ : j ∈ I, βj > βi } 2
kX {(ij)gβ : j ∈ I, βj < βi }. 2
The case βi = βj cannot occur.
The first important example of a skew operator commuting with each θi is defined using the Gram matrix for H; thus the domain is just the space X (for a given α satisfying (>, I)). Let P be the operator on X with the matrix P (w1 α, w2 α) = sgn(w1 )δw1 ,w2 (“P ” suggests parity). Note that P 2 = 1. Proposition 4.6. The operator on X with the matrix P H is skew and [P H, θi ] = 0 for i ∈ I. Proof. Let Ai be the matrix for θi in the basis {gwα : w ∈ SI }. By the lemma, Ai (γ, β) = κ0i (β) if γ = β, and = k2 sgn(βj − βi ) if γ = (i)β, and = 0 else. Thus Ai (γ, β) = 0 unless γ = β or γ = (ij)β for some j ∈ I, j 6= i. This shows P Ai P = ATi (T for transpose), since (P Ai P )(w1 α, w2 α) = sgn(w1 )sgn(w2 )Ai (w1 α, w2 α), thus (P Ai P )((ij)β, β) = −Ai ((ij)β, β), for β ∈ SI α. Also θi is self-adjoint and H is a scalar multiple of the Gram matrix for {gwα : w ∈ SI }, hence ATi H = HAi , that is P Ai P H = HAi and Ai P H = P HAi . The operator P H is skew by the lemma (and H(w1 α, w2 α) = H(α, w1−1 w2 α), w1 , w2 ∈ SI ).
Orthogonal Polynomials of Types A and B and Related Calogero Models
471
Proposition 4.7. Suppose A is a skew operator on X, corresponding to the function c on SI , then [A, θi ] = 0 for each i ∈ I if and only if (κ0i (wα) − κ0i (α))c(w) =
k X c((ij)w)(sgn(i − j) + sgn(w−1 (j) − w−1 (i))), 2 j∈I,j6=i
for each w ∈ SI , i ∈ I. Proof. For a linear transformation A on X the condition [A, θi ] = 0 is equivalent to κ0i (β)A(γ, β) +
kX sgn(βj − βi )A(γ, (ij)β) 2 j6=i
= κ0i (γ)A(γ, β) +
kX sgn(γi − γj )A((ij)γ, β), 2 j6=i
(γ, β ∈ SI α, j ∈ I). Let γ = w1 α, β = w2 α and replace A(w1 α, w2 α) by sgn(w1 )c(w1−1 w2 ). The equation becomes (κ0i (w2 α) − κ0i (w1 α))sgn(w1 )c(w1−1 w2 ) =
X k sgn(w1 ) c(w1−1 (w1 (i0 ), w1 (j0 ))w2 )(sgn(i0 − j0 ) 2 j0 6=i0
+ sgn(w2−1 w1 (j0 ) − w2−1 w1 (i0 ))), where i0 = w1−1 (i), j0 = w1−1 (j). Canceling out sgn(w1 ) gives an equation depending only on w := w1−1 w2 and i0 , because κ0i (w2 α) = κ0w1 (i0 ) (w2 α) = κ0i0 (wα) and κ0i (w1 α) = κ0i0 (α). This is the equation in the statement. Note for any w ∈ SI , i ∈ I, κ0i (α) = κ0w(i) (wα), and sgn(αi − αj ) = −sgn(j − i), for i, j ∈ I. Corollary 4.8. If A is skew and [A, θi ] = 0 for each i ∈ I, and k > 0, then the values c(w) are uniquely determined for given c(1), w ∈ SI . Proof. A certain subset of the equations in the proposition is extracted. For any w 6= 1 there is a unique i ∈ I so that w(j) = j for j < i and w−1 (i) > i. Specialize the equations to these values of w, i; sgn(i − j) + sgn(w−1 (j) − w−1 (i)) 6= 0 exactly when i < j and w−1 (i) > w−1 (j) (by construction j < i implies w−1 (j) < w−1 (i)). Thus X (κ0w−1 (i) (α) − κ0i (α))c(w) = −k {c((ij)w) : w−1 (i) > w−1 (j), i < j ≤ ` + m}. The coefficient of c(w) is nonzero, and if β = (ij)wα with w−1 (i) > w−1 (j), then β wα. Thus c(w) is uniquely determined in terms of the values {c(w1 ) : w1 α wα, w1 ∈ SI } when k 6= 0. This corollary shows that an operator on polynomials that is skew for SI and commutes with θi , i ∈ I is determined on each X (= span{wζα : w ∈ SI }, α satisfies (>, I)) as a scalar multiple of P H. We derive some equations which will be instrumental in computing the multiple.
472
C. F. Dunkl
Proposition 4.9. Suppose A is a skew linear operator on X and [A, θi ] = 0 for each i ∈ I, then there is a constant b such that A = bP H and (1) (2) (3) (4)
Ajα;I = bE+ (αR ; I)aα;I ; Aaα;I = bE− (αR ; I)jα,I ; Aζwα = b sgn(w)E+ (wα; I)σI ζσI wα , for w ∈ SI ; A2 = b2 E+ (αR ; I)E− (αR ; I)1.
Proof. The fact that A is a scalar multiple of P H follows from Corollary 4.8 and Proposition 4.6. Equations (1) and (2) follow from Corollary 3.16 and Lemma 4.4. We prove (3) by exhibiting a simultaneous eigenvector structure for {σI ζwα : w ∈ I} and its relation to A. P For i ∈ I, let UiR = τi − k i<j≤`+m (ij), then σI UσI (i) σI = UiR (note σI (i) = P 2m + ` + 1 − i). Further UiR A = AUi , indeed Ui = θi + k2 · j∈I,j6=i sgn(j − i)(ij) and P UiR = θi − k2 j∈I,j6=i sgn(j − i)(ij), and [A, θi ] = 0, A(ij) = −(ij)A by hypothesis. For w ∈ SI , σI Aζwα is an eigenvector of UσI (i) with eigenvalue κi (wα), each i ∈ I, because κi (wα)Aζwα = AUi ζwα = UiR Aζwα = σI UσI (i) (σI Aζwα ). Since κi (wα) = κσI (i) (σI wα), this shows there is a constant v(w) so that σI Aζwα = v(w)ζσI wα , each w ∈ SI . We use (1) to find v(w); on the one hand Aaα;I = bE− (αR ; I)jα;I = bE− (αR ; I)σI jα;I = bE− (αR ; I)E+ (αR ; I)σI
X w∈SI
1 ζwα ; E+ (wα; I)
on the other hand Aaα;I = E− (αR ; I)
X w∈SI
sgn(w) v(w)σI ζIσ wα . E− (wα; I)
In the first equation, change the summation variable to σI w, and match up coefficients of σI σwα in the two equations; this shows v(w) = b sgn(w)E− (wα; I)E+ (αR ; I)/E+ (σI wα; I) = b sgn(w)E− (wα; I)E+ (wα; I) (by (4.2)). Finally, A2 ζwα = b2 sgn(σI )v(w)σI AζσI wα − b2 sgn(σI )v(w)v(σI w)ζwα = b2 E+ (αR ; I)E− (αR ; I)ζwα , w ∈ SI .
We construct a skew operator commuting with each θi , i ∈ I, in the algebra generated by {τi : i ∈ I} ∪ SI , by induction on the size of the interval.
Orthogonal Polynomials of Types A and B and Related Calogero Models
473
Definition 4.10. For the interval I = [` + 1, ` + m] and 1 ≤ s < m, let ψ1 := 1, ψ˜ s+1 := U`+1 U`+2 · · · U`+s ψs ; ψs+1 := ψ˜ s+1 −
`+s X
(i, ` + s + 1)ψ˜ s+1 (i, ` + s + 1).
i=`+1
Then ψI := ψm . Theorem 4.11. The operator ψI satisfies (1) (2) (3) (4) (5)
ψI w = sgn(w)wψI for w ∈ SI ; ψI w = wψI for w ∈ S[1,`] × S[`+m+1,N ] ; / I; [ψI , Uj ] = 0 for j ∈ [ψI , θi ] = 0, for i ∈ I; for any α satisfying (≥, I), span{wζα : w ∈ SI } is an invariant subspace of ψI .
Proof. Properties (2) and (3) follow immediately from the defintion; and property (5) is a consequence of (1). Let (1s ) be the condition ψs (ij) = −(ij)ψs for `+1 ≤ i < j ≤ `+s, and (4s ) be `+s X ψs , τi − k (ij) = 0. 2 j=`+1,j6=i
The case s = 1 is trivial. Inductively, suppose (1s ) and (4s ) are true. For convenience, let t = ` + s + 1: then (ij)ψ˜ s+1 = −ψ˜ s+1 (ij) for ` + 1 ≤ i < j ≤ ` + s, because [U`+1 · · · U`+s , (ij)] = 0 (see Proposition 3.7). Now evaluate (ij)ψs+1 (ij) using the definition; each term in the sum is transformed to the negative of the corresponding term in ψs+1 , with the exception of the terms labeled i and j which are also interchanged; note (i, j)(i, t) = (j, t)(i, j). It suffices to show (` + s, t)ψs+1 (` + s, t) = −ψs+1 ; again the terms in the sum correspond to the negatives, example: (` + s, t)(i, t)ψ˜ s+1 (i, t)(` + s, t) = (i, t)(i, ` + s)ψ˜ s+1 (i, ` + s)(i, t) = −(i, t)ψ˜ s+1 (i, t). The other part is (` + s, t)(ψ˜ s+1 − (` + s, t)ψ˜ s+1 (` + s, t))(` + s, t). This shows (1s+1 ). P`+s To show (4s+1 ), let Bj := τj − k2 i=`+1,i6=j (ij), for ` + 1 ≤ j ≤ ` + s + 1. Then for j ≤ ` + s, [ψ˜ s+1 , Bj ] = 0 by (4s ) and [U`+1 · · · U`+s , Bj ] = 0; the latter follows from τj = (` + 1, j)U`+1 (` + 1, j) and U`+1 · · · U`+s commutes with U`+1 and w ∈ S[`+1,`+s] . It will suffice to show [ψs+1 , Bt ] = 0 since
(j, t)Bt (j, t) = τj −
k 2
`+s+1 X i=`+1,i6=j
(ij).
474
C. F. Dunkl
For ` + 1 ≤ i ≤ ` + s, [(i, t)ψ˜ s+1 (i, t), Bt ] = (i, t)[ψ˜ s+1 , (i, t)Bt (i, t)](i, t) k = (i, t) ψ˜ s+1 , Bi − (i, t) (i, t) 2 k = − (i, t)[ψ˜ s+1 , (i, t)](i, t) 2 =
k ˜ [ψs+1 , (it)]. 2
Also [ψ˜ s+1 , Ut ] = 0 by (3), that is, # " `+s X (i, t) = 0, ψ˜ s+1 , τt − k i=`+1
equivalently, [ψ˜ s+1 , Bt ] =
`+s k X ˜ [ψs+1 , (it)]. 2 i=`+1
This shows
" ψ˜ s+1 −
`+s X
# (i, t)ψ˜ s+1 (i, t), Bt = 0.
i=`+1
Theorem 4.12. Suppose α satisfies (>, I), then ψI |X = 5{κi (α) − κj (α) : ` + 1 ≤ i < j ≤ ` + m}P H. Proof. By 4.9, P Hζα = σI ζσI α ; thus it suffices to compute the coefficient of ζσI α in σI ψI ζα . We use the inductive framework from Definition 4.10 and Theorem 4.11. For fixed s < m, let σ0 , σ1 be the reversing permutations for the intervals [` + 1, ` + s], [` + 1, ` + s + 1] respectively. The inductive hypothesis is that Y (κi (γ) − κj (γ))σ0 ζσ0 γ ψ s ζγ = `+1≤i<j≤`+s
for any γ satisfying (>, [` + 1, ` + s]) (trivial for s = 1). Fix α satisfying (>, I); let Y {κi (α) − κj (α) : ` + 1 ≤ i < j ≤ ` + s + 1, i 6= r, j 6= r}, πr := and
πr0 :=
Y
{κi (α) : ` + 1 ≤ i ≤ ` + s + 1, i 6= r}.
For ` + 1 ≤ j ≤ ` + s + 1, let w(j) be the cycle (` + s + 1, ` + s, . . . , j + 1, j) in S[`+1,`+s+1] . In the notation of Definition 4.10, ψs+1 =
`+s+1 X j=`+1
−1 ˜ ψs+1 w(j) , (−1)`+s+1−j w(j)
Orthogonal Polynomials of Types A and B and Related Calogero Models
475
because w(j) = (` + s, ` + s − 1, . . . , j)(j, ` + 1 + s) (in cycle notation) and sgn(w(j) ) = ` + s + 1 − j; coming from the skew property of ψ˜ s+1 for S[`+1,`+s] . Let α(j) = w(j) α; thus α(j) = (α1 , . . . , α`+1 , α`+2 , . . . , αj−1 , αj+1 , . . . , α`+s+1 , αj , α`+s+2 , . . . ), −1 ˜ ψs+1 w(j) ζα which satisfies (>, [` + 1, ` + s]). We claim the coefficient of ζσ1 α in σ1 w(j) 0 is πj πj . By a standard identity for alternating polynomials `+s+1 X
Y
(−1)`+s+1−j πj πj0 =
j=`+1
(κi (α) − κj (α)).
`+1≤i<j≤`+s+1
To establish the claim: w(j) ζα = ζα(j) +
X
{b(γ)ζγ : γ ∈ S[`+1,`+s+1] α and γ α(j) }
(the triangularity of the matrix relating the bases {ζwα } and {wζα }). By the inductive hypothesis and X −1 ˜ −1 ψs+1 w(j) ζα = σ1 w(j) σ0 (πj πj0 ζσ0 α(j) + {b0 (γ)ζσ0 γ : γ α(j) }), σ1 w(j) for some coefficients b0 (γ). But −1 −1 σ0 = w(σ σ1 w(j) 1 j)
(σ1 (j) = 2` + s + 2 − j),
which fixes [` + 1, 2` + s + 1 − j] pointwise, which shows that −1 σ0 ζσ0 γ ∈ span{ζwσ0 γ : w ∈ S[2`+s+2−j,`+s+1] }. σ1 w(j)
Now (σ0 γ)`+s+1 = γ`+s+1 ∈ {αj+1 , . . . , α`+s+1 }, because γ w(j) α; hence for any β = wσ0 γ in this span, there must be an element of {αj+1 , . . . , α`+s+1 } not appearing in {β`+1 , . . . , β2`+s+1−j }, thus β 6= σ1 α. The argument will be finished once it is shown that the coefficient of ζσ1 α in the {ζwα } expansion of w−1 ζwσ1 α is 1: in the notation of (3.1), ! X −1 −1 gwσ1 α + B(γ, wσ1 α)gγ w ζwσ1 α = w γwσ1 α
= gσ1 α +
X
B(γ, wσ1 α)gw−1 γ ;
γ
γ wσ1 α implies w−1 γ 6= σ1 α, the minimality of σ1 α shows the coefficient of ζσ1 α in the right-hand side is 1. Corollary 4.13. Y
ψI aα;I =
(κi (α) − κj (α) − k)jα,I .
`+1≤i<j≤`+m
Proof.
Q
i<j (κi (α)−κj (α))E− (α
R
; I) has the specified value; see Lemma 3.13.
476
C. F. Dunkl
We return to the problem of evaluating AI f (1N ). The proof of the following comes later. Theorem 4.14. Suppose {I1 , I2 , . . . , It } is a collection of disjoint subintervals of [1, N ], mi := #Ii , 1 ≤ i ≤ t and f is a polynomial, then (AI1 AI2 · · · AIt f )(1N ) =
1 (N k + 1)µ
Qt
i=1 mi !
(ψI1 ψI2 · · · ψIt f )(1N ),
where µ = (m1 − 1, m1 − 2, . . . , 1, 0, m2 − 1, . . . , 0, . . . , mt − 1, . . . , 1, 0, . . . )+ . In the sequel, for an operator A on polynomials, A∗ denotes the adjoint with respect to the p-inner product. QN Let ι(x) := i=1 (1 − xi )−(N k+1) , then hf, ιip = f (1N ) for any polynomial f ; more generally, hf, Fk ( · , z)ip = f (z). Let ui := xi /(1 − xi ), 1 ≤ i ≤ N . Lemma 4.15. A∗I1
· · · A∗It ι(x)
Y t N Y 1 aI (u) = (1 − xj )−(N k+1) . mi ! i i=1
j=1
Proof. First apply AIj to Fk (x, z) with respect to z. Without loss of generality, assume Ij = [1, m] (with m = mj ); then X 1 sgn(w) Qm AIj Fk (x, z) = m! (1 − xi (zw)i ) i=1 w∈S[1,m]
×
N Y
QN
(1 − xj zj )
−1
j=m+1
=
i,j=1 (1
− xi zj )−k
a[1,m] (z)
N N m Y Y Y 1 a[1,m] (x) (1 − xi zj )−1 (1 − xj zj )−1 (1 − xi zj )−k , m! i,j=1
j=m+1
i,j=1
Qm
where the sum evaluates to a[1,m] (x)a[1,m] (z)/ i,j=1 (1−xi zj ) by an identity of Cauchy (this identity was used by Baker and Forrester [BF4] in a similar calculation). Because the intervals are disjoint, AI1 AI2 · · · AIt Fk (x, z) =
t Y
hr (x, z)Fk (x, z),
r=1
where hr (x, z) =
Y 1 aIr (x) {(1 − xi zj )−1 : i, j ∈ Ir , i 6= j}. mr !
To find A∗I1 · · · A∗It ι put z = 1N in this formula; note hr (x, 1N ) =
Y 1 1 aIr (x) aI (u). (1 − xi )−(mi −1) = mr ! mr ! r i∈Ir
Orthogonal Polynomials of Types A and B and Related Calogero Models
477
The operators ψI are generated by {Ti ρi : i ∈ I} and transpositions. Recall Ti ρi = Ti xi + k. Write Tiu to denote action in the variable (u1 , u2 , . . . ). Lemma 4.16. Suppose f is a polynomial in u, then (Ti xi + k)(f (u)ι(x)) = ((1 + ui )(Tiu ui + k)f (u))ι(x), 1 ≤ i ≤ N. Proof. The product rule Ti (h(x)ι(x)) = (Ti h(x))ι(x) + h(x) ∂ι(x) ∂xi applies because ι(x) is SN -invariant. The chain rule implies ∂f (u) [(Ti xi + k)(f (u)ι(x))] = f (u)(1 + k) + (N k + 1)ui f (u) + ui (1 + ui ) ι(x) ∂ui +k
X ui (1 + uj )f (u) − uj (1 + ui )f (u(ij)) j6=i
ui (1 + uj ) − uj (1 + ui )
(note xi = ui /(1 + ui )). The typical term in the sum equals ui f (u) − uj f (u(ij)) (1 + ui ) − ui f (u). u i − uj
The effect on f is to raise the degree by 1; in fact, the highest degree term of the operator coincides with Tiu∗ = ui (Tiu ui + k) ([D3], Proposition 4.1). Lemma 4.17. For an interval I = [` + 1, ` + m] and a polynomial f (u) of degree t, Y (Ti∗ − Tj∗ )f (u) + f1 (u) ι(x), ψI∗ (f (u)ι(x)) = `+1≤i<j≤`+m
where f1 (u) is a polynomial of degree < t + m(m − 1)/2. Proof. Using the inductive framework from Theorem 4.11, suppose the statement is true for the interval [` + 1, ` + s] (trivial for s = 1). In the notation of Definition 4.10, ∗ = ψs∗ (U`+1 · · · U`+s )∗ and by Lemma 4.16, ψ˜ s+1 X Ui∗ (f (u)ι(x)) = ((1 + ui )(Tiu ui + k)f (u) − k f (u(ij)))ι(x), j
(Ti∗ f (u))ι(x).
By the commutativity of which has the same highest degree terms as ∗ (f (u)ι(x)) is {Ti∗ }, the highest degree term of ψ˜ s+1 `+s Y Y Ti∗ (Ti∗ − Tj∗ )f (u) ι(x) i=`+1
`+1≤i<j≤`+s
(inductive hypothesis). Finally ∗ ∗ = ψ˜ s+1 − ψs+1
`+s X
∗ (i, ` + s + 1)ψ˜ s+1 (i, ` + s + 1);
i=`+1
and the usual identity for a[`+1,`+s+1] finishes the proof; since ∗ . (i, ` + s + 1)Ti∗ (i, ` + s + 1) = T`+s+1
478
C. F. Dunkl
Lemma 4.18. t Y
! ψI∗i
ι(x) = cI
i=1
t Y
aIi (u)ι(x)
i=1
for some constant cI . Proof. By the previous lemma, Q
t i=1
ψI∗i ι(x)
ι(x)
= f0 (u) + f1 (u),
Qt Pt where f0 (u) = i=1 aIi (T ∗ )1 and deg f1 < i=1 mi (mi − 1)/2. Also f0 (u) + f1 (u) is Qt skew for i=1 SIi (direct product) by the skew property of ψI ; which implies f1 = 0 and Qt f0 (u) = cI i=1 aIi (u) because this is the unique skew polynomial of minimum degree. Proof of Theorem 4.14. The lemmas show that t Y
ψI∗i ι(x)
=
i=1
t Y
! ! aIi (T
u∗
) 1 ι(x).
i=1
Q In ([D3], Theorem 3.1) it was shown that i aIi ∈ Eµ for µ = (m1 − 1, m1 − 2, . . . , 1, 0, m2 − 1, . . . , 1, 0, . . . )+ , thus (by Corollary 2.6) t Y
aIi (T u∗ )1 = (N k + 1)µ
i=1
t Y
aIi (u)
i=1
(see also [DH]); and so t Y
ψI∗i ι = (N k + 1)µ
i=1
t Y i=1
! mi !
t Y
A∗Ii ι.
i=1
Corollary 4.19. Suppose α satisfies (>, Ii ) for each i, 1 ≤ i ≤ t, then t Y i=1
=
A Ii
X
! (sgn w)wζα
(1N )
w∈SI
t (N k + 1)α+ E− (α, [1, N ]) Y Y {κi (α) − κj (α) − k : i, j ∈ Ir , i < j}, (N k + 1)µ h(α+ , 1) r=1
where SI := SI1 × SI2 · · · × SIt .
Orthogonal Polynomials of Types A and B and Related Calogero Models
479
Q Proof. κj (α) − k : i, j ∈ Ir , i < j}. For each interval Ir , P Let πr = {κi (α) − P ψIr w∈SI sgn(w)wζα = πr w∈SI wζα , by Corollary 4.13. Since [ψIr , w] = 0 for r r w ∈ SIi and [ψIr , ψIi ] = 0, for i 6= r (Theorem 4.11), t Y
X
ψ Ir
r=1
sgn(w)wζα =
t Y r=1
w∈SIr
πr
X
wζα
w∈SI
(the sum is a product of sums over SI1 , . . . , SIt ). Also ! t X Y N wζα (1 ) = mi ! ζα (1N ), i=1
w∈SI
and
ζα (1N ) = (N k + 1)α E− (α, [1, N ])/h(α+ , 1)
(by (3.2)). Further A Ir
X
sgn(w)wζα =
w∈SIr
X
sgn(w)wζα /aIr
w∈SIr
(and this is a typical term in the product sum over SI1 × · · · × SIt ).
5. Orthogonal Polynomials of Type BN This section deals with operators and orthogonal decompositions associated with the group generated by sign-changes and permutations of coordinates. Previously, Baker and Forrester [BF2] considered some of the orthogonal polynomials, called generalized Laguerre polynomials. These come from polynomials which are even in each coordinate or odd in each coordinate. The general situation is developed in the sequel. One consequence is a complete set of eigenfunctions for the Hamiltonian of the BN spin Calogero model (1/r2 interactions confined in harmonic potential), with arbitrary parity, that is, for any subset A ⊂ [1, N ] we find the eigenfunctions which are odd in xi , i ∈ A and / A. even in xi , i ∈ The underlying symmetry group WN is the Weyl group of type BN , called the hyperoctahedral group. It is generated by permutations of coordinates and sign-changes on RN . The reflections in WN are {σij , τij : 1 ≤ i < j ≤ N } and {σi : 1 ≤ i ≤ N }, j
i
i
j
defined by xσij = (x1 , . . . xj , . . . , xi , . . . ), xτij = (x1 , . . . , − xj , . . . , − xi , . . . ), i
xσi = (x1 , . . . , − xi , . . . ). There are two parameters k, k1 in the algebra of differentialdifference operators: TiB f (x) :=
∂f f (x) − f (xσi ) + k1 ∂xi xi X f (x) − f (xσij ) f (x) − f (xτij ) +k + xi − xj xi + x j
(5.1)
j6=i
(1 ≤ i ≤ N , for convenience σij = σji , τij = τji for j < i). The SN -theory can be applied to the analysis of WN by writing polynomials in the Q form xA g(x21 , x22 , . . . , x2N ), where xA := i∈A xi , A ⊂ [1, N ]. For a composition α let
480
C. F. Dunkl
pˆα (x) := xA pβ (x21 , . . . , x2N ), where A = {i : αi is odd}, and βi = bαi /2c, 1 ≤ i ≤ N . Observe N N Y Y X pˆα (x)z α = ((1 − xi zi )−1 (1 − x2i zj2 )−k ). α
i=1
j=1
ˆ 1 , . . . , αi+1 , . . . ). The raising operator ρˆi is defined by ρˆi pˆα = p(α We will use y ∈ RN to denote (x21 , x22 , . . . , x2N ) and the A-type operators Ti act on y. The following properties hold ([D4], Prop. 2.1), for A ⊂ [1, N ], any polynomial g in y: ( (σij + τij )xA g(y) =
TiB ρˆi (xA g(y)) =
(2xA )(ij)g(y) i, j ∈ A or i, j ∈ / A, 0
(5.2)
else;
2xA (Ti ρi g(y)),
i ∈ A, (5.3) 2x ((k − k − 1 )g(y) + T ρ g(y) − k P / A. A 1 i i s∈A (ij)g(y)), i ∈ 2
Definition 5.1. For 1 ≤ i ≤ N , UiB := TiB ρˆi − k in x. For a subset A ⊂ [1, N ], let UA,i := Ti ρi − k
P
j
+ τij ) acts on polynomials
X
{(ij) : j < i and j ∈ A} for i ∈ A,
X 1 {(ij) : j ∈ A or (j ∈ / A and j < i)} for i ∈ / A, UA,i := Ti ρi + (k1 − k − )1 − k 2 acting on polynomials in y. Proposition 5.2. For A ⊂ [1, N ], UiB (xA g(y)) = 2xA UA,i g(y), 1 ≤ i ≤ N . Also Y i∈A
TiB (xA g(y)) = 2#A
Y
UA,i + k1 − k −
i∈A
1 2
g(y).
This proves the commutativity of {UiB } in terms of propositions about SN ; a direct proof is also possible. ∂ , and V B Let V B denote the intertwining operator for WN , thus TiB V B = V B ∂x i B is homogeneous of degree 0; also let ξ be the linear operator on polynomials defined by ξ B pˆα = xα /α!. Here is a description of the eigenspace decomposition of V B ξ B constructed in [D4]. Each space is irreducible for the algebra generated by {TiB ρˆi : 1 ≤ i ≤ N }. Definition 5.3. A B-partition α is a composition (α1 , α2 , . . . , αN ), where the odd and even parts are respectively nonincreasing, that is, i < j and αi ≡ αj mod 2 implies αi ≥ αj . A standard B-partition is one in which the odd parts come first (for some `, αi is odd for i ≤ `, even for i > `). For α ∈ NN , let h(α) = (bα1 /2c, bα2 /2c, . . . ), b(α) = (α1 − bα1 /2c, α2 − bα2 /2c, . . . ).
Orthogonal Polynomials of Types A and B and Related Calogero Models
481
This is an example of a B-partition: α = (4, 5, 3, 4, 2, 0, 1), and h(α) = (2, 2, 1, 2, 1, 0, 0), b(α) = (2, 3, 2, 2, 1, 0, 1). The corresponding standard B-partition is (5, 3, 1, 4, 4, 2, 0). Any B-partition can be rearranged to a standard one; if α is a B-partition with ` odd parts, then there exists a standard B-partition α˜ and w ∈ SN so that wα˜ = α and 1 ≤ i < j ≤ ` or ` + 1 ≤ i < j ≤ N implies w(i) < w(j) (note αw(i) = α˜ i , each i). The property of permutations described above will be used several times in the sequel. Definition 5.4. For any subset A ⊂ [1, N ] let wA ∈ SN be the unique permutation satisfying: wA ([1, `]) = A (for ` = #A), and 1 ≤ i < j ≤ ` or ` + 1 ≤ i < j ≤ N implies wA (i) < wA (j). If 1 ≤ ` < N , the correspondence A → wA is one-to-one. Proposition 5.5. Let A ⊂ [1, N ], ` = #A, then UiB wA (x1 x2 · · · x` g(y)) = wA (2x1 · · · x` U[1,`],s g(y)) = wA (UsB (x1 · · · x` g(y))), for 1 ≤ i ≤ N, s = w−1 (i). −1 −1 −1 Proof. (TiB ρˆi )wA = wA TsB ρˆs , for s = wA (i) and (ij)wA = wA (wA (i), wA (j)) for j 6= i. The order preserving properties of wA and Proposition 5.5 imply the stated equations.
The proposition shows how to find joint eigenfunctions of {UiB } corresponding to arbitrary B-partitions once the standard case is done. For a given standard B-partition α, with αi being odd exactly when 1 ≤ i ≤ `, let G` = S[1,`] × S[`+1,N ] , β = h(α), µ = β + . Define EαB = span{x1 x2 · · · x` ζγ (y) : γ = wβ, w ∈ G` }. Proposition 5.6. EαB is invariant under G` , UiB (x1 · · · x` ζγ (y)) = ci x1 · · · x` ζγ (y), where ci = 2κi (γ) for i ∈ [1, `], ci = 2(κi (γ) + k1 − k − 21 ) for i ∈ [` + 1, N ]. This follows from Proposition 5.2. To express the eigenvalues of V B ξ B , let 1 3(α) := (N k + 1)h(α)+ ((N − 1)k + k1 + )b(α)+ , 2 for α ∈ NN . Then EαB is an eigenspace for V B ξ B with eigenvalue 2−|α| 3(α)−1 . It was shown in ([D4], Proposition 4.2) that x1 · · · x` ζβ (y) is an eigenfunction of V B ξ B with this eigenvalue, and V B ξ B commutes with w ∈ WN . Again there are three inner products in which each UiB is self-adjoint, and which are WN -invariant: (1) the p-product, hxα , pˆβ ip = δαβ ; (2) the B-product, hf, giB = f (T B )g(x)|x=0 (depends on both parameters); (3) the N -torus product, Z Y −2 k f (x)g(x) ˇ |(x2j − x2` )(x−2 hf, giT := ck j − x` )| dm(x) TN
1≤j<`≤N
(same notation as in 3.20; ck is the normalizing constant). It is obvious that hxA1 g1 (y), xA2 g2 (y)i = 0 if A1 , A2 ⊂ [1, N ] and A1 6= A2 , by the WN -invariance (change the sign of xi for i ∈ A1 \ A2 or i ∈ A2 \ A1 ). Further
482
C. F. Dunkl
hxA g1 (y), xA g2 (y)i = hg1 , g2 i for the p- or T-products, using the type A definition on the right. By the results in Sect. 3, let f (x) = x1 · · · x` ζγ (y) ∈ EαB , then kf k2p = kζγ k2p = E+ (γ)E− (γ)h(γ + , k + 1)/h(γ + , 1). By the same argument as Proposition 2.5, kf k2B = f (T B )f (x) = 2|α| 3(α)kζγ k2p .
(5.4)
In [D2] we showed that Z hf, giB = c
k,k0
e
−L/2
f
RN
e
−L/2
g
N Y j=1
|xj |2k1
Y
|x2i − x2j |2k e−|x|
1≤i<j≤N
2
/2
dx, (5.5)
PN for polynomials f, g where L := i=1 (TiB )2 (the normalizing constant ck,k0 is chosen to make h1, 1iB = 1 and is computed by means of the Macdonald-Selberg integral). Define a generalized Hermite polynomial Hβ (x) := e−L/2 (x1 · · · x` ζγ (y)),
(5.6)
with β = (2γ1 + 1, . . . , 2γ` + 1, 2γ`+1 , . . . , 2γN )
(so γ = h(β)).
The complete orthogonal basis is given by {wA Hβ (x) : A ⊂ [1, N ], ` = #A, βi is odd exactly for i ∈ [1, `]}. By using the “non-symmetric” binomial coefficients introduced by Baker and Forrester [BF3] we can produce an expression for Hβ in terms of {ζγ }; although very little is known about the coefficients. Definition 5.7. For α ∈ NN , the binomial coefficients (depending on k) are implicitly defined by ζγ (y) ζα (y + 1N ) X α = , y ∈ RN . N) ζα (1N ) ζ (1 γ γ γ It is known that αγ = 0 unless γ + ⊂ α+ (that is, (γ + )i ≤ (α+ )i , 1 ≤ i ≤ N ). For any scalar s, the homogeneity of ζα implies ζα (y + s1N ) X α |α|−|γ| ζγ (y) = , y ∈ RN . s N) ζα (1N ) ζ (1 γ γ γ When k = 0,
α γ
=
QN α i i=1
γi
(ordinary binomial coefficients).
Proposition 5.8 (Baker and Forrester [BF3]). For α ∈ NN , ! N X h(α+ , k + 1)E+ (α) γ X exp s s|γ|−|α| ζγ (y), y ∈ RN , s ∈ R. yi ζα (y) = + , k + 1)E (γ) h(γ α + γ + ⊃α+ i=1
Orthogonal Polynomials of Types A and B and Related Calogero Models
483
P Proof. The adjoint in the A-inner product of multiplication by exp(s i yi ) is translation g(y) 7→ g(y + s1N ). Indeed, suppose f and g are polynomials, then hes6yi f (y), g(y)iA = f (T )es6Ti g(y)|y=0 N X ∂ = f (T ) exp s ∂yi
! g(y)|y=0
i=1
= f (T )g(y + s1N )|y=0 , PN PN ∂ because i=1 Ti = i=1 ∂y . The given expression is found by using the A-norms of i the orthogonal basis elements ζγ (see Corollary 3.19). P N The adjoint of e−sL in the B-product is multiplication by exp −s i=1 x2i , which can be evaluated using the previous result for ζγ (y). For β ∈ NN , 0 ≤ ` ≤ N , let b(β, `) = (2β1 + 1, . . . , 2β` + 1, 2β`+1 , . . . , 2βN ). For α ∈ NN , s ∈ R, esL (x1 · · · x` ζα (y)) =
X 3(b(α, `))h(β + , 1)E− (α) (4s)|α|−|β| x1 x2 · · · x` ζβ (y). + , 1)E (β) 3(b(β, `))h(α − + + (5.7)
β ⊂α
The formula follows from a similar adjoint-type calculation as in 5.8. Here the norm of the orthogonal basis element is kx1 x2 · · · x` ζβ (y)k2B = 22|β|+` 3(b(β, `))kζβ k2p = 22|β|+` 3(b(β, `))h(β + , k + 1)E+ (β)E− (β)/h(β + , 1) (see (5.4)). Put s = − 21 in the formula to produce an orthogonal basis for e−|x| 2 for e−|x| in the formula (5.5); that is, including all the polynomials
2
/2
, s = − 41
wA (x1 · · · x` ζα (y)), A ⊂ [1, N ], ` = #A. The special cases ` = 0 and ` = N have already been obtained by Baker and Forrester [BF3], who called them generalized Laguerre polynomials. We discuss the connection to the BN -type spin Calogero model in (1.3). From the commutation [L, xi ] = 2TiB ([D1], Proposition 2.2), it follows that −L/2 e xi = (xi − TiB )e−L/2 . Also X TiB xi = xi TiB + 1 + 2k1 σi + k (σij + τij ), j6=i
which shows that e−L/2 UiB eL/2 = xi TiB − (TiB )2 + 1 + k + (2k1 − k)σi + k
X j>i
(σij + τij ).
484
C. F. Dunkl
The Hermite polynomials from (5.6) (with s = − 21 ) are simultaneous eigenfunctions of these operators. Then H3 := e−L/2
N X
UiB eL/2
i=1
=
N X
xi ∂ i −
N X
i=1
(TiB )2 + (k1 − k)
N X
i=1
σi + N (N k + 1 + k1 ).
i=1
The eigenvalues depend only on the degree and the number of odd indices, H3 e−L/2 (x1 · · · x` ζα (x2 )) = ((2|α| + `) + 2`(k − k1 ) + N (N k + 1 + 2k1 − k)(e−L/2 x1 · · · x` ζα (x2 )). Let h(x) =
Y
|x2i − x2j |k
N Y
|xi |k1 ,
i=1
1≤i<j≤N
then −1
h(x)H3 h(x)
=
N X i=1
∂2 ∂ xi − ∂xi ∂x2i
+ (k1 − k)
N X
σi + k 1
i=1
N X k 1 − σi i=1
X
+ 2k
1≤i<j≤N
x2i
k − τij k − σij + (xi − xj )2 (xi + xj )2
+ N (k + 1).
Conjugating once more leads to e−|x|
2
/4
h(x)H3 h(x)−1 e|x|
2
/4
= H2 + (k1 − k)
N X i=1
1 σi + N (k + ). 2
The middle term is basically counting the odd indices. Yamamoto [Y] already found that the eigenvalues were evenly spaced. Baker and Forrester studied the A-version of [BF3] PN 2 this model with a similar transformation, namely exp − i=1 Ti /2 . Van Diejen [vD] and Kakei [K1, K2, K3] have studied the symmetric (WN -invariant) eigenfunctions of this model. The theory of symmetric and alternating polynomials from Sects. 3 and 4 can be applied. We will only write down the two-interval situation, but the methods apply to finer partitions as well. Fix an interval [1, `], let G` = S[1,`] × S[`+1,N ] , and choose α ∈ NN which satisfies (≥, [1, `]) and (≥, [` + 1, N ]) (corresponding to a standard B-partition). Then αR := σ[1,`] σ[`+1,N ] α = (α` , α`−1 , . . . , α1 , αN , . . . , α`+1 ),
Orthogonal Polynomials of Types A and B and Related Calogero Models
485
and #G` α is the number of distinct permutations of (α1 , . . . , α` ), (α`+1 , . . . , αN ). The following polynomial is an invariant of S[1,`] × W[`+1,N ] : Let jα;` := x1 x2 · · · x` E+ (αR , [1, `])E+ (αR , [` + 1, N ]) ·
X β∈G` α
1 ζβ (x21 , . . . , x2N ). E+ (β, [1, `])E+ (β, [` + 1, N ])
P Then jα;` = x1 x2 · · · x` w wζα (summing over a complete set of representatives for the cosets w{w0 ∈ G` : w0 α = α}. Further, kjα;` k2p = (#G` α)E+ (αR , [1, `])E+ (αR , [` + 1, N ])kζα k2p , where kζα k2p = E+ (α)E− (α)h(α+ , k + 1)/h(α+ , 1); and kjα;` k2B = 22|α|+` 3(b(α, `))kjα;` k2p . This is also the squared norm (from (5.5)) of e−L/2 jα;` , which has the same S[1,`] × W[`+1,N ] invariance. (For an interval I, WI is the group generated by SI and {σi : i ∈ I}.) Further jα;` (1N ) = (#G` α)E− (α)(N k + 1)α+ /h(α+ , 1). Suppose that α satisfies (>, [1, `]) and (>, [` + 1, N ]). The following polynomial is alternating for W[1,`] × S[`+1,N ] . Let aα;` = x1 · · · x` E− (αR , [1, `])E− (αR , [` + 1, N ]) ·
X w∈G`
Then aα;` = x1 . . . x`
sgn(w) ζwα (x21 , . . . , x2N ). E− (wα, [1, `])E− (wα, [` + 1, N ])
P w∈G`
sgn(w)wζα ,
kaα;` k2p = `!(N − `)!E− (αR , [1, `])E− (αR , [` + 1, N ])kζα k2p , and kaα;` k2B = 22|α|+` 3(b(α, `))kaα;` k2p . Further, aα;` (x)/(a[1,`] (x2 )a[`+1,N ] (x2 ))
x=1N
(N k + 1)α+ E− (α) Y {κi (α)−κj (α) − k : 1 ≤ i, j ≤ ` or ` +1 ≤ i < j ≤ N }, (N k + 1)µ h(α+ , 1) Q where µ = (`−1, `−2, . . . , 1, 0, N −`−1, . . . , 1, 0)+ ; recall a[1,`] (x2 ) = 1≤i<j≤` (x2i − x2j ). Again e−L/2 aα;` is also alternating for W1,`] × S[`+1,N ] , and its squared norm (for (5.5)) equals kaα;` k2B . =
486
C. F. Dunkl
As mentioned above, the techniques of Sects. 2 and 3 can be used to describe polynomials with prescribed symmetry for direct products WI1 ×WI2 · · ·×WIr ×SIr+1 · · ·×SIt for any collection {I1 , I2 , . . . , It } of pairwise disjoint intervals in [1, N ]. More information about the generalized binomial coefficients needs to be obtained so that more concrete algorithms for the type-B Hermite polynomials can be found. The polynomials could be useful in the numerical cubature associated to the MacdonaldMehta-Selberg integral. Also, such knowledge could lead to orthogonal bases for harPN monic polynomials, which, by definition, are annihilated by i=1 (TiB )2 ; for example, these polynomials appear when one uses spherical polar coordinate systems to find eigenfunctions of some Hamiltonians, see Sect. 3.4 in [vD]. References [BF1] Baker, T.H. and Forrester, P.J.: The Calogero–Sutherland model and generalized classical polynomials. Commun. Math. Phys. 188, 195–216 (1997) [BF2] Baker, T.H. and Forrester, P.J.: The Calogero–Sutherland model and polynomials with prescribed symmetry. Nucl. Phys. B492, 682–716 (1997) [BF3] Baker, T.H. and Forrester, P.J.: Non-symmetric Jack polynomials and integral kernels. Duke Math. J. to appear; preprint, q-alg/9612003. [BF4] Baker, T.H. and Forrester, P.J.: Symmetric Jack polynomials from the non-symmetric theory. Preprint, q-alg/9707001, 1 Jul. 1997 [BDF] Baker, T.H., Dunkl, C.F. and Forrester, P.J.: Polynomial eigenfunctions of the Calogero–SutherlandMoser models with exchange terms; Proc. CRM Workshop on Calogero–Sutherland-Moser models. To appear [BO] Beerends, R. and Opdam, E.: Certain hypergeometric series related to the root system BC. Trans. Am. Math. Soc. 339, 581–609 (1993) [C] Cherednik, I.: A unification of the Knizhnik–Zamolodchikov and Dunkl operators via affine Hecke algebras. Inv. Math. 106, 411–432 (1991) [vD] van Diejen, J.F.: Confluent hypergeometric orthogonal polynomials related to the rational quantum Calogero system with harmonic confinement. Commun. Math. Phys. 188, 467–497 (1997) [D1] Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. Am. Math. Soc. 311, 167–183 (1989) [D2] Dunkl, C.F.: Integral kernels with reflection group invariance. Canadian J. Math. 43, 1213–1227 (1991) [D3] Dunkl, C.F.: Intertwining operators and polynomials associated with the symmetric group. Monatsh. Math. to appear [D4] Dunkl, C.F.: Intertwining operators of type BN . Proc. CRM Workshop on q-special functions and algebraic methods. Preprint CRM-2380, 1996 [DH] Dunkl, C.F. and Hanlon, P.: Integrals of polynomials associated with tableaux and the Garsia–Haiman conjecture. Math. Z. 228, 537–567 (1998) [K1] Kakei, S.: Common algebraic structure for the Calogero–Sutherland models. Preprint, solvint/9608009, 27 Aug. 1996, J. Phys. A29, L 619–L 624 (1996) [K2] Kakei, S.: An orthogonal basis for the BN -type Calogero model. Preprint, solv-int/9610010, 28 Oct. 1996 [K3] Kakei, S.: Intertwining operators for a degenerate double affine Hecke algebra and multivariable orthogonal polynomials. Preprint, q-alg/9706019, 17 June 1997 [LV1] Lapointe, L. and Vinet, L.: A Rodrigues formula for the Jack polynomials and the Macdonald–Stanley conjecture. IMRN 9, 419–424 (1995) [LV2] Lapointe, L. and Vinet, L.: Exact operator solution of the Calogero–Sutherland model. Commun. Math. Phys. 178, 425–452 (1996) [Sa] Sahi, S.: A new scalar product for nonsymmetric Jack polynomials. IMRN 20, 997–1004 (1996) [St] Stanley, R.P.: Some combinatorial properties of Jack symmetric functions. Adv. Math. 77, 76–115 (1989)
Orthogonal Polynomials of Types A and B and Related Calogero Models [Y] [YT] [Yn]
487
Yamamoto, T.: Multicomponent Calogero model of BN -type confined in harmonic potential. Phys. Lett. A208, 293–302 (1995) Yamamoto, T. and Tsuchiya, O.: Integrable 1/r 2 spin chain with reflecting end. J. Phys. A29, 3977– 3984 (1996) Yan, Z.: A class of hypergeometric functions in several variables. Canad. J. Math. 44, 1317–1338 (1992)
Communicated by T. Miwa
Commun. Math. Phys. 197, 489 – 519 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Algebras of BPS States Jeffrey A. Harvey1 , Gregory Moore2 1 Enrico Fermi Institute, University of Chicago, 5640 Ellis Avenue, Chicago, IL 60637, USA. E-mail: [email protected] 2 Department of Physics, Yale University, New Haven, CT 06511, USA. E-mail: [email protected]
Received: 29 August 1997 / Accepted: 5 November 1997
Abstract: We define an algebra on the space of BPS states in theories with extended supersymmetry. We show that the algebra of perturbative BPS states in toroidal compactification of the heterotic string is closely related to a generalized Kac–Moody algebra. We use D-brane theory to compare the formulation of RR-charged BPS algebras in type II compactification with the requirements of string/string duality and find that the RR charged BPS states should be regarded as cohomology classes on moduli spaces of coherent sheaves. The equivalence of the algebra of BPS states in heterotic/IIA dual pairs elucidates certain results and conjectures of Nakajima and Gritsenko & Nikulin, on geometrically defined algebras and furthermore suggests nontrivial generalizations of these algebras. In particular, to any Calabi–Yau 3-fold there are two canonically associated algebras exchanged by mirror symmetry.
1. Introduction String theories and field theories with extended supersymmetry have a distinguished set of states in their Hilbert space known as BPS states. Thanks to supersymmetry, one can make exact statements about these magical states even in the face of all the complexities, perplexities, and uncertainties that plague most attempts to understand nonperturbative Quantum Field Theory and Quantum String Theory. As a result, they have played a special role in the study of strong-weak coupling duality in both field theory and string theory. In this paper we point out that there is a simple, physical, and universal property of BPS states: They form an algebra. There are four reasons the algebra of BPS states is interesting: 1. BPS algebras appear to be infinite-dimensional gauge algebras, typically spontaneously broken down to a finite dimensional unbroken gauge symmetry.
490
J. A. Harvey, G. Moore
2. Comparing BPS algebras in dual string pairs has important applications in mathematics. 3. The BPS algebras appear to control the threshold corrections in d = 4, N = 2 string compactification. 4. BPS algebras appear to be intimately related to black hole physics. In particular, the counting of nonperturbative black hole degeneracies seems to be related to generalized Kac–Moody algebras. We will discuss (1) and (2) in this paper. Item (3) is the subject of several papers [1– 5]. Item (4) has been proposed recently in an imaginative paper of Dijkgraaf, Verlinde, and Verlinde [6]. The outline of this paper is as follows. Section Two contains the basic definition of the algebra of BPS states. In Section Three we use the definition to compute the algebra of perturbative BPS states in toroidal compactifications of heterotic string theory and discuss the relation of this algebra to Generalized Kac–Moody (GKM) algebras.1 In the fourth section we turn to an analysis of BPS states in Type II string theory, we discuss the formulation in terms of moduli spaces and argue that sheaves provide the correct language for a general discussion of BPS states. Section 5 develops the sheaf-theoretic interpretation of BPS states in some detail for K3 and T 4 compactifications. In Sect. 6 this is extended to the Calabi–Yau case. Section 7 contains a conjectural method for the computation of the algebra of BPS states in Type II string theory. String duality predicts isomorphisms between certain algebras of Type II BPS states and the dual algebra of perturbative heterotic BPS states. In Sects. 8 and 9 we discuss this isomorphism in a certain limit. The final section contains brief conclusions and a discussion of open issues. 2. The Space of BPS States is Always an Algebra 2.1. Definition. The definition of the algebra of BPS states uses very little information and is therefore quite general. We suppose that 1. There are absolutely conserved charges Q and therefore the Hilbert space of asymptotic particle states is graded H = ⊕HQ . 2. In each superselection sector there is a Bogomolnyi bound on the energy: E ≥k Z(Q) k,
(2.0)
where Z(Q) is a central charge and k · k is some norm function. Given the two conditions above we can define the Hilbert space of BPS states HBP S to be the space of one-particle states saturating (2.0).2 An algebra is simply a vector space with a product and we can define the product R : HBP S ⊗ HBP S → HBP S
(2.1)
as follows. Take two BPS states ψi of charges Qi , i = 1, 2. Boost them by momenta ±~ p∗ to produce a two-body state in the center of mass frame such that the total energy satisfies the BPS bound E =k Z(Q1 + Q2 ) k. By definition R(ψ1 ⊗ ψ2 ) is the orthogonal Q1 +Q2 projection of 3p~∗ (ψ1 ) ⊗ 3−~p∗ (ψ2 ) onto HBP ~ is the Lorentz boost. The S , where 3p algebra (2.1) is the central object of study in this paper. 1 In this paper the term GKM is used for something slightly different from the object defined by Borcherds. See note added. 2 If the one-particle state is a bound state at threshold it can be distinguished from a two-particle state satisfying (2.0) by the representation of the supertranslation group.
On the Algebras of BPS States
491
Remarks. 1. The nature of the charges and the Bogomolnyi bound depends on the context (dimension, number of supersymmetries, global vs. local supersymmetry, etc.). For example, in d = 4, N = 2 theories with unbroken U (1)r gauge group Q = (nI , mI ) refers to the electric and magnetic charges of the gauge group. The (N = 2) supersymmetry algebra in these superselection sectors has a central charge given in terms of symplectic periods (X I , FI ) by: Z(Q) = nI X I + mI FI .
(2.2)
In N = 2 supergravity we have k Z k2 ≡ eK |Z|2 . On the other hand, for N = 4, 8 Z(Q) is the maximal eigenvalue of the matrix Zij , etc. 2. The value of p~∗2 is fixed in terms of central charges: p~∗2 =
1 (k Z1 k2 − k Z2 k2 )2 1 k Z12 k2 + − 2 (k Z1 k2 + k Z2 k2 ), 4 k Z12 k2
(2.3)
where Z12 = Z(Q1 ) + Z(Q2 ). It follows from (2.3) that p~∗2 ≤ 0 with equality iff k Z12 k=k Z(Q1 ) k + k Z(Q2 ) k. Therefore to implement the above definition we must use analytic continuation in p~. We will analytically continue in the magnitude, leaving the direction pˆ real. Note that the definition of the algebra in principle depends on the choice of direction p. ˆ3 2.2. S-matrix interpretation. This definition has a simple S-matrix interpretation: Consider the two-body state of two boosted BPS states ψ1,2 with quantum numbers (in the p; Qi ). Consider the scattering process: center of mass frame) (Ei , ±~ ψ1 + ψ2 → F,
(2.4)
where F is some final state which is a vector in the superselection sector Q1 + Q2 . The S-matrix for (2.4) has a distinguished pole: S(ψ1 + ψ2 → F) ∼
hF|R[ψ1 ⊗ ψ2 ]i s− k Z(Q1 + Q2 ) k2
(2.5)
and the residue of the pole defines the product. In the case of massless BPS states or BPS states which are bound states at threshold care is needed. The algebra should be computed using a limiting procedure, or using techniques such as those employed in [7]. 2.3. Relation to topological field theory. The algebra (2.1) may be viewed as a generalization of the chiral algebra of chiral primary fields in massive d = 2, N = 2 theories [8]. Indeed, given the extended spacetime supersymmetry we may twist the theory [9]. The space HBP S is the BRST cohomology of a scalar supersymmetry Q. Interpolating operators which create such states should have nonsingular products (modulo Q). In many theories BPS states are thought to be smoothly connected to extremal black hole states. This suggests an alternative viewpoint on the BPS algebra. One could collide two extremal black holes of charges Q1 , Q2 and consider the amplitude for the resulting state to settle down to an extremal black hole of charge Q1 + Q2 . It would be extremely interesting to relate this amplitude to the structure constants of the BPS algebra. 3
We thank R. Dijkgraaf and E. Verlinde for emphasizing this.
492
J. A. Harvey, G. Moore
3. Example: Toroidally Compactified Heterotic String The simplest and most elementary example of the algebra of BPS states in string theory is given by heterotic string compactification on T d . This gives the algebra of DabholkarHarvey (DH) states. In this case the algebra may be studied (in string perturbation theory) using well-developed vertex operator techniques. The string tree level BPS algebra is closely related to the Gerstenhaber and BV algebras investigated in [10]. The pole in the S-matrix simply comes from the “dot product” of suitably boosted BRST classes: R(V1 ⊗ V2 )(z2 , z¯2 ) ≡ lim 3p~∗ V1 (z1 , z¯1 ) 3−~p∗ V2 (z2 , z¯2 ) mod QBRST , (3.1) z1 →z2
where 3p~∗ is the action of a Lorentz boost along some fixed direction with magnitude specified by (2.3). The BRST class on the LHS of (3.1) is a class of (left,right) ghost number (2, 2) but, for the models under consideration, ghost numbers 1 and 2 may be identified via: ∂ccV → cV (3.2) for V a matter vertex operator. We now describe the algebra in more detail. BPS states are those in the right-moving supersymmetric ground state [11] and thus the space of BPS states has the form HBP S = Hmult ⊗ π,
(3.3)
where π is a massless representation of real dimension 8boson ⊕ 8fermion of the spacetime supertranslation algebra with p~ = 0. The left-moving operators also carry spin. We will show that those multiplets with no left-moving spin in the uncompactified dimensions, H0mult can be given the structure of an infinite dimensional generalized Kac–Moody algebra [12, 13]. The full space of multiplets Hmult carries an interesting algebraic structure, described below. Hmult is graded by vectors in the Narain lattice:4 mult Hmult = ⊕(P L ;P R )∈016+d,d H(P L ;P R ) .
(3.4)
Furthermore, we may choose a basis of bosonic states which (almost) factorize between left and right as: ˜ right ¯ P , (3.5) VI,P,ζ˜ = VPleft L ,I (z) ⊗ V R ˜ (z) P ,ζ where P is a cocycle factor for the lattice. We will denote right-moving quantities with a tilde. So, ζ˜M is a polarization vector in 10 dimensional space M = 0, ..., 9. While states of string may be associated to an arbitrary vector in the Narain lattice (PL ; PR ), only those charge sectors with P 2 ≥ −2 can satisfy a Bogomolnyi bound. We thus have: N = 21 (PR2 − PL2 ) + 1 = 21 P 2 + 1 ≥ 0 E 2 − p~2 = PR2
,
(3.6)
where N is the oscillator level of V left . The index I runs from 1 to p24 (N ) over a basis of oscillator states. In the basis (3.5) the product of states in H0mult takes the form t t , VPlef ] ⊗ V˜ζ˜right , (3.7) R VP1 ,I1 ,ζ˜1 ⊗ VP2 ,I2 ,ζ˜2 = [VPlef 1 ,I1 2 ,I2 12
4
Signature convention: 0
16+d,d
has signature
(−1)16+d , (+1)d .
On the Algebras of BPS States
493
where ζ˜12 is a function, given below, of ζ˜i and Pi , and the first factor is a Lie bracket. To compute the algebra in this basis we first examine the right-moving component in the (−1) picture: ˜ ˜ ˜ z). ¯ (3.8) ¯ = c˜(z)e ¯ −φ eik·x˜ ζ˜ · ψ( V˜Pright R ,ζ˜ (z) Here φ˜ is the superconformal ghost, x˜ M , M = 0, . . . 9 labels all right-moving spacetime coordinates. The right-moving momentum is k˜ = (E, p~; P R ) and the BPS condition requires p~ = 0 and that k˜ is lightlike:5 k˜ 2 = 0.
(3.9)
˜ The right-moving part of the product is most We also have: k˜ · ζ˜ = 0 and ζ˜ ∼ ζ˜ + αk. easily computed by multiplying operators in the −1 and 0 pictures. The physical boost used in the definition in the previous section is characterized by the requirement that the total spatial momentum vanish and that the new right-moving momentum remains lightlike: p∗ ) + k˜ 2 (−~ p∗ ))2 = 0. (3.10) (k˜ 1 (~ Under these circumstances the only BRST invariant operator on the right is again a d=10 U(1) SYM multiplet. The multiplicative structure on the multiplet is exactly given by the on-shell three-point vertices of the SYM multiplet. In particular for the three-point vertex of bosons we have: p∗ ) · k˜ 2 (−~ p∗ )}ζ˜2 (−~ p∗ ) − {ζ˜2 (−~ p∗ ) · k˜ 1 (~ p∗ )}ζ˜1 (~ p∗ ) ζ˜12 = {ζ˜1 (~ , (3.11) − {ζ˜1 (~ p∗ ) · ζ˜2 (−~ p∗ )}k˜ 2 (−~ p∗ ) mod(k˜ 1 (~ p∗ ) + k˜ 2 (−~ p∗ )) ˜ p∗ ) is the boosted polarization tensor. where ζ(~ Now let us turn to the left-moving operators. These have the form: ik·x(z) VPleft PI (∂ ∗ x(z)), L ,I = ce
(3.12)
where k = (E, p~; P L ) is the left-moving momentum and the BPS condition states that p~ = 0 and 1 2 (3.13) 2k + N = 1 . In particular, the matter part of the vertex operator is a dimension one primary. In (3.12) PI (∂ ∗ x(z)) runs over a basis of representatives of the BRST cohomology so I = 1, . . . , p24 (N ). The operator product of the ghost factors in two such boosted states is c(z1 )c(z2 ) = z12 c∂c + · · · and therefore we must isolate the simple pole in the OPE of two dimension one primaries. Thus the left-moving matter part of the product state is given by: I dz1 3p~∗ VP1L ,I1 (z1 ) 3−~p∗ VP2L ,I2 (z2 ) mod V ir+ . (3.14) z2
It is well-known that the pole terms in mutually local dimension one primaries generate a current algebra [14, 15]. (The matter CFT is not unitary and therefore there will be higher poles in the OPE, but these do not contribute to the BRST cohomology. ) 5
Our signature convention for the spacetime metric is that η00 = −1. The index 0 refers to time.
494
J. A. Harvey, G. Moore
It follows immediately that the product in (3.14) defines a current algebra for the states which are bound states at threshold. For these states we can take p~ = 0. When there is a nontrivial binding energy then k1 · k2 will not be integer and the simple operator products of DH vertex operators will not define a new DH state. As for the right-movers, the boost solves the problem and p∗ ) · k2 (−~ p∗ ) = −P1 · P2 k1 (~ , = N1 + N2 − N12 − 1
(3.15)
where N12 is the oscillator level of the product BPS state. It remains to determine the properties of the product on states requiring a boost. For states in H0mult which are left-Lorentz scalars the product again determines a Lie algebra. This can be seen as follows. We choose four left-moving dimension one primaries 9i and consider the correlator h31 (91 )32 (92 )33 (93 )34 (94 )i
.
(3.16)
The 3i are Lorentz boosts determined by the conditions that (k˜ i + k˜ j )2 = 0 for all pairs i, j. In terms of the boosts used to define the product we have (3.17) 3i 9i (zi , z¯i )3j 9j (zj , z¯j ) = 3ij 3p~ij,∗ 9i 3p~ji,∗ 9j , where 3ij is an overall Lorentz transformation and 3p~ij,∗ are the special boosts, for the pair 9i , 9j defined above. We are only interested in the first order poles in (3.16). Thus we multiply (3.16) by dz1 ∧ dz2 ∧ dz3 ∧ dz4 and consider the resulting expression as a DeRham class on (P1 )4 − BD, where BD stands for the big diagonal where any two points coincide. The idea is that on such a space terms of the form z n dz are exact except for n = −1. In this way we focus on the pole terms. The three point functions are given by h31 (9I )32 (9J )33 (9K )idz1 ∧ dz2 ∧ dz3 = fIJK (pˆ12 )
dz1 ∧ dz2 ∧ dz3 z12 z23 z31
mod d(∗),
(3.18) where we have emphasized that, having chosen a basis of states 9I , the structure constants depend on direction. The two-point function defines a positive form on the algebra. Comparing expressions for (3.16) derived from the singularities in z1 with that derived from the singularities in z2 gives a Jacobi-like identity on fIJK : f12I (pˆ12 )f34I (pˆ34 ) − f13I (pˆ13 )f24I (pˆ24 ) + f14I (pˆ14 )f23I (pˆ23 = 0,
(3.19)
where the directions are related to each other by the requirement that s˜ = t˜ = u˜ = 0. For vertex operators which are scalars under the left-moving noncompact Lorentz group, or for states not requiring a boost the direction dependence vanishes and (3.19) is the Jacobi identity. In this case the product defines a Lie algebra as a subalgebra of the algebra of BPS states. The general structure on arbitrary BPS states is given by (3.19). Finally, we must clarify the sense in which H0mult is a GKM algebra. We would like to apply the definition of a GKM algebra in Sect. 4 of [13], replacing only the Zgrading by a II d+16,d -grading. The positive and negative grading is defined by the Narain vectors ±P supporting BPS and anti-BPS states, respectively. The subspaces at fixed grading with respect to II d+16,d are finite dimensional. Unfortunately if we convert the II d+16,d -grading to a Z-grading the finite-dimensionality of the graded subspaces need
On the Algebras of BPS States
495
not hold any longer. Thus, the algebras we are discussing are themselves generalizations of generalized Kac–Moody algebras.6 From the construction it is clear that the Lie algebra H0mult ⊂ Hmult satisfies two key properties. It is invariant under O(d + 16, d; R) rotations, and at enhanced symmetry subvarieties the massless subalgebra is the unbroken gauge group of the low energy theory. Thus, the BPS algebra is a kind of universal algebra for toroidal compactification. We regard it as a physically sensible version of the “duality invariant gauge algebra” of [16] and of the “universal gauge algebra” of [17]. Note that it involves only physical on-shell7 states, with positive definite inner product, and no compactification of time. (The role of compactified time has been supplanted by the BPS conditions.) Because of the properties of the BPS algebra described above, and because Narain moduli are Higgs fields, the algebra of BPS states (or at least H0mult ) should be regarded as a spontaneously broken gauge algebra. Remarks. 1. The above construction is closely related to the technique used in [18]. The boost 3p~ is similar to the choice of certain kinematic invariants in [18]. In particular the result of that paper can be applied to deduce that the “Ward identities” of the algebra of BPS states completely fixes the tree level BPS scattering matrix. In general, a strongly broken gauge symmetry is as useless as having no symmetry at all, but something unusual appears to be happening in the present context. Not only is the S-matrix fixed, but, in d = 4, N = 2 theories the quantum corrections appear to be closely related to the BPS algebra [1, 2, 4, 5]. 2. We have computed the residue using a tree level string calculation. The nonrenormalization theorems of string perturbation theory could possibly be applied here to guarantee that the algebra is unchanged to all orders of perturbation theory. 3. Similar considerations apply to d = 4, N = 2 compactifications of the heterotic string. The space of DH states now has the form: HBP S = Hvm ⊗ πvm ⊕ Hhm ⊗ πhm ,
(3.20)
where πvm , πhm are the massless vectormultiplet and hypermultiplet representations of the supertranslation algebra. Both have real superdimension: 4boson ⊕ 4fermion . (We include the supergravity multiplet as a vectormultiplet.) The algebra of states now has a Z2 grading with vectormultiplets even and hypermultiplets odd. 4. Previous attempts [1] at defining the BPS algebra using vertex operators have used the “left-right swap” according to which we associate a left-moving current J (z) ≡ eipL X(z)−ipR X(z) PI (∂ ∗ X) ˜
(3.21)
to the internal part of the BPS vertex operator, and then use null gauging to remove the right-moving oscillators. We regard the present formulation as a significant improvement on the old one. 5. It would be interesting to understand the system of simple roots for this algebra. The real simple roots will be the states associated with the P 2 = −2 vectors. The set of reflections in these roots generates the Weyl group of the BPS algebra, which is thus a subgroup of the T-duality group. This subgroup should be viewed as a gauge group, as in [19, 20]. 6 7
We are grateful to R. Borcherds for an illuminating comment on this point. Although we use analytic continuation in p ~, we always work within the framework of BRST cohomology.
496
J. A. Harvey, G. Moore
6. A similar construction also applies to the perturbative BPS states of toroidally compactified type II strings. The multiplication R(ψ1 ⊗ ψ2 ) can be defined, but we lose the obvious connection to currents since now vertex operators (for medium sized representations) satisfy 1(P) + 21 PL2 − 21 PR2 = 0, so the algebra need not be a GKM algebra. This algebra is of interest because, by U -duality, it also computes the algebra of RR charged BPS states for type II on a torus. 4. Geometrical Realization of BPS States for Type II on Calabi–Yau Manifolds Xd The results of [1] and the previous section exhibit an interesting algebraic structure in the interactions of BPS states in toroidal and K3 compactifications of the heterotic string. Given the duality between the heterotic string on T 4 and the IIA string on K3 [21–24] and between the heterotic string on K3 × T 2 and type II theory on K3-fibered Calabi– Yau manifolds [25, 26] one expects to find the same algebraic structure of BPS states in the dual formulation. This and the following sections are devoted to the development of this idea. We start with type II string theory on Xd × RD,1 , where D = 9 − 2d and Xd is a Calabi–Yau manifold of complex dimension d. We will be interested in BPS particle states obtained by wrapping D-branes on cycles in Xd . 4.1. The generalized Mukai vector. The charges of the unbroken U (1) gauge symmetries are naturally associated with a vector: Q ∈ H ∗ (Xd ; Z),
(4.1)
where ∗ is even for IIA and odd for IIB strings.8 The reason is that the U (1) gauge fields are obtained by Kaluza–Klein reduction of RR (p + 1)-form fields C (p+1)R and for each homology p-cycle 6 ⊂ Xd we may define a U (1) gauge field: A6 ≡ 6 C (p+1) . The charge lattice should have a basis dual to the basis of gauge fields, hence (4.1). The physical interpretation of Q depends on dimension. For D = 5 the particles can only have electric charge, then Q ∈ H ∗ (K3; Z) ∼ = 020,4 is an electric charge vector. For D = 3, Q is a vector of electric and magnetic charges. In many cases the space of BPS states with a fixed charge can be defined in terms of the cohomology of the moduli space of instantons [27, 28]. Let us suppose that there are r wrapped 2d-branes on Xd . 9 The low energy dynamics on the D-brane is governed by maximally symmetric SYM theory in Xd × R . The Chan–Paton spaces form a vector bundle E → Xd and the ten-dimensional gauge field AM , M = 0, . . . 9 on E becomes a gauge field Aµ , µ = 0, . . . 2d and a Higgs field 8i i = 1, · · · D which are r × r antihermitian matrices. In general the Chan–Paton vector bundle E is a twisted vector bundle. Indeed, the RR charge vector Q and the characteristic classes of the bundle are related by the important formula: q ˆ d) Q = v(E) ≡ ch(E) A(X . (4.2) p 1 = ch(E) T d(Xd ) ∈ H 2∗ (Xd ; Z) N 8 We assume for simplicity that there is no torsion. Otherwise we mod out by the torsion. We will also find 1 Z, where N depends on the manifold a possibility for fractional charges below, so Z should be replaced by N Xd . 9 Here we are considering the IIA theory.
On the Algebras of BPS States
497
The second line follows since Xd is Calabi–Yau. The integer N depends on Xd and is one for d = 2, is a divisor of 24 for d = 3, and so on. The Chern character should be 1 (F − B) , where F is the field strength of the gauge field regarded as ch(E) = Tr exp 2π on the brane and B is the bulk NS-NS anti-symmetric tensor field [27]. The expression (4.2) gives the various brane-charges associated to a gauge field configuration via Q = (r2d , r2d−2 , . . . , r0 ) ∈ H 0 ⊕ H 2 ⊕ · · · ⊕ H 2d . We will refer to the vector v(E) as a “generalized Mukai vector.” Remark. Various pieces of this formula appeared in [29–32] and the final form was derived in [1] using an anomaly inflow argument. The vector [33] has also appeared in the mathematics literature in the work of Mukai for the case d = 2 [34]. The origin of the vector in [34] is the Riemann-Roch-Grothendieck formula and is closely related to the derivation of [33]. 4.2. Attempt at a precise formulation of the space of BPS states. In this section we will attempt to give a precise formulation of the space of BPS states associated with r wrapped D-branes on Xd . Supersymmetric Yang–Mills does not have nontrivial infrared dynamics in R2d × R for d ≥ 2. Thus the classical and quantum moduli spaces must coincide. The situation can be different for SYM on Xd × R since Xd is compact. However, due to the topological nature of BPS states mentioned in Sect. 2.3, the space of BPS states should be independent of the volume V of Xd . This motivates our first assumption: Assumption A. We may describe the BPS states by the supersymmetric ground states of a supersymmetric quantum mechanics with target space given by the moduli of supersymmetric field configurations. Note that we have made an important change of limits, exchanging the large volume and low energy limits. The moduli space of classical ground states is the moduli space of solutions to the “generalized Hitchin system” defined by the equations: δχ = 0M N 1 FM N + 12 = 0
(4.3)
for some pair of covariantly constant spinors i on RD × Xd . Here χ, FM N are the gaugino and field strength respectively taking values in the Lie algebra u(r). The second term lives in the u(1) subalgebra. In N = 4 SYM with gauge group U (1) the scalar fields can be viewed as the Nambu-Goldstone modes related to translations transverse to the D-brane, the non-linearly realized supersymmetry transformation involving 2 is the superpartner of these translations.10 We would now like to simplify Eqs. (4.3). Since the gauge field AM is a massless field representing excitations of open strings with Dirichlet boundary conditions we are motivated to make Assumption B. The fields (Aµ , 8i ) in (4.3) are functions only of the coordinates xµ on Xd , and not functions of the coordinates of RD . We will now argue that, while assumptions A and B are reasonable, and approximately correct, in fact they are incompatible with string/string duality. Let us now examine more closely the consequences of Assumptions A and B. We can choose our covariantly constant spinors to be of the form ⊗ η, where η is a constant 10
The presence of the second term was noticed in conversations with J. Polchinski and A. Strominger.
498
J. A. Harvey, G. Moore
spinor on RD and is covariantly constant on Xd in the Calabi–Yau metric. We can normalize 1 so that the K¨ahler form is ωM N = †1 0M N 1 .
(4.4)
Expanding, we find three terms which must separately vanish. In the part depending on 0µν , 2 is chosen so that the equations for the Yang–Mills field are: F is type (1, 1) and Rω d−1 ∧ F =R λω d , where ω is the K¨ahler form and λ is a constant. Explicitly, λ = X ω d−1 c1 / X ω d . The resulting “generalized Hitchin equations” are F ∈ 1,1 (Xd ), ω d−1 ∧ F = λω d , Dµ 8i = 0, [8i , 8j ] = 0.
(5a) (5b) (5c) (5d)
Equations (5a, b) are the Hermitian Yang–Mills equations. The 8i represent the normal motions of the 2d-brane. Moreover, from (5c) we see that if 8i is nondiagonal then the vector bundle on Xd must in general be reducible. For a fixed Mukai vector Q = v(E) will define M0 (Q) to be the moduli space of solutions of (5a, b, c, d) modulo the U (r) gauge group. We expect that the moduli space M0 (Q) will be rather singular. But we expect it to be a stratified space with smooth strata. Note from (5d) that M0 (Q) has a natural projection to a configuration space of points: (4.6) π : M0 (Q) → S r (RD ) given by the eigenvalues of the (simultaneously diagonalizable) 8i : (r) π[(Aµ , 8i )] → {a(1) i , . . . ai }, (r) 8i ∼ Diag{a(1) i , . . . ai .}
(4.7)
These give the positions of the r wrapped branes, so (4.6) provides a partial stratification of M0 (Q) by S r (RD ) = qSνr (RD ), where ν labels partitions of r. Finally, from (5a, b) we see that over the “small diagonal” 1(r) ⊂ S r (RD ), where all points coincide we have a moduli space of instantons. We define: M(Q) ≡ π −1 (p)
(4.8)
for p ∈ 1(r) (the fiber does not depend on p). A crucial point is that the space M(Q) includes the reducible connections. Thus, M(Q) will itself be a stratified singular space with singular strata corresponding to the loci of reducible connections. Roughly speaking, the reducible connections are the connections for which the gauge field can be made block diagonal: (1) 0 A . (4.9) A= 0 A(2) This will happen when we can split the Chan–Paton bundle as: E3 = E1 ⊕ E2 . More technically, the holonomy group of the connection should have trivial normalizer (in the adjoint group). On the reducible locus the moduli space is roughly a product of smaller moduli spaces
On the Algebras of BPS States
499
Let us now consider the BPS states. These should correspond to harmonic L2 forms on M0 (Q). Since we are interested in bound states the forms should have support on the stratum over 1(r) . Moreover, we do not want wave functions with support on the reducible locus: such states will be products of wave functions already accounted for at smaller charges and do not correspond to bound states, but rather to two (or more)particle states. For this reason we expect that the bound states will be associated with the cohomology of the subspace Mirred (Q) ⊂ M(Q) of irreducible connections. With this motivation we therefore adopt the preliminary definition: ?
Q ∗ irred HBP (Q)) S = H (M
(4.10)
at least when r2d > 0. Since Mirred (Q) is noncompact we should specify carefully the notion of cohomology. The moduli space has a natural metric, and the physically correct notion of cohomology would appear to be L2 cohomology.11 In fact, as we will see below, Eq. (4.10) is incompatible with string/string duality. One way to fix it is discussed in the next section. Remarks. 1. If we have a wrapped 2d brane with a nontrivial normal bundle, then, as remarked in [28] the scalars 8i take values in the normal bundle and we get a topologically twisted SYM. In the case of a 2-brane wrapping a holomorphic curve 6 in a K3 surface Eqs. (5 )become the Hitchin equations on 6, after discarding the scalars 8 corresponding to noncompact directions [28]. This is the reason for the terminology “generalized Hitchin equations.” Note that the remaining 8i represent normal motions of the brane within a compact space, and hence the Hitchin space must be compactified. 2. Two parallel D-branes of spatial dimensions p, p0 appear to break supersymmetry unless p = p0 mod 4 [35]. This raises a paradox: How then can there be (0, 2, 4, . . . ) bound states? One resolution of this paradox was explained in [35]. When binding a p − 2-brane in a p-brane the p − 2 brane can decay via the nucleation of magnetic objects. This decay is allowed by the “Chern-Simons couplings” which lead to the formula (4.2). Another description of the same phenomenon can be given in terms of the effective field theory on the p − 2 brane [36]We learned this during discussions with J. Polchinski and A. Strominger.. Naively there is a negative vacuum energy for parallel p and p − 2 branes. However, the theory develops an FI term and a hypermultiplet field condenses in such a way as to maintain supersymmetry. 3. The above description of BPS states differs markedly from the picture discussed in [37]. It is likely that the two descriptions are valid in different regimes and that the II → 0. It is quite important to understand present one is only valid in the limit gstring this point more clearly. 4.3. Chan–Paton sheaves. Let us return to the proposal (4.10) for the space of BPS states. It is easy to see that this is incompatible with string duality. Consider, for example, X2 = K3, for which IIA is dual to the heterotic theory on T 4 . Consider a state with four-brane charge one and large 0-brane charge. There are many heterotic states with these charges, but there are no such U (1) instantons.12 There are similar problems for other RR charges. 11 However, G. Segal suggests that relative cohomology with respect to the reducible locus might be more appropriate. 12 This paradox is very similar to a problem which was addressed in [38], Sect. 5.3, and the resolution is the same: one must generalize from vector bundles to sheaves.
500
J. A. Harvey, G. Moore
These problems can be resolved if we take a compactification of Mirred (Q) provided by moduli spaces of sheaves. That is: to maintain string duality we should replace Chan– Paton bundles by Chan–Paton sheaves. This procedure restores string duality and has other advantages, described below. The generalization from a twisted Chan–Paton vector bundle E to a Chan–Paton sheaf is extremely natural in D-brane theory. Intuitively, a sheaf is very much like a vector bundle except that the dimension of the fiber can change discontinuously. In particular, the fibers can be the zero-dimensional vector space everywhere except at a point (“skyscraper sheaves”) or on a curve. This picture coincides nicely with the physical picture of the Chan–Paton vector spaces associated to 0-branes and wrapped 2-branes, respectively. We will not go very deeply into sheaf theory in this paper. Everything that one needs to know (almost) can be found in Sects. 0.3 and 5.3 of [39]. The introduction of Chan–Paton sheaves also provides a nice compactification of the moduli space of instantons. We assume here that Xd is algebraic and hence, by the Donaldson-Uhlenbeck-Yau theorem the moduli space of irreducible connections can be identified with the moduli of stable holomorphic vector bundles on Xd . In algebraic geometry the natural compactification of the moduli of holomorphic bundles is provided by a certain moduli space of sheaves. There is another advantage to the sheaf-viewpoint. When there are no 2d-branes wrapping Xd , but there are branes wrapping submanifolds of Xd we cannot use SYM theory on Xd . The advantage of the sheaf viewpoint is that in the IIA theory, the entire set of (0, 2, 4, . . . ) bound states can be discussed within a single framework. Remarks. 1. An important open problem is to derive the sheaf viewpoint from first principles. We believe this can be done by studying the string field theory of the DN sector states in D-brane theory. 2. The necessity to include sheaves has already been remarked in another investigation into D-brane moduli space [40] where it was noted that the full equivalence of instantons with bound states of branes requires the generalization of vector bundles to sheaves. We have recently learned that the idea that the natural setting for D-branes is in the category of coherent sheaves has also been advocated by R. Dijkgraaf, M. Kontsevich [41], D. Morrison [42], and G. Segal [43]. Sheaves have also recently played an important role in (0, 2) Calabi–Yau compactifications [44]. 5. Example: Sheaf-Theoretic Interpretation of (0, 2, 4) Bound States on an Algebraic K3 or Abelian Surface Suppose the compactification manifold X2 is an algebraic K3 surface or abelian variety (T 4 ).13 In this section we will advocate that, in order to have string/string duality, BPS states should be formulated as cohomology classes on the moduli space of so-called “coherent simple semistable sheaves”: Q ∗ spl HBP S = H (M (Q)).
(5.1)
Let us briefly indicate why such animals should be relevant to D-brane physics. First, “coherent” essentially means that the sheaf fits in an exact sequence F → G → E → 0, where F, G are “locally-free” - that is, sheaves of sections of a holomorphic 13 We will need the technical assumption that these surfaces are “polarized” which means that we choose an embedding of the surface into projective space. In particular, we assume the K¨ahler class [ω] is the restriction of the hyperplane class.
On the Algebras of BPS States
501
vector bundle. We will interpret this later as the condition that the 0-brane and 2-brane states can participate in interactions with wrapped 4-brane states. Second, the adjective “simple” means that the sheaf contains no nontrivial automorphisms14 and is the analog of the requirement that the corresponding gauge field be irreducible. Indeed, note that the direct sum of two sheaves carries a nontrivial automorphism obtained by scaling the sections of just one summand. The criterion of simplicity is required if we are to count BPS bound states, and not 2-particle states at zero relative momentum. Unfortunately, there are a few expectional cases, involving states with zero 4-brane charge where simplicity is not the correct criterion. Nevertheless, as described below, these states still admit a sheaf-description.15 Finally “semistable” means that for any subsheaf 0 → F → E we have an inequality on the “slopes” µ(F), µ(E): R d−1 R d−1 ω c1 (E) ω c1 (F) ≤ ≡ µ(E) (5.2) µ(F) ≡ ch0 (F) ch0 (E) (we assume ch0 > 0 for simplicity). The physical interpretation of this is much less evident, but it is required for a nice16 moduli space. The condition (5.2) will play a role in the geometrical formulation of the BPS algebra below and therefore has some physical sense. Let us work out the RR charge vector for X2 a K3 surface or abelian variety (T 4 ). Then, p1 = −48 or p1 = 0, respectively and the BPS states have electric charge vector given by: Q = v(E) = (rk(E), c1 (E), rk(E) + ch2 (E)) , (5.3) ∈ H 0 (X; Z) ⊕ H 2 (X; Z) ⊕ H 4 (X; Z) where = 1, 0 for K3, T 4 . Note that with our conventions ch2 (E) < 0 for a bundle admitting an ASD connection. The inner product on a vector v = (r, c1 , r − `) is: v 2 = (r, c1 , r − `)2 = c21 − 2r(r − `),
(5.4)
where c21 is the inner product on H 2 (X, Z) ∼ = 019,3 or H 2 (X, Z) ∼ = 03,3 for K3 4 2 or T respectively. For X2 = K3 we need only interpret states for v ≥ −2. These decompose into BPS and anti-BPS states. BPS states have r > 0 or r = 0, c1 > 0, or r = c1 = 0, ` > 0. A theorem of Mukai [34] shows that the space of coherent simple semistable sheaves with Chern classes specified by Q is smooth and compact and has dimension: (5.5) dimR Mspl (Q) = 4( 21 Q2 + 1). This is consistent with string duality. We will now describe the sheaf-theoretic interpretation of the various states on a case-by-case basis. 5.1. Q = (r; c1 ; L), r > 1. When r > 1 there is an open dense set in M(Q) consisting of the moduli space of holomorphic vector bundles (equivalently, the moduli of irreducible instantons). The moduli space is compactified by adding semistable sheaves. In the physics literature it is sometimes stated that M(Q) is the N -fold symmetric product 14 15 16
Sheaves always carry a trivial automorphism obtained by simply scaling all sections by a constant factor. The fact that the sheaves in these exceptional cases are not simple was pointed out to us by D. Morrison. e.g., Hausdorff
502
J. A. Harvey, G. Moore
S N X2 for N = 21 Q2 + 1, or, more accurately, its hyperkahler resolution by the “Hilbert scheme of points” ?
M(Q) = X2[N ] ,
(5.6)
and this is quoted as providing evidence for string/string duality. Equation (5.6) is certainly true at the level of dimensions, by Mukai’s theorem, and is known to be true at the level of Hodge numbers for some cases [45, 46] provided we use Mspl (Q). It is false at the level of complex structures, see [47] for a counterexample. As more detailed questions about the nature of BPS states become addressed the exact nature of the relation of these spaces will become more important. 5.2. Q = (1; 0; 1 − `). From Mukai’s theorem dim Mspl (Q) = 4`. Indeed, for this case it is known that [34]: (5.7) Mspl (Q) ∼ = X [`] The isomorphism is explained in the appendix. More generally, we should modify the above by twisting by a nontrivial line bundle on X to get charge vector (1, c1 , 1 − `). This is covered by Mukai’s theorem, of course. Note that (5.7) resolves the glaring discrepancy with string duality noted in Sect. 4.3. 5.3. Q = (0; ch1 ; ch2 ) . Mukai’s theorem includes sheaves with r = 0 and support on a curve. These are fairly strange objects from the point of view of a Yang–Mills theory on X2 . Note that the curve must be irreducible because we are only interested in bound states. This will be guaranteed by the condition that the sheaf be simple. Suppose n D-branes wrap a holomorphically embedded curve ι : 6 → X2 . We then have a rank n Chan–Paton vector bundle E → 6. The corresponding sheaf on X2 is E = ι∗ (E).
(5.8)
The Chern characters of E are easily computed from the Riemann-Roch-Grothendieck (RRG) theorem: ch(E)T d(T X2 ) = ι∗ (ch(E)T d(T 6)).
(5.9)
Expanding this out we obtain the Mukai vector: R
v(E) = (0, n[6], deg(E) − n 21 6 · 6),
(5.10)
where deg E = 6 ch1 (E). The 2-brane states should be associated to the cohomology H ∗ (Mspl (v(E))). Roughly, Mspl (v(E)) is the moduli of pairs consisting of a holomorphic curve in a linear system: C ⊂ |n[6]| together with a rank n vector bundle E → C of fixed degree.17 In [28] the space of states associated with wrapped two-branes was characterized in terms on the cohomology of Hitchin moduli space. We need a unified description of the BPS states in order to describe the BPS algebras so we prefer the sheaf description. As noted in [28] a paper of Donagi et. al. [48] shows that there is close relation between Mukai’s moduli space and Hitchin’s. 17 There are some exceptional cases where (5.10) does not define a simple sheaf, e.g., v(E) = (0, L[E], 0) where E is elliptic. In these cases there is still a sheaf-description, similar to that described in the next subsection.
On the Algebras of BPS States
503
5.4. Q = (0; ~0; −L).. Describing zerobranes of charge L turns out to be the most subtle case. They must correspond to sheaves of length L but concentrated at one point.18 At first sight string/string duality suggests that the proper moduli space is the small diagonal: 1(L) ⊂ S L X. However, for the sheaf interpretation we must modify this slightly. Let (5.11) π : X [L] → S L X be the Hilbert scheme of points resolving the symmetric product. Bound states will only form when the 0-branes are at the same point in spacetime, so we expect the 0-brane charge −L < 0 states to be represented by cohomology classes in π −1 (1(L) ) = (X [L] )L ≡ 4L .
(5.12)
There are very natural homology cycles in (X [L] )L . Let 6 be a cycle in X. Consider the cycle: ˜ L ≡ {S ∈ 4L : Supp(S) = P ∈ 6}. (5.13) 6 ˜ L ] ∈ H∗ 4L ). The This allows us to map a homology class [6] ∈ H∗ (X) to a class [6 dual cohomology classes to these homology classes in 4L are the correct differential forms to associate with pure zerobranes of charge −L.19 It should be noted that none of the sheaves in (5.12) are simple, since they all have nontrivial automorphisms, given by arbitrary GL(L, C) rotations of the fiber above a point. 5.5. Summary. In conclusion we have shown that the space of D-brane states can be interpreted as a space of cohomology classes on the moduli space of coherent simple sheaves on X if the 2 or 4 brane charge is positive. For some exceptional 2-brane configurations and for pure 0-branes, the states are a set of distinguished cohomology classes in a moduli space of sheaves supported on a curve and on a point, respectively. 6. Calabi–Yau Compactification If we compactify the IIA string on a Calabi–Yau threefold X3 then the RR charged BPS states will be (0, 2, 4, 6) bound states. The RR charge vector is connected to the characteristic classes of the sheaves via: p1 p1 (6.1) Q = (ch0 , ch1 , ch2 − ch0 , ch3 − ch1 ). 48 48 Q is now interpreted as a vector of electric and magnetic charges. The shift by − p481 is a geometric version of the Witten effect. Indeed, choosing an electric/magnetic polarization so that H 0 ⊕ H 2 is the lattice of magnetic charges we observe a shift in the electric vector: qe → qe − p481 qm . In particular, although the shift induces fractional D-brane charges, it does not violate the Dirac quantization condition.20 Once again we expect Q ∗ spl HBP S = H (M (Q))
(6.2)
18 This means that the fiber above the point is an L-dimensional vector space, in accord with the relation of U (L) SYM to a charge L 0-brane. 19 The space 4 L is topologically complicated. For example, the fiber has lots of cohomology: b2i (π −1 (L[P ])) = p(L, L − i) [49, 50]. Nevertheless, there are distinguished cohomology classes on 4L associated to multiplying the top degree cycle of the fiber with the homology of the base. We thank G. Segal for some very helpful remarks on this point. 20 We thank M. Douglas and E. Witten for a discussion on this point.
504
J. A. Harvey, G. Moore
although much less evidence is available to test this proposal. 6.1. Special features of K3 fibrations. In general very little is known about the BPS algebras associated to general Calabi–Yau manifolds. However, thanks to string/string duality it is possible to make some nontrivial statements about the algebra in the case that there is a heterotic dual. It is now understood that heterotic/IIA duality is intimately connected with K3 fibrations [51–53]. Let us consider therefore a Calabi–Yau 3-fold X3 → P1 .
(6.3)
We denote the K3 fiber over z ∈ P1 by Kz . In all known examples it has Picard number ≥ 1 on the heterotic side, so Kz is always algebraic. There is a subset of states which are of special relevance in string/string duality. These are the BPS states which have finite mass in the heterotic weak coupling limit: Z ω → ∞. (6.4) Im(ts ) = P1
These will be (0, 2, 4) bound states based on supersymmetric cycles which only wrap in the fiber Kz . We will refer to these as the “fiber bound states.” Let us determine the lattice of charges and RR charge vectors of the fiber bound states. Consider first the supersymmetric 2-cycles in the fiber. These must be holomorphic curves in the CY X3 . Therefore, the supersymmetric 2-cycles with finite mass in the limit (6.4) must be holomorphic curves in the K3 in the complex structure of Kz . The elementary 2-branes will be labelled by vectors r ∈ 1ir ⊂ H 2 (Kz ; Z) of classes dual to irreducible holomorphic curves. Note that r2 = 2g − 2 ≥ −2 determines the genus of the curve. Multiply wrapped branes with 2-brane charge N r can produce new bound states. Therefore, the set of 2-brane charges is contained in the NEF cone N EF (Kz ) and correspondingly, the lattice of 2-brane charges is a sublattice of the Picard lattice P ic(Kz ). (For a clear discussion of these concepts in the physics literature see [53, 54].) In fact, the Picard lattice can undergo monodromy when circling a singular fiber. Thus, in general, we expect the lattice of 2-brane charges to be P ic(Kz )invt [54]. To obtain the full lattice of charges we recall that there are 2 more gauge fields from wrapping C (5) on the K3 surface and from C (1) , giving another lattice 01,1 for H 0 (Kz ; Z) ⊕ H 4 (Kz ; Z). Note that rather than use the K¨ahler class of the P1 base we have used its magnetic dual corresponding to wrapping C (5) on the K3 fiber. Thus we are not using the standard CY polarization H 0 ⊕ H 2 for the magnetic charges.21 The total lattice of charges is (r4 , r2 , r0 ) ∈ H 0 (Kz ; Z) ⊕ P ic(Kz )invt ⊕ H 4 (Kz ; Z).
(6.5)
This lattice embeds into the full lattice H ∗ (X3 ). The moduli space of sheaves Mspl (Q) should be regarded as the moduli of sheaves in the full Calabi–Yau with specified Chern classes. We may now compare with the predictions of string/string duality. On the heterotic side the lattice of charges is the lattice of vectormultiplet charges in a heterotic dual theory. Here the gauge instanton breaks E8 × E8 to a rank s subgroup leaving a Narain moduli space based on a lattice 0s+2,2 . By general results P ic(Kz ) is a lattice of signature 21 And since the intersection form on H 2∗ (X ) is symmetric it is a little mysterious why, on a priori grounds, 3 such distinct polarizations should be related by electro-magnetic duality.
On the Algebras of BPS States
505
[(−1)n , (+1)1 ]. We identify P ic(Kz ) = 0s+1,1 ⊂ 0s+2,2 . By comparing this to the Type II picture and using string duality we see that the 2-brane charges must in fact fill the entire (monodromy invariant) NEF cone. Moreover, the perturbative BPS states in the heterotic description (which correspond to (6.4)), are easily described in terms of vertex operators, and are counted by the elliptic genus of certain vector bundles on the heterotic K3 surface [1]. Remarks. 1. It is interesting to contrast (6.5) with the lattice of 2-brane charges in K3 compactification. In K3 compactification we do not count holomorphic curves in K3, but, rather, curves which are holomorphic in some complex structure compatible with the fixed hyperk¨ahler structure. The appropriate counting function is 1/η 24 [28]. By contrast, the holomorphic curves in a fixed family of complex structures is a more subtle object. For example, in [1] the counting function for the K3 family 24 x21 + x32 + x73 + x42 4 = 0 was identified as E6 /η . 2. A related point is that in the heterotic dual the BPS spectrum is chaotic in the sense that it changes discontinuously on a dense subset of hypermultiplet moduli space [1]. This is most easily seen in the heterotic dual where it corresponds to discontinuity as functions of the hypermultiplets describing the heterotic K3 moduli. Translating this to the type IIA side we see that the BPS spectrum is highly chaotic as a function of the complex structure. This makes sense: a small perturbation of the complex structure changes wildly the allowed holomorphic curves in the K3 surface (generically there are none, even for an algebraic K3 the Picard lattice jumps discontinuously). In spite of this chaotic structure, the difference between the number of vector multiplet and hyper multiplet BPS states is stable and it is this difference that governs physical quantities [1].
7. The Geometrical Realization of the BPS Algebra for Type II Strings We would like to calculate the bound state BPS pole in the scattering of two BPS states to a final state F: (7.1) ψ 1 + ψ 2 → ψ3 → F corresponding to charge vectors: Q1 + Q2 = Q3 .
(7.2)
7.1. Positive and negative BPS states. It is important to note that we are scattering both BPS and anti-BPS states. The distinction is determined by the orientation of Xd . We refer to these as positive and negative BPS states. The notion of positivity corresponds to the positivity of roots in a Lie algebra, and plays an important role in the geometrical formulation of the BPS algebra. Example 1. In the case of 0,2,4 bound states on K3 we can order the charges by saying that 1. (r, ~c1 , r − |ch2 |) > 0 if r >P0. ni [6i ] is in the NEF cone, i.e., if ni are nonnegative. 2. (0, ~c1 , −|ch2 |) > 0 if ~c1 = 3. (0; 0; −L) > 0 for L > 0.
506
J. A. Harvey, G. Moore
Example 2. We now consider (0, 2, 4, 6) bound states on a Calabi–Yau 3-fold X3 . These will have charges (r6 , r4 , r2 , r0 ) ∈ H ∗ (X; Z). r2 , r4 are vectors in lattices, but these lattices still have cones: the Mori cone and the NEF cone respectively.22 Bogomolnyi states still lie in a cone in these lattices and we say that the BPS states are positive if r6 > 0 or, r6 = 0, r4 > 0 or, r6 = r4 = 0, r2 > 0, or, r6 = r4 = 0 = r2 = 0, r0 > 0. 7.2. The correspondence conjecture. We now consider the BPS states as differential forms on the moduli space of sheaves. Then the BPS states are represented by cohomology classes ωi ∈ H ∗ (Mspl (Qi )). We need to define the projection of the groundstate wavefunction ω1 ⊗ ω2 ∈ H ∗ (Mspl (Q1 ) × Mspl (Q2 )) onto a groundstate wavefunction in H ∗ (Mspl (Q3 )). A conjecture, motivated by the work of Nakajima, and of Ginzburg et. al., [56, 57], is the following. Suppose first that the three vectors Qi in (7.2) represent positive BPS states. Recall that the charges are Chern characters of sheaves. There is only one natural way that the three sheaves E1 , E2 , E3 can be related and satisfy (7.2). They must fit into an exact sequence: (7.3) 0 → E1 → E3 → E2 → 0 or (7.4) 0 → E2 → E3 → E1 → 0. The ambiguity between (7.3) and (7.4) is resolved by the requirement that E3 be semistable: since Chern characters are additive the inequality (5.2) cannot hold for both F = E1 and F = E2 .23 We define the correspondence region to be the subset of M(Q1 ) × M(Q2 ) × M(Q3 ) defined by the set of triples: C +++ (Q1 , Q2 ; Q3 ) = {(E1 , E2 , E3 ) : 0 → E1 → E3 → E2 → 0}.
(7.5)
If in (7.1) (7.2) we have some negative BPS states then we rewrite (7.2) in terms of positive vectors and write the corresponding sequence. For example, suppose Q1 > 0, Q2 < 0, Q3 > 0. Then Q1 = Q3 + (−Q2 ) and the correspondence is C +−+ (Q1 , Q2 ; Q3 ) = {(E1 , E2 , E3 ) : 0 → E3 → E1 → E2 → 0} ⊂ M(Q1 ) × M(−Q2 ) × M(Q3 )
(7.6)
when µ(E3 ) ≤ µ(E1 ), etc. We can now state the Correspondence conjecture. We conjecture that the residue of the boundstate pole is the overlap of the quantum wavefunctions on the correspondence region: Z ω3∗ ω1 ω2 . (7.7) hω3 |R(ω1 ⊗ ω2 )i = C(Q1 ,Q2 ;Q3 )
22 23
The Mori cone is the cone in H2 (X; Z) of homology classes of holomorphic curves in X3 [55]. We thank D. Morrison for a helpful remark on this point.
On the Algebras of BPS States
507
Remarks. 1. In order for (7.7) to make sense we must first restrict the forms ωi from Mspl (Qi ) to the moduli space of irreducible instantons Mirred (Qi ) (using the Donaldson-Uhlenbeck-Yau theorem). We are then assuming that the forms extend to the reducible locus, perhaps with singularities, but such that the integrals over M(Q) are well-defined. 2. This definition of R is extremely natural. It amounts to the assumption that the bound state formation simply corresponds to local additivity of the Chan–Paton vector spaces. 3. The structure constants are therefore given in terms of an intersection number, in harmony with the topological field theory interpretation. 4. Recall that the sheaf description of 2-brane states required [6] to be of type (1, 1). Thus the above proposal does not cover the scattering of 2-branes which cannot be simultaneously made into (1, 1) classes by rotation of complex structure. 5. The proposal (7.7) admits a nontrivial consistency check in terms of the degree of the form R(ω1 ⊗ ω2 ). See Appendix B for details. 7.3. An heuristic argument for the correspondence conjecture. Here we try to justify further the correspondence conjecture. The basic strategy is to use a standard result of quantum mechanics: The residue of a bound state pole is related to the coefficient of the exponential falloff in the bound state wavefunction [58]. We therefore attempt to construct an L2 harmonic form ω30 ∈ H ∗ (M0 (Q3 ), C)
(7.8)
with the asymptotic behavior: ω30 → e−|~ppole ||~a
(1)
−~ a(2) |
ω1 ω2
(7.9)
in the region in M0 (Q3 ) corresponding to widely separated states ψ1 , ψ2 at positions ~a(i) ∈ RD . 24 The restriction of ω30 to M(Q3 ) should define the product R(ψ1 ⊗ ψ2 ). Here we are assuming that we can restrict and extend H ∗ (Mspl (Q)) → ∗ H (Mirred (Q)) → H ∗ (M(Q)) as in Remark 1 above. We are also assuming that there is a reasonable Hodge theory on the singular stratified space M0 (Q) which allows us to identify cohomology classes with harmonic forms. Now, to construct ω30 we use the formulation of M0 (Q3 ) in terms of the generalized Hitchin system (5 )described above. Recall that this space is stratified by the partitions Sνr (RD ). A. Suppose Q1 = (r1 , . . . ), Q2 = (r2 , . . . ). On all strata except π −1 (1(r1 ) × 1(r2 ) ) and π −1 (1(r3 ) ) we take ω30 = 0. B. On the stratum π −1 (1(r1 ) × 1(r2 ) ), where we have the Higgs field (1) ai 1r1 ×r1 0 8i = 0 a(2) i 1r2 ×r2 we take
ω30 = e−|~a
(1)
−~ a(2) ||~ ppole |
ω1 ω2 ,
(7.10)
(7.11)
where ωi are harmonic forms on M (Qi ) restricted and extended to M(Qi ). Asymptotically the Hamiltonian is written in terms of the Laplacians on the moduli spaces M1 , d2 M2 as: H = − d~ a2 + 11 + 12 . spl
24
When the state is a bound state at threshold the falloff will be a power of r = |~a(1) − ~a(2) |.
508
J. A. Harvey, G. Moore
C. Finally, on the stratum |~a(1) − ~a(2) | = 0, corresponding to π −1 (1(r) ) we take: Z η(C → M(Q1 ) × M(Q2 ) × M(Q3 ))ω1 ω2 . (7.12) ω30 = M1 ×M2
Here η(X → Y ) denotes a harmonic representative of the Poincar´e dual of a space X embedded in Y . This defines a form ω30 on all of M0 (Q3 ). It is a harmonic form on the various smooth strata. It is also continuous because, if m3 is in the reducible locus of M(Q3 ) then R imposes E3 = E1 ⊕ E2 so ω30 = ω1 ω2 , just right to match to stratum B. We presume that such a form is unique. Thus, modulo the above caveats, we conclude that ω30 is the form representing the bound state in the scattering process. Now, if we want the overlap with a bound state ω3 ∈ H ∗ (Mirred (Q3 )) we recall that ω3 only has support in stratum C. The overlap hω3 |ω30 i is exactly given by (7.7). This concludes the argument for the correspondence conjecture. Remarks. 1. The above is an ansatz for the bound state waveform. If there turns out to be a nontrivial metric on S r (RD ) then the ansatz will have to be modified. 2. We have not treated the singularities at the reducible instantons with care. This is related to Assumption A of Sect. 4.2. It is possible that the moduli space is smoothed out at the reducible locus due to short fundamental string degrees of freedom, along the lines of [37] and some evidence for this is provided by [40]. This could lead to corrections to (7.7), but these corrections should vanish in the large volume limit. 7.4. Implications of Heterotic/IIA duality. The above geometrical formulation of the algebra of BPS states together with string duality have important applications to mathematics. 7.4.1. Nakajima algebras. Nakajima constructed algebras using correspondence varieties exactly as in the correspondence conjecture. While these definitions make sense for any algebraic surface the resulting algebraic structures are relatively unknown. In particular, the answer for K3 was not hitherto known. We see that type II/heterotic duality together with the correspondence conjecture makes an interesting prediction [59]: Nakajima’s construction applied to the moduli space of U (r) “instantons” for all r ≥ 0 defines a generalized Kac–Moody algebra whose root lattice is 020,4 and which can be described explicitly as the algebra of BPS states in the heterotic string on T 4 . In the next two sections we will show how two special cases of this statement reproduce exactly Nakajima’s results. 7.4.2. Remark on a conjecture of Gritsenko & Nikulin . In a fascinating paper, Gritsenko and Nikulin [60] postulated the existence of generalized Kac–Moody algebras whose simple roots are associated to collections of elements in the NEF cone of a K3 surface. The algebra of (0, 2)-brane fiber bound states in a K3 fibration, or of (0, 2, 4) fiber bound states provide examples of such GKM algebras, by virtue of string/string duality. One of these algebras is probably the algebra needed for the “Mirror conjecture” stated in [60]. (It is possible that one will need to take a quotient algebra of the algebra of BPS states.) Indeed, the (0, 2, 4) bound states, when interpreted on the heterotic side, have counting functions associated with various threshold corrections, which involve automorphic forms of the type entering in Gritsenko and Nikulin’s conjecture.
On the Algebras of BPS States
509
8. Comparison of Heterotic and Type II Algebras: the Heisenberg Algebras In the next two sections we compare the BPS algebras of the type II and heterotic strings on K3 and T 4 respectively. We will restrict to the subspace of Narain moduli space where we have the orthogonal decomposition: 020,4 = 01,1 ⊕ 019,3 ⊂ R1,1 ⊕ R19,3
(8.1)
so we can describe the K3 in terms of classical geometry. A waveform ω ∈ H ∗ (Mspl (Q)) for Q = (r, c1 , r − L) corresponds, on the heterotic side, to a vertex operator with matter of the form: ¯ P(∂ ∗ x(z))e eiEt(z,z)
i √1 ( (L−r) −rV )X(z) V 2
⊗e
˜ z) i √1 ( (L−r) +rV )X( ¯ ic1 ·Y V 2
e
(z, z). ¯
(8.2)
The notation is as follows. V is the real parameter of the lattice 01,1 . It corresponds to a radius on the heterotic side, and to the volume of the K3 on the IIA side. x(z) ˜ are left and right-moving stands for all 26 holomorphic left-moving coordinates. (X, X) ˜ are left and right coordinates on R19,3 . coordinates on R1,1 , while Y = (y, y) 8.1. Type IIA description. Nakajima’s Heisenberg algebra [56, 61] corresponds to the scattering of 0-branes off of a single 4-brane bound to collections of 0-branes. Indeed, the space of such BPS states is: (1;0;1−L) H ≡ ⊕L≥0 HBP S [L] ∼ = ⊕L≥0 H ∗ (X ),
(8.3)
2
where in the second line we have used (5.7). Let us consider a single four-brane bound to zerobranes with total 0-brane charge 1 − L1 , and let us scatter a 0-brane of charge −L2 . The corresponding BPS states are represented by homology classes:
where
˜ L2 ] → [S3 ], [S1 ] + [6
(8.4)
[S1 ] ∈ H∗ (M(1, 0, 1 − L1 )), ˜ L2 ] ∈ H∗ (4L2 ), [6
(8.5)
[S3 ] ∈ H∗ (M(1, 0, 1 − L3 )), 6 by and L3 = L1 + L2 . Nakajima defines the operator α−L 2 6 ˜ L 2 × S3 h[S3 ]|α−L |[S1 ]i = C +++ ∩ S1 × 6 2 Z ˜ L2 → 4L2 ). ηS1 ηS3 η(6 =
(8.6)
C(Q1 ,Q2 ;Q3 )
Recall that η stands for the Poincar´e dual. In a similar way the absorption of a 0-brane bound state of charge +L2 is defined by 6 ˜ L 2 × S3 |[S1 ]i = C +−+ ∩ S1 × 6 h[S3 ]|αL 2 Z (8.7) ˜ L2 → 4L2 ). ηS1 ηS3 η(6 = C +−+ (Q1 ,Q2 ;Q3 )
510
J. A. Harvey, G. Moore
Nakajima has shown that the operators αnI defined in this way as operators on (8.3) form a Heisenberg algebra J ] = cn η IJ δn+m,0 , [αnI , αm
(8.8)
IJ
where η is the intersection pairing on H∗ (X) and cn are constants. Ellingsrud and Stromme [2] have been able to calculate cn using intersection theory and find, remarkably, (8.9) cn = n(−1)n , so the operators defined this way are canonically normalized in the sense of string theory! 8.2. Heterotic description. According to (8.2) the heterotic operators AL,ζ corresponding to zerobranes of charge L have a left-moving matter piece given by −i √L (t+X)
Aleft L,ζ = ζ · ∂xe
2V
,
(8.10)
˜ are left/right coordinates on R1,1 , and x runs over all 26 left-moving where (X, X) coordinates. We have ζ · k = 0 and ζ ∼ ζ + λk, where k is the lightlike momentum in the exponential in (8.10). Thus, ζ span a 24-real dimensional space. The algebra of the operators An,ζ is easily computed. The boost p~ = 0, and we simply find: i (8.11) R(An,ζ ⊗ An0 ,ζ 0 ) = nζ · ζ 0 δn+n0 ,0 √ (∂t + ∂X). 2V Note that it is important that we are working within BRST cohomology. The algebra (8.11) is definitely not a Heisenberg algebra, but becomes closer to a Heisenberg algebra when we consider the scattering of 0-branes off of a 4-brane bound to zerobranes. Thus, we consider the subspace of BPS states (8.3) from the heterotic i (∂t + ∂X) acts as the side. When acting on the module H we find that the operator √2V c-number −1, so that, acting on the module H the operators An,ζ are represented by: [An,ζ , An0 ,ζ 0 ] = n0 δn+n0 ,0 ζ · ζ 0 .
(8.12)
Thus we recover Nakajima’s Heisenberg algebra. Remarks. 1. There is a very close analogy of the above 0-brane operators with DDF operators. 2. The above remarks can be generalized to scattering off of wrapped 4-brane states with charges (r, c1 , r − L) at fixed r, c1 . In this case the Heisenberg algebra becomes [An,ζ , An0 ,ζ 0 ] = rn0 δn+n0 ,0 ζ · ζ 0 . 3. The previous remark can be further generalized: When the volume of K3 satisfies V = 1 then in fact the algebra of pure 0-brane and 4-brane bound states is the algebra w∞ of area-preserving diffeomorphisms. Note that at V = 1 the X-CFT has the d(2)1 and we can define primary fields symmetry of SU J−m √ I √ e−i 2X ei 2JX (8.13) VJ,m (X) ≡ of dimension 1 = J 2 , where J = 0, 1/2, 1, . . . and m ≤ |J|. It follows immediately from [62, 63] that the corresponding BPS multiplets 9(+) J,m with energy √ √ E = 2 J 2 − 1 satisfy: (+) (+) R(9(+) J1 ,m1 ⊗ 9J2 ,m2 ) = (J2 m1 − J1 m2 )9J1 +J2 −1,m1 +m2 .
(8.14)
On the Algebras of BPS States
511
8.3. Intuitive picture. There is an extremely simple intuitive picture that explains the noncommutativity of the algebra of scattering 0-branes off 4-branes. We now define I for L > 0 to be the operator on BPS states obtained by absorbing a charge +L αL I is the operation of absorbing 0-brane and projecting onto the BPS state. Similarly, α−L a charge −L 0-brane and projecting onto the BPS state. The 4-brane wrapped around a K3 constitutes the Heisenberg vacuum: αL |0i = 0 because absorbing a charge L 0-brane breaks supersymmetry. Moreover, 0-branes of charge L can only annihilate 0branes of charge −L. Using these simple pictures one can understand some aspects of the algebras. Note that if we replace K3 by T4 then the algebra is not highest weight since T4 does not break any supersymmetry. This is in accord with the fact that in the toroidally compactified type II theory there are two BPS towers, one on the left and one on the right.
9. Comparison of Heterotic and Type II Algebras: The Affine Lie Algebras 9.1. Nakajima’s construction. In a famous set of papers Nakajima [56] showed how to construct highest weight representations of affine Lie algebras on the cohomology spaces of quiver varieties using the correspondences (7.5)(7.6). In particular, Nakajima d(n)r current algebra to the moduli of U (r) instantons on the ALE space associated SU ~ Xn (ζ) which is a resolution of C2 /Zn .25 In this section we will show how to recover Nakajima’s result from comparison of BPS algebras via string/string duality 26 . Let us consider a family S() of K3 surfaces degenerating to a surface with an ADE singularity. Thus, in addition to (8.1) we suppose that 019,3 () degenerates to 019,3 ∗ , where (0(g); 0˜ R ) ⊂ 019,3 ∗
(9.1)
for an ADE root lattice 0(g). For simplicity we will restrict our attention to g = An−1 . In this limit the two-spheres associated to the roots of g shrink to zero size so we simultaneously take a limit V → ∞ of the H 0 ⊕ H 4 lattice 01,1 (V ) so that the area of the two-spheres A = 2 V is fixed. The result is that as → 0 the K3 degenerates to an ALE space of type ADE. As → 0 the moduli space of instantons on the K3 breaks up into components corresponding to the various components of finite action instantons on the ALE. The latter have the topological classification by ρ, a flat U (r) connection on S 3 /Zn . So we expect that as the K3 degenerates to an ALE with finite area 2-spheres the cohomology of the moduli space of instantons behaves as ~ H ∗ (M(v; S())) → ⊕ρ H ∗ (M(v, ρ; Xn (ζ))). →0
(9.2)
The definition of correspondences carries over to Nakajima’s definition so we should expect to recover Nakajima’s current algebras if we translate the above degeneration to the heterotic side. By (9.2), a prediction of string/string duality is that on the heterotic side we should find all the highest weight representations. We will verify this below. Nakajima associated highest weight representations to the middle-dimensional cohomology. We The ζ~ are the hyperk¨ahler moduli. A suggestion that heterotic/IIA duality might be the right arena to explain Nakajima’s affine Lie algebras associated to ALE manifolds appeared in [64]. A related suggestion had been proposed independently by one of us previously in unpublished work. 25 26
512
J. A. Harvey, G. Moore
should take all the cohomology. This is in keeping with the heterotic description where it is evident that there are many representations of gˆ r in HBP S . 9.2. Recovering Affine Lie algebras. Let us consider the heterotic algebra of BPS states under the degeneration (9.3) lim 01,1 (A/2 ) ⊕ 019,3 () →0
described above. We will consider algebras and modules associated to states with charge vectors Q = (r, c1 , r − L). These have matter operators given by (8.2), as above. In the limit → 0 we obtain BPS states with internal left-moving vertex operator: J α~ ↔ei~α·~y α , ~ ↔ − i∂~y H
(9.4)
for roots α ~ of g.27 The algebra of these states is simply the Lie algebra g, in the CartanWeyl basis. (α is a cocycle factor.) We now consider a larger algebra obtained by adding generators with left-moving matter: ~ −i √1 V1 (t(z)+X(z)) iθ·~ −θ 2 ↔e e y(z) , J+1 (9.5) ~ y (z) +i √1 1 (t(z)+X(z)) −iθ·~ θ ↔e 2V e , J−1 where θ~ is the highest root of g. The product of the two states in (9.5) is i 1 −θ θ (∂t + ∂X). ⊗ J+1 ) ∼ −iθ~ · ∂~y + √ R(J−1 2V
(9.6)
While we do not obtain an affine Lie algebra in this way we can introduce a subspace of BPS states analogous to (8.3). We choose r ∈ Z+ , c1 ∈ 019,3 and define: X (r;α+c1 ;r−L) Hr,c1 ≡ HBP , (9.7) S α∈(3R (g);0),L≥0
where we hold r, c1 fixed. Now, when acting on the module (9.7) on the summand
√i 2V
(∂t + ∂X) is not a c-number, but acting
(r;α+c1 ;r−L) HBP S
it becomes multiplication by r L−r 1 L−r 2 1 2 ) + 2(~cR )− (rV + − 2 (r − 1 ) . V2 2V V
(9.8)
This is a complicated function, but, in the V → ∞ limit it becomes simply −r, a c-number on the entire module Hr,c1 . Thus, in the V → ∞ limit we have: −θ θ ~ − r. , J+1 ] = θ~ · H [J−1
Similarly, we can easily compute: 27
We use a vector sign to denote a vector in a Euclidean signature space.
(9.9)
On the Algebras of BPS States
513
−θ~ i 1 ~ −θ (∂t + ∂X) , J+1 , iθ~ · ∂~y − √ = 2J+1 2V +θ~ i 1 ~ +θ (∂t + ∂X) , J−1 = −2J−1 iθ~ · ∂~y − √ 2V
(9.10)
(this is valid for all volumes V ). Therefore, the subalgebra of BPS states generated by (9.5), (9.4) acting on the module (9.7) is a deformation of the affine Kac–Moody algebra gˆ r of level r. In the V → ∞ limit the subalgebra generated by (9.5)(9.4) becomes exactly gˆ r . Equations (9.5)(9.4) are the Serre generators of the algebra. 9.2.1. The representations. We will now show that by choosing c1 appropriately we can obtain all the integrable highest weight representations in (9.7) in the large volume limit. ~ −θ lowers L the modules (9.7) are always highest weight representations. Since J+1 States of type (8.2) come with degeneracy p24 (N ) where N = r(L − r) + 21 c21 + 1
.
(9.11)
The highest weight state in a highest weight representation should have degeneracy one. This can be ensured by choosing L = r and c21 = −2. We do not assume that c1 is purely left-moving, although its projection to the subspace 3R (g) ⊗ R must be some weightvector ~λ since 020,4 is selfdual. At enhanced symmetry points (9.1) we can obtain all dominant weights by choosing suitable c1 .28 We choose a basis vector 9r,c1 ,0 in this space. Consider the state: ~ n θ ) 9r,c1 ,0 . (9.12) (J−1 This state has charge vector Q = (r, c1 + nθ, n). The smallest value of n for which 1 2 2 Q + 1 < 0 (and hence, for which there cannot be any BPS state) is: n = r − ~λ · θ~ + 1,
(9.13)
~ For this value of n the vector (9.12) must vanish. Thus, we since c1 · θ = −~λ · θ. interpret (9.12) for n given by (9.13) as the null vector of the integrable highest weight representation of level r with weight ~λ.29 Similarly, we may demand that the action −θ produce zero because we have violated the BPS condition. This implies the of J+1 inequality: 2 1 (9.14) 2 (c1 − θ) + r(r − 1 − r) + 1 < 0 or, equivalently: ~λ · θ~ ≤ r
(9.15)
which is just the integrable highest weight condition for the affine Lie algebra. In fact, the space (9.7) forms a module of states under the sub-algebra generated by BPS states (8.2) with charges Q = (0; α ~ ; −N ), for α ~ ∈ 3R (g). These are 2-branes bound to arbitrary numbers of 0-branes. This is in principle a larger algebra than the affine Lie algebra. Remark. A recent paper [65] appears to be closely related to the construction of this section. We have only verified this for g of rank ≤ 3, but believe it to be generally true. To complete the argument we should show that the states do not vanish for smaller values of n. We have not done so. 28 29
514
J. A. Harvey, G. Moore
10. Conclusions and future directions In this paper we have discussed the algebras of BPS states associated to toroidally compactified heterotic strings and to IIA strings on K3 surfaces. However, as we have stressed, the concept of a BPS algebra is quite general. In particular, to any Calabi–Yau 3-fold X3 there are two canonically associated algebras, gA (X3 ), gB (X3 ) defined by the BPS algebras in the IIA and the IIB theory. Moreover, by quantum mirror symmetry, if X3 , X˜ 3 are mirror pairs then gA (X3 ) ∼ = gB (X˜ 3 )
.
(10.1)
Almost nothing is known about these Calabi–Yau algebras. The formulation in terms of correspondences gives a definition of the IIA algebra but does not give an effective calculational scheme. Moreover, there is at present no analogous mathematical formulation of the IIB algebra other than that provided by (10.1). We would like to stress that the CY algebras are algebras of nonperturbative dyonic BPS states in d = 4, N = 2 type II compactifications. Much remains to be understood here. First, in the d = 4, N = 4 theory we can use 6-dimensional string/string duality of heterotic/T6 with IIA/K3 × T 2 to obtain the electric subalgebra. This will be a GKM algebra with root lattice 022,6 . Moreover, the results of [6] suggest that the full d = 4, N = 4 dyonic BPS algebra should be a GKM algebra.30 Moving on to more complicated Calabi–Yau threefolds, heterotic/IIA duality for K3fibered Calabi–Yau’s shows that the (0, 2, 4) fiber bound states form a GKM algebra. This suggests that the full dyonic algebra gA (X3 ) is a generalized Kac–Moody algebra and would fit in well with the natural conjecture that the result of [6] should generalize to d = 4, N = 2 compactifications. There are many interesting avenues for further investigation of the above ideas. It would be interesting to investigate in detail the algebras of BPS states associated to 4-folds of exceptional holonomy together with their supersymmetric cycles. In view of recent developments [66] the BPS algebra of 0-branes in IIA theory on tori seems of particular importance. In [1] it was shown that automorphic forms of the kind appearing in the study of GKM algebras appear in threshold corrections. At the present moment we do not have a good understanding of how the algebra of BPS states and algebras appearing in threshold corrections are connected. Indeed, until recently no clear connection has been established between any automorphic form appearing in threshold corrections and a concretely defined GKM. That situation has been improved recently [5], but there is still much to learn. Recently a very interesting phase transition with an infinite number of massless dyonic particles has been discussed [67, 68]. These states may be thought of as analytic continuations of (0, 2, 4) brane bound states in a Del Pezzo surface. These will form a subalgebra of the IIA algebra. Perhaps the phase transition discussed in [67, 68] is a phase – rather analogous to D = 2 string theory at the self-dual radius, or to total string compactification at special radii for the timelike coordinate, at which an entire GKM gauge algebra is becoming unbroken. Notes added. 1. We would like to draw the reader’s attention to two papers making use of correspondences in D-brane interactions [69, 70]. 30
We are using the term GKM algebra loosely here. See note below.
On the Algebras of BPS States
515
2. R. Borcherds has pointed out to us that our use of the term “generalized Kac–Moody algebras” is inaccurate. The algebras defined in [12, 13] are Z graded, whereas our algebras are graded by a lattice which is possibly non-Lorentzian (e.g. II p,q ). It seems to us to be an important and interesting problem to develop the theory of this more general class of algebras. Acknowledgement. The remarks on correspondences were inspired by discussions and collaboration with A. Losev, N. Nekrasov, and S. Shatashvili on Nakajima algebras. We would also like to thank T. Banks, R. Dijkgraaf, M. Douglas, D. Freed, M. Green, S. Katz, E. Martinec, D. Morrison, R. Plesser, J. Polchinski, G. Segal, A. Sen, A. Strominger, S. Shenker, W. Taylor, A. Todorov, E. and H. Verlinde, and G. Zuckerman for discussions. We thank the Rutgers Physics Department and the ITP at Santa Barbara for hospitality during the course of this work. GM would like to thank the Aspen Center for Physics for providing a stimulating atmosphere during the completion of this work. Some of these results were announced on July 16, 1996 at the conference Strings ’96 at Santa Barbara. We thank the organizers for the opportunity to present them. This work was supported in part by NSF Grant No. PHY 91-23780 and DOE grant DE-FG02-92ER40704.
Appendix A. The Hilbert Scheme of Points We collect here a few basic facts about the Hilbert scheme of points. For more information see [49]. For any manifold X we denote S N (X) ≡ [X × · · · × X]/SN .
(A.1)
This space has orbifold singularities whenever two or more points in X N coincide. Indeed, the space may be written as a stratified space parametrized by the partitions of N: (A.2) S N (X) = qν (S N (X))ν , where if ν is the parition (1)n1 (2)n2 · · · (s)ns then Y [X ni − BD]/Sni (S N (X))ν = i
and BD stands for the “big diagonal” where any two points coincide. If X is a complex surface then there is a resolution of singularities π : X [N ] → S N X given by the “Hilbert scheme of points on X”. This is the moduli space of sheaves supported at points of length N . The length is given by the dimension of the stalk at the point. The Hilbert scheme of points on higher dimensional complex manifolds also admits a projection to S N X but is not smooth. While the space X [N ] parametrizes sheaves supported at points it also parametrizes sheaves of generic rank one which are rank zero at a finite set of points. The parametrization is via 0 → I → O → S → 0, (A.3) where S is a skyscraper sheaf, and I is the sheaf defining the rank 1 Chan–Paton sheaf. A local model for I on the open dense space (X N − BD)/SN is the following. Suppose we work near a point x = y = 0. The structure sheaf is just the sheaf of analytic
516
J. A. Harvey, G. Moore
functions O = C[[x, y]]. On the other hand, I(U ) for U containing x = y = 0 is the O = C[[x, y]] module of analytic series vanishing at x = y = 0. Explicitly we have: I(U ) = {a01 x + a10 y + a20 x2 + a11 xy + a02 y 2 + · · · }
(A.4)
on small open sets U containing x = y = 0. The sequence (A.3) gives the vector space S(U ) = {a00 } with O-module structure x · a00 = y · a00 = 0, if x = y = 0 is in U , that is, S is a skyscraper sheaf. At the other extreme, π −1 (P ) for P ∈ 1(L) (with 1(L) the small diagonal) parametrizes ideals I in O such that O/I is of dimensional L and supported at P . Appendix B. Consistency Check for the Correspondence Conjecture It is possible to give a nontrivial consistency check of (7.7) by considering the degrees of the forms involved. We thank G. Segal and K. Hori for asking the questions which led to this calculation. A massive BPS multiplet transforms under the group SO(5), the little group of p~ = 0, as well as the d = 6, N = 2 supertranslation algebra. SO(5) acts on the space of multiplets and we expect the two Cartan generators to be conserved in forming the BPS product. Let us see how this is realized from the type IIA side. We consider the process (7.5) for definiteness, and moreover suppose Q1 6= Q2 . The moduli space Mspl (Q) inherits a hyperk¨ahler structure from X2 . As is well known, the complex cohomology of a K¨ahler manifold has an action of sl(2, C) [39]. On a hyperk¨ahler manifold this is promoted to so(5, C) [71]. The two Cartan generators are diagonal on H p,q (M(Q), C) and take the values: J12 = p + q − dimC M(Q), J34 = p − q.
(B.1)
It is natural to interpret the so(5, C) action as the action of the complexified little group, see, e.g., [54]. Let us verify that these are conserved by the product (7.7). First, since C is an analytic subvariety its Poincar´e dual is of type (p, p). Hence J34 is trivially conserved. It takes more work to verify that J12 is conserved. To begin, the conservation of J12 is equivalent to: deg ω3 = deg ω1 + deg ω2 + 2Q1 · Q2 − 2.
(B.2)
Now, (7.7) predicts that deg ω3 = deg ω1 + deg ω2 + dimR M(Q3 ) − dimR C +++ ,
(B.3)
since η is a Poincar´e dual. Now, dim C +++ can be computed as dimC C +++ = dimC M(Q1 ) + dimC M(Q2 ) + dimC H 1 (X; Hom(E2 , E1 )) − 1. (B.4) The reason for this is that, having chosen E1 , E2 there is a nontrivial space of extensions given by H 1 (X; Hom(E2 , E1 )). (See, for example, [73], Prop. 10.2.4.) The extra −1 comes about because H 1 (X; Hom(Ei , Ei )) for i = 1, 2 are both 1-dimensional (since the sheaves are simple) so the “ratio” of these defines a spurious direction in H 1 (X; Hom(E2 , E1 )). In terms of transition functions: −1 1 χαβ λ1 0 λ1 0 χαβ 1 λ1 λ−1 2 = (B.5) 0 1 0 λ2 0 λ2 0 1
On the Algebras of BPS States
517
identifies a representative χαβ of H 1 (X; Hom(E2 , E1 )) with λ1 λ−1 2 χαβ . Thus, the RHS of (B.3) becomes deg ω1 + deg ω2 + 4Q1 · Q2 − 2 dimC H 1 (X; Hom(E2 , E1 )) − 2.
(B.6)
Now, by RRG we can compute dimC H 1 (X; Hom(E2 , E1 )) = Q1 · Q2 + dimC H 0 (X; Hom(E2 , E1 )) + dimC H 2 (X; Hom(E2 , E1 )).
(B.7)
See, [47], Eq. 3.21. Moreover, H 2 (X; Hom(E2 , E1 )) should vanish if C is smooth (this is the space of obstructions). Finally, from the long exact sequence associated to 0 → Hom(E2 , E1 ) → Hom(E2 , E3 ) → Hom(E2 , E2 ) → 0
(B.8)
we get H 0 (X; Hom(E2 , E1 )) ∼ = H 0 (X; Hom(E2 , E3 )). However, since E2 , E3 are semistable one finds the latter space is the zero vector space. Proof: If ψ : E2 → E3 were nonzero then, since E2 is semistable, µ(E2 ) < µ(E2 / ker ψ). We also know µ(E3 ) < µ(E2 ) but E2 / ker ψ ∼ = im(ψ) ⊂ E3 is a subsheaf of E3 , but this contradicts semistability of E3 . Thus dimC H 1 (X; Hom(E2 , E1 )) = Q1 · Q2 , and we finally get agreement between the RHS of (B.3) and (B.2).
References 1. Harvey, J.A. and Moore, G.: Algebras, BPS states, and strings. hep-th/9510182; Nucl. Phys. B463, 315 (1996) 2. Kawai, T.: N = 2 heterotic string threshold correction, K3 surface and generalized Kac–Moody superalgebra. hep-th/9512046; String Duality and Modular Forms. hep-th/9607078 3. Cardoso, G.L., Curio, G., Lust, D., Mohaupt, T, Rey S.-J.: BPS Spectra and Non-Returbative Couplings in N = 2, 4 Supersymmetric String Theories. hep-th/9512129; Cardoso, G.L., Curio, G., Lust, D., Mohaupt: Instanton Numbers ans Exchange Symmetries in N = 2 Dual String Pairs. hep-th/9602108; Cardoso, G.L., Curio, G., Lust: Perturbative Couplings and Modular Forms in N = 2 String Models with a Wilson Line. hep-th/9608154 4. Henningson, M. and Moore, G.: Counting Curves with Modular Forms. hep-th/9602154; Threshold corrections in K3 × T 2 heterotic string compactifications. hep-th/9608145 5. Harvey, J.A. and Moore, G.: Gravitational threshold corrections and the FHSV model. To appear 6. Dijkgraaf, R., Verlinde, E.,and Verlinde, H.: Counting dyons in N=4 string theory. hep-th/9607026 7. Sen, A.: A note on marginally stable bound states in Type II string theory. hep-th/9510229 8. Lerche, W., Vafa, C. and Warner, N.P.: Chiral rings in N = 2 superconformal theories. Nucl. Phys. B324, 427 (1989) 9. Witten, E.: Topological quantum field theory. Commun. Math. Phys. 117, 353 (1988) 10. Lian, B.H. and Zuckerman, G.J.: New Perspectives on the BRST Algebraic Structure of String Theory. Commun.Math.Phys. 154, 613 (1993); hep-th/9211072 11. Dabholkar, A. and Harvey, J.A.: Nonrenormalization of the superstring tension. Phys. Rev. Lett. 63, 478 (1989); Dabholkar, A., Gibbons, G., Harvey, J.A. and Ruiz Ruiz, F.: Superstrings and solitons. Nucl. Phys. B340, 33 (1990) 12. Borcherds, R.: Generalized Kac–Moody algebras. J. Alg. 115, 50 (1988) 13. Borcherds, R.: Monstrous moonshine and monstrous Lie superalgebras. Invent. Math. 109, 405 (1992) 14. Frenkel, I: Representation of Kac–Moody algebras and dual resonance models. In: Applications of Group Theory in Physics and Mathematical Physics. Vol. 21, Lectures in Applied Mathematics, M. Flato, P. Sally, G. Zuckermen, eds., Providence, RI: AMS, 1985 15. Goddard, P. and Olive, D.: Algebras, Lattices, and Strings. In: Vertex operators in mathematics and physics. ed. L. Lepowsky et al., Berlin–Heidelberg–New York: Springer-Verlag, 1985
518
J. A. Harvey, G. Moore
16. Giveon, A. and Porrati, M.: Duality invariant string algebra and D = 4 effective actions. Nucl. Phys. B355, 422 (1991) 17. Moore, G.: Finite in All Directions. hep-th/9305139; Moore, G.: Symmetries and symmetry-breaking in string theory. hep-th/9308052 18. Moore, G.: Symmetries of the Bosonic String S-Matrix. hep-th/9310026; Addendum to: Symmetries of the Bosonic String S-Matrix. hep-th/9404025 19. Dine, M., Huet, P. and Seiberg, N.: Nucl. Phys. B322, 301 (1989) 20. Giveon, A., Porrati, M. and Rabinovici, E.: Target space duality in string theory. Phys. Rept. 244, 77 (1994) 21. Hull, C. and Townsend, P.: Unity of superstring dualities. Nucl. Phys. B438, 109 (1995) 22. Witten, E.: String theory dynamics in various dimensions. Nucl. Phys. B443, 85 (1995); hep-th/9503124 23. Sen, A.: Nucl. Phys. B450, 103 (1995); hep-th/9504027 24. Harvey, J.A. and Strominger, A.: Nucl. Phys. B449, 535, ERRATUM-ibid 458, 456 (1996); hepth/9504047 25. Kachru, S. and Vafa, C.: Exact results for N = 2 compactifications of heterotic strings. Nucl. Phys. B450, 69C.; hep-th/9505105 26. Ferrara, S., Harvey, J.A., Strominger, A., Vafa, C.: Second-Quantized Mirror Symmetry. Phys. Lett. B361, 59 (1995); hep-th/9505162 27. Witten, E.: Bound states of string and p-branes. Nucl. Phys. B460, 335 (1996); hep-th/9510135 28. Bershadsky, M., Sadov, V. and Vafa, C.: D-Branes and Topological Field Theories. Nucl. Phys. B463, 420 (1996); hep-th/9511222 29. Polchniski, J. and Cai, Y.: Consistence of open supersting theories. Nucl. Phys. B296, 91 (1988) 30. Callan, C.G., Lovelac, C., Nappi, C.R. and Yost, S.A.; Nucl. Phys. B308, 221 (1988) 31. Li, M.: Boundary states of D-branes and Dy-strings. Nucl. Phys. B460, 351 (1996); hep-th/9510161 32. Douglas, M.: Branes within branes. hep-th/9512077 33. Green, M., Harvey, J. and Moore, G.: I-brane inflow and anomalous couplings on D-branes. hepth/9605033 34. Mukai, S.: Symplectic structure of the moduli of sheaves on an abelian or K3 surface. Invent. Math. 77, 101 (1984); On the moduli space of bundles on K3 surfaces, I. In: Vector Bundles on Algebraic Varieties, Tata Inst. of Fund. Research 35. Chaudhuri, S., Johnson C. and Polchinski, J.: Notes on D-branes. hep-th/9602052 36. We learned this during discussions with J. Polchinski and A. Strominger 37. Douglas, M.R., Kabat, D., Pouliot, P. and Shenker, S.H.: D-branes and short distances in string theory. hep-th/9608024 38. Losev, A., Moore, G., Nekrasov, N., Shatashvili, S.: Four-Dimensional Avatars of 2D RCFT. hepth/9509151 39. Griffiths, P. and Harris, J.: Principles of Algebraic geometry. New York: J. Wiley and Sons, 1978 40. M. Douglas and G. Moore, D-Branes, Quivers, and ALE Instantons. hep-th/9603167 41. Indeed, some aspects of the Dbrane development were anticipated from this very viewpoint in M. Kontsevich, Homological Algebra of Mirror Symmetry. Proc. of the 1994 International Congress of Mathematicians, Basel–Boston: Birkh¨auser, 1995 p.120; alg-geom/9411018 42. Morrison, D.R.: The geometry underlying mirror symmetry. alg-geom/9608006 43. Segal, G.: Equivariant K-theory and symmetric products. Manuscript, and talk at the Aspen Center for Physics, August, 1996 44. Distler, J., Greene, B. and Morrison, D.: Resolving singularities in (0, 2) models. hep-th/9605222 45. G¨ottsche, L. and Huybrechts, D.: Hodge numbers of moduli spaces of stable bundles on K3 surfaces. alg-geom/9408001 46. Bruzzo, U. and Maciocia, A.: Hilbert schemes of points on some K3 surfaces and Gieseker stable bundles. alg-geom/9412014 47. Mukai, S.: Moduli of vector bundles on K3 surfaces, and symplectic manifolds. Sugaku Expositions, Vol. 1, 139 48. Donagi, R., Ein, L., Lazarsfeld, R.: A non-linear deformation of the Hitchin dynamical system. alggeom/9504017 49. G¨ottsche, L.: Lecture Notes in Mathematics 1572, Hilbert Schemes of Zero-Dimensional Subschemes of Smooth Varieties. Berlin: Springer-Verlag, 1994 50. Ellingsrud, G. and Stromme, S.A.: An intersection number for the punctual Hilbert scheme of a surface. alg-geom/9603015
On the Algebras of BPS States
519
51. Klemm, A., Lerche, W. and Mayr, P.: K3-fibrations and Heterotic-Type II string duality. Phys. Lett. B357, 313 (1995); hep-th/9596122 52. Vafa, C., Witten, E.: Dual String Pairs With N=1 And N=2 Supersymmetry In Four Dimesions. Hepth/9507050 53. Aspinwall, P. and Louis, J.: On the Ubiquity of K3 fibrations in in string duality. Phys. Lett. B369, 233 (1996); hep-th/9510234 54. Aspinwall, P.S.: Enhanced gauge symmetries and Calabi–Yau threefolds. hep-th/9511171 55. Kollar, J.: The structure of algebraic threefolds: An introduction to Mori’s program. Bull. Am. Math. Soc. 17, 211 (1987) 56. Nakajima, H.: Homology of moduli spaces of instantons on ALE Spaces. I. J. Diff. Geom. 40, 105 (1990); Instantons on ALE spaces, quiver varieties, and Kac–Moody algebras. Duke Math. 76, 365 (1994); Gauge theory on resolutions of simple singularities and simple Lie algebras. Intl. Math. Res. Not. 2, 61 (1994); Quiver Varieties and Kac–Moody algebras. Preprint; Heisenberg algebra and Hilbert schemes of points on projective surfaces. alg-geom/9507012; Instantons and affine Lie algebras. alg-geom/9502013 57. Ginzburg, V., Kapranov, M. and Vasserot, E: Langlands reprocity for albebraic surfaces. q-alg/9502013 58. See, e.g., Landau, L.D. and Lifschitz, E.M.: Quantum Mechanics. London: Pergamon Press, sec. 128 20,4 first arose 59. The idea that Nakajima’s construction would lead to a GKM algebra with root lattice 0 in a discussion with R. Dijkgraaf, I. Grojnowski, and S. Shatashvili, in Nov. 1994 60. Gritsenko, V.A., Nikulin, V.V. K3 Surfaces, Lorentzian Kac–Moody Algebras, and Mirror Symmetry. alg-geom/9510008 61. Grojnowski, I.: Instantons and affine algebras I: The Hilbert scheme and vertex operators. alggeom/9506020 62. Klebanov, I. and Polyakov, A.: Interaction of discrete states in two-dimensional string theory. Mod. Phys. Lett. A6, 3273 (1991); hep-th/9109032 63. Witten, E.: Ground ring of two-dimensional string theory. Nucl. Phys. B373, 187 (1992); hep-th/9108004 64. Vafa, C.: Instantons on D-branes. Nucl. Phys. B463, 435 (1996); hep-th/9512078 65. Gebert, R.W. and Nicolai, H.: An affine string vertex operator construction at arbitrary level. hepth/9608014 66. Banks, T., Fischler, W., Shenker, S. and Susskind, L.: To appear. 67. Vafa, C. and Morrison, D.: Compactifications of F theory on Calabi–Yau threefolds 2. hep-th/9603161 68. Klemm, A., Mayr, P. and Vafa, C.: BPS States of Exceptional Non-critical strings. hep-th/9607139 69. Furuuchi, K., Kunitom, H. Nakatsu, T.: Topoligical Field Theory ans Second-Quantized Five-Branes. hep-th/9610016 70. Douglas, M.: Enhanced Gauge Symmetry in M(atrix) Theory. hep-th/9612126 71. Verbitsky, M.: Cohomology of compact hyperk¨ahler manifolds and its applications. alg-geom/9511009 72. Witten, E.: Phase transitions in M -theory and F -theory. hep-th/9603150 73. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford: Clarendon Press, 1990 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 197, 521 – 526 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Estimates of Periodic Potentials in Terms of Gap Lengths Evgeni Korotyaev? Math. Dept. 2, ETU, 5 Prof. Popov Str., St. Petersburg, 197376, Russia. E-mail: [email protected] Received: 29 December 1997 / Accepted: 11 February 1998
Dedicated to my teacher Mikhail Birman on the occasion of his 70th birthday Abstract: Define the Hill operator T = −d2 /dx2 + q(x) in L2 (R) and suppose q ∈ R1 L2 (0, 1) is a 1-periodic real potential, 0 q(x)dx = 0. We prove the estimate kqk 6 P 2kγk(1 + kγk1/3 ), where kγk2 = n>1 |γn |2 and |γn | > 0, n > 1, is the gap length of T.
1. Introduction We consider the Hill operator T = −d2 /dx2 + q(x) in L2 (R), where q is a 1-periodic R1 real potential and q ∈ L2 (0, 1), i.e. kqk2 = 0 q(x)2 dx < ∞. It is well known that + , αn− ], where the spectrum of T is absolutely continuous and consists of intervals [αn−1 + − + αn−1 6 αn 6 αn , n > 1. These intervals are separated by the gaps γn = (αn− , αn+ ), with the length |γn | > 0. Assume that q = q0 + q1 , where a constant q0 > 0 is defined R1 by the condition: α0+ = 0 and 0 q1 (x)dx = 0. Let ϕ(x, λ), ϑ(x, λ) be the solutions of the equation −y 00 + qy = λy, λ ∈ C,
(1.1)
satisfying ϕ0 (0, λ) = ϑ(0, λ) = 1, and ϕ(0, λ) = ϑ0 (0, λ) = 0. Introduce the Lyapunov function 1(λ) = (ϕ0 (1, λ) + ϑ(1, λ))/2 and note that 1(αn± ) = (−1)n , n > 0. The sequence α0+ < α1− 6 α1+ < . . . is the spectrum of Eq. (1.1) with the periodic boundary conditions of period 2, i.e. y(x + 2) = y(x), x ∈ R. Here the equality means that αn− = αn+ is the double eigenvalue. The eigenfunctions corresponding to αn± have period 1 when n is even and they are antiperiodic, y(x + 1) = −y(x), x ∈ R, when n is odd. It is well known that the sequence γ = γ(T ) ≡ {|γn |}∞ 1 belongs to the Hilbert ?
Partially supported by INTAS and Russian Fund of Fundamental Research.
522
E. Korotyaev
n o P 2 2 space `2 = z = {zn }∞ , kzk = z < ∞ (see e.g. [MO]). We formulate our n 1 n>1 main result. R1 Theorem. Let q = q0 + q1 ∈ L2 (0, 1), where a constant q0 > 0 and 0 q1 (x)dx = 0. Then the following estimates are fulfilled: kq1 k 6 2kγk max{1, (kγk/2)1/3 },
(1.2)
√ kγk 6 2kq1 k max{1, (kq1 k/ 3)1/3 }.
(1.3)
Remark. i) Estimate (1.2) is new. ii) In [KK2] the estimate kγk 6 2kq1 k(1 + kq1 k) was proved. iii) Using the same way it is possible to obtain the estimates of potentials in terms of the Sobolev norms. iv) In the paper [GT] Garnet and Trubowitz gave a new method to solve the inverse problem: periodic poentials → gap lengths. In this approach (see also [KK2]) Gel’fand–Levitan–Marchenko equation was not used but estimates of potentials in terms of “inverse problem data” were important. The proof in [GT] was not complete since the problem of an estimate of kqk in terms of kγk remained open. Note that for a Schr¨odinger operator with short range potential the double-sided estimate of the potential in terms of scattering data is absent. v) In [K3] the same estimates for the Dirac operator were found. vi) To prove it we use conformal mappings corresponding to the quasimomentum of the Hill operator, identities (2.3) from [K2] and we reformulate the problem for the differential operator as a problem of the conformal mapping theory. Such a mapping was introduced into the spectral theory of the Hill operator in [F] and [MO]. vii) The embedding Theorem 2.2 is the important point in the proof. We now give the corollary from the theorem. Define the Hill operator Tr = −d2 /dx2 + q(x, r) in L2 (R), where q(x, r) = q1 (x) + q2 (x/r) and q1 , q2 ∈ L2 (0, 1) are 1-periodic R1 real potentials with the conditions: 0 qn (x)dx = 0, n = 1, 2. The parameter r 1 is integer. By the physical point of view the operator T1 = −d2 /dx2 + q1 (x) is unperturbed and q1 is the potential of a “solid state”. The perturbation q2 (x/r) is the slowly oscillating potential (an external field). If q2 6≡ 0 then roughly speaking in each band of T1 there exist r gaps of Tr . It is possible to estimate all these gap lengths in terms of the potentials. Corollary. Let q1 , q2 ∈ L2 (0, 1). Then for each integer r > 1 the following estimate is fulfilled: kγ(Tr )k 6 2M (1 + r2/3 M 1/3 ), M = kq1 k + kq2 k. R1 Let in addition kq2 k 6 kq1 k/2 or 0 q1 (rx)q2 (x)dx = 0. Then M 6 6kγ(Tr )k(1 + r2/3 kγ(Tr )k1/3 ).
(1.4)
(1.5)
Proof. Introducing the variable y = x/r and multiplying Tr by r2 we obtain the Hill operator Lr = −d2 /dy 2 + Vr (y) with the 1-periodic potential Vr (y) = r2 (q1 (ry) + q2 (y)). Using estimate (1.3) for the operator Lr we get kγ(Lr )k 6 2kVr k(1 + kVr k1/3 ), and since r2 kγ(Tr )k = kγ(Lr )k and kVr k 6 r2 (kq1 k + kq2 k), we have (1.4). The same consideration, Est. (1.2) and the estimate kVr k > r2 (kq1 k − kq2 k) > r2 M/3 imply (1.5) in the first case. In the second one instead of the last inequality we use the identity kVr k2 = r2 (kq1 k2 + kq2 k2 ).
Estimates of Periodic Potentials in Terms of Gap Lengths
523
2. Proofs We describe the properties of the quasimomentum k(·), which is defined by the formula k(w) = arccos 1(w2 ), + gn = (a− n , an ) = −g−n ,
w ∈ W = C \ ∪gn , p a± = αn± > 0, n > 1, n
see [F, MO]. The function k(w) is a conformal mapping from W onto a quasimomentum domain K = C \ ∪cn , where cn = (πn + ihn , πn − ihn ) is an excised slit with the height hn > 0, n ∈ Z, h0 = 0. The function k maps the gap gn on the slit cn , and k(−w) = −k(w), w ∈ W. Let w(k) be the inverse function for k(w) and let k = p + is, w = u + iv. The function s0 (u) ≡ s(u + 0i) > 0, u ∈ gn 6= ∅, n ∈ Z, and s0 (u) = 0, u ∈ R \ ∪gn ; note that s(u − i0) = −s0 (u) 6 0, u ∈ R. The function s0 (u), u ∈ gn , has the maximum at the point an ∈ gn , and hn = s0 (an ). Remark that s0 (t) is even and p(t) is odd of t ∈ R, and p(t) = πn, t ∈ gn . Introduce the functions ξ(k) = w(k)(k − w(k)), E(k) = k 2 − w(k)2 , k ∈ K. Later we need the identities Z Z Z 8 1 1 1 2 2 t s0 (t)dt, s0 (t)dt = Q0 ≡ q(x)dx, (2.1) kqk = π π 2 0 Z Z 1 tp(t)s0 (t)dt = |E(k)0 |2 dpds π Z Z 4 |ξ(k)0 |2 dpds, = 2kqk2 − π
8 kq1 k = π 2
Z
(2.2)
(2.1) from [MO] and (2.2) from [K2]. Here and below, an integral with no limits indicated denotes integration over R or R2 . We define the numbers h+ = sup hn , b = − + ∞ max{1, h+ /π}, and the sequences H = {Hn }∞ 1 , Hn = (an + an )hn , and H = {Hn }1 , Hn = 2πnhn . Let us prove ”the embedding theorem”, the basic estimates in the paper. Theorem 2.1. Let q ∈ L2 (0, 1). Then the following estimates are fulfilled: √ kHk 6 (π 3b/2)kqk, √ kHk 6 (π b/2)kq1 k.
(2.3) (2.4)
Proof. First give the needed result from [K3]. Let a domain D(h) = (0, π) × (0, h) for some h > 0 and let a real function f obey the conditions: a) f (p, s) is continuous in ¯ D(h), and f belongs to the Sobolev space W12 (D(h)), b) f (p, 0) = 0 for all p ∈ (0, π), Then Z Z Z h π h |f (0, s)|2 ds 6 max 1, |∇f |2 dpds. (2.5) s 2 π D(h) 0 Introduce the function f (k) = Imξ = su(k) + pv(k) − 2u(k)v(k) and the domain Dn = (πn, π(n + 1)) × (0, hn ), n > 1. Fix n > 1 for some hn > 0. The function f (k) ¯ n , since w(k) is continuous in D ¯ n . The identity v(p + i0) = 0, p ∈ R, is continuous in D yields f (p+i0) = 0, p ∈ R. Assume that z(s) ≡ u(πn+0+is), s ∈ B ≡ [0, hn ], is convex upward. Then z(s) > y(s) for any s ∈ B, where y(s) = a+n −ts, t = (a+n −an )/hn . Using
524
E. Korotyaev
the simple estimate Hn2 < 12 sz(s), s ∈ B, we obtain X n>1
Hn2 6 12
R hn 0
XZ n>1
3πb 6 2
sy(s)2 ds, and (2.5) for the function f (πn + 0 + is) =
hn
sz 2 (s)ds 6 6πb
0
Z Z
XZ Z n>1
Dn
|ξ(k)0 |2 dpds
|ξ(k)0 |2 dpds,
and (2.2) yields (2.3). We have to prove the convexity of z. It is well known that s000 (u) < 0 + 0, u ∈ gn , and s00 (u) > 0, u ∈ (a− n , an ) and s0 (u) < 0, u ∈ (an , an ) (see e.g. [K1]). 0 0 00 00 0 3 00 00 00 Therefore, w = 1/k and w = −k /(k ) imply z = −(u)pp = −s0 (u)(s00 (u))−3 < 0, for s ∈ B, and z is convex. To prove (2.4) we apply estimate (2.5) for the function F = Im E(k) = 2(ps − u(k)v(k)) and the domain Dn . Note that F is continuous in C and the identity v(p + i0) = 0, p ∈ R, implies F (p + i0) = 0, p ∈ R. Then using F (πn + is) = 2πns, 0 < s < hn , and (2.5), we obtain Z hn Z Z 2 2 (2πn) sds 6 πb |E(k)0 |2 dpds, (2πnhn ) = 2 Dn
0
summing and using (2.2) we get (2.4).
We prove the theorem with few additional results. Theorem 2.2. Let q ∈ L2 (0, 1). Then (1.2-3) and the additional estimates are fulfilled: √ 1 √ kγk 6 kqk 6 rkq1 k, 2
r =1+
2h+ , π
(2.6)
h2+ 6 kqk,
(2.7)
p kqk 6 8 b/3kγk,
(2.8)
√ kq1 k 6 2 bkγk.
(2.9) p + Proof. The estimates s0 (t) > (t − a− n )(an − t) > 0, t ∈ gn , (see [KK1]), yield Z Z q π − π + (an + a+n )2 |gn |2 = t2 s0 (t)dt > t2 (t − a− |γn |2 , n > 1, n )(an − t)dt > 32 32 gn gn summing and using (2.1) we get first the estimate in (2.6). In order to give the Pndetailed analysis we reprove the last inequality from [KK1]. Due to the identity a+n = 1 (|σm | + |gm |), where σm = [a+m−1 , a− m ], and the simple estimates |σm | 6 π, |gm | 6 2hm (see [MO]) we have t 6 rπn = rp(t), t ∈ gn , n > 1. Therefore, Z Z 8 8r t2 s0 (t)dt 6 tp(t)s0 (t)dt = rkq1 k2 . kqk2 = π π We prove (2.7). It is well known that the function v(k) = Imw(k) is positive and harmonic in ”the comb” K+ = C+ ∩ K, and v(k) = 0, k ∈ ∂K+ and v(is) = s + (Q0 + o(1))/s
Estimates of Periodic Potentials in Terms of Gap Lengths
525
as s → ∞ (see e.g. [MO]). Assume that h+ = hm for some m > 1. Introduce the ˆ + = C+ \ cm , and let wˆ be the corresponding conformal mapping from K ˆ+ new comb K ˆ onto C+ . Then the function v(k) ˆ = Imw(k) ˆ is positive and harmonic in the comb K+ , ˆ + and v(is) and v(k) ˆ = 0, k ∈ ∂ K ˆ = s+p (Qˆ 0 + o(1))/s as s → ∞. Moreover, since the corresponding imaginary part sˆ0 (u) = h2+ − y 2 > 0, y = u − a ∈ (−h+ , h+ ) and R sˆ0 (u) = 0, y ∈ R \ (−h+ , h+ ), for some a > 0, we have h2+ = 2 sˆ0 (u)du/π = 2Qˆ 0 . The maximum principle gives v(k) ˆ < v(k), k ∈ K+ , and comparing the asymptotics of v(is), ˆ v(is), as s → ∞, we obtain h2+ = 2Qˆ 0 < 2Q0 (see [KK3]). Therefore, (2.1) yields h2+ 6 q0 6 kqk. 2 Let r 6 2, then (2.6) r > 2, then p implies (1.3). Let 1/3 √ (2.6-7)√yield h+ 6 1/2 2 (4h+ /π) kq1 k, and then 2h+ /π 6 (kq1 k/c) , c = π / 32 > 3, which together with (2.6) yields (1.3). Due to the relation 0 < s0 (t) 6 hn , t ∈ gn , and identity (2.2) we deduce that XZ X Z π kqk2 = t2 s0 (t)dt 6 hn t2 dt 16 gn gn n>1
n>1
X hn X1 3 ((a+n )3 − (a− = |γn |Hn , n) ) 6 3 3 n>1
n>1
and then 3πkqk2√6 16kγkkHk. Substituting (2.3) into the last estimate we have 3πkqk2 6 8kγkπ 3bkqk, which yields (2.8). Using the estimate s0 (u) 6 hn , u ∈ gn , and identity (2.2) we obtain XZ X 2 kq1 k = (4/π) 2πns0 (u)du2 6 (4/π) 2πnhn |γn | 6 (4/π)kHkkγk, n>1
gn
n>1
and relation (2.4) yields (2.9). Assume that b √ 6 1, then √ (2.9) implies √ (1.2). Suppose b > 1, then using (2.7-8) we deduce that b2 6 bkγk(π 2 3/8) 6 bkγk/2. Then b3/2 < kγk/2 and (2.9) gives (1.2). Remark. In order to obtain the more exact constants in (1.2-3) we prove (2.4) and (2.9). Acknowledgement. The author would like to thank H. Kn¨orrer and E. Trubowitz (ETH Zurich) and T. Hoffmann-Ostenhof (ESI, Vienna) for their hospitality where the various parts of this paper were written.
References [F]
Firsova, N.: Riemann surface of quasimomentum and scattering theory for the perturbed Hill operator. J. Soviet Math. 11, 487–497 (1979) [GT] Garnett, J., Trubowitz, E.: Gaps and bands of one dimensional periodic Schr¨odinger operator II. Comment. Math. Helv. 62, 18–37 (1987) [KK1] Kargaev, P., Korotyaev, E.: Effective masses and conformal mappings. Commum. Math. Phys. 169, 597–625 (1995) [KK2] Kargaev, P., Korotyaev, E.: Inverse Problem for the Hill Operator, the Direct Approach. Invent. Math., 129, no. 3, 567–593 (1997) [KK3] Kargaev, P., Korotyaev, E.: The Inverse Problem Generated by the Conformal Mapping on the Complex Plane with Parallel Slits. To appear
526
[K1]
E. Korotyaev
Korotyaev, E.: Propagation of the waves in the one-dimensional periodic media. Asymptotic Analysis 15, 1, 1–24 (1997) [K2] Korotyaev, E.: The Estimates of Potential in Terms of the Effective Masses. Commun. Math. Phys., 183, 383–400 (1997) [K3] Korotyaev, E.: The metric properties of the conformal mapping on the complex plane with parallel slits. IMRN, 10, 493–503 (1996) [MO] Marchenko, V., Ostrovski, I.: A characterization of the spectrum of the Hill operator. Math. USSR Sbornik 26, 493–554 (1975) Communicated by B. Simon
Commun. Math. Phys. 197, 527 – 551 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Asymptotic Behaviour of the Ground State of Singularly Perturbed Elliptic Equations Andrey L. Piatnitski P.N. Lebedev Physical Institute RAS, Leninski prospect 53, Moscow 117333, Russia. E-mail: [email protected] Received: 6 February 1998 / Accepted: 16 February 1998
Abstract: The ground state of a singularly perturbed nonselfadjoint elliptic operator µ2 ∇i aij (x)∇j + µbi (x)∇i + v(x) defined on a smooth compact Riemannian manifold with metric aij (x) = (aij (x))−1 , is studied. We investigate the limiting behaviour of the first eigenvalue of this operator as µ goes to zero, and find the logarithmic asymptotics of the first eigenfunction everywhere on the manifold. The results are formulated in terms of auxiliary variational problems on the manifold. This approach also allows to study the general singularly perturbed second order elliptic operator on a bounded domain in Rn . 0. Introduction Let M be an n-dimensional smooth compact Riemannian manifold endowed with metric aij (x). Consider the following eigenvalue problem 1 2 ∂ ∂ ∂ µ |a(x)|−1/2 i aij (x)|a(x)|1/2 j p(x) + µbi (x) i p(x) + v(x)p(x) 4 ∂x ∂x ∂x = −λp(x) (0.1)
Aµ p =
on M , where µ > 0 is a small parameter; (aij (x)) = (aij (x))−1 , |a(x)| = det{aij (x)} ∂ ij 1/2 ∂ and |a(x)|−1/2 ∂x i a (x)|a(x)| ∂xj is the Laplace–Beltrami operator on M (see [1], [8] for the relevant definitions). The coefficients aij (x), bi (x), v(x) are supposed to be continuously differentiable real functions and the matrix aij (x) is uniformly positive: aij (x)ξi ξj ≥ c|ξ|2 ,
ξ ∈ Rn , c > 0.
Thus, we deal with singularly perturbed uniformly elliptic operator.
528
A. L. Piatnitski
It is well known (see, for example, [15]) that for any fixed µ > 0 the operator Aµ has a discrete spectrum {λ0 , λ1 , ...}, <λk → +∞ as k → ∞. According to [6], the first eigenvalue (i.e., the eigenvalue with the smallest real part) λ0 is simple and real, and the first eigenfunction p0 (x) is also real and does not change sign. It then follows from the maximum principle that p0 (x) can be chosen positive (see [4]). We assume, without loss of generality, the following normalizing conditions to be satisfied: Z Z p0 (x)m(dx) = 1, m(dx) = 1, max v(x) = 0, (0.2) M
M
M
where m(dx) = |a(x)|1/2 dx1 ...dxn in local coordinates. This paper is aimed at an investigation of the asymptotic behaviour of λ0 and p0 (x) as µ → 0. One of the most important applications of the results obtained here is a study of the large-time behaviour of solutions to the Cauchy problem for singularly perturbed parabolic equations. The growth or decay rate of the solutions, as well as their limiting shape can be described in terms of the corresponding ground state. Moreover, in the case of a torus, these asymptotics play a significant role in homogenization theory (see [5], [12], [14]). Indeed, as demonstrated in [4], [7], the homogenization problem for Aµ can be reduced to the homogenization problem for a certain operator involving no zero-order term. Then, the standard homogenization technique can be applied. The coefficients of the latter operator depend on the ground state of Aµ and, therefore, in order to investigate the limit behaviour of the effective coefficients of Aµ for small µ, one should know the asymptotics of the ground state. If the operator Aµ is selfadjoint, i.e. if b(x) ≡ 0, one can use the variational technique in order to find the limit of λ0 . In contrast with selfadjoint operators, in the case under consideration the standard variational approach cannot be applied so even the study of the first eigenvalue becomes nontrivial. Both the behaviour of λ0 and the asymptotics of p0 (x) are described here in terms of auxiliary variational problems on the manifold M . In particular, we give a simple necessary and sufficient condition for convergence of λ0 to zero (recall that it is always so in the selfadjoint case). We also show that under certain conditions p0 (x) decays exponentially at almost all points of M , and give the corresponding asymptotics in a logarithmic scale. Previously, closely related eigenvalue problems in a smooth bounded domain in Rn for a singularly perturbed operator of special form µaij (x)
∂ ∂ ∂ + bi (x) i i j ∂x ∂x ∂x
(0.3)
were considered in [9]–[11], where only the asymptotics of the first eigenvalue were analyzed, but the first eigenfunction was not considered. Note that this operator is, in fact, a special case of (0.1). Indeed, it suffices to divide (0.1) by µ and set v(x) ≡ 0. It turns out that the limit behaviour of λ0 (µ) in this special case depends crucially on the geometry of the integral curves of the equation x˙ = b(x), especially near the boundary of the domain. If, for instance, b(x) · ν > 0 at the boundary, where ν is the inner normal, i.e. if all the trajectories starting in the domain never leave it, then, as shown in [10], [11], λ0 (µ) decays exponentially as µ → 0 and, under additional assumptions, its logarithmic asymptotics can be calculated with the help of the rate functional for the corresponding diffusion process. On the other hand, if there is a smooth φ(x) such that b(x) · ∇φ(x) > 0 in the entire domain then, according to [9], λ0 (µ) > c for some c > 0 (note that our normalization differs from [9]).
A Ground State of Singularly Perturbed Equations
529
Although in some special cases, the methods developed in the cited papers yield precise asymptotics of the first eigenvalue, they only work for operators with v(x) ≡ 0, and for domains with non-empty boundary. Then, even in those cases, the asymptotic upper and lower bounds for the lowest eigenvalue need not coincide. Moreover, there are no results on the corresponding eigenfunction. In the present paper we use another approach which combines various variational methods with the large deviation technique. This approach can also be applied to operators defined in a bounded domain. In particular, the rough asymptotics of λ0 (µ) can be found for the Dirichlet problem for an arbitrary elliptic operator of the form (0.1) or (0.3). In Sect. 1 we introduce auxiliary variational functionals and study their properties. The basic result here is the existence of the following limit 1 λˆ = lim inf T →∞ x(·) T
ZT
aij (x(t))(x˙ i − bi (x(t)))(x˙ j − bj (x(t))) − v(x(t)) dt,
0
where the infimum is taken over all smooth curves on the manifold. In Sect. 2 we prove the convergence of the first eigenvalue as µ → 0. The main assertion here, Theorem 1, states that ˆ lim λ0 (µ) = λ.
µ→0
Section 3 is devoted to the investigation of p0 (x). Under an additional condition on the operator (see the definition of recursive operator below), the logarithmic asymptotics of p0 (x), uniform over the manifold, is constructed. This condition concerns the set of accumulating points of trajectories satisfying the relation ZT sup T
ˆ < ∞. aij (x(t))(x˙ i − bi (x(t)))(x˙ j − bj (x(t))) − v(x(t)) dt − λT
0
Namely, the intersection of all such sets is assumed to be nonempty. Section 4 is of special interest. Here we consider the operators with “potential” first order terms: ∂ i = 1, 2, . . . , n , bi (x) = aij (x) j U (x), ∂x for a smooth function U (x) on M , i.e. b(x) is the gradient of U (x) in the metric aij (x). Then, the problem of finding λˆ takes the following algebraic form: ∂ ∂ λˆ = min aij (x) i U (x) j U (x) − v(x) . x∈M ∂x ∂x Moreover, an operator with potential first order terms is recursive iff the minimum point of the above expression is unique; therefore, in this case, the recursiveness condition is that of a general position. Section 5 contains results about selfadjoint operators (b(x) ≡ 0). These results have been proved in [12] by other methods and are included in this paper for the sake of completeness. It should be noted that in case of selfadjoint operators the logarithmic asymptotics of p0 (x) admits a simple geometric interpretation:
530
A. L. Piatnitski
lim µ ln p0 (x) = −dist|v|aij (x, x0 ),
µ→0
where x0 is the unique maximum point of v(x), v(x0 ) = 0, and the distance is taken in the metric |v(x)|aij (x). 1. Auxiliary Variational Problems For absolutely continuous curves x(t) = (x1 (t), ..., xn (t)), 0 ≤ t ≤ T, on M , we define the functional I(x(·), T ) as follows: ZT
aij (x(t))(x˙ i − bi (x(t)))(x˙ j − bj (x(t))) − v(x(t)) dt;
I(x(·), T ) = 0
here aij (x) is the inverse matrix to aij (x); recall that aij is the metric on M . Let us extend I(x(·), T ) to the space C(0, T ; M ) of continuous functions by setting I(x(·), T ) = ∞ for all other x(·). It is easy to see that for every fixed T > 0 the functional I(x(·), T ) maps C(0, T ; M ) into (0, +∞). In what follows, we also use the functionals S(x, y, T ) =
inf
x(·), x(0)=x, x(T )=y
m(T ) = inf S(x, y, T ),
I(x(·), T ),
M (T ) = inf S(x, x, T ),
x,y∈M
x∈M
ZT I0 (x(·), T ) = I(x(·), T ) +
v(x(t))dt 0
ZT =
aij (x(t))(x˙ i − bi (x(t)))(x˙ j − bj (x(t))) dt.
0
Taking the curve x(·) = const as a trial function in the definitions of m(T ) and M (T ) we obtain (1.1) m(t) ≤ M (T ) ≤ c0 T with a constant c0 ≥ 0. Hence, m(T ) M (T ) ≤ lim sup ≤ c0 . T T T →∞ T →∞ ) M (T ) and lim exist and coincide. In fact, the limits lim m(T T T lim sup
T →∞
(1.2)
T →∞
Lemma 1. The functions m(T )/T and M (T )/T converge to the same limit λˆ ≥ 0 as T → ∞. The inequality ˆ ≤ M (T ) m(T ) ≤ λT (1.3) ˆ k| = holds for all T > 0, and there exists a sequence tk → ∞ such that lim |M (tk )− λt 0.
k→∞
A Ground State of Singularly Perturbed Equations
531
Proof. The main idea of the proof is to use sub- and super-additivity of m(t) and M (t), respectively, in combination with the upper bound for |M (t) − m(t)| obtained in Proposition 2 below. Propositions 1 and 3 provide relevant technical estimates and Proposition 4 states that the function S(x, y, t) is Lipschitz continuous in all the variables. The following relations M (kt) ≤ kM (t),
m(kt) ≥ km(t)
(1.4)
hold for any t > 0 and any integer k > 0. To prove the first one, let us rewrite the definition of M (t) in the form M (t) = inf I(x(·), t) and note that the infimum in x(·) x(0)=x(t)
the last relation can be replaced by the minimum, i.e. that M (t) assumes its minimum. Indeed, in view of the positive definiteness of aij (x) and the compactness of M , any minimizing sequence {xk (·)} for M (t) is uniformly bounded and, therefore, weakly compact in the functional space H 1 (0, t) endowed with the norm Zt kx(·)k = 2
(x2 (t) + x˙ 2 (t))dt. 0
Passing to the weak limit as k → ∞ and taking into account the weak semicontinuity of I(x(·), t), we obtain the curve desired. Then, iterating the closed curve x(·) ¯ which provides the minimum for M (t) and using the same notation x(·) ¯ for the curve obtained, we find that M (kt) ≤ I(x(·), ¯ kt) = kI(x(·), ¯ t) = kM (t) for any positive integer k. To prove the second inequality in (1.4), it suffices to consider the curve x(·) ˜ which provides the minimum for m(kt) and to divide an interval (0, kt) into k equal parts: m(kt) = I(x(·), ˜ kt) = Set λ =
lim sup MT(T )
and λ =
δ = λ − λ is positive. Then for some
M (T ) T T →∞ sequences t0k →
lim inf
M (t0k ) = λ, k→∞ t0k lim
I(x(· ˜ + lt), t) ≥ km(t).
l=0
T →∞
k−1 X
and suppose that the difference ∞ and t00k → ∞ we have
M (t00k ) = λ. k→∞ t00k lim
Let us fix T > 1 such that M (T )/T < λ + δ/4. Then by (1.4) we have M (kT )/(kT ) < λ + δ/4 for all positive integer k. On the other hand, for sufficiently large k, we have t0k − T [t0k /T ] /t0k < T /t0k < δ/(4c1 ), (1.5) M (t0k )/t0k > λ, where c1 = max aij (x)bi (x)bj (x) + v(x) + 1 and [·] means the integral part. Now let x∈M
us iterate [t0k /T ] times the curve that provides the minimum for M (T ) and extend it as constant. Using the curve obtained as a trial function in the definition of M (t0k ) and taking into account (1.5) and the choice of T, we find that
532
A. L. Piatnitski
λ − δ/4 < M (t0k )/t0k = M ([t0k ]T + (t0k − [t0k ]T ))/t0k ≤ [t0k /T ]M (T )/t0k + c1 (t0k − [t0k /T ]T )/t0k ≤ M (T )/T + c1 δ/(4c1 ) ≤ λ + δ/2. This contradicts the fact that δ > 0. Thus λ = λ and the limit lim (M (t)/t) does exist. t→∞
The existence of lim (m(t)/t) is a consequence of the following statements. t→∞
Proposition 1. The inequality 1 S(x, y, T ) ≤ c T + T holds uniformly in x, y ∈ M . Proof. Let us first note that in view of the compactness of M and the positive definiteness of aij , we have ZT inf
x(·), x(0)=x, x(T )=y 0
1 aij (x(t))x˙ (t)x˙ (t)dt = T i
j
Z1 inf
x(·), x(0)=x, x(1)=y 0
aij (x(t))x˙ i (t)x˙ j (t)dt
c dist(x, y), T where dist(x, y) is the geodesic distance in the metric aij , and c does not depend on x and y. Then, reducing the set of trial curves in the definition of S(x, , y, T ) to the only curve x(·) ˆ that provides a minimum in the above problem, we obtain ≤
ZT S(x, y, T ) ≤ I(x(·), ˆ T) =
i j aij (x(t)) ˆ x˙ˆ (t)x˙ˆ (t)dt − 2
0
ZT
i aij (x(t)) ˆ x˙ˆ (t)bj (x(t))dt ˆ
0
ZT ZT i j i j ˆ (x(t))b ˆ (x(t)) ˆ − v(x(t)) ˆ dt ≤ 2 aij (x(t)) ˆ x˙ˆ (t)x˙ˆ (t)dt aij (x(t))b + 0
0
ZT 2c 1 i j . ˆ (x(t))b ˆ (x(t)) ˆ − v(x(t)) ˆ dt ≤ dist(x, y) + 2c1 T ≤ c T + 2aij (x(t))b + T T 0
The last inequality follows from the compactness of M .
Proposition 2. The inequality m(t) ≤ M (t) ≤ m(t) + c holds uniformly in t > 0. Proof. By Proposition 1, the function S(x, y, 1) is bounded uniformly in x and y. Let x(·) ˜ be a curve providing the minimum for m(t). Combining x(·) ˜ with a curve which provides the minimum for S(x(t), ˜ x(0), ˜ 1) and using the curve obtained as a trial function in the definition of M (t + 1) we find M (t + 1) ≤ m(t) + S(x(t), ˜ x(0), ˜ 1) ≤ m(t + 1) + c2 ; here the monotonicity of m(t) has also been used.
A Ground State of Singularly Perturbed Equations
533
The relation lim (m(t)/t) = lim (M (t)/t) easily follows from Proposition 2. t→∞
t→∞
ˆ ≤ M (t) follows from (1.4). Indeed, if M (t0 ) < Further, the estimate m(t) ≤ λt ˆλt0 for some t0 > 0, then M (t0 ) = (λˆ − δ 0 )t0 for some δ 0 > 0. By (1.4) we have M (kt0 )/(kt0 ) ≤ λˆ − δ 0 for all positive integer k. Therefore, lim Mt(t) ≤ λˆ − δ 0 in t→∞ ˆ The estimate m(t) ≤ λt ˆ can be derived from (1.4) contradiction with the definition of λ. in the same way. The last assertion of the lemma relies on the following statements. Proposition 3. The inequality ZT
aij (x(t))x˙ i x˙ j dt ≤ c2 I(x(·), T ) + c3 T
(1.6)
0
holds uniformly in T > 0 and x(·) ∈ C(0, T ; M ). Proof. By the Schwarz inequality, ZT
i j
ZT
aij (x(t))x˙ x˙ dt − 2
I(x(·), T ) = 0
aij (x(t))x˙ i bj (x(t))dt
0
ZT
aij (x(t))bi (x(t))bj (x(t)) − v(x(t)) dt ≥
+ 0
−
1 2
+
aij (x(t))x˙ i x˙ j dt
0
ZT
aij (x(t))x˙ i x˙ j dt − c4
0
ZT
ZT
ZT
aij (x(t))bi (x(t))bj (x(t))dt
0
1 aij (x(t))b (x(t))b (x(t)) − v(x(t)) dt ≥ 2 i
j
0
ZT
aij (x(t))x˙ i x˙ j dt
0
ZT − c5
aij (x(t))bi (x(t))bj (x(t)) − v(x(t)) dt.
0
In view of the boundness of the coefficients of Aµ this inequality implies (1.6).
Proposition 4. For each T0 > 0, the function S(x, y, t) is uniformly Lipschitz continuous on the set M × M × [T0 , ∞). Proof. First, let us establish the estimate |S(x, y, t0 ) − S(x, y, t00 )| ≤ L|t0 − t00 |
(1.7)
with a constant L that depends only on T0 . To this end we note that, after a simple 00 transformation, the difference (I(x(·), t0 ) − I(x( tt0 ·), t00 ) can be written as
534
A. L. Piatnitski
t00 t0 − t00 I(x(·), t0 ) − I(x( 0 ·), t00 ) = t t0
Zt
0
aij (x(t))x˙ i x˙ j dt
0
t00 − t0 + t00
Zt
0
aij (x(t))bi (x(t))bj (x(t)) − v(x(t)) dt.
0
From this relation, substituting the curve x(·) that provides the minimum for S(x, y, t0 ) and using Propositions 1 and 3, we get t00 ·), t00 ) − (I(x(·), t0 ) t0 |t0 − t00 | 1 |t0 − t00 | 0 0 0 ≤ cc + + cc t ct t + 2 3 t0 t0 t00 1 t0 ≤ c|t0 − t00 | 1 + 2 + 00 . t T0
S(x, y, t00 ) − S(x, y, t0 ) ≤ I(x(
If we suppose that |t0 − t00 | is bounded, say by T0 , then t0 /t00 ≤ 2 and the last estimate takes the form 1 00 0 0 00 S(x, y, t ) − S(x, y, t ) ≤ c|t − t | 3 + 2 . T0 It remains to note that (1.7) for arbitrary |t0 − t00 | follows from (1.7) for sufficiently small |t0 − t00 |. Similarly, it suffices to verify the inequality |S(x, y 0 , t) − S(x, y 00 , t)| ≤ Ldist(y 0 , y 00 )
(1.8)
for sufficiently small dist(y 0 , y 00 ). This allows us to use the same local coordinates for y 0 and y 00 . In particular, we can write |y 0 − y 00 | instead of dist(y 0 , y 00 ). Let x(·) be a curve minimizing S(x, y 0 , t). Extending this curve to the interval (t, t + |y 0 − y 00 |) as the function x(s) = y 0 + (y 00 − y 0 )(s − t)/|y 0 − y 00 | linear in the local coordinates and considering (1.7), we obtain S(x, y 0 , t) + c|y 0 − y 00 | ≥ I(x(·), t + |y 0 − y 00 |) ≥ S(x, y 00 , t + |y 0 − y 00 |) ≥ S(x, y 00 , t) − c|y 0 − y 00 |. Thus, S(x, y 00 , t) − S(x, y 0 , t) ≤ c|y 0 − y 00 | ≤ c1 dist(y 0 , y 00 ). In view of the symmetry between y 0 and y 00 , this implies (1.8). The inequality |S(x0 , y, t) − S(x00 , y, t)| ≤ Ldist(x0 , x00 ) can be proved in the same way.
To complete the proof of Lemma 1 we have to show that for any T > 0 and δ > 0 ˆ < δ. For this purpose, we cover the manifold there is t > T such that |M (t) − λt| M by finitely many balls of radius δ1 = δ/L, where L is the Lipschitz constant from Proposition 4 corresponding to T0 = 1. Denote by N the number of the balls forming the covering. According to (1.3), for any positive integer k there exists a curve x(·) defined on the interval (0, k(N + 1)T ) such that
A Ground State of Singularly Perturbed Equations
535
ˆ I(x(·), k(N + 1)T ) ≤ λk(N + 1)T.
(1.9)
k(N +1)
Consider the set {x(jT )}j=0 . It is clear that at least one ball in the covering contains (k + 1) or more points of this set. Denote these points by z1 , z2 , ..., zs , s ≥ k + 1, and the corresponding arguments by t1 , ..., ts . Let us suppose that the inequalities ˆ j+1 − tj ) + δ/2 I(x(· − tj ), tj+1 − tj ) ≥ λ(t
(1.10)
hold for all j < s. According to (1.3) and Proposition 2, the first and the last segments of the curve satisfy the estimates ˆ 1 − c0 , I(x(·), t1 ) ≥ λt ˆ + 1)T − ts ) − c0 . I(x(· − ts ), k(N + 1)T − ts ) ≥ λ(k(N Taking the sum of the inequalities (1.10) for all j = 1, 2, ..., s, and then adding the last two inequalities, we find that ˆ I(x(·), k(N + 1)T ) ≥ λk(N + 1)T + kδ/2 − 2c0 . For sufficiently large k this relation contradicts (1.9). Hence, for some j < s we get ˆ j+1 − tj ) + δ/2. S(zj , zj+1 , tj+1 − tj ) ≤ I(x(· − tj ), tj+1 − tj ) < λ(t Our construction guarantees that |zj − zj+1 | ≤ (1.3) and Proposition 4, we have
δ 2L
and tj+1 − tj ≥ T . Therefore, by
ˆ j+1 − tj ) ≤ M (tj+1 − tj ) ≤ S(zj , zj , tj+1 − tj ) λ(t δ ˆ j+1 − tj ) + δ, ≤ S(zj , zj+1 , tj+1 − tj ) + L ≤ λ(t 2L
and Lemma 1 is proved. Corollary 1. The inequality
ˆ
(1.11)
holds uniformly in t > T0 and x, y ∈ M . 2. Convergence of the First Eigenvalue In this section we study the first eigenvalue λ0 of problem (0.1). The limit behaviour of λ0 is described by the following Theorem 1. The relation lim λ0 (µ) = λˆ holds. µ→0
First of all we establish some rough estimates for λ0 and p0 (x). This is the subject of the following two statements. Proposition 5. For all µ > 0, min v(x) ≤ −λ0 (µ) ≤ max v(x) = 0.
x∈M
x∈M
536
A. L. Piatnitski
Proof. Due to the assumed normalizing conditions, p0 (x) is a positive function on M . Denote by x1 a maximum point of p0 . Then, according to the maximum principle, we have −λ0 p0 (x1 ) = Aµ p(x1 ) ≤ v(x1 )p0 (x1 ) . This implies the upper bound for λ0 . Similarly, writing down Eq. (0.1) at a minimum point of p0 , we obtain the lower bound. Proposition 6. The following inequalities hold uniformly in x ∈ M : e−c(M )/µ ≤ p0 (x) ≤ µ−n ;
max p0 (x) ≥ 1.
x∈M
(2.1)
Proof. The last inequality in (2.1) obviously follows from the normalizing conditions (0.2). To prove the first one, let us rewrite Eq. (0.1) in the rescaled local coordinates y = µx : 1 ∂ 1 ∂ ∂ 1 |a(µy)|− 2 i |a(µy)| 2 aij (µy) j p0 (µy) + bi (µy) i p0 (µy) + v(µy)p0 (µy) 4 ∂y ∂y ∂y = −λ0 p0 (µy).
According to our assumptions and Proposition 5, the coefficients of this equation are continuously differentiable functions bounded uniformly in µ; therefore, by the Harnack inequality (see [3]) we have 0 < c1 < p0 (µy1 )/p0 (µy2 ) < c2 uniformly in µ > 0 and y1 , y2 ∈ M satisfying the condition dist(y1 , y2 ) < 1. Thus, in the coordinates x we have c1 < p0 (x1 )/p0 (x2 ) < c2
(2.2)
for all x1 , x2 ∈ M such that dist(x1 , x2 ) < µ. Let x1 be a maximum point of p0 . Since M is compact, it follows that for any x ∈ M there exists a sequence x = z1 , z2 , z3 , ..., zN −1 , zN = x1 with the following properties: dist(zj , zj+1 ) < µ for all j < N ; N ≤ N0 (M )/µ with N0 (M ) independent of µ and x. Iterating (2.2), we find N −1 Y p0 (zj ) ≥ (c1 )N ≥ eN0 (M ) ln c1 /µ . p0 (x)/p0 (x1 ) = p0 (zj+1 ) j=1
This yields the lower bound (2.1). To prove the upper bound, let us consider p0 in the ball Qx = {z|dist(z, x) < µ}. By (2.2) and (0.2) we have Z Z p0 (x) ≤ c2 min p0 (z) ≤ c3 µ−n min p0 (z) m(dz) ≤ c3 µ−n p0 (z)m(dz) ≤ c3 µ−n , z∈Qx
z∈Qx
and thereby the proposition is proved.
Qx
M
Remark 1. In fact, the method developed here allows us to obtain the following inequality (2.3) p0 (x)/p0 (y) ≤ exp(c dist(x, y)/µ).
A Ground State of Singularly Perturbed Equations
537
[. Proof of Theorem 1.] In order to obtain the proper lower and upper bounds for λ0 (µ), let us consider an auxiliary Cauchy problem 1 ∂ ∂ 1 ∂ ∂u = µ|a(x)|−1/2 i |a(x)|1/2 aij (x) j u(x, t) + bi (x) i u(x, t) + v(x)u(x, t) ∂t 4 ∂x ∂x ∂x µ u|t=0 = p0 (x) (2.4) in the cylinder M × (0, +∞). To estimate its solution u(x, t), which is obviously equal to exp(− λµ0 t)p0 (x), we introduce the operator Bµ =
1 ∂ ∂ ∂ µ|a(x)|−1/2 i |a(x)|1/2 aij (x) j + bi (x) i , 4 ∂x ∂x ∂x
and denote by ξtx the corresponding diffusion process on M issuing from the point x. The process ξtx is defined on some probabilistic space (, F, P) and is assumed to have continuous trajectories. The relevant definitions can be found in [1]. Now the solution of (2.4) has the following probabilistic representation (see [1]):
1 u(x, t) = E p0 (ξtx ) exp µ
Zt
v(ξsx )ds ,
(2.5)
0
where E denotes the expectation of random variables. Upper bound. Let dist [0,t] (x0 (·), x00 (·)) stand for the distance sup dist(x0 (s), x00 (s)) 0≤s≤t
in the functional space C(0, t; M ). According to [2, Chapter 5, Th.3.2], the process ξtx satisfies the large deviation principle uniformly in x ∈ M , with the rate functional I0 (x(·), t); see the definition of I0 (x(·), t) in the previous section. In particular, for any absolutely continuous curve x(·) and any δ > 0 and γ > 0, there exists µ0 > 0 such that for all µ < µ0 , I0 (x(·), t) + γ . P {dist[0,t] (ξ·x , x(·)) < δ} ≥ exp − µ
(2.6)
Since the function v(x) is smooth, the inequality t Z Z t ≤ cδt v(ξsx )ds − v(x(s))ds 0
(2.7)
0
¯ be the holds for any trajectory ξtx that satisfies the estimate dist(ξ·x , x(·)) ≤ δ. Let x(·) minimizing curve for M (t). By (2.6) and (2.7), for all µ < µ0 we have
538
A. L. Piatnitski
−λ0 t/µ p0 (x(0))e ¯
1 ¯ ¯ = u(x, t) = E p0 (ξtx(0) ) exp v(ξsx(0) )ds µ 0 Zt 1 ¯ ≥ min p0 (y)P dist [0,t] (ξ·x(0) , x(·)) ¯ < δ exp v(x(s))ds ¯ − cδt y∈M µ 0 Zt 1 I0 (x(·), ¯ t) + γ ≥ min p0 (y) exp − exp v(x(s))ds ¯ − cδt y∈M µ µ 0 I(x(·), ¯ t) + γ + cδt ; = min p0 (y) exp − y∈M µ
here the equality I(x(·), t) = I0 (x(·), t) −
Rt
Zt
v(x(s))ds ¯ has also been used. According to
0
our choice of x(·), ¯ we have I(x(·), ¯ t) = M (t), and therefore, M (t) + γ + cδt −λ0 t/µ . ¯ ≥ min p0 (y) exp − p0 (x(0))e y∈M µ Finally, by Proposition 6 e
−λ0 t/µ
I(x(·), ¯ t) + γ + cδt 2c(M ) exp − ≥ exp − µ µ I(x(·), ¯ t) + γ + cδt + 2c(M ) = exp − . µ
Therefore, γ 2c(M ) M (t) + cδ + + t t t for all sufficiently small µ. Since δ and γ are arbitrary numbers and c(M ) does not depend on t, this implies λ0 ≤
lim sup λ0 ≤ lim µ→0
t→∞
M (t) ˆ = λ. t
(2.8)
Lower bound. To estimate λ0 from below, we consider the following subset of C(0, t; M ): 8xt (α) = {x(·)|x(0) = x, I0 (x(·), t) ≤ α} . According to [2], this set is compact in C(0, t; M ), and for every δ > 0, γ > 0 and α > 0 there exists µ0 > 0 such that α−γ P {dist[0,t] (ξ·x , 8xt (α)) > δ} ≤ exp − (2.9) µ for all µ < µ0 . Moreover, for any α0 > 0 the estimate (2.9) is uniform in α < α0 and in x ∈ M . Let x be the initial point of the curve that provides theminimum for m(t). Represent ¯ (t)) ≥ δ , ing as a union of the following two events: Q1 = dist [0,t] (ξ·x , 8xt (2M
A Ground State of Singularly Perturbed Equations
539
¯ (t)) < δ , M ¯ (t) = max(M (t), 1), and denoting by χ the Q2 = dist [0,t] (ξ·x , 8xt (2M · characteristic function of a set, one can rewrite (2.5) in the form
p0 (x)e−λ0 t/µ
1 = E χQ1 p0 (ξtx ) exp v(ξsx )ds µ 0 Zt 1 + E χQ2 p0 (ξtx ) exp v(ξsx )ds . µ
Zt
(2.10)
0
It follows from (2.9), Proposition 6 and the negativity of v(·) that the first term on the right hand side satisfies an estimate
1 E χQ1 p0 (ξtx ) exp µ
Zt 0
¯ (t) − γ 2M −n −n x v(ξs )ds ≤ cµ P(Q1 ) ≤ cµ exp − µ
(2.11) for sufficiently small µ. To estimate the second term, let us represent Q2 in the form Q2 = {dist[0,t] (ξ·x , 8xt (δ)) < δ} ¯ (t)/δ]+1 [2M
∪
[
{dist [0,t] (ξ·x , 8xt (kδ)) < δ} ∩ {dist[0,t] (ξ·x , 8xt ((k − 1)δ)) ≥ δ}
k=2 ¯ (t)/δ]+1 [2M
= Q12 ∪
[
Qk2 .
k=2
It should be noted that some events in this union can be empty. Since m(t) ≤ I(x(·), t) = Rt Rt I0 (x(·), t)− 0 v(x(s))ds, we obtain − 0 v(x(s))ds ≥ m(t)−kδ for any x(·) ∈ 8xt (kδ). Hence, by (2.7) we get Zt −
v(ξsx )ds ≥ m(t) − kδ − cδt
(2.12)
0
for all trajectories from Qk2 . At the same time, according to (2.9), P(Qk2 )
≤
P {dist[0,t] (ξ·x , 8xt ((k
(k − 2)δ − 1)δ) ≥ δ} ≤ exp − µ
Combining the last two estimates, we find
.
540
A. L. Piatnitski
E
χQ2 p0 (ξtx ) exp ¯ (t)/δ]+1 [2M
X
≤
k=1
1 µ
Zt
v(ξsx )ds
0
m(t) − kδ − cδt P(Qk2 ) max p0 (y) exp − y∈M µ
m(t) − kδ − cδt (k − 2)δ exp − ≤ cµ exp − µ µ k=1 ¯ (t)/δ exp − m(t) − 2δ − cδt . ≤ cµ−n M µ ¯ (t)/δ]+1 [2M
−n
X
From (2.10), taking into account (2.11) and the last inequality we derive ¯ (t) ¯ (t) − γ 3M m(t) − 2δ − cδt 2M −λ0 t/µ −n p0 (x)e + exp − exp − ≤ cµ µ δ µ or, after simple transformation, ¯ (t)) + m(t) − γ − 2δ − cδt − c1 ; λ0 t ≥ c(M ) − nµ ln µ(ln δ − ln M here c1 depends neither on t nor µ. Since δ and γ are arbitrary numbers, this implies lim inf λ0 ≥ lim µ→0
t→∞
m(t) ˆ = λ, t
which, in view of (2.8), completes the proof of Theorem 1.
Corollary 2. The first eigenvalue λ0 of problem (0.1) converges to zero as µ → 0 if and only if the identity v(x(t)) ≡ 0 holds along at least one solution of the equation x˙ = b(x) on M . 3. Asymptotics of the First Eigenfunction In this section the asymptotic behaviour of the first eigenfunction p0 (x) is studied. Some additional assumptions are required in order to ensure the existence of the asymptotics. These assumptions, in turn, involve the following definitions. Condition A. A curve x(·) defined on (0, +∞) satisfies condition A, if for any ε > 0 there is T > 0 such that for all t > 0 we have ZT +t I(x(· − T ), t) =
ˆ + ε, aij (x(t))(x˙ i − bi (x(t))(x˙ j − bj (x(t))) − v(x(t)) dt < λt
T
where λˆ is defined in Lemma 1. Condition B. A curve x(·) defined on (0, +∞) satisfies condition B, if the inequality ˆ + c holds uniformly in t > 0. I(x(·), t) ≤ λt
A Ground State of Singularly Perturbed Equations
541
First of all, we should answer the question if curves satisfying Conditions A and B do exist. The proof of the following two simple assertions is outlined briefly. Proposition 7. The conditions A and B are equivalent. Proof. The implication A⇒B is obvious. To derive A from B it suffices to note that the ˆ − ε} is not empty for each x(·) satisfying ˆ > sup(I(x(·), s) − λs) set {t | I(x(·), t) − λt s
Condition B, and to take arbitrary T from this set.
Proposition 8. A curve satisfying condition B does exist. Proof. Thanks to the last statement of Lemma 1 and the definition of M (t), there exist a sequence tk → ∞ and curves xk (t), 0 ≤ t ≤ tk , x(0) = x(tk ) such that ˆ k | = 0. lim |I(xk (·), tk ) − λt
k→∞
Taking, if necessary, a proper subsequence one can assume that the sequence xk (0) does converge, and that the inequalities ˆ k | < exp(−k) |I(xk (·), tk ) − λt
dist(xk (0), xk+1 (0)) < exp(−k),
hold. Now, combining the curves xk (·) and segments of geodesics that connect xk (tk ) and xk+1 (0), we obtain the desired curve. Next, we introduce the class of operators to be studied. Definition. The operator Aµ is recursive if there is at least one point x0 of M such that for any ε > 0, T > 0 and any x(·) satisfying condition A, the inequality dist(x(t), x0 ) < ε holds for some t > T . The point x0 is called recurrent for Aµ . The following property of recurrent points plays an important role in further considerations. Proposition 9. For each recurrent point x0 of Aµ , ˆ = 0. lim inf (S(x0 , x0 , t) − λt) t→∞
Proof. By Lemma 1, ˆ ≥ M (t) − λt ˆ ≥ 0. S(x0 , x0 , t) − λt ˆ ≥ 0. If we suppose that lim inf (S(x0 , x0 , t)− λt) ˆ = c > 0, Thus, lim inf (S(x0 , x0 , t)− λt) t→∞ t→∞ ˆ ≥ c/2 for sufficiently large t and x, y then Proposition 4 implies that S(x, y, t) − λt close to x0 . Let x(·) satisfy condition B. Since x0 is a recurrent point of Aµ , one can find a sequence {tk }∞ k=1 such that (tk+1 − tk ) → ∞ as k → ∞ and x(tk ) are close to x0 for all k. Hence, ˆ k+1 = (I(x(·), t1 ) − λt ˆ 1) + I(x(·), tk+1 ) − λt
k X
ˆ s+1 − ts ))) ≥ (I(x(· − ts ), ts+1 − ts ) − λ(t
s=1
≥ kc/2 − c1 . For sufficiently large k this estimate contradicts the fact that x(·) satisfies condition B.
542
A. L. Piatnitski
For a recursive operator we define the function W0 as follows W0 (x) = inf
inf
t>0 x(·), x(0)=x, x(t)=x0
ˆ = inf (S(x, x0 , t) − λt), ˆ (I(x(·), t) − λt) t>0
(3.1)
where x0 is a recurrent point. Remark 2. In fact, the infimum over all t > 0 in (3.1) can be replaced by that over an arbitrary half-line t > T0 , T0 ≥ 0. Indeed, let {xk (·)} be a sequence of curves with the following properties: xk (0) = xk (tk ) = x0 ,
lim tk = ∞,
ˆ k ) = 0. lim (I(xk (·), tk ) − λt
k→∞
k→∞
Proposition 9 guarantees the existence of such a sequence. Now it suffices to extend the curves from an arbitrary minimizing sequence for W0 as the curves from the sequence just constructed. In view of Remark 2, the following statement easily follows from Proposition 4. Proposition 10. W0 is a Lipschitz continuous function on M . It should be observed that, in general, the function W0 depends on the choice of the recurrent point x0 . Define the function W (x) on M by the formula W (x) = W0 (x) − min W0 (y). y∈M
A remarkable property of W is its independence of x0 . Proposition 11. W (x) is a well-defined function on M ; it does not depend on the choice of the recurrent point x0 . Proof. Consider two arbitrary recurrent points x00 and x000 of the operator Aµ . The corresponding functions (3.1) will be marked by 0 and 00 , respectively. Our proof is based on the following relation: (3.2) W00 (x000 ) + W000 (x00 ) = 0. In order to establish (3.2), let us first assume that W00 (x000 ) + W000 (x00 ) = c > 0. In view of Propositions 4 and 10, this implies the estimate ˆ + inf (S(y2 , x2 , t) − λt) ˆ ≥ c/2 inf (S(x1 , y1 , t) − λt)
t>0
t>0
x00
and y1 , y2 close to x000 . Fixing an arbitrary curve x(·) which for all x1 , x2 close to satisfies condition B and taking into account the properties of x00 and x000 , one can easily construct an increasing sequence {tk } such that x(t2k ) are close to x00 and x(t2k−1 ) are close to x000 for all k > 0. Then our assumption leads to the following inequality: ˆ 2k+1 = (I(x(·), t1 ) − λt ˆ 1) I(x(·), t2k+1 ) − λt k n X ˆ 2s − t2s−1 )) (I(x(· − t2s−1 ), t2s − t2s−1 ) − λ(t + s=1
o ˆ 2s+1 − t2s )) ≥ (I(x(·), t1 ) − λt ˆ 1) +(I(x(· − t2s ), t2s+1 − t2s ) − λ(t +
k n X s=1
ˆ 2s − t2s−1 )) (S(x(t2s−1 ), x(t2s ), t2s − t2s−1 ) − λ(t
o ˆ 2s+1 − t2s )) ≥ c1 + kc/2, + (S(x(t2s ), x(t2s+1 ), t2s+1 − t2s ) − λ(t
A Ground State of Singularly Perturbed Equations
543
which contradicts the fact that x(·) satisfies condition B. Thus, W00 (x000 ) + W000 (x00 ) ≤ 0. On the other hand, ˆ + inf (S(x000 , x00 , t) − λt) ˆ W00 (x000 ) + W000 (x00 ) = inf (S(x00 , x000 , t) − λt) t>0
≥
t>0
inf (S(x00 , x00 , t) t>0
ˆ ≥ inf (M (t) − λt) ˆ ≥0 − λt) t>0
and (3.2) follows. Now, ˆ ≤ inf (S(x, x000 , t) − λt) ˆ + inf (S(x000 , x00 , t) − λt) ˆ W00 (x) = inf (S(x, x00 , t) − λt) t>0
t>0
t>0
= W000 (x) + W00 (x000 ). Similarly, W000 (x) ≤ W00 (x) + W000 (x00 ). In view of (3.2), this means that W000 (x) = W00 (x)+W000 (x00 ). In other words, the difference W00 (x)−W000 (x) is constant, and therefore, W is well-defined. The main result of this section is the following Theorem 2. Let operator Aµ be recursive. Then lim µ ln p0 (x) = −W (x)
µ→0
(3.3)
uniformly in x ∈ M . Proof. Lower bound. Let us fix an arbitrary recurrent point x0 of the operator Aµ and estimate the ratio p0 (x)/p0 (x0 ) from below. According to the definition of W0 , for any x ∈ M and δ > 0 there is a curve x(·) defined on the interval (0, T (δ)) and such that ˆ (δ) < W0 (x) + δ. I(x(·), T (δ)) − λT (3.4) x(0) = x, x(T (δ)) = x0 , Moreover, using compactness arguments and Proposition 4 and 10, we can choose T (δ) bounded by some T0 (δ) uniformly in x ∈ M . Indeed, if we construct the segment of geodesic curve that connects y and x, combine it with x(·) and denote the obtained curve by x(·), ˜ then we get ˆ (δ) + dist(y, x)) < W0 (x) + 2δ < W0 (y) + δ I(x(·), ˜ T (δ) + dist(y, x)) − λ(T 3 3 for all y from a sufficiently small neighbourhood of x. According to [2], for any δ1 > 0 there exists µ0 > 0 such that I0 (x(·), T (δ)) + δ1 (3.5) P {dist [0,T (δ)] (ξ·x , x(·)) < δ1 } ≥ exp − µ for all µ < µ0 . From (2.3), (2.7), (3.4) and the last estimate, we get T (δ) Z 1 v(ξsx )ds p0 (x)e−λ0 T (δ)/µ = E p0 (ξTx (δ) ) exp µ 0
P{dist [0,T (δ)] (ξ·x , x(·))
< δ1 } T (δ) Z 1 −cδ1 exp v(x(s))ds − cT (δ)δ1 × p0 (x0 ) exp µ µ 0 I(x(·), T (δ)) + δ1 + cδ1 + cT (δ)δ1 . ≥ p0 (x0 ) exp − µ ≥
544
A. L. Piatnitski
Finally, using (3.4), we find W0 (x) + δ + T (δ)|λˆ − λ0 | + (1 + c1 + cT (δ))δ1 p0 (x) ≥ p0 (x0 ) exp − µ
! .
For suitably chosen δ, δ1 and µ0 , the quantity (δ + T (δ)|λˆ − λ0 | + (1 + c1 + cT (δ))δ1 ) becomes arbitrary small, and therefore, lim inf µ ln(p0 (x)/p0 (x0 )) ≥ −W0 (x). µ→0
(3.6)
Upper bound. The following statement is a direct consequence of the definition of a recurrent point. Proposition 12. Under the above conditions, for any δ > 0 and c¯ > 0, there exists t0 = t0 (¯c, δ) such that for all t > t0 the inequality min dist(x(s), x0 ) > δ implies that 0≤s≤t
ˆ + c¯. I(x(·), t) ≥ λt
(3.7)
The constant c¯ will be fixed later. Again using compactness arguments, we deduce from Corollary 1 that for any δ > 0 there is t1 = t1 (δ) such that inf (S(x, x0 , t) − λt) ˆ − inf (S(x, x0 , t) − λt) ˆ <δ (3.8) t>0
0≤t≤t1
uniformly in x ∈ M . Let us denote max(t0 (¯c, δ), t1 (δ)) by t¯ and fix µ0 (δ) such that the estimate ˆ t¯ < δ (3.9) |λ0 − λ| holds for all µ < µ0 (δ). Later on we assume that µ < µ0 (δ). ˆ satisfies the It is easy to check that the function u(x, ˜ t) = p0 (x) exp(−(λ0 − λ)t/µ) equation ! 1 µ λˆ ∂ − A − u˜ = 0, u| ˜ t=0 = p0 . (3.10) ∂t µ µ Moreover, according to our choice of µ0 (δ), the relation u(x, ˜ t) = p0 (x) exp(O(δ)/µ) takes place for all t < t¯ and µ < µ0 (δ). x be the exit time for the domain M \ O2δ (x0 ), where O2δ (x0 ) is the ball Let τ2δ {y ∈ M | dist(y, x0 ) < 2δ}. For our purposes, it is convenient to fix δ0 > 0 and divide the set into three parts: 1 = dist [0,t]¯ (8xt¯ (K), ξ·x ) ≥ δ0 , x > t¯) ∩ (dist [0,t]¯ (8xt¯ (K), ξ·x ) < δ0 ) , 2 = (τ2δ x 3 = (τ2δ ≤ t¯) ∩ (dist [0,t]¯ (8xt¯ (K), ξ·x ) < δ0 ) , where K = c¯ + t¯ max |v(y)| and 8xt¯ (K) is defined in the previous section. According to y∈M
[2], for sufficiently small µ we have
K −δ P(1 ) ≤ exp − µ
.
(3.11)
A Ground State of Singularly Perturbed Equations
545
To estimate the contribution of 3 into the solution u(x, ˜ t) written in a probabilistic form, let us fix an arbitrary positive δ1 and represent 3 as a finite union of the following events: ¯ 1 K/δ t/δ
3 =
[ [
x < (k + 1)δ1 } ∩ {dist[0,(k+1)δ1 ] (8x(k+1)δ1 ((l + 1)δ), ξ·x ) < δ0 } {kδ1 ≤ τ2δ
k=1 l=1 t/δ K/δ [1 [ k,l ∩ {dist [0,(k+1)δ1 ] (8x(k+1)δ1 (lδ), ξ·x ) ≥ δ0 } = 3 . ¯
k=1 l=1
We also fix a positive ν and suppose that x ∈ M \ Oν (x0 ). The opposite case, namely x ∈ Oν (x0 ), will be examined later. In what follows we assume that δ, δ0 , δ1 and ν are sufficiently small and satisfy the relations ν δ > δ0 , ν δ1 . According to [2], there exists t2 (ν) such that x < t2 (ν)} ≤ exp(−K/µ) P{τ2δ k,l for all x ∈ M \ Oν (x0 ). In view of the definition of k,l 3 , this implies that P(3 ) ≤ k,l exp(−K/µ) for all k < t2 (ν)/δ1 , and therefore it suffices to examine 3 only for k from the interval t2 (ν) ≤ kδ1 ≤ t¯. According to [2], lδ − δ x x P(k,l (3.12) ) ≤ P{dist (8 (lδ), ξ ) ≥ δ } ≤ exp − [0,(k+1)δ1 ] 0 (k+1)δ1 · 3 µ
for sufficiently small µ. At the same time, it follows from the definition of 8x(k+1)δ1 ((l+1)δ) that for any ξ·x ∈ k,l 3 there is a curve x(·) satisfying the estimates dist[0,(k+1)δ1 ] (x(·), ξ·x ) < δ0 .
I0 (x(·), (k + 1)δ1 ) ≤ (l + 1)δ,
(3.13)
In view of the evident relation |x(ξτx2δ ) − x0 | = 2δ, this implies x ) − x0 | < 2δ + δ0 |x(τ2δ
(3.14)
x is random. To estimate the same difference at a nonrandom where the argument τ2δ point, we apply the following
Proposition 13. The inequality p dist(x(s1 ), x(s2 )) ≤ c(K, t¯) |s1 − s2 | ,
s1 , s2 ≤ t
(3.15)
holds uniformly in t ≤ t¯ and x(·) satisfying the condition I0 (x(·), t) ≤ K. Proof. Proposition 3, the definition of the distance and the Schwarz inequality yield Zs2 dist(x(s1 ), x(s2 )) ≤ s1
1/2
aij (x(s)x˙ i (s)x˙ j (s)
ds
v uZs2 u p p u ≤ t aij (x(s)x˙ i (s)x˙ j (s)ds |s1 − s2 | ≤ c(K, t¯) |s1 − s2 |. s1
546
A. L. Piatnitski
x Now, taking into account the inequality kδ1 ≤ τ2δ < (k + 1)δ1 and Proposition 13, we deduce from (3.14), p (3.16) |x((k + 1)δ1 ) − x0 | < (c(K, t¯) δ1 + 2δ + δ0 ).
Note that the constant c(K, t¯) in (3.16) does not depend on δ1 and δ0 . Then, by the definition of S(x, y, t), we get (k+1)δ Z 1
I0 (x(·), (k + 1)δ1 ) −
v(x(t))dt ≥ S(x, x((k + 1)δ1 ), (k + 1)δ1 ). 0
In view of (3.16), Proposition 4 and the inequality kδ1 > t2 (ν), this implies (k+1)δ Z 1
p v(x(t))dt ≥ S(x, x0 , (k + 1)δ1 ) − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ).
I0 (x(·), (k + 1)δ1 ) − 0
Hence by virtue of (3.13), (k+1)δ Z 1
p v(x(t))dt ≥ S(x, x0 , (k + 1)δ1 ) − (l + 1)δ − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ).
− 0
Finally, by (2.7,) (k+1)δ Z 1
p v(ξtx )dt ≥ S(x, x0 , (k + 1)δ1 ) − (l + 1)δ − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ) − ct¯δ0 .
− 0
Using this inequality and (3.12), we obtain τx 1 Z2δ x ˆ (v(ξs ) − λ)ds E χk,l exp 3 µ 0
≤ exp
cδ1 µ
(k+1)δ 1 Z 1 ˆ ≤ exp cδ1 P(k,l ) E χk,l exp (v(ξsx ) − λ)ds 3 3 µ µ
0
√ S(x, x , (k + 1)δ )−(k + 1)δ λ−(l ˆ + 1)δ−c0 (ν)(c(K, t¯) δ1 + δ + δ0 ) − ct¯δ0 0 1 1 ≤ exp − µ (3.17) √ lδ − δ W0 (x) − (l + 1)δ − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ) − ct¯δ0 − cδ1 ≤ exp − exp − µ µ √ W0 (x) − 2δ − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ) − ct¯δ0 − cδ1 ; ≤ exp − µ
A Ground State of Singularly Perturbed Equations
547
ˆ here we have also used the obvious inequality W0 (x) ≤ S(x, x0 , (k + 1)δ1 ) − (k + 1)δ1 λ. Then, representing 2 in the form K/δ
2 =
[
{dist [0,t]¯ (8xt¯ ((l
+
1)δ), ξ·x )
< δ0 } ∩
{dist[0,t]¯ (8xt¯ (lδ), ξ·x )
≥ δ0 } =
l=1
K/δ
[
l2 ,
l=1
and applying the above arguments with obvious simplifications, we find that E χl exp 2
¯ 1 Zt µ
0
c¯ − S¯ − cδ − ct¯δ0 x ˆ , (v(ξs ) − λ)ds ≤ exp − µ
(3.18)
ˆ Now let us notice that the solution of the following where S¯ = inf (S(x, y, t) − λt). x,y,t
initial boundary value problem 1 λˆ ∂ − Aµ − ∂t µ µ u| ¯ t=0 = p0 ,
! u¯ = 0,
x ∈ M \ O2δ (x0 );
u| ¯ ∂O2δ (x0 ) = u| ˜ ∂O2δ (x0 ) ,
coincides with u(x, ˜ t) for x ∈ M \ O2δ (x0 ) and can be written in the form (see [1]) τ¯ x 1 Z2δ x x x ˆ (v(ξs ) − λ)ds , u(x, ˜ t) = u(x, ¯ t) = E u(ξ ˜ τ¯ x , τ¯2δ ) exp 2δ µ 0
x x ¯ = min(τ2δ , t). Using the relation u(x, ˜ t) = exp(O(δ)/µ)p0 (x), t < t¯, we where τ¯2δ obtain x Zτ¯2δ 1 cδ x x ˆ p0 (x) ≤ exp E (χ1 + χ2 + χ3 )p0 (ξτ¯ x ) exp (v(ξs ) − λ)ds . 2δ µ µ 0
In view of (3.11), Proposition 6 and the choice of K the first term on the right hand side can be estimated as follows τ¯ x 1 Z2δ x x ˆ (v(ξs ) − λ)ds E χ1 p0 (ξτ¯ x ) exp 2δ µ 0 K − δ − c(M ) − t¯ max |v(y)| ≤ exp − p0 (x0 ) µ c¯ − c(M ) − δ ≤ exp − p0 (x0 ). µ Then, by the definition of 3 , Remark 1 and (3.17), we get
548
A. L. Piatnitski
τ¯ x 1 Z2δ x x ˆ E χ3 p0 (ξτ¯ x ) exp (v(ξs ) − λ)ds 2δ µ
0
√ W0 (x) − 2δ − c0 (ν)(c(K, t¯) δ1 + δ + δ0 ) − ct¯δ0 − cδ1 t¯ K exp − ≤ δ1 δ µ cδ × exp p0 (x0 ). µ Similarly, by (3.18) and Proposition 6, we get τ¯ x 1 Z2δ x x ˆ E χ2 p0 (ξτ¯ x ) exp (v(ξs ) − λ)ds 2δ µ
0
K c¯ − S¯ − cδ − ct¯δ0 c(M ) ≤ exp − exp p0 (x0 ). δ µ µ Combining the last three estimates and choosing c¯, ν, δ, δ0 , δ1 properly, we find that lim sup µ ln(p0 (x)/p0 (x0 )) ≤ −W0 (x) µ→0
for all x ∈ M \ Oν (x0 ). In view of (3.6), this yields lim µ ln(p0 (x)/p0 (x0 )) = −W0 (x)
µ→0
for all x ∈ M \Oν (x0 ). Since ν > 0 is arbitrary, the last equality holds for all x 6= x0 . But, according to Remark 1, the functions µ ln(p(x)/p(x0 )) are equicontinuous. Therefore, this equality holds uniformly in x ∈ M . Now, the statement of Theorem 2 follows from our normalizing conditions. 4. Operators with Potential First Order Terms In the section we consider operators Aµ with ’potential’ first order terms. These operators admit explicit formula, both for the limit of the first eigenvalue and the recurrent point. Moreover, the function W (x) can be expressed in terms of the geodesic distance in a proper auxiliary metric. Definition. The operator Aµ has potential first order terms, if there is a function U (x) on M such that ∂ bi (x) = aij (x) j U (x), i = 1, 2, ..., n. ∂x Theorem 3. Suppose that the operator Aµ has potential first order terms. Then ∂ ∂ ij lim λ0 = min a (x) i U (x) j U (x) − v(x) . µ→0 x∈M ∂x ∂x ∂ ∂ The operator Aµ is recursive if and only if the function aij (x) ∂x i U (x) ∂xj U (x) − v(x) has a unique minimum point on M . This minimum point is the only recurrent point of Aµ .
A Ground State of Singularly Perturbed Equations
549
∂ ∂ Proof. Let x0 be a minimum point of the function aij (x) ∂x i U (x) ∂xj U (x) − v(x) . After simple transformation, we find that Zt I(x(·), t) =
∂ ∂ U (x(s)) U (x(s)) − v(x(s)) ds ∂xi ∂xj 0 ∂ ∂ ij + 2(U (x(t)) − U (x(0))) ≥ t min a (x) i U (x) j U (x) − v(x) x∈M ∂x ∂x + 2(U (x(t)) − U (x(0))) aij (x(s)x˙ i x˙ j + aij (x(s))
for any absolutely continuous curve x(·). Since 2(U (x(t)) − U (x(0))) is bounded uni ∂ ∂ formly in t, we have λˆ ≥ min aij (x) ∂x i U (x) ∂xj U (x) − v(x) . On the other hand, x∈M
taking the curve x(·) identically equal to x0 , we obtain ∂ ∂ 1 λˆ ≤ lim I(x(·), t) = min aij (x) i U (x) j U (x) − v(x) . t→∞ t x∈M ∂x ∂x The other assertions of the theorem can be proved in the same way. Denote
∂ ∂ V (x) = a (x) i U (x) j U (x) − v(x) ∂x ∂x ∂ ∂ ij − min a (y) i U (y) j U (y) − v(y) . y∈M ∂y ∂y ij
The next statement provides the geometric interpretation for W0 (x). Theorem 4. Let x0 be the unique minimum point of V (x) on M . Then W0 (x) = 2 U (x0 ) − U (x) + dist(V (x))aij (x) (x, x0 ) ; here dist(V (x))aij (x) is a distance in the metric (V (x))aij (x). The proof is the same as that of Theorem 5 below. Remark 3. The point x0 need not belong to the set of minimum points of W0 (x) (and, hence, W (x)). Thus, p0 (x0 ) might be exponentially small. 5. Selfadjoint Operators In this section we suppose that b(x) ≡ 0, i.e. that the operator Aµ is selfadjoint. Then, the formula of the previous section admit an interesting geometric interpretation. Clearly, for selfadjoint operators λˆ = min (−v(x)) = 0 and Condition B is equivalent to the x∈M
uniqueness of a minimum point of −v(x). Without loss of generality we suppose that min (−v(x)) = 0. Denote the minimum point by x0 .
x∈M
Theorem 5. Let b(x) ≡ 0, and assume that the function (−v(x)) has a unique minimum point. Then lim µ ln p(x) = −2dist(−v(x))aij (x) (x, x0 ); µ→0
here dist(−v(x))aij (x) is a distance in the metric (−v(x))aij (x).
550
A. L. Piatnitski
Remark 4. Under the assumption of Theorem 5 the metric (−v(x))aij (x) degenerates only at the point x0 . Proof of Theorem 5.. We will prove the following chain of equalities ZT lim µ ln p(x) = − inf
µ→0
=−
T >0
inf
x(·) x(0)=x, x(T )=x0 0
aij (x(t))x˙ i x˙ j − v(x(t)) dt
Z1 q inf
(−v(x(t))aij (x(t))x˙ i x˙ j dt = −2dist(−v(x))aij (x) (x, x0 ).
2
x(·) x(0)=x, x(1)=x0
0
(4.1) The first equality in (4.1) is a direct consequence of Theorem 3. To obtain the second one let us consider a family of regularized functions vκ (x) = v(x) − κ, κ > 0. We have ZT inf
T >0
aij (x(t))x˙ i x˙ j − vκ (x(t)) dt
inf
x(·) x(0)=x, x(T )=x0 0
Z1 = inf
T >0
≥2
1 aij (x(t))x˙ i x˙ j − T vκ (x(t)) dt T
inf
x(·) x(0)=x, x(1)=x0 0
Z1 q
(−vκ (x(t))aij (x(t))x˙ i x˙ j dt.
inf
x(·) x(0)=x, x(1)=x0 0
Now, for any fixed curve x(t), x(0) = x, x(1) = x0 we consider an equation r . τ˙ = T −aij (x(τ (t)))x˙ i x˙ j vκ (x(τ (t))), τ (0) = 0, and choose T in such a way that τ (1) = 1. Changing the parametrization z(t) = x(τ (t)) gives Z1 0
1 aij (z(t))z˙ i z˙ j − T vκ (z(t)) dt = T
Z1 0
T τ˙ aij (x(τ ))x˙ i x˙ j − vκ (x(τ )) dτ T τ˙
Z1 q =2 (−vκ (x(τ ))aij (x(τ ))x˙ i x˙ j dτ. 0
Thus, the relation Z1 inf
T >0
inf
x(·) x(0)=x, x(1)=x0 0
=2
inf
1 aij (x(t))x˙ i x˙ j − T vκ (x(t)) dt T Z1 q (−vκ (x(t))aij (x(t))x˙ i x˙ j dt
x(·) x(0)=x, x(1)=x0 0
A Ground State of Singularly Perturbed Equations
holds. Passing to the limit as κ → 0 we obtain (4.1). The theorem is proved.
551
Acknowledgement. This paper was written during the author’s visit to Augsburg University, which was an enjoyable experience. I express my deep gratitude to Professor Jochen Br¨uning and the staff of the Institut f¨ur Mathematik der Universit¨at Augsburg for their hospitality. This work was supported in part by Volkswagen Foundation “Cooperation between Mathematicians from the former Soviet Union and Germany”.
References 1. Ikeda, N., Watanabe, S.: Stochastic differential equation and diffusion process. Amsterdam: NorthHolland Publ. Co., 1989 2. Freidlin, M.I., Wentzell, A.D.: Random perturbations of dynamical systems. Berlin–Heidelberg–New York: Springer-Verlag, 1984 3. Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1983 4. Kozlov, S.M.: Reducibility of quasiperiodic operators and homogenization. Trans. Moscow Math. Soc. 46, 99–123 (1983) 5. Kozlov, S.M., Piatnitski, A.L.: Effective diffusion for a parabolic operator with periodic potential. SIAM J. Appl. Math. 53:2, 401–418 (1993) 6. Krasnolel’skii, M.A., Lifshits, E.A., Sobolev, A.V.: Positive linear systems. The method of positive linear operators. Sigma Series in Applied Mathematics, 5. Berlin: Heldermann Verlag, 1989 7. Zhikov, V.V.: An asymptotic behaviour and stabilization of the solutions of second order parabolic equation with low order terms. Transactions of Moscow Math. Soc. 46,70–98, 1983 8. Agmon, S.: On positive solutions of elliptic equations with periodic coefficients in Rn , spectral results and extensions to elliptic operators on Riemannian manifolds. Differential equations (Birmingham, Ala., 1983), North Holland Math Stud. 92, Amsterdam: North Holland, 1984, pp. 7–17 9. Devinatz, A., Ellis, R., Friedman, A.: The asymptotic behavior of the first real eigenvalue problem of second order elliptic operators with a small parameter in the highest derivatives, II. Indiana Univ. Math. J. 23, N11, 991–1011 (1974) 10. Friedman, A.: The asymptotic behavior of the first real eigenvalue problem of second order elliptic operators with a small parameter in the highest derivatives, I. Indiana Univ. Math. J. 2, 1005–1015 (1973) 11. Wentzell, A.D.: The asymptotic behaviour of the largest eigenvalue of a second order elliptic differential operator with a small parameter multiplying the highest derivatives. Dokl. Akad. Nauk SSSR 202, 19–22 (1972) 12. Kozlov, S.M., Piatnitski, A.L.: Degeneration of effective diffusion in the presence of periodic potential. Ann. de l’Inst. Henri Poincar´e, Probability and Statistics 32, N5, 571–587 (1996) 13. Agmon, S.: Lectures on exponential decay of solutions of second order elliptic equations: Bounds on eigenfunctions of n-body Schr¨odinger operators. Mathematical Notes 29. Princeton, NJ: Princeton University Press 14. Piatnitski, A.L.: Homogenization of singularly perturbed operators. GAKUTO International Series, Math. Sciences and Appl., 9, Homogenization and Application to Material Sciences, 1997, pp. 355–361 15. Markus, A.S.: Introduction to the spectral theory of polynomial operator pencils. Providence, RI: AMS, 1988 Communicated by Ya. G. Sinai
Commun. Math. Phys. 197, 553 – 569 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Wannier–Bloch Oscillators Vincenzo Grecchi1 , Andrea Sacchetti2 1 Dipartimento di Matematica, Universit` a di Bologna, Piazza di Porta S. Donato 5, I-40127 Bologna, Italy. E-mail: [email protected]. 2 Universit` a di Modena, Dipartimento di Matematica, Via Campi 213/B, I-41100 Modena, Italy. E-mail: [email protected]
Received: 30 July 1997 / Accepted: 1 March 1998
Abstract: We consider a Wannier–Stark problem with only one ladder for weak field. We prove that a generic first-band state is a metastable state (Wannier–Bloch oscillator) oscillating because of a beating effect and decaying at the rate given by the imaginary part of the Wannier-Stark resonances. By this result we have at the same time the realization of the ideas of Bloch about the oscillations, of Wannier about the approximate quantization and of Zener about the metastability. Such oscillators, which generically perform a breathing mode motion in a large spatial region, have been experimentally observed.
1. Introduction Let us consider the dynamics of an electron driven by a constant electric field in a one-dimensional crystal. The Hamiltonian operator is formally given by: Hf = −d2 /dx2 + V (x) + f x,
(1)
where V (x+2π) = V (x) and f > 0 is the electric field strength. This very basic problem has attracted a lot of interest since the beginning of quantum mechanics. Bloch [7], by means of a semiclassical approximation, suggested the validity of a kind of “acceleration theorem” (see [23] and [11]) for one-band states. More precisely, if the state is restricted to the first band and represented in the crystal momentum space by a function a1 (k) at t = 0, at a later time t, following Bloch, we expect that it will remain in the same band and it will be represented by a1 (k + τ ), where τ = f t is the adiabatic time. Since k is a torus variable, i. e. an angle, the uniform translation is a periodic motion (oscillation) of period tB = 1/f . The uniform translation of the crystal momentum k can be seen as an acceleration induced by the external electric field. As a direct consequence of his “acceleration theorem”, Bloch obtained oscillating bound states, usually called “Bloch oscillators”.
554
V. Grecchi, A. Sacchetti
Later, Wannier [27] was able to find stationary states, the Wannier–Stark ladders (WSL’s), simply by applying the single-band approximation. On the other hand, Zener [29] called in question the very existence of any kind of bound states because of the tunneling probability through the “tilted gap barrier”. Thus, combining the ideas of Zener and Wannier we expect resonances: ladders of resonances. For many years the research on the field was mostly devoted to the proof of the existence of such resonances (see [1, 6, 12, 14]). Actually, we have the proof of existence of the resonant states, for any sufficiently small field, in the particular class of problems with a finite number of bands [15]. The proof was obtained by starting from the singleband approximation of Wannier and by using the perturbation theory on the complextranslated problem [20]. Moreover, it was proved that the resonances are arranged on regular ladders parallel to the real axis and that the imaginary part of the resonances is exponentially small [17], as predicted by Zener, the Fermi Golden Rule, Kane and Blount [21], numerical and analytical researches [3, 5] and the adiabatic approximation[9, 10]. Hence, the dynamics of a generic single-band state is strongly affected by an infinite number of sharp resonances producing a global beating effect. Now, after many years of discussions on other aspects of the problem, we return to the original Bloch issue. The result is a synthesis of the different aspects of the problem giving a metastable oscillator, i. e. the Wannier–Bloch oscillator (WBO),explicitly related to the Wannier–Stark ladder of resonances and consequently to the Zener probability of decay. We have also a new version of the “acceleration theorem” of Bloch. The proof is based on the previous results about the Wannier–Stark ladder [15, 17]. Let us consider a Wannier–Stark problem with only two bands atf = 0 and a state initially restricted to the first band, so that in the crystal momentum representation we have a = (a1 , a2 ), with a2 = 0 and a1 = a1 (k) smooth enough as specified below. Let the first band be B1 = [E1b , E1t ] with band function E1 (k) and mean value hE1 i = 0. We prove the following behavior (see Corollary 1) of the first band component at1 (k) of the state in the crystal momentum representation for f small and τ > τ1 (f ) = O(log(1/f ): at1 (k) ∼ a1 (k + τ )Uτ (k) exp(−iE0 (f )t),
(2)
where Uτ (k) = ω1,0 (k)/ω1,0 (k + τ ) is a phase factor and ω1,0 (k) is the weak field approximation of the first band component of the simplest Stark–Wannier state:ω1,0 (k) = Rk exp[iθ(k)/f ], where θ(k) = 0 E1 (q)dq. E0 (f ) is the complex energy of a resonance of the ladder, whose imaginary part is exponentially small under certain conditions to be specified below (see Hypothesis ii)). In formula (2) we have the generic behavior of a first band state written as the product of three factors. The first one, a1 (k + τ ), was introduced above as the Bloch approximation of the full oscillator. The second one, Uτ (k), is a non-trivial phase factor playing a relevant role in the spatial movements, and the third one is the time behavior given by the complex energy of a Wannier–Stark resonance. The explanation of the result is simple: the states of the resonances of the Wannier– Stark ladder are a basis in the first band space and force the state to decay at the rate given by the imaginary part. The beating effect among resonances produces the rotation of the original state divided by the resonance state as can be seen rewriting formula (2) in the following way: t (k) ∼ a1 (k + τ )/ω1,0 (k + τ ), at1 (k)/ω1,0
(3)
t where ω1,0 (k) = ω1,0 (k) exp(−iE0 (f )t) is the approximate Wannier–Stark state with its exact time dependence.
Wannier–Bloch Oscillators
555
Now, let us discuss the localization properties of the WBO for small field and time large but much smaller than its lifetime. Generically we have spatial localization of the WBO on a large breathing region defined by the adiabatic space variable ξ = f x. The proof is based on the stationary phase asymptotics given by the phase factor Uτ (k). We have spatial localization on the breathing region defined by |ξ| < ξ + (τ ), where ξ + (τ ) = maxk {E1 (k) − E1 (k + τ )} is such that ξ + (τ ) = ξ + (τ + 1) and 0 = ξ + (0) ≤ ξ + (τ ) ≤ ξ + (1/2) = |B1 |. In this region the state is a finite combination of Bloch states with oscillating quasimomentum (see Theorem 2). In particular, if the state is localized in the k-space about a given point k 0 at t = 0, it will remain localized about the moving point k t = k 0 − τ with a tunneling probability p = −2=E0 (f )tB for every value of t such that k t − [k t ] = 1/2, i. e. E1 (k t ) = E1t . In this case we have energy localization about E1 (k t ) and spatial localization about the oscillating point: ξ t = E1 (k 0 ) − E1 (k t ). Another particular Wannier–Bloch oscillator is an approximate Wannier–Stark state, defined by the f -dependent a1 (k) = ω1,0 (k). It actually does not oscillate, it is localized in energy about E0 (0) = 0 and has a spatial localization on the fixed “tilted band” region: −E1t < ξ < −E1b (see Remark 11and [5]). Let us recall that the WBO’s have been observed experimentally and discussed in simple cases by an Aachen group [28]. Recently, a paper by Lyssenko et al. [24] has been published where the experimental observation of the spatial displacement of the WBO is reported. For analytical and numerical studies of the breathing of oscillators in different models we refer to [13] and [8]. This paper is organized as follows: in Sect. 2 we consider the crystal momentum representation of the Wannier–Bloch oscillators (see Theorem 1, Corollary 1 and Corollary 2); in Sect. 3 we give the proof of Theorem 1 following the general method of Herbst [19] (see also [26]) and using the peculiarity of our model, and we prove also the two Corollaries; in Sect. 4 we consider the x-representation of the Bloch oscillators and give the adiabatic limit (see Theorem 2). Our main results have been announced in [18].
2. The Wannier–Bloch Oscillators in the Crystal Momentum Representation We consider the time-dependent one-dimensional Schr¨odinger equation i
∂ 2 ψ t (x) ∂ψ t (x) =− + V (x)ψ t (x) + f xψ t (x), f > 0, ∂t ∂x2
(4)
with the choice of units such that ~2 = 2m = 1, where m is the mass of the electron (actually, for superlattices, m denotes an effective mass, see [2]), and where the potential V (x) is a periodic real-valued function with period 2π, i.e. V (x) = V (x + 2π). Let Hf be the Wannier–Stark operator acting on L2 (R, dx) and formally defined as Hf = HB + f x, HB = −
d2 + V (x). dx2
(5)
V (x) is regular enough and Hf is a self-adjoint operator with purely absolutely continuous spectrum: σ(Hf ) = σac (Hf ) = R (see [4] and [22]). We assume:
556
V. Grecchi, A. Sacchetti
Hypotheses. i) The potential V (x) is an even function, i.e. V (x) = V (−x), and the Bloch operator HB has the first gap open and the others closed, i.e.: σ(HB ) = [E1b , E1t ] ∪ [E2b , +∞), E1t < E2b ; ii) Let kc and k¯ c be the Kohn branch points connecting the two band functions, i.e. E1 (k) = E2 (k) at k = kc and k = k¯ c , where E` (k), ` = 1, 2, are the two band functions. We assume that the two anti-Stokes lines, starting from the branch point Rk k¯ c and such that = k¯ c [E2 (q) − E1 (q)]dq = 0, belong to the strip |=k| < |=kc | (but at the starting point). As a consequence of Hypothesis i), the potential V (z) is an analytic function in the strip |=z| < λ0 for some λ0 > 0 (see Theorem XIII. 91(d) in [25]). As an example satisfying Hypothesis i) we have V (x) = Vρ (x) = 2ρ2 sn2 (x; ρ), where sn (x; ρ) is a Jacobian elliptic function with modulus ρ ∈ (0, 1). For a certain value of ρ it is proved that Vρ satisfies Hypothesis ii) too, [17]. Let us recall the crystal momentum representation (hereafter CMR, for details see [17]) as a necessary tool in order to understand the result. Let H = H1 ⊕ H2 , where H1 = L2 (B, dk), with periodic boundary conditions, B = R/Z, with representative values taken in [0, 1), is the Brillouin zone, and H2 = L2 (R, dp). We consider the unitary transformation U from L2 (R, dx) onto H defined on ζ ∈ S(R), where S is the space of smooth functions of rapid decrease, defined by Uζ = (a1 , a2 ) where Z Z 1 1 ϕ¯ 1 (x, k)ζ(x)dx, a2 (p) = √ ϕ¯ 2 (x, p)ζ(x)dx, a1 (k) = √ 2π R 2π R with inverse 1 ζ(x) = √ 2π
Z B
Z a1 (k)ϕ1 (x, k)dk +
R
a2 (p)ϕ2 (x, p)dp ,
(6)
and extended on the whole Hilbert space L2 (R, dx) by continuity. Here ϕj (x, p) = eipx uj (x, p), uj (x, p) = uj (x + 2π, p) j = 1, 2, are the two Bloch functions. In the CMR the operator Hf takes the form (still denoted by Hf ) Hf = HfDB + f X, HfDB = diag(H1 , H2 ), where H1 = if
d d + E1 (k), H2 = if + E2 (p), dk dp
and where X is a bounded operator representing the coupling term between the two bands. The decoupled band approximation HfDB , obtained by neglecting the coupling term X in Hf , gives a ladder of simple eigenvalues hE1 i + 2πjf , j ∈ Z, with eigenvectors ωj = (ω1,j , ω2,j ) ∈ H, j ∈ Z, where: ( Z ) i k −j E1 (q) − hE1 i dq , ω2,j (p) ≡ 0, ω1,j (k) = [e1 (k)] exp f 0 and where e1 (k) = exp(i2πk). In the following we fix the energy constant in the potential V to have hE1 i = 0.
Wannier–Bloch Oscillators
557
For f small enough and by means of the regular perturbation theory it was proved [15] that this ladder of simple eigenvalues of HfDB becomes a ladder Ej (f ) = E0 (f ) + 2πjf , j ∈ Z, of resonances of Hf , with <E0 (f ) = O(f 2 ) and =E0 (f ) = O(f ∞ )
(7)
as f goes to zero. Let ψ t (x) be the solution of Eq. (4) satisfying the initial condition ψ 0 (x) = ψ(x) for a given ψ ∈ L2 (R, dx). Let a = (a1 , a2 ) = Uψ,at = (at1 , at2 ) = U ψ t and b = (b1 , b2 ) = U γ, where γ ∈ L2 (R, dx) is a test function. Our main result, stated below, concerns the asymptotic behavior of α(t) := hγ, ψ t iL2 (R,dx) = hb, at iH
(8)
when the initial state ψ and the test state γ are both prepared in the first band, i.e. a2 = b2 ≡ 0. Theorem 1. Let the Hypothesis i) be satisfied, let τ = f t and let Tτ be the translation operator in H1 : (Tτ a1 )(k) = a1 (k + τ ). Then there exists f0 > 0 such that for any fixed a1 , b1 ∈ D(H1 ), for any f ∈ (0, f0 ) and t ≥ 0 we have α(t) = e−iE0 t h(τ ) + R(t),
(9)
where h(τ ) is a periodic function with period 1 such that h(τ ) = hb1 , ω1,0 Tτ (a1 /ω1,0 )iH1 + O(f ),
(10)
as f goes to zero. Moreover, there exists a positive constant c1 , independent of f and t, such that |R(t)| ≤ c1 f −3 e−λ0 τ /2 .
(11)
The proof of Theorem 1 essentially follows the one given by Herbst [19], for a Stark potential superimposed to a Coulomb-type potential. We have adapted the method to our case where the Coulomb-type potential, vanishing at infinity, is replaced by a periodic potential, and there are infinitely many resonances periodically arranged in a strip parallel to the real axis. We collect here some remarks and two corollaries. Corollary 1. Let τ1 (f ) := (6/λ0 + ) log(1/f ), for any fixed > 0. We have: R(t) → 0 as f goes to zero and t goes to +∞ such that f t = τ > τ1 (f ). In particular, keeping τ > τ1 (f ), we have at1 ∼ e−iE0 t ω1,0 Tτ (a1 /ω1,0 ) = e−iE0 t Uτ Tτ a1 , w
as f goes to zero, where Uτ is the multiplication operator given by ( Z ) ω1,0 (k) i k (Uτ a1 ) (k) = Uτ (k)a1 (k), Uτ (k) = = exp E1 (q)dq . ω1,0 (k + τ ) f k+τ
(12)
558
V. Grecchi, A. Sacchetti
Remark 1. Eq. (12) can be written as (see Eq. (28) below) w at1 ∼ t1,0 Tτ a1 1,0 ,
(13)
where the term t1,0 is the Wannier–Stark resonant state restricted to the first band, 1,0 = 01,0 , and the other term Tτ a1 1,0 represents the beating effect among the WS states, which is a translation on the torus. We should notice that the beating effect is not affected by the decay since the lifetime is much longer than the beating period, for f small enough. Remark 2. Let Hypothesis ii) be satisfied. The asymptotics (7) of the imaginary part of the resonance can be improved. In [17] it has been proved that =E0 (f ) = − 21 f e−2ρZ (1 + o(1)) , as f goes to zero, where ρZ =
1 8f
Z
2π
[V (x) − hV i]2 dx. 0
Remark 3. The physical meaning of Corollary 1 concerns the general behavior of a regular state of the first band as a Wannier–Bloch oscillator. In particular, if the state is localized in the k-space about a given point k 0 at t = 0, it will remain localized about the moving point k t = k 0 − τ . Under Hypothesis ii) and from the decay rate we compute the Zener probability of tunneling p per period coinciding with the value suggested by Kane and Blount [18]: p = 2tB |=E0 (f )| = e−2ρZ (f ) (1 + o(1)) , as f → 0,
(14)
where tB = 1/f is the period of the periodic motion (12) in the first band. Corollary 2. Let τ2 (f ) = Cn f −n for any n and some Cn > 0 or, under Hypothesis ii), let τ2 (f ) = exp(C/f ) for 0 < C < f ρZ fixed. Let us take a different gauge in order to eliminate the phase factor exp(−i<E0 (f )t) in (12). We have k·kH
T−τ Uτ−1 at1 → 1 a1
(15)
as f goes to 0 and t goes to +∞ such that τ1 (f ) < τ = f t < τ2 (f ). Remark 4. From the essential constancy of the norm of at1 it follows that the state, initially prepared in the first band, is still there for f vanishing and τ diverging as k·kH
specified above. That is, at2 → 2 a2 = 0 in the same limit. Moreover, if we take only integer values for τ , in the same limit as above for f and τ , we have: k·kH
at → a.
(16)
Remark 5. Formula (15) represents a new version of the “acceleration theorem” and a new kind of Bloch oscillators. If we consider the function a1 (k) in the limit case of a Dirac delta δ(k − k0 ), so that in the x-representation we have a Bloch state, we recover the classical versions of the Bloch oscillator and of the “acceleration theorem” as given by Kittel [23]. Actually in this limit the phase factor Uτ (k) can be taken as k-independent, i.e. physically negligible, so that it can be eliminated in (15). In such a ˆ 0 , as can be easily verified from ˆ t + τ → hki case, the translation of the mean value hki Corollary 2, strengthens to the translation of the exact value of the crystal momentum k: k t + τ → k 0 .
Wannier–Bloch Oscillators
559
Remark 6. Let us consider the adiabatic behavior directly on at1 . For integer τ , it is essentially stationary at the initial state value. For non-integer τ , the term at1 contains the phase factor Uτ (k) which quickly oscillates as f goes to zero. There are two stationary phase points such that E1 (k) = E1 (k + τ ), that is k1 (τ ) = 21 − 21 (τ − [τ ]) and k2 (τ ) = 1 2 +k1 (τ ), where [τ ] denotes the integer part of τ . Hence, we have the following behavior in the distributional sense: q at1 (k) ∼ i2πf /|E10 (k)|(Uτ Tτ a1 )(k) [δ (k − k1 (τ )) − iδ (k − k2 (τ ))] as f goes to zero, for non-integer values of τ , where δ denotes the Dirac delta. That is, the WBO wavefunction is locally a linear combination of two Bloch states. Let us notice that this k-localization is naturally given by the vanishing of f , so that it is definitely different from the Kittel one imposed by the initial conditions. The meaning of this distributional limit will be clear in the x-representation and by extending the limit above to the adiabatic scale: f x → ξ. So we will formally recover the conservation of the norm of the state. Remark 7. Hypothesis i) can be weakened by assuming that the first N gaps are open and the others are closed. In such a case (9) and (10) are still true (with a different bound for R(t)) for any f small enough and such that the ladders of resonances do not cross. When the ladders cross a similar formula could be obtained following the analysis, based on the degenerate perturbation theory, performed in [16] for a three bands model. 3. Proofs of Theorem 1 and of Corollaries Proof of Theorem 1. As a first step, we collect some results about the WS resonances (see [17] for details). Resonances of Hf are defined as complex eigenvalues of the nonself-adjoint operator Hfλ , −f λ0 < =λ < 0, obtained from Hf by means of an analytic translation [20]. This deformation leaves the space H1 unaffected and gives Hfλ = diag(H1 , H2λ ) + f X λ , λ λ λ λ λ )`,m=1,2 with X11 ≡ 0 and X12 , X21 , X22 where H2λ = H2 + f λ and where X λ = (X`m defined in formulas (18), (19) and (26) in [17]. For f small enough, there exists exactly one ladder Ej (f ) = E0 (f ) + 2πjf , j ∈ Z, of simple eigenvalues of Hfλ in the strip −f λ0 /2 < =z < 0, that is of resonances of Hf , with associated eigenvectors λj = (1,j , λ2,j ), with norm kλj kH = 1. We recall also that
1,0 = ωˆ 0 µ0 , kµ0 − 1kH1 = O(f ) and kλ2,j kH2 = Oλ (f )
(17)
as f goes to zero, where ωˆ 0 is the multiplication operator defined on H1 by ω1,0 (k) and 1(k) ≡ 1 (see Lemma 1 in [17]) and that 1,j (p) is an analytic function in the strip |=p| < kc containing the real axis. Moreover: Lemma 1. Let eˆ be the unitary operator defined on (a1 , a2 ) ∈ H by eˆ (a1 , a2 ) = (ˆe1 a1 , eˆ 2 a2 ), where eˆ ` , ` = 1, 2, is the multiplication operator defined on H` by e` , where e1 (k) = ei2πk and e2 (p) = ei2πp . Then eˆ j λj = λ0 .
560
V. Grecchi, A. Sacchetti
Proof of Lemma 1. The lemma immediately follows from the simplicity of the ladder and because (18) diag(H1 , H2λ )ˆe = eˆ diag(H1 − 2πf, H2λ − 2πf ) and λ λ λ λ λ λ , X21 , X22 , eˆ 2 = eˆ 1 X12 eˆ 1 = eˆ 2 X21 eˆ 2 = eˆ 2 X22 X12
as one can directly verify from formulas (18), (19) and (26) in [17].
(19)
Let a1 , b1 ∈ H1 and be absolutely continuous, and let b = (b1 , 0), a = (a1 , 0) ∈ H; let F + (z) := hb, [Hfλ − z]−1 aiH = hb1 , [H1 − z − f F + (z)]−1 a1 iH1 be defined for =z > 0, where F + (z) is defined as F + (z) = X12 f [H2 + f X22 − z]−1 X21 for =z > 0. F + (z) admits an analytic extension to −f λ0 < =z as F + (z) ≡ F +,λ (z) := hb1 , [H1 − z − f F +,λ (z)]−1 a1 iH1 for any λ such that −λ0 < =λ < =z, where λ λ λ f [H2λ + f X22 − z]−1 X21 . F +,λ (z) := X12
Let us recall that (Lemma 10 in [15]) F +,λ (z) is norm bounded and kF +,λ (z)[H1 − z]−1 kL(H1 ) ≤ C
(20)
for any z ∈ ∂S0 , where S0 = [−πf, πf ] × i[−f λ0 /2, f λ0 /2] for f small enough and for some positive constant C independent of f and z. Similarly F − (z) = hb, [Hfλ − z]−1 aiH = hb1 , [H1 − z − f F − (z)]−1 a1 iH1 is defined for =z < 0 and it admits an analytic extension to =z < f λ0 as F − (z) = hb1 , [H1 − z − f F −,λ (z)]−1 a1 iH1 and (20) holds for F −,λ (z) too. Then, the function G(z) = F + (z) − F − (z) admits an analytic extension to the strip −f λ0 < =z < +f λ0 . In the following let us drop λ, if not necessary, and let us denote by C any positive constant. We give now the following preliminary results: Lemma 2. Let Sj be the complex box Sj = [s(j), s(j + 1)] × i[−f λ0 /2, f λ0 /2], where s(j) = (2j − 1)f π. Let S = ∪j∈z ∂Sj , where ∂Sj denotes the boundary of Sj . For f small enough, it follows that |G(z)| ≤ Cf −3 (|z|2 + 1)−1 , for any z belonging to S, for some positive constant C independent of f and z.
(21)
Wannier–Bloch Oscillators
561
Proof of Lemma 2. We set Fj± (z) = F ± (z + 2πjf ) and we restrict our analysis to z ∈ ∂S0 and large j. To this end we observe that F ± (z + 2πf ) = hb1 , [H1 − z − 2πf − f F ± (z + 2πf )]−1 a1 iH1 = hˆe1 b1 , [H1 − z − f F ± (z)]−1 eˆ 1 a1 iH1 from (18) and (19). Hence, it follows that Fj± (z) = hˆej1 b1 , [H1 − z − f F ± (z)]−1 eˆ j1 a1 iH1 .
(22)
From the first resolvent formula we obtain ± ± Fj± (z) = A± j (z) + Bj (z) + Cj (z),
where A+j (z) = A− ej1 b1 , [H1 − z]−1 eˆ j1 a1 iH1 , j (z) = hˆ Bj± (z) = hˆej1 b1 , [H1 − z]−1 f F ± (z)[H1 − z]−1 eˆ j1 a1 iH1 = f h[H1 − z] ¯ −1 eˆ j1 b1 , F ± (z)[H1 − z]−1 eˆ j1 a1 iH1 and Cj± (z) = hˆej1 b1 , [H1 − z]−1 f Q± (z)f [H1 − z]−1 eˆ j1 a1 iH1 = f 2 h[H1 − z] ¯ −1 eˆ j1 b1 , Q± (z)[H1 − z]−1 eˆ j1 a1 iH1 where we set Q± (z) = F ± (z)[H1 − z − f F ± (z)]−1 F ± (z). Since
[H1 − z]−1 eˆ j1 a1 (k) = ( ) Z Z k i 1 −1 i2πqj −1 i2πqj = − φ(k) φ (q)a1 (q)e dq + φ (q)a1 (q)e dq , f eiz/f − 1 B 0
where
"
i φ(k) = exp f
Z
k
# (E1 (q) − z) dq ,
0
and since a1 is absolutely continuous, integration by parts yields j [H1 − z]−1 eˆ 1 a1 (k) ≤ Cf −2 (|j| + 1)−1
(23)
for some positive constant C independent of j, f and k; a similar bound holds for [H1 − z]−1 eˆ j1 b1 too. From this and from (20) it follows that
562
V. Grecchi, A. Sacchetti
|Bj± | ≤ f [H1 − z]−1 eˆ j1 a1
H1
¯ −1 eˆ j1 b1
[H1 − z]
H1
kF ± (z)k
≤ Cf −3 (j 2 + 1)−1 for any z ∈ ∂S0 , for some positive constant C independent of f , j and z. A similar bound for Cj± follows. From this and from the fact that A+j − A− j = 0 (21) follows at once. Lemma 3. Let Ej (f ) = E0 (f ) + 2πjf be the ladder of resonances of Hf contained in the strip −f λ0 /2 < =z < 0. It follows that the series X Res F + (z)|z=Ej e−iEj t = e−iE0 t h(τ ), (24) A(t) := j∈z is absolutely convergent for any t, where ResF + (z)|z=Ej denotes the residue of F + (z) at z = Ej . The function h is a periodic function with period 1. Proof of Lemma 3. Let Pjλ be the eigenprojection on the space spanned by λj . We observe that qj := Res F + (Ej ) = hb, Pjλ aiH = hb, λj iH hλj , aiH = hb1 , 1,j iH1 h1,j , a1 iH1 −2 = hb1 , eˆ −j e−j ) 1 1,0 iH1 hˆ 1 1,0 , a1 iH1 = O(j
for |j| large, as can be proven by integration by parts since a1 and b1 are absolutely continuous functions. Hence, the Fourier series X qj e−i2πjτ h(τ ) = j∈z
is absolutely convergent for any τ .
Now we are ready to conclude the proof of Theorem 1. From the spectral theorem and since the spectrum of Hf is purely absolutely continuous, we can write Z +∞ 1 −iHf t aiH = G(z)e−izt dz (25) α(t) = hb, e 2πi −∞ IN1 ,N2 , = lim N1 ,N2 →+∞
where N1 , N2 ∈ N and IN1 ,N2 =
1 2πi
Z
s(N2 +1)
G(z)e−izt dz.
s(−N1 )
From the Cauchy Theorem we can replace the path of integration with the line γN1 ,N2 = 2 ∂SN1 ,N2 ∩ C− , where C− = {z ∈ C : =z ≤ 0} and where SN1 ,N2 = ∪N j=−N1 Sj , obtaining IN1 ,N2 =
N2 X j=−N1
Res F + (z)|z=Ej e−iEj t +
1 2πi
Z
G(z)e−izt dz. γN1 ,N2
(26)
Wannier–Bloch Oscillators
563
In the limit N1 , N2 → +∞ we have that the series converges (see Lemma 3) and the integral converges (see Lemma 2). Hence we obtain α(t) = A(t) + R(t), where R(t) = e−λ0 tf /2
1 2πi
Z
From Lemma 2 it follows that |R(t)| ≤ e−λ0 τ /2
C 2πf 3
+∞ −∞
Z
(27)
G(z − iλ0 f /2)e−izt dz.
+∞ −∞
dz ≤ c1 f −3 e−λ0 τ /2 +1
z2
for some positive constant c1 independent of f and t. Moreover, it follows that X e−iE0 t−i2πjτ hb1 , eˆ −j e−j A(t) = 1 1,0 iH1 hˆ 1 1,0 , a1 iH1 j∈z X −j i2πjτ ht1,0 b1 , e−j = , 1,0 a1 iH1 1 iH1 he1 e j∈z X −j ht1,0 b1 , e−j = 1 iH1 hT−τ (e1 ), 1,0 a1 iH1 j∈z X −j ht1,0 b1 , e−j = 1 iH1 he1 , Tτ 1,0 a1 iH1 j∈z = hb1 , t1,0 Tτ 1,0 a1 iH1 ,
(28)
where t1,0 = e−iE0 t 1,0 is the time evolution of the resonant state λ0 projected on the first band and because {ej1 }j∈Z is a complete orthonormal basis of H1 . From (24) it follows that h(τ ) = hb1 , 1,0 Tτ 1,0 a1 iH1 = hb1 , ω1,0 Tτ ω1,0 a1 iH1 + O(f ) because of (17). Thus Theorem 1 is completely proved. ˜ 1 iH1 , where R˜ is a bounded operator Proof of Corollary 1. We have that R(t) = hb1 , Ra acting on H1 as ˜ 1 = P1 e−iHf t 51 a1 − t1,0 Tτ 1,0 a1 , Ra where 51 a1 = a. Moreover, from (11) we have that ˜ 1 i H1 → 0 R(t) = hb1 , Ra
(29)
as f goes to zero with τ > τ1 (f ) and for any b1 ∈ D(H1 ). From this and from the fact w ˜ 1→ that D(H1 ) is dense in H1 it follows that Ra 0 as f goes to zero. Finally (12) follows for 1,0 = ω1,0 [1 + O(f )] and (17). Proof of Corollary 2. From Corollary 1 it immediately follows that T−τ Uτ−1 at1 → a1 w
564
V. Grecchi, A. Sacchetti
as f goes to zero and t goes to infinity, in such a way that τ1 < τ < τ2 . From this and from the following inequality: kat1 kH1 = kP1 e−iHf t akH1 ≤ ke−iHf t akH = kakH = ka1 kH1 = kUτ Tτ a1 kH1 we prove that the limit above holds in norm.
4. The Wannier–Bloch Oscillators in the x Representation and the Adiabatic Behavior Let ψ t (x) be the time dependent Wannier–Bloch oscillator (WBO) wavefunction in the x representation given by (see (6)) Z Z 1 at1 (k)ϕ1 (x, k)dk + at2 (p)ϕ2 (x, p)dp . ψ t (x) = √ 2π B R From Theorem 1 and the related remarks we are able to study the asymptotics of the WBO wavefunction in the x representation for any t such that τ1 < τ < τ2 , where τ = f t and where τ1 and τ2 are defined in Corollary 1 and 2. w w From Corollary 2 and Remark 4 it follows that at1 → Uτ Tτ a1 and at2 → 0 in t the above limit, hence the leading term of the WBO wavefunction ψ (x) is a periodic function with respect to t, with period tB = 1/f , given by Z 1 ϕ1 (x, k)Uτ (k)a1 (k + τ )dk (30) ψ t (x) ∼ √ 2π B as f goes to zero. From this and from the result of Remark 6 it follows that, for any x fixed and non-integer τ , we have s 2 X p i Uτ (k)a1 (kj + τ )ϕ1 (x, kj ) (31) ψ t (x) ∼ f E10 (kj ) j=1
as f goes to zero, where kj = kj (τ ) are defined as in Remark 6. The asymptotics (31), given by the stationary phase theorem, is understood in the punctual sense and it is uniform √ in a compact domain of the x-axis. Let us notice that the wave function vanishes as f . In order to extend the study to a larger region and in order to have a non-vanishing function, it is useful to make the same limit in an adiabatic scale. To this end we consider the limit f → 0 and x → ∞ such that f x → ξ finite. We introduce the adiabatic spatial variable ξ such that x = ξ/f + y, with y belonging to a compact set, and introduce the function φt (ξ, y) = f −1/2 ψ t (ξ/f + y) = f −1/2 ψ t (x). From this and from (30) it follows Z Rk i 1 t i(ξ/f +y)k f k+τ E1 (q)dq √ φ (ξ, y) ∼ u1 (ξ/f + y, k)e e a1 (k + τ )dk, 2πf B
Wannier–Bloch Oscillators
565
as f goes to zero with τ1 (f ) < τ < τ2 (f ). For integer τ , φt (ξ, y) approaches the initial function φ0 (ξ, y). For non-integer τ we apply the stationary phase theorem given in the following, suitable form: Lemma 4. Let Z
b
I(f ) =
u(x(f ), k)eiω(k)/f g(k)dk, −∞ < a < b < +∞, f > 0,
(32)
a
where ω is a real-valued function, x(f ) is any function with values in R/2πZ, u is a periodic function with respect to x and u, g and ω are regular enough, i.e. g(k) ∈ C 1 ([a, b]), u(·, k) ∈ C 2 ([a, b]), ω(k) ∈ C 2 ([a, b]), u(x, ·) ∈ C 1 (R/2πZ). Let k0 ∈ [a, b] be such that ω 0 (k0 ) = 0 and ω 00 (k0 ) 6= 0 and let ω 0 (k) 6= 0 for any k 6= k0 . Then, Z b I(f ) = u(x(f ), k0 ) eiω(k)/f g(k)dk + O(f ) a p p = 2πf u(x(f ), k0 )eiω(k0 )/f g(k0 ) is /|ω 00 (k0 )| + O(f )
(33) (34)
as f goes to zero and where s is the sign of ω 00 (k0 ). Proof. From the Lagrange theorem we have ˜ k), u(x, k) = u(x, k0 ) + (k − k0 )u(x, where u(x, ˜ k) and its derivative with respect to k are continuous and bounded functions in x ∈ R/2πZ. Hence Z
b
eiω(k)/f g(k)dk + R(f )
I(f ) = u(x(f ), k0 ) a
where Z R(f ) =
b
K(k, f )dk a
and ˜ ), k)g(k)eiω(k)/f . K(k, f ) = (k − k0 )u(x(f Let us assume, for the sake of definiteness, that k0 ∈ (a, b) and let, for f small enough, R(f ) = R1 (f ) + R2 (f ) + R3 (f ) Z k0 +√f Z Z k0 − √ f K(k, f )dk + K(k, f )dk + = √ a
k0 −
f
b
√ K(k, f )dk.
k0 +
f
Since |K(k, f )| ≤ C2 |k − k0 |/2 for some positive constant C2 then |R2 | ≤ C2 f . In order to estimate R1 we integrate by parts obtaining
566
V. Grecchi, A. Sacchetti
K(k, f ) R1 (f ) = −if ω 0 (k)
k0 − √ f
Z
k0 −
√
+ if
f
e
iω(k)/f
a
a
d u(x(f ˜ ), k)g(k) dk, dk ω 0 (k)/(k − k0 )
where ω 0 (k)/(k −k0 ) is a C 1 ([a, b]) function and is different from zero for any k ∈ [a, b]. ˜ k)g(k)/ω 0 (ξ, k) ∈ C 1 ([a, b]) so that Hence K(k, f )/ωk (k) is bounded, (k − k0 )u(x, |R1 | ≤ C1 f for some positive constant C1 . A similar bound holds for R3 . The proof of (33) is so complete. From this and from the stationary phase method (34) follows. Remark 8. This result (with a different estimate of the remainder) can be easily extended ˜ with α > 2. to the different cases: ω(k) = ω(k0 ) + (k − k0 )α ω(k) Let us consider the asymptotic behavior of φt (ξ, y) for ξ fixed and y in a compact Rk set. We use Lemma 4 with u1 for u, kξ + k+τ E1 (q)dq for ω(k) and eiky a1 (k + τ ) for g(k). The stationary phase points are the solutions of the equation E1 (k) − E1 (k + τ ) + ξ = 0.
(35)
We introduce the following periodic function ξ + (τ ) = max [E1 (k) − E1 (k + τ )] , k∈B
(36)
in particular, ξ + (τ ) = 0 at τ integer and ξ + (τ ) has maximum value at τ = 1/2 given by the first band width |B1 | = E1t − E1b . For given ξ and τ , Eq. (35) has solutions kj = kj (ξ, τ ) ∈ B, j = 1, 2, . . . , 2N for some N = N (ξ, τ ) ≥ 1, provided that ξ belongs to the interval [−ξ + (τ ), ξ + (τ )]. From this and from the stationary phase theorem given in Lemma 4 it follows that for any fixed ξ and y taken in a compact set, the function φt (ξ, y) has an oscillating behavior for any τ 6= 0 such that ξ ∈ [−ξ + (τ )ξ + (τ )]. In particular, if τ is such that gτ (kj ) := E10 (kj ) − E10 (kj + τ ) 6= 0,
(37)
then (34) gives φt (ξ, y) ∼
2N X
Cτ (kj )Uτ (kj )a1 (kj + τ )ϕ1 (ξ/f + y, kj )
(38)
j=1
as f goes to zero, where Cτ (kj ) =
p isτ (kj ) /|gτ (kj )| and sτ (k) denotes the sign of gτ (k).
At ξ = ±ξ + (τ ) Eq. (35) has a degenerate solution so that |φt (ξ ± (τ ), y)| = O(f −n/(4+2n) )
(39)
as f goes to zero, for some n ≥ 1. A similar behavior could appear also at the points ξ ∈ (−ξ + (τ ), ξ + (τ )) such that the solution kj (ξ, τ ) has multiplicity larger than one. The number of such points is finite because E1 (k) is an analytic function. On the other hand, φt (ξ, y) as a function of ξ has an exponentially decreasing behavior (modulo the error of the asymptotic formula) outside the interval [−ξ + (τ ), ξ + (τ )] because the solutions of Eq. (35) have imaginary part different from zero (see also Fig. 1). Going back to the WBO ψ t (x) we can summarize this result in the following theorem.
Wannier–Bloch Oscillators
567
Fig. 1. In picture 1b the grey zone is the domain, in the (ξ, τ ) plane, of localization of the Wannier–Bloch oscillator wavefunction. For any τ (not integer) fixed the WBO wavefunction is localized in an interval [−ξ + (τ ), ξ + (τ )] and its amplitude attains a maximum value of order O(f −1/6 ) at the endpoints of the interval, inside the interval the amplitude is of order O(1). The interval [−ξ + (τ ), ξ + (τ )] is periodic with period 1 and it has a maximum amplitude given by 2(E1t − E1b ) as suggested by the tilted band Zener picture (see picture 1a, where grey zone denotes the bands). In picture 1b we draw the graph of the functions φt (ξ, 0), where ξ1 = −ξ + (τ1 ) and ξ2 = −ξ + (τ2 ), for a1 (k) symmetric
Theorem 2. Let Hypotheses i) and ii) be satisfied. In the limit f → 0+ and for t → +∞ such that f t = τ ∈ (τ1 (f ), τ2 (f )), the WBO wavefunction ψ t (x) is periodic with respect to t with period tB = 1/f . Moreover, for x → ∞ such that f x → ξ finite, we obtain the following adiabatic behavior of the leading term of the WBO wavefunction ψ t (x): If we take for τ only integer values, the WBO wavefunction ψ t (x) approaches the initial state ψ 0 (x). ii) For any fixed and non-integer τ the WBO wavefunction ψ t (x) is localized in an interval such that ξ ∈ [−ξ + (τ ), ξ + (τ )], where
i)
ξ + (τ ) = max[E1 (k) − E1 (k + τ )]. k∈B
(40)
The function ξ + (τ ) is a periodic function with period 1, it has minimum value 0 at τ = 0 and maximum value |B1 | = E1t − E1b at τ = 1/2. The asymptotic behavior of the WBO is generically given by
568
V. Grecchi, A. Sacchetti 2N p X ψ (x) ∼ f Cτ (kj )Uτ (kj )a1 (kj + τ )ϕ1 (x, kj ) t
(41)
j=1
for some N ≥ 1, as f goes to zero and f x ∈ (−ξ + (τ ), ξ + (τ )), kj are the solutions of Eq. (35) and the coefficients Cτ (kj ) are defined after Eq. (38). The amplitude of the WBO wavefunction ψ t (x) is generically of order O(f 1/2 ) at any x ∈ (−ξ + (τ )/f, ξ + (τ )/f ) in agreement with the normalization of ψ t (x). The amplitude of the wavefunction is generically of order O(f 1/3 ) at the boundary points x = ±ξ + (τ )/f . iii) For non-integer τ , the WBO wavefunction ψ t (x) is locally given by a linear combination (38) of a finite number of Bloch states. In particular, (Remark 6) at ξ = 0 we observe two Bloch states with two opposite quasimomentum values k1 (τ ) = 21 − 21 (τ − [τ ]) and k2 (τ ) = k1 (τ ) + 1/2 moving on the torus with a velocity equal to 1/2. We close with the following remark. Remark 9. In the limit case of the band function locally equal to the free one: E1 (k) ≈ C(k 2 − 1/12), −1/2 < k ≤ 1/2, we have ξ + (τ ) ≈ C(η − η 2 ), where η = τ − [τ ], and at 0 < |ξ| < ξ + (τ ) and η 6= 0 we have only two Bloch states with quasimomentum values k1 (τ, ξ) ≈ −ξ/2Cη − η/2 and k2 (τ, ξ) ≈ −ξ/2C(η − 1) − (η − 1)/2. Remark 10. From the above proposition it follows that the WBO wavefunction ψ t (x) has a “breathing” behavior. That is the x–region, in which ψ t (x) is localized, periodically widens and shrinks according to the Bloch period tB = 1/f . Actually, there are particular states much more localized in the x-space. Let us consider a state concentrated about k 0 at t = 0 and at k t = k 0 − τ for intermediate time. The spatial localization is now much stronger and the region of localization is asymmetrical with respect to the origin. Indeed, from formula (41) and Eq. (35), we have spatial localization about the point defined by the adiabatic value ξ t = E1 (k 0 ) − E1 (k t ). For analytical and numerical analysis of the “breathing” spatial motion of oscillators see also [13] and [8]. Remark 11. A Wannier–Stark state in the small field approximation: a1 (k) = ω1,j (k) is f -dependent and is localized in a time-independent way. The energy localization is about Ej (0) = 2πγ, where γ is the adiabatic index different from zero if |j| increases with 1/f and f j → γ. The spatial localization is on the fixed region −E1t − 2πγ < ξ < −E1b − 2πγ, where the wave function behaves as ψ t (x) ∼
q if /E10 (k1 )(a1 (k1 )ϕ1 (x, k1 ) − ia1 (−k1 )ϕ1 (x, −k1 )),
(42)
where k1 = E1−1 (−ξ − 2πγ) and the function E1−1 (E) is the inverse of E1 (k) restricted to the interval: 0 < k < 1/2. Acknowledgement. This work is partially supported by the Italian MURST and CNR-GNFM. One of us (AS) is supported by the program “Progetti di ricerca avanzata o applicata” of the University of Modena and the other one (VG) by the INFN.
Wannier–Bloch Oscillators
569
References 1. Agler, J., Froese, R.: Existence of Stark ladder resonances Commun. Math. Phys. 100, 161–171 (1985) 2. Altarelli, M.: Envelope function approach to electronic states in heterostructures. In: Interfaces, Quantum well and superlattice ed. by R. Leavens and R. Taylor, London: Plenum Publ. Corp., 1988, pp. 43–66 3. Avron, J.: The lifetime of Wannier ladder states. Ann. Phys. 143, 33–53 (1982) 4. Bentosela, F., Carmona, R., Duclos, P., Simon, B., Souillard, B., Weder, R.: Schr¨odinger operators with an electric field and random or deterministic potentials. Commun. Math. Phys. 88, 387–397 (1983) 5. Bentosela, F., Grecchi, V., Zironi, F.: Approximate ladder of resonances in a semi-infinite crystal. J. Phys. C: Solid State Phys. 15, 7119–7131 (1982) 6. Bentosela, F., Grecchi, V.: Stark–Wannier ladders. Commun. Math. Phys. 142, 169–192 (1991) ¨ 7. Bloch, F.: Uber die Quantenmechanik der Elektronen in Kristallgittern. Z. Phys. 52, 555–600 (1928) 8. Bouchard, A.M., Luban, M.: Bloch oscillations and other dynamical phenomena of electrons in semiconductor superlattices. Phys. Rev. B 52, 5105–5123 (1995) 9. Buslaev, V., Dmitrieva, L.: A Bloch electron in an external field. Leningrad Math. J. 1, 287–320 (1990) 10. Buslaev, V., Grigis, A.: Imaginary parts of Stark–Wannier resonances. J. Math. Phys. 39, 2520–2550 (1998) 11. Callaway, J.: Quantum theory of the solid state. New-York: Academic Press, 1974 12. Combes, J.M., Hislop, P.: Stark ladder resonances for small electric field. Commun. Math. Phys. 140, 291–320 (1991) 13. Dignam, M., Sipe, J.E., Shah, J.: Coherent excitations in the Stark ladder: Excitonic Bloch oscillations. Phys. Rev. B 49, 10502–10513 (1994) 14. Grecchi, V., Maioli, M., Sacchetti, A.: Stark resonances in disordered systems. Commun. Math. Phys. 146, 231–240 (1992) 15. Grecchi, V., Maioli, M., Sacchetti, A.: Stark ladder of resonances: Wannier ladders and perturbation theory. Commun. Math. Phys. 159, 605–618 (1994) 16. Grecchi, V., Sacchetti, A.: Crossing and anticrossing of resonances: the Wannier–Stark ladders. Ann. Phys. 241, 258–284 (1995) 17. Grecchi, V., Sacchetti, A.: Lifetime of the Wannier–Stark resonances and perturbation theory. Commun. Math. Phys. 185, 359–378 (1997) 18. Grecchi, V., Sacchetti, A.: Metastable Bloch oscillators. Phys. Rev. Lett. 78, 4474–4477 (1997) 19. Herbst, I. Exponential decay in the Stark effect. Commun. Math. Phys. 75, 197–205 (1980) 20. Herbst, I., Howland, J.: The Stark ladder and the other one-dimensional external electric field problems. Commun. Math. Phys. 80, 23–42 (1981) 21. Kane, E.O., Blount, E.: Interband tunneling. In: Tunneling phenomena in solids, ed. Burstein, E. and Lundqvist, S. New York: Plenum Press, 1969, pp. 79–91 22. Kieselev, A.: Absolutely continuous spectrum of perturbed Stark operators. Preprint mp-arc 97–201 (1997) 23. Kittel, C.: Quantum theory of solids. New York: John Wiley and Sons, second edition, 1987 24. Lyssenko, V.G., Valusis, G., L¨oser, F., Hasche, T, Leo, K., Dignam, M.M., K¨ohler, K. Direct measurement of the spatial displacement of Bloch-oscillating electrons in semiconductor superlattices. Phys. Rev. Lett. 79, 301–304 (1997) 25. Reed, M., Simon, B.: Methods of Modern Mathematical Physics: IV Analysis of operators. New-York: Academic, 1978 26. Simon, B.: Resonances in n–body quantum systems with dilation analytic potentials and the foundations of time–dependent perturbation theory. Ann. Math. 97, 247–274 (1973) 27. Wannier, G.H. Wave functions and effective Hamiltonian for Bloch electrons in an electric field. Phys. Rev. 117, 432–439 (1960) 28. Waschke, C., Roskos, H.G., Schwedler, R., Leo, K., Kurz. H., K¨ohler, K.: Coherent submillimeter-wave emission from Bloch oscillations in a semiconductor superlattice. Phys. Rev. Let. 70, 3318 (1993) 29. Zener, C.: A theory of electric breakdown of solid dielectrics. Proc. R. Soc. A 145, 523–529 (1934) Communicated by B. Simon
Commun. Math. Phys. 197, 571 – 621 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Four-Dimensional Yang–Mills Theory as a Deformation of Topological BF Theory A. S. Cattaneo1,? , P. Cotta-Ramusino2,3 , F. Fucito4 , M. Martellini3,5,6 , M. Rinaldi7 , A. Tanzini4,8 , M. Zeni3,5,?? 1 2 3 4 5 6 7 8
Lyman Laboratory of Physics, Harvard University, Cambridge, MA 02138, USA Dipartimento di Matematica, Universit`a di Milano, Via Saldini 50, 20133 Milano, Italy I.N.F.N. Sezione di Milano, Via Celoria 16, 20133 Milano, Italy I.N.F.N. Sezione di Roma II, Via Della Ricerca Scientifica, 00133 Roma, Italy Dipartimento di Fisica, Universit`a di Milano, Via Celoria 16, 20133 Milano, Italy Landau Network at “Centro Volta”, Como, Italy Dipartimento di Matematica, Universit`a di Trieste, Piazzale Europa 1, 34100 Trieste, Italy Dipartimento di Fisica, Universit`a di Roma II “Tor Vergata”, Italy
Received: 12 June 1997 / Accepted: 2 March 1998
Abstract: The classical action for pure Yang–Mills gauge theory can be formulated as a deformation of the topological BF theory where, beside the two-form field B, one has to add one extra-field η given by a one-form which transforms as the difference of two connections. The ensuing action functional gives a theory that is both classically and quantistically equivalent to the original Yang–Mills theory. In order to prove such an equivalence, it is shown that the dependency on the field η can be gauged away completely. This gives rise to a field theory that, for this reason, can be considered as semi-topological or topological in some but not all the fields of the theory. The symmetry group involved in this theory is an affine extension of the tangent gauge group acting on the tangent bundle of the space of connections. A mathematical analysis of this group action and of the relevant BRST complex is discussed in detail. 1. Introduction Among the open problems of quantum Yang–Mills (YM) theory, there is certainly the absence of any proof of the property of confinement, which is observed in nature for systems supposedly described by a YM Lagrangian, and which is proved true only in lattice formulations of the theory. The non-perturbative dynamics of gauge theories has been discussed at length in the literature from different points of view. More recently, some of the authors of this paper observed [8, 9] that the presence of a two-form field in the first-order formulation of YM theory might allow the construction of a surface observable that is related to ‘t Hooft’s magnetic order operator [21]. A preliminary description of such surface observables can be found in [13, 7]; for a rigorous mathematical definition of these observables (in the case of paths of paths), s. [11]. ? ??
Supported by I.N.F.N. Grant No. 5565/95 and DOE Grant No. DE-FG02-94ER25228, Amendment A003. Supported by M.U.R.S.T. TMR program ERB-4061-PL-95-0789.
572
A. S. Cattaneo et al
The study of first-order YM theory was originally proposed in [17] as a way of taking into account strong-coupling effects (after some manipulations). For a discussion of this topic and its development, s. [19] and references therein. Another aspect of first-order YM theory pointed out in [8] is its formal relationship with the topological BF theory [20, 5].1 This suggested the possibility of finding Donaldson-like invariants [14] inside ordinary (not supersymmetric) YM theory with a mechanism similar to that described in [22]. The relation between YM and BF theory is actually more involved because the latter has less physical degrees of freedom than the former. In this paper we establish the correct relation by using, as an intermediate step, a new theory – called BF -Yang–Mills (BFYM) theory – which contains a new one-form field η whose role is to provide the missing degrees of freedom.2 The first aim of this paper is to prove the classical and quantum equivalence between YM and BFYM theory. Cohomological proofs of this were considered in [18] and in [16]. In the present paper we give a different and more explicit proof by fixing the gauge of the theory in three different but equivalent ways, to which we refer as to the trivial, the covariant and the self-dual gauge fixing. The reason for considering three different gauge fixings is that each of them provides a different setting for perturbation theory: In the trivial gauge, we have an expansion identical to that found in first-order YM theory (s. [15]). In the covariant gauge, the perturbative expansion around a flat connection can be organized by using the same propagators as in the topological pure BF theory (in the same gauge). Finally, the perturbative expansion in the self-dual gauge in a neighborhood of anti-self-dual connections makes use of the propagators of the topological BF theory with a cosmological term (in the same gauge). The relation between BFYM and the BF theories is then discussed in a more formal way in the subsequent section by using the Batalin–Vilkovisky (BV) formalism [2] (which is a generalization of the more familiar BRST formalism [3]). In particular, we show that, after a canonical transformation, it is possible to perform safely the limit in which the YM coupling constant vanishes and obtain the pure BF theory plus the (covariant) kinetic term for the extra field η. In the last part of the paper the geometrical aspects of BFYM are discussed. The group of symmetries of the BFYM theory (for an extended discussion, s. [12]) turns out to be an affine extension of the tangent gauge group. The action of this group on the space of fields is not free, but a BRST complex is obtainable directly from the action of the tangent gauge group on the space of fields of the theory. The situation has both similarities and differences with respect to the case of topological gauge theories [4]. As in [4], the BRST equations are obtained as structure equations and Bianchi identities for the curvature of a suitable connection on the space of fields; but, differently from [4], the only symmetry for the connection A is the gauge invariance, as in the YM theory. Hence the BFYM theory can be seen as “semi-topological” (or topological in the field η and non-topological in the field A).
1 2
For the study of BF theory with observables, s. also [10]. For an anticipation of some results of this paper by some of the authors, s. [16].
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
573
2. Preliminaries In this section we introduce YM theory both in its usual second-order and in its firstorder formulation. We prove the equivalence of the two formulations and discuss the problems related to the weak-coupling perturbative expansion of the latter. Then we will raise the issue of the topological embedding for a gauge field theory and discuss some of its general properties. It will be the aim of the following sections to prove that, through the topological embedding, it is possible to define a weak-coupling perturbative expansion of first-order YM theory around the topological BF theory. 2.1. Second-order YM theory. Let P → M be a principal G-bundle. The manifold M is a closed, simply connected, oriented four-manifold and G is SU (N ) or, more generally, a simple compact Lie group. The standard (second-order) YM action is the following local functional: SYM [A] =
1 hFA , FA i , 2 4gYM
(1)
where gYM is a real parameter (known as the YM coupling constant), and FA is the curvature of the connection A. Here we consider the inner product h· , ·i defined on the space ∗ (M, adP ) of forms on M with values in the adjoint bundle adP and given by Z Tr (α ∧ ∗β), (2) hα , βi = M
where ∗ is the Hodge operator on M . Even though the physical space–time is Minkowskian, we assume we have performed a Wick rotation so that M is a Riemannian manifold; viz., it has a Euclidean structure. The gauge group (or group of gauge transformations) is, by definition, the group G of maps g : P → G which are equivariant, i.e., which satisfy the equation g(ph) = Adh−1 g(p) for any p ∈ P and h ∈ G. Locally elements of G are represented by maps M → G. Under a gauge trasformation g ∈ G, the connection A transforms as A → Ag = Adg−1 A + g −1 dg,
(3)
while any form ψ ∈ ∗ (M, adP ) transforms as ψ → Adg−1 ψ. The Yang–Mills action is gauge-invariant, i.e., is invariant under the action of G. We denote the space of all connections by the symbol A. With some restrictions, the group G acts freely on A and A → (A/G) is a principal G-bundle.3 2.2. Classical analysis. A classical solution is a minimum of the action (modulo gauge transformations), i.e., a solution of the equation d∗A FA = 0.
(4)
If we add a source J(A) to the action, then the equations of motion become d∗A FA + 2 J 0 = 0. 2gYM 3 For instance we can consider only the space of irreducible connections and the action of the group obtained by dividing G by its center. But in order to avoid cumbersome notations we will keep on writing the quotient as A/G.
574
A. S. Cattaneo et al
A particular class of solutions, is given by the (anti)self-dual connections, i.e., connections whose curvature is (anti)self-dual (i.e., satisfies the equation FA = ± ∗ FA ). We will denote by MYM the moduli space of solutions of the YM equations of motion modulo gauge transformations. 2.3. Quantum analysis. If we denote by A the space of connections, then the partition function is defined as Z exp(−SYM ). (5) ZYM = A/G
The way physicists deal with this quotient in the quantum analysis is by introducing the BRST complex −1 0 1 1 A 0 c hc c
(6)
(where each row has the same form-degree and each column has the same ghost-number), and the BRST transformations: sA = dA c, sc = hc ,
1 sc = − [c, c], 2 shc = 0.
(7) (8)
Notice that the first equation is just the infinitesimal version of the action (3) of G on A, while the second is the Maurer–Cartan equation for the group G. A (local) section A/G → A is chosen by introducing a gauge fixing, i.e., by imposing a condition like, e.g., d∗A0 (A−A0 ) = 0 for A belonging to a suitable neighborhood of A0 . Correspondingly, one introduces a gauge-fixing fermion 9YM , i.e., a local functional of ghost number −1 given by4
9YM = c , d∗A0 (A − A0 ) , (9) where A0 is a background connection. In perturbative calculations, we work in a neighborhood of a critical solution; i.e., we assume that A0 is a solution of the classical equations of motion. The original action is then replaced by g.f. = SYM + is9YM , SYM
(10)
and the functional integration is performed over the vector spaces to which the fields of the BRST complex belong (notice that the integration over the affine space of connections is replaced by an integration over the vector space 1 (M, adP ) to which A−A0 belongs). To perform computations, it is also useful to assign each field a canonical (scaling) dimension so that the gauge-fixed action has dimension zero. Since the derivative, the volume integration and the BRST operator have respectively dimension 1, −4 and 0, we get the following table: 4 This way of implementing a gauge fixing is known as the Landau gauge. A more general gauge-fixing fermion implementing the same conditions is obtained by the replacement 9YM → 9YM + λ hc , hc i, where λ is a free parameter. By integrating out hc and setting λ = 1, one recovers the Feynman gauge. In this paper we will consider only Landau gauges.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
575
Dimension 0: c. Dimension 1: A. Dimension 2: c, hc . Notice that the coupling constant gYM has dimension 0. Perturbative expansion. The perturbative expansion of YM theory around a critical connection A0 is performed by setting √ (11) A = A0 + 2 gYM α. By also using the gauge fixing condition, the quadratic part of the action then reads
1
ˇ A0 α + i hc , d∗A α − i hc , 1A0 ci , α, 1 0 2
(12)
ˇ A0 = 1A0 + ∗[∗FA0 , ], 1
(13)
where
and 1A0 is the covariant Laplace operator. The αα-propagator is given by the inverse ˇ A0 . of 1 2.4. The first-order formulation of YM theory. Now we consider the local functional 2 hE , Ei , SYM0 [A, E] = i hE , ∗FA i + gYM
(14)
where FA is the curvature of the connection A and E ∈ B ≡ 2 (M, adP ). As for the canonical dimension, E is assigned dimension 2. The first-order YM theory – which we will prove in a moment to be equivalent to YM theory both at the classical and at the quantum level – is particularly interesting because of the new independent field E which allows the introduction of new observables which depend on loops of paths on M (or on the spanned surfaces) and could not be defined in ordinary YM theory [11]. The only symmetry of the theory corresponding to (14) is the gauge symmetry. It acts on the space of fields A × B as in (3) plus E → Adg−1 E.
(15)
The group G acts freely on the manifold A × B, which becomes a principal G-bundle with the projection (A × B) → A being a bundle-morphism. As a consequence, the gauge fixing on A is enough to completely fix the first-order formulation of YM theory. Remark. The presence of an i in the action (14) may look odd but is necessary since the EF term is not positive definite. Notice that without the Wick rotation (thus on a Minkowskian manifold), the factor i would disappear from the action. 2.4.1. Classical equivalence. The critical points (which are not minima) of the action (14) correspond to the solutions of the following equations of motion: 2 E = 0, i ∗ FA + 2gYM i ∗ dA E = 0.
(16)
By applying the operator ∗dA to the first equation, we see that, because of the second equation, A must solve the YM equation (4). By the first equation we then see that E
576
A. S. Cattaneo et al
2 must be equal to −(i/2gYM )∗FA . Thus, the space of solutions of first-order YM theory is in a one-to-one correspondence with the space of solutions of second-order YM theory. Moreover, this correspondence is preserved by gauge equivalence. Therefore, the moduli spaces of the two theories are the same. The presence of an i is a bit disturbing since it requires an imaginary solution for E. Moreover, it could seem that the factor i does not play any role in the classical equations. However, if we add a source J(A) to the action (14), the second equation is replaced by i ∗ dA E + J 0 = 0. The application of ∗dA to the first equation gives the correct answer 2 J 0 = 0 just because of the i factor. Observe that if we were working on a d∗A FA + 2gYM Minkowski space, then we would get the correct answer by removing the factor i (this is because ∗2 keeps track of the signature of the metric).
2.4.2. Quantum equivalence. It is not difficult to see that a Gaussian integration over E yields Z Z exp(−SYM0 [A, E]) O[A] ∝ exp(−SYM [A]) O[A], (17) A/G
(A×B)/G
where O is any gauge-invariant observable for YM theory. Notice that the proportionality constant depends on gYM because of the determinant coming from the E-integration. This dependence can be removed if one defines the functional measure for E as already containing this factor. Anyhow, this constant factor is irrelevant since computing vacuum expectation values involves a ratio, and we will not take care of it. More explicitly, the integration can be performed by introducing the BRST complex and the BRST transformations (7) and (8) plus sE = [E, c],
(18)
which corresponds to the infinitesimal version of the gauge transformation (15). Notice again that the presence of the i factor in the action is essential to make the Gaussian integration meaningful and to get the correct answer [instead of exp(+SYM )]. Finally, it is important to notice that B is a vector space, so it is not necessary to fix a background solution E0 to perform the integration; therefore, the equivalence between first- and second-order YM theories is non-perturbative. However, one might also decide to fix a background solution A0 of YM equations and a background field 2 ) ∗ FA0 and integrate in the variable E − E0 ; the result will be an E0 = −(i/2gYM equivalence with YM theory expanded around the same A0 . 2.4.3. The perturbative expansion. The exact computation of a functional integral is a formidable task. Usually one considers a perturbative expansion around a classical solution. YM theory can be computed as an asymptotic series in √ gYM (i.e., in weak coupling) after defining the integration variable as α = (A − A0 )/( 2 gYM ). In the firstorder formulation, perturbation theory requires √ choosing a background for E as well and introducing the integration variable β = 2 gYM (E − E0 ). The action will then contain the quadratic part
2 1 E0 , ∗(α ∧ α) + hβ , βi + gauge fixing hβ , ∗dA0 αi + 2gYM 2 2 E0 = O(1)]. From the quadratic part one reads plus terms of order gYM [notice that gYM the propagators and the Feynman rules leading to the usual ultraviolet behaviour [15].
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
577
Another possibility, which is interesting exploring, is to consider the term hβ , βi as a perturbation. In this way the propagators will resemble those of the topological pure BF theory defined by the action SBF = i hB , ∗FA i ,
(19)
where the field B is just a new name for our E. Unfortunately, however, the operator acting on (α, β) in this scheme is not invertible since its kernel includes any pair (0, β) with β ∈ ker(dA0 ). In the pure BF theory there is no problem since the theory itself has a larger set of symmetries (s. Sect. 4) and, therefore, an additional gauge fixing is required. Another way of seeing our problem is the following: The pure BF theory appears 2 → 0 limit of the first-order YM theory (after renaming E as B). formally as the gYM However, the number of degrees of freedom are reduced in this limit and this shows that the limit is ill-defined. The purpose of the next sections is to show that is possible to restore the missing degrees of freedom by introducing an extra field, and that this makes the above limit meaningful. The mechanism that will allow us to do so is the so-called “topological embedding”. Remark. Notice that in the first-order YM theory one can define also a strong-coupling expansion (i.e., an asymptotic expansion in 1/gYM ) after integrating out the connection in (14) (notice that this integration is Gaussian). For details, s. [17, 19]. 2.5. The “topological embedding”. The so-called topological embedding refers to the idea of “embedding” a topological into a physical theory. The way we discuss such a scheme is partly related to the arguments presented in [1]. The basic idea is to consider an action S[A] (or, more generally, an action S[A, E]) as a functional of an auxiliary field η as well. One then writes S[A, η], but it is understood that δS/δη = 0. Of course, this gives the theory a huge set of symmetries, viz., all possible shifts of η, so that η has no physical degrees of freedom. This is similar to what happens in the topological field theories of the so-called cohomological (or Witten) type where all fields are subject to such a symmetry. One might also speak of semi-topological theory since it is topological, in the previous sense, only in some field directions. In our case the new field η belongs to 1 (M, adP ), so that the pair (A, η) is an element of the tangent bundle T A. The Lie group T G, which can be represented as the semi-direct product of G and the abelian group 0 (M, adP ), has a natural action on T A given by (A, η) → (Ag , Adg−1 η + dAg ζ),
(20)
where Ag is defined in (3) and (g, ζ) ∈ T G [here ζ ∈ 0 (M, adP )]. It is convenient to combine the T G-transformation with the translations acting on the field η under which the theory is invariant; viz., we write the transformation of (A, η) as (21) (A, η) → Ag , Adg−1 η + dAg ζ − τ , where τ ∈ 1 (M, adP ) represents the translation. In this way we can write the group of the symmetries of the theory as the semidirect product of T G with 1 (M, adP ). In the following we will denote this group by Gaff .
578
A. S. Cattaneo et al
Unfortunately, the action of Gaff given by (21) is not free. However, as will be shown in detail in Subsect. 5, this problem can be successfully dealt with by considering the BRST complex defined by e sη = [η, c] + dA ξ − ψ,
(22)
where ξ and ψ˜ are new ghosts. For the BRST operator to be nilpotent they must obey the transformation rules e sξ = −[ξ, c] + φ,
e e c] + dA φ, sψe = −[ψ,
(23)
where φe is a ghost-for-ghost (i.e., has ghost number 2) which transforms as e c]. sφe = [φ,
(24)
For further details on the space T A, on its symmetries and on the implementation of the BRST procedure, s. Sect. 5 and Ref. [12]. The quantization of such a theory requires a gauge fixing for this topological symmetry as well. The apparently trivial operation of adding a new field on which the theory does not depend and then gauging it away can have some interesting consequences: 1. A trivial gauge fixing for η (i.e., setting η = 0) is always available but if a non-trivial gauge fixing for η is chosen, this may introduce a non-trivial measure on the moduli space. Heuristically we have Z Z exp(−S[A]) = exp(−S[A]) µ[A, M ], (25) T A/Gaff
A/G
where5 µ is the outcome of the η-integration and depends on A and on M and its metric structure through the chosen gauge fixing. Since one expects quantization not to depend on small deformations of the gauge fixing (in particular on small deformations of the metric of M ), one can argue that µ[A, M ] is a (possibly trivial) measure which, once integrated over the space of critical solutions modulo gaugetransformations (moduli space), gives a smooth invariant of M . In particular we may choose as non-trivial gauge fixing an incomplete one which leaves only a finite number of symmetries, and then introduce a “topological observable” as a volume form.6 5 The quotient space T A/G aff is not a manifold since, as we discussed before, the action of Gaff is not free. However, a BRST structure and the relevant quantization for the Gaff symmetry is available. This is the reason why we can still write, at the heuristic level, the identity (25) between functional integrals. 6 These are essentially the motivations of the approach of [1] where the following version of the YM action
S[A0 , Aq ] =
1
FA0 +gYM Aq , FA0 +gYM Aq 2 4gYM
2 ) hF , F i by the change of variable (with is considered. This is equivalent to choosing S[A, η] = (1/4gYM A A constant Jacobian) 1 A0 = A − η, Aq = η. gYM Moreover, if one also considers the following change of ghost variables (with constant Jacobian)
C0 = c − ξ,
e − [η, ξ], ψ0 = ψ
Cq = φ0 =
1 ξ, gYM 1 [ξ, ξ] 2
then the BRST transformations (7), (22), (23) and (24) become
e, −φ
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
579
2. When this procedure is applied the first-order formulation of YM theory, the limit 2 → 0 becomes meaningful, as we will see in the next sections. Moreover, a gYM weak-coupling perturbation theory with the propagators of one of the topological BF theories becomes available. Before discussing the last point, it is better to rewrite the action SYM0 [A, E, η] of (extended) first-order YM theory by making the change of variables B=E+√
1 dA η. 2 gYM
(26)
This yields the action 1 1 2 dA η , B − √ dA η , B−√ SBFYM [A, B, η] = i hB , ∗FA i + gYM 2 gYM 2 gYM (27) which we will call BFYM theory since it is related to both the YM theory, after integrating 2 → 0; in particular, this limit yields out B and η, and to the BF theory, in the limit gYM SBFYM [A, B, η]
2 gYM →0
∼
i hB , ∗FA i +
1 hdA η , dA ηi , 2
(28)
where we recognize the BF theory plus a non-topological term √ that restores the degrees of freedom of YM theory. Notice that the presence of 1/( 2 gYM ) in (26) is designed so as to give the kinetic term for η the correct normalization. In the next section we will reconsider the equivalence of the BFYM and YM theories and prove it explicitly by using three different gauge fixings. 2 → 0 and show that it is well-defined in the In Sect. 4 we will discuss the limit gYM present context. 3. The BFYM Theory In this section we discuss the theory described by the action (27), i.e., 1 1 2 dA η , B − √ dA η . SBFYM [A, B, η] = i hB , ∗FA i + gYM B − √ 2 gYM 2 gYM First we consider the equations of motion. They can be obtained directly by looking at the stationary points of (27) or from the equations of motion (16) of the first-order YM theory together with the change of variables (26). In any case, the equations of motion can be written, after a little algebra, as √ 2 B − 2 gYM dA η = 0, i ∗ FA + 2gYM (29) = 0, d∗A FA = 0. dA FA sA0 = ψ0 + dA0 C0 , sψ0 = −dA0 φ0 − [ψ0 , C0 ], sAq = − g 1 ψ0 + dA0 +gYM Aq Cq + [Aq , C0 ], YM
sC0 = φ0 − 21 [C0 , C0 ], sφ0 = [φ0 , C0 ], sCq = − g 1 φ0 − [C0 , Cq ] − YM
gYM [Cq , Cq ] 2
(which correspond to the BRST trasformation listed in [1]). In this way one recognizes the topological set of transformations for A0 which is later reinterpreted as the background connection.
580
A. S. Cattaneo et al
Notice that the third equation is just the Bianchi identity. Then we come to the symmetries. They can be obtained starting from the symmetries (3) and (15) of the first-order YM theory and from the topological symmetry (21) for η together with the change of variables (26). Explicitly, we have A → Ag = Adg−1 A + g −1 √dg, η → Adg−1 η + dAg ζ − 2 gYM τ, B → Adg−1 B + √21g [FAg , ζ] − dAg τ,
(30)
YM
with (g, ζ, τ ) ∈ Gaff . The action of Gaff on the space of triples (A, η, B) is again not a free action, but again, as it will be shown in a moment, a BRST complex and the relevant quantization are available. √ Notice that we have rescaled τ → 2 gYM τ so as to see the shift on η as a per2 → 0 in turbation of the tangent group action; this is consistent with the limit gYM (28). We will discuss this issue better in the next section, where we also explain why the B-transformation becomes singular in this limit. Notice that for the computations considered in this section all these rescalings are irrelevant. A further remark concerns the geometric interpretation of the field B: since it transforms as dA η, it is natural to see it as an element of the tangent space TFA B and not of B. To quantize the theory we have to describe the BRST symmetry. Again the BRST transformations for BFYM theory can be obtained from (7), (18), (22), (23), (24) and (26). Explicitly they read sA sη sB sψe
= dA c, √ e = [η, c] + dA ξ − 2 gYM ψ, 1 e = [B, c] + √2 g [FA , ξ] − dA ψ, YM e c] + dA φ, e = −[ψ,
sc = − 21 [c, c], √ e sξ = −[ξ, c] + 2 gYM φ,
(31)
e c], sφe = [φ,
where c and ξ have ghost number 1 and belong to 0 (M, adP ), ψe ∈ 1 (M, adP ) and 0 e has ghost number 1, and √ φ ∈ (M, adP ) and has ghost number 2. Notice that we e e e e so as to see the shifts as perturbations of the T G have rescaled (ψ, φ) → 2 gYM (ψ, φ) transformations on η and ξ. To study the theory, both at the classical and at the quantum level, we have to fix the symmetries (30). After having done this, our first aim will be to prove that the gaugefixed BFYM and YM theories are classically equivalent, i.e., that their moduli spaces are in one-to-one correspondence with each other. Our second aim will be to prove the quantum equivalence, i.e., Z Z exp(−SBFYM [A, η, B]) O[A] ∝ exp(−SYM [A]) O[A]. (32) (T A×TFA B)/Gaff
A/G
As in (17), the proportionality constant will depend on gYM but will not affect the vacuum expectation values, and we will not take care of it. Notice that, as it was for the case in (25), the quotient (T A × TFA B)/Gaff is not a manifold since the action of Gaff is not free. The same argument of footnote 5 applies here and a detailed discussion on how to deal with the non-freedom of these group action is considered in Subsect. 5.5. The formal computation of the functional integral can be performed after choosing a gauge fixing. In general, whenever we verify that some conditions are a gauge fixing (at
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
581
least in a neighborhood of critical solutions), we expect the equivalence to be realized (in that neighborhood); for we can always go back to the variables A, η, E by (26) and perform the Gaussian E-integration. The η-integration should give at most some topological contributions since η appears only in s-exact terms now. However, the change of variables (26) becomes singular as gYM → 0, so we prefer to work the equivalence out by using the variables A, η, B. We will consider three different gauge fixings which we call the trivial, the covariant and the self-dual gauge fixings. The last two of them will be dealt with in the next two subsections, and the conditions under which classical and quantum equivalence are true will be discussed. As for the trivial gauge fixing, characterized by the condition η = 0 plus a gauge-fixing condition on A, we see immediately that BFYM theory turns out to be equivalent to the first-order formulation of YM theory which, as we proved in the previous section, is equivalent to the second-order formulation. The other two gauge fixings are equivalent to the trivial one (when they are defined), so we can be sure of the equivalence between BFYM and YM theory in any of these gauges without any further computation. However, we prefer to check the equivalence explicitly, for this treatment also produces the correct framework to consider perturbation theory around BF theories. Obviously a weak-coupling expansion as in first-order YM theory is always possible, and this is the only possibility in the trivial gauge. In the covariant gauge, however, we will show that perturbation theory around a flat connection can be organized in a different way so that the AB-sector and the η-sector of the theory decouple in the unperturbed action and the AB-propagator turns out to be the propagator of the topological pure BF theory (in the covariant gauge). Finally, in the self-dual gauge, perturbation theory around an anti-self-dual non-trivial connection (the only kind of connection around which this gauge is well defined) can again be organized in such a way that the AB-sector and the η-sector decouple in the unperturbed action; moreover, the propagators in the AB-sector are recognized as those of the topological BF theory with a cosmological term (in the self-dual gauge). 3.1. The covariant gauge fixing. The covariant gauge fixing, which will be discussed explicitly in Subsect. 5.6, is characterized by a gauge-fixing condition on A together with d∗A η = 0,
η ⊥ Harm1A (M, adP ),
d∗A B ∈ dA 0 (M, adP ),
(33)
where HarmkA (M, adP ) ≡ {ω ∈ k (M, adP ) | 1A ω = 0},
(34)
1A ≡ d∗A dA + dA d∗A : ∗ (M, adP ) → ∗ (M, adP ).
(35)
and
Notice that if b1 [A] = dim Harm1A (M, adP ) is not constant on the whole space A, the covariant gauge fixing is consistently defined only in those open regions where it is constant. In particular, we will denote by N the open neighborhood of the space of connections where this is true (in particular cases, N may be the whole A). By consistency, on the shift τ in (30) we must impose the same conditions as those which fix the T G symmetry on η, viz., d∗A τ = 0,
τ ⊥ Harm1A (M, adP ).
582
A. S. Cattaneo et al
Similarly, in the context of BRST quantization, the ghost ψe is subject to the same conditions d∗A ψe = 0,
ψe ⊥ Harm1A (M, adP ).
(36)
Since we have Harm0A (M, adP ) = {0} (which is a consequence of taking A irreducible), this is actually a gauge fixing. Notice that there is a set of interpolating (complete) gauge fixings between (33) and the trivial gauge fixing, η = 0, which can also be written as d∗A η = 0, η ∈ dA 0 (M, adP ). The interpolating gauge fixings are then given by λd∗A B + (1 − λ)η ∈ dA 0 (M, adP ), with λ ∈ [0, 1]. One might also choose λ to be smooth but not constant on A. In particular, one could choose λ to be constant and equal to 1 in an open neighborhood of the space of critical connections contained in the neighborhood N , and constant and equal to 0 outside N . In this way one would obtain a gauge fixing that is defined on the whole space A and restricts to the covariant gauge fixing close to the critical connections. 3.1.1. Classical equivalence. Consider the equations of motion (29). The second and the third tell us that A is a solution of the YM equations. The first implies that √ 2 B − 2 gYM dA η) = 0, d∗A (2gYM so that
D
2 B− dA η , 2gYM
√
E 2 gYM dA η = 0.
On the other hand, the gauge-fixing conditions (33) imply that hdA η , Bi = 0. So we conclude that
||dA η||2 = 0.
By the positivity of the norm (remember that we are in a Riemannian manifold) we get then dA η = 0. Since the gauge fixing also imposes d∗A η = 0 and requires η not to be harmonic, we conclude that η = 0.
(37)
Finally, inserting this result in (29) yields B=−
i ∗ FA . 2 2gYM
(38)
Therefore, we have shown that a solution A of the YM equations completely determines a solution of BFYM equations in the covariant gauge fixing. Notice that this solution coincides with that obtained with the trivial gauge fixing. 3.1.2. Quantum equivalence. To implement the covariant gauge fixing in the BRST formalism, we have first to introduce the full BRST complex which generalizes (6). It
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
583
is useful to organize all the fields in the following tables where each row has the same form-degree and each column has the same ghost-number: −1 0 1 1 (A, η) 0 (c, ξ) (hc , hξ ) (c, ξ)
(39)
−2 −1 0 1 2 B e 1 ψ hψe ψe 0 φe1 hφe φe2 hφe φe 1 2
(40)
2
The BRST transformations are given by (31) together with sc = hc , sψe = hψe, sφe1 = hφe , 1
shc = 0, shψe = 0, shφe = 0, 1
sξ = hξ ,
shξ = 0, (41)
sφe2 = hφe , 2
shφe = 0. 2
If harmonic one-forms are present, in order to implement the covariant gauge fixing, (33) and (36), it is better to rewrite the BRST transformations for η and ψe displaying the harmonic contribution. First we take an orthogonal basis ωi [A] of Harm1A (M, adP ), with i = 1, . . . , b1 [A] = dim Harm1A (M, adP ). To be consistent with the scaling dimensions, we normalize this basis as √ hωi [A] , ωj [A]i = δij V , (42) where V is the volume of the manifold M . As a consequence of the fact that ωi [Ag ] = Adg−1 ωi [A], we get the BRST transformation rule sωi [A] = [ωi [A], c].
(43)
Then we add a family of constant ghosts k i and ri (respectively of ghost number 1 and 2) and BRST transformation rules √ sk i = 2 gYM ri , sri = 0. (44) Finally, we rewrite the BRST transformations for η and ψe as √ sη = [η, c] + dA ξ − 2 gYM ψe + k i ωi [A], e c] + dA φe + ri ωi [A], sψe = −[ψ,
(45)
where a sum over repeated indices is understood. It is easily verified that the BRST operator is still nilpotent. To implement the gauge fixing, we have then to build the BRST complex, i.e., add to (39) and (40) the following table:
584
A. S. Cattaneo et al
−2 −1 0 1 2 i k hik k i r1 i hir1 r2 i hir2 ri
(46)
where each column has the displayed ghost-number. We conclude by giving the last BRST transformations, viz., i
sk = hik , sr1 i = hir1 ,
shik = 0, shir1 = 0,
sr2 i = hir2 ,
shir2 = 0.
(47)
Now we are in a position to write down the gauge-fixing fermion that implements the conditions (33) and (36): 9 = 9YM +
i + ξ , d∗A η + k hωi [A] , ηi + D D E E + φe1 , d∗A ψe + r1 i ωi [A] , ψe + E D + ψe , d∗A B + dA φe2 + r2 i ωi [A] ,
(48)
where 9YM is a gauge-fixing fermion for YM theory like, e.g., in (9). Notice that both d∗ B and dA φe2 are in the orthogonal complement of Harm1 (M, adP ); thus, to imA
A
plement the second gauge-fixing condition in (33), we must take ψe in this orthogonal complement as well. This is accomplished by the last term in (48). The gauge-fixed action will then read g.f. = SBFYM + is9. SBFYM
(49)
e φe2 and r2 i : On the one hand, we can see ψe Notice the double role played here by ψ, as the antighost orthogonal to the harmonic forms that allows an explicit implementation of the gauge-fixing condition for B, viz., d∗A B + dA φe2 = 0;
(50)
on the other hand, we can see φe2 and r2 i as antighosts that implement on ψe the same e viz., conditions as those satisfied by ψ, d∗A ψe = 0,
ψe ⊥ Harm1A (M, adP ).
(51)
As in the case of YM theory, it is useful to assign a canonical dimension to all the fields in such a way that the gauge-fixed action, the derivative, the volume integration and the BRST operator have, respectively, dimensions 0, 1, −4 and 0. Therefore, we get e ki , ri . Dimension 0: c, ξ, φ, e e h , ψ. Dimension 1: A, η, ψ, e ψ
i
Dimension 2: B, c, ξ, hc , hξ , φe1 , hφe , φe2 , hφe , k , hik , r1 i , hir1 , r2 i , hir2 . 1
2
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
585
The explicit computation. Our first task is to compute s9. This will produce many terms which we can divide into two classes: terms that contain a Lagrange multiplier (the h-fields) and terms that do not. The former will impose the gauge-fixing conditions (33), (50), (36) and (51) (notice that – and this is the advantage of working in the Landau gauge – we do not have quadratic terms in the hs, so the h-integrations produce δfunctionals of the constraints). In the computation of the latter, several terms will be canceled after explicitly imposing these gauge-fixing conditions. In particular, all the thecovariant terms containing the ghost c (apart from those in s9YM ) are killed since
gauge-fixing conditions are G-equivariant; e.g., in the s-variation of ξ , d∗A η , we will
remove the term ξ , [d∗A η, c] by imposing d∗A η = 0. Particular care has to be taken in the variation of the last line in (48) since φe2 is not G-equivariant; the c-dependent part will then read (by adding and subtracting dA [φe2 , c]) E D ψe , [d∗A B + dA φe2 + r2 i ωi [A], c] − dA [φe2 , c] .
The first term then vanishes by theEgauge-fixing condition (50) of B, while the last term D can be rewritten as d∗ ψe , [φe2 , c] and vanishes by the gauge-fixing conditions (51) of A
e ψ. By imposing the gauge-fixing conditions, we can also simplify the action SBFYM : the effect is to eliminate the mixed term in B and η. Finally, we see that, thanks to the gauge-fixing conditions, we can always replace d∗A dA by the invertible operator 10A defined as 1 on ker(1A ) = HarmA 0 1A = 1A + πHarmA = , (52) 1A on coker(1A ) where πHarmA is the projection to harmonic forms. Notice that 10A = 1A on zero-forms since A is an irreducible connection. In the following, we will denote by GA the inverse of 10A and by det0 (1A ) the determinant of 10A . Therefore, the gauge-fixed action – after all these simplifications – reads cov. g.f. 2 SBFYM = i hB , ∗FA i + gYM hB , Bi + 21 hη , 10A ηi + + i s9YM + hhξ , d∗A ηi + hik hωi [A] , ηi + D E D E + hφe , d∗A ψe + hir1 ωi [A] , ψe + D 1 E + hψe , d∗A B + dA φe2 + r2 i ωi [A] + D D E E + hφe , d∗A ψe + hir2 ωi [A] , ψe + √
2 i − ξ , 1A ξ − k k j δij V + D E √ + φe1 , 1A φe + r1 i rj δij V + E D D E − √21g dA ψe , [FA , ξ] + ψe , 10A ψe .
(53)
YM
Notice that there is only one term which is singular as gYM → 0. However, this singularity can be removed easily if one rescales ξ → gYM ξ and ξ → ξ/gYM . Now we can start integrating out fields in order to prove (32). We want to point out that it is not necessary to choose a background for η and B since they already belong to vector spaces.
586
A. S. Cattaneo et al i
Step 1. Integrate k , k i , r1 i , ri . The integration over the first two variables yields 1 1 V b [A]/2 , while the integration over the last two of them yields V −b [A]/2 ; therefore, the contributions cancel each other. e The first integration yields det 1(0) , while the second yields Step 2. Integrate ξ, ξ, φe1 , φ. A
−1 and they cancel each other. Notice that there are no sources in ξ, so the (det 1(0) A) integration kills the term in ξ.
e ψ, e h , h , hi , hi . The integration over the first two fields yields Step 3. Integrate ψ, e φe r1 r2 φ 1
2
e e det 0 1(1) A . Moreover, since there are linear sources in ψ and ψ, viz., E D E D i dA hφe + hir1 ωi [A] , ψe − i ψe , dA hφe + hir2 ωi [A] , 1 2
the Gaussian integration will give the following contribution to the action: E D √ i hφe , XA hφe + hir1 hjr2 δij V , 1 2
(54)
where XA = d∗A GA dA : 0 (M, adP ) → 0 (M, adP ).
(55)
Notice that there are no other terms since GA ωi [A] = ωi [A] and d∗A ωi [A] = 0. Now the integration over hφe and hφe yields det XA (we will show shortly that XA 1
2
is invertible), while the integration over hir1 and hir2 yields V b Therefore, the net contribution of this step is given by b det 0 1(1) A det XA V
η, hξ , hik .
Step 4. Integrate linear source in η, viz.,
1
[A]/2
1
[A]/2
.
.
−1/2 The first integration yields (det0 1(1) ; moreover, the A)
i dA hξ + hik ωi [A] , η ,
produces the following contribution to the action: √ hhξ , XA hξ i + hik hjk δij V . Then the hξ -integration yields (det XA )−1/2 , while the hik -integrations yield 1 (4V )−b [A]/4 . Therefore, the net contribution of this step is given by −1/2 (4V )−b [A]/4 . (det 0 1(1) A det XA ) Step 5. Integrate B. Apart from an irrelevant gYM -dependent factor, the Gaussian Bintegration with source E D i B , ∗FA + dA hψe 1
gives the following contribution to the action: E D 1 hFA , FA i + hψe , 10A hψe , 2 4gYM the mixed terms disappearing because of the Bianchi identity.
(56)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
587
Step 6. Integrate hψe, φe2 , r2 i . The hψe-integration with quadratic term given in (56) and source E D i hψe , dA φe2 + r2 i ωi [A] −1/2 yields (det 0 1(1) plus the contribution A) D E √ 2 gYM φe2 , XA φe2 + r2 i r2 j δij V .
Then the remaining integrations yield (det XA )−1/2 and (4gYM 4V )−b Therefore, the net contribution of this step is
1
[A]/4
.
−1/2 (det 0 1(1) (det XA )−1/2 (4gYM 4V )−b [A]/4 . A) The operator XA is invertible. In order to complete all the steps in the functional integration, we still have to prove that the operator XA is invertible. Let us represent the space 1 (M, adP ) as the sum of three orthogonal subspaces: the vertical subspace VA = dA 0 (M, adP ), Harm1A (M, adP ), and Hˆ A ≡ HA Harm1A (M, adP ), where HA = ker d∗A is the horizontal subspace. Here horizontality and verticality are defined with respect to the connection form on A given by GA d∗A : 1 (M, adP ) → 0 (M, adP ). 1
ˆ ˆ The operator 1(1) A : HA ⊕ VA → HA ⊕ VA is injective and satisfies the following relation: E D ∗ ∗ η1 , 1(1) A η2 = h(dA + dA )η1 , (dA + dA )η2 i . Hence both 1(1) A and its inverse GA are (formally) self-adjoint and positive. We can 1/2 consider the (formal) “square root” GA and have, for any ζ ∈ 0 (M, adP ), 1/2
hζ , XA ζi = hdA ζ , GA dA ζi = ||GA dA ζ||2 . Hence XA is invertible. Conclusions. Collecting all the contributions, we get YM as the effective action. All the determinants cancel each other and the only net contribution of all the integrations 1 is a factor (2gYM )−b [A] . However, b1 [A] is constant in the neighborhood N . If this neighborhood does not coincide with A, one can choose an interpolating gauge that smoothly connects the covariant gauge in the interior of N with the trivial gauge outside. This ends the proof of (32) in the covariant gauge. 3.1.3. The perturbative expansion. As we have remarked after Eq. (53), a suitable rescaling of ξ and ξ removes any singularity as gYM → 0. This allows weak-coupling perturbation theory. To start with, one has to consider the fluctuations α, e and β of the fields A, η, B; viz., A = A0 + qα, η = η0 + e, B = B0 + q1 β,
(57)
588
A. S. Cattaneo et al
where q is a free parameter, and (A0 , η0 , B0 ) is a critical point of the action: i.e., A0 is a critical connection, η0 = 0 and B0 is given by (38). On the fluctuations, the covariant gauge fixing reads = 0, e ⊥ Harm1A (M, adP ), d∗A0 e + O(q) ∗ 2 dA0 β + q ∗ [α, ∗B0 ] + O(q) ∈ dA 0 (M, adP ) + Harm1A (M, adP ).
(58)
[Recall that q 2 B0 = O(1).] The general case. The quadratic part of the gauge-fixed action (53) reads i hβ , ∗dA0 αi +
1 2
q gYM
2
hFA0 , α ∧ αi +
gYM q
2 hβ , βi +
1
e , 10A0 e 2
plus the gauge-fixing terms. Therefore, we see that the αβ- and e-sectors decouple. Since we have both √ a term in gYM /q and one in q/gYM , we must take q ∼ gYM (a convenient choice is q = 2 gYM ). Perturbative expansion around a flat connection. If the connection A0 is flat, then B0 = FA0 = 0 and there are no terms in q/gYM in the quadratic part of the action. Therefore, we can also take gYM q 1 and consider (gYM /q)2 hβ , βi as a perturbation. More precisely, we take i hβ , ∗dA0 αi +
1
e , 10A0 e + gauge-fixing terms 2
(59)
as unperturbed action. This is possible since the quadratic form in (59) is non-degenerate. In fact, the kernel is determined by the conditions dA0 β = 0,
d∗A0 β + dA0 φe2 = 0.
(Notice that b1 [A0 ] = 0 if A0 is flat.) Applying dA0 to the second equation we get 1A0 β = 0. The kernel is therefore empty if there are no harmonic two forms (and in general is finite dimensional). The propagators can be computed easily. The αβ propagator (i.e., the inverse of ∗dA0 on its image) is the same as in pure BF theory in the covariant gauge, as is clear by comparing (59) with (101); viz., it is the integral kernel of (generalized) Gauss linking numbers.7 The ee-propagator is the same as the propagator for the fluctuation of the connection in YM theory, as is clear by comparing (59) with (12). The perturbative expansion will then be organized as a formal double expansion in q and (gYM /q). Notice that the theory is however independent of q; in fact, a rescaling q → tq can be reabsorbed by the rescaling α → α/t, β → tβ, hψe → hψe/t. This reflects the analogous independency on the coupling constant found in pure BF theory in any dimension. It is conceivable that quantization might break this symmetry if we consider Bdependent observables. (The equivalence with YM theory rules out this possibility when we consider only YM observables.) 7 Recall that in four dimensions we have linking numbers between spheres and loops as opposed to the standard linking numbers between loops in three dimensions
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
589
3.2. The self-dual gauge fixing. Preliminaries. The space of two-forms can canonically be split into the sum of selfdual and anti-self-dual forms [denoted by (2,+) (M, adP ) and (2,−) (M, adP )] which satisfy P + ω = ω and P − ω = ω respectively, where the projection operators P + and P − are defined as P± =
1±∗ . 2
(60)
By using one of these projection operators (whatever follows is true by replacing self-duality by anti-self-duality everywhere), we can define a new operator DA on the complex ∗ (M, adP ) (2,−) (M, adP ) as : 0 (M, adP ) → 1 (M, adP ) dA √ + 2P dA : 1 (M, adP ) → (2,+) (M, adP ) √ . (61) DA := 2dA : (2,+) (M, adP ) → 3 (M, adP ) 3 4 dA : (M, adP ) → (M, adP ) Then we can define the elliptic operator ∗ ∗ e A = DA DA + D A DA , 1
(62)
and prove the following identities for this deformed Laplace operator on forms of various degrees: (0) e (0) 1 A = 1A , (1) e A = 1(1) − ∗[FA , ], 1 A ∗ ∗ + (2) + e (2) 1 = 2D A DA = 2DA DA = 2P 1A P . A
(63)
Since we are considering only irreducible connections, the (deformed) Laplace operator is invertible on zero forms. e A . Notice that ] A (M, adP ) the (finite) kernel of 1 We will denote by Harm 0
] A (M, adP ) = Harm0A (M, adP ) = {0}, Harm 1 ] A (M, adP ) ⊃ Harm1A (M, adP ), Harm 2 ] A (M, adP ) = P + Harm2A (M, adP ). Harm
(64)
As in the case of the ordinary covariant Laplacian, see (52), we can define the invertible operator e A ) = Harm ]A on ker(1 e A + π] = 1 e 0A = 1 (65) 1 HarmA eA e A) 1 on coker(1 eA . and its inverse G Finally, if A is a non-trivial anti-self-dual connection (i.e., P + FA = 0), then we assume that DA : 1 (M, adP ) → (2,+) (M, adP ) is surjective, or equivalently, that ∗ ) = {0}; ker(DA
∗ DA : (2,+) (M, adP ) → 1 (M, adP ).
(66)
590
A. S. Cattaneo et al
Notice that (66) is verified for a dense set of (conformal classes of) metrics for G = SU (2) [14]. For any σ ∈ 1 (M, adP ) we have ∗ ∗ DA+tσ = DA + tQσ ,
where the A-independent operator Qσ is defined by: √ Qσ ϕ ≡ 2 ∗ [σ, ϕ]. ϕ ∈ (2,+) (M, adP ). ∗ This implies that for t sufficiently small DA+tσ is also invertible and that there is neigh− borhood of the space M of anti-self-dual connections – which we will denote by N – where the property (66) holds. We take N to be the inverse image of a neighe (2) borhood of the moduli space. By (66) it follows that 1 A is invertible if A ∈ N , so 2 ] HarmA (M, adP ) = {0}. 1
] A (M, adP ) = TA M− ; therefore, If A is an anti-self-dual connection, then Harm 1 ] A (M, adP ) = dim M− = m− . Moreover, for A in a neighborhood of M− , dim Harm 0 (2,+) 2 + (M, adP ). This implies that DA = FA = 0 : (M, adP ) → \ \ ∗ ∗ (Im DA ) 1 (M, adP ) = {0} if A ∈ M− . coker DA T 1 ∗ Therefore, (66) is an injection from (2,+) (M, adP ) to ker DA (M, adP ). By − continuity this property will hold in a neighborhood of M . We will denote by N 0 the intersection of this neighborhood with N and with the neighborhood where 1 ] A (M, adP ) is constant. dim Harm Therefore, the neighborhood N 0 – which we will use in the rest of this section – is characterized by the following two properties: T 1 ∗ (M, adP )) → (2,+) (M, adP ) is surjective if A ∈ N 0 ; 1. DA : (ker DA 1 ] A (M, adP ) = m− if A ∈ N 0 . 2. dim Harm The definition of the self-dual gauge fixing. Now we are in a position to define the selfdual gauge fixing (for further details, s. Subsect. 5) in terms of a gauge-fixing condition on the connection A ∈ N 0 together with the conditions ∗ DA η = 0,
1
] A (M, adP ), η ⊥ Harm
P + B = 0,
(67)
and, by consistency, ∗ DA τ = 0,
1
] A (M, adP ). τ ⊥ Harm
(68)
In the context of BRST quantization, the last conditions will imply ∗ e DA ψ = 0,
1
] A (M, adP ). ψe ⊥ Harm
(69)
Also in this case we have gauge fixings which are interpolating between (67) and the trivial gauge fixing η = 0. In fact, the trivial gauge fixing can be written as ∗ η = 0, DA
1
] A (M, adP ), η ⊥ Harm
DA η = 0.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
591
The interpolating gauge fixings can be then written as λP + B + (1 − λ)DA η = 0, with λ ∈ [0, 1]. Again one might also choose λ to be smooth but not constant on A. In particular, if we choose λ to be constant and equal to 1 in an open neighborhood of M− contained in the neighborhood N 0 , and constant and equal to 0 outside N 0 , we obtain a gauge fixing that is defined on the whole space A and restricts to the self-dual gauge fixing close to the anti-self-dual connections. 3.2.1. Classical equivalence. First we observe that an anti-self-dual connection solves the YM equations of motion. Then we see that the self-dual part of the first equation of (29) reads DA η = 0, which together with the gauge-fixing conditions implies η = 0;
(70)
therefore, we get B=−
i ∗ FA . 2 2gYM
(71)
Notice that this solution is the same as those obtained with the trivial and the covariant gauge fixings. 3.2.2. Quantum equivalence. To implement the self-dual gauge fixing, we have to introduce a BRST complex which is slightly different from that used for the covariant case. e φe2 ) and (h , h ) respectively by the More precisely, we have to replace the pairs (ψ, e φe2 ψ + self-dual antighost χ (with ghost number −1) and by the self-dual Lagrange multiplier h+χ (with ghost number 0). Notice that the number of degrees of freedom is preserved; in fact, ψe is a one-form with ghost number −1 (so four fermionic degrees of freedom), while φe2 is a zero-form with ghost number 0 (so one bosonic or, equivalently, minus one fermionic degree of freedom); this gives three fermionic degrees of freedom which is consistent with the fact that χ+ is a self-dual two-form with ghost number −1. A similar counting holds for the other fields. The BRST transformation rules for the antighosts and the Lagrange multipliers are the same as those described in the case of the covariant gauge fixing; as for the new fields, we have sχ+ = h+χ ,
sh+χ = 0.
(72)
To deal with the harmonic one-forms of the deformed Laplace operator, we in1 ] A (M, adP ) – which we still denote by ωi [A], troduce an orthogonal basis for Harm − i = 1, . . . , m – and normalize it as in (42). As in the case of the covariant gauge fixing, we introduce new constant ghosts k i and i r , together with their antighosts and Lagrange multipliers, with BRST transformation
592
A. S. Cattaneo et al
rules given by (44) and (47). Moreover, we rewrite the BRST transformations for η and ψe as in (45). The self-dual gauge fixing is eventually implemented by choosing the following gauge-fixing fermion: 9 = 9YM +
i ∗ + Dξ , DA η +E k hωiD [A] , ηi + E i ∗ e e + φ1 , D ψ + r1 ωi [A] , ψe +
(73)
A
+ hχ+ , Bi . The canonical dimensions of the old fields are the same as in the case of the covariant gauge fixing, while the new fields χ+ and h+χ have both dimension two. The explicit computation. As in the computation with the covariant gauge fixing, the gauge-fixed action SBFYM + is9 can be simplified if one imposes the gauge-fixing conditions explicitly. At the end we get s.d. g.f. 2 − = −i hB − , P − FA , B−i + SBFYM √i + gYM hB 1 − + 2hη , 1A ηi − 2gYM hB , P − (dA η)i + ∗ ηi + hik hωi [A] , ηi + + i s9YM + hhξ , DA D E
D E ∗ e + hφe , DA ψ + hir1 ωi [A] , ψe + h+χ , B + + 1 √
i j − Dξ , 1A ξ − k k δ V+ ij E √ i j + φe1 , 1A φe + r1 r δij V + D E − √21g hχ+ , P + [FA , ξ]i + χ+ , DA ψe ,
(74)
YM
where B ± are the self-dual and anti-self-dual components of B. Notice that there is only one term which is singular as gYM → 0. However, this singularity can be easily removed if one rescales ξ → gYM ξ and ξ → ξ/gYM . Now we can start integrating out the fields. i
Step 1. Integrate k , k i , r1 i , ri . As in the case of the covariant gauge fixing, this integration gives no contribution. e Again, this integration does not contribute. Step 2. Integrate ξ, ξ, φe1 , φ. e h , hi . The relevant terms in the action can be written as Step 3. Integrate χ, ψ, e1 r1 φ i hX , M Xi , 2 where X is the vector ψe hr1 1 m− ⊕ 2,+ ⊕ 0 , X= χ+ ∈ ⊕ R hφe
1
and M is the anti-hermitian operator
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
593
∗ 0 −ω A −DA −DA 0 0 ω∗ 0 . M = A 0 0 DA 0 ∗ 0 0 DA 0
(75)
The scalar product is defined as in (2) on ∗ (M, adP ) and is the ordinary Euclidean − scalar product on Rm . − The operator ω A : Rm → 1 (M, adP ) is defined by −
ω A hr 1 =
m X
ωi [A] hir1 ,
i=1 −
and its adjoint ω ∗A : 1 (M, adP ) → Rm acts as D E e i = ωi [A] , ψe . (ω ∗A ψ) The functional integration will then produce the Pfaffian of M which, as we will prove in App. A, is given, up to an irrelevant constant, by (1)
(2,+) 1/4 0e Vm Pf(M ) ∝ (det(1(0) A − RA ) det 1A det 1A )
−
/8
,
(76)
where ∗ πcoker(DA ) DA : 0 (M, adP ) → 0 (M, adP ). R A = DA
(77)
Notice that the operator b A = 1A − RA : 0 (M, adP ) → 0 (M, adP ) 1
(78)
is invertible for A ∈ N 0 (s. App. B). Step 4. Integrate h+χ , B. First notice that, since self-dual and anti-self-dual two-forms are orthogonal, the integration over B can be replaced by an integration over B + and B − with Jacobian equal to 1. The (h+χ , B + )-integration is then trivial. The B − -integration with source E D √ B − , P − (−iFA − 2gYM dA η) yields a constant term depending on gYM (of which we do not care) plus the following contribution to the action: 1
− i 1 − P FA , P − FA − √ P FA , P − (dA η) − P − (dA η) , P − (dA η) . 2 2 4gYM 2 gYM Therefore, at this stage we get the following effective action: E 1D 1 − − e 0A η η , 1 F , P F P + A A 2 4 4gYM −1 ∗ + i +i η, D P FA + DA hξ + hk ωi [A] + is9YM , 2gYM A √ √ ∗ + where we have used the fact that 2d∗A P − FA = 2d∗A P + FA = DA P FA .
594
A. S. Cattaneo et al (1)
e A )−1/2 plus the following contribution Step 5. η, hξ , hik . The η-integration yields (det 0 1 to the action: E E 1 D 1 D + ∗ e ∗ + P FA P FA , ZeA P + FA − hξ , D A GA DA 2 gYM 4gYM D E √ eA hξ + hi hj δij V , + hξ , X k k where 0 0 e eA = D ∗ G X A A DA : (M, adP ) → (M, adP ), (2,+) (2,+) ∗ e e Z A = D A GA DA : (M, adP ) → (M, adP ).
(79)
Even though DA does not commute with GA (unless A ∈ M− ), these two operators are identity operators as long as A ∈ N 0 . For details, s. App. B. − The hik -integrations yield (4V )−m /4 , while the hξ -integration produces eA )−1/2 = 1 plus the contribution (det X E 1 D + − 2 P FA , ZbA P + FA , 4gYM where ∗ e ∗ e A DA DA : (2,+) (M, adP ) → (2,+) (M, adP ). GA DA ZbA = DA G
(80)
As long as A ∈ N 0 , this operator is null as proved in App. B. Conclusions. Putting together the determinants coming from Steps 3 and 5 we find a net contribution J[A] =
1/4 1/4 (det 1(2,+) (det(1(0) A ) A − RA )) (1)
e A )1/4 (det 0 1
.
(81)
In App. B, we show that J[A] = 1 if A ∈ N 0 . Moreover, Step 4 and Step 5 reconstruct YM action in the form SYM [A] =
1 − 1 P FA , P − FA + 2 hP + FA , P + FA i . 2 4gYM 4gYM
Therefore, we have proved the equivalence between BFYM and YM theory (for A ∈ N 0 ) by using the self-dual gauge fixing. More explicitly, we have shown that Z Z exp(−SBFYM [A, η, B]) O[A] ∝ exp(−SYM [A]) O[A]. (T N 0 ×TFA B)/Gaff ,self−dual N 0 /G (82) If we choose a gauge fixing that restricts to the self-dual gauge in the interior of N 0 and to the trivial gauge outside, we can extend the equivalence to the whole A. 3.2.3. The perturbative expansion around an anti-self-dual connection. Again we can consider fluctuations around a background as in (57). Since we assume the connection not to be flat, we will have both√terms in q/gYM and in gYM /q, so we must take q ∼ gYM . It is convenient to choose q = 2 gYM .
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
595
The gauge-fixing conditions (67) on the fluctuations simply read d∗A0 η + O(gYM ) = 0,
1
] A (M, adP ), η ⊥ Harm 0
β + = 0.
The quadratic part of the gauge-fixed action (74) reads − − −i hβ − , dA0 αi + hFA0 , α ∧ αi + 21 hβ D , β0 i + E e A e + ∗[FA0 , e] , − hβ − , dA0 ei − i hFA0 , [α, e]i + 21 e , 1 0
plus the gauge-fixing terms. Unlike in the case of the covariant gauge fixing, the αβand e-sectors do not decouple. However, if we perform the change of variables α0 = α − ie,
e0 = e,
β 0 = β,
(83)
the quadratic part of the action turns out to be E 1D 0
1 0 − e 0A e0 , (84) (β ) , (β 0 )− + e ,1 −i (β 0 )− , dA0 α0 + hFA0 , α0 ∧ α0 i + 0 2 2 plus the gauge-fixing terms. Now the α0 β 0 - and e0 -sectors decouple. Moreover, for the α0 β 0 -sector we recognize the propagators of the topological BF theory with a cosmological term in the self-dual gauge, see (106), whereas the e0 e0 -propagator turns out to be same as the propagator for the fluctuation of the connection in YM theory [thanks to (13) and to the second equation of (63)]. 4. The Relation with the Topological BF Theories In the previous section, studying perturbative BFYM theory in the covariant gauge around a flat connection or in the self-dual gauge around a non-trivial anti-self-dual connection, we have discovered that a sector of the theory corresponds to the topological BF theory (pure or, respectively, with a cosmological term) in the same gauge. In this section we will recall the properties of the topological BF theories. The main problem with these theories is that the symmetries are described by a BRST operator that is nilpotent only on-shell. Therefore, one has to resort to the BV formalism which we briefly introduce in Subsect. 4.1. We also want to discuss the relations between the BFYM and the BF theories before starting perturbation theory. In the case of the self-dual gauge, this relation simply relies on the fact that hB , Bi = − hB , ∗Bi when B is anti-self-dual. The case of the covariant gauge fixing with a flat background connection is however more intricate, for it is related to the limit gYM → 0 which is ill-defined as discussed at the end of Subsect. 2.4. We have already observed that in this limit the BFYM theory formally reduces to the topological pure BF theory plus a dynamical term for η, s. Eq. (28). We have also observed that this limit is well-defined after fixing the gauge. However, we would like this limit to be meaningful for the theory even before a gauge is chosen. Dealing with the theory defined by (28) presents some difficulties. In fact, the term hdA η , dA ηi has a different symmetry on shell (the T G action) and off shell (only the G action). Of course, one has to consider the larger symmetry if one wants to quantize
596
A. S. Cattaneo et al
the theory. The on-shell symmetry for η can be made into an off-shell symmetry of the whole theory by setting sB = [B, c] − dA ψe + ∗[dA η, ξ]. However, now the BRST operator is nilpotent only on shell. (Notice that this is a problem affecting pure BF theory as well.) Another way of seeing the problem is to perform the limit g 2 → 0 in BFYM theory. We meet the following difficulties: 1. There is no way of getting in the limit the previous BRST transformation on B. 2. If we consider the BRST transformations as in (31), we get, in the limit, the correct on-shell symmetry for η but a divergent transformation √ for B. 3. If we try to avoid this problem by rescaling ξ → 2 gYM ξ, we get a well-defined transformation for B, but the transformation for η is correct only off shell now; this leads to contradictions when we try to quantize the theory. In fact, if we decide not to fix the gauge for η we get in trouble when the curvature vanishes; on the other hand, if we want to gauge fix it, we have to introduce the antighost ξ, but then we get in trouble since the ξ-dependent terms in the gauge-fixed action are killed. (Of course, if we first fix the gauge and √ then let gYM → 0, we do not have any problem.) 4. If we also decide to rescale η → 2 gYM η, the quadratic term in η disappears from the action. This means that the symmetry on η is given, as we correctly obtain, by the whole Gaff action. However, now B can be shifted by dA ψe0 with no relation between ψe and ψe0 . That is, we have to introduce new ghosts. The solution to these problems is again in the use of the BV formalism. 4.1. The BV formalism. In the BV formalism, one considers the Z-graded algebra of polynomials in the fields {8i } of the theory. We will denote by (K) the grading – i.e., the ghost number – of the monomial K. As a shorthand notation, we will simply write i for the ghost number of the generator 8i . Moreover, each field is given a Grassmann parity by the reduction mod 2 of the ghost number (if half-integer-spin particles are present, then their Grassmann parity is increased by one). To each field 8i is then associated an antifield 8†i which is completely equivalent to 8i under all respects but the ghost number; i.e., 8†i is a section of the same principal or associated bundle as 8i and is given ghost number by (8†i ) = −i − 1.
(85)
4.1.1. BV antibracket and Laplacian. Given two functions X and Y of the variables {8i , 8†i }, one defines the BV antibracket as *← * ← − − →+ − → + − δ δ δ δ (X , Y ) := X Y −X Y (86) , , i i † δ8 δ8†i δ8i δ8 and the BV Laplacian as 1X =
X i
* − − →+ → δ δ (−1)i , X. i † δ8i δ8
(87)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
597
Notice that both the antibracket and the Laplacian increase the ghost number by one. We remark that the two previous operations are not independent: in fact, the BV antibracket can be written in terms of the BV Laplacian and of the pointwise product of functions. 4.1.2. Canonical transformations. The BV formalism is defined modulo canonical transformations, i.e., transformations of the fields and antifields that preserve the BV Laplacian and, consequently, the BV antibracket. A canonical transformation can be obtained by introducing a generating functional f F (8i , 8†i ), with (F ) = −1, such that fi = 8
− → δ F, f δ 8†i
8†i =
− → δ F. δ8i
(88)
In the BV context, there is no analogue of Liouville’s theorem in classical mechanics, and, in general, the volume form is not preserved by canonical transformations. Notice that rescalings of the form 8i → λi 8i , 8†i= 8†i /λiare canonical transforP f mations, their generating functional being F = i λi 8i , 8†i . 4.1.3. The implementation of symmetries. Suppose we have an action S[ϕ], where by ϕ we denote the classical fields (i.e., the zero-ghost-number fields that appear in the action). The study of the on-shell symmetries allows the construction of the BRST complex (i.e., the whole set of fields 8i ) together with the BRST operator s. In many cases, this operator turns out to be nilpotent also off shell, and the BRST formalism is enough to quantize the theory. However, there are situations (e.g., in the BF theories) where this is not true. In these cases, the BV formalism provides a useful generalization of the BRST formalism. First of all one has to look for the BV action, i.e., a functional S BV [8, 8† ] that solves the master equation S BV , S BV = 0, (89) and reduces to the classical action S when the antifields are turned off, viz., S BV [8, 0] = S[ϕ].
(90)
In particular, one looks for a proper solution of (89); i.e., one requires the Hessian of S BV evaluated on-shell to have rank equal to the number of fields. There is a theorem that states that, under some mild assumptions on S, there exists one and only one (up to canonical transformations) proper solution S BV to the master equation (89) with boundary conditions (90). Thanks to the master equation and to the properties of the BV antibracket, the operator σ defined by σX := (S BV , X)
(91)
turns out to be nilpotent. The boundary condition (90) then ensures that, up to possible terms in the antifields, σ acts on the fields as s.
598
A. S. Cattaneo et al
If the BRST operator s is nilpotent also off shell, one can write the BV action as E XD S BV [8, 8† ] = S[ϕ] + s8i , 8†i . (92) i
In this case σ = s on all fields. 4.1.4. The BV quantization. The quantization of the theory then proceeds by fixing the gauge. This is achieved, as in the BRST formalism, by introducing a gauge-fixing fermion 9[8]. Now, however, the gauge-fixed action is defined by (93) S g.f. [8] = S BV [8, 8† ] 8† =i δ 9 . δ 8j
j
If S BV has the form (92), then this procedure gives S g.f. = S + is9, as in the BRST formalism. The condition that S BV should be a proper solution of the master equation makes perturbative quantization possible and – if 1S = 0 – independent of small deformations of 9.8 Moreover, the vacuum expectation value of a functional O(8, 8† ) such that σO = 0 turns out to be independent of small deformations of 9 as well. 4.2. Applications of the BV formalism. The theories we have considered in this paper – viz., first- and second-order YM theory and BFYM theory – have a BRST operator that closes also off shell; therefore, up to canonical transformations, they can be written as in (92). Explicitly, we have
BV = SYM + dA c , A† − 21 [c, c] , c† + hc , c† , SYM
BV SYM = SYM0 + dA c , A† + [E, c] , E † − 21 [c, c] , c† + hc , c† , 0 E √
D BV = SBFYM + dA c , A† + [η, c] + dA ξ − 2 gYM ψe , η † + SBFYM E D (94) + [B, c] + √21g [FA , ξ] − dA ψe , B † + YM D E √
1 + − 2 [c, c] , c† + −[ξ, c] + 2 gYM φe , ξ † + E D E D E D e c] , φe† + P hi , c† , e c] + dA φe , ψe† + [φ, + −[ψ, i i 8 Usually one can choose 1S = 0. However, the quantum corrections due to renormalization generally break this condition. † To deal with this case, one has to consider the quantum BV action S~BV (8, 8 ) which satisfies the quantum master equation S~BV , S~BV + 2~1S~BV = 0
and the boundary condition
†
†
S0BV (8, 8 ) = S BV (8, 8 ). Notice that to a BV action there might correspond no quantum BV action; in this case the theory is said to be anomalous. When the theory is not anomalous, fixing the gauge as in (93) with S BV replaced by S~BV yields a theory whose perturbative quantization is independent of small deformations of 9. Moreover, the quantum master equation implies that the operator σ~ defined as σ~ = σ + ~1 †
is nilpotent. Then one can show that the vacuum expectation value of a functional O~ (8, 8 ) such that σ~ O~ = 0 is independent of small deformations of 9.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
599
where, in the last line, we have denoted by hi and ci the Lagrange multipliers and antighosts. √ √ The canonical transformation ξ → 2 gYM ξ, ξ † → ξ † /( 2 gYM ) seems to remove BV . However, in the gYM → 0 limit the BV action turns out all the singularities from SBFYM not to be proper. As a consequence, if we fix the gauge with 9 as in (48), we do not get the kinetic term for ξ, ξ and quantization becomes impossible. 4.2.1. The pure BF theory. The pure BF theory is described by the action SBF = i hB , ∗FA i ,
(95)
and its symmetries are encoded by the following BRST transformations: sA = dA c, e sB = [B, c] − dA ψ, e c] + dA φ, e sψe = −[ψ,
sc = − 21 [c, c], e c]. sφe = [φ,
(96)
Notice that s2 6= 0 off shell; as a matter of fact, s2 vanishes on all fields but on B where one gets e s2 B = −[FA , φ]. The BV action can be written as E
D BV e , B† + SBF = SBF + dA c , A† + [B, c] − dA ψe − 21 ∗ [B † , φ] E D E D E D
e c] , φe† + P hi , c† , (97) e c] + dA φe , ψe† + [φ, + − 21 [c, c] , c† + −[ψ, i i where the sum is over the same antighosts and Lagrange multipliers as in BFYM but ξ and hξ . It can be shown that (97) is a proper solution of the master equation. Notice that the BV action is not linear in B † . This implies that the operator σ on B acts in a different way than the operator s; viz., e σB = [B, c] − dA ψe − ∗[B † , φ].
(98)
On all other fields, σ = s. Notice that σ 2 = 0 since σB † = −[B † , c] − ∗FA . Quantization in the covariant gauge. The covariant gauge fixing for pure BF theory is defined exactly as in BFYM theory (33) and is quantistically implemented by the same gauge-fixing fermion (48) (of course, forgetting the conditions on η) using (93). After some algebra we can write the gauge-fixed action as cov. g.f. (99) = i hB , ∗FA i + i s9YM + SBF D E D E + hφe , d∗A ψe + hir1 ωi [A] , ψe + E D 1 + hψe , d∗A B + dA φe2 + r2 i ωi [A] + D D E E + hφe , d∗A ψe + hir2 ωi [A] , ψe + E D 2 √ + φe1 , 1A φe + r1 i rj δij V + D E D E e φ] e + ψe , 10 ψe . + 21 dA ψe , ∗[dA ψ, A
600
A. S. Cattaneo et al
As is usual in theories whose BRST operatorD is nilpotent onlyEon shell, a cubic term e φ] e . This term is however appears in the ghost–antighost variables, viz., dA ψe , ∗[dA ψ, since there are no sources in φe1 . Recall that also in (53) killed byDthe φe1 φe integration E the term dA ψe , [FA , ξ] was irrelevant since the ξξ integration killed it. Therefore, the gauge-fixing terms in (53) are the same as those which appear in (99) plus the terms related to the gauge fixing on η. Explicitly, we have cov. g.f. cov. g.f. 2 = SBF + gYM hB , Bi + SBFYM
1 i hη , 10A ηi + is ξ , d∗A η + k hωi [A] , ηi . 2 (100)
Perturbative expansion. The equations of motion of BF theory are FA = dA B = 0. Therefore, A is a flat connection. In the covariant background, we also have B = 0 (if there are harmonic two-forms, we have to require B to be orthogonal to them). If we then consider fluctuations around a critical background (i.e., FA0 = B0 = 0) as in (57), we get the quadratic action i hβ , ∗dA0 αi + gauge-fixing terms,
(101)
which corresponds to the αβ part of (59). Notice that (101) is completely independent of the parameter q. 4.2.2. The BF theory with a cosmological term. This theory is described by the action κ SBF,κ = i hB , ∗FA i + i hB , ∗Bi , 2
(102)
where κ is a coupling constant known as the cosmological constant. The symmetries are encoded in the following BRST transformations: e sA = dA c + κψ, e sB = [B, c] − dA ψ, e c] + dA φ, e sψe = −[ψ,
e sc = − 21 [c, c] − κφ, e c]. sφe = [φ,
(103)
Again, s2 is not nilpotent off shell and its failure is given by e s2 B = −[FA + κB, φ] e ψ] e = 0). The BV action reads (notice that [ψ, D E D E BV e , B† + SBF,κ = SBF,κ + dA c + κψe , A† + [B, c] − dA ψe − 21 ∗ [B † , φ] E D E D e c] + dA φe , ψe† + + − 21 [c, c] − κφe , c† + −[ψ, E D E D (104) e c] , φe† + P hi , c† , + [φ, i i
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
601
and the only field on which σ acts in a different way is still B. σB is still given by (98), and its nilpotency is ensured by σB † = [−B † , c] − ∗(FA + κB). Quantization in the self-dual gauge. The self-dual gauge is defined by putting B + = 0 and again is well defined in the same hypotheses of Subsect. 3.2. Its quantum implementation is obtained by (93) with the gauge-fixing fermion (73) (forgetting of η). The gauge-fixed action then turns out to be s.d. g.f. = −i hB − , P − FA i − i κ2 hB − , B − i + SBF,κ E D D E ∗ e YM + i s0 9YM + κ ψe , δ9 + h , D ψ + A δA e1 φ D E
+ hir1 ωi [A] , ψe + h+χ , B + + D D E E √ e ∗ψ] e + + φe1 , 1A φe + r1 i rj δij V + κ φe1 , ∗[ψ, D D EE δ + κ r1 i ψe , δA ωi [A] , ψe + E D D E e + χ+ , DA ψe , + 21 χ+ , [χ+ , φ]
(105)
where s0 is s at κ = 0. e ψ] e vanishes, yet [ψ, e ∗ψ] e does not. This means that we have a source in Notice that [ψ, e e φe1 ; hence, E produce a term which is quartic in the ghost variables D the φ1 φ-integration will + + e ∗ψ] e ). (viz., [χ , χ ] , ∗GA ∗ [ψ, Therefore, differently from the covariant case (100), the BFYM theory and the BF theory with a cosmological term in the self-dual gauge have quite different vertices. However, we can still relate their quadratic parts. √ Perturbative expansion. We consider fluctuations as in (57) with q = κ, A0 an antiself-dual-connection and B0 = −FA0 /κ (notice that this is a solution of the equations of motion in the self-dual gauge). In this case, the quadratic action reads
i − − β , β + gauge-fixing terms. −i β − , dA0 α + i hFA0 , α ∧ αi − 2 By making the change of variables α0 = e
iπ 4
α,
β 0 = e−
iπ 4
β,
we get
1 0 − (β ) , (β 0 )− + gauge-fixing terms, −i (β 0 )− , dA0 α0 + hFA0 , α0 ∧ α0 i + 2 (106) which is exactly the α0 β 0 part of (84).
602
A. S. Cattaneo et al
4.2.3. From BFYM to pure BF theory as gYM → 0. We want to find a canonical transformation that lets the symmetries of BFYM theory become similar to those of BF theory. First we compute the action of σ on B † in BFYM theory getting √ σB † = −[B † , c] − ∗FA + 2 gYM dA η − 2g 2 B. Then we see that, if we make the change of variables e=B+√ 1 ∗ [B † , ξ], B→B 2 gYM
(107)
we get e + ∗[dA η, ξ] + [[B † , ξ], ξ] − e = [B, e c] − dA ψe − ∗[B † , φ] σB
√
e ξ]. 2 gYM ∗ [B, (108)
The transformation (107) can be obtained as a canonical transformation generated by E D X 1 f i f† † f† , ξ] . f† , ∗[B (109) 8 , 8i + √ B F (8, 8 ) = 2 2 gYM i Notice that, on all other fields than B and on all antifields but ξ † , the transformation is the identity; on ξ † we have 1 ξ † → ξe† = ξ † − √ ∗ B†, B† . 2 2 gYM e and ξe† . Therefore, in the following we will drop all the tildes but on B We can now rewrite the BV action for BFYM theory in the new variables: E √
D BV † = S^ + [η, c] + dA ξ − 2 gYM ψe , η † + SBFYM BFYM + dA c , A D E e + ∗[dA η, ξ] , B † + e c] − dA ψe − 1 ∗ [B † , φ] + [B, 2 D E √ 1 † e ξ] , B † + + 2 [[B , ξ], ξ] − 2 gYM ∗ [B, E √ D
+ − 21 [c, c] , c† + −[ξ, c] + 2 gYM φe , ξ † + D E D E D E e c] + dA φe , ψe† + [φ, e c] , φe† + P hi , c† , + −[ψ, i i
(110)
e where S^ BFYM is the BFYM action evaluated at B. Notice that now the BV action does 2 → 0; moreover, the BV action is still proper in not have singular terms in the limit gYM the limit. √ √ Notice that, if we instead rescaled ξ → 2 gYM ξ, ξe† → ξe† /( 2 gYM ) – in order for e to become, in the limit g 2 → 0, the same as the BV the BV transformation (108) of B YM transformation (98) on B in pure BF theory – we would not get a proper BV action in the limit. √ The same problems would be encountered if we decided to rescale also η → 2 gYM η unless we introduced the required new ghosts.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
603
The partition function of BFYM at gYM = 0. Consider the (gYM = 0)-BFYM action 0
BV,0 † SBFYM = S^ + [η, c] + dA ξ , η † + BFYM + dA c , A D E e + ∗[dA η, ξ] + 1 [[B † , ξ], ξ] , B † + e c] − dA ψe − 1 ∗ [B † , φ] + [B, 2 2 (111)
1
† † + − 2 [c, c] , c + −[ξ, c] , ξ + D E D E D E e c] + dA φe , ψe† + [φ, e c] , φe† + P hi , c† . + −[ψ, i i
Notice that the equations of motion impose A to be flat. Therefore, to quantize theory, it is convenient to choose the covariant gauge-fixing fermion 9 defined in (48). After fixing the gauge, we have at our disposal the rescaling ξ → ξ, ξ → ξ/. Since the partition function does not depend on the parameter , we can as well let → 0. This way the η, ξ, hξ , ξ fields decouple from the others, and their contribution to the partition function turns out to be det 1(0) A . (det 0 1(1) det XA )1/2 A The B integration then selects the flat connections. The partition function of BF theory is the analytic torsion which is trivial in even dimension; moreover, notice that XA = 1 if A is a flat connection. Therefore, we have Z det 1(0) A , (112) ZBFYM |g2 =0 = 0 (1) 1/2 YM A∈M0 (det 1A ) where M0 is the moduli space of flat connections. 2 Notice that YM theory in the limit gYM → 0 leads to the same result. 5. Geometry In this section we discuss the geometrical meaning of the set of fields appearing in (39) and (40) and of the BRST Eqs. (31).9 The situation is as follows: 1. In a topological gauge theory one deals with a connection c on the bundle of gauge orbits A → A/G, considers the corresponding connection A + c on the G-bundle P × A, and obtains the BRST equations as the structure equations and Bianchi identities for the curvature of A + c [4]. 2. In a non-topological Yang–Mills theory one considers the fiber immersion jA : G → ∗ c. This is just the Maurer Cartan form on G; the A and the pulled-back connection jA ∗ c give the classical BRST resulting structure equations for the curvature of A + jA equations [6]. It is customary to use the same symbol c also for the pulled-back ∗ c: in Yang–Mills theory this is the ghost field. connection jA 3. In a full topological theory that includes the field η ∈ 1 (M, adP ), one has to consider the tangent bundle T A, where there is a (free) action of the tangent gauge group T G. In complete analogy to point 1 above, one should consider a connection on T A and explicitly spell the structure equations and Bianchi identities for the corresponding connection on the T G-bundle T P × T A. 9
To simplify the notations, we will take
√
2 gYM = 1 throughout this section.
604
A. S. Cattaneo et al
4. In a semi-topological theory that includes the field η, one should not follow the analogy of point 2 above, i.e., consider the T G-orbit in T A; instead one should ∗ T A, where jA : G → A is the fiber take into account the pulled-back bundle jA immersion. The BRST equations will then be given as the structure equations and ∗ T A. the Bianchi identities for a curvature on the bundle T P × jA In this way the connection A is allowed to move only in a given G-orbit, as in the Yang–Mills theory, and the only symmetry that the theory requires for the field A is ∗ T A the field η may be any element gauge invariance. On the contrary, in the bundle jA 1 of (M, adP ) ≈ TA A; this means that the symmetries of the theory include the translation invariance for such an η. In other words, the theory is topological in one field-direction (η) and non-topological in another field-direction (A). The last framework is the one that suits the theory described in this paper. Another way to see this is to start with the action (30) of the group Gaff on pairs (A, η) ∈ T A. Such an action does the required job: it gauge-transforms A and acts on η by translations. Unfortunately, such an action is not free, so the quotient space is not a manifold. In ∗ T A and order to turn around this problem, one has to consider exactly the bundle jA obtain the BRST equations in the way mentioned above (point 4) and discussed in detail in the following pages. It is exactly in the framework of point 4 discussed above that the field B ∈ 2 (M, adP ) can be included in the BRST equations by keeping the BRST operator nilpotent. Dealing with the tangent gauge group and with the tangent bundle of the space of connections means taking first-order approximations. These relations can be made clearer if we consider paths (straight lines) of connections on the bundle P I ×AI , where AI ≡ M ap([0, 1], A) and P I is defined similarly. Finally, in this section we are going to discuss the geometrical aspects of the gaugefixing problems of our theory. 5.1. Tangent gauge group. Let P be any G-principal bundle over a closed oriented manifold M . The tangent bundle T G of any Lie group is a Lie group itself which is isomorphic to the semidirect product G ×s Lie(G) of the Lie group with its Lie algebra. The product of two pairs (g, x), (h, y) ∈ G ×s Lie(G) is defined as (a, x)(b, y) ≡ (ab, Adb−1 (x) + y).
(113)
Its Lie algebra is the semi-direct sum of two copies of Lie(G) with commutator [(x1 , y1 ), (x2 , y2 )] ≡ ([x1 , x2 ], [x1 , y2 ] + [y1 , x2 ]).
(114)
The tangent bundle T P is a T G-principal bundle with base space T M . The action of T G on T P (obtained as the derivative of the G-action on P ) is given as follows (115) T P × T G 3 (p, X)(g, x)* pg, (Rg )∗ X, +i(x)pg ∈ T P, where Rg denotes the (right) multiplication by g ∈ G and i(x)p denotes the fundamental vector field corresponding to x ∈ Lie(G) evaluated at p ∈ P. A connection A on P is defined as a Lie(G)-valued one-form on P with special properties. First of all, we require its equivariance, viz., Apg (Rg )∗ X = Adg−1 Ap (X) , X ∈ Tp P.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
605
Moreover, we require it to be the identity on fundamental vector fields: Ap (i(x)p ) = x,
∀x ∈ Lie(G).
A smooth map p : [a, b]2 ⊂ R2 → P, with p(0, 0) = p ∈ P , defines an element of the double tangent dp0 ∈ T T P, p, p0 , p˙ , dt where (t, s) ∈ [a, b]2 and the prime denotes the derivative w.r.t. s, while the dot denotes the derivative w.r.t. t. There is a canonical involution dp0 d˙p 0 0 = p, p˙ , p , . α : T T P → T T P, α p, p , p˙ , dt ds Now we consider the evaluation map ev : A × T P → Lie(G) and its derivative ev∗ : T A × T T P → Lie(G) × Lie(G), which has the following property Theorem 1. For any (A, η) ∈ T A, the one-form on T P given by d 0 [p]*ev∗ (A, η; αP [p]) = Ap p0 , A (p (0, t)) + η p ˙ p(0,t) dt t=0
(116)
defines a connection on T P . For the proof, s. [12]. In this way we can identify the tangent bundle T A as a subset of the space of the connection on T P . It can be seen easily that it is a proper subset [12]. The gauge group G is the space of equivariant maps G = M apG (P, G) 3 g⇒g(pa) = a−1 g(p)a, ∀a ∈ G. We have the following: Theorem 2. The tangent gauge group T G is a proper subgroup of the group of gauge transformations for T P . Proof. Let (ψ, χ) ∈ G ×s Lie(G). Notice that for any (p, X) ∈ T P and (g, x) ∈ G ×s Lie(G) we have the equation ψ −1 dψ pg, (Rg )∗ X + i(x)|pg = Adg−1 [ψ −1 dψ(X)] + x − Adg−1 Adψ−1 (p) Adg x. This shows that the map (ψ, χ) : T P −→G ×s Lie(G)
606
given by
A. S. Cattaneo et al
ψ(p), ψ −1 dψ(p, X) + χ(p) ∈ G ×s Lie(G)
is a gauge transformation for T P . Notice that the above map is given by the derivative of the evaluation map ev : P × G → G. For any one-form η ∈ 1 (M, adP ) and for any (ψ, χ) ∈ G ×s Lie(G), the map
(p, X)* ψ(p), ψ −1 dψ(p, X) + χ(p) + η(p, X)
is also a gauge transformation on T P , thus showing that the inclusion T GP ⊂ GT P is proper. In the proof of the previous theorem we showed explicitly that the group Gaff (the semidirect product of T G with the abelian group 1 (M, adP )) is also a subgroup of GT P . Remark 1. From the discussion above we conclude that T G acts freely on T A and this action coincides with the restriction of the action of the gauge group of T P (GT P ) on the space AT P of connections on T P . Remark 2. The group Gaff acts non-freely on T A as in (21). The group Gaff is a subgroup of GT P , but the action (21) is not given by the restriction of the action of GT P on AT P . 5.2. Paths on a principal bundle. For any manifold X we denote by X I the space of smooth paths M ap(I, X), where I = [0, 1]. If P (M, G) is a principal bundle, the group GI acts freely on P I and the bundle P I (M I , GI ) is a principal bundle. A path in A ≡ AP defines a connection on P I . In this way we identify AI with I AP . There is a natural bundle homomorphism P I → T P,
p(t)* (p(0), p(0)) ˙
(117)
which corresponds to the group homomorphism GI → G ×s Lie(G),
g(t)* g(0), g −1 (0)g(0) ˙ .
(118)
˙ Under the homomorphisms (117) and (118), a connection A(t) is sent into A(0), A(0) ∈ T A. If we have a connection c (a.k.a. as a gauge fixing) on the bundle of gauge orbits A → A/G, then A + c is a connection on the bundle A P ×A 7→ M × . G G
(119)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
607
In fact the one-form on P × A given by (A + c)(p,A) (X, η) ≡ A(X)p + c(η)(A,p)
(120)
is a connection on P × A which is G-invariant, i.e., descends to a connection on the principal G-bundle (119). Forms on P × A have a bi-degree (k, s), where k is the order of the form on P and s is the order of the form on A, a.k.a. the ghost number. By taking the tangent bundles of (119) one obtain the bundle TA TP × TA 7→ T M × . TG TG
(121)
By considering the relevant path spaces, one has the bundle AI P I × AI 7→ M I × I . I G G
(122)
If c(t) is a path of connections in A → A/G, and A(t) is a path of connections in A, then a connection on (122) is given by A(t) + cA(t) (t),
(123)
where we have explicitly represented the dependency of the connection c(t) on the point A(t) ∈ A. As particular paths we can take straight lines, A(t) = A + tη,
η ∈ 1 (M, adP ),
c(t) = c + tˆc,
(124)
where cˆ is an assignment to each connection A ∈ A of a map cˆA : 1 (M, adP ) 7→ 0 (M, adP ) with the property of G-equivariance, cˆAg Adg−1 τ = Adg−1 (ˆcA (τ )) , and of tensoriality, Im (dA ) ⊂ ker(ˆcA ). In physics, cˆ is an infinitesimal variation of the gauge fixing. It is convenient to rewrite the connection given by (124) as +∞ X (n) tn ξA,η . A + tη + cA+tη + tˆcA+tη = A + cA + t η + ξA,η +
(125)
n=2
In the previous expression we have: 1. identified the tangent bundle T A with A × 1 (M, adP ) [forms on A can then be evaluated on elements of 1 (M, adP )]; 2. defined 1 dn 1 d(n−1) (n) (τ ) ≡ c (τ ) + cˆA+tη (τ ), τ ∈ 1 (M, adP ), ξA,η A+tη n! dtn t=0 (n − 1)! dtn−1 t=0 (1) and set ξA,η ≡ ξA,η .
608
A. S. Cattaneo et al
As will be shown in a moment, the pair (A + c, η + ξ) can be seen as an honest connection with values in the Lie algebra of the tangent group T G. First we notice that an infinite-dimensional version of (1) implies that the pair (c, cˆ) defines a connection on the T G-bundle T A. Explicitly we have: Theorem 3. When we identify the double tangent bundle T T A with A×1 (M, adP )×3 , then the connection on T A represented by (c, cˆ) is a map A × 1 (M, adP )×3 3 (A, η, τ, σ)* cA (τ ), ξA,η (τ ) + cA (σ) ∈ Lie(G) ⊕s Lie(G).
Now we look again at the bundles (121) and (122). Given the natural inclusions P → T P, p*(p, 0),
P → P I , p*[p(t) = p],
we establish from now on the following Convention 1. We will generally assume that the forms on T P × T A we are going to consider are restricted to forms on P × T A and that the forms on P I × AI we are going to consider are restricted to forms on P × AI . Moreover, we assume Convention 2. We consider only elements in T T A ≈ A × 1 (M, adP )×3 that have 0 as the fourth component. We conclude that the Lie(G)-valued form (A + c, η + ξ),
(126)
whose explicit expression is given by (A + c, η + ξ)p;A,η (X, τ ) = Ap (X) + cA (τ ), ηp (X) + ξA,η (τ ) , with p ∈ P, X ∈ Tp P and (A, η, τ ) ∈ A × 1 (M, adP )×2 , represents a connection on the bundle (121) provided that conventions 1 and 2 are understood. 5.3. Curvatures. As is customary in topological (cohomological) field theories [4], the BRST equations are nothing but the structure equations and the Bianchi identities for the connections of some bundles of fields. We start by recalling the expression of the curvature of the connection (120). It is given by FA + ψ + φ,
(127)
where the three terms above are forms of degree (2, 0), (1, 1), (0, 2) in the product space P × A, the second number being the ghost number. More precisely: 1. ψ is minus the projection of 1 (M, adP ) on the horizontal subspace; 2. φ is the curvature of the connection c on the bundle A → A/G.
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
609
The structure equations and the Bianchi identities in this case read δA = dA c − ψ, δψ = dA φ − [ψ, c],
δc = − 21 [c, c] + φ, δφ = [φ, c]
dA FA = 0, δFA = [F, c] − dA ψ,
(128)
where we have denoted by δ the exterior derivative on A, a.k.a. as the BRST operator. The total derivative for (k, s)-forms on P × A is given by dtot = d + (−1)k δ.
(129)
The commutator of forms of bidegree (k, s) is assumed to satisfy the equation [ω1(k1 ,s1 ) , ω2(k2 ,s2 ) ] = (−1)k1 k2 +s1 s2 +1 [ω2(k2 ,s2 ) , ω1(k1 ,s1 ) ]. In this way the total covariant derivative satisfies the same sign-rule as (129), namely it is given by dA + (−1)k δc . Equations (128) are the field equations for the topological field theory considered in [4]. Next we consider the curvature of (125). It is given by 2 1 (2) (2) ˜ ˜ ˜ ˜ FA + ψ + φ + t dA η + ψ + φ + t ψ + φ + [η + ξ, η + ξ] + o(t2 ), (130) 2 where the forms ψ˜ and φ˜ are defined as ψ˜ ψ˜ (2) φ˜ φ˜ (2)
≡ ≡ ≡ ≡
dA ξ + [η, c] − δη, dA ξ (2) , δξ + [c, ξ], δξ (2) + [c, ξ (2) ] = δc ξ (2) .
(131)
Accordingly the curvature of the connection (126) is given by the first-order term of (130), i.e., by the Lie(G) ⊕s Lie(G)-valued form ˜ (F + ψ + φ, dA η + ψ˜ + φ).
(132)
The structure equations and the Bianchi identity for (132) and (130) are the natural generalization of (128). They are spelled out explicitly in [12]. Here we are concerned with the geometrical interpretation of the BRST equations considered in Sect. 3, and this requires a restriction to the G-fiber in the bundle T A → T A/T G. 5.4. Restriction to the G-fiber. In order to obtain the standard BRST equation from (128), we need to consider the fiber imbedding jA : G → A,
jA (g) = Ag ,
(133)
and the pulled-back bundle P × G → M × G.
(134)
The connection A + c (120) on P × A is pulled back to (134). It is customary (and unfortunately confusing) to denote the pulled back connection by the same symbol A + c. This means that in this case c is just the Maurer–Cartan form on G. The curvature of the pulled-back connection is simply FA , and the structure and Bianchi identities become
610
A. S. Cattaneo et al
δA = dA c,
1 δc = − [c, c], 2
dA FA = 0,
δFA = [FA , c].
(135)
These are the standard BRST equation for the Yang–Mills theory and the connection A+c on (134) gives the geometrical interpretation of the set of fields and ghosts appearing in (6) [6]. Let us apply now a similar fiber imbedding to the bundle T P × T A. Two choices are possible: 1. consider the fiber imbedding of the full tangent gauge group T G, i.e., the bundle: T P × jA,η T G → T M × jA,η T G, 2. consider only the fiber imbedding (133) and restrict the tangent bundle T A to the image of jA . This means considering the pulled-back bundle ∗ ∗ (T A) → T M × jA (T A). T P × jA
(136)
The first alternative would lead us to dealing with the Maurer–Cartan form on T G. But we are in fact interested in the second alternative since we want the field η to be generic and not restricted to be tangent to the G-orbit. Hence, from now on, only the second alternative will be considered: this means that in the connection (A + c, η + ξ) the form c becomes the Maurer–Cartan form on jA (G). Taking always into consideration convention 1, the corresponding curvature becomes ˜ (FA , dA η + ψ˜ + φ).
(137)
The Bianchi and structure equation for (137) are δA = dA c, δη = −ψ˜ + dA ξ + [η, c], δFA = [FA , c], δ(dA η) = −dA ψ˜ + [FA , ξ] + [dA η, c], δc = − 21 [c, c], δξ = φ˜ − [c, ξ], ˜ ˜ c] + dA φ, ˜ δψ = −[ψ, ˜ ˜ δφ = [φ, c].
(138)
The connection (A + c, η + ξ) for (136) gives the geometrical interpretation of the set of fields and ghosts appearing in (39) and (40) (but for the field B). The analogous construction for the connection (125) implies the following steps: 1. take the map ∗ jA (T A) ≈ G × TA A → AI , (g, A, η) * (A + tη)g ;
2. pull back the connection (125) to the bundle P I × G × TA A; 3. consider only constant paths in P I , according to convention 1.
(139)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
611
At this point the formal expression of the connection is the same as in (125), i.e., A + c + t (η + ξ) +
+∞ X
tn ξ (n) ,
(140)
n=2
but now c is the Maurer–Cartan form. The relevant curvature is 1 FA + t dA η + ψ˜ + φ˜ + t2 ψ˜ (2) + φ˜ (2) + [η + ξ, η + ξ] + o(t2 ). 2
(141)
Computing the structure and Bianchi identities for (141) will give again (138) and some other transformation laws for the fields ψ˜ (n) , φ˜ (n) , ξ˜(n) which we do not discuss here (s. [12]). 5.5. Including the field B. In four-dimensional BF quantum field theories, the field B behaves like a curvature but does not depend on the connection. It is then represented by an element of 2 (M, adP ). Now we show that such a field can be incorporated into the field Eqs. (138). By incorporating we mean that the BRST double complex with operators (d, δ) can consistently be extended to a double complex with operators (d, s) that includes the space 2 (M, adP ) in a such a way that: 1. s extends δ, so that s2 = 0, 2. the gauge equivariance is preserved, and 3. the field equations are preserved. We use here the same notation of Sect. 2.; viz., we set B ≡ 2 (M, adP ), and consider the tangent bundle T B ≈ 2 (M, adP ) × 2 (M, adP ). The group T G acts on the cartesian product T A × T B as follows: (A, η; C, E) · (g, ζ) = Ag , Adg−1 η + dAg ζ; Adg−1 C, Adg−1 E + [ Adg−1 C, ζ] ,
(142)
yielding a principal T G-bundle. Since the projection A × B 7→ A is a morphism of G-bundles, the connection c on A is also a connection on A × B, and the connection (c, ξ) [determined by the pair (c, cˆ)] on T A is also a connection on T A × T B. Moreover, (A + c, η + ξ) is a Lie(T G)-valued connection on the bundle TA × TB TP × TA × TB 7→ T M × , TG TG
(143)
where again we intend to apply conventions 1 and 2. Forms on T P × T A × T B will be characterized by three indices (m, n, p) which represent the degree with respect to the three spaces T P , T A, T B. The middle integer n is again the ghost number. The pair (C, E) ∈ T B is a Lie(T G)-valued (2, 0, 0)-form that is constant on T A.
612
A. S. Cattaneo et al
If we neglect the forms of degree (m, n, p) with p > 0, we find that, under the action of the total covariant derivative dtot (A+c,η+ξ) , the pair (C, E) is transformed into (dA C + [c, C], dA E + [η, C] + [c, E] + [ξ, C]) , where we have used the fact that the pair (C, E) is constant on the space T A. Now we are ready to consider the possible extensions of the BRST operator δ to T B that satisfy the requirements 1, 2 and 3 above. In order to take into account requirement number 2, we have to consider covariant derivatives, so we set s(C, E) ≡ ([C, c], [C, ξ] + [E, c]);
(144)
i.e., (dA − s)(C, E) coincides, up to forms of positive degree in the T B-component, with dtot (A+c,η+ξ) (C, E). If the (2, 2, 0) component of 2 e C]) (C, E) = [F(A+c,η+ξ) , (C, E)] = ([FA , C], [FA , E] + [dA η + ψe + φ, dtot (A+c,η+ξ) (145) is zero, then we may add to our field-Eqs. (138) the transformations (144) and obtain a consistent BRST algebra that includes the elements (C, E) ∈ T B. It is a matter of simple calculations to check the following Theorem 4. A consistent BRST algebra that includes pairs (C, E) ∈ T B and extends (138) is possible only for pairs (0, E) for any E. If we perform the change of variables dA η + E = B
(146)
and replace δ with s in (138), we obtain the following set of equations: sA sη sFA sB sc sξ sψ˜ sφ˜
= dA c, = −ψ˜ + dA ξ + [η, c], = [FA , c], = −dA ψ˜ + [FA , ξ] + [B, c], = − 21 [c, c], = φ˜ − [c, ξ], ˜ c] + dA φ, ˜ = −[ψ, ˜ = [φ, c],
(147)
which are immediately recognized as Eqs. (31). The change of variables (146) implies that B is a tangent vector in TFA B. Accordingly its transformation under the group T G is as follows: B · (g, ζ) = Adg−1 B + [FAg , ζ].
(148)
5.6. Gauge fixing and orbits. The fields of our theory are triples (A, η, B), where (A, η) ∈ T A and B ∈ TFA B. This space of fields can be described conveniently by means of the curvature map K : A → A × B, which descends to a map
K(A) = (A, FA ),
(149)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
613
A×B A → . G G The space of fields [i.e., the set of triples (A, η, B)] coincides then with the set of elements of the pulled-back bundle K ∗ (T A × T B). By taking into account the T G-invariance, the space of orbits of the theory is given by K:
K ∗ (HA × T B) K ∗ (T A × T B) ≈ , TG G
(150)
where by HA we denote the bundle of horizontal tangent vectors of T A with respect to a given connection on A → (A/G). The above diffeomorphism is induced by the linear map [A, η, B]T G * [A, η H , B − dA η V ]G , where the superscript H and V denote the horizontal and vertical component. The general Gaff -invariance of the action (27) implies the following further translational invariance: K ∗ (HA × T B) 3 (A, η, B)*(A, η + τ, B − dA τ ),
τ ∈ HA.
(151)
Relative to this translational invariance, two different gauges are possible: 1. η = 0; 2. B ⊥ dA (HA); i.e., d∗A B ∈ Im(dA ). We can therefore identify the space of gauge-fixed fields as HA Harm1A (M, adP ) ⊕ Bˆ K2∗ T B ≈ , (152) G G where K2 : A → B denotes the second component of K, and Bˆ is a vector bundle over A defined by: BˆA ≡ {B ∈ 2 (M, adP ) | d∗A B ∈ Im(dA )}. Let us finally come to the self-dual gauge. Here we consider the operator DA : HA → (2,+) (M, adP ) [s. (61)], and assume the following condition: Im(DA ) = (2,+) (M, adP ).
(153)
We know that (153) is satisfied when A is an anti-self-dual connection. By the same argument discussed in Subsect. 3.2, we conclude that, if A is a connection such that (153) is satisfied, then there is a neighborhood of A in which the same condition is satisfied as well; so the set of connections for which (153) is satisfied is an open set. On such an open set there is another way of fixing the translational invariance (151). If we set B + = 0⇔B ∈ (2,−) (M, adP )⇔B ⊥ (2,+) (M, adP ), 1
(154)
] A (M, adP ). We can then then τ ∈ HA is determined only up to elements in Harm conclude that the space of gauge-fixed fields can, in a neighborhood of an anti-self-dual connection, be given by 1 ] HA HarmA (M, adP ) ⊕ (2,−) (M, adP ) K2∗ T B ≈ . (155) G G
614
A. S. Cattaneo et al
6. Conclusions In this paper we have discussed the possibility of describing YM theory in terms of a theory that shares many characteristics with the topological field theories (of the BF type). The first step, Sect. 2, has considered the first-order formulation of YM theory with the addition of an extra field to be gauged away. The resulting theory (27), which we call BFYM theory, shows a formal resemblance with pure BF theory as the coupling constant vanishes. In Sect. 3, we have shown that BFYM theory is indeed equivalent to YM theory. Our proof is an explicit path-integral computation performed with three different (but equivalent) gauge fixings: the trivial, the covariant and the self-dual. The most interesting result is that perturbation theory in the last two gauges can be organized in a different way than in the second-order YM theory and explicitly shows the propagators of the topological BF theories. In Sect. 4, after recalling some basic facts on the BV formalism, we have given a brief description of the BV quantization of the BF theories. Moreover, we have shown that BFYM theory can be formulated in a canonically equivalent way so that the limit for vanishing coupling is well-defined and yields the pure BF theory plus a covariant kinetic term for the extra field. Finally, in Sect. 5, we have described the geometric structure of BFYM theory and have explicitly shown how to deal with the non-freedom of the action of the symmetry group on the space of fields. We conclude by recalling that one of the reasons for considering BFYM theory (and looking for its underlying topological properties) is the possibility of introducing new observables that might realize ‘t Hooft’s picture; but this will be discussed elsewhere. Acknowledgement. Two of us (F.F. & A.T.) thank S.P. Sorella for useful discussions.
A. Computation of the Pfaffian of M When studying BFYM theory in the self-dual gauge, we needed to compute the Pfaffian of the matrix M defined in (75). By a well known algebraic identity, (PfM )2 = det M ; therefore, (PfM )4 = det M 2 . An explicit computation using (65) and the last line of (63) yields
e 0A 0 0 1 √ M2 = − 0 V 1 0 , 0 0 N with
N=
1e 2 1A ∗ ∗ DA DA
DA DA eA 1
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
615
acting on (2,+) (M, adP ) ⊗ 0 (M, adP ). Up to a possible irrelevant phase, the deter− e (1) minant of M 2 is equal to the product V m /2 det 0 1 A det N . An explicit computation yields e (0) e (2) det N = det(1 A /2) det(1A − RA ), with
∗ ∗ e DA 2GA DA DA : 0 (M, adP ) → 0 (M, adP ). R A = DA
To simplify RA , we notice that e A DA : 1 (M, adP ) → (2,+) (M, adP ) 2G ∗ is the inverse of DA on its image; more precisely, we have ∗ e DA 2GA DA = πcoker(DA ) ,
which implies (77). Moreover, by the last line of (63), we have (2)
e A /2) = det(1(2,+) ). det(1 A Putting together all the pieces, we finally get (76). B. Some Useful Properties of the Operator DA In Subsect. 3.2, we have introduced the operator DA . This operator is an injection from the zero- to the one-forms since A is irreducible. Moreover, we have chosen to work in a neighborhood N 0 of the space of anti-self-dual connections where DA is also a surjection from the one- to the self-dual two-forms. These two properties are enough to prove a series of facts which were used in Subsect. 3.2 to prove the equivalence between YM theory and BFYM theory in the self-dual gauge. In Subsect. B.1 we will prove these facts in general; in Subsect. B.2 we will specialize to our case. B.1. The general case. Let us consider three (finite-dimensional) vector spaces X, Y and Z together with an injective linear operator p:X→Y
(156)
q : Y → Z.
(157)
and a surjective linear operator T
Moreover, we assume ker q ker p∗ = {0}. Then we introduce the Laplace operators 1X = p∗ p, 1Y = pp∗ + q ∗ q, 1Z = qq ∗ .
(158)
With our hypotheses, these operators are invertible; we will denote by G∗ their inverse.
616
A. S. Cattaneo et al
Now we have the following: Theorem 5. If qp = 0, then p
q
1. 0 → X → Y → Z → 0 is an exact sequence, and 2. det 1X det 1Z = 1. det 1Y Proof. Fact 1 just follows from the definition of exact sequence. For fact 2, write Y = Y1 ⊕ Y2 , with Y1 = Im p = ker q,
Y2 = Im q ∗ = coker q.
1Y is block diagonal with respect to this decomposition, so det 1Y = det 1Y1 det 1Y2 , with 1Y1 = pp∗ : Y1 → Y1 , 1 Y 2 = q ∗ q : Y 2 → Y2 . Moreover, Y1 (Y2 ) is isomorphic to X (Z), and p and p∗ (q ∗ and q) are invertible operators when restricted to this space. It follows that det 1Y1 = det 1X ,
det 1Y2 = det 1Z .
If pq 6= 0, we cannot identify ker q with Im p. We can however reduce to the preceding situation by defining p¯ = Hp : X → Y,
(159)
where H is the projection operator H = 1 − q ∗ GZ q : Y → Y.
(160)
Since qH = 0, we get q p¯ = 0. We will then define ¯ X = p¯∗ p, ¯ 1 ¯ Y = p¯p¯∗ + q ∗ q. 1 Then we have the following Corollary 1. If p¯ is injective and q is surjective, then ¯ X det 1Z det 1 = 1. ¯Y det 1 This corollary can however be refined to give the following ¯ X is invertible and q is surjective, then Theorem 6. If 1 ¯ X det 1Z det 1 = 1. det 1Y
(161)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
617
¯ X invertible”. Proof. First of all we notice that “p¯ injective” is equivalent to “1 Then we proceed as in the proof of Thm. 5 and split Y as Y = Y1 ⊕ Y2 , where Y¯1 = ker q and Y2 = coker q. Notice that HY2 = {0} and H|Y1 = 1, so ker H = Y2 . ¯ Y is block diagonal with respect to the above decomposition The Laplace operator 1 of Y . Therefore, ¯ Y = detY1 (pp∗ ) detY2 (q ∗ q). det 1 The Laplace operator 1Y is not block diagonal but has the following form with respect to the above decomposition: ∗ pp∗ pp . 1Y = pp∗ pp∗ + q ∗ q By subtracting the first from the second row, we get ∗ ∗ pp pp ¯Y. = det 1 det 1Y = det 0 q∗ q
T Remark. The condition that p¯ is injective is equivalent to Im p coker q = {0} (supposing that p is injective). If pq = 0, then this condition is immediate since in this case Im p = ker q. A dual way of solving the problem is to define q˜ = qK : Y → Z
(162)
K = 1 − pGx p∗ : Y → Y.
(163)
with
˜ Z = q˜q˜∗ and have a theorem analogous In this case, we get qp ˜ = 0. We can then define 1 to Thm. 6. We summarize the previous results, and some more, in the following Theorem 7. If p is injective and q is surjective, then T ˜ Z is, or, equivalently, if and only if Im p coker q = ¯ X is invertible if and only if 1 – 1. 1 T ∗ ∗ Im q coker p = {0}. – 2. If any one is invertible then a. ˜Z ¯ X det 1Z det 1X det 1 det 1 = = 1; det 1Y det 1Y b. the operators ξ = p∗ GY p : X → X, ζ = qGY q ∗ : Z → Z are identity operators; c. the operators
are null operators.
ξˆ = p∗ GY q ∗ qGY p : X → X, ζˆ = qGY pp∗ GY q ∗ : Z → Z
618
A. S. Cattaneo et al
Notice that if pq = 0 statements 2b and 2c follow trivially from the commutativity of p and q with the Laplace operators. The remarkable fact is that ξ = ζ = 1 and ξˆ = ζˆ = 0 even without this condition. ˜ Z is invertible if and only if q˜∗ is injective, i.e., Proof. To prove 1Twe notice that 1 ∗ ∗ ∗ if and only Im q coker p = {0} (supposing q ∗ injective). T Since Im p = coker p ∗ and Im q = coker q, the last condition turns out to be Im p coker q = {0} which is ¯ X (s. the preceding remark). equivalent to the invertibility condition for 1 Statement 2a follows from Thm. 6 after exchanging X with Z and p with q ∗ . By the way, either numerator is equal to the determinant of the operator 1Z qp : Z ⊗ X → Z ⊗ X. p∗ q ∗ 1X To prove 2b, first of all we notice that we have the commutation rules: p 1X − 1Y q = −q ∗ qp, q 1Y − 1Z q = qpp∗ , together with their conjugates 1X p∗ − p∗ 1Y = −p∗ q ∗ q, 1Y q ∗ − q ∗ 1Z = pp∗ q ∗ . Now we multiply the above relations by G∗ both from the left and from the right getting p G X − GY q q G Y − GZ q G X p∗ − p∗ G Y G Y q ∗ − q ∗ GZ
= GY q ∗ qpGX , = −GZ qpp∗ GY , = GX p∗ q ∗ qGY , = −GY pp∗ q ∗ GZ .
Using the third relation of (164), we can write ξ = 1 − GX p∗ q ∗ qGY p. Then we use the second relation of (164) and obtain ξ = 1 − GX R + GX Rξ, where Finally, we notice that
R = p∗ q ∗ GZ qp. ¯ X = 1X − R. 1
Applying 1X to the last formula for ξ, we get ¯ Xξ = 1 ¯ X. 1 ¯ X is invertible, this implies ξ = 1. If 1 ˜ Z is invertible. A dual proof shows that ζ = 1 when 1 To prove 2c, we use the fourth relation of (164) and obtain ξˆ = (1 − ξ)p∗ q ∗ GZ qGY p = 0.
(164)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
A dual proof shows that ζˆ = 0.
619
The infinite-dimensional case. If X, Y and Z are (infinite-dimensional) Hilbert spaces, all the previous theorems hold if we add the hypothesis that p and q have elliptic Laplacians. In this case, by Hodge’s theorem, the spectra of the Laplace operators are discrete, the eigenspaces are finite-dimensional and each Hilbert space is the direct sum of these eigenspaces. Therefore, we can use the ζ-function regularization. Namely, if we denote by λ the eigenvalues of a Laplace operator 1 and by dλ the dimension of the corresponding eigenspace, the ζ-function is defined as X λ−s dλ ζs (1) = Tr 1−s = λ
for s large enough and then analytically extended. The regularized determinant is then defined as det 1 := exp[−ζ00 (1)]. The proof of Thm. 5 can be refined by showing that p is an isomorphism between each eigenspace of 1X and each eigenspace of 1Y1 . This is essentially due to the fact that p1X = 1Y1 p and to the assumption that there are no zero eigenvalues. Therefore, ζs (1X ) = ζs (1Y1 ) for all s. By similar considerations on 1Y2 and 1Z , we get finally ζs (1X ) − ζs (1Y ) + ζs (1Z ) = 0,
∀s.
Deriving with respect to s and setting s = 0 yields (the logarithm of) the required formula. B.2. Our case. To apply the previous analysis to our case, we set X = 0 (M, adP ),
1
] A (M, adP ), Y = 1 (M, adP ) Harm Z = (2,+) (M, adP ). The operators p and q correspond to the operator DA . With these definitions we have 1X = 1(0) A, (1) e 1Y = 1A ,
1Z = 1(2,+) A .
Moreover, b (0) ¯X =1 1 A, e ξ = XA , ζ = ZeA , ζˆ = ZbA , where the operators on the r.h.s. are defined in (78), (80). T T (79) and ∗ ∗ ) 1 (M, adP ) = {0} if coker DA Recall that, by the definition of N 0 , (Im DA b (0) A ∈ N 0 . Therefore, by statement 1 of Thm. 7, 1 A is invertible. Then, by statements 2b and 2c of Thm. 7 we see that eA = 1, ZeA = 1, ZbA = 0; X (165) finally, by statement 2a, we have J[A] = 1, with J[A] defined in (81).
(166)
620
A. S. Cattaneo et al
References 1. Anselmi, D.: Topological Field Theory and Physics. Class. Quant. Grav. 14, 1–20 (1997) 2. Batalin, I.A. and Vilkovisky, G.A.: Relativistic S-Matrix of Dynamical Systems with Boson and Fermion Constraints. Phys. Lett. 69 B, 309–312 (1977); Fradkin, E.S. and Fradkina, T.E.: Quantization of Relativistic Systems with Boson and Fermion First- and Second-Class Constraints. Phys. Lett. 72 B, 343–348 (1978) 3. Becchi, C., Rouet, A. and Stora R.: Renormalization of the Abelian Higgs–Kibble Model. Commun. Math. Phys. 42, 127 (1975); Tyutin, I.V.: Lebedev Institute preprint N39, 1975 4. Baulieu, L. and Singer, I.M.: Topological Yang–Mills Symmetry. Nucl. Phys. B (Proc. Suppl.) 5B, 12 (1988) 5. Horowitz, G.T.: Commun. Math. Phys. 125, 417–436 (1989); Blau, M. and Thompson, G.: Topological Gauge Theories of Antisymmetric Tensor Fields. Ann. Phys. 205, 130–172 (1991); Birmingham, D., Blau, M., Rakowski, M. and Thompson, G.: Topological Field Theory. Phys. Rept. 209, 129 (1991) 6. Bonora, L. and Cotta-Ramusino, P.: Some Remarks on BRS Transformations, Anomalies and the Cohomology of the Lie Algebra of the Group of Gauge Transformations. Commun. Math. Phys. 87, 589–603 (1983); Bonora, L., Cotta-Ramusino, P., Rinaldi, M. and Stasheff, J.: The Evaluation Map in Field Theory, Sigma-Models and Strings. I, Commun. Math. Phys. 112, 237–282 (1987), & II Commun. Math. Phys. 114, 381–437 (1988) 7. Cattaneo, A.S.: Teorie topologiche di tipo BF ed invarianti dei nodi. Ph.D. Thesis, Milan University, 1995 8. Cattaneo, A.S., Cotta-Ramusino, P., Gamba, A. and Martellini, M.: The Donaldson–Witten Invariants in Pure 4D-QCD with Order and Disorder ’t Hooft-like Operators. Phys. Lett. B 355, 245–254 (1995) 9. Fucito, F., Martellini, M., Zeni, M.: The BF Formalism for QCD and Quark Confinement. Nucl. Phys. B 496, 259–284 (1997) 10. Cattaneo, A.S., Cotta-Ramusino, P. and Martellini, M.: Three-Dimensional BF Theories and the Alexander–Conway Invariant of Knots. Nucl. Phys. B 346, 355–382 (1995); Cattaneo, A.S., CottaRamusino, P., Fr¨ohlich, J. and Martellini, M.: Topological BF Theories in 3 and 4 Dimensions. J. Math. Phys. 36, 6137–6160 (1995); Cattaneo, A.S.: Cabled Wilson Loops in BF Theories. J. Math. Phys. 37, 3684–3703 (1996); Cattaneo, A.S.: Abelian BF Theories and Knot Invariants. Commun. Math. Phys. 189, 795–828 (1997) 11. Cattaneo, A.S., Cotta-Ramusino, P. and Rinaldi, M.: Loop and Path Spaces and Four-Dimensional BF Theories: Connections, Holonomies and Observables. In preparation 12. Cattaneo, A.S., Cotta-Ramusino, P. and Rinaldi, M.: BRST Symmetries for the Tangent Gauge Group. J. Math. Phys. 39, 1316–1339 (1998) 13. Cotta-Ramusino, P. and Martellini, M.: BF Theories and 2-Knots. In: Knots and Quantum Gravity (J. C. Baez ed.), Oxford, New York: Oxford University Press, 1994 14. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford, New York: Oxford University Press, 1990 15. Martellini, M., Zeni, M.: Feynman Rules and β-Function for the BF Yang-Mills Theory. Phys. Lett. B 401, 62–68 (1997) 16. Fucito, F., Martellini, M., Sorella, S.P., Tanzini, A., Vilar, L.C.Q. and Zeni, M.: Algebraic Renormalization of the BF Yang–Mills Theory. Phys. Lett. B 404, 94–100 (1997), See also Accardi, A. and Belli, A.: Renormalization Transformations of the 4-D BFYM Theory. Mod. Phys. Lett. A 12, 2353–2366 (1997), and Accardi, A., Belli, A., Martellini, A. and Zeni, M.: Cohomology and Renormalization of BFYM Theory in Three Dimensions. Nucl. Phys. B 505, 540–566 (1997), for the 3D case 17. Halpern, M.B.: Field Strength Formulation of Quantum Chromodynamics. Phys. Rev. D 16, 1798 (1977); Gauge Invariant Formulation of the Selfdual Sector. Phys. Rev. D 16, 3515 (1977) 18. Henneaux, M.: Anomalies and Renormalization of BFYM Theory. Phys. Lett. B 406, 66–69 (1997) 19. Reinhardt, H.: Dual Description of QCD. hep-th/9608191 20. Schwarz, A.S.: The Partition Function of Degenerate Quadratic Functionals and Ray–Singer Invariants. Lett. Math. Phys. 2, 247–252 (1978) 21. ‘t Hooft, G.: On the Phase Transition towards Permanent Quark Confinement. Nucl. Phys. B 138, 1 (1978); A Property of Electric and Magnetic Flux in Nonabelian Gauge Theories. Nucl. Phys. B 153, 141 (1979)
Four-Dimensional Yang–Mills Theory as Deformation of Topological BF Theory
621
22. Witten, E.: Topological Quantum Field Theory. Commun. Math. Phys. 117, 353–386 (1988); Seiberg, N. and Witten, E.: Monopoles, Duality and Chiral Symmetry Breaking in N = 2 Supersymmetric QCD. Nucl. Phys. B 431, 581–640 (1994) Communicated by G. Felder
Commun. Math. Phys. 197, 623 – 640 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Discontinuity of the Spin-Wave Stiffness in the Two-Dimensional XY Model L. Chayes Department of Mathematics, University of California, Los Angeles, CA 90095-1555, USA. E-mail: [email protected] Received: 21 March 1997 / Accepted: 3 March 1998
Abstract: Using a graphical representation based on the Wolff algorithm, the (classical) d-dimensional XY model and some related spin-systems are studied. It is proved that in d = 2, the predicted discontinuity in the spin-wave stiffness indeed occurs. Further, the critical properties of the spin-system are related to percolation properties of the graphical representation. In particular, a suitably defined notion of percolation in the graphical representation is proved to be the necessary and sufficient condition for positivity of the spontaneous magnetization.
Introduction Among the most noted early achievements of the renormalization group was the analysis of the defect (vortex) unbinding transition in two–dimensional systems with Abelian symmetries [B, KT]. The definitive (and experimentally accessible) prediction of this analysis is the occurrence of discontinuities at the edge of the low–temperature phase. Such a phenomenon is remarkable in and of the fact that the transition itself, by any other criterion is continuous. In the language of superfluid systems, the above mentioned discontinuity occurs in the superfluid density; for spin-systems, it is the spin-wave stiffness; sometimes known as the helicity modulus. This prediction has been born out by theoretical, numerical and experimental (and analog/experimental) tests; cf. the review articles [N, M] and references therein. In this note, a complete mathematical proof for the (classical) 2d-XY model is provided. The method of proof employs the graphical representation – or cluster representation – due to Wolff [W] (more precisely, the graphical representation that is implicit in the Wolff algorithm). The importance of understanding this representation was stressed in [PS] and this representation was exploited in [A] in the study of the “vortex-free” XY model. In [CMII ], critical properties of the spin-system and the graphical representation were shown to be related. Here some characterizations are presented: Up to constant
624
L. Chayes
factors the magnetization in the spin-system is equal to the percolation density in the Wolff-representation and the susceptibility is “equal” to the average size of the connected clusters. Of more immediate relevance is the fact that the spin-wave stiffness tested in finite volume is directly related to crossing probabilities in the graphical representation and in particular, a small stiffness implies and is implied by a small crossing probability. If this probability is “too small” then, using elementary rescaling ideas borrowed from rigorous percolation theory, it tends to zero exponentially at larger scales (which furthermore implies exponential decay of correlations). Thus, the stiffness is either uniformly positive at all scales or it is zero. The existence of a low temperature phase with power law decay of correlations (proved in [FS]) thus implies a discontinuity of the stiffness at a positive temperature. A related class of problem – in the sense that the RG equations turn out to be nearly identical – are the one dimensional long-range discrete models, e.g. 1/r2 Ising model. In this context, the magnetization at the critical point plays the role of the spin-wave stiffness and it was predicted in [T] to be discontinuous at Tc (the Thouless effect). This was rigorously established in [ACCN] by vaguely similar methods: graphical representations and “real space renormalization group” inequalities. However, in the rigorous as well as in the renormalization group arenas the deeper relationship between these two problems is still unclear. The remainder of this is organized along the following lines: Below, the definition of the spin-wave stiffness used in this note is provided. In the next section, the Wolff representation is developed. Here, the key relationship between the spin-wave stiffness and appropriate crossing probabilities is derived. This will be followed by the section in which the main result – the discontinuity of the spin-wave stiffness in d = 2 – is established. In the final section, some auxiliary results will be stated (but not proved) and in the appendix, complete proofs of these results and various properties of the Wolff representation will be provided. Spin-wave stiffness The spin-wave stiffness is the appropriate notion of a leading correction to the bulk free energy when the surface tension is zero. It may be defined as follows: Consider a regular finite volume d- dimensional shape V with two (separated) boundary components. Let VL denote the lattice approximation to this shape at scale L, i.e. the intersection of Zd with the image of V that has been uniformly scaled by a factor of L. The general strategy is to consider the difference in free energies of the system with uniform boundary conditions and twisted boundary conditions on VL . For typical ferromagnetic spinsystems, “uniform” means that all the boundary spin are aligned and “twisted” means that the two boundary components are individually aligned but are anti-parallel. For the purposes of this note, the above is sufficient. In more generality, one may consider cylindrical or even toroidal geometries which, in other contexts, are arguably a better choice, cf. the discussion in [FJB]. Modulo constants, for L 1, the log of the ratio of the twisted and uniform partition functions serves to “define” the spin-wave stiffness K. d−2 with β the Let us proceed more cautiously and define this ratio as e−βKL (V,β)g(V )L inverse temperature and g(V ) a geometric constant (which is essentially the capacitance) to be described below. A spin-wave stiffness may be defined via the limiting behavior of KL (V, β); since there is no general proof that the limit exists, let alone is independent of V , the matter will be left as it stands. Suffice it to say that it for any V of a roughly annular shape, KL (V, β) tends to zero then all possible KL ’s tend to zero (and similarly, in d > 2, if any KL (V, β)Ld−2 → 0, then they all do).
Spin-Wave Stiffness in 2d-XY -Model
625
Remark. In addition to the above mentioned geometries that will not be considered, it is worth noting that there is another class of geometries that also won’t be considered: One may try to define the spin-wave stiffness in geometries where there are more than two boundary components, the first two twisted/aligned and the rest free. The prominent example (which more or less falls into this class) is a hyper-rectangle where one pair of opposing faces is twisted/aligned and the other 2(d − 1) faces are free. Modulo the geometric constant to be discussed below (in these cases, one would presumably have to solve a free boundary problem which, for rectangular geometries is particularly simple) a spin-wave stiffness could be “defined” pretty much as above. In all cases that have now been described in this note, the finite volume stiffnesses have definitive stochastic geometric interpretations in terms of crossing probabilities. (In the case of toroidal boundary conditions, this is not obvious – certainly it does not follow immediately from anything in this – but it is nevertheless true [CMu ]. For the boundary conditions introduced in this remark, it is indeed obvious and the relationship between this version of the stiffness and appropriate crossing probabilities follows mutates mutandis the derivation in Proposition 1.) For independent percolation, it is not hard to show that if the probability of a crossing from the inside to the outside of an annulus at any scale is less than some small number (the value of which depends on the details of the shape) then the system is subcritical. (Arguments for percolation using annuli have appeared in a variety of contexts, e.g. [C, CMa , A].) Statements of this sort are sometimes possible with graphical representations of spin-systems, e.g. [CCFS] (and a number of systems discussed in [CMI, CMII] although an explicit proof has not been written). But these systems are usually more difficult due to the dependence in these problems on boundary conditions. (This, in a nutshell is what gives the work in this note a formidable appearance.) Typically, one must say that if the (annular) crossing probability is sufficiently small in those boundary conditions that optimize the crossing probability then the system is in some sort of high temperature phase. Further, one would like to relate crossing probabilities in such boundary conditions to an appropriate surface free energy or response function. On the other hand, it is only for independent percolation (to the author’s knowledge) where any such statement is possible concerning crossing probabilities of rectangles for the type of boundary conditions appropriate to a definition of spin-wave stiffness or surface tension. Indeed, for percolation, it is possible to show that if the “easy-way” crossing of a “squat” hyperrectangle (e.g. a 2L × 2L × · · · × 2L × L) is small then the system is subcritical (see, e.g. [CC] Prop. 2.10). And, for independent percolation, it is not hard to see that the easy-way crossing of rectangles is small if and only if the crossing probability from the inside to the outside of various annuli is small. These relations between these crossing probabilities in these geometries are readily established for independent percolation because of the essential absence of any boundary conditions in this system. Similar statements along these lines (again, to the authors knowledge) have not been made in the context of graphical representations of spin-systems when the relevant boundary conditions are used. Further, and on an even more ambitious track, is to establish a definitive equivalence between smallness of hard-way crossing and easy-way crossings. (One direction for percolation – and even for certain graphical representations is obvious; hence the nomenclature.) To the best of the author’s knowledge, this has only been done in d = 2 for independent percolation for the case of a square – the so called RSW lemma.[R, SW]. Specifically, it was shown that if the probability of crossing a square is small then the probability of crossing rectangles the easy-way is also small. However, this has not yet been proved in d > 2 and in fact even in d = 2, this has not yet been pushed below the level of a square. Needless to say, such results also haven’t been established in the context of graphical representations for spin-systems. Indeed
626
L. Chayes
here, not even a two-dimensional result along the lines of the RSW lemma is known to the author. In particular, it is worth noting that for models with self-duality – such as the Potts or (generalized) Ashkin–Teller models, an RSW lemma for a square crossing may represent the first step in proving, for the case of a continuous transition, that the self-dual point is the unique transition point. (In [BC] such results have recently been established if the transition is discontinuous.) However no such geometric lemmas seem to exist and certainly not for the representation used here (for which the author is not aware of any self-dual properties). Finally, the harder problems such as the analogs of RSW lemmas in d > 2 and, in d = 2, RSW lemmas for more extreme cases than squares – in boundary conditions easily related to surface tension or spin wave stiffness – also do not appear to be any easier in the context of interacting graphical problems than they are in the independent case. Hence these issues will not be discussed further in this work and we will stick to the straightforward definition of stiffness as defined in annular regions. Let us tend to the constant g(V ). The models under consideration will have spins with bounded values in R2 ; let us assume that the bound is one. Furthermore (and here rather vaguely) let us assume that if the Hamiltonian is expressed in “deviation” variables, the leading non-constant term is quadratic with coefficient 1/2. Let φV be the solution to Laplaces’ equation with boundary values ±1 on the two components. Then Z |∇φV |2 dd x. (1) g= V
With this definition, it is an elementary exercise to show, for the standard XY model on Zd (e.g. as defined in Eq. (3.a) with unit couplings between neighboring sites) that lim lim KL (V, β) = 1.
L→∞ β→∞
(2)
In this paper, all that is needed is the simplest of annular shapes: Consider, in d = 2, the square of size 3, S(3) = {x1 , x2 | − 23 ≤ x1 ≤ + 23 , − 23 ≤ x2 ≤ + 23 } and S(1) defined accordingly. The shape of interest is A ≡ S(3) \ S(1) . In d > 2 the corresponding generalization is used: a hypercube of side 3 with the central hypercube of side 1 removed. The Representation: Notation and Definitions Although the primary concern is with the behavior of uniform systems on regular ddimensional lattices, the cluster representation is just as easily formulated on an arbitrary (finite) graph. Indeed, there is a need for these sorts of generalities in order to formulate the representation of these systems in the presence of boundary conditions. Thus, let G denote a finite graph with sites SG and bonds BG . For each i ∈ SG , let ~si denote a 2d spin of length one and for each hi, ji ∈ BG , let Ji,j > 0 denote the couplings. The XY -Hamiltonian is given by X Ji,j ~si · ~sj . (3.a) HGXY = − hi,ji
Writing ai and bi for the magnitude of the Y and X components respectively, (here 0 ≤ ai , bi ≤ 1) and allowing τi = ±1 and σi = ±1, HGXY may be read X HGXY = − Ji,j [ai aj τi τj + bi bj σi σj ]. (3.b) hi,ji
Spin-Wave Stiffness in 2d-XY -Model
627
For most of what remains, we will have little use for the specifics of the XY -model itself. Indeed, we might just as well allow the right-hand side of Eq. (3.b) to define the model along with some constraint on the (ai , bi ) that makes one a decreasing function of the other and an a priori distribution, fi , for the bi (which need not be continuous). For the purposes of brevity we will, however assume complete symmetry between the a’s and the b’s and that these objects are bounded. The idea behind the Wolff representation is to develop one (or both) of the Ising systems in an FK [FK] random cluster representation. 1 The partition is given by the usual P XZ Y β Ji,j [ai aj τi τj +bi bj σi σj ] hi,ji dfi (bi )e . (4) Z(G, J, β) = σ, τ
i
In the above, σ and τ are notation for the Ising configurations on G while J denotes the collection of couplings. And similarly, a and b will be notation for configurations of the magnitude of the spin components with the ai understood to be a function of the bi . Let us start by writing the Ising portion of the Hamiltonian in Potts form: σi σj = 2δσi σj − 1, etc. For fixed b, let us trace over the τ variables and then trade the σ degrees of freedom for those of an FK expansion. Thus let ZaI (β) denote the Ising partition function according to an Ising Hamiltonian written in Potts form: X Ji,j ai aj (δτi τj − 1), (5.a) HaI = − hi,ji
ZaI (β) =
X
e
−βHaI
.
(5.b)
τ
Here, the dependence of these quantities on G, and the (J) has been temporarily suppressed. Unfortunately, the relevant β is twice what appears in Eq. (5.b) so to avoid confusion, this parameter will stay with us. Performing the afore mentioned trace and expansion, we arrive at the weights (or density function) of a joint distribution for the b and bond configurations ω ⊂ BG : Y eβJi,j (ai aj +bi bj ) Wb;2β (ω), (6) VβW (b, ω) = ZaI (2β) hi,ji
where Wb;2β (ω) are the usual (q = 2) FK weights with couplings Ji,j bi bj and inverse temperature 2β: Y Y Wb;2β (ω) = q C(ω) pi,j (1 − pi,j ), (7) hi,ji∈ω
hi,ji∈ω /
pi,j = 1−e2βJi,j bi bj and C(ω) the number of connected components of ω. The measures defined by the weights in Eq. (6) will be denoted by νβW (−). Let us consider the two marginal distributions: (i) Integrate out the b degrees of freedom to obtain a measure on the bond configurations ω. These will be denoted by Mβ (−) – or M∗β,G... (−), with ∗ signifying possible boundary conditions to be discussed later. (ii) Integrate out the ω degrees of freedom (i.e. skip the FK step and trace the σ 1 In typical simulations one does this for only one of the Ising variables – as will most often be the case here – but picking a direction at random. However, as argued in [CMII ], it may be advantageous to use the full expansion in conjunction with the Invaded Cluster algorithm.
628
L. Chayes
variables). The associated density will be denoted by ρβ (−) – or ρ∗β,G... (−) when the K need arises. Finally, let us consider the conditional FK measures, µF b (−) determined by the weights in Eq. (7). These distributions allow for a convenient decomposition of Mβ (−), Z K (8) Mβ (−) = dρβ (b)µF b (−). b
Some immediate applications of these measures have been discussed in [A and CMII ]. For example, in the usual isotropic XY case, if Ti,j is the (bond) event that i is connected to j then, e.g. in free boundary conditions, 2Mβ,G (Ti,j ) ≥ h~si · ~sj iβ,G
(9)
with h−iβ,G denoting expectation with respect to the canonical distribution. This has been supplemented by a lower bound proportional to a power of Mβ,G (Ti,j ). Here we will obtain a lower bound of a constant times Mβ,G (Ti,j ). Of direct relevance to the present work is the following: Let KL (A, β) denote the spin wave stiffness as discussed in the introduction. Ex+ + plicitly, let Z ı o (AL , β) denote the partition function on the annulus AL with boundary conditions obtained by setting all boundary spins on the inner boundary (ı) and the outer boundary (o) to the X-direction. (Or, in the language of Eq. (3.b), all the bi ’s are set to − + their maximum values and σi ≡ 1 on the boundary.) Similarly let Z ı o (AL , β) be the partition function for the setup in AL where the spins on the outer boundary are pointing in the positive X-direction and the spins on the inner boundary pointing in the negative X-direction. Thus e−βg(A)KL (A,β)L
d−2
≡ Zı
− +
o
(AL , β)/Z ı
+ +
o
(AL , β).
Concerning the “ı+ o+ ” system, it is clear that we can treat this setup along the lines already described: the boundary spins act as a single spin albeit with a concentrated distribution. ++ Let us denote by M1β,AL (−) the bond measure associated with these boundary conditions and let Tı,o denote the event of a connection between the inner and outer boundaries of AL . The first claim is Proposition 1. 1 − Zı
− +
o
(AL , β)/Z ı
+ +
o
++
(AL , β) = M1β,AL (Tı,o ).
In particular, the spin-wave stiffness is related in a simple way to the probability of a connection between the boundary components of AL . Proof. As is well known, in random cluster measures corresponding to Potts systems with spins on the boundary set to some fixed value, the weights for the graphical configurations are given by the standard one with the interpretation that C(ω) counts only the components that are disconnected from the boundary. (Equivalently, up to an irrelevant constant, one counts all the sites that are attached to the boundary as part of the same component.) Thus if we write XZ ++ ı+ o+ dVβW,1 (b, ω), (10) Z (AL , β) = ω
b
Spin-Wave Stiffness in 2d-XY -Model
629
the sum contains terms both with and without connections between the boundary. On the other hand, in an situation where two separate boundary components in the Potts system are set to different values, the rule for counting clusters is the same but now bond configurations containing connections between these components are assigned zero +− weight. Thus for fixed b, the formula for the Wolff weights VβW,1 (b, ω) corresponding to the twisted boundary condition is seen to be identical except for the proviso that ω does not connect ı with o – and here these configurations are discounted. The desired result is established. It is plausible that these measures enjoy various monotonicity properties but in any case, this will not be easy to prove. In particular it turns out that the joint measure is not strong FKG. What can be proved is that for a certain class of boundary conditions – that are called the -boundary conditions – the ρ-measures do have the FKG property. The precise definition of a -boundary condition is somewhat intricate but this class includes every boundary condition of physical interest where one could expect the FKG property to hold e.g. free, periodic and setting all the boundary spins to the positive X-direction. Furthermore, among all boundary specifications in the -class, this latter mentioned is maximal in the sense of FKG. The same dominance therefore holds over the -class of specifications which is defined as superpositions of specifications from the -class. This larger class has the property that its restrictions to smaller sets are also in the -class relative to the “larger” boundary component. The relevant consequences of the above is summarized in the form of a lemma: Lemma 2. Let G denote a graph. Then for every L ⊂ SG , there is a class of specifications on L called the -class such that: (1) If K ⊃ L and ∗ is a -specification on L then the restriction of the various measures, νβW,∗ (−), M∗β,G (−), etc. to the complement of K is itself a -class specification on K. (2) Setting all spins of L to the X-direction constitutes a -class specification on L; this is denoted by the 1+ boundary conditions on L. If ∗ is any other -specification on L then ≥ M∗ (−). M1β,G (−) FKG β,G +
A proof (including relevant definitions) will be supplied in the appendix. The important point is that among all possible relevant boundary conditions, on AL , the one ++ that maximizes the probability of Tı,o is precisely M1β,AL (−). Main Results With the identity of Proposition 1 and the inequalities of Lemma 2, the main argument reduces to a standard routine in percolation theory: ++
Theorem 3. There is an 0 = 0 (d) such that if for any L0 , M1β,AL (Tı,o (L0 )) < 0 then 0
++
lim M1β,AL (Tı,o (L)) = 0.
L→∞
++
In particular, under these conditions, M1β,AL (Tı,o ) tends to zero exponentially fast in L.
630
L. Chayes ++
Proof. Suppose that M1β,AL (Tı,o (L0 )) ≤ < 0 with 0 to be specified below. Let 0 N 1 and consider the event Tı,o (N L0 ) for the annulus AN L0 . Divide AN L0 into a grid of scale L0 so as to have the appearance of an AN on the large scale lattice. If P : ı → o is a path in AN L0 , each “site” on the large scale lattice that is visited by P has achieved an event like Tı,o (L0 ) – with the possible exception of the sites next to the boundary. Let us denote a “site” of AN to be “occupied” if the analog of the Tı,o (L0 ) occurs and is vacant otherwise. For the sake of being definitive, let us deem all sites ++ neighboring the boundary of AN to be occupied. It is clear that M1β,AN L (Tı,o (N L)) does not exceed the probability of a connection between the ı and the o of AN in the large-scale problem. Now of course, these site variables are not independent. However let us regard a sublattice consisting of a fraction – 1/3d – of these sites as sitting in the center of a translate of AL0 with these translates of AL0 situated in such a way that they tile the lattice. With the maximizing boundary conditions on these translates of AL0 , the sublattice of site occupation variables are independent and their probability is bounded above by . There are 3d possible ways to design such sublattices (depending on which sites are chosen as the centers) such that each site of AN is a central site on one of these 3d sublattices. Thus an “occupied cluster” consisting of K interior sites of AN must have at least 1/3d of these sites on (at least) one of the sublattices. Therefore, the d probability of a given occupied cluster with K interior sites is less than ()K/3 . The minimum sized cluster that permits the possibility of an actual path is essentially N and there are only of the order of N d−1 starting points on the inner boundary. Hence X ++ d [λ(d)1/3 ]K (11) M1β,AN L (Tı,o ) ≤ C2 N d−1 K>N −C1
with C1 and C2 constants of the order of unity and λ(d) < (d − 1) the connectivity d constant. It is evident that if < 0 = 1/λ3 , the stated result follows. Corollary. For the 2d models, the spin-wave stiffness does not go continuously to zero at any temperature. In any dimension, if the conditions of Theorem 3 hold for some finite L0 , there is exponential decay of correlations in any limiting -state. Proof. According to Lemma 2, the -state that maximizes the probability of Ti,j is always the 1+ -state. Under the conditions stated in Theorem 3, it is clear that the probability of Ti,j tends to zero exponentially in any limiting -state. (Later we will show that under these conditions there is in fact a unique limiting -state.) Using a bound along the lines of Eq. (9), exponential decay for the 2-point function is readily established: The factor of 2 in this inequality is for the X and Y -component pieces of ~si · ~sj . Indeed, in any boundary condition ∗, [X] ∗ ∗ ∗ hs[X] i sj iβ,G ≡ hbi σi bj σj iβ,G ≤ Mβ,G (Ti,j )
(12)
with connections through the boundary included in the definition of Ti,j . Since, among limiting -states this is maximized in the limiting 1+ -state, the correlation among the X- components goes to zero. The correlations between the Y components (in -states) would be maximized in the analog of the 1+ -state and hence, by the symmetry between X and Y components, is also (in any -state) always bounded by the probability of Ti,j in the 1+ -state. Thus we actually recover Eq. (9) for the 1+ -states and the conclusion about exponential decay is immediate. The statement concerning the spin wave stiffness is a tautology, however cf. Remark 2 below.
Spin-Wave Stiffness in 2d-XY -Model
631
Remark 1. If βc is defined by the infimum over temperatures at which K∞ (β) is zero, then, by an obvious continuity argument, K∞ (βc ) > 0 in d = 2. For the XY -model, the results of [FS] (concerning the existence of a region of power law decay of correlations) rather easily imply that such a discontinuity occurs at a finite β. Remark 2. Starting with [NK], detailed renormalization group studies of this “class” of problems predicts a universal value of βc K∞ (βc ). Although the present derivation is a far cry from a proof of any such statement, it is worth observing that the same set of results proved in Theorem 3 hold for a variety of models with “O(2)” characteristics – e.g. the Z4n -clock models – using the same value of 0 . Thus we have a universal lower bound on βc K∞ (βc ). This is analogous to (and borrowed from) the current situation in percolation theory: various crossing probabilities – even the one used here – which at the critical point are believed to converge to universal values at large length scale, can at least be shown to satisfy uniform bounds with universal constants.
Additional Results Some further results will be stated below but all the remaining proofs have been relegated to the appendix. The usual definition of percolation in correlated models starts, in finite volume, with the probability of a connection to the boundary in the boundary conditions that optimize this probability (cf. [CMI ], definition following Eq. (II.11)). Here, let us define: Definition. Let 3 ⊂ Zd be a finite connected set that contains the origin and let T0,∂3 denote the event that the origin is connected to the boundary. Let 53 (β) = M1β,3 (T0,∂3 ) ≡ max M∗β,3 (T0,∂3 )
(13.a)
5∞ (β) = lim 53 (β).
(13.b)
+
∗∈
and 3%Bd
In light of Lemma 2, the existence of this limit is not hard to establish. The actual percolation probabilities, denoted by P ’s instead of 5’s is defined as in Eqs. (13) but with the maximum taken over all boundary conditions. Theorem 4. (A) Let m(β) denote the spontaneous magnetization. Then there are finite non-zero constants, c1 and c2 (that depend only on minor details of the model) such that c2 5∞ (β) ≤ m(β) ≤ c1 5∞ (β). (B) If m(β) = 0, there is a unique limiting -state. Proof. A proof will be provided in the appendix. Remark. The results concerning uniqueness are hardly an improvement over the existing results which apply to most of these models considered here – uniqueness among translation invariant states when the magnetization vanishes [MMPf]. Of greater concern (to the author) is the connection between phase transitions in the spin-systems and percolation in the corresponding graphical representation. This is further underscored by the final result:
632
L. Chayes
Theorem 5. Let ∗ denote any finite volume -measure or infinite volume limit thereof [X] ∗ ∗ and let hs[X] i sj iβ ≡ hbi σi bj σj iβ denote the (untruncated) correlation function for the X-components. Then, [X] ∗ 2 ∗ c21 M∗β (Ti,j ) ≥ hs[X] i sj iβ ≥ c2 Mβ (Ti,j )
with c1 and c2 as in Theorem 4. In particular, if m(β) = 0 and X is defined by X [X] [X] hs0 sj iβ X (β) = j
evaluated in the unique limiting -state then c21 Eβ (|C0 |) ≥ X (β) ≥ c22 Eβ (|C0 |), where Eβ (|C0 |) is the expected size of the connected cluster of the origin in the graphical representation. Proof. The upper bound for the correlation function was derived in [A], the rest will be proved in the Appendix. Theorems 4 and 5 provide complete justification for the use of “percolation” as the critical criterion in the Wolff algorithm [W] or the Invaded Cluster version of this algorithm [CMII ]. Appendix: Monotonicity Properties of the Wolff Measures For reasons that are primarily of a technical nature, this appendix will be concerned with generalizations of the types of models already discussed (even though such generalizations are “unphysical” from the perspective of systems with O(2) symmetry). Thus consider a graph G and let HG denote the Hamiltonian X HG = − (Ki,j ai aj τi τj + Ji,j bi bj σi σj ) (A.1) hi,ji
with Ki,j , Ji,j ≥ 0. As discussed previously, the single site a priori measures and the range of the ai and bi as well as the constraint between them may be regarded as fairly arbitrary: It is enough to assume that they are non-negative, uniformly bounded and that ai goes down when bi goes up. Finally, it will be assumed that if bi achieves its maximum value then the corresponding ai is zero. Most of these assumptions can be removed but with an unreasonable cost of labor and space. To avoid spurious notational provisos, let us assume that the single site measures are discrete. (Indeed, since we will always start in finite volume, the “general” case can be recovered from the discrete J,K,f by a limiting procedure.) Thus we let ρβ,G (−) denote the measure on configurations b = (bi | i ∈ SG ) defined by the weights Y Y J,K,f I I (2β)Zb,J (2β) eβ[Ki,j ai aj +Ji,j bi bj ] fi (bi ), (A.2) Rβ,G (b) = Za,K hi,ji∈BG
i∈SG
where fi (bi ) is the a priori probability of bi , f ≡ (fi | i ∈ SG ), K ≡ (Ki,j | hi, ji ∈ BG ) and all other notation has been defined elsewhere.
Spin-Wave Stiffness in 2d-XY -Model
633 J,K,f
Proposition A.1. The measures ρβ,G
(−) are (strong) FKG.
Proof. Let b denote a fixed configuration and let u and v denote any distinct pair of sites in G. Let 1u > 0 and δu denote the configuration that is zero except at the site u, where it is equal to bu + 1u , similarly for δv with some 1v > 0. It may as well be assumed that fu (bu + 1u ) and fu (bv + 1v ) are positive. Thus, the configuration b ∨ δu ∨ δv has been “raised” at the sites u and v while b ∨ δu has been raised only at u, etc. Similarly, if 0u is the corresponding amount that au has to be lowered (determined by the constraint at u, the value of bu and 1u ) then let a ∧ γu denote the configuration of a’s that has been lowered at u, etc. (Formally, γu is au − 0u at the site u and infinite elsewhere.) To prove the desired claim, it is sufficient (and necessary) to show J,K,f
Rβ,G
J,K,f
(b ∨ δu ∨ δv )Rβ,G
J,K,f
(b) ≥ Rβ,G
J,K,f
(b ∨ δu )Rβ,G
(b ∨ δv ).
(A.3)
After cancellation of all manifestly equal terms (assumed non-zero) the purported inequality boils down to I I (2β)Za,K (2β)) ≥ (eβ0u 0v Za∧γ u ∧γv ,K I I I I (2β)Zb∨δ (2β)](Za∧γ (2β)Za∧γ (2β)). ≥ [Zb∨δ u ,J v ,J u ,K v ,K
(A.4)
It is claimed that the term in the square bracket on the rhs does not exceed the corresponding term on the left and similarly for the terms in the round bracket. Indeed, a moment’s reflection will show that these two inequalities are of an identical form. Let us therefore focus on proving I I I I (2β)Zb,J (2β)] ≥ [Zb∨δ (2β)Zb∨δ (2β)], [eβ1u 1v Zb∨δ u ∨δv ,J u ,J v ,J
(A.5)
and the same derivation will hold for the a-pairs. It turns out that the derivation is far easier without the annoyance of the 1u 1v cross terms. Let us thus define X H (0) = − Ji,j (δσi ,σj − 1)bi bj , (A.6a) hi,ji
H (U ) = −
X
Ji,j (δσi ,σu − 1)1u bi ,
(A.6b)
hi,ui I (2β) is given by and similarly for H (V ) . In these terms Zb∨δ u ∨δv ,J I (2β) = T r[e−2βH e−2βH Zb∨δ u ∨δv ,J (0)
(U )
e−2βH
(V )
e2βJu,v 1u 1v (δσu ,σv −1) ].
(A.7)
To get rid of the cross terms, it will be shown that eβ1u 1v Ju,v T r[e−2βH e−2βH
(U )
e−2βH
(V )
≥T r[e−2βH e−2βH
(U )
e−2βH
(V )
(0)
(0)
e2βJu,v 1u 1v (δσu ,σv −1) ] ≥
(A.8a)
].
Indeed, dividing both sides of the purported inequality (A.8a) by the right-hand side and denoting by EIH,β (−) the expectation with respect to the Ising Hamiltonian H at inverse temperature β, the desired (8.Aa) reads eβ1u 1v Ju,v EIH (0) +H (U ) +H (V ) ,2β (e2βJu,v 1u 1v (δσu ,σv −1) ) ≥ 1.
(A.8b)
634
L. Chayes
Expanding the integrand in the usual FK fashion, this reduces to showing that e−β1u 1v Ju,v + 2sh(β1u 1v Ju,v )EIH (0) +H (U ) +H (V ) ,2β (δσu ,σv ) ≥ 1.
(A.8c)
Here is one of the few places where the fact that the underlying model has an Ising structure is used: EIH,β (δσi ,σu ) ≥ 1/2 so the left-hand side of (A.8) is at least as big as chβ1u 1v Ju,v . For the remainder of the proof, it might just as well be assumed that the underlying model is the q-state Potts model. The remainder of this proof reduces to showing EIH (0) +H (U ) ,2β (e−2βH
(V )
) ≥ EIH (0) ,2β (e−2βH
(V )
).
(A.9)
This is very similar to the sorts of inequalities that were established in [C] so here the derivation will be succinct. Let i,v = 1 − e2βJi,v bi 1v and let Nv denote the collection of sets in SG each of which contains v and some subset of the sites in G that are connected (V ) to v. Expanding e−2βH in the usual FK fashion, it is seen that X (V ) r F δ σF (A.10) e−2βH = Q
F ∈Nv
Q
with rF = i∈F i,v j ∈F / (1 − j,v ), and where δσF is one if all the spins in F agree and zero otherwise. However, using an FK expansion of the q-state Potts system with Hamiltonian H, it is not hard to show F K (q=2) 1 CF −1 (( ) ), EIH,β (δσF ) = EH,β q
(A.10)
where CF is the number of connected components of the set F . This is the expectation of an FKG increasing function and thus the desired inequality follows – term by term – from the fact that the random cluster model that comes from the “bigger” Hamiltonian (i.e. H (0) + H (U ) ) is FKG dominant. Corollary I. Consider two systems on the same graph G with parameters J, J 0 and single site measures determined by the collections f and f 0 respectively. Suppose that 0 and further suppose that f f 0 J J 0 , meaning that for each hi, ji ∈ BG , Ji,j ≥ Ji,j 0 in the sense that for each i, fi (bi )/fi (bi ) is an increasing function of bi . Then J,K,f
ρβ,G
J 0 ,K,f 0
≥ ρ (−) FKG β,G
(−).
Proof. This is an immediate consequence of the FKG properties of these measures and the previous derivation. First, if f 0 ≺ f , then Q Y Y i∈S fi (bi ) ] fi (bi ) = [ Q G 0 fi0 (bi ) (A.11) f (b ) i i∈SG i i∈SG
i∈SG
so the f -weights are of the form [increasing function]× f 0 -weights. To establish the desired result for J J 0 it is sufficient to consider one bond at a time. Thus let hu, vi ∈ 0 BG and suppose that Ju,v = Ju,v + Lu,v (with Lu,v > 0) and all other J’s equal. Then J,K,f
Rβ,G
J 0 ,K,f
(b)/Rβ,G
(b) = eβLu,v bu bv EIH I ,2β [e2βLu,v bu bv (δσu ,σv −1) ], b
(A.12a)
Spin-Wave Stiffness in 2d-XY -Model
635
where the Ising Hamiltonian HbI was defined in Eq. (5.a) – and the J dependence has been suppressed. After a few manipulations along the lines of those in the previous proposition, Eq. (A.12a) reduces to J,K,f
Rβ,G
J 0 ,K,f
(b)/Rβ,G
F K (q=2) (b) = ch(βLu,v bu bv ) + sh(βLu,v bu bv )EH (XTu,v ), (A.12b) I ,2β b
where XTu,v is the indicator of the event that u is connected to v. The sines and cosines are manifestly (non-negative) increasing functions of b, while the random cluster term is the expectation of a positive event and is therefore an increasing function of all couplings in the Hamiltonian – including the b’s. Let us now turn to a discussion of boundary conditions. Let G denote a graph and let L ⊂ SG . The starting point will be the consideration of conditional measures for W J,K,f (−), the measures corresponding to the weights in Eq. (6) cast in the more νβ,G general framework – subject to specifications on L and the consequence of these spec˜ if (i) ifications on the b marginals. A specification ∗ will be called a -specification the values (bi | i ∈ L) are specified: bi = b∗i ; i ∈ L and (ii) L is divided into disjoint components `∗1 , `∗2 , . . . `∗k such that the counting rule in the FK expansion deems all the sites in and connected to each `∗n to be part of the same cluster. ˜ Remark. Back in the spin-system, one interpretation of a -specification is obvious: having determined the bi on L, the signs of the X-components of the spins – the σi ’s – are locked together within each component and they take on both values with equal probability. On the other hand, the same graphical weights emerge if one (and only one) of the components is deemed to represent spins pointing in the positive X-direction. The reader is cautioned that at this stage, the signs of the Y components of the boundary spins still have all their a priori degrees of freedom. ˜ There is a natural partial order on the set of all possible -specifications: ∗ ∗0 0 0 0 ∗ if (1) L ⊃ L and each bi on L \ L is set to the maximum value, (2) each bi ≥ b∗i , i ∈ L ∩ L0 and (3), the components of ∗, `∗1 , `∗2 , . . . `∗k “contain” the ∗0 -components 0 0 0 0 0 `∗1 , `∗2 , . . . `∗k0 in the sense that if `∗j 0 ∩ `∗j 6= ∅ then `∗j 0 ⊂ `∗j . The following is easily seen: ∗J,K,f
Corollary II. If ∗ is a g-specification and ρβ,G
(−) is the associated measure on the
∗J,K,f ρβ,G (−) 0
remaining b’s then is (strong) FKG. Furthermore if ∗ ∗0 in the sense described above, J J and f f 0 then ∗J,K,f
ρβ,G
∗0 J 0 ,K,f 0
≥ ρ (−) FKG β,G
(−).
˜ Proof. The above is clear given the following mechanism to create a -specification: to fix the values of bi on L, concentrate the a priori measures. To lock the components, introduce artificial J-type couplings between all pairs of sites in a given component and send these couplings to infinity; the desired measure is recovered in the limit. If ∗ ∗0 this procedure involves higher J’s and higher b’s. W ∗J,K,f
W ∗0 J 0 ,K,f 0
(−) and νβ,G (−) denote two Wolff measures Proposition A.2. Let νβ,G with all primed quantities below unprimed quantities in the sense described. Let ∗J,K,f
Mβ,G
∗0 J 0 ,K,f 0
(−) and Mβ,G
(−) denote the corresponding bond measures. Then
636
L. Chayes ∗J,K,f
Mβ,G
∗0 J 0 ,K,f 0
≥ M (−) FKG β,G
(−).
Proof. Let A denote an increasing bond event. Let us write as in Eq. (8) X ∗J,K,f ∗J,K,f K∗ ρβ,G (b)µF Mβ,G (A) = J,b (A),
(A.13)
b ∗0 J 0 ,K,f 0
(A). The desired result follows immediately from the FKG and similarly for Mβ,G K∗ F K∗0 properties of the usual random cluster measures: both µF J,b (A) and µJ 0 ,b (A) are K∗ increasing functions of b and furthermore, if ∗ ∗0 and J J 0 then µF J,b (A) ≥ 0
K∗ µF J 0 ,b (A).
Thus far, the Y degrees of freedom have been left completely unspecified. Now the same sorts of specifications will be considered for these objects and this defines ˜ specification, L is divided into disjoint compoa -specification: In addition to a nents 1 , . . . m on which the τ -variables act in unison. A recapitulation of the previous arguments yields: ∗J,K,f
Proposition A.3. Let ∗ denote a specification and let ρβ,G ∗J,K,f ρβ,G (−)
(−) denote the corre-
0
sponding measure. Then is FKG. Further, if ∗ ∗ , meaning the same as above regarding the J’s, the f ’s and the `-components while K 0 K and the 01 , . . . , 0m contain the 1 , . . . , m , then ∗J,K,f
ρβ,G and accordingly
∗J,K,f
Mβ,G
∗0 J 0 ,K 0 ,f 0
≥ ρ (−) FKG β,G
(−),
∗0 J 0 ,K 0 ,f 0
≥ M (−) FKG β,G
(−).
In particular, the FKG maximizing boundary condition (on L) in the -class is the bi set to the maximum value, σi ≡ 1 and the 1 , . . . , m being the individual sites of L. The latter is, of course automatic if bi maximized ⇒ ai = 0. Proof. Follows the lines of the previous arguments along with the observation that any increasing function of a is a decreasing function of b. Superpositions of -specifications do not constitute a -class boundary condition nor, in general, are they FKG measures. This is the usual situation in ferromagnetic systems and is of no serious consequence since we have knowledge of the maximizing measure in the -class. In any case, let us define the -class as that which consists of superpositions from the -class. The following is pivotal: Lemma A.4. Let L ⊂ SG and let ∗ denote a -specification on L. Let K ⊃ L and ∗J,K,f ∗J,K,f consider ρβ,G (−)||SG \K , the restriction of ρβ,G (−) to the remaining sites. Then this restricted measure is of the -class. Proof. It is sufficient to discuss the case where ∗ is itself a pure -specification. Consider ∗J,K,f the full Wolff measures wβ,G (−) on configurations (ω, η, b), where ω and b are as have been described and η denotes configurations of FK bonds in the random cluster
Spin-Wave Stiffness in 2d-XY -Model
637 J,K,f
expansion of the τ -system. Thus, e.g. the νβ,G (−) measures are obtained by integrating out the η-bonds. Now, to study the restricted measure, let us condition on an (ω, η, b) configuration on K and sum over all η-configurations (and, if desired, ω-configurations) pertaining to the bonds of SG \ K. Having done so, a sum must be performed over all ∗J,K,f
the external configurations with the appropriate weights assigned by wβ,G (−). But, since ∗ is a -specification, it is clear that each (ω, η, b) configuration on K provides a -specification on SG \ K: Indeed, the b-values are fixed, the components `1 , . . . `k are just the ω-components while the η-components constitute the i , . . . , m . It is now straightforward to establish the various results claimed in Theorems 4 and 5. Indeed everything except the statements concerning uniqueness follow immediately from the existing machinery. Here, to simplify matters notationally, let us again assume that β, J, and K and the graph G are fixed and omit any further explicit reference. All of Theorem 5 amounts to the stated bound of the correlation function in terms of the connectivity function. Recalling that in a -state, the event Ti,j includes connections via the boundary component, these bounds are easily proved: Proof of Theorem 5. If ∗ denotes a state, it is claimed that [X] ∗ ∗ hs[X] i sj i = Eρ [bi bj µb (Ti,j )],
(A.14)
where E∗ρ [−] denotes expectation with respect to the ρ∗ (−) measure on the b-configurations. Indeed, fixing b and ω, the Ising spins are equal if i is connected to j – either directly or via one of the boundary components – and are uncorrelated with at least one of them having equal probability of ±1 otherwise. Summing over all ω with b fixed and then summing over b yields the identity displayed in Eq. (A.14). But obviously, bi and bj cannot exceed their maximum values and this provides the upper bound with c1 equal any uniform bound on these values. On the other hand, µ∗b (Ti,j ), bi and bj are all increasing functions of b and hence, the FKG inequality provides the bound [X] ∗ ∗ ∗ ∗ hs[X] i sj i ≥ Eρ [µb (Ti,j )]Eρ [bi ]Eρ [bj ].
(A.15)
The quantities E∗ρ [bi ] and E∗ρ [bj ] may be estimated by considering the worst case boundary conditions on the neighborhoods of i and j which yields the uniformly positive constant c2 . For the d-dimensional XY -model, we have c1 = 1 and c2 = (2/π)(e−2dβ ). Proof of Theorem 4 (A). First observe that the lower bound follows because the magnetization can be estimated from below by the average of the s[X] ’s in any state, and by using the 1+ -state, this is obtained. In fact, for the XY model, and several other of the models under consideration, both of these bounds follow because it can be proved, via correlation inequalities, that the 1+ state is exactly the state that produces the magnetization. For the general case, consider the addition of the usual magnetic term: X [X] X hsi ≡ 2hbi (δσi ,+ − 1) + hbi (A.16) i
i
to the Hamiltonian. The effect of this additional term may be incorporated into the present analysis by the addition of a single “ghost” spin connected to all other spins with coupling h. (Here the ghost spin plays more the rˆole of a boundary site than a full blown XY -degree of freedom.) Now for a.e. h, the (thermodynamic) magnetization
638
L. Chayes
can be defined by evaluating the actual magnetization (the average of the s[X] ’s) in any convenient state. Thus, using the limiting state constructed from 1+ boundary conditions, it is clear that for a˙e. positive h, the magnetization is bounded above by the (limiting) average fraction of sites connected to the ghost site or the boundary. Let 3L denote the box of scale L and define 1 X 1+ πL (h, β) = M (Ti,B ), (A.17) 3L i∈3 β,h,3L where Ti,B is the event that the site i is connected to the boundary or the ghost site and the sum includes the contribution from the boundary sites themselves. The desired result follows from two elementary facts: First, by continuity in finite volume, lim πL (h, β) = πL (0, β).
h→0
(A.18)
Second, by a sequence of fairly standard manipulations, 5∞ (β) ≡ lim 53L (β) = lim πL (0, β). L→∞
L→∞
(A.19)
Now, for h > 0 suppose we were to evaluate m(h, β) starting on 3N L using 1+ boundary conditions and letting N → ∞. Since, for finite N , this is a certified finite volume state, we increase the value by conditioning on the event that the grid that divides 3N L into small copies of 3L is fully occupied. Thus, at each stage it is learned that m3N L (h, β) ≡
1
X
|3N L | i∈3
+
1 hs[X] i iβ,h,3L ≤ πL (h, β).
(A.20)
NL
Taking h ↓ 0 (along a sequence of points of continuity) the desired result follows from Eqs. (A.18) and (A.19). Proof of Theorem 4 (B). Let G denote a graph, I ⊂ SG and K = SG \ I. Let γ = {hi, ki ∈ BG | i ∈ I, k ∈ K} denote the connecting bonds and let 0(γ) denote the contour event that every ω-bond in γ is vacant. In what follows, it is assumed that if there is any specification on G, it is of the -type and involves only the sites of K. It is claimed that if 0(γ) occurs then the measure on the (bi | i ∈ I) lies below, in the sense of FKG, the “free measure” on I that would be obtained if all the Ji,k on γ were zero. Indeed, for any fixed b on K and η-configuration the weights for the Q configurations (bi | i ∈ I) are given by ZaI,η (2β) hi,ki∈γ eβJi,k (ai ak −bi bk ) ZbI,f (2β), where Z I,f denotes the free boundary partition function and Z I,η denotes the partition function with ( -type) boundary specification provided by η. On the other hand, the free weights are given simply by ZaI,f (2β)ZbI,f (2β). Thus it is clear that irrespective of the information on the outside, the conditional weights are a decreasing function times the free weights. Now, supposing that 5∞ (β) = 0, it is easy to establish uniqueness of the limiting ρ-measures among -states: Let 3 ⊂ Zd be a finite connected set. Let 4 ⊃ 3 with 4 so large that the probability of an ω-connection between 3 and ∂4 in the 1+ state on 4 is negligible. Under these circumstances, there are contours separating 3 from ∂4; ˜ let γ denote such a contour and let 0(γ) denote the event that γ is the outermost such separating contour. These contour events form a disjoint partition so, up to the negligible probability of a connection between 3 and ∂4, the restriction of the maximal measure in 4 to 3 is below a superposition of free measures on various separating contours.
Spin-Wave Stiffness in 2d-XY -Model
639
Now consider the lowest boundary condition on 4: setting all the boundary ai to one and locking their spin directions. By a ↔ b symmetry, the same outermost contours (in the η expansion) appear with the same probabilities and we find – again up to negligible terms – that this worst measure in 4 restricted to 3 lies above the previously discussed superposition. Evidently the two restricted measures coincide in the 4 % Zd limit and hence all the limiting -measures coincide at least as far as the distributions of b’s are concerned. However, the same argument implies uniqueness for the various other Wolffmeasures in the -class and, given the fact that all bond clusters are finite, uniqueness among all Gibbs measures of the -class follows easily. Acknowledgement. I would like to thank Jon Machta who provided help and encouragement in all stages of this work and without whom this project would not have existed.
References [A] [A2 ] [ACCN] [B] [BC] [C] [C2 ] [CC]
[CCFS] [CMa ] [CMu ] [CMI ] [CMII ] [FK] [FJB] [FS] [KT] [MMPf] [M] [N] [NK]
Aizenman, M.: On the Slow Decay of O(2) Correlations in the Absence of Topological Excitations: Remark on the Patrascioiu–Seiler Model. Jour. Stat. Phys. 77, 351–359 (1994) Aizenman, M.: On the number of incipient spanning clusters. Nucl. Phys. B [FS] 57, 551 (1997) Aizenman, M., Chayes, J.T., Chayes, L. and Newman, C.M.: Discontinuity of the Magnetization in One-Dimensional 1/|x − y|2 Ising and Potts Models. Jour. Stat. Phys. 50, 1–40 (1988) Berezinskii, : Destruction of Long-range Order in One-Dimensional and Two-Dimensional Systems Having a Continuous Symmetry Group I. Classical Systems Soviet Phys. JETP 32, 493–500 (1971) Baker, T. and Chayes, L.: On the Unicity of Discontinuous Transitions in the Two-Dimensional Potts and Ashkin–Teller models. To appear in J. Stat. Phys. Chayes, L.: Percolation and Ferromagnetism on Z2 : The q-State Potts Cases. Stoc. Proc. Appl. 65, 209–216 (1996) Chayes, L.: On the Length of the Shortest Crossing in the Super-Critical Phase of Mandelbrot’s Percolation Process. Stoc. Proc. Appl. 61, 25–43 (1996) Chayes, J.T. and Chayes, L.: Percolation and Random Media. In: Critical Phenomena, Random Systems and Gauge Theories, Part II. Les Houches Session XLIII (1984), K. Osterwalder and R. Stora, eds. Amsterdam–New York–Oxford–Tokyo: North-Holland, 1986 pp. 1001–1142 Chayes, J.T., Chayes, L., Fisher, D.S. and Spencer, T.: Correlation Length Bounds for Disordered Ising Ferromagnets. Commun. Math. Phys. 120, 501–523 (1989) Chayes, L. and Machta, J.: On the Behavior of the Surface Tension for Spin-Systems in a Correlated Porous Medium. J. Stat. Phys. 79, 117–164 (1995) Chayes, L. and Machta, J.: Unpublished Chayes, L. and Machta, J.: Graphical representations and Cluster Algorithms Part I: Discrete Spin Systems Physica A 239, 542–601 (1997) Chayes, L. and Machta, J.: Graphical representations and Cluster Algorithms, Part II. Physica A 254, 477–516 (1998) Fortuin, C.M. and Kasteleyn, P.W.: On the Random Cluster Model I. Introduction and Relation to Other Models. Physica 57, 536–564 (1972) Fisher, M.E., Jasnow, D. and Barber, M.N.: Phys. Rev. A 8, 1111–1124 (1973) Fr¨ohlich, J. and Spencer, T.: The Kosterlitz–Thouless phase transition in Two-Dimensional Abelian Spin-Systems and the Coulomb Gas. Commun. Math. Phys. 81, 527–602 (1981) Kosterlitz, J.M. and Thouless, D.J.: Ordering, Metastability and Phase Transitions in TwoDimensional Systems. J. Phys. C 6, 1181–1203 (1973) Messager, A., Miracle-Sole, S. and Pfister, C.: Correlation Inequalities and Uniqueness of the Equilibrium State for the Plane Rotator Ferromagnetic Model. Commun. Math. Phys. 58, 19–29 (1978) Minnhagen, P.: The Two-Dimensional Coulomb Gas, Vortex Unbinding and SuperfluidSuperconduction Films. Rev. Mod. Phys. 59, 1001–1066 (1987) Nelson, D.: Defect mediated Phase Transitions. In: Phase Transitions and Critical Phenomena, Vol. 7, (C. Domb and J. L. Lebowitz Eds.) London, Boston, Tokyo, etc.: Academic Press, 1983 Nelson, D.R. and Kosterlitz, J.M.: Phys. Rev. Lett. 39, 1201 (1977)
640
L. Chayes
[PS]
Patrascioiu, A. and Seiler, E.: Phase Structure of Two–Dimensional Spin Models and Percolation. J. Stat. Phys. 69, 573 (1992) Russo, L.: A note on Percolation. Z. Wahrsch. verw. Geb. 43, 39–48 (1978) Seymour, P.D. and Welsh, D.J.A.: Percolation Probabilities on the Square Lattice. Ann. Discrete Math. 3, 227–245 (1978) Thouless, D.J.: Phys. Rev. 187, 86 (1969) Wolff, U.: Collective Monte Carlo Updating for Spin-Systems. Phys. Rev. Lett. 62, 361 (1989)
[R] [SW] [T] [W]
Communicated by J. L. Lebowitz
Commun. Math. Phys. 197, 641 – 665 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Soliton Immersions R. K. Dodd1,? Department of Applied Mathematics, University of New South Wales, Sydney, Australia Received: 26 August 1997 / Accepted: 3 March 1998
Abstract: We develop the theory of soliton immersions of arbitrary dimension since they, rather than soliton surface immersions, perhaps more correctly encompass the integrable nature of soliton equations. Summary This paper investigates soliton systems considered as immersions in infinite dimensional spaces defined by the loop group and algebra with which they are associated. The usual soliton hierarchy of equations is considered to be the natural object which is immersed in the infinite dimensional space. Canonical forms of immersions are connected with the conjugacy classes of the underlying finite dimensional simple Lie algebra and the real forms of the loop algebras. The relationship with the usual soliton surface theory is discussed through the introduction of evaluations which are homomorphisms onto the simple finite dimensional group or algebra. A final section gives some of the simplest possible examples of the theory developed in the paper. 1. Immersions in a Flat Space The moving frame approach for immersions in a real flat space E m equipped with a metric < ·, · >, can be quickly developed. Let {eˆa : a = 1, ..., m} denote an orthonormal frame < eˆa , eˆb >= a δab , a = ±1, which is aligned at a point P of the n dimensional immersed manifold M so that the frame vectors eˆa , a = n + 1, ..., m are normal to it. If the signature of the space E m is p − q (sig(E m ) = p − q), then p + q = m and there exists a permutation of the basis such that the metric has the form diag(1p , −1q ), where ? Permanent address: Department of Mathematics & Computer Science, San Jose State University, San Jose, California, USA. E-mail: [email protected]
642
R. K. Dodd
1p denotes a list of p 1’s. It is convenient to let Roman indices a, b, c run from 1 to m, Roman indices i, j, k, run from 1 to n and Greek indices run from n + 1 to m. We also adopt the usual Einstein summation convention for these sets of indices (a repeated index in a superior and inferior position is a summed index). There exists a dual basis {σ i : i = 1, ..., n} to the frame vectors {eˆi : i = 1, ..., n} such that dP = σ i eˆi .
(1)
This equation expresses the fact that a displacement is to remain in the immersed subspace. We also have deˆa = eˆb ω ba .
(2)
The one forms introduced in this equation are all skew since d < eˆa , eˆb >= 0 implies that ωab + ωba = 0. They can be expressed in terms of the dual frame vectors {σ i }, ω ab = γ abi σ i .
(3)
The coefficients satisfy γabi + γbai = 0. The dσ i are a basis for one forms on M and therefore dσ i =
1 i c σj ∧ σk , 2 jk
cijk + cikj = 0.
(4)
A calculation shows that γijk =
1 (ckij − cjki − cijk ). 2
(5)
In terms of the group theoretical approach which is developed in the next section, we see that the rotation coefficients are determined in terms of the structure constants for the Lie algebra of the rotation group associated with the moving frame restricted to M . Introduce the matrix notation (6) = ω ij , 9 = ω αβ , S = ω iα . The integrability conditions on Eq. (1) give the Gauss formulas dσ i + ω ij ∧ σ j = 0,
ω aj ∧ σ j = 0,
(7)
or in terms of matrix notation with σ = (σ 1 , ..., σ n )t , dσ + ∧ σ = 0,
S t ∧ σ = 0.
(8)
The integrability conditions of the Gauss-Weingarten system (2) give the Gauss– Codazzi–Ricci equations dω ca + ω cb ∧ ω ba = 0.
(9)
The curvature two form of the immersed manifold is defined by 2 := d + ω ∧ , 2 = (θij ). The Gauss–Codazzi–Ricci equations can be written in matrix form as 2 − S ∧ St = 0 dS + ∧ S + S ∧ 9 = 0 d9 − S t ∧ S + 9 ∧ 9 = 0
Gauss eqn., Codazzi eqn., Ricci eqn.
(10)
Soliton Immersions
643
Notice that from (10) d2 = 0 so that the Bianchi identities are automatically satisfied as are all the remaining integrability conditions. Introduce the following components with respect to the basis of one forms {σ i } on M: 1 θij = − Rijkm σ k σ m , 2
ω ij = γ ijk σ k ,
ω αβ = µαβk σ k ,
ω iα = biαk σ k .
(11)
The condition S t ∧ σ = 0 requires that the b’s are symmetric and the skew symmetry of the forms ωij and ωαβ imply the skew symmetry of their coefficients biαj = bjαi .
µαβi + µβαi = 0.
γijk + γjik = 0.
(12)
The equations can be then be written as (cf. [8]) X eα (biαm bkαj − biαk bmαj ) Gauss eqns., Rijkm = α
(∇k biαj − ∇j biαk ) + +γimk bmαj − γimj bmαk ) +
X β
∇k µαβi − ∇i µαβk +
X
m (biαm (γmjk − γmkj ),
m
β (biβk µβαj − biβj µβαk ) = 0 Codazzi eqns., X
δ (µδαi µδηk − µδαk µδβi )
δ m m m +µαβm (γki − γik ) + (bm αi bmβk − bαk bmβi ) = 0. Ricci eqns.
(13) The integrability of the equations for the immersed manifold is contained in Poincar´e’s Lemma d2 = 0. If we write X σ i ∇i , (14) d= i
where the ∇i are intrinsic derivatives then Poincar´e’s Lemma requires that X j ∇i ∇ j − ∇j ∇i + (γ ik − γ ijk )∇k = 0, i < j = 1, ..., m.
(15)
k
For the existence of a two dimensional surface immersed in R3 (n = 2) we require a two dimensional involutive distribution. The Gauss formulas give in this case γ 313 = 0 = γ 323 and γ 123 = γ 213 and from (15) we have the involutive distribution ∇1 ∇2 − ∇2 ∇1 + γ 112 ∇1 + γ 212 ∇2 = 0.
(16)
Different integrable equations are associated with distinct coordinate systems. The nonlinear Schr¨odinger equation for example is associated with a geodesic coordinate system. For this case one of the frame vectors, (eˆ1 ∼ = tˆ), is lined up with the tangent vector to a geodesic at P . Then locally at MP there exist polar coordinates and the frame becomes a Frenet-Serret frame in which the normal to the geodesic is the normal to the surface nˆ = eˆ3 . The tangent and the binormal for this case move by parallel translation and the frame is said to be adapted to the distribution on the surface.
644
R. K. Dodd
Another example is provided by surfaces of constant negative curvature, for which 1 2 γ 23 − (γ 123 )2 ≡ − K = γ23
1 , ρ2
ρ ∈ R.
(17)
In this case the frame vectors in the surface can be chosen so that eˆ1 = tˆ and eˆ2 = nˆ which means that asymptotic coordinates can be introduced. The corresponding coordinate curves can be parametrised by arc length, and so define a Tschebyscheff net, and the system of equations defining the surface are equivalent to the sine-Gordon equation, uxy =
1 sin u. ρ2
(18)
2. Soliton Immersions It should be clear from these two examples that unravelling a particular integrable equation involves, in part, choosing a coordinate system which requires considerable ingenuity. The whole approach for immersions of dim M > 2 becomes considerably more difficult. An alternative strategy is to exploit the soliton equations directly and derive an immersion from the soliton system. This approach was developed by Sym for soliton surfaces in a series of papers [1, 2, 3] and references therein, following work by Pohlmeyer [4] and Lund [5] (cf. also [6]). Let 8,i = Ai 8
i = 1, ..., n
(19)
denote a system of equations where Ai = Ai (x; k) are rational functions of k ∈ C and differentiable functions of the real variables (x) = (x1 , ..., xn ) (8,i = ∂i 8 := ∂8/∂xi ). The functions 8(x; k) take values in a semisimple matrix group G over F (F = C or R) and Ai ∈ g, the Lie algebra of G. Systems of this type were first studied by Zakharov and Shabat [7]. The integrability conditions on this system of overdetermined Eqs. (19), require that Ai,j − Aj,i + [Ai , Aj ] = 0,
i < j = 1, ..., n.
(20)
This is a system of nonlinear integrable equations which is associated with the linear problem (19). Generally in this paper we assume that the integrable equations are independent of k. The Killing form on g, < X, Y >= Tr( adX adY ), X, Y ∈ g, is nondegenerate on g and so an orthonormal basis can be introduced by the Gram-Schmidt process. If g is a complex semisimple Lie algebra then it is natural to introduce a positive definite hermitian form on g and use this, rather the form < ·, · >, to define the orthonormal basis. Let γ be a conjugation which is an anti-involution, that is γ 2 = 1 and γ[x, y] = −[γ(x), γ(y)],
γ(ax) = a∗ γ(x),
x, y ∈ g, a ∈ C.
(21)
The positive definite hermitian form is given by hx, yiH := hx, γ(y)i ,
x, y ∈ g.
(22)
In the rest of the paper whenever we deal with a complex Lie algebra we shall always use this form to obtain the orthonormal basis and use the form < ·, · > for real forms
Soliton Immersions
645
of the complex Lie algebra. Let x.y denote the product for x, y ∈ g for either case. It is also convenient to specify the complex or real case by indicating F = C or F = R. Let {hˆ a : a = 1, ..., m}, m =dim(g), be an orthonormal basis for g obtained with respect to the appropriate form and let (y a ) be the coordinates thereby defined on g. Since hˆ a .hˆ b = a δab it follows that g ∼ = E m where E m is a flat space which depends upon g. If g is a complex semisimple Lie algebra then E m ∼ = Cm and the hermitian m form defines P a hermitian metric on E which in the orthonormal basis has standard form, ds2 = a dy a ⊗ dy a (ea = 1, a = 1, ..., m). This case is associated, through the introduction of real coordinates (y a = ua + iv a ) with a real Euclidean space E˜ 2m (ea = 1, a = 1, ..., 2m). If g is a compact real form then g is a Euclidean space (ea = −1, a = 1, ..., m), otherwise if g is some other real form then the space will have signature p − q for some p, q such that p + q = m. Let (xi ) be a coordinate system for the immersed manifold M . We shall assume that the coordinates for M are real (the case of complex coordinates presents no difficulties). Since E m is a flat space we can make the identification hˆ a ∼ = ∂ya . If g is complex then the complex conjugate coordinates (y a ∗ ) are defined with respect to the basis {hˆ ∗a := γ(hˆ a ) : a = 1, ..., m} and hˆ ∗a ∼ = ∂y a ∗ . An n-dimensional immersion in g is defined by a map f : Rn → g ∼ = E m . The a function f is defined once the component functions f = (f ) with respect to the basis {hˆ a } are specified. Notice that this has important consequences if the soliton system (19) is given. For in this case it turns out that the immersions are integrable, that is there exists a large class of immersion functions f which can be explicitly constructed. An immersion is specified parametrically by y a = f a (x1 , ..., xn ),
a = 1, ..., m.
(23)
The precise relationship between the immersion and the soliton system is given later. Conversely it might be expected that a study of the Gauss-Codazzi-Ricci equations could lead to a clearer insight into the nature of the integrability which occurs in soliton systems. Let f : Rn → E m define an immersion then the tangent vectors to M are ∂xi = f a,i hˆ a + f a,i∗ hˆ ∗a
(F = C),
∂xi = f a,i hˆ a
(F = R),
(24)
and the induced metric is given by gij = ∂xi .∂xj = hab f a,i f b,j∗ + f a,i∗ f b,j gij = ∂xi .∂xj = hab f a,i f b,j
(F = C),
(F = R),
(25)
where hab := diag(1 , ..., m ). In the text this metric will be used to raise and lower the indices a, b, ... For the complex algebra we shall consider the immersion to be in the associated real Euclidean space E˜ 2m obtained by introducing real coordinates y a = ua + iv a . Then we obtain in a consistent fashion from the complex orthonormal basis {hˆ a } the orthonormal real basis {h˜ˆ a : a = 1, ..., 2m}, where ( √1 ∂ua ∼ a = 1, ..., m = √12 (hˆ a + hˆ ∗a ) 2 ˜hˆ a ∼ . (26) = ∂y˜ a = 1 i ˆ ∗ ˆ √ ∂v a−m ∼ √ ( h − h ) a = m + 1, ..., 2m = 2 a−m m−a 2
646
R. K. Dodd
This basis is related to the compact real form u of the algebra since g = u ⊕ iu where u = {x ∈ g : ω(x) = −x}. Thus the complex case is transformed to the real case with orthonormal basis {h˜ˆa }, coordinates (y˜a ), a = 1, ..., 2m and immersion function f˜ : R → E 2m ∼ = R2m given by f˜a = <(fa ), f˜a+m = =(fa ), a = 1, ..., m. The following calculations therefore apply also to the complex case with these changes. It is convenient to adopt the notation hf,i , f,j i := hab f a,i f b,j .
(27)
In this notation put nˆ α = naα hˆ a so that the moving frame vectors normal to M can be defined by hnα , nβ i = α δα,β ,
hf,i , nα i = 0,
α, β = n + 1, ..., m,
i = 1, ..., n.
(28)
Standard calculations (for example Spivak [9]), give the Gauss equations and the integrability equations for the immersion. The induced metric is gij = hf,i , f,j i .
(29)
Define the functions biαj = hf,ij , nα i
α = n + 1, ..., m i, j = 1, ..., n,
(30)
then we can introduce the first and second fundamental forms by I: II :
ds2 = gij dxi ⊗ dxj , X α biαj bkαm dxi ⊗ dxj ⊗ dxk ⊗ dx` .
(31)
α
In this coordinate system by differentiating the metric (29) we obtain the Gauss formulas X f,ij = 0hij f,h + biαj nα , (32) α
where 0hij =
1 hs g (gis,j + gjs,i − gij,s ). 2
(33)
This equation could also be expressed in terms of covariant derivatives (vi;j := vi,j + 0kij vk ). The integrability conditions then give rise to the Gauss–Codazzi–Ricci equations expressed in terms of the local coordinates (xi ) for the intrinsic geometry of the immersed manifold M . In fact these can be obtained directly from Eqs. (13) by formally replacing the orthonormal frame with a natural one, σ i 7→ dxi (“formally” because although this is not an orthonormal frame, it can easily be seen that the calculations proceed as before with γ ijk 7→ 0ijk )
Soliton Immersions
647
Rijk` = (biαj;k − biαk;j ) +
X β
µαβi;k − µαβk;i +
X
eα (biα` bkαj − biαk b`αj ) Gauss eqns.,
α
β (biβk µβαj − biβj µβαk ) = 0 Codazzi eqns., X
δ (µδαi µδηk − µδαk µδβi )
δ
+(b`αi b`βk − b`αk b`βi ) = 0. Ricci eqns. (34) All the coefficient functions are defined with respect to the coordinate system (xi ). The relationship with the presentation in the first section is established by constructing an orthonormal moving frame {eˆa : a = 1, ..., m} at P ∈ M ⊂ E m . From the natural frame {∂xi } at P ∈ M we construct an orthonormal frame {eˆi : i = 1, ..., n}. Put eˆi = λji ∂xj and ξ αi = f α,j λji , then E D (35) heˆi , eˆj i = ξ αi hˆ α , ξ βj hˆ β = λsi λkj gsk = i δij i, j = 1, ..., n. Thus if (·, ·) is the metric on M defined by gij the first n frame vectors of the moving frame are constructed by determining the functions λij from (λki ∂xk , λsj ∂xs ) = i δij ,
(36)
and then eˆi = f α,j λji hˆ α , i = 1, ..., n. The remaining frame vectors are given by eˆα = nˆ α ,
α = n + 1, ..., m.
(37)
Equations (19) can be interpreted as defining a G-valued connection with the Eqs. (20) showing that the curvature tensor of the conection is zero. We now need a prescription for constructing f from the soliton system (19) so that the integrability conditions (20), or zero curvature condition for the Cartan-Ehresmann connection, are satisfied. Introduce an arbitrary variation δ which commutes with the derivatives ∂i , δ∂i = ∂i δ. Then from (19) we get (δ8),i = δAi 8 + Ai δ8
(38)
(8−1 δ8),i = 8−1 δAi 8.
(39)
and consequently
This equation has already played an important role in soliton theory. It provides one way in which the variations in the potentials in the linear problem (19) can be related to variations in the scattering data [10]. The system of Eqs. (39) are completely integrable since (8−1 δ8),ij = (8−1 δ8),ji is satisfied provided (40) 8−1 δAi,j − δAj,i + [δAi , Aj ] + [Ai , δAj ] 8 = 0. Thus the conditions that the second derivatives commute are consequences of the zero curvature condition (20), and are the δ-variations of this equation. It follows that (8−1 δ8),ij = ∂j (8−1 δAi 8) = ∂i (8−1 δAj 8),
i < j = 1, ..., m,
(41)
648
R. K. Dodd
so that there exists a function f : Rn → g ∼ = E m such that f,i = 8−1 δAi 8
(42)
and f = 8−1 δ8 + H,
H ∈ g.
(43)
Notice that H = H(k) in this last equation and that in addition we assume that δ is such that 8−1 δ8 ∈ g. The general form of a variation which commutes with the derivations {∂i } is δ=
X
ai ∂i + b∂k + C,
(44)
i
where C ∈ g acts as an inner derivation on g through the adjoint action. The compatibility conditions are satisfied provided the coefficients only depend on k. The function f is only defined up to a choice of basis {hˆ a } and an arbitrary translation H ∈ g. The choice of orthonormal basis is invariant under elements of the rotation group A ∈ O(E m ) since if {hˆ a } and {h˜ a } are two orthonormal bases of E m , hˆ a .hˆ b = Aac h˜ c .Abd h˜ d
⇒
Aad A∗b d = δba .
(45)
If F = R then the matrices (Aba ) are real. Thus we have f˜ = f˜a h˜ a = (Aab f b − T a )h˜ a also defines an immersion, where T a = Aab H b . Different solutions f : Rn :→ g ∼ = Em of (39) for a fixed function 8 = 8(x; k), therefore define n dimensional manifolds immersed in E m which can be superimposed after a rotation and a translation, that is a motion in E m . This interpretation of the degrees of freedom in f is only valid if H and the other functions are considered to be evaluated at a fixed value of k. However the parameter k can also be included in this formalism by interpreting g as a loop algebra (Sect. 3), whereupon m = ∞ and B(k) is an arbitrary element in a loop algebra defined by g. Let σ ∈Aut(g) then ad(σ · X) = σ ad Xσ −1 for X ∈ g. Since for a matrix group Ad(8)X = 8X8−1 and Ad(8) ∈ Aut(g) we deduce that
Ad(8−1 ) · X, Ad(8−1 ) · Y = Tr (Ad(8−1 ) · ad X · ad Y Ad(8)) = hX, Y i . (46) For the matrix groups we can take any other nondegenerate invariant bilinear form on the algebra such as the form associated with their natural representation, that is their lowest dimensional faithful matrix representation. This is the simplest form to work with and there is no loss in generality since the bilinear forms are equivalent up to a scaling factor. If X, Y ∈ g are elements in the natural representation then < X, Y >N := Tr XY . The invariance of this form under inner automorphisms can be immediately established. Thus in the case of real forms the function 8 defines an inner automorphism of g which preserves the metric gij = hf,i , f,j i = hAd(8) · δAi , Ad(8) · δAj i = hδAi , δAj i .
(47)
Soliton Immersions
649
Moreover the new frame vectors eα = Ad(8) · nα are also normal heα , eβ i = α δαβ ,
hδAi , eα i = 0,
biαj = h−Aj δAi + δAi,j + δAj Ai , eα i .
(48) (49)
For real forms Eqs. (47), (48), and (49) define the fundamental forms I and II in terms of the quantities which appear in the linear Eqs. (19). If the function f is given then the Gauss–Codazzi–Ricci equations are the conditions that the local integrability conditions on f are satisfied to order 3. The construction of f shows that this is automatically true. Higher order integrability conditions are then satisfied since they correspond to further (total) differentiations of Eq. (20). The construction of soliton immersions is now guaranteed provided we can: (i) Find a soliton system (19) for which n < dim(g). (ii) Construct an orthonormal basis for g. (iii) Construct a function f : Rn → g ∼ = E m from a variation δ : G → T G which defines a canonical map G → g under left translation. For soliton systems there are a very large number of such constructions. 3. Loop Algebras Throughout this and the remaining sections we abandon the previous usage of indices. The range of values for an index is specified when it is introduced. The existence of a large family of immersions for soliton systems is a consequence of the fact that the systems (19) are associated not with semisimple groups and Lie algebras over F, but rather loop groups and affine Lie algebras over F. The treatment of soliton systems in the previous section ignored the role of the parameter k in the theory. In this section we principally investigate the case when g is a semisimple Lie algebra over C equipped with the standard bilinear form < ·, · >. We only consider the compact real form of the algebras. If h is a Cartan subalgebra of g let 1 be the set of roots for g with respect to h and let π be a set of simple roots for 1. Let h∗ denote the dual of h and define a basis π ∨ of h through the pairing αi (αj∨ ) = aji ,
αi ∈ π, αj∨ ∈ h,
(50)
where A = (aij ) is the Cartan matrix for the algebra. The bilinear form < ·, · > defines a nondegenerate symmetric bilinear form on h∗ (by linearity) which we denote with the same notation hαi , αj i = αj (αi∨ ),
αi , αj ∈ π.
(51)
The Killing form (or any other standard form) on g, normalised so that long roots have length 2, can be used to directly generate the form (51). The equivalence classes of automorphisms which define distinct classes of soliton equations [11], are governed by automorphisms of any set of simple roots π ⊂ 1. Let ν ∈Aut(π) be an automorphism of finite order p, that is ν p = 1, the identity on π. An automorphism ν ∈Aut(π) defines a unique automorphism of g which we denote by the same letter. Let ω be a primitive pth root of unity (ω = e2πi/p ). For simplicity we shall
650
R. K. Dodd
restrict the discussion to affine Lie algebras which are simply laced, that is their Dynkin diagrams involve only single bonds. In this case Aut(π) ∼ = W , the Weyl group of g and we denote the automorphism by w ∈ W . The Cartan subalgebra h ⊂ g is chosen so that the weight lattice a Zαi (52) L= αi ∈π
of g and the automorphism w ∈ W satisfies [12] (i) (ii) (iii)
hα, αi ∈ 2Z, α ∈ L. hwα,
p/2wβi = hα, βi , α, β ∈ L. w α, α ∈ 2Z if p is even.
The second condition indicates that elements of W are isometries of the bilinear form and the final condition ensures that if α ∈ L then so is −α. For simply laced algebras the Cartan matrices are symmetric and
(53) hαi , αj i = αi∨ , αj∨ . The isomorphism between h and its dual becomes αi∨ 7→ αi and so we can make the identification h ∼ = h∗ and put h = C ⊗Z L. Let : L × L → h±1i be the bimultiplicative function (α + γ, β) = (α, β)(γ, β),
(α, β + γ) = (α, β)(α, γ),
(54)
which satisfies the condition 1
(α, α) = (−1) 2 <α,α> .
(55)
Then for a simply laced complex Lie algebra g with Cartan subalgbra h there exists a root space decomposition g = h ⊕α∈1 gα ,
(56)
where gα = Cxα [h, xα ] = hh, αi xα ,
h ∈ h.
(57)
For α, β ∈ 1 the basis elements xα ∈ gα can be chosen to have the commutators [13] (α, −α)α if α + β = 0 [xα , xβ ] = (α, β)xα+β if < α, β >= −1 (58) 0 if < α, β >≥ 0. The bilinear form on g restricted to the root spaces is given by (α, −α) if α + β = 0 < xα , xβ >= 0 if α + β 6= 0.
(59)
With the given definition of we have (α, −α) = −1, α ∈ 1. The automorphism w ∈ W extends to an automorphism of g, ν ∈Aut(g), ν j xα = η(j, α)xν j α .
(60)
Soliton Immersions
651
Since ν p = 1 it follows that η(j, α) ∈< ω > and η(j, α) = η(1, α)j . From the same relationship we also have η(1, ν j α) = η(1, α). For calculations it is useful to have a representation of ν so that the values of η : Zp × L →< ω > are determined from the explicit action of ν on elements in g. Let a h(j) (61) h= j∈Zp
be the eigenspace decomposition of h under w. In the text we always make the interpretation (j) ≡ (j) mod p. There exist y ∈ g such that ν = Ad(exp 2πiy) = exp 2πi ad y for which ν|h = w and [y, h(0) ] = 0. Fix such a y which in addition satisfies the properties < y, h >= 0 and the eigenvalues of ad y belong to p1 Z [14]. The eigenspace decomposition of x ∈ g can then be directly determined from the explicit form of ν. Let g(ν) denote the Zm -gradation defined by ν ∈ Aut(g), a g(ν) = g(j) . (62) j∈Zm
The loop algebra g˜ (ν) associated with the finite order automorphism ν of g is defined by g˜ (ν) =
a
g(j) ⊗ k j
(63)
j∈Z
with the Lie bracket [x ⊗ k j , y ⊗ k ` ] = [x, y] ⊗ k j+` ,
x ∈ g(i) , y ∈ g(j) .
(64)
The indeterminant k for soliton equations can be interpreted as a local coordinate on a Riemann surface X. A standard bilinear form (·, ·) on a Lie algebra g has the following properties: (i) (ii) (iii) (iv)
(·, ·) is invariant. (·, ·) restricted to h is nondegenerate. (gα , gβ ) = 0 if α + β 6= 0. (·, ·) is nondegenerate on gα + g−α , α ∈ 1 and [x, y] = (x, y)α for x ∈ gα , y ∈ g−α .
Thus h·, ·i is a standard form for g and a standard form on g˜ is defined for x ⊗ p(k), y ⊗ q(k) ∈ g˜ (ν) by (x ⊗ p(k), y ⊗ q(k)) = hx, yi Res (k −1 p(k)q(k)).
(65)
Let γ be the conjugation introduced in Sect. 2. Define σ = −γ then σ acts on g as σ([x, y]) = [σ(x), σ(y)],
σ(ax) = a∗ σ(x),
a ∈ C.
(66)
The conjugation σ preserves the relationships (57) and (58), σ:
x±α 7→ x∓α ,
α 7→ −α,
α ∈ 1.
(67)
Therefore an orthonormal basis for g with respect to the hermitian form hx, yiH := − hx, σ(y)i ,
x, y ∈ g
(68)
652
R. K. Dodd
is given by B = {ui , xα : α ∈ 1, 1 ≤ i ≤ dim(h)}, where {ui } is obtained from the simple coorots π ∨ . The hermitian form can be extended to g˜ by defining σ(k) = k −1 so that hx ⊗ p(k), y ⊗ q(k)iH := hx, yiH Res (k −1 p(k)q ∗ (k −1 ) ).
(69)
In this definition p(z)∗ = p∗ (z ∗ ), where p is a holomorphic function of z. The loops are considered as maps S 1 → g. The physically important case of loops R → g are considered in Sect. 6. We also require an orthonormal basis for the twisted Lie algebra g(ν). Since < νx, νy >=< x, y >, x, y ∈ g(ν), it follows that hg(i) , g(j) i = 0,
i + j 6= 0 mod p.
(70)
Consequently we can determine a basis for h, which taking (70) into account, reflects the gradation. The orthonormal basis is obtained by applying the Gram-Schmidt process starting with the basis vectors of h(0) . The basis can be expressed in terms of the vectors u(i) j which satisfy the conditions E D (j) (i+j),0 (j) = δhk , u(i) (71) u(i) h , uk h , uk ∈ h. H
√ ±(p/2) ∼ (0) ±(i) (i) (−i) Define the vectors ψj±(0) ∼ = uj , ψj = (uj ± uj )/ 2 and if p is even ψj = . The orthonormal basis for h is therefore {ψj±(i) : 1 ≤ j ≤ dim h(i) , i ∈ Zp }. The orthonormal basis for g(ν) is completed by basis elements from the root spaces. If (60) represents the action of ν on the basis elements and ω is a fixed primitive pth root of unity, then let `α be such that η(1, α) = ω `α . A particular vector xα ∈ gα determines hα basis vectors in the ν-gradation of g, where hα = min {i : η(1, α)i = 1, i ∈ N}. The partial basis of g(ν) determined by xα is given by the set of vectors (p/2)
uj
(`α −i) = xα
hX α −1
ω ij xν j α ,
i = 0, ..., hα − 1
(72)
i = 0, ..., hα − 1.
(73)
j=0
which are eigenvectors of ν, (`α −i) (`α −i) νxα = ω `α −i xα ,
Moreover the adjoint element is in g(−`α +i) , (−`α +i) (`α −i) ) = x−α . σ(xα
(74)
Let [α] represent the orbit of α ∈ 1 under the action of < ν >. Since β ∈ [α], β ∈ [γ] implies that [α] = [γ] the root vectors are partitioned into disjoint orbits. Thus the basis vectors which reflect the ν-gradation are {x(i) [α] } and the orthogonal property (70) imposes the condition that E D (j) i,j (i) = δα,β r[α] , (75) x(i) [α] , x[β] H
(i) r[α]
can be explicitly calculated. In this equation the requirement where the quantities (i) be the normalised α = β means that there exist γ ∈ [α], λ ∈ [β] such that γ = λ. Let τ[α]
Soliton Immersions
653
(i) ±(i) vector which corresponds to x(i) } is an orthonormal basis for [α] then B(ν) := {τ[α] , ψj (i) g(ν) (r[α] = 1). The orthonormal basis for a twisted simply laced Lie algebra can be obtained in a [i] [i] similar way. It is given by B˜ := {τ[α] , ψj± } where [i]
(i) ⊗ ki , τ[α] = τ[α]
ψj± = ψ (i) ⊗ k i , [i]
i ∈ Z, α ∈ 1.
(76)
Real forms for the algebras can be obtained in the standard way from the compact real form. The compact real form of g˜ (ν) is the real Lie algebra which is the fixed point ˜ set of the conjugation σ, u(ν) = {x ∈ g˜ (ν) : σ(x) = x}. The orthonormal basis for the [j] [j] ± ± }, where twisted compact real form is B˜ σ (ν) = {κ± [α] , ξ` √ (j) (−j) (j) (j) j −j √±1 τ σ(τ[α] ⊗ k ± τ ⊗ k ) 6= τ[α] [α] [−α] 2 ±[j] j ∈ Z, α ∈ 1, (77) κ[α] = √ √±1 (j) (j) (j) j −j τ ⊗ (k ± k ) σ(τ[α] ) = τ[α] 2 [α] ( ± ±[i] ξj
=
√1 ψ ±(i) ⊗ (k i − k −1 ) (− ξ ±(i) ) j 2 j √i ψ±(i) ⊗ (k i + k −i ) (+ ξ ±(i) ) j j 2
,
i ∈ Z, α ∈ 1.
(78)
The other real forms can be obtained from involutions of the compact form as in ˜ C) are given in Sect. 6; a complete the finite dimensional case. The real forms for sl(2, classification is given in [11].
4. Baker–Akhiezer Functions An important class of functions 8(x; k) which are solutions to (19) are called Baker– Akhiezer functions [15]. These functions have an analytic representation and so define integrable immersion functions through Eq. (43). In the general situation the functions Ai in (19) can have poles at points k = aη of multiplicity rη , η = 1, ..., N , including the point at infinity. For the equations which fall within the scope of the representations of the loop algebras which we have presented, the poles occur either at k = ∞ or k = 0. Let X be a compact Riemann surface of finite genus g and let π : X → P be the associated covering map which we take to be p-sheeted. Let B = {b1 , ..., bL } denote the set of branch points of X. Local coordinates on X, z+ = k, z− = 1/k can be assigned in a neighbourhood of each singularity a+ = 0, a− = ∞ respectively. A Baker–Akhiezer function is defined by the following data: (i) The points {a+ = 0, a− = ∞} ∈ P\ B. (ii) The diagonal p × p matrices 2() (k) = (θα() (k)δαβ ) where θα()
=
r X j=1
() γjα
z−j + `() α log z j
`() α ∈ Z, α = 1, ..., p.
(iii) A positive divisor D = P1 +···+Pg+`+p−1 of degree g+`+p−1, where ` =
P
() ,α `α .
654
R. K. Dodd
The Baker–Akhiezer functions 8() (k), = ± are then constructed from this data as follows. First form the scalar Baker–Akhiezer functions φ() α (P ), α = 1, ..., p with the properties −1 (0) ∪ π −1 (∞)} and has D as its pole divisor. 1. φ() α (P ) is meromorphic on X\{π 2. Select p points P1 , ..., Pp ∈ π −1 (a ) such that in a neighbourhood of Pβ ∈ π −1 (a ), φ() α (P ) behaves like ( (δαβ + O(z )) exp(θβ() (k)) as P → Pβ () φα (P ) = O(1) exp(θβ(µ) (k)) as P → Pµβ , µ 6= .
The functions defined in this way exist and are unique [15, 16]. The points Pβ , β = 1, ..., p are the points at infinity in the asymptotic expansion of the scalar Baker–Akhiezer functions. Under π : X → P let P 7→ k. From the functions {φ() α } we construct the p × p matrix functions 8() (k) which are multivalued on P, () (P (k)) . (79) 8() (k) = φ() α β The coordinates are chosen so that Pα() (k) ∈ π −1 (k) and Pα() (a ) = Pα . The functions obtained this way have the form ˆ () (k) exp 2() (k), 8() (k) = 8
(80)
()
ˆ (k) have the asymptotic expansions where the functions 8 X () ˆ () (k) = 8j k j , 8() 8 0 = 1.
(81)
j≥0
, a− , γ1(−) , ..., γr(−) } is the set of deformation parameters asThe data {a+ , γ1(+) , ..., γr(+) + − sociated with the system. For the cases which we consider the points a ∈ {0, ∞} are fixed. Identify the remaining data with the variables {x1 , ..., xn }. Consider the function 8(k) := 8(−) (k) which is multivalued on P and introduce the gauge function G(+) , which is independent of k, such that 8(k) = G(+) 8(+) (k). Thus the function 8 exp −2 is normalised to the identity at k = ∞, but the normalisation at the origin depends upon the gauge function G(+) . The constructed function defines a rational function B(k) such that B(k) = 8(k),k 8(k)−1 ,
B(k) =
r+ X j=0
B+j k −(j+1) −
r− X j=1
B−j k j−1 +
L X i=1
g+`+p−1 X Bˆ i B˜ i + , k − bi k − ci
(82)
i=1
where ci = π(Pi ). The rational functions Ai which appear in (19) are given by Ai (k) = 8(k),xi 8(k)−1 .
(83)
Conversely it is possible starting from (19) to construct formal solutions 8(k) defined by rational functions Ai such that the associated nonlinear equations preserve the monodromy data of the system (19) [17]. The Baker–Akhiezer functions are a special class
Soliton Immersions
655
of such functions which have a finite monodromy group determined by the covering map π : X → P, and for which the Stoke’s multipliers are all trivial. We can relate this material to the previous section by introducing the loop group associated with the Lie algebra g. Let G be the simply connected group whose Lie algebra is g, such that exp h := H ⊂ G. The loop group G˜ is the group of smooth maps S 1 → G with pointwise multiplication. As in the loop algebra case the map is still called a loop if it can be analytically extended off the circle as a regular rational map C× → G . Let H = G/N (H), where N is the normaliser of H in G, denote the variety of all maximal abelian subgroups, and let H˜ denote its loop space. The maximal abelian ˜ Let h˜ ⊂ g˜ be the subalgebras of g˜ are in one to one correspondence with the loops in H. ˜ ˜ then ˜ maximal abelian subalgebra of g which corresponds to the loop H ∈ H, ˜ k ∈ C× }. h˜ = {x ∈ g˜ : exp x(k) ∈ H,
(84)
Let ν ∈ Aut(g) be the lift of an element w ∈ W determined by a fixed y ∈ g introduced earlier, so that ν = Ad(exp 2πiy). Then the loops x ∈ g˜ (ν) have the property ˜ is defined by exactly x(ωk) = ν(x(k)), where ω p = 1. The twisted loop group G(ν) ˜ the same formula applied to loops in X ∈ G, ν(X(k)) = X(ωk), where ν is the inner automorphism determined by exp 2πiy. ˜ In this picture a loop γ ∈ G(ν) has a Birkhoff decomposition γ(k) = γ − (k).λ(k).γ + (k), ± where γ (k) are holomorphic in |k|±1 < 1, k ∈ P respectively. The function λ(k) is a loop on S 1 with values in the diagonal matrices of G. Thus we see that a Baker–Akhiezer ˜ function 8 ∈ G(ν) has the Birkhoff decomposition ˆ ˆ 8 = 8(k).λ(k). exp 2(k), λ(k) = diag(k −`1 , ..., k −`p ),
ˆ 2(k) = 2(k) − log λ(k).
(85)
An important point to observe is that the map g 7→ 8−1 g8, g ∈ g˜ (ν), is no longer an automorphism of g˜ (ν), but rather a homomorphism into a completion of the algebra g(ν) consisting of infinite sums of elements in g˜ (ν). The completion g(ν) is not a Lie algebra as the product is not well defined. However the homomorphic image of g˜ (ν) is a Lie algebra with a bracket induced by that for g˜ (ν). ˜ denote the j th basis element of g˜ (i) in some arbitrary, but fixed ordering Let vji ∈ B(ν) ˜ of the elements in B(ν). A point g ∈ g˜ (ν) has coordinates (y˜ij ), where X vji y˜ij . (86) g= i,j
The coordinates of an immersed soliton manifold are y˜ij = y˜ij (x), where the functions on the right-hand side are given by X vji y˜ij (x). (87) 8−1 δ8 = i,j
This makes sense because of the analytic properties of the Baker–Akhiezer function and because the action of δ on 8 is well defined. The variations defined by (87) are ˜ to obtained by an homomorphism from the elements Ai , B in the Lie algebra of G(ν), the completion g(ν),
656
R. K. Dodd
8−1 δ8 = 8−1 (
X
ai Ai + bB + C)8.
(88)
i
From Eq. (43) and the comments following this equation we see that in general the soliton immersions f = 8−1 8, involve a flat space E ∞ which is associated with the loop algebra g˜ (ν) and are only defined up to a motion. We can therefore take the function H in (43) to be zero. The flat space E ∞ , consists of infinite sequences of complex numbers with the inverse limit topology which is therefore not a Banach space, though it is a Fr´echet space. The existence of a loop group and algebra associated with an integrable equation is a manifestation of the infinite group of symmetries which can be associated with the equation or system of Eqs. (20). The original system of Eqs. (19) can be extended by including additional equations which are (infinitesimal) symmetries of the original set of equations. Let 8xi = Ai 8,
i = n + 1, ...
(89)
be the additional symmetries obtained in this way and necessarily Ai ∈ g˜ (ν) for all i. These symmetries define new integrable equations which together with the original system of equations constitute the hierarchy of integrable equations defined by the given system of Eqs. (19). Either through this result and the use of asymptotics or from directly extending the construction of the Baker–Akhiezer function to depend on N variables, an explicit formula can be found for the Baker–Akhiezer function in N variables (xi ). The function 8 therefore defines through Eq. (87) an immersion of the complete hierarchy of equations in the space E ∞ . As shown in Sect. 2 the immersion can be interpreted as an immersion in an associated Euclidean space by introducing real variables and using the conjugation σ = −γ. Proposition 1. The soliton systems which can be expressed as an overdetermined system of linear Eqs. (19), are associated with immersions in infinite dimensional flat spaces defined by twisted loop algebras. The complete hierarchy of equations defined by such a system can be immersed in the same space. The soliton systems associated with real forms of the algebras correspond to reductions of the soliton systems and so have a similar representation as an immersion in an infinite dimensional real flat space with a signature which depends upon the real form. Specific examples are presented in Sect. 6 and the general case will be presented in a later paper.
5. Minimal Immersions and Projections For each k ∈ C× , introduce a homomorphism k : g˜ (ν) → g(ν), defined on monomials by g ⊗ k j 7→ gk j , where g ∈ g(j) ⊂ g(ν). We call the homomorphic image k : g˜ (ν) → g(ν) the evaluation of g˜ (ν) at k ∈ C× . Let x(k) = k(x) denote the evaluation of x ∈ g˜ (ν) ˜ onto B(ν). For example at k ∈ C× . In particular the evaluation map k = 1 maps B(ν) under this map [i]
(i) , τ[α] 7→ τ[α]
i ∈ Z.
(90)
˜ If, as in the previous section, vji ∈ B(ν) denotes the j th basis element of g(i) in some ˜ arbitrary, but fixed ordering of the elements in B(ν), then let
Soliton Immersions
657
vji 7→ wjh ,
i ∈ Z,
h = i mod ± Zp
under this evaluation. The evaluation at k of (87) can then be written X j yi (x; k)wji , 8−1 (k)δ8(k) =
(91)
(92)
ij (i) where B := {τ[α] , ψj±(i) } ∼ = {wji }. Provided dim g(ν) = m > n this defines an immersion of the soliton system in g(ν). The complex immersion is associated with a real immersion in an Euclidean space E 2m , obtained by introducing real coordinates, which corresponds to a solution of the Gauss–Codazzi–Ricci equations, (Sect. 2).
Corollary 1. If dim g(ν) = m > n then the soliton system (19) can be immersed in the flat space defined by g(ν) with local coordinates yij = yij (x; k) given by (92). Associated with every complex immersion is an immersion in a real Euclidean space E 2m defined by introducing real coordinates. Such an immersion exists provided n < 2m. Moreover by choosing different evaluation maps an infinity of such immersions can be obtained. An immersion of a soliton system in a flat space of smaller dimension than that defined by Corollary 1 exists if the soliton system (20) can be imbedded in a zero curvature matrix representation (19) associated with a subalgebra of the given representation. This is only possible if the given finite dimensional Lie algebra is semisimple, in which case the soliton system might belong to a simple Lie algebra, or if the soliton system corresponds to a restriction of the given simple Lie algebra. For example the soliton system might be associated with the Lie algebra d` rather than a2`−1 (dim a2`−1 = (2`−1)(2`+1), dim d` = `(2` − 1) ). A slightly different situation arises if the system is connected with a real form of the simple Lie algebra g. In this case the complex flat space E ∞ becomes a real space and the evaluation map defines immersions in a real space E q , q =dim gR , where gR is the real form, and q < m. We consider examples of this type in the next section. If m < n (or 2m < n for a real immersion) we can still consider projections of the n dimensional immersed manifold onto submanifolds E m (E 2m ) of the infinite dimensional flat space E ∞ . The only projections which we shall consider are the ones which arise naturally from the evaluation map. Corollary 2. If dim g = m ≤ n then the soliton system (19) which is immersed in the infinite dimensional flat space E ∞ has a projection onto the flat space defined by g(ν) with local coordinates yij = yij (x; k) given by (92). Real projections in an Euclidean space E 2m are defined for 2m < n by taking real coordinates. There is an infinity of projections which can be obtained by using different evaluation maps. If we fix some of the variables (xi ) then we obtain a subimmersion of the soliton immersion. Thus the usual soliton surface immersions considered in the literature [1, 2, 3], correspond to subimmersions of the immersed soliton hierarchy. 6. Immersions, Projections and Subimmersions In this section we present some examples of soliton immersions, projections and subimmersions connected with the AKNS system [18]. This well known system is convenient to use since the soliton surface immersions of some of the equations in the system have been considered by other authors [1, 2, 3]. It also contains many important physical
658
R. K. Dodd
integrable systems. The AKNS system belongs to the loop groups and algebras connected with sl(2, C). There are two conjugacy classes in the Weyl group for this algebra which can be identified with the identity element and the Coxeter element of W . The ˜ C)(1) ∼ ˜ C) and identity element defines the homogeneous loop representation sl(2, = sl(2, ˜ the Coxeter element gives rise to the principal loop representation sl(2, C)(θ), where 1 and θ are respectively the identity element and an involution θ2 = 1, defined below, of Aut(sl(2, C)). Let 01 00 1 0 α= , x−α = (93) , xα = 00 −1 0 0 −1 denote a basis of sl(2, C) which satisfies the standard conditions given in Sect. 3, (α, −α) = −1 and < x, y >:=Tr xy, x, y ∈ sl(2, C). The involution θ introduced above is defined by θ(α) = −α, θ(x±α ) = x∓α . (1) Thus the algebra sl(2, C)(θ) = h(0) ⊕ Cx(0) ⊕ Cx(1) [α] ⊕ h [α] is given by
(94)
h = h(0) ⊕ h(1) = {0} ⊕ Cα, x(0) [α] = xα + θxα = xα + x−α , x(1) [α] = xα − θxα = xα − x−α .
(95)
(0) An orthonormal basis for the two gradations is provided by B(1) = {τ[α] = √ √ (0) (0) (0) (1) 0 xα , τ[−α] = x−α , ψ1 = α/ 2} = {wi } and B(θ) = {τ[α] = (xα + x−α )/ 2, τ[α] = √ √ ˜ C)(1) and (xα − x−α )/ 2, ψ1(1) = α/ 2} = {w10 , w11 , w21 }. The loop Lie algebras sl(2, ˜ and B(θ) ˜ respectively where ˜sl(2, C)(θ) have the orthonormal bases B(1) [j] [j] (j) (j) ˜ = {τ[α] B(ν) = τ[α] ⊗ k j , ψ1 = ψ1(j) ⊗ k j : τ[α] , ψ1(j) ∈ B(ν), j ∈ Z}.
(96)
[j] [j] ˜ = {v j = τ[α] Let the standard basis elements be given the ordering B(1) , v2j = τ[−α] , v3j = 1 [j] [2j] [2j+1] ˜ = {v 2j = τ[α] ψ1 } and B(θ) , v12j+1 = τ[α] , v22j+1 = ψ1(2j+1) }. Under the evaluation map 1 ˜ are mapped onto the basis elements in B(ν). An element k = 1 the basis elements in B(ν) ˜ g ∈ sl(2, C)(ν), ν = 1, θ, can therefore be identified with the infinite dimensional flat Cspace E ∞ with local coordinates (y˜ij ), where g = y˜ij vji . The projections onto the flat space E m , where m = dim g, defined by the evaluation maps, are given by k(g) = yij (k)wji , where (yij (k)) are the local coordinates on E m . The AKNS system of equations is given by (19) with ik q a j bj , j > 1. (97) P1 := A1 = , Pj := Aj = cj −aj r −ik
For this system the variable x := x1 often has the interpretation of a spatial variables and the variables xj , j > 1 are time-like, though we do not make this distinction. Families of integrable equations can be obtained by making assumptions about the dependency of the matrix functions Aj on k. This is useful if the fundamental forms (31) associated with the immersions are required (see below). We shall exclude the case where either q or r are constant which corresponds to the principal form of the algebra.
Soliton Immersions
659
A powerful way of displaying the integrable equations associated with this system is to use the inverse scattering method for the case of smooth functions r, q which are in the Schwartz class, that is they decay rapidly to zero, along with all their derivatives as |xi | → ∞. In particular if we take the first equation 9,x1 = A1 9, to define the scattering problem and write x = x1 , then the integrable equations are given by [18] r,xj r − j (L) = 0, j > 1, (98) −q,xj q where the recursion operator L is defined by R∞ R∞ 1 ∂x + 2r dyq −2r x R dyr x R . L := ∞ ∞ 2q x dyq −∂x − 2q x dyr 2i
(99)
The dispersion relations of the j th linearised equations for q, r, ωq (k) and ωr (k) respectively, are defined by the function j (which can be rational) j (k/2) = −iωr (k) = iωq (−k).
(100)
Rather than investigate the Baker–Akhiezer functions for this system we produce the wave functions 9 which correspond to soliton solutions of the complete system (98) using techniques associated with the inverse method. They can be obtained, for example, by using Darboux transformations or by solving a Riemann-Hilbert problem. In this case the Riemann-Hilbert problem is associated with the real line rather than the circle. Since R ∪ ∞ is isomorphic to S 1 under stereographic projection θ 7→ 2 tan(θ/2), θ ∈] − π, π], it might appear that the wave function which is defined on the line can be directly translated into the situation discussed in the paper. However it is simpler to analytically continue the wave functions we obtain into the complex plane, ˆ j ; k) exp 2(xj ; k), 9(xj ; k) = 9(x eθ+θ eθ i (k − k ) (k −k) 1 1 1 (k1 −k) ˆ j ; k) = 1 + , 9(x θ θ+θ (1 + eθ+θ ) −i (k e−k) − e 1 (k −k) 1
2(xj ; k) = −φα = −ikx1 α −
1X j (k)xj α. 2
(101)
j>1
In this equation k ∈ R and the eigenvalues defining the soliton solution P are located in the upper (=k1 > 0) and lower (=k 1 < 0) k-half plane and θ = 2ik1 x1 + j j (k1 )xj +δ, P θ = −2ik 1 x1 − j (k 1 )xj − δ, where δ, δ are arbitrary constants (determined by the initial data). The case when the j are polynomials is particularly simple to investigate. For this case we choose k1 and k 1 to lie in the annulus 1 < k < ρ < ∞. Then 9 has a power series expansion in k which is convergent on the unit circle with an essential singularity at ∞. The immersion is defined by the function −1 ˆ δ9 ˆ + δ2 exp 2 = F δk + Gδθ + Gδθ − δφ α, (102) 9−1 δ9 = exp −2 9 where
660
R. K. Dodd
F =
(k1 − k 1 ) (1 + eθ+θ )
eθ+θ (k1 −k)(k1 −k) θ−2φ
ie − (k 2 1 −k) θ
ieθ+2φ (k1 −k)2
− (k
eθ+θ 1 −k)(k1 −k)
i(k1 −k)e e (k 1 − k1 )eθ (k1 −k) − (k1 −k)2 G= −2φ θ (1 + eθ+θ )2 − (kie −k) − (k e−k) 1
G=
(k 1 − k1 )eθ (1 + eθ+θ )2
,
2θ+2φ
,
1
eθ ie2φ (k1 −k) (k1 −k) θ i(k1 −k)e2θ−2φ − (k1e−k) (k1 −k)2
! .
(103)
Consider the variation δ = ∂k . In this case 9−1 9,k =
(k 1 −k1 ) (1+eθ+θ )
ieθ−2φ ieθ+2φ eθ+θ x + x−α α+ α (k1 −k)2 (k 1 −k)(k1 −k) (k 1 −k)2
! − φ,k α. (104)
The function φ has a Laurent polynomial expansion in k, the precise form of which depends upon the functions j , X φj k j . (105) φ,k = j
If the number of commuting flows is n < ∞, or alternatively we consider a section of the immersed manifold, then there exists m > 0 such that φj = 0 for ±j > m.
(a)
(b)
(c)
Fig. 1. The immersion for the AKNS system with n = 2 and 2 = 4ik2 represented by the coordinates (u˜ 1j , v˜ j2 , v˜ j3 )). The pictures (a), (b) show the first two components j = 0, 1 for k = 2 + i, k1 = 2 − i, which is in fact the compact case. The third picture (c) is the component j = 0 for the case k1 = 2 + i, k1 = 3 − 2i
For the case when the functions j are polynomial then φj 6= 0 for 0 ≤ j ≤ m. If ˜ there are no constraints on q and r then the basis is B(1), and the coordinates for the immersed soliton manifold can be explicitly calculated because of the analytic properties [i] of 9. The coordinates corresponding to the basis elements v3i = ψ1 for example are obtained from the power series expansion of 1/(k 1 − k)(k1 − k). In this way we get ( 0 j<0 √ −(j+1) . (106) y˜j3 = 2 −(j+1) eθ+θ k1 − φj j ≥ 0 − k1 θ+θ (1+e
)
Soliton Immersions
661
The other coordinates are obtained by utilising the analytic properties of the exponential function, ! X k j−1 X (2φ(k))j X X e−2φ e2φ j = s a k = bj k j . = j j+1 2 2 j! (k−k ) (k−k 1 ) 1 k1 (107) s j j≥0 j≥0 They are given by ( y˜j1
iaj (k1 −k1 )eθ
=
(1+eθ+θ )
0
(
j≥0 , j<0
y˜j2
ibj (k1 −k1 )eθ
=
(1+eθ+θ )
0
j≥0 . j<0
(108)
The functions aj , bj can be calculated once the j are specified. For example the first two equations in the nonlinear Schr¨odinger system hierarchy are defined by 2 = 4ik 2 , 3 = 8ik 3 and the corresponding equations are ir,x2 − r,2x1 + 2r2 q = 0 iq,x2 + q,2x1 − 2q 2 r = 0
(2 = 4ik 2 ),
r,x3 + r,3x1 − 6rqr,x1 = 0 q,x3 + q,3x1 − 6rqq,x1 = 0
(3 = 8ik 3 ).
(109)
The first few nontrivial coordinates for this case (n = 3) are a0 =
b0 =
1 2 k1
1 k12
a1 =
b1 =
2 3 k1
(1 + ix1 k 1 )
2 (1 − ix1 k1 ) k13
a2 =
b2 =
1 4 k1
2
[(4ix2 − 2(x1 )2 )k 1 + 4ix1 k 1 + 3], (110)
1 [−(4ix2 + 2(x1 )2 )k12 − 4ix1 k1 + 3]. k14 (111)
Immersions into a real space are defined by introducing real coordinates y˜hj = u˜ jh + iv˜ hj . Projections onto E 3 defined by evaluations, are easy to obtain from (104). Thus for example the evaluation at k = 1 gives rise √ to a projection with respect to the basis (0) (0) = xα , τ[−α] = x−α , ψ1 = α/ 2} = {w1 , w2 , w3 } given by B(1) = {τ[α] y1 =
i(k 1 − k1 )eθ+2φ(1) (1 + eθ+θ )(k 1 − 1)2 √ 3
y =
(1 +
,
y2 = −
i(k 1 − k1 )eθ−2φ(1) (1 + eθ+θ )(k1 − 1)2
2(k 1 − k1 )eθ+θ
eθ+θ )(k
1
− 1)(k1 − 1)
−
√
2φ,k (1).
,
(112)
Since φ = φ(xj ; k) the projections can involve an infinite number of independent variables. If we choose subimmersions of the manifold immersed in E ∞ then we can reduce the number of independent variables. The usual subimmersions are codimension n immersions given by An = {xj = 0 : j > n}. The case A2 for evaluations corresponds to soliton surfaces immersed in a three dimensional complex space. The immersions, projections and sections of soliton manifolds into real spaces are obtained as usual by introducing real coordinates y˜hj = u˜ jh + v˜ hj (which doubles the
662
R. K. Dodd
(a)
(b)
(c)
(d)
Fig. 2. The subimmersion for the AKNS system with n = 3 and 2 = 4ik2 , 3 = 8ik3 represented by the coordinates (u˜ 1j , v˜ j2 , v˜ j3 ). The pictures (a), (b), (c) show the first three components j = 0, 1, 2 for the case k1 = 2 + i, k1 = 2 − i. The relation between k1 and k1 and the choice of components plotted mean that these are also the pictures for the compact case with k1 = 2 + i. Picture (d) is the Taylor series sum of the surfaces (a)–(c)
(a)
(b)
(c)
(d)
(e)
Fig. 3. Subimmersions of the AKNS system of equations defined by 1 = 4ik2 , 2 = 8ik3 in complex 3-space represented by the coordinates (u1 , v 2 , v 3 ). In each figure k1 = 2 + i, k1 = 3 − 2i: (a) surface subimmersion x3 = 0, k = 0 (b) surface subimmersion x3 = 0, k = 1. (c) projection k = 1 (d) surface subimmersion x2 = 0 k = 1 (e) surface subimmersion x1 = 0, k = 1
dimension of the flat space in the case of finite m). In this case another projection can be obtained if 2m > n, where n is the number of coordinates of a section, by selecting p real coordinates such that p > n. Thus for example associated with a soliton surface immersion we can define projections onto three dimensional Euclidean spaces. The Figs. 1 and 2 are obtained in this fashion. Most physical soliton systems however are associated with the real forms of the loop algebras [11]. Consider first the real forms of sl(2, C). The compact real form su(2) is defined by the conjugation σ = θ∗, where ∗ is the operation of complex conjugation. Thus a basis for su(2) is su(2) = R(xα + x−α ) ⊕ Ri(xα − x−α ) ⊕ Riα.
(113)
The other real forms are obtained from su(2) by involutive automorphisms. If τ is an involutive automorphism of su(2) then let su(2) = p0 ⊕ p1 be the eigenspace decomposition with respect to τ . Then g0 = p0 ⊕ q0 , q0 := ip1 is a real form of sl(2, C). Thus τ = ∗ gives sl(2, R) and τ =Ad(I1,1 ), I1,1 =diag(−1, 1), defines su(1, 1). The decomposition of the algebras in each case is given by
Soliton Immersions
663
sl(2, R) = Rxα ⊕ Rx−α ⊕ Rα, su(1, 1) = R(xα − x−α ) ⊕ Ri(xα + x−α ) ⊕ Riα.
(114)
The restriction of < ·, · > to these algebras results in flat spaces E 3 with the metrics h·, ·isu(2) = diag(−1, −1, −1),
h·, ·isu(1,1) = diag(+1, −1, −1),
h·, ·isl(2,R) = diag(+1, +1, −1).
(115)
˜ C)(1) is obtained from the fixed point set of the involution The compact real form of sl(2, σ = θ∗, where σ(k) = k −1 . The compact real form is spanned by the basis vectors {dj1 = (xα ⊗ k j + x−α ⊗ k −j ), dj2 = i(xα ⊗ k j − x−α ⊗ k −j ), dj3 = iα ⊗ (k j + k −j ), dj4 = α ⊗ (k j − k −j )}, a j ˜ Rd1 ⊕ Rdj2 ⊕ Rdj3 ⊕ Rdj4 . (116) su(2)(1) = j∈Z
The integrable equations are connected with the real forms obtained from the involution κ(k) = k −1 which acts as the identity on g. Thus we have a ˜ R(xα + x−α ) ⊗ k j ⊕ Ri(xα − x−α ) ⊗ k j ⊕ Riα ⊗ k j , su(2)(κ) = j∈Z
˜ su(2)(∗κ) =
a
j∈Z
˜ su(2)(τ ◦ κ) =
a
Rxα ⊗ k j ⊕ Rx−α ⊗ k j ⊕ Rα ⊗ k j , R(xα − x−α ) ⊗ k j ⊕ Ri(xα + x−α ) ⊗ k j ⊕ Riα ⊗ k j .
j∈Z
(117) These real forms correspond to the standard reductions of the AKNS method: 01 01 ∗ ∗ ˜ 9(k) = su(2)(κ) : r = −q , 9 (k) , 10 10 ˜ su(2)(∗κ) : r, q are real, ˜ su(2)(τ ◦ κ) :
r = q∗ ,
9(k) = 9∗ (k), 01 0 −1 ∗ 9(k) = 9 (k) . −1 0 1 0
˜ The su(2)(κ) reduction leads to an immersion in an Euclidean space E ∞ with the coor1 dinates {u˜ j , u˜ 2j , u˜ 3j } given in terms of the coordinates {yij }, Eqs. (106), (108) by 1 u˜ 1j = √ (y˜j1 + y2j ), 2
i u˜ 2j = √ (y˜j2 − y˜j1 ), 2
u˜ j3 = −iy˜j3 ,
j ∈ Z.
(118)
For the real forms of the loop algebras projections onto the real forms of the algebra sl(2, C) require that the evaluations only involve real values of k. Thus the evaluation at ˜ k = 1 is for the real form sl(2)(κ), θ∗ +2φ(1) i(k ∗ − k1 ) eθ−2φ(1) e + , u1 = √ 1 2(1 + eθ+θ∗ ) (k1∗ − 1)2 (k1 − 1)2
664
R. K. Dodd
(a)
(b)
(c)
Fig. 4. The surface immersion for the compact form of the AKNS system with n = 2, k1 = 2 + i, (a) k = 0 (b) k = 1 (c) k = −1
θ∗ +2φ(1) (k1∗ − k1 ) eθ−2φ(1) e , − (k1 − 1)2 2(1 + eθ+θ∗ ) (k1∗ − 1)2 √ √ 2i(k 1 − k1 )eθ+θ 3 u =− + 2iφ,k (1). (1 + eθ+θ )(k 1 − 1)(k1 − 1)
2
= 4ik2 and
u2 = √
(a)
(b)
(c)
(d)
(119)
(e)
Fig. 5. Subimmersions for the compact form of the AKNS system, n = 3, with 2 = 4ik2 , 3 = 8ik3 , and x1 = 0. (a) k = −1 (b) k = −0.1 (c) k = 0 (d) k = 0.1 (e) k = 1. The tubular helix collapses and thickens and then expands out again as the parameter changes from −1 to 1
Finally we consider the fundamental forms associated with the Gauss-Codazzi-Ricci equations which the AKNS system defines when n = 3, 2 = 4ik 2 and 3 = 8ik 3 in the compact case. From Eq. (47) we have gij =< Ai,k , δAj,k > and a calculation gives the metric explicitly for an immersion in R3 , (the functions are all evaluated at a specific value of k) 1 2k 3k 2 ∗ . gij (k/2) = −2 · 4(k 2 +|q|2 ) 12k 2 +8k|q|2 +2i(q ∗ q,x1 −qq,x 1) 4 2 2 ∗ ∗ ∗ · · 9k +16k |q| +4ki(q q,x1 −qq,x1 )+4qx1 qx1 (120)
Soliton Immersions
665
The tensor biαj however depends upon the immersion function 9, (49) and so does not have a simple expression. It can however still be calculated, for example 0 −2iq b1α1 = , eα , 2ir 0 b1α2 =
−2k 2 + 3qr −4ikq + q,x1 4ikr − r,x1 −2k 2 + 3qr
, eα .
(121)
√ are then calculated from Ad(9)·nα , where nα ∈ {(xα +x−α )/ 2, i(xα − The vectors √ eα √ x−α )/ 2, iα/ 2}. Acknowledgement. I would like to thank Colin Rogers and Wolfgang Schieff for the hospitality shown me during my stay at the University of New South Wales, and for introducing me to the area of soliton immersions. This work was supported in part by the National Science Foundation under Grant DMS-9404 290.
References 1. Sym, A.: Soliton surfaces. Lett. Nuovo Cimento 33, 394 (1982) 2. Sym, A.: II. Geometric unification of solvable nonlinearities. Lett. Nuovo Cimento 36, 307–312 (1985) 3. Sym, A.: Soliton surfaces and their applications (soliton geometry from spectral problems). In: Geometrical aspects of the Einstein equations and integrable systems. Lect. Notes Phys. 239 Berlin–Heidelberg– New York: Springer, 1985, pp. 154–231 4. Pohlmeyer, K.: Integrable Hamiltonian systems and interactions through quadratic constraints. Commun. Math. Phys. 46, 207 (1976) 5. Lund, F.: Ann. Phys. (N.Y.) 115, 251 (1978) 6. Fokas, A. S., Gelfand, I. M.: Surfaces on Lie groups on Lie algebras and their integrability. Commun. Math. Phys. 177, 203–220 (1976) 7. Zakharov, V.E., Shabat, A. B.: Integration of nonlinear equations of mathematical physics by the method of inverse scattering II, Funct. Anal. Appl. 13, 166–174 (1979) 8. Eisenhart, L. P.: Riemannian geometry. Princeton, NJ: Princeton U.P., 1964 9. Spivak, M.: Comprehensive introduction to differential geometry 3, Boston, MA: Publish or Perish Inc., 1975 10. Flaskcha, H., Newell, A. N.: Integrable systems of nonlinear evolution equations. Dynamical systems and their applications. Ed Moser, J. Lect. Notes Phys. 38, Berlin–Heidelberg–New York: Springer, 1975 11. Dodd, R. K.: A restricted classification of integrable equations. Preprint 1997 12. Lepowsky, J.: Calculus of twisted vertex operators. Proc. Natl. Acad. Sci. USA. 82, 8295–8299 (1985) 13. Frenkel, I. B., Kacs, V. G.: Basic representations of affine Lie algebras and dual resonance models. Invent. Math. 62, 23–66 (1981) 14. Kacs, V. G.: An elucidation of “Infinite dimensional algebras...and the very strange formula”. E8 and the cube root of the modular inariant j. Adv. in Math. 35, 264–23 (1980) 15. Krichever, I. M.: Integration of non-linear equations by methods of algebraic geometry. Funct. Anal. Appl. 11, 12–26 (1977) 16. Krichever, I. M.: Methods of algebraic geometry in the theory of non-linear equations. Russ. Math. Surv. 32, 185–213 (1977) 17. Jimbo, M., Miwa, T., Ueno, K.: Monodromy preserving deformation of linear ordinary equations with rational coefficients I. Phys. 2D, 306–352 (1981) 18. Ablowitz, M. J., Kaup, D. J., Newell, A. C., Segur, H.: The inverse scattering transform-Fourier analysis for nonlinear problems. Stud. Appl. Math. 53, 249–315 (1974) Communicated by T. Miwa
Commun. Math. Phys. 197, 667 – 712 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Duality Symmetries and Noncommutative Geometry of String Spacetimes Fedele Lizzi? , Richard J. Szabo?? Department of Physics – Theoretical Physics, University of Oxford, 1 Keble Road, Oxford OX1 3NP, UK Received: 24 January 1998 / Accepted: 4 March 1998
Abstract: We examine the structure of spacetime symmetries of toroidally compactified string theory within the framework of noncommutative geometry. Following a proposal of Fr¨ohlich and Gawe¸dzki, we describe the noncommutative string spacetime using an algebraic construction of the vertex operator algebra. We show that the spacetime duality and discrete worldsheet symmetries of the string theory are a consequence of the existence of two independent Dirac operators, arising from the chiral structure of the conformal field theory. We demonstrate that these Dirac operators are also responsible for the emergence of ordinary classical spacetime as a low-energy limit of the string spacetime, and from this we establish a relationship between T-duality and changes of spin structure of the target space manifold. We study the automorphism group of the vertex operator algebra and show that spacetime duality is naturally a gauge symmetry in this formalism. We show that classical general covariance also becomes a gauge symmetry of the string spacetime. We explore some larger symmetries of the algebra in the context of a universal gauge group for string theory, and connect these symmetry groups with some of the algebraic structures which arise in the mathematical theory of vertex operator algebras, such as the Monster group. We also briefly describe how the classical topology of spacetime is modified by the string theory, and calculate the cohomology groups of the noncommutative spacetime.
1. Introduction Duality has emerged as an important non-perturbative tool for the understanding of the spacetime structure of string theory and certain aspects of confinement in supersymmetric gauge theories (see [1, 2] for respective reviews). Recently, its principal applications ? Permanent address: Dipartimento di Scienze Fisiche, Universit` a di Napoli Federico II and INFN, Sezione di Napoli, Italy. E-mail: [email protected] ?? E-mail: [email protected]
668
F. Lizzi, R. J. Szabo
have been in string theory within the unified framework of M theory [3], in which all five consistent superstring theories in ten-dimensions are related to one another by duality transformations. Target space duality, in its simplest toroidal version, i.e. T-duality, relates large and small compactification radius circles to one another. It therefore relates two different spacetimes in which the strings live and, implicitly, large and small distances. The quantum string theory is invariant under such a transformation of the target space. The symmetry between these inequivalent string backgrounds leads to the notion of a stringy or quantum spacetime which forms the moduli space of string vacua and describes the appropriate stringy modification of classical general relativity. T-duality naturally leads to a fundamental length scale in string theory, which is customarily identified as the Planck length lP . A common idea is that at distances smaller than lP the conventional notion of a spacetime geometry is inadequate to describe its structure. As strings are extended objects, the notion of a “point” in the spacetime may not make sense, just as the notion of a point in a quantum phase space is meaningless. In fact, it has been conjectured that the string configurations themselves obey an uncertainty principle, so that at small distances they become smeared out. A recent candidate theory for this picture is the effective matrix field theory for D-branes [4, 5] in which the spacetime coordinates are described by noncommuting matrices. In this paper we shall describe a natural algebraic framework for studying the geometry of spacetime implied by string theory. Duality and string theory seem to point to a description of spacetime which goes beyond the one given by ordinary geometry, with its concept of manifolds, points, dimensions etc. In this respect, a tool ideally suited to generalize the concept of ordinary differential geometry is Noncommutative Geometry [6]. The central idea of Noncommutative Geometry is that, since a generic (separable) topological space (for example a manifold) is completely characterized by the commutative C ∗ -algebra of continuous complex-valued functions defined on it, it may be useful to regard this algebra as the algebra generated by the coordinates of the space. Conversely, given a commutative C ∗ algebra, it is possible to construct, with purely algebraic methods, a topological space. The terminology noncommutative geometry refers to the possibility of generalizing these concepts to the case where the algebra is noncommutative. This would be the case of some sort of space in which the coordinates do not commute. The structure of the theory is, however, even more powerful than this first generalization. The commutative algebra gives not only the possibility to construct the points of a space, but also to give them a topology, again using only algebraic methods. Metric aspects (distances, etc.) can also be constructed by representing the algebra as bounded operators on a Hilbert space, and then defining on it a generalization of the Dirac operator. Differential forms are also represented as operators on the Hilbert space and gauge transformations act as conjugation by elements of the group of unitary operators of the algebra. This set of three objects, an algebra, a Hilbert space on which the algebra is represented, and the generalized Dirac operator, goes under the name of a Spectral Triple. One is now led to ask if the generalization of spacetime hinted to by string theory can find a place in the framework of noncommutative geometry. Namely, if there is an algebra which can provide the noncommutative “coordinates” of string theory. The structure should be such that in the low-energy limit, in which strings are effectively point particles described by ordinary quantum field theory, one recovers the usual (commutative) spacetime manifolds. In the following we shall elaborate on a program initiated by Fr¨ohlich and Gawe¸dzki [7] (see also [8]) for understanding the geometry of string spacetimes using noncommutative geometry. This program has also been pur-
Duality Symmetries and Noncommutative Geometry of String Spacetimes
669
sued recently by Fr¨ohlich, Grandjean and Recknagel [9]. In [7] it was proposed that the algebra representing the stringy generalization of spacetime is the vertex operator algebra of the underlying conformal field theory for the strings. We shall analyse some of the properties of this spacetime through an algebraic study of the properties of vertex operator algebras. A conformal field theory has a natural chiral structure, from which we show how to naturally construct two Dirac operators. These operators are crucial to the construction of the low-energy limit of the noncommutative spacetime which gives the conventional spacetimes of classical general relativity at large distance scales. We will explicitly construct this limit, using the tools of noncommutative geometry, and show further how our Dirac operators are related to the more conventional ones that arise from N = 1 superconformal field theories. Chamseddine [10] has recently used these Dirac operators in the spectral action principle [11] of noncommutative geometry and shown that they lead to the desired effective superstring action. However, the Dirac operators that we incorporate into the spectral triple are more general and as such they illuminate the full structure of the duality symmetries of the string spacetime. All of the information concerning the target space dualities and discrete worldsheet symmetries of the string theory lies in the relationships between these two operators. They define isometric noncommutative spacetimes at the level of their spectral triples, and as such lead naturally to equivalences between their low-energy projective subspaces which imply the duality symmetries between classical spacetimes and the quantum string theory. From this we also deduce a non-trivial relation between T-duality transformations and changes of spin structure. The main focus of this paper will be on the applications of noncommutative geometry to a systematic analysis of the symmetries of a string spacetime using the algebraic formulation of the theory of vertex operator algebras (see for example [12, 13, 14]). We will show (following [15]) that target space duality is, at the level of the string theory spectral triple, just a very simple inner automorphism of the vertex operator algebra, i.e. a gauge transformation (this was anticipated in part in [16]). This transformation leaves the algebra (which represents the noncommutative topology of the stringy spacetime) invariant, but changes the Dirac operator, and hence the metric properties. We shall also describe other automorphisms of the vertex operator algebra. For example, discrete worldsheet symmetries of the conformal field theory appear as outer automorphisms of the noncommutative geometry. For the commutative algebra of functions on a manifold there are no inner automorphisms (gauge symmetries), and the group of (outer) automorphisms coincides with the diffeomorphism group of the manifold. We will show that the outer automorphisms of the low-energy projective subalgebras of the vertex operator algebra, which define diffeomorphisms of the classical spacetime, are induced via the projections from inner automorphisms of the full string theory spectral triple. This implies that, in the framework of noncommutative geometry, general covariance appears naturally as a gauge symmetry of the quantum spacetime, and general relativity is therefore formulated as a gauge theory. We shall also analyse briefly the problem of computing the full automorphism group of the noncommutative string spacetime. This ties in with the problem of finding a universal gauge group of string theory which overlies all of its dynamical symmetries. We are not aware of a full classification of the automorphims of a vertex operator algebra, nor will we attempt it here. Some very important mathematical aspects of this group are known, such as its relation to the Monster group [12], and we examine these properties within the context of the noncommutative geometry of the string spacetime. We also briefly discuss some properties of the noncommutative differential topology of the string
670
F. Lizzi, R. J. Szabo
spacetime and compare them with the classical topologies. From this we shall see the natural emergence of spacetime topology change. For definitiveness and clarity we restrict ourselves in this paper to a detailed analysis of essentially the simplest string theory, the linear sigma-model, i.e. closed strings compactified on a flat n-dimensional torus. Already at this level we will find a very rich noncommutative geometrical structure, which illustrates how the geometry and topology relevant for general relativity must be embedded into a larger noncommutative structure in string theory. Our analysis indicates that noncommutative geometry holds one of the best promises of studying physics at the Planck scale, which at the same time incorporates the dynamical features governed by string theory. Our results also apply to N = 1 superconformal field theories (whose target spaces are effectively restricted to tori), and we also indicate along the way how the analysis generalizes to other conformal field theories. However, the most interesting generalization would be in the context of open strings, where the two chiral sectors of the theory merge and T-duality connects strings with D-branes. D-branes have been discussed in the context of Noncommutative Geometry recently in [17] (see also [9]). Duality now relates spectral triples of different string spacetimes, which could have remarkable implications in the context of M theory. A complete dynamical description of M theory is yet unknown, but in the context of this paper this could be achieved by some large vertex operator algebra. In this respect the conjecture of [5], which relates M-theory to a matrix model, could be interpreted as a particular truncation of a set of vertex operators to finite-dimensional N ×N matrices. The large-N limit then recovers the aspects of the full theory. The structure of this paper is as follows. In Sect. 2 we present a brief introduction to the main ideas and definitions of noncommutative geometry that will be used throughout the paper. In Sect. 3 we then describe the systematic construction (following [7] for the most part) of the string spacetime. Here the Dirac operators of the theory are introduced, along with their alternative spin structures, and are related to similar objects which appear in supersymmetric sigma-models. The relevant algebra (the vertex operator algebra) is also introduced here, and we present an algebraic description of this space of operators. Sections 4 and 5 then contain the main results of this paper. In Sect. 4 the spectral triples are analysed and the low-energy sectors are constructed from the two Dirac operators. The duality symmetries are also presented as isomorphisms between the spectral triples. Section 5 discusses the symmetries and topology of the noncommutative string spacetime in a general setting, and some aspects of a universal string theory gauge group, Borcherds algebras and relations with the Monster group are presented. Some more formal algebraic aspects of vertex operator algebras, along with some features about how they construct the string spacetime, are included in an appendix with emphasis on how the analysis and results of this paper extend to more general string theories. 2. Elements of Noncommutative Geometry In this section we will give, a very brief introduction to the basic tools and definitions of noncommutative geometry that we will use in this paper. The classic reference for the subject is Connes’ book [6]. Other introductions can be found in [18, 19, 20]. The basic concepts of classical Riemannian geometry are very familiar to theoretical physicists. Here one has the notion of a manifold M whose points x ∈ M are labelled by a finite number of real coordinates xµ ∈ R. The metric is determined by the infinitesimal length element ds2 = gµν dxµ dxν .
(2.1)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
671
The main idea behind noncommutative geometry is to study the topology (and geometrical properties) of a space by not seeing it as a set of points, but rather by investigating the set of fields defined on it. In this sense the tools of noncommutative geometry resemble the methods of modern theoretical physics. The point is that there exists a dual description of the above situation, in that the specification of the topology of a space M is equivalent to studying the algebra A = C(M ) of continuous complex-valued functions on it. This algebra encodes the topology of M through the continuity criterion, and it is an example of a commutative unital C ∗ -algebra. The ∗-conjugation is defined by f ∗ (x) ≡ f (x) ∀x ∈ M, f ∈ A, and the norm on A is the L∞ -norm1 , kf k∞ = sup |f (x)|.
(2.2)
x∈M
Thus given a topological space (for instance a manifold), one can naturally associate to it a commutative C ∗ -algebra A. That the converse is also true is the content of a series of theorems due to Gel’fand and Naimark (for a review see for example [21, 22]). The Gel’fand-Naimark theorem states that there is a one-to one correspondence between Hausdorff topological spaces and commutative C ∗ -algebras. The correspondence in the other direction is much more sophisticated and can be seen in a constructive way. Given an abelian C ∗ -algebra A, i.e. a set of elements with two commutative operations, a norm and a conjugation, one can reconstruct the topological space M for which A ∼ = C(M ) is the algebra of continuous complex-valued functions on M . In this construction the points are taken to lie in the structure space of A, the space of equivalence classes of irreducibile representations of the algebra modulo algebraisomorphism. In the case of commutative algebras the irreducible representations are all one-dimensional. They are thus the (non-zero) multiplicative ∗-linear functionals φ : A → C. We can then consider a representation φx as the evaluation map which gives the values of the elements of the algebra at a point x: φx (f ) ≡ f (x) , f ∈ A.
(2.3)
The topology in the structure space is given using the concept of pointwise convergence. A sequence φxn of representations is said to converge to φx if and only if for any a ∈ A the sequence φxn (a) → φx (a) in the usual topology of C. There are various other algebraic ways to reconstruct the points of M [19, 21], such as through the maximal ideals, primitive ideals, characters or pure states of A. For commutative algebras these constructions are all equivalent. For a noncommutative algebra, for which not all irreducible representations are one-dimensional, this is no longer true. The important aspect to note is that the investigation of the topology of a space M in terms of the relations among its points is completely equivalent to the analysis of the algebra of functions defined on it. Therefore the study of topological spaces (manifolds, etc.) can be substituted by a study of C ∗ -algebras. In the case where M is a manifold, in addition to the topology the differentiable structure is determined by the algebra C ∞ (M ) of smooth functions on M . Connes [6] has shown that a metric and other local aspects can be also be encoded at the level of algebras. The key property is another important result due to Gel’fand, the fact that any C ∗ -algebra A (commutative or otherwise) can be represented faithfully as a subalgebra of the algebra B(H) of bounded operators on an infinite-dimensional separable Hilbert space H. In the following we will not distinguish between the algebra 1 In the cases where M is non-compact one has to further restrict to functions which vanish on the frontier of the space.
672
F. Lizzi, R. J. Szabo
and this representation, and we take A ⊂ B(H). The norm of A is represented by the Hilbert-Schmidt operator norm on B(H), kak =
sup |ψi,|φi∈H
|hψ|a|φi|.
(2.4)
The metric structure, as well as the noncommutative generalization of differential and integral calculus, is obtained via an operator which is a generalization of the usual Dirac operator. We shall call it a Dirac operator even in this generalized context. From the point of view of noncommutative geometry the Dirac operator is a (not necessarily bounded) operator D on H with the following properties: 1. D is self-adjoint, i.e. D = D† on the common domain of the two operators. 2. The commutator [D, a] (a ∈ A) is bounded on a dense subalgebra of A. 3. D has compact resolvent, i.e. (D − λ)−1 for λ 6∈ R is a compact operator on H. If the algebra is commutative, and therefore it has a structure space which determines a corresponding algebra of continuous functions on a topological space, then the Dirac operator enables one to give the topological space a metric structure via the definition of the distance. We define the distance between two points of the structure space as d(x, y) = sup {|a(x) − a(y)| : k[D, a]k ≤ 1} .
(2.5)
a∈A
This notion of distance coincides with the usual definition of geodesic distance for spin-manifold M with the usual Dirac operator D = iγ µ ∇µ acting on some dense domain of the Hilbert space H = L2 (spin(M )) of square-integrable spinors ψ(x) on M . The realvalued gamma-matrices generate the corresponding Clifford algebra {γ µ , γ ν } = 2g µν of the spin bundle spin(M ) of M , and ∇µ is the usual covariant derivative constructed from the Levi-Civita spin-connection of M . Notice how these latter two dependences of D encode the Riemannian geometry of M . We take A to be the algebra of continuous complex-valued functions on M and consider them as operators on H acting by pointwise multiplication. Given the notion of distance between pairs of points, the other properties of a metric space can be obtained by the traditional techniques. The set of data (A, H, D), i.e. a C ∗ -algebra A of bounded operators on a Hilbert space H and a Dirac operator D on H, is called a spectral triple, and it encodes the geometry and topology of a space under consideration. The pair (H, D) is called a Dirac K-cycle for A and it can be used to describe the cohomology of a space using K-theory [6]. A different choice of Dirac operator will alter the metric properties of the space. A change of metric is, in this context, a change of Dirac operator. Later on we will see how different choices of the Dirac operator for the case of string theory correspond to spacetimes whose metric structures are related by T-duality and other geometric transformations. Another important role played by the Dirac operator D is in the construction of the algebra of differential forms in the context of noncommutative geometry. The key idea is to also represent differential forms as operators on H, on a par with D and A. We first define the (abstract) universal differential algebra of forms as the Z-graded algebra M p A (2.6) ∗ A = p≥0
which is generated as follows:
Duality Symmetries and Noncommutative Geometry of String Spacetimes
0 A = A
673
(2.7)
and 1 A is generated by a set of abstract symbols da which are linear and satisfy the Leibnitz rule on A. Elements of p A are linear combinations of elements of the form ω = a0 da1 · · · dap . This makes p A a Z2 -graded A-module. The graded exterior derivative operator is the nilpotent linear map d : p A → p+1 A defined by d(a0 da1 · · · dap ) = da0 da1 · · · dap .
(2.8)
We define a linear representation πD : ∗ A → B(H) of the universal algebra of abstract forms by πD (a0 da1 · · · dap ) = a0 [D, a1 ] · · · [D, ap ].
(2.9)
Notice, however, that πD (ω) = 0 does not necessarily imply πD (dω) = 0. Forms ω for which this happens are called junk forms. They generate a Z-graded ideal in ∗ A and have to be quotiented out [6, 19]. Then the noncommutative differential algebra is represented by the quotient space (2.10) ∗D A = πD ∗ A/(ker πD ⊕ d ker πD ) which we note depends explicitly on the particular choice of D. The algebra ∗D A determines a DeRham complex whose cohomology groups can be computed using the conventional methods. With this machinery, it is also possible to naturally define (formally at this level) a vector bundle E over A as a finitely-generated projective left A-module, and along with it the usual definitions of connection, curvature, and so on [23]. However, in what follows we shall for the most part use only the trivial bundle over a unital C ∗ -algebra A. For this we define a gauge group U (A) as the group of unitary elements of A, (2.11) U (A) = u ∈ A | u† u = uu† = I , where I is the identity operator. Alternatively, the gauge group can be defined as the subgroup of unimodular elements u ∈ U (A). The presence of one-forms is then tantamount Pto the possibility of defining a connection, which is a generic Hermitian one-form ρ = i ai [D, bi ], and with it a covariant Dirac operator Dρ = D + ρ. The curvature of a connection ρ is defined to be θ = [D, ρ] + ρ2 .
(2.12)
We close this section by turning again to the example of a p-dimensional spin manifold above. Zero-forms are just complex-valued functions, while we can represent an exact one-form as πD (α) = πD (f df − (df )f ) = iγ µ (f ∂µ f − (∂µ f )f ) = 0
(2.13)
so that a generic one-form is represented as fµ γ µ , i.e. 1D A is a free A-module with basis {γ µ }. As we build two-forms we run into the problem of junk forms. Consider the form α = f df −(df )f . As a form in the universal algebra it is non-zero. Its representation does, however, vanish: πD (α) = iγ µ (f ∂µ f − (∂µ f )f ) = 0 while the representation of its differential does not vanish:
(2.14)
674
F. Lizzi, R. J. Szabo
πD (dα) = −γ µ γ ν ∂µ f ∂ν f = −2g µν ∂µ f ∂ν f.
(2.15)
We can therefore identify junk forms as the symmetric part of the product of two oneforms. This has to be quotiented out leaving only the antisymmetric part, so that a generic two-form is represented as fµν γ µν where γ µν = 21 [γ µ , γ ν ], i.e. 2D A is a free A-module with basis {γ µν }. Analogously one constructs higher-degree forms by antisymmetrizations of the γ’s. Forms of degree p + 1 or higher are all junk forms. A connection is a generic one-form ρ = Aµ γ µ and the curvature defined in (2.12) is the familiar Maxwell tensor: θ = 21 Fµν γ µν
(2.16)
with Fµν = ∂µ Aν − ∂ν Aµ . Note that, in this example, D acts densely on the Hilbert space, i.e. it maps a dense subspace of H = L2 (spin(M )) into itself. In this case arbitrary iterated commutators of the form [D, [D, [· · · , [D, a] · · · ] are bounded only on the dense subalgebra C ∞ (M ) ⊂ C(M ). Thus to ensure boundedness of all operators under consideration, one should restrict attention to the ∗-algebra C ∞ (M ). As the completion of C ∞ (M ) is C 0 (M ), nothing is lost in such a restriction, and therefore in the following we shall for the most part not be concerned with the completeness of the algebra in the spectral triple.
3. Quantum Spacetimes and the Fr¨ohlich-Gawe¸dzki Construction In this section we will generalize the material of the previous section to the generalizations of spaces suggested by noncommutative geometry. We first discuss some generalities on noncommutative spaces, leading up to the spacetime of string theory. We then discuss the construction of the spectral triples pertinent to string theory, as suggested by the work of Fr¨ohlich and Gawe¸dzki [7]. Noncommutative spaces. The classic example of a noncommutative space is provided by the Connes-Lott noncommutative geometry of the standard model [6, 11, 24] (see [25] for a review). This is described by the noncommutative algebra ASM = C ∞ (M ) ⊗ [C ⊕ H ⊕ M (3, C)] ,
(3.1)
where H is the algebra of quaternions and M (3, C) is the algebra of 3×3 complexvalued matrices. The unimodular group of this algebra is the familiar gauge group U (1)×SU (2)×SU (3) of the standard model. There is a standard prescription for defining fermionic and bosonic actions in noncommutative geometry [6], and in the present case the action contains the Yang-Mills action and the Higgs term. The input parameters are the masses of all fermions and the coupling constants of the gauge group, while the (classical) mass of the Higgs boson is a prediction of the model. Other predictions for masses at current energies can also be made [26]. The actions of [11, 27] also enable the introduction of gravity into the model. There are various examples of noncommutative spaces which are close in character to the context which will be described in the following. Some of them are the Fuzzy Sphere [28], noncommutative lattices [29], and the possibility of having a spacetime with noncommuting coordinates as described in [4, 5, 30]. Quantum groups [31] are noncommutative spaces, in the sense that the algebra of functions defined on them is noncommutative, while another example is the quantum plane [32].
Duality Symmetries and Noncommutative Geometry of String Spacetimes
675
What is common to this variety of noncommutative spaces is that they are described by the noncommutative algebra defined upon them. It might in general be a misnomer to call them “spaces”, in the sense that the concept of point is not appropriate. The problem in dealing with noncommutative C ∗ -algebras is that the identification of the points of the space becomes ambiguous, and has to be abandoned. However, another important feature, which is common to these examples, is that there exists some regime, usually when a scale goes to zero, in which it is possible to recognize a topological space. Such a “low energy” regime is of course necessary if one wants to identify at some level the space in which we live and do experiments. In string theory a low energy regime is one in which no excited, vibrational states of the string are present and, in the case of closed strings, no string modes wind around the spacetime. In this regime the theory is well described by an ordinary (point particle) quantum field theory. In this respect we look for a noncommutative algebra which describes the “space” of interacting strings. Vertex operators were originally introduced in string theory to describe the interactions of strings. They operate on the Hilbert space of strings as insertions on the worldsheet corresponding to the emission or absorption of string states. The “coordinates” of a string are the Fubini-Veneziano fields X (which we will define in the next subsection), and the basic vertex operators are objects of the form eipX . These vertex operators act as a basis of a “smeared” set of vertex operators which form a noncommutative algebra. For toroidally compactified strings restrictions are imposed on the momenta p, which have to lie on an even self-dual lattice. We will say more about this algebra later on, and we have described some of the more formal algebraic properties of vertex operator algebras in the appendix. This is the spacetime that was proposed by Fr¨ohlich and Gawe¸dzki in [7]. In their construction the algebra is the vertex operator algebra of the string theory under consideration. The spacetime is thus described by the operator algebra which describes the relations among the quantum fields of the conformal field theory. String theories come with a scale (the string tension), which is usually set to be of the order of the Planck mass. Usually the dimensions of compactified directions are taken to be of a similar scale, so that “low energy” in this context means to neglect higher (vibrational) excited states of the strings, as well as the non-local states which correspond to the strings winding around a compactified direction. In this limit the theory becomes a theory of point particles, and we should thus expect a commutative spacetime. In fact, a quantum theory of point particles on a spin-manifold M naturally supplies the Hilbert space H = L2 (spin(M )) of physical states and the commutative ∗-algebra A = C ∞ (M ) of observables. Thus the low-energy limit of the noncommutative string spacetime should be represented by a spectral triple (C ∞ (M ), L2 (spin(M )), ig µν γµ ∇ν ) corresponding to an ordinary spacetime manifold M at large distance scales. Let us briefly remark that there is another approach to the noncommutative spacetime which is inspired by D-brane field theory [9, 17]. An effective low-energy description of open superstrings is provided by supersymmetric U (N ) Yang-Mills theory in ten dimensions. Dimensional reduction of this gauge theory down to a (p + 1)-dimensional manifold Mp+1 describes the low-energy dynamics of N Dp-branes, with U (N ) gauge symmetry. In [9, 17] the spacetime is thus described in analogy with the Connes-Lott formulation of the standard model by choosing as algebra AD = C ∞ (Mp+1 )⊗M (N, C). This construction is rather different in spirit from the one which we shall present in the following, in that it utilizes a different low-energy regime of the string theory. Fubini-Veneziano fields. We now begin describing the spectral triple corresponding to the noncommutative spacetime of string theory. From the point of view of the cor-
676
F. Lizzi, R. J. Szabo
responding conformal field theory, it is more natural to start with the Hilbert space of physical states rather than the algebra since the former, with its oscillatory Fock space component, is one of the immediate characterizations of the string theory. We shall take this to be the space of states on which the quantum string configurations act. We consider a linear sigma model with target space the flat n-torus T n = (S 1 )n ∼ = Rn /2π0,
(3.2)
where 0 is a lattice of rank n with inner product gµν of Euclidean signature. The action of the model is (in units where lP = 1) Z 1 S[X] = (3.3) gµν dX µ ∧ ?dX ν + 2X ∗ (β) , 2π 6 where X ∗ (β) = 21 βµν dX µ ∧ dX ν is the pull-back of the constant two-form β to the worldsheet 6 by the embedding fields X : 6 → T n , and ? denotes the Hodge dual. The kinetic term in (3.3) is defined by the constant Riemannian metric g ≡ 21 gµν dX µ ⊗ dX ν on T n , and it leads to local propagating degrees of freedom in the field theory. The second term in (3.3) is the topological instanton term. At the quantum level the two-form β takes values in the torus β ∈ H 2 (T n ; R)/H 2 (T n ; Z), where the real cohomology represents the local gauge transformations β → β + dλ while the integer cohomology represents the large gauge transformations β → β + 4π 2 C with C a closed two-form with integer periods. Let us introduce the non-singular “background” matrices [d± µν ] = [gµν ±βµν ]
(3.4)
which, when n is even, determine both the complex and K¨ahler structures of T n . To highlight the relevant features of the construction, in this paper we shall consider only the simplest non-trivial case where the Riemann surface 6 is taken to be an infinite cylinder with local coordinates (τ, σ) ∈ R×S 1 . We introduce holomorphic coordinates z± = e−i(τ ±σ) , which after a Wick rotation of the worldsheet temporal coordinate maps the cylinder onto the complex plane. The classical embedding functions of the strings in this case are well-known to be given by the Fubini-Veneziano fields µ (z± ) = xµ± + ig µν p± X± ν log z± +
X 1 (±)µ −k α z± , ik k
(3.5)
k6=0
where the zero-modes xµ± (the center of mass coordinates of the string) and the (center of mass) momenta p± µ are canonically conjugate variables. The left-right momenta are ν √1 pµ ±d± , (3.6) p± µ = µν w 2 where {pµ } ∈ 0∗ are the spacetime momenta and {wµ } ∈ 0 are the winding numbers representing the number of times that the worldsheet circle S 1 is wrapped around the circles of the torus (S 1 )n . The set of momenta {(p+µ , p− µ )} along with the integer-valued quadratic form µν − µ µ hp, qi3 ≡ p+µ g µν qν+ − p− µ g qν = pµ v + qµ w ,
where qµ± =
√1 (qµ ±d± v ν ), µν 2
(3.7)
form an even self-dual Lorentzian lattice 3 = 0∗ ⊕ 0
(3.8)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
677
of rank 2n and signature (n, n) called the Narain lattice [33]. The functions (3.5) define chiral multi-valued quantum fields of the sigma-model. The oscillatory modes αk(±)µ in (3.5) yield bosonic creation and annihilation operators (acting on some vacuum states |0i± ) with the non-vanishing commutation relations h i (±)ν αk(±)µ , αm (3.9) = k g µν δk+m,0 . The Hilbert space of states of this quantum field theory is thus HX = L2 ((S 1 )n ,
Qn µ=1
dxµ 0 √ ) 2π
⊗ F + ⊗ F −,
(3.10)
where L2 ((S 1 )n , with L2 ((S 1 )n ,
Qn µ=1
Qn
dxµ √ ) 2π
µ=1
dxµ 0 √ ) 2π
=
M
L2 ((S 1 )n ,
Qn µ=1
{wµ }∈0
dxµ √ ) 2π
(3.11)
the space of square integrable functions on T n with its
Riemannian volume form. Periodic functions of xµ =
µ √1 (x+ 2
+ xµ− ) act in (3.11) by
multiplication and pµ as the derivative operator i ∂x∂ µ in each L2 -component labelled by the winding numbers wµ . The Hilbert space (3.11) is spanned by the eigenvectors ∂ + − −ipµ xµ of i ∂xµ in each component of the direct sum. The spaces F ± are |p ; p i = e two commuting copies of a bosonic Fock space built on the vacuum states |0i± with αk(±)µ |0i± = 0 for k > 0. The unique vacuum state of HX is |vaci ≡ |0; 0i ⊗ |0i+ ⊗ |0i−
(3.12)
(±)µ |vaci = 0 ∀k > 0. with p± µ |vaci = αk
Gauge symmetry and the Dirac–Ramond Operator. We shall now introduce a Dirac operator that will yield the Riemannian geometry of the string spacetime. The Dirac operator that is related to the two fundamental, infinite-dimensional continuous symmetries of the conformal field theory (3.3). We shall also briefly show how this Dirac operator is related to some previous approaches to the geometrical and topological properties of a spacetime using supersymmetric sigma-models. Superconformal field theories have been emphasized recently as the correct field theoretical structure for the description of stringy spacetimes in the framework of noncommutative geometry [7]–[10], [34]. µ (z± ) → The first fundamental symmetry is target space reparametrization X± µ µ µ X± (z± ) + δX± (z± ), where δX± (z± ) are arbitrary periodic functions. This symmetry is generated on the Hilbert space (3.10) by the u(1)n+ ⊕ u(1)n− Kac-Moody algebra at level 2 with conserved currents µ (z± ) J±
=
µ −iz± ∂z± X± (z± )
=
∞ X
−k αk(±)µ z±
(3.13)
k=−∞
obeying the commutation relations (3.9), where we have defined α0(±)µ ≡ g µν p± ν . Then the Hilbert space (3.10) is a direct sum of the irreducible highest-weight representations of the current algebra acting in |p+ ; p− i ⊗ F + ⊗ F − and labelled by the U (1)n± charges p± µ.
678
F. Lizzi, R. J. Szabo
The currents (3.13) can be used to define the Dirac operator which describes the DeRham cohomology and Riemannian geometry of the effective spacetime of the sigmamodel (3.3). For this, we endow the toroidal spacetime T n with a spin structure. Note that there are 2n possibilities corresponding to a choice of Neveu-Schwarz or Ramond fermionic boundary conditions around each of the n circles of T n . We then introduce two anti-commuting copies of the spin(n) Clifford algebra C(T n )± whose corresponding Dirac generators γµ± = (γµ± )∗ obey the non-vanishing anti-commutation relations ± ± (3.14) γµ , γν = 2gµν . To define the appropriate Dirac operator we must first enlarge the Hilbert space (3.10) to include the spin structure. Thus we replace HX by M L2 (spin(T n ))0 ⊗ F + ⊗ F − , (3.15) H= S[C(T n )]
where L2 (spin(T n ))0 =
M {wµ }∈0
S{wµ } [C(T n )] ⊗ L2 ((S 1 )n ,
Qn µ=1
dxµ √ ) 2π
(3.16)
is (a local trivialization of) the space of square integrable spinors, i.e. L2 -sections of the spin bundle spin(T n ) of the n-torus, with S{wµ } [C(T n )] unitary irreducible representations of the double Clifford algebra C(T n ) = C(T n )+ ⊕ C(T n )− . These modules have the form ( S{wµ } [C(T n )]+ ⊗ S{wµ } [C(T n )]− for n even n , S{wµ } [C(T )] = S{wµ } [C(T n )]+ ⊗ S{wµ } [C(T n )]− ⊗ C2 for n odd (3.17) where the bundle of spinors contains both chiralities of fermion fields in the even dimensional case. These various representations of the Clifford algebra need not be the same, but, to simplify notation in the following, we shall typically omit the explicit representation labels for the spinor parts of the Hilbert space (3.15). Define two anti-commuting Dirac operators acting on the Hilbert space (3.15) by ±
D / (z± ) =
∞ ∞ X X √ ± √ ± ± −k µ −k 2 γµ ⊗ J± (z± ) = 2 γµ ⊗ αk(±)µ z± ≡ D / k z± . (3.18) k=−∞ k=−∞
That these Dirac operators are appropriate for the target space geometry can be seen µ µ ∼ δXδ µ when acting on spacetime functions of X± . Furthermore, by noting that J± ± as we shall see, their squares determine the appropriate Laplace-Beltrami operator for the target space geometry which can also be found from conventional conformal field theory. The Dirac operator introduced above is a low-energy limit of Witten’s Dirac–Ramond operator for the full superstring theory corresponding to the two-dimensional N = 1 supersymmetric sigma-model [35, 36]. For the simple linear sigma-model under consideration here, this field theory is obtained by adding to the action (3.3) a free fermion term, so that the total action is Z i µ ν dz+ dz− gµν ψ+µ ∂z− ψ+ν + ψ− ∂ z+ ψ− , (3.19) S[X, ψ] = S[X] + 2π
Duality Symmetries and Noncommutative Geometry of String Spacetimes
679
µ where ψ± are Majorana spinor fields. The worldsheet supersymmetry generators are the N = 1 supercharges
Q± =
√1 2
µ gµν ψ± ∂± X ν
(3.20)
In the Ramond sector the fermionic zero modes ψ0(±)µ generate the Clifford algebras (3.14) and thus coincide with the gamma-matrices introduced above in specific irreducible representations SF [C(T n )]± , i.e. γµ± =
√1 2
gµν ψ0(±)ν .
(3.21)
The total Hilbert space of the supersymmetric sigma-model (3.19) is HX,ψ = FF ⊗ HX ,
(3.22)
where FF is the total fermionic Fock space of the field theory (3.19) including both chiralities of the Neveu-Schwarz and Ramond sectors. In the Ramond sector the quantized fermionic zero-modes of the worldsheet supercharges (3.20) coincide with the generalized Dirac operators (3.18) acting in the representation sector of the Hilbert space (3.15) determined by the fermion fields. DeP ± −k−1/2 of the worldsheet supercharges fine a mode expansion Q± (z± ) = k∈Z+ Qk, z± with the operators X (±)µ (±)ν gµν ψm αk−m (3.23) Q± k, = m∈Z+ (±)µ acting on the full Hilbert space (3.22) of the supersymmetric sigma-model, where ψm are the fermionic modes and = 21 for Neveu-Schwarz boundary conditions while = 0 for Ramond boundary conditions. Under the orthogonal projection of the Hilbert space HX,ψ , with projector PR(0) , onto the fermionic zero-modes, the Hilbert space (3.22) reduces to (3.15) and the supersymmetry charges (3.20) coincide with the Dirac operators (3.18) in the representation SF [C(T n )] of the Clifford algebra determined by the fermion fields, (0) (0) H |SF [C(T n )] ∼ = HR ≡ PR HX,ψ ,
±
(0) D / k = PR(0) Q± k,0 PR
(3.24)
This supersymmetric construction thus exhibits an algebraic, field-theoretical origin for the Dirac operators introduced in (3.18). Conformal symmetry and the Witten complex. The other basic symmetry that the field theory (3.3) possesses is worldsheet conformal invariance under transformations which act by reparametrization of the holomorphic coordinates z± . At the quantum level, this symmetry is represented on the Hilbert space (3.15) by a commuting pair of Virasoro algebras with the conserved stress-energy tensors ±
/ (z± )2 : = T ± (z± ) = − 21 : D
∞ X
−k−2 I ⊗ L± , k z±
(3.25)
k=− inf ty
where the double colons denote the usual Wick normal ordering and the SugawaraVirasoro generators
680
F. Lizzi, R. J. Szabo
L± k =
∞ 1 X (±)µ (±)ν gµν : αm αk−m : 2 m=−∞
(3.26)
act on the bosonic Hilbert space (3.10). The operators (3.26) generate the Virasoro algebra ± ± 3 c (3.27) Lk , Lm = (k − m)L± k+m + 12 k − k δk+m,0 with central charge c = n, the dimension of the toroidal spacetime. These Virasoro operators can be used to construct the global spacetime symmetry generators of the Poincar´e algebra. The momentum operator generating space transla− c + tions is P = L+0 − L− 0 . The Hamiltonian operator is H = L0 + L0 − 12 , which can be written explicitly as H=
X n 1 µν + + 1 µν − − X (+)µ (+)ν (−)µ (−)ν g p µ pν + g pµ p ν + gµν α−k αk + gµν α−k αk − . 2 2 12 (3.28) k>0 k>0
It determines the Laplace-Beltrami operator of the Riemannian geometry of the effective string spacetime [7], and it is the square of the Dirac operator (3.18) in the sense of (3.25). This geometrical property can be made somewhat more precise by turning to the supersymmetric sigma-model (3.19). The spacetime Poincar´e generators of the bosonic sigma-model are given by the projections 2 (0) I ⊗ (P ±H) = 21 PR(0) (Q± 0,0 ) PR
(3.29)
2 onto the spin-extended Hilbert space (3.15). The operators (Q± 0,0 ) also annihilate all n ± states of the form |9± i ⊗ |vaci, where 9± ∈ SF [C(T )] label the Clifford vacua of the Ramond Fock spaces, so that n ± ∼ ker Q± 0,0 = SF [C(T )] .
(3.30)
Thus the global supersymmetry is unbroken and there exists a whole family of supersymmetric ground states of the field theory (3.19). When restricted to Hilbert space states of spacetime momentum P = 0, the fields of (3.19) generate the DeRham complex of the target space T n [35]. The P = 0 projected supercharges PP =0 Q0 PP =0 and PP =0 Q¯ 0 PP =0 , with ¯ 0 = √1 Q+0,0 − Q− , Q (3.31) Q0 = √12 Q+0,0 + Q− 0,0 0,0 2 realize the exterior derivative d and the co-derivative ?d?, respectively, when acting on the Ramond sector of the Hilbert space of the supersymmetric sigma-model. Moreover, the projected fermionic zero-modes PP =0 ψ µ PP =0 and PP =0 ψ¯ µ PP =0 , with (+)µ (−)µ + ψ0,0 , ψ µ = ψ0,0
(+)µ (−)µ ψ¯ µ = ψ0,0 − ψ0,0
(3.32)
correspond, respectively, to basis differential one-forms and basis vector fields, and the Poincar´e-Hodge duality is realized by (Hermitian) conjugation in the sense of mappings ψ ± → ±ψ ± of left and right chiral sectors in the parametrizations above. Vertex operator algebra. The final ingredient we need is an appropriate operator algebra acting on the Hilbert space (3.10) which will give the necessary topology and differentiable structure to the string spacetime. We want to use an algebra that acts on (3.10)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
681
densely, i.e. it maps a dense subspace of HX into itself, so as to capture the full structure of the string spacetime as determined by the Fubini-Veneziano fields (3.5). Consider the basic (single-valued) quantum fields of the sigma-model which are the mutually local holomorphic and anti-holomorphic vertex operators ±
µ
Vq± (z± ) = : e−iqµ X± (z± ) :,
(3.33)
where single-valuedness restricts their momenta to ν with {qµ } ∈ 0∗ , {v ν } ∈ 0 qµ± = √12 qµ ±d± µν v
(3.34)
so that (q + , q − ) ∈ 3. The complete, left-right symmetric local vertex operators of the conformal field theory are Vq+ q− (z+ , z− ) = cq+ q− (p+ , p− )Vq+ (z+ )Vq− (z− ) µ
−
(3.35)
µ
= cq+ q− (p+ , p− ) : e−iqµ X+ (z+ )−iqµ X− (z− ) : . +
The operator-valued phases cq+ q− (p+ , p− ) = (−1)qµ w
µ
(3.36)
are 2-cocycles of the lattice algebra generated by the complexification 3 = 3 ⊗Z C, and they are inserted to correct the algebraic transformation properties (both gauge and conformal) of the vertex operators. They also enable the vertex operator construction of affine Kac-Moody algebras [37]. The algebraic properties of the local vertex operators can be described as follows. Consider the n-fold Heisenberg-Weyl operator algebra spanned by the oscillator modes, n o (±)µ hˆ ± = qµ± αm (3.37) (q + , q − ) ∈ 3 , m ∈ Z c
and the algebra of polynomials on the bosonic creation operators which is the symmetric vector space ) ( k M Y (−) (i)± (±)µ (i)+ (i)− ˆ qµ α−m (q , q ) ∈ 3 , mi > 0 . (3.38) S(h± ) = i
k>0
i=1
Next we consider the group algebra C[3] = C[3]+ ×C[3]− of the complexified lattice ± µ 3c , which is the abelian group generated by the translation operators eiqµ x± , (q + , q − ) ∈ 3. Because of the inclusion of the lattice cocycles in the definition of the complete vertex operators (3.35), we “twist” these group algebra generators by defining the operators +
µ
−
µ
εq+ q− ≡ eiqµ x+ +iqµ x− cq+ q− (p+ , p− )
(3.39)
The operators (3.39) generate the clock algebra εq+ q− εr+ r− = (−1)hq,ri3 εr+ r− εq+ q−
(3.40)
with the 2-cocycles (−1)hq,ri3 in the lattice algebra generated by 3c . Notice that the 2-cocycles (3.36) are maps c : 3c ⊕ 3c → Z2 . This means that the ˆ c = Z2 ×3c twisted operators (3.39) generate the group algebra of the double cover 3 of the complexified lattice 3c . Thus taking the twisted group algebra C{3} generated
682
F. Lizzi, R. J. Szabo
by the operators εq+ q− , the vertex operators (3.35) are formally endomorphisms of the Z-graded Fock space of operators bX (3) = C{3} ⊗ S(hˆ (−) ˆ (−) H + ) ⊗ S(h− )
(3.41)
which defines the space of twisted endomorphisms of the Hilbert space (3.10). The oscillator modes αk(±)µ for k < 0 act in the latter tensor products of (3.41) by multiplication, and for k > 0 their (adjoint) action is defined by (3.9). The zero mode operators α0(±)µ act on C{3} by α0(±)µ εq+ q− = g µν qν± εq+ q− while the action of εq+ q− on the twisted group algebra is given by (3.40). To a typical homogeneous element Y Y (+)µ (−)ν 9 = εq + q − ⊗ rµ(j)+ α−n ⊗ rν(k)− α−m (3.42) j k j
k
b X (3) with (q + , q − ), (r(i)+ , r(i)− ) ∈ 3, we associate the higher-spin vertex operator of H V(9; z+ , z− ) ≡ Vq+ q− (z+ , z− ) = : i Vq+ q− (z+ , z− )
Y j
Y r(k)− rµ(j)+ ν ∂zn+j X+µ ∂ mk X ν :, (nj − 1)! (mk − 1)! z− − (3.43) k
where = {(r(i)+ , r(i)− ); nj , mk } labels the fields. Extending this by linearity we obtain bX (3) into the algebra of endomorphisma well-defined one-to-one mapping from H valued Laurent power series in the variables z± , bX (3))[z ±1 , z ±1 ] bX (3) → (End H H + −
(3.44)
which defines the space of quantum conformal fields of the sigma-model. The coefficients of the monomials in the variables z± , that arise in an expansion of products of the basis vertex operators (R) V[q + q − ] [z+ , z− ] = :
R Y
(i) Vq(i)+ q(i)− (z+(i) , z− ) :
(3.45)
i=1
in terms of Schur polynomials, span the linear space (3.41) as R and the momenta (q (i)+ , q (i)− ) ∈ 3 are varied. The complete set of relations of the conformal field algebra is therefore determined by the operator product expansion formula for the operators (3.45), which is the local cocycle relation (R) (S) (S) (R) V[q + q − ] [z+ , z− ]V[r + r − ] [w+ , w− ] = V[r + r − ] [w+ , w− ]V[q + q − ] [z+ , z− ] Y exp −iπg µν qµ(i)+ rν(j)+ sgn(arg z+(i) − arg w+(j) ) × 1≤(i,j)≤(R,S)
(3.46)
n o (j) (i) × exp −iπg µν qµ(i)− rν(j)− sgn(arg z− − arg w− ) . The fundamental vertex operators (3.35) are primary fields of both the Kac-Moody and Virasoro algebras of the sigma-model. Specifically, Vq+ q− (z+ , z− ) transform as local
Duality Symmetries and Noncommutative Geometry of String Spacetimes
683
fields of U (1)n± charges qµ± under the local gauge transformations generated by the KacMoody currents, i h k Vq+ q− (z+ , z− ). (3.47) αk(±)µ , Vq+ q− (z+ , z− ) = g µν qν± z± Moreover,
k+1 k L± k , Vq + q − (z+ , z− ) = z± ∂z pm + (k + 1)1q ± z± Vq + q − (z+ , z− )
(3.48)
so that the local vertex operators (3.35) also transform under conformal transformations as tensors of weight 1q± = 21 g µν qµ± qν± . However, (3.47) holds only at k = 0 and (3.48) only for k = 0 and k = −1 in general for the higher-spin vertex operators Vq+ q− (z+ , z− ). Thus the general operators (3.43) also have U (1)n± charges qµ± and scaling dimensions P P 1 µν + + 1 µν − − 1 (3.49) q + = 2 g qµ qν + j nj , 1 q − = 2 g q µ q ν + k mk . The general transformation property (3.48) holds for the vertex operators (3.43) which are primary fields of the sigma-model (of weights 1 q ± ), i.e. the endomorphisms acting on the subspaces n o bX (3) | L± 9 = 1± 9 , L± 9 = 0 ∀k > 0 b (3) ≡ 9 ∈ H (3.50) P 1q q 0 k bX (3). The space (3.41) is then a direct sum of the of highest-weight operators in H irreducible highest-weight representations of the Virasoro algebra acting in the subspaces (3.50) and labelled by the conformal weights 1 q ± . Similarly, one can decompose (3.41) into irreducible highest-weight representations of the current algebra acting in subspaces of definite U (1)n± charges qµ± . bX (3), the conformal In fact, in addition to defining the grading of the space H dimension and spacetime momentum eigenvalues grade the Hilbert space (3.10). This follows from the operator-state correspondence which relates the states Y Y (+)µ (−)ν + − rµ(j)+ α−n |0i+ ⊗ rν(k)− α−m |0i− ∈ HX (3.51) |ϕ q + q − i = |q ; q i ⊗ j k j
k
to the higher-spin vertex operators (3.43), where labels the quantum numbers of the states. For example, the spin-0 tachyon state corresponds to the basis vertex operators themselves. In this case, the basis primary fields correspond to the Hilbert space states Qn dxµ ). |q + ; q − i ⊗ |0i+ ⊗ |0i− = lim Vq+ q− (z+ , z− )|vaci ∈ L2 ((S 1 )n , µ=1 √ 2π z± →0 (3.52) The (primary) tachyon vectors (3.52) are eigenstates of α0(±)µ with U (1)n± charge eigenvalues qµ± and of L± 0 with conformal weight eigenvalues 1q ± . Likewise, the general states (3.51) have spacetime momentum eigenvalues qµ± and conformal dimension eigenvalues (3.49). An important state is the graviton state which corresponds to the minimal spin-2 operator µ ν Vqµν + q − (z+ , z− ) = : i Vq + q − (z+ , z− ) ∂z+ X+ ∂z− X− : .
(3.53)
The operator (3.53) creates a graviton of polarization (µν) and represents the Fourier modes of the background matrices d± µν , i.e. of the metric gµν and instanton form βµν .
684
F. Lizzi, R. J. Szabo
(+)µ (−)ν It corresponds to the string state |q + ; q − i ⊗ α−1 |0i+ ⊗ α−1 |0i− ∈ HX , which is the lowest-lying vector with non-trivial string oscillations and will play an important role in Sect. 5 (see also the appendix). In the general case, the operator-state correspondence is achieved by the action of the smeared vertex operators Z dz+ dz− + − bX (3) V + − (z+ , z− )fS (z+ , z− ) ∈ H (3.54) V (q , q ) = 4πz+ z− q q
on the vacuum state of the Hilbert space, so that + − |ϕ q + q − i = V (q , q )|vaci.
(3.55)
Here fS is an appropriate Schwartz space test function which smears out the operatorvalued distributions Vq+ q− (z+ , z− ) 2 . The operator (3.54) is said to create a string state of type and momentum (q + , q − ) ∈ 3. The smeared vertex operators represent spacetime Fourier transformations of the string state operators inserted on the worldsheet. They span the space of primary fields of the conformal sigma model and generate the set of all functionals on the vector space (3.41), i.e. they span the twisted dual of the Hilbert space (3.10). The quantities (3.54) are naturally (non-local) elements of the operator algebra (3.41) and are well-defined operators on (3.10). In fact, they map a dense subspace of HX into itself, as follows from (3.55) and the completeness of the primary states (3.51) in HX . The operator-state correspondence is thus the Fock space mapping between the Hilbert space (3.10) and its twisted endomorphisms (3.41) + − b |ϕ q + q − i ↔ V (q , q ) : HX ↔ HX (3)
(3.56)
Note that this mapping is not one-to-one as there are many smeared vertex operators which can be associated to a given conformal state of the Hilbert space (3.10). Moreover the operators (3.54) are in general not bounded. As we are interested in a C ∗ -algebra which is necessarily made of bounded operators, one has therefore to consider the algebra generated by the subset of states which give rise to bounded operators. The algebraic make-up of the vertex operators described above has the formal mathematical structure of a (smeared) vertex operator algebra [12] (see also [13, 14] for concise introductions). A vertex operator algebra is the formal mathematical definition of a chiral algebra in conformal field theory [38], which is defined to be the operator product algebra of (primary) holomorphic fields in the conformal field theory. In this context the Klein transformations (3.36), which in the above were introduced to correct signs in the Kac-Moody and Virasoro commutation relations with the vertex operators, are needed to adjust some signs in the so-called “Jacobi identities” of the vertex operator algebra. In conformal field theory, these Jacobi identities are the precise statement of the Ward identities associated with the conformal invariance on the 3-punctured Riemann sphere. The general properties of vertex operator algebras as they pertain to this paper are briefly discussed in the appendix. The smeared vertex operators V (q + , q − ) generate a noncommutative unital ∗algebra AX which contains two Virasoro and two Kac-Moody subalgebras. The operator-state correspondence then implies that AX H X = HX . 2
For ease of notation in the following, we suppress the explicit dependence on the test function.
(3.57)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
685
The noncommutativity of AX is expressed in terms of the cocycle relation (3.46) which determines the complete set of relations between the smeared vertex operators. As a vertex operator algebra, the identity (3.46) in fact leads immediately to the Jacobi identity, which in turn encodes many other non-trivial algebraic relations among the elements of AX . The complicated nature of the 3-term Jacobi relation among the elements of the vertex operator algebra AX is what distinguishes the string spacetime from not only a classical (commutative) spacetime, but also from the conventional examples of noncommutative spaces. The full set of relations of the vertex operator algebra AX are presented in a more general context in the appendix, where it is also discussed how the algebraic structure of AX leads to a construction of the corresponding string spacetime along the lines described in Sect. 2. In particular, they illustrate the complicated nature of the string spacetime determined by the algebra AX , and how more general string spacetimes constructed from other conformal field theories arise from the general structure of vertex operator algebras. Note that the chiral and anti-chiral vertex operators (3.33) generate local chiral algebras E ± whose products combine into the full algebra E = E + ⊗ E − corresponding to the usual decomposition of the operator product algebra of primary holomorphic fields in conformal field theory. As mentioned before, the twisting of this chirally-symmetric algebraic structure by the cocyle factors (3.36) is required to compensate signs in the Jacobi identity for AX . But we shall see that the non-trivial duality structure of the effective string spacetime is represented essentially as a left-right chirality symmetry between the Dirac operators introduced above. The twistings then yield more complicated duality transformations, and, as we will discuss in Sect. 5, at certain special points in the quantum moduli space of the sigma-model the chiral symmetry is restored and the algebra E controls the structure of the quantum spacetime. 4. Quantum Geometry of Toroidal Spacetimes We shall now begin applying the formalism developed thus far to a systematic analysis of the geometrical properties of the string spacetime. With the above constructions, we obtain the spectral triple T ≡ (A , H , D)
(4.1)
associated to the sigma-model with target space the n-torus (S 1 )n with metric gµν and torsion form βµν . The triple (4.1) encodes the effective target space geometry of the linear sigma-model, i.e. the moduli space of this class of conformal field theories. Here D is an appropriately defined Dirac operator on the spin-extended Hilbert space H which encodes the effective geometry and topology of the string spacetime described by (4.1). It will be constructed below from the two Dirac operators introduced in the previous section. The algebra A ≡ I ⊗ AX acts trivially on the spinor part of H 3 . As we have pointed out in Sect. 2, different selections of D lead to different geometries for the 3 An algebraic, field-theoretical construction of (4.1) can be naturally given as a low-energy limit of the canonical spectral triple associated with the N = 1 superconformal field theory (3.19). There we take the Hilbert space (3.22), the algebra generated by the superconformal primary fields corresponding to states of HX,ψ , and a Dirac operator Q constructed from the worldsheet supersymmetry generators (3.20). The resulting unital ∗-algebra is described as a vertex operator superalgebra. The projection onto fermionic zeromodes gives HR(0) , the Dirac operator PR(0) QPR(0) , and the algebra APR(0) , where A = I ⊗ AX is generated by the primary conformal fields corresponding to states of the Neveu-Schwarz sector. This is the construction that has been elucidated on in [7, 8, 9, 34].
686
F. Lizzi, R. J. Szabo
string spacetime. Below we shall examine the properties of the string spacetime with appropriate selections of this operator. The topology and differentiable structure of the spacetime is encoded via the complicated, noncommutative vertex operator algebra AX that we described in the previous section. The effective spacetime of the strings is in this sense horribly complicated. The smearing of the vertex operators effectively achieves the non-locality property of noncommuting spacetime coordinate fields. We can get some insight into the various symmetries of the stringy geometries by viewing the emergence of ordinary spacetime (i.e. one with a commutative geometry) as a low-energy limit of the quantum spacetime of the strings. The low-energy limit of the sigma-model is the limit wherein the oscillatory modes αk(±)µ of the string vanish, leaving only the particle-like, center of mass degrees of freedom xµ± , p± µ . Thus in the low-energy limit of the string theory, the spectral triple (4.1) should contain a subspace (4.2) T0 = C ∞ (T n ) , L2 (spin(T n )) , ig µν γµ ∂ν which represents the ordinary spacetime manifold T n at large distance scales. We shall see that these commutative spacetimes are indeed small subspaces of the larger string spacetime given by the spectral triple T . The various string theoretic symmetries that determine the quantum moduli space of the toroidal sigma-models will then appear naturally via the possibility of introducing more than one Dirac operator D corresponding to isometries of T . The moduli space of linear sigma-models is determined by the symmetries of the lattice 0 and its dual 0∗ which define the toroidal spacetime. Thus the “classical” moduli space is the right coset manifold [1] Mcl = O(n, n)/(O(n)×O(n)).
(4.3)
The homogeneous space (4.3) is isomorphic to the Grassmannian Gr(n, n) of maximal positive subspaces of Rn,n which is parametrized by the constant n×n matrix d+µν (equivalently d− µν ). In the following we shall see how the appropriate quantum string modification of the moduli space (4.3) appears naturally in the framework of the noncommutative geometry of the sigma-model. We will see that the discrete group of automorphisms of the quantum sigma-model, the so-called “duality group”, can be readily identified by viewing the string spacetime as the spectral triple (4.1). By construction, this point of view automatically establishes the equivalence between the original quantum field theory and its dual (i.e. that the correlation functions of the models are also equivalent to each other), this being encoded in the quantum field algebra A. The spacetime duality maps are, by definition, those which lead to isomorphisms between inequivalent low-energy spectral triples (4.2). As we will show, they emerge from the possibility of defining the two independent (smeared) Dirac operators Z dz+ dz− 1 + − √ (z ) + D / (z ) fS (z+ , z− ), D / D / = + − 4πz+ z− 2 (4.4) Z dz+ dz− 1 + − ¯ √ / (z− ) fS (z+ , z− ) D / = D / (z+ ) − D 4πz+ z− 2 in the spectral triple (4.1). The main point is that there exists several unitary transformations T : H → H with D /¯ = T D / T −1
(4.5)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
687
which define automorphisms of the vertex operator algebra, i.e. T AT −1 = A. This then immediately leads to the isomorphism of spectral triples /) ∼ /¯ ) ≡ TD TD = (A , H , D / ≡ (A , H , D /¯ .
(4.6)
The isomorphism (4.6) is a special case of the spectral action principle [11] which describes Riemannian manifolds which are isospectral (i.e. their Dirac K-cycles have the same spectrum) but not necessarily isometric. It states that the noncommutative string spacetime determined by the spectral triple T with D = D / is identical to that defined with D = D /¯ . As such a change of Dirac operator in noncommutative geometry simply represents a change of metric structure on the spacetime, the isomorphism (4.6) is simply the statement of general covariance of the noncommutative string spacetime represented as an isometry of T . From this point of view, target space duality can be represented symbolically by the commutative diagram T ∼ / TD / −→ TD /¯ = TD P0
↓
↓ P¯ 0
(4.7)
T0 T0 −→ T¯0
The operators P0 and P¯ 0 project the full spectral triples onto their respective lowenergy subspaces T0 and T¯0 (these projections will be defined formally below). The triple T0 is the commutative spacetime (4.2) while T¯0 represents a duality transformed commutative spacetime. From the point of view of classical general relativity, these two spacetimes are inequivalent. However, the isomorphism T in the top line of (4.7) makes the diagram commutative, so that T0 P0 = P¯ 0 T and T0 is an isomorphism of subspaces of the noncommutative spacetime. Thus as subspaces of the full quantum spacetime (4.1), the commutative spacetimes T0 and T¯0 are equivalent. This is the essence of duality and the stringy modification of classical general relativity. T-duality, low-energy projections and spin structures. The first symmetry of the string spacetime that we explore is T-duality. In terms of the isomorphism (4.6), it is the unitary mapping T ≡ TS ⊗ TX : H → H in (4.5) which is defined as follows4 . TX acts trivially on the spinor part of H and on the bosonic Hilbert space HX itself it is defined by TX |p+ ; p− i ⊗ |0i+ ⊗ |0i− = cp+ p− (p+ , p− )|(d+ )−1 p+ ; −(d− )−1 p− i ⊗ |0i+ ⊗ |0i− −1 TX αk(±)µ TX = ±gνλ (d∓ )µν αk(±)λ .
(4.8)
TS acts trivially on HX and on the generators of the spin bundle as ± TS γµ± TS−1 = g νλ d∓ µν γλ .
(4.9)
Thus the operator T simply redefines the bosonic basis of the Hilbert space, and it also changes the choice of spin structure on the target space by defining a different representation of the double Clifford algebra (3.14). This latter property implies that the target space metric gµν changes to its dual g˜ µν = (d+ )µλ gλρ (d− )ρν , 4
(4.10)
In the next section we shall present explicit operator expressions for the duality transformations T .
688
F. Lizzi, R. J. Szabo
which defines an inner product on the dual lattice 0∗ . The quadratic form (3.7) of the lattice 3 is invariant under these transformations. The Hilbert space (3.15) is thus invariant under this mapping between the two Dirac operators (4.4), as is the Hamiltonian (3.28) (and hence the spectrum of the theory). Furthermore, the action of T on the smeared vertex operator algebra is given by −1 = cq+ q− (p+ , p− )V (q + (d+ )−1 , −q − (d− )−1 ) TX V (q + , q − ) TX
(4.11)
which amounts to simply producing the same type of vertex operator with a redefined ˆ c . The algebra A = T AT −1 is thus invariant under the unitwisted momentum in 3 tary mapping T . The transformation T defined by (4.8)–(4.11) which yields an explicit realization of the isomorphism (4.6) is the noncommutative geometry version of the celebrated ‘T-duality’ transformation of string theory which exchanges the torus T n with its dual (T n )∗ , and at the same time interchanges momenta and winding numbers in the spectrum of the compactified string theory. It corresponds to an inversion of the background matrices d± → (d± )−1 and is the n-dimensional analog of the R → 1/R circle duality [1]. From the point of view of noncommutative geometry, T-duality thus appears quite naturally as a very simple geometric invariance property of the noncommutative spacetime. As we will see, the choice and independence of spin structure in the string theory is also intimately related to the T-duality symmetry. We now describe the low-energy projections in detail and explore the consequences of the above duality map in this sector of the string spacetime. Consider the subspace H¯ 0 ≡ ker D / ∼ =
n O
H¯ 0(+)µ ⊕ H¯ 0(−)µ ,
(4.12)
µ=1
where
n o − H¯ 0(+)µ = |ψi ⊗ |p+ ; p− i ∈ H¯ 0 | g νλ d+λµ γν+ |ψi = g νλ d− γ |ψi , p = 0 , µ ν λµ (4.13) H¯ 0(−)µ = |ψi ⊗ |p+ ; p− i ∈ H¯ 0 | γµ+ |ψi = −γµ− |ψi , wµ = 0 .
In H¯ 0 ⊂ H all oscillator modes are suppressed, leaving only the center of mass zero modes of the strings. Each of the 2n smaller subspaces (4.13) represents a particular choice of chiral or anti-chiral representation of the double Clifford algebra C(T n ) = C(T n )+ ⊕ C(T n )− , respectively (see (3.17)). Each such representation in turn encodes a particular choice of one of the 2n possible spin structures of the torus. The Hilbert space H¯ 0 contains the subspace of highest-weight vectors which belong to complex-conjugate pairs of left-right representations of the u(1)n+ ⊕ u(1)n− current algebra. The 2n subspaces in (4.12) are all naturally isomorphic to the canonical antichiral subspace H¯ 0(−) = H¯ 0(−)1 ⊗ H¯ 0(−)2 ⊗ · · · ⊗ H¯ 0(−)n .
(4.14)
± ± µν The explicit isomorphism maps H¯ 0(+)µ ↔ H¯ 0(−)µ by g νλ d± λµ γν ↔ ±γµ and g pν ↔ ± µν ± wµ (or equivalently g µν p± ν ↔ ±(d ) pν ). These isomorphisms themselves determine a sort of partial T-duality transformation that exchanges m ≤ n momenta with winding numbers. We will see below that they are related to another type of duality called ‘factorized duality’. Thus the various chiral and anti-chiral subspaces (4.13) are all naturally isomorphic to one another under such “partial” T-duality transformations. Here we shall
Duality Symmetries and Noncommutative Geometry of String Spacetimes
689
make the canonical choice of anti-chiral low-energy subspace (4.14) corresponding to the representation of the double Clifford algebra for which γµ+ = −γµ− ≡ γµ ∀µ. The isomorphisms above demonstrate explicitly the independence of quantities on the choice of spin structure for the spacetime. It is intriguing that a change of spin structure is manifested as a T-duality symmetry of the string theory. In the subspace H¯ 0(−) , we have wµ = 0 ∀µ, as required since in the low-energy sector there should be no global winding modes √string around the∗ compactified directions √ of the of the target space. It follows that 2p+µ = 2p− µ = pµ ∈ 0 , and thus the subspace (4.14) is naturally isomorphic to the Hilbert space Qn dxµ ) (4.15) H¯ 0(−) ∼ = %− [C(T n )] ⊗ L2 ((S 1 )n , µ=1 √ 2π which is (a local trivialization of) the bundle of anti-chiral square-integrable spinors on the torus. Here %− denotes the anti-chiralµspinor representation and the explicit L2 isomorphism in (4.15) maps |p; pi ↔ e−ipµ x in the restriction to the L2 -component of winding number 0. The anti-holomorphic Dirac operator D /¯ acting in (4.15) is D /¯ P¯ 0(−) = ig µν γµ ⊗
∂ ∂xν ,
(4.16)
where P¯ 0(−) is the operator that projects H orthogonally onto H¯ 0(−) , and we have defined xµ = √12 (xµ+ + xµ− ). In particular, H¯ 0(−) is an invariant subspace of the Dirac operator, D /¯ H¯ 0(−) = H¯ 0(−) , and the full low-energy Hilbert space (4.12) is a maximal subspace of H with this invariance property. Next we need a corresponding low-energy projection A¯ 0 of the vertex operator algebra A. We define A¯ 0 to be the commutant of the Dirac operator restricted to the Hilbert space H¯ 0 , A¯ 0 = P¯ 0 (comm D / ) P¯ 0 ≡ {V ∈ A | [D / , V ] P¯ 0 = 0}.
(4.17)
It is the largest subalgebra of A with the property A¯ 0 H¯ 0 = H¯ 0 .
(4.18)
To describe the restriction of A¯ 0 to H¯ 0(−) , consider a typical homogenous smeared vertex operator V (q + , q − ) ∈ AX of type and momentum (q + , q − ) ∈ 3. Since P¯ 0(−) D / , I ⊗ V (q + , q − ) P¯ 0(−) = g µν γµ ⊗ qν+ − qν− P¯ 0(−) V (q + , q − ) P¯ 0(−) , (4.19) it follows that the subalgebra P¯ 0(−) A¯ 0 P¯ 0(−) consists of those vertex √ which create √ operators string states of identical left and right chiral momentum, i.e. 2qµ+ = 2qµ− = qµ ∈ 0∗ , which again agrees heuristically with the zero winding number restriction of the lowenergy sector. For the basis (tachyon) vertex operators of A we have in general that P¯ 0 (I ⊗ Vq+ q− (1, 1))|ψ; p+ , p− i = |ψ; q + + p+ , q − + p− i.
(4.20)
It follows that V (q, q) generate P¯ 0(−) A¯ 0 P¯ 0(−) and, in particular, we have µ (I ⊗ Vqq (z+ , z− )) P¯ 0(−) = e−iqµ x z+ z−
qµ gµν pν /2
.
(4.21)
690
F. Lizzi, R. J. Szabo
Thus the smeared tachyon generators of P¯ 0(−) A¯ 0 P¯ 0(−) coincide with the spacetime funcµ tions e−iqµ x which constitute a basis for the algebra C ∞ (T n ) of smooth (single-valued) functions on the toroidal target space. The low-energy algebra A¯ 0 therefore yields a natural isomorphism with the abelian algebra P¯ 0(−) A¯ 0 P¯ 0(−) ∼ = C ∞ (T n ).
(4.22)
To summarize then, we have proven that there is a natural isomorphism between the spectral triples: ¯ (−) ¯ ¯ (−) , H¯ (−) , D /¯ P¯ 0(−) P¯ 0(−) TD /¯ ≡ P0 A0 P0 0 (4.23) ∼ = C ∞ (T n ) , L2 (spin− (T n )) , ig µν γµ ∂ν . This says that the low-energy projection of the spectral triple (4.1) determined by the kernel of the Dirac operator D / coincides with the spectral triple that describes the ordinary (commutative) spacetime geometry of the n-torus T n ∼ = Rn /2π0 with metric gµν . Thus the full noncommutative spacetime (4.1) of the string theory can be projected onto an ordinary, commutative spacetime via an explicit choice of Dirac operator. A key feature of these low-energy projections is that the corresponding algebras A¯ 0 consist only of the zero-mode components of the tachyon vertex operators. The full noncommutative spacetime is then built from the highest-weight states of the current algebra which are eigenstates of the Dirac operator D /, (4.24) D / , I ⊗ V (q + , q − ) = g µν γµ+ ⊗ qν+ + γµ− ⊗ qν− V (q + , q − ) which is just another feature of the spectral action principle [11]. Now let us treat the second Dirac operator D /¯ in (4.4) in an analogous way. It yields another low-energy subspace of the Hilbert space H, /¯ ∼ H0 ≡ ker D =
n O
H0(+)µ ⊕ H0(−)µ ,
(4.25)
µ=1
where
H0(+)µ = |ψi ⊗ |p+ ; p− i ∈ H0 | γµ+ |ψi = γµ− |ψi , wµ = 0 , n o − H0(−)µ= |ψi ⊗ |p+ ; p− i ∈ H0 | g νλ d+λµ γν+ |ψi = −g νλ d− (4.26) λµ γν |ψi, pµ = 0
define the chiral and anti-chiral subspaces of the kernel of D /¯ . Again the 2n spin structure subspaces in (4.25) are all naturally isomorphic under partial T-duality, and we therefore take the canonical anti-chiral subspace H0(−) = H0(−)1 ⊗ H0(−)2 ⊗ · · · ⊗ H0(−)n with the − representation of the double Clifford algebra for which g νλ d+λµ γν+ = −g νλ d− λµ γν ≡ γ˜ µ ∀µ (which is dual to the anti-chirality condition of√ the subspace (4.14)). In this √ (−) µ subspace we have pµ = 0 ∀µ so that 2(d+ )µν p+ν = − 2(d− )µν p− ν = w ∈ 0. H0 is also naturally isomorphic to the low-energy particle Hilbert space (4.15), with the dual anti-chiral spinor representation (%− )∗ , under the identification |d+ w; −d− wi ↔ − λρ + ν µ e−idµλ g dρν w x , and the holomorphic Dirac operator action is given by D / P0(−) = ig˜ µν γ˜ µ ⊗
∂ ∂xν ,
(4.27)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
691
where the dual metric g˜ µν is defined by (4.10). Again the full low-energy Hilbert space /. H0 is a maximal invariant subspace for the Dirac operator D Finally, we define the maximal subalgebra /¯ ) P0 A0 = P0 (comm D
(4.28)
of the smeared vertex operator algebra A with the property A0 H0 = H0 . Since /¯ , I ⊗ V (q + , q − ) P0(−) = γ˜ µ ⊗ (d+ )νµ qν+ + (d− )νµ qν− P0(−) V (q + , q − ) P0(−) , P0(−) D (4.29) it follows that the subalgebra P0(−) A0 P0(−) consists of smeared vertex operators √ √ V (q + , q − ) with 2(d+ )νµ qν+ = − 2(d− )νµ qν− = v µ ∈ 0. P0(−) A0 P0(−) is generated by the smeared tachyon vertex operators V (d+ v, −d− v) which coincide with the basis −ig˜ µν v ν xµ of C ∞ (T n )∗ . Thus there is also the natural isomorphism spacetime functions e of spectral triples, (−) (−) , H0(−) , D / P0(−) P0(−) TD / ≡ P 0 A0 P 0 (4.30) ∼ = C ∞ (T n )∗ , L2 (spin− (T n )∗ ) , ig˜ µν γ˜ µ ∂ν which identifies the low-energy projection of T determined by the kernel of the Dirac operator D /¯ with the commutative spacetime geometry of the dual n-torus (T n )∗ ∼ = Rn /2π0∗ with metric g˜ µν . According to the T-duality symmetry (4.6) of the effective string spacetime, as subspaces of the quantum spacetime we have the isomorphism C ∞ (T n ) , L2 (spin− (T n )) , ig µν γµ ∂ν (4.31) ∼ = C ∞ (T n )∗ , L2 (spin− (T n )∗ ) , ig˜ µν γ˜ µ ∂ν which is the usual statement of the T-duality T n ↔ (T n )∗ of string theory compactified on an n-torus. Notice how the statement that this duality symmetry corresponds to the target space symmetry gµν ↔ g˜ µν and the symmetry under interchange of momentum and winding numbers in the compactified string spectrum arise very naturally from the point of view of noncommutative geometry. It appears as a discrete Z2 -symmetry of the noncommutative geometry. Worldsheet parity. T-duality was shown above to be the linear isomorphism of the string spacetime which relates the anti-chiral low-energy subspaces defined by the pair of Dirac operators (4.4). There are several other symmetries of the string spacetime which also arise from the above construction, represented by the isomorphisms between other pairs of subspaces in (4.12) and (4.25). For example, suppose we consider the chiral subspace of (4.25), H0(+) = H0(+)1 ⊗ H0(+)2 ⊗ · · · ⊗ H0(+)n .
(4.32)
In this subspace we have γµ+ = γµ− ≡ γµ and wµ = 0 for all µ. The Hilbert space H0(+) is also isomorphic to (4.15), with the chiral spinor representation %+ , and the holomorphic Dirac operator action is now D / P0(+) = −ig µν γµ ⊗
∂ ∂xν
= −D /¯ P¯ 0(−) .
(4.33)
692
F. Lizzi, R. J. Szabo
Moreover, one finds that again P0(+) A0 P0(+) is generated by the smeared tachyon vertex µ operators V (q, q) ∼ e−iqµ x . According to the equivalences discussed above, we then have the isomorphism of low-energy spectral triples C ∞ (T n ) , L2 (spin− (T n )) , ig µν γµ ∂ν (4.34) ∼ = C ∞ (T n ) , L2 (spin+ (T n )) , −ig µν γµ ∂ν . This quantum symmetry of the string spacetime is the worldsheet parity symmetry of string theory. It is the left-right chirality symmetry which reflects the worldsheet spatial ∓ coordinate σ → −σ and thus acts on the string background as β → −β, i.e. d± µν → dµν . It acts by interchanging the left and right chirality sectors of the string theory, so that it flips the sign of the Lorentzian quadratic form (3.7) and is thus not an automorphism of the lattice 3. In terms of the full spectral triple describing the quantum spacetime, it is the Z2 -transformation T ≡ WS ⊗ WX : H → H that achieves (4.5) and (4.6) by mapping −1 = αk(∓)µ , WX αk(±)µ WX
WS γµ± WS−1 = ±γµ∓ ,
(4.35)
−1 = V (q − , q + ), WX V (q + , q − ) WX
which amounts to the interchange ± ↔ ∓ on both H and A. This worldsheet quantum symmetry of the sigma-model thus also arises as a change of Dirac operator for the noncommutative geometry, and so the isometries of the spectral triple (4.1) account for both target space and discrete worldsheet symmetries of the quantum geometry. Factorized duality and spacetime topology change. The next generalization of the Tduality isomorphism of low-energy sectors is to compare the anti-chiral subspace (4.14) with the subspaces H0[+;µ] = H0(+)1 ⊗ · · · ⊗ H0(+)µ−1 ⊗ H0(−)µ ⊗ H0(+)µ+1 ⊗ · · · ⊗ H0(+)n
(4.36)
which are defined for each µ = 1, . . . , n. In H0[+;µ] we have γν+ = γν− ≡ γν and wν = 0 − for all ν 6= µ, while g λρ d+λµ γρ+ = −g λρ d− ˜ µ and pµ = 0. This Hilbert space is λµ γρ ≡ γ isomorphic to (4.15) with the mixed chirality spinor representation %(µ) determined by the spinor conditions of (4.36), where the explicit L2 -isomorphism maps the bosonic states of (4.36) to the functions P P (4.37) f (µ) (x) = exp −i ν6=µ pν xν − i λ g˜ µλ wµ xλ . The restriction of the holomorphic Dirac operator to (4.36) is P P P D / P0[+;µ] = −i ν6=µ λ g νλ γν ⊗ ∂x∂ λ + i λ g˜ µλ γ˜ µ ⊗
∂ . ∂xλ
(4.38)
The algebra P0[+;µ] A0 P0[+;µ] is generated by the smeared tachyon vertex operators which are given by the basis spacetime functions (4.37) of C ∞ (T n )(µ) , and we arrive at the spectral triple isomorphism C ∞ (T n ), L2 (spin− (T n )), ig µν γµ ∂ν P P P ∼ = C ∞ (T n )(µ) , L2 (spin(µ) (T n )), −i ν6=µ λ g νλ γν ∂λ +i λ g˜ µλ γ˜ µ ∂λ . (4.39)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
693
The symmetry (4.39) of the string spacetime is called “factorized duality”. For each µ = 1, . . . , n it is a generalization of the R → 1/R circle duality in the X µ direction of T n . This becomes more transparent if we choose a particular basis of the lattice 0 that splits the n-torus into a product of a circle S 1 of radius Rµ and an (n − 1)dimensional background T n−1 . The factorized duality map then takes Rµ → 1/Rµ leaving T n−1 unchanged, and at the same time interchanges the µth momentum and winding mode g µν pν ↔ wµ leaving all others invariant. Acting on the full spectral triple of the noncommutative spacetime it is described formally as follows. Let (Eµ )νλ = δµν δµλ be the n-dimensional step operators, and consider the unitary transformation T ≡ DS(µ) ⊗ (µ) : H → H in (4.5) defined by DX µ
(µ) + − DX |p , p i ⊗ |0i+ ⊗ |0i− = (−1)pµ w |p˜+ ; p˜− i ⊗ |0i+ ⊗ |0i− . i h (µ) (±)ν (µ)−1 (±)λ αk DX = δλν − gλρ (Eµ )νρ ±(Eµ )νρ d∓ , DX ρλ αk ± (4.40) DS(µ) γν± DS(µ)−1 = δνλ − gνρ (Eµ )ρλ + gνσ g αλ (Eµ )σρ d∓ ρα γλ , µ
(µ) (µ)−1 DX V (q + , q − ) DX = (−1)qµ w V (q˜+ , q˜− ), ± ± ± λρ ± µν µ where p˜± ν = pν ∀ν 6= µ and p˜µ = ±gµλ (d ) pρ (or equivalently g pν ↔ w ). The mapping (4.40) acts on the background matrices and the metric tensor as i h ± γ λ αλ αβ [Eµ ·d± + (In − g·Eµ )]−1 ρ , → δ − g (E ) + g g (E ) d d± να µ να γβ µ νρ ν λγ h i gνρ → δνλ − gνα (Eµ )αλ + gνα g λβ (Eµ )αγ d− γβ gλσ (4.41) σ × δρ − gρα (Eµ )ασ + gρα g βσ (Eµ )αγ d+γβ ,
where In is the n×n identity matrix. As before, the transformation (4.40) leads to the isomorphism (4.6) of the full noncommutative geometry. When n is even, the factorized duality map yields the famous ‘mirror symmetry’ of string theory which expresses the equivalence of string spacetimes under the interchange of the complex and K¨ahler structures of T n . It nis equivalent to the interchange of the Dolbeault cohomology groups H p,q (T n ) and H 2 −p,q (T n ) which gives a mirror reflection along the diagonals of the corresponding Hodge diamonds. The change of Dirac operator in (4.5) therefore also yields the stringy phenomenon of spacetime topology change. It arises in the present point of view from the non-trivial chirality structure which is present in the spin bundle of T n when n is even (see (3.17)) 5 . Furthermore, the above analysis shows that generally a factorized duality transformation in the µth direction must be accompanied by a worldsheet parity transformation in all of the other n − 1 directions. This is somewhat anticipated from our earlier remarks about the relationship between the spin structures of T n and T-duality, and it agrees with some basic statements concerning mirror symmetry when n is even. Thus the noncommutative geometry formulation of the quantum geometry shows that worldsheet parity is in fact a crucial part of the duality symmetries. The remaining isomorphisms between pairs of 5 More complicated duality symmetries, such as mirror symmetry between distinct, curved Calabi-Yau manifolds [1], can be obtained by introducing a larger set of Dirac operators which are related to, for instance, an N = 2 supersymmetric sigma-model. The resulting spectral triple contains a larger symmetry than just the chiral-antichiral one used in this paper. Some of these points are addressed in [7, 9].
694
F. Lizzi, R. J. Szabo
subspaces in (4.12) and (4.25) are then combinations of the discrete duality transforms exhibited above. Lattice isomorphism. The duality symmetries which were essentially deduced above from the various isomorphisms that exist between the low-energy spectral triples exhaust the transformations (4.5) which lead to non-trivial equivalences between spacetimes of distinct geometry and topology. There are, however, two other discrete “internal” spacetime symmetries that leave each of the Dirac operators in (4.4) invariant and trivially leave the corresponding spectral triples unaffected. These transformations do not affect the classical spacetimes, but they do lead to non-trivial dynamical effects in the quantum field theory and are therefore associated with symmetries of the quantum geometry. The first one is a change of basis of the compactification lattice 0, which is described by an invertible, integer-valued matrix [Aµν ] ∈ GL(n, Z) that acts on the spacetime metric as gµν → (A−1 )λµ gλρ (A−1 )ρν .
(4.42)
In general, all covariant tensors from which the spectral triples are built transform under A−1 while all contravariant tensors transform under A. This leads to a simple reparametrization of all quantities composing the spectral triples, thus leaving them unaltered. This change of basis is therefore also trivially a symmetry of the low-energy commutative string spacetime (4.2). For instance, some of these GL(n, Z) transformations simply permute the spacetime dimensions, while others reflect the configurations X µ → −X µ . T-duality can be shown to be the composition of a succession of factorized dualities and dimension permutations in all of the directions of T n . This is naturally evident from the isomorphisms between (4.12) and (4.25), where appropriate permutations of the spin structures map factorized duality onto T-duality. Torsion cohomology. The final quantum symmetry is the shift βµν → βµν + Cµν
(4.43)
of the spacetime torsion form by an antisymmetric, integer-valued matrix Cµν . This shift corresponds to a change of integer cohomology class of the instanton form, and thus only affects the winding numbers in the target space T n . It can be absorbed by a shift pµ → pµ − Cµν wν which simply yields a reparametrization of the momenta {pµ } ∈ 0∗ . All other quantities are left invariant by this shift, and the action of the Dirac operators on (A, H) is unaffected by this discrete transformation. Again we trivially have the equivalence between the corresponding string spacetimes. Quantum moduli space. It can be shown that the discrete transformations exhibited in this section generate the duality group of the string theory, which is the semi-direct product Gd = O(n, n; Z) ⊗S Z2
(4.44)
of the lattice automorphism group O(n, n; Z) (i.e. the group of transformations of 3 that preserve the quadratic form (3.7)) by the action of the reflection group Z2 corresponding to worldsheet parity. The group O(n, n; Z) is the arithmetic subgroup of O(n, n) and it acts on the background matrices d± µν by linear fractional transformations. Note that inside the duality group Gd lies the discrete geometrical subgroup SL(n, Z) ⊂ O(n, n; Z) which represents the group of large diffeomorphisms of T n . The quantum modification of (4.3) is therefore given by the Narain moduli space [33]
Duality Symmetries and Noncommutative Geometry of String Spacetimes
Mqu = O(n, n; Z) \ O(n, n)/(O(n)×O(n)) ⊗S Z2 ,
695
(4.45)
where the quotient by the infinite discrete group O(n, n; Z) acts on O(n, n) from the left. As we have discussed, the duality group (4.44) is a discrete subgroup of the group Aut(A) of automorphisms of the vertex operator algebra A. In the next section we shall discuss the structure of Aut(A) in a somewhat more general setting.
5. Symmetries of the Noncommutative String Spacetime In the previous section we examined the symmetries of the quantum spacetime by determining the discrete automorphisms of the spectral triples which led to isomorphisms between their low-energy projection subspaces. The full duality group (4.44) was thus determined as the set of all isomorphisms between subspaces of (4.12) and (4.25). Along the way we showed that this way of viewing the target space duality in noncommutative geometry led to new insights into the relationships between choices of spin structure, worldsheet parity, and T-duality. However, there are many more possible automorphisms of the vertex operator algebra A than just those which preserve the commutative subspaces. In fact, the elements of the duality group Gd arise from the equivalences between the zero-mode eigenspaces of the Dirac operators (4.4), while the general isospectral automorphisms of the spectral triple (4.1) are determined by unitary transformations (such as (4.5)) between different Dirac operators that have the same spectrum [11]. Indeed, the structure of the full string spacetime is determined by the spectrum of the Dirac K-cycle (H, D / ) or (H, D /¯ ) (see (4.24)), which in turn incorporates the non-zero oscillatory modes of the strings. In this final section we shall examine the geometrical symmetries of the string spacetime in a more general setting by viewing them as automorphisms of the vertex operator algebra. This point of view will, among other things, naturally establish the framework for viewing duality as a gauge symmetry [1]. We shall also comment on some non-metric aspects of the noncommutative geometry. Automorphisms of the vertex operator algebra. The basic symmetry group of the noncommutative string spacetime (4.1) is Aut(A). An automorphism of the vertex operator algebra is a unitary transformation g : H → H which preserves both the vacuum state and the stress-energy tensors, and hence the representations of the Virasoro subalgebras, bX (3) are compatible in the sense that such that the actions of g and V(9; z+ , z− ) on H bX (3). g V(9; z+ , z− ) g −1 = V(g9; z+ , z− ) , ∀9 ∈ H
(5.1)
Thus the mapping V on (3.44) is equivariant with respect to the natural adjoint actions of bX (3) (i.e. the subspaces (3.50)) g. The automorphism g also preserves the grading of H which can be decomposed into a direct sum of the eigenspaces of g, bX (3) = H
M
b (j) (3), H X
(5.2)
j∈Zr
bX (3) | g9 = η j 9} with η the b (j) (3) = {9 ∈ H where r > 0 is the order of g and H X generator of Zr . Note that, from (3.25), the invariance of the stress-energy tensors auto± matically implies invariance among the Dirac operators D / . Two immediate examples with r = ∞ are provided by the Kac-Moody and Virasoro transformation groups of the
696
F. Lizzi, R. J. Szabo
sigma-model. The former group decomposes A in terms of the spectrum of the Dirac opb (3) decomposes erators. Generally, given a subgroup G ⊂ Aut(A), the Fock space H LX [R(G)] b b b [R(G)] (3). (3), HX (3) = R(G) H into a direct sum of irreducible G-modules HX X In the ordinary commutative case of a manifold M , the group Diff(M ) of diffeomorphisms of M is naturally isomorphic to the group of automorphisms of the abelian algebra A = C ∞ (M ). To each ϕ ∈ Diff(M ) one associates the algebra-preserving map gϕ : A → A by gϕ (f ) = f ◦ ϕ−1 ∀f ∈ A. In the general noncommutative case, the group Aut(A) has a natural normal subgroup Inn(A) ⊂ Aut(A) consisting of inner automorphisms of A, i.e. the algebra-preserving maps gu : A → A that act on the algebra as conjugation by elements of the group (2.11) of unitary operators in A, gu (a) = uau† , ∀a ∈ A.
(5.3)
I → Inn(A) → Aut(A) → Out(A) → I
(5.4)
The exact sequence of groups
defines the remaining outer automorphisms in Aut(A) such that the automorphism group is the semi-direct product Aut(A) = Inn(A) ⊗S Out(A)
(5.5)
of Inn(A) by the natural action of Out(A). Note that for an abelian algebra A the group of inner automorphisms Inn(A) = {I} is trivial, so that in the case of a commutative space M the diffeomorphisms of the manifold correspond to outer automorphisms. Recall from Sect. 2 that the group (2.11) of unitary elements of an algebra A defines a natural gauge group of the space. In this context inner automorphisms then correspond to gauge transformations. For example, the automorphism group (5.5) of the noncommutative algebra (3.1) of the standard model is the semi-direct product Aut(ASM ) = C ∞ (M, U (1)×SU (2)×U (3)) ⊗S Diff(M )
(5.6)
of the group of local gauge transformations by the natural action of the diffeomorphism group of M . The inner automorphisms in this case are therefore associated with the local internal gauge invariance of the model while the outer automorphisms represent the spacetime general covariance dictated by general relativity. In fact, (5.6) is the canonical invariance group of the standard model coupled to Einstein gravity, modulo an overall U (1) phase group which can be eliminated by restricting to the unimodular group of ASM . In the general case then, one can identify the outer automorphisms of the noncommutative string spacetime as general coordinate transformations and the inner automorphisms as internal gauge symmetry transformations [39], i.e. internal fluctuations of the noncommutative geometry corresponding to the rotations (5.3) of the elements of A. We shall see that these symmetry structures of the string spacetime have some remarkable features. Duality transformations as inner automorphisms and Gauge symmetries. We shall now begin to explore the structure of the symmetry group Aut(A). For illustration, we start by giving the explicit duality maps T that were exhibited in the previous section and show that they correspond to inner automorphisms of the vertex operator algebra. A general formalism for viewing symmetries of string theory as inner automorphisms of the vertex operator algebra has been given in [40] and applied to duality transformations in [15].
Duality Symmetries and Noncommutative Geometry of String Spacetimes
697
The basic idea behind this approach is that such an inner automorphism represents a deformation of the conformal field theory by a marginal operator, and as such it represents the same point in the corresponding moduli space. Here we wish to stress the fact that such automorphisms arise quite naturally from the point of view of the noncommutative geometry formalism and lead immediately to the well-known interpretation of duality as a gauge symmetry [1]. Recall that the Dirac operators were constructed from the generators (3.13) of the fundamental U (1)n+ ×U (1)n− gauge symmetry of the theory. It turns out that this symmetry group is augmented at the fixed point of the T-duality transformation of the string spacetime. T-duality is tantamount to the inversion d± → (d± )−1 of the background matrices. This transformation has a unique fixed point (d± )2 = In given by gµν = δµν and βµν = 0. At this single fixed point the generic U (1)n+ ×U (1)n− gauge symmetry is n n \ \ “enhanced” to a level 1 representation of the affine Lie group SU (2)+ ×SU (2)− [41]. Thus the fixed point 30 ∈ Mqu in the Narain moduli space of toroidal compactification coincides with the occurrence of “enhanced gauge symmetries”. It is due to the appearance of extra dimension (1+ , 1− ) = (1, 0) and (0, 1) operators in the theory. To describe this structure, let kµ(i) , i = 1, . . . , n, be a suitable basis of (constant) Killing forms on T n which are the generators of isometries of the spacetime metric gµν . Then the operators µ
α(i) (zα ) = : e±ikµ Xα (zα ) : , J± (i)
J3α(i) (zα ) = ikµ(i) Jαµ (zα ),
where α = ±, generate a level 1 su(2)n+ ⊕ su(2)n− Kac-Moody algebra, h i i h α(i) α(i) α(i) α(i) α(i) α(i) J3,k = ±J±,k+m = 2J3,k+m , J±,m , J+,k , J−,m + 2kδk+m,0
(5.7)
(5.8)
with all other commutators vanishing, and where we have defined the mode expansions P α(i) −k−1 zα . Jaα(i) (zα ) = k∈Z Ja,k Associated with the SU (2)n+ ×SU (2)n− gauge symmetry of the theory is the corresponding gauge group element g = eiGχ ,
(5.9)
where the generator Gχ is defined as the smeared operator Z dz+ dz− χa+,µ [X] Ja+(µ) (z+ ) + χa−,µ [X] Ja−(µ) (z− ) fS (z+ , z− ), (5.10) Gχ = 4πz+ z− and the gauge parameter functions χa±,µ [X], a = 1, 2, 3, µ = 1, . . . , n, are sections of the spin bundle of T n . Here and in the following we define X = √12 (X+ + X− ). The unitary operators (5.9) locally decompose as g = gS ⊗ gX , as in the previous section, where the operators gX act as inner automorphisms of AX . The automorphisms gS are defined by their corresponding actions on the generators γµ± and lead to reparametrization of the spin structure of the target space. The operators (5.9) that implement spacetime duality transformations of the string theory have been constructed in [15]. The µth factorized duality map corresponds to the action of the inner automorphism (5.9) with Gχ = G (µ) , where Z π dz+ dz− +(µ) −(µ) +(µ) −(µ) (µ) − J− J− (5.11) J+ J+ fS . G = 2i 4πz+ z−
698
F. Lizzi, R. J. Szabo
Another duality map which comes from the enhanced gauge symmetry follows from (µ) choosing kµ(i) = δµi and Gχ = G+(µ) + G− , where Z π dz+ dz− ±(µ) (µ) ±(µ) G± J+ fS . = − J− (5.12) 2i 4πz+ z− The inner automorphism generated by (5.12) corresponds to a reflection X µ → −X µ of the coordinates of T n and is part of the lattice isomorphism group GL(n, Z) . Thus factorized dualities and spacetime reflections are enhanced gauge symmetries of the noncommutative geometry, and as such they are intrinsic properties of the string spacetime. The remaining O(n, n; Z) transformations are abelian gauge symmetries. By the definition of the currents (3.13), a general spacetime coordinate transformation X → ξ(X), with ξ(X) a local section of spin(T n ), is generated by Gχ = Gξ with Z dz+ dz− µ ξµ (X) J+µ (z+ ) + J− Gξ = (z− ) fS (z+ , z− ). (5.13) 4πz+ z− Taking the large diffeomorphism ξµ (X) = ξµ(π) (X) = π2 sgn(π)gπ(µ),ν X ν then yields a permutation π ∈ Sn of the coordinates of T n (corresponding to another lattice isomorphism). Combining this with the factorized duality transformations above yields T-duality in the form of an inner automorphism. As such, T-duality corresponds to the global gauge transformation in the Weyl subgroup Z2 of SU (2). Next, the local gauge transformations β → β + dλ of the torsion two-form are generated by Gχ = Gλ with Z dz+ dz− µ Gλ = λµ (X) J+µ (z+ ) − J− (z− ) fS (z+ , z− ). (5.14) 4πz+ z− Taking the gauge transformation λµ (X) = Cµν X ν , with Cµν a constant antisymmetric matrix, gives effectively the torsion shift (4.43). Singlevaluedness of the corresponding group element (5.9) then forces Cµν ∈ Z ∀µ, ν yielding a large gauge transformation. The final O(n, n; Z) transformations correspond to large diffeomorphisms ξµ (X) = Tµν X ν of the n-torus. Again singlevaluedness of the corresponding gauge group element puts [Tµν ] ∈ SL(n, Z). These discrete vertex operator automorphisms all correspond to large gauge transformations of the internal string spacetime. As anticipated, the Z2 part of the duality group Gd representing worldsheet parity corresponds to an outer automorphism of the vertex operator algebra. This is because it corresponds to the automorphism WX ∈ Aut(A) that interchanges the left and right WX E − ⊗ E + . Clearly no inner automorphism of A can chiral algebras E = E + ⊗ E − −→ achieve this transformation, and indeed worldsheet parity is the outer automorphism of A represented by the Z2 generator 0 In WX = (5.15) In 0 acting in the two-dimensional space labelled by the chiral components E = E + ⊗E − . Thus worldsheet parity cannot be interpreted in terms of any gauge symmetry and represents a large diffeomorphism of the noncommutative string spacetime. This Z2 -symmetry is actually a discrete subgroup of the U (1) worldsheet symmetry group that acts by rotating the chiral sectors among each other. Associated to the spin structure of the
Duality Symmetries and Noncommutative Geometry of String Spacetimes
699
string worldsheet there is a representation of spin(2) ∼ = R on the Hilbert space HX that implements the group SO(2) ∼ = U (1) with generator cos θ sin θ Wθ = , θ ∈ [0, 2π). (5.16) − sin θ cos θ Thus, as a start, we can identify the infinite-dimensional symmetry algebras which contain the target space duality group O(n, n; Z) as inner automorphisms at the fixed point 30 ∈ Mqu , n n \ \ Inn(A30 ) ⊃ SU (2)+ ×SU (2)− ⊗S Vir + ×Vir − (5.17) n n ⊃ Ud (1)+ ×Ud (1)− ⊗S Vir + ×Vir − , where Vir ± denotes the chiral Virasoro groups. The abelian gauge symmetry group in (5.17) (the maximal torus of the nonabelian one) is present at a generic point 3 6= 30 . Furthermore, worldsheet parity is a finite subgroup of the outer automorphism group of A which contains a finite-dimensional worldsheet rotation group, Out(A) ⊃ O(2).
(5.18)
It is interesting to note what happens to these automorphisms when projected onto the low-energy sector P¯ 0(−) A¯ 0 P¯ 0(−) representing the ordinary spacetime T n . Only the inner automorphisms (5.13) act non-trivially on this subalgebra of A and represent the generators of Diff(T n ) in terms of the canonically conjugate center of mass variables xµ , pµ . The other transformations when restricted to this subalgebra act as the identity I, i.e. as inner automorphisms. Thus, the subgroup (5.17) of the inner automorphism group of A, representing internal gauge symmetries of the string spacetime, is projected onto the full group of outer automorphisms of the low-energy target space T n , corresponding to diffeomorphisms of the manifold. This approach therefore naturally identifies the usual invariance principles of general relativity as a gauge symmetry of the stringy modification. The diffeomorphisms of the full string spacetime are completely unobservable in the low-energy sector (for instance in the anti-chiral projection onto H¯ 0(−) the operator (5.15) acts as In ), as are the gauge symmetries corresponding to the duality transformations. This is indeed another essence of the target space duality in string theory. It is only observable as a symmetry of the huge noncommutative spacetime represented by the full spectral triple (4.1) and acts trivially on the corresponding low-energy projections representing the conventional spacetimes. In fact, the duality automorphisms naturally partition the full vertex operator algebra into sectors, each of which project onto the various low-energy spacetimes we described in the last section. Each such low-energy sector is distinct at the classical level but related to the other ones by the duality maps. At the level of the full spectral triple, there are two sectors corresponding to the two eigenspaces of the duality maps as dictated by the decomposition (5.2). For example, in the case of worldsheet parity, the two eigenspaces consist of holomorphic and antiholomorphic combinations, respectively, of the chirality sectors of the vertex operator algebra. In a low-energy projection, where the notion of chirality is absent, the effects of duality are unobservable. Similar decompositions can also be made for the larger symmetry groups in (5.17) and (5.18) in terms of their irreducible representations. Note b (0) (3) in (5.2) that, given a subgroup G of automorphisms, the G-invariant subspace H X
700
F. Lizzi, R. J. Szabo
(corresponding to the one-dimensional trivial representation of G) defines a vertex operator subalgebra and hence leads to a subspace of the string spacetime which is invariant under the G-transformations. In particular, the corresponding low-energy subspace from the decomposition with respect to a duality map is then completely unaffected by the duality transformation. Thus, the above presentation of duality (and other symmetries of the string spacetime) naturally leads to a systematic construction of the low-energy projective subspaces that we presented in the previous section. Universal Gauge groups and monster symmetry. At present, the general structure of the unitary group of the vertex operator algebra A is not known, nor are its general automorphisms. The group U(A) represents the complete internal (gauge) symmetry group of the noncommutative spacetime and appears to be quite non-trivial. Even at the commutative level where A = C ∞ (M ), the unitary group is the complicated, infinite-dimensional loop group C ∞ (M, S 1 ) of the manifold M . The inner automorphism group of the noncommutative string spacetime includes spacetime diffeomorphisms, two copies of the Virasoro group, and the Kac-Moody symmetry groups in (5.17) which contain the spacetime duality symmetries. There are a number of additional infinite-dimensional subalgebras of A that have been identified as subspaces of the inner automorphism algebra inn(A), such as the algebras of area-preserving (W∞ ) and volume-preserving diffeomorphisms in n = 2 dimensions [42] and also the weighted tensor algebras described in [43]. In all of these instances the inner automorphisms define appropriate mixings among the chiral ± Dirac operators D / which preserve the conformal invariance of the theory. Indeed, the chiral and conformal properties of the worldsheet theory are, as we have extensively shown in this paper, crucial aspects of the string spacetime. A classification of U(A) would ultimately lead to a “universal symmetry group” of string theory that would contain all unbroken gauge groups and represent the true stringy symmetries of the quantum spacetime. The problem with such a classification scheme though is that the “size” of U(A) appears to be very sensitive to the lattice 3 ∈ Mqu from which the vertex operator algebra is built. A natural Lie group G3 of automorphisms arises from exponentiating the Lie algebra L3 associated with the lattice 3 [12] (see the appendix). Then G3 acts continuously and faithfully on the vertex operator algebra A. The construction of U(A) has been discussed by Moore in [44] who considered the appearance of enhanced symmetry points in the Narain moduli space, i.e. points 3 ∈ Mqu at which extra dimension (1,0) and (0,1) operators appear and generate new symmetries of the conformal field theory. For this, we analytically continue in the spacetime momenta and extend the lattice 3 to a module over the Gaussian integers, 3(G) = 3 ⊗Z Z[i] ⊂ 3c .
(5.19)
We then form the corresponding operator Fock space (3.41) based on 3(G) by bX (3(G) ) = C{3(G) } ⊗ S(hˆ (−) ˆ (−) H + ) ⊗ S(h− )
(5.20)
so that the corresponding Lie algebra of dimension 1 primary fields (3.50) is (see the appendix) − + (G) b1 (3(G) )/ S b b (G) (5.21) LU ≡ P k≥1 P1 (3 ) ∩ L−k ⊗ L−k HX (3 ). Moore proved that, since the action of O(n, n; Z) on Mqu is transitive, the Lie algebra (5.21) generates a universal symmetry group of the string theory, i.e. if L3 is the (affine) Lie algebra that appears at an enhanced symmetry point 3, then there is a natural Lie subalgebra embedding L3 ,→ LU . We refer to [44] for the details.
Duality Symmetries and Noncommutative Geometry of String Spacetimes
701
We would like to stress that, from the point of view of the noncommutative geometry formalism that we have discussed, not only is the interpretation of duality symmetries as being part of some mysterious gauge group [1] now clarified, but the Lie group generated by (5.21) now has a natural geometrical description in terms of the theory of vertex operator algebras and the noncommutative geometry of A. The Lie algebra LU naturally overlies all symmetries of the string spacetime obtained from marginal deformations of the conformal field theory, and geometrically it contains many of the internal rotational symmetries of the noncommutative geometry. This by no means exhausts all of the inner automorphisms of A, but it provides a geometric, universal way of identifying gauge symmetries. Note that, in contrast to the low-energy subspaces which were determined by the tachyon sector of the vertex operator algebra A, the universal gauge symmetries are determined by the graviton sector of A. To get an idea of how large the gauge group U (A) can be, it is instructive to consider a specific example. We consider an n = (25 + 1) dimensional toroidal spacetime defined by Wick rotating the target space coordinate X 26 . The change from Euclidean to hyperbolic compactification lattices 0 is well-known to have dramatic effects on the structures of the corresponding vertex operator algebra [12, 14] and on the Narain moduli space [44]. Consider the unique 26-dimensional even unimodular Lorentzian lattice 0 = 525,1 . It can be shown [44] that 3∗ = 525,1 ⊕ 525,1 ∈ Mqu is the unique point in the Narain moduli space at which the vertex operator algebra A completely factorizes between its left and right chiral sectors, bX (3∗ ) = C + ⊗ C − , H
(5.22)
C ± = C{525,1 } ⊗ S(hˆ ∗(−) ± )
(5.23)
where
and hˆ ∗± is the Heisenberg-Weyl algebra (3.37) built on 3∗ . The distinguished point 3∗ ∈ Mqu is an enhanced symmetry point and the corresponding Lie algebra L3∗ is a maximal symmetry algebra, in the sense that it contains all unbroken gauge symmetry algebras. L3∗ is not, however, universal since the gauge symmetries are not necessarily embedded into it as Lie subalgebras. Again the framework of noncommutative geometry naturally constructs L3∗ as a symmetry algebra of the string theory. The Lie algebra L3∗ = B ⊕ B is an example of a mathematical entity known as a Borcherds or generalized Kac-Moody algebra [45], where b1 (525,1 )/ kerh·, ·i B=P
(5.24)
and h·, ·i is the bilinear form on the Lie algebra of primary fields of weight one defined in the appendix. The root lattice of B is 525,1 along with the set of positive integer multiples of the Weyl vector ρ ~ = (1, 2, . . . , 25; 70) ∈ 525,1 , each of multiplicity 24. It (±)µ is generated by εq± , ε−q± , qµ± α−1 (in each chiral sector) and em~ρ , where q ± ∈ 525,1 and m ∈ Z. The first three of these generators span an infinite-dimensional Kac-Moody algebra of infinite rank. The simple roots of B are the simple roots of this Kac-Moody algebra, and the positive-norm simple roots of the lattice 525,1 lie in the Leech lattice 0Leech , which is the unique 24-dimensional even unimodular Euclidean lattice with no vectors of square length two. The symmetries of its Dynkin diagram can be classified according to the automorphism group of the Leech lattice. The Lie algebra B is called the fake Monster Lie algebra [12, 14, 45].
702
F. Lizzi, R. J. Szabo
Thus the fake Monster Lie algebra (5.24) is a maximal symmetry algebra of the string theory, so that Borcherds algebras, when interpreted as generalized symmetry algebras of the noncommutative geometry, seem to be relevant for the construction of a universal symmetry of string theory. These algebras, being a natural generalization of affine Lie algebras, may emerge as new symmetry algebras for string spacetimes within the unified framework of vertex operator algebras and noncommutative geometry. The fake Monster Lie algebra can also be used to construct the Monster Lie algebra [12]. Mathematically, the most interesting aspect of this construction is that a subgroup of the automorphism group of the Monster vertex operator algebra is the celebrated Monster group, which is the full automorphism group of the 196884-dimensional Griess algebra that is constructed from the Monster Lie algebra and the moonshine module. The Monster group is the largest finitely-generated simple sporadic group. The appearance of this Monster symmetry as a gauge symmetry of the noncommutative spacetime emphasizes the point that these exotic mathematical structures, such as those contained in the content of Borcherds algebras, might play a role as a sort of dynamical Lie algebra which changes the Dirac operators and the Fock space gradings. But the underlying noncommutative geometrical structure of the string spacetime remains unchanged. We know of no complete classification of such vertex operator algebra automorphisms, and, in the context of this paper, this remains an important problem to be carried out in order to understand the full set of geometrical symmetries that underlie the stringy modification of classical general relativity. Differential topology of the quantum spacetime. As a final application of the formalism developed in this paper, we look briefly at the problem of computing the cohomology groups of the noncommutative string spacetime and compare them with the known (DeRham) cohomology groups of the ordinary n-torus ( n R(k ) for 0 ≤ k ≤ n k n . (5.25) H (T ; R) = {0} otherwise The rigorous way to describe the non-metric aspects of noncommutative geometry is through noncommutative K-theory [6], but we shall not enter into this formalism here. Here we shall simply compute the cohomology groups in analogy with the example of a manifold that was presented in Sect. 2. This approach is based on a natural generalization of the Witten complex of Sect. 3 which describes the cohomology (5.25). We assume, for simplicity, that n is even and that a basis for the compactification lattice 0 has been chosen so that gµν = δµν . In light of our analysis above, no loss of generality occurs with this choice of point in the Narain moduli space. In that case the spin(n) Clifford algebras C(T n )± each possess a chirality matrix γc± = γ1± γ2± · · · γn± whose actions on the generators of the spin bundle are ± ∓ ± ± γc , γµ = 0. γ c , γµ = 0 ,
(5.26)
(5.27)
The chirality matrices are of order 2, (γc± )2 = I. Our first observation is that the two Dirac operators in (4.4) are related by / γc− . D /¯ = γc− D
(5.28)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
703
As we shall see below, the chirality operators (5.26) define a Klein operator γ, ˜ which provides a natural Z2 -grading, and a Hodge duality operator ? by6 γ˜ = γc+ γc− , ? = γc−
(5.29)
Recalling the description of Sect. 2 and the discussion in Sect. 3 concerning the Witten complex, we can identify the holomorphic Dirac operator D / as an exterior derivative operator d. According to (5.28) and (5.29), the anti-holomorphic Dirac operator D /¯ can † then be identified with the co-derivative d = ?d?. The duality isomorphisms of Sect. 4 then state that the string spacetime is invariant under the exchange between the exterior derivative and its dual d ↔ d† = ?d?, which is another well-known characterization of target space duality in string theory. We can now proceed to construct the complex of differential forms of the noncommutative string spacetime as described in Sect. 2. As always our starting point is 0D / A = A.
(5.30)
Next, we can compute, for V = I ⊗ V ∈ A, the exact one-form πD / , V ]. / (dV ) = [D
(5.31)
Since the vertex operators of definite spacetime momentum (q + , q − ) ∈ 3 span the vertex µ , V ] sweep out operator algebra AX , it follows that all commutators of the form [J± / acts densely on H, just as in the the space AX as V is varied, i.e. the Dirac operator D commutative case of Sect. 2. Using the explicit form of the Dirac operator we can thus identify the linear space of differential one-forms as n + n − (5.32) , dimA 1D 1D / A = A ⊗R C(T ) ⊕ C(T ) / A = 2n with basis {γµ+ , γµ− }nµ=1 . Similarly, one can proceed to calculate the higher-degree spaces kD / A just as in the example of a spin-manifold described in Sect. 2. As occurred there, we will encounter junk forms for k ≥ 2, which can be eliminated by antisymmetrizations of the gamma-matrices in each chiral sector. Since the left and right chiral sector gammamatrices already anticommute, this need not be done for mixed chirality products of the γ’s. We therefore arrive at ! k M k n +[i] n −[k−i] C(T ) ⊗ C(T ) , D / A = A ⊗R i=0
k X n n k , dimA D /A = i k−i
(5.33)
i=0
± where C(T ) is the linear space spanned by the antisymmetrized products γ[µ · · · γµ±j ] 1 Q P j = j!1 π∈Sj sgn π l=1 γµ±π(l) . The linear space (5.33) is defined for all 0 ≤ k ≤ n. What is interesting about the noncommutative differential complex is that, unlike that of the torus T n , forms of degree higher than n exist. To construct lD / A for l > n, we exploit the interpretation of the chirality matrices above as Hodge duality operators. However, on the differential n ±[j]
6 For a definition of the Hodge duality operator in a more general setting which can be applied to the cases where n is odd, see [34].
704
F. Lizzi, R. J. Szabo
complex, we use a slightly different representation than that given in (5.29) to avoid forms of negative degree. This is simply a matter of convenience, and the entire differential topology can be instead given using the Hodge dual in (5.29). Thus we take ˜ πD / (?) = mC ◦ γ,
(5.34)
where mC is the multiplication operator on the double Clifford algebra C(T n ). If one now proceeds to construct differential l-forms by antisymmetrizations of products of l gamma-matrices γµ± for l > n, it is straightforward to see that ∼ ˜ 2n−l A ∼ lD A for l > n. = 2n−l / A = γ· D / D /
(5.35)
This process will terminate at l = 2n, so that the algebra of differential forms of the noncommutative spacetime is ∗D /A =
2n M
kD / A.
(5.36)
k=0
The action of the Dirac operator D / as defined in (5.31) gives a nilpotent linear map k k+1 2n /¯ defined by (5.31) and (5.34) d : D / A → D / A with d(D / A) = {0}, while that of D k−1 k 0 † † gives the adjoint nilpotent map d = ?d? : D / A → D / A) = {0}. / A with d (D Thus the chirality structure of the worldsheet theory leads to a “doubling” in the differential complex of the string spacetime. Since the A-module 1D / A is finitely-generated and projective, it can be viewed as a cotangent bundle, and one can proceed to equip it with connections, although at the level of a toroidal target space there is really no modification from the classical (commutative) case. Since 1D / A is free with basis given 1 1 1 : A → by {γµ± }nµ=1 , we can define a connection ∇D / D / D / A ⊗A D / A by [7, 23] / , ω]. ∇D / ω = [D
(5.37)
± Thus {γµ± }nµ=1 constitutes a parallel basis for 1D / (γµ ⊗ I) = 0, and ∇D / / A, i.e. ∇D has vanishing curvature. Thus the string spacetime is a noncommutative space with flat connections of zero torsion, and the curvature properties of the toroidal general relativity are unchanged by stringy effects. To compute the cohomology ring of the noncommutative spacetime, we define the k cohomology group HD / (A) to be the linear space spanned by the harmonic differential k-forms, i.e. the k-forms annihilated by both D / and D /¯ in the representation πD / defined above. For instance, the harmonic zero-forms are the vertex operators V ∈ A with 0 ∼ /. [D / , V ] = 0, so that HD / (A) = comm D The situation for the higher degree cohomology groups is similar, except that now one obtains higher-dimensional spaces corresponding to the global string oscillations around the cycles of T n . After some calculation, we find comm D / for k = 0 k A ¯ ⊗R RdimA D/ A for 0 < k < 2n D / ,D / k ∼ , (5.38) (A) HD = / ¯ comm D / for k = 2n {0} otherwise
Duality Symmetries and Noncommutative Geometry of String Spacetimes
705
where / ∩ comm D /¯ AD / ,D /¯ ≡ comm D
(5.39)
and the dimension of kD / A is given in (5.33) and by (5.35). The intermediate cohomology groups for 0 < k < 2n are characterized by vertex operators V ∈ A with [D / , V ] = [D /¯ , V ] = 0.
(5.40)
These harmonic k-forms are the vertex operators which constitute ‘isometries’ of the string spacetime. From Sect. 4, it follows that AD / ,D /¯ contains smeared tachyon vertex + − + − operators I ⊗ V (q , q ) with q = q = 0. In the low-energy projection onto H¯ 0(−) , one ¯ (−) ∼ finds that P¯ 0(−) AD = C. But there are still higher-spin vertex operators of zero / ,D /¯ P0 charge that survive in (5.39) in the general case. This is wherein most of the stringy modification of the topology of the classical spacetime lies, in that higher-spin oscillatory modes of the strings “excite” the cohomology groups (5.25) leading to generalized, infinitely-many connected components in the string spacetime. The spaces (5.38) essentially represent the vertex operators which are invariant under the global U (1)n+ ×U (1)n− Kac-Moody gauge symmetry of the string theory, and as such they represent the globally diffeomorphism-invariant spacetime observables of the noncommutative geometry. Explicit calculations can eliminate potential vertex operators from belonging to (5.39), for instance the graviton field (3.53). This space consists of those states which belong to the simultaneous zero-mode eigenspaces of the two Dirac operators D / and D /¯ . At this stage though we have not found any elegant way of characterizing the cohomology (5.38) and it would be interesting to explore these spaces further. The cohomology for k = 0 is determined by the vertex operators of zero winding number but non-zero spacetime momentum, and vice-versa for k = 2n. The number of independent “k-cycles” is larger in general than nk because in the generic high-energy sector the string spacetime distinguishes between chirality combinations and accounts for string oscillations and windings about the circles of T n . Using the Z2 -grading γ˜ (Klein operator) and the Hodge dual ? defined in (5.29), it is also possible to compute topological invariants, such as the Euler characteristic and the Hirzebruch signature, of the noncommutative geometry, in analogy to the Witten complex [9, 34, 35]. Notice that in a low-energy projection the cohomology groups (5.38) do not coincide with (5.25). One needs to first project the complex (5.36) and Dirac operators and then compute the cohomology groups along the lines described in Sect. 2. The key feature to this is that then the chirality sectors of the spaces (5.33) become equivalent, and for l > n the spaces (5.35) “fold” back onto the n linear spaces in (5.33). In the general case the cohomology (5.38) leads immediately to the mirror symmetry of the string k,l spacetime. We can naturally define “Dolbeault” cohomology groups HD / (A) by using the chiral and anti-chiral Clifford algebra decompositions in (5.33) to split the spaces (5.38) into holomorphic and anti-holomorphic combinations. Using the chirality matrix k,l ∼ k,n−l (A). Comparing the respective projections onto we then have γc− ·HD / (A) = HD / [+;µ] H¯ 0(−) and H0 as described in Sect. 4 leads immediately to the usual statement of mirror symmetry between the “Dolbeault” cohomology groups arising from “foldings” in (5.38). Of course this analysis is only meant to be somewhat heuristic since, as mentioned before, a complete analysis of the topological properties of a noncommutative spacetime
706
F. Lizzi, R. J. Szabo
entails more sophisticated techniques. But the above results show how the algebraic structures inherent in the vertex operator algebra modify the geometry and topology, as well as the general symmetry principles of general relativity. More non-trivial structures could be displayed by the Wess-Zumino-Witten models studied in [7, 9], and even in the generalizations of the conformal field theory (3.3) to arbitrary compact Riemann surfaces 6 or to embedding fields X µ which live in toroidal orbifold target spaces. The methods emphasized in this paper can be more or less straightforwardly extended to the analysis of such string theories. Acknowledgement. We thank I. Giannakis for pointing out previous work on vertex operator algebra automorphisms in string theory to us, and G. Mason for comments on the manuscript. F.L. gratefully thanks the Theoretical Physics Department of Oxford University for hospitality during the course of this work. The work of R.J.S. was supported in part by the Natural Sciences and Engineering Research Council of Canada.
Appendix. Properties of Vertex Operator Algebras and Construction of Quantum Spacetimes In string theory, vertex operators generate the algebra of observables of the underlying two-dimensional conformal quantum field theory whose action on the vacuum state |vaci forms a dense subspace of vectors of the corresponding Hilbert space H of physical states. The (anti-)chiral algebra E + (E − ) is defined to be the operator product algebra of the (anti-)holomorphic fields in the conformal field theory. The chiral algebras contain, in particular, copies of the Virasoro algebras characterizing the conformal invariance of the string theory. A rational conformal field theory is completely characterized by its chiral algebra. The fundamental relations of local conformal field theory are given by braiding and fusion relations in each chiral algebra, and the sewing transformations which combine the two chiral algebras (see (3.35)). These relations then combine into the operator product expansion (see (3.46)) on the full algebra E = E + ⊗E − . They also lead to the property of locality, i.e. that the quantum fields commute whenever their arguments are space-like separated, and also the property of duality, i.e. crossing-symmetry of the 4-point functions on the Riemann sphere. These identities can be expressed in terms of a single, compact relation by turning to the formal notion of a Vertex Operator Algebra. This single identity is known as the “Jacobi identity” and it encodes the full non-triviality of the structure of the Vertex Operator Algebra (as the above relations do for the local conformal field theory). The standard discussion above can be cast into such formal form that is useful in the more algebraic applications of conformal field theory, such as that required in the noncommutative geometry of string spacetimes. In this appendix we will define formally a Vertex Operator Algebra and describe some of its algebraic properties in the context of the construction of quantum spacetimes. More details can be found in, for example, the introductory reviews [13, 14], the book [12], and references therein. Axiomatics. A vertex operator algebra consists of a Z-graded complex vector space F=
M
Fn
(A.1)
n∈Z
and a linear map V which associates to each element 9 ∈ F an endomorphism of F that can be expressed as a formal sum in a variable z:
Duality Symmetries and Noncommutative Geometry of String Spacetimes
V(9, z) =
X
9n z −n−1
707
(A.2)
n∈Z
with 9n ∈ Fn . The element 9 ∈ F is called a state and the endomorphism V(9, z) is called a vertex operator. The vertex operators V(9, z) are required to satisfy the following axioms: 1. Given any 8 ∈ F we have: 9n 8 = 0 for n sufficiently large.
(A.3)
2. There is a preferred vector 1 called the vacuum such that 1n = δn+1,0
(A.4)
and therefore V(1, z)8 = 8, ∀8 ∈ F . 3. 9n = 0 ∀n ∈ Z ⇐⇒ 9 = 0. 4. There exists a conformal vector Tn+1 = Ln , with the Ln satisfying the Virasoro Algebra (3.27) for some central charge c ∈ C. This vector provides as well a translation generator: d V(9, z) dz
(A.5)
L0 9n = n9n ∀9n ∈ Fn .
(A.6)
V(L−1 9, z) = and the grading of F:
5. The spectrum of L0 is bounded from below. 6. The eigenspaces Fn of L0 are finite-dimensional. 7. The vertex operators must also satisfy a Jacobi identity: X i l (−1) 9l+m−i (8n+i 4) − (−1)l 8l+n−i (9m+i 4) i i≥0 X m (9l+i 8)m+n−i 4 = i
(A.7)
i≥0
for all 9, 8, 4 ∈ F , l, m, n ∈ Z. Axioms 1–7 define a Vertex Operator Algebra, which contains the Virasoro algebra (Axiom 4.). The Jacobi identity (A.7) contains the most information about the algebra. Three special cases of it are particularly interesting. They represent associativity: X m (−1)i 9m−i 8n+i − (−1)m 8m+n−i 9i , (A.8) (9m 8)n = i i≥0
the commutator formula: [9m , 8n ] =
X m i≥0
and skew-symmetry:
i
(9i 8)m+n−i ,
(A.9)
708
F. Lizzi, R. J. Szabo
9n 8 = (−1)n+1 8n 9 +
X1 (−1)i+n+1 (L−1 )i (8n+i 9) i!
(A.10)
i≥1
for all 9, 8 ∈ F, m, n ∈ Z. The Jacobi identity therefore encodes the complete noncommutativity (as well as other nontrivial properties) of the vertex operator algebra. Conformal highest weight vectors. Vertex operators generate states when applied to the vacuum (this is the original motivation for their name): V(9, z)1 = ezL−1 9.
(A.11)
In general, the conformal highest weight vectors (or primary fields) are states which satisfy: L0 9 = 19 9 , Ln 9 = 0 ∀n > 0,
(A.12)
where 19 is the conformal weight of the vector 9 (defined by (A.6)). In particular, the vacuum is a primary state of weight zero. Algebra of primary fields of weight one. The primary fields of weight one form a Lie algebra L ≡ F1 /(F1 ∩ L−1 F ) = F1 /L−1 F0
(A.13)
with antisymmetric bracket: [9, 8] ≡ 90 8
(A.14)
h9, 8i ≡ 91 8,
(A.15)
and L-invariant bilinear form:
provided that the spectrum of L0 is non-negative and the weight zero subspace F0 is onedimensional. The former constraint is usually imposed out of physical considerations, as often L0 is identified with the Hamiltonian of the system. The quotient of F1 in (A.13) is by the set of spurious states. The classical Jacobi identity for the Lie bracket (A.14) follows from the Jacobi identity for the vertex operator algebra, while the symmetry of the inner product (A.15) follows directly from the skew-symmetry property. Setting 9 = 8 = 1 in (A.7), using the vacuum axiom 2 and definition (A.2), we obtain the usual Cauchy theorem of classical complex analysis. Thus the Jacobi identity for vertex operator algebras is a combination of the classical Jacobi identity for Lie algebras and the Cauchy residue formula for meromorphic functions. Vertex operators and even lattices. One of the most important results in the theory of vertex operator algebras (and the one of great relevance to the present work) is the theorem which states that: Associated with any even positive-definite lattice 0 there is a vertex operator algebra. The proof of the theorem is a constructive procedure. The construction is in fact the one which associates a bosonic string theory compactified on a torus, as we did in Sect. 3. In this paper we had a Fock space as our starting point. In general this is not necessary, and the Fock space can actually be constructed starting from the lattice 0, so that the lattice is the only ingredient necessary for the construction. The proof that the algebra one constructs is indeed a vertex operator algebra, as well as the details of the formal construction, can be found in [12, 14].
Duality Symmetries and Noncommutative Geometry of String Spacetimes
709
In the case of the vertex operator algebra of Sect. 3, the algebra one constructs is actually the chiral algebra E ± , the endomorphisms on C{0}± ⊗S(hˆ (−) ) (respectively on 0∗ ). The full vertex operator algebra is then constructed using the sewing transformation (3.35). For the chiral algebras, the Jacobi identity follows from the analogous braiding relations (3.46) which lead to the fusion relations [12] (R) (S) V V(V[q ± ] [z± ], z± )V[r ± ] [w± ], w± =
Y
gµν qµ(i)± rν(j)±
(j) (i) z± + (z± − w± )
(A.16)
1≤(i,j)≤(R,S) (R) (S) × : V(V[q ± ] [z± ], z± + w± )V(V[r ± ] [w± ], w± ) : (R) By the completeness of the chiral vertex operators V[q ± ] [z± ] on the respective Hilbert spaces, the relation (A.16) leads immediately to the Jacobi identity for the chiral vertex operator algebras E ± . Note that self-duality 0 = 0∗ is not a requirement. In the general case, we have the coset decomposition [ (0 + λx ) (A.17) 0∗ = x∈0∗ /0
of the dual lattice with λ0 = 0. It can be shown [12, 46] that {Rx | x ∈ 0∗ /0} is the complete set of inequivalent irreducible representations of the vertex operator algebra, where Rx = C{0 + λx } ⊗ S(hˆ (−) ).
(A.18)
bX (0∗ ) = L ∗ Rx , and the representations (A.18) can be identified with Then H x∈0 /0 the “points” of a noncommutative spacetime. Furthermore, the corresponding linear space of characters φRx ≡ tr Rx q L0 −c/24 , where q = eπiτ with Im τ > 0, is modular invariant with respect to SL(2, Z) [47]. If the lattice 0 is self-dual then the vertex operator algebra is holomorphic, i.e. it is its only irreducible representation. This shows how the symmetries of 0 control the structure of the associated spacetime. In the cases studied throughout this paper, we can now see how the structure of the quantum spacetime is intimately tied to the properties of the compactification lattice. Those lattices associated to large symmetries of the spacetime, such as that associated with the Monster group, give smaller quantum spacetimes than those associated to nonsymmetrical lattices, i.e. large gauge symmetries essentially exhaust the full structure of the spacetime. This increase in symmetry of the spacetime from a decrease in the number of its “points” is similar to the effect of increasing the number of elements of an algebra to gain a decrease in the number of points of a topological space. Furthermore, given a compact automorphism group G of the vertex operator algebra as in Sect. 5, each b [R(G)] (3) is an irreducible representation of the G-invariant subalgebra G-module H X b (0) (3) [48]. This exemplifies, in particular, how the construction of the commutative H X low-energy projective subspaces carries through from the structure of the vertex operator algebra. Moreover, the above results show that more general theories than the ones we have presented in this paper will also lead to vertex operator algebras, and thus similar noncommutative spacetimes. For example, the allowed momenta and winding modes
710
F. Lizzi, R. J. Szabo
of heterotic string theory live on an (n + 16, n)-dimensional even self-dual Lorentzian lattice [33], and the construction of Sect. 4 can be used to show that the target space duality group in this case is isomorphic to O(n + 16, n; Z) [1]. In this general class of vertex operator algebras, F1 is a Lie algebra with generators εq µ and qµ α−1 , where q ∈ 0. Its root lattice is precisely the lattice 0, and the affinization of F1 then yields the usual Frenkel-Kac construction of affine Lie algebras [37]. Note that, as a subspace of the noncommutative spacetime, the subalgebra F1 contains the lowest non-trivial oscillatory modes of the strings, so that the universal gauge symmetry of the string spacetime coincides with smallest excitations of the commutative subspaces. Thus, more general spacetime gauge symmetric structures can also be inputed into the constructions presented in this paper (leading to analogs of the results of Sect. 5), so that our results extend naturally to a larger class of models than just the linear sigma-models described here.
References 1. Giveon, A., Porrati, M. and Rabinovici, E.: Target Space Duality in String Theory. Phys. Rep. 244, 77 (1994) (hep-th/9401139) 2. Alvarez-Gaum´e, L. and Hassan, S.F.: Introduction to S-Duality in N = 2 Supersymmetric Gauge Theories: A Pedagogical Review of the work of Seiberg and Witten, Fortsch. Phys 45, 159 (1997) (hep-th/9701069) 3. Witten, E.: String Theory Dynamics in Various Dimensions. Nucl. Phys. B443, 85 (1995); Schwarz, J.H.: The Power of M Theory. Phys. Lett. B367, 97 (1996); Lectures on Superstring and M Theory Dualities. Nucl. Phys. B (Proc. Suppl.) 55B, 1 (1997); Duff, M.J.: M Theory (The Theory formerly known as Strings). Intern. J. Mod. Phys. A11, 5623 (1996) 4. Witten, E.: Bound States of Strings and p-branes. Nucl. Phys. B460, 335 (1996) (hep-th/9510135); Shenker, S.H.: Another Length Scale in String Theory? hep-th/9509132; Danielsson, U.H., Ferretti, G. and Sundborg, B.: D-particle Dynamics and Bound States. Int. J. Mod. Phys. A11, 5463 (1996), (hep-th/9603081); Kabat, D. and Pouliot, P.: A Comment on Zero-brane Quantum Mechanics. Phys. Rev. Lett. 77, 1004 (1996), (hep-th/9603127); Douglas, M.R., Kabat, D., Pouliot, P. and Shenker, S.H.: D-branes and Short Distances in String Theory. Nucl. Phys. B485, 85–127 (1997), (hep-th/9608024) 5. Banks, T., Fischler, W., Shenker, S.H. and Susskind, S.H.: M Theory as a Matrix Model: A Conjecture. Phys. Rev. D55, 5112–5128 (1997), (hep-th/9610043) 6. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 7. Fr¨ohlich, J. and Gawe¸dzki, K.: Conformal Field Theory and Geometry of Strings. CRM Proc. Lecture Notes 7, 57–97 (1994), (hep-th/9310187) 8. Chamseddine, A.H. and Fr¨ohlich, J.: Some Elements of Connes’ Noncommutative Geometry and Spacetime Geometry. In: Yang Festschrift, eds. C.S. Liu and S.-F. Yau, Boston: International Press, 1995, pp. 10–34 9. Fr¨ohlich, J., Grandjean, O. and Recknagel, A.: Supersymmetric Quantum Theory, Noncommutative Geometry and Gravitation. In: Quantum Symmetries, Proc. LXIV Les Houches Session, eds. A. Connes and K. Gawe¸dzki, to appear, (hep-th/9706132) 10. Chamseddine, A.H.: The Spectral Action Principle in Noncommutative Geometry and the Superstring. Phys. Lett. B400, 87 (1997), (hep-th/9701096); An Effective Superstring Spectral Action. Phys. Rev. D56, 3555, (1997), (hep-th/9705153) 11. Chamseddine, A.H. and Connes, A.: Universal Formula for Noncommutative Geometry Actions: Unification of Gravity and the Standard Model. Phys. Rev. Lett. 77, 4868 (1996); The Spectral Action Principle. Commun. Math. Phys. 186, 731 (1997), (hep-th/9606001) 12. Frenkel, I.B., Lepowsky, J. and Meurman, A.: Vertex Operator Algebras and the Monster. Pure Appl. Math. 134, New York: Academic Press, 1988 13. Huang, Y.-Z.: Vertex Operator Algebras and Conformal Field Theory. Intern. J. Mod. Phys. A7, 2109– 2151 (1992)
Duality Symmetries and Noncommutative Geometry of String Spacetimes
711
14. Gebert, R.W.: Introduction to Vertex Algebras, Borcherds Algebras and the Monster Lie Algebra. Intern. J. Mod. Phys. A8, 5441–5503 (1993), (hep-th/9308151) 15. Evans, M. and Giannakis, I.: T Duality in Arbitrary String Backgrounds. Nucl. Phys. B472, 139–162 (1996); Giannakis, I.: O(d, d; Z) Transformations as Automorphisms of the Operator Algebra. Phys. Lett. B388, 543–549 (1996) 16. Lizzi, F. and Szabo, R.J.: Target Space Duality in Noncommutative Geometry. Phys. Rev. Lett. 79, 3581 (1997), (hep-th/9706107) 17. Douglas, M.R.: Superstring Dualities, Dirichlet Branes and the Small Scale Structure of Space. In: Quantum Symmetries, Proc. LXIV Les Houches Session, eds. A. Connes and K. Gawe¸dzki, to appear, (hep-th/9610041); Ho, P.-M. and Wu, Y.-S.: Noncommutative Geometry and D-branes. Phys. Lett. B398, 52–60 (1997), (hep-th/9611233); Kalkkinen, J.: Dimensionally Reduced Yang–Mills Theories in Noncommutative Geometry. Phys. Lett. B399, 243 (1997), (hep-th/9612027) 18. V´arilly, J.C. and Gracia-Bond´ıa, J.M.: Connes’ Noncommutative Differential Geometry and the Standard Model. J. Geom. Phys. 12, 223 (1993) 19. Landi, G.: An Introduction to Noncommutative Spaces and their Geometries. Springer Lecture Notes in Physics 51, Berlin–Heidelberg: Springer-Verlag, 1997 20. Madore, J.: An Introduction to Noncommutative Geometry and its Physical Applications. LMS Lecture Notes 206, (1995) 21. Fell, J.M.G. and Doran, R.S.: Representations of ∗-Algebras, Locally Compact Groups and Banach ∗-Algebraic Bundles, London–New York: Academic Press, 1988 22. Dixmier, J.: Les C ∗ -alg`ebres et leurs Repr´esentations. Paris: Gauthier-Villars, 1964 23. Chamseddine, A.H., Felder, G. and Fr¨ohlich, J.: Gravity in Noncommutative Geometry. Commun. Math. Phys. 155, 205–217 (1993) 24. Connes, A. and Lott, J.: Particle Models and Noncommutative Geometry. Nucl. Phys. B (Proc. Suppl.) 18B, 29 (1990); The Metric Aspect of Noncommutative Geometry. In: New Symmetry Principles in Quantum Field Theory, eds. J. Fr¨ohlich, G. ’t Hooft, A. Jaffe, G. Mack, P.K. Mitter and R. Stora, New York: Plenum Press, 1992, pp.53–94 25. Mart´ın, C.P., Gracia-Bond´ıa, J.M. and V´arilly, J.C.: The Standard Model as a Noncommutative Geometry: The Low Energy Regime. Phys. Rep. 294, 363 (1998), (hep-th/9605001) 26. Iochum, B., Kastler, D. and Sch¨ucker, T.: On the Universal Chamseddine-Connes Action I. Details of the Action Computation. J. Math. Phys. 38, 4929 (1997), (hep-th/9607015) 27. Figueroa, H., Lizzi, F., Gracia-Bond´ıa, J.M. and V´arilly, J.C.: A Nonperturbative Form of the Spectral Action Principle in Noncommutative Geometry. hep-th/9701179, to appear in J. Geom. Phys. 28. Madore, J.: The Fuzzy Sphere. Class. Quant. Grav. 9, 69 (1992) 29. Balachandran, A.P., Bimonte, G., Ercolessi, E., Landi, G., Lizzi, F., Sparano, G. and TeotonioSobrinho, P.: Noncommutative Lattices as Finite Approximations. J. Geom. Phys. 18, 163–194 P., (hepth/9510217); Finite Quantum Physics and Noncommutative Geometry. Nucl. Phys. B (Proc. Suppl.) 37C, 20–45 (1995), (hep-th/9403067) 30. Doplicher, S., Fredenhagen, K. and Roberts, J.E.: Spacetime Quantization induced by Classical Gravity. Phys. Lett. B331, 39–44 (1994); The Quantum Structure of Spacetime at the Planck Scale and Quantum Fields. Commun. Math. Phys. 172, 187–220 (1995); Kempf, A. and Mangano, G.: Minimal Length Uncertainty Relation and Ultraviolet Regularization. Phys. Rev. D55, 7909–7920 (1997), (hep-th/9612084); Mangano, G.: Path Integral Approach to Noncommutative Spacetimes. gr-qc/9705040 to appear in J. Math. Phys. 31. Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 1995 32. Wess, J. and Zumino, B.: Covariant Differential Calculus on the Quantum Hyperplane. Nucl. Phys. B (Proc. Suppl.) 18B, 302 (1991) 33. Narain, K.S.: New Heterotic String Theories in Uncompactified Dimensions < 10. Phys. Lett. B169, 41 (1986); Narain, K.S., Sarmadi, M.H. and Witten, E.: A Note on Toroidal Compactification of Heterotic String Theory. Nucl. Phys. B279, 369 (1987) 34. Fr¨ohlich, J., Grandjean, O. and Recknagel, A.: Supersymmetric Quantum Theory and (Noncommutative) Differential Geometry. hep-th/9612205, to appear in Commun. Math. Phys.
712
F. Lizzi, R. J. Szabo
35. Witten, E.: Supersymmetry and Morse Theory. J. Diff. Geom. 17, 661 (1982); Constraints on Supersymmetry Breaking. Nucl. Phys. B202, 253–316 (1982) 36. Witten, E.: Global Anomalies in String Theory. In: Anomalies, Geometry and Topology, eds. W. Bardeen and A. White, Sinagpore: World Scientific, 1985, pp. 61–99; Noncommutative Geometry and String Field Theory. Nucl. Phys. B268, 253–294 (1986) 37. Goddard, P. and Olive, D.: Kac–Moody and Virasoro Algebras in relation to Quantum Physics. Intern. J. Mod. Phys. A1, 303–414 (1986) 38. Belavin, A.A., Polyakov, A.M. and Zamolodchikov, A.B.: Infinite Conformal Symmetry in Twodimensional Quantum Field Theory. Nucl. Phys. B241, 333–380 (1984); Moore, G. and Seiberg, N.: Classical and Quantum Conformal Field Theory. Commun. Math. Phys. 123, 177 (1989) 39. Connes, A.: Gravity coupled with Matter and the Foundation of Noncommutative Geometry. Commun. Math. Phys. 182, 155 (1996) 40. Evans, M. and Ovrut, B.: Spontaneously Broken Intermass Level Symmetries in String Theory. Phys. Lett. B231, 80 (1989); Deformations of Conformal Field Theories and Symmetries of the String. Phys. Rev. D41, 3149 (1990); Evans, M. and Giannakis, I.: Gauge Covariant Deformations, Symmetries and Free Parameters of String Theory. Phys. Rev. D44, 2467–2479 (1991) 41. Ginsparg, P.: Applied Conformal Field Theory. In: Fields, Strings and Critical Phenomena, eds. E. Br´ezin and J. Zinn-Justin Amsterdam: North Holland, 1990, pp. 1–168 42. Hull, C.M.: Higher Spin Extended Conformal Algebras and W Gravities. Nucl. Phys. B353, 707–756 (1991); Witten, E. and Zwiebach, B.: Algebraic Structures and Differential Geometry in 2D String Theory. Nucl. Phys. B377, 55 (1992) 43. Evans, M., Giannakis, I. and Nanopoulos, D.V.: An Infinite-dimensional Symmetry Algebra in String Theory. Phys. Rev. D50, 4022–4031 (1994) 44. Moore, G.: Finite in all Directions. hep-th/9305139 45. Borcherds, R.E.: Generalized Kac–Moody Algebras. J. Algebra 115, 501 (1988); Monstrous Moonshine and Monstrous Lie Superalgebras. Invent. Math. 109, 405–444 (1992) 46. Borcherds, R.E.: Proc. Nat. Acad. Sci USA 83, 3068 (1986); Dong, C.: Vertex Algebras associated with Even Lattices. J. Algebra 160, 245–265 (1993) 47. Zhu, Y.: Modular Invariance of Characters of Vertex Operator Algebras. J. Am. Math. Soc. 9, 237–302 (1996) Dong, C., Li, H. and Mason, G.: Twisted representations of Vertex Operator Algebras. To appear in Mathematische Annalen, q-alg/9509005 48. Dong, C.: Li, H. and Mason, G.: Compact Automorphism Groups of Vertex Operator Algebras. qalg/9608009 Communicated by A. Connes
Commun. Math. Phys. 197, 713 – 727 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Twistor Spaces for QKT Manifolds P. S. Howe1 , A. Opfermann2 , G. Papadopoulos2 1 2
Department of Mathematics, King’s College London, London WC2R 2LS, UK DAMTP, Silver Street, University of Cambridge, Cambridge CB3 9EW, UK
Received: 13 October 1997 / Accepted: 27 March 1998
Abstract: We find that the target space of two-dimensional (4,0) supersymmetric sigma models with torsion coupled to (4,0) supergravity is a QKT manifold, that is, a quaternionic K¨ahler manifold with torsion. We give four examples of geodesically complete QKT manifolds, one of which is a generalisation of the LeBrun geometry. We then construct the twistor space associated with a QKT manifold and show that under certain conditions it is a K¨ahler manifold with a complex contact structure. We also show that, for same 4k-dimensional QKT manifold, there is an associated 4(k+1)-dimensional hyper-K¨ahler one.
1. Introduction The geometry of the target space of two-dimensional sigma models with extended supersymmetry is described by the properties of a metric connection with torsion [1, 2]. Rigid (4,0) supersymmetry requires that the target space of two-dimensional sigma models without Wess–Zumino term (torsion) is a hyper-K¨ahler (HK) manifold. In the presence of torsion the geometry of the target space becomes hyper-K¨ahler with torsion (HKT) [3]. Manifolds with either HK or HKT structure admit three complex structures which obey the algebra of imaginary unit quaternions and the sigma model metric is hermitian with respect to all complex structures. In addition, in a HK geometry the complex structures are covariantly constant with respect to the Levi–Civita connection, while in a HKT geometry the complex structures are covariantly constant with respect to a metric connection with torsion. Local (4,0) supersymmetry requires that the target space of two-dimensional sigma models with torsion is either (i) HKT or (ii) a generalisation of the standard quaternionc K¨ahler geometry (QK) (see [4, 5]) for which the associated metric connection has torsion [6]; we shall call this geometry quaternionic K¨ahler with torsion (QKT). This is unlike the case of (4,4) locally supersymmetric sigma models where it has been shown that the geometry of the target space is either of HKT type or it
714
P. S. Howe, A. Opfermann, G. Papadopoulos
is the standard quaternionic K¨ahler geometry [7]. Thus a QKT geometry is not compatible with (4,4) local supersymmetry. Nevertheless, the conditions on the geometry of the target space of two-dimensional sigma models required by (4,0)-local supersymmetry can be derived from an appropriate truncation of the conditions found for the (4,4)locally supersymmetric ones. It is well known in a QK geometry that the holonomy of the Levi–Civita connection is a subgroup of Sp(k) · Sp(1). Similarly, a QKT geometry is characterised by the fact that the holonomy of a metric connection with torsion has holonomy Sp(k) · Sp(1). The torsion is the exterior derivative of the Wess–Zumino term of the sigma model action and is therefore a closed three-form on the sigma model manifold, at least in the classical theory. In this paper, we list the conditions on the target manifold of a sigma model required by (4,0)-local supersymmetry and thus derive the restrictions on a Riemannian manifold that must be satisfied in order for it to admit a QKT geometry. We shall then explore some of the properties of a QKT geometry. In particular, we shall show that for every fourdimensional quaternionic K¨ahler manifold there is an associated class of QKT manifolds. These manifolds are parameterised by harmonic functions (possibly with delta function singularities on the QK manifold). This gives a large class of QKT manifolds since every orientable 4-manifold is QK due to the fact that SO(4) = Sp(1) · Sp(1). Using this method, we present four examples of complete four-dimensional QKT manifolds. Allowing dH 6= 0, we show that any 4k-dimensional QKT manifold admits a twistor construction. We construct the twistor space of a QKT manifold and show that it is a K¨ahler manifold with a complex contact structure provided that k > 1 and dH is a (2,2)-form with respect to all three complex structures. In addition, we associate to same 4k-dimensional QKT manifold a 4(k + 1)-dimensional HK one which is a fibre bundle over the QKT manifold with fibre C2 − 0. In the limit that the torsion H vanishes the results of Salamon [5] and Swann [8] for QK manifolds are recovered. This paper is organised as follows: in Sect. Two we state the algebraic and differential conditions required by a QKT geometry on a Riemannian manifold; in Sect. Three we present examples of QKT manifolds; in Sect. Four we give the twistor construction for QKT manifolds and show that under certain conditions a complex contact structure; in Sect. Five we show that for some QKT manifold there is an associated HK one, and in Sect. Six we make some concluding remarks.
2. Local (4,0) Supersymmetry The multiplets required for the construction of a two-dimensional (4,0) locally supersymmetric theory coupled to sigma model matter are as follows: (i) The supergravity multiplet (e, C, ψ) comprises of the graviton e, a SO(3) gauge field {C r ; r = 1, 2, 3} and four Majorana–Weyl gravitini {ψ, ψ r ; r = 1, 2, 3}; (ii) sigma model scalar multiplets (φ, χ), each comprised of four real scalars φ and four Majorana–Weyl fermions χ. The spinors of the scalar multiplet have opposite chirality to those of the supergravity one. Let M be the sigma model manifold of dimension 4k with metric g, Wess–Zumino 3-form H, a LSO(3)-valued one-form B and three almost complex structures {Ir ; r = 1, 2, 3}. The Lagrangian 1 that describes the (4,0)-supergravity multiplet coupled to k scalar multiplets system is 1 The letters from the beginning of the Greek alphabet α, β, γ, δ = 0, 1 are worldvolume indices and the letters from the middle of the Greek alphabet are target space indices λ, µ, ν, κ = 1, . . . , 4k. We have also suppress spinor indices.
Twistor Spaces for QKT Manifolds
715
2e−1 L = gµν ∂α φµ ∂ α φν − αβ bµν ∂α φµ ∂β φν
− igµν χ¯ µ γ α Dα χν + igµν χ¯ µ γ α γ β ∂β φκ δκ ν ψα − (Ir )κ ν ψαr 1 − χ¯ λ γ α χν χ¯ µ 3Hκ[λν (Ir )µ] κ ψαr + Hµλν ψα 3 1 − gµκ χ¯ µ γ α χν (Ir )ν λ ψ¯ βr + δν λ ψ¯ β γα γ δ γ β (Ir )λ κ ψδr − δλ κ ψδ , 8
(1)
where
1 µ µ r µ ν r µ ν Dα χµ = ∇(+) (2) α χ + Bα (Ir )ν χ − ωα χ − Cα (Ir )ν χ , 2 Bαr is the pull back of Bµr with respect to φ, ωα is the spin connection of the worldvolume and the covariant derivatives ∇(±) are associated with the connections µ µ ˆ µ 1 0(±) νκ = 0νκ ± Hνκ ; 2
(3)
0ˆ is the Levi–Civita connection of the metric g. To simplify the notation we set 0 = 0(+) (∇ = ∇(+) ). The conditions on the geometry of M required by (4,0) local supersymmetry can be found by appropriately truncating the conditions required by (4,4) local supersymmetry [7]. The former are the following: Ir Is = −δrs + rst It , κ
λ
(Ir )µ (Ir )ν gκλ = gµν , Dµ (Ir )κ ν = 0, NDˆ (Ir )µ νκ = 0,
r = 1, 2, 3,
(4)
where Dµ (Ir )κ ν = ∇µ (Ir )κ ν + Bµr s (Is )κ ν , Br = −2B tr . In addition ˆ [λ (Ir )κ] µ − (ν ↔ κ) NDˆ Ir µ νκ = (Ir )ν λ D s
t
(5)
s
(6)
is a Nijenhuis-like tensor associated with the covariant derivative ˆ µ (Ir )ν κ = ∇ ˆ µ (Ir )ν κ + Bµ r s (Is )ν κ , D
(7)
ˆ is the Levi–Civita covariant derivative. We remark that this Nijenhuis tensor is where ∇ ˆ independent from the Levi–Civita part of D. The first three conditions in (4) imply that (i) the almost complex structures, Ir , obey the algebra of imaginary unit quaternions, (ii) the metric g is hermitian with respect to all almost complex structures and (iii) the holonomy of the connection D is a subgroup of Sp(k) · Sp(1), respectively. The covariantised Nijenhuis condition, ND (Ir ) = 0, together with the third condition in (4) imply that the torsion is a (1,2)-and (2,1)-form with respect to all almost complex structures. We remark that in the commutator of two supersymmetry transformations, apart from NDˆ (Ir ), the mixed covariantised Nijenhuis “tensors”, NDˆ (Ir , Is ), appear as well (see for example [9]). However they do not give independent conditions on the almost complex structures since they vanish provided that NDˆ (Ir ) = 0.
716
P. S. Howe, A. Opfermann, G. Papadopoulos
In analogy with the HKT case [3], we say that the manifold M with tensors g, I, B and H that satisfy (4) has a weak QKT structure if no further conditions are imposed on H. However, if in addition we take H to be a closed 3-form (dH = 0), we say that M has a strong QKT structure, in which case we can write H = 3db
(8)
for some locally defined two-form b on M . Finally, if H vanishes, the manifold M becomes quaternionic K¨ahler. The target space, M , of a (classical) (4,0) locally supersymmetric sigma model with torsion is a manifold with a strong QKT structure. The couplings of the classical action of the theory are the metric g, the LSO(3) valued one-form B and the two-form b. However, in the quantum theory and in particular in the context of the anomaly cancellation mechanism [10–12], the (classical) torsion H of (2,0)supersymmetric sigma models receives corrections2 . The new torsion is not a closed three form. Therefore, although classically the target space of (4,0)-supersymmetric sigma models has a strong QKT structure, quantum mechanically this may change to a weak QKT structure, albeit of a particular type. It is well known that all 4k-dimensional QK manifolds are Einstein, i.e. Rµν = 3gµν , and that
r
(9)
3 (I r )µν , (10) k+2 where 3 is a constant and G is the curvature of the B connection. There is no direct analogue of these statements in the context of a QKT geometry. However, one can show that if the curvature G of the B connection satisfies (10) then the torsion H vanishes. To show this, we first differentiate (10) with respect to the ∇κ connection and then antisymmetrise in all three κ, µ, ν indices. Then, using the fact that DIr = 0, we find that the right-hand side of the equation can be expressed in terms of B and Ir . Using (10) once more we find that the left hand side of the equation is expressed in terms of the torsion and a term similar to that of the right-hand side. Finally, one gets Grµν = dB + B ∧ B
µν
=−
(Ir )[κ λ Hµν]λ = 0.
(11)
Using this together with the fact that H is a (2,1) and (1,2) tensor on M , we conclude that H vanishes. Thus Eq. (10) excludes torsion. A consequence of this is that the (4,0) locally supersymmetric models constructed in [13] have zero torsion. 3. Examples To construct examples of QKT geometry, we generalize the ansatz used in [14] to find HKT geometries from HK ones. As we have already mentioned in the introduction, any oriented four-dimensional manifold is QK. Let M be such a manifold with metric h, connection B and compatible almost complex structures Ir . The volume form, , of M can be written in terms of the almost complex structures as =
3 X
ω r ∧ ωr ,
(12)
r=1 2
Apart from the sigma model anomalies, these models have also two-dimensional gravitational anomalies.
Twistor Spaces for QKT Manifolds
717
where ωr (X, Y ) = h(X, Ir Y )
(13)
are the K¨ahler-like forms of the almost complex structures Ir . We remark that is covariantly constant with respect to the Levi–Civita connection. We also mention for later use that3 3 X (ωr )µν (ωr )κλ = µνκλ + hµκ hνλ − hνκ hµλ . (14) r=1
To construct four-dimensional QKT manifolds, we use the ansatz g = F h,
H=
1 ? dF , 2
(15)
where ? is the Hodge dual with respect to . The manifold M with metric g, torsion H, almost complex structures Ir and connection B is a weak QKT manifold. To show the covariant constancy condition of the almost complex structures in (4), we use Eq. (14) and the ansatz (15). The remaining conditions in (4) are straightforwardly satisfied. For M to have a strong QKT structure, H must be closed which in turn implies that F must be a harmonic function on M with respect to the h metric, i.e. d ? dF = 0.
(16)
We shall allow F to have delta function singularities on M . There always exist nontrivial solutions of (16) on any four-dimensional manifold. So we conclude that there is a family of QKT manifolds associated to every four-dimensional QK manifold labeled by the harmonic functions of the latter4 . Due to the singularities of F , the associated QKT metric may be geodesically incomplete. This in fact is the case for some choices of harmonic function for the compact four-dimensional Wolf spaces S 4 and CP 2 . However there are examples of complete QKT geometries. Here we shall present four non-singular QKT manifolds starting from the QK manifolds, R × dS(3) , dS(4) , the Tolman wormhole and a LeBrun like metric, respectively, where dS(n) is the n-dimensional de Sitter space. The metric h on R × dS(3) is ds2 = du2 + dv 2 + cosh2 v d2(2) ,
(17)
where −∞ < u, v < ∞ and d2(2) is the SO(3) invariant metric on S 2 . Supposing that the harmonic function, F , depends only on v, we get F = λ1 tanh(v) + λ2 ,
(18)
where λ1 and λ2 are real numbers. It is straightforward to compute the metric and the torsion of the QKT manifold to find that ds2F = (λ1 tanh(v) + λ2 ) du2 + dv 2 + cosh2 v d2(2) , (19) H = λ1 sin θ du ∧ dθ ∧ dφ, 3 Although the 4-form can be defined for QK manifolds of any dimension, this identity holds only for four-dimensional manifolds. 4 Note that the QKT manifold with metric g is also QK with respect to the same metric, as four-dimensional manifold, but with a different set of almost complex structures.
718
P. S. Howe, A. Opfermann, G. Papadopoulos
where θ, φ are the angular coordinates on S 2 . This QKT metric is geodesically complete if we choose λ2 > |λ1 |. The metric h on dS(4) is ds2 = dv 2 + cosh2 v d2(3) ,
(20)
where −∞ < v < ∞ and d2(3) is the SO(4) invariant metric on S 3 . Supposing that the harmonic function, F , depends only on v, we get F = λ1
sinh(v) + arctan(sinh(v)) + λ2 , 2 cosh (v)
(21)
where λ1 and λ2 are real numbers. It is straightforward to compute the metric and the torsion of the QKT manifold to find that ds2F = λ1
sinh(v) + arctan(sinh(v)) + λ2 dv 2 + cosh2 vd2(3) , 2 cosh (v)
λ1 sin2 θ sin φ dθ ∧ dφ ∧ dψ, H= 2
(22)
where θ, φ, ψ are the angular coordinates on S 3 . This QKT metric is geodesically complete, if we choose λ2 > π2 λ1 > 0. The metric of the Tolman wormhole is ds2 = dv 2 + (a2 + v 2 )d2(3) ,
(23)
where −∞ < v < ∞, a is a real non-zero constant, and d2(3) is the SO(4) invariant metric on S 3 . This metric is the analytic continuation of the FRW model of a universe filled with a perfect fluid with pressure equal to 1/3 of its density. Using the Einstein equations, we find the Tolman wormhole metric has zero scalar curvature. In addition, it is conformally flat a2 2 (24) ds2 = 1 + 2 (dr2 + r2 d2(3) ), 4r as can be easily seen using the coordinate transformation v=r−
a2 . 4r
(25)
Supposing that the harmonic function, F , depends only on v, we get F = λ1
a2
√
v a2 + v 2
+ λ2 ,
(26)
where λ1 and λ2 are real numbers. It is straightforward to compute the metric and the torsion of the QKT manifold to find that v + λ2 dv 2 + (a2 + v 2 )d23 , ds2F = λ1 √ a2 a2 + v 2 (27) λ1 H= sin2 θ sin φ dθ ∧ dφ ∧ dψ, 2 where θ, φ, ψ are the angular coordinates on S 3 . This QKT metric is geodesically complete, if we choose λ2 > (1/a2 )|λ1 |.
Twistor Spaces for QKT Manifolds
719
All the examples of QKT geometries presented so far are conformally flat and therefore their Weyl tensor vanishes. For reasons that will become apparent in the twistor construction of four-dimensional QKT manifolds, we give an example of a QKT geometry with non-vanishing but self-dual Weyl tensor. To do this we begin with the four-dimensional metric ds2 = V −1 (dτ + ω)2 + V ds2(3) ,
(28)
1 dx2 + dy 2 + dq 2 , q2
(29)
where ds2(3) = is the hyperbolic 3-metric and dω = ?dV
(30)
with the Hodge duality operation taken with respect to the metric ds2(3) . Equation (30) is just the magnetic monopole equation in a hyperbolic background. The function V is harmonic with respect to the hyperbolic 3-metric. Solving (30) for one monopole we get 1 V = 1 + (coth ρ − 1), 2 1 y coth ρ, ωx = − 2 2 x + y2 1 x ωy = coth ρ, 2 x2 + y 2 ωz = 0,
(31)
x2 + y 2 + q 2 + q02 coth ρ = q . (x2 + y 2 + q 2 + q02 )2 − 4q 2 q02
(32)
where
To construct the associated QKT geometry, let us suppose that the harmonic function F depends only on the coordinate q. Then we find that F = q2 .
(33)
Therefore the associated QKT geometry is ds2F = q 2 V −1 (dτ + ω)2 + V ds2(3) , H = dτ ∧ dx ∧ dy.
(34)
The metric ds2F is the LeBrun metric which has been shown to be complete in [15]. It is also known to have a non-vanishing but self-dual Weyl tensor.
720
P. S. Howe, A. Opfermann, G. Papadopoulos
4. Twistor Spaces Let M be a 4k-dimensional weak QKT manifold. Since the connection 0 of M has holonomy Sp(k) · Sp(1), the tangent bundle is associated to a principal Sp(k) · Sp(1) bundle. In particular this implies that the complexified tangent bundle splits asTc M = T2k ⊗ T2 with the first subbundle associated with Sp(k) and the second associated with Sp(1). 5 Next we introduce a frame eai and write the metric as ds2 = ebj ⊗ eai ηab ij ,
(35)
where η is the invariant Sp(k) symplectic form (a, b = 1, . . . , 2k) and is the invariant Sp(1) symplectic form (i, j = 1, 2). The reality condition for a vector X in this frame is X¯ ai = X bj ηba ji ,
(36)
which can be extended to tensors in a straightforward way. A basis for the almost complex structures in this frame is (37) (Ir )ai bj = −iδa b (τr )i j , where the τr are the Pauli matrices; the almost complex structures are real tensors. The connection-form 0 can be written in this basis as 0ai bj = δi j Aa b + δa b Bi j ,
(38)
where Aa b is the Sp(k) connection and Bi j is the Sp(1) connection introduced in Eq. (5). Similarly the curvature can be decomposed as Rai bj = δi j Fa b + δa b Gi j .
(39)
The twistor space, Z, can be defined either as the projective bundle of T2 or as the quotient U (1)\P of the principal Sp(1) subbundle, P , of the principal Sp(k) × Sp(1) bundle. (We take the group of a principal bundle to act from the left.) We shall work mainly with P . Functions on twistor space are U (1) invariant functions on P while U (1) equivariant functions on P correspond to sections of U (1) line bundles over Z associated to P considered as a U (1) principal bundle over Z. This allows us to work with P and then reduce our results to Z. For this we introduce “coordinates” (x, u) on P , where x are coordinates on the base space M and u ∈ SU (2). We write u as uI i (with inverse ui I ), (i = 1, 2) and (I = 1, 2) with the local Sp(1) gauge transformations acting from the right, i.e. on the index i, and the rigid Sp(1) transformations act from the left, i.e. on the index I, as we have already mentioned6 . Since the structure group of P as a principal bundle over Z is U (1), it will be appropriate to split up the capital I indices into two indices (1,2) indicating the U (1) charges. The right-invariant one-forms on the fibre (of P → M ) in these coordinates are eI J = duI i ui J
(40)
eI I = 0
(41)
with 5 In principle T and T are only locally defined, but we shall assume that they exist globally for simplicity; 2 2k this is similar to demanding the existence of a spin structure and means that one can define a principal Sp(k) × Sp(1) bundle. See [5] for a discussion. 6 The equivariant formalism used here has been called “harmonic space” formalism elsewhere; it was applied to QK geometry in [16].
Twistor Spaces for QKT Manifolds
721
as a consequence of the fact that det u = 1. The dual right-invariant vector fields DI J satisfy 1 (42) D I J u K i = δK J u I i − δ I J u K i 2 and the algebra of vector fields is [DI J , DK L ] = δK J DI L − δI L DK J ,
(43)
which is isomorphic to the LSp(1) Lie algebra. To see this, we note that DI I = 0 and set (44) D 0 = D1 1 − D 2 2 . It is then easy to verify that {D0 , D1 2 , D2 1 } satisfy the familiar Lie algebra commutator relations of SU (2). We shall take the vector field D0 to be tangent to the orbits of the U (1) subgroup of SU (2) acting on P from the left which we have used to define the twistor space Z = U (1)\P . We also note that D0 u1 i = u1 i , D0 u2 i = −u2 i .
(45)
In the following we shall use the properties of the torsion and the curvature of the Sp(1) connection Ta1b1c1 ≡ u1 i u1 j u1 k Haibjck = 0, (46) Gaibj,k` = ik j` + i` jk Gab , respectively. The latter condition holds provided that k ≥ 2 and that dH is (2,2) with respect to all almost complex structures. An outline of the proof of the above properties is given in the appendix. Now we can state the properties of the twistor space Z associated with a QKT manifold M : (i) Z is a complex manifold provided that k ≥ 2. (ii) Z has a real structure. (iii) Z admits a complex contact structure provided that k ≥ 2, dH is (2,2) with respect to all almost complex structures and det(Gab ) 6= 0. (iv) Z is a K¨ahler manifold provided that (−ij Gab ) is positive definite, k ≥ 2 and dH is (2,2) with respect to all almost complex structures as in the previous property. The real structure is induced on Z from the antipodal map on each two-sphere fibre of Z over M in exactly the same way as in hyper-K¨ahler and quaternionic K¨ahler geometry; so we refer the reader to the literature for discussions of this point [5, 17]. To prove (i) we introduce the horizontal lift basis on P : E˜ I J = DI J , E˜ aI = eˆaI − BaI,J K DK J
(47)
with dual basis given by EI J = eI J + eaK BaK,I J , E aI = eai ui I ,
(48)
where we convert i, j, k indices to I, J, K indices using uI i or ui I as appropriate and where eˆai are the basis vector fields on M dual to eai . We then find that
722
P. S. Howe, A. Opfermann, G. Papadopoulos
dEI J = −EI K ∧ EK J + GI J , dE aI = −E aJ ∧ EJ I + T aI − E bI ∧ Ab a ,
(49)
where GI J is the Sp(1) curvature and T aI is the torsion in the {EI J , E aI } frame. We ˜ 1 2 } spans an integrable distribution up to a claim that the set of vector fields {E˜ a1 , D U (1) translation and therefore defines a complex structure in Z. To show this, we write the second equation in (49) in the dual form ˜ LK . [E˜ aI , E˜ bJ ] = −TaIbJ cK E˜ cK + AaI,b c E˜ cJ − AbJ,a c E˜ cI − GaIbJ,K L D
(50)
Setting I = J = 1 we find that the commutator [E˜ a1 , E˜ b1 ] closes on terms linear in {E˜ a1 , D1 2 } and D0 provided that Ta1b1 c2 = 0, Ga1b1,1 2 = 0.
(51)
Similarly, the commutator [E˜ a1 , D1 2 ] closes on terms linear in D1 2 and D0 . The first condition in (51) is equivalent to the first condition in (46). For k ≥ 2, the second condition in (51) is a special case of the second condition in (46) which holds for any weak QKT space, even if dH is not (2,2) with respect to all almost complex structures (see the appendix). Therefore for k ≥ 2, the twistor space Z is always a complex manifold. For k = 1, the second condition in (51) must be imposed in addition to the conditions required by (4,0) local supersymmetry on the geometry of M . In particular, for the examples that we have presented in Sect. 3, this condition always holds because the Weyl tensor is self-dual. To show (iii), we first note that a complex contact structure is defined locally by a (1,0) form β such that (52) β ∧ (∂β)k 6= 0. In our case we choose β = E1 2 .
(53)
2
Using the definition of E1 and the second condition in (46), we find that
So
dE1 2 = −E b2 ∧ E a2 Gab − (E1 1 − E2 2 ) ∧ E1 2 .
(54)
∂β = −E b2 ∧ E a2 Gab ,
(55)
and the condition (52) is satisfied provided that det(Gab ) 6= 0.
(56)
As we have already mentioned, for k ≥ 2 the second condition in (46) always holds provided that dH is (2,2) with respect to all almost complex structures. For k = 1, the second condition in (46) must be imposed in addition to those required by (4,0) local supersymmetry on M . For the examples that we have presented in Sect. 3, this always holds since the Weyl tensor is self-dual. Note that for k = 1, dH is always (2,2) with respect to all almost complex structures. It remains to show (iv). Since we have already shown that Z is complex, it is enough to find the appropriate K¨ahler form . The metric can then be constructed from the K¨ahler form and the complex structure. We choose as K¨ahler form
Twistor Spaces for QKT Manifolds
723
= 2i E1 2 ∧ E2 1 + E b2 ∧ E a1 Gab .
(57)
Clearly, is (1,1) with respect to the chosen complex structure so it remains to show that it is closed. For this, using (46) we find that d = 2i E c2 ∧ E b1 ∧ E a1 ∇a1 Gbc + E c2 ∧ E b1 ∧ E a2 ∇a2 Gbc + E c2 ∧ T b1 Gbc − T c2 ∧ E b1 Gbc .
(58)
Expanding T using the E basis, we find that the term involving E c2 ∧ E b1 ∧ E a1 in the above equation is proportional to 1 ∇a1 Gbc + Ta1b1 d1 Gdc − Ta1c2 d2 Gdb , 2
(59)
antisymmetrised on a and b, which vanishes because of the second Bianchi identity ∇aI GbJcK,LM + TaIbJ dN GdN cK,LM + cyclic in (aI, bJ, cK) = 0.
(60)
Similarly, the term proportional to E c2 ∧ E b1 ∧ E a2 in (58) vanishes and therefore is closed. The metric is non-degenerate and positive definite provided that (−ij Gab ) is non-degenerate and positive definite. 5. HK Structures from QKT Manifolds As in the previous section, let M be a 4k-dimensional weak QKT manifold. As we have already mentioned the tangent bundle can be written as T M = T2k ⊗ T2 . The main task of this section is to show that Tˆ2 , which is defined to be T2 with the zero section removed, is a HK manifold provided that k ≥ 2, dH is (2,2) with respect to all three almost complex structures and (−ij Gab ) is non-degenerate and positive definite. Introducing complex coordinates {y i ; i = 1, 2} along the fibres of Tˆ2 , we define a set of 2k + 2 complex one-forms as follows: E i = dy i + y j Bj i , E a = eai yi ,
(61)
where yi = y j ji . We claim that this set of forms defines a complex structure on Tˆ2 , i.e., that it defines a basis set of (1,0) forms. To show this we use the differential form version of Frobenius’ theorem which states, in the current context, that the exterior derivative of any (1,0) form should be a sum of two-forms each one of which has a (1,0) factor. Differentiating (61) we find dE a = −E b ∧ Ab a + eai ∧ Ei + T ai yi , dE i = −E j ∧ Bj i + y j Gj i .
(62)
Since H is (2,1) and (1,2) with respect to all almost complex structures, we can write Taibjck = Haibjck = ij Hab,ck + ki Hca,bj + jk Hbc,ai ,
(63)
where Hab,ck = Hba,ck and H(ab,c)k = 0, so that T ai yi = 2ecj ∧ E b (Hbc, a j − H a b,cj ).
(64)
724
P. S. Howe, A. Opfermann, G. Papadopoulos
Then, using the expression in (46) for the Sp(1) curvature G, we find y j Gj i = −ebi ∧ E a Gab .
(65)
Hence the right-hand sides of both of the Eqs. (62) have the required structure for Frobenius’ theorem to hold. We choose the first complex structure to be diagonal with respect to this integrable distribution, i.e. (IE)i = iE i and (IE)a = iE a . To find the metric and the rest of the hyper-K¨ahler structure, it is enough to determine two of the three K¨ahler forms, {r , r = 1, 2, 3}. As we are working in a basis in which one of the complex structures is diagonal, one of the K¨ahler forms, say 1 , is a (1,1)-form with respect to the chosen complex structure while the other two are (2,0) plus (0,2) with respect to the same complex structure. The first K¨ahler form is (66) 1 = 2i E¯ i ∧ E i − E¯ b ∧ E a Ga b , where the bars denote complex conjugation7 . In particular, the frame {E¯ a , E¯ i } is E¯ i = dy¯i − Bi j y¯j , E¯ a = −eai y¯ i .
(67)
The connection forms {Bi j , Aa b } are skew-hermitian (e.g. (B¯i j ) = B¯ i j = −Bj i ) and the basis forms eai are with respect to the reality condition (36). It is clear that is (1,1) with respect to the chosen complex structure, so it remains to show that it is closed. That this is so follows on using the second Bianchi identity for G, DGij = 0, where D is the Sp(1) covariant exterior derivative, and contracting it with y i y¯ j . Next we choose the second almost complex structure J to be J(E a ) = E¯ b η ba , J(E i ) = E¯ j ji ,
J(E¯ a ) = E b ηba , J(E¯ i ) = E j ji .
(68)
The almost complex structure J is integrable as may easily be seen by observing that a basis of (1,0) forms for J is {E a + iE¯ a , E i + iE¯ i } and then by using the Frobenius’ theorem. The J complex structure anticommutes with the I complex structure as required. The (2,0) part of the K¨ahler form of the J complex structure is 0 = E j ∧ E i ij − E b ∧ E a Gab .
(69)
The proof that this form is closed is similar to that for 1 with the difference that one must use the second Bianchi identity for G contracted with y i y j . The K¨ahler forms 2 , 3 are the real and imaginary parts of 0 , 0 =
1 (2 + i3 ). 2
This shows that Tˆ2 is an HK manifold since the third complex structure can be constructed from the first two and its integrability is also implied by the integrability of the first two. The metric is (70) ds2 = 2E¯ i ⊗ Ei − 2E¯ b ⊗ E a Gab . It is hermitian with respect to all three complex structures and is non-degenerate and positive definite provided that −ij Gab is non-degenerate and positive definite. 7 Note that Sp(1) and Sp(k) indices are raised or lowered by complex conjugation as well as the corresponding symplectic invariant tensors.
Twistor Spaces for QKT Manifolds
725
6. Concluding Remarks We have shown that the QKT geometry of manifolds that arise in the context of twodimensional (4,0) locally supersymmetric sigma models is determined by the properties of a metric connection with torsion. This connection has holonomy Sp(k) · Sp(1) so that the corresponding geometry is a generalization of a QK geometry. QKT manifolds admit a twistor construction. The twistor space is a (4k + 2)-dimensional K¨ahler manifold with a complex contact structure. In addition, for some QKT manifold there is a (4k + 4)dimensional hyper-K¨ahler manifold which is obtained from a vector bundle over the QKT manifold with fibre C2 associated to the Sp(1) principal bundle by omitting the zero section. There are various limits in the twistor construction for QKT manifolds in which one or more tensors associated with this structure vanish. In the limit that the torsion vanishes, as we have already mentioned, the QKT structure degenerates to a QK one and one recovers the results of Salamon [5] and Swann [8] for QK manifolds. In another limit where the torsion does not vanish but the holonomy becomes Sp(k) the manifold becomes HKT for which the twistor construction was given in [3]. Finally, if both the torsion vanishes and the holonomy is Sp(k), then the manifold is HK for which the twistor construction was given in [17]. In this paper we have not investigated the applications of the twistor construction in (4,0) supergravity coupled to sigma model matter system. However, it is likely that the sigma model maps can be thought of as holomorphic maps from a harmonic extension of the (4,0) superspace to the twistor space of M , thus generalizing a similar property of (4,0) superfields for the models with rigid supersymmetry [3, 18]. It would also be of interest to find more examples of QKT manifolds in 4k-dimensions for k > 1. For example, there might be locally symmetric spaces with a QKT structure in direct analogy to the Wolf spaces for QK manifolds [19] or to the group manifold examples for HKT manifolds [20]. New QKT manifolds may also be constructed starting from QK manifolds with an isometry that respects the QK structure and then performing a T-duality transformation along the Killing direction. By this means one might expect to develop relationships between QK and QKT manifolds similar to those that hold between HK and HKT manifolds [21, 22]. Acknowledgement. One of us, G.P., would like to thank R. Goto for helpful discussions. A.O. is supported by the EPSRC and the German National Foundation. G.P. is supported by a University Research Fellowship from the Royal Society.
Appendix Here we shall show that Ta1b1 c2 = 0, Gaibj,k` = (ik j` + i` jk )Gab .
(71)
The first condition is equivalent to Ta1b1c1 ≡ u1 i u1 j u1 k Haibjck = 0.
(72)
Then the first condition in (71) follows by contracting the expression for H in (63) with u1 i as in (72).
726
P. S. Howe, A. Opfermann, G. Papadopoulos
To show the second condition in (71), one uses the Bianchi identity Rµ[νρσ] =
1 ∇µ Hνρσ − 2Pµνρσ , 3
(73)
where Pµνρσ = 3∂[µ Hνρσ] . We first write this Bianchi using the ai coordinate description and then contract all four Sp(1) indices i, j, k, ` with u1 i , u1 j , u1 k , u1 ` . This gives Ra1[b1c1d1] =
1 ∇a1 Tb1c1d1 − 2Pa1b1c1d1 . 3
(74)
From the first condition in (71), we find that the left-hand-side of (74) vanishes. But
and so we find that
Raibj,ckd` = k` Faibj,cd + ηcd Gaibj,k`
(75)
Ra1b1c1d1 = ηcd Ga1b1,11 ≡ ηcd G0ab ,
(76)
where Fa b is the curvature of the Sp(k) connection Aa b in (38). Substituting this in (74) we find that (77) G0ab = 0, provided that k ≥ 2. Since G0ab = 0 for any u and G0ab = u1 i u1 j u1 k u1 ` Gaibj,k` , this implies that (78) Gab,(ijk`) = 0. We remark that this condition is enough to show that the twistor space is a complex manifold. Next we contract the Bianchi identity (73) with u2 i , u1 j , u1 k , u1 ` and we get Ra2[b1,c1d1] =
1 ∇a2 Tb1c1d1 − 2Pa2b1c1d1 . 3
(79)
The right-hand-side of the above equation vanishes provided that Pa2b1c1d1 = 0 which is precisely the condition for dH to be (2,2) with respect to all three almost complex structures. Using (75) for the curvature, we find that Ra2b1,c1d1 = ηcd Ga2b1,11 .
(80)
Substituting this back into (79), we find that both G(ab)21,11 and G[ab]21,11 vanish provided that k ≥ 2. Since this is again the case for any u, the first condition implies that G(ab)ij vanishes and the second together with (78) imply the second condition in (71). We remark that for QK manifolds Gab = λ ηab , where λ is a real constant. References 1. Gates, S.J., Hull, C.M. and Roˇcek, M.: Nucl. Phys. B248, 157 (1984) 2. Howe, P.S. and Papadopoulos, G.: Nucl .Phys. B289, 264 (1987); Class. Quantum Grav. 5, 1647 (1988); Nucl .Phys. B381, 360 (1992) 3. Howe, P.S. and Papadopoulos, G.: Phys. Lett. B379, 80 (1996) 4. Ishihara, S.: J. Diff. Geom. 9, 483 (1974) 5. Salamon, S.: Invent Math. 67, 143 (1982); Ann. Sci. ENS Supp. 19, (1986) 6. Nishino, H.: Phys. Lett. B355, 117 (1995) 7. de Wit, B. and van Nieuwenhuizen, P.: Nucl. Phys. B312, 58 (1989) 8. Swann, A.: C.R. Acad. Sci. Paris, t. 308, 225 (1989)
Twistor Spaces for QKT Manifolds
9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
727
Howe, P.S. and Papadopoulos, G.: Commun. Math. Phys. 151, 467 (1993) Hull, C.M. and Witten, E.: Phys. Lett. 160B, 398 (1985) Sen, A.: Nucl. Phys. B278, 289 (1986) Howe, P.S. and Papadopoulos, G.: Class. Quantum Grav. 4, 1749 (1987) Bergshoeff, E. and Sezgin, E.: Mod. Phys. Lett. A 1, 191 (1986) Callan, C.G., Harvey, J.A. and Strominger, A.: Nucl. Phys. B359, 611 (1991) LeBrun, C.: J. Diff. Geom. 34, 223 (1991) Galperin, A., Ivanov, E. and Ogievetsky, O.: Ann. Phys. 230, 201 (1994) Hitchin, N.J., Karlhede, A., Lindstr¨om, U. and Roˇcek, M.: Commun. Math. Phys. 108, 535 (1987) Delduc, F., Kalitsin, S. and Sokatchev, E.: Class. Quantum. Grav. 7, 1567 (1990) Wolf, J.: J. Math. Mech. 14, 1033 (1965) Spindel, Ph., Sevrin, A., Troost, W. and Van Proeyen, A.: Nucl. Phys. B308, 662 (1988); B311, 465 (1988) 21. Gibbons, G.W., Papadopoulos, G. and Stelle, K.S.: Nucl. Phys. B 508, 623 (1997) 22. Opfermann, A.: Phys. Lett. 416, 101 (1998)
Communicated by R. H. Dijkgraaf