This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
q zq p
p6=q (1 − q
− q bii zπ(p) ).
(39)
Substituting (38) and (39) into (37) we get: Y
(1 − q
bii zq
zp
p6=q
Q1−aij r=1
P
Fj i ( zwr )
1−aij
Y
)
s=1
Q
p>q
P1−aij
(1 − q bij
zs )Pij = w
(40)
zq 1 bii zq zq (1 − q zp )Fii ( zp )
×
1 − aij × k qi Q l(π) (z bii π(q) − q zπ(p) )× p
k=0
(−1)k
(41)
Now Lemma 11 follows immediately from Theorem 9 with t = qi , m = aij . Proof of the theorem. The conditions of the theorem imply that, as a formal power series, Pij =
1−aij
Y
Y
1
p6=q
1 − q bii zpq
z
s=1
1−aij Y Y 1 zs bii zq (1 − q ) (1 − q bij )Pij . (42) zs b ij zp w 1 − q w p6=q s=1
From Lemma 11 it follows that Y p6=q
(1 − q
bii zq
zp
1−aij
)
Y
(1 − q bij
s=1
zs )Pij = 0. w
t Therefore, Pij = 0. This concludes the proof. u One can formulate several versions of Theorem 10. For instance, the following is true.
16
A. Sevostyanov
Proposition 12. Let Fkl (z), k, l = 1, . . . , l be a solution of system (34). Suppose that for some i and j the product 1−aij
Y
1
p6=q
1 − q bii zpq
z
·
Y 1
w zs 1 − q bij zws
· Pij
(43)
is well-defined as a formal power series. Then, Pij = 0. Proof of the proposition is similar to that of Theorem 10. u t Similar statements exist for |q| > 1. Acknowledgements. The author would like to thank B. Enriquez, G. Felder and V. Tarasov for useful discussions. I am also grateful to A. Alekseev and to M. Golenishcheva-Kutuzova for careful reading of the text.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Bourbaki, N.: Groupes et algebras de Lie. Chap. 4, 5, 6, Paris: Hermann, 1968 Chari, V., Pressley, A.: A guide to quantum groups. Cambridge: Cambridge Univ. Press, 1994 Drinfeld, V.G.: A new realization of Yangians and quantized affine algebras. Sov. Math. Dokl. 36 (1988) Drinfeld, V.G.: Quantum groups. Proc. Int. Congr. Math. Berkley, California, 1986, Am. Math. Soc., Providence, RI: 1987, pp. 718–820 Feigin, B., Frenkel, E.: Affine Lie algebras at the critical level and Gelfand–Dikii algebras. Int. J. Mod. Phys. A7, Suppl. A1, 197–215 (1992); Quantization of the Drinfeld–Sokolov reduction: Phys. Lett. B246, 75–81 (1990) Gasper, G., Rahman, M.: Basic hypergeometric series. Cambridge: Cambridge Univ. Press, 1990 Kac, V.G.: Infinite dimensional Lie algebras. Cambridge: Cambridge Univ. Press, 1990 Khoroshkin, S.M., Tolstoy, V.N.: On Drinfeld’s realization of quantum affine algebras. J. Geom. Phys. 11, 445–452 (1993) Kostant, B.: The principal three-dimensional subgroup and the Betti numbers of a complex simple Lie group. Am. J. Math. 81, 973–1032 (1959) Kostant, B.: On Whittaker vectors and representation theory: Inventiones Math. 48, 101–184 (1978) Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. in Math. 34, 195–338 (1979) Jing, N.: Quantum Kac–Moody algebras and vertex representations. q-alg/9802036 Macdonald, I.G.: Symmetric functions and Hall polynomials. 2nd edition, Oxford: Clarendon Press, 1995
Communicated by G. Felder
Commun. Math. Phys. 204, 17 – 38 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Separation of Variables in the Elliptic Gaudin Model Evgueni K. Sklyanin1 , Takashi Takebe2,? 1 St. Petersburg Branch of the Steklov Mathematical Institute, Fontanka 27, St. Petersburg 191011, Russia.
E-mail: [email protected]
2 Department of Mathematical Sciences, University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo, 153-8914,
Japan. E-mail: [email protected] Received: 21 August 1998 / Accepted: 12 January 1999
Abstract: For the elliptic Gaudin model (a degenerate case of the XYZ integrable spin chain) a separation of variables is constructed in the classical case. The corresponding separated coordinates are obtained as the poles of a suitably normalized Baker-Akhiezer function. The classical results are generalized to the quantum case where the kernel of the separating integral operator is constructed. The simplest one-degree-of-freedom case is studied in detail. 1. Introduction The quantum elliptic (or XYZ) Gaudin model was introduced in [1], see also [2], as a limiting case of the integrable XYZ spin chain [3]. The commuting Hamiltonians Hn of the model are expressed as quadratic combinations of sl2 spin operators. Determining the spectrum of Hn turned out to be a difficult problem like the original XYZ spin chain. Let us list the known facts related to this problem. • A solution by means of the Algebraic Bethe Ansatz has been obtained only recently [4]. See also [5]. • As shown in [6], in the SU(2)-invariant, or XXX, or rational, case the spectrum and the eigenfunctions of the model can be found via an alternative method, Separation of Variables, see also the survey [7]. • In [8] the separation of variables in the rational Gaudin model [6] was interpreted as a geometric Langlands correspondence. • In [9] a separation of variables was constructed for the elliptic Gaudin–Calogero model which is closely related to the XYZ Gaudin model, though the separation of variables for the former one is much simpler. • The results of [8] and [9] are based on the interpretation of the corresponding Gaudin models as conformal field theoretical models (Wess–Zumino–Witten models). The ? Present address: Department of Mathematics, Faculty of Science, Ochanomizu University, Otsuka 2-1-1, Bunkyo-ku, Tokyo, 112-8610, Japan. E-mail: [email protected]
18
E. K. Sklyanin, T. Takebe
corresponding interpretation of the XYZ Gaudin model was obtained in [10], but the conformal field theoretical model corresponding to the XYZ Gaudin model turned out to be so complicated that writing down the geometric Langlands correspondence for this system, following [8], is not easy. The main task of the present paper is to present a construction of separated variables for the XYZ Gaudin model both in the classical and quantum cases. The paper is organized as follows. After giving a detailed description of the XYZ Gaudin model in Sect. 2, we proceed, in Sect. 3, with the classical case and, following the general philosophy of [7], construct the separated coordinates as the poles of an appropriately normalized Baker-Akhiezer function. The corresponding eigenvalues of the Lax matrix are then shown to provide the canonically conjugated momenta. The whole construction is a simplified version of the one used in [11]. The quantum case is considered in Sect. 4. The separating classical canonical transformation is replaced by an integral operator K . We write down a system of differential equations for the kernel of K and show that it is integrable. The resulting integral operator K intertwines the original and the separated variables and provides, respectively, a Radon–Penrose transformation of the corresponding D-modules. The quantization constructed is a formal one, since we do not study the transformations of the functional spaces of quantum states, leaving it for a further study. A detailed study of the spectral problem is given in the simplest case only: N = 1 (Sect. 5). We show that the corresponding separated equation is none other than the (generalized) Lamé equation. Two appendices contain, respectively, a list of properties of elliptic functions, and the formulas describing a realization of finite-dimensional representations of sl2 (C) on the elliptic curve which are used throughout the paper. 2. Description of the Model Let us recall the definition of the XYZ Gaudin model, following [4]. The elementary Lax operator L(u) of the model depending on a complex parameter u (spectral parameter) is given by 3 1X A(u) B(u) a a wa (u)S ⊗ σ = . (2.1) L(u) = C(u) −A(u) 2 a=1
Here σ a are the Pauli matrices, w1 (u) =
0 θ11 θ 0 θ00 (u) θ 0 θ01 (u) θ10 (u) , w2 (u) = 11 , w3 (u) = 11 , θ10 θ11 (u) θ00 θ11 (u) θ01 θ11 (u)
(2.2)
0 = d/du(θ (u))| where θαβ (u) = θαβ (u; τ ), θαβ = θαβ (0), θ11 11 u=0 , (see Appendix A) a and S are generators of the Lie algebra sl2 (C):
[S a , S b ] = iS c . Hereafter (a, b, c) denotes a cyclic permutation of (1, 2, 3). Note that A, B, C are holomorphic except at u ∈ Z + τ Z, where these operators have poles of first order. 1
2
Introducing the notation L := L ⊗ 1l2 and L := 1l2 ⊗ L, where 1l2 is the unit operator in C2 , one can establish the commutation relation 1
2
1
2
[L (u), L (v)] = [r(u − v), L (u) + L (v)],
(2.3)
Elliptic Gaudin Model
19
where r(u) is a classical r matrix defined by 1X wa (u)σ a ⊗ σ a . 2 3
r(u) = −
(2.4)
a=1
The r matrix behaves as − u1 (P − 21 ) + O(u−3 ) when u → 0. Here P is the permutation operator: P(x ⊗ y) = y ⊗ x. Explicitly, in the natural basis in C2 ⊗ C2 ,
a(u) 0 0 d(u) 0 0 b(u) c(u) r(u) = , 0 c(u) b(u) 0 d(u) 0 0 a(u)
(2.5)
where a(u) = −b(u) = −
w1 (u) + w2 (u) w1 (u) − w2 (u) w3 (u) , c(u) = − , d(u) = − . 2 2 2
Since wa (u) are quasiperiodic in u because of (A.3): w1 (u + 1) = −w1 (u + τ ), w1 (u) = w2 (u) = −w2 (u + 1) = −w2 (u + τ ), w3 (u + τ ), w3 (u) = −w3 (u + 1) =
(2.6)
the L operator (2.1) has the following quasiperiodicity: L(u + 1) = σ 1 L(u)σ 1 ,
L(u + τ ) = σ 3 L(u)σ 3 .
(2.7)
Let `n (n = 1, . . . , N) be half integers. The total Hilbert space of the model is N (`n ) and V (`) is a spin ` representation space of sl : V = N 2 n=1 Vn , where Vn ' V ρ ` : sl2 (C) → EndC (V (`) ),
V (`) ' C2`+1 .
(2.8)
The generating function of the integrals of motion is τˆ (u) =
1 Tr T 2 (u), 2
(2.9)
where the matrix T (u) is constructed as the sum of elementary Lax operators (2.1), T (u) =
N X n=1
A(u) B(u) Ln (u − zn ) = . C(u) −A(u)
(2.10)
Here zn are mutually distinct complex parameters, 1X wa (u)Sna ⊗ σ a 2 3
Ln (u) =
(2.11)
a=1
Sna = 1lV1 ⊗ . . . ⊗ 1lVn−1 ⊗ ρ `n (S a ) ⊗ 1lVn+1 ⊗ . . . ⊗ 1lVN .
(2.12)
20
E. K. Sklyanin, T. Takebe
By virtue of the commutation relations (2.3) the operator T (u) satisfies the same commutation relations 1
2
1
2
[T (u), T (v)] = [r(u − v), T (u) + T (v)],
(2.13)
which implies the commutativity of τˆ (u): [τˆ (u), τˆ (v)] = 0.
(2.14)
Operator τˆ (u) is explicitly written down as follows: τˆ (u) =
N X
℘11 (u − zn )`n (`n + 1) +
n=1
N X
Hn ζ11 (u − zn ) + H0 .
(2.15)
n=1
Here ℘11 , ζ11 are normalized Weierstraß functions defined by (A.5) and Hn =
3 1 XX a wa (zn − zm )Sna Sm , 2 m6=n a=1
H0 =
N X
3 X
n,m=1 a=1
a Za (zn − zm )Sna Sm
(2.16)
are integrals of motion, where Z1 (t) =
0 θ 0 (t) θ11 10 , 4θ10 θ11 (t)
Z2 (t) =
0 θ 0 (t) θ11 00 , 4θ00 θ11 (t)
Z3 (t) =
0 θ 0 (t) θ11 01 . 4θ01 θ11 (t)
(2.17)
Note that the integrals of motion Hn (n = 0, . . . , N) appear as coefficients of the elliptic Knizhnik-Zamolodchikov equations in [10]. Our expression (2.16) for H0 differs from that given in [4] because of different normalization of the ℘ and ζ functions. The classical Gaudin model is obtained if we replace all the commutators with the Poisson brackets, e.g. 1
2
1
2
{T (u), T (v)} = [r(u − v), T (u) + T (v)],
(2.18)
Sa
satisfy, respectively, the Poisson commutation instead of (2.13). The spin variables P a b c relations [S , S ] = iS and are subject to the constraint 3a=1 (S a )2 = `2 . 3. Classical Separation of Variables According to the recipe in [7], the separated coordinates xn should be constructed as the poles of a suitably normalized Baker-Akhiezer function (eigenvector of Lax matrix T (u)). The corresponding canonically conjugated variables should appear then as the corresponding eigenvalues of T (xn ). Instead of choosing a normalization, we shall rather speak of a choice of a gauge transformation M of T (u). The separated coordinates xn ˜ will be obtained then as the zeros of the off-diagonal element B(u) of the twisted matrix −1 ˜ T = M T M. The classical XYZ Gaudin model is a degenerate case of the classical lattice LandauLifshits equation for which a separation of variables has been constructed in [11], see also a discussion in [7]. Here we use essentially the same gauge transformation M(u) as in [11], and our calculations represent a revised and simplified version of those in [11].
Elliptic Gaudin Model
21
3.1. Gauge transformation. Let M(u; u) ˜ be the following 2 × 2 matrix −θ01 u−2 u˜ ; τ2 −θ01 u+2 u˜ ; τ2 , M(u; u) ˜ := θ00 u+2 u˜ ; τ2 θ00 u−2 u˜ ; τ2
(3.1)
where u and u˜ are (possibly dynamical) parameters. (This matrix appears also in the ˜ context of the algebraic Bethe Ansatz. See [12,4].) A twisted L-operator L(u, v; u) ˜ depending on a parameter u˜ is defined by ˜ v; u) ˜ v; u) A(u, ˜ B(u, ˜ ˜ ˜ − v)M(u; u). ˜ (3.2) L(u, v; u) ˜ = ˜ := M −1 (u; u)L(u ˜ v; u) C(u, v; u) ˜ −A(u, ˜ Likewise we define the twisted Lax matrix by ˜ ˜ A(u; u) ˜ B(u; u) ˜ ˜ ˜ (u)M(u; u). ˜ T (u; u) ˜ = ˜ := M −1 (u; u)T ˜ C(u; u) ˜ −A(u; u) ˜
(3.3)
Note that M(u; u) ˜ has the quasiperiodicity because of (A.3): ˜ M(u + 1; u) ˜ = −σ1 M(u; u), ˜ exp(π i uσ ˜ 3 ). M(u + τ ; u) ˜ = e−πi(u+τ/2) σ3 M(u; u)
(3.4)
˜ These formulae together with (2.7) imply that the function B(u, v; u) ˜ has the following quasiperiodicity properties: ˜ + 1; u) ˜ B(u ˜ = B(u; u), ˜
˜ ˜ + τ ; u) u). ˜ B(u ˜ = e−2π i u˜ B(u;
(3.5)
Hence by a standard argument in the theory of elliptic functions (see [13]), we have X ˜ ˜ y ∈ Z + τ Z, (3.6) deg(div(B(u))) = 0, −u˜ + (mult y div(B(u))) ˜ ˜ is the multiplicity of a divisor [y] in the divisor div(B(u)). By the where mult y div(B(u)) ˜ definition (3.3), operator B(u; u) ˜ is holomorphic except at poles of A(u), B(u), C(u), ˜ i.e., u = 0 modulo Z + τ Z: i.e., u = zn (n = 1, . . . , N), and at zeros of det M(u; u), ! N X ˜ [zn ] + [0] (mod Z + τ Z). (3.7) div(B(u)) ≥− n=1
Thus (3.7) and (3.6) imply that there are (N + 1) points x0 , . . . , xN such that ! N N X X ˜ [xj ] − [zn ] + [0] (mod Z + τ Z), div(B(u)) ≡
(3.8)
n=1
j =0
and N X j =0
xj ≡
N X
zn − u˜
(mod Z + τ Z).
(3.9)
n=1
Let us fix the parameter u˜ by the condition that one of xj , for example x0 , is a constant ξ . Note that u˜ becomes then a dynamical variable. Thus we have ˜ = ξ ; u) ˜ j ; u) ˜ = B(u ˜ = 0. B(x
(3.10)
Dynamical variables x1 , . . . , xN are (classically) separated coordinates of the system as we will see below.
22
E. K. Sklyanin, T. Takebe
3.2. Poisson commutation relations and classical separation of variables. The main purpose of this subsection is to prove the following commutation relations. ˜ j ) have the canonical Theorem 3.1. Generically the dynamical variables xj and −A(x Poisson brackets: (i) {xi , xj } = 0 for all i, j = 1, . . . , N. ˜ j )} = 0 for all i, j = 1, . . . , N. ˜ i ), −A(x (ii) {−A(x ˜ (iii) {−A(xi ), xj } = δi,j for all i, j = 1, . . . , N. To prove the theorem, we follow the argument of [11]. First let us introduce several ˆ B, ˆ C, ˆ Dˆ as notations. Define the matrices A, 1 0 0 1 0 0 0 0 ˆ ˆ ˆ ˆ A := , B := , C := , D := . (3.11) 0 0 0 0 1 0 0 1 Gauge transformation of them are defined as follows: ˆ ˆ A(u; u) ˜ := M(u; u) ˜ AM(u; u) ˜ −1 , ˆ ˆ C(u; u) ˜ := M(u; u) ˜ CM(u; u) ˜ −1 ,
ˆ ˆ B(u; u) ˜ := M(u; u) ˜ BM(u; u) ˜ −1 , ˆ ˆ D(u; u) ˜ := M(u; u) ˜ DM(u; u) ˜ −1 .
(3.12)
Bracket h, i is the standard inner product of the 2 × 2 matrices: hX, Y i = tr XY.
(3.13)
When X(u) is a variable depending on the spectral parameter u, we will denote X(xi ) by Xi for brevity. For example, ∂ ˆ ˆ tr(C(u; u)T ˜ (u)). (∂u hCT i)i = ∂u u=xi The following statement is proved by the same argument as in the proof of the Theorem in §2 of [11]. Lemma 3.2. For any dynamical variable X, hCˆ 0 {X, T }0 i , h∂u˜ Cˆ 0 T0 i
(3.14)
hCˆ 0 {X, T }0 ih∂u˜ Cˆ j Tj i − hCˆ j {X, T }j ih∂u˜ Cˆ 0 T0 i . ˆ T i)j h∂u˜ Cˆ 0 T0 i (∂u hC,
(3.15)
{X, u} ˜ =− and {X, xj } =
We also need the formula for the twisted r matrix. Lemma 3.3. Define 1
2
1
2
˜ −1 M (v; u) ˜ −1 r(u − v)M (u; u)M ˜ (v; u), ˜ r˜ (u, v; u) ˜ := M (u; u)
(3.16)
Elliptic Gaudin Model
23
which we call the twisted r matrix and let r˜ij (u, v; u) ˜ be its (i, j ) element. Then it has the following form: ˜ = −˜r22 (u, v; u) ˜ = −˜r33 (u, v; u) ˜ = r˜44 (u, v; u) ˜ = r˜11 (u, v; u) 0 0 (u) 0 (v) (u − v) θ11 θ11 1 θ11 = − − + , (3.17) 2 θ11 (u − v) θ11 (u) θ11 (v) ˜ = −˜r13 (v, u; u) ˜ = −˜r21 (u, v; −u) ˜ = r˜31 (v, u; −u) ˜ = r˜12 (u, v; u) ˜ = −˜r34 (u, v; u) ˜ = −˜r42 (v, u; −u) ˜ = r˜43 (u, v; −u) ˜ = = r˜24 (v, u; u) 0 ˜ −θ11 θ11 (v + u) , (3.18) = 2θ11 (u)θ ˜ 11 (v) ˜ = r˜41 (u, v; u) ˜ = 0, (3.19) r˜14 (u, v : u) 0 θ (u − v + u) ˜ −θ11 11 r˜23 (u, v; u) ˜ = r˜32 (u, v; −u) ˜ = . (3.20) θ11 (u − v)θ11 (u) ˜ Proof. The proof is given by a direct computation. For example, we have formulae like 1 ˜ θ10 (u)θ10 (u) ˜ θ11 (u)θ11 (u) ˜ −θ10 θ10 (u − u) 1 θ00 (u)θ00 (u) ˜ ˜ = M(u, u) ˜ −1 (iσ2 )M(u, u) ˜ θ11 (u)θ11 (u) ˜ −θ00 θ00 (u − u) 1 −θ01 (u)θ01 (u) ˜ ˜ = M(u, u) ˜ −1 σ3 M(u, u) ˜ θ01 θ01 (u − u) θ11 (u)θ11 (u) ˜ ˜ = M(u, u) ˜ −1 σ1 M(u, u)
θ10 θ10 (u + u) ˜ , −θ10 (u)θ10 (u) ˜ θ00 θ00 (u + u) ˜ , −θ00 (u)θ00 (u) ˜ −θ01 θ01 (u + u) ˜ , θ01 (u)θ01 (u) ˜
which follow from the addition theorems (cf. [14, pp. 20, 22] ) and the Landen transformation (cf. [13, §21.52]) of theta functions. Substituting them in the definition of r˜ (3.16) and using the addition theorems again, we can prove the lemma. u t Proof of Theorem 3.1. Using the formulae (3.14) and (3.15), we have 1 × ˆ ˆ i)k (∂u hCT i)j (∂u hCT h h∂ Cˆ T ih∂ Cˆ T i 1 2 1 2 h∂u˜ Cˆ k Tk i 1ˆ 2ˆ 1 2 u˜ j j u˜ k k hCˆ 0 Cˆ 0 {T , T }00 i − hC j C 0 {T , T }j 0 i × h∂u˜ Cˆ 0 T0 i2 h∂u˜ Cˆ 0 T0 i 1 2 1 2 i h∂u˜ Cˆ j Tj i 1 2 1 2 (3.21) hCˆ 0 Cˆ k {T , T }0k i + hCˆ j Cˆ k {T , T }j k i . − h∂u˜ Cˆ 0 T0 i
{xj , xk } =
1
2
1
2
Therefore computation of {xj , xk } reduces to computation of hCˆ j Cˆ k {T , T }j k i. As in Appendix B of [11], we have 1
2
1
2
1 2
1
2
ˆ k {T , T }j k i = tr1 tr2 ([8 ˆ , r˜ (xj , xk )](T˜ (xj ; u) ˆ9 ˆ j9 ˜ + T˜ (xk ; u))), ˜ h8
(3.22)
for any 8, 9 = A, B, C, D. Substituting 8 = 9 = C and using (3.19), we have 1
2
1
2
hCˆ j Cˆ k {T , T }j k i = 0. Thus (3.21) implies that {xj , xk } = 0.
24
E. K. Sklyanin, T. Takebe
A direct consequence of this is {xj , u} ˜ = 0, which follows from (3.9). Using these results and Lemma 3.2, we have for j 6 = k, 1
2
1
2
1
2
1
2
h∂ Cˆ T ihAˆ j Cˆ 0 {T , T }j 0 i − h∂u˜ Cˆ 0 T0 ihAˆ j Cˆ k {T , T }j k i ˜ j ), xk } = u˜ k k . {A(x ˆ T i)k h∂u˜ Cˆ 0 T0 i (∂u hC, 1
2
1
(3.23)
2
Hence we need to know hAˆ j Cˆ k {T , T }j k i and h∂u˜ Cˆ k Tk i. The former can be computed by (3.22) and (3.19) and we have 1
2
1
2
˜ A˜ k . hAˆ j Cˆ k {T , T }j k i = −2˜r12 (xj , xk ; u)
(3.24)
The factor h∂u˜ Cˆ k Tk i is computed as follows: ˆ T˜ (xk ; u)i ˜ −1 ∂u˜ M(xk ; u), ˜ C] ˜ = h∂u˜ Cˆ k Tk i = h[M(xk ; u) 0 ˜ θ θ11 (xk + u) A˜ k . = − 11 θ11 (xk )θ11 (u) ˜
(3.25)
˜ j ), xk } = 0 for j 6 = k. Substituting (3.24), (3.25) and (3.18) into (3.23), we have {A(x ˜ k )} = 0 is done in a similar way. In addition to the formulae ˜ j ), A(x The proof of {A(x we have shown above, we need 1
2
1
2
˜ C˜ j + r˜12 (xj , xk ; u) ˜ C˜ k , hAˆ j Aˆ k {T , T }j k i = r˜13 (xj , xk ; u) 0 θ (x + u) ˜ −θ11 11 k C˜ k . h∂u˜ Aˆ k Tk i = 2θ11 (xk )θ11 (u) ˜
(3.26) (3.27)
˜ j ), xj } = −1 requires special care, since the r Proof of the remaining equation {A(x matrix r(u) diverges at u = 0. Instead of (3.24), we use 1
2
1
2
hAˆ j Cˆ j {T , T }jj i = =
˜ − 2˜r12 (xj , xj ; u) ˜ A˜ j − lim r˜32 (u, xj ; u) ˜ B(u; u) ˜ u→xj
˜ j. ˜ A˜ j + (∂u B) − 2˜r12 (xj , xj ; u)
(3.28)
ˆ T i)j = (∂u B) ˜ j and substituting (3.28) and (3.25) into (3.23), we have Noting (∂u hC, ˜ j ), xj } = −1. u {A(x t ˜ j ) is an eigenvalue of Since B˜ is zero at u = xj , the dynamical variable Xj := −A(x T (xj ): 0 0 = Xj , (3.29) T˜ (xj ) 1 1 x +u˜ x +u˜ −θ01 j 2 ; τ2 −θ01 j 2 ; τ2 = Xj . (3.30) T (xj ) x +u˜ x +u˜ θ00 j 2 ; τ2 θ00 j 2 ; τ2 Thus if we define the characteristic polynomial by W (z, u) := det(z − T (u)),
(3.31)
Elliptic Gaudin Model
25
each pair of dynamical variables (xj , Xj ) satisfies an equation W (Xj , xj ) = 0,
(3.32)
for j = 1, . . . , N. Therefore, following the definition in [7], canonical variables (x1 , . . . , xN ; X1 , . . . , XN ) are separated variables of the classical elliptic Gaudin model. 4. Quantum System: General Case We return now to the quantum elliptic Gaudin model and construct the quantum separation of variables. The special case N = 1 is considered in the next section, Sect. 5. 4.1. Kernel function. Suppose that the representation space Vn = V `n (2.8) is realized as a space of functions on a certain space with coordinate yn and that the operators S a are differential operators on, e.g., polynomials or elliptic functions. The separating operator K is expressed as an integral operator Z K f (x1 , . . . , xN ) = dy1 · · · dyN 8(x1 , . . . , xN |y1 , . . . , yN )f (y1 , . . . , yN ), (4.1) which maps a function of (y1 , . . . , yN ) in V1 ⊗ · · · ⊗ VN to a function of N -variables xi on the elliptic curve C/Z + τ Z. Let us define the operator Xi as follows: ∂ − 3(xi ), Xi := ∂xi
3(x) =
N X n=1
`n
0 (x − z ) θ11 n . θ11 (x − zn )
(4.2)
Lemma 4.1. The following system of partial differential equations satisfies the Frobenius integrability condition: ˜ = 0, B˜ ∗ (xi ; u)8 ∗ ˜ ˜ = 0, (Xi + A (xi ; u))8
i = 1, . . . , N, i = 1, . . . , N,
(4.3) (4.4)
where P ∗ is the (formal) adjoint of a differential operator P with respect to (y1 , . . . , yN ) and we set u˜ =
N X n=1
zn −
N X
xj
(4.5)
j =0
for a certain constant x0 = ξ . Proof. This is a consequence of the commutation relation (2.13). By multiplying 1
2
˜ (v; u) ˜ from the right and its inverse from the left, we have M (u; u)M 1
2
1
2
˜ T˜ (v; u)] ˜ = [˜r (u, v; u), ˜ T˜ (u; u) ˜ + T˜ (v; u)]. ˜ [T˜ (u; u), Note that u˜ is not a dynamical variable in contrast to that in Sect. 3.
(4.6)
26
E. K. Sklyanin, T. Takebe
In order to show the consistency of Eqs. (4.3) for i and for j , we prove that [B˜ ∗ (xi ; u), ˜ is expressed as a linear combination of B˜ ∗ (xi ; u) ˜ and B˜ ∗ (xj ; u). ˜ Since the formal adjoint is an algebra anti-isomorphism, (P Q)∗ = Q∗ P ∗ , we have
B˜ ∗ (xj ; u)] ˜
˜ j ; u), ˜ i ; u)] ˜ B˜ ∗ (xj ; u)] ˜ = [B(x ˜ B(x ˜ ∗. [B˜ ∗ (xi ; u),
(4.7)
The (1,4)-element of (4.6) gives ˜ ˜ ˜ ˜ ˜ B(u) − r˜12 (v, u; u) ˜ B(v)) [B(u; u), ˜ B(v; u)] ˜ = 2(˜r12 (u, v; u)
(4.8)
by virtue of (3.19) and (3.18). Replacing u and v in (4.8) by xi and xj respectively which are not dynamical, we obtain ˜ B˜ ∗ (xj ; u)] ˜ = [B˜ ∗ (xi ; u),
0 θ (x + u) ˜ ∗ ˜ ∗ θ11 θ 0 θ11 (xi + u) 11 j B˜ (xi ) − 11 B˜ (xj ), θ11 (xj )θ11 (u) ˜ θ11 (xi )θ11 (u) ˜
(4.9)
which means that Eq. (4.3) for i and for j are compatible. Next we show the compatibility condition ˜ Xj + A˜ ∗ (xj ; u)] ˜ = 0, [Xi + A˜ ∗ (xi ; u),
(4.10)
which implies the consistency of Eqs. (4.4) for i and for j (i 6= j ). It is obvious from (4.2) that [Xi , Xj ] = 0. Because of (4.5), we have ˜ =− [Xi , A˜ ∗ (xj ; u)]
(4.11)
∗ ∂ ˜ A(xj ; u) ˜ . ∂ u˜
By the same argument as that for (3.27) the right-hand side is rewritten as ˜ =− [Xi , A˜ ∗ (xj ; u)]
0 θ (u θ11 θ 0 θ11 (u˜ − xj ) ∗ 11 ˜ + xj ) ˜ ∗ C (xj ; u) B˜ (xj ; u). ˜ − 11 ˜ (4.12) 2θ11 (u)θ ˜ 11 (xj ) 2θ11 (u)θ ˜ 11 (xj )
Exchanging i and j , we have ˜ =− [Xj , A˜ ∗ (xi ; u)]
0 θ (u θ11 θ 0 θ11 (u˜ − xi ) ∗ 11 ˜ + xi ) ˜ ∗ C (xi ; u) B˜ (xi ; u). ˜ − 11 ˜ 2θ11 (u)θ ˜ 11 (xi ) 2θ11 (u)θ ˜ 11 (xi )
(4.13)
The (1,1)-element of (4.6) means ˜ A˜ ∗ (xj ; u)] ˜ = −˜r13 (xi , xj ; u) ˜ C˜ ∗ (xi ; u) ˜ − r˜12 (xi , xj ; u) ˜ C˜ ∗ (xj ; u) ˜ [A˜ ∗ (xi ; u), ∗ ∗ ˜ ˜ ˜ B (xi ; u) ˜ + r˜21 (xi , xj ; u) ˜ B (xj ; u). ˜ (4.14) +˜r31 (xi , xj ; u) Summing up (4.11), (4.12), (4.13) and (4.14), we have proved (4.10) because of (3.18). The consistency of (4.4) for i and (4.3) for j is shown as follows. First assume i 6 = j . Then the same computation as above gives ∗ ∂ ˜ ∗ ∗ ˜ ˜ B(xj ; u) ˜ B (xj ; u) ˜ = − ˜ + [A˜ ∗ (xi ; u), ˜ B˜ ∗ (xj ; u)] ˜ [Xi + A (xi ; u), ∂ u˜ 0 0 (x ) ˜ ∗ θ 0 θ (xi − xj − u) θ11 (xi − xj ) θ11 i − B˜ ∗ (xj ; u) B˜ (xi ; u). ˜ − 11 ˜ (4.15) = θ11 (xi − xj ) θ11 (xi ) θ11 (xi − xj )θ11 (u) ˜
Elliptic Gaudin Model
27
Thus we have proved the compatibility of (4.4) for i and (4.3) for j . Here we used ˜ θ 0 θ11 (xj + u) θ 0 (xj ) ∂ ˜ ˜ j ; u) ˜ j ; u), B(xj ; u) A(x B(x ˜ = − 11 ˜ + 11 ˜ ∂ u˜ θ11 (xj )θ11 (u) ˜ θ11 (xj )
(4.16)
and the (1,2)-element of (4.6). The case i = j is almost the same, but there is another term coming from ˜ [Xi , B˜ ∗ (xi ; u)]: ˜ B˜ ∗ (xi ; u)] ˜ = [Xi + A˜ ∗ (xi ; u), ∗ ∂ ˜ ∂ ˜ ∗ (u; u) B(x ˜ − ; u) ˜ + [A˜ ∗ (xi ; u), ˜ B˜ ∗ (xi ; u)]. ˜ B = i ∂u u=xi ∂ u˜
(4.17)
By the same computation as (3.28), it follows from the (1,2)-element of (4.6) that ˜ B˜ ∗ (xi ; u)] ˜ = [A˜ ∗ (xi ; u), 0 θ 0 θ11 (xi + u) ˜ ˜ ∗ θ (u) ∂ ˜ − 11 ˜ − ˜ (4.18) B˜ ∗ (u; u). A˜ (xi ; u) = 11 B˜ ∗ (xi ; u) θ11 (u) ˜ θ11 (xi )θ11 (u) ˜ ∂u u=xi Substituting (4.18) and (4.16) for j = i into (4.17), we obtain 0 θ 0 (xi ) ˜ θ11 (u) − 11 B˜ ∗ (xi ; u), ˜ B˜ ∗ (xi ; u)] ˜ = ˜ [Xi + A˜ ∗ (xi ; u), θ11 (u) ˜ θ11 (xi )
(4.19)
which proves the consistency of (4.4) for i and (4.3) for i. u t 4.2. Separating operator. The separating integral operator K is defined by (4.1) with the kernel function 8(x|y) satisfying Eqs. (4.3) and (4.4). Proposition 4.2. (i) For any function f of (y1 , . . . , yN ) in V1 ⊗ · · · ⊗ VN , we have ˜ i ; u)f ˜ ) = 0, K (B(x ˜ i ; u)f ˜ ) = Xi f. K (−A(x
(4.20) (4.21)
(ii) The elliptic Gaudin Hamiltonian τˆ (u) with the spectral parameter fixed to u = xi is transformed as follows. K (τˆ (xi )f )(x) = Xj2 K (f )(x),
(4.22)
where Xi is defined by (4.2). Proof. (i) is a direct consequence of (4.3) and (4.4) respectively. (ii) By Definition (2.9), K (τˆ (xi )f )(x) = Z 1 ˜ i ; u) ˜ i ; u) ˜ i ; u) ˜ i ; u))f ˜ i ; u) ˜ 2 + B(x ˜ C(x ˜ + C(x ˜ B(x ˜ (y) dy 8(x|y) (2A(x 2 Z Z ∗ 2 ˜ ˜ 8(x|y) f (y) dy + C˜ ∗ (xi ; u) ˜ B˜ ∗ (xi ; u)8(x|y) ˜ f (y) dy = (A (xi ; u)) Z 1 ˜ C˜ ∗ (xi ; u)]8(x|y) ˜ f (y) dy. (4.23) [B˜ ∗ (xi ; u), + 2
28
E. K. Sklyanin, T. Takebe
The first term in the right-hand side of (4.23) is rewritten by the following formula: ˜ 2 8(x|y) = −A˜ ∗ (xi )Xi 8(x|y) (A˜ ∗ (xi ; u))
= Xi2 8(x|y) + [Xi , A˜ ∗ (xi )]8(x|y),
(4.24)
where we used (4.4). The last term of (4.24) is ∂ ∂ ˜ ∗ (u; u) ˜ = ˜ − ˜ (4.25) A A˜ ∗ (u; u) [Xi , A˜ ∗ (xi ; u)] ∂u u=xi ∂ u˜ u=xi P P because u˜ = zn − xi . Hence, similarly to the derivation of (4.12), we can prove that θ 0 θ11 (u˜ + xi ) ∗ ∂ ∗ ˜ ˜ = ˜ − 11 ˜ A˜ ∗ (u; u) C˜ (xi ; u) [Xi , A (xi ; u)] ∂u u=xi 2θ11 (u)θ ˜ 11 (xi ) −
0 θ (u θ11 11 ˜ − xi ) ˜ ∗ B (xi ; u). ˜ 2θ11 (u)θ ˜ 11 (xi )
(4.26)
The (2, 3)-element of the commutation relation (4.6) gives 0 θ (u − u) ˜ θ11 11 ˜ B(u; u) ˜ θ11 (u)θ ˜ 11 (u) ˜ θ 0 θ11 (u + u) ˜ − 11 C(u; u) ˜ θ11 (u)θ ˜ 11 (u)
˜ ˜ ˜ − [B(u; u), ˜ C(u; u)] ˜ = 2A˜ 0 (u; u)
(4.27)
in the limit v → u. Substituting (4.24), (4.26) and (4.27) into (4.23) and using (4.3), we obtain (4.22). u t Equation (4.20) is a quantum version of (3.10) and Eq. (4.21) together with the canonical commutation relation [Xi , xj ] = δij means that operators (x1 , . . . , xN ; X1 , . . . , XN ) are the quantization of the classical separated variables in Sect. 3.2. The second statement of Proposition 4.2 provides a formal separation of variables for the quantum elliptic Gaudin model. Using the language of [8] and [9], the kernel 8(x|y) provides a Radon–Penrose transformation of the corresponding D-modules (cf. [15]). In principle, the quantum separation of variables should result in a one dimensional spectral problem for the separated equation (4.22) which is equivalent to the spectral problem for the original Hamiltonians (2.16). To achieve this goal one needs to specify an integration contour in (4.1) to study in detail the action of the integral operator K on the functional space V . Here we examine only the simplest case N = 1, leaving the general case for further study. 5. Quantum System: Case N = 1 In this section we examine the special case of N = 1. In this case, everything can be computed explicitly and we shall see that the separated equation is nothing but the classical Lamé equation and its generalization. We adopt the realization of the representation ρ ` of sl2 on the space of elliptic functions reviewed in Appendix B. We could use the standard realization on the space of sections of a line bundle over P1 , but the result is essentially the same up to coordinate transformation and gauge transformation. We omit the suffix n of zn and Sna for brevity.
Elliptic Gaudin Model
29
˜ u) ˜ 5.1. Separated variables. The quantum twisted B operator B(u; u) ˜ = B(u; ˜ is defined as in the classical case (3.3) or (3.2). Substituting (2.1) we obtain 0 θ11 ˜ θ10 (u − z)θ10 (u + u)S ˜ 1 B(u; u) ˜ = 2θ11 (u)θ11 (u)θ ˜ 11 (u − z) − θ00 (u − z)θ00 (u + u)iS ˜ 2 − θ01 (u − z)θ01 (u + u)S ˜ 3 . (5.1) The realization of the representation (B.2) gives the following expression: d ˜ + B˜ (0) (u; u), ˜ ˜ B(u; u) ˜ = B˜ (1) (u; u) dy
(5.2)
where ˜ = θ11 (u)−1 θ11 (u) ˜ −1 θ11 (u − z)−1 θ11 (2y)−1 B˜ (1) (u; u) z u˜ z u˜ × θ10 y + u − + θ10 y − u + − 2 2 2 2 z u˜ z u˜ θ10 −y + + , × θ10 −y − − 2 2 2 2
(5.3)
0 2`θ11 ˜ 11 (u − z)θ11 (y)2 θ11 (u + u)θ 2θ11 (u)θ11 (u)θ ˜ 11 (u − z)θ11 (y)2 z u˜ z u˜ z u˜ 2 + 2θ10 y + u − + θ10 −y + u − + θ10 − − . (5.4) 2 2 2 2 2 2
˜ = B˜ (0) (u; u)
A special point in the case N = 1 is that we can make use of the freedom of u˜ so that ˜ B(u; u) ˜ is a multiplication operator with the divisor of the form, ˜ div(B(u; u)) ˜ = [x] + [z] − [z] − [0] = [x] − [0]
(mod Z + τ Z),
(5.5)
˜ =0 as in the classical case, (3.8), (3.9). In fact, if we put u˜ = −z ± 2y + 1, B˜ (1) (u; u) by virtue of (5.3), and then (5.4) implies ˜ = B(u; u)| ˜ u=−z±2y+1 ˜
0 θ (u − z ± 2y) 2`θ11 11 . −2θ11 (u)θ11 (−z ± 2y)
(5.6)
(We substitute the variable “from the left”, namely we define d ˜ + B˜ (0) (u; −z ± 2y + 1). = B˜ (1) (u; −z ± 2y + 1) B(u; u)| ˜ u=−z±2y+1 ˜ dy Hereafter we always follow this normal ordering convention.) Therefore we can take x = z ∓ 2y in (5.5). This is the one of the “separated variables” in this case. ˜ In the classical model, Theorem 3.1, −A(x; u) ˜ is a dynamical variable canonically ˜ (3.3), is conjugate to x. This is also the case in the quantum model. The definition of A, rewritten in the form 0 θ11 ˜ ˜ 1 θ −1 θ10 (u − z)θ10 (u)θ10 (u)S A(u; u) ˜ = 2θ11 (u)θ11 (u)θ ˜ 11 (u − z) 10 −1 −1 − θ00 θ00 (u − z)θ00 (u)θ00 (u)iS ˜ 2 − θ01 θ01 (u − z)θ01 (u)θ01 (u)S ˜ 3 (5.7)
30
E. K. Sklyanin, T. Takebe
by (2.1). Substituting u˜ = −z ± 2y + 1 and u = x = z ∓ 2y (from the left), we obtain ˜ = A(u; u)| ˜ u=z∓2y,u=−z±2y+1 ˜ 0 0 θ (2y) 0 θ (2y) θ11 θ10 (2y) θ11 θ11 1 d 00 01 =± + 2` + + 2 dy θ10 θ11 (2y) θ00 θ11 (2y) θ01 θ11 (2y) ℘ 00 (y) 1 d −` 0 . (5.8) =± 2 dy ℘ (y) The last equality can be proved by comparing the poles of both sides. Therefore x = ˜ are the canonical conjugate variables z ∓ 2y and X := −A(u; u)| ˜ u=z∓2y,u=−z±2y+1 ˜ satisfying, [X, x] = 1.
(5.9)
We did not make use of the formulation in the previous sections explicitly. In fact, thanks to the special choice of u, ˜ 8 in (4.1) is a δ-function type kernel, which reduces the integral operator K to a coordinate transformation operator from y to x. 5.2. Solving the spectral problem. For the case N = 1, the generating function of the quantum integrals of motion τˆ (u) (u is the spectral parameter) 1X wa (u)2 (ρ (`) (S a ))2 2 3
τˆ (u) =
(5.10)
a=1
is explicitly written down. Here we shift the spectral parameter in the original definition (2.9) as u 7 → uz and set z = 0 for the sake of simplicity. Using (B.2) or (B.6) and various identities of elliptic functions in [13], we can expand the right hand side of (5.10): 1 d2 ℘ 00 (y) d d d2 , 2;u = + 4`(2` − 1)℘ (y) + 4`(` + 1)℘ (u) − 2` τˆ y, dy dy 4 dy 2 ℘ 0 (y) dy (5.11) or
d d2 ;λ , τˆ η, dη dη2
= (η − e1 )(η − e2 )(η − e3 ) 1 1 − 2` 1 1 d d2 + + + + × 2 dη 2 η − e1 η − e2 η − e3 dη `(2` − 1)η + `(` + 1)λ , (5.12) + (η − e1 )(η − e2 )(η − e3 )
where λ = ℘ (u). As is expected from Proposition 4.2 and the result for the rational Gaudin model in [6], operator τˆ (u) is factorized as follows when the spectral parameter u is fixed to a separated variable x1 = 2y. (We may also take x1 = −2y.): 2 d d2 ˜ = −A(u; u)| ˜ u=2y,u=2y+1 = X2 , (5.13) , 2 ; u τˆ y, ˜ dy dy u=2y
Elliptic Gaudin Model
31
which immediately follows from (5.7). (The operator X is defined before (5.9).) This is consistent with the general result (4.22). Equations (5.11) or (5.12) show that the spectral problem of the elliptic Gaudin model with N = 1 is an ordinary differential equation of second order on the elliptic curve C/Z + τ Z: d d2 (5.14) , 2 ; u ψ(y) = t (u)ψ(y), τˆ y, dy dy or on the projective line P1 (C): d d2 , 2 ; λ ψ(η) = t (λ)ψ(η). τˆ η, dη dη
(5.15)
Here t (u), t (λ) are eigenvalues of τˆ , ψ ∈ V (`) is an eigenvalue corresponding to this eigenvalue. Since operators τˆ (u) and Uα commute with each other by virtue of (B.12, B.13) and (5.10), we can decompose each eigenspace of τˆ (u) into those of Uα . Equation (5.14) has regular singularities: u = 0 (mod 0) with exponents −4`, −2` + 1, and u = ωα (α = 1, 2, 3, ω1 = 1, ω2 = τ , ω3 = 1 + τ ) with exponents 0, 2` + 1. Equation (5.15) has regular singularities: η = eα with exponents 0, (2` + 1)/2, and η = ∞ with 21 − `, −2`. If ` is an integer, these equations are ordinary Lamé equations, while for ` ∈ 21 +Z they are generalized Lamé equations studied by Brioschi, Halphen and Crawford. Following the classical theory of Lamé functions (see [13, Chap. XXIII]), we can solve the spectral problem (5.14), (5.15) in V (`) as follows. 5.2.1. Case ` ∈ Z. We want a solution ψ(η) of (5.15) such that ψ(η) ∈ V (`) . Let us assume that ψ(η) is expanded around the singular point eα as ψ(η) =
∞ X r=0
arα (η − eα )2`−r ,
(5.16)
a0 being 1. The condition ψ(η) ∈ V (`) means that arα = 0 for r > 2`. Substituting (5.16) into (5.15), we obtain the following recursion relation: α `(2` − 1) − 3(r − 1)(2` − r + 1) eα + E ar−1 r(` + 21 − r)arα = 3 α (eα − eβ )(eα − eγ )ar−2 + (2` − r + 2) ` − r + (5.17) 2 for r > 0 where E = `(` + 1)λ − t (λ). (Undefined coefficients arα for r < 0 are 0.) Hence, as a function of E, arα = arα (E) is a polynomial of degree r of the form arα (E) = Ar E r + O(E r−1 ),
Ar = r!
−1 1 `−r − +j . 2
r Y j =1
(5.18)
α (E) = 0 by Eiα (i = 1, . . . , 2` + 1). The recursion Let us denote the roots of a2`+1 α α relation (5.17) implies ar (Ei ) = 0 for r ≥ 2` + 1. Hence we obtain a polynomial
32
E. K. Sklyanin, T. Takebe
solution ψ(η) = ψ(η; Eiα ) of (5.15) of the form (5.16) for each i = 1, . . . , 2` + 1, provided that t (λ) = `(` + 1)λ − Eiα .
(5.19)
Conversely, if ψ(η) ∈ V (`) is a solution of the spectral problem (5.15), then there exists certain i for each α = 1, 2, 3 such that ψ(η) = ψ(η; Eiα ). This is proved by expanding the polynomial ψ(η) as in (5.16) and tracing back the above argument. Proposition 5.1. Assume that ω2 = τ is pure imaginary and that parameters zn are all real numbers. Then all Eiα are real and the spectral problem (5.15) is non-degenerate. Namely Eiα 6 = Ejα for distinct i, j and the solutions ψ(η; Eiα ) span the space V (`) . In 1 particular Eiα (i = 1, . . . , 2` + 1) for α = 1, 2, 3 coincide up to order, and a2`+1 (E) = α α 2 3 a2`+1 (E) = a2`+1 (E). Hence we can omit the index α for Ei and a2`+1 (E). Vector ψ(η; Ei ) is an eigenvector of Uα with eigenvalue (−1)` if a`α (Ei ) 6 = 0 and (−1)`+1 if a`α (Ei ) = 0. Proof. Under the assumption τ ∈ iR, operator τˆ (u) (u ∈ R) is an hermitian operator because of (B.11), and hence it is obvious that Eiα are real and that ψ(η; Eiα ) span V (`) . In order to show non-degeneracy of the spectral problem (5.15) we have only to prove that Ei2 are distinct with each other. Define ar2 (E), r < ` + 1, 2 a˜ r (E) := r−l ` + 1 ≤ r ≤ 2` + 1. (−1) ar2 (E), Then the leading coefficient of a˜ r2 is a˜ r2 (E) = A˜ r E r + O(E r−1 ),
A˜ r = |Ar |.
(5.20)
The recursion relation (5.17) is rewritten as 2 2 (E) − kr−2 a˜ r−2 (E), cr a˜ r2 (E) = qr a˜ r−1
(5.21)
where cr = r|` +
1 2
− r|,
`(2` − 1) − 3(r − 1)(2` − r + 1) eα + E, 3 = ` − r + (2` − r + 2)(e1 − e2 )(e2 − e3 ). 2
qr = kr
(5.22)
Since e1 > e2 > e3 under the assumption of the proposition, we have cr > 0 and kr > 0. This fact together with A˜ r > 0 (see (5.20)) implies that all the roots of a˜ r2 (E) are real and distinct by Sturm’s theorem (see, e.g., Chap. IX, §§4–5, [16]). This proves the first statement of the proposition. The operators Uα and τˆ commute and each eigenspace of τˆ is one-dimensional. Hence ψ(η; Ei ) is an eigenvector of Uα . Recall that Uα has eigenvalues (−1)` with multiplicity ` + 1 and (−1)`+1 with multiplicity `. (See §B.2.) If a`α (Ei ) 6= 0, then Uα ψ(η; Ei ) = (−1)` ψ(η; Ei ) because of (B.15). Hence there are at most ` + 1 of Ei ’s such that a`α (Ei ) 6 = 0. In other words, at least ` of Ei ’s satisfy a`α (Ei ) = 0. Since a`α (E) is a polynomial of degree `, this proves the second statement of the proposition. u t
Elliptic Gaudin Model
33
5.2.2. Case ` ∈ 21 + Z. As in the case ` ∈ Z, we consider an expansion (5.16) of a solution ψ(η) of the spectral problem (5.15), but this time we consider the series which terminate at r = ` − 21 : ψ(η) =
`−1/2 X r=0
arα (η − eα )2`−r .
They are parametrized by zeros of the polynomial a α
`+ 21
(5.23)
(E), {Eiα }i=1,...,`+ 1 as in the 2
previous case: ψ(η) = ψ(η; Eiα ). Another set of solutions are obtained from this set by applying the operator Uα : Uα ψ(η; Eiα )
`− 21
=
X r=0
a α 0r (Eiα )(η − eα )r ,
(5.24)
since Uα and τˆ (u) commute. The following proposition is proved in the same manner as Proposition 5.1. Proposition 5.2. Assume that ω2 = τ is pure imaginary and that parameters zn are all real numbers. Then all Eiα are real and Eiα 6 = Ejα for distinct i, j . The solutions ψ(η; Eiα ) and Uα ψ(η; Eiα ) span the space V (`) . In particular Eiα (i = 1, . . . , `+ 21 ) for α = 1, 2, 3 coincide up to order, and a 1 1 (E) = a 2 1 (E) = a 3 1 (E). Hence we can omit the index α for Eiα and a α
`+ 21
(E).
`+ 2
`+ 2
`+ 2
Vectors ψ(η; Ei ) ± Uα ψ(η; Ei ) are eigenvectors of Uα with eigenvalues ∓i. This proposition means that each eigenvalue Ei degenerates with multiplicity two. It was Crawford [17] who first found the relation of these two solutions (one is obtained from the other by operating U2 ) by the explicit expansions of type (5.23), (5.24). See also p.578 of [13]. A. Notations We use the notation for the theta functions with characteristics as follows (see [14]): for a, b = 0, 1, X 2 eπi(n+a/2) τ +2π i(n+a/2)(u+b/2) . (A.1) θab (u; τ ) = n∈Z
Unless otherwise specified, θab (u) = θab (u; τ ). We also use abbreviations d 0 θab = θab (u). θab = θab (0), du u=0
(A.2)
Quasi-periodicity properties of theta functions: θab (u) = (−1)a θab (u + 1) = eπ iτ +2π iu θab (u + τ ).
(A.3)
Parity of thetas: θ00 (−u) = θ00 (u), θ01 (−u) = θ01 (u), θ10 (−u) = θ10 (u), θ11 (−u) = −θ11 (u).
34
E. K. Sklyanin, T. Takebe
A.1. Weierstrass functions. Below we fix ω1 = 1 and ω2 = τ , " # Y u 2 1 u u + 1− exp σ (u) = u , ωmn ωmn 2 ωmn
(A.4)
m,n6=0
where ωmn = mω1 + nω2 , ζ (u) =
σ 0 (u) , σ (u)
℘ (u) = −ζ 0 (u),
σ (u + ωl ) = −σ (u)eηl (2u+ωl ) , ζ (u + ωl ) = ζ (u) + 2ηl , ℘ (u + ωl ) = ℘ (u), where ηl = ζ (ωl /2) , which satisfy η1 ω2 − η2 ω1 = π i. Sigma function is expressed by theta functions as follows: θ11 (u/ω1 ) 2 , σ (u) = ω1 eη1 u /ω1 0 θ11 σ (−u) = −σ (u), ζ (−u) = −ζ (u), ℘ (−z) = ℘ (u), ζ (u) = u−1 + O(u3 ), ℘ (u) = u−2 + O(u2 ). u∼0: σ (u) = u + O(u5 ), Other sigma functions are defined as follows: ω2 σ u + ω1 + η1 2 θ (u/ω ) 2 u 00 1 = e ω1 , σ00 (u) = e−(η1 +η2 )u + ω ω θ (0) 1 2 00 σ 2 ω 1 σ u+ 2 η1 2 θ (u/ω ) u 10 1 = e ω1 , σ10 (u) = e−η1 u ω θ (0) 10 σ 21 σ u + ω22 η1 2 θ (u/ω ) u 01 1 = e ω1 , σ01 (u) = e−η2 u ω θ (0) 2 01 σ 2 which satisfy σg1 g2 (u + ωl ) = (−1)gl eηl (2u+ωl ) σg1 g2 (u), σg1 g2 (0) = 1. σg1 g2 (−u) = σg1 g2 (u), Defining e1 = ℘ (ω1 /2), e2 = ℘ ((ω1 + ω2 )/2), e3 = ℘ (ω2 /2), we have 2 (u) σ 2 (u) σ 2 (u) σ10 + e1 = 00 + e2 = 01 + e3 = ℘ (u), 2 2 σ (u) σ (u) σ 2 (u) e1 + e2 + e3 = 0, 2 2 2 π π π 4 4 θ01 (0) , e1 − e3 = θ00 (0) , e2 − e3 = θ10 (0)4 . e1 − e2 = ω1 ω1 ω1 We also use normalized Weierstraß functions: d d θ11 (u), ℘11 (u) = − ζ11 (u). (A.5) ζ11 (u) = du du
Elliptic Gaudin Model
35
B. Realization of Spin ` Representations on an Elliptic Curve We recall here the following realization of the spin ` representation of the Lie algebra sl2 (C). Let e, f , h be the Chevalley generators and define S 1 = e+f , S 2 = −ie+if and S 3 = h. They satisfy the relation [S a , S b ] = 2iS c for any cyclic permutation (a, b, c) of (1, 2, 3) and represented by the Pauli matrices σ a . B.1. Spin ` representations. The representation space V (`) is realized by V (`) =
2` M
C℘ (y)k
k=0
= { even elliptic function f (y) | div(f ) ≥ −4`(Z + τ Z)}.
(B.1)
The generators S a act on this space as differential operators of first order: ρ (`) (S 1 ) =
θ10 (y)2 θ10 θ10 (2y) d + 2` , 0 θ11 θ11 (2y) dy θ11 (y)2
θ00 θ00 (2y) d θ00 (y)2 1 (`) 2 , ρ (S ) = 0 + 2` i θ11 θ11 (2y) dy θ11 (y)2 ρ (`) (S 3 ) =
θ01 (y)2 θ01 θ01 (2y) d + 2` , 0 θ11 θ11 (2y) dy θ11 (y)2
or in terms of usual Weierstraß functions, σ10 (2y) d (`) 1 + 2`(℘ (y) − e1 ) , ρ (S ) = a1 σ (2y) dy σ00 (2y) d (`) 2 + 2`(℘ (y) − e2 ) , ρ (S ) = a2 σ (2y) dy σ01 (2y) d (`) 3 + 2`(℘ (y) − e3 ) , ρ (S ) = a3 σ (2y) dy
(B.2)
(B.3)
where ea = ℘ (ωa¯ /2) (a¯ = 1, 3, 2, ω1 = 1, ω2 = τ , ω3 = 1 + τ ) for a = 1, 2, 3 respectively and 1 i 1 , a2 = √ , a3 = √ . a1 = √ √ √ √ e1 − e2 e1 − e3 e1 − e2 e2 − e3 e2 − e3 e1 − e3 (B.4) This realization is equivalent to the realization on the space of polynomials of degree ≤ 2` (or, sections of a line bundle on P1 (C)), e = x2
d − 2`x, dx
f =−
d , dx
h = 2x
d − 2`x, dx
via a coordinate transformation, x = −θ01 (y; τ/2)/θ00 (y; τ/2), and a gauge transformation: θ00 (y; τ/2) n ϕ(x(y)) ∈ V (`) . {polynomials in x} 3 ϕ(x) 7 → θ11 (y; τ )2
36
E. K. Sklyanin, T. Takebe
Note that this is also obtained by a gauge transformation from a quasi-classical limit of the representation of the Sklyanin algebra on theta functions [18]. The following expression is obtained from the coordinate transformation η = ℘ (y): V (`) =
2` M
Cηk ,
(B.5)
k=0
and S α acts on V (`) as d + 2`(η − eα ) . (B.6) ρ (`) (S α ) = aα ((eα − eβ )(eα − eγ ) − (η − eα )2 ) dη Let us assume that τ is a pure imaginary number. Then, as is well known (see, e.g., [13]), ea are real numbers and e1 > e2 > e3 . This implies that a1 and a3 are real, while a2 is purely imaginary. We introduce the following hermitian form in this representation space: for elliptic functions f (y), g(y) belonging to V (`) defined by (B.1), we define Z f (y¯2 ) g(y1 ) µ(y1 , y2 ), (B.7) hf, gi := C
where the 2-cycle C is defined by C := {(y1 , y2 ) ∈ (C/ 0)2 , y2 = y¯1 }, and the 2-form µ(y1 , y2 ) is defined by µ(y1 , y2 ) := (e1 − e2 )2(`+1) (e2 − e3 )2(`+1) σ (2y2 )σ (y2 )4` σ (2y1 )σ (y1 )4` dy2 ∧ dy1 (B.8) × 2(`+1) 2(`+1) 4i σ00 (y2 − y1 ) σ00 (y2 + y1 ) (℘ (y2 ) − e2 )(℘ (y1 ) − e2 ) −2(`+1) ℘ 0 (y2 )℘ 0 (y1 )dy2 ∧ dy1 . = 1+ (e1 − e2 )(e2 − e3 ) 4i This is nothing but a twisted version of the inner product introduced in [18]. If we take the description of V (`) of the form (B.5), this hermitian form is expressed as follows: Z hf, gi := f (η) ¯ g(η) µ(η, η), ¯ (B.9) C
where the 2-form µ(η, η) ¯ is defined by (η¯ − e2 )(η − e2 ) −2(`+1) d η¯ ∧ dη . µ(η, η) ¯ := 1 + (e1 − e2 )(e2 − e3 ) 2i An orthogonal basis with respect to this inner product is given by {(η − e2 )j }j =0,...,2` : h(η − e2 )j , (η − e2 )k i = 2π
(2j )!!(4` − 2j )!! (e1 − e2 )j +1 (e2 − e3 )j +1 δj k . (B.10) (4` + 2)!!
The generators S a of the Lie algebra sl2 act on the space V (`) as self-adjoint operators: hρ (`) (S a )f, gi = hf, ρ (`) (S a )gi.
(B.11)
This was first proved in [18], but we can check it directly by using formula (B.10). Hence, if u and zn are real numbers, the operator τˆ (u) defined by (2.9) and the integrals of motion Hn defined by (2.16) are hermitian operators on the Hilbert space V with respect to h·, ·i.
Elliptic Gaudin Model
37
B.2. Involutions. There are involutive automorphisms of the Lie algebra sl2 defined by Xa (S b ) = (−1)1−δab S b .
(B.12)
These automorphisms are induced on the spin ` representations as Xa (S b ) = Ua−1 S b Ua , where operators Ua : V (`) → V (`) are defined by 2` ω1 ℘ (y) − e1 πi` f y+ , (U1 f )(y) = e √ √ 2 e1 − e2 e1 − e3 2` ω1 + ω2 ℘ (y) − e2 , f y+ (U2 f )(y) = e2πi` √ √ 2 e1 − e2 e2 − e3 2` ω2 ℘ (y) − e3 −πi` f y+ , (U3 f )(y) = e √ √ 2 e1 − e3 e2 − e3
(B.13)
(B.14)
for a elliptic function f (y) ∈ V (`) (cf. [18]). They satisfy commutation relations Uα2 = (−1)2` ,
Uα Uβ = (−1)2` Uβ Uα = Uγ
for any cyclic permutation (α, β, γ ) of (1, 2, 3). The action of these operators on the bases {(η − eα )j }j =0,...,2` is: U1 (η − e1 )j = eπi` (e1 − e2 )j −` (e1 − e3 )j −` (η − e1 )2`−j , U2 (η − e2 )j = eπi(2`−j ) (e1 − e2 )j −` (e2 − e3 )j −` (η − e2 )2`−j , U3 (η − e3 )j = e−πi` (e1 − e3 )j −` (e2 − e3 )j −` (η − e3 )2`−j .
(B.15)
Hence eigenvalues of Ua are (−1)` with multiplicity `+1 and (−1)`+1 with multiplicity ` if ` is an integer, and ±i both with multiplicity ` + 21 if ` is a half of an odd integer. When ω1 = 1 and ω2 is a pure imaginary number, these operators are unitary with respect to the hermitian form (B.7). Acknowledgements. One of the authors (E.S.) is grateful to the Department of Applied Mathematics, University of Leeds, UK, where the main part of the paper was written, for hospitality, and acknowledges the support of EPSRC. The other one (T.T.) is grateful to the Department of Mathematics, University of California at Berkeley, USA, where some part of the work was done, for hospitality. He also acknowledges the support of Postdoctoral Fellowship for Research abroad of JSPS. Special thanks are given to Benjamin Enriquez and Vladimir Rubtsov who gave us their article [9] before publication and explained details.
References 1. 2. 3. 4.
Gaudin, M.: Diagonalisation d’une classe d’hamiltoniens de spin. J. de Physique 37, 1087–1098 (1976) Gaudin, M.: La fonction d’onde de Bethe. Paris: Masson, 1983 Baxter, R. J.: Exactly Solved Models of Statistical Mechanics. New York: Academic Press, 1982 Sklyanin, E.K., Takebe, T.: Algebraic Bethe Ansatz for XYZ Gaudin model. Phys. Lett. A 219, 217–225 (1996) 5. Babujian, H., Lima-Santos, A., Poghossian, R. H.: Knizhnik–Zamolodchikov–Bernard equations connected with the eight-vertex model. Preprint solv-int/9804015 6. Sklyanin, E.K.: Separation of Variables in the Gaudin model. Zapiski Nauchnykh Seminarov LOMI 164, 151–169 (1987) (in Russian); J. Sov. Math. 47, 2473–2488 (1989) (English transl.)
38
E. K. Sklyanin, T. Takebe
7. Sklyanin, E.K.: Separation of variables. New trends. In: Quantum field theory, integrable models and beyond (Kyoto, 1994). Progr. Theoret. Phys. Suppl. 118, 35–60 (1995) 8. Frenkel, E.: Affine algebras, Langlands duality and Bethe ansatz. XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge, MA: Internat. Press, 1995, pp. 606–642 9. Enriquez, B., Feigin, B., Rubtsov, V.: Separation of variables for Gaudin–Calogero systems. Compositio Math. 110, 1–16 (1998) 10. Kuroki, G., Takebe, T.: Twisted Wess–Zumino–Witten models on elliptic curves. Commun. Math. Phys. 190, 1–56 (1997) 11. Sklyanin, E. K.: On Poisson structure of periodic classical XY Z-chain. Zapiski Nauchnykh Seminarov LOMI 150, 154–180 (1986) (in Russian); J. Sov. Math. 46, 1664–1683 (1989) (English transl.). 12. Takhtajan, L.A. and Faddeev, L.D.: The quantum method of the inverse problem and the Heisenberg XYZ Model. Uspekhi Mat. Nauk 34:5, 13–63 (1979) (in Russian); Russian Math. Surveys 34:5, 11–68 (1979) (English transl.) 13. Whittaker, E.T. and Watson, G.N.: A Course of Modern Analysis. 4th ed., Cambridge: Cambridge University Press, 1927 14. Mumford, D.: Tata Lectures on Theta I. Basel–Boston: Birkhäuser, 1982 15. D’Agnolo, A., Schapira, P.: Radon–Penrose transform for D-modules. J. Funct. Anal. 139, 349–382 (1996) 16. Dickson, L. E.: Elementary theory of equations, New York: John Wiley & Sons, Inc., 1914 17. Crawford, L.: On the solution of Lamé’s equation d 2 U/du2 = U {n(n + 1)pu + B} in finite terms when 2n is an odd number. Quarterly J. Pure and Appl. Math. XXVIII, 93–98 (1895) 18. Sklyanin, E.K.: Some Algebraic Structures Connected with the Yang–Baxter Equation. Representations of Quantum Algebras. Funkts. analiz i ego Prilozh. 17-4, 34–48 (1983) (in Russian); Funct. Anal. Appl. 17, 273–284 (1984) (English transl.) Communicated by G. Felder
Commun. Math. Phys. 204, 39 – 60 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Double Quantization on Some Orbits in the Coadjoint Representations of Simple Lie Groups J. Donin1 , D. Gurevich2 , S. Shnider1 1 Department of Mathematics, Bar-Ilan University, 52900 Ramat-Gan, Israel 2 ISTV, Université de Valenciennes, 59304 Valenciennes, France
Received: 15 August 1998 / Accepted: 13 January 1999
Abstract: Let A be the function algebra on a semisimple orbit, M, in the coadjoint representation of a simple Lie group, G, with the Lie algebra g. We study one and two parameter quantizations Ah and At,h of A such that the multiplication on the quantized algebra is invariant under action of the Drinfeld–Jimbo quantum group, Uh (g). In particular, the algebra At,h specializes at h = 0 to a U (g)-invariant (G-invariant) quantization, At,0 . We prove that the Poisson bracket corresponding to Ah must be the sum of the socalled r-matrix and an invariant bracket. We classify such brackets for all semisimple orbits, M, and show that they form a dim H 2 (M) parameter family, then we construct their quantizations. A two parameter (or double) quantization, At,h , corresponds to a pair of compatible Poisson brackets: the first is as described above and the second is the Kirillov-KostantSouriau bracket on M. Not all semisimple orbits admit a compatible pair of Poisson brackets. We classify the semisimple orbits for which such pairs exist and construct the corresponding two parameter quantization of these pairs in some of the cases.
1. Introduction Passing from classical mechanics to quantum mechanics involves replacing the commutative function algebra, A, of classical observables on the phase space, M, with a noncommutative (deformed, quantized) algebra, At , of quantum observables (see [BFFLS] where the deformation quantization scheme is developed). The algebra A is a Poisson algebra and the product in the quantized algebra At is given by a power series in the formal parameter t with leading term equal to the original commutative product and with infinitesimal generator (the linear term in t, antisymmetrized) equal to the Poisson bracket. If the classical system is invariant under a Lie group of symmetries G, the associated quantum system often retains the same group of symmetries, that is, the product
40
J. Donin, D. Gurevich, S. Shnider
on At is invariant under the action of G, or, equivalently, under the action of the universal enveloping algebra U (g). Some modern field-theories, in particular those attempting to unify gravity and quantum field theory, investigate the possibility of deforming (quantizing) the group symmetry as well as the function algebra of the phase space. This is one of the reasons for the interest in quantum groups. The quantum group, Uh (g), defined by Drinfeld and Jimbo is a deformation of U (g) as a Hopf algebra. The quantization of the symmetry group together with the phase space corresponds to defining a Uh (g) invariant deformation of the algebra At . This leads us to the problem of finding a two parameter (or double) quantization, At,h , of a Poisson algebra with a U (g) invariant Poisson bracket. In other words, the problem of two parameter quantization arises if we want to quantize a Poisson bracket (appearing as the infinitesimal generator in t) in such a way that multiplication in the quantized algebra is invariant under the quantum group action (which depends on a deformation parameter h). In the present paper we investigate the problem of one and two parameter invariant quantizations of the Poisson function algebra on a semisimple orbit in the coadjoint representation of a simple Lie group. In general, if M is a manifold on which a semisimple Lie group G acts transitively, it is not true that there exists even a one parameter Uh (g) invariant quantization of the function algebra. In [DGM] it is proven that such a quantization exists if M = G/H and the Lie algebra of H contains a maximal nilpotent subalgebra. A Uh (g) invariant quantization of the algebra of holomorphic sections of a line bundle over a flag manifold is constructed by similar methods in [DG1]. In all these cases the Poisson bracket which is quantized is the r-matrix bracket, rM , on M. This bracket is determined by the bivector field (ρ ⊗ ρ)(r) for r ∈ ∧2 g, the Drinfeld–Jimbo r-matrix (of the form (3.3)), and ρ : g → Vect(M) is the mapping defined by the action of G on M. The Sklyanin–Drinfeld (SD) bracket on G is determined by the difference of two bivector fields on G which are the right and left invariant extensions of the Drinfeld– Jimbo classical r-matrix. For details see [LW]. In [DG2] it is shown that a one parameter Uh (g) invariant quantization of the SD bracket exists for all semisimple orbits in coadjoint representation of G. In the present paper we show that for the semisimple orbits, M, the SD bracket is one of a dim H 2 M parameter family of Poisson brackets admitting Uh (g) invariant quantization. For symmetric spaces M, the r-matrix bracket satisfies the Jacobi identity and hence defines a Poisson bracket. The existence of a one parameter Uh (g) invariant quantization with the r-matrix Poisson bracket is proven in [DS1]. That paper also proves that when M is a hermitian symmetric space there exists a two parameter Uh (g) invariant quantization. The hermitian symmetric spaces form a subclass of semisimple orbits in g∗ . These orbits have a very interesting property: the one-sided invariant components of the Sklyanin–Drinfeld bracket being reduced on such an orbit become Poisson brackets separately. More precisely, one of these components being reduced on M is the r-matrix bracket and the other one becomes the Kirillov–Kostant–Souriau (KKS) bracket which is roughly speaking the restriction to the orbit of the Lie bracket in g (see [KRR,DG1]). These brackets are obviously compatible (i.e. their Schouten bracket vanishes). So, we get a Poisson pencil which is the set of linear combinations of KKS and r-matrix Poisson brackets. The two parameter quantization, At,h , for hermitian symmetric spaces constructed in [DS1] is just a quantization of such a pencil. In particular, At,0 is a U (g) invariant quantization of the KKS bracket on M.
Double Quantization on Some Orbits of Simple Lie Groups
41
The orbits in g∗ on which the r-matrix bracket is Poisson have been classified in [GP]. In particular, the only such semisimple orbits are symmetric spaces. It turns out that there exist two parameter Uh (g) invariant quantizations on some G-manifolds where the r-matrix bracket is not Poisson (does not satisfy the Jacobi identity) but one can add a G-invariant bracket to the r-matrix bracket and get a Poisson bracket. For example, in [Do] it is shown that for g = sl(n) there is a two parameter Uh (g) invariant family (Sg)t,h , where Sg is the symmetric algebra of g which can be considered as the function (polynomial) algebra on g∗ . It is proven that this family can be restricted to give a two parameter quantization on any semisimple orbit of maximal dimension (on which the r-matrix bracket may not be Poisson). In the present paper we prove the existence of a two parameter Uh (g) invariant quantization of the function algebra for some non-symmetric semisimple orbits in g∗ . Similar to the case of symmetric orbits, the quasiclassical (infinitesimal) term of this quantization is a Poisson pencil generated by the KKS bracket and another Poisson bracket which must be the sum of the r-matrix and an U (g) invariant brackets. In particular, we shall see that such pencils exist for all semisimple orbits in case g = sl(n). Note that for non-symmetric orbits both the r-matrix bracket and the SD Poisson bracket are not compatible with the KKS bracket. We give a complete classification of the orbits admitting such a Poisson pencil. Moreover, we classify all such pencils and construct the deformation quantization of some of them. We now describe the content of the paper in more detail. Let G be a simple, connected, complex Lie group, g its Lie algebra. The Drinfeld–Jimbo quantum group Uh (g) can be considered as the algebra U (g)[[h]] with undeformed multiplication but deformed noncommutative comultiplication 1h . Let M be a G-homogeneous complex manifold. It is easy to show that M is isomorphic to a semisimple orbit of G in the coadjoint representation g∗ if and only if the stabilizer, Go , of a point o ∈ M is a Levi subgroup in G (see Sect. 3 for definition). The G-invariant symplectic structure on M is not unique, but each such symplectic structure on M arises from an isomorphism of M with an orbit, Oλ , for some semisimple element λ ∈ g∗ . On Oλ there is the Kirillov–Kostant–Souriau (KKS) Poisson bracket, vλ , whose action on the restriction of linear functions to the orbit is given by the Lie bracket in g. Each Ginvariant symplectic structure on M is induced from the KKS bracket by an isomorphism of M onto a semisimple orbit (see Sect. 3). The first problem we consider is that of quantizing the algebra A of polynomial (or holomorphic) functions on M, such that the quantized algebra Ah has a Uh (g) invariant multiplication ∞ X h i µi , µh = i=0
where µ0 is the initial multiplication of functions and the µi for i ≥ 1 are bidifferential operators. Invariance means that µh satisfies the property µh 1h (x)(a ⊗ b) = xµh (a ⊗ b)
for x ∈ Uh (g), a, b ∈ A.
The second problem we consider is the existence of a two parameter Uh (g) invariant quantization, At,h , of A such that the one parameter family At,0 is a U (g) invariant quantization of the KKS bracket on M. The infinitesimal of the one parameter quantization Ah is a Poisson bracket, whereas the infinitesimal of the two parameter quantization is a pencil of two compatible Poisson brackets.
42
J. Donin, D. Gurevich, S. Shnider
In Sect. 2 we recall some facts on the Drinfeld monoidal categories of quantum group representations, used in our construction of quantization. In addition, we show that the Poisson bracket, p(a, b) = µ1 (a, b) − µ1 (b, a), corresponding to the Uh (g) invariant quantization Ah must be of a special form. Namely, let r ∈ ∧2 g be the Drinfeld–Jimbo classical r-matrix. The Schouten bracket [[r, r]] ∈ ∧3 g is the invariant element ϕ, which is unique up to a factor. Denote by rM the bracket on M determined by the bivector field (ρ ⊗ ρ)(r) where ρ : g → Vect(M) is the mapping defined by the action of G on M. We call rM an r-matrix bracket. Put also ϕM = ρ ⊗3 ϕ. Then, p(a, b) has the form p(a, b) = rM (a, b) + f (a, b),
a, b ∈ A,
(1.1)
where f (a, b) is a U (g) invariant bracket with the Schouten bracket [[f, f ]] = −ϕM .
(1.2)
Note that the r-matrix bracket is compatible with any invariant bracket, i.e. [[f, rM ]] = 0. Similarly, the two parameter quantization At,h corresponds to a pair of compatible Poisson brackets, (p, vλ ), where p = rM + f of the form (1.1) and vλ is the KKS bracket. Since rM is compatible with the invariant bracket vλ , compatibility of p with vλ is equivalent to the condition [[f, vλ ]] = 0.
(1.3)
In Sect. 3 we give a classification of all invariant brackets f satisfying condition (1.2) for all M isomorphic to semisimple orbits. We show that such brackets form a dim H 2 (M) parameter family. In the same section we show that a semisimple orbit may not have any invariant brackets satisfying conditions (1.2) and (1.3). We call an orbit a “good orbit”, if such a bracket exists and then give a classification of all good orbits. Namely, if g is of type An , all semisimple orbits are good. All orbits which are symmetric spaces are good. In cases Dn , E6 there are good orbits which are not symmetric spaces. Moreover, for the good orbits the brackets satisfying conditions (1.2) and (1.3) form a one parameter family. In fact, the property of an orbit being good depends only on its structure as a homogeneous manifold, not on the symplectic structure. The dependence is on the Lie subalgera of the stabilizer subgroup. So, if an orbit M is good, then any orbit isomorphic to M as a homogeneous manifold will be good. In Sect. 4 we consider cohomologies of the complex of invariant polyvector fields on M with differential given by the Schouten bracket with the bivector f satisfying (1.2). We show that for almost all f these cohomologies coincide with the usual de Rham cohomologies of the manifold M and then use this fact in Sect. 5 to prove the existence of an invariant quantization. In the proof we use methods of [DS1] and [DS2]. Using the same methods, we also construct the two parameter quantization for good orbits in cases Dn , E6 and brackets satisfying (1.2) and (1.3). For the case An some additional arguments are required. (See [Do] where using another method the existence of two parameter quantization for maximal orbits is proven.) In conclusion we make two remarks. Remark 1.1. Throughout this paper G is supposed to be a complex Lie group. However, one can consider the situation when G is a real simple Lie group with Lie algebra gR and M is a semisimple orbit of G in g∗R . In this case we take A = C ∞ (M), the complexvalued smooth functions. Let g be the complexification of gR . It is clear that the Lie algebra g and algebra U (g) act on C ∞ (M). Since all our results are formulated in terms of g action, they are valid in the real case as well (see also [DS1]).
Double Quantization on Some Orbits of Simple Lie Groups
43
Remark 1.2. The deformation quantization can be considered as the first step of a quantization procedure whose second step is a representation of the quantized algebra, At,h , as an operator algebra in a linear space. For some symmetric orbits in sl(n)∗ such a representation has been given in [DGR] and [DGK], but the method of [DGK] can be apparently extended to all symmetric orbits in g∗ for all simple Lie algebras g. These operator algebras have a deformed Uh (g) invariant trace which, however, is not symmetric. 2. Poisson Brackets Associated with Uh (g) Invariant Quantization We recall some facts about Drinfeld algebras and the monoidal categories determined by them. They will be used, in particular, in our construction of the quantization. Let A be a commutative algebra with unit, B a unitary A-algebra. The category of representations of B in A-modules, i.e. the category of B-modules, will be a monoidal category if the algebra B is equipped with an algebra morphism, 1 : B → B ⊗A B, called comultiplication, and an invertible element 8 ∈ B ⊗3 such that 1 and 8 satisfy the conditions (see [Dr1]) (id ⊗2
(id ⊗ 1)(1(b)) · 8 = 8 · (1 ⊗ id)(1(b)), b ∈ B, (2.1) ⊗2 ⊗ 1)(8) · (1 ⊗ id )(8) = (1 ⊗ 8) · (id ⊗ 1 ⊗ id)(8) · (8 ⊗ 1). (2.2)
Define a tensor product functor for C the category of B modules, denoted ⊗C or simply ⊗ when there can be no confusion, in the following way: given B-modules M, N, M ⊗C N = M ⊗A N as an A-module. The action of B is defined by b(m ⊗ n) = (1b)(m ⊗ n) = b1 m ⊗ b2 n,
where 1b = b1 ⊗ b2 ,
using the Sweedler convention of an implicit summation over an index. The element 8 = 81 ⊗ 82 ⊗ 83 defines the associativity constraint, aM,N,P : (M ⊗N)⊗P → M ⊗(N ⊗P ), aM,N,P ((m⊗n)⊗p) = 81 m⊗(82 n⊗83 p). Again the summation in the expression for 8 is understood. By virtue of (2.1) 8 induces an isomorphism of B-modules, and by virtue of (2.2) the pentagon identity for monoidal categories holds. We call the triple (B, 1, 8) a Drinfeld algebra. The definition is somewhat non-standard in that we do not require the existence of an antipode. The category C of B-modules for B a Drinfeld algebra becomes a monoidal category. When it becomes necessary to be more explicit we shall denote C(B, 1, 8). Let (B, 1, 8) be a Drinfeld algebra and F ∈ B ⊗2 an invertible element. Put e (b) = F1(b)F −1 , b ∈ B, 1 e = (1 ⊗ F ) · (id ⊗ 1)(F ) · 8 · (1 ⊗ id)(F −1 ) · (F ⊗ 1)−1 . 8
(2.3) (2.4)
e and 8 e satisfy (2.1) and (2.2), therefore the triple (B, 1 e, 8 e) also becomes a Then 1 e 1 e, 8 e). Drinfeld algebra which has an equivalent monoidal category of modules, C(B, Note that the equivalent categories C and Ce consist of the same objects as B-modules, and the tensor products of two objects are isomorphic as A-modules. The equivalence C → Ce is given by the pair (I d, F ), where I d : C → Ce is the identity functor of the categories (considered without the monoidal structures, but only as categories of B-modules), and F : M ⊗C N → M ⊗Ce N is defined by m ⊗ n 7 → F1 m ⊗ F2 n, where F 1 ⊗ F2 = F .
44
J. Donin, D. Gurevich, S. Shnider
Assume A is a B-module with a multiplication µ : A ⊗A A → A which is a homomorphism of A-modules. We say that µ is 1 invariant if bµ(x ⊗ y) = µ1(b)(x ⊗ y) for b ∈ B, x, y ∈ A,
(2.5)
and 8 associative, if µ(81 x ⊗ µ(82 y ⊗ 83 z))) = µ(µ(x ⊗ y) ⊗ z) for x, y, z ∈ A.
(2.6)
Note that a B-module A equipped with 1 invariant and 8 associative multiplication is an associative algebra in the monoidal category C(B, 1, 8). The multiplication e µ= e e-associative and invariant in the category C. µF −1 : M ⊗A M → M will be 8 We are interested in the case when A = C[[h]], B = U (g)[[h]], where g is a complex simple Lie algebra. In this case, all tensor products over C[[h]] are completed in h-adic topology. Denote by ϕ ∈ ∧⊗3 g an invariant element (unique up to scaling for g simple) and by r ∈ ∧⊗2 g the so-called Drinfeld–Jimbo r-matrix of the form (3.3) such that the Schouten bracket of r with itself is equal to ϕ: [[r, r]] = ϕ.
(2.7)
In [Dr1], Drinfeld proved the following (see also [DS2] for the property c)). Proposition 2.1. 1. There is an invariant element 8 = 8h ∈ U (g)[[h]]⊗3 of the form 8h = 1 ⊗ 1 ⊗ 1 + h2 ϕ + · · · satisfying the following properties: a) it depends on h2 , i.e. 8h = 8−h ; b) it satisfies Eqs. (2.1) (i.e. invariant) and (2.2) with the usual 1 arising from U (g); c) it is invariant under the Cartan involution θ ; 321 321 = 8 ⊗ 8 ⊗ 8 for 8 = 8 ⊗ 8 ⊗ 8 . d) 8−1 3 2 1 1 2 3 h = 8h , where 8 2. There is an element F = Fh ∈ U (g)[[h]]⊗2 of the form Fh = 1 ⊗ 1 + (h/2)r + · · · e = 1 ⊗ 1 ⊗ 1. satisfying Eq. (2.4) with the usual 1 and with 8 This proposition implies that there are two nontrivial Drinfeld algebras: the first, (U (g)[[h]], 1, 8h ) with the usual comultiplication and 8 from Proposition 2.1, and the e , 1), where 1 e (x) = Fh 1(x)F −1 for x ∈ U (g). The pair (I d, Fh ) second, (U (g)[[h]], 1 h defines an equivalence between the corresponding monoidal categories C(U (g)[[h]], e , 1). 1, 8h ) and C(U (g)[[h]], 1 It is clear that reduction modulo h defines a functor from either of these categories to the category of representations of U (g) and the equivalence just described reduces to the identity modulo h. In fact, both categories are C[[h]]-linear extensions of the C-linear category of representations of g. Ignoring the monoidal structure the extension is a trivial e in the second case one, but the associator 8 in the first case and the comultiplication 1 make the extension non-trivial from the point of view of monoidal categories. e is denoted by Uh (g) and is The bialgebra U (g)[[h]] with comultiplication 1h = 1 isomorphic to the Drinfeld–Jimbo quantum group ([Dr1]). Let A be a U (g) invariant commutative algebra, i.e. an algebra with U (g) invariant multiplication µ in the sense of (2.5). A quantization of A is an associative algebra, Ah , which is isomorphic to A[[h]] = A ⊗ C[[h]] (completed tensor product) as a C[[h]]module, with multiplication in Ah having the form µh = µ + hµ1 + o(h). The Poisson bracket corresponding to the quantization is given by {a, b} = µ1 (a, b) − µ1 (b, a), a, b ∈ A.
Double Quantization on Some Orbits of Simple Lie Groups
45
In general, we call a skew-symmetric bilinear form A ⊗ A → A a bracket, if it satisfies the Leibniz rule in either argument when the other is fixed. The term Poisson bracket indicates that the Jacobi identity is also true. A bracket of the form {a, b}r = (r1 a)(r2 b) = µ(r1 a, r2 b)
a, b ∈ A,
(2.8)
where r = r1 ⊗ r2 (summation implicit) is the representation of the r-matrix r will be called an r-matrix bracket. Assume, Ah is a Uh (g) invariant quantization, i.e. the multiplication µh is 1h invariant. We shall show that in this case the Poisson bracket {·, ·} has a special form. Suppose f and g are two brackets on A. Then we define their Schouten bracket [[f, g]] as [[f, g]](a, b, c) = f (g(a, b), c) + g(f (a, b), c) + cyclic permutations of a, b, c.(2.9) Then [[f, g]] is a skew-symmetric map A⊗3 → A. We call f and g compatible, if [[f, g]] = 0. Proposition 2.2. Let A be a U (g) invariant commutative algebra and Ah a Uh (g) invariant quantization. Then the corresponding Poisson bracket has the form {a, b} = f (a, b) − {a, b}r ,
(2.10)
where f (a, b) is a U (g) invariant bracket. The brackets f and {·, ·}r are compatible and [[f, f ]] = −ϕA , where ϕA (a, b, c) = (ϕ1 a)(ϕ2 b)(ϕ3 c) and ϕ1 ⊗ ϕ2 ⊗ ϕ3 = ϕ ∈ ∧3 g is the invariant element given by (2.7). ⊗2 Proof. The permutation σ : A⊗2 h → Ah , a ⊗ b → b ⊗ a, is an equivariant operator in the category C(U (g)[[h]], 1, 8h ), because 1 is a cocommutative comultiplication. e , 1) implies that The equivalence of categories C(U (g)[[h]], 1, 8h ) and C(U (g)[[h]], 1 the operator e σ = F σ F −1 on A⊗2 is equivariant under the action of the quantum group Uh (g). Suppose the multiplication in Ah has the form µh (a, b) = ab + hµ1 (a, b) + o(h). It is easy to calculate that
1 σ )(a ⊗ b) = µ1 (a, b) − µ1 (b, a) + (r1 a)(r2 b) + O(h) µh (I d − e h = {a, b} + {a, b}r + O(h). But this is a Uh (g) equivariant operator A⊗2 h → Ah . Taking h = 0 we obtain that the bracket f (a, b) = {a, b} + {a, b}r must be U (g) invariant. So, we have {a, b} = f (a, b) − {a, b}r , as required. It is easy to check that any bracket of the form {a, b} = (X1 a)(X2 b) = µ(X1 a, X2 b), for X1 ⊗ X2 ∈ g ∧ g, is compatible with any invariant bracket. In particular, an r-matrix bracket is compatible with f . In addition, {·, ·} is a Poisson bracket, so its Schouten bracket with itself is equal to zero. Using this and taking into account that the Schouten bracket of the r-matrix bracket with itself is equal to ϕA , we obtain from (2.10) that t [[f, f ]] = −ϕA . u Remark 2.1. a) It is clear that if r satisfies (2.7), then −r satisfies (2.7), too, and we may fh with leading terms 1 ⊗ 1 − (h/2)r. Then, instead replace Fh in Proposition 2.1 with F (2.10) we can write {a, b} = f (a, b) + {a, b}r .
46
J. Donin, D. Gurevich, S. Shnider
b) Assume that At,h is a two parameter quantization of A, i.e. a topologically free C[[t, h]]-module with a multiplication of the form µt,h (a, b) = ab + hµ1 (a, b) + tµ01 (a, b) + o(t, h). Assume, that At,h is Uh (g) invariant, so that At,0 is U (g) invariant. Then there are two compatible Poisson brackets corresponding to such a quantization: the bracket µ1 (a, b) − µ1 (b, a) of the form (2.10) and the U (g) invariant bracket v(a, b) = µ01 (a, b) − µ01 (b, a). Since v is invariant, the compatibility is equivalent to [[f, v]] = 0. e , 1), c) In view of equivalence of the categories C(U (g)[[h]], 1, 8h ) and C(U (g)[[h]], 1 the problem of quantizing the algebra A may be considered in the first category. If Ah is a Uh (g) invariant quantization with multiplication µh , then the multiplication µ¯ h = µh Fh = µ + hµ¯ 1 + o(h) will be U (g) invariant and 8h associative in the sense of (2.6). We have µ¯ 1 (a, b)− µ¯ 1 (b, a) = f (a, b), where f is from (2.10). So, we see that the invariant bracket f from (2.10) with [[f, f ]] = −ϕA plays the role of Poisson bracket for 8h associative quantization. Similarly, the two parameter quantization At,h corresponds to the U (g) invariant 8h associative quantization in C(U (g)[[h]], 1, 8h ) with a pair of compatible invariant brackets f and v, where [[f, f ]] = −ϕA and [[v, v]] = 0. Working in the category C(U (g)[[h]], 1, 8h ) can simplify the process of quantization (see [DS1] and Sect. 5). In the next section we consider the case when A is a function algebra on a semisimple orbit in the coadjoint representation of g and we give a classification of the invariant brackets f satisfying the property [[f, f ]] = −ϕA . Moreover, among such f we distinguish those which are compatible with the KKS Poisson brackets. 3. Pairs of Brackets on Semisimple Orbits Let g be a simple complex Lie algebra, h a fixed Cartan subalgebra. Let ⊂ h∗ be the system of roots corresponding to h. Select a system of positive roots, + , and denote by 5 ⊂ the subset of simple roots. Fix an element Eα ∈ g of weight α for each α ∈ + and choose E−α such that (Eα , E−α ) = 1 for the Killing form (·, ·) on g. Then, for all pairs of roots α, β such that α + β 6 = 0 we define the numbers Nα,β in the following way: [Eα , Eβ ] = Nα,β Eα+β Nα,β = 0
if α + β ∈ , if α + β ∈ / .
These numbers satisfy the following property, [He]. For the roots α, β, γ such that α + β + γ = 0 one has Nα,β = Nβ,γ = Nγ ,α .
(3.1)
Let 0 be a subset of 5. Denote by h∗0 the subspace in h∗ generated by 0. Note that ∗ h = h∗0 ⊕h∗5\0 , and one can identify h∗5\0 and h∗ /h∗0 via the projection h∗ → h∗ /h∗0 . Let 0 ⊂ h∗0 be the subsystem of roots in generated by 0, i.e. 0 = ∩ h∗0 . Denote by g0 the subalgebra of g generated by the elements {Eα , E−α }, α ∈ 0, and h. Such a subalgebra is called the Levi subalgebra. Let G be a complex connected Lie group with Lie algebra g and G0 a subgroup with Lie algebra g0 . Such a subgroup is called the Levi subgroup. It is known that G0 is a connected subgroup. Let M be a homogeneous space of G and G0 be the stabilizer of a point o ∈ M. We can identify M and the coset space G/G0 . It is known that such M is isomorphic to a semisimple orbit in g∗ . This orbit goes through an element λ ∈ g∗
Double Quantization on Some Orbits of Simple Lie Groups
47
which is just the trivial extension to all of g∗ (identifying g and g∗ via the Killing form) of a map λ : h5\0 → C such that λ(α) 6 = 0 for all α ∈ 5 \ 0. Conversely, it is easy to show that any semisimple orbit in g∗ is isomorphic to the quotient of G by a Levi subgroup. The projection π : G → M induces the map π∗ : g → To , where To is the tangent space to M at the point o. Since the ad-action of g0 on g is semisimple, there exists an ad(g0 )-invariant subspace m = m0 of g complementary to g0 , and one can identify To and m by means of π∗ . It is easy to see that the subspace m is uniquely defined and has a basis formed by the elements Eγ , E−γ , γ ∈ + \ 0 . Let v ∈ g⊗m be a tensor over g. Using the right and the left actions of G on itself, one can associate with v right and left invariant tensor fields on G denoted by v r and v l . We say that a tensor field, t, on G is right G0 invariant, if t is invariant under the right action of G0 . The G equivariant diffeomorphism between M and G/G0 implies that any right G0 invariant tensor field t on G induces a tensor field π∗ (t) on M. The field π∗ (t) will be invariant on M if, in addition, t is left invariant on G, and any invariant tensor field on M can be obtained in such a way. Let v ∈ g⊗m . For v l to be right G0 invariant it is necessary and sufficient that v be ad(g0 ) invariant. Denote π r (v) = π∗ (v r ) for any tensor v on g and π l (v) = π∗ (v l ) for any ad(g0 ) invariant tensor v on g. Note that the tensor π r (v) coincides with the image of v by the map g⊗m → Vect(M)⊗m induced by the action map g → Vect(M). Any G invariant tensor on M has the form π l (v). Moreover, v clearly can be uniquely chosen from m⊗m . Denote by [[v, w]] ∈ ∧k+l−1 g the Schouten bracket of the polyvectors v ∈ ∧k g, w ∈ ∧l g, defined by the formula X (−1)i+j [Xi , Yj ] ∧ X1 ∧ · · · Xˆ i · · · Yˆj · · · ∧ Yl , [[X1 ∧ · · · ∧ Xk , Y1 ∧ · · · ∧ Yl ]] = where [·, ·] is the bracket in g. The Schouten bracket is defined in the same way for polyvector fields on a manifold, but instead of [·, ·] one uses the Lie bracket of vector fields. We will use the same notation for the Schouten bracket on manifolds. It is easy to see that π r ([[v, w]]) = [[π r (v), π r (w)]], and the same relation is valid for π l . Denote by 0 the image of in h∗5\0 without zero. It is clear that 5\0 can be identified with a subset of 0 and each element from 0 is a linear combination of elements from 5\0 with integer coefficients which are all positive or all negative. Thus, + the subset 0 ⊂ 0 of the elements with positive coefficients is exactly the image of + . We call elements of 0 quasiroots and the images of 5 \ 0 simple quasiroots. Proposition 3.1. a) Let β and β 0 be roots from such that they give the same element in 0 . Then there exist roots α1 , . . . , αk ∈ 0 such that β + α1 + · · · + αk = β 0 and all partial sums β + α1 + · · · + αi , i = 1, . . . , k are roots. b) Let β¯1 , . . . , β¯k , β¯ be elements of 0 such that β¯ = β¯1 + · · · + β¯k . Then there exist representatives of these elements in such that β = β1 + · · · + βk . Proof. a) Let β 0 = β + γ1 + · · · + γm , where γi ∈ 0 ∪ −0. If (β 0 , β) > 0, then β 0 − β is a root, and the proposition follows. Proceed by induction on m. Since (β 0 , β 0 ) > 0, if (β 0 , β) ≤ 0, then there exist a γi , say γm , such that (β 0 , γm ) > 0, so β 0 − γm is a root and β 0 − γm = β + γ1 + · · · + γm−1 . By induction, the proposition holds for the pair β 0 − γm and β, i.e. there exist a representation β + α1 + · · · + αk−1 = β 0 − γm satisfying the proposition. Now, putting αk = γm , we obtain the required representation of β 0 . ¯ Then b) Let β10 , . . . , βk0 , β 0 be some representatives of elements β¯1 , . . . , β¯k , β¯ in . we have the equation β10 + · · · + βk0 + γ1 + · · · + γm = β 0 for some γi ∈ 0 ∪ −0,
48
J. Donin, D. Gurevich, S. Shnider
i = 1, . . . , m. If (γi , β 0 ) > 0, then β 0 − γi is a root, and we can take β 0 − γi instead β 0 . After iteration, we can assume that all (γi , β 0 ) ≤ 0. Then, there exists a βi0 , say βk0 , such that (βk0 , β 0 ) > 0, so β 0 − βk0 is a root. Applying induction on k, one can suppose that there are representatives β1 , . . . , βk−1 of β¯1 , . . . , β¯k−1 , such that we have an equation β1 + · · · + βk−1 = β 0 − βk0 + γ1 + · · · + γn for some γi ∈ 0 ∪ −0, i = 1, . . . , n, and ˜ γi ) > 0, then for some βj , say β1 , (β1 , γi ) > 0, β˜ = β1 + · · · + βk−1 is a root. If (β, β1 − γi is a root, and one can replace β1 by β1 − γi . Repeating this argument we can ˜ (β 0 − β 0 )) ≥ 0, so either (β, ˜ β 0 ) > 0, and we set ˜ γi ) ≤ 0. Then (β, assume that all (β, k P P 0 0 0 0 ˜ −β ) > 0, and we set βk = β and β = β 0 + γi . In βk = βk − γi , β = β , or (β, k k ¯ u t any case, we obtain the required representatives of β¯1 , . . . , β¯k , β. Remark 3.1. It is obvious that m considered as a g0 representation space decomposes into the direct sum of subrepresentations mβ¯ , β¯ ∈ 0 , where mβ¯ is generated by all the ¯ Part a) of Proposition 3.1 elements Eβ , β ∈ , such that the projection of β is equal to β. shows that all mβ¯ are irreducible. Part b) together with part a) shows that for β¯1 , β¯2 ∈ 0 such that β¯1 + β¯2 ∈ 0 one has [mβ¯1 , mβ¯2 ] = mβ¯1 +β¯2 . Using the Killing form, it is easy to see that representations mβ¯ and m−β¯ are dual. Question. Is it true that for β¯1 , β¯2 ∈ 0 such that β¯1 + β¯2 ∈ 0 the representation mβ¯1 +β¯2 is contained in mβ¯1 ∧ mβ¯2 ⊂ ∧2 m with multiplicity one? Since g0 contains the Cartan subalgebra h, each g0 invariant tensor over m has to be of weight zero. It follows that there are no invariant vectors in m. Hence, there are no invariant vector fields on M. Consider the invariant bivector fields on M. From the above, such fields correspond from ∧2 m. Note that any h invariant bivector from ∧2 m to the g0 invariant bivectors P has to be of the form c(α)Eα ∧ E−α . Proposition 3.2. A bivector v ∈ ∧2 m is g0 invariant if and only if it has the form P v= c(α)Eα ∧ E−α , where the sum runs over α ∈ + \ 0 , and for two roots α, β which give the same element in h∗ /h∗0 one has c(α) = c(β). Proof. In view of Proposition 3.1 a), we may assume that α = β + γ , where γ ∈ 0 . Then the coefficient before Eβ+γ ∧ E−β in [[Eγ , v]] appears from the terms Eβ ∧ E−β and Eβ+γ ∧ E−β−γ in v, and is equal to Nγ ,β c(β) + Nγ ,−β−γ c(β + γ ). But from (3.1) it follows that Nγ ,β = −Nγ ,−β−γ , so if v is invariant under the action of Eγ , i.e. t [[Eγ , v]] = 0, then c(β) = c(β + γ ). u P This proposition shows that coefficients of an invariant element v = c(α)Eα ∧E−α + depend 0 , denoted α, ¯ so vPcan be written in the form P only on the image of α in c(α)E ¯ α ∧ E−α , where the v= c(α)E ¯ α ∧ E−α .Let v ∈ ∧2 m be of the form v = sum runs over α ∈ + \ 0 . Denote by θ the Cartan automorphism of g. Then, v is θ anti-invariant, i.e. θv = −v. Hence, any g0 invariant bivector is θ anti-invariant. If 2 v, Pw ∈ ∧ m are g0 invariant, then [[v, w]] is θ invariant and is of the form [[v, w]] = ¯ α+β ∧ E−α ∧ E−β , where roots α, β are both negative or both positive and e(α, ¯ β)E ¯ = −e(−α, ¯ Hence, to calculate [[v, w]] for such v and w it is sufficient to e(α, ¯ β) ¯ −β). ¯ for positive α¯ and β. ¯ calculate coefficients e(α, ¯ β)
Double Quantization on Some Orbits of Simple Lie Groups
49
P P Proposition 3.3. Let v = c(α)Eα ∧ E−α , w = d(α)Eα ∧ E−α be elements from g ∧ g. Then for any positive roots α, β, (α + β) the coefficient by the term Eα+β ∧ E−α ∧ E−β in [[v, w]] is equal to Nα,β (c(α)(d(β) − d(α + β)) + c(β)(d(α) − d(α + β)) − c(α + β)(d(α) + d(β))).
(3.2)
Proof. Direct computation, see [KRR]. u t Let r ∈ g ∧ g be the Drinfeld–Jimbo r-matrix: X Eα ∧ E−α . r=
(3.3)
α∈+
Then [[r, r]] = ϕ is an invariant element in ∧3 g. From Proposition 3.2 it follows that r reduced modulo g ∧ g0 is g0 -invariant. Hence, r and ϕ define invariant bivector and three-vector fields on M, π l (r) and π l (ϕ), which we denote by rM and ϕM . Recall, that we identify invariant tensor fields on M with invariant tensors in m. From Propositions that the Schouten bracket P3.3 and 3.1 b) it follow that the condition of the bivector v = c(α)E ¯ α ∧ E−α with itself give K 2 ϕM for a number K is ¯ ¯ = c(α)c( ¯ + K2 c(α¯ + β)(c( α) ¯ + c(β)) ¯ β)
(3.4)
for all the pairs of positive quasiroots α, ¯ β¯ such that α¯ + β¯ is a quasiroot. ¯ and assuming that c(α) ¯ 6= 0 we find that Given c(α) ¯ and c(β) ¯ + c(β) ¯ = c(α¯ + β)
¯ + K2 c(α)c( ¯ β) . ¯ c(α) ¯ + c(β)
(3.5)
¯ γ¯ are positive quasiroots such that α¯ + β, ¯ β¯ + γ¯ , α¯ + β¯ + γ¯ are also Assume α, ¯ β, quasiroots. Then the number c(α¯ + β¯ + γ¯ ) can be calculated formally (ignoring possible division by zero) in two ways, using (3.5) for the pair c(α), ¯ c(β¯ + γ¯ ) on the right-hand ¯ c(γ¯ ). But it is easy to check that these two ways give side and also for the pair c(α¯ + β), the same value of c(α¯ + β¯ + γ¯ ). In this sense the system of equations corresponding to (3.5) for all pairs is consistent. Let us consider this system represented P P more carefully. For any positive quasiroot ¯ = ai . In general the as simple quasiroots α¯ = ai α¯ i , define the height ht(α) coefficient c(α) ¯ for quasiroots of height l can be formally defined by iterating (3.5). Let α¯ = α¯ 1 + · · · + α¯ l with possible repetitions and let ci := c(α¯ i ), then P c1 c2 · · · cl + K 2 ci1 ci2 · · · cil−2 + · · · P . (3.6) c(α) ¯ =P ci1 ci2 · · · cil−1 + K 2 ci1 ci2 · · · cil−3 + · · · For K 6 = 0 the expression in the denominator can be expressed as 1 l 5i=1 (ci + K) − 5li=1 (ci − K) . K
(3.7)
50
J. Donin, D. Gurevich, S. Shnider
Assumption. Let (α¯ 1 , · · · , α¯ k ) be the k-tuple of simple quasiroots. In the following we will assume that the point (c(α¯ 1 ), · · · , c(α¯ k )) ∈ Ck does not lie on any of the subvarieties defined by the expressions in the denominator of (3.6). It will be convenient to include the conditions c(α¯ i ) 6 = 0. Proposition 3.4. a) Given a k-tuple of positive numbers (c1 , . . . , ck ) := (c(α¯ 1 ), . . . , Eq. (3.6) uniquely defines ¯ for all c(α¯ k )) satisfying the Assumption, P numbers c(α) P ¯ α ∧ E−α satisfies positive quasiroots α¯ = α¯ i such that the bivector v = c(α)E the condition [[v, v]] = K 2 ϕM . b) When K = 0, the solution described in part a) defines a Poisson bracket on M and there exists a linear form λ ∈ h∗5\0 such that c(α) ¯ =
1 λ(α) ¯
(3.8)
for all quasiroots α. ¯ Proof. a) Since by assumption the denominator is never zero, Eq. (3.6) defines consis¯ satisfying (3.5) for all positive β. ¯ tently c(β) ¯ ¯ = c(α)c( ¯ and further b) When K = 0 (3.4) becomes c(α¯ + β)(c( α) ¯ + c(β)) ¯ β) Eq. (3.6) implies that c(α) ¯ 6 = 0 for all quasiroots. So setting λ(α) ¯ = 1/c(α), ¯ we find ¯ = λ(α) ¯ Thus λ is a linear that Eq. (3.4) is equivalent to the equation λ(α¯ + β) ¯ + λ(β). functional, i.e., an element of h∗5\0 , which by construction must be nonzero on all quasiroots. u t Remark 3.2. a) This proposition shows that invariant brackets v on M such that [[v, v]] = K 2 ϕM form a k-dimensional manifold, X , which equals Ck minus the subvarieties defined in the Assumption, where k is the number of elements from 5 \ 0. Further, it is known that k = dim H 2 (M), [Bo]. If K is regarded as indeterminate, then v forms a k + 1 dimensional manifold, Y = Ck × C. The submanifold Y0 corresponds to K = 0, i.e. consists of Poisson brackets. It is easy to see that all the Poisson brackets of the type c(α) ¯ = 1/λ(α) ¯ 6 = 0 are nondegenerate. Since Y is connected, it follows that almost all brackets v (except an algebraic subset in Y of lesser dimension) are nondegenerate as well. b) If v defines a Poisson bracket on M, then M is a symplectic manifold and may be realized as an orbit in g∗ passing through the element λ from (3.8) trivially extended to g∗ , with the KKS bracket. P Now we fix a Poisson bracket v = (1/λ( ¯ α ∧E−α , where λ is a fixed linear form P α))E and describe the invariant brackets f = c(α)E ¯ α ∧ E−α which satisfy the conditions for K 6= 0, [[f, f ]] = K 2 ϕM [[f, v]] = 0.
(3.9)
An ordered pair of quasiroots α, ¯ β¯ such that α¯ + β¯ is a quasiroot as well will be called an admissible pair. Substituting in (3.2) instead d(α) the coefficients of v, we obtain that the condition [[f, v]] = 0 is equivalent to the system of equations for the coefficients of f, ¯ β) ¯ 2 = c(α¯ + β)λ( ¯ α¯ + β) ¯ 2 c(α)λ( ¯ α) ¯ 2 + c(β)λ( ¯ for all admissible pairs α, ¯ β.
(3.10)
Double Quantization on Some Orbits of Simple Lie Groups
51
On the other hand, the condition [[f, f ]] = K 2 ϕM is equivalent to the system of Eqs. (3.4) for all admissible pairs of quasiroots. Substituting c(α + β) from (3.10) in (3.4) we obtain ¯ β) ¯ 2 )(c(α) ¯ = c(α)c( ¯ ¯ 2 + K 2 λ(α¯ + β) ¯ 2. ¯ + c(β)) ¯ β)(λ( α) ¯ + λ(β)) (c(α)λ( ¯ α) ¯ 2 + c(β)λ( Cancelling terms and extracting the square root, we obtain the equation ¯ β) ¯ = c(α)λ( ¯ c(β)λ( ¯ α) ¯ ± Kλ(α¯ + β).
(3.11)
¯ β) ¯ from (3.11) in (3.10), we obtain Substituting c(β)λ( ¯ α¯ + β) ¯ = c(α)λ( ¯ c(α¯ + β)λ( ¯ α) ¯ ± Kλ(β).
(3.12)
So, the conditions (3.9) on f are equivalent to the system of Eqs. (3.11, 3.12) with + the same sign before K for all admissible pairs α¯ and β¯ from 0 . ¯ We say that an ordered triple of positive quasiroots (not necessarily different) α, ¯ β, + ¯ ¯ ¯ γ¯ ∈ 0 is an admissible triple, if α¯ + β, β + γ¯ , and α¯ + β + γ¯ are quasiroots, too. ¯ γ¯ be an admissible triple of quasiroots. If c(α¯ + β)λ( ¯ α¯ + β) ¯ = Lemma 3.1. Let α, ¯ β, ¯ then c(β¯ + γ¯ )λ(β¯ + γ¯ ) = c(β)λ( ¯ β) ¯ + Kλ(γ¯ ), that is, the signs c(α)λ( ¯ α) ¯ + Kλ(β), before K in (3.12) for the admissible pairs α, β and β, γ are the same. Proof. The admissible pair α, ¯ β¯ + γ¯ gives the equation c(α)λ( ¯ α) ¯ = c(β¯ + γ¯ )λ(β¯ + γ¯ ) ± Kλ(α¯ + β¯ + γ¯ ). The first equation given in the lemma implies that the + sign appears in Eq. (3.11). These two equations imply ¯ β) ¯ − Kλ(α¯ + β) ¯ = c(β¯ + γ¯ )λ(β¯ + γ¯ ) ± Kλ(α¯ + β¯ + γ¯ ). c(β)λ( Substituting for c(β¯ + γ¯ )λ(β¯ + γ¯ ) using (3.12), we get ¯ β) ¯ − Kλ(α¯ + β) ¯ = c(β)λ( ¯ β) ¯ ± Kλ(γ¯ ) ± Kλ(α¯ + β¯ + γ¯ ), c(β)λ( where the last two ± are independent. However if ¯ β) ¯ − Kλ(γ¯ ), c(β¯ + γ¯ )λ(β¯ + γ¯ ) = c(β)λ( we have a contradiction, since either the sign in front of Kλ(α¯ + β¯ + γ¯ ) is positive and ¯ = +Kλ(α¯ + β), ¯ so 0 = λ(α¯ + β), ¯ or the sign in front of Kλ(α¯ + β¯ + γ¯ ) −Kλ(α¯ + β) is negative, implying 0 = λ(γ¯ ). We conclude that the sign in front of Kλ(γ¯ ) must be positive. u t In the situation of the lemma we can express c(γ¯ ) in terms of c(α) ¯ as ¯ + λ(β¯ + γ¯ )). c(γ¯ )λ(γ¯ ) = c(α)λ( ¯ α) ¯ + K(λ(α¯ + β)
(3.13)
Now, we consider the pair (M, λ) as an orbit in g∗ passing through λ with the KKS Poisson bracket X 1 Eα ∧ E−α . v = vλ = λ(α) ¯ + α∈
52
J. Donin, D. Gurevich, S. Shnider
Definition 3.1. We call M a good orbit, if there exists on M an invariant bracket f = P c(α)E ¯ α ∧ E−α satisfying the conditions (3.9). Recall the discussion in the introduction where we explained that conditions (3.9) are necessary for the existence of a 2-parameter quantization. Proposition 3.5. The good semisimple orbits are as follows: a) For g of type An all semisimple orbits are good. b) For all other g, the orbit M is good if and only if the set 5 \ 0 consists of one or two roots which appear in representation of the maximal root with coefficient 1. Proof. a) In this case the system of quasiroots 0 looks like a system of roots of type Ak for k being equal to the number of elements of 5 \ 0. So, the simple quasiroots can be ordered in a sequence β¯1 , . . . , β¯k in such a way that all subsequences consisting of three adjacent elements are admissible. Pick an arbitrary value for c(β¯1 ) and a sign before K in (3.11) for the pair β¯1 and β¯2 . Then, due to Lemma 3.1, consistency of system (3.11, 3.12) implies that the sign before K is the same for all adjacent pairs β¯i and β¯i+1 . Using Eqs, (3.11) and (3.12) for a fixed sign before K and induction on ht(α), we find all the coefficients c(α) ¯ of f and see that the system (3.11, 3.12) is consistent. b) Let g be of type Bn or Cn . Then the maximal root has the form α1 +2α2 +· · ·+2αn , where αi ∈ 5. Denote β = α2 + · · · + αn which is a root. If 5 \ 0 does not contain α1 , + ¯ β. ¯ So, from (3.11) it follows that i.e. α¯ 1 = 0, then 0 contains the admissible pair β, ¯ ¯ ¯ ¯ ¯ ¯ c(β)λ(β) = c(β)λ(β) ± 2Kλ(β), i.e. λ(β) = 0 which is impossible. Assume that 5 \ 0 contains α1 and some roots αi for i > 1. Then both α¯ 1 and β¯ are + ¯ α¯ 1 , β. ¯ It follows from (3.13) not equal to zero, and 0 contains the admissible triple β, ¯ ¯ ¯ ¯ ¯ ¯ that c(β)λ(β) = c(β)λ(β) + 2Kλ(β + α¯ 1 ), i.e. λ(β + α¯ 1 ) = 0 which is impossible as well, because β¯ + α¯ 1 is a quasiroot. So, for consistency of system (3.11, 3.12) in cases Bn and Cn , the set 0 has to contain all the roots αi , i > 1. But in the latter case the system is trivially consistent, because in that case the set of quasiroots looks like A1 . The homogeneous space G/G0 is a symmetric space. Consider the case Dn . The maximal root has the form α1 + α2 + α3 + 2α4 + · · · + 2αn . Denote β = α4 + · · · + αn , which is a root. Consider several cases. The cases when two of α¯ i , i ≤ 3, are equal to zero and β¯ is not equal to zero lead to an inconsistency in the system (3.11, 3.12) in the same way as in the cases Bn and Cn considered above. Assume that two of α¯ i , i ≤ 3, say α¯ 1 , α¯ 3 , and β¯ are not equal to zero. Then the sequence ¯ α¯ 3 , α¯ 1 + β¯ α¯ 1 , α¯ 2 + β,
(3.14)
is a sequence of four nonzero quasiroots. It is easy to see that the subsequences α¯ 1 , α¯ 2 + ¯ α¯ 3 , α¯ 1 + β¯ form admissible triples in + ¯ α¯ 3 and α¯ 2 + β, β, 0 . From Lemma 3.1 it follows that the sign before K must be the same in (3.11) for all adjacent pairs. Taking, for example, the sign plus and applying (3.13) to the second triple, we obtain the equation ¯ α¯ 1 + β) ¯ = c(α¯ 2 + β)λ( ¯ α¯ 2 + β) ¯ + K(λ(α¯ 2 + β¯ + α¯ 3 ) c(α¯ 1 + β)λ( ¯ + λ(α¯ 3 + α¯ 1 + β)).
(3.15)
Double Quantization on Some Orbits of Simple Lie Groups
53
From (3.11) applied to the pair α¯ 1 , α¯ 2 + β¯ we obtain ¯ α¯ 2 + β) ¯ = c(α¯ 1 )λ(α¯ 1 ) + Kλ(α¯ 1 + α¯ 2 + β). ¯ c(α¯ 2 + β)λ(
(3.16)
Putting c(α¯ 1 ) from (3.16) in (3.15) and taking into account linearity of λ, we obtain the equality ¯ α¯ 1 + β) ¯ = c(α¯ 1 )λ(α¯ 1 ) + 2Kλ(α¯ 1 + α¯ 2 + α¯ 3 + β) ¯ + Kλ(β). ¯ (3.17) c(α¯ 1 + β)λ( ¯ in terms of c(α¯ 1 ) from (3.12), we obtain On the other hand, expressing c(α¯ 1 + β) ¯ α¯ 1 + β) ¯ = c(α¯ 1 )λ(α¯ 1 ) ± Kλ(β). ¯ c(α¯ 1 + β)λ(
(3.18)
Now, comparing (3.17) and (3.18), we see that if we take plus before K in (3.18), then ¯ = 0, if we take minus, then λ(α¯ 1 + α¯ 2 + α¯ 3 + β) ¯ = 0. But both λ(α¯ 1 + α¯ 2 + α¯ 3 + 2β) of the cases are impossible, since λ is not equal to zero on quasiroots. Next, assume that β¯ = 0 but α¯ i 6 = 0 for i = 1, 2, 3. In this case the sequence (3.14) becomes the sequence α¯ 1 , α¯ 2 , α¯ 3 , α¯ 1 . Using the above arguments, we obtain that in this case λ(α¯ 1 + α¯ 2 + α¯ 3 ) = 0 must be true, which is impossible, since α¯ 1 + α¯ 2 + α¯ 3 is a quasiroot. In the case when 5 \ 0 contains only one or two roots of αi , i = 1, 2, 3, system (3.11, 3.12) is consistent, because in these cases the set of quasiroots 0 looks like the system of roots of type A1 or A2 . So, the proposition is proved for the classical g. For the exceptional g and a semisimple orbit which is not a symmetric space the system of quasi-roots reduces to one of the t cases which we have just excluded for g equal to Bn , Cn or Dn . u Remark 3.3. a) Note that Proposition 3.5 may be reformulated in the following way: An orbit M is good if and only if the corresponding system of quasiroots 0 is isomorphic to a system of roots of type Ak . We say in this case that M is of type Ak . Orbits of type A1 are exactly the orbits which are symmetric spaces. For such orbits ϕM = 0, and we may take f = 0. Symmetric orbits exist for all classical g and also for g of types E6 , E7 . Orbits of type A2 exist for g of type An , Dn , and E6 . Orbits of type Ak , k > 2, exist only in case g = sl(n). Moreover, in this case all semisimple orbits have the type Ak for k ≥ 1. b) From the proof of Proposition 3.5 it follows that the bracket f satisfying (3.9) is defined on good orbits by the value of its coefficient c(α) for a fixed simple root and the choice of a sign before K. On the other hand, if a fixed f0 satisfies (3.9), then the family ±f0 + sv for arbitrary numbers s also satisfies these conditions. So, this family consists of all invariant brackets satisfying (3.9). Almost all brackets from this family (except a finite number) are nondegenerate, since v is nondegenerate and, therefore, for large s0 the bracket ±f0 + s0 v is nondegenerate as well. For symmetric orbits 5 \ 0 consists of one element, there is one quasiroot and f0 is a multiple of v. Note that in [GP] a classification of all orbits in coadjoint representation (not necessarily semisimple) is given for which ϕM = 0. In particular, if we take K = 1 and find f0 such that [[f0 , f0 ]] = ϕM , then the family ±if0 + sv gives all the brackets satisfying [[f0 , f0 ]] = −ϕM and compatible with the KKS bracket on M. Note that if f is a bracket satisfying [[f, f ]] = −ϕM and {·, ·}r is the r-matrix bracket (2.8), then f ± {·, ·}r is a Poisson bracket on M compatible with KKS bracket.
54
J. Donin, D. Gurevich, S. Shnider
4. Cohomologies Defined by Invariant Brackets In the next section we prove the existence of a Uh (g) invariant quantization of the Poisson brackets described above using the methods of [DS1]. This requires us to consider the 3-cohomology of the complex (3• (g/g0 ))g0 = (3• m)g0 of g0 invariants with differential given by the Schouten bracket with the bivector v ∈ (32 m)g0 from Proposition 3.4 a), for u ∈ (3• m)g0 . δv : u 7 → [[v, u]] The condition δv2 = 0 follows from the Jacobi identity for the Schouten bracket together with the fact that [[v, v]] = K 2 ϕM . The latter equation is equivalent to [[v, v]] = ϕ modulo g0 ∧ g ∧ g, hence [[v, v]] is invariant modulo g0 ∧ g ∧ g. Denote these cohomologies by H k (M, δv ), whereas the usual de Rham cohomologies are denoted by H k (M). Recall, Remark 3.2 a), that the brackets v satisfying [[v, v]] = K 2 ϕ form a connected manifold Y = X × C. Proposition 4.1. For almost all v ∈ Y (except an algebraic subset of lesser dimension) one has H k (M, δv ) = H k (M) for all k. In particular, H k (M, δv ) = 0 for odd k. Proof. First, let v be a Poisson bracket, i.e. v ∈ Y0 . Then the complex of polyvector fields on M, 2• , with the differential δv is well defined. Denote by • the de Rham complex on M. Since none of the coefficients c(α) ¯ of v are zero, v is a nondegenerate bivector field, and therefore it defines an A-linear isomorphism v˜ : 1 → 21 , ω 7 → v(ω, ·), which can be extended up to the isomorphism v˜ : k → 2k of k-forms onto k-vector fields for all k. Using the Jacobi identity for v and invariance of v, one can show that v˜ gives a G invariant isomorphism of these complexes, so their cohomologies are the same. Since g is simple, the subcomplex of g invariants, (• )g , splits off as a subcomplex of • . In addition, g acts trivially on cohomologies, since for any g ∈ G the map X → X, x 7 → gx, is homotopic to the identity map. (G is assumed connected.) It follows that cohomologies of complexes (• )g and • coincide. But v˜ gives an isomorphism of complexes (• )g and (2• )g = ((3• m)g0 , δv ). So, cohomologies of the latter complex coincide with de Rham cohomologies, which proves the proposition when v is a Poisson brackets. Now, consider the family of complexes ((3• m)g0 , δv ), v ∈ Y. It is clear that δv depends algebraically on v. It follows from the uppersemicontinuity of dim H k (M, δv ) and the fact that H k (M) = 0 for odd k, [Bo], that H k (M, δv ) = 0 for odd k and almost all v ∈ Y. Using the uppersemicontinuity again and the fact that the number P k k k k (−1) dim H (M, δv ) is the same for all v ∈ Y, we conclude that dim H (M, δv ) = k dim H (M) for even k and almost all v. u t Remark 4.1. Call v ∈ Y admissible, if it satisfies Proposition 4.1. From the proof of the proposition it follows that the subset D such that Y \ D consists of admissible brackets does not intersect with the subset Y0 consisting of Poisson brackets. Let M be a good orbit and f0 + sv the family from Remark 3.3 b) satisfying (3.9) for a fixed K. Then for almost all numbers s this bracket is admissible. Indeed, this family is
Double Quantization on Some Orbits of Simple Lie Groups
55
contained in the two parameter family tf0 + sv. For t = 0, s 6 = 0 we obtain admissible brackets. So, there exist t0 6 = 0 and s0 such that the bracket t0 f0 + s0 v is admissible. It follows that the bracket f0 + (s0 /t0 )v is admissible, too. So, in the family f0 + sv there is an admissible bracket, and we conclude that almost all brackets in this family (except a finite number) are admissible. Question. Is it true that the set of admissible brackets contains all the nondegenerate brackets? For the proof of existence of two parameter quantization for the cases Dn and E6 in the next section we will use the following result on invariant three-vector fields. Denote by θ the Cartan automorphism of g. Lemma 4.1. For either Dn or E6 and one of the subsets, 0, of simple roots such that G0 defines a good orbit, any g0 and θ invariant element v in 33 m is a multiple of ϕM , that is, g0 ∼ 33 (m = hϕM i. Proof. Let g be a simple Lie algebra of type Dn or E6 and {α1 , . . . , αn } a system of simple roots. Changing notation slightly from Sect. 3, we assume that for g = Dn , (αi , αi+1 ) = −1 for i = 1, . . . , n − 2, (αn−2 , αn ) = −1 with all other inner products of distinct simple roots are zero, and for g = E6 , the non-zero products are (αi , αi+1 ) = −1 for i = 1, 2, 3, 4 and (α3 , α6 ) = −1. For g = Dn , 0 is one of the subsets of simple roots, 01 = {α1 , . . . , αn−2 }, 02 = {α2 , . . . , αn−1 }, or 03 = {α2 , . . . , αn−2 , αn }. For g = E6 , 0 = {α2 , α3 , α4 , α6 }. The positive quasiroots consist of three elements, α, ¯ α¯ 0 , and α¯ + α¯ 0 . Since a θ invariant element has the form w + θw for w ∈ mα¯ ⊗ mα¯ 0 ⊗ m−(α+ ¯ α¯ 0 ) , it is sufficient to show that the space of invariants in mα¯ ⊗ mα¯ 0 ⊗ m−(α+ ¯ α¯ 0 ) has dimension one. We know from Remark 3.1 that the subspaces mα¯ , mα¯ 0 , mα+ ¯ α¯ 0 are irreducible representations of g0 and that mα¯ and m−α¯ are dual. Therefore the dimension of the space of invariants in mα¯ ⊗ mα¯ 0 ⊗ m−(α+ ¯ α¯ 0 ) is the multiplicity of the representation mα+ ¯ α¯ 0 in the tensor product mα¯ ⊗ mα¯ 0 . For Dn and any of 0i the algebra g0 ∼ = An−2 . For 01 , α = αn−1 and α 0 = αn , the representations mα¯ and mα¯ 0 are both isomorphic to the dual vector representation for An−2 , that is, the contragredient representation to the representation for the fundamental weight λn−2 , mα¯ n−1 ∼ = mα¯ n ∼ = (V λn−2 )∗ ∼ = V λ1 . To see that this is so, note first of all that mα¯ n−1 is a lowest weight representation because it has a cyclic vector Eαn−1 and all negative simple root vectors of g0 annihilate Eαn−1 . The corresponding weight of An−2 is −λn−2 because (αn−1 , αj ) = −(λn−2 , αj ) if 1 ≤ j ≤ n − 2. The irreducible lowest weight representations with lowest weight −λn−2 is (V λn−2 )∗ and An−2 , (V λj )∗ ∼ = V λn−1−j . Since the subspaces mα¯ and mα¯ 0 of g have zero intersection, the wedge product mα¯ ∧ mα¯ 0 projects isomorphically onto the tensor product which contains the representation ∼ 2λn−2 )∗ ∼ mα+ = V 2λ1 ¯ α¯ 0 = (V with multiplicity one. In the cases 02 or 03 the representation mα¯ 1 is the contragredient representation to the vector representation (V λ1 )∗ ∼ = V λn−2 . The representations mα 0 = mαn for the case
56
J. Donin, D. Gurevich, S. Shnider
02 and mα 0 = mαn−1 for 03 are (V λn−3 )∗ ∼ = V λ2 . From the elementary representation theory of the Lie algebra sl(n − 1) (type An−2 ) we see that the tensor product mα¯ ⊗ mα¯ 0 contains mα+ ¯ α¯ 0 with multiplicity one. In the case of E6 , g0 ∼ = D4 ∼ = so(8). The representations mα¯ , mα¯ 0 and m−(α+ ¯ α¯ 0 ) are the three inequivalent irreducible eight dimensional representations, the two spinor representations and the vector representation. It is well known that there is a one diment sional space of invariants in the tensor product mα¯ ⊗ mα¯ 0 ⊗ mα+ ¯ α¯ 0 . u The proof of the lemma gives a positive answer on the Question from Remark 3.1 for the particular case. Note that in case of symmetric orbits the space of invariant three-vector fields is equal to zero. 5. Uh (g) Invariant Quantizations in One and Two Parameters In this section we prove the existence of two types of Uh (g) invariant quantization of the function algebra A on M = G/G0 . The first is a one parameter quantization µh (a, b) =
X n≥0
hn µn (a, b) = ab+
X
hn µn (a, b), µ1 (a, b) =
n≥1
1 (f (a, b)−{a, b}r ), 2
as described in Sect. 2, where f is one of the invariant brackets found above, f ∈ (32 m)g , [[f, f ]] = −ϕM . The second is a two parameter quantization µt,h (a, b) = ab + hµ1 (a, b) + tµ01 (a, b) +
X
hk t l µk,l (a, b).
k,l≥1
Recall that in this case there are two compatible Poisson brackets corresponding to such a quantization: the bracket µ1 (a, b)−µ1 (b, a) is skew-symmetric of the form (2.10) and µ01 (a, b) − µ01 (b, a) is a U (g) invariant bracket v(a, b), which we will assume to be the KKS bracket defined by identifying G/G0 with an orbit of the coadjoint representation. We remind the reader of the method in [DS1]. The first step is to construct a U (g) invariant quantization in the category C(U (g)[[h]], 1, 8h ). Then we use the equivalence given by the pair (I d, Fh ) between the monoidal categories C(U (g)[[h]], 1, 8h ) and e , 1) to define a Uh (g) invariant quantization, either µh F −1 in the one C(U (g)[[h]], 1 h parameter case or µt,h Fh−1 in the two parameter case (see Sect. 2). In the first step we used the fact that (33 m)g0 = 0 for symmetric spaces. In the examples considered in this paper, (33 m)g0 does not necessarily vanish, and we modify the proof using a method from [DS2] (see also [NV]). In the case of An any semisimple orbit is a good orbit and a different method is required (see [Do] where the existence of two parameter quantization for maximal orbits is proven). For the cases Bn and Cn , the only good orbits are symmetric spaces and the quantization was dealt with in [DS1]. In the remaining cases g = Dn or E6 , we proved in Lemma 4.1 that (33 m)g0 ,θ = hϕA i, and a suitable modification of the proof still applies.
Double Quantization on Some Orbits of Simple Lie Groups
57
Theorem 5.1. For almost all (in sense of Proposition 4.1) g invariant brackets satisfying [[f, f ]] = −ϕM , there exists a multiplication µh on A, X hn µn (a, b), µh (a, b) = ab + (h/2)f (a, b) + n≥2
which is g invariant (Eq. 2.5)) and 8 associative (Eq. (2.6)). Proof. To begin, consider the multiplication µ(1) (a, b) = ab + (h/2)f (a, b). The corresponding obstruction cocycle is given by obs2 =
1 (1) (1) (µ (µ ⊗ id) − µ(1) (id ⊗ µ(1) )8) h2
considered modulo terms of order h. No h1 terms appear because f is a biderivation and, therefore, a Hochschild cocycle. The fact that the presence of 8 does not interfere with the cocyle condition and that this equation defines a Hochschild 3-cocycle was demonstrated in the proof of Proposition 4 in [DS1]. It is well known that if we restrict to the subcomplex of cochains given by differential operators, the differential Hochschild cohomology of A in dimension p is the space of polyvector fields on M. Since g is reductive, the subspace of g invariants splits off as a subcomplex and has cohomology given by (3p m)g0 . The complete antisymmetrization of a p-tensor projects the space of invariant differential p-cocycles onto the subspace (3p m)g0 representing the cohomology. The equation [[f, f ]] + ϕM = 0 implies that obstruction cocycle is a coboundary, and we can find a 2-cochain µ2 so that µ(2) = µ(1) + h2 µ2 satisfies µ(2) (µ(2) ⊗ id) − µ(2) (id ⊗ µ(2) )8 = 0 mod h2 . Assume we have defined the deformation µ(n) to order hn such that 8 associativity holds modulo hn , then we define the (n + 1)st obstruction cocycle by obsn+1 =
1 (µ(n) (µ(n) ⊗ id) − µ(n) (id ⊗ µ(n) )8) mod h. hn+1
In [DS1] (Proposition 4) we showed that the usual proof that the obstruction cochain satisfies the cocycle condition carries through to the 8 associative case. The coboundary of obsn+1 appears as the hn+1 coefficient of the signed sum of the compositions of µ(n+1) with obsn+1 . The fact that 8 = 1 mod h2 together with the pentagon identity implies that the sum vanishes identically, and thus all coefficients vanish, including the coboundary 0 ∈ (33 m)g0 be the projection of obsn+1 on the totally skew in question. Let obsn+1 symmetric part, which represents the cohomology class of the obstruction cocycle. The coefficient of hn+2 in the same signed sum, when projected on the skew symmetric part is 0 0 ]] which is the coboundary of obsn+1 in the complex (3• m)g0 , δf = [[f, .]]). [[f, obsn+1 0 Thus obsn+1 is a δf cocycle. We have shown in Sect. 4 that this complex has zero cohomology. Now we modify µ(n+1) by adding a term hn µn with µn ∈ (32 m)g0 and consider the (n + 1)st obstruction cocycle for µ0(n+1) = µ(n+1) + hn µn . Since the term we added at degree hn is a Hochschild cocyle we do not introduce a hn term in the calculation of µ(n) (µ(n) ⊗ id) − µ(n) (id ⊗ µ(n) )8 and the totally skew symmetric projection hn+1 term has been modified by [[f, µn ]]. By choosing µn appropriately we can make the (n + 1)st obstruction cocycle represent the zero cohomology class, and we are able to continue the recursive construction of the desired deformation. u t
58
J. Donin, D. Gurevich, S. Shnider
Now we prove the existence of a two parameter deformation for good orbits in the cases Dn and E6 . Theorem 5.2. Given a pair of g invariant brackets, f, v, on a good orbit in Dn or E6 satisfying [[f, f ]] = −ϕM , [[f, v]] = [[v, v]] = 0, there exists a multiplication µh,t on A, X hk t l µk,l (a, b), µt,h (a, b) = ab + (h/2)f (a, b) + (t/2)v(a, b) + k,l≥1
which is g invariant (Eq. 2.5)) and 8 associative (Eq. (2.6)). Proof. The existence of a multiplication which is 8 associative up to and including h2 terms is nearly identical to the previous proof. Both f and v are anti-invariant under the Cartan involution θ. We shall look for a multiplication µt,h such that µk,l is θ antiinvariant and skew-symmetric for odd k + l and θ invariant and symmetric for even k + 1. So suppose we have a multiplication defined to order n, X hk t l µk,l (a, b), µt,h (a, b) = ab + hµ1 (a, b) + tµ01 (a, b) + k+l≤n
with the above mentioned invariance properties and 8 associative to order hn . Using properties c) and d) for 8 from Proposition 2.1, direct computation shows that the obstruction cochain, X hk t n+1−k βk , obsn+1 = k=0,... ,n+1
has the following properties: obsn+1 is θ invariant and obsn+1 (a, b, c) = −obsn+1 (c, b, a) for odd n, and obsn+1 is θ anti-invariant and obsn+1 (a, b, c) = obsn+1 (c, b, a) for even n. Hence, the projection of obsn+1 on (33 m)g0 is equal to zero for even n. It follows that all the βk are Hochschild coboundaries, and the standard argument implies that the multiplication can be extended up to order n + 1 with the required properties. For odd n, Lemma 4.1 shows that the projection on (33 m)g0 has the form X ak hk t n+1−k ϕM . obsn+1 = k=0,... ,n+1
The KKS bracket is given by the two-vector v=
X α∈+ \0
Setting w=
X
1 Eα ∧ E−α . λ(α) ¯
λ(α)E ¯ α ∧ E−α ,
α∈+ \0
gives [[v, w]] = −3ϕM .
Double Quantization on Some Orbits of Simple Lie Groups
Defining µ0(n) = µ(n) + the new obstruction cohomology class is X 0 =( obsn+1
59
a0 n t w, 3
ak hk t n+1−k )ϕM .
k=1,... ,n+1
Finally we define µ00(n) = µ0(n) +
X
ak hk−1 t n+1−k )f
k=1,... ,n
and get an obstruction cocycle which is zero in cohomology. Now the standard argument implies that the deformation can be extended to give a 8 associative invariant multiplication with the required properties of order n + 1. So, we are able to continue the recursive construction of the desired multiplication. t u Remark 5.1. This theorem is also true for semisimple orbits in An , but the proof, which requires different techniques, will be deferred to a forthcoming paper. Remark 5.2. Using the 8h associative multiplications µh and µt,h from Propositions 5.1 and 5.2 and the equivalence between the monoidal categories C(U (g)[[h]], 1, 8h ) e , 1) given by the pair (I d, Fh ) (see Sect. 2), one can define Uh (g) and C(U (g)[[h]], 1 invariant multiplications, either µh Fh−1 in the one parameter case or µt,h Fh−1 in the two parameter case. Acknowledgements. We thank the referee for bringing to our attention some as yet unpublished work dealing with related topics. In “Manin pairs and moment maps" (Ecole Polytechnique preprint, 1998) A. Alekseev and Y. Kosmann–Schwarzbach consider bivectors with Schouten bracket [[f, f ]] = ϕ in the framework of moment map theory. J.-H. Lu has given a classification of Poisson brackets on G/H which are Poisson-invariant relative to the Poisson–Lie action of G.
Note added in proof. The answer to the question after Remark 3.1 is yes. See Bourbaki: Groupes et algèbres de Lie, Chap. 8, § 9, Ex. 14. References [BFFLS] Bauen, F., Flato, M., Fronsdal, C., Lichnerovicz, A., Sternheimer, D.: Deformation theory and quantization, I. Deformations of symplectic structures. Ann. Physics 111, 61–110 (1978) [Bo] Borel, A.: Sur la cohomologie des espaces fibrés principaux et des espaces homogènes de groupes de Lie compacts. Ann. Math. 57, 115–207 (1953) [Dr1] Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 1419–1457 (1990) [Do] Donin, J.: Double quantization on the coadjoint representation of sl(n)∗ . Czechoslovak J. of Phys. 47 No 11, 1115–1122 (1997), q-alg/9707031 [DG1] Donin, J. and Gurevich, D.: Quasi-Hopf Algebras and R-Matrix Structure in Line Bundles over Flag Manifolds. Selecta Math. Sovietica, 12, No 1, 37–48 (1993) [DG2] Donin, J. and Gurevich, D.: Some Poisson structures associated to Drinfeld–Jimbo R-matrices and their quantization. Israel Math. J. 92, No 1, 23–32 (1995) [DGK] Donin, J., Gurevich, D. and Khoroshkin, S.: Double quantization of CP n type orbits by generalized Verma modules. J. of Geom. and Phys. To appear
60
[DGM] [DGR]
[DS1] [DS2] [GP] [He] [KRR] [LW] [NV]
J. Donin, D. Gurevich, S. Shnider
Donin, J., Gurevich, D. and Majid, S.: R-matrix brackets and their quantization. Ann. de l’Institut d’Henri Poincaré, Phys. Theor. 58, No 2, 235–246 (1993) Donin, J., Gurevich, D., Rubtsov, V.: Quantum hyberboloid and braided modules. In: Algebre non commutative, Groupes Quantiques et Invariants, Societé Mathématique de France, Collection Séminaires et Congres, no. 2, 103–118 (1997) Donin, J. and Shnider, S.: Quantum symmetric spaces. J. of Pure and Appl. Algebra 100, 103–115 (1995) Donin, J. and Shnider, S.: Cohomological construction of quantum groups. To appear Gurevich, D. and Panyushev, D.: On Poisson pairs associated to modified R-matrices. Duke Math. J. 73, n. 1 (1994) Helgason, S.: Differential geometry, Lie groups, and symmetric spaces, London–New York: Academic Press, 1978 Khoroshkin, S., Radul, A. and Rubtsov, V.: A family of Poisson structures on compact Hermitian symmetric spaces. Commun. Math. Phys. 152, 299–316 (1993) Lu, J.H. and Weinstein, A.: Poisson–Lie groups, dressing transformations, and Bruhat decompositions. J. of Diff. Geom. 31, 501–526 (1990) Neroslavsky, O.M. and Vlasov, A.T.: Éxistence de produits * sur une variété. C.R. Acad. Sci. Paris, 292 I, 71–76 (1981)
Communicated by G. Felder
Commun. Math. Phys. 204, 61 – 84 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
A Functional-Analytic Theory of Vertex (Operator) Algebras, I Yi-Zhi Huang Department of Mathematics, Rutgers University, 110 Frelinghuysen Rd., Piscataway, NJ 08854-8019, USA. E-mail: [email protected] Received: 15 August 1998 / Accepted: 13 January 1999
Abstract: This paper is the first in a series of papers developing a functional-analytic theory of vertex (operator) algebras and their representations. For an arbitrary Z-graded finitely-generated vertex algebra (V , Y, 1) satisfying the standard grading-restriction axioms, a locally convex topological completion H of V is constructed. By the geometric interpretation of vertex (operator) algebras, there is a canonical linear map from V ⊗ V to V (the algebraic completion of V ) realizing linearly the conformal equivalence class of a genus-zero Riemann surface with analytically parametrized boundary obtained by deleting two ordered disjoint disks from the unit disk and by giving the obvious parametrizations to the boundary components. We extend such a linear map ˜ (⊗ ˜ being the completed tensor product) to H , and prove to a linear map from H ⊗H the continuity of the extension. For any finitely-generated C-graded V -module (W, YW ) satisfying the standard grading-restriction axioms, the same method also gives a topo˜ W to H W logical completion H W of W and gives the continuous extensions from H ⊗H of the linear maps from V ⊗ W to W realizing linearly the above conformal equivalence classes of the genus-zero Riemann surfaces with analytically parametrized boundaries. 0. Introduction We begin a systematic study of the functional-analytic structure of vertex (operator) algebras and their representations in this paper. Vertex (operator) algebras were introduced rigorously in mathematics by Borcherds and by Frenkel, Lepowsky and Meurman (see [B,FLM] and [FHL]). Incorporating modules and intertwining operators, the author introduced intertwining operator algebras in [H5]. The original definition of vertex (operator) algebra is purely algebraic, but a geometric interpretation of vertex operators motivated by the path integral picture in string theory was soon observed mathematically by Frenkel [F]. In [H1]–[H7], the author developed a geometric theory of vertex operator algebras and intertwining operator algebras; the hard parts deal with the Virasoro algebra, the central charge, the interaction
62
Y.-Z. Huang
between the Virasoro algebra and vertex operators (or intertwining operators), and the monodromies. But in this geometric theory, the linear maps associated to elements of certain moduli spaces of punctured spheres with local coordinates (or to elements of the vector bundles forming partial operads called “genus-zero modular functors” over these moduli spaces) are from the tensor powers of a vertex operator algebra (or an intertwining operator algebra) to the algebraic completion of the algebra. To be more specific, let (V , Y, 1) be a Z-graded vertex algebra. For any nonzero complex number z, let P (z) be the the conformal equivalence class of the sphere C∪{∞} with the negatively oriented puncture ∞, the ordered positively oriented punctures z and 0, and with the standard local coordinates. Then associated to P (z) is the linear map Y (·, z)· : V ⊗ V → V given by the vertex operator map. Note that the image is in the algebraic completion V of V , not V itself. One of the main goals of the theory of vertex operator algebras is to construct conformal field theories in the sense of Segal [S1,S2] from vertex operator algebras and their representations, and to study geometry using the theory of vertex operator algebras. It is therefore necessary to construct locally convex topological completions of the underlying vector spaces of vertex (operator) algebras and their modules, such that associated to any Riemann surface with boundary, one can construct continuous linear maps between the tensor powers of these completions. In particular, it is necessary to construct a locally convex completion H of the underlying vector space V of a vertex algebra (V , Y, 1) such that associated to the conformal equivalence class of a disk with two smaller ordered disks deleted and with the obvious boundary parametrization, there ˜ (⊗ ˜ being the completed tensor product) to is a canonical continuous map from H ⊗H H . Though in some algebraic applications of conformal field theories, only the algebraic structure of conformal field theories is needed, in many geometric applications, it is necessary to have a complete locally convex topological space and continuous linear maps, associated to Riemann surfaces with boundaries, between tensor powers of the space. In the present paper (Part I), we construct a locally convex completion H V of an arbitrary finitely-generated Z-graded vertex algebra (V , Y, 1) satisfying the standard grading-restriction axioms. Since V is fixed in the present paper, we shall denote H V simply by H . The completion H is the strict inductive limit of a sequence of complete locally convex spaces constructed from the correlation functions of the generators. The strong topology, rather than the weak-∗ topology, on the topological dual spaces of certain function spaces is needed in the construction. For any positive numbers r1 , r2 and nonzero complex number z satisfying r2 +2r1 < 1 and r2 < |z| < 1, there is a unique genus-zero Riemann surface with analytically parametrized boundary given by deleting two ordered disjoint disks, the first centered at z with radius r1 and the second centered at 0 with radius r2 , from the unit disk and by giving the obvious parametrizations to the boundary components. Associated to this genus-zero Riemann surface with analytically L(0) L(0) parametrized boundary is a linear map Y (r1 ·, z)r2 · : V ⊗ V → V . We extend this ˜ (⊗ ˜ being the completed tensor product) to H , linear map to a linear map from H ⊗H and prove the continuity of the extension. It is clear that H is linearly isomorphic to a L(0) L(0) subspace of V containing both V and the image of Y (r1 ·, z)r2 ·. For any finitely-generated C-graded module (W, YW ) satisfying the standard gradingrestriction axioms for such a finitely-generated vertex algebra, we also construct a locally ˜ W to H W convex completion H W of W , and construct continuous extensions from H ⊗H L(0) L(0) of the linear map YW (r1 ·, z)r2 · : V ⊗ W → W .
Functional-Analytic Theory of Vertex (Operator) Algebras, I
63
Our method depends only on the axiomatic properties or “world-sheet geometry” (mainly the duality properties) of the vertex algebra. Since our construction does not use any additional structure, we expect that the locally convex completions constructed in the present paper will be useful in solving purely algebraic problems in the representation theory of finitely-generated vertex (operator) algebras. This paper is organized as follows: In Sect. 2, we construct a locally convex completion H of a finitely-generated Z-graded vertex algebra (V , Y, 1) satisfying the standard L(0) L(0) grading-restriction axioms. In Sect. 3, we extend Y (r1 ·, z)r2 · : V ⊗ V → V to ˜ continuous linear maps from H ⊗H to H . In Sect. 4, we present the corresponding results for modules. In this paper, we assume that the reader is familiar with the basic definitions and results in the theory of vertex algebras. The material in [B,FLM] and [FHL] should be enough. We also assume that the reader is familiar with the basic definitions, constructions and results in the theory of locally convex topological vector spaces. The reader can find this material in, for example, [K1] and [K2]. We shall denote the set of integers, the set of real numbers and the set of complex numbers by the usual notations Z, R and C, respectively. We shall use i, j, k, l, m, n, p, q to denote integers. In particular, when we write, say, k > 0 (or k ≥ 0), we mean that k is a positive integer (or a nonnegative integer). For a graded vector space V , we use V 0 , V ∗ and V to denote the graded dual space, the dual space and the algebraic completion of V , respectively. For a topological vector space E, we use E ∗ to denote the topological dual space of E. The symbol ⊗ always denotes the vector space tensor product. The bifunctor given by completing the vector space tensor product of two topological vector ˜ spaces with the tensor product topology is denoted by ⊗. 1. A Locally Convex Completion of a Finitely-Generated Vertex Algebra In this section, we use “correlation functions” to construct the locally convex completion of a finitely-generated vertex algebra. For any k ≥ 0, let Rk be the space of rational functions in the complex variables z1 , . . . , zk with the only possible poles zi = zj for i 6= j and zi = 0, ∞ (i, j = 1, . . . , k). Let M k = {(z1 , . . . , zk ) ∈ Ck | zi 6 = zj for i 6 = j ; zi 6= 0 (i, j = 1, . . . , k)} (k)
and let {Kn }n>0 , be a sequence of compact subsets of M k satisfying (k)
Kn(k) ⊂ Kn+1 , n > 0, and
Mk =
[ n>0
Kn(k) .
For any n > 0, we define a map k · kRk ,n : Rk → [0, ∞) by kf kRk ,n =
max
(k)
(z1 ,...,zk )∈Kn
|f (z1 , . . . , zk )|
for f ∈ Rk . Then it is clear that k · kRk ,n is a norm on Rk . Using this sequence of norms, we obtain a locally convex topology on Rk . Note that a sequence in Rk is convergent if
64
Y.-Z. Huang
and only if this sequence of functions is uniformly convergent on any compact subset (k) of M k . Clearly, this topology is independent of the choice of the sequence {Kn }n>0 . Let V be a Z-graded vertex algebra satisfying the standard grading-restriction axioms, that is, a V(n) , V = n∈Z
dim V(n) < ∞ for n ∈ Z and V(n) = 0 for n sufficiently small. By the duality properties of V , for any v 0 ∈ V 0 , any u1 , . . . , uk , v ∈ V, hv 0 , Y (u1 , z1 ) · · · Y (uk , zk )vi is absolutely convergent in the region |z1 | > · · · > |zk | > 0 and can be analytically extended to an element R(hv 0 , Y (u1 , z1 ) · · · Y (uk , zk )vi) of Rk . For any u1 , . . . , uk , v ∈ V and any (z1 , . . . , zk ) ∈ M k , we have an element Q(u1 , . . . , uk , v; z1 , . . . , zk ) ∈ V defined by hv 0 , Q(z1 , . . . , zk , v; z1 , . . . , zk )i = R(hv 0 , Y (u1 , z1 ) · · · Y (uk , zk )vi) ˜ be for v 0 ∈ V 0 . We denote the projections from V to V(n) , n ∈ Z, by Pn . Let G ∗ the subspace of V consisting of linear functionals λ on V such that for any k ≥ 0, u1 , . . . , uk , v ∈ V , X λ(Pn (Q(u1 , . . . , uk , v; z1 , . . . , zk ))) (1.1) n∈Z
is absolutely convergent for any z1 , . . . , zk in the region k = {(z1 , . . . , zk ) ∈ M k | |z1 |, . . . , |zk | < 1}. M<1
The dual pair (V ∗ , V ) of vector spaces gives V ∗ a locally convex topology. With the ˜ is also a locally convex space. Note that V 0 is topology induced from the one on V ∗ , G ˜ a subspace of G. ˜ u1 , . . . , uk , v ∈ V , both (1.1) and For any k ≥ 0, λ ∈ G, X ∂ λ(Pn (Q(u1 , . . . , uk , v; z1 , . . . , zk ))) ∂z1 n∈Z X λ(Pn (Q(L(−1)u1 , . . . , uk , v; z1 , . . . , zk ))) = n∈Z
Functional-Analytic Theory of Vertex (Operator) Algebras, I
65
k . Thus (1.1) is analytic in z when (z , are absolutely convergent in the region M<1 1 1 k . . . , zk ) is in M<1 . Similarly, (1.1) is analytic in zi for i = 2, . . . , k when (z1 , . . . , zk ) k . So (1.1) defines an analytic function on M k and we denote it by is in M<1 <1
gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) since (1.1) is multilinear in λ, u1 , . . . , uk and v. These functions span a vector space Fk k . So we obtain a linear map of analytic functions on M<1 ˜ ⊗ V ⊗(k+1) → Fk . gk : G (k)
k such that Fix a sequence {Jn }n>0 of compact subsets of M<1 (k)
Jn(k) ⊂ Jn+1 , n > 0, and
[ n>0
k Jn(k) = M<1 .
As in the case of Rk , using these compact subsets, we define a sequence of norms k·kFk ,n on Fk , and these norms give a locally convex topology on Fk . There is also an embedding ιFk from Fk to Fk+1 defined as follows: We use k+1 ˜ . For λ ∈ G, (z0 , . . . , zk ) instead of (z1 , . . . , zk+1 ) to denote the elements of M<1 u1 , . . . , uk , v ∈ V , since Y (1, z) = 1 for any nonzero complex number z, gk+1 (λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) as a function of z0 , . . . , zk is in fact independent of z0 , and is equal to gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) as a function of z1 , . . . , zk . Thus we obtain a well-defined linear map ιFk : Fk → Fk+1 such that ιFk ◦ gk = gk+1 ◦ φk , where
˜ ⊗ V ⊗(k+1) → G ˜ ⊗ V ⊗(k+2) φk : G
is defined by φk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) = λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v ˜ u1 , . . . , uk , v ∈ V . It is clear that ιFk is injective. Thus we can regard Fk as for λ ∈ G, a subspace of Fk+1 . Moreover, we have: Proposition 1.1. For any k ≥ 0, ιFk as a map from Fk to ιFk (Fk ) is continuous and open. In other words, the topology on Fk is induced from that on Fk+1 .
66
Y.-Z. Huang
Proof. We consider the two topologies on Fk , one is the topology defined above for Fk and the other induced from the topology on Fk+1 . We need only prove that for any n > 0, (i) the norm k · kFk ,n is continuous in the topology induced from the one on Fk+1 , and (ii) the restriction of the norm k · kFk+1 ,n to Fk is continuous in the topology on Fk . Let {fα }α∈A be a net in Fk convergent in the topology induced from the one on Fk+1 . Then {fα }α∈A , when viewed as a net of functions in z0 , z1 , . . . , zk , is convergent uniformly on any compact subset of M k+1 . Since fα , α ∈ A, are independent of z0 , {fα }α∈A is in fact convergent uniformly on any compact subset of M k , proving (i). Now let {fα }α∈A be a net in Fk convergent in the topology on Fk . Then {fα }α∈A is convergent uniformly on any compact subset of M k . If we view fα , α ∈ A, as functions on C × M k , then the net {fα }α∈A is convergent uniformly on any subset of M k+1 of the form C × K where K is a compact subset of M k . But any compact subset of M k+1 is contained in a subset of the form C × K. So {fα }α∈A is convergent uniformly on any compact subset t of M k+1 , proving (ii). u We equip the topological dual space Fk∗ , k ≥ 0, of Fk with the strong topology, that is, the topology of uniform convergence on all the weakly bounded subsets of Fk (see p. 256 of [K1]). Then Fk∗ is a locally convex space. In fact, Fk∗ is a (DF)-space (see p. 396 of [K1]). ˜ be defined by ˜ and u ∈ V , let Y−1 (u)λ ∈ G For any λ ∈ G (Y−1 (u)λ)(v) = λ((Resx x −1 Y (u, x))v) for v ∈ V . (Note that Resx x −1 Y (u, x) is u−1 in the usual notation. So we could denote Y−1 (u) by u−1 . Here we use the notation Y−1 (u) to avoid the possible confusion with the notations u0 , u1 , . . . for elements in V used later.) For k ≥ 0, we define a linear map γk : Fk+1 → Fk by γk (gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)) = gk (Y−1 (u0 )λ) ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) ˜ u0 , u1 , . . . , ul , v ∈ V . for λ ∈ G, Proposition 1.2. The map γk is continuous and satisfies γk ◦ ιFk = IFk ,
(1.2)
where IFk is the identity map on Fk . k+1 . Proof. We use (z0 , . . . , zk ) instead of (z1 , . . . , zk+1 ) to denote the elements of M<1 From the definition, we see that for any positive number < 1, when |z1 |, . . . , |zk | < ,
γk (gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ ul ⊗ v)) I 1 z−1 gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)dz0 = √ 2π −1 |z0 |= 0 ˜ u0 , . . . , uk , v ∈ V . Thus for any n > 0, there exist mn > 0 and positive for λ ∈ G, number n such that
Functional-Analytic Theory of Vertex (Operator) Algebras, I
67
kγk (gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v))kFk ,n = max |γk (gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v))| (k)
(z1 ,...,zk )∈Jn
= ≤ ≤
max
I 1 √ (k) 2π −1
|z0 |=n
(z1 ,...,zk )∈Jn
max
(k)
(z1 ,...,zk )∈Jn ,|z0 |=n
max
(k+1)
(z0 ,...,zk )∈Jmn
z0−1 gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)dz0
|gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)|
|gk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)|
= kgk+1 (λ ⊗ u0 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)kFk+1 ,mn . This inequality implies that γk is continuous. ˜ u1 , · · · , uk , v ∈ V , by definition, For λ ∈ G, gk+1 (λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) = ιFk (gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)). Since Y−1 (1)λ = λ, γk (gk+1 (λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)) = gk ((Y−1 (1)λ) ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) = gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v). So we have (1.2). u t Corollary 1.3. The adjoint map γk∗ of γk satisfies ι∗Fk ◦ γk∗ = IFk∗ ,
(1.3)
where ∗ → Fk∗ ι∗Fk : Fk+1
is the adjoint of ιFk and IFk∗ is the identity on Fk∗ . It is injective and continuous. As a map from Fk∗ to γk∗ (Fk∗ ), it is also open. In particular, if we identify Fk∗ with γk∗ (Fk∗ ), ∗ . the topology on Fk∗ is induced from the one on Fk+1 Proof. The identity (1.3) is an immediate consequence of the identity (1.2). By this identity, we see that γk∗ is injective. The continuity of γk∗ is a consequence of the continuity of γk . To show that it is open as a map from Fk∗ to γk∗ (Fk∗ ), we need only show that its inverse is continuous. Let {µα }α∈A be a net in Fk∗ such that {γk∗ (µα )}α∈A is convergent ∗ . Since ι∗ is continuous, {ι∗ (γ ∗ (µ ))} ∗ in Fk+1 α α∈A is convergent in Fk . By the identity Fk Fk k (1.2), µα = ι∗Fk (γk∗ (µα )) for α ∈ A. Thus {µα }α∈A is convergent in Fk∗ , proving that the inverse of γk∗ viewed as t a map from Fk∗ to γk∗ (Fk∗ ) is continuous. u
68
Y.-Z. Huang
˜ and the algebraic dual space G ˜ ∗ of We use h·, ·i to denote the pairing between G 0 ∗ ˜ Since V ⊂ G ˜ and V ⊂ G ˜ , this pairing is an extension of the pairing between V 0 G. ˜ and G ˜ ∗ form a dual pair and V denoted using the same symbol. With this pairing, G ˜ ∗. of vector spaces. This dual pair of vector spaces gives a locally convex topology to G ˜ the dual space G ˜ ∗ can be viewed as a subspace of (V 0 )∗ = V . We define Since V 0 ⊂ G, ˜∗ ⊂ V ek : V ⊗(k+1) ⊗ Fk∗ → G by
hλ, ek (u1 ⊗ · · · ⊗ uk ⊗ v ⊗ µ)i = µ(gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v))
˜ u1 , . . . , uk , v ∈ V and µ ∈ F ∗ . for λ ∈ G, k We now assume that V is finitely generated. The generators of V span a finitedimensional subspace X of V . We assume that the vacuum vector 1 is in X. Any norm on X induces a Banach space structure on X. Since X is finite-dimensional, all norms on X are equivalent, so that the topology induced by the norm is in fact independent of the norm. Since X is a finite-dimensional Banach space and Fk∗ is a locally convex space, ⊗(k+1) ⊗ Fk∗ is also a locally convex space. We denote the image ek (X ⊗(k+1) ⊗ Fk∗ ) X of X ⊗(k+1) ⊗ Fk∗ ⊂ V ⊗(k+1) ⊗ Fk∗ under ek by Gk . Proposition 1.4. For any k ≥ 0, Gk ⊂ Gk+1 . Proof. By definition, hλ, ek (u1 ⊗ · · · ⊗ uk ⊗ v ⊗ µ)i = µ(gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v) = µ(γk (gk+1 (λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)) = (γk∗ (µ))(gk+1 (λ ⊗ 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)) = hλ, ek (1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v ⊗ γk∗ (µ))i ˜ u1 , . . . , uk , v ∈ V and µ ∈ F ∗ . Thus for λ ∈ G, k ek (u1 ⊗ · · · ⊗ uk ⊗ v ⊗ µ) = ek (1 ⊗ u1 ⊗ · · · ⊗ ul ⊗ v ⊗ γk∗ (µ)) ∈ Gk+1 ˜ u1 , . . . , uk , v ∈ V and µ ∈ F ∗ . u for λ ∈ G, t k Proposition 1.5. The linear map ˜∗ ek |X⊗(k+1) ⊗F ∗ : X⊗(k+1) ⊗ Fk∗ → G k
is continuous. Proof. By the definition of the locally convex topology on the tensor product X⊗(k+1) ⊗ Fk∗ , we need only prove that ek |X⊗(k+1) ⊗F ∗ as a multilinear map from X⊗(k+1) × Fk∗ to k ˜ ∗ is continuous. G Let {(X α , µα )}α∈A be a net in X⊗(k+1) × Fk∗ convergent to 0. Then the nets {Xα }α∈A ˜ and {µα }α∈A are convergent to 0 in X⊗(k+1) and F ∗ , respectively. For any fixed λ ∈ G, k
since X ⊗(k+1) is a finite-dimensional Banach space, the linear map from X⊗(k+1) to Fk defined by u1 ⊗ · · · ⊗ uk ⊗ v 7 → gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ v)
Functional-Analytic Theory of Vertex (Operator) Algebras, I
69
for u1 , . . . , uk , v ∈ X is continuous. Thus {gk (λ ⊗ Xα )}α∈A is convergent to 0 in Fk . In particular, there exists α0 ∈ A such that {gk (λ ⊗ Xα )}α∈A,α>α0 is weakly bounded. Thus ) ( sup
α 0 ∈A,α 0 >α0
µα (gk (λ ⊗ Xα 0 )) α∈A
is convergent to 0. In particular, {µα (gk (λ ⊗ Xα ))}α∈A,α>α0 or equivalently
{µα (gk (λ ⊗ Xα ))}α∈A
is convergent to 0. Since hλ, ek (Xα ⊗ µα )i = µα (gk (λ ⊗ Xα )), ˜ By the definition of we see that {hλ, ek (Xα ⊗ µα )i}α∈A is convergent to 0 for λ ∈ G. ∗ ∗ ˜ ˜ t the topology on G , {ek (Xα ⊗ µα )}α∈A is convergent to 0 in G . u From Proposition 1.5, we conclude: Corollary 1.6. The quotient space (X ⊗(k+1) ⊗ Fk∗ )/(ek |X⊗(k+1) ⊗F ∗ )−1 (0) k
is a locally convex space. Since Gk is linearly isomorphic to (X⊗(k+1) ⊗ Fk∗ )/(ek |X⊗(k+1) ⊗F ∗ )−1 (0), k
the locally convex space structure on (X ⊗(k+1) ⊗ Fk∗ )/(ek |X⊗(k+1) ⊗F ∗ )−1 (0) k
gives a locally convex space structure on Gk . Let Hk be the completion of Gk . Then Hk is a complete locally convex space. Proposition 1.7. The space Hk can be embedded canonically in Hk+1 . The topology on Hk is the same as the one induced from the topology on Hk+1 . Proof. The first conclusion follows from Proposition 1.4. To prove the second conclusion, we need only prove that the topology on Gk is the same as the one induced from the topology on Gk+1 . We have the following commutative diagram: ψk
∗ X ⊗(k+1) ⊗ Fk∗ −−−−→ X⊗(k+2) ⊗ Fk+1 e ek y y k+1
Gk where
−−−−→
Gk+1
∗ ψk : X⊗(k+1) ⊗ Fk∗ → X⊗(k+2) ⊗ Fk+1
70
Y.-Z. Huang
is defined by ψk (u1 ⊗ · · · ⊗ uk ⊗ v ⊗ µ) = 1 ⊗ u1 ⊗ · · · ⊗ uk ⊗ v ⊗ γk∗ (µ) for u1 , . . . , uk , v ∈ X and µ ∈ Fk∗ . By Corollary 1.3, if we identify Fk∗ with γk∗ (Fk∗ ), ∗ . By the definition of the topologies the topology on Fk∗ is induced from the one on Fk+1 on Gk and Gk+1 and the commutativity of the diagram above, we see that the topology t on Gk is the same as the one induced from the topology on Gk+1 . u By Proposition 1.7, we have a sequence {Hk }k≥0 of strictly increasing complete locally convex spaces. Let [ Hk . H = k≥0
We equip H with the inductive limit topology. Then H is a complete locally convex space. Let [ Gk . G= k≥0
Then G is a dense subspace of H . Note that V ⊂ G ⊂ V . Since elements of V are finite or infinite sums of elements of V , elements of G are also finite or infinite sums of elements of V . Thus infinite sums of elements of V belonging to G must be convergent in the topology on H . So G is in the closure of V . Since G is dense in H , we obtain: Theorem 1.8. The vector space H equipped with the strict inductive limit topology is a locally convex completion of V . 2. The Locally Convex Completion and the Vertex Operator Map In this section we construct continuous linear maps from the topological completion of H ⊗ H to H associated with conformal equivalence classes of closed disks with two ordered open disks inside deleted and with the standard parametrizations at the boundary components. We consider the closed unit disk centered at 0. We delete two ordered open disks inside: the first centered at z with radius r1 and the second centered at 0 with radius r2 . See Fig. 1. The positive numbers r1 , r2 and the nonzero complex number z must satisfy the conditions r2 + 2r1 < 1 and r2 < |z| < 1. The three boundary circles are parametrized by the maps eiθ 7 → eiθ , eiθ 7 → z + r1 eiθ , eiθ 7 → r2 eiθ . We denote the resulting closed disk with two disjoint open disks inside deleted and with ordered boundary components parametrized as above by D(z; r1 , r2 ). Note that any closed disks with two ordered open disks inside deleted and with the standard parametrizations is conformally equivalent to D(z; r1 , r2 ) for some z ∈ C× , and some positive numbers r1 , r2 satisfying r2 + 2r1 < 1 and r2 < |z| < 1.
Functional-Analytic Theory of Vertex (Operator) Algebras, I
71
r1 z r2 0
1
Figure 1.
eH be the topological completion of the vector space tensor product H ⊗ H . Let H ⊗ We would like to construct a continuous linear map eH → H ν Y ([D(z; r1 , r2 )]) : H ⊗ associated with the conformal equivalence class [D(z; r1 , r2 )] of D(z; r1 , r2 ). We know ˆ = C ∪ {∞} with the negatively oriented that D(z; r1 , r2 ) corresponds to the sphere C punctures ∞, the ordered positively oriented puncture z, 0 and with the local coordinates w 7 → 1/w, w 7 → (w − z)/r1 and w/r2 vanishing at ∞, z and 0, respectively. In the notation of [H6], the conformal equivalence class of this sphere with tubes of type (1, 2) is (z; 0, (1/r1 , 0), (1/r2 , 0)) ∈ K(2). Associated with this class, we have a linear map L(0)
νY ((z; 0, (1/r1 , 0), (1/r2 , 0))) = Y (r1
L(0)
·, z)r2
·:V ⊗V →V
eH , that the im(see [H6]). We now show that this linear map can be extended to H ⊗ age of this extension is in H and that this extension is continuous. Then we define ν Y ([D(z; r1 , r2 )]) to be this extension. ˜ and u ∈ V , we define an element u [D(z;r1 ,r2 )] λ ∈ V ∗ by For any λ ∈ G (u [D(z;r1 ,r2 )] λ)(v) =
X
L(0)
λ(Pn (Y (r1
L(0)
u, z)r2
v))
n∈Z
=
X
L(0)
λ(Pn (Q(r1
n∈Z
for v ∈ V . ˜ Proposition 2.1. The element u [D(z;r1 ,r2 )] λ is in G.
L(0)
u, r2
v; z)))
72
Y.-Z. Huang
Proof. For any k ≥ 0, u1 , . . . , uk , v ∈ V , X (u [D(z;r1 ,r2 )] λ)(Pm (Q(u1 , . . . , uk , v; z1 , . . . , zk ))) m∈Z
=
XX
L(0)
u, z)r2
L(0)
u, z) ·
λ(Pn (Y (r1
L(0)
Pm (Q(u1 , . . . , uk , v; z1 , . . . , zk )))
m∈Z n∈Z
=
XX
λ(Pn (Y (r1
m∈Z n∈Z L(0)
·Pm (Q(r2
L(0)
u1 , . . . , r2
L(0)
uk , r2
v; r2 z1 , . . . , r2 zk ))).
(2.1)
Since the Z-grading of V is lower-truncated and since r2 < |z| < 1, for any k , there exists a positive number δ > 1 such that the Laurent series (z1 , . . . , zk ) ∈ M<1 2 X
m∈Z
=
L(0)
L(0)
λ(Pn (Y (r1
u, z)Pm (Q(r2
X
L(0)
λ(Pn (Y (r1
L(0)
u1 , . . . , r2
L(0)
uk , r2
v; r2 z1 , . . . , r2 zk ))))t2m
u, z) ·
m∈Z L(0)
=
X
·Pm (t2
L(0)
L(0)
u1 , . . . , r2
L(0)
v; r2 z1 , . . . , r2 zk )))) L(0) L(0) λ(Pn (Y (r1 u, z)Pm (Q((t2 r2 ) u1 , . . . , (t2 r2 )L(0) uk , (t2 r2 )L(0) v;
m∈Z
Q(r2
uk , r2
t2 r2 z1 , . . . , t2 r2 zk ))))
in t2 has only finitely many negative powers and is absolutely convergent to L(0)
λ(Pn (Q(r1
u, (t2 r2 )L(0) u1 , . . . , (t2 r2 )L(0) uk , (t2 r2 )L(0) v; z, t2 r2 z1 , . . . , t2 r2 zk )))
˜ and |z| < 1, for any (z1 , . . . , zk ) ∈ M k , there when 0 < |t2 | < δ2 . Since λ ∈ G <1 exists a positive number δ1 > 1 such that the Laurent series in t1 X L(0) L(0) L(0) L(0) λ(Pn (Q(r1 u, r2 u1 , . . . , r2 uk , r2 v; z, r2 z1 , . . . , r2 zk )))t1n n∈Z
=
X
L(0)
λ(Pn (t1
L(0)
Q(r1
L(0)
u, r2
L(0)
u1 , . . . , r2
L(0)
uk , r2
v; z, r2 z1 , . . . , r2 zk )))
n∈Z
=
X
n∈Z
λ(Pn (Q((t1 r1 )L(0) u, (t1 r2 )L(0) u1 , . . . , (t1 r2 )L(0) uk , (t1 r2 )L(0) v; t1 z, t1 r2 z1 , . . . , t1 r2 zk )))
k , the iterated is absolutely convergent when 0 < |t1 | < δ1 . Thus for (z1 , . . . , zk ) ∈ M<1 sum XX L(0) L(0) L(0) L(0) λ(Pn (Y (r1 u, z)Pm (Q(r2 u1 , . . . , r2 uk , r2 v; n∈Z m∈Z
r2 z1 , . . . , r2 zk ))))t1n t2m
is absolutely convergent when 0 < |t1 | < δ1 and 0 < |t2 | < δ2 .
(2.2)
Functional-Analytic Theory of Vertex (Operator) Algebras, I
73
The iterated sum (2.2) gives a function of t1 and t2 in the region 0 < |t1 | < δ1 , 0 < |t2 | < δ2 . It is clear that this function is analytic in t1 and t2 . Thus it has a Laurent expansion which must be X L(0) L(0) L(0) L(0) λ(Pn (Y (r1 u, z)Pm (Q(r2 u1 , . . . , r2 uk , r2 v; n,m∈Z
r2 z1 , . . . , r2 zk ))))t1n t2m .
Since this double sum is equal to the Laurent expansion, it is absolutely convergent and thus both iterated sums are absolutely convergent. In particular, when t1 = t2 = 1, we see that the right-hand side of (2.1) and consequently the left-hand side of (2.1) are k , proving that u ˜ absolutely convergent when (z1 , . . . , zk ) ∈ M<1 [D(z;r1 ,r2 )] λ is in G. t u ˜ ⊗ Xl+1 ⊗ F ∗ → V ∗ by For any l ≥ 0, we define a linear map αl : G l (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν))(u) = hu [D(z;r1 ,r2 )] λ, el (v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν)i ˜ v1 , . . . , vl , v ∈ X, ν ∈ F ∗ and u ∈ V . for λ ∈ G, l ˜ Proposition 2.2. The image of αl is in G. ˜ u1 , . . . , uk , u, ∈ V , v1 , . . . , vl , v ∈ X and ν ∈ F ∗ , Proof. For any k ≥ 0, λ ∈ G, l X (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν))(Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk ))) n∈Z
=
X
h(Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ),
n∈Z
=
X
el (v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν)i ν(gl ((Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ)
n∈Z
⊗v1 ⊗ · · · ⊗ ⊗vl ⊗ v)) X X ν (Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ) = n∈Z
m∈Z
(Pm (Q(v1 , . . . , vk , v; zk+1 , . . . , zk+l ))) .
By the definition of Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ, we have (Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ) (Pm (Q(v1 , . . . , vk , v; zk+1 , . . . , zk+l ))) X L(0) λ(Pp (Y (r1 Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )), z) · = p∈Z
(2.3)
74
Y.-Z. Huang L(0)
=
·r2
X
Pm (Q(v1 , . . . , vl , v; zk+1 , . . . , zk+l )))) L(0)
λ(Pp (Y (Pn (Q(r1
L(0)
u1 , . . . , r1
L(0)
uk , r1
u; r1 z1 , . . . , r1 zk )), z) ·
p∈Z L(0)
·Pm (Q(r2
L(0)
v1 , . . . , r2
L(0)
vl , r2
v; r2 zk+1 , . . . , r2 zk+l )))). (2.4)
Using the associativity of vertex operators, we know that the element XX L(0) L(0) L(0) Y (Pn (Q(r1 u1 , . . . , r1 uk , r1 u; r1 z1 , . . . , r1 zk )), z)· n∈Z m∈Z L(0)
·Pm (Q(r2
L(0)
v1 , . . . , r2
L(0)
vl , r2
v; r2 zk+1 , . . . , r2 zk+l ))
in V is equal to L(0)
Q(r1
L(0)
L(0)
L(0)
L(0)
L(0)
u1 , . . . , r1 uk , r1 u, r2 v1 , . . . , r2 vl , r2 v; r1 z1 + z, . . . , r1 zk + z, z, r2 zk+1 , . . . , r2 zk+l )
k and (z l when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . This fact and (2.4) imply that
XX
(Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ)
n∈Z m∈Z
(Pm (Q(v1 , . . . , vk , v; zk+1 , . . . , zk+l )))
is convergent absolutely to L(0)
λ(Pp (Q(r1
L(0)
L(0)
L(0)
L(0)
L(0)
u1 , . . . , r1 uk , r1 u, r2 v1 , . . . , r2 vl , r2 r1 z1 + z, . . . , r1 zk + z, z, r2 zk+1 , . . . , r2 zk+l )))
v;
k and (z l ˜ when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . Since λ ∈ G,
X
L(0)
λ(Pp (Q(r1
L(0)
u1 , . . . , r1
L(0)
uk , r1
L(0)
u, r2
L(0)
v1 , . . . , r2
L(0)
vl , r2
v;
p∈Z
r1 z1 + z, . . . , r1 zk + z, z, r2 zk+1 , . . . , r2 zk+l ))) k and (z l is absolutely convergent when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . The arguments above prove that the iterated sum XXX L(0) L(0) L(0) λ(Pp (Y (Pn (Q(r1 u1 , . . . , r1 uk , r1 u; r1 z1 , . . . , r1 zk )), z)· p∈Z n∈Z m∈Z L(0)
·Pm (Q(r2
L(0)
v1 , . . . , r2
L(0)
vl , r2
v; r2 zk+1 , . . . , r2 zk+l ))))
k and (z l is absolutely convergent when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . The same method as in the proof of Proposition 2.1 shows that in fact the corresponding triple sum is absolutely convergent and thus all the iterated sums are absolutely convergent k and (z l and are all equal when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . So
Functional-Analytic Theory of Vertex (Operator) Algebras, I
XXX
L(0)
λ(Pp (Y (Pn (Q(r1
75 L(0)
u1 , . . . , r1
L(0)
uk , r1
u; r1 z1 , . . . , r1 zk )), z)·
n∈Z m∈Z p∈Z L(0)
L(0)
L(0)
·Pm (Q(r2 v1 , . . . , r2 vl , r2 v; r2 zk+1 , . . . , r2 zk+l )))) XX (Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk )) [D(z;r1 ,r2 )] λ) = n∈Z m∈Z
(Pm (Q(v1 , . . . , vk , v; zk+1 , . . . , zk+l )))
k and (z l is absolutely convergent when (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 . By (2.3), we see that the left-hand side of (2.3) is also absolutely convergent when k and (z l (z1 , . . . , zk ) ∈ M<1 k+1 , . . . , zk+l ) ∈ M<1 , proving that
αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ˜ u is indeed in G. t By Proposition 2.2, X αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν)(Pn (Q(u1 , . . . , uk , u; z1 , . . . , zk ))) n∈Z
is absolutely convergent and the sum is equal to gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u) ∈ Fk . We define an linear map
∗ βk,l : Fk∗ ⊗ Fl∗ → Fk+l+1
by (βk,l (µ, ν))(gk+l+1 (λ ⊗ u1 ⊗ · · · ⊗ uk+1 ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ v)) = µ(gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u)) ˜ u1 , . . . , uk , u, v1 , . . . , vl , v ∈ V , µ ∈ F ∗ and ν ∈ F ∗ . In fact this formula for λ ∈ G, k l only gives a linear map from Fk∗ ⊗ Fl∗ to the algebraic dual of Fk+l+1 . We have: ∗ and the map βk,l is Proposition 2.3. The image of the map βk,l is indeed in Fk+l+1 continuous. ζ
Proof. To avoid notational confusions in the proof, we use Fk , k ≥ 0, to denote the k . Similarly we space Fk whose elements are viewed as functions of (ζ1 , . . . , ζk ) ∈ M<1 η ˜ u1 , . . . , uk , u, have the notation Fl , l ≥ 0. Let µ ∈ Fk∗ and ν ∈ Fl∗ . For any λ ∈ G, v1 , . . . , vl , v ∈ V , we have an element g(z1 , . . . , zk+l+1 ) = gk+l+1 (λ ⊗ u1 ⊗ · · · ⊗ uk+1 ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ v)
(2.5)
of Fk+l+1 . By definition |(βk,l (µ, ν))(g(z1 , . . . , zk+l+1 ))| = |(βk,l (µ, ν))(gk+l+1 (λ ⊗ u1 ⊗ · · · ⊗ uk+1 ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ v))| (2.6) = |µ(gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u))|.
76
Y.-Z. Huang
We now view gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u) ζ
as an element of Fk . Then gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u) X = (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν))(Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk ))) p∈Z
=
X
ν(gl ((Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk )) [D(z;r1 ,r2 )] λ)
p∈Z
⊗v1 ⊗ · · · ⊗ vl ⊗ v)).
(2.7)
For fixed ζ1 , . . . , ζk , we view gl ((Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )) [D(z;r1 ,r2 )] λ) ⊗ v1 ⊗ · · · ⊗ vl ⊗ v) η
as an element of Fl . Then by definition we have X
p∈Z
=
gl ((Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )) [D(z;r1 ,r2 )] λ) ⊗ v1 ⊗ · · · ⊗ vl ⊗ v) XX
(Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )) [D(z;r1 ,r2 )] λ)
p∈Z q∈Z
(Pq (Q(v1 , . . . , vl , v; η1 , . . . , ηl ))) XXX L(0) λ(Pj (Y (r1 Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )), z) · = p∈Z q∈Z j ∈Z L(0)
=
·r2 XXX
Pq (Q(v1 , . . . , vl , v; η1 , . . . , ηl )))) L(0)
λ(Pj (Y (Pp (Q(r1
L(0)
u1 , . . . , r1
L(0)
uk , r1
u; r1 ζ1 , . . . , r1 ζk1 )), z) ·
p∈Z q∈Z j ∈Z L(0)
·Pq (Q(r2
L(0)
v1 , . . . , r2
L(0)
vl , r2
v; r2 η1 , . . . , r2 ηl )))).
(2.8)
We now calculate the right-hand side of (2.8). For any nonzero complex numbers t0 , t1 , t2 , X L(0) L(0) L(0) λ(Pj (Y (Pp (Q(r1 u1 , . . . , r1 uk , r1 u; r1 ζ1 , . . . , r1 ζk1 )), z)· p,q,j ∈Z
=
X
L(0)
·Pq (Q(r2 L(0)
λ(Pj (t0
L(0)
v1 , . . . , r2 L(0)
Y (Pp (t1
L(0)
Q(r1
L(0)
vl , r2
j p q
v; r2 η1 , . . . , r2 ηl ))))t0 t1 t2 L(0)
u1 , . . . , r1
L(0)
uk , r1
u;
p,q,j ∈Z L(0)
L(0)
L(0)
L(0)
r1 ζ1 , . . . , r1 ζk1 )), z)Pq (t2 Q(r2 v1 , . . . , r2 vl , r2 v; r2 η1 , . . . , r2 ηl )))) X λ(Pj (Y (Pp (Q((t0 t1 r1 )L(0) u1 , . . . , (t0 t1 r1 )L(0) uk , (t0 t1 r1 )L(0) u; = p,q,j ∈Z
Functional-Analytic Theory of Vertex (Operator) Algebras, I
77
t0 t1 r1 ζ1 , . . . , t0 t1 r1 ζk1 )), t0 z)Pq (Q((t0 t2 r1 )L(0) v1 , . . . , (t0 t2 r1 )L(0) vl , (t0 t2 r1 )L(0) v; (2.9) t0 t2 r2 η1 , . . . , t0 t2 r2 ηl )))). By the associativity of vertex operators and the definition of the map gk+l+1 , there exists real numbers δ0 , δ1 , δ2 > 1 such that the right-hand side of (2.9) is convergent absolutely to gk+l+1 (λ ⊗ (t0 t1 r1 )L(0) u1 ⊗ · · · ⊗ (t0 t1 r1 )L(0) uk ⊗ (t0 t1 r1 )L(0) u⊗
(t0 t2 r1 )L(0) v1 ⊗ · · · ⊗ (t0 t2 r1 )L(0) vl ⊗ (t0 t2 r1 )L(0) v)|zi =t0 z+t0 t1 r1 ζi ,i=1,...,k,zk+1 =t0 z,zk+1+j =t0 t2 r2 ηj ,j =1,...,l (2.10)
when 0 < t0 < δ0 , 0 < t1 < δ1 and 0 < t1 < δ1 . In particular, when t0 = t1 = t2 = 1, we see that (2.10) is equal to and thus the right-hand side of (2.8) is convergent absolutely to L(0)
gk+l+1 (λ ⊗ r1 L(0) ⊗r2 vl
L(0)
u1 ⊗ · · · ⊗ r1
L(0)
uk ⊗ r1
L(0)
u ⊗ r2
v1 ⊗ · · · L(0) ⊗ r2 v)|zi =z+r1 ζi ,i=1,...,k,zk+1 =z,zk+1+j =r2 ηj ,j =1,...,l . (2.11)
k , (2.11) as a function of η , . . . , η is an element of Hence for fixed (ζ1 , . . . , ζk ) ∈ M<1 1 l η Fl . We denote this element by fg;ζ1 ,...,ζk (η1 , . . . , ηl ). η ζ We now view µ and ν as elements of (Fk )∗ and (Fl )∗ , respectively. Since ν is continuous, by the calculations above, we have
ν(fg;ζ1 ,...,ζk (η1 , . . . , ηl )) X gl ((Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )) [D(z;r1 ,r2 )] λ) =ν p∈Z
⊗v1 ⊗ · · · ⊗ vl ⊗ v)
=
X
ν(gl ((Pp (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk1 )) [D(z;r1 ,r2 )] λ)
p∈Z
⊗v1 ⊗ · · · ⊗ vl ⊗ v)).
(2.12)
By the calculations from (2.6) to (2.12), we see that ζ
ν(fg;ζ1 ,...,ζk (η1 , . . . , ηl )) ∈ Fk and we obtain
|(βk,l (µ, ν))(g(z1 , . . . , zk+l+1 ))| = |µ(ν(fg;ζ1 ,...,ζk (η1 , . . . , ηl )))|.
(2.13)
For g(z1 , . . . , zk+l+1 ) ∈ Fk+l+1 not of the form (2.5), we use linearity to define the functions fg;ζ1 ,...,ζk (η1 , . . . , ηl ). Then (2.13) holds for any g(z1 , . . . , zk+l+1 ) ∈ Fk+l+1 .
78
Y.-Z. Huang
From the definition of fg;ζ1 ,...,ζk (η1 , . . . , ηl ), we see that for any fixed (ζ1 , . . . , ζk ) ∈ η k , the linear map from F M<1 k+l+1 to Fl given by g 7 → fg;ζ1 ,...,ζk (η1 , . . . , ηl ) η
for g ∈ Fk+l+1 is continuous, and for any fixed ν ∈ (Fl )∗ , the linear map from Fk+l+1 ζ to Fk given by g 7 → ν(fg;ζ1 ,...,ζk (η1 , . . . , ηl )) is also continuous. Thus by (2.13), βk,l (µ, ν) is continuous. So the elements of image ∗ . of βk,l are in Fk+l+1 From the definition of fg;ζ1 ,...,ζk (η1 , . . . , ηl ), we see that given any weakly bounded subset B of Fk+l+1 , the subset η
B 0 = {fg;ζ1 ,...,ζk (η1 , . . . , ηl ) | g ∈ B} ⊂ Fl
k is weakly bounded, and thus for any net {ν } for fixed (ζ1 , . . . , ζk ) ∈ M<1 α α∈A convergent η ∗ to 0 in (Fl ) , there exists α0 ∈ A such that the subset ζ
B 00 ({να }α∈A ) = {να (fg;ζ1 ,...,ζk (η1 , . . . , ηl )) | g ∈ B, α > α0 } ⊂ Fk
is also weakly bounded. η ζ Now let {(µα , να )}α∈A be a net convergent to 0 in (Fk )∗ × (Fl )∗ . Then the nets η ζ {µα }α∈A and {να }α∈A are convergent to 0 in (Fk )∗ and (Fl )∗ , respectively. Thus by (2.13) and the discussion above, supg(z1 ,...,zk+l+1 )∈B |(βk,l (µα , να ))(g(z1 , . . . , zk+l+1 ))| ≤ suph(ζ1 ,...,ζk )∈B 00 ({να }α∈A ) |µα (h(ζ1 , . . . , ζk ))|
(2.14)
when α > α0 . By the definition of the topology on Fk∗ , the net on the right-hand side of (2.14) is convergent to 0. Thus the net on the left-hand side of (2.14) is convergent to 0, t proving the continuity of the map βk1 ,k2 . u Let h1 = ek (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ µ) ∈ Gk and h2 = el (v1 ⊗ · · · ⊗ vl ⊗ v ⊗ ν) ∈ Gl , where u1 , . . . , uk , u, v1 , . . . , vl , v ∈ X, µ ∈ Fk∗ and µ ∈ Fl∗ . We define (ν Y ([D(z; r0 , r1 , r2 )]))(h1 ⊗ h2 ) = ek+l+1 (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ βk,l (µ, ν)). Note that any element of Gk or Gl is a linear combination of elements of the form h1 or h2 , respectively, given above, and that k and l are arbitrary. Thus we obtain a linear map ν Y ([D(z; r0 , r1 , r2 )])|G⊗G : G ⊗ G → G. Proposition 2.4. The map ν Y ([D(z; r1 , r2 )])|G⊗G is continuous.
Functional-Analytic Theory of Vertex (Operator) Algebras, I
79
Proof. From the definition of ν Y ([D(z; r1 , r2 )])|G⊗G , we see that for any k, l ≥ 0, ν Y ([D(z; r1 , r2 )]))(Gk ⊗ Gl ) is in Gk+l+1 . To prove that ν Y ([D(z; r1 , r2 )])|G⊗G is continuous, we need only prove that for any k, l ≥ 0, it is continuous as a map from Gk ⊗ Gl to Gk+l+1 . Since the topology on Gk and Gl are defined to be the quotient topologies on (X⊗(k+1) ⊗ Fk∗ )/(ek |X⊗(k+1) ⊗F ∗ )−1 (0) k
and
(X ⊗(l+1) ⊗ Fl∗ )/(el |X⊗(l+1) ⊗F ∗ )−1 (0), l
respectively, we need only prove that the map (ν Y ([D(z; r1 , r2 )])) ◦ (ek |X⊗(k+1) ⊗F ∗ ⊗ el |X⊗(l+1) ⊗F ∗ ) : k
l
(X ⊗(k+1) ⊗ Fk∗ ) ⊗ (X⊗(l+1) ⊗ Fl∗ ) → Gk+l+1 is continuous. On the other hand, by definition, (ν Y ([D(z; r1 , r2 )])) ◦ (ek |X⊗(k+1) ⊗F ∗ ⊗ el |X⊗(l+1) ⊗F ∗ ) = ek+l+1 |X⊗(k+l+2) ⊗F ∗
k+l+1
k
l
◦ (IX⊗(k+l+1) ⊗ βk,l ) ◦ σ23 ,
where IX⊗(k+l+1) is the identity map on X⊗(k+l+1) and σ23 : (X⊗(k+1) ⊗ Fk∗ ) ⊗ (X⊗(l+1) ⊗ Fl∗ ) → X⊗(k+l+2) ⊗ Fk∗ ⊗ Fl∗ is the map permuting the second and the third tensor factor, that is, σ23 (X1 ⊗ µ ⊗ X2 ⊗ ν) = X1 ⊗ X2 ⊗ µ ⊗ ν for X1 ∈ X⊗(k+1) , µ ∈ Fk∗ , X2 ∈ X ⊗(l+1) and ν ∈ Fl∗ . Since the topology on Gk+l+1 is defined to be the quotient topology on ∗ )/(ek+l+1 |X⊗(k+l+2) ⊗F ∗ (X⊗(k+l+2) ⊗ Fk+l+1
k+l+1
)−1 (0),
we need only prove that the map (IX⊗(k+l+1) ⊗βk,l )◦σ23 is continuous. But IX⊗(k+l+1) and σ23 are obviously continuous and βk,l is continuous by Proposition 2.3. So (IX⊗(k+l+1) ⊗ t βk,l ) ◦ σ23 is indeed continuous. u Since G is dense in H , we can extend (ν Y ([D(z; r1 , r2 )]))|G⊗G to a linear map eH to H . ν Y ([D(z; r1 , r2 )]) from H ⊗ Theorem 2.5. The map ν Y ([D(z; r1 , r2 )]) is a continuous extension of νY ((z; 0, (1/r1 , 0), (1/r2 , 0))) eH. That is, ν Y ([D(z; r1 , r2 )]) is continuous and to H ⊗ ν Y ([D(z; r1 , r2 )])|V ⊗V = νY ((z; 0, (1/r1 , 0), (1/r2 , 0))).
80
Y.-Z. Huang
Proof. The continuity of ν Y ([D(z; r1 , r2 )]) follows from the definition and Proposition 2.4. For any m1 , . . . , mk ∈ Z, let µm1 ,...,mk be an element of Fk∗ defined by µm1 ,...,mk (gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ u)) I I 1 1 ··· √ z1m1 · · · zkmk · = √ 2π −1 |z1 |=1 2π −1 |zk |=k ·gk (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ u)dzk · · · dzk ˜ u1 , . . . , uk , u ∈ V , where 1 , . . . , k are arbitrary positive real numbers for λ ∈ G, satisfying 1 > · · · > k . Since V is generated by X, elements of the form ek (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ µm1 ,...,mk ), k ≥ 0, u1 , . . . , uk , u ∈ X and m1 , . . . , mk ∈ Z, span V . Let h1 = ek (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ µm1 ,...,mk ) and h2 = el (v1 ⊗ · · · ⊗ vl ⊗ v ⊗ µn1 ,...,nl ) be two such elements of V . Then hλ, (ν Y ([D(z; r1 , r2 )]))(h1 ⊗ h2 )i = hλ, ek+l+1 (u1 ⊗ · · · ⊗ uk ⊗ u ⊗v1 ⊗ · · · ⊗ vl ⊗ v ⊗ βk,l (µm1 ,...,mk , µn1 ,...,nl ))i = βk,l (µm1 ,...,mk , µn1 ,...,nl ) (gk+l+1 (λ ⊗ u1 ⊗ · · · ⊗ uk ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ v)) = µm1 ,...,mk (gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ µn1 ,...,nl ) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u)). (2.15) As in the proof of Proposition 2.3, to avoid notational confusions, we view gk (αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ µn1 ,...,nl ) ⊗ u1 ⊗ · · · ⊗ uk ⊗ u) ζ
as an element of Fk and gl ((Pn (Q(u1 , . . . , uk , v; ζ1 , . . . , ζk )) [D(z;r1 ,r2 )] λ) ⊗ v1 ⊗ · · · ⊗ vl ⊗ v) η
ζ
as an element of Fl . We also view µm1 ,...,mk and νn1 ,...,nl as elements of (Fk )∗ and η (Fl )∗ , respectively. Then the right-hand side of (2.15) is equal to µm1 ,...,mk
X n∈Z
αl (λ ⊗ v1 ⊗ · · · ⊗ vl ⊗ v ⊗ µn1 ,...,nl )
(Pn (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk ))))
Functional-Analytic Theory of Vertex (Operator) Algebras, I
= µm1 ,...,mk
X
81
µn1 ,...,nl (gl ((Pn (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk )) [D(z;r1 ,r2 )] λ)
n∈Z
⊗v1 ⊗ · · · ⊗ vl ⊗ v)) = µm1 ,...,mk
X
µn1 ,...,nl
X
n∈Z
(Pn (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk )) [D(z;r1 ,r2 )] λ)
m∈Z
(Pm (Q(v1 , . . . , vl , v; η1 , . . . , ηl ))) = µm1 ,...,mk
X
µn1 ,...,nl
X X
n∈Z
L(0)
λ(Pp (r1
·
m∈Z p∈Z
·Y (Pn (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk )), z) · L(0) ·r2 Pm (Q(v1 , . . . , vl , v; η1 , . . . , ηl ))) I I 1 1 ··· √ ζ1m1 · · · ζkmk · = √ 2π −1 |ζ1 |=1 2π −1 |ζk |=k I I X 1 1 ··· √ η1n1 · · · ηlnl · · √ 2π −1 |η1 |=δ1 2π −1 |ηl |=δl n∈Z XX L(0) λ(Pp (r1 Y (Pn (Q(u1 , . . . , uk , u; ζ1 , . . . , ζk )), z) · · m∈Z p∈Z L(0)
=
·r2 XXX
Pm (Q(v1 , . . . , vl , v; η1 , . . . , ηl ))) L(0)
λ(Pp (r1
n∈Z m∈Z p∈Z
Y (Pn (Resx1 · · · Resxk x1m1 · · · xkmk · L(0)
·Y (u1 , x1 ) · · · Y (uk , x1 )u), z)r2
=
·
·Pn (Resy1 · · · Resyl y1n1 · · · ylnl Y (v1 , y1 ) · · · Y (vl , yl )v)) X L(0) L(0) λ(Pp (Y (r1 h1 , z)r2 h2 )) p∈Z L(0)
= hλ, Y (r1
L(0)
h1 , z)r2
h2 i.
(2.16)
From (2.15) and (2.16), we get (ν Y ([D(z; r1 , r2 )]))(h1 ⊗ h2 ) L(0)
L(0)
= Y (r1 h1 , z)r2 h2 = (νY ((z; 0, (1/r1 , 0), (1/r2 , 0))))(h1 ⊗ h2 ), proving the theorem. u t 3. A Locally Convex Completion of a Finitely-Generated Module and the Vertex Operator Map In this section, we discuss modules. Since the constructions and proofs are all similar to the case of algebras, we shall only state the results and point out the slight differences
82
Y.-Z. Huang
between the constructions and proofs in the case of modules and those in the case of algebras. Let V be a Z-graded finitely generated vertex algebra satisfying the standard gradingrestriction axioms and W a C-graded finitely-generated V -module satisfying the standard grading-restriction axioms, that is, a W(n) , W = n∈C
dim W(n) < ∞, n ∈ C and
W(n) = 0
for n ∈ C whose real part is sufficiently small. Let M be the finite-dimensional vector space spanned by a set of generators of W . As in Sect. 2, we assume that X is a finitedimensional subspace containing 1 of V spanned by a set of generators of V . ˜ W and F W , k ≥ 0, and their duals in the same way as the We construct spaces G k ˜ constructions of G and Fk , k ≥ 0, and their duals in Sect. 2, except that V ∗ is replaced by W ∗ , v ∈ V is replaced by w ∈ W and the vertex operator map Y for V are replaced by vertex operator map YW for the module W . We also define linear maps gkW , ιF W , γkW , k
ekW , k ≥ 0, and linear maps induced from them in the same way as in the definitions of gk , ιFk , γk , ek , k ≥ 0, and induced linear maps in Sect. 2, except that the spaces are replaced by the corresponding spaces involving W and M. ˜ W )∗ . Let For k ≥ 0, ekW is a linear map from V ⊗k ⊗ W ⊗ (FkW )∗ to (G W ⊗k ⊗ M ⊗ (FkW )∗ ) GW k = ek (X
and
GW =
[ k≥0
GW k .
˜ W )∗ form a dual pair of vector spaces ˜ and H , the spaces and (G As in the case of G ˜ W )∗ . As in Sect. 2, we have: and thus we have a locally convex topology on (G ˜W G
Proposition 3.1. For any k ≥ 0, the linear map ˜ W )∗ ekW |X⊗k ⊗M⊗(F W )∗ : X⊗k ⊗ M ⊗ (FkW )∗ → (G k
is continuous. In particular, X⊗k ⊗ M ⊗ (FkW )∗ /(ekW |X⊗k ⊗M⊗(F W )∗ )−1 (0) k
and GW k is a locally convex space. For k ≥ 0, let HkW be the completion of GW k and [ HkW . HW = k≥0
Then HkW , k ≥ 0, are complete locally convex space.
Functional-Analytic Theory of Vertex (Operator) Algebras, I
83
W and the topology Proposition 3.2. For k ≥ 0, HkW can be embedded canonically in Hk+1 W W on Hk is induced from the topology on Hk+1 .
Theorem 3.3. The vector space H W equipped with the strict inductive limit topology is a locally convex completion of W . In particular, GW is dense in H W . Next we extend the map L(0)
YW (r1
L(0)
·, z)r2
·:V ⊗W →W
eH W to H W . associated to [D(z; r1 , r2 )] to a linear map ν YW ([D(z; r1 , r2 )]) from H ⊗ ∗ W ˜ We define u [D(z;r1 ,r2 )] λ ∈ W for λ ∈ G and u ∈ V , the maps ˜ W ⊗ X l ⊗ W ⊗ (F W )∗ → G, ˜ αlW : G l l ≥ 0, and
W W : Fk∗ ⊗ (FlW )∗ → (Fk+l+1 )∗ , βk,l k, l ≥ 0, in the same way as in the definitions in Sect. 3 of u [D(z;r1 ,r2 )] λ ∈ V ∗ for ˜ and u ∈ V , the maps λ∈G
˜ ⊗ Xl+1 ⊗ Fl∗ → V ∗ , αl : G l ≥ 0, and
∗ , βk,l : Fk∗ ⊗ Fl∗ → Fk+l+1 W ˜ ˜ k, l ≥ 0. (Note that the image of αl is in G, not in GW . This is the reason why the W is F ∗ ⊗ (F W )∗ , not (F W )∗ ⊗ (F W )∗ .) domain of βk,l l k l k Let h1 = ek (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ µ) ∈ Gk and h2 = elW (v1 ⊗ · · · ⊗ vl ⊗ w ⊗ ν) ∈ GW l ,
where u1 , . . . , uk , u, v1 , . . . , vl ∈ X, w ∈ M and µ ∈ Fk∗ , ν ∈ (FlW )∗ . Define (ν YW ([D(z; r0 , r1 , r2 )]))(h1 ⊗ h2 )
W W (u1 ⊗ · · · ⊗ uk ⊗ u ⊗ v1 ⊗ · · · ⊗ vl ⊗ w ⊗ βk,l (µ, ν)). = ek+l+1
Note that any element of Gk or GW l is a linear combination of elements of the form h1 or h2 , respectively, above, and that k and l are arbitrary. Thus we obtain a linear map ν YW ([D(z; r0 , r1 , r2 )])|G⊗GW : G ⊗ GW → GW . Proposition 3.4. The map ν YW ([D(z; r1 , r2 )])|G⊗GW is continuous. By this proposition and the fact that G and GW are dense in H and H W , respectively, we can extend ν YW ([D(z; r1 , r2 )])|G⊗GW to a continuous linear map ν YW ([D(z; r1 , r2 )]) eH W to H W . from H ⊗ Theorem 3.5. The map ν YW ([D(z; r1 , r2 )]) is a continuous extension of L(0)
YW (r1 to
L(0)
·, z)r2
·:V ⊗W →W
eH W . H⊗
Acknowledgement. This research is supported in part by by NSF grant DMS-9622961.
84
Y.-Z. Huang
References [B]
R. E. Borcherds: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) [F] Frenkel, I.B.: Talk presented at the Institute for Advanced Study, 1988; and private communications [FHL] Frenkel, I.B., Huang, Y.-Z. and Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Preprint, 1989; Memoirs Am. Math. Soc. 104, 1993 [FLM] Frenkel, I.B., Lepowsky, J. and Meurman, A.: Vertex operator algebras and the Monster. Pure and Appl. Math. 134, New York: Academic Press, 1988 [H1] Huang, Y.-Z.: On the geometric interpretation of vertex operator algebras. Ph.D thesis, Rutgers University, 1990 [H2] Huang, Y.-Z.: Geometric interpretation of vertex operator algebras. Proc. Natl. Acad. Sci. USA 88, 9964–9968 (1991) [H3] Huang, Y.-Z.: Applications of the geometric interpretation of vertex operator algebras. In: Proc. 20th International Conference on Differential Geometric Methods in Theoretical Physics, New York, 1991, ed. S. Catto and A. Rocha, Singapore: World Scientific, 1992, pp. 333–343 [H4] Huang, Y.-Z.: Vertex operator algebras and conformal field theory. Intl. J. of Mod. Phys. A7, 2109– 2151 (1992) [H5] Huang, Y.-Z.: Intertwining operator algebras, genus-zero modular functors and genus-zero conformal field theories. In: Operads: Proceedings of Renaissance Conferences, ed. J.-L. Loday, J. Stasheff, and A. A. Voronov, Contemporary Math. 202, Providence, RI: Am. Math. Soc. 1997, pp. 335–355 [H6] Huang, Y.-Z.: Two-dimensional conformal geometry and vertex operator algebras. Progress in Mathematics, Vol. 148, Boston: Birkhäuser, 1997 [H7] Huang, Y.-Z.: Genus-zero modular functors and intertwining operator algebras, Internat. J. Math. 9, 845–863 (1998) [K1] Köthe, G.: Topological vector spaces I, Grundlehren der mathematischen Wissenschaften, 159, English Edition, New York: Springer-Verlag, 1969 [K2] Köthe, G.: Topological vector spaces II. Grundlehren der mathematischen Wissenschaften 237, New York: Springer-Verlag, 1979 [S1] Segal, G.B.: The definition of conformal field theory. Preprint, 1988 [S2] Segal, G.B.: Two-dimensional conformal field theories and modular functors. In: Proceedings of the IXth International Congress on Mathematical Physics, Swansea, 1988, Bristol: Hilger, 1989, 22–37 Communicated by G. Felder
Commun. Math. Phys. 204, 85 – 88 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Weakly Mixing Invariant Tori of Hamiltonian Systems Oliver Knill Department of Mathematics, University of Texas, Austin, TX 78712, USA. E-mail: [email protected] Received: 24 April 1998/ Accepted: 14 January 1999
Abstract: We note that every finite or infinite dimensional real-analytic Hamiltonian system with a quasi-periodic invariant KAM torus of finite dimension d ≥ 2 can be perturbed in such a way that the new real-analytic Hamiltonian system has a weakly mixing invariant torus of the same dimension. 1. Introduction By the celebrated Kolmogorov–Arnold–Moser theory, Hamiltonian systems often have invariant tori on which the motion is quasi-periodic. While the dynamics on one-dimensional periodic orbits is always trivial, the Hamiltonian dynamics induced on higherdimensional invariant tori can be interesting. The reason is the nontrivial ergodic theory of the systems dxi = αi F (x)−1 dt
(1)
which are obtained by a change of time from a linear flow x˙ = α and which have the invariant measure µ = F (x)dx (see [1]). The flow φ t can be either weakly mixing or RT can be conjugated to the linear flow x˙ = α. Weak mixing means limT →∞ T −1 0 µ(Y ∩ φ t (Y )) − µ(Y )2 dt = 0 for any measurable set Y and is a weak type of chaos. The question of what kind of dynamics occurs for a given α and F is interesting and has been studied for quite a while. Much is known in the case d = 2, a situation of wider interest because any smooth flow on the two-dimensional torus with no fixed points and some absolutely continuous invariant measure reduces to (1) by a change of coordinates [1]. In two dimensions and for smooth F , no strong mixing µ(Y ∩ φ t (Y )) → µ(Y )2 can happen [4]. While Baire generically weakly mixing occurs [3], one can for almost all α conjugate the system to the quasi-periodic flow with F = 1. While for Diophantine rotation numbers, there is point spectrum by Kolmogorov’s theorem and for Liouville
86
O. Knill
rotation number, there is zero-dimensional but continuous spectrum in general [3], it is not known, if the dimension of spectral measures can become positive. The ergodic theory of (1) is less understood in dimensions d ≥ 3. We will note in Sect. 2 that near F = 1, we have generically weak mixing. Arnold mentioned in an interview [6] that Kolmogorov’s work on KAM theory was motivated by the question of whether mixing invariant tori exist for most Hamiltonian systems. While this question is open, our note shows that in a weak sense, the answer is yes: weakly mixing tori of dimension d ≥ 2 occurs densely in some open sets of Hamiltonian systems. The result does not only apply to finite dimensional systems. KAM theorems for infinite dimensional Hamiltonian systems often lead to the persistence of finite-dimensional invariant tori [5,9]. In such a situation, one can perturb the infinitedimensional Hamiltonian to obtain weakly mixing invariant measures of some PDE’s. 2. A Higher Dimensional Version of Sklover’s Theorem Sklover has shown [10,1] that smooth differential equations on the two torus exist for which the dynamics is weakly mixing. The weak mixing property is even Baire generic [3] for real analytic F ’s. This can be generalized to higher dimensions: Proposition 2.1 (Generalization of Sklover’s theorem). For a Baire generic set of (F, α) near F = 1, the flow x˙i = α/F (x) has purely singular continuous spectrum. Such systems are in general ergodic and weakly mixing. Proof. If the coordinates αi of α are rationally dependent, then every orbit is periodic and Tn is foliated by one-dimensional tori, which are parameterized by Tn−1 . A general measure preserving flow Tt on the torus Tn defines a one parameter family Ut of unitary operators Ut f = f (T−t ) on the Hilbert space L2 (Td , dx). By Stone’s theorem there is an infinitesimal generator L satisfying Ut = exp(iLt) which we call the Liouville operator. There is a continuum of distinct ergodic invariant measures my and the Liouville R operator L is an integral L = Tn−1 L(y) dy, where L(y) = p(y)∂x and p(y) is the period of the flow. L has pure absolutely continuous spectrum on the orthocomplement of constant functions if and only if the measure of all orbits with a given period has measure zero. This condition for F is true for an open dense set of F ’s. If α is Diophantine and the realanalytic F near 1, then the flow is conjugated to the linear flow and has pure point spectrum. This result of Arnold and Moser can also be derived from the fact that every time t map is conjugated to a map x 7 → x + α (see [7, 2]). There is a dense set of Liouville operators with absolutely continuous spectrum and a dense set of Liouville operators with discrete spectrum. By Simon’s theorem [11], L Baire generically has purely singular continuous spectrum. (See [3] for details, like how to deal with the fact that the different Liouville operators are defined on different Hilbert spaces. Simon’s theorem is: Let X be a complete metric space of self-adjoint operators on a separable Hilbert space for which convergence in the metric implies strong resolvent convergence. Suppose the two sets of operators in X that have purely continuous spectrum and purely discrete spectrum are both dense in X . Then there is a t dense Gδ in X of operators that have purely singular continuous spectrum.) u Remark. While the result in two dimensions which we obtained together with A. Hof [3] is global, we don’t know whether Proposition 2.1 can be made global in dimensions d ≥ 3.
Invariant Tori of Hamiltonian Systems
87
3. Weakly Mixing Invariant Tori Consider a Hamiltonian vector field XH with smooth Hamiltonian H on a symplectic manifold M. The function H is an integral of motion and the vector field XH is tangential to any energy surface C = {x | H (x) = c}. If c is not a critical value for H , then C is a smooth submanifold and the vector field XH does not vanish on C. If K is another Hamiltonian for which C is an energy surface, C = {x | H (x) = c} = {x | K(x) = c0 } with dH, dK 6 = 0 on C, then ∇K(x) = F (x)∇H (x) at every point x ∈ C with F (x) 6 = 0 and therefore XK (x) = F (x)XH (x), F (x) 6 = 0 on C. It follows that XH and XK have the same orbits on C although their time parameterization will be changed in general. Especially, any invariant torus of XH which is contained in the energy surface C is an invariant torus of XK and the change of the Hamiltonian produces a time change on this torus. Changing the Hamiltonian is useful for the study of periodic orbits (e.g. [12, Lemma 2.1]). Lemma 3.1 (Poincaré trick). Given a Hamiltonian system with a d-dimensional invariant torus N. For any smooth function F on N , with no roots on N, there exists a new Hamiltonian K which has the same invariant torus on which the dynamics is obtained by a change of time with function F . Proof. An explicit choice for K is F (x)(H − c) which has C = {K = 0}. u t We also need to change the rotation vector on invariant quasi-periodic tori. Lemma 3.2 (Change of the rotation vector). Given a Hamiltonian system with a ddimensional invariant torus N on which the dynamics is a rotation with rotation vector α. For any β near α, there exists a Hamiltonian K near H for which the torus N is still invariant and quasi-periodic with rotation vector β. Proof. A change of variables A : z 7 → (φ, I ) defined near the invariant torus N brings the Hamiltonian into action-angle variables on the invariant torus: φ˙ = α + g(I, φ), I˙ = h(I, φ), where g(I, φ) and h(I, φ) vanish on N = {I = α} (see e.g. [8]). If H˜ (φ, I ) = ˜ I) = H (A−1 (φ, I )) is the Hamiltonian in these new variables, change it to K(φ, ˜ ˜ H (φ, I ) + (β − α)I and define K(x, y) = H (A(x, y)). The flow of XK leaves N invariant and is conjugated there to a quasi-periodic flow with rotation vector β. u t Remark. The invariant torus N obtained like this loses the KAM property during the perturbation. However, by KAM, a different torus with the same rotation vector α will persist and the perturbed system will now have two invariant tori, one with rotation number α and one with rotation vector β. Theorem 3.3. Given a real-analytic Hamiltonian H for which there exists an invariant torus N on which the dynamics is quasi-periodic, there exists a Hamiltonian K arbitrarily close to H for which the same torus N is still invariant and for which the dynamics on N is ergodic and weakly mixing.
88
O. Knill
Proof. Using Proposition 2.1, take (β, F ) with β near α and F near 1 such that the corresponding flow on the torus is weakly mixing. We make now a first change of the Hamiltonian H → K1 such that the rotation vector of N is changed to β. Let c be the energy of an orbit on N. Let K = F (K1 − c) be the Hamiltonian obtained from K1 . The flow of this Hamiltonian induced on the invariant torus N is weakly mixing. u t The result generalizes obviously to infinite-dimensional Hamiltonian systems for which KAM theory assures that finite dimensional tori survive (see [5]). For example, there are perturbations of some nonlinear wave equations which have weakly mixing invariant tori. Acknowledgements. I acknowledge the support of the Swiss National Foundation and thank the Mathematical Physics group at the University of Texas at Austin for hospitality. This note is a spin-off from a collaboration [3] with Bert Hof. A key input came from Maciej Wojtkowski, who suggested to us after our seminar talk in Tucson to use the Poincaré trick to apply the torus results to Hamiltonian systems.
References 1. Cornfeld, I.P., Fomin, S.V. and Sinai, Ya.G. Ergodic Theory. Volume 115 of Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Berlin–Heidelberg–New York: Springer Verlag, 1982 2. Herman, M-R.: Sur les courbes invariantes par les difféomorphismes de l’anneau. Vol. 2. Astérisque 144, 1–248 (1986) 3. Hof, A. and Knill, O.: Zero dimensional singular continuous spectrum for smooth differential equations on the torus. Ergod. Th. Dyn. Sys. 18, 879–888 (1998) 4. Katok, A.B.: Spectral properties of dynamical systems with an integral invariant on the torus. Functional Anal. Appl. 1, 296–305 (1967) 5. Kuksin, S.: Nearly integrable infinite-dimensional Hamiltonian systems. Volume 1556 of Lecture notes in mathematics, Berlin–Heidelberg–New York: Springer-Verlag, 1993 6. Lui, S.H.: An interview with Vladimir Arnold. Notices of the AMS, April, 1997 7. Moser, J.: A rapidly convergent iteration method and non-linear differential equations. II. Ann. Scuola Norm. Sup. Pisa (3) 20, 499–535 (1966) 8. Pöschel, J.: Integrability of hamiltonian systems on cantor sets. Commun. Pure Appl. Math. 35, 653–696 (1982) 9. Pöschel, J.: A KAM-theorem for some nonlinear partial differential equations. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 23, 119–148 (1996) 10. Shklover, M.D. On classical dynamical systems on the torus with continuous spectrum. Izv. Vyssh. Uchebn. Zaved. Mat. 10, 113–124 (1967) (In Russian) 11. Simon, B.: Operators with singular continuous spectrum: I. General operators. Annals of Mathematics 141, 131–145 (1995) 12. Weinstein, A.: Periodic orbits for convex Hamiltonian systems. Annals of Mathematics 108, 507–518 (1978) Communicated by Ya. G. Sinai
Commun. Math. Phys. 204, 89 – 114 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Kneading Theory: A Functorial Approach J. F. Alves, J. Sousa Ramos Departamento de Matemática, Instituto Superior Técnico, Av. Rovisco Pais, 1096 Lisboa Cedex, Portugal. E-mail: [email protected]; [email protected] Received: 25 February 1998 / Accepted: 15 January 1999
Abstract: We study the homology theory of ` - modal maps of the interval. We give another proof of the Milnor and Thurston results about zeta-functions and we give a functorial approach to kneading theory. Our results give explicit methods for computing the sequences of lap numbers `(f k ) and the sequences of numbers of periodic points in an arbitrary interval [x, y]. 1. Introduction and Preliminaries Mappings from an interval to itself provide a surprisingly complex behavior, useful to the study of higher dimensional smooth dynamical systems. In this paper we present an explicit method for computing the sequence of lap numbers ` f k and the sequence of numbers of periodic points (of negative and critical type) in arbitrary intervals [a, b]. We obtain the Milnor and Thurston [Mi-Th 88] results using a functorial approach to Kneading theory. See also Mori [Mo 90] and Baladi and Ruelle [Ba-Ru 94]. Let I = [a, b] ⊆ R be a compact interval. By definition, a continuous map f : I → I is piecewise monotone if there are points a < c1 < · · · < c` < b at which f has a local extremum and f is strictly monotone in each of the intervals I1 = [a, c1 ], I2 = [c1 , c2 ], . . . , I`+1 = [c` , b]. In this case, the points c1 , . . . ,c` are called the turning points of f , the intervals I1 , . . . ,I`+1 are called the laps of f and `(f ) is the number of laps of f . The relationship between the sequence of lap numbers ` (f ), ` f 2 , . . . , ` (f n ), . . . and the topological entropy of f is very well known [Me-St 93]: the topological entropy of f is equal to the logarithm of the number s (f ) = lim
n→∞
p p n ` (f n ) = inf n ` (f n ). n
Milnor and Thurston [Mi-Th 88] introduced two basic invariants called the kneading matrix and the kneading determinant of f .
90
J. F. Alves, J. Sousa Ramos
In that paper, they present an explicit method for computing the sequence of lap numbers in terms of the kneading matrix. As an example, if f has only one turning point and f (∂I ) ⊆ ∂I , then we have X ` f n t n−1 = (1 − t)−1 + (1 − t)−2 Df −1 , (1.1) n≥1
where Df is the kneading determinant of f . The main theorem in that paper establishes an important relationship between the kneading determinant of f and the periodic points of f . That result can be stated as follows: X tn 2Nf−n − 1 , (1.2) Df −1 = exp n n≥1
where Nf−n is the number of fixed points of f n of negative type, i.e., fixed points of f n in the interior of some lap of f n , where f n is strictly decreasing. Here in this paper we introduce the correspondences θ0 and θ1 . To each piecewise monotone map f : I → I we associate two linear transformations θ0,f and θ1,f both def
defined in the Q-vector space S0 (I, Q) = S0 (I ) ⊗ Q, where S0 (I ) is the free abelian group with basis the elements of I , and Q is the field of rational numbers. These two correspondences present two fundamental properties, which will play an important role in what follows. The first one concerns iteration. Behind the construction of θ0 and θ1 we find two functors .f#0 and .f#1 . As an immediate consequence of this, we will see that θ0 and θ1 have good behaviour in the presence of iteration. In fact we will show that n n = θ0,f n and θ1,f = θ1,f n , for each n ∈ Z+ . θ0,f
(P1)
The nature of the second property is exclusively algebraic. It is clear that S0 (I, Q) is infinite-dimensional, and we will see that the subspaces Im θ0,f and Im θ1,f are both infinite-dimensional; however we will show that (P2) Im θ1,f − θ0,f is finite-dimensional. This property allows us to define the determinant of the pair, θ0,f , θ1,f . This determinant is an element of Q [[t]], which we call D(θ0,f ,θ1,f ) , and will play a role analogous to the kneading determinant of f . To compute D(θ0,f ,θ1,f ) , we introduce the square matrix M(θ0,f ,θ1,f ) with coefficients in Q [[t]], which have many resemblances with the kneading matrix of f . Let f : I → I be a piecewise monotone map, K = [c, d] ⊆ I (with d > c) and x0 ∈ I . As first application we compute the number of solutions in K of the equation f n (x) = x0 , for each n ∈ N. In fact there is a simple relationship between θ1,f (d − c) and # {x ∈ K : f (x) = x0 } , and applying P1 and P2 we can compute the sequence # x ∈ K : f n (x) = x0 .
Kneading Theory: A Functorial Approach
91
Using this we present an explicit method for computing the sequence of lap numbers, which is a generalization of (1.1). The second application have to do with the number of solutions of the equation f n (x) = x, for each n ∈ Z+ . We shall see that there is an important relationship between the number of fixed points of f and the trace of θ1,f − θ0,f . By P1 we easily obtain a similar version of (1.2): we will show that X tn 2Nf−n − 1 + Nf0 n , (1.3) D(θ0,f ,θ1,f ) −1 = exp n n≥1
points of critical type of f n , i.e., fixed points of f n at where which It is clear that (1.2) and (1.3) have only to do with the number of periodic points of f in I . If we want to know something about the number of periodic points of f in an arbitrary subinterval K ⊆ I , then we have to generalize the previous results. We can regard both equations Nf0 n is the number of fixed f n has a local extremum.
f n (x) = x0 and f n (x) = x, for each n ∈ N, as special cases of the equation f n (x) = h(x), for each n ∈ N,
(1.4)
where h : I → I is a continuous map with constant sign, i.e., h is strictly monotone in I , or h is constant in I . Our main theorem concerns the number of solutions of Eq. (1.4), and using this result we obtain a generalization of (1.3) which is related to the number of periodic points of f in arbitrary subintervals of I . 1.1. The determinant of a pair (ϕ, ψ). Let U be a vector space over Q and ψ : U → U a linear transformation such that Im (ψ) is finite-dimensional. Assume ψ has finite-dimensional image, and let W be any finite-dimensional subspace of U such that Im (ψ) ⊆ W . Then we define the trace of ψ by def T r (ψ) = T r ψ|W (it is easy to see that the definition does not depend on W ). If ψ has finite-dimensional image, then there are vectors u1 , . . . , uk ∈ U and linear forms ω1 , . . . , ωk ∈ U ∗ such that ψ=
k X
ωi ⊗ ui ,
(1.5)
i=1
and we can compute T r (ψ) considering the matrix ω1 (u1 ) . . . ω1 (uk ) def ... : Mψ = . . . ωk (u1 ) . . . ωk (uk )
(1.6)
92
J. F. Alves, J. Sousa Ramos
we have
T r (ψ) = T r Mψ .
More generally, if ψ has finite-dimensional image, then, for each n ∈ Z+ , ψ n has finite-dimensional image, and T r ψ n = T r Mψn . As an immediate consequence of this, we have the following Proposition 1.1. Let ψ be a linear transformation on U having finite-dimensional image, and consider u1 , . . . ,uk ∈ U and ω1 , . . . ,ωk ∈ U ∗ as in (1.5). Then we have X tn −1 T r ψ n = Det id − tMψ exp , n n≥1
where Mψ is the matrix defined in (1.6). Now we generalize this result. Let (ϕ, ψ) be a pair of linear transformations on U such that Im (ψ − ϕ) is finite dimensional. We shall say that the pair (ϕ, ψ) has finite-dimensional image. If the pair (ϕ, ψ) has finite-dimensional image, then we can consider vectors u1 , . . . , uk ∈ U and linear forms ω1 , . . . , ωk ∈ U ∗ such that ψ −ϕ =
k X
ωi ⊗ ui .
(1.7)
i=1
More generally ψ n − ϕn =
n k X X
ωi ◦ ψ n−j ⊗ ϕ j −1 (ui ) , for each n > 0,
(1.8)
i=1 j =1
and we see that (ϕ n , ψ n ) has finite-dimensional image, for each n > 0. Thus we define the determinant of (ϕ, ψ) to be the following element of Q [[t]]: X tn def T r ψ n − ϕn . D(ϕ,ψ) −1 = exp n n≥1
If ϕ and ψ have finite-dimensional images, then by Proposition 1.1, Det id − tMψ . D(ϕ,ψ) = Det id − tMϕ In order to compute D(ϕ,ψ) , in the general case, we consider u1 , . . . ,uk ∈ U and ω1 , . . . ,ωk ∈ U ∗ as in (1.7), and we introduce the matrix P P ω1 (ϕ n (u1 )) t n . . . ω1 (ϕ n (uk )) t n n≥0 n≥0 def , . . . ... (1.9) M(ϕ,ψ) = P P n n n n ωk (ϕ (u1 )) t . . . ωk (ϕ (uk )) t n≥0
with coefficients in Q [[t]].
n≥0
Kneading Theory: A Functorial Approach
93
Proposition 1.2. Let (ϕ, ψ) be a pair of linear transformations on U having finitedimensional image and consider u1 , . . . ,uk ∈ U and ω1 , . . . ,ωk ∈ U ∗ , as in (1.7). Then we have D(ϕ,ψ) = Det id − tM(ϕ,ψ) , where M(ϕ,ψ) is the matrix defined in (1.9). Observe that this proposition is a generalization of Proposition 1.1, when we identify a linear transformation ψ with the pair (0, ψ). In fact, if ϕ = 0, then the matrices defined in (1.6) and (1.9) coincide. Generally, the computation of M(ϕ,ψ) is very difficult. The next proposition, which is an immediate consequence of the definition, may be useful in computing D(ϕ,ψ) . Proposition 1.3. Let φ, ϕ and ψ be linear transformations on U such that (φ, ϕ) and (ϕ, ψ) have finite-dimensional images. Then we have: −1 , 1. D(ϕ,ψ) = D(ψ,ϕ) 2. D(φ,ψ) = D(φ,ϕ) D(ϕ,ψ) .
Later we shall need to solve two problems. The first one can be stated as follows: let ψ be a linear transformation on U , u ∈ U and ω ∈ U ∗ . We want to compute the numbers ω (u) , ω (ψ (u)) , . . . , ω ψ n (u) , . . . .
(1.10)
Considering a pair (ϕ, ψ) with finite-dimensional image; the problem becomes easier, if ϕ has a easier iteration. In fact, if the pair (ϕ, ψ) has finite-dimensional image, then it is possible to express these numbers in terms of ϕ. We begin to observe that the numbers in (1.10) appear naturally, when we consider the determinant of the pair (ψ, ψ + ω ⊗ u); we have (ψ + ω ⊗ u) − ψ = ω ⊗ u, and by Proposition 1.2, D(ψ,ψ+ω⊗u) = 1 −
X
ω ψ n (u) t n+1 .
(1.11)
n≥0
Considering the pairs (ψ, ϕ) and (ϕ, ψ + ω ⊗ u), we apply Proposition 1.3 to decompose the first member of (1.11), and obtain the following Corollary 1.4. Let (ϕ, ψ) be a pair of linear transformations on U having finitedimensional image, u ∈ U and ω ∈ U ∗ . Then we have 1−
X n≥0
−1 ω ψ n (u) t n+1 = D(ϕ,ψ) D(ϕ,ψ+ω⊗u) .
94
J. F. Alves, J. Sousa Ramos
The second problem can be stated as follows: Let φ, ϕ and ψ be linear transformations on U , such that the pair (ϕ, ψ) has finite-dimensional image. We want to compute the numbers T r φ ◦ ψ n − ϕn
, for each n > 0.
In other words, we want to compute the formal power series def
ϒ(φ,ϕ,ψ) =
X
T r φ ◦ ψ n − ϕn
t n−1 .
n≥1
It is clear that if φ = id, then by definition of D(ϕ,ψ) , 0
−1 D(ϕ,ψ) , ϒ(id,ϕ,ψ) = −D(ϕ,ψ)
and we can use Proposition 1.2 to compute ϒ(id,ϕ,ψ) . In order to compute ϒ(φ,ϕ,ψ) , we generalize Proposition 1.2. Let (ϕ, ψ) be a pair with finite-dimensional image, and consider u1 , . . . ,uk ∈ U and ω1 , . . . ,ωk ∈ U ∗ as in (1.7). By (1.8), n
Tr φ ◦ ψ −ϕ
n
=
n k X X
i h ωi ◦ ψ n−j φ ϕ j −1 (ui ) ,
i=1 j =1
for each n > 0. Thus ϒ(φ,ϕ,ψ) =
k X X
t j −1
i=1 j ≥0
=
X
t j −1
X
h
i
ωi ψ n φ ◦ ϕ j (ui )
t n+1
n≥0
X
h
ωi ψ n φ ◦ ϕ j (ui )
i
t n+1 ,
n≥0
(i,j )∈3
where o n 3 = (i, j ) ∈ {1, 2, . . . , k} × N : φ ◦ ϕ j (ui ) 6= 0 , and from Corollary 1.4 we obtain the following Corollary 1.5. Let φ, ϕ, ψ be linear transformations on U , such that the pair (ϕ, ψ) has finite-dimensional image, and consider u1 , . . . ,uk ∈ U and ω1 , . . . ,ωk ∈ U ∗ as in (1.7). Then we have ϒ(φ,ϕ,ψ) =
X h (i,j )∈3
i −1 D(ϕ,ψ+ωi ⊗φ◦ϕ j (ui )) . t j −1 1 − D(ϕ,ψ)
Kneading Theory: A Functorial Approach
95
2. The Correspondences θ0 and θ1 2.1. The functors .#0 and .#1. Let I and J be closed and bounded intervals of R and consider a continuous map f : I → J . This map induces another map f : int (I ) → {−1, 0, 1} , defined as follows: for each x ∈ int (I ), such that f is injective on some neighbourhood of x, we define the sign f (x) to be either ±1 according to whether f is strictly increasing or strictly decreasing on some neighbourhood of x. For the remaining points of int (I ) we define f (x) = 0. We say that f is a piecewise monotone map if the set Cf = x ∈ int (I ) : f (x) = 0 is finite. In this case the elements of Cf are called turning points of f . If f has exactly ` turning points, then we say that f is an `-modal map. Let S∗ (I, Q) and S∗ (J, Q) be the singular chain complexes of I and J with coefficients in Q. A continuous map f : I → J induces a chain map f# from S∗ (I, Q) to S∗ (J, Q); ∂1
. . . S1 (I, Q) −−−−→ S0 (I, Q) −−−−→ 0 f f . y #0 y #1 ∂1
. . . S1 (J, Q) −−−−→ S0 (J, Q) −−−−→ 0 This chain complex gives us the following commutative diagram: ∂1 →
p
∂1
p
0 −→ S˜1 (I, Q) = f y #1
S1 (I,Q) ker(∂1 )
−−−−→ S0 (I, Q) −−−−→ H0 (I, Q) −−−−→ 0 f π , y #0 y
0 −→ S˜1 (J, Q) =
S1 (J,Q) ker(∂1 )
−−−−→ S0 (J, Q) −−−−→ H0 (J, Q) −−−−→ 0
in which the two rows are exact sequences and π is a natural isomorphism. Recall that S0 (I, Q) is the Q-vector space defined by def
S0 (I, Q) = S0 (I ) ⊗ Q, where S0 (I ) is the free abelian group with basis I . Thus I is a basis of S0 (I, Q), and f#0 : S0 (I, Q) → S0 (J, Q) is the unique linear transformation that verifies f#0 (x) = f (x), for all x ∈ I . Using f#0 we define .f#0 . Definition 2.1. Let f : I → J be a piecewise monotone map. We define .f#0 : S0 (int (I ) , Q) → S0 (int (J ) , Q) to be the unique linear transformation such that .f#0 (x) = f (x) .f#0 (x) for all x ∈ int (I ) .
96
J. F. Alves, J. Sousa Ramos
To define the morphism .f#1 , recall that the subspace Im (∂1 ) ⊂ S0 (I, Q) is generated by {u ∈ S0 (I, Q) : u = d − c, with c, d ∈ I } . Since S˜1 (I, Q) ' Im (∂1 ), we see that each interval K = [c, d] ⊆ I , with d > c, can be identified with the unique vector k ∈ S˜1 (I, Q), that verifies ∂1 (k) = d − c. From now on we use the same symbol to denote the interval K and the correspondent vector of S˜1 (I, Q). With this identification we can say that the set def
I = {K = [c, d] ⊆ I : d > c} , generates S˜1 (I, Q), and [f (c) , f (d)] if f (c) < f (d) 0 if f (c) = f (d) , for all [c, d] ∈ I. f#1 ([c, d]) = − [f (d) , f (c)] if f (c) > f (d)
(2.1)
Consider now the subset of I, def
If = {K ∈ I : f is injective on K} . Observe that, since f is piecewise monotone, we see that If generates S˜1 (I, Q). For each K ∈ If , we define the sign f (K) to be 1 (−1) if f is strictly increasing (decreasing) on K, and by (2.1) we see that f#1 : S˜1 (I, Q) → S˜1 (J, Q) is the unique linear transformation that verifies f (K) if f (K) = 1 , for all K ∈ If . f#1 (K) = −f (K) if f (K) = −1
(2.2)
Definition 2.2. Let f : I → J be a piecewise monotone map. We define .f#1 : S˜1 (I, Q) → S˜1 (J, Q) to be the unique linear transformation such that .f#1 (K) = f (K) .f#1 (K) , for all K ∈ If . Observe that, by (2.2) .f#1 : S˜1 (I, Q) → S˜1 (J, Q) is the unique linear transformation that verifies .f#1 (K) = f (K) , for all K ∈ If .
(2.3)
The next proposition is an immediate consequence of the definitions of .f#0 and .f#1 , and will play an important role in what follows. Proposition 2.3. Let £ be the category whose objects are the closed and bounded subintervals of R, and whose morphisms are piecewise monotone maps between intervals. Then the associations I → S0 (int (I ) , Q) I → S˜1 (I, Q) and f → .f#0 f → .f#1 are covariant functors from £ to the category of vector spaces over Q.
Kneading Theory: A Functorial Approach
97
2.2. The equation f n (x) = x0 . We are in position to approach the following problem: let f : I → I be a piecewise monotone map, x0 ∈ I and K ∈ I. We want to compute the numbers # x ∈ K : f n (x) = x0 , where f n is the nth -iterate of f.
Remark 2.4. The reason why we consider this problem will be clear later, in the computation of the numbers ` (f n ). We begin to introduce the following notations. Definition 2.5. Let g : I → I be a piecewise monotone map and x0 ∈ I . 1. We define ν(g,x0 ) (K) = # {x ∈ K : g (x) = x0 }, for all subsets K ⊆ I . 2. We define ν˜ (g,x0 ) to be the unique linear form on S˜1 (I, Q) that verifies 1 ν˜ (g,x0 ) (K) = ν(g,x0 ) (int (K)) + ν(g,x0 ) (∂K) , for all K ∈ I. 2 Let f : I → I be a piecewise monotone map and x0 ∈ I . To compute the numbers ν˜ (f 0 ,x0 ) (K) , ν˜ (f 1 ,x0 ) (K) , . . . , ν˜ (f n ,x0 ) (K) , . . . , with K ∈ I, we introduce ωx0 to be the linear form on S˜1 (I, Q) defined by: 1 if x0 ∈ int (K) ωx0 (K) = 21 if x0 ∈ ∂K , for all K ∈ I, /K 0 if x0 ∈ and we have Proposition 2.6. Let f : I → I be a piecewise monotone map, x0 ∈ I and n ∈ N. Then we have ωx0 ◦ (.f#1 )n = ν˜ (f n ,x0 ) . Let n ∈ N. By Proposition 2.3, we have n , ωx0 ◦ (.f#1 )n = ωx0 ◦ .f#1
therefore, Proposition 2.6 is an immediate consequence of the following Lemma 2.7. Let g : I → I be a piecewise monotone map and x0 ∈ I . Then we have ωx0 ◦ .g#1 = ν˜ (g,x0 ) . Proof. Let K ∈ Ig . Since Ig generates S˜1 (I, Q), we only have to show that ωx0 ◦ .g#1 (K) = ν˜ (g,x0 ) (K) , and this follows from (2.3) because ωx0 ◦ .g#1 (K) = ωx0 (g (K)) = ν˜ (g,x0 ) (K) .
t u
98
J. F. Alves, J. Sousa Ramos
Let K ∈ I. From Proposition 2.6, we have ωx0 (.f#1 )n (K) = ν˜ (f n ,x0 ) (K) . Observe that, only with this result, we can’t compute the numbers ν˜ (f n ,x0 ) (K), because the computation of (.f#1 )n (K) is very complicated. To circumvent this difficulty we must work simultaneously with .f#0 and .f#1 . Since these morphisms are defined in different spaces, we introduce the morphisms θ0,f and θ1,f . Definition 2.8. Let f : I = [a, b] → J be a piecewise monotone map, and for each interval I , denote by i the inclusion int (I ) ,→ I . We define θ0,f to be the unique linear transformation from S0 (I, Q) in to S0 (J, Q) that verifies: 1. The diagram i#0
S0 (int (I ) , Q) −−−−→ S0 (I, Q) θ .f#0 y y 0,f i#0
S0 (int (J ) , Q) −−−−→ S0 (J, Q) is commutative. 2. θ0,f (a) = θ0,f (b) = 0. Definition 2.9. Let f : I = [a, b] → J be a piecewise monotone map, and ∂1 the boundary operator. We define θ1,f to be the unique linear transformation from S0 (I, Q) in to S0 (J, Q) that verifies: 1. The diagram ∂1 S˜1 (I, Q) −−−−→ S0 (I, Q) θ .f#1 y y 1,f ∂1 S˜1 (J, Q) −−−−→ S0 (J, Q)
is commutative. 2. θ1,f (a + b) = 0. From the definitions of θ0,f and θ1,f , we see at once that θ0,id 6 = id and θ1,id 6= id, therefore the associations I → S0 (I ) I → S0 (I ) and f → θ0,f f → θ1,f are not functorial. However, if f : I → J and g : J → K are piecewise monotone maps, then we have from Proposition 2.3 θ0,g◦f = θ0,g ◦ θ0,f and θ1,g◦f = θ1,g ◦ θ1,f , and, as an immediate consequence of this, we obtain the following
Kneading Theory: A Functorial Approach
99
Corollary 2.10. Let f : I → I be a piecewise monotone map. Then we have n n = θ0,f n and θ1,f = θ1,f n , for each n ∈ Z+ . θ0,f
In order to write down θ0,f and θ1,f explicitly, we have to introduce some notation. Definition 2.11. Let f : I → J be a piecewise monotone map. We define the function ¯f : I → {−1, 0, 1} by ¯f (x) =
f (x) if x ∈ int (I ) . 0 if x ∈ ∂I
Proposition 2.12. Let f : I → J be a piecewise monotone map. Then we have θ0,f (x) = ¯f (x) .f#0 (x) for all x ∈ I. Definition 2.13. Let f : I = c0 , c`+1 → J be a piecewise monotone map with turning points c1 < · · · < c` . For each 0 ≤ i ≤ ` + 1, we define ωi to be the unique linear form on S0 (I, Q) such that f (ci −) − f (ci +) if x > ci 2 f (ci −) + f (ci +) ωi (x) = if x = ci for all x ∈ I, 2 +) − −) (c (c f i f i if x < ci 2 defining f (c0 −) = f (c`+1 +) = 0. Now we are in position to write down the morphism θ1,f . The proof of the next proposition is a simple verification. Proposition 2.14. Let f : I = c0 , c`+1 → J be a piecewise monotone map, with turning points c1 < · · · < c` . For each 0 ≤ i ≤ ` + 1, let ωi be the linear form defined above. Then we have θ1,f − θ0,f =
`+1 X
ωi ⊗ f#0 (ci ) .
i=0
Observe that, if I = J , then θ0,f and θ1,f are endomorphisms of S0 (I, Q) and this proposition shows that the pair θ0,f , θ1,f has finite-dimensional image. Thus we may consider the determinant of this pair D(θ0,f ,θ1,f ) −1 = exp def
X tn n≥1
n
n n T r θ1,f − θ0,f ,
and, applying Proposition 1.2 and Proposition 2.14, we obtain D(θ0,f ,θ1,f ) = Det id − tM(θ0,f ,θ1,f ) ,
100
J. F. Alves, J. Sousa Ramos
with n ω0 θ0,f (f#0 (c`+1 )) t n n≥0 n≥0 . . . . . . . P P n n ω`+1 θ0,f ω`+1 θ0,f (f#0 (c0 )) t n . . . (f#0 (c`+1 )) t n
P M(θ0,f ,θ1,f ) =
n ω0 θ0,f (f#0 (c0 )) t n . . .
n≥0
P
n≥0
From Proposition 2.12 we have n n θ0,f (x) = ¯ (x) .¯ (f (x)) . . . ¯ f n−1 (x) .f#0 (x) , for each x ∈ I and n ∈ Z+ .With this, we see that M(θ0,f ,θ1,f ) have many resemblances with the kneading matrix of f (see [Mi-Th 88]). Example 2.15. Let f : [0, 1] → [0, 1] be the unimodal map defined by f (x) = 4bx (1 − x) with b ∼ 0.9900675. We have c0 = 0, c1 = 21 , c2 = 1 and c0 = f (c0 ) = f (c2 ) < f 2 (c1 ) < f 3 (c1 ) < f 4 (c1 ) = c1 < f (c1 ) < c2 . Since
1 if x ∈ ]c0 , c1 [ 0 if x ∈ {c0 , c1 , c2 } , ¯f (x) = −1 if x ∈ ]c , c [ 1 2
we have
1 2
M(θ0,f ,θ1,f ) = −1 1 2
1 2 1 2
−1 + t + t 2 + t 3 21 1 + t + t 2 −1 1 1 − t − t2 − t3 2
and D(θ0,f ,θ1,f ) = 1 − 2t + t 4 . Now we return to Proposition 2.6. We can rewrite this result in terms of θ1,f . Definition 2.16. Let I ⊆ R be a compact interval and x0 ∈ I . We define δx0 to be the unique linear form on S0 (I, Q) that verifies 1 2 if x > x0 0 if x = x0 , for all x ∈ I. δx0 (x) = 1 − 2 if x < x0
Kneading Theory: A Functorial Approach
101
We have ωx0 = δx0 ◦ ∂1 , and from the definition of θ1,f , we have n ◦ ∂1 = ∂1 ◦ (.f#1 )n , for all n ∈ N. θ1,f
Thus, as an immediate consequence of Proposition 2.6, we have Theorem 2.17. Let f : I → I be a piecewise monotone map, x0 ∈ I , n ∈ N, and ∂1 : S˜1 (I, Q) → S0 (I, Q) the boundary operator. Then we have n ◦ ∂1 = ν˜ (f n ,x0 ) . δx0 ◦ θ1,f
Let K = [c, d] ∈ I. We have ∂1 (K) = d − c, and from the previous theorem, we have X X n δx0 θ1,f ν˜ (f n ,x0 ) (K) t n+1 . (2.4) 1− (d − c) t n+1 = 1 − n≥0
n≥0
Since the pair θ0,f , θ1,f has finite-dimensional image, we apply Corollary 1.4, to express the first member of (2.4) in terms of θ0,f , and we obtain X ν˜ (f n ,x0 ) (K) t n+1 = D θ1,f ,θ1,f +δx ⊗(d−c) (2.5) 1− 0
n≥0
= D(θ0,f ,θ1,f ) −1 D
.
θ0,f ,θ1,f +δx0 ⊗(d−c)
Example 2.18. Let f be the map of Example 2.15. Set x0 = c1 and K = [c0 , c2 ]. By Propositions 1.2 and 2.14, we have D θ0,f ,θ1,f +δc ⊗(c2 −c0 ) = Det id − tM θ0,f ,θ1,f +δc ⊗(c2 −c0 ) , 1
1
with M
θ0,f ,θ1,f +δc1 ⊗(c2 −c0 )
M(θ0,f ,θ1,f ) A = , B C
T P n n ω0 θ0,f δc1 θ0,f (c2 − c0 ) t n (f#0 (c0 )) t n n≥0 n≥0 P P n n n ω1 θ0,f (c2 − c0 ) t , B = δc1 θ0,f (f#0 (c1 )) t n A= n≥0 n≥0 P P n n ω2 θ0,f δc1 θ0,f (c2 − c0 ) t n (f#0 (c2 )) t n P
n≥0
n≥0
and C=
X n δc1 θ0,f (c2 − c0 ) t n . n≥0
102
J. F. Alves, J. Sousa Ramos
Since
M
θ0,f ,θ1,f +δc1 ⊗(c2 −c0 )
1 2
−1 = 1 2
− 21
−1 + t + t 2 + t 3 21 −1 1 + t + t 2 −1 2 , 1 1 − t − t 2 −t 3 2 −1 1 1 2 −2 1 2 1+t +t
1 2 1 2
we have D
θ0,f ,θ1,f +δc1 ⊗(c2 −c0 )
= 1 − 3t + t 4 .
By (2.5), we arrive at X
ν˜ (f n ,c1 ) ([c0 , c2 ]) t n+1 = 1 −
n≥0
=
1 − 3t + t 4 1 − 2t + t 4
t , 1 − 2t + t 4
and finally X ν˜ (f n ,c1 ) ([c0 , c2 ]) t n = 1 + 2t + 4t 2 + 8t 3 + 15t 4 + 28t 5 + 52t 6 + o [t]7 . n≥0
With this, we conclude for instance that the equation f 6 (x) = c1 has exactly 52 solutions in [c0 , c2 ]. n . Let f : I → I be a piecewise monotone map with turning 2.2.1. The numbers ` f|K points c1 < · · · < cl , and n ∈ N. Recall that the lap number of f n is defined by def ` f n = # x ∈ int (I ) : f n (x) = 0 + 1, in other words, ` (f n ) is the number of maximal intervals in which f n is strictly monotone. More generally, for each K = [c, d] ∈ I, we define def n = # x ∈ int (K) : f n (x) = 0 + 1. ` f|K If we know the numbers ν(f i ,cj ) (int (K)) , with 0 ≤ i ≤ n − 1 and 1 ≤ j ≤ l, n . then we can easily compute ` f|K Let Cf = {c1 , . . . , c` } be the set of the turning points of f . For each cj ∈ Cf , we define the sequence γn cj by: 0 if f n+1 cj ∈ Cf for all n ≥ 0, γ0 cj = 1; γn+1 cj = γn cj if f n+1 cj ∈ / Cf
Kneading Theory: A Functorial Approach
103
and the correspondent formal power series def X γn cj t n . γ cj = n≥0
Then we have n−1 ` X X n γn−1−i cj ν(f i ,cj ) (int (K)) , for all n ≥ 1, =1+ ` f|K j =1 i=0
or equivalently ` X X X n ` f|K γ cj ν(f n ,cj ) (int (K)) t n+1 . t n = t (1 − t)−1 + n≥1
j =1
n≥0
Thus, if ν(f n ,cj ) (int (K)) = ν˜ (f n ,cj ) (K) , for all 1 ≤ j ≤ ` and n ≥ 0, we have by (2.5), D ` X X θ0,f ,θ1,f +δcj ⊗(d−c) n n −1 . ` f|K t = t (1 − t) + γ cj 1 − D(θ0,f ,θ1,f ) n≥1
(2.6)
j =1
In the particular case K = I , if f (∂I ) ⊆ ∂I , then ν(f n ,cj ) (int (I )) = ν˜ (f n ,cj ) (I ) , for all 1 ≤ j ≤ ` and n ∈ N, and by (2.6) we obtain Corollary 2.19. Let f : I = [a, b] → I be a piecewise monotone map with turning points c1 < · · · < c` . If f (∂I ) ⊆ ∂I , then we have D l X X θ0,f ,θ1,f +δcj ⊗(b−a) n n −1 . ` f t = t (1 − t) + γ cj 1 − D(θ0,f ,θ1,f ) n≥1 j =1 Remark 2.20. If f : I = [a, b] → I has a single turning point c1 and f (∂I ) ⊆ ∂I , then we check easily D(θ0,f ,θ1,f ) − D
θ0,f ,θ1,f +δc1 ⊗(b−a)
= t,
and from the previous corollary X n≥1
` f n t n−1 = (1 − t)−1 + γ (c1 ) D −1 (θ
0,f ,θ1,f
)
.
104
J. F. Alves, J. Sousa Ramos
Example 2.21. Let f be the map of Example 2.15. f has a single turning point at c1 . Since f (c0 ) = f (c2 ) = c0 , f (∂I ) ⊆ ∂I . We have γ (c1 ) = 1 + t + t 2 + t 3 , and by Corollary 2.19 X
` f n tn =
n≥1
t 1 − 3t + t 4 + 1 + t + t2 + t3 . 1 − , 1−t 1 − 2t + t 4
and X n≥1
` f n t n−1 =
1 1 + t + t2 + t3 + 1−t 1 − 2t + t 4
= 2 + 4t + 8t 2 + 16t 3 + 30t 4 + 56t 5 + o [t]6 . 2.3. The equation f n (x) = x. Let g : I → I be a piecewise monotone map. As usual, we say that x ∈ int (I ) is a fixed point of g of negative type (positive type) if g (x) = x and g is strictly decreasing (increasing) in some neighbourhood of x. We say that x ∈ I is a fixed point of g of critical type if g (x) = x and x is a local minimum or local maximum point of g. Using the sign ¯g (see Definition 2.11) we may write F ixg = {x ∈ I : g (x) = x} , F ixg− = x ∈ I : g (x) = x, ¯g (x) = −1 , F ixg+ = x ∈ I : g (x) = x, ¯g (x) = 1 , F ixg0 = x ∈ I : g (x) = x, ¯g (x) = 0 , and the correspondent cardinals Ng = # F ixg , Ng− = # F ixg− , Ng+ = # F ixg+ and Ng0 = # F ixg0 . Observe that, since g is piecewise monotone, the sets F ixg− and F ixg0 are finite, and it is clear that Ng ≥ 2Ng− − 1 + Ng0 . Note also that, in many interesting cases, it is sufficient to know the numbers Ng− and Ng0 for computing Ng . As an example if the piecewise monotone g is expansive (by 0 (x) > 1 and map we mean a continuous map g : I → I such that g+ an0 expansive g (x) > 1 for all x ∈ I ) then − Ng = 2Ng− − 1 + Ng0 . Let f : I → I be a piecewise monotone map. Recall that Milnor and Thurston proved that X tn (2.7) 2Nf−n − 1 = Df−1 , exp n n≥1
Kneading Theory: A Functorial Approach
105
where Df is the kneading determinant of f . Here we easily prove a similar result: exp
X tn n≥1
2Nf−n − 1 + Nf0 n = D −1 . (θ0,f ,θ1,f ) n
(2.8)
From definition of D(θ0,f ,θ1,f ) , Eq. (2.8) is an immediate consequence of the following result. Theorem 2.22. Let f : I → I be a piecewise monotone map and n ∈ Z+ . Then we have n n − θ0,f = 2Nf−n − 1 + Nf0 n . T r θ1,f Let n ∈ Z+ . By Corollary 2.10, we have n n − θ0,f T r θ1,f = T r θ1,f n − θ0,f n , therefore, Theorem 2.22 is an immediate consequence of the following Lemma 2.23. Let g : I = [a, b] → I be a piecewise monotone map. Then we have T r θ1,g − θ0,g = 2Ng− − 1 + Ng0 . Proof. Case 1 (g (a+) > 0 and g (b−) > 0 ). Let c1 < · · · < c` be the turning points of g. By Proposition 2.14, we have `+1 X ωi (g#0 (ci )) , T r θ1,g − θ0,g = i=0
with c0 = a and c`+1 = b. Consider the intervals I1 = [c1 , c2 ] , I3 = [c3 , c4 ] , . . . , I`−1 = c`−1 , c` . In this case, g is strictly decreasing in each of these intervals. Therefore, for each i = 1, 3, . . . , ` − 1, g has at most one fixed point in Ii , and by definition of ωi (see Definition 2.13) we have 2 if Ii ∩ F ixg− 6= ∅ ωi (g#0 (ci )) + ωi+1 (g#0 (ci+1 )) = 1 if Ii ∩ F ixg0 6= ∅ . 0 if Ii ∩ F ixg = ∅ From definition of ω0 and ω`+1 we also have 1 ωi (g#0 (ci )) = # F ixg0 ∩ {ci } − , for each i ∈ {c0 , c`+1 } . 2
106
J. F. Alves, J. Sousa Ramos
Thus `+1 X i=0
ωi (g#0 (ci )) = 2Ng− − 1 + Ng0 ,
and the proof follows. Case 2 (g (a+) < 0 or g (b−) < 0 ). Assume that g (b−) > 0 (in the remaining cases the proof follows in the same way). In this case we consider an interval J = [d, b], with d < a, and h : J → J a piecewise monotone map such that g (x) = h (x), for all x ∈ I , h is strictly increasing in [d, a] and h (d) > d. By definition of h, we have Nh− = Ng− and Nh0 = Ng0 , and by Proposition 2.14, T r θ1,g − θ0,g = T r θ1,h − θ0,h . Since h is in the conditions of the first case the proof follows. u t Remark 2.24. We can see Theorem 2.22 as a corollary of a more general result, which shows that the numbers 2Nf−n − 1 + Nf0 n are, in some sense, intrinsically associated with the morphisms .f#0 and .f#1 . We can state this result as follows: let f : I → I be a piecewise monotone map, and consider morphisms α0 , β0 , α1 and β1 such that, the diagrams i#0
p
S0 (I,Q) S0 (int(I ))
i#0
p
S0 (I,Q) S0 (int(I ))
0 −→ S0 (int (I )) −−−−→ S0 (I, Q) −−−−→ .f α y #0 y 0 0 −→ S0 (int (I )) −−−−→ S0 (I, Q) −−−−→
= Q2 −−−−→ 0 β y 0 = Q2 −−−−→ 0,
and ∂1
p0
0 −→ S˜1 (I, Q) −−−−→ S0 (I, Q) −−−−→ H0 (I, Q) = Q −−−−→ 0 .f α β y #1 y 1 y 1 0
p ∂1 | 0 −→ S˜1 (I, Q) −−−−→ S0 (I, Q) −−−−→ H0 (I, Q) = Q −−−−→ 0
are commutative. Then the pair (α0 , α1 ) has finite-dimensional image and T r α1n − α0n − T r β1n + T r β0n = 2Nf−n − 1 + Nf0 n , for all n ∈ N. To prove this result observe that the diagrams i#0
p
S0 (I,Q) S0 (int(I ))
i#0
p
S0 (I,Q) S0 (int(I ))
0 −→ S0 (int (I )) −−−−→ S0 (I, Q) −−−−→ α n −θ n y0 y 0 0,f 0 −→ S0 (int (I )) −−−−→ S0 (I, Q) −−−−→
= Q2 −−−−→ 0 β n y 0 = Q2 −−−−→ 0,
Kneading Theory: A Functorial Approach
107
and 0
p ∂1 0 −→ S˜1 (I, Q) −−−−→ S0 (I, Q) −−−−→ H0 (I, Q) = Q −−−−→ 0 α n −θ n β n y 1 1,f y 1 y0 0
p ∂1 0 −→ S˜1 (I, Q) −−−−→ S0 (I, Q) −−−−→ H0 (I, Q) = Q −−−−→ 0
are commutative. Therefore we have: n n = T r β0n and T r α1n − θ1,f = T r β1n , T r α0n − θ0,f and from Theorem 2.22,
n n − θ0,f 2Nf−n − 1 + Nf0 n = T r θ1,f i h n n − α1n + α0n − θ0,f = T r α1n − α0n + θ1,f = T r α1n − α0n − T r β1n + T r β0n .
Remark 2.25. We can establish a simple relation between D(θ0,f ,θ1,f ) and Df . In fact, from (2.7) and (2.8), we arrive at Df exp
X n≥1
−
Nf0 n n
t n = D(θ0,f ,θ1,f ) ,
or equivalently Df 5
o∈O0
1 − t p(o) = D(θ0,f ,θ1,f ) ,
where O0 denotes the set of periodic orbits of f of critical type, and p (o) the correspondent period. Example 2.26. Let f be the map of Example 2.15. We have −1 X tn 2Nf−n − 1 + Nf0 n = 1 − 2t + t 4 . exp n n≥1
Thus X n≥1
2Nf−n − 1 + Nf0 n t n−1 =
2 − 4t 3 , 1 − 2t + t 4
and finally X 2Nf−n − 1 + Nf0 n t n−1 = 2 + 4t + 8t 2 + 12t 3 + 22t 4 + 40t 5 + o [t]6 . n≥1
With this, we can conclude for instance that Nf 6 ≥ 2Nf−6 − 1 + Nf0 6 = 40.
108
J. F. Alves, J. Sousa Ramos
2.4. The equation f n (x) = h (x). Let h : I → I be a continuous map. We say that h is a constant signal map, if h is a constant function on int (I ). We define the sign of h by def
(h) = h (x) , for some x ∈ int (I ) . In the previous section we proved independently Theorems 2.17 and 2.22. These results have to do with the number of solutions of the equations f n (x) = x0 and f n (x) = x.
(2.9)
Since the constant map and the identity map are constant signal maps, we may regard the equation f n (x) = h (x) ,
(2.10)
where h is a constant signal map, as a generalization of the equations in (2.9). Our goal now is to show that Theorems 2.17 and 2.22 are particular cases of a more general theorem concerning the number of solutions of Eq. (2.10). As an immediate consequence of this theorem, we shall obtain a stronger version of Theorem 2.22, dealing with the number of periodic points of f in an arbitrary subinterval of I . We begin to generalize ν˜ (g,x0 ) (see Definition 2.5). Let g : I → I be a piecewise monotone map and h : I → I a constant signal map. We define the function ν(g,h) : I → {0, 1, 2} by 1 − ¯g (x) (h) if f (x) = h (x) , for all x ∈ I. ν(g,h) (x) = 0 if f (x) 6 = h (x) Since the set
x ∈ I : ν(g,h) (x) 6 = 0
is finite, we can define Definition 2.27. Let g : I → I be a piecewise monotone map, and h : I → I a constant signal map. P ν(g,h) (x), for all subsets K ⊆ I . 1. We define ν(g,h) (K) = x∈K
2. We define ν˜ (g,h) to be the unique linear form on S˜1 (I, Q) that verifies 1 ν˜ (g,h) (K) = ν(g,h) (int (K)) + ν(g,h) (∂K) , 2 for each K ∈ I. Note that, if h = idI , then (h) = 1 and ν(g,idI ) (I ) = 2Ng− + Ng0 . More generally ν(g,idI ) (K) = 2# F ixg− ∩ K + # F ixg0 ∩ K ,
Kneading Theory: A Functorial Approach
109
and ν˜ (g,idI ) (K) = 2# F ixg− ∩ int (K) + # F ixg− ∩ ∂K 1 +# F ixg0 ∩ int (K) + # F ixg0 ∩ ∂K , 2 for each K ∈ I, and it is clear that # F ixg ∩ K ≥ ν(g,idI ) (K) − 1 ≥ ν˜ (g,idI ) (K) − 1. Let h : I → I be a constant signal map and f : I → I a piecewise monotone map. Consider the linear transformation ηh : S0 (I, Q) → S0 (I, Q) defined as follows: If x ∈ I and there is y ∈ I , such that h (y) = x, then ηh (x) = (h) .y . For the remaining points of I we define ηh (x) = 0. It is clear that, if (h) = 0, then ηh = 0. If h = idI , then ηh = idS0 (I,Q) . Since the pair θ0,f , θ1,f has finite-dimensional image, we see that, def n n − θ0,f , for each n ∈ N, γ(n,f,h) = ηh ◦ θ1,f is a linear transformation on S0 (I, Q), with finite-dimensional image. Therefore we can consider T(n,f,h) to be the linear form on S˜1 (I, Q) defined by T(n,f,h) (K) = T r π˜ K ◦ γ(n,f,h) , for all K ∈ I, where π˜ K : S0 (I, Q) → S0 (I, Q) is the unique linear transformation that verifies x if x ∈ int (K) π˜ K (x) = 21 x if x ∈ ∂K , for all x ∈ I. 0 if x ∈ /K Now we can state our main result, which is a generalization of Theorems 2.17 and 2.22. Theorem 2.28. Let f : I → I be a piecewise monotone map, h : I → I a constant signal map, ∂1 : S˜1 (I, Q) → S0 (I, Q) the boundary operator, n ∈ Z+ and S(n,f,h) the linear form on S0 (I, Q) defined by n n S(n,f,h) (x) = δh(x) θ1,f (x) − (h) .f#0 (x) , for all x ∈ I, where δh(x) is the linear form of Definition 2.16. Then S(n,f,h) ◦ ∂1 + T(n,f,h) = ν˜ (f n ,h) . If h is a homeomorphism, then T r γ(n,f,h) = ν(f n ,h) (I ) − 1.
110
J. F. Alves, J. Sousa Ramos
Observe that, once more by Corollary 2.10, we have γ(n,f,h) = γ(1,f n ,h) , T(n,f,h) = T(1,f n ,h) and S(n,f,h) = S(1,f n ,h) , for each n ∈ Z+ . Thus Theorem 2.28 is an immediate consequence of the following Lemma 2.29. Let g : I → I be a piecewise monotone map, h : I → I a constant signal map, ∂1 : S˜1 (I, Q) → S0 (I, Q) the boundary operator, and S(1,g,h) the linear form on S0 (I, Q) defined by S(1,g,h) (x) = δh(x) θ1,g (x) − (h) .g#0 (x) , for all x ∈ I. Then S(1,g,h) ◦ ∂1 + T(1,g,h) = ν˜ (g,h) . If h is a homeomorphism, then
T r γ(1,g,h) = ν(g,h) (I ) − 1.
Proof. Observe that, if (h) = 0, then the proof follows from Theorem 2.17. We shall prove the lemma for (h) 6 = 0. Let g : I = c0 , c`+1 → I be a piecewise monotone map, with turning points c1 < · · · < c` , and h : I → I a constant signal map, such that (h) 6 = 0. Since Ig generates S˜1 (I, Q), to prove the first part, we only need to show that (2.11) S(1,g,h) (d) − S(1,g,h) (c) + T r π˜ K ◦ γ(1,g,h) = ν˜ (g,h) (K) , for all K = [c, d] ∈ Ig . Let K = [c, d] ∈ Ig , and consider the numbers C and D, defined by: h (c) < g (c) (h) − g (c+) if 1 0 if h (c) = g (c) and ¯g (c) 6 = 0 , C= 2 (h) g (c+) if h (c) = g (c) and ¯g (c) = 0 h (c) > g (c) g (c+) − (h) if and
h (d) < g (d) g (d−) − (h) if 1 0 if h (d) = g (d) and ¯g (d) 6 = 0 . D= 2 (h) g (d−) if h (d) = g (d) and ¯g (d) = 0 h (d) > g (d) (h) − g (d−) if
By definition of Ig , g is injective in K, and we check easily C + D = ν˜ (g,h) (K) . Thus, from (2.11) we have to prove that S(1,g,h) (d) − S(1,g,h) (c) + T r π˜ K ◦ γ(1,g,h) = C + D.
(2.12)
Kneading Theory: A Functorial Approach
111
From Propositions 2.12 and 2.14, we have
S(1,g,h) (x) = ¯g (x) − (h) δh(x) (g#0 (x)) +
`+1 X
(2.13)
ωi (x) δh(x) (g#0 (ci ))
i=0
for all x ∈ I , and `+1 X ωi ((π˜ K ◦ ηh ◦ g#0 ) (ci )) . T r π˜ K ◦ γ(1,g,h) =
(2.14)
i=0
Let U : I → Q be the function defined by: U (x) = ¯g (x) − (h) δh(x) (g#0 (x)) , for all x ∈ I. Let V : I → Q be the function defined by: V (x) = 0 if x ∈ I \ {c0 , . . . , c`+1 } , and V (ci ) = ωi ((π˜ K ◦ ηh ◦ g#0 ) (ci )) +ωi (d) δh(d) (g#0 (ci )) − ωi (c) δh(c) (g#0 (ci )) , for each i = 0, . . . , ` + 1. We have
(π˜ K ◦ ηh ) (x) = (h) δh(c) − δh(d) (x) .ηh (x) , for all x ∈ I,
therefore
ωi ((π˜ K ◦ ηh ◦ g#0 ) (ci )) = ωi (h) δh(c) (g#0 (ci )) ηh (g#0 (ci ))
−ωi (h) δh(d) (g#0 (ci )) ηh (g#0 (ci )) ,
and V (ci ) = δh(c) (g#0 (ci )) ωi ( (h) ηh (g#0 (ci )) − c) +δh(d) (g#0 (ci )) ωi (d − (h) ηh (g#0 (ci ))) .
(2.15)
Since K ∈ Ig , we have {c0 , c1 , . . . , c`+1 } ∩ int (K) = ∅, therefore V (x) = 0, for all x ∈ int (K). From (2.15), we verify easily that V (x) = 0, for all x ∈ / K . Thus {x ∈ I : V (x) 6 = 0} ⊆ {c, d} , and, once more by (2.15), we have −δh(c) (g#0 (c)) g (c+) if g (c) 6= h (c) , if ¯g (c) = 0, V (c) = −δh(d) (g#0 (c)) g (c+) if g (c) = h (c) and
V (d) =
δh(d) (g#0 (d)) g (d−) if g (d) 6 = h (d) , if ¯g (d) = 0. δh(d) (g#0 (d)) g (d−) if g (d) = h (d)
(2.16)
(2.17)
(2.18)
112
J. F. Alves, J. Sousa Ramos
By (2.13) and (2.14), we have
X V (x) , S(1,g,h) (d) − S(1,g,h) (c) + T r π˜ K ◦ γ(1,g,h) = U (d) − U (c) + x∈I
and by (2.16) U (d) − U (c) +
X
V (x) = V (c) − U (c) + V (d) + U (d) .
x∈I
From (2.17) and (2.18) we have V (c) − U (c) = C and V (d) + U (d) = D, and finally we obtain (2.12). Now we prove the second part. Assume that h is an homeomorphism. From the first part, we obtain
S(1,g,h) (c`+1 ) − S(1,g,h) (c0 ) + T r π˜ I ◦ γ(1,g,h) = ν˜ (g,h) (I ) ,
therefore
T r γ(1,g,h) − ν˜ (g,h) (I ) = T r idS0 (I,Q) − π˜ I ◦ γ(1,g,h) −S(1,g,h) (c`+1 ) + S(1,g,h) (c0 ) .
Once more, we use Propositions 2.12 and 2.14, to show that
ν(g,h) (I ) − ν˜ (g,h) (I ) − 1 = T r idS0 (I,Q) − π˜ I ◦ γ(1,g,h) −S(1,g,h) (c`+1 ) + S(1,g,h) (c0 ) ,
and the proof follows. u t Remark 2.30. It is clear that we can prove the second part of this proposition, using Lemma 2.23. In fact, if h is an homeomorphism, then we have by Lemma 2.23, (2.19) T r θ1,h−1 ◦g − θ0,h−1 ◦g = ν(h−1 ◦g,idI ) (I ) − 1. Observe that, if h is an homeomorphism, then we have ηh = (h) h−1 #0 , and it is clear that ηh 6 = θ0,h−1 and ηh 6 = θ1,h−1 . Nevertheless, the diagrams i
∂1 #0 S0 (int (I ) , Q) −→ S0 (I, Q) S˜1 (I, Q) −→ S0 (I, Q) ↓ ηh and .h−1 .h−1 ↓ ηh #0 ↓ #1 ↓ ∂1 i#0 S˜1 (I, Q) −→ S0 (I, Q) S0 (int (I ) , Q) −→ S0 (I, Q)
are commutative. Thus, we have ηh ◦ θ0,g = θ0,h−1 ◦g and ηh ◦ θ1,g = θ1,h−1 ◦g , and by (2.19), T r ηh ◦ θ1,g − θ0,g
= T r θ1,h−1 ◦g − θ0,h−1 ◦g = ν(h−1 ◦g,idI ) (I ) − 1 = ν(g,h) (I ) − 1.
Kneading Theory: A Functorial Approach
113
We return to the case h = idI . In this case we have n n S(n,f,idI ) (x) = δx θ1,f (x) − f#0 (x) , for each x ∈ I, T(n,f,idI ) (K) = T r π˜ K ◦ γ(n,f,idI ) n n − θ0,f , for each K ∈ I. = T r π˜ K ◦ θ1,f If we consider the formal power series def
Sf (x) =
X
S(n,f,idI ) (x) t n−1 , for each x ∈ I,
n≥1
and def
Tf (K) =
X
T(n,f,idI ) (K) t n−1 , for each K ∈ I,
n≥1
then, as an immediate consequence of Theorem 2.28, we have X ν˜ (f n ,idI ) (K) t n+1 , Sf (d) − Sf (c) + Tf (K) =
(2.20)
n≥1
for each K = [c, d] ∈ I. Since Tf (K) = ϒ(π˜ K ,θ0,f ,θ1,f ) , we apply Corollary 1.5 to compute Tf (K). To compute Sf (x) we apply Corollary 1.4 to obtain t 2 Sf (x) = D(f#0 ,f#0 +δx ⊗x) − D −1 (θ
0,f ,θ1,f
D , ) (θ0,f ,θ1,f +δx ⊗x )
(2.21)
for each x ∈ I . Example 2.31. Let f be the map of Example 2.15, and consider K = [c, d], with c = f 3 (c1 ) and d = c1 . We have o n j 3 = (i, j ) ∈ {0, 1, 2} × N : π˜ K ◦ θ0,f (f#0 (ci )) 6 = 0 = {(1, 2) , (1, 3)} , and by Corollary 1.5. Tf (K) = ϒ(π˜ K ,θ0,f ,θ1,f ) X −1 j −1 = t 1 − D θ ,θ D j ( 0,f 1,f ) θ0,f ,θ1,f +ωi ⊗π˜ K ◦θ0,f (f#0 (ci )) (i,j )∈3
=
t2 − t3 . 2 1 − 2t + t 4
114
J. F. Alves, J. Sousa Ramos
By (2.21) we have t 2 Sf (c) =
−2t 3 + t 4 + t 5 t2 + t3 − t4 − , 2 1 − 2t + t 4 2 1 − t4
and t 2 Sf (d) =
t4 + t3 − t2 . 2 1 − t4
Thus by (2.20) we have X
ν˜ (f n ,idI ) (K) t n−1 =
n≥1
and finally X
t t − t2 + , 1 − t4 1 − 2t + t 4
ν˜ (f n ,idI ) (K) t n−1 = 2t + 2t 2 + 3t 3 + 6t 4 + 12t 5 + o [t]6 .
n≥1
With this, we conclude, for instance, that # F ixf 6 ∩ K ≥ ν˜ (f 6 ,idI ) (K) − 1 = 11. References [Ba-Ru 94] Baladi, V. and Ruelle, D.: An extension of the theorem of Milnor and Thurston on the zeta functions of interval maps. Ergod. Th. & Dynam. Sys. 14, 621–632 (1994) [Me-St 93] Melo, W. and van Strein, S.: One-Dimensional Dynamics. Berlin–Heidelberg–New York: Springer-Verlag, 1993 [Mi-Th 88] Milnor, J., Thurston, W.: On Iterated Maps of the Interval. In: Dynamical Sytems: Proceedings 1986–1987, Berlin–Heidelberg–New York: Springer-Verlag, 1988. Lecture Notes in Mathematics, 1342, pp. 465–563 [Mo 90] Mori, M.: Fredholm determinants for piecewise linear transformations. Osaka J. Math. 27, 81– 116 (1990) Communicated by Ya. G. Sinai
Commun. Math. Phys. 204, 115 – 136 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Free Massive Fermions Inside the Quantum Discrete sine-Gordon Model Nadja Kutz? Technische Universität Berlin, Sekr. MA 8-5, Strasse des 17. Juni 136, 10623 Berlin, Germany. E-mail: [email protected] Received: 2 December 1996 / Accepted: 19 January 1999
Abstract: We extend the notion of space shifts introduced in [FV3] for certain quantum light cone lattice equations of sine-Gordon type at root of unity (e.g. [FV1,FV2, BKP,BBR]). As a result, we obtain a compatibility equation for the roots of central elements within the algebra of observables (also called current algebra). The equation, which is obtained by exponentiating these roots, is exactly the evolution equation for the “classical background” as described in [BBR]. As an application for the introduced constructions, we derive a one to one correspondence between a special case of the quantum light cone lattice equations of sine-Gordon type and free massive fermions on a lattice, as a special case of the lattice Thirring model constructed in [DV]. 1. Introduction There is a famous correspondence between the quantum sine-Gordon model and the massive Thirring model (see e.g. [C,KM]). The sine-Gordon model and the massive Thirring model have equivalents which are discrete in space and time – namely the so-called doubly discrete quantum sineGordon model (see e.g. [BRST,NCP,S,V,FV2,CN,EK]) and the doubly discrete massive Thirring model [DV]. One of the main purposes of this paper is to establish a connection between the quantum doubly discrete sine-Gordon model and the doubly discrete model of free massive fermions which appears as a special case of the doubly discrete massive Thirring model. The (classical) doubly discrete sine-Gordon model belongs to a class of lattice systems defined by an evolution equation of the following type: gu = V 0 (gl − gr ) + gd = 0. ? Supported by the Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 288
(1)
116
N. Kutz
Equation (1) is a discrete version of the evolution equation 2g = V 0 (g), 2 being the Laplacian on two dimensional Minkowski space time. The real value gu ∈ R is given by the three real values gd , gl , gr ∈ R and by the function V 0 (x), which is the derivative of a real potential V (x) : R → R. The indices u, d, l, r denote up, down, left and right, respectively, and stand for the corresponding vertices in the picture of a two dimensional Minkowski space time lattice shown below. The Minkowski space time (or light cone) lattice is just Z2 with the interpretation that edges point in light cone directions. Clearly, if we start with e.g. space periodic initial data gi along a Cauchy zig zag C on the light cone lattice (Fig. 1) then the local evolution given by (1) will determine the function g : V ertices → R on the whole lattice. gu gr
gl
Cauchy path C
gd
Lattice systems of the above type have been discussed e.g. in [BRST,NCP,S,V, FV2,CN,EK]). In [EK] it was shown that it is possible to derive the above equation as an equation of motion from an explicitly given action. Moreover, using covariant phase space techniques, it is possible to derive the symplectic structure belonging to the above model via a variation of this action. As a consequence, one obtains an unique Poisson bracket for the variables gi (now viewed as functions on covariant phase space with quasiperiodic initial conditions). This bracket was first stated in [FV2]. The above Poisson bracket has the property that the variables gi , which are associated with the vertices of the space time lattice, do not necessarily Poisson commute with variables gk associated with vertices outside the lightcone initiating at the vertex i. For this reason the variables gi , which we call vertex variables are often referred to as “variables with nonlocal Poisson relations”. In contrast to this, the induced Poisson relations for the Poisson subalgebra of difference variables gl − gr (Fig. 1) are local. Therefore the difference variables are viewed as referring to physical observables and consequently they received much attention (see e.g. [BKP,FV1,BBR,CN]). The difference variables gl −gr can be associated with the faces of the space time lattice (Fig. 1) and are therefore sometimes called face variables. Despite their unpleasant Poisson relations; it is worthwhile to study the ”vertex algebra” generated by the vertex variables gi . An important step in this direction was made in the work of Faddeev and Volkov [FV3], where quantized models of the above variables were studied. In particular, the so-called quantum L-operator, which is needed for obtaining quantum integrals of motion can be constructed in terms of the vertex algebra but (so far) not in terms of the face algebra. For the construction of the quantum lattice sine-Gordon/massive Thirring model analogy it is crucial to investigate a subalgebra of the vertex algebra which is generated by variables gd + gl and gr + gd , respectively (Fig. 1). This subalgebra is associated with the edges of the lattice and clearly contains the face algebra generated by the face variables. The induced Poisson relations on this subalgebra (edge algebra) are also
Free Massive Fermions Inside Discrete sine-Gordon Model
117
nonlocal. Nevertheless, it turns out that the edge variables when quantized, may furnish fermionic fields, obeying local anticommutation rules. The evolution in the lattice massive Thirring model (MTM) is given by an inner automorphism involving the (quantum) R-matrix of an integrable generalization of the six vertex model followed as a space-like shift on fermionic fields. The evolution of quantized edge variables in the lattice sine-Gordon model (SGM) case given by the quantized version of (1) can also be constructed as an automorphism involving the “dynamical” sine-Gordon R-matrix and a space-like shift. The corresponding automorphisms in that case are not always inner, in particular in the root of unity case, this automorphism gives rise to a quantum evolution with “classical background” as first noticed in [BBR]. In the work of Faddeev and Volkov [FV3] the above mentioned space-like shift automorphisms for the SGM are constructed as inner automorphisms for a special choice of monodromies. We will show that this shift automorphism together with the evolution automorphism can be generalized in the root of unity case to an “almost” inner automorphism by extending the corresponding algebra with central elements which change nontrivially under the automorphism (so-called “classical phase factors”) (see also [BBR]). As a consequence, we obtain a compatibility equation for these central elements. The equation that is obtained by exponentiating this equation is exactly the evolution equation for the “classical background” as described in [BBR]. The paper is organized as follows: in the first section we define the general classical model, which gives rise to the sine-Gordon model and a careful description of the involved algebras is given. In the second section the corresponding quantized model is described. In the third section we describe the model of sine-Gordon type at root of unity, and derive an evolution equation for the roots of central elements. In the last section we use the prior results to show how the quantized lattice sine-Gordon model can be related to the lattice massive Thirring model constructed by Destri and de Vega [DV] (see also [TS] and its bibliography). In particular, we construct an isomorphism between a representation of the fermionic edge algebra and the “bare” fermion operators in [DV] and hence obtain an isomorphism between a representation of the dynamical sine-Gordon R-matrix and the R-matrix of an integrable generalization of the six vertex model (for ω = 1) in [DV]. Incidentally, this R-matrix is identical to the usual quantum R-matrix (with real spectral parameter) of the sine-Gordon model, which defines the commutation relations of the sine-Gordon L-operator (see e.g. [TF]). This is not only interesting in terms of representation theory, but also because we obtain in this way an immediate connection between the phase space of the quantum lattice sine-Gordon theory and its dynamical aspects. As another result it may be possible to understand more about the true nature of the famous sine-Gordon - massive Thirring model equivalence (see e.g. [C,KM]). 2. The Phase Space 2.1. Vertex variables. A (spatially) periodic light cone lattice L2p (Fig. 1) with period 2p may be viewed as L/Z, where Z acts on the infinite light cone lattice L by shifts of 2p in a space-like direction. A quasi-periodic field is a mapping g:L→R with gt,i+2p − gt,i = gt,i+2p+2 − gt,i+2 ∀i ∈ Z,
118
N. Kutz
g2t+2,−2
g2t+2,0
g2t+1,−1
g2t+2,2 g2t+1,1
g2t+1,2p−1
x2t+1,1 g2t,−2
g2t+2,2p
g2t,0
g2t+1,2p+1 C2t+1
x2t+1,3
g2t,2
g2t,2p
x2t,1 x2t,2 g2t−1,−1 g2t−1,1
C2t
x2t,0
g2t−1,2p−1
g2t−1,2p+1
Figure 1. (1)
(2)
and with two (space independent) monodromies mt , mt
defined by
(i)
(2) mt := gt,2p+2k+1−i − gt,2k+1−i for an arbitrary k ∈ Z. Since the quasiperiodic variables gt,k are associated with the vertices of the Minkowski lattice L we will simply call them vertex variables. The indices t, k are adapted to Fig. 1, i.e. t denotes time and k space. We define an evolution of the following form (Fig. 1): (3) gt+1,k+1 = V 0 (gt,k − gt,k+2 ) + gt−1,k+1 . g g Given a set of initial values I2t = (g2t−1,2k+1 )k∈{−1,...p−1} , (g2t,2k )k∈{0,...p} or I2t+1= (g2t,2k )k∈{0,...p} , (g2t+1,2k−1 )k∈{−1,...p−1} , (t ∈ Z fixed) along an elementary Cauchy zig zag (Fig. 1), the evolution equations (3) define quasiperiodic field values at all other times. A quasiperiodic field g obtained in this way will be called a solution to evolution (3). By virtue of the evolution equations the monodromies are time independent (1)
(2)
m2t := m(1) m2t−1 := m(2) . In Fig. 1 elementary Cauchy zig zags C2t and C2t+1 are visualized by dashed and nondashed lines, the grey area indicates fundamental regions for quasiperiodic initial values g g I2t and I2t+1 . The quasiperiodic initial values It can be interpreted as a (global) coordinate chart gt : P → R2p+2 on the set of all solutions P to Eq. (3) via the identification: gt,i (g) = gt,i ∈ R. The set of all solutions to a given evolution will be called covariant phase space P. It is possible to define a translationally invariant action on the set of quasiperiodic fields whose variation gives the above field equations as well as a time independent, translationally invariant symplectic structure [EK], which in the turn defines the following Poisson relations for the functions gt,i : P → R: {gt,i , gt,k } = 0, if i − k even, {gt,i , gt±1,k } = 1, if i − k odd, i < k, |i − k| < 2p, {gt,i , m(k) } = 0 (i − k odd), , m(k) }
= 2 (i − k even), {gt,i {m(1) , m(2) } = 0.
(4)
Free Massive Fermions Inside Discrete sine-Gordon Model
119
The Poisson algebra generated by the vertex variables gt,i : P → R together with the relations (4) will be called vertex algebra. 2.2. Edge variables. Define the following variables: xt,k : = gt,k + gt−1,k−1 xt,k : = gt−1,k + gt,k−1
for k − t even, for k − t odd.
(5)
With respect to the labeling of the vertex variables, the choice of indices for the edge variables appears to be quite counterintuitive. Nevertheless with this choice the indices t and k still refer to space and time. xt+1,k
xt+1,k−1 gt,k−2
gt,k
xt,k−1
t
xt,k
gt−1,k−1 k
The variables xt,k are associated with the edges of the Minkowski lattice L and we call them edge variables. The corresponding Poisson subalgebra will be called edge algebra. g g x = A set of initial vertex values I2t+1 or I2t defines the initial edge values I2t+1 x (x2t+1,k )k∈{0,...2p} or I2t = (x2t,k )k∈{0,...2p} , respectively. The edge variables are still quasiperiodic. Their monodromy is the sum of the two monodromies of the vertex variables : + m(2)} . xt,k+2p = xt,k + |m(1) {z :=mx
The induced Poisson commutation rules are {xt,k , xt,k+n } = 2 {xt,k , mx } = 4.
n ∈ {1 . . . 2p − 1},
(6) (7)
Using definition (5) and the vertex evolution equations (3), one obtains the following evolution equations for the edge variables: xt+1,k = V 0 (xt,k−1 − xt,k ) + xt,k , xt+1,k−1 = V 0 (xt,k−1 − xt,k ) + xt,k−1 , mx = const. We will call these equations edge evolution equations.
(8) (9) (10)
120
N. Kutz
2.3. Face variables. Define the following fields: pt,k−1 : = gt,k−2 − gt,k
gt,k−2
pt,k−1
gt,k
pt−1,k
The difference variables pt,k are associated with the faces of the Minkowski lattice L. They will be called face variables, and the corresponding Poisson subalgebra face g g algebra. A set of initial vertex values I2t+1 or I2t , respectively defines the initial face valp p ues I2t = {(p2t−1,2k )k∈{0..p−1} , (p2t,2k+1 )k∈{0..p−1} } or I2t+1 = {(p2t,2k+1 )k∈{0..p−1} , (p2t+1,2k )k∈{0..p−1} }. Note that pt,k−1 = xt,k−1 −xt,k = xt+1,k−1 −xt+1,k (k −t even). The above variables are periodic since the monodromies cancel: pt,2p+k−1 = pt,k−1 = gt,k−2 + m(i) − (gt,k + m(i) ). The induced Poisson relations between these variables are ultralocal in space: {pt,j , pt±1,j +1 } = 2, {pt,j , pt±1,j +1+k } = 0 for k ∈ {1 . . . 2p − 3}.
(11) (12)
The evolution equations in terms of the face variables read as pt+1,k = V 0 (pt,k−1 ) − V 0 (pt,k+1 ) + pt−1,k . We will call these equations face evolution equations. The introduction of the above variables refers to a Marsden-Weinstein reduction of phase space as explained in [EK]. 2.4. Relations to the sine-Gordon model. Let the halfperiod p of the lattice be even, i.e. p = 2s. In [EK] it was shown that the action, which describes the above dynamical systems is invariant under a redefinition of g ; −g along every second diagonal of the Minkowski lattice, i.e. all the above structure is preserved under such a redefinition. Choosing V 0 (x) = −i ln(
1 + keix ), k + eix
(13)
projecting the evolution to the torus T 2p+2 : k + ei(gt,k −gt,k+2 ) igt−1,k+1 e , (14) 1 + kei(gt,k −gt,k+2 ) and with the above redefinition the vertex equation (3) we obtain the famous Hirota equation [H], while the face equations are commonly known as doubly discrete sineGordon equations (see e.g.[BKP,BBR,FV2]). Without such a redefinition along the diagonals, but still with the special potential given in (13), the above model is related to the doubly discrete mKdV model [CN]. Due to its relation to the sine-Gordon model (see e.g. [IK]), the torus projected vertex, edge and face equations with the potential (13) will be called of sine-Gordon type. In the next sections we will study evolutions of sine-Gordon type. eigt+1,k+1 =
Free Massive Fermions Inside Discrete sine-Gordon Model
121
3. Quantization of the Models 3.1. General outline. Fix an initial Cauchy path as e.g. C2T (Fig. 1). Our quantization scheme ig follows the commonig procedure to substitute the canonical variables I2T := (e 2T −1,2k+1 )k∈{−1,...p−1} {(e 2T ,2k )k∈{0,...p} as functions on phase space by unitary operators I2T := (G2T −1,2k+1 )k∈{−1,...p−1} ∈ U (H), (G2T ,2k )k∈{0,...p} ∈ U (H) obeying Weyl commutation relations. That is, we search for a bijection Q : F(P) → U (H), Q(eigt,k ) =: Gt,k with the properties Q(const) = const 11, ) = Q(eigt,k )Q(eigτ,j ) ei phase , Q(e e igt,k igτ,j
such that [Gt,i , Gt,k ] = 0, Gt,i Gt±1,k = q [Gt,i
, M (k) ]
Gt,i
M (k)
−m 2
if i − k even, Gt±1,k Gt,i , if i − k odd, i < k, |i − k| < 2p,
=0 =
(i − k odd),
q −m M (k) Gt,i
(15)
(i − k even),
[M (1) , M (2) ] = 0, where the commutator is defined as usual [A, B] := AB − BA, q ∈ S 1 ⊂ C, m ∈ N and M (1) : = G2T ,2p G−1 2T ,0 ,
M (2) : = G2T −1,2p−1 G−1 2T −1,−1 ,
Gt,k+2p : = M (i) Gt,k .
11 = (Gt,k )0 is the identity in U (H) and the product in U (H) is given by the composition 2π i
of operators. For our purpose q will always be chosen as a root of unity, i.e. q = e N ∈ S 1 ⊂ C, N ∈ N. in the generators I2T = Let A(G2T ) be the algebra of Laurent polynomials (G2T −1,2k+1 )k∈{−1,...p−1} , (G2T ,2k )k∈{0,...p} (T ∈ Z fixed). Note that, up until now, the “quantization” map Q was only defined for the initial canonical variables eigt,k , t ∈ {2T − 1, 2T } and, modulo a phase factor, on products of these. It was not specified for other functions on phase space P, such as the time one evolved variables: g2T +1,2k−1 = g2T +1,2k−1 (g2T ,2k−2 , g2T ,2k , g2T −1,2k−1 ). Our goal is to implicitly define a quantization for these functions by defining an automorphism Et,k−1 : A(G2T ) → A(G2T ), such that: Q(eigt+1,k−1 ): = Et,k−1 (Gt−1,k−1 ) = Et,k−1 (Q(eigt−1,k−1 )), where eigt+1,k−1 is given by the classical evolution. The automorphism Et,k−1 will be very adapted to our specific model. We will not discuss whether (or how) it would be possible to find such an automorphism in general nor will we be concerned with a discussion of the above quantization procedure with respect to completeness (when e.g. extending by linearity), uniqueness, connection to other quantizatons, etc.
122
N. Kutz
Let us proceed with an explicit construction of Et,k−1 . In accordance with the classical definition (5) define edge operators (k − t even): Xt,k : = Gt,k Gt−1,k−1 , Xt,k−1 : = Gt−1,k−1 Gt,k−2 ,
(16) (17)
−1 = q −m M (1) M (2) . ⇒ MX : = Xt,2p Xt,0
(18)
Let A(X2T ) be the algebra of Laurent polynomials in the generators X = (X2T ,k )k∈{0..2p} . I2T
Define face operators m
Pt,k−1 := q − 2 G−1 t,k Gt,k−2
(19)
−1 −1 −1 Xt,k−1 = Xt+1,k Xt+1,k−1 . Let A(P2T ) be the algebra of Laurent Note that Pt,k−1 = Xt,k P polynomials in the generators I2T = {(P2T −1,2k )k∈{0..p−1} , (P2T ,2k+1 )k∈{0..p−1} }. Clearly the above construction can also be done for the odd time Cauchy path C2T −1 .
3.2. Almost Hamiltonian quantum evolution. Let Rk (Pt,n−1 ) be a nonvanishing Laurent m polynomial in the face operator Pt,n−1 := q − 2 G−1 t,n Gt,n−2 , i.e. Rk (Pt,n−1 ) ∈ A(Gt ) B
which depends on a parameter k ∈ [0, 1). Let eiξk (Pt,n−1 ) ∈ S 1 ⊂ C be a number B ∈ A(PT ) ⊂ A(GT ) and the same parameter which depends on a central element Pt,n−1 k ∈ [0, 1), and B ∈ N. Define recursively (n − t even) B
Gt+1,n−1 : = Et,n−1 (Gt−1,n−1 ) : = Rk (Pt,n−1 )Gt−1,n−1 Rk (Pt,n−1 )−1 eiξk (Pt,n−1 ) B Rk (Pt,n−1 ) Gt−1,n−1 eiξk (Pt,n−1 ) , = (20) m Rk (q Pt,n−1 ) Et,n−1 (Gt˜,j ): = Gt˜,j for j 6 = n − 1 mod 2p; t˜ ∈ {t, t − 1},
(21)
i.e. the automorphisms Et,n−1 act nontrivially only on operators which are associated to the (t, n − 1)th face of the corresponding Cauchy zig zag Ct . Gt+1,n−1
Gt+1,n+1
Et,n+1
Et,n−1
C Gt−1,n−1
Gt−1,n+1
An automorphism defined by conjugation with an element of the algebra and multiplication with a phase factor will be called almost inner. Define E2t : =
p−1 Y n=0
E2t,2n+1
E2t−1 : =
p−1 Y n=0
E2t−1,2n ,
Free Massive Fermions Inside Discrete sine-Gordon Model
123
which is well defined since the corresponding automorphisms Et,n−1 commute. Et evolves all operators associated with a Cauchy path Ct one time step further and by the definition of the evolution automorphism, Et (Gt−1,n−1 ) = Et,n−1 (Gt−1,n−1 ). Using the commutation relations in (15) and the periodicity of the face operators, it follows immediately that the above automorphisms preserve the monodromies. We will call such automorphisms almost Hamiltonian evolution automorphisms. The induced evolution on the subalgebras A(PT ) ⊂ A(XT ) ⊂ A(GT ) (Fig. 1) reads: B
Xt+1,n : = Gt,n Gt+1,n−1 = Gt,n Rk (Pt,n−1 )Gt−1,n−1 Rk (Pt,n−1 )−1 eiξk (Pt,n−1 ) (22) B B Rk (Pt,n−1 ) Xt,n eiξk (Pt,n−1 ) , = Rk (Pt,n−1 )Xt,n Rk (Pt,n−1 )−1 eiξk (Pt,n−1 ) = m Rk (q Pt,n−1 ) B
Xt+1,n−1: = Gt+1,n−1 Gt,n−2 = Rk (Pt,n−1 )Gt−1,n−1 Rk (Pt,n−1 )−1 Gt,n−2 eiξk (Pt,n−1 ) B
= Rk (Pt,n−1 )Xt,n−1 Rk (Pt,n−1 )−1 eiξk (Pt,n−1 ) B Rk (Pt,n−1 ) Xt,n−1 eiξk (Pt,n−1 ) , = m Rk (q Pt,n−1 ) m
Pt+1,n : = q − 2 G−1 t+1,n+1 Gt+1,n−1 = q
−m 2
(23) B ) −1 −iξk (Pt,n+1
Rk (Pt,n+1 )G−1 t−1,n+1 Rk (Pt,n+1 )
e
(24)
B ) −1 iξk (Pt,n−1
Rk (Pt,n−1 )Gt−1,n−1 Rk (Pt,n−1 ) e B B Rk (Pt,n−1 ) Rk (Pt,n+1 ) = Pt−1,n eiξk (Pt,n−1 ) e−iξk (Pt,n+1 ) Rk (q m Pt,n−1 ) Rk (q −m Pt,n+1 ) =: Et,n+1 Et,n−1 (Pt−1,n ),
(25)
P = {(P Now I2T 2T −1,2k )∈{0..p−1} , (P2T ,2k+1 )k∈{0..p−1} } (of course one could also P take I2T −1 ) is an initial configuration of unitary periodic (face) operators, i.e. Pt,2p+n = Pt,n ∈ A(PT ), n ∈ Z. These operators commute as:
[Pt,n , Pt,j ] = 0 n 6= j, [Pt±1,n , Pt,n+j ] = 0
for j ∈ {2...2p − 2}
mod 2p,
Pt,n Pt±1,n+1 = q −m Pt±1,n+1 Pt,n .
(26)
Consider the automorphism Et,n+1 Et,n−1 : A(PT ) → A(PT ) which was recursively defined by: Et,n+1 Et,n−1 (Pt−1,k ) : = Rk (Pt,n+1 )Rk (Pt,n−1 )Pt−1,n Rk (Pt,n−1 )−1 Rk (Pt,n+1 )−1 e
(27) B iξk (Pt,n−1 )
e
B −iξk (Pt,n+1 )
,
and Et,n+1 Et,n−1 acting trivially on all other faces along the corresponding Cauchy zig zag Ct . Since Pt+1,n = Et,n+1 Et,n−1 (Pt−1,n ),
124
N. Kutz
it follows by Definition (28) that : Rk (Pt+1,n ) B
B
= Rk (R(Pt,n+1 )Rk (Pt,n−1 )Pt−1,n Rk (Pt,n+1 )−1 Rk (Pt,n+1 )−1 eiξk (Pt,n−1 ) e−iξk (Pt,n+1 ) ) B
B
= Rk (Pt,n+1 )Rk (Pt,n−1 )Rk (eiξk (Pt,n−1 ) e−iξk (Pt,n+1 ) Pt−1,n )Rk (Pt,n−1 )−1 Rk (Pt,n+1 )−1 . (28) Define the operators (t ∈ Z) R2t−1 :=
p−1 Y
Rk (P2t−1,2n )
R2t :=
n=0
p−1 Y
Rk (P2t,2n+1 ).
n=0
Proposition 3.1. The operators Rt ∈ A(P ) defined as above evolve as: R2t+1 = R2t
p−1 Y
B
B
−1 Rk (eiξk (P2t,2n−1 ) e−iξk (P2t,2n+1 ) P2t−1,2n )R2t ,
n=0 p−1 Y
R2t = R2t−1
B
B
−1 Rk (eiξk (P2t−1,2n ) e−iξk (P2t−1,2n+2 ) P2t−2,2n+1 )R2t−1 .
n=0
Proof. Using the commutation relations in (26) and (28) and the periodicity of the face operators, the proof is straightforward. u t B
B
Corollary 3.2. If eiξk (Pt,n−1 ) e−iξk (Pt,n+1 ) = 11 for all t, x ∈ Z (k −t even) then Rt+1 Rt = Rt Rt−1 is constant. Since in this case the evolution for the face variables is given by conjugation −1 −1 Rt , Pt+1,n = Rt Rt−1 Pt−1,n Rt−1
the operator Rt Rt−1 can be viewed as the discrete analog of a continuous Hamiltonian time evolution, i.e. 1t0 fixed. Rt Rt−1 ↔ eiH 1t0 This is why we called the automorphism constructed in (20),(21) almost Hamiltonian. 4. The Quantum Discrete sine-Gordon Model 4.1. Constructing the evolution automorphism. Theorem 4.1. Let xˆ be an element of a ∗-algebra over C, such that x −1 = x ∗ and . . · x}ˆ is a multiple of the identity element 11 in the algebra, i.e. xˆ B = x B 11, xˆ B := |xˆ · .{z B times
with x B ∈ S 1 ⊂ C. Let k ∈ C, B = e
iξk
=e
iξk (xˆ B )
:=
N gcd(m,N ) , q
=e
2π i N
. Choose any root
1 + k¯ B x B (−1)B−1 B 11 ∈ S 1 11. k B + x B (−1)B−1 1
(29)
Free Massive Fermions Inside Discrete sine-Gordon Model
Define Rk (x) ˆ :=
PB−1
j =0 lj xˆ
j
125
with lj :=
j Y eiξk q m(n−1) − k¯ , 1 − eiξk kq mn
n=1
ˆ satisfies the equation: then Rk (x) ˆ = Rk (x)
k + xˆ Rk (q m x)e ˆ iξk . 1 + k¯ xˆ
(30)
Proof. By straightforward computation. u t ˆ a “dynamical R-matrix”. The above theorem is a generalization of the We call Rk (x) results found in [V] and [FZ], where it was assumed that xˆ B = 11. As it turns out generalizing to general central elements xˆ B is important for obtaining an evolution with nontrivial classical background as first described in [BBR], where a solution for Eq. (30) was indicated for the case m = 2, N = odd, k ∈ R. Solutions to Eq. (30) when q m is not a root of unity can also be found in [BBR]. For our purpose k ∈ [0, 1) from now on. B Unfortunately, despite the suggestive notation, the numbers eiξk (xˆ ) depend not only on the Casimirs xˆ B and the real constant k ∈ [0, 1), but also on the choice of a B th root. ˆ together Clearly, once a choice is made (for all times t) one can use the operators Rk (x) with the chosen roots (now viewed as functions in the central elements) for defining an almost Hamiltonian quantum evolution for the sine-Gordon model as in (20). B However, fixing the roots eiξk (xˆ ) for all time contradicts the idea of defining an evolution by a local process. In the following, we want to show that given an initial choice of roots, it is possible to define an unique evolution for the above roots. In order to accomplish this task, we will extend the algebra A(XT ) by the central elements B eiξk (PT ) . 4.2. Light cone shifts. The doubly discrete sine-Gordon equation, as well as the above described equations of sine-Gordon type (see e.g. (14)), are, as in the continuous case, invariant under light cone shifts. I.e. if gt,k : Z2 → R parametrizes a solution to (3) then gt±n,k±n gives also a solution. In this sense space time shifts can be lifted to automorphisms on covariant phase space and can be interpreted as symplectomorphisms [EK]. In the previous section, we found a quantization of another (yet trivial) symplectomorphism ([AM,GS]) on phase space, namely time evolution. It is now natural to find quantized analogs of the above mentioned light cone shifts. This will be done by defining quantized space translations for half the lattice spacing distance, and then applying the time automorphism. Since translations of half the lattice spacing distance are hard to define on the vertex operators, as one would have to go over to the dual lattice, we restrict ourself to the edge algebra A(XT ). It will be shown that space translations, or space shifts as automorphisms on the edge algebra A(XT ) are almost inner automorphisms. For constructing the space shifts, we will follow an idea developed in [FV3], where such space shifts were suggested for the case of a special choice of vertex monodromies. It will turn out that the treatment of the more general case gives us a natural way to fix B the roots eiξk (Pt,k−1 ) .
126
N. Kutz
Lemma 4.2. The quotient of the two vertex monodromies −1 −1 −1 P2t,2p−3 . . . P2t,1 M (1) (M (2) )−1 = P2t±1,0 P2t±1,2 . . . P2t±1,2p−2 P2t,2p−1 −2 2 2 2 = q −2mp+m X2t,0 X2t,1 X2t,2 . . . X2t,2p−1 MX −2 2 2 2 = q −2mp+m X2t+1,0 X2t+1,1 X2t+1,2 . . . X2t+1,2p−1 MX
is a central element in A(XT ). Demand 4.3. For establishing quantum space time shifts, demand that a) the central elements −1 −1 −1 P2t,2p−3 . . . P2t,1 M (1) (M (2) )−1 = P2t−1,0 P2t−1,2 . . . P2t−1,2p−2 P2t,2p−1 B and Pt,k−1 (B as in Theorem 4.1) shall be multiples of the identity within A(XT ). B
b) The roots eiξ0 (Pt,k−1 ) = (Pt,k−1 (−1)B−1 )− B (see 29) shall be fixed in such a way that 1
p−1 Y
B (P2t−1,2k (−1)B−1 ) B 1
k=0
p−1 Y
B (P2t,2k−1 (−1)B−1 )− B = 1
k=0
p−1 Y
P2t−1,2k
k=0
p−1 Y
−1 P2t,2k−1 .
k=0
By the definition of an almost Hamiltonian quantum evolution demand a) holds automatically for all times t if it is true at an initial time T . The same is valid for demand b) which will soon become clear. For this reason, the index t in the above demands refers to all times t. We are now in the position to define space shifts via products of dynamical R-matrices. Observe that the product is over 2p −1 faces. Figure 2 shows the involved faces (labeling in accordance with Fig. 1 for t = odd). Define St−1 : =
2p ← Y k=2
−1 R0 (Xt,k Xt,k−1 )
(31)
−1 −1 −1 Xt,2p−1 )R0 (Xt,2p−1 Xt,2p−2 ) . . . R0 (Xt,2 Xt,1 ) = R0 (Xt,2p
Pt,2 Xt,2
Xt,1 Pt−1,1
Pt,4
Xt,2p Pt−1,2p−1
Pt−1,3
Proposition 4.4. For all k ∈ Z, −1
B
St−1 Xt,k St = q −m eiξ0 ((Xt,k Xt,k−1 ) ) Xt,k−1 , where eiξ0 (x
B)
= (x B (−1)B−1 )− B as in (29). 1
Free Massive Fermions Inside Discrete sine-Gordon Model
127
Proof. For the proof, everything will be carried out at a fixed time and therefore we suppress the time index. If n ∈ {2 . . . 2p} then by the commutation rules of the edge variables St−1 Xt,n St = S −1 Xn S = =
2p ← Y
R0 (Xk−1 Xk−1 )Xn
k=n 2p ← Y
k=n+1
=
2p ← Y k=n+1
= q −m
R0 (Xk−1 Xk−1 )
k=n
R0 (Xk−1 Xk−1 )−1
R0 (Xn−1 Xn−1 )
R0 (q m Xn−1 Xn−1 )
R0 (Xk−1 Xk−1 )Xn−1 Xn−1 Xn
2p ← Y k=n+1
=q
2p → Y
R0 (Xk−1 Xk−1 )Xn−1
−m iξ0 ((Xn−1 Xn−1 )B )
e
2p → Y k=n+1
2p → Y k=n+1
2p → Y
Xn
k=n+1
R0 (Xk−1 Xk−1 )−1 −1 X
R0 (Xk−1 Xk−1 )−1 eiξ0 ((Xn
B n−1 ) )
−1 X B n−1 ) )
R0 (Xk−1 Xk−1 )−1 eiξ0 ((Xn
Xn−1 .
Analogously, one obtains S −1 X1 S = = = = =
2p ← Y k=3 2p ← Y
R0 (Xk−1 Xk−1 )X2−1 X12
2p → Y k=3
−1
R0 (Xk−1 Xk−1 )−1 eiξ0 ((X2
X1 )B )
2p → Y −1 R0 (X3−1 X2 ) B −1 2 R0 (Xk−1 Xk−1 ) X R0 (Xk−1 Xk−1 )eiξ0 ((X2 X1 ) ) X 1 2 −1 −m R0 (q X3 X2 ) k=4 k=4 2p → 2p ← Y Y −1 −1 B B R0 (Xk−1 Xk−1 )q 2m X3 X2−2 X12 R0 (Xk−1 Xk−1 )e−iξ0 ((X3 X2 ) )+iξ0 ((X2 X1 ) ) k=4 k=4 2p−1 2p Y 2(−1)k+1 Y −1 −1 k B B −1 q 2m(p−1) X2p Xk e(−1) iξ0 ((Xk Xk−1 ) ) eiξ0 ((X1 X0 ) ) k=1 k=1
b)
−1
= q −m eiξ0 ((X1
X0 )B )
X0 ,
(32)
t if we suppose that demand b) holds. With S −1 MX S = MX the assertion follows. u It is easy to show that (l)
−1
(l)
B
(St )−1 Xt,n St = q −m eiξ0 ((Xt,n Xt,n−1 ) ) Xt,n−1 , where (l)
(St )−1 : =
2p+l Y← k=2+l
−1 R0 (Xt,k Xt,k−1 ).
128
N. Kutz (l)
(0)
Therefore the index l is irrelevant and will be dropped, also if (St )−1 instead of (St )−1 is used. Clearly, one can also define an automorphism of the above kind by redefining Sˆt−1 : =
2p+l Y← k=2+l
−1 R0 (αXt,k Xt,k−1 )
α ∈ S1,
which will act as: −1
B Sˆt−1 Xt,n Sˆt = αq −m eiξ0 ((Xt,n Xt,n−1 ) ) Xt,n−1 ,
a fact we will use later. The automorphism S−1 t : A(XT ) → A(XT ) defined on the operators Xt,n as −1
B
m −iξ0 ((Xt,n Xt,n−1 ) ) −1 St Xt,n St = Xt,n−1 S−1 t (Xt,n ) := q e
can be interpreted as a shift on the edge operators in a space direction. The picture of the action of S−1 t is a little different for an automorphism on A(PT ), we find B
B
St−1 Pt,k−1 St = Pt−1,k−2 e−iξ0 (Pt,k−1 ) + eiξ0 (Pt−1,k−2 ) , B
B
St−1 Pt−1,k−2 St = Pt,k−3 e−iξ0 (Pt−1,k−2 ) + eiξ0 (Pt,k−3 ) . Hence applying St to face operators with the same time index t results in a shift down in the light cone direction and applying St to face operators with time index t − 1 results in a shift up in the lightcone direction. Hence St is rather an up-and-down shift in lightcone directions than rather a shift in space: PT +1,k
PT +1,k ST +1 PT ,k−1 ST
ST +1 PT ,k+1
Et
ST +1 PT ,k+1
ST PT −1,k
S PT −1,kT
Figure 2.1
Figure 2.2
Fix an initial time for example T = odd. Then, by definition, the operators ST−1 = R0 (PT −1,2p−1 )R0 (PT ,2p−2 ) . . . R0 (PT −1,1 ),
ST−1+1 = R0 (PT +1,2p−1 )R0 (PT ,2p−2 ) . . . R0 (PT +1,1 )
are given by the choice of initial operators (PT −1,2k+1 )k∈{0...p−1} , (PT ,2k )k∈{0...p−1} and the time evolved face operators (PT +1,2k+1 )k∈{0...p−1} . The operator PT +1,2k+1 can now be obtained by shifting the operator PT ,2k or by applying the time evolution to PT −1,2k+1 , i.e. the shifts depictured in Fig. 2.2 must commute. By the commutation relations of the face operators, it is a straightforward result that:
Free Massive Fermions Inside Discrete sine-Gordon Model
129
Lemma 4.5. R0 (PT +1,2p−1 )R0 (PT ,2p−2 ) . . . R0 (PT +1,1 ) B
B
= RT R0 (e−iξ0 (PT ,2p )+iξ0 (PT ,2p−2 ) PT −1,2p−1 )R0 (PT ,2p−2 ) B
B
. . . R0 (e−iξ0 (PT ,2 )+iξ0 (PT ,0 ) PT −1,1 )RT−1 . Hence B
!
B
ST−1+1 PT ,2n ST +1 = PT +1,2n−1 e−iξ0 (PT ,2n )+iξ0 (PT +1,2n−1 ) B
B
= RT R0 (e−iξ0 (PT ,2p )+iξ0 (PT ,2p−2 ) PT −1,2p−1 )R0 (PT ,2p−2 )
4.5
B
B
. . . R0 (e−iξ0 (PT ,2 )+iξ0 (PT ,0 ) PT −1,1 )PT ,2k B
B
(RT R0 (e−iξ0 (PT ,2p )+iξ0 (PT ,2p−2 ) PT −1,2p−1 )R0 (PT ,2p−2 ) B
B
. . . R0 (e−iξ0 (PT ,2 )+iξ0 (PT ,0 ) PT −1,1 ))−1 B
= RT PT −1,2n−1 RT−1−1 e−iξ0 (PT ,2n )+iξ0 ((e B
−iξk (PTB,2n )+iξk (PTB,2n−2 )
PT −1,2n−1 )B )
B
e−iξk (PT ,2n )+iξk (PT ,2n−2 ) B
= PT +1,2n−1 e−iξ0 (PT ,2n )+iξ0 ((e
−iξk (PTB,2n )+iξk (PTB,2n−2 )
PT −1,2n−1 )B )
.
Comparing both sides of the equation, one finally obtains the compatibility condition: B
!
B
B
B
eiξ0 (PT +1,2n−1 ) = e−iξk (PT ,2n )+iξk (PT ,2n−2 ) eiξ0 (PT −1,2n−1 ) .
(33)
For defining the evolution, we had already fixed the roots on the right-hand side of Eq. (33), so that Eq. (33) determines the roots at one time step further for k = 0. So B a continuous deformation of k defines the roots eξk (Pt,k−1 ) for all k ∈ [0, 1). Hence, B also the evolution of the roots eiξk (Pt,k−1 ) have been now defined by a local process. A direct consequence of Eq. (33) is that demand b) holds for all time since the classical monodromies are as well as their quantum counterparts integrals of motion [K]. Note that if one takes the B th power of Eq. (33) then one obtains PTB+1,2n−1 =
k B + PTB,2n (−1)B−1 1 + k B PTB,2n−2 (−1)B−1
1 + k B PTB,2n (−1)B−1 k B + PTB,2n−2 (−1)B−1
PTB−1,2n−1 .
B satisfy also an equation of sine-Gordon type. This Hence the “classical” variables Pt,k fact was first noted in [BBR] by using the commutation relations of the face operators and computing PTB+1,2k−1 . It is now straightforward to show that, as an immediate consequence,
ST +1 (XT +1,k ) = ET ◦ ST ◦ ET−1 (XT +1,k ), i.e. shifts in time and space direction commute if (33) is satisfied. The picture of the above developed quantum evolution looks considerably complicated. It simplifies greatly, if one restricts oneself to the case of Corollary (3.2):
130
N. Kutz
Proposition 4.6. Let ST be an automorphism on A(XT ) such that at an initial time T , RT = ST (RT −1 ) = S−1 T (RT −1 ),
(34)
Rt+1 = Rt Rt−1 Rt−1 ,
(35)
where recursively then Rt = RT ST (Rt−1 )RT−1 for all t ∈ Z ≥ T . Proof. By (34) and (35), RT = RT ST (RT −1 )RT−1 ,
(36)
RT +1 =
(37)
RT ST (RT )RT−1 .
For completing the induction argument, assume that the following is true: RT +n = RT ST (RT −1+n )RT−1 ,
RT +1+n = RT ST (RT +n )RT−1 .
(38) (39)
Hence RT +n+2
(35)
=
(38,39)
=
(35)
=
RT +n+1 RT +n RT−1+n+1 , RT ST (RT +n )RT−1 RT ST (RT −1+n )RT−1 (RT ST (RT +n )RT−1 )−1 , RT ST (RT +n+1 )RT−1 ,
and analogously RT +n+3
(39,40)
=
(40)
RT ST (RT +n+2 )RT−1 . u t
Proposition 4.7. If ST is an automorphism such at initial time T , XT ,k = ST (XT ,k−1 ) = S−1 T (XT ,k+1 )
and Xt+1,k = Rt Xt,k Rt−1
for all t ∈ N, and with Rt as in Proposition (4.6), then Xt+1,k = RT ST (Xt,k−1 )RT−1 . Proof. By induction as above and by the use of Proposition (4.6). u t The connection to models of statistical mechanics is now evident. We find −1 −1 Xt+2,k+1 = RT S−1 T (RT ST (Xt,k+1 )RT )RT
−1 −1 −1 = RT S−1 T (RT )Xt,k+1 ST (RT )RT .
(41)
Moreover, RT is a product of “local amplitudes” R(PT ,k−1 ) associated to the faces at time T within the light cone lattice. Hence S−1 T (RT ) is a product of “local amplitudes” R(Pt,k−1 ) associated to faces which are shifted in the lightcone direction. Because of (41), this picture is the same all over the lattice, hence we can interpret RT as a kind of transfer matrix (though with complex weights). Another fortunate consequence is that for investigating the evolution it suffices to control the first time step - everything else is then obtained by applying the light cone shifts ET ◦ ST . This is especially important for the construction of integrals of motion, since finding an operator HT , that commutes with the above light cone shifts automatically yields an integral of motion. In the next section an example of such a “static” quantum field theory will be discussed.
Free Massive Fermions Inside Discrete sine-Gordon Model
131
5. Relations to the Massive Thirring Model Choose an initial time T = odd. Let 1 0 01 10 B= , S= , C= . 0 −1 10 0i
(42)
The notation is adopted from viewing B as a “Boost” and S as a “Shift” operator. Note that CBSC −1 = −iS. CSC −1 = −iBS, B, S, C are operators acting on the Hilbert space H = C2 . Define for any operator A on N2p−1 H an operator Al on a “big” Hilbert space H2p = k=0 C2 by Al =
⊗ 11 . . . |{z} A . . . 11 ⊗ |{z} 11 .
11 |{z} 2p−1th
l th site
site
Let C: =
2p−1 Y
0th site
Cl .
l=0
Define XT ,2k : = S2k
2k−1 Y←
Bl = S2k B2k−1 . . . B0 ,
(43)
l=0
XT ,2k+1 : = −B2k+1 S2k+1
2k ← Y
Bl ,
(44)
l=0
MX : = 11 ⊗ 11 . . . ⊗ 11,
(45)
where XT ,0 = S0 . Denote shortly BS2k+1 : = B2k+1 S2k+1 . The above defined operators satisfy the commutation relations of the edge operators XT ,k XT ,k+1 = q −m XT ,k+1 XT ,k ,
XT ,k MX = q −2m MX XT ,k
for q −m = −1. They are periodic and XT2 ,2k = 11 XT2 ,2k+1 = −11.
(46)
The corresponding face operators are almost all bilocal in terms of the tensor product: PT ,2k := XT−1,2k+1 XT ,2k = BS2k+1 BS2k for k ∈ {0 . . . p − 1},
PT −1,2k+1 := XT−1,2k+2 XT ,2k+1 = −S2k+2 S2k+1 for k ∈ {0 . . . p − 2},
PT −1,2p−1 = BS2p−1
2p−2 Y
Bl BS0 ,
l=1
and
PT2,2k = PT2−1,2k+1 = 11.
132
N. Kutz
The difference of the vertex monodromies is −1 p−1 11. P2T −1,2p−1 . . . PT −1,1 PT−1 ,0 . . . PT ,2p−2 = (−1) B
For defining the shift automorphism ST we fix the roots eiξ0 (iPt,n )) (t ∈ {T , T − 1}, n ∈ {0 . . . 2p − 1}) as: e
iξ0 (iPTB,2n ))
=e
iξk (iPTB,2n ))
1 + k 2 (iPT ,2n )2 (−1) k 2 + (iPt,n )2 (−1)
=
21
:= 11,
n ∈ {0 . . . p − 1}, and analogously B
eiξk (iPT −1,2n+1 )) = 11,
n ∈ {0 . . . p − 1}
for all k ∈ [0, 1). Note that with this definition B
B
−1 p−1 −iξ0 (iPT −1,2p−1 )) e . . . e−iξ0 (iPT −1,0 )) PT −1,2p−1 . . . PT −1,1 PT−1 ,0 . . . PT ,2p−2 = (−1) B
B
eiξ0 (iPT ,0 )) . . . eiξ0 (iPT ,2p−2 )) , and hence we have to be careful when defining the shift automorphism (compare with (32)). Let 1 − k2 2k , b = −i , c: = 1 + k2 1 + k2 then using (4.1) and normalizing, the local amplitudes are a straightforward result: Rk (iPT ,2n ) = √
i (b + (1 − c)PT ,2n ). 2(1 − c)
They obey the functional equations: k + iPT ,2n Rk (iPT ,2n ) . = Rk (−iPT ,2n ) 1 + kiPT ,2n The evolution for the edge operators to the next time step is given by XT +1,n =
RT XT ,n RT−1
where RT =
p−1 Y
Rk (iPT ,2l ).
l=0
The shift matrix shall be given by: ST−1 : = R0 (iPT ,2p−2 )R0 (iPT −1,2p−3 ) . . . R0 (iPT −1,1 )R0 (iP2T ,0 ), which defines the following shift automorphism on the edge algebra: −1 S−1 T (XT ,n ) := iST XT ,n ST = XT ,n−1 ,
S−1 T (XT ,0 )
:=
i(−1)p−1 ST−1 XT ,0 ST
n ∈ {1 . . . 2p − 1},
= XT ,2p−1 .
Free Massive Fermions Inside Discrete sine-Gordon Model
133
Lemma 5.1. The matrices CPT ,2n C−1 and CPT ,2n+1 C−1 (n ∈ {0 . . . p − 1}) commute with all generators ITX = (XT ,n )n∈{0..2p} of A(XT ). Hence RT CRT C−1 ST−1 CST−1 C−1 (XT ,n − iCXT ,n C−1 )CST C−1 ST CRT−1 C−1 RT−1
= RT ST−1 XT ,n ST RT−1 − iCRT ST−1 XT ,n ST RT−1 C−1
−1 −1 −1 −1 = −i(RT S−1 T (XT ,n )RT − iCRT ST (XT ,n )RT C ),
n ∈ {1 . . . p − 1}, analogous for XT ,0 . Clearly this defines lightcone shifts on the operators n−1
ψT ,n+1 : =
Y 1 (XT ,n − iCXT ,n C−1 ) = σn− σlz , 2 l=0
i.e. ˜ T S˜ T (XT ,n )R ˜ −1 ψT +1,n+1 : = R T with n ∈ {0 . . . p − 1}, −1 −1 −1 −1 S˜ −1 T (ψT ,n ): = iST CST C ψT ,n ST CST C ,
˜ T : = RT CRT C−1 , R and σ
−
=
00 10
σ z = B.
The shift matrices ST and ST−1 , as well as the evolution matrix RT , are products of bilocal operators, = 11 ⊗ . . . Rk (iBS ⊗ BS) . . . ⊗ 11, n ∈ {0 . . . p − 1}, Rk (iPT ,2n ) Rk (iPT −1,2n+1 ) = 11 ⊗ . . . Rk (−iS ⊗ S) . . . ⊗ 11, n ∈ {0 . . . p − 2}. ˜ T and S˜T . A straightforward computation gives Hence the same holds for R Rk (iBS ⊗ BS)CRk (iBS ⊗ BS)C−1 = Rk (iBS ⊗ BS)Rk (−iS ⊗ S) 1000 0 c b 0 . . . ⊗ 11 = = 11 ⊗ . . . 0 b c 0 0001 = Rk (−iS ⊗ S)CRk (−iS ⊗ S)C−1 . The shift matrices S˜T−1 = R˜ 0 (iPT ,2p−2 )R˜ 0 (iPT −1,2p−3 )R˜ 0 (iPT ,2p−4 ) . . . R˜ 0 (iPT ,0 )
134
N. Kutz
act on the fermionic operators ψT ,k as S˜T−1 ψT ,2n S˜T = = R˜ 0 (iPT ,2n−2 )ψT ,2n R˜ 0 (iPT ,2n−2 )−1 1 0 0 0 1 0 0 −i 0 − 0 z = (σ ⊗ σ2n−2 ) 0 −i 0 0 2n−1 0 0 0 0 1 0 − = −i(112n−1 ⊗ σ2n−2 )
2n−3 Y l=0
= −iψT ,2n−1 ,
0 0 i 0
0 i 0 0
0 2n−3 0 Y z σl 0 l=0 1
σlz (47)
n ∈ {0 . . . p − 1}. In a similar way we have S˜T−1 ψT ,2n+1 S˜T = −iψT ,2n n ∈ {1 . . . p − 1}, S˜T−1 ψT ,1 S˜T = i(−1)p ψT ,2p , so that finally ˜ −1 ˜ = ψT ,n−1 S˜ −1 T (ψT ,n ) = i ST ψT ,n ST ˜S−1 (ψT ,1 ) = i(−1)p−1 S˜ −1 ψT ,n S˜T = ψT ,2p , T T
n ∈ {2 . . . 2p}
which is identical to the shift automorphism constructed in [DV]. Following (4.6,4.7) we know that the construction of the shift automorphism S˜ and the evolution automorphism given by the conjugation with R˜ is sufficient for constructing a Hamiltonian quantum evolution in the sense of the previous sections. Finally comparing with the construction in [DV] one finds that the fermionic operators obey an evolution of free massive fermions. The evolution equations can be derived easily by considering the evolution for the edge variables. Recalling (46), one finds: Xt+1,2k : = Rt Xt,2k Rt−1 =
−1 −1 Xt,2k 1 − kiXt,2k+1 Xt,2k k + iXt,2k+1
Xt,2k −1 −1 1 − kiXt,2k+1 Xt,2k 1 − kiXt,2k+1 Xt,2k 1 −1 = (2k + i(1 − k 2 )Xt,2k+1 Xt,2k )Xt,2k 1 + k2 = cXt,2k + bXt,2k+1 ,
(48)
and analogously for Xt+1,2k+1 . Since these equations are linear, the evolution equations for the fermionic operators follow immediately: ψt+1,2n−1 = cψt,2n−1 + bψt,2n , ψt+1,2n = cψt,2n + bψt,2n−1 .
(49) (50)
Free Massive Fermions Inside Discrete sine-Gordon Model
135
5.1. Conclusion. In the present paper a generalized model of a lattice field theory of sine Gordon type at root of unity was suggested. One aim was to stress the purely local character of the evolution automorphism and another was to finally derive global features like constant transfer matrices – and hence a connection to models in statistical mechanics by restricting to a special case (Propositions 4.6 and 4.7). Since the classical phase space belonging to the evolution of the face variables (current variables) is a torus T 2p , the corresponding quantum model can be viewed as a type of quantization of this torus. Hence, a future project should be the investigation of the above within the framework of noncommutative geometry [Co]. A nice side effect of studying the above quantum lattice model was the detection of a relation to another quantum lattice model, namely the massive Thirring model in its reduced version describing free massive fermions, as given by [DV]. Since relations between the two models are known for the continuous case, see e.g. [KM,C], it seems to speak for the self coherence of the above lattice models, that they also exist in the discrete case. References [AM]
Abraham, R., Marsden, J.E.: Foundations of Mechanics. New York: The Benjamin/Cummings Publishing Company, Inc, 1978 [BBR] Bazhanov, V., Bobenko, A., Reshetikhin, N.: Quantum Discrete Sine-Gordon Model at Roots of 1: Integrable Quantum System on the Integrable Classical Background. Commun. Math. Phys. 175, 377–400 (1996) [BP] Bobenko, A.I., Pinkall, U.: Discrete Surfaces with Constant Negative Gaussian Curvature and the Hirota Equation, Sfb 288 Preprint Nr. 127 TU Berlin, Berlin 1994 [BKP] Bobenko, A.I., Kutz, N., Pinkall, U.: The Quantum Pendulum, Phys. Lett. A 177, 399–404 (1993) [BRST] Bruschi, M., Ragnisco, O., Santini, P.M., Tu, G.-Z.: Integrable Symplectic Maps, Physica D 49, 273 (1991) [Co] Connes, A.: Noncommutative Differential Geometry, Pub. Math. IHES 62, 257–360, (1986) [C] Coleman, S.: Quantum sine-Gordon equation as the massive Thirring model, Phys. Rev. D11 No. 5 2088–2097 (1975) [CN] Capel, H.W., Nijhoff, F.W.: Integrable Quantum Mappings, CRM Proceedings and Lecture Notes Preprint (1994) [CW] Crnkovi´c, C., Witten, E.: Covariant Description of canonical formalism in geometrical theories. In: Three hundred years of gravitation, Cambridge: Cambridge University Press, Ed.: Hawking, Israel, 1987, pp. 676–684 [DV] Destri, C., de Vega, H.J.: Light-Cone Lattice Approach to Fermionic Theories in 2D, The Massive Thirring Model. Nucl. Phys. B 290[FS20], 363–391 (1987) [EK] Emmrich, C., Kutz, N.: Doubly discrete Lagrangian systems related to the Hirota and sine-Gordon equation. Phys. Lett. A 201, 156–160 (1992) [FV1] Faddeev, L.D., Volkov, A.Yu.: Quantum Inverse Scattering Method on a Spacetime Lattice. Theor. Math. Phys. 92, 837–842 (1992) [FV2] Faddeev, L.D., Volkov, A.Yu.: Hirota Equation as an Example of an Integrable Symplectic Map. Lett. Math.Phys. 32, 125–136 (1994) [FV3] Faddeev, L.D., Volkov, A.Yu.: Abelian current algebra and the Virasoro algebra on the lattice. Phys. Lett. B 315, 311 (1993) [FZ] Fateev, V.A., Zamolodchikov, A.B.: The exactly solvable case of a 2D lattice of plane rotators. Phys. Lett. A 29, Nr.1, 35–39 (1982) [GS] Guillemin, V., Sternberg, S.: Symplectic techniques in physics. Cambridge: Cambridge University Press, 1984 [H] Hirota, R.: Nonlinear Partial Difference Equations III; Discrete Sine-Gordon Equation. J. Phys. Soc. Japan 43, 2079–2086 (1977) [IK] Izergin, A.G., Korepin, V.G.: The lattice Quantum Sine-gordon Model. Lett. Math. Phys. 5, 199–205 (1981) [KM] Klassen, T.R., Melzer, E.M.: Sine-Gordon 6 = Massive Thirring, and Related Heresies. Int. J. Mod. Phys. A8, 4131–4174 (1993) [K] Kutz, N.: On the spectrum of the quantum pendulum. Physics Letters A 187, 365–372 (1994)
136
[TF]
N. Kutz
Takhdajan, L.A., Faddeev, L.D.: The Quantum Method of the Inverse Problem and the Heisenberg XYZ Model. Russ. Math. Surv. 34 5, 11–68 (1979) [NCP] Nijhoff, F.W., Capel, H.W., Papageorgiu, V.G.: Integrable Quantum Mappings. Phys. Rev. A 46, 2155 (1992) [S] Suris, Yu.B.: Integrable Mappings of the Standard Type. Funct. Anal. Appl. 23, 74–79 (1989) [TS] Truong, T.T., Schotte, K.D.: The Quantum Field Theories Associated with a “Staggered” Ice-Type Model. Nucl. Phys. B230, 1–15 (1984) [V] Veselov, A.P.: Integrable Maps. Russ. Math. Surv. 46, No. 5, 1–51 (1991) Communicated by T. Miwa
Commun. Math. Phys. 204, 137 – 146 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Reduction of Quantum Systems with Arbitrary First Class Constraints and Hecke Algebras Alexey Sevostyanov? Institute of Theoretical Physics, Uppsala University, Box 803, S-75108 Uppsala, Sweden. E-mail: [email protected] Received: 15 June 1998 / Accepted: 25 January 1999
Abstract: We propose a method for reduction of quantum systems with arbitrary firstclass constraints. An appropriate mathematical setting for the problem is the homology of associative algebras. For every such algebra A and subalgebra B with augmentation ε there exists a cohomological complex which is a generalization of the BRST one. Its cohomology is an associative graded algebra H k ∗ (A, B) which we call the Hecke algebra of the triple (A, B, ε). It acts in the cohomology space H ∗ (B, V ) for every left A module V . In particular the zeroth graded component H k 0 (A, B) acts in the space of B invariants of V and provides the reduction of the quantum system. Introduction The purpose of this paper is to generalize the well-known BRST quantization procedure [4,2,3,8] to arbitrary associative algebras. Recall that given a Hamiltonian action of a Lie group on a Poisson manifold one can construct a super-Poisson complex, called the classical BRST complex, whose zeroth cohomology is the algebra of functions on the reduced space over the zero value of the corresponding moment map [1,11]. This complex admits a quantization and the zeroth cohomology of the quantum complex may be treated as a quantization of the classical reduced space. In that case the quantum counterparts of matrix elements of the moment map form a Lie algebra and represent a system of first-class constraints for the quantum reduction. An appropriate mathematical setting for the BRST cohomology of Lie algebras was proposed by Kostant and Sternberg in [9]. Our approach to the BRST cohomology differs from the one described above. We start with the quantum complex directly. It turns out that the quantum BRST cohomology may be defined using the language of homological algebra, resolutions, etc. The particular ? Address after September 1, 1999: Max-Planck Institut für Mathematik, Vivatsgasse 7, Box 7280, D-53072 Bonn, Germany
138
A. Sevostyanov
complex described in [9] corresponds to the standard resolution of the ground field. As usual, in homological algebra different choices of resolutions lead to the same homology. This allows us to generalize the BRST cohomology to arbitrary systems of first-class constraints. The BRST quantization is not unique in the following sense. One can always define the BRST cohomology related to the usual cohomology of representation spaces of the quantum system. However, under some technical assumptions there exists another version of the BRST reduction related to the semi-infinite cohomology of the representation spaces [7]. In the case of Lie algebras it requires a normal ordering in the differential [9]. In the present paper we only discuss the usual BRST cohomology. The latter one will be explained in a subsequent paper. 1. Endomorphisms of Complexes Let A be an associative ring with unit, X be a graded complex of left A modules equipped with a differential of degree -1. Recall the definition of the complex Y = EndA (X) [10]. By definition Y is a Z-graded complex Y =
∞ M
Yn
(1)
Y p,q ,
(2)
n=−∞
with graded components defined as Y
Yn =
p+q=n
where Y p,q = HomA (X p , X−q ).
(3)
Clearly Y is a subalgebra in the full algebra of A-endomorphisms of X. It is easy to see that Y is closed with respect to the multiplication given by composition of endomorphisms. Thus it is a graded associative algebra. We emphasize that Y is not a bigraded space. We introduce a differential on Y of degree +1 as follows: (df )p,q = (−1)p+q f p−1,q ◦ d + d ◦ f p,q−1 , f = {f p,q }, f p,q ∈ Y p,q .
(4)
If f is homogeneous then df = d ◦ f − (−1)deg(f ) f ◦ d,
(5)
that d is the supercommutator by d. We shall consider also the partial differentials d 0 and d 00 on Y: for f ∈ Y p,q , (d 0 f )(x) = (−1)p+q+1 f (dx), x ∈ Xp+1 , 00
(d f )(x) = df (x),
p
x∈X .
(6)
Reduction of Quantum Systems
139
It is easy to check that (d 0 )2 = (d 00 )2 = d 0 d 00 + d 00 d 0 = 0.
(7)
These conditions ensure that d2 = 0. The following property of d is crucial for the subsequent considerations. Lemma 1. d is a superderivation of Y . Proof. Let f and g be homogeneous elements of Y . Then deg(f g) = deg(f ) + deg(g) and (5) yields: d(fg) = d ◦ fg − (−1)deg(f )+deg(g) f g ◦ = d ◦ fg − (−1)deg(f ) f ◦ d ◦ g + (−1)deg(f ) f ◦ d ◦ g − (−1)deg(f )+deg(g) fg ◦ d
(8)
= (df )g + (−1)deg(f ) f (dg). This completes the proof. u t The most important consequence of the lemma is Theorem 2. The homology space H ∗ (Y ) inherits a multiplicative structure from Y. Thus H ∗ (Y ) is a graded associative algebra. Proof. First, the product of two cocycles is a cocycle. For if f and g are homogeneous and df = dg = 0 then dfg = (df )g + (−1)deg(f ) f (dg) = 0.
(9)
Now we have to show that the product of homology classes is well-defined. It suffices to verify that the product of a homogeneous cocycle with a homogeneous coboundary is cohomologous to zero. For instance consider the product f dh. Equation (5) gives f dh = f ◦ (d ◦ h − (−1)deg(h) h ◦ d) = (−1)deg(f ) d ◦ f ◦ h − (−1)deg(h) f ◦ h ◦ d = (−1)
deg(f )
(10)
d(f h).
This completes the proof. u t One of the principal statements of homological algebra says that homotopically equivalent complexes have the same homology. In particular the vector space H ∗ (Y ) depends only on the homotopy class of the complex X. It turns out that the same is true for the algebraic structure of H ∗ (Y ). Indeed we have the following Theorem 3. Let X, X0 be two homotopically equivalent graded complexes of left Amodules. Then H ∗ (Y ) ∼ = H ∗ (Y 0 ) as graded associative algebras.
(11)
140
A. Sevostyanov
Proof. Let F : X → X0 , F 0 : X0 → X be two maps between the complexes such that F 0 F − idX = dX s + sdX , 0
0
0
s : X → X,
s ∈ Y −1 ,
(12)
0
0
0 −1
(13)
0
0
F F − idX0 = dX0 s + s dX0 , s : X → X , s ∈ Y
.
Consider the induced mappings of the complexes Y, Y 0 : ∗
F F 0 : Y → Y 0, ∗
F F 0 f = F ◦ f ◦ F 0 , f ∈ Y, F 0 F ∗ : Y 0 → Y,
(14)
F 0 F ∗ g = F 0 ◦ g ◦ F, f ∈ Y 0 . Their compositions are homotopic to the identity maps of Y and Y 0 (see Chap. 4, [5] for a general statement about equivalences of functors). But this means that F F 0 ∗ is inverse to F 0 F ∗ when restricted to homology. Thus H ∗ (Y ) is isomorphic to H ∗ (Y 0 ) as a vector space. We have to show that the restrictions of F F 0 ∗ and F 0 F ∗ to the homologies are homomorphisms of algebras. Let f and g be homogeneous elements of Y and dX f = dX g = 0. By the definition of the induced maps we have ∗
F F 0 (fg) = F ◦ f g ◦ F 0 .
(15)
On the other hand ∗
∗
F F 0 (f )F F 0 (g) = F ◦ f ◦ F 0 F ◦ g ◦ F 0 = F ◦ f (idX + dX s + sdX )g ◦ F 0 .
(16)
Now recall that f and g are cocycles in Y . By (5) they supercommute with dX : dX ◦ f = (−1)deg(f ) f ◦ dX .
(17)
Using (17) and the fact that F and F 0 are morphisms of complexes we can rewrite (16) as follows: F ◦ f (idX + dX s + sdX )g ◦ F 0 = F ◦ fg ◦ F 0 + (−1)deg(f ) dX0 ◦ F ◦ f sg ◦ F 0 + (−1)deg(g) F ◦ f sg ◦ F 0 ◦ dX0
(18)
= F ◦ fg ◦ F 0 + (−1)deg(f ) dX0 (F ◦ f sg ◦ F 0 ). Finally observe that by (18), F F 0 ∗ (fg) and F F 0 ∗ (f )F F 0 ∗ (g) belong to the same t homology class in H ∗ (Y 0 ). This completes the proof. u
Reduction of Quantum Systems
141
2. Hecke Algebras Let A be an associative algebra over a ring K with unit, and B a subalgebra of A with augmentation, that is, a K-algebra homomorphism ε : B → K. Let X be a projective resolution of the left B-module K. Then the complex A ⊗B X has the natural structure of a left A-module. We can apply Theorem 2 to define a graded associative algebra H k ∗ (A, B) = H ∗ (EndA (A ⊗B X)).
(19)
Observe that all B-projective resolutions of K are homotopically equivalent. Hence by Theorem 3 H k ∗ (A, B) does not depend on the resolution X. We shall call it the Hecke algebra of the triple (A, B, ε). Now consider A as a left A-module and a right B-module via multiplication. In this way A becomes a left A ⊗ B opp -module. Let X0 be a projective resolution of the module. The complex X 0 ⊗B K is a left A-module. Therefore there exists an associative algebra ∗ d H k (A, B) = H ∗ (EndA (X 0 ⊗B K))
independent of the resolution
(20)
X0 .
∗ d k (A, B) as a graded associative algebra. Theorem 4. H k ∗ (A, B) is isomorphic to H ∗ d Proof. We shall use the standard bar resolutions for computing H k (A, B) and H k ∗ (A, B) [10,5]. Consider the complex B ⊗ T (I (B)) ⊗ B, where I (B) = B/K and T denotes the tensor algebra of the vector space. Elements of B ⊗ T (I (B)) ⊗ B are usually written as a[a1 , . . . , as ]a 0 . The differential is given by
da[a1 , . . . , as ]a 0 = aa1 [a2 , . . . , as ]a 0 + s−1 X
(21)
(−1)k a[a1 , . . . , ak ak+1 , . . . , as ]a 0 + (−1)s a[a1 , . . . , as−1 ]as a 0 .
k=1
Then B ⊗ T (I (B)) ⊗ B ⊗B K = B ⊗ T (I (B)) ⊗ K is a free resolution of the left B-module K. And A ⊗B B ⊗ T (I (B)) ⊗ B = A ⊗ T (I (B)) ⊗ B is a free resolution of A as a right B-module. The complex A ⊗ T (I (B)) ⊗ B is also a free left A-module via left multiplication by elements of A. Hence this is an A ⊗ B opp - free resolution of A. Thus the complex EndA (A ⊗B B ⊗ T (I (B)) ⊗ K) = EndA (A ⊗ T (I (B)) ⊗ K) for the computation of H k ∗ (A, B) is canonically isomorphic to the complex EndA (A ⊗ ∗ d k (A, B). T (I (B)) ⊗ B ⊗B K) = EndA (A ⊗ T (I (B)) ⊗ K) for the computation of H This establishes the isomorphism of the algebras. u t 3. Action in Homology and Cohomology Spaces Recall that for every left B-module V the cohomology modules are defined to be H ∗ (V ) = Ext∗B (K, V ) = H ∗ (HomB (X, V )),
(22)
where X is a projective resolution of K. On the other hand for every right B-module W one can define the homology modules H∗ (W ) = Tor B ∗ (W, K) = H∗ (W ⊗B X).
(23)
142
A. Sevostyanov
Now observe that for every right A-module V the complex in (22) for calculating its cohomology as a right B-module may be represented as follows: HomB (X, V ) = HomA (A ⊗B X, V ).
(24)
Endow the space HomA (A ⊗B X, V ) with a right EndA (A ⊗B X)-action: HomA (A ⊗B X, V ) × EndA (A ⊗B X) → HomA (A ⊗B X, V ), ϕ × f 7 → ϕ ◦ f, ϕ ∈ HomA (A ⊗B X, V ), f ∈ EndA (A ⊗B X).
(25)
The action is well-defined since f commutes with the left A-action. Clearly this action respects the gradings, i.e., it is an action of the graded associative algebra on the graded module. Theorem 5. The action (25) gives rise to a right action H ∗ (V ) × H k ∗ (A, B) → H ∗ (V ), H n (V ) × H k m (A, B) → H n+m (V ).
(26)
Proof. Let ϕ ∈ HomA (A ⊗B X, V ) and dϕ = ϕ ◦ d = 0. Let also f ∈ EndA (A ⊗B X) be a homogeneous cocycle. By (17) ϕ ◦ f is a cocycle in HomA (A ⊗B X, V ). Indeed d(ϕ ◦ f ) = ϕ ◦ f ◦ d = (−1)deg(f ) ϕ ◦ d ◦ f = 0.
(27)
Then we need to show that the action does not depend on the choice of the representative f in the homology class [f ], that is ϕ ◦ dg is homologous to zero for every homogeneous g ∈ EndA (A ⊗B X). This is a direct consequence of the definitions: ϕ ◦ dg = ϕ ◦ (d ◦ g − (−1)deg(g) g ◦ d) = −(−1)deg(g) d(ϕ ◦ g),
(28)
since ϕ ◦ d = 0. Finally let us check that the action is independent of the representative in the homology class [ϕ]. For ψ ∈ HomA (A ⊗B X, V ) dψ ◦ f is always homologous to zero: dψ ◦ f = ψ ◦ d ◦ f = (−1)deg(f ) ψ ◦ f ◦ d = (−1)deg(f ) d(ψ ◦ f ).
(29)
This concludes the proof. u t Similarly for every right A-module W one can equip the homology module H∗ (W ) with a structure of left H k ∗ (A, B)-module. First the complex W ⊗B X = W ⊗A A ⊗B X has the natural structure of a left EndA (A ⊗B X)-module: EndA (A ⊗B X) × W ⊗A A ⊗B X → W ⊗A A ⊗B X, f × w ⊗ x 7 → w ⊗ f (x), w ⊗ x ∈ W ⊗A (A ⊗B X), f ∈ EndA (A ⊗B X).
(30)
Observe that according to the convention of Sect. 1 elements of EndnA (A ⊗B X) have degree −n as operators in the graded space W ⊗A A ⊗B X: EndnA (A ⊗B X) × W ⊗A A ⊗B Xm → W ⊗A A ⊗B Xm−n .
(31)
The following assertion is an analogue of Theorem 5 for homology. Theorem 6. The action (30) gives rise to a left action H k(A, B)∗ × H∗ (W ) → H∗ (W ), H k(A, B)n × Hm (W ) → Hm−n (W ).
(32)
Reduction of Quantum Systems
143
4. Structure of the Hecke Algebras In this section we investigate the Hecke algebras under some technical assumptions. The main theorem here is Theorem 7. Assume that Tor B n (A, K) = 0 for n > 0.
(33)
H k n (A, B) = ExtnA (A ⊗B K, A ⊗B K) = ExtnB (K, A ⊗B K).
(34)
Then
In particular H k n (A, B) = 0, n < 0; H k 0 (A, B) = HomB (K, A ⊗B K).
(35)
Proof. Equip the complex Y = EndA (A ⊗ T (I (B)) ⊗ K), which we used in Theorem 4 for the computation of H k ∗ (A, B), with the first filtration as follows: ∞ X
F kY =
Y
Y p,q .
(36)
n=−∞ p+q=n,p≥k
The associated graded complex with respect to this filtration is the double direct sum ∞ X
GrY =
Y p,q .
(37)
p,q=−∞
One can show that the filtration is regular and the second term of the corresponding spectral sequence is p,q
E2
p
q
= Hd 0 (Hd 00 (GrY )),
(38)
where Hd∗0 and Hd∗00 denote the homologies of the complex with respect to the partial differentials (6). Now observe that at the same time the complex A⊗T (I (B))⊗K is a complex for the calculation of Tor B n (A, K) because A ⊗ T (I (B)) ⊗ B is a free resolution of A as a right B-module. It is also free as a left A-module. Therefore the functor HomA (A⊗T (I (B))⊗ K, ·) is exact. By assumption H ∗ (A ⊗ T (I (B)) ⊗ K) = Tor B 0 (A, K) = A ⊗B K. Using the last two observations we can calculate the cohomology of the complex GrY with respect to the differential d 00 : Hd∗00 (GrY ) = Hd∗00 (HomA (A ⊗ T (I (B)) ⊗ K, A ⊗ T (I (B)) ⊗ K)) = HomA (A ⊗ T (I (B)) ⊗ K, A ⊗B K).
(39)
Here HomA should be thought of as a direct sum of the double graded components. Now (39) provides that the spectral sequence (38) degenerates at the second term. Moreover, p,∗
E2
p
p
= Hd 0 (Hd000 (GrY )) = Hd 0 (HomA (A ⊗ T (I (B)) ⊗ K, A ⊗B K)).
(40)
144
A. Sevostyanov
But the complex A ⊗ T (I (B)) ⊗ K may be regarded as a free resolution of the left A-module A ⊗B K. Therefore p,∗
E2
p
= ExtA (A ⊗B K, A ⊗B K).
(41)
Finally by Theorem 5.12, [5] we have: H k n (A, B) = H n (Y ) = E2n,0 = ExtnA (A ⊗B K, A ⊗B K).
(42)
Since Tor B n (A, K) = 0 for n > 0 we can apply the Shapiro lemma to simplify the last expression: ExtnA (A ⊗B K, A ⊗B K) = ExtnB (K, A ⊗B K).
(43)
This completes the proof. u t Remark 1. In particular the conditions of the theorem are satisfied if A is projective as a right B-module. For instance suppose that there exists a subspace N ⊂ A such that multiplication in A provides an isomorphism of vector spaces A ∼ = N ⊗ B. Then A is a free right B-module. 5. Comparison with the BRST Complex Let g be a Lie algebra over a field K. For simplicity we suppose that g is finitedimensional. However the arguments presented below remain true, with some technical modifications, for an arbitrary Lie algebra. We shall apply the construction of Sect. 2 in the following situation. Let B = U (g) and let A be an associative algebra over K containing B as a subalgebra. Note that U (g) is naturally augmented. Consider the U (g)-free resolution of the left U (g)-module K as follows: X = U (g) ⊗ 3(g), P xn ) = ni=1 (−1)i+1 uxi ⊗ x1 ∧ . . . ∧ xbi ∧ . . . ∧ xn + d(u P ⊗ x1 ∧ . . . ∧ i+j u ⊗ [x , x ] ∧ x ∧ . . . ∧ x bi ∧ . . . ∧ xbj ∧ . . . ∧ xn , i j 1 1≤i<j ≤n (−1)
(44)
where the symbol xbi indicates that xi is to be omitted. Then A ⊗U (g) X = A ⊗ 3(g)
(45)
is a complex with a differential given by the operator X X ei ⊗ ei∗ − 1 ⊗ [ei , ej ]ei∗ ej∗ . d= i
(46)
i,j
Here ei is a linear basis of g, ei∗ is the dual basis, ei ⊗ 1 is regarded as the operator of right multiplication in A, and 1 ⊗ ei , 1 ⊗ ei∗ are the operators of exterior and inner multiplications in 3(g) respectively. Now observe that EndA (A ⊗ 3(g)) = Aopp ⊗ EndK (3(g)) = Aopp ⊗ C(g + g∗ ), g∗ )
g∗ .
(47)
is the Clifford algebra of the space g + Under this identification where C(g + Aopp acts on A ⊗ 3(g) by multiplication in A on the right and the Clifford algebra acts
Reduction of Quantum Systems
145
by the exterior and inner multiplication in 3(g). This allows to consider the differential (46) as an element of the complex Aopp ⊗ C(g + g∗ ). It is easy to see that the canonical Z-grading of the complex Aopp ⊗ C(g + g∗ ) coincides mod 2 with the Z2 -grading inherited from the Clifford algebra. Therefore according to (5) the differential d is given by the supercommutator in Aopp ⊗ C(g + g∗ ) by element (46). Now recall that the complex Aopp ⊗ C(g + g∗ ) with the differential given by the supercommutator by the element (46) is the quantum BRST complex proposed in [9]. This establishes Theorem 8. The complex (EndA (A ⊗U (g) X), d) is isomorphic to the BRST one Aopp ⊗ C(g + g∗ ) with the differential being the supercommutator by the element (46). 6. Relation to Quantum Reduction The results of the previous section imply that if K is the field of complex numbers C then H k 0 (A, U (g))opp may be thought of as a result of quantum reduction in A with U (g) being a system of first-class constraints [9]. We shall show that this treatment remains true in the general situation of Sect. 2. Suppose that A is a quantization of a classical system, that is, A is included into a family of associative algebras Ah , parametrized by a complex number h, such that for different h, Ah are isomorphic as vector spaces, A0 is commutative, and the formula ab − ba h→0 h
{a, b} = lim
defines a Poisson algebra structure on A0 . The classical limit of B is a Poisson subalgebra B0 in A0 with a character ε0 : B0 → C. Let J0 be the ideal in A0 generated by the kernel of the map ε0 . Then the classical reduced Poisson algebra coincides with the subspace of Poisson B0 -invariants in the quotient A0 /J0 . In typical situations A0 is the Poisson algebra of functions on a Poisson manifold. In this case the scheme of the reduction was suggested by Dirac in [6]. Now assume that the conditions of Theorem 7 are satisfied. Then the algebra H k 0 (A, B)opp is isomorphic to the algebra of B-invariants in the quotient A/J where J is the left ideal in A generated by the kernel of the augmentation map ε. Thus the classical limit of H k 0 (A, B)opp is exactly the reduced Poisson algebra defined above. If Theorem 7 does not hold the algebra H k 0 (A, B)opp can be still treated as a quantization of the classical reduced space in the following sense. Recall the scheme of the Dirac quantum reduction [6]. Suppose again that we are given a quantum system with first-class constraints, that is an associative algebra A over C together with a representation V and subalgebra B of A equipped with a character ε : B → C. According to Dirac the space of the physical states for the reduced system is the space of B-invariants in V : V B = {v ∈ V : bv = ε(b)v for every b ∈ B},
(48)
that is, the zeroth cohomology space of V as a left B-module H 0 (V ) = V B = HomB (C, V ). And the algebra of observables AB V of the reduced system is formed by operators a ∈ A such that [b, a]v = 0 for every b ∈ B, v ∈ V B .
(49)
146
A. Sevostyanov
This condition is equivalent to bav = ε(b)av.
(50)
It means that the space V B is invariant with respect to the action of AB V . Clearly, such operators form an algebra. From the other side we have an action (26) of the algebra H k 0 (A, B)opp in the space B V . From the definition of the action it is clear that only (0, 0)-bidegree components give a nontrivial contribution to the action. They may be represented by elements from A. Moreover, Theorem 9. The algebra H k 0 (A, B)opp satisfies the condition (49) for every representation V of A. So it may be regarded as a universal Dirac reduction of the physical system. The statement of the theorem follows directly from Theorem 5. Namely, condition (49) ensures that the action of the algebra H k 0 (A, B)opp in the space of invariants is well-defined. Acknowledgements. The author is grateful to J. Stasheff for references and to the CMP referee for careful reading of the text.
References 1. Arnold, V.: Mathematical Methods of Classical Mechanics. Graduate Texts in Math. 60, Berlin: Springer, 1978. 2. Batalin, I. A., Vilkovisky, G. A.: Phys. Lett. B 69, 309 (1977); Phys. Rev. D 69, 2567 (1983); J. Math. Phys., 26, 172 (1985) 3. Batalin, I. A., Fradkin, E. S.: Phys. Lett. B 122, 157 (1983) 4. Becchi, C., Rouet, A., Stora, R.: Phys. Lett. B 52, 344 (1974); Ann. Phys. (N.Y.) 98, 287 (1976) 5. Cartan, H., Eilenberg, S.: Homological Algebra. Princeton, NJ: Princeton Univ. Press 1956 6. Dirac, P.A.M.: “Generalized Hamiltonian Dynamics”. Can. J. Math. 2, 129 (1950); “Generalized Hamiltonian Dynamics”. Proc. Roy. Soc. London A246, 326 (1958); Lectures on Quantum Mechanics. London–New York, Academic Press, 1967 7. Feigin, B.: Uspechi Mat. Nauk 39, 195 (1984) 8. Fradkin, E. S., Vilkovisky, G. A.: Phys. Lett. B 55, 224 (1975) 9. Kostant, B., Sternberg, S.: “Symplectic reduction, BRS cohomology and infinite-dimensional Clifford algebras”. Ann. Phys. 176, 49 (1987) 10. Mac Lane, S.: Homology. Berlin–Heidelberg–New York, Springer 1995 11. Marsden, J.E., Weinstein, A.: “Reduction of symplectic manifolds with symmetry”. Rep. Math. Phys. 5, 121 (1974) Communicated by A. Connes
Commun. Math. Phys. 204, 147 – 188 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Discrete Time Lagrangian Mechanics on Lie Groups, with an Application to the Lagrange Top A. I. Bobenko, Yu. B. Suris Fachbereich Mathematik, Technische Universität Berlin, Strasse des 17. Juni 136, D-10623 Berlin, Germany. E-mail: [email protected]; [email protected] Received: 15 September 1998 / Accepted: 26 January 1999
Abstract: We develop the theory of discrete time Lagrangian mechanics on Lie groups, originated in the work of Veselov and Moser, and the theory of Lagrangian reduction in the discrete time setting. The results thus obtained are applied to the investigation of an integrable time discretization of a famous integrable system of classical mechanics – the Lagrange top. We recall the derivation of the Euler–Poinsot equations of motion both in the frame moving with the body and in the rest frame (the latter ones being less widely known). We find a discrete time Lagrange function turning into the known continuous time Lagrangian in the continuous limit, and elaborate both descriptions of the resulting discrete time system, namely in the body frame and in the rest frame. This system naturally inherits Poisson properties of the continuous time system, the integrals of motion being deformed. The discrete time Lax representations are also found. Kirchhoff’s kinetic analogy between elastic curves and motions of the Lagrange top is also generalised to the discrete context.
1. Introduction This paper is devoted to the time discretization of a famous integrable system of classical mechanics – the Lagrange top. This is a special case of rotation of a rigid body around a fixed point in a homogeneous gravitational field, characterized by the following conditions: the rigid body is rotationally symmetric, i.e. two of its three principal moments of inertia coincide, and the fixed point lies on the axis of rotational symmetry. We present a discretization preserving the integrability property, and discuss its rich mechanical and geometrical structure. Notice that until recently [B] only the integrable Euler case of the rigid body motion was discretized preserving integrability [V,MV,BLS]. Consult also [AL,H,DJM,QNCV] for some fundamental early papers on the subject of integrable discretizations, and [BP,S] for reviews of this topic reflecting the viewpoints of the present authors and containing extensive bibliography.
148
A. I. Bobenko, Yu. B. Suris
We found the standard presentation of the Lagrange top in mechanical textbooks insufficient in several respects, and therefore chose to write this paper in a pedagogical manner, giving a systematic account of the new results along with the well known ones represented in a form suitable for our present purposes. The paper is organized as follows. The introduction recalls the classical Euler–Poinsot equations for the motion of the spinning top in the frame moving with the body. Further, we give less known Euler– Poinsot equations describing the Lagrange top in the rest frame (they cannot be directly generalized to a general top case). We finish the introduction by announcing a beautiful time discretization of the latter equations. In order to derive this discretization systematically, we need some formalism of discrete time Lagrangian mechanics on Lie groups. The discrete time Lagrangian mechanics were introduced by Veselov [V,MV], see also [WM], but the case of Lie groups have certain specific features which, in our opinion, were not worked out sufficiently. In particular, there lacks a systematic account of the discrete time version of the Lagrangian reduction (which is fairly well understood in the continuous time setting, cf. [MS,HMR]). Also, we think that some technical details in [V,MV,WM] could be amended: in working with variational equations these authors systematically use Lagrange multipliers instead of introducing proper notions such as Lie derivatives (specific for Lie groups as opposed to general manifolds). Therefore we give a detailed exposition of the discrete time Lagrangian mechanics on Lie groups in Sect. 3. In order to underline an absolute parallelism of its structure with that of the continuous time Lagrangian mechanics, we included in Sect. 2 also a presentation of (a fragment of) the latter, which is, of course, by no means original. Section 4 is devoted to a Lagrangian derivation of equations of motion of the Lagrange top, both in the rest and in the body frames. Finally, in Sect. 5 we do the same work for a discrete time Lagrange top. It has to be mentioned that the actual motivation for the present development came from differential geometry, more precisely, from the theory of elastic curves. A brief account of the relations between spinning tops and elastic curves is given in Sect. 6. We also give three appendices. Appendix A is for fixing the notations of Lie group theory. In Appendix B we collect the main results of Sect. 2, 3 in the form of an easy–to– use table. Finally, Appendix C contains some conventions and simple technical results on a specific Lie group we work with, namely on SU (2). It should be remarked here that our experience with various integrable discretizations convinced us that working with this group has many advantages when compared to the group SO(3), more traditional in this context. A standard form of equations of motion describing rotation of a rigid body around a fixed point in a homogeneous gravity field is the following: ( M˙ = M × (M) + P × A, (1.1) P˙ = P × (M). Here M = (M1 , M2 , M3 )T ∈ R3 is the vector of kinetic momentum of the body, expressed in the so-called body frame. This frame is firmly attached to the body, its origin is in the body’s fixed point, and its axes coincide with the principal inertia axes of the body. The inertia tensor of the body in this frame is diagonal: J1 0 0 (1.2) J = 0 J2 0 . 0 0 J3
Discrete Lagrange Top
149
For the vector = (M) of the angular velocity we have: = (M) = J −1 M = (J1−1 M1 , J2−1 M2 , J3−1 M3 )T ∈ R3 .
(1.3)
The vector P = (P1 , P2 , P3 )T ∈ R3 is the unit vector along the gravity field, with respect to the body frame. Finally, A = (A1 , A2 , A3 )T ∈ R3 is the vector pointing from the fixed point to the center of mass of the body. It is a constant vector in the body frame. It is well known that the system (1.1) is Hamiltonian with respect to the Lie–Poisson bracket of the Lie algebra e(3) of the Lie group E(3) of euclidean motions of R3 , i.e. with respect to the Poisson bracket {Mi , Mj } = εij k Mk , {Mi , Pj } = εij k Pk , {Pi , Pj } = 0,
(1.4)
where εij k is the sign of the permutation (ij k) of (123). The Hamiltonian equations of motion for an arbitrary Hamilton function H = H (M, P ) in the bracket (1.4) read: ( M˙ = M × ∇M H + P × ∇P H, (1.5) P˙ = P × ∇M H, which coincides with (1.1) if H (M, P ) =
1 hM, (M)i + hP , Ai. 2
(1.6)
(Here h·, ·i stands for the standard euclidean scalar product in R3 ). The Poisson bracket (1.4) has two Casimir functions, C = hM, P i and hP , P i,
(1.7)
which are therefore integrals of motion for (1.1) in involution with H (M, P ) (and with any other function on the phase space). The Lagrange case of the rigid body motion (the Lagrange top, for brevity), is characterized by the following data: J1 = J2 , which means that the body is rotationally symmetric with respect to the third coordinate axis), and A1 = A2 = 0, which means that the fixed point lies on the symmetry axis. Choosing units properly, we may assume that J1 = J2 = 1, J3 = α, A = (0, 0, 1)T .
(1.8)
The system (1.1) has in this case an additional integral of motion, M3 = hM, Ai,
(1.9)
which is also in involution with H (M, P ), and assures therefore the complete integrability of the flow (1.1). For an actual integration of this flow in terms of elliptic functions see, e.g., [G,KS], and for a more modern account [RM,Au,CB]. Remarkable as it is, this result is, however, somewhat unsatisfying from the practical point of view. Indeed, one is usually interested in describing the motion of the top in the rest frame, which does not move in the physical space. It is less known that for the Lagrange top the corresponding equations of motion are also very nice and, actually, even somewhat simpler than (1.1): ( m ˙ = p × a, (1.10) a˙ = m × a.
150
A. I. Bobenko, Yu. B. Suris
Here m = (m1 , m2 , m3 )T ∈ R3 is the vector of kinetic momentum of the body, expressed in the rest frame, p is the unit vector along the gravity field, also expressed in the rest frame, so that it becomes constant: p = (0, 0, 1)T ,
(1.11)
and a = (a1 , a2 , a3 )T ∈ R3 is the vector pointing from the fixed point to the center of mass, expressed in the rest frame. An exterior observer is mainly interested in the motion of the symmetry axis of the top, which is described by the vector a. The system (1.10) has several remarkable features. First of all, it does not depend explicitly on the anisotropy parameter α of the inertia tensor. Second, it is Hamiltonian with respect to the Lie–Poisson bracket of e(3): {mi , mj } = −εij k mk , {mi , aj } = −εij k ak , {ai , aj } = 0.
(1.12)
For an arbitrary Hamilton function H (m, a), the Hamiltonian equations of motion in this bracket read: ( m ˙ = ∇m H × m + ∇a H × a, (1.13) a˙ = ∇m H × a. These equations coincide with (1.10), if H0 (m, a) =
1 hm, mi + hp, ai. 2
(1.14)
Of course, the functions c = hm, ai and ha, ai,
(1.15)
are Casimirs of the bracket (1.12), and therefore are integrals of motion for (1.10) in involution with H0 (m, a) (and with any other function on the phase space). An additional integral of motion in involution with H0 (m, a), assuring the complete integrability of the system (1.10), is: m3 = hm, pi.
(1.16)
In the main text we give a Lagrangian derivation of equations of motion (1.1) and (1.10) and an explanation of their Hamiltonian nature and integrability. Then we present a discrete Lagrangian function generating two maps approximating (1.1) and (1.10), respectively. Most beautiful is the discretization of (1.10): ( mk+1 − mk = ε p × ak , ε (1.17) ak+1 − ak = mk+1 × (ak + ak+1 ). 2 It is easy to see that the second equation in (1.17) can be uniquely solved for ak+1 , so that (1.17) defines a map (mk , ak ) 7 → (mk+1 , ak+1 ), approximating, for small ε, the time ε shift along the trajectories of (1.10). This distinguishes the situation from the one in [MV] where Lagrangian equations led to correspondences rather than to maps. We shall demonstrate that the map (1.17) is Poisson with respect to the bracket (1.12), so that the Casimirs (1.15) are integrals of motion. It is also obvious that (1.16) is an integral
Discrete Lagrange Top
151
of motion. Most remarkably, this map has another integral of motion – an analog of the Hamiltonian: ε 1 (1.18) Hε (m, a) = hm, mi + ha, pi + ha × m, pi. 2 2 The function (1.18) is in involution with (1.16), which renders the map (1.17) completely integrable. A similar discretization for the equations of motion in the body frame (1.1) is slightly less elegant. 2. Lagrangian Mechanics on T G (Continuous Time Case) Let L(g, g) ˙ : T G 7 → R be a smooth function on the tangent bundle of the Lie group G, called the Lagrange function. For an arbitrary function g(t) : [t0 , t1 ] 7 → G one can consider the action functional Z t1 L(g(t), g(t))dt. ˙ (2.1) S= t0
A standard argument shows that the functions g(t) yielding extrema of this functional (in the class of variations preserving g(t0 ) and g(t1 )), satisfy with necessity the Euler– Lagrange equations. In local coordinates {g i } on G they read: ∂L d ∂L (2.2) = i. dt ∂ g˙ i ∂g The action functional S is independent of the choice of local coordinates, and thus the Euler–Lagrange equations are actually coordinate independent as well. For a coordinate– free description in the language of differential geometry, see [A,MR]. Introducing the quantities 1 5 = ∇g˙ L ∈ Tg∗ G,
(2.3)
one defines the Legendre transformation: (g, g) ˙ ∈ T G 7 → (g, 5) ∈ T ∗ G.
(2.4)
If it is invertible, i.e. if g˙ can be expressed through (g, 5), then the Legendre transformation of the Euler–Lagrange equations (2.2) yield a Hamiltonian system on T ∗ G with respect to the standard symplectic structure on T ∗ G and with the Hamilton function H (g, 5) = h5, gi ˙ − L(g, g), ˙
(2.5)
(where, of course, g˙ has to be expressed through (g, 5)). Finally, we want to mention the Noether construction for deriving the existence of integrals of motion of the Euler–Lagrange equations from the symmetry groups of the Lagrange function. We shall formulate the simplest form of Noether’s theorem, where Lagrangian functions are invariant under the action of one-dimensional groups. Let ζ ∈ g be a fixed element, and consider a one-parameter subgroup G(ζ ) = {ecζ : c ∈ R} ⊂ G.
(2.6)
1 For the notations from the Lie groups theory used in this and subsequent sections see Appendix A.
152
A. I. Bobenko, Yu. B. Suris
Proposition 1. a) Let the Lagrange function be invariant under the action of G(ζ ) on T G induced by left translations on G: ˙ = L(g, g). ˙ L(ecζ g, Lecζ ∗ g)
(2.7)
Then the following function is an integral of motion of the Euler–Lagrange equations: hRg∗ (∇g˙ L), ζ i = hRg∗ 5, ζ i.
(2.8)
b) Let the Lagrange function be invariant under the action of G(ζ ) on T G induced by right translations on G: ˙ = L(g, g). ˙ L(gecζ , Recζ ∗ g)
(2.9)
Then the following function is an integral of motion of the Euler–Lagrange equations: hL∗g (∇g˙ L), ζ i = hL∗g 5, ζ i.
(2.10)
Proof. Differentiate (2.7) (or (2.9)) with respect to c, set c = 0, and use the Euler– Lagrange equations. u t For a detailed proof of the general version of the Noether theorem see [A,MR]. In practice, it turns out to be more convenient to work not with the tangent bundle T G, but with its trivializations G × g, which is achieved by translating the vector g˙ ∈ Tg G into the group unit via left or right translations. 2.1. Left trivialization. Consider the trivialization map (g, ) ∈ G × g 7 → (g, g) ˙ ∈ T G,
(2.11)
where g˙ = Lg∗
⇔
= Lg −1 ∗ g. ˙
(2.12)
The trivialization (2.11) of the tangent bundle T G induces the following trivialization of the cotangent bundle T ∗ G: (g, M) ∈ G × g∗ 7 → (g, 5) ∈ T ∗ G,
(2.13)
where 5 = L∗g −1 M
⇔
M = L∗g 5.
(2.14)
Denote the pull-back of the Lagrange function through ˙ L(l) (g, ) = L(g, g), so that L(l) (g, ) : G × g 7 → R.
(2.15)
Discrete Lagrange Top
153
We want to find differential equations satisfied by those functions (g(t), (t)) : [t0 , t1 ] 7 → G × g delivering extrema of the action functional Z t1 L(l) (g(t), (t))dt S (l) = t0
and such that ˙ (t) = Lg −1 (t)∗ g(t). Admissible variations of (g(t), (t)) are those preserving the latter equality and the values g(t0 ), g(t1 ). Proposition 2. The differential equations for extremals of the functional S (l) read: M˙ = ad∗ · M + d g0 L(l) , (2.16) g˙ = Lg∗ , where 2 M = ∇ L(l) ∈ g∗ .
(2.17)
(g, ) ∈ G × g 7 → (g, M) ∈ G × g∗
(2.18)
If the Legendre transformation
is invertible, it turns (2.16) into a Hamiltonian form on G×g∗ with the Hamilton function H (g, M) = hM, i − L(l) (g, ),
(2.19)
(where, of course, has to be expressed through (g, M)); the underlying invariant Poisson bracket on G × g∗ is the pull-back of the standard symplectic bracket on T ∗ G, so that for two arbitrary functions f1,2 (g, M) : G × g∗ 7→ R we have: {f1 , f2 } = −hd g0 f1 , ∇M f2 i + hd g0 f2 , ∇M f1 i + hM, [∇M f1 , ∇M f2 ] i.
(2.20)
Proof. The equations of motion (2.16) may be derived by pulling back Eqs. (2.2) under the trivialization map (2.11), but it is somewhat simpler to derive them independently. To this end, consider the admissible variations of (g(t), (t)) in the form g(t, ) = g(t)eη(t) , where η(t) : [t0 , t1 ] 7 → g, η(t0 ) = η(t1 ) = 0, and ˙ ) = Ad e−η(t) · (t) + η(t) ˙ + O( 2 ) (t, ) = Lg −1 (t,)∗ g(t, = (t) + η(t) ˙ + [(t), η(t)] + O( 2 ). 2 Recall (see Appendix A) that for an arbitrary smooth function f : G 7 → R its right Lie derivative d 0 f and left Lie derivative df are functions from G into g∗ defined via the formulas d d hdf (g), ηi = , hd 0 f (g), ηi = , ∀η ∈ g. f (eη g) f (geη ) d d =0 =0
154
A. I. Bobenko, Yu. B. Suris
So, equating the variation of action to zero, we get: dS (l) 0= d
Z =
=0
t1
t0
h d g0 L(l) , η i + h∇ L(l) , η˙ + ad · η i dt.
Integrating the term with η˙ by parts and taking into account η(t0 ) = η(t1 ) = 0, we come to: Z t1 d h d g0 L(l) + ad∗ · ∇ L(l) − (∇ L(l) ), η i dt = 0. dt t0 Due to arbitrariness of η(t) the following equation holds: d (∇ L(l) ) = ad∗ · ∇ L(l) + d g0 L(l) . dt It remains to notice that M defined by (2.14), (2.3), i.e. M = L∗g ∇g˙ L, coincides with (2.17), as it follows from (2.12). u t Remark 1. In the case when L(g, g) ˙ is left G-invariant, i.e. L(l) (g, ) does not actually depend on g, the first equation in the system (2.16) becomes the standard (left) Lie– Poisson equation M˙ = ad∗ · M, see e.g. [MR]. Remark 2. Variations of the angular velocity of the form η˙ + [, η] used in the above proof, are standard in the theory of Euler–Poincaré equations, cf. [MS,MRW,HMR]. We now observe what Noether’s theorem (more exactly, its version in Proposition 2.1) yields under left trivialization. Proposition 3. a) Let the Lagrange function L(l) (g, ) be invariant under the action of G(ζ ) on G × g induced by left translations on G: L(l) (ecζ g, ) = L(l) (g, ).
(2.21)
Then the following function is an integral of motion of the Euler–Lagrange equations: hAd∗ g −1 · ∇ L(l) , ζ i = hM, Ad g −1 · ζ i.
(2.22)
b) Let the Lagrange function L(l) (g, ) be invariant under the action of G(ζ ) on G × g induced by right translations on G: L(l) (gecζ , Ad e−cζ · ) = L(l) (g, ).
(2.23)
Then the following function is an integral of motion of the Euler–Lagrange equations: h∇ L(l) , ζ i = hM, ζ i.
(2.24)
Discrete Lagrange Top
155
We finish this subsection by discussing the reduction procedure relevant for later applications. Let us assume that there holds a condition somewhat stronger than (2.21), namely, that the function L(l) is left invariant under the action of a subgroup somewhat larger than G(ζ ) . Fix an element ζ ∈ g, and consider the isotropy subgroup G[ζ ] of ζ with respect to the adjoint action of G, i.e. G[ζ ] = {h : Ad h · ζ = ζ } ⊂ G.
(2.25)
Obviously, G(ζ ) ⊂ G[ζ ] . Suppose that the Lagrange function L(l) (g, ) is invariant under the action of G[ζ ] on G × g induced by left translations on G: L(l) (hg, ) = L(l) (g, ), h ∈ G[ζ ] .
(2.26)
We want to reduce the Euler–Lagrange equations with respect to this action. As a section (G × g)/G[ζ ] we choose the set gζ × g, where gζ is the orbit of ζ under the adjoint action of G: gζ = {Ad g · ζ , g ∈ G} ⊂ g.
(2.27)
We define the reduced Lagrange function L(l) : gζ × g 7 → R as L(l) (P , ) = L(l) (g, ), where P = Ad g −1 · ζ.
(2.28)
This definition is correct, because from P = Ad g1−1 · ζ = Ad g2−1 · ζ there follows Ad g2 g1−1 · ζ = ζ , so that g2 g1−1 ∈ G[ζ ] , and L(l) (g1 , ) = L(l) (g2 , ). Proposition 4. Consider the reduction (g, ) 7 → (P , ). The reduced Euler–Lagrange equations (2.16) read: ( M˙ = ad∗ · M + ad∗ P · ∇P L(l) , (2.29) P˙ = [P , ], where M = ∇ L(l) ∈ g∗ .
(2.30)
(P , ) ∈ gζ × g 7 → (P , M) ∈ gζ × g∗
(2.31)
If the Legendre transformation
is invertible, it turns (2.29) into a Hamiltonian system on gζ × g∗ , with the Hamilton function H (P , M) = hM, i − L(l) (P , ),
(2.32)
where has to be expressed through (P , M); the underlying invariant Poisson structure on gζ × g∗ is given by the following formula: {F1 , F2 } = −h∇P F1 , [P , ∇M F2 ] i + h∇P F2 , [P , ∇M F1 ] i + hM, [∇M F1 , ∇M F2 ] i (2.33)
156
A. I. Bobenko, Yu. B. Suris
for two arbitrary functions F1,2 (P , M) : gζ × g∗ 7 → R. (This formula indeed defines a Poisson bracket on all of g × g∗ ). In addition to the integral of motion (2.32), the equations of motion (2.29) always have the following integral of motion: C = h M, P i.
(2.34)
This function is a Casimir of the bracket (2.33). Proof. The proof is a consequence of the following formulas: d g0 L(l) = ad∗ P · ∇P L(l) , ∇ L(l) = ∇ L(l) , which are easy to derive from the definitions, and similar formulas connecting the Lie t derivatives of f1,2 with the gradients of F1,2 . u 2.2. Right trivialization. All constructions in this subsection are absolutely parallel to those of the previous one, therefore we restrict ourselves to the formulation and omit all proofs. Consider the trivialization map (g, ω) ∈ G × g 7 → (g, g) ˙ ∈ T G,
(2.35)
where g˙ = Rg∗ ω
⇔
ω = Rg −1 ∗ g. ˙
(2.36)
This trivialization of the tangent bundle T G induces the following trivialization of the cotangent bundle T ∗ G: (g, m) ∈ G × g∗ 7 → (g, 5) ∈ T ∗ G,
(2.37)
where 5 = Rg∗−1 m
⇔
m = Rg∗ 5.
(2.38)
The pull-back of the Lagrange function is denoted through ˙ L(r) (g, ω) = L(g, g).
(2.39)
Proposition 5. The differential equations for the extremals of the functional Z t1 L(r) (g(t), ω(t))dt S (r) = t0
read:
(
m ˙ = −ad∗ ω · m + dg L(r) , g˙ = Rg∗ ω,
(2.40)
Discrete Lagrange Top
157
where m = ∇ω L(r) ∈ g∗ .
(2.41)
(g, ω) ∈ G × g 7 → (g, m) ∈ G × g∗
(2.42)
If the Legendre transformation
is invertible, it turns (2.40) into a Hamiltonian form on G×g∗ with the Hamilton function H (g, m) = hm, ωi − L(r) (g, ω),
(2.43)
where ω has to be expressed through (g, m); the underlying invariant Poisson bracket on G × g∗ is the pull-back of the standard symplectic bracket on T ∗ G, so that for two arbitrary functions f1,2 (g, m) : G × g∗ 7 → R we have: {f1 , f2 } = −hdg f1 , ∇m f2 i + hdg f2 , ∇m f1 i − hm, [∇m f1 , ∇m f2 ] i.
(2.44)
Remark 3. In the case when L(g, g) ˙ is right G-invariant, i.e. L(r) (g, ω) does not depend on g, the first equation in the system (2.40) becomes the standard (right) Lie–Poisson equation m ˙ = −ad∗ ω · m. A version of Noether’s theorem takes the following form: Proposition 6. a) Let the Lagrange function L(r) (g, ω) be invariant under the action of G(ζ ) on G × g induced by left translations on G: L(r) (ecζ g, Ad ecζ · ω) = L(r) (g, ω).
(2.45)
Then the following function is an integral of motion of the Euler–Lagrange equations: h∇ω L(r) , ζ i = hm, ζ i.
(2.46)
b) Let the Lagrange function L(r) (g, ω) be invariant under the action of G(ζ ) on G × g induced by right translations on G: L(r) (gecζ , ω) = L(r) (g, ω).
(2.47)
Then the following function is an integral of motion of the Euler–Lagrange equations: hAd∗ g · ∇ω L(r) , ζ i = hm, Ad g · ζ i.
(2.48)
Turning to the reduction procedure, suppose that the Lagrange function L(r) (g, ω) is invariant under the action of G[ζ ] on G × g induced by right translations on G: L(r) (gh, ω) = L(r) (g, ω), h ∈ G[ζ ]
(2.49)
We define the reduced Lagrange function L(r) : gζ × g 7 → R as L(r) (a, ω) = L(r) (g, ω), where a = Ad g · ζ.
(2.50)
158
A. I. Bobenko, Yu. B. Suris
Proposition 7. Consider the reduction (g, ω) 7 → (a, ω). The reduced Euler–Lagrange equations (2.40) read: ( m ˙ = −ad∗ ω · m − ad∗ a · ∇a L(r) , (2.51) a˙ = [ω, a], where m = ∇ω L(r) ∈ g∗ .
(2.52)
If the Legendre transformation (a, ω) ∈ gζ × g 7 → (a, m) ∈ gζ × g∗ is invertible, it turns (2.51) into a Hamiltonian system on gζ × function H (a, m) = hm, ωi − L(r) (a, ω),
(2.53) g∗ ,
with the Hamilton (2.54)
where ω has to be expressed through (a, m); the underlying invariant Poisson structure on gζ × g∗ is given by the following formula: {F1 , F2 } = h∇a F1 , [a, ∇m F2 ] i − h∇a F2 , [a, ∇m F1 ] i − hm, [∇m F1 , ∇m F2 ] i (2.55) for two arbitrary functions F1,2 (a, m) : gζ × g∗ 7→ R. (This formula indeed defines a Poisson bracket on all of g × g∗ .) In addition to the integral of motion (2.54), the equations of motion (2.51) always have the following integral of motion: c = h m, a i.
(2.56)
This function is a Casimir of the bracket (2.55). Notice that the brackets (2.33) and (2.55) essentially coincide (differ only by a sign). Remark 4. For future reference notice that the elements , ω ∈ g and M, m ∈ g∗ are related via the formulas = Ad g −1 · ω, M = Ad∗ g · m.
(2.57) (2.58)
3. Lagrangian Mechanics on G × G (Discrete Time Case) We now turn to the discrete time analog of the previous constructions, introduced in [V, MV]. Our presentation is an adaptation of the Moser–Veselov construction for the case when the basic manifold is a Lie group. We shall see that almost all constructions of the previous section have their discrete time analogs. The only exception is the existence of the “energy” integral (2.5). Let L(g, b g ) : G × G be a smooth function, called the (discrete time) Lagrange function. For an arbitrary sequence {gk ∈ G, k = k0 , k0 + 1, . . . , k1 } one can consider the action functional S=
kX 1 −1 k=k0
L(gk , gk+1 ).
(3.1)
Discrete Lagrange Top
159
Obviously, the sequences {gk } delivering extrema of this functional (in the class of variations preserving gk0 and gk1 ), satisfy with necessity the discrete Euler–Lagrange equations: ∇1 L(gk , gk+1 ) + ∇2 L(gk−1 , gk ) = 0.
(3.2)
g ) (∇2 L(g, b g )) denotes the gradient of L(g, b g ) with respect to the first Here ∇1 L(g, b argument g (resp. the second argument b g ). Recall (see Appendix A) that for an arbitrary smooth function f : G 7 → R its gradient is defined as ∇f (g) = Rg∗−1 df (g) = L∗g −1 d 0 f (g). So, in our case, when G is a Lie group and not just a general smooth manifold, Eq. (3.2) is written in a coordinate free form, using the intrinsic notions of the Lie theory. An invariant formulation of the Euler–Lagrange equations in the continuous time case is more sophisticated, see, e.g., [MR]. (Notice that (2.2) are written in local coordinates.) This fact seems to underline the fundamental character of discrete Euler– Lagrange equations. Equation (3.2) is an implicit equation for gk+1 . In general, it has more than one solution, and therefore defines a correspondence (multi–valued map) (gk−1 , gk ) 7→ (gk , gk+1 ). To discuss symplectic properties of this correspondence, one defines: 5k = ∇2 L(gk−1 , gk ) ∈ Tg∗k G. Then (3.2) may be rewritten as the following system: ( 5k = −∇1 L(gk , gk+1 ), 5k+1 = ∇2 L(gk , gk+1 ).
(3.3)
(3.4)
This system defines a (multivalued) map (gk , 5k ) 7 → (gk+1 , 5k+1 ) of T ∗ G into itself. More precisely, the first equation in (3.4) is an implicit equation for gk+1 , while the second one allows for the explicit and unique calculation of 5k+1 , knowing gk and gk+1 . As demonstrated in [V,MV], this map T ∗ G 7 → T ∗ G is symplectic with respect to the standard symplectic structure on T ∗ G. For discrete Euler–Lagrange equations there holds an analog of Noether’s theorem. Again, we give only the simplest version thereof. Proposition 8. a) Let the Lagrange function be invariant under the action of G(ζ ) on G × G induced by left translations on G: g ) = L(g, b g ). L(ecζ g, ecζ b
(3.5)
Then the following function is an integral of motion of the discrete Euler–Lagrange equations: hd2 L(gk−1 , gk ), ζ i = hRg∗k 5k , ζ i.
(3.6)
b) Let the Lagrange function be invariant under the action of G(ζ ) on G × G induced by right translations on G: g ecζ ) = L(g, b g ). L(gecζ , b
(3.7)
Then the following function is an integral of motion of the Euler–Lagrange equations: h−d 10 L(gk , gk+1 ), ζ i = hL∗gk 5k , ζ i.
(3.8)
160
A. I. Bobenko, Yu. B. Suris
Proof. Since both statements are proved similarly, we restrict ourselves to proof of the first one. To this end differentiate (3.5) with respect to c and set c = 0. Writing (gk , gk+1 ) for (g, b g ), we get: hd1 L(gk , gk+1 ), ζ i + hd2 L(gk , gk+1 ), ζ i = 0. But the discrete Euler–Lagrange equations imply that d1 L(gk , gk+1 ) = −d2 L(gk−1 , gk ). Hence hd2 L(gk , gk+1 ), ζ i = hd2 L(gk−1 , gk ), ζ i, and the statement is proved. u t Notice that the expressions of the Noether integrals in terms of (g, 5) are exactly the same as in the continuous time case. 3.1. Left trivialization. Actually, the tangent bundle T G does not appear in the discrete time context at all. We shall see that the analogs of the “angular velocities” , ω live not in Tg G but in G itself. On the contrary, the cotangent bundle T ∗ G still plays an important role in the discrete time theory, and it is still convenient to trivialize it. This subsection is devoted to the constructions related to the left trivialization. Consider the map (gk , Wk ) ∈ G × G 7 → (gk , gk+1 ) ∈ G × G,
(3.9)
where gk+1 = gk Wk
⇔
Wk = gk−1 gk+1 .
(3.10)
The group element Wk is an analog of the left angular velocity from (2.12). In the continuous limit Wk lies in a neighborhood of the group unity e, more precisely, it approximates e . Consider also the left trivialization of the cotangent bundle T ∗ G: (gk , Mk ) ∈ G × g∗ 7 → (gk , 5k ) ∈ T ∗ G,
(3.11)
where 5k = L∗g −1 Mk k
⇔
Mk = L∗gk 5k .
(3.12)
Denote the pull-back of the Lagrange function under (3.9) through L(l) (gk , Wk ) = L(gk , gk+1 ).
(3.13)
We want to find difference equations satisfied by the sequences {(gk , Wk ) , k = k0 , . . . , k1 − 1} delivering extrema of the action functional S (l) =
kX 1 −1
L(l) (gk , Wk ),
k0
and satisfying Wk = gk−1 gk+1 . Admissible variations of {(gk , Wk )} are those preserving the values of gk0 and gk1 = gk1 −1 Wk1 −1 .
Discrete Lagrange Top
161
Proposition 9. The difference equations for extremals of the functional S (l) read: ( Ad∗ Wk−1 · Mk+1 = Mk + d g0 L(l) (gk , Wk ), (3.14) gk+1 = gk Wk , where 0 (l) L (gk−1 , Wk−1 ) ∈ g∗ . Mk = d W
(3.15)
If the “Legendre transformation” (gk−1 , Wk−1 ) ∈ G × G 7 → (gk , Mk ) ∈ G × g∗ ,
(3.16)
where gk = gk−1 Wk−1 , is invertible, then (3.14) defines a map (gk , Mk ) 7→ (gk+1 , Mk+1 ) which is symplectic with respect to the Poisson bracket (2.20) on G × g∗ . Proof. The simplest way to derive (3.14) is to pull back Eqs. (3.2) under the map (3.9). To do this, first rewrite (3.2) as d 10 L(gk , gk+1 ) + d 20 L(gk−1 , gk ) = 0.
(3.17)
We have to express these Lie derivatives in terms of (g, W ). The answer is this: 0 (l) L (gk−1 , Wk−1 ), d 20 L(gk−1 , gk ) = d W
(3.18)
d 10 L(gk , gk+1 ) = d g0 L(l) (gk , Wk ) − dW L(l) (gk , Wk ).
(3.19)
Indeed, let us prove, for example, the (less obvious) (3.19). We have: d d (l) 0 η η −η = L(gk e , gk+1 ) L (gk e , e Wk ) hd 1 L(gk , gk+1 ), ηi = d d =0 =0 = hd g0 L(l) (gk , Wk ), ηi − hdW L(l) (gk , Wk ), ηi. It remains to substitute (3.18), (3.19) into (3.17). Taking into account that dW L(l) (gk , Wk ) = Ad∗ Wk−1 · d g0 L(l) (gk , Wk ) we find (3.14). Finally, notice that the notation (3.15) is consistent with the definitions (3.3), (3.12). Indeed, from these definitions it follows: Mk = d 20 L(gk−1 , gk ), and the reference to (3.18) finishes the proof. u t We now observe what the discrete time version of the Noether theorem from Proposition 3.1 yields under left trivialization. Proposition 10. a) Let the Lagrange function L(l) (g, W ) be invariant under the action of G(ζ ) on G × g induced by left translations on G: L(l) (ecζ g, W ) = L(l) (g, W ).
(3.20)
Then the following function is an integral of motion of the Euler–Lagrange equations: 0 (l) L (gk−1 , Wk−1 ), ζ i = hMk , Ad gk−1 · ζ i. hAd∗ gk−1 · d W
(3.21)
162
A. I. Bobenko, Yu. B. Suris
b) Let the Lagrange function L(l) (g, W ) be invariant under the action of G(ζ ) on G × G induced by right translations on G: L(l) (gecζ , e−cζ W ecζ ) = L(l) (g, W ).
(3.22)
Then the following function is an integral of motion of the Euler–Lagrange equations: 0 (l) L (gk−1 , Wk−1 ), ζ i = hMk , ζ i. hd W
(3.23)
We discuss now the reduction procedure. Assume that the function L(l) is invariant under the action of G[ζ ] on G × G induced by left translations on G: L(l) (hg, W ) = L(l) (g, W ), h ∈ G[ζ ] .
(3.24)
Define the reduced Lagrange function 3(l) : gζ × G 7→ R as 3(l) (P , W ) = L(l) (g, W ), where P = Ad g −1 · ζ.
(3.25)
Proposition 11. Consider the reduction (g, W ) 7→ (P , W ). The reduced Euler–Lagrange equations (3.14) read: (
Ad∗ Wk−1 · Mk+1 = Mk + ad∗ Pk · ∇P 3(l) (Pk , Wk ), Pk+1 = Ad Wk−1 · Pk ,
(3.26)
0 3(l) (Pk−1 , Wk−1 ) ∈ g∗ . Mk = d W
(3.27)
where
If the “Legendre transformation” (Pk−1 , Wk−1 ) ∈ gζ × G 7 → (Pk , Mk ) ∈ gζ × g∗ ,
(3.28)
−1 · Pk−1 , is invertible, then (3.26) define a map (Pk , Mk ) 7 → where Pk = Ad Wk−1 (Pk+1 , Mk+1 ) of gζ × g∗ which is Poisson with respect to the Poisson bracket (2.33). The equations of motion (3.26) always have the following integral of motion:
C = h Mk , Pk i, which is a Casimir function of the bracket (2.33).
(3.29)
Discrete Lagrange Top
163
3.2. Right trivialization. Consider the map (gk , wk ) ∈ G × G 7 → (gk , gk+1 ) ∈ G × G,
(3.30)
where gk+1 = wk gk
⇔
wk = gk+1 gk−1 .
(3.31)
Consider also the right trivialization of the cotangent bundle T ∗ G: (gk , mk ) ∈ G × g∗ 7 → (gk , 5k ) ∈ T ∗ G,
(3.32)
where 5k = Rg∗−1 mk k
⇔
mk = Rg∗k 5k .
(3.33)
Denote the pull-back of the Lagrange function under (3.30) through L(r) (gk , wk ) = L(gk , gk+1 ).
(3.34)
Proposition 12. The difference equations for extremals of the functional S (r) =
kX 1 −1
L(r) (gk , wk ),
k0
read:
(
Ad∗ wk · mk+1 = mk + dg L(r) (gk , wk ), gk+1 = wk gk ,
(3.35)
mk = dw L(r) (gk−1 , wk−1 ) ∈ g∗ .
(3.36)
where
If the “Legendre transformation” (gk−1 , wk−1 ) ∈ G × G 7 → (gk , mk ) ∈ G × g∗ ,
(3.37)
where gk = wk−1 gk−1 , is invertible, then (3.35) define a map (gk , mk ) 7 → (gk+1 , mk+1 ) which is symplectic with respect to the Poisson bracket (2.44) on G × g∗ . Proof. This time the discrete Euler–Lagrange equations (3.2) are rewritten as d1 L(gk , gk+1 ) + d2 L(gk−1 , gk ) = 0,
(3.38)
and the expressions for these Lie derivatives in terms of (g, w) read: d2 L(gk−1 , gk ) = dw L(r) (gk−1 , wk−1 ), d1 L(gk , gk+1 ) = =
dg L (gk , wk ) − d w0 L(r) (gk , wk ) dg L(r) (gk , wk ) − Ad∗ wk · dw L(r) (gk , wk ).
(3.39)
(r)
(3.40)
The expression (3.36) is consistent with the definitions (3.3), (3.33), which imply that t mk = d2 L(gk−1 , gk ), and a reference to (3.39) finishes the proof. u
164
A. I. Bobenko, Yu. B. Suris
Proposition 13. a) Let the Lagrange function L(r) (g, w) be invariant under the action of G(ζ ) on G × G induced by left translations on G: L(r) (ecζ g, ecζ we−cζ ) = L(r) (g, w).
(3.41)
Then the following function is an integral of motion of the Euler–Lagrange equations: hdw L(r) (gk−1 , wk−1 ), ζ i = hmk , ζ i.
(3.42)
b) Let the Lagrange function L(r) (g, w) be invariant under the action of G(ζ ) on G × g induced by right translations on G: L(r) (gecζ , w) = L(r) (g, w).
(3.43)
Then the following function is an integral of motion of the Euler–Lagrange equations: hAd∗ gk · dw L(r) (gk−1 , wk−1 ), ζ i = hmk , Ad gk · ζ i.
(3.44)
Finally, we turn to the reduction procedure. Assume that the function L(r) is invariant under the action of G[ζ ] on G × G induced by right translations on G: L(r) (gh, w) = L(r) (g, w), h ∈ G[ζ ] .
(3.45)
Define the reduced Lagrange function 3(r) : gζ × G 7 → R as 3(r) (a, w) = L(r) (g, w), where a = Ad g · ζ.
(3.46)
Proposition 14. Consider the reduction (g, w) 7→ (a, w). The reduced Euler–Lagrange equations (3.35) read: ( Ad∗ wk · mk+1 = mk − ad∗ ak · ∇a 3(r) (ak , wk ), (3.47) ak+1 = Ad wk · ak , where mk = dw 3(r) (ak−1 , wk−1 ) ∈ g∗ .
(3.48)
If the “Legendre transformation” (ak−1 , wk−1 ) ∈ gζ × G 7 → (ak , mk ) ∈ gζ × g∗ ,
(3.49)
where ak = Ad wk−1 · ak−1 , is invertible, then (3.47) define a map (ak , mk ) 7→ (ak+1 , mk+1 ) of gζ × g∗ which is Poisson with respect to the bracket (2.55). The equations of motion (3.47) always have the following integral of motion: c = h mk , ak i,
(3.50)
which is a Casimir of the bracket (2.55). A table summarizing the unreduced and reduced Lagrangian equations of motion, both in the continuous and discrete time formulations, is put in Appendix B.
Discrete Lagrange Top
165
4. Lagrangian Formulation of the Lagrange Top From now on we always work with the group G = SU (2), so that g = su(2), see Appendix C for necessary background. In particular, we identify vectors from R3 with matrices from g, and do not distinguish between the vector product in R3 and the commutator in g. We write the adjoint group action as a matrix conjugation, and the operators L∗g , Rg∗ as left and right matrix multiplication by g −1 , in accordance with (C.4) and (C.10). The following table summarizes the integrals of motion and the reductions following from the symmetries of Lagrange functions, in the terminology of the rigid body motion. Left symmetry
Right symmetry
g 7 → ecp g
g 7 → gecA
(rotation about p ,
(rotation about A ,
the gravity field axis)
the body symmetry axis)
hM, P i , P = g −1 pg
hM, Ai
hm, pi
hm, ai , a = gAg −1
Left trivialization (g, 5) 7 → (g, M = g −1 5) (body frame) Right trivialization (g, 5) 7 → (g, m = 5g −1 ) (rest frame)
4.1. Body frame formulation. For an arbitrary Lagrangian system on T G, whose Lagrange function may be written as L(g, g) ˙ = L(l) (P , ), ˙ P = g −1 pg, the Euler–Lagrange equations of motion take the form where = g −1 g, ( M˙ = [M, ] + [∇P L(l) , P ] (4.1) P˙ = [P , ], where M = ∇ L(l) . Such systems are characterized by the condition of invariance of L(g, g) ˙ under the action of G(p) on T G induced by left translations on G, i.e. ˙ = L(g, g). ˙ L(ecp g, ecp g) The geometrical meaning of this action is the rotation around p – the symmetry axis of the gravitation field. Consider the Lagrange function of the general top: L(l) (P , ) =
1 hJ , i − hP , Ai , 2
(4.2)
where J : g 7 → g is a linear operator, and A ∈ g is a constant vector. We calculate: M = ∇ L(l) = J , ∇P L(l) = −A,
166
A. I. Bobenko, Yu. B. Suris
so that (4.1) takes the form
(
M˙ = [M, ] + [P , A], P˙ = [P , ],
(4.3)
M = J ,
(4.4)
where
which is identical with (1.1). According to Proposition 2.4, this system is Hamiltonian with respect to the bracket (2.33), which in our case has the coordinate representation (1.4). The Lagrange top is distinguished by the relations (1.8). They may be represented in the following, slightly more invariant fashion: M = J = − (1 − α)h, AiA,
(4.5)
i.e. J acts as multiplication by α in the direction of the vector A, and as the identity operator in the two orthogonal directions. This allows us to rewrite (4.2) as 1 1−α (4.6) h, i − h, Ai2 − hP , Ai. 2 2 In this case the equations of motion (4.3) clearly imply that the following function is an integral of motion: L(g, g) ˙ = L(l) (P , ) =
C = hM, Ai. This assures the complete integrability of the Lagrange top. Remark 5. It is easy to see that (4.5) implies hM, Ai = αh, Ai, which allows us to invert (4.5) immediately: =M+
1−α hM, Ai A. α
(4.7)
For futher reference, we rewrite this as =
1−α 1 M+ [A, [A, M]]. α α
(4.8)
This, in turn, allows us to reconstruct the motion of the frame g(t) through the motion of the reduced variables M(t), P (t) (actually only through M(t)). To this end one has to solve the linear differential equation g˙ = g . Remark 6. As almost all known integrable systems, the Lagrange top has a Lax representation [RSTS,Au], the original references are [AM,R,RM]. It is straightforward to check the following Lax representation for (4.3), (4.5) with the matrices from the loop algebra su(2)[λ]: ˙ L(λ) = [L(λ), U (λ)],
(4.9)
L(λ) = λ2 A + λM + P , U (λ) = λA + .
(4.10)
where
Discrete Lagrange Top
167
4.2. Rest frame formulation. With the formula (4.6), we can clearly rewrite the Lagrange function only in terms of ω = gg ˙ −1 , a = gAg −1 : 1 1−α hω, ωi − hω, ai2 − hp, ai. 2 2
L(g, g) ˙ = L(r) (a, ω) =
(4.11)
The possibility to represent L(g, g) ˙ through ω, a is equivalent to the invariance of L(g, g) ˙ under the action of G(A) on T G induced by right translations on G: ˙ cA ) = L(g, g). ˙ L(gecA , ge The geometrical meaning of this action is the rotation around A – the symmetry axis of the top. The Euler–Lagrange equations of motion for such Lagrange functions read: ( m ˙ = [ω, m] + [a, ∇a L(r) ], (4.12) a˙ = [ω, a], where m = ∇ω L(r) . We calculate for the Lagrange function (4.11): m = ∇ω L(r) = ω − (1 − α)hω, aia, (r)
∇a L
(4.13)
= −(1 − α)hω, aiω − p.
Putting this into (4.12), we find: (
m ˙ = [p, a] a˙ = [m, a]
(4.14)
which is identical with (1.10). According to Proposition 2.7, this system is Hamiltonian with respect to the bracket (2.55), whose coordinate representation coincides with (1.12). Remark 7. It follows from (4.13) that hm, ai = αhω, ai, so that (4.13) can be easily inverted: ω =m+
1−α hm, ai a. α
(4.15)
Recall that c = hm, ai is a Casimir function of the underlying invariant Poisson bracket (1.12). Now the latter formula allows us to reconstruct the frame evolution from the evolution of the reduced variables (m, a) via integration of the linear differential equation g˙ = ωg. Remark 8. It turns out to be possible to derive from (4.14) a closed second order differential equation for a. Indeed, take the vector product of the second equation in (1.10) in order to obtain m = a × a˙ + ca.
(4.16)
Substituting this into the first equation in (1.10), we find: a × a¨ + ca˙ = p × a.
(4.17)
168
A. I. Bobenko, Yu. B. Suris
Remark 9. The Lax representations for (4.14) is, of course, gauge equivalent to the one for the body frame formulation, but is slightly simpler than the latter [R,RSTS]. It reads: ˙ `(λ) = [`(λ), u(λ)],
(4.18)
`(λ) = λ2 a + λm + p, u(λ) = λa.
(4.19)
with the matrices
In Sect. 6 we indicate how this Lax representation can be derived from the zero curvature representation of the so-called Heisenberg magnetic. 5. Discrete Time Lagrange Top We now give (in an ad hoc manner) the discrete Lagrange function which is claimed to lead to a suitable discretization of the Lagrange top. The motivation for the choice of this function comes from the geometry of curves and will be given in the next section. Unlike the continuous time case, we start with the rest frame formulation. 5.1. Rest frame formulation. Consider L(gk , gk+1 ) = 3(r) (ak , wk ) 2(1 − α) 4α log tr(wk ) − log 1 + hak , wk ak wk−1 i − εhp, ak i, =− ε ε (5.1) where ak , wk are defined as in Sect. 3.2: wk = gk+1 gk−1 , ak = gk Agk−1 . Notice that hak , wk ak wk−1 i in (5.1) is nothing but hak , ak+1 i. To see that the function (5.1) indeed gives a proper discretization of (4.11), we shall need the following simple lemma. Lemma 1. Let w(ε) = 1 + εω + O(ε 2 ) ∈ SU (2) be a smooth curve, ω ∈ su(2). Then tr(w()) = 2 −
ε2 hω, ωi + O(ε3 ). 4
(5.2)
For an arbitary a ∈ su(2): ε2 ha, aihω, ωi − ha, ωi2 + O(ε3 ). 2
ha, w(ε)aw −1 (ε)i = ha, ai −
(5.3)
Proof. Let w = 1 + εω + ε 2 v + O(ε3 ). Then from ww∗ = 1 we get: v + v ∗ + ωω∗ = 0
⇒
v=
1 2 ω + v1 , v1 ∈ su(2). 2
Hence 1 1 tr(ω2 ) = − hω, ωi, 2 4 which proves (5.2). Similarly, we derive from (5.4): tr(v) =
waw ∗ = a + ε[ω, a] + which implies (5.3). u t
ε2 [ω, [ω, a]] + ε2 [v1 , a] + O( 3 ), 2
(5.4)
Discrete Lagrange Top
169
With the help of this lemma we immediately see that, if w = 1 + εω + O(ε2 ), then, up to an additive constant, 3(r) (a, w) = εL(r) (a, ω) + O(ε2 ), where L(r) (a, ω) is the Lagrange function (4.11) of the Lagrange top. Theorem 1. The Euler–Lagrange equations of motion for the Lagrange function (5.1) are equivalent to the following system: ( mk+1 = mk + ε[p, ak ], ε (5.5) ak+1 = ak + [mk+1 , ak + ak+1 ]. 2 The second equation of motion can be uniquely solved for ak+1 : ak+1 = (1 + εmk+1 )ak (1 + εmk+1 )−1 .
(5.6)
The map (mk , ak ) 7 → (mk+1 , ak+1 ) is Poisson with respect to the bracket (1.12) and has two integrals in involution assuring its complete integrability: hm, pi and Hε (m, a) =
1 ε hm, mi + ha, pi + h[a, m], pi. 2 2
(5.7)
Proof. According to Proposition 3.7, the Euler–Lagrange equations of motion have the form: ( wk−1 mk+1 wk = mk + [ak , ∇a 3(r) (ak , wk )], (5.8) ak+1 = wk ak wk−1 , where mk+1 = dw 3(r) (ak , wk ).
(5.9)
To calculate the derivatives of 3(r) , we use the following formulas: 1 dw tr(wk ) = − =(wk ), dw hak , wk ak wk−1 i = [ak+1 , ak ]. 2
(5.10)
∇a hak , wk ak wk−1 i = ak+1 + wk−1 ak wk .
(5.11)
Indeed, the first one of these expressions follows from: d 1 η tr(e wk ) = tr(ηwk ) = tr(η=(wk )) = − h=(wk ), ηi. hdw tr(wk ), ηi = d 2 =0 To prove the second one, proceed similarly: hdw hak , wk ak wk−1 i, ηi
d −1 −η η = hak , e wk ak wk e i d =0 = hak , [η, ak+1 ]i = h[ak+1 , ak ], ηi.
170
A. I. Bobenko, Yu. B. Suris
Finally, as for the third expression, we have: h∇a hak , wk ak wk−1 i, ηi
d −1 hak + η, wk (ak + η)wk i = d =0 = hak+1 + wk−1 ak wk , ηi.
With the help of (5.10), (5.11) we find the following expressions: mk+1 =
2α =(wk ) 2(1 − α) [ak+1 , ak ] − , ε tr(wk ) ε 1 + hak , ak+1 i
(5.12)
and wk−1 mk+1 wk − [ak , ∇a 3(r) (ak , wk )] = =
2α =(wk ) 2(1 − α) [ak+1 , ak ] − + ε[ak , p] =mk+1 + ε[ak , p]. ε tr(wk ) ε 1 + hak , ak+1 i
(5.13)
Comparing the latter formula with the first equation of motion in (5.8), we find that it can be rewritten as mk+1 + ε[ak , p] = mk , which is equivalent to the first equation of motion in (5.5). To derive the second one, rewrite the second equation in (5.8) as 0 = ak+1 wk − wk ak = <(wk )(ak+1 − ak ) + ak+1 =(wk ) − =(wk )ak 1 1 = tr(wk )(ak+1 − ak ) + [ak+1 + ak , =(wk )] 2 2 (we used Lemma C.3 and the equality hak+1 , =(wk )i = hak , =(wk )i which follows from the same equation ak+1 wk = wk ak we started with). So, the second equation in (5.8) is equivalent to =(wk ) , ak+1 + ak . (5.14) ak+1 − ak = tr(wk ) On the other hand, for any two unit vectors ak , ak+1 with ak+1 + ak 6 = 0 we have: [ak+1 , ak ] , ak+1 + ak . (5.15) ak+1 − ak = − 1 + hak , ak+1 i Comparing (5.14), (5.15) with (5.12), we find the second equation of motion in (5.5). Next, we want to show how the second equation of motion in (5.5) can be solved for ak+1 . This equation implies hak+1 , mk+1 i = hak , mk+1 i, so that, according to Lemma C.3, it can be rewritten as ak+1 + εak+1 mk+1 = ak + εmk+1 ak , which is clearly equivalent to (5.6). The Poisson properties of the map (5.5) are assured by Proposition 3.7.
Discrete Lagrange Top
171
It remains to demonstrate that the function (5.7) is indeed an integral of motion. This is done by the following derivation: 1 ε hmk+1 , mk+1 i + hak+1 + [ak+1 , mk+1 ] , p i 2 2 ε 1 = hmk+1 , mk+1 i + hak − [ak , mk+1 ] , p i 2 2 1 = hmk+1 , mk+1 − ε[p, ak ] i + hak , p i 2 1 = hmk + ε[p, ak ], mk i + hak , p i = Hε (mk , ak ). 2
Hε (mk+1 , ak+1 ) =
The theorem is proved. u t Remark 10. The equations of motion (5.5), being written entirely in terms of elements of the Lie algebra su(2), are clearly equivalent to the equations of motion (1.17), which are written in terms of vectors from R3 . The situation with (5.6) is slightly different. Indeed, it corresponds to the following formula in R3 : ak+1 = Qk+1 ak =
1 + εmk+1 /2 ak , 1 − εmk+1 /2
where the orthogonal matrix Qk+1 ∈ SO(3) is constructed out of the skew–symmetric matrix mk+1 ∈ so(3) which corresponds to the vector mk+1 ∈ R3 according to the following rule: 0 −m3 m2 m = (m1 , m2 , m3 )T ∈ R3 ↔ m = m3 0 −m1 ∈ so(3). −m2 m1 0 Just as in the continuous time case, it is possible to derive a closed second order difference equation for the motion of the body axis ak . Proposition 15. The sequence of ak satisfies the following equation: 2 ak−1 2 ak+1 + ak × 1 + hak , ak+1 i 1 + hak−1 , ak i ak + ak−1 ak+1 + ak − = ε2 p × ak , +εc 1 + hak , ak+1 i 1 + hak−1 , ak i
(5.16)
where c = hmk , ak i is an integral of motion. Proof. Take a vector product of the second equation of motion in (1.17) by ak+1 + ak . Taking into account that hmk+1 , ak+1 i = hmk+1 , ak i = c, we find: 2ak × ak+1 = εmk+1 (1 + hak , ak+1 i) − εc(ak+1 + ak ), or mk+1 =
2 ak × ak+1 ak+1 + ak +c . ε 1 + hak , ak+1 i 1 + hak , ak+1 i
Plugging this into the first equation of motion in (1.17), we arrive at (5.16). u t
(5.17)
172
A. I. Bobenko, Yu. B. Suris
Further, we demonstrate how to reconstruct the “angular velocity” wk (and therefore the motion of the frame gk ) from the evolution of the reduced variables (ak , mk ). Proposition 16. The discrete time evolution of the frame gk can be determined from the linear difference equation gk+1 = wk gk ,
(5.18)
where wk are given by wk =
tr(wk ) (1 + εξk ), 2
(5.19)
where ξk = mk+1 + c
1 − α ak+1 + ak 2 ak × ak+1 c ak+1 + ak = + , α 1 + hak , ak+1 i ε 1 + hak , ak+1 i α 1 + hak , ak+1 i (5.20)
and s tr(wk ) = r 1+
2 ε2 4
=
2
hξk , ξk i
1 + hak , ak+1 i . 1 + ε2 c2 /4α 2
(5.21)
Proof. We combine (5.12) with (5.17) in order to derive the formula 2
=(wk ) = εξk tr(wk )
with the expressions for ξk given in (5.20). Now the reference to Lemma C.2 finishes the proof. u t Finally, we give a Lax representation for the map (5.5). Theorem 2. The map (5.5) has the following Lax representation: `k+1 (λ) = u−1 k (λ)`k (λ)uk (λ),
(5.22)
with the matrices ε ε2 p + λmk + p, uk (λ) = 1 + ελak . `k (λ) = λ2 ak + [ak , mk ] + 2 4
(5.23)
Proof. A direct verification. u t In the next section we present a derivation of this Lax representation from the one for the so-called lattice Heisenberg magnetic.
Discrete Lagrange Top
173
5.2. Moving frame formulation. Note that the discrete Lagrange function (5.1) may be also expressed in terms of Pk = gk−1 pgk , Wk = gk−1 gk+1 : L(gk , gk+1 ) = 3(l) (Pk , Wk ) = −
2(1 − α) 4α log tr(Wk ) − log 1 + hA, Wk−1 AWk i − εhPk , Ai. ε ε (5.24)
Since Wk = 1 + ε + O(ε2 ), we can apply Lemma 5.1 to see that 3(l) (Pk , Wk ) = εL(l) (P , ) + O(ε2 ), where L(l) (P , ) is the Lagrange function (4.6) of the continuous time Lagrange top. Now, one can derive all results concerning the discrete time Lagrange top in the body frame from the ones in the rest frame by performing the change of frames so that Mk = gk−1 mk gk , Pk = gk−1 pgk , A = gk−1 ak gk . Theorem 3. The Euler–Lagrange equations for the Lagrange function (5.24) are equivalent to the following system: ( Mk+1 = Wk−1 Mk + ε[Pk , A] Wk , (5.25) Pk+1 = Wk−1 Pk Wk , where the “angular velocity” Wk is determined by the “angular momentum” Mk+1 via the following formula and Lemma C.2: h i A, (1 + εMk+1 )−1 A(1 + εMk+1 ) 2(1 − α) ε =(Wk ) D E = Mk+1 + 2 tr(Wk ) α α 1 + A, (1 + εMk+1 )−1 A(1 + εMk+1 ) 1−α 1 (5.26) Mk+1 + [A, [A, Mk+1 ]] + O(ε2 ). =ε α α The map (5.25), (5.26) is Poisson with respect to the Poisson bracket (1.4) and has two integrals in involution assuring its complete integrability: hM, Ai and Hε (M, P ) =
1 ε hM, Mi + hP , Ai + h[M, P ], Ai. 2 2
(5.27)
Remark 11. It might be preferable to express Wk through (Mk , Pk ) rather than through Mk+1 (in particular, this is necessary in order to demonstrate that the map (Mk , Pk ) 7→ (Mk+1 , Pk+1 ) is well defined). The corresponding expression reads: 2(1 − α) [A, Wk AWk−1 ] ε =(Wk ) = (Mk + ε[Pk , A]) − , tr(Wk ) α α 1 + hA, Wk AWk−1 i
(5.28)
Wk AWk−1 = (1 + εMk + ε2 [Pk , A]) A (1 + εMk + ε2 [Pk , A])−1 .
(5.29)
2
174
A. I. Bobenko, Yu. B. Suris
We see that the resulting formula is similar to (5.26), but its right-hand side depends not only on Mk but also on Pk (though this latter dependence appears only in O(ε2 ) terms). According to Lemma C.2, both versions allow for the reconstruction of the evolution of the frame gk from the evolution of the reduced variables (Mk , Pk ), anyway. We close this section with a Lax representation for the map (5.25), (5.26). Theorem 4. The map (5.25), (5.26) has the following Lax representation: Lk+1 (λ) = Uk−1 (λ)Lk (λ)Uk (λ),
(5.30)
with the matrices ε2 ε Pk + λMk + Pk , Uk (λ) = (1 + ελA)Wk . Lk (λ) = λ2 A + [A, Mk ] + 2 4 (5.31) Proof. A direct verification. u t 6. Motivation: Lagrange Top and Elastic Curves The Lagrange function (5.1) was found using an analogy between the Lagrange top and the elastic curves as a heuristic tool. The present section is devoted to an exposition of the corresponding interrelations. Let γ : [0, l] 7 → R3 be a smooth curve parametrized by the arclength x ∈ [0, l]. Defining the tangent vector T : [0, l] 7 → R3 as T (x) = γ 0 (x), the characteristic property of the arclength parametrization may be expressed as |T (x)| = 1,
(6.1)
where | · | stands for the euclidean norm. The curvature of the curve γ is defined as k(x) = |T 0 (x)|.
(6.2)
Definition 1 ([L,LS]). A classical elastic curve (Bernoulli’s elastica) is a curve delivering an extremum to the functional Z l k 2 (x)dx, (6.3) 0
the admissible variations of the curve are those preserving γ (0) and γ (l), more precisely, Rl those preserving γ (l) − γ (0) = 0 T (x)dx. Introducing the Lagrange multipliers p ∈ R3 corresponding to this constraint, we come to the functional Z l (6.4) |T 0 (x)|2 − 2 hp, T (x)i dx. 0
Identifying the arclength parameter x with the time t, this functional becomes (twice) the action functional for the spherical pendulum. So, classical elasticae are in a one-to–one correspondence with the motions of the spherical pendulum.
Discrete Lagrange Top
175
A generalization of these notions to elastic rods (which physically means that they can be twisted) requires the curves to be framed, i.e. to carry an orthonormal frame 8(x) = (T (x), N(x), B(x)) in each point. In other words, a framed curve isR a map 8 : x [0, l] 7 → {frames}. The curve itself is then defined by integration: γ (x) = 0 T (y)dy. The following quantities are attributes of a framed curve: the geodesic curvature k1 (x) = hT 0 (x), N (x)i,
(6.5)
k2 (x) = hT 0 (x), B(x)i,
(6.6)
τ (x) = hN 0 (x), B(x)i.
(6.7)
the normal curvature
and the torsion
Obviously, one has: k 2 (x) = k12 (x) + k22 (x). Definition 2 ([L,LS]). An elastic rod (Kirchhoff’s elastica) is a framed curve delivering an extremum to the functional Z l (6.8) k 2 (x) + ατ 2 (x) dx 0
with some α 6 = 0. The admissible variations of the curve preserve 8(0), 8(l), and Rl γ (l) − γ (0) = 0 T (x)dx. The first term in (6.8) corresponds to the bending energy, the second one corresponds to the twist energy. We shall identify R3 with su(2), as described in Appendix C, and the frames with elements of 8 ∈ SU (2), according to the following prescription: T = 8−1 e3 8, N = 8−1 e1 8, B = 8−1 e2 8.
(6.9)
= −80 8−1 , ω = −8−1 80 ,
(6.10)
Then, denoting
we find: k1 = hω, Bi = h, e2 i = 2 , k2 = −hω, Ni = −h, e1 i = −1 , τ = hω, T i = h, e3 i = 3 .
(6.11) (6.12)
So, the variational problem for elastic rods may be formulated as follows: find 8 : [0, l] 7 → SU (2) delivering an extremum of the functional Z l (6.13) 21 (x) + 22 (x) + α23 (x) − 2 hp, T (x)i dx, 0
where p is an (x-independent) Lagrange multiplier coming from the condition of fixed Rl γ (l) − γ (0) = 0 T (x)dx. Identifying the arclength parameter x with the time t, ˙ = (t), and T (x) = 8(x) = g −1 (t), so that (x) = −80 (x)8−1 (x) = g −1 (t)g(t) 8−1 (x)e3 8(x) = g(t)e3 g −1 (t) = a(t), we see that the functional (6.13) coincides with (twice) the action functional for the Lagrange top. This proves the
176
A. I. Bobenko, Yu. B. Suris
Proposition 17 (Kirchhoff’s kinetic analogy, [L]). The frames of arclength parametrized elastic rods are in a one-to-one correspondence with the motions of the Lagrange top. Actually, we use another characterization of the elastic rods. From the Euler–Lagrange equations it follows: Proposition 18. The torsion τ along the extremals of the functional (6.13) is constant, and the tangent vector T (x) satisfies the following second–order differential equation: T × T 00 + cT 0 = p × T ,
(6.14)
where c = ατ . Conversely, each solution T (x) of (6.14) corresponds to a curve γ (x), which, being equipped with a frame with constant torsion τ , delivers an extremum to the functional (6.13) with α = c/τ . Equation (6.14) is (4.17) in new notations. The latter differential equation allows the following interpretation. Consider the so-called Heisenberg flow. It is defined by the differential equation Tt = T × T 00 ,
(6.15)
and describes the evolution of a curve in the binormal direction with the velocity equal to the curvature. Here the “time” t has nothing in common with the time t of the Lagrange top, which is, remember, identified with x. It is easy to see that the flow on curves defined by the vector field Tx = T 0 (a reparametrization of a curve) commutes with the Heisenberg flow (6.15). Using this fact, we can integrate (6.15) once in order to find γt = γ 0 × γ 00 = T × T 0 .
(6.16)
(The reparametrization flow, once integrated, takes the form γx = γ 0 = T ). Now we can formulate the following fundamental statement. Theorem 5 ([Ha,LS]). Let 8 : [0, l] 7 → SU (2) be the frame of an elastic rod, and γ : [0, l] 7 → su(2) the corresponding curve with the tangent vector T = γ 0 : [0, l] 7 → su(2). Then the evolution of γ under the Heisenberg flow (6.16) is a rigid screw–motion, and the evolution of T under the Heisenberg flow (6.15) is a rigid rotation. Conversely, if the evolution of T is a rigid rotation, then T can be lifted to a frame 8 of an elastic rod. The first statement of the theorem follows from (6.14). The left–hand side of (6.14) can be interpreted as the vector field on curves, corresponding to a linear combination of the Heisenberg flow and the reparametrization: Tt + cTx = p × T . Integrated once, this equation yields a rigid screw motion for the curve γ : γt + cγx = p × γ + q, where q ∈ su(2) is a fixed vector. The converse statement follows from Proposition 6.4. By the way, this theorem allows to find a Lax representation for the equation (6.14), and therefore for the Lagrange top, starting from the well-known Lax representation for the Heisenberg flow.
Discrete Lagrange Top
177
Proposition 19. Equation (6.14) is equivalent to the Lax equation `x (λ) = [`(λ), u(λ)]
(6.17)
`(λ) = λ2 T + λ(T × Tx + cT ) + p, u(λ) = λT .
(6.18)
with the matrices
Proof. Indeed, the Heisenberg flow (6.15) is equivalent to the following matrix equation (“zero curvature representation”, [FT]): ut − vx + [v, u] = 0, where u, v ∈ su(2)[λ] are the following matrices: u = λT , v = λ2 T + λT × Tx . Now it is easy to derive that Eq. (6.14), rewritten as Tt + cTx = [p, T ], is equivalent to (6.17) with ` = v + cu + p. u t Remembering that in the Kirchhoff’s kinetic analogy x is identified with t, T is identified with a, and recalling the formula (4.16), we recover the Lax representation of the Lagrange top in the rest frame given in (4.18), (4.19). Theorem 6.5 is also a departure point for discretizing elastic curves and, therefore, the Lagrange top [B]. A discrete arc–length parametrized curve is a sequence γ : Z 7→ R3 with the property |Tk | = 1, where Tk = γk − γk−1 . Correspondingly, discrete framed 3 curves are the sequences of orthonormal frames 8k , such that Tk = 8−1 k e3 8k . As 3 before, we identify R with su(2), and the space of orthonormal frames with SU (2). The curve γ can be reconstructed by applying the summation operation to the sequence T. A discretization of the Heisenberg flow is well known [Skl,FT], see also [DS] for geometric interpretation of discrete flow. It reads: (Tk )t =
2 Tk−1 × Tk 2 Tk × Tk+1 − . 1 + hTk , Tk+1 i 1 + hTk−1 , Tk i
(6.19)
A commuting flow approximating Tx = T 0 is given by: (Tk )x =
Tk + Tk+1 Tk−1 + Tk − . 1 + hTk , Tk+1 i 1 + hTk−1 , Tk i
(6.20)
Once “integrated”, this gives the flows on γk : (γk )x =
Tk + Tk+1 2 Tk × Tk+1 , (γk )t = . 1 + hTk , Tk+1 i 1 + hTk , Tk+1 i
(6.21)
Now we accept the following discrete version of Theorem 6.5 as a definition of discrete elastic rods. 3 Note that the frames 8 , as well as the tangent vectors T , are attached to the edges [γ k k k−1 , γk ] of the discrete curve γ
178
A. I. Bobenko, Yu. B. Suris
Definition 3. A discrete elastic rod is a framed curve for which the evolution of γk under a linear combination of flows (γk )t + c(γk )x with some c is a rigid screw–motion, so that the evolution of Tk under the flow (Tk )t + c(Tk )x is a rigid rotation. In other words, the sequence Tk satisfies the following second order difference equation: 2 Tk−1 2 Tk+1 + Tk × 1 + hTk , Tk+1 i 1 + hTk−1 , Tk i Tk + Tk−1 Tk+1 + Tk (6.22) − = p × Tk . +c 1 + hTk , Tk+1 i 1 + hTk−1 , Tk i (This is Eq. (5.16) with ε = 1 in new notations). We can immediately find the Lax representation for the difference equation (6.22). Proposition 20. Equation (6.22) is equivalent to the Lax equation `k+1 (λ) = u−1 k (λ)`k (λ)uk (λ)
(6.23)
with the matrices `k (λ) =
λ2 + cλ Tk + Tk−1 2λ − cλ2 /2 Tk−1 × Tk · + · + p, 1 + λ2 /4 1 + hTk , Tk−1 i 1 + λ2 /4 1 + hTk , Tk−1 i uk (λ) = 1 + λTk .
(6.24)
(6.25)
Proof. It is well known (see [FT]) that the flows (6.19), (6.20) allow the following “discrete zero curvature representations”: (1)
(1)
(0)
(0)
(uk )t = uk vk+1 − vk uk , (uk )x = uk vk+1 − vk uk , respectively, with the matrices uk as in (6.25) and (1)
vk =
(0)
vk =
Tk + Tk−1 2λ Tk−1 × Tk λ2 · + · , 2 2 1 + λ /4 1 + hTk , Tk−1 i 1 + λ /4 1 + hTk , Tk−1 i λ Tk + Tk−1 λ2 /2 Tk−1 × Tk · − · . 1 + λ2 /4 1 + hTk , Tk−1 i 1 + λ2 /4 1 + hTk , Tk−1 i
Now it is easy to see that Eq. (6.22), rewritten as (Tk )t + c(Tk )x = [p, Tk ], is equivalent to uk `k+1 = `k uk with (1)
(0)
`k = vk + cvk + p, which coincides with (6.24). u t
Discrete Lagrange Top
179
To establish a link with the discrete time Lagrange top, recall that the formula (5.17) in our new notations reads: mk = 2
Tk−1 × Tk Tk + Tk−1 +c , 1 + hTk , Tk−1 i 1 + hTk , Tk−1 i
which implies also Tk +
c Tk−1 × Tk 1 Tk + Tk−1 Tk × mk = − . 2 1 + hTk , Tk−1 i 2 1 + hTk , Tk−1 i
Hence we can write: 1 1 (1 + λ2 /4) `k = λ2 Tk + Tk × mk + p + λmk + p, 2 4 which coincides with (5.23) up to a nonessential constant factor. It remains to find a variational problem generating the equations of motion (6.22). But the calculations of Sect. 5 show that this task is solved by the functional (5.1). This gives the following alternative definition of discrete elastic rods. Definition 4. A discrete elastic rod is a discrete framed curve given by a finite sequence 81 ,...,8N ∈ su(2) delivering an extremum to the functional N−1 X k=1
N X 8 ) − 2(1 − α) log(1 + hT , T i) − hp, Tk i (6.26) − 4α log tr(8−1 k+1 k k+1 k k=1
with some α 6 = 0. The admissible variations of the curve preserve 81 , 8N , and γN −γ0 = PN −1 k=1 8k e3 8k . The equivalence of Definitions 6.7 and 6.9 is the basic new result of this section. It is a geometric counterpart and a motivation for the considerations of Sect. 5. We want to close this section by giving discretizations of geometrical notions like curvature and torsion. Notice that the functional (6.26) naturally splits into two parts, one independent on α and one proportional to α. Accordingly, we declare X X 1 log(1 + hTk , Tk+1 i) = 2 log 1 + kk2 + const (6.27) −2 4 k
k
as a discretization of the “bending energy” X k
R 1 l 2 2 0 k (x)dx,
− 4 log tr(8k+1 8−1 k ) + 2 log(1 + hTk , Tk+1 i)
and
=2
X k
1 2 log 1 + τk + const 4 (6.28)
as a discretization of the “twist energy” curvature” kk at the vertex γk by
R 1 l 2 0
τ 2 (x)dx. Here we define the “discrete
2 1 ⇐ kk = 2 tan(ϕk /2), 1 + kk2 = 4 1 + hTk , Tk+1 i
180
A. I. Bobenko, Yu. B. Suris
where ϕk is the angle between the vectors Tk and Tk+1 . Notice that the kk depends not on the whole frame, but on the tangent vectors Tk only, so that it makes sense also for non–framed curves. The “discrete torsion” τk at the vertex γk is defined by + * =(8k+1 8−1 2(1 + hTk , Tk+1 i) 1 2 k ) , e3 . ⇐ τk = −2 1 + τk = 2 4 (tr(8k+1 8−1 tr(8k+1 8−1 k )) k ) The last formula will be commented on immediately. Let us demonstrate that, in a complete analogy with the continuous case, the discrete torsion is constant along the extremals of the functional (6.26). Denoting for a moment a b −1 ∈ SU (2), 8k+1 8k = −b¯ a¯ we find: −1 2 1 + hTk , Tk+1 i = 1 − 2 tr(8k+1 8−1 k e3 8k 8k+1 e3 ) = 2|a| ,
and also <(a) =
1 1 −1 −1 tr(8k+1 8−1 k ), =(a) = tr(8k+1 8k e3 ) = − h=(8k+1 8k ), e3 i, 2 2
so that
* + * + =(8k+1 8−1 =(8−1 <(a) k ) k 8k+1 ) , e3 = −2 , Tk+1 . = −2 τk = 2 =(a) tr(8k+1 8−1 tr(8−1 k ) k 8k+1 )
(6.29)
Comparing this with (5.12) (remember, we set ε = 1 and identified ak with Tk and wk with 8−1 k+1 8k ), we see that τk = c/α, where c = hmk+1 , Tk+1 i is an integral of motion of the Euler–Lagrange equations (a Casimir function of the e(3) Lie–Poisson bracket). This corresponds literally to the continuous case. Remark 12. The case α = 0 corresponds to discrete elastic curves γ : Z 7 → R3 . The tangent vectors T : Z 7 → S 2 , Tk = γk − γk−1 , define a trajectory of the discrete time spherical pendulum. Its Lagrange function is obtained, as in the continuous time case, from the bending energy (6.27), upon introducing the Lagrange multiplier p. Notice that the Lagrange function of the discrete time spherical pendulum is defined on S 2 × S 2 . 7. Visualisation After the theory has been developed, it is tempting to look at the spinning of the discrete time Lagrange top. Fortunately, in the computer era, a discrete time top is even simpler to simulate than a classical one. Indeed, as it is shown in Theorem 5.2, the Poisson map (mk , ak ) 7 → (mk+1 , ak+1 ) is well defined and can be easily iterated. The vectors ak having been computed, Proposition 5.4 provides us with the evolution of the frame gk , which describes the rotation of the top completely. So, given (m0 , a0 ), the rotation of the top is determined uniquely. Due to (5.17) one can take two consecutive positions (a0 , a1 )
Discrete Lagrange Top
181
Fig. 1. Evolution of the axis of the discrete spinning top
of the axis as the initial conditions as well. Figure 1 demonstrates a typical discrete time precession of the axis. Compare this with the classical continuous time pictures in [KS, A]. The motion of the discrete time Lagrange top can be viewed using a web-browser. The Java-applet has been written by Ulrich Heller and can be found on the web page http://www-sfb288.math.tu-berlin.de/˜bobenko The applet presents an animated spinning top described by the formulas of the present paper.
8. Conclusion We took an opportunity of elaborating an integrable discretization of the Lagrange top to study in a considerable detail the general theory of discrete time Lagrangian mechanics on Lie groups. We consider this theory as an important source of symplectic and, more general, Poisson maps. Moreover, from some points of view the variational (Lagrangian) structure is even more fundamental and important than the Poisson (Hamiltonian) one (cf. [HMR,MPS], where a similar viewpoint is represented). In particular, discrete Lagrangians on G × G may serve as models for general (not necessarily integrable) cases of the rigid body motion (cf. [WM]). It is somewhat astonishing that this construction is able to produce integrable discrete time systems, since integrability is not built in it a priori. Nevertheless, we extend the Moser–Veselov list [V,MV] of integrable discrete time Lagrangian systems with a new item, namely, an integrable discrete time Lagrange top. It seems that this list may be further continued. In finding this new discrete time mechanical system an analogy with some differentialgeometric notions was very instructive. Also these interrelations between integrable differential geometry and integrable mechanics, both continuous and discrete, deserve to be studied further. Let us mention also some more concrete problems connected with this work. First of all, the discrete time Lax representations found here call for being understood both from the r-matrix point of view [RSTS,S] and from the point of view of matrix factorizations [MV] (unfortunately, these two schemes, being in principle closely related, still could not be merged into a unified one). Futher, the discrete time dynamics should be integrated in terms of elliptic functions. The methods of the finite–gap theory will be useful here
182
A. I. Bobenko, Yu. B. Suris
[RM]. Finally, it would be important to elaborate a variational interpretation of different integrable discretizations of the Euler top found in [BLS]. Note added in proof. Some of the problems mentioned in the Conclusion are now solved. The r-matrix interpretation of the Lax matrix `k (λ) in (5.23) is as follows: the matrices ε λ a + [a, m] −1 m λ `(λ) 2 = λ−1 p + + 2 2 1 + ε λ /4 1 + ε2 λ2 /4 1 + ε2 λ2 /4 form an orbit of the linear r-matrix bracket in the loop algebra su(2)[λ, λ−1 ] corresponding to the following standard R-operator: R u(λ) = strictly positive part of u(λ) − nonpositive part of u(λ). Notice that by ε → 0 it turns into another orbit, consisting of the Lax matrices of the continuous time Lagrange top (4.19), λ−1 `(λ) = λ−1 p + m + λa. The Lax representation (5.22) of the discrete time Lagrange top may be cast also in the form (1 + ελak+1 ) 1 + ελ−1 p + ε(mk+1 − εak+1 p) = 1 + ελ−1 p + ε(mk − εak p) (1 + ελak ), which is typical for the approach of [MV]. However, unlike the situation in [MV], the corresponding matrix factorization problem in the loop group SU (2)[λ, λ−1 ], connected with the above R-operator, has a unique solution, which explains why our discrete top is described by a genuine map and not by a correspondence. A. Notations We fix here some notations and definitions used throughout the paper. Let G be a Lie group with the Lie algebra g, and let g∗ be a dual vector space to g. We identify g and g∗ with the tangent space and the cotangent space to G in the group unity, respectively: g = Te G, g∗ = Te∗ G. The pairing between the cotangent and the tangent spaces Tg∗ G and Tg G in an arbitrary point g ∈ G is denoted by h·, ·i. The left and right translations in the group are the maps Lg , Rg : G 7 → G defined by Lg h = gh, Rg h = hg ∀h ∈ G, and Lg∗ , Rg∗ stand for the differentials of these maps: Lg∗ : Th G 7 → Tgh G, Rg∗ : Th G 7→ Thg G. We denote by Ad g = Lg∗ Rg −1 ∗ : g 7 → g
Discrete Lagrange Top
183
the adjoint action of the Lie group G on its Lie algebra g = Te G. The linear operators ∗ ∗ G 7 → Th∗ G, Rg∗ : Thg G 7→ Th∗ G L∗g : Tgh
are conjugated to Lg∗ , Rg∗ , respectively, via the pairing h·, ·i: ∗ G , η ∈ Th G, hL∗g ξ, ηi = hξ, Lg∗ ηi for ξ ∈ Tgh ∗ ∗ G , η ∈ Th G. hRg ξ, ηi = hξ, Rg∗ ηi for ξ ∈ Thg
The coadjoint action of the group Ad∗ g = L∗g Rg∗−1 : g∗ 7→ g∗ is conjugated to Ad g via the pairing h·, ·i: hAd∗ g · ξ, ηi = hξ, Ad g · ηi for ξ ∈ g∗ , η ∈ g. The differentials of Ad g and of Ad∗ g with respect to g in the group unity e are the operators ad η : g 7 → g and ad∗ η : g∗ 7 → g∗ , respectively, also conjugated via the pairing h·, ·i: had∗ η · ξ, ζ i = hξ, ad η · ζ i ∀ξ ∈ g∗ , ζ ∈ g. The action of ad is given by applying the Lie bracket in g: ad η · ζ = [η, ζ ], ∀ζ ∈ g. Finally, we shall need the notion of gradients of functions on vector spaces and on manifolds. If X is a vector space, and f : X 7 → R is a smooth function, then the gradient ∇f : X 7 → X ∗ is defined via the formula d f (x + y) , ∀y ∈ X . h∇f (x), yi = d =0 Similarly, for a function f : G 7 → R on a smooth manifold G its gradient ∇f : G 7 → T ∗ G is defined in the following way: for an arbitrary g˙ ∈ Tg G let g() be a curve in G through g(0) = g with the tangent vector g(0) ˙ = g. ˙ Then d . f (g()) h∇f (g), gi ˙ = d =0 If G is a Lie group, then two convenient ways to define a curve in G through g with the tangent vector g˙ are the following: ˙ g() = eη g, η = Rg −1 ∗ g, and ˙ g() = geη , η = Lg −1 ∗ g,
184
A. I. Bobenko, Yu. B. Suris
which allows to establish the connection of the gradient ∇f with the (somewhat more convenient) notions of the left and the right Lie derivatives of a function f : G 7→ R: ∇f (g) = Rg∗−1 df (g) = L∗g −1 d 0 f (g). Here df : G 7 → g∗ and d 0 f : G 7 → g∗ are defined via the formulas d η f (e g) , ∀η ∈ g, hdf (g), ηi = d =0 d , ∀η ∈ g. f (geη ) hd 0 f (g), ηi = d =0 B. Lagrangian Equations of Motion
Continuous time
Discrete time General Lagrangian systems
L(g, g) ˙ 5 = ∇ L g˙ 5 ˙ =∇ L
L(gk , gk+1 ) 5 = −∇ L(g , g 1 k k k+1 ) 5 = ∇ L(g , g )
g
k+1
k
2
k+1
Left trivialization: M = L∗g 5 L(g, g) ˙ = L(l) (g, )
L(gk , gk+1 ) = L(l) (gk , Wk )
= Lg −1 ∗ g˙
Wk = gk−1 gk+1
M = L∗g 5 = ∇ L(l) M˙ = ad∗ · M + d 0 L(l)
0 L(l) (g Mk = L∗gk 5k = d W k−1 , Wk−1 ) 0 (l) Ad∗ W −1 · M = M k+1 k + d g L (gk , Wk ) k g =g W
g˙ = L g∗
g
k+1
k
k
Left trivialization, left symmetry reduction: M = L∗g 5 , P = Ad g −1 · ζ L(g, g) ˙ = L(l) (P , )
L(gk , gk+1 ) = 3(l) (Pk , Wk )
= Lg −1 ∗ g˙ , P = Ad g −1 · ζ
Wk = gk−1 gk+1 , Pk = Ad gk−1 · ζ
M = L∗g 5 = ∇ L(l) M˙ = ad∗ · M + ad∗ P · ∇ L(l) P P˙ = [P , ]
0 3(l) (P Mk = L∗gk 5k = d W k−1 , Wk−1 ) ∗ (l) Ad∗ W −1 · M k+1 = Mk + ad Pk · ∇P 3 (Pk , Wk ) k P = Ad W −1 · P k+1
k
k
Right trivialization: m = Rg∗ 5 L(g, g) ˙ = L(r) (g, ω)
L(gk , gk+1 ) = L(r) (gk , wk )
ω = Rg −1 ∗ g˙
wk = gk+1 gk−1
m = Rg∗ 5 = ∇ω L(r) m ˙ = −ad∗ ω · m + dg L(r) g˙ = R ω
mk = Rg∗k 5k = dw L(r) (gk−1 , wk−1 ) (r) Ad∗ w · m k k+1 = mk + dg L (gk , wk ) g =w g
g∗
k+1
k k
Discrete Lagrange Top
185
Continuous time
Discrete time
Right trivialization, right symmetry reduction: m = Rg∗ 5 , a = Ad g · ζ L(g, g) ˙ = L(r) (a, ω)
L(gk , gk+1 ) = 3(r) (ak , wk )
ω = Rg −1 ∗ g˙ , a = Ad g · ζ
wk = gk+1 gk−1 , ak = Ad gk · ζ
m = Rg∗ 5 = ∇ω L(r) m ˙ = −ad∗ ω · m − ad∗ a · ∇a L(r) a˙ = [ω, a]
mk = Rg∗k 5k = dw 3(r) (ak−1 , wk−1 ) ∗ (r) Ad∗ w · m k k+1 = mk − ad ak · ∇a 3 (ak , wk ) a = Ad w · a k+1
k
k
The relation between the continuous time and the discrete time equations is established, if we set ˙ + O(ε2 ); gk = g, gk+1 = g + ε g˙ + O(ε2 ), L(gk , gk+1 ) = εL(g, g) 2 (l) (l) Pk = P , Wk = 1 + ε + O(ε ), 3 (Pk , Wk ) = εL (P , ) + O(ε2 ); wk = 1 + εω + O(ε2 ), 3(r) (ak , wk ) = εL(r) (a, ω) + O(ε2 ). ak = a, C. On SU (2) and su(2) The Lie group G = SU (2) consists of complex 2 × 2 matrices g satisfying the condition gg ∗ = g ∗ g = 1, where 1 is the group unit, i.e. the 2 × 2 unit matrix, and ∗ denotes the Hermitian conjugation, i.e. g ∗ = g¯ T . In components: α β a + ib c + id g= = , (C.1) −c + id a − ib −β¯ α¯ where α = a + ib , β = c + id ∈ C, and |α|2 + |β|2 = a 2 + b2 + c2 + d 2 = 1.
(C.2)
The tangent space Te SU (2) is the Lie algebra g = su(2) consisting of complex 2 × 2 matrices η such that η + η∗ = 0. In components, ib c + id η= . (C.3) −c + id −ib The Lie bracket in su(2) is the usual matrix commutator. Let us introduce the following notations: for an arbitrary matrix g of the form (C.1), not necessary belonging to SU (2), set a0 ib c + id <(g) = , =(g) = , 0a −c + id −ib so that <(g) is a scalar real matrix, and =(g) ∈ su(2).
186
A. I. Bobenko, Yu. B. Suris
As it is always the case for matrix groups, we have for g ∈ SU (2), η ∈ su(2): Lg∗ η = gη, Rg∗ η = ηg, Ad g · η = gηg −1 . If we write (C.3) as η=
1 2
−iη3 −η2 − iη1 η2 − iη1 iη3
(C.4)
,
(C.5)
and put this matrix in a correspondence with the vector η = (η1 , η2 , η3 )T ∈ R3 , then it is easy toverify that this correspondence is an isomorphism between su(2) and 3 the Lie algebra R , × , where × stands for the vector product. This allows not to distinguish between vectors from R3 and matrices from su(2). In other words, we use the following basis of the linear space su(2): 1 0 −i 1 0 −1 1 1 = σ1 , e2 = = σ2 , e1 = −i 0 1 0 2 2i 2 2i 1 e3 = 2
−i 0 0 i
=
1 σ3 , 2i
(C.6)
where σj are the Pauli matrices. We supply su(2) with the scalar product h·, ·i induced from R3 . It is easy to see that in the matrix form it may be represented as hη, ζ i = −2 tr(ηζ ) = 2 tr(ηζ ∗ ).
(C.7)
This scalar product allows us to identify the dual space su(2)∗ with su(2) itself, so that the coadjoint action of the algebra becomes the usual Lie bracket with minus: ad∗ η · ξ = [ξ, η] = −ad η · ξ.
(C.8)
We use a formula similar to (C.7) to define a scalar product of two arbitrary complex 2 × 2 matrices: hg1 , g2 i = 2 tr(g1 g2∗ ).
(C.9)
(In particular, the square of the norm of every matrix g ∈ SU (2) is equal to 4). The formula (C.9) gives us a left– and right–invariant scalar product in all tangent spaces Tg G. Indeed, to see, for instance, the left invariance, let gη, gζ ∈ Tg SU (2) (here g ∈ SU (2), η, ζ ∈ su(2)). Then hgη, gζ i = 2 tr(gηζ ∗ g ∗ ) = 2 tr(ηζ ∗ ) = hη, ζ i. This scalar product allows to identify the cotangent spaces Tg∗ G with the tangent spaces Tg G. It follows easily that: L∗g ξ = g −1 ξ, Rg∗ ξ = ξg −1 , Ad∗ g · ξ = g −1 ξg
(C.10)
(in these formulas g ∈ SU (2), so that g −1 = g ∗ ). Let us now formulate several simple properties of SU (2) and su(2) which will be used later on.
Discrete Lagrange Top
187
Lemma 2. For an arbitrary g ∈ SU (2): g = cos(θ) 1 + sin(θ) ζ with ζ ∈ su(2), hζ, ζ i = 4.
(C.11)
The adjoint action of SU (2) on su(2) has in these notations the following geometrical interpretation: gηg −1 is a rotation of the vector η around the vector ζ by the angle 2θ . This interpretation makes SU (2) very convenient for describing rotations in R3 (in some respects more convenient than the standard use of SO(3) in this context). Since by rotations only the vectors on the rotation axis remain fixed, we see that for the case G = SU (2) G[ζ ] = G(ζ ) . In a different way, the previous lemma may be formulated as follows. Lemma 3. For g ∈ SU (2), if 2
=(g) = ξ, tr(g)
(C.12)
then g=
2 tr(g) (1 + ξ ), tr(g) = √ . 2 1 + hξ, ξ i/4
(C.13)
We have also the following simple connection between the matrix multiplication and the commutator in su(2) . Lemma 4. For η, ζ ∈ su(2) their matrix product has the form (C.1), and 1 1 ηζ = − hη, ζ i1 + [η, ζ ]. 4 2
(C.14)
In particular, the following corollary is important: hη, ζ i = 0 ⇒ ηζ = −ζ η.
(C.15)
References [AL]
[AM] [A] [Au] [B]
[BLS]
Ablowitz, M., Ladik, J.: A nonlinear difference scheme and inverse scattering. Stud. Appl. Math. 55, 213–229 (1976); On solution of a class of nonlinear partial difference equations. Stud. Appl. Math. 57, 1–12 (1977) Adler, M., van Moerbeke, P.: Completely integrable systems, Euclidean Lie algebras and curves. Adv. Math. 38, 267–317 (1980) Arnold, V.I.: Mathematical methods of classical mechanics. Berlin–Heidelberg–New York: Springer, 1978 Audin, M.: Spinning tops. Cambridge: Cambridge University Press, 1996 Bobenko, A.I.: Discrete integrable systems and geometry. In: XIIth International Congress Mathematical Physics, ICMP ’97. Eds. D. De Wit, A. Bracken, M. Gould, P. Pearce, Boston: International Press, 1999, pp. 219–226 Bobenko, A.I., Lorbeer, B., and Suris, Yu.B.: Integrable discretizations of the Euler top. J. Math. Phys. 39, 6668–6683 (1998)
188
[BP]
A. I. Bobenko, Yu. B. Suris
Bobenko, A.I., Pinkall, U.: Discretization of surfaces and integrable systems. In: Discrete integrable geometry and physics. Eds. A. Bobenko and R. Seiler, Oxford: Oxford University Press, 1999, pp. 3–58 [CB] Cushman, R.H., Bates, L.M.: Global aspects of classical integrable systems. Basel–Boston: Birkhäuser, 1997 [DJM] Date, F., Jimbo, M., and Miwa, T.: Method for generating discrete soliton equations. I–IV. J. Phys. Soc. Japan 51, 4116–4124, 4125–4131 (1982); 52, 761–765, 766–771 (1983) [DS] Doliwa, A., Santini, P.: Geometry of discrete curves and lattices and integrable difference equations. In: Discrete integrable geometry and physics. Eds. A. Bobenko and R. Seiler, Oxford: Oxford University Press, 1999, pp. 139–154 [FT] Faddeev, L.D., Takhtadjan, L.A.: Hamiltonian methods in the theory of solitons. Berlin–Heidelberg– New York: Springer, 1987 [G] Golubev, V.V.: Lectures on the integration of the equations of motion of a rigid body about a fixed point. Moscow: State Publishing House, 1953 [Ha] Hasimoto, H.: Motion of a vortex filament and its relation to elastica. J. Phys. Soc. Japan 31, 293–294 (1971); A soliton of a vortex filament. J. Fluid Mech. 51, 477–485 (1972) [H] Hirota, R.: Nonlinear partial difference equations, I–V. J. Phys. Soc. Japan 43, 1423–1433, 2074– 2078, 2079–2086 (1977); 45, 321–332 (1978); 46, 312–319 (1978) [HMR] Holm, D.D., Marsden, J.E., and Ratiu, T.: The Euler–Poincaré equations and semidirect products with applications to continuum theories. Adv. Math. (1998, to appear) [KS] Klein, F., Sommerfeld, A.: Über die Theorie des Kreisels. Teubner, 1965. (Reprint of the 1897–1910 edition) [L] Love, A.E.H.: A treatize of the mathematical theory of elasticity. Cambridge, 1892 [LS] Langer, J., Singer, D.: Lagrangian aspects of the Kirchhoff elastic rod. SIAM Rev. 38, 605–618 (1996) [MPS] Marsden, J.E., Patrick, G.W., and Shkoller, S.: Multisymplectic geometry, variational integrators, and nonlinear PDEs. Commun. Math. Phys. 199, 351–395 (1998) [MR] Marsden, J.E., Ratiu, T.S.: Introduction to mechanics and symmetry. Berlin–Heidelberg–New York: Springer, 1994 [MRW] Marsden, J.E., Ratiu, T.S., and Weinstein, A.: Semi-direct products and reduction in mechanics. Trans. Am. Math. Soc. 281, 147–177 (1984); Reduction and Hamiltonian structures on duals of semidirect products Lie algebras. Contemp. Math. 28, 55–100 (1984) [MS] Marsden, J.E., Scheurle, J.: Lagrangian reduction and the double spherical pendulum. ZAMP 44, 17–43 (1993); The reduced Euler–Lagrange equations. Fields Inst. Comm. 1, 139–164 (1993) [MV] Moser, J., Veselov, A.P.: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Commun. Math. Phys. 139, 217–243 (1991) [QNCV] Quispel, G., Nijhoff, F., Capel, H., and Van der Linden, J.: Linear integral equations and nonlinear differential–difference equations. Physica A 125, 344–380 (1984) [RM] Ratiu, T., van Moerbeke, P.: The Lagrange rigid body motion. Ann. Inst. Fourier 32, 211–234 (1982) [R] Reyman, A.G.: Integrable Hamiltonian systems connected with graded Lie algebras. J. Sov. Math. 19, 1507–1545 (1982) [RSTS] Reyman, A.G., Semenov-Tian-Shansky, M.A.: Group theoretical methods in the theory of finite dimensional integrable systems. In: Encyclopaedia of mathematical science, v.16: Dynamical Systems VII. Berlin–Heidelberg–New York: Springer, 1994, pp. 116–225 [Skl] Sklyanin, E.K.: On some algebraic structures related to the Yang–Baxter equation. Funct. Anal. and Appl. 16, 27–34 (1982) [S] Suris, Yu.B.: R-matrices and integrable discretizations. In: Discrete integrable geometry and physics. Eds. A.Bobenko and R.Seiler. Oxford: Oxford University Press, 1999, pp. 157–207 [V] Veselov, A.P.: Integrable systems with discrete time and difference operators. Funct. Anal. Appl. 22, 1–13 (1988) [WM] Wendlandt, J.M., Marsden, J.E.: Mechanical integrators derived from a discrete variational principle. Physica D 106, 223–246 (1997)
Communicated by T. Miwa
Commun. Math. Phys. 204, 189 – 206 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Ground State Energy of the Fractional Quantum Hall Effect? Jingbo Xia Department of Mathematics, State University of New York at Buffalo, Buffalo, NY 14214, USA. E-mail: [email protected] Received: 10 November 1998 / Accepted: 27 January 1999
Abstract: Let I (N, R) be the ground state energy of N electrons confined to a disc of radius R with a constant magnetic field B in the perpendicular direction. We show that, in the limit R → ∞ and√N/R 2 → ν, where ν is the Landau level filling factor, we have I (N, R) = (π/4)λ νN 3/2 + o(N 3/2 ) with λ = (2e3 m2e c/h¯ 3 |B|)1/2 . The factor π/4 = (1/2) · (π/2) is obtained through the solution of an extreme-value problem in measure theory. 1. Introduction Let N electrons in the plane R2 interact with a bounded potential, with a constant magnetic field B in the perpendicular direction, and with each other via the Coulomb potential. Then, in units of magnetic length (2ch¯ /e|B|)1/2 and cyclotron energy e|B|h¯ /2me c, the Hamiltonian representing this system reads HN =
N X 1 j =1
2
Hj + Q(zj ) + λ
X
|zi − zj |−1 .
1≤i<j ≤N
Here, zj = (xj , yj ) ∈ R2 , Hj = (−i(∂/∂xj ) + yj )2 + (−i(∂/∂yj ) − xj )2 , Q is a bounded, real-valued measurable function, and λ is the dimensionless constant (2e3 m2e c/h¯ 3 |B|)1/2 , which is large under earthly conditions. In the study of the fractional quantum Hall effect (FQHE) one further confines the N electrons to a disc of radius R and investigates what happens in the limit R → ∞ and N/R 2 → ν, where ν is the Landau level filling factor (see [2, Sec. VII] and [6]). One way to confine the electrons is to constrain their orbital angular momenta, as in the case of Laughlin’s approximation to the ground state of FQHE [5,6]. Here, we take ? Research supported in part by National Science Foundation grant DMS-9703515.
190
J. Xia
the approach of restricting HN to the polydisc 1(N, R) = {(z1 , . . . , zN ) : |zj | < R, j = 1, . . . , N } in (R2 )N . Given N ≥ 2 and R > 0, let (Cc∞ (1(N, R)))a denote the collection of functions in Cc∞ (1(N, R)) which are anti-symmetric under the interchange of any two variables. Then (Cc∞ (1(N, R)))a is a dense domain in (L2 (1(N, R)))a , the collection of the L2 -functions on 1(N, R) with the same property. We have the quadratic form [f, g]HN ,R,a = hHN f, gi, f, g ∈ (Cc∞ (1(N, R)))a . ¯ which belongs to L1 , (Here h|zi − zj |−1 f, gi is simply the integral of |zi − zj |−1 f g, −1 2 even though |zi − zj | f may not be in L .) Now [., .]HN ,R,a is obviously closable and bounded from below, and its closure (also denoted by [., .]HN ,R,a ) has a form domain a (N, R) which contains (Cc∞ (1(N, R)))a as a form core. By a well-known construction of Friedrichs [4,7], the Hamiltonian HN is realized via [., .]HN ,R,a as a self-adjoint operator Ha (N, R) on a dense domain Da (N, R) in the Hilbert space (L2 (1(N, R)))a . Indeed Da (N, R) ⊂ a (N, R), Da (N, R) is a form core for [., .]HN ,R,a , and hHa (N, R)f, gi = [f, g]HN ,R,a , f, g ∈ Da (N, R). Obviously the choice of the Hilbert space (L2 (1(N, R)))a is demanded by the exclusion principle and the form domain (Cc∞ (1(N, R)))a in effect imposes the Dirichlet P boundary condition on 1(N, R), which guarantees that N j =1 Hj is positive. Let I (N, R) denote the ground state energy of Ha (N, R), i.e., I (N, R) = inf{hHa (N, R)f, f i : f ∈ Da (N, R), kf k = 1}, which is the bottom of the spectrum of Ha (N, R). We also have I (N, R) = inf{[g, g]HN ,R,a : g ∈ (Cc∞ (1(N, R)))a , kgk = 1}, since (Cc∞ (1(N, R)))a is a form core for Ha (N, R). The purpose of this paper is to determine the leading term in the asymptotic expansion of I (N, R) in the limit R → ∞ and N/R 2 → ν. This involves estimates which do not appear to be completely trivial and which seem to shed some light on what the true ground state of FQHE might be. It is easy to see that, for 0 < ν < ∞, lim inf N −3/2 I (N, R) ≥
R→∞ N/R 2 →ν
1 √ λ ν. 4
(1.1)
8 √ λ ν. 3π
(1.2)
In a previous work [9] we showed that lim sup N −3/2 I (N, R) ≤
R→∞ N/R 2 →ν
Thus I (N, R) = O(N 3/2 ). Our main result eliminates the gap between estimates (1.1) and (1.2). Theorem 1. For any Landau level filling factor 0 < ν < ∞, we have π √ lim N −3/2 I (N, R) = λ ν. R→∞ 4 2 N/R →ν
(1.3)
Ground State Energy of Fractional Quantum Hall Effect
191
√ This gives us I (N, R) = (π/4)λ νN 3/2 + o(N 3/2 ) in the limit R → ∞ and N/R 2 → ν. It seems that, given a trial ground state of FQHE, one should be able to use this result to test how closely it approximates the true ground state. Indeed this consideration was the original motivation for our investigation. The proof of Theorem 1 will be divided into three steps: An upper bound for I (N, R), a lower bound for I (N, R), and an extreme-value problem. Furthermore, (1.3) will be slightly refined in the first two steps (Propositions 2 and 3 below). Let us now turn to the technical aspect of the proof. Throughout the paper, U denotes the closed unit disc {z ∈ R2 : |z| ≤ 1}. The usual area measure on R2 will be denoted by dA or simply by A. It is important to emphasize that the normalization of A is such that A(U ) = π. Let P be the collection of probability measures on U which are devoid of point masses. (Recall that, in addition to being positive and having total mass 1, a probability measure is required to be a regular Borel measure. Also recall that, if µ has no point masses, then (µ × µ)({(z, z) : z ∈ U }) = 0.) For each µ ∈ P, define Z Z J (µ) = |z − w|−1 dµ(z)dµ(w). Define J00 = inf{J (µ) : µ ∈ P}. Let U denote the collection of continuous functions u on [0,1] satisfying the following: (a) u is non-decreasing on [0,1] and u(0) > 0. R1 (b) 0 u(r)rdr = 1/2π. Thus for each u ∈ U, the measure dµu (z) = u(|z|)dA(z) on U belongs to P. We will regard U as a subclass of P by identifying each u with µu . In particular, we write Z Z |z − w|−1 u(|z|)u(|w|)dA(z)dA(w) J (u) = U
U
for u ∈ U. Furthermore, define J0 = inf{J (u) : u ∈ U}. With these definitions, we can now state the three steps mentioned earlier. Proposition 2. Let 0 < β < ∞ and u ∈ U. Then lim sup N −3/2 I (N, R) ≤
N→∞ N/R 2 ≤β
p 1 J (u)λ β. 2
Equation (1.2) is really the easy case of this proposition where u is the constant function 1/π on U ; indeed (1/2) · J (1/π) accounts for the factor 8/3π in (1.2). Proposition 3. Let 0 < α < ∞. Then lim inf N −3/2 I (N, R) ≥
N→∞ N/R 2 ≥α
√ 1 J00 λ α. 2
192
J. Xia
Define the measure ω on U by the formula dω(z) =
1 1 ·p dA(z), |z| < 1. 2π 1 − |z|2
(1.4)
It is easy to verify that ω has total R mass 1. That is, ω ∈ P. As we will see later, the measure ω has the property that |z − w|−1 dω(z) remains constant when w ∈ U . Proposition 4. We have J0 = J00 = J (ω) = π/2. Proof of Theorem 1. It follows immediately from Propositions 2, 3 and 4. u t The proofs of these propositions themselves will be given in Sects. 2, 3, and 4 respectively. Since all of these proofs are quite technical, in the rest of this section we will sketch the main ideas behind these proofs and explain what motivated these ideas. We start with the observation that, since there are N (N − 1)/2 terms of Coulomb potentials in the Hamiltonian HN compared with only N terms for kinetic energy, near the ground state, the repulsive potential energy clearly has a higher order in N. This leads to the realization that, when N is large, the total energy can be close to its minimum even if the kinetic energy is not. In other words, in order to minimize the total energy, it may be necessary to reduce the potential energy at the price of elevating the kinetic energy slightly. This is what sets our approach apart from previous constructions of trial ground states for FQHE, where states from the lowest Landau level of the one-particle magnetic Hamiltonian played a key role [2,5,6]. Clearly, to bound I√(N, R) from above, one needs a configuration of N electrons in a disc of radius R ≥ N/β which is close to being energetically optimal. The idea is that, given any u ∈ U, one looks for a state 9 in which the collective charge distribution of the N electrons mimics the distribution given by the measure βu(|z/R0 |)dA(z) on the disc B(0, R0 ) = {z ∈ R2 : |z| < R0 }, √ where R0 = N/β ≤ R. In addition, to make the relevant estimates manageable, one needs to limit 9 to the form of a single Slater determinant, 9(z1 , . . . , zN ) = (N!)−1/2 det(ψk (zj ))N j,k=1 . As it turns out, there exist single-electron states ψ1 , . . . , ψN constructed from a single state ξ which meet these two requirements. To mimic the measure βu(|z/R0 |)dA(z) on B(0, R0 ) with the charge distribution of the electrons, we will show (latter part of Sect. 2) that B(0, R0 ) contains approximately N pairwise disjoint open sectors S1 , . . . , SN such that Z u(|z/R0 |)dA(z) = 1 and B(ck , δ) ⊂ Sk ⊂ B(ck , δ 0 ), (1.5) β Sk
1 ≤ k ≤ N, where δ and δ 0 depend only on u and β. We then place an electron in each Sk in the following way: Starting with a unit vector ξ ∈ L2 (R2 ) which is smooth and has a support strictly contained in B(0, δ), we let ψk = Uk Tk ξ, 1 ≤ k ≤ N. Here, Tk translates ξ to the site ck and Uk is the corresponding gauge transformation which cancels the effect of the magnetic field on the translation. That is, we start with
Ground State Energy of Fractional Quantum Hall Effect
193
a single electron state ξ , translate it on the plane, and then individually gauge away the effect of the magnetic field. This ensures that the kinetic energy of each ψk = Uk Tk ξ is the P PN same as that of ξ , which leads to h N j =1 Hj 9, 9i = O(N). Because h j =1 Hj 9, 9i is of lower order, the magnetic field does not appear in the first term of the asymptotic expansion of I (N, R) except in the form of λ = (2e3 m2e c/h¯ 3 |B|)1/2 . Because of (1.5), the j th electron can hardly distinguish between the actual electric charge distribution |ψk (z)|2 dA(z) of the k th electron and the distribution given by βu(|z/R0 |)dA(z) on Sk , so long as these electrons are far apart. And the design of 9 ensures that each electron is far apart from most electrons. Thus each electron feels a charge distribution on B(0, R0 ) that is approximately given by βu(|z/RP 0 |)dA(z). Hence such an arrangement of the electrons ensures that the potential energy h j 6=k |zj − zk |−1 9, 9i approximately equals Z Z 2 |z − w|−1 u(|z/R0 |)u(|w/R0 |)dA(z)dA(w) β =
|w|
giving our desired upper bound for I (N, R). The purpose of Sect. 2 is to make this argument mathematically rigorous. The lower bound for I (N, R) is much less physically intuitive than the case of the upper bound, but mathematically it is the simplest part of the paper. Also, for the lower P bound, we can ignore the magnetic field, for N j =1 Hj is positive anyway. Thus the problem becomes the following: Given a unit vector f ∈ L2 (1(N, R)), we need to P find a lower bound for h i6=j |zi − zj |−1 f, f i. But, upon approximating the unit vector g = |f |2 in L1 (1(N, R)) by simple functions, this is quickly reduced to the problem of finding a lower bound for Z Z X N −2 (1.6) |z − w|−1 χEi (z)χEj (w)dA(z)dA(w), A(Ei )A(Ej ) i6=j
where E1 , . . . , EN are Borel sets in the unit disc U which have the properties that A(Ej ) > 0 and that diam(Ej ) ≤ 1/N, but which are otherwise arbitrary. As it turns out, if we let JN be the infimum of (1.6) taken over all such tuples (E1 , . . . , EN ), then a measure-theoretical argument shows that lim inf JN ≥ J00 , N→∞
which is really the essence of Proposition 3. The details of this argument are worked out in Sect. 3. Now, having obtained the two bounds, we arrive at the inequality √ √ N 3/2 (J00 /2)λ ν + o(N 3/2 ) ≤ I (N, R) ≤ N 3/2 (J0 /2)λ ν + o(N 3/2 ) in the limit N → ∞ and N/R 2 → ν. The remaining problem is one in pure analysis: Do the quantities J0 and J00 coincide and, if they do, what is their actual value? One’s immediate guess is that J0 = J00 because, with regard to the functional J (.), U is “dense" in P in some intuitive sense. This turns out to be correct, but, to determine the actual value of J0 = J00 , extra work is required. Two separate but converging lines of reasoning lead us to the solution of this extreme-value problem.
194
J. Xia
First of all, it is not difficult to show that the extreme value J00 is attainable, i.e., there is a µ∗ ∈ P such that J (µ∗ ) = J00 . (Interestingly, we have no proof that such a µ∗ is unique in P.) But the extremal property of µ∗ entails certain consequences, one of which is the inequality Z (1.7) |z − w|−1 dµ∗ (z) ≤ J00 for every w ∈ R2 . Now this inequality narrows our search considerably: If we can find an ω ∈ P such that Z (1.8) |z − w|−1 dω(z) is constant when |w| ≤ 1, RR then, applying (1.7) and (1.8) to |z − w|−1 dω(z)dµ∗ (w), we will have J00 = J (ω). Thus we need to find an ω satisfying (1.8) for which J (ω) is, hopefully, calculable. Secondly, the evaluation of J00 is equivalent to asking the following question: If a unit quantity of static electric charge is distributed on the unit disc U , what is the smallest possible potential energy? But when the potential energy is minimized, one expects the electric field on U to vanish, hence (1.8). Thus we see that (1.8) is really the key to the solution of our extreme-value problem. For obvious reasons one expects such an ω to be invariant under the rotation of U . Guided by integral calculations related to the identity (4.3) below, our initial guess was that ω would most likely have the form dω(z) =
∞ X
aj |z|2j dA(z),
(1.9)
j =0
which turned out to be correct. Now if one assumes (1.9), then (1.8) becomes a set of workable conditions on the aj ’s. As luck would have it, from these conditions we can deduce that if the aj ’s are the coefficients in the power series expansion of (2π )−1 (1 − x)−1/2 , then (1.8) is satisfied. That is, (1.4) gives the desired ω. But once we know what ω really is, (1.8) can be directly verified without any reference to power series expansion at all (see the proof of Lemma 11). The identity J0 = J00 then follows from the fact that ω can be approximated by measures in U in the appropriate setting. After the above sketch, let us now fill in the details. 2. Upper Bound for I (N, R) For the remainder of the paper, we write B(z, r) = {w ∈ R2 : |w − z| < r}. Let 0 < β < ∞ and u ∈ U be given as in the statement of Proposition 2. We write a = u(0) and b = u(1),
(2.1)
−1
(2.2)
d = (2πβa)
.
Then 0 < a ≤ u(r) ≤ b for all r ∈ [0, 1]. Let ∈ (0, β] be given. Define R0 = R0 (N ) by (2.3) N = βR02 .
Ground State Energy of Fractional Quantum Hall Effect
195
We have R0 ≤ R because N ≤ βR 2 under the assumptions of Proposition 2. Since we are only interested in large N, we may assume that the following conditions are satisfied: R0 > 3d + 16,
(2.4)
(β + )bπ((16/R0 ) + (4d/R0 )) ≤ , 2
N ≥ 10 , which ensures 0 < 4(N 6
1/6
(2.5) −1
− 4)
< 1.
(2.6)
We claim that there exist 0 < δ < δ 0 < ∞ which depend only on the numbers a, b and β such that, for N and R0 satisfying (2.3–5), the disc B(0, R0 ) contains the union of disjoint open sets S1 , . . . , SN with the properties that, for each k ∈ {1, . . . , N}, Z u(|z/R0 |)dA(z) = 1, (2.7) (β + ) Sk
B(ck , δ) ⊂ Sk ⊂ B(ck , δ 0 ), where ck is some point in Sk .
(2.8)
In order not to be distracted from our main argument, we will postpone the verification of (2.7) and (2.8) until the end of the proof. Denote H = (−i(∂/∂x) +Ry)2 + (−i(∂/∂y) − x)2 . Let η ∈ Cc∞ (R2 ) be such that η(z) = 0 when |z| ≥ 1/3 and |η(z)|2 dA(z) = 1. With δ given as above, define ξ(z) = δ −1 η(δ −1 z), z ∈ R2 . L2 (R2 )
(2.9)
and ξ = 0 on R \B(0, δ/3). Let ck = (pk , qk ), 1 ≤ Then ξ is a unit vector in k ≤ N, be the points that appeared in (2.8). For each 1 ≤ k ≤ N , define the unitary operators 2
(Tk f )(x, y) = f (x − pk , y − qk ) and (Uk f )(x, y) = ei(pk y−qk x) f (x, y) on L2 (R2 ). Define1 ψk = Uk Tk ξ , 1 ≤ k ≤ N . Obviously kψk k = 1, ψk = 0 on R2 \B(ck , δ/3), and ψk ψk 0 = 0 if k 6 = k 0 . Thus the Slater determinant 9(z1 , . . . , zN ) = (N!)−1/2 det(ψk (zj ))N k,j =1 is a unit vector in the form domain of Ha (N, R). If SN denotes the permutation group of P P {1, . . . , N}, then h N σ ∈SN hH ψσ (1) , ψσ (1) i/N!. It is easy to verify j =1 Hj 9, 9i = N that Uk∗ H Uk = Tk H Tk∗ . Thus kH ψk k = kH Uk Tk ξ k = kUk Tk H ξ k = kH ξ k. By (2.9), kH ξ k depends only on δ and the choice of η, not on N or R. Therefore N X hHj 9, 9i ≤ N kH ξ k = O(N).
(2.10)
j =1
P P Next let us consider the potential energy Y = i<j h|zi − zj |−1 9, 9i = i<j Yij . The SN -symmetry in this sum gives us Y = N (N − 1)Y12 /2 = (Z − Z 0 )/2, where X h|z − w|−1 ψk (z)ψk 0 (w), ψk (z)ψk 0 (w)i, Z= k6=k 0
Z0 =
X
h|z − w|−1 ψk (z)ψk 0 (w), ψk 0 (z)ψk (w)i.
k6=k 0 1 The author gratefully acknowledges that the idea of using the unitary transformations T and U in this k k situation, and the idea that the ψk ’s should be arranged so that there is a positive distance between their supports, are due to Gero Friesecke.
196
J. Xia
For k 6 = k 0 , since ψk ψk 0 = 0, we have ψk (z)ψk 0 (w)ψk 0 (z)ψk (w) = 0 for all z, w ∈ R2 . This implies Z 0 = 0. That is, Y = N(N − 1)Y12 /2 = Z/2 and, therefore, N
I (N, R) ≤ NkQk∞ +
1X 1 hHj 9, 9i + λZ. 2 2 j =1
Recalling (2.10) and that ∈ (0, β] is arbitrary, Proposition 2 will follow once we establish (1 + (/β))2 p 3/2 βN J (u) + δ −1 (2δ 0 /δ)2 N 4/3 . (2.11) Z≤ 1 − 4(N 1/6 − 4)−1 To this end we set P = {(k, k 0 ) : 1 ≤ k, k 0 ≤ N , k 6 = k 0 , |ck − ck 0 | > δ 0 N 1/6 } and L = {(k, k 0 ) : 1 ≤ k, k 0 ≤ N, k 6 = k 0 , |ck − ck 0 | ≤ δ 0 N 1/6 }. Then Z = W + V with X W = h|z − w|−1 ψk (z)ψk 0 (w), ψk (z)ψk 0 (w)i, (k,k 0 )∈P
V =
X
h|z − w|−1 ψk (z)ψk 0 (w), ψk (z)ψk 0 (w)i.
(k,k 0 )∈L
Recall that B(ck , δ) ∩ B(ck 0 , δ) ⊂ Sk ∩ Sk 0 = ∅ if k 6 = k 0 . Thus, for each k ∈ {1, . . . , N }, if n(k) is the number of k 0 ∈ {1, . . . , N} such that |ck − ck 0 | ≤ δ 0 N 1/6 , then n(k)π δ 2 ≤ π(2δ 0 N 1/6 )2 . Thus n(k) ≤ (2δ 0 /δ)2 N 1/3 and, therefore, card(L) ≤ (2δ 0 /δ)2 N 4/3 . Since ψk = 0 on R2 \B(ck , δ/3) and the distance between B(ck , δ/3) and B(ck 0 , δ/3) is at least δ when k 6 = k 0 , each term in V is bounded by δ −1 kψk (z)ψk 0 (w)k2 = δ −1 . Hence V ≤ δ −1 (2δ 0 /δ)2 N 4/3 . To estimate W , we need the second half of (2.8), i.e., Sk ⊂ B(ck , δ 0 ). If (k, k 0 ) ∈ P and (z, w), (ζ, τ ) ∈ Sk × Sk 0 , then |z − w| ≥ |ζ − τ | − 4δ 0 = (1 − 4δ 0 |ζ − τ |−1 )|ζ − τ | ≥ (1 − 4(N 1/6 − 4)−1 )|ζ − τ |. Thus, recalling (2.7) and the fact that the ψk ’s are unit vectors, when (k, k 0 ) ∈ P , we have h|z − w|−1 ψk (z)ψk 0 (w), ψk (z)ψk 0 (w)i ≤ sup{|z − w|−1 : (z, w) ∈ Sk × Sk 0 } Z Z (β + )2 { |ζ − τ |−1 u(|ζ /R0 |)dA(ζ )}u(|τ/R0 |)dA(τ ). ≤ 1 − 4(N 1/6 − 4)−1 Sk0 Sk Because {Sk × Sk 0 : (k, k 0 ) ∈ P } are disjoint subsets of B(0, R0 ) × B(0, R0 ), we obtain Z Z (β + )2 |ζ − τ |−1 u(|ζ /R0 |)u(|τ/R0 |)dA(ζ )dA(τ ). W ≤ 1 − 4(N 1/6 − 4)−1 B(0,R0 ) B(0,R0 ) The substitution (z, w) = (ζ /R0 , τ/R0 ) now yields Z Z (β + )2 3 · R |z − w|−1 u(|z|)u(|w|)dA(z)dA(w). W ≤ 0 1 − 4(N 1/6 − 4)−1 U U Since βR02 = N, this is exactly the first term on the RHS of (2.11). Thus, subject to the verification of (2.7) and (2.8), Proposition 2 is proved. To prove (2.7) and (2.8), we need to divide B(0, R0 ) first into annuli and then into sectors. We start with r1 = 16. By (2.4), r1 + 2d ≤ R0 . Suppose that r1 , . . . , rn
Ground State Energy of Fractional Quantum Hall Effect
197
have been chosen so that 16 ≤ rn and rn + 2d ≤ R0 . Then, by (2.2), Z u(|z/R0 |)dA(z) ≥ βaπd(2rn + 3d) ≥ 2πβa · d · 16 = 16. (β + ) rn +d≤|z|≤rn +2d
This shows that there is an rn+1 ∈ [rn + d, rn + 2d] such that Z u(|z/R0 |)dA(z) ∈ N\{1, 2, . . . , 7, 8}. p(n) = (β + ) rn ≤|z|≤rn+1
(2.12.n)
Repeat this until the condition rn + 2d ≤ R0 is violated. That is, inductively, we obtain r1 < r2 < . . . < rm+1 such that d ≤ rn+1 − rn ≤ 2d, 1 ≤ n ≤ m, and R0 − 2d < rm+1 ≤ R0 .
(2.13) (2.14)
We also have (2.12.n) for all 1 ≤ n ≤ m. Now, for each n, divide the annulus {z ∈ R2 : rn ≤ |z| ≤ rn+1 } equally into p(n) portions and discard the boundary of each portion to yield open sets Sn,1 , . . . , Sn,p(n) . Obviously we have Z u(|z/R0 |)dA(z) = 1, i = 1, . . . , p(n). (2.15) (β + ) Next we show that
Because
R
Pm
Sn,i
n=1 p(n)
≥ N. That is, we need to show Z u(|z/R0 |)dA(z) ≥ N. (β + ) r1 ≤|z|≤rm+1
|ζ |≤1 u(|ζ |)dA(ζ )
= 1,
Z
(β + )
|z|≤R0
(2.16)
u(|z/R0 |)dA(z) = (β + )R02
Z |ζ |≤1
u(|ζ |)dA(ζ ) = (β + )R02
by the substitution ζ = z/R0 . Since r1 = 16, it follows from (2.5) that Z Z + )u(|z/R0 |)dA(z) ≤ (β + )bπ(162 + 2d · 2R0 ) (β + )( |z|≤r1
rm+1 ≤|z|≤R0
= (β + )bπ((16/R0 )2 + (4d/R0 ))R02 ≤ R02 . P Thus the LHS of (2.16) is at least (β + )R02 − R02 = βR02 = N , i.e., m n=1 p(n) ≥ N. Now we simply let {S1 , . . . , SN } be a subfamily of {Sn,i : 1 ≤ i ≤ p(n), 1 ≤ n ≤ m} of N members. Then (2.7) is guaranteed by (2.15). By the definition of Sn,i , there is a θn,i ∈ R such that Sn,i = {(r cos θ, r sin θ) : rn < r < rn+1 , θn,i < θ < θn,i + 2π/p(n)}. Let zn,i = (1/2)(rn+1 + rn )(cos(θn,i + π/p(n)), sin(θn,i + π/p(n))), the “center" of Sn,i . To prove (2.8), it suffices to find 0 < δ < δ 0 < ∞ which depend only on a, b and β such that B(zn,i , δ) ⊂ Sn,i ⊂ B(zn,i , δ 0 ), 1 ≤ i ≤ p(n), 1 ≤ n ≤ m.
198
J. Xia
It follows from (2.15) that (2bβ)−1 ≤ A(Sn,i ) ≤ (aβ)−1 . By (2.13), this means πa/2b = (4bβd)−1 ≤ (π/p(n))(rn+1 + rn ) ≤ (aβd)−1 = 2π.
(2.17)
The distance from zn,i to {r(cos(θn,i ), sin(θn,i )) : r ≥ 0} is (1/2)(rn+1 +rn ) sin(π/p(n)), which is at least (πa/4b) inf 0<x≤π/8 x −1 sin x according to (2.17). And the distance from zn,i to the circles {z : |z| = rn } and {z : |z| = rn+1 } is (rn+1 − rn )/2 ≥ d/2. Thus Sn,i ⊃ B(zn,i , δ) if δ = min{d/2, (πa/4b)
inf
0<x≤π/8
x −1 sin x}.
To find δ 0 , let Tn,i be the smallest trapezoid containing Sn,i . Then it is easy to see that the distances from zn,i to the corners of Tn,i are less than d + max{rn+1 tan(π/p(n)), (π/p(n)) · (rn+1 + rn )/2}. By (2.17) and the fact that p(n) > 8, the desired δ 0 obviously exists. This verifies (2.8). 3. Lower Bound for I (N, R) For each integer N ≥ 2, let EN be the collection of tuples (E1 , . . . , EN ) of Borel sets Ej ⊂ U with the properties that A(Ej ) > 0 and that diam(Ej ) ≤ 1/N, 1 ≤ j ≤ N . We note that for a tuple (E1 , . . . , EN ) ∈ EN , the sets E1 , . . . , EN may intersect each other. Define X
JN = inf{
i6=j
N −2 A(Ei )A(Ej )
Z Z
|z − w|−1 χEi (z)χEj (w)dA(z)dA(w) : (E1 , . . . , EN ) ∈ EN }.
Lemma 5. For any non-empty squares F1 , . . . FN in U , we have XZ
Z ···
|zi − zj |−1 χF1 (z1 ) . . .
i6=j
χFN (zN )dA(z1 ) . . . dA(zN ) ≥ N 2 JN A(F1 ) . . . A(FN ). Proof. Both sides are additive when each Fj is subdivided into the disjoint union of smaller squares. Therefore we only need to consider the case where diam(Fj ) ≤ 1/N , 1 ≤ j ≤ N. In this case the lemma simply follows from the identity −1
Z
Z
|zi − zj |−1 χF1 (z1 ) . . . χFN (zN )dA(z1 ) . . . dA(zN ) Z Z |z − w|−1 χFi (z)χFj (w)dA(z)dA(w) = (A(Fi )A(Fj ))−1
(A(F1 ) . . . A(FN ))
···
t and the definition of JN . u
Ground State Energy of Fractional Quantum Hall Effect
199
Lemma 6. Let 0 < α < ∞ and let N ≥ 2 and R > 0 be such that N/R 2 ≥ α. Then X √ 1 h|zi − zj |−1 f, f i ≥ JN N 3/2 α 2 1≤i<j ≤N
for any continuous function f on 1(N, R) with kf kL2 (1(N,R)) = 1. Proof. Let α, N, R and f be as above. Define g(z1 , . . . , zN ) = |f (z1 , . . . , zN )|2 on 1(N, R). Then g ≥ 0 and g is a unit vector in L1 (1(N, R)). Let 0 < < 1 be given. By the continuity of g on 1(N, R) and a routine approximation, there exist λ1 > 0, . . . , λm > 0 and non-empty squares Qn,1 , . . . , Qn,N in B(0, R), 1 ≤ n ≤ m, such that m X
λn ≥ 1 − ,
n=1
g(z1 , . . . , zN ) ≥
m X
λn (A(Qn,1 ) . . . A(Qn,N ))−1 χQn,1 (z1 ) . . . χQn,N (zN ).
n=1
P P Since ∈ (0, 1) is arbitrary and i<j = (1/2) i6=j , the lemma is reduced to proving Z XZ (A(Q1 ) . . . A(QN ))−1 · · · |zi − zj |−1 χQ1 (z1 ) . . . χQN (zN )dA(z1 ) . . . dA(zN ) i6=j
√ ≥ JN N 3/2 α
(3.1)
for all non-empty squares Q1 , . . . , QN in B(0, R). But given such Q1 , . . . , QN , we define Fj = {z/R : z ∈ Qj }, 1 ≤ j ≤ N , which are squares in U . By the scaling of the area measure A and the homogeneity of |zi − zj |−1 , the LHS of (3.1) becomes R −1 (A(F1 ) . . . A(FN ))−1 Z XZ · · · · |ζi − ζj |−1 χF1 (ζ1 ) . . . χFN (ζN )dA(ζ1 ) . . . dA(ζN ) i6=j
upon the substitution (ζ1 , . . . , ζN ) = (z1 /R, . . . , zN /R). But Lemma 5 tells us that this is at least R −1 N 2 JN = N 3/2 · (N/R 2 )1/2 · JN . Since it is assumed that N/R 2 ≥ α, (3.1) follows. u t Lemma 7. We have lim inf N→∞ JN ≥ J00 . ∞ Proof. If this were false, then there would be a subsequence {Nk }∞ k=1 of {N }N =2 such that (3.2) lim JNk < J00 . k→∞
We will deduce a contradiction from this. For each k ∈ N, there is an Nk -tuple (Ek,1 , . . . , Ek,Nk ) ∈ ENk such that Z Z X Nk−2 |z − w|−1 χEk,i (z)χEk,j (w)dA(z)dA(w) ≤ JNk + 1/k. A(Ek,i )A(Ek,j ) i6=j
(3.3)
200
J. Xia
For each k ∈ N, define the probability measure dµk (z) =
Nk 1 X 1 χE (z)dA(z) Nk A(Ek,j ) k,j j =1
on U . Passing to a subsequence if necessary, we may assume that {µk }∞ k=1 converges in the weak-∗ topology of the dual of C(U ) to a probability measure µ0 on U. We claim that µ0 has no point masses. For, if it were true that µ0 ({ζ }) = r > 0 for some ζ ∈ U , then, for every δ > 0, there would be a k(δ) ∈ N such that µk (B(ζ, δ)) ≥ r/2 when k ≥ k(δ). We can, of course, set k(δ) so large that diam(Ek,j ) ≤ 1/Nk ≤ δ when k ≥ k(δ). Thus, for such a k, if m(k) denotes the number of Ek,j ’s in the tuple (Ek,1 , . . . , Ek,Nk ) satisfying the condition Ek,j ∩ B(ζ, δ) 6 = ∅, then m(k)/Nk ≥ µk (B(ζ, δ)) ≥ r/2 by the definition of µk . Since Nk → ∞ as k → ∞, we must have m(k) ≥ 2 when k is sufficiently large. Note that if k ≥ k(δ) and Ek,j ∩ B(ζ, δ) 6= ∅, then Ek,j ⊂ B(ζ, 3δ). Therefore, when k ≥ k(δ) is such that m(k) ≥ 2, X Ek,i ⊂B(ζ,3δ) Ek,j ⊂B(ζ,3δ) i6=j
Nk−2 A(Ek,i )A(Ek,j )
Z Z
|z − w|−1 χEk,i (z)χEk,j (w)dA(z)dA(w)
≥ m(k)(m(k) − 1)Nk−2 (6δ)−1 ≥ (1/2) · (r 2 /4) · (6δ)−1 .
Since δ > 0 can be arbitrarily small and J00 is a finite number, when k is large, this is irreconcilable with (3.3) and (3.2). Hence µ0 has no point masses. That is, µ0 ∈ P. We next show that, for any > 0, lim inf k→∞
X i6=j
Nk−2 A(Ek,i )A(Ek,j )
Z Z
|z − w|−1 χEk,i (z)χEk,j (w)dA(z)dA(w) (3.4)
Z Z
≥
−1
|z−w|≥
|z − w|
dµ0 (z)dµ0 (w).
Indeed when 1/Nk < /2, we have Ek,j × Ek,j ⊂ {(z, w) : |z − w| < /2} and, consequently, X i6=j
Nk−2 A(Ek,i )A(Ek,j ) ≥
Z Z
|z − w|−1 χEk,i (z)χEk,j (w)dA(z)dA(w)
Z Z |z−w|≥/2
|z − w|−1 dµk (z)dµk (w).
Of course, µk × µk → µ0 × µ0 in the weak-∗ topology of the dual of C(U × U ). Hence Z Z Z Z −1 |z − w| dµk (z)dµk (w) ≥ |z − w|−1 dµ0 (z)dµ0 (w). lim inf k→∞
|z−w|≥/2
|z−w|≥
Ground State Energy of Fractional Quantum Hall Effect
201
This proves (3.4). Since > 0 is arbitrary, (3.4) implies lim inf k→∞
X i6=j
Nk−2 A(Ek,i )A(Ek,j )
Z Z
|z − w|−1 χEk,i (z)χEk,j (w)dA(z)dA(w)
Z Z ≥
|z − w|−1 dµ0 (z)dµ0 (w) = J (µ0 ) ≥ J00 ,
which contradicts (3.2) and (3.3). This concludes the proof of the lemma. u t P N Proof of Proposition 3. Let f ∈ (Cc∞ (1(N, R)))a . Then j =1 hHj f, f i ≥ 0 and, therefore, X h|zi − zj |−1 f, f i. [f, f ]HN ,R,a ≥ −NkQk∞ kf k2 + λ 1≤i<j ≤N
Because (Cc∞ (1(N, R)))a is a form core for Ha (N, R), when N/R 2 ≥ α > 0, it follows from this and Lemma 6 that I (N, R) ≥
√ 1 JN N 3/2 λ α − kQk∞ N. 2
Now Lemma 7 completes the proof. u t
4. The Extreme-Value Problem The proof that J0 = J00 = π/2 requires several steps. Let us start with a result which may be of independent interest. Proposition 8. If µ is a probability measure on R2 with a compact support X, then Z Z |z − w|−1 dµ(z) ≤ sup |z − ζ |−1 dµ(z) for every w ∈ R2 \X. ζ ∈X
Although the proof of this proposition is an elementary exercise in measure theory and subharmonic functions, it does require Egoroff’s theorem to extract a not-so-obvious continuity. Therefore we will present its proof in the Appendix for the sake of completeness. Lemma 9. There exists a µ∗ ∈ P such that J (µ∗ ) = J00 . Proof. For each n ∈ N, let µn ∈ P be such that J (µn ) ≤ J00 + 1/n. Passing to a subsequence if necessary, we may assume that {µn } converges in the weak-∗ topology of the dual of C(U ) to some µ∗ , which is necessarily a probability measure. We need to show that µ∗ ∈ P, i.e., µ∗ has no point masses, and that J (µ∗ ) = J00 . Suppose that µ∗ ({ζ }) = r > 0 for some ζ ∈ U . Let δ > 0 be such that (2δ)−1 > (4/r 2 )(J00 + 2). (Obviously J00 < ∞.) The weak-∗ convergence µn → µ∗ implies that there is an n(δ) > 0 such that µn (B(ζ, δ)) ≥ r/2 if n ≥ n(δ). For such an n, we have Z Z |z − w|−1 dµn (z)dµn (w) ≥ (2δ)−1 (µn (B(ζ, δ)))2 ≥ J00 + 2, J (µn ) ≥ B(ζ,δ) B(ζ,δ)
202
J. Xia
which contradicts J (µn ) ≤ J00 + 1/n. Hence µ∗ has no point masses. Since {µn × µn } converges to µ∗ × µ∗ in the weak-∗ topology of the dual of C(U × U ), for any > 0, Z Z |z − w|−1 dµn (z)dµn (w) J00 ≥ lim sup n→∞ |z−w|≥/2 Z Z |z − w|−1 dµ∗ (z)dµ∗ (w). ≥ |z−w|≥
This implies J00 ≥
J (µ∗ ).
But J00 ≤ J (µ∗ ) by definition. Hence J00 = J (µ∗ ). u t
For the rest of the section µ∗ will be the measure given in Lemma 9. Since the extreme value J00 is attained at µ∗ , the natural urge is to do some variational calculus in P. But the difficulty is that, for an arbitrary µ ∈ P, we do not know if there is any x < 0 for which µ∗ + xµ is a positive measure. This difficulty is circumvented by the contra-positive argument that appears in our next proof. Define Z 8(w) = |z − w|−1 dµ∗ (z), w ∈ R2 . Lemma 10. We have 8(w) ≤ J00 for every w ∈ R2 . Proof. Denote the support of µ∗ by K. That is, K is the smallest compact set such that µ∗ (K) = 1. Proposition 8 tells us that it suffices to prove that 8(w) ≤ J00 whenever w ∈ K.
(4.1)
Let B = {w : 8(w) > J00 }. We claim that µ∗ (B) = 0. Assuming the contrary, we could define the probability measure µ0 by the formula µ0 (E) = µ∗ (E ∩ B)/µ∗ (B). We will show that this leads to a contradiction. Let Z Z b= |z − w|−1 dµ∗ (z)dµ0 (w). Clearly, b and J (µ0 ) are both finite. Since 8(w) > J00 whenever w ∈ B, we have Z Z 1 8(w)dµ∗ (w) > J00 = J (µ∗ ). (4.2) b = 8(w)dµ0 (w) = ∗ µ (B) B Define µx = (µ∗ + xµ0 )/(1 + x) for x > −1. The key observation here is that, if x > −µ∗ (B), then µ∗ + xµ0 is a positive measure because 1 + (x/µ∗ (B)) > 0. Thus µx ∈ P whenever x > −µ∗ (B). Consider the function f (x) = J (µx ), x > −µ∗ (B). It follows from Lemma 9 that f (0) = J (µ∗ ) = J00 . Thus f (x) ≥ f (0) whenever x > −µ∗ (B). Because −µ∗ (B) < 0 by our assumption, this implies that f 0 (0) = 0. Now f (x) = (J (µ0 )x 2 + 2bx + J (µ∗ ))/(1 + x)2 . Thus from the equation f 0 (0) = 0 we deduce J (µ∗ ) = b, which contradicts (4.2). This proves µ∗ (B) = 0. ∗ Since K is the smallest compact set such that µ∗ (K) R = 1 and since µ (B) = 0, any open set which intersects K must intersect G = {w : |z − w|−1 dµ∗ (z) ≤ J00 }. That is, K is contained in the closure of G. An application of Fatou’s lemma now completes the proof of (4.1). u t
Ground State Energy of Fractional Quantum Hall Effect
203
We now turn to the measure ω. Recall that the definition of ω is dω(z) = (2π)−1 (1 − |z|2 )−1/2 dA(z), |z| < 1. Lemma 11. We have
R
|z − w|−1 dω(z) = π/2 whenever |w| ≤ 1.
Proof. Identify R2 with C as usual. Since ω is rotation invariant, it suffices to compute Z Z 1 (1 − |ζ + r|2 )−1/2 |ζ |−1 dA(ζ ) |z − r|−1 dω(z) = 2π |ζ +r|≤1 for 0 ≤ r ≤ 1. We will take full advantage of the symmetries hidden in this integral. Letting ζ = ρeiθ in the above, we have |ρeiθ + r|2 = |ρ + re−iθ |2 = |ρ + reiθ |2 = (ρ + r cos θ )2 + r 2 sin2 θ.
(4.3)
When 0 ≤ r < 1, the circle {ρeiθ : |ρeiθ + r| = 1} is given by the equation p ρ = −r cos θ + 1 − r 2 sin2 θ , 0 ≤ θ ≤ 2π. When r = p 1, this is also the equation for the circle {ρeiθ : |ρeiθ +1| = 1}pin the sense that − cos θ + 1 − sin2 θ = −2 cos θ if θ ∈ [π/2, 3π/2] and − cos θ + 1 − sin2 θ = 0 if θ ∈ [0, π/2] ∪ [3π/2, 2π]. Therefore, for 0 ≤ r ≤ 1, we have √ Z 2πZ −r cos θ+ 1−r 2 sin2 θ Z 1 ρ −1 ρdρ p dθ |z − r|−1 dω(z) = 2π 0 0 1 − (ρ + r cos θ )2 − r 2 sin2 θ Z 2π Z π 1 1 fr (θ)dθ = (fr (θ ) + fr (θ + π ))dθ, (4.4) = 2π 0 2π 0 R0 where, with the convention that 0 (whatever)dρ means 0, we write √ Z −r cos θ+ 1−r 2 sin2 θ dρ p fr (θ) = 0 1 − (ρ + r cos θ )2 − r 2 sin2 θ for 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. We have the following elementary identities: Z a+b Z b Z b Z −a+b 2 2 2 g((ρ + a) )dρ + g((ρ − a) )dρ = g(x )dx + g(x 2 )dx 0
a
0
Z
b
=2
−a
g(x 2 )dx
0
if b ≥ 0 and a ∈ [−b, b], and
√
Z 0
h
√
dx h − x2
=
π 2
if h > 0. Since cos(θ + π) = − cos θ and sin2 (θ + π ) = sin2 θ , it follows from these identities that √ Z 1−r 2 sin2 θ π dx p =2· =π fr (θ ) + fr (θ + π) = 2 2 2 2 2 0 1 − r sin θ − x t whenever 1 − r 2 sin2 θ > 0. A substitution of this into (4.4) yields our result. u
204
J. Xia
Proof of Proposition 4. For each δ ∈ [0, 1), let uδ (x) = (2π )−1 (1−δx 2 )−1/2 , 0 ≤ x ≤ 1, R and let Mδ = U uδ (|z|)dA(z). Clearly, uδ /Mδ ∈ U and Z Z |z − w|−1 uδ (|z|)uδ (|w|)dA(z)dA(w) ≤ Mδ−2 J (ω). J0 ≤ J (uδ /Mδ ) = Mδ−2 U
U
Now Mδ ↑ ω(U ) = 1 as δ ↑ 1. Hence J0 ≤ J (ω). By Lemma 10, we have ZZ Z Z |z − w|−1 dµ∗ (z)dω(w) = 8(w)dω(w) ≤ J00 dω(w) = J00 . On the other hand, by Fubini’s theorem and Lemma 11, Z Z Z ZZ π π ∗ −1 ∗ −1 ∗ dµ (z) = . |z − w| dµ (z)dω(w) = { |z − w| dω(w)}dµ (z) = 2 U 2 Lemma 11 also implies that J (ω) = π/2. Hence J00 ≥ π/2 = J (ω) ≥ J0 . But J0 ≥ J00 by the definitions of these quantities. This completes the proof. u t
Appendix The main idea behind the proof of Proposition 8 is to use the maximum principle for subharmonic functions: If Y is a non-empty compact set in R2 and F ≥ 0 is a function which is subharmonic on R2 \Y and continuous on the entire plane R2 , then it attains its maximum value on Y if it has one on R2 . ButRthe problem we face in the proof is the following: We do not know if the function w 7 → |z − w|−1 dµ(z) is continuous on R2 . R Proof of Proposition 8. We may assume that |z − ζ |−1 dµ(z) < ∞ for every ζ ∈ X, for otherwise there is nothing to prove. But this assumption immediately leads to Z |z − ζ |−1 dµ(z) = 0 for every ζ ∈ X. (A.1) lim k→∞ |z−ζ |≤2−k
Let w0 ∈ R2 \X and > 0 be given. To prove the proposition, it suffices to show that Z Z −1 (A.2) |z − w0 | dµ(z) ≤ sup |z − ζ |−1 dµ(z) + . ζ ∈X
For this we use Egoroff’s theorem: By (A.1) and the regularity of µ, there is a non-empty compact subset Y ⊂ X with µ(X\Y ) ≤ d(w0 , X) such that the uniform convergence Z |z − ζ |−1 dµ(z) = 0 (A.3) lim sup k→∞ ζ ∈Y
|z−ζ |≤2−k
holds. Define the measure µ˜ by the formula µ(E) ˜ = µ(E ∩ Y ). Then Z Z Z µ(X\Y ) −1 −1 |z − w0 | dµ(z) + ˜ + . ≤ |z − w0 |−1 d µ(z) |z − w0 | dµ(z) ≤ d(w0 , X) Y R R ˜ ≤ supζ ∈X |z − ζ |−1 dµ(z), (A.2) will follow once we Since supζ ∈Y |z − ζ |−1 d µ(z) establish Z Z ˜ ≤ sup |z − ζ |−1 d µ(z) ˜ for every w ∈ R2 \Y. (A.4) |z − w|−1 d µ(z) ζ ∈Y
Ground State Energy of Fractional Quantum Hall Effect
205
To prove (A.4), we first show that the function Z ˜ w ∈ R2 , F (w) = |z − w|−1 d µ(z), is continuous on the entire plane R2 . For each k ∈ N, write Z |z − w|−1 d µ(z), ˜ w ∈ R2 . Bk (w) = B(w,2−k )
For each k ∈ N, let ηk be the continuous function on [0, ∞) such that ηk (x) = 2k for 0 ≤ x ≤ 2−k and ηk (x) = x −1 for x > 2−k . Let ξk (x) = x −1 − ηk (x), 0 < x < ∞. Then Z Z ˜ + ξk (|z − w|)d µ(z) ˜ = gk (w) + bk (w). F (w) = ηk (|z − w|)d µ(z) Now gk is clearly continuous. Since 0 ≤ ξk (x) ≤ x −1 χ(0,2−k ) (x), we have 0 ≤ bk ≤ Bk . Thus, by (A.3), the continuity of F will follow if we can show that Z |z − ζ |−1 d µ(z) ˜ for all k ∈ N and w ∈ R2 . (A.5) Bk (w) ≤ 8 sup ζ ∈Y
|z−ζ |≤2−k+1
To prove (A.5), fix a k ∈ N. Of course, we only need to consider w ∈ R2 \Y . Given such a w, let m ∈ Z be the largest integer such that B(w, 2−m ) ∩ Y 6 = ∅. Because µ(B(w, ˜ 2−m−1 )) = 0, the case where m + 1 ≤ k is trivial. Suppose that m ≥ k. Then m Z X |z − w|−1 d µ(z) ˜ Bk (w) = ≤
−j −j −1 ) j =k B(w,2 )\B(w,2 m X j +1 −j
2
)\B(w, 2−j −1 )).
µ(B(w, ˜ 2
j =k
Since B(w, 2−m ) ∩ Y 6 = 0, there is a ζ ∈ Y such that |w − ζ | < 2−m . Thus, for j ≤ m, we have B(w, 2−j ) ⊂ B(ζ, 2−m + 2−j ) ⊂ B(ζ, 2−(j −1) ). Therefore Bk (w) ≤
m X
2j +1 µ(B(ζ, ˜ 2−(j −1) )) = 4
j =k
m−1 X
2j µ(B(ζ, ˜ 2−j )).
j =k−1
−i −(i+1) )}) if j < m − 1. But B(ζ, 2−j ) = B(ζ, 2−(m−1) ) ∪ (∪m−2 i=j {B(ζ, 2 )\B(ζ, 2 Hence m−2 m−1 m−2 X X X 1 Bk (w) ≤ µ(B(ζ, ˜ 2−(m−1) )) 2j + 2j µ(B(ζ, ˜ 2−i )\B(ζ, 2−(i+1) )) 4
= µ(B(ζ, ˜ 2−(m−1) ))
j =k−1
j =k−1
m−1 X
m−2 X
2j +
j =k−1
˜ 2−(m−1) )) + ≤ 2{2m−1 µ(B(ζ, ≤2
|z−ζ |≤2−k+1
2j )µ(B(ζ, ˜ 2−i )\B(ζ, 2−(i+1) ))
i=k−1 j =k−1 m−2 X i=k−1
Z
(
i=j
i X
|z − ζ |−1 d µ(z) ˜
2i µ(B(ζ, ˜ 2−i )\B(ζ, 2−(i+1) ))}
206
J. Xia
P (omit m−2 i=k−1 . . . in the above in the case m = k). This proves (A.5) and completes the proof of the fact that F is continuous on the entire plane R2 . Since 1(x 2 + y 2 )−1/2 = (x 2 + y 2 )−3/2 > 0, it is elementary that 1F (w) ≥ 0 whenever w ∈ / Y . Thus F is a subharmonic function on R2 \Y . That is, Z 2π 1 F (w + (r cos θ, r sin θ ))dθ F (w) ≤ 2π 0 if d(w, Y ) > r. See [3, p. 48]. With these properties of F in hand, we are now ready to prove (A.4). Since F is continuous on R2 , F ≥ 0, and lim|w|→∞ F (w) = 0, it has a maximum value M on R2 . By the maximum principle for subharmonic functions, F attains M on Y . That is, there is a ζ ∗ ∈ Y such that F (ζ ∗ ) = M ≥ F (w) for every w ∈ R2 . This proves (A.4) and completes the proof of Proposition 8. u t
References 1. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic field. I. General interactions. Duke J. Math. 45, 847–884 (1978) 2. Bellissard, J., van Elst, A., Schulz-Baldes, H.: The non-commutative geometry of the quantum Hall effect. J. Math. Phys. 35, 5373–5451 (1994) 3. Garnett, J.: Bounded analytic functions. New York: Academic Press, 1981 4. Kato, T.: Perturbation theory for linear operators. New York, Springer-Verlag, 1976 5. Laughlin, R.: Anomalous quantum Hall effect: An incompressible quantum fluid with fractionally charged excitations. Phys. Rev. Lett. 50, 1395–1398 (1983) 6. Prange, R., Girvin, S. (eds.): The quantum Hall effect. Graduate text in Physics. NewYork, Springer-Verlag, 1990 7. Reed, M., Simon, B.: Methods of modern mathematical physics. II. Fourier analysis, self-adjointness. New York, Academic Press, 1975 8. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York, Academic Press, 1978 9. Xia, J.: An estimate of the ground state energy of the fractional quantum Hall effect. To appear in J. Math. Phys. 40, 150–155 (1999) Communicated by B. Simon
Commun. Math. Phys. 204, 207 – 247 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Growth of Sobolev Norms in Linear Schrödinger Equations with Quasi-Periodic Potential J. Bourgain School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, USA. E-mail: [email protected] Received: 16 October 1997 / Accepted: 28 January 1999
Abstract: In this paper, we consider the following problem. Let iut +1u+V (x, t)u = 0 be a linear Schrödinger equation ( periodic boundary conditions) where V is a real, bounded, real analytic potential which is periodic in x and quasi periodic in t with diophantine frequency vector λ. Denote S(t) the corresponding flow map. Thus S(t) preserves the L2 -norm and our aim is to study its behaviour on H s (TD ), s > 0. Our main result is the growth in time is at most logarithmic; thus if φ ∈ H s , then kS(t)φkH ∗ < C(log |t|)C.s for |t| → ∞.
(*)
More precisely, (∗) is proven in 1D and 2D when V is small. We also exhibit examples showing that a growth of higher Sobolev norms may occur in this context and (∗) is thus essentially best possible. 0. Introduction In this paper we consider linear Schrödinger equations with periodic boundary conditions of the form (0.1) iut − 1u + V (x, t)u = 0, where V is a real analytic potential, periodic in x ∈ TD (D = space dimension) and quasi-periodic in t, with frequency vector λ = (λ1 , . . . , λb ) ∈ Rb satisfying a diophantine condition (0.2) kk.λk > |k|−C for k ∈ Zb \{0}. The flow map S(t)
S(t)u(0) = u(t)
corresponding to (0.1) preserves clearly the
(0.3)
L2 -norm
kS(t)φk2 = kφk2 ,
(0.4)
208
J. Bourgain
and if φ ∈ H s (TD ), s > 0, then so will S(t)φ be for all time. The problem considered here is the growth (if any) of kS(t)φkH s when |t| → ∞. Problems of this nature have been looked at in various contexts, including for nonlinear Hamiltonian PDE’s, see [Bo1, St]. For an equation such as (0.1), assuming V smooth in x, t, one may easily obtain estimates polynomial in |t|, i.e. kS(t)φkH s ≤ C(1 + |t|)s kφkH s .
(0.5)
We will show here that for the quasi-periodic potential V considered above, (0.5) has the following improvement: kS(t)φkH s < C[log(2 + |t|)]C.s
(0.6)
in the following cases: D=1 D=2
(see Sect. 1) and V small (see Sect. 2) (the argument may probably be extended to any dimension).
Moreover, simple constructions permit us to show that a logarithmic growth of H s -norm, s > 0 may appear. In Sect. 5 of the paper, we give an example of an equation (0.1) with V smooth and periodic in x, almost periodic in t, such that S(t)(1) is not an almost periodic function in time (almost periodicity referring to the L2 -norm), as it is the case for time periodic V. In the special case of a time-periodic potential, estimate (0.6) was observed by T. Spencer (for D arbitrary and no assumption on time frequency λ). Discussions with him on this subject lie at the origin of this work ([S]). Fixing a large time T , the method consists in describing approximately the flow S(t) for |t| < T in terms of “approximative” Bloch waves X b(n, k) ei(n.x+k.λt) , ψ ψ(x, t) = (0.7) eiEt ψ(x, t) (n,k)∈ZD+b
b is “reasonably-well” localized on the lattice ZD+b . where ψ The analysis involved in this problem is closely related to the methods used for instance around the localization of eigenstates for the lattice Schrödinger operator with quasi-periodic potential, in particular (cf. [F-S-W]) ε1 + cos(nλ + σ )
(0.8)
(i.e. Mathieu potential) and certain recent developments in the KAM theory for Hamiltonian PDE’s (cf. [C-W, Bo2, 3, 4]). Similar results may be obtained for the following wave equations version of (0.1): p B = −1 + ρ (0.9) iut + Bu + B −1 [V (x, t)u] = 0, for which there is preservation of the norm (equivalent with H 1/2 ) hBφ, φi (defining the symplectic Hilbert-space for (0.9), in the sense of [Ku]).
(0.10)
Growth of Sobolev Norms in Linear Schrödinger Equations
209
More precisely, bounds of the form (0.6) may be shown to hold assuming D = 1 V quasi-periodic in time with λ satisfying (0.2), D ≥ 1 V periodic in time with “typical” frequency λ ∈ R (the exact meaning of this is stated in the last section of the paper). Remark. In the case of the usual wave equation ytt − 1y + ρy + V (x, t)y = 0
(ρ > 0),
(0.11)
rewriting (0.11) as (
yt = −Bv vt = By + B −1 V y
B=
√ −1 + ρ
(0.12)
and putting u = y + iv
(0.13)
iut + Bu + B −1 [V (x, t)(u + u)] = 0.
(0.14)
yields the equation
The following example shows however that even in the simplest case of V = V (x, t) periodic in x and t, one may not expect results along the preceding line for (11), (14) when V has non-trivial time dependence. Take D = 1 and (0.15) V (x, t) = v1 (x) − v2 (t), where v1 , v2 are periodic potentials. Let (
− −
d2 dx 2 d2 dt 2
+ v1 (x) y1 = λ1 y1 + v2 (t) y2 = λ2 y2 ,
(0.16) (0.17)
d where we choose λ1 in the periodic spectrum of − dx 2 + v1 (x) and y1 a periodic eigen2
d function, λ2 inside a gap of instability for − dt 2 + v2 (t) and thus y2 of the form 2
y2 (t) = eσ t z(t), where z = z(t) is periodic in t and σ ∈ R\{0}. Taking ρ = λ2 − λ1
(0.18)
(0.19)
(which one may choose arbitrarily large) the function y(x, t) = y1 (x)y2 (t) = eσ t z(t)y1 (x) is clearly a solution of (0.11).
(0.20)
210
J. Bourgain
I. 1D Linear Schrödinger Equation with Quasi-Periodic Potential Consider the case of the 1D linear Schrödinger equation with a potential V ( i u˙ − uxx + V (x, t)u = 0 , u(0) = φ(x)
(1.0)
where V is real 1D-periodic in x and quasi-periodic in t with diophantine frequency λ = (λ1 , . . . , λb ). We will prove the following. Theorem 1. Let s > 0 and φ ∈ H s (T). Then u(t) ∈ H s for all time t and ku(t)kH s < (log t)Cs for t → ∞. For expository reasons, we will subdivide the argument in several items presented in the next subsections. Although most considerations depend essentially on the linear structure of the equation, some bounds such as (3.31) hold in a much greater generality of nonlinear Hamiltonian PDE’s (cf. [B1, St]). 1. Assume V a trigonometric polynomial for simplicity, thus X0 b(n)(t) ei.n.x V V (x, t) = n∈Z X0 b(n, k) ei(n.x+k.λt) , V = b n∈Z, k∈Z
where P0
(
b(n, k). b(−n, −k) = V V
denotes summation over finite sets.) b(0)(t)-term, replacing u by One may eliminate the V X0 b(0, k) ik.λt V b(0, 0)t + (e − 1) . u. exp i V k∈Zb \{0} k.λ
(1.1) (1.2) (1.3)
(1.4)
Thus in (1.1), (1.2), we may assume n 6 = 0. 2. In this subsection, we will construct the approximative Bloch waves (0.7). Their main property is to be well-localized in frequency space. Writing a general solution as a superposition of those, this will eventually permit us to control also higher Sobolev norms besides L2 . Letting X b(n, k) ei(n.x+k.λt) , (2.1) ψ u = eiEt ψ with ψ(x, t) = n∈Z, k∈Zb
(1.0) has a representation in Z × Zb -lattice as b)(n, k) = E ψ b, b ≡ (n2 − k.λ)ψ b(n, k) + (SV ψ Tψ
(2.2)
where SV is a Toeplitz multiplication operator with symbol V . Thus T = D + SV , where D is diagonal with D(n, k) = n2 − k.λ.
(2.3)
Growth of Sobolev Norms in Linear Schrödinger Equations
211
Denote T the restriction of T to the region b = {(n, k) ∈ Z × Z |n| ≤ N, |k| = max |ks | ≤ K}, 1≤s≤b
where N, K will be specified later. Our next purpose is to obtain certain localization properties of the eigenvectors of T . Choose (2.4) K C1 < N0 N. Let T ξ = Eξ, |ξ | = 1.
(2.5)
Claim. Up to error of order e−K/100 , there is either localization of [(n, k) ∈ |kn| < 2N0 ] or there is localization to a union of boxes K K 10 ] × [|k − k0 | < 10 ] for some (n0 , k0 ) ∈ .
ξ to the region [ |n| − |n0 | <
We distinguish the following cases Case 1. E < 21 N02 . Partition = 1 + 2 , where 1 = {(n, k) ∈ |n| ≤ N0 }. Clearly, for (n, k) ∈ 2 , |D(n, k) − E| > n2 − CK − |E| > so that (T2 − E)−1 = (D2 − E)−1
∞ X
1 2 N 3 0
(−1)j [S2 (D2 − E)−1 ]j
(2.6)
(2.7)
j =0
satisfies
k(T2 − E)−1 k < CN0−2
(2.8)
and fast off-diagonal decay, in particular 0
|(T2 − E)−1 (z, z0 )| < CN0−2 e−|z−z | .
(2.9)
Writing ξ = ξ1 + ξ2 , it follows from (2.5) that ξ2 = −(T2 − E)−1 P2 T ξ1 , where P2 denotes the projection operator on `2 (2 ). Thus, for |n| > N0 , X (T2 − E)−1 (n, k), z2 T (z2 , z1 ) ξ(z1 ), ξ(n, k) = − z1 ∈1 z2 ∈2
(2.10)
(2.11)
212
J. Bourgain
and by (2.9) |ξ(n, k)| < N0−2
X
e−|(n,k)−z2 |−|z1 −z2 | |ξ(z1 )|
z1 ∈1 z2 ∈2
< N0−2 e− 2 (|n|−N0 ) . 1
In particular
kξ
(2.12)
k < K 1/2 N0−2 e− 2 N0 < e− 2 N0 . 1
|n|>2N0
1
(2.13)
Case 2. E > 21 N02 . Since N0 > K C1 , there is clearly, up to sign, a unique n0 , |n0 | >
N0 2
(2.14)
such that min |n20 − k.λ − E| <
|k|≤K
N0 . 4
(2.15)
Hence, for (n, k) ∈ , |n| 6 = |n0 |, |D(n, k) − E| >
N0 . 4
(2.16)
Also, by the diophantine property of λ, |k.λ| > |k|−C0 for k 6= 0 in Zb ,
(2.17)
where we assume 10C0 < C1 . Therefore, there is at most one k0 -value (and hence a single value, since E is an eigenvalue) such that for k 6 = k0 , |k| < K, |D(±n0 , k)| > K −C0 .
(2.18)
0 = \{(n0 , k0 ), (−n0 , k0 )}
(2.19)
0 = 1 + 2 ,
(2.20)
1 = {(n, k) ∈ |n 6 = n0 , −n0 }, 2 = 2,+ ∪ 2,− , 2,± = {(±n0 , k)|k 6 = k0 , |k| ≤ K}.
(2.21) (2.22)
Consider the index set partitioned as where
It follows from (2.16) that
k(T1 − E)−1 k . N0−1
(2.23)
and satisfies exponential off-diagonal decay 0
From (2.18),
|(T1 − E)−1 (z, z0 )| . N0−1 e−|z−z | .
(2.24)
k(D2 − E)−1 k < K C0 .
(2.25)
Growth of Sobolev Norms in Linear Schrödinger Equations
213
b(0) = 0, Observe that, since V P2,+ SV P2,+ = 0 = P2,− SV P2,−
(2.26)
b(n) = 0 for |n| sufficiently large, in particular for |n| > |n0 |, also and, since V P2,+ SV P2,− = P2,− SV P2,+ = 0.
(2.27)
T2 = D2 .
(2.28)
Consequently
Write T0 − E =
T1 − E U∗ U D2 − E
(2.29)
and (T0 − E)−1 = (T1 − E)−1 + (T1 − E)−1 U ∗ A−1 U (T1 − E)−1 −A−1 U (T1 − E)−1
−(T1 − E)−1 U ∗ A−1 A−1
,
(2.30) where
A = (D2 − E) − U (T1 − E)−1 U ∗ .
(2.31)
[112 − U (T1 − E)−1 U ∗ (D2 − E)−1 ](D2 − E).
(2.32)
Rewrite A as
Hence A−1 = (D2 − E)−1 [112 − U (T1 − E)−1 U ∗ (D2 − E)−1 ]−1 ,
(2.33)
where, by (2.23)–(2.25), kU (T1 − E)−1 U ∗ (D2 − E)−1 k < k(T1 − E)−1 k k(D2 − E)−1 k −1/2
. N0−1 K C0 < N0 Thus and
.
kA−1 k < 2k(D2 − E)−1 k < 2K C0 0
0
|A−1 (z, z0 )| . K 2C0 N0−1 e− 2 |z−z | < e− 2 |z−z | for z 6 = z0 . 1
1
(2.34)
(2.35) (2.36)
Invoking (2.24), (2.36), (2.30), we get that k(T0 − E)−1 k . K C0 and
0
|(T0 − E)−1 (z, z0 )| . e− 2 |z−z | for z 6 = z0 . 1
(2.37) (2.38)
214
J. Bourgain
Again (T − E)ξ = 0 implies ξ0 = −(T0 − E)−1 P0 T ξ\0 ,
(2.39)
where the set \0 is now reduced to the pair (n0 , k0 ), (−n0 , k0 ). Thus for (n, k) ∈ 0 , (2.39), (2.37), (2.38) yield X (T0 − E)−1 (n, k), z0 T (z0 , z00 )ξ(z00 ), ξ(n, k) = − z0 ∈0 z00 ∈\0
|ξ(n, k)| < K
X
C0
e
− 21 |(n, k)−z0 |−|z0 −z00 |
C0 − 41
e
|n|−|n0 | +|k−k0 |
.
(2.40)
z0 ∈0 z00 ∈\0
Hence, there is (n0 , k0 ) ∈ such that
ξ
|n|−|n0 | > K
10
or
K |k−k0 |> 10
K
< e− 100 ,
(2.41)
localizing the eigenvector ξ . eigenvector ξ is either In summary, it follows that up to an error of size e−K/100 , the K K , |k − k0 | < 10 in . localized to |n| < 2N0 or to a union of boxes |n| − |n0 | < 10 This proves the claim. 3. Next, we use this system of approximative Bloch waves obtained in the previous section to describe a general solution; see (3.37), (3.38) below. Consider first an initial data ϕ such that kϕk2 ≤ 1 and with frequency restriction b ϕ (n) = 0 for |n| < 4N0 or |n| > Considering the vector ϕ˜ ∈ `2 defined by ( ϕ(n, ˜ 0) = b ϕ (n) ϕ(n, ˜ k) = 0 for k 6= 0 expand ϕ˜ in the `2 -normalized eigenvectors of T X ϕ˜ = hϕ, ˜ ξ iξ.
,
N . 2
(3.1)
(3.2)
(3.3)
From the localization property for eigenvectors of T established in Subsect. 2, it clearly follows that X X hϕ, ˜ ξ iξ + 0 e−K/100 = hϕ, ˜ ξ 0 iξ 0 + 0 e−K/100 , (3.4) ϕ˜ = ξ ∈E
ξ ∈E
where E denotes the set of those ξ ’s admitting a localization on the sense of (2.41) in a region K K × |k| < , (3.5) Qξ = |n ± n0 | < 10 10
Growth of Sobolev Norms in Linear Schrödinger Equations
where dist (n0 , supp b ϕ) <
K 10 ,
215
in particular 2N0 < |n0 | <
3 N 4
(3.6)
and ξ 0 = ξ |Qξ K Qξ
Qξ N0
ϕ˜
N
Since T ξ = Eξ it follows that for ξ ∈ E, (T − E)ξ 0 = 0(e−K/100 ),
(3.7)
and thus, for (n, k) ∈ Z × Zb , (n2 − k.λ − E)ξ 0 (n, k) + (SV ξ 0 )(n, k) = 0(e−K/100 ) from the definition of Qξ . Define ξˇ (x, t) = eiEt
X
ξ(n, k)ei
nx+(k,λ)t
(3.8)
.
(3.9)
(n,k)∈Qξ
From (3.8), ξˇ is an approximative solution of (1.0), in the sense that (i∂t − ∂x2 )ξˇ + V .ξˇ = 0(e−K/100 ).
(3.10)
At this point, we recall the following elementary fact on approximative solutions. Lemma 3.11. Consider an approximative solution u of (1.0), i.e.
where η satisfies
iut + ∂x2 u + V u = η,
(3.12)
kη(t)k2 < ε for |t| < T .
(3.13)
Then the solution u˜ of (1.0) with initial data u(0) ˜ = u(0) satisfies ku(t) ˜ − u(t)k2 < ε|t| ≤ εT for |t| < T .
(3.14)
Proof. Denote by S(t) the flow map corresponding to (1.0). It preserves the L2 -norm, i.e. kS(t)ψk2 = kψk2 . (3.15) Hence, (3.14) follows from (3.15) and the integral equation Z t S(t)S(τ )−1 η(τ )dτ. (u − u)(t) ˜ = 0
(3.16)
216
J. Bourgain
Applying Lemma 3.11 to (3.10), it follows that for |t| < T , kξˇ (t) − S(t) ξˇ (0) k2 ≤ e−K/100 T .
(3.17)
Taking the inverse Fourier transform of (3.4), we get X X hϕ, ˜ ξ 0i ξ 0 (n, k)ei(nx+k.θ ) + 0(e−K/100 ) ϕ(x) =
(3.18)
ξ ∈E
n,k
(where the 0( ) refers to L2 -norm on T1+b ) and in particular X X 0 0 inx hϕ, ˜ ξi ξ (n, k)e + 0(e−K/100 K b/2 ) ϕ(x) = ξ ∈E
(3.19)
n,k
or ϕ=
X
hϕ, ˜ ξ 0 i ξˇ (0) + 0(e−K/200 ).
(3.20)
ξ ∈E
Hence, by (3.17), we get for |t| < T , X X hϕ, ˜ ξ 0 i ξˇ (t)k2 < e−K/200 + |hϕ, ˜ ξ 0 i| kξˇ (t) − S(t)ξˇ (0)k2 kS(t)ϕ − ξ ∈E
<e
ξ ∈E
−K/200
b 1/2 −K/100
+ (NK )
e
T < e−K/300
(3.21)
if we assume K (1 + s). log N + log T .
(3.22)
Next, estimate
X
0 ˇ
h ϕ, ˜ ξ i ξ (t)
X n
X n
Hs
ξ ∈E
=
X 1/2 |n|2s hϕ, ˜ ξ 0 ihξˇ (t), en i|2 = ξ
X 1/2 X 2s |n| b ϕ (m)ξ 0 (m, 0)hξˇ (t), en i|2 . m
(3.23)
ξ
By construction of ξ 0 , ξˇ , the expression ξ 0 (m, 0)hξˇ (t), en i vanishes, unless |m| − K t |n| < 5 , |m| > N0 . u In particular, it follows that in (3.23), 1 < |m| < 2|n|. 2|n| Define for dyadic R, ϕR =
X 1 4R <|m|<4R
b ϕ (m)eim.x .
(3.24)
(3.25)
Growth of Sobolev Norms in Linear Schrödinger Equations
217
From (3.24), we have then clearly
(3.23) ∼
X
R
R dyadic
≤
X R
=
X
X
2s
R 2 <|n|<2R
2 1/2 XX 0 ˇ b ϕR (m)ξ (m, 0)hξ (t), en i m ξ ∈E
2 1/2
XX
0
ˇ R 2s b ϕ (m) ξ (m, 0) ξ (t) R
m
ξ
2 1/2
X
2s 0 ˇ R hϕ˜R , ξ iξ (t) .
(3.26)
2
ξ
R
2
By (3.21)
X
0 ˇ
≤ kS(t)ϕ k2 + e−K/300 kϕ k2
h ϕ ˜ , ξ i ξ (t) R R R
2
ξ
≤ 2kϕR k2 , and substituting in (3.26), it follows from (3.25),
(3.23) .
X
1/2 R 2s kϕR k22
. C s kϕkH s .
(3.28)
R dyadic
It follows from (3.21), (3.28) that for |t| < T , kπN (S(t)ϕ)kH s < C s kϕkH s + e−K/300 N s kϕk2 , where πN denotes a Fourier multiplier on the N first Fourier modes with shape ..... ..... . . . . ... ..... ..... −N −N 2
0
..... ..... 5 b ..... N ..... ..... ..... N N 2
It remains to bound the contribution of the Fourier modes |n| > From the equation i u˙ − ∂x2 + V u = 0
N 2.
(3.29)
218
J. Bourgain
we may estimate Z 2Re
X Z α+β=s β<s
t+1
t
t+1
t t+1
Z 2I m
ku(t + 1)k2H s − ku(t)k2H s =
t
hu(τ ˙ ), (−1)s u(τ )idτ =
hV u(τ ), (−1)s u(τ )idτ ≤ (since V is real)
k∂xα V (τ ).∂xβ u(τ )k2 k∂xs u(τ )k2 dτ ≤ Z C
t+1 t
ku(τ )kH s−1 ku(τ )kH s dτ ≤ 2− 1
1
Cku(t)kH s s ku(t)kLs 2 = 1
and (3.30) implies that
2− 1
Ckϕk2s ku(t)kH s s ,
(3.30)
ku(t)kH s < (CT )s kϕkH s .
(3.31)
Previous calculation also yields similarly that k(I − πN )u(t + 1)k2H s − k(I − πN )u(t)k2H s < X Z t+1 {k(I − πN )(∂xα V )(τ )(∂xβ u)(τ )k2 ku(τ )kH s + < α+β=s β<s
t
+ k[V , πN ]u(τ )kH s ku(τ )kH s }dτ < <
(CT )2s 2 C ku(t)k2H s < kϕH s , N N
(3.32)
where [V , πN ] denotes the commutator. Thus Z t (C|t|)2s+1 (Cτ )2s 2 2 dτ kϕkH s < 1 + kϕk2H s k(I − πN )u(t)kH s < 1 + N N 0 (3.33) or 1 (|C|t|)s+ 2 (3.34) kϕkH s . k(I − πN )S(t)ϕkH s < 1 + N 1/2 Choose
N > (CT )2s+1 .
(3.35)
Conditions (2.4), (3.22), (3.35) lead to the parameter choice K ∼ s log N ∼ s 2 log T , N0 ∼ K C1 where C1 depends on λ.
(3.36)
From (3.29), (3.34) we get then that kS(t)ϕkH s ≤ C s kϕkH s
(3.37)
Growth of Sobolev Norms in Linear Schrödinger Equations
219
for |t| < T and ϕ satisfying (3.1), i.e. n ∈ supp b ϕ ⇒ 4N0 < |n| <
N . 2
(3.38)
We will also use the commutator estimate k[S(t), πN ]kH s →H s < If then
(1 + |t|)3s+1 . N
(3.39)
(i∂t − ∂x2 )u + V u = 0, (i∂t − ∂x2 )(πN u) + V (πN u) = [V , πN ](u).
(3.40)
Hence, for |t| < T , by (3.40) [πN , S(t)]ϕ = (πN S(t)−S(t)πN )ϕ = πN u(t) − S(t)πN u(0) Z t = i S(t)S(τ )−1 [V , πN ]u(τ )dτ,
(3.41)
0
and using (3.31) and again the fact that k[V , πN ]kH s →H s . N1 , it follows Z t (1 + |τ |)s k[V , πN ]u(τ )kH s dτ k[πN , S(t)]ϕkH s < (1 + |t|)s 0 Z t 1 (1 + |τ |)s ku(τ )kH s dτ . (1 + |t|)s N 0 Z t 1 1 (1 + |τ |)2s dτ kϕkH s < (1 + |t|)3s+1 kϕkH s , < (1 + |t|)s N N 0 which is the claim (3.39). 4. Finally, consider a general ϕ ∈ H s , kϕkH s = 1 (without frequency restrictions). Denote S(t1 , t2 ) the flow map from time t1 to time t2 (which we assume bounded byT ). From (3.34) and the choice of N in (3.35), we get kS(0, t)ϕkH s ≤ kπ N S(0, t)ϕkH s + 2kϕkH s . 10
(4.1)
Write kπ N S(0, t)ϕkH s ≤ kπ N S(0, t)π4N0 ϕkH s 10
10
+ kS(0, t)(π N − π4N0 )ϕkH s
(4.3)
+ kπ N S(0, t)(ϕ − π N ϕ)kH s .
(4.4)
(4.3) ≤ C s .
(4.5)
2
10
By (3.37),
(4.2)
2
By (3.39), (4.4) = k[S(0, t), π N ](ϕ − π N ϕ)kH s ≤ 10
2
T 3s+1 kϕkH s < 1. N
(4.6)
220
J. Bourgain
Thus
kπ N S(0, t)ϕkH s < (4.2) + C s . 10
(4.7)
Write (4.2) ≤ kπ N S(1, t)π4N0 S(0, 1)π4N0 ϕkH s 10
(4.8)
+ kS(1, t)(π N − π4N0 )(S(0, 1)π4N0 ϕkH s
(4.9)
+ kπ N S(1, t)(I − π N )S(0, 1)π4N0 ϕkH s .
(4.10)
2
10
2
Again by (3.37), (3.39), (4.9) . C s kS(0, 1)π4N0 ϕkH s ≤ C s , (4.10) ≤ k[S(1, t), π N ](I − π N )S(0, 1)π4N0 ϕkH s < 1,
(4.12)
kπ N S(0, t)ϕkH s < kπ N S(1, t)π4N0 S(0, 1)π4N0 ϕkH s + 2C s .
(4.13)
10
and
(4.11)
10
2
10
After r steps, we need to estimate an expression kπ N S(r, t)π4N0 S(r − 1, r)π4N0 S(r − 2, r − 1) · · · π4N0 S(0, 1)ϕkH s 10
= kπ N S(r, t)π4N0 8kH s ,
(4.14)
10
where
k8k2 ≤ kϕk2 .
(4.15)
Writing S(r, t) = S(r + 1, t)S(r, r + 1), we obtain kπ N S(r, t)π4N0 8kH s ≤ kπ N S(r + 1, t)π4N0 S(r, r + 1)π4N0 8kH s 10
10
(4.16)
+ kS(r + 1, t)(π N − π4N0 )S(r, r + 1)π4N0 8kH s
(4.17)
+ kπ N S(r + 1, t)(I − π N )S(r, r + 1)π4N0 8kH s .
(4.18)
2
10
2
Again by (3.37), (3.39) and (4.15), (4.17) < C s kS(r, r + 1)π4N0 8kH s ≤ C s kπ4N0 8kH s < (CN0 )s ,
(4.19)
(4.18) = k[S(r + 1, t), π N ](I − π N )S(r, r + 1)π4N0 8kH s 10
.
T 3s+1 N
2
kS(r, r + 1)π4N0 8kH s <
T 3s+1 (4N0 )s < 1. N
(4.20)
Hence kπ N S(r, t)π4N0 S(r − 1, r)π4N0 · · · S(0, 1)ϕkH s ≤ 10
kπ N S(r + 1, t)π4N0 S(r, r + 1)π4N0 S(r − 1, r) · · · S(0, 1)ϕkH s + (CN0 )s . 10
(4.21)
Iterating and recalling (4.1) yields kS(0, t)ϕkH s ≤ T (CN0 )s .
(4.22)
Growth of Sobolev Norms in Linear Schrödinger Equations
Thus, by (3.36)
221
kS(0, t)kH s →H s ≤ T (Cs 2 log T )Cs
(4.23)
for some constant C. Interpolating with the L2 -bound yields for 0 < s1 < s, kS(0, t)kH s1 →H s1 ≤ T s1/s (Cs 2 log T )Cs1 ,
(4.24)
and, letting s = log T , kS(0, t)kH s1 →H s1 < C(log T )Cs1 .
(4.25)
This yields the growth-estimate for the H s -norm and proves Theorem 1. 5. Examples. In this section, we will first show that the logarithmic growth factor for t → ∞ can not be avoided (even for time periodic potentials). The main point, as one may expect, is to exploit resonance effects. This will be achieved in a standard perturbative construction. Next, we will exhibit an example (see (5.18)) iut + 1u + V (x, t)u = 0; u(0) ∈ ∩s>0 H s (T) with V bounded smooth and periodic in x and t, such that {S(t)u(0)|t ∈ R+ } is not relatively compact in H s , for any s > 0. Thus u(t) = S(t)u(0) is not almost periodic (as time function) in any H s , s > 0. Recall here the classical fact that u(t) is almost periodic as an L2 -valued function. Finally, an example iut + 1u + V (x, t)u = 0, u(0) = 1 is constructed, with V bounded, smooth, periodic in x and almost periodic in t, such that {u(t)|t ∈ R+ } is not relatively compact in L2 . Take the 1D Sturm–Liouville operator −
d2 + w(x) dx 2
(5.1)
with w(x) small real analytic periodic potential. Denote λ0 < λ1 ≤ λ2 < · · · ϕ ϕ ... 1≈ϕ 0
1
(5.2)
2
periodic spectrum and corresponding eigenfunctions respectively. These eigenfunctions are exponentially well localized wrt the system of exponentials, in the sense that (cf. [C-W]) for some α > 0, X n
j |b ϕj (n)| exp α |n| − < C. 2
(5.3)
Take a large index j and consider the linear Schrödinger equation iut − uxx + [w(x) + δ(cos(λj − λ0 )t
ϕj (x) ϕ0 (x)
]u = 0.
(5.4)
222
J. Bourgain
Denote by Sδ (t) the corresponding flow map which acts unitarily on L2 . For δ = 0 X hf, ϕj i eiλj t ϕj (5.5) S0 (t)f = j
and S0 (t) is uniformly bounded on all H s -spaces. By the integral formula, we get from (4), Z t ϕj S0 (t − τ )[(cos(λj − λ0 )τ Sδ (τ )f ]dτ. Sδ (t)f = S0 (t)f + iδ ϕ0 0 Hence
kSδ (t)f − S0 (t)f k2 5 2δ|t| kf k2
(5.6) (5.7)
and thus, substituting (5.7) in (5.6), we get Z t ϕj S0 (t −τ )[ cos(λj −λ0 )τ S0 (τ )f ]dτ +0L2 (δ 2 t 2 ). (5.8) Sδ (t)f = S0 (t)f +iδ ϕ 0 0
Taking
f =ϕ
(5.9)
0
as initial data and choosing t such that δt = γ = fixed small number ,
(5.10)
(5.8) becomes Sδ (t)ϕ0 = eiλ0 t ϕ0 + iδ
Z 0
t
cos(λj − λ0 )τ e−i(λj −λ0 )τ dτ eiλj t ϕj + 0(δ 2 t 2 ) (5.11)
γ = eiλ0 t ϕ0 + i eiλj t ϕj + 0(γ 2 ). 2
(5.12)
Equation (5.4) is of the form iut − uxx + V (x, t)u = 0 with potential V V (x, t) = w(x) + δ[cos(λj − λ0 )t]
(5.13) ϕj (x) . ϕ0 (x)
(5.14)
By appropriate choice of w, one may ensure that λj − λ0 ∈ ρZ
(5.15)
for any initially specified sufficiently small number ρ. Hence V is 1-periodic in x and 2π ρ -periodic in t. By the localization property (5.3), a uniform bound on the real analytic function space norm of V will be achieved for δ = e−j say. From (5.10), the time choice is then (5.16) t = γ ej .
Growth of Sobolev Norms in Linear Schrödinger Equations
223
From (5.12), it follows then that for s > 0, ku(t)kH s = kSδ (t)ϕ0 kH s ∼ kϕj kH s ∼ (log |t|)s ,
(5.17)
producing lower bounds logarithmic in time for the Sobolev norms. Observe that in this construction the time frequency may be chosen diophantine. The method of the preceding construction may be used to produce a Schrödinger equation (5.18) iut − uxx + V (x, t)u = 0 T s with V smooth and periodic in x, t and u(0) = f ∈ s<∞ H (T) such that, denoting S(t) the flow map for (18), the set {S(t)f |t ∈ R} is not relatively compact in H s for any s > 0 (unlike in L2 ).
(5.19)
for any s > 0. Hence S(t)f is not almost periodic in H s
One may also construct an example (5.18) where V is smooth and periodic in x, almost periodic in t, such that (5.19) is not relatively compact in L2 and hence S(t)φ not almost periodic. Both constructions are based on a procedure of consecutive perturbations of the potential which are exploited over an increasing time sequence {Tj } and using the results from previous sections. Fix V0 (x, t) periodic in x, t and given by a trigonometric polynomial. (In this construction, the real analytic norm of the potential will not remain bounded but a weaker norm controlling all H s , s < ∞ will). Denote by S0 (t) the flow map of iut − uxx + V0 (x, t)u = 0
(5.20)
and Sδ (t) the flowmap of a perturbed equation iut − uxx + V0 (x, t)u + δV1 (x, t)u = 0.
(5.21)
Assuming |V1 | < 1, one has again from the integral equation Z t S0 (t)S0 (τ )−1 [V1 (x, τ )Sδ (τ )f ]dτ, Sδ (t) = S0 (t)f + iδ
(5.22)
0
kSδ (t)f − S0 (t)f k2 < δ|t|, Z t S0 (t)S0 (τ )−1 [V1 (x, τ )S0 (τ )f ]dτ + 0L2 (δ 2 t 2 ). Sδ (t)f = S0 (t)f + iδ
(5.23) (5.24)
0
Fix |t| < T . From the discussion in Subsects. 1–5, one may construct approximative Bloch waves eiE1 t ψ1 (x, t), kψ1 (x, 0)k2 = 1, e
iE2 t
ψ2 (x, t), kψ2 (x, 0)k2 = 1
(5.25) (5.26)
for (5.20), where ψ1 , ψ2 are periodic in t,
b1 ⊂ B(0, (log T )C , |E1 | < (log T )C , supp ψ 1/2 b2 ⊂ [ |n| − E 1/2 < (log T )3 ] × [|k| < (log T )3 ], E2 ∼ e(log T ) , supp ψ 2 kS0 (t)ψα (x, 0) − e
iEα t
ψα (x, t)k2 < e
−(log T )2
(α = 1, 2).
(5.27) (5.28) (5.29)
224
J. Bourgain
The same statement holds if we assume V0 quasiperiodic in t with diophantine frequency vector λ; ψα (α = 1, 2) are then λ-quasiperiodic in t; the constant C in (27) has to be chosen depending on V0 , in particular on the frequency vector λ in the quasi-periodic case (we let T be sufficiently large here). Coming back to (24), we will next describe how to choose V1 . Take in (24) f = ψ1 (x, 0). Write
(5.30)
Z t −1 hψ2 (x, t), S0 (t)S0 (τ ) [V1 (x, τ )S0 (τ )ψ1 (x, 0)]dτ i 0
Z t −1 hS0 (t)ψ2 (x, 0), S0 (t)S0 (τ ) [V1 (x, τ )S0 (τ )ψ1 (x, 0)]dτ i
by (26), (29)
=
0 −(log T )2
) + 0(e Z t 2 = hS0 (τ )ψ2 (x, 0), V1 (x, τ )S0 (τ )ψ1 (x, 0)idτ + 0(e−(log T ) )
(5.31) (5.32)
0
(since the flow map is unitary) Z t Z 2 = ei(E2 −E1 )τ ψ2 (x, τ )ψ1 (x, τ )V1 (x, τ )dxdτ + 0(e−(log T ) T ).
by (29)
0
T
(5.33)
Hence, choosing either V1 (x, τ ) = Re[e−i(E2 −E1 )τ ψ2 (x, τ ) ψ1 (x, τ )]
(5.34)
V1 (x, τ ) = Im [· · · ],
(5.35)
or one may ensure that the first term in (5.33) has a lower bound 1 2
Z tZ 0
T
|ψ1 (x, τ )|2 |ψ2 (x, τ )|2 dxdτ.
(5.36)
Observe that since in particular from (5.28) b2 = 0(e−(log T )2 ), b2 (n, k) + SV0 ψ (n2 − k.λ − E2 )ψ b2 (n, k)| . (log T )3 , |n2 − E2 | |ψ 1 |n| − E 1/2 kψ b2 (n)k2 < e− 3 (log T )1/2 , 2
(5.37) (5.38) (5.39)
there exist n0 ∈ Z such that 1/2
|n0 | ∼ E2
and
X |n|6=|n0 |
1 b2 (n, k)ei(nx+kλ)t = 0 e− 3 (log T )1/2 . ψ
(5.40)
Growth of Sobolev Norms in Linear Schrödinger Equations
Hence (5.31), (5.36) =
1 2
Z tZ 0
T
225
b2 (n0 )(τ )|2 |ψ1 (x, τ )|2 [|ψ
b2 (−n0 ) e2in0 x + o(1)] b2 (−n0 )(τ )|2 + 2Re ψ b2 (n0 )ψ + |ψ Z Z t Z 1 2 2 |ψ1 (x, τ )| · |ψ2 (x, τ )| + o(1) dτ = 2 0 T T Z Z t Z 1 2 |S(τ )ψ1 (x, 0)| · |S(τ )ψ2 (x, 0)|2 + o(1)]dτ ∼ 2 0 T T 1 (5.41) ∼ t. 2 In the construction of a time periodic potential, one has to preserve the time periodicity and hence ensure that in (5.34), (5.35), E2 − E1 ∈ λZ.
(5.42)
This may for instance be achieved in the following way. Denote ρ a variable space frequency. Thus b(n). d (5.43) 1 φ(n) = −ρn2 φ It is clear from (5.27), (5.28) that ∂(E2 − E1 ) 1/2 > 0(e(log T ) ), ∂ρ
(5.44)
and hence, choosing T sufficiently large, (5.42) may be achieved by an arbitrarily small 1/3 perturbation of ρ, leaving the properties of S0 (t) unchanged for times |t| < e(log T ) (in particular, the times T 0 considered at the previous stages of the construction). Also, again by (5.27), (5.28) b1 ⊂ B(0, e supp V
log T )1/2
,
(5.45)
and to ensure a uniform smoothness bound on δV1 in (21), we let δ=
c T
(5.46)
(for some small constant c). Equations (5.24) and (5.31)–(5.41) then yield for t = T and N ∼ exp 21 (log T )1/2 , X 1/2 2 \ > kPN Sδ (T )ψ1 (x, 0)k2 ≡ Sδ (T )ψ1 (x, 0)(n)| |n|>N
1 δT − 0(δ 2 T 2 ) − |hS0 (T )ψ1 (x, 0), ψ2 (x, T )i| 3 1 > δT 4 > c. (5.47)
|hSδ (T )ψ1 (x, 0), ψ2 (x, T )i| >
226
J. Bourgain
The original potential V0 is then replaced by the potential V = V0 + δV1 which is periodic in x and t (with the same frequency, up to the ρ-perturbation described above). Moreover V is given by a trigonometric polynomial which is uniformly bounded in weighted Fourier space X 3/2 b(n, k)| e[log |k|+|n|] |V (5.48) |||V ||| = n, k
as a consequence of (5.45), (5.46). The flow map S(t) = Sδ (t) satisfies by (5.27), (5.47), kPN S(T )f k2 > c for
N = exp
and some f satisfying
1 log δ
(5.49)
1/2 (5.50)
1 C . supp fb ⊂ B 0, log δ
(5.51)
Denote g the normalization of f in the norm X 3/2 e(log |n|) |b g (n)|. |||g||| =
(5.52)
n∈Z
One gets then from (5.49), (5.50), |||g||| = 1, kS(T )gkH s > cN s e−C(log log δ )
1 3/2
s
> e2
log
1 δ
1/2
(5.53)
for all s > 0. It is clear that an iteration of the preceding construction permits us to construct a space and time periodic potential V , initial data {φj } and a rapidly increasing time sequence {Tj }, such that (5.54) |||V ||| < 1, |||φj ||| < 1 and lim kS(Tj )φj kH s = ∞ for all s > 0,
j →∞
(5.55)
where S(t) denotes the flow map of iut − uxx + V (x, t)u = 0.
(5.56)
From this, one may easily get φ, |||φ||| < 1 for which {S(t)φ|t ∈ R} is not bounded in H s , for any s > 0. Next we sketch the construction of a time almost-periodic V for which |||V ||| < 1 and {S(t)f |t ∈ R} is not relatively compact in L2 , where f is a fixed datum (take f = 1 for instance).
(5.57)
Growth of Sobolev Norms in Linear Schrödinger Equations
227
The procedure is similar to the one described above. Assume V0 a trigonometric polynomial which is periodic in x and quasi-periodic in t with diophantine frequency vector λ ∈ Zb (the value of b will increase along the construction as well as the constants involved when expressing the diophantine estimate on λ). Write (5.24). Letting T be sufficiently large, consider (E2 , ψ2 ) satisfying (5.26), (5.28), (5.29) (as mentioned, the constant C depends on V0 and λ and will increase along the construction). One has, cf. (5.31)–(5.33), Z t −1 hψ2 (x, t), S0 (t)S0 (τ ) [V1 (x, τ )S0 (τ )f ]dτ i 0 Z t Z (5.58) iE2 τ −(log T )2 e ψ2 (x, τ ) S0 (τ )f V1 (x, τ )dτ + 0(e .T ). = 0
T
Furthermore S0 (τ )f =
X 2 hf, ψα (x, 0)i eiEα τ ψα (x, τ ) + 0 e−(log T ) ,
(5.59)
α
where (Eα , ψα ) fulfill (5.25), (5.27), (5.29). Substituting (5.59) in (5.58) yields thus Z t Z X i(E2 −Eα )τ e ψ2 (x, τ )ψα (x, τ ) hf, ψα (x, 0)i 0
T
α
· V1 (x, τ )dxdτ + 0 e
−(log T )2
(5.60)
T .
Observe that in (5.60), ψ2 and ψα have λ-time-frequency and the new frequencies introduced are a finite set (5.61) µα = E2 − Eα (of size at most (log T )C ). Take either V1 (x, τ ) = Re
X
0 eiµα τ ψ2 (x, τ )ψα (x, τ ) hf, ψα (x, 0)i
(5.62)
α
or We let here
V1 (x, τ ) = Im [. . . ].
(5.63)
|µ0α − µα | < e−(log T )
(5.64)
2
λ0
µ0 )
still satisfies some diophantine estimate. so that = (λ, Choosing V1 = (5.62) or (5.63) appropriately, (5.60) gives again a lower estimate for t = T, 2 2 Z T Z X 2 iEα τ e ψα (x, τ )hf, ψα (x, 0)i ψ2 (x, τ ) dxdτ + 0(e−(log T ) T . 0
T
α
(5.65)
228
J. Bourgain
Repeating the argument (5.36)-(5.41) yields then 2 Z T Z X iEα τ dx e ψ (x, τ )hf, ψ (x, 0)i (5.65) = α α T α 0 Z |ψ2 (x, τ )|2 dx dτ + o(1)T × by (59), (29)
=
Z
T
T
(5.66)
kS0 (τ )f k2 kS0 (τ )ψ2 (x, 0)k2 dτ + o(1)T = T kf k22 kψ2 (x, 0)k22 + o(1)
(5.67)
> cT .
(5.69)
0
Take δ∼ Thus, again (5.24) yields that for N =
1 2
1 . T
(5.68)
(5.70)
exp (log T )1/2
kPN [Sδ (T )f )k2 > |hSδ (T )f, ψ2 (x, T )i| > cδT − 0(δ 2 T 2 ) − |hS0 (T )f, ψ2 (x, T )i| > c.
(5.71)
We then replace the potential V0 by V = V0 + δV1
(5.72)
which is given by a trigonometric polynomial periodic in x and quasi-periodic in t with 0 frequency vector λ0 ∈ Zb , b0 < b + (log T )C which, by construction, still satisfies a DC. Again V1 satisfies (5.45) and hence, since δ ∼ T1 , there is a uniform bound on |||V ||| given by (5.48). Iterating the procedure yields an almost periodic potential V , and rapidly increasing sequences {Nj }, {Tj } such that for all j , kPNj S(Tj )f k2 > c,
(5.73)
where c > 0 is a fixed constant. II. 2D Schrödinger Equation with Small Quasi-Periodic Potential Consider next the case of the 2D linear Schrödinger equation with quasi-periodic potential i u˙ − 1u + V (x, t)u = 0, (1.1) where V is 2D-periodic in x, quasi-periodic in t with diophantine frequency vector λ = (λ1 , . . . , λd ) as in (I). We will assume moreover V small. Our aim is to prove the analogue of Theorem 1 in this context. Thus Theorem 2. Under the preceding hypothesis on V = V (x, t), one has the growth estimate ku(t)kH s < (log t)Cs for t → ∞, assuming u(0) ∈ H s (T2 ), s > 0.
Growth of Sobolev Norms in Linear Schrödinger Equations
229
The argument is based on similar considerations. However, due to the more serious small divisor issues, it will be technically more involved. 1. As in the 1D case, we may moreover assume b(0) = V b(0)(t) = 0. V
(1.2)
Following the same argument as in the 1D-case, we need to consider the linear operator T = D + SV , restricted to the box
D = diagonal with D(n, k) = |n|2 − k.λ
= {(n, k) ∈ Z2+b |n| < N, |k| < K}.
(1.3)
(1.4)
The main complication compared with 1D is the structure of singular sites of T − E when E is Case 2, i.e. |E| > 21 N02 . This also turns out to be the reason of the smallness assumption on V . Clearly, by (1.3), if (n, k) is a singular site, i.e.
then
|n2 − kλ − E| < 1
(1.5)
|n2 − E| < CK.
(1.6)
Assume
1 (1.7) N0 and |n20 − E| < CK 2 and restrict n ∈ Z2 to a B-neighborhood of n0 . If n satisfies (1.6), writing n2 = n20 + 2n0 .1n + |1n|2 it follows that |n0 | >
|n0 .1n| < CK + B 2 .
(1.8)
Observe again that for nα ∈ Z2 (α = 1, 2) and 1α n ≡ nα − n0 linearly independent, one has (1.9) det [11 n, 12 n] ∈ Z\{0}. It follows therefore from (1.8), (1.9) that, if we assume C(K + B 2 )B < N0
(1.10)
dim[1n| |1n| < B and n satisfying (1.6)] ≤ 1.
(1.11)
then, for fixed n0 satisfying (1.7) Thus within a B-neighborhood of n0 , the n-projection of singular sites lie on a line. We will first prove a result concerning these individual structures (see Claim (2.1) below). Rewrite for n = n0 + mv(m ∈ Z, v ∈ Z2 \{0}, |v| < B) the diagonal D(n, k) − E = n2 − kλ − E (1.12) = m2 |v|2 + 2m n0 .v − (k − k0 ).λ + (n20 − k0 .λ − E) 2 n0 .v 2 n0 .v ) − (k − k0 ).λ − + (n20 − k0 λ − E). = |v|2 (m + |v|2 |v| (1.13)
230
J. Bourgain
Hence, on a neighborhood of n0 , there is essentially speaking a 1-dimensional reduction wrt space mode n and we need to describe singular sites of restrictions of a Z×Zb -matrix T = D + SV ,
(1.14)
D(m, k) = |v|2 (m + a)2 − k.λ − γ
(1.15)
where D has the form
(for some v ∈ Z2 \{0}, γ = γ (n0 , k0 , v, E) and a = a(n0 , v) ∈ R, where we may assume |a| < 1) and SV stationary, SV (z, z0 ) = 0 for |z − z0 | > C and kSV k < ε.
(1.16)
2. In this subsection, we will discuss the inverse of the matrix (1.14). We will call here a size M-box in Z × Zb any product of intervals of size ∼ M. We prove the following statement by an inductive multi-scale argument. Claim 2.1. Let Q = [−M, M] × Q1 be an M-box in Z × Zb and TQ the corresponding restriction. Assume M c1 (2.2) kTQ−1 0 k < ζ < e for any M 0 -box Q0 ⊂ Q with
1 ≤ M 0 < M c2 .
(2.3)
Then 0 1/2
kTQ−1 k < M C3 ζ 2 and |TQ−1 (z, z0 )| < e−|z−z | Here 0 < c1 < c2 <
1 10
for |z − z0 | > M 2c2 .
(2.4)
and C3 > 1 are constants to be specified later.
Remark. The claim remains valid if Q is replaced by an arbitrary M-box. This will be clear from the proof. Assume first M bounded (depending on ε). Letting M 0 = 1 in (2.2), we have that |D(m, k)| >
1 c1 > e−M > e−M > δ ≡ ε1/10 ζ
(2.5)
for (m, k) ∈ Q. Recall (1.16). Using the Neumann series TQ−1 where
=D
−1
(I + SV D
−1 −1
)
=D
−1
∞ X (−1)j (SV D −1 )j ,
(2.6)
j =0
kSV D −1 k < εδ −1 < ε9/10 ,
(2.7)
it follows that kTQ−1 k < 1 + kD −1 k < 1 + ζ and |TQ−1 (z, z0 )| . δ −2 ε < ε1/2 for z 6 = z0 , and thus (2.4).
(2.8)
Growth of Sobolev Norms in Linear Schrödinger Equations
231
Next we describe the inductive step. Consider first the region 31 = {(m, k) ∈ Q| |m| >
1 c4 M }, 2
(2.9)
where 0 < c4 < 1 is to be specified. Considering subregions 31, α = {(m, k) ∈ 31 |k ∈ Bα }
(2.10)
with Bα an M c5 -box in Zb , c5 < 21 c4 , it is clear that there is at most one pair of values + − ± m± α , |mα + mα | < 2, so that if m 6 = mα and k ∈ Bα , then
2
|v| (m + a)2 − k.λ − γ > 1 M c4 − CM c5 − 1 M c4 > 1 M c4 . (2.11) 2 5 5 b(0, k) = 0, the restriction of T and T31, α to {(m, k) ∈ 31, α m = m+ Since V α } and } are again diagonal. {(m, k) ∈ 31, α |m = m− α Assuming |k.λ| >
1 |k|−C0 for k ∈ Zb \{0}, C0
(2.12)
it follows that if
1 c4 , 10 there is at most one value kα± ∈ Bα , such that 2 ± ± 2 ± −c4 /10 . |D(m± α , kα )| = |v| (mα + a) − kα · λ − γ < M C 0 c5 <
(2.13)
(2.14)
Hence, if we denote + − − 301, α = 31, α \{(m+ α , kα ), (mα , kα )},
(2.15)
will be well-controlled using the argument from I, Subsect. 2. T3−1 0 1,α
The same is also true for T3−1 0 , where 1
301 = ∪301, α = 31 \3001 with
(2.16)
00 31 = {(m, k) ∈ 3| |v|2 (m + a)2 − k.λ − γ | < M −c4/10 }. 00
Moreover, from the preceding, the elements of 31 are at least Thus c4/10 0 −|z−z0 | and |T3−1 for z 6 = z0 . kT3−1 0 k . M 0 (z, z )| < e 1
1
(2.17)
M c5 -separated. (2.18)
To treat the entire region 31 , consider for each x ∈ 3001 , a neighborhood Qx ⊂ 31 of x 1 which is an 10 M c5 -box. Thus
Assume
Qx ∩ Qx 0 = φ for x 6= x 0 ∈ 3001 .
(2.19)
c1 < c5 ≤ c2
(2.20)
232
J. Bourgain
so that assumption (2.2) implies that k < ζ for x ∈ 3001 . kTQ−1 x Writing
[
31 = 301 ∪
(2.21)
Qx ,
(2.22)
x∈3001
(2.18), (2.21) and another application of the resolvent identity yields that k . ζ M c4/10 kT3−1 1 and
0 1/2
(z, z0 )| < e−|z−z | |T3−1 1
(2.23)
for |z − z0 | > M 2c5 .
(2.24)
Consider next the region 32 = {(m, k) ∈ Q| |m| < M c4 }
(2.25)
obtained as a union of M c4 -boxes Q0 = {(m, k) ∈ Q| |m| < M c4 , k ∈ Q01 },
(2.26)
where Q01 is an M c4 -box in Q1 . Assume TQ−1 0 does not satisfy 0 1/2
c4 C3 +4C0 0 −|z−z | and |TQ−1 kTQ−1 0 k < M 0 (z, z )| < e
for |z − z0 | > M 2c2 c4
(2.27)
(cf. (2.4)). Since M c4 is on a lower size scale, application of the claim (by induction) with ζ = M 2C0
(2.28)
yields then some M 00 -box Q00 ⊂ Q0 such that
and
M 00 < M c4 c2
(2.29)
2C0 . kTQ−1 00 k > M
(2.30)
Recall (1.14), (1.15). Observe that, by translation in the k-coordinate and stationarity of the off-diagonal part, if (2.31) Q00 = (0, kQ00 ) + (I 00 × Q000 ), where I 00 = Projm (Q00 ) is an M 00 -box and Q000 = [−M 00 , M 00 ]b , then TQ00 = TI 00 ×Q000 − (kQ00 · λ)11I 00 ×Q000 .
(2.32)
Statement (2.30) then implies by (2.31) that |E − kQ00 · λ| < M −2C0 for some eigenvalue E of TI 00 ×Q000 .
(2.33)
Growth of Sobolev Norms in Linear Schrödinger Equations
233
Observe next that the number of E-values may be bounded by M c2 c4 M c4 M c2 c4 (b+1) < M (b+3)c4
(2.34)
considering the various matrices obtained by restriction to all possible (I 00 × Q000 )-boxes. Since (2.12) holds, there can for fixed E be at most one k-value in Q1 satisfying (2.34). This yields thus a set C ⊂ Q1 of at most M (b+3)c4 -elements. A straightforward construction allows then to cover a M 2c4 -neighborhood of C by ˜ 1, α ⊂ Q1 such that M˜ α -boxes Q
and
M 4c4 < M˜ α < M 3(b+5)c4
(2.35)
˜ 1, β ) > M 4c4 for α 6 = β. ˜ 1,α , Q dist (Q
(2.36)
Define for each α,
˜ 1,α . ˜ α = [−M˜ α , M˜ α ] × Q Q Q0
violating (2.27) is contained in Since by construction each some element of C, we get that [ [ ˜ α. Q0 ⊂ Q
(2.37) M 2c4 -neighborhood
of
(2.38)
Q0 fails (27)
Write
[
Q = 31 ∪
Q0 ∪
[
˜ α, Q
(2.39)
Q0 M c4 −box satisfying (27)
≈
where the last union in (2.39) will play the role of “singular islands”. Denoting Qα ⊂ Q ˜ α in Q, one has by (2.36) a M c4 -neighborhood of Q ≈
≈
dist (Qα , Qβ ) > M 2c4 ≈
(2.40)
≈
and, by (2.35), Qα is an M α -box with ≈
M α < M 3(b+6)c4 .
(2.41)
In order to apply (2.2), (2.3), we require
Then, for each α,
3(b + 6)c4 ≤ c2 .
(2.42)
kT ≈−1 k < ξ.
(2.43)
Qα
Recall also properties (2.23)–(2.24) for 31 and (2.27) for the Q0 in (2.39). Given a point z = (m, k) ∈ Q, either |m| > 43 M c4 and B(z, 41 M c4 ) ⊂ 31 by (2.9), or |m| < 43 M c4 in which case B(z, 41 M c4 ) is clearly contained in a box Q0 as described ˜ α , it follows from (2.38) that Q0 satisfies (2.27). Based on in (2.26). Assuming z 6 ∈ ∪Q
234
J. Bourgain
(2.39), the preceding comment, (2.23)–(2.24), (2.27), (2.43) and the usual considerations based on the resolvent identity, it follows that kTQ−1 k . (ζ M c4/10 + M c4 C3 +4C0 )ζ,
(2.44)
provided c5 <
i.e. by (2.2)
1 1 c4 , c2 < , 10 10 log ζ M c4/2 , c1 < c4/4
(2.45) (2.46) (2.47)
and also, by (2.41), (2.42) 0 1/2
|TQ−1 (z, z0 )| ≤ e−|z−z |
for |z − z0 | > M 2c2 .
(2.48)
This yields (2.4), provided 1 , C3 > 8C0 + 1. 2 Recalling the different conditions on the exponents 1 0 < c1 < c2 < 10 1 C0 c5 < 10 c4 cf. (13) c < c ≤ c cf. (20) 2 5 1 3(b + 6)c4 < c2 cf. (42) c4 1 c < , c < cf. (45) 2 5 10 10 cf. (47) c < c4/4 1 c4 < 21 , C3 > 8C0 + 1 cf. (5.49) c4 <
(2.49)
shows that they are clearly compatible. This concludes the proof of Claim (2.1). In order to proceed further, we will also need the following comment concerning the operator T given by (1.14), (1.15) to which the claim applies. Assume we consider a “perturbed” T of the form (2.50) T = D + SV + P with D, SV as above and P a (not necessarily translation invariant) perturbation satisfying (2.51) kP k < M −C6 and exponential off-diagonal decay for a sufficiently large constant C6 (C6 = 10C0 in the preceding setup will do). Then the claim still holds. We verify this by briefly reviewing the argument. Consider first the region 31 . Statement (2.18) regarding T3−1 0 is clearly independent of the per1 turbation P , because of (2.51). The assumption (2.21) for the perturbed operator (2.50) . Consider next the region 32 . Again by will then again imply (2.23), (2.24) for T3−1 1 (2.51), (2.30) is essentially independent of the perturbation P . Thus we may ignore P in the considerations, involving in particular (1.15) and the stationarity of SV , leading eventually to the set C of k-sites in Q1 . Also (2.39) is independent of P . The assumption
Growth of Sobolev Norms in Linear Schrödinger Equations
235
(2.43) on the other hand again refers to the perturbed operator (2.50) and the proof is completed the same way. 3. We are now returning to Subsect. 1 and the initial construction described there. Fix n0 ∈ Z2 , |n0 | > 21 N0 and a M-box Q1 in Zb . Fix R < M. If for all n ∈ B(n0 , R), |n2 − E| > CK,
(3.1)
|n2 − kλ − E| > K,
(3.2)
3 = B(n0 , R) × Q1 ,
(3.3)
hence then for T3 − E = (D − E) + SV |3 is clearly invertible by a Neumann series. Assume next there is n1 ∈ B(n0 , R) such that |n21 − E| < CK.
(3.4)
Then B(n0 , R) ⊂ B(n1 , 2R). We distinguish the following two cases. Case A. For all n ∈ B(n1 , 2R), n 6 = n1 , |n2 − E| > CK.
(3.5)
3 = B(n1 , 2R) × Q1 .
(3.6)
Replace 3 given by (3.3) by
For n ∈ B(n1 , 2R), n 6 = n1 , one has by (3.5) that
By (2.12), assuming
|n2 − kλ − E| > K.
(3.7)
M C0 < K 1/10
(3.8)
there is for n = n1 at most one k1 ∈ Q1 such that |n21 − k1 · λ − E| < Defining we have thus that
1 . K 1/10
(3.9)
30 = 3\{n1 , k1 },
(3.10)
30 = 31 ∪ 32 ,
(3.11)
31 = B(n1 , 2R)\{n1 } × Q1 , 32 = {(n1 , k)}|k ∈ Q1 \{k1 }),
(3.12)
where, by (3.7) k(T31 − E)−1 k <
1 , K
(3.13)
236
J. Bourgain
b(0, k) = 0, and, since V
T32 − E = D32 − E, −1
k(D32 − E) Hence
k
(3.14) .
(3.15)
k(T30 − E)−1 k < K 1/10 ,
(3.16)
|(T30 − E)−1 (z, z0 )| < e
1/10
−|z−z0 |
for z 6= z0 .
(3.17)
The only singular site for T3 − E being (n1 , k1 ), one may thus control (T3 − E)−1 from an estimate 10c1 (3.18) k(T300 − E)−1 k < eR , where 300 ⊂ 30 is an R 1/10 -box containing 3 ∩ B (n1 , k1 ), R 1/20 ). Case B. There is n ∈ B(n1 , 2R), n 6 = n1 such that |n2 − E| < CK.
(3.19)
R1 = R 10
(3.20)
3 = B(n1 , R1 ) × Q1 .
(3.21)
C(K + R12 )R1 < N0 ,
(3.22)
dim[1n = n − n1 |n ∈ B(n1 , R1 ) satisfying (19)] = 1.
(3.23)
Define then and consider Assuming (1.10) fulfilled, i.e.
it follows from (1.11) that
Hence, the set of n ∈ B(n1 , R1 ) for which (3.19) holds is contained in R1 L = n0 + mv 0 ≤ m ≤ |v| for some v ∈ Z2 ∩ B(0, 2R), v 6 = 0, by the assumption in case B. Write 3 = 31 ∪ 32 with 31 = L × Q1
and T3 = Since
T31 U . U ∗ T32
|n2 − E| > CK, |n2 − kλ − E| > K for n ∈ B(n1 , R1 )\L,
(3.24)
(3.25) (3.26) (3.27)
it follows that
1 , K thus the inverse of T3 − E is controlled by the inverse of k(T32 − E)−1 k <
(T31 − E) − U (T32 − E)−1 U ∗ = T31 − E − P ,
(3.28)
(3.29)
Growth of Sobolev Norms in Linear Schrödinger Equations
237
where, by (3.28), 1 . (3.30) K corresponds to the restriction of an operator T of the form kP k <
By (1.13), the operator T31 (1.14), (1.15) to
R1 R1 × Q1 . Q= − , |v| |v|
Since |v| < 2R and (3.20), 9/10
R1
<
(3.31)
R1 < R1 . |v|
(3.32)
Letting and assuming (cf. (2.51), (3.30))
M ∼ R1
9/10
(3.33)
M C6 < K,
(3.34)
the discussion in Subsect. 2 and the claim are thus applicable to (3.29). Hence, letting c ζ = eM 1 in (2.1), either k(T31 − E − P )−1 k < e3M −1
|(T31 − E − P )
c1
and 0 1/2
0
(z, z )| < e−|z−z |
for |z − z0 | > M 2c2 ,
(3.35)
or there is a restriction c1
k[(T31 − E − P )|Q0 ]−1 k > eM , where Q0 ⊂ Q is an M 0 -box,
(3.36)
M 0 < M c2 .
(3.37)
It follows from the preceding that if (3.35), then k(T3 − E)−1 k < e3M and |(T3 − E)−1 (z, z0 )| < e−( if
|z−z0 | 1/2 |v| )
< e−(
c1
|z−z0 | 1/2 2R )
(3.38) 0 1/4
< e−|z−z |
|z − z0 | > M 1/4 > 2RM 2c2 + 4R 2 .
Assume next (3.36) for some Q0 ⊂ Q. Thus there is a subinterval L0 and a M 0 -box Q01 ⊂ Q1 such that, if we denote
(3.39) of L of size |v|.M 0
301 = L0 × Q01 ⊂ 31 , then
(3.40) c1
k(T301 − E − R301 U (T32 − E)−1 U ∗ R301 )−1 k > eM ,
where R301 denotes the restriction operator. Defining 30 = 301 ∪ 32 ,
(3.41)
(3.42)
238
J. Bourgain
we have
T301 R301 U . ∗ U R301 T32
T30 =
(3.43)
Recalling (3.28), property (3.41) is clearly equivalent to c1
k(T30 − E)−1 k > eM .
(3.44)
Let c1 be sufficiently small. Summarizing Cases (A), (B), it follows that given n0 ∈ Z2 , N > |n0 | > 21 N0 and a M-box Q1 in Projk (), there is a box 3 such that ⊃ 3 = BR 0 × Q1 with BR 0 ⊃ B(n0 , R) ∩ Projn (),
(3.45)
where R 0 < M 10/9 such that k(T3 − E)−1 k < e3M and
0 1/4
|(T3 − E)−1 (z, z0 )| < e−|z−z |
c1
(3.46)
for |z − z0 | > (R 0 )1/4 ,
(3.47)
30
⊂ 3 which is either a box in or of the form unless (3.44) holds for some region (3.42). In particular, the number of such regions 30 ⊂ is easily bounded by K 2b × N 2 .M 7 < N 3 .
(3.48)
Denote by k1 the center of Q1 and represent the spectrum of T30 + (k1 · λ)130 by a family of smoothly differentiable functions {σα (λ)} of λ, which, by first order variation satisfy |∂λ σα | < M,
(3.49)
D(n, k) + k1 · λ = n2 + (k1 − k).λ, |k − k1 | < M.
(3.50)
since for (n, k) ∈ 30 ⊂ 3 Property (3.44) then means that for some α, |σα (λ) − k1 · λ − E| < e−M
c1
(3.51)
holds. From (3.48) the number of functions σ = σ (λ) to be considered is at most N 3 .M
20 9 +b
< N 4.
(3.52)
Our aim is by small variation of λ (which is harmless when considering the flow of (1.1) up to time T in H s -space) (3.53) |λ − λ0 | < T −Cs to ensure that if σ, σ 0 are functions in the system (3.52), then the inequalities ( c |σ (λ0 ) − k · λ0 − E| < e−M 1 c |σ 0 (λ0 ) − k 0 · λ0 − E| < e−M 1
(3.54)
for some k, k 0 ∈ Zb , |k|, |k 0 | < K and E ∈ R may only be satisfied for |k − k 0 | < M 2 .
(3.55)
Growth of Sobolev Norms in Linear Schrödinger Equations
239
From (3.54), one gets indeed c1
|σ (λ0 ) − σ 0 (λ0 ) − (k − k 0 )λ0 | < 2e−M .
(3.56)
Assume |k − k 0 | ≥ M 2 . Thus by (49) |∂λ0 [σ (λ0 ) − σ 0 (λ0 ) − (k − k 0 )λ0 ]| > M 2 − 2M >
1 2 M . 2
(3.57)
From the preceding, the number of inequalities (3.56) to avoid is at most N 8 .K b ,
(3.58)
so that this may be achieved for a choice of λ0 satisfying (3.53), provided N 8 .K b .
1 −M c1 e < T −Cs . M2
(3.59)
Thus M & (log T )2/c1 , K & (log T )
(3.60)
C6 /c1
( by (34)),
(3.61)
(by (22)).
(3.62)
2C6 /c1
N0 > (log T ) Denote
1 K , 1 = (n, k) ∈ Z2+b N0 < |n| < N, |k| < 2 10 K 2+b 1 < |k| < K}. 2 = {(n, k) ∈ Z N0 < |n| < N, 2 5
(3.63) (3.64)
It follows that either 1 or 2 may be covered with boxes 3 obtained in (3.45) and satisfying (3.46), (3.47), since otherwise (3.54) would hold for a pair of eigenvalue functions σ, σ 0 as considered above and k, k 0 ∈ Zb satisfying |k − k 0 | >
K > M 2, 10
contradicting the choice of λ0 . Thus either 0 1/2
if |z − z0 | >
√ K
(3.66)
0 1/2
if |z − z0 | >
√ K.
(3.67)
|(T1 − E)−1 (z, z0 )| < e−|z−z | or
(3.65)
|(T2 − E)−1 (z, z0 )| < e−|z−z |
Letting N02 (3.68) 2 be the eigenfunction, it follows in case (3.66) that ξ is essentially supported on a neighborhood of \1 and hence may be ignored when expanding X ϕ˜ = hϕ, ˜ ξ iξ (3.69) T ξ = Eξ, E >
240
J. Bourgain
with
N ] × {0}. (3.70) 2 In case (3.67), ξ is essentially supported on a neighborhood of \2 , hence in the region K 1/2 . (3.71) < 1 × |k| < Qξ = |n| − E 2 supp ϕ˜ ⊂ [2N0 < |n| <
Thus if Qξ ∩ supp ϕ˜ 6 = φ, then again ξˇ (x, t) = eiEt
X
ξ(n, k) ei(nx+k.λt)
(3.72)
(n, k)∈Qξ
will satisfy (1.1) up to error e−K
1/2
. The proof may then be completed as in the 1D-case.
III. Further Comments In this section, we consider Eq. (0.9), iut + Bu + B −1 [V (x, t)u] = 0,
B=
p −1 + ρ, ρ > 0.
(1)
In order to prove estimates of the form (0.6), we follow the same scheme as for the Schrödinger case. Here, the Hilbert space L2 is replaced by H 1/2 with norm kφk = hBφ, φi = kB 1/2 φk2
(2)
which is the symplectic Hilbert space for (1) and preserved under the corresponding flow map S(t). The operator T is given by T = D + S, where D(n, k) = −hk, λi +
p
1 SV . |n|2 + ρ and S = p |n|2 + ρ
(3) (4)
Following the scheme used in the Schrödinger case described above, the main issue is the localization of eigenfunctions. We may deal with the following cases: D = 1. V quasi-periodic with frequency vector λ = (λ1 , . . . , λb ) satisfying a diophantine condition 1 (5) kλ.kk > |k|−C for k ∈ Zb \{0}. C D > 1. V periodic in time with frequency λ ∈ R satisfying a condition
X −CJ
J 1 j
aj λ > for all a = (aj )0≤j ≤J ∈ ZJ +1 \{0} max |aj |
CJ 0≤j ≤J
(6)
j =0
(which is satisfied for typical λ). Consider first the case D = 1. Proceeding as in Sect. I let T be the restriction of T to a region = {(n, k) ∈ Z1+b | |n| < N, |k| < K}
(7)
Growth of Sobolev Norms in Linear Schrödinger Equations
241
and E an eigenvalue of T . Letting again N0 = K C
(8)
with the constant C depending on λ, we distinguish the cases E ≤ 21 N0 , E > 21 N0 . In the first case, singular sites may only appear for |n| < N0 and we may repeat that part of the analysis, yielding localization of the eigenfunction on |n| < 2N0 . Assume next E > N20 . Let (n1 , k1 ) and (n2 , k2 ) be two singular sites, i.e. q C C (α = 1, 2). (9) = | − hkα , λi + n2α + ρ − E| < p 2 |nα | nα + ρ Hence | − hkα , λi + |nα | − E| < since necessarily |nα | > |E| − CK >
N0 3 .
C N0
(α = 1, 2),
(10)
From (10), we get thus
khk1 − k2 , λik <
C N0
(11)
(where kωk denotes the distance of ω ∈ R to the nearest integer). Since |k1 − k2 | < 2K, an appropriate choice of the constant C in (8) depending on diophantine properties of λ will ensure that (11) may only happen for k1 = k2 , hence |n1 | = |n2 |. Thus the singular sites are reduced to a pair (±n0 , k0 ) and we get localization of the eigenfunctions on a K-neighborhood of those sites, in the sense of (I, 3.29). Observe in the present case that, denoting S(t) the flow map corresponding to (1), one gets by the order −1 smoothing Z t itB B −1 V (τ )ReS(τ )ϕ dτ, (12) S(t)ϕ = e ϕ + i 0
C T s+ 2 kϕkH s , ≤ T . sup kS(τ )ϕkH s ≤ C N |τ |
k(I − πN )(S(t)ϕ − e
itB
ϕ)kH s
(13)
and thus, by the choice of N T s+ 2
(14)
k(I − πN )S(t)ϕkH s ≤ (1 + CN −1/2 )kϕkH s ≤ 2kϕkH s .
(15)
k(I − πN ) S(t) − e
itB
1
kH s →H s
In particular, cf. (I, 3.34),
Also, if u = S(t)ϕ, hence Thus again
i u˙ + Bu + B −1 (V u) = 0,
i(πN u)• + B(πN u) + B −1 (V πN u) = B −1 ([V , πN ]u).
(16)
[πN , S(t)]ϕ = πN u(t) − S(t)πN u(0) Z t S(t)S(τ )−1 B −1 ([V , πN ]u)(τ )dτ =i
(17)
0
242
J. Bourgain
from where the commutator estimate (s > 23 ) T 3(s− 2 ) < CN −1/2 for |t| < T . ≤C N 1
k[πN , S(t)]k
H s →H s
(18)
This permits us to conclude the argument as in the Schrödinger case. Consider next the case D > 1 with V periodic in t with frequency λ satisfying (6). Let be as in (7) and E > 21 N0 . Again from (4) (b = 1) the set of “singular sites” for T − E1 may be defined as 0 = {(n, k) ∈ |n| − kλ − E| <
C } |n| + 1
(19)
(for some constant C depending on V ). Thus (T − E)\0 has an inverse controlled by a Neumann series. The main ingredient next is a partitioning of 0 in “separated clusters” in the following sense Claim. Fix a large number K1 ( K and to be specified later). Then there is a partitioning of 0 as [ 0α , (20) 0 = α
where and
diam 0α < K1C1 for each α
(21)
dist (0α , 0β ) > K1 for α 6= β
(22)
(the exponent C1 = C(λ)). Proof of the Claim. We basically reproduce the argument from [B4] (the only difference is the extra E-term, which turns out to be harmless in this argument). From the definition of 0 , it follows that 2 |n| − (E + kλ)2 | < C for (n, k) ∈ 0 . (23) To get the partition result, it clearly suffices to show that whenever a sequence of distinct elements (24) z1 , z2 , . . . , zR ∈ 0 satisfies then there is a bound on R
|zj −1 − zj | < K1 ,
(25)
R < K1C1 −1 = K1C2 .
(26)
Fix a constant C3 . Another subsequence extraction permits us to assume the given system {z1 , z2 , . . . , zR } to satisfy the property (27) dim[zi − zj 1 ≤ i, j ≤ R] = dim[zi − zj i, j ∈ I ], hence
H = [zi − zj 1 ≤ i, j ≤ R] = [zi − zj i, j ∈ I ]
(28)
whenever I is a sufficiently long subinterval of [1, R], more precisely I ⊂ [1, R] is an interval such that |I | ≥ R1 = R 1/C3
(29)
Growth of Sobolev Norms in Linear Schrödinger Equations
243
(the constant C2 in (26) just needs to be replaced by C2 .C3D ). Fix next j = 1, . . . , R−R1 and denote (30) 1s z = zj +s − zj (s = 1, . . . , R1 ). By construction
H = [1s z s = 1, . . . , R1 ]
(31)
|1s z| < R1 K1 .
(32)
and by (25)
0 ,
If zj +s = (uj +s , kj +s ), one has by definition of 2C > |nj +s |2 −|nj |2 −(E + kj +s λ)2 + (E + kj λ)2 | = |2hnj , nj +s −nj i + |nj +s −nj |2 −2λ(E + kj λ)(kj +s −kj )−(kj +s −kj )2 λ2 | = 2|h nj , −λ(E + kj λ) , 1s zi + 0(|1s z|2 ). (33) Hence, by (32), |h nj , −λ(E + kj λ) , 1s zi| < (R1 K1 )2
(1 ≤ s ≤ R1 ).
(34)
Let d = dim H . By (31), (34), we may find a basis {e` |` = 1, . . . , d} ⊂ {1s z|s = 1, . . . , R1 } for H such that |hv, e` i| < (R1 K1 )2 where we denote
(1 ≤ ` ≤ d),
v = vj = PH nj , −λ(E + kj λ) .
(35) (36)
Writing v=
d X
c` e` ,
(37)
`=1
it follows from (35) that |v| = 2
d X
|c` | |hv, e` i| < d(R1 K1 )2 max |c` |.
(38)
`=1
Also, from (35), for `0 = 1, . . . , d, X d c` he` , e`0 i = |hv, e`0 i| < (R1 K1 )2 .
(39)
`=1
Observe that since e` ∈ ZD+1 , {e` }1≤`≤d linearly independent, thus
Hence From (32),
det[he` , e`0 i|1≤`, `0 ≤d ] ∈ Z\{0}.
(40)
|det [he` , e`0 i(1≤`, `0 ≤d) ]| ≥ 1.
(41)
|e` | < R1 K1 , |he` , e`0 i| < (R1 K1 )2 .
(42)
244
J. Bourgain
We easily derive from (39), (41), (42) that max |c` | . (R1 K1 )2 (R1 K1 )2(d−1) = (R1 K1 )2d .
(43)
kvk < CD (R1 K1 )d+1 .
(44)
Hence, by (38) Recalling that the elements {zj } are distinct points in ZD+1 , we may clearly find 1 ≤ j1 , j2 ≤ R − R1 , such that −1 1/D+1 R . |zj1 − zj2 | > CD
(45)
τλ : RD+1 → RD+1 : (y, u) 7 → (y, −λ2 u).
(46)
|PH τλ (zj1 − zj2 )| = |vj1 − vj2 | < CD (R1 K1 )d+1 .
(47)
Denote by τλ the map
Thus, by (36), (44)
Observe that z = zj1 − zj2 ∈ H . Thus, using the basis {e` }`=1,... ,d ∈ ZD+1 ∩ H considered above, we may write d X c` e` (48) z= `=1
and for `0 = 1, . . . , d, by (42), (47), X d+2 0 hτ e , e i . c ` λ ` ` = |hτλ z, e`0 i| < |PH τλ z| |e`0 | < CD (R1 K1 ) Observe that
P (λ) ≡ det [hτλ e` , e`0 i|1 ≤ `, `0 ≤ d]
(49)
(50)
is a polynomial in Z[λ] of degree at most 2d and coefficients bounded by CD (R1 K1 )2d . Also, by (46), (41), (i 2 = −1), (51) τi = I d, |P (+i)| ≥ 1 so that the polynomial does not vanish identically. Hence, by assumption (6) on λ, we have that −1 (R1 K1 )−C(2d)2d . (52) |P (λ)| > CD Thus (49), (52) gives again an estimate max |c` | ≤ CD (R1 K1 )d+2 (R1 K1 )2(d−1)+2dC(2d) ,
1≤`≤d
(53)
and by (45), (48), (53) 1 1/D+1 R < |z| < CD (R1 K1 )CD . CD
(54)
Growth of Sobolev Norms in Linear Schrödinger Equations
245
Recalling R1 in (29), (54) implies that, choosing C3 sufficiently large, 0
R < CD K1C with C 0 =
(D + 1)CD
(D+1)CD C3
1−
(55)
which is the desired estimate (26). Observe that CD depends only on D and the hypotheses (6) on λ, while the constant C3 may be chosen freely. This completes the proof of (24)–(26), hence of the claim. Having constructed the clusters {0α } satisfying (20)–(22), define 00α =
K1 − neighborhood of 0α in . 10
(56)
Let I denote the set of indices α such that 1/2
k(T00α − E100α )−1 k < eK1 and
1 = \ (\0 )∪
S
[ α6∈I
0α
[
2 =
α6 ∈I
(57)
0α .
(58)
00 α∈I α
, where (T\0 −E)−1 is controlled by a Neumann series Thus 1 = and for the additional separated neighborhoods 00α of the singular islands {0α }α∈I (56), (57) holds. Applications of the resolvent identity permits us to control then the inverse of (T1 − E)−1 also, 1/2
k(T1 − E)−1 k . eK1
(59)
and off-diagonal decay estimates 0
|(T1 − E)−1 (z, z0 )| < e−c|z−z | for |z − z0 | > K1C1 +1
(60)
(recall (21)). Consider next the distribution of the remaining {0α }α6∈I . Choose zα = (nα , kα ) ∈ 0 α s.t (61) 0α ⊂ B(zα , K1C1 ), Assume α, β 6 ∈ I such that |kα | < K1C1 +2 ,
(62)
K1C1 +3 .
(63)
|kβ | > Recall that by (57),
1/2
k(T00α − E)−1 k > eK1 , −1
k(T00β − E)
k>e
1/2
K1
(64)
.
(65)
Thus there are eigenvalues σα (resp. σβ ) of T00α (resp. T00β ) such that 1/2
1/2
|σα − E| < e−K1 , |σβ − E| < e−K1 .
(66)
246
J. Bourgain
Hence
1/2
|σα − σβ | < 2e−K1 .
(67)
Consider T and each T00α (00α fixed independent of λ) as a (holomorphic) matrix function of λ and the spectrum of T00α obtained by a system {σα (λ)} of continuously differentiable
functions of λ. Observe from (4), (61) and first order variation that
σα (λ) = −kα · λ + σ˜ α (λ) with |∂λ σ˜ α | < K1C1 .
(68)
We then again avoid an event (62), (63), (67) by small perturbation of λ |λ0 − λ| < T −Cs .
(69)
Some comments are in order here concerning the role of λ and E in the construction of the partition {0α } of 0 . When defining 0 , one may clearly have E (and λ) vary over an interval of sufficiently small size, say N12 , on which 0 and hence {0α } may be fixed. For fixed {0α }, one then avoids events (62), (63), (67) using the fact that ∂λ [σα (λ) − σβ (λ)] > |kα − kβ | − 2K1C1 >
1 C1 +3 K 2 1
(70)
(by (68)). From the preceding discussion, the number of inequalities involved may be bounded by (71) N 3 (KN D )2 (2K C1 )2(D+1) , which, by (69), (70) leads to the condition −(C1 +3) −K11/2
N 3 (KN D )2 (2K C1 )2(D+1) K1 Take as in the D = 1 case
e
< T −Cs .
(72)
N = T 6s .
(73)
K1 > (log T )5 .
(74)
K = K1C1 +10 ∼ (log T )5(C1 +10) .
(75)
Equation (72) will then be fulfilled for
We let
λ0 ,
we thus succeed in avoiding simultaneously (62), (63) for After replacement of λ by a pair α, β 6 ∈ I . Thus either 1 C1 +2 for all (n, k) ∈ 2 K 2 1
(76)
|k| < 2K1C1 +3 for all (n, k) ∈ 2 .
(77)
|k| > or
By (59), (60), the eigenfunction ξ
T ξ = Eξ
(78)
will be up to e−K1 -error supported by an K1C1 +1 -neighborhood 02 of 2 . Taking initial data ϕ = u(0) and expanding as before X ϕ˜ = < ϕ, ˜ ξ > ξ, (79)
Growth of Sobolev Norms in Linear Schrödinger Equations
247
only the set ξ of those eigenfunctions ξ are retained for which the corresponding region 02 intersects [4N0 < |n| < N2 ] × {0}. Hence (77) needs to hold, for which ξ is thus localized up to error 0(e−K1 ) on a region of the form K C1 +3 < , (80) Qξ = [ |n| − E| < CK] × |k| < 3K1 2 where E > 21 N0 > CK. This again permits us to carry out the argument. u t Note added in proof. More recently, the author extended the main result of this paper to the case of linear Schrödinger equations iut +1u+V (x, t)u = 0, where V is a bounded, real, smooth potential, periodic in the spatial variable (without specified behaviour in time). One then gets growth estimates kS(t)φ kH s 5 Cε,s |t|ε kφkH s for all ε > 0, s < ∞ as |t| → ∞ (to appear in Journals d’Analyse de Jerusalem). References [B1]
Bourgain, J.: On the growth in time of higher Sobolev norms of smooth solutions of Hamiltonian PDE. IMRN N6, 277–304 (1996) [B2] Bourgain, J.: Construction of Quasi-periodic solutions for Hamiltonian perturbations of linear equations and applications to nonlinear PDE. IMRN N11, 475–497 (1994) [B3] Bourgain, J.: Quasi-periodic solutions of Hamiltonian perturbations of 2D linear Schrödinger equations. Annals of Math. 148, 2, 363–439 (1998) [B4] Bourgain, J.: Construction of periodic solutions of nonlinear wave equations in higher dimension. GAFA 4, 629–639 (1995) [C-W] Craig, W., Wayne, W.: Newton’s method and periodic solutions of nonlinear wave equations. CPAM 46, 1409–1501 (1993) [Ku] Kuksin, S.: Infinite dimensional symplectic capacities and a squeezing theorem for Hamiltonian PDE. CMP 167, 531–552 (1995) [F-S-W] Fröhlich, J., Spencer, T., Wittwer, R.: Localization for a class of one dimensional quasi-periodic Schrödinger operators. CMP 132, N1, 5–25 (1990) [S] Spencer, T.: Private communications [St] Staffilani, G.: Quadratic forms for a 2D semilinear Schrödinger equation. Duke Math. J. 86, N1, 79–108 (1997) Communicated by J. L. Lebowitz
Commun. Math. Phys. 204, 249 – 267 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Homogeneous Decoherence Functionals in Standard and History Quantum Mechanics Oliver Rudolph1 , J. D. Maitland Wright2 1 Theoretical Physics Group, Blackett Laboratory, Imperial College of Science, Technology and Medicine,
Prince Consort Road, London SW7 2BZ, UK. E-mail: [email protected]
2 Analysis and Combinatorics Research Centre, Mathematics Department, University of Reading, Reading
RG6 6AX, UK Received: 26 November 1998 / Accepted: 2 December 1998
Abstract: General history quantum theories are quantum theories without a globally defined notion of time. Decoherence functionals represent the states in the history approach and are defined as certain bivariate complex-valued functionals on the space of all histories. However, in practical situations – for instance in the history formulation of standard quantum mechanics – there often is a global time direction and the homogeneous decoherence functionals are specified by their values on the subspace of homogeneous histories. In this work we study the analytic properties of (i) the standard decoherence functional in the history version of standard quantum mechanics and (ii) homogeneous decoherence functionals in general history theories. We restrict ourselves to the situation where the space of histories is given by the lattice of projections on some Hilbert space H. Among other things we prove the non-existence of a finitely valued extension for the standard decoherence functional to the space of all histories, derive a representation for the standard decoherence functional as an unbounded quadratic form with a natural representation on a Hilbert space and prove the existence of an Isham–Linden–Schreckenberg (ILS) type representation for the standard decoherence functional. 1. Introduction This paper is an investigation into certain aspects of the history approach to quantum mechanics. The histories approach to quantum theory is a promising new approach to quantum mechanics [1]–[12], [14], [16]–[23], [26]–[29] which has led to several interesting developments. Originally, the consistent histories approach to quantum mechanics was introduced by Griffiths [5] as a tool for interpreting standard nonrelativistic Hilbert space quantum mechanics. This so-called “consistent histories interpretation” has been further developed and brought to its present form by Omnès [14]. However, in the present paper we are exclusively interested in another aspect of the histories approach, namely that it is a potential framework for a quantum theory
250
O. Rudolph, J. D. M. Wright
where time plays a subsidiary role. In a series of interesting papers Gell-Mann and Hartle [1]–[4] have studied quantum cosmology and the path integral formulation of relativistic quantum field theory in terms of the concepts of the histories approach. They put forward for the first time the idea of taking the concepts of the consistent histories approach to quantum mechanics as independent fundamental entities in their own right in a generalized quantum theory. Based on this idea, Isham has formulated in [6] a natural algebraic generalization of the consistent histories approach. With his so-called general history quantum theories he has broadened both the scope and the mathematical framework of the consistent histories approach to quantum mechanics. Standard quantum mechanics is based on the idealized notions of observable and state at a single time. Isham’s general history theories provide a framework in which these notions are replaced by temporal analogues, histories and decoherence functionals, respectively. Accordingly, in general the histories are more general objects than simply time-sequences of single-time events but are regarded as events intrinsically spread out in time. Moreover, Isham’s general quantum histories provide a possible framework for formulating a quantum theory without an external globally defined notion of time. In Isham’s approach a general history quantum theory is formally characterized by the space of histories on the one hand and by the space of decoherence functionals on the other hand. The histories are regarded as fundamental entities in their own right and are identified with the general temporal properties of the quantum system. Isham’s approach has subsequently become the subject of intense study [7]–[12], [16]–[23], [26]–[29]. In the history approach probabilities are assigned to complete histories. However, at the basis of the histories approach is the idea that any probability assignment to a history h makes sense only with respect to a so-called consistent set of histories containing h. Dual to the notion of history is the notion of decoherence functional. The decoherence functional determines the consistent sets of histories in the theory and the probabilities assigned to histories in the consistent sets. More specifically, a decoherence functional d maps every ordered pair of general histories h, k to a complex number denoted by d(h, k). The number d(h, k) is interpreted in physical terms as a measure of the mutual interference of the two histories h and k. A consistent set of histories consists of histories whose mutual interference is sufficiently small, such that the diagonal value d(h, h) can be interpreted as the probability of the history h in this consistent set. In the present work we study both the history version of standard quantum mechanics and Isham’s general history quantum theories. In the latter case we restrict ourselves to the situation where the space of histories is given by the space of projections on some Hilbert space. In [11] Isham, Linden and Schreckenberg studied operator representations for decoherence functionals in the finite dimensional case. For infinite dimensions, Wright [26] obtained the canonical representation of bounded decoherence functionals by quadratic forms on von Neumann algebras. As a special case of Wright’s results in [26], it is now known that a countably additive decoherence functional, defined on all the projections of an infinite dimensional separable Hilbert space, must be bounded. In [19] we have further investigated operator representations of bounded decoherence functionals in the infinite dimensional case. However, in practical situations – and in particular in the history formulation of standard quantum mechanics – decoherence functionals are often specified by their values on the space of so-called homogeneous histories, which are simply time sequences of single-time events (with respect to some a priori given time direction). The values of such a decoherence functional on the space of all histories
Homogeneous Decoherence Functionals
251
are in general unknown, and – moreover – it is not at all clear a priori whether such homogeneous decoherence functionals can be extended to the space of all histories. In this work we address the problem of whether such homogeneous decoherence functionals can be extended unambiguously to the set of all histories and study operator representations for homogeneous decoherence functionals. For finite dimensional Hilbert spaces we find in Sect. 3 that every bounded homogeneous decoherence functional admits an Isham–Linden–Schreckenberg (ILS) representation by some trace class operator and can be uniquely extended to a bounded decoherence functional on the space of all histories. In the infinite dimensional case we identify in Sect. 4 those homogeneous decoherence functionals admitting an ILS representation. Sect. 5 is devoted to the study of the homogeneous decoherence functional dρ in the history version of standard quantum mechanics corresponding to the initial state ρ. We shall show that the standard homogeneous decoherence functional dρ cannot be represented by a finitely valued complex-valued (bounded or unbounded) decoherence functional on the space of all histories whenever the single time Hilbert space is infinite dimensional. Nevertheless we show that the standard homogeneous decoherence functional admits a generalized ILS-type representation by some bounded operator. Our result shows that the standard decoherence functional – although bounded on homogeneous histories – can only be extended to a function on the space of all histories if values in the Riemann sphere C ∪ {∞} are permitted. Moreover, in 5.2, we succeed in representing the decoherence functional dρ by an (in general) unbounded quadratic form. This gives a very natural extension of Wright’s representation theory [26] for bounded decoherence functionals. 2. Homogeneous Decoherence Functionals 2.1. Standard quantum mechanics. In standard quantum mechanics single-time events at time t are represented by projection operators ht on the single-time Hilbert space Hs and the quantum mechanical state is given by some density operator ρ on Hs . In the history formulation of nonrelativistic quantum mechanics one considers homogeneous histories which are simply finite sequences {ht } of single-time events parametrized by the external time parameter t. Physically, one may think of a homogeneous history as a sequence of quantum events, or – in a measurement situation – as a sequence of measurement results. Let h denote a finite (homogeneous) history, i.e., a finite sequence ht1 , ht2 , · · · , htn of projection operators htj on the single time Hilbert space Hs . We call the number n the order of h. Then, by standard quantum mechanics, the probability (symbolically denoted by dρhom (h, h)) of the history h in the quantum state ρ is given by dρhom (h, h) = tr Hs (htn htn−1 · · · ht1 ρht1 · · · htn−1 htn ). This expression for the probability of a homogeneous history was first given by Wigner [25] in the context of the orthodox Copenhagen interpretation. Since a quantum mechanical state ρ may be identified with a positive trace class operator on Hs with trace one, the expression for dρhom (h, h) is well-defined, even when some (or all) of the projection operators htj have infinite dimensional range. Accordingly, the standard homogeneous decoherence functional dρhom in standard quantum mechanics (associated with the state ρ) is defined for all pairs of finite homogeneous histories h and k as dρhom (h, k) = tr Hs (htn htn−1 · · · ht1 ρkt1 · · · ktn−1 ktn ).
(1)
252
O. Rudolph, J. D. M. Wright
We can always assume without loss of generality that the order of h equals the order of k, whenever h and k are finite homogeneous histories. We are working in the Heisenberg picture where the time evolution is thrown into the projection operators htj and – to keep the notation as simple and as transparent as possible – the explicit time dependence of the projection operators htj is suppressed. The main idea in Isham’s approach [6] is to map homogeneous histories injectively into the space of projection operators on some appropriate (n-fold) tensor product K t1 ,··· ,tn := ⊗ni=1 Hti , where Hti equals the single time Hilbert space Hs for every i. I.e., the history {htj } is mapped to the projection operator ht1 ⊗ · · · ⊗ htn on K t1 ,··· ,tn . We will normally follow Isham and identify a homogeneous history with the corresponding projection operator on the tensor product Hilbert space. Then, following Isham [6], the standard decoherence functional dρ is defined on the pair (h1 ⊗ · · · ⊗ hn , k1 ⊗ · · · ⊗ kn ) as dρhom (h1 , · · · , hn , k1 , · · · , kn ). However, it will be helpful, initially, to keep the distinction between dρhom and dρ clear. P Assume that the spectral resolution of ρ can be written as ρ = ∞ i=1 ωi Pψi , where {ψi } denotes an orthonormal basis in Hs , where Pψi denotes the one dimensional P projection operator onto the subspace of Hs spanned by ψi for every i and where ∞ i=1 ωi = 1 and ωi ≥ 0 for all i. Isham, Linden and Schreckenberg [11] have shown that by repeatedly inserting arbitrary “resolutions of the identity” into dρhom (h, k), the decoherence functional dρhom (h, k) can be written as dρhom (h, k) =
∞ X
i,j1 ,··· ,j2n+1 =1
D ED E D E n , k e ωi ej22 , kt1 ej11 ej33 , kt2 ej22 · · · ejn+1 t n jn × n+1
E D ED E D n+1 2n+1 2n+1 2n 1 , h e , h e , P e · · · e e , × ejn+2 t t ψ n 1 i j j j j j 1 2n n+2 n+1 2n+1 2n+1
where the {ejrr } are orthonormal bases in Hs for all r. Thus dρhom (h, k) =
X i,j2 ,··· ,j2n
D ED E ωi ej22 , kt1 ψi ej33 , kt2 ej22 · · · ×
ED E D E D n+2 n+1 n 2n , k e , h e , h e e · · · ψ × ejn+1 tn jn tn jn+1 i t1 j2n . jn+2 n+1 Hence we arrive at the representation for dρ dρ (h, k) =
∞ X
ωj1 (h ⊗ k) εj1 ,··· ,j2n , ε˜ j1 ,··· ,j2n
(2)
j1 ,··· ,j2n =1
for all homogeneous histories h = ht1 ⊗ · · · ⊗ htn and k = kt1 ⊗ · · · ⊗ ktn , where the orthonormal bases {εj1 ,··· ,j2n } and {˜εj1 ,··· ,j2n } of K t1 ,··· ,tn ⊗ K t1 ,··· ,tn are given by ⊗ · · · ⊗ ejn+2 ⊗ ej22 ⊗ ej33 ⊗ · · · ⊗ ejn+1 , εj1 ,··· ,j2n := ψj1 ⊗ ej2n2n ⊗ ej2n−1 2n−1 n+2 n+1
(3)
⊗ · · · ⊗ ejn+1 ⊗ ψj1 ⊗ ej22 ⊗ ej33 ⊗ · · · ⊗ ejnn . ε˜ j1 ,··· ,j2n := ej2n2n ⊗ ej2n−1 2n−1 n+1
(4)
The expression (2) is well-defined and finite for all pairs of homogeneous histories h = ht1 ⊗ · · · ⊗ htn and k = kt1 ⊗ · · · ⊗ ktn .
Homogeneous Decoherence Functionals
253
Following Isham [6] in the history reformulation of quantum mechanics the set of all histories now has to be identified with the set of projections P K t1 ,··· ,tn of projection operators on K t1 ,··· ,tn . (Strictly speaking histories has to be identified with the set of all : {t1 , · · · , tn } ⊂ R , see [16,17].) the direct limit of the directed system P K t ,··· ,t n 1 Those histories in P K t1 ,··· ,tn which are not homogeneous are called inhomogeneous histories.
2.2. General history quantum theories. Now we switch to general (abstract) history quantum theories over some Hilbert space H. Such a theory is fully characterized by the space of histories and the space of decoherence functionals. In general history quantum theories over some Hilbert space H the space of all histories (or more precisely propositions about histories) is – by definition – given by P(H), see [6]–[9]. Notice that the history Hilbert space H must not be confused with the single time Hilbert space Hs in standard quantum mechanics. A generalized decoherence functional for H is a function d, defined on all ordered pairs of projections in P(H), with values in the Riemann sphere C ∪ {∞} such that: (i) (ii) (iii) (iv)
d(p, q) = d(q, p)∗ for each p and each q in P(H) (Hermitianess). d(p, p) ≥ 0 for each p (Positivity). d(1, 1) = 1 (Normalization). d(p1 + p2 , q) = d(p1 , q) + d(p2 , q) for each q whenever p1 and p2 are perpendicular and all quantities and terms are finite (Ortho-additivity).
Moreover, we say that a decoherence functional d is completely additive if (iv’) whenever {pi }i∈I is an infinite collection of pairwise orthogonal projections, ! X X pi , q = d(pi , q), d i∈I
i∈I
for all q ∈ P(H) such that the left hand side is finite and all terms in the summation on the right hand side are finite. The infinite series is required to converge absolutely. (This is automatic when I is countable since the series is rearrangement invariant. When I is uncountable, all but countably many of the terms of the series are zero.) For brevity we shall write “decoherence functional” for “generalized decoherence functional” except where this could cause confusion. In the previous literature it has always been assumed that decoherence functionals are finitely valued. In the present work, however, we drop the requirement that decoherence functionals are finitely valued. Our motivation for doing so will become clear below, see Sect. 5. In the present paper we restrict ourselves to the situation where the history Hilbert space H can be written as a finite tensor product of a family of Hilbert spaces. Specifically we are aiming at formalising those situations where there is given a priori an external (possibly discretised or coarse grained) time direction in the theory – as for instance in the history formulation of standard quantum mechanics. In this case the notions of homogeneous history and of homogeneous decoherence functional make sense. The history reformulation of standard quantum mechanics discussed above motivates also the following general definitions for general history quantum theories.
254
O. Rudolph, J. D. M. Wright
Definition. Let B(H) be the (von Neumann) algebra of all bounded operators on a Hilbert space H and let P(H) be the lattice of projections in B(H). Assume that there is a finite family of Hilbert spaces {Hi }ni=1 such that H can be written as the tensor product of the Hilbert spaces Hi , i.e., H = ⊗ni=1 Hi . Then we say that the pair (H, {Hi }) is a homogeneous history Hilbert space of order n. In this situation we will – abusing language – also briefly say that H is a homogeneous history Hilbert space. When we do so we will always tacitly assume that a family of Hilbert spaces {Hi } has been chosen such that (H, {Hi }) is a homogeneous history Hilbert space. A homogeneous projection p on a homogeneous history Hilbert space H is then a projection of the form p = p1 ⊗ · · · ⊗ pn , where pi is a projection on Hi for all i respectively. A homogeneous decoherence functional for H is a complex valued function d hom , defined on all pairs of nth order history projections, i.e., d hom is a function d hom : P(H1 ) × · · · × P(Hn ) × P(H1 ) × · · · × P(Hn ) → C, such that: d hom (p1 , · · · , pn , q1 , · · · , qn ) = d hom (q1 , · · · , qn , p1 , · · · , pn )∗ for all pi and qi in P(Hi ) (Hermitianess). (ii) d hom (p1 , · · · , pn , p1 , · · · , pn ) ≥ 0 for all (p1 , · · · , pn ) (Positivity). (iii) d hom (11 , · · · , 1n , 11 , · · · , 1n ) = 1 (Normalization). (iv) d hom is orthoadditive in each of its 2n arguments (Ortho-additivity).
(i)
Notice that we only consider finitely valued homogeneous decoherence functionals. Clearly, in the physical applications in standard quantum mechanics the Hilbert spaces Hi are all interpreted as the single time Hilbert space Hs indexed by a discrete time parameter, i.e., all the Hi are isomorphic to Hs and can be obtained from the single-time Hilbert space Hs at some fiducial time by the application of a suitable unitary time translation operator (recall that we are working in the Heisenberg picture). For every initial state ρ the homogeneous decoherence functional dρhom associated with ρ is a homogeneous decoherence functional in the above sense. From our discussion above it is clear that the homogeneous decoherence functional in standard quantum mechanics is bounded. It is thus of some interest to study the problem whether bounded homogeneous decoherence functionals in general history quantum theories can be unambiguously extended to the space of all histories. In the sequel we shall need the following theorem. Theorem 2.1. Let H = ⊗ni=1 Hi be a history Hilbert space where all Hi are of dimension greater than two. Let d hom be a bounded homogeneous decoherence functional for H. Then there exists a unique bounded multilinear functional D : B(H1 ) × · · · × B(Hn ) × B(H1 ) × · · · × B(Hn ) → C extending d hom . This theorem is a special case of the multi-form generalized Gleason theorem proved in [20]. In the next section we will briefly consider the situation where all Hi (and thus also H = ⊗ni=1 Hi ) are finite dimensional Hilbert spaces whereas in the Sects. 4 and 5 we consider the situation where the Hi are general infinite dimensional Hilbert spaces.
Homogeneous Decoherence Functionals
255
3. The Finite Dimensional Case: A Generalized Isham–Linden–Schreckenberg Theorem In this section we briefly consider the question whether in history quantum theories over finite dimensional Hilbert spaces every bounded homogeneous decoherence functional can be extended to a finitely valued decoherence functional on the space of all histories. This question can indeed be answered in the affirmative. Theorem 3.1. Let H be a finite dimensional homogeneous history Hilbert space and H = ⊗ni=1 Hi its representation as a finite tensor product of (finite dimensional) Hilbert spaces all of which have dimension greater than two. Then there is a one-to-one correspondence between bounded homogeneous decoherence functionals d hom for H and trace class operators X on H ⊗ H according to the rule d hom (p1 , · · · , pn , q1 , · · · , qn ) = tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ q1 · · · ⊗ qn )X) for all projections pj , qj ∈ P(Hj ) with the restriction that tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ q1 · · · ⊗ qn )X) = tr H⊗H ((q1 ⊗ · · · ⊗ qn ⊗ p1 · · · ⊗ pn )X∗ ) ; (ii) tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ p1 · · · ⊗ pn )X) ≥ 0; (iii) tr H⊗H (X) = 1. (i)
Proof. This theorem can be proved directly by iterating the argument given by Isham, Linden and Schreckenberg in their proof of the case n = 1 (the ILS-Theorem) [11]. The details are left to the reader. In Sect. 4 we will derive Theorem 3.1 as a by-product of Theorem 4.1. u t Remark. From Theorem 3.1 and from the ILS-Theorem [11] it follows that in the finite dimensional case the notions of bounded homogeneous decoherence functional and bounded decoherence functional can be identified with each other. 4. The ILS-Theorem for Homogeneous Decoherence Functionals in Infinite Dimensions Let H be a Hilbert space and let K(H) be the ideal of compact operators in B(H). Then K(H) = B(H) if, and only if, H is finite dimensional. We shall need some basic facts on tensor products of operator algebras. For a particularly elegant account, from first principles, of tensor products of C ∗ -algebras see WeggeOlsen [24] and, for a more advanced treatment, see Kadison and Ringrose [13]. Let us recall that if H1 ,· · ·, Hn are Hilbert spaces, the algebraic tensor product H1 ⊗alg · · ·⊗alg Hn can be equipped with an inner product such that hϕ1 ⊗ · · · ⊗ ϕn , ψ1 ⊗ · · · ⊗ ψn i = hϕ1 , ψ1 i · · · hϕn , ψn i. The completion of H1 ⊗alg · · · ⊗alg Hn with respect to this in n ner product is the Hilbert space tensor product H1 ⊗ · · · ⊗ Hn . When xj j =1 is a family of bounded operators on Hj respectively, then there is a unique operator in B(H1 ⊗ · · · ⊗ Hn ), denoted by x1 ⊗ · · · ⊗ xn , such that (x1 ⊗ · · · ⊗ xn )(ϕ1 ⊗ · · · ⊗ ϕn ) = x1 (ϕ1 ) ⊗ · · · ⊗ xn (ϕn ). Let {Aj }nj=1 be a family of C ∗ -algebras of operators acting on Hj respectively. Then the algebraic tensor product A1 ⊗alg · · · ⊗alg An can be identified with the ∗ -algebra, acting on H1 ⊗ · · · ⊗ Hn , which consists of all finite sums of operators of the form
256
O. Rudolph, J. D. M. Wright
x1 ⊗ · · · ⊗ xn , with xj ∈ Aj , for all j . The norm closure of A1 ⊗alg · · · ⊗alg An is the C ∗ -tensor product of {Aj }nj=1 and is denoted by A1 ⊗ · · · ⊗ An . (This is also called the spatial C ∗ -tensor product to distinguish it from other possible C ∗ -tensor products; see [24].) The algebraic tensor product K(H1 ) ⊗alg · · · ⊗alg K(Hn ) embeds naturally into B(H1 ⊗ · · · ⊗ Hn ) (see, e.g., Kadison and Ringrose [13], Chapter 11.2). This embedding of K(H1 ) ⊗alg · · · ⊗alg K(Hn ) in B(H1 ⊗ · · · ⊗ Hn ) induces a (unique) pre-C ∗ -norm on K(H1 ) ⊗alg · · · ⊗alg K(Hn ). The (spatial) C ∗ -tensor product K(H1 ) ⊗ · · · ⊗ K(Hn ) is the closure of K(H1 ) ⊗alg · · · ⊗alg K(Hn ) in B(H1 ⊗ · · · ⊗ Hn ) with respect to this pre-C ∗ -norm and can be identified with K(H1 ⊗ · · · ⊗ Hn ). Now let H be a homogeneous history Hilbert space of order n > 0 and let H = ⊗ni=1 Hn be its given representation as a tensor product, where all Hi are of dimension greater than two. Let d hom be a bounded homogeneous decoherence functional for H. Then, by Theorem 2.1, there exists a (unique) bounded 2n-linear functional B : B(H1 ) × · · · × B(Hn ) × B(H1 ) × · · · × B(Hn ) → C such that d hom (p1 , · · · , pn , q1 , · · · , qn ) = B(p1 , · · · , pn , q1 , · · · , qn ) for all pi , qi ∈ P(H). Denote by BK the restriction of B to K(H1 ) × · · · × K(Hn ) × K(H1 ) × · · · × K(Hn ). Then, by the fundamental property of the algebraic tensor product, there is a unique linear functional β : K(H1 ) ⊗alg · · · ⊗alg K(Hn ) ⊗alg K(H1 ) ⊗alg · · · ⊗alg K(Hn ) → C such that β(x1 ⊗ · · · ⊗ xn ⊗ y1 ⊗ · · · ⊗ yn ) = BK (x1 , · · · , xn , y1 , · · · , yn ) = B(x1 , · · · , xn , y1 , · · · , yn ) for all xi , yi ∈ K(Hi ). In particular d hom (p1 , · · · , pn , q1 , · · · , qn ) = β(p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn ) for all projections pi , qi ∈ K(Hi ). Definition. A homogeneous decoherence functional d hom for a history Hilbert space H = ⊗ni=1 Hi is said to be tensor bounded if d hom is bounded and the associated functional β is bounded on K(H1 )⊗alg · · ·⊗alg K(Hn )⊗alg K(H1 )⊗alg · · ·⊗alg K(Hn ), when K(H1 ) ⊗ · · · ⊗ K(Hn ) ⊗ K(H1 ) ⊗ · · · ⊗ K(Hn ) is equipped with its canonical C ∗ norm. Theorem 4.1. Let H be a history Hilbert space with tensor product representation H = ⊗ni=1 Hi , where all Hi are of dimension greater than two. Let d hom be a bounded homogeneous decoherence functional for H. Then d hom is tensor bounded if, and only if, there exists a trace class operator T on H1 ⊗ · · · ⊗ Hn ⊗ H1 ⊗ · · · ⊗ Hn such that d hom (p1 , · · · , pn , q1 , · · · , qn ) = tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn )T ) for all projections pi , qi ∈ K(Hi ). Proof. The proof is analogous to the proof of Theorem 3.2 in [19] and omitted. u t
Homogeneous Decoherence Functionals
257
Corollary 4.2. Let H be a history Hilbert space with tensor product representation H = ⊗ni=1 Hi , where all Hi are of dimension greater than two. Let d hom be a completely additive bounded homogeneous decoherence functional for H. Then d hom is tensor bounded if, and only if, there exists a trace class operator T on H1 ⊗ · · · ⊗ Hn ⊗ H1 ⊗ · · · ⊗ Hn such that d hom (p1 , · · · , pn , q1 , · · · , qn ) = tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn )T )
(5)
for all projections pi , qi ∈ P(Hi ). Proof. When there exists a trace class operator T such that (5) holds, then, by Theorem 4.1, d hom is tensor bounded. Conversely, when d hom is tensor bounded, the existence of T such that (5) holds for all projections of finite rank is guaranteed by Theorem 4.1. By appealing to the complete additivity of d hom and the ultraweak continuity of the map z 7 → tr(zT ) it is straightforward to establish (5) for arbitrary projections. Corollary 4.3. Let H be a history Hilbert space with tensor product representation H = ⊗ni=1 Hi , where all Hi are of dimension greater than two. There is a one-to-one correspondence between completely additive, tensor bounded homogeneous decoherence functionals for H and trace class operators T on H1 ⊗ · · · ⊗ Hn ⊗ H1 ⊗ · · · ⊗ Hn such that, for pi , qi ∈ P(Hi ), – tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn )T ) = tr H⊗H ((q1 ⊗ · · · ⊗ qn ⊗ p1 ⊗ · · · ⊗ pn )T ∗ ) ; – tr H⊗H ((p1 ⊗ · · · ⊗ pn ⊗ p1 ⊗ · · · ⊗ pn )T ) ≥ 0; – tr H⊗H (T ) = 1. Proof. Straightforward. u t Remark. Theorem 3.1 (the generalized Isham–Linden–Schreckenberg Theorem) follows immediately since, when the history Hilbert space H is finite dimensional, then K(H1 ) ⊗alg · · · ⊗alg K(Hn ) = K(H1 ) ⊗ · · · ⊗ K(Hn ) = B(H1 ⊗ · · · ⊗ Hn ) which is finite dimensional and every linear functional on a finite dimensional normed space is bounded. Let d be a bounded decoherence functional for (H, {Hi }), where all Hi have dimension greater than two. Let us call d Isham–Linden–Schreckenberg-representable (or, more shortly, ILS-representable) if there exists a trace class operator T in B(H ⊗ H) = B(H1 ⊗ · · · ⊗ Hn ⊗ H1 ⊗ · · · ⊗ Hn ) such that d(p, q) = tr H⊗H ((p ⊗ q)T ) for all projections p and q in B(H). It follows from the results given above that a completely additive homogeneous decoherence functional d hom on a history Hilbert space H can be represented by an ILS-representable bounded decoherence functional d on H if, and only if, d hom is tensor bounded.
258
O. Rudolph, J. D. M. Wright
5. The Decoherence Functional in Standard Quantum Mechanics 5.1. Non-existence of a finitely valued extension of the standard decoherence functional. The definition of the homogeneous decoherence functional dρhom associated with the initial state ρ in the history reformulation of standard quantum mechanics has been already given in Sect. 2.1 above. This function is of particular interest since the axioms characterizing general history quantum theories are abstracted from the structure of the history reformulation of standard quantum mechanics, and features which fail to be true in the history reformulation of standard quantum mechanics are unlikely to hold for more general physical history quantum theories. We recall that in standard quantum mechanics the homogeneous decoherence functional dρhom associated with the initial state ρ is defined on pairs of homogeneous histories h and k by Eq. (1) as dρhom (h, k) = tr Hs (htn htn−1 · · · ht1 ρkt1 · · · ktn−1 ktn ). We have discussed in Sect. 2.1 that in Isham’s history formulation of standard quantum mechanics the general histories are identified with the projection operators on some tensor product of the single time Hilbert space by itself. As explained in Sect. 2.1 in the tensor product formalism Isham et al. [11] obtained the representation in Eq. (2) for dρ dρ (h, k) =
∞ X
ωj1 (h ⊗ k) εj1 ,··· ,j2n , ε˜ j1 ,··· ,j2n
j1 ,··· ,j2n =1
for all tensored P homogeneous histories h = ht1 ⊗ · · · ⊗ htn and k = kt1 ⊗ · · · ⊗ htn , where ρ = ∞ i=1 ωi Pψi denotes the spectral resolution of ρ and where the εj1 ,··· ,j2n and ε˜ j1 ,··· ,j2n are defined in Eqs. (3) and (4) respectively. The question which will be addressed in this section is whether this expression can be extended to the space of all histories. When the single-time Hilbert space is finite dimensional, then it follows from our result in Sect. 3 that dρ can indeed be extended to the space of all histories and its extension is also ILS-representable. It is natural to try to define dρ (p, q) for arbitrary projections p, q by dρ (p, q) =
∞ X
ωj1 (p ⊗ q) εj1 ,··· ,j2n , ε˜ j1 ,··· ,j2n .
(6)
j1 ,··· ,j2n =1
Proposition 5.1. When the single time Hilbert space is infinite dimensional, the expression (6) does not define a finitely valued functional on the space of all histories. Proof. Let Hs denote the single-time Hilbert space of the quantum system in question. We assume that Hs is a separable infinite dimensional Hilbert space. We consider two time histories, i.e., the case n = 2. Let e be a fixed unit vector in Hs . Let (ej )(j = 1, 2...) be an orthonormal basis in Hs with e = e1 . Let P and Q be projections on Hs ⊗ Hs . Whenever the summation converges we define de (P , Q) := ∞ ∞ X X := =
∞ X
(P ⊗ Q)(e ⊗ ej (1) ⊗ ej (2) ⊗ ej (3) ), ej (1) ⊗ ej (3) ⊗ e ⊗ ej (2)
j (1)=1 j (2)=1 j (3)=1 ∞ ∞ ∞ X X X
j (1)=1 j (2)=1 j (3)=1
P (e ⊗ ej (1) ), ej (1) ⊗ ej (3) Q(ej (2) ⊗ ej (3) ), e ⊗ ej (2) .
Homogeneous Decoherence Functionals
259
This expression coincides with the above formula defining the standard decoherence functional for n = 2 and for a pure quantum state ρ = Pe , where Pe denotes the projection operator onto the subspace of Hs spanned by e. If we put P = I , then all the terms in the summation vanish except where j (1) = j (3) = 1. Thus de (I, Q) =
∞ X
Q(ej (2) ⊗ e), e ⊗ ej (2) .
j (2)=1
Let De (S) := j (2)=1 S(ej (2) ⊗ e), e ⊗ ej (2) for all S ∈ B(Hs ⊗ Hs ) for which the summation converges. We observe that De (I ) = he ⊗ e, e ⊗ ei = 1. If de (I, Q) is well defined for every projection Q on Hs ⊗ Hs , then De (Q) converges for every projection Q. Let U be the unitary on Hs ⊗ Hs such that U (en ⊗ em ) = em ⊗ en for each n and m. Then U 2 = I . So U is self adjoint. Let QU = 21 (U + I ). Then QU is a projection. Assume now that De (QU ) is convergent. Then De (2QU − I ) is convergent, i.e., P∞
∞ X
∞ ∞ X X
1 U (ej (2) ⊗ e), e ⊗ ej (2) = (e ⊗ ej (2) ), e ⊗ ej (2) =
j (2)=1
j (2)=1
j (2)=1
is convergent. This is false. So De (QU ) is divergent. Thus de (I, QU ) is not defined by the formula. Thus we conclude that the natural formula for the homogeneous decoherence functional of standard quantum mechanics does not induce a finitely valued decoherence functional on the space of all histories but rather a generalized functional d ρ : P K t1 ,··· ,tn × P K t1 ,··· ,tn → C ∪ {∞} with values in the Riemann sphere C ∪ {∞}. u t The generalized decoherence functional d ρ : P K t1 ,··· ,tn × P K t1 ,··· ,tn → C ∪ {∞} of dρ is given by dρ (h, k): whenever the series defining dρ (h, k) is well-defined and finite d ρ (h, k) = ∞: else. It is easy to see that our argument above already implies that in standard quantum mechanics no standard decoherence functional (of order two or greater) over an infinite dimensional single-time Hilbert space Hs can be extended to a completely additive finitely valued decoherence functional on the space of all histories. For, by Wright [27], such an extension would be bounded, contrary to the argument above. In the next section it will become clear that even if the requirement of complete additivity is dropped there is no hope of extending dρ to all histories. Our argument above also shows that there are histories such that no decoherence functional dρ assumes a finite value on them. For instance, in our example above the infinite sum defining dρ (I, QU ) diverges independently of ρ. This result seems to indicate that the space P K t1 ,··· ,tn contains unphysical elements and one might hope that the standard decoherence functional is well defined on some suitably chosen smaller space of histories. Bounded decoherence functionals have canonical representations as quadratic forms on von Neumann algebras, see Corollary 4 [26]. Surprisingly, in view of the negative results above, this is almost true for standard decoherence functionals. It turns out that each standard decoherence functional can be identified with a positive, but unbounded, quadratic form defined on a “dense” ∗ -subalgebra of B(H). This is clarified below.
260
O. Rudolph, J. D. M. Wright
5.2. Representing standard decoherence functionals by unbounded quadratic forms. The non-boundedness of the standard decoherence functional forces us to consider unbounded decoherence functionals as “necessary evils” like unbounded operators. Also, like unbounded operators, their domains are non-closed subspaces of the underlying Banach space. In this subsection we shall show that in standard quantum mechanics every decoherence functional dρ associated with the initial state ρ can be represented by an unbounded quadratic form. Specifically we consider the restriction of dρ to histories of order n (which we denote also by dρ – slightly abusing the notation) and call the resulting functional the decoherence functional of order n. In this subsection we shall always assume that the single time Hilbert space Hs is infinite dimensional. We shall see that for each such standard decoherence functional dρ of order n on H = Hs ⊗ · · · ⊗ Hs there is a Hilbert space H and an operator Rρ : B(Hs ) ⊗alg · · · ⊗alg B(Hs ) → H such that dρ (p1 ⊗ p2 ⊗ · · · ⊗ pn , q1 ⊗ q2 ⊗ · · · ⊗ qn )
= Rρ (p1 ⊗ p2 ⊗ · · · ⊗ pn ), Rρ (q1 ⊗ q2 ⊗ · · · ⊗ qn ) . It follows that dρ extends to a positive quadratic form Dρ on B(Hs ) ⊗alg · · · ⊗alg B(Hs ), where
(7) Dρ (v, w) = Rρ (v), Rρ (w) , for each v and each w in the n-fold algebraic tensor product of B(Hs ) by itself. We shall see below that when Hs is infinite dimensional, then Dρ is not bounded and that the map Rρ is an unbounded operator. However, the representation (7) is a very close analogue of the representations of bounded decoherence functionals obtained in [26]. For, by the results of [26], any bounded decoherence functional on the projections of H # (where H # is a Hilbert space with dimension greater than 2) can be extended to a bounded quadratic form which can be expressed as the difference of bounded semi-inner products on B(H # ). Remark. For bounded decoherence functionals, three notions of positivity were distinguished in [27]. Only the strongest of these corresponds to the representing quadratic form being positive. However all decoherence functionals arising canonically in the history formulation of standard quantum mechanics have this strong positivity property although they fail to be bounded when the single time Hilbert space Hs is infinite dimensional. Let π : B(Hs )n → B(Hs ) be the product map defined by π(x1 , x2 , · · · , xn ) = xn xn−1 · · · x1 . Then, since π is an n-linear form, it follows by the basic algebraic properties of tensor products, that there exists a unique linear map 5 : B(Hs ) ⊗alg · · · ⊗alg B(Hs ) → B(Hs ) such that 5(x1 ⊗ x2 ⊗ · · · ⊗ xn ) = π(x1 , x2 , · · · , xn ) = xn xn−1 · · · x1 for all x1 , x2 , · · · , xn ∈ B(Hs ). Let φ be a normal state on B(Hs ). Then, for a unique positive trace class operator ρ with trace one, φ(x) = tr Hs (xρ) for all x in B(Hs ). The correspondence φ ↔ ρ is a bijection. Then define Dφ on B(Hs ) ⊗alg · · · ⊗alg B(Hs ) by Dφ (z, w) = φ(5(w)∗ 5(z)).
Homogeneous Decoherence Functionals
261
Let p1 , p2 , · · · , pn and q1 , q2 , · · · , qn be sequences of projections in B(Hs ). Then Dφ (p1 ⊗ p2 ⊗ · · · ⊗ pn , q1 ⊗ q2 ⊗ · · · ⊗ qn ) = φ(q1 q2 · · · qn pn pn−1 · · · p1 ) = tr Hs (q1 q2 · · · qn pn pn−1 · · · p1 ρ). But, by Proposition 5.2.2 of [15], tr(abρ) = tr(bρa), so Dφ extends the standard decoherence functional dρ arising from ρ. For any z in B(Hs ) ⊗alg · · · ⊗alg B(Hs ), Dφ (z, z) = φ(5(z)∗ 5(z)) ≥ 0. So Dφ is a semi-inner product on B(Hs ) ⊗alg · · · ⊗alg B(Hs ). Let Nφ = {z : φ(5(z)∗ 5(z)) = 0}. It follows from the Cauchy–Schwarz inequality that Nφ is a vector subspace of B(Hs ) ⊗alg · · · ⊗alg B(Hs ). Let Rρ be the quotient map from B(Hs ) ⊗alg · · · ⊗alg B(Hs ) onto (B(Hs ) ⊗alg · · · ⊗
alg B(Hs ))/N φ . Then Dφ induces an inner product h·, ·i on the quotient such that Rρ (z), Rρ (w) = Dφ (z, w). Thus (B(Hs ) ⊗alg · · · ⊗alg B(Hs ))/Nφ is a pre-Hilbert space. Let H be the Hilbert space obtained by completing this pre-Hilbert space. We have proved: Theorem 5.2. Given a standard decoherence functional dρ of order n on H = Hs ⊗ Hs ⊗ · · · ⊗ Hs , there exists a Hilbert space H and a linear operator Rρ from B(Hs ) ⊗alg · · · ⊗alg B(Hs ) onto a dense subspace of H such that dρ (p1 ⊗ p2 ⊗ · · · ⊗ pn , q1 ⊗ q2 ⊗ · · · ⊗ qn )
= Rρ (p1 ⊗ p2 ⊗ · · · ⊗ pn ), Rρ (q1 ⊗ q2 ⊗ · · · ⊗ qn ) for arbitrary projections p1 , p2 , · · · , pn , q1 , q2 , · · · , qn in B(Hs ). Hence dρ extends to a positive quadratic form Dρ on B(Hs ) ⊗alg · · · ⊗alg B(Hs ). The following proposition shows that Dρ = Dφ is unique. Proposition 5.3. Let Q be a sesquilinear form on B(Hs ) ⊗alg · · · ⊗alg B(Hs ) such that Q(p1 ⊗ p2 ⊗ · · · ⊗ pn , q1 ⊗ q2 ⊗ · · · ⊗ qn ) = dρ (p1 ⊗ p2 ⊗ · · · ⊗ pn , q1 ⊗ q2 ⊗ · · · ⊗ qn ) for all projections p1 , p2 , · · · , pn and q1 , q2 , · · · , qn in B(Hs ). Also let there exist a constant C such that |Q(x1 ⊗ x2 ⊗ · · · ⊗ xn , y1 ⊗ y2 ⊗ · · · ⊗ yn )| ≤ Ckx1 kkx2 k · · · kxn kky1 kky2 k · · · kyn k.
Then Q(u, v) = Rρ (u), Rρ (v) = Dρ (u, v) for each u and v in B(Hs ) ⊗alg · · · ⊗alg B(Hs ).
262
O. Rudolph, J. D. M. Wright
Proof. Let us define a 2n-linear form L on B(Hs ) by L(x1 , x2 , · · · , xn , y1 , y2 , · · · , yn ) = Q(x1 ⊗ x2 ⊗ · · · ⊗ xn , y1∗ ⊗ y2∗ ⊗ · · · ⊗ yn∗ ). Then L is bounded. Let p1 , p2 , · · · , pn and q1 , q2 , · · · , qn be sequences of projections in B(Hs ). Then L(p1 , p2 , · · · , pn , q1 , q2 , · · · , qn ) = φ(q1 q2 · · · qn pn pn−1 · · · p1 ). It follows from the boundedness of L and spectral theory that L(x1 , x2 , · · · , xn , y1 , y2 , · · · , yn ) = φ(y1 y2 · · · yn xn xn−1 · · · x1 ) for all self-adjoint x1 , x2 , · · · , xn , y1 , y2 , · · · , yn . It then follows from multilinearity that this identity remains valid when x1 , x2 , · · · , xn , y1 , y2 , · · · , yn are arbitrary elements of B(Hs ). Thus Q(x1 ⊗ x2 ⊗ · · · ⊗ xn , y1 ⊗ y2 ⊗ · · · ⊗ yn ) = L(x1 , x2 , · · · , xn , y1∗ , y2∗ , · · · , yn∗ ) = φ(π(y1 , y2 , · · · , yn )∗ π(x1 , x2 , · · · , xn )) = φ(5(y1 ⊗ y2 ⊗ · · · ⊗ yn )∗ 5(x1 ⊗ x2 ⊗ · · · ⊗ xn ))
= Rρ (x1 ⊗ x2 ⊗ · · · ⊗ xn ), Rρ (y1 ⊗ y2 ⊗ · · · ⊗ yn ) . Since each element of the algebraic tensor product B(Hs ) ⊗alg · · · ⊗alg B(Hs ) is a finite linear combination of simple tensors, Q is of the required form. u t The following proposition sheds further light on the unboundedness results of Sect. 5.1. In its statement we shall take n = 2 to simplify the notation but the result holds whenever n ≥ 2. Fix ξ , a unit vector in Hs . Let φξ (x) = hxξ, ξ i for each x in B(Hs ). Let Dξ be constructed from φξ as above. Proposition 5.4. Let Hs be infinite dimensional. The positive quadratic form Dξ is unbounded on K(Hs ) ⊗alg K(Hs ). Hence Dξ is unbounded on B(Hs ) ⊗alg B(Hs ). Proof. Let us assume that Dξ is bounded on K(Hs ) ⊗alg K(Hs ). For each z ∈ K(Hs ) ⊗alg K(Hs ) let δ(z) = Dξ (z, 1). Then δ is a bounded linear functional on K(Hs ) ⊗alg K(Hs ). But δ(x ⊗ y) = φξ (yx) = hx, y ∗ i. So, by Proposition 0 [28] δ is unbounded on K(Hs ) ⊗alg K(Hs ). This contradiction completes the proof. t u The above proof is valid for standard decoherence functionals corresponding to a vector state but, by a slight modification of Proposition 0 [28], a similar argument works for any standard decoherence functional dρ of order n ≥ 2, provided Hs is infinite dimensional. Remark. Proposition 5.4 implies that the quadratic form Dξ is unbounded with respect to any C ∗ -norm on B(Hs ) ⊗alg B(Hs ). This is an immediate consequence of the fact that, by nuclearity, all C ∗ -norms on B(Hs ) ⊗alg B(Hs ) coincide on K(Hs ) ⊗alg K(Hs ). Corollary 5.5. The map 5 : B(Hs ) ⊗alg · · · ⊗alg B(Hs ) → B(Hs ) is unbounded if Hs is infinite dimensional. The map Rρ : B(Hs ) ⊗alg · · · ⊗alg B(Hs ) → H is unbounded if Hs is infinite dimensional.
Homogeneous Decoherence Functionals
263
Proof. If 5 or Rρ were bounded, then Dρ would also be bounded. u t Corollary 5.6. Each standard decoherence functional dρ of order n, corresponding to a positive trace class operator ρ, has a unique extension to a positive, quadratic form Dρ on the n-fold algebraic tensor product B(Hs ) ⊗alg · · · ⊗alg B(Hs ) such that Dρ (x, y) = tr Hs (5(y)∗ 5(x)ρ). In particular Dρ (I, I ) = 1. 5.3. Existence of a generalized ILS-representation for the standard homogeneous decoherence functional in infinite dimensions. In Theorem 4.1 in Sect. 4 we have seen that a general homogeneous decoherence functional has an ILS-representation by a trace class operator if and only if it is tensor bounded. However, we have seen that the decoherence functional dρ in standard quantum mechanics associated with the initial state ρ is not even bounded when the single time Hilbert space Hs is infinite dimensional. Therefore from the general results of Sect. 4.1 we cannot infer the existence of an ILS-type representation for the standard decoherence functional. However, the representation of the standard decoherence functional given in Eq. (2) is almost of the required form. In the present subsection we prove the following theorem and corollary (as always we denote the single time Hilbert space by Hs ). Theorem 5.7. Let dρhom be the standard homogeneous decoherence functional of order n in standard quantum mechanics associated with the initial state ρ. There exists a unique bounded linear operator Mρ on H ⊗ H = Hs ⊗ · · · ⊗ Hs (2n times) such that dρhom (p1 , · · · , pn , q1 , · · · , qn ) = tr H⊗H (p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn ) Mρ whenever pi and qi are finite rank projections on Hs for all i. Let us recall that Eq. (1) implies that the standard decoherence functional dρhom is a homogeneous decoherence functional which is completely additive in each of its 2n arguments. Thus we have from Theorem 5.7 Corollary 5.8. Let dρhom be the standard homogeneous decoherence functional of order n in standard quantum mechanics associated with the initial state ρ. Then there exists a unique bounded linear operator Mρ on H ⊗ H = Hs ⊗ · · · ⊗ Hs (2n times) such that, respectively, whenever Pi and Qi are projections in P(Hs ) and (pP i,r ) and (qi,r ) are, P orthogonal families of finite rank projections with Pi = r pi,r and Qi = r qi,r , then dρhom (P1 , · · · , Pn , Q1 , · · · , Qn ) X X ··· tr H⊗H p1,i1 ⊗ · · · ⊗ pn,in ⊗ q1,j1 ⊗ · · · ⊗ qn,jn Mρ . = i1 ,j1
in ,jn
Proof. The complete additivity of dρhom and Theorem 5.7 imply the existence of a unique t bounded linear operator Mρ with the required properties. u Proof of Theorem 5.7. Let Bρ : B(Hs ) × · · · × B(Hs ) → C be the unique bounded 2n-linear form which extends dρhom . Let βρ be the unique linear functional on K(Hs ) ⊗alg · · · ⊗alg K(Hs ), (2n times)
264
O. Rudolph, J. D. M. Wright
such that βρ (x1 ⊗ · · · ⊗ xn ⊗ y1 ⊗ · · · ⊗ yn ) = Bρ (x1 , · · · , xn , y1 , · · · , yn ) for all xi , yi ∈ K(Hs ). When ξ is a unit vector in Hs ⊗alg · · · ⊗alg Hs (2n times), then we denote by pξ the projection operator onto the subspace of Hs ⊗ · · · ⊗ Hs (2n times) spanned by ξ . The projection operator pξ is in K(Hs ) ⊗alg · · · ⊗alg K(Hs ) (2n times). Similarly, when ξs is a unit vector in Hs , then we denote the projection onto the subspace spanned by ξs by pξs . We shall see that for every positive trace class operator ρ with trace one the standard decoherence functional dρ is tracially bounded, i.e., there exists a constant C such that for every unit vector ξ in Hs ⊗alg · · · ⊗alg Hs (2n times) βρ pξ ≤ C. From Equation (2) we know that βρ can be written as ∞ X
βρ (P ) =
ωj1 P εj1 ,··· ,j2n , ε˜ j1 ,··· ,j2n .
j1 ,··· ,j2n =1
For ξ, η ∈ Hs ⊗ · · · ⊗ Hs (2n times) let ∞ X
Sρ (ξ, η) :=
ωj1 hεj1 ,··· ,j2n , ξ ihη, ε˜ j1 ,··· ,j2n i.
j1 ,··· ,j2n =1
This expression is well defined since the sequences {hεj1 ,··· ,j2n , ξ i} and {hη, ε˜ j1 ,··· ,j2n i} are square summable sequences in the Hilbert space `2 (N2n ). So, by the Cauchy– Schwarz inequality in `2 (N2n ) ∞ X
Sρ (ξ, η) ≤
hεj
j1 ,··· ,j2n =1
≤
∞ X j1 ,··· ,j2n =1
1 ,··· ,j2n
, ξ i hη, ε˜ j1 ,··· ,j2n i
1 2 2 1 ,··· ,j2n , ξ i
hεj
∞ X
hη, ε˜ j
1 ,··· ,j2n
1 2 2 i
j1 ,··· ,j2n =1
= kξ kkηk. So, Sρ is a bounded sesquilinear form on Hs ⊗ · · · ⊗ Hs and thus there exists a bounded linear operator Mρ on Hs ⊗ · · · ⊗ Hs such that Sρ (ξ, η) = hMρ ξ, ηi for all ξ, η in Hs ⊗ · · · ⊗ Hs . We note that from kMρ ξ k2 = hMρ ξ, Mρ ξ i = |Sρ (ξ, Mρ ξ )| ≤ kξ kkMρ ξ k it follows that kMρ k ≤ 1. For ξ1 , · · · , ξ2n ∈ Hs let ξ = ξ1 ⊗ · · · ⊗ ξ2n . Then pξ = pξ1 ⊗ · · · ⊗ pξ2n and tr H⊗H (pξ1 ⊗ · · · ⊗ pξ2n )Mρ = hMρ ξ, ξ i = βρ (pξ1 ⊗ · · · ⊗ pξ2n ) = dρ pξ1 , · · · , pξn , pξn+1 , · · · , pξ2n .
Homogeneous Decoherence Functionals
Hence, by orthoadditivity, dρ (p1 , · · · , pn , q1 , · · · , qn ) = tr H⊗H (p1 ⊗ · · · ⊗ pn ⊗ q1 ⊗ · · · ⊗ qn ) Mρ
265
whenever pi and qi are finite rank projections on Hs for all i. The uniqueness of Mρ follows from the following lemma Lemma 5.9. Let L be a bounded operator on Hs ⊗ · · · ⊗ Hs such that, for all αi ∈ Hs , hL(α1 ⊗ · · · ⊗ αn ), α1 ⊗ · · · ⊗ αn i = 0. Then L = 0. Lemma 5.9 can be proved by iterating the argument given in the proof of Lemma 4.1 in [19]. This proves Theorem 5.7 and Corollary 5.8. u t 5.4. Discussion. We conclude this section with a discussion of the physical meaning of histories with infinite weight. In the history reformulation of standard quantum mechanics the decoherence functional dρ determines the consistent sets of histories. In short, a subset C of P K t1 ,··· ,tn is called consistent if it is a Boolean lattice with respect to the lattice theoretical operations induced from P K t1 ,··· ,tn and if Re dρ (h, k) = 0 for all disjoint h and k. In this case pρ : C → R, pρ (h) := dρ (h, h) defines a probability functional on C. Assertions about histories are meaningful only with respect to a consistent set of histories and for h ∈ P K t1 ,··· ,tn , dρ (h, h) can only be interpreted as probability of h when explicit reference is made to a fixed consistent set of histories. Isham and Linden [8] have shown that already in the history formulation of standard quantum mechanics over finite dimensional Hilbert spaces the diagonal values of the standard decoherence functional are greater than one for some inhomogeneous histories. In Sect. 5.1 we have shown that for infinite dimensional Hilbert spaces the standard decoherence functional is not even finitely valued on the space of all histories. Clearly, a value dρ (h, h) > 1 cannot be interpreted as a probability for the inhomogeneous history h. We propose the following physical interpretation of inhomogeneous histories with dρ (h, h) > 1: If dρ (h, h) > 1, then h is a coarse-graining of mutually exclusive histories, whose “space-time” interference (measured by the decoherence functional dρ ) is so large that they cannot be distinguished as separate “events” in space-time. The physical point at stake is that histories, which are disjoint (i.e., which are represented by orthogonal projections) may nevertheless have a large “overlap” (since histories are spread out in time two homogeneous histories can represent exclusive propositions at some time and non-exclusive propositions at another time). Accordingly for a pair of disjoint histories h and k which have a large overlap in this sense, the coarse graining h ∨ k (representing the proposition that the history h or the history k is realized) may not represent a physically sensible proposition. As a consequence of this, all histories with dρ (h, h) > 1 represent unphysical propositions in the state and must be dismissed. The same is true for histories h for which dρ (h, h) is infinite. Such histories represent no greater conceptual problem in this interpretation than histories with 1 < dρ (h, h) < ∞. The axioms characterising a history quantum theory over an orthoalgebra or over an effect algebra are abstracted from the history reformulation of standard quantum mechanics over some Hilbert space. These axioms were first given by Isham [6]. However, in the past it has always been assumed that a decoherence functional is a complex valued
266
O. Rudolph, J. D. M. Wright
function on pairs of histories. This choice can be motivated by appealing to the history reformulation of standard quantum mechanics over finite-dimensional Hilbert spaces. However, our analysis above of the history reformulation of standard quantum mechanics over infinite dimensional Hilbert spaces has shown that the standard decoherence functional is unbounded and that its extension to the space of all histories would in general take values in the Riemann sphere. Accordingly, one has to expect that also in general history quantum theories the decoherence functional will in general be a function with values in the Riemann sphere (or, equivalently, represented by an unbounded ‘densely’ defined quadratic form). Acknowledgement. Oliver Rudolph is a Marie Curie Research Fellow and carries out his research at Imperial College as part of a European Union training project financed by the European Commission under the Training and Mobility of Researchers (TMR) programme.
References 1. Gell-Mann, M., Hartle, J.B.: Alternative decohering histories in quantum mechanics. In: Phua, K.K., Yamaguchi,Y. (eds.) Proceedings of the 25th international conference on high energy physics. Singapore: August 2–8, 1990, Singapore: World Scientific, 1990, pp. 1303–1310 2. Gell-Mann, M., Hartle, J.B.: Quantum mechanics in the light of quantum cosmology. In: Kobayashi, S., Ezawa, H., Murayama, Y., Nomura, S. (eds.) Proceedings of the third international symposium on the foundations of quantum mechanics in the light of new technology. Tokyo: Physical Society of Japan, 1990, pp. 321–343 3. Gell-Mann, M., Hartle, J.B.: Quantum mechanics in the light of quantum cosmology. In: Zurek, W. (ed.) Complexity, entropy and the physics of information, Santa Fe Institute Studies in the Science of Complexity, Vol. VIII, Reading, MA: Addison-Wesley, 1990, pp. 425–458 4. Gell-Mann, M., Hartle, J.B.: Classical equations for quantum systems. Phys. Rev. D 47, 3345–3382 (1993) 5. Griffiths, R.B.: Consistent histories and the interpretation of quantum mechanics. J. Stat. Phys. 36, 219– 272 (1984) 6. Isham, C.J.: Quantum logic and the histories approach to quantum theory. J. Math. Phys. 35, 2157–2185 (1994) 7. Isham, C.J.: Topos theory and consistent histories: The internal logic of the set of all consistent sets. Int. J. Theor. Phys. 36, 785–814 (1997) 8. Isham, C.J., Linden, N.: Quantum temporal logic and decoherence functionals in the histories approach to generalized quantum theory. J. Math. Phys. 35, 5452–5476 (1994) 9. Isham, C.J., Linden, N.: Continuous histories and the history group in generalized quantum theory. J. Math. Phys. 36, 5392–5408 (1995) 10. Isham, C.J., Linden, N.: Information entropy and the space of decoherence functions in generalized quantum theory. Phys. Rev. A 55, 4030–4040 (1997) 11. Isham, C.J., Linden, N., Schreckenberg, S.: The classification of decoherence functionals: An analogue of Gleason’s theorem. J. Math. Phys. 35, 6360–6370 (1994) 12. Isham, C.J., Linden, N., Savvidou, K., Schreckenberg, S.: Continuous time and consistent histories. J. Math. Phys. 39, 1818–1834 (1998) 13. Kadison, R.V., Ringrose, J.R.: Fundamentals of the theory of operator algebras I & II. Orlando: Academic 1983 & 1986 14. Omnès, R.: The interpretation of quantum mechanics. Princeton, NJ: Princeton University Press, 1994 15. Pedersen, G.K.: C ∗ -algebras and their automorphism groups. London: Academic Press 1979 16. Pulmannová, S.: Difference posets and the histories approach to quantum theories. Int. J. Theor. Phys. 34, 189–210 (1995) 17. Rudolph, O.: Consistent histories and operational quantum theory. Int. J. Theor. Phys. 35, 1581–1636 (1996) 18. Rudolph, O.: On the consistent effect histories approach to quantum mechanics. J. Math. Phys. 37, 5368– 5379 (1996) 19. Rudolph, O., Wright, J.D.M.: On tracial operator representations of quantum decoherence functionals. J. Math. Phys. 38, 5643–5652 (1997) 20. Rudolph, O., Wright, J.D.M.: The multi-form generalized Gleason theorem. Commun. Math. Phys. 198, 705–709 (1998)
Homogeneous Decoherence Functionals
267
21. Schreckenberg, S.: Symmetry and history quantum theory: An analogue to Wigner’s theorem. J. Math. Phys. 37, 6086–6105 (1996) 22. Schreckenberg, S.: Symmetries of decoherence functionals. J. Math. Phys. 38, 759–769 (1997) 23. Schreckenberg, S.: Completeness of decoherence functionals. J. Math. Phys. 36, 4735–4742 (1995) 24. Wegge-Olsen, N.E.: K-Theory and C ∗ -algebras. Oxford: Oxford University Press, 1993 25. Wigner, E.P.: The problem of measurement. Am. J. Phys. 31, 6–15 (1963) 26. Wright, J.D.M.: The structure of decoherence functionals for von Neumann quantum histories. J. Math. Phys. 36, 5409–5413 (1995) 27. Wright, J.D.M.: Decoherence functionals for von Neumann quantum histories: Boundedness and countable additivity. Commun. Math. Phys. 191, 493–500 (1998) 28. Wright, J.D.M.: Linear representations of bilinear forms on operator algebras. Expos. Math. 16, 75–84 (1998) 29. Wright, J.D.M.: Quantum decoherence functionals and positivity. Atti Sem. Mat. Fis. Univ. Modena XLVII, 137–144 (1999) Communicated by H. Araki
Commun. Math. Phys. 204, 269 – 312 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Interface, Surface Tension and Reentrant Pinning Transition in the 2D Ising Model C.-E. Pfister1 , Y. Velenik2 1 Département de Mathématiques, EPF-L, CH-1015 Lausanne, Switzerland.
E-mail: [email protected]
2 Fachbereich Mathematik, Sekr. MA 7-4, TU-Berlin, Strasse des 17. Juni 136, D-10623 Berlin, Germany.
E-mail: [email protected] Received: 28 January 1998 / Accepted: 22 December 1998
Abstract: We develop a new way to look at the high-temperature representation of the Ising model up to the critical temperature and obtain a number of interesting consequences. In the two-dimensional case, it is possible to use these tools to prove results on phase-separation lines in the whole phase-coexistence regime, by way of a duality transformation. We illustrate the power of these techniques by studying an Ising model with a boundary magnetic field, in which a reentrant pinning transition takes place; more precisely we show that the typical configurations of the model can be described, at the macroscopic level, by interfaces which are solutions of the corresponding thermodynamic variational problem; this variational problem is solved explicitly. There exist values of the boundary magnetic field and temperatures 0 < T1 < T2 < Tc such that the interface is not pinned for T < T1 or T > T2 , but is pinned for T1 < T < T2 ; we can also find values of the boundary magnetic field and temperatures 0 < T1 < T2 < T3 < Tc such that for T < T1 or T2 < T < T3 the interface is pinned, while for T1 < T < T2 or T > T3 it is not pinned. An important property of the surface tension which is used in this paper is the sharp triangle inequality about which we report some new results. The techniques used in this work are robust and can be used in a variety of different situations. 1. Introduction Let us consider a 2D Ising model in some rectangular box with boundary conditions imposing the presence of a phase-separation line crossing the box from one fixed point of a vertical side to another fixed point of the other vertical side. We suppose that the model is in the phase-coexistence region; the boundary conditions are chosen so that above the phase-separation line we have the + phase and below it the − phase. The bottom horizontal side of the box, which we call the wall, is subject to a negative boundary magnetic field. By varying the temperature or the boundary magnetic field one can observe an interfacial pinning-depinning or critical wetting transition as established by Abraham
270
C.-E. Pfister, Y. Velenik
[A1]. In [A1], however, this surface phase transition was called a “roughening transition” (although the analysis demonstrated the depinning character); further comments are made in Sect. 2.3.2 in connection with the work of McCoy and Wu who observed a related “boundary hysteresis” (see Chapters VI and XIII in [MW]). We now describe the pinning-depinning transition at the macroscopic level. For values of these parameters for which the + phase wets partially the wall, and under appropriate geometrical conditions, the equilibrium shape of the interface changes from a straight line crossing the box to a broken line touching a macroscopic part of the wall. Moreover, we show in this paper that there exist values of the boundary magnetic field and temperatures 0 < T1 < T2 < Tc such that the interface is not pinned for T < T1 or T > T2 and pinned for T1 < T < T2 ; we can also find values of the boundary magnetic field and temperatures 0 < T1 < T2 < T3 < Tc such that for T < T1 or T2 < T < T3 the interface is pinned, while for T1 < T < T2 or T > T3 it is not pinned. These reentrant pinning-depinning transitions are predicted by a macroscopic variational problem for the interface, which is formulated in terms of the surface tension and wall free energies of the model. One of the main results of the paper is the derivation of this macroscopic theory starting from the Boltzmann formula defining the equilibrium states of the model at the microscopic level. It is important to distinguish different length-scales. To do so we use two different words, “interface” and “phase-separation line”. We use the word “interface” to denote the boundary at the macroscopic scale between the two phases. At this scale the boundary is fixed (nonfluctuating). The fundamental thermodynamical function associated with an interface is the surface tension, which is non-zero below the critical temperature. (In [ABCP] similar ideas are developed). By contrast, the “phase-separation line” is a stochastic line whose probability distribution is determined by the Gibbs measure; it describes the boundary between the two phases at the lattice spacing scale. In this respect it is very interesting to read the introduction of [T], where Talagrand develops a similar analysis of the Law of Large Numbers for independent random variables. On the conceptual level one point of our paper is to show that the theory of the Gibbs states for the infinite volume model is inadequate for discussing some macroscopic properties of the model. The famous theorem, which states that all Gibbs states are translation-invariant for the 2D Ising model [Az1,Hi1], is not pertinent when we study the model at scales Lα , α > 1/2, L being the linear size of the box containing the system. There are non-translation invariant states at that scale, with well-defined (fixed) interfaces! Let us illustrate this point by considering the so-called ± boundary conditions, which corresponds to a special case of the present paper, where the phase-separation line goes from the middle of a vertical side of the box to the middle of the other vertical side. The definition of the phase separation line in [BLP1] coincides with the one of Gallavotti in his work [G] about the phase separation in the 2D Ising model; it differs slightly from the one used here, but in no essential way. (Notice that the terminology “interface” is sometimes used for “phase-separation line” in [BLP1].) There are three natural scales in the study of the phase-separation line, which have been first studied by Abraham and Reed [AR] in a non-perturbative manner. At the scale of the lattice spacing the phase-separation line is a stochastic geometrical line, which has well-defined properties, which depend strongly on the microscopic interaction [BLP1]. Its middle point has fluctuations typically of the order O(L1/2 ), L being the linear size of the box 3L containing the system [G,AR]. Because of these fluctuations the projection of the corresponding limiting Gibbs state, at the middle of the box, when L → ∞, is translation invariant [G]; in particular the magnetization
Reentrant Pinning Transition
271
(at the middle of the box) is zero. If we scale the lengths vertically by (1/L)1/2 and horizontally by 1/L, then in the limit L → ∞ the phase-separation line converges to a Brownian bridge, [Hi2,DH,D]. The magnetization profile on that scale has been computed by [AR]. At that intermediate scale the phase-separation line is still stochastic, but its properties show some universal features (Central Limit Theorem). However, at a scale of order O(Lα ), α > 1/2, the system has a well-defined fixed horizontal interface and a deterministic macroscopic magnetization profile [AR]. To describe the system at the scale O(Lα ) we partition the box 3L into square boxes Ci of linear size O(Lα ); the state ofPthe system in each of these boxes is specified by the empirical magnetization |Ci |−1 t∈Ci σ (t). Then we rescale all lengths by 1/L in order to get a measure for these normalized block-spins in the fixed (macroscopic box) Q. When L → ∞ these measures converge to a deterministic macroscopic magnetization profile showing a well-defined horizontal interface separating the two phases of the model, characterized by a value ±m∗ of the normalized block-spins, m∗ being the spontaneous magnetization of the model. This coarsed-grained description of the equilibrium state at the thermodynamic limit is in sharp contrast with the above mentioned result implying that the equilibrium state converges to a translation invariant measure at the thermodynamical limit. These two limits are related to properties of the model at two different scales, the lattice spacing scale and the macroscopic one. We outline the content of the paper. In Sect. 2 we recall the definitions and some properties of phase-separation line, duality, surface tension and wall free energy. We give here no proof. By duality the statistical properties of the phase-separation line at β > βc between two distant but fixed points, say t and t 0 , are (essentially) the same as the statistical properties of the high-temperature contour λ in the random-line representation (1.1) of the two-point correlation function at β ∗ < βc , X h σ (t)σ (t 0 ) i(β ∗ ) = q(λ). (1.1) λ:t→t 0
In (1.1) λ is an open contour of the high-temperature representation with end-points t and t 0 ; q(λ) is the weight of the contour λ; q(λ) depends of course on β ∗ . We can also interpret λ as the part of the phase-separation line going from t to t 0 and q(λ) is the weight at β of that part of the phase-separation line. The sum over λ in (1.1) is the partition function of the ensemble of stochastic lines λ from t to t 0 . We exploit the fact that this partition function is equal to h σ (t)σ (t 0 ) i(β ∗ ); consequently we have a good control of this sum since we can use information, either from explicit computations or from correlation inequalities, available for the two-point correlation function. Our (working) definition of the surface tension of an interface described at the macroscopic level by a line passing through t and t 0 , perpendicular to the direction n, is the thermodynamical function corresponding to this ensemble of stochastic lines, that is τˆ (n; β) := lim − k∈N k→∞
X 1 q(λ). ln 0 kkt − t k 0
(1.2)
λ:kt→kt
This is exactly the quantity, which enters in the macroscopic variational problem. (The same point of view is taken in [Pf2] and [PV1] in connection with the Wulff shape.) On the technical side this definition is much simpler to use in our problem than the previous definitions considered in the literature [A2]. The fact that this definition coincides with previous definitions considered in the literature is not trivial (Proposition 2.2). The “physical” reason, why Proposition 2.2 is true, is that the walls of the box are in the
272
C.-E. Pfister, Y. Velenik
complete wetting regime; see Sect. 7 where the dual question, equality between the short and long correlation lengths, is considered. In Sect. 3 we define precisely the main problem, which we address. In this section we give some references to earlier works. We formulate the problem from the microscopic viewpoint, but then discuss it from the macroscopic viewpoint in Sects. 4 and 5. The main physical results are contained in Sect. 5, in which we prove that the physical situations described at the beginning of this introduction take place. These two sections dealing with the macroscopic theory are formulated in terms of surface tension and wall free energy. We use known results, mostly coming from explicit computations. In fact, we do not know how to predict the reentrant phenomena, which we display, without knowing explicitly the values of the surface tension and the wall free energy. In the second part of the paper we derive the macroscopic theory starting from the microscopic Hamiltonian by analysing the typical configurations. Our starting point is a new way of dealing with the high-temperature representation of the model, which has been developed recently in [PV1]. Although different, our approach is similar in some respect to the random current representation of the Ising model of Aizenman [Az2]. This method is exposed in Sect. 6; it is the core of the paper. The method is not restricted to dimension two. Except for two proofs, which can be read in [PV1], the method is developed from scratch, with new proofs and new results. This section has its own interest and can be read independently. The major results are concentration results for the random-line representation (1.1) of the two-point correlation function above the critical temperature when D = 2. Let S(t, t 0 ; C ln kt − t 0 k) := { x ∈ Z2 : kt − xk + kt 0 − xk ≤ kt − t 0 k + C ln kt − t 0 k }. (1.3) There exists C, large enough, so that the stochastic lines contributing to the two-point function h σ (t)σ (t 0 ) i(β ∗ ) are those contained inside the ellipse (1.3); more precisely, if C is large enough, by Lemma 6.10, X q(λ) lim
kt−t 0 k→∞
λ:t→t 0 λ6⊂S(t,t 0 ;C ln kt−t 0 k)
X
q(λ)
= 0.
(1.4)
λ:t→t 0
This result is sharp, since the width of the ellipse is O (kt − t 0 k ln kt − t 0 k)1/2 . Thus the lines λ contributing to (1.1) are concentrated in a region, whose size is essentially, that of the normal fluctuations of a random walk going from t to t 0 . When the model is defined on the half-infinite lattice L := { x ∈ Z2 : x2 ≥ 0 } we have a random-line representation similar to (1.1) for the boundary two-point function (Lemma 6.13). There are two regimes, depending on the value of the boundary magnetic field. If the boundary coupling constant h∗ , dual to the boundary magnetic field h, is not too large, then the concentration result is as above; in that case we know that the line λ undergoes an entropic repulsion from the boundary of L. On the other hand, if the coupling constant h∗ is high enough, then the line λ sticks to the boundary of L. We show that the lines λ contributing to the boundary two-point correlation function, when kt − t 0 k = |t1 − t10 | tends to infinity, are those contained in a rectangle (t1 < t10 ) B(t, t 0 ; ρ) := { x ∈ L : x1 ∈ [t1 − ρ, t10 + ρ], 0 ≤ x2 ≤ ρ } , ρ = C ln |t1 − t10 |. (1.5)
Reentrant Pinning Transition
273
Again, this result is optimal. We stress that the only condition about the temperature is T > Tc . We give a first application of the results of Sect. 6 in Sect. 7. This section also contains one of the main estimates, a lower bound for the two-point correlation function in a finite box in terms of the two-point correlation function of the infinite system. This bound is essential for Sect. 8. We show that the pinning transition below Tc has a dual interpretation above Tc ; although there is a unique Gibbs state at the thermodynamical limit, we may have surface effects. Inspired by [SML] we introduce the notions of short correlation length and long correlation length. We prove that these two notions do not necessarily coincide. They differ when at the dual temperature the interface is pinned. The relevance of these results for the surface tension at the dual temperature is discussed at the beginning of Sect. 2.3. In Sect. 8 we justify the macroscopic theory of Sect. 4 starting from the microscopic theory, and we show how the interface emerges in the statistical description of the model, as a deterministic object in a coarse-grained description of the microscopic configurations. We add one appendix, Sect. 9, where we show that our method is robust. We apply it to a generic case with N interfaces. In this paper we derive results by very different technical tools like exact computation, correlation inequalities and high-temperature representation. We can treat mathematically various interesting physical situations for the 2D Ising model. Each of the approaches just mentioned has its own strengths and weaknesses. It is certainly advantageous to combine these methods as we do in this paper. It is evident that the method of the high-temperature representation, combined with duality, is appropriate for studying interfaces for the 2D Ising model at a scale Lα , α > 1/2. On the other hand we also show that we need few, but very precise results about specific quantities, like two-point correlation function, surface tension, wall free energy, values of the boundary magnetic field where the wetting transition takes place. These results depend on finer properties of the model at scales Lα , α ≤ 1/2. Here exact computations are appropriate; moreover, some of these results can be obtained only by exact computations. [A2] is a good review about exact results on interface problems in general. We also mention the work by Fisher [F] where deep insight about wetting and pinning problems and other phenomena in 2D is provided by analysing these questions in terms of random walks. Some of the results presented here are taken from [V] (see Chapter 6). 2. Definitions and Notations We introduce the notation used in the paper, which follows essentially that of [PV1]. We recall the notions of duality, phase-separation line, surface tension and wall free energy. We also state some fundamental estimates for the two-point correlation function of the model. A large part of this material is standard; references are given in the text. Throughout the paper we use the following convention: O(x) denotes a non-negative function of x ∈ R+ , such that there exists a constant C with O(x) ≤ Cx; the function O(x) may be different at different places. 2.1. Phase-separation line. As explained in the introduction, we study some macroscopic features of the 2D Ising model starting from the microscopic description of the model. It is therefore natural to start by fixing some macroscopic box Q ⊂ R2 , which we choose in an asymmetric way for latter purposes, Q := { x = (x1 , x2 ) ∈ R2 : |x1 | ≤ 1 , 0 ≤ x2 ≤ 2 }.
(2.1)
274
C.-E. Pfister, Y. Velenik
Let L be an integer and 3L ⊂ Z2 , 3L := { x = (x1 , x2 ) ∈ Z2 : |x1 | ≤ L , 0 ≤ x2 ≤ 2L }.
(2.2)
Notice that after scaling by 1/L, 3L ⊂ Q. Spin configurations are denoted by ω ∈ {−1, +1}3L ; the spin variable at x ∈ Z2 is σ (x), σ (x)(ω) = ω(x) = ±1. Phaseseparation lines are stochastic lines (see below), whose positions are fixed on the boundary of 3L by boundary conditions. The boundary ∂3L of 3L is the subset ∂3L := { x ∈ 3L : ∃ y 6 ∈ 3L max |yi − xi | = 1 }. i=1,2
(2.3)
Boundary conditions (b.c.) for 3L consists in prescribing the value of the spin at x ∈ ∂3L . For example, the − b.c. means that ω(x) = −1 ∀x ∈ ∂3L . In the general case boundary conditions are specified by η ∈ {−1, +1}∂3L , so that for all configurations ω, ω(x) := η(x) ∀x ∈ ∂3L ; we refer to that boundary condition as the η b.c.. Free boundary conditions means absence of boundary conditions. The Hamiltonian of the model in 3L with η b.c. is the function on {−1, +1}3L ( P − ht,t 0 i⊂3L J (t, t 0 )σ (t)(ω)σ (t 0 )(ω) if ω(x) = η(x) ∀x ∈ ∂3L ; η H3L (ω) := +∞ otherwise. (2.4) Here ht, t 0 i is the standard notation for a pair of nearest neighbour points of the lattice Z2 , called bond. The coupling constants J (t, t 0 ) are positive; we specify them later on. The Gibbs measure on {−1, +1}3L with η b.c. is η
exp{−βH3L (ω)} 2η (3L )
;
(2.5)
β is the inverse temperature and 2η (3L ), the partition function, is the normalization η constant in (2.5). Expectation values are written h · i3L . The dual lattice to Z2 is Z2∗ := { x = (x1 , x2 ) ∈ R2 : x + (1/2, 1/2) ∈ Z2 },
(2.6)
and the dual box 3∗L ⊂ Z2∗ is 3∗L := { x = (x1 , x2 ) ∈ Z2∗ : |x1 | ≤ L − 1/2 , 1/2 ≤ x2 ≤ 2L − 1/2 }.
(2.7)
Each bond ht, t 0 i defines a unit segment e(t, t 0 ) ⊂ R2 with end-points t, t 0 ; to each bond ht, t 0 i such that ht, t 0 i ∩ 3L \∂3L 6 = ∅, there corresponds a unique dual bond ht ∗ , t 0 ∗ i ⊂ 3∗L , which is defined by the condition that e(t, t 0 ) ∩ e(t ∗ , t 0 ∗ ) 6= ∅. Given boundary conditions η, each configuration ω, which is compatible with the η b.c., can be uniquely specified by giving all segments e(t, t 0 ) such that σ (t)(ω)σ (t 0 )(ω) = −1 and {t, t 0 } ∩ 3L \∂3L 6 = ∅; this is equivalent to specify all dual segments e(t ∗ , t 0 ∗ ), or the corresponding dual bonds of 3∗L . The union of these dual segments forms a set of lines in R2 , which we decompose into connected components. Whenever ∃ t ∈ 3∗L , which belongs to four segments, we apply the deformation rule defined in the picture below, so that each configuration ω, compatible with the η b.c., is uniquely
Reentrant Pinning Transition
275
specified by a finite set of disjoint simple lines called contours of the configuration. -
Let B be a set of dual bonds; the boundary δB of B is the set of x ∈ Z2∗ such that there is an odd number of bonds of B adjacent to x. B is closed if δB = ∅ and open if δB 6 = ∅. The contours of a configuration are either closed, or open with end-points on the boundary of 3∗L . The set VL (η) ⊂ 3∗L of the end-points of the open contours is uniquely determined by the η b.c.; its cardinality is even if VL (η) 6= ∅. The set of closed contours is written γ = {γ1 , γ2 , . . . , } and the set of open contours λ = {λ1 , λ2 , . . . }. We call the open contours the phase-separation lines of the configuration. Conversely, a family of contours (γ 0 , λ0 ) is called η compatible if there exists ω compatible with the η b.c. such that γ (ω) = γ 0 and λ(ω) = λ0 . The probability of λ can be computed with the Gibbs measure (2.5). It is however more convenient to introduce a non-normalized measure on the set of phase-separation lines, in order Pto exploit the duality property of the model. The length |γ | of a closed contour γ is e∈γ J (e). The sum of the lengths of the contours of a family γ is written |γ |. Similar notations hold for open contours. Next we define two (normalized) partition functions, Z η (3L ) and Z η (3L |λ), where λ and η are compatible, X exp{−2β|γ (ω)|} exp{−2β|λ(ω)|}; (2.8) Z η (3L ) := ω: η comp.
and Z η (3L |λ) :=
X
exp{−2β|γ (ω)|}.
(2.9)
ω: η comp. λ(ω)=λ
η
We define a weight q3L (λ) by setting η exp{−2β|λ|} Z (3L |λ) if λ and η are compatible, η Z − (3L ) q3L (λ) := 0 otherwise.
(2.10)
η
The weight q3L (λ) does not define a probability measure on the set of η compatible phase-separation lines, since in (2.10) we divide by Z − (3L ) and not Z η (3L ). 2.2. Duality. A basic property of the 2D Ising model is self-duality. As a consequence of that property many questions about the model below the critical temperature can be translated into dual questions for the dual model above the critical temperature. For example, questions about the surface tension are translated into questions about the correlation length. We define the dual objects to 3L , β and J (t, t 0 ). The dual box 3∗L is defined in (2.7). The dual inverse temperature β ∗ is defined by tanh β ∗ := exp{−2β}.
(2.11)
276
C.-E. Pfister, Y. Velenik
We recall that the critical inverse temperature βc of the Ising model with coupling constants J (t, t 0 ) ≡ 1 is the fixed point of Eq. (2.11). Let ht, t 0 i be a bond of Z2 and ht ∗ , t 0 ∗ i its dual bond; the dual coupling constant J ∗ (t ∗ , t 0 ∗ ) is defined by ∗
tanh β ∗ J ∗ (t ∗ , t 0 ) := exp{−2βJ (t, t 0 )}.
(2.12)
Let H3∗L := −
X
J ∗ (t, t 0 )σ (t)σ (t 0 )
(2.13)
ht,t 0 i⊂3∗L
be the Hamiltonian in the dual box 3∗L with free boundary conditions and dual coupling constants. The expectation value with respect to the corresponding Gibbs measure at the dual temperature β ∗ is written h · i3∗L . A key dual statement is the following one. Let λ be a family of phase-separation lines, which are η compatible with a given η b.c. for 3L . Then X λ
η
q3L (λ) = h
Y
t∈VL (η)
σ (t) i3∗L .
(2.14)
Formula (2.14) is our starting point for analysing the interfaces of the model. It is proven η in Sect. 6. In that section we identify the weight q3L (λ) with the weight q3∗L (λ) of λ in the high-temperature representation of the model defined in the dual box 3∗L with free boundary conditions (see Lemma 6.2).
2.3. Surface tension and wall free energy. We recall the definition of surface tension as given for example in the review paper [A2] formula (2.14a) (see also [Pf1]), since this is the definition which is mostly used. In Sect. II.D of [A2] other definitions of surface tension are reviewed and compared. The heuristic grounds given on p.10 of [A2] (see also note 12 in [Pf1]) lead to a definition of the surface tension as the logarithm of the ratio of two partition functions with different boundary conditions. The results of Sect. 7 show that this may lead to a wrong definition of the surface tension for an Ising model with modified coupling constants on one part of the boundary. The heuristic grounds give a correct definition only if the walls of the box are in the complete wetting regime, a crucial physical condition, which has been so far overlooked in the literature. See Sect. 7 where we consider explicitly the dual question of equivalence of short and long correlation length, but the results apply to the definition of the surface tension. As explained in the Introduction our working definition of the surface tension is different. The fact that we get the same quantity is a consequence of Proposition 2.2. The ultimate justification for the definition of the surface tension is that it should be equal to the quantity, which enters into the formulation of the variational problem describing the behaviour of the interface at the macroscopic level. This is the subject of this paper. 2.3.1. Surface tension. Consider the model defined in 30L , 30L := { x ∈ Z2 : |xi | ≤ L , i = 1, 2 },
(2.15)
Reentrant Pinning Transition
277
with coupling constants J (t, t 0 ) ≡ 1 and inverse temperature β. Let n ∈ R2 be a unit vector; denote by Dn the straight line perpendicular to n and passing through the origin of R2 . The ηn b.c. for 30L is defined by ( −1 if x ∈ ∂30L is below or on Dn , (2.16) ηn (x) := +1 if x ∈ ∂30L is above Dn . Let Dn be the Euclidean length of the segment {x ∈ R2 : |xi | ≤ 1} ∩ Dn . If ω is compatible with the ηn b.c. there is a unique phase-separation line λ(ω). The limit τˆ (n; β) Z ηn (30 ) 1 ln − 0L L→∞ LDn Z (3L )
τˆ (n; β) := − lim
(2.17)
exists and is called the surface tension at inverse temperature β. By symmetry of the model we have (n = (n1 , n2 )) τˆ (n1 , n2 ; β) = τˆ (−n1 , −n2 ; β) = τˆ (n2 , −n1 ; β) = τˆ (n2 , n1 ; β).
(2.18)
We extend the definition of τˆ (n; β) to R2 by homogeneity (k · k is the Euclidean norm), τˆ (x; β) := kxkτˆ (x/kxk; β).
(2.19)
J (t, t 0 )
≡ 1. The surface tension is a uniformly Lipschitz convex Proposition 2.1. Let function on R2 such that τˆ (x; β) = τˆ (−x; β). It is identically zero if β ≤ βc , and strictly positive for all x 6 = 0 if β > βc . In the latter case τˆ defines a norm on R2 . The main property of τˆ is the sharp triangle inequality. For all β > βc there exists a strictly positive constant κ = κ(β) such that for any x, y ∈ R2 , τˆ (x; β) + τˆ (y; β) − τˆ (x + y; β) ≥ κ(kxk + kyk − kx + yk).
(2.20)
Let x(θ ) := (cos θ, sin θ) and τˆ (θ ; β) := τˆ (x(θ ); β). Then the best constant κ is 2 d τ ˆ (θ ; β) + τ ˆ (θ; β) > 0. (2.21) κ := inf θ dθ 2 The first part of the proposition is proved in [LP] and [Pf2] (Lemma 6.4). The arguments are not restricted to the 2D Ising model. Ioffe [I1] proved an equivalent form of inequality (2.20), but with a non-optimal value of κ. Inequality (2.20) as stated here first appeared in [V]. The strict positivity of the optimal constant κ given in (2.21) follows from the exact expression of τˆ (θ ; β) [AA]; it is called the positive stiffness property. Remark. Geometrically (2.21) means that the curvature of the Wulff shape is bounded above by 1/κ. It is well-known that the surface tension is the support function of the Wulff crystal. The following result of Convex Theory is interesting, and appears to be new as far as we know [V]. It characterizes the compact convex bodies W in R2 which have a support function τˆ , τˆ (x) := sup h y ∗ , x i, y ∗ ∈W
(2.22)
satisfying the sharp triangle inequality τˆ (x) + τˆ (y) − τˆ (x + y) ≥ K 0 (kxk + kyk − kx + yk).
(2.23)
278
C.-E. Pfister, Y. Velenik
In (2.22) h ·, · i is the Euclidean scalar product. No smoothness of the boundary of W is assumed. Let W1 and W2 be two convex bodies; we say that ∂W1 is tangent to ∂W2 at x ∗ if W1 and W2 have a common support plane at x ∗ . We recall the notion of radius of curvature of W at x ∗ . Let U be an open neighborhood of x ∗ . Let Ti (x ∗ , U ) be the family of discs D with the following properties: 1. ∂D is tangent to ∂W at x ∗ ; 2. W ∩ U ⊃ D ∩ U . We allow the degenerate cases where the disc is a single point or a half-plane. Consequently Ti (x ∗ , U ) 6 = ∅. We denote by ρ(D) the radius of the disc D and set ρ(x ∗ , U ) := sup{ρ(D) : D ∈ Ti (x ∗ , U )}.
(2.24)
Clearly ρ(x ∗ , U1 ) ≤ ρ(x ∗ , U2 ) if U1 ⊃ U2 . The lower radius of curvature at x ∗ is defined as ρ(x ∗ ) := sup{ρ(x ∗ , U ) : U open neighborhood of x ∗ }.
(2.25)
Theorem 2.1. Let W be a convex compact body and τˆ be its support function. Then the following statements are equivalent: 1. The lower radius of curvature of ∂W is uniformly bounded below by K > 0. 2. There exists a positive constant K 0 such that for any x, y ∈ R2 , τˆ (x) + τˆ (y) − τˆ (x + y) ≥ K 0 (kxk + kyk − kx + yk).
(2.26)
There is a well-known dual relation between the surface tension at inverse temperature β and the decay-rate of the two-point function at the dual temperature β ∗ , which we recall here. Consider the 2D Ising model on the dual lattice, with coupling constants J ∗ (t, t 0 ) ≡ 1, inverse temperature β ∗ and free b.c.. The two-point function, or covariance, is hσ (t)σ (t 0 )i(β ∗ ) , t, t 0 ∈ Z2∗ ,
(2.27)
i(β ∗ )
denotes expectation value with respect to the infinite volume free b.c. where h · Gibbs measure at inverse temperature β ∗ . The decay-rate of the two-point function is defined for all t, t 0 ∈ Z2∗ as τ (t − t 0 ; β ∗ ) := − lim
k∈N k→∞
1 lnhσ (kt)σ (kt 0 )i(β ∗ ). k
(2.28)
Proposition 2.2. Let J (t, t 0 ) ≡ 1. The surface tension τˆ (x; β) of the 2D Ising model and the decay-rate τ (x; β ∗ ) are equal, τˆ (x; β) = τ (x; β ∗ ) ∀x.
(2.29)
Remark. Identity (2.29) has been noticed several times; we refer the reader to [ZA] where a brief historical account with references is given at the beginning of their paper. However, a proof of formula (2.29) does not follow from duality only. There is an exchange of limits, which must be justified (see e.g. [BLP2]). We show in Sect. 7 that there are cases where the exchange of limits is not valid and such a relation does not hold.
Reentrant Pinning Transition
279
2.3.2. Wall free energy. There is another thermodynamical quantity, which enters into the description of the properties of the interface, the wall free energy. In the phasecoexistence regime it depends on the nature of the bulk phase. Only the difference of wall free energies when the bulk phase is either the + phase or the − phase has an intrinsic meaning. In order to have interesting surface phenomena we single out one part of the boundary of the box 3L , the bottom part. (This is the reason for our asymmetrical choice of 3L .) We choose here the coupling constants of the model as follows. ( h > 0 if t2 = 0 or t20 = 0, 0 (2.30) J (t, t ) := 1 otherwise. We compare the free energy for two different b.c., one being the − b.c. and the other one the η± b.c., defined as ( −1 if x ∈ ∂3L and x2 = 0, (2.31) η± (x) := 1 if x ∈ ∂3L and x2 > 0. We set1 Z η± (3L ) 1 ln − . L→∞ 2L + 1 Z (3L )
τˆbd (β, h) := − lim
(2.32)
The quantity τˆbd (β, h), which gives the difference of two free energies, verifies the fundamental inequalities (2.34) for any D ≥ 2, [FP1] and [FP2]. Let nw := (0, 1) and set τˆ (β) := τˆ (nw ; β);
(2.33)
|τˆbd (β, h)| ≤ τˆ (β).
(2.34)
0 < τˆbd (β, h) ≤ τˆ (β).
(2.35)
for any β and any h,
If β > βc and h > 0, then
Suppose that β > βc . The difference between the two free energies, per unit length, is interpreted as the free energy, per unit length, of the horizontal interface produced by the boundary condition η± . If τˆbd (β, h) = τˆ (β), then this free energy is equal to the surface tension of an horizontal interface. This indicates that the interface produced by the boundary condition η± b.c. is not pinned; or in other terms, we have complete wetting of the wall by the − phase. On the other hand, if τˆbd (β, h) < τˆ (β), then this indicates that the interface is pinned, or in other words, that we have partial wetting. What we just described is Cahn’s criterion for the wetting transition: when h > 0 there is partial wetting of the wall if and only if τˆbd (β, h) < τˆ (β). In terms of Gibbs states one can prove, [FP1] and [FP2], that near the wall all Gibbs states are identical if and only if |τˆbd (β, h)| = τˆ (β). Intuitively this is easy to understand: at the microscopic level the state of the system near the wall is always the state of the − phase near the wall, since the wall is in the complete wetting regime. By contrast, in the partial wetting regime 1 The definition of τˆ differs from the analogous quantity used in [PV1] or [PV2], because in these papers bd the reference bulk phase is the + phase and here it is the − phase.
280
C.-E. Pfister, Y. Velenik
the state near the wall depends on the nature of the bulk phase. The behaviour of Gibbs states near the wall can be distinguished by the order parameter η
lim h σ (0, 1) i3L (β, h).
(2.36)
L→∞
Fröhlich and Pfister in [FP2] proved that there are several Gibbs states near the bottom wall if and only if η±
lim h σ (0, 1) i− 3L (β, h) 6 = lim h σ (0, 1) i3L (β, h).
L→∞
L→∞
(2.37)
This occurs if and only if h < hw , with hw = hw (β), a temperature dependent coupling, which is defined by (see (2.27) in [FP2]) η±
hw (β) = inf{ h ≥ 0 : lim h σ (0, 1) i− 3L (β, h) = lim h σ (0, 1) i3L (β, h) }. (2.38) L→∞
L→∞
Using the results of Fröhlich and Pfister [FP1] and [FP2], and those of Pfister and Penrose [PP] one can show that the surface magnetizations computed by McCoy and Wu (see Chapter VI in [MW]) can be identified with the above quantities lim h σ (0, 1) i− 3L (β, h)
L→∞
η±
and lim h σ (0, 1) i3L (β, h). L→∞
Therefore hw can be computed from their work, hw being given by formula (5.44), p.137 of [MW]; it is not difficult to show that an equivalent form of this expression is (2.39), which is the formula given by Abraham for the value of hw , where the pinning-depinning transition occurs, exp{2β}{cosh 2β − cosh 2βhw (β)} = sinh 2β.
(2.39)
An equivalent computation of hw based on Cahn’s criterion is given in [AC]. Remark. At the time when McCoy and Wu discovered this surface phase transition nobody understood what was physically implied: the transition was interpreted as a boundary hysteresis phenomenon. This interpretation is, however, misleading, the transition is not related to any kind of metastability. The plot of the quantities corresponding to η± limL→∞ h σ (0, 1) i− 3L (β, h) and lim L→∞ h σ (0, 1) i3L (β, h) is given in Fig. 6.6, Chapter VI of [MW]. Besides the extensive computations for the semi-infinite Ising model of McCoy and Wu, Abraham, Abraham and coworkers, we also mention [AY] and [AF]; this list is not exhaustive. As for the surface tension there is a dual expression for τˆbd . We first introduce the two-point function of the model on the half-infinite lattice L∗ := { x ∈ Z2∗ : x2 ≥ 1/2 },
(2.40)
hσ (t)σ (t 0 )iL∗ (β ∗ , h∗ ) := lim hσ (t)σ (t 0 )i3∗L (β ∗ , h∗ ).
(2.41)
as L→∞
Reentrant Pinning Transition
281
Fig. 1. τˆbd as a function of the magnetic field h, for β = 1.4βc
Proposition 2.3. Let the coupling constants be given by (2.30), h > 0, and let β > βc . Let β ∗ , h∗ be the dual coupling constants, t, t 0 ∈ 3∗L , t (2) = t 0 (2) = 1/2. Then the limit − lim
n→∞
1 lnhσ (nt)σ (nt 0 )iL∗ (β ∗ , h∗ ) = |t1 − t10 | · τbd (β ∗ , h∗ ) n
(2.42)
exists and τbd (β ∗ , h∗ ) = τˆbd (β, h). See [PV1] for a proof.
2.4. Two-point correlation function. There are close relations between surface tension, resp. wall free energy, and decay-rate of the two-point correlation function, resp. boundary two-point correlation function (Propositions 2.2 and 2.3). The next proposition states fundamental estimates about the two-point correlation functions, which we need later on. As in the previous section, see (2.33), τˆ (β) := τˆ (nw ; β) and τ (β ∗ ) = τˆ (β). Proposition 2.4. Let J (e) ≡ 1. Let β ∗ < βc . 1. There exist positive constants K and ab such that for all x, y ∈ Z2∗ , K
exp{−τ (y − x; β ∗ )} ≤ hσ (x)σ (y)i(β ∗ ) ≤ exp{−τ (y − x; β ∗ )}. kx − ykab
(2.43)
2. Let the coupling constants be given by (2.30), with h = h∗ , 0 < h∗ < ∞. If τbd (β ∗ , h∗ ) = τ (β ∗ ), then there exists a constant K 0 such that for all x, y ∈ L∗ , with x2 = y2 = 1/2, K0
exp{−τ (β ∗ )|x1 − y1 |} ≤ hσ (x)σ (y)iL∗ (β ∗ , h∗ ) kx − yk3/2 ≤ exp{−τ (β ∗ )|x1 − y1 |}.
(2.44)
282
C.-E. Pfister, Y. Velenik
3. Let the coupling constants be given by (2.30), with h = h∗ , 0 < h∗ < ∞. If τbd (β ∗ , h∗ ) < τ (β ∗ ), then there exists a constant K 00 such that for all x, y ∈ L∗ , with x2 = y2 = 1/2, K 00 exp{−τbd (β ∗ , h∗ )|x1 − y1 |} ≤ hσ (x)σ (y)iL∗ (β ∗ , h∗ ) ≤ exp{−τbd (β ∗ , h∗ )|x1 − y1 |}.
(2.45)
Remarks. 1. The upper bounds are well-known consequences of sub-additivity and GKS inequalities, see e.g. [PV1]. 2. The lower bound (2.43) has been proved recently by Alexander [Al]; his method is robust and can be applied to different models of statistical mechanics, e.g. percolation, Potts or random-cluster models. The value obtained by this method is not optimal (see the next remark). 3. The optimal value in (2.43) is ab = 1/2. Notice that for our purpose the bound (2.43) derived by Alexander is sufficient. However, the determination of the asymptotic behaviour of the two-point function is an important theoretical question. A detailed asymptotic study of the two-point function of the Ising model when D = 2 is made in Chapter XII of [MW] (in particular (4.39) therein); see the very informative discussion of their results in Sect. 5 of the same chapter. For dimension D ≥ 2 the expected behaviour is hσ (x)σ (y)i(β ∗ ) = ϕ(n(y − x); β ∗ )
exp{−τ (y − x; β ∗ )} kx − yk
D−1 2
,
(2.46)
with n(y − x) = (x − y)/kx − yk. Recently Ioffe [I2] proved such a formula for the simple self-avoiding walk on ZD , D ≥ 2, with ϕ : S D−1 → R+ an analytic function. 4. The lower bound (2.44) follows again from the work of [MW] when h∗ = 1 (Chapter VII, in particular the discussion pp. 144–145). Using correlation inequalities, it can be extended to the general case as shown in [PV1, Prop. 7.1]. 5. The lower bound in (2.45) is proven in [PV1, Prop. 7.1]. 3. A Microscopic Model for the Pinning Transition We define a microscopic model for a system with two coexisting phases, separated by an interface, where we have a reentrant pinning-depinning transition. Our model is inspired by the work of Patrick [Pa1], who showed that there is a reentrant pinning-depinning transition for the SOS model corresponding to our settings. In a recent work, Patrick and Upton [PU] studied in the Ising model questions similar to those investigated here. The interesting fact that we can have reentrant pinning-depinning transition for an Ising model with ferromagnetic coupling constants only is not new. This is for example proved in [ACD] for a different choice of the coupling constants; in our notations this corresponds to c > 0 if t2 = 0 and t20 = 1 or vice-versa, 0 if t2 = t20 = 0 or t2 = t20 = 1, (3.1) J (t, t 0 ) := b > 0 if t2 = 1 and t20 = 2 or vice-versa, 1 otherwise.
Reentrant Pinning Transition
In [ACD] the two boundary conditions η± are considered ( ±1 if x ∈ 3L , x2 = 0, η± (x) := 1 otherwise.
283
(3.2)
This model differs from our model; if we integrate over the spins of the row {x ∈ 3L , x2 = 1}, then the resulting Hamiltonian is equivalent to our Hamiltonian defined by the coupling constants (3.3), but with now an effective nonlinear temperature dependent coupling h = h(T ) (see formula (11) in [ACD]). Our method proceeds in two steps. First, we derive a macroscopic variational problem characterizing the typical configurations. This part of the analysis is based on the probabilistic methods developed in Sect. 6 and following. The main advantage we gain is that these methods are robust (see for example the Appendix). In the second step, we solve explicitly the variational problem. It is at that point that we need the exact expressions of the surface tension and wall free energy. Let Q be the macroscopic box (2.1) and denote by WQ := { x ∈ Q : x(2) = 0 } its bottom wall. We want to describe at the macroscopic level an interface going from the point A := (−1, a), 0 < a < 2, to the point B := (1, b), 0 < b < 2, which can be pinned by the bottom wall WQ . The idea is to introduce a grid in Q with lattice spacing 1/L, L ∈ N, and to consider an Ising model on that grid. When L tends to infinity we hope to have a good microscopic description of the macroscopic physical situation. It is however more convenient to work with a fixed lattice with lattice spacing unity, when we investigate asymptotic properties of the model for L tending to infinity. Therefore we define the model in the box 3L (see (2.2)). We choose the coupling constants of the model as follows, ( h > 0 if t2 = 0 or t20 = 0, 0 (3.3) J (t, t ) := 1 otherwise. The boundary conditions specify the end-points of one phase-separation line, which is the microscopic manifestation of the interface. The boundary conditions are ηab , +1 if x ∈ 3L , x2 = 2L, +1 if x = −L and aL ≤ x ≤ 2L, 1 2 (3.4) ηab (x) := +1 if x = L and bL ≤ x ≤ 2L, 1 2 −1 otherwise. In each spin configuration compatible with ηab there is a unique phase-separation L line λ with end-points in VL (ηab ) := {uL , v L }, uL 1 = −L and v1 = L. The normalized partition function is denoted by Z ab (3L ) ≡ Z ηab (3L ). Problem. Describe the statistical properties of the phase-separation line λ and show that there is reentrant pinning-depinning transition. Derive the macroscopic theory developed in Sect. 4 from the microscopic theory. Remark. In [AK] the same model is studied, with similar, but different boundary conditions; the pinning of the interface is used in order to define the contact angle and give an exact derivation of the modified Young equation for partial wetting.
284
C.-E. Pfister, Y. Velenik
4. The Variational Problem The interface is a macroscopic deterministic object, whose properties are described by a functional involving the surface tension or the wall free energy. The equilibrium state of the interface is given by the minimum of this functional. In Q the interface is a simple rectifiable curve C with end-points A = (−1, a), 0 < a < 2, and B = (1, b), 0 < b < 2. We denote by |C ∩ WQ | the length of the portion of the interface in contact with the wall WQ . Suppose that [0, t] → Q, s 7 → C(s) = (u(s), v(s)), is a parameterization of the interface. The free energy of the interface C can be written Z t h i τˆ (u(s), ˙ v(s))ds ˙ + |C ∩ WQ | · τˆbd − τˆ (1, 0) , (4.1) W(C) := 0
because the function τˆ (x1 , x2 ) is positively homogeneous and τˆ (x1 , x2 ) = τˆ (−x2 , x1 ). The interface at equilibrium is the minimum of this functional. Therefore we have to solve the Variational problem. Find the minimum of the functional W among all simple rectifiable open curves in Q with extremities A = (−1, a) and B = (1, b). Let D be the straight line from A to B and W be the curve composed of three straight line segments: from A to a point P1 ∈ WQ , from P1 to P2 ∈ WQ , and from P2 to B. The points P1 resp. P2 are such that the angles between the first segment and the wall resp. between the last segment and the wall are equal to θY ∈ [0, π/2], which is a solution of the Herring–Young equation (4.2) cos θY τˆ (θY ) − sin θY τˆ 0 (θY ) = τˆbd .
(4.2)
W is a simple curve in Q if and only if θY ∈ [arctan
a+b , π/2). 2
(4.3)
Remarks. 1. The choice, θY ∈ [0, π/2], leads to a different sign at the right-hand side of the Herring–Young equation (4.2) than in [PV2] formulae (1.5) or (4.60); in these latter references we use π − θ instead of θ. 2. For the case under consideration the existence of θY is an immediate consequence of the Winterbottom construction. In our case we have supposed that h > 0, so that τˆbd > 0. Since τ 0 (π/2) = 0 the case θY = π/2 never occurs. Proposition 4.1. Let θY be the solution of the Herring–Young equation (4.2). 1. If tan θY ≤ a+b 2 , then the minimum of the variational problem is given by the curve D. 2. If π/2 > θY > arctan( a+b 2 ), then the minimum of the variational problem is given by D if W(D) < W(W), by W if W(D) > W(W) and by both D and W if W(D) = W(W). Proof. The proof is an easy consequence of the two following lemmas. Lemma 4.1 states that the minimum is a polygonal line.
Reentrant Pinning Transition
285
Lemma 4.1. Let C be some simple rectifiable parameterized curve with initial point A and final point B. If C does not intersect the wall, then W(C) ≥ W(D)
(4.4)
with equality if and only if C=D. If C intersects the wall, let t1 be the first time C touches the wall and t2 the last time C touches the wall. Let Cb be the curve defined by three segments from A to C(t1 ), from C(t1 ) to C(t2 ) and from C(t2 ) to B. Then b W(C) ≥ W(C).
(4.5)
b Equality holds if and only if C = C. Proof. Since τˆ is convex and homogeneous, we have in the first case by Jensen’s inequality Z Z Z 1 t 1 t 1 t τˆ (u(s), ˙ v(s))ds ˙ ≥ t τˆ ( u(s)ds, ˙ v(s)ds) ˙ = W(D). (4.6) W(C) = t t 0 t 0 t 0 The inequality is strict if C 6 = D as is seen using the sharp triangle inequality (2.20). In the second case we apply Jensen’s inequality to the part of C between A and C(t1 ) b and between C(t2 ) and B to compare with the corresponding straight segments of C. Combining Jensen’s inequality and the fact that τˆbd ≤ τˆ , we can also compare the part b u t of C between C(t1 ) and C(t2 ) with the corresponding straight segment of C. Lemma 4.2. Let Cb be a polygonal line from A to Pˆ1 ∈ WQ , then from Pˆ1 to Pˆ2 ∈ WQ , and finally from Pˆ2 to B. Let θY be the solution of the Herring–Young equation (4.2). If π/2 > θY > arctan( a+b 2 ) then b ≥ W(W), W(C)
(4.7)
with equality if and only if Cb = W. If arctan( a+b 2 ) ≥ θY , b > W(D). W(C)
(4.8)
Proof. Let θ1 be the angle of the segment of Cb from A to Pˆ1 with the wall WQ , and θ2 be the angle of segment from Pˆ2 to B with the wall WQ . A necessary and sufficient condition, that the polygonal line Cb is a simple polygonal line, is b a + ≤ 2. tan θ1 tan θ2
(4.9)
In particular, we certainly have θ1 ≥ θa , where θa := arctan a/2, and θ2 ≥ θb , where θb := arctan b/2. Since we suppose that a > 0 and b > 0 we have θa > 0 and θb > 0. We suppose that θY ∈ (0, π/2), since θY = 0 occurs only if τˆ (0) = τˆbd , and in that case b > W(D). We compute by Lemma 4.1 W(C) a a b b + τˆbd (2 − − ) + τˆ (θ2 ) sin θ1 tan θ1 tan θ2 sin θ2 = g(θ1 , a) + g(θ2 , b),
b = τˆ (θ1 ) W(C)
(4.10)
286
C.-E. Pfister, Y. Velenik
where g(θ, x) := τˆ (θ)
x x + τˆbd (1 − ). sin θ tan θ
(4.11)
Since θY is a solution of (4.2), x ∂ 0 τ ˆ (θ ) − cos θ τ ˆ (θ ) + τ ˆ sin θ = 0. g(θY , x) = Y Y Y Y bd ∂θ sin2 θY
(4.12)
The second derivative of g(θ, x) is x(τˆ (θ) + τˆ 00 (θ )) 2 ∂ ∂2 g(θ, x) = − g(θ, x). 2 ∂θ sin θ tan θ ∂θ
(4.13)
Therefore, for θ ∈ (0, π/2], we have ∂ g(θ, x) = x ∂θ
Z
Z
θ
θY
exp{−
γ
θ
τˆ (γ ) + τˆ 00 (γ ) 2 dα} dγ . tan α sin γ
(4.14)
Since τˆ has positive stiffness, i.e. τˆ (θ) + τˆ 00 (θ) > 0, (4.14) implies that θY is an absolute minimum of g(θ, x) over the interval (0, π/2], and that g is strictly monotonous over the intervals (θY , π/2] and (0, θY ). Therefore b ≥ g(θY , a) + g(θY , b). W(C)
(4.15)
b ≥ W(W), because in that case If (4.3) holds, then (4.15) implies W(C) g(θY , a) + g(θY , b) = W(W).
(4.16)
If (4.14) does not hold, W is not a simple line and is not even necessarily inside Q. The two segments from A to the wall and from B to the wall intersect at some point P ∈ Q. b be the simple polygonal curve going from A to P , then from P to B. A simple Let W application of Lemma 4.1, using the fact that τˆ (1, 0) ≥ τˆbd , gives b g(θY , a) + g(θY , b) ≥ W(W).
(4.17)
Applying again Lemma 4.1 we get b > W(D). W(W) t u
(4.18)
Reentrant Pinning Transition
287
5. Reentrance and Pinning Transition The results of Sect. 4 show that, when the parameters a and b are well-chosen, the system under consideration can undergo a phase transition from a phase in which the interface is pinned to the wall on a macroscopic distance to a phase in which it does not touch the wall. It is interesting to consider the corresponding phase diagram, which is obtained using the explicit expressions for the mass gap of the 2-point function and the mass gap of the boundary 2-point function (by duality this provides exact expressions for the surface tension and wall free energy). The expressions we use are the following: τˆ (θ; β) = | cos θ| sinh−1 (α| cos θ|) + | sin θ | sinh−1 (α| sin θ |), p 1/2 2 , α = (1 − b2 )/(1 + sin2 2θ + b2 cos2 2θ ) b b = 2 sinh 2β cosh−2 2β,
(5.1)
and for 0 ≤ h < hw (β), with β ∗ and h∗ the dual coupling constants to β and h, cosh τˆbd (β, h) = cosh2 (β ∗ ) coth(2β ∗ h∗ ) − sinh2 (β ∗ ) coth[2β ∗ (h∗ − 1)].
(5.2)
They can be found, for example, in [MW] [Eq. (4.39) of Chap. XII and Eq. (5.29) of Chap. VII]. Figure 2 shows a set of phase-transition lines, depending on the parameters a and b, in the T -h plane (T = 1/kβ being the temperature). The shaded area corresponds to the set of parameters {(T , h) : τˆbd (β, h) < τˆ ((1, 0); β)}.
(5.3)
In other words, the boundary of that region is the wetting transition line: If we set a = b = 0, then for values of the temperature and boundary magnetic field inside this set, the phase-separation line is pinned to the wall microscopically (partial wetting), h
0:8
i ii iii
0:2
iv
Tc
T
Fig. 2. A sequence of phase-transition lines, separating the phase in which the interface is a straight line and the phase in which it is pinned to the wall. The shaded area corresponds to the values of (T , h) so that τˆbd (β, h) < τˆ ((1, 0); β). The four curves correspond to: i) a = 0.1, b = 0.1; ii) a = 0.1, b = 0.2; iii) a = 0.1, b = 0.4; iv) a = 0.4, b = 0.4. Observe that the system in case i) exhibits reentrance (see also Fig. 3)
288
C.-E. Pfister, Y. Velenik
Fig. 3. This figure shows part of the phase-transition line for a = 0.1, b = 0.1 (left), and a = 0.1, b = 0.12 (right). For values of the parameters T and h below these curves the interface is pinned, while it is a straight line above these curves. Increasing the temperature along the dashed lines, we see that the system exhibits reentrance; this corresponds to the two situations discussed in the introduction
while for values of the parameters outside this set it takes off and fluctuates far from the wall (complete wetting). Notice that in the macroscopic limit, the interface lies always along the wall in this case. The four curves i) to iv) in Fig. 2 represent the phasetransition line for various values of the parameters a > 0 and b > 0. For any value of the parameters β and h above the phase-transition line, the system’s interface is the straight line, while, for any value of these parameters below the curve, it is pinned. Clearly, since a and b are strictly positive, the phase-transition line must be inside the shaded region (when τˆbd (β, h) = τˆ ((1, 0); β), Jensen’s inequality implies that the interface is always a straight line). The phenomenon of reentrance described in the introduction can be seen in Figs. 2 and 3. Suppose a = b = 0.2 and h is slightly above 0.8 (this corresponds to the dashed line of the first picture in Fig. 3). At very low temperature, the interface does not touch the wall; if we increase the temperature, then there is a first transition and the interface becomes tied to the wall; if we increase further the temperature, then a second transition takes place and the interface is again the straight line; finally, at T = Tc , the system becomes disordered. In fact for slightly different values of a and b, there can even be one more transition, as shown in the second picture of Fig. 3. 6. High-Temperature Representation We give the main results about the high-temperature representation of the Ising model. These results are not restricted to dimension 2, but for simplicity we consider only this case; we also use a definition of contour adapted to this particular case. We stress that the high-temperature representation is a non-perturbative approach; the basic objects in the high-temperature representation are defined for all positive β and we apply this representation for all β < βc . The results are essential for the rest of our analysis, in particular Lemmas 6.9 and 6.11 about random-line representations of the two-point correlation function, and Lemmas 6.10 and 6.13, which characterize those random-lines, which give the main contribution to the two-point correlation function.
Reentrant Pinning Transition
289
6.1. Ising model on a finite graph. We consider here the high-temperature representation of the Ising model with free boundary conditions, but we could treat + boundary conditions. The correct point of view is to define the model on a graph G = (V , B); to each vertex t ∈ V of the graph we associate a spin variable σ (t) and to each bond e = ht, t 0 i ∈ B a nonnegative coupling constant K(e) = K(t, t 0 ), which takes into account the inverse temperature, so that in the applications K(e) = β ∗ J ∗ (e). The Gibbs measure on G is P exp{ e=ht,t 0 i∈B K(e)σ (t)σ (t 0 )} . (6.1) 4(G) The constant 4(G) is the partition function, X X exp{ K(e)σ (t)σ (t 0 )} 4(G) : = σ (t)=±1, t∈V
=
X
(6.2)
e=ht,t 0 i∈B
Y
cosh K(e)(1 + σ (t)σ (t 0 ) tanh K(e)).
σ (t)=±1, t∈V e=ht,t 0 i∈B
Expectation values with respect to the probability measure (6.1) are denoted by h · iG . All graphs are subgraphs of (Z2∗ , E ∗ ), where Z2∗ is the lattice Z2∗ := { x = (x1 , x2 ) ∈ R2 : x + (1/2, 1/2) ∈ Z2 };
(6.3)
E ∗ the set of all bonds of Z2∗ , i.e. the set of all e = ht, t 0 i, {t, t 0 } a pair of nearest neighbours points of Z2∗ . We make the following convention. If V ⊂ Z2∗ , then E(V ) := { ht, t 0 i ∈ E ∗ : t, t 0 ∈ V } and the graph generated by V is G(V ) := (V , E(V )). Similarly, if B ⊂ E ∗ , then V (B) := { t ∈ Z2∗ : ∃t 0 , ht, t 0 i ∈ B } and the graph generated by B is G(B) := (V (B), B). Let G = (V , B) be a graph. We need the following geometric notions. Let B1 ⊂ B. The index of a site t in B1 is the number of bonds of B1 , which are adjacent to t. The boundary of B1 is the subset of V δB := { t ∈ V : index of t in B1 is odd }. A path is an ordered sequence of sites and bonds, t0 , e0 , t1 , e1 , . . . , tn , where ti ∈ V for all i = 0, . . . n, and ej = htj , tj +1 i ∈ B, j = 0, . . . , n − 1. By definition all bonds of a path are different, but not necessarily all sites of the path. The initial point of the path is t0 and the final point is tn . A path is closed if its final point coincides with its initial point; otherwise it is open. Unoriented paths are called contours. Given B1 ⊂ B we can decompose B1 uniquely into a finite number of contours by the following procedure. 1. If δB1 = ∅, then choose a bond e = ht, t 0 i in B1 and set t0 := t, e0 := e and t1 = t 0 . The path is uniquely continued using rule A specified in the picture below and by requiring that it is maximal and that its final point is t0 . We have thus defined a closed path; forgetting the orientation this defines uniquely a closed contour. Repeat this construction until all bonds of B1 belong to some contour. 2. If δB1 6 = ∅, then choose first t ∈ δB1 , and set t0 := t. Then choose e0 among the adjacent bonds to t0 according to rules A0 specified in the picture below. Initial points are marked by dots in the picture specifying the rules A0 . The path is uniquely continued using rules A and A0 and by requiring that it is maximal and its final point tn ∈ δB1 . We have thus defined an open path, since t0 6= tn ; forgetting the orientation this defines uniquely an open contour. Repeat this construction starting with a new point of δB1 until all points of δB1 belong to some open contours; if there are still bonds of B1 which do not belong to some contours, then do Construction 1 above.
290
C.-E. Pfister, Y. Velenik
-
rule A
-
q
-
q
-
q
- q
rules A0 the dots denote initial points of open paths
Let θ = { θ1 , . . . , θn } be a family of contours; we denote by E(θ1 , . . . , θn ) the set of all bonds of the contours θ1 , . . . , θn . We say that θ is compatible if either E(θ1 , . . . , θn ) = ∅ or { θ1 , . . . , θn } is the decomposition into contours of the set E(θ1 , . . . , θn ). If we want to stress the condition that each contour is a contour of the graph G, then we say that θ is G-compatible. Notice that the notion of compatibility introduced here is purely geometrical; it is different from the notion of compatibility defined in Subsect. 2.1. Let e be a bond and B(e) the set formed by e and all bonds of E ∗ , which are adjacent to e. The edge-boundary of e is the set of bonds of the contour 1(e) 3 e of the decomposition of B(e) into contours. Let B1 ⊂ E ∗ ; the edge-boundary 1(B1 ) of B1 is 1(B1 ) := ∪e∈B1 1(e). The next lemma is proven in [PV1]; its proof is not difficult. Lemma 6.1. Let θ be a family of compatible contours. Then a non-empty compatible family of n closed contours γ = { γ1 , . . . , γn } is compatible with θ, that is γ ∪ θ is compatible, if and only if no bond of γi is a bond of 1(θ), ∀i = 1, . . . , n.
Two bonds e, e0 and a contour θ with their edge-boundaries 1(e), 1(e0 ), 1(θ )
We define the high-temperature representation of the model. The partition function 4(G) is given in (6.2). We expand the product in (6.2). Each term of the expansion is labeled by a set of bonds ht, t 0 i: we specify the bonds corresponding to the factors tanh K(e). Then we sum over σ (t), t ∈ V ; after summation only terms labeled by sets of bonds with empty boundary give a non-zero contribution. Any term of the expansion of (6.2), which gives a non-zero contribution, can be uniquely labeled by a G-compatible family γ of closed contours. Let e be a bond, θ a contour and θ a compatible family of contours; we set w(e) := tanh K(e) , w(θ) :=
Y e∈θ
w(e) , w(θ) :=
Y θ ∈θ
w(θ ).
(6.4)
Reentrant Pinning Transition
291
If θ = ∅, then w(θ) := 1. 4(G) can be written as (|V | is the cardinality of V ) 4(G) = 2|V |
Y
X
cosh K(e)
e∈B
w(γ ) ≡ 2|V |
Y
cosh K(e) · Z(G),
(6.5)
e∈B
γ : δγ =∅ G−comp.
with Z(G) the normalized partition function, X
Z(G) :=
w(γ ).
(6.6)
γ : δγ =∅ G −comp.
Notice that Z(G1 ) = Z(G2 ) if the two graphs Gi = (Vi , Bi ), i = 1, 2, have the same set of closed contours. More generally, given any G-compatible family θ of contours, we set X w(γ ). (6.7) Z(G|θ ) := γ : δγ =∅ γ ∪θ G −comp.
We define a weight qG (θ) for an arbitrary family θ, (
w(θ)
qG (θ) :=
0
Z(G|θ) if θ is G-compatible, Z(G) otherwise.
(6.8)
The usefulness of the Q weights qG (θ) comes from the following representation of the correlation function h t∈A σ (t) iG . If the cardinality of A is odd, then by symmetry Q h Qt∈A σ (t) iG = 0. Suppose that |A| = 2m, m ≥ 1. We expand the numerator of h t∈A σ (t) iG as above. The presence of the variables Q σ (t), t ∈ A, implies that the only terms in the expansion of the numerator of h t∈A σ (t) iG , which give non-zero contributions, are those labeled by compatible families of contours containing a subfamily λ = { λ1 , . . . , λm } of m open contours such that δλ = A. Summing over all closed contours for a given family of m open contours λ, we get a contribution to the numerator equal to w(λ)Z(G|λ). We can therefore write the key-identity, a random-line representation for the even correlation function, h
Y
σ (t) iG =
t∈A
X
qG (λ).
(6.9)
λ: δλ=A
From now on, if we specify the graph G by its set of vertices V ⊂ Z2∗ , then we write h · iV and qV (λ) instead of h · iG (V ) and qG (V ) (λ). Our first application of (6.9) is Lemma 6.2. Let 3L be the square box (2.2) and 3∗L its dual box. Let η be boundary conditions for 3L and VL (η) ⊂ Z2∗ the set of end-points of the phase-separation lines of the configurations in 3L with η boundary conditions. Then X λ
η
q3L (λ) = h
Y
t∈VL (η)
σ (t) i3∗L .
(6.10)
292
C.-E. Pfister, Y. Velenik
Proof. Since 3L is a square box, the set of η compatible families of contours in 3L coincides with the set of compatible families of contours θ of the graph G(3∗L ) such that δθ = VL (η). By duality (compare (2.10) and (6.9)), η
q3L (λ) = q3∗L (λ).
(6.11)
t u Lemma 6.3. Let G = (V , B) be a graph and θ a G-compatible family of contours. Then Z(G|θ ) is a decreasing function of K(e) for any e ∈ B. If G 0 = (V 0 , B 0 ) and V ⊂ V 0 , Z(G) B ⊂ B 0 , then qG (θ) ≥ qG 0 (θ). Proof. Let B1 := B\1(θ) and G(B1 ) be the graph defined by this set B1 of bonds. Let V (B1 ) be the set of vertices of G(B1 ). By Lemma 6.1 we have, Z(G|θ ) = Z(G(B1 )).
(6.12)
Therefore Y 4(G(B1 )) Z(G|θ ) cosh K(e) + ln 2 (|V | − |V (B1 )|). = ln + ln ln Z(G) 4(G)
(6.13)
e∈1(θ)
If e = ht, t 0 i ∈ B1 , then Z(G|θ) ∂ ln = hσ (t)σ (t 0 )iG (B1 ) − hσ (t)σ (t 0 )iG ≤ 0, ∂K(e) Z(G)
(6.14)
by GKS-inequalities, since V (B1 ) ⊂ V . If e = ht, t 0 i ∈ 1(θ), then Z(G|θ) ∂ ln = −hσ (t)σ (t 0 )iG + tanh K(e) ≤ 0, ∂K(e) Z(G)
(6.15)
since by GKS-inequalities hσ (t)σ (t 0 )iG ≥ hσ (t)σ (t 0 )i{t,t 0 } = tanh K(e).
t u
(6.16)
We make the following convention. If θ1 and θ2 are two compatible families of contours, such that E(θ1 ) ∩ E(θ2 ) = ∅, then the decomposition of E(θ1 ) ∪ E(θ2 ) into contours does not coincide necessarily with θ1 ∪ θ2 . In such a situation we interpret qG (θ1 ∪θ2 ) as the weight of the family of contours of the decomposition of E(θ1 )∪E(θ2 ) if necessary. Lemma 6.4. Let θ1 and θ2 be two compatible families of contours of the graph G = (V , B), such that E(θ1 ) ∩ E(θ2 ) = ∅. Let G 0 be the graph defined by the set of bonds B\1(θ2 ) ∪ 1(θ2 ) ∩ E(θ1 ) . If 1(θ2 ) ∩ E(θ1 ) = ∅, then qG (θ1 ∪ θ2 ) = qG 0 (θ1 ) qG (θ2 ).
(6.17)
qG (θ1 ∪ θ2 ) ≥ qG 0 (θ1 ) qG (θ2 ).
(6.18)
qG (θ1 ∪ θ2 ) ≥ qG (θ1 ) qG (θ2 ).
(6.19)
If 1(θ2 ) ∩ E(θ1 ) 6 = ∅, then
In both cases
Reentrant Pinning Transition
293
Proof. We have qG (θ1 ∪ θ2 ) = w(θ1 )w(θ2 )
Z(G|θ1 ∪ θ2 ) Z(G)
= w(θ1 )
Z(G|θ1 ∪ θ2 ) Z(G 0 )
w(θ2 )
Z(G 0 ) . (6.20) Z(G)
A family of closed contours γ of G contributes to Z(G|θ1 ∪ θ2 ) if and only if γ ∩ 1(θ1 ) ∪ 1(θ2 ) = γ ∩ 1(θ1 ) ∪ γ ∩ 1(θ2 ) = ∅.
(6.21)
This is equivalent to say that γ is a family of closed contours of the graph G 0 and γ ∩ 1(θ1 ) = ∅. Therefore (see Lemma 6.1) Z(G|θ1 ∪ θ2 ) = Z(G 0 |θ1 ).
(6.22)
If 1(θ2 ) ∩ E(θ1 ) = ∅, then G 0 is the graph defined by the set of bonds B\1(θ2 ); hence Z(G 0 ) = Z(G|θ2 ).
(6.23)
Z(G 0 ) ≥ Z(G|θ2 ),
(6.24)
If 1(θ2 ) ∩ E(θ1 ) 6 = ∅, then
since the graph G 0 contains some bonds of 1(θ2 ). The last affirmation follows from the above results and Lemma 6.3. u t Let λ1 and λ2 be two open contours such that δλ1 = {x, y} and δλ2 = {u, v}. We say that λ1 and λ2 are disjoint if either they are compatible or E(λ1 ) ∩ E(λ2 ) = ∅ and the decomposition of E(λ1 ) ∪ E(λ2 ) into contours is a single contour. If λ1 and λ2 are disjoint, then we write λ1 q λ2 the family { λ1 , λ2 } or the single contour of the decomposition into contours of E(λ1 ) ∪ E(λ2 ). Notice that when λ1 q λ2 = λ is a single contour, then {x, y} ∩ {u, v} 6 = ∅. Lemma 6.5. Let λ1 and λ2 be two open contours such that δλ1 = {x, y} and δλ2 = {u, v}. Then X X X qG (λ) ≤ qG (λ1 ) qG (λ2 ). (6.25) λ : λ=λ1 qλ2 δλ1 ={x,y}, δλ2 ={u,v}
λ1 :
δλ1 ={x,y}
λ2 :
δλ2 ={u,v}
Proof. The proof is easy if λ1 q λ2 = { λ1 , λ2 }. Indeed, from Lemma 6.4, since E(λ1 ) ∩ 1(λ2 ) = ∅, qG (λ) = qG 0 (λ1 ) qG (λ2 ).
(6.26)
Summing over λ1 , keeping λ2 fixed, we get from the basic formula (6.9) and GKS inequalities X qG (λ) ≤ h σ (x)σ (y) iG 0 qG (λ2 ) (6.27) λ1 :
λ=λ1 qλ2
≤ h σ (x)σ (y) iG qG (λ2 ) X qG (λ1 ) qG (λ2 ). = λ1 :
δλ1 ={x,y}
294
C.-E. Pfister, Y. Velenik
We can now sum over λ2 . When λ1 q λ2 is a single contour λ, then the proof is more delicate, since the second case in Lemma 6.4 occurs. However, the proof is similar. For details we refer to the proof of Lemma 5.4 in [PV1]. u t Lemma 6.6. Let G = (V , B) and B1 ⊂ B. Let G 0 = (V1 , B1 ) be the graph generated by B1 . Let x, y ∈ V1 . Then X X qG (λ) ≤ qG 0 (λ) = h σ (x)σ (y) iG 0 . (6.28) λ : δλ={x,y} E (λ)⊂B1
λ : δλ={x,y}
Proof. The result follows directly from Lemma 6.3. u t The next lemma gives a concentration result for the random-line representation (6.9). Let G = (V , B) and V1 ⊂ V . We define ∂ext V1 := { t ∈ V \V1 : ∃t 0 ∈ V1 , h t, t 0 i ∈ B }.
(6.29)
Similarly, if B1 ⊂ B, then we set ∂ext B1 := ∂ext V (B1 ).
(6.30)
We say that B1 is connected if for any pair of sites x, y ∈ V (B1 ), there is a path from x to y with all its bonds in B1 . Lemma 6.7. Let G = (V , B), B1 ⊂ B be a connected subset and x, y two sites of the bonds of B1 . Suppose that all bonds incident to x and y belong to B1 . Then X qG (λ) (6.31) 0 ≤ h σ (x)σ (y) iG − X
X
≤
λ:δλ={x,y} E (λ)⊂B1
qG (λ)
z∈∂ext B1 λ:δλ1 ={x,y} λ3z
X
≤
h σ (x)σ (z) iG h σ (z)σ (y) iG .
z∈∂ext B1
Proof. Equations (6.9) gives h σ (x)σ (y) iG =
X λ:δλ={x,y} E (λ)⊂B1
qG (λ) +
X
qG (λ).
(6.32)
λ:δλ={x,y} E (λ)6⊂B1
We estimate the second sum. For any λ contributing to this sum, let z(λ) be the first point of ∂ext B1 of the path from x to y defined by the contour λ. Any such a path can be decomposed into λ1 such that δλ1 = {x, z} and λ2 such that δλ2 = {z, y} so that t λ = λ1 q λ2 . The result then follows from Lemma 6.5 and (6.9). u There is a useful formula for the weight qG (θ), which is a consequence of the following elementary remarks. Let K denote the function e ∈ V 7 → K(e) ∈ R. Given a compatible family of contours, we introduce a new function Ks , 0 ≤ s ≤ 1, ( K(e) if e 6 ∈ 1(θ), Ks := (6.33) sK(e) if e ∈ 1(θ ).
Reentrant Pinning Transition
295
Then Z(G|θ)(K) = Z(G)(Ks )|s=0 . On the other hand we have Z 1 d (6.34) ln 4(G)(Ks ) ds ln 4(G)(K) − ln 4(G)(K0 ) = 0 ds Z 1 X K(e) h σ (t)σ (t 0 ) iG (Ks ) ds. = e=h t,t 0 i∈1(θ)
0
Therefore, for a compatible family of contours θ , cosh K(e) exp −
Y
qG (θ) = w(θ)
X e=h t,t 0 i∈1(θ)
e∈1(θ)
Z K(e) 0
1
h σ (t)σ (t 0 ) iG (Ks ) ds . (6.35)
Formula (6.35) allows to compare qG (θ)(K) for different functions K or different graphs G 0 . For example we get immediately the lower bound Y 1 1 + e−2K(e) . (6.36) qG (θ) ≥ w(θ) 2 e∈1(θ)
Lemma 6.8. Let G = (V , B), V1 ⊂ V and G 0 be the graph generated by V \V1 . Let θ be a compatible family of contours of G such that no site of θ belongs to ∂ext V1 . We set for all t ∈ ∂ext V1 , X K(h t, t 0 i). (6.37) K(t) := t 0 ∈V1 : h t,t 0 i∈B
Then X
| ln qG 0 (θ) − ln qG (θ) | ≤ X K(t 00 ) h σ (t)σ (t 00 ) iG 0 + h σ (t 0 )σ (t 00 ) iG 0 .
K(e)
e=ht,t 0 i∈1(θ)
(6.38)
t 00 ∈∂ext V1
Proof. Formula (6.35) gives qG 0 (θ) = ln qG (θ)
X e=h t,t 0 i∈1(θ)
Z K(e) 0
h σ (t)σ (t 0 ) iG (Ks ) − h σ (t)σ (t 0 ) iG 0 (Ks ) ds.
1
(6.39) We put a magnetic field h0 on each t ∈ V1 and let h0 → ∞. We have h σ (t)σ (t 0 ) iG (Ks ) ≤ h σ (t)σ (t 0 ) i+ G 0 (Ks ),
(6.40)
0 where h σ (t)σ (t 0 ) i+ G 0 (Ks ) is the expectation with respect to a Gibbs measure on G with 0 coupling constants given by Ks on the bonds of G and magnetic field K(t) for t ∈ ∂ext V1 .
296
C.-E. Pfister, Y. Velenik
Since −σ (t)σ (t 0 ) + σ (t) + σ (t 0 ) is an increasing function we get by FKG inequalities 0 0 h σ (t)σ (t 0 ) i+ G 0 (Ks ) − h σ (t)σ (t ) iG (Ks )
0 + 0 0 0 ≤ h σ (t) i+ G 0 (Ks ) − h σ (t) iG (Ks ) + h σ (t ) iG 0 (Ks ) − h σ (t ) iG (Ks ).
(6.41)
We define an interpolating magnetic field for t ∈ ∂ext V1 , Ka (t) := aK(t) , 0 ≤ a ≤ 1.
(6.42)
Let h · i+ G 0 (Ks ; a) be the expectation value with respect to this new measure and set 0 + h σ (t); σ (t 0 ) i+ G 0 (Ks ; a) := h σ (t)σ (t ) iG 0 (Ks ; a)
0 + − h σ (t) i+ G 0 (Ks ; a) h σ (t ) iG 0 (Ks ; a).
(6.43)
+ + We have h σ (t) iG 0 (Ks ) = h σ (t) i+ G 0 (Ks ; 0) and h σ (t) iG 0 (Ks ) = h σ (t) iG 0 (Ks ; 1); therefore Z 1 X 00 0 (Ks ) = (K ) − h σ (t) i K(t ) h σ (t); σ (t 00 ) i+ h σ (t) i+ 0 s G G G 0 (Ks ; a) da. t 00 ∈∂ext V1
0
(6.44) GHS inequalities imply that h σ (t); σ (t 00 ) i+ G 0 (Ks ; a) is decreasing in a; thus 00 + 00 0 h σ (t); σ (t 00 ) i+ G 0 (Ks ; a) ≤ h σ (t); σ (t ) iG 0 (Ks ; 0) = h σ (t)σ (t ) iG (Ks ),
(6.45)
since by symmetry h σ (t) i+ G 0 (Ks ) = 0. The lemma follows from (6.39), (6.41), (6.44) and (6.45). u t 6.2. Ising model on Z2∗ above Tc . We consider the model on (Z2∗ , E ∗ ) and choose as coupling constants K(e) := β ∗ ∀e, with β ∗ < βc . We recall that the decay-rate τ (y − x) = τ (y − x; β ∗ ) is strictly positive for such β ∗ and that for any 3 ⊂ Z2∗ (see Proposition 2.4) h σ (x)σ (y) i3 (β ∗ ) ≤ exp{−τ (y − x; β ∗ )}.
(6.46)
Given any 3 ⊂ Z2∗ and a family of compatible contours θ in G(3), we define weights q3 (θ) (see Lemma 6.3), q3 (θ) := lim q3n (θ), 3n ↑3
(6.47)
where 3n is an increasing sequence of finite subsets 3n of 3, such that eventually every site of 3 is contained in some 3n . When 3 = Z2∗ we write q(θ) instead of qZ2∗ (θ). Lemmas 6.3 to 6.8 are still valid for the weights q3 (θ). On the other hand the random-line representation does not extend automatically in the infinite case.
Reentrant Pinning Transition
297
Lemma 6.9. Let K(e) := β ∗ ∀e and β ∗ < βc . Then the two-point correlation function of the Ising model has a random-line representation, X q(λ). (6.48) h σ (t)σ (t 0 ) i = λ:δλ={t,t 0 }
A formula similar to (6.9) is true for even correlation functions. Proof. The hypothesis β ∗ < βc is equivalent to X h σ (0)σ (t) i < ∞.
(6.49)
t∈Z2∗
Let 31 ⊂ 32 be two finite subsets and suppose that t, t 0 ∈ 31 . Let B1 be the set of bonds between sites of 31 ; suppose furthermore that B1 is connected. Then formula (6.9) and Lemma 6.7 give X q32 (λ) (6.50) 0 ≤ h σ (t)σ (t 0 ) i32 − λ:δλ={t,t 0 } E (λ)⊂B1
X
≤
h σ (t)σ (s) i h σ (s)σ (t 0 ) i.
s∈∂ext B1
Given ε > 0, we can find 31 so that the last sum in (6.50) is smaller than ε. Letting 32 ↑ Z2∗ we get X q(λ) ≤ ε. (6.51) 0 ≤ h σ (t)σ (t 0 ) i − λ:δλ={t,t 0 } E (λ)⊂B1
t The result now follows by letting 31 ↑ Z2∗ . u Lemma 6.10. Let K(e) := β ∗ ∀e and β ∗ < βc . Set S(x, y; ρ) := { t ∈ Z2∗ : kx − tk + ky − tk ≤ kx − yk + ρ},
(6.52)
with k · k the Euclidean norm. Then X
q(λ) ≤
λ:δλ={x,y} E (λ)6⊂E (S (x,y;ρ))
|∂ext S(x, y; ρ)| kx − yk1/2 e−κρ h σ (x)σ (y) i. K
(6.53)
K is the constant of Proposition 2.4. Proof. By Lemma 6.7, X q(λ) ≤ λ:δλ={x,y} E (λ)6⊂E (S (x,y;ρ))
X
h σ (x)σ (t) i h σ (t)σ (y) i
(6.54)
t∈∂ext S (x,y;ρ)
=
X t∈∂ext S (x,y;ρ)
h σ (x)σ (t) i h σ (t)σ (y) i h σ (x)σ (y) i. h σ (x)σ (y) i
298
C.-E. Pfister, Y. Velenik
We apply the sharp triangle inequality to the numerator of the last expression, h σ (x)σ (t) i h σ (t)σ (y) i ≤ e−τ (x−t)−τ (y−t)+τ (x−y) e−τ (x−y) ≤e
−τ (x−y)−κρ
(6.55)
.
Finally we apply Proposition 2.4 to the denominator, e−τ (x−y) ≤
kx − yk1/2 h σ (x)σ (y) i. K
(6.56)
t u Lemma 6.10 characterizes those random-lines, which give the main contribution to the two-point correlation function. If ρ ≥ C ln kx − yk, with C large enough, then the coefficient in front of h σ (x)σ (y) i in (6.53) tends to zero when kx − yk diverges. The result is sharp. 6.3. Ising model on L∗ above Tc . Let β ∗ < βc and h∗ > 0. We consider the model on subsets 3∗L ⊂ L∗ and choose as coupling constants (
h∗ β ∗ β∗
K(e) :=
∀ e = h t, t 0 i, with t2 = t20 = 1/2, otherwise.
(6.57)
We set 6L∗ := { t ∈ 3∗L : t2 = 1/2 } , 6 ∗ := { t ∈ L∗ : t2 = 1/2 }.
(6.58)
The weight qL∗ (θ) is defined by (6.47). Lemma 6.11 establishes the random-line representation for the two-point function, its proof is similar to that of Lemma 6.9. Lemma 6.11. Let β ∗ < βc , h∗ > 0 and the coupling constants be given by (6.57). Then the two-point correlation function of the Ising model on L∗ has a random-line representation, X qL∗ (λ). (6.59) h σ (t)σ (t 0 ) iL∗ = λ:δλ={t,t 0 }
A formula similar to (6.59) is true for even correlation functions. Lemma 6.12. Let β ∗ < βc , h∗ > 0, 3∗L ⊂ L∗ and θ be a family of compatible contours. Let q3∗L (θ) be the weight for the model defined on 3∗L with coupling constants (6.57). Let q(θ ) be the weight for the model on Z2∗ with coupling constants K(e) ≡ β ∗ . 1. If h∗ ≤ 1, then q3∗L (θ) ≥ q(θ). 2. Let d(θ ) := min{ |t2 − 3/2| : t ∈ 1(θ)} ≥ 1. If h∗ ≥ 1, then ln
q3∗L (θ) q(θ)
≥ −O(L2 ) exp{−O(d(θ))}.
(6.60)
Reentrant Pinning Transition
299
Proof. The first case follows directly from Lemma 6.3. The second case follows from Lemma 6.8. By Lemma 6.3 q3∗L \6L∗ (θ) ≥ q(θ). Since q3∗L (θ) ≥
q3∗L (θ)
q3∗L \6L∗ (θ)
q(θ),
(6.61)
we must compare q3∗L (θ) and q3∗L \6L∗ (θ). We apply Lemma 6.8 with G the graph generated by 3∗L and G 0 the graph generated by 3∗L \6L∗ . Notice that h σ (t)σ (t 0 ) iG 0 ≤ h σ (t)σ (t 0 ) i; therefore, if t ∈ 1(θ ), X
h σ (t)σ (t 0 ) iG 0 ≤
t 0 ∈3∗L :t20 =3/2
X t 0 : t20 =3/2
≤ ≤
h σ (t)σ (t 0 ) i
X
X
t 0 : t20 =3/2
λ: δλ={t,t 0 }
X
(6.62)
X
(6.63)
q(λ) X
q(λ),
t 0 : t20 =3/2 s: s2 =3/2 λ: z(λ)=s δλ={t,t 0 }
with z(λ) the first site z of the path defined by λ with initial point t, such that z2 = 3/2. To estimate the last sums we use Lemma 6.5. We have X X X h σ (t)σ (t 0 ) i ≤ exp{−τ (t − s) − τ (s − t 0 )}. (6.64) t 0 : t20 =3/2
t 0 :t20 =3/2 s:s2 =3/2
We sum over t 0 and get a finite contribution independent of s; then the sum over s gives t a contribution exp{−O(d(θ))}. Since |1(θ)| ≤ O(L2 ) we get (6.60). u The next lemma characterizes those random-lines, which give the main contribution to the boundary two-point correlation function. We consider the case β ∗ < βc and h∗ > hw (β)∗ , when the random-lines stick to 6 ∗ . In the other cases there is a result similar to that of Lemma 6.10. Lemma 6.13. Let β ∗ < βc , h∗ > hw (β)∗ and the coupling constants given by (6.57). Let x, y ∈ 6 ∗ , x1 < y1 and ρi ∈ N, i = 1, 2; we set B(x, y; ρ1 , ρ2 ) := { t ∈ L∗ : x1 − ρ1 ≤ t1 ≤ y1 + ρ1 , 1/2 ≤ t2 ≤ 1/2 + ρ2 }. (6.65) Then X λ:δλ={x,y} E (λ)6⊂E (B)
qL∗ (λ) ≤
h σ (x)σ (y) iL∗ 2ρ2 exp{−2ρ1 τˆbd } K 00
(6.66)
+ O ρ2 |x1 − y1 | exp{−κρ2 } .
K 00 is the constant of Proposition 2.4; τˆbd = τˆbd (β, h) with β and h the dual values of β ∗ and h∗ ; κ the constant in the sharp triangle inequality and C := τˆ ((1, 0)) − τˆbd > 0.
300
C.-E. Pfister, Y. Velenik
Proof. We decompose ∂ext B into two parts: V1 := { t ∈ ∂ext B : t1 = x1 − ρ1 − 1 or t1 = y1 + ρ1 + 1 } , V2 := ∂ext B\V1 . (6.67) We consider λ as a unit-speed parametrized curve, s ∈ [0, |λ|] 7→ λ(s), with initial point λ(0) = x; we suppose that s ∗ is the first time such that λ ∈ ∂ext B; we set t = λ(s ∗ ). We have X X X X X qL∗ (λ) ≤ qL∗ (λ) + qL∗ (λ). (6.68) λ:δλ={x,y} E (λ)6⊂E (B)
t∈∂ext B: λ:δλ={x,y} λ3t t∈V1
t∈∂ext B: λ:δλ={x,y} λ3t t∈V2
We treat these two sums separately. By Lemma 6.7, symmetry and GKS inequalities, X
X
qL∗ (λ) ≤ 2
t∈∂ext B: λ:δλ={x,y} λ3t t∈V1
X
h σ (x)σ (t) iL∗ h σ (t)σ (y) iL∗ (6.69)
t∈∂ext B t1 =x1 −ρ1 −1
=2
X
h σ (x)σ (t) iL∗ h σ (t)σ (y)iL∗
t∈∂ext B t1 =x1 −ρ1 −1
≤2
X
h σ (x)σ (y) iL∗
t∈∂ext B t1 =x1 −ρ1 −1
≤
2ρ2 exp{−2ρ1 τˆbd }h σ (x)σ (y) iL∗ , K 00
where x is the image of x under a reflection of axis {u : u1 = x1 − ρ1 − 1}. Let t ∈ V2 , with t = λ(s ∗ ). Let s1 be the last time before s ∗ such that λ(s1 ) ∈ 6 ∗ and s2 the first time after s ∗ such that λ(s2 ) ∈ 6 ∗ . We set u := λ(s1 ) and v := λ(s2 ); we have x1 − ρ1 ≤ u1 ≤ y1 + ρ1 . By definition no bond of λ between times s1 and s ∗ belong to E(6 ∗ ). Therefore Lemma 6.6 and GKS inequalities give X
qL∗ (λ0 ) ≤ h σ (u)σ (t) i.
(6.70)
λ0 :δλ0 ={u,t} E (λ0 )∩E (6 ∗ )=∅
The hypothesis h∗ > h∗w implies that C := τˆ ((1, 0)) − τˆbd > 0. Using Lemma 6.5, (6.70) and the sharp triangle inequality we get X
qL∗ (λ) ≤
λ:λ3t δλ={x,y}
≤
X h σ (x)σ (u) iL∗ h σ (u)σ (t) ih σ (t)σ (v) ih σ (v)σ (y) iL∗ (6.71) u,v
X
exp{−τˆbd (|u1 − x1 | + |y1 − v1 |)}
u,v
· exp{−τˆ (t − u) − τˆ (v − t)} X exp{−τˆbd (|u1 − x1 | + |y1 − v1 |)} exp{−τˆ (u − v)} ≤ u,v
· exp{−κ(ku − tk + kt − vk − ku − vk)}.
Reentrant Pinning Transition
301
We have τˆ (u − v) = C|u1 − v1 | + τˆbd |u1 − v1 |. Therefore hσ σ i ∗ x y L (6.72) exp − τˆbd (|u1 − x1 | + |y1 − v1 |) − τˆ (u − v) ≤ K 00 · exp − τˆbd (|u1 − x1 | + |y1 − v1 | + |u1 − v1 | − |x1 − y1 |) − C|u1 − v1 | . We sum over u, v and t, which are sums over u1 , v1 and t1 . We set for s ∈ R and [a, b] ⊂ R, d(s, [a, b]) := min{ |t − s| : t ∈ [a, b] }.
(6.73)
First notice that |u1 − x1 | + |y1 − v1 | + |u1 − v1 | − |x1 − y1 | ≥ 2d(v1 , [x1 , y1 ]) if v1 6∈ [x1 , y1 ], (6.74) and |u1 − x1 | + |y1 − v1 | + |u1 − v1 | − |x1 − y1 | ≥ 2d(u1 , [x1 , y1 ]) if u1 6 ∈ [x1 , y1 ]. (6.75) Let α := κ/(C + κ); if |u1 − v1 | ≤ αρ2 , then exp{−κ(ku − tk + kt − vk − ku − vk)} ≤ exp{−κ(2 − α)ρ2 }.
(6.76)
If t1 6 ∈ [u1 , v1 ] or t1 6 ∈ [v1 , u1 ], then ku − tk + kt − vk − ku − vk ≥ ρ2 + min{|u1 − t1 |, |v1 − t1 |}.
(6.77)
Let v1 6 ∈ [x1 , y1 ]. We consider two cases. First suppose that |u1 − v1 | ≥ αρ2 . We sum over t1 using (6.77), getting at most a contribution O(|u1 − v1 |); then we sum over u1 , such that |u1 − v1 | ≥ αρ2 , using the factor exp{−C|u1 − v1 |}; finally we sum over v1 using (6.74). Thus we get a contribution (6.78) O exp{−κ(2 − α)ρ2 } . Suppose that |u1 − v1 | ≤ αρ2 . We sum over t1 , using now (6.77) and (6.76), getting at most a contribution O(ρ2 |u1 − v1 |) exp{−κ(2 − α)ρ2 };
(6.79)
then we sum over u1 using the factor exp{−C|u1 − v1 |}; finally we sum over v1 using (6.74), getting a contribution (6.78). The case u1 6∈ [x1 , y1 ] is similar. It remains to consider the case where x1 < u1 < v1 < y1 . We proceed in the same manner, but this time the last sum gives a factor |x1 − y1 | since in this case |u1 − x1 | + |y1 − v1 | + |u1 − v1 | − |x1 − y1 | = 0.
(6.80)
Therefore we get a contribution O ρ2 |x1 − y1 | exp{−κρ2 }. t u
(6.81)
302
C.-E. Pfister, Y. Velenik
7. On the Correlation Length Above Tc Let β ∗ < βc and 0 < h∗ < ∞. The model is defined in the box 3∗L with free boundary conditions and coupling constants (6.57). We study the influence of the boundary effect on the correlation length due to the coupling constants K(e) = h∗ β ∗ , e ∈ E(6L∗ ). We consider two definitions, which we call short correlation length and long correlation length, following a similar terminology introduced in [SML] about the long range-order. The short correlation length is the standard correlation length. Let t, t 0 ∈ Z2∗ ; we define 1 1 := − lim lnh σ (kt)σ (kt 0 ) i(β ∗ ). 0 ∗ k→∞ kkt − t 0 k ξsh (t, t ; β )
(7.1)
k∈N
In (7.1) we compute the expectation value with respect to the infinite volume Gibbs state on Z2∗ , which is unique. Then we take the limit k → ∞. We have ξsh (t, t 0 ; β ∗ ) = ξsh (s, s 0 ; β ∗ ) if s − s 0 is a multiple of t − t 0 . In the case of the long correlation length we perform the thermodynamical limit and the limit k → ∞ simultaneously. Let t, t 0 ∈ 3∗L ; the long correlation length is defined by 1 ξlg
(t, t 0 ; β ∗ , h∗ )
:= − lim
k→∞ k∈N
1 lnh σ (kt)σ (kt 0 ) i3∗kL (β ∗ , h∗ ). kkt − t 0 k
(7.2)
ξlg (t, t 0 ; β ∗ , h∗ ) depends on the position of the sites t and t 0 in the box 3∗L . The next lemma contains one of the main estimate of the paper, which we shall use later on, when discussing phase-separation lines. Lemma 7.1. Let β ∗ < βc and 0 < h∗ < ∞. (1) There exist constants c1 , c2 , c0 , c00 with the following property. Let t, t 0 ∈ 3∗L ; suppose that there exist p, p 0 ∈ 3∗L such that 1. kp − tk ≤ c1 ln L and kp0 − t 0 k ≤ c2 ln L, 2. S(p, p 0 ; c0 ln L) ⊂ 3∗L ∩ { t ∈ L∗ : t2 ≥ c00 ln L} (see (6.53)). Then there exist C and L0 such that ∀L ≥ L0 and ∀ t, t 0 as above, h σ (t)σ (t 0 ) i3∗L (β ∗ , h∗ ) ≥
1 −τ (p − p0 ; β ∗ ) e . LC
(7.3)
(2) Let h∗ > hw (β)∗ . There exist c3 , c4 with the following property. Let m = (m1 , 1/2) ∈ 3∗L and n = (n1 , 1/2) ∈ 3∗L ; suppose that (see (6.65)) B(m, n; c3 ln L, c4 ln L) ⊂ 3∗L .
(7.4)
Then there exist C and L0 such that ∀L ≥ L0 and ∀ m, n as above, ∗ ∗ h σ (m)σ (n) i3∗L (β ∗ , h∗ ) ≥ Ce−τˆbd (β , h )|n1 − m1 | .
(7.5)
Proof. By GKS inequalities h σ (t)σ (t 0 ) i3∗L ≥ h σ (t)σ (p) i3∗L h σ (p)σ (p0 ) i3∗L h σ (p0 )σ (t 0 ) i3∗L .
(7.6)
From (6.36) we have h σ (t)σ (p) i3∗L ≥ exp{−O(ln L)}
(7.7)
Reentrant Pinning Transition
303
and h σ (p0 )σ (t 0 ) i3∗L ≥ exp{−O(ln L)}.
(7.8)
Let SL := S(p, p0 ; c0 ln L); by Lemmas 6.12, 6.10, Proposition 2.4 and taking c0 and c00 large enough, there exists L0 such that ∀L ≥ L0 , h σ (p)σ (p 0 ) i3∗L ≥
≥
=
X λ:E (λ)⊂E (SL ) δλ={p,p 0 }
1 2 1 2
q3∗L (λ)
X
(7.9)
q(λ)
λ:E (λ)⊂E (SL ) δλ={p,p0 }
X
q(λ) −
λ:δλ={p,p0 }
1 2
X
q(λ)
λ:E (λ)6 ⊂E (SL ) δλ={p,p0 }
1 h σ (p)σ (p0 ) i 4 0 K e−τ (p − p ) . ≥ 0 1/2 4kp − p k
≥
This proves (1), since kp − p 0 k ≤ O(L). We estimate h σ (m)σ (n) i3∗L by Lemma 6.3, 6.13 and Proposition 2.4. Let BL := B(m, n; c3 ln L, c4 ln L); by taking c3 and c4 large enough, there exists L0 such that ∀L ≥ L0 , h σ (m)σ (n) i3∗L ≥
X λ:E(λ)⊂E(BL ) δλ:={m,n}
X
≥
q3∗L (λ)
(7.10)
qL∗ (λ)
λ:E(λ)⊂E(BL ) δλ:={m,n}
=
1 2
X
qL∗ (λ) −
λ:δλ={m,n}
1 2
X
qL∗ (λ)
λ:E (λ)6 ⊂E (BL ) δλ={m,n}
1 h σ (m)σ (n) iL∗ 4 K 00 −τˆbd |n1 − m1 | ≥ e . 4
≥
This proves (2). u t Let t, t 0 ∈ 3∗L . Suppose that h∗ > hw (β)∗ . We apply Lemma 7.1 to show that we may have (depending on the choice of t and t 0 ) ξlg (t, t 0 ; β ∗ , h∗ ) > ξsh (t, t 0 ; β ∗ ).
(7.11)
304
C.-E. Pfister, Y. Velenik
We assume that t1 < t10 and choose m = (m1 , 1/2) and n = (n1 , 1/2), m1 < n1 , m, n ∈ 3∗L . By GKS inequalities h σ (kt)σ (kt 0 ) i3∗kL ≥ h σ (kt)σ (km) i3∗kL h σ (km)σ (kn) i3∗kL h σ (kn)σ (kt 0 ) i3∗kL . (7.12) e such If k is large enough, then we can use Lemma 7.1 to estimate (7.12). There exists C that h σ (kt)σ (kt 0 ) i3∗kL ≥
1 O(k Ce)
0 e−k(τ (t − m) + τ (n − t )) e−k τˆbd |n1 − m1 | .
(7.13)
Therefore 1 ξlg
(t, t 0 ; β ∗ , h∗ )
≤
τ (t − m) + τ (n − t 0 ) + τˆbd |n1 − m1 | . kt − t 0 k
(7.14)
We can optimize this upper bound by taking the minimum over m and n. On the other hand τ (t − t 0 ) 1 = . 0 ∗ ξsh (t, t ; β ) kt − t 0 k
(7.15)
The results of Sect. 4 show that there exist t, t 0 , when h∗ > hw (β)∗ , such that for suitable m and n, τ (t − m) + τ (n − t 0 ) + τˆbd |n1 − m1 | < τ (t − t 0 ),
(7.16)
ξlg (t, t 0 ; β ∗ , h∗ ) > ξsh (t, t 0 ; β ∗ ).
(7.17)
and so
8. From Microscopic to Macroscopic Theory We show that the phase-separation line λ is concentrated in a neighborhood of the solution of the variational problem of Sect. 4, scaled by L, with probability tending to 1 when L → ∞. The thickness of the neighborhood is at most O((L ln L)1/2 ). Consequently, if we do a coarse-grained description of the configurations, using cells of linear size Lα , 1/2 < α < 1, then we see the emergence of an interface, which coincides with the solution of the variational problem. This justifies the macroscopic theory, starting from the microscopic theory. It is possible to consider even a more general situation. Suppose that we prescribe a curve C ⊂ Q from A to B. We can estimate the probability that the phase-separation line is in a neighborhood of this curve scaled by L, the thickness of the neighborhood being at most O((L ln L)1/2 ). Using the method developed fully in [PV1], this probability is roughly equal to (8.1) exp − L(W(C) − W∗ ) , where W∗ is the minimum of the variational problem. We shall not give the details of that estimate here.
Reentrant Pinning Transition
305
8.1. Main result. The weight of a separation line λ in 3∗L , going from uL to v L , is given by q3∗L (λ). These weights define a measure on the set of the phase-separation lines, such that the total mass is X q3∗L (λ) = hσ (uL )σ (v L )i3∗L . (8.2) E (λ)⊂E (3∗L ): δλ={uL ,v L }
Consequently we can introduce the following probability measure: PLab [λ] =
q3∗L (λ)
hσ (uL )σ (v L )i3∗L
.
(8.3)
Let D and W be the curves in Q introduced in Sect. 4. We set IiL := { x ∈ 6L∗ : kx − wiL k ≤ (ML log L)1/2 } , i = 1, 2,
(8.4)
with wiL = (LPi , 1/2) and [P1 , P2 ] = W ∩ WQ . We set ρL := M ln L.
(8.5)
We define two sets of phase-separation lines. The set TD contains all λ, E(λ) ⊂ E(3∗L ), such that a1 . δλ = {uL , v L }; a2 . E(λ) is inside E(S(uL , v L ; ρL )). The set TW contains all λ, E(λ) ⊂ E(3∗L ), considered as parameterized curves s 7 → λ(s), such that b1 . b2 . b3 . b4 . b5 . b6 .
δλ = {uL , v L }, λ(0) := uL ; ∃s1 such that λ(s1 ) ∈ I1L and for all s < s1 , λ(s) ∩ 6L∗ = ∅; λ1 := {λ(s) : s ≤ s1 } is inside S(uL , λ(s1 ); ρL ); ∃s2 such that λ(s2 ) ∈ I2L and for all s2 < s, λ(s) ∩ 6L∗ = ∅; λ3 := {λ(s) : s2 ≤ s} is inside S(λ(s2 ), v L ; ρL ); λ2 := {λ(s) : s1 ≤ s ≤ s2 } is inside {x ∈ 3∗L : x(2) ≤ ρL , λ(s1 )(1) − ρL ≤ x(1) ≤ λ(s2 )(1) + ρL }.
Theorem 8.1. Let β > βc , h > 0, 0 < a < 1, 0 < b < 1. There exist M > 0 and L0 = L0 (h, β, M) such that, for all L ≥ L0 , the following statements are true. 1. Suppose that the solution of the variational problem in Q is the curve D. Then PLab [TD ] ≥ 1 − L−O(M) .
(8.6)
2. Suppose that the solution of the variational problem in Q is the curve W. Then PLab [TW ] ≥ 1 − L−O(M) .
(8.7)
306
C.-E. Pfister, Y. Velenik
3. Suppose that the solution of the variational problem in Q is either the curve D or the curve W. Then PLab [TD ∪ TW ] ≥ 1 − L−O(M) .
(8.8)
Comment. The results of Theorem 8.1 are optimal in the following sense: At a finer scale we do not expect the phase-separation line to converge to some non-random set, but rather to some random process. It is known that fluctuations of a phase-separation line of length O(L), which is not in contact with the wall, are O(L1/2 ) (see [Hi2] and [DH]). On the other hand, if the phase-separation line is attracted by the wall on a length O(L), then we expect that its excursions away from the wall have a size typically bounded by O(log L). Proof. 1. Suppose that the minimum of the variational problem is given by D, W(D) = W∗ . Let W∗∗ be the minimum of the functional over all simple curves in Q, with endpoints A and B, and which touch the wall WQ . By hypothesis there exists δ > 0 with W∗∗ = W∗ + δ. We set S1 := S(uL , v L ; ρL ); for L large enough S1 ∩ 6L∗ = ∅, since a > 0 and b > 0. We apply Lemma 7.1. We have PLab [{λ 6 ∈ TD }] =
X
1
q3∗L (λ) hσ (uL )σ (v L )i3∗L λ6 ∈TD
≤ LC exp{W∗ L}
X
λ6 ∈TD
(8.9)
qL∗ (λ).
We estimate the numerator of PLab [{λ 6 ∈ TD }]. There are two cases, either λ ∩ 6L∗ 6 = ∅ or λ ∩ 6L∗ = ∅. The first case is easy to estimate. Consider λ as a unit-speed parametrized curve from uL to v L and suppose that z1 (λ), resp. z2 (λ), is the first, resp. last, point of λ ∩ 6L∗ 6 = ∅. Then by Lemmas 6.5 and 6.6, X λ∩6L∗ 6=∅
q3∗L (λ) ≤
X
L L e−τˆ (z1 − u ) e−τˆbd (z2 − z1 ) e−τˆ (v − z2 ) .
(8.10)
z1 ,z2 ∈6L∗
We can bound above this sum by O(L2 ) exp{−LW∗∗ }. In the second case we have λ ∩ 6L∗ = ∅. Using Lemmas 6.7, 6.6, GKS inequalities and Lemma 6.10, X X X q3∗L (λ) ≤ q3∗L (λ) (8.11) z∈∂ext S1
λ6∈TD ∗ =∅ λ∩6L
≤
X z∈∂ext S1
≤
X
∗ =∅ z∈λ, λ∩6L δλ={uL ,v L }
h σ (uL )σ (z) i3∗L \6L∗ h σ (z)σ (v L ) i3∗L \6L∗ h σ (uL )σ (z) i h σ (z)σ (v L ) i
z∈∂ext S1
≤ O(L3/2− κM )h σ (uL )σ (v L ) i ≤ O(L3/2− κM ) exp{−W∗ L}. This proves the first statement.
Reentrant Pinning Transition
307
2. Suppose that the minimum of the variational problem is given by W, W(W) = W∗ . Then there exists δ > 0 such that W(D) = W∗ +δ. We estimate PLab [{λ 6∈ TW }] in several steps. Notice that condition b1 is always satisfied. 1. The probability that condition b2 is satisfied, but not b3 , can be estimated as in (8.11) using Lemma 6.6; it is smaller than O(LC+1 )/L1M . 2. The probability that condition b4 is satisfied, but not b5 , is estimated in the same way; it is smaller than O(LC+1 )/L1M . 3. The probability that conditions b2 and b4 are satisfied, but not b6 , can be estimated by Lemma 6.13; it is smaller than L−O(M) . 4. We estimate the probability that condition b2 is not satisfied. The case with condition b5 is similar. If λ does not intersect 6L∗ , then this probability is smaller than O(LC ) exp{−δL}, since W(D) = W∗ + δ. Suppose that there exist s1 and s2 , with λ(si ) ∈ 6L∗ , λ(s) ∩ 6L∗ = ∅ for all s < s1 and λ(s) ∩ 6L∗ = ∅ for all s2 < s. Let piL := λ(si ), i = 1, 2. Under these conditions, b2 is not satisfied if and only if p1L 6∈ I1L . Let C(p1L , p2L ) be the polygonal curve from uL to p1L , then from p1L to p2L and finally from p2L to v L . Then the probability of this event is bounded above by X
X
∗: p1L ∈6L p1L 6∈I1L
p2L ∈6L∗
exp{−W(C(p1L , p2L ))} ≤
(8.12)
O(L2 ) max{exp{−W(C(p1L , p2L ))} | p1L ∈ 6L∗ \I1L , p2L ∈ 6L∗ }. Suppose that C denotes the polygonal line giving the maximum; scaled by 1/L we get a polygonal line in Q, denoted by C ∗ , from A to some point P1∗ , then from P1∗ to P2∗ and finally from P2∗ to B. Let θ ∗ be the angle between the straight line from A to P1∗ with the wall. We have W(C) = LW(C ∗ ) ≥ L(g(θ ∗ , a) + g(θY , b)).
(8.13)
By hypothesis |θ ∗ − θY | ≥
1 O((M log L)1/2 ). L1/2
(8.14)
Therefore (use a Taylor expansion of g around θY and the monotonicity of g(θ, x) on [0, θY ], respectively [θY , π/2]) there exists a positive constant α such that W(C ∗ ) ≥ g(θY , a) + g(θY , b) + = W∗ +
αM log L . L
αM log L L
(8.15)
We conclude that the probability, that condition b2 is not satisfied, is bounded above by O(LC+2 )/LαM . If M is large enough, the second statement of the theorem is true. 3. The proof of the third statement of the theorem is similar. u t
308
C.-E. Pfister, Y. Velenik
9. Appendix: N Phase-Separation Lines In this appendix we indicate how we can treat problems with N phase-separation lines. We consider the simplest case, in order to illustrate the basic ideas. We reduce the question of finding typical configurations to a similar questions for a single phase separation line. We assume in this section that all coupling constants are equal, K(e) = β, β > βc . We fix 2N points Ai , i = 1, . . . , 2N, on the boundary of Q. Then we scale the box Q L by L ∈ N and get 2N points AL i , i = 1, . . . , 2N . We assume that Ai , i = 1, . . . , 2N , 2 are at the middle of bonds of the lattice Z . Consequently, these points give naturally L a partition of ∂3L into 2N subsets (see Fig. 4), which we denote by [AL i , Ai+1 ], i = L L 1, . . . , 2N, with A2N+1 ≡ A1 . Let η be the boundary conditions for 3L , ( η(x) =
+1 −1
L if x ∈ [AL i , Ai+1 ] and i is odd, L if x ∈ [AL i , Ai+1 ] and i is even.
(9.1)
This boundary condition defines N phase-separation lines λi (ω) i = 1, . . . , N, in any configuration ω compatible with η. The set VL (η) := { aiL : i = 1, . . . , 2N } of endpoints of these phase-separation lines is uniquely determined by the points AL i . Given ω compatible with η, the N phase-separation lines λi (ω) give a partition of VL (η) into twopoint subsets δλj (ω) = {ajL1 , ajL2 }. The set of all possible partitions of VL (η) compatible with N phase-separation lines is denoted by P(VL (η)) and an element of P(VL (η)) by L , a L ). a L = (a1L1 , a1L2 ; . . . ; aN N2 1
L Fig. 4. The box 3L , the points AL i (white dots) and the points ai (black dots). A family of phase-separation lines is also drawn
Reentrant Pinning Transition
309
Lemma 9.1. Let η be a b.c. with N phase-separation lines for 3L . Let bL = L , bL ) ∈ P(V (η)). Then (b1L1 , b1L2 ; . . . ; bN L N2 1 Y h σ (bjL1 )σ (bjL2 ) i∗3L o Eη Dn j ≥1 ≤ . λ : δλj = {bjL1 , bjL2 } , j = 1, . . . , N Y 3L max h σ (ajL1 )σ (ajL2 ) i∗3L a L ∈P (VL (η))
j ≥1
(9.2) η
Proof. Let q3L (λ) be the weight of the compatible family λ of N phase-separation lines. We estimate the denominator of the left-hand side of (9.2). Let L L , aN ) ∈ P(VL (η)). a L = (a1L1 , a1L2 ; . . . ; aN 1 2
By Lemma 6.2 and GKS inequalities Y X η q3L (λ) = h σ (t) i∗3L λ
(9.3)
t∈VL (η)
≥
Y
h σ (ajL1 )σ (ajL2 ) i∗3L .
j ≥1
We estimate the numerator of the left-hand side of (9.2). By Lemma 6.5, Y X η q3L (λ) ≤ h σ (bjL1 )σ (bjL2 ) i∗3L .
(9.4)
j ≥1
λ: δλj ={bjL ,bjL } 1 2
t u When J (e) ≡ β it is easy to analyze the right-hand side of (9.2). Let L L , aN ) ∈ P(VL (η)); a L = (a1L1 , a1L2 ; . . . ; aN 1 2
we set N
1X τ (ajL2 − ajL1 ), W(a ) := L
(9.5)
Wη := min{ W(a L ) : a L ∈ P(VL (η))}.
(9.6)
L
j =1
and
Then by Proposition 2.4 and Lemma 7.1, o Eη Dn λ : δλj = {bjL1 , bjL2 } , j = 1, . . . , N
3L
≤ LO(N ) exp{−L(W(bL ) − Wη )}. (9.7)
In the generic case the minimum in (9.6) is attained at a single bL ∈ P(VL (η)); there exists ε > 0 such that W(a L ) ≥ Wη + ε , a L 6 = bL .
(9.8)
310
C.-E. Pfister, Y. Velenik
We can use Lemma 6.5 to bound above the denominator of the left-hand side of (9.2), X X η Y q3L (λ) ≤ h σ (apLj )σ (apLj ) i∗3L . (9.9) λ
1
p∈P (VL (η)) j ≥1
2
Notice that this is slightly better than what we would have obtained using the Gaussian inequality. For L large enough only a single term dominates in (9.9), namely the term given by the partition p such that bL = (apL1 , apL1 ; . . . ; apLN , apLN ). 1
2
1
2
(9.10)
Therefore in the generic case, for fixed N and large L, X η Y Y h σ (bjL1 )σ (bjL2 ) i∗3L ≤ q3L (λ) ≤ (1 + O(e−εL )) h σ (bjL1 )σ (bjL2 ) i∗3L . (9.11) λ
j ≥1
j ≥1
Let λ be a family of compatible phase-separation lines, such that δλj = {bjL1 , bjL2 }, j = 1, . . . , N. Formula (6.11) and Lemma 6.4 imply that Y η q3∗L (λj ). (9.12) q3L (λ) = q3∗L (λ) ≥ j ≥1
Notice that the factor h σ (bjL1 )σ (bjL2 ) i∗3L in (9.11) is equal to h σ (bjL1 )σ (bjL2 ) i∗3L =
X
λ: δλ={bjL ,bjL } 1 2
q3∗L (λ).
(9.13)
We summarize the results obtained so far. 1. In the generic situation described above the typical phase-separation lines λ compatible with the b.c. η are those such that δλj = {bjL1 , bjL2 }, j = 1, . . . , N, where bL = L , bL ) is the element of P(V (η)), which minimizes W(a L ) := (b1L1 , b1L2 ; . . . ; bN L N2 1 1 PN L − a L ). τ (a j =1 j2 j1 L 2. The probability of the occurrence of λ compatible with the b.c. η, assuming that δλj = {bjL1 , bjL2 }, j = 1, . . . , N, is bounded below by Y j ≥1
Y q3∗ (λj ) ≥ XL q3∗L (λ) j ≥1
λ: δλ={bjL ,bjL } 1 2
q(λj ) . X q(λ)
(9.14)
λ: δλ={bjL ,bjL } 1 2
We suppose that we are in the generic case. Then there are N segments with total length minimal, which do not intersect. Therefore the distance between two segments is at least δL, δ > 0. We also suppose that for each pair of points {bjL1 , bjL2 } we can apply Case 1 of Lemma 7.1. If L is large enough, then the ellipses Sj := S(bjL1 , bjL2 ; c0 ln L), j = 1, . . . , N, are disjoint two by two. Let { λ : δλj = {bjL1 , bjL2 }, λj ⊂ Sj , j = 1, . . . , N }.
(9.15)
Reentrant Pinning Transition
311
We can easily estimate the probability of the event (9.15) using (9.14). Indeed, we can reduce the estimate to an estimate for an event concerning a single interface, { λ : δλ = {bjL1 , bjL2 }, λ ⊂ Sj }.
(9.16)
We have, using Lemma 6.7, GKS inequalities and Lemma 6.10, X X X q3∗L (λ) ≤ q3∗L (λ) z∈∂ext Sj
E(λ)6⊂E(Sj ): δλ={bjL ,bjL } 1 2
≤
X z∈∂ext Sj
≤
X
z∈∂ext Sj
λ3z: δλ={bjL ,bjL } 1 2
h σ (bjL1 )σ (z) i3∗L h σ (z)σ (bjL2 ) i3∗L h σ (bjL1 )σ (z) i h σ (z)σ (bjL2 ) i 0
≤ O(L3/2− κc )h σ (bjL1 )σ (bjL2 ) i. On the other hand, by Lemma 7.1 and Proposition 2.4, X q3∗L (λ) = h σ (bjL1 )σ (bjL2 ) i3∗L E(λ)⊂E(3∗ L ): δλ={bjL ,bjL } 1 2
(9.17)
−τ (bL −bL )
j1 j2 ≥ L−C e ≥ L−C−1/2 h σ (bjL1 )σ (bjL2 ) i.
Choosing c0 so large that 3/2 − κc0 + C + 1/2 = α < 0, the probability of the event (9.16) is larger than 1 − O(L−α ). Therefore, the probability of the event (9.15) is also larger than 1 − O(L−α ). Acknowledgements. We thank A. Patrick for communicating to us the exact expressions of the mass gaps.
References [A1] [A2] [Al] [Az1] [Az2] [AY] [AC] [AK] [AR]
Abraham, D.B.: Solvable model with a roughening transition for a planar Ising ferromagnet. Phys. Rev. Lett. 44, 1165–1168 (1980) Abraham, D.B.: Surface Structures and Phase Transitions – Exact Results. In: Phase Transitions and Critical Phenomena, Vol 10, , Eds. C. Domb, J.L. Lebowitz, London: Academic Press, 1986, pp. 1–74 Alexander, K.S.: Power-law corrections to exponential decay of connectivities and correlations in lattice models. To appear in Annals of Probability Aizenman, M.: Translation invariance and Instability of phase coexistence in the two dimensional Ising model. Commun. Math. Phys. 73, 83–94 (1980) Aizenman, M.: Geometric Analysis of φ 4 Fields and Ising Models. Parts I and II. Commun. Math. Phys. 86, 1–48 (1982) Au Yang, H.: Thermodynamics of an anisotropic boundary of a two-dimensional Ising model. J. Math. Phys. 14, 937–946 (1973) Abraham,D.B., de Coninck, J.: Description of phases in a film-thickening transition. J. Phys. 16 A, L333–L337 (1983) Abraham, D.B., Ko, L.-F. Exact derivation of the modified Young equation for partial wetting. Phys. Rev. Lett. 63, 275–278 (1989) Abraham, D.B., Reed, P.: Phase separation in the two-dimensional ferromagnet. Phys. Rev. Lett. 33, 377–379 (1974); Interface profile of the Ising ferromagnet in two dimensions Commun. Math. Phys. 49, 35–46 (1974)
312
[ACD]
C.-E. Pfister, Y. Velenik
Abraham D.B., de Coninck, J., Dunlop, F.: Contact angle for 2D Ising ferromagnet. Phys. Rev. B 39, 4708–4710 (1989) [AA] Akutsu, N., Akutsu, Y.: Relationship between the anisotropic interface tension, the scaled interface width and the equilibrium shape in two dimensions. J. Phys. 19 A, 2813–2820 (1986) [AF] Au-Yang, H., Fisher, M.E.: Bounded and inhomogeneous Ising models II. Specific-heat scaling function for a strip. Phys. Rev. B 11, 3469–3487 (1975) [ABCP] Alberti, G., Bellettini, G., Cassandro, M., Presutti, E.: Surface tension in Ising systems with Kac potentials. J. Stat. Phys. 82, 743–796 (1996) [BLP1] Bricmont, J., Lebowitz, J.L., Pfister, C.-E.: On the local structure of the phase separation line in the two-dimensional Ising system. J. Stat. Phys. 26, 313–332 (1981) [BLP2] Bricmon,t J., Lebowitz, J.L., Pfister, C.-E.: On the surface tension of lattice systems. Annals of the New York Academy of Sciences 337, 214–223 (1980) [D] Dobrushin, R.L.: A statistical behaviour of shapes of boundaries of phases. In: R. Kotecký (ed.) Phase Transitions: Mathematics, Physics, Biology..., Singapore: World Scientific, 1993, pp. 60–70 [DH] Dobrushin, R.L., Hryniv, O.: Fluctuations of the phase boundary in the 2D Ising ferromagnet. Commun. Math. Phys. 189, 395–445 (1997) [F] Fisher, M.E.: Walks, Walls, Wetting, and Melting. J. Stat. Phys. 34, 667–729 (1984) [FP1] Fröhlich, J., Pfister, C.-E.: Semi–infinite Ising model I. Thermodynamic functions and phase diagram in absence of magnetic field. Commun. Math. Phys. 109, 493–523 (1987) [FP2] Fröhlich, J., Pfister, C.-E.: Semi–infinite Ising model II. The wetting and layering transitions. Commun. Math. Phys. 112, 51–74 (1987) [G] Gallavotti, G.: The phase separation line in the two–dimensional Ising model. Commun. Math. Phys. 27, 103–136 (1972) [Hi1] Higuchi, Y.: On the absence of non-translation invariant Gibbs states for the two-dimensional Ising model. In: J. Fritz, J.L. Lebowitz and D. Szász (eds) Random Fields, Esztergom, Amsterdam: NorthHolland, Vol I, 1979, pp. 517–534 [Hi2] Higuchi, Y.: On some limit theorems related to the phase separation line in the two-dimensional Ising model. Z. Wahrsch. verw. Geb. 50, 287–315 (1979) [I1] Ioffe, D.: Large deviations for the 2D Ising model: A lower bound without cluster expansions. J. Stat. Phys. 74, 411–432 (1994) [I2] Ioffe, D.: Ornstein–Zernike Behaviour and Analyticity of Shapes for Self-Avoiding Walks on Zd . To appear in Markov Processes and Related Fields [LP] Lebowitz, J.L., Pfister, C.-E.: Surface tension and phase coexistence. Phys. Rev. Lett. 46, 1031–1033 (1981) [MW] McCoy, B.M., Wu, T.T.: The Two-dimensional Ising Model. Cambridge, MA: Harvard University Press, 1973 [Pa1] Patrick, A.: The influence of boundary conditions on Solid-On-Solid models. J. Stat. Phys. 90, 389–433 (1998) [Pf1] Pfister, C.-E.: Interface and surface tension in Ising model. In: Scaling and self–similarity in physics, ed. J. Fröhlich, Basel: Birkhäuser, 1983 pp. 139–161 [Pf2] Pfister, C.-E.: Large deviations and phase separation in the two–dimensional Ising model. Helv. Phys. Acta 64, 953–1054 (1991) [PU] Patrick, A., Upton, P.J.: Surface phase transitions in two dimensions: Metastability and re–entrance. Dublin Institute for Advanced Studies, Preprint DIAS-STP-96-08 (1996) [PP] Pfister, C.-E., Penrose, O.: Analyticity properties of the surface free energy of the Ising model. Commun. Math. Phys. 115, 691–699 (1988) [PV1] Pfister, C.-E., Velenik, Y.: Large deviations and continuum limit in the 2D Ising model. Prob. Th. Rel. Fields 109, 435–506 (1997) [PV2] Pfister C.-E., Velenik Y.: Mathematical theory of the wetting phenomenon in the 2D Ising model. Helv. Phys. Acta 69, 949–973 (1996) [SML] Schultz, T.D., Mattis, D.C., Lieb, E.H.: Two-dimensional Ising model as a soluble problem of many fermions. Rev. Mod. Phys. 36, 856–871 (1964) [T] Talagrand, M.: A new look at independence. Ann. Prob. 24, 1–34 (1996) [V] Velenik, Y.: Phase separation as a large deviations problem: A microscopic derivation of surface thermodynamics for some 2D spin systems. Thèse 1712 EPF-L (1997); available electronically from the author [ZA] Zia, R.K.P., Avron, J.E.: Total surface energy and equilibrium shapes: Exact results for the d=2 Ising crystal. Phys. Rev. B 25, 2042–2045 (1982) Communicated by M. E. Fisher
Commun. Math. Phys. 204, 313 – 327 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
A Class of Counterexamples to Jeans’ Theorem for the Vlasov–Einstein System? Jack Schaeffer Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213 USA. E-mail: [email protected] Received: 2 November 1998 / Accepted: 24 December 1998
Abstract: The Vlasov–Poisson and Vlasov–Einstein systems model the motion of a self gravitating system such as a galaxy. The Vlasov–Poisson system is nonrelativistic. Jeans’ theorem states that every spherically symmetric solution of the Vlasov–Poisson system that is independent of time may be expressed as a function of the two invariants, energy and angular momentum. This paper shows this is not the case for the Vlasov–Einstein system. 1. Introduction Consider a self gravitating mass system, such as a galaxy, in which collisions may be neglected. Let x denote position, v denote momentum, and consider a static situation which is independent of time. If f (x, v) gives the number density of particles in phase space, then f satisfies the Vlasov equation. We will consider spherical symmetry, that is, we assume f is a function of r = |x|, u = |v|, L = |x ∧ v|2 . Then we are led to consider two systems: For classical Newtonian gravitation we have the Vlasov–Poisson system in which v · ∇x f − ∇x U · ∇v f = 0, 0 r −2 r 2 U 0 = 4πρ, Z ρ = ρ(r) = f (x, v)dv. ? Supported in part by NSF DMS-9731956.
(1.1) (1.2) (1.3)
314
J. Schaeffer
We assume that all particles have mass one. For relativistic gravitation we are led to the Vlasov–Einstein system in which p x v p · ∇x f − 1 + |v|2 µ0 · ∇v f = 0, (1.4) 2 r 1 + |v| (1.5) e−2λ 2rλ0 − 1 + 1 = 8π r 2 ρ, −2λ 0 2 (1.6) 2rµ + 1 − 1 = 8π r p, e Z p 1 + |v|2 f (x, v)dv, (1.7) ρ(r) = Z dv x · v 2 f (x, v) p . (1.8) p(r) = r 1 + |v|2 If x = (r sin σ cos φ, r sin σ sin φ, r cos σ ) then the space time metric is given by ds 2 = −e2µ dt 2 + e2λ dr 2 + r 2 dσ 2 + sin2 σ dφ 2 . As boundary conditions we require asymptotic flatness, that is lim µ = 0,
(1.9)
lim λ = 0,
(1.10)
λ(0) = 0.
(1.11)
r→∞
r→∞
and a regular center, that is
For the Vlasov–Poisson system we require lim U (r) = 0.
r→∞
(1.12)
Consider the Vlasov–Poisson system. The quantities E=
1 2 |v| + U (r), L = |x ∧ v|2 2
(energy and angular momentum squared) are constant on characteristics of (1.1), that is, the curves defined by x˙ = v, v˙ = −∇x U (x). Jeans asserted [5,6] that f must be expressible in the form f (x, v) = φ(E, L) for some φ : R2 → R. This assertion has been referred to as Jeans’ Theorem [3, 4, 7]. It was established rigorously in [1]. For the Vlasov–Einstein system the quantities p E = eµ 1 + |v|2 , L = |x ∧ v|2 are constant on characteristics of (1.4). It is the purpose of this article to construct a solution of (1.4) through (1.11), where f is not a function of E and L. Thus we say Jeans’ Theorem does not hold for the Vlasov–Einstein system.
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
315
Before going further we mention a few other papers on these systems. For the Vlasov– Poisson system global in time existence of smooth solutions has been established [8, 9, 16]. Solutions of the time independent problem are constructed in [1, 2, 10]. For the Vlasov–Einstein system much less is known. Small symmetric initial data leads to global existence, [14]. Also, for a symmetric solution, the first singularity must occur at the center of symmetry, [15]. Time independent solutions are constructed in [11–13]. We mention that [12] especially influenced this article. 2. The Construction We will use the procedure used in [12] so we begin by outlining this. Let us fix constants D0 > 0, E0 > 0, L0 > 0 and k, ` with 1 k > 0 and ` > − . 2 Then if we make the ansatz k ` p |x ∧ v|2 − L0 f (x, v) = D0 E0 − eµ 1 + |v|2 +
+
(2.1)
(where a+ = positive part of a) then (1.4) follows automatically and (1.7) and (1.8) become
where
ρ(r) = r 2` e−(2`+4)µ g(E),
(2.2)
p(r) = r 2` e−(2`+4)µ h(E),
(2.3)
p E(r) = eµ(r) 1 + L0 r −2 , Z E0 `+ 1 2 d, g(E) = D0 c` (E0 − )k+ 2 2 − E 2 E
h(E) =
D 0 c` 2` + 3
Z
+
E0
`+ 3 2 d, (E0 − )k+ 2 − E 2 +
E
and
Z c` = 2π
We define
Z m=
1 0
r
(2.4) (2.5) (2.6)
s` ds. √ 1−s
4πs 2 ρ(s)ds,
(2.7)
0
and note that by (1.11), (1.5) is equivalent to e−2λ = 1 − 2r −1 m,
(2.8)
and (1.6) is equivalent to µ0 =
4πrp + r −2 m . 1 − 2r −1 m
(2.9)
316
J. Schaeffer
Once (2.9) is solved (with (2.2) through (2.7) substituted) then f, ρ, p, and λ may be computed. In [12] solutions are constructed of the form f1 (x, v) r < R (2.10) f (x, v) = 0 r ≥ R. On (0, R), f1 is given by (2.1) and (2.2) through (2.9) hold, where (2.9) is solved with µ(0) given. R is taken so that f1 = 0 on an interval of the form [R, R + ] (R is shown to exist), so f defined in (2.10) is smooth. On (R, ∞), f = 0 so ρ = p = 0 and λ and µ are defined by (2.7), (2.8), and (2.9). All conditions are satisfied except for (1.9). Equation (1.9) is ensured as follows: Let µ∞ = lim µ r→∞
and
µ = µ − µ∞ , then (f, ρ, p, λ, µ) satisfy (1.4) through (1.11). We follow this method also, except we p must show that the resulting solution, f , 2 cannot be written as a function of E = eµ 1 + |v|2 and p L = |x ∧ v| . It is equivalent µ to show f cannot be written as a function of E = e 1 + |v|2 and L, so we do this instead. Consider the computation of f1 . Note that by (2.1) f1 = 0 if E ≥ E0 . Also, by (2.9) p µ ≥ µ(0) so E ≥ eµ(0) 1 + L0 r −2 . Taking µ(0) < ln E0 we define s L0 , (2.11) r0 = 2 −2µ(0) E0 e −1 q so that E0 = eµ(0) 1 + L0 r0−2 . Then on [0, r0 ] we have E ≥ E0 so f = 0 and hence µ = µ(0). By [11, Thm. 3.4] we may continue the solution of (2.9) to as large an interval as we wish. Note that 1 (2.12) r −1 m < 2 follows. Lemma. Let D0 , E0 , L0 , k, and ` be as above and be fixed. Let q = k + ` + 23 . Let µ be the solution of (2.9) (with (2.2) through (2.7) substituted). Then there exist positive constants D1 , D2 , D3 , D4 (independent of µ(0)) such that for µ(0) < −D1 there is 1+ 2 1+ 2 (2.13) R ∈ r0 + D2 r0 q+1 , r0 + D3 r0 q+1 with E 0 (R) > 0, E(R) = E0 ,
(2.14)
and R −1 m(R) <
1 − D4 . 2
(2.15)
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
317
The proof of the lemma is postponed to the next section. The lemma provides us with R to be used in (2.10). For r ≥ R we have m = m(R), ρ = p = 0 so (2.9) becomes µ0 =
r −2 m(R) . 1 − 2r −1 m(R) s
It follows that µ
e =e s
and E = E0
µ(R)
1 − 2r −1 m(R) 1 − 2R −1 m(R)
(1 − 2r −1 m(R))(1 + L0 r −2 ) (1 − 2R −1 m(R))(1 + L0 R −2 )
for r ≥ R. By (2.13) and (2.15) v u u (1 − 2r −1 m(R))(1 + L0 r −2 ) E ≤ E0 t 1+ 2 2D4 L0 (r0 + D3 r0 q+1 )−2 and
v u 2 E0 r0 u (1 − 2r −1 m(R))(1 + L0 r −2 ) q+1 →√ E0 t 1 + D3 r0 as r → ∞. 1+ 2 2D4 L0 2D4 L0 (r0 + D3 r0 q+1 )−2
But r0 → 0 as µ(0) → −∞ by (2.11), so for µ(0) sufficiently negative 2 E0 r0 q+1 1 + D3 r0 < E0 , √ 2D4 L0 and E < E0 for r large. Consider a point (x, v) where f 6 = 0 and 2 E0 r0 1 + D3 r0q+1 < E(r) < E0 . √ 2D4 L0 Here r = |x|. There exists r∗ > R with
We claim that
E(r∗ ) = E(r).
(2.16)
p p eµ(r∗ ) 1 + L(r∗ )−2 < eµ(r) 1 + Lr −2 ,
(2.17)
where
L = |x ∧ v|2 .
To see this note that L > L0 and s 7→
1 + Ls −2 1 + L0 s −2
318
J. Schaeffer
is decreasing on (0, ∞), hence 1 + Lr∗−2
1 + L0 r∗−2
<
1 + Lr −2 . 1 + L0 r −2
By (2.16), (2.17) now follows. Now we take x∗ with |x∗ | = r∗ . The equation L = |x∗ ∧ v∗ |2 determines the nonradial component of v∗ . By (2.17), p E = eµ(r) 1 + Lr −2 + (x · v)2 r −2 p > eµ(r∗ ) 1 + L(r∗ )−2 , so we may choose the radial component of v∗ so that q E = eµ(r∗ ) 1 + Lr∗−2 + (x∗ · v∗ )2 r∗−2 p = eµ(r∗ ) 1 + |v∗ |2 . Now E(x, v) = E(x∗ , v∗ ) and L(x, v) = L(x∗ , v∗ ), but f (x, v) > 0 = f (x∗ , v∗ ). Hence f may not be written as a function of E and L. We mention that we have not made the assumption that k < 3` + 27 , which is used in [11]. The method used in [11] involves comparison with a Newtonian potential and the condition k < 3` + 27 implies that that Newtonian potential has a zero. Here no comparison with a Newtonian potential is made so the restriction on k and ` is not needed. 3. Proof of the Lemma Now we consider proving the lemma. Note that we may make r0 as close to zero as desired by taking µ(0) smaller. We will restrict r0 as needed. Constants may depend on D0 , E0 , L0 , `, and k but not on r0 or µ(0). In this section a positive constant will be denoted with C with a subscript related to the line number where it first appears. Given a value of r, say r1 , we will denote µ1 = µ(r1 ), E1 = E(r1 ), etc. We restrict attention to an interval on which 1 1+ q+1
r0 ≤ r ≤ r0 + r0 and restrict r0 so that
2 1+ q+1
r0 Let
1 1+ q+1
<< r0
<< r0 .
R˜ = sup {r > r0 : E ≤ E0 on [r0 , r]} .
Except for one place (which will be pointed out) we also restrict ˜ r ≤ R.
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
319
For 21 E0 ≤ u ≤ E0 we have by (2.5) and (2.6) that C1 (E0 − u)q ≤ g(u) ≤ C˜ 1 (E0 − u)q
(3.1)
C2 (E0 − u)q+1 ≤ h(u) ≤ C˜ 2 (E0 − u)q+1 .
(3.2)
and
Also by (2.4),
r 2` e−(2`+4)µ = (L0 + r 2 )`+2 E −(2`+4) r −4 ,
so if 21 E0 ≤ E then C3 r0−4 ≤ r 2` e−(2`+4)µ ≤ C˜ 3 r0−4 ,
(3.3)
and hence by (2.2) and (2.3), q q C4 r0−4 (E0 − E)+ ≤ ρ ≤ C˜ 4 r0−4 (E0 − E)+
(3.4)
and q+1
C5 r0−4 (E0 − E)+
q+1 ≤ p ≤ C˜ 5 r0−4 (E0 − E)+ .
(3.5)
By (2.4) and (2.9), 0
E =
4πrp + r −2 m L0 − E, 1 − 2r −1 m r(L0 + r 2 )
(3.6)
so using (3.5) E 0 ≥ 4πrp − r −1 E i h ≥ C7 r0−3 (E0 − E)q+1 − r0−1 E.
(3.7)
1 q+1 . We restrict r0 so that Note that by (3.7), E 0 must be positive if E < E0 − C7−1 r02 1 1 q+1 > E0 , E0 − C7−1 r02 2 then it follows that 1 1 q+1 > E0 E ≥ E0 − C7−1 r02 2
(3.8)
and that (3.3), (3.4), and (3.5) hold. Also using (3.8) in (3.4) and (3.5) yields 2 −2− q+1
ρ ≤ C 9 r0
(3.9)
and p ≤ C10 r0−2 .
(3.10)
320
J. Schaeffer
Next we derive another lower bound p for E and use it to derive an upper bound. Note that µ0 ≥ 0 so by the convexity of r 7 → 1 + L0 r −2 we have q p L (r − r ) 0 0 E ≥ eµ0 1 + L0 r −2 ≥ eµ0 1 + L0 r0−2 − q 3 (3.11) r0 1 + L0 r0−2 ≥ E0 − E0 r0−1 (r − r0 ), and hence by (3.5) −5−q
p ≤ C12 r0
(r − r0 )q+1 .
(3.12)
Similarly by (3.4) −4−q
ρ ≤ C13 r0
(r − r0 )q ,
(3.13)
so from (2.7) −3−q
r −1 m ≤ C14 r0
(r − r0 )q+1 .
(3.14)
On the interval r0 ≤ r ≤ r0 +
1 −1 3+q C r 4 14 0
we have r −1 m ≤
1 q+1
,
(3.15)
1 , 4
so using (3.12) and (3.14) in (3.6) yields −4−q (r − r0 )q+1 − r0−1 . E 0 ≤ C16 C˜ 16 r0
(3.16)
Hence for 2 1+ q+1
r0 ≤ r ≤ r0 + C17 r0
,
(3.17)
(3.15) holds and 1 E 0 ≤ − C16 r0−1 . 2 It follows that
2 1+ 2 E r0 + C17 r0 q+1 ≤ E0 − C19 r0q+1 .
(3.18)
(3.19)
Next we claim that E attains a minimum at some point r1 with 2 1+ q+1
r0 < r1 ≤ r0 + C20 r0
(3.20)
and 2
E1 ≤ E0 − C19 r0q+1 .
(3.21)
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
321
1+ 2 By (3.18), E 0 r0 + C17 r0 q+1 < 0, so suppose E 0 ≤ 0 on 1+ 2 I = r0 + C17 r0 q+1 , r1 . Then by (3.4) and (3.19) 2 −2− q+1
ρ ≥ C22 r0
(3.22)
on I , and hence by (2.12) and (2.7) −1− 2 1 > r1−1 m1 ≥ C23 r0 q+1 length(I ). 2
(3.23)
The claim now follows. We will compare 4πrp and r −2 m at r1 . Note that from (2.5) and (2.6), Z E0 `+ 3 −1 0 −1 (E0 − )k+ ( 2 − u2 )+ 2 d u g(u) + h (u) = u D0 c` u
so by (3.2) and (3.8) 0 ≤ E −1 g(E) + h0 (E) ≤ C24 r02 .
(3.24)
2`r −1 − (2` + 4)µ0 p + r 2` e−(2`+4)µ h0 (E)E 0 ,
(3.25)
Also note that by (2.3) 0 4πr 3 p = 12πr 2 p +4πr 3
and by (3.6) and (2.2) (4πr 3 p)0 + 4πr 3 ρ µ0 −
L0 r(L0 +r 2 )
= 4πr 2 p 3 + 2` − [2` + 4] rµ0
(3.26)
+4πr 3+2` e−(2`+4)µ h0 (E) + E −1 g(E) E 0 . As long as µ0 ≤ 3r −1 ,
(3.27)
we have by (3.26), (3.3), (3.24), (3.6), and (3.10), 0 0 4πr 3 p + 4πr 3 ρ µ0 − r(LL+r 2) 0
≤ 4πr 2 (3 + 2` + 3[2` + 4])p
+4πr 3 C˜ 3 r0−4 ≤ C28 .
C24 r02 3r −1 −
L0 r(L0 +r 2 )
(3.28) E0
322
J. Schaeffer
If E 0 < 0 then (3.27) holds so by (3.28), 0 4πr 3 p ≤ C28 − 4πr 3 ρ µ0 − ≤ C28
+ 4πr 2 ρ
= C28
L0 r(L0 +r 2 )
(3.29)
+ m0 .
If E 0 ≥ 0 then by (3.25), (3.10), and since h0 ≤ 0 we have 0 4πr 3 p ≤ 4πr 2 p (3 + 2`) ≤ C30 .
(3.30)
Combining (3.29) and (3.30) yields 0 4πr 3 p ≤ m0 + C31 ,
(3.31)
so 4πr 2 p ≤ r −1 m + C32 r0−1 (r − r0 ). Next consider a point, r∗ , where
E 0 (r
∗)
(3.32)
= 0. By (3.6),
4πr∗2 p∗ + r∗−1 m∗ 1 − 2r∗−1 m∗
=
L0 . L0 + r∗2
Solving for r∗−1 m∗ yields (with (3.10)) L0 L0 +r∗2
1 − 4πr∗2 p∗ − C33 r02 ≤ 3
− 4π r∗2 p∗
0 1 + 2 L L+r 2 0
∗
(3.33)
1 − 4πr∗2 p∗ + C33 r02 . =r∗−1 m∗ ≤ 3 Using (3.32) here yields 1 1 − r∗−1 m∗ − C32 r0−1 (r∗ − r0 ) − C33 r02 ≤ r∗−1 m∗ . 3 Note that by (3.17) and (3.18) 2 1+ q+1
r∗ ≥ r0 + C17 r0 so
2 r∗ − r0 ≥ C17 r0q+1 ≥ C17 r02 . r0 Hence, restricting r0 , we have
1 − C34 r0−1 (r∗ − r0 ) ≤ r∗−1 m∗ . 4
(3.34)
Using (3.34) in (3.33) yields 4πr∗2 p∗ ≤
1 + C35 r0−1 (r∗ − r0 ). 4
(3.35)
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
323
In the case when r∗ = r1 we have by (3.5) and (3.21), p1 ≥ C36 r0−2 ,
(3.36)
so by (3.33) 1 − C37 . 3
r1−1 m1 ≤
(3.37)
Note also from (3.33) that 4πr12 p1 ≥ 1 − 3r1−1 m1 − C38 r02 .
(3.38)
E can not stay near its minimal value for long. Let σ ∈ (0, 1) and suppose that 2
E ≤ E0 − σ C19 r0q+1 holds for r ∈ [r1 , r2 ]. Then by (3.4), 2 −2− q+1
ρ ≥ C39 σ q r0
,
(3.39)
and by (2.7) 2 − q+1
m ≥ C40 σ q r0
(r − r1 )
(3.40)
for r ∈ [r1 , r2 ]. But then by (2.12), r0 >
− 2 1 r2 > m2 ≥ C40 σ q r0 q+1 (r2 − r1 ). 2
Then there exists rσ with 2 1+ q+1
r1 < rσ < r1 + C41 σ −q r0
(3.41)
and 2
Eσ > E0 − σ C19 r0q+1 .
(3.42)
4πrσ2 pσ ≤ C43 σ q+1 .
(3.43)
By (3.5) and (3.41)
We seek a value of r where
1 + C, 3 for this will give a positive lower bound on E 0 . To find this value of r we consider two cases, the first being that r −1 m >
µ0 <
15 −1 r on (r1 , rσ ). 8
(3.44)
324
J. Schaeffer
With this assumption, (3.28) and (3.9) yield 0 15 7 7 L0 3 3 − + 4π r 2 ρ − C28 ≥ −C45 , 4πr p + m ≥ 4πr ρ 2 8 r(L0 + r ) 8r 8 (3.45) so by (3.41) 1+ 2 7 7 4πrσ3 pσ + mσ ≥ 4πr13 p1 + m1 − C46 σ −q r0 q+1 . 8 8
(3.46)
Now by (3.38), (3.43), and (3.37), 1+ 2 mσ ≥ m1 + 87 1 − 3r1−1 m1 − C38 r02 r1 − rσ C43 σ q+1 − C46 σ −q r0 q+1 (3.47) 2 1+ q+1 1 3 + r σ q+1 + σ −q r − C − C r ≥ 87 − 17 r . 1 47 37 σ 0 0 7 3 We now take
1 −1 q+1 σ = C37 C47
(independent of r0 ) and by restricting r0 and using (3.41), (3.47) yields rσ−1 mσ ≥
1 + C48 . 3
(3.48)
Next suppose that (3.44) does not hold, then there exists r3 ∈ (r1 , rσ ) such that µ0 (r3 ) ≥ By (2.9) r3−1 m3
8 ≥ 38
15 −1 r . 8 3
15 2 − 4π r3 p3 . 8
(3.49)
Note that by (3.6) E30 > 0 and let
r∗ = inf r ≤ r3 : E 0 > 0 on (r, r3 ) ,
and note that r∗ ≥ r1 . By (3.30), (3.35), and (3.41) 4πr33 p3 ≤ 4πr∗3 p∗ + C30 (r3 − r∗ ) ≤ r∗
1 4
≤ r3
+ C35 r0−1 (r∗ − r0 ) + C30 (r3 − r∗ ) 2
1 4
+ C50 r0q+1
(3.50)
.
Using (3.50) in (3.49) and restricting r0 yields (again using (3.41)) 2 8 13 1 q+1 −1 − C50 r0 ≥ + C51 . r3 m3 ≥ 38 8 3
(3.51)
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
325
Combining (3.51) and (3.48) we have r4−1 m4 ≥
1 + C52 3
(3.52)
for some 1+ 2 r4 ∈ r1 , r0 + C53 r0 q+1 . For r ∈ r4 ,
1+3C52 1+C52 r4
(3.53)
we have by (3.6) and (3.52),
0
E ≥
r −1 m L0 − −1 1 − 2r m L0 + r 2 r −1 r4 ( 13 + C52 )
≥
1 − 2r −1 r
1 4( 3
+ C52 ) !
r −1 E !
− 1 r −1 E
≥
1 3 (1 + C52 ) − 1 r −1 E 1 − 23 (1 + C52 ) ! 1 3 (1 + C52 ) − 1 r −1 E 1 − 23 C52 r −1 E ≥ C54 r0−1 .
≥ =
(3.54)
52 ˜ We point out that the interval r4 , 1+3C 1+C52 r4 extends past R, but that (3.54) still holds. By (3.8) 1 q+1 + C54 r0−1 (r − r4 ) E ≥ E0 − C7−1 r02 52 on r4 , 1+3C 1+C52 r4 and hence E
1 + 3C52 r4 1 + C52
≥ E0 −
C7−1 r02
1 q+1
+ C54 r0−1
1 + 3C52 − 1 r4 1 + C52
1 2C52 C54 q+1 ≥ E0 − C7−1 r02 + , 1 + C52
For r0 sufficiently small this expression exceeds E0 , so the existence of R asserted in (2.13) and (2.14) follows by the intermediate value theorem. It remains to show (2.15). First we claim that 2 1+ q+1
R − r1 ≥ C55 r0
.
(3.55)
By (3.9)
0
r −1 m
2 −(1+ q+1 )
= 4πrρ − r −2 m ≤ C56 r0
,
(3.56)
326
J. Schaeffer
so by (3.37) r −1 m ≤
1 on 3
r1 , r1 +
2 C37 1+ q+1 r0 C56
.
Hence by (3.6) and (3.10), 0
E ≤
C57 r0−1
2 C37 1+ q+1 r1 , r1 + r0 C56
on
.
(3.57)
Now (3.55) follows by using (3.21). We will need to bound Q = 1 − 2r −1 m away from zero. By (3.6) for r ∈ (r1 , R) E 0 ≥ R −2 m1 Q−1 − r1−1 E, so by (3.34) and (3.8) (also (3.20), (2.13), and restricting r0 ) E 0 ≥ C58 Q−1 − C˜ 58 r0−1 .
(3.58)
Using (3.8) and (2.14) yields
C7−1 r02
1 q+1
Z ≥ E(R) − E1 ≥
R
r1
C58 Q−1 − C˜ 58 r0−1 dr,
so by (2.13) 1+ 2 C59 r0 q+1
Z ≥
R
r1
Q−1 dr.
(3.59)
On the other hand by (2.14) and (3.56), Z Q(R) − Q = −2
R r
2 −(1+ q+1 )
(s −1 m(s))0 ds ≥ −2C56 r0
(R − r).
So (3.59) yields 1+ 2 C59 r0 q+1
≥
1+ 2 (2C56 )−1 r0 q+1
−1
ln 1 + 2C56 Q
2 −(1+ q+1 ) (R)r0 (R
− r1 ) .
By (3.55) C60 ≥ ln 1 + C˜ 60 Q−1 (R) , and (2.15) follows.
(3.60)
Counterexample to Jeans’ Theorem for Vlasov–Einstein System
327
References 1. Batt, J., Faltenbacher, W., Horst, E.: Stationary spherically symmetric models in stellar dynamics. Arch. Rat. Mech. Anal 93 2, 159–183 (1986) 2. Batt, J., Pfaffelmoser, K.: On the radius of continuity of the models of polytropic gas spheres which correspond to the positive solutions of the generalized Emden–Fowler equation. Math. Meth. Appl. Sci. 10, 499–516 (1988) 3. Camm, G. L.: Self-gravitating star systems II. Monthly Notices Royal Astro. Soc. 112, 155–176 (1952) 4. Eddington, A. S.: The dynamical equilibrium of the stellar system. Astronom. Nachr. Jubiläumsnummer, 9–10 (1921) 5. Jeans, J. H.: Problems of cosmogony and stellar dynamics. Cambridge: Cambridge University Press, 1919 6. Jeans, J. H.: On the theory of star-streaming and the structure of the universe. Monthly Notices Royal Astro. Soc. 76, 70–84 (1915) 7. Kurth, R.: General theory of spherical self-gravitating star systems in a steady state. Astronom. Nachr. 282, 97–106 (1955) 8. Lions, P. L., Perthame, B.: Propagation of moments and regularity for the 3-dimensional Vlasov–Poisson system. Invent. Math. 105, 415–430 (1991) 9. Pfaffelmoser, K.: Global classical solutions of the Vlasov–Poisson system in three dimensions for general initial data. J. Diff. Eqns. 95, 281–303 (1992) 10. Rein, G.: Stationary and static stellar dynamic models with axial symmetry. Nonlinear Analysis, Theory, Methods and Applications, to appear 11. Rein, G.: Static solutions of the spherically symmetric Vlasov–Einstein system. Math. Proc. Camb. Phil. Soc. 115, 559–570 (1994) 12. Rein, G.: Static shells for the Vlasov–Poisson and Vlasov–Einstein systems. Preprint 13. Rein, G., Rendall, A. D.: Smooth static solutions of the spherically symmetric Vlasov–Einstein system. Ann. de l’Inst. H. Poincaré, Phys. Théor. 59, 383–397 (1993) 14. Rein, G., Rendall, A. D.: Global existence of solutions of the spherically symmetric Vlasov–Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992) 15. Rein, G., Rendall, A. D., Schaeffer, J.: A regularity theorem for solutions of the spherically symmetric Vlasov–Einstein system. Commun. Math. Phys. 168, 467–478 (1995) 16. Schaeffer, J.: Global existence of smooth solutions to the Vlasov–Poisson system in three dimensions. Commun. Part. Diff. Eqns. 16, 1313–1335 (1991) Communicated by H. Araki
Commun. Math. Phys. 204, 329 – 351 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Breit–Wigner Approximation and the Distribution of Resonances Vesselin Petkov1 , Maciej Zworski2 1 Département de Mathématiques Appliquées, Université de Bordeaux I, 351, Cours de la Libération,
33405 Talence, France. E-mail: [email protected]
2 Mathematics Department, University of California, Evans Hall, Berkeley, CA 94720, USA.
E-mail: [email protected] Received: 16 October 1998 / Accepted: 28 January 1999
Abstract: For operators with a discrete spectrum, {λ2j }, the counting function of λj ’s, P N (λ), trivially satisfies N(λ + δ) − N(λ − δ) = j δλj ((λ − δ, λ + δ]). In scattering situations the natural analogue of the discrete spectrum is given by resonances, λj ∈ C+ , and of N (λ), by the scattering phase, s(λ). The relation between the two is now nontrivial and we prove that X ωC+ (λj , [λ − δ, λ + δ]) + O(δ)λn−1 , s(λ + δ) − s(λ − δ) = |λj −λ|<
where ωC+ is the harmonic measure of the upper of half plane and δ can be taken dependent on λ. This provides a precise high energy version of the Breit–Wigner approximation, and relates the properties of s(λ) to the distribution of resonances close to the real axis. 1. Introduction The origins of the theory of resonances lie in the work of Breit and Wigner who studied the behaviour of unstable particles. They postulated that if an ustable particle at energy E0 decays according to the law exp(−0t) then the energy density should be approximately distributed according to the Breit–Wigner distribution 0 0 2 /4 + (E
− E0 )2
.
(1.1)
Hence for 0 small, that is for long living particles, E = E0 becomes the resonant energy. In scattering experiments we consequently see “blips”. A single “blip” is supposed to correspond to a creation of a particle with rest energy E0 and then to its subsequent decay.
330
V. Petkov, M. Zworski
The mathematical content of a scattering experiment is captured by the scattering matrix and we should consequently look for “blips” in expressions derived from it. The most basic one is the scattering phase which measures the averaged phase shift which a wave experiences while passing through the scatterer. If S(λ) is the scattering matrix then we put def
s(λ) = i log det S(λ), s(0) = 0.
(1.2)
According to the Breit–Wigner theory, s 0 (λ), which is the distribution function of continuous energy, should behave according to an expression similar to (1.1) with λ20 = E0 −i0, λ2 = E: def
P (λ0 , λ) =
1 Im λ0 . π |λ − λ0 |2
(1.3)
This is partly justified by the Poisson formula for resonances – see (3.1) below. Since there could be many resonances, λ0 , close to the real axis, the separation between the ones corresponding to a particular “blip” and the other ones is far from apparent. In this paper we give upper and lower bounds on the variation of the scattering phase s(λ + δ) − s(λ − δ),
(1.4)
in terms of the number of resonances close to λ. That provides further justification of the Breit–Wigner approximation, in particular at high energies, and leads to new results on the distribution of resonances. Our main result is Theorem. For operators satisfying assumptions (1.7)–(1.9) below and for any fixed 0 < < 1 we have X ωC+ (λj , [τ − δ, τ + δ]) + O (δ)τ n−1 , s(τ + δ) − s(τ − δ) = 2π |λj −τ |< (1.5) ∀δ, 0 ≤ δ < /2, ∀τ ≥ 1, where λj are resonances included according to their multiplicities and ωC+ is the harmonic measure for the upper half-plane: Z P (z, λ)dλ, E ⊂ R = ∂C+ , ωC+ (z, E) = E
with the Poisson kernel P (z, λ) given by (1.3). We note that we can apply the theorem with δ which depends on τ . A by-product of the proof, which also justifies the statement of the theorem, is the following estimate closely related to the assumption (1.9) below: (1.6) ] λj : |λj − τ | < = O(τ n−1 ), 0 < < 1, see Proposition 2 below. The theorem leads to new estimates on the number of resonances near the real axis. Previous results on the Breit–Wigner approximation were obtained in a very specific semi-classical setting by Gérard-Martinez-Robert [3] and their work motivated ours. We consider a general compactly supported perturbation of the Laplacian in Rn , n ≥ 2. To
Breit–Wigner and Resonances
331
avoid discussion of specific aspects of obstacle or metric scattering we adopt the “black box” formalism developed by Sjöstrand and the second author [20]. It is quite possible that much of this work could be generalized to include the non-compact “black box” perturbations [19] but that would involve further development of some tools of scattering theory. We choose to remain in a somewhat less technical situation at this stage. We start by recalling the abstract assumptions of “black box” scattering. Let H be a complex Hilbert space with an orthogonal decomposition H = HR0 ⊕ L2 (Rn \ B(0, R0 )), n ≥ 2, and let L be a self-adjoint operator with domain D ⊂ H such that 1Rn \B(0,R0 ) D = H 2 (Rn \ B(0, R0 )), 1Rn \B(0,R0 ) L = −1Rn \B(0,R0 ) , σpp (L) ∩ [0, ∞] = ∅, L ≥ −C, C > 0.
(1.7) (1.8)
We also assume that L is elliptic in the following spectral sense: let L] be a reference operator of L obtained by considering L on H] = HR0 ⊕ L2 ((Rn /RZn ) \ B(0, R0 )), R R0 , in the obvious way. We then assume that ]{λ : λ2 ∈ σ (L] ), |λ| ≤ r} = C(L)r n + O(r n−1 ).
(1.9)
In other words, the reference operator admits Weyl asymptotics with the Hörmander remainder term. We note that (1.9) implies the trace class properties needed for the Poisson formula for resonances – see [22,19]. A typical example is provided by a metric perturbation of −1 on an exterior domain given by n X ∂xj ai,j (x)∂xi , on Rn \ O, L=− i,j =1
where ∂O is smooth, Rn \ O is connected and we take Dirichlet (or Neumann) boundary conditions. The coefficients ai,j (x) are real valued and satisfy the assumptions: ai,j (x) ∈ C ∞ (Rn ), ai,j (x) = aj,i (x), 1 ≤ i, j ≤ n, n X
ai,j (x)ξi ξj ≥ δ0 |ξ |2 , δ0 > 0, ∀x ∈ Rn , ∀ξ ∈ Rn ,
i,j
there exists R0 > 0 such that ai,j (x) = δi,j for |x| ≥ R0 . In this case L is a self-adjoint elliptic positive operator in L2 (Rn \ O) with the domain given by H 2 (Rn \ O) ∩ H01 (Rn \ O). As was shown in [20], the resolvent / σpp (L), R(λ) = (L − λ2 )−1 : H −→ D, Im λ < 0 , λ2 ∈ admits a meromorphic continuation to C as an operator R(λ) : Hcomp −→ Dloc when n is odd and to the logarithmic plane, 3, when n is even. The poles, λj , Im λj > 0, have finite multiplicities. Moreover, for λ ∈ R we define the scattering operator S(λ) : L2 (S n−1 ) −→ L2 (S n−1 )
332
V. Petkov, M. Zworski
following the usual practice in scattering theory – see [31] and [2] for the details in the “black box” setting. The scattering matrix has the form S(λ) = I + A(λ) with a trace class operator A(λ), and it continues meromorphically to C when n is odd and to 3 when n is even. Its poles coincide with the poles of the resolvent with multiplicities, mS (λ), related to the multiplicities of the poles of the resolvent, mR (λ), by the formula mS (λ) = mR (λ) − mR (−λ), see [31] and references given there for more details. The above assumptions imply a sharp polynomial upper bound for the counting function of the scattering poles when n is odd (obtained in successive generality in [9, 30,20,25]) ]{λj : |λj | ≤ r} ≤ A0 r n , r ≥ A1 .
(1.10)
For n even we have a similar estimate ]{λj : |λj | ≤ r, | arg λj | ≤ ρ} ≤ Aρ r n , r ≥ A1 ,
(1.11)
– see [20] for the proof which applies to ρ < π/2 and [27,28] for the general case and the dependence of Aρ on ρ. Since A(λ) is a trace class opearator, the definition of the scattering phase, (1.2), makes sense. Because of the assumption (1.9), the scattering phase satisfies a Weyl law: |s(λ) − c0 λn | ≤ C0 λn−1 , λ −→ +∞,
(1.12)
for some constants c0 > 0, C0 > 0 independent of λ. This was shown by Christiansen in [2] following earlier work on obstacle and metric scattering [10,14,15,11] (some of those papers are concerned with non-compact perturbations not included in the presently discussed set-up). In spite of (1.12), the derivative s 0 (τ ) may increase exponentially for τ = Re λj if the poles λj converge exponentially to the real axis (see [3,11]). This of course means that the scattering phase difference s(τ + δ) − s(τ − δ), δ > 0
(1.13)
varies quickly as δ −→ 0 and τ is equal to the real part of a scattering pole. This is related to our opening discussion and to the Breit–Wigner approximation. Thus it is interesting and useful to find a link between the behaviour of the scattering phase difference and the existence of poles in small “boxes” {z ∈ C : | Re z − τ | ≤ Bδ, Im z ≤ δ}, B > 0.
(1.14)
To motivate our results let us recall the case studied in [3] (see also [15] for the long range case). Let sh (λ) be the relative scattering phase associated to two operators Lj (h) = −h2 1 + Vj (x), j = 1, 2, h > 0, where V1 (x) is a non-trapping potential for the energy level 0 < λ0 = |ξ |2 +V1 (x), while V2 (x) is a “well in the island” trapping potential for the level λ0 = |ξ |2 + V2 (x). Let 0(h) be the set of poles of the scattering
Breit–Wigner and Resonances
333
operator related to Lj (h), j = 1, 2, and let λ(h) ∈ 0(h) be a simple pole. Next let δ(h) > 0 be such that lim h−n δ(h) = 0,
h&0
lim
h&0
Im λ(h) = 0. δ(h)
(1.15)
Assume that (h) is a complex neighbourhood of λ0 such that \ (h) = {λ0 }, Re λ(h) ± δ(h) ∈ (h) ∩ R, 0 < h ≤ h0 , h>0
dist 0(h), ∂(h) ≥ c e−/ h , ∀ > 0
and, moreover, suppose that λ(h) is the unique pole in (h). Under these assumptions Gérard, Martinez and Robert [3] (see also [15] for long range potentials) proved that h i (1.16) lim s(Re λ(h) ± δ(h)) − s(Re λ(h)) = ±π. h&0
We start by presenting some improved estimates on the scattering determinant in Sect.2. In Sect.3 we obtain a lower bound for the difference (1.13) taking B < 1 , 0 < ≤ 1 in (1.14). As a consequence we obtain an upper bound (1.6) for the variation of the counting function of scattering poles near the real axis. This shows that the distribution near the real axis has to be to some extent regular. Section 4 is devoted to the proof of the main theorem and in Sect. 5 we obtain upper bounds for the variation of the scattering phase (1.13). The estimate involves the number of the poles in a “box” {z ∈ C : | Re z − τ | ≤ δ 1−2γ , Im z ≤ δ 1−γ }, with an error O(δ γ )τ n−1 . Finally, in Sect. 6, we combine the bounds for the scattering differences with the Weyl asymptotics for the scattering phase for metric perturbations [11] to establish the existence of poles λj with Im λj −→ 0. 2. Estimates on the Scattering Determinant We start by recalling results on the factorization of the scattering determinant. For n odd, det S(λ) = eg(λ) P (λ) =
Y
P (−λ) , |g(λ)| ≤ C|λ|n + C, λ ∈ C, P (λ)
E(λ/λj , n), E(z, p) = (1 − z)e
2
p
z+ z2 +···+ zp
.
(2.1)
j
Here, as elsewhere in this paper, the resonances can have multiplicities, that is, we can have λi = λj for i 6 = j . In the study of the relation between the scattering phase and resonances it is therefore natural to use the factorization formula rather than (3.1). In even dimensions we have a weaker factorization given in [32] (see (2.3) and (2.7) there): det S(λ) = egρ (λ) Pρ (λ) =
Y
Pρ (−λ) , |∂λk gρ (λ)| ≤ Cρ 0 ,k |λ|m−k , λ ∈ 3ρ 0 , m ≥ n, ρ 0 < ρ, Pρ (λ)
E(λ/λj , n), 3ρ = {λ ∈ C : | arg ±λ| < ρ < π/2, ± Re λ > 1}.
λj ∈3ρ
(2.2)
334
V. Petkov, M. Zworski
For λ in the physical half plane we will give an estimate which generalizes and improves the estimate given in Proposition 2 and (14) of [30]: Lemma 1. Let L be an operator satisfying the “black box” assumption of Sect.1 but with (1.9) replaced by a weaker assumption ]{λ : λ2 ∈ σ (L] ), λ ≤ r} ≤ C(L)r m , m ≥ n.
(2.3)
Then, for Im λ ≥ 0, Re λ ≥ 0, |λ − λ˜ | > , λ˜ 2 ∈ σpp (L) ∪ {0} we have | det S(−λ)| ≤ C exp(C Im λ|λ|n−1 ). Here we identified the first quadrant in C with the corresponding subset of 3 when n is even. Proof. Let Im λ > C0 1. We denote below by C different constants which are independent of λ. We can then use the representation of S(−λ) = I + A(−λ) given in the proof of Theorem 4 in [32] (we could also use the representation given in [31], see (2.4) in [32]): A(−λ) = −cn λn−2 Eφ1 (λ)[1, χ2 ]R(−λ)[1, χ]t Eφ2 (λ), where Eφ (λ) : L2 (Rn ) → L2 (Sn−1 ) has the kernel given by K(θ, x) = exp(iλhx, θi)φ(x), φj , χ2 , χ ∈ C ∞ (Rn ), identically 1 near B(0, R0 ) are specially chosen. We have for Im λ > C0 , kR(−λ)kH→H ≤ 1/| Im(λ2 )| and kR(−λ)kH→H ≤
C , |λ|
provided Im λ > C0 . Using the equation, we obtain k[1, χ2 ]R(−λ)[1, χ]kL2 (Rn )→L2 (Rn ) ≤ C|λ|, Im λ > C0 , which immediately gives |A(−λ)(θ, ω)| ≤ C|λ|n−1 eC Im λ , Im λ > C0 , and the analyticity of the kernel A(−λ)(θ, ω) of A(−λ) in θ . The Cauchy estimates then yield m C|λ| , Im λ > C0 , |1m θ A(θ, ω)| ≤ C (2m)!e
and hence following [9,30] we can estimate characteristic values of A(−λ): µj (A(−λ)) ≤ µj ((−1θ + I )−m )k(−1θ + I )m A(−λ)k ≤ C m (2m)!j − n−1 eC|λ| 2m
≤ CeC|λ|−j
1 n−1 /C
,
Breit–Wigner and Resonances
335
where the last inequality is achieved through an optimization in m. We now observe that 1 Y n−1 e A , A ≥ 0, 1 + eA−j /C ≤ Ce j ≥CAn−1
and consequently, via the standard use of Weyl inequalities (see [30]), ∞ Y
| det S(−λ)| ≤
n−1
(1 + µj (A(−λ))) ≤ CeCkA(−λ)k|λ|
j =1 C(Im λ+(n−1) log |λ|)|λ|n−1
≤ Ce
eC|λ|
, Im λ > C0 ,
since kA(−λ)k ≤ CeC Im λ+(n−1) log |λ| , Im λ > C0 . ˜ = , λ˜ 2 ∈ σpp (L) ∪ {0}, For λ ∈ R we have | det S(−λ)| = 1 and for |λ − λ| | det S(−λ)| ≤ C . Also, as in [4,31], we can estimate det S(−λ) globally away from its poles: [ mn+ , −λ ∈ / D(λj , |λj |−m− ), |λ| > 1, n odd. | det S(−λ)| ≤ CeC|λ| λj 6=0
For n even the same estimate is true in a conic neighbourhood of the positive real axis as was shown in [32]. Combined with the estimate on the real axis and the maximum principle this gives | det S(−λ)| ≤ exp(C|λ|N ) for Im λ ≥ 0, |λ − λ˜ | > , λ˜ 2 ∈ σpp (L) ∪ {0} and some fixed N . To apply the Phragmén–Lindelöf principle consider the function n
g(λ) = eiBλ det S(−λ), Im λ ≥ 0, | Re λ| > C2 > 0 with a constant B > 0 which we will choose below. On the curve γ = {λ ∈ C : Im λ = log |λ| > C0 , Re λ ≥ C2 > 0} we have
1 X n Im λ j Cj +1 ≥ 1/2, n Re λ n−1
1−
j =1
provided we take C2 > 0 sufficiently large. Thus we deduce that nB n Im λ(Re λ)n−1 , λ ∈ γ . |eiBλ | ≤ exp − 2 Combining this with the estimate for | det S(−λ)|, we obtain |g(λ)| ≤ C, λ ∈ γ , assuming |λ|/Re λ ≤ C3 on γ and B = 2CC3n−1 . On the other hand, |g(λ)| = 1 for Im λ = 0 and an application of Phragmén–Lindelöf principle yields n
n−1
| det S(−λ)| ≤ C|e−iBλ | ≤ C4 eC4 Im λ|λ|
for 0 ≤ Im λ ≤ log |λ|, Re λ ≥ C2 .
By the maximum principle we complete the proof. u t
336
V. Petkov, M. Zworski
As the first application of this estimate we obtain a local factorization of the scattering determinant. The proof is inspired by recent progress in trace formulæ1 for resonances, [4], [19], [32]: Lemma 2. If for |τ | 0 we write Pρ,τ (λ) , |λ − τ | ≤ |τ |/A det S(λ) = egρ,τ (λ) Y Pρ,τ (λ) Pρ,τ (λ) = (λ − λj ), A > 2, |λj −τ |≤|τ |/2 λj ∈3ρ
then for |λ − τ | ≤ |τ |/Cρ , we can choose the holomorphic function gρ,τ so that |∂λl gρ,τ (λ)| ≤ Cl |λ|n−l , l ≥ 0, where Cρ ≥ A > 2 and Cl are independent of τ . Proof. Let us take τ > 0. We first notice that gρ,τ (λ) = −gρ,τ (λ). Next we choose Bρ > A large enough to arrange {λ : |λ − τ | ≤ τ/Bρ } ⊂ 3ρ ∩ {Re λ > 1}. We fix Bρ and we observe that by Lemma 1, for |λ − τ | ≤ τ/Bρ , we have n
| det S(λ)| ≤ CeC|λ| , Im λ ≤ 0. Here and below we denote by C different constants independent on λ and τ . We note that we can replace Pρ,τ (λ) by Y λ − λj , τ |λ −τ |≤|τ |/2 j
λj ∈3ρ
so that the bounds on the number of resonances, (1.10), (1.11), give n
|Pρ,τ (λ)| ≤ CeC|λ| , |λ − τ | ≤ τ/Bρ . On the other hand, Cartan’s Lemma (see for instance [6], Lemma 6.17), implies that outside of a union of discs with radii summing up to τ , > 0, 1 n |Pρ,τ (λ)| > exp −C log τ . If 1 we can find B > Bρ such that this estimate is valid on |τ −λ| = τ/B. Combining this with the upper bounds on Pρ,τ (λ) and S(λ) above, and with the maximum principle we obtain Re gρ,τ (λ) ≤ C1 τ n , Im λ ≤ 0, |λ − τ | ≤ τ/B. 1 As was recently pointed out by Sjöstrand, the underlying feature of these arguments is already present in the classical proof of Lidskii’s theorem.
Breit–Wigner and Resonances
337
Applying Harnack’s inequality to the positive harmonic function G(λ) = 2C1 τ n − ˜ r) ⊂ D(λ, ˆ 3r), Im λˆ < −3r and comparing the maximum of Re gρ,τ (λ) in discs D(λ, ˜ r) with the minimum of G(λ) on D(λ, ˜ r) gives G(λ) on D(λ, −3C1 τ n ≤ Re gρ,τ (λ) ≤ C1 τ n , Im λ ≤ 0, |λ − τ | ≤ τ/B. Hence | Re gρ,τ (λ)| ≤ C|λ|n for Im λ ≤ 0, |λ − τ | ≤ τ/B, which, due to the symmetry of gρ,τ , immediately yields | Re gρ,τ (λ)| ≤ C|λ|n for |λ − τ | ≤ τ/B. From this we see that |gρ,τ (λ)| ≤ C|λ|n for |λ − τ | ≤ τ/2B. In fact, Carathéodory’s inequality (see for instance [24], 5.5) shows that max
|λ−τ |≤τ/2B
|gρ,τ (λ)| ≤ 2
max
|λ−τ |≤τ/B
Re gρ,τ (λ) + 3|gρ,τ (τ )|.
We now note that gρ,τ is determined up to an integral multiple of 2π i, and hence we can choose gρ,τ so that |gρ,τ (τ )| ≤ 2π. The Cauchy inequalities then give the symbolic estimates. u t 3. Lower Bounds on the Variation of the Scattering Phase Under the assumptions (1.7)–(1.9) (or in fact under a much weaker assumption than (1.9)) we can define a distribution √ √ u(t) = 2 tr cos(t L) − 1Rn \B(0,R0 ) cos(t 1)1Rn \B(0,R0 ) ∈ D0 (R). The basic trace formula of scattering theory, the Birman–Krein formula, relates the wave trace, u(t), and the scattering phase: u(t) =
ds 1 c (t) + 2 2π dλ
X
cos(λj t), t 6 = 0,
λ2 j ∈σpp (L) Im λj <0
see [2] and references given there. The other trace formula relates the wave trace to resonances and is a non-trivial analogue of the easy spectral trace formula for operators on compact manifolds. For n odd we have X eiλj |t| , t 6 = 0, (3.1) u(t) = j
in the sense of distributions on R \ {0} and where the sum is taken over all poles λj included according to their multiplicities. This was established in successive generality by Bardos-Guillot-Ralston, Melrose and Sjöstrand–Zworski – see [19] and the references given there. As was pointed out in [4,31,32], the Poisson formula is equivalent to the Birman–Krein formula and the factorization results for the scattering determinant recalled in the beginning of Sect. 2. Motivated by the Poisson formula (3.1) we now apply a modification of the idea of Melrose [10] for regularizing s(λ). It allows the best available general justification of the Breit–Wigner approximation at high energies.
338
V. Petkov, M. Zworski
Let us take χ (t) ∈ C ∞ (R) such that 0 ≤ χ(t) ≤ 1, χ (t) = 1 for 5/6 ≤ t ≤ 7/6, χ (t) = 0 for t ≥ 4/3 and for t ≤ 2/3. We then put X |λj | d snorm (λ) = 2π P (λj , λ), snorm (0) = 0, χ dλ λ λj ∈3ρ
where P (z, λ) is given by (1.3). Next, we define sreg (λ) = s(λ) − snorm (λ).
(3.2)
The essential part of Melrose’s argument was showing that sreg (λ) ∈ S n ((0, ∞)), that is, it is a symbol of order n. We prove the same result in this greater generality:2 Lemma 3. Let sreg be given by (3.2). Then sreg (λ) ∈ S n ((0, ∞)).
(3.3)
Proof. First we will show that for |λ − τ | ≤ |τ |/Cρ , we have d k 0 sreg (λ) ≤ Ck (1 + |λ|)n−k−1 , |λ| ≥ 1, k ≥ 0, dλ with constants Ck > 0 independent of τ . In the notation of Lemma 2 we have X 2 Im λj 0 (λ) + , s 0 (λ) = igρ,τ |λ − λj | 2 |λ −τ |≤|τ |/2 j
so that 0 (λ) sreg
=
X
0 igρ,τ (λ) +
−
X
λj ∈3ρ
|λj −τ |≤|τ |/2 λj ∈3ρ
|λj −τ |>|τ |/2 λj ∈3ρ
χ
(3.4)
|λj | λ
1−χ
|λj | λ
2 Im λj . |λ − λj |2
2 Im λj |λ − λj |2 (3.5)
0 (λ) and it remains to study the remaining We already have symbolic estimates for gτ,ρ terms. In the second term on the right-hand side of (3.5) we have |λj − λ| ≥ |λ|/C and the estimate (3.4) for this term follows from the upper bound on the number of resonances in the sum, which by (1.10), (1.11) is C|τ |n . The third term is treated in the same way, once we note that in the sum, |λ − λj | ≥ |τ |/C, if Cρ is large enough. Since 0 (λ) ∈ S n−1 ((0, ∞)) and by the constants Ck are independent of τ , we conclude that sreg integration we complete the proof. u t
We should remark that we could now give a new proof of (1.12) in even dimensions following the now standard argument of Hörmander, as applied in [10]. The normalized part of s(λ), snorm (λ), involves contributions of resonances with moduli comparable with |λ| and behaves in the way giving “blips”. The estimates (3.4) show that for the variation of sreg (λ) is well behaved, sreg (τ +δ)−sreg (τ −δ) = O(δ)τ n−1 for δ > 0. 2 The argument is different from the strictly odd dimensional argument of [10] based on the exact Poisson formula.
Breit–Wigner and Resonances
339
Proposition 1. Let C1 > 1, C = 2 1 − C1 −1 δ > 0 and τ ≥ 1 we have
−1
be fixed. For each 0 < ≤ 1, each
s(τ + δ) − s(τ ) ≥ (π − C) ] {λj : τ ≤ Re λj ≤ τ +
δ , Im λj ≤ δ} + O(δτ n−1 ), C1
(3.6)
s(τ ) − s(τ − δ) ≥ τ, (π − C) ] {λj : τ −
δ ≤ Re λj ≤ Im λj ≤ δ} + O(δτ n−1 ), C1
(3.7)
where O(δτ n−1 ) is independent of . Proof. We shall only treat the inequality (3.6) since the proof of (3.7) is similar. As we already remarked the integration of the derivative of sreg (λ) over (τ, τ + δ) yields a term O(δτ n−1 ), it is only necessary to examine the integral Z τ +δ X Z τ +δ d snorm (λ)dλ ≥ 2π P (λj , λ)dλ dλ τ τ | Re λ −τ |≤γ j Im λj ≤δ
+ 2π
X
Z
| Re λj −τ |≤γ , Im λj ≤δ 7τ/6≤|λj |≤4(τ +δ)/3 2τ/3≤|λj |≤5(τ +δ)/6
τ +δ
τ
|λj | − 1 P (λj , λ)dλ, χ λ
where we take γ < δ. In the second sum we have |λj − λ| ≥ λ/6 ≥ τ/6 and
2 Im λj ≤ 72δτ −2 . |λj − λ|2
Combining this with the upper bound of the number of poles in the disk |z| ≤ Aτ , O(τ n ), we obtain for this term the bound O(δ 2 τ n−2 ). To deal with the first sum, we write Z (τ −Re z+δ)/Im z Z τ +δ 1 P (z, λ)dλ = 2 dt. 2π 1 + t2 τ (τ −Re z)/Im z We then set
Im z τ − Re z 1 = ≤ , x = ≤0 y δ Im z and assume that |x| ≤ C11 y which corresponds to assuming that |τ − Re z| ≤ δ/C1 . We obtain Z y 1+ x Z y 1− 1 Z x+y y C 1 1 1 1 dt ≥ 2 dt ≥ 2 dt 2 2 2 1 + t 1 + t 1 + t2 0 0 x Z ∞ 1 dt ≥ π − C =π −2 1 1 + t2 y y 1− C 1 1 2 = 1− > 0. ≥ π − C, C C1 0<
340
V. Petkov, M. Zworski
The estimate (3.6) now follows by choosing γ = δ/C1 and summing over all poles, λj , which satisfy δ , Im λj ≤ δ. t u τ ≤ Re λj ≤ τ + C1 Remark. The constant π −C on the right-hand side of (3.6) corresponds to the constant π in the semi-classical result (1.16). In fact, assume the existence of scattering poles λm such that Im λm ≤ C(Re λm )−n , 0 < Re λm −→ +∞, m −→ ∞, C > 1. Choose 0 < ν < 1, δm = C(Re λm )−n+ν , m = (Re λm )−ν , τm = Re λm . Then we deduce from Proposition 1that, lim s(τm + δm ) − s(τm ) ≥ π.
m→∞
(3.8)
We may choose δm , m suitably in order that λm is the unique pole in the “box” {z ∈ C : | Re z − τm | ≤
δm , Im z ≤ δm m }. C
We will see in Sect. 5 that if this pole is simple we can get equality in (3.8). We shall now obtain a bound for the number of poles ] {λj : τ ≤ Re λj ≤ τ + 1, Im λj ≤ 2}. We take C1 = 2, C = 4, δ = 2 and 0 < ≤ (2π − 1)/8 in Proposition 1 and we conclude that h i ] {λj : τ ≤ Re λj ≤ τ + 1, Im λj ≤ 2} ≤ 2 s(τ + 1) − s(τ ) +A1 τ n−1 . The difference s(τ + 1) − s(τ ) can be estimated by (1.12). Thus we get s(τ + 1) − s(τ ) ≤ c0 ((τ + 1)n − τ n ) + 2C0 τ n−1 ≤ A2 τ n−1 , τ ≥ 1, where c0 > 0, C0 > 0 and A2 > 0 are independent of τ . Finally, with A = 2A2 + A1 we obtain the following Proposition 2. For operators satisfying (1.7)-(1.9) and for 0 < ≤ (2π − 1)/8, there exists a constant A > 0 independent of τ such that ] {λj : τ ≤ Re λj ≤ τ + 1, Im λj ≤ 2} ≤ Aτ n−1 , τ ≥ 1. In particular we have (1.6).
(3.9)
Breit–Wigner and Resonances
341
4. Proof of the Main Theorem To prove the theorem stated in Sect. 1 we will use the representation of the scattering phase given in Sect.3 and the local bound given in Proposition 2. Thus all we need to prove is X
P (λj , λ) = O (τ n−1 ), |λ − τ | 1,
(4.1)
<|λj −τ |< 21 τ
as then integration over [τ − δ, τ + δ] will furnish the needed δ factor. The convenient constant, 21 , came from the choice of the cut-off function χ in the definition of snorm combined with the choice of a small ρ. The sums of the Poisson kernels, P (λj , λ), appear naturally in the Carleman estimate, see [24], 3.7 and 3.71: let h(z) be holomorphic in Im z ≥ 0 with zeros λj , none of which lie on semi-circles |z − λj | = ρ and |z − λj | = 2r. Then for λ ∈ R , X ρ<|λ−λj |
Z π 1 log |h(λ + 2reiθ )| sin θ dθ 2πr 0 Z π 1 log |h(λ + ρeiθ )| sin θ dθ − πρ 0 Z 2r 1 1 1 − 2 log |h(λ + y)h(λ − y)|dy . + 2π ρ y2 4r (4.2)
Im λj 4 ≤ 2 |λ − λj | 3
The natural function for our problem comes from the determinant of the scattering matrix: Y def (λ + λj ) det S(−λ). (4.3) h(λ) = λ2 j ∈σpp (L) Im λj <0
We recall that for Im λ ≤ 0, λ 6 = 0, S(λ) is meromorphic with finitely many poles at λj 6 = 0 satisfying λ2j ∈ σpp (L) and zeros at −λj , where λj are resonances. The multiplicity of the resonance λj is given by the order of the zero of det S(λ) at −λj (except possibly at the finitely many eigenvalues) – see [31]. We also recall that due to unitarity of S(λ) for λ real we have | det S(λ)| = 1 for λ ∈ R. To apply (4.2) we need some estimates on det S(−λ). The first one was already given in Lemma 1. The difficulty in the application of the Carleman estimate (4.2) lies in estimating the integral over the ρ-semicirle, uniformly in λ ∼ τ . We need to estimate log |h(λ + ρeiθ )| from below. That is done in Lemma 4. For any fixed 0 < ρ < 1/2 and for τ 1 there exists ρ/2 < ρ(τ ) < ρ such that log | det S(−τ − ρ(τ )eiθ )| ≥ −Aρ τ n−1 , 0 ≤ θ ≤ π, where Aρ > 0 is independent of τ .
342
V. Petkov, M. Zworski
Proof. Let λj , 1 ≤ j ≤ J (τ ) = O(τ n−1 ), be the resonances in |λ + τ | < 1, with the bound on J (τ ) given by Proposition 2. Then we can write H (λ) = det S(−λ) = egτ (λ) def
def
Pτ (λ) =
¯ Pτ (λ) , |λ − τ | < 1, Pτ (λ)
JY (τ )
(λ + λj ),
j =1
¯ = −gτ (λ) so that in where gτ is holomorphic in |λ − τ | ≤ 3/4 and satisfies gτ (λ) particular Re gτ (λ) = 0 for λ real. For |λ − τ | ≤ 3/4 we clearly have |Pτ (λ)| ≤ C exp(C|τ |n−1 ). We now proceed as in the proof of Lemma 2: using Cartan’s Lemma, we find ρ/2 < ρ(τ ) < ρ and 0 < ρ independent of τ so that for |τ − λ| = ρ(τ ), J (τ ) > exp −Cτ n−1 . (4.4) |Pτ (λ)| > 2e Combining this with the estimate provided by Lemma 1 and the maximum principle we obtain Re gτ (λ) ≤ C1 τ n−1 , Im λ ≥ 0, |λ − τ | ≤ 3/4. Applying Harnack’s inequality to the positive harmonic function G(λ) = 2C1 τ n−1 − Re gτ (λ) in discs D(λ˜ , r) ⊂ D(λˆ , 3r), Im λˆ > 3r and comparing the maximum of G(λ) ˜ r) with the minimum of G(λ) on D(λ, ˜ r) gives on D(λ, −3C1 τ n−1 ≤ Re gτ (λ) ≤ C1 τ n−1 , Im λ ≥ 0, |λ − τ | ≤ 3/4. Recalling (4.4) and with the choice of ρ(τ ) which followed it, we now see that log |H (λ)| ≥ −C2 τ n−1 , |λ − τ | = ρ(τ ), Im λ ≥ 0, which is the statement of the lemma. u t The proof of (4.1) based on (4.2) is now immediate as the estimates of Lemmas 1 and 4 hold also for h(λ) given by (4.3). The choice of χ (the constant 21 in (4.1)) guarantees that we stay in the first quadrant outside a neighbourhood of 0 in applying Carleman’s inequality. From this the main theorem follows from an integration over [τ − δ, τ + δ]. We remark that the use of the Carleman estimate (4.2) in the study of resonances has a long tradition. Selberg used it to obtain a Blaschke factorization of the scattering matrix for finite volume non-compact hyperbolic surfaces (see the references in [4]). An observation of Titchmarsh concerning zeros of Fourier transforms was recalled in [29] to show that potentials in dimension one have o(θ)r poles in θ -conic neighbourhoods of R. In higher dimensions and for more general scattering problems, Vodev [26] and Petkov–Vodev [12] used (4.2) to show that X 0<|λj |≤r
Im λj ≤ Cr n−1 . |λj |2
Breit–Wigner and Resonances
343
Lemma 1 provides a somewhat more conceptual proof of this result since we can use the scattering determinant rather than determinants of some specially constructed operators. As an immediate consequence of the main theorem we have the following observation which seems new even in the case of a resonance free strip and obstacle scattering Proposition 3. Suppose there exists a non-decreasing continuous function F (λ) > 0 satisfying F (λ + δ) < CF (λ) for 0 < δ ≤ 1 and such that there are no resonances in the region {λ : Re λ > τ0 > 0 , Im λ < C0 (F (Re λ))−1 }. Then
ds (τ ) ≤ C1 F (|τ |)|τ |n−1 , |τ | ≥ 1. dλ
Proof. Assuming τ0 > 1, we have proved that X
s 0 (λ) = 2π
P (λj , λ) + O(λn−1 ), λ > τ0 .
|λ−λj |<1
Since P (λj , λ) ≤ π −1 (Im λj )−1 , the assumption of the proposition and Proposition 2, immediately give the claimed estimate. u t We note that using the recent results of Burq [1] this gives the exponential bound on |s 0 (λ)| whenever L is a metric Laplacian on the exterior of an obstacle. Moreover, for every fixed A > 0 we can apply the argument of Proposition 3 to the resonances in the region {λ : Re λ > A, Im λ < C0 (F (Re λ))−1 } and this implies the following Corollary 1. Let F (λ) be the function in Proposition 3. Assume there exists a sequence τj % ∞ such that lim sup j →∞
|s 0 (τj )|
F (τj )τjn−1
= +∞.
(4.5)
Then for every fixed C0 > 0 we have ]{λj : Re λj > 0, Im λj < C0 (F (Re λj ))−1 } = ∞. This completes the results in [3] and [11] where the existence of resonances zj close to the real axis was related to the blow-up of |s 0 (τj )| for τj = Re zj .
344
V. Petkov, M. Zworski
5. Upper Bounds for the Variation of the Scattering Phase We have already given some lower bounds on the variation of the scattering phase in Proposition 1. To give upper bounds in terms of the number of resonances we need to understand the sum of harmonic measures X
ωC+ (λj , [τ − δ, τ + δ]),
|λj −τ |<1
and that is done in Proposition 4. For any > 0, γ1 > γ2 > 0, 0 < γ2 ≤ 1 and 0 < δ ≤ δ(γ1 ), τ ≥ τ0 we have |s(τ + δ) − s(τ − δ)| ≤ 2π ] {λj : | Re λj − τ | ≤ min{1, δ 1−γ1 }, Im λj < δ 1−γ2 } + C δ min(γ1 −γ2 ,γ2 ) τ n−1 , with a constant C > 0 independent of γ1 , γ2 . Proof. We only have to show that
X Im λj >δ 1−γ2 | Re λj −τ |<1
δ γ2 ωC+ (λj , [τ − δ, τ + δ]) = O
X
τ n−1 ,
ωC+ (λj , [τ − δ, τ + δ]) = O δ γ1 −γ2 τ n−1 , .
(5.1)
(5.2)
Im λj ≤δ 1−γ2 1>| Re λj −τ |>δ 1−γ1
To see (5.1) we put xj = τ − Re λj , yj = Im λj and estimate the sum of O(τ n−1 ) terms of the form Z
xj +δ yj xj −δ yj
2δ 2δ γ2 1 . dt ≤ ≤ 1 + t2 yj
For (5.2) we use the same notation and observe that now either xj −δ, xj +δ < −δ 1−γ1 /2 or xj − δ, xj + δ > δ 1−γ1 /2 (provided δ is small enough depending on γ1 ). Hence Z
xj +δ yj xj −δ yj
1 dt ≤ 1 + t2
Z
∞
δ 1−γ1 /(2δ 1−γ2 )
completing the proof of the proposition. u t
1 dt ≤ 2δ γ1 −γ2 , 1 + t2
Breit–Wigner and Resonances
345
6. Distribution of Scattering Poles Close to the Real Axis In this section we shall apply our results to the case of metric perturbations, that is, to the case where L has the form L=−
n X
∂xj ai,j (x)∂xi
i,j =1
given in Sect. 1. Let l(x, ξ ) be the principal symbol of L and let 6 = {(x, ξ ) ∈ T ∗ (Rn ) : l(x, ξ ) = 1}.
√ Set r(x, ξ ) = l(x, ξ ) and consider the flow 8t = exp(tHr ) on 6 related to the Hamiltonian field Hr of r. A point ν ∈ 6 is called periodic with period T > 0 if exp(T Hr )ν = ν.
(6.1)
By T (ν) > 0 we denote the minimal T > 0 for which (6.1) holds. The set ot periodic point in 6 will be denoted by 5. Clearly, 5 ⊂ {(x, ξ ) ∈ T ∗ (Rn ) : |x| ≤ R0 }. For fixed ν ∈ 5 let γ (ν) = {x(σ ), ξ(σ ) ∈ T ∗ (Rn ) : 0 ≤ σ ≤ T (ν)} be the periodic trajectory passing through ν for σ = 0. Denote by m(ν) ∈ Z4 the Maslov index (see Sect.21.6 of [7]) related to γ (ν) and the Lagrangian manifold C = {(t, x, y, τ, ξ, η) ∈ T ∗ R × R2n : τ + r(x, ξ ) = 0, (x, ξ ) = exp(tHr )(y, η)} and set q(ν) = π2 m(ν). A point ν ∈ 5 is called absolutely periodic with period T (ν) > 0 if in any local coordinates z in a neighbourhood of ν with z(ν) = 0 we have ∂zα 8T (z) − z z=0 = 0, ∀α. The set of absolutely periodic points ν ∈ 5a with period T will be denoted by 5a (T ) and 5a = ∪t>0 5a (t) is the set of absolutely periodic points. Notice that µ(5) = µ(5a ) (see [17]), where here and in the following we use the Liouville measure µ on 6. Next consider the function Z −n [π − λT (ν) − q(ν)]2π T −1 (ν)dν Q(λ) = (2π) 5a
introduced in [5,16]. Here −π < [z]2π ≤ π is such that z = [z]2π + 2kπ, k ∈ Z. Notice that Q(λ) is right continuous and Z −n+1 T −1 (ν)dν Q(λ + 0) − Q(λ − 0) = (2π ) λ
with
λ = {ν ∈ 5a : λT (ν) + q(ν) ≡ 0 mod (2π )}.
346
V. Petkov, M. Zworski
In [11] the first author established, for n odd, the following estimate for the scattering phase: δ δ Q(τ + ) − Q(τ − ) τ n−1 − C0 δτ n−1 − oδ (τ n−1 ) ≤ s(τ + δ) − s(τ − δ) 2 2 3δ 3δ ≤ Q(τ + ) − Q(τ − ) τ n−1 + C0 δτ n−1 + oδ (τ n−1 ) 2 2 (6.2) with C0 > 0 independent of τ and δ, while oδ (τ n−1 ) means that for every fixed δ > 0 we have oδ (τ n−1 ) −→τ →∞ 0. τ n−1 It is easy to generalize the argument of [11] to the even dimensional case. In fact for the analysis in Sect.4 of [11] we may write s(λ) = snorm (λ) + sreg (λ), where snorm and sreg are defined in Sect. 3. Therefore the function s1 (λ) = snorm (λ) is increasing and a Tauberian type argument can be applied for s1 since we have the following estimate: ds dsreg ds1 dλ ∗ ϕ (λ) ≤ dλ ∗ ϕ (λ) + dλ ∗ ϕ (λ) . Here ϕ(λ) ∈ S(R) is a function with Fourier transform ϕ(t) ˆ supported in a small neighbourhood of 0. Now suppose there exists T0 > 0 such that µ(5a (T0 )) > 0. Since m(ν) takes values in Z4 , without loss of the generality we may suppose that for some set 50 ⊂ 5a (T0 ) with µ(50 ) ≥ 41 µ(5a (T0 )) we have T (ν) = T0 , q(ν) = q0 , ∀ν ∈ 50 . Introduce the numbers τm =
2mπ q0 − , m∈N T0 T0
and observe that 50 ⊂ τm , hence µ(τm ) ≥ µ(50 ) > 0. This implies Q(τm + 0) − Q(τm − 0) ≥ (2π)−n+1 T0−1 µ(50 ) = η0 > 0. We claim that there exists 0 > 0 such that for all 0 < ≤ 0 we have Q(τm + ) − Q(τm − ) ≥ η0 − 2(2π )−n µ(5a ), ∀m ∈ N.
(6.3)
To prove this we need the following representation given by Safarov ([16], Proposition 1) Z X −n −n+1 T −1 (ν) χλ,k (ν)dν, Q(λ + ) − Q(λ − ) = −2(2π) µ(5a ) + (2π ) 5a
k∈Z
(6.4) is the characteristic function of the set where χλ,k
λ, k = {ν ∈ 5a : −T (ν) ≤ λT (ν) + q(ν) − 2π k < T (ν)}.
Breit–Wigner and Resonances
347
Given λ = τm , we obtain Z Z X −n+1 −1 −n+1 −1 T (ν) χτm , k (ν)dν ≥ (2π ) T0 (2π ) 5a
X
50 k∈Z
k∈Z
χτm , k (ν)dν ≥ η0 ,
since for each ν ∈ 50 we have χτm , m (ν) = 1, and the claim is proved. Now we shall apply Proposition 4 with γ1 = 2/3 and γ2 = 1/3 to prove the existence of scattering poles with arbitrary small imaginary parts. Fix > 0 and assume that the 1 error term in Proposition 4 is of modulus less than Cδ 3 τ n−1 . Choose 0 < δ ≤ 3 small enough to arrange η 1 1 0 (2π)−n µ(5a )δ + C0 δ + Cδ 3 ≤ . 2π 4 Next fix δ > 0 and choose τδ so that η0 1 |oδ (τ n−1 )| ≤ τ n−1 for τ ≥ τδ . 2π 4 Therefore Proposition 4 combined with (6.3) and the left hand estimate in (6.2) gives the following Proposition 5. Let 50 ⊂ 5a be a subset with µ(50 ) > 0 and let T (ν) = T0 , q(ν) = q0 , ∀ν ∈ 50 . Then for every fixed > 0 there exists τ > 0 such that for the sequence τm = (2π m − q0 )T0−1 , m ∈ N, we have ] {λj : | Re λj − τm | ≤ , Im λj ≤ 2 } ≥
(2π )−n µ(50 )τmn−1 , τm ≥ τ . 2T0
(6.5)
This result is similar to clustering properties of the eigenvalues established in [5,16,18] since the constant (2π)−n (2T0 )−1 µ(50 ) > 0 is independent of > 0. Proposition 5 is sharper than the results in [21,12] and [13]. In particular, in [12] the existence of scattering poles λj with Im λj −→ 0 was established under some conditions. Popov [13] used the assumption µ(5a (T0 )) > 0 to prove that for every ρ > 0 we have ]{λj : Im λj ≤ ρ log |λj |, | Re λj | ≤ r} ≥ r −→ +∞.
(2π )−n µ(5a (T0 ))r n (1 − oρ (1)), n
A direct application of Proposition 5 implies a lower bound for the counting function N (r) = ]{λj : | Re λj | ≤ r, Im λj ≤ 2 } and we have the following Corollary 2. Suppose the assumptions of Proposition 5 are fulfilled. Then for every > 0 we have (2π)−n−1 1 n µ(50 )r 1 − O ( ) , r −→ +∞. (6.6) N (r) ≥ n r
348
V. Petkov, M. Zworski
Proof. Fix 0 < < π/T0 and let τm ≥ τ for m ≥ p + 1. Therefore the small “boxes” {λj : | Re λj − τm | ≤ , Im λj ≤ 2 } are disjoint and we may sum the lower bounds for all “boxes” with p + 1 ≤ m ≤ M. Thus we get M X m=p+1
τmn−1
M M X 2π n−1 X ≥ mn−1 − C0 mn−2 T0 m=p+1 m=p+1 n−1 1 2π 1 n ≥ M 1 − Op ( ) n T0 M
with a constant C0 > 0 independent of M and p. Given r 1, let M(r) ∈ N be the biggest integer so that 2πM(r) − q0 ≤ r − , T0 which yields T0 (r − ) + q0 −1≥ M(r) ≥ 2π
T0 2π
1 r 1 − O ( ) . r
Finally, using the symmetry of the resonances with respect to the imaginary axis, we obtain N (r) ≥
M(r) X (2π)−n (2π )−n−1 1 µ(50 )r n 1 − O ( ) . µ(50 ) τmn−1 ≥ T0 n r
t u
m=p+1
Remark. It is important to note that the constant in (6.6) is independent of T0 and and that this constant modulo the factor (2π)−1 coincides with that in the result of Popov cited above. A similar constant appears also in the lower bounds with factor r n obtained by Stefanov [23] who assumed existence of a suitable elliptic periodic ray. The assumption µ(5a ) > 0 is necessary for the clustering results. More precisely, we have the following Proposition 6. Assume there exist constants κ > 0 and B > 0 and sequences m & 0, τm % +∞ such that 2 } ≥ κτmn−1 , ∀m ∈ N. ]{λj : | Re λj − τm | ≤ Bm , Im λj ≤ m
Then 2(2π)
−n
Z 5a
T −1 (ν)dν ≥ κ.
(6.7)
Breit–Wigner and Resonances
349
Proof. Choose δm = 2Bm and consider the term h 3δm 3δm i n−1 − Q τm − τm + oδm (τmn−1 ) Q τm + 2 2 involved in the right-hand side of (6.2). To overcome the difficulties related to oδm (τmn−1 ), we shall use the argument of [5,16] replacing the sequence δm by a slowly decreasing 0 ≥ δ , δ 0 −→ 0. Choose N < N < · · · < N < . . . , N ≥ k so sequence δm m 1 2 m k m that 1 |oδm (τjn−1 )| ≤ τjn−1 , ∀j ≥ Nm . m Next define δk0 = δm for Nm ≤ k < Nm+1 , m ∈ N, and note that |oδm0 (τmn−1 )| τmn−1
−→m→∞ 0.
Now we apply Proposition 1 with C1 = 2, C = 4, = m (2B)−1 and deduce 0 0 ) − s(τm − δm ) τm−n+1 ≥ lim s(τm + δm m→∞ i h δ0 0 0 m (2B −1 )} −C1 δm lim τm−n+1 (π −2m B)]{λj : | Re λj −τm | ≤ m , Im λj ≤ δm m→∞ 2 ≥ πκ 0 . Now applying the right-hand estimate in with C1 > 0 independent of m , τm and δm (6.2), we get
h oδ 0 (τmn−1 ) i 3δ 0 3δ 0 0 + m n−1 ≥ π κ. lim Q τm + m − Q τm − m + C0 δm m→∞ 2 2 τm 0 = Setting m
0 3δm 2
and taking into account the representation (6.4), we conclude that (2π)−n+1 lim 0
Z
m →0 5a
It is easy to see that
X k∈Z
0
T −1 (ν)
χτmm, k (ν) ≤
X k∈Z
0
χτmm, k (ν)dν ≥ π κ.
(6.8)
0 m T (ν) + 1 π
and (6.7) follows from (6.8). The proof is complete. u t Remark. The above result is similar to Proposition 5 in [16] where the necessary conditions for the clustering of eigenvalues are treated. The clustering of resonances is closely related to the non uniform continuity of the function Q(λ) on R+ . Thus we can generalize Proposition 5 in the following way
350
V. Petkov, M. Zworski
Proposition 7. Assume there exist a constant κ > 0 and sequences m & 0, τm % +∞ such that i h lim Q(τm + m ) − Q(τm − m ) ≥ κ. m→∞
Then there exists a sequence ηm & 0 such that 2 }≥ ]{λj : | Re λj − τm | ≤ ηm , Im λj ≤ ηm
κ n−1 τ , m ≥ m0 . 4π m
0 ≥ , 0 −→ 0, by using (6.4), we get Proof. First notice that if m m m 0 0 0 ) − Q(τm − m ) ≥ −2(m − m )(2π )−n µ(5a ) + Q(τm + m ) − Q(τm − m ), Q(τm + m
hence
i h 0 0 ) − Q(τm − m ) ≥ κ. lim Q(τm + m
m→∞
Now consider the term i h Q(τm + m ) − Q(τm − m ) τmn−1 − o2m (τmn−1 ) 0 ≥ , 0 −→ 0 so that and, as in the proof of Proposition 6, choose a sequence m m m
o2m0 (τmn−1 ) τmn−1
−→ 0.
0 An application of Proposition 4 with γ1 = 2/3, γ2 = 1/3, = 1 and δ = 2m combined with the left-hand estimate in (6.2) yields for m large i h 0 0 0 n−1 0 1/3 n−1 )−Q(τm −m ) τmn−1 −o2m0 (τmn−1 )−2C0 m τm −C1 (2m ) τm ≤ Q(τm +m 0 1/3 0 2/3 ) , Im λj ≤ (2m ) }. 2π]{λj : | Re λj −τm | ≤ (2m
For m ≥ m0 the left-hand side of the above inequality can be estimated from below by κ n−1 0 1/3 we complete the proof. u t 2 τm , and setting ηm = (2m ) Acknowledgements. Part of this work was conducted during our stay at the Erwin Schrödinger Institute. We would like to thank the organizers of the programme in Spectral Geometry for the invitation and the Institute for its warm hospitality. The work of the second author was supported in part by the National Science and Engineering Research Council of Canada. The discussion of the Breit–Wigner approximation is partly based on some unpublished notes of Shlomo Sternberg. The second author would like to thank Professor Sternberg for making these notes available to him in 1989. Both authors are also grateful to András Vasy and Georgi Vodev for helpful comments, and to the referee for discovering a weak point of an earlier version of this paper.
References 1. Burq, N.: Décroissance de l’énergie locale de l’équation des ondes pour le problème extérieur et absence de résonance au voisinage du réel. Acta Math. 180, 1–29 (1998) 2. Christiansen, T.: Spectral asymptotics for general compactly supported perturbations of the Laplacian on Rn . Comm. P.D.E. 23, 933–947 (1998) 3. Gérard, C., Martinez, A. and Robert, D.: Breit–Wigner formulas for the scattering poles and total scattering cross-section in the semi-classical limit. Commun. Math. Phys. 121, 323–336 (1989) 4. Guillopé, L. and Zworski, M.: Scattering asymptotics for Riemann surfaces. Ann. of Math. 129, 597–660 (1997)
Breit–Wigner and Resonances
351
5. Guriev, T.E. and Safarov,Yu.S.: Precise asymptotics of the spectrum for the Laplace operator on manifolds with periodic geodesics. Trudy Matem. Inst. Steklov, 179, (1988) (in Russian); English translation in Proc. Steklov Institute of Mathematics, 179, 35–53 (1989) 6. Hayman, W.K.: Subharmonic Functions. vol.II, London: Academic Press, 1989 7. Hörmander, L.: The Analysis of Linear Partial Differential Operators. Vol. III, Berlin: Springer-Verlag, 1985 8. Melrose, R.B.: Polynomial bounds on the number of the scattering poles. J. Funct. Anal. 53, 287–303 (1983) 9. Melrose, R.B.: Polynomial bounds on the distribution of poles in scattering by an obstacle. Journées EDP, Saint-Jean-de-Monts, 1984 10. Melrose, R.B.: Weyl asymptotics for the phase in obstacle scattring. Comm. P.D.E., 13, 1431–1439 (1988) 11. Petkov, V.: Weyl asymptotic of the scattering phase for metric perturbations. Asymptotic Analysis 10, 245–261 (1995) 12. Petkov, V. and Vodev, G.: Upper bounds on the number of scattering poles and the Lax–Phillips conjecture. Asymptotic Analysis, 7, 97–104 (1993) 13. Popov, G.: On the contribution of degenerate periodic trajectories to the wave trace. Commun. Math. Physics 196, 363–383 (1998) 14. Robert, D.: Asymptotique de la phase de diffusion à haute énergie pour des perturbations du seconde ordre du Laplacien. Ann. Sci. Ecole Norm. Sup. Sér. 25, 107–134 (1992) 15. Robert, D.: Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics. J. Funct. Anal. 126, 36–82 (1994) 16. Safarov, Yu.: Asymptotics of the spectrum of pseudodifferential operators with periodic characteristics. Zap. Nauchn. sem. Leningrad. Otdel Mat. Inst. Steklov 152, 94–104 (1986) (in Russian); English translation in J. Soviet Math. 40, 645–652 (1988) 17. Safarov, Yu. and Vassiliev, D.: Branching Hamiltonian billiards. Dokl. AN SSSR 301, 271–274 (1988); English tranlsation in Sov. Math. Dokl. 38, 64–68 (1989) 18. Safarov, Yu. and Vassiliev, D.: The asymptotic distribution of eigenvalues of partial differential equations. Translations of mathematical monographs, AMS, vol. 155, 1996 19. Sjöstrand, J.: A trace formula and review of some estimates for resonances. In: Microlocal analysis and spectral theory (Lucca, 1996), 377–437, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., 490, Dordrecht: Kluwer Acad. Publ., 1997 20. Sjöstrand, J. and Zworski, M.: Complex scaling and the distribution of scattering poles. J. Am. Math. Soc. 4, 729–769 (1991) 21. Sjöstrand, J. and Zworski, M.: Lower bounds on the number of scattering poles. Commun. P.D.E. 18, 847–857 (1993) 22. Sjöstrand, J. and Zworski, M.: Lower bounds on the number of scattering poles. II, J. Funct. Anal. 123, 336–367 (1994) 23. Stefanov, P.: Quasimodes and resonances: Sharp lower bounds. To appear in Duke Math J. 24. Titchmarsh, E.C.: The Theory of Functions. Oxford: Oxford University, 1968 25. Vodev, G.: Sharp bounds on the number of scattering poles for perturbations of the Laplacian. Commun. Math. Phys. 146, 205–216 (1992) 26. Vodev, G.: On the distribution of scattering poles for perturbations of the Laplacian. Ann. Inst. Fourier (Grenoble) 42, 625–635 (1992) 27. Vodev, G.: Sharp bounds on the number of scattering poles in even-dimensional spaces. Duke Math. J. 74, 1–17 (1994) 28. Vodev, G.: Sharp bounds on the number of scattering poles in two dimensional case. Math. Nachr. 170, 287–297 (1994) 29. Zworski, M.: Distribution of poles for scattering on the real line. J. Funct. Anal. 73, 277–296 (1987) 30. Zworski, M.: Sharp polynomial bounds on the number of scattering poles. Duke Math. J. 59, 311–323 (1989) 31. Zworski, M.: Poisson formulae for resonaces. Séminaire E.D.P., Ecole Polytechnique, Exposé XIII, 1996– 1997 32. Zworski, M.: Poisson formula for resonances in even dimensions. Asian J. Math. 2, 615–624 (1998) Communicated by B. Simon
Commun. Math. Phys. 204, 353 – 366 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Critical Behavior for 2D Uniform and Disordered Ferromagnets at Self-Dual Points L. Chayes1 , K. Shtengel2 1 Department of Mathematics, University of California, Los Angeles, CA 90095-1555, USA.
E-mail: [email protected]
2 Department of Physics, University of California, Los Angeles, CA 90095-1547, USA.
E-mail: [email protected] Received: 16 November 1998 / Accepted: 28 January 1999
Abstract: We consider certain two-dimensional systems with self-dual points including uniform and disordered q-state Potts models. For systems with continuous energy density (such as the disordered versions) it is established that the self-dual point exhibits critical behavior: Infinite susceptibility, vanishing magnetization and power law bounds for the decay of correlations. Introduction In this note we will consider the Potts ferromagnets and related systems on the square lattice. The Potts models are defined by the Hamiltonian X Jx,y δσx ,σy (1) H=− hx,yi
with σx = 1, 2, . . . , q and δσx ,σy the usual Kronecker delta. Here Jx,y is non-zero only when x and y are nearest neighbors and it is assumed that these couplings cannot be negative. Interest in the disordered version of these systems has recently been revived, in particular by J. Cardy and coworkers [Ca] who have discovered an apparent close connection between these problems and systems with random fields. For the two-dimensional disordered Potts models (among other 2-d systems) it was established in [AW1,AW2] that for all temperatures, the energy is continuous. Thus, by conventional definitions, the magnetic ordering transition is continuous. However, as pointed out by Cardy in [Ca] – as well as in a number of public forums – for these systems it has not been established that all aspects of the transition meet the conventional criteria of a continuous transition: Vanishing of the order parameter, power law decay for correlations and infinite susceptibility. Here we establish that, at least at the self-dual points, these systems behave critically in the sense of all the above mentioned (with a lower bound of a power law for
354
L. Chayes, K. Shtengel
decay of correlations). Our results apply to a variety of systems under the hypothesis of a continuous energy density. The method is to employ graphical representations (e.g. the random cluster representation for the Potts models) and in fact applies to the non-integer cases provided the model is attractive (q ≥ 1). In essence, the results here are complimentary to one recently proved in a paper coauthored by one of us [BC]. There it was shown that if (at some point) the energy density is not continuous then the discontinuity (a) is unique, (b) occurs precisely at the self-dual point and (c) coincides with the magnetic ordering transition. For these cases the picture is, by and large, complete. Unfortunately, for the continuous cases, our methods do not rule out the possibility that the self-dual point is simply a critical point in the interior of a critical phase. (Nor does it rule out the possibility that at the low-temperature edge of this purported phase, the magnetization exhibits a jump akin to the Thouless effect in one-dimensional long-range systems [T, AY, ACCN].) Nevertheless, taken together, the two sets of results imply that the self-dual point is always a point of non-analyticity. This work will be organized along the following lines: We will start with the uniform q ≥ 1 random cluster cases which are the simplest illustration of the basic method. Next we will treat certain straightforward generalizations, e.g. the Ashkin–Teller model and in the second section we treat the disordered Potts model. Uniform Systems The random cluster model. We shall begin by setting notation. Consider the random cluster model on some finite connected 3 ⊂ Z2 . If ω is a configuration of bonds, the probability of ω, in the setup with “free” boundary conditions is given by q,R
µ3f (ω) ∝ R N(ω) q C
f (ω)
,
(2)
where N (ω) is the number of “occupied” bonds of the configuration and C f (ω) is the number of connected components. If q is an integer, this is the representation of the model described by Eq. (1) – for free boundary conditions – with Ji,j ≡ 1 and R = eβ − 1. For other boundary conditions, the formula for the weights must be modified. Of primary interest are the wired boundary conditions which, back in the spin-system correspond to setting each spin on the boundary to the same value. Then the formula for the weights of configurations is the same as in Eq. (2) with C f (ω) replaced by C w (ω) where the latter counts all sites connected to the boundary as part of the same cluster. In this note we will restrict attention to the random cluster measures that are (weak, possibly subsequential) limits of random cluster measures defined in finite volume. Here the boundary conditions that we will consider – essentially those that are handed down from the spin–systems – are defined as follows: In volume 3, the boundary ∂3 is divided into k disjoint sets. All sites of the individual sets are identified as the same site. Thus the boundary consists of k effective sites (components) v1 , . . . vk . No interior connections between v1 , . . . vk are permitted. (I.e. any configuration with such a connection is assigned zero weight.) All interior sites connected to the same boundary component are considered as part of the same connected component. Finally, we will allow couplings between boundary sites and their neighboring interior sites to take arbitrary values in [0, ∞). And, of course, we will also consider arbitrary superpositions of all of the above. For the case of free and wired boundary conditions, infinite volume limits, ergodicity, etc. follow in a straightforward fashion from the monotonicity (FKG) properties of the
2D Disordered Ferromagnets
355
q ≥ 1 random cluster measures. (In the disordered cases, some of these points must be rediscussed and we will do so at the appropriate time.) We will assume general familiarity on the part of the reader concerning these properties. Most of the relevant material can be found in [ACCN] or [BC]. However, if available at the time of reading, the authors highly recommend the forthcoming article [GHM]. The dual model, defined on the lattice 3∗ that is dual to 3, has weights of the same form as those in Eq. (2) with R replaced by R ∗ = qR −1 . The general problem of boundary conditions for the dual model are a little intricate but for the purposes of this work, it is sufficient to note that the free and wired boundary conditions are exchanged under duality. In these models (with integer q) the relationship between the bond density in the random cluster model and energy density in the spin-system is straightforward. In particular, let ex,y denote the event of an occupied bond that connects the neighboring pair hx, yi. Then, as shown in [CMI], q,R
hδσx ,σy i3# =
1 + R q,R µ3# (ex,y ), R
(3)
q,R
where h−i3# denotes thermal expectation in the spin-system in boundary condition # = f, w (or, for that matter any other boundary condition). Thus, in these cases, continuity in the energy density is manifested as continuity in the bond density. Hereafter, we will focus on q ≥ 1 random cluster models and use continuous bond density for our working hypotheses. Let us finally remark that for almost every R, the bond density is, in fact, a well defined concept. This point (which is fairly standard) has recently been detailed in [BC] so here we will be succinct. Consider the free energy, 8(R), defined here by 8(R) = lim
3%Z2
1 log Z3# , |3|
(4)
where Z3# is the sum of the weights in Eq. (2) and 3 % Z2 means a thermodynamic limit – a regular sequence of boxes. The function 8(R) is convex (as a function of log R) and hence has a left and right derivative for every R which agree for almost every R. At points of continuity of the derivative, there is uniqueness among the translation invariant states and the bond density in this state is given by the derivative of 8 (with respect to log R). At points of discontinuity, the upper value for the bond density is achieved in the wired state and the lower value in the free state. We are ready for our first result: Theorem 1. Consider the 2d random √ cluster models with parameter q ≥ 1. Then if the self-dual point, R = R ∗ = q is a point of continuity of the bond density, the percolation density vanishes. Proof. This result can in fact be obtained as a consequence of Theorem 2.1 in [BC]. For completeness we will provide a direct proof. Here we will establish the contrapositive statement; i.e. assume that the percolation probability is positive at the self-dual point and show that this implies a discontinuity in the bond density. Percolation is defined in reference to the wired measures (and limits thereof). These measures are ergodic under Z2 translations, respect the x, y-axis symmetry and have the FKG property. In short, these measures satisfy all the conditions of the theorem in [GKR] which forbids coexisting infinite clusters of the opposite type. Thus, with probability
356
L. Chayes, K. Shtengel
one, whenever there is percolation, all dual bonds reside in finite clusters. However, if there is percolation (in the wired state) at the self-dual point, the same cannot be said in the limiting free boundary condition measure. Indeed, from the perspective of the dual bonds, this is a wired state. Hence, in this state, the dual bonds percolate and the regular bonds do not. It is thus evident that the limiting free and wired measures are distinct; q,R q,R µ3f (−) is strictly below µ3w (−). By the (corollary to) Strassen’s theorem (see [L] p. 75) this implies that the bond density in the wired state is strictly larger than that of the free state; the self-dual point is thus not a point of continuity for the derivative of 8. t u Corollary. Under the hypotheses of Theorem 1, there is a unique limiting Gibbs state/ random cluster measure at the self-dual point. Proof. It was proved in [ACCN] (Theorem A.2) that the absence of percolation (for q ≥ 1) implies that a unique limiting random cluster measure and, for (integer q ≥ 2) a unique Gibbs state in the corresponding spin-system. So this follows immediately. u t Let A and B denote disjoint sets in Z2 . We will denote by {A ←→ B} the event that some site in A is connected to some site in B by occupied bonds. Further, if D contains both A and B, we let {A ←→ B}D denote the event that such a connection occurs by a # (R, q) = path that lies entirely in D. For x and y (distinct) points in Z2 , let gx,y = gx,y q,R
µ# ({x ←→ y}) be the probability that x and y belong to the same cluster. We will call this object the connectivity function. For integer q = 2, 3, . . . the connectivity function is equal to (or proportional to) the spin–spin correlation function. In these cases, the susceptibility and the average cluster size are also identified. For non-integer q, the geometric quantities are defined to be the objects of interest. These quantities are the subject of our next theorem which is a direct consequence of Theorem 1. The proof below borrows heavily from the argument in [A]. Theorem 2. For self-dual q ≥ 1 random cluster models with vanishing percolation probability, the function gx,y has a power law lower bound. Explicitly, if 0 is the origin and L is the point (L, 0) on the x axis then g0,L ≥ 18 L−2 . Finally, the average cluster size is infinite. Proof. Since there is a unique limiting state then, in particular, the limiting free and wired measures coincide. Consider the L × L square centered at the origin which we denote by SL . In every configuration, there is either a left–right crossing by regular bonds or a top–bottom crossing by dual bonds. By duality, in the limiting measure, these probabilities are both one half.1 Letting LL denote the left edge and RL the right edge of the square, this implies that X 1 (5) gx,y ≥ . 2 x∈LL ,y∈RL
Hence, for some (deterministic) x ∗ ∈ LL and y ∗ ∈ RL we have gx ∗ ,y ∗ ≥
1 . 2L2
(6)
1 To ensure that this is strictly true, one must carefully construct the square so that it is exactly self-dual. However, for the arguments here, it is actually sufficient to observe that one of the probabilities must be greater or equal to one half.
2D Disordered Ferromagnets
357
This is, in essence the bound on the correlation function. For æsthetic purposes we will show that a similar bound holds for g0,L but let us first attend to the susceptibility. Following the logic of Eqs.(5) and (6), there is an x ∗∗ in LL that is connected to RL by a path inside SL with probability of order L−1 : √ q
µq,R=
({x ∗∗ ←→ RL }SL ) ≥
1 . 2L
(7)
Regarding the point x ∗∗ as being at the center of a square of side 2L and using translation invariance, we find √ q
µq,R=
({0 ←→ ∂S2L }) ≥
1 , 2L
(8)
√
i.e. XL ≡ µq,R= q ({0 ←→ ∂SL }) ≥ L−1 . This immediately implies a divergent susceptibility/cluster size. Indeed writing X X X X g0,x = g0,x ≥ XL , (9) X = x
L x∈∂SL
L
this result follows. Finally, let us obtain our bound for the correlation function along the coordinate axes. Consider the event ({x ∗∗ ←→ RL }SL along with its mirror image reflected along the midline of the square. I.e. a connection between LL and y ∗∗ by a path inside SL , where y ∗∗ = x ∗∗ + L. If these two connections occur in tandem with a top bottom crossing of SL , we achieve a connection between y ∗∗ and x ∗∗ . These events are all positively correlated hence 1 1 1 . (10) g0,L = gy ∗∗ ,x ∗∗ ≥ 2 2L 2L t u It is clear that the above generalizes to other random cluster systems. However since at present there are not too many examples of physically relevant models that satisfy all of the required conditions, we will be content with a small selection. The [r,s]-cubic (generalized Ashkin–Teller) model. Consider two copies of Z2 with two sets of Potts spins: τi ∈ {1, . . . r} and κi ∈ {1, . . . s}. It is convenient to envision the model as two layers of Z2 , the τ -layer and the κ-layer with the τ -layer just above the k-layer. In any case, the Hamiltonian is given by X [aδτi ,τj δκi ,κj + bδκi ,κj + cδτi ,τj ], (11) H =− hx,yi
where, as it turns out, we will be interested in the cases a, b, c ≥ 0. The dual relations for this model (at least for integer r and s) were derived some time ago in [dN,DR] by algebraic methods. (Of course the special case r = s = 2 was derived much earlier starting, in fact, with [AT].) More recently, graphical representations for this model have been discovered [CMI,PfV], (and see also [SS]) in which the duality is manifest. Consider bond configurations ω = (ωτ , ωκ ), i.e. separate bond configurations in the τ - and κ-layers. As usual, we will start in finite volume. Let N (ωτ ) and N (ωκ ) denote the number of occupied bonds in the τ - and κ-layers respectively. Let N (ωτ ∨ωκ )
358
L. Chayes, K. Shtengel
denote the number of edges where at least one of the τ - or κ-layers have occupied bonds and finally let N(ωτ ∧ ωκ ) denote the number of edges where both the τ - and κ-layers have occupied bonds. The graphical representation is defined by the weights W (ω) = AN(ωτ ∨ωκ ) BN(ωτ ∧ωκ ) C[N (ωκ )−N (ωτ )] r C(ωτ ) s C(ωκ ) ,
(12)
where C(ωτ ) and C(ωκ ) are the number of connected components as in the usual random cluster problems. (And typically must be augmented with some boundary conditions.) The relationship between A, B and C and a, b and c is as follows: A = [(eβb − 1)(eβc − 1)] 2 , 1
B=
eβ(a+b+c)
− eβb
− eβc
(13a) +1 1
[(eβb − 1)(eβc − 1)] 2 1 βb (e − 1) 2 . C= (eβc − 1)
,
(13b)
(13c)
In order for the graphical representation to make sense, we require b, c ≥ 0. However, this is not the case with a but it turns out that the FKG property – which we will need – only holds if B ≥ A [BC] thus we actually require all couplings in Eq. (13) to be ferromagnetic. Under these conditions for the case r = s, b = c it was shown in [CMI] that there is a single ordering transition as the temperature is varied. The dual model is defined straightforwardly: edges of the dual lattice in, say, the τ -layer that are traversal to occupied bonds are considered vacant dual bonds, those edges traversal to vacant bonds are the occupied dual bonds and similarly in the κ-layer. The duality conditions are easily obtained from the weights in Eq. (12) (for the simple reason that ∨ ↔ ∧ under duality) and the result is: √ rs , (14a) A∗ = B √ rs , (14b) B∗ = rA r 1 . (14c) C∗ = s C The analog of Theorems 1 and 2 for this system are readily established: Theorem 3. Consider the (r, s)-cubic model as described with A ≥ B. Let 8(A, B, C) denote the free energy similar to that defined in Eq. (4). Suppose that a self dual point: (A, B, C) = (A∗ , B∗ , C∗ ), is a point of continuity for any first derivative of 8. Then the τ and X τ denote the connectivity percolation probability in either layer vanishes. Let gx,y κ and X κ . Then function and average size of clusters in the τ -layer and similarly for gx,y τ ≥ 1 L−2 , X τ is infinite and similarly for g κ and X κ . g0,L x,y 8 Proof. It is convenient, but not essential, to restrict attention to the “plane” C = C∗ . Indeed, following the argument below, it can be shown that continuity with respect to A and B actually implies continuity with respect to C thus, for all intents and purposes, the C-variable is out of the play. Continuity of the derivative with respect to B implies that at the point (A, B, C), the density of “doubly occupied” bonds is independent of
2D Disordered Ferromagnets
359
state for any translation invariant state. Add to this continuity of the derivative with respect to A and (since N(ωτ ∨ ωκ ) + N(ωτ ∧ ωκ ) = N (ωτ ) + N (ωκ )) we may conclude that the total bond density is the same in every translation invariant state. However, we claim that this implies the same result for the separate densities. Indeed if the τ -density were discontinuous, to keep the total density continuous would require a compensating discontinuity in the κ-density. But these densities are positively correlated; i.e. the discontinuities must go in the same direction. In particular, we could find a state (at the point (A, B, C)) where both densities achieved their lower value and another where they both obtain the upper value. Constancy of the bond density implies no percolation in either layer at a self-dual point which in turn implies unicity of the state. The rest of the argument follows mutatis mutandis the proof of Theorem 2. u t A loop related model. Our final example appeared in the context of loop models in [CPS]. Let ω denote a bond configuration on Z2 and let ω˜ denote the complimentary configuration on the dual lattice: If a bond of ω is occupied then so is the traversal bond and similarly for vacancies. (In other words, the vacant bonds of the dual configuration are the occupied bonds of the complimentary configuration.) The weights, in finite volume are given by ˜ . V (ω) = LN(ω) s C(ω) s C(ω)
(15)
We remark that from a technical perspective, unrelated boundary conditions for ω and ω˜ may be implemented. However it is natural to assert that if one is fully wired so is the other and similarly with free boundary conditions. A derivation identical to the one for the usual random cluster model shows that the dual model is the same model with the parameter L∗ =
s2 L
(16)
along with the usual exchange of boundary conditions. For double-free or double-wired (as well as other) boundary conditions, the FKG property follows easily: Proposition 4. For free or wired boundary conditions, the random cluster models defined by the weights in Eq. (15) with s ≥ 1 have the FKG property. Further, for these boundary conditions, if R1 ≥ R2 and s1 ≤ s2 the measure with parameters (R1 , s1 ) FKG dominates the one with parameters (R2 , s2 ). Proof. For the usual random cluster model, the FKG property follows from the FKG lattice condition [FKG]: Let ω1 and ω2 denote two bond configurations, ω1 ∧ ω2 the configuration of bonds occupied in both ω1 and ω2 and ω1 ∨ω2 the configuration of bonds occupied in either. The lattice condition reads: µ(ω1 ∧ ω2 )µ(ω1 ∨ ω2 ) ≥ µ(ω1 )µ(ω2 ) which follows because C(ω1 ∧ ω2 ) + C(ω1 ∨ ω2 ) ≥ C(ω1 ) + C(ω2 ) [ACCN]. In the present case, we need only apply this argument twice, once to C(ω) and once to C(ω). ˜ The FKG dominance follows by writing the one set of weights as an increasing function times the other. u t Remark. It would appear that the model under discussion is very close to the q = s 2 state random cluster model. This follows by noting that for the former, C(ω) and C(ω) ˜ ˜ = s 2C(ω) s C(ω)−C(ω) ˜ and are identically distributed. Then, we may write s C(ω)+C(ω)
360
L. Chayes, K. Shtengel
suppose that the “fluctuations” C(ω) ˜ − C(ω) are (thermodynamically) small. However, at present there is no hard evidence of such an equivalence. On the other hand, the selfdual point can in fact be realized as the endpoint of the self-dual line of the symmetric (r = s, C = 1) cubic model corresponding to A → ∞ and B → 0. Here the τ -bonds may be taken to be the occupied bonds and the κ’s to be the vacants. The condition B = 0 ensures that they cannot coincide while A = ∞ implies one bond or the other actually is occupied. Theorem 5. For the model defined by the weights in Eq. (15), the results of Theorem 3 apply; If R = s is a point of continuity for the bond density then the connectivity function has a power law lower bound and the average cluster size is infinite. Proof. Follows from the same arguments as the proofs of Theorems 1 and 2. u t Remark. In this model, the results of [BC] also apply: If there is any point of discontinuity for the bond density, that point must be the self dual point. (The cases of the Potts model and the cubic model for r = s were the explicit subject of [BC]; the identical arguments apply to the current case.) Thus, one way or another in all these systems the self-dual points are points of “phase transitions”. For large q, r, and/or s – at least in the integer cases – it is straightforward to show that discontinuities do occur. (Theorem IV.2 in [CMI] covers all of these cases.) The difficulty is the opposite cases: establishing continuity of the energy/bond – density. For independent percolation (q = 1) this form of continuity is trivial. (By contrast, establishing that this is the unique critical point and that the percolation density is continuous involve quite intricate arguments [K,R].) Indeed, to the authors’knowledge, the only non-trivial uniform system where this has been done with complete rigor is the Ising magnet, here by exact solution [O]. However, the next section features systems where the required continuity has been guaranteed by general arguments. Quenched Potts Models For the remainder of this paper, we will deal exclusively with the q-state Potts model on Z2 as defined by the Hamiltonian in Eq. (1); we will treat the case where the Jx,y are (non-negative) independent random variables. (And ultimately to prove theorems along the lines of Theorems 1 and 2, we will need to focus on distributions that are self-dual.) We strongly suspect that with only minimal labor the forthcoming results could be extended to disordered versions of the various other models discussed in the previous section. But here we will focus on the minimal case. The approach in this work will be somewhat different from the usual mathematical studies of disordered systems: rather than looking at properties that are “typical” of configurations of couplings, we will construct, from the outset, the quenched measure – more precisely, the graphical representation thereof. When all the preliminaries are in place, this has the advantage of allowing a derivation that is essentially indistinguishable from the uniform cases. The disadvantage is that many of the “basic preliminaries” will require some attention. The quenched measure. Let 3 ⊂ Z2 (or Zd for the duration of the preliminaries) denote a finite volume. In what follows, the inverse temperature β as well as the value of q (≥ 1) will be regarded as fixed and hence will be suppressed notationally. Let
2D Disordered Ferromagnets
361
η = {Jx,y | hx, yi ∈ 3} denote a set of couplings. Let # denote a boundary condition on ∂3. In general we will allow the boundary condition to depend on the realization of couplings so we will write #(η). (For the case of continuous variables Jx,y we must also η stipulate that #(η) is a measurable function.) We let h−i3;#(η) denote the finite volume Gibbs state (for this realization of couplings and this boundary condition.) Similarly, η we may consider the random cluster measures µ3;#(η) (−). Our assumption about the Jx,y is that they are i.i.d. non-negative variables. Let b(−) denote the product measure for configurations of couplings and Eb (−) the expectation with respect to this measure. Then the quenched measures are defined as the b-averages of the “thermal” averages η η according to h−i3;#(η) and µ3;#(η) (−). Explicitly, if F (σx1 , . . . σxk ) is a function of spins (with x1 , . . . xk ∈ 3) then the quenched average of F is given by η
hF i3;# = Eb (hF i3;#(η) ).
(17a)
Similarly, for a bond event A, η
µ3;# (A) = Eb (µ3;#(η) (A)).
(17b)
Most of our attention will be focused on the quenched random cluster measures as defined in Eq. (17b) – or the infinite volume limits thereof. Our first proposition establishes some FKG properties of these quenched measures: Proposition 6. On finite 3, let w and f denote the boundary conditions that are, respectively wired and free for all η. Then the measures µ3,w (−) and µ3,f (−) have the FKG property (in the sense of positive correlations). Furthermore, for any # = #(η), the wired measure µ3,w (−) FKG-dominates the measure µ3,# (−). Proof. For the FKG property, we will just do the wired case; the free case is identical. Let A and B denote increasing events (defined on the bonds of 3). By the FKG properties η of the measures µ3;w (−), η
η
η
µ3;w (A ∩ B) ≥ µ3;w (A)µ3;w (B).
(18) η
Now we observe that for any increasing event C, the quantity µ3;w (C) is an increasing 1 ≥ J 2 for each bond in 3) then the function of η. Indeed, if η1 η2 (meaning Jx,y x,y random cluster measure with couplings η1 FKG dominates the one with couplings η2 . But then η
η
η
µ3,w (A ∩ B) = Eb (µ3;w (A ∩ B)) ≥ Eb (µ3;w (A)µ3;w (B)) ≥ η
η
Eb (µ3;w (A))Eb (µ3;w (B)) = µ3,w (A)µ3,w (B).
(19)
The same works for the free case (and various other boundary conditions that are independent of η). Finally, the stated FKG-domination is an obvious consequence of the fact η η t that for each η, the measure µ3,w (−) FKG-dominates the measure µ3,#(η) (−). u As a corollary, we obtain the existence of infinite volume limits as in the usual random cluster cases: Corollary. The infinite volume limits µf (−) and µw (−) exist in the sense that if 3k % Zd is any sequence of boxes with 3k+1 ⊃ 3k then the limits of µ3k ,f (−) and µ3k ,w (−) exist and are independent of the sequence. These limiting measures are translation invariant and invariant under exchange of coordinate axes.
362
L. Chayes, K. Shtengel
Proof. The argument is exactly as in the standard proofs and is a consequence of the η following observation: If 31 ⊂ 32 then for any fixed η, the restriction of µ32 ,w (−) to 31 is FKG dominated by the wired measure in 31 . Thus the same statement holds for the quenched average of these measures. A similar sort of domination, but in the opposite direction is established for the free measures. The remainder of the proof is now identical to the derivations for uniform random cluster models (with occasional use of translation invariance and coordinate symmetry of b(−)). Such proofs have been written in many places (see e.g. [CMI], Theorem 3.3) and need not be repeated here. u t We now demonstrate that absence of percolation is the correct criterion for uniqueness. Our working definition of percolation is fairly standard: Definition. Let 3 ⊂ Zd be a finite set that contains the origin. We define P∞ = lim µ3,w (0 ←→ ∂3).
(20)
3%Zd
We say there is percolation if P∞ is not zero. Remark. It is obvious, by the considerations of the corollary to Proposition 6 that this limit exists. Further, if P∞ vanishes, there is no percolation by any other criterion. Finally it is not difficult to show that P∞ is exactly the spontaneous magnetization in the spin-system. Next we establish the quenched analog of Theorem A.2 in [ACCN]. Proposition 9. If P∞ = 0 there is a unique limiting quenched random cluster measure and a unique limiting quenched Gibbs measure. Proof. Our proof will in essence be to show that any sequence of finite volume measures converges to the free measure. Let A denote any local increasing event. Let 3 denote a large (finite) box – the bonds of which determine the event A. Now consider a much larger box 4 along with some boundary condition #(η); the measure µ4,# (−) may be thought of as “well along the way” towards the construction of some infinite volume measure. Since the percolation probability is assumed to vanish, it is clear that if 4 is sufficiently large, then, for > 0, µ4,# (∂3 ←→ ∂4) ≤ 2 .
(21)
η
Thus if D3,4 = {η | µ4,#(η) (∂3 ←→ ∂4) > }, then b(D3,4 ) < . η Now for any η ∈ D3,4 , with µ4,#(η) -probability greater than 1 − , there is a “ring” (separating surface) of vacant bonds in the region between ∂3 and ∂4. Conditioning to the “outermost” such ring gives us a measure which, in the interior of the ring, is equivalent to free boundary conditions on the ring. For any η, this in turn is dominated by the measure with free boundary conditions on ∂4 and dominates (in 3) the measure with free boundary conditions on ∂3. Thus, for η ∈ D3,4 , η
η
η
(1 − )[µ3,f (A)] ≤ µ4,#(η) (A) ≤ (1 − )[µ4,f (A)] + ,
(22)
(1 − )[µ3,f (A)] ≤ µ4,# (A) ≤ (1 − )[µ4,f (A)] + 2,
(23)
and hence
2D Disordered Ferromagnets
363
where the extra comes from the η ∈ / D3,4 . From Eq. (23) it is easy to see that all sequences of finite volume quenched measures converge to the limiting free measure. The argument for the uniqueness of the quenched Gibbs state follows from the above by noting that the thermal average of any local spin-function can be expressed as expectations of random cluster functions (which themselves are finite combinations of increasing events.) This proves (a) the existence of a limiting h−if (and for that matter a limiting h−iw ) and (b) that if the magnetization vanishes that this is the unique limiting state. u t The final result we will need is the ergodic property for the free and wired quenched random cluster measures. Theorem 10. The measures µw (−) and µf (−) are ergodic under Zd translations. Proof. We will do the wired case, the free case is nearly identical. Let A and B denote local events assumed, without loss of generality, to be increasing. Let r ∈ Zd and let Tr (B) denote the event B translated by r. We will show that limr→∞ µw (A ∩ Tr (B)) = µw (A)µw (B). By FKG and translation invariance we have, for any r, µw (A ∩ Tr (B)) ≥ µw (A)µw (B).
(24)
Now consider |r| large – far larger than the scale of the regions that determine the events A and B. Let s ≤ |r| be chosen so that 3s , the box of side s centered at the origin and its translate by r, which we denote by Tr (3s ), are disjoint but within a few lattice spacings of each other. Finally, let us consider an L that is very large compared with r; we will approximate µw (A ∩ Tr (B)) by µ3L ,w (A ∩ Tr (B)). By the FKG property, µ3L ,w (A ∩ Tr (B)) is less than the corresponding probability given that all bonds on the outside of 3s and Tr (3s ) are occupied. But given these occupations, the measure inside 3s is equivalent to wired boundary conditions on 3s and similarly for Tr (3s ). Now for each η, the wirings make these interior measures independent. Thus we have η
η
η
µ3L ,w (A ∩ Tr (B)) ≤ µ3s ,w (A)µTr (3s ),w (Tr (B)).
(25)
However as functions of η, the two objects on the right of Eq. (25) are independent – η they take place on disjoint sets. It is clear that µTr (3s ),w (Tr (B)) averages to µ3s ,w (B) and thus µ3L ,w (A ∩ Tr (B)) ≤ µ3s ,w (A)µ3s ,w (B).
(26)
Letting L → ∞ we get µw (A ∩ Tr (B)) ≤ µ3s ,w (A)µ3s ,w (B),
(27)
lim µw (A ∩ Tr (B)) ≤ lim µ3s ,w (A)µ3s ,w (B) = µw (A)µw (B).
(28)
and hence r →∞
s→∞
This completes the proof for the wired case; the free case works the same way. Here we use decreasing events for A and B. u t
364
L. Chayes, K. Shtengel
Main results. We are ready for the disordered analogs of Theorems 1 and 2. However, in this case, we will not need to hypothesize the required continuity: this is the central subject of [AW1,AW2]. Let us first briefly discuss duality in the disordered case. In the general setup, let q ∗ . (29) J (J ) = log 1 + J e −1 ∗
(I.e. eJ − 1 = q/[eJ − 1]). Then what is needed, in the discrete case, to have β = 1 a point of self-duality is that b(Jx,y = J ) = b(Jx,y = J ∗ (J )). (A similar statement holds for continuous or other distributions.) Indeed, if this is the case we see that the probability of bonds and dual bonds of equivalent strength are the same. Then, in finite volume, the probabilities of two coupling configurations that are equivalent under duality (including the usual exchange of boundary conditions) are equal. Sometimes it is convenient to parameterise the distribution and allow β to vary. For example, suppose there are two bond values J1 and J2 with b(Jx,y = J1 ) = b(Jx,y = J2 ) = 1/2. Since temperature is back in the problem, we may assume, without loss of generality that J1 = 1 and write J2 = λ with 0 ≤ λ ≤ 1. Then, in the λ, β plane it is not hard to see that the model with parameters λ, β is equivalent under duality to the one with parameters λ∗ , β ∗ where q (30a) β ∗ = log 1 + λβ e −1 and h i log 1 + eβq−1 h i. λ∗ = log 1 + eλβq−1
(30b)
The system is self-dual (λ = λ∗ , β = β ∗ ) when (eβ − 1)(eλβ − 1) = q. Theorem 10 . Consider a disordered Potts model of the type described at a self-dual point. Then there is a unique limiting state with zero magnetization. Proof. The proof is the same as the proof of Theorem 1 which we will recapitulate for continuity. If the magnetization (percolation probability) were non-zero then in the free boundary state, the percolation probability for dual bonds would be non-vanishing. Proposition 6 and Theorem 10 allow us to use the result in [GKR]. Thus the free and wired states would be distinguished. But these are FKG ordered states so it would follow that the bond (and hence energy) density would differ in the two states implying a discontinuity in the energy density (or bond density). This however, is contradicted by the results of [AW1,AW2]. Proposition 9 connects the absence of magnetization to uniqueness. u t Theorem 20 . Under the hypotheses of Theorem 10 , in the limiting state the quenched correlation function satisfies 1 1 1 hδσ0 ,σL − i ≥ . q 8 L2
2D Disordered Ferromagnets
365
Further the quenched susceptibility, defined as X =
X y
hδσ0 ,σy −
1 i q
is infinite. Proof. Again this follows exactly the proof of Theorem 2 once we can conclude that the probability of a square crossing is one half. Although this follows from self-duality on “general principles” it is comforting to consider the square L × L square, SL in the middle of a finite (but much larger) square with wired boundary conditions on the top and right and with free boundary conditions on the left and bottom. Then the quenched crossing probability is manifestly exactly one half and the thermodynamic limit can be taken, which gets us to our unique state, and we conclude that the probability in the limiting state is one half. u t Acknowledgements. Work supported by the NSA under grant # MDA904-98-1-0518. L. C. takes great pleasure in thanking M. Aizenman; in general for many useful conversations over the years and in particular for some tips on this problem, and for the last reprint of [AW2].
References [A] [ACCN] [AT] [AW1] [AW2] [AY] [BC] [Ca] [CMI] [CPS] [DR] [FKG] [GHM] [GKR] [K] [L] [dN] [O]
Aizenman, M.: On the Slow Decay of O(2) Correlations in the Absence of Topological Excitations: Remark on the Patrascioiu–Seiler Model. J. Stat. Phys. 77, 351–359 (1994) Aizenman, M., Chayes, J.T., Chayes, L. and Newman, C.M.: Discontinuity of the Magnetization in One-Dimensional 1/|x − y|2 Ising and Potts Models. J. Stat. Phys. 50, 1–40 (1988) ˙ Ashkin, J. and Teller, E. Statistics of Two-Dimensional Lattices with Four Components. PhysRev. 64, 178–184 (1943) Aizenman, M. and Wehr, J.: Rounding of First-Order Phase Transitions in Systems with Quenched Disorder. Phys. Rev. Lett. 62, 2503–2506 (1989) Aizenman, M. and Wehr, J.: Rounding Effects of Quenched Randomness on First-Order Phase Transitions. Commun. Math. Phys. 130, 489–528 (1990) Anderson, P.W. and Yuval, G.: J. Phys. C 4, 407 (1971) Baker, T. and Chayes, L.: On the Unicity of Discontinuous Transitions in the Two-Dimensional Potts and Ashkin-Teller models. J. Stat. Phys. 93, 1–15 (1998) Cardy, J.: Quenched Randomness at First–Order Transitions. Physica A 263, 215–221 (1999); In: Statistical Physics Invited Papers from STATPHYS 20 (A. Gervois, D. Iagolinitzer, M. Moreau and Y. Pomeau (eds.), Elsevier, North-Holland Chayes, L. and Machta, J.: Graphical Representations and Cluster Algorithms Part I: Discrete Spin Systems. Physica A 239, 542–601 (1997) Chayes, L., Pryadko, L. and Shtengel, K.: Loop Models on Zd : Rigorous Results. Unpublished (1998) Domany, E. and Riedel, E.: Two-Dimensional Anisotropic N-Vector Models. Phys. Rev. B 19, 5817–5834 (1979) Fortuin, C.M., Kasteleyn, P.W. and Ginibre, J.: Correlation inequalities on Some Partially Ordered Stes Commun. Math. Phys. 22, 89–103 (1971) Georgii, H.O., Häggström, O. and Maes, C.: The Random Geometry of Equilibrium Phases. To appear in a future volume of: Phase Transitions and Critical Phenomena, C. Domb and J. L. Lebowitz, eds., London, Boston, Tokyo: Academic Press Gandolfi, A., Keane, M. and Russo, L.: On the Uniqueness of the Infinite Occupied Cluster in Dependent Two-Dimensional Site Percolation. Ann. Prob. 16, 1147–1157 (1988) Kesten, H.: The Critical Probability of Bond Percolation on the Square Lattice Equals 1/2. Commun. Math. Phys. 74, 41–59 (1980) Liggett, T.M.: Interacting Particle Systems. Berlin, New York: Springer Verlag, (1985) den Nijs, M.P.M.: Unpublished Onsagar, L.: Crystal Statistics I. A Two-Dimensional Model with an Order-Disorder Transition. Phys. Rev. 65, 117–149 (1944)
366
[PfV] [R] [SS] [T]
L. Chayes, K. Shtengel
Pfister, C.E. and Velenik, Y.: Random-Cluster Representation of the Ashkin–Teller Model. J. Stat. Phys. 88, 1295–1331 (1997) Russo, L.: On the Critical Percolation Probabilities. Z. Warsch. verw. Geb 56, 229–237 (1981) Salas, J. and Sokal, A. D.: Dynamic Critical Behavior of a Swendsen–Wang-Type Algorithm for the Ashkin–Teller Model. J. Stat. Phys. 85, 297–361 (1996) Thouless, D.: Phys. Rev. 187, 732 (1969)
Communicated by M. E. Fisher
Commun. Math. Phys. 204, 367 – 396 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Characterization of the Spectrum of the Landau Hamiltonian with Delta Impurities T. C. Dorlas1 , N. Macris2 , J. V. Pulé 3,4,? 1 Department of Mathematics, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, Wales, UK.
E-mail: [email protected]
2 Institut de Physique Théorique, Ecole Polytechnique Fédérale de Lausanne, CH-1015 Lausanne,
Switzerland. E-mail: [email protected]
3 Department of Mathematical Physics, National University of Ireland, Dublin, (University College Dublin),
Belfield, Dublin 4, Ireland. E-mail: [email protected]
4 Research Associate, School of Theoretical Physics, Dublin Institute for Advanced Studies, Dublin, Ireland
Received: 1 June 1998 / Accepted: 29 January 1999
Abstract: We consider a random Schrödinger operator in an external magnetic field. The random potential consists of delta functions of random strengths situated on the sites of a regular two-dimensional lattice. We characterize the spectrum in the lowest N Landau bands of this random Hamiltonian when the magnetic field is sufficiently strong, depending on N. We show that the spectrum in these bands is entirely pure point, that the energies coinciding with the Landau levels are infinitely degenerate and that the eigenfunctions corresponding to energies in the remainder of the spectrum are localized with a uniformly bounded localization length. By relating the Hamiltonian to a lattice operator we are able to use the Aizenman–Molchanov method to prove localization. 1. Introduction Recently there has been progress in the theory of Anderson localization for two dimensional continuous models of an electron moving in a random potential and a uniform magnetic field ([1–4]). In these works it is established that the states at the edges of the Landau bands are exponentially localized and the corresponding energies form a pure point spectrum. However, the nature of the generalized eigenfunctions of the Schrödinger operator for energies near the centre of the Landau bands has not been established. A first step towards the resolution of this problem was made in [5] for a Hamiltonian restricted to the first Landau band with a random potential consisting of point impurities with random strength and located on the sites of a square lattice. There it was shown that, for a sufficiently strong magnetic field, all the eigenstates are localized except for a single energy at the centre of the band. This energy is an infinitely degenerate eigenvalue with probability one. In the present paper we extend the results of [5] to a similar model where the restriction to the lowest Landau band is removed. The technique used here is different and yields ? This work was partially supported by the Forbairt (Ireland) International Collaboration Programme 1997.
368
T. C. Dorlas, N. Macris, J. V. Pulé
much stronger results. Formally the Hamiltonian of the electron is given by X vn δ(r − n), H = H0 +
(1.1)
n
where H0 =
1 (−i∇ − A(r))2 , 2
(1.2)
A(r) = 21 (r × B) and vn , the strengths of the impurities which are located on the sites n of a two-dimensional square lattice, are i.i.d. random variables. It is well known that the definition of Hamiltonians with point scatterer in more than one dimension is delicate and requires a renormalization procedure. This is the subject of Sect. 2. The main results of this paper are the following. Let En = (n+ 21 )B, n = 0, 1, 2, . . . , be the Landau levels corresponding to the kinetic part, H0 , of the Hamiltonian. Given an integer N, there exists B0 (N) such that for B > B0 (N ), the spectrum is completely characterized for energies E < EN . We show that for n = 0, 1, 2, . . . N − 1, the Landau levels En are infinitely degenerate eigenvalues of H with probability one. All other energies in this part of the spectrum correspond to exponentially localized eigenfunctions with a localization length which is uniformly bounded as a function of the energy. Thus the localization length does not diverge at the centres of the bands when the magnetic field is strong enough, at least for the lower bands. Our analysis breaks down for energies greater than EN and in fact we expect a different behaviour for high energies. There is an extensive literature on the problem of point scatterers with a magnetic field, but it appears that little is known on the rigorous level for the two-dimensional random case considered here. For the periodic case, that is, when all the vn ’s are identical, we refer the reader to the review [6] and the references therein. The case when the potential is periodic in the x-direction and random in the y-direction has been discussed recently in [7]. Finally the density of states for models similar to ours with a restriction to the first Landau level has been computed analytically in [8] (see also [9] which deals with the existence of Lifshitz tails). The infinite degeneracy of the Landau levels had already been noticed in various ways in the past ([10,8,11]). For example in [8] it appears as a delta function in the density of states of the first level. The result suggests that it is in fact macroscopic, in other words, there is a positive density per unit volume. Our results characterize completely the rest of the spectrum and also give information about the localization length. Let us say a few words about the method used to arrive at these results. The scatterers in (1.1) are similar to rank one perturbations of the kinetic energy so that by using the resolvent identity one can express the Green’s function corresponding to H in terms of the Green’s function of the kinetic energy and a matrix which contains all the randomness. Thus the problem is reduced completely to the study of this random matrix which has random elements on the diagonal and rapidly decaying non-random off-diagonal elements. It turns out that the method invented by Aizenman and Molchanov [12] is very well suited to study the decay of eigenvectors of this matrix. These eigenvectors are related by an explicit formula to the eigenfunctions of H in such a way that exponential decay of the former implies exponential decay of the latter. In fact it follows from the structure of the random matrix that, in the strong magnetic field regime, the off-diagonal elements are much smaller than the diagonal elements, and this is true even for energies near the band centres. Therefore our problem is analagous to the high disorder regime in the usual Anderson model and this is the reason why we have access to the whole
Landau Hamiltonian with Delta Impurities
369
spectrum. It is instructive to discuss the physical implications of our results in the context of the quantum Hall effect. A basic ingredient used to explain the occurrence of plateaux in the Hall conductivity is the localization of electrons due to the random potential. This has been established in a mathematically precise way in [13] (see also [14]), by assuming the existence of localized states. Usually it is difficult to obtain quantitative results on the localization length. The Network Model of Chalker and Coddington [15] and numerical simulations [10] suggest that it is finite except at the band centres where it diverges like |E −En |−ν with ν ≈ 2·35 for the first few n’s. In the Network Model one must work with smooth equipotential lines of the random potential so that it is difficult to compare to our situation. The model in this paper has been treated numerically only in a regime where the magnetic length, which is of the order of B −1/2 , is much greater than the average spacing between impurities. The regime covered by our analysis is such that the magnetic length is smaller than the average spacing between impurities, and we prove that there is no divergence in the localization length at least for the first few bands. One might think that this means that there is no quantum Hall effect in this regime. However this is not the case because the energy at the band centre is an infinitely degenerate eigenvalue. One can compute explicitly the eigenprojector associated to each degenerate eigenvalue and check that the corresponding Chern number is equal to unity [16]. From this result and the equivalence between Hall conductivity and Chern number, when the Fermi level lies in the region of localized states or in a spectral gap, we conclude that the Hall conductivity takes a non-zero quantized value equal to the number of Landau levels below the Fermi energy. This has been made mathematically precise in [13] (see also [14] and [17]). The picture which emerges out of the combination of our analytical results with those of simulations is that in the present model one has to distinguish at least two regimes. In the first one, the magnetic length is much greater than the spacing between impurities: the localization length diverges and there is no degenerate eigenvalue at the band centres. In the second the magnetic length is much smaller than the spacing between impurities: the localization length does not diverge and there is a degenerate eigenvalue at the band centres. Whether there exists one or more intermediate regimes or not is an open question. It is instructive to note that in the model studied in [8], it turns out that, at the level of the density of states, one must also distinguish between various regimes, more than two in fact. Finally, we wish to stress that the quantized Hall plateaux exist in both regimes and that an interesting open question is whether the different behaviour of the localization length is reflected in the transition between two successive Hall plateaux. The paper is organized as follows. In Sect. 2 we give the precise definition of the model and the Hamiltonian and also collect useful Green’s function identities. Our main theorem (Theorem 2.2) is stated at the end of this section. The infinite degeneracy of the first N (B) Landau levels is proved in Sect. 3 and the spectrum is characterized as a set. The connection between generalized eigenfunctions of H and eigenvectors of the random matrix is established in Sect. 4. Finally, the Aizenman–Molchanov method is applied in Sect. 5, where the proof of our main theorem is completed. The appendices contain more technical material. 2. Definition of the Hamiltonian In this section we define our Hamiltonian. It is well known that Hamiltonians with δfunction potentials in dimensions greater than one require renormalization. This was first done rigorously in [11]. The magnetic field case was developed in [6]. We refer the reader also to [18] though this does not deal explicitly with the case of a magnetic field.
370
T. C. Dorlas, N. Macris, J. V. Pulé
Let ωn , n ∈ Z[i] ≡ {n1 +in2 : (n1 , n2 ) ∈ Z2 }, the Gaussian integers, be i.i.d. random variables. We shall assume that their distribution is given by an absolutely continuous probability measure µ0 whose support is an interval X = [−a, a] with 0 < a < ∞. We require that µ0 is symmetric about the origin and that its density ρ0 is differentiable on (−a, a) and satisfies the following condition: ρ00 (ζ ) < ∞. ζ ∈(0,a) ρ0 (ζ ) sup
(2.1)
These conditions on µ0 can be weakened, but we have chosen the above because they allow us to check the regularity of the Q distribution of 1/ωn , in the sense of [12] very simply. We let = X Z[i] and P = n∈Z[i] µ0 . For m ∈ Z[i], let τm be the measure preserving automorphism of defined by (τm ω)n = ωn−m .
(2.2)
The group {τm : m ∈ Z[i]} is ergodic for the probability measure P. Let H = L2 (C) and let H0 be the operator on H defined by H0 = (1/8κ)(−i∇ − A(z))2 − 1/2,
(2.3)
where A(z) = (−2κIz, 2κRz). Here κ = B/4 and H0 is the same as the Hamiltonian in (1.2) apart from the multiplicative constant 1/8κ and the shift by 1/2 which are inserted for convenience so that the Landau levels coincide with the set of non-negative integers, N0 . Let Hm be the eigenspace corresponding to the mth Landau level of the Hamiltonian H0 defined in (2.3) and let Pm be the orthogonal projection onto Hm . The projection Pm is an integral operator with kernel Pm z, z0 = Lm (2κ|z − z0 |2 )P0 z, z0 , (2.4) where Lm is the Laguerre polynomial of order m and 2κ exp[−κ|z − z0 |2 − 2iκz ∧ z0 ], P0 z, z0 = π
(2.5)
with z ∧ z0 = RzIz0 − IzRz0 , Rz and Iz being the real and imaginary parts of z respectively. For λ ∈ C \ N0 , let Gλ0 = (H0 − λ)−1 , the resolvent of H0 at λ. Gλ0 has kernel (cf. [6]) Gλ0 (z, z0 ) = 0(−λ)P0 (z, z0 )U (−λ, 1, 2κ|z − z0 |2 ),
(2.6)
where
# " ∞ X (a)r r 1 M(a, 1, ρ) ln ρ + ρ {ψ(a + r) − 2ψ(1 + r)} U (a, 1, ρ) = − 0(a) r! r=0
(2.7) is the logarithmic solution of Kummer’s equation ([19, Chap. 13]): ρ
dU d 2U + (1 − ρ) − aρ = 0. 2 dρ dρ
(2.8)
Landau Hamiltonian with Delta Impurities
371
Here 0 is the Gamma function, ψ(a) = 0 0 (a)/ 0(a) is the Digamma function, (a)r = a(a + 1)(a + 2) . . . (a + r − 1),
(a)0 = 1,
(2.9)
and M(a, 1, ρ) =
∞ X (a)r r=0
r!
ρr
(2.10)
is Kummer’s function. Let M = l 2 (Z[i]) and for λ ∈ C \ N0 , define Uλ : H → M by hn|Uλ φi = (Gλ0 φ)(n).
(2.11)
From the bounds in Propositions 6.1 and 6.2 in Appendix A one can see that Uλ is a bounded operator. Its adjoint Uλ∗ : M → H is given by X ¯ Gλ0 (z, n)hn|ξ i. (2.12) (Uλ∗ ξ )(z) = n∈Z[i]
For λ ∈ C \ N0 let cnλ =
2κ π
ψ(−λ) −
2π ωn
(2.13)
and define the operators D λ , Aλ and M λ on M as follows. D λ is diagonal and hn|D λ |ni = cnλ ,
(2.14)
( λ
0
hn|A |n i =
0 if n = n0 Gλ0 (n, n0 ) if n 6 = n0 ,
(2.15)
and M λ = D λ − Aλ .
(2.16)
Note that D λ is a closed operator on the domain X |cnλ |2 |hn|ξ i|2 < ∞}, D(D λ ) = {ξ ∈ M :
(2.17)
n∈Z[i]
and Aλ is bounded, therefore M λ is closed on D(M λ ) = D(D λ ). Note also that (M λ )∗ = ¯ / σ (M λ ) let M λ and that for λ ∈ R, M λ is self-adjoint. For λ ∈ C \ N0 such that 0 ∈ 0 λ = (M λ )−1 .
(2.18)
To define our Hamiltonian H we use the following lemma: / σ (M λκ ) and Lemma 2.1. For each κ > 0, there exists λκ ∈ C \ R such that 0 ∈ 1 0 2
|hn|0 λκ |n0 i| ≤ K(κ)e−κ|n−n | .
(2.19)
372
T. C. Dorlas, N. Macris, J. V. Pulé
Proof. Let λ = −r(1 + i), with r > 0. By Proposition 6.1 in Appendix A, we have for n, n0 ∈ Z[i], n 6 = n0 , 0 2
|Gλ0 (n, n0 )| ≤ Cr,κ e−κ|n−n | , where
Cr,κ = Cκ
(2.20)
1 1 2 + e−(2κr) (1 + | ln(2κ)|) , r
(2.21)
C < ∞ being a constant. Therefore ||Aλ || ≤ Cr,κ ||S||, where S is the operator with matrix 0 2
hn|S|n0 i = e−κ|n−n | .
(2.22)
Let 0˜ λ = (D λ )−1 , then ||Aλ 0˜ λ || ≤
π Cr,κ ||S|| < 1/2, 2κ |Iψ(−λ)|
(2.23)
if r is large enough. Note that by (6.3.18) in [19] lim Iψ(−λ) = π/4.
Then
P∞
λ˜λ k k=1 (A 0 )
converges and consequently M λ is invertible, 0 λ = 0˜ λ (I +
and ||I +
(2.24)
r→∞
∞ X
(Aλ 0˜ λ )k )
(2.25)
k=1
P∞
λ˜λ k k=1 (A 0 ) ||
≤ 2. Clearly ( 0 λ˜λ 0 hn|A 0 |n i = 1 Gλ (n, n0 ) 0 cλ n0
for n = n0 if n 6 = n0 .
(2.26)
Thus 1 0 2
0 2
|hn|Aλ 0˜ λ |n0 i| ≤ Br,κ e−κ|n−n | ≤ Br,κ e−κ|n−n | ,
(2.27)
C
r,κ π where Br,κ = 2κ |I ψ(−λ)| . Now, there exists a constant c0 < ∞ such that for κ > 1 (see Lemma 3.3 in [5]), 1 1 1 X 00 2 00 0 2 0 2 e−κ|n−n | e−κ|n −n | ≤ c0 e−κ|n−n | . (2.28)
n00 ∈Z[i]
This bound, together with (2.27), gives 1 0 2
k −κ|n−n | e , |hn|(Aλ 0˜ λ )k |n0 i| ≤ c0k−1 Br,κ
(2.29)
and thus from (2.25) 1 0 2
|hn|0 λ |n0 i| ≤ Ke−κ|n−n | t if c0 Br,κ < 21 . u
(2.30)
Landau Hamiltonian with Delta Impurities
373
For λ ∈ C \ N0 , we have the formula ([19] 6.3.16) ψ(−λ) = −γ −
∞ X m=0
(λ + 1) , (m + 1)(m − λ)
(2.31)
where γ is Euler’s constant. Thus if λ1 , λ2 ∈ C \ N0 , ψ(−λ1 ) − ψ(−λ2 ) = (λ2 − λ1 )
∞ X m=0
1 . (m − λ1 )(m − λ2 )
(2.32)
On the other hand we have Gλ0 1 Gλ0 2 =
∞ X m=0
Pm , (m − λ1 )(m − λ2 )
(2.33)
and thus (Gλ0 1 Gλ0 2 )(n, n) =
∞ X m=0
∞ 2κ X Pm (n, n) 1 = . (m − λ1 )(m − λ2 ) π (m − λ1 )(m − λ2 )
(2.34)
m=0
Therefore hn|M λ1 − M λ2 |ni =
2κ {ψ(−λ1 ) − ψ(−λ2 )} = (λ2 − λ1 )(Gλ0 1 Gλ0 2 )(n, n). (2.35) π
On the other hand, for n 6 = n0 , using the resolvent identity, we get hn|M λ1 − M λ2 |n0 i = Gλ0 2 (n, n0 ) − Gλ0 1 (n, n0 ) = (λ2 − λ1 )(Gλ0 1 Gλ0 2 )(n, n0 ). (2.36) Therefore combining the two identities (2.35) and (2.36) we obtain M λ1 − M λ2 = (λ2 − λ1 )Uλ2 Uλ¯∗ . 1
(2.37)
It is clear from this equation that Uλ2 Uλ¯∗ = Uλ1 Uλ¯∗ . 1 2 Note that H0 is essentially self-adjoint on S(C) ([20, Theorem X.34]). Define Vκ : S(C) → H by Vκ = Uλ¯∗ 0 λκ T , where hn|T ψi = ψ(n). Let κ
D(H ) = {φ = ψ + Vκ ψ : ψ ∈ S(C)},
(2.38)
H φ = H0 ψ + λκ Vκ ψ.
(2.39)
and for φ ∈ D(H )
This definition implies that (H − λκ )φ = (H0 − λκ )ψ, and therefore since H0 is essentially self-adjoint on S(C), Ran(H − λκ ) is dense in H. Let ψ 0 ∈ S(C) and let ¯ ¯ ψ = ψ 0 + (λ¯ κ − λκ )Gλ0 κ Uλ∗κ 0 λκ T ψ 0 . Then ψ ∈ S(C) and T ψ = M λκ 0 λκ T ψ 0 . Note
374
T. C. Dorlas, N. Macris, J. V. Pulé ¯
¯
0
1
that 0 ∈ / σ (M λκ ) and |hn|0 λκ |n0 i| = |hn0 |0 λκ |ni| ≤ K(κ)e−κ|n−n | 2 . Let φ = ψ +Vκ ψ. Then (H − λ¯ κ )φ = (H0 − λ¯ κ )ψ + (λκ − λ¯ κ )Vκ ψ ¯ = (H0 − λ¯ κ )ψ 0 − (λκ − λ¯ κ )Gλ0 κ (H0 − λ¯ κ )Uλ∗κ 0 λκ T ψ 0 ¯
+(λκ − λ¯ κ )Vκ ψ 0 − (λκ − λ¯ κ )2 Vκ Gλ0 κ Uλ∗κ 0 λκ T ψ 0 ¯
= (H0 − λ¯ κ )ψ 0 − (λκ − λ¯ κ )Uλ¯∗ 0 λκ T ψ 0 + (λκ − λ¯ κ )Vκ ψ 0 κ
¯
−(λκ − λ¯ κ )2 Uλ¯∗ 0 λκ Uλκ Uλ∗κ 0 λκ T ψ 0 κ
¯
= (H0 − λ¯ κ )ψ 0 − (λκ − λ¯ κ )Uλ¯∗ 0 λκ T ψ 0 + (λκ − λ¯ κ )Vκ ψ 0 ¯
κ
¯
−(λκ − λ¯ κ )Uλ¯∗ 0 λκ (M λκ − M λκ )0 λκ T ψ 0
= (H0 − λ¯ κ )ψ 0 .
κ
Therefore Ran(H − λ¯ κ ) is dense in H and H is essentially self-adjoint on D(H ). / σ (M λ ), define For λ ∈ C \ N0 such that 0 ∈ Gλ ≡ Gλ0 + Uλ¯∗ 0 λ Uλ .
(2.40)
One can check using the resolvent identity and identity (2.37) that ([6], see also [18]) Gλ (H − λ)φ = φ,
(2.41)
Gλ = (H − λ)−1 .
(2.42)
so that
We now state the main theorem of this paper. (a) is proved in Lemma 3.2, (c) in Lemma 3.1 and (b) and (d) in Theorem 5.8. Theorem 2.2. (a) The spectrum of H contains bands around the Landau levels N0 and an interval extending from −∞ to a finite negative point. For each N ∈ N there exists κ0 > 0 such that for κ > κ0 , with probability one, (b) σcont (H ) ∩ (−∞, N ) = ∅, (c) if m ∈ N0 ∩ (−∞, N ), then m is an eigenvalue of H with infinite multiplicity, (d) if λ ∈ σ (H ) ∩ (−∞, N ) \ N0 , is an eigenvalue of H and R the corresponding eigenfunction is φλ , then for any compact subset B of C, B |φλ (z − z0 )|2 dz0 decays exponentially in z with exponential length less than or equal to 2/κ. 3. The Spectrum In this section we study the spectrum of the Hamiltonian. We first show that the Landau levels are still infinitely degenerate eigenvalues. We then prove that the spectrum contains bands around the Landau levels and an infinite interval in the negative half-line. Let {Uz : z ∈ C} be the family of unitary operators on H corresponding to the magnetic translations: 0 (3.1) (Uz f ) z0 = e2iκz∧z f z + z0 .
Landau Hamiltonian with Delta Impurities
375
These satisfy Uz1 Uz2 = e2iκz2 ∧z1 Uz1 +z2 . For n ∈ Z[i], Un Gλ (ω) Un−1 = Gλ (τn ω) .
(3.2)
The ergodicity of {τm : m ∈ Z[i]} and Eq. (3.2) together imply that the spectrum of H (ω) and its components are non random (see for example [21], Theorem V.2.4). We shall first prove that almost surely the lower Landau levels are infinitely degenerate eigenvalues for large κ. This lemma is a generalization of similar results in [5] and [22]. The main idea of the proof is to construct states in Hm which vanish at all the impurity sites, so that they are also eigenfunctions of H . These states involve the entire function in 2 (3.4) which vanishes at all the points of Z[i] and consequently grows like eA|z| for large |z|. The condition that the states are square integrable then requires that the magnetic 2 field be sufficiently large in order to compensate this growth by the factor e−κ|z| . Lemma 3.1. For each N ∈ N, there exists κ0 (N ) > 0, such that for κ > κ0 , with probability one, each Landau level m, with m ≤ N , is infinitely degenerate. Proof. The elements of the space H0 are of the form φ(z) = ψ(z)e−κ|z| , 2
(3.3)
where ψ is an entire function and, of course, φ ∈ L2 (C). Let ψ0 (z) = z
Y
(1 −
n∈Z[i]\{0}
z nz + z22 )e 2n . n
(3.4)
Then ψ0 is an entire function with zeros at all the points of Z[i]. It follows from the theory 2 of entire functions (see [23, 2.10.1]) that there exists A > 0 such that |ψ0 (z)| ≤ eA|z| . For k ∈ N0 , let φ0,k (z) = zk ψ0 (z)e−κ|z| , 2
(3.5)
then, if κ > A, φ0,k ∈ H0 and since Vκ φ0,k = 0, H φ0,k = 0. Also if for M ∈ N0 , PM P PM k k / Z[i]. Therefore M k=0 bk φ0,k = 0, then k=0 bk z = 0 for z ∈ k=0 bk z ≡ 0 and thus the bk ’s are zero implying that the φ0,k ’s are linearly independent. So the φ0,k ’s form an infinite linearly independent set of eigenfunctions of H with eigenvalue 0. For the higher levels we modify this argument with the use of the creation and annihilation √ ∂ ∗ ∗ operators for the Hamiltonian H0 , a and a, defined by a = (1/ 2κ) − + κ z¯ ∂z √ ∂ + κz . and a = (1/ 2κ) ∂ z¯ These operators satisfy the commutation relation [a, a ∗ ] = 1. Also if φ ∈ Hm then ∗ a φ ∈ Hm+1 and aφ ∈ Hm−1 except when m = 0, in which case aφ = 0. For m ≤ N and k ∈ N0 , let 2 φ˜ m,k (z) = zk (ψ0 (z))m+1 e−κ|z| ,
(3.6)
then, if κ > A(N + 1), φ˜ m,k ∈ H0 . Now let φm,k = (a ∗ )m φ˜ m,k . Then φm,k ∈ Hm and φm,k (n) = 0 for all n ∈ Z[i] since φ˜ m,k has a zero of order greater than m at each point
376
T. C. Dorlas, N. Macris, J. V. Pulé
of Z[i]. Therefore since Vκ φm,k = 0, H φm,k = mφm,k . Moreover since [a, a ∗ ] = 1 and P a φ˜ m,k = 0, a m φm,k = m!φ˜ m,k . So, if for M ∈ N0 , M k=0 bk φm,k = 0, then M X
bk φ˜ m,k = (m!)−1 a m
k=0
M X
! bk φm,k
= 0.
(3.7)
k=0
P k / Z[i] and as for m = 0 it follows that the φm,k ’s This means that M k=0 bk z = 0 for z ∈ form an infinite linearly independent set of eigenfunctions of H with eigenvalue m. u t In the case of one impurity of strength ω at the origin, the Green’s function is given by Gλ = Gλ0 +
1 λ G (·, 0)Gλ0 (0, ·), cλ 0
(3.8)
where cλ =
2κ π
ψ(−λ) −
2π ω
.
(3.9)
It is clear that in this case the spectrum consists of Landau levels and the values of λ for which cλ = 0. For small ω the latter correspond to points close to the Landau levels and in the case of ω > 0, there is another point which is negative and of the order of exp(2π/|ω|). In the next lemma we shall show that in our case these points are also in
4
2
-4
-3
-2
-1
1
2
3
4
-2
-4 Fig. 3.1. λ 7 → ψ(−λ)
the spectrum in the sense that the spectrum of our Hamiltonian contains bands around the Landau levels and an interval extending from −∞ to a finite negative point.
Landau Hamiltonian with Delta Impurities
377
Let Y = {2π/x : x ∈ X \ {0}}. Lemma 3.2. With probability one −ψ −1 (Y ) ⊂ σ (H (ω)).
(3.10)
−ψ −1 (Y )
Proof. It is sufficient to prove that for each λ ∈ and for all > 0, there exists 0 with P(0 ) > 0 and ψ ∈ H with kψk = 1 such that for all ω ∈ 0 , k Gλκ (ω) − (λ − λκ )−1 ψk < . Let hv|ni = δn0 and let ψ = CUλ¯∗ v, where C −2 = P λ −2 (2κ/π) ∞ m=0 (m − λ) . Note that ψ(z) = CG0 (z, 0) and kψk = 1 by (2.34). Then − − λκ )−1 ψ Gλκ(λ = Gλ0 κ + Uλ¯∗ 0 λκ Uλκ + (λκ − λ)−1 ψ κ −1 = (λκ − λ) C (λκ − λ) Gλ0 κ Uλ¯∗ − Uλ¯∗ 0 λκ (M λκ − M λ ) + Uλ¯∗ v κ −1 ∗ ∗ ∗ ∗ λκ λ = (λκ − λ) C Uλ¯ − Uλ¯ − Uλ¯ + Uλ¯ 0 M + Uλ¯∗ v κ
κ
κ
= (λκ − λ)−1 CUλ¯∗ 0 λκ M λ v. κ
By using (2.37) we get k Gλκ (ω) − (λ − λκ )−1 ψk2 = C 2 |λ − λκ |−2 (Iλκ )−1 IhM λ v, 0 λκ M λ vi by (2.25). Choose R such that
P
¯ ≤ 2C 2 |λ − λκ |−2 |Iλκ |−1 kM λ vk k0˜ λκ M λ vk |n|>R
|Gλ0 (n, 0)|2 < δ, and let
0 = {ω : |c0λ | < δ,
min
|n|≤R,n6 =0
|cnλκ | > 1/δ}.
Since ψ(−λ) ∈ Y and 0 is in the support of µ, P(0 ) > 0. We have ( cλ , if n = 0 hn|M λ vi = 0 λ −G0 (n, 0), if n 6 = 0. Therefore kM λ vk2 ≤ δ 2 +
X
|Gλ0 (n, 0)|2
(3.11)
(3.12)
(3.13)
n6=0
and ¯
k0˜ λκ M λ vk2 = |c0λ |2 |c0λκ |−2 +
X n6=0
|cnλκ |−2 |Gλ0 (n, 0)|2 X
≤ δ 2 (π/2κ)2 |Iψ(−λκ )|−2 + δ 2 + (π/2κ) |Iψ(−λκ )| 2
X
−2
|Gλ0 (n, 0)|2
|n|≤R
|Gλ0 (n, 0)|2
|n|>R
≤ δ (π/2κ) |Iψ(−λκ )| 2
2
−2
+ δ2
X
|Gλ0 (n, 0)|2
n6 =0 −2
+ δ(π/2κ) |Iψ(−λκ )| . ψk < if δ is small enough. u t Thus k In the next section we relate the generalized eigenvectors of H with those of M λ . 2
Gλκ (ω) − (λ − λκ )−1
378
T. C. Dorlas, N. Macris, J. V. Pulé
4. Generalized Eigenfunctions of H In this section we show that a generalized eigenfunction of H with eigenvalue λ, say, which is not a Landau level, is related in a simple way to an eigenvector v of M λ with eigenvalue zero. Furthermore if v decays then so does the corresponding eigenfunction. Since this reduces the problem to a lattice problem, it makes it possible for us to use the Aizenman–Molchanov method. Proposition 4.1. If φ is a generalized eigenfunction of H with eigenvalue λ ∈ / N0 , then v = 0 λκ Uλκ φ is a generalized eigenvector of M λ with eigenvalue zero and φ = ∗ if v decays exponentially, then for any compact subset B of C, (λ R − λκ )Uλ 0v. 2Moreover 0 decays exponentially in z. |φ(z − z )| dz B Proof. Suppose φ is a generalized eigenvector of H with eigenvalue λ. Then Gλκ φ = (λ − λκ )−1 φ
(4.1)
Gλ0 κ φ + Uλ∗¯ 0 λκ Uλκ φ = (λ − λκ )−1 φ.
(4.2)
Uλ Gλ0 κ φ + Uλ Uλ∗¯ 0 λκ Uλκ φ = (λ − λκ )−1 Uλ φ.
(4.3)
or κ
Thus κ
Using Uλ Gλ0 κ = (λ − λκ )−1 (Uλ − Uλκ ), we get Uλ Uλ∗¯ 0 λκ Uλκ φ = (λ − λκ )−1 Uλκ φ, κ
(4.4)
which by (2.37) can be written in the form M λ 0 λκ Uλκ φ = 0.
(4.5)
M λ v = 0.
(4.6)
Therefore if v = 0 λκ Uλκ φ,
From (4.2) we get (λ − λκ )Gλ0 κ φ + (λ − λκ )Uλ∗¯ v = φ.
(4.7)
(λ − λκ )Gλ0 Gλ0 κ φ + (λ − λκ )Gλ0 Uλ∗¯ v = Gλ0 φ.
(4.8)
κ
Thus κ
By using the resolvent identity we can write this as Uλ∗ v = Gλ0 κ φ + Uλ∗¯ v, κ
(4.9)
and therefore φ = (λ − λκ )Uλ∗ v by (4.7). From Propositions 6.1 and 6.2 in Appendix A we get for λ ∈ / N0 , κ
0 2
|Gλ0 (z, z0 )| < Ce− 2 |z−z | (1 + 1B(0,1/√2κ) (|z − z0 |)| ln |z − z0 ||),
(4.10)
Landau Hamiltonian with Delta Impurities
379
where C depends on λ and κ. From the equation X Gλ0 (z, n)hn|vi, φ(z) = (λ − λκ )
(4.11)
n∈Z[i]
we get, assuming hn|vi ≤ C 0 e−α|n| , that X |Gλ0 (z, n)|e−α|n| |φ(z)| ≤ |λ − λκ |C 0 ≤C
00
n∈Z[i]
X
e
− κ2 |z−n|2 −α|n|
e
n∈Z[i]
+ C 00
X
κ
n∈Z[i]
e− 2 |z−n| e−α|n| 1B(0,1/√2κ) (|z − n|)| ln |z − n|| 2
= S1 + S2 . Now
X
S1 ≤ C 00
κ
e− 2 |z−n| e−α|n| + C 00 eα e−α|z|
|n−z|≥1 00 −β|z|
≤C e
X
e−β|n| + C 00 eα e−α|z| ,
n∈Z[i]
where β =
1 4
min(κ, 2α). Thus, S1 ≤ C 000 e−β|z| . Similarly X e−β|n| 1B(0, √1 ) (|z − n|) ln |z − n||. S2 ≤ C 000 e−β|z| n∈Z[i]
Therefore
X
|φ(z)|2 ≤ Ce−2β|z| (1 + 3
n∈Z[i]
(4.12)
2κ
e−β|n| 1B(0,1/√2κ) (|z − n|)| ln |z − n||2 ).
(4.13)
Let B ⊂ C be compact and let R = sup{|z| : z ∈ B}. Then for z0 ∈ B, |φ(z − z0 )|2 ≤ Ce2βR e−2β|z| X (4.14) e−β|n| 1B(0, √1 ) (|z − z0 − n|)| ln |z − z0 − n||2 . 1+3 2κ
n∈Z[i]
Therefore Z Z X 0 2 0 2βR −2β|z| −β|n| |φ(z − z )| dz ≤ Ce e (|B| + 3 e B
n∈Z[i]
|z0 |< √1
| ln |z0 ||2 ). (4.15)
2κ
t u We do not dwell on the existence of the generalized eigenfunctions. It suffices to say that the arguments of Theorem II.4.5 in [21] can be used with e−tH replaced by Gλκ since from the bound in Lemma 2.1 and the bounds in Appendix A for |Gλ0 (z, z0 )| it follows that Z (4.16) sup |Gλ0 κ (z, z0 )|2 dz0 < ∞. z
C
The same bounds guarantee also that v is a generalized eigenvector of M λ . In the next section we apply the Aizenman–Molchanov method to the lattice operator M λ .
380
T. C. Dorlas, N. Macris, J. V. Pulé
5. An Application of the Aizenman–Molchanov Method In this section we apply the Aizenman–Molchanov method to M λ , where λ is not a Landau level. The main ingredient in this method is the Decoupling Principle for τ regular measures. We start by stating this principle, not in its full generality but in the form in which it will be used here. Definition. A measure µ on R is said to be τ -regular, with τ ∈ (0, 1], if there exists ν > 0 and C < ∞ such that µ([x − δ, x + δ]) ≤ Cδ τ µ([x − ν, x + ν])
(5.1)
for all x ∈ R and 0 < δ < 1. 5.1 (A Decoupling Principle). Let µ be a τ -regular measure and let RLemma |u| µ(du) < ∞ for some > 0. Then for all 0 < s < min(τ, ) there exists ξs , a positive, increasing function on R+ with ξs (0) > 0 satisfying ξs (x) = 1, x→∞ x lim
(5.2)
such that for all η, a and b ∈ C, Z Z |u − η|s |au + b|−s µ(du) ≥ (ξs (|η|))s |au + b|−s µ(du).
(5.3)
Let = µ0 ({ω : 1/ω ∈ A}). In Appendix B we shall show that µ is 1-regular and R µ(A) |u| µ(du) < ∞ for all < 1. Thus the inequality (5.3) is valid for all s ∈ (0, 1). As in [12] we use this lemma to obtain an exponential bound on hn|0 λ (z)|0i, where 0 λ (z) = (M λ − z)−1 . This bound then allows us to apply the results of [24] to deduce that the spectrum of M λ in a neighbourhood of the origin consists of eigenvalues and that the corresponding eigenvectors decay exponentially. We then combine this result with Proposition 4.1 to translate it into a statement about the properties of the spectrum of H . It is convenient here to introduce a notation for the intervals between the Landau levels. We let I0 = (−∞, 0) and IN = (N − 1, N) for N ∈ N. Lemma 5.2. For all N ∈ N0 , for all s ∈ ( 21 , 1) and for all γ < s there exists κ0 (N, s) < ∞ such that for all κ > κ0 (N, s), for all λ ∈ (−∞, N) \ N0 and for all z ∈ C with Iz 6 = 0 and |Rz| ≤ 1, X |hn|0 λ (z)|0i|s eγ κ|n| } ≤ 1/{2κ(ξs (0))s }. (5.4) E{ n∈Z[i]
Proof. The starting point is the following equation: For z ∈ / R, X hn|M λ − z|n0 ihn0 |0 λ (z)|n00 i = δnn00 .
(5.5)
n0 ∈Z[i]
This becomes using (2.16) (cnλ − z)hn|0 λ (z)|n00 i −
X n0 6=n
Gλ0 (n, n0 )hn0 |0 λ (z)|n00 i = δnn00 .
(5.6)
Landau Hamiltonian with Delta Impurities
381
Now we take n 6 = n00 and 0 < s < 1 to get X |cnλ − z|s |hn|0 λ (z)|n00 i|s = | Gλ0 (n, n0 )hn0 |0 λ (z)|n00 i|s n0 6=n
≤
X
|Gλ0 (n, n0 )|s |hn0 |0 λ (z)|n00 i|s .
n0 6=n
Thus E{|cnλ − z|s |hn|0 λ (z)|n00 i|s } ≤
X
|Gλ0 (n, n0 )|s E{|hn0 |0 λ (z)|n00 i|s }.
(5.7)
˜ n En {|cnλ − z|s |hn|0 λ (z)|n00 i|s }, E{|cnλ − z|s |hn|0 λ (z)|n00 i|s } = E
(5.8)
n0 6=n
Now
˜ n is with respect to all other ωn0 ’s. where En is the expectation with respect to ωn and E Let hn0 |Mnλ |n00 i = hn0 |M λ |n00 i − (4κ/ωn )δnn0 δnn00 .
(5.9)
Then Mnλ is independent of ωn and using the resolvent identity hn|0 λ (z)|0i =
A , 1 + (4κ/ωn )B
(5.10)
where A = hn|(Mnλ − z)−1 |0i and B = hn|(Mnλ − z)−1 |ni. Then En {|cnλ − z|s |hn|0 λ (z)|0i|s } |cnλ − z|s = En |A|s |1 + (4κ/ωn )B|s Z |u − η|s ≥ (4κ)s µ(du)|A|s , |1 + 4κuB|s where u = 1/ωn and 2πη = ψ(−λ) − Lemma 5.1,
π 2κ E,
E being the real part of z. Thus by
En {|cnλ − z|s |hn|0 λ (z)|0i|s } ≥ (4κ)s (ξs (|η|))s En
|A|s |1 + cnλ B|s
= (4κ)s (ξs (|η|))s En {|hn|0 λ (z)|0i|s }. Using (5.7) this gives (4κ)s (ξs (|η|))s E{|hn|0 λ (z)|0i|s } ≤
X
|Gλ0 (n, n0 )|s E{|hn0 |0 λ (z)|0i|s }
(5.11)
n0 6=n
or E{|hn|0 λ (z)|0i|s } ≤ (1/4κ)s (ξs (|η|))−s
X n0 6=n
|Gλ0 (n, n0 )|s E{|hn0 |0 λ (z)|0i|s }. (5.12)
382
T. C. Dorlas, N. Macris, J. V. Pulé
Let γ > 0 and define 4(s) = E{
X
eγ κ|n| |hn|0 λ (z)|0i|s }.
(5.13)
n∈Z[i]
Then 4(s) = E{|h0|0 λ (z)|0i|s } +
X
eγ κ|n| E{|hn|0 λ (z)|0i|s }
n6=0 λ
s
≤ E{|h0|0 (z)|0i| } +(1/4κ)s (ξs (|η|))−s
XX
eγ κ|n| |Gλ0 (n, n0 )|s E{|hn0 |0 λ (z)|0i|s }.
n6=0 n0 6=n
Thus 4(s) ≤ E{|h0|0 λ (z)|0i|s } + (1/4κ)s (ξs (|η|))−s
XX 0 0 eγ κ|n−n | |Gλ0 (n, n0 )|s eγ κ|n | , E{|hn0 |0 λ (z)|0i|s } n0 n6=n0
so that
P λ
s
s
4(s) ≤ E{|h0|0 (z)|0i| } + (1/4κ) sup
n6 =n0
n0
Let
P s
F (s, λ) = (1/4κ) sup
n6=n0
n0
0
eγ κ|n−n | |Gλ0 (n, n0 )|s (ξs |η|))s
4(s).
(5.14)
0
eγ κ|n−n | |Gλ0 (n, n0 )|s (ξs (|η|))s
.
(5.15)
If F (s, λ) < 1/2 then 4(s) ≤
E{|h0|0 λ (z)|0i|s } ≤ 2E{|h0|0 λ (z)|0i|s }. 1 − F (s, λ)
(5.16)
But h0|0 λ (z)|0i =
h0|(M0λ − z)−1 |0i
1 + c0λ h0|(M0λ − z)−1 |0i
(5.17)
so that |h0|0 λ (z)|0i| =
1 , 4κ|b + 1/ω0 |
(5.18)
where b is independent of ω0 . Using this and Lemma 5.1 with a = 1 and η = −b, we get E0 {|h0|0 λ (z)|0i|s } ≤ 1/(4κξs (0))s ,
(5.19)
E{|h0|0 λ (z)|0i|s } ≤ 1/(4κξs (0))s .
(5.20)
and therefore
Landau Hamiltonian with Delta Impurities
383
This proves (5.4). To prove that F (s, λ) < 1/2, first assume that λ ∈ IN with N ∈ N. By Proposition 6.2, we have 0 2
|Gλ0 (n, n0 )| ≤ (Cκ)N+1 N N |0(−λ)||n − n0 |2N e−κ|n−n |
(5.21)
for n 6 = n0 , and λ ∈ IN , N ∈ N. Therefore X X 0 2 eγ κ|n−n | |Gλ0 (n, n0 )|s ≤ (Cκ)(N+1)s N N s |0(−λ)|s eγ κ|n| |n|2N s e−κs|n| . n6=n0
n6 =0
(5.22) Let γ < α < s. Using the bounds eγ κx e−ακx 2 x 2Ns e−κ(s−α)x /2 < (2sN/(s − α)κ)Ns for x ≥ 1 and X 2 e−t|n| ≤ K(t),
2
≤ e−(α−γ )κ for x ≥ 1,
(5.23)
n∈Z[i]
where K(t) = 1 + e−t/4 + (π/t)1/2 X 0 eγ κ|n−n | |Gλ0 (n, n0 )|s
2
for t > 0 (see Lemma 2.1 in [5]), we get
n6=n0
≤ (K(κ(s − α)/2) − 1) (C N+1 κ)s (2sN 2 /(s − α))N s |0(−λ)|s e−(α−γ )κ . Thus
F (s, λ) ≤ K
κ(s − α) 2
s N+1 s 2 N s −(α−γ )κ 0(−λ) /4) (2sN /(s − α)) e − 1 (C ξ (|η|) . s (5.24)
Now since for N ∈ N, the limits limλ→N |0(−λ)/ψ(−λ)| and limλ→N −1 |0(−λ)/ψ(−λ)| are finite, we have for λ ∈ IN , |0(−λ)| ≤ CN (1 + |ψ(−λ)|) ≤ CN (1 + (π/2κ) + 2π |η|). Therefore ! 0(−λ) ξ (|η|) ≤ CN [{1 + (π/2κ)}/ξs (0)] + 2π sup {x/ξs (x)} . s
x∈R0
(5.25)
(5.26)
Thus there exists κ(N1 , s) < ∞ such that for all κ > κ0 (N1 , s), F (s, λ) < 1/2 for all λ ∈ (0, N1 ) \ N. For λ ∈ I0 we have by Proposition 6.1 |Gλ0 (n, n0 )| ≤ (Cκ)
1 −κ|n−n0 |2 e , |λ|
and therefore X n6=n0
0
eγ κ|n−n | |Gλ0 (n, n0 )|s ≤ (K(κ(s − α)) − 1) (Cκ)s
(5.27)
1 |λ|
s
e−(α−γ )κ ,
(5.28)
384
T. C. Dorlas, N. Macris, J. V. Pulé
and thus F (s, λ) ≤ (K(κ(s − α)) − 1) (C/4)s e−(α−γ )κ s
1 |λ|ξs (|η|)
s
.
(5.29)
t Therefore by the same argument F (s, λ) < 1/2 for all λ ∈ I0 if κ is large enough. u We have from Theorems 8 and 9 in [24] that if for all E ∈ (−1, 1) and a.e. ω, X |hn|0 λ (E + i)|0i|2 < ∞, (5.30) lim ↓0
n∈Z[i]
then σcont (M λ ) ∩ (−1, 1) = ∅ for a.e. ω. If, furthermore, for a.e. pair (ω, E), ω ∈ and E ∈ (−1, 1), lim |hn|0 λ (E + i)|0i| < Cω,E e−m(E)|n| , ↓0
(5.31)
λ of M λ with eigenvalue E ∈ (−1, 1) obey then with probability one, the eigenvectors vE λ |ni| < Dω,E e−m(E)|n| . |hvE
(5.32)
We shall use the results of [24], Lemma 5.2 and Proposition 4.1 to prove the following lemma. Lemma 5.3. For each N ∈ N there exists κ0 > 0 such that for κ > κ0 , for each λ ∈ (−∞, N ) \ N0 with probability one, if λ is a generalized eigenvalue of H with generalized eigenfunction φλ , then for any compact subset B of C, Rcorresponding 0 )|2 dz0 decays exponentially in z with exponential length less than or equal |φ (z − z B λ to 2/κ. Proof. From Lemma 5.2 we have for all λ ∈ (−∞, N) \ N0 , z ∈ C with Iz 6 = 0 and |Rz| ≤ 1, is/2 o nX o nh X |hn|0 λ (z)|0i|2 e2γ κ|n|/s |hn|0 λ (z)|0i|s eγ κ|n| ≤E E n∈Z[i]
n∈Z[i]
≤ 1/{(2κ)(ξs (0))s }.
(5.33)
Now for a.e. pair (ω, E), ω ∈ and E ∈ (−1, 1), lim↓0 hn|0 λ (E + i)|0i exists. Therefore by Fatou’s Lemma, is/2 o nh X lim |hn|0 λ (E+i)|0i|2 e2γ κ|n|/s E n∈Z[i]
↓0
hX is/2 o n |hn|0 λ (E + i)|0i|2 e2γ κ|n|/s ≤ E lim inf ↓0
n∈Z[i]
nh X is/2 o |hn|0 λ (E + i)|0i|2 e2γ κ|n|/s ≤ lim inf E ↓0
n∈Z[i]
≤ 1/{(2κ)(ξs (0))s }.
(5.34)
Landau Hamiltonian with Delta Impurities
385
Thus (5.30) and (5.31) are satisfied. Therefore if λ is a generalized eigenvalue of H with corresponding generalized eigenfunction φλ , then by Proposition 4.1, vλ = 0 λκ Uλκ φλ is a generalized eigenvector of M λ with eigenvalue 0 and must satisfy |hvλ |ni| < Dω e−2γ κ|n|/s .
(5.35) R
Then again by Proposition 4.1, for any compact subset B of C, B |φλ (z−z0 )|2 dz0 decays exponentially in z with exponential length less than or equal to max(s/(2γ κ), 2/κ). If we choose γ = s/2, then max(s/(2γ κ), 2/κ) = 2/κ. u t By Fubini’s Theorem, we can deduce from Lemma 3.5 the result about the decay of eigenfunctions with probability one and a.e. λ with respect to Lebesgue measure and therefore with probability one σac (H ) ∩ (−∞, N) = ∅. However to be able to make a statement about σcont (H ) we have to replace a.e. λ with respect to Lebesgue measure with a.e. λ with respect to the spectral measure of H (ω). We do this in Lemma 5.7 by using the ideas of [25] and the following four lemmas. We state the first lemma without proof. Lemma 5.4. Let {fn } be a total countable subset of normalized vectors of a Hilbert space H and H a self-adjoint P operator on H with spectral projections E( · ). Let cn > 0, P n cn < ∞ and ν = n cn µn , where µn = (fn , E( · )fn ). Then ν(A) = 0 implies that E(A) = 0. Lemma 5.5. For each N ∈ N0 , there exists an open set JN ⊂ C, containing IN , such that for κ sufficiently large with probability one, M λ is invertible for all λ ∈ JN \ IN . Proof. Let λ ∈ IN and || < 1, 6 = 0. Let ( cλ+i 0 hn|X|n i = n λ −G0 (n, n0 ) Then ||Xξ || ≥
2κ π |Iψ(−(λ + i))|||ξ ||.
||X−1 || ≤ Let
if n = n0 , if n 6= n0 .
Therefore X is invertible and
π 1 . 2κ |I(ψ(−(λ + i))|
( hn|Y |n0 i =
(5.36)
0 −i(Gλ0 Gλ+i )(n, n0 ) 0
(5.37)
if n = n0 if n 6= n0
(5.38)
so that M λ = X + Y = X(1 + X−1 Y ).
(5.39)
From Proposition 6.2 in Appendix A we have for λ with Rλ ∈ IN , N ∈ N, and |Iλ| ≤ 1, |Gλ0 (z, z0 )| ≤ C N κN N |0(−Rλ)|(1 + ln(2κ|z − z0 |2 )
(5.40)
for 2κ|z − z0 |2 < 1 and κ
0 2
|Gλ0 (z, z0 )| ≤ C N κN 2N |0(−Rλ)|e− 2 |z−z |
(5.41)
386
T. C. Dorlas, N. Macris, J. V. Pulé
for 2κ|z − z0 |2 ≥ 1. Therefore if λ ∈ IN , N ∈ N, || < 1, κ > 2 and n, n0 ∈ Z[i] with n 6= n0 , 0 2N 2 κ |0(−λ)|2 N 3N |(Gλ0 Gλ+i 0 )(n, n )| ≤ C nZ κ 0 2 × dz(1+ln(2κ|z − n|2 )e− 2 |z−n | 2κ|z−n|2 <1 Z κ 2 + dz(1+ln(2κ|z − n0 |2 )e− 2 |z−n| 2κ|z−n0 |2 <1 Z o κ κ 2 0 2 +N N dze− 2 |z−n| e− 2 |z−n | Z 1 κ κ 0 2 (1+ln r 2 )rdr +N N e− 8 ≤ 2πC 2N κ|0(−λ)|2 N 3N e− 8 |n−n | 0
≤ 2C ≤e
κ |0(−λ)| N
2N 2
κ − 32
2
4N − κ8 |n−n0 |2
e
0 2
C 2N |0(−λ)|2 N 4N e− 8 |n−n | 1
(5.42)
if κ is large enough. Therefore κ
||Y || ≤ e− 32 C 2N |0(−λ)|2 ||T ||,
(5.43)
0 2
where T is the operator with matrix hn|T |n0 i = e−1/8|n−n | . Now take λ ∈ (N −1, N − 21 ] and || < λ − N +1. In this interval |0(−λ)| ≤
aN . (λ − N + 1)
(5.44)
On the other hand by [19] 6.3.16 Iψ(−(λ + i)) = −
∞ X k=0
1 . (λ − k)2 + 2
(5.45)
Therefore |Iψ(−(λ + i))| = ||
∞ X k=0
>
|| 1 > 2 2 (λ − k) + (λ − N + 1)2 + 2
|| . 2(λ − N + 1)2
(5.46)
Thus ||X −1 Y || ≤ ||X−1 ||||Y || ≤
π 2 2N − κ a C e 32 ||T || < 1 4κ N
(5.47)
if κ is sufficiently large. Thus M λ+i is invertible. We can use the same argument if λ ∈ [N − 21 , N) and || < N − λ. Using the bounds in Proposition 6.1, a similar calculation to the above gives for λ ∈ I0 , κ
||Y || ≤ e− 32 C 2N |λ|−2 ||T ||.
(5.48)
Landau Hamiltonian with Delta Impurities
387
Then using the inequality |Iψ(−(λ + i))| >
|| , + 2
|λ|2
(5.49)
t we can show that M λ+i is invertible if || < |λ|. u Lemma 5.6. For n ∈ Z[i] and λ ∈ JN \ IN , let φnλ = cλ,n Uλ∗ 0 λ |ni, where cλ,n = ||Uλ∗ 0 λ |ni||−1 so that ||φnλ || = 1. Then if [a, b] ⊂ IN , the set {φnλ : n ∈ Z[i], λ ∈ (JN \ IN ) ∩ Q[i]} is total in E([a, b])H. Proof. For n ∈ Z[i] and λ ∈ JN let φ˜ nλ = Uλ∗ |ni. Then if λ ∈ JN \ IN , X −1 0 λ λ cλ,n φ˜ nλ = 0 hn |M |niφn0 .
(5.50)
n0 ∈Z[i]
0
Also if λ → λ0 then φ˜ nλ → φ˜ nλ . Therefore it is sufficient to prove that the set {φ˜ nλ : n ∈ Z[i], λ ∈ In } is total. We do this by showing that the orthogonal complement of this set is in the orthogonal complement of E([a, b])H. Let f ∈ H and suppose that (φ˜ nλ , f ) = 0 for all n ∈ Z[i] and all λ ∈ [a, b]. Then since (Gλ0 f )(n) = (φ˜ nλ , f ), Gλ f = Gλ0 f . Therefore E([a, b])Gλ f = E([a, b])Gλ0 f and thus sup ||E([a, b])Gλ f || ≤ sup ||Gλ0 ||||f || < ∞.
λ∈[a,b]
λ∈[a,b]
Let µ1 (A) = (f, E([a, b] ∩ A)f ). Then λ
Z
||E([a, b])G f || = 2
[a,b]
µ1 (dλ0 ) . |λ − λ0 |2
Let xi = a + (b − a)i/M, i = 0, . . . , M and λi = 21 (xi + xi+1 ). Then Z Z 4M 2 µ1 (dλ0 ) µ1 (dλ0 ) ≥ ≥ µ1 ([xi , xi+1 ]). 0 0 2 2 (b − a)2 [a,b] |λ − λi | [xi ,xi+1 ] |λ − λi | Therefore
Z sup
λ∈[a,b] [a,b]
4M 2 µ1 (dλ0 ) ≥ µ1 ([xi , xi+1 ]) 0 2 |λ − λ| (b − a)
(5.51)
(5.52)
(5.53)
(5.54)
for all i, and so Z sup
λ∈[a,b] [a,b]
M 4M 2 1 X 4M µ1 (dλ0 ) ≥ µ1 ([xi , xi+1 ] ≥ µ1 ([a, b]). |λ − λ0 |2 (b − a)2 M (b − a)2 i=0
(5.55) Since M is arbitrary supλ∈[a,b] ||E([a, b])Gλ f || = ∞ unless µ1 ([a, b]) = 0. But t µ1 ([a, b]) = ||E([a, b])f ||2 . u Let F be the σ -algebra generated by {ωn0 : n0 ∈ Z[i]} and let Fn be the sub σ -algebra generated by {ωn0 : n0 6 = n}. Let BN be the Borel sets of IN .
388
T. C. Dorlas, N. Macris, J. V. Pulé
Lemma 5.7. Let B 7 → E(B) be the spectral measure of H and A ∈ ∩n∈Z[i] (Fn ⊗ BN ). If for a.e λ ∈ IN with respect to Lebesgue measure E {1A ( · , λ)} = 0, then E {E({λ : ( · , λ) ∈ A})} = 0. Proof. If forna.e λ with respecto to Lebesgue measure E {1A ( · , λ)} = 0, then by Fubini’s R Theorem E IN dλ1A ( · , λ) = 0. Let 3 be a bounded subset of Z[i] and let H3 be defined in the same way as H with / N0 is an M replaced by M3 = l 2 (3). By the same argument as in Proposition 4.1 λ ∈ λ v = 0, where M λ eigenvalue of H3 if and only if there exists v ∈ M3 such that M3 3 is the restriction of M λ to M3 . Then the corresponding eigenfunction is Uλ∗ v. Since in the interval IN , ψ is bijective it is clear that there are |3| eigenvalues in IN . Let λ1 , . . . , λ|3| be the eigenvalues in IN , say, and let v1 , . . . , v|3| be the correλk vk = 0. Let un = 1/ωn . Then for n ∈ 3 we get sponding vectors such that M3 |hvk |ni|2 dλk =− . dun ||Uλ∗k vk ||
(5.56)
λk λk vk = 0 and hvk |ni = 0 for a particular value of un then M3 vk = 0 for all values If M3 of un . We shall see later that we can ignore these eigenvalues. We see from Eq. (5.56) that each λk is a monotonic decreasing function of un . Moreover as un → ±∞, the λk ’s become identical, except the value of λk corresponding to the vk which tends to |ni and this latter value of λk decreases from N to N − 1 (respectively −∞ if N = 0) as un increases from −∞ to +∞. Therefore Z XZ ∞ dλk f (λk ) dun = − f (λ)dλ. (5.57) dun IN −∞ k
Let ψk =
Uλ∗ vk k
||Uλ∗ vk || k
so that Hλ ψk = λk ψk . Let λ ∈ JN \ IN and n ∈ Z[i]. For B ⊂ IN
λ λ let µn,λ 3 (B) = (φn , E3 (B)φn ), where E3 is the spectral measure of H3 . Then for 3 sufficiently large, X (B) = |(φnλ , ψk )|2 µn,λ 3 λk ∈B
= ||0 λ |ni||−2
X λk ∈B
= −||0 λ |ni||−2
1 |hn|vk i|2 2 |λ − λk | ||Uλ∗k vk ||2
X
λk ∈B
1 dλk . |λ − λk |2 dun
(5.58)
Note that if hn|vk i = 0 then the corresponding term in (5.58) is absent. Also if λk is degenerate, we can choose the corresponding orthogonal set of eigenvectors so that only one satisfies hn|vk i 6 = 0. Therefore there is only one term corresponding to such λk in the sum (5.58). From (5.58) and (5.57) we get Z Z ∞ dλ0 n,λ λ −2 dun µ3 (B) = ||0 |ni|| , (5.59) 0 2 B |λ − λ | −∞
Landau Hamiltonian with Delta Impurities
and thus
Z
∞
−∞
389
λ −2 dun ρ(un )µn,λ 3 (B) ≤ ||0 |ni|| ||ρ||∞
Z B
dλ0 . |λ − λ0 |2
(5.60)
n,λ we have the If µn,λ (B) = (φnλ , E(B)φnλ ) then by the weak convergence of µn,λ 3 to µ n,λ bound (5.60) for µ . By Kotani’s argument [26], we have that Z n,λ 0 0 dµ (λ )1A ( · , λ ) = 0. E IN
By Lemmas 5.4 and 5.6 we get that E {E({λ : ( · , λ) ∈ A})} = 0. u t By combining Lemmas 5.3 and 5.7 we obtain our final theorem. Theorem 5.8. For each N ∈ N there exists κ0 > 0 such that for κ > κ0 , with probability one, σcont (H ) ∩ (−∞, N ) = ∅, and if λ ∈ σ (H ) ∩ (−∞, N) \ N0 , is an eigenvalue of eigenfunction is φλ , then for any compact subset B of C, R H and the0 corresponding 2 dz0 decays exponentially in z with exponential length less than or equal |φ (z − z )| λ B to 2/κ. Acknowledgements. This work was supported by the Forbairt (Ireland) International Collaboration Programme 1997. J.V.P. and T.C.D. would like to thank the Institut de Physique Théorique of the Ecole Polytechnique Fédérale de Lausanne for their hospitality and financial support. T.C.D. and N.M. would like to thank University College Dublin for their hospitality.
6. Appendix A. Bounds for the Green’s Function In this appendix we shall obtain bounds on the Green’s function Gλ0 (z, z0 ). Our basic tools are the the integral representation ([19, 13.2.5]) Z ∞ dte−ρt t a−1 (1 + t)b−a−1 , (6.1) 0(a)U (a, b, ρ) = 0
which is valid for Ra > 0 and ρ > 0 and the recurrence relation ([19, 13.4.18]) U (a, b, ρ) = ρU (a + 1, b + 1, ρ) − (b − a − 1)U (a + 1, b, ρ).
(6.2)
We first obtain bounds for |Gλ0 (z, z0 )| when Rλ < 0. Proposition 6.1. There exists a constant C < ∞, such that for Rλ ∈ I0 , √ 1 λ 0 −κ|z−z0 |2 − 2κ|Rλ||z−z0 | 0 2 +e 1 + | ln 2κ|z − z | | , |G0 (z, z )| ≤ Cκe |Rλ| (6.3) if 2κ|z − z0 |2 ≤ 1, and |Gλ0 (z, z0 )| ≤ if 2κ|z − z0 |2 > 1.
Cκ −κ|z−z0 |2 e , |Rλ|
(6.4)
390
T. C. Dorlas, N. Macris, J. V. Pulé
Proof. Let λ = x + iy with x < 0. Then from (6.1) we get Z |0(−λ)U (−λ, 1, ρ)| ≤
∞
dte−ρt t |x|−1 (1 + t)−|x|
0
Z = Z
1
dte
t
0 1
≤
|x|−1
dt t
0
≤
−ρt |x|−1
1 + |x|
Z
−|x|
(1 + t)
Z +
∞
1 ∞
e
dt
e−ρt dt t
−(ρt+ |x| 2t )
t
1
Z
∞
+
dte−ρt t |x|−1 (1 + t)−|x|
1
t 1+t
|x|
.
(6.5)
If ρ ≤ 1, we have
1 |x| 1 = |x| ≤
≤
1 |x|
≤
1 |x|
Z
|x|
e− 2 ρt e− 2 (ρt+ t ) dt t 1 1 Z ∞ 1 e− 2 ρt 2 dt + e−(ρ|x|) t 1 Z ∞ −t 1 e 2 + e−(ρ|x|) dt 1 t 2ρ Z Z ∞ 1 dt 1 1 e−t 2 2 + e−(ρ|x|) + e−(ρ|x|) dt 1 t t 1 2ρ Z ∞ 1 1 2 2 − e−(ρ|x|) ln(ρ/2) + e−(ρ|x|) dte−t
1 + |0(−λ)U (−λ, 1, ρ)| ≤ |x|
1
1
∞
1 1
=
1 +e |x|
1 −(ρ|x|) 2
| ln(ρ/2)| +
e−(ρ|x|) 2 e
.
Thus,
1 1 2 + (1 + | ln ρ|) e−(ρ|x|) |0(−λ)U (−λ, 1, ρ)| ≤ C |x|
If ρ > 1, we have 1 + |0(−λ)U (−λ, 1, ρ)| ≤ |x|
Z 1
Z
∞
|x|
e−(t+ 2t ) dt t
∞ e− 2 t e− 2 (t+ 1 + dt ≤ |x| t 1 Z ∞ 1 1 1 2 + e−|x| dte− 2 t ≤ |x| 1 1 1 −|x| 2 + 2e = . |x| 1
1
|x| t )
.
(6.6)
Landau Hamiltonian with Delta Impurities
391
Therefore, |0(−λ)U (−λ, 1, ρ)| ≤
C . |x|
(6.7)
Inserting the inequalities (6.6) and (6.7) into (2.6) we get Proposition 6.1. u t Now we shall obtain bounds for Rλ > 0. Proposition 6.2. There exists a constant C < ∞, such that for Rλ ∈ IN , N ∈ N,|Iλ| ≤ 1, 0 2
|Gλ0 (z, z0 )| ≤ κC N N N |0(−Rλ)|(1 + | ln(2κ|z − z0 |2 )|)e−κ|z−z | ,
(6.8)
if 2κ|z − z0 |2 ≤ 1, and 0 2
|Gλ0 (z, z0 )| ≤ (Cκ)N+1 N N |0(−Rλ)||z − z0 |2N e−κ|z−z | ,
(6.9)
if 2κ|z − z0 |2 > 1. Let λ = x + iy. We shall prove that if N − 1 < x < N, N ∈ N0 , b ∈ N and ρ > 1, then b+N−1 x N 0(−x) ρ (b + N + |y|) |U (−λ, b, ρ)| ≤ 2 0(−λ) (6.10) −(ρ−2) N (b + N )! . (ρ + |y| + 1) +e |0(N − λ)| We shall do this by induction on N . We first prove (6.10) for N = 0 by using (6.1) which gives Z ∞ Z 1 |0(−λ)U (−λ, b, ρ)| ≤ dte−ρt t −(x+1) (1+t)b+x−1 + dte−ρt t −(x+1) (1+t)b+x−1 0
1
= I1 +I2 .
We now take ρ > 1, −1 < x < 0 and b ≥ 1. For I1 , since t < 1, we get Z 1 Z ρ b−1 −ρt −(x+1) b−1 x dte t =2 ρ dte−t t −(x+1) I1 ≤ 2 0 0 Z ∞ ≤ 2b−1 ρ x dte−t t −(x+1) = 2b−1 ρ x 0(−x). 0
On the other hand, using t > 1, we get Z ∞ dte−(ρ−1)t e−t t −(x+1) (1 + t)b+x−1 I2 = 1 Z ∞ ≤ e−(ρ−1) dte−t t −(x+1) (1 + t)b+x−1 Z1 ∞ −(ρ−1) ≤e dte−t (1 + t)b−1 . 1
392
T. C. Dorlas, N. Macris, J. V. Pulé
Therefore I2 ≤ e−(ρ−1)
Z
∞
dse−s+1 s b−1 ≤ e−(ρ−2)
2
=e
−(ρ−2)
0(b) ≤ e
Z
∞
dse−s s b−1
0 −(ρ−2)
(6.11)
b!.
Thus we have |0(−λ)U (−λ, b, ρ)| ≤ 2b−1 ρ x 0(−x) + e−(ρ−2) b!
(6.12)
for ρ > 1 and −1 < x < 0, or
0(−x) + e−(ρ−2) b! . |U (−λ, b, ρ)| ≤ 2b−1 ρ x 0(−λ) |0(−λ)|
(6.13)
Suppose that (6.10) is true for N − 1 < x < N . Then by using the recurrence relation (6.2), we get for N < x < N + 1, |U ( − λ, b, ρ)| n 0(−x + 1) ≤ ρ 2b+N ρ (x−1) (b + N + |y| + 1)N 0(−λ + 1) (b + N + 1)! o +e−(ρ−2) (ρ + |y| + 1)N |0(N − λ + 1)| n b+N−1 (x−1) N 0(−x + 1) ρ (b + N + |y|) +|b + λ − 1| 2 0(−λ + 1) o (b + N )! . +e−(ρ−2) (ρ + 1 + |y|)N |0(N − λ + 1)| The identity 0(x + 1) = x0(x) gives 0(−x + 1) x 0(−x) 0(−x) = ≤ . 0(−λ + 1) λ 0(−λ) 0(−λ) Therefore |U (−λ, b, ρ)| 0(−x) |b + λ − 1| (b + N + |y|)N ≤ 2b+N ρ x (b + N + |y| + 1)N + 2ρ 0(−λ) (b + N + 1)! |b + λ − 1| ρ+ + e−(ρ−2) (ρ + |y| + 1)N |0(N − λ + 1)| b+N +1 0(−x) |b + λ − 1| b+N N x (b + N + |y| + 1) ρ 1 + ≤2 0(−λ) 2ρ (b + N + 1)! . + e−(ρ−2) (ρ + |y| + 1)N+1 0(N − λ + 1) Therefore since 2 ≤ b + N + |y| + 1 and |b + λ − 1| ≤ b + N + |y| we get the required bound 0(−x) |U (−λ, b, ρ)| ≤ 2b+N ρ x (b + N + |y| + 1)N +1 0(−λ) (b + N + 1)! . + e−(ρ−2) (ρ + 1)N +1 0(N − λ + 1)
Landau Hamiltonian with Delta Impurities
393
This gives the following bound on the Green’s function for 2κ|z − z0 |2 ≥ 1 and N − 1 < x < N and |y| ≤ 1, 2κ −κ|z−z0 |2 |0(−λ)| e G|λ0 (z, z0 )| ≤ π n 0(−x) × (2κ)x (2(2 + N))N |z − z0 |2x 0(−λ) (N + 1)! −2κ|z−z0 |2 o e . +e2 (2 + 2κ|z − z0 |2 )N |0(N − λ)|
(6.14)
From (6.14) we get for N − 1 < x < N with N ∈ N0 , 0 2
|Gλ0 (z, z0 )| ≤ (Cκ)N+1 N N |0(−x)||z − z0 |2N e−κ|z−z | ,
(6.15)
since |0(−λ)| ≤ |0(−x)| and 0(N − λ) is bounded below. We shall prove, again by induction, that for N − 1 < x < N , N ∈ N0 , b ∈ N and ρ ≤ 1, 0(−x) (1 + | ln ρ|) . (6.16) |U (−λ, b, ρ)| ≤ 2N+4 (b + N)! 0(−λ) ρ b−1 We first prove (6.16) for N = 0. From (6.7) we have for ρ ≤ 1, and x < 0, |0(−λ)U (−λ, 1, ρ)| ≤ Thus
1 1 + + | ln ρ|. |x| e
0(−x) 1 1 1 + + | ln ρ| . |U (−λ, 1, ρ)| ≤ 0(−λ) |x|0(−x) e0(−x) 0(−x)
(6.17)
(6.18)
Since −1 < x < 0, 0(−x) > 1 and |x|0(−x) = 0(−x + 1) > (e − 1)/e, this gives 0(−x) e2 + e − 1 + | ln ρ| (6.19) |U (−λ, 1, ρ)| ≤ 0(−λ) e2 − e 0(−x) ≤ 24 0(−x) (1 + | ln ρ|). ≤ (2 + | ln ρ|) 0(−λ) 0(−λ) Now we take b ≥ 2,
Z
∞
dte−ρt t −(x+1) (1 + t)b+x−1 Z ∞ 1 = b−1 dte−t t −(x+1) (ρ + t)b+x−1 ρ 0 Z ∞ 1 ≤ b−1 dte−t t −(x+1) (1 + t)b+x−1 , ρ 0
|0(−λ)U (−λ, b, ρ)| ≤
0
since b + x − 1 ≥ 0. Thus we have for b ≥ 2, and −1 < x < 0, |0(−λ)U (−λ, b, ρ)| ≤
1 ρ b−1
0(−x)U (−x, b, 1).
(6.20)
394
T. C. Dorlas, N. Macris, J. V. Pulé
By inserting the bound obtained from (6.12) by letting ρ tend to 1, 0(−x)U (−x, b, 1) ≤ 2b−1 0(−x) + eb!,
(6.21)
into this inequality we get 0(−x) eb! 1 ≤ 24 b! 0(−x) (1 + | ln ρ|) . |U (−λ, b, ρ)| ≤ b−1 2b−1 + ρ 0(−x) 0(−λ) 0(−λ) ρ b−1 (6.22) We can combine this with the inequality (6.19) to get for b ∈ N, −1 < λ < 0 and ρ < 1, 4 0(−x) (1 + | ln ρ|) . (6.23) |U (−λ, b, ρ)| ≤ 2 b! 0(−λ) ρ b−1 Using the recurrence relation (6.2) and the induction hypothesis we get for N < λ < N + 1, 0(−x + 1) (1 + | ln ρ|) |U (−λ, b, ρ)| ≤ ρ ≤ 2N+4 (b + N + 1)! 0(−λ + 1) ρb 0(−x + 1) (1 + | ln ρ|) +|b + λ − 1|2N+4 (b + N )! 0(−λ + 1) ρ b−1 0(−x + 1) (1 + | ln ρ|) ≤ 2N+4 (b + N)! (b + N + 1 + |b + λ − 1|) 0(−λ + 1) ρ b−1 0(−x) (1 + | ln ρ|) . (6.24) ≤ 2N+5 (b + N + 1)! 0(−λ) ρ b−1 In particular with b = 1, (6.16) gives for N − 1 < x < N and ρ ≤ 1, 0(−x) (1 + | ln ρ|). |U (−λ, 1, ρ)| ≤ 2N+4 (N + 1)! 0(−λ)
(6.25)
This gives us the required bound on the Green’s function for N − 1 < x < N, and 2κ|z − z0 |2 ≤ 1, |Gλ0 (z, z0 )| ≤
2κ 0 2 |0(−x)|2N+4 (N + 1)!(1 + | ln(2κ|z − z0 |2 )|)e−κ|z−z | π 0 2
≤ κC N N N |0(−x)|(1 + | ln(2κ|z − z0 |2 )|)e−κ|z−z | .
(6.26)
7. Appendix B. Regularity of µ Definition. A probability measure µ on R is said to be τ -regular, with τ ∈ (0, 1], if there exists ν > 0 and C < ∞ such that µ([x − δ, x + δ]) ≤ Cδ τ µ([x − ν, x + ν]) for all x ∈ R and 0 < δ < 1.
(7.1)
Landau Hamiltonian with Delta Impurities
395
Note that it is equivalent to requiring that there exists ν > 0 and C < ∞ such that µ([x − δ, x + δ]) ≤ Cδ τ µ([x − ν, x + ν])
(7.2)
for all x ∈ R and 0 < δ < ν. We shall prove this with τ = 1. Recall that the probability measure µ0 has support is an interval [−a, a] with a < ∞. µ0 is symmetric about the origin and that its density ρ0 is differentiable on (−a, a) and satisfies the following condition: ρ00 (ζ ) < ∞. ζ ∈(0,a) ρ0 (ζ )
A ≡ sup
(7.3)
If B ⊂ R, µ(B) = µ0 ({ω : 1/ω ∈ B}) and the density of µ, ρ, is given by ρ(x) =
ρ0 (1/x) . x2
(7.4)
Since in our case µ is symmetric about the origin, it is sufficient to prove (7.2) for x ≥ 0. Also it is easy to see that the following condition is sufficient for (7.2) with τ = 1. There exists ν > 0 and C < 0 such that ρ(x + t 0 ) ≤ Cρ(x + t) for all x ∈ R+ , −ν ≤ t 0 ≤ t ≤ ν. Then µ([x − δ, x + δ]) = ≤ ≤ =
(7.5)
Z δ δ ν ρ(x + t)dt ν −ν ν Z δ(C + 1) ν δ ρ(x + t)dt ν ν 0 Z δ(C + 1)C ν ρ(x + t)dt ν 0 δ(C + 1)C µ([x, x + ν]). ν
(7.6)
Let b = 1/a. If 0 ≤ x ≤ b − t 0 , then ρ(x + t 0 ) = 0. If x > b − t 0 , then ln ρ0 (1/(x + t 0 )) − ln ρ0 (1/(x + t)) =
ρ00 (ζ ) t − t0 , 0 (x + t)(x + t ) ρ0 (ζ )
(7.7)
where ζ ∈ (1/(x + t), 1/(x + t 0 )). Thus ln ρ0 (1/(x + t 0 )) − ln ρ0 (1/(x + t)) ≤ max(0,
2Aν ). b2
(7.8)
)), Therefore, with C 0 = exp(max(0, 2Aν b2 ρ0 (1/(x + t 0 )) ≤ C 0 ρ0 (1/(x + t)). But
x+t x + t0
2 ≤
b + 2ν b
(7.9)
2 .
(7.10)
396
T. C. Dorlas, N. Macris, J. V. Pulé
2 Thus the inequality (7.5) is satisfied with C = C 0 b+2ν . Finally note that b Z Z |x| µ(dx) = |x|− ρ0 (x)dx < ∞
(7.11)
for all < 1 since ρ0 is continuous at the origin. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.
Dorlas, T.C., Macris, N. and Pulé, J.V.: Helv. Phys. Acta. 68, 330 (1995) Dorlas, T.C., Macris, N. and Pulé, J.V.: J. Math. Phys. 37, 1574 (1996) Combes, J.M. and Hislop, P.D.: Commun. Math. Phys. 177, 603 (1996) Wang, W-M.: J. Funct. Anal. 146, 1 (1997) Dorlas, T.C., Macris, N. and Pulé, J.V.: J. Stat. Phys. 87, 847 (1997) Geiler, V.: St. Petersburg. Math. J. 3, 489 (1992) Gredeskul, S.A., Zusman, M., Avishai, Y. and Azbel’, M.Ya.: Phys. Rep. 288, 223 (1997) Brézin, E., Gross, D.J. and Itzykson, C.: Nucl. Phys. B 235 [FS11], 24 (1984) Erd˝os, L.: Probability and Related Fields 112, 321 (1998) Huckestein, B.: Rev. Mod. Phys. 67, 357 (1995) Berezin, F.A. and Fadeev, L.D.: Soviet Math. Dokl. 2, 372 (1961) Aizenman, M. and Molchanov, S.: Commun. Math. Phys. 157, 245 (1993) Kunz, H.: Commun. Math. Phys. 112, 121 (1987) Bellisard, J., Van Elst, A. and Schulz-Baldes, H.: J. Math. Phys. 35, 5373 (1994) Chalker, J.T. and Coddington, P.D.: J. Phys. C 21, 2665 (1988) Dorlas, T.C., Macris, N. and Pulé, J.V.: Quantum Hall effect without divergence of the localization length. Preprint (1998) Aizenman, M. and Graf, G.M.: J. Phys. A 31, 6783 (1998) Albeverio, S., Gesztesy, F., Hoegh-Krohn, R. and Holden, H.: Solvable Models in Quantum Mechanics. Heidelberg: Springer-Verlag, 1988 Abramowitz, M. and Stegun, I.A.: Handbook of Mathematical Functions. New York: Dover Publications, 1965 Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol II. New York: Academic Press, 1975 Carmona, R. and Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Birkhäuser: Boston, 1990 Avishai, Y., Redheffer, R.M. and Band, Y.B.: J. Phys. A 25, 3883 (1992) Boas, R.Ph.: Entire Functions. New York: Academic Press, 1954 Simon, B. and Wolff, T.: Commun. Pure Appl. Math. 39, 75 (1986) Delyon, F., Lévy, Y. and Souillard, B.: Commun. Math. Phys. 100, 463 (1985) Kotani, S.: In: Proceedings of the 1984 AMS conference on Random Matrices and their Applications. Contemp. Math. 50 Providence, RI: Am. Math. Soc., 1986
Communicated by D. C. Brydges
Commun. Math. Phys. 204, 397 – 423 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Completeness of the Quasinormal Modes of the Pöschl–Teller Potential Horst R. Beyer1,2 1 Max Planck Institute for Gravitational Physics, Albert Einstein Institute, D-14476 Golm, Germany.
E-mail: [email protected]
2 University Stuttgart, Institute A for Mechanics, Pfaffenwaldring 9, D-70569 Stuttgart, Germany.
E-mail: [email protected] Received: 10 March 1998 / Accepted: 1 February 1999
Abstract: The completeness of the quasinormal modes of the wave equation with Pöschl–Teller potential is investigated. A main result is that after a large enough time t0 , the solutions of this equation corresponding to C ∞ -data with compact support can be expanded uniformly in time with respect to the quasinormal modes, thereby leading to absolutely convergent series. Explicit estimates for t0 depending on both the support of the data and the point of observation are given. For the particular case of an “early” time and zero distance between the support of the data and observational point, it is shown that the corresponding series is not absolutely convergent, and hence that there is no associated sum which is independent of the order of summation. 1. General Introduction The description of a compact classical system often leads to the consideration of a “small” perturbation of some special solution of its evolution equations. Expanding around this solution leads to a linear evolution equation for some perturbed quantities which characterise the system. For a system with no explicit time dependence a further step consists of finding the normal mode solutions of the evolution equation satisfying certain physical boundary conditions. To provide a complete description of the system under small perturbations every solution of the linear equation satisfying the boundary conditions must have an expansion in terms of these modes. To my knowledge the only well-developed mathematical framework to date for deciding such a “completeness” question is provided by the spectral theory of linear operators in Hilbert spaces. This is the approach taken in this paper. It is frequently the case (as in this paper) that the linear equation is a wave equation. Then it is well-known that the squares of the normal modes frequencies coincide with the spectrum of that linear self-adjoint operator, which is naturally connected with the equation and the boundary conditions. Since this spectrum is real, the normal mode frequencies are purely imaginary or real. Using the so-called functional calculus associated
398
H. R. Beyer
with the operator, a representation in terms of the normal modes can be given for the solution of the initial-value problem for the linear equation. Quasinormal mode solutions of the linear equation are often displayed by, in some sense, dissipative systems. They satisfy boundary conditions which differ from those for the normal modes, but usually are viewed in the same context as the normal mode solutions. From this point of view it is natural to ask whether they are in any sense complete [11]. On the other hand quasinormal frequencies have, in general, both real and imaginary parts and hence their squares cannot belong to a spectrum of any linear self-adjoint operator. In the special case considered in this paper, the system is initially contained in some finite box in space and is “dissipative” if one considers the energy contained in the box as a function of time. But the system is conservative if one considers the energy distributed in the whole space. It turns out that the quasinormal frequencies of the “finite” system are resonances of the operator corresponding to the “infinite” system. The analogous can be seen to be true for many other systems.1 This paper addresses the completeness question of the resonance modes of the infinite system using the framework of “spectral theory”. The system is described by a wave equation in one-dimensional space (as motivated by astrophysical systems). That the system is initially contained in a finite box is displayed by the fact that only initial values with compact support are considered. 2. Introduction The decay in time of the solutions of the Einstein field equations linearized around the Schwarzschild metric is governed by quasinormal frequencies (“QNF”) and the corresponding modes (“QNM”) [14]. For perturbing fields of the form, 8(t, x, θ, ϕ) :=
1 φ(t, x) · Y`m (θ, ϕ), r
(1)
where Y`m denotes an appropriate tensor spherical harmonic function; t, r, θ, ϕ are the usual Schwarzschild coordinate functions; x := r +ln(r −1) is the “tortoise” coordinate function and ` is a natural number, one gets the following wave equation for the scalar function φ:2 ∂2 ∂ 2φ + − 2 + U φ = 0, (2) ∂t 2 ∂x where
1 U (r) := 1 − r
l(l + 1) 3 · − 3 . r2 r
(3)
A still open mathematical question [11], is whether, and then in which sense, the solutions of (2) corresponding to C ∞ -data with compact support can be represented as 1 Such resonances are known to be important in quantum theory and mathematical methods have been developed to deal with them (Vols. III and IV of [12]). However it is also known that the concept of resonances of an operator is far more delicate than that of the spectrum. In contrast to the spectrum, resonances depend not only on the operator, but also on the choice of dense subspace of the underlying function space. In addition much less is known about resonances than about spectra, in particular concerning their behaviour under perturbations of the operator. 2 Here the units are chosen such that the Schwarzschild radius is normalized to 1.
Completeness of Quasinormal Modes of Pöschl–Teller Potential
399
sums of quasinormal mode solutions of (2). The latter are separated solutions satisfying so called “purely outgoing” boundary conditions (see e.g. [5]). Since there are an infinite number of such modes [2] it is in particular important to find out the type of convergence with respect to which such an expansion may be valid. The answer to these questions is obscured by technical problems – the QNF are not explicitly known and there is no convenient analytical representation for the QNM. In such a situation it is natural to ask whether there is any reason to expect that such a quasinormal mode expansion exists? Or more precisely, is there a wave equation of type (2) having infinitely many quasinormal modes such that each solution corresponding to C ∞ -data with compact support has an expansion into quasinormal modes? To my knowledge such a wave equation is not known. Hence it is still unclear whether one should expect such a “quasinormal mode expansion” for (2) to exist. Further, if such a normal mode expansion does exist for (2) a natural next step would be to ask whether this is true also for other wave equations, or in other words, whether the phenomenon is in any sense “stable” against “ small perturbations” of the potential. Such points suggest the consideration of other wave equations than (2) and in this paper we now look at the wave equation ∂2 ∂ 2φ + − 2 + V φ = 0, (4) ∂t 2 ∂x where the potential V is the Pöschl–Teller potential [10], V (x) :=
V0 , cosh2 (x/b)
x ∈ R.
(5)
Here V0 and b are, respectively, the maximal value and the “width” of V and are non zero positive real numbers (considered as given in the following). There are good reasons for working with this special choice of the so called “Pöschl–Teller” potential V instead of the Schwarzschild potential U . First, the QNF and QNM are known analytically [5], and there are an infinite number of QNF which are elementary functions of V0 and b (see (20), (21), (29)). In addition the shapes of U and V are similar (see Fig. 1 3 ) and both potentials are integrable over the real line and decay exponentially for x → −∞. However the decay of U and V differs for x → ∞, where U decays as 1/r 2 and V decays exponentially. These similarities have already been used in order to approximate the QNF of the Schwarzschild black hole which have “low” imaginary part by the corresponding QNF for V [5].A final very important reason for considering this particular wave equation is that the resolvent of the Sturm–Liouville operator corresponding to V (given later in Eq. (10)) can be given explicitly in terms of well-known analytic special functions. This cannot be done, so far, for the Schwarzschild potential U – and it is this fact which prevents the same analysis in this paper being carried through for (2). From such considerations it appears that the use of the wave equation with Pöschl– Teller potential is a good starting point for a mathematical investigation of the completeness of quasinormal modes. One may hope that, given the different decay as x → ∞, the results have some similarities with those for U . This is illustrated in Fig. 2 4 , where the solutions of (2) and (4) are compared. In both cases the initial data describes a Gaussian 3 This figure was kindly provided by G. Allen from the Max-Planck Institute for Gravitational Physics in Potsdam. 4 This figure was kindly provided by G. Allen from the Max-Planck Institute for Gravitional Physics in Potsdam. It displays results from a numerical solution of the wave equations (2) and (4).
400
H. R. Beyer
Fig. 1. A comparison of the Pöschl–Teller and Schwarzschild potentials. The parameters for the Pöschl–Teller potential are fixed by setting the maximum amplitude and the second derivative at this maximum amplitude to be equal to those for the Schwarzschild potential (with l = 2). That is, V0 = 0.61 and b = 2.75. In the figure, the solid line shows the Pöschl–Teller, and the dotted line the Schwarzschild potential
pulse which is purely incoming from infinity. In the figure, the lines show the resulting outgoing waves, as seen by a distant observer. The solid line corresponds to the Pöschl– Teller potential and the dotted line corresponds to the Schwarzschild potential. At early times the solutions are very similar although their behaviours differ at late times. The most difficult and time-consuming part of the calculations for the results on completeness was in the derivation of the estimates, (32) and (33), on the analytic continuation of the resolvent of the Sturm–Liouville operator with Pöschl–Teller potential. It was not clear a priori, from previous works on quasinormal modes, what form the estimates should take in order to prove or disprove these completeness results. Although the estimates are given here only for the Pöschl–Teller potential, one can hope that their structure is representative for other potentials. If this is the case, the form of the estimates (32) and (33) could provide a basis for further completeness calculations for different potentials. Section 3, which contains the rigorous basis of this paper, is intended to be partly pedagogical. The results apply to a much more general class than just partial differential operators. Although these results can be found in the mathematical literature, they are not easily accessible, and in this section the relevant results are collected and presented in a manner more convenient for quasinormal mode considerations. A study of the literature on quasinormal modes shows that some of these results (especially (16) and (19)) are already used. However the form used is often not valid for the case considered or the proof of its validity is left open. Formulae (16) and (19) in Sect. 3 offer a rigorous starting point for such considerations in the future. Further, in some more physically motivated papers dealing with quasinormal mode expansions, mathematical terminology such as “convergence" or “completeness" is used somewhat freely. That is the terminology is used but corresponding proofs are not given rigorously, or are substituted by “physical" arguments. While important, physical intuition into whether or not an infinite sum converges is very different from and cannot substitute a proof of convergence. Hence in this paper much importance is placed on mathematical rigor.
Completeness of Quasinormal Modes of Pöschl–Teller Potential
401
0.2
0
-0.2
200
250
300
350
t (t+5.6 for PT) 0
-10
-20
-30
-40 200
250
300
350
t (t+5.6 for PT)
Fig. 2. Comparison of the solutions to Eq. (2) (with l = 2) and Eq. (4) (with V0 = 0.61 and b = 2.75) from the same initial data (φ(0, x) = exp(−1.5(x − 120)2 ), φ,t (0, x) = φ,x (0, x))
Now, for those readers who are not concerned with the details of the various proofs, the main results of the paper are summarised. For this, denote by q(A) the set of quasinormal frequencies of V and for each ω ∈ q(A) denote by uω the corresponding quasinormal eigenfunction. In addition let f be some complex-valued C ∞ -function with compact support and let φf be the corresponding solution of (4) with initial values φf (0, x) = 0
and
∂φf (0, x) = f (x), ∂t
(6)
for all real x. Finally denote by φg,f the following averaged function obtained from φf , ( R +∞ ∗ −∞ g (x) · φf (t, x)dx for t > 0 (7) φg,f (t) := 0 for t < 0 , where g is some complex-valued C ∞ -function with compact support. The main results of this paper are:
402
H. R. Beyer
1. The quasinormal modes of V are complete, in the sense that there is a family of complex numbers cω , ω ∈ q(A) (given explicitly in Sect. 5, see (38)) such that for for a large enough t0 and for every t ∈ [t0 , ∞), Z+∞ Z+∞ cω · uω (y 0 )f (y 0 )dy 0 · g ∗ (x 0 )uω (x 0 )dx 0 · eiωt (8) −∞
−∞
ω∈q(A)
is absolutely summable with sum φg,f (t). So the summation of this sequence (using any order of summation) gives the quasinormal mode expansion of φg,f for large times.5 In addition, estimates for the possible size of such t0 are given depending on the supports of f and g. It is shown that t0 can be chosen to be any real number which is greater than some explicitly given real number M(g, f ) (see (31) in Sect. 4). 2. φg,f has an analytic extension φ¯ g,f to the strip (M(g, f ), ∞)×R and the sequence (8) is uniformly absolutely summable on [t0 , ∞) × K0 with sum φ¯ g,f for each t0 ∈ (M(g, f ), ∞) and each compact subset K0 of R. As a consequence the sequence (8) can be termwise differentiated to any order on that strip and the resulting sequence of derivatives is uniformly summable on [t0 , ∞) × K0 with a sum equal to the corresponding derivative of φ¯ g,f . 3. A result shown in Appendix B indicates that the QNM sum exists only for large enough times. There it is shown that the sequence , (9) cω [uω (0)]2 ω∈q(A)
which one gets formally3 from (8) by the substitutions t = 0, f by δ(x) and g by δ(y), is not absolutely summable. Hence for that case there cannot be associated a sum with (9) which is independent of the order of the summation. The plan for the remaining part of this paper is the following: In Sect. 3 the wave equation (4) is associated with the linear self-adjoint (Sturm–Liouville) operator A (10). A representation of the solution of the initial value problem is given. This representation is found by applying the members of a special family (parameterized by time) of functions of A (which are bounded linear operators) on the data (see e.g. [3]). Using a result of semigroup theory [6] these functions are represented by integrals over the resolvent of A (see (16) or (121) in Appendix B). Because of the analyticity properties of the resolvent the method of contour integration can be used in Sect. 5. Using the residue theorem the quasinormal frequencies (and modes) come in, since they are (in quantum terminology) common poles of the analytic continuations of a set of transition amplitudes of the resolvent (see e.g. [12, Vol. IV, p. 55]). By explicit estimates on these analytic continuations, which are supplied in Sect. 4, it is then shown that the resonance modes are complete for a large enough time t0 . In addition estimates for t0 are given. These bounds depend on the support of the data. Section 6 gives a discussion of the results. Appendix A supplies mathematical details to the results of Sects. 3, 4 and 5. Finally, for readers better 5 Note that the imaginary parts of the quasinormal frequencies are different from zero and hence that the absolute value of the exponential functions in this sequence does depend on t. Here also a remark concerning the role of the test function g might be in order. This test function is mainly for mathematical convenience. Below there is also given a corresponding result on the sum of the sequence, which one gets from (8) by formally substituting f by δ(x 0 − x)and g by δ(y 0 − y), respectively, for some x ∈ R and y ∈ R. The corresponding sum is then a Green’s function (more precisely the so called “commutator -distribution”) which is associated to (4).
Completeness of Quasinormal Modes of Pöschl–Teller Potential
403
acquainted with the “Laplace method” [13] than operator theory, Appendix B gives a (not completely rigorous) derivation for the basic representation (16) used in this paper for the solution of the initial value problem for (4). 3. An Initial Value Formalism for the Wave Equation In order to give (4) a well-defined meaning one has, of course, to specify the differentiability properties of φ. In the following a standard abstract approach for giving such a specification is used.6 The purpose of this section is the derivation of the representations (16) and (19) of the solutions of the initial-value problem of (4). These representations are basic for this paper. The methods for this derivation come from semigroup theory and spectral theory. For the reader not familiar with these methods, this is rederived in Appendix B using the so-called “Laplace method” (e.g. [13]). Define the Sturm–Liouville operator A in L2 (R) by Af := −f 00 + Vf,
(10)
for each f ∈ W 2 (R). Here L2 (R) denotes the Hilbert space of complex-valued square integrable functions on the real line with scalar product < | > defined by Z +∞ f ∗ (x) · g(x)dx, (11) < f |g >:= −∞
for all f, g ∈ L2 (R); W 2 (R) denotes the dense subspace of L2 (R), consisting of two times distributionally differentiable elements, and the distributional derivative is denoted by a prime. By the Rellich-Kato theorem7 , it follows that A is a densely defined linear and self-adjoint operator in L2 (R) which results from perturbing the linear self-adjoint operator A0 defined by A0 f := −f 00
(12)
for each f ∈ W 2 (R) by the bounded linear self-adjoint operator with the function V 8 (the so called maximal multiplication operator corresponding to V ). Further, the spectrum of A consists of all positive real numbers (including zero). The proof of this, which is not difficult, is not given here. The formulation of (4) used in the following is given by ¨ = −Aφ(t), φ(t)
(13)
for each t ∈ R, where φ is required to be a C 2 -map from R into L2 (R) with values in W 2 (R), and a dot denotes time differentiation.9 Using only abstract properties of A, namely its selfadjointness and its positiveness, it follows from the proposition on p. 295 6 See for example, p. 295 in Vol. II of [12]. Of course there are also other approaches for such a specification. Usually, all approaches turn out to be “equivalent” in that the unique solution of the initial value problem in one approach can be reinterpreted in such a way that it coincides with the corresponding one in another approach. The approach chosen in this paper has the advantage that it leads in a natural way to eigenfunction expansions and/or quasinormal eigenfunction expansions of the solution. 7 Theorem X.12 in Vol. II of [12]. 8 See for example Proposition 1 in Chapter VIII.3, Vol. I of [12]. 9 Hence (4) is viewed, similarly as in the case of the Schroedinger equation (but with a second order time derivative), as an ordinary differential equation for a curve in a Hilbert space.
404
H. R. Beyer
in Vol. II of [12] and Theorem 11.6.1 in [6] (see also Theorem 1 in Appendix B) that for each f ∈ W 2 (R) there is a unique φf ∈ C 2 (R, L2 (R)) with values in W 2 (R), satisfying the initial conditions φf (0) = 0
and
φ˙ f (0) = f,
(14)
and that the solution φf has the following representation. Define φg,f (t) :=
< g|φf (t) > for t > 0 0 for t < 0 .
(15)
The representation of φf is given by 1 φg,f (t) = √ et F −1 Rg,f (· − i) (t), 2π
(16)
for all t ∈ R, where is an, otherwise arbitrary, strictly positive real number; g is an, otherwise arbitrary, element of L2 (R); F is the unitary linear Fourier transformation on10 L2 (R) and Rg,f is defined by Rg,f (ω) :=< g|R(ω2 )f >,
(17)
for each ω ∈ C with I m(ω) < 0. Here R : C \ [0, ∞) → L(L2 (R), L2 (R)) is the so called resolvent of A, which associates to each λ ∈ C \ [0, ∞) the inverse of the operator A − λ. L(L2 (R), L2 (R)) denotes the linear space of bounded linear operators on L2 (R). Note that Rg,f (· − i) is square integrable and also integrable, as can easily be concluded from the bound |Rg,f (ω)| 6
kf k2 · kgk2 , max{2|ω2 | · |ω1 |, ω22 − ω12 }
(18)
which is valid for each ω = ω1 + iω2 ∈ C with ω2 < 0. This bound requires11 also only the self-adjointness and positivity of A. Finally, using Lebesgue’s dominated convergence theorem, it follows from (16) and (18) that for all t ∈ [0, ∞), φg,f (t) =
1 lim 2π ν→∞
Z
ν
−ν
eit·(ω−i) Rg,f (ω − i)dω.
(19)
The representations (16) and (19) have here been given for the special case of the Pöschl–Teller potential. In fact, as hinted at in the above text, (16) is an application of the abstract Theorem 1 given at the end of Appendix B, which is far more general. The representation (120) given in that theorem is, for instance, also valid for wave equations in arbitrary space dimensions. 10 For the definition see Chap. IX in Vol. II of [12]. 11 Spectral Theorem VIII.5(b) in Vol. I of [12].
Completeness of Quasinormal Modes of Pöschl–Teller Potential
405
4. Analytic Properties of the Resolvent Formula (19) is the starting point of a contour integration, which is performed in Sect. 5, and eventually leads to the results on the completeness of the quasinormal modes. The basis for that contour integration is provided by the estimates (32), (33) of this section below on the analytic continuation of Rg,f . The purpose of this section is mainly to explain these estimates. A sketch of the proofs of these estimates is given in Appendix A.12 Let f and g be arbitrary, considered as given from now on, complex-valued C 2 functions on R with compact supports. Then it follows from general analytic properties of resolvents that the function Rg,f defined in Eq. (17) is an analytic function on the open lower half-plane. Now, using for the first time13 the special properties of the Pöschl–Teller potential, it will be concluded that Rg,f has an analytic extension into the closed upper half-plane. ¯ g,f is now defined. In order to see this the auxiliary function R Define the set q(A) of “quasinormal frequencies of A” by [ ωk− , ωk+ , (20) q(A) := k∈N
where for each k ∈ N, 1 1 ωk− := i · ( − α + k)/b, ωk+ := i · ( + α + k)/b, 2 2 and
q 1 − b2 V0 for b2 V0 6 α := q4 i b2 V − 1 for b2 V > 0 0 4
1 4 1 4
(21)
.
(22)
¯ g,f (ω) is defined by For each ω ∈ C \ q(A) the corresponding R ZZ ¯ g,f (ω) = g ∗ (x)K(ω, x, y)f (y) dx dy, R
(23)
R2
where for each x, y ∈ R :
( 1 ur (ω, x)ul (ω, y) for y 6 x , K(ω, x, y) = − W (ω) ul (ω, x)ur (ω, y) for y > x
and for each ω ∈ C, x ∈ R: ul (ω, x) := eiωx · F¯
1 1 1 − α, + α, 1 + ibω, 2x 2 2 1 + e− b
ur (ω, x) := ul (ω, −x),
(24)
! , (25)
12 For the reader aequainted with the analogous situation for the Yukawa potential (which, like the Pöschl– Teller potential, is decaying exponentially for large positive values of the parameter) we mention that for that case such a continuation is not possible. In that case a further branch cut would appear in the upper half-plane [8]. 13 Apart from its positivity, which has already been used in concluding that A is a positive operator.
406
H. R. Beyer
and W (ω) := ul (ω, x)(ur (ω, ·))0 (x) − ur (ω, x)(ul (ω, ·))0 1 1 2 1 1 + α + ibω − α + ibω . =− · b 0 2 0 2
(26)
Here F¯ : C3 × U1 (0) → C is the analytic extension of the function C2 × (C \ −N) × U1 (0) → C, (a, b, c, z) 7 → F (a, b, c, z)/ 0(c) ,
(27)
where the hypergeometric function (Gauss series) F and the Gamma function 0 are defined according to [1] and 1/ 0 denotes the extension of (C \ −N → C, c 7 → 1/ 0(c)) to an entire analytic function. Note that for each ω ∈ C the corresponding functions ul (ω, ·), ur (ω, ·) satisfy, (ul (ω, ·))00 (x) − (V (x) − ω2 ) · ul (ω, x) = 0, (ur (ω, ·))00 (x) − (V (x) − ω2 ) · ur (ω, x) = 0,
(28)
for each x ∈ R. In addition, for each ω ∈ C with I m(ω) < 0, the associated ul (ω, ·), ur (ω, ·) is L2 near −∞ and +∞, respectively. Using this, along with general results on “Sturm–Liouville” operators (see e.g. [15]) and differentiation under the integral sign, ¯ g,f is an analytic function on C \ q(A), which coincides with Rg,f on it follows that R the open lower half-plane. The proof of this is elementary and not given in this paper. The QNF of A, which coincide with the zeros of the Wronskian determinant function ¯ g,f . These poles are simple for the case α 6 = 0 and second order for W , are poles of R the case α = 0. In somewhat misleading, but common mathematical terminology, such poles are often called “second sheet poles of the resolvent (of A)” or “resonances” (of A) (see for example Vol. IV of [12]). This terminology is somewhat misleading, because they not only depend on A, but also on the choice of a dense subspace of L2 (R) (see for example Vol. IV of [12]). Here this subspace is the space where the data for (4) are taken from, namely the space of complex-valued C ∞ - functions on the real line with compact support. The QNM corresponding to the QNF of A, ur (ω, ·)ul (ω, ·), ω ∈ q(A) satisfy, ur ωk± , x = (−1)k · ul ωk± , x , 1 1 · (2 cosh(x/b)) 2 ±α+k · uω± (x) := ul ωk± , x = 1 k 0( 2 ∓ α − k) 1 −2x/b ) , F −k, −k ∓ 2α, ∓ α − k, 1/(1 + e 2
(29)
for each k ∈ N, x ∈ R. This result is also easy to see and its proof is not given in this paper. In view of the analytic properties of Rg,f , it is natural to try to evaluate the right hand side of (19) by contour integration. This is done in the next section and that contour integration leads to the completeness results of this paper on the QNM of the Pöschl– ¯ g,f Teller potential. The basis for the contour integration is provided by estimates on R which are now given.
Completeness of Quasinormal Modes of Pöschl–Teller Potential
407
The estimates depend on the parameters d(g, f ), m(g, f ) and M(g, f ), which define certain “distances” between the supports of g and f . These distances are defined by, d(g, f ) := min{|x − y| : x ∈ supp(g) and y ∈ supp(f )}, m(g, f ) := max{|x − y| : x ∈ supp(g) and y ∈ supp(f )}, M(g, f ) := max{D(x, y) : x ∈ supp(g) and y ∈ supp(f )} > m(g, f ),
(30)
where
( ln(1 + 2e−2x/b ) + ln(1 + 2e2y/b ) for y 6 x , D(x, y) := |x − y| + b · ln(1 + 2e−2y/b ) + ln(1 + 2e2x/b ) for y > x
(31)
or each x ∈ R and y ∈ R. Note that the quantities d(g, f ) and m(g, f ) have an obvious geometrical interpretation. ¯ g,f and each ω = ω1 +iω2 ∈ C\(q(A) ∪ −q(A)) The following estimates hold for R ¯ g,f (ω)| 6 |R
( −1/2 eω2 ·d(g,f ) for ω < 0 e2πb|ω1 | 2 · ω2 ·M(g,f ) · 1 + 4b2 ω12 , C1 (g, f )· e for ω2 > 0 | cos(2πα) + cosh(2π bω)| (32) and if in addition both supp(f ) ⊂ [0, ∞) and supp(g) ⊂ (−∞, 0] or supp(f ) ⊂ (−∞, 0] and supp(g) ⊂ [0, ∞) : ¯ g,f (ω)| 6 |R
( −1/2 eω2 ·d(g,f ) for ω < 0 e2πb|ω1 | 2 2 2 · 1 + 4b ω1 , · ω2 ·m(g,f ) C2 (g, f )· e for ω2 > 0 | cos(2πα) + cosh(2π bω)| (33) where C1 (g, f ), C2 (g, f ) ∈ [0, ∞) are given in Appendix A. The derivation of (32) and (33) is given in Appendix A. 14 They were obtained by different methods of estimation. Note that depending on the methods used in their derivation these estimates are “singular” in the open lower half-plane at the elements ¯ g,f is analytic there. This will not be relevant in the following. of −q(A), although R ¯ g,f to the real axis is From (32) and (33) follows, in particular, that the restriction of R square integrable. This is used in the contour integration in the next section. Note that the corresponding statement is false for the operator A0 (see (12) for the definition), although it is only a bounded (i.e. in the operator theoretic sense “very small”) perturbation of ¯ g,f is analytic on C \ {0} and has in general a first A. For this case the corresponding R order pole at ω = 0. 14 Considering the importance of these estimates for the decision of the completeness of the quasinormal modes, one might ask whether such estimates are also used in the literature on resonances oriented more towards quantum mechanics. Seemingly this is not the case. On the other hand, in private communication, M. Klein from the University of Potsdam kindly pointed out that the estimates on the so called “Jost functions” might provide the means to derive such estimates also for the more general class of so-called “dilation analytic potentials” (see e.g. [12], Vol. IV, for the definition of this class of potentials). For such estimates compare for instance the paper [7]. This might help to generalize the results of this paper to a larger class of potentials.
408
H. R. Beyer
5. Consequences A first implication of the estimates (32) and (33) is, roughly speaking, that for the special case of the Pöschl–Teller potential, the formula (16) is also true for the case = 0, making subsequent contour integration easier. This can be seen as follows. The estimates (32) and (33) imply the boundedness of the function which associates the value kRg,f (· + iω2 )k2 to each ω2 ∈ (−∞, 0), where k k2 denotes the norm which is induced on L2 (R) by the scalar product < | >. Hence it follows by a Paley-Wiener theorem15 , that the sequence (Rg,f (· + iω2 ))ω2 ∈(−∞,0) ¯ g,f |R of R ¯ g,f to the real axis. converges for ω2 → 0 in L2 (R) to the restriction R Hence (16) and the continuity of the Fourier transformation lead to 1 ¯ g,f |R . · F −1 R φg,f = √ 2π
(34)
Using a well-known result in the theory of the Fourier transformation10 , it follows that there exists a subset N of R having Lebesgue measure zero such that, for each t ∈ [0, ∞) \ N, Z ν 1 ¯ g,f (ω)dω. · lim eitω · R (35) φg,f (t) = 2π ν→∞ −ν In particular this implies that φg,f is square integrable – the corresponding statement not being generally true when the operator A is replaced by A0 . Equation (35) can now be contour integrated, using the Cauchy integral theorem and Cauchy integral formula, to give an expansion of φg,f with respect to the QNM. In the following, for convenience, the case α = 0 is excluded. Then the QNF of A are simple ¯ g,f . But with the help of (32) and (33), the same contour integration can also poles of R be carried through for the case α = 0, leading to similar results. The contours are chosen as the boundaries of the rectangles with corners (−ν, 0), (ν, 0), (ν, n/b), (−ν, n/b) and (−ν, 0), (ν, 0), (ν, −n/b), (−ν, −n/b), where ν is an integer and n is a natural number. Then, following from (32) and (33), the integrals along the paths in the upper and lower half plane vanish for certain t in the limit when first ν → ∞ and then n → ∞. The calculations for this are elementary but lengthy and will not be carried through in this paper, only their results will be given in the following. In particular, as demanded by causality, the function φg,f vanishes on the interval [0, dg,f ], as is seen by closing the contour in the lower half plane. Closing the contour in the upper half plane leads to two statements concerning the expansion of φg,f in the QNM. First define µ by ( m(g, f ) if either supp(f ) ⊂ [0, ∞) and supp(g) ⊂ (−∞, 0] µ := or supp(f ) ⊂ (−∞, 0] and supp(g) ⊂ [0, ∞) . (36) M(g, f ) otherwise 15 See for example Theorems 1 and 2 in Sect 4, Chap. VI of [16].
Completeness of Quasinormal Modes of Pöschl–Teller Potential
409
Now define for each n ∈ N and t ∈ C the entire analytic function sg,f,n by sg,f,n (t) :=
n X k=0
c
ωk−
Z+∞ Z+∞ − uω− (y)f (y)dy g ∗ (x)uω− (x)dxeiωk t k
−∞
+ cω+ k
k
−∞
Z+∞ Z+∞ + uω+ (y)f (y)dy g ∗ (x)uω+ (x)dxeiωk t , −∞
k
−∞
k
(37)
where for each k ∈ N, (−1)k π/ (2 sin(2π α)) , 0(1 + k)0(1 − 2α + k) (−1)k+1 π/ (2 sin(2π α)) := . 0(1 + k)0(1 + 2α + k)
cω− := k
cω + k
(38)
The following statements (i) and (ii) are then true. (i) For each t0 ∈ (µ, ∞) the sequence (sg,f,n )n∈N converges on [t0 , ∞) in the L2 -mean to φg,f . (ii) The restriction of φg,f to (µ, ∞) has an extension to an analytic function on the strip (µ, ∞) × R. For each t0 ∈ (µ, ∞) and each compact subset K0 of R the sequence (sg,f,n )n∈N converges uniformly on [t0 , ∞) × K0 to this extension. Note that in these results a special order of the summation for the QNM sequence (37) is used, which is induced by the chosen contour in the integration. That this result is independent of this order of summation for µ := M(g, f ) follows from further results on the summability of the sequence , (39) cω uω (y)uω (x)eiωt ω∈q(A)
for given x ∈ R, y ∈ R and t ∈ [0, ∞), which are now stated. The corresponding proofs are given in Appendix B. There, it is shown by direct estimates on the sequence elements that, given x ∈ R and y ∈ R, this sequence is absolutely and uniformly summable on [t0 , ∞) × K0 , where t0 > Ds (x, y), K0 is any compact subset of R and x+y x−y + cosh . (40) Ds (x, y) := b log 2 cosh b b Hence, in particular the analyticity follows of the function which associates to each complex number t with Re(t) > Ds (x, y) the value X cω uω (y)uω (x)eiωt . (41) ω∈q(A)
Further it is shown in Appendix B that Z+∞ Z+∞ ∗ iωt cω · uω (y)f (y)dy · g (x)uω (x)dx · e −∞
−∞
ω∈q(A)
,
(42)
410
H. R. Beyer
is absolutely and uniformly summable on [t0 , ∞) × K0 , where t0 > Ms (g, f ), K0 is any compact subset of R and Ms (g, f ) := max{Ds (x, y) : x ∈ supp(g) and y ∈ supp(f )}.
(43)
It is easily seen that D(x, y) > Ds (x, y) > |x − y|,
(44)
for all x ∈ R and y ∈ R and hence that M(g, f ) > Ms (g, f ) > m(g, f ).
(45)
Using the results on the QNM sequence from this section above, it follows that, for every t0 > M(g, f ) and for every t ∈ [t0 , ∞), the sequence (42) is absolutely summable with sum φg,f (t) and that the sequence (42) is uniformly absolutely summable on [t0 , ∞)×K0 with sum φ¯ g,f for each t0 ∈ (M(g, f ), ∞) and each compact subset K0 of R. As a consequence the sequence (42) can be termwise differentiated to any order on that strip and the resulting sequence of derivatives is uniformly summable on [t0 , ∞) × K0 with a sum equal to the corresponding derivative of φ¯ g,f . A further result shown in Appendix B indicates that the QNM sum exists only for large enough times. There it is shown that, for the special case of x = y = 0 and t = 0 (< Ds (0, 0)) the sequence (39) is not absolutely summable because the sum n i2 h X c − u − (0) , (46) ω ω k=0
k
k
is shown to diverge for n → ∞. Hence, for that case, there can be no associated sum with (39) which is independent of the order of the summation. 6. Discussion and Open Questions In this paper we gave several results on the completeness of the quasinormal modes of the Pöschl–Teller potential. A main result is that any solution of the wave equation with the Pöschl–Teller potential (4) corresponding to C ∞ -data with compact support can be expanded uniformly in time with respect to the quasinormal modes after a large enough time t0 . Further the corresponding series are absolutely convergent and hence do not depend on the order of summation. In addition we showed that these series can be arbitrarily often termwise partially differentiated with respect to time, again leading to series which converge absolutely and uniformly in time on [t0 , ∞) to the corresponding time derivatives of the solution. Estimates of t0 were given which depend on the support of the data and on the point of observation. Estimates were also given for the time t1 from when the solution can be expanded uniformly in time with respect to the quasinormal modes, where a special order of summation is assumed. Also for this case, the quasinormal mode series can be arbitrarily often termwise partially differentiated with respect to time, thereby leading to series which converge uniformly in time on [t1 , ∞) to the corresponding time derivatives of the solution of the initial value problem. These estimates have in common that they depend on both the support properties of the data and the point of observation, and that they are greater or equal to the geometrical distance between the support of the data and the observational point.
Completeness of Quasinormal Modes of Pöschl–Teller Potential
411
We showed that, for an “early” time and zero distance between the support of the data and observational point, the corresponding quasinormal mode series is not absolutely convergent. Hence there is no associated sum, since in general different orders of summation will give different results. From these results one might suspect for the case of the Schwarzschild potential that the corresponding quasinormal modes are complete (in the analogous sense as described in the paper) for “intermediate” times. For large times this is not to be expected. Numerical calculations (see e.g. Fig. 2) illustrate that, due to the slower polynomial decay of the Schwarzschild potential for large radii, backscattering leads to non-exponential decay of the solutions (corresponding to initial data with compact support) of the wave equation for large times. Such an asymptotic behaviour of the solutions would make an expansion into quasinormal modes impossible for large times. Several open questions remain. The results of this paper suggest a relationship between the convergence of the quasinormal mode sums of the Pöschl–Teller potential and causality. To make this clearer, one would like to have a complete overview of the convergence of the quasinormal mode sums depending on the support of the data as well as the point of observation; possibly depending on whether a special order of summation is assumed or not and possibly depending on whether the series converges to the corresponding solution of the wave equation or not. Acknowledgements. I would like to thank Bernd Schmidt for bringing this problem to my attention, and for many interesting and helpful discussions about the subject. Further I would like to thank Gabrielle Allen for providing the figures for, and the many criticisms on, the paper. Also I am grateful to Rachel Capon for useful discussions and her careful reading of this manuscript and I have benefited from many helpful conversations with Kostas Kokkotas. Finally, I would like to thank Markus Klein and an unknown referee for their reading of the paper and their useful suggestions.
A. This appendix gives a derivation of the estimates (32) and (33) as well as an estimate on the members of the sequence (39). All these estimates are crucial for the proof of the expansion formulae in Sect. 4. The definitions and the notation of [1] are used throughout. The derivation uses the following auxiliary estimate. Lemma 1. Let n ∈ N, a ∈ (0, 1) and s ∈ [0, ∞) be given. Then Z 0
π/2
e−st sinn+a−1 (t)dt 6
πBa 2−n 0(n + 1) · (1 + s 2 )−a/2 , · 2 (0( n2 + 1))2
(47)
where a a a 4 1 −a , 0(a) + π · · max 2 · π + . Ba := π a e
(48)
Note that later on, (47) has to provide a proper estimate for the vanishing of the integral both for s → ∞ and n → ∞. This demand excludes, for instance, an application of the method of partial integration, in the following proof.
412
H. R. Beyer
Proof. First by standard estimates for the sine-function one gets Z π/2 e−st sinn+a−1 (t)dt 0
a−1 Z π/2 2 · e−st t a−1 sinn (t)dt 6 π 0 a−1 Z 1/2 Z π/2 2 · 2−n · e−st t a−1 dt + 21−a · e−s/2 · sinn (t)dt 6 π 0 1/2 a−1 Z π Z 1/2 1 2 n −st a−1 −a −s/2 · sin (t)dt · e t dt + 2 · e · 6 π π 0 0 Z 1/2 a−1 −n 2 0(n + 1) 2 −st a−1 −a −s/2 · · e t dt + π · 2 · e , = π (0( n2 + 1))2 0
where in the last equality the identity Z π sinn (t)dt = 0
π · 0(n + 1) , 2n · (0( n2 + 1))2
(49)
(50)
was used16 . For the case 0 6 s 6 1 one has now, Z 1/2 Z 1/2 1 e−st t a−1 dt +π · 2−a · e−s/2 6 t a−1 dt +π · 2−a 6 π + · (1 + s 2 )−a/2 , a 0 0 (51) and for the case s > 1, Z 1/2 e−st t a−1 dt + π · 2−a · e−s/2 6 s −a · [0(a) + π · 2−a · s a e−s/2 ] 0 h a a i (52) 6 s −a · 0(a) + π · e h a a i · (1 + s 2 )−a/2 . 6 2a · 0(a) + π · e The result (47) then follows from (49), (51) and (52). u t The starting point for the derivation of the formulae (32) and (33) is the following. Lemma 2. Let ω ∈ C\(q(A) ∪ −q(A)), x ∈ R and y ∈ R be given, then π 2 be−iω·|x−y| · K(ω, x, y) = cos(2πα) + cosh(2π bω) ( h(ω, α, (1 + e2x/b )−1 ) · h(ω, −α, (1 + e−2y/b )−1 ) for y 6 x , h(ω, −α, (1 + e−2x/b )−1 ) · h(ω, α, (1 + e2y/b )−1 ) for y > x
(53)
where for arbitrary β ∈ (−1/2, 1/2) × R and x 0 ∈ R, h(ω, β, x 0 ) :=
F¯ ( 21 − β, 21 + β, 1 + ibω, x 0 ) 0( 21 + β − ibω)
.
16 This can be derived, for instance, using Formulae 6.2.1, 6.2.2, 6.1.8 and 6.1.18 of [1].
(54)
Completeness of Quasinormal Modes of Pöschl–Teller Potential
413
Proof. The proof consists of a straightforward calculation starting from (24) and using (26) in addition to Equations 6.1.17 and 15.1.1 of [1]. u t Note that the main reason for representing K(·, x, y) in the form (53) is that only the first elementary factor is singular at the elements of q(A). The price for this is that this factor is singular also at the points of −q(A) in the open upper half-plane. But this will play no role in the following. The function h satisfies the following estimate, which eventually leads to (52). Lemma 3. Let β ∈ C with −1/2 < Re(β) < 1/2, ω = ω1 + iω2 ∈ C such that ω 6 = in/b for all n ∈ N \ {0} as well as ω 6 = −i · (n + 21 + β)/b for all n ∈ N \ {0} and x ∈ (0, 21 ) be given. Then, |h(ω, β, x)| 6 Cβ · (1 + 4b2 ω12 )− 2 ·( 2 +Re(β)) · eπ b|ω1 | · (1 − 2x)−( 2 −Re(β)) , 1
1
1
(55)
where Cβ := π
−1
·2
− 21 +Re(β)
·0
1 − Re(β) · |cos(πβ)| · eπ ·|I m(β)|/2 · B 1 +Re(β) . (56) 2 2
Proof. First, using the power series expansion of the hypergeometric function and Eq. 6.1.22 of [1], one gets ∞ 1 1 X xn 1 2 −β n· 2 +β n · · |h(ω, β, x)| = 1 0( + β − ibω) · 0(1 + ibω) (1 + ibω) 0(n + 1) n 2 n=0 ∞ 1 X 0(n + 21 + β) xn 1 . · · − β · 6 1 1 2 0(n + 1) |0( 2 + β)| n=0 0(n + 1 + ibω) · 0( 2 + β − ibω) n (57) Using the formula17 Z
π/2
−π/2
e
iyt
u−1
· cos
(t)dt = e =
iπy/2
π
Z ·
π
0 1−u ·2
e−iyt · sinu−1 (t)dt · 0(u)
1+u+y 0( 1+u−y 2 ) · 0( 2 )
,
(58)
which is valid for arbitrary u ∈ C with Re(u) > 0 and y ∈ C (where the expression which includes the Gamma functions is defined by analytic continuation for the cases 17 See e.g. Eq. 5.25 in [9]
414
H. R. Beyer
y = ±(1 + u) ), one gets in a second step, 0(n + 21 + β) 0(n + 1 + ibω) · 0( 1 + β − ibω) 2 Z π/2 1 1 1 e(2bω1 −I m(β))·t · cosn− 2 +Re(β) (t)dt 6 · 2n− 2 +Re(β) · π −π/2 Z 0 1 1 n+ 1 +Re(β) π·|I m(β)|/2 2 ·e · e−2b|ω1 |·t · cosn− 2 +Re(β) (t)dt (59) 6 ·2 π −π/2 Z π/2 1 1 1 e−2b|ω1 |·t · sinn− 2 +Re(β) (t)dt = · 2n+ 2 +Re(β) · eπ·|I m(β)|/2 · eπb|ω1 | · π 0
− 21 ·
6 2n− 2 +Re(β) · eπ·|I m(β)|/2 · B 1 +Re(β) · (1 + 4b2 ω12 ) 1
2
1 2 +Re(β)
· eπ b|ω1 | .
With help from Eqs. 6.1.22 and 6.1.26 of [1] one has for an arbitrary n ∈ N, 0(n + 21 − β) 0(n + 21 − Re(β)) 1 2 − β = 0( 1 − β) 6 |0( 21 − β)| n 2 0( 21 − Re(β)) 1 . · − Re(β) = 2 |0( 21 − β)| n
(60)
Finally, (55) follows from (57), (59), (60) with the help of Formulae 6.1.17 and 3.6.8 of [1]. u t From (53), (55) and the continuity of K one gets now the following estimate: Lemma 4. Let ω ∈ C \ (q(A) ∪ −q(A)), x ∈ R and y ∈ R be given. Then, 1 e2πb|ω1 | · (1 + 4b2 ω12 )− 2 · eω2 ·|x−y| · |K(ω, x, y)| 6 π 2 bCα C−α · | cos(2πα) + cosh(2πbω)| (tanh(x/b))− 21 −Re(α) · (tanh(−y/b))− 21 +Re(α) if x > 0 and y < 0 . (61) − 1 +Re(α) − 1 −Re(α) · (tanh(y/b)) 2 if x < 0 and y > 0 (tanh(−x/b)) 2
From this one gets easily (33) (compare (33) and in particular the assumptions on f and g), where ZZ g ∗ (x)H2 (x, y)f (y)dxdy, (62) C2 (g, f ) := π 2 bCα C−α · R2
and where
H2 (x, y) :=
− 21−Re(α) − 21+Re(α) ·(tanh(−y/b)) if x > 0 and y < 0 (tanh(x/b))
(tanh(−x/b)) 0
−
1 2+Re(α)
−
·(tanh(y/b))
1 2−Re(α)
. if x < 0 and y > 0 otherwise (63)
Completeness of Quasinormal Modes of Pöschl–Teller Potential
415
A further estimate of the function h uses an integral representation of the hypergeometric function F , which could not be found in the tables on special functions. For this reason that representation and its proof is given now. Lemma 5. Let a ∈ C, b ∈ C, c ∈ C \ (−N) and z ∈ C with |z| < 1 be given. Then (i) If in addition Re(c) > Re(b) and b 6 ∈ N \ {0} hold, 0(c) · 0(1 − b) · F (a, b, c, z) = π −1 · 2c−b−1 · eiπ(c+b−1)/2 · 0(c − b) Z π −a e−i·(c+b−1)·t · sinc−b−1 (t) · 1 − ze−2it dt. (64) 0
(ii) If in addition Re(b) > 0 and c − b 6 ∈ N \ {0} hold, 0(c) · 0(b − c + 1) · (1 − z)c−(a+b) · F (a, b, c, z) = π −1 · 2b−1 · eiπ(2c−b−1)/2 · 0(b) Z π a−c e−i·(2c−b−1)·t · sinb−1 (t) · 1 − ze−2it dt. (65) 0
Proof. Part (ii) is a direct consequence of (i) and Formula 15.3.3 in [1]. Hence it remains to prove part (i). For this let Re(c) > Re(b) and b 6 ∈ N \ {0}. First by Formulae 6.1.22 and 6.1.17 in [1] as well as by some elementary reasoning it follows that 0(c) · 0(1 − b) 0(c − b) (b)n = (−1)n · · , (c)n 0(c − b) 0(c + n) · 0(1 − (b + n))
(66)
where the right-hand side is defined by analytic continuation (and hence by zero) for the cases where b ∈ {−n + 1, −n + 2, ...}. From the definition of F (Formula 15.1.1 in [1]), and using (58) and (66) follows, F (a, b, c, z) =
∞ (−z)n 0(c−b) 0(c) · 0(1−b) X · · (a)n · 0(c−b) 0(c+n) · 0(1−(b+n)) 0(n+1) n=0
= π −1 · 2c−b−1 · eiπ(c+b−1)/2 · Z lim
N→∞ 0
π
e
−i·(c+b−1)·t
· sin
0(c) · 0(1−b) · 0(c−b) c−b−1
(t) ·
N X n=0
(67)
! (a)n −2it n · (e · z) dt. 0(n+1)
From this (64) follows using Lebesgue’s dominated convergence theorem and the complex version of Formula 3.6.9 (“binomial series”) of [1]. u t Note that part (i) of the foregoing Lemma 5 gives an integral representation for the hypergeometric series for a larger class of parameter values than Formula 15.3.1 in [1], since it does not assume that Re(b) > 0 holds. This will be essential for the derivation of (32). Actually used in the following is the subsequent corollary of Lemma 5,
416
H. R. Beyer
Corollary 1. Let a ∈ C, b ∈ (0, ∞) × R, c ∈ C \ (−N) such that c − b 6 ∈ N \ {0} and x ∈ (−1, 1) be given. Then 0(c) · 0(b − c + 1) · (1 − x)c−(a+b) · F (a, b, c, x) = π −1 · 2b−1 · 0(b) Z π/2 a−c e−i·(2c−b−1)·t · cosb−1 (t) · x + e2it dt. −π/2
(68)
Proof. The relation (68) follows from (65) by a straightforward substitution and from the identity a−c , (69) (1 + xe−2it )a−c = e−2i·(a−c)·t · x + e2it for each t ∈ (−π/2, π/2). The latter can easily be shown by analytic continuation. u t Now with the help of these auxiliary results a further estimate for the function h will be proved, which eventually leads to (32). Lemma 6. Let β ∈ C with −1/2 < Re(β) < 1/2, ω = ω1 + iω2 ∈ C such that ω 6 = in/b for all n ∈ N \ {0} as well as ω 6 = −i · (n + 21 + β)/b for all n ∈ N \ {0} and x ∈ (0, 1) be given. Then, |h(ω, β, x)| 6Cβ0 · (1 + 4b2 ω12 )− 2 ·( 2 +Re(β)) · eπ b|ω1 | · (1 − x)−( 2 +Re(β)) · 1+x bω2 for ω2 > 0 1−x , (70) 1 for ω 6 0 1
1
1
2
where 2− 2 +Re(β) · e5π·|I m(β)|/2 1
Cβ0 :=
|0( 21 + β)|
· B 1 +Re(β) . 2
(71)
Proof. First one gets from the definitions and (68), π −1 · 2Re(β)− 2
1
|h(ω, β, x)| =
· (1 − x)−bω2 · |0( 21 + β)| Z π/2 −( 1 +β+ibω) 2 i·(3β+ 21 )·t β− 21 2it e · cos (t) · x + e dt . −π/2
For each t ∈ (−π/2, π/2) one now has x+e
2it
= |x + e
2it
|·e
i·[t+arctan
(72)
1−x 1+x ·tan(t)
]
(73)
and hence, −( 1 +β+ibω) bω2 − 1 −Re(β) (bω +I m(β))·[t+arctan 1−x ·tan(t)] 2 2 1+x | = | x + e2it ·e 1 | x + e2it 6 eπ·|I m(β)| · e2b|ω1 |·|t| · (1 − x)−( 2 +Re(β)) · |x + e2it |bω2 ( (1 + x)bω2 for ω2 > 0 π·|I m(β)| 2b|ω1 |·|t| −( 21 +Re(β)) ·e · (1 − x) · . 6e (1 − x)bω2 for ω2 6 0 1
(74)
Completeness of Quasinormal Modes of Pöschl–Teller Potential
417
From (72), (75) follows, π −1 · 2Re(β)− 2
1
|h(ω, β, x)| 6
|0( 21
+ β)|
· e5π·|I m(β)|/2 ·
(1 − x)−( 2 +Re(β)) · 1
=
π −1 · 2 |0( 21
Re(β)+ 21
+ β)|
Z
1+x bω2 1
π/2 −π/2
e2b|ω1 |·|t| · cosRe(β)− 2 (t)dt · 1
for ω2 > 0
1−x
for ω2 6 0
· e5π·|I m(β)|/2 ·
Z
π/2
e−2b|ω1 |·|t| · sinRe(β)− 2 (t)dt · 1
0
(1 − x)−( 2 +Re(β)) · 1
1+x bω2 1
1−x
(75) for ω2 > 0 for ω2 6 0
,
and from this (70) by using (47). u t From (53), (70) and the continuity of K a straightforward calculation provides the following estimate. Lemma 7. Let ω ∈ C \ (q(A) ∪ −q(A)), x ∈ R and y ∈ R be given. Then, 1 e2πb|ω1 | 0 · (1 + 4b2 ω12 )− 2 · H1 (x, y) · · |K(ω, x, y)| 6π 2 bCα0 C−α | cos(2πα) + cosh(2πbω)| ( ω ·D(x,y) for ω2 > 0 e 2 , (76) ω ·|x−y| 2 for ω2 6 0 e
where, ( H1 (x, y) :=
(1 + e−2x/b ) 2 +Re(α) · (1 + e2y/b ) 2 −Re(α) for y 6 x 1 1 (1 + e−2y/b ) 2 +Re(α) · (1 + e2x/b ) 2 −Re(α) for y > x 1
1
(77)
and, ( D(x, y) := |x − y| + b ·
ln(1 + 2e−2x/b ) + ln(1 + 2e2y/b ) for y 6 x . ln(1 + 2e−2y/b ) + ln(1 + 2e2x/b ) for y > x
(78)
Note that the functions H1 and D are symmetric. Obviously (70) implies (32), where, 0 · C1 (g, f ) := π 2 bCα0 C−α
ZZ
g ∗ (x)H1 (x, y)f (y)dxdy.
(79)
R2
In the following an estimate is given on the members of the sequence (39). The derivation of this estimate uses the following Lemmata.
418
H. R. Beyer
Lemma 8. Let β ∈ C with −1/2 < Re(β) < 1/2, z ∈ C with |z| < 1 and k ∈ N . Then the following recursion holds, 1 F −(k + 2), −(k + 2) + 2β, + β − (k + 2), z = 2 1 (80) (1 − 2z)F −(k + 1), −(k + 1) + 2β, + β − (k + 1), z + 2 1 (k + 1)(k + 1 − 2β) + β − k, z . z(1 − z)F −k, −k + 2β, 2 k + 2 − 21 + β k + 1 − 21 + β Proof. The proof is a straightforward consequence of Formulae 15.2.2, 15.5.1 and 5.5.3 of [1]. u t Lemma 9. Let β ∈ C with −1/2 < Re(β) < 1/2 and y ∈ [0, 1). Then for every k ∈ N the following estimate holds, F −k, −k + 2β, 1 + β − k, y 6 1. (81) 2 Proof. The estimate (81) follows from Lemma 9 using induction along with the following estimate for each k ∈ N, (k + 1)(k + 1 − 2β) t u (82) 6 2. 1 1 k+2− +β k+1− +β 2
2
The following inequalities for the elements of the quasinormal mode sequence (39) are straightforward consequences of the definitions and Lemma 10 (as well as of Formulae 6.1.17, 6.1.26, 6.1.22 of [1]). For given x ∈ R, y ∈ R, t ∈ R and k ∈ N one gets, 1 ±Re(α)+k ± 2 , cω± uω± (y)uω± (x)eiωk t 6 ak± e−(t−Ds (x,y))/b k
k
k
(83)
where ak±
2 0 21 ± Re(α) + k 0(1 ± 2Re(α)) 1 |cot(πα)| . := |0(1 ± 2α)| 0(1 + k)0(1 ± 2Re(α) + k) 4π
(84)
Further using 6.1.22 of [1] it is easy to see that there is a positive constant Cα such that ± a 6 Cα . (85) k Hence with such a constant Cα one gets for given x ∈ R, y ∈ R, t ∈ R and k ∈ N : 1 ±Re(α)+k ± 2 . cω± uω± (y)uω± (x)eiωk t 6 Cα e−(t−Ds (x,y))/b k
k
k
(86)
For given x ∈ R and y ∈ R from the last estimate follows the absolute and uniform summability of , (87) cω uω (y)uω (x)eiωt ω∈q(A)
Completeness of Quasinormal Modes of Pöschl–Teller Potential
419
on every compact subset of (Ds (x, y), ∞) × R and hence also the analyticity of the function which associates to each t ∈ C with Re(t) > Ms (x, y) the value X cω uω (y)uω (x)eiωt . (88) ω∈q(A)
A further consequence of the estimate is that Z+∞ Z+∞ cω · uω (y)f (y)dy · g ∗ (x)uω (x)dx · eiωt −∞
−∞
(89)
ω∈q(A)
is absolutely and uniformly summable on [t0 , ∞) × K0 , where t0 > Ms (g, f ), K0 is any compact subset of R and Ms (g, f ) := max{Ds (x, y) : x ∈ supp(g) and y ∈ supp(f )},
(90)
and hence also the analyticity of the function which associates to each t ∈ (Ms (x, y), ∞) ×R the value X ω∈q(A)
Z+∞ Z+∞ cω · uω (y)f (y)dy · g ∗ (x)uω (x)dx · eiωt . −∞
(91)
−∞
The remainder of this appendix considers the sequence , cω [uω (0)]2 ω∈q(A)
(92)
which is a special case of (87) for x = y = 0 and t = 0. This is interesting because for this case t < Ds (x, y), which was not considered up to now. In the following it will be shown that this sequence is not absolutely summable. First, after some computation, which uses Formulae 15.4.19, 8.6.1, 6.1.17, 6.1.18, of [1], it can be seen that uω− (0) = 0
(93)
i2 h 0(k + 21 )|0(k − α + 21 )| c − u − (0) = 1 | cot(πα)| , 2π ω2k ω2k 0(k + 1)|0(k − α + 1)|
(94)
2k+1
and that
both for each k ∈ N. Further, using Formulae 6.1.17, 6.1.26, 6.2.1 of [1] (as well as Fubini’s theorem and Tonelli’s theorem) one gets for each n ∈ N, n n i2 h X X 0(k + 21 )0(k − Re(α) + 21 ) c − u − (0) > 1 | cos(πRe(α))| ω2k ω2k 2π | sin(πα)| 0(k + 1)0(k − Re(α) + 1) k=0 k=0 ZZ 1 − (ts)n+1 1 | cos(πRe(α))| = [ts(1 − t)(1 − s)]−1/2 s −Re(α) dtds. (95) 2 2π | sin(πα)| 1 − ts (0,1)2
420
H. R. Beyer
The proof that (92) is not absolutely summable proceeds indirectly. From the assumption that it is absolutely summable it follows by (95) and the monotonous convergence theorem that the function defined by (1 − ts)−1 [ts(1 − t)(1 − s)]−1/2 s −Re(α) ,
(96)
for each t ∈ (0, 1) and s ∈ (0, 1) is integrable on (0, 1)2 . Hence using the substitution t = sin2 (τ ) , τ ∈ (0, π/2),
(97)
and Fubini’s theorem it follows that the function defined by s −( 2 +Re(α)) (1 − s)−1 , 1
(98)
for each s ∈ (0, 1) is integrable on (0, 1), which is false. Hence (92) is not absolutely summable. Using similar methods it can be shown that the sequence n X k=0
h
cω− uω− (0) k
k
i2
+ cω+ k
i2 uω+ (0) ,
h
k
(99)
converges for n → ∞. The proof of this is not given here. Note that this does not contradict the fact that (92) is not absolutely summable. The “sum” of (92) just depends on the order of the summation. B. This appendix gives a (not completely rigorous) derivation for the representation (16) used in this paper for the solution to the initial value problem. The derivation uses the “Laplace method” of [13]. To aid further applications of this general method, the derivation is provided for a more general wave equation (100) than used in this paper. Take as given J , V , and f , where J is a non empty (bounded or unbounded) interval of R, V is a continuous real-valued function on J , and f is a square integrable complexvalued function on J . Let φf be a given complex-valued function on R × J , which is two times continuously partially differentiable and which satisfies ∂ 2 φf ∂ 2 φf (t, x) − (t, x) + V (x)φf (t, x) = 0, ∂t 2 ∂x 2
(100)
for each t ∈ R and x ∈ J . In addition φf satisfies the initial conditions φf (0, x) = 0,
∂φf (0, x) = f (x) ∂t
(101)
for each t ∈ R and x ∈ J . Finally, let be a given strictly positive real (otherwise arbitrary) number. By Laplace transforming (100) and using (101) one gets the representations (110), (111) of φf below as follows. Defining ψf (t, x) := e−t φf (t, x),
(102)
Completeness of Quasinormal Modes of Pöschl–Teller Potential
421
for each t ∈ R, x ∈ J and assuming the boundedness of ψf (·, x),
∂ψf (·, x), ∂t
∂ 2 ψf (·, x), ∂t 2
(103)
on each [0, ∞) for each x ∈ J one gets from (100), (101) for arbitrary x ∈ J and s ∈ C with Re(s) > 0 , Z
∞ 0
2 ∂ ψf 2 e−st · − (t, x) + [V (x) + (s + ) ] · ψ (t, x) dt = f (x). f ∂x 2
(104)
From this, assuming the uniformly boundedness of ψf (·, y),
∂ψf (·, y), ∂t
∂ 2 ψf (·, y), ∂t 2
(105)
for y from a neighbourhood of x, one concludes −(9f (s, ·))00 (x) + [V (x) − (−is − i)2 ]9f (s, x) = f (x),
(106)
where Z 9f (s, y) :=
0
∞
e−st ψf (t, y)dt,
(107)
for each y ∈ J . Note that, roughly speaking, the −i also guarantees the unique solvability of (106) in L2 (R) for the limiting cases where s is purely imaginary. This fact will be used in (107) for inverting the Laplace transform. Formal inversion of (106) leads to 9f (s, x) = Gf ((ω − i · (σ + ))2 , x) Z G((ω − i(σ + ) )2 , x, y)f (y)dy, := J
(108)
where G(ω − i(σ + ) )2 , ·, ·) is a Green’s function for the formal differential operator −
d2 + V − [ω − i(σ + ) ]2 , dx 2
(109)
which one arrives at by the method of variation of constants. Here σ , ω denote the real and imaginary parts of s, respectively. Note that for the choice of an appropriate Green’s function it may be necessary to impose further boundary conditions on the solutions of (100). By assuming the square integrability of ψf (·, x) on [0, ∞) the inversion of the Laplace transform in (107) can be performed using the Fourier inversion theorem for square integrable functions on the real line. In this way one gets from (107), (108) the representations, F
−1
( √ ψf (t, x) for t > 0 Gf ((· − i) , x) = 2π · , 0 for t < 0 2
(110)
422
H. R. Beyer
where F denotes the unitary linear Fourier transformation on L2 (R) (defined according to [12], Vol. II) as well as for (Lebesgue-) almost all t ∈ [0, ∞), Zν
1 lim φf (t, x) = 2π ν→∞
eit·(ω−i) Gf ((ω − i)2 , x)dω.
(111)
−ν
Note that the limit in the last formula is essential since from the assumptions made one can only conclude that the integrand is square integrable (but not integrable) over R. Moreover note that the right hand side of (108) and the left hand side of (111) are independent of , which reflects the fact that in the inversion of the Laplace transformation there is some freedom in the choice of contour. Starting from (108) one can arrive at (16) in the following way. Let g be an arbitrarily chosen infinitely often differentiable complex-valued function on R having compact support. Assuming the uniform boundedness of ψf on R × supp(g) one gets from (107), (108), 9g,f (s, x) = Gg,f ((ω − i · )2 ), where
Z 9g,f (s) := < g|ψf (t, ·) > := Gg,f ((ω − i · )2 ) :=
Z
∞
0
J
Z
J
(112)
e−st · < g|ψf (t, ·) > dt,
g ∗ (x)ψf (t, x)dx for each t ∈ R,
(113)
g ∗ (x)Gf ((ω − i · )2 , x)dx.
Assuming the square integrability of the function which associates to each t ∈ [0, ∞) the value of < g|ψf (t, ·) > one gets from (112) by the Fourier inversion theorem, ( √ < g|ψf (t, ·) > for t > 0 −1 2 , (114) [F Gg,f ((· − i) )](t) = 2π · 0 for t < 0 as well as for almost all t ∈ [0, ∞), 1 lim < g|φf (t, ·) >= 2π ν→∞
Zν
eit·(ω−i) Gg,f ((ω − i)2 )dω.
(115)
−ν
Formula (115) is easily seen to be a consequence of (114). In the following, sufficient conditions are given for the validity of the Formulae (114), and (115) leading to the formulae (120) and (121), respectively. For the terminology used in the following theorem consult for example Vol. I of [12]. Theorem 1. Let X be a non trivial complex Hilbert space with the scalar product < | >. Let A : D(A) → X be a densely defined, linear, self-adjoint, semibounded operator in X with spectrum σ (A) and resolvent R, where the latter is defined by R(λ) := (A − λ)−1 for each λ ∈ C \ σ (A). Define ( 0 for min σ (A) > 0 . (116) α := √ − min σ (A) for min σ (A) < 0
Completeness of Quasinormal Modes of Pöschl–Teller Potential
423
Also for each ξ, η ∈ X define the analytic function Rξ,η by Rξ,η (ω + iσ ) :=< ξ |R((ω + iσ )2 )η >,
(117)
for each ω ∈ R and σ ∈ (−∞, −α). Finally, let ξ and η be arbitrary elements of D(A) and X, respectively and let φξ be the unique element of C 2 (R, X) satisfying for each t ∈ R, φξ00 (t) = −Aφξ (t),
(118)
and the initial conditions φξ (0) = 0,
φξ0 (0) = ξ.
(119)
Then for each ∈ (α, ∞) and almost all (in the Lebesgue sense) t ∈ [0, ∞), et < η|φξ (t) >= √ [F −1 Rη,ξ (· − i)](t), 2π
(120)
and 1 lim < η|φξ (t) >= 2π ν→∞
Zν
eit·(ω−i) Rη,ξ (ω − i)dω.
(121)
−ν
This theorem is mainly a consequence of Theorem 11.6.1 in [6] and the proposition on p. 295 in Vol. II of [12], and will not be proved here. References 1. Abramowitz, M. and Stegun, I,. A. (ed): Pocketbook of Mathematical Functions. Thun: Harri Deutsch, 1984 2. Bachelot, A. and Motet-Bachelot, A.: Les Resonances D’un Trou Noir De Schwarzschild. Ann. I.H.P. Phys. Theor. 59, 3–68 (1993) 3. Beyer, H.R.: The spectrum of radial adiabatic stellar oscillations. J. Math. Phys. 36, 4815–4825 (1995) 4. Chandrasekhar, S. and Detweiler, S.: The quasi-normal modes of the Schwarzschild black hole. Proc. R. Soc. Lond. A. 344, 441–452 (1975) 5. Ferrari, V. and Mashoon, B.: New approach to the quasinormal modes of a black hole. Phys. Rev. D 30, 295–304 (1984) 6. Hille, E. and Phillips, R.S.: Functional Analysis and Semi-Groups. Providence, RI: AMS (1957) 7. Shmuel, A. and Klein, M.: Analytic properties in scattering and spectral theory for Schroedinger operators with long-range radial potentials. Duke M. J. 68, 337–399 (1992) 8. Newton, R.G.: Scattering theory of waves and particles. 2. ed. , New York: Springer, 1982 9. Oberhettinger, F.: Fourier Transforms of Distributions and their Inverses. New York: McGraw-Hill, 1973 10. Pöschl, G. and Teller, E.: Bemerkungen zur Quantenmechanik des harmonischen Oszillators. Z. Phys. 83, 143–151 (1933) 11. Price, R.H. and Husain, V.: Model for the completeness of quasinormal modes of relativistic stellar oscillations. Phys. Rev. Lett. 68, 1973–1976 (1992) 12. Reed, M. and Simon, B.: Methods of Mathematical Physics Volume I, II, III, IV. New York: Academic, 1980, 1975, 1979, 1978 13. Schmidt, B.G. and Nollert, H-P.: Quasinormal modes of Schwarzschild black holes: Defined and calculated via Laplace transformation. Phys. Rev. D 45, 2617–2627 (1992) 14. Vishveshwara, C.V.: Stability of the Schwarzschild Metric. Phys. Rev. D 1, 2870–2879 (1970) 15. Weidmann, J.: Lineare Operatoren in Hilberträumen. Teubner: Stuttgart, 1976 16. Yosida, K.: Functional Analysis Berlin: Springer, 1980 Communicated by H. Nicolai
Commun. Math. Phys. 204, 425 – 437 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Algebraic Entropy M. P. Bellon1 , C.-M. Viallet1,2 1 Laboratoire de Physique Théorique et Hautes Energies, Unité Mixte de Recherche 7589, CNRS et
Universités Paris VI et Paris VII, Paris, France
2 CERN, Division TH, CH-1211 Genève 23, Switzerland
Received: 20 May 1998 / Accepted: 2 February 1999
Abstract: For any discrete time dynamical system with a rational evolution, we define an entropy, which is a global index of complexity for the evolution map. We analyze its basic properties and its relations to the singularities and the irreversibility of the map. We indicate how it can be exactly calculated. 1. Introduction Exploring the behaviour of dynamical systems is an old subject of mechanics [1,2]. Turning to discrete systems has triggered a huge activity and the notions of sensitivity to initial conditions, numerical (in)stability, Lyapunov exponents and various entropies remained at the core of the subject (see for example [3]). We describe here the construction of a characterizing number associated to discrete systems having a rational evolution (the state at time t + 1 is expressible rationally in terms of the state at time t): it is defined in an algebraic way and we call it the algebraic entropy of the map. It is linked to global properties of the evolution map, which usually is not everywhere invertible. It is not attached to any particular domain of initial conditions and reflects its asymptotic behaviour. Its definition moreover does not require the existence of any particular object like an ergodic measure. In previous works [4–7], a link has been observed between the dynamical complexity [8] and the degree of the composed map. The naive composition of n degree-d maps is of degree d n , but common factors can be eliminated without any change to the map on generic points. This lowers the degree of the iterates. For maps admitting invariants, the growth of the degree was observed to be polynomial, while the generic growth is exponential. We first define the algebraic entropy of a map from the growth of the degrees of its iterates, and give some of its fundamental properties. From the enumeration of the degrees of the first iterates, it is possible to infer the generating function and extract the exact value of the algebraic entropy, even for systems with a large number of degrees of
426
M. P. Bellon, C.-M. Viallet
freedom. The reason underlying this calculability is the existence of a finite recurrence relation between the degrees. After reviewing basic properties of birational maps and of their singularities, we prove such recurrences for specific families. The proof relies on the analysis of the singularities. We describe the relations between the factorization process governing the growth of the degrees of the iterates and the geometry of the singularities of the evolution. This put a new light on the analysis of [9–11]. See also [12]. 2. Algebraic Entropy 2.1. Definition. The primary notion we use is the degree of a rational map. In order to assign a well defined degree to a map, we require that all the components of the map are reduced to a common denominator of the smallest possible degree. The maximum degree of the common denominator and the various numerators is called the degree of the rational maps and it is the common degree of the homogeneous polynomials describing the map in projective space. From this definition we obtain the two basic properties: • The degree is invariant by projective transformations of the source and image spaces. • The degree of the composition of two maps is bounded by the product of the degrees of the maps. From now on, rational maps will always be defined by homogeneous polynomials acting on homogeneous coordinates of the projective completion of affine space. When calculating the composition of two maps, common factors may appear which lower the degree of the resulting map. We then define a reduced composition φ2 × φ1 of φ1 and φ2 by: φ2 ◦ φ1 = m(φ2 , φ1 ) · (φ2 × φ1 ).
(1)
We denote by φ [n] the “true” nth iterate of a map φ, once all factors have been removed. For a transformation φ, we can define the sequence dn of the degrees of the successive iterates φ [n] of φ. Defining Proposition. The sequence 1/n log dn always admits a limit as n → ∞. By definition, we call this limit the algebraic entropy of the map φ. The proof is straightforward and is a consequence of the inequality dn1 +n2 ≤ dn1 dn2 . The algebraic entropy is independent of the particular representation of the rational map φ. Indeed, if we take the conjugation of φ by some birational transformation ψ, φ 0 = ψ −1 × φ × ψ, the degree dn0 of φ 0 [n] will satisfy dn0 ≤ kdn for some constant k depending on the degree of ψ. A similar inequality can be obtained when writing φ = ψφ 0 ψ −1 . In other words, the entropy is a birational invariant associated to φ. This quantity can be rather easily computed by taking the images of an arbitrary line. The convergence to the asymptotic behavior is quite fast and can be obtained from the first iterates for which the degree can be exactly calculated. The growth of dn measures the complexity of the evolution, since dn is the number of intersections of the nth image of a generic line with a fixed hyperplane. It is related to the complexity introduced by Arnol’d [8], with the difference that we are not dealing with homeomorphisms.
Algebraic Entropy
427
The algebraic entropy also has an analytic interpretation. An invariant Kähler metric exists on the complex projective space Pn and the volume of a k-dimensional algebraic variety is given by the integral of the k th power of the Kähler form. This volume is proportional to the degree of the variety [13]. The area of the image by φ [n] of a complex line can be expressed as the integral of the squared modulus of the differential of φ [n] . It is proportional to dn by the above argument. The algebraic entropy can then be viewed as an averaged exponent: it does not depend on the choice of a starting point and it has the advantage of being of a global nature. The definition of the algebraic entropy can be generalized to sequences of maps (φk )k such that the degree of φk is bounded. We define φ [n] to be the regularized map φn × . . . × φ2 × φ1 . This allows the extension to non-autonomous iterations and to maps which are the product of elementary steps, in which case the sequence (φk )k is periodic. In the cases where dn grows polynomially with n, the algebraic entropy is zero, but we can make use of a new invariant, the degree of this polynomial in n. As the algebraic entropy, it is a birational invariant. 2.2. Entropy of the Hénon map. For a simple confrontation of the algebraic entropy with more usual approaches, let us consider the much studied Hénon map [14]. Since it is a polynomial map, it is usually considered as having no singularities. This is a misconception: using projective space shows that singularities exist and are located on the line at infinity. t → t 2, x → t 2 + ty − ax 2 , y → btx.
(2) (3) (4)
Here t is the homogenizing coordinate. We immediately see on this expression that the line at infinity t = 0 is sent to the point with homogeneous coordinates (t, x, y) = (0, 1, 0). This point is still on the line t = 0, so it is a fixed point of the transformation. It will therefore never be mapped to (0, 0, 0) and there cannot be any factorization. The nth iterate of this map is of degree 2n and the algebraic entropy is log(2). The remarkable thing is that this number is independent of the parameters a and b, contrary to usual dynamical exponents. 3. Birational Maps Among rational maps, we mainly use birational ones. They are almost everywhere invertible and are therefore quite appropriate for modeling systems possessing a certain amount of reversibility. 3.1. A little bit of algebraic geometry. Rational relations between two algebraic sets X and Y are relations with a graph Z which is an algebraic subset of X × Y . It would be too restrictive to impose that this defines a map from X to Y . In fact, the only rational maps which are defined on the whole space Pn are linear. One therefore only requires that a rational map is one to one on the complement of an algebraic variety, that is a Zariski open set. A birational map defines a bijection from an open subset X0 of X to an open subset Y0 of Y .
428
M. P. Bellon, C.-M. Viallet
If we call p1 and p2 the projection on the components of the Cartesian products restricted to Z, the point x will correspond to p2 (p1−1 (x)). When this subset of Y is not reduced to a point, x is by definition in the singular locus of Z. If we solve for the homogeneous coordinates of the image point, we get homogeneous polynomials in the coordinates of x. We therefore get a map defined in Cn+1 . Homogeneity makes it compatible with the scale relation defining projective space. Some vector lines in Cn+1 however are identically mapped to zero: they are projective points without definite images. The set of these points is exactly the singular locus. If φ is the homogeneous polynomial representation of a rational map and P is a homogeneous polynomial in (n + 1) variables, we denote φ ∗ P the pull-back of P by φ. It is simply obtained by the composition P ◦ φ. The hypersurface of equation φ ∗ P = 0 is the image by φ −1 of the hypersurface P = 0. If xj is one of the homogeneous coordinates of Pn , φ ∗ xj is simply the index j component of the polynomial function φ. Homogeneous polynomials do not define functions on Pn , but sections of a line bundle which only depends on the homogeneity degree.
3.2. Two examples. Let us describe two examples of birational maps. The first one is the generalized Hadamard inverse in Pn . Take two copies of Pn with homogeneous coordinates (x0 , x1 , . . . , xn ) and (y0 , y1 , . . . , yn ) and define Z by the n equations in P n × Pn : xi yi = x0 y0 , i = 1, . . . , n.
(5)
On the subset where all the xi are different from zero, we can use affine coordinates by fixing x0 = 1 and Z defines the map (x1 , . . . , xn ) → (1/x1 , . . . , 1/xn ).
(6)
If any of the xi is zero, then all the products xj yj must be zero. Let J be the set of indicesTfor which xi is zero. Z induces a correspondence between {x} and the linear space i∈[0,... ,n]−J Hi (Hi is the hyperplane yi = 0). Instead of Eqs. (5), it is often more convenient to give a functional definition of this correspondence. A polynomial definition is: Y xi . (7) yi = j 6 =i
The yi ’s are polynomial functions of the xi ’s of degree n − 1 and they satisfy Eqs. (5). But no formula can give the proper relationship for singular points. For these, at least two of the xi ’s are zero and therefore all the yi ’s vanish. The second example is given in two dimensions by x → x, y → f (x) − y,
(8) (9)
with f (x) any rational function of x. Here again, we can give a homogeneous polynomial formulation. We will have a third variable t which will be multiplied by the denominator of f (x). These two involutions give rise to interesting evolution maps when combined with simple linear transformations. The exchange of the two variables x and y combined
Algebraic Entropy
429
with (8) gives a family of transformation which contains for suitable f discrete versions of some Painlevé equations [11]. The transformation (5) and its conjugation by the Fourier transformation yields a birational transformation which appears naturally as a symmetry of the (n + 1)-state chiral Potts model [15]. 3.3. Singularities. The singular points of a birational map are the vector lines of Cn+1 which are sent to the origin (0, 0, . . . , 0). This singular set is of codimension at least 2. In fact, if there was an algebraic set of codimension 1 sent to the origin, the equation of this set could be factored out of all the components of the image, allowing a reduced description of the map without this singularity. There is a bigger set where the map is not bijective. Let φ be a birational map and ψ be its inverse. Then the composition ψ ◦ φ of their representations as polynomial maps in Cn+1 is a map of degree d 2 . It is however equivalent to the identity, so that each of the components of the image are of the form Kφ xi , where Kφ is a homogeneous polynomial of degree d 2 − 1. The set of zeroes of Kφ , V (Kφ ), is a set where the composition ψ ◦ φ is a priori not defined and it plays a fundamental role. Kφ is an example of a multiplier. When composing two birational maps φ1 and φ2 , a common factor m(φ2 , φ1 ) may appear in the components of φ2 ◦ φ1 . In the case of inverse birational transformations, ψ × φ is the identity and m(ψ, φ) is Kφ . A fundamental property of m(φ2 , φ1 ) is that it cannot vanish out of V (Kφ1 ). Otherwise φ1 would map an open subset of the set of zeros of m(φ2 , φ1 ) to a codimension 1 set where φ2 is singular, since φ1 is a diffeomorphism outside of V (Kφ1 ). This gives us a contradiction since the singular set of rational maps are of codimension at least 2. Determining the multiplying factor amount to determining the exponents of the different irreducible components of Kφ1 in m(φ2 , φ1 ). In fact we obtain a definition of the map on a number of apparently singular hypersurfaces, which is a natural continuous extension of the map. 3.4. The meaning of factorization. Consider the successive iterates φ [n] of a birational map φ. Suppose we have the following pattern of factorization: φ ◦ φ = φ × φ = φ [2] , φ ◦ φ [2] = φ × φ [2] = φ [3] , φ ◦ φ [3] = κ · φ [4] ,
(10) (11) (12)
with κ different from 1. Equation (12) means that the variety κ = 0 is sent to singular points of φ by φ [3] . In other words, κ = 0 is blown down to some variety of codimension higher than one by φ. The latter is non singular for the action of φ and φ [2] but is eventually blown up by φ [3] . Two situations may occur: it may happen that the image by φ [4] of the variety κ = 0 is again of codimension 1 and we have a self-regularization of the map. Such a situation was called singularity confinement in [9,10]. We would rather call it resolution of singularities. Reversibility is recovered on the singular set of φ after a finite number of time steps. The other possibility is that the image of the variety κ = 0 by φ [4] remains of codimension larger than one, a situation depicted in Fig. 1. In the scheme of Fig. 1, the equation of 6 is κ = 0, and the factor κ appears anew in φ ◦ φ [4] . The fifth iterate φ [5] is regular on 6.
430
M. P. Bellon, C.-M. Viallet
Π
Π’
Σ’ Σ ∆
∆’
Fig. 1. A possible blow-down blow-up scheme in P3
The drop of the degree of the iterates is due to the presence of singularities on the successive images of a generic surface under the repeated action of φ. In other words, these images are less and less generic. 4. Recurrence Relations for the Degree One of the basic properties of the sequence of degrees is that it seemingly always verifies a finite linear recurrence relation with integer coefficients. If this is true, the algebraic entropy is the logarithm of an algebraic number. 4.1. A simple case in P2 . Consider the map φ = φ2 φ1 with φ1 and φ2 given by: 0 t = xy + 3 ty + 3 tx φ1 : x 0 = xy + α tx + β ty , y 0 = xy + β tx + α ty 0 t = xy + 3 ty + 3 tx φ2 : x 0 = xy + β tx + α ty , y 0 = xy + α tx + β ty √ with α and β the roots 21 (−1 ± i 7) of z2 + z + 2 = 0. It was used as an example of chaotic behavior in [15] and its singularities have been studied in [4]. The first few elements of the sequence dn are: 1, 2, 4, 7, 12, 20, 33, 54, . . .
(13)
This sequence can be coded in the generating function: g(z) =
1 . 1 − 2z + z3
(14)
The rationality of the generating function is equivalent to the existence of a finite linear recurrence relation for the degrees, at least after a finite number of steps. The determination of the entropy is straightforward once the recurrence relation is known.
Algebraic Entropy
431
The iterated map is a product φ2 ◦ φ1 of two linearly related transformations φ1 and φ2 of degree 2. It is useful in this case to look at the sequence of iterates of φ as the sequence built from (φ1 , φ2 , φ1 , φ2 , . . . ). When calculating φ1 × φ2 × φ1 , t will appear as a factor, since φ2 × φ1 send the line t = 0 to a singular point of φ1 . We want to know the degree of the factor m(φ [n] , φ) or m(φ, φ [n] ). The former factor can only contain the factor t, but the exponent is not readily known, so we rather examine m(φ, φ [n] ). We have to determine the curve which is sent by φ [n−2] to the line t = 0. This is just the first component of the polynomial expression of φ [n−2] and can be written (φ [n−2] )∗ t. It has degree dn−2 . In two steps, the line t = 0 is mapped to a singular point of the following φi . The curve with equation (φ [n] )∗ t is therefore mapped to a singular point by φ [n+2] and its equation can be factorized in the calculation of φ [n+3] . This gives: (φ [n−2] )∗ t · φ [n+1] = φ ◦ φ [n] ,
(15)
and consequently the following recurrence relation for dn : dn+1 = 2dn − dn−2 .
(16)
This relation proves√ formula (14) and yields an exponential growth of the degrees and the value log 21 (1 + 5) of the entropy. 4.2. Factors in the factors. The previous analysis is simple because the image of t remains an irreducible polynomial. This cannot be true in general, since the factors in Kφ generally break into pieces under further transformation by φ [5]. Let us take two birational transformations φ1 and φ2 with respective inverses ψ1 and ψ2 and calculate ψ1 ◦ φ1 ◦ φ2 in two different ways: ψ1 ◦ φ1 ◦ φ2 = (Kφ1 · I d) ◦ φ2 = φ2∗ Kφ1 · φ2 = m(φ1 , φ2 )dψ1 · ψ1 ◦ (φ1 × φ2 ).
(17)
Since the components of φ2 cannot have any common factor, we deduce that m(φ1 , φ2 )dψ1 divides φ2∗ Kφ1 . Geometrically, m(φ1 , φ2 ) is the equation of a hypersurface which φ2 sends to singular points of φ1 . Since Kφ1 vanishes on the singular points of φ1 , its image φ2∗ Kφ1 vanishes on the zero locus of m(φ1 , φ2 ). In the example of the previous section, each new factor appearing in φ ◦ φ [n] is the equation of a hypersurface which φ [n] sends to the point (1, 0, 0). The x and y components of φ [n] therefore have a common factor m(φ, φ [n] ). Consequently, the image φ [n]∗ Kφ of Kφ = t x y by φ [n] contains the expected factor m(φ, φ [n] )2 , while φ [n]∗ t does not contain this factor. 4.3. An example in PN−1 . Consider the algebra of the finite group ZN , its generic P q element a(x) = N−1 q=0 xq σ , with x = (x0 , x1 , . . . , xN −1 ) and σ the generator of ZN . The algebra has two homomorphic products: a(x ◦ y) = a(x) ◦ a(y), a(x · y) =
N−1 X q=0
xq yq σ q .
(18) (19)
432
M. P. Bellon, C.-M. Viallet
The product ◦ just comes from the product in ZN , and verifies σ p ◦ σ q = σ (p+q) , p while σ p · σ q = δq σ p . In terms of cyclic matrices, these two products respectively correspond to the matrix product and the element by element (Hadamard) product. The homomorphism between these two products is realized by the Fourier transform. φ1 and φ2 will be the two inverses constructed from these products. The components (x0 , x1 , . . . , xN−1 ) of x are the natural coordinates of projective space and φ1 and φ2 are involutions of degree N − 1. Kφ1 and Kφ2 are products of linear factors. These linear factors are the equations of hyperplanes which are sent by the corresponding φi into points which we call maximally singular. The important fact is that these maximally singular points are permuted by the other involution. As an example, the maximally singular points of the Hadamard inverse are of the form σ q , i.e., with only one non zero component. The matrix inverse permutes such points by σ q → σ N−q . If p has one vanishing component, say xi , then φ [2] (p) will have all its coordinates vanishing except xN−i . It follows that xi is a common factor to all these coordinates. The j th coordinate of φ [2] (p) can be written1 : Y xi . (20) φ 2 (p)j = xj[2] i6 =N−j
The Hadamard inverse is easily calculated on such an expression. The coordinates of φ 3 (p) are given by: Y [2] Y N −2 xi xN−j xi . (21) φ 3 (p)j = i6=j
i=0...N −1
The common factor is simply Kφ1 and this suggests that φ [3] is a local diffeomorphism on the zeroes of Kφ1 . We now want to determine the structure of the components of φ [n] for any n. The situation for n odd and n even will be similar, since the conjugation by the Fourier transform exchanges the two inverses. From the expression (20) of φ 2 (p), we see that this point is a generic element of the plane xj = 0 if and only if xj[2] = 0. We define polynomials xj[n+2] generalizing the xj[2] ’s appearing in (20) such that2 : φ [n+2] (p)j = xj[n+2]
Y
i6 =N−j
xi[n] .
(22)
If gn is the degree of the xj[n] ’s, then Eq. (22) yields: dn = gn + (N − 1)gn−2 .
(23)
The generalization of (21) gives: m(φ, φ [n] ) =
Y j
xi[n−2]
N −2
.
(24)
1 As in Sect. 4.1, we write φ n (resp. φ [n] ) for the composition of alternatively the two inverses (resp. the reduced composition). 2 In this formula, the coordinates are different according to the parity of n. They are always such that the following φ is the Hadamard inverse in those coordinates.
Algebraic Entropy
433
The factor is therefore of degree N(N − 2)gn . Finally: gn+1 = dn+1 − (N − 1)gn−1 = (N − 1)dn − N (N − 2)gn−2 − (N − 1)gn−1 = (N − 1)gn − (N − 1)gn−1 + gn−2 . (25) It is easy from this recurrence relation to determine that for N = 3, (gn ) and therefore (dn ) are periodic sequences of period 6. The sequence of the φ [n] ’s is known to have this periodicity. For N = 4, gn is a polynomial of degree 2 in n, and for bigger N, the sequences are growing like β n , with β the larger root of x 2 − (N − 2)x + 1. 4.4. Another proof. There is another way to prove the previous result, relating directly to the study of the singularities and the blow-down blow-up process. We first need to introduce some notations, using a homogeneous coordinates system 0 ), for PN−1 . The Hadamard inverse φ1 sends (x0 , x1 , . . . , xN −1 ) into (x00 , x10 , . . . , xN−1 Q 0 where xk = α6=k xα . The square of φ1 is the multiplication by Kφ1 . Define C to be the projective linear transformation constructed from the matrix 1 1 1 ... 1 1 ω ω2 . . . ωN −1 (26) C = .. .. .. .. , .. . . . . . 1 ωN−1 ω2(N −1) . . . ω(N−1)
2
¯ The involution φ2 with ω = exp(2iπ/N). The inverse of C is its complex conjugate C. is linearly related to φ1 by ¯ φ2 = C φ1 C.
(27)
The product φ2 ◦ φ1 may thus be rewritten ρ1 ◦ ρ2 , with ρ1 = C ◦ φ1 and ρ2 = C¯ ◦ φ1 . Denote by ψ1 = φ1 ◦ C¯ and ψ2 = φ1 ◦ C the inverses of ρ1 and ρ2 respectively. The maximally singular points of the Hadamard inverse are the points Pi with xi the only non vanishing component. They are the blow down by φ1 of the planes 5i : {xi = 0} for i = 0, . . . , (N − 1). They are singular points of ρ1 and ρ2 . Denote by Qi , i = 0..(N − 1) the points Qi = (1, ωi , ω2i , . . . , ω(N−1)i ). The Qi ’s are singular points of ψ1 and ψ2 . We have the following straightforward relations: CPi = Qi ,
¯ i = Q−i , CP φ1 (5i ) = Pi .
¯ i = Pi = CQ−i , CQ
(28) (29)
The relevant singularity structure is entirely described by the two sequences: 5i 5i
φ1 φ1
C
φ1
C¯
φ1
C
φ1
Pi 7 −→ Qi 7 −→ Q−i 7 −→ P−i C¯
φ1
Pi 7 −→ Q−i 7 −→ Qi 7 −→ P−i
5−i ,
(30)
5−i .
(31)
The first squiggly line indicates blow down from hyperplane to point and the last one indicates blow up from point to hyperplane. Consider now a sequence {Sk } of varieties of codimension one, constructed by the successive action of ρ1 , ρ2 , ρ1 , and so on. Suppose the ordering is such that ρ1 acts on
434
M. P. Bellon, C.-M. Viallet
the S’s with even index and ρ2 on the S’s with odd index. The successive images in the sequence are supposed to be regularized by continuity. We denote by dn = d(Sn ) the degree of the equation of Sn . Denote by αk (n) (resp. βk (n)) the order of Pk (resp. Qk ) on Sn . If a is the running point of PN−1 , then we have the defining relations
S2n (ρ2 (a)) = S2n−1 (a) ·
N −1 Y u=0
S2n−1 (ψ2 (a)) = S2n (a) ·
N −1 Y v=0
β (2n)
u x−u
(a),
xvαv (2n−1) (C a).
(32)
(33)
Using the fact that ρi and ψi are inverse of each other, and relations (32, 33), we get by · S2n (a), the following relation on the degrees: evaluating S2n (ρ2 ψ2 (a)) = Kψd2n 2 X
(N − 1) d2n = αv (2n − 1) +
βb (2n).
(34)
αu (2n).
(35)
b6 =−v
Similarly, calculating S2n (ψ1 ρ1 (a)) produces (N − 1) d2n = βu (2n + 1) +
X k6 =u
Let 2α (n) =
P
k
αk (n) and 2β (n) =
P
k
βk (n). Relations (34,35) yield
N(N − 1) d2n = 2α (2n − 1) + (N − 1) 2β (2n), N(N − 1) d2n = 2β (2n + 1) + (N − 1) 2α (2n).
(36)
From the singularity pattern (30,31), we see that αi (2n) = β−i (2n−1) and αi (2n+1) = βi (2n), so that 2α (2n) = 2β (2n − 1) and 2α (2n + 1) = 2β (2n). It follows that 2α (k) = 2β (k − 1).
(37)
dn+3 − (N − 1)dn+2 + (N − 1)dn+1 − dn ,
(38)
This combined with (36) yields
which is the recurrence relation on the degrees of the iterates, with generating function fq (z) =
1 + z2 (N − 1) . (1 − z)(z2 − z(N − 2) + 1)
(39)
Algebraic Entropy
435
4.5. Discrete Painlevé I. The discrete Painlevé I system is given by the following transformations: cn + b − x − y, x→ x y → x, (40) where cn depends on three parameters and is given by cn = c + an + d(−1)n . The transformation is just an involution of the form (8) followed by the exchange of x and y. The homogeneous form is φn given by: (t, x, y) → (xt, cn t 2 + bxt + x 2 − yx, x 2 ).
(41)
x3,
so that x is the only factor which can appear It is easy to obtain that Kφ is simply in m(ψ, φ). The line x is sent to the point of coordinates (0, 1, 0), but it is not sufficient to characterize a possible blowing up. In fact, at leading order in x, the image of points approaching this line satisfy the equation xy = cn t 2 . We therefore have to follow the image up to second order. x remains a factor in the successive transformations of t. In (φ 3 )∗ x, x 2 appears as a factor. This gives a factor of x 6 in the transformation of Kφ and is the signal according to Sect. 4.2 of the factor x 3 appearing in m(φ, φ 3 ). The x 2 factor is however not sufficient to guarantee the factorization of x 3 in the next composition. The factorization of x 3 depends on the relation cn+3 − cn+2 − cn+1 + cn which characterizes the form of the cn given in [11]. We may now establish the recurrence relation obeyed by the degrees. We introduce the polynomials x [n] of degree gn such that the x component of φ [n] is x [n] (x [n−3] )2 . As in the preceding case, the factor which will replace x in the successive factorization is x [n] . The factors x [n−3] have disappeared in φ [n+1] and the images of x [n−3] = 0 are not singular. Since dn = gn + 2gn−3 , we have: dn+1 = 2dn − 3gn−3 = 2gn + gn−3 .
(42)
gn = (1 + 21 n)2 − 18 (1 − (−1)n ),
(43)
whose solution is: dn =
3 2 4n
+
9 8
−
1 n 8 (−1) .
(44)
These results agree with the explicit calculations, producing the sequence of degrees: 1, 2, 4, 8, 13, 20, 28, 38, 49, 62, 76, . . . .
(45)
We can also consider a slight generalization introduced in [6]. The pole part of the transformation of x is replaced by a double pole but we do not use variable coefficients, c x → 2 + b − x − y, x y → x. (46) This is now a degree 3 birational map, with Kφ = x 3 . It was shown that we still have the same pattern, but with higher powers of x appearing. In φ [3] , the x component gets a x 3 factor and we can factorize x 8 from φ 4 . Defining similarly x [n] such that the x component of φ [n] is x [n] (x [n−3] )3 , we get the following recurrence relation for its degree gn : gn+1 − 3gn + 3gn−2 − gn−3 = 0.
(47)
The solution of this equation allows to recover the results of [6]. The algebraic entropy √ is given by the logarithm of the largest solution of x 4 − 3x 3 + 3x − 1 which is 21 (3 + 5), the square of the golden ratio.
436
M. P. Bellon, C.-M. Viallet
5. Conclusion and Perspectives We have not produced the general proof of the existence of a finite recurrence on the degrees. We have however shown that its origin lies in the singularity structure of the evolution and the possible recovering of reversibility. In numerous examples which we will not enumerate, we have been able either to establish recurrence equations or to infer a generating function from the first degrees which successfully predict the following ones. This supports the following conjecture. Conjecture. The generating function of the sequence of the degrees is always a rational function with integer coefficients. This may even be the case for rational transformation which are not birational [16]. The algebraic entropy is in this case the logarithm of an algebraic number and in the case of vanishing entropy, the sequence of the degrees is of polynomial growth. There is a keyword which we did not use yet: integrability. Proving integrability in our setting amounts to showing that the motion is a translation on a torus. From the numerous examples we have examined, we believe the algebraic entropy measures a deviation from this type of integrability. We can actually propose the following: Conjecture 2. If the birational transformation φ is equivalent to a bijection defined on an algebraic variety M deduced from Pn by a finite sequence of blow-ups, then the sequence of degrees of φ [n] has at most a polynomial growth. There also is the question of the relation of the algebraic entropy to other dynamical entropies [3]. The fact that the algebraic maps we study do not necessarily admit an ergodic measure precludes the definition of the Kolmogorov-Sinai entropy in many cases. The most natural correspondence would be with the topological entropy, but requires more work. We must also stress that in any case, the algebraic entropy is a property of the map in the complex domain. The special properties of rational maps allow to characterize the complexity of the dynamics from the study of a single number, the degree, and to control it through the study of the behaviour of the map in a small number of singular points. References 1. Liouville, J.: Sur l’intégration des équations différentielles de la Dynamique. J. de Mathématiques Pures et Appliquées XX, 137–138 (1855) 2. Poincaré, H.: Les méthodes nouvelles de la mécanique céleste. Paris: Gauthier–Villars, 1892 3. Eckmann, J.-P. and Ruelle, D.: Ergodic theory of chaos. Rev. Mod. Phys. 57(3), 617–656 (1985) 4. Falqui, G. and Viallet, C.-M.: Singularity, complexity, and quasi-integrability of rational mappings. Commun. Math. Phys. 154, 111–125 (1993) 5. Boukraa, S., Maillard, J-M. and Rollet, G.: Integrable mappings and polynomial growth. Physica A 209, 162–222 (1994) 6. Hietarinta, J. and Viallet, C.-M.: Singularity confinement and chaos in discrete systems. Phys. Rev. Lett. 81, 325–328 (1998); solv-int/9711014 7. Veselov, A.P.: Growth and Integrability in the Dynamics of Mappings. Commun. Math. Phys. 145, 181– 193 (1992) 8. Arnold, V.I.: Dynamics of complexity of intersections. Bol. Soc. Bras. Mat. 21, 1–10 (1990) 9. Grammaticos, B., Ramani, A. and Papageorgiou, V.: Do integrable mappings have the Painlevé property? Phys. Rev. Lett. 67, 1825–1827 (1991) 10. Ramani, A., Grammaticos, B., and Hietarinta, J.: Discrete versions of the Painlevé equations. Phys. Rev. Lett. 67, 1829–1832 (1991) 11. Grammaticos, B., Nijhoff, F.W. and Ramani, A.: Discrete painlevé equations. In: R. Conte, ed., The Painlevé property, one century later, 1996 Cargèse Proceedings, to appear in CRM Proceedings and lecture notes
Algebraic Entropy
437
12. Hietarinta, J. and Viallet, C.-M.: Discrete Painlevé and singularity confinement in projective space. Proceeding Bruxelles July 1997 meeting on “Integrability and chaos in discrete systems”, 1997 13. Mumford, D.: Algebraic geometry, Vol. 1: Complex projective varieties. Berlin–Heidelberg–New York: Springer-Verlag, 1995 14. Hénon, M.: A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 50, 69–77 (1976) 15. Bellon, M.P., Maillard, J.-M. and Viallet, C.-M.: Infinite Discrete Symmetry Group for the Yang-Baxter Equations: Spin models. Phys. Lett. A 157, 343–353 (1991) 16. Boukraa S., Abenkova, N., Anglès d’Auriac, J.-C. and Maillard, J.-M.: In preparation (1998) Communicated by Ya. G. Sinai
Commun. Math. Phys. 204, 439 – 473 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Chiral de Rham Complex Fyodor Malikov1,2 , Vadim Schechtman3 , Arkady Vaintrob4 1 Max-Planck-Institut für Mathematik, Gottfried-Claren-Straße 26, 53225 Bonn, Germany.
E-mail: [email protected]
2 Department of Mathematics, University of Southern California, Los Angeles, CA 90089, USA.
E-mail: [email protected]
3 Department of Mathematics, University of Glasgow, 15 University Gardens, Glasgow G12 8QW, UK.
E-mail: [email protected]
4 Department of Mathematics, New Mexico State University, Las Cruces, NM 88003-8001, USA.
E-mail: [email protected] Received: 29 April 1998 / Accepted: 7 February 1999
Abstract: We define natural sheaves of vertex algebras over smooth manifolds which may be regarded as semi-infinite de Rham complexes of certain D-modules over the loop spaces. For Calabi–Yau manifolds they admit N = 2 supersymmetry. Connection with Wakimoto modules is discussed. Introduction 0.1. The aim of this note is to define certain sheaves of vertex algebras on smooth manifolds. In this note, “vertex algebra” will have the same meaning as in Kac’s book [K]. Recall that these algebras are by definition Z/(2)-graded. “Smooth manifold” will mean a smooth scheme of finite type over C. For each smooth manifold X, we construct a sheaf ch X , called the chiral de Rham complex of X. It is a sheaf of vertex algebras in the Zarisky topology, i.e. for each open U ⊂ X, 0(U ; ch X ) is a vertex algebra, and the restriction maps are morphisms of vertex algebras. It comes equipped with a Z-grading by fermionic charge, and the chiral de ch , which is an endomorphism of degree 1 such that (d ch )2 = 0. Rham differential dDR DR One has a canonical embedding of the usual de Rham complex ch (X , dDR ) ,→ (ch X , dDR ).
(0.1)
The sheaf ch X has also another Z≥0 -grading, by conformal weight, compatible with ch respects conformal weight, and the subfermionic charge one. The differential dDR complex X coincides with the conformal weight zero component of ch X . The wedge multiplication on X may be restored from the operator product in ch X , see 1.3. The map (0.1) is a quasi-isomorphism, cf. Theorem 4.4. Each component of ch X of fixed conformal weight admits a canonical finite filtration whose graded factors are symmetric and exterior powers of the tangent bundle TX and of the bundle of 1-forms 1X .
440
F. Malikov, V. Schechtman, A. Vaintrob
Similar sheaves exist in complex-analytic and C ∞ settings. If X is Calabi–Yau, then the sheaf ch X has a structure of a topological vertex algebra (i.e. it admits N = 2 supersymmetry, cf. 2.1), cf. 4.5. (For an arbitrary X, the obstruction to the existence of this structure is expressible in terms of the first Chern class of TX , cf. Theorem 4.2.) One may hope that the vertex algebra R0(X; ch X ) defines the conformal field theory which is Witten’s “A-model” associated with X. The intuitive geometric picture behind our construction is as follows. Let LX be the space of “formal loops” on X, i.e. of the maps of the punctured formal disk to X. Let L+ X ⊂ LX be the subspace of loops regular at 0. Note that we have a natural projection L+ X −→ X (value at 0). We have a functor p : (Sheaves on LX) −→ (Sheaves on X), namely, if F is a sheaf on LX, then 0(U ; p(F)) = 0(LU ; F ) for an open U ⊂ X. Now, the sheaf ch X is the image under p of the semi-infinite de Rham complex of the D-module of δ-functions along L+ X. This sheaf is a particular case of a more general construction which associates with every D-module M over X its “chiral de Rham complex” ch X (M) which is a sheaf of . Its construction is sketched in §6, cf. 6.11. vertex modules over the vertex algebra ch X ch of vertex algebras, which could 0.2. One can also try to define a purely even sheaf OX be called chiral structure sheaf. Here the situation is more subtle than in the case of ch X , where “fermions cancel the anomaly”. One can define this sheaf for curves, cf. 5.5. If dim(X) > 1, then there exists a non-trivial obstruction of cohomological nature to the ch . This obstruction can be expressed in terms of a certain homotopy construction of OX Lie algebra, cf. §5 A. ch for the flag manifolds X = G/B (G being a simple However, one can define OX algebraic group and B a Borel subgroup), cf. Sect. 5 B, C. This sheaf admits a structure of ab g-module at the critical level (here g = Lie(G) andb g is the corresponding affine Lie ch ) is the irreducible vacuum b g-module algebra). The space of global sections 0(X; OX ch for g = sl(2); conjecturally, it is also true for any g, cf. 5.13. The sheaf OX may be regarded as localization of Feigin-Frenkel bosonization. More generally, if we start from an arbitrary D-module M on X = G/B correponding to some g-module M, then we can define its “chiralization” Mch which is an Och module. It seems plausible that the space of global sections 0(X; Mch ) coincides with the Weyl module over b g corresponding to M (on the critical level).
0.3. A few words about the plan of the note. In Sect. 1 we recall the basic definitions, to fix terminology and notations. In Sect. 2, some “free field representation” results are described. No doubt, they are all well known (although for some of them we do not know the precise reference). They are a particular case of the construction of ch X , given in Sect. 3. In Sect. 4, we discuss the topological structure, and in Sect. 5 the chiral structure sheaf. In Sect. 6 we outline another construction of our vertex algebras, and some generalizations.
Chiral de Rham Complex
441
1. Recollections on Vertex Algebras For more details on what follows, see [K]. For a coordinate-free exposition, see [BD1]. 1.1. For a vector space V , V [[z, z−1 ]] will denote the space of all formal sums P i i∈Z ai z , ai ∈ V . V ((z)) will denote the subspace of Laurent power series, i.e. the sums as above, with ai = 0 for i << ∞. We denote by ∂z the operator of differentiation (k) k ≥ 0. For a(z) ∈ by z acting on V [[z, z−1 ]]. We set ∂z := (∂z )k /k!, for an integer R −1 V [[z, z ]], we will also use the notation f (z)0 for ∂z f (z), and a(z) for the coefficient at z−1 (i.e. the residue). V ∗ will denote the dual space. In the sequel, we will omit the prefix “super” in the words like “superalgebra”, etc. For example, “a Lie algebra” will mean “a Lie superalgebra”. 1.2. Let us define a vertex algebra following [K], 4.1. Thus, a vertex algebra is the data (V , Y, L−1 , 1). Here V is a Z/(2)-graded vector space V = V ev ⊕ V odd . The Z/(2)grading is called parity; for an element a ∈ V , its parity will be denoted by a. ˜ The space of endomorphisms End(V ) inherits the Z/(2)-grading from V . 1 is an even element of V , called vacuum. Y is an even linear mapping Y : V −→ End(V )[[z, z−1 ]].
(1.1)
For a ∈ V , the power series Y (a) will be denoted a(z), and called the field corresponding to a. The coefficients of the series a(z) are called Fourier modes. L−1 is an even endomorphism of V . The following axioms must be satisfied. Vacuum axiom. L−1 (1) = 0; 1(z) = id (the constant power series). For all a ∈ V , a(z)(1) z=0 = a (in particular, the components of a(z) at the negative powers of z act as zero on the vacuum). Translation invariance axiom. For all a ∈ V , [L−1 , a(z)] = ∂z a(z). Locality axiom. For all a, b ∈ V , (z − w)N [a(z), b(w)] = 0 for N >> 0. The meaning of this equality is explained in [K]. Given two fields a(z), b(z), their operator product expansion is defined as in [K], (2.3.7a) N X cj (w) + : a(z)b(w) :, a(z)b(w) = (z − w)j j =1
where
˜
: a(z)b(w) := a(z)+ b(w) + (−1)a˜ b b(w)a(z)− .
Here
a(z)− = P
X n≥0
z−n−1 ,
a(n) z−n−1 ; a(z)+ =
X
(1.2)
a(n) z−n−1
n<0
for a(z) = a(n) cf. [K], (2.3.5), (2.3.3). A morphism of vertex algebras (V , 1, L−1 , Y ) −→ (V 0 , 10 , L0−1 , Y 0 ) is an even linear map f : V −→ V 0 taking 1 to 10 , such that f ◦ L−1 = L0−1 ◦ f , and for each a ∈ V , if an is a Fourier mode of the field a(z), we have f (a)n ◦ f = f ◦ an .
442
F. Malikov, V. Schechtman, A. Vaintrob
1.3. A conformal vertex algebra (cf. [K], 4.10) is a vertex algebra (V , Y, L−1 , 1), together with an even element L ∈ V such that if we write the field L(z) as X Ln z−n−2 , L(z) = n
the endomorphisms Ln satisfy the Virasoro commutation relations [Ln , Lm ] = (n − m)Ln+m +
n3 − n · c · δn,−m . 12
(1.3)
Here c is a complex number, to be called the Virasoro central charge of V (in [K] it is called the rank). The component L−1 of the series (1.2) must coincide with the endomorphism L−1 from the definition of a vertex algebra. The endomorphism L0 must be diagonalizable, with integer eigenvalues. Set V (n) = {a ∈ V | L0 (a) = na}. We must have V = ⊕n∈Z V (n) . For a ∈ V (n) , the number n is called the conformal weight of a and denoted |a|. This Z-grading induces a Zgrading on the space End(V ), also to be called the conformal weight. By definition, an endomorphism has conformal weight n if it maps V (i) to V (i+n) for all i. The conformal weight grading on V should be compatible with the parity grading, i.e. both gradings should come from a Z × Z/(2)-bigrading. We require that if a has conformal weight n, then the field a(z) has the form X ai z−i−n , (1.4) a(z) = i
with ai having the conformal weight −i. We have the important Borcherds formula. If a, b are elements in a conformal vertex algebra, with |a| = n then a(z)b(w) =
X ai (b)(w) + : a(z)b(w) :, (z − w)i+n
(1.5)
i
cf. [K], Theorem 4.6. The Virasoro commutation relations (1.3) are equivalent to the operator product L(z)L(w) =
2L(w) L(w)0 c + + 2(z − w)4 (z − w)2 z−w
(1.6)
(we omit from now on the regular part). 1.4. Let V be a conformal vertex algebra, such that V (n) = 0 for n < 0. For a ∈ V (n) , b ∈ V (m) , consider the operator product a(z)b(w) =
X ai (b)(w) . (z − w)i+n
(1.7)
i
We have |ai (b)| = m − i. Therefore, am (b)z−n−m is the most singular term in (1.5).
Chiral de Rham Complex
443
If n = m = 0 then a(z)b(w) is non-singular at z = w. Define the multiplication on the space V (0) by the rule (1.8) a · b = a(z)b(w) z=w 1 . z=0
This endows V (0) with a structure of an associative and commutative algebra with unity 1. 1.5. Heisenberg algebra. In this subsection, we will define a conformal vertex algebra, to be called Heisenberg vertex algebra. Fix an integer N > 0. Let HN be the Lie algebra which as a vector space has the base ani , bni , i = 1, . . . , N; n ∈ Z, and C, all these elements being even, with the brackets j
[am , bni ] = δij δm,−n C,
(1.9)
all other brackets being zero. Our vertex algebra, to be denoted VN , as a vector space is the vacuum representation of HN . As an HN -module, it is generated by one vector 1, subject to the relations i 1 = 0 if m ≥ 0; C1 = 1. bni 1 = 0 if n > 0; am
The mapping
j
(1.10)
j
P (bni , am ) 7 → P (bni , am ) · 1 j
identifies VN with the ring of commuting polynomials on the variables bni , am , n ≤ 0, m < 0, i, j = 1, . . . , N. We will use this identification below. Let us define the structure of a vertex algebra on VN . The Z/(2) grading is trivial: everything is even. The vacuum vector is 1. The fields corresponding to the elements of VN are defined by induction on the degree of a monomial. First we define the fields for the degree one monomials by setting X X j j bni z−n ; a−1 (z) = an z−n−1 . (1.11) b0i (z) = n∈Z
n∈Z
j
Here bni , an in the right hand side are regarded as operating on VN by multiplication. We set j i i (z) = ∂z(n) b0i (z), a−n−1 (z) = ∂z(n) a−1 (z) (1.12) b−n for n > 0. The fields corresponding to the monomials of degree > 1 are defined by induction, j using the normal ordering. Let us call the operators bni , n > 0, and an , n ≥ 0, acting on VN , annihilation operators. j For x = bni or an , n ∈ Z, and b ∈ End(VN ), the normal ordered product : xb : is given by bx if x is an annihilation operator : xb := . (1.13) xb otherwise Define by induction : x1 · . . . · xk :=: x1 · (: x2 · . . . · xk :) :,
(1.14)
444
F. Malikov, V. Schechtman, A. Vaintrob j
for xp = bni or an , p = 1, . . .P , k. P Given two series x(z) = n∈Z xn z−n+p and y(z) = n∈Z yn z−n+q , with xn as above, we set X : x(z)y(w) := : xn ym : z−n+p w −m+q . (1.15) n,m∈Z
For any finite sequence x1 , . . . , xp , where each xj is equal to one of ani or bni , we define the series : x1 (z)x2 (z) · . . . · xp (z) :∈ End(VN )[[z, z−1 ]] by induction, as in (1.14). This expression does not depend on the order of xi ’s. Given a monomial x1 · . . . · xp 1 ∈ VN , with xi as above, we define the corresponding field by (1.16) x1 · . . . · xp (z) =: x1 (z) · . . . · xp (z) : . Since every element of VN is a finite linear combination of monomials as above, this completes the definition of the mapping (1.1). We will use the shorthand notations j
bi (z) = b0i (z); a j (z) = a−1 (z).
(1.17)
The operator products of these basic fields are a j (z)bi (w) =
δij + (regular), z−w
bi (z)bj (w) = (regular); a i (z)a j (w) = (regular),
(1.18a) (1.18b)
where “(regular)” means the part regular at z = w. These operator products are equivalent to the commutation relations (1.9). Other operator products are computed by differentiation of (1.18), and using the Wick theorem, cf. [K], 3.3. One can say that the vertex algebra VN is generated by the even fields bi (z), a j (z), of conformal weights 0 and 1 respectively, subject to the relations (1.18). The Virasoro field is given by L(z) =
N X
: bi (z)0 · a i (z) : .
(1.19)
i=1
The central charge is equal to 2N . Let us check this. Assume for simplicity that N = 1, and let us omit the index 1 at the fields a, b. Thus, we have L(z) =: b(z)0 a(z) : . Let us compute the operator product L(z)L(w) using the Wick theorem. We have : b(z)0 a(z) :: b(w)0 a(w) := [b(z)0 a(w)][a(z)b(w)0 ] +[b(z)0 a(w)] : a(z)b(w)0 : +[a(z)b(w)0 ] : b(z)0 a(w) : (we have b(z)0 a(w) = a(z)b(w)0 = 1/(z − w)2 ) =
2 : b(w)0 a(w) : : b(w)00 a(w) : + : b(w)0 a(w)0 : 1 . + + (z − w)4 (z − w)2 z−w
Chiral de Rham Complex
445
Hence,
2L(w) L(w)0 1 + + , 4 2 (z − w) (z − w) z−w
L(z)L(w) = which is (1.6) with c = 2.
1.6. Clifford algebra. Let ClN be the Lie algebra which as a vector space has the base φni , ψni , i = 1, . . . , N, n ∈ Z, and C, all these elements being odd, with the brackets j
[φm , ψni ] = δij δm,−n · C.
(1.20)
Clifford vertex algebra. 3N is defined as in the previous subsection, starting with the odd fields X X j ψ i (z) = ψni z−n−1 ; φ j (z) = φn z−n (1.21) n∈Z
n∈Z
and repeating the definitions of loc. cit., with a (resp. b) replaced by ψ (resp. φ). One must put the obvious signs in the definition of the normal ordering. Thus, 3N is generated by the odd fields φ i (z), ψ i (z), subject to the relations φ i (z)ψ j (w) =
δij + regular; z−w
φ i (z)φ j (w) = regular; ψ i (z)ψ j (w) = regular.
(1.22a) (1.22b)
The Virasoro field is given by L(z) =
N X
: φ i (z)0 · ψ i (z) : .
(1.23)
i=1
The central charge is equal to −2N . 1.7. If A and B are vertex algebras then their tensor product A ⊗ B admits a canonical structure of a vertex algebra, cf. [K], 4.3. The Virasoro element is given by LA⊗B = LA ⊗ 1 + 1 ⊗ LB .
(1.24)
We will use in the sequel the tensor product of the Heisenberg and Clifford vertex algebras N = VN ⊗ 3N . Its Virasoro central charge is equal to 0. 1.8. Let A be a vertex algebra. A linear map f : A −→ A is called derivation of A if for any a ∈ A, f (a)(z) = [f, a(z)]. (1.25) Note that an invertible map g : A −→ A is an automorphism of A iff g(a)(z) = ga(z)g −1
(1.26)
and (1.25) is the infinitesimal version of (1.26). R It follows from Borcherds formula that for every a ∈ A, the Fourier mode Ra(z) is a derivation, cf. Lemma 1.3 from [LZ]. Consequently, if the endomorphism exp( a(z)) is well defined, it is an automorphism of A.
446
F. Malikov, V. Schechtman, A. Vaintrob
2. De Rham Chiral Algebra of an Affine Space 2.1. A topological vertex algebra of rank d is a conformal vertex algebra A of Virasoro central charge 0, equipped with an even element J of conformal weight 1, and two odd elements, Q, of conformal weight 1, and G, of conformal weight 2. The following operator products must hold: L(z)L(w) = J (z)J (w) =
L(w)0 2L(w) + , (z − w)2 z−w
(2.1a)
d J (w) J (w)0 d ; L(z)J (w) = − + + , (z − w)2 (z − w)3 (z − w)2 z−w
(2.1b)
G(w) G(w)0 2G(w) ; J (z)G(w) = − , (2.1c) + 2 (z − w) z−w z−w
G(z)G(w) = 0; L(z)G(w) = Q(z)Q(w) = 0; L(z)Q(w) = Q(z)G(w) =
Q(w) Q(w)0 Q(w) ; J (z)Q(w) = , + 2 (z − w) z−w z−w
(2.1d)
d J (w) L(w) . + + 3 2 (z − w) (z − w) z−w
(2.1e)
Note the following consequence of (2.1e), [Q0 , G(z)] = L(z).
(2.2)
2.2. In this subsection, we will intorduce a structure of a topological vertex algebra of rank N on the vertex algebra N from 1.6. This topological vertex algebra will be called the de Rham chiral algebra of the affine space AN . Recall that the Virasoro element is given by L=
N X i=1
i i i i a−1 + φ−1 ψ−1 b−1 .
(2.3a)
Define the elements J, Q, G by J =
N X i=1
i φ0i ψ−1 ; Q=
N X i=1
i a−1 φ0i ; G =
N X i=1
i i ψ−1 b−1 .
The corresponding fields are X L(z) = : bi (z)0 a i (z) : + : φ i (z)0 ψ i (z) : and J (z) = G(z) =
X X
: φ i (z)ψ i (z) :, Q(z) =
X
: a i (z)φ i (z) :,
: ψ i (z)bi (z)0 : .
The relations (2.1) are readily checked using the Wick theorem.
(2.3b)
(2.4a)
(2.4b)
Chiral de Rham Complex
447
2.3. Let us define the fermionic charge operator acting on N , by XX i F := J0 = : φni ψ−n :. i
We have and We set Obviously,
(2.5)
n
F1 = 0
(2.6)
[F, φni ] = φni ; [F, ψni ] = −ψni ; [F, ani ] = [F, bni ] = 0.
(2.7)
p
N = {ω ∈ N | F ω = pω}. p
N = ⊕p∈Z N .
We define an endomorphism d of the space N by X i : ani φ−n : d := −Q0 = −
(2.8) (2.9)
(2.10)
i,n
(we could omit the normal ordering in the last formula since the letters a and φ commute anyway). We have d 2 = 0. Indeed, by the Wick theorem Q(z)Q(w) = regular, hence all Fourier modes of Q(z) (anti)commute. The map d is called a chiral de Rham differential. The map d increases the fermionic charge by 1, by (2.7). Thus, the space N equipped with the fermionic charge grading and the differential d, becomes a complex (infinite in both directions), called a chiral de Rham complex of AN . p N Consider the usual algebraic de Rham complex (AN ) = ⊕N p=0 (A ) of the affine space AN . We identify the coordinate functions with the letters b01 , . . . , b0N , and their differentials with the fermionic variables φ01 , . . . , φ0N . Thus, we identify the commutative dg algebras (AN ) = C[b01 , . . . , b0N ] ⊗ 3(φ01 , . . . , φ0N ),
(2.12)
the second factor being the exterior algebra. The grading is defined by assigning to the j letters b0i (resp. φ0 ) the degree 0 (resp. 1). The usual de Rham differential is given by X a0i φ0i , (2.13) dDR = i
as follows from the relations (1.9). Theorem 2.4. The obvious embedding of complexes i : ((AN ), dDR ) −→ (N , d)
(2.14)
is compatible with the differentials, and is a quasi-isomorphism. We identify the space N with the space of polynomials in the letters bni , φni (n ≤ 0) i and ani , ψni (n < 0). One sees that on the subspace C[b0i , φ0i ], all the summands ani φ−n with n 6 = 0 act trivially. It follows that the map i is compatible with the differentials.
448
F. Malikov, V. Schechtman, A. Vaintrob
Proof. To prove that it is a quasi-isomorphism, let us split d in two commuting summands d = d+ + d− , where XX XX i i ani φ−n , d− = ani φ−n . (2.15) d+ = i
n≥0
i
n<0
j
j
We think of the space N as of the tensor product C[ani , ψni ]⊗C[bm , φm ]. The differential d− acts trivially on the second factor, and on the first one it is the Koszul differential. j j So, the cohomology of d− is C[bm , φm ]m≤0 . Now, we have to compute the cohomology of this space with respect to d+ . For >0 , and our space as C[bj , φ j ] ⊗ this purpose, split d+ once again as d+ = dDR + d+ 0 0 j j >0 acts trivially on the first factor, and on the second C[bm , φm ]m<0 . The differential d+ one, it is the de Rham differential. Hence (by Poincaré lemma), taking the cohomology of >0 kills all non-zero modes, and we are left precisely with the usual de Rham complex. d+ Alternatively, it follows from (2.2) that [G0 , d] = L0 .
(2.16)
The operator G0 commutes with L0 , and it follows from (2.16) that it gives a homotopy to 0 for the operator d on all the subcomplexes of non-zero conformal weight. Therefore, all cohomology lives in the conformal weight zero subspace. u t 2.5. The vertex algebra N satisfies the assumptions of 1.3. The subspace (AN ) coincides with the conformal weight zero component of it. If we apply the definition of 1.3, we get the structure of a commutative algebra on (AN ) which is given by the usual wedge product of differential forms. 3. Localization 3.1. Consider the Heisenberg vertex algebra VN defined in 1.4. As in loc. cit., we will i , aj ] identify the space VN with the space of polynomials C[b−n −m n≥0, m>0 . To simplify i the notations below, let us denote the zero mode variables b0 by bi . Let AN denote the algebra of polynomials C[b1 , . . . , bN ]. The space AN is iden(0) tified with the subspace VN ⊂ VN of conformal weight zero. The space VN has an bN denote the algebra of formal power series obvious structure of an AN -module. Let A C[[b1 , . . . , bN ]]. Set bN ⊗AN VN . bN = A (3.1) V bN . We are going to introduce a structure of a conformal vertex algebra on the space V Let us define the map (1.1). bN . We claim that the expression f (b1 (z), Let f (b1 , . . . , bN ) be a power series from A N bN )[[z, z−1 ]]. (We are grateful to Boris . . . , b (z)) makes sense as an element of End(V Feigin who has shown us a particular case of the following construction.) Let us express the power series bi (z) as (3.2) bi (z) = bi + 1bi (z). Thus,
1bi (z) =
X n>0
i (bni z−n + b−n zn ).
(3.3)
Chiral de Rham Complex
449
Let us define f (b1 (z), . . . , bN (z)) by the Taylor formula X 1b1 (z)i1 · . . . · 1bN (z)iN ∂ (i1 ,... ,iN ) f (b1 , . . . , bN ), (3.4) f (b1 (z), . . . , bN (z)) = where ∂
(i1 ,... ,iN )
=
∂bi11 i1 !
· ... ·
∂biNN
iN !
.
(3.5)
bN )[[z, z−1 ]]. We will show that the series (3.4) gives a well-defined element of End(V Let us write X i ,... ,i N −k ck1 z . 1b1 (z)i1 · . . . · 1bN (z)iN = k
The coefficient
cki1 ,... ,iN
is an infinite sum of the monomials j
j
bk11 · . . . · bkII ,
(3.6)
bN . There exists M with I = i1 + . . . + iN , k1 + . . . + kI = k. Pick an element v ∈ V P j1 jN li > M. such that bl1 · . . . · blN v = 0 if We have X X X k= ki = + |ki | − − |ki |, P P (resp. − ) denotes the sum of all positive (resp. negative) summands. If where + j1 jI bk1 · . . . · bkI v 6 = 0, then X + |ki | ≤ M. (3.7a) On the other hand, −
X
|ki | =
+
X
|ki | − k ≤ M − k.
(3.7b)
There exists only a finite number of tuples (k1 , . . . , kI ) satisfying (3.7a) and (3.7b). bN . Therefore, cki1 ,... ,iN are well-defined endomorphisms of V We have X X i ,... ,i N (i1 ,... ,iN ) ck1 ∂ f (b1 , . . . , bN ) z−k . (3.8) f (b1 (z), . . . , bN (z)) = k
i1 ,... ,iN
number of positive (resp. negative All numbers ki are non-zero. P Let I+ (resp., I− ) be theP ki ≥ I+ = I −I− ≥ I −M +k. ki ’s). We have I− ≤ − ki ≤ M −k, hence M ≥ + Therefore, I ≤ 2M − k. (3.9) Therefore, when we apply the series (3.8) to the element v, only a finite number of terms in the sum over (i1 , . . . , iN ) survives. Therefore, the series (3.8) is a well-defined bN )[[z, z−1 ]]. element of End(V bN is a finite sum of products g(a)f (b), where g(a) is a polynomial Every element of V in the letters a and f (b) is a power series as above. We have already defined f(b)(z). The definition of g(a)(z) is the same as in the case of VN . We define g(a)f (b) (z) by g(a)f (b) (z) =: g(a)(z)f (b)(z) :, (3.10)
450
F. Malikov, V. Schechtman, A. Vaintrob
where the normal ordering is defined in (1.2). This completes the definition of the mapping bN )[[z, z−1 ]], bN −→ End(V (3.11) Y : V The following version of the definition of the map (3.11) is helpful in practice. Every bN is a limit of the elements of ci ∈ VN (in the obvious topology). We can element c ∈ V bN )[[z, z−1 ]]. The field c(z) is the limit regard the fields ci (z) as the elements of End(V of the fields ci (z). bN as the image of the correWe define the vacuum and Virasoro element 1, L ∈ V bN . sponding element of VN under the natural map VN −→ V Theorem 3.2. The construction of 3.1 defines a structure of a conformal vertex algebra bN . on the space V This follows from [K], Theorem 4.5. b 3.3. Let Aan N ⊂ AN denote the subalgebra of power series, convergent in a neighbourhood of the origin. Set b (3.12) VNan = Aan N ⊗AN VN ⊂ VN .
It is clear from the inspection of the Taylor formula (3.4) that for f (a, b) ∈ VNan , bN ), respect the subspace V an . the Fourier modes of f (a, b)(z) which belong to End(V N bN defined in 3.1 induces the Therefore, the conformal vertex algebra structure on V structure of a conformal vertex algebra on VNan . In this argument, Aan N can be replaced by any algebra of functions containing AN and closed under derivations. More precisely, one has the following general statement. Let A0 be an arbitrary commutative AN -algebra, given together with an action of the Lie algebra T = Der(AN ) by derivations, extending the natural action of T on AN . Then the space VA0 := A0 ⊗AN VN admits a natural structure of a vertex algebra. For the details, see 6.9. ∞ For example, let Asm N denote the algebra of germs of smooth (C ) functions. Then we get a vertex algebra (3.13) VNsm = Asm N ⊗AN VN . Another natural example is that of localization of A. It is treated in the next subsection. 3.4. Zariski localization. Let f ∈ AN be a nonzero polynomial. Let AN ;f denote the localization AN [f −1 ]. Set (3.14) VN;f = AN;f ⊗AN VN . Consider the Taylor formula (3.4) applied to the function f −1 . We have evidently ∂ (i1 ,... ,iN ) f −1 (b1 , . . . , bN ) ∈ AN ;f . P In more concrete terms, let f (z) = fn z−n be the field correponding to f , then we want to define the field corresponding to f −1 as −1 = f (z)−1 = (f0 + f−1 z + f1 z−1 + . . . )−1 = f0−1 1 + f0−1 (f−1 z + f1 z + . . . ) (we use the geometric series) = f0−1 (1 + f0−2 (2f−1 f1 + 2f−2 f2 + . . . ) + . . . )
Chiral de Rham Complex
451
(we started to write down the coefficient at z0 ). Now, in the right hand side, the coefficient at each power of z is an infinite sum, but as an operator acting on AN ;f it is well defined since only a finite number of terms act nontrivially. We need only to invert f0 = f . Therefore, the construction 3.1 provides a conformal vertex algebra structure on the space VN;f . ch be the O -quasicoherent sheaf Let X denote the affine space Spec(AN ). Let OX X corresponding to the AN -module VN . We have just defined the structure of a conformal ch ), where U = Spec(A vertex algebra on the spaces VN;f = 0(Uf ; OX f N ;f ). If Uf ⊂ of conformal vertex algebras. Ug then the restriction map VN;g −→ VN;f is a morphism S If U ⊂ X is an arbitrary open, we have U = Uf , and Y −→ Y ch ) = Ker VN ;f −→ VN ;f g . 0(U ; OX Using this formula, we get a structure of a conformal vertex algebra on the space ch ). Therefore, O ch gets a structure of a sheaf of conformal vertex algebras. 0(U ; OX X bN ⊗AN N , bN = A 3.5. We can add fermions to the picture. Consider the spaces an ⊗ sm = Asm ⊗ b = A ⊂ , . The construction 3.1 provides a an AN N N AN N N N N N structure of a topological vertex algebras on these spaces. Let X be as in 3.4; let ch X denote the OX -quasicoherent sheaf associated with the AN -module N . The construction 3.4 provides a structure of a sheaf of topological vertex algebras of rank N on ch X. 3.6. Now we want to study coordinate changes in our vertex algebras. Let X be the formal scheme Spf(C[[b1 , . . . , bN ]]). Consider the formal N|N -dimensional superscheme X˜ = 5T X (here T X is the total space of the tangent bundle, 5 is the parity change functor). Thus, X˜ has the same underlying space as X, and the structure sheaf of X˜ ˜ we have N even coincides with the de Rham algebra of differential forms X. On X, coordinates b1 , . . . , bN and odd ones φ 1 = db1 , . . . , φ N = dbN . ˜ with the above coordinates, we have assigned a (super)vertex To this superscheme X, bN , generated by the fields bi (z), a i (z) (even ones) and φ i (z), ψ i (z) (odd ones). algebra ˜ The fields a i (z) (resp. ψ i (z)) correspond to the vector fields ∂bi (resp. ∂φ i ) on X. These fields satisfy the relations (cf. (1.18), (1.22)) a i (z)bj (w) =
δij , z−w
bi (z)bj (w) = (regular); a i (z)a j (w) = (regular), φ i (z)ψ j (w) =
δij , z−w
(3.15a) (3.15b) (3.15c)
φ i (z)φ j (w) = (regular); ψ i (z)ψ j (w) = (regular),
(3.15d)
bi (z)φ j (w) = (regular); bi (z)ψ j (w) = (regular),
(3.15e)
a i (z)φ j (w) = (regular); a i (z)ψ j (w) = (regular).
(3.15f)
Consider an invertible coordinate transformation on X, b˜ i = g i (b1 , . . . , bN ); bi = f i (b˜ 1 , . . . , b˜ N ),
(3.16a)
452
F. Malikov, V. Schechtman, A. Vaintrob
where g i ∈ C[[bj ]]; f i ∈ C[[b˜ j ]]. It induces the transformation of the odd coordinates φ i = dbi , φ˜ i =
∂g i j ∂f i j φ˜ φ ; φi = j ∂b ∂ b˜ j
(3.16b)
(the summation over the repeating indices is tacitly assumed). The vector fields transform as follows, ∂f j ∂ 2f k ∂g l (g(b))∂bj + (g(b)) · r · φ r ∂φ k (3.16c) ∂b˜ i = ∂b ∂ b˜ i ∂ b˜ i ∂ b˜ l and ∂φ˜ i =
∂f j (g(b))∂φ j . ∂ b˜ i
(3.16d)
We want to lift the transformation (3.16a) to the algebra N . Define the tilded fields by b˜ i (z) = g i (b)(z), φ˜ i (z) =
∂g i j φ (z), ∂bj
2 k ∂ f ∂g l r k ∂f j (g(b)) (z) + (g(b)) r φ ψ (z), a˜ (z) = a ∂b ∂ b˜ i ∂ b˜ i ∂ b˜ l i
j
ψ˜ i (z) =
∂f j (g(b))ψ j (z). ∂ b˜ i
(3.17a) (3.17b) (3.17c) (3.17d)
Theorem 3.7. The fields b˜ i (z), a˜ i (z), φ˜ i (z) and ψ˜ i (z) satisfy the relations (3.15). Proof. We will use the relations h(b)(z)a i (w) = −
∂h/∂bi (w) i ∂h/∂bi (w) ; a (z)h(b)(w) = z−w z−w
(3.18)
bN , which follow from (3.15a) and the Wick theorem. Let us check (3.15a) for each h ∈ A for the tilded fields. We have k
∂f (g(b))(w) b˜ i (z)a˜ j (w) =g i (b)(z)a k ∂ b˜ j ∂ 2f k ∂g l (g(b)) r ψ k φ r (w). − g i (b)(z) ∂b ∂ b˜ j ∂ b˜ l By the Wick theorem, the first summand is equal to −
δij 1 ∂g i ∂f k (g(b))(w) · = , k ∂b ∂ b˜ j z−w z−w
Chiral de Rham Complex
453
by (3.18). The second summand is zero. Let us check (3.15b). The first identity is clear. We have ∂f k ∂f n (g(b))(z)a n (g(b))(w) ∂ b˜ i ∂ b˜ j ∂ 2f n ∂g l ∂f k (g(b))(z) (g(b)) r ψ n φ r (w) + + ak ∂b ∂ b˜ i ∂ b˜ j ∂ b˜ l 2 k l ∂g ∂f n ∂ f (g(b)) r ψ k φ r (z)a n (g(b))(w) − ∂b ∂ b˜ i ∂ b˜ l ∂ b˜ j ∂g l ∂ 2f n ∂g l ∂ 2f k (g(b)) r ψ k φ r (z) (g(b)) r ψ n φ r (w). + ∂b ∂b ∂ b˜ i ∂ b˜ l ∂ b˜ j ∂ b˜ l
a˜ i (z)a˜ j (w) = a k
When we compute each term using the Wick theorem, there appear single and double pairings. The part corresponding to the single pairings coincides with the expression of the bracket [∂b˜ i , ∂b˜ j ] in old coordinates bp , so it vanishes. The “anomalous” part comes from the double pairings. One double pairing appears in the first term and is equal to −
∂ ∂f n 1 ∂ ∂f k (g(b)) (z) (g(b)) (w) · , n k ∂b ∂ b˜ i ∂b ∂ b˜ j (z − w)2
another one appears in the fourth term and equals ∂g l ∂ 2f n ∂g p 1 ∂ 2f k (g(b)) r (z) (g(b)) m (w) · i l j p ˜ ˜ ˜ ˜ ∂b ∂b (z − w)2 ∂b ∂b ∂b ∂b ∂ 2f k ∂g l ∂ 2f n ∂g p 1 = (g(b)) n (z) (g(b)) k (w) · . i l j p ˜ ˜ ˜ ˜ ∂b ∂b (z − w)2 ∂b ∂b ∂b ∂b
δkm δrn
We see that these terms cancel out. The remaining relations, (3.15c–f), contain only single pairings, and are easily checked. t u Thus, for each automorphism g = (g 1 , . . . , g N ) of C[[b1 , . . . , bN ]], (3.16a), the formulas (3.17) determine a morphism of vertex algebras bN . bN −→ g˜ :
(3.19)
i , bj , bN is an (infinite) sum of finite products of a−n More precisely, each element c of V −m i , ψ j . We have c = c(z)1 (0). By definition, ψ−n −m
g(c) ˜ = g(c(z))1 ˜ (0).
(3.20)
bN . If c is one of the generators Thus, we have to define the field g(c(z)) ˜ for each c ∈ V i , bi , φ i , ψ i , we define g(c(z)) ˜ by formulas (3.17). We set a−1 0 0 −1 i i (z)) = ∂z(n) g(a ˜ −1 (z)), g(a ˜ −1−n
(3.21)
454
F. Malikov, V. Schechtman, A. Vaintrob
and the same with b, φ, ψ, cf. (1.12). Finally, if c = c1 c2 · . . . · cp , where each ci is one of the letters a, b, φ or ψ, we set ˜ 2 (z)) · . . . · g(c ˜ p (z)) :, g(c(z)) ˜ =: g(c ˜ 1 (z))g(c
(3.22)
where the normal ordered product of two factors is defined by (1.2), and if p > 2, we use the inductive formula (1.14). j Equivalently, if cj = xkj where x j = a i , bi , φ i or ψ i , we have 1 p ˜ (z)) k · . . . · g(x ˜ (z)) k 1. g(c ˜ 1 · . . . · cp 1) = g(x 1
p
(3.23)
P (Here we use the following notation. If a(z) = i ai z−i−n is a field corresponding to an element of conformal weight n, we denoted the Fourier mode ai by a(z)i .) Let GN denote the group of automorphisms (3.16a). Theorem 3.8. The assignment g 7 → g˜ defines the group homomorphism GN −→ bN ). Aut( Proof. Let us consider two coordinate transformations, 0 bi = g1i (b), and 00 bi = g2i (0 b). Let fj denote the transformation inverse to gj . We have to check that g ] 2 g1 = g˜ 2 g˜ 1 .
(3.24)
By Theorem 4.5 from [K], it suffices to check this equality on the generators. Let us i 1 is expressed in the coordinates 0 a, etc., as follows begin with a i . The element 00 a−1 00 i a−1 1
=
j 0 j ∂f2 a−1 00 i (g2 (0 b0 )) − ∂ b
∂ 2 f2k ∂g2l 0 0 0 k 0 r (g2 ( b0 )) 0 r ( b0 ) ψ−1 φ0 1. ∂ 00 bi ∂ 00 bl ∂b
Expressing it in the coordinates a, etc., we get the element g˜ 2 g˜ 1 (a i ) p p q j ∂f2 ∂ 2 f1 ∂g1 p s p ∂f1 (g g (b))(z)0 1 = a 0 j (g1 (b))(z) + 0 j 0 q (g1 (b)) s ψ φ (z) 00 i 2 1 ∂b ∂b ∂b ∂b −1 ∂ b r p ∂g l ∂f ∂g1 q ∂ 2f k φ (z) 1 − 00 i 200 l (g2 g1 (b)) 0 2r (g1 (b))(z)0 0 1k (g1 (b))ψ p (z) q ∂ b∂ b ∂b ∂b −1 ∂b 0 (3.25) (we have used (3.23)). Now, the action of our group GN on the classical de Rham i complex is associative. It follows that the expression (3.25) is equal to g ] 2 g1 (a ) plus two anomalous terms: p j ∂f2 ∂f1 (g1 (b))(z) (g g (b ))1 00 i 2 1 0 ∂ 0 bj −1 ∂ b p j ∂f1 ∂ ∂f2 = + 0 j (g1 (b))(z) p 00 i (g2 g1 (b0 ))1 ∂b −1 ∂b0 ∂ b
p a0
Chiral de Rham Complex
455
coming from the first summand, and p ∂g1r ∂g2l ∂f1 ∂ 2 f2k p q (g g (b )) (g (b )) (g (b))(z) (b )ψ φ 1 2 1 0 1 0 1 q 0 0 0 ∂ 00 bi ∂ 00 bl ∂ 0 br ∂ 0 bk ∂b −1 p ∂g1r ∂g2l ∂f1 ∂ 2 f2k (b )1 = − 00 i 00 l (g2 g1 (b0 )) 0 r (g1 (b0 )) 0 k (g1 (b))(z) p 0 ∂ b∂ b ∂b ∂b −1 ∂b
−
coming from the second one. One sees that these two terms cancel out, which proves t (3.24) for a i . For the generators b, ψ, and φ, the anomaly does not appear at all. u 3.9. Theorems 3.7 and 3.8 allow one to define the sheaf of conformal vertex algebras ch X for each smooth manifold X in an invariant way, by gluing the sheaves defined in 3.5. This can be done in each of the three settings: in algebraic, complex analytic or smooth one. In the complex analytic situation, we have our sheaves of vertex algebras over the coordinate charts, and the formulas (3.17) allow to glue these sheaves in a sheaf over X. In the algebraic situation, Theorem 3.8 ensures the existence of our sheaves by the standard arguments of “formal geometry” of Gelfand and Kazhdan, cf. [GK]. Consider bN is a GN -module. Therefore, the Lie the formal situation. By 3.8, the vertex algebra bN ) of formal vector fields vanishing at 0 acts on bN by derivations. algebra WN0 = Lie( In fact, since the proof of Theorem 3.8 never uses the fact that the automorphisms in question preserve the origin, the infinitesimal version of formulas (3.17) shows that the bN . (Alternatively, this can be entire algebra WN of formal vector fields operates on bN the anomaly shown by the computation similar to the one from 5.1 (in the case of bN is a (WN , GN )-module (cf. [BS, BFM]). Now, the vanishes!)). In other words, standard result, [GK], says that such a module defines naturally a sheaf ch X on each smooth algebraic variety X. They are sheaves of vertex algebras since GN (resp., WN ) acts by vertex algebra automorphisms (resp., derivations). A more direct construction of these sheaves is outlined in Sect. 6, see 6.10. bN admits a canonical 3.10. Consider the formal situation. Let us show that the algebra filtration whose graded factors are standard tensor fields. Placing the formulae (3.16) and (3.17), (3.23) on the desk next to each other, one realizes that the “symbols of fields” transform in the same way as the corresponding geometric quantities: functions, 1-forms, and vector fields. To be more precise, introduce bN as follows. a filtration on bN -module with a base consisting of monomials in letters bN is a free A The space i i i i an , ψn (n < 0); bm , φm (m ≤ 0). Define a partial ordering on this base by j (a) a > φ, a > ψ, a > b, ψ > φ, ψ > b, φ > b; xni > xm if n < m, x being a, b, φ or ψ; (b) extending this order to the whole set of monomials lexicographically. This partial order on the base naturally determines an increasing exhausting filtration on the spaces of fixed conformal weight, (i)
(i)
b ⊂ F1 b ⊂ ... . F0 N N
(3.27)
bN , F0 b i b(0) = A b(−1) = ⊕N A For example, F0 N N i=1 N b−1 , etc. A glance at (3.17), (3.23) (i)
shows that the corresponding graded object GrF• N is a direct sum of symmetric powers
456
F. Malikov, V. Schechtman, A. Vaintrob
of the tangent bundle, symmetric powers of the bundle of 1-forms, and tensor products i (n > 0) is a b(0) is a function, that of b−n thereof. For example, the image of b0i in GrF• N i is a vector field, etc. 1-form, that of a−n This filtration is stable under coordinate changes. Therefore all the sheaves ch X acquire the natural filtration with graded factors being the bundles of tensor fields. 4. Conformal and Topological Structure 4.1. Let us return to the formal setting 3.6. Recall that we have in our vertex algebra bN the fields L(z), J (z), Q(z) and G(z), defined by the formulas (2.4), which make it a topological vertex algebra. Let us study the effect of the coordinate changes (3.16a) on these fields. ˜ Let us denote by L(z), etc., the field L(z), etc., written down using formulas (2.4), in terms of the tilded fields a˜ i (z), etc., and then expressed in terms of the old fields a i (z), etc. Theorem 4.2. We have ˜ L(z) = L(z), 0 J˜(z) = J (z) + Tr log(∂g i /∂bj (z)) , 0 ∂ i j ˜r ˜ ˜ Tr log(∂f /∂ b ) φ (z) , Q(z) = Q(z) + ∂ b˜ r ˜ G(z) = G(z). i 1. Therefore, (cf. (3.23)), Proof. We have J = φ0i ψ−1
k i k ∂f ∂g ∂f ∂g i j k φ (z) ψ (z) 1 = J + δj k (z) (z) 1 j i i ˜ ∂bj ∂b 0 ∂b −1 −1 ∂ b˜ 0 i 0 j ∂f ∂g (z) (z) 1, =J + i ∂bj 0 0 ∂ b˜
J˜ =
i bi 1. Therefore, which implies (4.1b). We have G = ψ−1 −1
j i j i ∂f ∂f ∂g k 0 j j ˜ ψ (z) ψ (z) b (z) 1 g (b)(z) −1 1 = G= k ∂ b˜ i ∂ b˜ i −1 −1 ∂b 0 j
k 1 = G. = δj k ψ−1 b−1 i φ i 1. Therefore, This proves (4.1d). We have Q = a−1 0
i j ∂ 2 f k ∂g l k r ∂g q j ∂f ˜ (z) − ψ φ (z) φ (z) 1. Q= a q ∂ b˜ i ∂ b˜ i ∂ b˜ l ∂br −1 ∂b 0 The classical terms: 2 k j i ∂ f ∂g l ∂f ∂g q j k r − ψ−1 φ0 φ 1 = Q, a−1 r i i l ˜ ˜ ˜ ∂bq ∂b 0 ∂ b ∂ b ∂b 0 0
(4.1a) (4.1b) (4.1c) (4.1d)
Chiral de Rham Complex
457 q
k and φ . Quantum since the second summand is zero, due to the anticommutation of ψ−1 0 corrections (anomalous terms):
2 k l i 2 i ∂ f ∂g r ∂ g ∂g ∂f j r φ + φ 1 0 j ∂bq r k i i l ˜ ˜ ˜ ∂b ∂b ∂b ∂ b −1 ∂b b 0 −1 0 2 k i j 2 i ∂ f ∂g l ∂ g ∂g ∂f + φ0r = j r r i i l ∂ b˜ −1 ∂b ∂b 0 ∂ b˜ ∂ b˜ ∂b −1 ∂bk 0 2 k ∂ f ∂g l ∂g i ∂f k ∂g l ∂g i r r + φ−1 1 = φ 1, ∂ b˜ i ∂ b˜ l ∂br ∂bk 0 ∂ b˜ i ∂ b˜ l ∂br ∂bk −1
since
∂ 2 f k ∂g l ∂ b˜ i ∂ b˜ l ∂br
0
∂g i ∂bk
(4.2)
k 2 l i ∂f ∂ g ∂f t ∂g 1=− 1 k l ∂bt ∂br ∂ b˜ i ˜ ∂b ∂ b −1 0 −1 k 2 l t i ∂f ∂g ∂f ∂ g 1 = t r l i ˜ ˜ ∂ b ∂b ∂b 0 ∂ b −1 ∂bk 0 2 i t ∂f ∂ g 1. = ∂bt ∂br 0 ∂ b˜ i −1
Returning to (4.2), we have ∂ 2 f k ∂g i l ∂ ∂ 2 f k ∂g l ∂g i r i j ˜ ˜ φ = φ = Tr log(∂f /∂ b ) φ˜ l , ∂ b˜ i ∂ b˜ l ∂br ∂bk ∂ b˜ l ∂ b˜ i ∂g k ∂ b˜ l which proves (4.1c). It follows from (4.1c) that the operator Q0 is invariant. Hence, (4.1a) follows from (4.1d) and (2.2). u t 4.3. It follows from (4.1a) that for an arbitrary smooth manifold X, the field L(z) is ch −1 a well-defined global section of the sheaf End(ch X )[[z, z ]], i.e. X is canonically a sheaf of conformal vertex algebras. It follows from (4.1b) and (4.1c) that the Fourier modes F = J0 (“fermionic charge”) ch = Q (“BRST charge”) are well-defined endomorphisms of the sheaf ch . and dDR 0 X ch ) becomes a complex of sheaves, graded by F . This is a localization of , d Thus, (ch DR X Definition 2.3. Theorem 4.4. For any smooth manifold X, the obvious embedding of complexes of sheaves ch i : (X , dDR ) −→ (ch X , dDR )
(4.3)
is a quasi-isomorphism. This is true in algebraic, analytic and C ∞ settings. Indeed, the problem is local along X, and we are done by Theorem 2.4. 4.5. If X is Calabi–Yau, i.e. c1 (TX ) = 0, then the fields J (z) and Q(z) are globally well defined, by (4.1b) and (4.1c). Here TX denotes the tangent bundle. Therefore, in this case the sheaf ch X is canonically a sheaf of topological vertex algebras.
458
F. Malikov, V. Schechtman, A. Vaintrob
5. Chiral Structure Sheaf A. Obstruction bN of “chiral 5.1. Consider the formal setting 3.1, 3.2, 3.6. We have the vertex algebra V bN ), where A bN = C[[b1 , . . . , bN ]]. Let WN functions” over the formal disk DN = Spf(A bN ) denote the module denote the Lie algebra of vector fields f i (b)∂bi on DN . Let 1 (A bN and 1 (A bN ) are naturally WN -modules. Recall of one-forms f i (b)dbi . The spaces A bN ) is given by that the action of WN on 1 (A f i ∂bi · g j dbj = f i ∂bi g j dbj + g j df j .
(5.1)
bN ) is compatible with the WN -action. bN −→ 1 (A The de Rham differential d : A Let us define a map bN ). (5.2) π : WN −→ End(V For a vector field τ = f i (b)∂bi , let τ (z) denote the field f i (b)a i (z) (of conformal weight bN ) denote the Fourier mode bN . Let π(τ ) ∈ End(V 1) of our vertex algebra V Z (5.3) π(τ ) := τ (z) = τ (z)0 . bN . Note that by 1.7, the maps π(τ ) are derivations of V The mapping π does not respect the Lie bracket. Let us compute the discrepancy. Let τ1 = f i (b)∂bi , τ2 = g i (b)∂bi be two vector fields. We have the operator product ∂bj f i (b(z))∂bi g j (b(w)) (z − w)2 i j j f (b(w))∂bi g (b(w))a (w) − g j (b(w))∂bj f i (b(w))a i (w) (5.4) + z−w 0 ∂bj f i (b(w)) ∂bi g j (b(w)) ∂ j f i (b(w))∂bi g j (b(w)) [τ1 , τ2 ](w) + − . =− b (z − w)2 z−w z−w τ1 (z)τ2 (w) = −
It follows that 0 [π(τ1 ), τ2 (w)] = [τ1 , τ2 ](w) − ∂bj f i (b(w)) ∂bi g j (b(w)).
(5.5)
In particular, Z [π(τ1 ), π(τ2 )] = π([τ1 , τ2 ]) −
0 ∂bj f i (b(w)) ∂bi g j (b(w)).
(5.6)
bN ), let us denote by ω(z) the field f i (b)bi (z)0 of our 5.2. For ω = f i (b)dbi ∈ 1 (A R vertex algebra. Denote by π(ω) the Fourier mode ω(z)0 = ω(z). bN . Its conformal bN , let f (z) denote the corresponding field of V For f = f (b) ∈ A weight is 0, and f (z)0 = f . We have df (z) = f (z)0 .
(5.7)
Chiral de Rham Complex
459
bN ), we have the operator product Given τ = f i ∂bi ∈ WN , ω = g j dbj ∈ 1 (A f i (b(z))g i (b(w)) f i (b(w))∂bi g j bj (w)0 + (z − w)2 z−w i i i j f (b(w))g (b(w)) f ∂bi g (w)bj (w)0 + f i (b(w))0 g i (b(w)) . (5.8) + = (z − w)2 z−w
τ (z)ω(w) =
It follows that [π(τ ), ω(z)] = (τ ω)(z)
(5.9)
[π(τ ), π(ω)] = π(τ ω).
(5.10)
and bN ) generated by the Fourier modes π(τ ) (τ ∈ Let W˜ N denote the linear subspace of End(V 1 b WN ) and π(ω) (ω ∈ (AN )). Let IN ∈ W˜ N be the linear subspace generated by the Fourier modes π(ω). It follows from (5.7) that if ω is exact then π(ω) = 0. Thus, π induces an epimorphic map, bN )/d A bN −→ IN . (5.11) 1 (A
Lemma 5.3. The map (5.11) is an isomorphism. This can be proved by writing down the Fourier mode as an infinite sum of monomials the coefficients of like terms. In fact, a more general statement, in bni and comparing R namely, that Q(z) = 0 if and only if Q = const · 1 or Q(z) = P (z)0 for some P , seems to be valid for a broad class of vertex algebras, cf. a similar statement in [FF3]. From our point of view, this phenomenon has topological nature. It is amusing to exhibit an example of a vertex algebra, for which the lemma above isRfalse. Namely, take b−1 b−1 ∈ A1 [b−1 ], see 3.4; then (b−1 b−1 )(z) = b(z)−1 b(z)0 and b(z)−1 b(z)0 = 0, but b(z)−1 b(z)0 is not a total derivative. 4 bN ). It follows from (5.6) and 5.4. Obviously, ω1 (z)ω2 (w) = 0 for all ω1 , ω2 ∈ 1 (A ˜ (5.10) that WN is a Lie algebra, IN is its abelian ideal, and we have the canonical extension (5.12) 0 −→ IN −→ W˜ N −→ WN −→ 0. The action of WN on IN arising from this extension coincides with the canonical action bN )/d A bN , by (5.10). Note that we have defined this extension together of WN on 1 (A bN )/d A bN ) of with its splitting (5.2). It is given by the two-cocycle c ∈ Z 2 (WN ; 1 (A bN )/d A bN , read from (5.5), WN with values in 1 (A bN ). c(f i ∂bi , g j ∂bj ) = −∂bi g j d(∂bj f i )(mod d A
(5.13)
5.5. Let us consider the truncated and shifted de Rham complex bN −→ 1 (A bN ) −→ 0, • : 0 −→ A
(5.14)
bN ) in degree zero. It is a complex of WN -modules. We have an where we place 1 (A obvious map of complexes of WN -modules
460
F. Malikov, V. Schechtman, A. Vaintrob
bN )/d A bN , • −→ 1 (A
(5.15)
where the target is regarded as a complex sitting in degree zero. Let us write down a two-cocycle c˜ ∈ Z 2 (WN ; • ) which is mapped to c, (5.13), under the map (5.14). Such a cocycle is by definition a pair (c2 , c3 ), where c2 ∈ bN )) is a two-cochain, and c3 ∈ C 3 (WN ; A bN ) is a three-cochain, such C 2 (WN ; 1 (A that dLie (c2 ) = dDR (c3 ),
(5.16a)
dLie (c ) = 0.
(5.16b)
3
Let us define and
c2 (f i ∂i , g j ∂j ) = ∂i g j d(∂j f i ) − ∂j f i d(∂i g j )
(5.17)
c3 (f i ∂i , g j ∂j , hk ∂k ) = ∂j f i ∂k g j ∂i hk − ∂k f i ∂i g j ∂j hk .
(5.18)
We write for brevity ∂i instead of ∂bi . One checks the compatibilities (5.16) directly. Thus, we have defined c. ˜ One sees that c˜ is mapped to −2c under (5.15). bN )/d A bN is trivial. This allows one to define the sheaf Och For N = 1 the space 1 (A X bN instead of bN . for curves, acting as in 3.9, and starting from V Assume that N > 1. Using the computations of Gelfand–Fuchs, cf. [F], Theorems 2.2.7 and 2.2.4, one can show that the map in cohomology bN )/d A bN ) H 2 (WN ; • ) −→ H 2 (WN ; 1 (A
(5.19)
induced by (5.15) is an isomorphism. We have the canonical short exact sequence bN )) −→ H 2 (WN ; • ) −→ H 3 (WN ; A bN ) −→ 0. 0 −→ H 2 (WN ; 1 (A
(5.20)
the left- and right-most terms being one-dimensional. Under the second map of this sequence, our cocycle c˜ is mapped to its second component c3 which is a canonical bN ), cf. [F], Theorem 2.2.70 and representative of a generator of the space H 3 (WN ; A Chapter 2, Sect. 1, no. 4. In particular, our cocycle c˜ is non-trivial. It follows that the cocycle c defining the extension (5.12) is also non-trivial. What kind of an object does the cocycle c˜ define? Recall that a homotopy Lie algebra L• is a complex of vector spaces equipped with a collection of brackets [ , . . . , ]i : 3i L• −→ L• [−i + 2], i ≥ 2,
(5.21)
satisfying certain compatibility conditions, cf. for example [HS], Sect. 4. In particular [ , ]2 is a skew symmetric map, satisfying the Jacobi identity up to the homotopy (given by the third bracket). bN ), L−1 = A bN , the Let us define a complex L• as follows. Set L0 = WN ⊕ 1 (A other components being trivial. The differential L−1 −→ L0 is the composition of the bN ) ,→ L0 . de Rham differential and the obvious embedding 1 (A Let the second bracket [ , ]2 be given by the usual bracket of vector fields, and bN ). Define the third bracket with the only bN and 1 (A the action of vector fields on A nontrivial component being the three-cocycle c3 , (5.18). We set the other brackets equal to zero. This way we get a structure of a homotopy Lie algebra on L• . We have a canonical extension of homotopy Lie algebras 0 −→ • −→ L• −→ WN −→ 0.
(5.22)
Chiral de Rham Complex
461
Here • is an abelian ideal in L• (all brackets are zero). This is a refinement of extension (5.12). B. Projective Line 5.6. Let X be the projective line P1 . Let us fix a coordinate b on P1 , and consider the open covering X = U0 ∪ U1 , where U0 = Spec(C[b]), U1 = Spec(C[b−1 ]). ch on U with coordinate b, and O ch on U with coordinate Consider the sheaves OU 0 1 U1 0 −1 b , which were defined in 3.4. Let us glue them on the intersection U01 = U0 ∩ U1 using the transition function ˜ b(z) = b(z)−1 ,
(5.23a) 0
a(z) ˜ = b a(z) + 2b(z) . 2
(5.23b)
ch . In this way, we get the sheaf on the X, to be denoted OX ch ) admits a natural structure of an Theorem 5.7. The space of global sections 0(X; OX b irreducible vacuum sl2 -module on the critical level. ch ∩ O ch where both O ch are regarded as subspaces of O ch . It is We have to compute OU U1 Ui U01 0 the essence of the Wakimoto construction, [W], that the fields a(z)b(z)2 + 2b(z)0 , a(z) ch (resp., on O ch ), and under ˜ 0 , a(z)) ˜ 2 +2b(z) ˜ generate an b sl2 -action on OU (resp., a(z) ˜ b(z) U1 0 ch , i = 1, 2, become the restricted Wakimoto module with zero highest this action, OU i weight. (Restricted here means that the level is critical, and the Sugawara operators act ch ) and, therefore, by zero.) It follows from (5.23) that the b sl2 -action comes from 0(X; OX ch ∩ O ch is also an b ch contains a sl2 -module. It follows from [FF1] or [M] that each OU OU U1 0 i unique proper submodule which is isomorphic to the irreducible vacuum representation. ch ch ch To complete the proof, it remains to show that OU0 6= OU0 ∩ OU1 , and this is obvious. 4 ch ) is also isomorphic to the same 5.8. In fact, the first cohomology space H 1 (X; OX irreducible b sl2 -module. To prove this, let us compute the Euler character ch )= ch(X; OX
∞ X N=0
ch(N)
χ (X; OX
) · qN
in two different ways. First, by definition ch ch ch ) = ch(0(X; OX )) − ch(H 1 (X; OX )). ch(X; OX
By Theorem 5.7 and [M], ch )) = (1 − q)−1 ch(0(X; OX
∞ Y
(1 − q N )−2 .
N =1
ch carries a filtration F such that the On the other hand, formulas (5.23) imply that OX F image of a−n (resp., b−n ) (n ≥ 1) in Gr is a vector field (resp., a 1-form). It follows that
462
F. Malikov, V. Schechtman, A. Vaintrob ch(N )
each monomial · . . . · a−nr b−m1 · . . . · b−ms contributes 2s − 2r + 1 in χ(X; OX P a−n1 P where N = ni + mj . Therefore, ch )= ch(X; OX
∞ Y
),
(1 − q N )−2 ,
N=1
hence ch ch )) = q · ch(0(X; OX )). ch(H 1 (X; OX ch ) has the same (up to the shift by q) character as 0(X; O ch ). In other words, H 1 (X; OX X Again by [M], these two spaces are isomorphic as b g-modules.
C. Flag Manifolds 5.9. Let G be a simple algebraic group, B ⊂ G a Borel subgroup, N ⊂ B the maximal nilpotent subgroup. The manifold N is isomorphic to the affine space, and is a (g, B)-scheme, where B acts by conjugation. Consider the Heisenberg vertex algebra V associated with the affine space N. According to [FF2], V admits a structure of a b g-module R (Wakimoto module); in particular, V is a (g, B)-module. Note that x ∈ g acts on V as X(z) for some X ∈ V . Consequently, considered as an affine space, V admits a structure of a (g, B)-scheme. Let M be the algebra of functions on V . Proceeding as in 3.9, with K = B, Xˆ = G, and X = G/B, we get the sheaf of ind-schemes U 7 → Spec(H∇ (1(M))) on X. The sheaf of its C-points is called the chiral structure sheaf of X and denoted by ch . OX ch admits a more explicit construction, using charts 5.10. If G = SL(n) then the sheaf OX and gluing functions. In this case X = GL(n)/(B × C∗ ). The Weyl group is identified with the symmetric group Sn and can be realized as the subgroup of GL(n) consisting of permutation matrices. One checks that in terms of the Lie algebra gl(n), the simple permutation ri (interchanging i and i + 1) can be written as follows:
√ ri = exp(π −1Eii ) exp(Ei+1,i ) exp(−Ei,i+1 ) exp(Ei+1,i ).
(5.24)
where Eij (1 ≤ i, j ≤ n) form the standard base of gl(n). The manifold X is covered by |S| = n! charts, the chart associated with an element w ∈ Sn being Uw = wNw0 B, where N ⊂ B is the unipotent subgroup consisting of all upper-triangular matrices and w0 ∈ Sn is the element of maximal length. Let us identify Uw with N using the bijection n 7 → wnw0 B. Under this identification, if x ∈ Uw1 ∩Uw2 , then the change from the coordinates determined by Uw1 to the ones determined by Uw2 , is given by x 7 → w2−1 w1 x. ch , we declare Each Uw may be identified with the affine space Cn(n−1)/2 . To define OX ch ch that OX U = OUw , where the last sheaf is defined in 3.4. Now we have to glue these w sheaves over the pairwise intersections in a consistent manner.
Chiral de Rham Complex
463
ch ). First, we extend the b Let V denote the vertex algebra Vn(n−1)/2 = 0(Uw ; OU sl(n)w module structure on V to a gbl(n)-module structure. For that, in addition to the formulae in [FF1], p. 279, define X X bij a ij (z) + bj i a j i (z). (5.25) Eii (z) = − j >i
j
b It is easily checked that in this way we indeed get an action of gl(n) on V . For any A ∈ End(V ), introduce the formal exponent exp(tA) : V −→ V ⊗ C[[t]], exp(tA)(v) =
∞ X Ai (v) i t . i!
(5.26)
i=0
Working over C[[t]], we can easily compose such maps. Motivated by (5.24), set Z √ ri (t) = exp(tπ −1 Eii (z)) · Z Z Z (5.27) · exp(t Ei+1,i (z)) exp(−t Ei,i+1 (z)) exp(t Ei+1,i (z)). Thus, ri (t) is a map V −→ V [[z, z−1 ]][[t]]. Note that V [[t]] is naturally a vertex algebra. It contains the vertex subalgebra “generated by functions, rational in t”, that is, V (t) = R(N × C) ⊗A V ⊂ V [[t]].
(5.28)
Here A = 0(N ; ON ), and R(N × C) denotes the ring of rational functions on N × C, regular at N × {0}. Lemma 5.11. (a) ri (t)V ⊂ V (t). Further, ri (t)V is generated by functions welldefined for any value of t. (b) ri (1) is well defined (by (a)), and determines a map R
ch ch ) −→ 0(Uw ∩ Uri w , OU ). ri (1) : 0(Uw ; OU w w
(5.29)
As X(z), X ∈ V , is a derivation, the map ri (t) is an embedding of vertex algebras. Therefore, it is enough to compute ri (t) on generators. There are two types of generators in V : (i) those coming from the subspace W ⊂ V , canonically isomorphic to g = sl(n), such that the fields X(z) (X ∈ W ) generate ij g; (ii) those coming from C[bn ]1 ⊂ V . Obviously, the endomorphisms Rthe action of b Eij (z) (i 6 = j ), act on W as Eij on sl(n); in particluar, this action is nilpotent, and ij ri (t) is polynomial in t. As for C[bn ]1, this subspace is identified with R the symmetric • 1 algebra S ( (Uw )). For any X ∈ gl(n) ⊂ 0(X; TX ), Rthe element X(z) acts on this space as the Lie derivative along X. Consequently, exp( X(z)) maps S • (1 (Uw )) into S • (1 (exp(X) · Uw )). Repeated application of this lemma gives for any v = ri1 · . . . rik ∈ Sn the map ch ch ) −→ 0(Uw ∩ Uv·w ; OU ). v(1) = ri1 · . . . · rik : 0(Uw ; OU w w
(5.30)
ch , glue the sheaves O Finally, to complete our construction of OX Uw1 and OUw2 using the maps ch ch ch ) −→ 0(Uw1 ∩ Uw2 ; OU ) ←− 0(Uw2 ; OU ), 0(Uw1 ; OU w w w 1
2
2
(5.31)
464
F. Malikov, V. Schechtman, A. Vaintrob
where the first (resp. second) arrow is w2 w1−1 (1) (resp., w1 w2−1 (1)). Since the gluing ch is well maps are induced by the action of Sn , they are transitive, and the sheaf OX defined. Example 5.12. Let g = sl(2). We have N = A1 ⊂ X = P1 ; b is the coordinate on N such that generators of gl(2) act as the following vector fields: E21 7→ −∂b , E12 7→ b2 ∂b , E11 7 → b∂b . One easily calculates that √ exp(E21 ) : b 7 → b − 1; exp(−E12 ) : b 7 → b/(b + 1); exp(π −1E11 ) : b 7 → −b. The simple reflection is √ r = exp(π −1E11 ) exp(E21 ) exp(−E12 ) exp(E21 ) : b 7→ b−1 .
(5.32)
To do the chiral analogue of this computation, recall (cf. [W]) that E21 (z) = −a(z), E12 (z) = b2 a(z) + 2b(z)0 , E11 (z) = ba(z). A direct computation using the Wick theorem yields (5.33) r(1)a−1 1 = (b2 a−1 + 2b−1 )1 which coincides with (5.23b). Computation of r(1)b1 does not differ from the classical one, see (5.32). ch ) admits a natural structure of a b g-module at the Theorem 5.13. The space 0(X; OX critical level, such that it is a submodule of the restricted Wakimoto module, and its conformal weight zero component is 1-dimensional.
Proof is not much different from that of Theorem 5.7. The space of sections W = ch ) over the big cell is the restricted Wakimoto module with the zero highest 0(N ; OX weight. Restricted here again means that the center of U (b g)loc acts trivially. As the action of B on V preserves b g ⊂ End(V ), b g is spanned by the Fourier modes of fields associated ch ). Therefore, 0(X; O ch ) is a b g-module at the critical with certain elements of 0(X; OX X level, and there arises a b g-morphism ch ) −→ W. 0(X; OX
(5.34)
By construction, the sheaf OX admits a canonical filtration whose associated quotients are free OX -modules of finite rank. Therefore, the map (5.34) is an injection. By construcch ) is equal to 0(X; O ) = C. tion, the conformal weight zero component of 0(X; OX X 4 This theorem is less precise than 5.7, the reason being that the representation-theoretic result of [FF1, M] mentioned in 5.7 is only available in the sl(2)-case. However, Theorem 5.1 of [FF1] makes it plausible that this representation-theoretic statement carries over ch ) is in fact the irreducible vacuum b g-module. to any g, and hence that 0(X; OX ch admits a structure of a b g-module at the critical level. 5.14. Note that the whole sheaf OX
5.15. Localization for non-zero highest weight. For any integral weight of g, there exists ch , to be denoted by Lch . Its construction repeats word for word a twisted analogue of OX λ 5.9, except that the action of (g, B) on V is to be twisted by λ, see [W, FF1]. If λ is a regular dominant weight, then Theorems 5.7 and 5.13, along with their proofs, generalize in the obvious way. For example, the word “vacuum” in the formulation of Theorem 5.7
Chiral de Rham Complex
465
is to be replaced with “highest weight λ”. Likewise, the claim that “conformal weight zero component is 1-dimensional” should be replaced with the following: “conformal weight zero component is isomorphic to 0(X; Lλ )”, where Lλ is the standard line bundle on X. One may want to regard Lch λ as a sheaf of modules over a sheaf of vertex algebras, ch is a sheaf of modules over itself. From this point of view, O ch is a chiral just as OX X analogue of the sheaf DX of differential operators on X. Thus, one cannot expect Lch λ ch . We believe (and have checked this for g = sl(2)) that one to be a module over OX can define a sheaf of vertex algebras Oλch which acts on Lch λ and is locally isomorphic to ch . Therefore, this sheaf can be regarded as a chiral partner of the sheaf D of twisted OX λ differential operators on X. 6. Alternative Construction In this section we will outline another construction of our vertex algebras, and some generalizations. The details will appear in a separate paper, see [MS]. 6.1. Recall (cf. [K], 4.9) that a graded vertex algebra is a pair (V , H ), where V is a vertex algebra and H : V −→ V is a diagonalizable linear operator (Hamiltonian) such that [H, a(z)] = z∂z a(z) + (H a)(z)
(6.1.1)
for each a ∈ V . For example, a conformal vertex algebra is graded by the Hamiltonian L0 . The eigenvalues of H are called conformal weights. By V (1) we will denote the eigenspace of conformal weight 1. We have 0
0
a(n) b ∈ V (1+1 −n−1) for a ∈ V (1) , b ∈ V (1 ) .
(6.1.2)
Let us call a graded vertex algebra restricted if it has no negative integer conformal weights. Let us fix a restricted vertex algebra V . To simplify the notations, we assume that V is purely even. We would like to describe the structure which is induced on the space V ≤1 := V (0) ⊕ V (1) by the vertex algebra structure on V . We omit all computations; all the claims below are deduced directly from the Borcherds identity [K], Proposition 4.8 (b) and from op. cit. (4.2.3). 6.2. The operation a(−1) b in V will be denoted simply by ab. (a) The space V (0) is a commutative associative unital C-algebra with respect to the operation ab. This algebra will be denoted A. The unity is the vacuum, to be denoted by 1. The space A acts by the left multiplication on V (1) . However, this does not make V (1) an A-module: the multiplication by A is not associative in general. We have the map L−1 : A −→ V (1) . Let ⊂ V (1) denote the subspace spanned by the elements a∂b, a, b ∈ A. Thus, L−1 induces the map d : A −→ .
(6.2.1)
466
F. Malikov, V. Schechtman, A. Vaintrob
(b) The left multiplication by A makes a left A-module. We have a(db) = (db)a (a, b ∈ A).
(6.2.2)
(c) The map d is a derivation, i.e. d(ab) = adb + bda.
(6.2.3)
V (1) /.
Let us denote by T the quotient space (d) The left multiplication by A makes T a left A-module. onsider the operation (1) ⊗ V (1) −→ V (1) . (0) : V
(6.2.4)
(e) The operation (6.6.4) induces a Lie bracket on T , to be denoted [ , ]. Consider the operation (1) ⊗ A −→ A. (6.2.5) (0) : V (f) The operation (6.2.5) vanishes on the subspace ⊗ A, and induces on A a structure of a module over the Lie algebra T . This action will be denoted by τ (a) (a ∈ A, τ ∈ T ). (g) The Lie algebra T acts on A by derivations, τ (ab) = τ (a)b + aτ (b).
(6.2.6)
(aτ )(b) = aτ (b).
(6.2.7)
(h) We have
The properties (d)–(h) mean that T is a Lie algebroid over A. (i) The operation (6.2.4) induces a structure of a module over the Lie algebra T on the space . This action will be denoted τ (ω) or τ ω (τ ∈ T , ω ∈ ). (j) We have τ (aω) = aτ (ω) + τ (a)ω (a ∈ A, τ ∈ T , ω ∈ ). (6.2.8) (k) The differential d : A −→ is compatible with the T -module structure. It follows from (j) and (k) that (l) We have τ (adb) = τ (a)db + ad(τ (b)) (τ ∈ T , a, b ∈ A). (6.2.9) Consider the operation (1)
: V (1) ⊗ V (1) −→ A.
(6.2.10)
(m) The map (6.2.10) vanishes on the subspace ⊗ . Therefore, it induces the pairing h , i : ⊗ T ⊕ T ⊗ −→ A.
(6.2.11)
This pairing is A-bilinear and symmetric. We have hτ, adbi = aτ (b) (τ ∈ T , a, b ∈ A).
(6.2.12)
Chiral de Rham Complex
467
(n) We have (aτ )(ω) = aτ (ω) + hτ, ωida (a ∈ A, τ ∈ T , ω ∈ ).
(6.2.13)
6.3. Let us denote by Tb the space V (1) /dA. The operation (6.6.4) induces a Lie bracket on the space Tb. The subspace /dA ⊂ Tb is an abelian Lie ideal. The adjoint action of T = Tb/(/dA) coincides with the action defined by (i) and (l). Thus, we have an extension of Lie algebras 0 −→ /dA −→ Tb −→ T −→ 0.
(6.3.1)
Note that this extension is not central in general. 6.4. Let us denote the space V (1) by T˜ . Thus, we have an exact sequence of vector spaces 0 −→ −→ T˜ −→ T −→ 0. (6.4.1) Both arrows are compatible with the left multiplication by A. Let π denote the projection π : T˜ −→ T . Let us define the “bracket” [ , ] : 32 T˜ −→ T˜ by [x, y] =
1 (x(0) y − y(0) x) (x, y ∈ T˜ ). 2
(6.4.2)
This bracket does not make T˜ a Lie algebra: the Jacobi identity is in general violated. Set J (x, y, z) = [[x, y], z] + [[y, z], x] + [[z, x], y] (x, y, z ∈ T˜ ). (6.4.3) Consider the operation (6.2.10). (a) The operation (6.2.10) is symmetric. It will be denoted by hx, yi. Let us define the map I : 33 T˜ −→ A by I (x, y, z) = hx, [y, z]i + hy, [z, x]i + hz, [x, y]i.
(6.4.4)
(b) We have J (x, y, z) = 6.5. Let us choose a splitting
1 dI (x, y, z). 6
s : T −→ T˜
(6.4.5)
(6.5.1)
of the projection π. Let us define the map
by
h , i = h , is : S 2 T −→ A
(6.5.2)
hτ1 , τ2 i = hs(τ1 ), s(τ2 )i
(6.5.3)
(we put the lower index s in the notation if we want to stress the dependence on the splitting s). Let us define the map c2 = cs2 : 32 T −→
(6.5.4)
468
F. Malikov, V. Schechtman, A. Vaintrob
by c2 (τ1 , τ2 ) = s([τ1 , τ2 ]) − [s(τ1 ), s(τ2 )].
(6.5.5)
Let us define the map K : 33 T −→ A by K(τ1 , τ2 , τ3 ) = hs(τ1 ), s([τ2 , τ3 ])i +hs(τ2 ), s([τ3 , τ1 ])i+ hs(τ3 ), s([τ1 , τ2 ])i. (6.5.6) Let us define the map c3 = cs3 : 33 T −→ A by 1 1 c3 (τ1 , τ2 , τ3 ) = − K(τ1 , τ2 , τ3 ) + I (s(τ1 ), s(τ2 ), s(τ3 )), 2 3
(6.5.7)
cf. (6.4.4). Let us regard c2 (resp. c3 ) as Lie algebra cochains living in C 2 (T ; ) (resp., in C 3 (T ; A)). (a) We have dLie (c2 ) = dc3 .
(6.5.8)
dLie (c3 ) = 0.
(6.5.9)
(b) We have The identities (a) and (b) mean that the pair c = (c2 , c3 ) form a 2-cocycle of the Lie algebra T with coefficients in the complex A −→ . (c) We have 1 1 h[τ1 , τ2 ], τ3 i + hτ2 , [τ1 , τ3 ]i = τ1 (hτ2 , τ3 i) − τ2 (hτ1 , τ3 i) − τ3 (hτ1 , τ2 i) 2 2 − hτ2 , c2 (τ1 , τ3 )i − hτ3 , c2 (τ1 , τ2 )i. (6.5.10) Let us investigate the effect of the change of a splitting. Let s 0 : T −→ T˜ be another splitting of π. The difference s 0 − s lands in ; let us denote it ω = ωs,s 0 : T −→ .
(6.5.11)
We regard ω as a 1-cochain of T with coefficients in . Let us define a 2-cochain α = αs,s 0 ∈ C 2 (T ; A) by α(τ1 , τ2 ) =
1 hω(τ1 ), τ2 i − hτ1 , ω(τ2 )i . 2
(6.5.12)
(d) We have cs2 − cs20 = dLie (ω) − dα.
(6.5.13)
−dLie (α) = cs3 − cs30 .
(6.5.14)
(e) We have
Chiral de Rham Complex
469
The properties (d) and (e) mean that cs − cs 0 = dLie (β),
(6.5.15)
where β = βs,s 0 := (ω, α) ∈ C 1 (T ; A −→ ). Therefore, we have assigned to our vertex algebra a canonically defined “characteristic class” (6.5.16) c(V ) = c(V ≤1 ) ∈ H 2 (T ; A −→ ) as the cohomology class of the cocycle cs . 6.6. Let us introduce the mapping
by
(a) We have
(b) We have
γ = γs : A ⊗ T −→
(6.6.1)
γ (a, τ ) = s(aτ ) − as(τ ).
(6.6.2)
γ (ab, τ ) = γ (a, bτ ) + aγ (b, τ ) + τ (a)db + τ (b)da.
(6.6.3)
haτ1 , τ2 i = ahτ1 , τ2 i + hγ (a, τ1 ), τ2 i − τ1 τ2 (a).
(6.6.7)
(c) We have c2 (aτ1 , τ2 ) = ac2 (τ1 , τ2 ) + γ (a, [τ1 , τ2 ]) − γ (τ2 (a), τ1 ) + τ2 γ (a, τ1 ) (6.6.8) 1 1 1 − hτ1 , τ2 ida + d(τ1 τ2 (a)) − d(hτ2 , γ (a, τ1 )i) 2 2 2 (a ∈ A, τi ∈ T ). (d) We have 1 c3 (aτ1 , τ2 , τ3 ) = ac3 (τ1 , τ2 , τ3 ) + τ1 [τ2 , τ3 ](a) 2 1 − hτ2 , γ (a, [τ3 , τ1 ])i − hτ3 , γ (a, [τ2 , τ1 ])i 2 + hτ2 , γ (τ3 (a), τ1 )i − hτ3 , γ (τ2 (a), τ1 )i 1 + hτ2 , τ3 γ (a, τ1 )i − hτ3 , τ2 γ (a, τ1 )i 2 1 − h[τ2 , τ3 ], γ (a, τ1 )i. 2
(6.6.9)
6.7. Let us call a prevertex algebra the data (a) – (f) below. (a) A commutative algebra A. (b) An A-module , together with an A-derivation d : A −→ . We assume that is generated as a vector space by the elements adb (a, b ∈ A), i.e. the canonical map 1 (A) := 1C (A) −→ is surjective.
470
F. Malikov, V. Schechtman, A. Vaintrob
(c) An A-Lie algebroid T . Define the action of T on by τ (adb) = τ (a)db + ad(τ (b)),
(6.7.1)
cf. (6.2.9). We assume that this action is well defined. It follows that d is compatible with the action of T . We assume that the formula hτ, adbi = aτ (b) (6.7.2) gives a well defined A-bilinear pairing T × −→ A. (d) A C-bilinear mapping γ : A × T −→ satisfying 6.6 (a). (e) A C-bilinear symmetric mapping h , i : T × T −→ A satisfying 6.6 (b). (f) A C-bilinear skew symmetric mapping c2 : T × T −→ . This map should satisfy 6.5 (c), 6.6 (c), and the property (6.7.7) below. Let us define the map [ , ]0 := [ , ] − c2 : 32 T −→ T ⊕ .
(6.7.3)
1 1 c3 := − K˜ + I˜ : 33 T −→ A, 2 3
(6.7.4)
Let us define the map
where
and
˜ 1 , τ2 , τ3 ) = hτ1 , [τ2 , τ3 ]i + hτ2 , [τ3 , τ1 ]i + hτ3 , [τ1 , τ2 ]i K(τ
(6.7.5)
I˜(τ1 , τ2 , τ3 ) = hτ1 , [τ2 , τ3 ]0 i + hτ2 , [τ3 , τ1 ]0 i + hτ3 , [τ1 , τ2 ]0 i.
(6.7.6)
In the last formula, we imply the symmetric pairing h , i to be extended to the whole space T ⊕ using (6.7.2), (e), and setting it equal to zero on × . Now, with c3 defined above, we require that dLie (c2 ) = dc3 ; dLie (c3 ) = 0.
(6.7.7)
Let us call a restricted vertex algebra V split if it is given together with a splitting (6.5.1). We have constructed in 6.2 – 6.6 a functor P : (Split restricted vertex algebras) −→ (Prevertex algebras).
(6.7.8)
Claim. Functor P admits a left adjoint, to be denoted V. In other words, given a prevertex algebra P = (A, , T , . . . ), the corresponding vertex algebra V(P ) is defined by its universal property: to give a morphism of vertex algebras from V(P ) to an arbitrary split restricted vertex algebra V 0 is the same as to give a morphism of prevertex algebras P −→ P(V 0 ). (Morphisms of prevertex algebras are defined in the obvious manner.) The construction of V = V(P ) goes in two steps. First, the components V0 , V1 and the operations (i) , i = −2, −1, 0, 1 acting on them, are recovered by inverting the discussion 6.2 – 6.6. For example, V (0) = A; V (1) = T ⊕ ; τ1(0) τ2 = [τ1 , τ2 ] − c2 (τ1 , τ2 ) + dhτ1 , τ2 i, τ1(1) τ2 = hτ1 , τ2 i.
(6.7.9) (6.7.10)
(τi ∈ T ), etc. Now, the components of weights ≥ 2 are recovered by “bootstrap” from the universal property. Note that the set of conformal weights of V(P ) is equal to Z≥0 if V(P ) 6= C.
Chiral de Rham Complex
471
Example 6.8. Let T be a Lie algebra over C equipped with an invariant bilinear form h , i. Set A = C, = 0, c2 = 0, γ = 0. This defines a prevertex algebra P . Note that the component defined by the rule 6.7 (f) is not equal to zero, but is given by 1 c3 (τ1 , τ2 , τ3 ) = − hτ1 , [τ2 , τ3 ]i. 2
(6.8.1)
The vertex algebra V(P ) coincides with the vacuum (level 1) representation of the affine Kac–Moody Lie algebra corresponding to (T , h , i). Example 6.9. Let A be a C-algebra; set = 1C (A). Let T0 be an abelian Lie algebra over C acting by derivations on A. Set T = A ⊗C T0 . There is a unique Lie bracket on T making it a Lie algebroid over A, (6.9.1) [aτ1 , bτ2 ] = aτ1 (b)τ2 − bτ2 (a)τ1 (τi ∈ T0 , a, b ∈ A). We set hτ1 , τ2 i = 0; γ (a, τ ) = 0; c2 (τ1 , τ2 ) = 0; c3 (τ1 , τ2 , τ3 ) = 0 for a ∈ A, τi ∈ T0 . Then the formulas 6.6 (a) – (d) define the unique extension of the operations γ , h , i, c2 and c2 to the whole space T . Namely, γ (a, bτ ) = −τ (a)db − τ (b)da, haτ1 , bτ2 i = −aτ2 τ1 (b) − bτ1 τ2 (a) − τ1 (b)τ2 (a).
(6.9.2) (6.9.3)
It is convenient to write down c = (c2 , c3 ) as a sum of a simpler cocycle and a coboundary, (6.9.4) c2 (aτ1 , bτ2 ) = 0 c2 (aτ1 , bτ2 ) + dβ(aτ1 , bτ2 ), where 1 τ1 (b)dτ2 (a) − τ2 (a)dτ1 (b) , 2 1 0 2 c (aτ1 , bτ2 ) = τ1 (b)dτ2 (a) − τ2 (a)dτ1 (b) , 2 1 β(aτ1 , bτ2 ) = bτ1 τ2 (a) − aτ2 τ1 (b) , 2 0 2
c (aτ1 , bτ2 ) =
and
c3 (aτ1 , bτ2 , cτ3 ) = 0 c3 (aτ1 , bτ2 , cτ3 ) + dLie β(aτ1 , bτ2 , cτ3 ),
(6.9.5) (6.9.5) (6.9.6) (6.9.7)
where 0 3
c (aτ1 , bτ2 , cτ3 ) =
1 τ1 (b)τ2 (c)τ3 (a) − τ1 (c)τ2 (a)τ3 (b)}. 2
(6.9.8)
This gives the prevertex algebra P . Note that the cocycle (0 c2 , 0 c3 ) coincides with (minus one half of) the cocycle (5.1718) if A is the polynomial ring. For example, let A be smooth, and assume that there exists a base {τi } of the left A-module T := DerC (A) consisting of commuting vector fields. Let T0 ⊂ T be the C-vector space spanned by {τi }. This gives a prevertex algebra P . The vertex algebra Ach := V(P ) may be called a “chiralization of A”. This definition depends on the choice of {τi }, and this is esssential; when we change the basis, we may get a non-isomorphic vertex algebra: here the “anomaly” appears.
472
F. Malikov, V. Schechtman, A. Vaintrob
Specifying even more, let A be a polynomial algebra AN , cf. 3.1. Let τi = ∂bi . Then the vertex algebra V(P ) coincides with the Heisenberg vertex algebra VN . If A0 is an arbitrary commutative A-algebra given together with an action of T0 extending its action on A, then the base change PA0 = (A0 , TA0 := A0 ⊗A T , . . . ) has an obvious structure of a prevertex algebra. We have V(PA0 ) = A0 ⊗A V(P ). This explains the remark about the base change in 3.3. There exists a common generalization of the above two examples. 6.10. All the above considerations have an obvious “super” (Z/(2)-graded) version. Let us consider the super version of Example 2.9. Let us start from the de Rham superalgebra A of differential forms over an arbitrary smooth algebra A. Let us assume that there exists an étale map Spec(A) −→ AN given by coordinate functions {bi } (this is true maybe after some Zariski localization). Lifting the coordinate vector fields a i = ∂bi to A, we get an abelian base in the Lie algebra Der(A) which gives rise to an abelian base in Lie superalgebra Der(A ). Now, we proceed as in 6.9 (in its super version), and get a vertex superalgebra ch = V(P ). The calculation in the proof of Theorem 3.7 shows that this vertex superalgebra does not depend on the choice of local étale coordinates. This is nothing but 0(X; ch X ) for X = Spec(A). This may be viewed as an alternative (or a version of) construction of the chiral de Rham complex. 6.11. Chiral Weyl modules. Let us return to the even situation again. Let V be a restricted vertex algebra, let P(V ) = (A, , T , . . . ) be the corresponding prevertex algebra. Assume that the Lie algebra T coincides with Der(A). Let M be a graded vertex module over V (the definition of such an object is an obvious modification of the definition of a graded vertex algebra). Assume that M is restricted, i.e. has no negative integer conformal weights. Consider the weight zero component M = M(0) . Then the operations am = a(−1) m (a ∈ A, m ∈ M) and τ m = x(0) m (τ ∈ T , x ∈ T˜ is any representative of τ , m ∈ M) makes M a left D-module over A. This way we get a functor (Restricted V -modules) −→ (DA -modules).
(6.11.1)
This functor admits a left adjoint W : (DA -modules) −→ (Restricted V -modules).
(6.11.2)
For a D-module M, the vertex module W(M) is called the chiral Weyl module corresponding to M. It is natural to hope this construction applied to flag spaces G/B gives a functor from ch which corresponds to the Weyl module D-modules over G/B to modules over OG/B construction in the language of representations. A similar construction gives for an arbitrary smooth manifold X a functor ch : (DX − mod) −→ (ch X − mod) called the chiral de Rham complex of a D-module.
(6.11.3)
Chiral de Rham Complex
473
Acknowledgement. The idea of this note arose from reading the papers [LVW] and [LZ]. We have learned the first example of “localization along the target space” from B. Feigin, to whom goes our deep gratitude. We thank the referee for the useful remarks which helped to improve the exposition. We are thankful to V. Gorbounov for catching many misprints. After the submission of this note to the alg-geom server, an interesting preprint [B] has appeared, where a possible application of the chiral de Rham complex to Mirror Symmetry was suggested. The work was done while the authors were visiting the Max-Planck-Institut für Mathematik in Bonn. We are grateful to MPI for the excellent working conditions, and especially to Yu.I. Manin for highly stimulating environment.
References [BD1] [BD2]
Beilinson, A., Drinfeld, V.: Chiral algebras I. Preprint Beilinson, A., Drinfeld, V.: Quantization of Hitchin’s integrable system and Hecke eigensheaves. Preprint, 1997 [BFM] Beilinson, A., Feigin, B., Mazur, B.: Introduction to algebraic field theory on curves. Preprint [BS] Beilinson, A., Schechtman, V.: Determinant bundles and Virasoro algebras. Commun. Math.Phys. 118, 651–701 (1988) [B] Borisov, L.A.: Vertex Algebras and Mirror Symmetry. math.AG/9809094 [FF1] Feigin, B., Frenkel, E., Representations of affine Kac–Moody algebras and bosonization, in: V. Knizhnik Memorial Volume, L. Brink, D. Friedan, A.M. Polyakov (Eds.), 271–316, World Scientific, Singapore, 1990. [FF2] Feigin, B., Frenkel, E.: Affine Kac–Moody algebras and semi-infinite flag manifolds. Commun. Math. Phys. 128, 161–189 (1990) [FF3] Feigin, B., Frenkel, E.: Affine Kac–Moody algebras at the critical level and Gelfand–Dikii algebras, Infinite Analysis, Parts A, B. Kyoto, 1991, Adv. Ser. Math. Phys. 16, River Edge, NJ: World Sci. Publishing, 1992, pp. 197–215 [F] Fuchs, D.B.: Cohomology of infinite-dimensional Lie algebras. Contemp. Sov. Math., New York: Consultants Bureau, 1986 [GK] Gelfand, I.M., Kazhdan, D.A.: Some problems of differential geometry and the calculation of cohomologies of Lie algebras of vector fields. Soviet Math. Dokl. 12 No. 5, 1367–1370 (1971) [HS] Hinich, V., Schechtman, V.: Homotopy Lie algebras. I.M. Gelfand Seminar, Adv. in Sov. Math. 16, Part 2, 1–28 (1993) [K] Kac, V.: Vertex algebras for beginners. University Lecture Series 10, Providence, RI: American Mathematical Society, 1997 [LVW] Lerche, W., Vafa, C., Warner, P.: Chiral rings in N = 2 superconformal theories. Nucl. Phys. B324, 427–474 (1989) [LZ] Lian, B., Zuckerman, G.: New perspectives on the BRST-algebraic structure of string theory. Commun. Math. Phys. 154, 613–646 (1993); hep-th/9211072 [M] Malikov, F.: Verma modules over Kac–Moody algebras of rank 2. Leningrad Math. J. 2 No. 2, 269–286 (1990) [MS] Malikov, F., Schechtman, V.: Chiral de Rham complex. II, math. AGI; D.B. Fuchs’ 60th Anniversary Volume, 1999, to appear (1) [W] Wakimoto, M.: Fock representations of the affine Lie algebra A1 . Comm. Math. Phys. 104, 604–609 (1986) Communicated by G. Felder
Commun. Math. Phys. 204, 475 – 492 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Theory of Ordered Spaces II. The Local Differential Structure H.-J. Borchers1 , R. N. Sen2 1 Institut für Theoretische Physik, Universität Göttingen, Bunsenstr. 9, D-37073 Göttingen, Germany 2 Department of Mathematics and Computer Science, Ben-Gurion University, 84105 Beer Sheva, Israel
Received: 26 November 1997 / Accepted: 10 February 1999
Abstract: In this paper we investigate the conditions under which the ordered spaces defined in [1] are locally diffeomorphic to RN . In Sect. 1 we give an introduction and an overview of the results. In Sect. 2 we show that the axioms of [1] do not suffice to make light rays locally homeomorphic to R. We introduce this structure via the new connectedness axiom 2.13, and work out some of its immediate consequences. In Sect. 3 we give the (somewhat involved) construction of timelike curves in a D-set, which are basic to everything that follows. They are used in Sect. 4 to prove (i) a nested interval theorem for ordered spaces; (ii) the contractibility of order intervals in D-sets; and (iii) that order intervals in D-sets are star-shaped. The notion of D-countability (meaning that a D-set has a countable base in the subspace topology) is introduced in Sect. 5. The Urysohn lemma shows that a D-countable ordered space is locally metrizable. If this space is also locally compact, then it has finite topological dimension N ; these results are established in Sect. 6. The local differential structure now follows from known results: the embedding of such spaces in R2N+1 , and the result that an open star-shaped region in Rn is diffeomorphic to Rn . In conclusion, we exhibit these inclusions in Fig. 3, and suggest the possibility that Wigner’s position on the “Unreasonable effectiveness of mathematics in the natural sciences” may be open to reasonable doubt. The axioms of [1] are given in the Appendix. 1. Introduction and Overview This paper continues the investigation of the implications of Einstein causality for the local structure of space-time which was begun in [1]. In that paper the basic framework, namely the axiomatization of Einstein causality as a partial order, was set up and its local topological consequences explored. The axiomatization of [1] did not use any number system apart from the natural numbers. One sees from Example 2.8 of the present paper that these axioms are not strong enough to force the emergence of the real number system.
476
H.-J. Borchers, R. N. Sen
If our natural fully ordered subsets, the light rays, are to look locally like intervals of real numbers, they need to satisfy a condition which can be stated in different ways, for example: i) Cantor’s criterion for a linear continuum [3]; ii) the least upper bound property; iii) connectedness; or iv) completeness.A physical principle which would force the emergence of one of these structures has not yet been identified. We have introduced this structure, in as weak a form as seems possible to us, via the new Axiom 2.13, called the connectedness axiom, which supplements the axiom system of [1]. The axiom demands that every open cover of a light ray segment have an overlapping subcover. Unsurprisingly, this property turns out to be a strong homogeneity requirement. It allows us to transport a variety of natural homeomorphisms from the neighbourhood of any point to the neighbourhood of any other point along l-polygons. It also allows us to construct a timelike curve joining two timelike points a b in a D-set, which in turn allows us to carry out a fairly thorough analysis of the local properties of an ordered space. For example, we find that an analogue of the nested interval theorem for complete metric spaces holds locally, and that every closed order interval in a D-set is homeomorphic to every other closed order interval in the same or another D-set. Locally, in our context, means within a D-set. We now ask whether or not the space is locally metrizable. An example (albeit a metrizable one) shows that a D-set need not be second countable in the subspace topology. We therefore introduce a countability condition 5.2, called D-countability, which requires D-sets to have countable bases in the subspace topology. Spaces which satisfy this condition fulfill, locally, the requirements of Urysohn’s metrization theorem, and are therefore locally metrizable. The D-countability condition is intermediate in strength between first and second countability. The best-known example of a D-countable space is perhaps the long line. However, such spaces may, or may not, be locally compact. If we pick out the locally compact ones, then, from the homogeneity properties established earlier, it follows rather easily that they have finite topological dimension, and therefore can be embedded in Rn for some n. Full exploitation of the homeomorphism between two closed order intervals shows that order intervals are contractible to any of their interior points; it then follows that they are homeomorphic to the closed unit ball in some Rn . It is a standard result in differential topology that the interior of such a ball is C ∞ diffeomorphic to Rn . In the following, definitions, results and formulae of [1] are prefixed by the letter A. For the convenience of the reader, the axioms of [1] and the nontriviality and regularity conditions imposed there are reproduced in the Appendix. It turns out that the definition of D-sets given in [1] is not restrictive enough to exclude certain features which, at the present stage of the investigations, may be regarded as too general; a modified definition which is restrictive enough is given in the Appendix. What it says is that, in a D-set, if a light ray intersects a cone boundary at two different points, then it lies wholly on that boundary and passes through the vertex of the cone. We remind the reader that every D-set is assumed to be regular, as defined in AII.4.2, meaning that it is l-connected and contains two points x, y such that x y (see Appendix). 2. Some Topological Questions In this section we shall establish certain topological properties of light rays and light cones, introduce the new axiom and investigate some of its consequences.
Theory of Ordered Spaces, II.
477
2.1. The order topology on a light ray. Definition 2.1. Let l be a light ray. By a cut ξ of l we shall mean a decomposition of l into two sets ξ + and ξ − such that 1. 2. 3. 4.
ξ + 6 = ∅ and ξ − 6 = ∅; ξ + ∩ ξ − = ∅; ξ + ∪ ξ − = l; x ∈ ξ − , y ∈ ξ + ⇒ x
If ξ is a cut of l, then one of the following must be true: 1. ξ − has a least upper bound, but ξ + does not have a greatest lower bound. 2. ξ − has no least upper bound, but ξ + has a greatest lower bound. 3. ξ − has no least upper bound, and ξ + has no greatest lower bound. If ξ1 and ξ2 are type 3) cuts of l, we shall say that ξ1 < ξ2 if ξ1+ ⊃ ξ2+ . The fourth logical possibility, namely that ξ − has a least upper bound, and ξ + has a greatest lower bound, is ruled out by the density part of the order axiom, AI.1.1 b). Let ξ1 and ξ2 be type 3) cuts of l, with ξ1 < ξ2 . Then O = {x|x ∈ ξ1+ , x 6∈ ξ2+ } is a convex subset of l which has no end-points in l. We shall call such a set – as also a set which lacks only one end-point in l – a half-ray. If U is a D-set, then l ∩ U is either empty, or an open segment1 or open half-ray. By definition, the subspace topology on l induced by the topology of M has as base the family of intersections of l with the D-sets of M. This base B consists of open segments and open half-rays. However, the open half-ray O is the union of open segments contained in O: [ l(ax , bx ), ax , bx ∈ O, ax
Therefore one can replace B by a base which consists entirely of open segments. The latter, however, is the base which defines the standard order topology on l. We have proved that: Proposition 2.2. The subspace topology on l induced by the order topology of M is the standard order topology on l. 2.2. Homeomorphism of light ray segments. A number of useful results follow rather easily from Proposition 2.2. We begin by adding topological information to Theorem AII.3.3. Proposition 2.3. Let U be a D-set, x, y ∈ U , x y, lx 3 x and ly 3 y. Set lx ∩ βCy− = {p}, and ly ∩ βCx+ = {q}. Then the segments lx [x, p] and ly [q, y] are homeomorphic. We shall denote this as hom
follows: lx [x, p] = ly [q, y]. 1 Recall that the open segment l(a, b) of the light ray l is defined to be the set l(a, b) = {x|a, b ∈ l, a
478
H.-J. Borchers, R. N. Sen
Proof. Theorem AII.3.3 establishes that there exists a bijective natural map, mediated by light rays, between these two segments. Additionally, this map is order-preserving, and therefore maps open segments onto open segments. The result now follows trivially from Proposition 2.2. u t We now assume a little bit more: Assumption 2.4. There are at least three distinct light rays through each point of U . In the following, Assumption 2.4 will be deemed to hold unless the contrary is explicitly stated. When it holds, there are enough natural maps in a D-set to prove that any two light-ray segments (in a D-set) are homeomorphic to each other. We break up the proof into three propositions. Proposition 2.5. Let U be a D-set, x, y ∈ U and x y. Let lx1 , lx2 , lx3 be three distinct light rays through x, which are intersected, respectively, by the rays ly1 , ly2 and ly3 through y. Let the points of intersection be p1 , p2 and p3 respectively. Then p1 , p2 and p3 are distinct, and the closed segments lx [x, p1 ], ly [p2 , y] and lx [x, p3 ] are homeomorphic to each other. hom
Proof. The fact that p1 , p2 and p3 are distinct is trivial. By Proposition 2.3, ly [p2 , y] = hom
lx [x, p1 ], and ly [p2 , y] = lx [x, p3 ]. The result follows. u t If we call the segments lx [x, p] and ly [p, y] boundary segments of the order interval I [x, y] ⊂ U , the above result can be paraphrased as follows: In a D-set, any two boundary segments of an order interval are homeomorphic to each other. The same is true for the open segments obtained by deleting the end-points. Observe that the argument fails if there are only two light rays through each point of M; there are not enough natural maps. However, in this case the differential structure emerges without effort, as we shall see later. Anticipating future results, this case will be called the two-dimensional case. Proposition 2.6. Let U be a D-set, x, y ∈ U, x y. Let lx and ly be light rays through x and y respectively which intersect each other, and let {p} = lx ∩ ly . Furthermore, let p 0 ∈ lx such that x
hom
Proof. Let l 0 x and l 0 y be two other light rays through x and y respectively which intersect at the point q, and let {q 0 } = βCy− ∩ l 0 y . By Prop. 2.5, lx [x, p] = ly [p, y] = lp0 [p0 , q 0 ] = lx [x, p0 ] = lx [p0 , p]. hom
hom
hom
hom
t u
Our final result in this direction is the following: Proposition 2.7. Any two closed light-ray segments in a D-set are homeomorphic to each other. Proof (outline). If the two segments intersect each other, one sees immediately that the “future" segments are homeomorphic to each other, as are the “past" segments. If they do not intersect each other, then one can construct an l-polygon, lying wholly within the D-set, with the two given segments as its terminal segments. The result follows by repeated application of Prop. 2.5 to adjacent segments. u t
Theory of Ordered Spaces, II.
479
2.3. Local structure of light rays. We saw above that in a D-set any two closed segments of the light ray l are homeomorphic to each other. It is natural to ask whether this property holds in general for a light ray. Clearly, Prop. 2.6 can be extended outside the D-set U if there is a D-set V which overlaps U and covers a part of l which is not inside U . However, this cannot be guaranteed. The following example shows this clearly: Example 2.8. Let M be the space Q × Q. We take the light rays to be sets of the form {q} × Q and Q × {q}. Consider Q × Q as being embedded in R2 , and let the family of D-sets be the restrictions to Q × Q of open order-convex sets in R2 which do not intersect the line π × Q. Then M = M(l) ∪ M(r) , where M(l) consists of points of M which lie to the left of the line x = π , and M(r) of those which lie to the right of this line. M(l) and M(r) are both open, i.e., M, while being l-connected, is not connected. Definition 2.9. Let M be a regular ordered space, l a light ray in M, and D = {Dα |α ∈ A}, where A is some indexing set, be a covering of l by D-sets. We say that D is an overlapping cover of l if, for any cut ξ of l, there exists Dγ ∈ D such that Dγ ∩ ξ + 6= ∅ and Dγ ∩ ξ − 6 = ∅. We may now prove the following: Theorem 2.10. Let M be a regular ordered space, and l a light ray in it. If l has an overlapping cover D, and Dα , Dβ ∈ D, then any closed segment of l in Dα is homeomorphic to any closed segment of l in Dβ . Proof. Choose D1 ∈ D, and define a cut ξ1 of l as follows: x ∈ ξ1− iff x ∈ l, and x
480
H.-J. Borchers, R. N. Sen
Example 2.12. Let L be the long line2 . Let M = L × L, and define the light rays to be the long lines parallel to the axes. Take as D-sets the open rectangles with sides parallel to the bases which are homeomorphic to open rectangles in R2 . This gives an ordered space which is locally homeomorphic to ordered R2 . However, the homeomorphism is not global, because, unlike R2 , the space M = L × L does not have a countable base. We now supplement our axioms with the following: Axiom 2.13 (Connectedness Axiom). Through every D-set U there passes a light ray l such that every cover of l ∩ U has an overlapping subcover. This axiom states that there is at least one light ray through any D-set which has the local structure of the interval (0, 1) of real numbers. From our earlier results, it follows immediately that: Theorem 2.14. Every light ray has the local structure of an interval of real numbers. 2.4. Boundaries of double cones. Let U be a D-set, a, b ∈ U, a b. We introduce the following notation: S(a, b) = βCa+ ∩ βCb− . The key observation is the following: Proposition 2.15. Let U be a D-set, a, b, c ∈ U, a b c. Then there exists a natural bijective map between S(a, b) and S(a, c), and this map is a homeomorphism. Proof. The map mediated by light rays through a is a homeomorphism. u t Proposition 2.16. Let U be a D-set, a, b, b0 , c ∈ U, a b c, a b0 c. Then S(a, b) and S(a, b0 ) are homeomorphic. Proof. The map
S(a, b) → S(a, c) → S(a, b0 )
composed from the maps of Prop. 2.15 is a homeomorphism. u t 2.5. Some topological properties of the space. Proposition 2.17 (The T1 -property). One-point sets are closed in a regular ordered space. Proof. Let M be a regular ordered space and a ∈ M. If b ∈ M, b 6 = a, then ∃ a D-set Vb such that b ∈ Vb , a 6 ∈ Vb . Then [ Vb = V b∈U b6=a
is open, and M \ V = {a}. Therefore {a} is closed. u t Theorem 2.18 (The T3 -property). M is regular. 2 Let be the smallest uncountable ordinal, and let S be the minimal uncountable well-ordered set. We define L to be S × [0, 1) with the dictionary order topology, and the smallest element deleted.
Theory of Ordered Spaces, II.
481
Proof. Let B ⊂ M be closed, and let a ∈ M \ B. We have to prove that there exist open sets Ua and UB such that a ∈ Ua , B ⊂ UB , and Ua ∩ UB = ∅. Since B is closed and a 6 ∈ B, it follows that a 6∈ bd B. Then there exists a neighbourhood V of a such that V ∩ B = ∅; for, if every neighbourhood of a intersects B, then a ∈ B¯ = B, a contradiction. Then there exist x, y ∈ V such that x a y and I [x, y] ⊂ V . Therefore a ∈ I (x, y) ⊂ I [x, y], t and the sets UB = M \ I [x, y] and Ua = I (x, y) have the desired properties. u 3. Construction of Timelike Curves In this section, we shall use the binary representation to denote numbers between 0 and 1; thus, 0.1 = 21 , etc. A number r ∈ [0, 1] will be called b-finite if it has a terminating binary representation. It will be called b-infinite if it does not have a terminating binary representation. Thus b-finite numbers are rationals of the form p/2k , 0 < p < 2k . All other rationals, and all irrationals, are b-infinite. The b-finite numbers are dense in [0, 1] in its usual topology. The set of b-finite numbers in [0, 1] will be denoted by B. Let U be a D-set. Given three points C(0), C(0.1), C(1) ∈ U such that C(0) C(0.1) C(1), we want to construct a continuous “timelike" curve C : [0, 1] → U connecting these three points, and parametrized by t ∈ [0, 1]. Timelike means the usual; t1 , t2 ∈ [0, 1], t1 < t2 ⇒ C(t1 ) C(t2 ). As anticipated by the notation, C(0) is the initial point, C(1) the terminal point, and C(0.1) the given point in between the two. We begin by setting up the notations (shown in Fig. 1) which will be used in the following: Notations 3.1. 1. Fix a forward ray lC+(0) through C(0), and choose an injective order-preserving map ξ : [0, 1] → lC+(0) such that i) ξ(0) = C(0), ξ(1) = lC+(0) ∩ βCC−(1) , ξ(0.1) = lC+(0) ∩ βCC−(0.1) , and ii) which is a homeomorphism onto its range. By abuse of notation, the forward ray lC+(0) itself will be denoted by ξ + . 2. There is a unique ray through ξ(0.1) and C(0.1). Denote the forward part of this ray + . It intersects βCC−(1) at a unique point. Denote this point by η0.1 (0.1). by η0.1 3. Denote the ray through C(1) and η0.1 (0.1) by ζ , and the intersection of ζ − with βCC+(0) by η(1). Denote the ray through C(0) and η(1) by η. The point C(0) will also be denoted by η(0) (as well as by ξ(0)). 4. Denote the intersection of βCC−(0.1) with η by η(0.1), and draw the ray segment connecting C(0.1) with η(0.1). The construction so far and the notations are shown in Fig. 1.
482
H.-J. Borchers, R. N. Sen C(1)
q @
@
@
+
η0.1 @ @ q η (0.1) pq p pp @ 0.1 pp pp @ pp pp @ pp + ξ η+ pp @ pp pp @ q Iq @ q C(0.1) η(1) ξ(1) @ @ @ R − @ @ @ ζ @ @ @ @ @ @ @q η(0.1) q @ ξ(0.1) @ @ @ @ @q ξ(0) = C(0) = η(0) Fig. 1. Basic construction and notations
3.1. Construction of the b-finite points. 3.1.1. Initiating the bisection procedure. We now use the order intervals I [C(0), C(0.1)] and I [C(0.1), C(1)] to define two new points as follows: 0. Denote the closed segment from ξ(t1 ) to ξ(t2 ), t1 < t2 , by ξ [t1 , t2 ] (and similarly for other rays on which real or b-finite coordinates are defined). 1. Pick the midpoint ξ(0.01) of the segment ξ [0, 0.1]. The forward cone from this point intersects the ray ζ − at a unique point. Denote by η0.01 the ray which joins ξ(0.01) with this point (which need not be named). Choose a point in the intersection + ∩ I (C(0), C(0.1)) and call it C(0.01). Denote the intersection of βCC−(0.01) with η0.01 η by η(0.01) and draw the segment which joins C(0.01) with η(0.01). 2. Pick the midpoint ξ(0.11) of the segment ξ [0.1, 1]. The forward cone from this point intersects the ray ζ − at a unique point. Denote by η0.11 the ray which joins ξ(0.11) with this point (which need not be named). Choose a point in the intersection + ∩ I (C(0.1), C(1)) and call it C(0.11). Denote the intersection of βCC−(0.11) with η0.11 η by η(0.11) and draw the segment which joins C(0.11) with η(0.11). We now have the points C(0), C(0.01), C(0.1), C(0.11) and C(1) which satisfy C(0) C(0.01) C(0.1) C(0.11) C(1), and the points η(0), η(0.01), η(0.1), η(0.11), η(1) on the ray η which satisfy η(0)
Theory of Ordered Spaces, II.
483 C(1)
q @
@
+ η0.11
@q + η0.1 @ pp pp q C(0.11) @ q η0.1 (0.11) @ @ + @ η0.01 @ q pq p @ + @ pp ξ η+ pp @ @ p pp I @ @ pq @q q C(0.1) ξ(1) @ @ @ η(1) @ @ @ R − @ @ ζ @ @q pq p @q η(0.11) ξ(0.11) @ pp @ pp pp @ @ pp q @q η(0.1) q @ C(0.01) ξ(0.1) @ @ @ @ q @q η(0.01) ξ(0.01) @ @ @q pq p
pp
pp
ξ(0) = C(0) = η(0) Fig. 2. After the first bisections
Figure 2 shows these points, rays, and the four closed double cones I [C(0), C(0.01)], I [C(0.01), C(0.1)], I [C(0.1), C(0.11)] and I [C(0.11), C(1)], which provide the starting points for the next step. In the figure, ray crossings do not necessarily imply that the two rays intersect; if they are known to intersect, the point is marked by a larger dot. If a “ray” is shown in two parts, one marked by a solid line and the other by a dotted line, it means that the two lines may represent different rays; they are not known to be continuations of one and the same ray. 3.1.2. Iterating the bisection procedure. We now introduce a special notation for the order intervals with vertices on C: Notations 3.2 (J-intervals). The order intervals I [C(t1 ), C(t2 )] and I (C(t1 ), C(t2 )), where t1 < t2 , will be denoted by J [t1 , t2 ] and J (t1 , t2 ) respectively. We start with the situation as shown in Fig. 2, where we have the 22 closed order intervals J [0, 0.01], J [0.01, 0.1], J [0.1, 0.11] and J [0.11, 1]. We choose an interior point in each of these as follows:
484
H.-J. Borchers, R. N. Sen
1. Pick the midpoints ξ(0.001), ξ(0.011), ξ(0.101) and ξ(0.111) of the segments ξ [0, 0.01], ξ [0.01, 0.1], ξ [0.1, 0.11] and ξ [0.11, 1] respectively. Consider the rays η0.001 , η0.011 , η0.101 and η0.111 which join these points with the ray ζ ; these rays are defined uniquely. 2. Arguments which are by now standard show that the intersections η0.001 ∩ J (0, 0.01), η0.011 ∩ J (0.01, 0.1), η0.101 ∩ J (0.1, 0.11), η0.111 ∩ J (0.11, 1) are nonempty; pick a point in each, and denote these points by C(0.001), C(0.011), C(0.101) and C(0.111) respectively. 3. Mark the points of intersection of the backward cones from these points with the ray η, and name them η0.001 , η0.011 , η0.101 and η0.111 respectively. At the end of this step, we have divided the segment ξ [0, 1] into 23 subsegments. Through them we have determined 23 + 1 points C(t), each (except the ones given initially) with a certain amount of arbitrariness. Each of these points determines a unique point on the ray η; we have 23 + 1 points η(t) so far. These points are all in the proper order, that is ti < tj ⇒ ξ(ti )
n o S(>) = η(r)|ηb
Theory of Ordered Spaces, II.
485
be the set of all b-finite points on η[0, 1] which follow ηb , and define qb = inf S(>) . Define now the η-height of I [a, b] by νη (I [a, b]) = |qb − qa |.
(1)
The significance of this definition is the following: If I1 and I2 are closed order intervals such that I1 ⊂ I2 , I1 6 = I2 , then νη (I1 ) < νη (I2 ).
(2)
The ξ -height νξ of I [a, b] can be defined analogously3 , using the ray ξ instead of the ray η. Again, we have that I1 ⊂ I2 , I1 6 = I2 ⇒ νξ (I1 ) < νξ (I2 ).
(3)
Finally, we observe that for closed J -intervals the ξ -height equals the η-height. The common value will be called the height of the J -interval and denoted by ν(J ). The construction by repeated bisections shows that there exist J -intervals of arbitrarily small height. 3.3. Definition of the b-infinite points. The tool for defining b-infinite points on the curve C(t) is the Cantor intersection theorem for complete metric spaces. Let ϑ be a b-infinite number in the interval [0, 1], and ξ(ϑ) the corresponding point on ξ [0, 1]. Consider the family F of all b-finite segments ξ [r1 , r2 ], where r1 , r2 ∈ B and r1 < ξ < r2 . Denote the length |r2 − r1 | of the segment ξ [r1 , r2 ] by d(ξ [r1 , r2 ]). Now choose a countable subfamily F0 of F, F0 = {F0 , F1 , F2 , · · · Fn , · · · }, such that the Fk are ordered by inclusion, i.e., j < k ⇒ Fj ⊃ Fk , and d(Fn ) → 0. Then, by Cantor’s intersection theorem, ∞ \
Fn = {ξ(ϑ)}.
n=0
By construction, to each Fn there corresponds a unique J -interval; denote this by Jn . We then have a chain of J -intervals J0 , J1 , J2 , . . . Jn , . . .
(4)
which are ordered by inclusion, j < k ⇒ Jj ⊃ Jk , and are such that ν(Jn ) → 0. 3 Since the segment ξ [0, 1] carries a numerical coordinate, one may give a direct definition of the ξ -height which does not use the sup or the inf; the two definitions are equivalent. However, the direct definition would not extend to the η-height.
486
H.-J. Borchers, R. N. Sen
The ray ηϑ which joins ξ(ϑ) with ζ is unique. By construction, it intersects every Jn . Since Jn is a closed order interval, ηϑ ∩ Jn is a closed segment, which we shall denote by ηϑ [n]. Using the natural homeomorphism of ηϑ [0, 1] with η0 [0, 1], we assign the length δ[n] to ηϑ [n]. One sees, by standard arguments, that δ[n] = ν(Jn ). We thus have a chain of closed segments ηϑ [0], ηϑ [1], . . . ηϑ [n], . . . which are ordered by inclusion and are such that δ[n] → 0. Therefore, by Cantor’s intersection theorem, the intersection ∞ \
ϑ[n]
n=1
is a unique point, say p. By construction, r1 < ϑ < r2 , r1 and r2 b-finite, implies that C(r1 ) p C(r2 ). We therefore denote p by C(ϑ). The above procedure defines the point C(ϑ) uniquely for any b-infinite ϑ ∈ [0, 1]. Moreover, it implies that if ϑ1 < ϑ2 and ϑ1 , ϑ2 are b-infinite, then C(ϑ1 ) C(ϑ2 ). This completes the construction of the curve C(t) for 0 ≤ t ≤ 1. The b-finite points are chosen arbitrarily, subject to the preservation of (timelike) order; the b-infinite points are then defined uniquely by the requirement of preservation of this order. 3.4. Topology of C(t). By construction, there is an order-preserving bijection between the set {C(t)|0 ≤ t ≤ 1} furnished with the order , and the closed real interval [0, 1]. An open order interval is either disjoint from the curve C, or else intersects it on an open arc. The subspace topology on C is therefore the same as the topology mediated by the order on it; the curve C(t) is continuous. 4. Parametrization of the Double Cone We shall now abandon the binary representation and revert to decimals. We shall also make the following changes in our notations for the double cone: The point C(0) will now be denoted by C(−1), the point C(0.1) will be denoted by C(0), and the curve C will now be C : [−1, 1] → U . The closed order interval CC+(−1) ∩ CC−(1) will be denoted by J [−1, 1], in the notation of Sect. 3.1.2. Let p ∈ J [−1, 1], and let βCp− ∩ C = {a},
βCp+ ∩ C = {b},
(5)
Theory of Ordered Spaces, II.
487
where the points a and b are specified by their numerical values on C[−1, 1], i.e., −1 ≤ a, b ≤ 1. We can parametrize p by cylindrical coordinates h, r and , which are introduced as follows: a+b , 2 b−a . r= 2
h=
(6)
h will be called the level or height variable, and r the radius. Clearly, −1 ≤ h ≤ 1, 0 ≤ r ≤ 1.
(7)
The point p will therefore have the coordinates p = {h, r, },
(8)
where is an “angle” variable, i.e., a point on S(−1, 1) = βCC−(1) ∩ βCC+(−1) . We see that: Proposition 4.1. In a D-set, any two closed order intervals are homeomorphic to each other, as are any two open order intervals. Proof. In view of Prop. 2.16, this follows immediately from the parametrization (8). u t Next, we use the above parametrization to prove the following result: Theorem 4.2 (Nested interval theorem for ordered spaces). As in (4), let J0 , J1 , . . . Jn , . . . be a nested sequence of J -intervals such that ν(Jn ) → 0. Then the intersection J =
∞ \
Jn
n=0
consists of exactly one point. Proof. We have already established, in Sect. 3.3, that this intersection contains a point a which lies on C[−1, 1]. If it contains a different point b which is timelike to a, then, since ν(Jn ) → 0, we can find a Jk of height less than |b − a|, i.e., b cannot belong to J . If b is lightlike or spacelike to a, then r(b) > 0, and b cannot be contained in any J -interval of height less than r(b). Therefore b cannot be contained in J . u t Finally, we have the following: Theorem 4.3 (Contractibility of order intervals). Let U be a D-set, and I [a, b] a closed nonempty order interval contained in U . Let o ∈ I (a, b). Then I [a, b] is contractible to {o}.
488
H.-J. Borchers, R. N. Sen
Proof. It suffices to consider a parametrized interval I [−1, 1]. Let p = (h, r, ). For each t in the real interval (0, 1], the map p 7 → tp = (th, tr, )
(9)
is a homeomorphism of I [−1, 1] with I [−t, t]. For t = 0, it is the constant map p 7→ O = (0, 0, ·). This displays the required homotopy. u t It follows immediately from the above that: Theorem 4.4 (Order intervals are star-shaped). The order interval I [a, b] is starshaped with respect to every point o in I (a, b). Proof. It suffices to consider I [−1, 1] and the point O in it. Let p ∈ bd I [−1, 1]. Define the line segment connecting p to O to be the path traced by p under the homotopy (9). Then: • Each line segment carries a natural map to the real interval [0, 1]. • For each p, the entire line segment connecting p to O is contained in I [−1, 1]. • Moreover, if p 6 = q, p, q ∈ bd I [−1, 1], then the only point at which the line segments connecting p and q to O intersect is the point O itself. t u 5. Countability In 2.12, we had an example of an ordered space M in which every point had a neighbourhood which was second countable in the subspace topology, but M itself was not. The following example is of a different kind: Example 5.1. Let H be a nonseparable Hilbert space, and consider the product H × R. This too is a metric space, and it too is nonseparable. Let x ∈ H, t ∈ R. We can define a Lorentz structure on H × R via ||x||2 − t 2 = 0, and light rays are the null lines of this structure. The topological property which interests us is that in this space no neighbourhood of any point is second countable in the subspace topology. We now make the following definition: Definition 5.2 (D-countability). An ordered space M will be called D-countable if for every p ∈ M there exists a D-set Dp such that p ∈ Dp , and Dp is second countable in the subspace topology. From now on we shall work only with D-countable spaces. 6. Metrizability and Topological Dimension Let p ∈ M, where M is a D-countable ordered space. Then there exists a D-set U such that p ∈ U ⊂ M and U has a countable base in the subspace topology. We can take U to be an order interval without loss of generality. We shall denote this space by D. Since regularity is a hereditary property [5], D is regular (by Prop. 2.18). Therefore D is a regular second countable space. Therefore Theorem 6.1. D is metrizable.
Theory of Ordered Spaces, II.
489
Proof. This is just the Urysohn metrization theorem [5]. u t Example 6.2. The Hilbert sequence space l2 (R), x = {x0 , x1 , . . . xn , . . . }, xk ∈ R, k ∈ N, and ∞ X xi2 < ∞ i=0
becomes an ordered space upon defining a Lorentz structure by s 2 = x02 −
∞ X i=1
xi2 .
This space has a countable basis, is infinite-dimensional and not locally compact. Definition 6.3. A locally compact D-space (that is, a regular, second countable, locally compact ordered space) will be called an E-space (after Einstein). The main result is the following: Theorem 6.4. An E-space has finite topological dimension. Proof. In an E-space, a closed order interval is compact. Therefore every open cover of it has a finite subcover, and this subcover automatically refines the cover. Let O be an open cover and let j (O) be the size of a finite subcover of it. Let k = min j (O), the minimum being taken over all O and all finite subcovers of O. Since the space is connected, k ≥ 3. k is clearly finite. Since every closed order interval in E is homeomorphic to every other closed order interval, the number k is independent of the order interval. Therefore N = k − 1 is the covering or topological dimension of E. u t 7. The Differential Structure of E The fact that E has a differential structure follows immediately from two standard results: A metrizable space of finite topological dimension N can be embedded in R2N+1 [5]. Then, from the last part of the proof of Theorem 4.4, it follows that E can be embedded as an n-dimensional open ball for some n ≤ 2N + 1. But: Every nonempty open star-shaped subset of Rn is C ∞ -diffeomorphic to Rn [4]. This establishes the desired result. 8. Concluding Remarks Figure 3 displays the logical structure of the scheme developed above. A rectangle B nested inside a rectangle A means that B is a special case of A. The interpretation of Einstein causality as a partial order and the investigation of the local structure of these spaces has revealed the following: There exist ordered spaces in which the connectedness axiom 2.13 does not hold, and those in which D-countability or local compactness fail. An ordered space which satisfies the connectedness axiom, is
490
H.-J. Borchers, R. N. Sen
Spaces satisfying the four axioms of [1]
Spaces satisfying the connectedness axiom 2.13
D-countable spaces
Locally Rn
Locally compact spaces
Fig. 3. Inclusions of ordered spaces
D-countable and locally compact is locally smoothly diffeomorphic with Rn for some n. We have not investigated the structure of light rays in ordered spaces in which the connectedness axiom does not hold. In a well-known article [6], Wigner wrote: “. . . the enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious and that there is no rational explanation for it”. Wigner stated clearly that he understood mathematics to be a creation of the human mind, independent of the physical universe (to the extent that any creation of the human mind can be so independent). This view, or some approximation to it, has enjoyed considerable support among mathematicians since the mid-19th century. (For Cantor’s own views, see [2], Chapter 6, and [3] pp. 108–109 (footnotes)). If, however, the local differential structure of the physical universe (possibly its most basic mathematical structure) is a consequence of Einstein causality, Wigner’s position may be open to doubt. Appendix The items below are given with the numbering in which they appear in [1]. l denotes a light ray. (Remarks 1 and 2 have been added in the present paper.) I.1.1. Axiom (The order axiom). a) b) c) d)
If x, y ∈ l, x 6 = y, then either x
Theory of Ordered Spaces, II.
491
I.1.9. Assumptions. a) M is l-connected. b) M does not consist of a single point. c) M does not consist of a single light ray. I.1.11. Axiom (The Identification Axiom). If l and l 0 are distinct light rays and a ∈ S ≡ l ∩ l 0 , then there exist p, q ∈ l such that p
492
H.-J. Borchers, R. N. Sen
References 1. Borchers, H.-J. and Sen, R. N.: Theory of Ordered Spaces. Commun. Math. Phys. 132, 593 (1990) 2. Dauben, J. W.: Georg Cantor: His Mathematics and Philosophy of the Infinite. Princeton, NJ. Princeton University Press, 1979 (paperback, 1990) 3. Fraenkel, A. A.: Abstract Set Theory. Amsterdam: North-Holland, 1953 4. Hirsch, Morris W.: Differential Topology. New York: Springer-Verlag, 1976 5. Munkres, J. R.: Topology: A First Course. Englewood Cliffs, NJ: Prentice-Hall, 1975 6. Wigner, E. P.: The Unreasonable Effectiveness of Mathematics in the Natural Sciences. Comm. Pure and Appl. Math. 13, 1 (1960) Communicated by H. Araki
Commun. Math. Phys. 204, 493 – 524 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Loop and Path Spaces and Four-Dimensional BF Theories: Connections, Holonomies and Observables? Alberto S. Cattaneo1 , Paolo Cotta-Ramusino1 , Maurizio Rinaldi2 1 Dipartimento di Matematica, Università di Milano, Via Saldini 50, 20133 Milano, and I.N.F.N., Sezione di
Milano, Italy. E-mail: [email protected]; [email protected]
2 Dipartimento di Matematica, Università di Trieste, Piazzale Europa 1, 34127 Trieste, Italy.
E-mail: [email protected] Received: 28 March 1998 / Accepted: 12 September 1998
Abstract: We study the differential geometry of principal G-bundles whose base space is the space of free paths (loops) on a manifold M. In particular we consider connections defined in terms of pairs (A, B), where A is a connection for a fixed principal bundle P (M, G) and B is a 2-form on M. The relevant curvatures, parallel transports and holonomies are computed and their expressions in local coordinates are exhibited. When the 2-form B is given by the curvature of A, then the so-called non-abelian Stokes formula follows. For a generic 2-form B, we distinguish the cases when the parallel transport depends on the whole path of paths and when it depends only on the spanned surface. In particular we discuss generalizations of the non-abelian Stokes formula. We study also the invariance properties of the (trace of the) holonomy under suitable transformation groups acting on the pairs (A, B). In this way we are able to define observables for both topological and non-topological quantum field theories of the BF type. In the non-topological case, the surface terms may be relevant for the understanding of the quark-confinement problem. In the topological case the (perturbative) four-dimensional quantum BF -theory is expected to yield invariants of imbedded (or immersed) surfaces in a 4-manifold M. 1. Introduction In this paper we consider the spaces LM and PM of free loops and paths of a compact manifold M and the principal G-bundles on LM and PM obtained by pulling back, via the evaluation map, a fixed principal bundle P (M, G). We are interested in the ? This work has been partly supported by research grants of the Ministero dell’Universitàs e della Ricerca Scientifica e Tecnologica (MURST). Part of this work was completed while A.S.C. was at Harvard University supported by I.N.F.N. Grant No. 5565/95 and DOE Grant No. DE-FG02-94ER25228, Amendment No. A003. P.C.–R. developed some of the work related to this paper while participating to the APCTP/PIms Summer Workshop at the University of Vancouver, B.C.
494
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
connections on such bundles that are determined by pairs (A, B), where A is a connection on P (M, G) and B is a 2-form of the adjoint type on P (M, G). We study the properties of the curvature and of the holonomy of such connections. The motivations for this study are rooted in the four-dimensional quantum field theories of the BF -type. One of our goals is to understand the relation between those QFT’s and the (smooth) invariants of four-manifolds and of surfaces imbedded (or immersed) in four-manifolds. Before discussing the differential geometrical results, we comment briefly on quantum BF -theories. 1.1. Quantum field theory. Four-dimensional BF -theories may become increasingly relevant both to the quantum-field theoretical description of smooth four-manifold invariants and to the understanding of quark-confinement problems. What characterizes BF -theories, and distinguishes them from ordinary gauge theories, is the fact that there are two fundamental fields: a connection A for some principal G-bundle over a four-dimensional manifold M and a 2-form field B that transforms under gauge transformations as the curvature of A. Various actions (and observables) can be constructed with the two fields A and B and, as a result, BF -theories can be both topological and non-topological (see Sect. 8). One of the relevant non-topological BF theories is the first-order formalism of the Yang-Mills theory introduced in [1–3] and then modified in [4] by replacing B with B − dA η, where the extra field η is a 1-form. The resulting theory, which turns out to be a deformation of a topological theory, has been shown in [4] to be equivalent to the Yang-Mills theory. For general topological field theories the reader is referred to [5]. Topological BF -theories have been introduced in [6] (see also [7] and reviewed in [8]. The inclusion of observables in four-dimensional topological BF -theories is due to [9]. Here we begin a study of the geometry of four-dimensional BF -theories. The starting point is the observation that the spatial components of B are the conjugate momenta to the spatial components of A. If we formally identify the tangent and cotangent spaces, we may see B as an infinitesimal connection. The natural way to interpret the 2-form B as a tangent vector to a space of connections is to consider a principal G-bundle which has both as the total space and as the base manifold a space of loops or paths. By integrating the 2-form B over a path, one obtains a 1-form. We show in this paper that pairs (A, B) represent connections on a G-bundle over the path or loop space of our manifold M. An ordinary connection A yields the parallel transport along a path of geometrical objects associated to points. Similarly a connection on the path or loop space, represented by a pair (A, B), yields the parallel transport along a path of paths or a path of loops of geometrical objects associated to paths or loops. Surfaces are spanned by paths of paths (though not in a unique way), so the first question one has to ask is when a Stokes-like formula holds for parallel transport. It is easy to see that when the field B is the (opposite of the) curvature FA of A, then the parallel transport along a path of paths spanning a surface S is uniquely determined by the A-parallel transport along the boundary of S. This is a new version of the well-known non-abelian Stokes formula (see [10, 11]). If B is a small deformation of a (small) curvature FA , then a surface term appears in the parallel transport with respect to the connection (A, B). For large deformations the
Loop and Path Spaces and Four-Dimensional BF Theories
495
parallel transport with respect to (A, B) depends on the whole path of paths structure and not only on the spanned surface. We now recall that the ordinary holonomy along a loop in space–time has a physical interpretation: namely, it represents the contribution of a pair quark–antiquark forced to move along the loop. With our construction we have at our disposal more general objects. For example we may consider the parallel transport with respect to the pair (A, B) along a path of paths. This provides us with a simple generalization of the previous situation: we might think of this case as of a pair quark–antiquark with an interaction that is not given just by the A-parallel transport. As a second example we may take the holonomy corresponding to the pair (A, B) of a loop in the path (or loop) space. This should represent the contribution of a pair of open (or closed) strings in interaction. Thus, the equivalence between the Yang–Mills theory and a deformation of the BF theory plus the presence of surface terms associated to the parallel transport involving a non-trivial B-field may be relevant for the problem of quark confinement as formulated by Wilson [12]. A rôle of the BF -theories in the understanding of quark confinement has been discussed in [13] (see also [14]). As far as topological BF theories are concerned, we notice that one can define, at least in principle, a four-dimensional analogue of the Witten–Chern–Simons theory, whose vacuum expectation values (v.e.v.) of (products of) Wilson loops represent linkinvariants. In four-dimensional topological BF -theories, one can compute v.e.v.’s of traces of holonomies of imbedded (or immersed) loops of paths (or of loops). Again one of the delicate points is to check when these invariants can be considered invariants of the surfaces spanned by loops of paths (or of loops). This requirement is close to the parameterization invariance for the surface since (for imbeddings) the loop-of-paths structure yields coordinates on the surface. In this paper we show that when the fields A and B, restricted to the surface spanned by a loop of paths, take values in an abelian subgroup T of G, then the trace of the (A, B)-holonomy depends only on the surface and not on the loop-of-paths structure. This reducibility condition is related to the abelian projection considered in [15, 16]. The actual computation of v.e.v.’s for the topological BF theories will be carried out elsewhere. There is some indication that these v.e.v.’s may be related to the invariants of imbedded (or immersed) surfaces considered by Kronheimer and Mrowka [17]. Even though the original motivations for this paper lie in the development of BF quantum field theories, our main goal here is to study connections, curvatures and holonomies of principal fiber bundles over path and loop spaces. Our language will be therefore the language of differential geometry.
1.2. Geometry. We begin Sect. 2 by considering a fixed principal fiber bundle P (M, G) together with a connection A. The space of A-horizontal paths, denoted by PA P , is a principal G-bundle itself which has the space of paths on M as base manifold. We describe explicitly the tangent bundle of PA P as a submanifold of the path space of TP . We consider connections on PA P and particularly those connections that are defined in terms of a 2-form B. We call these connections on PA P special connections.
496
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
The curvatures, horizontal distributions and parallel transports corresponding to special connections are computed. The explicit expression for the curvature involves Chen integrals. In Sect. 3 we discuss the parallel transport of paths of paths with respect to special connections. We single out the case when B is given by the (opposite) of the curvature FA of A: it is the only case where we have an abelian Stokes formula, namely a relation between the parallel transport of a path of paths and the A-holonomy of the loop given by the boundary of such path of paths. More generally we study the possible conditions that force the (trace of the) parallel transport along a path of paths (or of loops) and the (trace of the) holonomy of a loop of paths (or of loops) to depend only on the spanned surface. The first of these conditions is a “perturbative” one: namely we assume that P admits a flat connection and we expand both B and FA around zero. Then, for imbedded paths of paths, the (trace of the) parallel transport with respect to a special connection (A, B) depends only on the spanned surface, up to second-order terms. In Sect. 4 we compute the expressions of the special connections and of the parallel transport of paths of paths in local coordinates. After recalling the general transformation properties of the holonomy considered as a function of the space of connections for a generic principal bundle (Sect. 5), we discuss in Sect. 6 the “non-perturbative” conditions that guarantee that the holonomy of a loop of paths is independent of those automorphisms of P that maps the spanned surface into itself. Here we require both B and A to be reducible to an abelian subgroup of the structure group G once they are restricted to the image of the spanned surface. In Sect. 7 we study the action on the space of pairs (A, B) of those transformations groups that happen to be symmetries for the BF -theories. The invariance of the (trace of the) holonomy under those transformation is guaranteed provided that we require again the reducibility conditions for both A and B. It is worthwhile noticing that the group of gauge transformations on PA P , which preserves the trace of the holonomy, is not a symmetry group for the BF -theories. More precisely the symmetries of the BF theories are “close” to being gauge transformations on PA P , the missing terms being boundary terms and higher-order Chen integrals. The full group of gauge transformations on PA P , the space of all connections on PA P and the relation between PA P and the free loop bundle LP (whose structure group is the loop group of G) will be discussed in a forthcoming paper [18]. In Sect. 8 we describe the observables for BF quantum field theories, both in the topological and in the non-topological case.
2. Differential Geometry of Horizontal Paths We describe here the general setting of this paper. We consider a smooth manifold M that is assumed to be closed, compact, oriented and Riemannian, a compact Lie Group G with an Ad-invariant inner product on its Lie algebra g and a fixed principal G-bundle P = P (M, G) over M. The group of gauge transformations of P will be denoted by G, while the space of connections on P will be denoted by A. Also we denote by ∗ (M, adP ) the graded Lie algebra of forms on M with values in the adjoint bundle ad P = P ×Ad g. We will consistently consider the elements of ∗ (M, adP ) also as forms on P that are both of the adjoint type and tensorial [19].
Loop and Path Spaces and Four-Dimensional BF Theories
497
The group G acts on A, and this action is free provided that we restrict A to be the space of irreducible connections and divide G by its center. We denote this action as follows: (2.1) A × G 3 (A, g) Ag ∈ A. In the course of this paper we will have to consider other principal G-bundles, say PX (X, G), over some manifold X, possibly infinite-dimensional. We will then denote G(PX ) and by A(PX ) the relevant group of gauge transformations and the space of connections. If no confusion arises, we use the symbol π to denote the projection of any fiber bundle. For any manifold X we denote by PX the space of smooth paths on X. The space of smooth free loops on X will be denoted by the symbol LX and the space of x-based loops (x ∈ X) by the symbol Lx X. With some extra work we could consider also piecewise smooth paths and loops, but we do not wish to discuss this problem here. We will also be interested in the space of smooth maps assigning to each point x ∈ X a path or a loop with initial point x. We call such maps path-fields and, respectively, loop-fields. If we denote by Map(X) the space of smooth maps of X to itself, then a path field and a loop field on X are represented respectively by a path or a loop on Map(X) with initial point the identity map. Most of this paper deals with horizontal lifts of paths on M with respect to a given connection A ∈ A. We use the following notation for horizontal lifts; for any γ : [0, 1] → M and for any p ∈ P with π(p) = γ (0), the A-horizontal lift of γ with initial point p is denoted by the symbol L(A, γ , p). (2.2) Our first task is to study the differential geometry of the space of A-horizontal paths. 2.1. The principal bundle of horizontal paths and its tangent bundle. Let PA P denote the space of A-horizontal paths in P . This is a principal G-bundle PA P −−→ PM, where the right G-action is given by the right G-action on the initial points of the horizontal paths. If we consider two distinct connections A, A¯ ∈ A, then we have two distinct and isomorphic principal G-bundles PA P and PA¯ P . They are isomorphic since, for any connection A, the bundle PA P is isomorphic to the pulled-back bundle ev0∗ P . By the symbol ev we denote in general the evaluation map and, in this particular case, the map ev : PM × I → M, I = [0, 1], evt , ev(·, t).
(2.3)
Let us call JA the isomorphism between ev0∗ P and PA P given by JA (γ , p) , L(A, γ , p), γ ∈ PM, p ∈ π −1 γ (0).
(2.4)
We denote by jA the evaluation map ev0∗ P × I → P given by jA ((γ , p), t) ≡ L(A, γ , p)(t). We have the following bundle morphisms: ev
evt
PA P × I −−−−→ P ; PA P −−−−→ P .
(2.5)
498
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
As a particular case we can consider the loop space LM on M, instead of PM and the corresponding principal bundle LA P whose elements are the A-horizontal paths on P whose projections are loops. We now study the properties of the tangent bundle of the bundle PA P . First we identify TPM (the tangent bundle of the path space) with P(TM) (the path space of the tangent bundle). In other words, given any path γ ∈ PM, a vector X ∈ Tγ (PM) is given by the assignment for each t ∈ I of a vector X(t) ∈ Tγ (t) M. Equivalently the same vector X can be represented by a smooth map 0 : (−, ) × I → M, so that ∂0(s, t) 0 = X(t). 0(0, t) = γ (t), 0 (0, t) ≡ ∂s s=0 For any horizontal path q on P , let us consider a tangent vector q ∈ Tq (PP ), defined by a smooth map Q : (−, ) × I → P satisfying the following conditions: ∂ ∂ = q(t), ˙ Q∗ = q(t). (2.6) Q(0, t) = q(t), Q∗ ∂t Q(0,t) ∂s Q(0,t) The tangent vector q belongs to Tq (PA P ) if and only if the following extra requirement is satisfied: ∂ ˙ t)) = 0, s = 0, ∀t ∈ I. A(Q(s, (2.7) Q∗ ∂s Here we used the dot to denote the derivative with respect to the variable t ∈ I . We will use this notation also in the future. Moreover when dealing with two variables (s, t) ∈ I × I we will use the prime to denote the derivative with respect to the variable s ∈ I. Condition (2.7) is independent of the choice of the map Q(s, t) representing q. In e t) would satisfy (in local coordinates) the fact any two such choices Q(s, t) and Q(s, e t) = sg(s, t) for some map g with g(0, t) = 0. condition Q(s, t) − Q(s, By considering the Lie derivative L and inner product i operators, condition (2.7) is written as (2.8) L ∂ i ∂ Q∗ A = 0, s = 0, ∀t ∈ I. ∂s
∂t
An important consequence of (2.8) which will be used several times in the rest of this paper is given by the following Theorem 2.1. For any element q ∈ Tq (PA P ), the following equations hold: dA (q(t)) ˙ = 0, ∀t ∈ I, + FA (q(t), q(t)) dt Z t dt1 FA (q(t1 ), q(t ˙ 1 )) , ∀t ∈ I, A (q(t)) − A(q(0)) = −
(2.9)
0
where FA denotes the curvature of A.
Proof of Theorem 2.1. Condition (2.8) and the commutation property imply ∗
(Q dA)
∂ ∂ , ∂t ∂s
=L
∂ ∂t
i
∂ ∂s
Q∗ A, s = 0, t ∈ I.
∂ ∂ , ∂s ∂t
=0 (2.10)
Loop and Path Spaces and Four-Dimensional BF Theories
499
Since q is horizontal we have also
∗
(Q [A, A])
∂ ∂ , ∂t ∂s
= 0, s = 0, t ∈ I.
(2.11)
Equations (2.10) and (2.11) and the structure equation for the curvature imply ∗
(Q FA )
∂ ∂ , ∂t ∂s
=L
∂ ∂t
i
∂ ∂s
Q∗ A, s = 0, t ∈ I.
(2.12)
If we recall (2.6) then (2.12) becomes immediately the first equation of (2.9), namely ˙ q(t)) = FA (q(t),
dA (q(t)) , dt
while the second equation of (2.9) is obtained by integrating the first. u t ˙ To any path (q, q) in TP and t ∈ I we associate the tangent vectors (q(t), ˙ q(t)) ∈ ˙ represents a path in TTP . ˙ q) T(q(t),q(t)) TP . Altogether the quadruple (q, q, q, ◦
To any connection A ∈ A we can canonically associate a connection A on the TGbundle TP [20, 21], called the tangential connection. ◦
We recall that the tangential connection A applied to an element TTP represented by a smooth map Q : (−, ) × (t0 − , t0 + ) → P yields ˙ ∂A(Q)(s, t0 ) ˙ ∈ g × g. A(Q)(0, t0 ), ∂s s=0
(2.13)
We have then the following Remark 2.2. A path (q, q) in TP represents an element of TPA P if and only if it is a ◦
◦
˙ = 0. A-horizontal path in TP , namely if we have A (q, q, q, ˙ q) ◦
In other words (q, q) is the A-horizontal lift of a path (γ , ρ) in TM with initial point (q(0), q(0)) ∈ π −1 (γ (0), ρ(0)). A vertical vector q ∈ Tq (PA P ) is required to satisfy the extra condition q(t) ∈ Vq(t) P , ∀t ∈ I,
(2.14),
where Vp P denotes the vertical subspace of Tp P . Finally we have the following Corollary 2.3. As a consequence of Theoreem 2.1 and condition (2.14), vertical vectors in Tq (PA P ) satisfy the equation dA(q(t)) = 0, ∀t ∈ I . dt
500
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
2.2. Connections and curvatures on the bundles of horizontal paths. We now consider connections on PA P . In particular we are interested here in those connections on PA P that are determined by 2-forms in 2 (M, adP ), as shown in the following Theorem 2.4. Let B ∈ 2 (M, adP ) and A, A¯ be any pair of connections on P . The form Z ∗ ¯ (2.15) ev0 A + ev ∗ B I
defines a connection on PA P . Proof of Theorem 2.4. The g-valued 1-form ev0∗ A¯ is a connection on PA P . Moreover Z the 1-form
I
ev ∗ B is of the adjoint type and is tensorial, as can be seen by inspecting
the explicit expression: Z Z ∗ ev B (q) = I
1
dt B (q(t), q(t)) ˙ , q ∈ Tq (PA P ).
0
(2.16)
t u We will call any connection of the above form a special connection on PA P and will ¯ B). The space of special connections on PA P is an affine denote it by the triple (A, A, space modeled on 1 (M, adP ) ⊕ 2 (M, adP ). The reason why we are particularly interested in special connections is that elements of 2 (M, adP ) and connections are the essential ingredients of four-dimensional BF -theories [8, 9, 22, 4]. The space of special connections is a proper subspace of the space of all smooth connections on PA P . A simple example of a connection on PA P that is not special is et be a path in 2 (M, adP ) and A, A¯ be any pair of very easy to construct. Let t B connections on P . Then we have a connection on PA P , defined on q ∈ Tq (PA P ) as ¯ A(q(0)) +
Z
1 0
et (q(t), q(t)) dt B ˙ .
(2.17)
Other examples of connections on PA P that are not special will be discussed extensively in a subsequent paper [18]. In (2.15) we often choose A¯ = A and denote the triple (A, A, B) simply by a pair (A, B). Here A is kept fixed, so the space of special connections on PA P that are represented by pairs (A, B) is an affine space modeled on 2 (M, adP ). A vector q ∈ Tq (PA P ) is, by definition, horizontal with respect to the connection ¯ B) if the following condition is satisfied: (A, A, A¯ (q(0)) +
Z
1
dt [B (q(t), q(t))] ˙ = 0.
(2.18)
0
We consider two particular connections on PA P : 1. The trivial connection (A, 0) with curvature ev0∗ FA . Here condition (2.18) is equivalent to requiring that q(0) is A-horizontal. 2. The tautological connection (A, −FA ). As a consequence of Theorem 2.1, we have the following
Loop and Path Spaces and Four-Dimensional BF Theories
501
Corollary 2.5. The tautological connection is given by ev1∗ A and its curvature is given by ev1∗ FA . Condition (2.18) for the tautological connection is the requirement that q(1) is A-horizontal. Let us add that on LA P the tautological and the trivial connections are gauge equivalent, but we refer to [18] for the study of the gauge group of LA P . The computation of the curvature for a generic 2-form B involves Chen integrals. These integrals are defined in [23] for scalar forms, but their extensions to forms in ∗ (M, adP ) is relatively easy and will be discussed extensively in [18]. Here it is enough to say that the Chen integral of a form w ∈ deg(w) (M, adP ) is just the form given by the ordinary integral Z
Z w, Chen
I
ev ∗ w ∈ deg(w)−1 (PM, ad(PA P )),
while the Chen bracket of two forms w1 , w2 ∈ ∗ (M, adP ) is the form in ∗ (PM, ad(PA P )) of degree deg(w1 ) + deg(w2 ) − 1 defined as Z
Z {w1 ; w2 } , Chen
deg(w2 )−1
= (−1)
w1 (· · · , γ˙ (t1 ))dt1 , w2 (· · · , γ˙ (t2 )) dt2
0
Z
w1 (· · · , γ˙ (t1 )), w2 (· · · , γ˙ (t2 )) dt1 dt2 ,
0
where for each value of ti wi (· · · , γ˙ (ti )) are (deg(wi ) − 1)-forms on P to be evaluated at tangent vectors in Tγ (ti ) P . Notice that the Chen bracket is bilinear, but neither skewsymmetric nor graded-skew-symmetric. We have the following Theorem 2.6. The curvature F(A,B) of (A, B) is given by the following 2-form on PA P : ev0∗ FA
− ev1∗ B
+ ev0∗ B
Z +
ZI
Z
∗
ev dA B + (1/2)
ev B,
I
ev ∗ A − ev0∗ A, ev ∗ B I Z Z ∗ ∗ = ev0 FA − ev1 B + ev0∗ B + ev ∗ dA B + −
I
Z
∗
I
Chen
∗
ev B
{B + FA ; B} .
Proof of Theorem 2.6. The curvature of the connection (A, B) is given by: F(A,B) = ev0∗ FA + dev0∗ A
Z I
ev ∗ B + (1/2)
Z I
ev ∗ B,
Z I
ev ∗ B .
We then recall that we have the following relation between exterior derivatives: d P
A P ×I
= d P
AP
± d I ,
where the sign is given by the parity of the order of the form on PA P .
502
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
Hence we have the following chain of identities: Z
Z
Z
ev dA B = d ev B = d ev ∗ B + ev ∗ A, ev ∗ B I I I Z Z ∗ ∗ ev A, ev ∗ B + ev1∗ B − ev0∗ B = d ev B + Z I Z I ∗ ∗ ev A − ev0∗ A, ev ∗ B + ev1∗ B − ev0∗ B. = dev0∗ A ev B + ∗
ev ∗ A
I
We also have
I
Z (1/2)
∗
∗
I
ev B,
Z I
∗
Z {B; B} ,
ev B = Chen
and by taking into account Theorem 2.1, we conclude the proof. u t It is now natural to look for flat connections on PA P . If we restrict to special connections (A, B), then (A, 0) is a flat connection if A is flat. In order to find other flat connections, we have to require some reducibility conditions. Let T be an abelian subgroup of G. We use the following Definition 2.7. We say that a form ω ∈ ∗ (M, adP ) is reducible to T if there exists a T -subbundle of P , such that ω restricted to it takes values in Lie(T ). When we require the reducibility of the connection A and of some forms ωi ∈ ∗ (M, adP ), it will be understood that there will exist a T -subbundle of P , where the above forms are reducible simultaneously. When we restrict ourselves to considering the bundle LA P of horizontal paths whose projections are loops, then a sufficient condition for the flatness of (A, B) is given by the following Theorem 2.8. The curvature of the connection (A, B) on LA P is zero if the following conditions are satisfied: 1. FA = 0, 2. dA B = 0, 3. A and B are reducible to T . To conclude this section, we recall that the bundles ev0∗ P and PA P are isomorphic ¯ B) and A(A, B) the connections on ev ∗ P induced via (2.4). We denote by A(A, A, 0 ¯ B) and (A, B) on PA P . respectively by the connection (A, A, Namely we set Z ¯ B) , ev0∗ A¯ + jA∗ B, (2.19) A(A, A, I
A(A, B) ,
ev0∗ A +
Z I
jA∗ B,
(2.20)
where jA has been defined in (2.5). The curvature of (2.19) and (2.20) will be denoted ¯ B) and F(A, B). respectively by the symbols F(A, A,
Loop and Path Spaces and Four-Dimensional BF Theories
503
3. Horizontal Lift of Paths of Paths and the Non-Abelian Stokes Formula In this section we consider the parallel transport of paths of paths and in particular of imbedded paths of paths. We discuss when the relevant parallel transport is invariant under isotopy. Any connection on PA P defines horizontal lifts of a path of paths 0 in M, namely, of a map 0 : [0, 1] × [0, 1] → M.
(3.1)
Each of these horizontal lifts depends on the choice of the initial path q ∈ PA P , with π(q(t)) = 0(0, t). In turn this initial path q, being A-horizontal, depends only on the choice of a initial point q(0) ∈ π −1 0(0, 0). So we will speak of horizontal lift of paths of paths with respect to an initial point p0 ∈ π −1 0(0, 0). The horizontal lift of a path of paths 0 with respect to a given connection on PA P is, by definition, a path of A-horizontal paths. So we have the following: Theorem 3.1. The horizontal lift of 0 : I × I → M, with respect to a given connection on PA P and an initial point p0 ∈ π −1 0(0, 0), is uniquely determined by the lift of the path of the initial points of the given paths s 0(s, 0) ∈ M. Notice that the lift of the path of initial points considered in Theorem 3.1 coincides with the horizontal lift with respect to a connection A¯ ∈ A only if we choose the special ¯ 0) on PA P . For a general connection on PA P , the lift of the path of connection (A, A, initial points is more general than the horizontal lift: it is still G-equivariant but depends on the whole path of paths 0. It is therefore convenient to consider the following general definition of path-lifting for a principal bundle P (M, G). Definition 3.2. A lift is a smooth map: h : ev0∗ P → PP satisfying the conditions h(p, γ )(0) = p; π (h(p, γ )) = γ , and the G-equivariance h(ph, γ ) = [h(p, γ )]h, ∀h ∈ G. Our definition of lift is a smooth G-equivariant version of the definition of a “connection” for the fibration π : P → M, as given, e.g., in [24]. But we will use the term “lift” instead of “connection” in order to avoid confusion with the ordinary connections on P (M, G). We recall that a path-field is a smooth map M → PM that assigns to each x ∈ M a path beginning at x. Each path-field Z composed with γ yields an element of P(PM). When we lift Z ◦ γ via a connection on PA P , we obtain a path of paths in P whose initial points are a lift of γ in the sense of Definition 3.2. Hence Theorem 3.1 can be rephrased as follows: Theorem 3.3. If we denote by P(M) the space of path-fields on M, by H(P ) the space of lifts as in Definition 3.2 and by A(ev0∗ P ) the space of connections on ev0∗ P , then we have a map: (3.2) P(M) × A(ev0∗ P ) → H(P ).
504
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
In particular standard horizontal lifts correspond either to the choice of a special ¯ 0) on PA P together with an arbitrary choice of a pathconnection of the type (A, A, field or to an arbitrary choice of a connection on PA P together with the choice of the trivial path-field (i.e. the path-field assigning the constant path to every point of M). This shows in particular that (3.2) is far from being injective. Following definition (2.2) we denote the horizontal lift with respect to the connection ω on PA P by the symbol L(ω, 0, p0 ). The comparison of the lift of the initial points of 0 ∈ P(PM) with respect to the two connections ω and (A, 0) defines a path k0,ω ∈ PG such that (3.3) L(ω, 0, p0 )(s, 0) = L((A, 0), 0, p0 )(s, 0) · k0,ω (s). Due to the A-horizontality of the lifted paths t holds also for a generic t, namely we have
L(ω, 0, p0 )(s, t), equation (3.3)
L(ω, 0, p0 )(s, t) = L((A, 0), 0, p0 )(s, t) · k0,ω (s).
(3.4)
When we work with a fixed path of paths 0 and a fixed base point p0 ∈ π −1 0(0, 0) we use a simplified notation, i.e., we set LA (s, t) ≡ L((A, 0), 0, p0 )(s, t).
(3.5)
From (3.3) we conclude that k0,ω satisfies the following differential equation: dk0,ω (s) k0,ω (s)−1 = −ω(L0A (s, •)), ds
(3.6)
where we have to keep in mind that for any s ∈ I , the map L0A (s, •)(t) ≡ L0A (s, t) represents a tangent vector in TPA P . We will use consistently in this paper the notation of (3.6). Namely for any function f of several variables a, b, c, d, · · · we denote by f (•, b, c, d, . . . ) the function of one variable (a) obtained by evaluating f at b, c, d, . . . ). When we choose ω to be the special connection (A, A + η, B), then equations (3.4) and (2.18) imply the following differential equation: dk0,A,η,B (s) k0,A,η,B (s)−1 = − ds
Z
1
0
˙ A (s, t))dt − η L0 (s, 0) . (3.7) B(L0A (s, t), L A
The solution is given by a path-ordered exponential (in the variable s) Z k0,A,η,B (s) = P exp −
[0,s]
Z ds1 0
1
˙ A (s1 , t))dt + η L0 (s1 , 0) B(L0A (s1 , t), L A
.
(3.8) If G is an abelian group (e.g., U (1)n ), then path-ordering is not needed. Consider now the evaluation map ev : P(PM) × I × I → M and the pulled-back ∗ P whose elements are represented precisely by pairs (0, p ), where 0 is a bundle ev0,0 0 path of paths on M and p0 ∈ P is an element in the fiber over 0(0, 0). We have the following
Loop and Path Spaces and Four-Dimensional BF Theories
505
Theorem 3.4. Any connection ω on PA P determines a map ∗ P →G Hω : ev0,0
of the adjoint type, i.e. satisfying the equation Hω (0, pg) = Adg −1 (Hω (0, p)) , ∀g ∈ G.
(3.9)
In particular when ω is a special connection (A, A + η, B), then the map H(A,A+η,B) has the following properties: H(Aψ ,Aψ +Ad −1 η,Ad −1 B) (0, p) = Adψ −1 (p) H(A,A+η,B) (0, p) , ∀ψ ∈ G. (3.10) ψ
ψ
Proof of Theorem 3.4. We define Hω (0, p) ≡ k0,ω (1), where the r.h.s. is in turn defined by (3.3). Equation (3.10) is a consequence of (3.8). u t Theorem 3.4 summarizes the properties of the horizontal lift of paths of paths. Let us now consider the “square” associated to (i.e. the image of) a path of paths I × I to M. If we compute the map H applied to two different paths of paths with the same image in M, is the result the same? The answer is in general no, but some special situations are worth consideration. First we consider the case of the trivial connection on PA P . It follows from the definition, that for any 0 and for any A ∈ A, we have HA,0 (0, p0 ) = 1. Next we consider the tautological connection (A, −FA ). For this connection the non-abelian Stokes formula holds, namely we have the following Theorem 3.5. For any path of paths 0 ∈ P(PM) and for any p0 ∈ π −1 0(0, 0) we have H(A,−FA ) (0, p0 ) = HolA (∂0, p0 ). Here ∂0 : [0, 1] → M denotes the (smooth) 1 loop defined by 1 , 4 1 1 ≤τ ≤ , (∂0)(τ ) = 0(4τ − 1, 1), 4 2 3 1 ≤τ ≤ , (∂0)(τ ) = 0(1, 3 − 4τ ), 2 4 3 ≤ τ ≤ 1. (∂0)(τ ) = 0(4 − 4τ, 0), 4 Proof of Theorem 3.5. Here the connection on the bundle of horizontal paths, is given by ev1∗ A. Hence the (A, −FA )-horizontal lift at p0 ∈ P of the path of initial points 0(•, 0) is obtained as follows. We first consider the A-horizontal lift of 0(0, •) and its end-point p1 = 0(0, 1). Then we consider the A-horizontal lift of the path of end-points 0(•, 1) beginning at p1 and the A-horizontal lift of all the paths 0(s, •) for s ∈ (0, 1] with assigned end-point. The resulting path of initial points is the (A, −FA )-horizontal lift of 0(•, 0). The theorem follows immediately from Theorem 3.4. u t (∂0)(τ ) = 0(0, 4τ ),
0≤τ ≤
1 We assume that, when needed, all corners are properly smoothed.
506
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
The non-abelian Stokes Formula has a long history, starting from [10, 11]. For some more recent papers see [25–27]. The treatment of the problem as a problem of parallel transport in a space of paths is new. We now consider different paths of paths with the same image in M and see if their image with respect to the map given in Theorem 3.4 is the same. In this section, from now on, we limit ourselves to considering imbedded paths (or loops) of paths. We assume, in particular, that we have an isotopy 0r : I × I → M, r ∈ [0, 1] satisfying the following assumptions: G.1 0r (0, 0) = 00 (0, 0), ∀r ∈ [0, 1]. G.2 Im(0r ) = Im(00 ), ∀r ∈ [0, 1]. G.3 Im(0r (•, 0)) = Im(00 (•, 0)), ∀r ∈ [0, 1]. By taking derivatives of 0r with respect to the parameter r, we define for each r a smooth map Zr : I × I → TM, with Zr (s, t) ∈ T0r (s,t) M. The following conditions for Zr are a consequence of the corresponding conditions for 0r : Z.1 Zr (0, 0) = 0. Z.2 Zr (s, t) is tangent to Im(00 ) and the restriction of Zr to ∂(I × I ) is tangent to Im(∂00 ). Z.3 Zr (s, 0) is tangent to Im(00 (•, 0)) and Zr (1, 0) = 0. When do we have H(0r , p0 ) = H(00 , p0 )? We have the following partial answer Theorem 3.6. If 0r : I × I → M is an isotopy satisfying the conditions G.1 and G.2 above and if, moreover, FA = 0, then for any B ∈ 2 (M, adP ) we have HA,λB (0r , p0 ) = HA,λB (00 , p0 ) + o(λ), ∀r ∈ [0, 1].
(3.11)
If, in addition, condition G.3 is satisfied, then we have also H(A,A+λη,λB) (0r , p0 ) = H(A,A+λη,λB) (00 , p0 ) + o(λ), ∀r ∈ [0, 1].
(3.12)
Here we have assumed that some representation of the group G has been chosen so that the sum in (3.11) and (3.12) makes sense. Even though we do not have in general a true horizontal lift of squares or of surfaces, the implications of Theorem 3.6 are that in some particular cases such horizontal lifts do exist. This is true if we consider imbeddings or immersions as paths of paths, and small deviations from a flat connection and from B = 0. Proof of Theorem 3.6. Consider L((A, 0), 0r , p0 ), i.e., the (A, 0)-horizontal lift of 0r . Since A is flat, the image under L((A, 0), 0r , p0 ) of any curve in I ×I is A-horizontal in P . We take derivatives with respect to r in L((A, 0), 0r , p0 ) and obtain a map Z¯ r : I × I → TP . For each (s, t), Z¯ r (s, t) is now a horizontal lift of Zr (s, t). The map r HA,λB (0r , p0 ) defines a curve in G. By taking the logarithmic derivative of the above map, we obtain an element of g. If this element is zero, up to terms of order λ2 and for any r, then the theorem is proved. We use again a simplified notation by setting LA,r (s, t) ≡ L((A, 0), 0r , p0 )(s, t).
(3.13)
We first consider equation (3.11). By taking into account (3.8), we see that the element in g we are looking for is Z d L∗A,r B + o(λ). (3.14) − dr I ×I
Loop and Path Spaces and Four-Dimensional BF Theories
The integrand above coincides with
507
L∗A,r LZ¯ r B .
But the Stokes theorem and property Z2 imply that the integral of L∗A,r diZ¯ r B vanishes. Moreover we have Z Z L∗A,r iZ¯ r dB = L∗A,r iZ¯ r dA B = 0 I ×I
I ×I
since, for any X, Y ∈ Ts,t (I × I ), the three vectors LA,r A−horizontal and linearly dependent. As for (3.12), we set
∗
X, LA,r
LA,r,0 (s) ≡ L((A, 0), 0r , p0 )(s, 0). In order to prove (3.12) we have to show the vanishing of the term Z Z d L∗ η A,r,0 = L∗A,r,0 LZ¯ r η . dr I I
∗
Y , Z¯ r (s, t) are
(3.15)
(3.16)
The R conditions Z.1-Z.2 imply R integral (3.16) vanishes since the Stokes theorem and t that I L∗A,r (diZ¯ r η) vanishes, while condition Z.3 implies I L∗A,r (iZ¯ r dA η) = 0. u Remark 3.7. If the image of 0r is contained in a submanifold i : N ,→ M, then in order for the conclusions of Theorem 3.6 to remain true, it is enough to require i ∗ FA = 0. Moreover in (3.11) and (3.12) we may replace λB with any B(λ) such that i ∗ B = o(λ). Let us come back to the non-abelian Stokes formula. This formula implies that Tr H(A,−FA ) (0, p0 ) coincides with the Wilson loop of the boundary ∂0. We recall that the Wilson loop is defined precisely as Tr HolA (γ , p0 ) for γ ∈ LM, A ∈ A and π(p0 ) = γ (0). When we consider instead of the tautological connection a generic special connection (A, B), the corresponding generalized Wilson loop Tr H(A,B) (0, p0 ) depends on the path of paths 0 and not only on ∂0. This may be relevant for the understanding of the quark-confinement problem in the framework of BF -theories. In particular we are interested in considering generalized Wilson loops represented by deformations of the ordinary Wilson loop, where, up to the second order in the perturbative expansion, Tr H(A,B) (0, p0 ) depends only on the surface Im(0) and not on the particular path of paths 0. Accordingly we consider a special connection given by a perturbation series in a neighborhood of a flat connection (A, 0), where FA = 0. We may use two different variables κ and λ to describe respectively the deformation of the connection A and of the 2-form field B, i.e. we set: A(κ) ≡ A + κη + o(κ), B(λ) ≡ λB + o(λ).
(3.17)
We now choose a smooth isotopy of imbeddings 0r (or smooth homotopy of immersions) as before and set H(κ, λ, r, p0 ) ≡ H(A(κ),−FA(κ) +B(λ)) (0r , p0 ).
(3.18)
508
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
If we have only one parameter κ, we set H(κ, r, p0 ) ≡ H(A(κ),−FA(κ) +B(κ)) (0r , p0 ).
(3.19)
When λ = 0, then the non-abelian Stokes formula implies that H(κ, λ = 0, r, p0 ) is independent of r. In the general case the power series expansions of H(κ, λ, r, p0 ) and H(κ, r, p0 ) depend on r but satisfy the following: Theorem 3.8. Let P (M, G) be a principal G-bundle admitting a flat connection A. Let 0r satisfy G.1 and G.2 and let A(κ) and B(λ) be defined as above. For H(κ, λ, r, p0 ) given by (3.18) we have the following equation: ∂ 2 H(κ, λ, r, p0 ) = 0. (3.20) ∂λ∂r κ=λ=0 If we have only one parameter κ = λ, and we assume also G.3, then we have ∂ 2 H(κ, r, p0 ) = 0, ∂κ∂r κ=0
(3.21)
where Definition (3.19) has been assumed. Theorem 3.8 provides a surface law for the generalized Wilson loop in BF -theories. The main difference between Theorem 3.6 and Theorem 3.8 lies in the fact that in the latter the field B deforms a (non-trivial) tautological connection (for which the nonabelian Stokes formula holds) at any order in κ. Proof of Theorem 3.8. As in (3.3) and (3.5) we set L((A(κ), −FA(κ) + B(λ)), 0r , p0 )(s, t) ≡ L((A, 0), 0r , p0 )(s, t)k(r,κ,λ) (s), with k(r,κ,λ) ∈ PG and L(r,κ) (s, t) ≡ L((A, 0), 0r , p0 )(s, t). Analogously to (3.7) we have dk(r,κ,λ) (s) kr,κ,λ (s)−1 = −κη(L0(r,κ) (s, 0)) + ds Z 1 ˙ (r,κ) (s, t))dt. [FA(κ) − B(λ)](L0(r,κ) (s, t), L
(3.22)
0
By taking the derivative of (3.22) with respect to λ at κ = λ = 0 the r.h.s. of (3.22) becomes Z 1 ˙ (A,r) (s, t) dt, B L0(A,r) (s, t), L − 0
(see (3.13) for the notation) and the proof of Theorem 3.6 applies verbatim to our case. As for (3.21) we notice that we have to replace λ with κ in (3.22). The derivative with respect to κ at κ = 0 of the r.h.s of (3.22) becomes Z 1 ˙ (A,r) (s, t) dt. [−B + dA η] L0(A,r) (s, t), L −η L0(A,r) (s, 0) + 0
Loop and Path Spaces and Four-Dimensional BF Theories
509
We differentiate again with respect to r. Using G.3 and the same arguments as in Theorem 3.6, we obtain (3.21). u t We end this section by considering the special case of 0 being an imbedded loop of paths. The holonomy with respect to the connection (A, B) is then given by Z Hol(A,B) (0, p0 ) = HolA (0(•, 0), p0 )Ps exp −
LA ([0,1]×[0,1])
B .
(3.23)
It is clear that HolA,B (0, p0 ) is given by the group element g = g (0, (A, B), p0 ) such that p0 g is the end-point of the (A, B)-horizontal lift of the loop of initial points 0(•, 0). If B = 0, then the above holonomy is nothing else than the A-holonomy of the loop of initial points. Remark 3.9. If 0 ∈ L(PM) then we have: H(A,B) (0, p0 ) = Hol−1 A (0, p0 )Hol(A,B) (0, p0 ) and if 0 is a loop of loops: H(A,−FA ) (0, p0 ) = Hol−1 A (0(•, 0), p0 ), HolA (0(0, •), p0 ) HolA (0(•, 0), p0 )Hol−1 A (0(0, •), p0 ).
4. Local Coordinates Here we discuss the expressions of the special connections on PA P and of the relevant parallel transport in local coordinates. Let U be the domain of a local chart in M. We denote by PU the space of paths in U and by PU M the space of paths in M with initial point in U . Any section σ : U ⊂ M −→ P determines a section b σ : PU M → PA P , γ
b σ (γ ) , L(A, γ , σ (γ (0))) ,
(4.1)
where, as before, L denotes the horizontal lift. So the bundle of horizontal paths is trivial if and only if the bundle P is trivial. Definition 4.1. For any section σ : U → P we define h : PU × I → G by the equation σ (γ (t))h(γ , t) ≡ [b σ (γ )](t). The map h allows us to compare the A-parallel transport with the image of a section σ and is given by the standard path-ordered exponential of the integral Z h(γ , t) = P exp
[0,t]
γ ∗ (−σ ∗ A).
510
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
4.1. Connections. Given any X ∈ Tγ (PM), we set hγ (t) ≡ h(γ , t), q = b σ (γ ) and q≡b σ∗ X. We have q = σ∗ X hγ + i (h−1 dh)X , (4.2) where the map i : g → X(P ) is, by definition, the map yielding fundamental vector fields. Moreover we have (4.3) q˙ = (σ∗ γ˙ )hγ + (σ ◦ γ )h˙ γ . The second terms of both (4.2) and (4.3) are vertical vectors fields along q, so we finally obtain Z Z 1 ∗ ∗ q B (X) = dt Adh−1 (4.4) σ ∗ B (X(t), γ˙ (t)) . b σ γ (t) I
0
This is the expression in local coordinates of the difference between the connection (A, B) and the connection (A, 0).2 4.2. Horizontal lift of paths of paths. Now we compute the (A, B)-horizontal lift of 0 ∈ P(PM) in local coordinates. Consider a map 0 : I × I → M, where the first variable (s) describes the path of paths, while the second variable (t) describes each individual path. We assume that the image of 0 is all contained in the domain U of a local section σ : U ⊂ M → P . We consider the section b σ ( (4.1)) on PA P . We have explicitly, for each fixed s ∈ I , (b σ 0)(s, t) = σ (0(s, t)) h0(s,•) (t), where h is as in Definition 4.1 and 0(s, •) denotes the path in PM given by t The (A, 0)-horizontal lift of 0 is given, in local coordinates, by (s, t)
0(s, t).
(b σ 0)(s, t)h0(•,0) (s) = σ (0(s, t)) h0(s,•) (t)h0(•,0) (s).
When we consider as in (3.3) the path k0,A,B : I → G, then the (A, B)-horizontal lift of 0 is given, in local coordinates, by (s, t)
(b σ 0)(s, t)h0(•,0) (s)k0,A,B (s).
(4.5)
We now consider the following vectors in T(b σ 0)(s,t) P : 01 (s, t) ,
∂b σ0 ∂b σ0 (s, t), 02 (s, t) , (s, t). ∂s ∂t
We also set K0,A,B (s) , h0(•,0) (s)k0,A,B (s). The (A, B)-horizontality of (4.5) translates into the following equations: Z 1 dK0,A,B (s) −1 K0,A,B (s) + A (01 (s, 0)) + dt B (01 (s, t), 02 (s, t)) = 0, ds 0 dh0(•,0) (s) −1 h0(•,0) (s) + A(01 (s, 0)) = 0. ds
(4.6)
(4.7) (4.8)
2 The expression (4.4) was first considered in [19] where the notation Hol (γ )t for h (t) was employed. γ A 0
Loop and Path Spaces and Four-Dimensional BF Theories
Hence we have dk0,A,B (s) −1 k0,B (s) + Adh−1 (s) 0(•,0) ds and
Z
1
Z
1
dt B (01 (s, t), 02 (s, t)) = 0,
(4.9)
0
1
dt B (01 (s, t), 02 (s, t)) =
0
Z
511
dt Adh−1
0(s,•)(t)
0
˙ t) . (σ ∗ B) 0 0 (s, t), 0(s,
The solution of (4.9) is finally given by k0,A,B (s 0 )
( Z = Ps 0 exp −
0
Z
s0
ds Adh−1
0(•,0) (s)
1 0
dt Adh−1
0(s,•)
) ˙ , (t) (σ B) 0 (s, t), 0(s, t) ∗
0
(4.10) where Ps 0 denotes path-ordering in the variable s 0 . 5. Transformation Properties of the Holonomy as a Function of the Connection In this section we consider a generic manifold X and a principal G-bundle π : PX → X (typically we have in mind either X = M or X = PM) and we recall the main properties of the parallel transport of paths and of the holonomy of loops, both seen as functions on the space of connections A(PX ). Let ω ∈ A(PX ), η ∈ TA(PX ), γ ∈ PX, and u ∈ π −1 γ (0) ⊂ PX . We consider the horizontal lift L(ω, γ , u) (2.2). To the path of connections given by κ ω + κη, κ ∈ (−, ) we associate the path in PG, κ gκ = gκ (γ , ω, η, u) ∈ PG given by the solution of the following equation: L(ω + κη, γ , u)(t) = L(ω, γ , u)(t)gκ (t), gκ (0) = 1.
(5.1)
By definition we have dL(ω + κη, γ , u)(t) = 0, ∀κ ∈ (−, ), ∀t. (ω + κη) dt
(5.2)
In this section we use again a simplified notation for the horizontal lift by setting L(t) ≡ L(ω, γ , u)(t), and the relevant evaluation map ev : I × L → P . The paths gκ (t) satisfy the following equation in the variable t: ˙ gκ−1 g˙ κ + κAd −1 gκ η(L) = 0.
(5.3)
The solution is the path-ordered exponential Z t Z ˙ dτ η(L(τ )) = P exp −κ gκ (t) = P exp −κ 0
[0,t]
∗
ev η .
(5.4)
512
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
dgκ (t) We are interested in H (t) , . dκ κ=0 By differentiating at κ = 0 (5.3), we get Z Z ˙ dt η(L(t)) =− H (t) = − [0,t]
[0,t]
ev ∗ η.
(5.5)
Thus we have proved the following Theorem 5.1. For any loop γ ∈ LX, the logarithmic exterior derivative of the holonomy, seen as a function of the connection ω ∈ AX , is given by Z −1 (5.6) Holω (γ , u)δ (Holω (γ , u)) (η) = − ev ∗ η. I
We denote by Aut PX the group of automorphims of PX and by aut PX the Lie algebra of infinitesimal automorphisms of PX . There is an action of Aut PX on AX and a projection (group homomorphism) ρ : Aut PX → Diff X
(5.7)
whose kernel is the group of gauge transformations G(PX ). This projection allows us to define an action of Aut PX on PX. Hence any ψ ∈ Aut PX defines an isomorphism of bundles of horizontal paths ψ : Pω PX → Pψ ∗ ω PX
(5.8).
We now want to discuss the effect of this isomorphism on the parallel transport and the holonomy. The isomorphism (5.8) satisfies the following equation: L(ψ ∗ ω, ρ(ψ −1 ) ◦ γ , ψ −1 (u)) = ψ −1 (L(ω, γ , u)) , γ ∈ PX.
(5.9)
This implies that the infinitesimal action of aut PX on Pω PX is just the opposite of the corresponding action on AX . For any Z ∈ aut PX we compute the corresponding Lie derivative LZ ω = dω iZ ω + iZ Fω . By setting η = LZ ω in (5.6), we get Z d Holω+sLZ ω (γ , u) = −Holω (γ , u) ev ∗ (iZ Fω + dω iZ ω) ds I s=0 (5.10) Z ∗ ev iZ Fω − Holω (γ , u) (iZ ω(L(1)) − iZ ω(L(0))) . = − Holω (γ , u) I
If we consider the variation of the trace of the holonomy (in any representation of G), we have Theorem 5.2. Let ω ∈ AX , γ ∈ LX, Z ∈ autPX , and u ∈ π −1 γ (0). Then we have Z ∗ (5.11) (δTr Holω (γ , u)) (LZ ω) = −Tr Holω (γ , u) ev iZ Fω . I
Loop and Path Spaces and Four-Dimensional BF Theories
513
Proof of Theorem 5.2. We have iZ ω(L(1)) = iZ ω(L(0)Holω (γ , u)) = Holω (γ , u)−1 (iZ ω(L(0)))Holω (γ , u) and therefore Tr (Holω (γ , u)iZ ω(L(1)) − Holω (γ , u)iZ ω(L(0))) = 0. The result now follows from (5.10). u t Corollary 5.3. The variation (5.11) vanishes if the restriction of Z to the image of γ is proportional to the tangent vector γ˙ . In particular if the loop is an imbedding, then the corresponding trace of the holonomy is invariant under the action of any ψ ∈ Aut PX connected to the identity for which ρ(ψ) ∈ Diff X maps the image of the loop into itself. 6. Holonomy of Cylinders and the Group of Automorphisms of P In this and the following section we consider loops of paths and loops of loops and study the corresponding holonomies as functions on the space of (special) connections. We will use the name cylinders to denote the image of loops of paths, even though we are not assuming that such loops of paths are necessarily imbeddings or immersions. In this section we look for the conditions which guarantee that the (trace of the) holonomy of a loop of paths is invariant under those automorphisms of P which project onto diffeomorphisms connected to the identity, that map the corresponding image (cylinder) into itself. Since we are considering the action of Aut P on the space of connections A, it is convenient to work primarily with the bundle ev0∗ P instead of PA P , for which the choice of a fixed connection A is required. We will, though, make constant use of the isomorphism JA : ev0∗ P → PA P (2.4). Equation (5.9) says that the group Aut P of automorphisms of P acts in a natural way on the bundle ev0∗ P . In fact we have P × PM 3 (p, γ )
(ψ(p), ρ(ψ)(γ )), ψ ∈ Aut P
with p ∈ π −1 γ (0) H⇒ ψ(p) ∈ π −1 ρ(ψ)(γ (0)). The group Aut P can be identified with a subgroup of Aut(ev0∗ P ). The Lie algebra aut P can be accordingly identified with a subalgebra of aut(ev0∗ P ). Given now Z ∈ autP and the corresponding element in aut(ev0∗ P ) which we denote by the same symbol, we want to describe (JA )∗ Z ∈ aut(PA P ) explicitly. Consider q ∈ PA P . The path (6.1) t (πq(t), ρ∗ Z(π q(t))) ◦
is an element of TPM. We now lift A-horizontally (6.1) (see Remark 2.2) with initial point (p, Z(p)) ∈ TP . This lifted path is ((JA )∗ Z)(q). For any t, ((JA )∗ Z)(q)(t) is a vector in Tq(t) P . Notice that in general ((JA )∗ Z)(q)(t) is different from Z(q(t)) unless t = 0. The isomorphism (2.4) JA : ev0∗ P → PA P and the corresponding evaluation map (2.5) jA : ev0∗ P ×I → P allow us to transform forms on PA P defined by Chen integrals into forms defined on ev0∗ P . The result of performing first Chen integrals and then pulling
514
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
back the forms to ev0∗ P via JA will be represented by the symbol case of line-integrals, we have for a generic k-form φ on P Z
Z φ= Chen(A)
I
R
Chen(A) . In the special
jA∗ φ.
Then we have the following Theorem 6.1. The pullback of the connection A(A, B) (2.20) via ψ ∈ AutP is given by Z (6.2) ψ ∗ (A(A, B)) = ev0∗ ψ ∗ A + jψ∗ ∗ A ψ ∗ B. I
At the infinitesimal level, for any Z ∈ aut P , we have LZ A(A, B) = ev0∗ LZ A +
Z
Z
Chen(A)
LZ B +
Chen(A)
{LZ A; B}.
(6.3)
Proof of Theorem 6.1. We have LZ A(A, B) =
ev0∗ LZ A +
Z d LZ B + B. dκ κ=0 Chen(A+κLZ A) Chen(A)
Z
(6.4)
If we are given η ∈ 1 (M, adP ) and ζ ∈ ∗ (M, adP ) we have Z Z Z d d d ∗ ζ = j ζ = Ad −1 j ∗ ζ, dκ κ=0 Chen(A+κη) dκ κ=0 I A+κη dκ κ=0 I gκ A t where gκ is defined as in in (5.1). Now the proof follows from (5.5). u The curvature F(A, B) of A(A, B) at (q, p), is given by ev0∗ FA − jA (1)∗ B + ev0∗ B +
Z
Z
Chen(A)
dA B +
Chen(A)
{B + FA ; B} .
(6.5)
A direct consequence of Theorem 5.2 is Theorem 6.2. Let 0 ∈ L(PM) and Z ∈ aut P . The trace of the holonomy HolA(A,B) (0, p) in ev0∗ P with respect to A(A, B) transforms as follows: δTr HolA(A,B) (0, p)(Z) = −Tr
Z HolA(A,B) (0, p)
I
ev iZ F(A, B) , ∗
(6.6)
with ev : I × L(A(A, B), 0, p) → P . We now compute explicitly (6.6). First we set LA,B (s, t) ≡ L(A(A, B), 0, p)(s)(t), Z(s, t) ≡ ((JA )∗ Z)(LA,B (s, •))(t) ∈ TLA,B (s,t) P .
(6.7)
Loop and Path Spaces and Four-Dimensional BF Theories
515
We can write down (6.6) as follows: δTr HolA(A,B) (0, p)(Z) = Z 1 − Tr HolA(A,B) (0, p) ds FA Z(s, 0), L0A,B (s, 0) Z +
0
Z
1
1
ds 0
Z
0
Z
1
+
1
ds Z
0
0
Z
1
−
Z t ˙ A,B (s, τ ) , B L0 (s, t), L ˙ A,B (s, t) dt dτ (B + FA ) Z(s, τ ), L A,B 0
Z
1
ds 0
˙ A,B (s, t) dt dA B Z(s, t), L0A,B (s, t), L
dt 0
0
t
˙ A,B (s, τ ) , dτ (B + FA ) L0A,B (s, τ ), L ˙ A,B (s, t) . B Z(s, t), L
In order to obtain the vanishing of the previous expression, we make some extra assumptions on the vector field Z, namely: A) π∗ Z(s, 0) is proportional to the tangent vector to the path of initial points 0(•, 0) i.e., Z(s, 0) is proportional to L0A,B (s, 0) up to vertical vectors, ˙ t), with coefficients that, in B) π∗ Z(s, t) is a linear combination of 0 0 (s, t) and 0(s, general, are functions of s and t. Moreover let 6 be a submanifold of M containing Im(0). For the restriction of P to 6, we make the following assumptions: C) the connection A restricted to the bundle P6 is reducible to an abelian subgroup T of G, D) the form B ∈ 2 (M, adP ) restricted to P6 is (simultaneously) reducible to T . We have finally the following Theorem 6.3. We have δTr HolA(A,B) (0, p)(Z) = 0
(6.8)
provided either that conditions A), B), C), D) are satisfied or that conditions A), B) are satisfied together with the extra requirement that on 6 we have either B = −FA or B = 0. 7. Invariance Properties of the (Trace of the) (A, B)-Holonomy The space of connections on ev0∗ P of the type A(A, B) is isomorphic to the affine space A × 2 (M, adP ) which is acted upon by some transformation groups, that arise in the framework of quantum field theories of the BF type (see below). In this section we want to check under what conditions the trace of the (A, B)holonomy is invariant under those transformation groups. In quantum field theories one considers first of all the gauge group G. If we divide G by its center and consider only irreducible connections, then G acts freely on A × 2 (M, adP ) (7.1) (A, B)g = Ag , Adg −1 B .
516
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
We have moreover the group GT given by the semidirect product G n1 (M, adP ), where G acts on the abelian group 1 (M, adP ) via the adjoint action. The group GT acts nonfreely in two ways on A × 2 (M, adP ). The first action is given by the transformation 1 (Ag + η, Adg −1 B − dAg η − [η, η]), (g, η) ∈ GT , 2
(A, B)
(7.2)
while the second action is given by (Ag , Adg −1 B − dAg η), (g, η) ∈ GT .
(A, B)
(7.3)
Before seeing how the above transformation groups act on the holonomy, we compute the derivative of the (A, B)-holonomy as a function on A × 2 (M, adP ) at (η, β) ∈ TA × T2 (M, adP ) UnderR the transformation A A + η, B B + β the connection A(A, B) = ev0∗ A + jA∗ B on the bundle ev0∗ P → PM transforms into Z ∗ B + κβ. ev0 (A + κη) + Chen(A+κη)
The corresponding derivative of the holonomy is given by: δTr HolA(A,B) (0, p)(η, β) = Z Z Z β+ −Tr HolA(A,B) (0, p) ev0∗ η +
I
Chen(A)
{η; B}
.
(7.4)
Chen(A)
The integral in the r.h.s. of (7.4) can be written explicitly as Z 1 Z 1 ˙ A,B (s, t)) + ds η(L0 (s, 0)) + dt β(L0 (s, t), L A,B
0 1
Z
Z
1
ds 0
Z dt
0
0
t
0
A,B
0 ˙ ˙ dτ η(LA,B (s, τ )), B(LA,B (s, t), LA,B (s, t)) ,
where the prime denotes, as usual, the derivative with respect to the variable s and the dot the derivative with respect to the variable t, and LA,B (s, t) has been defined in (6.7). A direct consequence of Theorem 6.1 is the following Theorem 7.1. Let 0 : S 1 ×I → M be any loop in PM, let p be such that π(p) = 0(0, 0) and let g ∈ G be any gauge transformation. The trace of the (A, B)-holonomy of 0 with initial point p is invariant under the transformation (A, B)
(Ag , Adg −1 B).
Now we can study the transformation properties of the (A, B)-holonomy under (7.2) and (7.3) in the special case when we restrict the elements of G to be the identity. In this case (7.2) becomes the transformation A and we have the following:
A + η, B
1 B − dA η − [η, η]. 2
Loop and Path Spaces and Four-Dimensional BF Theories
517
Theorem 7.2. When β = −dA η − 21 [η, η] then (7.4) becomes δTr HolA(A,B) (0, p)(η, −dA η) = Z Z Z (7.5) 1 [η, η] − {B + FA ; η} . ev1∗ η − −Tr HolA(A,B) (0, p) 2 Chen(A) I Chen(A)
Proof of Theorem 7.2. We have Z 1Z 1 ˙ A,B (s, t) dsdt [A, η] L0A,B (s, t), L 0
0 1Z 1
Z
= Z
0
=
0 1Z 1
˙ A,B (s, t)) dsdt A(L0A,B (s, t)), η(L dsdt
0
0
Z 0
ZZ =
I ×I
Z
+
A(L0A,B (s, 0))
˙ A,B (s, τ ), L0 (s, τ )), η(L ˙ A,B (s, t)) dτ FA (L A,B
t
+
Z
1
dsdt 0
˙ A,B (s, τ ), L0 (s, τ )) dτ B(L A,B
0 ˙ ˙ dτ FA (LA,B (s, τ ), LA,B (s, τ )), η(LA,B (s, t)) ,
t
0
where we have used (2.18). Therefore Z Z Z [A, η] − − Z
Z
1
=−
Z
1
ds 0
0
Notice also that Z Z − I
1 0
Chen(A)
t
dt
0
Z
I
Z
0
Chen(A)
0 ˙ ˙ dτ (FA + B)(LA,B (s, τ ), LA,B (s, τ )), η(LA,B (s, t)) .
dη =
Chen(A) Z 1
{η; B} =
0
1
˙ A,B (1, t)) + η(L ˙ A,B (0, t)) − dt −η(L
ds η(L0A,B (s, 0)) − η(L0A,B (s, 1)) =
˙ A,B (0, t)) − HolA(A,B) (0, p)−1 η(L ˙ A,B (0, t))HolA(A,B) (0, p) dt η(L Z
1
− 0
ds η(L0A,B (s, 0)) − η(L0A,B (s, 1)) .
Therefore
Z 1 Z 1 ˙ A,B (s, t)) + ds η(L0A,B (s, 0))− dtdA η(L0A,B (s, t), L −Tr HolA(A,B) (0, p) Z
Z
1
ds 0
Z
1
dt 0
0 t
0
0 ˙ ˙ dτ η(LA,B (s, τ )), B(LA,B (s, t), LA,B (s, t)) = 0 Z Z ev1∗ η − {FA + B; η}. t u I
Chen(A)
518
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
Finally we take into account Theorem 7.2 and consider the invariance properties under (7.3) of the trace of the holonomy of a loop of paths 0 : S 1 × I → M. The previous discussion yields the following Theorem 7.3. Corresponding to the action (7.3) we have δTr HolA(A,B) (0, p)(0, dA η) = Z Z Z Z {FA ; η} + B, η + Tr HolA(A,B) (0, p) I
Z I
Chen(A)
LA,B (s, 1)∗ η −
Z
I
I
∗ LA,B (s, 0) η .
I
We now consider loops of loops. In this case we have: Corollary 7.4. Let T be an abelian subgroup of G. If conditions C) and D) of the previous section are satisfied and if the restriction of η ∈ 1 (M, adP ) to 0 : S 1 × S 1 → M is also reducible to T , then the trace of the (A, B)-holonomy of the loop of loops 0 is invariant under (7.3). If, besides the above conditions, we have also Z ev ∗ η = 0, ev : I × 0(•, 0)(p) → P , I
then the trace of the (A, B)-holonomy for loops of loops is also invariant under (7.2). The conclusion of this section is that the symmetry (7.3), which arises from the BF (quantum) field theory, does not leave the trace of the holonomy invariant, unless some reducibility constraints are imposed on the connection A and on the field B. In this sense the transformations (7.3) represent almost a good symmetry for the observable given by the trace of the (A, B)-holonomy. A good symmetry for the same observable would certainly be represented by the group of gauge transformations for PA P . Unfortunately gauge transformations for PA P do not map special connections into special connections and hence are not good symmetries for the BF theories. In general gauge transformations for PA P map special connections into special connections plus some extra terms given by Chen integrals and boundary terms. By neglecting these extra terms one obtains exactly (7.3). In this sense the transformations (7.3) are almost gauge transformations for PA P . 8. Observables, Actions and Quantum Field Theories An application of the ideas developed in this paper is the construction of new observables for quantum field theories (QFT). A QFT is described by an action functional, and by observable one means another functional that is invariant under the same symmetries that leave the action functional unchanged. A weaker requirement for the observables is the invariance only on shell (i.e., upon using the Euler–Lagrange equations); in this case the quantization of the theory requires the use of the Batalin–Vilkovisky formalism [28, 29], but we will discuss this elsewhere. Throughout this section we will restrict ourselves to considering a fourdimensional manifold M.
Loop and Path Spaces and Four-Dimensional BF Theories
519
8.1. Non-topological QFT’s. The first QFT we consider is the Yang–Mills theory described by the action functional Z SYM [A] = ||FA ||2 = −
M
Tr (FA ∧ ∗FA ),
where ∗ is the Hodge dual with respect to the Riemannian metric on M. The invariance group of the Yang–Mills action functional is the group of gauge transformations A → Ag . In this framework we have two natural elements of 2 (M, ad P ) at our disposal, viz., FA and ∗FA . Therefore, we may consider the following family of observables: Oαβ (0) = Tr H(A,αFA +β∗FA ) (0, p),
(8.1),
where 0 is a path of paths or loops. Theorem 3.4 guarantees that this is indeed an observable. Notice that for α = β = 0 the observable reduces to the trace of the identity, while for α = −1 and β = 0 it yields the trace of the A-holonomy along the boundary ∂0. Taking α = 0 and β = 1 (β = −1) is an interesting choice if the background connection – i.e., the solution of the Euler–Lagrange equations dA∗ FA = 0 around which we are working – is anti-self-dual (self-dual); in this case, on shell the observable is the A-holonomy along the boundary of 0 but off shell it depends on 0 (see Theorem 3.8). Another family of observables can be obtained by replacing H by Hol in the above formula. As discussed in the Introduction, a physical interpretation of these observables may be the following: as the Wilson loop – i.e., the trace of the A-holonomy – describes the displacement of a point-like charge, so the observable O describes the displacement of a path-like (or loop-like) charge, namely of an open or closed string. Notice that, even if the image of 0 represents a smooth surface, the observable O depends in general on its underlying path-of-paths structure. If, however, we impose assumption C of Sect. 6 as a boundary condition for A, then O will depend only on the surface represented by 0 and on the loop of initial points. There are other theories that are equivalent to the Yang–Mills theory, like the first order Yang–Mills theory [1, 2], S
YM 0
1 = ||B||2 + i 4
Z M
Tr (B ∧ FA ).
(8.2)
In this case, however, we have at our disposal a bigger family of observables than those given by (8.1) . In fact as our form in 2 (M, ad P ), we can take a generic linear combination αFA + β ∗ FA + γ B + δ ∗ B. Another version of (8.2) is the so-called BF -Yang–Mills theory [22, 4], where B is replaced by B − dA η, η ∈ 1 (M, ad P ), in the above action and, consequently, in the observable. The BF -Yang–Mills theory has been extensively studied in [4] where it has been shown to be equivalent to the Yang–Mills theory. This equivalence makes more interesting the appearance of a surface term for Wilson loops.
520
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
8.2. Topological QFT’s. Topological Quantum Field Theories are QFT’s whose action functional does not depend on the Riemannian structure of M and so it is expected to yield topological or smooth invariants as its vacuum expectation values. We consider the following TQFT’s: 1) the topological Yang–Mills theory StY M =
Z M
Tr (FA ∧ FA ),
2) the BF theory with a cosmological term Z Z 1 Tr (B ∧ FA ) + Tr (B ∧ B), SBF −BB = 2 M M and 3) the pure BF theory
Z SBF =
M
Tr (B ∧ FA ).
We do not have a non trivial loop-of-loops observable for the topological Yang–Mills theory. As for the BF theory with a cosmological term, we notice that the symmetries read, at the infinitesimal level, δA = dA ξ + η,
δB = [B, ξ ] − dA η,
with ξ ∈ 0 (M, ad P ) and η ∈ 1 (M, ad P ). These transformations correspond to (7.2). Since the Euler–Lagrange equations are B + FA = 0, then the trace of Hol(A,B) is almost invariant on shell. The problem is the presence of boundary terms in η, see (7.5). To get a good on-shell observable for loops of paths 0, we have to eliminate these boundary terms; so we may consider i h O(0) = Tr Hol(A,−FA ) (0, p)−1 Hol(A,B) (0, p) . Notice that on shell this observable is trivial. Off shell one must add Batalin–Vilkovisky corrections. Alternatively one can assume conditions C and D of Sect. 6 as boundary conditions. In this case the above observable is automatically invariant both on shell and off shell. In the case of the pure BF theory, the infinitesimal symmetries are (7.3), i.e. δA = dA ξ,
δB = [B, ξ ] − dA η.
The Euler–Lagrange equations read FA = 0, dA B = 0. These conditions correspond almost to the flatness of the connection for loops of paths, the missing requirement being the reducibility of B, see Theorem 2.8. We have then a first observable for pure BF theory, namely, O(0) = Tr Hol(A,B) (0, p). In fact, by Theorem 7.3 we get
δO(0) = 0,
Loop and Path Spaces and Four-Dimensional BF Theories
521
provided that we assume conditions C and D of Sect. 6 as boundary conditions. In this case the above observable is invariant both on shell and off shell. Another possible choice for pure BF theory is given by the observable d e (8.3) O(0) = Tr exp Hol(A,tB) (0, p) . dt t=0 On shell (i.e. when FA = 0) Theorem 3.6 guarantees that (8.3) is an observable that can be rightly associated to the surface spanned by a loop of paths. To compute the transformation properties of this observable, we must consider the transformation of the e turns out to be holonomy and not of its trace but only up to the first order in t. So O invariant on shell if one requires η to vanish on the restriction of P over a submanifold 6 containing the image of ∂0. To get a good observable also off shell, i.e., also in the case when A is not flat, one must add Batalin–Vilkovisky corrections. Notice that (8.3) is the exact counterpart of the observable for 3-dimensional BF -theory considered in [30–31]. Since the BF theories are topological – i.e., do not depend on the choice of the Riemannian metric on M – one expects that the vacuum expectation values of the above metric-independent observables will yield smooth invariants of the image of an imbedded (immersed) loop of paths (of loops). When M is a four-dimensional simply connected manifold, we conjecture that these invariants are related to the Kronheimer–Mrowka invariants [17], of imbedded (immersed) surfaces. Both in their theory and in our framework, a special rôle is played by connections that are reducible when restricted to the given surface. Moreover both in [17] and in the preliminary perturbative calculations of the four-dimensional quantum BF theory (see [9, 22]), the reducible connections (“monopoles on the surface”) yield loops and surfaces that are non-trivially linked. 9. Conclusions The natural geometrical setting for field theories of the BF type is a principal bundle on the space of paths (“open strings”) or loops (“closed strings”) of a (four-dimensional) manifold M. The fields A and B of the BF theory describe collectively a connection on such principal bundles. Out of the trace of the corresponding holonomy one can define observables associated to paths (loops) of paths (of loops). These can be seen as associated to imbedded (or immersed) surfaces only if some extra conditions are met and if those extra conditions are taken into account in the calculations of Feynman integrals. The geometrical analysis of BF theories suggests two physically relevant considerations: 1. In those BF theories that are related (equivalent) to the Yang–Mills theory, one can consider B-dependent observables associated to paths of paths which, when B is a deformation of the curvature, are a deformation of the Wilson loop along the boundary of the surface spanned by the path of paths. In other words a deviation from the nonabelian Stokes formula appears and this may be relevant for a correct understanding of the problem of quark-confinement. 2. Four-dimensional topological BF theories yield invariants of the four-manifold. When no B-dependent observable is included, the invariants to be considered should be related to the Donaldson invariants. When B-dependent observables are considered, one expects the corresponding quantum field theory to yield invariants of imbedded (or immersed) surfaces (like the Kronheimer–Mrowka invariants).
522
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
Four-dimensional BF theories can then be considered as a sort of gauge theories for loops and paths. The main difference is the fact that the action functional is not integrated over the whole space of paths (loops) but over the original four-manifold M. As a consequence, the action functional is not invariant under the full gauge group of the principal bundle over the path space but is only approximatively invariant (i.e. when one neglects boundary terms and higher-order Chen integrals). The full structure of the gauge group, of the space of connections and of the space of gauge orbits for paths and loops as well as the relation with Hochschild (and cyclic) (co)homology, will be discussed elsewhere. Acknowledgement. We thank A. Belli, L. Bonora, J. D. S. Jones, M. Martellini for useful discussions. P. C.-R. thanks G. Semenoff for inviting him to Vancouver B.C. (July 1997, APCTP/PIms Summer Workshop).
10. Appendix: Iterated Loop Spaces Most of the construction in this paper can be easily iterated, namely we can consider principal G-bundles on iterated free path and loop spaces. Let us denote those by the symbols P n M and Ln M. We describe here the special connections and the relevant curvatures for iterated path spaces (in the case n = 2). If we are given a connection (A, B) on PA P , then we can consider connections on the G-principal bundle of (A, B)-horizontal paths of paths π
2 P −−→ P 2 M. P(A,B)
We have the following diagram ev 13
ev
ev 13
ev
2 P −−−−→ I × PA P −−→ I × I ×P(A,B) yid×π yid×id×π
I × I × P 2M
P π y
−−→ I × PM −−→ M.
Here ev 13 acts on the first and the third element of the product. 2 P are maps Elements of PA.B Q: I × I → P (s, t)
Q(s, t)
satisfying ˙ t) = 0, ∀s, t ∈ I A Q(s, Z 1 ˙ t), Q0 (s, t) , ∀s ∈ I. dt B Q(s, A Q0 (s, 0) = 0 2 P are maps from I × I to TP , which are in turn defined by Vectors tangent to P(A.B) maps R : (−, ) × I × I → P
Loop and Path Spaces and Four-Dimensional BF Theories
523
so that
Z 1 ∂ ∂ 0 ˙ s, t), R 0 (r, s, t) = 0, ∀s ∈ I. A R (r, s, 0) − dt B R(r, ∂r r=0 ∂r r=0 0
2 Following the definitions of Sect. 2, in order to define a special connection on P(A,B) ¯ B) ¯ and a form C ∈ 3 (M, ad P ). Here we choose we need another connection (A, 2 P →P A¯ = A, B¯ = B. By considering the double evaluation map Ev : I × I × P(A,B) the special connection (A, B, C) is explicitly given by: Z Z ∗ ∗ Ev ∗ C. Ev(0,0) A + Ev(0,·) B + I
I ×I
The space of special connections considered above is an affine space modeled on 3 (M, ad P ). We have: Z Z Z ∗ ∗ ∗ Ev C = ev13 ev C . I ×I
I
I
2 P TQ PA,B
is a map I × I 3 (s, t) TQ(s,t) P . So we get Any tangent vector X ∈ Z Z Z 0 ∗ ˙ ds dtC X(s, t), Q (s, t), Q(s, t) = ds ev C X(s, t), Q0 (s, t) . [0,1]×[0,1]
[0,1]
I
The curvature of a special connection (A, B, C) is obtained directly from Theorem 2.6 via the following replacements ( R ∗ B A → ev0∗ A + I ev(·) R B → I ev ∗ C. References 1. Halpern, M.B.: Field Strength Formulation of Quantum Chromodynamics. Phys. Rev. D 16, 1798–1801 (1977) 2. Halpern, M.B.: Gauge Invariant Formulation of the Selfdual Sector. Phys. Rev. D 16, 3515–3519 (1977) 3. Reinhardt, H.: Dual Description of QCD. hep-th/9608191 4. Cattaneo, A.S., Cotta-Ramusino, P., Fucito, F., Martellini, M., Rinaldi, M., Tanzini, A., Zeni, M.: FourDimensional Yang–Mills Theory as a Deformation of Topological BF Theory. Commun. Math. Phys. 197, 571–621 (1998) 5. Witten, E.: Topological Quantum Field Theory. Commun. Math. Phys. 117, 353–386 (1988) 6. Schwartz, A.S.: The Partition Function of a Degenerate Quadratic Functional and Ray–Singer Invariants. Lett. Math. Phys. 2, 247–252 (1978) 7. Horowitz, G.T.: Exactly Soluble Diffeomorphism Invariant Theories. Commun. Math. Phys. 125, 417–436 (1989) 8. Birmingham, D., Blau, M., Rakowski, M., Thompson, G.: Topological Field Theories. Phys. Rep. 209, 129-340 (1991) 9. Cotta-Ramusino, P., Martellini, M.: BF theories and 2-knots In: Knots and Quantum Gravity, edited by J. Baez, Oxford: Oxford University Press, 1994, 169–189 10. Halpern, M.B.: Field Strength and dual variable formulation of Gauge theory. Phys. Rev. D 19, 517–530 (1979) 11. Aref’eva, I.Ya.: Non-Abelian Stokes Formula. Teor. Math. Fiz. 43, 111–116 (1980) 12. Wilson: Confinement of Quarks. Phys. Rev. D, 10, 2445–2459 (1974) 13. Fucito, F., Martellini, M., Zeni, M.: The BF Formalism for QCD and Quark Confinement. Nucl. Phys. B 496, 259–284 (1997)
524
A. S. Cattaneo, P. Cotta-Ramusino, M. Rinaldi
14. Kondo, K.: Yang-Mills Theory as a Deformation of Topological Field Theory, Dimensional Reduction and Quark Confinement. hep-th/9801024 15. ‘t Hooft, G.: On the Phase Transition towards Permanent Quark Confinement. Nucl. Phys. B 138, 1–25 (1978) 16. ’t Hooft, G.: A Property of Electric and Magnetic Flux in Nonabelian Gauge Theories. Nucl. Phys. B 153, 141–160 (1979) 17. Kronheimer, P.B., Mrowka, T.S.: Gauge theory for embedded surfaces, I, II . Topology, 32, 4, 773–826, (1993) and 34, 1, 37–97 (1995) 18. Cotta-Ramusino, P., Ronaldi, M.: in preparation 19. Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry. Vol. I, New York: Interscience Publishers, 1963 20. Kobayashi, S.: Theory of Connections. Ann. Mat. Pura Appl. 43, 119–194 (1967) 21. Cattaneo, A.S., Cotta-Ramusino, P., Rinaldi, M.: BRST symmetries for the tangent gauge group. J. Math. Phys. 39 1316–1339 (1998) 22. Cattaneo, A.S., Cotta-Ramusino, P., Gamba, A., Martellini, M.: The Donaldson–Witten Invariants and Pure QCD with Order and Disorder ’t Hooft-like Operators. Phys. Lett. B 355, 245–254 (1995) 23. Chen, K.: Iterated integrals of differential forms and loop space homology. Ann. of Math. 97 2, 217–246 (1973) 24. Whitehead, G.W.: Elements of Homotopy theory. Berlin–Heidelberg–New York: Springer Verlag, 1979 25. Broda, B.: Non-Abelian Stokes Theorem In: Advanced Electromagnetism: Foundations, Theory and Application, (T. Barrett, D. Grimes eds.) Singapore: World Scientific, 1995, pp. 496–505 26. Diakonov, D., Petrov, V.: Non-Abelian Stokes Theorem and Quark-Monopole Interaction. hep-th 9606104 27. Lunev, F.A.: Pure Bosonic Worldline Path Integral Representation for Fermionic Determinants, NonAbelian Stokes Theorem, and Quasiclassical Approximation in QCD. Nucl. Phys. B 494, 433–470 (1997) 28. Batalin, I.A. and Vilkovisky, G.A.: Relativistic S-Matrix of Dynamical Systems with Boson and Fermion Constraints. Phys. Lett. 69 B, 309–312 (1977) 29. Fradkin, E.S. and Fradkina, T.E.: Quantization of Relativistic Systems with Boson and Fermion Firstand Second-Class Constraints. Phys. Lett. 72 B, 343–348 (1978) 30. Cattaneo, A.S., Cotta-Ramusino, P. and Martellini, M.: Three-Dimensional BF Theories and the Alexander–Conway Invariant of Knots. Nucl. Phys. B 346, 355–382 (1995) 31. Cattaneo, A.S., Cotta-Ramusino, P., Fröhlich, J., Martellini, M.: Topological BF theories in 3 and 4 dimensions. J. Math. Phys. 36, 6137–6160 (1995) 32. Cattaneo, A.S.: Cabled Wilson loops in BF theories. J. Math. Phys. 37, 3684–3703 (1996) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 204, 525 – 549 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Nonlinear Stability of a Self-Similar 3-Dimensional Gas Flow Wen-Ching Lien1,? , Tai-Ping Liu2,?? 1 School of Mathematics, Institute for Advanced Studies, Princeton, NJ 08540, USA 2 Department of Mathematics, Stanford University, Stanford, CA 94305, USA
Received: 23 November 1998 / Accepted: 26 January 1999
Abstract: We show that the 3-dimensional supersonic gas flow past an infinite cone is nonlinear staple upon the perturbation of the obstacle. The perturbed flow exists globally in space and tends to the self-similar flow downstream. There is a thin layer of concentration of vorticities and entropy variation. Our analysis is based on an approximation scheme using local self-similar solutions as building blocks. This enables us to obtain global estimates of the nonlinear interactions of waves needed for the stability analysis. 1. Introduction We are concerned with inviscid gas flow in three space dimension. The compressible Euler equations are: ρt +
3 X (ρui )xi = 0
(Conservation of mass),
(1.1)
i=1
(ρu)t +
3 X
(ρui uj )xj + Pxi = 0, i = 1, 2, 3,
(Conservation of momentum),
j =1
(ρE)t +
3 X (ρEui + P ui )xi = 0,
(Conservation of energy),
(1.2)
(Equation of state),
(1.3)
i=1
P = f (ρ, S)
where (u1 , u2 , u3 ) is the velocity, ρ the density, P the pressure, e the internal energy, E = e + (u1 2 + u2 2 + u3 2 )/2 the total energy, and S the specific entropy. The system is ? The research supported in part by NSF Grant DMS-97-29992.
?? The research supported in part by NSF Grant DMS-98-03323.
526
W.-C. Lien, T.-P. Liu
quasilinear hyperbolic and a general flow contains shock waves, which greatly complicate the analysis. However, in the presence of symmetries, the flow becomes self-similar and the system may be reduced to ordinary differential equations. In the present paper, we are inerested in steady flow past an infinite cone, with axis in the x1 direction, and its stability with respect to perturbation of the cone. Such flow has cylindrical symmetry, p the dependent variables are functions of x ≡ x1 and the distance y ≡ x2 2 + x3 2 from the axis. Let u and v represent the axial and radial components of velocity, x2 x3 ,p )v(x, y). u1 (x1 , x2 , x3 ) ≡ u(x, y), (u2 , u3 )(x1 , x2 , x3 ) ≡ ( p 2 2 2 x2 + x3 x2 + x3 2 With the additional simplification that the flow is isentropic, the Euler equations are reduced to: −1 (ρv), y −1 (ρuv), (ρu2 + P )x + (ρuv)y = y −1 (ρv 2 ), (ρuv)x + (ρv 2 + P )y = y P = P (ρ). (ρu)x + (ρv)y =
(1.4) (1.5) (1.6) (1.7)
When a uniform supersonic flow (u, v) = (q0 , 0) hits the obstacle, which is an infinite cone y/x = θ0 , and the angle θ0 of opening at the vertex is sufficiently small, the conical flow can be constructed by studying self-similar solutions. The flow is deflected by an attached shock front beginning at the vertex and is continued so that the state of the air is constant on each concentric cone behind the shock cone and is parallel to the obstacle cone. The simplification is due to the fact that, between the shock and obstacle cones, the flow is isentropic S = S0 and irrotational: vx = uy ,
(1.8)
and it follows from (1.4)–(1.7) that (1 −
2uv v2 v u2 )ux − 2 vx + (1 − 2 )vy + = 0. 2 c c c y
(1.9)
Here the sound speed c is a given function of u and v through Bernoulli’s law. The flow depends on σ ≡ x/y and the equations are further reduced to a system of ordinary differential equations:
Nonlinear Stability of Self-Similar 3-D Gas Flow
vσ + σ uσ = 0, (1 −
2uv v2 u2 )uσ − 2 vσ − (1 − 2 )σ vσ + v = 0. 2 c c c
527
(1.10)
(1.11)
Such self-similar flow is suggested by Busemann([1]), who gave a graphical method for obtaining them, see Sect. 2. We consider a more realistic case when the obstacle is a perturbation of the infinite cone. The shock front and the flow behind it are conical until the expansion wave and the shock wave coming from the bendings of the obstacle interact with the conical flow and the shock front. The flow then becomes rotational and, in general, contains infinitly many interacting shock waves. The main questions are: With all these wave interactions, does a solution exist globally? Is it stable with respect to the perturbation, for finite x and also asymptotically as x → ∞? We answer these questions affirmatively and show that the long-range behavior of the flow is self-similar corresponding to an infinite cone with the asymptotic angle of the perturbation. In particular, the flow between the leading shock and the obstacle tends to be irrotational and isentropic. In fact, there is a boundary layer of high concentration of vorticity and entropy variation. The width of the layer tends to zero as x → ∞. This can occur, of course, only for inviscid flow; for the viscous flow the vorticity would propagate into the flow. Nevertheless, experimental evidences show that the inviscid flow still accurately represents the actual flow, Courant–Friedrichs [3]. System (1.4)–(1.7) can be regarded as one-dimensional hyperbolic conservation laws with a source. There is a general existence and time-asymptotic theory, Liu [9, 10] and Lien [7], for a system of the form ut + f(u)x = g(u, x). The main idea is to recognize that solutions, which are function of x only, f(u)x = g(u, x), are normal modes. The idea is then to approximate the solution by a piecewise smooth function consisting of these modes and then to study the nonlinear interaction of waves resulting form the resolution the discontinuities. This is done modifying the random choice method of Glimm [4] for hyperbolic conservation laws ut + f (u)x = 0. This approach applies when the source is finite, i.e. Z ∞ |g(u, x)|dx = O(1). −∞
−1 −1 −1 (ρv), (ρuv), (ρv 2 )) is not finite both at the y y y origin and for large y and the above theory does not apply. (To make the comparison, the independent variables are related by (x, t) → (y, x).) This simply means that solutions, which are functions of y only, play no natural role here. Nevertheless, the idea of identifying the normal modes for approximating the general solutions is shown to work in the present situation. We choose instead the self-similar solutions for (1.10) and (1.11) as the building blocks for the construction of solutions to (1.4)–(1.7). Besides the However, the source term here (
528
W.-C. Lien, T.-P. Liu
difference between the finite source and the present situation, one also notes that, unlike the finite source case, the system (1.10) and (1.11) is not automous. This reflects the fact that a self-similar solution is determined not only by its value at a location (x, y), but also by the origin of the self-similatity. For the perturbation of the infinite cone, the suitable origin (x0 , 0) changes with the location (x, y), x0 = x0 (x, y), and so does the self-similar variable x − x0 . σ = σ (x, y) = y For flow next to the obstacle, there is the clear choice of σ = 1/θ, θ the slope of the obstacle. We allow this choice to propagate into the flow by the 3-waves. The numerical grids are moving along the constancy of the self-similar variable. The dominant shock next to the uniform upstream flow is traced, cf. Chern [2]. The construction of approximate solutions is done in Sect. 3. Our analysis is based on the estimates of local wave interactions. Besides the interaction of elementary waves for the Riemann problem, one needs also to study the interaction of elementary waves and self-similar solutions, as well as the waves produced due to the changes in the origin (x0 , 0) across a 3-wave. As with other studies of nonlinear waves, stability follows from the decoupling of waves, that is, wave interactions must decay. For hyperbolic conservation laws, this has been extensively studied, starting with Glimm-Lax [5]. A key observation here is that the angle between the selfsimilar rays, σ = constant, and the shock and entropy waves decreases after interaction. These estimates are studied in Sect. 4. The local estimates allow us to introduce a global functional on nonlinear wave interactions to control the variation of the approximate solutions and thus prove the global existence of the solution in Sect. 5. The functional is defined also to take into account the fact that 1-waves are reflected by the obstacle to become 3-waves, and 3-waves propagate toward and then are combined with the dominant shock. These global wave estimates allow us in Sect. 6 to study the asymptotic behaviour of the flow as x → ∞. The entropy waves, 2-waves, approach the obstacle and form the aforementioned boundary layer. The flow pattern eventually tends to a self-similar solution corresponding to the conical flow for the infinite cone without any deflections.
2. Self-Similar Solutions In this section, we briefly review the quantitative analysis of shock polars and the construction of conical flow. We refer the readers to Courant and Friedrichs [3] and the references therein for more details. For simplicity, we assume that the flow is isentropic and consider the systems (1.4)–(1.7) for general flow, and (1.10) and (1.11) for self-similar flow. We consider the polytropic gases: P = Aρ γ , γ > 1. 2.1. Shock polars. Consider a shock S in the (x, y)-plane with the upstream state of velocity q 0 = (q0 , 0) and the downstream state of velocity q 1 , which makes angle θ with the upstream flow. The angle the shock makes with the upstream flow is denoted by β.
Nonlinear Stability of Self-Similar 3-D Gas Flow
529 S
q0 β
* q1 - θ
There is a one-parametric family of possible states, with velocity q 1 , which can be reached through a shock. These possible states are given by the Rankine–Hugoniot conditions of the conservation of mass, momentum and energy. On the phase space of the velocity q = (u, v), the states q 1 lie on a curve, called the shock polar. Let N and L be the components of the velocity q normal and tangential to the shock line S respectively. We have u0 = q0 , v0 = 0, u1 = L1 cos β + N1 sin β, v1 = L1 sin β − N1 cos β, L1 = L0 = q0 cos β (Continuity of tangential component), N0 = q0 sin β.
(2.1)
From the conservation laws, Bernoulli’s law for steady flow holds across a shock front: 1 1 1 2 q0 + i0 = q1 2 + i1 = qˆ 2 , 2 2 2
(2.2)
where i is the specific enthalpy. For a polytropic gas, P (ρ) = Aρ γ , γ > 1, we have i= where c =
√
c2 , γ −1
P 0 (ρ) is the sound speed. We set µ2 =
identities yield the Prandtl’s relation: N1 =
(2.3) γ −1 ˆ The above and c∗ = µq. γ +1
(c∗ )2 − µ2 L20 . N0
(2.4)
By (2.1)–(2.4), we obtain the relations u1 = (1 − µ2 )q0 cos2 β + v1 = (q0 − u1 ) cot β.
c∗2 , q0
Eliminating the angle β, we find v 2 = (q0 − u)2
u − u˜ , U −u
(2.5)
530
W.-C. Lien, T.-P. Liu
c∗2 and U = (1−µ2 )q0 + u. ˜ The curve in the (u, v)-plane given by Eq. (2.5), q0 the shock polar, is the Folium of Descartes as shown in the following figure: where u˜ =
2.2. Conical flow. Now consider a conical body facing a supersonic stream of air at a uniform velocity q0 = (u0 , 0). Assume that the obstacle is an infinite cone with its vertex located at the origin in the (x, y)-plane. A shock wave S is formed and situated on a concentric cone where an abrupt change in density and velocity occurs. Between the shock and the obstacle cones, the flow is self-similar. ( ( ((( ((( ( ( ( XX X -@ ``XXXX @ ````XX X X ` h h @ hhh ``X ` `X h h hhh @ h @ @ S
The self-similar flow satisfies the ordinary differential equations (1.10) and (1.11). There are two boundary requirements for the solution: The first requirement is that the flow velocity next to the obstacle is parallel to the obstacle. This is the natural condition for the inviscid flow. The second requirement is that the self-similar variable σ = x/y next to the shock equals 1/θS , θS the slope of the shock. This is needed because the flow variables next to the shock are unchanged and the Rankine–Hugoniot condition is always satisfied. Such a solution is constructed by the shooting method. Given a state q 1 on the shock polar through the given upstream state q 0 , we continue it by solving (1.10) and (1.11) with the initial condition q 1 at σ = 1/θS so that the second requirement is satisfied. In other words, the initial value of (1.10) and (1.11) satisfies, with v regarded as a function of u, vu = −σ.
(2.6)
Nonlinear Stability of Self-Similar 3-D Gas Flow
531
Since the shock line S is perpendicular to the straight line joining (u0 , 0) and (u1 , v1 ), the initial slope of the curve is given by vu =
v1 − v 0 . u1 − u0
(2.7)
x increases till an end state y ≡ (ue , ve ) with the property that ue /ve = σe there, or by (1.10),
The solution to (1.10) and (1.11) is continued so that σ = q end
u vu = − . v
(2.8)
In other words, the first requirement would be satisfied if the obstacle is x = σe y. The collection of the end states q e , for varying the state q 1 on the shock polar, forms an apple curve, so called because of its shape. Note from (2.6) that the solution to (1.10) and (1.11) at an end point on the apple curve is normal to the line through the origin.
To construct the conical flow with the slope θ0 of the obstacle given, we locate the point on the apple curve which intersects with the ray x/y = 1/θ0 through the origin. In general, there are two intersections of which the one corresponding to the weaker shock is more likely to occur in reality and is our main concern in the present paper.
3. Construction of Approximate Solutions The approximate solutions to the system (1.4)–(1.7) will be constructed based on the self-similar solutions and the elementary waves for the homogeneous system (ρu)x + (ρv)y = 0,
(3.1)
(ρu + P )x + (ρuv)y = 0,
(3.2)
(ρuv)x + (ρv + P )y = 0, P = P (ρ).
(3.3) (3.4)
2
2
The self-similar solutions have been considered in the first two sections. We now consider the elementary waves for (3.1). The system (3.1) is strictly hyperbolic and its
532
W.-C. Lien, T.-P. Liu
characteristic speeds are: uv c(q 2 − c2 )1/2 − < 0, u2 − c 2 u2 − c2 v > 0, λ2 = u uv c(q 2 − c2 )1/2 + > 0. λ3 = 2 2 u −c u2 − c2 λ1 =
(3.5) (3.6) (3.7)
Its first and third characteristic fields are genuinely nonlinear and the second characteristic field is linearly degenerate in the sense of Lax [6]. (For the non-isentropic flow, we need to consider the energy equation and the system is not strictly hyperbolic with double linearly degenerate eigenvalues v/u. Nevertheless, the system is completely hyperbolic and our analysis can be easily generalized for it.) In the following, (ρ, u, v) is denoted by ω. Let Si (ω− ) and Ri (ω− ) denote the Rankine–Hugoniot curve and the rarefaction curve for the i-characteristic field, respectively. Set Ri+ (ω− ) = {ω : ω ∈ Ri (ω− ), and λi (ω) ≥ λi (ω− )},
Si− (ω− ) = {ω : ω ∈ Si (ω− ), and λi (ω) < s(ω− , ω) < λi (ω− ), s is the shock speed}, Ti (ω− ) = Ri+ (ω− ) ∪ Si− (ω− ), for i = 1, 3, T2 (ω− ) = R2 (ω− )(= S2 (ω− )).
By straightforward computations, we obtain that R2 (ω− ) = {ω : P = P− ,
v− v = } u u−
(3.8)
and that Ri , i = 1, 3, are the integral curve of ( du dv
= −λi ,
dP dv
= ρ(λi u − v).
The Riemann problem for (3.1)–(3.4) with initial data a single jump can be solved by the elementary waves taking values along the wave curves Ti , i = 1, 2, 3, just described, Lax [6]. To construct approximate solutions to (1.4)–(1.7), we adopt a generalization of the Glimm scheme [4]. The obstacle is a perturbation of the infinite cone y/x = θ0 . In the following figure, we exhibit the construction of the numerical grids to be described below for the simplified case when the cone is perturbed only at one location x = x1 and with the change of angle θ1 .
Nonlinear Stability of Self-Similar 3-D Gas Flow
533
, ,
y
6
y2 (k)
" " "
" q" " " , ( (( " h h h , y1 (k) !!! , ! , !q! ! , , hh ! , ! , ( h ! ,( ( h h h , y0 (k) , q , , ( ( ( ( , ( ((( , ((( θ ( ( ( 1 , ( q q q ,
O
, S , , ,
θ0 x0
x1
ks
-
x
We now define the difference scheme. Choose the grid size s = 1x for the variable x. Suppose that the obstacle is unperturbed before x = N0 1x. For 0 ≤ k ≤ N0 , the grid points are the intersection of x = k1x with the self-similar rays centered at (0, 0), y=x
1 , h = 0, −1, −2, . . . . 1/θ0 + h1σ
In this region, the approximate solution is the unperturbed conical flow centered at (0, 0). We choose the initial numerical grid on x = N0 s to satisfy the usual C-F-L condition. The approximate solutions ω(x, y) = ω1 (x, y) and the numerical grids are defined inductively in k, x = ks, k = N0 , N0 + 1, , . . . , as follows: Choose an equidistributed sequence a1 , a2 , . . . in the unit interval (0, 1). Approximate the obstacle by piecewise linear cones with changes in angle at x = ks, k = N0 , N0 +1, . . . . Suppose that the approximate solution and the grid points have been defined for x ≤ ks. Let the grid points on x = ks be denoted by y = y0 (k) < y1 (k) < . . . , with y = y0 (k) the location of the obstacle. The approximate solution ω(ls + 0, y) is a piecewise smooth solution of the self-similar system (1.10) and (1.11) on each vertical grid line x = ls + 0. As part of the induction hypothesis, we assume that the center (x0 , 0) = (x0 (l, h + 1/2), 0) of the self-similar variable σ = (x − x0 )/y for yh (l) < y < yh+1 (l) have also been defined for l ≤ k. We now define the approximate solution for the region ks < x ≤ (k + 1)s. For yh (k) < y < yh+1 (k), ω(ks + 0, y) is the solution of (1.10) and (1.11) with ω(ks + 0, yh + ak (yh+1 (k) − yh (k))) =ω1 (ks − 0, yh (k) + ak (yh+1 (k) − yh (k))), h = 0, 1, . . . , . (3.9) As noted before, the initial value above does not uniquely determine the solution of the non-autonomous system (1.10) and (1.11) and the center of the self-similar variable needs to be specified. We specify the center to be (x0 , 0) = (x0 (k, h + 1/2), 0), which has been defined inductively, and this yields the self-similar variable σ = (x − x0 )/y. The discontinuities at the grid points (k1x, yh ), h = 1, 2, . . . are resolved by solving the Riemann problem for (3.1)–(3.4) with initial data: (ω(ks + 0, yh (k) − 0), ω(ks + 0, yh (k) + 0). The solution of the Riemann problem is a function of (x − ks)/(y − yh (k)) and consists of rarefaction waves, shock waves or contact discontinuities.
534
W.-C. Lien, T.-P. Liu
The approximate solution ω(x, y), ks < x ≤ (k + 1)s, yh−1/2 (k) < y < yh+1/2 (k) is defined according to (1.10) and (1.11) along the ray (y − yh (k))/(x − ks) = ξ with the initial value at x = ks + 0 given by the solution of the above Riemann problem. As before, we need to specify the center x0 (ξ ) of the self-similar variable. We do it according to the principle that the center propagates away from the obstacle and toward the leading shock. Let the upper edge of the 3-wave of the solution of the Riemann problem at (ks, yh (k)) be (y − yh (k))/(x − ks) = a. Since the 3-wave moves toward the leading shock, we set the center to be x0 (ξ ) = x0 (k, h − 1/2) (or x0 (ξ ) = x0 (k, h + 1/2)) for the region below (or above) the upper edge of the 3-wave, ξ < a (or ξ > a.) The numerical grids on x = (k +1)s are defined to be on the self-similar rays through the grids on x = ks. The new center on x = (k + 1)s inherits those x0 (ξ ) on x = ks + 0 through the random choice (3.9). The choice of the centers is natural. The choice of the grid points is motivated by the study of moving sources in that the grids move along the constancy of the underlining self-similar flow. On the obstacle, (x − ks)/(y − y0 (k)) = σ0 (k), a 3-shock (or 3-rarefaction) wave emerges when the obstacle changes angle toward (or away from) the flow. For this, we solve the initial-boundary Riemann problem for (3.1)–(3.4) with initial data: ω1 (ks + 0, σ ) = ω1 (ks, σ0 (k)), σ < σ0 (k) and with a boundary condition posed at σ = σ0 (k): u = σ0 (k). v The approximate solution is extended to (x, y), ks < x ≤ (k + 1)s, y0 (k) + (x − ks)/σ0 (k) < y < y0 (k) + 1/2(y1 (k) − y0 (k)) as before with center x0 (k, 0) ≡ ks − σ0 (k)y0 (k). The leading strong shock cone next to the uniform upstream flow is traced continuously, instead of the above random scheme. Suppose that the approximate solution is constructed for 0 ≤ x < ks, k ≥ N0 . Let (x, yf (x)) denote the locus of the front of the 3-shock cone S. Suppose that yjf (k) < yf (ks) < yjf +1 (k). We call the interval yjf −1 (k) < y < yjf +1 (k) the front region at x = ks. Inside the front region, we first solve the self-similar solution to (1.10) and (1.11) with the initial value: ω(ks + 0, yjf −1 (k) + ak (yjf (k) − yjf −1 (k))) = ω1 (ks − 0, yjf −1 (k) + ak (yjf (k) − yjf −1 (k))), and with the same center as the initial value. Denote the solution by ω(y). Next we solve the Riemann problem for (3.1)–(3.4) so that ( ω(ks, y) =
(ρ0 , u0 , v0 ) = ω+ , for y > yf (x) ω(yf (x) − 0),
for yf (x) > y > yjf −1 (k).
The solution ω(x, y) thus contains a relatively strong 3-shock wave, (ω+ , ω− ), with speed s. Solve again Eqs. (1.10) and (1.11) in the interval yjf −1 (k) < y < yf (x) with the initial value ω(yf (x) − 0) = ω− .
Nonlinear Stability of Self-Similar 3-D Gas Flow
535
Denote the solution by ω− (y). Now, we can define the approximate solution in the front region as follows: ω+ , for y > yf (x) ω1 (x, y) = ω− (y), for yf (x) > y > yjf −1 (k), yf (x) = s(x − ks) + yf (ks) for ks ≤ x < (k + 1)s. And the discontinuity at y = yjf −1 (k) is resolved by the same construction as before. 4. Local Interaction Estimates We first study the interaction among the weak waves between the shock cone S and the obstacle cone. In order to obtain the desired estimates, we consider space-like curves, which are piecewise linear curves consisting of line segments joining akh to ak+1,h+1 or to ak−1,h+1 , where akh = (ks, yh (k) + ak (yh+1 (k) − yh (k)). The shock cone in the first quadrant is covered by "diamonds," the corners of which are the mesh points, akh . Let 1 denote a diamond centered at (ks, yh (k)). We consider the following case. Suppose that the waves entering 1 are denoted by α and β, which are centered at ((k − 1)s, yh−1 (k − 1)) and ((k − 1)s, yh (k − 1)) respectively. Let δ denote the set of waves issuing from (ks, yh (k)) and δi the strength of the i-wave in δ. Let ω1 (σ ), ω2 (σ¯ ) and ω3 (σ¯ ) represent the self-similar solutions centered at O1 , O2 and O2 respectively such that α = (ω2 (σ¯1 ), ω1 (σ1 )), β = (ω3 (σ¯2 ), ω2 (σ¯2 )), δ = (ω3 (σ¯2 ), ω1 (σ2 )). σ and σ¯ are the self-similar variables with the corresponding centers O1 and O2 respectively. To measure the potential nonlinear wave interaction, we use the following notations: Q0 (1) = Q0 (α, β) X ≡ |αi ||βj | : αi and βj are approaching , Q1 (1) ≡ |α1 |1σ + |α3 |1σ, Q2 (1) ≡ |α2 |1σ, x0 1σ, if O1 6 = O2 , c Q (1) ≡ 0, if O1 = O2 , Q(1) ≡ Q0 (1) + Q1 (1) + Q2 (1) + Qc (1), where 1σ = |σ2 − σ1 | and x0 denotes the change of the location for different centers. Here, Q0 measures the wave interaction between elementary waves, Q1 and Q2 measure the wave interaction between self-similar solutions and elementary waves, and Qc measures the effect of the change of centers for self-similar solutions. Our interaction estimate is as follows: Lemma 4.1. For some constant O(1) depending only on system (1.4)–(1.7), 1 ≤ i ≤ 3, δi = αi + βi + O(1)Q(1).
(4.1)
536
W.-C. Lien, T.-P. Liu
Proof. By the interaction estimates of elementary waves for conservation laws [12], δi = (ω3 (σ¯2 ), ω1 (σ2 ))i = (ω3 (σ¯2 ), ω2 (σ¯2 ))i + (ω2 (σ¯2 ), ω1 (σ2 ))i + O(1)Q0 ((ω3 (σ¯2 ), ω2 (σ¯2 )), (ω2 (σ¯2 ), ω1 (σ2 ))).
(4.2)
It follows from the elementary theory of ordinary differential equations that ω2 (σ¯2 ) − ω1 (σ2 ) = ω2 (σ¯1 ) − ω1 (σ1 ) + O(1)(|α| + x0 )(|σ2 − σ1 | + |σ¯2 − σ¯1 |). (4.3) Note that |σ¯2 − σ¯1 | is equivalent to |σ2 − σ1 | when x0 is sufficiently small. Since the solution of the Riemann problem depends continuously on its end states, (4.2) and (4.3) yield δi = αi + βi + O(1)Q(1). This completes the proof. u t Remark 4.1. For the other cases, such as when α issues from ((k − 1)s, yh+1 (k − 1)) or when ω2 and ω3 have different centers, the Qj ’s can be defined with the same meaning and the interaction estimate (4.1) holds by the same argument. Remark 4.2. When h = 0, that is, 1 covers a part of the boundary of the obstacle cone, we need to solve the boundary Riemann problem. Let α and β denote the waves issuing from ((k − 1)s, y1 (k − 1)) and ((k − 1)s, y0 (k − 1)) respectively. By an analogous argument, we have the interaction estimate: δ = δ3 = β + C0 α + O(1)Q(1),
(4.4)
where C0 depends only on system (3.1)–(3.4). As for the case involving the relatively strong 3-shock wave S, the estimate is similar to the above lemma except that instead of advancing one diamond, we need to advance three diamonds in the front region simultaneously. We still denote these three diamonds by 1. Let 1k,h represent the diamond whose center is (ks, yh (k)).Assume ak+1 ∈ (0, 21 ). Then, 1 = 1k+1,jf −1 ∪ 1k+1,jf ∪ 1k+1,jf +1 . The case for ak+1 ∈ [ 21 , 1) can be treated by the same analysis. Let βk stand for the relatively strong 3-shock wave issuing from (ks, yf (ks)). We denote by α the set of waves issuing from (ks, yjf −1 (k)). The waves in α entering 1k+1,jf are denoted by α l and α r are the waves entering 1k+1,jf +1 . Let γ be the set of waves issuing from (ks, yjf −2 (k)) and entering 1k+1,jf (k)+1 . Set ω1 (y) (or ω1 (σ¯ )) and ω2 (y) (or ω2 (σ )) to be the self-similar solutions such that βk connects ω0 = (ρ0 , u0 , v0 ) and ω1 (y), and α l connects ω1 (y) and ω2 (y) at x = ks. Let βk+1 denote the strong 3-shock issuing from
Nonlinear Stability of Self-Similar 3-D Gas Flow
537
((k + 1)s, yf ((k + 1)s)). δ denotes the wave issuing from ((k + 1)s, yjf −1 (k + 1)). In this case, we set Q0 (1) = Q0 (βk , α l ) + Q0 (α r , γ ),
Q1 (1) = |α l |1σα + |γ1 |1σγ + |γ3 |1σγ ,
Q2 (1) = |γ2 |1σγ , x0 1σα , if ω1 (y) and ω2 (y) have different centers, c Q (1) = 0, if ω1 (y) and ω2 (y) have the same center. Here, 1σ simply means the change of the self-similar variable for the corresponding wave as it propagates through the self-similar solution. In this case, 1σα = |σ (yjf (k + 1)) − σ (yjf −1 (k))| and 1σγ = |σ (yjf −1 (k + 1)) − σ (yjf −2 (k))|. 1σβk = |σf (k + 1) − σf (k)|, where σf (k) represents the value of the self-similar variable σ for the shock βk . In the following, O(1) always represents a constant depending only on system (1.4)–(1.7). Lemma 4.2. Suppose that βk is sufficiently small. Then there exists a small constant c0 = O(1)|βk | such that βk+1 = βk + α1l + O(1)Q0 (βk , α l ) +O(1)1σβk + O(1)|α l |1σα + O(1)Qc (1), n o δj = αjr + γj + O(1) Q0 (βk , α l ) + Q0 (α r , γ ) + O(1)c0 1σβk +O(1)|α l |1σα + O(1)|γ |1σγ + O(1)Qc (1), for 1 ≤ j ≤ 3. Proof. Owing to the interaction estimates of the elementary waves for conservation laws, we have (ω0 , ω2 (σf (k + 1)))j = (ω0 , ω1 (σf (k + 1))j + (w1 (σf (k + 1)), w2 (σf (k + 1)))j + O(1)Q0 ((ω0 , ω1 (σf (k + 1)), (w1 (σf (k + 1)), w2 (σf (k + 1))). And by (2.7), there exists a small constant c0 = O(1)|βk | such that ( O(1)c0 1σβk , j = 1, 2, (ω0 , ω1 (σf (k + 1)))j = (ω0 , ω1 (σf (k)))j + O(1)1σβk , j = 3,
(4.5)
(4.6)
when βk is sufficiently small. Also, (w1 (σf (k + 1)), (w2 (σf (k + 1)))j = (ω1 (yjf −1 (k)), ω2 (yjf −1 (k)))j +O(1)|α l |1σα + O(1)Qc (1).
(4.7)
Thus, (4.5)–(4.7) imply that βk+1 = (ω0 , ω2 (σf (k + 1)))3 = (ω0 , ω∗ ) = βk + α1l + O(1)Q0 (βk , α l ) +O(1)1σβk + O(1)|α l |1σα + O(1)Qc (1).
(4.8)
538
W.-C. Lien, T.-P. Liu
Denote the end states of δ by (δ− , δ+ ). To estimate the strength of δ, we first apply Lemma 4.1 to obtain (ω2 (yjf −1 (k)), δ+ )j = αjr + γj + O(1)Q0 (α r , γ ) + O(1)|γ |1σγ .
(4.9)
By the elementary theory of ordinary differential equations, we have (δ− , ω2 (yjf −1 (k)))j = (ω∗ , ω2 (σf (k + 1)))j + O(1)|ω∗ − ω2 (σf (k + 1))|1σα . (4.10) Hence, (4.5)–(4.10) yield δj = (δ− , δ+ )j
n o = αjr + γj + O(1) Q0 (β, α l ) + Q0 (α r , γ ) +O(1)c0 1σβk + O(1)|α l |1σα + O(1)|γ |1σγ + O(1)Qc (1).
This completes the proof. u t We now establish the basic estimates on the change of speed of 3-waves and 2-waves. As 3-waves (or 2-waves) propagate along self-similar solutions, the characteristic speed λ3 (or λ2 ) is monotonely increasing with respect to σ . Lemma 4.3. Suppose that ω(σ ) = (ρ(σ ), u(σ ), v(σ )) is a self-similar solution to (1.10) and (1.11). Then, d λ2 (σ ) > 0, dσ d λ3 (σ ) > 0. (ii) dσ (i)
Proof. (i) By (1.10), d d λ2 (σ ) = dσ dσ
v(σ ) u(σ )
=
vσ u − vuσ −(σ u + v)uσ = > 0. 2 u u2
dc dρ d λ3 (σ ), we need to know and . Applying Bernoulli’s law, we (ii) To compute dσ dσ dσ obtain udu + vdv = −c2 dρ/ρ. It thus follows from (1.10) and (i) that −ρ dρ = (uuσ + vvσ ) 2 dσ c −ρ = (u − σ v)uσ 2 > 0. c Hence, dρ 1 dc = P 00 (ρ) > 0. dσ 2c dσ
Nonlinear Stability of Self-Similar 3-D Gas Flow
539
Since λ3 satisfies (v − λ3 u)2 − c2 (1 + λ23 ) = 0, differentiating this equation with respect to σ yields (c2 λ3 + u(v − λ3 u))
d λ3 (σ ) = (v − λ3 u)(vσ − λ3 uσ ) − ccσ (1 + λ23 ). dσ
(4.11)
Substituting λ3 , we have c2 λ3 + u(v − λ3 u) = (c2 − u2 )λ3 + uv q = −uv − c q 2 − c2 + uv q = −c q 2 − c2 < 0. Applying (1.10) again, it is easy to check that the RHD of (4.11) is negative. Hence, it follows that d λ3 (σ ) > 0. dσ This completes the proof. u t We now turn to the interaction between self-similar solutions and elementary waves. To quantitatively measure how the elementary waves weave through self-similar solutions, we estimate the change of the angle between the elementary wave and the ray through the center of the wave itself, as shown in the following figure. 6
α
r
x0
r
θα (x)
r
x
-
θα (x) denotes the angle associated with the wave α issuing from x. The self-similar variable σ is employed in place of y to describe the coordinate in the (x, y)-plane. The following lemmas show that θ(x) is decreasing with respect to x for 2-waves and the relatively strong 3-shock S. Lemma 4.4. Suppose that S = (ω0 , ω1 ) at (x, σ1 ). At the next step, S = (ω0 , ω2 ) at (x + 1x, σ2 ) by the construction described in Sect. 3. Then we have θS (x) − θS (x + 1x) = |O(1)||σ1 − σ2 |. Proof. Let si denote the shock speed s(ω0 , ωi ), i = 1, 2. Assume that si > other case can be proved by analogous arguments. Let ω(σ ) denote the self-similar solution with the initial data ω(σ1 ) = ω1 .
1 σ1 .
The
540
W.-C. Lien, T.-P. Liu
Due to the construction of approximate solutions in Sect. 3, we need to solve the Riemann problem (ω0 , ω(σ2 )). Hence, (ω0 , ω2 ) = (ω0 , ω(σ2 ))3 . As σ1 decreases to σ2 , ω(σ ) moves along the integral curve (u, v(u)) to (1.10) and (1.11) and below the shock curve S3 (ω0 ) on the (u, v)-plane. By (1.10), vu < 0. This property together with the fact that T1 and T2 have positive slopes on the (u, v)-plane implies that s2 < s1 and thus s1 − s2 = |O(1)||σ1 − σ2 |. Therefore, we have θS (x) − θS (x + 1x) = |O(1)||σ1 − σ2 |.
t u
Lemma 4.5. Suppose that α = (ωl , ωr ) is a contact discontinuity at (x, σ1 ). At the next step, α = (ωˆl , ωˆr ) at (x + 1x, σ2 ) by the construction described in Sect. 3. Then we have θα (x) − θα (x + 1x) = |O(1)||σ1 − σ2 |. Proof. Set s1 = s(ωl , ωr ) and s2 = s(ωˆl , ωˆr ). Assume that the wave speed s1 > σ11 and σ1 > σ2 . The other cases can be proved by analogous arguments. Let ωl (σ ) and ωr (σ ) denote the self-similar solutions to (1.10) and (1.11) with the initial data ωl (σ1 ) = ωl , ωr (σ1 ) = ωr , respectively. Due to the construction of approximate solutions, we obtain (ωˆl , ωˆr ) = (ωl (σ2 ), ωr (σ2 ))2 . It follows from Lemma 4.3 that λ2 (ωl (σ2 )) < λ2 (ωl ), λ2 (ωr (σ2 )) < λ2 (ωr ).
(4.12) (4.13)
(ωl (σ2 ), ωr (σ2 ))j = O(1)|ωl − ωr ||σ1 − σ2 |
(4.14)
And by Lemma 4.1, we have
for j = 1, 3. Hence, (4.12)–(4.14) yield that s2 < s1 and s2 − s1 = O(1)|σ1 − σ2 |. It thus implies that θα (x) − θα (x + 1x) = |O(1)||σ1 − σ2 |. This completes the proof. u t
Nonlinear Stability of Self-Similar 3-D Gas Flow
541
5. Global Existence In this section, we adopt the difference scheme described in Sect. 3 to prove the global existence of the solution to (1.4)–(1.7). The obstacle is approximated by piecewise linear cones with the change of angle θi , i = 1, · · · , n, at x = N0 s, · · · , (N0 + n − 1)s respectively, and the corresponding centers for these linear cones are x01 , x02 , · · · , x0n respectively. For convenience, we prove the simplified case when the cone is perturbed only at one location x = x1 and after the perturbation, the obstacle is the infinite cone x − x0 = σ0 with its corresponding center located at x = x01 = x0 ; hence, the selfy x − x0 . Nevertheless, the functionals to be constructed below similar variable σ equals y are also true for general situations. The proof requires estimates on the total variations of the approximate solutions ω1 (x, y). Our strategy is to use induction on certain nonlinear functionals constructed to detect global wave interactions. Once this uniform bound is established, with the aid of Helly’s theorem, we can extract a convergent subsequence of ω1 (x, y) in L1loc (R 2 ), and by the consistency theorem (Liu [10]), this subsequence converges to a weak solution ω(x, y) to the system (1.4)–(1.7). Let J be a space-like curve. To establish the uniform bound, we define a nonlinear functional F (J ) as follows: F (J ) ≡ L(J ) + KQ(J ), L(J ) ≡ L0 (J ) + L1 (J ), X {cα |α| : α is the strength of any elementary waves crossing L0 (J ) ≡ J and α 6 = S} , X {θα : α is a contact discontinuity crossing J }, L1 (J ) ≡ θS (J ) + Q(J ) ≡ Q0 (J ) + Q1 (J ) + Q3 (J ) + Qc (J ), X {|αβ| : α and β are strengths of elementary waves which are Q0 (J ) ≡ approaching, and cross J } , X {|α|(σ0 − σα ) : α is a 1-wave crossing J } , Q1 (J ) ≡ X {|α|(σα − σ ) : α is a 3-wave crossing J and α 6 = S} , Q3 (J ) ≡ Qc (J ) ≡
n X i=1
Qic (J ),
Qic (J ) ≡ (x0i − x0i−1 )(σci (J ) − σ ), (x00 = 0). Here
( cα =
C0 , when α is a 1-wave or 2-wave. 1,
when α is a 3-wave.
C0 is the same constant as in Remark 4.2, which depends only on system (1.4)–(1.7). σα denotes the σ -coordinate of the center for the wave α. σci (J ) is the σ -coordinate of the grid point where the center of the self-similar solutions passing through J changes from x0i−1 to x0i . If the centers do not change anymore, Qc (J ) ≡ 0. And σS (J ) is the
542
W.-C. Lien, T.-P. Liu
σ -coordinate of the 3-shock S when S crosses J . σ ≡ σS (0) − , for some suitably chosen small constant . And K is some large number to be determined later. The terms Q’s are defined to detect the potential amount of wave interactions in the solution. Since 3-waves and 1-waves between S and the obstacle move upwards and downwards respectively with respect to the σ coordinate, Q3 (J ) and Q1 (J ) are so defined according to the domain of influence. Q0 (J ) is the amount of the usual waves interactions between elementary waves. And Qc (J ) is defined to measure the effect of the change of centers for self-similar solutions, which also reflects the fact that this effect propagates upwards. As for the 2-waves nearby the obstacle boundary and the relatively strong shock S, we do not know a priori how they move ahead. Consequently, we cannot foresee their potential wave interactions. However, the local analysis gives us a decreasing quantity θ , which constitutes L1 (J ). We will show that the decrease in L1 (J ) is sufficient to dominate the increase in the remaining parts of F (J ). We now give the global interaction estimates. Let 0 stand for the space-like curve in the strip N0 s ≤ x ≤ (N0 + 1)s. 3 represents the region between 0 and J . And Q(3) is the sum over all Q(1), 1 any diamond in 3. P Lemma 5.1. Suppose that L(0), σ0 − σ and ni=1 θi are sufficiently small. For sufficiently large K, we have ! n X X X 1 θi − c1 1σS (Jk ) + 1σα2 , (5.1) F (J ) ≤ F (0) − Q(3) + C1 2 α i=1
1 Q(J ) ≤ Q(0) − Q(3) + Q2 (3) + 2
k
n X
θi +
i=1
2
1X 2
1σS (Jk ),
(5.2)
k
where C1 and c1 are positive constants depending P only on system (1.4) -(1.7), and Jk ’s are all the space-like curves P between 0 and J . k 1σS (Jk ) is the sum taken over the change of σS (Jk ). And α2 1σα2 is the sum over the change of σ for all the contact discontinuities in 3. Proof. We choose ≡ F (0) + C1
n X
! θi c1−1 .
i=1
K, C1 and c1 will be determined later. We will prove by induction. For J = 0, we can choose L(0) and Q(0) as small as needed. Suppose that (5.1) and (5.2) have been shown for J = J1 . It thus follows from (5.1) that σS (J1 ) > σ . Let J2 be an immediate successor and 1 denote the diamond between J1 and J2 . To show that (5.1) and (5.2) hold for J = J2 , we divide the proof into three cases: Case 1. 1 is between the shock cone S and the obstacle cone. Let us consider the case when 1 is under the same setting as Lemma 4.1. The other cases can be proved similarly. With the help of Lemma 4.1 and 4.5, we obtain L0 (J2 ) − L0 (J1 ) = O(1)Q(1), L1 (J2 ) − L1 (J1 ) = −|O(1)|1σ + O(1)Q(1),
Nonlinear Stability of Self-Similar 3-D Gas Flow
543
where the term −|O(1)|1σ is due to the change of the angle θα2 if the contact discontinuity α2 6 = 0 in Lemma 4.1, Q0 (J2 ) − Q0 (J1 ) ≤ O(1)L0 (J1 )Q(1) − Q0 (1), (Q1 + Q3 )(J2 ) − (Q1 + Q3 )(J1 ) ≤ O(1)Q(1)(σ0 − σ ) − Q1 (1), Qc (J2 ) − Qc (J1 ) = −Qc (1). It follows from the above inequalities that Q(J2 ) − Q(J1 ) ≤ O(1) (L0 (J1 ) + (σ0 − σ )) Q(1) − Q(1) − Q2 (1) , and thus 3 Q(J2 ) − Q(J1 ) ≤ − Q(1) + Q2 (1), 4
(5.3)
provided that L(J1 ) and σ0 − σ are sufficiently small. Therefore, F (J2 ) − F (J1 ) ≤ (O(1) −
K )Q(1) + KQ2 (1) − |O(1)|1σ. 2
(5.4)
Note that Q2 (1) is a quadratic term and Q2 (1) ≤ L0 (J1 )1σ . Hence, when F (J1 ) is sufficiently small, by choosing suitably large constant K, we have 1 F (J2 ) − F (J1 ) ≤ − Q(1) − c1 1σ 2 for some positive constant c1 . By the induction hypothesis, it thus follows that ! n X X X 1 θi − c1 1σS (Jk ) + 1σα2 , F (J2 ) ≤ F (0) − Q(32 ) + C1 2 α i=1
1 Q(J2 ) ≤ Q(0) − Q(32 ) + Q2 (32 ) + 2
k
n X i=1
θi +
2
1X 2
1σS (Jk ),
k
where 32 is the region between 0 and J2 . Thus, (5.1) and (5.2) hold for J = J2 . Case 2. 1 covers a part of the obstacle boundary. Let us consider the case when 1 is under the same setting as Remark 4.2. The other cases can be proved similarly. Using Remark 4.2 and Lemma 4.5, we have L0 (J2 ) − L0 (J1 ) = O(1)Q(1), L1 (J2 ) − L1 (J1 ) = −|O(1)|1σ + O(1)Q(1), where the term −|O(1)|1σ is due to the change of the angle θα2 if the contact discontinuity α2 6 = 0 in Remark 4.1. Also, Q0 (J2 ) − Q0 (J1 ) ≤ O(1)L0 (J1 )Q(1) − Q0 (1), (Q1 + Q3 )(J2 ) − (Q1 + Q3 )(J1 ) ≤ O(1)(|α| + Q(1))(σ0 − σ ) − Q1 (1), Qc (J2 ) − Qc (J1 ) = −Qc (1).
544
W.-C. Lien, T.-P. Liu
Thus, Q(J2 ) − Q(J1 ) ≤ O(1) (L0 (J1 ) + (σ0 − σ )) Q(1) − Q(1) − Q2 (1) + O(1)|α||σ0 − σ |. When L(J1 ) and σ0 − σ are sufficiently small, the last inequality yields 3 Q(J2 ) − Q(J1 ) ≤ − Q(1) + Q2 (1) + O(1)|α||σ0 − σ |. 4
(5.5)
Therefore, F (J2 ) − F (J1 ) ≤ (O(1) −
K )Q(1) + KQ2 (1) − |O(1)|1σ + O(1)K|α||σ0 − σ |. 2 (5.6)
By telescoping the estimates of the three cases for every step between 0 and J2 , we obtain from (5.3)–(5.8) (see also Case 3) K )Q(32 ) 2 ! n X X θi + Q(32 ) + c0 1σS (Jk ) |σ0 − σ | + O(1)K
F (J2 ) − F (0) ≤ (O(1) −
+
i=1
X
KQ (1k ) − |O(1)|1σk 2
k
k
+ |O(1)| (K(σ0 − σ ) + c0 − 1)
X
1σS (Jk )
k
+
n X
θi + O(1)Q(32 ) + O(1)c0
X
i=1
! 1σS (Jk ) ,
k
3 Q(J2 ) − Q(0) ≤ − Q(32 ) + Q2 (32 ) + O(1) 4 ! n X X 1X θi + Q(32 ) + c0 1σS (Jk ) |σ0 − σ | + 1σS (Jk ), · 4 i=1
k
k
where 1k is any diamond between 0 and J2 . When F (J1 ) is sufficiently small and K sufficiently large, we obtain ! n X X X 1 θi − c1 1σS (Jk ) + 1σα2 , F (J2 ) ≤ F (0) − Q(3) + C1 2 α i=1
1 Q(J2 ) ≤ Q(0) − Q(32 ) + Q2 (32 ) + 2
k
n X i=1
θi +
2
1X 2
σS (Jk ),
k
where 32 is the region between 0 and J2 . C1 and c1 are positive constants depending only on system (1.4)–(1.7). Thus, (5.1) and (5.2) hold for J = J2 .
Nonlinear Stability of Self-Similar 3-D Gas Flow
545
Case 3. 1 is in the shock front region as in Lemma 4.2. Let us consider the case when 1 is under the same setting as Lemma 4.2. The other cases can be proved similarly. By Lemma 4.2 and 4.4, we have L0 (J2 ) − L0 (J1 ) = O(1)Q(1) + O(1)c0 1σS (J1 ), where 1σS (J1 ) = |σS (J2 ) − σS (J1 )|. And L1 (J2 ) − L1 (J1 ) = −|O(1)|1σ +O(1)Q(1)+O(1)c0 1σS (J1 )+(θS (J2 )−θS (J1 )), where the sum −|O(1)|1σ is due to the change of the angle θγ2 if the contact discontinuity γ2 6 = 0 in Lemma 4.2. And Q0 (J2 ) − Q0 (J1 ) ≤ O(1)L0 (J1 )Q(1) − Q0 (1) (Q1 + Q3 )(J2 ) − (Q1 + Q3 )(J1 ) ≤ O(1)Q(1)(σ0 − σ ) − Q1 (1) + O(1)1σβ (J1 )(σ0 − σ ) Qc (J2 ) − Qc (J1 ) = −Qc (1). Hence, Q(J2 ) − Q(J1 ) ≤ O(1) σ )) Q(1) (L0 (J1 ) + (σ0 −
− Q(1) − Q2 (1) + O(1)1σβ (J1 )|σ0 − σ |.
If L(J1 ) and σ0 − σ are sufficiently small, we have 3 1 Q(J2 ) − Q(J1 ) ≤ − Q(1) + Q2 (1) + 1σβ (J1 ). 4 4
(5.7)
Thus, K )Q(1) + KQ2 (1) − |O(1)|1σ 2 (5.8) +O(1)(K(σ0 − σ ) + c0 )1σS (J1 ) − |O(1)|1σS (J1 ) + |α l |.
F (J2 ) − F (J1 ) ≤ (O(1) −
By telescoping the estimates of the three cases for every step between 0 and J2 , we have X K KQ2 (1k ) − |O(1)|1σk F (J2 ) ≤ F (0) + (O(1) − )Q(32 ) + 2 k X 1σS (Jk ) +|O(1)| (K(σ0 − σ ) + c0 − 1) k
+O(1)K
n X
θi + Q(32 ) + c0
X
i=1
+
n X
! 1σS (Jk ) |σ0 − σ |
k
θi + O(1)Q(32 ) + O(1)c0
i=1
X k
! 1σS (Jk ) ,
3 1X 1σS (Jk ) Q(J2 ) ≤ Q(0) − Q(32 ) + Q2 (32 ) + 4 4 k ! n X X θi + Q(32 ) + c0 1σS (Jk ) |σ0 − σ |. +O(1) i=1
k
546
W.-C. Lien, T.-P. Liu
Thus, ! n X X X 1 θi − c1 1σS (Jk ) + 1σα2 , F (J2 ) ≤ F (0) − Q(3) + C1 2 α i=1
k
1 Q(J2 ) ≤ Q(0) − Q(32 ) + Q2 (32 ) + 2
n X
θi +
i=1
2
1X 2
σS (Jk ),
k
provided that K is large enough. Thus, (5.1) and (5.2) hold for J = J2 . Furthermore, by Lemma 4.2, (5.1) and (5.2), we can establish the estimate of the strength of the relatively strong shock: |S(x)| ≤ O(1)|S0 |,
(5.9)
where |S0 | denotes the initial strength. This completes the proof. u t Remark 5.1. It is to be noted that the assumption in Lemma 5.1 can be achieved by choosing the Mach number M = q0 /c0 sufficiently large. For simplicity in the presentation, we assume that the gas s is polytropic. As the shock strength |S0 | tends to zero, its corresponding σS tends to
q0 − u˜ by (2.5). By direct computation, we obtain U − q0 s
q0 − u˜ = U − q0
s
q0 c0
2 − 1.
Hence, we can choose σ0 − σ in Lemma 5.1 sufficiently small by simultaneously requiring |S0 | sufficiently small and the Mach number sufficiently close to σ0 . The global existence theorem thus follows from Lemma 5.1 and the consistency theorem [8]. Theorem 5.1. Suppose that the opening angle θ0 of the obstacle cone and the initial strength |S0 | of the relatively strong shock are sufficiently small and the Mach number M = qc00 is sufficiently close to σ0 . Then the initial boundary value problem (1.4)–(1.7) as stated in Sect. 3 has a global solution ω(x, y) satisfying T otal V ariation {ω(x, y) : 0 < y < ∞} = O(1)|S0 |, provided that the perturbation is small as compared to the shock strength |S0 |. 6. Decay of Solutions In this section, we study the rate of the convergence of the solution ω(x, y) to a selfsimilar solution. We use the following notations. χi denotes an i-generalized characteristic curve [5], which is a Lipschitz continuous curve traveling either with i-shock speed or with i-characteristic speed. The one-sided limits of the weak solution exist along any
Nonlinear Stability of Self-Similar 3-D Gas Flow
547
such curves except possibly for a countable set of x and an i-wave may cross χi only due to interactions. We set χ3S ≡ (x, yS (x)), x ≥ x1 ≡ the 3-generalized characteristic curve issued from (x1 , σS ) , χ30 (x) ≡ the 3-generalized characteristic curve issued from (x, σ0 ) , χj1 (x) ≡ the j -generalized characteristic curve issued from (x, yS (x)). Suppose that χ11 (x) ends at x = xˆ when σ = σ0 . We set ˆ σ0 ). χ32 (x) ≡ the 3-generalized characteristic curve issued from (x, The Lax entropy condition implies that χ30 (x) and χ32 (x) enter the relatively strong shock S before O(1)|S|−1 x. To study the decay rate of the solution ω(x, y), we define the following functions: X X(x) = {|α| : α is a 3-wave or a 1-wave at x, α 6 = S}, X Y¯ (x) = {|α| : α is a 2-wave at x}, X Y (x) = {|α|θα (x) : α is a 2-wave at x}, Z(x) = |S(x)|θS (x), where S(x) is the strength of the relatively strong shock S at x. Q(x) ˜ denotes the limit of Q(J ) as the mesh lengths r, s tend to zero, where J is a space-like curve approaching x = x. ˜ We choose a sufficiently large number x2 such that Qc (x) = 0 for x > x2 . Lemma 6.1. There exist some constants M > 1, k1 , k2 depending only on system (1.4)– (1.7), and C = O(1)|S0 |−1 depending on system (1.4)–(1.7) and the shock strength |S0 | such that for x > x2 , X(Cx) ≤ MI (x),
(6.1)
Y (Cx) ≤ C
−k1 σ2
2
Y (x) + M|S0 | (X(x) + I (x)),
(6.2)
Z(Cx) ≤ C
−k2 σ2
Z(x) + M|S0 |(X(x) + I (x)).
(6.3)
Here S0 is the initial strength of the relatively strong shock, and I (x) is due to wave interactions defined by I (x) ≡ X(x)2 + |S0 |X(x) + Y (x) + Z(x). Proof. According to Lemma 5.1, there exists some constant C2 depending only on system (1.4)–(1.7) such that F (J ) ≤ C2 |S0 |2 , for any space-like curve J provided that the hypothesis of Theorem 5.1 holds. It thus implies that Y¯ (x) ≤ C2 |S0 |2 ,
(6.4)
548
W.-C. Lien, T.-P. Liu
θS (x) +
X
{θα (x) : α is a 2-wave.} ≤ C2 |S0 |2 .
(6.5)
Since χ30 (x) and χ32 (x) enter χ3S before Cx, C = O(1)|S0 |−1 , 1-waves and 3-waves in X(Cx) are those produced by wave interactions; hence, we have from (6.4), Lemma 4.1 and 4.2, X(Cx) ≤ O(1) X(x)2 + |S0 |X(x) + Y (x) + Z(x) . for x > x2 . Applying Lemma 4.5, we obtain the decay rate of θα : θα (Cx) = θα (x)
x k1 σ2 , Cx
(6.6)
provided that a contact discontinuity α interacts only with a self-similar solution. Here, k1 is a constant depending only on system (1.4)–(1.7). It follows from (6.4)–(6.6) that X
Y (Cx) ≤ C2 |S0 |2 I (x) + α:
≤C
−k1 σ2
α(x)θα (x)C −k1 σ + O(1)C2 |S0 |2 (X(x) + I (x)) 2
2-wave
Y (x) + O(1)|S0 |2 (X(x) + I (x)).
By Lemma 4.4, we can derive the decay rate of θS : θS (Cx) = θS (x)
x k2 σ2 , Cx
(6.7)
provided that the relatively strong shock S interacts only with a self-similar solution. k2 is a constant depending only on system (1.4)–(1.7). Therefore, (6.7), Lemma 4.1, 4.2, and 5.1 yield 2 Z(Cx) ≤ |S(Cx)| θS (x)C −k2 σ + O(1)X(x) + O(1)I (x) ≤ C −k2 σ Z(x) + O(1)|S0 |(X(x) + I (x)). 2
Now, we can choose a sufficiently large number M such that (6.1)–(6.3) hold for x > x2 . This completes the proof. u t Theorem 6.1. For given ε > 0, suppose that the hypothesis of Theorem 5.1 holds, the solution ω(x, y) to system (1.4)–(1.7) converges to a self-similar solution at the following rate: X(x) ≤ M1 x − 2+ε , 1
Y (x) ≤ M2 x − 2+ε , 1
Z(x) ≤ M3 x
1 − 2+ε
(6.8)
,
where Mi , i = 1, 2, 3, are some constants depending on ε, |S0 | and system (1.4)–(1.7).
Nonlinear Stability of Self-Similar 3-D Gas Flow
549
Proof. We shall prove by induction. Set 1
3
1
M1 = C2 |S0 |(Cx2 ) 2+ε , M2 = M3 = C2 |S0 | 2 (Cx2 ) 2+ε , so that (6.8) holds for x ≤ Cx2 . Suppose that (6.8) holds for x ≤ C p x2 , p ≥ 1. We want to establish (6.8) for C p x2 < x ≤ C p+1 x2 . By Lemma 6.1 and the induction hypothesis, for x ≤ C p x2 , X(Cx) ≤ MI (x) ≤ M(X(x)2 + |S0 |X(x) + Y (x) + Z(x)) −2 −1 −1 −1 ≤ M M12 x 2+ε + |S0 |M1 x 2+ε + M2 x 2+ε + M3 x 2+ε −1
≤ M1 (Cx) 2+ε when |S0 | is sufficiently small, which depends also on ε. Also by the same argument, we have Y (Cx) ≤ C −k1 σ Y (x) + M|S0 |2 (X(x) + I (x)) −1 −1 −1 2 ≤ C −k1 σ M2 x 2+ε + M|S0 |2 M1 x 2+ε + M1 (Cx) 2+ε 2
−1
≤ M2 (Cx) 2+ε , Z(Cx) ≤ C −k2 σ Z(x) + M|S0 |(X(x) + I (x)) −1 −1 −1 2 ≤ C −k2 σ M3 x 2+ε + M|S0 | M1 x 2+ε + M1 (Cx) 2+ε 2
−1
≤ M3 (Cx) 2+ε . t Therefore, (6.8) holds for C p x2 < x ≤ C p+1 x2 . The proof is complete. u References 1. Busemann, A.: Drucke auf kegelformige Spitzen bei Bewegung mit Überschallgeschwindigkeit. Zeits. für angewandte Math. und Mech. 9, No 6, 496–498 (1929) 2. Chern, I.-L.: Stability theorem and truncation error analysis for the Glimm scheme and for a front tracking method for flows with strong discontinuities. Comm. Pure Appl. Math. 42, 815–844 (1989) 3. Courant, R., Friedrichs, K. O.: Supersonic flow and shock waves. New York: Interscience, 1948 4. Glimm, J.: Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math. 18, 697–715 (1965) 5. Glimm, J., Lax, P. D.: Decay of solutions of systems of nonlinear hyperbolic conservation laws. Memoirs, Am. Math. Soc. 101, (1970) 6. Lax, P. D.: Hyperbolic system of conservation laws, II. Comm. Pure Appl. Math. 10, 537–566 (1957) 7. Lien, W.-C.: Hyperbolic conservation laws with a moving source. Comm. Pure Appl. Math. (to appear) 8. Liu, T.-P.: The deterministic version of the Glimm scheme. Commun. Math. Phys. 57, 135–148 (1977) 9. Liu,T.-P.: Admissible solutions of hyperbolic conservation laws. Memoirs, Am. Math. Soc. 240, (1981) 10. Liu, T.-P.: Quasilinear hyperbolic systems. Commun. Math. Phys. 68, 141–172 (1979) 11. Maccoll, J. W.: The conical shock wave formed by a cone moving at high speed. Proc. of the Royal Society (A) 159, 459–472 (1937) 12. Smoller, J.: Shock waves and reaction-diffusion equations. New York: Springer Verlag, 1983 13. Xiao, L., Zhang, T.: The Riemann problem and interaction of waves in gas dynamics. Pitman Monographs and Surveys in Pure and Applied Math. 41 Communicated by A. Jaffe
Commun. Math. Phys. 204, 551 – 586 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Nonlinear Stability of Weak Detonation Waves for a Combustion Model Tai-Ping Liu1,? , Shih-Hsien Yu2,?? 1 Department of Mathematics, Stanford University, Stanford, CA 94305, USA. E-mail: [email protected] 2 Department of Mathematics, UCLA, Los Angeles, CA 90095, USA. E-mail: [email protected]
Received: 6 October 1998 / Accepted: 2 February 1999
Abstract: We show that the weak detonation waves for a combustion model of Rosales– Majda are nonlinearly stable. Because of the strongly nonlinear nature of the wave, usual stability analysis of weakly nonlinear nature does not apply. The chemical switch on-off is the main feature of nonlinearity. In particular, the propagation of the wave depends sensitively on the tail behaviour of the flow in front of it. Unlike the strong detonation waves, a weak detonation is supersonic and there is the separation of the gas waves from the reacting front. As a consequence, the reacting front needs to be traced. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2. Structure of Weak Detonation Profiles . . . . . . . . . . 2.1 Construction of the profiles . . . . . . . . . . . . . 2.2 Hypothesis on the profiles . . . . . . . . . . . . . . 2.3 Explicit profiles for model with linear flux . . . . . 3. Evolution Equations and Wave Fronts . . . . . . . . . . 4. Stability Analysis I: Linear Flux Model . . . . . . . . . 4.1 Initial step: upper bounds of the wave fronts . . . . 4.2 Initial step: rate of the wave fronts . . . . . . . . . 4.3 Initial step: waves carried by detonation wave fronts 4.4 Pointwise convergence to viscous profiles . . . . . 5. Stability Analysis II: Nonlinear Flux . . . . . . . . . . 5.1 Nonlinear front tracking . . . . . . . . . . . . . . . 5.2 Update wave front . . . . . . . . . . . . . . . . . . 6. Remarks . . . . . . . . . . . . . . . . . . . . . . . . . 7. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . ? The research supported in part by NSF Grant DMS-9803323.
?? The research supported in part by NSF Grant DMS-9706827.
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
552 553 554 556 558 559 561 561 564 569 573 577 578 580 583 584
552
T.-P. Liu, S.-H. Yu
1. Introduction Consider the combustion model, Rosales–Majda [15], ut + (f (u) − qz)x = uxx , zx = K ψ(u) z,
(1.1)
where u represents the lumped gas variable and z the density of the reactant. The unburnt state is z = 1 and the burnt state z = 0. The positive constants q and K are the released energy and the reaction rate, respectively. We assume that there is an ignition temperature T ∗ so that the reaction function ψ(u) is given by ( 1 if u > T ∗ (1.2) ψ(u) ≡ 0 if u ≤ T ∗ . To model the detonation waves, the flux function f (u) is assumed to satisfy f 00 (u) ≥ 0, f 0 (u) > 0 for all u under consideration.
(1.3)
Our purpose is to study the nonlinear stability of weak detonation waves. The detona¯ z¯ )(x − st) of the model. We consider tion waves are travelling waves (u, z)(x, t) = (u, the perturbation of the wave: limx→∞ z(x, t) = 1, limx→−∞ z(x, t) = 0, u(x, 0) = u0 (x), with limx→∞ u0 (x) = uR , limx→−∞ u0 (x) = uL . The flow is unburnt in front of the wave and burnt behind it, uR < T ∗ < uL . System (1.1) is derived from the reactive Navier-Stokes equations to model the acoustic mode of the flow under the limit of low Mach number. For combustion waves for the reactive Navier Stokes equations, see [2]. There are two types of detonation waves, the strong and weak detonations.A strong detonation satisfies the same entropy condition as a gas dynamic shock in that it is supersonic (or subsonic) with respect to the flow in front of (or behind) it. Its stability can be shown by the same technique as that for the viscous conservation laws, [7,9,5,6]. The weak detonation is supersonic with respect to both sides of the wave. This is the classical inviscid Chapman-Jouget theory, [1]. The weak detonation waves are not inviscid waves and depend on the dissipation parameters. This is the general phenomena for waves which are either overcompressive, such as intermediate MHD waves, or undercompressive, such weak detonation waves, cf. [8]. This has two basic implications. A perturbation produces a gas wave leaving the combustion wave. The one conservation law can not determine both the location of the detonation wave and the amount of gas wave. Thus the weak detonation wave needs to be traced. The situation is similar to the interaction of shock waves with either the boundary, [10,14], or with other nonlinear waves, [17], see also [11], and [12] on discrete waves. In the study of weak detonation waves in Sect. 2, we fix all physical variables except for the ignition temperature T ∗ , which is allowed to vary. This is done for convenience and is equivalent to the usual practice of varying the energy release q, cf. [15]. One thing distinguishes the weak detonation wave is that it generates strongly nonlinear effects. For instance, there is a sensitive dependence of the propagation of the wave
Nonlinear Stability of Weak Detonation Waves for Combustion Model
553
on the perturbation in front of it. These factors demand new techniques for the stability analysis. The tracing of the wave requires exact analysis of the chemical nonlinearity, which is done using the Laplace–Fourier transform. The paper by Szepessy, [16] also studies the problem of stability of weak detonation waves. We have adopted his definition of detonation wave front γ (t) by u(γ (t), t) = T ∗ , [16]. The derivation of the integral equation for the wave front in Sect. 3 is also motivated by the paper. On the other hand, [16] does not require the second condition in (1.3) and, as a consequence, the strength of the detonation wave can be assumed to be small. The gas nonlinearity f (u) = u2 is emphasized in [16] and is taken care of through the Hopf-Cole transformation. A strong detonation, the so-called ZND wave, [1], is a gas dynamic shock followed by a reacting zone. The shock raises the gas temperature through compression and thereby sets up the chemical reaction. Thus its stability mechanism is similar to that of the gas shocks. A weak detonation, on the other hand, runs ahead of the gas waves and decouples from them. Thus the nonlinearity of a weak detonation wave is mainly the chemical nonlinearity. To focus on this, we consider in Sect. 4 the simplified model with linear flux, f 00 (u) ≡ 0. This is the main part of the present paper. The study of the general situation, f 00 (u) ≥ 0, requires an iteration scheme to take care of the gas nonlinearity and is done in Sect. 5. Consistent with the derivation of the model (1.1), we require the strong seperation of the detonation front and the gas wave, α ≡ s−f 0 (T ∗ ) large. The other main assumption is that both the reaction rate K and α/K are large. The precise assumptions, Assumptions 2.1 to 2.7, are listed in Sect. 2. These assumption are verified either numerically for convex flux in Sect. 2, or analytically for linear flux in Sect. 4. Under these assumptions, we have the following main theorem: Theorem 1.1. Suppose that the perturbation of a weak detonation wave v0 (x) ≡ u(x, 0) − u(x) ¯ is sufficiently small: v0 ∈ C 1 (R) ∩ Lip 2 (R), 5|x| |∂xi v0 (x)| ≤ δ α i e−α 8 for i = 0, 1, 2 for a constant δ satisfying δ < α −6 . Then the solution of (1.1) tends to a translation γ (t) of the detonation wave as time t tends to infinity: (x−f 0 (u− )t)2 1 e− A(t+1) ], u(x, t) − u(x ¯ − st − γ (t)) = O(1)δ[e−Ct e−C|x−st−γ (t)| + √ t +1
for some positive constants C and A; and γ (t) tends to its limit exponentially fast. Note that the convergence is at the exponential rate, except for the algebraic rate of (t + 1)−1/2 along the gas acoustic direction x = f 0 (u)t. 2. Structure of Weak Detonation Profiles It follows from the system (1.1) that the end states and the speeds of a detonation wave satisfies the Rankine–Hugoniot condition: s=
f (uL ) − f (uR ) + q , uL − uR
(2.1)
554
T.-P. Liu, S.-H. Yu
y − f (uR ) = s(u − uR − q) y = f (u)
uR = u+
uL = u−
uL = u−
f (uR ) − q
It is clear that we have Lemma 2.1. Suppose that uR and s > 0 are given such that s > f 0 (uR ) > 0 as well as that
f 00 > 0.
Then, uL in (2.1) has two solutions {u− , u− } with u− < u− . Let u+ = uR be given as in Lemma 2.1. The shock ((u− , 0), (u+ , 1)) is supersonic: s > f 0 (u− ) > f 0 (u+ ) > 0 and is called a weak detonation wave. The wave ((u− , 0), (u+ , 1)) satisfies the usual gas dynamics entropy condition: f 0 (u− ) > s > f 0 (u+ ), and is called a strong detonation wave. 2.1. Construction of the profiles. So far we have studied the far field states u± of the combustion waves. For the actual existence and structure of these waves we need to study the ODE obtained from (1.1) when the solution is a travelling wave (u, z)(x, t) = ¯ z¯ )(x − st): (u, −s(u¯ − u− ) + f (u) − f (u− ) − q z¯ = u¯ x , Kψ(u)¯z = z¯ x , (2.2) ¯ z¯ )(x) = (u+ , 1), limx→∞ (u, limx→−∞ (u, ¯ z¯ )(x) = (u− , 0), ψ(u) ≡ H(u − T ∗ ),
Nonlinear Stability of Weak Detonation Waves for Combustion Model
555
where, for definiteness, we have made the normalization ¯ u(0) = T ∗, and H is the Heaveside function:
H(u) =
0 if u ≤ 0, 1 else.
Lemma 2.2. For any s and u+ ≡ uR given in Lemma 2.1, there is a unique T ∗ ∈ (u+ , u− ) such that (2.2) has a monotone solution. Proof. Consider the following dynamical system: X˙ = −s(X − u− ) + f (X) − f (u− ) − qY, Y˙ = KY, limt→−∞ (X, Y )(t) = (u− , 0).
(2.3)
Y 0+ (u+ , 1)
(T ∗ , 1) Y =1
F (X, Y ) = 0
X
(u− , 0)
The state (u− , 0) is a fixed point of this dynamical system. At this point the dynamical system has a one-dimensional unstable manifold 0. Let 0 + be the branch with positive Y component. Set 0 + ≡ {(01 (t), 02 (t)) : t ∈ R}. From the second equation of (2.3), 0 + can be normalized such that 02 (t) = eKt . Hence, 0 + will intersect Y = 1. Set F (X, Y ) ≡ −s(X − u− ) + f (X) − f (u− ) − qY. The set F = 0 contains (u+ , 1) and (u− , 0). From the phase diagram of (X, Y ), it follows that F = 0 never intersects 0 + in Y > 0. Thus, 0 + is to the right of F = 0 and
556
T.-P. Liu, S.-H. Yu
01 (t) is a monotone decreasing function. Hence, 01 (0) < u− and (01 (0), 1) is to the right of F = 0 and so 01 (0) ∈ (u+ , u− ). The profiles (u, ¯ z¯ )(x) and (01 (x), eKx ) are identical for x ≤ 0. Set T ∗ ≡ 01 (0). ¯ z¯ )(x) for x > 0 is z¯ (x) ≡ 1 and u(x) ¯ = X(x) From this choice of T ∗ , the solution (u, solving ˙ −s(X − u− ) + f (X) − f (u− ) − q = X, X(0) = T ∗ . Clearly, X(t) is a strictly monotone decreasing function with lim X(t) = u+ .
t→∞
t u
The ignition temperature T ∗ is a function depending on u− , u+ , s, q, and K. With the additional constraint of the Rankine–Hugoniot condition, we have T ∗ = T ∗ (u+ , s, q, K).
(2.4)
2.2. Hypothesis on the profiles. Let (u, ¯ z¯ )(x − st) be the normalized weak detonation profile and α the separation of the combustion and the gas speeds: α ≡ s − f 0 (T ∗ ). For our stability analysis we need the following hypotheses on the combustion wave. Assumption 2.3. Assume that the following holds for u¯ Q≡
qK = O(1), |u¯ x (0)|α
where the bound O(1) is independent of the parameters involved. Assumption 2.4. Assume that the gradient of −u¯ x (0) is sufficiently large. Note. When q, α/K 1, the quantity |u¯ x (0)| is proportional to |u− − u+ |K. Thus, this hypothesis is a consequence of assuming K 1. About Assumption 2.3 the quantity Q can not be made arbitrarily small by arranging the values of u+ , s, q, and K. This will be illustrated by a simplified model in the next subsection. For now we show by numerics that both Assumption 2.3 and Assumption 2.4 can be satisfied, by calculating the value Q for the flux function f (u) = u2 . The normalized profile u¯ satisfies u¯ x = 2(u¯ − u− ) + (u¯ − u− )2 − qeKx for x ≤ 0.
(2.5)
Nonlinear Stability of Weak Detonation Waves for Combustion Model
557
By repeating Picard iteration eight times, we have the following approximate values for T ∗ , u¯ x , and Q with given (u+ , u− ) and varying (q, K): u+
u−
s
q
K
T∗
u¯ x (0)
α
Q
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
103 53 28 15.5 103 103 103 103
100 50 25 12.5 100 100 100 100
1 1 1 1 2 4 8 16
1.0097 1.0189 1.0358 1.0653 1.0192 1.0378 1.0729 1.1361
−0.98 −0.96 −0.931 −0.88 −1.94 −3.81 −7.36 −13.73
100.98 50.96 25.92 13.37 100.96 100.92 100.85 100.72
1.0097 1.0189 1.036 1.0661 1.0194 1.04 1.08 1.16
The variables (u+ , s, q, K) in the above table are the basic variables which determine uniquely the other variables in the table. With K, q, and u+ fixed one can vary s such that K/α 1. One has the following analytic properties of the resulted viscous shock ¯ profile u. Lemma 2.5. With K > 0, q > 0, and u+ given, there exists S(K, q, u+ ) > 0 such that for any s > S(K, q, u+ ) the following holds for x > 0: 0< |u¯ x (0)|e−(s−f 2
S(K, q, u+ ) T ∗ − u+ (x) , < u− − u+ α
0 (u ))|x| +
≤ −u¯ x (x) ≤ 2 |u¯ x (0)|e−α|x| ;
and for x < 0 0 ≤ −u¯ x (x). See the Appendix for the proof. To handle the chemical nonlinearity, we pose the following hypothesis: Assumption 2.6. Assume that 1
α 16 1, |u¯ x (0)| α 1. K On interaction between the fluid nonlinearity and chemical nonlinearity, the following assumption is required. Assumption 2.7. Assume that K 1.
558
T.-P. Liu, S.-H. Yu
2.3. Explicit profiles for model with linear flux. Consider a simplified system with linear flux ut + m ux − qzx = uxx , (2.6) zx = Kψ(u)z, u− > u+ , s > m > 0, q ≡ (s − m)(u− − u+ ). From the analysis of the profiles in the last section it is easy to see that q , K +s−m K(u− − u+ ) . T ∗ − u+ = K +s−m
T ∗ ≡ u− −
¯ z¯ )(x − st) be the travelling wave solution of (2.6), which connects ((u− , 0), Let (u, (u+ , 1)): u¯ x = (m − s)(u¯ − u− ) − q z¯ , (2.7) z¯ x = Kψ(u)¯ ¯ z, ¯ z¯ )(x) = (u+ , 1), limx→∞ (u, ¯ z¯ )(x) = (u− , 0). limx→−∞ (u, As before, we make the normalization: ¯ u(0) = T ∗, z¯ (0) = 1.
(2.8)
From the monotonicity of the profile we have ¯ (u(x) − T ∗ ) x < 0 for x 6 = 0. From (2.8) and (2.9),
( z¯ (x) =
1 for x > 0, eKx for x ≤ 0.
(2.9)
(2.10)
Substitute (2.10) into (2.7), we obtain u¯ x = (m − s)(u¯ − u− ) − qeKx for x ≤ 0. From (2.8) and (2.11) we have u(x) ¯ = u− −
qeKx for x ≤ 0. s−m+K
The profile for x > 0 is trivial as in the last section. For this simplified model α = s − m. Furthermore, from (2.11) we have u¯ x (0) =
K (m − s) (u− − u+ ) < 0. s−m+K
(2.11)
Nonlinear Stability of Weak Detonation Waves for Combustion Model
559
qK |u¯ x (0)|α α+K q(s − m + K) = . = (s − m)2 (u− − u+ ) α
Q≡
Therefore Assumption 2.3 is satisfied under a weak version of Assumption 2.6: α = O(1)K. 3. Evolution Equations and Wave Fronts For a small perturbation of a weak detonation wave, the solution u(x, t) assumes the value of the ignition temperature only at one location for each given time and the wave front γ (t) + st is well-defined by: u(γ (t) + st, t) ≡ T ∗ . Without loss of generality, we assume that γ (0) = 0. With the change of coordinates:
the system (1.1) becomes (
x → x + st, t → t,
ut − sux + f (u)x = uxx + qzx , zx = Kψ(u)z.
(3.1)
Expand f (u) at T ∗ : N2 (u) ≡ f (u) − f (T ∗ ) − f 0 (T ∗ )(u − T ∗ ), α ≡ s − f 0 (T ∗ ). Substitute this into (3.1):
The perturbation
ut − αux − uxx = qzx − N2 (u)x , zx = ψ(u)Kz.
(3.2)
¯ v(x, t) ≡ u(x, t) − u(x)
of the weak detonation satisfies vt − α vx − vxx = q(z − z¯ )x − N1 (v)x , ¯ − N2 (u). ¯ N1 (v) ≡ N2 (v + u) The Green function for the left-hand side of (3.3) is the heat kernel (x−y+α(t−σ ))2
e− 4(t−σ ) . k(x − y + α(t − σ ), t − σ ) = √ 4π(t − σ )
(3.3)
560
T.-P. Liu, S.-H. Yu
By Duhamel’s principle, Z k(x − y + αt, t)v(y, 0)dy v(x, t) = R Z tZ k(x − y + α(t − σ ), t − σ )(z − z¯ )y dydσ +q 0 R Z tZ k(x − y + α(t − σ ), t − σ )N1y (v)(y, σ )dydσ. −
(3.4)
R
0
Set W(x) ≡ (1 − H(x)) eKx . From the reaction equation of z in (3.2), both z(y, σ ) and z¯ (y) can be represented in terms of the detonation wave locations:
zy (y, σ ) = KW(y − γ (σ )), z¯ y (y) = KW(y).
(3.5)
Since u(γ (t), t) = T ∗ : ¯ (t)) = u(0) ¯ (t)), v(γ (t), t) = u(γ (t), t) − u(γ ¯ − u(γ ¯ (t)) = T ∗ − u(γ we obtain, from (3.4) and (3.5) , the equation for γ (t), Z ¯ ¯ (t)) = k(γ (t) − y + αt, t)v(y, 0)dy u(0) − u(γ R Z tZ (k(γ (t) − γ (σ ) − y + α(t − σ ), t − σ ) + qK 0
−
Z tZ 0
R
y<0
−k(γ (t) − y + α(t − σ ), t − σ ))W(y)dydσ
(3.6)
k(γ (t) − y + α(t − σ ), t − σ )N1 (v)y (y, σ )dydσ.
The same Duhamel’s principle applies for the special solution (u, ¯ z¯ ) of (3.2): Z ¯ − ρ) − u(x) ¯ u(x =
¯ −ρ)− u(y))dy ¯ k(x −y + αt, t)(u(y Z tZ (k(x −ρ −y + α(t −σ ), t −σ ) + qK R
0
y<0
− k(x −y + α(t −σ ), t −σ ))W(y) dydσ Z tZ ¯ −ρ)− u(y)) ¯ k(x −y + α(t −σ ), t −σ )N1 (u(y − y dydσ. 0
R
Set both x and ρ above equal to γ (t), and subtract (3.6) from the resulting identity. We obtain a refined equation for γ (t):
Nonlinear Stability of Weak Detonation Waves for Combustion Model
561
Z k(γ (t) − y + αt, t) {[u(y ¯ − γ (t)) − u(y)] ¯ − v(y, 0)} dy Z tZ (k(−y + α(t − σ ), t − σ ) + qK
0=
R
y<0
0
(3.7)
− k(γ (t) − γ (σ ) − y + α(t − σ ), t − σ ))W(y) dydσ Z tZ k(γ (t)−y +α(t −σ ), t −σ ) {N1 (u(y ¯ −γ (t))− u(y))−N ¯ − 1 (v)}y dydσ. 0
R
Equation (3.7) contains not only the front γ (t) but also the gradient of the fluid variable u. ¯ Thus, the study of the qualitative behavior of the wave front requires the global stability of the fluid. The idea is to utilize the local stability to trace the wave fronts in each small time interval. This is then used to show the local stability of the fluid. The time asymptotic stability is studied by repeating the local analysis. In the next section we carry out the analysis for the model with linear flux.
4. Stability Analysis I: Linear Flux Model To concentrate on the analysis of the switch on-off reaction nonlinearity, we consider the simplified model with linear flux, f 0 (u) = m. In the moving coordinate, the system (3.1) is
ut − α ux − uxx = qzx , zx = Kψ(u)z,
where α = m − s > 0. The integral equation (3.7) of the detonation wave front γ (t) is self-contained for this simplified model: Z 0=
k(αt − y, t) ([u(y) ¯ − u(y ¯ + γ (t))] − v(y + γ (t), 0)) dy Z tZ {k(α(t − σ ) − y, t − σ ) + qK R
0
y<0
− k(α(t − σ ) + γ (t) − γ (σ ) − y, t − σ )} · W(y) dydσ.
(4.1)
4.1. Initial step: upper bounds of the wave fronts. Let δ be a small positive number: δ < α −6 ,
(4.2)
¯ satisfies: and assume that the initial perturbation v0 (x) ≡ u(x, 0) − u(x) (
v0 (0) = 0, v0 ∈ C 1 (R) ∩ Lip2 (R), 5|x| |∂xi v0 (x)| ≤ δ α i e−α 8 for i = 0, 1, 2.
(4.3)
562
T.-P. Liu, S.-H. Yu
Rewrite (4.1) as Z 1 Z u¯ y (y +γ (t)θ) dθ dy γ (t) − k(αt −y, t) R 0 ( Z Z Z t
+
γ (t) qK
0
Z
t
= Z −
(
1
y<0
γ (σ ) qK
0
R
k(θ(γ (t)−γ (σ ))+α(t −σ )−y, t −σ )dθ
0
Z
Z
y<0
1
)
y
Ky
e
)
k(θ(γ (t)−γ (σ ))+α(t −σ )−y, t −σ ) dθ
0
dy dσ
eKy dy dσ
y
k(αt −y, t)v0 (y +γ (t)) dy.
(4.4)
Lemma 4.1. Suppose that α is sufficiently large. Then, there exists a constant L0 > 0 such that for |ρ| < δ, Z 2 1 − α4 t u ¯ (0) e min 1, k(ρ + αt − y, t) u¯ y (y) dy, ≤ − −L−1 √ x 0 α t R Z 5α|y| α2 t 1 k(ρ + αt − y, t)e− 8 dy ≤ L0 e− 4 min 1, √ , (4.5) α t R Z α|y| α2 t 1 k(ρ + αt − y, t)e 3 dy ≤ L0 e− 4 min 1, √ . (4.6) α t y<0 Proof. Expand k(ρ + αt − y, t) as follows: e 2 αy 1
k(ρ + αt − y, t) =
e
1 2 αρ
k(ρ − y, t) e−
α2 t 4
.
The lemma follows from plugging Lemma 2.6 and the above expansion into the integrals. t u Lemma 4.2. Suppose that α is sufficiently large and that (4.2) holds. Then, for 0 < t < 2 log α/ α 2 , Z 1 Z t Z Ky k(θ(γ (t) − γ (σ )) + α(t − σ ) − y, t − σ )dθ e dy dσ qK y<0 t− 8δ 0 y α Z Z 1 1 u¯ y (y + γ (t)θ) dθ dy . k(αt − y, t) ≤ 1 R 0 α2 Proof. Exchange the order of the last integrations and apply Lemma 4.1 to the resulting integral to yield Z 1 Z 2 log α |u¯ x (0)| ≤ k(αt − y, t) |u¯ y (y + γ (t)θ )| dθ dy for 0 ≤ t ≤ . (4.7) 3 α2 R 0 4L0 α 2 Set
Z
Z II(t, σ ) ≡
y<0
0
1
k(θ(γ (t) − γ (σ )) + α(t − σ ) − y, t − σ ) dθ
y
eKy dy.
Nonlinear Stability of Weak Detonation Waves for Combustion Model
563
Using integration by parts we have Z 1 k(θ(γ (t) − γ (σ )) + α(t − σ ), t − σ ) dθ II(t, σ ) = 0
−
Z
Z
1
y<0 0
k(θ(γ (t) − γ (σ )) + α(t − σ ) − y, t − σ ) KeKy dθ dy
2 . < √ (t − σ ) This and (4.2) yield r Z t √ √ √ 5 8δ II(t, σ ) dσ ≤ qK4 = 8 2Q|u¯ x (0)| δα ≤ 8 2Q|u¯ x (0)|α − 2 . qK α t− 8δ α (4.8) The lemma follows from combining Assumption 2.3, (4.7), and (4.8) and the assumption that α is sufficiently large. u t Proposition 4.3. Suppose that α > 0 is sufficiently large. Then, 2 log α for t ∈ 0, α 2 , |γ (t)| <
4L20 δ , |u¯ x (0)|
(4.9)
where L0 is the constant given in Lemma 4.1. Proof. We first relax (4.9) and make the a priori assumption 8L20 δ 2 log α for t ∈ 0, . |γ (t)| < |u¯ x (0)| α2 Set
(4.10)
k|γ k|t ≡ sup |γ (σ )|. σ ∈(0,t]
Case 1. 0 ≤ t ≤ 8δ/α. By applying Lemma 4.1, Lemma 4.2, (4.3) and (4.4), one obtains |γ (t)| 1 L0 δ γ (t) |u¯ x (0)|e−2αδ − 3 . ≤ 3 |u¯ x (0)| · k|γ k|t + √ √ L0 min( tα, 1) min( tα, 1) α 2 |u¯ x (0)| α2 (4.11) Due to Assumption 2.4, (4.11), and largeness of α, we have k|γ k|t ≤
4L20 δ for 0 ≤ t ≤ 8δ/α, |u¯ x (0)|
and the proposition holds in this case. Case 2. 8 δ α −1 ≤ t ≤ 2α −2 log α. When t − σ > 8δ/α, due to (4.10) the function ∂y k(θ(γ (t)−γ (σ ))+α(t−σ )−y,t−σ ) is a positive function for y < 0 and θ ∈ [0, 1]. Thus, II(t, σ ) is a positive function for (t − σ ) > 8δ/α. It yields that Z t−8δ/α Z t Z t II(t, σ )γ (σ ) dσ ≤ II(t, σ ) dσ + |II(t, σ )|dσ |kγ k|t . 0
0
t−8δ/α
564
T.-P. Liu, S.-H. Yu
Applying this, Lemma 4.1, Lemma 4.2, (4.3) and (4.4), we conclude that Z t−8δ/α 2 1 − α4 t |u¯ x (0)| e + qK II(t, σ ) dσ (4.12) |γ (t)| √ L0 min( tα, 1) 0 1 − 5 |u¯ x (0)| · |γ (t)| α2 α2 t Z t−8δ/α L0 e− 4 δ 1 . |u¯ x (0)| + qK II(t, σ ) dσ · k|γ k|t + ≤ √ 5 min( tα, 1) 0 α2 Suppose that |kγ k|t = |γ (τ0 )| for τ0 ∈ [0, t], then from (4.12), Assumption 2.4, and largeness of α: Z t−8δ/α α2 τ 1 |u¯ x (0)| − 40 |u¯ x (0)| e + qK II(t, σ ) dσ − |γ (τ0 )| √ 5 L0 min( tα, 1) 0 α2 2 α τ0 Z t−8δ/α e− 4 δ 1 . |u¯ x (0)| + qK II(t, σ ) dσ |γ (τ0 )| + L0 < √ 5 min( tα, 1) 0 α2 By canceling the integrals with qK coefficients in both sides, it yields the uniform bound t for |γ (τ0 )|. Thus, the estimate of k|γ k|t follows in this case. u Differentiate (4.1) with respect to t to result in the equation for γ 0 (t): Z k(αt, t) u¯ y (y + γ (t)) + vy (y + γ (t), 0) dy (4.13) 0 = −γ 0 (t) R Z d {k(αt − y, t)} {u(y) ¯ ¯ + γ (t)) − v(y + γ (t), 0)} dy − u(y + R dt Z tZ ∂y (k(γ (t) − γ (t − τ ) + ατ − y, τ )) (γ 0 (t) − γ 0 (t − τ ))W(y)dy. +qK 0
y
Applying the same arguments for obtaining the uniform bound of γ (t) to (4.13), one obtains the following proposition about the uniform bound of γ 0 (t). Proposition 4.4. Suppose that α is sufficiently large. Then, there exists a constant L1 > α ], 0 such that for t ∈ [0, 2 log α2 |γ 0 (t)| ≤
δL1 L20 α 2 . |u¯ x (0)|
(4.14)
4.2. Initial step: rate of the wave fronts. In the above we have obtained the uniform bound of the detonation wave location γ (t) in a finite time interval [0, 2 log α/α 2 ]. The analysis tries to minimize the chemical effect, the terms in (4.4) with coefficient qK. This uniform bound is not refined enough to trace the wave front. We need to obtain sharper estimates of γ 0 by using the Proposition 4.4 to obtain a refined wave front tracing. This is done by using the Laplace–Fourier transformation, [3], to make a full account of the chemical nonlinearity.
Nonlinear Stability of Weak Detonation Waves for Combustion Model
565
Let G(t) be a function satisfying |G(t)| < eBt . b be the Laplace–Fourier transformation of G: Let G(s) Z ∞ b ≡ e−st G(t)dt for s ∈ C and < s > B. G(s)
(4.15)
0
One has the following Parseval’s relation. Lemma 4.5. For η > B, Z ∞ Z ∞ 1 b + iξ )|2 dξ. e−2ηt |G(t)|2 dt = |G(η 2π −∞ 0 See Appendix A.2 of [3]. Rewrite (4.13) as follows: Z ∞ Z 0 Ky ∂y k(ατ −y, τ )e dy dτ qK γ (t) 0 t
Z
y<0
0
Z
Ky
γ (t −τ ) qK ∂y k(ατ −y, τ )e dy dτ y<0 Z −¯uy (y +γ (t))+vy (y +γ (t), 0) dy = γ 0 (t) R Z d {k(αt −y, t)} {u(y)− ¯ u(y ¯ +γ (t))−v(y +γ (t), 0)} dy + dt R Z t Z + qK ∂y ((k(γ (t)−γ (t −τ )+ατ −y, τ )−k(γ (t)−γ (t −τ )+ατ −y, τ )) −
0
0
y
· (γ 0 (t)−γ 0 (t −τ ))W(y))dydτ Z ∞Z 0 Ky ∂y k(ατ −y, τ )e dydτ +γ (t)qK t
y<0
≡ j1 +j2 +j3 . (4.16) From the uniform bounds of γ and γ 0 in Propositions 4.3 and 4.4, one can see that j1 and j2 in (4.16) satisfy, for t ∈ (0, 2 log α /α 2 ), |j1 | ≤ e−
α2 t 4
|j2 | ≤ 8e−
α 2 δL1 L20 |u¯ x (0)| + α 2 δ , |u¯ x (0)|
α2 t 4
α 2 δ(L20 + 2), α2 t qKα 2 δe− 4 δ2 α2 O(1) + O(1) |j3 | = |u¯ x (0)| α |u¯ x (0)| 2 log α 2 −α 2 t/4 for t ∈ 0, . = O(1)α δe α2
(4.17)
566
T.-P. Liu, S.-H. Yu
Here we have noticed from (4.14) and (4.9) that Z ∞ Z ∞ qKα 2 III(τ )dτ = O(1) III(τ )dτ γ 0 (t)qK |u¯ x (0)| t t α2 t qKα − α2 t e 4 = O(1)α 2 e− 4 , = O(1) |u¯ x (0)| Z tZ (γ 0 (t) − γ 0 (t − τ ))(γ (t) − γ (t − τ ))kyy (·, ·)W(y)dydτ qK y<0 2 qKδ α 3 0
= O(1)
|u¯ x (0)|2
Set
= O(1)
Z III(τ ) ≡
y<0
δ2 α4 . |u¯ x (0)|
∂y k(ατ − y, τ )eKy dy,
E(τ ) ≡ III(τ ) − k(ατ, τ ), Z k(ατ − y, τ )KeKy dy |E(τ )| =
(4.18)
y<0
α2 τ
K e− 4 ≤ O(1) √ , α+K τ Z ∞ III(τ ) dτ V0 ≡ Z ∞ Z0 ∞ K 1 1 + O(1) , E(τ )dτ + k(ατ, τ )dτ = = α α 0 0 E k(ατ, τ ) , E0 (τ ) ≡ , P (τ ) ≡ V0 V0 G(t) ≡ γ 0 (t).
(4.19)
From the above estimates about j1 , j2 , and j3 , (4.16) can be rewritten as G(t) = (P ∗ G)(t) + (E0 ∗ G)(t) + F(t), Q (j1 + j2 + j3 ) · charh 2 log α i (t), F(t) ≡ O(1) 0, 2 |u¯ x (0)| α
(4.20)
δe−α t/4 α 2 . F(t) = O(1) |u¯ x (0)| 2
Let’s rewrite (4.20) in terms of convolution operators (1 − P) · G = E0 · G + F, P · G(τ ) ≡ P ∗ G, E0 · G(τ ) ≡ E0 ∗ G.
(4.21)
We study (4.21) by formally expressing G as G=
∞ X i=0
[(1 − P)−1 E0 ]i (1 − P)−1 F.
(4.22)
Nonlinear Stability of Weak Detonation Waves for Combustion Model
567
One needs to construct a special functional space such that the operator (1 − P)−1 is a bounded linear operator and that {[(1−P)−1 E0 ]i (1−P)−1 F}i≥0 is a Cauchy’s sequence. Consider the Fourier–Laplace transformation (4.15) of (4.20) with s = −α 2 /8 + iξ, ξ ∈ R. Both P and E0 are convolution operators, and so P(s) · b E0 (s), P[ ◦ E0 (s) = b and (4.22) yields b = G(s)
∞ X i=0
c0 (s)i E b F(s). (1 − b P(s))i+1
The function space is defined by the norm: Z ∞ 2 α t 2 e 4 h(t) dt. k|hk|α = 0
By Lemma 4.5 Z k|Gk|α ≡ ≤
∞
2 α 2 t/4
G(t) e
1/2 dt
(4.23)
0
1/2 Z ∞ b ∞ X α2 1 F(s)|2 |E0 (s)|2i |b dξ with < s = − . 2(i+1) 2π −∞ |1 − b 8 P(s)| i=0
From (4.20), k|Fk|α = O(1)
δα . |u¯ x (0)|
(4.24)
Thus, it is sufficient to show that |1 − b P(s)|−1 is bounded for < s = −α 2 /8. Lemma 4.6. For any ξ ∈ R there is positive constant C0 such that |(1 − b P(−α 2 /8 + iξ ))|−1 < C0 , |b E0 (−α 2 /8 + iξ )| < C0
K . α+K
Proof. Set s = −α 2 /8 + iξ in (4.15),
Z ∞ α2 b b e−(− 8 +iξ )t P (t) dt P(ξ ) ≡ P(s) = 0 Z ∞ 2 −(− α8 +iξ )t k(αt, t) e dt, = V0 0
b P(0) = α
K 1 + O(1) α+K
Z 0
∞
k(ατ, τ ) e
α2 8
dτ = 1 + O(1)
K α+K
√ 2. (4.25)
568
T.-P. Liu, S.-H. Yu
Differentiate b P(ξ ) with respect to ξ to yield Z ∞ √ α2 τ τ b Pξ (ξ ) = −i √ e− 8 −iξ τ dτ 4π 0 b P(ξ ) . =− 2 2(ξ − i α8 ) From this,
2 iα = 0. b P(ξ ) ξ − 8
s
ξ
Combine this with (4.25) to obtain √ K ) 2 (1 + O α+K b q . P(ξ ) = 1 + i8ξ α2
(4.26)
From (4.26), it follows that 1 for ξ ∈ R |1 − b P(ξ )| > 16 and |(1 − b P(ξ ))−1 | is bounded. c0 in (4.20) and (4.21), From the definition of E 2 y2 Z ∞ Z − α8 t + αy 2 +Ky− 4t −iξ t e 2 c dydt |E0 (−α /8 + iξ )| = K α √ 4π t y<0 0 Z α2 t K 2α ∞ e− 8 . dt = O(1) ≤K √ α + 2K 0 α + K 4π t This proves the lemma. u t From Lemma 4.6, the series in (4.23) converges, and together with (4.24), Z
∞
0
γ (t) e 2
α2 t 4
1 2
dt
0
C2K ≤ O(1) δ 1 − 0 α+K
≡ |kG|kα
!−1
(4.27)
α . |u¯ x (0)|
With this we may improve Proposition 4.3: Proposition 4.7. For t ∈ [t0 /2, t0 ] , t0 = dent of α, such that
2 log α , α2
there exists C2 > 0, which is indepen-
1 √ α − 8 δ | log α| . |γ (t) − γ (t0 )| ≤ C2 |u¯ x (0)|
Nonlinear Stability of Weak Detonation Waves for Combustion Model
569
Proof. For t ∈ [t0 /2, t0 ], (4.27) yields Z t0 |γ 0 (ρ)|dρ |γ (t) − γ (t0 )| ≤ t
1/2 Z t0 2 α 2 t0 α ρ t0 − t e− 16 e 4 γ 0 (ρ)2 dρ t √ 1 | log α| − αδ α 8 = O(1) α |u¯ x (0)| 1 √ α − 8 δ | log α| . t u = O(1) |u¯ x (0)| ≤
√
By introducing a parameter β0 , − log β0 ≡
1 √ α| α − 8 C2 | log |u¯ x (0)|
,
2 log α
Proposition 4.7 becomes |γ (t) − γ (t0 )| ≤ δe−β0 α
2t 0
for t ∈
t0 , t0 . 2
4.3. Initial step: waves carried by detonation wave fronts. The analytic properties given in Propositions 4.3 and 4.7 are sufficient to obtain a fine wave structure at t0 = 2 log α/α 2 . We first update the initial data at time t = t0 as a perturbation of ¯ − γ (t0 )), u(x v(x, ¯ t) ≡ u(x, t) − u(x ¯ − γ (t0 )). With the front updated, the estimate of the perturbation can be improved when compared with the initial data in (4.3): Proposition 4.8. There exists a positive constant C0 such that, for x > 0, 5α|x| 1p |v(x, ¯ t0 )| < C0 α − 8 log α δe− 8 . Proof. Similar to (3.4) for v(x, t), one gets Z ¯ ¯ − (γ (t0 ))) dy (4.28) k(x − y + αt, t) v(y, 0) + (u(y) − u(y v(x, ¯ t) = R Z tZ (k(x + ατ − γ (t0 − τ ) − y, τ ) + qK 0
y<0
−k(x + ατ − γ (t0 ) − y, τ ))eKy dydτ. The proof uses Proposition 4.7 for the integral over [t0 /2, t0 ] and the separation of the fluid and the combustion wave speed s, α 1, for the time interval [0, t0 /2]. By integration by parts and by mean value theorem,
570
T.-P. Liu, S.-H. Yu
Z qK
t0
Z y<0
0
(k(x + ατ − γ (t0 − τ ) − y, τ ) − k(x + ατ − γ (t0 ) − y, τ )) eKy dydτ
Z
Z
t0
= −qK 0
Z
t0 Z
+ qK 2
0
1
k(x + ατ + θ(γ (t0 − τ ) − γ (t)), τ )(γ (t0 ) − γ (t0 − τ )) dθ dτ Z
y<0 0
0
1
k(x + ατ + θ(γ (t0 − τ ) − γ (t0 )) − y, τ ) · (γ (t0 ) − γ (t0 − τ ))eKy dθ dydτ.
Expand the kernel function k(x − y + ατ, τ ) as follows: (x−y)2 α 2 τ α(x−y) 1 e− 4τ − 4 − 2 k(x + ατ − y, τ ) = √ 4πτ 3(x−y)2 3α 2 τ α(x−y) α|x−y| 1 e− 16τ − 16 − 2 − 8 . ≤ √ 4πτ
Substitute this into the last integral of (4.29) and apply (4.9), together with Assumption 2.6, to yield, for x > 0, Z
t0
Z
(k(x + ατ − γ (τ ) − y, τ ) − k(x + ατ − γ (t0 ) − y, τ )) eKy dydτ Z t0 /2 √ 1 3α 2 τ 5α|x| K δ log α α − 8 e− 16 − 8 dτ qK 1 + O(1) ≤ O(1)e √ α+K 0 |u¯ x (0)| τ 3α 2 τ Z t0 δe− 16 dτ + √ τ t0 /2 ! 1 p 5α|x| 3 α −1− 8 K −1− 16 δ qK log α +α e− 8 log α = O(1) 1 + O(1) α+K |u¯ x (0)| p 5α|x| 1 1 = O(1) log α α − 8 1 + |u¯ x (0)|α − 16 δe− 8
qK
0
y<0
= O(1)α − 8 log α δe− 1
5α|x| 8
.
(4.29)
The last two integrals are obtained by using Proposition 4.7 and the largeness of α. Similarly, for x > 0, Z R
¯ − γ (t0 )) + v(y, 0) ) dy ¯ − u(y k(x − y + αt0 , t0 ) (u(y)
= O(1) (|u¯ x |(0) + 1)δe−
3α 2 t0 16
e−
5α|x| 8
.
Equations (4.30) and (4.30) imply the proposition. u t
(4.30)
Nonlinear Stability of Weak Detonation Waves for Combustion Model
571
As before, the above proposition can be rewritten as 5α|x|
|v(x, ¯ t0 )| ≤ δe−β1 α t0 e− 8 for x > 0, √ 1 − log C0 | log α| α − 8 . β1 ≡ 2 log α 2
From the condition α 1, β1 ∈ (0, β0 ). Before investigating the situation x < 0, we derive an estimate from (4.28). The following lemma will be used to show that, due to the supersonic speed of the combustion wave, the information to the left of the combustion does not influence the propagation of the combustion wave. Lemma 4.9. For x < 0 and α > 0 it holds
Z Z
R
k(x + αt − y, t) e
y<0
− 5α|y| 8
2
− α|x+αt| 4
e e dy = O(1) √ + √ α t α 4π t
k(x + αt − y, t) e−K|y| dy
= O(1) Z
,
(4.31) (4.32)
2 − (x+αt) e √ 4t if x + αt ≥ 0, K 4πt
(x+αt)2 e− √16t +
K 4πt
y<0
− (x+αt) 16t
k(x + αt − y, t) e
K|x+αt|
e− √2 K t
−K|y|
else x + αt < 0, e−
αx 2
α2 t
e− 4 . dy ≤ 2 √ (α + K) 4π t
(4.33)
Proof. The proofs of (4.31) and (4.32) are identical. So, one just needs to prove (4.32) and (4.33). When x + αt > 0, Z y<0
k(x + αt − y, t) e−K|y| dy ≤
When x + αt < 0, Z y<0
=
y<0
k(x + αt, t) e−K|y| dy ≤
k(x + αt − y, t) e−K|y| dy ! Z Z y< x+αt 2 K|x+αt|
≤
Z
+
x+αt 2
1 k(x + αt, t). K
k(x + αt − y, t) e−K|y| dy
2 e− 2 k(x + αt, 4t). + √ K K 4πt
572
T.-P. Liu, S.-H. Yu
Hence, (4.32) follows: Z y<0
k(x + αt − y, t) e−K|y| dy
Z
=
k(x − y, t) e−
y<0
αx
2 α(x−y) −K|y|− α4 t 2
dy
α2 t
e− 2 − 4 2 , ≤ √ α+K 4πt and (4.33) follows. u t If δ is sufficiently small, then Z |v(x, ¯ t0 )| ≤
R
k(x − y + αt0 , t0 ) |v(y, 0)| + |u(y) ¯ − u(y ¯ − γ (t0 ))| dy
+ δ ω1 (x), where Z ω1 (x) ≡ L0 Kq
t0 t0 2
Z
t0 2
+ 0
k(x + ατ, τ ) + K
! e
−β0 α 2 t0
Z y<0
k(x + ατ − y, τ )e
Ky
dy dτ.
From (4.32) and (4.33) we have Lemma 4.10. The function ω1 (x) satisfies that ω1 (x) = O(1) |u¯ x (0)| for x < 0, ω1 (x) ≤ e
−β1 α 2 t0
e
α|x| 2
(4.34)
for x < 0.
(4.35)
From Lemma 4.10 and Proposition 4.8, we have the following proposition. Proposition 4.11. The wave structure of v(x, ¯ t0 ) is
|v(x, ¯ t0 )| ≤ δ
O(1) ω1 (x) + e−β1 α
2t 0
e−
5|x| 8
e
−
(x+αt0 )2 16t0
−
+e √ α t0
for x > 0.
5α|x+αt0 | 16
for x < 0,
Nonlinear Stability of Weak Detonation Waves for Combustion Model
573
4.4. Pointwise convergence to viscous profiles. Propositions 4.11 and 4.7 provide a decaying structure of a perturbation in front of the detonation wave at time t = t0 as well as the analytic property of the detonation wave location. In fact, the factor δ can be realized as the strength of the perturbation in front the detonation wave, and the updated 2 perturbation in front of the detonation wave is of order e−β1 α t0 δ. Finally, it results in the convergence of the wave front to a fixed location exponentially fast as well as time asymptotic stability of the weak detonation wave. In this subsection, we will generalize both Propositions 4.7 and 4.11 to show this. The above reasoning suggests the a priori assumption on the wave front γ (nt0 ): |γ (nt0 ) − γ (t)| = O(1) δ b(t) for t ∈ [0, nt0 ], hti 2 hh i h i − β α t e t0 1 0 for t ∈ tt t0 , tt + 21 t0 , 0 0 h i b(t) ≡ hh i h i − tt β1 +β0 α 2 t0 t 1 t 0 for t ∈ e t0 + 2 t0 , t0 + 1 t0 , , where [x] is the largest integer less than or equal to x. The solution to the left of the detonation wave front is analyzed by resolving an initial boundary value problem with boundary value bounded by the a priori bound O(1)δb. The consideration of such an initial boundary value problem is motivated by the device in [4] for studying the stability of a viscous shock profile. For this we consider the Green function G∗ , G∗ (x, t; y, σ ) ≡ k(x − y + α(t − σ ), t − σ ) − eαx k(x + y − α(t − σ ), t − σ ),
(4.36)
for the initial-boundary value problems: ut − αux − uxx = 0 for x < 0, t > 0 with homogeneous boundary values u(0, t) = 0 for t ≥ 0, and the solution ∗ of the initial-boundary value problem: ut − αux − uxx = qKW(x), u(0, t) = Lb(t), u(x, 0) = 0, ∗ (x, t; L) ≡ qK −L
Z
t
0 Z t 0
Z b(σ ) · G∗ (x, t; 0, σ ) + K
y<0
G∗ (x, t; y, σ )eKy dy
dσ
b(σ ) · ∂y G∗ (x, t; 0, σ ) dσ.
We introduce comparison functions ωn (x) for the estimate of ∗ (x, t; L): Z nt0 ωn (x) ≡ qK b(σ )· 0 Z Ky k(x − y + α(nt0 − σ ), nt0 − σ )e dy dσ. k(x + α(nt0 − σ ), nt0 − σ ) + K y<0
574
T.-P. Liu, S.-H. Yu
Lemma 4.12. For x < 0,
−
(x+αnt0 )2 Ant0
e |ωn (x)| ≤ O(1) Q |u¯ x (0)| √ nt0 |ωn (x)| ≤ O(1) Q|u¯ x (0)|e
|x|α 3
+
|x+αnt0 |K 1 e− 2 , √ K nt0
b(nt0 ),
(4.37)
(4.38)
for any A > max(16, 32/β1 ). Proof. From Lemma 4.9, the double integral defining ωn (x) can be bounded by a single integral: Z nt0 b(σ ) · (4.39) ωn (x) ≤ O(1)qK 0 K|x+α(nt0 −σ )| − 2 e k(x + α(nt0 − σ ), nt0 − σ ) + √ H(−x − α(nt0 − σ )) dσ. nt0 − σ Break the integration into two parts: σ ∈ [0, nt0 /2) and σ ∈ [nt0 /2, nt0 ]. For the first part σ ∈ [0, nt0 /2), Z nt0 /2 b(σ ) · k(x + α(nt0 − σ ), nt0 − σ ) dσ 0
Z
nt0 /2
b(σ ) · k(x + α(nt0 − σ ), nt0 ) dσ αβ|x+αnt0 | ! 2 e− O(1) k(x + αnt0 , 4nt0 ) + √ . = 2 α nt0 ≤4
0
(4.40)
When σ ∈ [nt0 /2, nt0 ], break x in two cases: x + nt0 < 0 and 0 < x + nt0 < nt0 . When x + nt0 < 0, since k(x + α(t0 n − σ ), t0 n − σ ) < O(1)k(x + αt0 n, t0 n − σ ), we have Z nt0 nt0 /2
≤4
b(σ ) · k(x + α(nt0 − σ ), nt0 − σ ) dσ Z
nt0 nt0 /2
b(σ ) · k(x + αnt0 , nt0 − σ ) dσ n
= O(1)nt0 e− 2 β1 α t0 k(x + αnt0 , nt0 ) = 2
O(1) − n β1 α 2 t0 e 4 k(x + αnt0 , nt0 ). (4.41) α2
When 0 < x + nt0 < nt0 , due to choice of A, we have e−
β1 n 2 8 α t0
O(1)k(x + αnt0 , Ant0 ).
Nonlinear Stability of Weak Detonation Waves for Combustion Model
Therefore, Z Z nt0 b(σ ) · k(x + α(nt0 − σ ), nt0 − σ ) dσ ≤ 4 nt0 /2
nt0
nt0 /2
n
≤ O(1) e− 4 β1 α
2t 0
√ e nt0 ≤ O(1)
− n8 β1 α 2 t0
575
b(σ ) · √
1 dσ nt0 − σ
1 ≤ O(1) k(x + αnt0 , Ant0 ). α
α
(4.42)
Equations (4.40), (4.41), and (4.42) yield that Z nt0 b(σ ) · k(x + α(nt0 − σ ), nt0 − σ ) dσ nt0 /2
1 = O(1) α
k(x + αnt0 , Ant0 ) +
e−
αβ1 |x+αnt0 | 2
!
√ nt0
.
(4.43)
From a direct calculation one can have Z
nt0
b(σ ) ·
−K|x+α(nt0 −σ )| 2
e− √
0
nt0 − σ
H(−x − α(nt0 − σ )) dσ ≤
O(1) e−K|x+αnt0 | . (4.44) √ α nt0
Combining qK/α = Q|u¯ x (0)| with (4.43) and (4.44), one has (4.37). The estimate (4.38) follows by plugging the inequality k(x − y + α(t − σ ), t − σ ) r αx αy α 2 (t−σ ) 3 3 e− 3 + 3 − 8 k(x − y + α(t − σ ), (t − σ )) ≤ O(1) ≤ √ 2 2 t −σ t into (4.4) with β1 < 1/8. u Lemma 4.13. It holds for x < 0 and L > 1 that ∗
(x, t; L) = O(L) |u¯ x (0)| ∗ (x, nt0 ; L) ≤ Lb(nt0 +)e
k(x + αt, At) + α|x| 3
e−
K|x+αt| 2
!
√ t
,
,
where A > max(16, 32/β1 ). Proof. The function ∗ (x, t; L) can be identified with the solution U (x, t) of Ut − αUx − Uxx = qK(eKx + δ(x))b(t) for x < 0, t > 0, U (0, t) = L b(t), U (x, 0) ≡ 0, where δ(x) is a delta function. Consider Vt − αVx − Vxx = l0 qK(W(x) + δ(x))b(t) for x ∈ R, t > 0, V (x, 0) ≡ 0.
(4.45) (4.46)
576
T.-P. Liu, S.-H. Yu
Duhamel’s representation of V (x, t) is identical to the representation ωn (x) in (4.4). Hence, the estimate of ωn (x) in Lemma 4.12 is applicable to V (x, t) with nt0 replaced by t. According to (4.38) in Lemma 4.12, one can find an l0 > L such that the solution V (x, t) satisfies V (0, t) ≥ Lb(t). By maximal principle, one has U (x, t) ≤ V (x, t) for x < 0, t > 0. So, (4.45) follows. Equation (4.46) is rather straightforward. Its proof is omitted. u t The perturbation at each updated detonation wave front γ = γ (nt0 ) is: ¯ − γ (nt0 )). v¯n (x, t) ≡ u(x, t) − u(x The following proposition yields both the convergence of the wave locations and the time asymptotic pointwise convergence to the viscous weak detonation profile. Proposition 4.14. There is an L > 1 such that for all n ∈ N, 4L20 −(n−1)β1 α 2 t0 e δ, |γ (nt0 ) − γ ((n − 1)t0 )| ≤ |u¯ x (0)| ∗ 2 (x − γ (nt0 ), nt0 ; L) +k(x + αnt0 − γ (nt0 ), 4nt0 ) for x < γ (nt0 ), |v¯n (x, nt0 )| ≤ δ 5|x−γ (nt0 )| 2 8 for x > γ (nt0 ). O(1)e−nβ1 α t0 e−
(4.47)
(4.48)
Proof. We will prove Proposition 4.14 by induction. Due to (4.9) and Proposition 4.11, the proposition holds for n = 1. Assume that (4.47) and (4.48) hold for n ≤ j . For simplicity in notation, we take γ (j t0 ) = 0. Set τ ≡ t − j t0 , v(x, τ ) ≡ v¯j (x, τ + j t0 ), (4.49) v0 (x) ≡ v¯j (x, j t0 ), γ¯ (τ ) ≡ γ (j t0 + τ ). The equation for γ¯ (τ ) is identical to (4.4) for γ (t) with v0 (x) provided by (4.49) instead of (4.3): v0 (0) = 0, 5|x| 2 |∂xi v0 (x)| ≤ α i e−jβ1 α t0 e−α 8 for x > 0, i = 0, 1, 2, (4.50) |x| 2 i |∂x v0 (x)| ≤ α i e−jβ1 α t0 eα 3 for x < 0, i = 0, 1, 2. The conditions involve derivatives up to second order. However, the induction hypotheses of the proposition yield only the information about zeroth order. Nevertheless, since the equation is parabolic, the zero-th order information is enough to recover the higher order derivatives in any positive time. The recovery of the higher order derivatives is routine and is omitted. Here, the third condition is due to Lemma 4.13. Due to (4.5) and (4.6), the consequences of (4.11) and (4.12) remain valid for γ¯ (τ ) 2 with δ replaced by δe−jβ1 α t0 for τ ∈ [0, t0 ]: |γ ((j + 1)t0 ) − γ (j t0 )| = |γ¯ (t0 )| < δ
4L20 2 e−jβ1 α t0 . |u¯ x (0)|
Nonlinear Stability of Weak Detonation Waves for Combustion Model
577
Thus, (4.47) holds for n = j + 1. All the conditions for Proposition 4.7 are still valid for γ¯ (τ ) and one has, for τ ∈ [ t20 , t0 ], |γ¯ (τ ) − γ¯ (t0 )| ≤ e−(jβ1 +β0 )α
2t 0
≤ δe−(j +1)β1 α t0 . 2
(4.51)
In deriving this estimate, one also has the following estimate as a by-product: |v(0, t)| ≤ O(1) δ e−jβ1 α |v(0, t)| ≤ O(1) δ e
2t 0
for t ∈ [0, t0 /2],
−(jβ1 +β0 )α 2 t0
(4.52)
for t ∈ [t0 /2, t0 ].
Now, we change the x coordinate to x−γ ((j +1)t0 ). Therefore, we take γ ((j +1)t0 ) = 0, again. Combining (4.48) for n ≤ j , (4.52), γ ((j + 1)t0 ) = 0, and (4.51) together, we have obtained the information of γ (t) and v¯j +1 (t) for t ∈ [0, (j + 1)t0 ]. This leads to the boundary value problem: ∂t v¯j +1 − α∂x v¯j +1 − ∂x2 v¯j +1 = qK (W(x − γ (t)) − W(x)) , (4.53) |v¯ +1 (t)| ≤ O(1) δ b(t), |γj(t)| ≤ O(1) δ b(t)/|u¯ x (0)|. By the Duhamel principle, the representation of v¯j +1 (x, (j + 1)t0 ) is Z G∗ (x, (j + 1)t0 ; y, 0) v¯j +1 (y, 0) dy v¯j +1 (x, (j + 1)t0 ) = Z + qK
(j +1)t0
0 (j +1)t0
Z − 0
R
G∗ (x, (j + 1)t0 ; y, σ ) (W(y − γ (σ )) − W(y)) dydσ
∂y G∗ (x, (n + 1)t0 ; 0, σ )v¯j +1 (0, σ )dσ.
Substitute the conditions about γ (t) and v¯j +1 (0, t) in (4.53) into the above representation. Then, (4.48) is verified for n = j + 1 and the proposition follows. u t 5. Stability Analysis II: Nonlinear Flux When the flux is nonlinear, one needs to linearize the problem at the left end state of the detonation wave profile as well as at the ignition point. The first is for studying waves propagating to the left far field, the other is for the purpose of tracing the wave fronts. About the wave travelling to the left far field, we need consider it as a boundary value problem with the boundary values provided by the front tracing. Similar to the setting of an initial boundary value problem in the previous section, we introduce h iβ h i h i i t 1 2 t t 1 e− t0 2 α t0 if t ∈ , t + t0 , 0 t t 2 0 0 h i B(t) ≡ h h i i i t β1 β0 2 t 1 t e− t0 2 + 2 α t0 if t ∈ , + t + 1 t0 , 0 t0 2 t0
578
T.-P. Liu, S.-H. Yu
α− ≡ s − f 0 (u− ), G− (x, t; y, σ ) ≡ k(x − y +α− (t −σ ), t −σ ) − eαx k(x +y − α− (t −σ ), t −σ ), Z t Z B(σ )· G− (x, t; 0, σ )+K G− (x, t; y, σ )eKy dy dσ − (x, t; L) ≡ qK Z −L
0 t
0
y<0
B(σ ) · ∂y G− (x, t; 0, σ ) dσ.
(5.1)
Note. The estimate for ∗ (x, t; L) in Lemma 4.13 can be applied to − by replacing α and b(nt0 ) in the lemma by α− and B(nt0 )) respectively. Let v(x, t) be the solution of (3.3) with initial values which satisfy (4.3) and (4.2); and v¯n (x, t) stands for the same meaning as that in Proposition 4.14. Proposition 5.1. There is a constant L > 10 such that it holds for all n ∈ N, 8L20 −(n−1) β1 α 2 t0 2 δ, e |γ (nt0 ) − γ ((n − 1)t0 )| ≤ |u¯ x (0)| − 2 (x − γ (nt0 ), nt0 ; L) |v¯n (x, nt0 )| ≤ δ +k(x + αnt0 − γ (nt0 ), 4nt0 ) for x < γ (nt0 ), β1 2 5|x−γ (nt0 )| 8 for x > γ (nt0 ). O(1)e−n 2 α t0 e−
(5.2)
(5.3)
We will also prove this proposition by mathematical induction. The procedure is similar to those in obtaining Proposition 4.14. However, one still needs to modify the wave front tracing and the stability analysis regarding the presence of the fluid nonlinearity. It should be mentioned that for weak detonation the chemical nonlinearities and fluid nonlinearities are decoupled in terms of wave front tracing, because weak detonation wave is faster than any other non-chemical waves.
5.1. Nonlinear front tracking. In order to proceed with the wave front tracing for nonlinear flux, one makes the a priori assumption that |∂xi v¯n (x, t)| < 2α i |v¯n (x)|, i = 1, 2, for t ∈ (nt0 , (n + 1)t0 ).
(5.4)
Proposition 5.2. Under the hypothesis of Proposition 4.3 and under (5.4), it holds for the nonlinear problem that nβ1
8L20 e− 2 α t0 δ for t ∈ (nt0 , (n + 1)t0 ) , |γ (t) − γ (nt0 )| ≤ |u¯ x (0)| 2
(5.5)
and (nβ1 +β0 ) 2 α t0
8L20 e− 2 1 |γ (t) − γ ((n + )t0 )| ≤ 2 |u¯ x (0)|
δ
1 for t ∈ (n + )t0 , (n + 1)t0 . (5.6) 2
Nonlinear Stability of Weak Detonation Waves for Combustion Model
579
Proof. The equation of the front γ (t) is given by (3.7). The difference between (3.7) and (4.4) is the fluid nonlinearity, which shows up in the last double integral in R.H.S. of (3.7). However, the influence of this nonlinearity can be ignored in deriving the uniformly bound estimate of the wave front, provided that this nonlinearity satisfies, for t ∈ (0, t0 ), Z t Z {−N k(α(t −σ )−y, t −σ ) ( v ¯ (y, γ (nt +σ )))} dydσ 1 n 0 y R
0
Z
δ
nβ1
e− 2 α t0 k(αt −y, t)− (y, nt0 ; L) dy + |u¯ x (0)| y<0 2
Z y>0
(5.7) !
k(αt −y, t)e−5αy/8 dy .
With this, (4.11) and (4.12) are still valid by letting L0 twice the value of that in Proposition 4.3. This results in (5.5), the uniform bound of the detonation wave location. Similarly, if Z tZ d k(α(t − σ ) − y, t − σ )N1 (v¯n (y, t + nt0 ))y dydσ dt 0
δα
R
Z
2
nβ1
e− 2 α t0 k(αt −y, t) (y, nt0 ; L) dy + |u¯ x (0)| y<0 2
−
Z
k(αt −y, t)e
(5.8) ! −5αy/8
y>0
dy ,
then (4.24) remains valid. Thus, Proposition 4.4 and Proposition 4.7 hold for this nonlinear flux, too. This proves (5.6). It remains to prove (5.7) and (5.8). From the definition N1 (v¯n ) in (3.3), one can write ¯ − γ (nt0 ))) − f (u(y ¯ − γ (nt0 ))) − v¯n f 0 (u(0)) ¯ N1 (v¯n )(y, nt0 + t) ≡ f (v¯n (y, t) + u(y Z 1 = v¯n ¯ − f 0 (u(0)) ¯ dθ f 0 (θ v¯n + u) 0
= v¯n
Z 1Z 0
0
1
¯ u(0)) ¯ ¯ ¯ f 00 ({φ(θ v¯n + u− + u(0)}) [θ v¯n + {u¯ − u(0)}] dφ dθ.
By this identity and (5.4), one has that |N1y (vn (y, nt0 + t))| = O(1) δe = O(1) δe
nβ − 21 α 2 t0 nβ − 21 α 2 t0
(5.9) |u¯ x (0) + δα|e |u¯ x (0)|e−
|N1y (vn (y, nt0 + t))|e = O(1) δe−
nβ1 2 2 α t0
−
5α|y−γ (nt0 )| − 8
5α|y−γ (nt0 )| 8
for y > γ (nt0 ),
α|y−γ (nt0 )| 2
|u¯ x (0)| for y < γ (nt0 ),
580
T.-P. Liu, S.-H. Yu
|N1yy (vn (y, nt0 + t))| = O(1) δαe
nβ − 21 α 2 t0
(5.10) |u¯ x (0)|e
|N1yy (vn (y, nt0 + t))|e = O(1) δαe−
nβ1 2 2 α t0
5α|y−γ (nt0 )| − 8
for y > γ (nt0 ),
α|y−γ (nt0 )| − 2
|u¯ x (0)| for y < γ (nt0 ).
Substitute (5.9) into the double integral in the L.H.S. of (5.7), then by a straight calculation (5.7) follows. Similarly, by (5.10) and by applying integration by parts, (5.8) follows. This completes the proof of the proposition. u t ¯ be the solution of 5.2. Update wave front. Let w = w(x, t) + u(x) wt −swx + f (w)x − wxx =qKW(x − loc(t)) for − x, t > 0, ≤ δB(t), max |u¯ x (0) · loc(t)|, |w(0,t)| L |w(x, 0)| ≤ δe−K|x| .
(5.11)
The equation of w(x, t) is wt − α− wx − wxx = qK(W(x − loc(t)) − W(x)) ¯ − α− )w x − N(w)x , + (s − f 0 (u(x))
(5.12)
where ¯ ¯ ¯ w, N(w) ≡ f (u(x) + w) − f (u(x)) − f 0 (u(x)) ¯ − α− = O(1)eKx for x < 0. s − f 0 (u(x))
Lemma 5.3. Suppose that Assumption 2.7 holds. Then, there exists constant C3 such that |w(x, t)| ≤ 2C3 δ − (x, nt0 ; L) + k(x + α− nt0 , 2nt0 ) for t ∈ (nt0 , (n + 1)t0 ). Proof. By Duhamel’s principle, Z G− (x, t; y, 0)w(y, 0)dy w(x, t) = y<0
+ qK −
Z tZ
Z tZ Z
0 t
− 0
0
y<0
y<0
G− (x, t; y, σ ) (W(y − loc(σ )) − W(y)) dydσ
∂y G− (x, t; y, σ ) O(1)eKy w + N(w) (y, σ )dydσ
∂y G− (x, t; 0, σ )w(0, σ ) dσ.
Nonlinear Stability of Weak Detonation Waves for Combustion Model
We introduce a standard iteration scheme to construct the solution w(x, t), Z 1 G− (x, t; y, 0)w(y, 0)dy w (x, t) = y<0
+ qK
Z tZ 0
Z
t
−
y<0
581
(5.13)
G− (x, t; y, σ ) (W(y − loc(σ )) − W(y)) dydσ
∂y G− (x, t; 0, σ )w(0, σ ) dσ.
0
For j ≥ 1, w j +1 (x, t) =
Z G− (x, t; y, 0)w(y, 0)dy
y<0
+ qK
Z tZ 0
Z
t
y<0
(5.14)
G− (x, t; y, σ ) (W(y − loc(σ )) − W(y)) dydσ
∂y G− (x, t; 0, σ )w(0, σ ) dσ Z tZ ∂y G− (x, t; y, σ ) O(1)eKy w j + N(w j ) (y, σ )dydσ, −
−
0
0
y<0
From the definition of − (x, t; L), there exists c0 such that |w 1 (x, t)| ≤ A(x, t), 1 A(x, t) ≡ c0 δ − (x, t; L) + k(x + α− t, 2t) . α This leads to a priori assumption on w j (x, t) for j ≥ 1: |w j (x, t)| ≤ 2 δ A(x, t). It is sufficient to show that Z tZ |∂y G− (x, t; y, σ ) eKy A + N(A) (y, σ ) | dy dσ A(x, t). y<0
0
Due to the quadratic nonlinearity of N, one can easily show that Z tZ 0
About
y<0
Rt R
|∂y G− (x, t; y, σ )N(A)(y, σ )| dy dσ A(x, t).
y<0 ∂y G− (x, t; y, σ )e
0
δ
Z tZ 0
y<0
Ky A(y, σ )
dydσ , it is sufficient to show that
|∂y G− (x, t; y, σ )|eKy k(y + α− σ, 4σ )dydσ A(x, t).
(5.15)
582
T.-P. Liu, S.-H. Yu
For showing this, first we break the space integral into two parts, ! Z Z Z t
δ
y<−
0
α− σ 2
+
0
−
α− σ 2
|∂y G− (x, t; y, σ )|eKy k(y + ασ, 4σ )dydσ
≡ r1 + r2 . From the definition of G− , (5.1), one has k(x − y + α− (t − σ ), 2(t − σ )) |∂y G− (x, t; y, σ )| ≤ O(1) √ t −σ |x + y − α(t − σ )| −αx e k(x + y − α− (t − σ ), (t − σ )) + t −σ k(x − y + α− (t − σ ), 2(t − σ )) + αk(x − y + α− (t − σ ), (t − σ )) . ≤ O(1) √ t −σ From this, r1 ≤ δO(1)
Z tZ y<−
0
α− σ 2
α− σ k(x − y + α− (t − σ ), 2(t − σ )) + α− k(x − y + α− (t − σ ), t − σ ) e−K 2 √ t −σ 1 1 · k(y + α− σ, 4σ )dydσ = O(1)δ √ + k(x + α− t, 4t). K Kα−
Therefore, when α, K 1, r1 A(x, t).
(5.16)
When y ∈ (− ασ 2 , 0), we have α
eK(y+ασ ) k(y + ασ, 4σ ) ≤
e−( 32 −K)|y+ασ | . √ σ
This yields Z
0 − ασ 2
k(x + αt − (y + ασ ), t − σ )eKy k(y + ασ, 4σ ) dy
α 1 ≤ O(1)e−Kασ √ k(x + αt, 4(t − σ )) + e−( 32 −K)|x+αt|/2 . α σ This yields, when α 1, r2 A(x, t).
(5.17)
Thus, (5.16) and (5.17) imply (5.15). By a similar calculation as above one can show that the iteration scheme converges. u t
Nonlinear Stability of Weak Detonation Waves for Combustion Model
583
Proof of Proposition 5.1. The analysis for the case n = 1 is similar to the following and is omitted. Assume the proposition holds for n ≤ j and suppose that (5.4) is satisfied for n = j . Then, Proposition 5.2 yields (5.5) and (5.6). We need to update the detonation wave front and consider v¯j +1 (x, t) in order to justify that (5.4) holds for n = j . The verification of the ansatz for x > γ ((j + 1)t0 ) is straightforward. We also omit it. For convenience we assume γ ((j + 1)t0 ) = 0. Thus, γ (t) satisfies the same property of loc(t), given in (5.11), for t ∈ (0, (j + 1)t0 ). The value of v¯j +1 (0, t) also share the same property of w(0, t), given in (5.11), for t ∈ [0, j t0 ]. The property of v¯j +1 (0, t) for t ∈ [j t0 , (j + 1)t0 ] can be obtained through verifying the ansatz v¯j +1 (x, t) in the region x > 0. This results in
v¯j +1 (0, t) ≤ Lδ e− v¯j +1 (0, t) ≤ Lδ e−
jβ1 +β0 2 α t0 2
1 for t ∈ j t0 , j + , 2 1 , (j + 1)t0 ; for t ∈ j + 2
jβ1 2 2 α t0
and also implies (5.4) for n = j , x > 0. It remains to show that (5.4) holds for n = j , x < 0. Since v¯j +1 (x, t) satisfies the criterion for w(x, t) in (5.11) for t ∈ [0, (j + 1)t0 ], one can apply Lemma 5.3 to v¯j +1 (x, t). So, (5.4) is true for n = j . Proposition 5.1 is true for n = j + 1. Thus, the proposition follows. u t From Proposition 5.1 we have proved Theorem 1.1.
6. Remarks In a physical setting, the parameters q, K, and T ∗ are given, while the states s, u− , and u+ depend on the physical situation. In our setting of a weak detonation profile, (2.4), T ∗ is a function of u+ , s, q, and K. Keeping the ignition temperature T ∗ , q, and K as fixed constants, then (2.4) gives an implicit function for s. Thus, one can write s as a function s(u+ ). Then, by (2.1) the left state of the weak detonation wave is given uniquely. We note this state as u∗+ . On the other hand, the weak detonation wave (u∗+ , u+ ) seems so special compared to the usual fluid shock wave (u− , u+ ), for which both states can vary independently; for a weak detonation wave only one state can. Thus, it seems very difficult to produce such a weak detonation wave pattern. However, the weak detonation wave pattern (u∗+ , u+ ) is generic and is a key wave pattern for a Riemann problem. Consider the Riemann problem (u− , u+ ) with u+ ∈ {Domain of s}. The decomposed wave pattern is a weak detonation wave followed either by a shock wave or by a rarefaction wave. The decomposed wave pattern is (u− , u+ ) H⇒ (u− , u∗+ ) + (u∗+ , u+ ), | {z } | {z } Slow
Fast
584
T.-P. Liu, S.-H. Yu
where (u− , u∗+ ) is a shock wave or a rarefaction wave depending on whether u− > u∗+ or u− < u∗+ . See the following diagram. u u∗+
u−
Weak Detonation Wave u+
Rarefaction Wave
u−
x
u∗+
Shock Wave
Weak Detonation Wave u+ x
Our analysis can be refined to support such wave patterns by replacing u(x) ¯ in (5.11) either by a viscous shock profile or by a viscous rarefaction wave. Thus, a weak detonation profile is generic. 7. Appendix Proof of Lemma 2.6. Let u(x) ¯ the solution of (2.2). In order to obtain an analytic property of the profile, one needs an analytic property about s, q, K, (u− , u+ ), and T ∗ in term of the fluid nonlinearity f (u). We assume |u− − u+ | = O(1) and q 1. From (2.1), s=
q + O(1). u− − u+
(7.1)
When x < 0, the normalized profile u¯ = v¯ + u− is given by v¯ x = (f 0 (u− ) − s) · v¯ x − qeKx + n(¯v), where n(¯v) ≡ f (¯v + u− ) − f (u− ) − f 0 (u− )¯v, m− ≡ f 0 (u− ). Transform this into an integral equation and use (7.1), Z x e(−s+m− )(x−y) −qeKy + n(¯v(y)) dy v¯ (x) = −∞
Z x qeKx e(−s+m− )(x−y) n(¯v(y))dy + s − m− + K −∞ |u− − u+ | eKx = (u+ − u− ) 1 + O(K + m− ) q Z x + e(−s+m− )(x−y) n(¯v(y)) dy. =−
−∞
(7.2)
Nonlinear Stability of Weak Detonation Waves for Combustion Model
585
By a priori assumption |¯v(x)| < 2|u− − u+ | eKx , for x < 0 one obtains v¯ (0) = (u+ − u− ) +
O(1)|u− − u+ |2 (K + m− ) . q
This yields that T ∗ − u+ =
O((K + m− )|u− − u+ |2 ) . q
(7.3)
By a straightforward calculation, one can verify this assumption and also (7.3). Next we consider the profile in the unburnt zone. Denote v¯ + ≡ u¯ − u+ . The equation for v¯ + is
+
v¯ x+ = (−s + f 0 (u+ ))¯v+ + n+ (¯v+ ),
n (¯v+ ) ≡ f (u+ + v¯ + ) − f (u+ ) − f 0 (u+ )¯v+ . By the smallness of T ∗ − u+ in (7.3) and by Picard’s iteration, + (x) = v¯ + (0)e(−s+f v¯ n+1
this yields
0 (u ))x +
Z + 0
x
e(−s+f
0 (u ))(x−y) +
n+ (¯vn+ )(y)dy,
O(1) (−s+f 0 (u+ ))x . e v¯ + (x) = v¯ + (0) 1 + 2 q
Roughly, this yields, for x > 0, s − f 0 (u+ ) + 0 v¯ (0)e(−s+f (u+ ))x 2 1 1 0 1 + O( ) u¯ x (0)e(−s+f (u+ ))x . =− 2 q
−¯v+ (x)x >
The lemma follows. u t Acknowledgement. The authors would like to thank Professor Anders Szepessey for helpful discussions of [16].
586
T.-P. Liu, S.-H. Yu
References 1. Courant, R. & Friedrichs, K.O.: Supersonic Flow and Shock Waves. Berlin–Heidelberg–New York: Springer-Verlag, 1948 2. Gasser, I., Szmolyan, P.: A geometric singular perturbation analysis of detonation and deflagration waves, SIAM J. Math. Anal. 24, 968–986 (1993) 3. Gustafsson, B., Kreiss, H.-O., Oliger, J.: Time Dependent Problems and Difference Methods. New York: Wiley-Interscience, 1995 4. Kreiss, G. & Kreiss, H.-O.: Stability of systems of viscous conservation laws. Comm. Pure and Appl. Math.51, 1397–1424 (1998) 5. Li, T.: Stability of strong detonation waves and rates of convergence. Electronic J. of Differential Equations 1–17 (1998) 6. Li, T.: Rigorous asymptotic staility of a Chapman-Jouguet detonation wave in the limit of small resolved heat release. Combustion Theory Modeling 1, 259–270 (1997) 7. Liu, T.-P.: Pointwise Convergence to Shock Waves for Viscous Conservation Laws. Comm. Pure and Appl. Math. vol. L No 11, 1113–1182 (1997) 8. Liu, T.-P.: Zero dissipation and stability of shocks. Method and Applications of Analysis 5, 81–94 (1998) 9. Liu, T.-P. & Ying, L.-A.: Nonlinear stability of strong detonation for a viscous combustion model. SIAM J. Math. Anal. 26, 519–528 (1995) 10. Liu, T.-P. & Yu, S.-H.: Propagation of a Stationary Shock Layer in the Presence of a Boundary. Arch. Rat. Mech. Anal. 139, 57–82 (1997) 11. Liu, T.-P. & Yu, S.-H.: Continuum Shock Profiles for Discrete Conservation Laws, I. Construction. To appear in Commun. Pure Appl. Math. (1999) 12. Liu, T.-P. & Yu, S.-H.: Continuum Shock Profiles for Discrete Conservation Laws, II: Stability. Submitted to Commun. Pure Appl. Math. (1998) 13. Majda, A.: A qualitative model for dynamics combustion. SIAM J. Appl. Math. 37, 686–699 (1979) 14. Nishibata, S. &Yu, S.-H.: The Asymptotic Behavior of the Hyperbolic Conservation Laws with Relaxation on the Quarter Plane. SIAM J. Appl. Math. 28, 304–321 (1997) 15. Rosales, R. & Majda, A.: Weakly nonlinear detonation waves. SIAM J. Appl. Math. 43, 1086–1118 (1983) 16. Szepessy, A.: Dynamics and Stability of a weak detonation wave. Preprint 17. Yu, S.-H.: Zero Dissipation Limit to Solutions with Shocks for Systems of Hyperbolic Conservation Laws. To appear in Arch. Rat. Mech. Anal. Communicated by A. Jaffe
Commun. Math. Phys. 204, 587 – 618 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Bosonization and Integral Representation of Solutions of the Knizhnik–Zamolodchikov–Bernard Equations Gen Kuroki1 , Takashi Takebe2,? 1 Mathematical Institute, Tohoku University, Sendai 980, Japan 2 Department of Mathematical Sciences, The University of Tokyo, Komaba, Tokyo 153-8914, Japan
Received: 5 October 1998 / Accepted: 8 February 1999
Abstract: We construct an integral representation of solutions of the Knizhnik–Zamolodchikov–Bernard equations, using the Wakimoto modules. Contents 0. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Bosonization and Wakimoto Modules . . . . . . . . . . . . . . . . . . . . 1.1 Notations for the finite dimensional algebra . . . . . . . . . . . . . . 1.2 Ghosts and free bosons . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Bosonization and Wakimoto modules . . . . . . . . . . . . . . . . . . 1.4 State-operator correspondence . . . . . . . . . . . . . . . . . . . . . 1.5 Screening operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. WZW Models on Elliptic Curves . . . . . . . . . . . . . . . . . . . . . . 2.1 Space of conformal coinvariants and space of conformal blocks . . . . 2.2 N -point functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. N-Point Functions from Wakimoto Modules . . . . . . . . . . . . . . . . 4. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Theta Functions . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. Method of Coherent States and One-Loop Correlation Functions References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
1 2 2 4 5 7 10 10 11 13 16 28 28 29 31
0. Introduction The purpose of this paper is to construct an integral representation of solutions of the Knizhnik–Zamolodchikov–Bernard equations, using the Wakimoto modules. ? Present address: Department of Mathematics, Faculty of Science, Ochanomizu University, Otsuka 2-1-1, Bunkyo-ku, Tokyo 112-8610, Japan
588
G. Kuroki, T. Takebe
Correlation functions of the (chiral) Wess–Zumino–Witten models satisfy a system of differential equations. In the genus zero case, it is the well-known Knizhnik– Zamolodchikov (KZ) equations [KZ,TK]. Bernard [B1] found a system of equations for the genus one case, which is now called the Knizhnik–Zamolodchikov–Bernard (KZB) equations. In general, it is known (cf. [TUY]) that correlation functions are solutions of a holonomic system over the base space of a family of Riemann surfaces with marked points and principal bundles on them. See also [B2] and [F]. There are a vast amount of works on the KZ equations, among which are studies on integral representation of solutions. There are several different approaches to this subject. One is from the viewpoint of the theory of hypergeometric type integrals; e.g., [DJMM], [Mat,SV1,SV2]. Another approach comes from free field theories on Riemann surfaces and the Wakimoto realization of affine Lie algebras; e.g., [Mar,GMMOS,BeF,ATY, A]. The third one is the off-shell Bethe Ansatz developed in [BaF], the representation theoretical meaning of which was clarified by [FFR]. See also [RV] and [Ch]. The first approach in the genus one case was pursued by Felder and Varchenko in [FV], while Babujian et al. [BLP] study the off-shell Bethe Ansatz approach of the KZB equations (with an additional term). Our goal in this paper is to apply the second approach to the genus one case to obtain an integral representation of solutions of the KZB equations. In the genus zero case, an integral over a suitable (twisted) cycle of a matrix element of product of vertex operators and screening charges gives a solution of the KZ equations. In the genus one case, we take a twisted trace instead of a matrix element to give a solution of the KZB equation in the integral form. We apply the method of the screening current Ward identity used in [ATY] and [A] mutatis mutandis, and obtain an explicit formula for this integral representation. For the sl(2) case Bernard and Felder found the same result in [BeF] by using Wakimoto modules. The paper is organized as follows. In Sect. 1, mainly following [Ku], we review several fundamental techniques in the conformal field theory, especially the free field realization of the affine Lie algebra found by Wakimoto [W], Feigin and Frenkel [FF1,FF2]. We state the problem in Sect. 2. Namely we formulate the Wess–Zumino–Witten model on elliptic curves, following [FW], and give a definition of N-point functions. The KZB equations are introduced as a system of equations satisfied by N-point functions. Sect. 3 is the main part of this paper, where we give an integral representation of an N-point function on elliptic curves (Theorem 3.4). If we restrict an N -point function to a certain submodule, it is a solution of the KZB equations. We thus give an integral representation of solutions of the KZB equations (Theorem 3.12). To write down the integrand explicitly, we use the screening current Ward identity on elliptic curves. Appendix 4 is a table of theta functions used in the paper. In Appendix 4 we review the method of coherent states well known in string theory. We also compute the one-loop correlation function of vertex operators.
1. Bosonization and Wakimoto Modules In this section we review basic facts about the Wakimoto representations of the affine Lie algebras, following [Ku]. See also [W,FF1,FF2,FFR].
1.1. Notations for the finite dimensional algebra. Here we recall fundamental facts about finite dimensional simple Lie algebras to fix the notations.
Bosonization and KZB Equations
589
Let g be a finite dimensional simple Lie algebra of rank l, h its Cartan subalgebra and g=h⊕
M
gα ,
(1.1)
α∈1
the root decomposition, where 1 is the set of roots. The Cartan-Killing form is denoted by (·|·), through which we identify h and its dual space h∗ . We fix the simple roots {α1 , . . . , αl }, Chevalley generators {Hi , Ei , Fi }i=1,...,l and a basis eα of gα , such that eαi = Ei for i = 1, . . . , l and (eα |e−α 0 ) = δα,α 0 . The set of positive and negative roots are denoted by 1+ = {β1 , . . . , βs } and 1− respectively. The Borel and nilpotent subalgebras corresponding to 1± are denoted by b± , n± as usual. Let G, B± and N± be an algebraic group corresponding to g, the subgroups corresponding to b± and to n± . As is well known, there exists a Lie algebra homomorphism Rλ from g to the sheaf of twisted differential operators Dλ on the flag variety B− \G once one fixes a dual vector λ ∈ h∗ . Denote the base point [BA ] ∈ B− \G by o. With respect to the coordinate on oN+ ∼ = n+ , a big cell of B− \G, introduced by the exponential map: C1+ 3 (x α )α∈1+ 7 → o exp(x β1 eβ1 ) · · · exp(x βs eβs ) ∈ oN+ , the twisted differential operator Rλ (X) for X ∈ g is represented by a first order differential operator acting on the space of polynomials C[x α ; α ∈ 1+ ]: Rλ (X) = R(X; x, ∂x , λ),
(1.2)
where ∂x = (∂/∂x α )α∈1+ . The operator R(X; x, ∂x , λ) is a polynomial in X, x, ∂x and λ(Hi ) (i = 1, . . . , l). More explicitly, there are polynomials Rα (X; x) in x for X ∈ g, α ∈ 1+ such that Rλ (Ei ) =
X
Rα (Ei ; x)
∂ , ∂xα
Rα (Fi ; x)
∂ + xαi λ(Hi ), ∂xα
α∈1+
Rλ (Fi ) =
X
α∈1+
Rλ (H ) = −
X
α(H )xα
α∈1+
(1.3)
∂ + λ(H ), ∂xα
for Chevalley generators Ei , Fi (i = 1, . . . , l) and H ∈ h. Note that Rλ (Ei ) does not depend on λ. Hence we sometimes omit the suffix and denote it by R(Ei ). The nilpotent subgroup N+ acts on the big cell from the left as n · (oa) = ona
for n, a ∈ N+ .
The infinitesimal action of n+ induced by this action is denoted by Scr: n+ 3 X 7 → Scr(X; x, ∂x ) ∈ C[x, ∂x ].
(1.4)
590
G. Kuroki, T. Takebe
c 1.2. Ghosts and free bosons. Let us introduce the algebra of (bosonic) ghosts, Gh(g), d and the algebra of free bosons, Bos(g). c The generators of the algebra Gh(g) are βα [m] and γ α [m] (α ∈ 1+ , m ∈ Z) satisfying the canonical commutation relations: 0
0
[βα [m], γ α [n]] = δαα δm+n,0 · 1,
(1.5)
for α, α 0 ∈ 1+ and m, n ∈ Z. We call the formal generating functions of generators, X X z−m−1 βα [m], γ α (z) = z−m γ α [m], (1.6) βα (z) = m∈Z
m∈Z
ghost fields. They satisfy the following operator product expansions: 0
δα βα (z)γ (w) ∼ α . z−w α0
(1.7)
c generated by the vacuum The ghost Fock space F gh is defined as a left Gh(g)-module vector |0igh , satisfying γ α [n]|0igh = 0
βα [m]|0igh = 0,
(1.8)
for any α ∈ 1+ , m = 0, n > 0. d The algebra B os(g) is generated by φi [m] (i = 1, . . . , l, m ∈ Z), the defining relation of which is [φi [m], φj [n]] = κ(Hi |Hj )mδm+n,0 · 1,
(1.9)
g where κ is a non-zero complex parameter. We extend the algebra to B os(g) by adding the boost operator epi and its logarithm pi which satisfies the relation: [φi [m], epj ] = κ(Hi |Hj )δm,0 epj ,
[φi [m], pj ] = κ(Hi |Hj )δm,0 .
(1.10)
Fields φi (z) := κpi + φi [0] log z + ∂φi (z) :=
X
X m∈Zr{0}
z
−m−1
z−m φi [m], −m
(1.11)
φi [m]
m∈Z
are important generating functions of generators of this algebra. The field φi (z) is called P the free boson field. For any H = li=1 ai Hi ∈ h, we also use notations like φ[H ; m] =
l X
ai φi [m],
p[H ] =
l X
i=1
ai pi ,
φ(H ; z) =
i=1
l X
ai φi (z).
i=1
The free boson fields satisfy the following operator product expansion: φ(H ; z)φ(H 0 ; w) ∼ κ(H |H 0 ) log(z − w), ∂φ(H ; z)∂φ(H 0 ; w) ∼
κ(H |H 0 ) , (z − w)2
(1.12)
Bosonization and KZB Equations
591
for any H, H 0 ∈ h. For a dual vector λ ∈ h∗ , the boson Fock space Fλbos with momentum d λ is defined as a left Bos(g)-module generated by the vacuum vector |λibos , satisfying φi [m]|λibos = 0,
φi [0]|λibos = λ(Hi )|λibos
(1.13)
for anyL i = 1, . . . , l, m > 0. The boost operator epi acts on the direct sum of Fock spaces λ Fλbos by shifting the momentum: epi |λibos = |λ + Hi ibos .
(1.14)
The normal ordered product :P : of a monomial P of βα [m]’s, γ α [m]’s, φi [m]’s and p i e ’s is defined by putting annihilation operators of |0igh and |λibos (βα [m] (m = 0), γ α [m] (m > 0), φi [m] (m > 0)) and φi [0] appearing in P to the right side in the product. For example, the bosonic vertex operator is defined by V (λ; z) := :e
φ(λ;z) κ
:
! ! 1 X z−m 1 X z−m p[λ] κ1 φ[λ;0] φ[λ; m] e z φ[λ; m] . exp = exp κ −m κ −m m<0
m>0
(1.15) We introduce the following notations for later use: V (λ; z) = V˜ (λ; z)V0 (λ; z), V˜ (λ; z) := :e
˜ φ(λ;z) κ
: = exp
!
(1.16) !
1 X 1 X φ[λ; m] exp φ[λ; m] , κ −m κ −m z−m
m<0
z−m
m>0
(1.17) V0 (λ; z) := e
p[λ]
z
1 κ φ[λ;0]
,
(1.18)
˜ where φ(λ; z) = φ(λ; z) − κp[λ] − φ[λ; 0] is the non-zero mode part of φ(λ; z).
1.3. Bosonization and Wakimoto modules. Bosonization the differential operators R(X; x, ∂x , λ) in Sect. 1.1 by ghosts and free bosons in Sect. 1.2 gives the Wakimoto ˆ realization of the affine Lie algebra g. Define current operator X(z) (X ∈ g) and the energy-momentum tensor T (z) by X(z) := :R(X; γ (z), β(z), ∂φ(z)): for X = Ei , Hi and i = 1, . . . , l, for i = 1, . . . , l, Fi (z) := :R(Fi ; γ (z), β(z), ∂φ(z)): + ci ∂γ αi (z) T (z) := T gh (z) + T φ (z), X :∂γ α (z)βα (z):, T gh (z) :=
(1.19) (1.20) (1.21) (1.22)
α∈1+
T φ (z) :=
l 1 X 1 2 ∂ φ(2ρ; z), :∂φ(Hi ; z)∂φ(H i ; z): − 2κ 2κ i=1
(1.23)
592
G. Kuroki, T. Takebe
where {H i }i=1,... ,l is a basis of h dual to {Hi } with respect to (·|·), ρ is the half sum of positive roots (h and h∗ are identified via the inner product) and {ci }i=1,...,l is a set of constants to be determined. More explicitly, we have from (1.3) X :Rα (Ei ; γ (z))βα (z):, (1.24) Ei (z) = α∈1+
Fi (z) =
X
:Rα (Fi ; γ (z))βα (z): + γαi (z)∂φi (z) + ci ∂γ αi (z),
α∈1+
H (z) = H gh (z) + ∂φ(H ; z),
H gh (z) := −
X
α(H ):γ α (z)βα (z):
(1.25) (1.26)
α∈1+
for Chevalley generators Ei , Fi (i = 1, . . . , l) and H ∈ h. We expand these series in the following way: for X ∈ g, X X z−m−1 X[m], T (z) = z−m−2 T [m]. X(z) = m∈Z
(1.27)
m∈Z
c The coefficients X[m] and T [m] belong to a certain completion of the algebra Gh(g) ⊗ d Bos(g). Theorem 1.1 ([W,FF1,FF2,Ku]). There exists a unique set of constants {ci }i=1,...,l , such that a Lie algebra homomorphism from the affine Lie algebra gˆ = g ⊗ C[t, t −1 ] ⊕ Ckˆ c to a completion of Gh(g) ⊗[ Bos(g) can be defined by ω(X ⊗ t m ) = X[m],
ˆ = κ − h∨ , ω(k)
(1.28)
for all X ∈ g, m ∈ Z, where kˆ is the center of gˆ and h∨ is the dual Coxeter number of g. Moreover, the energy-momentum tensor T (z) defined by (1.21) coincides with the image of TSug (z) in U gˆ defined by the Sugawara construction: T (z) = ω(TSug (z)),
TSug (z) :=
dim g 1 X ◦ ◦ Jp (z)J p (z) , ◦ ◦ 2κ
(1.29)
p=1
and T [m]’s generate the Virasoro algebra P Vir with the central charge cV = dim g − 12(ρ|ρ)/κ = k dim g/κ. Here Jp (z) = m∈Z z−m−1 Jp ⊗ t m , {Jp } is a basis of g, {J p } is its dual basis with respect to (·|·) and the symbol ◦◦ ◦◦ is the normal ordered product ˆ in U g. Namely, ω can be extended to a Lie algebra homomorphism from gˆ ⊕ Vir to a c d completion of Gh(g) ⊕ Bos(g) such that T [m] = ω(TSug [m]).
(1.30)
Therefore Kac–Moody current operators satisfy the operator product expansions: X(z)Y (w) ∼
[X, Y ](w) k(X|Y ) , + 2 (z − w) z−w
(1.31)
Bosonization and KZB Equations
593
where X, Y ∈ g and k = κ − h∨ , and the energy-momentum tensor satisfies 2T (w) ∂T (w) cV /2 , + + (z − w)4 (z − w)2 z−w ∂X(w) X(w) . + T (z)X(w) ∼ 2 (z − w) z−w T (z)T (w) ∼
(1.32) (1.33)
We can regard F gh ⊗ Fλbos as a representation of gˆ of level k = κ − h∨ through ω. Definition 1.2. Denote F gh ⊗ Fλbos by Wakλ,k and call it a Wakimoto module of level k, weight λ. There is a g-submodule generated by |0igh ⊗ |λibos , which is spanned by Y γ α [0]I (α) |0igh ⊗ |λibos , (I (α) ∈ N) α∈1+
and is isomorphic to the dual Verma module Mλ∗ of g with the highest weight λ. (See 0 : Proposition 4.4 of [Ku].) We denote it by Wakλ,k n Y o 0 := SpanC γ α [0]I (α) |0igh ⊗ |λibos (I (α))α∈1+ ∈ N1+ ∼ Wakλ,k = Mλ∗ . α∈1+
(1.34) It is easy to show that for any m > 0 and X ∈ g, 0 = 0, (1.35) X[m]Wakλ,k P and the quadratic Casimir operator C2 = p Jp J p acts as a multiplication operator:
C2 |Wak 0 = (λ|λ + 2ρ)id.
(1.36)
λ,k
1.4. State-operator correspondence. Let us recall the state-operator correspondence in the two dimensional conformal field theories. A primary field generates a highest weight representation of the algebra of symmetries of the theory in a space of operator valued ch(g) ⊕ B d functions (“local operators”). In our case, the algebra of symmetries is G os(g) and the representation space is: Oλ := SpanC {x\ 1 [m1 ] · · · x\ n [mn ]V (λ; z)|n ∈ N; xi = βα , γ α or φj for certain α, j ; mi ∈ Z} c d ⊂ Gh(g) ⊕ Bos(g)((z)),
(1.37)
[ is defined where V (λ; z) is defined by (1.15) and the action of x[m], denoted by x[m], by [ x[m]8(z) := Resζ =z (ζ − z)m+h−1 x(ζ )8(z), m \ φ i [m]8(z) := Resζ =z (ζ − z) ∂φi (ζ )8(z),
(x = βα , γ α ),
(1.38)
594
G. Kuroki, T. Takebe
P −n−h (cf. (1.6)) and where h is the conformal spin of the field x(z) = n∈Z x[n]z 8(z) ∈ Oλ . An element of Oλ maps Wakµ,k to Wakλ+µ,k for any µ: Oλ ⊂ HomC (Wakµ,k , Wakλ+µ,k ).
(1.39)
c d ⊕ Bos(g) Because of the operator product expansions, (1.7), (1.12), Oλ is a Gh(g) module: Lemma 1.3. (i) For any fields x, y = βα , γ α , φi and any integers m, n ∈ Z, we have [ [ [x[m], y[n]] = [x[m], y[n]]∧ ∈ EndC (Oλ ). (ii) Oλ is generated by V (λ; z), which satisfies α [n]V (λ; z) = φ \ \ β\ α [m]V (λ; z) = γ i [m]V (λ; z) = 0,
φ[ i [0]V (λ; z) = λ(Hi )V (λ; z),
(1.40)
for α ∈ 1+ , i = 1, . . . , l, m = 0, n > 0. The second statement is due to the operator product expansions: βα (z)V (λ; w) ∼ 0,
γ α (z)V (λ; w) ∼ 0,
φi (z)V (λ; w) ∼
λ(Hi ) V (λ; w). z−w (1.41)
Hence, the universality of Fock representations implies Corollary 1.4. There exists a unique surjective homomorphism 8λ : Wakλ,k → Oλ ,
(1.42)
which maps |0igh ⊗ |λibos to V (λ; z). 0 defined by (1.34) It follows immediately from (1.38) that the g-submodule Wakλ,k is isomorphically mapped to
Pλ := { P (γ (z))V (λ; z) | P (x) is a polynomial of x = (x α )α∈1+ }
(1.43)
by 8λ : 8λ (P (γ [0])|0igh ⊗ |λibos ) = P (γ (z))V (λ; z).
(1.44)
Since gˆ ⊕ Vir is realized in terms of ghosts and bosons through (1.28) and (1.30), we can define action of X[m] (X ∈ g, m ∈ Z) and T [m] (m ∈ Z) on Oλ by replacing βα [n], etc. in ω(X[m]), ω(T [m]) with β[ α [n], etc. defined by (1.38) respectively. In fact, their actions are described more simply, thanks to the following lemma: Assume that P −m−hi (i = 1, 2) are fields which have the operator product B [m]z Bi (z) = i m∈Z expansion B1 (z)B2 (w) =
N X B12,j (w) j =1
(z − w)j
+ :B1 (z)B2 (w):,
(1.45)
Bosonization and KZB Equations
595
where the normal order product : : is defined by (
B1 [m]B2 [n], m 5 −h1 , B2 [n]B1 [m], m > −h1 ,
:B1 [m]B2 [n]: =
(1.46)
and the field :B1 (z)B2 (w): has no singularity at z = w. We denote its restriction to the diagonal {z = w} by B3 (z): B3 (z) :=
X
B3 [m]z−m−h3 ,
B3 [m] =
m∈Z
X
:B1 [n]B2 [m − n]:,
(1.47)
n∈Z
where h3 = h1 + h2 . Lemma 1.5. For any 8(z) ∈ Oλ , we have \ B 3 [m]8(z) =
X◦ n∈Z
◦ [m − n] 8(z), B[ 1 [n]B2\ ◦ ◦
where m+hi −1 \ Bi (ζ )8(z), B i [m]8(z) = Resζ =z (ζ − z)
and the normal ordering
◦ ◦ ◦ ◦
is defined by the same rule as in (1.46).
Corollary 1.6. For X ∈ g, m ∈ Z and 8(z) ∈ Oλ , we have [ X[m]8(z) = Resζ =z (ζ − z)m X(ζ )8(z),
(1.48)
T[ [m]8(z) = Resζ =z (ζ − z)m+1 T (ζ )8(z),
(1.49)
where action in the left hand side is defined by replacing βα [m], etc. in ω(X[m]) and ω(T [m]) by β\ α [m], etc. defined by (1.38). Using the operator product expansions, T (z)V (λ; w) ∼
∂V (λ; w) 1λ V (λ; w) + , z−w (z − w)2
(1.50)
where 1λ = (λ|λ + 2ρ)/2κ is the conformal weight of V (λ; w), and the fact [T [−1], x(z)] = ∂x(z),
(1.51)
for x = βα , γ α or ∂φi , which is a direct consequence of (1.21), we can prove the following formula by induction. Lemma 1.7. For any 8(z) ∈ Oλ , we have ∂8(z) = [T [−1], 8(z)] = T\ [−1]8(z).
(1.52)
596
G. Kuroki, T. Takebe
1.5. Screening operators. Bosonization of the operator Scr(X; x, ∂x ) in (1.4) gives the screening operator. The ghost sector of the screening operator is defined for any positive root α ∈ 1+ as follows: Scrα (z) := :Scr(eα ; γ (z), β(z)):.
(1.53)
The screening operator or the screening current is defined for simple roots αi (i = 1, . . . , l) as the product of Scrαi and a bosonic vertex operator (see (1.15)): scri (z) := Scrαi (z)V (−αi ; z) ∈ O−αi .
(1.54)
An important property of screening currents is the following operator product expansions: X(z)scri (w) ∼ 0,
(1.55)
∂ V (−αi ; w) , Fj (z)scri (w) ∼ −κδi,j ∂w z − w ∂ scri (w) , T (z)scri (w) ∼ ∂w z − w
(1.56) (1.57)
where X ∈ b+ and i, j = 1, . . . , l. The ghost sector of the screening current has the following operator product expansion, which shall be used in computing explicit forms of the integral representations of solutions of the KZB equations: α(H )Scrα (w) , z−w ∂Scrα (w) Scrα (w) , + T gh (z)Scrα (w) ∼ z−w (z − w)2
H gh (z)Scrα (w) ∼
(1.58) (1.59)
α+β
Scrα (z)Scrβ (w) ∼
fα,β Scrα+β (w)
z−w (Scrα P )(γ (w)) , Scrα (z)P (γ (w)) ∼ z−w α+β
for any P (x) ∈ C[x] (x = (x α )α∈1 ), where fα,β
,
(1.60) (1.61)
is the structure constant of the
α+β fα,β eα+β ,
and (Scrα P )(x) ∈ C[x] is the polynomial Lie algebra n+ , [eα , eβ ] = (Scrα P )(x) = Scr(eα ; x, ∂x )P (x) ∈ C[x].
2. WZW Models on Elliptic Curves In this section we recall the definition (or a characterization) of N -point functions on elliptic curves.
Bosonization and KZB Equations
597
2.1. Space of conformal coinvariants and space of conformal blocks. N-point functions of the WZW model take values in the space of the conformal blocks which is the dual of the space of the conformal coinvariants. (Exactly speaking, N-point functions are sections of a vector bundle, a fiber of which is the space of conformal blocks. See Sect. 2.2.) To define the space of conformal coinvariants and conformal blocks, we first need a Lie algebra bundle over an elliptic curve with marked points. For each q ∈ C× , |q| < 1, we define an elliptic curve X = Xq by Xq := C× /q Z ,
(2.1)
where q Z = {q n | n ∈ Z} is a multiplicative group acting on C× by z 7 → q n z. A Lie algebra bundle gH is defined for each H ∈ h by gH = C× × g/ ∼,
(2.2)
where the equivalence relation ∼ is (t, A) ∼ (qt, e− ad H A).
(2.3)
This Lie algebra bundle gH has a natural connection, ∇d/dt = td/dt, and is decomposed into a direct sum of line bundles and a trivial bundle with fiber h: M Lα(H ) ⊕ (h × X). (2.4) gH ∼ = α∈1
Here the line bundle Lc (c ∈ C) on X is defined by Lc := (C × C)/≈c ,
(2.5)
where ≈c is an equivalence relation defined by (t, x) ≈c (qt, e−c x).
(2.6)
As usual, the structure sheaf on X = Xq is denoted by OX and the sheaf of meromorphic functions on X by KX . A stalk of a sheaf F on X at a point P ∈ X is denoted by FP . When F is an OX -module, we denote its fiber FP /mP FP by F|P , where mP is the maximal ideal of the local ring OX,P . Denote by FP∧ the mP -adic completion of FP . We shall use the same symbol for a vector bundle and for a locally free coherent OX -module consisting of its local holomorphic sections. For instance, the invertible sheaf associated to the line bundle Lc is also denoted by the same symbol Lc . Denote by 1X the sheaf of holomorphic 1-forms on X, which is isomorphic to OX since X is an elliptic curve. The fiberwise Lie algebra structure of the bundle gH induces that of the associated sheaf gH over OX . Define the invariant OX -inner product on gH by (A|B) :=
1 Tr H (ad A ad B) ∈ OX for A, B ∈ gH , 2h∨ g
(2.7)
where the symbol ad denotes the adjoint representation of the OX -Lie algebra gH . Then the inner product on gH is invariant under the translations with respect to the connection ∇ : gH → gH ⊗OX 1X : d(A|B) = (∇A|B) + (A|∇B) ∈ 1X for A, B ∈ gH .
(2.8)
598
G. Kuroki, T. Takebe
Under the trivialization of gH defined by the construction (2.2), the connection ∇ coincides with the exterior derivative by t d/dt. The fiber of gH is isomorphic to g. For any point P on X with t (P ) = z, we put gP := (gH ⊗OX KX )∧ P,
(2.9)
which is a topological Lie algebra non-canonically isomorphic to the loop algebra g((t − ∼ z)). Its subspace gP+ := (gH )∧ P = g[[t − z]] is a maximal linearly compact subalgebra P of g under the (t − z)-adic linear topology. Let us fix mutually distinct points P1 , . . . , PN on X whose coordinates are t = z1 , . . . , zN and put D := {P1 , . . . , PN }. We shall also regard D as a divisor on X (i.e., L Pi ˙ The Lie algebra gD := N D = P1 + · · · + PN ). Denote X r D by X. i=1 g has the natural 2-cocycle defined by ca (A, B) :=
N X
Rest=zi (∇Ai |Bi ),
(2.10)
i=1 N D where A = (Ai )N i=1 , B = (Bi )i=1 ∈ g and Rest=z is the residue at t = z. (The symbol “ca ” stands for “Cocycle defining the Affine Lie algebra”.) We denote the central extension of gD with respect to this cocycle by gˆ D :
ˆ gˆ D := gD ⊕ Ck, where kˆ is a central element. Explicitly the bracket of gˆ D is represented as D ˆ [A, B] = ([Ai , Bi ]0 )N i=1 ⊕ ca (A, B)k for A, B ∈ g ,
(2.11)
where [Ai , Bi ]0 are the natural bracket in gPi . The Lie algebra gˆ P for a point P is non-canonically isomorphic to the affine Lie algebra gˆ and gˆ Pi can be regarded as a Pi subalgebra of gˆ D . Put gP+ := (gH )∧ P as above. Then g+ can be also regarded as a subalgebra of gˆ Pi and gˆ D . be the space of global meromorphic sections of gH which are holomorphic Let gH,D X˙ ˙ on X: := 0(X, gH (∗D)). gH,D X˙ There is a natural linear map from gH,D into gD which maps a meromorphic section of X˙ H g to its germ at Pi ’s. The residue theorem implies that this linear map is extended to into gˆ D , which allows us to regard a Lie algebra injective homomorphism from gH,D X˙ gH,D as a subalgebra of gˆ D . X˙
Definition 2.1. The space of conformal coinvariants CCH (Xq , D, M) and that of conformal blocks CBH (Xq , D, M) associated to gˆ Pi -modules Mi with the same level kˆ = k N H,D and are defined to be the space of coinvariants of M := N i=1 Mi with respect to gX˙ its dual: M, CBH (Xq , D, M) := (M/gH,D M)∗ . CCH (Xq , D, M) := M/gH,D X˙ X˙
(2.12)
Bosonization and KZB Equations
599
(In [TUY], CCH (Xq , D, M) and CBH (Xq , D, M) are called the space of covacua and that of vacua respectively.) In other words, the space of conformal coinvariants CCH (Xq , D, M) is generated by M with relations and v ∈ M, AX˙ v ≡ 0 for all AX˙ ∈ gH,D X˙
(2.13)
and a linear functional 8 on M belongs to the space of conformal blocks CBH (Xq ,D,M) if and only if it satisfies that and v ∈ M. 8(AX˙ v) = 0 for all AX˙ ∈ gH,D X˙
(2.14)
These equations (2.13) and (2.14) are called the Ward identities. 2.2. N-point functions. N -point functions are flat sections of a sheaf of conformal e of pointed curves with marked points defined blocks over the base space S of a family X as follows: S := { (z; q; H ) = (z1 , . . . , zN ; q; H ) ∈ (C× )N × C× × h | zi /zj 6∈ q Z if i 6= j }, e := S × C× . X e onto S along C× , π(z; q; H ; t) = (z; q; H ), be the projection from X Let π˜ = πX/S e and p˜ i the section of π˜ given by zi : e for (z; q; H ) = (z1 , . . . , zN ; q; H ) ∈ S. p˜ i (z; q; H ) := (z; q; H ; zi ) ∈ X A family of N-pointed elliptic curves π : X S is constructed as follows. Define e by the action of Z on X e m · (z; q; H ; t) := (z; q; H ; q m t) for m ∈ Z, (z; q; H ; t) ∈ X.
(2.15)
e by the action of Z: Let X be the quotient space of X e X := Z\X.
(2.16)
e onto X and π = πX/S the projection from be the natural projection from X Let πX/X e X onto S induced by π. ˜ We put ◦ p˜ i , Pi := pi (S), D := pi := πX/X e
N [ i=1
˙ := X r D, D˜ := π −1 (D). Pi , X e X/X
P Here pi is the section of π induced by p˜ i and D is also regarded as a divisor N i=1 Pi on X. The fiber of π at (z; q; H ) = (z1 , . . . , zN ; q; H ) ∈ S is an elliptic curve with modulus q and marked points z1 , . . . , zN . We refer to [FW] for the construction of a sheaf of conformal blocks and its flat connection and, using their result, define the N-point functions as follows. (See also [S].) We identify each fiber of gH with g via the standard trivialization defined by the construction of gH , (2.2). Then the algebra gP defined by (2.9) is identified with g((t − z)), where t is the coordinate on the complex plane and z is the coordinate of P . For
600
G. Kuroki, T. Takebe
X ∈ g, the element X ⊗ (t − z)m of gP ∼ = g((t − z)) is denoted by X[m]. The Virasoro generator defined by the Sugawara construction (1.29) is denoted by T [m]: T [m] =
dim g ◦ 1 XX◦ Jp [m − n]J p [n] . ◦ ◦ 2κ
(2.17)
p=1 n∈Z
Let us denote the representation of gˆ Pi on the i th component of the tensor product N ∗ ∗ ∗ ˆ Pi , M= N i=1 Mi by ρi and its dual by ρi : for v ∈ M, v ∈ M , A ∈ g hρi∗ (A)v ∗ , vi = −hv ∗ , ρi (A)vi,
(2.18)
where h·, ·i is the pairing of M ∗ and M. We assume that the Virasoro algebra with central charge cV acts on Mi and Mi∗ through the Sugawara construction (2.17). For v ∈ M, v ∗ ∈ M ∗ , H ∈ h and a multi-valued meromorphic function f (t) on Xq , all poles of which belong to {z1 , . . . , zN } (mod q Z ), we define N X X d fi,m+1 ρi (T [m])v, v := ρ T f (t) dt i=1 m∈Z
N X X d v ∗ := fi,m+1 ρi∗ (T [m])v ∗ , ρ ∗ T f (t) dt
(2.19)
i=1 m∈Z
ρ(H {f (t)})v :=
N X X
fi,m ρi (H [m])v,
i=1 m∈Z
ρ ∗ (H {f (t)})v ∗ := P
N X X i=1 m∈Z
(2.20) fi,m ρi∗ (H [m])v ∗ ,
)m
where f (t) = m∈Z fi,m (t − zi is the Laurent expansion of f (t) around t = zi . e with poles only at {z1 , . . . , zN } Fix a meromorphic function Z(z; q; H ; t) on X Z e ˜ (mod q ) (namely Z(z; q; H ; t) ∈ 0(X, OX e (∗D))) satisfying Z(z; q; H ; qt) = Z(z; q; H ; t) − 1.
(2.21)
We abbreviate Z(q; z; H ; t) as Z(t). Example 2.2. We may take the following function as Z(t) = Z(z; q; H ; t): Z(z; q; H ; t) :=
1 t d log θ11 (t/zi0 ; q), 2πi zi0 dt
for i0 ∈ {1, . . . , N}. Let us take a coordinate system of h as h 3 H = orthonormal basis of h.
Pl
a=1 ξa Ha ,
(2.22)
where {Ha }la=1 is an
Definition 2.3. A multi-valued holomorphic function 9(z; q; H ) on S with values in M ∗ is called an N -point function in genus one if it satisfies the following conditions (I), (II), (III) and (IV):
Bosonization and KZB Equations
601
(I) For any (z, q, H ) ∈ S, 9(z; q; H ) ∈ CBH (Xq , D, M); (II) For j = 1, . . . , N, ∂ 9(z; q; H ) = ρj∗ (T [−1])9(z; q; H ); ∂zj
(2.23)
(III) q
cV d ∂ + 9(z; q; H ) = ρ ∗ T Z(t)t 9(z; q; H ); ∂q 24 dt
(2.24)
(IV) For r = 1, . . . , l, ∂ 9(z; q; H ) = −ρ ∗ (Hr {Z(t)})9(z; q; H ). ∂ξr
(2.25)
Assume that each Mi contains a g-submodule Vi such that for any m > 0 and X ∈ g,
and the Casimir operator C2 =
X[m]Vi = 0,
P
p Jp J
p
(2.26)
acts as a multiplication, (i)
ρi (C2 ) = c2 idVi . Let us define a function 5(q; H ) by 5(q; H ) := q dim g/24 (q; q)l∞
Y
2 sinh(α(H )/2)
α∈1+
(2.27) Y
(qeα(H ) ; q)∞ .
(2.28)
α∈1
Then we can restrict an N-point function 9(z; q; H ) to V = 9V (z; q; H ) := 9(z; q; H )|V ,
NN
i=1 Vi :
(2.29)
˜ V (z; q; H ) = 5(q; H )9V (z; q; H ) satisfies the following system, which is called and 9 the Knizhnik–Zamolodchikov–Bernard (KZB) equations first found by Bernard [B1]. (See Theorem 4.1 of [FW].) The functions σc (z) and ζ (z) below are defined by (A.7) and (A.8). (I’)
For any H ∈ h, N X i=1
(II’) For j = 1, . . . , N, (j )
c ∂ + 2 κ zj ∂zj 2κ =
l X r=1
ρj∗ (Hr )
˜ V (z; q; H ) = 0. ρi∗ (H )9
(2.30)
! ˜ V (z; q; H ) 9
X ∂ ˜ V (z; q; H ), (2.31) ˜ V (z; q; H ) + (ρi∗ ⊗ ρj∗ (zi , zj ))9 9 ∂ξr i6=j
602
G. Kuroki, T. Takebe
where (z, w) = (z, w; q; H ) := −
X
σ−α(H )
z
α∈1
w
eα ⊗ e−α
l z X Hr ⊗ Hr . − ζ w
(2.32)
r=1
(III’) l X ∂ 2 ∂ ˜ ˜ V (z; q; H ) 9 2κq 9V (z; q; H ) = ∂q ∂ξr r=1
(2.33)
N X
(ρi∗ + i,j =1
˜ V (z; q; H ), ⊗ ρj∗ H (zi , zj ))9
where H (z, w) = H (z, w; q; H ) X
:= −
α∈1
−
l X r=1
ζ
eα(H ) z w
!
! − ζ (eα(H ) ) σ−α(H )
z w
eα ⊗ e−α
z 2 z 0z 1 ζ Hr ⊗ Hr . + ζ 2 w w w
(2.34)
Conversely, restriction of an N-point function to V is characterized by the KZB equations with additional conditions. For example, Felder and Wieczerkowski [FW] used ˜ V as the additional conditions, the automorphic properties and asymptotic behavior of 9 ˜ V which includes the KZB while Suzuki [S] found a holonomic system characterizing 9 equations. 3. N-Point Functions from Wakimoto Modules In this section we construct N-point functions for Wakimoto modules, Mi = Wakλi ,k , P where λi ∈ h∗ and N i=1 λi belongs to the positive root lattice of g. Fix ordered sets of the simple roots of g, {αi1 , . . . , αiM }, such that M X j =1
αij =
N X
λi .
(3.1)
i=1
Denote the following linear map by ψ(z; t; q; H ), where z = (z1 , . . . , zN ), t = (t1 , . . . , tM ) and (z, t) belongs to (the universal covering of) (C× )N +M r { diagonals }: Wakλ1 ,k ⊗ · · · ⊗ WakλN ,k 3 v1 ⊗ · · · ⊗ vN 7→ 7 → Tr Wakµ,k (8λ1 (v1 ; z1 ) · · · 8λN (vN ; zN )scri1 (t1 ) · · · scriM (tM )q T [0]−cV /24 eH [0] )dt1 ∧ · · · ∧ dtM .
(3.2)
Note that, thanks to (1.39) and (3.1), the operator inside the bracket in the right-hand side is an endomorphism of Wakµ,k .
Bosonization and KZB Equations
603
Proposition 3.1. There is a local system of rank one L on { (z; t; q; H ) | (z; t) = (z1 , . . . , zN ; t1 , . . . , tM ) ∈ XqN+M r {diagonals}, q ∈ C× , H ∈ h } such that ψ(z; t; q; H )(v) is a holomorphic section of L for any v ∈ M. Namely, the monodromy of ψ(z; t; q; H )(v) is independent of v. Proof. First note that 8λi (vi ; zi ) ∈ Oλ is of the form, X
x\ 1 [m1 ] · · · x\ n [mn ]V (λ; z) I X dζi1 (ζi1 − zi )m1 +h1 −1 = Ci1 I ··· dζin (ζin − zi )mn +hn −1 x1 (ζi1 ) · · · xn (ζin )V (λi ; zi ),
8λi (vi ; zi ) =
(3.3)
Cin
where xj is βα , γ α or ∂φj and mj ∈ Z. The contour Cij encircles zi , lies outside of Cij 0 (j 0 > j ) and does not contain 0 and zi 0 (i 0 6 = i) inside it. It follows from (1.54), (1.53) and this expression (3.3) that ψ(z; t; q; H )(v1 ⊗ · · · ⊗ vN ) is sum of integrals of the form I
I Ci1
dζi1 · · ·
Cin
dζin (rational function of ζij , zi )F gh (ζij , ti ; q; H )
×F bos (ζij , zi , ti ; q; H )dt1 ∧ · · · dtM ,
(3.4)
F gh (ζij , ti ; q; H ) := Tr F gh (polynomial of βα (ζij ), βα (ti ), γ α (ζij ), γ α (ti ))q T F bos (ζij , zi , ti ; q; H ) := Tr Fµbos
Y
φr1j (ζ1j )V (λ1 ; z1 ) · · ·
Y
j
× V (−αi1 ; t1 ) · · · V (−αiM ; tM )q T
gh [0]
eH
gh [0]
,
(3.5)
φrNj (ζNj )V (λN ; zN )
j φ [0]
eφ[H ;0]
(3.6)
where “polynomial of βα , etc.” contain possibly normal ordered products coming from Scrαij (tj ). The first trace in the integrand in (3.4), F gh (ζij , ti ; q; H ), has singularities at ζij = ζi 0 j 0 and ζij = ti 0 , which are poles because of the operator product expansions (1.7). The second trace in the integrand in (3.4), F bos (ζij , zi , ti ; q; H ), has singularities at (1) ζij = ζi 0 j 0 ; (2) ζij = zi 0 or ti 0 ; (3) zi = zi 0 , zi = ti 0 , ti = ti 0 . The first singularities (1) are poles because of the operator product expansion (1.12). The second singularities (2) are also rational by virtue of the third expansion in (1.41). The formula (cf. [Ku] (5.32)) 0
V (λ; z)V (λ0 ; w) = :V (λ; z)V (λ0 ; w):(z − w)(λ|λ )/κ
(3.7)
604
G. Kuroki, T. Takebe
implies that F bos has non-trivial monodromy around the singularities (3): F bos (ζ ; z; t; q; H ) → e2πi(λi |λj )/κ F bos (ζ ; z; t; q; H ), when zi goes around zj , (1 5 i < j 5 N ), 2πi(λ |−α )/κ
i ij F bos (ζ ; z; t; q; H ) → e F bos (ζ ; z; t; q; H ), when zi goes around tj , (1 5 i 5 N, 1 5 j 5 M)),
F bos (ζ ; z; t; q; H ) → e
2πi(αij |αi 0 )/κ j
(3.8)
F bos (ζ ; z; t; q; H ),
when tij goes around tij 0 , (1 5 j < j 0 5 M). Summarizing, we conclude that the integrand in (3.4) is a rational function with respect to ζij ’s and also rational with respect to zi ’s and ti ’s except at zi = zi 0 , zi = ti 0 , ti = ti 0 , where it has the same monodromies as F bos , (3.8). As a next step, we show that ψ(z; t; q; H )(v1 ⊗ · · · ⊗ vN ) has the same monodromy as the integrand in (3.4). This is proved by applying the following lemma iteratively. Lemma 3.2. Assume that a function F (ζ, z, t) is rational with respect to ζ and has monodromy with respect to z and t around the diagonal z = t: F (ζ, z, t) → cF (ζ, z, t) when z goes around t, where c is a constant. Then I F (ζ, z, t)dζ (3.9) ψ(z, t) = C(z)
has the same monodromy around z = t, where C(z) is a small contour surrounding z. Proof. Fix a small circle γ around t, γ (θ) = t + ε exp(iθ),
θ ∈ [0, 2π ].
When z goes around t along γ , F (ζ, z, t) is multiplied by c. Since F is rational with respect to ζ , the integration contour C(z) in (3.9) can be replaced with a cycle γ+ − γ− , where θ ∈ [0, 2π] γ± (θ) = t + ε± exp(iθ), and ε± are suitable constants satisfying ε− < ε < ε+ . Now that γ± do not depend on z and do not intersect with γ , it is obvious that I I F (ζ, z, t)dζ − F (ζ, z, t)dζ ψ(z, t) = γ+
γ−
is multiplied by c when z goes around t along γ . u t We can similarly prove that the monodromies of ψ(z; t; q; H )(v1 ⊗· · ·⊗vN ) around the cycles of Xq (along the paths, zi = r exp(2π iθ ) (0 5 θ 5 2π) and zi → qzi , and the same for ti ) do not depend on v1 ⊗ · · · ⊗ vN . This completes the proof of the proposition. t u Remark 3.3. We can also write down an explicit expression of the second trace in the Q φ integrand in (3.4), Tr Fµbos ( i,j ∂φrij (ζij )V (λi ; zi ))q T [0] eφ[H ;0] , by using (B.15) and (B.17). In fact, since ∂ ∂ V˜ (tλ; z) = ∂φ(λ; z), ∂z ∂t t=0
Bosonization and KZB Equations
605
applying differential operators of the form ∂z ∂t |t=0 to (B.17) and combining the result with (B.15), we have a desired expression, which also shows that this trace is rational with respect to ζij and has monodromy around the diagonals zi = zi 0 , etc. Theorem 3.4. Z 9(z; q; H ) =
C (z,q,H )
ψ(z; t; q; H )
(3.10)
is an N-point function with values in (Wakλ1 ,k ⊗ · · · ⊗ WakλN ,k )∗ , where C(z, q, H ) is a family of M-cycles with coefficients in the local system L∗ dual to L. Remark 3.5. We refer to [AK] or [FV] for integrals over cycles with coefficients in the local system. Proposition 3.1 guarantees that the integration in the right hand side of (3.10) is well-defined and that the right hand sides of (2.23), (2.24) and (2.25) are meaningful. Proof. This can be shown in almost the same way as Proposition 3.4.1 of [S]. Condition (I) of Definition 2.3 is checked as follows. Fix (z, q, H ) ∈ S. Condition , (I) means that for any J (t) ∈ gH,D X˙ N X i=1
ρi∗ (J (t))9(z; q; H ) = 0.
(3.11)
Thanks to the decomposition (2.4), we may assume J (t) = X ⊗ f (t), where X ∈ gα (α ∈ 1 t {0}, g0 := h), f (t) ∈ 0(Xq , Lα(H ) (∗D)) (L0 = OXq ). The left-hand side of (3.11) is equal to Z
C (z,q) i=1 N X
Z =
N X ρi∗ (X ⊗ f (t))ψ(z; t; q; H )(v1 ⊗ · · · ⊗ vN )
C (z,q)
−
i=1
Z =
C (z,q)
−
N X i=1
Tr 8λ1 (v1 ; z1 ) · · · (X ⊗ f (t))∧ t=zi 8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] dt1 ∧ · · · ∧ dtM
(3.12)
Resζ =zi f (ζ ) Tr 8λ1 (v1 ; z1 ) · · · X(ζ )8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] dζ dt1 ∧ · · · ∧ dtM ,
where Tr is Tr Wakµ,k and (X ⊗ f (t))t=zj is the Laurent expansion of X ⊗ f (t) at t = zj . The last line is due to the following fact, which is easily checked by (1.48): for any 8(z) ∈ Oλ and X ∈ g, (X ⊗ f (t))∧ t=z 8(z) = Resζ =z f (ζ )X(ζ )8(z)dζ.
(3.13)
606
G. Kuroki, T. Takebe
By the commutativity of the current X(ζ ), vertex operator 8λ (v; z) and screening current scri (z), we have f (ζ ) Tr 8λ1 (v1 ; z1 ) · · · X(ζ )8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] dζ =f (ζ ) Tr X(ζ )8λ1 (v1 ; z1 ) · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] dζ =f (ζ ) Tr 8λ1 (v1 ; z1 ) · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] X(ζ ) dζ =f (qζ ) Tr 8λ1 (v1 ; z1 ) · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · X(qζ )q T [0]−cV /24 eH [0] qdζ, (3.14) where we used eH [0] X(ζ ) = eα(H ) X(ζ )eH [0] ,
q T [0] X(ζ ) = qX(qζ )q T [0] ,
(3.15)
and f (qt) = e−α(H ) f (t) (cf. (2.5)). Therefore, f (ζ ) Tr 8λ1 (v1 ; z1 ) · · · X(ζ )8λi (vi ; zi )
· · · scrij (tj ) · · · q T [0]−cV /24 eH [0] dζ ∈ 0(Xq , 1X (∗D)).
Hence the sum of its residues at ζ = zj (j = 1, . . . , N) is zero by the residue theorem. Thus (3.12) implies (3.11). Condition (II) of Definition 2.3 is a direct consequence of (1.52). We prove Condition (III), assuming |q| < |tM | < · · · < |t1 | < |zN | < · · · < |z1 | < 1. The general case follows from this case by the analytic continuation. By the same argument as (3.12), we have d ψ(z; q; H ) = ρ ∗ T Z(t)t dt =
N X i=1
Resζ =zi Z(ζ ) Tr T (ζ ) · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] d M t ⊗ ζ dζ.
(3.16)
Deforming the integration contour, the right-hand side of (3.16) is rewritten as I I M I X 1 ζ dζ Z(ζ ) × − − 2πi |ζ |=1 |ζ |=|q| ζ =t j j =1 × Tr T (ζ ) · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · q T [0]−cV /24 eH [0] d M t.
(3.17)
The first integral in (3.17) is equal to I 1 ζ dζ Z(ζ ) Tr · · · 8λi (vi ; zi )· · ·scrij (tj )· · ·q T [0]−cV /24 eH [0] T (ζ ) d M t 2πi |ζ |=1 I 1 ζ dζ Z(ζ ) Tr · · ·8λi (vi ; zi )· · ·scrij (tj )· · ·q 2 T (qζ )q T [0]−cV /24 eH [0] d M t = 2πi |ζ |=1 I 1 qζ d(qζ ) Z(ζ )Tr · · ·8λi (vi ; zi )· · ·scrij (tj )· · ·T (qζ )q T [0]−cV /24 eH [0] d M t = 2πi |ζ |=1 I 1 ζ dζ Z(q −1 ζ ) Tr · · ·8λi (vi ; zi )· · ·scrij (tj )· · ·T (ζ )q T [0]−cV /24 eH [0] d M t, = 2πi |ζ |=|q| (3.18)
Bosonization and KZB Equations
607
where we used eH [0] T (ζ ) = T (ζ )eH [0] ,
q T [0] T (ζ ) = q 2 T (qζ )q T [0] .
(3.19)
The second integral in (3.17) is equal to I 1 ζ dζ Z(ζ ) Tr · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · T (ζ )q T [0]−cV /24 eH [0] d M t. 2π i |ζ |=|q| (3.20) The third integral in (3.17) turns into a term of the form ∂/∂tj (· · · ) by the operator product expansion (1.57), and therefore the sum of those terms is an exact M-form. Hence they do not contribute to the integral over C. Thus by summing up (3.16), (3.18) and (3.20) and using the property of Z, (2.21), we obtain d ∗ 9(z; q; H ) ρ T Z(t)t dt Z Tr · · · 8λi (vi ; zi ) · · · scrij (tj ) · · · T [0]q T [0]−cV /24 eH [0] d M t = (3.21) C (z,q) cV ∂ + 9(z; q; H ), = q ∂q 24 which proves Condition (III). Condition (IV) is proved in the same way. u t 0 , (1.34), Recall that the Wakimoto module Wakλ,k contains a g-submodule Wakλ,k ∗ which is isomorphic to the dual Verma module Mλ and satisfies (1.35) and (1.36). As is N 0 mentioned at the end of Sect. 2.2, the restriction of 9(z; q; H ) to N i=1 Wakλi ,k satisfies 0 makes it possible to write down the the KZB equation. The simple structure of Wakλ,k restriction of ψ(z; t; q; H ) explicitly. Each vector vi in Wakλ0i ,k corresponds to a polynomial Pi (x) ∈ C[x] (x = (x α )α∈1+ ) and to an operator in Pλ (1.43) as
vi = Pi (γ [0])|0igh ⊗ |λi ibos ,
8λi (vi ; z) = Pi (γ (z))V (λi ; z).
(3.22)
See (1.44). Let us compute ψ(z; t; q; H ) for this vi . Inserting the expression (3.22) into the definition (3.2), we obtain ψ(z; t; q; H )(v1 ⊗ · · · ⊗ vN ) = q −cV /24 × Tr Fµbos (V (λ1 ; z1 ) · · · V (λN ; zN )V (−αi1 ; t1 ) · · · V (−αiM ; tM )q T × Tr F gh (P1 (γ (z1 )) · · · PN (γ (zN ))Scrαi1 (t1 ) · · · ScrαiN (tN )q T
φ [0]
gh [0]
eφ[H ;0] )
eH
gh [0]
dt1 ∧ · · · ∧ dtM ,
) (3.23)
where T φ [0], T gh [0] and H gh [0] are the zero mode part of T φ (z) (1.23), T gh (z) (1.22) and H gh (z) (1.26), respectively. Since the right-hand side of (3.23) splits into the bosonic sector and the ghost sector, we can calculate each part separately.
608
G. Kuroki, T. Takebe
The computation of the bosonic sector correlation function reduces to the following g lemma. Denote the one-loop correlation function of any element A of Bos(g) (Sect. 1.2) by T hAibos µ,q,H := Tr Fµbos (Aq
φ [0]
eφ[H ;0] ),
(3.24)
when A|µibos ∈ Fµbos . P Lemma 3.6. Let µi (i = 1 . . . , N) be weights in h satisfying N i=1 µi = 0. Then the one-loop correlation function of bosonic vertex operators (1.15) is hV (µ1 ; z1 ) · · · V (µN ; zN )ibos µ,q,H = `µ1 ,...,µN ,µ (z1 , . . . , zN ; q; H ) := :=
1µ (H |µ) e (q; q)−l ∞q
×
Y
N Y √ (µ |2µ−µi )/2κ ( −1η(q)3 )(µi |µi )/2κ zi i i=1
!
θ11 (zi /zj ; q)(µi |µj )/κ ,
(3.25)
1≤i<j ≤N
where 1µ = (µ|µ + 2ρ)/2κ is the conformal weight and η(q) = q 1/24 (q; q)∞ is the Dedekind eta function. The proof is in Appendix 4. This is shown by the standard method of coherent states (cf. for example, [GSW]). The ghost sector can be computed in a similar way as Proposition 3.2 of [ATY] and Theorem I of [A]. Let us define the ghost sector one-loop correlation function by hAiq,H := Tr F gh (Aq T gh
gh [0]
eH
gh [0]
),
(3.26)
ch(g). for A ∈ G The important lemma is the following screening current Ward identity. Lemma 3.7. For any Pa (x) ∈ C[x] (a = 1, . . . , n), a root α and a sequence of positive roots {α(j )}m j =1 , we have gh
hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα (t)Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H = =
n X (−wα(H ) (t, za ))hP1 (γ (z1 )) · · · (Scrα Pa )(γ (za )) · · · Pn (γ (zn )) a=1 gh
× Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H m X α+α(j ) + (−wα(H ) (t, tj ))fα,α(j ) j =1
gh
× hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα(1) (t1 ) · · · Scrα+α(j ) (tj ) · · · Scrα(m) (tm )iq,H . Here we used the notations in (1.60) and (1.61).
(3.27)
Bosonization and KZB Equations
609
Proof. We may assume that |q| < |tm | < · · · < |t1 | < |t| < |zn | < · · · < |z1 | < 1. The left-hand side of (3.27) is rewritten as follows because of (A.5):
gh
hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα (t)Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H I 1 = dζ wα(H ) (t, ζ )hP1 (γ (z1 )) 2π i ζ =t gh
· · · Pn (γ (zn ))Scrα (ζ )Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H I I n I m I X X 1 dζ wα(H ) (t, ζ ) × = − − − 2π i |ζ |=1 ζ =za ζ =tj |ζ |=|q| a=1
(3.28)
j =1
gh
× hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα (ζ )Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H .
The first integral in (3.28) is equal to
1 2π i
I |ζ |=1
dζ wα(H ) (t, ζ ) Tr F gh (Scrα (ζ )P1 (γ (z1 )) · · · Pn (γ (zn ))
× Scrα(1) (t1 ) · · · Scrα(m) (tm )q T [0] eH [0] ) I 1 dζ wα(H ) (t, ζ ) Tr F gh (P1 (γ (z1 )) · · · Pn (γ (zn )) = 2π i |ζ |=1 gh
× Scrα(1) (t1 ) · · · Scrα(m) (tm )q T I 1 dζ qeα(H ) wα(H ) (t, ζ ) = 2π i |ζ |=1
gh
gh [0]
eH
gh [0]
Scrα (ζ )) (3.29)
× Tr F gh (P1 (γ (z1 ))· · ·Pn (γ (zn )) × Scrα(1) (t1 ) · · · Scrα(m) (tm )Scrα (qζ )q T [0] eH [0] ) I 1 dζ eα(H ) wα(H ) (t, q −1 ζ )hP1 (γ (z1 )) · · · Pn (γ (zn )) = 2π i |ζ |=|q| gh
gh
gh
× Scrα(1) (t1 ) · · · Scrα(m) (tm )Scrα (ζ )iq,H ,
where we used the following facts derived from (1.58) and (1.59):
eH [0] Scrα (ζ ) = eα(H ) Scrα (ζ )eH [0] ,
q T [0] Scrα (ζ ) = qScrα (qζ )q T [0] .
(3.30)
Therefore the property (A.4) of the function wα(H ) (t, ζ ) and (3.29) imply that the first integral and the last integral in (3.28) cancel.
610
G. Kuroki, T. Takebe
Using the operator product expansions (1.60) and (1.61), the second and the third integrals in (3.28) are rewritten as I 1 dζ wα(H ) (t, ζ ) × 2π i ζ =za gh
× hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα (ζ )Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H = wα(H ) (t, za )hP1 (γ (z1 )) · · · (Scrα Pa )(γ (za )) · · · Pn (γ (zn )) gh
× Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H , I 1 dζ wα(H ) (t, ζ ) × 2π i ζ =tj gh
× hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα (ζ )Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H α+α(j )
= wα(H ) (t, tj )fα,α(j )
gh
× hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα(1) (t1 ) · · · Scrα+α(j ) (tj ) · · · Scrα(m) (tm )iq,H , which proves the lemma. u t Lemma 3.8. For any polynomial P (x) ∈ C[x] with constant term cP , gh
hP (γ (z))iq,H = cP chF gh (q, H ), gh
where chF gh (q, H ) = h1iq,H is the character of the ghost Fock space. An explicit expression of the character is Y α(H ) (e−α(H ) ; q)−1 ; q)−1 (3.31) chF gh (q, H ) = ∞ (qe ∞. α∈1+
Proof. It is sufficient to prove that Tr F gh (
n Y
γβi [ni ]q T
gh [0]
eH
gh [0]
) = 0,
(3.32)
i=1
for any n ∈ Z>0 , βi ∈ 1+ , ni ∈ Z (i = 1, . . . , n). The Fock space F gh has a basis consisting of vectors of the form Y Y βα 0 (j ) [−m0j ]|0igh , (3.33) γα(i) [−mi ] where α(i), α 0 (j ) ∈ 1+ , mi ∈ Z≥0 , m0j ∈ Z>0 . The action of T gh [0] and H gh [0] is Qndiagonal with respect to this basis. Hence what we must show is that the action of i=1 γβi [ni ] does not have diagonal components with respect to this basis. This can be shown by elementary method which uses only the commutation relation (1.5). The character chF gh (q; H ) is calculated by factorizing the total Fock space into the Fock space Fα,m generated by βα [m] and γ α [−m]: O Fα,m . (3.34) F gh = α∈1+ ,m∈Z
Bosonization and KZB Equations
611
When m = 0, Fα,m = C[γ α [−m]]|0igh , and when m < 0, Fα,m = C[β α [m]]|0igh . The character of each space is: ( (1 − q m e−α(H ) )−1 , m = 0, T gh [0] H gh [0] (3.35) e )= Tr Fα,m (q (1 − q −m eα(H ) )−1 , m < 0, which follows from the commutation relations, [T gh [0], βα [m]] = −mβα [m],
[T gh [0], γ α [m]] = −mγ α [m],
[H gh [0], βα [m]] = α(H )βα [m],
[H gh [0], γ α [m]] = −α(H )γ α [m],
and T gh [0]|0igh = H gh [0]|0igh = 0. Multiplying (3.35) over all α and m, we obtain (3.31). u t Corollary 3.9. For P (x) ∈ C[x] and a sequence of simple roots {αi(j ) }nj=1 , we have h(Scrαi(1) · · · Scrαi(n) P )(γ (z))iq,H = chF gh (q, H )(−1)n (Ei(n) · · · Ei(1) P ), gh
∼ Wak 0 → C is the pairing with the highest weight vector of the Verma where : Mλ∗ = λ,k module of g with the highest weight λ, Mλ . (See (1.34).) Explicitly written, (Ei(n) · · · Ei(1) P ) = R(Ei(n) ) · · · R(Ei(1) )P (x)|x=0 ,
(3.36)
where R(Ei ) is the differential operator corresponding to the Chevalley generator Ei given by (1.3). In particular, (Ei(n) · · · Ei(1) P ) does not depend on λ. Proof. According to Lemma 3.3 of [ATY], the constant term of (Scrαi(1) · · · Scrαi(n) P )(z) is given by (−1)n (Ei(n) · · · Ei(1) P ) = (−1)n R(Ei(n) ) · · · R(Ei(1) )P (x)|x=0 . Lemma 3.10. For any α(i) ∈ 1+ (i = 1, . . . , m) and Pa (x) ∈ C[x] (a = 1, . . . , n), we have gh
hP1 (γ (z1 )) · · · Pn (γ (zn ))Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H chF gh (q, H ) =
X
n hP (γ (z )) Y a a
I1 t···tIn ={1,... ,m} a=1
Q
i∈Ia
(3.37)
gh
Scrα(i) (ti )iq,H
chF gh (q, H )
.
Proof. This is a purely combinatorial lemma. We can apply the inductive proof of (5.3) in [A], replacing the screening current Ward identity for genus 0 with that for genus 1, (3.27). The first step of the induction (the case m = 0) is assured by Lemma 3.8. u t Lemma 3.11. For any P (x) ∈ C[x] and roots α(i) (i = 1, . . . , m), we have gh
hP (γ (z))Scrα(1) (t1 ) · · · Scrα(m) (tm )iq,H X = (−wα(σ (1)) (tσ (1) , tσ (2) ))(−wα(σ (1))+α(σ (2)) (tσ (2) , tσ (3) )) · · · σ ∈Sm gh
× (−wα(σ (1))+···+α(σ (m)) (tσ (m) , z))h(Scrα(σ (1)) · · · Scrα(σ (m)) P )(γ (z))iq,H , (3.38) where we write wα(H ) as wα for short.
612
G. Kuroki, T. Takebe
Proof. We prove this statement by induction on m, as in the proof of (5.4) in [A]. When m = 0, the statement is trivial and when m = 1, it is nothing but the screening current Ward identity (3.27). Assume that (3.38) holds for all m 5 n. Let us regard the left and right hand side of (3.38) for m = n + 1 as functions of t0 : gh
(3.39) F1 (t0 ) := hP (γ (z))Scrα(0) (t0 )Scrα(1) (t1 ) · · · Scrα(n) (tn )iq,H , X (−wα(σ (0)) (tσ (0) , tσ (1) ))(−wα(σ (0))+α(σ (1)) (tσ (1) , tσ (2) )) · · · F2 (t0 ) := σ ∈Sn+1 gh
× (−wα(σ (0))+···+α(σ (n)) (tσ (n) , z)) × h(Scrα(σ (1)) · · · Scrα(σ (n)) P )(γ (z))iq,H , (3.40) where σ is a permutation, σ : {0, 1, . . . , n} → {0, 1, . . . , n}. We now show that F1 (t0 ) = F2 (t0 ). First, note that both functions are meromorphic on C× and poles exist at t0 = ti (i = 1, . . . , n) and at t0 = z. (i)
Both functions have the same quasi-periodicity, f (qt0 ) = e−α(0)(H ) q −1 f (t0 ).
(3.41)
In fact, (3.41) is proved for f = F1 similarly to (3.29). It follows from the property of the function w (A.4) that F2 (t0 ) also satisfies the same periodicity property (3.41). (ii) The principal parts of the pole at t0 = z are equal to 1 gh h(Scrα(0) P )(γ (z))Scrα(1) (t1 ) · · · Scrα(n) (tn )iq,H . t0 − z
(3.42)
For F1 (t0 ), this is a direct consequence of the Ward identity (3.27). The pole of F2 (t0 ) at t0 = z comes from terms in (3.40) such that σ (n) = 0. Using (A.5) and the induction hypothesis, we can show that its principal part is of the form (3.42). (iii) The principal parts of the pole at t0 = ti are equal to α(0)+α(i)
fα(0),α(i) t0 − ti
gh
hP (γ (z))Scrα(1) (t1 ) · · · Scrα(0)+α(i) (ti ) · · · Scrα(n) (tn )iq,H .
(3.43)
The Ward identity (3.27) implies (3.43) for F1 (t0 ). The pole of F2 (t0 ) at t0 = ti comes from terms in (3.40) such that (σ (0), σ (i)) = (j − 1, j ) or (σ (0), σ (i)) = (j, j − 1) (j = 1, . . . , n). The principal part becomes n 1 XX (−wα(σ (0)) (tσ (0) , tσ (1) )) · · · (−wα(σ (0))+···+α(σ (j −2)) (tσ (j −2) , ti )) t0 − ti σ j =1
× (−wα(σ (0))+···+α(σ (j −2))+α(0)+α(i) (ti , tσ (j +1) )) · · · gh
× h(Scrα(σ (0)) · · · [Scrα(0) , Scrα(i) ] · · · Scrα(σ (n)) P )(z)iq,H . (3.44)
Bosonization and KZB Equations
613
where σ runs through the set of permutations σ : {0, . . . , j − 2, j + 1, . . . , n} → α(0)+α(i) {1, . . . , i −1, i +1, . . . , n}. Since [Scrα(0) , Scrα(i) ] = fα(0),α(i) Scrα(0)+α(i) , it follows from the induction hypothesis that (3.44) is equal to (3.43). Comparing F1 (t0 ) and F2 (t0 ) by (i), (ii) and (iii), we conclude that F1 (t0 ) = F2 (t0 ). t u Putting together (3.23), Lemma 3.6, Lemma 3.10, Lemma 3.11, Corollary 3.9, and (3.31), we finally obtain the integral representation of a solution of the KZB equations. Theorem 3.12. The following integral gives a solution of the KZB equations, (2.30), (j ) (2.31) with c2 = (λj |λj + 2ρ) and (2.33): ˜ 0 (z; q; H ) = 9
Z `0−αi ,...,−αi ,λ1 ,...,λN ,µ (t; z; q; H )ψ gh (t; z; q; H ; P1 , . . . , PN ), M C (z,q,H ) 1 (3.45)
where C(z, q, H ) is a family of M-cycles with coefficients in L∗ , `0µ1 ,...,µN ,µ (z1 , . . . , zN ; q; H ) := q (µ+ρ|µ+ρ)/2κ eµ(H ) ! N Y Y √ (µ |2µ−µ )/2κ i ( −1η(q)3 )(µi |µi )/2κ zi i i=1
θ11 (zi /zj ; q)(µi |µj )/κ ,
1≤i<j ≤N
(3.46) and the M-form ψ gh is defined as follows: Pa (x) are polynomials in x, ψ gh (t; z; q; H ; P1 , . . . , PN ) =e
X
ρ(H )
N hP (γ (z )) Y a a
I1 t···tIN ={1,... ,M} a=1
Q
j ∈Ia
gh
Scrαij (tj )iq,H
chF gh (q, H )
dt1 ∧ · · · ∧ dtM , (3.47)
and the last factor in (3.47) for a (1 5 a 5 N ) is gh
hP (γ (z))Scrαi(1) (t1 ) · · · Scrαi(m) (tm )iq,H =
X
chF gh (q, H ) wαi(σ (1)) (tσ (1) , tσ (2) )wαi(σ (1)) +αi(σ (2)) (tσ (2) , tσ (3) ) · · ·
(3.48)
σ ∈Sm
× wαi(σ (1))+···+αi(σ (m)) (tσ (m) , z) (Ei(σ (m)) · · · Ei(σ (1)) P ), if {ij | j ∈ Ia } = {i(1), . . . , i(m)}. This result is an elliptic analogue of [SV1,SV2,ATY,A] and a generalization of a result for sl(2) in [BeF]. Felder and Varchenko have obtained a similar formula in [FV] from a different standpoint.
614
G. Kuroki, T. Takebe
4. Concluding Remarks We found an integral representation of N -point functions of the WZW model on elliptic curves and gives an explicit expression for a solution of the KZB equations, using the Wakimoto realization. Let us list some of the related problems. 1. Higher genus: Is there a similar integral representations of correlation functions of the Wess–Zumino–Witten models on higher genus Riemann surfaces? There are several works to this direction [GMMOS,Ko]. Their formulations are, however, different from ours. 2. Twisted Wess–Zumino–Witten models: In [KT] we formulated “another” Wess– Zumino–Witten model on elliptic curves which we named a “twisted WZW model”. Is it possible to give an integral representation of solutions of the KZ type equations for the correlation functions? 3. Critical level: Feigin, Frenkel and Reshetikhin [FFR] found that the Bethe vector of a certain spin chain model is obtained from the Wakimoto realization of the Wess– Zumino–Witten model on the Riemann sphere at the critical level. How about the genus one case? We shall study the last question in the forthcoming paper. In fact, Felder andVarchenko [FV] have found a relation of their integral representation of solutions of the KZB equations with a solution of a quantum N -body system. We shall take the conformal field theoretical approach to this problem. Acknowledgements. TT is supported by grant-in-aid of the Ministry of Education and Sciences of Japan, No. 09740009. The authors express their gratitude to Edward Frenkel, Takeshi Ikeda, Akishi Kato, Hitoshi Konno, Atsushi Matsuo, Alexei Morozov, Takeshi Suzuki and Yasuhiko Yamada for comments and discussions.
Appendix A. Theta Functions We denote the theta function with characteristic (1/2, 1/2) (cf. Chapter I of [Mu]) additively as X 2 e(n+1/2) π iτ +2π i(n+1/2)(x+1/2) (A.1) θ11 (x; τ ) = n∈Z
and multiplicatively as θ11 (z; q) = θ11 (x; τ ),
(A.2)
where z = exp(2πix) and q = exp(2πiτ ). The infinite product expansion (A.3) θ11 (z; q) = i(q; q)∞ q 1/8 z1/2 (z−1 ; q)∞ (qz; q)∞ Q∞ is also useful, where (x; q)∞ = n=0 (1 − xq n ). We use a function wc (w, z) on C× × C× with parameter c ∈ C× characterized by the following properties: 1. wc (w, z) is a meromorphic function of z and w. 2. wc (w, z) has a following (quasi-)periodicity wc (w, qz) = ec wc (w, z),
wc (qw, z) = q −1 e−c wc (w, z).
(A.4)
Bosonization and KZB Equations
615
3. wc (w, z) has only one simple pole on the elliptic curve C× /q Z at z = w as a function of z. Its Laurent expansion around z = w is: wc (w, z) =
1 + regular. z−w
(A.5)
An explicit form of wc (w, z) is as follows: wc (w, z) =
0 (1; q) θ11 θ11 (e−c z/w; q) , −c θ11 (e ; q) wθ11 (z/w; q)
(A.6)
0 (z; q) = d/dzθ (z; q). where θ11 11 To write down the Knizhnik–Zamolodchikov–Bernard equations, we need following functions: 0 (1; q) θ11 θ11 (e−c z; q) , −c θ11 (e ; q) θ11 (z; q) θ 0 (z; q) . ζ (z) := z 11 θ11 (z; q)
σc (z) :=
(A.7) (A.8)
Appendix B. Method of Coherent States and One-Loop Correlation Functions In this appendix we review the method of coherent states, following Chapter 7.A and 8.1 of [GSW] and compute one-loop correlation functions of vertex operators of the free bosons, which proves Lemma 3.6. Let a and a † be generators of a Heisenberg algebra H: [a, a † ] = 1,
(B.1)
and |0i and h0| be the generating vector of the Fock space representation of H and that of its (restricted) dual, respectively: F := H|0i, ∗
F := h0|H,
a|0i = 0,
(B.2)
h0|a = 0.
(B.3)
†
There are natural bases {|ni}n∈N of F and {hn|}n∈N of F ∗ , consisting of eigenvectors of the number counting operator Na = a † a: an (a † )n hn| = h0| √ , |ni = √ |0i, n! n! hn|Na = hn|n, hm|ni = δmn . Na |ni = n|ni,
(B.4)
The coherent states are defined by |λ) := exp(λa † )|0i = (λ| := h0| exp(λ¯ a) =
∞ X λn √ |ni, n! n=0 ∞ X
λ¯ n hn| √ , n! n=0
(B.5)
616
G. Kuroki, T. Takebe
for λ ∈ C. Here λ¯ is the complex conjugate of λ. In particular, |0) = |0i and (0| = h0|. They have the following properties: a|λ) = λ|λ),
¯ (λ|a † = (λ|λ,
(µ|λ) = e q
Na
µλ ¯
(B.6)
,
(B.7)
|λ) = |qλ),
(B.8)
where q ∈ C× . The trace of an operator A ∈ EndC (F) is computed by the following integral: Z 1 2 d 2 λ e−|λ| (λ|A|λ). (B.9) Tr F (A) = π C Using these formulae, we calculate the one-loop correlation function (3.24) of vertex operators of the free boson fields V (µi ; z) (1.15): T hV (µ1 ; z1 ) · · · V (µN ; zN )ibos µ,q,H = Tr Fµbos (V (µ1 ; z1 ) · · · V (µN ; zN )q
φ [0]
eφ[H ;0] ), (B.10)
P where N i=1 µi = 0. Let us fix an orthonormal basis {Hr }lr=1 of h. The boson Fock space Fµbos is factorized as Fµbos = F0,µ ⊗
∞ l O O
Fr,n ,
(B.11)
r=1 n=1
where F0,µ is the zero-mode space C|µibos and Fr,n is a non-zero-mode Fock space generated by φr [±m] := φ[Hr ; ±n]. Note that φr [±m] and φr 0 [±n] (m, n ∈ Z>0 , r, r 0 = 1, . . . , l) commute with each other unless r = r 0 and m = n. Hence, to compute the φ value of (3.24), we have to compute the trace of V (µ1 ; z1 ) · · · V (µN ; zN )q T [0] eφ[H ;0] over F0,µ and Fr,n and multiply all of them. The vertex operators V (µi ; zi ) are factorized into the product of the zero mode part V0 (µi ; zi ) ((1.18)) and the non-zero mode part V˜ (µi ; zi ) ((1.17)), while the operator T φ (z) is decomposed into sum of the zero mode and the non-zero mode parts as φ
T φ (z) = T0 (z) + T˜ φ (z), φ
T0 (z) :=
(B.12)
l 1 X 1 φ[Hi ; 0]φ[H i ; 0] + φ[2ρ; 0], 2κ 2κ
(B.13)
i=1
l ∞ 1 XX φr [−n]φr [n]. T˜ φ (z) := κ
(B.14)
r=1 n=1
(I) Zero-mode: It is easy to see that the zero mode part of V (µ1 ; z1 ) · · · V (µN ; zN )q T
φ [0]
eφ[H ;0]
Bosonization and KZB Equations
617
acts on |µibos as φ
V0 (µ1 ; z1 ) · · · V0 (µN ; zN )q T0 [0] eφ[H ;0] |µibos Y
=
1≤i<j ≤N
(µi |µj )/κ
zi
N Y i=1
(µi |µ)/κ (µ+2ρ|µ)/2κ µ(H )
zi
q
e
|µibos .
(B.15)
(II) Non-zero-mode: The algebra Hr,n generated by φr [±n] is isomorphic to H through the isomorphism defined by φr [−n] a† = √ . κn
φr [n] a= √ , κn
The Hr,n part of V (µ1 ; z1 ) · · · V (µN ; zN )q T
φ [0]
eφ[H ;0] is !
µr1 z1−n µr1 z1n φr [−n] exp φr [n] exp κ n κ −n ! r n −n µrN zN µN zN φr [−n] exp φr [n] q φr [−n]φr [n]/κ , · · · exp κ n κ −n
where µi = The result is
Pl
r r=1 µi Hr . Its trace over Fr,n
1 1 − qn −1 · exp κn(1 − q n )
X 1≤i<j ≤N
µri µrj
zj zi
is computed by means of the formula (B.9).
n
−q n + κn(1 − q n )
X 1≤i≤j ≤N
µri µrj
zi zj
n
.
(B.16) Multiplying (B.16) for all r and n, we have ˜φ Tr Nr,n Fr,n (V˜ (µ1 ; z1 ) · · · V˜ (µN ; zN )q T [0] )
= (q; q)−l ∞
N Y (µ |µ )/κ (q; q)∞i i i=1
Y
(zj /zi ; q)∞ (qzi /zj ; q)∞
(µi |µj )/κ
.
(B.17)
1≤i<j ≤N
Putting (B.15) and (B.17) together, we obtain the final result (3.25), using the infinite product expansion of the theta function (A.3). This completes the proof of Lemma 3.6. References [A] [AK] [ATY]
Awata, H.: Screening currents Ward identity and integral formulas for the WZNW correlation functions. In Recent developments in string and field theory (Kyoto, 1991). Progr. Theor. Phys. Suppl. 110, 303–319 (1992) Aomoto, K., Kita, M.: Theory of hypergeometric functions. (in Japanese), Tokyo: Springer Verlag, 1994 Awata, H., Tsuchiya, A., Yamada, Y.: Integral formulas for the WZNW correlation functions. Nucl. Phys. B365, 680–698 (1991)
618
[B1] [B2]
G. Kuroki, T. Takebe
Bernard, D.: On the Wess–Zumino–Witten models on the torus. Nucl. Phys. B303, 77–93 (1988) Bernard, D.: On the Wess–Zumino–Witten models on Riemann surfaces. Nucl. Phys. B309, 145–174 (1988) [BaF] Babujian, H. M., Flume, R.: Off-Shell Bethe Ansatz Equation for Gaudin Magnets and Solutions of Knizhnik-Zamolodchikov Equations. Mod. Phys. Lett. A9, 2029–2040 (1994) [BeF] Bernard, D., Felder, G.: Fock representations and BRST cohomology in SL(2) current algebra. Commun. Math. Phys. 127, 145–168 (1990) [BLP] Babujian, H., Lima-Santos, A., Poghossian, R. H.: Knizhnik–Zamolodchikov–Bernard equations connected with the eight-vertex model. Preprint UFSCAR-98-04, solv-int/9804015 (1998) [Ch] Cherednik, I.: Integral solutions of trigonometric Knizhnik-Zamolodchikov equations and Kac– Moody algebras. Publ. Res. Inst. Math. Sci. 27, 727–744 (1991) [DJMM] Date, E., Jimbo, M., Matsuo, A., Miwa, T.: Hypergeometric type integrals and the sl(2, C) Knizhnik-Zamolodchikov equation. Int. J. Mod. Phys. B4, 1049–1057 (1990) [F] Felder, G.: The KZB equations on Riemann surfaces. In: Symétries quantiques, (Les Houches, 1995), Amsterdam: North-Holland, 1998, pp. 687–725 [FF1] Feigin, B., Frenkel, E.: A family of representations of affine Lie algebras. (Russian) Uspekhi Mat. Nauk 43-5, 227–228 (1988); (English transl.) Russ. Math. Surv. 43-5, 221–222 (1988) [FF2] Feigin, B., Frenkel, E.: Affine Kac–Moody algebras and semi-infinite flag manifolds. Commun. Math. Phys. 128, 161–189 (1990) [FFR] Feigin, B., Frenkel, E., Reshetikhin, N.: Gaudin model, Bethe Ansatz and critical level. Commun. Math. Phys. 166, 27–62 (1994) [FV] Felder, G., Varchenko, A.: Integral representation of solutions of the elliptic Knizhnik– Zamolodchikov–Bernard equations. Internat. Math. Res. Notices 5, 221–233 (1995) [FW] Felder, G., Wieczerkowski, C.: Conformal blocks on elliptic curves and the Knizhnik– Zamolodchikov–Bernard equations. Commun. Math. Phys. 176, 133–162 (1996) [GMMOS] Gerasimov, A., Marshakov, A., Morozov, A., Olshanetskii M., Shatashvili S.: Wess–Zumino– Witten model as a theory of free fields. Int. J. Mod. Phys. A5, 2495–2590 (1990) [GSW] Green, M. B., Schwarz, J. H., Witten, E.: Superstring theory, vol. 1, 2. Cambridge Monographs On Mathematical Physics. Cambridge-New York: Cambridge University Press, 1987 [Ko] Konno, H.: A construction of screened multiloop operator for SU (2)k Kac–Moody algebra. Phys. Rev. D45, 4555–4568 (1992) [Ku] Kuroki, G.: Fock space representations of affine Lie algebras and integral representations in the Wess–Zumino–Witten models. Commun. Math. Phys. 142, 511–542 (1991) [KT] Kuroki, G., Takebe, T.: Twisted Wess–Zumino–Witten models on elliptic curves. Comm. Math. Phys. 190, 1–56 (1997) [KZ] Knizhnik, V. G., Zamolodchikov, A. B.: Current algebra and Wess-Zumino model in two dimensions. Nucl. Phys. B247, 83–103 (1984) [Mar] Marshakov, A. V.: Bosonization and calculation of correlation functions in the Wess–Zumino– Witten model. JETP Lett. 49, 419–423 (1989) [Mat] Matsuo, A.: An application of Aomoto-Gelfand hypergeometric functions to the SU (N ) Knizhnik-Zamolodchikov equation. Commun. Math. Phys. 134, 65–78 (1990) [Mu] Mumford, D., Tata Lectures on Theta I. Basel-Boston: Birkhäuser, 1982 [RV] Reshetikhin, N., Varchenko, A.: Quasiclassical asymptotics of solutions to the KZ equations. In Geometry, topology and physics, Conf. Proc. Lecture Notes Geom. Topology, VI, Cambridge: Internat. Press, 1995, pp. 293–322 [S] Suzuki, T.: Differential equations associated to the SU(2) WZNW model on elliptic curves. Publ. Res. Inst. Math. Sci. 32, 207–233 (1996) [SV1] Schechtman, V. V., Varchenko, A. N.: Hypergeometric solutions of Knizhnik-Zamolodchikov equations. Lett. Math. Phys. 20, 279–283 (1990) [SV2] Schechtman, V. V., Varchenko, A. N.: Arrangements of hyperplanes and Lie algebra homology. Invent. Math. 106, 139–194 (1991) [TK] Tsuchiya, A., Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid group. In: Conformal field theory and solvable lattice models (Kyoto, 1986), Adv. Stud. Pure Math. 16, 297–372 (1988); Errata. In: Integrable systems in quantum field theory and statistical mechanics, Adv. Stud. Pure Math., 19, 675–682 (1989) [TUY] Tsuchiya, A., Ueno, K., Yamada, Y.: Conformal field theory on universal family of stable curves with gauge symmetries. In: Integrable systems in quantum field theory and statistical mechanics, Adv. Stud. Pure Math. 19, 459–566 (1989) (1) [W] Wakimoto, M.: Fock representations of the affine Lie algebra A1 . Commun. Math. Phys. 104, 605–609 (1986) Communicated by G. Felder
Commun. Math. Phys. 204, 619 – 649 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
The Boltzmann Equation for a One-Dimensional Quantum Lorentz Gas? R. Esposito1 , M. Pulvirenti2 , A. Teta2 1 Dipartimento di Matematica pura ed Applicata, L’ Aquila, Italy 2 Dipartimento di Matematica, Roma “La Sapienza”, Rome, Italy
Received: 2 April 1998 / Accepted: 12 February 1999
Abstract: We study the macroscopic behavior of a quantum particle under the action of randomly distributed scatterers on the real line. Each scatterer generates a δ-potential. We prove that, in the low density limit, the Wigner function of the system converges to a probability distribution satisfying a classical linear Boltzmann equation, with a scattering cross section computed according to the Quantum Mechanical rules. 1. Introduction The kinetic theory of quantum particles is of interest in a large variety of situations in Statistical Physics and in the applications. However not too much is known about a logically well founded and mathematically consistent approach explaining how it is possible to recover quantum transport equations from the microscopic dynamics. In particular it is certainly interesting, both from a conceptual as well as from a practical point of view, to understand whether and how quantum macroscopic effects arise in the kinetic description of physical systems and to compare it with the classical theory. We direct the reader to Ref. [Sp 1] and references quoted therein, for a discussion concerning the Quantum Boltzmann equation. In this paper we want to approach what we consider the simplest problem, namely the motion of a one-dimensional quantum particle under the action of a random distribution of obstacles. We prove that, under a suitable scaling limit, the Wigner function associated to the dynamics of this particle converges to the solution of the simple transport equation: (∂t + v∂x )f (t; x, v) = λ(|v|){f (x, −v) − f (x, v)},
(1.1)
where λ(|v|) is a suitable real function. At this stage all the memory of the microscopic quantum model is summarized by λ(|v|). ? Work partially supported by GNFM-CNR, MURST and CNR contract n. 96.03850.01.
620
R. Esposito, M. Pulvirenti, A. Teta
Let us give some more details of the model. Consider the Schrödinger equation: X 1 V (x − c)ψ(x, t), i∂t ψ(x, t) = − ∂x2 ψ(x, t) + α 2 c∈c
(1.2)
describing the motion of a quantum particle in a distribution c = {c1 , . . . , cN , . . . } of scatterers. V = V (x) is a given positive potential of compact support (V (x) = 0 if |x| > R > 0) and Rα is a positive parameter. We assume the Planck constant, the mass of the particle and dx V (x) equal to 1. The scatterers are not overlapping in the sense that if ci , cj ∈ c, with i 6 = j , then |ci − cj | > R. We now pass from the microscopic variables (x, t) to the macroscopic ones: r = εx, τ = εt, where ε > 0 is a small scale parameter. Defining: 1 ψε (r, τ ) = √ ψ(ε−1 r, ε−1 τ ), ε Eq. (1.2) becomes iε∂τ ψε (r, τ ) = −
X ε2 2 ∂r ψε (r, τ ) + εα Vε (r − c)ψε (r, τ ), 2 ε
(1.3)
c∈c
Here
1 r V( ) ε ε is the rescaled potential, cε = {εci } denotes the rescaled scatterer configuration. Note now that, although the Planck constant has not been scaled, the simple change of variables leads us to a semiclassical limit problem. However the potential we are dealing with is varying on a scale ε so that we do not expect a regular classical limit. As shown by F. Nier (see [N1, N2]), it is possible to characterize this kind of limiting behavior (as ε → 0) in terms of the Wigner function of the system and the result is that the Wigner function is going to concentrate on classical paths which, as we shall discuss later on, have a stochastic nature. This point is, in our opinion, of great interest and it is the heart of the macroscopic effect exhibited by the transport equation as a consequence of the microscopic tunneling. In analogy with the classical case (see Refs. [G1], [G2, Sp2, BBS, BGW, DP] for a rigorous treatment of the Lorentz gas) the scatterers are assumed stochastically distributed. If they have a typical distance of order one at the microscopic level, the macroscopic density turns out to be of order ε −1 . This prevents the quantum particle from performing a free flight for most of the time as it does in transport problems. Therefore we dilute the scatterer gas by assuming a density µε going to infinity as ε → 0 at a suitable rate, much slower than ε −1 , as we shall specify later on. We only remark here that in dimension d > 1 the number of particles in a macroscopic box, in the kinetic limit, is assumed of order ε −(d−1) . In dimension one such an assumption is not sufficient because the number of particles per unit macroscopic volume has to diverge in order to use the law of large numbers. On the other hand, the “amount of scattering” that a single particle suffers in its evolution has to be finite. For this purpose, the interaction has to be suitably scaled so that the effect of a single collision is of order µ−1 ε . Such considerations do not single out a specific rate for µε : any rate slower than ε−1 would fulfill our requirements. We will see however that a µε diverging very slowly is technically convenient. We do not Vε (r) =
Boltzmann Equation for 1-D Quantum Lorentz Gas
621
know whether this assumption is also necessary to obtain the result. One can also argue that this choice in dimension one is closer to the usual notion of low density limit, while faster rates for µε are closer to the notion of weak coupling limit. One of the most interesting features of our result is that it relies on the tunnel effect, and hence it has no classical counterpart. This is due to the one-dimensionality of the model. In fact a classical one-dimensional Lorentz gas does not have a meaningful kinetic behavior, described by a PDE of the type (1.1). This is probably the most remarkable difference between classical and quantum transport phenomena. On the other hand, in two or more dimensions a low density limit (see [Sp1]) makes sense for both classical and quantum systems. We expect that an analogous procedure as the one presented in this paper will give the linear Boltzmann equation with a scattering cross section computed by the quantum rules. In this context we quote the recent papers [EY1, EY2] where convergence results in the low density and weak coupling limits are presented for the quantum Lorentz gas. We observe that our present result is complementary to the ones in [EY1, EY2] as regards both the range of validity and the approach. We also mention that the related problem of the weak coupling limit has been studied in [Sp 3] and [HLW]. In such papers a linear Boltzmann equation has been derived, for macroscopic short times, starting from a quantum particle under the action of a random potential distributed according to a Gaussian law. We also direct the reader to the review paper [Sp 4] in which Markovian limits for Quantum systems are discussed. Let us now analyze the scattering mechanism more closely. In the limit we are considering, the action of the obstacles is essentially that of a sum of δ–type potentials (centered at c ∈ cε ), all of them with the same intensity εα. As mentioned before, in −1
order to have a non-trivial limit, we need to scale α in such a way that α is of order µε 2 . In this way the quantum particle has a vanishing probability to be reflected (of order µ−1 ε , as follows by the asymptotic analysis of a single collision) and, due to the infinite set of obstacle met in a finite macroscopic time (of order µε ), it has a finite probability to be reflected somewhere. Such a probability is computed by quantum rules and gives rise to the jump rate λ(|v|) in front of the collision term of the linear kinetic equation (1.1) we want to derive. To make the things easier we shall consider a model of scatterers generating a δ– type potential from the very beginning. This allows us to have an explicit analysis of the single scattering problem. The true evolution of the model is approximated by a simpler evolution where, in each given interval between two successive collisions, the only active scatterer is the one the particle would collide with, if it were subject to the classical motion. The wave function so obtained turns out to be close to a stochastic evolution where the particle proceeds freely and, when it meets an obstacle, is scattered with complex amplitude B and transmitted with amplitude 1 − B. By iterating such a procedure we construct an evolution that we call semi-classical dynamics. This is indeed the main tool of our method. It is still a quantum evolution in its nature, based, as it is, on a complex wave function rather than on a density on the phase space. The corresponding Wigner function differs in fact from a density distribution on the phase space because of the presence of interference terms. Thanks to the assumed randomness of the scatterers, the interference terms give a negligible (in the limit ε → 0) contribution, and the limiting density distribution is shown to solve (1.1). The control of the interference terms is the only point in which we really need a slow growth of µε . We would stress that the homogenization limit we are considering is highly singular: we start from a deterministic, time-reversible dynamics and arrive at a stochastic, irreversible system as for the classical case. However there is an important difference with
622
R. Esposito, M. Pulvirenti, A. Teta
respect to the classical case which is worth underlining. For the quantum case there are two kinds of stochasticity gained by the system. The first is due to the “semiclassical limit” which yields the semiclassical dynamics which is stochastic but not Markovian (and not even a process). Indeed the semiclassical dynamics is an approximate, simplified version of the Schrödinger evolution. The homogenization limit on the scatterers finally gives us also the Markov property (which is essential for any reasonable kinetic equation). We remark that the randomness of the obstacles is essential in our approach: as discussed at the end of Sect. 3, if the scatterers were periodic, the contribution of the interference term would not be negligible and the nature of the limiting evolution, if any, would be quite different from the simple transport equation (1.1). On the other hand it is well known (see [GN] and references quoted therein) that the motion of a quantum particle in a periodic potential is asymptotically ballistic. We finally discuss our choice of the δ-potential. The only point in which we use the explicit form of this potential is in Step 1, Sect. 3 below. This is for the asymptotic analysis of the scattering problem. The same result can be obtained for a suitably rescaled square-well potential in which the generalized eigenfunctions are explicitly known. We do not give here the details for the sake of shortness. Thus we believe that the general case of a reasonable compactly supported positive potential can be handled as well, with some extra technical efforts. 2. Statement of the Problem, Notation and Results We are interested in the one-dimensional Schrödinger equation (in macroscopic variables): (2.1) iε∂t ψ ε = Hεc ψ ε , where
ε2 2 ε (∂ ψ ) + Vε (·; c)ψ ε (2.2) 2 x and ε is a small positive parameter. Moreover c = {cj , j ∈ Z} denotes a sequence of points in the line which we call scatterers or obstacles and X αε δ(x − c), (2.3) Vε (x; c) = Hεc ψ ε = −
c∈c −1
where αε > 0 will be chosen later to be of order εµε 2 , where µε is a parameter, suitably diverging with ε, which is related to the density of the scatterers distribution. For a given configuration of scatterers c, the precise definition of the operator Hεc (formally defined by (2.2) and (2.3)) is obtained as follows. We start by considering the 2 free operator H˜ ε = − ε2 ∂x2 on the domain D(H˜ ε ) = {ψ | ψ ∈ C0∞ (R\c)}. For a fixed αε , such an operator has a selfadjoint extension in L2 (R), denoted by Hε , defined on the domain: n o αε D(Hεc ) = ψ | ψ ∈ H 2 (R\c) ∩ H 1 (R), ψ 0 (c+ ) − ψ 0 (c− ) = 2 2 ψ(c), ∀c ∈ c . ε
Boltzmann Equation for 1-D Quantum Lorentz Gas
623
See [A] for details. We omit, for the moment, the explicit dependence of the operator Hε on c . In the sequel we shall sometimes denote this operator by Hεc when it will be necessary to underline this dependence. We are interested in the asymptotic behavior of our system in the limit ε → 0. To formulate our result we need to recall a few facts concerning the Wigner formalism in which our result is suitably formulated. For a classical observable F = F (x, v) we construct the operator Opε F according to the Weyl quantization rule: Z v(x−y) x+y 1 dvdy F ( , v)ei ε ψ(y). (2.4) (Opε F ψ)(x) = 2πε 2 The map F → Opε F is an isometry, up to a constant, between L2 (R2 ) and the space of the Hilbert–Schmidt operators equipped with the Hilbert–Schmidt norm. Indeed it is easy to verify that Z (2.5) dvdx F¯ G = (2πε)T r[(Opε F )∗ Opε G]. Given a density matrix ρ, we define the Wigner transform Wρε by Z ε 1 ε ε dy e−iyv ρ(x + y, x − y), Wρ (x, v) = 2π 2 2 where ρ(x, y) is the kernel of ρ. Notice that Opε Wρε =
1 ρ. 2π ε
(2.6)
(2.7)
On the contrary, given a (classical) density f0 = f0 (x, v) ∈ L2 (R2 ), the time evolution of its quantum counterpart Opε f0 is given by the solution of the Heisenberg equation, namely: e−
itHε ε
(Opε f0 )e
itHε ε
= Opε (fcε (t)) = Opε (εc (t)f0 ).
(2.8)
Note that fcε (t) = εc (t)f0 denotes also the solution of the Wigner equation: (∂t + v∂x )fcε (t; x, v) =
1 iε iε [V (x − ∂p ) − V (x + ∂p )]fcε (t; x, v) iε 2 2
(2.9)
with initial datum f0 . εc (t) is the one-parameter group of operators solving Eq. (2.9) and the pseudodifferential operator in the right-hand side of Eq. (2.9) is explicitly given by Z 1 ε ε 1 0 (2.10) dydv 0 [V (x − y) − V (x + y)]fcε (t; x, v 0 )eiy(v−v ) . 2π iε 2 2 All this part concerning the Wigner equation must be understood at a formal level. While it makes sense for a smooth potential V , we do not know any rigorous treatment of the Wigner equation for a δ potential. We do not take care of it: in this paper we are concerned with the solution εc (t)f0 = fcε (t) (formal solution of Eq. (2.9)) obtained by solving the Heisenberg equation, which makes perfect sense once we have the generator Hε .
624
R. Esposito, M. Pulvirenti, A. Teta
By Eq. (2.8) and (2.5) we easily get that εc is an isometry in L2 (R2 ): kεc (t)f0 kL2 (R2 ) = kf0 kL2 (R2 ) .
(2.11)
Coming back to the distribution c of obstacles, we assume that they are randomly distributed according to the following law. For a parameter µε suitably diverging with −1 ε, c¯ = {c¯j , j ∈ Z} are the points of a one dimensional lattice of mesh µ−1 ε : c¯j = j µε , j ∈ Z. The generic scatterer positions cj ∈ c are distributed independently, according to a probability density gε (cj − c¯j ), −1 where gε = gε (x) is an even function supported in [−µ−1 ε /4, µε /4]. We shall assume
k gε kL∞ ≤ ε−β
(2.12)
for some β > 0 sufficiently small. Remark. Such a choice for the scatterers distribution is better suited to describe the behavior of a charged particle in a semiconductor. On the other hand, with a little extra technical effort our result can be extended to a Poisson distribution of parameter µε . Denoting by Esc the expectation with respect to the scatterers distribution defined above, we introduce: Z ε ε (2.13) f (t) = Esc (fc (t)) := Psc (dc) fcε (t). Our main result is expressed by the following theorem. Theorem 2.1. Assume that f0 , the Wigner function at time t = 0 is a positive normalized function in C 1 ∩ L1 ∩ L2 ((R × R) and
Set
µε = o(| log ε|).
(2.14)
εα αε = √ µε
(2.15)
and assume that β (see (2.12)) is sufficiently small. Then, in the limit ε → 0, f ε (t) (given by (2.13)), converges, in the sense of distributions, to f (t) solving (∂t + v∂x )f (x, v, t) = λ(|v|)[f (x, −v, t) − f (x, v, t)]
(2.16)
with initial datum f (x, v, 0) = f0 (x, v), where λ(|v|) = α 2 |v|−1 . Remark. Condition (2.14) expresses the low density hypothesis in the present framework. In the sequel we will assume µε = | log ε|d ,
0 < d < 1.
(2.17)
Boltzmann Equation for 1-D Quantum Lorentz Gas
625
We shall first prove Theorem 2.1 under some more restrictive assumptions on the initial state and then we extend it to the general case. We begin by considering a WKB initial datum of the form: q p (2.18) ψ0ε (x; p) = ρ0+ (x) ei ε x , where p > 0 is fixed and ρ0+ ∈ C01 (R) is a probability distribution. Therefore itHεc
ψcε (x, t; p) = [e− ε ψ0ε ( · ; p)](x) is the solution of the problem (2.1) with initial datum (2.18). Let fcε (x, v, t; p) be the corresponding Wigner function and set f ε (x, v, t; p) = Esc (fcε (x, v, t; p)). Moreover, for technical reasons, it is convenient to start with WKB states localized on a scale which is intermediate between the microscopic scale ε and√the distance between η µ−1 the obstacles i.e. µ−1 ε . We call this scale η and assume that ε ε . Then we set 1 + 2 ρ0+ (x) = √ e−(x−x ) /η πη
(2.19)
for some x + ∈ R. We have: Theorem 2.2. Under the assumptions (2.14), (2.15) and (2.19), there is a > 0 sufficiently small, such that, if η = εa , in the limit ε → 0, f ε ( · , · , t; p) converges, in the sense of distributions, to f ( · , · , t; p) given by f (x, v, t; p) = ρ + (x, t)δ(v − p) + ρ − (x, t)δ(v + p) and ρ ± (x, t) is a weak solution of (∂t ± p∂x )ρ ± (x, t) = λ(p){ρ ∓ (x, t) − ρ ± (x, t)},
(2.20)
with initial data ρ + (x, 0) = ρ0+ ( · ) = δ( · − x + ), ρ − (x, 0) = 0. 3. Outline of the Proof The proof of Theorem 2.2 is organized in various steps which we summarize. We first consider a single scattering problem for the WKB state (2.18) in Step 1. The result is the introduction of the semiclassical dynamics which is a good approximation of the real quantum problem and easier to handle. Then we show, in Step 2, how to reduce the original multiple scattering problem to a single scattering process. Step 3 is devoted to the study of some properties of the semiclassical dynamics in view of an explicit estimate (Step 4) of the error we make by replacing the true solution of the Schrödinger equation with that given by the semiclassical dynamics. Finally in Steps 5 and 6 we translate the previous results in terms of the Wigner transform and control its asymptotic behavior. In doing this, we have to show that the semiclassical dynamics, on the classical observables, is asymptotically equivalent to a stochastic process which we call the semiclassical process. Finally we show that such a process is asymptotically equivalent to the random flight process which is underlying the linear Boltzmann equation (2.20) we want to derive. We begin by considering a single scattering problem.
626
R. Esposito, M. Pulvirenti, A. Teta
Step 1. (Single scattering). We shall use the following version of the stationary phase theorem. Theorem 3.1. Consider the following integral: Z θ (x) dx f (x)ei ε I= R
C0∞ (R)
with f ∈ and θ ∈ for x 6 = 0. Then
C ∞ (R). I=
where
√
Suppose that 0 ∈ suppf , θ 0 (0) = 0 and θ 0 (x) 6= 0 s
εe
i θ(0) ε
2π i f (0) + R, θ 00 (0)
|R| ≤ Cε log ε−1 kf kC 1
and C depends on θ and the support of f . Moreover, if 0 ∈ / suppf (non-stationary phase), we have: 1 f 0 0 (x)|. θ0 θ0
|I | ≤ Cε2 sup | x
For the proof, see [F]. Let us consider now the Schrödinger equation: iε∂t ψ = Hεc ψ,
(3.1)
where
ε2 00 ψ + αε δ(x − c)ψ. (3.2) 2 We have (see [Sc]) an explicit solution to the problem (3.1) given by means of the following Green function: Z ∞ du e−λε u G0 (|x − c| + |y − c| + u; t), (3.3) G(x, y; t) = G0 (|x − y|; t) − λε Hεc ψ = −
0
where λε = ε−2 αε and G0 (x; t) = √
|x|2
1 2π iεt
ei 2εt
(3.4)
is the free propagator. We want to investigate the asymptotic behavior of the evolution (according to Eq. (3.1)) of an initial state ψ0 of the particular form (for p > 0): x
ψ0 (x) = r0+ (x)eip ε ,
(3.5)
where r0+ ∈ C 2 (L, c) is supported in (L, c) with L < c and |(r0+ )0 | < δ˜−1 , |(r0+ )00 | < δ˜−2 . For a given t > 0 we introduce the semiclassical approximation: ψεcl (x, t) = Aε (p)ei = e−i
p2 t 2ε
p2 t 2ε
eip
x−pt ε
r0+ (x − pt) x
Aε (p)r0+ (x − pt)eip ε
(3.6)
Boltzmann Equation for 1-D Quantum Lorentz Gas
627
Figure 1.
for x > c and ψεcl (x, t) = B ε (p)ei = e−i
p2 t 2ε
p2 t 2ε
e−ip
B ε (p)ei
x+pt−2c ε
2cp ε
r0+ (2c − (x + pt)) + ei x
p2 t 2ε
r0+ (2c − (x + pt))e−ip ε + e−i
eip p2 t 2ε
x−pt ε
r0+ (x − pt) x
r0+ (x − pt)eip ε (3.7)
for x < c. Here Aε and B ε denote the transmission and reflection coefficients relative to the δ potential, which explicitly read as: αε α = √ + O(µ−1 ε ), iεp − αε ip µε iεp α = 1 + √ + O(µ−1 Aε (p) = 1 + B ε (p) = ε ). iεp − αε ip µε
B ε (p) =
(3.8)
Note that |Aε |2 + |B ε |2 = 1. Furthermore we remark that the reflection probability 2 2 |B ε (p)|2 ≈ pα2 µ and hence the expected mean free path inverse is αp . ε The stationary phase analysis of the explicit propagator (3.3) allows us to prove: c
Proposition 3.1. Denote ψ(t) = e−iHε t/ε ψ0 the solution of the problem (3.1) with initial condition ψ0 given by (3.5). Then √ kψεcl (t) − ψ(t)kL∞ ≤ C δ˜−1 ϕ(ε) t,
(3.9)
√ where ϕ(x) = x log x −1 for x > 0. Moreover, denoting by d the distance between the interval (L, c) and the two points set {(x − pt), 2c − (x + pt)}, we have |ψ(x, t)| ≤ C(εt) 2 δ˜−2 3
Proof in Sect. 4.
1 1 + 2 . 4 d d
(3.10)
628
R. Esposito, M. Pulvirenti, A. Teta
Step 2. (Reduction to a single collision). For a given ordered scatterer configuration c = {. . . , c−1 , c0 , c1 , . . . } and for an initial condition ψ0 as in Step 1, consider the evolution given by the following Schrödinger equation:
where [Hεc ψ](x) = −
iε∂t ψ ε = Hεc ψ ε ,
(3.11)
X ε2 00 ψ (x) + αε δ(x − c)ψ(x). 2 c∈c
(3.12)
Suppose that c is distributed according with the scatterers distribution Psc . As a consequence X 1 = O(µ4ε ). 6= 2 )2 (1 + c j j ∈Z ct
We assume, without loss of generality c0 = 0 and try to compare the evolution e−iHε ε ψ0 0t with e−iHε ε ψ0 analyzed in Step 1 with c = 0. We have: Proposition 3.2. Suppose that (L, 0) ⊂ (c−1 , 0) and consider an initial condition (3.5) supported in (L, 0). Let pt < min(|c−1 |, |c1 |). For such a small time no classical path (that drawn in Fig. 2) hits an obstacle different from that localized in 0. In such hypotheses: k(e
−iHεc εt
−e
−iHε0 εt
)ψ0 k2L2
˜−2
≤ Cδ
1 2
(εαε ) t
5 2
1/2
6
1/2
N + 12 d
! ,
(3.13)
where N1 = card(c ∩ [−1, 1]) and d = min(|pt − c1 |, | − pt − c−1 |). Remark. If c is typical with q respect to Psc , the probability law of the scatterers configuration, t <
µ−1 ε 2p
and αε ≡
λ(p)pµ−1 ε ε, (see Eq. (2.15)), then the left hand-side of
− (3.13) is bounded by C δ˜−2 εµε 4 , 3
Figure 2.
Boltzmann Equation for 1-D Quantum Lorentz Gas
629
Step 3. (Properties of the semiclassical dynamics). According to Step 1 it is natural to introduce the semiclassical evolution for an initial state of the form: p
p
ψ0 (x) = r0+ (x)ei ε x + r0− (x)e−i ε x ,
(3.14)
where r0± ∈ L2 and p > 0. We fix a scatterer distribution c. The semiclassical dynamics is defined in the following way (see Fig. 3). For a given x and t > 0 draw back from x all the possible trajectories with velocities ±p, which either cross or are reflected by the obstacles c ∈ c. We denote by ξ++ (x)(ξ+− (x), ξ−+ (x), ξ−− (x)) the sets of trajectories starting with positive (negative, positive, negative) momentum at time t and ending, at time zero, with positive (positive, negative, negative) momentum (respectively). For any ξ ∈ ξ++ (x) (ξ ∈ ξ+− (x), ξ ∈ ξ−+ (x), ξ ∈ ξ−− (x) respectively) we denote by ξ(s) the corresponding point at time s ∈ [0, t].
Figure 3.
The semiclassical evolution of a wave funtion ψ0 (x) is defined by: cl ψ0 (x, t) ≡ ψεcl (x, t) = Ec,t
XX X
ei
p2 t 2ε
j
D(ξ )r0 (ξ(0))eijp
ξ(0) ε
.
(3.15)
k=± j =± ξ ∈ξ k (x) j
Here
D(ξ ) = Ah B s ,
where h and s denote the number of crossing and reflections of the trajectory ξ up to the time 0. Note that Def. (3.15) is consistent with the semiclassical evolution introduced in Step 1 (see formulas (3.6) and (3.7)) for a single collision. Actually formula (3.15) is the iteration of (3.6) and (3.7) for arbitrary times. Observe also that ψεcl (x, t) = e−i
p2 t 2ε
p p r + (x, t)ei ε x + r − (x, t)e−i ε x .
(3.16)
630
R. Esposito, M. Pulvirenti, A. Teta
Figure 4.
for suitable r ± (x, t) which can be defined explicitly. To do this it is convenient to proceed by iteration. We choose τ sufficiently small so that the trajectories x ± ps with s ≤ τ intersect at most a single obstacle (see Fig. 4). Then if x − ps intersects the obstacle c ∈ c we have r + (x; nτ ) = Ar + (x − pτ ; (n − 1)τ ) + Be−
2icp ε
r − (2c − (x − pτ )); (n − 1)τ )
(3.17-1)
while, if x + ps hits c ∈ c, r − (x; nτ ) = Ar − (x + pτ ; (n − 1)τ ) + Be
2icp ε
r + (2c − (x + pτ )); (n − 1)τ ). (3.17-2)
Obviously, if the trajectories x ± ps do not hit any scatterer, we have the free evolution: r ± (x; nτ ) = r ± (x ∓ pτ ; (n − 1)τ ). By iteration we find r ± (x, t) =
X X
˜ )r j (ξ(0)). D(ξ 0
(3.18) (3.19)
j =± ξ ∈ξ ± (x) j
Here ˜ ) = Ah B s D(ξ
s Y
cr
eθr 2ip ε ,
(3.20)
r=1
where h and s denote the number of crossing and reflections of the trajectory ξj± (x) up to the time 0, c1 . . . cs are the positions of the reflecting obstacles (with repetitions if it is the case), θr = 1 (resp. −1) if the reflection happens with positive (resp. negative) forward incoming velocities. Unfortunately, no matter how r0± is regular, the semiclassical dynamics creates discontinuities. First from (3.17) it follows that the points in c are discontinuity points for r ± (t). Moreover if y is a discontinuity point for r + (t) then y + pτ and 2c − (x + pτ ) are discontinuity points for r + (t + τ ) and r − (t + τ ) respectively. In conclusion the lines
Boltzmann Equation for 1-D Quantum Lorentz Gas
631
Figure 5.
ck ± pt, their reflections and the obstacles themselves are discontinuity lines for the semiclassical dynamics (see Fig. 5). However it is clear that most of the discontinuity lines can be eliminated by assuming r0± ∈ C0 (R\c).
(3.21) r ± (t)
are just the obstacles c. Under this hypothesis the only discontinuity points for Note also that, in general, r ± (t) is constant along the characteristics x ± ps outside the obstacles. On the obstacles we have a jump discontinuity which can be computed explicitly by means of (3.17). Indeed we have: d + r (x + pt; t) dt X 2icp + − =− δ(x + pt − c)B(rin (x + pt; t) − e− ε rin (x + pt; t)),
(3.22-1)
d − r (x − pt; t) dt X 2icp − + δ(x − pt − c)B(rin (x − pt; t) − e ε rin (x − pt; t)). =−
(3.22-2)
p −1
c∈c
p −1
c∈c
± we mean the values of r ± before the collision, namely the left value of r + and By rin the right value of r − . We also have X d + − 2 + 2 |r (x + pt; t)|2 = p δ(x + pt − c) B|2 (|rin | − |rin | (x + pt; t) dt c∈c 2icp 2icp − ¯ ¯ − ε r¯ + r − + BAe ε r+r (3.23-1) + ABe in in in ¯in (x + pt; t)
and
X d − + 2 − 2 |r (x − pt; t)|2 = p δ(x − pt − c) B|2 (|rin | − |rin | (x − pt; t) dt c∈c 2icp −2icp − + − ¯ ¯ ε r+r ε r ¯in rin (x − pt; t). (3.23-2) + ABe in ¯in + BAe
632
R. Esposito, M. Pulvirenti, A. Teta
¯ + BA ¯ = 0 (which is a consequence of the identities A−B = 1 Using the identity AB 2 2 and |A| + |B| = 1) we readily prove: Proposition 3.3. For a given scatterer configuration c and t > 0, the total number of particles (defined as k r + k2L2 + k r − k2L2 ) is conserved: krc+ (t)k2L2 + krc− (t)k2L2 = kr0+ k2L2 + kr0+ k2L2 .
(3.24)
ct
cl and e−iHε ε are close for ε small. In Our objective is to show that the flows Ec,t order to prove this, however, we first need to exploit some regularity properties of the semiclassical dynamics. From now on we shall assume:
r0± ∈ C02 (R\c).
(3.25)
In this case r ± (t) ∈ C 2 (R\c). The same argument proving the conservation of the number of particles yields: Proposition 3.4. Under assumption (3.25), for s = 1, 2, 3:
2
s + 2
D r (t) + D s r − (t) = kD s r + k2 + kD s r + k2 , c c 0 L2 0 L2
R\c L 2
R\c L 2
(3.26)
where, for notational simplicity, L2 = L2 (R) (in the sequel H s = H s (R)) and D s = ∂xs . With fixed c = {cj }j ∈Z , for any j ∈ Z, let χj ∈ C0∞ (cj −1 , cj ) be a positive function such that (3.27) χj (x) = 1 for x ∈ (cj −1 + δ, cj − δ) and such that χj ≤ 1, |χj0 | ≤ Cδ −1 , |χj00 | ≤ Cδ −2 , |χj000 | ≤ Cδ −3 . In the following we shall use the notation: X χj . χc =
(3.28)
j
We shall also need to introduce, later on, another approximation of the identity, namely χ˜ c , which is defined as χc with δ˜ >> δ to be specified. Proposition 3.5. Assume hypothesis (3.25) and s = 1, 2, 3. Then we have: kχc rc+ (t)k2H s + kχc rc− (t)k2H s ≤ Cδ −2s (kr0+ k2H s + kr0− k2H s ).
(3.29)
The proof is straightforward and it is given in Sect. 4. In spite of the fact that the semiclassical dynamics is simple we find it difficult to obtain L∞ estimates for r ± (t). However it is easy to get weakly divergent estimates, which are sufficient for our purposes. Proposition 3.6. Assume hypothesis (3.25) and let D the minimal distance between two consecutive obstacles. Then, for any t ∈ [0, T ], we have kr + (t)kL∞ + kr − (t)kL∞ ≤
Tp + C kr0 kL2 kr0+ kH 1 + kr0− kL2 kr0− kH 1 . D
(3.30)
Boltzmann Equation for 1-D Quantum Lorentz Gas
633
Step 4. (Convergence of the wave functions). In what follows it is convenient to define the evolution operator Vεt , applied on wave functions of the form: x
x
ψ(x) = r + (x)eip ε + r − (x)e−ip ε by the formula: Vεt ψ =
(3.31)
i t t x x e−iHj ε (rj+ eip ε ) + e−iHj −1 ε (rj− e−ip ε ) ,
Xh j
(3.32)
where rj± are the restrictions of r ± on the interval (cj −1 , cj ) and Hj is the Hamiltonian generated by the obstacle cj : [Hj ψ](x) = −
ε2 00 ψ (x) + αε δ(x − cj )ψ(x). 2
(3.33)
In other words Vεt is the superposition of solutions of the Schrödinger equation with x initial conditions rj± e±ip ε and potential given by the first obstacle crossed by the classical trajectory traveling on the right (left). cl ψ ε and e−iHεc εt ψ ε , where ψ ε (x) = The aim of this step is to compare the flows Ec,t 0 0 0 px
δη (x − x + )1/2 ei ε , and δη (x − x + ) = ρ0+ (x) (see (2.19)). However, in view of the application of Step 1 and Step 2, for regularity reasons we will estimate −iH c t cl n ) χ˜ c ψ0ε e ε ε − (Ec,τ
(see the definition of χ˜ c after (3.28)). Assume t = nτ with τ sufficiently small to allow only one collision in the semi-classical dynamics, namely τ < Then we have
µ−1 ε 2p .
h i −iH c τ ct c τ n−1 cl n cl n−1 ) χ˜ c ψ0ε = e ε ε e−iHε ε − (Ec,τ ) χ˜ c ψ0ε e−iHε ε − (Ec,τ h i cl n−1 cτ cl + e−iHε ε − Ec,τ ) χ˜ c ψ0ε 1 − χc (Ec,τ h i cτ cl n−1 + e−iHε ε − V τ χc (Ec,τ ) χ˜ c ψ0ε h i cl n−1 + V τ − Eτcl χc (Ec,τ ) χ˜ c ψ0ε ,
(3.34)
and estimate the above four terms in the right hand-side of (3.34) separately. The L2 norm of the left-hand side will be bounded by the L2 norm of the first term in the right-hand side (which is exactly the same quantity at a previous time) plus the sum of small errors. The last two terms are bounded using Step 2 and Step 1 respectively. Here we have some cl )n−1 χ˜ ψ ε (behaving as inverse powers diverging terms due to the derivatives of χc (Ec,τ c 0 ˜ and η) which however will be controlled by powers of ε. The second term will of δ, δ, be controlled by the particle conservation and the L∞ estimate (given by Proposition 3.6). In doing this we shall choose δ and δ˜ such that ˜ µ−1 ε η δ δ ε. We fix now the scale η by putting η = εa with 1 > a > 4β (β is the parameter introduced in (2.12)). We refer to Sect. 4 for the details. As a matter of fact we prove:
634
R. Esposito, M. Pulvirenti, A. Teta
Theorem 3.2. Assume ψ0ε given by (2.18), (2.19) with η = εa , a ≤
1 50 .
Put
ct
ψε (t) = e−iHε ε ψ0ε and
cl ε ψ0 . ψεcl (t) = Ec,t
Then, for all T > 0, there exists a positive constant C(T , p) such that, for a.a. c (with respect to Psc ), we have: sup k ψε (t) − ψεcl (t) kL2 ≤ Cε 100 . 1
(3.35)
t∈[0,T ]
Step 5. (Control of the interferences). We now compute ρε± = |r ± |2 by using (3.19). We have: X X X 0 ˜ )D(ξ ˜ )r j (ξ(0))¯r j (ξ(0)). D(ξ (3.36) ρε± (x, t) = 0 0 j,j 0 =± ξ ∈ξ ± (x) ξ 0 ∈ξ 0 ± (x) j
j0
We write ρε± as where
ρε± = Rε± + Iε± . Rε± (x, t) =
X X
(3.37) j
˜ )|2 |r (ξ(0))|2 |D(ξ 0
(3.38)
j =± ξ ∈ξ ± (x) j
and
Iε± (x, t) =
X
X
j,j 0 =±
ξ ∈ξj± (x) 0 0 ξ ∈ξ ± (x), ξ 6=ξ 0 j0
0
˜ )D(ξ ˜ 0 )r j (ξ(0))¯r j (ξ 0 (0)). D(ξ 0 0
(3.39)
The term (3.39), that is the non-diagonal part of the density, is a purely quantum “interference” factor which is vanishing in the limit ε → 0. For proving this we really need that the scatterers are randomly distributed (see the comments at the end of this section) and we conjecture that this hypothesis is truly necessary to obtain the result. We shall show that the average with respect to the scatterers distribution of the L1 norm of the term I is negligible when ε → 0. To this end, let us consider a single term of the sum in (3.39). We compute for instance Z dx |r0+ (ξ(0))| |¯r0+ (ξ 0 (0))|, (3.40) J = A
are two trajectories in ξ++ (0), to fix the ideas. We identify such two where ξ and trajectories by specifying the number of collisions (say n and n0 ) and the sequence of the colliding scatterers a1 , a2 . . . an and a10 , a20 . . . an0 0 . Note that aj , ak0 ∈ c and possibly ai = aj for some pair i 6 = j . Finally A denotes the set of all x ∈ R which make geometrically possible the two trajectories, namely x ∈ A if ξ0
a1 − x +
n−1 X j =1
|aj +1 − aj | ≤ pt
(3.41)
Boltzmann Equation for 1-D Quantum Lorentz Gas
635
0
and the analogous expression for ξ++ is also verified. It is immediate to check that ξ(0) = x + σ (c, ) − pt,
ξ 0 (0) = x + σ 0 (c) − pt,
with σ (c) = −2a1 + 2a2 · · · + (−1)n 2an , σ 0 (c) = −2a10 + 2a20 · · · + (−1)n 2an0 0 . Note that n and n0 are even because both the initial and the final velocities are positive and this is compatible only with an even number of collisions in each trajectory. Therefore: δη1/2 (x + σ (c) − pt − x + )δη1/2 (x + σ 0 (c) − pt − x + ) 1 1 (σ (c) − σ 0 (c)2 , = δη (x + σ (c) + σ 0 (c) − pt − x + ) exp − 2 2 4η so that
(σ (c) − σ (c))2 . J ≤ exp − 4η
(3.42)
Denote now by a ⊂ c the set of the colliding scatterers of the two trajectories ξ ξ 0 . We now observe that, for any a ∈ a, σ (c) − σ 0 (c) = L(a\a) + 2(s − s 0 )a, where L(a\a) is a linear expression in a\a and s (resp s 0 ) are the differences between the left and the right collisions in the trajectory ξ (resp. ξ 0 ). Moreover, since the trajectories are different, there exists a ∈ a for which |s − s 0 | > 0. Therefore, by the obvious estimate: Z ¯ da gε (a − a)e
− (L−2(s−s 4η
0 )a)2
√ ≤ Ckgε kL∞ η,
(3.43)
we finally obtain a
a
Esc [J ] ≤ Cη1/2 k gε kL∞ ≤ Cε 2 −β ≤ Cε 4 ,
(3.44)
β < a/4.
(3.45)
provided that The other terms in (3.39) can be bounded in the same way, so we conclude that Esc [I] ≤ Cεa/4 N 2 ,
(3.46)
where N is the number of possible histories ξ±± (x) corresponding to a given configuration of scatterers c in the time t. Since the scatterers are at a minimal distance µ−1 ε /2 from each other, the particle moving with velocity of modulus p, in a time t can visit at most 2ptµε scatterers. At each time there are two possible outcomes from the collision, thus N is bounded by 22pµε t . Hence Esc [I] ≤ Cεa/2 24pµε t , which goes to zero as ε → 0 because of assumption (2.14).
(3.47)
636
R. Esposito, M. Pulvirenti, A. Teta
Step 6. (Convergence of the semi-classical process). Once we have eliminated the interference contribution to r ± we are led to consider the convergence of the quantities Rε± (see Def. (3.38)). It is natural to introduce a stochastic process, which we call semiclassical. Given a scatterer configuration c, such that |cj − c¯j | < µ−1 ε /4, consider a particle initially located in x with velocity v, with {x} ∩ c = ∅. The particle moves freely up to the first instant in which it hits an obstacle c ∈ c. Then it is reflected (v → −v) with probability |B|2 and goes ahead with probability 1 − |B|2 . Then the particle moves freely up to the next collision instant in which the procedure is repeated. We denote such a process by γcε (t; x, v). We are interested in the asymptotic behavior of the quantity E[u(γ ε (t; x, v)], where u = u(x, v) is a bounded and continuous test function. γ ε (t) is the stochastic process obtained by γcε making random the scatterers c distributed according to the previous distribution whose expectation is denoted by Esc . Finally E denotes the expectation with respect to Esc and the process γcε . To be more explicit E[u(γ ε (t; x, v)] = Esc Eproc [u(γcε (t; x, v)], where
Eproc [u(γcε (t; x, v)] =
X
P (γ¯c )u(γ¯c ),
γ¯c
and where γ¯c is the generic sample of γcε . Finally P (γ¯c ) = |B|2n (1 − |B|2 )m ,
(3.48)
and n and m denote the numbers of reflections and crossing through the obstacles performed by the sample γ¯c respectively. We also introduce the random flight γ (t, x, v) defined as follows. Denoting by γi (t), i = 1, 2, position and velocity of the process respectively, we have dγ1 (t; x, v) = γ2 (t; x, v) (3.49) dt for all t in which there are not jumps in the velocity variable.At random times t distributed according to a Poisson process of intensity λ(|v|), we have the transition γ2 (t; x, v) → −γ2 (t; x, v). Initially we have γ (0; x, v) = (x, v). Defining f (t) according to the formula Z (3.50) hf (t), ui = dxdv f0 (x, v)E[u(γ (t; x, v)] (u bounded and continuous), it is easy to verify that f (t) satisfies the transport equation (∂t + v∂x )f (t; x, v) = λ(|v|){f (x, −v) − f (x, v)}.
(3.51)
Coming back to the process γcε (t; x, v), we note that, for fixed c a trajectory γcε (t; x, v) is completely determined by the number of collisions n and the times in which the collisions take place. We introduce the sets: S1 = {t1 ∈ (0, t) | x + vt1 ∈ c}, S(t1 ) = {t2 ∈ (t1 , t) | t1 ∈ S1 , x + vt1 − vt2 ∈ c} ...
Boltzmann Equation for 1-D Quantum Lorentz Gas
637
n S(t1 , t2 , . . . , tn−1 ) = tn ∈ (tn−1 , t) | t1 ∈ S1 , . . . , tn−1 ∈ S(t1 , . . . , tn−2 ), o x + vt1 − vt2 . . . (−1)n vtn ∈ c . Therefore: Eproc [u(γcε (t; x, v)] =
XX X n≥0 t1 ∈S1 t2 ∈S(t1 )
(3.52)
X
···
tn ∈S(t1 ,t2 ...,tn−1 )
(3.53)
χ (t1 , t2 , . . . tn )P (t1 , t2 , . . . tn ; x, v; t)u(γ (t1 , t2 , . . . tn ; x, v; t)), where γ (t1 , t2 , . . . tn ; x, v; t) is the trajectory in which there is a reflection exactly at the times t1 , t2 , . . . tn , χ (t1 , t2 , . . . tn ) is the characteristic function of the event t1 + P n j =2 |tj − tj −1 | ≤ t, namely that the trajectory γ (t1 , t2 , . . . tn ; x, v; t) can be realized and P (t1 , t2 , . . . tn ; x, v; t) is just given by the expression (3.48). We now observe that P (t1 , t2 , . . . tn ; x, v; t) = (1 − |B(|v|)|2 )m . |B(|v|)|2n
(3.54)
µε t|v| − 1 < m + n < µε t|v| + 1.
(3.55)
But, Hence,
(1 − |B(|v|)|2 )−n+1 (1 − |B(|v|)|2 )µε |v|t ≤ (1 − |B(|v|)|2 )m ≤ (1 − |B(|v|)|2 )−n−1 (1 − |B(|v|)|2 )µε |v|t .
Moreover: XX X n≥0 t1 ∈S1 t2 ∈S(t1 )
(3.56)
X
···
tn ∈S(t1 ,...,tn−1 )
χ (t1 , t2 , . . . tn )(1 − |B|2 )−n±1 (1 − |B(|v|)|2 )µε |v|t |B|2n u(γ (t1 , t2 , . . . tn ; x, v; t)) Z tZ t Z t ··· dt1 . . . dtn u(γ (t1 , t2 , . . . tn ; x, v; t)) + φ± (ε), = λ(|v|)n e−λ(|v|)t t1
0
tn−1
(3.57) where φ± (ε) is vanishing as ε → 0, because |B(|v|)|2 ∼ λ(|v|)|v|−1 µ−1 ε as ε → 0, (see (3.8)). Therefore lim E[u(γ ε (t; x, v)] = Z tZ t Z X e−λ(|v|)t λ(|v|)n ··· ε→0
n≥0
0
t1
t
tn−1
dt1 . . . dtn u(γ (t1 , t2 , . . . tn ; x, v; t)).
(3.58)
The last two steps are justified by the fact that the time ordering makes the series X XX X ··· |B|2n (3.59) n≥0 t1 ∈S1 t2 ∈S(t1 )
tn ∈S(t1 ,...,tn−1 )
converging uniformly in ε, so that it is enough to check the term by term convergence. Notice now that the right hand side of Eq. (3.58) is nothing else than hf (t), ui, where f (t) solves (3.51) (weakly) with initial datum f0 (y, w) = δ(y − x)δ(w − v).
638
R. Esposito, M. Pulvirenti, A. Teta
On the other hand, defining f˜ε (x, v, t) = Rε+ (x; t)δ(v − p) + Rε− (x; t)δ(v + p), we have that Esc hf˜ε (t), ui =
Z
dx δη (x − x + )E(u(γ ε (t; x, v))
(3.60) (3.61)
and hence we proved Proposition 3.7. Under the hypotheses of Theorem 2.1, Esc Rε± (t) → ρ ± (t)
(3.62)
for all t ∈ [0, T ] and in the sense of weak convergence of the measures. Moreover, ρ ± solve Eq (2.20). Remark. We stress that we used the average on the scatterers only in the statement of the Proposition 3.7. What is really proved by the previous argument is a more involved, pointwise (a.a. c) result, which we do not need below and omit for sake of simplicity. Proof of Theorem 2.2. Let us denote by Wccl and Wε the Wigner transform of ψεcl and its expectation with respect to Esc respectively. By Theorem 3.2 Wccl and fc (and hence also Wε and fε ) are asymptotically equivalent as ε → 0, in the distributional sense. Moreover Wccl is also asymptotically equivalent to ρε+ (x, t)δ(v − p) + ρε− (x, t)δ(v + p) with ρε± = |r ± |2 given by (3.19). Finally by using Step 5 we conclude that Esc (ρε± ) are asymptotically equivalent to Esc (R ± ) and therefore, by Step 6, Esc (ρε± ) converge weakly to the ρ ± solution of the random flight equation (2.20). This concludes the proof. u t Proof of Theorem 2.1. Given an initial datum f0 = f0 (x, v) we construct Z ε f0 (x, v) = dx0 dp f0 (x0 , p)δη (x − x0 )δ ε2 (v − p). η
On the other hand, consider the wave function p x ψ0 (x) = δη (x − x0 )eip ε and its Wigner transform 1 W (x, v|x0 , p) = 2π
Z dy e
−i(v−p)y
r r y y δη (x − x0 + ε ) δη (x − x0 − ε ). 2 2
A simple Gaussian integration shows that W (x, v|x0 , p) = δη (x − x0 )δ ε2 (v − p), η
and hence f0ε (x, v) =
Z dx0 dp f0 (x0 , p)W (x, v|x0 , p).
Boltzmann Equation for 1-D Quantum Lorentz Gas
639
On the basis of our previous analysis, we know that, as ε → 0, ε (t)W ( · , · |x0 , p) (x, v) converges, in the distributional sense to f (x, v; t|x0 , p) ≡ ρ + (x, t|x0 )δ(v − |p|) + ρ − (x, t|x0 )δ(v + |p|) uniformly in x0 and p in compact sets, with p bounded away from the origin. Moreover ρ ± solves the random-flight equation (here we are assuming positive p): (∂t ± p∂x )ρ ± (x, t|x0 ) = λ(p){ρ ∓ (x, t|x0 ) − ρ ± (x, t|x0 )}, with initial condition ρ + (x, 0|x0 ) = δ(x − x0 );
ρ − (x, 0|x0 ) = 0.
Notice that, for any test function u: Z dxdv u(x, v)f (x, v; t|x0 , p) = E[u(γ (t; x0 , p)], where γ (t, x, v) is the random flight process. Therefore, assuming f0 (x0 , v) = 0 if a −1 < |v| < a, |x0 | < a for a > 1, Z ε (ε (t)f0 )(x, v) = dx0 dp f0 (x0 , p) ε (t)W ( · , · |x0 , p)) (x, v) Z → dx0 dp f0 (x0 , p)f (x, v; t|x0 , p) ≡ f (x, v, t). Then f (t) is given by
Z
hf (t), ui =
dxdv f0 (x, v)E[u(γ (t; x, v)]
(u bounded and continuous), and hence f (t) satisfies the transport equation (∂t + v∂x )f (t; x, v) = λ(|v|){f (x, −v) − f (x, v)} with initial condition f0 . In conclusion, if given a > 1, f0 (x0 , v) = 0 for a −1 < |v| < a, |x0 | < a, then (ε (t)f0ε )(x, v) → f (x, v, t) in the sense of distributions. By a standard approximation argument the same result is true also for a general probability density f0 ∈ L1 ∩ L∞ (x, v). Finally by the conservation of the L2 norm we conclude that (ε (t)f0 )(x, v) → f (x, v, t) in distributional sense, because |h(ε (t)f0 − ε (t)f0ε ), ui| ≤ kf0ε − f0 kL2 kukL2 , which goes to zero as ε → 0. u t
640
R. Esposito, M. Pulvirenti, A. Teta
Figure 6.
Remark. As we have seen, one of the crucial points in our proof is that the expectation of the quantity ¯ 0 )r + (ξ(0))r + (ξ 0 (0)) D(ξ )D(ξ 0 0 vanishes for any pair ξ 6 = ξ 0 , in the limit ε → 0. This fact, however, is delicate and requires to be better understood. Consider the case of a periodic lattice of mesh µ−1 ε , which we assume to be a multiple c of ε. In this way, setting cj = j µε we have that exp 2i εj = 1. Consider the semiclassical dynamics in this framework, for a datum r0+ (x) = a −1 χ[1−a/2,1+a/2] −1 with a ≈ µ−1 ε . For t ≈ 3µε we have only eight trajectories contributing to the solution at time t. We enumerate these according to the collision sequences:
1 − BBA
dash line
2 − ABB
black line
3 − BBB
dash line
4 − ABA
black line
The other trajectories are as in Fig. 6. Note that 1 and 2 contribute to r + while 3 and 4 contribute to r − . An easy computation shows that r + (x, t) = a −1 (4|A|2 |B|4 ), ¯ r − (−x, t) = a −1 (|A|4 |B|2 + |B|6 + 2R[B 3 A¯ 2 B]). Note that the contribution due to the interferences between the trajectories 1 and 2 (namely a −1 (2|A|2 |B|4 ) is the same as that due to the semiclassical process (the diagonal part of the sum).
Boltzmann Equation for 1-D Quantum Lorentz Gas
641
Figure 7.
Of course the conservation of the mass is not violated. One readily shows that |r + (x, t)|2 + |r − (−x, t)|2 = a −1 |B|2 , while the contribution of the other trajectories is 1 − a −1 |B|2 . This example can be extended to macroscopic times t of order one (see Fig. 7). If one consider all the interference terms due to the trajectories performing two collisions, one finds that they are of order 2a −1 |B|4 (1 − |B|2 )µε t (tµε )2 ≈ e−ct µε t 2 , where the term (1 − |B|2 )µε t is the contribution due to crossing the non-reflecting obstacles and (tµε )2 is the number of possibilities we have to realize the two trajectories. Note that this term is not vanishing in the limit ε → 0. Actually it is diverging if a ≈ µ−1 ε . Of course this is not a counterexample showing the failure of our result for a periodic configuration of scatterers. Indeed the terms due to a different number of collisions could compensate exactly this positive divergence (some other negative divergence must arise by the other terms because of the mass conservation) but, if so, the mechanism should be involved and this is not at all clear to us. We finally remark that periodic configurations of scatterers prevents this kind of homogenization for the classical case (in higher dimension) as shown in [BGW]. 4. Proofs Proof of Proposition 3.1. We first notice that, if Z (x−x0 )2 dx f (x)ei 2aε , I= R
(4.1)
642
R. Esposito, M. Pulvirenti, A. Teta
with a > 0 and f of compact support, with x0 ∈ suppf , then, by the stationary phase theorem, we have √ (4.2) I = 2πiεaf (x0 ) + Rε with |Rε | ≤ C k f 0 k∞ aε log ε−1 .
(4.3)
In order to prove Proposition 3.1, we use the explicit form of the Green function given by (3.3). Therefore, we need to compute the integrals Z y 0 (4.4) I = dy G0 (x − y; t)r0+ (y)eip ε and
Z
y
dy G0 (|x − c| + |y − c| + u; t)r0+ (y)eip ε .
(4.5)
(x − y)2 + 2pyt = 2pxt + (x − y − pt)2 − p2 t 2 .
(4.6)
I1 = As for I 0 , we write
Therefore, setting y0 = x − pt, using (4.2) we get I =e 0
Z
2
ip xε −i p2εt
e
R
ei dy r0+ (y) √
(y−y0 )2 2εt
2πiεt
x
= eip ε e−i
and, by (4.3) |Rε | ≤ C k (r0+ )0 k∞
p2 t 2ε
r0+ (x − pt) + Rε
√ εt log ε−1 .
(4.7)
(4.8)
In the same way we compute I 1 . Recall that supp r0+ ⊂ (L, c). Therefore we need to look at the following two cases: 1) x > c. Then (|x − c| + |y − c| + u)2 + 2pyt = (y − (x − pt + u))2 + 2pt (x + u) − p2 t 2 . (4.9) 2) x < c. Then (|x −c|+|y −c|+u)2 +2pyt = (y −(2c−(x +pt)+u))2 +2pt (2c−x +u)−p 2 t 2 . (4.10) Therefore, in case 1) (x > c, y < c) we get x
I 1 = eip ε e−i
p2 t 2ε
h
i u r0+ (x − pt + u)eip ε + Rε .
(4.11)
In case 2) (x < c, y < c) we get x
I 1 = e−ip ε e−i
p2 t 2ε
h i c u e2ip ε r0+ (2c − (x + pt) + u)eip ε + Rε .
(4.12)
Boltzmann Equation for 1-D Quantum Lorentz Gas
643
Now we compute the integral on u. It is convenient to change the variable u to w = uε−1 . Therefore, setting x ∗ = x − pt in case 1) and x ∗ = 2c − (x + pt) in case 2), we have Z − λε =
+∞
0
u du e−λε u eip ε r0+ (x ∗
−r0+ (x ∗ )ελε
Z
+∞
0
Z + u) = −ελε
0
+∞
dw e−ελε w eipw r0+ (x ∗ + εw)
dw e(−ελε +ip)w + R˜ ε (4.13)
with R˜ ε = −ελε
Z
+∞ 0
dw e(−ελε +ip)w [r0+ (x ∗ + εw) − r0+ (x ∗ )]. 1/2
The remainder is bounded by εµε
(4.14)
k r0+ kC 1 , while the main term is given by
ελε r + (x ∗ ) = B(p)r0+ (x ∗ ), −ελε + ip 0
(4.15)
αε ελε . = −ελε + ip iεp − αε
(4.16)
with B(p) =
In conclusion, we get a global error bounded by √ |Rε | + |R˜ ε | ≤ C k r0+ kC 1 [ εt log ε−1 + εµ1/2 ε ],
(4.17)
and x
p2 t
ψ(x, t) = χ ({x > c})(B(p) + 1)eip ε e−i 2ε r0+ (x − pt) 2 2 −ip xε −i p2εt 2ip εc + ip xε −i p2εt + e e r0 (2c − (x + pt)) + e e r0 (x − pt) + χ ({x < c}) B(p)e √ + O( εt log ε−1 ) (4.18) Finally, the estimate (3.10) follows from the non-stationary phase theorem, noting that, for θ (x) = (x − x0 )2 /2t,
1 θ0
f θ0
0
= t2
3f 0 (x) f 00 (x) 3f (x) − − . (x − x0 )4 (x − x0 )3 (x − x0 )2
(4.19)
t u Proof of Proposition 3.2. We first proceed formally by setting: e
−iHε0 εt
ψ0 = e
−iHεc εt
Z αε X t c (t−s) 0s ψ0 + i ds e−iHε ε δ( · − cj )e−iHε ε ψ0 . ε 0 j 6=0
(4.20)
644
R. Esposito, M. Pulvirenti, A. Teta
Since
h i 0t ct 0t ct k e−iHε ε ψ0 − e−iHε ε ψ0 k22 = 2 1 − Re(e−iHε ε ψ0 , e−iHε ε ψ0 ) Z t X (t−s) s t α c 0 c ε ds (e−iHε ε δ( · − cj )e−iHε ε ψ0 , e−iHε ε ψ0 ) = 2Re i ε 0 j 6=0 Z t X s s α 0 c ε ds (δ( · − cj )e−iHε ε ψ0 , e−iHε ε ψ0 ) = 2Re i ε 0 j 6=0 Z t X α ε ¯ j , s) , ds ψ 0 (cj , s)ψ(c = 2Re i ε 0
(4.21)
j 6=0
where ψ 0 ( · , t) = e−iHε t/ε ψ0 and ψ( · , t) = e−iHε t/ε ψ0 , we get 0t ct k e−iHε ε ψ0 − e−iHε ε ψ0 k22 1/2 1/2 Z X X 2αε t ≤ ds |ψ(cj , s)|2 |ψ 0 (cj , s)|2 . ε 0 0
c
j 6=0
(4.22)
j 6 =0
We use the conservation of energy (ψ(t), Hεc ψ(t)) = E0 . Here (ψ(t), H ψ(t)) =
(4.23)
X ε2 0 |ψ(cj , t)|2 (ψ (t), ψ 0 (t)) + αε 2
(4.24)
j
and
ε2 p2 k (r0+ )0 k22 + 2 k r0+ k22 . 2 ε As a consequence of the conservation of energy, we have 1/2 s X E0 |ψ(cj , s)|2 ≤ . αε E0 := (ψ(0), Hεc ψ(0)) =
(4.25)
(4.26)
j
To make rigorous the previous argument it is enough to use the standard approximation of the δ-potential and realize that the bound (4.26) is independent of the regularization. 1/2 P 0 (c , s)|2 |ψ using the non stationary phase theorem. Now we estimate j j 6 =0 We consider separately the cj ’s outside a given interval, say [−1, 1] and those inside. The contribution coming from the formers can be bounded by virtue of (3.10) by Cε 3/2 s 3/2 δ˜−2 6 1/2 , where 6 is the sum of the series introduced after (3.12). As for 1 those in [−1, 1], we bound it with the worst one which is given by C δ˜−2 ε3/2 s 3/2 d −2 N12 , where N1 is the number of j ’s such that cj ∈ [−1, 1]. Integrating on s we obtain the proof of Proposition 3.2. u t
Boltzmann Equation for 1-D Quantum Lorentz Gas
645
Proof of Proposition 3.5. Since c is fixed we write r ± instead of rc± . The assumption (3.25) ensures that the only discontinuities of r ± are at the obstacle positions cj ∈ c. Therefore the restriction of r ± to each of the intervals (cj −1 , cj ) is differentiable twice. Hence, we compute (D s := ∂xs and Cks are suitable coefficients) X k D s [χj r + (t)] k2L2 k D s [χc r + (t)] k2L2 = =
XZ j
≤C
cj
cj −1
dx
j s s XX
Chs Cks D s−h χj D s−k χj D h r¯ + (t)D k r + (t)
h=0 k=0 s s X X X s−h j h=0 k=0
1 δ
1 δ
s−k Z
cj
cj −1
dx |D h r¯ + (t)D k r + (t)|
2s X X s X s 1 k D h r + (t) kL2 (cj −1 ,cj ) k D k r + (t) kL2 (cj −1 ,cj ) ≤C δ j h=0 k=0 2s 1 ≤C k r + (t) k2H s (R\c) δ 2s 1 ≤C k r0+ k2H s + k r0− k2H s . δ
(4.27)
The last step follows by using Propositions 3.3 and 3.4. The same argument also works t for r − and the proof of Proposition 3.5 is complete. u ± ± Proof of Proposition 3.6. Denote by rin and rout the values of r ± before and after a ± ± collision. If rl and rr are the left and right limits of r ± , we have + − + − = rl+ , rout = rr+ , rin = rr− , rout = rl− . rin
(4.28)
By the conservation of mass during a collision,
Hence
+ 2 − 2 + 2 − 2 | + |rin | = |rout | + |rout | . |rin
(4.29)
|rr+ |2 − |rr− |2 = |rl+ |2 − |rl− |2 ,
(4.30)
which implies the continuity of the mass flux 1(x, t) = |r + (x, t)|2 − |r − (x, t)|2
(4.31)
also in the discontinuity points of r ± . Moreover the mass flux is bounded because, by Propositions 3.3 and 3.4, k 1(t) kL∞ =k |r + (t)|2 − |r − (t)|2 kL∞ ≤k ∂x (|r + (t)|2 − |r − (t)|2 ) kL1 ≤ 2 k r + (t) kL2 k r + (t) kH 1 (R\c) +2 k r − (t) kL2 k r − (t) kH 1 (R\c) = 2 k r0+ kL2 k r0+ kH 1 +2 k r0− kL2 k r0− kH 1 .
(4.32)
646
R. Esposito, M. Pulvirenti, A. Teta
Figure 8.
We use the previous remark to estimate the L∞ -norm of r ± . Given the configuration of obstacles c, fix x ∈ R. Let cl and cr be the first obstacles, on the left and right of x respectively. Draw the backward trajectory reflected by cl and cr up to time zero and denote by t1 > t2 the collision times with cl and cr respectively (see Fig. 8). By the definition of the semi-classical dynamics we have + (cl , t1 )|2 |r + (x, t)|2 = |rout
− = |rin (cl , t1 )|2 + 1(cl , t1 )
− (cr , t2 )|2 + 1(cl , t1 ) = |rout
= ≤
+ |rin (cr , t2 )|2 + (cr , t2 )|2 |rin
(4.33)
− 1(cr , t2 ) + 1(cl , t1 )
+ 2 k r0+ kL2 k r0+ kH 1 + k r0− kL2 k r0− kH 1 .
Iterating this estimate up to time t = 0 and using the fact that the number of iterations is at most pT /D, with D the minimal distance between two consecutive obstacles, the estimate (3.30) is achieved as well as the proof of Proposition 3.6. u t Proof of Theorem 3.2. Consider the expansion (3.34) and denote by Si , i = 1, . . . , 4 the four terms in the right-hand side. We estimate the L2 norms of such terms. Setting: cl n−1 ) χ˜ c ψ0ε , (4.34) 8 = (Ec,τ we first note that 8 is of the form x
Then: S4 =
X j ∈c(x + ,L)
x
(4.35)
cl V τ − Ec,τ χj 8 +
(4.36)
+ − (x)eip ε + rn−1 (x)e−ip ε . 8(x) = rn−1
X cj ∈c(x / + ,L)
cl V τ − Ec,τ χj 8,
Boltzmann Equation for 1-D Quantum Lorentz Gas
where
647
c(x + , L) = c ∩ (x + − L, x + + L)
(4.37)
and L > 0 will be fixed later on. Moreover, for cj ∈ c(x + , L),
Z x + cl eip ε k2L2 = χj rn−1 k V τ − Ec,τ
cj
cj −1
+
x 2 + cl dx V τ − Ec,τ (x)eip ε χj rn−1
XZ
ck
ck−1
k6=j
x 2 + cl dx V τ − Ec,τ (x)eip ε . χj rn−1 (4.38)
According to this decomposition and applying Proposition (3.1) we get x + cl eip ε k2L2 χj rn−1 k V τ − Ec,τ
(4.39)
+ + ≤ C k χj rn−1 k2C 1 ε| log ε|2 + Cε 2 k χj rn−1 k2C 2 µ2ε . 3
Furthermore, for s = 1, 2, + + k2C s ≤ 2 k D s+1 χj rn−1 k2L2 k χj rn−1 + ≤ 2 k χc rn−1 k2Hs+1
≤ Cδ −2(s+1) k χ˜ c δη1/2 k2Hs+1
(4.40)
≤ Cδ −2(s+1) δ˜−2(s+1) ≤ Cδ −4(s+1) , having used Proposition 3.5 and the conditions η δ˜ δ in the last two steps. Therefore: x 3 + cl eip ε k2L2 ≤ Cε| log ε|2 δ −8 + Cε 2 µ2ε δ −12 (4.41) χj rn−1 k V τ − Ec,τ and
X cj ∈c(x + ,L)
1 cl k V τ − Ec,τ χj 8 k2L2 ≤ Cε 10 ,
(4.42)
1 . by choosing δ = ε γ with γ ≤ 10 Moreover, using the Cauchy Schwartz inequality: x x cl (r0+ eip ε + r0− e−ip ε k2L2 ≤ 2 k r0+ k2L2 + k r0− k2L2 k Ec,t
and hence cl k Ec,τ
X cj ∈c(x / + ,L)
χj 8 k2L2 ≤ 2
X
cj ∈c(x / + ,L)
Z ≤2
|x−x + |>L −1
e η ≤C√ , η
+ − k2L2 + k χj rn−1 k2L2 k χj rn−1
dx δη (x − x + )
(4.43)
(4.44)
648
R. Esposito, M. Pulvirenti, A. Teta
by using the conservation of the mass in the third step and setting L = pt + 1 in the last step. The term X X χj 8 k2L2 =k χj 8 k2L2 (4.45) k Vτ cj ∈c(x / + ,L)
cj ∈c(x / + ,L)
can be handled analogously. Therefore
1
k S4 k2L2 ≤ Cε 10 . We now estimate S3 . Using Proposition 3.2 and the associated remark: −1 + − kC 2 + k χc rn−1 kC 2 k S3 k2L2 ≤ Cεµε 4 k χc rn−1
(4.46)
(4.47)
≤ Cεµ− 4 δ −6 1
1
≤ Cε 10 . Finally, we estimate S2 . Using Schwartz inequality for the first step and (4.43) for the second step, we get: cτ
cl (1 − χc )8 k2L2 k S2 k2L2 ≤ 2 k e−iHε ε (1 − χc )8 k2L2 +2 k Ec,τ + − ≤ 2 k (1 − χc )8 k2L2 +4 k (1 − χc )rn−1 k2L2 +4 k (1 − χc )rn−1 k2L2 + − ≤ 8 k (1 − χc )rn−1 k2L2 + k (1 − χc )rn−1 k2L2 X Z cj +δ + − dx |rn−1 (x)|2 + |rn−1 (x)|2 ≤8 c∈c(x + ,L)
Z
+8
cj −δ
|x−x + |>L
+ − dx |rn−1 (x)|2 + |rn−1 (x)|2 .
(4.48) The last term can be bounded as before by putting L > pt +1 and using the conservation of the mass. The first term is bounded by 1 1 + − k2L∞ + k rn−1 k2L∞ ≤ Cµ2ε δ k χ˜ c δη2 k2H 1 ≤ Cµ2ε δ δ˜−2 ≤ Cε 20 . Cµε δ k rn−1 (4.49) Summarizing all the above estimates, we finally find:
h
h i i 1 cτ
−iHεc τε n cl n cl n−1 ) − (Ec,τ ) χψ ) ˜ 0ε ≤ Cε 40 + (e−iHε ε )n−1 − (Ec,τ χψ ˜ 0ε .
(e L2
L2
(4.50)
Iterating the above inequality we find
h i
−iHεc τε n cl n ) − (Ec,τ ) χ˜ c ψ0ε
(e
L2
1
≤ Cε 40 µε .
Finally, we have:
h i
−iHεc εt cl − Ec,t ψ0ε
e L
h
h
i 2 i ct
−iHεc εt
cl cl ≤ e − Ec,t χ˜ c ψ0ε + e−iHε ε − Ec,t (1 − χ˜ c )ψ0ε L2 L2
1
ε cl ε ≤ Cε 50 + (1 − χ˜ c )ψ0 L + Ec,t (1 − χ˜ c )ψ0 . 2 L2
(4.51)
(4.52)
Boltzmann Equation for 1-D Quantum Lorentz Gas
649
Moreover:
1 2
(1 − χ˜ c )ψ ε 2 ≤ 4
(1 − χ˜ c )δη ( · − x + ) 2 0 L 2
=4
X cj ∈c(x + ,L)
Z
cj +δ˜
cj −δ˜
dx δη (x − x + ) + 4
Z
L2
|x−x + |>1
dx δη (x − x + )
(4.53)
1 C −1 δ˜ ≤ Cµε √ + √ e η ≤ Cε 50 , η η
after
usual L > pt + 1 and using conservation of the mass. The last term,
cl choosing, as
E (1 − χ˜ c )ψ ε can be handled analogously. This concludes the proof of Theorem c,t 0 L2 3.2. u t Acknowledgements. During the preparation of this work we had the chance to discuss with several colleagues. We thank D. Benedetto, E. Caglioti, P. Gerard, P. Markovich, F. Nier, H. Spohn and H.T. Yau for helpful discussions. We also thank the referees for having suggested improvements to the exposition.
References [A] [BBS] [BGW] [DP] [EY1] [EY2] [F] [G1] [G2] [GN] [HLW] [N1] [N2] [Sc] [Sp 1] [Sp 2] [Sp 3] [Sp 4]
Albeverio, S., Gesztesy, F., Högh-Krohn, R., Holden, H.: Solvable models in Quantum Mechanics. New-York: Springer-Verlag, 1988 Boldrighini, C., Bunimovich, L., and Sinai, Y.: On the Boltzmann equation for the Lorentz gas. 32, 477–501 (1983) Bourgain, J., Golse, F. and Wennberg, B.: On the distribution of of free path lenghts for periodic Lorentz gas. Commun. Math. Phys. 190, 491–508 (1997) Desvillettes, L. and Pulvirenti, M.: The linear Boltzmann equation for long-range forces: A derivation from particle systems. Preprint (1997) Erdos, L. andYau, H.T.: Linear Boltzmann equation as scaling limit of quantum Lorentz gas. Contemp. Math. 217, 137–155 (1997) Erdos, L. and Yau, H.T.: Linear Boltzmann Equation as the Weak Coupling Limits of a Random Schrodinger Equation. Preprint (1999) Fedoryuk, M.V.: Stationary phase method and pseudodifferential operators. Uspekhi Mat. Nauk 26, 67–112 (1971) Gallavotti, G.: Divergences and approach to equilibrium in the Lorentz and the Wind-tree models. Phys. Rev. 185, 308–322 (1969) Gallavotti, G.: Rigorous theory of the Boltzmann equation in the Lorentz gas. Reprinted in: G. Gallavotti Meccanica Statistica, Quaderni del CNR n. 50, Nota interna n 358 Istituto di Fisica Univerità di Roma, 1972 (1995), pp. 191–204 Gérard, C. and Nier, F.: The Mourre theory for analytically fibered operators. J. Funct. Anal. 152, 202–219 (1998) Ho, T.G., Landau, L.J., and Wilkins, A.J.: On the weak coupling limit for a Fermi gas in a random potential. Rev. Math. Phys. 5, 209–298 (1994) Nier, F.: Asymptotic analysis of a scaled Wigner equation and Quantum scattering. Transp. Theor. Stat. Phys. 24, 591–629 (1995) Nier, F.: A semiclassical picture of Quantum scattering. Ann. Sci. Ec. Norm. Sup. 4, 148–183 (1996) Schulman L. S.: Applications of the propagator for the delta function potential. In: Path integrals from mev to Mev, Gutzwiller, M.C., Iuomata, A. Klauder, J.K., Streit, L. eds., Singapore: World Scientific, 1986, pp. 302–311 Spohn, H.: Quantum kinetic equation. In: M. Fannes, C. Maes, A. Verbeure, eds., On Three Levels: Micro-, Meso- and Macro-Approaches in Physics, NATO ASI Series B: Physics 324, (1994) New York and London: Plenum Press, pp. 1–10 Spohn, H.: The Lorentz process converges to a random flight process. Commun. Math. Phys. 60, 277–290 (1978) Spohn, H.: Derivation of the transport equation for electron moving through random impurities. J. Stat. Phys. 27, 385 (1977) Spohn, H.: Kinetic equations for Hamiltonian dynamics: Markovian limits. Rev. Mod. Phys. 53, 569–615 (1980)
Communicated by J. L. Lebowitz
Commun. Math. Phys. 204, 651 – 668 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Spectral Analysis of the Disordered Stochastic 1-D Ising Model S. Albeverio1,2 , R. Minlos3 , E. Scacciatelli4 , E. Zhizhina3 1 Institut für Angewandte Mathematik, SFB 256, Universität Bonn, D-53155 Bonn; BiBoS Research Centre,
Bielefeld; SFB 237 (Essen–Bochum–Düsseldorf), Germany
2 CERFIM, Locarno, Switzerland 3 Institute for Information Transmission Problems, Bolshoy Karetny per. 19, 101447 Moscow, Russia 4 Dipartimento di Matematica, Università di Roma “La Sapienza”, P. le A. Moro 2, 00185 Roma, Italy
Received: 22 September 1998 / Accepted: 12 February 1999
Abstract: We consider the generator of the Glauber dynamics for a 1-D Ising model with random bounded potential at any temperature. We prove that for any realization of the potential the spectrum of the generator is the union of separate branches (so-called k-particles branches, k = 0, 1, 2, . . . ), and with probability one it is a nonrandom set. We find the location of the spectrum and prove the localization for the one-particle branch of the spectrum. As a consequence we find a lower bound for the spectral gap for any realization of the random potential. 1. Definitions and Results We consider the Glauber dynamics for a one-dimensional Ising model with random potential. The formal Hamiltonian of the model has the following form: X ωx,y ηx ηy , ηx = ±1, (1) H (η, ω) = − x,y∈Z 1 |x−y|=1
where ωx,y = ωy,x ≡ ωb are independent identically distributed random variables marked by bonds b = (x, y), |x − y| = 1 of the lattice Z 1 . Denote by pb = p the probability distribution of ωb . Let B 1 be the set of all bonds of the lattice Z 1 . Then the family of random variables ω = {ωb , b ∈ B 1 } forms a random field on B 1 with the 1 1 space of realizations = R B , and the probability distribution P = pB . The random 1 field ω = {ωb } is ergodic with respect to the group {Tj , j ∈ Z } of automorphisms of the probability space (, P ) generated by the group of space shifts on the lattice Z 1 : (Tj ω)x,y = ωx−j,y−j , x, y, j ∈ Z 1 , |x − y| = 1.
652
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
We assume that 1) the common probability distribution p is absolutely continuous with respect to the Lebesgue measure, 2) the random variables ωb are positive, finite and bounded away from zero: 0 < γ1 ≤ ωb ≤ γ2 < ∞, γ2 = inf{C : P r(ωb > C) = 0}. For any such realization ω ∈ we will define below a stationary Markov process on 1 the configuration space X = {−1, +1}Z , which is usually called the Glauber dynamics for the Ising model with Hamiltonian (1). The goal of the present paper is to study the spectral properties of the generator of the Glauber dynamics. First we formulate results which are valid for any given bounded sequence ω = {ωb , b ∈ B 1 }. Theorem 1. Let 0 ⊂ be the set of all bounded realizations: 0 = {ω : sup |ωb | < ∞}. b
Then for any ω ∈ 0 and for any β (inverse temperature) the limit Gibbs distribution β β µω (η) exists and it is defined on the configuration space X. µω is the weak limit of the β distributions µω,3 for finite 3: β
µβω = lim µω,3 ,
(2)
3%Z 1
β
dµω,3 (η) = where
e−βH3 (η,ω) 0 dµ (η), η ∈ X, Zβ,3 (ω)
H3 (η, ω) = −
X
ωx,y ηx ηy ,
x,y∈3 |x−y|=1
and µ0 = ν0 Z , ν0 is a distribution of a single spin: 1
ν0 (η = +1) = ν0 (η = −1) =
1 ; 2
Zβ,3 (ω) is a normalizing factor. β The limit Gibbs distribution µω (η) determines a non-homogeneous Markov chain on 1 Z with the state space {−1, 1}, transition probabilities Pω (ηx+1 |ηx ) =
eβωx,x+1 ηx ηx+1 , x ∈ Z 1 , ηx , ηx+1 = ±1 2 cosh βωx,x+1
(3)
and stationary distribution ν0 . Remark 1. Denote by {τj , j ∈ Z 1 } the group of automorphisms of the space X, generated by the shifts on the lattice Z 1 : (τj η)x = ηx−j , x, j, ∈ Z 1 . β
Then for the family of measures {µω , ω ∈ } we have β
µTj ω (τj A) = µβω (A), A ⊂ X.
Spectral Analysis of Disordered Stochastic 1-D Ising Model
653
We recall that the Glauber dynamics for the non-homogeneous Ising model with Hamiltonian (1) is a stationary Markov process on the space X ηω (t) = {ηx,ω (t), t ≥ 0, x ∈ Z 1 }, ηω (t) ∈ X, ω ∈ 0 ,
(4)
β
with invariant measure µω (see [4] for details). This process is determined by the assumption that for any configuration η ∈ X the intensity for the reversal of the spin ηx at the point x equals to: c(x, η, ω) =
1 , 1 + e−1x (η,ω)
(5)
with 1x (η, ω) = βH (η(x) , ω) − βH (η, ω) = −2β(ωx,x−1 ηx ηx−1 + ωx,x+1 ηx ηx+1 ). Here η(x) ∈ X is a configuration, which differs from the configuration η ∈ X only at the point x: ηy , y 6= x, ηy(x) = −ηx , y = x. This means that for small δ Pr{ηω (t + δ) = ηω(x) (t) | ηω (t)} = δ · c(x, ηω (t), ω) + o(δ). From the general constructions of the book [4] it follows that for any sequence ω ∈ 0 β the unique Glauber dynamics (4) with stationary measure µω given by (2) exists. The β generator L (ω) of the stochastic semigroup S(t) = exp{tLβ (ω)} of the process (see β [4]) acts in the space of functions L2 (X, dµω ) in the following way: X c(x, η, ω) f (η(x) ) − f (η) , (6) (Lβ (ω)f )(η) = x∈Z 1
f (η) ∈ L2 (X, dµβω ) ≡ L2 (ω), η ∈ X.
The operator (6) is defined on cylindrical functions in L2 (ω), and the closure L¯ β (ω) of (6) in L2 (ω) is a selfadjoint operator in L2 (ω) (see [4]). In what follows we preserve the notation Lβ (ω) for its closure L¯ β (ω). We denote by {Uj , j ∈ Z 1 } the unitary group acting in the space of functions on X and generated by the group {τj }: (Uj f )(η) = f (τj−1 η), f ∈ L2 (ω). Remark 2. For any fixed ω ∈ the spaces L2 (ω) and L2 (Tj−1 ω), j ∈ Z 1 are unitary equivalent by the unitary mapping Uj : Uj : L2 (ω) → L2 (Tj−1 ω), and the representation (6) implies that (7) Uj Lβ (ω)Uj−1 = Lβ (Tj−1 ω), i.e. the family of operators and spaces L2 (ω), Lβ (ω) is a metrically transitive family with respect to the unitary group of space translations {Uj , j ∈ Z 1 } (see [11] for details).
654
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina β
Theorem 2. I. The space L2 (X, dµω ) is decomposed into a direct sum of subspaces invariant with respect to the operator Lβ (ω): L2 (X, dµβω ) =
∞ M
lk (ω),
k=0
with β
β
Uj : lk (ω) → lk (Tj−1 ω), and Uj Lk (ω)Uj−1 = Lk (Tj−1 ω)
(8)
for every ω ∈ and k = 0, 1, 2, . . . . Here β
Lk (ω) = Lβ (ω)|lk (ω) is the restriction of the operator Lβ (ω) to the invariant subspace lk(ω). In other words β (8) means that for every k = 0, 1, . . . the family lk (ω), Lk (ω) of subspaces and operators forms a metrically transitive family with respect to the unitary group {Uj }. β II. There is a unitary mapping from L2 (X, dµω ) to the antisymmetric Fock space 1 F as (l2 (Z )): V (ω) : L2 (X, dµβω ) → F as (l2 (Z1 )),
(9)
such that every invariant subspace lk (ω), k = 1, 2, . . . (l0 (ω) = {const}) transforms to the k th antisymmetric tensor product of the space l2 (Z1 ) (k-particle subspace of F as (l2 (Z1 ))): V (ω) : lk (ω) → [(l2 (Z1 ))⊗k ]as ⊂ F as (l2 (Z1 )), k = 1, 2, . . . , and the operator Lβ (ω) transforms to the operator L˜ β (ω) in the space F as (l2 (Z1 )):1 β L˜ β (ω) = V (ω)Lβ (ω)V ∗ (ω) = d0(L˜ 1 (ω)),
(10)
β β L˜ 1 (ω) = V (ω)L1 (ω)V ∗ (ω).
(11)
with
Corollary 1. Let
σ1 (ω) = σ Lβ (ω)|l1 (ω)
be the spectrum of the operator Lβ (ω) on the invariant subspace l1 (ω) (the so-called one-particle branch of the spectrum). Then the spectrum of Lβ (ω) has the following structure: 1 We recall that for any operator A in the space l (Z1 ) we denote by d0(A) the operator in the space 2 F as (l2 (Z1 )), such that any subspace [(l2 (Z1 ))⊗k ]as , k = 0, 1, 2, . . . is invariant and the restriction of d0(A) on this subspace equals to
A ⊗ E ⊗ · · · ⊗ E + E ⊗ A ⊗ · · · ⊗ E + · · · + E ⊗ · · · ⊗ E ⊗ A, | {z } | {z } | {z } k times k times k times with E being the identity operator.
Spectral Analysis of Disordered Stochastic 1-D Ising Model
655 .
σ (Lβ (ω)) = {0} ∪ {σ1 (ω)} ∪ {σ1 (ω) + σ1 (ω)} . . . ∪ .
.
(12)
.
∪ {σ1 (ω) + σ1 (ω) + . . . + σ1 (ω)} ∪ . . . . {z } | k times Here 0 is the eigenvalue corresponding to the invariant subspace l0 (ω) = {const} , . A + B ⊂ R means the “arithmetic sum” of the sets A and B, i.e. .
A + B = {x ∈ R | x = y + z, y ∈ A, z ∈ B}. In the next section we give the proofs of Theorems 1 and 2, which are based on the methods of [5] and the analysis of the 1-D stochastic Ising model with constant potential from [10]. We emphasize that Theorem 2 is an essential generalization of results from [10] in the case of the non-homogeneous 1-D Ising model, although actually the basic line of our reasoning is the same as in the proofs from [10]. Thus the Glauber dynamics for the non-homogeneous 1-D Ising model is “integrable”, i.e. the corresponding stochastic semigroup of operators in the functional space β L2 (X, dµω ) has the canonical spectral representation. The key point for the proof of β Theorem 2 is the construction of a special multiplicative basis in L2 (X, dµω ) and the following explicit expression for matrix elements of the generator in this basis, see Sect. 2. This basis had been introduced in [9] for the first time, and then the construction of the basis has been exploited repeatedly, see [6] and references therein. Remark 3. Let us consider in F as (l2 (Z1 )) a unitary representation {U˜ j , j ∈ Z 1 } for the group of the space translations on the lattice Z 1 , such that every subspace [(l2 (Z1 ))⊗k ]as , k = 0, 1, 2, . . . is invariant for the operators {U˜ j , j ∈ Z 1 }, and U˜ j have the following representation on [(l2 (Z1 ))⊗k ]as : (1) U˜ j | [(l2 (Z1 ))⊗k ]as = (U˜ j )⊗k , k = 0, 1, 2, . . . . (1) Here U˜ j : l2 (Z1 ) → l2 (Z1 ) is the usual operator of the space translation in l2 (Z1 ): (1)
(U˜ j f )(x) = f (x − j ), f ∈ l2 (Z1 ), x, j ∈ Z 1 . Then we have for V (Tj−1 ω) : L2 (Tj−1 ω) → F as (l2 (Z1 )), V (Tj−1 ω) = U˜ j V (ω)Uj−1 , ω ∈ .
(13)
Remark 4. The relations (7), (13) and (10) imply that U˜ j L˜ β (ω)U˜ j−1 = L˜ β (Tj−1 ω), where the operators {U˜ j , j ∈ Z 1 } in F as (l2 (Z1 )) were defined in Remark 3. Thus the operator L˜ β (ω) is metrically transitive with respect to the unitary group {U˜ j , j ∈ Z 1 }, and from general results (see, for example [11]) it follows that the spectrum of the operator L˜ β (ω) (and Lβ (ω)) is non-random for P-a.e. ω.
656
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
The representations (10)–(11) imply that we need only the description of the oneparticle branch of the spectrum to obtain the description (nature and location) of the whole spectrum for the operator Lβ (ω) (or L˜ β (ω)). This is the main goal of the present β paper, and in Sects. 3–4 we will study the spectral properties of the operator L1 (ω) (or β L˜ 1 (ω)), which hold with probability 1 (or for P-a.e. ω). We prove here the localization under our assumptions on the random field ω, and find for any β the location of the spectrum (which is non-random for P-a.e. ω). β We emphasize that we prove the localization for the operator L˜ 1 (ω) acting in the 1 space l2 (Z ). We recall that this means the following: β
1) the operator L˜ 1 (ω) has only pure point spectrum: β
β
σ (L˜ 1 (ω)) = σpp (L˜ 1 (ω)), β i.e. there is a basis {ψλ (x), λ ∈ σpp } of eigenfunctions for the operator L˜ 1 (ω) in the space l2 (Z 1 ), and 2) every eigenfunction ψλ (x) decays exponentially:
|ψλ (x)| < C e−α|x|
(14)
with constants C = C(ψλ ), α = α(ψλ ) > 0. In Sects. 3–4 we give the proof of the following basic theorem. Theorem 3. I. Under our assumptions 1)–2) on the random field ω we have localizaβ tion for the operator L˜ 1 (ω) for P-a.e. ω. β β β II. The spectrum σ (L˜ 1 (ω)) = σpp (L˜ 1 (ω)) of the operator L˜ 1 (ω) is nonrandom for P-a.e. ω, and it is the same as the following segment: β
σ (L˜ 1 (ω)) = [−1 − c, −1 + c]
(15)
with c = tanh 2βγ2 < 1 for any β. β ¯ III. For any realization ω¯ ∈ of the random field ω we have for the spectrum σ (L˜ 1 (ω)) β ¯ of the operator L˜ 1 (ω): β ¯ ⊆ [−1 − c, −1 + c] σ (L˜ 1 (ω))
with the same constant c. Corollary 2. I. The spectrum of the operator Lβ (ω) for P-a.e. ω is a nonrandom set and it is the same as the union of the following segments: 0, [−1 − c, −1 + c], [−2 − 2c, −2 + 2c], . . . , [−k − kc, −k + kc], . . . , where c = tanh 2βγ2 < 1 for any β. II. For any realization of the random field ω¯ ∈ there is a spectral gap gω¯ ≥ 1 − tanh 2βγ2 .
Spectral Analysis of Disordered Stochastic 1-D Ising Model
657
The methods of the present paper involve the general analysis of metrically transitive operators from the book [11], the proof of the localization for off-diagonal disorder in the one-dimensional case based on the positiveness of the Lyapunov exponent, see [2, 7], and also some bounds from the papers [3] and [8]. In conclusion we state a conjecture for the ν-dimensional, ν ≥ 2, disordered stochastic Ising model. Conjecture 1. Under the assumptions 1)–2) on the random potential the upper (the socalled one-particle) branch of the spectrum of the generator for the ν-dimensional (ν ≥ 2) stochastic Ising model at high temperatures is a non-random set with probability 1, and coincides with the segment [−1 − Cβ, −1 + Cβ], where β is small and C > 0 is a constant depending on ν and the potential. Moreover for any realization of the random potential the spectral gap gω¯ exists and gω¯ ≥ 1 − Cβ. 2. The Case of Any Fixed Realization. Proofs of Theorems 1 and 2 We start with discussions of the results which are valid for any fixed realization ω ∈ of the random field. Proof of Theorem 1. Repeating the reasoning from the book [5], one can write P3 (ηx+1 , ηˆ x ) P3 (ηˆ x ) Zβ,3\{x,x+1} (ηx+1 , ω) · eβωx,x+1 ηx ηx+1 ˆ , = Zβ,3\{x} ˆ (ηx , ω)
P3 (ηx+1 |ηˆ x ) =
with 3 = {−3, . . . , 3} ⊂ Z 1 , xˆ = {. . . , x − 2, x − 1, x} ⊂ Z 1 , ηˆx = {. . . , ηx−2 , ηx−1 , ηx }; X X exp {β ωy,z ηy ηz }, (ηˆx is fixed). Zβ,A (ηx , ω) = ηA
y,z∈A∪{x} |y−z|=1
Let us consider the matrix a(ωx,x+1 ) =
eβωx,x+1 e−βωx,x+1 e−βωx,x+1 eβωx,x+1
,
with matrix elements aηx ,ηx+1 (ωx,x+1 ) = eβηx ηx+1 ωx,x+1 , ηx , ηx+1 = ±1. Then λ1 (ωx,x+1 ) = 2 cosh βωx,x+1 , λ2 (ωx,x+1 ) = 2 sinh βωx,x+1 are the eigenvalues of a(ωx,x+1 ), and the corresponding orthonormal eigenvectors
(16)
658
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
ψ1 =
1 1 1 1 √ , √ ; ψ2 = √ , − √ 2 2 2 2 (1)
(2)
are independent of ω ( ψj (ηx = −1) = ψj , ψj (ηx = +1) = ψj , j = 1, 2 ). Since the matrix elements of any 2 × 2 symmetric matrix P = {pi,j }, i, j = 1, 2 are known to have the following representation: pi,j = λ1 ψ1 (i)ψ1 (j ) + λ2 ψ2 (i)ψ2 (j ), i, j = 1, 2, where λ1 , λ2 are eigenvalues for P , and {ψ1 , ψ2 } is the corresponding orthonormal basis for eigenvectors, we get: X (ηx+1 , ω) = aηx+1 ,ηx+2 (ωx+1,x+2 ) · · · · · aη3−1 ,η3 (ω3−1,3 ) Zβ,3\{x,x+1} ˆ =
X
ηx+2 ,...η3 =±1
λ1 (ωx+1,x+2 ) · λ1 (ωx+2,x+3 ) · . . . λ1 (ω3−1,3 )ψ1 (ηx+1 )ψ1 (ξ )
ξ =1,2
+
X
λ2 (ωx+1,x+2 ) · λ2 (ωx+2,x+3 ) · . . . λ2 (ω3−1,3 )ψ2 (ηx+1 )ψ2 (ξ ).
ξ =1,2
Hence (ηx+1 , ω) = Zβ,3\{x,x+1} ˆ
3−1 Y
λ1 (ωk,k+1 )
(17)
k=x+1
is a function independent of ηx+1 , and (ηx+1 Zβ,3\{x} ˆ (ηx , ω) = Zβ,3\{x,x+1} ˆ = −1)e−βωx,x+1 ηx + Zβ,3\{x,x+1} (ηx+1 = 1)eβωx,x+1 ηx ˆ 3−1 Y
=
λ1 (ωk,k+1 ) · 2 cosh βωx,x+1 .
(18)
k=x+1
Inserting (17) and (18) to (16) , we obtain: P3 (ηx+1 |ηˆ x ) =
eβωx,x+1 ηx ηx+1 = Pω (ηx+1 |ηx ). 2 cosh βωx,x+1
Theorem 1 is proved. u t Proof of Theorem 2. Denote by vx (η, ω) the function of the form: vx (η, ω) =
ηx − hηx |η<x i , x ∈ Z1, h(ηx − hηx |η<x i)2 |η<x i1/2
(19)
β
where h·|η<x i is the conditional mean of the measure µω under the assumption that the values of the configuration η at points to the left of x are fixed. It has been shown in [9], that the following functions Y vx (η, ω), v∅ ≡ 1, (20) vI (η, ω) = x∈I
where I = {x1 , . . . , xn } ⊂ L2 (ω).
Z1
are finite subsets of Z 1 , form an orthonormal basis in
Spectral Analysis of Disordered Stochastic 1-D Ising Model
659
Lemma 1. For any k = 0, 1, 2, . . . the subspaces lk (ω) (l0 (ω) = {const}), spanned by the set of functions {vI (η, ω), I = {x1 , x2 , . . . , xk } ⊂ Z 1 , |I | = k} are invariant subspaces for the operator Lβ (ω). In addition our Remarks 1 and 2 immediately imply that Uj lk (ω) = lk (Tj−1 ω) and (Uj v)I (η, ω) = vI −j (η, Tj−1 ω). Proof. Let us first consider a one-point set I = {x}. From (5) we have sinh 2βωx,x−1 1 1 − ηx ηx−1 2 2 cosh 2βωx,x−1 + cosh 2βωx,x+1 sinh 2βωx,x+1 1 − ηx ηx+1 2 cosh 2βωx,x−1 + cosh 2βωx,x+1
c(x, η, ω) =
=
2 ) 1 1 ax (1 − ax+1 1 ax+1 (1 − ax2 ) − η η η , η − x x−1 2 ) 2 ) x x+1 2 2 (1 − ax2 ax+1 2 (1 − ax2 ax+1
(21)
and vx ≡ vx (η, ω) =
ηx − ax ηx−1 (1 − ax2 )1/2
(22)
with ax = ax (ω) = tanh βωx−1,x . The representations (6), (21), (22) imply that Lβ (ω)vx = Ax,x−1 vx−1 + Ax,x vx + Ax,x+1 vx+1 ,
(23)
where Ax,x−1 = Ax,x+1 =
2 )1/2 ax (1 − ax2 )1/2 (1 − ax−1 2 ) (1 − ax2 ax−1
,
2 )1/2 ax+1 (1 − ax2 )1/2 (1 − ax+1
Ax,x = −1 −
2 ) (1 − ax2 ax+1 2 ) ax2 (1 − ax−1 2 ) (1 − ax2 ax−1
+
,
2 (1 − a 2 ) ax+1 x 2 ) (1 − ax2 ax+1
.
(24)
Using (23) and again (6), (22), we have: Lβ (ω)(vx vy ) = (Lβ (ω)vx ) · vy + vx · (Lβ (ω)vy ) = (Ax,x + Ay,y )vx vy + Ax,x−1 vx−1 vy + Ax,x+1 vx+1 vy + Ay,y−1 vx vy−1 + Ay,y+1 vx vy+1 ,
(25)
660
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
if |x − y| ≥ 2, and Lβ (ω)(vx vx+1 ) =
X
c(y, η, ω) (vx (η(y) ) − vx (η)) · vx+1 (η) +
y
+vx (η) · (vx+1 (η(y) ) − vx+1 (η)) + (vx (η(y) ) − vx (η)) · (vx+1 (η(y) ) − vx+1 (η)) X c(y, η, ω)[(vx (η(y) ) − vx (η)] = vx+1 (η) · = +vx (η) ·
X
y
c(y, η, ω)[(vx+1 (η(y) ) − vx+1 (η)] −
y
c(x, η, ω)4ax+1 2 )1/2 (1 − ax2 )1/2 (1 − ax+1
β
2 = (L (ω)vx ) · vx+1 + vx · (Lβ (ω)vx+1 ) − (Ax,x+1 vx+1 + Ax+1,x vx2 ) = (Ax,x + Ax+1,x+1 )vx vx+1 + Ax,x−1 vx−1 vx+1 + Ax+1,x+2 vx vx+2 .
(26)
In the general case, when I = {x1 < x2 < · · · < xk } ⊂ Z 1 , using similar calculations we have: Lβ (ω)vI =
k X
[(Lβ (ω)vxi )vI \{xi }
i=1
− δxIi ,xi +1 where δxIi ,xi +1
=
4axi +1 c(xi , η, ω) vI \{xi ,xi +1} ], (1 − ax2i )1/2 (1 − ax2i +1 )1/2
1, if the pair {xi , xi + 1} ⊂ supp I, 0, if the pair {xi , xi + 1} 6 ⊂ supp I.
The relation 2 + Ax+1,x vx2 − Ax,x+1 vx+1
4ax+1 c(x, η, ω) = 0, 2 )1/2 (1 − ax2 )1/2 (1 − ax+1
which is valid for any x ∈ Z1 , implies that Lβ (ω)vI =
k X [(Lβ (ω)vxi )vI \{xi }
(27)
i=1
− δxIi ,xi +1 (Axi ,xi +1 vx2i +1
+ Axi +1,xi vx2i )vI \{xi ,xi +1} ].
1 It is easy to see that Sr any finite set I ⊂ Z can be decomposed into r subsets Im , m = 1, . . . , r; I = m=1 Im , such that each Im is a series of consecutive points Im = {ym , ym + 1, . . . , ym + km }, and if we denote by ym and zm the first and last points of Im , then |ym+1 − zm | ≥ 2. The representation (27) can be written as X Ay,y )vI Lβ (ω)vI = ( y∈I
−
r X
(28) [Aym ,ym −1 v(I \ym )∪{ym −1} + Azm ,zm +1 v(I \zm )∪{zm +1} ].
m=1
Finally, the formulas (23), (28) prove the statement of Lemma 1. u t
Spectral Analysis of Disordered Stochastic 1-D Ising Model
661
We define a canonical isomorphism Vk (ω) : lk (ω) → [(l2 (Z1 ))⊗k ]as , k = 0, 1, 2, . . . by the following transformation of the basis vectors: Vk (ω) : vI → eI ∈ [(l2 (Z1 ))⊗k ]as ,
(29)
where I = {x1 , . . . , xk }, |I | = k, and the function eI (y1 , . . . , yk ), yi ∈ Z 1 is given by ( (−1)|π| √1 , when {y1 , . . . , yk } = {x1 , . . . , xk } = I k! eI (y1 , . . . , yk ) = 0, otherwise. Here π is a permutation, ordering a sequence y1 , . . . , yk in ascending order yi1 < · · · < yik , and |π| is the parity of this permutation. Now formulas (23), (27), (28) imply that the operator β
Lk (ω) = Lβ (ω)|lk (ω) transforms to the operator β β L˜ k (ω) = Vk (ω)Lk (ω)Vk∗ (ω) : [(l2 (Z1 ))⊗k ]as → [(l2 (Z1 ))⊗k ]as ,
that has the following representation: β β β L˜ k (ω) = L˜ 1 (ω) ⊗ E ⊗ · · · ⊗ E + · · · + E ⊗ · · · ⊗ E ⊗ L˜ 1 (ω), β β where L˜ 1 (ω) = V1 (ω)L1 (ω)V1∗ (ω) is an operator in l2 (Z 1 ). Since the functions {vI (σ, ω)} form an orthonormal basis in L2 (ω), we defined a unitary mapping V (ω) from the space L2 (ω) onto the antisymmetric Fock space F as (l2 (Z1 )):
V (ω) : L2 (ω) → F as (l2 (Z1 )), satisfying all conditions of Theorem 2 and proved the representation (10) for the operator L˜ β (ω). Theorem 2 is proved. u t
3. Proof of Theorem 3. Localization β In this section we prove the first statement of Theorem 3. We recall that L˜ 1 (ω) has the matrix representation (23) by the Hermitian matrix A = {Ax,y } with matrix elements (24). However this representation is not appropriate for our purpose, and we will turn here to another one, given again by a Jacobi matrix, but with constant diagonal elements. To do this we consider the subspace in L2 (ω) spanned by the set of functions {hx (η) = ηx , x ∈ Z1 }. From the general formula (6) and the representation(22) one sees that this β subspace is the same as l1 (ω), and by (6) and (5) the operator L1 (ω) has the following matrix structure in this basis: 2 ) 2 ) ax+1 (1 − ax+2 ax (1 − ax−1 β f f , + L1 (ω)f = −fx + x+1 2 a2 ) 2 ) x−1 x (1 − ax+1 (1 − ax2 ax−1 x+2 (30) X fx hx (η) ∈ l1 (ω), ax = tanh βωx−1,x . f = f (η) = x∈Z 1
662
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
The representation (30) is defined by a non-Hermitean matrix, since the basis {hx } is not orthonormal in l1 (ω). Therefore we turn to the other basis p hˆ x (η) = ηx · cosh 2βωx−1,x + cosh 2βωx,x+1 , x ∈ Z 1 , β
where the operator L1 (ω) has the following symmetric representation given by a Hermitean Jacobi matrix B = {Bx,x+1 }: β (31) L1 (ω)g = Bx,x−1 gx−1 − gx + Bx,x+1 gx+1 , x
where X
g=
x∈Z
fx gx hˆ x (η) ∈ l1 (ω), gx = p , cosh 2βω x−1,x + cosh 2βωx,x+1 1
and Bx,x−1 (ω) = Bx−1,x (ω) sinh 2βωx−1,x p . (cosh 2βωx−2,x−1 + cosh 2βωx−1,x )(cosh 2βωx−1,x + cosh 2βωx,x+1 )
(32)
β
We consider now an operator L˘ 1 (ω) acting in the space l2 (Z 1 ), which has the same matrix elements as the operator (31): β (33) L˘ 1 (ω)g = Bx,x−1 gx−1 − gx + Bx,x+1 gx+1 , P
x
with g = x∈Z 1 gx ex ∈ l2 (Z 1 ); and ex = ex (y) = δx,y , Bx,y = Bx,y (ω) are defined β β β by (32). Since the operators L˘ 1 (ω) and L˜ 1 (ω) (which is unitary equivalent to L1 (ω)) are similar, they have the same spectra, in particular the same pure point components of the spectrum. In addition the exponential decrease (14) of the eigenfunctions gλ (x) ∈ l2 (Z 1 ) β of the operator L˘ 1 (ω) appears to imply the exponential decrease (14) for the eigenβ functions ψλ (x) ∈ l2 (Z 1 ) corresponding to the operator L˜ 1 (ω). In fact, if the eigenfunction gλ (x) ∈ l2 (Z 1 ) of the operator (33) (or the corresponding eigenfunction P gλ (η) = x∈Z 1 (gλ )x hˆ x (η) ∈ l1 (ω), (gλ )x = gλ (x) for the operator(31)) decays exponentially, then our assumption on the boundedness of the random variables ωb implies P the exponential decrease for the corresponding eigenfunction fλ = x∈Z 1 (fλ )x hx (η) of the operator (30): |(fλ )x | < C e−α|x|
(34)
with constants C and α > 0. The function fλ ∈ l1 (ω) is written as: X X (fλ )x ηx = (ψλ )x vx , fλ = x∈Z 1
x∈Z 1
where {vx } is the orthonormal basis (22), and by virtue of the canonical isomorphism V (ω) (9) the vector ψλ ∈ l2 (Z 1 ), ψλ = {ψλ (x) = (ψλ )x , x ∈ Z 1 }
Spectral Analysis of Disordered Stochastic 1-D Ising Model
663 β
is the corresponding eigenfunction for the operator L˜ 1 (ω). Therefore we have (ψλ )x = (fλ , vx ) =
X
(fλ )y (ηy , vx ),
(35)
y∈Z 1 β
where (·, ·) is the scalar product in l1 (ω) produced by the measure µω given by (2). From the representation (22) for vx and the explicit expression for correlations: (ηy , ηx ) =
x−1 Y
tanh βωz,z+1 , y < x
z=y
(which results by calculations using the formula (3)), we get that (ηy , vx ) = 0, when y < x
(36)
|(ηy , vx )| < C0 κ y−x , when y ≥ x
(37)
and
with constants C0 and 0 < κ < 1. Now (34)–(37) imply that the function ψλ = {(ψλ )x } ∈ l2 (Z 1 ) is exponentially decaying: |(ψλ )x | < C1 e−α1 |x| with constants C1 = C1 (ψλ ) and α1 = α1 (ψλ ) > 0. β Thus we reduce our problem to the proof of the localization for the operator L˘ 1 (ω) (33). This is a Jacobi operator with off-diagonal disorder. The new random field B = {Bx−1,x (ω)} which is defined on the bonds of the lattice is not independent, but it is non-deterministic. We recall that the field {Vx (ω)} is called deterministic, if Vx (ω) is a measurable function of {Vx (ω)}x≤0 a.e. Our further analysis is based on the positivity of the Lyapunov exponent. The Lyapunov exponent γ (λ) is defined (see, for example [12]) as follows: for any complex λ ∈ C and for any solution u = {ux , x ∈ Z 1 } of L(ω)u = λu for P-a.e. ω the limit lim
x→∞
1 1 ln(|ux |2 + |ux+1 |2 ) 2 x
exists, and it is either γ (λ) or −γ (λ). In the following we shall use general results from the work of Delyon, Simon and Souillard [2] about the localization for off-diagonal disorder, summarized in the following: Theorem 4 (Delyon, Simon, Souillard). Let the Lyapunov exponent, corresponding to the operator (33) be positive: γ (λ) > 0 for Lebesgue a.e. λ, and the probability distribution of each Bx−1,x (ω) be absolutely continuous with respect to the Lebesgue β measure. Then the operator L˘ 1 (ω) has almost surely only pure point spectrum with exponentially decaying eigenfunctions.
664
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
To apply Theorem 4 to our case we have to prove only the positiveness of the Lyapunov exponent γ (λ), because our first assumption on the random field ω and the representation (32) for Bx−1,x (ω) immediately imply that the probability distribution of each Bx−1,x (ω) is absolutely continuous. To prove that γ (λ) > 0 for Lebesgue a.e. λ we will reduce Eq. (33) to an equation involving a discrete Sturm–Liouville operator. Using new variables ux : gx = Kx (ω)ux , x ∈ Z1 , where the random field K = {Kx (ω)} will be described below, one can rewrite the equation β ˜ x L˘ 1 (ω)g = λg as
x
Bx,x−1 Kx−1 ux−1 + Bx,x+1 Kx+1 ux+1 = (λ˜ + 1)Kx ux ,
or Bx,x+1 Kx+1 Bx,x−1 Kx−1 ux−1 + ux+1 = (λ˜ + 1)ux . Kx Kx
(38)
β We fix a point µ0 of the spectrum σ (L˘ 1 (ω) + E) (discrete or continuous) of the operator β L˘ 1 (ω) + E, E is the identity operator, and let K = {Kx , x ∈ Z1 } be the corresponding (generalized) eigenfunction, i.e. the solution of the equation
Bx,x−1 Kx−1 + Bx,x+1 Kx+1 = µ0 Kx .
(39)
Then we put Mx = Kx2 , Px = Kx−1 Kx Bx−1,x , and (38) can be written as: 1 (40) Px+1 (ux+1 − ux ) − Px (ux − ux−1 ) = (λ˜ − µ0 + 1)ux . Mx Operators of the form (40) have been studied in the work of Minami [7], and we use here the main result of [7] about the positivity of the Lyapunov exponent for nondeterministic random fields. Theorem 5 (Minami). Let us consider the operator 1 Px+1 (ux+1 − ux ) − Px (ux − ux−1 ) . (S(ω)u)x = Mx If the Lebesgue measure of the set A = {λ ∈ R 1 : γ (λ) = 0} is positive, then the sequence {Mx (ω), Px (ω), x ∈ Z1 } is deterministic. In our case the field K = {Kx (ω)} is non-deterministic, and hence, the fields M = {Mx } and P = {Px } are also non-deterministic. Thus by Theorem 5 the Lyapunov exponent γ (λ) relative to the operator (40) (with λ = λ˜ − µ0 + 1) is positive: γ (λ) > 0 for Lebesgue a.e. λ, and by Theorem 4 the solutions u = {ux } of (40) decay exponentially for Lebesgue a.e. λ and P-a.e. ω. Therefore the eigenfunctions g = {gx } = {Kx ux } of β the operator L˘ 1 (ω) (33) also decay exponentially for Lebesgue a.e. λ˜ and P-a.e. ω, since K = {Kx } as generalized eigenfunctions satisfying (39) increase at most polynomially. β β Thus we have the localization for the operators L˘ 1 (ω) and L˜ 1 (ω).
Spectral Analysis of Disordered Stochastic 1-D Ising Model
665
4. Proof of Theorem 3. Location of the Spectrum β β To find the location for the spectrum σ (L˜ 1 (ω)) of the operator L˜ 1 (ω) for P-a.e. ω we β consider again the unitary equivalent operator L1 (ω) acting in the space l1 (ω). We denote β β by σ1 = σ (L1 (ω)) the spectrum of the operator L1 (ω) which is non-random for P-a.e. ω (see Remark 4). We will prove two inclusions:
σ1 ⊆ [−1 − C, −1 + C], with some C ≤ c = tanh 2βγ2 ,
(41)
[−1 − c, −1 + c] ⊆ σ1 , c = tanh 2βγ2 .
(42)
β
1) Let us consider the random operator M(ω) = L1 (ω) + E, where E is the identity operator. Since M(ω) ¯ is a self-adjoint operator for any fixed ω¯ ∈ , then ¯ sup ||M(ω)|| ¯ ], σ (M) ⊆ [ − sup ||M(ω)||, ω¯
ω¯
¯ M(ω) ¯ : l1 (ω) ¯ → l1 (ω), ¯ generated where ||M(ω)|| ¯ is the norm of the operators in l1 (ω), β ¯ and σ (M) is the spectrum σ (M) = by the norm in the space L2 (X, dµω¯ ) = L2 (ω), . σ (M(ω)) = σ1 + 1 of the operator M(ω), which is a non-random set for P-a.e. ω. We denote by L the space of functions of the form: X
L = {f =
cx ηx , ||f ||L =
x∈Z 1
X
|cx | < ∞}.
x∈Z 1
¯ and for f ∈ L we have: Then L is a Banach space which is dense in any space l1 (ω), ||f ||L2 (ω) ¯ = ||
X
X
cx ηx ||L2 (ω) ¯ ≤
|cx | · ||ηx ||L2 (ω) ¯ = ||f ||L .
For a bounded operator B in L its norm in L is defined by ||B||L = sup
X
x
|Bx,y |,
y
where Bx,y are the matrix elements of B in the basis {ηx }, i.e. Bηx =
X
Bx,y ηy .
y
We use now the following lemma from the paper [8]. ¯ such that B : L → L Lemma 2 (Minlos). Let B be a self-adjoint operator in l1 (ω) and the restriction B|L is a bounded operator in L. Then B is a bounded operator in ¯ and l1 (ω), ||B|| ≤ ||B||L .
(43)
666
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
By (30) one can rewrite the representation for the operator M(ω) ¯ as 1 tanh(β ω¯ x,x−1 + β ω¯ x,x+1 ) + tanh(β ω¯ x,x−1 − β ω¯ x,x+1 ) ηx−1 M(ω)η ¯ x= 2 (44) 1 + tanh(β ω¯ x,x−1 + β ω¯ x,x+1 ) − tanh(β ω¯ x,x−1 − β ω¯ x,x+1 ) ηx+1 . 2 ¯ (for any ω): ¯ Now (43) and (44) imply the upper bound on the norm of M(ω) ¯ in l1 (ω) ||M(ω)|| ¯ ≤ ||M(ω)|| ¯ L = sup tanh(β ω¯ x,x−1 + β ω¯ x,x+1 ). x
Setting ¯ ≤ sup sup tanh(β ω¯ x,x−1 + β ω¯ x,x+1 ) = tanh 2βγ2 , C ≡ sup ||M(ω)|| ω¯
ω¯
x
we have finally σ (M) ⊆ [−C, C] and σ1 ⊆ [−1 − C, −1 + C]. 2) We consider now the space of the realizations of the random field ω as a topological space (in Schwartz topology), and we denote by supp P the topological support of P . It is easy to see that for any constant realization ζ ∈ : ζ = {ωb ≡ ζ ∈ D for any b ∈ B 1 } ∈ supp P . β
First we prove that the resolution of the identity of L1 (ω) is a weakly continuous function of ω, that is w − lim (ELβ (ω ) (dλ)ψ, ψ) = (ELβ (ω) (dλ)ψ, ψ) ωn →ω
1
n
(45)
1
for any ψ ∈ l2 (Z 1 ). (Due to the canonical isomorphism V (ω) (29) one can consider the β β operators L1 (ωn ) and L1 (ω) as operators in l2 (Z 1 )). By the representations (23), (24) it immediately follows that for any ψ ∈ l2 (Z 1 ): β
β
L1 (ωn )ψ → L1 (ω)ψ as ωn → ω. Analogously,
β
β
(L1 (ωn ))p ψ → (L1 (ω))p ψ as ωn → ω
for any ψ ∈ l2 (Z 1 ) and any integer p = 1, 2, 3, . . . . β β Since L1 (ωn ), L1 (ω) are generators of stochastic semigroups in l2 (Z 1 ) we have: β
β
||etL1 (ωn ) ψ|| ≤ ||ψ||, ||etL1 (ω) ψ|| ≤ ||ψ||, for any t ≥ 0 and ψ ∈ l2 (Z 1 ), and we get: β
β
s − lim etL1 (ωn ) = etL1 (ω) , t ≥ 0. ωn →ω
But it is known [1] that for a sequence of self-adjoint operators {En } on a Hilbert space the strong convergence of semigroups is equivalent to the weak convergence of the measures (En (·)ψ, ψ) → (E(·)ψ, ψ) for any ψ ∈ l2 (Z 1 ). Thus we obtain (45).
Spectral Analysis of Disordered Stochastic 1-D Ising Model
667
This implies that β
¯ ⊆ σ1 for any fixed realization ω¯ ∈ supp P , σ (L1 (ω)) β
(46)
β
¯ is the spectrum of the operator L1 (ω) ¯ corresponding to a fixed realwhere σ (L1 (ω)) ization ω, ¯ and σ1 is the non-random set described above. In fact if 1 ⊂ R is an open interval not intersecting σ1 : 1 6 ⊆ σ1 , then for P -almost all ω˜ we have: (1)ψ, ψ) = 0, for any ψ ∈ l2 (Z 1 ). (ELβ (ω) ˜
(47)
1
Since ω¯ ∈ supp P , then there is a sequence ω˜ n → ω¯ meeting the condition (47). Now β (1)ψ, ψ) = 0, that is 1 6 ⊆ σ (L1 (ω)). ¯ (45) implies that (ELβ (ω) ¯ 1
So for any constant realization ζ ∈ supp P : β
σ (L1 (ζ )) ⊆ σ1 , consequently, [
β
σ (L1 (ζ )) ⊆ σ1 .
(48)
ζ ∈D β
One can easily find the spectrum σ (L1 (ζ )) for any constant realization {ωb ≡ ζ, b ∈ B 1 }: β
σ (L1 (ζ )) = [ −1 − tanh 2βζ, −1 + tanh 2βζ ], thus by (48) [−1 − tanh 2βγ2 , −1 + tanh 2βγ2 ] ⊆ σ1 . Finally (41) and (42) imply (15). We have proven that for almost all ω ∈ the spectral gap is equal to g = 1 − tanh 2βγ2 . But by the inclusion (46) we have: gω ≥ 1 − tanh 2βγ2 for any ω ∈ . Acknowledgements. Two authors (R. Minlos and E. Zhizhina) thank the Russian Foundation for Basic Research (Grants N 96-01-00064, N 96-01-10020, N 97-01-00714) and the Swiss National Foundation, Contract N 7SUPJ048214, for financial support. They also gratefully acknowledge the BiBoS Research Center for kind hospitality.
668
S. Albeverio, R. Minlos, E. Scacciatelli, E. Zhizhina
References 1. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Berlin: Birkhäuser, 1990 2. Delyon, R., Simon, B., Souillard, B.: Localization for off-diagonal disorder and for continuous Schrödinger operators. Commun. Math.Phys. 109, 157–165 (1987) 3. Kotani, S.: Support theorems for random Schrödinger operators. Commun. Math. Phys. 97, 443–452 (1985) 4. Liggett, Th.: Interacting particle systems. Berlin–Heidelberg–New York: Springer-Verlag, 1985 5. Malyshev, V.A., Minlos, R.A.: Gibbsian random fields Dordrecht: Kluwer Academic Publishers, 1991 6. Malyshev, V.A., Minlos, R.A.: Linear infinite-particle operators. Providence, RI: AMS, 1995 7. Minami, N.: An extension of Kotani’s theorem for random generalized Sturm–Liouville operators. Commun. Math. Phys. 103, 387–402 (1986) 8. Minlos, R.A.: Invariant subspaces of Ising stochastic dynamics (for small β). Markov Processes and Related Fields 2 No 2, 263–284 (1996) 9. Minlos, R.A., Sinai, Ya.G.: Study of the spectra of stochastic operators arising in lattice models of a gas. Teor. Mat. Fizika 2, 230–243 (1970) 10. Minlos, R.A., Trishch, A.G.: The complete spectral decomposition of a generator of Glauber dynamics for one-dimensional Ising model. Uspechi Mathem. Nauk 49, No 6, 209–210 (1994) 11. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators Berlin–Heidelberg–New York: Springer-Verlag, 1991 12. Simon, B.: Kotani theory for one dimensional stochastic Jacobi matrices. Commun. Math. Phys. 89, 227–234 (1983) Communicated by Ya. G. Sinai
Commun. Math. Phys. 204, 669 – 689 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On Overlapping Divergences Dirk Kreimer? Department of Physics, Mainz University, D-55099 Mainz, Germany. E-mail: [email protected] Received: 14 October 1998/ Accepted: 12 February 1999
Abstract: Using set-theoretic considerations, we show that the forest formula for overlapping divergences comes from the Hopf algebra of rooted trees. Motivation and Introduction The process of renormalization is governed by the forest formula, as derived for example in [1]. The underlying combinatorics is directly related to the Hopf algebra structure of rooted trees. This is evident in the case of Feynman diagrams which only provide nested or disjoint subdivergences. It is the purpose of this paper to show that the same Hopf algebra appears in the study of overlapping divergences. This was already shown using Schwinger Dyson equations [2], or by explicit considerations of divergent sectors [3], or differential equations on bare Green functions [4]. At such a level, one obtains a resolution of overlapping divergent graphs into a sum of rooted trees, to which then the combinatorics of the Hopf algebra of rooted trees applies [2,4]. It was suggested to construct a Hopf algebra which directly considers overlapping divergent graphs, without using external input as Schwinger Dyson equations [5]. However, as already mentioned in [4], this leads to the same Hopf algebra as for the case of non-overlapping divergences, as we will prove by set-theoretic considerations. 1. The Hopf Algebra HR In this section we first repeat the definition of the Hopf algebra of decorated rooted trees, as it can be found in [4]. The rooted trees provide sets of vertices connected by edges. The vertices are labelled by decorations. ? Heisenberg Fellow
670
D. Kreimer
Each decoration corresponds to an analytic expression with a non-vanishing superficial degree of divergence, but free of subdivergences. Such analytic expressions are typically obtained from general Feynman graphs by shrinking superficially divergent subgraphs to a point. If for example 0 is a superficially divergent Feynman graph which contains only one divergent subgraph γ , then one usually denotes by 0/γ the expression in which γ is reduced to a point in 0. When we speak of Feynman graphs in the following, this includes such quotients 0/γ . The Hopf algebra of decorated rooted trees, with vertices labelled by Feynman graphs free of subdivergences, is equivalent to the Hopf algebra on parenthesized words introduced in [2]. In the next section, we embark on some set-theoretic considerations, which will prove useful in the study of overlapping divergences. In particular, we will assign a unique rooted tree to a set M by imposing conditions on its subsets. We follow Sect. II of [4]. A rooted tree t is a connected and simply-connected set of oriented edges and vertices such that there is precisely one distinguished vertex which has no incoming edge. This vertex is called the root of t. Further, every edge connects two vertices and the fertility f (v) of a vertex v is the number of edges outgoing from v. The trees being simply-connected, each vertex apart from the root has a single incoming edge. As in [4], we consider the (commutative) algebra of polynomials over Q in rooted trees, hence the multiplication m(t, t 0 ) of two rooted trees means drawing them next to each other in arbitrary order. Note that for any rooted tree t with root r we have f (r) trees t1 , . . . , tf (r) which are the trees attached to r. The unit element of this algebra is 1, corresponding, as a rooted tree, to the empty set. Let B− be the operator which removes the root r from a tree t: B− : t → B− (t) = t1 t2 . . . tf (r) .
(1)
Figure 1 gives an example. Let B+ be the operation which maps a monomial of n rooted trees to a new rooted tree t which has a root r with fertility f (r) = n which connects to the n roots of t1 , . . . , tn , B+ : t1 . . . tn → B+ (t1 . . . tn ) = t.
(2)
This is clearly the inverse to the action of B− . One has B+ (B− (t)) = B− (B+ (t)) = t
Fig. 1. The action of B− on a rooted tree
(3)
Overlapping Divergences
671
Fig. 2. The action of B+ on a monomial of trees
Fig. 3. An elementary cut c splits a rooted tree t into two components, the fall-down P c (t) and the piece which is still connected to the root, R c (t)
for any rooted tree t. Figure 2 gives an example. For convenience, we define the rooted trees t1 , t2 , t31 , t32 to be the trees with one, two or three vertices, given in Fig. 5 on the lhs from top to bottom. We further set B− (t1 ) = 1, B+ (1) = t1 . We will introduce a Hopf algebra on such rooted trees by using the possibility to cut such trees in pieces. We start with the most elementary possibility. An elementary cut is a cut of a rooted tree at a single chosen edge, as indicated in Fig. 3. By such a cutting procedure, we will obtain the possibility to define a coproduct, as we can use the resulting pieces on either side of the coproduct. But before doing so we finally introduce the notion of an admissible cut, also called a simple cut. It is any assignment of elementary cuts to a rooted tree t such that any path from any vertex of the tree to the root has at most one elementary cut. Figure 4 gives an example.
Fig. 4. An admissible cut C acting on a tree t. It produces a monomial of trees. One of the factors, R C (t), contains the root of t
672
D. Kreimer
∆ ∆
∆
∆
1+1 1+1
+
1+1
+
+
+
+ 2
1+1
Fig. 5. The coproduct. We work it out for the trees t1 , t2 , t31 , t32 , from top to bottom
An admissible cut C maps a tree to a monomial in trees. If the cut C contains n elementary cuts, it induces a map C : t → C(t) =
n+1 Y
t ji .
(4)
i=1
Note that precisely one of these trees tji will contain the root of t. Let us denote this distinguished tree by R C (t). The monomial which is delivered by the n − 1 other factors is denoted by P C (t). The definitions of C, P , R can be extended to monomials of trees in the obvious manner, by choosing a cut C i for every tree tji in the monomial: C(tj1 . . . tjn ) := C 1 (tj1 ) . . . C n (tjn ), n
P C (tj1 . . . tjn ) := P C (tj1 ) . . . P C (tjn ), 1
n
R C (tj1 . . . tjn ) := R C (tj1 ) . . . R C (tjn ). 1
Let us now establish the Hopf algebra structure. Following [2,4] we define the counit and the coproduct. The counit e: ¯ A → Q is simple: e(X) ¯ =0 for any X 6 = 1,
e(1) ¯ = 1.
The coproduct 1 is defined by the equations 1(1) = 1 ⊗ 1, 1(t1 . . . tn ) = 1(t1 ) . . . 1(tn ), 1(t) = t ⊗ 1 + (id ⊗ B+ )[1(B− (t))],
(5) (6) (7)
Overlapping Divergences
673
which defines the coproduct on trees with n vertices iteratively through the coproduct on trees with a lesser number of vertices. The coproduct can be written as [2,4] X P C (t) ⊗ R C (t). (8) 1(t) = 1 ⊗ t + t ⊗ 1 + adm. cuts C of t
Up to now we have established a bialgebra structure. It is actually a Hopf algebra. Following [2,4] we find the antipode S as S(1) = 1, S(t) = −t −
X
(9) C
C
S[P (t)]R (t).
(10)
adm. cuts C of t
Let us give yet another formula to write the antipode, which one easily derives using induction on the number of vertices [2,4]: X (−1)nC P C (t)R C (t), S(t) = − all cuts C of t
where nC is the number of single cuts in C. This time, we have a non-recursive expression, summing over all cuts C, relaxing the restriction to admissible cuts. By now we have established a Hopf algebra on rooted trees, using the set of rooted trees, the commutative multiplication m for elements of this set, the unit 1 and counit e, ¯ the coproduct 1 and antipode S. We call this Hopf algebra HR . Still following [2,4] we allow to label the vertices of rooted trees by Feynman graphs without subdivergences, in the sense described before. Quite generally if Y is a set of primitive elements providing labels, we call the resulting Hopf algebra HR (Y ). Let us also mention that X (11) m[(S ⊗ id)1(t)] = e(t) ¯ =0= S(t(1) )t(2) , P where we introduced Sweedler’s notation 1(t) =: t(1) ⊗ t(2) , and id is the identity map HR → HR . We finally note the following definition: for a rooted tree t let nv (t) be the number Q of its vertices. This extends to a monomial of rooted trees in the obvious manner, nv ( i Ti ) = P i nv (Ti ). Ultimately, we work in the vector space of finite linear combinations of monomials P in rooted trees. Hence for such a linear combination T := i q i Xi , i ∈ I, for some index set I, we define nv (T ) := max{nv (Xi ) | i ∈ I}.
(12)
2. A Set Theoretic Approach 2.1. Notation. Let #(M) be the cardinality of any set M. For any given finite set M we let P(M) be the set of all proper subsets of M. Further, we let PX (M) ⊂ P(M)
674
D. Kreimer
be the set of all proper subsets of M which fulfill the condition X. Thus, if X is the boolean operator which is true when the condition X is satisfied, we have PX (M) = {γ ∈ P(M) | X(γ )}. If we impose no condition we write X = ∅, hence P∅ (M) ≡ P(M). If we want to stress ⊂ M. that a subset γ ⊂ M fulfills condition X, we write γ X Let γi , γj ⊂ P(M), i 6 = j , be two elements of PX (M), hence two subsets of M. If γi ∩ γj = ∅ we call γi , γj disjoint. Otherwise, if γi ⊂ γj or γj ⊂ γi , we call γi , γj nested. Finally, if γi , γj are neither disjoint nor nested, we call them overlapping. They then have a nontrivial intersection U := γi ∩ γj 6= ∅, which is a proper subset of each, γi ⊃ U ⊂ γj . If γi , γj are not overlapping, we call them tree-related, for reasons which become obvious in a moment. For a given set X of mutually tree-related sets γi , we say that another set γ is overlapping with X if γ is overlapping with at least one element of X. If a set γ ⊂ PX (M) can be written as a union of mutually disjoint sets γi ⊂ PX (M), γ = ∪i∈I γi for some index set I, we say that γ is reducible. Otherwise, we say it is irreducible (w.r.t.X). Note that reducibility depends on the chosen condition X. Let M/γ denote the complement of the set γ ⊂ M with respect to M, M = M/γ ∪ γ . 2.2. Basic results. It is our task to find all elements p ∈ P(PX (M)) which fulfill the following three conditions: i) p consists of mutually tree-related sets ∈ PX (M), ii) all elements of p are irreducible, iii) p is complete: for all γ ⊂ PX (M) such that γ 6∈ p ⇔ γ is overlapping with p. For an irreducible M, let the set of all such p, that is the set of all complete, irreducible, tree-ordered elements of P(PX (M)) be denoted by PXcit (M). Proposition 1. To each such p ∈ PXcit (M), we can assign a rooted tree TX (p) with n = (#(p) + 1) vertices. Proof. We draw n points in the plane, which furnish the set of vertices of the rooted tree. To one of these points, we associate the set M. It will become the root. To each of the other n − 1 points we associate one element of p. Let v(γi ) denote the vertex which is labelled by the set γi ∈ p in this process. Now we can construct the edges. For that, we connect two vertices v(γi ), v(γk ) by an edge pointing from v(γk ) to v(γi ) if and only if the following two conditions are fulfilled:
Overlapping Divergences
675
i) γi ⊂ γk , ii) there is no further set γj ∈ p such that γi ⊂ γj ⊂ γk . Here, we allow γk to be the set M itself: γk ∈ {p ∪ M}. The resulting tree is simplyconnected, due to the fact that all elements of p are mutually tree-ordered. Further, it has a distinguished root. u t For a chosen vertex v of a rooted tree TX (p) let γ (v) be the set associated to that vertex. Further, assume that f (v) = k, hence v connects via k outgoing edges to vertices v1 , . . . , vk , say. The corresponding sets γ (v1 ), . . . , γ (vk ) are necessarily mutually disjoint, as TX (p) is simply-connected. Define f (v)
γv := ∪i:=1 γ (vi ). Proposition 2. PX (γ (v)/γv ) = ∅. Proof. γ (v) is irreducible as it is an element of p. Hence γ (v)/γv 6 = ∅. By definition, γv is the union of all sets γ (vi ) ∈ p which are subsets γ (vi ) ⊂ γ (v). If there would be an element γ 0 in PX (γ (v)/γv ), this would imply that γ 0 is a non-overlapping subset of t γ (v) which is not in γv . Contradiction. u The linear combination TX (M) assigned to the irreducible set M is the sum TX (M) :=
X
TX (p).
cit (M) p∈PX
For a reducible M we can write P M = ∪i Mi for some mutually disjoint irreducible sets Mi . We then set TX (M) = i TX (Mi ). 1 An example might be in order. Let M = {a, b, c}. First, choose X = ∅. All subsets which contain more than one element are reducible. Thus, TX (M) is the product t1 (a)t1 (b)t1 (c) of three disjoint roots, labelled {a}, {b}, {c}. Next, let X be the condition that a is contained in the subset but not c. Then, {a} and {a, b} are irreducible proper subsets. PXcit (M) contains a single set p = {{a}, {a, b}}, and we obtain TX (p) = t31 (see Fig. 5) with the set M labelling the root, which is connected to a vertex labelled by {a, b}, and finally this vertex is connected to a third one labelled by {a}. Finally, choose X to be the condition that a is contained in the subset. Then, {a}, {a, b}, {a, c} are irreducible proper subsets. The latter two are overlapping. PXcit (M) consists of two elements p1 , p2 , say, where p1 = {{a}, {a, b}} and p2 = {{a}, {a, c}}. TX (p1 ) and TX (p2 ) both realize t31 with appropriate decorations. Consider Fig. 6 for a visualization of these examples. 1 If there is more than one possibility to write M as a union of disjoint sets M , we sum over all trees which i we obtain from the consideration of all possibilities how to decompose M into these various disjoint subsets. We will not meet this case in this paper, though.
676
D. Kreimer
a
b
c
a,b,c a,b
a
a,b,c a,b
a
a,b,c +
a,c
a
Fig. 6. The examples show how various TX ({a, b, c}) are generated by different conditions X. From top to bottom, we have i)X = ∅, ii)X: a ∈ γ , c 6 ∈ γ , iii)X : a ∈ γ
2.3. The Hopf algebra structure of PX (M). To each set M we can assign a depth dX (M) as dX (M) = nv (TX (M)), according to (12). This gives us a decomposition on the set M of all irreducible (w.r.t.X) finite sets through the grading by depth, M = M[1] ∪ M[2] ∪ . . . ∪ M[k] ∪ . . . , which obviously depends on the condition X. Here, M[1] are all sets M which have no proper subset which fulfills condition X, hence of depth dX (M) = 1, M[2] are all sets of depth two, such that all their proper subsets which fulfill X are from M[1] , and in general M[k] contains all sets of depth k, and hence has proper subsets of depth ≤ k − 1. We want to establish a Hopf algebra of rooted trees on M. For this, it is sufficient to study irreducible sets M. We will take elements of M[1] as primitive elements. By definition, T (M) = t1 (M) for M ∈ M[1] , which justifies this choice. We call a set M non-overlapping, if PX (M) is tree-ordered, hence if all its subsets which fulfill X are tree related amongst each other. Proposition 3. If M is non-overlapping, #(PXcit (M)) = 1. Proof. All elements of PX (M) can be tree-ordered amongst themselves by assumption. As any element p ∈ PXcit (M) is complete and contained in PX (M), there can be only one such element p. u t Two final definitions: If X is a given condition, PX (M) = {u ∈ P(M) | X(u)}, then, for γ ∈ PX (M), Xγ is defined to be the condition PXγ (M) = {u ∈ P(M) | X(u) and u 6 ∈ {γ ∪ PX (γ )} }. We call a condition X an orderly condition if and only if TXγ (M) = TX (M/γ ), ∀γ ∈ PX (M). This means that checking the condition X and then eliminating all elements of PX (M) which belong as well to γ ∪ PX (γ ) is the same as first eliminating γ and checking the condition X on the reduced set M/γ . Let us give an example of an orderly condition.
Overlapping Divergences
677
Example. Consider a space Y and a set σY of subsets of Y . Endow Y with the topology generated by σY as a subbasis.2 Endow any space Y/γ , γ ∈ σY with its induced topology, which is generated by the subbasis {u/γ | u ∈ σY }. Let X be the condition that a subset γ ⊂ Y must fulfill γ ∈ σY to be in PX (Y ). Then, X is an orderly condition. Indeed, TX (Y /γ ) is the forest TX (Y ) in which all vertices decorated by γ or its subsets in PX (γ ) are deleted, and so is TXγ (Y ). On the other hand, note that the examples in Fig. 6 give non-orderly conditions for X 6 = ∅. [i] We want to establish a Hopf algebra of rooted trees HR (M[1] ∪∞ i=2 U ) which [k] assigns to each M ∈ M a sum of rooted trees TM such that its coproduct takes the form X Tγ ⊗ TM/γ . (13) 1(TM ) = γ
⊂ XM
The sum is over all subsets γ ⊂ M such that γ fulfills condition X. We do not demand that γ is irreducible. It is thus allowed that γ is the union of disjoint sets γi which themselves fulfill condition X and are irreducible. [k] refers to the iterative manner in which we will achieve our The notation ∪∞ i=2 U goal. To achieve our goal for sets M of depth one is trivial. We take M[1] as the set of decorations for HR and are done. Next, we will construct a set of decorations, U [2] such that HR (M[1] ∪ U [2] ) achieves the desired goal for all sets of depth up to two. Then, we further enlarge this set by U [3] so that the coproduct in HR (M[1] ∪ U [2] ∪ U [3] ) agrees with (13) for sets M of depth up to three and so on. In general, we show that if one has succeeded at depth k that there is a set of decorations U [k+1] which are primitive under the coproduct of HR , such that one obtains the desired form (13). It will turn out that TM is a sum of rooted trees containing TX (M). Further, [TM − TX (M)] is a sum of rooted trees which fulfills nv (TM − TX (M)) < nv (TX (M)). For non-overlapping sets M, there is an immediate natural Hopf algebra structure HR (M[1] ). It is natural in the sense that the coproduct assumes the form (13): Proposition 4. For non-overlapping sets M we have X TX (γ ) ⊗ TXγ (M). 1(TX (M)) = γ
(14)
⊂ XM
Proof. For non-overlapping sets M, TX (M) is a single rooted tree TX (M) = TX (p). Admissible cuts on this rooted tree and subsets γ in the sum are in one-to-one correspondence, by construction. Let γC be the set corresponding to the chosen admissible cut C. By the definition of TX (γ ), TX (γ ) = P C (TX (M)). Further, R C (TX (M)) is the decorated tree which remains connected with the root under the admissible cut. By definition of Xγ , TXγ (M) = R C (TX (M)) as both rooted trees are obtained from TX (M) by eliminating all vertices and edges corresponding to TX (γ ). Further, by Proposition 2 t we can decorate the rooted tree TX (M) with elements from M[1] . u Note that for an orderly condition X, (14) takes the form X TX (γ ) ⊗ TX (M/γ ). 1(TX (M)) = γ
(15)
⊂ XM
2 Any set σ of subsets of a space generates a topology. The open sets are unions of finitely many intersections of elements of σ , and σ is the subbasis of this topology.
678
D. Kreimer
Hence we set Tγ = TX (γ ), TM/γ = TX (M/γ ),
(16)
to obtain the desired form (13) for all non-overlapping M ∈ Mnol , the set of all sets M which are non-overlapping. This is consistent as if M is non-overlapping, so are all elements in PX (M). To simplify notation, let us assume in the following that X is an orderly condition. When we come to Feynman graphs in the next section, we will actually find the relevant condition X to be an orderly condition. However, the general case demands not much more than a replacement TX (M/γ ) → TXγ (M) and a slightly refined decomposition of M. So far, we found that all elements in Mnol have the desired form. From now on let 11 be the coproduct of HR (M[1] ). We have just shown that it has the desired form on Mnol . We stress that 11 is defined on all rooted trees with decorations in M[1] . We now want to show that for the other elements, which are overlapping sets M, we can find a Hopf algebra of rooted trees with a coproduct which has the desired form (13), by simply adding more decorations. As an aside, we will gain a systematic decomposition into primitive elements, which corresponds to a skeleton expansion at the level of QFT, as we will see later on. We will proceed by induction on the depth. There are no overlapping sets M in M[1] , [1] M ⊂ Mnol . Hence we start the induction by considering sets in M[2] . We want to construct a Hopf algebra of rooted trees HR (M[1] ∪ U [2] ) such that its coproduct 12 again can be written in the form (13). U [2] is a set of decorations, hence we demand 12 (u) = u ⊗ 1 + 1 ⊗ u, ∀u ∈ U [2] . Let M ∈ M[2] be irreducible and overlapping. Then, each p ∈ PX[cit] (M) is in M[1] . Let us assign to M an element TM and set X Tγ ⊗ TM/γ . 12 (TM ) = TM ⊗ 1 + 1 ⊗ TM + γ
⊂ XM
Due to the definition of PXcit (M) this can be written as X 12 (TM ) = TM ⊗ 1 + 1 ⊗ TM +
Tp ⊗ TM/p .
cit (M) p∈PX
But p ∈ M[1] and M/p ∈ M[1] , hence Tp = TX (p), TM/p = TX (M/p), by (16). Also, the coproduct 11 of HR (M[1] ) is defined on the sum of rooted trees TX (M) and reads X TX (p) ⊗ TX (M/p). 11 (TX (M)) = TX (M) ⊗ 1 + 1 ⊗ TX (M) + cit (M) p∈PX
Thus, we find that, for UM := TM − TX (M), 12 (UM ) = UM ⊗ 1 + 1 ⊗ UM . UM reveals itself to be a primitive element with respect to 12 . This suggests to define U [2] via the union of all elements UM = TM − TX (M), where M is of depth two and
Overlapping Divergences
679
overlapping. As UM is primitive we identify it with a decoration uM of the tree t1 , t1 (uM ) = UM and obtain U [2] = {uM | t1 (uM ) = TM − TX (M), M ∈ M[2] ∧ M 6 ∈ Mnol }.
(17)
Hence we find that 12 is the coproduct of the Hopf algebra of rooted trees HR (M[1] ∪ U [2] ), where U [2] is the set of decorations corresponding to primitive elements UM ≡ TM − TX (M), M being an overlapping set in M[2] . The primitive elements of this Hopf algebra are t1 (M), M ∈ M[1] and the elements UM defined above. Note that the element TM is resolved into the linear combination of trees TM = TX (M) + t1 (uM ), as desired. Note further that we can write the coproduct 12 as ¯ ⊗ (id − E ◦ e)1 ¯ 1 (TX (M)), 12 (TM ) = TM ⊗ 1 + 1 ⊗ TM + (id − E ◦ e) where E : Q → HR is given by E(q) = q1. Obviously we left the counit e¯ unchanged, e(1) ¯ = 1, e(T ¯ ) = 0, ∀T 6 = 1. At this point the attentive reader might ask why we do not simply set TM = TX (M), as this would still deliver the natural form (13). But our point is to show that any attempt to find a Hopf algebra which has the natural form (13) will be a Hopf algebra of rooted trees, with an appropriate set of primitive elements. This completely puts the combinatorical problem of renormalization at rest and settles its algebraic structure as determined by the Hopf algebra structure of rooted trees, which, fascinatingly, not only describes renormalization but also the combinatorics of the diffeomorphism group [4]. Let us continue then. Thus, let M ∈ M[k] be irreducible and overlapping. Assume we found a Hopf algebra of rooted trees HR (M[1] ∪ki=2 U [i] ) with coproduct 1k such that in this Hopf algebra there is a linear combination TM of elements such that the coproduct obtains the form (13), 1k (TM ) = TM ⊗ e + e ⊗ TM +
X γ
Tγ ⊗ TM/γ .
⊂ XM
We want to induce that the same holds for M+ ∈ M[k+1] . ⊂ M+ be given, and let M+ ∈ M[k+1] be overlapping. Then, consider all the Let γ X terms in X X C(p) C(p) P [TX (p)] ⊗ R [TX (p)] , 11 [TX (M+ )] = cit (M ) p∈PX +
adm. cts. C p of TX (p)
which correspond to the set γ . This is well-defined: any two overlapping sets γ , γ 0 ∈ PX (M+ ) will correspond to branches of different trees TX (p), TX (p0 ), as elements p, p 0 are tree-ordered. Further, each single elementary cut corresponds to some subset ⊂ ⊂ M+ . We can thus organize the above sum in groups of terms corresponding to γ X M+ . γX cit Finally, the completeness of all elements of PX (M+ ) guarantees that all admissible cuts
680
D. Kreimer
which correspond to γ conspire to give TX (γ ), and all terms on the other side of the tensorproduct conspire to give TX (M+ /γ ), for an orderly condition X. We get X X P C(p) [TX (p)] ⊗ R C(p) [TX (p)] 11 [TX (M+ )] = cit (M ) p∈PX +
X
= γ
adm. cts. C p of TX (p)
TX (γ ) ⊗ TX (M+ /γ ).
(18)
⊂ X M+
Now we have to take care of the difference between TX (γ ) and Tγ , and between TX (M+ /γ ) and TM+ /γ . We first take care of all possible differences between TX (γ ) and Tγ . Consider all γ ∈ PX (M+ ). First, we consider all such γ which are in M[2] and overlapping. In the coproduct (18) we find a term TX (γ ) on the lhs. TX (γ ) ⊗ TX (M+ /γ ) is actually a sum of terms (as on both sides are sums of trees in general) which carries a natural product structure indicated by the tensorproduct. For each term in this sum, there is a well-defined set of edges corresponding to the admissible cut which gives γ . Gluing both sides, TX (γ ) and TX (M/γ ), together along these edges gives back TX (M+ ), TX (M) = TX (γ ) ∧γ TX (M/γ ). Here ∧γ refers to the gluing process along the edges which are cut when we obtain TX (γ ) on the lhs of the tensorproduct. Instead, we glue Tγ = TX (γ ) + Uγ back along these edges (“surgery along edges”), for all overlapping γ ∈ M[2] . Call the new sum of trees T2 (M+ ) = Tγ ∧γ TX (M/γ ). It has the form TX (M+ ) + T2 , where nv (T2 ) = nv (TX (M+ )) − 1. It further has the property that all cuts corresponding to such a γ in T2 (M+ ) will give Tγ on the lhs, if we employ 12 [T2 (M+ )]. Now we consider all γ ∈ PX (M+ ) which are overlapping and in M[3] . We use the product structure of T2 (M) under 12 and glue back Tγ for T2 (γ ). We continue in this manner for all overlapping γ ∈ PX (M+ ) in ascending order until we reach γ ∈ M[k] . Call the resulting sum of trees Tk (M). In a similar manner, we then replace Tk (M/γ ) by TM/γ starting with M/γ ∈ M[2] . We finally obtain a sum of trees T˘ (M) = TX (M)+ terms of lower depth. By construction, 1k (T˘ (M) − TX (M)) contains all the terms which distinguish P ˘ γ ⊂ M Tγ ⊗ TM/γ from 1(TX (M)). Notably, 1k acts on T (M), as it is a sum of X
rooted trees with decorations in M[1] ∪ki=2 U [i] . Hence we get
1k (T˘ (M)) = T˘ (M) ⊗ e + e ⊗ T˘ (M) +
X γ
Tγ ⊗ TM/γ .
⊂ XM
We now set ¯ ⊗ (id − e)1 ¯ k (T˘ (M)). 1k+1 (TM ) := TM ⊗ e + e ⊗ TM + (id − e) Then, again, UM := TM − T˘ (M) is a primitive element for 1k+1 , and thus 1k+1 becomes [i] [k+1] is the coproduct of a Hopf algebra of rooted trees HR (M[1] ∪k+1 i=2 U ), where U
Overlapping Divergences
681
the set of elements UM , with M an overlapping set of depth k + 1, thus in M[k+1] . We have, in analogy to the case k = 2, U [k+1] = {uM | t1 (uM ) = TM − T˘ (M), M ∈ M[k+1] ∧ M 6∈ Mnol },
(19)
which works iteratively as T˘ uses only decorations obtained from trees with degree ≤ k. Hence, algorithmically, one needs to determine all elements γ ∈ PX (M), and then all elements in the corresponding PX (γ ), and so on. Eventually, one ends by considering elements of depth two, whose decorations can be immediately determined, by (17). One then works up with the grading. We conclude that the natural coproduct (13) is the coproduct of the Hopf algebra of rooted trees based on an appropriate set of decorations, constructed iteratively starting at depth two and using induction on the grading by depth. Regarding an arbitrary Feynman graph as a set of edges and vertices, it needs powercounting and a determination of (one-particle irreducible) subgraphs to determine the skeleton expansion, and hence all decorations, iteratively, as the examples in the next section will exhibit. 3. Overlapping Divergences We will apply the notions established in the previous section to sets of propagators and vertices which constitute Feynman graphs. We will order Feynman graphs by the depth of the rooted trees assigned to them. Below, we will define an orderly condition X which can be tested by powercounting. The elementary fact that Feynman integrals allow for a well-defined degree of divergence essentially allows to use this degree of divergence as the crucial check on subgraphs, regarded as subsets of edges and vertices. One-particle irreducibility is the other demand which we choose for convenience. For each one-particle irreducible (1PI) superficially divergent Feynman graph 0 we denote by {0} the set of its propagators and vertices. Let X be the condition: for any set {γ } ⊂ {0} of propagators and vertices X({γ }) is true if and only if γ constitutes a one-particle irreducible superficially divergent subgraph of 0. Further, to X({0}/{γ }) we associate the graph 0/γ which we obtain if we shrink γ in 0 to a point. Proposition 5. X is an orderly condition. Proof. #(PX ({0}/{γ })) = #PX{γ } (M). Assume that two elements of either of these two sets correspond to the same two subgraphs of 0. Then, if they are overlapping, nested or disjoint in one of these two sets, they are so in the other as well. u t We are interested in the set PX ({0}). For a 1PI Feynman graph 0, TX ({0}) is the forest assigned to it in the sense of the previous section. In general, TX ({0}) will be a sum of rooted trees TX (p), p ∈ PXcit ({0}). Note further that {0} is irreducible with respect to X for all 1PI graphs 0. Define the depth d(0) as d(0) := nv (T ({0})), as before. This depth is well-defined for any Feynman graph. Feynman diagrams without subdivergences thus have depth one, as they correspond to the rooted tree t1 decorated by the set {0}.
682
D. Kreimer
Each Feynman diagram has a well-defined depth and thus we have a decomposition on the set of all Feynman graphs FG, FG = FG [0] ∪ FG [1] ∪ FG [2] ∪ FG [3] ∪ . . . . Here FG [0] corresponds to superficially convergent graphs. We are interested in graphs in FG [n] , n ≥ 1. To Feynman graphs of depth one we assign the rooted tree t1 , decorated by the corresponding element of FG [1] . The elements of this set furnish the set of primitive elements of the Hopf algebra HR (FG [1] ) of decorated rooted trees. The results of the previous section show that for each Feynman graph 0 ∈ FG [k] , we find a sum of associated rooted tree T0 and a coproduct given by X Tγ ⊗ T0/γ . (20) 1(T0 ) = 1 ⊗ T0 + T0 ⊗ 1 + γ
⊂ X0
Here, T0 is a sum of rooted trees with decorations in FG [1] and in ∪ki=2 U [i] , primitive elements in the Hopf algebra of rooted trees, obtained from Feynman graphs without subdivergences (which, as said earlier in this paper, includes graphs which have other subgraphs reduced to a point in them) and iteratively constructed primitive elements in U [i] as described in the previous section. We will soon see explicit examples which indeed show that the so constructed elements are indeed primitive, hence correspond to analytic expressions without subdivergences. At this stage, we can justify the notation of [2] or [4], where vertices of rooted trees were decorated by elements of FG [1] [4], which in the same spirit were used as letters of parenthesized words in [2]. In Proposition 2 we labelled each vertex v of T ({0}) by a subset {γ } corresponding to a subgraph γ in our context. γ itself can have further subdivergences. But then, condition X and Proposition 2 ensure that we could as well label vertices by elements of γ (v)/γv , which correspond to graphs without subdivergences. Before we come to examples, let us first make sure that we really get Zimmermann’s forest formula from (20). 3.1. Derivation of the forest formula. To the coproduct (20) belongs an antipode given by X S[Tγ ]T0/γ , (21) S(T0 ) = −T0 − γ ⊂0
as one immediately checks. As it is an antipode P in a Hopf algebra of rooted trees, it can be written as a sum over all cuts. Set T0 = i Ti for some decorated rooted trees Ti . Then, X X (−1)nCi P Ci (Ti )R Ci (Ti ). (22) S(T0 ) = i
all cuts Ci of Ti
Each such cut corresponds to a renormalization forest, which we obtain if we box the corresponding subgraphs in 0, and vice versa [4].3 3 Note that we can easily identify maximal forests here (in the sense of renormalization theory), by using the B− operator on the trees Ti .
Overlapping Divergences
683
Now, let φ be a Q-linear map which assigns to T0 the corresponding Feynman integral. Further, let φR = τR ◦ φ be a map which assigns to T0 the corresponding Feynman integral, evaluated under some renormalization condition R. Hence, from T0 we obtain via φ a Feynman integral φ(T0 ) in need of renormalization. τR modifies this Feynman integral, in a way such that the result contains the divergent part of this integral. Essentially, τR extracts the divergences of φ(T0 ) in a meaningful way [6]. Hence, as τR isolates divergences faithfully, differences (id − τR )(φ(T0 )) eliminate divergences in Feynman integrals. Depending on the chosen renormalization scheme R, one can adjust finite parts to fulfill renormalization conditions. A detailed study of this freedom from the Hopf algebra viewpoint can be found in [9]. P T0 (1) ⊗ T0 (2) . Let us We remind the reader of Sweedler’s notation: 1(T0 ) = consider the antipode e(T ¯ 0 ) using Sweedler’s notation: X S(T0 (1) )T0 (2) . 0 = e(T ¯ 0) = This map vanishes identically. Note that it can also be written as ¯ 0 ) = 0. m[(S ⊗ id)1(T0 )] ≡ e(T But this map gives rise to a much more interesting map, by composition with φ, T0 → 0R := m[(SR ⊗ id)(φ ⊗ φ)1(T0 )]. This map associates to the Feynman graph 0 represented by a unique sum of rooted trees the renormalized Feynman integral 0R [2,4]. Its usual definition X Zγ 0/γ , (23) 0R = (id − τR ) 0 + γ ⊂0
is recovered if we define
SR [φ(Tγ )] ≡ Zγ = −τR (γ ) − τR
X
Zγ 0 γ /γ 0 .
(24)
γ 0 ⊂γ
This map is derived from the antipode S[Tγ ] = −Tγ −
X
S[Tγ 0 ]Tγ /γ 0 .
(25)
γ 0 ⊂γ
Using φ to lift this to Feynman graphs, and using the freedom to alter corresponding analytic expressions according to renormalization schemes R one obtains (24). Note that if one defines φR = SR ◦ φ ◦ S, one has SR ◦ φ = φR ◦ S and hence SR [φ(Tγ )] = τR −φ(Tγ ) −
X γ 0 ⊂γ
φR (S[Tγ 0 ])φ(Tγ /γ 0 ) .
(26)
684
D. Kreimer
Hence, in accordance with [2,4] we find the Z-factor of a graph γ as derived from the antipode in the Hopf algebra of rooted trees. Above, in (23), we recovered the original forest formula in its recursive form. The non-recursive form is recovered with the same ease, using (22) instead of (21) [2,4]. It reads " # X X nC i Ci Ci (−1) φτR (P (Ti ))φ(R (Ti )) 0R = (id − τR ) i
all normal cuts
in a form which makes its finiteness obvious when we take into account that the operation τR is defined to leave divergences unaltered. φτR (P Ci (Ti )) implies an iterative application of τR as governed by the unique boxes (the forests of classical renormalization theory) associated with normal cuts [4]. Explicit realizations will be given elsewhere [9], as well as a more detailed discussion of renormalization schemes, renormalization group equations, operator product expansions and relations to cohomological properties of renormalizations. 3.2. Examples. We start with a simple example. Let v1 , v2 , ω1 , ω2 , ω3 be the Feynman graphs indicated in Fig. 7. We then have, switching to a notation in PW’s [2],4 TX (ω3 ) = ((v1 )ω1 ) + ((v2 )ω2 ), Tω3 = ((v1 )ω1 ) + ((v2 )ω2 ) + (Uω3 ), (Uω3 ) = (Tω3 ) − [((v1 )ω1 ) + ((v2 )ω2 )], 1[Tω3 ] = Tω3 ⊗ e + e ⊗ Tω3 + (v1 ) ⊗ (ω1 ) + (v2 ) ⊗ ω2 .
(27) (28) (29) (30)
The graphs belong to FG [2] . Note that Uω3 gives us the skeleton corresponding to this graph. It is a primitive element, and thus free of subdivergences. And indeed, for any choice of momentum transfer and masses in vi , φ(Uω3 ) is an analytic expression free of subdivergences. Any representation of T0 in terms of Feynman integrals shows that the expressions corresponding to such U0 are free of subdivergences. An instructive example is given in the appendix of [4], where it is shown how graphs in φ 3 theory explicitly realize the results derived here on general grounds. Similar results can be found in [2,3,7]. Next, in Fig. 8, we consider examples taken from FG [3] . This time, we find the following results TX (ω4 ) = (((v2 )v3 )ω2 ) + (((v3 )v1 )ω1 ) + ((v3 )(v2 )ω2 ), T˘ (ω4 ) = ((v2 )Uω5 ) + ((v3 )Uω3 ), Tω4 = (((v2 )v3 )ω2 ) + (((v3 )v1 )ω1 ) + ((v3 )(v2 )ω2 ) + ((v2 )Uω5 ) + ((v3 )Uω3 ) + Uω4 . 1(Tω4 ) = Tω4 ⊗ e + e ⊗ Tω4 +2(v2 ) ⊗ ((v3 )ω2 ) + (v3 ) ⊗ ((v1 )ω1 ) + (v3 ) ⊗ ((v2 )ω2 ) + ((v2 )v3 ) ⊗ (ω2 ) + ((v3 )v1 ) ⊗ (ω1 ) + (v3 )(v2 ) ⊗ (ω2 ) + (v2 ) ⊗ (Uω5 ) + (v3 ) ⊗ (Uω3 ).
(31) (32) (33)
(34)
4 For example, in this notation ((v )ω ) corresponds to the tree t , with its root decorated by ω and the 1 1 2 1 other vertex decorated by v1 . Decorated rooted trees and PW’s on an alphabet of decorations are in one-to-one correspondence [4].
Overlapping Divergences
685
Fig. 7. A graph from FG [2] and its subgraphs. We read it as a graph in Yang–Mills theory in four dimensions say, with straight lines being fermions. In the first row, we see the graph ω3 . Below, we see its two subgraphs v1 , v2 and in the bottom row we see the graphs ω1 = ω3 /v1 and ω2 = ω3 /v2
Fig. 8. Graphs from FG [3] and their subgraphs. At the top, we see the graph ω4 . Apart from the subgraphs in the previous figure, we find two more subgraphs, the vertex v3 and the self-energy ω5 = ω4 /v2 , both given in the second row. In the third row, we define the three-loop fermion self-energy 63 . It involves the same subgraphs as before, plus a new graph 61 = 63 /ω3 . Finally, at the bottom, we see the graph ω6
Now,
as
2(v2 ) ⊗ ((v3 )ω2 ) + (v3 ) ⊗ ((v1 )ω1 ) + (v3 ) ⊗ ((v2 )ω2 ) + (v2 ) ⊗ (Uω5 ) + (v3 ) ⊗ (Uω3 ) = (v2 ) ⊗ Tω5 + (v3 ) ⊗ Tω3 , Tω5 = Uω5 − 2((v3 )ω2 ), Tω3 = Uω3 − ((v2 )ω2 ) − ((v1 )ω1 ).
686
D. Kreimer
Fig. 9. Surgery along edges delivers the transition from TX (0) to T0 . We first give TX (0), where 0 is the five-loop graph indicated at the roots. All six rooted trees in this figure have to be added to give TX (0)
For the other graphs in Fig. 8 we find T63 = (((v1 )ω1 )61 ) + (((v2 )ω2 )61 ) + ((Uω3 )61 ), and Tω6 = (((v2 )v1 )ω1 ) + (((v2 )v2 )ω2 ) + ((v2 )Uω3 ). We invite the reader to confirm that the coproduct on these expressions has the desired form (13). Finally, Figs. 9–11 shows how the transition from TX (0) to T0 is achieved in terms of surgery along edges. We start with an example taken from φ 3 theory in six dimensions. We consider a quadratically divergent two-point function as given in the figures. Figure 9 gives TX (0). It consists of six decorated rooted trees. In the figure, we give the
Overlapping Divergences
687
Fig. 10. Now we add the results of replacing T (γ ) by Tγ
Fig. 11. Finally, we construct the terms which achieve the transition TX (0/γ ) → T0/γ . The first two rows, if we append the forest Tγ , give the terms of the previous two figures. The second takes into account the fact that in 0/γ , ∀γ ∈ PX ({0}), we can find the element γ2 = 0/γ itself, by shrinking three loops to this element of F G [2] . The inlay in the first row indicates the graphs γ which have to shrink. Note that γ is allowed to consist of disjoint graphs. The last row takes into account the primitive element U0/γ2 . The inlay defines Tγ2
688
D. Kreimer
decorations not by primitive elements, but by full subgraphs. The decoration by primitive elements is obtained, in accordance with Proposition 2, if we divide by the decorations at outgoing vertices. That there are six decorated trees is a consequence of the internal product structure of the graph: there is a subgraph γ2 with #(PXcit (γ2 )) = 2, and the complement graph 0/γ2 has #(Pxcit (0/γ2 )) = 3. Figure 10 adds the terms for the transition TX (γ ) → Tγ . This is only non-trivial for the case that γ is the indicated overlapping two-loop two-point function γ2 . Finally, Fig. 11 shows the additional terms generated from the complement graphs 0/γ . Let us end this section with a few remarks concerning the various sorts of overlapping divergences. Most prominent and most severe are overlapping quadratic divergences, as one encounters typically in (gauge)-boson propagators, let it be gauge theory in four dimensions of φ 3 theory in six, considered in the above examples. Typically, the overlapping subdivergences are provided by vertex corrections, and hence we have two sets which overlap. Characteristically, the two overlapping subdivergences can be eleminated by two derivatives with respect to an external momentum, one for each of them. This then generates new decorations of logarithmic degree of divergence. An illuminating example for this situation is given in the appendix of [4]. Overlapping degrees of divergences can come in other degrees of divergence, and in other configurations. For example, in non-abelian Yang–Mills theory one can have overlapping divergent Feynman graphs with a logarithmic degree of divergence, where one has three sets which mutually overlap with each other.5 4. Conclusions Starting from set-theoretic notions, we showed how the forest formula underlying renormalization theory is ad initio derived from the Hopf algebra of rooted trees. At the same time, we constructed a systematic way to obtain the skeleton expansion in any QFT, given by elements UM . We derived the original non-recursive forest formula of Zimmermann from the Hopf algebra of rooted trees, as well as the recursive formulation. The results of [5] are in full accordance with our results and are a specification of the general result presented here. Details for the practitioner of calculational QFT are given elsewhere [9], including remarkable number-theoretic results when investigating the role of the Connes-Moscovici Hopf subalgebra in Feynman diagrams. Some further remarks are in order. • The methods developed in the first section are sufficiently general to be applied to problems of operator product expansions and asymptotic expansions, with applications to OPE’s already being established [9]. Our approach being based on settheoretic considerations, the remaining challenge for general asymptotic expansions is to find and interpret sensible conditions X, and to identify the resulting primitive elements. • The Hopf algebra of rooted trees has relations to shuffle Hopf algebras [8]. Shuffle products play a role when we start to study the action of the symmetric group on decorations. They appear naturally in the consideration of the sub Hopf algebra k (e), which is the Hopf algebra underlying Chen’s iterated generated by trees B+ integral. The Hopf algebra of rooted trees has this algebra as a sub Hopf algebra. 5 A tetrahedron formed out of gluons, with three external gluons coupling to three sides of the tetrahedron which form a triangle is an appealing three-loop example only involving three-gauge-boson vertices.
Overlapping Divergences
689
There are interesting generalizations when we study shuffle algebras and iterated integrals from the viewpoint of the Hopf algebra of rooted trees [9]. Especially, the absence of a shuffle product for bare Green functions in the presence of a remaining convolution law points to interesting structures lying ahead [9]. • The general set-theoretic set-up adopted in this paper allows to study bare Green functions in x-space, and hence will allow to study them as functions on configuration space (which relies on tree-ordered boundaries in a natural manner, see e.g.[10] and references there). This will hopefully reconcile early work on such functions [11] with more recent developments. Acknowledgements. Let me first thank Raymond Stora for interest and discussions, and for the ultimate motivation to write this paper. Also, I thank him for carefully proofreading an earlier version of this paper. I very much enjoyed the opportunity to discuss the intricate structures of the calculus of QFT and to collaborate on its surprising relation to Noncommutative Geometry with Alain Connes. I also thank Alain for generous hospitality on various occasions. As usual, many thanks are due to David Broadhurst for companionship in our longlasting exploration of patterns and structures in QFT. Support by a Heisenberg fellowship is gratefully acknowledged.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Zimmermann, W.: Commun. Math. Phys. 15, 208 (1969) Kreimer, D.: Adv. Theor. Math. Phys. 2, 303 (1998); q-alg/9707029 Kreimer, D.: J. Knot Th. Ram. 6, 479 (1997); q-alg/9607022 Connes, A., Kreimer, D.: Commun. Math. Phys. 199, 203 (1998); hep-th/9808042 Krajewski, T., Wulkenhaar, R.: On Kreimer’s Hopf algebra structure of Feynman Graphs. CPT-98/P.3639; hep-th/9805098, Eur. Phys. J. C 7, 697 (1999) Collins, J.C.: Renormalization. Cambridge: Cambridge Univity Press, 1984 Broadhurst, D.J., Delbourgo, R., Kreimer, D.: Phys. Lett. B366, 421 (1996); hep-ph/9509296 Hoffman, M.E.: Quasi-shuffle products. Preprint, to appear in J. Algebraic Combinatorics; Borwein, J.M., Bradley, D.M., Broadhurst, D.J., Lisonek, P.: Combinatorial aspects of multiple zeta values. Electr. J. Comb. 5, R38 (1998) Broadhurst, D.J., Kreimer, D.: Renormalization automated by Hopf algebras. hep-th/9810087; Kreimer, D.: Chen’s iterated integral represents the Operator Product Expansion. hep-th/9901099; Delbourgo, R., Kreimer, D.: Using Hopf algebras to calculate Feynman diagrams. hep-th/9903249 Thurston, D.P.: Integral Expressions for the Vassiliev Knot Invariants. math/9901110 Epstein, H., Glaser, V., Stora, R.: General Properties of the n-point Functions in Local Quantum Field Theory. In: Balian, R., Iagolnitzer, D. (eds.) Les Houches 1975, Proceedings, Summer School On Structural Analysis Of Collision Amplitudes. Amsterdam 1976, pp. 5–93; Epstein, R., Glaser, V.: Ann. Inst. H. Poincaré 19, 211 (1973)
Communicated by A. Jaffe
Commun. Math. Phys. 204, 691 – 707 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Rigidity Theorem for Spacetimes with a Stationary Event Horizon or a Compact Cauchy Horizon Helmut Friedrich1 , István Rácz2,? , Robert M. Wald3 1 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institute, Schlaatzweg 1, 14473 Potsdam,
Germany. E-mail: [email protected]
2 Yukawa Institute for Theoretical Physics, Kyoto University, Kyoto 606-01, Japan.
E-mail: [email protected]
3 Enrico Fermi Institute, University of Chicago, 5640 S. Ellis Ave., Chicago, IL 60637-1433, USA.
E-mail: [email protected] Received: 4 November 1998 / Accepted: 13 February 1999
Abstract: We consider smooth electrovac spacetimes which represent either (A) an asymptotically flat, stationary black hole or (B) a cosmological spacetime with a compact Cauchy horizon ruled by closed null geodesics. The black hole event horizon or, respectively, the compact Cauchy horizon of these spacetimes is assumed to be a smooth null hypersurface which is non-degenerate in the sense that its null geodesic generators are geodesically incomplete in one direction. In both cases, it is shown that there exists a Killing vector field in a one-sided neighborhood of the horizon which is normal to the horizon. We thereby generalize theorems of Hawking (for case (A)) and Isenberg and Moncrief (for case (B)) to the non-analytic case.
1. Introduction A key result in the theory of black holes is a theorem of Hawking [1,2] (see also [3,4]), which asserts that, under certain hypotheses, the event horizon of a stationary, electrovac black hole is necessarily a Killing horizon, i.e., the spacetime must possess a Killing field (possibly distinct from the stationary Killing field) which is normal to the event horizon. The validity of this theorem is of crucial importance in the classification of stationary black holes, since it reduces the problem to the cases covered by the well-known uniqueness theorems for electrovac black holes in general relativity [5–10]. However, an important, restrictive hypothesis in Hawking’s theorem is that the spacetime be analytic. A seemingly unrelated theorem of Isenberg and Moncrief [11,12] establishes that in an electrovac spacetime possessing a compact Cauchy horizon ruled by closed null geodesics, there must exist a Killing vector field which is normal to the Cauchy horizon. This result supports the validity of the strong cosmic censorship hypothesis [13] by demonstrating that the presence of such a compact Cauchy horizon is “non-generic”. ? Fellow of the Japan Society for the Promotion of Science, on leave of absence from MTA-KFKI Research Institute for Particle and Nuclear Physics
692
H. Friedrich, I. Rácz, R. M. Wald
The Isenberg–Moncrief theorem also contains the important, restrictive hypothesis that the spacetime be analytic. The main purpose of this paper is to show that the theorems of Hawking and of Isenberg and Moncrief can be proven in the case of a smooth (as opposed to analytic) geometrical setting.1 However, a fundamental limitation of our method is that we are able to prove existence of the Killing field only on a one-sided neighborhood of the relevant horizon. For the Hawking theorem, this one-sided neighborhood corresponds to the interior of the black hole, whereas the existence of a Killing field in the exterior region is what is relevant for the black hole uniqueness theorems. However, for the Isenberg–Moncrief theorem, the one-sided neighborhood corresponds to the original Cauchy development, so our results significantly strengthen their conclusion that the presence of a compact Cauchy horizon ruled by closed null geodesics is an artifact of a spacetime symmetry. This paper is organized as follows: In Sect. 2 we consider stationary black hole spacetimes and establish the existence of a suitable discrete isometry which maps each generator of the event horizon into itself. As seen in Sect. 3, by factoring the spacetime by this isometry, we produce a spacetime having the local geometrical properties of the spacetimes considered by Isenberg and Moncrief. This construction explicitly demonstrates the close mathematical relationship between the Hawking and Isenberg–Moncrief theorems. In Sect. 3, we also review the relevant result of Isenberg and Moncrief, which shows that in suitably chosen Gaussian null coordinates defined in the “unwrapping” of certain local neighborhoods covering the horizon, N , all the fields and their coordinate derivatives transverse to N are independent of the coordinate u on N . In the analytic case, this establishes that (∂/∂u)a is (locally) a Killing field. Section 4 contains the key new idea of the paper: We use the methods of [15,16] to extend the region covered by the local Gaussian null coordinates of Isenberg and Moncrief so that the extended spacetime is smooth and possesses a bifurcate null surface. This bifurcate null surface then provides a suitable initial data surface, from which the existence of a Killing field on the extended (and, hence, on the original) spacetime can be established without appealing to analyticity. The results concerning the null initial value formulation that are needed to establish the existence of a Killing field are proven in an appendix. Throughout this paper a spacetime (M, gab ) is taken to be a smooth, paracompact, connected, orientable manifold M endowed with a smooth Lorentzian metric gab of signature (+, −, −, −). It is assumed that (M, gab ) is time orientable and that a time orientation has been chosen. The Latin indices a, b, c . . . will be used as abstract tensor indices [17], the Latin indices i, j, k . . . will denote tetrad components (used only in Appendix B), and Greek indices will denote coordinate components. 2. Stationary Black Hole Spacetimes In this section we shall give a mathematically precise specification of the class of stationary, black hole spacetimes to be considered, and we then shall prove existence of a discrete isometry which maps each generator of the horizon into itself. We consider smooth, strongly causal spacetimes (M, gab ) which are (k, α)-asymptotically stationary as specified in Definition 2.1 of [18]. Thus, we assume that (M, gab ) possesses a one-parameter group of isometries, φt , generated by a Killing vector field t a , and possesses a smooth acausal slice 6 which contains an asymptotically flat “end”, 1 Further generalizations to allow for the presence of other types of matter fields will be treated by one of us elsewhere [14].
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
693
6end , on which t a is timelike and the properties specified in Definition 2.1 of [18] hold. However, the precise asymptotic flatness conditions given in that definition will not be of great importance here, and could be significantly weakened or modified. We further require that if any matter fields (such as an electromagnetic field, Fab ) are present in the spacetime, then they also are invariant under the action of φt . We define Mend to be the orbit2 of 6end under the isometries Mend = φ{6end }.
(2.1)
The black hole region B is defined to be the complement of I − [Mend ] and the white hole region W is defined to be the complement of I + [Mend ]. We require that (M, gab ) possess a black hole but no white hole, i.e. W = ∅ which implies that M = I + [Mend ].
(2.2)
Note that the domain of outer communications D associated with the asymptotically flat end is, in general, defined to be the intersection of the chronological future and past of Mend , but, in view of Eq. (2.2), we have simply D = I − [Mend ].
(2.3)
The (future) event horizon of the spacetime is defined by N = ∂I − [Mend ].
(2.4)
Our final requirement is that N is smooth and that the manifold of null geodesic generators of N has topology S 2 (so that N has topology R × S 2 ). Definition 2.1. Stationary black hole spacetimes which satisfy all of the above assumptions will be referred as spacetimes of class A. Remark 2.1. If it is merely assumed that N has topology R × K, where K is compact, then under some mild additional assumptions, it follows from the topological censorship theorem [19] that each connected component of N has topology R × S 2 [20–22]; see Remark 2.2 below for a strengthening of this result. However, rather than introduce any additional assumptions here, we have chosen to merely assume that N has topology R × S2. We begin with the following lemma Lemma 2.1. Let (M, gab ) be a spacetime of class A. Then for all q ∈ N and all t 6= 0 we have φt (q) 6 = q. In particular, t a is everywhere non-vanishing on N . Proof. Suppose that for some q ∈ N and some t 6= 0 we had φt (q) = q. Since M = I + [Mend ], there exists p ∈ Mend such that p ∈ I − (q). Since φnt (q) = q for all integers n, it follows that φnt (p) ∈ I − (q) for all n, from which it follows that φ{p} ∈ I − (q). Therefore, by Lemma 3.1 of [18], we have I − (q) ⊃ Mend and, hence, I − (q) ⊃ I − [Mend ] = D. However, since N = ∂I − [Mend ], it follows that q lies on a future inextendible null geodesic, γ , contained within N . Let r lie to the future of q along γ . Let O be an open neighborhood of r which does not contain q and let V be any open neighborhood of r with V ⊂ O. Since r ∈ N = ∂I − [Mend ] = ∂D, we have V ∩ D 6 = ∅. Hence we can find a causal curve which starts in V ∩ D, goes to q (since t I − (q) ⊃ D) and then returns to r along γ . This violates strong causality at r. u 2 The orbit of an arbitrary subset Q ⊂ M under the action of φ is defined to be φ{Q} = ∪ t t∈R φt [Q].
694
H. Friedrich, I. Rácz, R. M. Wald
Remark 2.2. No assumptions about the topology or smoothness of N were used in the proof of this lemma. It is worth noting that a step in the proof of this lemma can be used to strengthen the results of [20], so as to eliminate the need for assuming existence of an asymptotically flat slice that intersects the null geodesic generators of N in a cross section. First, we note that part (1) of Lemma 2 of [20] can be strengthened to conclude that for any p ∈ Mend , each Killing orbit, α, on N intersects the achronal C 1− hypersurface C ≡ ∂I + (p) in precisely one point. (Lemma 2 of [20] proved the analogous result for Killing orbits in D.) Namely by Lemma 3.1 of [18], α satisfies either I − [α] ∩ Mend = ∅ or I − [α] ⊃ Mend . The first possibility is excluded by our assumption that M = I + [Mend ], so there exists q ∈ α ∩ I + (p). On the other hand, the proof of the above lemma shows that I − (q) cannot contain D, so there exists t > 0 such that q 6 ∈ I + (φt (p)). Equivalently, we have φ−t (q) 6∈ I + (p), which implies that the Killing orbit α must intersect C. Furthermore, if α intersected C in more than one point there would exist t > 0 so that both r ∈ α, and φt (r) lie on C. This would imply, in turn, that r lies on the boundary of both I + (p) and I + (φ−t (p)) which is impossible since p ∈ I + (φ−t (p)). Consequently, each Killing orbit on N intersects C precisely once, i.e., ς ≡ C ∩ N is a cross-section for the Killing orbits, as we desired to show.3 In particular, this shows that N has the topology R × ς. If we now assume, as in [20], that D is globally hyperbolic, that the null energy condition holds, and that [C \ Cext ] ∩ D has compact closure in M, then the same argument as used in the proof of Theorem 3 of [20] establishes that each connected component of ς has topology S 2 , without the need to assume the existence of an achronal slice which intersects the null geodesic generators of N in a cross-section. Our main result of this section is the following.4 Proposition 2.1. Let (M, gab ) be a spacetime of class A which satisfies the null energy condition, Rab k a k b ≥ 0 for all null k a . Then there exists a t0 6 = 0 such that φt0 maps each null geodesic generator of N into itself. Thus, the Killing orbits on N repeatedly intersect the same generators with period t0 . Proof. By Proposition 9.3.1 of [2], the expansion and shear of the null geodesic generators of N must vanish. By Lemma B.1 of Appendix B, this implies that Lk g 0 ab = 0 on N , where g 0 ab denotes the pullback of gab to N and k a is any smooth vector field normal to N (i.e., tangent to the null geodesic generators of N ). Since we also have g 0 ab k b = 0, it follows from the Appendix of [23] that g 0 ab gives rise to a negative definite metric, gˆ AB , on the manifold, S, of null geodesic orbits of N . By our assumptions, S has topology S 2 . 3 Furthermore, if N is smooth and ς is compact, then ς also is a cross-section for the null geodesic generators of N . Namely, smoothness of N precludes the possibility that a null geodesic generator, γ , of N has endpoints. (Future endpoints are excluded in any case, since N is a past boundary.) To show that γ must intersect ς, let r ∈ γ and let t be such that φ−t (r) ∈ ς. (Such a t exists since ς is a cross section for Killing orbits.) Suppose that t > 0. If γ failed to intersect ς , then the segment of γ to the past of r would be a past inextendible null geodesic which is confined to the compact region bounded by ς and φt [ς]. This violates strong causality. Similar arguments apply for the case where t < 0, thus establishing that γ must intersect ς . Finally, if γ intersected ς at two points, q, s, then by achronality of C, the segment of γ between q and s must coincide with a null geodesic generator, λ, of C. When extended maximally into the past, this geodesic must remain in N (by smoothness of N ) and in C (since C is a future boundary and p 6 ∈ N ). Thus, we obtain a past inextendible null geodesic which lies in the compact set ς = N ∩ C, in violation of strong causality. 4 Note that the conclusion of this proposition was assumed to be satisfied in the proof of Prop. 9.3.6 of [2], but no justification for it was provided there.
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
695
Now, for all t, φt maps N into itself and also maps null geodesics into null geodesics. Consequently, φt gives rise to a one parameter group of diffeomorphisms φˆ t on S, which are easily seen to be isometries of gˆ AB . Let tˆA denote the corresponding Killing field on S. If tˆA vanishes identically on S (corresponding to the case where t a is normal to N ), then the conclusion of the Proposition holds for all t0 6 = 0. On the other hand, if tˆA does not vanish identically, then since the Euler characteristic of S is non-vanishing, there exists a p ∈ S such that tˆA (p) = 0. By the argument given on pp. 119–120 of [24], it follows that there exists a t0 6 = 0 such that φˆ t0 is the identity map on S. Consequently, t φt0 maps each null geodesic generator of N into itself. u 3. Isenberg–Moncrief Spacetimes In this section, we shall consider spacetimes, (M, gab ), which contain a compact, orientable, smooth null hypersurface, N , which is generated by closed null geodesics. Definition 3.1. Spacetimes which satisfy the above properties will be referred to as spacetimes of class B. Spacetimes of class B arise in the cosmological context. In particular, the Taub-NUT spacetime and its generalizations given in Refs. [25–27]) provide examples of spacetimes of class B. In these examples, N is a Cauchy horizon which separates a globally hyperbolic region from a region which contains closed timelike curves. However, it should be noted that even among the Kerr–Taub-NUT spacetimes one can find (see Ref. [25]) spacetimes with compact Cauchy horizons for which almost all of the generators of the horizon are not closed. Therefore it should be emphasized that here we restrict consideration to horizons foliated by circles. Since strong causality is violated in all spacetimes of class B, it is obvious that no spacetime of class A can be a spacetime of class B. Nevertheless, the following proposition shows that there is a very close relationship between spacetimes of class A and spacetimes of class B: Proposition 3.1. Let (M, gab ) be a spacetime of class A. Then there exists an open neighborhood, O, of the horizon, N , such that (O, gab ) is a covering space of a spacetime of class B. Proof. Let t0 > 0 be as in Proposition 2.1. By Lemma 2.1, φt0 has no fixed points on N . Since the fixed points of an isometry comprise a closed set, there exists an open neighborhood, U, of N which contains no fixed points of φt0 . Let O = φ{U}. Then clearly O also is an open neighborhood of N which contains no fixed points of φt0 . ˜ g˜ ab ) be the factor space of (O, gab ) under the Moreover, φt0 maps O into itself. Let (M, ˜ g˜ ab ) is a spacetime of class B, with covering space action of the isometry φt0 . Then (M, t (O, gab ). u Now, if a spacetime possesses a Killing vector field ξ a , then any covering space of that spacetime possesses a corresponding Killing ξ 0 a that projects to ξ a . Consequently, if the existence of a Killing field is established for spacetimes of class B, it follows immediately from Proposition 3.1 that a corresponding Killing field exists for all spacetimes of class A. In particular, for analytic electrovac spacetimes of class B, Isenberg and Moncrief [11,12] proved existence of a Killing field in a neighborhood of N which is normal to N . Consequently, for any analytic, electrovac spacetime of class A, there also exists a
696
H. Friedrich, I. Rácz, R. M. Wald
Killing field in a neighborhood of N which is normal to N . Thus, Hawking’s theorem [1,2] may be obtained as a corollary of the theorem of Isenberg and Moncrief together with Proposition 3.1. The main aim of our paper is to extend the theorems of Hawking and of Isenberg and Moncrief to the smooth case. In view of the above remark, it suffices to extend the Isenberg–Moncrief theorem, since the extension of the Hawking theorem will then follow automatically. Thus, in the following, we shall restrict attention to spacetimes of class B. For spacetimes of class B, N is a compact, orientable 3-manifold foliated by closed null geodesics. To discuss this situation we introduce some terminology (cf. [28] for more details). The “ordinary fibered solid torus” is defined as the set D 2 × S 1 with the circles {p} × S 1 , p ∈ D 2 , as “fibers”. Here D 2 denotes the 2-dimensional closed unit disk. A “fibered solid torus” is obtained by cutting an ordinary fibered solid torus along a disk D 2 × {q}, for some q ∈ S 1 , rotating one of the disks through an angle m n 2 π, where m, n are integers, and gluing them back again. While the central fiber now still closes after one cycle, the remaining fibers close in general only after n cycles. We note b of the ordinary fibered solid torus onto that there exists a fiber preserving n : 1 map ψ the fibered solid torus which is a local diffeomorphism that induces a n : 1 covering map on the central fiber. As shown in [12], it follows from Epstein’s theorem [29] that the null geodesics on N represent the fibers of a Seifert fibration. This means that any closed null geodesic has a “fibered neighborhood”, i.e. a neighborhood fibered by closed null geodesics, which can be mapped by a fiber preserving diffeomorphism onto a fibered solid torus. Because N is compact, it can be covered by a finite number of such fibered neighborhoods, Ni . bi which is mapped by For any neighborhood Ni there is an ordinary fibered solid torus N bi onto Ni as described above. Further we can choose a tubular a fiber preserving map ψ spacetime neighborhood, Ui , of Ni , so that Ui has topology D 2 ×R×S 1 and the fibration bi ' D 2 × R × S 1 of N bi , with of Ni extends to Ui . There exists then a fibered extension U bi can be extended to a fiber preserving local fibers {p} × S 1 , p ∈ D 2 × R, to which ψ bi onto Ui . We denote the extension again by ψ bi . Let Oi diffeomorphism which maps U bi (so that Oi has topology D 2 × R2 ). We denote denote the universal covering space of U bi by ψ ei , set ψi = ψ bi ◦ ψ ei , and denote the inverse the projection map from Oi onto U e image of Ni under ψi by Ni . We will refer to (Oi , ψi∗ gab ) as an elementary spacetime region. Note that for the case of a spacetime of class B constructed from a spacetime of class A in the manner of Proposition 3.1, Oi may be identified with a neighborhood of a portion of the horizon in the original (class A) spacetime. Our main results will be based upon the following theorem, which may be extracted directly from the analysis of Isenberg and Moncrief [11,12]: Theorem 3.1 (Moncrief & Isenberg). Let (M, gab ) be a smooth electrovac spacetime of class B and let (Oi , ψi∗ gab ) be an elementary spacetime region, as defined above. Then, there exists a Gaussian null coordinate system (u, r, x 3 , x 4 ) (see Appendix A) covering ei in Oi so that the following properties hold (i) The coordinate a neighborhood, Oi0 , of N range of u is −∞ < u < ∞ whereas the coordinate range of r is − < r < for some bi ei . (ii) In O0 , the projection map ψ ei : Oi → U > 0, with the surface r = 0 being N i is obtained by periodically identifying the coordinate u with some period P ∈ R. Thus, in particular, the components of ψi∗ gab and ψi∗ Fab in these coordinates are periodic functions of u with period P . (iii) We have, writing in the following for convenience gab
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
697
and Fab instead of ψi∗ gab and ψi∗ Fab , f |Nei = −2κ◦ and FuA |Nei = 0,
(3.1)
ei , the r-derivatives of the with κ◦ ∈ R, where f is defined in Appendix A. (iv) On N metric and Maxwell field tensor components up to any order are u-independent, i.e., in the notation of Appendix A, n ∂ ∂ = 0, {f, } h , g ; F , F , F , F (3.2) A AB ur uA rA AB n ∂u ∂r ei N for all n ∈ N ∪ {0}. ei the vector field k a = (∂/∂u)a Remark 3.1. Along the null geodesic generators of N (which we take to be future directed) satisfies the equation k a ∇a k b = κ◦ k b .
(3.3)
ei are past incomplete but Thus, if κ◦ > 0, then all of the null geodesic generators of N ei are past complete future complete. If κ◦ < 0 then all of the null geodesic generators of N but future incomplete. Similarly, if κ◦ = 0 (usually referred to as the “degenerate case”), ei are complete then u is an affine parameter and all of the null geodesic generators of N in both the past and future directions. Remark 3.2. In the analytic case, Eq. (3.2) directly implies that k a = (∂/∂u)a is a ei . Since the projection map ψ ei is obtained Killing vector field in a neighborhood of N by periodically identifying the coordinate u, it follows immediately that k a projects to bi . Appealing then to the argument of a Killing vector field b k a in a neighborhood of N [12], it can be shown that the b k a further projects to a Killing vector field under the action ei ◦ ψ bi projects k a = (∂/∂u)a to a well-defined bi , so that the map ψi = ψ of the map ψ Killing field in a neighborhood of Ni . The arguments of [11,12] then establish that the local Killing fields obtained for each fibered neighborhood can be patched together to produce a global Killing field on a neighborhood of N . In the next section, we shall generalize the result of Remark 3.2 to the smooth case. However, to do so we will need to impose the additional restriction that κ◦ 6= 0, and we will prove existence only on a one-sided neighborhood of the horizon. 4. Existence of a Killing Vector Field The main difficulty encounted when one attempts to generalize the Isenberg–Moncrief theorem to the smooth case is that suitable detailed information about the spacetime metric and Maxwell field is known only on N (see Eq. (3.2) above). If a Killing field k a exists, it is determined uniquely by the data ka , ∇[a kb] at one point of N , because Eqs. (B.4), (B.6) imply a system of ODE’s for the tetrad components kj , ∇[i kj ] along each C 1 curve. But the existence of a Killing field cannot be shown this way. Thus we will construct the Killing field as a solution to a PDE problem. However, N is a null surface, and thus, by itself, it does not comprise a suitable initial data surface for the relevant hyperbolic equations. We now remedy this difficulty by performing a ei which is covered by the Gaussian null suitable local extension of a neighborhood of N coordinates of Theorem 3.1. This is achieved via the following proposition:
698
H. Friedrich, I. Rácz, R. M. Wald
Proposition 4.1. Let (Oi , gab |Oi ) be an elementary spacetime region associated with an electrovac spacetime of class B such that κ◦ > 0 (see Eq. (3.1) above). Then, there ei in Oi such that (O00 , gab |O00 , Fab |O00 ) can be exists an open neighborhood, Oi00 , of N i i i ∗ , F ∗ ), that possesses a bifurcate extended to a smooth electrovac spacetime, (O∗ , gab ab e∗ is the union of two null hypersurfaces, N ∗ and N ∗ , which e ∗ – i.e., N null surface, N 1 2 ei corresponds to the intersect on a 2-dimensional spacelike surface, S – such that N ei ]. Furthermore, the portion of N1∗ that lies to the future of S and I + [S] = Oi00 ∩ I + [N expansion and shear of both N1∗ and N2∗ vanish. Proof. It follows from Eq. (3.2) that in Oi0 , the spacetime metric gab can be decomposed as (0)
gab = gab + γab ,
(4.1) (0)
(0)
where, in the Gaussian null coordinates of Theorem 3.1, the components, gµν , of gab are independent of u, whereas the components, γµν , of γab and all of their derivatives ei ). Furthermore, taking account of the with respect to r vanish at r = 0 (i.e., on N periodicity of γµν in u (so that, in effect, the coordinates (u, x 3 , x 4 ) have a compact range of variation), we see that for all integers j ≥ 0 we have throughout Oi0 |γµν | < Cj |r|j
(4.2)
for some constants Cj . Similar relations hold for all partial derivatives of γµν . ei in the spacetime It follows from Eq. (4.2) that there is an open neighborhood of N (0) (0) (Oi0 , gab ) such that gab defines a Lorentz metric. It is obvious that in this neighborhood, ei is a Killing horizon of g (0) with respect to the Killing field k a = (∂/∂u)a . ConseN ab ei in quently, by the results of [15,16], we may extend an open neighborhood, Oi00 , of N (0)∗ ∗ e∗ , Oi to a smooth spacetime (O , gab ), that possesses a bifurcate Killing horizon, N (0)∗ (0)∗ e ∗ with respect to gab . Furthermore, with respect to the metric gab , N automatically satisfies all of the properties stated in the proposition. In addition, by Theorem 4.2 of (0)∗ [16], the extension can be chosen so that k a extends to a Killing field, k ∗a , of gab in (0)∗ O∗ , and (O∗ , gab ) possesses a “wedge reflection” isometry (see [16]); we assume that such a choice of extension has been made. (0) Let (u0 , r0 , x03 , x04 ) denote the Gaussian null coordinates in Oi00 associated with gab , 3 4 3 4 ei we have r0 = r = 0, u0 = u, x = x , x = x . Since γab is smooth in such that on N 0 0 00 Oi and is periodic in u, it follows that each of the coordinates (u0 , r0 , x03 , x04 ) are smooth functions of (u, r, x 3 , x 4 ) which are periodic in u. It further follows that the Jacobian matrix of the transformation between (u0 , r0 , x03 , x04 ) and (u, r, x 3 , x 4 ) is uniformly bounded in Oi00 , and that, in addition, there exists a constant, c such that |r| ≤ c|r0 | in Oi00 . Consequently, the components, γµ0 ν0 of γab in the Gaussian null coordinates (0) associated with gab satisfy for all integers j ≥ 0, |γµ0 ν0 | < Cj0 |r0 |j .
(4.3) (0)
Let (U, V ) denote the generalized Kruskal coordinates with respect to gab introduced in [15,16]. In terms of these coordinates, Oi00 corresponds to the portion of O∗ satisfying U > 0 and the wedge reflection isometry mentioned above is given by U → −U, V →
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
699
−V . The null hypersurfaces N1∗ and N2∗ that comprise the bifurcate Killing horizon, e ∗ , of g (0)∗ correspond to the hypersurfaces defined by V = 0 and U = 0, respectively. N ab It follows from Eq. (23) of [15] that within Oi00 we have |r0 | < C|U V |
(4.4)
for some constant C. Hence, we obtain for all j , |γµ0 ν0 | < Cj00 |U V |j
(4.5)
with similar relations holding for all of the derivatives of γµ0 ν0 with respect to the coordinates (u0 , r0 , x03 , x04 ). Taking account of the transformation between the Gaussian null coordinates and the generalized Kruskal coordinates (see Eqs. (24) and (25) of [15]), we see that the Kruskal components of γab and all of their Kruskal coordinate derivatives also go to zero uniformly on compact subsets of (V , x 3 , x 4 ) in the limit as U → 0. It follows that the tensor field γab on Oi00 extends smoothly to U = 0 – i.e., the null hypersurface N2∗ of O∗ – such that γab and all of its derivatives vanish on N2∗ . We ∗ now further extend γab to the region U < 0 – thereby defining a smooth tensor field γab on all of O∗ – by requiring it to be invariant under the above wedge reflection isometry. In O∗ , we define (0)∗
∗ ∗ = gab + γab . gab
(4.6)
∗ is smooth in O ∗ and is invariant under the wedge reflection isometry. FurThen gab ∗ vanishes on N e∗ , it follows that N e∗ is a bifurcate null surface with thermore, since γab ∗ ∗a ∗ e . In addition, on N e∗ we have Lk ∗ g (0)∗ = 0 respect to gab and that k is normal to N ab (0)∗ ∗ = 0 (since γ ∗ and its derivatives vanish (since k ∗a is a Killing field of gab ) and Lk ∗ γab ab e∗ . By Lemma B.1, it follows that the e ∗ ). Therefore, we have Lk ∗ g ∗ = 0 on N on N ab expansion and shear of both N1∗ and N2∗ vanish. Finally, by a similar construction (using the fact that FuA |Nei = 0; see Eq. (3.1)), we ∗ in O ∗ which can extend the Maxwell field Fab in Oi00 to a smooth Maxwell field Fab ∗ ∗ ) satisfies is invariant under the wedge reflection isometry. By hypothesis, (gab , Fab the Einstein-Maxwell equations in the region U > 0. By invariance under the wedge ∗ , F ∗ ) also satisfies the Einstein-Maxwell equations in the region reflection isometry, (gab ab U < 0. By continuity, the Einstein-Maxwell equations also are satisfied for U = 0, so ∗ , F ∗ ) is a solution throughout O ∗ . u t (gab ab
Remark 4.1. By Remark 3.1, the hypothesis that κ◦ > 0 is equivalent to the condition ei are past incomplete. Therefore, it is clear that that the null geodesic generators of N Proposition 4.1 also holds for κ◦ < 0 if we interchange futures and pasts. However, no analog of Proposition 4.1 holds for the “degenerate case” κ◦ = 0. We are now prepared to state and prove our main theorem: Theorem 4.1. Let (M, gab ) be a smooth electrovac spacetime of class B for which the generators of the null hypersurface N are past incomplete. Then there exists an open neighborhood, V of N such that in J + [N ] ∩ V there exists a smooth Killing vector field k a which is normal to N . Furthermore, in J + [N ] ∩ V the electromagnetic field, Fab , satisfies Lk Fab = 0.
700
H. Friedrich, I. Rácz, R. M. Wald
Proof. As explained in Sect. 3, we can cover N by a finite number of fibered neighborhoods, Ni . Let Oi denote the elementary spacetime region obtained by “unwrapping” a neighborhood of Ni , as explained in Sect. 3. By Remark 3.1, the past incompleteness of the null geodesic generators of N implies that κ◦ > 0, so Proposition 4.1 holds. We now apply Proposition B.1 to the extended spacetime O∗ to obtain existence of a e∗ = N ∗ ∪ N ∗ . By restriction to Killing vector field in the domain of dependence of N 1 2 a Oi , we thereby obtain a Killing field K (which also Lie derives the Maxwell field) on a ei ] ∩ V ei , where V ei is an open neighborei of the form J + [N one-sided neighborhood of N ei . Both K a and k a = (∂/∂u)a are tangent to the null geodesic generators of hood of N ei we clearly have K a = ϕk a for some function ϕ. Furthermore, on N ei , we ei , so on N N have LK gab = 0 (since K a is a Killing field) and Lk gab = 0 (as noted in the proof of ei , so we may rescale K a so Proposition 4.1). It follows immediately that ∇a ϕ = 0 on N a a a e e that K = k on Ni . Since the construction of k off of Ni (as described in Appendix A) is identical to that which must be satisfied by a Killing field (as described in Remark B.1 below), it follows that K a = k a within their common domain of definition. Thus, the vector field k a = (∂/∂u)a – which previously had been shown to be a Killing field ei ] ∩ V ei . By in the analytic case – also is a Killing field in the smooth case in J + [N exactly the same arguments as given in [11,12] (see Remark 3.2 above) it then follows ei ◦ ψ bi projects k a = (∂/∂u)a to a well-defined Killing field in a that the map ψi = ψ one-sided neighborhood of Ni , and that the local Killing fields obtained for each fibered neighborhood can be patched together to produce a global Killing field on a one-sided t neighborhood of N of the form J + [N ] ∩ V, where V = ∪i ψi [Vi ]. u In view of Proposition 3.1, we have the following corollary Corollary 4.1. Let (M, gab ) be a smooth electrovac spacetime of class A for which the generators of the event horizon N are past incomplete. Then there exists an open neighborhood, V of N such that in J + [N ] ∩ V there exists a smooth Killing vector field k a which is normal to N . Furthermore, in J + [N ] ∩ V the electromagnetic field, Fab , satisfies Lk Fab = 0. A. Gaussian Null Coordinate Systems In this appendix the construction of a local Gaussian null coordinate system will be recalled. Let (M, gab ) be a spacetime, let N be a smooth null hypersurface, and let ς be a ς smooth spacelike 2-surface lying in N . Let x 3 , x 4 be coordinates on an open subset e of ς. On a neighborhood of e ς in N , let k a be a smooth, non-vanishing normal vector field to N , so that the integral curves of k a are the null geodesic generators of N . Without loss of generality, we may assume that k a is future directed. On a sufficiently small open neighborhood, S, of e ς × {0} in e ς × R, let ψ : S → N be the map which takes (q, u) into the point of N lying at parameter value u along the integral curve of k a starting at q. Then, ψ is C ∞ , and it follows from the inverse function theorem that ψ is 1 : 1 and onto from an open neighborhood of e ς × {0} onto an open e e, of ς ς onto neighborhood, N e in N . Extend the functions x 3 , x 4 from e N by keeping their values constant along the integral curves of k a . Then u, x 3 , x 4 are coordinates e. on N e satisfying l a ka = 1 and e let l a be the unique null vector field on N At each p ∈ N a a a e and satisfy X ∇a u = 0. On a sufficiently l Xa = 0 for all X which are tangent to N
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
701
e × {0} in N e × R, let 9 : Q → M be the map which small open neighborhood, Q, of N takes (p, r) ∈ Q into the point of M lying at affine parameter value r along the null geodesic starting at p with tangent l a . Then 9 is C ∞ and it follows from the inverse e × {0} onto function theorem that 9 is 1 : 1 and onto from an open neighborhood of N e to O e in M. We extend the functions u, x 3 , x 4 from N an open neighborhood, O, of N by requiringtheir values to be constant along each null geodesic determined by l a . Then u, r, x 3 , x 4 yields a coordinate system on O which will be referred to as Gaussian null e we have k a = (∂/∂u)a . coordinate system. Note that on N Since by construction the vector field l a = (∂/∂r)a is everywhere tangent to null geodesics we have that grr = 0 throughout O. Furthermore, we have that the metric functions gru , gr3 , gr4 are independent of r, i.e. gru = 1, gr3 = gr4 = 0 throughout O. e. In addition, as a direct consequence of the above construction, guu and guA vanish on N Hence, within O, there exist smooth functions f and hA , with f |Ne = (∂guu /∂r) |r=0 and hA |Ne = (∂guA /∂r) |r=0 , so that the spacetime metric in O takes the form ds 2 = r · f du2 + 2drdu + 2r · hA dudx A + gAB dx A dx B ,
(A.1)
where gAB are smooth functions of u, r, x 3 , x 4 in O such that gAB is a negative definite 2 × 2 matrix, and the uppercase Latin indices take the values 3, 4. B. The Existence of a Killing Field Tangent to the Horizon The purpose of this section is to prove the following fact. Proposition B.1. Suppose that (M, g, F ) is a time oriented solution to the EinsteinMaxwell equations with Maxwell field F (without sources). Let N1 , N2 be smooth null hypersurfaces with connected space-like boundary Z, smoothly embedded in M, which are generated by the future directed null geodesics orthogonal to Z. Assume that N1 ∪N2 is achronal. Then there exists on the future domain of dependence, D + , of N1 ∪ N2 a non-trivial Killing field K which is tangent to the null generators of N1 and N2 if and only if these null hypersurfaces are expansion and shear free. If it exists, the Killing field is unique up to a constant factor and we have LK F = 0. Remark B.1. We shall prove Proposition B.1 by first deducing the form that K must have on N1 ∪ N2 , then defining K on D + by evolution from N1 ∪ N2 of a wave equation that must be satisfied by a Killing field, and, finally, proving that the resulting K is indeed a Killing field. An alternative, more geometric, approach to the proof of Proposition B.1 would be to proceed as follows. Suppose that the Killing field K of Proposition B.1 exists. Denote by ψt the local 1-parameter group of isometries associated with K. Since ψt maps N1 ∪ N2 into itself, geodesics into geodesics, and preserves affine parameterization, we can describe for given t the action of ψt on a neighborhood of N1 ∪ N2 in D + in terms of its action on geodesics passing through N1 ∪ N2 into D + and affine parameters which vanish on N1 ∪ N2 . There are various possibilities, one could employ e.g. the future directed time-like geodesics starting on Z or the null geodesics which generate double null coordinates adapted to N1 ∪ N2 . Changing the point of view, one could try to use such a description to define maps ψt , show that they define a local group of isometries, and then define K as the corresponding Killing field. To show that ψt∗ g = g, one would prove this relation on N1 ∪ N2 and then invoke the uniqueness for the characteristic initial value problem for the Einstein-Maxwell equations to show that this relation holds also in a neighborhood of N1 ∪ N2 in D + . For this to work we would
702
H. Friedrich, I. Rácz, R. M. Wald
need to show that ψt∗ g has a certain smoothness (C 2 say). This is not so difficult away from N1 ∪ N2 but it is delicate near the initial hypersurface. The discussion would need to take into account the properties of the underlying space-time exhibited in Lemmas B.1, B.2 below and would become quite tedious. For this reason we have chosen not to proceed in this manner. It will be convenient to use the formalism, notation, and conventions of [30] in a gauge adapted to our geometrical situation. Since we will be using the tetrad formalism, throughout this Appendix we shall omit all abstract indices a, b, c . . . on tensors and use the indices i, j, k . . . to denote the components of tensors in our tetrad. We begin by choosing smooth coordinates u = x 1 , r = x 2 , x A , A = 3, 4, and a smooth tetrad field z1 = l, z2 = n z3 = m, z4 = m, with gik = g(zi , zk ) such that g12 = g21 = 1, g34 = g43 = −1 are the only nonvanishing scalar products. Let x A be coordinates on a connected open subset ζ of Z on which also m, m can be introduced such that they are tangent to ζ . On ζ we set x 1 = 0, x 2 = 0 and assume that l is tangent to N1 , n is tangent to N2 , and both are future directed. The possible choices which can be made above will represent the remaining freedom in our gauge. We assume ∇z2 z2 = 0,
< z2 , d x µ > = δ µ 1 , on N20 ,
and set ζc = {x 1 = c} ⊂ N20 for c ≥ 0, where N20 denotes the subset of N2 generated by the null geodesics starting on ζ . We assume that m, m are tangent to ζc . From the transformation law of the spin coefficient γ under rotations m → ei φ m we find that we can always assume that γ = γ on N20 . With this assumption m, whence also l, will be fixed uniquely on N20 and we have γ = 0,
ν = 0 on N20 .
The coordinates and the frame are extended off N20 such that ∇z1 zi = 0,
< z1 , d x µ > = δ µ 2 .
On a certain neighborhood, D, of N10 ∪N20 in D + (where N10 is the subset of N1 generated by the null geodesics starting on ζ ), we obtain by this procedure a smooth coordinate system and a smooth frame field which has in these coordinates the local expression lµ = δ 1 µ ,
lµ = δµ 2 ,
nµ = δ µ 1 + U δ µ 2 + XA δ µ A ,
mµ = ω δ µ 2 + ξ A δ µ A .
We have N10 = {x 1 = u = 0}, N20 = {x 2 = r = 0} and κ = 0, = 0, π = 0, τ = α + β on D,
U = 0, XA = 0, ω = 0 on N20 .
We shall use alternatively the Ricci rotation coefficients defined by ∇i zj ≡ ∇zi zj = γj l i zl or their representation in terms of spin coefficients as given in [30]. The gauge above will be used in many local considerations whose results extend immediately to all of N1 and N2 . We shall then always state the extended result. We begin by showing the necessity of the conditions on the null hypersurfaces in Proposition B.1 and some of their consequences.
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
703
Lemma B.1. Let N be a smooth null hypersurface of the space-time (M, g) and X a smooth vector field on M which is tangent to the null generators of N and does not vanish there. If g 0 denotes the pull back of g to N , then LX g 0 = 0 on N if and only if the null generators of N are expansion and shear free. Proof. We can assume that N coincides with the hypersurface N1 . Then we have in our gauge X = X 1 z1 on N with X1 6 = 0, and LX g 0 = 0 translates into 0 = ∇(i Xj ) = z(i (X 1 ) δ 2 j ) − γ(j 2 i) X 1 on N with i, j 6 = 2. By our gauge this is equivalent to 0 = γ(A 2 B) = −σ δ 3 A δ 3 B − Re ρ δ 3 (A δ 4 B) − σ¯ δ 4 A δ 4 B . Since ρ is real on the hypersurface N , the assertion follows. u t Lemma B.2. If the null hypersurfaces N1 , N2 are expansion and shear free, then the frame coefficients, the spin coefficients, the components 9i of the conformal Weyl spinor field, the components φk of the Maxwell spinor field, and the components 8ik = k φi φ¯ k of the Ricci spinor field are uniquely determined in our gauge on N10 and N20 by the field equations and the data φ1 ,
τ,
ξ A , A = 3, 4, on Z.
(B.1)
In particular, we have 90 = 0, 91 = 0, φ0 = 0, 80k = 8k0 = 0, D φ1 = 0, D φ2 = δ φ1 , φ2 = r δ φ1 , ω = −r τ, µ = r 92 , on N10 ,
(B.2)
94 = 0, 93 = 0, φ2 = 0, 8i2 = 82i = 0, 1 φ1 = 0, 1 φ0 = δ φ1 − 2 τ φ1 , φ0 = u (δ φ1 − 2 τ φ1 ), ρ = u (δ τ − 2 α τ − 92 ) on N20 .
(B.3)
Proof. In our gauge the relations 90 = 0, 800 = k φ0 φ 0 = 0 on N10 are an immediate consequence of the NP equations and our assumption that ρ = 0, σ = 0 on N10 . Similarly, the assumptions µ = 0, λ = 0 on N20 imply 94 = 0, 822 = k φ2 φ 2 = 0 on N20 . The relation 8ij = k φi φ j implies the other statements on the Ricci spinor on N10 , N20 and it allows us to determine 811 on Z from the data (B.1). The NP equations involving only the operators δ, δ, the data (B.1), and our gauge conditions allow us to calculate the functions α, β, 91 = 0, 92 , 93 = 0 on Z. Then all metric coefficients, spin coefficients, and the Weyl, Ricci, and Maxwell spinor fields are known on Z. The remaining assertions follow by integrating in the appropriate order the NP equations (cf. also the appendix of [30]) involving the operator D on N10 and the t equations involving the operator 1 on N20 . u Lemma B.3. A Killing field K as considered in Proposition B.1 satisfies, up to a constant factor, K = r z1 on N1 ,
K = −u z2 on N2 .
704
H. Friedrich, I. Rácz, R. M. Wald
Proof. We note, first, that the statement above is reasonable, because the vector fields z1 , z2 can be defined globally on N1 and N2 respectively and the form of K given above is preserved under rescalings consistent with our gauge freedom on Z. Writing K = K i zi , we have by our assumptions K = K 1 z1 on N1 , K = K 2 z2 on N2 for some smooth functions K 1 , K 2 which vanish on Z. To determine their explicit form we use, in addition to the Killing equation LK gij = ∇i Kj + ∇j Ki = 0
(B.4)
the identity ∇i ∇j Kl + Km R m ilj =
1 ∇i (LK glj ) + ∇j (LK gli ) − ∇l (LK gij ) 2
(B.5)
which holds for arbitrary smooth vector field K and metric g and which implies, together with Eq. (B.4), the integrability condition ∇i ∇j Kl + Km R m ilj = 0.
(B.6)
The restriction of Eq. (B.4) to ζ gives ∇i Kj = 2 h δ 1 [i δ 2 j ] with h = z1 (K 1 ) = −z2 (K 2 ). Using this expression to evaluate Eq. (B.6) on ζ for i = A = 3, 4, and observing that ζ is connected we get zA (h) = 0, whence h = const. on ζ . Since Z is connected the same expression for ∇i Kj will be obtained everywhere on Z with the same constant h. If h were zero, K would vanish identically by Eqs. (B.4) and (B.6). Since K is assumed to be non-trivial we have h 6 = 0 and can rescale K to achieve h = 1. Equations (B.4), (B.6) imply in our gauge z1 (K2 ) = −∇2 K1 ,
z1 (∇2 K1 ) = 0 on N10 ,
z2 (K1 ) = −∇1 K2 ,
z2 (∇1 K2 ) = 0 on N20 ,
t which, together with the value of ∇i Kj , i, j = 1, 2, on Z entail our assertion. u Taking into account ρ = 0, σ = 0 on N1 , µ = 0, λ = 0 on N2 , and in particular Eq. (B.2), we immediately get the following. Lemma B.4. By calculations which involve only inner derivatives on the respective null hypersurface one obtains from Eq. (B.3), ∇ i Kj = δ 1 i δ 2 j ,
i 6= 2, on N1 ,
∇i Kj = −δ 2 i δ 1 j + u τ δ 3 i δ 1 j + u τ δ 4 i δ 1 j ,
i 6 = 1, on N2 .
(B.7) (B.8)
Equation (B.6) implies the hyperbolic system ∇i ∇ i Kl − Km R m l = 0,
(B.9)
and the initial data for the Killing field we wish to construct are given by Lemma B.3. Both have an invariant meaning. Lemma B.5. There exists a unique smooth solution, K, of Eq. (B.9) on D + which takes on N1 ∪ N2 the values given in Lemma B.3.
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
705
Proof. The uniqueness of the solution is an immediate consequence of standard energy estimates. The results in [31] or [32] entail the existence of a unique smooth solution of Eq. (B.9) for the data given in Lemma B.3 on an open neighborhood of ζ in D ∩ D + (N10 ∪ N20 ). These local solutions can be patched together to yield a solution in some neighborhood of Z in D. Because of the linearity of Eq. (B.9) this solution can be t extended (e.g. by a patching procedure) to all of D + . u Lemma B.6. The vector field K of Lemma B.5 satisfies LK g = 0 and LK F = 0 on D+. Proof. The equations above need to be deduced from the structure of the data in Lemma B.3 and from Eq. (B.9). Applying ∇j to (B.9) and commuting derivatives we get ∇i ∇ i (LK gj l ) = 2 LK Rj l + 2 R i j l k (LK gik ) − 2 R i (j (LK gl)i ). The Einstein equations give 1 LK Rij = k 0 2 F(i l (LK Fj )l ) − gij (LK Fkl ) F kl − (LK gkl ) Fi k Fj 2
(B.10)
l
(B.11)
1 1 kl km l − (LK gij ) Fkl F + gij (LK gkl ) F F m . 4 2 The identity d LK F = d (iK d F + d iK F ) together with Maxwell’s equations implies ∇[i (LK Fj l] ) = 0.
(B.12)
Applying LK to the second part of Maxwell’s equations and using the identity (B.5) as well as the fact that K solves Eq. (B.9), we get ∇ i (LK Fik ) = F j l ∇j (LK glk ) + (LK gj l ) ∇ j F l k .
(B.13)
Substituting Eq. (B.11) in Eq. (B.10), we can view the system (B.10), (B.12), (B.13) as a homogeneous linear system for the unknowns LK g, LK F . This system implies a linear symmetric hyperbolic system for the unknowns LK g, ∇ LK g, LK F (cf. [33]). We shall show now that these unknowns vanish on N1 ∪ N2 . The standard energy estimates for symmetric hyperbolic systems then imply that the fields vanish in fact on D + , which will prove our lemma and thus Proposition B.1. Equation (B.9) restricted to N10 reads 0 = ∇i ∇ i Kl − Km R m l = 2 (∇1 ∇2 Kl − ∇3 ∇4 Kl ) − Km (R m l + R m l21 − R m l43 ). Using this equation together with Eqs. (B.2) and (B.7) and our gauge conditions, we obtain by a direct calculation a system of ODE’s of the form ∇1 (∇(i Kj ) ) = Hij (∇(k Kl) ), on the null generators of N10 . Here Hij is a linear function of the indicated argument (suppressing the dependence on the points of N10 ). Since ∇(k Kl) = 0 on Z, we conclude that LK g = 0 on N1 . An analogous argument involving Eqs. (B.3) and (B.8) shows that LK g = 0 on N2 . It follows in particular that ∇ LK g = 0 on Z.
706
H. Friedrich, I. Rácz, R. M. Wald
Writing (LK F )AA0 BB 0 = A0 B 0 pAB + AB pA0 B 0 and using (B.3) we find on N10 in NP notation p0 = r D φ0 + φ0 ,
p1 = r D φ1 ,
p2 = r D φ2 − φ2 .
It follows from Eq. (B.2) that pAB , whence LK F , vanishes on N1 . An analogous argument involving Eq. (B.3) shows that LK F vanishes on N2 . Observing in Eqs. (B.11) and (B.10) that LK F = 0, LK g = 0, whence also ∇l (∇(i Kj ) ) = 0 for l 6 = 2 on N10 , we obtain there 0 = ∇k ∇ k (∇(i Kj ) ) = 2 ∇1 (∇2 (∇(i Kj ) )). We conclude that ∇ LK g, which vanishes on Z, vanishes on N1 . In a similar way it t follows that ∇ LK g = 0 on N2 . This completes the proof. u Acknowledgements. This research was supported in part by Monbusho Grant-in-aid No. 96369 and by NSF grant PHY 95-14726 to the University of Chicago. We wish to thank Piotr Chrusciel and James Isenberg for reading the manuscript. One of us (IR) wishes to thank the Albert Einstein Institute and the Physics Department of the Tokyo Institute of Technology for their kind hospitality during part of the work on the subject of the present paper.
References 1. Hawking, S.W.: Black holes in general relativity. Commun. Math. Phys. 25, 152–166 (1972) 2. Hawking, S.W. and Ellis, G.F.R.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973 3. Chru´sciel, P.T.: On rigidity of analytic black holes. Commun. Math. Phys. 189, 1–7 (1997) 4. Chru´sciel, P.T.: Uniqueness of stationary, electro-vacuum black holes revisited. Helv. Phys. Acta 69, 529–552 (1996) 5. Israel, W.: Event horizons in static vacuum space-times. Phys. Rev. 164, 1776–1779 (1967) 6. Israel, W.: Event horizons in static electrovac space-times. Commun. Math. Phys. 8, 245–260 (1968) 7. Carter, B.: Axisymmetric black hole has only two degrees of freedom. Phys. Rev. Lett. 26, 331–333 (1971) 8. Carter, B.: Black hole equilibrium states. In: Black Holes, C. de Witt and B. de Witt (eds.), New York, London, Paris: Gordon and Breach, 1973 9. Mazur, P.O.: Proof of uniqueness of the Kerr–Newman black hole solutions. J. Phys. A: Math. Gen. 15, 3173–3180 (1982) 10. Bunting, G.L.: Proof of the uniqueness conjecture for black holes. Ph. D. Thesis, University of New England, Admirale (1987) 11. Moncrief, V. and Isenberg, J.: Symmetries of cosmological Cauchy horizons. Commun. Math. Phys. 89, 387–413 (1983) 12. Isenberg, J. and Moncrief, V.: Symmetries of cosmological Cauchy horizons with exceptional orbits. J. Math. Phys. 26, 1024–1027 (1985) 13. Penrose, R.: Singularities an time asymmetry. In: General relativity; An Einstein centenary survey, eds. S.W. Hawking, W. Israel, Cambridge: Cambridge University Press, 1979 14. Rácz, I.: On further generalization of the rigidity theorem for spacetimes with a stationary event horizon or a compact Cauchy horizon. In preparation 15. Rácz, I. and Wald, R.M.: Extension of spacetimes with Killing horizon. Class. Quant. Grav. 9, 2643–2656 (1992) 16. Rácz, I. and Wald, R.M.: Global extensions of spacetimes describing asymptotic final states of black holes. Class. Quant. Grav. 13, 539–553 (1996) 17. Wald, R.M.: General relativity. Chicago: University of Chicago Press, 1984 18. Chru´sciel, P.T. and Wald, R.M.: Maximal hypersurfaces in asymptotically flat spacetimes. Commun. Math. Phys. 163, 561–604 (1994) 19. Friedman, J.L., Schleich, K. and Witt, D.M.: Topological censorship. Phys. Rev. Lett. 71, 1486–1489 (1993) 20. Chru´sciel, P.T. and Wald, R.M.: On the topology of stationary black holes. Class. Quant. Grav. 11, L147– L152 (1994)
Rigidity Theorem for Spacetimes with a Stationary Event Horizon
707
21. Galloway, G.J.: On the topology of the domain of outer communication. Class. Quant. Grav. 12, L99–L101 (1995) 22. Galloway, G.J.: A “finite infinity” version of the FSW topological censorship. Class. Quant. Grav. 13, 1471–1478 (1996) 23. Geroch, R.: A method for constructing solutions of Einstein’s equations. J. Math. Phys. 12, 918–924 (1971) 24. Wald, R.M.: Quantum field theory on curved spacetimes. Chicago: University of Chicago Press, 1994 25. Miller, J.G.: Global analysis of the Kerr–Taub-NUT metric. J. Math. Phys. 14, 486–494 (1973) 26. Moncrief, V.: Infinite-dimensional family of vacuum cosmological models with Taub-NUT (Newman– Unti–Tamburino-type extensions. Phys. Rev. D. 23, 312–315 (1981) 27. Moncrief, V.: Neighborhoods of Cauchy horizons in cosmological spacetimes with one Killing field. Ann. of Phys. 141, 83–103 (1982) 28. Seifert, H.: Topologie dreidimensionaler gefaserter Räume. Acta. Math. 60, 147–238 (1933); English translation in: H. Seifert, W. Threlfall: A Textbook of Topology, New York: Academic Press, 1980 29. Epstein, D.B.A.: Periodic flows on three-manifolds. Ann. Math. 95, 66–81 (1972) 30. Newman, E., Penrose, R.: An Approach to Gravitational Radiation by a Method of Spin coefficients. J. Math. Phys. 3, 566–578 (1962), 4, 998 (1963) 31. Müller zum Hagen, H.: Characteristic initial value problem for hyperbolic systems of second order differential systems. Ann. Inst. Henri Poincaré 53, 159–216 (1990) 32. Rendall, A.D.: Reduction of the characteristic initial value problem to the Cauchy problem and its applications to the Einstein equations. Proc. R. Soc. Lond. A 427, 221–239 (1990) 33. Friedrich, H.: On the Global Existence and the Asymptotic Behaviour of Solutions to the Einstein– Maxwell–Yang–Mills Equations. J. Diff. Geom. 34, 275–345 (1991) Communicated by H. Nicolai
Commun. Math. Phys. 204, 709 – 729 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Spacing Between Phase Shifts in a Simple Scattering Problem Steve Zelditch1 , Maciej Zworski2 1 Department of Mathematics, The Johns Hopkins University, Baltimore, MD 21218, USA.
E-mail: [email protected]
2 Mathematics Department, University of California, Berkeley, CA 94720, USA.
E-mail: [email protected] Received: 15 August 1998 / Accepted: 15 February 1999
Abstract: We prove a scattering theoretical version of the Berry–Tabor conjecture: for an almost every surface in a class of cylindrical surfaces of revolution, the large energy limit of the pair correlation measure of the quantum phase shifts is Poisson, that is, it is given by the uniform measure. 1. Introduction and Statement of the Result The Berry–Tabor conjecture [2] for quantum integrable systems with discrete spectra asserts that the spacings between normalized eigenvalues of a quantum integrable system should exhibit Poisson statistics in the semi-classical limit. In particular, when the eigenvalues are scaled to have unit mean level spacing, the distribution of their differences should be uniform. This conjecture has been verified numerically in many cases [4] and has been rigorously proved in an almost everywhere sense for flat 2tori (Sarnak [14]), flat 4-tori (Vanderkam [17]), deterministically for almost all flat tori (Eskin–Margulis–Mozes [5]), and for certain integrable quantum maps in one degree of freedom (Rudnick–Sarnak [12], Zelditch [1]). Smilansky [16] has more recently posed an analogous conjecture for scattering systems with continuous spectra. Since the scattering matrix S(E) at energy level E is, at least heuristically, the quantization of the classical scattering map, he argues that when the scattering map is integrable the eigenvalues of S(E) ( known as phase shifts) should exhibit Poisson statistics. In particular, he proposed that the pair correlation function of scaled phase shifts should be uniform for surfaces of revolution with a cylindrical end (see Fig. 1). The purpose of this paper is to prove (a somewhat modified form of) this conjecture for almost all surfaces in an infinite dimensional family of (pairs of) such surfaces. To explain the modifications and state our results, we need to introduce some notation. The surfaces we consider are topological discs X on which S 1 acts freely except for a unique fixed point m. The metrics g we consider are invariant under the S 1 action and in geodesic polar coordinates centered at m have the form g = dr 2 + a(r)2 dθ 2 , where
710
S. Zelditch, M. Zworski
a(r) defines a short range cylindrical end metric (see Sect. 2). For technical reasons, we are only able to analyse the phase shifts at this time in the case where g has a conic singularity at m.
θ
Fig. 1. A surface of revolution with a conic singularity and a cylindrical end
To define the pair correlation measure, we recall that at energy λ2 the scattering matrix for a surface of revolution with a cylindrical end is given by a diagonal (2[λ] + 1) × (2[λ] + 1) matrix with entries exp(2πiδk (λ)), |k| ≤ [λ] – see Sect. 3 for a detailed presentation. The phase shifts are given by δk (λ) and are well defined modulo Z. The parameter k corresponds to the angular momentum or in other words to the eigenvalues of the Laplacian on the cross-section (the circle in our case). When |k| is close to λ we expect no scattering phenomena as the classical motion is close to the bounded motion along the cross-sections (see Fig. 2). At the opposite extreme, when |k|/|λ| is close to 0, the classical motion is along geodesics approaching the singularity on the surface. Since the properties of the pair correlation measure are supposed to correspond to the properties of smooth classical motion it is natural, at least at this early stage, to delete the angular momenta corresponding to the neighbourhoods of the singularities. Based on this discussion we define for any > 0 the following measure ρλ ([a, b]) =
def
1 1 ] {(l, m, k) : l, m, k ∈ Z, < |l/λ| , |m/λ| < 1−, (1−2) 2λ+1 (1.1) (2λ + 1)(1 − 2)(δl (λ) − δm (λ) + k) ∈ [a, b] } .
In other words for f ∈ S(R), Z f (x)ρλ (dx) =
1 X 1 (1 − 2) 2λ + 1 k∈Z
X
f ((1 − 2)(1 + 2λ)(δl (λ) − δm (λ) + k)) .
m,l∈Z <|m/λ|,|l/λ|<1−
(1.2)
k
k = 0 a)
b) Fig. 2. a Geodesics approaching the singularity; b Geodesics close to the cross-sections
~
λ
Spacing Between Phase Shifts
711
Although rather cumbersome, this definition follows the standard procedure for defining pair correlation measures – see the references listed above. Our main result concerns the (modified) pair correlation function for an infinite dimensional set G of 2-parameter families of surfaces of revolution (X, g α,β ), (α, β) ∈ (α0 − δ, α0 + δ) × (−δ, δ) ⊂ R2 , with cylindrical ends. The precise definition of G will be given in Proposition 6 in Sect.5. The key property of the metrics is that the leading parts of the phase shifts, ψ α,β , of the 2-parameter families (X, g α,β ) depend linearly on the parameters (α, β). This feature allows us to prove: Theorem. Let {gα,β } ∈ G. Then for almost every pair (α, β) (in the sense of Lesbesgue measure) and for any sequence {λm }∞ m=0 satisfying ∞ X log3 λm < ∞, λm
m=0
we have lim ρ (f ) m→∞ λm
Z =
f (x)dx + f (0), f ∈ S(R), > 0.
(1.3)
This theorem proves the Berry–Tabor–Smilansky conjecture for phase shifts for our class of surfaces. The statement can only hold almost everywhere as we can produce one parameter families of surfaces for which the pair correlation measure is not uniform. The proof of the theorem is based on Proposition 5 below which is a somewhat stronger and more precise result. 2. Surfaces of Revolution with Cylindrical Ends We consider a class of incomplete two dimensional smooth manifolds denoted by X\{m}, such that X is a topological completion of X \{m}. We can then consider X as a manifold with a conic singularity. The manifold X \ {m} is globally parametrized by (0, ∞) × S1 and we put on it metrics of revolution: g = dr 2 + a(r)2 dθ 2 , r ∈ (0, ∞) , θ ∈ S1 .
(2.1)
The metric is assumed to be a short range cylindrical end metric, that is, we require that k (2.2) ∂r a(r)2 − 1 ≤ Ck r −2−k , r −→ ∞. At m we assume a conic structure: a(0) = 0, a 0 (0) 6 = 0.
(2.3)
We also make a convexity assumption by demanding that a 0 (r) > 0.
(2.4)
712
S. Zelditch, M. Zworski
The metric can be extended to a smooth metric on X (endowed with a natural C ∞ structure coming from polar coordinates, (r, θ) ) if and only if a 0 (0) = 1, a 2p (0) = 0 , p ≥ 0,
(2.5)
see for instance [3]. We will not assume (2.5) and consequently we allow bullet like surfaces shown in Fig.1. The classical dynamics is given by the Hamiltonian flow of the metric: p = |ξ |2gx = ρ 2 + a(r)−2 t 2 ,
(2.6)
where we parametrized T ∗ (X \ {m}) by (x, ξ ) = (r, θ; ρ, t), with ρ and t dual to r and θ respectively. As is well known this flow is completely integrable: {p, t} = 0, and t = ξ(∂θ ) is called the Clairaut integral. Abstractly, ∂θ is the vector field generating the S1 action on X \ {m}. As in the case of compact simple surfaces of revolution (see [1,6,3]) we have a stronger statement: Proposition 1. For (X\{m}, g) with the metric g satisfying (2.1)–(2.4) there exist global action angle variables on T ∗ (X \ {m}). Although it plays no part in the proof, this is worth presenting here as the global action variables are closely related to the asymptotics of the phase shifts. Proof. The moment map P T ∗ (X \ {m}) 3 (x, ξ ) 7 −→ |ξ |gx , ξ(∂θ ) ∈ R+ × R has the range given by the open set B = {(b1 , b2 ) : |b2 | < b1 }. For any (b1 , b2 ) ∈ B, P −1 (b1 , b2 ) consists of a R × S1 orbit of a single geodesic in T ∗ (X \ {m} (the R-action corresponds to the geodesic flow and the S1 -action to the θ -rotation). In the case of a simple surface of revolution, the global action variables, (I1 , I2 ), are defined by Z 1 α, α = ξ · dx, (2.7) Ij (b) = 2π γj (b) where (γ1 (b), γ2 (b)) is a global trivialization of the bundle H1 (P −1 (b), Z) of the homology groups along the fibers of P . When γ1 (b) is chosen as the orbit of the S1 -action, then I1 = ξ(∂θ ). In the case of non-compact surfaces discussed here the fibers are given by R × S1 and not by S1 × S1 (except for the degenerate case of the meridians, t = 0, where the fiber is (R \ {0}) × S1 , where 0 corresponds to the point m). Consequently the integral for I2 given by (2.7) diverges (for γ1 we can still take the compact orbit of the S1 -action). Hence we have to normalize the integral using the fact that the surface is asymptotic to a cylinder with a(r) ≡ 1. If we take γ2 (b) to correspond to a geodesic in P −1 (b), then outside of the turning point ρ = 0 (or r = 0 for the degenerate case of the meridians) it can be parametrized by r. Then ξ · dx becomes ρdr and we can put 1 lim I2 (b) = π R→∞
Z
R 0
b12
b2 2 − a(r)2
21 +
Z dr − 0
R
1
(b12 − b22 ) 2 dr,
(2.8)
that is we normalize by subtracting the “free” ρdr defined by ρ 2 + b22 = b12 . From this we find the angle variables as in [1,6]. u t
Spacing Between Phase Shifts
713
3. Review of Scattering Theory There are many ways of introducing the scattering matrix on a manifold of the type we consider. Since we only assume (2.2), X is not a b-manifold in the sense of Melrose – see [9,8]. It is a manifold with a cusp metric at one end and a conic metric at the other – see [9]. We shall not however use this point of view here. Instead we will proceed more classically and we will define the scattering matrix using the wave operators – see [8] for an indication of the relation between the two approaches. As in the proof of Proposition 1 we need a free reference problem X0 ' R × S1 , g0 = dr 2 + dθ 2 .
(3.1)
On X and X0 we define the wave groups, U (t) and U0 (t): U (t) : Cc∞ (X) × Cc∞ (X) 3 (u0 , u1 ) 7−→ (u(t), Dt u(t)) , where (Dt2 − 1g )u = 0 , ut=0 = u0 , Dt ut=0 = u1 . The operators U (t) extend as a unitary group to the energy space, H(X), obtained by taking the closure of Cc∞ (X) × Cc∞ (X) with respect to the norm k(u0 , u1 )k2E = k∇u0 k2L2 + ku1 k2L2 . The definition and properties of U0 (t) are analogous. We then define the Møller wave operators W± : H(X0 ) −→ H(X), by W± [w] = lim U (−t)χ(r)U0 (t)w, w ∈ H(X0 ), t→±∞
where χ ∈ C ∞ ([0, ∞); [0, 1]), χ(r) ≡ 0 for r < 1 and χ(r) ≡ 1 for r > 2, and where for r > 1 we used the obvious identification of the corresponding subsets of X and X0 . In the situation we consider the existence of W± is quite straightforward and we choose the wave rather than the Schrödinger picture just for variety. The scattering operator is S = W−∗ W+ : H(X0 ) −→ H(X0 ) def
(3.2)
and, as we will see below, it is a unitary operator. When there is no pure point spectrum then the wave operators W± are themselves unitary. In all situations they are partial isometries and W±∗ = limt→±∞ U0 (−t)χ(r)U (t). The null space of W±∗ is the span of the L2 eigenfunctions of 1. Under our assumptions there could only be finitely many such eigenfunctions. The wave operators have the intertwining properties: 0 I 0 I 0 I = W± H⇒ S, = 0. W± 1g0 0 1g0 0 1g 0
714
S. Zelditch, M. Zworski
Since all operators commute with the generator of the S1 action, ∂θ , we decompose S using the spectral decompositions of 1g0 and of ∂θ . It is easy to check that Z ∞ 0 I = λdEλ0 , 1g0 0 −∞ where the Schwartz kernel of dEλ0 is given by 1 sgn(λ) X in(θ−θ 0 ) 0 I 2 2 2 0 −1 e eisgn(λ)(λ −n ) (r−r ) (λ2 − n2 )+ 2 dλ. dEλ0 (r, θ; r 0 , θ 0 ) = 2 2 λ 0 (2π) n∈Z
Because S commutes with the generator of the free propagator, U0 (t), we obtain the scattering matrix at fixed energy using the above spectral decomposition: Z S = S(λ)dEλ0 , and then the decomposition corresponding to the eigenvalues of ∂θ : S(λ) =
1 X 0 Sn (λ)ein(θ −θ ) . 2π n∈Z
From the structure of dEλ0 it is clear that Sn (λ) ≡ 0 for |n| > |λ|. For |λ| = |n| we follow [8] and put Sn (λ) = lim Sn (sgn(λ)τ ). τ →|λ|+
We also note that Sn (λ) = S−n (λ). For |n| ≤ |λ|, Sn (λ) is a unitary operator, that is, it is given by multiplication by a complex number of unit length: Sn (λ) = e2π iδn (λ) ,
(3.3)
and the number δn (λ) is the nth phase shift at energy λ2 . Another way to think about S(λ) is as a diagonal unitary (2n + 1) × (2n + 1) matrix, where n = [|λ|]: . S(λ) = e2πiδk (λ) δkj −n≤k,j ≤n
A more “down-to-earth” definition, following the traditional way of introducing phase shifts in one dimensional scattering, is given through asymptotic expansions in (3.6) below. The uniform behaviour as k and λ go to infinity and k λ is a well understood semi-classical problem. To describe it we separate variables in the eigen-equation of the Laplacian. We remark that this procedure can also provide direct proofs of the general scattering theoretical statements above. The Laplace operator is given by 1g = Dr2 − i
a 0 (r) 1 Dr + D2 a(r) a(r)2 θ
Spacing Between Phase Shifts
715
and on the eigenspaces of Dθ it acts as a 0 (r) 1 n2 Dr + a(r) a(r)2 1 1 n2 2a 00 (r)a(r) − (a 0 (r))2 = a(r)− 2 Dr2 + − a(r) 2 . a(r)2 4a(r)2
1n = Dr2 − i
(3.4)
The reduced operator appearing in brackets in the second line above has a self-adjoint realization on L2 ((0, ∞)r ) and for large λ it can be considered semi-classically: 1n − λ2 = λ2 a(r)− 2 P (x, h)a(r) 2 , h = 1
1
1 n , x= , |λ| λ
P (x, h) = (hDr )2 + V (r; x, h) − 1, V (r; x, h) =
x2 a(r)2
− h2
2a 00 (r)a(r) − (a 0 (r))2 4a(r)2
(3.5) def
, V0 (r; x) = V (r; x, 0).
The principal symbol of P (x, h) is given by p = ρ 2 +x 2 /a(r)2 −1 and the natural range of x for which semi-classical methods are applicable is given by 0 < < |x| < 1 − . In fact, since a(r) is one at infinity, we approach zero energy when x 2 is close to 1. On the other hand when x → 0 the characteristic variety of p has a singular limit – see Fig.3. A detailed analysis of the x → 0 limit has to involve the lower order terms in V (r; x, h). In particular, miraculous cancellations in the expansions due to the interaction between the leading and lower order terms occur when we have product type conic singularities since we can then use the theory of Bessel functions. The general situation is, at least to the authors, unclear at the moment. What is quite clear is that we have a uniform expansion in h/x. ρ p(0) = 0 1
p(x) = 0, x = 0 r
-1 p(0) = 0 T * IR + Fig. 3. The characteristic variety of P (x, h)
716
S. Zelditch, M. Zworski
The phase shifts δn (λ) are related to the semi-classical phase shifts of the operator P (x, h), ψ(x, h), which are defined by asymptotics of solutions: i
P (x, h)u = 0, u(r) = e h
√
i
1−x 2 r
i
√
+ e h ψ(x,h) e− h 1−x n 1 |λ| ψ , . δn (λ) = 2π λ |λ|
2r
+O
1 , r → ∞, r
(3.6)
We recall now the essentially standard asymptotic properties of ψ – see [10], Chapter 6 and for a more microlocal discussion [11]. Proposition 2. As h → 0, ψ(x, h) defined by (3.6) has an asymptotic expansion uniform in < |x| < 1 − for any fixed > 0: ψ(x, h) ∼ ψ(x) + h
π + h2 ψ2 (x) + · · · , 2
(3.7)
where ∞
Z ψ(x) = 0
1
1
(1 − V0 (r, x))+2 − (1 − x 2 ) 2
dr.
(3.8)
We remark that when we translate the asymptotics to the coordinates on T ∗ (X \ {m}): x = t/λ, λ2 = ρ 2 + t 2 /a(r)2 we obtain the second action variable defined in the proof of Proposition 1. As mentioned in the introduction, we can describe this connection between the phase shifts and action variables by saying that S(λ) is a quantum map on G(X, g), the space of geodesics. We now digress to explain this statement in more detail. For simplicity we will consider S at integral values of λ and denote them by N. Since it is not needed in the calculation of the limit pair correlation function we give a somewhat sketchy discussion and refer to [18] for background on Toeplitz quantization. See also [15] for a related discussion from a physicist’s point of view. ∗ (X ) of incoming vectors at the parallel We can identify G(X, g) with the set Sin r0 def
Xr0 = X ∩ {r = r0 }. As in Proposition 2 we have to delete -neighborhoods of the singular set, given by {|t/λ| < }, where t = I1 = ξ(∂θ ) is the first action variable and λ2 = ρ 2 + t 2 a(r)−1 is the energy. We denote the deleted space of geodesics by G (X, g) and identify it with the set ∗ (X ) of incoming vectors at X with incoming angle satisfying |θ | > , |θ −π | > Sin, r0 r0 and |θ − π2 | > . If we consider the deleted space of geodesics as a phase space then, on the quantum level, it corresponds to the sequence of truncated Hilbert spaces HN, spanned by the eigenfunctions {einθ } of the quantum action N1 Iˆ1 with < | Nn | < 1 − , where Iˆ1 = −i∂θ . Here, 1/N plays the role of the Planck constant and we restrict it to integral values. Since HN, is invariant under S(N ) we may restrict the latter to a unitary scattering matrix S (N) on HN, . We now state the somewhat informal:
Spacing Between Phase Shifts
717
Proposition 3. The sequence {S (N)} is a semiclassical quantum map over G (X, g) associated to the classical scattering map ∗ ∗ (Xr0 ) → Sin, (Xr0 ), β : Sin,
where β(x, ξ ) is obtained by following the geodesic γ(x,ξ ) through (x, ξ ) until it intersects Xr0 for the last time and reflecting the outgoing tangent vector inward. Proof. From the explicit formula S (N ) = (e
2πiδk (N )
δkj )≤|k/N|,|j/N|≤(1−) ,
|N | ψ δk (N ) = 2π
k 1 , N N
(3.9)
we see that S (N) is the exponential of N times the Hamiltonian bN, = χ H
b b I1 1 I1 ψ , N N N
on HN, where χ is a smooth cutoff function defining the truncated Hilbert space. The truncated phase space G (X, g) is symplectically equivalent to a truncated S 2 , equipped with its standard area form, with neighbourhoods of the poles and of the equator deleted. Indeed, the equivalence is defined by the identity map between global actionangle charts on the surfaces. This map intertwines the obvious S 1 actions which rotate the spaces. The quantization of this chart then defines a unitary equivalence on the quantum level which intertwines the operators ∂θ on cylinder and sphere (they can be considered as the angular momentum operators). The equivalence is specified up to a choice of 2N + 1 phases by mapping the spherical harmonic of degree N which transforms under rotation by θ on S 2 by eikθ to the exponential eikθ with k ∈ [−N, . . . , N]. The map is completely specified by requiring that the spherical harmonic be real valued along θ = 0. bN, with a Hamiltonian over the compact phase space S 2 . Thus we may identify H Since it is a function of the (Toeplitz) action operator Ib1 it is necessarily a semiclassical Toeplitz operator of order zero with principal symbol χ (I1 /E)ψ(I1 /E) on S 2 . The semiclassical parameter N is identified in the Toeplitz theory with a first order positive elliptic Toeplitz operator with eigenvalue N in HN – see [18] and references given there. bN, is a first order Toeplitz operator of real principal type. As in the essentially Hence N H analogous case of pseudodifferential operators, the exponential of a first order Toeplitz operator of real principal type is a Fourier integral Toeplitz operator whose underlying classical map is the Hamilton flow generated by ψ. We now wish to identify this map at time one with the classical scattering (or billiard ∗ (X ), that is, we wish to prove that β = exp 4 , where 4 is the ball) map on Sin, r0 ψ ψ Hamilton vector field of ψ. Indeed, let us work in the symplectic action-angle coordinates (θ, I1 ), where θ is the angle along Xr0 . The Hamilton flow of ψ then takes the form exp t4ψ (θ, I1 ) = (θ + tω, I1 ),
ω = ∂I1 ψ.
(3.10)
At time t = 1 the angle along the parallel Xr0 changes by ω. We claim that ω is also the change in angle along the incoming geodesic through θ ∈ Xr0 in the direction I1 as it scatters in the bullet head before exiting again along Xr0 .
718
S. Zelditch, M. Zworski
To see this, we use Proposition 2 which shows that ψ is closely related to the second action variable: in the notation of the proof of Proposition 1, b2 , I1 (b1 , b2 ) = b2 . I2 (b1 , b2 ) = b1 ψ b1 Since b1 = λ is preserved by the flow we can fix it at λ = 1 and then Z ∂I1 ψ(I1 ) = 2I1
dr a(r)2
I2 1− 1 2 a(r)
!− 1 2
.
(3.11)
+
On the other hand the equations of motion show that θ˙ 2ta(r)−2 I1 dθ = = = dr r˙ 2ρ a(r)2
I2 1− 1 2 a(r)
!− 1 2
.
(3.12)
It follows that ω(I1 ) is twice the change in angle as the radial distance changes from r0 to its minimum along the geodesic. The piece of the geodesic lying in the bullet-head consists of two segments: the initial segment beginning on Sr0 and ending upon its tangential intersection with the parallel Xr− (I1 ) closest to m, and the segment beginning at this intersection and ending on Xr0 . The change in θ-angle along both segments is the same, so that the total change in angle during the scattering is given by the integral (3.12) above. This shows that β and exp 4ψ have precisely the same formula in action-angle variables and completes the proof of the proposition. u t 4. Exponential Sums Following [18] we will reduce the study of (1.2) to a study of certain exponential sums. We first remark that because of symmetries of δk (λ) we can study a slightly simpler expression ρ˜λ (f ) =
X 1 (1 − 2)λ
m∈Z
X
f (1 − )λ(δj (λ) − δk (λ) + m)
<j/λ<1−
as one easily checks that ρ˜λ (f )
1 1 = ρλ f • . 2 2
The reduction to exponential sums follows from an application of the Poisson summation formula in m: 2 X X 1 2π ` i`δk (λ) ˆ e f ρ˜λ (f ) = . [(1 − 2)λ]2 (1 − 2)λ `∈Z
Spacing Between Phase Shifts
719
From this we see that ρ˜λ (f ) = fˆ(0) + Eλ (f ) =
def
1 [(1 − 2)λ]2
Z
fˆ(2πξ )dξ + oλ→∞ (1) + Eλ (f ), X X 2π` ei`(δk (λ)−δj (λ)) . fˆ (1 − 2)λ
`∈Z\{0}
(4.1)
j 6 =k
Ideally, we would like to show that Eλ (f ) → 0 as λ → 0. That however seems very hard and for some surfaces is simply not true. As in [14,12,18] we will instead consider families of surfaces and resulting families of scattering phase shifts: α,β
(α0 − γ , α0 + γ ) × (−γ , γ ) 3 (α, β) 7 −→ δk (λ). α,β
Replacing δk (λ) by δk (λ) in (4.1) we define (α0 − γ , α0 + γ ) × (−γ , γ ) 3 (α, β) 7 −→ Eλ (f ; α, β).
(4.2)
To see the point of doing this we recall from [14,12] and [18] the following simple Lemma 1. If for any f ∈ S(R) Z
α0 +γ α0 −γ
Z
γ −γ
|Eλ (f ; α, β)|2 dαdβ ≤ C,f F (λ),
then for any sequence {λm }∞ m=0 such that ∞ X
F (λm ) < ∞,
m=0
we have Eλ m (f ; α, β) −→ 0, m −→ ∞, ∀ f ∈ S(R) almost everywhere in (α, β) ∈ (α0 − γ , α0 + γ ) × (−γ , γ ). α,β
When δk (λ) have a somewhat idealized form, the crucial estimate comes from [18], Theorem 5.1.1 where it is loosely based on the Vinogradov method. Since we will need a further development of these estimates we present a slightly modified proof. Proposition 4. If in (4.1) and (4.2), k α,β , 8 ∈ C ∞ ((0, 1)), |800(,1−) | ≥ C > 0, δk (λ) = αk + βλ8 λ then for any f ∈ S(R) Z
1
Z
1
−1 −1
|Eλ (f ; α, β)|2 dαdβ
= Of,
log3 λ , λ −→ ∞. λ
720
S. Zelditch, M. Zworski
Proof. Let ρδ ∈ C ∞ (R) have the following properties: ρδ (t) ≥ 1l[−1,1] (t), supp ρˆδ ⊂ (−δ, δ). The estimate of the lemma will clearly follow from 3 2 log λ ρδ (α)ρδ (β) Eλ (f ; α, β) dαdβ = O . λ R
Z Z R
(4.3)
Using the representation of Eλ , (4.1), the left-hand side of (4.3) can be rewritten as
X
`1 `2 1 X X g g × λ2 λ λ `1 6=0 `2 6=0 X ρˆδ (`1 (j1 − k1 ) − `2 (j2 − k2 )) ×
(4.4)
k1 k2 j1 j2 −8 − `2 8 −8 , ρˆδ λ `1 8 λ λ λ λ def
where we dropped the overall factor of (1 − 2)−2 and put g(ξ ) = f (2π(1 − 2)−1 ξ ). From now on we will drop the parameter altogether: we can for instance extend 8 as a strictly convex or concave function to [0, 1] adding additional positive terms to the sums which are being estimated or we can shift and rescale the variables. The support condition on ρˆδ implies that `1 (j1 − k1 ) − `2 (j2 − k2 ) = 0, δ k2 j2 ≤ . `1 8 j1 − 8 k1 − `2 8 −8 λ λ λ λ λ
(4.5)
To understand the second expression we apply the mean value theorem twice to the difference of 8’s. For that we make a simple observation: if φ 00 has a fixed sign on [a − , b + ] then if (m + h)/2, (m − h)/2 ∈ [a, b], m−h m+h −φ = hφ 0 (4(m, h)), φ 2 2 ∂4 1 max |φ 00 | 1 min |φ 00 | ≤ (m, h) ≤ . 2 max |φ 00 | ∂m 2 min |φ 00 | In our case we put φ = 8(•/λ) and hi = ji − ki 6 = 0 and mi = ji + ki . Then with def
ξλ,h (m) = 4(m, h) we have 1/C < ∂m ξλ,h (m) < C,
(4.6)
Spacing Between Phase Shifts
721
and (4.5) implies `1 h1 = `2 h2 , 0 < |hi | ≤ λ, −1 (m ) ξ ≤ Cδλ, 0 ≤ mi ≤ 2λ, `1 h1 m2 − ξλ,h λ,h 1 1 2
(4.7)
where we can invert ξλ,h2 in view of (4.6). Thus we want study the sets of six integers (hi , mi , li ), i = 1, 2, satisfying (4.7). We first note that, say, h2 is determined by h1 , `1 , `2 . Then we see from the second inequality in (4.7) and from (4.6) that for fixed (h1 , h2 , m1 , `1 ) there are λ m2 ’s satisfying (4.7). O(1) max 1, |`1 h1 | When λ > |`1 h1 | the contribution to (4.4) is estimated by X X X λ `1 `2 1 X g g λ4 |` h | λ λ 1 1 −λ≤h ≤λ 1≤m1 ≤2λ
≤ Cδ
log λ λ
≤ Cg δ
`1 6=0 `2 6=0
1 h1 6 =0
Z
|ξ |>λ
|g(ξ )|
dξ |ξ |
Z |g(ξ )|dξ
log2 λ . λ
When λ ≤ |`1 h1 | then the number of m2 ’s is uniformly bounded for each choice of the other variables. We want to count the triples (h1 , h2 , `1 ) satisfying the first equation of (4.7) as a function of `2 and λ. Let F (λ, `2 ) denote that number. If d(n) denotes the number of divisors of n 6 = 0, then X d(h2 |`2 |), F (λ, `2 ) ≤ 8 0
since `1 h1 = `2 h2 and each factorization into a product has to be counted twice since `1 and h1 can be interchanged. Then X X X def G(λ, N) = F (λ, `2 ) ≤ 8 d(|`2 |h2 ) `2 6 =0 0
06=|`2 |≤N
≤C
X
0
d(n) ≤ C λN (log λ + log N )3 , 2
1≤n≤Nλ
by a theorem of Ramanujan – see [7], Sect. 18.2 and references given there. Hence the part of (4.4) corresponding to the bounded number of m2 ’s is bounded by Z X X `2 −4 0 −3 ≤ C max |g| F (λ, `2 ) g λ Cλ (G(λ, λ|ξ |) + 1) |g 0 (ξ )|dξ g λ 1≤m1 ≤2λ
`2 6=0
≤ Cg
log3 λ , λ
where we used summation by parts and then approximation by the Riemann integral. This completes the proof of the proposition. u t
722
S. Zelditch, M. Zworski
We now recall from Proposition 2 that for surfaces we consider we have 1 1 k k k 1 + + ψ2 , , < < 1 − , λ → ∞, δk (λ) = λψ λ 4 λ λ λ λ
(4.8)
where for < x < 1 − , ∂xk ψ2 = Ok, (1). When the family of surfaces depends on two parameters, (α, β), so that (2.1)-(2.4) hold uniformly, then we also have (4.8) uniformly with respect to the parameters. Hence to apply Proposition 4 to our case we need to estimate the contribution of the error terms α,β coming from ψ2 . That is given in α,β
Proposition 5. Let δk (λ) be given by (4.8) uniformly in (α, β) ∈ (α0 − γ , α0 + γ ) × (−γ , γ ) with ψ α,β (x) = αx + β8(x), 8 ∈ C ∞ ((0, 1)), |800(,1−) | ≥ C > 0. Then Z
α0 +γ
Z
α0 −γ
γ
−γ
|Eλ (f ; α, β)|2 dαdβ = Of,
log3 λ . λ α,β
Proof. We observe that ψ α,β is defined for all (α, β) and that we can extend δk to all (α, β) by smoothly cutting off the lower order terms for (α, β) ∈ R × R \ (−γ − , γ + ) × (α0 − γ − , α0 + γ + ). Using the inequality |x|2 ≤ 2|y|2 + 2|x − y|2 and Proposition 4 we see that we need to estimate 2 Z Z X X α,β α,β α,β α,β ` ˜ ˜ 1 g ei`(δk (λ)−δj (λ)) − ei`(δk (λ)−δj (λ)) 2 λ λ
def α,β · ρδ (α)ρδ (β)dαdβ, δ˜k (λ) = λψ α,β (k/λ),
where we use the notation of the proof of Proposition 4. We now introduce def
τ (z) = 2i sin
z z τ (z) exp −i , eix − eiy = eix τ (x − y), ∈ C∞, 2 2 z
and put λ α,β α,β α,β α,β τ ` δk (λ) − δj (λ) − δ˜k (λ) − δ˜j (λ) ` ` λ α,β k 1 α,β j 1 ψ2 , − ψ2 , . = τ ` λ λ λ λ λ
ψ`,j,k,λ (α, β) =
From Proposition 2 and the obvious properties of τ we see that ψ`,j,k,λ is C ∞ and that it satisfies the following estimates: pα +pβ ` pα pβ . (4.9) ∂α ∂β ψ`,j,k,λ (α, β) ≤ Cp 1 + λ
Spacing Between Phase Shifts
723
Hence we have to look at 2 Z Z X X α,β α,β ` ` 1 i`(δ˜k (λ)−δ˜j (λ)) g ψ`,j,k,λ (α, β)e 2 λ λ λ
· ρδ (α)ρδ (β)dαdβ. We would like to proceed as in the proof of Proposition 4 but now taking of Fourier transforms has to be replaced by integration by parts. Using the analysis of the differences α,β of δ˜k presented there we need to estimate `1 `2 `2 1 X X `1 2 2 λ g λ λ g λ h`1 /λi h`2 /λi I (λ, `1 , `2 ), λ2
(4.10)
`1 6=0 `2 6=0
where I (λ, `1 , `2 ) = 2[λ] 2[λ] X X m1 =1 m2 =1
[λ] X
[λ] X
h`1 h1 − `2 h2 i−2 hλ−1 (`2 h2 ξλ,h2 (m2 ) − `1 h1 ξλ,h1 (m1 ))i−2 ,
h1 =−[λ] h2 =−[λ] h1 6 =0 h2 6 =0 1
and where, as is usual, we write hxi = (1 + x 2 ) 2 . We note that we can absorb the terms h`1 /λi2 and h`2 /λi2 into the g terms. We first observe that X 1 −2 hA − Bmi = O(1) max 1, , B m∈Z
uniformly in A ∈ R. In fact, for |B| ≤ 1 this follows from the comparison with the integral using the Euler-MacLaurin formula ∞ X
Z f (n) =
−∞
∞
−∞
Z f (x)dx + O
and for |B| > 1 we can write the sum as Using this and (4.6) we see that X
hA − Bξλ,h (m)i−2 ≤ C
m≥1
X
P
∞
−∞
k (1 + B
00
|f (x)|dx ,
2 (A/B
− [A/B] − k)2 )−1 = O(1).
1 −1 hBm − Bξλ,h (A/B)i−2 ≤ C max 1, . B
m≥1
Hence, uniformly in `2 , h2 , X m1
−1
hλ
(`2 h2 ξλ,h2 (m2 ) − `1 h1 ξλ,h1 (m1 ))i
−2
λ . = O(1) max 1, |h1 `1 |
(4.11)
724
S. Zelditch, M. Zworski
e(λ, `2 , p) as the number Proceeding as in the proof of Proposition 4 we introduce F of (`1 , h1 , h2 ) satisfying `1 h1 = `2 h2 + p. We now have X e(λ, `2 , p) ≤ 4 d(|`2 h2 + p|), F 06=|h2 |≤λ `1 h2 +p6=0
and e N ) def = G(λ,
∞ X
X
e(λ, `2 , p)hpi−2 F
p=−∞ |`2 |≤N
≤ C1
≤ C2
∞ X
X
p=−∞
1≤|n|≤N λ n+p6=0
Nλ X
! 21 d(n)
n=1
2
d(|n|)d(|n + p|)hpi−2
∞ X
hpi
NX λ+p
−2
p=0
1 2
2
d(m)
m=1
∞ 1 1 X 2 2 ≤ C3 λN(log λ + log N)3 hpi−2 (λN + p) log3 (λN + p) p=0
≤ CλN(log λ + log N) . 3
Using (4.11) we can estimate (4.10) by: Z 3 −3 e λ|ξ |) + 1 |∂ξ (ξg)(ξ )|dξ ≤ Cg log λ , G(λ, C max(1 + |ξ |)|g(ξ )|λ λ completing the proof of the proposition. u t Remark. As was emphasized by the referee, Propositions 4 and 5 are essentially sharp. The “diagonal” solutions j1 = j2 , k1 = k2 , l1 = l2 give a lower bound 1/λ. Hence, no essential improvement of the Theorem in Sect.1 is possible by this method. 5. Construction of the Family of Surfaces To prove the main theorem we need to construct a family of surfaces, G = {g α,β }, for which, in the expansion of the phase shifts (4.8), ψ α,β (x) = αx + β8(x), |800(,1−) | > C > 0, (α, β) ∈ (α0 − γ , α0 + γ ) × (−γ , γ ),
(5.1)
so that we can apply Propositions 4 and 5. Recalling Proposition 2, this means that we want to find a α,β satisfying (2.1)-(2.4), and such that 21 Z ∞ 2 1 1 x 1− − (1 − x 2 ) 2 dr = αx + β8(x), (5.2) ψ α,β (x) = π 0 a α,β (r)2 + with 8 convex or concave.
Spacing Between Phase Shifts
725
We will now skip the indices α and β. If we write def
W (r) = and def
Z
φ(x) =
∞
1 − 1, a(r)2
1−
0
21
x2 W (r)
then 1p 1 − x2φ ψ(x) = π
+
− 1 dr,
(5.3)
x
√ 1 − x2
,
and we might study the simpler function φ instead. From the assumptions on a, W is monotonically decreasing and r 2 W (r) is smooth and non-zero at r = 0. Hence there exists a smooth monotonically increasing function, y(r), such that W (r) =
1 . y(r)2
Since we can write r as a function of y we define def
F (y) =
dr (y). dy
That way we can express φ(x) as a linear transform of F : Z
∞
φ(x) = xI (F )(x), I (F )(x) = 0
1 1− 2 y
!
1 2
+
− 1 F (xy)dy.
(5.4)
From this we immediately get a linear model corresponding to F (x) ≡ t > 0: √ x 1 1 − x2 r2 H⇒ ψ(x) = φ = − tx, a(r) = 2 √ 2 t +r π 2 1 − x2 since ∞
Z 0
1 − y −2
1 2
+
− 1 dy = −π/2.
Remark. The surfaces defined using the linear model do not have uniform pair correlations measures. In that case we can compute the leading contribution to Eλ (f ) directly. To apply Propositions 4 and 5 we need to have the linear term in ψ and that forces the singularity at 0 for our surfaces: only one value of α corresponds to a smooth surface. We want to introduce the convex or concave term in ψ by perturbing the case F ≡ const. For that let us first establish some simple properties of the transform F 7→ I (F ). k spaces of We denote by Cb∞ smooth functions with bounded derivatives and by Sphg poly-homogeneous (classical) symbols.
726
S. Zelditch, M. Zworski
Lemma 2. For g ∈ Cb∞ ([0, ∞)) I (g)(x) ∈ Cb∞ ([0, ∞)) + x log xCb∞ ([0, ∞)). −2 ([0, ∞)) then When g ∈ Sphg
I (g)[1,∞) ∈
−1 Sphg ([1, ∞)) ,
Z ∞ 1 X gk I (g)(x) ∼ − f (y)dy , x −→ ∞. + x xk k=2
Proof. To prove the first part of the lemma we write Z C 1 (1 − y −2 )+2 − 1 g(xy)dy I (g)(x) = 0 Z ∞ 1 (1 − y −2 ) 2 − 1 g(xy)dy, C > 1, + C
where the first term on the right-hand side is clearly in Cb∞ ([0, ∞)). In the second term, the integrand can be rewritten as 1 1 1 1 − · · · g(xy)dy. − + y2 2 8 y2 Thus we are concerned with integrals of the form Z ∞ Z 1 Z ∞ 1 k−1 −k k−1 g(xy)dy = x Y g(Y )dY + x Y −k g(Y )dY, k C y 1 Cx where the first term is smooth and uniformly bounded in k. To study the second term we write g(Y ) = g0 + g1 Y + · · · gl−1 Y l−1 + g˜ l (Y )Y l , which gives 1 C −k+l 1 C −k+1 g0 + C −k+2 g1 x + · · · + gl−1 x l−1 + x l Fk,l,C (x), k−1 k−2 k−l for k > l and 1 1 C −k+1 g0 + C −k+2 g1 x + · · · + gk−1 x k−1 log x + x k−1 Gk,l,C (x), k−1 k−2 for finitely many k ≤ l. Since we check that Gk,l,C (x) ∈ C ∞ ([0, 1/C)), k ≤ l, |Fk,l,C (x)| ≤ Gl (C −k+l+1 + |x|k−l−1 ), k > l, we can sum up the contributions from different k’s (C > 1, |x| 1, and we use the 1 uniform convergence of (1 − z) 2 ). Thus for every l we obtain I (g)(x) = h1,l (x) + x log xh2,l (x) + O(x l ), h1,l , h2,l ∈ C ∞ , and consequently I (g) ∈ Cb∞ ([0, ∞)) + x log xCb∞ ([0, ∞)).
Spacing Between Phase Shifts
727
P −k , The second part of the lemma is even more clear. If for large Y , g(Y ) ∼ ∞ k=2 Gk Y then Z ∞ 1 −2 2 (1 − y )+ − 1 g(xy) 0 ! 1 Z Z ∞ ∞ X 1 1 2 fk 1 g(Y )dY + dy . t u 1− 2 ∼x→∞ − x xk y yk 1 k=2
The lemma shows that we cannot expect smoothness of ψ(x) at the end points x = 0, 1 but that the function is very well behaved in the interior, as in any case is implicit in Proposition 2. Having discussed the general properties of the transform I we now state a straightforward √ √ Lemma 3. For 8(x) = xI (f )(x/ 1 − x 2 ) and 0 < x(z) = z/ 1 + z2 < 1 we have 800 (x(z)) =Z(1 + z2 ) 2
3
∞
· 0
1 (1 − y −2 )+2 − 1 (2 + 3z2 )yf 0 (yz) + zy 2 (1 + z2 )f 00 (zy) dy .
Guided by the two lemmas we can easily construct a family of surfaces for which ψ has the needed properties. We want to find f ∈ Cb∞ (R) such that for β small enough and α close to α0 , F α,β (y) = α + βf (y) > 0, and so that a(r) obtained from inverting the process described above has the properties (2.1)-(2.4). This is easily achieved by demanding that f is a symbol of order −2 on [0, ∞). We also want 800 (x) described in Lemma 3 to have a fixed sign. From the formula we see that that 8 is concave if af 0 (y) + yf 00 (y) > 0 , y > 0, a = 2, 3.
(5.5)
In fact, the integrand has the same sign as (2 + 3z2 )f 0 (Y ) + (1 + z2 )Yf 00 (Y )Y =yz , and that is positive for any z if (5.5) holds. We can summarize this discussion in Proposition 6. For α in a neighbourhood of a fixed α0 < 0 and for β small enough, let α,β −2 af (r) be obtained from f ∈ Sphg ([0, ∞)) by the following procedure: F α,β (y) = −2α − βf (y), 1 dy α,β = (F α,β (y α,β (r))−1 , y α,β (0) = 0, a α,β (r) = y α,β (r)(1 + y α,β (r)2 )− 2 . dr
α,β
Then af surfaces
has the properties (2.1)–(2.5) and for the set of two parameter families of α,β
α,β
G = {(X, gf ) : gf
α,β
= dr 2 + af (r)dθ 2 ,
−2 ([0, ∞)) satisfies (5.5), |α − α0 | < δf , |β| < f } f ∈ Sphg
the leading part of the phase shifts depends linearly on α and β and (5.1) holds.
728
S. Zelditch, M. Zworski
Combined with Propositions 2 and 5 this provides an infinite dimensional family of perturbations of the linear model each giving a two parameter family of surfaces for which the theorem of Sect. 1 holds. Perhaps the simplest example is obtained by putting F α,β (y) = −2α − β
1 2 , 0<ρ≤ . 1 + ρy 2 3
Proof of the Main Theorem. Proposition 6 guarantees that the leading parts of the expansions of the phase shifts of (X, g α,β ) satisfy the assumptions of Proposition 5. Let ,α,β denote the pair correlation measure for (X, g α,β ) given by (1.2). Recalling (4.1) ρλ and the discussion preceding it we see that for all f ∈ S(R), ,α,β
ρλ
(f ) = fˆ(0) + f (0) + oλ→∞ (1) + Eλ (f ; α, β).
Proposition 5 now shows that the assumptions of Lemma 1 are satisfied with F (λ) = t log3 λ/λ and that lemma gives the statement of the Main Theorem. u Acknowledgements. This note originated in a discussion between U. Smilansky and the first author at the Newton Institute during the program on Quantum Chaos and Disordered Systems in 1997 following a talk on the results of [18]. Smilansky pointed out that the formal WKB formula for phase shifts of a surface of revolution with cylindrical end closely resembled the WKB formula for eigenvalues of an integrable quantum map on S 2 and proposed the problem of proving that the scaled phase shifts exhibit Poisson statistics. We would like thank U. Smilansky for bringing this to our attention. The first author would also like to thank the National Science Foundation for partial support under the grant DMS-9703775. The second author would like to thank the National Science and Engineering Research Council of Canada for partial support. Both authors are grateful for the hospitality of the Erwin Schrödinger Institute where part of this work was done.
References 1. Arnold, V.I.: Mathematical Methods of Classical Mechanics. Graduate Texts in Mathematics 60, Berlin– Heidelberg–New York: Springer-Verlag, 1989 2. Berry, M.V. and Tabor, M.: Level clustering in regular spectrum. Proc. Roy. Soc. Lond. Ser. A 356, 375–394 (1977) 3. Besse, A.: Manifolds all of whose geodesics are closed. Berlin–Heidelberg–New York: Springer Verlag, 1978 4. Bohigas, O.: Random matrix theories and chaotic dynamics. In: Chaos et Physique Quantique, Les Houches LII, M.-J. Giannoni, A. Voros and J. Zinn-Justin eds., London: Elsevier, 1991, pp. 87–199 5. Eskin, A., Margulis, G. and Mozes, S.: Upper bounds and asymptotics in a quantitative version of the Oppenheim conjecture. Ann.of Math. 147, 93–141 (1998) 6. Colin de Verdière, Y.: Spectre conjoint d’operateurs pseudo-differentiels qui commutent I: Le cas integrable. Math. Zeit. 171, 51–73 (1980) 7. Hardy, G.H. and Wright, E.M.: An Introduction to the Theory of Numbers. Oxford: Oxford University Press, 1979 8. Christiansen, T.: Scattering theory for manifolds with asymptotically cylindrical ends. J. Func. Anal. 131, 499–530 (1995) 9. Melrose, R.B.: Geometric scattering theory. Cambridge: Cambridge University Press, 1995 10. Olver, F.W.J.: Asymptotics and Special Functions. Wellesley, MA: A K Peters, 1997 11. Rammond, Th.: Semiclassical study of quantum scattering on the line. Commun. Math. Phys. 177, 221– 254 (1996) 12. Rudnick, Z. and Sarnak, P.: The pair correlation function of fractional parts of polynomials. Preprint, 1997 13. Sarnak, P.: Arithmetic quantum chaos. Schur lectures, Israel Math. Conf. Proc. 8, (1995) 14. Sarnak, P.: Values at integers of binary quadratic forms. C.M.S. Conf. Proc. 21, Providence, R.I.: AMS, 1997, pp. 181–203 15. Smilansky, U.: The classical and quantum theory of chaotic scattering. In: Chaos et Physique Quantique, Les Houches LII, M.-J. Giannoni, A. Voros and J. Zinn-Justin eds., London: Elsevier, 1991, pp. 371–441
Spacing Between Phase Shifts
729
16. Smilansky, U.: Private communication. 1997 17. Vanderkam, J.: Pair correlation of four-dimensional flat tori. Duke Math. J. 97 (2), 413–328 (1999) 18. Zelditch, S.: Level spacings for integrable quantum maps in genus zero. Commun. Math. Phys. 196, 289–318 (1998) Communicated by P. Sarnak