Commun. Math Phys. 184, 1 – 25 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Remarks on Quantum Integration Chryssomalis Chryssomalakos Laboratoire de Physique Th´eorique ENSLAPP? , Chemin de Bellevue BP 110, F-74941, Annecy-le-Vieux Cedex, France. E-mail:
[email protected] Received: 2 February 1996 / Accepted: 20 July 1996
Abstract: We give a general integration prescription for finite dimensional braided Hopf algebras, deriving the N-dimensional quantum superplane integral as an example. The transformation properties of the integral on the quantum plane are found. We also discuss integration on quantum group modules that lack a Hopf structure. 1. Introduction The emergence of Hopf algebras, during the last decade, as a promising framework within which new physical symmetries can be accomodated, has prompted an interest in the theory and techniques of integration on them. Similar remarks hold for braided Hopf algebras which, more recently, have provided a still further extension of the classical concept of a group, marrying quantization with nontrivial statistics. Integrals on (finite dimensional) Hopf algebras have been studied extensively, see for example [12, 19] and references therein. For the braided case see the treatment in [13, 14] for the basics of the theory – some examples appear in [3, 10]. From the point of view of a physicist who is interested in the basic elements of the theory and in applications, the situation presents certain problems. The results are generally scattered and when they do become available, the disparity of the methods employed in them prohibits the formation of a clear image of the minimum background required to explore the field. When it comes to applications, results are extracted (often ingeniously) from particular properties of individual examples – no general integration prescription seems to be available for the quantum space wanderers. The closest one can come to such a prescription in the literature is perhaps the trace formula of Radford and Larson [11] (and a braided analogue of it in [18]) which, however, occasionally returns trivial (i.e. identically zero) results (there doesn’t seem to exist a description of when exactly it fails either). Our purpose in this paper is thus basically twofold. On the one hand, we aim at providing simple, ?
URA 14-36 du CNRS, associ´ee a` l’Ecole Normale Sup´erieure de Lyon et a` l’Universit´e de Savoie
2
C. Chryssomalakos
self contained proofs of basic results, using only the Hopf algebra axioms we assume the reader to be familiar with, attempting in this way a demonstration of what can be accomplished with a rather minimal set of tools. On the other hand, addressing the problem of the missing integration prescription, we give an explicit formula for the integral on any finite dimensional braided Hopf algebra (FDBHA) and show that it is always nontrivial, commenting along the way on the conditions under which the trace formula fails. The paper is structured as follows: in Sect. 2 we present the notation we use and collect some basic formulas we need in subsequent proofs. Section 3 starts with background information on integrals on (nonbraided) Hopf algebras. We then give a modified trace formula for the integral and prove its nontriviality. We also introduce a “vacuum expectation value” approach to integration, discuss properties of right Fourier transforms and prove a number of useful formulas. Section 4 supplies the braided version of the modified trace formula and the vacuum projectors and discusses, as an example, the integral on the N-dimensional quantum superplane. Also included are some comments on the transformation properties of the integral on the N-dimensional quantum plane. The last section provides an integration prescription (with some modest assumptions) for quantum group modules that lack a braided Hopf structure. 2. Hopf Algebras The language used in the following is predominantly that of Hopf algebras – we refer the reader to [1, 15, 22] for an introduction to the subject. Concerning the notation, we denote by ∆, , S the coproduct, counit and antipode respectively and by ∆A , U ∆ the right A and left U-coactions respectively. Sweedler-like conventions are employed throughout - thus ∆a = a(1) ⊗ a(2) , (∆ ⊗ id) ◦ ∆(a) = a(1) ⊗ a(2) ⊗ a(3) etc. . Also, 0 ¯ ∆A (x) = x(1) ⊗ x(2 ) , U ∆(a) = a(1) ⊗ a(2) and, for example, (id ⊗∆) ◦ ∆A (x) = 0 0 0 0 x(1) ⊗ x(2 ) (1) ⊗ x(2 ) (2) = x(1) ⊗ x(2 ) ⊗ x(3 ) and so on. By A we will generally denote a function type Hopf algebra (its elements will be denoted by a, b, etc. ) - U will stand for its dual Hopf algebra (universal enveloping algebra type) with elements x, y etc. . The duality is via a nondegenerate inner product h·, ·i that relates the algebra structure in A with the coalgebra stucture in U and vice-versa. The universal R-matrix is denoted by R = R(1) ⊗ R(2) ; R0 stands for τ (R) with τ (a ⊗ b) = b ⊗ a. R satisfies ∆0 (x) = R∆(x)R−1 ,
∀x ∈ U .
(1)
as well as (∆ ⊗ id)R = R13 R23 , (id ⊗∆)R = R13 R12
(2) (3)
(with R13 ≡ R(1) ⊗ 1U ⊗ R(2) etc.) Given a pair of dual Hopf algebras, one can construct their semidirect product A×U with A, U trivially embedded in it and cross relations
ax = x(2) x(1) , S −1 (a(2) ) a(1) . (4) xa = a(1) hx(1) , a(2) i x(2) , The above commutation relations guarrantee that [24] A ⊗ U 3 a ⊗ x 6= 0 ⇒ xa 6= 0
(5)
Remarks on Quantum Integration
3
with xa ∈ A×U. The action . of U on A is given by x . a = a(1) hx, a(2) i. DThe same E 0
symbol will denote the (adjoint) action of U on U: x . y = x(1) yS(x(2) ) = y (1) x, y (2 ) . The canonical in U ⊗ A is written like C = ei ⊗ f i with {ei }, {f i } dual (in the
element j sense that ei , f j = δi ) linear bases in U, A respectively. It holds (∆ ⊗ id)C12 (id ⊗∆)C12 ( ⊗ id)C (id ⊗)C (S ⊗ id)C (id ⊗S)C
= = = = = =
C13 C23 , C12 C13 , 1, 1, C −1 , C −1 ,
(6)
as well as ∆A (a) ≡ ∆(a) = C(a ⊗ 1)C −1 , ∆A (x) = C(x ⊗ 1)C −1 , −1 (1 ⊗ a)C, U ∆(a) = C −1 (1 ⊗ x)C . U ∆(x) ≡ ∆(x) = C
(7)
Either of (4) encodes the information about the inner product hx, ai. To make this precise, we introduce U and A-right vacua, denoted by |ΩU i and |ΩA i respectively, which satisfy [3] x|ΩU i a|ΩA i
= =
(x)|ΩU i, (a)|ΩA i.
Left vacua hΩU |, hΩA | are defined analogously. In terms of these, the inner product hx, ai can be given as the “expectation value” hΩA |xa|ΩU i
= hΩA |a(1) hx(1) , a(2) i x(2) |ΩU i = hx, ai
(8)
if we normalize the vacua so that hΩA |ΩU i = hΩU |ΩA i = 1. Similarly, the adjoint action of U on A can be written as xa|ΩU i = x . a|ΩU i.
(9)
3. Integration on Hopf Algebras
3.1. Background. We list here known results about invariant integrals on Hopf algebras that we will use later – more details can be found in [22, 1]. Some of the proofs are also supplied in order to familiarize the reader with the usage of the tools presented in Sect. 2. To prevent potential divergence problems from distracting the formulation of concepts, we deal throughout with finite dimensional Hopf algebras – some of the results though retain their validity in the infinite dimensional case as well.
4
C. Chryssomalakos
We start with two dually paired Hopf algebras U , A with (finite) dual bases {ei }, {f j } respectively. We define a right (invariant) integral in A as a map h·iR : A → k with the property (10) ha(1) iR a(2) = haiR 1A for all a in A. We call h·iR trivial if all hf i iR are zero. Left (invariant) integrals are similarly defined via (11) a(1) ha(2) iL = 1A haiL . As we shall soon see, when h1A iR 6= 0 (or h1A iL 6= 0), left and right integrals are proportional and can therefore be normalized so that they coincide. One can now introduce the element δUR ∈ U (the right delta function in U) which implements the right integral in A via (12) hδUR , ai = haiR (so that (δUR ) = h1A iR ). This allows us to write δUR = hf i iR ei
(13)
(in the mathematics literature δUR is often called a right integral in U – we will not use this terminology here). For arbitrary a in A we have hδUR x, ai
= = = = =
hδUR , a(1) i hx, a(2) i ha(1) iR hx, a(2) i haiR hx, 1A i (x)haiR hδUR (x), ai ,
therefore δUR x = δUR (x)
∀x ∈ U .
(14)
Taking antipodes in the above equation we find ((x) = (S(x))) S(x)S(δUR ) = (S(x))S(δUR )
(15)
xS(δUR ) = (x)S(δUR ) ∀x ∈ U ,
(16)
which, for invertible S gives
in other words, S(δUR ) implements a left integral and we can therefore take S(δUR ) = δUL . Since (S(x)) = (x), we find that h1A iR = h1A iL ≡ h1A i (h·i denotes a bi-invariant integral). Consider now the product δUR S(δUR ). We have δUR S(δUR ) = δUR (S(δUR )) = δUR (δUR ) = δUR , h1A i and also
δUR S(δUR ) = (δUR )S(δUR ) = h1A iS(δUR ),
therefore, in the unimodular case where h1A i 6= 0 (so that we can normalize h1A i = 1), it holds δUR = S(δUR ), xδUR = (x)δUR ∀x ∈ U , hai = hS(a)i ∀a ∈ A .
(17) (18) (19)
Remarks on Quantum Integration
5 0
0
Concerning uniqueness, assume that a second right integral h·iR exists and let δUR be the 0 0 0 element of U that implements it. We then get δUR S(δUR ) = h1A iS(δUR ) and δUR S(δUR ) = 0 0 0 0 δUR h1A i . For h1A i = 1 we conclude that h1A i = 1 implies δUR = S(δUR ) = δUR while 0 0 h1A i = 0 implies that h·i is trivial. Radford and Larson [11] have shown that
2 i haiR (20) tr ≡ S (ei ), f a defines a right integral on A which though, in some cases (as [11] warns), is trivial. It is this shortcoming of (20) that (among other things) motivated our formula for the integral of the next section, whichis shown to benontrivial for any FDHA. One can
in some derive from (20) that S −2 (ei ), f i a , S −2 (ei ), af i define
left and right (again, cases trivial) integrals respectively. For σ ≡ S 2 (ei ), f i 6= 0 and σ 0 ≡ S −2 (ei ), f i we conclude
σ0
−2 (21) S (ei ), f i a = S −2 (ei ), af i = haiR tr . σ We summarize the main points: when h1A i = 1, h·i is bi-invariant, unique and Eq. (19) holds. Equation (20) defines a (sometimes trivial) right integral and when σ 6= 0, (21) holds. Considerably more is known about integrals on Hopf algebras – in the interest of self containment we have only mentioned above what we can prove here. 3.2. A modified trace formula. We want to present now a modified version of the trace formula (20) that overcomes the limitations mentioned above. It is given by
R = ej S −2 (ei ), af i f j (22) haiR δA R (for the remainder of this paper, haiR δA is defined by the rhs of (22)). Notice that (22) R also defines (for some nonzero hai which we can normalize to 1) the right delta function in A. By pairing both sides of (22) with x in U we conclude
(23) hxiR haiR = xS −2 (ei ), af i .
The proof of invariance is quite analogous to that of (20). What is interesting is the following Lemma 1. The integral h·iR defined by (22) is nontrivial for any FDHA A.
Proof. Set Θlk ≡ el S −2 (ei ), f k f i and compute A×U 3 S −2 (ei )f i
−2 −2 i i f(1) S (ei(1) ), f(2) S (ei(2) ) D E j j i i S −2 (ei ), f(2) S −2 (ej ) f(1) f(2) = f(1)
= S −2 (f i )f j ei S −2 (ek ), f k f l S −2 (ej )S −2 (el ) =
=
Θil S −2 (f i )f j S −2 (ej )S −2 (el ).
(24)
We now employ (5) to conclude that not all Θil are zero (since S −2 (ei ) ⊗ f i 6= 0 in U ⊗ A). Alternatively, one can compute directly the integral [23]
6
C. Chryssomalakos R hS −1 (δA )iR
hei iR hS −1 (f i )iR
ei ej , S −1 (f i )S −2 (f j )
i i ei , S −1 (f(1) )S −2 (f(2) )
ei , (f i )1A h1U , 1A i 1,
= = = = = =
(25)
which shows nontriviality in both U and A (of course, hei iR hS −1 (f i )iR = 1 is a stronger statement). 2 Defining Θ = Θ(1) ⊗ Θ(2) = Θkl f k ⊗ el we get
R = Θ(1) Θ(2) , a haiR δA
hxiR δUR = x, Θ(1) Θ(2) .
(26)
Related to the above proof is Lemma 2. For A a FDHA, it holds (a in A) haf i iR = 0 ∀i ⇒ a = 0
(27)
Proof. Assuming haf i iR = 0 for all i we get 0
=
i R i hS −1 (ei )iR haf(1) i S(f(2) )
=
i R i i hS −1 (ei )iR ha(1) f(1) i a(2) f(2) S(f(3) )
=
hS −1 (ei )iR ha(1) f i iR a(2)
= hS −1 (ei )iR hf i iR a = a where, in the last line, use was made of (25).
2
There exist formulas similar to (23) for other combinations of invariance properties hxiL haiL
∼
hxiL haiR
∼
hxi hai
∼
R
L
ei x, S −2 (f i )a ,
2 xS (ei ), f i a ,
ei x, aS 2 (f i ) .
(28)
We could have used any of these formulas as our basic definition of the integral. When we deal with braided Hopf algebras we use in fact the analogue of the second of (28) as our starting point. The coefficient of proportionality in the above formulas depends on which function’s integral we normalize to 1. The relation with the trace formula (20) is illuminated by the following Lemma 3. For A, U dually paired FDHAs, it holds (a in A, x in U )
R 2 i hxiR tr haitr = σ xei , S (f )a .
(29)
Remarks on Quantum Integration
7
Proof. R hxiR tr haitr
= = = = = = = = =
ek , S 2 (f k )a
ei x(1) , S 2 (f i ) x(2) ek , S 2 (f k )a
i i ) x(1) , S 2 (f(2) ) x(2) ek , S 2 (f k )a ei , S 2 (f(1)
i i ) x(1) ek(2) S −1 (ek(1) ), S 2 (f(2) ) x(2) ek(3) , S 2 (f k )a ei , S 2 (f(1)
i i i ) S −1 (ek(1) ), S 2 (f(3) ) xek(2) , S 2 (f(2) )S 2 (f k )a ei , S 2 (f(1)
ei ej el , S 2 (f i ) ek(1) , S(f l ) xek(2) , S 2 (f j )S 2 (f k )a
ei ej S(ek(1) ), S 2 (f i ) xek(2) , S 2 (f j ))S 2 (f k )a ED E D ei ej (1) S(ej (2) ), S 2 (f i ) xej (3) , S 2 (f j )a
ei , S 2 (f i ) xej , S 2 (f j )a . ei x, S 2 (f i )
2 As a corollary, we infer that, when σ = 0, h·iR tr is trivial in U or A (or both); when σ 6= 0, it is nontrivial in both U and A. 3.3. Vacuum Projectors. We give here a formulation of invariant integration in which the integral of a function is regarded as its “vacuum expectation value”. First, notice that right invariance can also be expressed as (x in U , a in A) hx . ai = (x)hai .
(30)
Recall now the U and A-vacua introduced in Sect. 2. We could define, in terms of these, our “vacuum” integral via an equation like (a ∈ A) [26] haiv ∼ hΩU |a|ΩU i;
(31)
invariance in the form (30) is automatically satisfied. However, as we shall soon see, it is more natural, in this case, to work instead with quantities like |ΩA ihΩA |, |ΩA ihΩU | etc. , i.e. with operators rather than states. The reason is that the former can be realized in A×U and hence their properties can be derived while those of the latter have to be introduced “by hand”. We aim therefore at a definition like haiv ∼ |ΩA ihΩU |a|ΩU ihΩA |.
(32)
We expect the rhs of (32) to be proportional to δA (at least, under certain conditions), consistent with its property to return counit when multiplied by functions either from the left or from the right. What we need next is to find quantities in A×U that represent the operators |ΩA ihΩU |, |ΩU ihΩA |. We recall at this point a result of [2]: the vacuum projectors E, E¯ defined by E = S −1 (f i )ei , satisfy
E¯ = S 2 (ei )f i
(33)
8
C. Chryssomalakos
Ea
= S −1 (f i )ei a
= S −1 (f i )a(1) ei(1) , a(2) ei(2) =
S −1 (f j )S −1 (f i )a(1) hei , a(2) i ej
= =
S −1 (f j )S −1 (a(2) )a(1) ej (a)E
(34)
for all a in A, as well as xE
= xS −1 (f i )ei
i i = S −1 (f(2) ) x(1) , S −1 (f(1) ) x(2) ei
= S −1 (f j ) x(1) , S −1 (f i ) x(2) ei ej S −1 (f j )x(2) S −1 (x(1) )ej (x)E
= =
for all x in U, while we can similarly show that ¯ Ex = (x)E¯ ∀x ∈ U , ¯ ¯ aE = (a)E ∀a ∈ A. 2 2 ¯ ¯ Furthermore, E = E and E = E which allows us to write E = |ΩU ihΩA |, |ΩA ihΩU |. With an eye on (32), we now compute ¯ EaE
= S 2 (ei )f i aE
i i = f(1) a(1) S 2 (ei ), f(2) a(2) E
i i = f n en , f(1) a(1) S 2 (ei ), f(2) a(2) E
= f n en S 2 (ei ), f i a E L E. = haiR δA
This simplifies further when δA = δA ≡ δA – we then get ¯ EaE = haiδA . L
(35) (36) (37) E¯ =
(38)
R
(39)
3.4. Fourier Transforms. We work again with a general (i.e. not necessarily unimodular) FDHA. We define a right Fourier transform b·R : A → U in terms of a right integral as follows (40) b aR ≡ haS −1 (f i )iR ei , so that (b in A) (41) hb aR , bi = haS −1 (b)iR . We show now that the right Fourier transform is invertible a R iR f j hej b
=
haS −1 (f i )iR hej ei iR f j
=
i i haS −1 (f(2) )iR hei iR f(1)
=
i i i ha(1) S −1 (f(3) )iR hei iR a(2) S −1 (f(2) )f(1)
=
ha(1) S −1 (f i )iR hei iR a(2)
=
R ha(1) S −1 (δA )iR a(2)
R = hS −1 (δA )iR a = a.
(42)
Remarks on Quantum Integration
9
In the language of the previous section, when σ 0 6= 0, the Fourier transform allows the switching between |ΩU i and |ΩA i. Indeed, we find aE
= aS −1 (f i )ei
i i = ei(2) ei(1) , S −1 (a(2) S −1 (f(1) )) a(1) S −1 (f(2) )
−2 i i = ei(2) S (ei(1) ), f(1) S(a(2) ) a(1) S −1 (f(2) ) D E j −2 i −1 j i ) = ej S (ei ), f(1) f(1) S(a(2) ) a(1) S (f(2) )S −1 (f(2) D E j j S(a(2) ) a(1) S −1 (f(2) )S −1 (f k ) = ej S −2 (ei )S −2 (ek ), f i f(1) =
j j S(a(2) )ia(1) S −1 (f(2) )S −1 (f k ) ej hS −2 (ek )ihf(1)
=
j j )ia(1) S −1 (f(2) )hS −3 (ek )if k ej ha(2) S −1 (f(1)
= =
ej haS −1 (f j )ihek if k aˆ δA ,
(43)
which, in terms of the action on right vacua, corresponds to a|ΩU i = aˆ |ΩA i .
(44)
One can easily check that (a in A, x in U) R
aR . xd . a = xb
(45)
Another familiar, in the unimodular case, property that survives when h1A i = 0, is R
fd ?R g = fbR gbR ,
(46)
where the right convolution ?R of f, g in A is given by f ?R g = g(1) hf S −1 (g(2) )iR
(47)
( (46), together with the invertibility of b·R , guarantee the associativity of ?R ). On the other hand, the following property that is easily seen to hold in the unimodular case, aˆˆ = S(a)
(48)
does not hold, in general, for b·R when h1A i = 0. 3.5. Further properties. We give now the proof of a number of interesting formulas, valid for unimodular FDHAs. First, notice that 2 (a) Sd
=
ei hS 2 (a)S −1 (f i )i
=
ei haS −3 (f i )i
=
S −2 (ei )haS −1 (f i )i
=
S −2 (ˆa),
where, in the second line, we used (19). Two useful lemmas follow
(49)
10
C. Chryssomalakos
Lemma 4. For A a FDHA and σ 0 6= 0, it holds D E ¯ S −2 (a) = a(1) , S −2 (a(2) (2) ) a(2) (1)
(50)
for all a in A. Proof. We have (the notation is introduced in Sect. 2) ¯
a(1) ⊗ a(2) (1) ⊗ a(2) (2)
which gives E D ¯ a(1) , S −2 (a(2) (2) ) a(2) (1)
=
=
(id ⊗∆)(S(ei )ej ⊗ f i af j )
=
S(ek )S(ei )ej el ⊗ f i a(1) f j ⊗ f k a(2) f l ,
S(ek )S(ei )ej el , S −2 (f k )S −2 (a(2) )S −2 (f l ) f i a(1) f j
=
hS(ek )S(ei )ej ihS −2 (f k )S −2 (a(2) )if i a(1) f j
=
i i hS(ei )ej ihS −2 (f(2) )S −2 (a(2) )if(1) a(1) f j
=
hS(ei )ej ihf i aif j
=
hS(ei )ej ihS −1 (a)S −1 (f i )if j d(a))ej if j hS(S −1
=
=
d(a))iS(f j ) hej S 2 (S −1 d(a)iS(f j ) hej S −3
=
S −2 (a) ,
=
(51)
where, in the last line, we used the formula for the inverse Fourier transform, Eq. (42). 2 Lemma 5. For A a FDHA, it holds a(2) S −1 (a(1) )b = ba(2) S −1 (a(1) ) ¯
¯
(52)
for all a, b in A. Proof. We have a(2) S −1 (a(1) )b ¯
D E ¯ ¯ a(2) b(1) S −1 (a(1) (2) ), b(2) S −1 (a(1) (1) ) D D E E ¯ ¯ ¯ = b(1) a(2)(1) , b(2) a(2)(2) S −1 (a(1) (2) ), b(3) S −1 (a(1) (1) ) D E ¯ ¯ ¯ = b(1) a(2)(1) S −1 (a(1) (2) ), b(2) a(2)(2) S −1 (a(1) (1) ) D E ¯ ¯ ¯ = b(1) a(1) (3) S −1 (a(1) (2) ), b(2) a(2) S −1 (a(1) (1) ) =
=
ba(2) , S −1 (a(1) ) ¯
where, in the first and second line, we used the first and second of (4) respectively. At this point, we have enough machinery at our disposal to prove the following
(53) 2
Remarks on Quantum Integration
11
Proposition 1. For A a FDHA and σ 0 6= 0, it holds hbai = hS −2 (a)bi
(54)
for all a, b in A. Proof. We have ¯ EbaE
¯ ¯ (2) (S −1 (a(1) ))E Eba ¯ (2) −1 (1) ¯ Eba S (a )E
= =
¯ ¯ (2) S −1 (a(1) )bE Ea D E ¯ ¯ −1 ( 1) ¯ = ES (a (1) ) S −1 (a(1) (2) ), S −1 (a(2) (2) ) a(2) (1) bE E D ¯ = E¯ S −1 (a(1) ), S −1 (a(2) (2) ) a(2) (1) bE
=
¯ −2 (a)bE, ES
=
where in the third line we used (52) and in the last one, (50). The proposition follows now from (39). 2 It is interesting to compare (54) with the formula one can derive in the presence of a universal R-matrix R in U. The result in this case is contained in Proposition 2. Let A be a dual quasitriangular Hopf algebra and h·i a bi-invariant integral on it. It holds (55) hbai = hS 2 (a / s)bi for all a, b in A, where s = cu−2 ,
u = S(R(2) )R(1) ,
c = uS(u)
(56)
( s, u, c in U ∼ A∗ ) and a / s ≡ hs(1) , ai s(2) . Proof. The commutation relations in A can be written in the form
ba = hR, a(1) ⊗ b(1) i a(2) b(2) R−1 , a(3) ⊗ b(3)
(57)
(this is the dual version of (1)). It can also be shown that the element u defined above implements the square of the antipode in U acccording to S 2 (x) = uxu−1
(58)
for all x in U [7] – its inverse is given by u−1 = R(2) S 2 (R(1) ). We can then write
hbai = hR, a(1) ⊗ b(1) i ha(2) b(2) i R−1 , a(3) ⊗ b(3)
= hR, a(1) ⊗ b(1) S(b(2) )S(a(2) )i ha(3) b(3) i R−1 , a(5) ⊗ b(5) S −1 (b(4) )S −1 (a(4) )
= hR, a(1) ⊗ S(a(2) )i ha(3) bi R−1 , a(5) ⊗ S −1 (a(4) )
(59) = hS(u), a(1) i ha(2) bi u−1 , a(3) . However, S(u)xu−1
= =
S(u)u−1 uxu−1 sS 2 (x).
12
C. Chryssomalakos
Then, for arbitrary x in U, we have
= hS(u), a(1) i hx, a(2) i u−1 , a(3)
therefore
S(u)xu−1 , a
= sS 2 (x), a
= x, S 2 (a / s) ,
hS(u), a(1) i a(2) u−1 , a(3) = S 2 (a / s);
(60)
2
substituting in (59) we get (55).
Comparison with (54) and use of (27) leads to the relation S 4 (a) = a / s−1 which in the dual implies S 4 (x) = s−1 x and therefore (by taking x = 1U ) S 4 (x) = x
(61)
for all x in U (a different proof of this has been given in [20]). Equations (54) and (61) have been proven above only in the finite dimensional case (the latter assuming quasitriangularity as well). On the other hand, (55) and the following versions of it which are proved similarly, hold for all quasitriangular Hopf algebras with bi-invariant integral hbai
=
haS 2 (s−1 . b)i
=
hS −2 (s . a)bi
=
h(u−1 . b)(a / u)i.
(62)
We close this section with the remark that hbai = hS 2 (a)bi has been shown to hold for unimodular, finite dimensional ribbon Hopf algebras (see [19]). Using (61) in (54) we conclude that it actually holds for (the wider class of) quasitriangular FDHAs. 4. Integration on Braided Hopf Algebras We transcribe here the main results of Sect. 3 to the case of FDBHAs, using the Ndimensional quantum superplane as a concrete example. 4.1. Preliminaries. 4.1.1. The quantum superplane. Let us review briefly the basics of the construction of the quantum superplane [25, 27]. As is typical in the study of quantum spaces, one deals with the associative, noncommutative algebra X generated by 1X and the coordinate functions ξi , i = 1, . . . , N on the quantum superplane satisfying the commutation relations ξ2 ξ1 = −q Rˆ 12 ξ2 ξ1
(63)
−1 (q)). (we work with the “q −1 ” version [4] – we remind the reader that Rˆ 12 (q −1 ) = Rˆ 21 The derivatives σi , i = 1, . . . , N , dual to the above coordinates, generate (together with 1D ) the algebra D with commutation relations
σ1 σ2 = −qσ1 σ2 Rˆ 12 . The coordinate-derivative duality is encoded in the cross relations
(64)
Remarks on Quantum Integration
13 −1 σi ξj = δij − q Rˆ mj,ni ξn σm .
(65)
We denote the combined coordinate – derivative algebra by P. In analogy with the treatment of the quantum plane in [3], one can enlarge P by the introduction of displacements ηi , τi , i = 1, . . . , N for the coordinates and derivatives respectively, satisfying η2 η1 = −q Rˆ 12 η2 η1 ,
−1 τi ηj = δij −q Rˆ mj,ni η n τm ,
τ1 τ2 = −qτ1 τ2 Rˆ 12 ,
−1 η2 ξ1 , ξ2 η1 = −q Rˆ 12
Rˆ kj,li ηl σk ,
(66) (67)
−1 σ1 τ2 = −qτ1 σ2 Rˆ 12 .
(68)
σi ηj = −q
−1 τl ξ k , ξi τj = −q −1 Dla Rˆ ia,bk Dbj
−1
As seen from above, the η’s are taken to be just a second copy of the ξ’s but are endowed with nontrivial statistics with both the ξ’s and the σ’s – analogous remarks hold for the τ ’s. The remarkable property of (68) is that the displaced coordinates ξi + ηi and derivatives σi +τi still satisfy (63) and (64) respectively while the entire enlarged algebra is covariant under the GLq (N )-coaction, ξi 7→ ξi0 ηi 7→ ηi0
= =
ξj ⊗ S(Aij ), ηj ⊗ S(Aij ),
(69) (70)
σi 7→ σi0
=
σj ⊗ S 2 (Aji ),
(71)
=
τj ⊗ S (Aji ).
(72)
τi 7→
τi0
2
We will often drop the tensor product sign in the following. 4.1.2. Braiding. Suppose U is a quasitriangular Hopf algebra (with universal R-matrix R) that acts from the left on two algebras V, W . One can, in this case, form the braided tensor product W ⊗V in which V, W are trivially embedded as subalgebras but have nontrivial statistics, given by (v in V , w in W ) (1⊗v)(w⊗1)
≡
Ψ (v ⊗ w)
=
τ ◦ (R(1) . v ⊗ R(2) . w) D E 0 0 w(1) ⊗v (1) R, v (2 ) ⊗ w(2 ) .
=
(73)
We have expressed above the action of U on V, W in terms of the dual coaction of A ∼ U ∗ . The first line of (73) also defines the braided transposition Ψ : V ⊗ W → W ⊗ V for which it holds in general Ψ 2 6= id (due to R0 R 6= 1). For a detailed discussion of the properties of Ψ , see e.g. [17]. 4.1.3. The quantum superplane as a braided Hopf algebra. The concept of braided tensor products provides a natural framework for an elegant description of the quantum superplane as a braided Hopf algebra. We give here, for completeness, an outline of this approach – more details can be found in [17]. Essential to this description is the use of diagrams which encode neatly the braiding information. The maps Ψ and Ψ −1 are and respectively. For the algebra X , the map represented by the diagrams ξi 7→ ξi ⊗1 + 1⊗ξi ≡ ηi + ξi is regarded as a braided coproduct ∆ : X → X ⊗X (extended (braided) multiplicatively on the whole X ). Diagramatically this appears as
14
C. Chryssomalakos
ξ
η
ξ
ξ
η
where the first two vertices denote the product and coproduct in X and the third diagram expresses the braided multiplicativity of ∆ (ξ, η etc. denote generic elements of X ). One also has a matching counit and antipode with , S, S 2 , S −1 , S −2 denoted respectively by
and satisfying braided versions of the familiar Hopf algebra identities, e.g.
(74) A particularly important requirement on the braiding, which Ψ of (73) satisfies, is that one should be able to move crossings past all vertices and boxes, e.g.
(75) should hold. Exactly analogous treatment is possible for D, the algebra of derivatives. The braided coproduct is given by ∆(σi ) = σi ⊗1+1⊗σi ≡ τi +σi and the corresponding diagrams are an exact copy of those for X (we will use in them the letters σ, τ etc. to denote generic elements of D). One can now combine X and D to form a braided semidirect product X ×D. We need for this a braided action of D on X which is given by the second of the following diagrams
σ ξ
σ ξ hσ, ξi ∼
,
σ
ξ (76)
while the first one simply depicts the pairing between a derivative and a function (defined as the counit of the derivative of the function). When viewed upside-down, the first diagram stands for the canonical element φi ⊗i ∈ X ⊗D (notice the reversal of order in
the tensor product) with i , φj = δij . Both the inner product and the canonical element are assumed invariant under ∆A :
Remarks on Quantum Integration
15
0 0 p(1) , x(1) p(2 ) x(2 )
(1)
φi ⊗i (1) ⊗ φi
0
(2 )
0
i (2 )
=
hp, xi 1A
(77)
=
φi ⊗i ⊗ 1A .
(78)
The product-coproduct duality between X and D is taken to be
σ
ξ
η
σ
ξ
η
σ
τ
ξ
σ
τ
ξ
(79) Notice that this differs from the standard convention in the unbraided case. As a result, to get the unbraided version of any diagrammatic equation that appears in the following, one should translate the diagrams, ignoring the braiding information, into the language of Sect. 2 and then set ∆ → ∆0 , S → S −1 . Again, viewing the diagrams upside-down reveals additional (dual) information – in the case of the diagrams above, one discovers two basic properties of the canonical element (compare with the first two of (6)). The commutation relations in the semidirect product (i.e. the braided analogue of (4)) are
σ
ξ
σ
ξ
ξ
σ
ξ
σ
(80) We close this review with a technical remark. If one computes the ξ − ξ braiding given by (73) (with V = W = X ), using the coaction (69), one fails to reproduce the ξ − η commutation relations of (67) – the result is off by a q factor (similarly for the rest of (67), (68)). To remedy this, one can enlarge A by a grouplike central element g (the dilaton, see [16]), the inner product of which with R is given by
(81) R, g a ⊗ g b = (−q)−ab , hR, g a ⊗ Aij i = hR, Aij ⊗ g a i = δij . ˜ Setting A → gA in the rhs of (69)– (72) gives an We will call the enlarged algebra A. ˜ A-coaction on the quantum superplane which reproduces, via (73), the commutation relations (67), (68). 4.2. The invariant integral. 4.2.1. First definition and problems. The integral we are looking for is a linear map h·i : X → C which is translationally invariant in the following sense [25]: hσi f (ξ)i = 0, i = 1, . . . , N
∀f ∈ X .
(82)
16
C. Chryssomalakos
An equivalent formulation of invariance is [3] hf (ξ + η)i = 1X hf (ξ)i
∀f ∈ X
(83)
or, in braided Hopf algebra language, f(1) hf(2) i = 1X hf i
∀f ∈ X .
(84)
Representing the integral with a rhombus, we want it to satisfy
ξ
η
ξ
η
(85) a requirement which, as one can easily see, cannot, in general, be satisfied. Indeed, we only need consider the classical fermionic line with coordinate ξ and displacement η satisfying ξ 2 = η 2 = 0, ξη = −ηξ. Taking ξ, η to stand for themselves in the diagram above, we find that the lhs is ηhξi (since hξi, being a number, braids trivially), while the rhs is ηh−ξi = −ηhξi, implying hξi = 0 which contradicts the known Berezin result. We conclude that hf (ξ)i, assumed invariant (and nontrivial), cannot both be a number and satisfy the property expressed in the diagram above. 4.2.2. An improved definition. Our treatment, in Sect. 3, of the integral on FDHAs points to a simple solution to the above problem. We recall that there, the quantity that naturally emerged, in our algebraic formulation in terms of the modified trace formula, was the numerical integral h·i times a delta function (as in the lhs of (22)). Motivated by this, we R (with k, in our case, the complex numbers) as define a new integral hh·iiL : X → kδX follows R . (86) hhξiiL = hξiL δX The output braid of the integration rhombus in our diagrams will stand accordingly for R . The rhombus’ inner workings are exposed in the diagram a numerical multiple of δX below
(87) To get the product hσiR hξiL one should pair the output braid with σ. Whether (87), in its present or a suitably modified form, applies to the infinite dimensional case (and under what conditions), is a direction for future work (one can easily see that, in certain such cases, the rhs of (87) diverges). Lemma 6. The integral hh·iiL defined by (87) is nontrivial for every FDBHA.
Remarks on Quantum Integration
17
Proof. Using (S −1 ⊗ id) ◦ Ψ −1 (φi ⊗ i ) as input to the braided version of Θ (shown below), we find
1
2
from which nontriviality follows.
4.2.3. Braided vacuum projectors. We denote in the following by E, E¯ the braided ¯ They are given by analogues of E, E.
E =
,
E¯ =
Proof. The proof of the (analogue of) (34) is as follows:
2
(88)
18
C. Chryssomalakos
R R L ¯ As before, EaE is a multiple of δX E – when δX = δX ≡ δX , it becomes a multiple R R R R defined via hτ, δX i = hτ iR , one gets ξδX = (ξ)δX – the of δX . Notice that, with δX difference in the order of multiplication, compared to (14), is due to the second of (79). The explicit computation of E¯ is simplified by the following identity:
Ψ −1 (φi ⊗ i ) = S(u−1 ) . i ⊗ φi ,
(89)
which is easily proved using the invariance of the canonical element. Inspection of our definition reveals the braided version of the trace formula (20) for the numerical integral hξiR tr - it is given by
ξ
hξiR tr =
(90)
Proof. The proof of invariance is as follows:
2 To find out under what conditions it becomes trivial, we have to derive the braided version of (29). Omiting the somewhat lengthy diagrammatic proof, which parallels that of Lemma 3, we state Lemma 7. For X , D dually paired FDBHAs, it holds σ
ξ
σ
ξ
(91)
Remarks on Quantum Integration
19
R In analogy with the unbraided case, when σ ≡ h1X iR tr = 0, h·itr is trivial; when σ 6= 0, R h·itr provides a nontrivial integral in X . For the existence of integrals on FDBHAs and properties of them, see also [13, 14]. An analogous definition for the numerical braided integral (and a different proof of its invariance) can be found in [18].
4.2.4. Braided Fourier transforms. Transcribing (40), we define the Fourier transform fˆ of the element ξ of a FDBHA X by the equation ξˆ ≡ hξS(φi )ii or, in pictures,
ξ
(92)
ξ
(93) (this differs from earlier definitions [10] by the use of the nonbosonic integral hh·ii). The output braid on the right stands for what one usually calls the Fourier transform of ξ (an element of D, the dual of X ) while the one on the left stands for the delta function in X that is produced by the integration and which ensures the correct braiding behavior of ˆ·. There is also a notion of braided convolution of functions, defined by
ξ
η
ξ
η
(94) Again, the output braid on the left only carries a delta function in X . The following basic properties can be shown to hold
ξ
η
ξ
η
σ ξ
σ
ξ
4.3. Integration on the quantum superplane. We apply now the general formalism developed above to the problem of integration on the quantum superplane. Our starting point will be the vacuum projector construction of Sect. 4 – notice that although h1A i = 0 in this case, the integral is nevertheless bi-invariant so we expect (39) to hold. For the canonical element we find (95) φi ⊗i = eq−1 (ξi ⊗σi ),
20
C. Chryssomalakos
where eq (x) =
∞ X k=0
1 k x , [k]q !
[k]q =
1 − q 2k , 1 − q2
[k]q ! = [1]q [2]q . . . [k]q ,
[0]q ≡ 1
(96) (compare with the vacuum projectors for the quantum plane in [3]). The commutation relations (63), (64) imply ξi2 = σi2 = 0 for i = 1, . . . , N which gives (ξi ⊗σi )N +1 = 0. Using the braiding relations (67), the second of which can also be written as Ψ (σi ⊗ ξj ) = −q −1 Rˆ kj,li ξl ⊗ σk ,
(97)
we can expand (ξi ⊗σi )k in (95) to find i
φ ⊗i =
N X q −k(k−1) k=0
[k]q−1 !
ξi1 . . . ξik ⊗σik . . . σi1 .
(98)
With the antipode being given by S(ξi1 . . . ξik ) = (−1)k q k(k−1) ξi1 . . . ξik
(99)
and using (89) we find E
=
N X (−1)k ξ i . . . ξ i k σ i k . . . σ i1 , [k]q−1 ! 1
(100)
k=0
E¯
=
N X (−1)k q k k=0
[k]q !
D i1 j 1 . . . D ik j k σ ik . . . σ i1 ξ j 1 . . . ξ j k .
(101)
It will be convenient in the following to express E¯ in the alternative form: E¯ =
N X (−1)k q k(k−2N +1) k=0
[k]q !
([N ]q − ξ·σ)([N − 1]q − ξ·σ) . . . ([N − k + 1]q − ξ·σ), (102)
where ξ·σ ≡ ξi σi (this form makes the invariance under ∆A evident - a similar expression exists for E). Using the commutation relation ξ ·σξj = ξj (1 + q 2 ξ ·σ) and the fact that ξ·σE = 0, we can now compute the integral of an arbitrary monomial ξi1 . . . ξir (r < N ) , A X (−1)k q k(k−2A+1) [A]q ! ¯ i 1 . . . ξ ir E = ( Eξ )ξi1 . . . ξir E, (103) [k]q ![A − k]q ! k=0
where A ≡ N − r. For the sum in parentheses, one can set S(z) =
A X (−1)k q k(k−2A+1) [A]q ! k=0
[k]q ![A − k]q !
zk .
(104)
Introducing a Jackson derivative ∂z , satisfying ∂z z = 1 + q 2 z∂z , we find from (104) ∂z S(z) =
q −2A 1 S(z) − S(q 2 z) . q −2 − 1 q −2 − 1
(105)
Remarks on Quantum Integration
21
On the other hand, it holds ∂z S(z) =
S(q 2 z) − S(z) ; q2 − 1
comparison with (105) shows that 1 − q2 z S(q 2 z) = , S(z) 1 − q −2(A−1) z from which we find S(z) = (1 − z)(1 − q −2 z) . . . (1 − q −2(A−1) z) ,
(106)
implying that S(1) = 0 and therefore that hξi1 . . . ξir i = 0
0≤r
(107)
For r = N , the integrand is (a multiple of) a delta function and its numerical integral is evidently (a multiple of) 1. We conclude that the quantum Berezin integral in N dimensions is essentially undeformed. As expected, h·iR tr is trivial in this case, as the reader can easily verify. 4.4. Remarks on the integral on the quantum plane. We make here a few remarks about the transformation properties of the invariant integral on the quantum plane [25]. We denote by xi , i = 1, . . . , N the coordinate functions on it and by ∂i , i = 1, . . . , N the dual derivatives, satisfying x1 x2 = q −1 Rˆ 12 x1 x2 ,
∂2 ∂1 = q −1 Rˆ 12 ∂2 ∂1 ,
∂i xj = δij + q Rˆ jl.ik xk ∂l . (108)
The above algebra is covariant under the transformation x → ∆A (x) = xA, ∂ → ∆A (∂) = ∂S(AT ) with A a GLq (N ) matrix (we omit the tensor product symbol). We assume an integral hh·ii exists, defined on a suitable class of functions, satisfying translational invariance (in the spirit of (82)) and braiding correctly, i.e. according to (85). In the classical case of integration on the N-dimensional plane, one finds the transformation property Z Z 1 f (x)dx, (109) f (xA)dx = det(A) where A is a GL(N ) matrix. We now show that a similar property holds in the quantum case. We remark first that (85) implies 0
∆A (hhf (x)ii) = hhf (xA)ii = hhf (1) (x)iif (2 ) (A),
(110)
0
where f (xA) ≡ f (1) (x)f (2 ) (A). Consider now the dual action of the generators Yij ≡ L+im S(L− mj ) of Uq (gl(N )) on the integrand E D 0 hhYij . f ii = hhf (1) ii Yij , f (2 ) . (111) The above action can be represented in terms of differential operators on the plane as follows [4, 5]: Yij ∼ q −2 δij + q −1 λ∂i xj , (112)
22
C. Chryssomalakos
which gives, making use of the invariance of the integral hhYij . f ii
=
hh(q −2 δij + q −1 λ∂i xj )f ii
=
q −2 δij hhf ii.
Repeating the calculation for products of Y ’s acting on f , we find 0
hhf (1) ii ⊗ f (2 ) = hhf ii ⊗ z
(113)
with ∆(z) = z ⊗z and hYij , zi = q −2 δij , h1U , zi = 1. The above information completely determines z. Since hYij , detq (A)i = q 2 , (detq (A)) = 1 and detq (A) is grouplike, we conclude (114) hhf (xA)ii = hhf (x)ii(detq (A))−1 . As in the case of the quantum superplane, to obtain the correct q factors for the braiding (so that, for example, x → x⊗1 + 1⊗x is a homomorphism of the first of (108)) one has ˜ with to introduce a grouplike, central dilaton g (extending this way A to A)
hR, g a ⊗ Aij i = hR, Aij ⊗ g a i = δij (115) R, g a ⊗ g b = q ab , and use the A˜ coaction x → xAg, ∂ → ∂S(AT )g −1 in (73). We obtain in this way the braiding relations Ψ (hhf (x)ii ⊗ xi )
=
q −(N +1) xi ⊗ hhf (x)ii,
Ψ (hhf (x)ii ⊗ ∂i )
=
q N +1 ∂i ⊗ hhf (x)ii,
Ψ (hhf (x)ii ⊗ hhg(x)ii)
=
q N (N +1) hhg(x)ii ⊗ hhf (x)ii .
(116)
We point out that a translationally invariant integral on the quantum plane cannot be also invariant under the coacting quantum group – an assumption to the contrary is made in [10]. 5. Integration on Quantum Group Modules We present here an approach to integration on quantum spaces that are covariant under a quantum group transformation, which does not rely on the braided Hopf algebra structure we have assumed so far. The only necessary ingredients are – a coaction ∆A : X → X ⊗ A, where A, X are the algebras of functions on the quantum group and the quantum space respectively – a map η : X → C that respects the algebra structure of X – a (left) invariant integral on A. Given the above data, an invariant (under ∆A ) integral on X can be defined by 0
hαi = η(α(1) )hα(2 ) i
(117)
(we denote by h·i the integral on both X and A). Notice that our notion of invariance in this section is different from the one employed so far in this paper. Indeed, in the absence of a (possibly braided) Hopf structure, no concept of translation exists (as codified by the coproduct) and therefore (10) cannot serve as our starting point. We illustrate the above procedure taking for X the quantum Euclidean space (for detailed treatments of
Remarks on Quantum Integration
23
this case see [8, 21]). An example of a function algebra the integral on which eludes all methods presented so far in this paper appears in [6]. 5.1. Integration on the quantum Euclidean space. In the following we use the notation and conventions of [9]. The algebra of functions on the N -dimensional quantum Euclidean space is generated by the coordinates xi , i = 1, . . . , N satisfying (−) = 0, x1 x2 P12
(118)
where P (−) is the antisymmetric projector in the spectral decomposition of the SOq (N ) Rˆ matrix. The center of the algebra is generated by 1 and the squared length L2 = xT Cx, where C is the quantum metric. For A an SOq (N ) matrix, it holds S 2 (A) = DAD−1 ,
D = CC T ,
A = C T S(AT )(C −1 )T .
(119)
The algebra of the x’s admits the coaction ∆A : x 7→ xA while the map η : xi 7→ ui ≡ u1 δi1 + uN δiN , with u1 , uN numbers, respects (118). L2 is ∆A -invariant. The integral Ii1 ...im ≡ hxi1 . . . xim i is given by Ii1 ...im = η(xj1 . . . xjm )hAj1 i1 . . . Ajm im i.
(120)
Notice that the function xj1 . . . xjm hAj1 i1 . . . Ajm im i is ∆A -invariant. Assuming that all ∆A -invariant functions are functions of the invariant length, we conclude fi1 ...im (Lm ) m even xj1 . . . xjm hAj1 i1 . . . Ajm im i = , (121) 0 m odd
and therefore
η(fi1 ...im (L2 )) 0 We treat, as an example, the case m = 2. Setting Ii1 ...im =
m even . m odd
(122)
Fni,mj ≡ hS(Aim )Anj i
(123)
and invoking invariance in the form S(a(1) )ha(2) i = 1hai, we get Arb Fbi,aj ⇒ A1 F12 D1
Arb S(Abm )S 2 (Ani )Fma,nj F12 D1 A1 .
= =
(124)
Since only multiples of the identity matrix commute with A, we conclude F12 = M2 D1−1 .
(125)
However, Aki Fbi,aj ⇒ A2 F12 ⇒ M2
= = =
Aki S(Ain )Arj Fbn,ar F12 A2 ρI2 ,
with ρ a number. With the normalization h1A i = 1, we find ρ = (Tr D
(126) −1 −1
)
and therefore
1 D−1 δij . (127) Tr D−1 ba Using now the third of (119) in (120) and substituting the above result, we find, with the help of (119), η(L2 ) Ci i . (128) I i1 i2 = Tr D−1 1 2 Fbi,aj =
24
C. Chryssomalakos
Acknowledgement. Many people have helped me understand the subject matter treated here. I thank Joris Van der Jeught for discussions and computations that convinced me to treat the infinite dimensional case elsewhere. While this paper was being written up, A. Van Daele kindly pointed out reference [23] which contains essentially (23). Also, the alternative proof of Lemma 1 quoted here (ending in Eq. (25)) motivated the proof of Lemma 6 – I thank him for his comments during the discussions we had in Warsaw. I also thank for discussions Shahn Majid and Volodimir Lyubashenko (who also drew my attention to references [13, 14]). Thanks go to Daniel Arnaudon for his help with all things binary. Special thanks go to Stanislaw Zakrzewski for his interest in this paper and for the organization of (and invitation to) the minisemester on Quantum Groups and Quantum Spaces (November – December 1995, Warsaw) during which many aspects of the subject treated here were illuminated. Bruno Zumino had a decisive impact on this work through numerous discussions, contributions and his constant support – my warmest thanks go to him.
References 1. Abe, E.: Hopf Algebras. Cambridge: Cambridge Univ. Press, 1980 2. Chryssomalakos, C., Schupp, P., Watts, P.: The Role of the Canonical Element in the Quantized Algebra of Differential Operators A×U . Preprint LBL-33274, UCB-PTH-92/42 3. Chryssomalakos, C., Zumino, B.: Translations, Integrals And Fourier Transforms In The Quantum Plane. In: Ali, A., Ellis, J., Randjbar-Daemi, S. (eds.) Salamfestschrift. Proceedings of the “Conference On Highlights Of Particle And Condensed Matter Physics”, ICTP, Trieste, Italy 1993 4. Chryssomalakos, C., Schupp, S., Zumino, B.: Induced Extended Calculus On The Quantum Plane. Alg. i Anal. 6, 252 (1994) 5. Chu, C.S., Zumino, B.: Realization of Vector Fields for Quantum Groups as Pseudodifferential Operators on Quantum Spaces. Preprint LBL-36746, UCB-PTH-95/04, q-alg/9502005 6. Chu, C.S., Ho, P.M., Zumino, B.: The Quantum 2-Sphere as a Complex Quantum Manifold. To appear in Z. Phys. 7. Drinfel’d, V.G.: Quantum Groups. In Gleason, A. (ed.) Proceedings of the ICM, Providence, Rhode Island AMS, 1987 8. Fiore, G.: The SOq (N, R)-Symmetric Harmonic Oscillator on the Quantum Euclidean Space RqN and its Hilbert Space Structure. Int. J. Mod. Phys. A8, 4679-4729 (1993) 9. Faddeev, L.D., Reshetikhin, N.Yu., Takhtadzhyan, L.A.: Quantization of Lie Groups and Lie Algebras. Leningrad Math. J. 1, 193–225 (1990) 10. Kempf, A., Majid, S.: Algebraic q-integration and Fourier Theory on Quantum and Braided Spaces. Preprint DAMTP/94-7 11. Larson, R.G., Radford, D.E.: Semisimple Cosemisimple Hopf Algebras. Am. J. Math. 109, 187 (1988) 12. Larson, R.G., Sweedler, M.E.: An Associative Orthogonal Bilinear Form for Hopf Algebras. Am. J. Math. 91, 75–94 (1969) 13. Lyubashenko, V.: Tangles and Hopf Algebras In Braided Categories. J. Pur. Appl. Alg 98, 245–278 (1995) 14. Lyubashenko, V.: Modular Transformations for Tensor Categories. J. Pur. Appl. Alg. 98, 279–327 (1995) 15. Majid, S.: Quasitriangular Hopf Algebras and Yang-Baxter Equations. Int. J. Mod. Phys. A5, 1–91 (1990) 16. Majid, S.: Braided Momentum in the q-Poincare Group. J. Math. Phys. 34, 2045–2058 (1993) 17. Majid, S.: Beyond Supersymmetry and Quantum Symmetry (An introduction to braided groups and braided matrices). In: Ge, M.L., de Vega, H.J. (eds.), Quantum Groups, Integrable Statistical Models and Knot Theory. Singapore: World Sci., 1993, pp. 231–282 18. Majid, S.: Lie Algebras And Braided Geometry. Adv. Appl. Cliff. Alg. 4, (S1), 61–77 (1994) 19. Radford, D.E.: The Trace Function And Hopf Algebras. Preprint 20. Radford, D. E.: The Order of the Antipode of a Finite Dimensional Hopf Algebra is Finite. Am. J. Math. 98, (2), 333–355 (1973) 21. Steinacker, H.: Integration on Quantum Euclidean Space and Sphere in N Dimensions. Preprint LBL37431, UCB-PTH-95/21; q-alg 9506020 22. Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 23. Van Daele, A.: The Haar Measure On Finite Quantum Groups. Preprint K. U. Leuven, November 1992 24. Van Daele, A.: Private communication
Remarks on Quantum Integration
25
25. Wess, J., Zumino, B.: Covariant Differential Calculus on the Quantum Hyperplane. Nucl. Phys. B (Proc. Suppl.) 18B, 302–312 (1990) 26. Zumino, B.: Private communication 27. Zumino, B.: Deformation of the Quantum Mechanical Phase Space with Bosonic or Fermionic Coordinates. Mod. Phys. Lett. A13, 1225–1235 (1991) Communicated by R.H. Dijkgraaf
Commun. Math Phys. 184, 27 – 50 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
New Classes of Toda Soliton Solutions F. Gesztesy, W. Renger Department of Mathematics, University of Missouri, Colombia, MO 65211, USA. E-mail:
[email protected],
[email protected] Received: 16 October 1995 / Accepted: 23 July 1996
Abstract: We provide a detailed investigation of limits of N –soliton solutions of the Toda lattice as N tends to infinity. Our principal results yield new classes of Toda solutions including, in particular, new kinds of soliton–like (i.e., reflectionless) solutions. As a byproduct we solve an inverse spectral problem for one–dimensional Jacobi operators and explicitly construct tri–diagonal matrices that yield a purely absolutely continuous spectrum in (−1, 1) and give rise to an eigenvalue spectrum that includes any prescribed countable and bounded subset of R\[−1, 1].
1. Introduction Our principal aim in this paper is a detailed study of N –soliton solutions (aN (t, n), bN (t, n)) of the Toda lattice equations d a(t, n) = a(t, n)[b(t, n) − b(t, n + 1)], dt d b(t, n) = 2[a(t, n − 1)2 − a(t, n)2 ], dt
(t, n) ∈ R × Z
(1.1)
in the limit N → ∞. In particular, if (a∞ (t, n), b∞ (t, n)) denotes the limit of (aN (t, n), bN (t, n)) as N → ∞, we shall undertake a careful investigation of the spectral (and scattering) properties of the associated Jacobi operator H∞ (t) = a∞ (t)S + + S − a∞ (t) + b∞ (t) in `2 (Z) (where (S ± f )(n) = f (n ± 1), f ∈ `∞ (Z) are the usual shift operators). Our principal techniques are based on a Hilbert space approach to Toda systems (as studied in detail in [5]), the use of Fredholm determinants in the context of discrete evolution equations (see also [1, 25]), and Weyl-Titchmarsh spectral theory (see, e.g., [11]).
28
F. Gesztesy, W. Renger
In order to describe our approach in some detail we briefly recall some of the basic facts of N –soliton solutions (aN (t, n), bN (t, n)) of the Toda system (1.1). They can be represented as (cf. Sect. 2 for more details) 1
{det[1N + CN (t, n + 1)] det[1N + CN (t, n − 1)]} 2 , n ∈ Z, (1.2) 2 det[1N + CN (t, n)] 1 bN (t, n) = (k + k −1 ) 2 k det[1N + DN (k, t, n + 1)] det[1N + CN (t, n − 1)] − 2 det[1N + DN (k, t, n)] det[1N + CN (t, n)] 1 det[1N + DN (k, t, n − 1)] det[1N + CN (t, n)] , n ∈ Z, − 2k det[1N + DN (k, t, n)] det[1N + CN (t, n − 1)] 1 1 N λ = (k + k −1 ) ≤ inf{λj = (κj + κ−1 (1.3) j )}j=1 , 2 2
aN (t, n) =
where N (κj κl )n+1 , CN (t, n) = cj (t)cl (t) 1 − κj κl j,l=1 " #N κj − k , δj,l AN (k) = κ−1 j −k
(1.4)
(1.5)
j,l=1
DN (t, n) = AN (k)CN (t, n − 1).
(1.6)
The corresponding self–adjoint Jacobi operator in `2 (Z) defined by HN (t) = aN (t)S + + S − aN (t) + bN (t),
D(HN (t)) = `2 (Z),
t∈R
(1.7)
then has the t–independent spectrum 1 N σ(HN (t)) = { (κj + κ−1 j )}j=1 ∪ [−1, 1], 2
(1.8)
the essential spectrum [−1, 1] of HN (t) being purely absolutely continuous. The scattering matrix SN (λ) associated with the pair (HN (t), H0 ), H0 = 21 (S + + S − ) is t– independent and reflectionless, i.e., 0 TN (k) SN (λ) = , 0 TN (k) N Y 1 1 − kκj , λ = (k + k −1 ) ∈ [−1, 1]. sgn(κj ) (1.9) TN (k) = κj − k 2 j=1
In order to illustrate the limit N → ∞ one first observes that CN (t, n) > 0 and hence Tr[CN (t, n)] = kCN (t, n)k1 (k · k1 denotes the trace norm). Upon embedding CN into `2 (N) (viewing CN (t, n) as an operator in `2 (N)) one then shows that {κj }j∈N ∈ `∞ (N), {c2j (1
−
κ2j )−1 }j∈N
0 < κ0 ≤ |κj | < 1, ∈ ` (N) 1
(1.10)
New Classes of Toda Soliton Solutions
29
is a natural hypothesis such that CN (t, n) converges to C∞ (t, n) in B1 (`2 (N))–norm (B1 (·) the set of trace class operators) and hence the determinants det[1N + CN (t, n)] converge to Fredholm determinants det 1 [1∞ + C∞ (t, n)] as N → ∞. Given Hypotheses (1.10) we shall prove in our principal Theorem 4.1 that the corresponding Jacobi limit operator H∞ (t) in `2 (Z) is bounded and self–adjoint with t–independent spectrum, 1 0 σess (H∞ (t)) = { (κj + κ−1 j )}j∈N ∪ [−1, 1], 2 σac (H∞ (t)) = [−1, 1], {σp (H∞ (t)) ∪ σsc (H∞ (t))} ∩ (−1, 1) = ∅,
(1.11) (1.12) (1.13)
1 1 −1 (1.14) { (κj + κ−1 j )}j∈N ⊆ σp (H∞ (t)) ⊆ { (κj + κj )}j∈N . 2 2 That is, the spectrum of H∞ (t) in (−1, 1) is purely absolutely continuous and its point spectrum contains the bounded set { 21 (κj + κ−1 j )}j∈N . This set, however, besides being bounded (and of course countable), needs to satisfy no further restrictions if the constants cj > 0 satisfy the conditions in (1.10). In particular, these eigenvalues {λj = 21 (κj + −1 1 1 κ−1 j )}j∈N of H∞ (t) can be dense in any subinterval of (− 2 (κ0 + κ0 ), −1) ∪ (1, − 2 (κ0 + κ−1 0 )). Hence Theorem 4.1 can be viewed as a solution of the following inverse spectral problem: Given any bounded and countable subset {λj }j∈N of (−∞, −1) ∪ (1, ∞), construct a tri–diagonal matrix whose absolutely continuous spectrum equals [−1, 1] and whose set of eigenvalues includes the prescribed set {λj }j∈N . (In (1.11)–(1.14) σess (·), σac (·), σsc (·) and σp (·) denotes the essential, absolutely continuous, singularly continuous and point spectrum (i.e., the set of eigenvalues), respectively and Σ 0 denotes the derived set of Σ ∈ R, i.e., the set of accumulation points of Σ.) By inspection, the product in the transmission coefficient TN (k) in (1.9) converges absolutely as N → ∞ if end only if {1 − |κj |}j∈N ∈ `1 (N).
(1.15)
Therefore, assuming (1.15) in addition to (1.10) allows one to study scattering theory for the pair (H∞ (t), H0 (t)). Theorem 4.2 is devoted to the detailed treatment of this case. In Sect. 2 we briefly review the necessary prerequisites on N –soliton solutions of the Toda lattice. Except, perhaps, for our representation of bN (t, n) in (2.7), this material is standard. Since the spectrum of HN (t) is t–independent we restrict ourselves to the stationary (i.e., t–independent) case in the following Sects. 3 and 4. Section 3 contains the main technical results on convergence properties of various quantities as N → ∞. Section 4 contains our principal results on spectral properties of H∞ outlined above. Section 5 finally returns to the t–dependent case and yields the construction of new classes of Toda soliton solutions associated with H∞ (t). We emphasize that our techniques are by no means restricted to the Toda system but apply equally well to integrable systems of the AKNS class. The particular case of the Korteweg de Vries (KdV) equation has been worked out in detail in [11] (see also [12]). Further results on one–dimensional, generalized reflectionless potentials can be found in [4, 16–19], and [22]. 2. Reflectionless Jacobi Operators and Toda N –Soliton Solutions This section briefly summarizes reflectionless short–range Jacobi operators and Toda N – soliton solutions. Since a complete bibliography on this subject is nearly impossible due
30
F. Gesztesy, W. Renger
to the extensive literature available, we restrict ourselves to some of the key references from which the following material has been taken [3, 6–10, 13, 24]. We start with the stationary (i.e., t–independent) case and consider Toda solutions at the end of this section. Define the matrices N (κj κl )n+1 , (2.1) CN (n) = cj cl 1 − κj κl j,l=1 h κ −k iN j , (2.2) AN (k) = −1 δj,l j,l=1 κj − k DN (k, n) = AN (k)CN (n − 1),
(2.3)
N 0 < |κj | < 1, κj ∈ R, cj > 0, 1 ≤ j ≤ N, N ∈ N, k ∈ C\{κ−1 j }j=1 .
(2.4)
where
Reflectionless Jacobi operators HN in `2 (Z) are then defined by HN = aN S + + S − aN + bN ,
D(HN ) = `2 (Z),
(2.5)
where 1
{det[1N + CN (n + 1)] det[1N + CN (n − 1)]} 2 , n ∈ Z, 2 det[1N + CN (n)] 1 bN (t, n) = (k + k −1 ) 2 det[1N + DN (k, n + 1)] det[1N + CN (n − 1)] − k2 det[1N + DN (k, n)] det[1N + CN (n)] det[1 N + DN (k, n − 1)] det[1N + CN (n)] 1 , n ∈ Z, − 2k det[1N + DN (k, n)] det[1N + CN (n − 1)] N λ = 21 (k + k −1 ) ≤ inf{λj = 21 (κj + κ−1 j )}j=1 ,
aN (t, n) =
(2.6)
(2.7)
S ± denote the shift operators in `∞ (Z), (S ± f )(n) = f (n ± 1),
n ∈ Z, f = {f (n)}n∈Z ∈ `∞ (Z),
(2.8)
and 1N = {δj,l }N j,l=1 .
(2.9)
Since (f, CN (n)f )CN =
N X j,l=1
f (j)cj cl
∞ X N 2 X (κj κl )n+1 f (l) = f j c j κm j > 0, 1 − κ j κl m=n+1
j=1
N f = {f (j)}N j=1 ∈ C ,
(2.10)
one infers CN (n) > 0,
(2.11)
and (2.6), (2.7) are well–defined. One verifies the spectral properties of the self–adjoint operator HN ,
New Classes of Toda Soliton Solutions
31
σess (HN ) = σac (HN ) = [−1, 1], σsc (HN ) = ∅, N 1 σd (HN ) = σp (HN ) = {λj = 2 (κj + κ−1 j )}j=1 ,
(2.12)
where σess (·), σac (·), σsc (·), σd (·), and σp (·) denotes the essential, absolutely continuous, singularly continuous, discrete, and point spectrum (i.e., the set of eigenvalues), respectively. While the representation (2.6) for aN (n) is quite standard, the one in (2.7) follows, for instance, by inserting the solution fN,+ (k, n) from (2.22) into bN (n) = z − (aN (n)fN,+ (k, n + 1) + aN (n − 1)fN,+ (k, n − 1))fN,+ (k, n)−1 (using the difference equation (2.21)). One can show that |aN (n)| ≤ kHN k = 21 (κ0 + κ−1 0 ),
(2.13)
κ−1 0 ),
(2.14)
|bN (n)| ≤ kHN k =
1 2 (κ0
+
where N
κ0 = sup |κj |j=1 .
(2.15)
The normalized eigenfunctions ψN,j are explicitly given by HN ψN,j = λj ψN,j ,
kψN,j k2 = 1,
1 ≤ j ≤ N,
(2.16)
where 0 T ΨN (n) = (c1 κn1 , . . . , cN κN ΨN (n) = (ψN,1 (n), . . . , ψN,N (n))T , 1 ) , n det[1 + C (n)] o 21 h i−1 N N 0 ΨN (n) = 1N + CN (n) ΨN (n), n ∈ Z. (2.17) det[1N + CN (n − 1)]
In addition to (2.7), bN also admits the trace formula representation [9], 1X 2 (κj − κ−1 j )ψN,j (n) , 2 N
bN (n) = −
n ∈ Z.
(2.18)
j=1
Next, we consider Jost and scattering wavefunctions of HN defined by n
fN,+ (k, n) =
det[1N + CN (n)] o 21 − det[1N + CN (n − 1)] N X n kκj c j κj ψN,j (n) k n , − 1 − kκj
|k| < 1,
(2.19)
j=1
ψN,− (k, n) = fN,+ (k −1 , n),
|k| < 1,
(2.20)
satisfying HN fN,+ (k) = zfN,+ (k), in the weak sense. One verifies
HN ψN,− (k) = zψN,− (k),
z=
1 (k + k −1 ) (2.21) 2
32
F. Gesztesy, W. Renger
det[1N + DN (k, n)]
fN,+ (k, n) = k n ψN,− (k, n) = k
1
{det[1N + CN (n)] det[1N + CN (n − 1)]} 2 det[1N + DN (k −1 , n)] −n
,
1
{det[1N + CN (n)] det[1N + CN (n − 1)]} 2
and
fN,+ (k, n) = ψN,− (k, n) =
(2.22)
n → +∞ , n → −∞
kn , TN (k)−1 k n , k −n , TN (k)k −n ,
,
n → +∞ , n → −∞
(2.23)
where TN (k) denotes the corresponding transmission coefficient TN (k) =
N Y
sgn(κj )
j=1
1 − kκj κj − k
(2.24)
(note that TN (k)−1 = TN (k −1 )). The asymptotic relations (2.23) prove the reflectionless property of HN , r,l (k) = 0, (2.25) RN (with Rr (·), Rl (·) denoting the reflection coefficient from right and left incidence, respectively) and hence yield the following unitary scattering matrix SN (λ) in C2 associated with the pair (HN , H0 ), H0 = 21 (S + + S − ), SN (λ) =
TN (k) 0
We also note that
0 , TN (k)
λ=
1 (k + k −1 ) ∈ [−1, 1]. 2
fN,+ (κj , n) = c−1 j ψN,j (n), Res ψN,− (k, n) = −cj κj ψN,j (n),
k=κj
(2.26)
(2.27)
1 ≤ j ≤ N.
(2.28)
Finally, we briefly consider N –soliton solutions of the Toda lattice (in Flaschka’s variables) given by d a(t, n) = a(t, n)[b(t, n) − b(t, n + 1)], dt d b(t, n) = 2[a(t, n − 1)2 − a(t, n)2 )], dt
(t, n) ∈ R × Z.
(2.29)
1 (κj − κ−1 j ), 1 ≤ j ≤ N, t ∈ R 2
(2.30)
Replacing cj → cj (t) = cj eβj t ,
βj =
in (2.6) and (2.7), denoting the result by aN (t, n), bN (t, n), HN (t), etc., then yields the N –soliton solutions of (2.29). Using the standard Lax pair for the Toda lattice equations (2.29) then proves that HN (t) is unitarily equivalent to HN (0) for all t ∈ R.
New Classes of Toda Soliton Solutions
33
3. Convergence Results as N → ∞ This is the main technical section in which we investigate various limits of aN (n), bN (n), HN , CN (n), fN,+ (k, n), ψN,− (k, n), and TN (k) as N → ∞. Throughout the remainder of this paper we shall consider sequences {κj }j∈N , {cj }j∈N subject to the following hypothesis: (H.3.1) Assume that {κj }j∈N satisfies κ0 ≤ |κj | < 1, κj ∈ R, j ∈ N for some κ0 > 0, and that {cj }j∈N satisfies cj > 0, j ∈ N, {c2j (1 − κ2j )−1 }j∈N ∈ `1 (N). Whenever we are interested in scattering theory we shall assume the stronger hypothesis (H.3.2) In addition to (H.3.1) suppose that {1 − |κj |}j∈N ∈ `1 (N), motivated by the absolute convergence of the product for TN (k) as N → ∞ in (2.24). We first derive general convergence results assuming (H.3.1) only. The scattering case in connection with (H.3.2) will be treated at the end of this section. Define h (κj κl )n+1 i , n ∈ Z, (3.1) C∞ (n) = cj cl 1 − κj κl j,l∈N h κ −k i j A∞ (k) = −1 , k ∈ C\{κ−1 (3.2) δj,l j }j∈N , j,l∈N κj − k D∞ (k, n) = A∞ (k)C∞ (n − 1),
k ∈ C\{κ−1 j }j∈N ,
n ∈ Z.
(3.3)
Basic properties of C∞ (n) and D∞ (k, n) are listed in the following Lemma 3.3. Assume (H.3.1) and n ∈ Z, N ∈ N, k ∈ C\{κ−1 j }j∈N . Then for all M ∈ N ∪ {∞}, 0 ≤ CM (n) ∈ B1 (`2 (N)),
kCM (n)k1 ≤ kC∞ (n)k1 =
X
c2j
j∈N
κ2n+2 j , 1 − κ2j
(3.4)
det[1N + CN (n)] −→ det1 [1∞ + C∞ (n)],
(3.5)
det 1 [1M + CM (n)] −→ 1 uniformly with respect to M,
(3.6)
DM (k, n) ∈ B1 (`2 (N)), kDM (k, n)k1 ≤ kD∞ (k, n)k1 ≤ const.(k) kC∞ (n)k1 , det[1N + DN (k, n)] −→ det1 [1∞ + D∞ (k, n)],
(3.7) (3.8)
det 1 [1M + DM (k, n)] −→ 1 uniformly with respect to M.
(3.9)
N →∞
n→+∞
N →∞
n→+∞
(Here B1 (·) denotes the set of trace class operators, k · k1 the corresponding trace norm, det 1 [1∞ + · ] the associated Fredholm determinant, and 1∞ the identity in `2 (N).) Proof. Let f = {f (p)}p∈N ∈ `2 (N) then (f, CM (n)f )`2 (N) =
∞ X M 2 X f (j)cj κm j ≥ 0, m=n+1
j=1
(3.10)
34
F. Gesztesy, W. Renger
and thus kCM k1 = Tr[CM (n)] ≤ Tr[C∞ (n)] yield (3.4) given (H.3.1). Since (1∞ − PN )C∞ (n)(1∞ − PN ) ≥ 0, we infer 21 ∞ 2n+2 X 1 κ j −→ 0, (3.11) c2j kC∞ (n) − PN C∞ (n)PN k1 ≤ 2 kC∞ (n)k12 2 N →∞ 1 − κ j j=N +1 and hence (3.5) since |det1 [1∞ + C∞ (n)] − det [1N + CN (n)]| ≤ kC∞ (n) − PN C∞ (n)PN k1 × exp {kC∞ (n)k1 + 1} (3.12) by [23], p.66. Here PN , N ∈ N denotes the projection 2 ` (N) → `2 (N) PN : f = (f1 , . . . , fN , fN + 1, . . .) 7→ (f1 , . . . , fN , 0, 0, . . .)
(3.13)
which embeds CN into `2 (N) and P∞ = 1∞ denotes the identity operator in `2 (N). The fact that kPM C∞ (n)PM k1 ≤ kC∞ (n)k1 −→ 0 by the Weierstrass test for all n→+∞
M ∈ N ∪ {∞} together with
det 1 [1∞ + PM C∞ (n)PM ] ≤ exp{kC∞ (n)k1 }
(3.14)
(cf. [23], p.47) then proves (3.6). The corresponding results for DM (k, n), M ∈ N∪{∞} then follow from (3.3) since A∞ (k) is a bounded operator in `2 (N) with κ −k j kA∞ (k)k = sup −1 (3.15) k ∈ C\{κ−1 < ∞, j }j∈N j∈N κj − k and kAM (k)CM (n − 1)k1 ≤ kAM (k)k kCM (n − 1)k1 , PN A∞ (k)PN C∞ (n − 1)PN = A∞ (k)PN C∞ (n − 1)PN , kD∞ (k, n) − PN D∞ (k, n)PN k1 ≤ kA∞ (k)k kC∞ (n − 1) − PN C∞ (n)PN k1 . Lemma 3.4. Assume (H.3.1) and n ∈ Z, N ∈ N, k ∈ C\{κ−1 j }j∈N . Then det[CN (n)] =
N X j=1
det[DN (n)] =
c2j κ2n+2 j
N N Y Y (κl − κm )2 (1 − κr κs )−1 ,
l,m=1 m>l
N Y κj − k det[CN (n − 1)]. −1 κ −k j=1 j
For M ∈ N ∪ {∞}, det 1 [1M + CM (n)] =
(3.16)
r,s=1
X
aI xnI ,
(3.17)
(3.18)
I∈PM
det 1 [1M + DM (k, n)] =
X
I∈PM
aI pI (k)xn−1 , I
(3.19)
New Classes of Toda Soliton Solutions
35
where PN is the power set of {1, . . . , N }, P∞ is the set of all finite subsets of N, Y Y κj − k κ2j > 0, pI (k) = xI = , κ−1 − k j∈I j∈I j Y Y Y c2j κ2j (κl − κm )2 (1 − κr κs )−1 > 0, I ⊂ N. (3.20) aI = j∈I
l,m∈I m>l
r,s∈I
In particular, this yields the monotonicity property M ∈ N ∪ {∞}.
1 < det1 [1M + CM (n + 1)] < det1 [1M + CM (n)],
(3.21)
Proof. Equation (3.16) follows from [20], p.92 and (3.17) is then clear from (3.2) and (3.3). For N ∈ N we expand det[1N + CN (n)] = 1 + det[CN (n)] +
+
···
+
N X
N X
N X j1 j1 ,j2 det[CN (n)] + det[CN (n)]
j1 =1
j1 ,j2 =1 j1 <j2
j ,j2 ,...,jN −1
det[CN1
(n)],
(3.22)
j1 ,j2 ,...,jN −1 =1 j1 <j2 <···<jN −1
j1 ,j2 ,...,jk where CN (n) denotes the (N − k) × (N − k) matrix obtained by deleting the j1 , . . . , jk rows and columns of CN (n). Each of these matrices is of the same form as j1 ,...,jk the original matrix CN (n), in particular, the analog of (3.16) applies to all CN (n). Since each summand in (3.22) corresponds to one element I ∈ PN , we proved (3.18) for M ∈ N. Since all terms in (3.22) are positive, X X aI xnI ≤ aI xnI = det[1N +1 + CN +1 (n)] det[1N + CN (n)] = I∈PN
I∈PN +1
≤ exp[kC∞ (n)k1 ],
(3.23)
and hence det1 [1∞ + C∞ (n)] = lim det[1N + CN (n)] = lim N →∞
N →∞
X
aI xnI
(3.24)
I∈PN
proves (3.18). Equation (3.21) is an immediate consequence of (3.16), aI > 0 and 0 < xI < 1. Equation (3.19) is proved similarly. (Even though the terms in (3.19) are not necessarily positive, the pI (k) are uniformly bounded with respect to I by a constant depending on k.) Next we define 1
{det 1 [1∞ + C∞ (n + 1)] det 1 [1∞ + C∞ (n − 1)]} 2 , n ∈ Z, (3.25) a∞ (n) = 2 det1 [1∞ + C∞ (n)] 1 k det1 [1∞ + D∞ (k, n + 1)] det1 [1∞ + C∞ (n − 1)] b∞ (n) = (k + k −1 ) − 2 2 det 1 [1∞ + D∞ (k, n)] det1 [1∞ + C∞ (n)] 1 det 1 [1∞ + D∞ (k, n − 1)] det1 [1∞ + C∞ (n)] , (3.26) − 2k det1 [1∞ + D∞ (k, n)] det1 [1∞ + C∞ (n − 1)] n ∈ Z, 0 < |k| ≤ κ0 , k ∈ R.
36
F. Gesztesy, W. Renger
We note that the conditions on k in (3.26) imply
κj −k κ−1 −k j
≥ 0 and hence pI ≥ 0 in (3.20)
implying det1 [1M + DM (k, n)] ≥ 1
(3.27)
(cf. (3.19)). Theorem 3.5. Assume (H.3.1). Then for n ∈ Z, aN (n) −→ a∞ (n), N →∞
bN (n) −→ b∞ (n),
(3.28)
N →∞
1 1 ≤ a∞ (n) ≤ (κ0 + κ−1 0 ), 2 2
1 (κ0 + κ−1 0 ). 2
|b∞ (n)| ≤
(3.29)
In addition, the operator H∞ on `2 (N) defined by H∞ = a∞ S + + S − a ∞ + b∞ ,
D(H∞ ) = `2 (Z)
(3.30)
is self–adjoint and s–lim HN = H∞ ,
(3.31)
N →∞
or equivalently, HN converges to H∞ in the strong resolvent sense. Proof. Equation (3.28) follows from (3.5) and (3.8) and the fact that det 1 [1M + DM (k, n)] ≥ 1 for all n ∈ Z, 0 < |k| < κ0 , k ∈ R, M ∈ N ∪ {∞}. Together with (2.13) and (2.14) this yields the upper bounds in (3.29). In order to prove a∞ (n) ≥ 21 one considers X −2 X X n−1 n 4a∞ (n)2 = aI xn+1 a x a x (3.32) K K L L I I∈P∞
X
=
K∈P∞
aI aK (xI xK )n−1 x2I
I,K∈P∞
1 = 2
L∈P∞
X
R,S∈P∞
X
aI aK (xI xK )n−1 (x2I + x2K )
I,K∈P∞
1 = 1+ 2
aR aS xnR xnS
X
X
−1
aR aS xnR xnS
−1
R,S∈P∞
aI aK (xI xK )n−1 (xI − xK )2
I,K∈P∞
X
aR aS xnR xnS
−1
≥ 1.
R,S∈P∞
(The reordering of the terms in (3.32) being justified due to absolute convergence of the sums involved.) (Equations 3.28) and (3.29) then prove (3.31). The convergence properties in Lemmas 3.3, 3.4 and Theorem 3.5 immediately imply the following result for the wave functions of H∞ . Lemma 3.6. Assume (H.3.1) and k ∈ C\{κ−1 j }j∈N . Then fN,+ (k, n) −→ f∞,+ (k, n) N →∞
= kn
det1 [1∞ + D∞ (k, n)] 1
{det1 [1∞ + C∞ (n)] det1 [1∞ + C∞ (n − 1)]} 2
ψN,− (k, n) −→ ψ∞,− (k, n) = f∞,+ (k −1 , n) N →∞
,
(3.33) (3.34)
New Classes of Toda Soliton Solutions
37
pointwise with respect to n, in particular, ψN,j (n) = cj fN,+ (κj , n) −→ ψ∞,j (n) = cj f∞,+ (κj , n) N →∞
(3.35)
pointwise and H∞ f∞,+ (k) = zf∞,+ (k), H∞ ψ∞,− (k) = zψ∞,− (k), H∞ ψ∞,j = λj ψ∞,j ,
z=
λj =
1 (k + k −1 ), 2
(3.36)
1 (κj + κ−1 j ) 2
(3.37)
in the weak sense. Unfortunately, Lemma 3.6 sheds no light on whether or not ψ∞,j ∈ `2 (Z) and ψ∞,j 6≡ 0, i.e., whether or not λj ∈ σp (H∞ ). In order to clarify this point we need to pursue a more sophisticated strategy. We introduce some notation first. Let ˆ a = `2 (N) ⊗ `2 ([a, ∞)), a∈Z (3.38) H and introduce the following identity operators: 1N in CN , 1∞ in `2 (N), 1ˆ in `2 ([a, ∞)), 1ˆ N = 1N ⊗ 1ˆ in CN ⊗ `2 ([a, ∞)), ˆ a. 1ˆ ∞ = 1∞ ⊗ 1ˆ in `2 (N) ⊗ `2 ([a, ∞)) = H
(3.39)
ˆ a in the obvious way and Moreover, embed CN and CN ⊗ `2 ([a, ∞)) into `2 (N) and H ˆ a onto CN ⊗ `2 ([a, ∞)), i.e., let PˆN be the projection from H ˆ a = [CN ⊗ `2 ([a, ∞))] ⊕ (1ˆ ∞ − PˆN )H ˆ a, H
PˆN =
ˆ 1N 0
0 , 0
(3.40)
ˆ a induced by CN (·) and denote by Cˆ N and Cˆ ∞ the operators in CN ⊗ `2 ([a, ∞)) and H and C∞ (·), i.e.,
Cˆ M f
j
(n) =
PM l=1
(κj κl )n+1 fl (n), 1 − κ j κl n ∈ Z, 1 ≤ j ≤ M, M ∈ N ∪ {∞}. cj cl
(3.41)
Lemma 3.7. Assume (H.3.1). Then Cˆ ∞ ≥ 0,
Cˆ ∞ < ∞, ˆ (1N + Cˆ N )−1 s–lim 0 N →∞
0 0
(3.42) (3.43) = (1ˆ ∞ + Cˆ ∞ )−1 .
(3.44)
38
F. Gesztesy, W. Renger
Proof. Equation (3.42) is shown as in the proof of of Lemma 3.3. Equation (3.43) follows from 2 ∞ X ∞ X ∞ X
(κj κl )n+1
Cˆ ∞ f 2ˆ = c c f (n) j l l Ha 1 − κ κ j l n=a j=1
l=1
∞ ∞ X ∞ X X
∞
(κj κl )2n+2 X |fk (n)|2 (3.45) 2 (1 − κ κ ) j l n=a j=1 l=1 k=1 2 2 ∞ ∞ ∞ ∞ 2n+2 2a+2 X X X X κ κ j j kf k2ˆ . ≤ c2j |fk (n)|2 ≤ c2j Ha 2 2 1 − κ 1 − κ j j n=a j=1 j=1 k=1 ≤
c2j c2l
Finally, to prove (3.44), we first note that (1ˆ ∞ + Cˆ ∞ )f Hˆ ≥ kf kHˆ a , and hence it a suffices to estimate
h ˆ i (1N + Cˆ N )−1 0
ˆ
g
(1∞ + Cˆ ∞ ) (1ˆ ∞ + Cˆ ∞ )−1 g − 0 0 ˆa H
ˆ 1N 0 0 0
= g − g g− ˆ 0 0 (1∞ − PˆN )Cˆ ∞ PˆN 0 ˆa H
ˆ ˆ ˆ ˆ ˆ ≤ g − PN g Hˆ a + (1∞ − PN )C∞ PN fN Hˆ a , (3.46) (1ˆ N + Cˆ N )−1 0 g (note that kfN kHˆ a ≤ kgkHˆ a ). Since 0 0 s–limN →∞ PˆN = 1ˆ ∞ , the first term on the right-hand side of (3.46) tends to zero. As for the second term one estimates ∞ X ∞ X N 2 X
(κj κl )n+1
(1ˆ ∞ − PˆN )Cˆ ∞ PˆN fN 2ˆ = c c f (n) j l N,l Ha 1 − κ κ j l n=a where we abbreviated fN =
≤
j=N +1
l=1
∞ X κ2n+2 κ2n+2 j 2 c2j c2l l 2 kfN kHˆ a −→ 0. 2 N →∞ 1 − κ 1 − κ j l j=N +1 l=1
∞ X
(3.47)
Hence defining 0 Ψ∞ (n) = ({cj κnj }j∈N )T , Ψ∞ (n) = ({ψ∞,j (n)}j∈N )T , n det [1 + C (n)] o 21 h i−1 1 ∞ ∞ 0 Ψ∞ (n) = 1∞ + C∞ (n) Ψ∞ (n), det1 [1∞ + C∞ (n − 1)]
(3.48)
one obtains the following result. Theorem 3.8. Assume (H.3.1) and let a ∈ Z. Then kΨ∞ − ΨN kHˆ a −→ 0,
(3.49)
kψ∞,j − ψN,j k`2 ([a,∞)) −→ 0
(3.50)
N →∞
in particular,
N →∞
implying
kψ∞,j k2 ≤ 1,
j ∈ N.
(3.51)
New Classes of Toda Soliton Solutions
39
Proof. By Lemma 3.3, the ratio of determinants in (2.17) converges to that in (3.48). Moreover, (3.11) shows that the convergence is uniform for n ∈ [a, ∞). By the previous Lemma 3.7 one then concludes (3.49) and (3.50). Equation (3.51) then follows from kψ∞,j k`2 ([a,∞)) = limN → ∞ kψN,j k`2 ([a,∞)) ≤ 1, for all a ∈ Z by taking a → −∞. The fact that ψ∞,j defined in (3.48) coincide with the ones in (3.35) is clear from the pointwise convergence in (3.35). It remains to prove that ψ∞,j 6≡ 0. In order to achieve this it suffices to study the asymptotics of ψ∞,j (n) as n → +∞. Lemma 3.9. Assume (H.3.1) and k ∈ C\{κ−1 j }j∈N , j ∈ N. Then lim k −n f∞,+ (k, n) = 1,
(3.52)
n→+∞
lim c−1 κ−n j ψ∞,j (n) n→+∞ j
= 1.
Proof. This is clear from (3.6), (3.9), (3.33) and (3.35).
(3.53)
Finally we study the trace formula (2.18) in the limit N → ∞. Lemma 3.10. Assume (H.3.1) and n ∈ Z. Then 1X 2 b∞ (n) = − (κj − κ−1 j )ψ∞,j (n) . 2
(3.54)
j∈N
Proof. This follows from (2.18), bN (n) −→ b∞ (n) in (3.28) and N →∞
N ∞ X X 2 2 (κj − κ−1 (κj − κ−1 j )ψN,j (n) − j )ψ∞,j (n) j=1
j=1
≤ ≤
1 −1 (κ − κ0 ) 2 0
∞ X
|ψN,j (n)2 − ψ∞,j (n)2 |
j=1
1 −1 (κ − κ0 ) kΨN (n) + Ψ∞ (n)k`2 (N) kΨN (n) − Ψ∞ (n)k`2 (N) . (3.55) 2 0
The first factor on the right-hand-side of (3.55) is bounded by (3.49) and the second tends to zero again by (3.49). This completes the technical convergence part under the general hypothesis (H.3.1). In the remainder of this section we assume the stronger hypothesis (H.3.2) motivated by scattering theory. In this case we will be able to prove convergence of the aN and bN in `1 (Z)-norm. To do this, we first have to get a handle on the behavior of various ratios of determinants as n → −∞. Lemma 3.11. Assume (H.3.2) and k ∈ C\{κ−1 j }j∈N . For all M ∈ N ∪ {∞}, l ≤ M, l ∈ N and for all n ∈ Z, det 1 [1M + CM (n)] > xN , det1 [1M + CM (n − 1)] det [1 + D (κ , n)] 1 M M l ≤ const.(l)κ−2n+2 l det1 [1M + CM (n − 1)]
(3.56) (3.57)
40
F. Gesztesy, W. Renger
with the constant being independent of M and n. Furthermore, for all ε > 0 there exist Nε ∈ N, nε ∈ Z such that for all M > Nε , M ∈ N ∪ {∞} and all n < nε , n ∈ Z, det 1 [1M + CM (n)] − xN < ε, det1 [1M + CM (n − 1)] det [1 + D (k, n)] 1 M M − pN < ε, det1 [1M + CM (n − 1)]
(3.58) (3.59)
where xN =
Y
κ2j > 0,
pN (k) =
j∈N
Proof. By (H.3.2), xN and pN (k) exist since xN =
Q
1−κ2j 1+κj k ].
Y κj − k 6= 0. κ−1 − k j∈N j
(3.60)
2 j∈N [1−(1−κj )], pN (k)
=
Q
j∈N [1−
Equation (3.56) is obvious from (3.18) and xI > xN for I 6= N. In order to prove (3.57) we first note that pI (k) as given by (3.20) is equal to zero if k = κl and l ∈ I. Equation (3.20) also implies that aI∪{l} ≥ c˜1 (l)aI with some constant c˜1 (l) > 0 independent of I. Similarly, |pI (κl )| ≤ c˜2 (l) with another constant c˜2 (l). Finally, observing xI∪{l} = xI κ2l , we get !−1 X det 1 [1M + DM (κl , n)] X n−1 n−1 = aK xK aI pI (κl )xI det 1 [1M + CM (n − 1)] K∈PM I∈PM −1 X X n−1 aK∪{l} xn−1 a p (κ )x ≤ I I l I K∪{l} K∈P I∈PM M l6∈K l6∈I ⊂
. ≤ const.(l)κ−2n+2 l
(3.61)
In order to prove (3.58), choose Nε ∈ N such that M ε Y ( κ2j ) − xN < 4
for all M > Nε .
j=1
By (3.18), det [1 + C (n)] 1 M M − xN det1 [1M + CM (n − 1)] −1 X X aK xn−1 aI xn−1 (xI − xN ). = K I K∈PM
I∈PM
Next we split PM into two disjoint parts, PM = PM,1,ε ∪ PM,2,ε , and estimate
PM,1,ε = {I ∈ PM |(xI − xN ) > ε2 }, PM,2,ε = {I ∈ PM |(xI − xN ) ≤ ε2 }
(3.62)
New Classes of Toda Soliton Solutions
X K∈PM
≤
ε + 2
41
aK xn−1 K
X
−1 X
aI xn−1 (xI − xN ) I
I∈PM
aK xn−1 K
−1 X
K∈PM,2,ε
aI xn−1 I
I∈PM,1,ε
n−1 −1 ε + aK xK (xN + ε2 )−1 2 K∈PM,2,ε X n−1 × aI xI (xN + ε2 )−1 . ≤
X
(3.63)
I∈PM,1,ε
For I ∈ PM,1,ε , [xI (xN + ε2 )−1 ] > 1 and hence [xI (xN + ε2 )−1 ]n−1 → 0 as n → −∞. Similarly, for K ∈ PM,2,ε , [xK (xN + ε2 )−1 ] ≤ 1 implies [xK (xN + ε2 )−1 ]n−1 ≥ 1 as n → −∞. Thus the denominator on the right-hand side of (3.63) is bounded from below by a positive constant independently of M since PM+1,2,ε ⊇ PM,2,ε . For the numerator in (3.63) one infers X
n−1 P n−1 aI xI (xN + ε2 )−1 ≤ I∈P∞,1,εaI xI (xN + ε2 )−1 −→ 0 n→−∞
I∈PM,1,ε
(3.64)
P by the Weierstrass test since I∈P∞ aI = det 1 [1∞ + C∞ (0)] < ∞ by Lemma 3.4. This proves (3.58). Since xI − xN ≤ δ implies |pI (k) − pN (k)| ≤ const.(k)δ, (3.59) is proven along the same lines. Lemma 3.12. Assume (H.3.2), M ∈ N ∪ {∞}, j ∈ N, j ≤ M. Then for all n ∈ Z, |n|
ψM,j (n) ≤ const.(j)κj ,
(3.65)
with the constant being independent of M and n. Proof. This is obvious from (3.33) and (3.35) (respectively (2.22) and (2.27) for finite M) and Lemma 3.3 (respectively Lemma 3.11) for n → +∞ (respectively n → −∞). This then allows us to prove Lemma 3.13. Assume (H.3.2). Then for all M ∈ N ∪ {∞} and for all n ∈ Z,
aM −
1
≤ const., 2 1
kbM k1 ≤ const.,
(3.66)
with constants being independent of M. Moreover, for all ε > 0 there is an nε ∈ N such that for all M ∈ N ∪ {∞},
a M −
1
< ε, 2 `1 ((−∞,−nε ])
kbM k`1 ((−∞,−nε ]) < ε,
aM −
1
< ε, 2 `1 ([nε ,∞))
kbM k`1 ([nε ,∞)) < ε.
(3.67) (3.68)
42
F. Gesztesy, W. Renger
Proof. In order to estimate aM we note that aM (n) >
1 2
by (3.29) and
det1 [1M + CM (n)] > 1. det 1 [1M + CM (n + 1)] Thus
n2 X aM (n) −
n=n1
n2 n 1 X 1 det1 [1M + CM (n)] o 21 ≤ (aM (n) − ) 2 det [1 + C (n + 1)] 2 1 M M n=n 1
n2 n 1 X det1 [1M + CM (n − 1)] o 21 n det1 [1M + CM (n)] o 21 − = 2 n=n det1 [1M + CM (n)] det 1 [1M + CM (n + 1)] 1 n 1 o 1 det1 [1M + CM (n1 − 1)] 2 − = 2 det1 [1M + CM (n1 )] n det [1 + C (n )] o 21 1 M M 2 (3.69) , n1 , n2 ∈ Z. − det 1 [1M + CM (n2 + 1)]
By Lemma 3.3, det1 [1M + CM (n)] −→ 1 det1 [1M + CM (n + 1)] n→+∞ uniformly in M and by Lemma 3.11 Y det1 [1M + CM (n)] > (xN + ε)−1 κ−2 x−1 j > N = det1 [1M + CM (n + 1)] j∈N
uniformly in M > Nε for some Nε , which proves the first part of (3.66) and both statements in (3.67) for all M sufficiently large. But for any fixed finite M, (3.67) clearly holds as well. In order to estimate the norm of bM we use the trace formula (2.18), respectively (3.54), n2 X n=n1
n2 X ∞ 1 X − |κj |)ψM,j (n)2 |bM (n)| ≤ ( κ−1 j 2 n=n 1
(3.70)
j=1
n2 N˜ X M X 1 X −1 1 −1 ψM,j (n)2 + ( κj − |κj |), ≤ (κ0 − κ0 ) 2 2 ˜ n=n j=1
j=N +1
1
n1 , n2 ∈ Z, N˜ ∈ N. The first equality in (3.70) together with kψM,j k2 ≤ 1 (cf. (3.51)) proves the second part of (3.66). The last sum in (3.70) can be made arbitrarily small by choosing N˜ large enough, thus leaving just a finite sum over the `2 ([n1 , n2 ])–norm of ψM,j (n). Together with Lemma 3.12 this completes the proof. Theorem 3.14. Assume (H.3.2). Then for all j ∈ N kψN,j − ψ∞,j k2 −→ 0, N →∞
kaN − a∞ k1 −→ 0, X n∈Z
N →∞
b∞ (n) = −
kψ∞,j k2 = 1,
(3.71)
kbN − b∞ k1 −→ 0.
(3.72)
1X (κj − κ−1 j ). 2 j∈N
N →∞
(3.73)
New Classes of Toda Soliton Solutions
43
Proof. Equation (3.71) follows from 2
kψN,j − ψ∞,j k2 ≤
a−1 X
2
(ψN,j (n)2 + ψ∞,j (n)2 ) + kψN,j − ψ∞,j k`2 ([a,∞)) ,
(3.74)
n=−∞
Theorem 3.8, and Lemma 3.12. In order to prove (3.72) we use n1 X 1 1 |aN (n) − | + |a∞ (n) − | kaN − a∞ k1 ≤ 2 2 n=−∞ +
+
n2 X
|aN (n) − a∞ (n)| +
n=n1 +1 ∞ X n=n2 +1
1 1 |aN (n) − | + |a∞ (n) − | , 2 2
n1 , n2 ∈ Z, (3.75)
the pointwise convergence (Theorem 3.5), and Lemma 3.13. kbN − b∞ k1 can be estimated analogously. Equation (3.73) follows from (3.54) and (3.71). This yields Theorem 3.15. Assume (H.3.2). Then HN converges to H∞ in trace norm resolvent sense,
(HN − z)−1 − (H∞ − z)−1 −→ 0, z ∈ C\R. (3.76) 1 N →∞
Proof. This follows immediately from
(HN − z)−1 − (H∞ − z)−1 1
≤ (HN − z)−1 kHN − H∞ k1 (H∞ − z)−1 (3.77)
−1 −1
(H∞ − z) {2 kaN − a∞ k1 + kbN − b∞ k1 } −→ 0.(3.78) ≤ (HN − z) N →∞
(We emphasize that HN and H∞ are not trace class separately, but their difference HN − H∞ is.) Lemma 3.11 also implies Lemma 3.16. Assume (H.3.2) and k ∈ C\{±1}, |k| = 1, n ∈ Z. Then n k , n → +∞ f∞,+ (k, n) = T∞ (k)−1 k n , n → −∞, k −n , n → +∞ ψ∞,− (k, n) = T∞ (k)k −n , n → −∞
(3.79)
with transmission coefficient T∞ (k) =
∞ Y
sgn(κj )
j=1
∞ −1 1 − kκj Y κj − k = . κj − k κj − k
(3.80)
j=1
Moreover, H∞ is reflectionless, i.e., r,l R∞ (k) = 0.
(3.81)
44
F. Gesztesy, W. Renger
Proof. It suffices to combine Lemma 3.3 (for n → +∞), Lemma 3.11 (for n → −∞), and Lemma 3.6. 4. Spectral Properties of H∞ This section describes our principal results concerning spectral properties of the limit operator H∞ . As in Sect. 3 we distinguish two cases governed by hypotheses (H.3.1) and (H.3.2), respectively. We start by assuming (H.3.1). Theorem 4.1. Assume (H.3.1). Then H∞ defined in `2 (Z) by H ∞ = a∞ S + + S − a ∞ + b∞ ,
D(H∞ ) = `2 (Z)
(4.1)
is self–adjoint, and 1 0 σess (H∞ ) = [−1, 1] ∪ { (κj + κ−1 j )}j∈N 2
(4.2)
(here Σ 0 denotes the derived set of Σ ∈ R, i.e., the set of accumulation points of Σ), σac (H∞ ) = [−1, 1], {σp (H∞ ) ∪ σsc (H∞ )} ∩ (−1, 1) = ∅,
(4.3) (4.4)
1 1 −1 { (κj + κ−1 j )}j∈N ⊆ σp (H∞ ) ⊆ { (κj + κj )}j∈N . 2 2
(4.5)
The spectral multiplicity of H∞ on (−1, 1) is two while σ(H∞ )\[−1, 1] has multiplicity one. In addition, 1 (κj + κ−1 j )ψ∞,j , 2
0 6≡ ψ∞,j ∈ `2 (Z),
H∞ ψ∞,j =
H∞ f∞,+ (k ±1 ) =
1 (k + k −1 )f∞,+ (k ±1 ), 2
and
j∈N
k ∈ C\{κ±1 j }j∈N
(4.6)
(4.7)
in the weak sense. If { 21 (κj + κ−1 j )}j∈N ) is a discrete subset of (∞, −1) ∪ (1, ∞) (i.e., if ±1 are its only limit points), then σsc (H∞ ) = ∅,
(4.8)
1 σ(H∞ ) ∩ [(−∞, −1) ∪ (1, ∞)] = σd (H∞ ) = { (κj + κ−1 j )}j∈N . 2
(4.9)
More generally, if {κj }0j∈N is countable, then (4.8) holds. Proof. We shall use Weyl m-function techniques to prove Theorem 4.1. In the notation of [13], the Weyl m-functions associated with H∞ are given by m∞,± (z) = −f∞,+ (k ±1 , 1)[a∞ (0)f∞,+ (k ±1 , 0)]−1 , 1 z = (k + k −1 ), |k| < 1. 2
(4.10)
(m∞,± (z) are unique since H∞ is in the limit point case at ±∞ (i.e., self-adjoint) by Theorem 3.5.) Equation (4.10) is obtained from the corresponding Weyl m-functions mN,± (z) of HN by
New Classes of Toda Soliton Solutions
45
m∞,± (z) = lim mN,± (z), N →∞
z ∈ C\R
(4.11)
with uniform convergence for z in compact subsets of C\R. Equation (4.11) is clear for m∞,+ (z) since f∞,+ (k, .), fN,+ (k, .) ∈ `2 ((a, ∞)), a ∈ Z. In order to obtain the corresponding result (4.11) for m∞,− (z) one utilizes ψN,− (k, .) = fN,+ (k −1 , .) ∈ `2 ((−∞, a)), N ∈ N, a ∈ Z and the fact that D − z)−1 δn,1 ∈ `2 ((−∞, a)), ψN,− = const.(z)(HN,−
a∈Z
(4.12)
D the operator HN in `2 ((−∞, 1)) with a Dirichlet boundary condition at n = 1) (HN,− together with strong resolvent convergence of HN to H∞ (cf. Theorem 3.5). Combining (2.19), (3.33) and (4.11) yields the fundamental property
λ ∈ (−1, 1),
m∞,+ (λ + i0) = m∞,− (λ + i0),
(4.13)
where we used the obvious notation, m± (λ + i0) = limε↓0 m± (λ + iε) in connection 1 with the branch cut k = z + (z 2 − 1) 2 , |k| ≤ 1 approaching |k| = 1 nontangentially from inside the unit k–disk. The corresponding Weyl M – matrix M∞ (z) and (self–adjoint and right–continuous) spectral matrix ρ∞ (λ) of H∞ then read M∞ (z) =
1 a2∞ (0)[m∞,+ (z) − m∞,− ] 1 −a∞ (0)m∞,+ (z) , × −a∞,+ (0)m∞,+ (z) a2∞ (0)m∞,+ (z)m∞,− (z) z ∈ C\σ(H∞ ),
ρ∞,p,q (λ) − ρ∞,p,q (µ) = −
1 lim lim π δ↓0 ε↓0
Z
(4.14)
λ+δ
dν Im[M∞,p,q (ν + iε)],
(4.15)
µ+δ
λ, µ ∈ R, 1 ≤ p, q ≤ 2. One computes m∞,+ (λ + i0) − m∞,− (λ + i0)
(4.16)
−1
=
W (f∞,+ (k), f∞,+ (k ))(0) a∞ (0)2 f∞,+ (k, 0)f∞,+ (k −1 , 0) W (f∞,+ (k), f∞,+ (k −1 ))(n) n→+∞ a∞ (0)2 f∞,+ (k, 0)f∞,+ (k −1 , 0)
= lim
1 k − k −1 2 2 a∞ (0) f∞,+ (k, 0)f∞,+ (k −1 , 0) 1 6 0, = λ = (k + k −1 ) ∈ (−1, 1), 2
=
where W (f, g)(n) = a∞ (n)[f (n)g(n + 1) − f (n + 1)g(n)] denotes the (modified) Wronskian of f and g. Given these preliminaries we can now follow the strategy of proof in Theorem 5.9 in [11]. (4.6) and (4.7) follow from (3.51), (3.53) and the strong resolvent convergence of HN to H∞ . Applying Lemma 5.8 of [11] yields that lim m∞,− (λ + iε) exists and is real–valued for a.e. λ ∈ R\[−1, 1]. ε↓0
(4.17)
46
F. Gesztesy, W. Renger
Since m∞,+ (z) also shares property (4.17), one concludes that σac (H∞ ) ∩ [(−∞, −1) ∪ (1, ∞)] = ∅.
(4.18)
Moreover, (4.13), (4.15), and (4.16) show that ρ∞ (λ) has rank two on (−1, 1) implying (−1, 1) ∈ σac (H∞ )
(4.19)
and spectral multiplicity two of H∞ in (−1, 1). Together with (4.6) and the strong resolvent convergence of HN to H∞ this proves (4.2)–(4.5). (4.8) and (4.9) are just special cases and multiplicity one of σ(H∞ )\[−1, 1] follows from simplicity of σp (H∞ ) (since H∞ is in the limit point case at ±∞) and of σsc (H∞ ), if any (cf. [14], [15]). The analogous result in the scattering case governed by hypothesis (H.3.2) then reads as follows. Theorem 4.2. Assume (H.3.2). Then σp (H∞ ) ∩ (−1, 1) = σsc (H∞ ) = ∅, (4.20)
σess (H∞ ) = σac (H∞ ) = [−1, 1], σd (H∞ ) =
{ 21 (κj
+ κ−1 j )}j∈N .
(4.21)
H∞ has spectral multiplicity two on (−1, 1), λj = 21 (κj + κ−1 j ), j ∈ N are simple eigenvalues of H∞ with (normalized) eigenfunctions ψ∞,j kψ∞,j k2 = 1,
H∞ ψ∞,j = λj ψ∞,j ,
j ∈ N.
(4.22)
The weak solutions f∞,+ (k, n), ψ∞,− (k, n) = f∞,+ (k −1 , n) of H∞ satisfy H∞ f∞,+ (k ±1 ) = and
1 (k + k −1 )f∞,+ (k ±1 ), 2
f∞,+ (k, n) = ψ∞,− (k, n) =
kn , T∞ (k)−1 k n , k −n , T∞ (k)k −n ,
k ∈ C\{+1, −1}, |k| ≤ 1
(4.23)
n → +∞, n → −∞, n → +∞, n → −∞,
k ∈ C\{+1, −1}, |k| ≤ 1, (4.24)
where the transmission coefficient T∞ (k) of H∞ holomorphically extends to T∞ (k) =
∞ Y j=1
|κj |
κ−1 j −k , κj − k
k ∈ C\{κj }j∈N .
(4.25)
The wave operators Ω± (HM , H0 ) = s–lim eitHM e−itH0 , t→±∞
M ∈ N ∪ {∞}
exist in `2 (Z) and are strongly asymptotically complete, i.e., Ran Ω± (HM , H0 ) = Hac (HM ) = Ran EHM ((0, ∞)) ,
(4.26)
M ∈ N ∪ {∞}. (4.27)
New Classes of Toda Soliton Solutions
47
(Here Hac (·), EH (·) denote the absolutely continuous spectral subspace and the family of spectral projections of a self–adjoint operator H.) The scattering operators in `2 (Z), S(HM , H0 ) = Ω+ (HM , H0 )∗ Ω− (HM , H0 ),
M ∈ N ∪ {∞}
(4.28)
are unitary, and Ω± (HN , H0 ) and S(HN , H0 ) are strongly continuous as N → ∞, s–lim Ω± (HN , H0 ) = Ω± (H∞ , H0 ),
N →∞
s–lim S(HN , H0 ) = S(H∞ , H0 ).
N →∞
(4.29)
In addition, the fibers SN (λ), λ ∈ (−1, 1) in C2 of S(HN , H0 ) converge pointwise to the fibers S∞ (λ) of S(H∞ , H0 ), lim kSN (λ) − S∞ (λ)k = 0,
N →∞
In particular, H∞ is reflectionless, i.e., 0 T∞ (λ) , S∞ (λ) = 0 T∞ (λ)
λ ∈ (−1, 1).
Rr,l (λ) = 0,
λ ∈ (−1, 1).
(4.30)
(4.31)
Proof. The spectral properties (4.20)–(4.23) are a special case of Theorem 4.1; (4.24) and (4.25) have been discussed in Lemma 3.16. Trace norm resolvent convergence of HN to H∞ (cf. Theorem 3.15) then yields continuity of the wave and scattering operators along the lines of [2], [21], p.27, 387. Remark 4.1. Under the additional condition {c2j (1 ∓ κj )−2 }j∈N ∈ `1 (N) or {cj (1 ∓ 1 κj )−1 (1−|κj |)− 2 }j∈N ∈ `1 (N) one can prove that there is no square summable solution ψ∞ of H∞ ψ∞ = ±ψ∞ , i.e., either one of these hypotheses guarantees that ±1 6∈ σp (H∞ ). 5. A new Class of Toda Soliton Solutions In our final section we return to the Toda lattice equations (2.29) and construct a new class of Toda soliton solutions obtained from N –soliton solutions as N → ∞. Assuming (H.3.1) in this section, we introduce the substitution (cf. (2.30)) cj → cj (t) = cj eβj t ,
βj =
1 (κj − κ−1 j ), (j, t) ∈ N × R, 2
(5.1)
which renders all quantities in Sects. 3 and 4 t–dependent. In obvious notation we denote the resulting objects by a∞ (t, n), b∞ (t, n), H∞ (t), ψ∞,j (t, n), C∞ (t, n), CN (t, n), etc.. Our main result then reads as follows. Theorem 5.1. Assume (H.3.1). Then H∞ (t) is unitary equivalent to H∞ (0) for all t ∈ R and (a∞ (t, n), b∞ (t, n)) satisfies the Toda lattice equations (2.29), i.e., d a∞ (t, n) = a∞ (t, n)[b∞ (t, n) − b∞ (t, n + 1)], dt d b∞ (t, n) = 2[a∞ (t, n − 1)2 − a∞ (t, n)2 ], (t, n) ∈ R × Z. dt
(5.2)
48
F. Gesztesy, W. Renger
Proof. The standard Lax representation of (2.29) proves unitary equivalence of H∞ (t) and H∞ (0), t ∈ R. (5.2) will follow from the results of Sects. 2 and 3 if we can prove that d d det[1N + CN (t, n)] = det 1 [1∞ + C∞ (t, n)], N →∞ dt dt lim
(t, n) ∈ R × Z.
(5.3)
Indeed, (5.3) together with the obvious convergence of det[1N + CN (t, n)] −→ det 1 [1∞ + C∞ (t, n)], N →∞
(t, n) ∈ R × Z
then yields dm dm m a∞ (t, n), dtm aN (t, n) N−→ dt →∞ dm dtm [1N
s dm −1 , m [1∞ + C∞ (t, n)] N →∞ dt
dm
m ΨN (t, n) − dmm Ψ∞ (t, n) 2 −→ 0, dt dt ` (N) N →∞ dm dm m b∞ (t, n) dtm bN (t, n) N−→ →∞ dt
+ CN (t, n)]−1 −→
(5.4) (5.5) (5.6) (5.7)
for all (t, n) ∈ R × Z, m = 0, 1 and hence (5.2) using the fact that (aN (t, n), bN (t, n)) satisfy (2.29). In order to prove (5.3) one expands d dt
det[1N + CN (t, n)] =
PN j=1
det[(1N + ^ CN (t, n))(j) ]
(5.8)
= det[1N + CN (t, n)]Tr{[1N + CN (t, n)]−1 [BN CN (t, n) + CN (t, n)BN ]}, CN (t, n))(j) results by replacing the j–th column by its derivative and where (1N + ^ BN = {βj δj,l }N j,l=1 , 1 kBN k ≤ (κ−1 − κ0 ). 2 0
(5.9) (5.10)
Using [1N + CN (t, n)]−1 ≤ 1 one infers from (5.8), |
d det[1N + CN (t, n)]| ≤ (κ−1 0 − κ0 ) det[1N + CN (t, n)] kCN (t, n)k1 . dt
(5.11)
In particular, (5.11) extends to N = ∞ and d d det[1N + CN (t, n)] − det 1 [1∞ + C∞ (t, n)]| dt dt ∞ n X [1N + CN (t, n)]−1 0 ≤ det[1N + CN (t, n)] 0 0 j=1 o −[1∞ + C∞ (t, n)]−1 φ(j) (n)
|
j
+ |det[1N + CN (t, n)] − det1 [1∞ + C∞ (t, n)]| × Tr{[1∞ + C∞ (t, n)]−1 [B∞ C∞ (t, n) + C∞ (t, n)B∞ ]}, where
(5.12)
New Classes of Toda Soliton Solutions
φ(j) (n) =
(βj + βl )cj (t)cl (t)
49
(κj κl )n+1 ∞ T ∈ `2 (N), 1 − κj κl l=1
j ∈ N.
While the second term in (5.12) clearly tends to zero as N → ∞ the first can be estimated by N1 n n X [1N + CN (t, n)]−1 c˜(t, n) 0 j=1
∞ n X [1N + CN (t, n)]−1 + 0 j=N1 +1
o 0 − [1∞ + C∞ (t, n)]−1 φ(j) (n) 0 j
o 0 − [1∞ + C∞ (t, n)]−1 φ(j) (n) (5.13) 0 j
for some constant c˜(t, n) > 0 since det[1N + CN (t, n)] is bounded with respect to N . Next, choosing N1 ∈ N sufficiently large, the second term in (5.13) can be made arbitrarily small. Since for a given N1 the first term in (5.13) converges to zero as N → ∞ by the strong convergence of s [1N + CN (t, n)]−1 0 −→ [1 + C∞ (t, n)]−1 , 0 0 N →∞ ∞ the expression in (5.13) and hence in (5.12) tends to zero as N → ∞. Since the whole t– −1 1 dependence comes from the constants cj (t) = cj eβj t , and e− 2 (κ0 −κ0 )s cj (t) ≤ cj (t+s) = P −1 1 cj (t)2 cj (t)eβj s ≤ e 2 (κ0 −κ0 )s cj (t), j∈N 1−κ 2 converges uniformly with respect to t for t j
in compact intervals. Thus hence (5.3) follows.
d dt
det[1N + CN (t, n)] converges locally uniformly in t and
Since Theorem 5.1 trivially extends to all equations of the Toda hierarchy we omit further details at this point. Moreover, utilizing the commutation formalism developed in [10] (see also [3]), all results of this paper can be transferred to the modified Toda lattice or Kac–van Moerbeke lattice in a straightforward manner. Acknowledgement. We are indebted to Gerald Teschl for numerous discussions on this topic. F.G. would like to thank Helge Holden and the Department of Mathematical Sciences, NTH, University of Trondheim, Norway for the extraordinary hospitality extended to him during the final steps of this work at a month long stay in the summer of 1995. Financial support by the Norwegian Research Council is gratefully acknowledged.
References 1. Bauhardt, W. and P¨oppe, C.: The Fredholm determinant method for discrete integrable evolution equations. Lett. Math. Phys. 13, 167–178 (1987) 2. Br¨uning, E. and Gesztesy, F.:Continuity of wave and scattering operators with respect to interactions. J. Math. Phys. 24, 1516–1528 (1983) 3. Bulla, W., Gesztesy, F., Holden, H. and Teschl, G.: Algebro-geometric quasi-periodic finite-gap solutions of the Toda and Kac–van Moerbeke hierarchies. Memoirs Amer. Math. Soc., to appear 4. Degasperis, A. and Shabat, A.: Construction of reflectionless potentials with infinite discrete spectrum. Theoret. Math. Phys. 100, 970–984 (1994) 5. Deift, P., Li, L. C. and Tomei, C.: Toda flows with infinitely many variables. J. Funct. Anal. 64, 358–402 (1985) 6. Eilenberger, G.: Solitons. Berlin: Springer, 1983. 7. Faddeev, L. D. and Takhtajan, L. A.: Hamiltonian Methods in the Theory of Solitons. Berlin: Springer, 1987
50
F. Gesztesy, W. Renger
8. Flaschka, H.: On the Toda lattice II. Progr. Theoret. Phys. 51, 703-716 (1974) 9. Gesztesy, F. and Holden, H.: Trace formulas and conservation laws for nonlinear evolution equations. Rev. Math. Phys. 6, 51–95 (1994) 10. Gesztesy, F., Holden, H., Simon, B. and Zhao, Z.: On the Toda and Kac–van Moerbeke systems. Trans. Amer. Math. Soc. 339, 849–868 (1993) 11. Gesztesy, F., Karwowski, W. and Zhao, Z.: Limits of soliton solutions. Duke Math. J. 68, 101–150 (1992) 12. Gesztesy, F., Karwowski, W. and Zhao, Z.: New types of soliton solutions Bull. Amer. Math. Soc. 27, 266–272 (1992) 13. Gesztesy, F. and Teschl, G.: Commutation methods for Jacobi operators. J. Diff. Eqs. 128, 252–299 (1996) 14. Kac, I.: On the multiplicity of the spectrum of a second–order differential operator. Sov. Math. Dokl. 3, 1035–1039 (1962) 15. Kac, I.: On the multiplicity of the spectrum of a second-order differential operator. Izv. Akad. Nauk. SSSR 27, 1081–1112 (1963) [in Russian] 16. Lundina, D. S.: Compactness of sets of reflectionless potentials. Teor. Funktsi˘ı Funktsional. Anal. i Prilozhen. 44, 57–66 (1985) [in Russian] 17. Lundina D. S. and Marchenko, V. A.: Limits of the reflectionless Dirac operator. Adv. Sov. Math. 19, 1–25 (1994) 18. Marchenko, V. A.: The Cauchy problem for the KdV equation with non-decreasing initial data. In What is Integrability?, V. E. Zakharov (ed.), Berlin,: Springer, 1991, pp. 273–318 19. Novokshenov, V. Yu.: Reflectionless potentials and soliton series of the KdV equation. Theoret. Math. Phys. 93, 1279–1291 (1992) 20. Polya G. and Szeg¨o, G.: Problems and Theorems in Analysis, Volume II. Berlin: Springer, 1976 21. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Volume III, Scattering Theory. New York: Academic Press, 1979 22. Shabat, A.: The infinite–dimensional dressing dynamical system. Inverse Problems 8, 303–308 (1992) 23. Simon, B.: Trace Ideals and Their Application. Cambridge: Cambridge University Press, 1979 24. Toda, M.: Theory of Nonlinear Lattices. 2nd enl. Ed., Berlin: Springer, 1989 25. Venakides, S., Deift, P. and Oba, R.: The Toda shock problem. Commun. Pure Appl. Math. 44, 1171–1242 (1991) Communicated by M. Jimbo
Commun. Math Phys. 184, 51 – 63 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Regular Weyl-Systems and Smooth Structures on Heisenberg Groups ¨ Gunther H¨ormann Institut f¨ur Mathematik, Universit¨at Wien, Wien, Austria. Email:
[email protected] Received: 10 May 1996 / Accepted: 30 July 1996
Abstract: We study representation theory of the Weyl relations for infinitely many degrees of freedom. Differentiability of regular representations along rays in the parameter space E suggests to consider smooth structures on E. Switching from representations of CCR to group representations of the associated Heisenberg group over E we develop a framework for smooth representations of the Heisenberg group as an infinite dimensional Lie group. After careful inspection and translation of the necessary differential geometric input for Kirillov’s orbit method we are able to construct a large class of smooth representations. These reproduce the Schr¨odinger representation if E is finite dimensional.
1. Remarks on the CCR Algebra In 1972 Slawny ([24]) published a proof of the existence of a unique minimal C ∗ -algebra W generated by unitary elements W (x) subject to the Weyl relations (or CCR in Weyl form) W (x)W (y) = b(x, y)W (x + y) , (1) where W is a map E → W and E an abelian group with non-degenerate bicharacter b: E×E → T (T denoting the complex numbers of modulus 1). Moreover W is universal with respect to unitary representations of the Weyl relations in the sense that each such algebra is given by composition of W with a ∗ -representation of W. (Four years earlier Manuceau had used a different set-up to construct an isomorphic object for the case where E is a symplectic vector space and the results were subsequently extended and sharpened for presymplectic spaces ([15, 16]).) In typical applications to the description of Bosonic quantum systems, (E, b) is given as Hilbert space (we use the notation (H)-space, and similarly (F)-space in the case of Fr´echet spaces henceforth) with
52
G. H¨ormann
b(x, y) = ei Imhx|yi defining the bicharacter. Or, slightly more generally, E is some (real) function space with a (weak) symplectic form σ: E × E → R replacing the role of Imh. | .i. A well written introduction and formulation of these results inside this framework can be found for example in lecture notes of Petz (cf. [19], Chaps.1 and 2). However, the more general set-up of Slawny can be justified by at least three observations: (a) A pure C ∗ -algebraic point of view (i.e. without reference to continuity properties of representations with respect to E) of CCR necessarily reflects (only) an underlying discrete structure on E: the existence of the tracial state τ (W (x)) = δx0 on W √ implies that kW (x) − W (y)k ≥ 2 if x 6= y (cf. [24, 19]). (b) Several applications in the context of solid state physics and quantum ergodic theory involved a version of Weyl relations over a parameter space E of integers (usually together with an additional deformation constant, cf. [2, 1]). (c) As Slawny has pointed out one can also bring the Fermionic CAR into Weyl form by constructing E as a certain set of indices with the symmetric set difference as a group operation ([24], 3.10). In this context it might be of interest to show that in this general setting there exists another method of defining W as a subalgebra of the group C ∗ -algebra of the corresponding Heisenberg group over (E, b). We sketch the construction after the following Definition 1. Let (E, b) be an abelian group with non-degenerate bicharacter b, i.e. b: E×E → T is a character when restricted to one component and satisfies the following condition: b(x, y)b(y, x)−1 = 1 ∀y ∈ E
∀x ∈ E :
⇒
x=0.
The Heisenberg group H(E, b) is the set E × T endowed with the multiplication (x, λ) ◦ (y, µ) = (x + y, λµ · b(x, y)) and identity element (0, 1). Since b(x, 0) = b(0, x) = 1 and b(x, x) = b(−x, x) = b(x, −x) for all x ∈ E (use the character property in each component for x + 0 and x − x) we have the inversion ¯ x)). formula (x, λ)−1 = (−x, λb(x, Now equip H = H(E, b) with the product topology τ0 of the discrete topology on E and the usual one on T. Then (E, τ0 ) is a locally compact topological group with Haar measure being the product of the counting measure on E and the Lebesgue measure on T. Subsequent measure theoretic terms are understood to refer to this measure. Note that the Haar measure is both left and right invariant in this case . Moreover H is amenable and hence the group C ∗ -algebra C ∗ (H) can be realized in the regular representation acting on L2 (H) by convolution (cf. [18], Sect. 7.2 and 7.3). It is generated by the set Cc (H, C) of complex valued continuous functions with compact support under multiplication (stemming from convolution on the group) XZ F (y, µ)G((y, µ)−1 (x, λ)) dµ F, G ∈ Cc (H, C) (F ∗ G)(x, λ) = y∈E
and involution
T
Regular Weyl-Systems and Smooth Structures
53
F ∗ (x, λ) = F (−x, λb(x, x)) . Consider the subset W0 = C(E) ⊗{c}, where C(E) = {f : E → C | supp(f ) is finite} and c: T → T denotes complex conjugation on T. Then for elements of the form F = f ⊗ c and G = g ⊗ c we have the following equation: X f (y)g(x − y) b(−y, x)b(y, y) =: (f ? g) ⊗ c . (f ⊗ c) ∗ (g ⊗ c)(x, λ) = | {z } y∈E
b(−y,x−y)
In particular for the base vectors (δz )z∈E in C(E) we derive (δz1 ⊗ c) ∗ (δz2 ⊗ c) = (δz1 ? δz2 ) ⊗ c = b(−z1 , z2 )(δz1 +z2 ⊗ c) = b(z1 , z2 )δz1 +z2 ⊗ c by direct insertion and using the simple (bi)character property b(−x, y) = b(x, y). Therefore by setting W (x) := δx ⊗ c ∈ W0 and completion of W0 in the C ∗ -norm of C ∗ (H) we get an isomorphic realization of the CCR algebra over (E, b) by Slawny’s theorem ([24], Th.3.7). We may state the following Proposition 1. The (unique) CCR algebra over the abelian group E with non-degenerate bicharacter b is isomorphic to the C ∗ -subalgebra of the group C ∗ -algebra C ∗ (H) which is generated by the functions (δx ⊗ c)x∈E . Remark 1. The above construction is in the spirit of Folland’s book ([4]) where the corresponding finite dimensional setting (and far reaching further results) are represented in the context of harmonic analysis. Kastler and Mebkhout have given similar constructions of the CCR algebra in terms of convolution algebras of measures on (pre)symplectic vector spaces. I am indebted to P. Michor for a helpful discussion on the above construction ([8]). 2. Regular Representations and Smooth Structure By construction representations of the Weyl-Relations, i.e. continuous ? -homomorphisms α of W(E, b) into a set of bounded operators on a (H)-space, and group representations π of the Heisenberg group H(E, b) with one-dimensional center, i.e. factor representations, are in one-to-one correspondence by π(x, λ) = λα(W (x)). Therefore it seems natural to try to relate continuity conditions on Weyl-representations to obstructions for the adequate topological structure of the Heisenberg group. The discrete structure on the abelian group (E, b) suggested by the algebraic constructions of Sect. 1 is not appropriate for applications to quantum systems where one usually wants to implement the Heisenberg uncertainty principle in the form of the canonical commutator relations. These relations are derived for the generators of one-parameter subgroups in W(E, b) resp. H(E, b) if E is actually a vector space. The necessary and sufficient condition for the existence of such generators in the form of densely defined self-adjoint operators with common dense domain is the following regularity condition on a representation α : W(E, b) → B(H): ∀x ∈ E : t 7→ α(W (tx)) is strongly continuous R → U (H).
(R)
Here strongly continuous means that ∀ξ ∈ H: lim t→0 k α(W (tx))ξ − ξ k= 0. As I.E. Segal pointed out ([23], II.1) in 1958 this condition suggests that E carries a topology induced by all finite dimensional subspaces, which coincides with the finest locally convex topology on E.
54
G. H¨ormann
In view of the main examples and for simplicity from now on we consider the case where • E is a vector space over R (esp. any complex vector space) and • b is given by a weak symplectic form σ : E × E → R, i.e. σ is bilinear, antisymˇ = σ(x, .) is injective, according to the formula metric, and σˇ : E → E ∗ , σ(x) b(x, y) = eiσ(x,y) . We call (E, σ) a weak symplectic space, and symplectic space in the case where σˇ is an isomorphism (we use the terms of [3]). Remark 2. For E = C(N) with σ weak symplectic one can actually proof equivalence of regularity and continuity of the corresponding group homomorphism H(E) → U (H) ([8], I.2). (Observe that this requires a non-trivial argument since by non-linearity of the group homomorphism continuity does not follow directly from the definition of the finest locally convex topology.) We use the notation H(E, bσ ) to indicate the assumptions specified above. For H(E, bσ ) to be a topological group with respect to the product topology of the finest locally convex topology ϕ on E and the usual one on T it is necessary and sufficient that σ : E × E → R is continuous. Since on (E, ϕ) each linear mapping is continuous σ is at least separately continuous. Moreover since the bounded subsets of E, resp. E × E, are exactly those which are bounded and contained in some finite-dimensional subspace, each bilinear form is bounded and (B, B)-hypocontinuous (cf. [21], II.6 and III.5). Example 1. Let E be a complex (H)-space considered as a real vector space with the symplectic form σ(x, y) = Imhx | yi. (Then σ is continuous with respect to the k.k2 -Topology on E and therefore also continuous on (E, ϕ).) For example in Fock representations this is the so-called one-particle (H)-space. More generally each continuously embedded vector space which is symplectic as a subspace of E provides an example (this includes function spaces like S(Ω) or D(Ω) as subspaces of L2 (Ω), Ω an open subset of Rn ). (ii) Let E = R(N) × R(N) ∼ = C(N) , where R(N) = {f : N → R | supp(f ) is finite}. On (N) R each bilinear form is continuous, since R(N) = (RN )0β (by [21], IV.9.9.,Cor 3 and IV.5, Example 3). For example if (. | .) denotes the standard scalar product on `2 (N, R) ⊆ R(N) then σ((x1 , x2 ), (y1 , y2 )) = (x1 | y2 ) − (x2 | y1 ) defines a weak symplectic form on E. This example corresponds to the setting of G˚arding and Wightman in their classification program of CCR in 1954 (cf. [6]). (i)
Apart from mere topological considerations a different point of view is suggested by the following observation. If α is a regular Weyl-representation and H∞ is the common dense domain (of smooth vectors) of the generators for the one-parameter subgroups α(W (tx)), then for each ξ ∈ H∞ and x ∈ E the mapping t 7→ α(W (tx))ξ is differentiable at 0 by assumption. By the group property it is therefore differentiable at each value t ∈ R. If φ(x) denotes the generator of (α(W (tx)))t∈R then ∀ξ ∈ H∞ we have φ(sx)ξ = sφ(x)ξ, and therefore t 7→ α(W (tx))ξ is actually infinitely often differentiable
Regular Weyl-Systems and Smooth Structures
55
R → H, or in other words: α ◦ W maps the “special curves” c(t) = tx, c : R → E, into the smooth curves c˜(t) = α ◦ W (tx) · ξ. Since cx (t) = tx defines a smooth curve for each locally convex topology on E (in a sense stated precisely in the next definition) we can once again reformulate and state that ˇ : E → H∞ , α(ξ)(y) ˇ = α(W (y))ξ, maps the ∀ξ ∈ H∞ : α(ξ) smooth curves (cx )x∈E of E to smooth curves into H∞ .
(2)
If we generalize this statement and consider instead of the special curves (cx )x∈E arbitrary smooth curves into E we arrive at the basic definition of a smooth mapping E → H∞ in the sense of differential calculus of A. Fr¨ohlicher and A. Kriegl (cf. [5, 13] and [14] for basic definitions and results). The idea of the following is to recognize the Heisenberg group H(E, bσ ) in rather general situations as a Lie group — in the sense that group operations are smooth — and then to apply differential geometric methods to its (smooth) representation theory. First we recall the basic definitions and cite some results of Fr¨ohlicher-Kriegl differential calculus (cf. [13], Chap. I, for more details and proofs). We do not state the most general form of the concepts. Definition 2. 0) in E. A map c: R → E is called differentiable if ∀t0 ∈ R : ∃ limt→t0 c(t)−c(t t−t0 0 0) In this case c0 (t0 ) := limt→t0 c(t)−c(t defines a map c : R → E. c is called a t−t smooth curve if all iterated derivatives exist. We denote by C ∞ (R, E) the space of all smooth curves. (ii) Let E, F be locally convex spaces. A map f : E → F is called smooth if it maps smooth curves into smooth curves, i.e. ∀c ∈ C ∞ (R, E) : f ◦ c ∈ C ∞ (R, F ). (Instead of E one can consider more general sets as the domain of f .) We use the notation C ∞ (E, F ) = {f : E → F | f smooth}.
(i)
There exist smooth mappings, which are not continuous! (Smoothness is a bornological concept.) In particular for linear and multilinear mappings we have the following characterization (cf. [13] 1.19 and 3.5). Proposition 2. A linear map l: E → F between locally convex spaces (lcvs) is smooth iff l is bounded. (ii) A bilinear map b: E × F → G of lcvs is smooth iff b is bounded.
(i)
One of the main questions at the starting point of the calculus is to provide convenient methods for testing smoothness of curves in order to check smoothness of more general maps. In Rn one can always look at the components of a “time-dependent” vector c(t) and differentiability is equivalent to differentiability of each component function. In arbitrary locally convex vector spaces one hopes to get a large class of examples by transferring this componentwise testing to the testing with arbitrary continuous functionals: Definition 3.
56
G. H¨ormann
(i)
A lcvs E is said to be c∞ -complete (or convenient) if for an arbitrary curve c: R → E one has: c ∈ C ∞ (R, E)
↔
∀f ∈ E 0 : f ◦ c ∈ C ∞ (R, R),
where E 0 denotes the topological dual of E. (ii) The c∞ -topology on a lcvs E is the final topology with respect to all smooth curves into E, i.e. the finest topology such that all smooth curves are continuous. In general this does not have to define a topological vector space structure on E (cf. [13], I.2 for examples). We collect some important results in the following theorem (the proofs can be found in [13], 1.22.(3) and (4), 2.9.(1), 2.9.(2)). Theorem 1. (i) A lcvs E is c∞ -complete if E is c∞ -closed in some lcvs. (ii) If E is a metrizable lcvs then its topology is equivalent to the c∞ -Topology. (iii) If a bornological lcvs E is the dual of a (F)-Schwartz space then its topology is equivalent to the c∞ -topology. Example 2. By the above theorem each (H)-space is c∞ -complete. Moreover every (F)space — in particular the function spaces S, E, and Hol(Ω) (cf. [21], III.8,3-5) — is c∞ -complete by part (ii). Also RN , the space of all real sequences with topology of convergence in each component, is c∞ -complete by the same argument. Its strong dual is R(N) (cf.[21], IV.5, Example 3) which is c∞ -complete by part (iii) of the theorem. In contrast RR and R(R) are no longer c∞ -complete ([13], 2.24 (iv)) and therefore a lcvs E with finest locally convex topology is not c∞ -complete if its algebraic dimension is uncountable. Remark 3. This concept reproduces the classical case if E is a finite-dimensional vector space. Furthermore differentiation is a linear operator on C ∞ (E, F ) and the chain rule holds ([13], 1.41). In view of Observation 2 and the concepts introduced above we now consider the Heisenberg group H(E, bσ ) where E is a c∞ -complete lcvs and σ is bounded. Lemma 1. If E is a c∞ -complete locally convex vector space and σ is a bounded weak symplectic form then the group operations in H(E, bσ ) are smooth. Moreover H(E, bσ ) is locally diffeomorphic to the c∞ -complete lcvs E ×R, showing that it is a c∞ -complete locally convex Lie group. Proof. Smoothness of the group operations follows by boundedness of σ from Prop.2; if ψ is a local diffeomorphism T → R we get a local diffeomorphism E × T → E × R simply by forming the product with the identity map IE , which clearly maps smooth curves into smooth curves in both directions. Remark 4. The automorphism group of H(E, bσ ) can be studied completely analogously to the finite-dimensional case since smooth homogeneous mappings are linear by [13], 1.42. The statement would be exactly the same as in [4], Th.1.22, with obvious transfers to the more general set-up.
Regular Weyl-Systems and Smooth Structures
57
3. Smooth Representations and Coadjoint Orbits The space H∞ used in Observation 2 is itself a metrizable lcvs and therefore c∞ complete by Theorem 1, part (ii). This motivates the following Definition 4. Let E be c∞ -complete and σ a bounded weak symplectic form on E. Then a smooth representation of the Heisenberg group H(E, bσ ) is a group homomorphism π : H(E, bσ ) → GLb (V ), where V is some c∞ -complete lcvs and GLb (V ) denotes the set of all bounded invertible linear maps with bounded inverse on V with the property that πˆ : H(E, bσ ) × V → V, π((x, ˆ λ), ξ) = π(x, λ)ξ, is smooth. Remark 5. 1. c∞ -completeness of E and V is assumed from the beginning in order to guarantee appropriate (smooth) behavior in switching from π to πˆ and vice versa, i.e. πˆ is smooth iff π is (see proofs later). 2. If c1 , c2 : R → E are smooth curves with c1 (0) = c2 (0) = 0, c01 (0) = c02 (0) = x ∈ E d d |t=0 π(c1 (t), 1)ξ = dπ(0,1) (x, 0) · ξ = dt |t=0 π(c2 (t), 1)ξ. we have ∀ξ ∈ V : dt Therefore the generators φ(x) = −i dπ(0,1) (x, 0) of the regular Weyl-representations exist in this setting and are uniquely determined by differentiation along the rays c(t) = tx in E. This amounts to the corresponding Lie algebra representation on the domain of smooth vectors in a unitary representation. 3. Clearly if V is a dense subspace of some (H)-space H and im(π) ,→ U (H) the above definition yields a regular Weyl-representation. Example 3. (i)
If E is finite-dimensional the irreducible regular representations are essentially unique (due to von Neumann’s result, [25]) and are equivalent to the Schr¨odinger representation, which can be realized in H = L2 (Rn ) with H∞ = S(Rn ). If ˆ 1), f ) has to be tested with temc ∈ C ∞ (R, E) the smoothness of π((c(t), pered distributions χ ∈ S 0 (Rn ): but by definition of the Schr¨odinger representation as a mixture of modulation and translation operator for each f ∈ S(Rn ) the function π((c(t), ˆ 1), f ) is also smooth in t and yields a smooth function πfc : R × Rn → R, πfc (t, x) = (π((c(t), 1))f )(x), i.e. πfc ∈ C ∞ (R × Rn , C). If we consider on C ∞ (Rn , C) ⊇ S(Rn ) the topology of uniform convergence on compact sets in each derivative then we have the equivalence (cf. [13], 1.33 or 1.35) πfc ∈ C ∞ (R × Rn , C)
⇔
πˇ fc ∈ C ∞ (R, C ∞ (Rn , C)), πˇ fc (t)(x) = πfc (t, x)
which shows that πfc (t, .) is a smooth curve into C ∞ (Rn , R); since the Schr¨odinger operators can easily be extended to act on the whole space C ∞ (Rn , R) we have a smooth representation at least if we leave the surrounding Hilbert space L2 (Rn ) and extend to C ∞ (Rn , C). This is also the natural representation space emerging from geometric quantization and the orbit method in constructing representations of Lie groups (cf. [11, 10]). We will recover this sort of representation in the next section. By using the fact that a tempered distribution χ ∈ S 0 (Rk ) can be written as a derivative of a continuous function F (cf. [22], Ch.7, Th.VI), i.e. χ = ∂ α F (for some multiindex α), one can prove smoothness of π directly even for the representation space S(Rn ).
58
G. H¨ormann
(ii) As a second prominent example consider the Bosonic Fock representations. For (E, σ) a subspace of a real (H)-space this can be realized as the subspace of polynon n mials over E (=ˆ symmetric algebra S(E) = ⊗∞ n=0 s E, where s E denotes the n-fold symmetric tensor product in the category of (pre) Hilbert spaces) in Hol(E¯ 0 ), the space of holomorphic functions on a subspace of the complexification EC of E. For example in [20] this is called the standard representation for the Heisenberg group. In [8] this representation is constructed as a holomorphic induction from orbit theory in spaces of sections of a line bundle for E = C(N) . We will not recall the details here but note that once again we have extended the framework of Hilbert spaces to a surrounding space of smooth/holomorphic functions on E (or subspaces thereof) and got a smooth representation (for proofs cf. [8]). (iii) Only recently Narnhofer and Thirring pointed out that for relativistic quantum fields, in particular quantum electrodynamics, it seems to be necessary to give up regularity of representations on certain subspaces of E corresponding to the gauge conditions. In [17] they construct Weyl-representations where E can be chosen to be S(R2 )3 /E0 with E0 = {(( + e2 )ϕ1 , ϕ2 , −e 2 ϕ2 ) | ϕ1 , ϕ2 ∈ S(R2 )} and the weak symplectic form σ given by integration with the field propagators as kernels. We mention this example in order to show up certain limitations on the scope of our approach. On the background of Lemma 1 and Definition 4, and with the main examples (i) and (ii) at hand we start to inspect the possibilities of orbit theory in the general case where E is a c∞ -complete locally convex vector space with σ a bounded weak symplectic form. In other words we study orbit methods for the c∞ -complete locally convex Lie group H(E, bσ ). First we have to investigate the notion of tangent space because for general infinitedimensional c∞ -complete locally convex spaces E the two notions of (a) kinematic tangent vectors: for a ∈ E let Ta E be the space of all derivatives c0 (0) of smooth curves c : R → E with c(0) = a; Ta E ∼ = E, and (b) operational tangent vectors: at a ∈ E consider bounded derivations δ : Ca∞ (E, R) → R of function germs at a, i.e. δ linear with δ(f g)(a) = δ(f )(a) · g(a) + f (a) · δ(g)(a) and functions E → R are identified if they agree on some neighborhood of a ∈ E; denote by Da E the space of all operational tangent vectors, are no longer equivalent in contrast to finite-dimensional differential geometry. Actually, at this point a second hint comes up indicating that the concept of (H)-space is in some sense too narrow: if E is an infinite-dimensional (H)-space Ta E is always a proper subspace of Da E (cf. [13], 2.24 and 2.26). However, if E is C(N) or a nuclear (F)-space, in particular any of the function spaces ∞ C (Rn , R), S(Rn ), Hol(Ω), then it can be shown that the two notions coincide (cf. [13], 2.26 and [21], IV.5.) and we can compute most of the geometric objects as usual. So from now on let us assume that E = C(N) or E is a nuclear (F)-space. We collect some immediate consequences in the next statement. Lemma 2. Let H(E, bσ ) be the Heisenberg group for (E, σ) as above. Then the Lie algebra h is isomorphic to E × R with Lie bracket [(u, r), (v, s)] = (0, 2σ(u, v)). The left invariant vector fields ξ(u,r) ((u, r) ∈ h) are given by the formula ξ(u,r) (x, λ) = (u, r + σ(x, u))
Regular Weyl-Systems and Smooth Structures
59
and the exponential map is simply exp(u, r) = (u, eir ), exp: h → H(E, bσ ). Proof. The formula for ξ(u,r) is derived by differentiating the left translations (x, λ) 7→ (x + u, λeir eiσ(u,x) ) along curves (tx, eits ) at t = 0; then the Lie bracket is computed as a commutator of left invariant fields (where we need the equivalence of operational and kinematic tangent vectors), exactly as in the finite-dimensional situation; and exp is simply computed by the trivial flow of ξ(m, r). Now we are ready to compute the orbits of the coadjoint action of H(E, bσ ) on the dual h0 ∼ = E 0 × R of its Lie algebra h = E × R. The coadjoint representation Ad∗ : H(E, bσ ) → GL(h0 ) is defined by Ad∗(x,λ) f ∗ = f ∗ Ad−1 (x,λ) in terms of the adjoint representation Ad: H(E, bσ ) → GL(h) , where Ad(x,λ) (u, r) = (u, r + 2σ(x, u)) is the derivative of the conjugation action conj(x,λ) (y, µ) = (x, λ)(y, µ)(x, λ)−1 on H(E, bσ ). >From now on we simply write H instead of H(E, bσ ) Lemma 3. Ad∗ : H → GL(h0 ) is given by the formula Ad∗(x,λ) (u∗ , a) = (u∗ −aσ(x), ˇ a) and therefore the orbits O(u∗ ,a) = {Ad∗(x,λ) (u∗ , a) | (x, λ) ∈ H}) of the coadjoint action fall exactly into two classes: (i)
if a = 0 then O(u∗ ,0) = {(u∗ , 0)},
ˇ × {a}. (ii) if a 6= 0 then O(u∗ ,a) = {u∗ } + σ(E) Proof. The formula for Ad∗(x,λ) is directly computed by action on elements of h; the classification of orbits is obvious by this formula. Remark 6. If E is finite-dimensional then O(u∗ ,a) = E × {a} for a 6= 0 which corresponds to the Schr¨odinger representation with ~=a ˆ (cf. [12]). For general E we only have O(u∗ ,a) as affine subspaces in E 0 (a 6= 0). 4. Generalized Schr¨odinger Representations
The basic idea of Kirillov’s orbit method is to choose an arbitrary orbit O(u∗ ,a) and to look for a maximal (closed) Lie subalgebra k ⊆ h subordinated to (u∗ , a), i.e. such that (u, r) 7→ h(u∗ , a), (u, r)i is a Lie algebra homomorphism k → R. For this it is necessary and sufficient that h(u∗ , a), [(u, r), (v, s)]i = h(u∗ , a), (0, 2σ(u, v))i = = 2aσ(u, v) = 0 ∀(u, r), (v, s) ∈ k, which in turn corresponds to the choice of a maximal (closed) isotropic subspace L ⊆ E and to put k = L × R. If K ∼ = L × T in H denotes the corresponding closed subgroup then we∗get a onedimensional smooth representation ρ: L × T → T by setting ρ(x, eis ) = eih(u ,a),(x,s)i = ihu∗ ,xi+isa for x ∈ L, s ∈ R. (Here we have used the particular simple form of exp: h → e H.) The last step then is to construct the induced representation π = indH K ρ, which can be realized in the space C ∞ (H, C)K of K-equivariant smooth functions, i.e. functions f ∈ C ∞ (H, C) with the property f (hk) = ρ(k −1 )f (h)
∀h ∈ H, ∀k ∈ K
60
G. H¨ormann
(this space can be considered as a space of sections of the associated line bundle over H/K with projection p: H → H/K). The action of H is then given by translation, i.e. (π(h)f )(h1 ) = f (h−1 h1 )
∀h, h1 ∈ H .
On C ∞ (H, C) ⊇ C ∞ (H, C)K we consider the following topology: first equip C (R, C) with the topology of uniform convergence on compact sets of each derivative (this is a c∞ -complete topology), then for each c ∈ C ∞ (R, H) define the linear map c∗ : C ∞ (H, C) → C ∞ (R, C), c∗ (f ) = f ◦ c. Now define the locally convex topology on C ∞ (H, C) to be the initial topology with respect to the family (c∗ )c∈C ∞ (R,H) ; this defines a c∞ -complete topology on C ∞ (H, C)(cf. [13], 1.23, 1.34). ∞
Theorem 2. π = indH K ρ defines a smooth representation. Proof. We have to show that πˆ : H × C ∞ (H, C)K → C ∞ (H, C)K is smooth; first we note that C ∞ (H, C)K is a closed subspace of C ∞ (H, C) and therefore c∞ -complete. The following proof is exactly along the lines of a corresponding proposition in ([8], 6.4): Step 1: ∀h ∈ H : π(h) is smooth C ∞ (H, C)K → C ∞ (H, C)K . Since π(h) is linear it is sufficient to show continuity (even boundedness would suffice). C ∞ (H, C)K carries an initial topology, so continuity can be checked by composition with smooth curves c : R → H. If λg denotes left translation with g on H each map ϕ → c∗ (π(h)ϕ) = c∗ ◦ λ∗h−1 (ϕ) = (λh−1 ◦ c)∗ ϕ should be continuous C ∞ (H, C)K → C ∞ (R, C). But (λh−1 ◦ c)∗ ϕ = ϕ ◦ (λh−1 ◦ c) and t 7→ (λh−1 ◦ c)(t) = h−1 c(t) is clearly a smooth curve into H; denote this curve by d. Now c∗ (π(h)) = d∗ is continuous by definition of the initial topology. Since c was arbitrary we have proved continuity and hence smoothness of π(h). So π : H → GL(C ∞ (H, C)K ) ⊆ C ∞ (C ∞ (H, C)K , C ∞ (H, C)K ) and smoothness of πˇ can be proved using Cartesian closedness, i.e. πˇ is smooth if and only if π is smooth (cf. [13], 1.35). Step 2: ∀ϕ ∈ C ∞ (H, C)K : h 7→ π(h)ϕ is smooth H → C ∞ (H, C)K (therefore C ∞ (H, C)K consists entirely of smooth vectors). C ∞ (H, C)K is a closed subspace, therefore it is enough to show smoothness of h 7→ π(h)ϕ, H → C ∞ (H, C). But this is again by Cartesian closedness equivalent to showing smoothness of (h, g) 7→ (π(h)ϕ)(g), of H × H → C. Now finally, if m and inv denote the (smooth) group operations multiplication and inversion in H (π(h)ϕ)(g) = ϕ(h−1 g) = (ϕ ◦ m)(inv(h), g) is clearly smooth in (h, g). Step 3: π : H → GL (C ∞ (H, C)K ) is smooth. Since π takes values in the closed subspace End(C ∞ (H, C)K ) of smooth linear mappings it suffices to show that evϕ ◦ π : H → End(C ∞ (H, C)k ) → C ∞ (H, C)K (where evϕ ◦ π(h) = π(h)ϕ) is smooth for all ϕ ∈ C ∞ (H, C)K (according to [13], 3.25, “the smooth uniform boundedness principle”). But this requires exactly h 7→ π(h)ϕ to be smooth, i.e. ϕ to be a smooth vector, which was shown in Step 2. Remark 7.
Regular Weyl-Systems and Smooth Structures
61
1. One can work out the same constructions with the complexified Heisenberg group HC = EC × C \ {0} replacing the conditions of smoothness by holomorphy. In this way one obtains representations in a space of holomorphic functions on HC (or subspaces thereof) which in some cases make possible the construction of the Bosonic Fock representation exactly as in the standard representation according to [20], 9.6. The details for the case E = C(N) are given in [8], II.8. Furthermore, the so-called momentum mapping for the Fock representation is investigated there and it is shown that its image contains the original orbit of the whole construction. 2. The dependence of π on the choice of the orbit or the functional (u∗ , a) ∈ h0 is not explicitly seen in this general form. Our next purpose is to give a more explicit formula for the action according to π and we will thereby change the representation space into an isomorphic one. If the closed maximal isotropic subspace L ⊆ E is topologically complemented by ¯ i.e. E = L ⊕ L, ¯ then we can identify E/L with L¯ (as locally convex another subspace L, ¯ which is a commutative normal spaces) and a fortiori the quotient group H/K with L, subgroup of H. Remark 8. Such isotropic subspaces are usually constructed by use of complex structures (cf.[20], 9.5) and in most examples are not difficult to achieve: for example in a complex (H)-space E one has ER ∼ = E ⊕R E and therefore a natural isotropic decomposition with respect to σ = Imh. | .i. Theorem 3. If E = L ⊕ L¯ is a decomposition into closed maximal isotropic subspaces with respect to σ then we have the following equivalent realization of π = indH K ρ. ¯ C) such that the representation There exists an isomorphism Ψ : C ∞ (H, K)K → C ∞ (L, ¯ C)) defined by θ: H → GL(C ∞ (L, θ(h) = Ψ ◦ π(h) ◦ Ψ −1
∀h ∈ H
is given by the formula (using the unique decomposition x = x1 ⊕ x2 with x1 ∈ L, ¯ x2 ∈ L) ∗ ¯ f (z¯ − x2 ) . (θ(x, λ)f )(z) ¯ = λa eihu ,x1 i−iaσ(x1 ,x2 ) e2iaσ(x1 ,z) We call θ a generalized Schr¨odinger representation. ¯ C) simply by restriction to L: ¯ Proof. Define Ψ : C ∞ (H, K)K → C ∞ (L, ∀ϕ ∈ C ∞ (H, K)K .
Ψ (ϕ) = ϕ |{0}×L×{1} ¯
We assert that the inverse is given by the formula Ψ −1 (f )(x, λ) = λ−a e−ihu
∗
,x1 i−iaσ(x1 ,x2 )
f (x2 )
¯ C) . ∀f ∈ C ∞ (L,
First we show that ϕ = Ψ −1 (f ) is always K-equivariant: let (y1 , µ) ∈ K then ϕ((x, λ) ◦ (y1 , µ)) = ϕ((x1 + y1 ) ⊕ x2 , λµeiσ(x2 ,y1 ) ) = = λ−a µ−a e−iaσ(x2 ,y1 ) e−ihu =µ |
∗
−a −ihu ,y1 i
e
{z
ρ(−y1 ,µ)
∗
λ }|
,x1 +y1 i −iaσ(x1 +y1 ,x2 )
e
f (x2 ) =
−a −ihu∗ ,x1 i −iaσ(x1 ,x2 )
e
e {z
ϕ(x,λ)
¯ by definition of ρ and the decomposition E = L ⊕ L.
f (x2 ) }
62
G. H¨ormann
Now for ϕ ∈ C ∞ (H, K)K , Ψ −1 (Ψ (ϕ))(x, λ) = λ−a e−ihu
∗
,x1 i−iaσ(x1 ,x2 )
ϕ(x2 , 1) = ϕ(x1 ⊕ x2 , λ)
by K-equivariance of ϕ and similarly Ψ (Ψ −1 (f ))(x2 , 1) = 1−a e−ihu
∗
,0i−iaσ(0,x2 )
f (x2 ) = f (x2 )
¯ C). So we have shown that Ψ is linear and invertible. for arbitrary f ∈ C ∞ (L, −1 Ψ and Ψ are continuous (hence also smooth by linearity): for Ψ −1 this is seen directly by its defining formula and for Ψ it follows by definition of the initial topology on C ∞ (H, C)K and testing on smooth curves. The last thing to do is the computation of θ(h)f . We have (θ(x, λ)f )(y2 ) = (Ψ ◦ π(x, λ) ◦ Ψ −1 )f (y2 ) = ∗ = π(x, λ)(µ−a e−ihu ,y1 i−iaσ(y1 ,y2 ) f (y2 )) | y1 =0 = µ=1
= eiaσ(x,y) λa µ−a e−ihu
∗
,y1 −x1 i−iaσ(y1 −x1 ,y2 −x2 )
f (y2 − x2 ) | y1 =0 = µ=1
a iaσ(x1 ,y2 ) ihu∗ ,x1 i−iaσ(−x1 ,y2 −x2 )
=λ e
e
= λa e2iaσ(x1 ,y2 ) eihu
∗
f (y2 − x2 ) =
,x1 i −iaσ(x1 ,x2 )
e
f (y2 − x2 ) .
Remark 9. 1. The elements of the Lie algebra act according to the derived representation θ0 = ¯ C)) and the field operators −iθ0 (x, 0) are bounded operators dθ(0,0) : h → End(C ∞ (L, ¯ C) given by on C ∞ (L, (θ0 (x, 0)f )(z) ¯ = (ihu∗ , x1 i + 2iaσ(x1 , z))f ¯ (z) ¯ − hdf(z) ¯ , x2 i , which can be verified by directional derivatives. ¯ C), which is generated by polynomials 2. One can now define a subspace of C ∞ (L, ¯ · · · σ(xn , z) ¯ and then develop further the construction of Fock repp(z) ¯ = σ(x1 , z) resentations. But it is possibly more interesting (and more difficult?) to look for new ways of constructing unitary representations starting from such generalized Schr¨odinger representations. 3. There are stronger and more detailed results by Gelfand and Vilenkin in the case where the locally convex topology on E is generated by countably many positivedefinite Hermitian forms and the representations are unitary and strongly continuous maps E × T → U (H). Their results are stated in terms of measures on the dual space E 0 and yield similar formulas for the action of the group (cf. [7], IV.5.4). It would be of great interest to study (infinite-dimensional) measure theory on E 0 in the context of orbit theory for the construction of unitary representations.
Regular Weyl-Systems and Smooth Structures
63
References 1. Benatti, F., Narnhofer, H., Sewell, G.L.: A Non-Commutative version of the Arnold Cat Map. Lett. Math. Phys. 21, 157-172 (1991) 2. Bellissard, J.: K-Theory of C ∗ -Algebras in Solid State Physics. Springer LNP 257, Berlin-HeidelbergNew York: Springer-Verlag 1986, pp. 99-156 3. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds, and Physics. New York: Elsevier Sci. Publ., 1982 4. Folland, G.B.: Harmonic Analysis in Phase Space. Princeton, NJ: Princeton University Press, 1989 5. Fr¨olicher, A., Kriegl, A.: Linear Spaces and Differentiation Theory. Pure and Applied Mathematics, Chichester: J.Wiley, 1988 6. G˚arding, L., Wightman, A.: Representations of the Commutation Relations. Proc. Nat. Acad. Sci. 40, 622-626 (1954) 7. Gel’fand, I.M., Vilenkin, N.Y.: Generalized Functions, Volume IV. New York-London: Academic Press, 1964 8. H¨ormann, G.: Representations of the Infinite Dimensional Heisenberg Group. Dissertation, University of Vienna 1993 9. Kastler, D., Mebkhout, M.: Revisiting the Mackey-Stone-von Neumann Theorem (The C ∗ -Algebra of a Presymplectic Space). Nucl. Phys. B (Proc. Suppl.) 18 B, 200-211 (1990) 10. Kostant, B.: Quantization and Unitary Representations, Part I. Springer LNM 170, Berlin-HeidelbergNew York: Springer-Verlag, 1970, pp. 87-208 11. Kirillov, A.A.: Elements of the Theory of Representations. Berlin-Heidelberg: Springer-Verlag, 1976 12. Kirillov, A.A.: Unitary Representations of Nilpotent Lie Groups. Russ. Math. Survey 17, 53-104 (1962) 13. Kriegl, A., Michor, P.W.: Foundations of Global Analysis. Book in preparation, version 01/96 (available via http://radon.mat.univie.ac.at/People/michor/michor.html.) 14. Kriegl, A., Michor, P.W.: Aspects of the Theory of Infinite Dimensional Manifolds. Diff. Geom. Appl. 1, 159-176 (1991) 15. Manuceau, J.: C ∗ -algebre de relations de commutations. Ann. Inst. Henri Poincar´e 8, 139-161 (1968) 16. Manuceau, J., Sirugue, M., Testard, D., Verbeure, A.: The Smallest C ∗ -Algebra for Canonical Commutation Relations. Commun. Math. Phys. 32, 231-243 (1973) 17. Narnhofer, H., Thirring, W.: Covariant QED without Indefinite Metric. UWThPh-1991-63, 1993 18. Pedersen, Gert K.: C∗ -Algebras and their Automorphism Groups. London: Academic Press, 1979 19. Petz, D.: An Invitation to the Algebra of Canonical Commutation Relations. Leuven: Leuven University Press, 1990 20. Pressley, A., Segal, G.: Loop Groups. New York: Oxford University Press, 1986 21. Schaefer, H.H.: Topological Vector Spaces. Springer GTM 3, fifth printing, Berlin-Heidelberg-New York: Springer-Verlag 1986 22. Schwartz, L.: Th´eorie des distributions. Paris: Hermann, 1966 23. Segal, I.E.: Distributions in Hilbert Spaces and Canonical Systems of Operators. Trans. Am. Math. Soc. 88, 12-41 (1958) 24. Slawny, J.: On Factor Representations and the C ∗ -Algebra of Canonical Commutation Relations. Commun. Math. Phys. 24, 151-170 (1972) 25. von Neumann, J.: Die Eindeutigkeit der Schr¨odingerschen Operatoren. Math. Ann. 104, 570-578 (1931) 26. Woodhouse, N.: Geometric Quantization. Oxford: Oxford University Press 1980 Acknowledgement. The author wants to thank Michael Kunzinger for his encouragement to reinvestigate some of the results obtained in the doctoral thesis (cf. [8]) and for many helpful remarks on the final versions.
Communicated by H. Araki This article was processed by the author using the LaTEX style file Pljour1 from Springer-Verlag.
Commun. Math Phys. 184, 65 – 93 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Vertex Operator Algebras Associated to Admissible ˆ2 Representations of sl Chongying Dong? , Haisheng Li, Geoffrey Mason?? Department of Mathematics, University of California, Santa Cruz, CA 95064, USA Received: 4 April 1996 / Accepted: 1 August 1996
ˆ 2 are studied from the point Abstract: The Kac-Wakimoto admissible modules for sl of view of vertex operator algebras. It is shown that the vertex operator algebra L(l, 0) associated to irreducible highest weight modules at admissible level l = pq − 2 is not rational if l is not a positive integer. However, a suitable change of the Virasoro algebra makes L(l, 0) a rational vertex operator algebra whose irreducible modules are exactly ˆ 2 and for which the fusion rules are calculated. It is these admissible modules for sl also shown that the q-dimensions with respect to the new Virasoro algebra are modular functions. 1. Introduction Let {e, f, h} be a standard basis for g = sl2 such that [e, f ] = h, [h, e] = 2e, [h, f ] = −2f , gˆ the corresponding affine Lie algebra and L(`, j) the irreducible highest weight gˆ module of level ` with highest weight j. It is well known that the vacuum representation L(`, 0) has a natural vertex operator algebra (or chiral algebra) structure for any ` 6= −2 (cf. [FZ]). If ` is a positive integer, the chiral algebra L(`, 0) of the WZNW models in the context of conformal field theory has been well understood. For example, the fusion rules are obtained by using primary field decompositions (cf. [GW, TK]) or by the Verlinde formula (cf. [K, V]), and n-point functions are calculated [KZ]. In the context of vertex operator algebras, it has been proved (cf. [DL, FL, Li1]) that any Z+ -graded weak L(`, 0)-module is completely reducible and the set of equivalence classes of irreducible L(`, 0)-modules is the set of equivalence classes of standard gˆ modules of level `. Thus L(`, 0) is rational (defined in Sect. 2). (It has been proved ? Supported by NSF grant DMS-9303374 and a +research grant from the Committee on Research, UC Santa Cruz. ?? Supported by NSF grant DMS-9401272 and a research grant from the Committee on Research, UC Santa Cruz.
66
C. Dong, H. Li, G. Mason
recently in [DLiM2] that any weak L(`, 0)-module is completely reducible and the set of equivalence classes of irreducible weak L(`, 0)-modules is the set of equivalence classes of standard gˆ -modules of level `.) The modular invariance of the vector space c` linearly spanned by the characters trL(`,j) e2πiτ (L(0)− 24 ) for all standard modules of level ` is obtained in [KP] by using explicit character formulas and also follows from a general theorem of Zhu [Z]. The fusion rules are computed in [FZ] by studying certain associative algebras and bimodules associated to L(`, 0) and its irreducible modules. p If ` is rational such that ` + 2 = for some coprime positive integers p ≥ 2 and q q, Kac and Wakimoto [KW1–KW2] found finitely many distinguished irreducible representations, called admissible (or modular invariant) representations. In this case the fusion rules among admissible modules have been calculated in the context of conformal field theory (cf. [AY, BF, MW]) by employing various methods, but different methods sometimes give different results. Especially, the Verlinde formula gives negative fusion rules. c` If j is not an integer, tr L(`,j) e2πiτ (L(0)− 24 ) does not exist (because the homogeneous c` 1 subspaces are infinite-dimensional) so that the character trL(`,j) e2πiτ (L(0)− 2 zh(0)− 24 ) was considered in [KW1–KW2], where z is a positive rational number less than 1. c` 1 In [KW1], a formula for tr L(`,j) e2πiτ (L(0)− 2 zh(0)− 24 ) in terms of theta functions was given and a transformation law under S(τ, z) = (−τ −1 , −zτ ) was found. Later, the transformation law was corrected by adding an extra factor [KW2]. After this correction, c` 1 it is not clear that the space linearly spanned by all tr L(`,j) e2πiτ (L(0)− 2 zh(0)− 24 ) for admissible weights j is invariant under the action of the modular group P SL(2, Z), where S(τ, z) = (−τ −1 , −zτ ). Our purpose in this paper is to study these admissible representations from the point of view of vertex operator algebras. We show that L(`, 0) is a Q-graded rational vertex operator algebra under a new Virasoro algebra and that its irreducible modules are exactly the admissible modules for gˆ . We extend Zhu’s A(V )-theory ([FZ, Z]) to Qgraded vertex operator algebras and apply this theory to L(`, 0) to calculate the fusion rules. The new Virasoro algebra also gives a natural interpretation of the characters c` 1 trL(`,j) e2πiτ (L(0)− 2 zh(0)− 24 ) . We explain these results in detail in the following. In the first part of the paper, we prove that, among all the highest-weight irreducible gˆ -modules of level `, the admissible representations of level ` constitute the set of irreducible Z+ -graded weak L(`, 0)modules. This is implicit in references such as [AY, BF, FM]. It follows from this result and a complete reducibility theorem of Kac-Wakimoto [KW2] that any weak L(`, 0)module from category O is completely reducible. Let N+ be the sum of all positive root spaces of gˆ . Let E be the category of weak L(`, 0)-modules W on which N+ is locally nilpotent, i.e., for any u ∈ W, there is a positive integer k such that N+k u = 0. Then we prove that any weak L(`, 0)-module from category E is completely reducible. Since some admissible weights are not integers if ` is not integral, Zhu’s algebra A(L(l, 0)) [Z] has infinite-dimensional irreducible modules. This implies that L(`, 0) is not rational and that Zhu’s C2 -condition (another crucial condition for Zhu’s theorem of modular invariance) is not true either. In the second part of the paper, we study the vertex operator algebra L(`, 0) under a new Virasoro algebra. Let ω be the original Segal-Sugawara Virasoro vector of L(`, 0). Set ωz = ω + 21 zh(−2)1 ∈ L(`, 0), where z is a complex number. Then ωz is a Virasoro vector of L(`, 0) with a central charge c`,z = c` − 6`z 2 and Lz (0) = (ωz )1 = L(0) − 1 1 2 zh(0). If we choose z = 0, 2 , we obtain the homogeneous grading and the rescaled
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
67
principal grading (cf. [K, LW]), respectively. Let z be a positive rational number less than 1. Note that the vertex operator algebra (L(`, 0), Y, 1, ωz ) is Q-graded instead of Zgraded. We extend Zhu’s A(V )-theory of one-to-one correspondence [Z] between the set of equivalence classes of irreducible admissible V -modules and the set of equivalence classes of irreducible A(V )-modules and Frenkel-Zhu’s A(M ) theory [FZ] for fusion rules to any Q-graded vertex operator algebra. It follows from our complete reducibility theorem in the first part that any Q+ -graded weak L(`, 0)-module under the new Virasoro vector ωz is completely reducible. That is, (L(`, 0), ωz ) is rational. By using the MalikovFeigin-Fuchs’ singular vector expressions [MFF] and the Fuchs’ projection formula [F] we find all the fusion rules and prove that (L(`, 0), ωz ) satisfies the C2 -finite condition. Our results on the fusion rule agree with the corresponding results in [BF]. 1 It is natural for us to consider the modified characters tre2πiτ (Lz (0)− 24 c`,z ) , that is, 2 1 1 tre2πiτ (L(0)− 2 zh(0)− 24 (c` −6`z )) . Using KW’s character formula [KW1] we find that these modified characters are modular functions so that c`,z is the modular anomaly rather than 1 1 c` . (Then the characters tre2πiτ (L(0)− 2 zh(0)− 24 c` ) are obviously not modular functions.) One may ask: Is the space linearly spanned by our new characters invariant under the transformation S(τ ) = −τ −1 with z being fixed? This will be discussed in another paper. There is some overlap between Sect. 2 of this paper and [AM]. In particular, the result in our Remark 2.21 was obtained independently in [AM] (Theorem 3.5.3). We should also mention that the vertex operator algebras associated to irreducible highest weight representations of certain rational levels for affine Lie algebra Cn(1) have been studied in [A]. ˆ2 2. Vertex Operator Algebras L(`, 0) Associated to sl A vertex operator algebra, or briefly a VOA, is a Z-graded vector space V = ⊕n∈Z Vn satisfying a number of axioms. We refer the reader to [B, FLM and FHL] for the details of the definition. However, we would like to give the definitions of weak modules and modules in details. A weak V -module is a pair (W, YW ), where W is a vector space and YW (·, z) is a linear map from V to (EndW )[[z, z −1 ]] satisfying the following axioms: (1) YW (1, z) = idW ; (2) YW (a, z)u ∈ W ((z)) for any a ∈ V, u ∈ W ; (3) d YW (a, z) for a ∈ V ; (4) the Jacobi identity: YW (L(−1)a, z) = dz z1 − z2 z2 − z1 −1 −1 z0 δ YM (a, z1 )YM (b, z2 )u − z0 δ YM (b, z2 )YM (a, z1 )u z0 −z0 z1 − z0 = z2−1 δ (2.1) YM (Y (a, z0 )b, z2 )u z2 for a, b ∈ V, u ∈ W . A weak V -module (W, YW ) is called a V -module if L(0) semisimply acts on W with the decomposition into L(0)-eigenspaces M = ⊕h∈C Mh such that for any h ∈ C, dim Mh < ∞, Mh+n = 0 for n ∈ Z sufficiently small. A Z+ -graded weak V -module [FZ] is a weak V -module W together with a Z+ gradation W = ⊕∞ n=0 W (n) such that am W (n) ⊆ W (k + n − m − 1)
for a ∈ Vk , m, n ∈ Z,
(2.2)
where W (n) = 0 by definition for n < 0. One may define the notions of “submodule”and “irreducible submodule” accordingly. A VOA V is said to be rational if any Z+ -graded weak V -module is a direct sum of irreducible Z+ -graded weak V -modules. It was proved
68
C. Dong, H. Li, G. Mason
in [DLiM1] that if V is rational, there are only finitely many irreducible Z+ -graded weak V -modules up to equivalence and any irreducible weak V -module is a module. Let {e, f, h} be the standard basis for g = sl2 with the commutation relations: [e, f ] = h.[h, e] = 2e, [h, f ] = −2f . We fix the normalized Killing form on g such that ˜ 2 = C[x, x−1 ] ⊗ g + Cc be the affine Lie algebra and identify g with hh, hi = 2. Let g˜ = sl x0 ⊗ g. Set a(n) = a ⊗ xn for a ∈ g and n ∈ Z for convenience. Define subalgebras N+ = Ce + xC[x] ⊗ g, N− = Cf + x−1 C[x−1 ] ⊗ g, B = N+ ⊕ Ch ⊕ Cc, P = C[x] ⊗ g ⊕ Cc.
(2.3) (2.4)
˜ 2 = N+ ⊕ Ch ⊕ Cc ⊕ N− . Then sl Let gˆ = g˜ ⊕ Cd be the extended affine algebra [K], where [d, c] = 0, [d, xn ⊗ a] = n(xn ⊗ a) for a ∈ g, n ∈ Z. Let H = Ch ⊕ Cc ⊕ Cd be the Cartan subalgebra of gˆ , α0 , α1 be the simple roots of gˆ , let Γ+ = Z+ α0 ⊕ Z+ α1 and let Λ0 , Λ1 be the fundamental weights of gˆ [K]. Let ρ¯ be half of the sum of positive roots of g and set ρ = ρ¯ + 2Λ0 [K]. For any λ ∈ H ∗ , denote by M (λ) (resp. L(λ)) the Verma (resp. the irreducible highest weight) gˆ -module. When restricted to g˜ , L(λ) is an irreducible g˜ -module [K]. It is clear that L(λ) and L(µ) are isomorphic g˜ -module if and only if λ ∈ µ + Cδ. As commonly used in many references, we use the notation L(`, j) for the g˜ -module L(λ), where ` = hλ, ci, j = hλ, α1 i = j. Conversely, let M be a restricted g˜ -module of level ` 6= −2. Then we extend M to a gˆ -module by letting d act on M as −L(0). In this paper we shall consider any restricted g˜ -module as a gˆ -module in this way. For a complex number l and a Ch-module U which can be regarded as a B-module by N+ acting trivially and c acting as l, let M (`, U ) be the generalized Verma g˜ -module U (˜g) ⊗U (B) U [Le] of level l or Weyl module. If U = C is one-dimensional Ch-module on which h acts as a fixed complex number j the corresponding module is an ordinary Verma module denoted by M (l, j). Note that U can be identified with the subspace 1 ⊗U (B) U of M (l, U ). Then M (l, j) has a unique maximal submodule which intersects trivially with C and L(`, j) is isomorphic to the corresponding irreducible highest weight module. Similarly, one can define the generalized Verma g˜ -module V (`, U ) = U (˜g) ⊗U (P ) U for any g-module U which can be extended to a P -module by setting xC[x] ⊗ g acting trivially and c acting as l. Note that if U = C is the trivial g-module then V (`, C) is a quotient of M (l, 0) and L(l, 0) is the irreducible quotient of V (`, C) modulo the unique maximal submodule which intersects C trivially. It is well-known that V (`, C) and L(`, 0) have natural vertex operator algebra structures for any ` 6= −2 and that any M (`, U ) is a weak module for vertex operator algebra V (`, C) (cf. [FZ] and [Li1]). We recall the following Kac-Kazhdan reducibility criterion [KK]: Proposition 2.1. The Verma module M (`, j) is reducible if and only if there are some positive integers n, k such that one of the three conditions hold: (I) j = n − 1 − (k − 1)t;
(II) j = −n + kt; (III) ` + 2 = 0,
(2.5)
where t = ` + 2. Remark 2.2. Since any restricted g˜ -module of level ` is a weak V (`, C)-module ([FZ], [Li1]), V (`, C) is always irrational. If t = ` + 2 6∈ Q+ , then it follows from Proposition 2.1 that V (`, C) = L(`, 0). Therefore, L(`, 0) is an irrational vertex operator algebra.
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
69
Recall from [KW1] that a weight λ ∈ H ∗ is said to be admissible if the following conditions hold: (i) hλ + ρ, αi > 0 for all but finitely many positive roots α of gˆ ; (ii) hλ + ρ, αi ∈ / {−1, −2, · · ·} for any positive root α of gˆ ; (iii) The set of positive roots α satisfying hλ + ρ, αi ∈ Z+ spans a 2-dimensional subspace of H ∗ . A complex number ` is called an admissible level if there is an admissible weight λ such that hλ, ci = `. It was proved in [KW1] that ` is an admissible level if and only if ` = −2+ pq , where p and q are coprime positive integers with p ≥ 2 and j is an admissible weight of level ` if and only if j = n − k pq for some n, k ∈ Z+ , n ≤ p − 2, k ≤ q − 1. From now on we will assume that t = ` + 2 = pq , where p and q are coprime positive integers with p ≥ 2. Remark 2.3. Let j = n − kt be an admissible weight. Then j = −(p − n) + (q − k)t. Since p and q are relatively prime, r − st = 0 for r, s ∈ Z, 0 ≤ s ≤ q − 1 if and only if s = 0, r = 0. Consequently, the expression j = n − kt of an admissible weight j with n, k ∈ Z+ , n ≤ q − 2, k ≤ q − 1 is unique. A vector w in a highest weight module M for g˜ is called a singular vector if w is a highest weight vector which generates a proper submodule. It is well known that the singular vectors of M (`, j) give the key information for determining the module structure of L(l, j) and the fusion rules. In [MFF] an expression for singular vectors in terms of non-integral powers of elements of gˆ was found as follows (see [MFF] for details): Proposition 2.4. [MFF]. Let j = n − 1 − (k − 1)t, where n and k are positive integers satisfying 1 ≤ n ≤ p − 1, 1 ≤ k ≤ q and let v be a highest weight vector of the Verma module M (`, j). Set F1 (n, k) = f (0)n+(k−1)t e(−1)n+(k−2)t f (0)n+(k−3)t e(−1)n+(k−4)t · · · e(−1)n−(k−2)t f (0)n−(k−1)t , (2.6) F2 (n, k) = e(−1)p−n+(q−k)t f (0)p−n+(q−k−1)t e(−1)p−n+(q−k−2)t f (0)p−n+(q−k−3)t · · · f (0)p−n−(q−k+1)t e(−1)p−n−(q−k)t . (2.7) Then vj,1 = F1 (n, k)v, vj,2 = F2 (n, k)v are singular vectors of M (`, j) of degrees n((k − 1)α0 + kα1 ) and (p − n)((q + 1 − k)α0 + (q − k)α1 ), respectively. Moreover, the maximal proper submodule of M (`, j) is generated by vj,1 and vj,2 . Remark 2.5. Note that v0,2 = F2 (1, 1)1 generates the maximal proper submodule of V (`, C). For any complex number α, following [F] and [FM] we set Hα = f e−αh−α(α+1). Then Hα Hβ = Hβ Hα , em Hα = Hα−m em , f m Hα = Hα+m f m , f m em = H0 H1 · · · Hm−1 , em f m = H−1 H−2 · · · H−m , hm en = en (h + 2n)m , hm f n = f n (h − 2n)m
(2.8) (2.9) (2.10)
for any complex numbers α, β and for any positive integers m, n. Let σ be the anti-automorphism of U (g) such that σ(a) = −a for any a ∈ g. Then σ(Hα ) = H−(α+1) for any complex number α. Let P1 be the projection g˜ onto g such that P1 (tn ⊗ a) = a for any a ∈ g and P1 (c) = 0.
70
C. Dong, H. Li, G. Mason
Proposition 2.6. [F]. The following projection formulas hold: P1 (F1 (n, k)) =
!
n−1 Y k−1 Y
f n,
Hr+st
r=0 s=1
P1 (F2 (n, k)) =
p−n Y q−k Y
(2.11)
! ep−n .
H−r−st
(2.12)
r=1 s=1
Let B0 = C(f (−1) + f (0)) + C[x−1 ](x−2 + x−1 ) ⊗ g. Then B0 is an ideal of N− such that N− /B0 = CT+ + CT0 + CT− , denoted by L0 , where T+ = e(−1) + B0 , T0 = h(−1) + B0 , T− = f + B0 , satisfies the following commutation relations: [T+ , T− ] = T0 , [T0 , T+ ] = −2T+ , [T0 , T− ] = 2T− .
(2.13)
Let P be the natural quotient map from U (N− ) onto U (L0 ). For any complex number α, we define Gα = T− T+ − αT0 + α(α + 1). Then Gα Gβ = Gβ Gα , T+m Gα = Gα−m T+m , T−m Gα = Gα+m T−m , T−m T+m = G0 G1 · · · Gm−1 , T+m T−m = G−1 G−2 · · · G−m
(2.14) (2.15)
for any complex numbers α, β and for any positive integer m. Using the same method as suggested in [F] we obtain Proposition 2.7. The following formulas hold: P (F1 (n, k)) =
n−1 Y k−1 Y
! T−n ,
Gr+st
r=0 s=1
P (F2 (n, k)) =
p−n Y q−k Y
(2.16)
! G−r−st
T+p−n .
(2.17)
r=1 s=1
Recall that N− = Cf + x−1 C[x−1 ] ⊗ g. Set B2 = Cf (−1) + x−2 C[x−1 ] ⊗ g. The it is clear that B2 is an ideal of N− . Let L2 = N− /B2 be the quotient Lie algebra. Then L2 is a ¯ [h, ¯ e] ¯ f¯] = 0, three-dimensional Heisenberg Lie algebra with relations: [e, ¯ f¯] = h, ¯ = [h, ¯ ¯ where e¯ = e(−1) + B2 , f = f + B2 , h = h(−1) + B2 . Let P2 be the natural quotient map from U (N− ) to U (L2 ). Then Proposition 2.8. [F]. For any positive integers 1 ≤ n ≤ p − 1, 1 ≤ k ≤ q, we have P2 (F1 (n, k)) =
n−1 Y k−1 Y
! H¯ r+st
r=0 s=1
P2 (F2 (n, k)) =
p−n Y q−k Y r=1 s=1
¯ where H¯ α = e¯f¯ − αh.
f¯n ,
(2.18)
! H¯ −r−st
e¯p−n ,
(2.19)
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
71
For a Ch-module define a linear functional on U ∗ ⊗ M (`, U ) as follows: hu0 , ui = u0 (P(u))
for u0 ∈ U ∗ , u ∈ M (`, U ),
(2.20)
where P is the projection of M (`, U ) onto the subspace U . Define I = {u ∈ M (`, U )|hu0 , xui = 0
for any u0 ∈ U ∗ , x ∈ U (˜g)}.
(2.21)
It is clear that I is the unique maximal submodule which intersects with U trivially. Set L(`, U ) = M (`, U )/I and regard U as a subspace in a natural way. Then P induces a projection of L(l, U ) to U, which is still be denoted by P, and the formula (2.20) also define a linear functional on U ∗ ⊗ L(l, U ). Then (see [FZ] or [Li2]) M (`, U ) and L(`, U ) are weak modules for the vertex operator algebra V (`, C). Let Y (·, z) be the vertex operators defining themodule structure on L(`, C). It is clear that Y (·, z) is an L(`, U ) intertwining operator of type (see [FHL] for the definition of interV (`, C) L(`, U ) L(`, U ) twining operator). Let Y(·, z) be the intertwining operator of type L(`, U ) V (`, C) defined by Y(u, z)v = ezL(−1) Y (v, −z)u (cf. [FHL]). Lemma 2.9. The g˜ -module L(`, U ) is a weak module for the vertex operator algebra L(`, 0) if and only if hu0 , Y(u, z)v0,2 i = 0
for any u0 ∈ U ∗ , u ∈ U (g) ⊂ L(`, U ).
(2.22)
Proof. It is clear that the condition is necessary. Now we assume that (2.22) holds. Let J be the maximal submodule of V (`, C) which intersects C trivially. Then J = U (N− )v0,2 . From the definition of the bilinear form we get hu0 , aY(u, z)wi = 0
for u0 ∈ U ∗ , u ∈ L(`, U ), a ∈ N− U (N− ), w ∈ L(`, 0). (2.23)
By using the commutator formula [a(m), Y(u, z)] =
X m j≥0
j
Y(a(j)u, z)z m−j
(2.24)
for a ∈ g, m ∈ Z and u ∈ L(l, U ) together with (2.22) we get hu0 , Y(u, z)Ji = 0
for any u0 ∈ U ∗ , u ∈ U (g)U ⊂ L(`, U ).
(2.25)
From the Jacobi identity for the vertex operators against the intertwining operator we have X n X n a(n − j)Y(u, z)z j − (−1)n Y(u, z)a(j)z n−j (2.26) Y(a(n)u, z) = j j j≥0
j≥0
for u ∈ L(l, U ), a ∈ g and n ∈ Z. Note that L(l, U ) is generated by U as g˜ -module. Combining (2.23), (2.25) and (2.26) gives hu0 , Y(u, z)Ji = 0
for any u0 ∈ U ∗ , u ∈ L(`, U ).
(2.27)
By the commutator formula (2.24) again, we obtain hu0 , xY(u, z)Ji = 0
for any u0 ∈ U 0 , x ∈ U (˜g), u ∈ L(`, U ).
(2.28)
72
C. Dong, H. Li, G. Mason
By the definition of L(`, U ) we have Y(v, z)u =0 for any v ∈ J, u ∈ L(`, U ). Thus, L(`, U ) Y(·, z) induces an intertwining operator of type . This proves that L(`, U ) L(`, 0) L(`, 0) is a weak module for L(`, 0). Proposition 2.10. The L(`, U ) is a weak L(`, 0)-module if and only if f (h)U = 0, where f (h) =
p−2 Y q−1 Y
(h − r + st).
r=0 s=0
Proof. Recall that Y(·, z) is the corresponding nonzero intertwining operator of type L(`, U ) . For n ∈ Z, a ∈ g we define deg(xn ⊗ a) = n. By (2.24) we obtain L(`, U ) V (`, C) hu0 , Y(u, z)avi = hu0 , z deg a Y(σP1 (a)u, z)vi
(2.29)
for u0 ∈ U ∗ , u ∈ U (g)U ⊆ L(`, U ), a ∈ U (N− ), v ∈ L(`, 0). Let a = F2 (1, 1). Then v0,2 = a1. By Lemma 2.9 and (2.29) L(l, U ) is a weak L(l, 0)-module if and only if hu0 , Y(σP1 (a)u, z)1i = 0.
(2.30)
By Proposition 2.6, we have P1 (a) =
p−1 Y q−1 Y
H−r−st ep−1 .
(2.31)
r=1 s=1
Then from (2.8), σP1 (x) = (−1)p−1 ep−1
p−1 Y q−1 Y
Hr−1+st = (−1)p−1
r=1 s=1
p−1 Y q−1 Y
H−p+r+st ep−1 .
(2.32)
r=1 s=1
Note that hu0 , Y(σP1 (a)u, z)1i = hu0 , ezL(−1) (σP1 (a))ui = hu0 , (σP1 (a))ui. Thus L(`, U ) is a weak L(`, 0)-module if and only if hu0 ,
p−1 Y q−1 Y
H−p+r+st ep−1 U (g)U i = 0 for any u0 ∈ U ∗ .
r=1 s=1
From the grading restriction on the bilinear pair, the later is equivalent to ! p−1 Y q−1 Y p−1 p−1 H−p+r+st e f U = 0. r=1 s=1
By (2.9) and the fact that eU = 0 we have
(2.33)
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl p−1 Y q−1 Y p−1 Y
73
H−p+r+st H−i U
r=1 s=1 i=1
=
p−1 Y q−1 Y p−1 Y
(p − r − st)(h − p + r + 1 + st)i(h − i + 1)U = 0.
(2.34)
r=1 s=1 i=1
Since p − r − st 6= 0 for any 1 ≤ r ≤ p − 1, 1 ≤ s ≤ q − 1 (from Remark 2.3), we obtain p−1 Y q−1 Y p−1 Y
(h − p + 1 + r + st)(h − i + 1)U = 0.
(2.35)
r=1 s=1 i=1
Thus p−2 Y q−1 Y
(h − r + st)U = 0.
(2.36)
r=0 s=0
This finishes the proof.
ˆ 2 -module L(`, j) is a weak L(`, 0)-module if and Corollary 2.11. The highest weight sl only if j = r − st for 0 ≤ r ≤ p − 2, 0 ≤ s ≤ q − 1. That is, the gˆ -module L(`, j) is a weak L(`, 0)-module if and only if j is admissible. Let j be an admissible weight so that L(`, j) is a weak L(`, 0)-module. It follows from [FHL] that L(`, 0)0 is a weak L(`, 0)-module. But (L(`, j)0 )0 6= L(`, j) because L(`, j) has infinite-dimensional homogeneous subspaces in general. By using the wellknown principal grading (cf. [K]), any L(`, j) = ⊕m,n∈Z L(`, j)(m,n) becomes a Z2 graded space such that each homogeneous subspace is finite-dimensional. Let L(`, j)c = ⊕(m,n)∈Z×Z L(`, j)∗m,n be the restricted dual of L(`, j) with respect to this Z2 -grading. Then it is clear that L(`, j)c is an irreducible weak L(`, 0)-module satisfying (L(`, j)c )c = L(`, j). But the lowest L(0)-weight subspace of L(`, j)c is a lowest weight g-module with weight. Then there is a non-trivial intertwining operator of type −j as its lowest L(`, 0) so that L(`, j) and L(`, j)c are conjugates of each other from the L(`, j)L(`, j)c physical point of view. Corollary 2.12. Let j be an admissible weight. Then both L(`, j) and L(`, j)c are irreducible weak L(`, 0)-modules. Remark 2.13. If ` is not a nonnegative integer, there are also other types of irreducible weak L(`, 0)-modules. For a positive integral level `, it was proved [DLiM2] that any weak module is completely reducible and any irreducible weak L(`, 0)-module is an irreducible integrable highest weight gˆ -module of level `. This distinguishes L(`, 0) for a positive integral level ` from all the rational levels. Remark 2.14. It follows immediately from Propositions 2.10 and a complete reducibility theorem of Kac-Wakimoto (Theorem 4.1 of [KW2]) that any weak L(`, 0)-module M which is an gˆ -module of level ` from the category O is a direct sum of irreducible modules L(`, j) with admissible weight j.
74
C. Dong, H. Li, G. Mason
Next, we shall prove a complete reducibility theorem for a category much bigger than the category O. Recall the following theorem from [KK]: Theorem 2.15. Let λ, µ ∈ H ∗ . Then L(µ) is isomorphic to a subquotient module of M (λ) if the ordered pair {λ, µ} satisfies the following condition: There exists a sequence β1 , · · · , βk of positive roots and a sequence n1 , · · · , nk of positive integers such that Pk (i) λ − i=1 ni βi = µ; (ii) 2(λ + ρ − n1 β1 − · · · − nj−1 βj−1 , βj ) = nj (βj , βj ) for 1 ≤ j ≤ k. Lemma 2.16. Let λ, µ be two distinct admissible weights. Then L(µ) is not isomorphic to any subquotient module of M (λ). Proof. Otherwise, by Theorem 2.15 we have a sequence β1 , · · · , βk of positive roots and a sequence n1 , · · · , nk of positive integers satisfying (i)–(ii). From [DGK] each βi is real. Then we obtain Pk k X 2(λ + ρ − i=0 ni βi , βk ) = −nk . (2.37) ni β i , β k i = hµ + ρ, βk i = hλ + ρ − (βk , βk ) i=1
This contradicts the admissibility of µ.
Lemma 2.17. Let M be a weak L(`, 0)-module such that M is a highest weight g˜ module. Then M is irreducible. Proof. Let λ be the highest weight of M . If M contains a proper submodule W , there is a highest weight vector u in W of weight µ such that µ < λ. Then both λ and µ are admissible by Corollary 2.11. This contradicts Lemma 2.16. Then M is irreducible. Recall from [K] that ωˆ is the involutory antiautomorphism of gˆ , which is the negative Chevalley involution. Let M be a gˆ -module of level ` such that H local finitely acts on M with finite-dimensional generalized H-eigenspaces. We define [DGK] M ωˆ = ˆ for any f ∈ M ωˆ , a ∈ gˆ , u ∈ ⊕λ∈H ∗ Mλ∗ with the following action (af )(u) = f (ω(a)u) M . Then (M ωˆ )ωˆ ' M, L(λ)ωˆ ' L(λ) for any λ ∈ H ∗ . Proposition 2.18. Let λ1 , λ2 be admissible weights of level `. Then any short exact sequence 0 → L(λ1 ) → M → L(λ2 ) → 0
(2.38)
of weak L(`, 0)-modules splits. Proof. First, since H semisimply acts on L(λ1 ) and L(λ2 ), H acts local finitely on M . Let M = ⊕λ∈H ∗ Mλ be the generalized H-eigenspace decomposition. Then the sequence (2.38) splits if and only if the following sequence splits: 0 → L(λ2 ) → M ωˆ → L(λ1 ) → 0.
(2.39)
/ Without losing generality we may assume that λ1 6> λ2 . Let u ∈ Mλ2 such that u ∈ L(λ1 ). Then N+ u ⊆ L(λ1 ). If N+ u 6= 0, there is a β ∈ Z+ α0 +Z+ α1 such that λ2 +β = λ1 . This contradicts the assumption λ1 6> λ2 . Thus N+ u = 0. Set U = U (g)u. Let W be the submodule generated by U . Since L(`, U ) as a gˆ -module is isomorphic to some quotient module of W , L(`, U ) is a weak L(`, 0)-module. From Proposition 2.10, H semisimply acts on U . Then u is a highest weight vector. By Lemma 2.17, W is irreducible. Then we obtain M = W ⊕ L(λ1 ). That is, sequence (2.38) splits. . From Proposition 2.18 we have
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
75
Corollary 2.19. Let λ, λ1 , · · · , λk be admissible weights of level `. Then any short exact sequence 0 → L(λ1 ) ⊕ · · · ⊕ L(λk ) → M → L(λ) → 0 of weak L(`, 0)-modules splits. Theorem 2.20. Let M be any weak L(`, 0)-module such that for any u ∈ M , there exists a positive integer k such that (N+ )k u = 0. Then M is a direct sum of irreducible modules L(`, j) with admissible weight j. Proof. Set Ω(M ) = {m ∈ M |g ⊗ tC[t] · m = 0}. Then the proof of Theorem 3.7 of [DLiM2] shows that Ω(M ) 6= 0. Since e is locally nilpotent on Ω(M ) we conclude that there exist vectors m ∈ M such that N+ m = 0. From the proof of Proposition 2.18 we see that H acts semisimply on U (g)m. Thus M contains a highest weight vector. It follows from Lemma 2.17 that M contains an irreducible weak L(`, 0)-module L(λ). Let W be the sum of all irreducible weak L(`, 0)-submodules of M . We have to prove M = W . If M 6= W , there is a submodule E of M such that W ⊆ E, E/W ' L(λ) for some admissible weight λ. Let u + W be a highest weight vector of E/W . Since gˆ is finitely generated, N+ u ⊆ L(λ1 ) ⊕ L(λ2 ) ⊕ · · · ⊕ L(λr ) for some λ1 , · · · , λr . Set W o = L(λ1 ) ⊕ L(λ2 ) ⊕ · · · ⊕ L(λr ). It follows from Corollary 2.19 that the submodule generated by u and W o is completely reducible. Then u ∈ W . This contradicts the assumption of u. Thus M = W . This finishes the proof. Remark 2.21. From Corollary 2.11 and Proposition 2.20 the set of equivalence classes of irreducible L(`, 0)-modules consists of L(`, j) with j ∈ Z, 0 ≤ j ≤ p − 2 and any (ordinary) module is completely reducible. Remark 2.22. In [Z], an associative algebra A(V ) was introduced for any vertex operator algebra V such that there is a natural one-to-one correspondence between the set of equivalence classes of irreducible Z+ -graded weak V -modules and the set of equivalence classes of irreducible A(V )-modules. If ` is not a nonnegative integer, then A(L(`, 0)) has infinite-dimensional irreducible modules so that A(L(`, 0)) is infinite-dimensional. Therefore (from [DLiM1]), L(`, 0) is not rational. Because Zhu’s C2 -finiteness condition implies that A(L(`, 0)) is finite-dimensional, L(`, 0) does not satisfy the C2 -finiteness condition. 3. Q-Graded Vertex Operator Algebras and the Rationality of (L(`, 0), ωz ) If j is not a nonnegative integer, homogeneous spaces of L(`, j) are infinite-dimensional 3` . In [KW1–KW2], so that the character tr L(`,j) q L(0) is not well-defined, where c` = `+2 c L(0)− 21 zh(0)− 24` the modified characters tr L(`,j) q were considered, where z is a positive rational number less than 1. Noticing that L(0) − 21 zh(0) could be considered as the degree-zero component of a Virasoro vector ωz = ω + 21 zh(−2)1 whose central charge is c`,z = c` −6`z 2 , we study L(`, 0) with respect to the new Virasoro element ωz in this section. We denote the new vertex operator algebra by (L(`, 0), ωz ). Note that (L(`, 0), ωz ) is Q-graded instead of Z-graded. This leads us to the study of Q-graded vertex operator algebras. In particular we extend Zhu’s A(V )-theory and Frenkel-Zhu’s fusion rule formula to any Q-graded vertex operator algebra. That is, we construct an associative
76
C. Dong, H. Li, G. Mason
algebra A(V ) for any Q-graded VOA V and establish the one-to-one correspondence between the set of equivalence classes of irreducible Q+ -graded weak V -modules and the set of equivalence classes of irreducible A(V )-modules. If V is 21 Z-graded, our construction A(V ) and related results coincide with those for the vertex operator superalgebra as developed in [KWa]. We also use complete reducibility, Theorem 2.20, to show that (L(`, 0), ωz ) is rational. A Q-graded vertex operator algebra V satisfies all the axioms for a vertex operator algebra V except that V is Q-graded by weights instead of Z-graded. In particular, a Q-graded vertex operator algebra is a generalized vertex operator algebra in the sense of [DL]. The definitions of weak module and ordinary module are as before. In the definition of Z+ -graded module for a Z-graded vertex operator algebra, replacing Z by Q gives a Q+ -graded module for a Q-graded vertex operator algebra. Definition 3.1. A Q-graded vertex operator algebra V is called rational if any Q+ graded weak V -module is completely reducible. Let V = ⊕α∈Q Vα be a Q-graded vertex operator algebra. Then VZ = ⊕n∈Z Vn is a Z-graded (ordinary) vertex operator algebra. Just as in [FFR] and [Li2], one obtains a Q-graded Lie algebra G(V ) = ⊕α∈Q G(V )α as the quotient space of C[x, x−1 ] ⊗ V d ⊗ 1 + 1 ⊗ L(−1))(C[x, x−1 ] ⊗ V ). Here the Lie bracket is induced from modulo ( dx [xm ⊗ u, xn ⊗] =
∞ X m i=0
i
xm+n−i ⊗ ui v
d for u, v ∈ V and the degree of xn ⊗ u + ( dx ⊗ 1 + 1 ⊗ L(−1))(C[x, x−1 ] ⊗ V ) is wtu − n − 1 for homogeneous u. Set
G(V )± = ⊕α>0 G(V )±α .
(3.1)
Let U be any G(V )0 -module. Then we form the following induced module: M (U ) = U (G(V )) ⊗U (G(V )0 ⊕G(V )− ) U,
(3.2)
where G(V )− acts trivially on U. Then M (U ) is a lower-truncated Q-graded G(V )module generated by the lowest-degree subspace U . Let U ∗ be the dual space of U and extend U ∗ to M (U ) by letting U ∗ annihilate ⊕n>0 M (U )(n). We denote such a pair by hu0 , vi for u0 ∈ U ∗ and v ∈ M (U ). Set I = {v ∈ M (U )|hu0 , avi = 0 for any u0 ∈ U ∗ , a ∈ U (G(V ))}.
(3.3)
Then it is clear that I is a G(V )-submodule of M (U ). Let L(U ) be the quotient module of M (U ) modulo I. Let V be a Q-graded vertex operator algebra. First we define a function ε for all homogeneous elements of V as follows: ε(a) = 1 if wta ∈ Z, ε(a) = 0 if wta ∈ / Z. For any homogeneous element a ∈ V , we define: a ∗ b = ε(a)Resx
(1 + x)[wta] Y (a, x)b for any b ∈ V, x
(3.4)
where [·] denotes the greatest-integer function. Then extend “∗” to a bilinear product on V . Let O(V ) be the subspace of V linearly spanned by
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
Resx
(1 + x)[wta] Y (a, x)b x1+ε(a)
for any homogeneous element a ∈ V and for any b ∈ V . Using (1 + x)m = one can prove Resx
(1 + x)[wta]+m Y (a, x)b ∈ I xn+1+ε(a)
77
(3.5) Pm i=0
m i
xi
(3.6)
for n ≥ m ≥ 0. Let M be any weak V -module. Then we define Ω(M ) = {u ∈ M |am u = 0 for a ∈ V, m > wta − 1}. Define o to be the linear map from V to EndΩ(M ) such that o(a) = ε(a)a[wta]−1 for any homogeneous element a of V . Generalizing Theorems 2.1.1 and 2.1.2 of Zhu we obtain Theorem 3.2. (a) The subspace O(V ) is a two-sided ideal of V with respect to the product “∗” and A(V ) = V /O(V ) is an associative algebra with identity 1 + O(V ). Moreover, ω + O(V ) lies in the center of A(V ). (b) For any weak V -module M , Ω(M ) is an A(V )-module with a acts as o(a). The proof is the same as in the twisted case (see the proofs of Proposition 2.3 and Theorem 5.3 in [DLiM1]). Similarly, for a weak V -module M , we define O(M ) to be the subspace of M linearly spanned by Resx
(1 + x)[wta] Y (a, x)u x1+ε(a)
(3.7)
for any homogeneous element a ∈ V and for any u ∈ M . The following theorem is an analogue of Theorems 1.5.1 and 1.5.2 of [FZ] (also see [KWa] and [Li2]): Theorem 3.3. (a) The quotient space A(M ) = M/O(M ) is an A(V )-bimodule with the following left and right actions: (1 + x)[wta] Y (a, x)u, x (1 + x)[wta]−1 u ∗ a = ε(a)Resx Y (a, x)u x
a ∗ u = ε(a)Resx
(3.8) (3.9)
for any homogeneous a ∈ V and for any u ∈ M . (b) Let W1 , W2 , W3 be irreducible V -modules and suppose V is rational1 . Then there is a linear isomorphism from the space (A(W1 ) ⊗A(V ) W2 (0), W3 (0)) to HomA(V ) W3 . the space of intertwining operators of type W1 W2 Proof. Let I(M ) be the subspace of O(M ) linearly spanned by Resx
(1 + x)wta Y (a, x)u x2
for any homogeneous element a ∈ VZ and for any u ∈ M . Then AVZ (M ) = M/I(M ) is the A(VZ )-bimodule defined in [FZ]. Thus it is enough for us to prove that the subspace 1
It was pointed out in [Li2] that this condition is necessary and a proof was supplied
78
C. Dong, H. Li, G. Mason
O(M )/I(M ) is a sub-bimodule of AVZ (M ). Since the proof of this is parallel to the proof of Theorem 3.2 we omit the proof. The proof of (b) is similar to that for the Z-graded vertex operator algebra as in [Li2]. By definition A(V ) is a quotient algebra of A(VZ ). It is clear that A(V )Lie is a quotient algebra of G(V )0 . Then for any A(V )-module U , we may naturally view U as a G(V )0 -module. Proposition 3.4. For any A(V )-module U , L(U ) is a weak V -module. Proof. The proof is the same as in the ordinary case (see [Li2] and [Z]) or the twisted case (see the proof of Theorem 6.3 of [DLiM1]). The following is a generalization of Theorem 2.2.2 of [Z]. See [Li2] or [DLiM1] for a similar proof. Theorem 3.5. The functor Ω gives rise to a one-to-one correspondence between the set of equivalence classes of irreducible Q+ -graded weak V -modules and the set of equivalence classes of irreducible A(V )-modules. As in the case of the Z-graded vertex operator algebra, we have (see the proof of Theorem 8.1 of [DLiM1]): Proposition 3.6. If V is rational, A(V ) is semisimple and any Q+ -graded weak V module is a direct sum of irreducible ordinary V -modules. Let V be a Q-graded vertex operator algebra. Then it is clear that exp(2πiL(0)) is an automorphism of V . Let M = ⊕h∈C M(h) be a V -module. Following [FHL], let M 0 = ⊕h∈C Mh∗ be the restricted dual of M and define hY (a, x)f, ui = hf, Y (exL(1) (eπi x−2 )L(0) a, x−1 )ui
(3.10)
0
for any f ∈ M , a ∈ V, u ∈ M . The following proposition is essentially proved in [Li3]. Proposition 3.7. The pair (M 0 , Y (·, x)) gives rise to a σ 2 -twisted V -module, where σ = exp(2πiL(0)). Remark 3.8. If V is 21 Z-graded, then M 0 is a V -module because σ 2 = idV . Therefore, we obtain a new functor from V -modules to V -modules. It is important to notice that the vertex operator algebra V may not be isomorphic to its own contragredient dual. Let V be a Q-graded vertex operator algebra and let M be any weak V -module. Define C2 (M ) to be the subspace linearly spanned by a−2 M for a ∈ VZ and by a−1 M for a ∈ Vn , n 6∈ Z. Define bilinear products “·” and “◦” on V as follows: For a ∈ Vm , b ∈ Vn we define a · b = a−1 b and a ◦ b = a0 b if m, n ∈ Z, otherwise we define a · b = 0 and a ◦ b = 0. Lemma 3.9. The defined subspace C2 (V ) is a two-sided ideal for both (V, ·) and (V, ◦). Proof. Let a ∈ Vm , b ∈ Vn , c ∈ Vk . If m 6∈ Z or n + k 6∈ Z, by definition we have: a · b−r c = 0 and a ◦ b−r c = 0 for r = 1 or 2. If m, n + k ∈ Z, we get ∞ X −j a−j (b−r c) = b−r a−j c + (ai b)−j−r−i c i i=0
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
79
for j = −1 or 0. Then a · (b−r c), a ◦ (b−r c) ∈ C2 (V ). Then the proof is complete. Set A2 (M ) = M/C2 (M ). By Lemma 3.9 we obtain a quotient algebra A2 (V ) = V /C2 (V ). Similarly to [Z] we have: Proposition 3.10. The quotient algebra (A2 (V ), ·) is a commutative associative algebra with the vacuum vector 1 as its identity and (A2 (V ), ◦) is a Lie algebra such that (a · b) ◦ c = a · (b ◦ c) + (a ◦ c) · b for any a, b, c ∈ A2 (V ). Therefore (A2 (V ), ·, ◦) is a Poisson Lie algebra. Definition 3.11. If A2 (V ) is finite-dimensional, we say V is C2 -finite or V satisfies the C2 -finiteness condition. If V as a Virasoro algebra module is generated by primary vectors, we say that V satisfies the Virasoro condition. If V as a vertex operator algebra is generated by ω and all primary vectors, we say that V satisfies the primary-field condition. Remark 3.12. It has been proved in [Z] that if V is a rational vertex operator algebra condition and the Virasoro condition, with integral weights satisfying the C2 -finiteness c then the space linearly spanned by trM q L(0)− 24 , where M runs through all irreducible V -modules, is modular invariance. If one replaces the Virasoro condition by the primaryfield condition, one can check that Zhu’s theorem also holds. Recall the following proposition from [DLinM]. Proposition 3.13. Let (V, Y, 1, ω) be a vertex operator algebra of rank r and let h ∈ V satisfy the following conditions: L(n)h = δn,0 h, hn h = δn,1 λ1 for n ∈ Z+ ,
(3.11)
where λ is a complex number. Then (V, Y, 1, ω+h−2 1) is a vertex algebra of rank r−12λ. Now we go back to the vertex operator algebra L(`, 0). For any z ∈ Q, we set ωz = ω+ 21 zh(−2)1. Then it follows from Proposition 4.1 of [DLinM] that ωz is a new Virasoro 3` − 6`z 2 . Thus Lz (0) = (ωz )1 = L(0) − 21 zh(0) vector of L(`, 0) with a central charge `+2 so that [Lz (0), xm ⊗ h] = −m(xm ⊗ h); [Lz (0), xm ⊗ e] = (−m − z)(xm ⊗ e); [Lz (0), xm ⊗ f ] = (−m + z)(xm ⊗ f )
(3.12) (3.13) (3.14)
for any m ∈ Z. In general, V is Q-graded by weights with respect to Lz (0) = L(0) − 1 ˜ 2 zh(0) instead of Z-graded. Consequently, we obtain a Q-grading for sl2 satisfying the conditions: deg(xn ⊗ e) = −n − z; deg(xn ⊗ f ) = −n + z, deg(xn ⊗ h) = −n
for n ∈ Z.
(3.15)
For a positive integral level `, all irreducible L(`, 0)-modules are integral modules. For a general rational level `, the admissible weight j may be non-integral. To make the graded spaces of L(`, j) be finite-dimensional, we assume z ∈ Q, 0 < z < 1. Let M = ⊕n∈Q+ M (n) be any Q+ -graded weak (L(`, 0), ωz )-module. Since xn ⊗ e, xn+1 ⊗ f, xn+1 ⊗ h for n ∈ Z+ have negative degrees with respect to the operator Lz (0), it is clear that M satisfies the condition of Proposition 2.20 so that M is completely reducible. Then we obtain
80
C. Dong, H. Li, G. Mason
Theorem 3.14. The Q-graded vertex operator algebra (L(`, 0), ωz ) is rational and all irreducible modules (up to equivalence) are L(`, j) for the admissible weights j. Remark 3.15. It is easy to check that each eigenspace for Lz (0) in L(`, j) is finite1 dimensional for admissible weight j. Thus tr L(`,j) q Lz (0) = tr L(`,j) q L(0)− 2 zh(0) is well L (0) defined and is equal to tr L(`,j)c q z . In fact they are convergent in upper half plane (see Sect. 5). Remark 3.16. Since Lz (n) = L(n) − 21 (n + 1)zh(n), e, f are primary vectors in (L(`, 0), ωz). Because L(`, 0) as a vertex operator algebra is generated by e, f , (L(`, 0), ωz ) satisfies the primary-field condition. 4. Fusion Rules and C2 -Finiteness of (L(`, 0), ωz ) The main goal of this section is to calculate the fusion rules and prove that (L(`, 0), ωz ) is C2 -finite. Throughout this section we assume that ` = −2 + pq , where p and q are coprime positive integers with p ≥ 2 and that z is a fixed rational number satisfying 0 < z < 1 (under which certain traces converge in some domain [KW1-2]). Let M be any weak V (`, C)-module. Since wth(−1) = 1, wte(−1) = 1 − z, wtf (−1) = 1 + z,
(4.1)
we have (1 + x)[wtf ] Y (f, x)u = (f (−m) + f (1 − m))u; xm (1 + x)[wte] Resx Y (e, x)u = e(−m)u; xm (1 + x)wth Y (h, x)u = (h(−m − 1) + h(−m))u Resx xm+1
Resx
(4.2) (4.3) (4.4)
for any positive integer m and for u ∈ M . By definition all those elements in (4.2)–(4.4) are in O(M ). Proposition 4.1. Let M be any weak V (`, C)-module. Then the space O(M ) is spanned by the all the elements in (4.2)–(4.4). Proof. Let W be the subspace linearly spanned by all the elements in (4.2)–(4.4). Set C = C[x−1 ](x−1 + 1) ⊗ f + C[x−1 ]x−1 ⊗ e + C[x−1 ](x−2 + x−1 ) ⊗ h.
(4.5)
Then W = C · M. Since [h(−k), C] ⊆ C for any positive integer k, we get h(−k)W ⊆ W. Let L be the linear span of homogeneous elements a of V (`, C) such that for any positive integer n, Resx
(1 + x)[wta] Y (a, x)M ⊆ W. xn+ε(a)
(4.6)
We shall prove that L is equal to V (`, C). For any homogeneous element a of L and for any nonnegative integers m ≥ n, we have
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
81
(1 + x)[wta]+n Y (a, x)M ⊆ W (4.7) xm+1+ε(a) ∞ (1 + x)[wta]+n X n (1 + x)[wta] = . because xm+1+ε(a) i xm−i+1+ε(a) i=0 Let a be any homogeneous element of L and let k be any positive integer. Then for any n ∈ N, u ∈ M , we have: Resx
Resz2
(1 + z2 )[wth(−k)a] z2n+ε(h(−k)a)
=
Resz0 Resz2
=
Resz1 Resz2
Y (h(−k)a, z2 )u
(1 + z2 )[wta]+k z2n+ε(a) (1 + z2 )[wta]+k z2n+ε(a)
z0−k Y (Y (h, z0 )a, z2 )u (z1 − z2 )−k Y (h, z1 )Y (a, z2 )u
(1 + z2 )[wta]+k
(−z2 + z1 )−k Y (a, z2 )Y (h, z1 )u z2n+ε(a) ∞ X (1 + z2 )[wta]+k −k h(−k − i)Y (a, z2 )u (−z2 )i = Resz2 i z2n+ε(a) i=0 ∞ X (1 + z2 )[wta]+k −k (−1)k+i n+k+i+ε(a) Y (a, z2 )h(i)u −Resz2 i z2 i=0 ∞ X −k (1 + z2 )[wta]+k z2i h(−k)Y (a, z2 )u mod W ≡ Resz2 i z2n+ε(a) i=0 −Resz2 Resz1
=
h(−k)Resz2
≡
0
(1 + z2 )[wta] z2n+ε(a)
mod W.
Y (a, z2 )u (4.8)
Here we used the relation h(−k − i)w ≡ (−1)i h(−k)w (mod M ) which follows from (4.4). Therefore h(−k)L ⊆ L for any k ∈ N. Similarly, we have (1 + z2 )[wte(−k)a]
Y (e(−k)a, z2 )u z2n+ε(e(−k)a) ∞ X (1 + z2 )[wte(−k)a] −k = Resz2 e(−k − i)Y (a, z2 )u (−z2 )i i z2n+ε(e(−k)a) i=0 ∞ X (1 + z2 )[wte(−k)a] −k (−1)k+i n+k+i+ε(e(−k)a) Y (a, z2 )e(i)u −Resz2 i z2 i=0 ∞ X (1 + z2 )[wte(−k)a] −k (−1)k+i n+k+i+ε(e(−k)a) Y (a, z2 )e(i)u mod W. (4.9) ≡ −Resz2 i z2 i=0 Resz2
If wta ∈ Z, then [wte(−k)a] = wta + k − 1 and ε(e(−k)a) = 0. Then the last formula in (4.9) is equal to
82
C. Dong, H. Li, G. Mason
−Resz2
∞ X −k i
i=0
(−1)k+i
(1 + z2 )wta+k−1 Y (a, z2 )e(i)u z2n+k+i
which is in W by (4.7) as wte = 1 − z < 1 and [wta] + k − 1 ≤ [wte(−k)a] ≤ [wta] + k. A similar discussion using (4.7) shows that the last expression of (4.9) is also in W if [wte(−k)a] 2) Y (e(−k)a, z2 )u ∈ W . wta ∈ / Z. Thus Resz2 (1+z z n+ε(e(−k)a) Analogously,
≡
≡
(1 + z2 )[wtf (−k)a]
Y (f (−k)a, z2 )u z2n+ε(f (−k)a) ∞ X (1 + z2 )[wta+z]+k −k (−z2 )i n+ε(f (−k)a) f (−k − i)Y (a, z2 )u Resz2 i z2 i=0 ∞ X −k (1 + z2 )[wta+z]+k (−1)k+i n+k+i+ε(f (−k)a) Y (a, z2 )f (i)u −Resz2 i z2 i=0 ∞ X (1 + z2 )[wta+z]+k −k f (0)Y (a, z2 )u (−1)k z2i Resz2 i z2n+ε(f (−k)a) i=0 ∞ X (1 + z2 )[wta+z]+k −k (−1)k+i n+k+i+ε(f (−k)a) Y (a, z2 )f (i)u mod W −Resz2 i z2 i=0
Resz2 =
2
Resz2 (−1)k
(1 + z2 )[wta+z] z2n+ε(f (−k)a)
−Resz2 (−1)k
f (0)Y (a, z2 )u
(1 + z2 )[wta+z]+k
Y (a, z2 )f (0)u z2n+k+ε(f (−k)a) ∞ X (1 + z2 )[wta+z]+k −k (−1)k+i n+k+i+ε(f (−k)a) Y (a, z2 )f (i)u mod W −Resz2 i z2 i=1 (1 + z2 )[wta+z]
≡
(f (0)Y (a, z2 ) − Y (a, z2 )f (0))u z2n+ε(f (−k)a) k X (1 + z2 )[wta+z] k −Resz2 (−1)k n+i+ε(f (−k)a) Y (a, z2 )f (0)u i z2 i=1 ∞ X −k (1 + z2 )[wta+z]+k −Resz2 (−1)k+i n+k+i+ε(f (−k)a) Y (a, z2 )f (i)u i z2 i=1
≡
Resz2 (−1)k
Resz2 (−1)k
(1 + z2 )[wtf (0)a]
Y (f (0)a, z2 )u z2n+ε(f (0)a) k X (1 + z2 )[wta+z] k −Resz2 (−1)k n+i+ε(f (−k)a) Y (a, z2 )f (0)u i z2 i=1 ∞ X −k (1 + z2 )[wta+z]+k −Resz2 (−1)k+i n+k+i+ε(f (−k)a) Y (a, z2 )f (i)u mod W. (4.10) i z2 i=1
Since deg f (0)a = deg a, by the induction hypothesis, we have
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
Resz2
(1 + z2 )[wtf (0)a] z2n+ε(f (0)a)
83
Y (f (0)a, z2 )u ∈ W.
Notice that [wta + z] = [wta], or [wta] + 1. If [wta + z] = [wta], by (4.7) the last two terms in (4.10) are in W no matter what ε(f (−k)a) is. If [wta + z] = [wta] + 1, then wta ∈ / Z so that ε(a) = 0, it is clear that the last two terms in (4.10) are in W again by (4.7). Since 1 ∈ L and V (`, C) = U (x−1 C[x−1 ] ⊗ g)1 we get L = V (`, C). Thus O(M ) ⊆ W . Therefore O(M ) = W. Proposition 4.2. The associative algebra A(V (`, C)) for Q-graded vertex operator algebra (V (`, C), ωz ) is isomorphic to the polynomial algebra C[x]. Proof. Define a linear map ψ from C[x] to A(V (`, C)) as follows ψ(g(x)) = g(h(−1))1 + O(V (`, C))
(4.11)
for g(x) ∈ C[x]. Since [h(−1), h(0)] = 0 and h(0)1 = 0, we get g(h(−1))1 = g(h(−1) + h(0))1 for any g(x) ∈ C[x]. Since wt h(−1) = 1 it follows from Definition (3.4) that h(−1) ∗ u = (h(−1) + h(0))u for u ∈ V (`, C). Thus ψ is an algebra homomorphism. Recall N− and C from (2.3) and (4.5). Then N− = B ⊕ Cf (0) ⊕ Ch(−1). We have U (N− ) = U (C)U (Ch(−1))U (Cf (0)). By Proposition 4.1 O(V (`, C)) = CV (`, C) = CU (N− )1 ' CU (C)U (Ch(−1)). Therefore ψ is an isomorphism.
Proposition 4.3. The A(V )-bimodule A(M (`, j)) is isomorphic to C[x, y] with the biaction as follows: x ∗ f (x, y) = (x + j − 2y
∂ )f (x, y), f (x, y) ∗ x = xf (x, y) ∂y
(4.12)
for any f (x, y) ∈ C[x, y]. Proof. Let v be a (nonzero) lowest weight vector of M (`, j). Then as in the proof of Proposition 4.2 we have O(M (`, j)) = CU (C)U (Ch(−1))U (Cf (0))v ' CU (C)U (Ch(−1))U (Cf (0)). Then
A(M (`, j)) = ⊕m,n∈Z+ C(h(−1)m f (0)n + O(M (`, j)).
By the definition of the left and right actions of A(V (`, C)) on A(M (`, j)) in Theorem 3.3, we have h(−1) ∗ (h(−1)m f (0)n v)
= (h(−1) + h(0))h(−1)m f (0)n v = (h(−1) + j − 2n)h(−1)m f (0)n v
(4.13)
and (h(−1)m f (0)n v) ∗ h(−1) = h(−1)(h(−1)m f (0)n v) = h(−1)m+1 f (0)n v.
(4.14)
The proposition follows immediately if we set x = h(−1) + O(M (`, j)), y = f (0) + O(M (`, j)). As a corollary of Propositions 2.10, 3.6 and Theorem 3.5 we obtain
84
C. Dong, H. Li, G. Mason
Corollary 4.4. The associative algebra A(L(`, 0)) is semisimple and isomorphic to the quotient algebra C[x]/hf (x)i of the polynomial algebra C[x] in x, where f (x) =
p−2 Y q−1 Y
(x − r + st).
(4.15)
r=0 s=0
The following lemma is useful for calculating A(L(`, j)). The reader can refer to [FZ] for a proof. Lemma 4.5. (a) Let V be a vertex operator algebra and let M be a V -module with a ¯ = M/W . Then as an A(V )-bimodule A(M ¯ ) ' M/(O(M ) + W ). submodule W . Set M (b) If I is an ideal of V then (I + O(V ))/0(V ) is a 2-sided ideal of A(V ) and A(V /I) is isomorphic to A(V )/((I + O(V ))/O(V )). (c) If I is an ideal of V , and I · W ⊂ M (I · W means the linear span of elements vn w for v ∈ I, n ∈ Z and w ∈ W ), then I ∗ A(M ) ⊂ (W + O(M ))/O(M ), A(M ) ∗ I ⊂ (W + O(M ))/O(M ), and A(M )/((W + O(M )/O(M )) is isomorphic to A(W/M ) as A(V /I)-bimodules. Proposition 4.6. Let j = n − 1 − (k − 1)t be an admissible weight. Then the A(L(`, 0))bimodule A(L(`, j)) is isomorphic to the quotient space of C[x, y] modulo the subspace C[x, y]y n + C[x]fj,0 (x, y) + C[x]fj,1 (x, y) + · · · + C[x]fj,n−1 (x, y) where fj,i (x, y) = y i
p−n−1 Y q−k Y r=0
(x − r − i + st). The left and right actions of A(L(`, 0))
s=0
on A(L(`, j)) are given by (4.12). Proof. First, M (`, j) ' U (N− ) as a vector space. Recall that B0 = C(x−1 + 1) ⊗ f + (x−2 + x−1 )C[x−1 ] ⊗ g. Since C = B0 ⊕ Cx−1 ⊗ e, by Proposition 4.1 O(M (`, j)) = CM (`, j) ' B0 U (N− ) + e(−1)U (N− ). Since B0 is an ideal of N− , U (N− )B0 = B0 U (N− ) is an ideal of U (N− ). Set L0 = N− /B0 . Recall from Sect. 2 that T+ = e(−1) + B0 T− = f + B0 and T0 = h(−1) + B0 . Then L0 is a Lie algebra spanned by T+ , T− , T0 and isomorphic to g (see (2.13)). Recall from Proposition 2.4 that vj,1 , vj,2 are the two singular vectors of M (`, j). Then by Lemma 4.5 and Proposition 4.1 we have A(L(`, j)) ' M (`, j)/(CM (`, j) + U (N− )vj,1 + U (N− )vj,2 ) ' U (N− )/(B0 U (N− ) + e(−1)U (N− ) + U (N− )F1 (n, k) + U (N− )F2 (n, k))
(4.16)
as A(L(`, 0))-bimodules. Note that U (N− )/B0 U (N− ) ∼ = U (L0 ). Thus A(L(`, j)) ' U (L0 )/(U (L0 )P (F1 (n, k)) + U (L0 )P (F2 (n, k)) + T+ U (L0 )).
(4.17)
For any nonnegative integers a, b, d, using Proposition 2.7, (2.14) and the fact that Gα = T+ T− − (α + 1)T0 + α(α + 1) we obtain
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
T+a T0b T−d P (F1 (n, k)) =
T+a T0b T−d
n−1 Y k−1 Y
=
T+a
!
Gr+st
r=0 s=1 n−1 Y k−1 Y
85
T−n
! T0b T−d+n
Gr+d+st
r=0 s=1
=
T+a
n−1 Y k−1 Y
! (T+ T− − (r + d + 1 + st)T0 + (r + d + st)(r + d + 1 + st) T0b T−n+d
r=0 s=1
≡
T+a
n−1 Y k−1 Y
! (−r − d − 1 − st)(T0 − r − d − st)
r=0 s=1 b n+d mod · T0 T−
·
T+ U (L0 ).
(4.18)
Noticing that −r − d − 1 − st 6= 0 for any 0 ≤ r ≤ n − 1, 1 ≤ s ≤ k − 1, d ∈ Z+ we obtain U (L0 )P (F1 (n, k)) + T+ U (L0 ) ! ∞ n−1 X Y k−1 Y C[T0 ] (T0 − r − d − st) T−n+d . = T+ U (L0 ) +
(4.19)
r=0 s=1
d=0
Similarly, let a, b, d be any nonnegative integers. If d < p − n, we have T+a T0b T−d P (F2 (n, k)) =
T+a T0b T−d T+p−n
p−n Y q−k Y
Gp−n−r−st
r=1 s=1
=
T+a T0b
d−1 Y
!
T+p−n−d
Gi
p−n Y q−k Y
i=0
=
T+a T0b T+p−n−d
Gp−n−r−st
r=1 s=1 p−n Y q−k Y d−1 Y
Gp−n−r−st Gi+p−n−d
r=1 s=1 i=0
=
T+a+p−n−d (T0 − 2(a + p − n − d))b
p−n Y q−k Y d−1 Y
Gp−n−r−st Gi+p−n−d
r=1 s=1 i=0
≡
0 mod T+ U (L0 ).
(4.20)
If d = m + p − n for some m ∈ Z+ , we have T+a T0b T−d P (F2 (n, k)) =
T+a T0b T−m
p−n−1 Y
Gi
i=0
=
T+a T0b T−m
p−n Y q−k Y r=1 s=0
p−n Y q−k Y
Gp−n−r−st
r=1 s=1
Gp−n−r−st
86
C. Dong, H. Li, G. Mason
T+a
=
p−n Y q−k Y
! Gp+m−n−r−st
r=1 s=0 T0b T−m
T+a
=
p−n−1 Y q−k Y r=0
≡
T+a
! Gm+r−st
T0b T−m
s=0
p−n−1 Y q−k Y
! (−m − r − 1 + st)(T0 − m − r + st)
·
r=0 s=0 · T0b T−m mod T+ U (L0 ).
(4.21)
Since −r − m − 1 + st 6= 0 for any 0 ≤ r ≤ p − n − 1, 0 ≤ s ≤ q − k, we obtain U (L0 )P (F2 (n, k)) + T+ U (L0 ) =
T+ U (L0 ) +
∞ X
C[T0 ]
m=0
p−n−1 Y q−k Y r=0
! (T0 − m − r + st)
· ·T−m .
(4.22)
s=0
Thus U (L0 )P (F1 (n, k)) + U (L0 )P (F2 (n, k)) + T+ U (L0 ) ⊂
T+ U (L0 ) + U (L0 )T−n +
n−1 X
C[T0 ]
p−n−1 Y q−k Y
i=0
r=0
!
(T0 − i − r + st) T−i .
s=0
On the other hand, since r + d + st 6= m + r0 − s0 t for any 0 ≤ r ≤ n − 1, 1 ≤ s ≤ Qn−1 Qk−1 k − 1, 0 ≤ r0 ≤ p − n − 1, 0 ≤ s0 ≤ q − k, d, m ∈ Z+ , r=0 s=1 (x − r − d − st) Qp−n−1 Qq−k and r=0 s=0 (x − m − r + st) are relatively prime. Then we obtain C[T0 ]T−n+i ⊆ U (L0 )P (F1 (n, k)) + U (L0 )P (F2 (n, k)) + T+ U (L0 ) for any i ∈ Z+ . This shows that U (L0 )P (F1 (n, k)) + U (L0 )P (F2 (n, k)) + T+ U (L0 ) ⊃
T+ U (L0 ) +
U (L0 )T−n
+
n−1 X
C[T0 ]
p−n−1 Y q−k Y
i=0
r=0
!
(T0 − i − r + st) T−i .
s=0
Set x = T0 , y = T− . Then the proposition follows from Proposition 4.3 and Lemma 4.5. Theorem 4.7. For admissible weights ji = ni − 1 − (ki − 1)t (i = 1, 2), the fusion rules are given as follows: min{n1 −1,n2 −1}
L(`, j1 ) × L(`, j2 ) =
X
L(`, j1 + j2 − 2i)
i=max{0,n1 +n2 −p}
if 0 ≤ k2 − 1 ≤ q − k1 , and L(`, j1 ) × L(`, j2 ) = 0 otherwise.
(4.23)
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
87
Proof. For any admissible weight j, let Cvj be the one-dimensional module for the Lie algebra Ch such that hvj = jvj . Then Cvj is the lowest weight space of L(`, j). By Theorem 3.4 we need to calculate the A(L(`, 0))-module A(L(`, j1 )) ⊗A(L(`,0)) Cvj2 . Using Proposition 4.6 we get A(L(`, j1 )) ⊗A(L(`,0)) Cvj2 ' C[x, y]/J where J is the subspace of C[x, y] spanned by {x − j2 , C[x, y]y n1 , fj1 ,i (j2 , 1)C[x]y i , i = 0, 1, · · · , n1 − 1}.
(4.24)
If j2 does not satisfy the relation 0 ≤ k2 − 1 ≤ q − k1 , then fj1 ,i (j2 , 1) =
p−n 1 −1 q−k Y Y1 r=0
(j2 − r − i + st) 6= 0
s=0
for 0 ≤ i ≤ n1 − 1. Thus A(L(`, j1 )) ⊗A(L(`,0)) Cvj2 = 0 so that all the corresponding fusion rules are zero. Suppose 0 ≤ k2 − 1 ≤ q − k1 . As before C[x]y i = 0 in C[x, y]/J if fj1 ,i (j2 , 1) 6= 0. p−n 1 −1 q−k Y Y1 Notice that fj1 ,i (j2 , 1) = (j2 − r − i + st) = 0 if and only if j2 −r −i+st = 0 r=0
s=0
for some 0 ≤ r ≤ p − n1 − 1, 0 ≤ s ≤ q − k1 . This implies that 0 ≤ r + i ≤ p − 2. It follows from Remark 2.3 that r + i = n2 − 1. That is, n1 + n2 − p ≤ i ≤ n2 − 1. Therefore max{0, n1 + n2 − p} ≤ i ≤ min{n1 − 1, n2 − 1}. If n1 + n2 − p ≤ i ≤ n2 − 1, then C[x]y i is not zero in C[x, y]/J. Thus C[x, y]/J ∼ = ⊕max{0,n1 +n2 −p}≤i≤min{n1 −1,n2 −1} Cy i . From (4.12) we get x ∗ y i = (j2 + j1 − 2i)y i , as required.
Remark 4.8. (a) Since Lz (−1) = L(−1) the fusion rules among the admissible modules with respect two different operator algebra structure of L(`, 0) are the same. Thus the fusion rules obtained in Theorem 4.7 are also those with respect to the old vertex operator algebra structure. (b) After changing the notations one immediately sees that our results agree with Bernard and Felder’s results [BF] on fusion rules by using BRST cohomology. (c) Suppose that ` is an integer. That is, q = 1 and p = ` + 2. Since ji = ni − 1, we have n1 + n2 − p = j1 + j2 − `. Since ki = 1 for any i, 0 ≤ k2 − 1 ≤ q − k1 holds automatically. Then min{n1 −1,n2 −1}
L(`, j1 ) × L(`, j2 )
=
X
L(`, j1 + j2 − 2i)
i=max{0,n1 +n2 −p} min{j1 ,j2 }
=
X
L(`, j1 + j2 − 2i)
i=0,i≥j1 +j2 −`
=
jX 1 +j2
L(`, j).
j=|j1 −j2 |,j+j1 +j2 ≤2`
This is exactly the well-known fusion formula (cf. [GW, TK]).
(4.25)
88
C. Dong, H. Li, G. Mason
Proposition 4.9. Let M be any V (`, C)-module. Then C2 (M ) = (Cx−1 ⊗ e + Cx−1 ⊗ f + x−2 C[x−1 ] ⊗ g)M. Proof. Since wth = 1, wte, wtf ∈ / Z, by the definition of C2 (M ) we get (Cx−1 ⊗ e + Cx−1 ⊗ f + x−2 C[x−1 ] ⊗ g)M ⊆ C2 (M ).
(4.26)
Set B1 = Cx−1 ⊗ e + Cx−1 ⊗ f + x−2 C[x−1 ] ⊗ g. Let a be a homogeneous element of V (`, C) such that Resz2 z2−n−ε(a) Y (a, z2 )M ⊆ B1 M
(4.27)
for any positive integer n. Then for any k, n ∈ N, u ∈ M, b ∈ {e, f, h}, we have Resz2 z2−n−ε(b(−k)a) Y (b(−k)a, z2 )u =
Resz0 Resz2 z2−n−ε(b(−k)a) z0−k Y (Y (b, z0 )a, z2 )u
=
Resz1 Resz2 z2−n−ε(b(−k)a) (z1 − z2 )−k Y (b, z1 )Y (a, z2 )u
=
≡ ≡
−Resz1 Resz2 z2−n−ε(b(−k)a) (−z2 + z1 )−k Y (a, z2 )Y (b, z1 )u ∞ X −k (−z2 )i z2−n−ε(b(−k)a) b(−k − i)Y (a, z2 )u Resz2 i i=0 ∞ X −k (−1)k+i z2−n−k−i−ε(b(−k)a) Y (a, z2 )b(i)u −Resz2 i i=0 −n−ε(b(−k)a) b(−k)Y Resz2 z2
(a, z2 )u mod W
0 mod B1 W.
(4.28)
Clearly (4.27) holds for a = 1. Note that V (`, 0) = U (x−1 C[x] ⊗ g)1. It follows from (4.28) that (4.27) holds for all a ∈ V (`, C). The proof is complete. Theorem 4.10. The commutative associative algebra A2 (L(`, 0), ωz ) is isomorphic to the quotient algebra C[x]/hx(p−1)q i. Consequently, (L(`, 0), ωz ) is C2 -finite. Proof. First, notice that the Verma module M (`, 0) is linearly isomorphic to U (N− ). Recall from Sect. 2 that B2 = Cx−1 ⊗f +x−2 C[x−1 ]⊗g is an ideal of N− , L2 = N− /B2 , is the corresponding quotient Lie algebra spanned by e¯ = e(−1) + B2 , f¯ = f (0) + B2 , h¯ = h(−1) + B2 and with the commutation relations ¯ ¯ ¯ [e(−1), ¯ f¯(0)] = h(−1), [h(−1), e(−1)] ¯ = [h(−1), f¯(0)] = 0. By Proposition 4.9, we get C2 (M (`, 0)) = B2 M (`, 0) + e(−1)M (`, 0) ' B2 U (N− ) + e(−1)U (N− ). One easily verifies that ¯ (L2 ) + U (L2 )f¯ + U (L2 )P2 (F2 (1, 1))). A2 (L(`, 0)) ' U (L2 )/(eU For any a, b, m ∈ Z+ and m ≥ p − 1 we obtain from Proposition 2.8 that
(4.29)
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
89
e¯a h¯ b f¯m P2 (F2 (1, 1)) a ¯ b ¯m p−1
e¯ h f e¯
=
p−1 Y q−1 Y
H¯ p−1−r−st
r=1 s=1
e¯a h¯ b f¯m−p+1
=
p−2 Y
H¯ i
i=0 a¯b
e¯ h
=
p−2 Y p−1 Y q−1 Y
p−1 Y q−1 Y
H¯ p−1−r−st
r=1 s=1
H¯ m−r−st H¯ m−p+1+i
i=0 r=1 s=1 a¯b
e¯ h
=
p−2 Y q−1 Y
! f¯m
!
H¯ r+m−p+1−st
f¯m−p+1 .
(4.30)
r=0 s=0
Here we used the relations H¯ α e¯ = e¯H¯ α+1 , H¯ α f¯ = f¯H¯ α+1 and f¯s e¯s = H¯ 0 · · · H¯ s−1 which follows from the definition of H¯ α and the commutator relations (4.29). Thus if ¯ (L2 ) + U (L2 )f¯. a > 0 or m > p − 1, then e¯a h¯ b f¯p−1+m P2 (F2 (1, 1)) ∈ eU If a = 0 and m = p − 1 we have h¯ b f¯p−1 P2 (F2 (1, 1)) =
h¯ b
p−2 Y q−1 Y
H¯ r−st
r=0 s=0
≡
h¯ b
p−2 Y q−1 Y
(−r − 1 + st)h¯ mod eU ¯ (L2 ).
(4.31)
r=0 s=0
Similarly, if m < p − 1, for any a, b ∈ Z+ we get e¯a h¯ b f¯m P2 (F2 (1, 1)) ∈ eU ¯ (L2 ). Since −r − 1 + st 6= 0 for any 1 ≤ r ≤ p − 1, 0 ≤ s ≤ q − 1, we obtain ¯ f¯ + U (L2 )h¯ (p−1)q . eU ¯ (L2 ) + U (L2 )f¯ + U (L2 )P2 (F2 (1, 1)) = eU ¯ (L2 ) + U (h) ¯ Then the theorem follows if we set x = h.
5. Modular Invariance Property In this section we study the modular invariance property of the space linearly spanned by 2 1 1 3` all characters trL(`,j) e2πiτ (L(0)− 2 zh(0)− 24 ( `+2 −6`z )) , where imτ > 0, z ∈ Q, 0 < z < 1. In this section we shall first find a modular transformation formula for the modified characters for admissible modules. Following [K] or [KW1–KW2], for m, n ∈ Z, m > 0 we define X 2 e2mπiτ (j +jz) , z ∈ C. (5.1) θn,m (τ, z) = n j∈Z+ 2m
Set
90
C. Dong, H. Li, G. Mason
X
Θn,m (τ ) =
2
e2mπiτ j .
(5.2)
n j∈Z+ 2m
Then θn,m (τ, z)
1
e2mπiτ (− 4 z
=
2
)
X
2
1
e2mπiτ (j+ 2 z)
n j∈Z+ 2m
e− 2 mz 1
=
2
X
πiτ
2
e2mπiτ j .
(5.3)
j∈Z+ n+mz 2m
Suppose that z =
v u
is a rational number with u > 0. Then θn,m (τ, z)
=
e− 2 mz 1
2
X
πiτ
e2mπiτ j
2
j∈Z+ nu+mv 2mu
=
e− 2 mz 1
2
πiτ
τ Θnu+mv,mu ( ). u
(5.4)
As in Sect. 4, we let ` = −2 + pq be a fixed admissible level, where p ≥ 2, q are relatively prime positive integers. Let P` be the set of all admissible weights (mod Cδ) of level `. Then P` = {j = n − kt|n, k ∈ Z+ , n ≤ p − 2, k ≤ q − 1}. 3` . For any rational number z, we set c`,z = c` − 6`z 2 . In Sect. 4 we have Set c` = `+2 studied the vertex operator algebra or chiral algebra L(`, 0) under a different Virasoro vector ωz which has a central charge c`,z . That is, the rank of (L(`, 0), ωz ) is c`,z . With this motivation we define the following characters 1
1
1
χj (τ, z) := tr L(`,j) e2πiτ (Lz (0)− 24 c`,z ) = tr L(`,j) e2πiτ (L(0)− 2 zh(0)− 24 c`,z ) .
(5.5)
For an admissible weight j = n − kt ∈ P` , set a = pq, b± j = q(±(n + 1) − kt). Now we restrict z to be a positive rational number less than 1. Remark 5.1. In [KW1–KW2], the following defined character has been considered: 1
1
χ¯ j (τ, z) = tr L(`,j) e2πiτ (L(0)− 2 zh(0)− 24 c` ) ,
(5.6)
and it was proved that
χ¯ j (τ, z) =
θb+j ,a (τ, q −1 z) − θb− ,a (τ, q −1 z) j
θ1,2 (τ, z) − θ−1,2 (τ, z)
.
(5.7)
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl
91
Using KW’s character formula we obtain χj (τ, z) 1
= e 2 `z =e
2
πiτ
2 1 2 `z πiτ
χ¯ j (τ, z) θb+j ,a (τ, q −1 z) − θb− ,a (τ, q −1 z) j
θ1,2 (τ, z) − θ−1,2 (τ, z) τ τ Θqub+j +av,aqu ( qu ) − Θb− qu+av,aqu ( qu ) 2 −2 2 2 1 1 j = e 2 `z πiτ e− 2 aq z πiτ ez πiτ τ τ Θu+2v,2u ( u ) − Θ−u+2v,2u ( u ) τ τ Θqub+j +av,aqu ( qu ) − Θuqb− +av,aqu ( qu ) −2 1 2 j = e 2 z πiτ (`+2−aq ) τ τ Θu+2v,2u ( u ) − Θ−u+2v,2u ( u ) τ τ + Θuqbj +av,aqu ( qu ) − Θqub− +av,aqu ( qu ) j . (5.8) = τ τ Θu+2v,2u ( u ) − Θ−u+2v,2u ( u ) Then χj is a modular function with c`,z as the modular anomaly rather than c` . Remark 5.2. In [KW1] the following transformation law was given:
χ¯ j (−τ
−1
1 , τ z) = 2i
r
0 2 X −iπb+ b0− /a − e−iπb+ b+ /a χ¯ j 0 (τ, z). e a 0 1
Later in [KW2], a correction was made by adding the factor e 2 `z side of (5.9). That is,
χ¯ j (−τ
−1
1 , τ z) = 2i
r
(5.9)
j ∈P`
2
πiτ
on the right-hand
0 2 1 `z2 πiτ X −iπb+ b0− /a e2 e − e−iπb+ b+ /a χ¯ j 0 (τ, z). (5.10) a 0 j ∈P`
1
2
Based on the modular transformation law ((5.9) without the factor e 2 `z πiτ), the fusion rules have been calculated in [KS] and [MW] by using Verlinde formula [V]. Unfortunately, some of them are negative. On the other hand, the correct formula (5.10) can not be used to compute the fusion because the coefficients in (5.10) involve the variable τ. This puzzles both mathematicians and physicists. For a Z-graded rational vertex operator algebra satisfying C2 conditionc and the Virasoro condition it is proved in [Z] that the space spanned by trM q L(0)− 24 for all irreducible modules M modular invariant. If the Virasoro condition is replaced by the primary field condition (cf. Remark 3.16) one still has the modular invariance of the space by modifying Zhu’s proof. Now we have a Q-graded rational vertex operator algebra (L(`, 0), ωz ) satisfying the C2 condition (see Theorem 4.10) and primary field condition (see Remark 3.16). Unfortunately Zhu’s modular invariance theorem [Z] does not apply to the Q-graded vertex operator algebra. This raise a question: Is the space linearly spanned by χj (τ, z) modular invariant under the transformation τ 7→ −τ −1 with z being fixed? This question will be discussed in another paper.
92
C. Dong, H. Li, G. Mason
References [A] [AM]
Adamovic, D.: Some rational vertex algebras. Preprint Adamovic, D. and Milas, A.: Vertex operator algebras associated to modular invariant representations for A(1) 1 . Math. Research Lett. 2 563–575 (1995) ˜ 2 algebra. Mod. Phys. Lett. A7 [AY] Awata, H., and Yamada, Y.: Fusion rules for the fractional level sl 1185 (1992) [BF] Bernard, D. and Felder, G.:Fock representations and BRST cohomology in SL(2) current algebra. Commun. Math. Phys. 127 145–168 (1990) [DGK] Deodhar,V.V., Gabber, O. and Kac, V.: Structure os some categories of representations of infinitedimensional Lie algebras. Adv. Math. 45 92–116 (1982) [DL] Dong, C. and Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators: Progress in Math. Vol. 112, Boston: Birkh¨auser, 1993 [DLiM1] C. Dong, Li, H. and Mason, G.: Twisted representations of vertex operator algebras, preprint, q-alg/9509005. [DLiM2] Dong, C., Li, H. and Mason, G.: Regularity of rational vertex operator algebras. Adv. in. Math., to appear, q-alg/9508018. [DLinM] Dong, C., Lin, Z. and Mason, G.: On vertex operator algebras as sl2 -modules. In: Groups, Difference Sets, and the Monster, Proc. of a Special Research Quarter at The Ohio State University, Spring 1993, ed. by K.T. Arasu, J.F. Dillon, K. Harada, S. Sehgal and R. Solomon, Berlin-New York: Walter de Gruyter, 1996, pp. 349–362 [FM] Feigin, B. and Malikov, F.: Fusion algebra at a rational level and cohomology of nilpotent subal˜ 2 . Preprint gebras of sl [FFR] Feingold, A.J., Frenkel, I.B., Ries, J.F.X.: Spinor construction of vertex operator algebras, triality and E8(1) . Contemp. Math. Vol. 121, Providence: Amer. Math. Soc., 1991 [FHL] Frenkel, I.B., Huang, Y.-Z. and Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104, 1993 [FLM] Frenkel, I.B., Lepowsky J. and Meurman, A.: Vertex Operator Algebras and the Monster, Pure and Applied Math., Vol. 134, New York: Academic Press, 1988 [FZ] Frenkel, I.B. and Zhu, Y.-C.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) [F] Fuchs, D.B.: Two projections of singular vectors of Verma modules over the affine Lie algebra A11 . Funct. Anal. Appl. Vol. 23 No. 2, 81–83 (1989) [GW] Gepner, D. and Witten, E.: String theory on group manifold Nucl. Phys. B 287, 493–549 (1986) [K] Kac, V.: Infinite-dimensional Lie Algebras 3rd ed., Cambridge: Cambridge Univ. Press, 1990 [KK] Kac, V. and Kazhdan, D.: Highest weight representations for affine Lie algebras. Adv. Math. 34, 97–108 (1979) [KW1] Kac, V.G. and Wakimoto, M.: Modular invariant representations of infinite-dimensional Lie algebras and superalgebras. Proc. Natl. Acad. Sci. USA Vol. 85, 4956–4960 (1988) [KW2] Kac, V. and Wakimoto, M.: Classification of modular invariant representations of affine algebras. In Infinite Dimensional Lie Algebras and Groups, Proceedings of the conference held at CIRM, Luminy, edited by Victor G. Kac, 1988 [KWa] Kac, V. and Wang, W.: Vertex operator superalgebras and representations. Contemporary Math. Vol. 175, 161–191 (1994) [KZ] Knizhnik, V.G. and . Zamolodchikov, A.B: Current algebra and Wess-Zumino model in two dimensions. Nucl. Phys. B274, 83–103 (1984) [KS] Koh, I.G. and Sorba, P.: Fusion rules and (sub)modular invariant partition functions in non-unitary theories. Phys. Lett. Vol. 215, 723–739 (1988) [Le] Lepowsky, J.: Generalized Verma modules, loop space cohomology and Macdonald type identities. Ann. Sci. Ecole Norm. Sup. 12, 169–234 (1979) [LW] Lepowsky, J. and Wilson, R.: Construction of the affine Lie algebra A(1) 1 . Commun. Math. Phys. 62, 43–53 (1978) [Li1] Li, H.-S.: Local systems of vertex operators, vertex superalgebras and modules J. Pure and Appl. Alg. 109, 143–195 (1996) [Li2] Li, H.-S.: Representation theory and tensor product theory of vertex operator algebras. Ph.D. thesis, Rutgers University, 1994
ˆ2 Vertex Operator Algebras Associated to Admissible Representations of sl [MFF] [MW] [TK]
[V] [Z]
93
Malikov, F.G., Feigin, B.L. and Fuchs, D.B.: Singular vectors in Verma modules over Kac-Moody algebras. J. Funct. Anal. Appl., 20, No. 2, 25–37 (1986) Mathieu, P. and Walton, A.A.: Fractional-level Kac-Moody algebras and nonunitary coset conformal theories. Prog. of Theor. Phys. Suppl. No. 102, 229–254 (1990) Tsuchiya, A. and Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid group. In: Conformal Field Theory and Solvable Lattice Models, Advanced Studies in Pure Math. Vol. 16, Tokyo: Kinokuniya Company Ltd., 1988 pp. 297–372 Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B 300, 360–376 (1988) Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Am. Math. Soc. 9, 237–302 (1996)
Communicated by G. Felder
Commun. Math. Phys. 184, 95 – 117 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Geometrical Meaning of R-Matrix Action for Quantum Groups at Roots of 1 Fabio Gavarini Dipartimento di Matematica, Istituto G. Castelnuovo, Universit`a degli studi di Roma “La Sapienza”, Piazzale Aldo Moro 5, 00185 Roma, Italy. E-mail:
[email protected] Received: 23 April 1996 / Accepted: 12 August 1996
Abstract: The present work splits in two parts: first, we perform a straightforward generalization of results from [Re], proving that quantum groups UqM(g) and their unrestricted specializations at roots of 1, in particular the function algebra F [H] of the Poisson group H dual of G, are braided; second, as a main contribution, we prove the convergence of the (specialized) R-matrix action to a birational automorphism of a 2`×2 when ε is a primitive `-th root of 1, and of a fold ramified covering of Spec UεM(g) 2-fold ramified covering of H, thus giving a geometric content to the notion of braiding for quantum groups at roots of 1. 1. Definitions 1.1. Cartan data. Let g be a complex finite dimensional semisimple Lie algebra of rank n, with Cartan matrix A := (aij )i,j=1,...,n ; let R be its root system, Q, resp. P , be its root lattice, resp. weight lattice; we fix a subset R+ (⊂ R) of positive roots, a basis {α1 , . . . , αn } (⊂ R+ ) of simple roots, and we let {ω1 , . . . , ωn } be the dual basis of P . We denote by W the Weyl group of g, with generators s1 , . . . , sn (namely the reflections associated with simple roots), and we set N := #(R+ ) . Finally, we let (d1 , . . . , dn ) be the (unique) n-tuple of relatively prime positive integers such that (di aij )i,j=1,...,n is a symmetric positive definite matrix. 1.2. Quantum enveloping algebras. We briefly recall some definitions. The quantized universal enveloping algebra Uh (g) is the associative algebra with 1 over C[[h]] generated by Y1 , . . . , Yn , H1 , . . . , Hn , X1 , . . . , Xn with relations (for i, j = 1, . . . , n) Hi Hj − Hj Hi = 0, Hi Yj − Yj Hi = −aij Yj , Hi Xj − Xj Hi = aij Xj ,
96
F. Gavarini
Xi Yj − Yj Xi = δij
exp(hdi Hi ) − exp(−hdi Hi ) , exp(hdi ) − exp(−hdi )
1 − aij 1−a −k (−1) Xi ij Xj Xik = 0 k q i k=0 1−aij X 1 − aij 1−a −k (−1)k Yi ij Yj Yik = 0 k qi
X
1−aij
∀ i 6= j,
k
∀ i 6= j,
k=0
where q := exp(h) , qi := q di = exp(hdi ) , and the Gaussian binomial by hmi n
:= q
[m]q ! , [m − n]q ![n]q !
[k]q ! :=
k Y
[s]q ,
[s]q :=
s=1
m n q
is defined
q s − q −s q − q −1
−1 for all m, n, k, s ∈ N+ , n ≤ m, with [s]q , [k]q !, m . It is known that n q ∈ C q, q Uh (g) has a Hopf algebra structure, given by (i = 1, . . . , n) ∆(Yi ) := Yi ⊗ exp(−hdi Hi ) + 1 ⊗ Yi , S(Yi ) := −Yi exp(hdi Hi ) , (Yi ) := 0 , S(Hi ) := −Hi , (Hi ) := 0 , ∆(Hi ) := Hi ⊗ 1 + 1 ⊗ Hi , ∆(Xi ) := Xi ⊗ 1 + exp(hdi Hi ) ⊗ Xi , S(Xi ) := − exp(−hdi Hi )Xi , (Xi ) := 0 . Let M be a lattice such that Q ≤ M ≤ P : the quantized universal enveloping algebra UqM(g) (cf. [DP], §9) is the associative algebra with 1 over C(q) generated by F1 , . . . , Fn , Lµ (∀ µ ∈ M ), E1 , . . . , En with relations (i, j = 1, . . . , n; µ, ν ∈ M ) Lµ Lν = Lµ+ν = Lν Lµ
,
L0 = 1,
Lµ Fj = q −hµ|αj i Fj Lµ , Lα − L−αi Ei Fj − Fj Ei = δij i , qi − qi−1
L µ Ej = q
hµ|αj i
Ej L µ ,
1 − aij 1−a −k (−1) Ei ij Ej Eik = 0 k q i k=0 1−aij X 1 − aij 1−a −k (−1)k Fi ij Fj Fik = 0 k qi X
1−aij
k
∀ i 6= j, ∀ i 6= j.
k=0
A Hopf algebra structure on UqM(g) is defined by (i = 1, . . . , n; µ ∈ M ) ∆(Fi ) := Fi ⊗ L−αi + 1 ⊗ Fi S(Fi ) := −Fi Lαi , (Fi ) := 0 , ∆(Lµ ) := Lµ ⊗ Lµ , S(Lµ ) := L−µ , (Lµ ) := 1 , ∆(Ei ) := Ei ⊗ 1 + Lαi ⊗ Ei , S(Ei ) := −L−αi Ei , (Ei ) := 0 . 0
It is clear that UqM(g) ,→ UqM (g) whenever Q ≤ M ≤ M 0 ≤ P , this being a Hopf algebra embedding. In the sequel we shall also use notation Li := Lωi , Ki := Lαi (for all i = 1, . . . , n).
Geometrical Meaning of R-Matrix Action for Quantum Groups
97
The very definitions imply the existence of a Hopf algebra monomorphism j: UqQ(g) ,−−−→ Uh (g) given by q 7→ exp(h) , Fi 7→ Yi , Ki 7→ exp(hdi Hi ) , Ei 7→ Xi ; still from definitions it is also clear that this uniquely extends to an embedding j: UqM(g) ,−−−→ Uh (g) for all lattices M ; in particular UqP(g) ,−→ Uh (g) . Finally, we shall denote by UqM(b+ ), resp. UqM(b− ), the Hopf subalgebra of UqM(g) generated by Lµ ’s and Ei ’s, resp. Fi ’s: these are called quantum Borel subalgebras. An interesting property that Hopf algebras can enjoy is quasitriangularity: Definition 1.1. (cf. [Dr].) A Hopf algebra H is called quasitriangular if there exists an invertible element R ∈ H ⊗ H (or an element of an appropriate completion of H ⊗ H) such that ∀a ∈ H, (1.1) R · ∆(a) · R−1 = Ad(R)(∆(a)) = ∆op (a), (∆ ⊗ id)(R) = R13 R23 ,
(1.2)
(id ⊗ ∆)(R) = R13 R12 (1.3) op op ⊗2 where ∆ , is the opposite comultiplication, i. e. ∆ (a) = σ ◦ ∆(a) with σ: A → A⊗2 , a ⊗ b 7→ b ⊗ a , and R12 , R13 , R23 ∈ H ⊗3 (or the appropriate completion of H ⊗3 ), R12 = R ⊗ 1 , R23 = 1 ⊗ R , R13 = (σ ⊗ id)(R23 ) = (id ⊗ σ)(R12 ) . As a corollary of this definition, R satisfies the Yang-Baxter equation in H ⊗3 (cf. [Ta]) R12 R13 R23 = R23 R13 R12 . The quantum universal enveloping algebra Uh (g) is quasitriangular (for any KacMoody algebra g indeed; cf. [Dr], LS, KR]): its R-matrix is n X Y −1 expqα qα − qα Xα ⊗ Yα · exp − h Bij Hi ⊗ Hj , R= Q
α∈R+
i,j=1
where α∈R+ denotes an ordered product (with respect to a fixed convex ordering of R+ ), qα := q dα (where dα is one-half the square length of the root α; in particular dαi = di for all i), (Bij )i,j=1,...,n := (di aij )−1 i,j=1,...,n , and Xα , Yα are q-analogue of root vectors (not unique, however) attached to roots α, −α. On the other hand, this is not true – strictly speaking – for the C(q)–algebras UqM(g): to be precise we need a slight modification of the notion of quasitriangularity, suggested by Reshetikhin, as follows: Definition 1.2. (cf. [Re], Definition 2). A Hopf algebra H is called braided if there exists an automorphism R of H ⊗ H (or of an appropriate completion of H ⊗ H) distinct from σ: a ⊗ b 7→ b ⊗ a such that (1.4) R ◦ ∆ = ∆op , (∆ ⊗ id) ◦ R = R13 ◦ R23 ◦ (∆ ⊗ id),
(1.5)
(id ⊗ ∆) ◦ R = R13 ◦ R12 ◦ (id ⊗ ∆)
(1.6).
Here R12 , R13 , R23 are the automorphisms of H ⊗H ⊗H defined by R12 = R⊗id , R23 = id ⊗ R , R13 = (σ ⊗ id) ◦ (id ⊗ R) ◦ (σ ⊗ id) .
98
F. Gavarini
It follows from this definition that R satisfies the Yang-Baxter equation in End(H ⊗3 ): R12 ◦ R13 ◦ R23 = R23 ◦ R13 ◦ R12 .
(1.7)
Furthermore, it is clear that if (H, R) is quasitriangular, then H, Ad(R) is braided. Again from [Re] we recall another definition (but notice that, because of later convenience, the present definition is slightly different from the original one in [Re]). Definition 1.3. (cf. [Re], Definition 3) Let H be a Hopf algebra, let R(0) be an algebra automorphism of H ⊗ H and R(1) ∈ H ⊗ H an invertible element such that
Ad R(1)
◦R
(0)
◦∆
= ∆op ,
(1.8)
(∆ ⊗ id) ◦ R(0) = R(0)13 ◦ R(0)23 ◦ (∆ ⊗ id),
(1.9)
(id ⊗ ∆) ◦ R(0) = R(0)13 ◦ R(0)12 ◦ (id ⊗ ∆),
(1.10)
(∆ ⊗ id) R(1) = R(1)13 · R(0)13 R(1)23 ,
(1.11)
(id ⊗ ∆) R(1) = R(1)13 · R(0)13 R(1)12 ,
(1.12)
then H, Ad R(1) ◦ R(0) is a braided Hopf algebra, and the element R(1) is called the universal R-matrix of H, Ad R(1) ◦ R(0) . Finally we recall from [Ta] the strictly related notion below: Definition 1.4. (cf. [Ta], § 4) Let H be a Hopf algebra, let Φ be an algebra automorphism of H ⊗ H and C ∈ H ⊗ H be an invertible element such that C −1 · Φ(∆op (a)) · C = ∆(a)
(1.13)
(Φ23 ◦ Φ13 )(C12 ) = C12 ,
(1.14)
(Φ12 ◦ Φ13 )(C23 ) = C23 ,
(1.15)
(∆ ⊗ id)(C) = Φ23 (C13 ) · C23 ,
(1.16)
(id ⊗ ∆)(C) = Φ12 (C13 ) · C12 ,
(1.17)
then we will say that (H, C, Φ) is a pretriangular Hopf algebra.
Geometrical Meaning of R-Matrix Action for Quantum Groups
99
2. Some q-Calculus 2.1. In this section we introduce some material to be used in the sequel; as standard references for q-special functions and related matters we quote [Ex] and [GR]. Let us introduce some q-symbols. We have q-numbers qs − 1 , q−1
k Y
m
(m)q ! n q (m − n)q ! (n)q ! s=1 −1 , and the symbol for s, k, m, n ∈ N+ , n ≤ m , with (s)q , (k)q !, m n q ∈ C q, q Qn−1 k (a; q)n := k=0 (1 − aq ) , for n ∈ N, a ∈ C. Now consider the function of z, (s)q :=
(k)q ! :=
(s)q ,
(z; q)∞ :=
∞ Y
:=
(z; q)n ;
n=0
we regard it as an element of C(q)[[z]]. The infinite product expressing (z; q)∞ converges to an analytic function of z in any finite part of C if q is a complex number such that |q| < 1; its Taylor series is then (z; q)∞ =
n ∞ X (−1)n q ( 2 )
(q; q)n
n=0
zn .
Also the series eq (z) :=
∞ X n=0
n ∞ X q( 2 ) n Eq (z) := z (q; q)n
1 zn , (q; q)n
n=0
both converge to analytic functions of z; moreover, one has eq (z) = (z; q)∞−1 ,
Eq (z) = (−z; q)∞
so that Eq (−z)eq (z) = 1 . Finally expq (z) :=
∞ X
1 zn (n)q2 !
n=0
thus one has expq (z) = eq2
1 − q2 z .
We claimed above that (z; q)∞ is an analytic function of z for |q| < 1 ; the following lemma describes the behavior of this function for q → ε , ε a root of 1. Lemma 2.1. ([Re], Lemma 3.4.1) Let ε be a primitive `th root of 1, with ` odd. The asymptotic behavior of the function (of q) (z; q)∞ for q → ε is given by ! `−1 Z z` Y k/` log(1 − t) −1 · dt · (z; q)∞ = exp · 1 + O(q − ε) = 1 − εk z 2 t (q ` − 1) 0 k=0 ! `−1 ∞ X Y k/` 1 1 · z `n · · 1 + O(q − ε) . 1 − εk z = exp n2 (q `2 − 1) n=1
k=0
100
F. Gavarini
Proof. Taylor expansion of log(1 − t) shows that the two expressions in the right-handside of (2.1) are equivalent. Now, the function (z; q)∞ satisfies the difference equation zq ` ; q
∞
=
1 (z; q)∞ , (z; q)`
(2.2)
and it is uniquely determined by this property along with the condition (0; q)∞ = 1 . But ! `−1 ∞ X Y 1 `n 1 k/` · z (1 − εk z) ψz (q) := exp 2 2 ` (q − 1) n=1 n k=0 has the asymptotic behavior, for q → ε , of the solution of (2.2); in fact we have ψz (q) = (z; q)` · ψzq` (q) k/` P∞ 1 `n Q`−1 1 exp z · k=0 1 − εk z 2 2 n=1 n ` q −1 = = Q`−1 P∞ 1 `2 n `n Q`−1 1 k · exp k zq ` k/` q z 1 − zq 1 − ε · 2 k=0 n=1 n k=0 q `2 −1 2 Q`−1 ∞ k k/` 1 − q` n X z 1 − ε 1 k=0 =Q z `n = k/` Q`−1 · exp `2 2 `−1 n k ` k − 1 q · k=0 1 − zq n=1 k=0 1 − ε zq P∞ 2 2 ` n ` n `−1 − − 1 z n q Y 1 − εk z k/` n=1 1 = = · Q`−1 · exp 2 k ` ` k 1 − ε zq q −1 k=0 1 − zq k=0 ! `−1 ∞ Y 1 − εk z k/` X 2 1 ` n · Q`−1 (n)q`2 z ; n = · exp − k 1 − εk zq ` k=0 1 − zq n=1
k=0
when q → ε we have lim (n)q`2 = n,
q→ε
`−1 Y
lim Q`−1
q→ε
k=0
thus we get
i. e. limq→1
1 1 − zq k
k/` 1 − εk z lim = 1, q→ε 1 − εk zq ` k=0 !−1 `−1 `−1 `−1 Y Y Y k = εk · 1 − zε εk − z = 1 − z ` , = k=0
k=0
k=0
! ∞ X 1 ψz (q) 1 ` n = · exp − z = lim q→ε (z; q)` · ψzq ` (q) 1 − z` n n=1 exp log 1 − z ` 1 − z` = =1, = 1 − z` 1 − z` ψz (q) (z;q)` ·ψzq` (q)
= 1 . Moreover from definition ψ0 (q) = 1 . The claim follows.
Geometrical Meaning of R-Matrix Action for Quantum Groups
101
3. Braiding of Quantum Enveloping Algebras
3.1. As we said, it is well known that quantum algebras Uh (g) are quasitriangular; this is proved by means of Drinfeld’s method of the "quantum double" (cf. [Dr] and others). On the other hand, for the C(q)-algebras UqM(g) the correct statement is that they are braided; for g = sl(2) , this is proved in [Re]: here we quickly perform the (straightforward) generalization. ⊗2 To begin with we define a suitable completion of UqM(g) , namely b2 UqM(g)⊗
:=
( +∞ X
) En ·
Pn−
⊗
Pn+
· Fn
,
n=0
where Pn− ∈ UqM(b− ), Pn+ ∈ UqM(b+ ) (UqM(b± ) being opposite quantum Borel subal P P b2 ⊗ gebras), En ∈ |β|=n UqM(g) β , Fn ∈ |β|=−n UqM(g) β . It is clear that UqM(g) ⊗2
is a completion of UqM(g) Eα := Xα , Fα := Yα .
as a Hopf algebra. From now on, as in [DD, DP], we set b2 ⊗
Theorem 3.1. Let R(0) be the algebra automorphism of UqM(g) R(0) (Lµ ⊗ 1) := Lµ ⊗ 1 ,
defined by
R(0) (1 ⊗ Lµ ) := 1 ⊗ Lµ ,
R(0) (Ei ⊗ 1) := Ei ⊗ L−αi , R(0) (1 ⊗ Ei ) := L−αi ⊗ Ei , R(0) (Fi ⊗ 1) := Fi ⊗ Lαi , R(0) (1 ⊗ Fi ) := Lαi ⊗ Fi b2 ⊗
(i = 1, . . . , n; µ ∈ M ) and let R(1) ∈ UqM(g) R(1) :=
Y
expqα
be defined by
qα −1 − qα Eα ⊗ Fα .
α∈R+
Then UqM(g), Ad(R(1) ) ◦ R(0) is a braided Hopf algebra, with R(1) as R-matrix. Proof. We just outline the main steps, details being trivial. First of all, direct computation b b2 Q ⊗ 2 on generators shows that (1.9) and (1.10) hold. Then define C ∈ Uq (g) ⊂ UqM(g)⊗ by X C := q (β,β) · Kβ−1 ⊗ Kβ · Cβ , β∈Q+
where Q+ := Q ∩ P+ is the positive root lattice and Cβ is the canonical element of the bilinear pairing UqQ (b+ ) β × UqQ (b− ) −β −→ C(q) among quantum Borel algebras; −1 let also Φ := R(0) ; then it is proved in [Ta], Theorem 4.3.3 that UqQ(g), C, Φ is a pretriangular Hopf algebra; the same proof also works for UqM(g) instead of UqQ(g). therefore being Now trivial checking yieldsR(1) = Φ−1 (C) (more or less by definition); pretriangular UqM(g), C, Φ , implies that UqM(g), Ad(R(1) ) ◦ R(0) is braided.
102
F. Gavarini
Remark 3.1. Applying the remarks in Sect. 2 we can provide a multiplicative formula for the universal R-matrix R(1) of UqM(g), namely Y R(1) = expqα qα −1 − qα · Eα ⊗ Fα = α∈R+
Y
=
α∈R+
=
Y
α∈R+
=
Y
α∈R+
eqα2
eqα2
qα −1 − qα · 1 − qα2 · Eα ⊗ Fα =
qα −1 − qα · qα qα−1 − qα · Eα ⊗ Fα =
Y −1 eqα2 qα E α ⊗ F α = qα · E α ⊗ F α ; qα2 ∞ , α∈R+
−1
where E α := qα − qα in other words
Eα and F α := qα − qα−1 Fα denote modified root vectors; Y −1 R(1) = (3.1) qα · E α ⊗ F α ; qα2 ∞ . α∈R+
Definition 3.1. We let UqM(g) be the C q, q −1 –subalgebra of UqM(g) generated by n o F α , Lµ , E α α ∈ R + , µ ∈ M . Furthermore, for any c ∈ C we let . UcM(g) := UqM(g) (q − c) UqM(g) ∼ = UqM(g) ⊗C[q,q−1 ] C (with C ∼ = C q, q −1 (q − c) ) be the corresponding specialized algebra. Remark 3.2. The previous definition is different but equivalent to the original one in [DP], §12, equivalence arising from the very description of UqM(g) made therein. It is also proved in [DP] that UqM(g) is a C q, q −1 –integer form of UqM(g). 3.2. Our goal now is to show that UqM(g) is braided: to be precise, we could say that the autoquasitriangular structure of UqM(g) gives by restriction a braiding structure for UqM(g). To begin with, we define a suitable completion of UqM(g)⊗2 (mimicking Sect. 3.1), namely ( +∞ ) X b M ⊗ 2 − + Uq (g) := E n · Pn ⊗ Pn · F n , n=0
where Pn− ∈ UqM(b− ), Pn+ ∈ UqM(b+ ), E n ∈
X |β|=n
X UqM(g) β , F n ∈
|β|=−n
UqM(g) β .
b 2 is a completion of U M(g)⊗2 as a Hopf algebra; moreover we It is clear that UqM(g)⊗ q b 2 ⊆ U M(g)⊗ b 2 via the natural embedding U M(g) ,−→ U M(g) . have U M(g)⊗ q
q
q
q
Geometrical Meaning of R-Matrix Action for Quantum Groups
103
b 2 is defined by Proposition 3.1. The restriction of R(0) (cf. Theorem 3.1) to UqM(g)⊗ g (0) (L ⊗ 1) := L ⊗ 1 , R µ µ g (0) R (E α ⊗ 1) := E α ⊗ L−α , g (0) (F ⊗ 1) := F ⊗ L , R α α α
g (0) (1 ⊗ L ) := 1 ⊗ L , R µ µ g (0) R (1 ⊗ E α ) := L−α ⊗ E i , g (0) (1 ⊗ F ) := L ⊗ F R α α α
b2 . g (0) of U M(g)⊗ ( µ ∈ M, α ∈ R+ ) so that R(0) restricts to an algebra automorphism R q b 2 be defined (as in Theorem 3.1) by Moreover, let R(1) ∈ U M(g)⊗ q
R(1) :=
Y
expqα
Y −1 qα −1 − qα Eα ⊗ Fα = qα · E α ⊗ F α ; qα2 ∞ .
α∈R+
α∈R+
b 2 , and U M(g), R g (1) of U M(g)⊗ e – Then Ad R(1) restricts to an automorphism R q q g g (1) ◦ R (0) – is a braided Hopf algebra. e := R with R Proof. The first part of the statement . is trivial. As for the second, we must recall that the M M specialization U1 (g) := Uq (g) (q − 1) UqM(g) is a commutative C -algebra (cf. [DP], §12). Now from (3.1) we have R(1) =
Y α∈R+
qα · E α ⊗ F α ; qα2
−1 ∞
=
Y
R(1) α
α∈R+
−1 2 letting R(1) for all α ∈ R+ , and Lemma 2.1 (for ε = 1 ) α := qα · E α ⊗ F α ; qα ∞ gives 1 1 R(1) · = exp · ϕ q · E ⊗ F · α α α α q − 1 2dα −1/2 · 1 + O(q − 1) · 1 − qα · E α ⊗ F α P∞ where we set ϕ(z) := n=1 n12 z n , as usual. Therefore we fall within the framework of [Re], §3, hence we can apply Reshetikhin’s trick to conclude: namely, applying Lemma 3.2.2 of [Re] we get for all α ∈ R+ , b2 (1) (1) −1 M ⊗ Ad R(1) α (a) = R α · a · R α ∈ Uq (g) g (1) b 2 , i. e. Ad R(1) restricts to an automorphism R b2 M ⊗ for all a ∈ UqM(g)⊗ α of Uq (g) ; thus α Q Q Q g (1) also Ad R(1) = Ad = α∈R+ Ad R(1) = α∈R+ R(1) α does restrict α α∈R+ R α b M ⊗ 2 g (1) to an automorphism R of Uq (g) as claimed. Then Theorem 3.1 yields the claim. b2 ⊗
Corollary 3.1. For any c ∈ C, let Rc be the algebra automorphism of UcM(g) e at q = c. Then U M(g), Rc is a braided Hopf algebra. by specialization of R c
given
104
F. Gavarini
4. The Geometrical Meaning of Braiding at roots of 1
4.1. Geometric framework In this section we turn to geometry: our aim is to show that the series describing the adjoint action of the R-matrix of a quantum group are more than formal objects, for they do converge, in a proper sense, so that such action does yield well-defined automorphisms of geometric objects. Let G be a connected simply connected semisimple Poisson algebraic group over C with g as tangent Lie bialgebra; then there exists a uniquely defined connected simply connected semisimple affine algebraic Poisson group H over C with tangent Lie bialgebra g∗ and algebra of polynomial functions F [H], which is called the Poisson group dual of G (cf. e. g. [DP], §11). Let ` ∈ N be odd, ` > d := maxi {di } , or ` = 1 ; then let ε ∈ C be a primitive `th root of 1. As a matter of notation, let Uε := UεP(g) , Zε := Z (Uε ) (the centre of Uε ). Everything in the sequel can then be suitably extended to the case of the quantum group UqM(g) with general lattice M . From the analysis in [DP] (cf. also [DK, DKP]) we recall the following results: ` ` (a) The subalgebra Z0 of Uε generated by E α , F α , L`i (α ∈ R+ , i = 1, . . . , n) is central, i. e. Z0 ⊆ Zε . (b) Zε and Z0 inherit (from UqP(g)) canonical structures of Poisson algebras; in particular, Z0 is a Poisson Hopf algebra. (c) There exists an isomorphism Z0 ∼ = F [H] as Poisson Hopf algebras (with respect to a suitable normalization of the Poisson bracket on Z0 ), hence Spec Z0 ∼ = H as P ∼ (g) Poisson (complex affine algebraic) groups. In particular (for ` = 1 ) Spec U = H . 1 Recall that in [DK, DKP, DP] the spectra Spec Uε , Spec Zε , and Spec Z0 are introduced as the set of isomorphism classes of finite dimensional representations of the corresponding algebras Uε , Zε , and Z0 ; in particular Spec Zε and Spec Z0 can be identified with usual geometric objects, namely complex affine algebraic varieties describing the maximal spectrum of Zε and Z0 ; since Z0 ∼ = F [H] as Poisson Hopf affine algebraic groups (over C); algebras, we also have Spec Z0 ∼ = H as Poisson thus in the sequel we will also set Hε := Spec Zε and Sε := Spec Uε . The analysis in [DP] describes Spec U ε as (espace e´ tal´e of) a sheaf – or a fibre bundle – of algebras over Spec Z0 or Spec Zε ; in particular we can think of Uε as the algebra of global sections of this sheaf. Now set ` ` zλ := L`λ , xα := E α ∀ α ∈ R+ , λ ∈ P, yα := F α , q=ε
q=ε
q=ε
and in particular yi := yαi , zi := zωi , xi := xαi (i = 1, . . . , n). Following [DK], §3.5, we denote by Zb0 the algebra of all formal power series in the yα ’s, zi±1 ’s, xα ’s which converge to meromorphic functions for all complex values of the yα ’s, xα ’s, and bε := Zb0 ⊗Z Uε , Zbε := Zb0 ⊗Z Zε . all non-zero complex values of the zi ’s; then let U 0 0 b meromorphic sections of In other words we can think of Uε as the algebra of global the corresponding bundle of algebras over Spec Z0 ∼ = H . Similar notations and definitions will be used when dealing with square tensor powers, like Z0 ⊗2 , Zε ⊗2 , and so on. Notice also that Spec Z0 ⊗Z0 = Spec Z0 × Spec Z0 = H × H , Spec Zε ⊗ Zε = Spec Zε × Spec Zε = Hε × Hε , Spec(Uε ⊗ Uε ) = Spec(Uε ) × Spec(Uε ) = Sε × Sε .
Geometrical Meaning of R-Matrix Action for Quantum Groups
105
Warning: When dealing with cross-product spaces like X × Y , we shall use left subscripts to denote functions of either of the two spaces, viz. 2 x := 1 ⊗ x , 1 E α := E α ⊗ 1 , etc. Let H(N ) be any fixed ramified N -fold covering (for N ∈ N ∪ {∞} ) of H (so that H(N ) × H(N ) is an N -fold covering of H × H ); then we denote by Hε(N ) and Hε(N ) := H(N )×HHε and Sε(N ) := H(N ) ×H Sε . Notice that Sε(N ) the fiberproducts (N ) cε and S (N ) = Spec U cε ; furthermore, H(N ) and H(N ) clearly have Hε = Spec Z ε ε a unique Poisson structure compatible with the covering map, so that Hε(N ) is a (complex analytic) Poisson variety and H(N ) is a (complex analytic) Poisson group. Finally, τ := σ ∗ : H × H −→ H × H (σ being defined in Sect. 1.3) is given by (x, y) 7→ (y, x) ; then τ (N ) : H(N ) × H(N ) −→ H(N ) × H(N ) , also given by (x, y) 7→ (y, x) , is a lifting of τ to H(N ) × H(N ) . Fix now ` > 1 : we are ready for the next result, which claims that the "formal automorphism" Rε giving the braiding structure of Uε actually does converge in a proper sense. b 2 −→ U ⊗ b 2 defines a meromorProposition 4.1. The algebra automorphism Rε : Uε⊗ ε ∗ (∞) (∞) phic automorphism Rε,∞ of Sε × Sε , which restricts to meromorphic Poisson automorphisms R∗ε,∞ : Hε(∞) × Hε(∞) −→ Hε(∞) × Hε(∞) and R∗ε,∞ : H(∞) × H(∞) −→ H(∞) × H(∞) . Moreover, R∗ε,∞ and its restriction enjoy the dual properties of (1.4–6); in particular, R∗ε,∞ 6= τ , and m R∗H (x, y) = m(y, x) = y · x for all elements x, y of the Poisson group H(∞) (m and " · " denoting the product of H(∞) ). Proof. The first step in the proof amounts to show that series Rε (x ⊗ y) do converge almost everywhere on a suitable covering Sε(∞) × Sε(∞) . Recall (cf. Proposition 3.1 and g g g (1) ◦ R (0) , thus R := R(1) ◦ R(0) with R(0) := R (0) mod (q − ε) e := R its proof) that R ε ε ε ε (1) (0) g (1) mod (q − ε) . For R and R ε := R ε the very definition implies that no problem g (1) := of convergence (nor of domain of definition) occurs. For R(1)ε , recall that R and Ad R(1) P b Uq (g)⊗2 R(1) :=
Y
qα · E α ⊗ F α ; qα2
α∈R+
−1 ∞
=
Y
R(1) α,
(4.1)
α∈R+
−1 2 where R(1) α := qα · E α ⊗ F α ; qα ∞ , like in the proof of Proposition 3.1; therefore Y Y −1 2 . (4.2) Ad R(1) Ad q · E ⊗ F ; q = Ad R(1) = α α α α α ∞ α∈R+
α∈R+
Now again we apply Reshetikhin’s trick: from [Re], Lemma 3.2.2 and formulas (3.2.10–11), and from our Lemma 2.1 we get −1 mod (q − ε) = qα · E α ⊗ F α ; qα2 ∞ Ad R(1) α mod (q − ε) = Ad Φα = Ad exp q−ε mod (q − ε) = exp ad{ , } Φα mod (q − ε) = (4.3) ε = exp ad{ , } 2d `2 · ϕ qα E α ⊗ F α mod (q − ε) α
106
F. Gavarini
with q−ε · ϕ qα E α ⊗ F α − −1
Φα := exp
qα2`2
!! k 1 − ε k qα E α ⊗ F α ) · 1 + O(q − ε) = − (q − ε) · log ` k=0 ! q−ε · ϕ qα E α ⊗ F α mod (q − ε) = exp qα2`2 − 1 `−1 Y
and
Z
t`
ϕ(t) := 0
∞
X t`n log(1 − τ ) dτ = τ n2
(by Taylor expansion) .
n=1
Notice that
(
ad{ , } t · ϕ(x) (y) = {t · ϕ(x), y} = t ·
∞ X x`n n=1
=t·
n2
) ,y
=
∞ X 1 n ` n o · x ,y = n2 n=1
n−1 ∞ ∞ X ` X x` 1 ` n−1 · t · x` , y n x · x ,y = =t· 2 n n n=1
n=1
(because of Leibnitz’ rule: { · , y} = −ad{ , } (y) is a derivation!), hence ad{ , } t · ϕ(x) = ψ x` · ad{ , } t · x` P∞ y n = n=0 n+1 (by Taylor expansion again), and then with ψ(t) := log(1−y) y exp ad{ , } t · ϕ(x) = exp ψ x` · ad{ , } t · x` ; together with (4.3) this gives
ε Ad R(1) · ϕ q E ⊗ F mod (q − ε) = exp ad = α α α {,} α 2dα `2 ε ` ` ` ` q` · E α ⊗ F α mod (q − ε) = = exp ψ qα` E α ⊗ F α · ad{ , } 2 α 2d ` α ε ψ(xα ⊗ yα ) · ad{ , } (xα ⊗ yα ) mod (q − ε) . = exp 2dα `2
Therefore we have to show that the formal series ε exp · ψ(xα ⊗ yα ) · ad{ , } (xα ⊗ yα ) (x ⊗ y) 2dα `2 for x, y generators of Uε (that is x, y ∈ { 1, F α , Li , E α mod (q − ε) | i = 1, . . . , n, α ∈ R+ } ) does converge (to a meromorphic function on Sε(∞) × Sε(∞) ). But notice that the following obvious identity holds (for all n ∈ N )
Geometrical Meaning of R-Matrix Action for Quantum Groups
107
n ε · ψ(xα ⊗ yα ) · ad{ , } (xα ⊗ yα ) = 2dα `2 n ε · ad{ , } (xα ⊗ yα ) = ψ(xα ⊗ yα )n · 2dα `2 because of Leibnitz’ rule and ad{ , } (xα ⊗ yα ) 2dεα `2 ψ(xα ⊗ yα ) = 0 ; moreover,
ψ(xα ⊗ yα ) is a meromorphic function on the ∞-fold ramified covering H(∞) × H(∞) of H × H. Now recall that [x ⊗ y, z ⊗ w] = [x, z] ⊗ yw + xz ⊗ [y, w] ; then set 1 eα := ad[ , ] 1 Eα(`) , 2 fα := ad[ , ] 2 Fα(`) , observe that q=ε
1 eα
:= ad[ , ]
(`) 1 Eα
2 fα
:= ad[ , ]
(`) 2 Fα
q=ε
q=ε
q=ε
1 qα2`2 − 1
=
1 −1
=
(4.4)
qα2`2
! ` ε · ad{ , } (1 xα ) · ad[ , ] 1 E α = 2 2d α` q=ε ! ` ε · ad[ , ] 2 F α = · ad{ , } (2 yα ) 2dα `2 q=ε
and let m(x): y → 7 xy (left multiplication by x); then formula (4.4) gives ε · ad{ , } (xα ⊗ yα ) = 1 eα ⊗ m (2 yα ) + m (1 xα ) ⊗ 2 fα ; 2dα `2
(4.5)
one trivially checks that 1 eα ⊗m (2 yα ) and m (1 xα )⊗ 2 fα are operators which commute with each other, thus (4.5) gives (4.6) exp ad{ , } (xα ⊗ yα ) = exp 1 eα ⊗ m (2 yα ) ◦ exp m (1 xα ) ⊗ 2 fα ; for x, y generators of Uε we have
exp ψ(xα ⊗yα) · m (xα)⊗fα (x⊗y) = log (1 − 1 xα · 2 yα) = exp · 1 xα · fα (x⊗y) = 1 x α · 2 yα log (1 − 1 xα · 2 yα ) = exp · fα (x⊗y) = 2 yα log (1 − 1 xα · 2 yα ) exp · fα (y) · x⊗1. 2 yα
(4.7)
It is proved in [DK], Sect. 3, that exp (t · fα ) converges to a holomorphic automorphism of the algebra of global holomorphic sections of Sε (as a bundle over H), for all t ∈ C ; when t is replaced with any meromorphic function on H, the series we get does converge to an automorphism of the algebra of meromorphic sections (cf. formulas in the proof of Proposition 3.5 of [DK]); since log(1−21yxαα ·2 yα ) is meromorphic on the ∞-fold covering H(∞) × H(∞) , we conclude that exp (ψ(xα ⊗ yα ) · m(1 xα ) ⊗ 2 fα ) (x ⊗ y) is a (∞) meromorphic section of Sε(∞) 2 yα )) , ×Sε ; thesame holds for exp (ψ(xα ⊗ yα ) · 1 eα ⊗m( ε (1) = exp 2dα `2 · ψ(xα ⊗ yα ) · ad{ , } (xα ⊗ yα ) , and finally for Ad R α q. e. d.
q=ε
q=ε
108
F. Gavarini
[ [ ⊗2 ⊗2 For the second part, notice that R(0)ε clearly leaves invariant both Z and Z , ε 0 (∞) (∞) (∞) (∞) × H ; moreover, since R(1)ε hence its dual leaves invariant Hε × Hε and H is a product of terms −ε (x ) (x ) = exp · ψ ⊗ y ⊗ y Ad R(1) · ad α α α α {,} α 2 2dα ` q=ε q=ε [ [ [ ⊗2 ⊗2 ⊗2 since Z and Z are closed for the Poisson bracket, and since xα ⊗ yα ∈ Z ⊆ ε 0 0 [ ⊗2 (1) (∞) (∞) (∞) (∞) Z , we have that the dual of R leaves H ×H and H ×H invariant; ε
ε
ε
ε
thus we conclude that R∗ε leaves Hε(∞) × Hε(∞) and H(∞) × H(∞) invariant. Finally, it clearly preserves the Poisson structure because Rε is defined by specializing an algebra P ⊗2 automorphism of U\ q (g) , whence h i e e R(x), R(y) e [x, y] = = {Rε (x0 ), Rε (y0 )} . Rε {x0 , y0 } = R q − ε q=ε q−ε q=ε
The proof of the last part of the statement is completely trivial, by functoriality.
A deeper analysis yields an improvement of the previous result, proving that the convergence already holds on finite ramified coverings, as the following shows. Theorem 4.1. The meromorphic automorphism R∗ε : Sε(∞) × Sε(∞) −→ Sε(∞) × Sε(∞) ∗ push down to a birational automorphism Rε,` : Sε(2`) ×Sε(2`) −→ Sε(2`) ×Sε(2`) ; moreover, ∗ ∗ Rε,` 6= τ (2`) , and Rε,` enjoys the dual properties of (1.4–6). The same holds with Hε , resp. H instead of Sε , with a birational Poisson automor∗ ∗ : Hε(2`) × Hε(2`) → Hε(2`) × Hε(2`) , resp. Rε,` : H(2`) × H(2`) → H(2`) × H(2`) : phism Rε,` ∗ in particular, m Rε,` (x, y) = m(y, x) = y · x for all elements x, y of the Poisson group H(2`) (where m and " · " denote the product of H(2`) ). Proof. It is clear that for R(0)ε everything is o.k. As for R(1)ε , from the proof of Proposition 4.1 we see that it is enough to show that −ε ) (x ) · ad (x ⊗ y) (4.8) exp ψ(x ⊗ y ⊗ y α α α α {,} 2dα `2 (for any x, y in Uε ) is a rational section of the bundle Sε(2`) on a 2`-fold ramified covering H(2`) of H : this again amounts to perform some computations. In particular (cf. (4.4–7)) we are reduced to check the same for functions log (1 − 1 xα · 2 yα ) · 1 eα exp 1x · 2y , 1 xα (4.9) log (1 − 1 xα · 2 yα ) exp · 2 fα 1x · 2y . 2 yα We deal with the first function above, the proof for the second following by symmetry. [ ⊗2 , its exponential is an automorphism of Since log(1−1 xα ·2 yα ) · e is a derivation of U 1 xα
1 α
ε
Geometrical Meaning of R-Matrix Action for Quantum Groups
109
· 1 eα (1⊗y) = 1 ⊗ y , for all y ∈ Uε ; therefore we have only to compute exp log(1−11xxαα ·2 yα ) · 1 eα (x ⊗ 1) for x = 1 x ∈ Uε : in particular, it is enough to take x to be a generator of Uε , namely x ∈ { Fi , Lλ , Ej | i, j = 1, . . . , n; λ ∈ P } . Like in the proof of [DK], Proposition 3.5, exploiting the braid group action we can restrict to the case of simple roots α = αi , i = 1, . . . , n (thus we set 1 E i := 1 E αi , 1 ei := 1 eαi , and so on), using formulas for α = w(αi ) 1 E α = Tw 1 E i , 1 F α = Tw 1 F i (4.10) −1 −1 for α = w(αi ) 1 eα = Tw ◦ 1 ei ◦ Tw , 1 fα = Tw ◦ 1 fi ◦ Tw [ ⊗2 ; now U ε
log(1−1 xα ·2 yα ) 1 xα
· 1 eα (1⊗y) = 0 , whence exp
log(1−1 xα ·2 yα ) 1 xα
(cf. [DK], § 3.4), where Tw denotes the unique element of the braid group associated to w ∈ W . Moreover, from direct computation or recalling formulas in the proof of [DK], Proposition 3.5, we get, mutatis mutandis, exp (t · 1 ei ) (1 Lλ ) = e(hαi |λi/2`)·t·1xi · 1 Lλ , exp (t · 1 ei ) 1 F j = −t·1xi /` et·1xi /` − 1 −di − 1 di e `−1 = 1 F j − δij · ε L αi + · ε L−αi · 1 E i 1 xi 1 xi for any indeterminate t which commutes with 1 E i (where hαi |λi := 2(αi |λ)/(αi |αi ) ∈ Z ); when instead of t we have the meromomorphic function log(1−1 x1 xi i ·2 yi ) (which does commute with 1 E i !) the previous formulas give hαi |λi log (1 − 1 xi · 2 yi ) exp · 1 ei 1 Lλ = (1 − 1 xi · 2 yi ) 2` , 1 xi log (1 − 1 xi · 2 yi ) exp · 1 ei 1 F j = 1 xi (1 − 1 xi · 2 yi )−1/` − 1 di = 1 F j − δij · ε Lαi + 1 xi (1 − 1 xi · 2 yi )1/` − 1 −di `−1 ε L−αi · 1 E i + x 1 i and both these are rational functions on Sε(2`) × Sε(2`) . Now we are left with the case x = E j , j = 1, . . . , n . Consider 1 ei 1 E j ; if aij = 2 (i. e. i = j) or aij = 0 we have 1 ei 1 E j = 0 , hence log (1 − 1 xi · 2 yi ) exp · 1 ei 1 E j = 1 E j . 1 xi Therefore we are reduced to make computations in the connected rank 2 case. To this end, we will follow conventions and notations of [DP], Appendix, and skip for a while the left bottom indices "1" (i. e. 1 E i = E i , etc.). We develop the A2 case; the procedure is the same in the remaining cases but the computations are longer (cf. also the Remark after the proof).
110
F. Gavarini
In this case we have d1 = 1 = d2 . Define the root vector E12 := Eα1 +α2 ∈ UqP(g) as Eα12 ≡ E12 := T1 (E2 ) = −E1 E2 + q −1 E2 E1 , then we have E2 E1 = qE1 E2 + qE12 ,
(4.11)
E12 E1 = q −1 E1 E12 .
(4.12)
Let C(q)(E1 ) be the field of rational functions in the indeterminate E1 with coefficients in C(q); let M be the C(q)(E1 )–vector space with basis {E 2 , E 12 }: then (4.12) tells us that the operation ρE1 of right multiplication by E1 yields an endomorphism of M defined by the matrix (with respect to the ordered C(q)(E1 )–basis {E 2 , E 12 } ) 0 qE1 q q −1 E1 n therefore multiplication by E1 yields the endomorphism of M defined by the matrix n 0 0 n qE1 (qE1 )n = . (4.13) q −1 E1 q q −1 E1 q[n] q · E1n−1 Thus for e1 E 2 we have i h (`) e1 E 2 := E1 , E 2 = q=ε
E1` E 2 − E 2 E1` = = [`] q ! q=ε
E1` E 2 − q ` E1` E 2 − q[`] q E1`−1 E 12 = [`] q ! q=ε q − q −1 1 − q` q `−1 ` · E1 E 2 − · E1 E 12 = = · ` −` q −q [` − 1] q ! [` − 1] q ! q=ε q=ε `
ε `−1 x1 ε `−1 E = − 1 · E 2 − E 1 · E 12 = − · E 2 − E 1 · E 12 2` ` 2` ` `−1 `−1 −1 −1 ) 1 (because [`−1] = Q(q−q = (ε−ε ` ) ); on the other hand, for `−1 s q ! q=ε (q −q −s ) q=ε s=0 e1 E 12 , (4.12) gives i h E ` E 12 − E 12 E1` E1` E 2 − q −` E1` E 12 = e1 (E 12 ) := E1(`) , E 12 = 1 = [`] ! [`] ! =
q=ε
−`
1−q q ` − q −`
q
q=ε
q
` q − q −1 x1 E1 ` · E1 E 12 = · E 12 = · E 12 , · [` − 1] q ! 2` 2` q=ε
q=ε
therefore we conclude that e1 restricts to an endomorphism of M defined by the matrix ! ` E − x2`1 0 0 − 2`1 = , `−1 ` x1 `−1 E1 − ε` · E 1 − ε` · E 1 2` 2` n hence en1 = e1 is given by the matrix M
M
Geometrical Meaning of R-Matrix Action for Quantum Groups
− x2`1
0 `−1
− ε` · E 1
n =
x1 2`
111
− x2`1 −δn∈(2N+1) ·
2ε x1
·
n x1 n 2`
0 `−1
· E1
x1 2`
n
(for all n ∈ N , where δx∈Y := 1 for x ∈ Y and δx∈Y := 0 for x ∈ / Y ), so that t − 2` ·x1 0 t e `−1 t t exp(t · e1 ) = − ε · e 2` ·x1 − e− 2` ·x1 · E e 2` ·x1 , , M
1
x1
where t denotes any indeterminate which commutes with E1 ; in particular for t = ` ` log(1−w1 ) , with w1 := 1 E 1 · 2 F 1 = 1 x1 · 2 y1 , we get ` E1
exp
log(1 − w1 ) `
E1
! · e1
=
M
(1 − w1 )− 2` 1 1 `−1 · (1 − w1 ) 2` − (1 − w1 )− 2` · E 1 1
= thus
−
ε ` E1
0 1 (1 − w1 ) 2`
! ,
log(1 − 1 x1 · 2 y1 ) exp · 1 e1 1 E 2 = x 1 1 1 1 1 ε `−1 − 2` = (1 − 1 x1 · 2 y1 ) ·1 E 2 − · (1 − 1 x1 · 2 y1 ) 2` − (1 − 1 x1 · 2 y1 )− 2` ·1 E 1 ·1 E 12 , x 1 1 (2`) (2`) which is a rational section of a Sε × Sε , q.e.d. log(1−1 x2 ·2 y2 ) As for exp · 1 e2 , everything comes from above by symmetry, namely 1 x2 because α2 = s1 s2 (α1 ) implies 1 E 2 = T1 T2 1 E 1 , 1 F 2 = T1 T2 1 F 1 , and 1 e2 = (T1 T2 ) ◦ 1 e1 ◦ (T1 T2 )−1 ; on the other hand, in the other cases of rank 2 (that is B2 and G2 ) such a symmetric situationdoes not occur, hence we must perform direct computation for exp log(1−1 x1 x2 2 ·2 y2 ) · 1 e2 too (this is entirely similar, although longer, to the previous one). Finally, it is clear that restricting to subalgebras Zε and Z0 we get (bi)rational Poisson automorphisms of their spectra, by the same argument as the end of the proof of Proposition 4.1.
Remark 4.1. The very (theoretical) reason why computations do work in all rank two cases, so that Theorem 4.1 does hold, lies in the availability of the commutation formulas for quantum root vectors (the so-called Levendorskij-Soibel’man formulas, cf. [DP], Theorem 9.3), strictly related with the existence of a convex ordering on the set of positive roots. The previous result can be still improved when considering the central Hopf subalgebra Z0 , hence the Poisson group H, as the following shows: ∗ : H(2`) × H(2`) → H(2`) × Theorem 4.2. The birational Poisson automorphism Rε,` ∗ H(2`) push down to a birational Poisson automorphism Rε,` : H(2) ×H(2) → H(2) ×H(2) , (2) independent of `, of a 2-fold ramified covering H × H(2) of H × H ; moreover, ∗ ∗ 6= τ (2) (the "twist" map of H(2) × H(2) ), and Rε,` enjoys the dual properties of Rε,` ∗ (1.4–6): in particular, m Rε,` (x, y) = y · x for all elements x, y of the Poisson group
H(2) (where m and " · " denote the product of H(2) ).
112
F. Gavarini
Proof. As for Theorem 4.1, the proof amounts to check that some series do converge on an appropriate covering. Namely, we have to check that −ε ψ(xα ⊗ yα ) · ad{ , } (xα ⊗ yα ) (1 w ⊗ 2 w) exp 2dα `2 does converge to a rational function on a covering H(2) ×H(2) as claimed for all α ∈ R+ and for all i w ∈ { 1, i xβ , i zλ , i yγ | β, γ ∈ R+ ; λ ∈ P } , i = 1, 2 . This again amounts to prove the same for functions log(1 − 1 xα · 2 yα ) exp · 1 eα (1 w), 1 xα log(1 − 1 xα · 2 yα ) · 2 fα (2 w) exp 2 yα for all α and i w like above. As for Theorem 4.1, we deal with the first function, the proof for the second one following by symmetry. By the braid group action we can again reduce to the case of simple roots α = αi . ` Furthermore (cf. [DK], § 3.4, and [DP], § 19), with respect to coordinates xγ := E γ , `
zλ := L`λ , yγ := F γ , the formulas for derivations eα are independent of ` : therefore we can fix ` = 1 and perform computations in U1 . Again direct computation (or formulas in the proof of [DK], Proposition 3.5) gives exp (t · 1 ei ) (1 zλ ) = e(hαi |λi/2)·t·1xi · 1 zλ
(4.14)
for any indeterminate t which commutes with 1 E i = 1 xi ; then for t = log(1−1 x1 xi i ·2 yi ) we have hαi |λi log(1 − 1 xi · 2 yi ) · 1 ei (1 zλ ) = (1 − 1 xi · 2 yi ) 2 · 1 zλ exp 1 xi which is a rational function on a 2-fold ramified covering H(2) × H(2) of H × H . log(1−1 xi ·2 yi ) Now consider exp · 1 ei 1 xγ , with γ ∈ R+ (notice that now simple 1 xi root vectors E j = xj (j = 1, . . . , n) are not enough to generate U1+ (the "positive part" of U1 ): we do need all root vectors E γ = xγ , γ ∈ R+ ). For any fixed pair (α, γ) of + positive roots, let us denote by Rβ,γ the rank 2 root system spanned by {α, γ} in R+ . The following is well known (cf. e. g. [DP], first Lemma of § 15.4): roots with αi simple, there exists w ∈ W Claim. For any fixed pair (αi , γ) of positive + + and w(α1 ) = αi . = Rα and α1 , α2 ∈ R+ such that w Rα 1 ,α2 i ,γ Thanks to Claim and (4.10) we are reduced to make computations in the rank 2 case; the same holds when considering exp log(1−1 x1 xi i ·2 yi ) · 1 ei 1 yγ , with γ ∈ R+ (now again negative simple root vectors F j = yj (j = 1, . . . , n) are not enough to generate U1− (the "negative part" of U1 ): we do need all negative root vectors F γ = yγ , γ ∈ R+ ). We denote by T the type of a root system of rank 2 (hence T ∈ {A1 ×A1 , A2 , B2 , G2 } ). T = A1 × A1 : First of all, since ej xj = ej E j = 0 (j = 1, 2), we have log(1 − 1 xj · 2 yj ) exp · 1 ej 1 x j = 1 x j (j = 1, 2) ; 1 xj
Geometrical Meaning of R-Matrix Action for Quantum Groups
113
second, since a12 = 0, we have ei xj = 0 (for i, j ∈ {1, 2} , i 6= j) whence log(1 − 1 xi · 2 yi ) · 1 ei 1 x j = 1 x j exp 1 xi (for i, j ∈ {1, 2} , i 6= j) thus we are done with generators xα ’s. As for negative root vectors yα = F α , we have ei (yj ) = δij · (zαi − z−αi ) , whence ein (yj ) = δij · ein−1 (zαi ) − ein−1 (z−αi ) = δij · (−xi )n−1 · zαi − xin−1 · z−αi (thanks to (4.14)) for all n ∈ N+ , thus exp (t · 1 ei ) (1 yj ) = 1 yj − δij ·
et·1 xi − 1 e−t·1 xi − 1 · zαi + · z−αi 1 xi 1 xi
for any indeterminate t which commutes with 1 xi ; for t = log(1−1 x1 xi i ·2 yi ) we get log(1 − 1 xi · 2 yi ) exp · 1 ei (1 yj ) = 1 xi ! (1 − 1 xi · 2 yi )−1 − 1 (1 − 1 xi · 2 yi ) − 1 = 1 yj − δij · · zαi + · z−αi 1 xi 1 xi which is a rational function on a 2-fold ramified covering H(2) × H(2) of H × H ( = Spec(Z0 ) × Spec(Z0 ) ) . Since for T = A1 × A1 we have R+ = {α1 , α2 } , we are done. T = A2 : We follow again conventions and notations of [DP], Appendix. In the present case we have d1 = 1 = d2 , and R+ = {α1 , α12 := α1 + α2 , α2 } , and we define the root vector E12 := −E1 E2 − q −1 E2 E1 (cf. (4.11)). For γ = α1 we have as above log(1 − 1 x1 · 2 y1 ) · 1 e1 (1 x1 ) = 1 x1 . exp 1 x1 Then let M be the C(q)(E1 )–vector space with basis {x2 , x12 } = E 2 , E 12 : then (4.12) tells us that the operation of right multiplication by E1 yields an endomorphism of M defined by the matrix (with respect to E 2 , E 12 ) 0 qE1 . q q −1 E1 Thus for e1 (x2 ) we have e1 (x2 ) := E1 , E 2 = E1 E 2 − E 2 E1 = E1 E 2 − qE1 E 2 − qE 12 = q=1 q=1 q=1 x1 E1 1−q = · E 2 − E 12 = − · E 1 E 2 − q · E 12 = − · x2 − x12 ; q − q −1 2 2 q=1 q=1 on the other hand, for e1 (x12 ) , (4.12) gives e1 (x12 ) := E1 , E 12 = E1 E 12 − E 12 E1 = E1 E 2 − q −1 E1 E 12 = q=1 q=1 q=1 x E 1 − q −1 1 1 = · E 12 = · x12 . · E1 E 12 = q − q −1 2 2 q=1
114
F. Gavarini
Therefore we conclude that e1 restricts to an endomorphism of M defined by the matrix x1 0 −2 , x1 −1 2 n hence en1 = e1 is given by the matrix M
M
− x21 −1
0
n =
x1 2
(for all n ∈ N , so that exp(t · e1 )
n − x21 −δn∈(2N+1) · x21 ·
M
=
x1 2
− t ·x
− x11
log(1−1 x1 ·2 y1 ) 1 x1
(1 − 1 x1 · 2 y1 )− 2 1 1 · (1 − 1 x1 · 2 y1 ) 2 − (1 − 1 x1 · 2 y1 )− 2 1
− 1 x1 1
x1 n 2
et 2 1 t · e 2 ·x1 − e− 2 ·x1
for any t, which commutes with E1 , and for t = log(1 − 1 x1 · 2 y1 ) exp · 1 e1 = x1 M =
0
0
e
n
t 2 ·x1
we have
0 1 (1 − 1 x1 · 2 y1 ) 2
!
Or, in other words, log(1 − 1 x1 · 2 y1 ) · 1 e1 ( 1 x 2 ) = exp 1 x1 1 1 1 1 · (1 − 1 x1 · 2 y1 ) 2 − (1 − 1 x1 · 2 y1 )− 2 · 1 x12 = (1 − 1 x1 · 2 y1 )− 2 · 1 x2 − x1 1 1 log(1 − 1 x1 · 2 y1 ) exp · 1 e1 (1 x12 ) = (1 − 1 x1 · 2 y1 ) 2 · 1 x12 x 1 1 and these are rational functions on a 2-fold ramified covering H(2) × H(2) of H × H, q. e. d. For negative root vectors yα = F α ’s, define Fα12 ≡ F12 := T1 (F2 ) = −F2 F1 + qF1 F2 ; then we have again e1 (yj ) = δ1j · (zα1 − z−α1 ) (j = 1, 2), whence log(1 − 1 x1 · 2 y1 ) · 1 e1 (1 yj ) = exp 1 x1 ! (1 − 1 x1 · 2 y1 )−1 − 1 (1 − 1 x1 · 2 y1 ) − 1 · zα 1 + · z−α1 = 1 yj − δ1j · 1 x1 1 x1 (j = 1, 2) which is a rational function on the proper covering; this takes care of γ = α1 and γ = α2 . At last, for γ = α12 := α1 + α2 , we have e1 (y12 ) = E1 , F 12 = −F2 E1 , F 1 + q E1 , F 1 F2 = q=1 q=1 = −F 2 Lα1 − L−α1 + q Lα1 − L−α1 F 2 = q=1 2 q −1 · F 2 Lα1 = zα1 · y2 . = q − q −1 q=1
Geometrical Meaning of R-Matrix Action for Quantum Groups
115
Now, since for all n ∈ N we have e1n (zα1 · y2 ) = e1n (zα1 ) · y2 = (−x1 )n · zα1 · y2 , we get, for all n ∈ N+ e1n (y12 ) = e1n−1 (zα1 · y2 ) = e1n−1 (zα1 ) · y2 = (−x1 )n−1 · zα1 y2 = − whence exp (t · e1 ) (y12 ) = y12 − exp
e−t·x1 −1 x1
(−x1 )n · zα1 y2 , x1
· zα1 y2 and finally
(1 − 1 x1 · 2 y1 )−1 − 1 log(1 − 1 x1 · 2 y1 ) · 1 e1 (1 y12 ) = 1 y12 − · 1 zα 1 · 1 y 2 , 1 x1 1 x1
the latter beinga rational functionon the covering H(2) × H(2) of H × H . As for exp log(1−1 x1 x2 2 ·2 y2 ) · 1 e2 , everything follows by symmetry; on the other hand, situationdoes not occur, hence we must perform in cases B2 and G2 such a symmetric log(1−1 x2 ·2 y2 ) direct computation for exp · 1 e2 too (which is completely similar, al1 x2 though quite longer, to the previous one). We stress the fact that the proof of Theorem 4.2 above also contains the proof of the following one, which means that the adjoint action of the R-matrix does specialize for q → 1 to something more than formal, with a very precise geometric meaning: Theorem 4.3. For the braided Hopf algebra F [H], R1 = (U1 , R1 ) the "formal automorphism" R1 is in fact an effective Poisson automorphism of the field of rational functions on H(2) × H(2) . In other words, R1 defines a Poisson birational automorphism R1∗ of H(2) × H(2) which enjoys the dual properties of (1.4–6), in particular m R1∗ (x, y) = y · x for all x, y ∈ H(2) . 4.2. Recall that, by general theory, one has Spec(A ⊗ B) = Spec(A) × Spec(B) for all associative unital algebras A and B; moreover, if M and N are Poisson manifolds then M × N is a Poisson manifold too, whose symplectic leaves are all the products of symplectic leaves of M and N . In our context this implies that the symplectic leaves of H ×2 (resp. H×2 ) are all the products of symplectic leaves of H (resp. H ). Let N ∈ {2`, ∞}. Let us denote by Z0 the algebra of meromorphic functions on H(N ) . As we said, we can look at Sε as a sheaf of algebras over H ; similarly, we can look at Sε(N ) as a sheaf of algebras over H : its algebra of meromorphic sections Uε is then Uε = Z0 ⊗Z0 Uε . Now represent the elements of H(N ) = Spec(Z 0 ) as maximal ideals of Z0 , and let m ∈ H(N ) : the fibre over m of our sheaf is then Uε m Uε . Similarly, the fibre over (m, n) ∈ H(N ) × H(N ) (of the sheaf of algebras Sε(N ) × Sε(N ) over H(N ) × H(N ) ) is (Uε ⊗ Uε ) (m Uε ⊗ Uε + Uε ⊗ n Uε ) . Proposition 4.2. R∗ε,N is a meromorphic automorphism of Sε(N ) ×Sε(N ) as a fibre bundle of over H(N ) × H(N ) with respect to the meromorphic automorphism R∗ε,N H(N ) ×H(N )
the base space H(N ) × H(N ) ; in other words, the following diagram is commutative:
116
F. Gavarini
S(N ) × S(N ) π
∗ R,N
-
S(N ) × S(N ) π ?
? -
H(N ) × H(N )
H(N ) × H(N )
R∗ ε,N
where π: Sε(N ) × Sε(N ) −→ H(N ) × H(N ) is the projection map of the fibre bundle. In particular R∗ε,N leaves invariant the fibres of Sε(N ) × Sε(N ) over symplectic leaves of H(N ) ×H(N ) (i. e. the preimages, with respect to π, of symplectic leaves of H(N ) ×H(N ) ). ⊗2 = Proof. This is more or less trivial, by construction. Let s = (m, n) ∈ Spec Z 0 . ⊗2 ⊗2 ⊗2 (N ) (N ) be a maximal ideal of Z0 ; then its fibre Uε H ×H s Uε is mapped by . . R∗ε,N onto R∗ε,N Uε ⊗2 s Uε ⊗2 = Uε ⊗2 Rε−1 (s) Uε ⊗2 , whence everything easily follows. Note added in proof. Let H be a quasitriangular Hopf algebra, with R as R-matrix. Then it is well-known that, if Vi , . . . , Vn (n ∈ N+ ) are representations of H, then an action of the braid group Bn on V1 ⊗ · · · ⊗ Vn is defined by means of R. Similarly, if (H, R) is braided, then an action of Bn is defined on H ⊗n by means of R. Therefore, by duality, Theorem 4.1 and Theorem 4.2 imply that the braid group Bn acts – by birational ×n automorphisms, which are also Poisson in the second and third case – on Sε(∈`) , on (∈) ×n (∈) ×n Hε and on H . References [DD]
Damiani, I., De Concini, C.: Quantum groups and Poisson groups, In W. Baldoni, M. Picardello (eds.), Representations of Lie groups and quantum groups. London: Longman Scientific & Technical, 19?? [DK] De Concini, C., Kac, V. G.: Representations of Quantum Groups at Roots of 1. In: Colloque Dixmier 1989, Progr. in Math. 92, 471–506 (1990) [DKP] De Concini, D., Kac, V. G., Procesi, C.: Quantum coadjoint action. J. Am. Math. Soc. 5, 151–189 (1992) [DL] De Concini, C., Lyubashenko, V.: Quantum function algebra at roots of 1. Adv. in Math. 108, 205–262 (1994) [DP] De Concini, C., Procesi, C.: Quantum Groups. In L. Boutet de Monvel, C. De Concini, C. Procesi, P. Schapira, M. Vergne (eds.), D-modules, Representation Theory, and Quantum Groups. Lectures Notes in Mathematics 1565, Berlin–Heidelberg–New York: Springer-Verlag, 1993 [Dr] Drinfeld, V. G.: Quantum Groups. In Proc. Intern. Congress of Math, Berkeley, 1986, Providence, RT: AMS, 1987, pp. 798–820 [Ex] Exton, H.: q-Hypergeometric Functions and Applications. Ellis Hordwood Series Mathematics and its Applications, London: Ellis Hordwood Limited, 1983 [GR] Gasper, G., Rahman, M.: Basic hypergeometric series. Encyclopedia of Mathematics and its Applications 35, Cambridge: Cambridge University Press, 1990 [Ji] Jimbo, M.: A q-difference analogue of U(g) and the Yang-Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) [KR] Kirillov, A. N., Reshetikhin, N.: q-Weyl Group and a Multiplicative Formula for Universal R-Matrices. Comm. Math. Phys. 134, 421–431 (1990) [LS] Levendorskij, S. Z., Soibel’man, Ya. S.: Some applications of the quantum Weyl groups. J. Geom. Physics 7, 241–254 (1990)
Geometrical Meaning of R-Matrix Action for Quantum Groups
[Re] [Ta]
117
Reshetikhin, N.: Quasitriangularity of quantum groups at roots of 1. Commun. Math. Phys. 170, 79–99 (1995) Tanisaki, T.: Killing forms, Harish-Chandra Isomorphisms, and Universal R-Matrices for Quantum Algebras. Internat. J. Modern Phys. A 7, Suppl. 1 B, 941–961 (1992)
Communicated by G. Felder
Commun. Math. Phys. 184, 119 – 141 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
The Affine Sugawara Generators at Arbitrary Level R.W. Gebert? , K. Koepsell, H. Nicolai II. Institut f¨ur Theoretische Physik, Universit¨at Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany Received: 2 May 1996 / Accepted: 15 August 1996
Abstract: We construct an explicit representation of the affine Sugawara generators for arbitrary level in terms of the homogeneous Heisenberg subalgebra, which generalizes the well-known expression at level 1. This is achieved by employing a physical vertex operator realization of the affine algebra at arbitrary level, in contrast to the Frenkel– Kac–Segal construction which uses unphysical oscillators and is restricted to level 1. At higher level, the new operators are transcendental functions of DDF “oscillators” unlike the quadratic expressions for the level-1 generators. An essential new feature of our construction is the appearance, beyond level 1, of new types of poles in the operator product expansions in addition to the ones at coincident points, which entail (controllable) non-localities in our formulas. We demonstrate the utility of the new formalism by explicitly working out some higher-level examples. Our results have important implications for the problem of constructing explicit representations for higher-level root spaces of hyperbolic Kac–Moody algebras, and E10 in particular. 1. Introduction The affine Sugawara construction [31, 30, 5, 6, 1, 8, 7, 26, 18, 32, 28, 10, 20] and the coset construction [16] have both come to play a prominent role in string theory and in the theory of Kac–Moody algebras (see e.g. [23, 27] for the general theory). As is well known, the affine Sugawara construction extends a representation of an affine Lie algebra g to that of its semidirect product with the Virasoro algebra Virg . The coset construction, in turn, is based on the Sugawara construction: given an affine subalgebra p ⊂ g, there always exists another Virasoro algebra corresponding to the difference of Virasoro operators associated with g and p, respectively, such that the resulting coset Virasoro algebra Vir g,p commutes with the affine subalgebra p. It is for this reason that ? Supported by Deutsche Forschungsgemeinschaft under Contract No. DFG Ni 290/3-1 Correspondence to: H. Nicolai, E-mail:
[email protected]
120
R.W. Gebert, K. Koepsell, H. Nicolai
the coset construction, whose main original application was the explicit description of c < 1 Virasoro modules [16], has acquired great importance in the representation theory of affine algebras [25]. More specifically, every highest weight representation L(Λ) of g can be decomposed w.r.t. the direct sum p ⊕ Vir g,p as follows: M Lp (λ) ⊗ U (Λ, λ). (1.1) L(Λ) = λ∈P+g,p
The relevant specialization of this formula for tensor products of g modules is obtained by taking p to be the diagonal subalgebra of g ⊗ 1 ⊕ 1 ⊗ g. Unfortunately, it is not so easy in general to compute the products (1.1) in practice. For simple examples, where the central charge of the Virasoro module U (Λ, λ) obeys c < 1 and unitarity restricts the conformal weights h to a finite set of allowed values, one can work out the product explicitly. However, at higher level the right-hand side of (1.1) will contain many terms; furthermore, these will in general correspond to Virasoro Verma modules with central charge c > 1 where the values of h are unrestricted (apart from h ≥ 0). A more serious problem arises when one tries to exploit (1.1) to analyze the structure of hyperbolic Kac–Moody algebras (as was done for the level-2 sector in [9, 24]): apart from the fact that it is not clear which representations will occur at a given level, the division by the multilinear Serre relations in general eliminates infinitely many states of U (Λ, λ), such that the resulting infinite superposition of affine representations no longer has the structure of a Virasoro module. Purely representation theoretic means may thus not be sufficient to make further progress in this direction. Rather, we believe that what is needed at this point is a more explicit and manageable representation of the relevant operators which mediate between the different affine modules, and which will differ from the standard coset generators after imposition of the Serre relations. From our previous work all indications are that a complete solution of this problem will necessitate the incorporation of longitudinal DDF operators, but an indispensable first step towards this goal is the construction of an explicit representation for the higher-level Sugawara operators themselves. This is the problem which we address and solve in the present paper. The famous FKS vertex operator realization [12, 28] of nontwisted affine Lie algebras corresponds to a spatially compactified bosonic string whose momentum lattice is taken to be the (Euclidean) root lattice of a finite-dimensional simple Lie algebra of ADE type. The Laurent coefficients (“modes”) of the tachyon vertex operators together with the string oscillators then constitute a basis of the affine algebra. This basis, however, is not physical in the sense of string theory since, except for the zero mode, these mode operators do not commute with the Virasoro constraints. Furthermore, the FKS construction has the drawback of being restricted to affine algebras at level 1. In [11, 17], it was noticed that if the momentum lattice of the string is enlarged by a two-dimensional Minkowski lattice then the zero mode operators by themselves already lead to a basis of the affine algebra. Our starting point was the observation that apart from being manifestly physical, this construction is applicable to affine Lie algebras at arbitrary level and thus more general than the FKS construction (this fact is apparently not widely known). Exploiting this vertex operator construction we arrive at the new formula (3.15) which elegantly comprises the known features of the Sugawara operators and enjoys additional remarkable properties. The usual vertex operator formalism relies on the evaluation of the singular terms arising in the expansion of certain operator products at coincident points. By contrast, we here find that, at level `, additional poles appear at the non-coincident points z =
The Sugawara Generators at Arbitrary Level
121
wp := ζ p w in the expansion of the product of two conformal operators supported at z and w, where ζ is a primitive `-th root of unity. This is still visible in our new formula which is non-local in the sense that the integrand in (3.15) depends both on w and wp . Nevertheless, these non-localities remain controllable as there is always only a finite number of them present. The final expression for the general level-` Sugawara operators is a transcendental function of DDF oscillators forming a Heisenberg algebra. A new “transversal coordinate” field occurs, which has the form of the old Fubini-Veneziano field, but for which the usual string oscillators are replaced by the level-` DDF operators. The new formula involves an exponential dependence on this new field; since the DDF operators themselves are defined by exponentials of the string oscillators, our construction may thus even be termed “doubly transcendental.” For the special case of level 1 we immediately recover the well-known result [10] that the Sugawara operators become quadratic expressions in the Heisenberg oscillators. We would like to stress that the affine Sugawara generators obtained in this way act on the affine modules which arise as subspaces of the full DDF Fock space; since, however, we are here not primarily concerned with the representation theoretic applications of our techniques to the theory of Kac–Moody algebras, this point will be elaborated in a subsequent paper [15]. In [23, Lemma 12.8.b)], the eigenvalue of the Sugawara zero mode L[`] 0 at arbitrary level on the vacuum vector was computed by using the properties of the affine Casimir operator. With the help of our new formula, this result can be rederived by a short calculation (cf. Eq. (4.7)). Apart from the new structural insights afforded by the new formula, our results explicitly illustrate the increasing “anisotropy” of the higher-level root spaces of hyperbolic Kac–Moody algebras which was already observed in [14], and which can be explained by the symmetry breaking of the full (affine) Weyl group down to a finite (and generically trivial) subgroup called “little Weyl group” in [14]. From a more technical perspective this phenomenon is due to the appearance of certain weighted sums of tensor products of roots (see (5.4)) which have not yet appeared in the literature to the best of our knowledge. We would like to emphasize that these and other special features of hyperbolic Kac–Moody algebras can not be explained by string compactification alone. In other words, such algebras reveal an enigma beyond the string vertex operator construction. By contrast, many of the generalized Kac–Moody (super)algebras which have recently received attention (see e.g. [3, 21, 22]) can presumably be realized as untruncated, hence bona fide, algebras of physical vertex operators of some compactified string. For instance, the so-called fake monster Lie algebra is just the algebra of all transversal physical states of the bosonic string compactified on II25,1 ; its root spaces are therefore perfectly “isotropic” and the associated root multiplicities are simply given by the number of physical string states.
2. Preliminaries We consider a nontwisted affine Lie algebra g with underlying simple finite-dimensional Lie algebra g¯ of rank d − 2 (d ≥ 3). The associated hyperbolic Kac–Moody algebra gˆ of rank d is obtained by adjoining to the affine Dynkin diagram another node which is related to the over-extended simple root r−1 . The extended (Minkowskian) affine root lattice is defined as Qˆ := Q ⊕ ZΛ0 (viewed as the even sublattice of the affine weight lattice), where Q denotes the affine root lattice and Λ0 := r−1 + δ is a null vector
122
R.W. Gebert, K. Koepsell, H. Nicolai
conjugate to the affine null root δ. Clearly, Qˆ is just the hyperbolic root lattice. For any ˆ the level ` is defined by element Λ of Q, ` := −Λ·δ.
(2.1)
Now suppose that Λ ∈ Qˆ is a root of gˆ of nonzero level. The DDF decomposition of Λ [14] is defined by Λ = a − nk` ,
(2.2)
where we have put 1 k` := − δ `
∈ QQ := Q ⊗Z Q,
(2.3)
and where the vector a is uniquely determined by requiring a2 = 2, i.e., n = 1 − 21 Λ2 . Note that n is always a non-negative integer since Λ2 ≤ 2 for any root. We will refer to a as the “tachyonic level-` vector” and to |ai as the “tachyonic level-` state” associated to Λ; beyond level 1, it need not be an element of the lattice Qˆ but only of its rational extension. Let us furthermore introduce the orthonormal polarization vectors ξ i ≡ ξi (a) satisfying ξ i·ξ j = δij and ξ i·δ = ξ i·Λ = ξ i·a = 0. They constitute a basis of the complex vector space h¯ ∗ dual to the Cartan subalgebra h¯ of g¯ . Then we define the operators I dz [`] i ξ (a)·P(z)eimk` ·X(z) , Am (a) := (2.4) 2πi i I dz [`] r Em := :ei(r+mk` )·X(z) :, (2.5) 2πi for m ∈ Z, 1 ≤ i ≤ d − 2, r ∈ ∆¯ (roots of g¯ ). Here we have used the well-known Fubini–Veneziano coordinate and momentum fields, respectively, X µ (z)
:=
q µ − ipµ ln z + i
X 1 αµ z −m , m m
(2.6)
m6=0
P µ (z)
:=
i
X d µ µ −m−1 X (z) = αm z , dz
(2.7)
m∈Z
µ (m ∈ Z, 1 ≤ µ ≤ d), expressed in terms of the string oscillators αm µ [αm , αnν ] = mη µν δm+n,0 .
(2.8)
The shift of any polarization vector ξ i (a) along the δ direction leaves the associated DDF operator [`]Aim (a) unchanged for m 6= 0, because the residue of a total derivative always vanishes. Since the polarization vectors of two tachyonic level-` states are related by ξ i (a0 ) = ξ i (a) +
1 ξ i (a)·a0 δ, `
(2.9)
we are thus effectively dealing with a single set of DDF operators for m 6= 0, Aim ≡ [`]Aim (a) = [`]Aim (a0 );
[`]
(2.10)
The Sugawara Generators at Arbitrary Level
123
so we can suppress the labels a, a0 in the remainder, i.e. write [`]Aim ≡ [`]Aim (a). Let us stress, however, that the zero mode operators do differ for different a, viz. Ai0 (a) = ξi (a)·p = ξi (a0 )·p −
[`]
1 1 ξ (a)·a0 δ·p = [`]Ai0 (a0 ) − ξ i (a)·a0 δ·p. (2.11) ` i `
For definiteness, we choose the polarization vectors to be ξ i (Λ0 ) throughout this paper, where Λ0 denotes the above fundamental dominant weight of level 1 with tachyonic ˆ This vector a0 plays the role of the simple root r−1 level-1 vector a0 = Λ0 − δ ∈ Q. occurring in the canonical extension of g to the hyperbolic Kac–Moody algebra gˆ with Qˆ as root lattice. Last but not least we have the obvious relation Aim (a) = [`]Ai`m (a).
[1]
(2.12)
The above operators obey the commutation relations [[`]Aim , [`]Ajn ]
=
mδ ij δm+n,0 [`]K0 ,
(2.13)
Aim , [`]Enr ]
=
(2.14)
r [`] s [[`]Em , En ]
=
(ξ i ·r) ( 0 r+s (r, s) [`]Em+n [`] r − Am+n −mδm+n,0 [`]K0
[
[`]
r Em+n ,
[`]
if r·s ≥ 0, if r·s = −1, if r·s = −2,
(2.15)
where [`]K0 := k` ·p = − 1` δ·p denotes the operator realization of the central element of the affine algebra and I X dz [`] r ¯ r·P(z)eimk` ·X(z) = Am := (ξ i ·r) [`]Aim ∀r ∈ ∆. 2πi i As usual, we have to extend the Cartan subalgebra by an exterior derivative which we choose to be [`]d0 := `Λ0 · p for the basic (level 1) fundamental weight Λ0 (note that Λ20 = 0). r ¯ m ∈ Z) establish The operators [1]K0 , [1]d0 , [1]Aim , [1]Em (1 ≤ i ≤ d − 2, r ∈ ∆, a realization of the affine Lie algebra g, the level being given by the eigenvalue of the operator [1]K0 . Note that in contrast to the FKS construction this vertex operator realization works for arbitrary level and is physical in the sense of string theory, i.e. [Lm , [1]K0 ] = [Lm , [1]d0 ] = [Lm , [1]Ain ] = [Lm , [1]Enr ] = 0, ¯ where the operators for all m, n ∈ Z, 1 ≤ i ≤ d − 2, r ∈ ∆, Lm :=
1X :αn ·αm−n : 2
(2.16)
n∈Z
satisfy the standard Virasoro algebra with central charge c = d. There is yet another realization of the affine Lie algebra which is, however, restricted to level 1. Namely, on r (1 ≤ i ≤ d − 2, states with eigenvalue ` for [1]K0 , the operators [`]K0 , [`]d0 , [`]Aim , [`]Em ¯ r ∈ ∆, m ∈ Z) form a level-1 realization of g which is also physical. Since we are working with the so-called homogeneous vertex operator construction we will refer to the algebra of operators [`]Aim as the homogeneous Heisenberg subalgebra of the affine algebra. The crucial observation for our analysis is that these operators not only occur as part of the affine algebra but also as part of the spectrum generating
124
R.W. Gebert, K. Koepsell, H. Nicolai
algebra for the physical string states. In this context, they are nothing but the well-known transversal DDF operators. A crucial new feature of our analysis is the appearance of the level-` transversal coordinate field X 1 [`] i [`] i X (z) := Qi − ipi ln z + i A z −m , (2.17) m m m6=0
and the level-` transversal momentum field P i (z) := iz
[`]
X d [`] i [`] i X (z) = Am z −m , dz
(2.18)
m∈Z
respectively, neither of which has appeared in the literature so far. The “center of mass coordinate” Qi in (2.17) is not the usual one; however, its explicit form is not needed here and will be discussed in [15]. Evidently, the above fields are transcendental expressions in terms of the standard oscillator basis. The momentum field (2.18) is physical because it commutes with the Virasoro constraints term by term. The same is true for the fields Ypi (z) := [`]X i (zp ) − [`]X i (z),
p = 1, . . . , ` − 1 ,
[`]
(2.19)
where zp := ζ p z and ζ denotes a primitive `-th root of unity. The affine Sugawara generators built from the affine Cartan–Weyl basis (2.13)-(2.15) are d−2 XX X 1 × [1] i [1] i × × × [1] r [1] −r [`] (2.20) Lm := × An Am−n × + × En Em−n × , 2(` + h∨ ) ¯ n∈Z
i=1
r∈∆
∨
where h denotes the dual Coxeter number of g¯ . Normal-ordering is defined by [1] i [1] j Am An for m ≤ n, × [1] i [1] j × := (2.21) × Am An × [1] j [1] i A A for m > n, [1] nr [1] ms Em En for m ≤ n, × [1] r [1] s × (2.22) := [1] s [1] r × Em En × En Em for m > n. It is well known that the operators L[`] m , m ∈ Z, form a Virasoro algebra (see e.g. [19] and references therein), [`] [`] [L[`] m , Ln ] = (m − n)Lm+n +
c(`) 3 (m − m)δm+n,0 [`]K0 , 12
(2.23)
with central charge c(`) :=
` dim g¯ . ` + h∨
(2.24)
These operators act as outer derivations on the affine algebra so that one obtains a semidirect product Vir L[`] n g: [1] i [1] i [L[`] m , An ] = −n Am+n ,
[1] r [1] r [L[`] m , En ] = −n Em+n .
(2.25)
By construction, the Sugawara generators are physical, viz. [L[`] m , Ln ] = 0
∀m, n ∈ Z.
(2.26)
The Sugawara Generators at Arbitrary Level
125
Thus the above semidirect product is a symmetry of the physical string spectrum, whereas in the FKS approach only the full Fock space carries a (level-1) representation of the affine algebra. It should be mentioned that in addition to the operators L[`] m , there is another infinity of “physical” Virasoro algebras (but with uniform central charge c = 26 − d) generated by the longitudinal DDF operators [`]A− m (a), all of which commute with the Sugawara generators (2.20). However, we will not elaborate on this point here; for further information, the interested reader may consult [14].
3. The Main Formula Our aim is to rewrite the Sugawara generators L[`] m in terms of the homogeneous Heisen[`] i berg subalgebra spanned by the Am ’s. This will be the generalization of the well-known result L[1] m =
d−2 1 X X × [1] i [1] i × × An Am−n × , 2
(3.1)
n∈Z i=1
which is referred to in the literature as “the equivalence of the Sugawara and the Virasoro construction at level 1.”1 For this purpose, we wish to evaluate the operator products occurring in the second part of the Sugawara generators. We start from the well-known formulas
:ei(r+nδ)·X(z) : :e−i(r+(m+n)δ)·X(w) : = (z − w)−2 :eir·[X(z)−X(w)] :e−imδ·X(w) einδ·[X(z)−X(w)] and
:e−i(r+(m−n)δ)·X(w) : :ei(r−nδ)·X(z) : = (z − w)−2 :eir·[X(z)−X(w)] :e−imδ·X(w) e−inδ·[X(z)−X(w)] , where the exponentials involving δ need not be normal ordered since δ 2 = δ · r = 0. Performing the contour integrals, we get XX × [1] r [1] −r × × En Em−n × r∈∆¯ n∈Z
=
X I dw 2πi ¯
r∈∆
0
I +
=
XI r∈∆¯ 0
I |z|>|w|
dw 2πi
0
dw 2πi ×
dz X i(r+nδ)·X(z) :e : :e−i(r+(m+n)δ)·X(w) : 2πi n≥0
I
|z|<|w|
X
dz X −i(r+(m−n)δ)·X(w) :e : :ei(r−nδ)·X(z) : 2πi
I
wp ∈{poles}z=wp
X
n>0
dz (z − w)−2 :eir·[X(z)−X(w)] :e−imδ·X(w) × 2πi
einδ·[X(z)−X(w)] ,
(3.2)
n≥0 1
Although, strictly speaking, this statement has only been proven in the framework of the FKS construction.
126
R.W. Gebert, K. Koepsell, H. Nicolai
where the second sum runs over all poles of the integrand in the region Cw := lim {z | |w| − ≤ |z| ≤ |w| + } = {z | |z| = |w|}, →0
i.e., on a circle of radius |w| in the z plane. A crucial observation for our construction is that besides the obvious pole at z = w, there will be extra poles for level |`| ≥ 2 in the operator-valued function X einδ·[X(z)−X(w)] . Y (z, w) := n≥0
These are due to the replacement of (w/z)n by (w/z)`n in the momentum mode contributions when the infinite sum defining Y (z, w) acts on a level-` state |ai. More specifically, we shall see that, when acting on such states, this operator gives rise to poles of arbitrary order located at (see Fig. 1) 1 ≤ p ≤ `,
z = wp := ζ p w,
(3.3)
where ζ := e2πi/` and ` denotes the eigenvalue of [1]K0 . These extra poles will lead to non-local (in the sense of quantum field theory) integrands in our final expressions.
z ............... ............... 6 ........................ ......... p ...... . . . . ..... ...........................wq ........ .. ........... ...... ... . . . . .. .. ..... ... .... ... ... ... ... p .. ... ... p .... ... ... ... ... ... . ... . ... ... ... ... ... ... . . .. ... .. ... . ... p ....... . . ... .. ...... .... .............. ................... p ...... ..... . . . . . . . . . . ..... ..... ....... ...... . ........... p . . . . . . . . ...................................
=⇒
z 6 .... ..... ..... ..... . . . q . . .q .... w... ... ... .. ... .. ... .. ..q ... ..q .. ... ... ... .. . ... . . .q.. . . ..q . . .. .. . .. ... .... . ..... q..... ..... ..... .
Fig. 1. Location of poles for level ` = 7
Let us first analyze the pole that Y (z, w) gives rise to at z = w ≡ w` ; expansion around z = w yields Y (z, w) = −
1 , (z − w)f` (z, w)
(3.4)
where the function f` (z, w) does not vanish at z = w; explicitly, k X 1 ∂ iδ·[X(z)−X(w)] k−1 (z − w) e f` (z, w) = k! ∂z k z=w k≥1
=
δ·P(w) + 21 (z − w)[δ·P0 (w) + (δ·P(w))2 ] + 16 (z − w)2 [δ·P00 (w) + 3δ·P(w)δ·P0 (w) + (δ·P(w))3 ] + . . . , (3.5)
The Sugawara Generators at Arbitrary Level
127
with the momentum field P(z) already defined in Eq. (2.7). When we insert the expansion of Y (z, w) back into Eq. (3.2), we observe that the integrand has a pole of third order at z = w` ≡ w. Application of Cauchy’s theorem therefore yields XX
× ×
r∈∆¯ n∈Z
=
−r × Enr [1]Em−n ×
[1]
X I dw 1 ∂ 2 :eir·[X(z)−X(w)] : − 2 2πi 2 ∂z f (z, w) ` z=w ¯
r∈∆ 0
+
`−1 I X p=1 z=w
dz 2πi
:eir·[X(z)−X(w)] : (z − w)2
Y (z, w)
e−imδ·X(w) . (3.6)
p
The first term may be further simplified by noting that the sum over both the positive and negative roots of g¯ cancels expressions linear in r. Hence X ∂ 2 :eir·[X(z)−X(w)] : 2 ∂z f (z, w) ` z=w r∈∆¯ X :(r·P(w))2 : δ·P(w) δ·P00 (w) (δ·P0 (w))2 + = − , (3.7) + δ·P(w) 6 3(δ·P(w))2 2(δ·P(w))3 ¯ r∈∆
where [δ·P(w)]−1 is well defined on states of level ` 6= 0, viz. [δ·P(w)]
−1
=
−w
K0 −
[1]
X
(δ·αm )w
−m
−1
m6=0
=
−
X n wX − (k` ·αm )w−m . ` n≥0
m6=0
Next recall that the physical states of a subcritical bosonic string are finite linear combinations of states of the form [`] − 1 M Ai−m · · · [`]Ai−m A−n1 · · · [`]A− −nN |ai, 1 M
[`]
(3.8)
where |ai is any tachyonic state with [1]K0 -eigenvalue ` and the operators [`]Aim and [`]A− m denote the transversal and the longitudinal DDF operators, respectively [4]. In order to know the action of the Sugawara operators on arbitrary physical states, it is therefore sufficient to work out explicitly the action of the L[`] m ’s on a tachyonic ground state and then to determine their commutation relations with the DDF operators. So let us consider a state |ai satisfying a2 = 2 and [1]K0 |ai = −(δ ·a)|ai = `|ai for some ` ∈ N. Evidently, |ai is a highest weight state for Vir L[`] , L[`] m |ai = 0
∀m > 0,
because (a − mδ)2 = 2(1 + `m) > 2 for m > 0, but there is no physical string state below the tachyon. For m ≥ 0, we first have to evaluate Y (z, w)|ai. We find that
128
R.W. Gebert, K. Koepsell, H. Nicolai
Y (z, w)|ai
=
X
einδ·[X(z)−X(w)] |ai
n≥0
=
=
X X nk w `n [iδ·X< (z) − iδ·X< (w)]k |ai k! z k≥0 n≥0 X z `(k+1) pk (w/z)` k k+1 [iδ·X< (z) − iδ·X< (w)] |ai, ` ` k≥0 k! z − w
(3.9)
where iδ·X< (z) :=
X1 (δ·α−n )z n , n
n>0
and p0 ≡ 1,
pk+1 (x) := x[(1 − x)p0k (x) + (k + 1)pk (x)]
The latter recursion relation follows from the formula k X pk (x) d 1 k n = n x = x dx 1−x (1 − x)k+1
∀k ≥ 0.
(|x| < 1).
(3.10)
(3.11)
n≥0
The polynomials pk (x) only have positive coefficients. Indeed, the above recursion relations translate into pk+1,i = ipk,i + (k − i + 2)pk,i−1
∀k > 0, 0 ≤ i ≤ k + 1,
where2 pk (x) =
k X
pk,i xi .
i=0
Hence, in particular, the polynomials cannot vanish at x = 1 which proves that the term z `(k+1) pk (w/z)` k+1 z ` − w` contains ` poles at z = wp ≡ ζ p w, each of order k + 1. On the other hand, the expression [iδ·X< (z) − iδ·X< (w)]k is a sum of terms of the form (z n1 − wn1 ) · · · (z nk − wnk )(δ·α−n1 ) · · · (δ·α−nk ),
ni > 0 ∀i,
each of them having a zero at z = w of order k. In total, Y (z, w)|ai always has a simple pole at z = w ≡ w` , which was already evaluated in (3.7), but exhibits a much more complicated pattern at the other poles. For example, if (ni , `) = m > 0 (highest common divisor) then the poles at z = e2πik/m , 1 ≤ k ≤ m, in (z ` − w` )−1 cancel against the zeros in z ni −wni . Up to oscillator number two, for instance, one has the explicit formula 2 By induction, it is easy to show that the coefficients are symmetric in the sense that p k,i = pk,k−i+1 ∀k > 0, 0 ≤ i ≤ k. In particular, pk,k = pk,1 = 1 ∀k > 0.
The Sugawara Generators at Arbitrary Level
Y (z, w)|ai
=
129
z ` w` z` + (z − w)(δ·α−1 ) + 21 (z 2 − w2 )(δ·α−2 ) + . . . ` ` ` ` 2 z −w (z − w ) i z ` w` (z ` + w` ) h 2 2 (δ·α ) + . . . + . . . |ai. (3.12) (z − w) + −1 2(z ` − w` )3
It is obvious from this result that a direct evaluation of L[`] −m |ai quickly becomes unfeasible with increasing m. There is, however, an elegant argument which allows us to shortcut this calculation and to read off the result directly from the expression (3.6). We recall that the leading oscillator contribution of a DDF operator is Ai−m ∼ ξi ·α−m + . . . ,
[`]
A− −m ∼ −a·α−m + . . .
[`]
.
Since these oscillators are linearly independent we can immediately rewrite a given physical state in terms of DDF states simply by identifying the leading oscillators. An important assumption here, without which this argument would be invalid, is that there must not be any null physical state present, because their appearance would spoil the nice oscillator structure. Now a glance at Eq. (3.6) shows that longitudinal oscillators are absent altogether. This means that the Sugawara generators when applied to any physical state neither produce null physical states nor additional longitudinal excitations (apart from those already contained in the initial state (3.8)). We conclude that the Sugawara generators can be rewritten in terms of the transversal DDF operators alone and that the result can be obtained by isolating those terms which do not contain δ·α−n oscillators. For the second term in the Sugawara generators we find in this way that XX
× ×
−r × Enr [1]Em−n × =
[1]
r∈∆¯ n∈Z
ir·[X(z)−X(w)] ` `−1 I X I dw w `2 − 1 X dz :e :z 2 + w`m :(r·P(w)) : + 2 ` `) 2πi 2` 12`w 2πi (z − w) (z − w ¯ p=1 z=w
r∈∆ 0
p
+ terms containing δ·α−n ’s.
(3.13)
Note that the integrals around z = wp for the displayed terms can be immediately performed since the integrands have only simple poles. The above reasoning ensures that all terms involving δ’s in (3.13) must combine with the other terms precisely in such a way that the ordinary string oscillators are replaced by DDF oscillators. After this “leap of faith” we arrive at XX r∈∆¯ n∈Z
× ×
Enr
[1]
−r × Em−n ×
[1]
=
X I dw 1 `2 − 1 × [`] r 2× ( P (w)) + × 2πi 2`w × 12`w ¯
r∈∆ 0
+
`−1 X p=1
wp × ir·[[`]X (wp )−[`]X (w)]× `m e . (3.14) × w `(wp − w)2 ×
Our main result is thus the following new realization of the Sugawara generators at arbitrary level:
130
R.W. Gebert, K. Koepsell, H. Nicolai
Theorem 1. The operators L[`] m
=
d−2 d−2 XX XX 1 h∨ × [1] i [1] i × [`] i [`] i × × A A + n m−n × × × An A`m−n × 2(` + h∨ ) 2`(` + h∨ ) n∈Z i=1
−
X 1 ∨ 2`(` + h ) ¯
`−1 X
r∈∆ p=1
1 − 1|2
|ζ p
I
n∈Z i=1
dw n `m−1× ir·[`]Yp (w)× o w ×e × 2πi
0
(` − 1)(d − 2)h∨ δm,0 , + 24`(` + h∨ ) 2
(3.15)
generate a Virasoro algebra with central charge c(`) :=
` dim g¯ . ` + h∨
In deriving this result we have made use of the identity X
× ×
∨ P r (z) [`]P r (z)× × = 2h
[`]
r∈∆¯
d−2 X
× ×
P i (z) [`]P i (z)× ×.
[`]
i=1
Observe that (3.15) is “doubly transcendental” as a function of the ordinary string oscillators because the new coordinate field (2.17), which itself is already a transcendental function of the string oscillators, appears in the exponential. Moreover, this expression is manifestly physical as it depends only on the difference of the coordinate field. Equation (3.15) contains the well-known formula (3.1) as a special case for ` = 1. With the above formula, the level-` energy-momentum tensor X −`m L[`] , (3.16) L[`] (z) := mz m∈Z
becomes nonlocal, to wit X X 1 h∨ [`] i × [1] i ` [1] i ` × × [`] i × P (z ) P (z ) + × × P (z) P (z)× × ∨ ∨ 2(` + h ) 2`(` + h ) d−2
L[`] (z)
=
d−2
i=1
i=1
`−1 XX
−
1 2`(` + h∨ )
+
(` − 1)(d − 2)h . 24`(` + h∨ ) 2
r∈∆¯ p=1 ∨
|ζ p
1 × ir·[[`]X (zp )−[`]X (z)]× e × 2 − 1| × (3.17)
4. General Properties and Proof of Theorem Before we prove that the expressions (3.15) for the level-` Sugawara operators satisfy the Virasoro algebra, we would like to discuss some general features of our new formula. In the next section we will work out some explicit examples. Since the operators (3.15) are purely transversal in terms of DDF oscillators, we immediately see that they commute with the longitudinal DDF operators,
The Sugawara Generators at Arbitrary Level
131
[`] [[`]A− n , Lm ] = 0
∀m, n ∈ Z.
(4.1)
The commutation relations with the transversal DDF operators can be verified as follows. A straightforward calculation yields ir· [[`]Ajn , × ×e
Yp (w)× ×]
[`]
ir· = (r·ξj )(ζ pn − 1)wn× ×e
Yp (w)× ×.
[`]
(4.2)
Similarly, one finds that X [`] i [`] i × ij [`] j [[`]Ajn , × Am+n . × Ak Am−k × ] = 2nδ k∈Z
Inserting these results into formula (3.15) we obtain [[`]Ajn , L[`] m] = X nh∨ [`] j n [1] j δ A + A `k+n,0 m−k (` + h∨ ) `(` + h∨ ) `m+n k∈Z
I `−1 X X 1 ζ pn − 1 dw n `m+n−1× ir·[`]Yp (w)× o w − (r·ξ j ) ×e × , (4.3) ∨ p 2 2`(` + h ) ¯ |ζ − 1| 2πi p=1
r∈∆
0
[1] j and therefore recover the formula [[1]Ajn , L[`] m ] = n Am+n in agreement with Eq. (2.25). It is instructive to have a closer look at the new expression for L[`] 0 , which reads
L[`] 0 |ai
=
`−1 XX 1 ζ pr·a 1 2 (`2 − 1)(d − 2)h∨ a¯ + − |ai, (4.4) 2` 24`(` + h∨ ) 2`(` + h∨ ) ¯ |ζ p − 1|2 r∈∆ p=1
¯ denotes the projection of a (resp. Λ) onto h¯ ∗ . Let us focus on the last term. where a¯ ≡ Λ We have the following Lemma 1. Let ζ be a primitive `-th root of unity and let k = 0, . . . , ` − 1. Then `−1 X p=1
Proof.
3
ζ pk = − 1|2
|ζ p
1 2 12 (`
− 1) − 21 k(` − k).
With the elementary algebraic identity (for p 6≡ 0 mod `) 1 X pj 1 =− jζ , p 1−ζ ` `−1 j=1
we immediately obtain `−1 X p=1
`−1 `−1 `−1 1 X 1 X X p(k+i−j) 1 ζ pk p(k+i−j) = ijζ = 2 ij ζ − 4 (` − 1)2 . |ζ p − 1|2 `2 ` i,j,p=1
i,j=1
p=0
Invoking the following well-known property of sums of roots of unity, 3
We would like to thank H. Samtleben for the crucial idea.
(4.5)
132
R.W. Gebert, K. Koepsell, H. Nicolai `−1 X
ζ pn =
n
p=0
if n ≡ 0 mod `, else,
` 0
(4.6)
we find that `−1 X i,j=1
ij
`−1 X
ζ p(k+i−j)
p=0
=
`
`−1−k X
i(i + k) + `
i=1 `−1 X
`−1 X
i(i + k − `) + `2
=
`
=
i=1 ` 16 (`
i(i + k − `)
i=`−k `−1−k X
− 1)`(2` − 1) +
i=1 1 2 (k
i
− `)`(` − 1) + 21 `(` − 1 − k)(` − k) ,
and thus `−1 X p=1
ζ pk − 1|2
|ζ p
− 1)(2` − 1) − 21 k(` − k) − 41 (` − 1)2
=
1 6 (`
=
1 2 12 (`
− 1) − 21 k(` − k)
By use of this result and some well-known facts about the finite root system, we may rewrite the last term of the above formula for L[`] 0 as `−1 XX r∈∆¯ p=1
ζ pr·a − 1|2
|ζ p
=
X 1
6 (`
2
− 1) − (r·a)(` − (r·a))
r∈∆¯ +
=
(`2 − 1)(d − 2)h∨ − 2`ρ· ¯ a¯ + h∨ a¯ 2 , 12
where ρ¯ denotes the Weyl vector for the finite subalgebra.4 If we insert this into (4.4) we arrive at the formula L[`] 0 |ai =
(¯a + 2ρ)· ¯ a¯ |ai, 2(` + h∨ )
(4.7)
in agreement with [23, Lemma12.8.b)]. Note that we have not employed any properties of the affine Casimir operator in our calculation. It remains to verify that the operators (3.15) really do satisfy the Virasoro algebra with the correct central charge. To this aim we split the Sugawara operators and introduce the following operators: (1) (2) (3) L˜ [`] m = Lm + L m + L m ,
(4.8)
with 4 Note that the term linear in k = r·a does not drop out upon summation over the roots but rather reproduces the Weyl vector. This is due to the fact that the lemma is valid only for 0 ≤ k ≤ ` − 1 and thus different values of r · a have to be transported into this range by multiples of `.
The Sugawara Generators at Arbitrary Level
L(1) m
:=
133
d−2 1 X X × [1] i [1] i × × An Am−n × , 2` n∈Z i=1
L(2) m
:=
d−2 X X h∨ × [`] i [`] i × × An A`m−n × , 2`(` + h∨ ) n∈Z n6≡0(`)
L(3) m
:=
i=1
I `−1 XX 1 1 dw n `m−1× ir·[`]Yp (w)× o w − ×e × . (4.9) 2`(` + h∨ ) ¯ |ζ p − 1|2 2πi r∈∆ p=1
0
Observe that we have absorbed all terms involving [1]Ain into L(1) m so that the prefactor is “renormalized” to (2`)−1 with respect to (3.15), because these operators commute with the DDF oscillators [`]Ain with n 6≡ 0(`). We obviously have (with [1]K0 = `) (1) [L(1) m , Ln ]
=
(m − n)L(1) m+n +
(2) [L(1) m , Ln ]
=
(3) [L(1) m , Ln ] = 0.
d−2 3 (m − m)δm+n,0 , 12 (4.10)
It is equally straightforward to show that (2) [L(2) m , Ln ] = (m − n)
h∨ (d − 2)(` − 1)(h∨ )2 3 m (2) δm+n,0 (4.11) L + m + ` + h∨ m+n 12(` + h∨ )2 `
and h∨ L(3) . (4.12) ` + h∨ m+n The remaining commutator requires more work. For its evaluation we need the operator product r·s (zp − wq )(z − w) × ir·[`]Yp (z)+is·[`]Yq (w)× × ir·[`]Yp (z)× × is·[`]Yq (w)× ×e ×e ××e × = × . (4.13) (zp − w)(z − wq ) (3) [L(2) m , Ln ] = (m − n)
We write (3) [L(3) m , Ln ] =
`−1 X X
I(r, s, p, q),
r,s∈∆¯ p,q=1
with I(r, s, p, q)
:=
I ` I 1 1 dw X dz w`n−1 z `m−1 × 4`2 (` + h∨ )2 |ζ p − 1|2 |ζ q − 1|2 2πi 2πi a=1 w 0 a r·s [`] [`] (zp − wq )(z − w) × ir· Yp (z)+is· Yq (w)× × e × × . (zp − w)(z − wq )
Inspection of the term in curly brackets reveals poles of order 1, 2 and 4, depending on the values of r·s and p, q ∈ {1, . . . , ` − 1}. As usual, there is no contribution for r·s = 0. Another useful observation is that I(r, s, p, q) = I(r, −s, p, ` − q)
∀r, s, p, q.
(4.14)
¯ and r ·s = −2 It is therefore sufficient to consider the cases r ·s = −1 (⇔ r + s ∈ ∆) (⇔ s = −r); these lead to the following results:
134
R.W. Gebert, K. Koepsell, H. Nicolai
1. If r·s = −1 and p = q (pole of order 2), then I m−n 1 dw `(m+n)−1× i(r+s)·[`]Yp (w)× w I(r, s, p, p) = ×e × 8`(` + h∨ )2 |ζ p − 1|2 2πi 0
¯ there are always after partial integration. For simply laced algebras and given r ∈ ∆, 2h∨ − 4 roots s such that r·s = −1 (or +1).5 Hence `−1 X X
I(r, s, p, p) = (m − n)
r,s∈∆¯ p=1 r·s=−1
2 − h∨ (3) L . 2(` + h∨ ) m+n
2. If r·s = −1 and p 6= q (poles of order 1), then I(r, s, p, q) = −I(s, r, q, p), which vanishes upon (symmetric) summation over r, s, p, q. 3. If r·s = −2 and p = q (pole of order 4), then `−1 XX
I(r, −r, p, p)
=
r∈∆¯ p=1
` L(2) 2(` + h∨ ) m+n (d − 2)`(` − 1)h∨ 3 m δm+n,0 , m + + 24(` + h∨ )2 `
(m − n)
after partial integration and use of Lemma 1. 4. If r·s = −2 and p 6= q (poles of order 2), then `−1 XX r∈∆¯
I(r, −r, p, q) = (m − n)
p,q=1 p6=q
`−2 L(3) , 2(` + h∨ ) m+n
after partial integration. Hence (3) [L(3) m , Ln ]
=
` ` − h∨ (3) (2) L + (m − n) L m+n ` + h∨ ` + h∨ m+n ∨ (d − 2)`(` − 1)h m δm+n,0 . m3 + + ∨ 2 12(` + h ) ` (m − n)
(4.15)
Adding up all contributions we get ˜ [`] ˜ [`] [L˜ [`] m , Ln ] = (m − n)Lm+n +
c m3 + bm δm+n,0 , 12
5 By Weyl invariance it is sufficient to prove the statement for the highest root θ. From the definition of the Coxeter number and the Weyl vector we have
¯ = 2(h∨ − 1) = 2ρ·θ
X
s·θ. s∈∆¯ +
The only contributions in the sum arise from the terms with s·θ = 1 (whose number we wish to compute) and with s·θ = 2. However, the only positive root for which s·θ = 2 is s = θ, whence the result.
The Sugawara Generators at Arbitrary Level
135
with b := −
(d − 2)(`2 + h∨ ) d − 2 (d − 2)(` − 1)h∨ + =− ; ∨ 12 12`(` + h ) 12`(` + h∨ )
the central charge c given by (d − 2)(` − 1)(h∨ )2 (d − 2)`(` − 1)h∨ + (` + h∨ )2 (` + h∨ )2 (d − 2)`(1 + h∨ ) ` dim g¯ = = , ` + h∨ ` + h∨ in agreement with (2.24). Since this Virasoro algebra has not yet the standard form, we have to shift L˜ [`] 0 . Doing this we arrive at the desired result, c
=
d−2+
L[`] m
c + 12b δm,0 L˜ [`] m + 24 (`2 − 1)(d − 2)h∨ δm,0 . L˜ [`] m + 24`(` + h∨ )
:= =
r Finally, we would like to mention that, analogous to (3.15), the step operators [1]Em are also expressible in terms of DDF oscillators [15].
5. Examples In this section we present some examples. As already mentioned we will restrict attention to the Lie algebras E8 , E9 and E10 . When expanding the exponential operator in the new formula (3.15), we notice that L[`] n involves linear combinations of the form X 1 1 M Tj1 ...jM (a; hm1 i, . . . , hmM i) [`]Aj−m · · · [`]Aj−m , (5.1) 1 M m · · · m 1 M m ,...,m 6≡0(`) 1 M m1 +...+mM =`n
with Tj1 ...jM (a; hm1 i, . . . , hmM i) `−1 X X
:=
p=1 r∈∆¯
ζ pr·a (ζ pm1 − 1) · · · (ζ pmM − 1) rj1 · · · rjM ; − 1|2
|ζ p
(5.2)
here hmi is a coset representative for m, i.e., m = hmi + k` for some k ∈ Z, hmi ∈ {1, . . . , ` − 1}, and rj denotes the j-th component of the root r with respect to some basis {ej |1 ≤ j ≤ d − 2} of h¯ ∗ . Since M ≥ 2, the tensors can be simplified by writing N (r·a; hm1 i, . . . , hmM i) :=
`−1 X p=1
=
−
ζ pr·a (ζ pm1 − 1) · · · (ζ pmM − 1) |ζ p − 1|2
`−1 hm 1 i−1 hm2 i−1 X X X p=1
k1 =0
k2 =0
ζ p(r·a+k1 +k2 +1) (ζ pm3 − 1) · · · (ζ pmM − 1). (5.3)
136
R.W. Gebert, K. Koepsell, H. Nicolai
Invoking (4.6) we conclude that the numbers N (a·r; hm1 i, . . . , hmM i) are always real integers. Hence further evaluation of the tensors Tj1 ...jM necessitates the computation of weighted sums over tensor products of real roots of the following type: X
N (r·a)r ⊗ . . . ⊗ r,
(5.4)
r∈∆¯
with N (r·a) ∈ Z. Such sums have not been considered in the literature so far, except in the simplest situation where N = const. In this case the sums become invariant tensors w.r.t. the full Weyl group of the finite Lie algebra under consideration. E.g., for E8 we have the following formula for the unweighted sums over tensor products up to six tensor factors: X
rj1 · · · rj2k = 24−k 7 + 22k−3 δ(j1 j2 · · · δj2k−1 j2k )
for k=1, 2, 3,
(5.5)
r∈∆¯
where (...) denotes symmetrization with strength one and tensors with an odd number of indices vanish (this is no longer true for the weighted sums (5.4)). The simple result (5.5) is explained by the absence of invariant tensors other than δij for k ≤ 3 of the Weyl group W(E8 ) = D4 (2) ⊗ Z22 . For k ≥ 4, new invariants appear in accordance with the general theory since the exponents of E8 are 1, 7, 11, 13, 17, 19, 23, 29 [29]. It is clear that the presence of the factor N (r·a) in T(a) breaks the symmetry under the full affine Weyl group down to that subgroup which preserves a; this is just the (finite) little Weyl group W(a, δ) introduced in [14]. As a consequence, the results will be the more cumbersome the smaller W(a, δ) becomes. For the examples to be presented below we have therefore evaluated the relevant sums (5.4) on the computer. Inspection of the explicit examples suggests that it may be difficult to find closed (or at least more elegant) general expressions for them, and we have not tried to do so. Let us illustrate the new formula and the above remarks with some explicit examples. With the future application in mind, we will consider only the exceptional Lie algebra g¯ = E8 with affine extension g = E9 and hyperbolic extension gˆ = E10 . In this (unique) case the finite root lattice is selfdual, and consequently the extended affine root lattice coincides with the weight lattice of E10 which is just the unique even selfdual Lorentzian lattice II9,1 . As a partial check on our results, we have recalculated (and reobtained) (5.6) below directly by means of formula (4.60) in [14] (the Lie algebra analog of (2.20)), i.e., without use of (3.15). Doing the calculation in this “old” way is impossible without massive use of algebraic computer programs, whereas the new formula requires substantially less effort.6 In fact, knowing the tensors (5.2) (for this we must still rely on the computer), the calculation can be done by hand. As our first example, we choose the fundamental dominant weight Λ1 = 2r−1 +r0 +3δ of level ` = 2 with associated tachyonic vector a1 = Λ1 −2δ. We identify the polarization vectors ξ i with the orthonormal basis vectors ei . The little Weyl group is W(Λ1 , δ) = W(E7 ) ⊗ Z2 = C3 (2) ⊗ Z22 . An exceptional property of the level-2 sector is the vanishing of all tensors with an odd number of indices; this feature will be lost for higher levels |`| > 2. With the notation Ai−m ≡ [2]Ai−m we find that For example, the CPU time used by a MapleV program for the calculation of the state L[2] −3 |a1 i below could be cut by a factor of 5·104 with the use of our new formula. 6
The Sugawara Generators at Arbitrary Level
L[2] −1 |a1 i
3 16
=
11 64
=
Ai−1 Ai−1
+
8 7 8 16 A−1 A−1
7 X
Ai−3 Ai−1 +
11 256
i=1
7 X
(A8−1 )2 (Ai−1 )2 +
i=1
√ 7
+ 16 7 16
=
7 X
2A8−2 (A8−1 )2 Ai−3 Ai−1
+
+
3 32
i=1
+ 41 =
|a1 i,
2A8−2
(Ai−1 )2 (Ak−1 )2 +
7 √ X 2 A8−2 (Ai−1 )2
3 16
i=1
1 2
√
7 X
8 35 8 64 A−3 A−1
+ 21 (A8−2 )2 +
|a1 i,
2A8−4
−
(A8−1 )2 (Ai−1 )2
1 64
i=1
8 X
Ai−2 Ai−2
+
8 4 35 256 (A−1 )
7 X
(Ai−1 )2 (Ak−1 )2
i,k=1
8 29 8 48 A−3 A−1
8 4 5 192 (A−1 )
+
+
1 2
√
2A8−4
|a1 i,
i=1
L[2] −3 |a1 i
+
√
i,k=1
7 X
15 + 128
L[2] −2 |a1 i
1 2
i=1
[2] L[2] −1 L−1 |a1 i
7 X
137
9 20
7 X
Ai−5 Ai−1 +
1 2
8 X
i=1
Ai−3 Ai−1 (Ak−1 )2 +
1 16
7 X
6 X
7 X
Ai−3 Ai−1 (A8−1 )2 −
1 64
6 X
(Ai−1 )2 (Ak−1 )2 (A8−1 )2 +
6 X
(Ai−1 )2 (Ak−1 )2 (A7−1 )2
1 32
6 X
(Ai−1 )2 (A7−1 )2 (A8−1 )2
i=1
(Ai−1 )2 (Ak−1 )4 +
1 48
6 X
6 X
(Ai−1 )4 (A7−1 )2
i=1
i,k=1 1 − 192
A8−3 A8−1 (Ak−1 )2
i,k=1
i,k=1 1 − 96
Ai−3 Ai−3
k=1
i=1 1 + 64
7 X i=1
i,k=1 1 + 16
11 48
i=1
7 X
1 − 48
Ai−4 Ai−2 +
(Ai−1 )2 (A7−1 )4 +
1 64
i=1
(Ai−1 )2 (A8−1 )4 +
7 2 8 4 1 64 (A−1 ) (A−1 )
i=1
1 + 64 (A8−1 )2 (A7−1 )4 + 37 8 + 144 A−3 A8−3
6 X
+
1 − 2880 (A8−1 )6 −
1 120
6 X
(Ai−1 )6 +
8 11 8 20 A−5 A−1
i=1 8 8 3 7 6 5 1 144 A−3 (A−1 ) − 320 (A−1 ) 6 Y √ 8 i 1 1 A + 2A −1 −6 |a1 i. 4 2 i=1
(5.6)
Our second example is the fundamental dominant weight Λ8 of level ` = 3 with associated tachyonic vector a8 = Λ8 −2δ and little Weyl group W(Λ8 , δ) = W(A8 ) = S8 . In terms of our standard basis of orthonormal polarization vectors we find the results
138
L[3] −1 |a8 i
R.W. Gebert, K. Koepsell, H. Nicolai
=
X 8 1 6
Ai−3
+
i=1
L[3] −2 |a8 i
=
1 6
8 X
(1 − 6δij + 12δij δjk δki )Ai−1 Aj−1 Ak−1 |a8 i,
i,j,k=1
Ai−6 +
17 55
8 X
i=1
−
Ai−2 Ai−1
i=1
8 X
1 − 264
X 8
7 22
i=1 8 X
1 352
Ai−5 Ai−1 +
27 88
8 X
Ai−4 Ai−2 +
i=1
1 6
8 X
(Ai−3 )2
i=1
(1 − 4δij − 2δjk + 12δij δjk )Ai−4 Aj−1 Ak−1
i,j,k=1
+
8 X
1 2112
(1 − 6δij + 12δij δjk )Ai−2 Aj−2 Ak−2
i,j,k=1
−
8 X
1 704
(δij + δkl + 4δjk − 12δij δjk − 12δjk δkl
i,j,k,l=1
+ 24δij δjk δkl − 4δij δkl − 8δik δjl )Ai−2 Aj−2 Ak−1 Al−1 +
8 X
1 2816
(1 − 8δij − 12δjk + 24δij δjk + 16δjk δkl
i,j,k,l,m=1
− 32δij δjk δkl − 8δjk δkl δlm − 16δij δjk δkl δlm + 64δij δkl + 16δjk δlm + 72δij δjk δlm + 48δij δkl δlm )Ai−2 Aj−1 Ak−1 Al−1 Am −1 −
1 42240
8 X
(1 − 15δij + 40δij δjk − 60δij δjk δkl
i,j,k,l,m,n=1
+ 144δij δjk δkl δlm − 144δij δjk δkl δlm δmn + 80δij δkl δmn j i k l m n + 320δij δjk δlm δmn )A−1 A−1 A−1 A−1 A−1 A−1 |a8 i, (5.7) where now Ai−m ≡ [3]Ai−m . This example illustrates that terms with an odd number of DDF oscillators need not vanish in general. The invariance under the little Weyl group S8 can be made manifest by switching to a non-orthonormal and S8 -invariant basis of polarization vectors. This leads to slightly simpler expressions. However, this simplicity is an artefact caused by the size of the little Weyl group S8 ; generically, the little Weyl group becomes much smaller. In addition we note that, by (5.1), even the simplest operator L[`] −1 will involve oscillator number m1 + . . . + mM = ` and thus an exponentially growing number of terms with increasing level `. Finally, we would like to point out that the above physical states are elements of E10 . (Λ1 ) (Λ1 +δ) (Λ8 ) [2] For instance, L[2] , and L[3] −2 |a1 i ∈ E10 , L−3 |a1 i ∈ E10 −2 |a8 i ∈ E10 .
The Sugawara Generators at Arbitrary Level
139
6. Outlook As we have already indicated in the introduction, the present work is mainly motivated by and continues our previous investigation of hyperbolic Kac–Moody algebras corresponding to the canonical extensions of affine algebras by an over-extended root r−1 [14, 13]. There, an attempt was made to understand the structure of such algebras, and in particular the maximally extended algebra E10 , via a novel realization in terms of DDF states which enabled us to give a simple and explicit representation for a nontrivial level-2 root space of E10 corresponding to a 75-fold multiple commutator of the Chevalley–Serre generators for the first time (meanwhile, further examples have been worked out). These results explicitly demonstrate the occurrence of longitudinal states for levels |`| ≥ 2 and the simultaneous decoupling of certain transversal states, whereas the level ±1 sectors can be simply realized as the set of purely transversal states [14, 11] (the level-0 sector is just the affine subalgebra). Let us recall that the higher-level elements of the algebra can be recursively defined as multiple commutators of level-1 elements. A first difficulty here is that one must discard all those multiple commutators, and hence the corresponding affine representations, which contain the Serre relations somewhere inside. This difficulty is invisible in the string vertex algebra realization [2], which takes automatic care of the Serre relations (since there are no physical string states below the tachyon), but the tribute to this convenience is the phenomenon of “missing” (or “decoupled”) states, i.e., physical string states that can not be reached by multiple commutation of the Chevalley–Serre generators [14]. The second difficulty is that there is no general method for efficiently computing the relevant products of representations in practice. Although the challenge of finding explicit formulas for the coset generators remains, we believe that the present results bring us one step closer towards the ambitious goal of finding a concrete realization of hyperbolic Kac–Moody algebras, because concrete examples have shown the realization of root spaces in terms of DDF states to be far more efficient than any other description. Moreover, the above examples nicely display the increasing “anisotropy” of hyperbolic Kac–Moody algebras with increasing level, a feature which we have already stressed before and which can be traced to the decrease (and eventual triviality) of the little Weyl group at higher level. While the further exploration of higher-level root spaces by direct methods as in [14] seems prohibitively difficult, prospects are much brighter with our new formula (3.15). What is still missing at this point is an analogous and similarly explicit expression for the full coset generators [`] [1] [`] := (L[1] Km m ⊗ 1 ⊗ · · · ⊗ 1) + . . . + (1 ⊗ · · · ⊗ 1 ⊗ Lm ) − Lm ,
(6.1)
which would eventually allow us to write down the root space elements directly in terms of suitable creation operators acting on a “master state” in a given level-` sector. However, we cannot expect the solution of this problem to be simple in the context of hyperbolic Kac–Moody algebras, because, as we pointed out already in the introduction, the division by the Serre relations destroys the Virasoro module structure. Therefore, the coset generators may have to be modified so as to take this fact into account. While the last term of (6.1) involves transversal DDF operators only by (3.15), preliminary checks show that in the other terms longitudinal DDF operators will emerge. Since the resulting operators commute with the affine subalgebra this might also shed some light on the long-standing problem of finding explicit expressions for the (non-polynomial) higher-order Casimir invariants of affine algebras. We hope to come back soon to these issues in another publication.
140
R.W. Gebert, K. Koepsell, H. Nicolai
Acknowledgement. We are very grateful to P. Slodowy for sharing with us his expertise on Weyl groups and their invariant tensors. We would also like to thank J. Fuchs for discussions about the representation theory of affine Lie algebras.
References 1. Bardak¸ci, K., and Halpern, M. B.: New dual quark models. Phys. Rev. D3, 2493–2506 (1971) 2. Borcherds, R. E.: Vertex algebras, Kac-Moody algebras, and the monster. Proc. Nat. Acad. Soc. USA 83, 3068–3071 (1986) 3. Borcherds, R. E.: Automorphic forms on Os+2,2 (R) and infinite products. Invent. Math. 120, 161–213 (1995) 4. Brower, R. C.: Spectrum-generating algebra and no-ghost theorem for the dual model. Phys. Rev. D6, 1655–1662 (1972) 5. Callan, C. G., Dashen, R. F., and Sharp, D. H.: Solvable two-dimensional field theory based on currents. Phys. Rev. 165, 1883–1886 (1968) 6. Coleman, S., Gross, D., and Jackiw, R.: Fermion avatars of the Sugawara model. Phys. Rev. 180, 1359–1366 (1969) 7. Dashen, R., and Frishman, Y.: Four fermion interactions and scale invariance. Phys. Rev. D11, 2781–2802 (1975) 8. Dell’Antonio, G. F., Frishman, Y., and Zwanziger, D.: Thirring model in terms of currents: Solution and light-cone expansions. Phys. Rev. D6, 988–1007 (1972) 9. Feingold, A. J., and Frenkel, I. B.: A hyperbolic Kac-Moody algebra and the theory of Siegel modular forms of genus 2. Math. Ann. 263, 87–144 (1983) 10. Frenkel, I. B.: Two constructions of affine Lie algebra representations and boson-fermion correspondence in quantum field theory. J. Funct. Anal. 44, 259–327 (1981) 11. Frenkel, I. B.: Representations of Kac-Moody algebras and dual resonance models. In: Applications of Group Theory in Theoretical Physics, Providence, RI: American Mathematical Society, Lect. Appl. Math., Vol. 21, 1985, pp. 325–353 12. Frenkel, I. B., and Kac, V. G.: Basic representations of affine Lie algebras and dual models. Invent. Math. 62, 23–66 (1980) 13. Gebert, R. W., and Nicolai, H.: E10 for beginners. In: G. Akta¸s, C. Sa¸clioˇglu, and M. Serdaroˇglu (eds.), Strings and Symmetries, Proceedings of the G¨ursey Memorial Conference I, Istanbul, 6-10 June 1994, New York: Springer, 1995, pp. 197–210 14. Gebert, R. W., and Nicolai, H.: On E10 and the DDF construction. Commun. Math. Phys. 172, 571–622 (1995) 15. Gebert, R. W., and Nicolai, H.: An affine string vertex operator construction at arbitrary level. Preprint DESY 96-166, hep-th/9608014 16. Goddard, P., Kent, A., and Olive, D.: Virasoro algebras and coset space models. Phys. Lett. 152B, 88–92 (1985) 17. Goddard, P., and Olive, D.: Algebras, lattices and strings. In: J. Lepowsky, S. Mandelstam, and I. M. Singer (eds.), Vertex Operators in Mathematics and Physics – Proceedings of a Conference November 10-17, 1983, Publications of the Mathematical Sciences Research Institute #3, New York: Springer, 1985, pp. 51–96 18. Goddard, P., and Olive, D.: Kac-Moody algebras, conformal symmetry and critical exponents. Nucl. Phys. B257, 226–252 (1985) 19. Goddard, P., and Olive, D.: Kac-Moody and Virasoro algebras in relation to quantum physics. Int. J. Mod. Phys. A1, 303–414 (1986) 20. Goodman, R., and Wallach, N.: Structure and unitary cocycle representations of loop groups and the group of diffeomorphisms of the circle. J. reine u. angew. Math. 347, 69–133 (1984) 21. Gritsenko, V. A., and Nikulin, V. V.: Siegel automorphic form corrections of some Lorentzian Kac–Moody Lie algebras. Schriftenreihe des SFB “Geometrie und Analysis” Heft 17, Mathematica Gottingensis (1995). Preprint alg-geom/9504006 22. Harvey, J. A., and Moore, G.: Algebras, BPS states, and strings. Nucl. Phys. B463, 315–368 (1996) 23. Kac, V. G.: Infinite dimensional Lie algebras. Cambridge University Press, Cambridge, third edn., 1990
The Sugawara Generators at Arbitrary Level
141
24. Kac, V. G., Moody, R. V., and Wakimoto, M.: On E10 . In: K. Bleuler and M. Werner (eds.), Differential geometrical methods in theoretical physics. Proceedings, NATO advanced research workshop, 16th international conference, Como, Amsterdam: Kluwer, 1988, pp. 109–128 25. Kac, V. G., and Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. in Math. 70, 156–236 (1988) 26. Knizhnik, V. G., and Zamolodchikov, A. B.: Current algebra and Wess-Zumino model in two dimensions. Nucl. Phys. B247, 83–103 (1984) 27. Moody, R. V., and Pianzola, A.: Lie Algebras With Triangular Decomposition. New York: John Wiley & Sons, 1995 28. Segal, G.: Unitary representations of some infinite dimensional groups. Commun. Math. Phys. 80, 301–342 (1981) 29. Slodowy, P.: Private communication 30. Sommerfield, C.: Currents as dynamical variables. Phys. Rev. 176, 2019–2025 (1968) 31. Sugawara, H.: A field theory of currents. Phys. Rev. 170, 1659–1662 (1968) 32. Todorov, I. T.: Current algebra approach to conformal invariant two-dimensional models. Phys. Lett. 153B, 77–81 (1985) Communicated by R. H. Dijkgraaf This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.
Commun. Math Phys. 184, 143 – 171 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Pair Correlations and Exchange Phenomena in the Free Electron Gas G. Friesecke Departement Mathematik, ETH Z¨urich, R¨amistr. 101, CH-8092 Z¨urich, Switzerland. Received: 22 March 1996 / Accepted: 20 August 1996
Abstract: We present a rigorous derivation of the formulae of Dirac-Bloch and WignerSeitz for the quantum mechanical exchange energy and the ’exchange hole’ of the free electron gas. More precisely we establish that for arbitrary determinantal ground states of the underlying finite system of N free electrons in a box, subject to periodic or zero boundary conditions, the formulae are accurate to order N −1/3 (per electron) and N −1/2 respectively.
1. Introduction The goal of this article is to present a rigorous derivation of the formulae of DiracBloch [Di30, Bl29]1 and Wigner-Seitz [WS33] for the exchange energy (quantum minus classical Coulomb repulsion energy) and the ’exchange hole’of the free electron gas, and to estimate the accuracy of the formulae for the underlying system of N free electrons in a finite box which the gas is meant to be approximating. The Dirac-Blochformula, which expresses the exchange energy in terms of the single-particle density of the system, plays an important role in numerical “ab initio” electronic structure calculations based on density functional theory.2 The derivations of the formulae customary in the density functional theory and quantum chemistry literature (such as the insightful account in [PY89]) concern periodic 1 A large part of the literature attributes the formula to Dirac but as emphasized recently in [Se95] it was suggested independently and at about the same time by Bloch. In fact, unlike Bloch, Dirac never stated the formula explicitly, but derived a corresponding exchange potential as a correction to the Thomas-Fermi equation. 2 Textbook accounts of DFT as initiated in [HK64, KS65] are [PY89, DG90, KL90]; for a rigorous discussion of various aspects see [Li83, Fr97].
144
G. Friesecke
boundary conditions and pure plane-wave states3 and are not mathematically rigorous: they rest on approximating a lattice sum in momentum space by an integral even though the integrand exhibits oscillations on the length scale of the lattice. As explained below, basic error estimates on this continuum approximation4 are insufficient, for instance, to decide on the correct order of magnitude of the exchange energy in terms of the volume of the system. Hitherto disconnected from the enormous body of DFT literature, a few rigorous results on exchange phenomena can be found in the mathematical physics literature: for periodic plane-wave determinants the Dirac-Bloch formula for the free electron gas is justified to leading order in [Th80, Ch. 4.3], while Graf & Solovej [GS94] have shown it to be asymptotically correct for the interacting electron gas in the high density limit (i.e., in the double limit N → ∞, ρ¯ → ∞, where ρ¯ denotes the number of particles per unit volume). Finally we mention the monumental and by now almost completed series of papers [FS92+] (partially simplified in the beautiful papers [Ba93, GS94]) devoted to establishing an asymptotic expansion, accurate enough to account for exchange, of the ground state energy of heavy atoms as the nuclear charge tends to infinity.5 In [Th80] (and [Ba93, GS94] for more intricate systems) control of discreteness effects is achieved on the spatially averaged level of the exchange energy, by exploiting the regularizing effect of its integral kernel 1/|r−r0 |. To extend the result in [Th80] to general ground states and establish error bounds (Theorem 1.1) we do not follow this approach but show instead (Theorem 1.2) that discreteness effects are in fact already small on the pointwise level of the pair correlation function, where one faces oscillations genuinely on the length scale of the discretization. In particular our decay estimate on pair correlations (Theorem 1.2) displays the localization of the exchange energy near the diagonal in two-body configuration space, thereby allowing a simple and conceptually appealing explanation of the separation of scales between mean field and exchange part of the interelectron repulsion energy (which grow like N 5/3 and N respectively). Mathematically, these decay properties in the finite systems are caused by cancellation effects in certain exponential sums. To quantify these effects we rely on the help of the stationary phase method [So93, St93], in a variant invented originally [La15, Co23, Ha15] to understand some questions in analytic number theory. In order to recall the precise definition of exchange energy, we try to follow the notation in standard quantum chemistry texts such as Szabo & Ostlund [SO82], Parr & Yang [PY89].6 The quantum-mechanical state of an N -electron-system confined to a region Λ ⊆ R3 is described by an N -electron wave function, that is, a square-integrable function ‘of space and spin’ ψ : (Λ × {±1/2})N → C which is normalized, hψ|ψiΛ = 1, where hφ|ψiΛ =
X Z s1 ,...,sN
ΛN
(1.1)
φ(r1 , s1 , ..., rN , sN )ψ(r1 , s1 , ..., rN , sN )∗ dr1 ...drN ,
3 Below arbitrary determinantal ground states of the free electron energy functional, as well as Dirichlet or periodic boundary conditions are admitted; for interesting boundary layer effects observed in the Dirichlet case see Theorems 1.2 and 5.1. 4 Like those [LS77 Thm. III.13, RS78 Ch. XIII.15] sufficient to establish the similar-looking but mathematically much simpler Thomas-Fermi kinetic energy formula [Th27, Fe27]. 5 I thank J. Fr¨ ohlich for bringing references [Th80, Ba93, GS94] to my attention. 6 In particular, atomic units − h = me = |e| = 1 are employed throughout.
Pair Correlations and Exchange Phenomena in the Free Electron Gas
145
and obeys the antisymmetry principle (or Pauli exclusion principle, or Fermi statistics) ψ(..., ri , si , ..., rj , sj , ...) = −ψ(..., rj , sj , ..., ri , si , ...)
(i 6= j).
(1.2)
Important examples of N -electron wave functions are determinantal wave functions (or Slater determinants, or single configurations) ψ (r , s ) ... ψ (r , s ) 1 1 1 1 N N 1 , : : det ψ(r1 , ..., sN ) = √ N! ψN (r1 , s1 ) ... ψN (rN , sN ) where {ψ1 , ..., ψN } is an orthonormal set (hψi |ψj i = δij ) of one-electron wave functions. Now the inconspicuous antisymmetric structure induces the following inequality between the quantum mechanical interelectron energy Eee and the ‘classical’ electrostatic self-repulsion energy J of the electronic charge cloud Eee (ψ) =
Z N X 1 1 1 ρ(r)ρ(r0 ) hψ| ψiΛ < dr dr0 = J(ρ), 0 2 |r − r | 2 2 |r − r | i j Λ i,j=1
(1.3)
i6=j
valid for any determinantal N -electron wave function with associated one-body density (total electronic charge density) PN ρ(r) = hψ|( i=1 δri −r )ψiΛ X Z =N |ψ(r, s1 , r2 , s2 , ..., rN , sN )|2 dr2 ...drN . (1.4) s1 ,...,sN
ΛN −1
This fascinating many-body effect is measured by the energy difference Ex (ψ) = Eee (ψ) − J(ρ)
(1.5)
7
(exchange energy). There is associated with these energy functionals a simple intuition in terms of pair correlations which is crucial for an understanding of the mathematical arguments that follow. Introducing the two-body spin density8 ρspin (r, s, r0 , s0 ) 2 Z N (N − 1) X |ψ(r, s, r0 , s0 , r3 , s3 , ..., rN , sN )|2 dr3 ...drN = 2 N −2 Λ s ,...,s 3
(1.6)
N
and the two-body density ρ2 (r, r0 ) =
X
ρspin (r, s, r0 , s0 ), 2
(1.7)
s,s0 7 The Coulomb integral J(ρ), unlike E (ψ), contains a positive and unphysical self-interaction contriee N J(ρ(i) ), where ρ(i) (r) = Σ |ψ (r, s)|2 is the charge contributed by the ith electron bution, Jself (ψ) = Σi=1 s i N ρ(i) (r) for determinantal wave functions). But (1.3) remains valid with J(ρ) replaced by the (note ρ(r) = Σi=1 self-interaction-corrected Coulomb integral Jsic (ψ) = J(ρ) − Jself (ψ), with the only caveat that one has equality if all the one-electron orbitals ψi have disjoint support. (To see this, use (1.9), (2.4), (2.5) together with the fact that 1/|r − r0 | is a positive kernel.) 8 The normalization factor, whose denominator will reappear in (1.8), is determined by the convention that ρ2 integrate to the number of pairs in the system.
146
G. Friesecke
we may rewrite
Z Eee (ψ) =
Λ2
ρ2 (r, r0 ) dr dr0 . |r − r0 |
J(ρ) is then the approximation to Eee obtained by assuming the electrons to be independent, ρ2 (r, r0 ) ≈
1 ρ(r)ρ(r0 ), 2
(1.8)
while Ex (ψ) reflects by how much the independent electron assumption fails: it is a weighted average of the pair correlation function C(r, r0 ) = ρ2 (r, r0 ) − 21 ρ(r)ρ(r0 ),9 Z Ex (ψ) =
Λ2
C(r, r0 ) dr dr0 , |r − r0 |
(1.9)
with the contributions of nearby points r ≈ r0 being dominant. Note that for nearby points the independent electron assumption (1.8) has no hope of being valid for any state ψ, since the antisymmetry principle (or Pauli exclusion principle, or Fermi statistics) (r, s, r, s) ≡ 0. enforces ρspin 2 The simplest, and classical, setting for studying these pair correlations and exchange phenomena induced by the Pauli exclusion principle is that of the free electron gas. This limiting system is obtained from a large number N of electrons moving freely in a cubical box of volume V by letting N → ∞, V → ∞ with the density ρ¯ = N/V remaining finite. Studying the asymptotics of N -body ground states for such a system reduces, of course, to studying the asymptotics of eigenfunctions of the one-body Laplacian. For convenience of the reader we recall this basic algebraic fact as Lemma 1.1. Let Λ be a box [0, L]3 or in fact any bounded domain in R3 , and let λ1 = λ2 ≤ λ3 = λ4 ≤ ... be an ordered listing of eigenvalues, accounting for multiplicity, of − 21 ∆r operating on one-body functions ψ(r, s) (r ∈ Λ, s ∈ {± 21 }, ψ ∈ C) subject to zero boundary conditions. For a determinantal N -electron wave function ψ the following are equivalent: (i) ψ is a ground state of the free electron gas energy EN,Λ (ψ) = 21 ΣiN= 1 h∇ri ψ|∇ri ψiΛ subject to zero boundary conditions10 , (ii) ψ is a determinant of N orthonormal eigenfunctions ψi of the above one-body problem, corresponding to the lowest N eigenvalues. It is instructive to look, for a moment, at the special case of an even number of electrons and doubly occupied spinless one-body orbitals ψ2i−1 (r, s) = φi (r)δ−1/2 (s), ψ2i (r, s) = φi (r)δ1/2 (s). By well-known identities from Hartree-Fock theory11 , ground state density and pair correlation function are 9 The normalization, here, is not standard, but the above C(r, r 0 ) will be more convenient notationally than the standard pair correlation function h(r, r 0 ) = C(r, r0 )/( 21 ρ(r)ρ(r 0 )) or the so-called exchange-correlation hole ρxc (r, r 0 ) = C(r, r 0 )/( 21 ρ(r)). 10 I.e. mathematically: a minimizer of E N,Λ not just among determinantal wave functions, but on the full set of antisymmetric wave functions {ψ : (Λ × {±1/2})N → C, ψ(·, s1 , ..., ·, sN ) ∈ H01 (ΛN , C) for all (s1 , ..., sN ) ∈ {±1/2}N , (1.1) and (1.2) hold}, where H01 denotes the usual Sobolev space of squareintegrable functions with square- integrable gradient. 11 Which may be verified by means of straightforward calculations from the definitions
Pair Correlations and Exchange Phenomena in the Free Electron Gas
ρN, Λ (r) = 2
N/2 X
147
|φi (r)|2 ,
(1.10)
i=1
X N/2 φi (r)φi (r0 )∗ CN, Λ (r, r ) = − 0
2 .
(1.11)
i=1
As inferred earlier from more general considerations, one sees again that correlations are large for r ≈ r0 , CN, Λ (r, r) = − 41 ρ(r)ρ(r). Now one expects for generic sequences φi , and in more general situations [Sh74, CV85, HMR87, GL93] than cubical boxes, that the faster and faster oscillations of eigenfunctions lead to ergodic behaviour |φi |2 * 1/vol(Λ) as i → ∞ (where the halfarrow denotes weak convergence for example in L1 (Λ)) whence the ergodic sum ρN, Λ (r) ≈ N/vol(Λ) = ρ¯ (r ∈ Λ). So if there were no decorrelating effects for r 6= r0 , or mathematically: cancellation effects in the oscillatory N/2 φi (r)φi (r0 )∗ not present for r = r0 , one would infer in the thermodynamic sum Σi=1 limit Z 1 dr dr0 ∼ (vol(Λ))5/3 ( = L5 for the box). Ex (ψN, Λ ) ∼ 0 Λ×Λ |r − r | The true scaling is rather different, illustrating the presence and strength of cancellation effects. Theorem 1.1. (Exchange energy). For N ∈ N and L > 0 let Q(L) = [0, L]3 and let ψN,L be any determinantal N -electron wave function which minimizes the free electron gas energy EN,Q(L) (as defined in Lemma 1.1), subject to either zero or periodic boundary R LDA conditions. Let Ex (ρ) = −cx Q ρ(r)4/3 dr denote the Dirac-Bloch-Slater functional, where cx = 43 ( π3 )1/3. In the thermodynamic limit N → ∞, L → ∞, N/L3 ≡ ρ¯ ∈ (0, ∞) Ex (ψN,L ) = −cx ρ¯4/3 L3 + O(L2 ), Ex (ρN,L ) = −cx ρ¯ LDA
4/3
(1.12)
L + O(L ), 3
2
(1.13)
where ρN,L is the one-body density of ψN,L . LDA In particular, the quotient Ex (ψN,L )/Ex (ρN,L ) converges to 1. LDA
We emphasize that the celebrated approximation Eee (ψ) ≈ J(ρ) + Ex (ρ) is justified here in situations not covered by the classical calculation (going back to [Di30, Bl29] and justified in [Th80]) for plane wave orbitals and a homogeneous one-body ¯ Theorem 1.1 shows that both open-shell effects and the long-range density ρN,L ≡ ρ: density oscillations induced by imposing zero boundary conditions, reminiscent of the Friedel oscillations in realistic systems, contribute only a lower order term to Ex , at most of the order of magnitude of the surface area of the box. The astonishing accuracy of numerical calculations employing this approximation (or small modifications meant to account for correlation effects) in situations where density homogeneity is violated much more strongly than above12 remains one of the unsolved mysteries of DFT. The powers of L in the error estimates can of course be converted, via the thermodynamic relation N = ρL ¯ 3 , into powers of the particle number N , yielding errors of order 2/3 N . 12 For instance, cohesive energies, lattice parameters, and elastic constants for solid metals are typically predicted to within a few percent of experimental values. Current developments in the computational literature may be traced through the recent conference proceedings volumes [El95, GD95, SP95].
148
G. Friesecke
Formula (1.12) admits an interesting mathematical variant which we state, for simplicity, in the case of an even number of electrons and doubly occupied one-body orbitals. Fix N ∈ N and pick two indices i, j ∈ {1, ..., N } at random. Consider the product of the ith and j th eigenfunctions (extended by zero to all of R3 ) of the Laplacian in the 3D unit cube. How large, then, is the expected value of the negative Sobolev norm ||φi φ∗j ||H −1 (R3 ) squared?13 Theorem 1.10 Let Q denote the three-dimensional unit cube, let {φi }i∈N be any orthonormal basis of L2 (Q, C) of eigenfunctions of the Laplacian subject to zero (or periodic) boundary conditions, and assume the φi are ordered according to the size of their eigenvalues. Then as N → ∞, IE ||φi φ∗j ||2H −1 (R3 ) : i, j ≤ N −→ 1, 4 −2 1 6 3 N 3 16 π where IE(aij : (i, j) ∈ S ⊂ N2 ) = |S|−1 Σ(i, j) ∈ S aij denotes the average of a collection of real numbers. (Let us see how this follows from Theorem 1.1. Note first that the 2N -electron Slater determinant with spin orbitals φi (r)δs=−1/2 , φi (r)δs=1/2 (i = 1, ..., N ) is a ground state of E2N,Q . Call this determinant ψ2N,1 and rewrite formulae (1.9), (1.11) as Ex (ψ2N,1 ) = −
N Z X i, j=1
Q2
N X φi (r)φj (r)∗ φi (r0 )∗ φj (r0 ) 0 dr dr = −4π ||φi φ∗j ||2H −1 (R3 ) . |r − r0 | i, j=1
(1.14) Now use the scaling Ex (ψM,L ) = L−1 Ex (ψM,1 ) and apply (1.12).) Theorem 1.10 may be regarded as an ergodic theorem, indicating a delocalization and decorrelation of eigenfunctions with growing i and describing the rate of this process. We elaborate on this point in Sect. 8. Before proceeding to the proof of Theorem 1.1 let us comment on the exponents 4/3 and 3 appearing in (1.12). The exponent of ρ¯ can be derived rigorously from a simple scaling argument if the product structure of the leading order term and the exponent of L are known. Namely, suppose that instead of (1.12) one only knew ¯ 3 + o(L3 ) − Ex (ψN,L ) = f (ρ)L
(1.15)
for some unknown, finite, nonzero function f (ρ). ¯ For any ground state ψ of the (N, L, ρ) ¯ system, the scaled state ψλ (r1 , s1 , ..., rN , sN ) = λ3N/2 ψ(λr1 , s1 , ..., λrN , sN ) (λ > 0) is a ground state of the (N, L/λ, ρλ ¯ 3 ) system. Using the scaling of the exchange energy, ¯ 3 )λ−3 = λf (ρ), ¯ so Ex (ψλ ) = λEx (ψ), dividing by L3 and letting L → ∞ gives f (ρλ 4/3 4/3 f (ρ) ¯ = f (1)ρ¯ = const · ρ¯ . Nonrigorous variants of this argument are well known but we emphasize that their validity rests on the physically and mathematically nontrivial assumption (formalized here as (1.15)) that the exchange energy scales at fixed particle R 13 Here ||f ||2 = 3 ((−∆R3 )−1 f )∗ f , where (−∆R3 )−1 is the inverse Laplacian with zero boundary H −1 (R3 ) R R −1 −1 0 −1 0 0 −1 conditions at infinity: ((−∆R3 ) f )(r) = (4π) |r − r | R3 F −1 |ξ|−2 F , where F is the Fourier transform (see Sect. 4).
f (r )dr or alternatively (−∆R3 )
=
Pair Correlations and Exchange Phenomena in the Free Electron Gas
149
density like the volume of the system. (For instance, the total interelectronic energy scales like volume to the 5/3.) To justify assumption (1.15) and to derive the correct exponent of L is more subtle. That the “≤” part of (1.15) must be true is physically expected from the deeper fact that the total binding energy of the finite systems approximating the interacting electron gas, and in fact of any collection of nuclei and electrons, cannot exceed a constant times the number of particles in the system ([LN75]; Dyson-Lenard theorem). At least for plane-wave determinants this part of (1.15) follows rigorously from formula (1.10) and a nontrivial inequality of E. Lieb [Li79] related to the proof in [LT75] of R the DysonLenard theorem: for arbitrary N -electron wave functions, Ex (ψ) ≥ −C R3 ρ(r)4/3 dr for some constant C.14 For general ground states this argument does not work, due to possible concentration effects in ρ. The desired upper bound on −Ex is contained, via (1.9), in the following much finer result on pair correlations. In case of zero boundary data, a certain role is played by the group G = {σ ∈ M 3×3 : σ = diag(σ1 , σ2 , σ3 ) for some σj ∈ {±1} } ∼ = (Z2 )3 of reflections at the planes parallel to the faces of the cube. Theorem 1.2. Pair correlations. Let ψN,L be a determinantal ground state of the free electron gas energy, let CN,L be its pair correlation function (as defined above (1.9)), ¯ Then for all r, r0 ∈ [0, L]3 , in case of periodic boundary and let N/L3 ≡ const = ρ. conditions 2 ρ¯2 h(pF |r − r0 |T (L) ) + AN,L , (1.16) CN,L (r, r0 ) = − 4 |CN,L (r, r0 )| ≤ cρ¯2 N −1 + (1 + pF |r − r0 |T (L) )−4 , (1.17) and in case of Dirichlet boundary conditions 2 ρ¯2 X (det σ) h pF |r−σr0 |T (2L) + BN,L , 4 σ∈G 0 2 |CN,L (r, r )| ≤ cρ¯ N −1 + (1 + pF |r − r0 |)−4 ,
CN,L (r, r0 ) = −
(1.18) (1.19)
where h(s) = 3(sin s − s cos s)/s3 , pF = (3ρπ ¯ 2 )1/3 , and the error terms satisfy |AN,L | ≤ c N −1 + N −1/2 (1 + pF |r − r0 |T (L) )−2 , (1.20) |BN,L | ≤ c N −1 + N −1/2 (1 + pF |r − r0 |)−2 . (1.21) Here c denotes a universal constant independent of r, r0 , N , L, and ρ, ¯ and |r − r0 |T (L) 3 3 is the natural distance function on the torus R /LZ inherited from the euclidean norm on R3 , |r − r0 |T (L) = min{|r − (r0 +k)| : k ∈ LZ3 }. Estimates (1.17), (1.19)15 show that despite the nonlocality of the Pauli exclusion principle, statistical independence (1.8) is a valid long range law. The fundamental separation of scales between electrostatic mean field energy (J = O(N 5/3 )) and exchange energy (Ex = O(N )) emerges as an immediate consequence, by multiplying the right-hand side of (1.17), (1.19) by 1/|r−r0 | and integrating over r, r0 ∈ [0, L]3 . 14 15
The best constant C is not known, but it must be bigger than cx . Which will be proved without appeal to the explicit formulae for CN,L given above
150
G. Friesecke
The decay exponent −4 in (1.17), (1.19) is optimal, in the sense that (1+pF |r−r0 |)−4 cannot be replaced by any function g(r − r0 ) with g(s)/(1 + |s|)−4 → 0 (|s| → ∞). See Corollary 4.1. The finer results of Theorem 1.1 on the exchange energy (explicit identification of the leading order term and error estimates) require the finer results (1.16), (1.18), (1.20), (1.21) of Theorem 1.2 (see Sect. 6). The leading order term in (1.16) is well known in the physics literature and appeared first in a paper by Wigner & Seitz [WS33]. We recall its beautiful physical interpretation [e.g. WS33, Sl51, PY89]: Since h(0) = 1, limN, L → ∞ CN,L (r, r) equals −1/2 times the two-body density ρ¯2 /2 of a statistically independent sample. So since the one¯ + O(N −1/2 )) (see Theorem 5.1), the two-body density body density ρN,L (r) = ρ(1 0 limN, L → ∞ (ρ2 )N,L (r, r ) approaches ρ¯2 /2 as |r−r0 | → ∞ but only ρ¯2 /4 as |r−r0 | → 0.16 The length scale of this ‘exchange hole’ is given by the Fermi wavelength 1/pF ; note pF is the Fermi momentum of a free electron gas at density ρ, ¯ i.e. the limiting momentum as N → ∞ of the highest occupied eigenstate in the finite system. In case of zero boundary data, the 7 additional terms σ 6= id in the sum in (1.18)17 represent a boundary layer effect, and may be visualized as correlations between an electron at r and ‘virtual electrons’ at the positions obtained from r0 by reflection at the faces of the box. See Fig. 1. s r
s r s
c
r0 s
c
c σr0
r0
Fig. 1. a Pair correlations, periodic boundary conditions; b Pair correlations, zero boundary conditions
Notice that if dist(r, ∂[0, L]3 ), dist(r0 , ∂[0, L]3 ) ≥ L0 then |r − σr0 |T (2L) ≥ 2L0 for all σ 6= id, so 1 CN,L (r, r0 ) = − ρ¯2 (h(pF |r − r0 |))2 + O(N −1/2 ) + O((pF L0 )−2 ) . 4 In particular, away from a boundary layer of thickness ∼ L0 = L3/4 (<< L for large L) the classical pair correlation function is correct to order L−3/2 ∼ N −1/2 . Starting point for our proof of Theorem 1.2 is the fact that for closed-shell ground states, the pair correlation function is unique and may be expressed as an exponential sum, but as a first novelty, changing to zero boundary data makes the reflection group G appear: 16
In a fermion system with q spin states the prefactor for CN,L would become − q1 ( 21 ρ¯2 ) while (ρ2 )N,L (r, r)
would approach
q−1 1 2 ( 2 ρ¯ ), as any given particle is not exchange-correlated to particles of different spin, whose q q−1 . q
fraction equals 17 Which we have not previously seen in the literature
Pair Correlations and Exchange Phenomena in the Free Electron Gas
151
Lemma 1.2. Let N satisfy the closed-shell condition λN +1 > λN , where the λi are as in Lemma 1.1. Then for every determinantal ground state of EN,Q(L) subject to zero boundary conditions and all r, r0 ∈ Q(L) 1 X X iπk 0 2 CN,L (r, r0 ) = − 3 det σ e L ·(r−σr ) , 8L σ∈G
(1.22)
k∈GLN
where LN is the set of the N/2 positive integer lattice points k ∈ N3 with smallest euclidean distance to the origin, and G is the reflection group defined above Theorem 1.2. In case of periodic boundary conditions, if N satisfies the above closed-shell condition with the λi denoting the analogous periodic one-body eigenvalues, every determinantal ground state satisfies instead 1 X 2iπk 0 2 e L ·(r−r ) , CN,L (r, r0 ) = − 3 L
(1.23)
k∈LN
where LN is the set of the N/2 integer lattice points k ∈ Z3 with smallest euclidean distance to the origin. (For convenience of the reader the elementary calculations are detailed in Sect. 2 below.) At this point, the customary derivations of the Wigner-Seitz formula and Dirac-Bloch formula proceed by assuming that the sum over k ∈ LN is well-approximated by the corresponding integral. But is it? Notice that the function-to-be-summed oscillates in k with amplitude one and with period of order (r − σr0 )/L, i.e. for typical r, r0 ∈ [0, L]3 : with period of order one. Thus the function-to-be-summed oscillates on the length scale of the lattice and the obvious error estimate on the continuum approximation only yields the trivial estimate C(r, r0 ) = O(1). Since the summation over one of the components of k may be carried out explicitly, simple error estimates on the remaining double sum would allow to calculate C(r, r0 ) up to an error of order L−1 ∼ N −1/3 , still insufficient to even predict the order of magnitude in L of the exchange energy correctly. Sufficient − in fact remarkable − error estimates can be obtained through methods of Harmonic Analysis (‘method of stationary phase’). Lemma 1.3. In any space dimension n there exists a constant c0 (n) such that for all R > 0, X Z n−1 ik·z ik·z e − e dk ≤ c0 1 + Rn−1− n+1 , (1.24) n k∈Zn ∩B(R)
R ∩B(R)
for all |z|max ≤ π.18 The lemma establishes a uniform bound in one periodic cell of the discrete sum. For a proof see Sect. 4 below. Estimates of this kind are well known in analytic number theory. By setting z = 0 the lemma reduces to the following classical result on the distribution of lattice points, due to W. Sierpi´nski [Si06] in dimension n = 2 and due to E. Landau [La15] in higher dimensions: 18 Here and below |z| max = max{|z1 |, ..., |zn |}, and B(R) denotes the closed euclidean ball of radius R centered at the origin.
152
G. Friesecke
Corollary 1.1. [Si06, La15]. Let An (R) be the number of integer points in the ball B(R) ⊂ Rn , and let τn be the volume of the unit ball in Rn . Then as R → ∞, n−1 An (R) = τn Rn + O(Rn−1− n+1 ).19 From Lemma 1.3 together with the fact that the decay at infinity of the continuous term, the Fourier transform of the characteristic function of a ball, is known (Lemma 4.2), it is then not difficult to deduce our error bounds on the Wigner-Seitz formula and the Dirac-Bloch approximation (Theorems 1.2, 1.1). The various technical details that remain to be supplied are discussed in Sects. 2–6. These are: the elementary calculations leading to the closed-shell result of Lemma 1.2; control of open-shell effects via Corollary 1.1; the demonstration of Lemma 1.3 and Theorem 1.2; control of heterogeneities in the one-body density (Theorem 5.1); and the passage to the limit in the Dirac-Bloch- and the exchange energy functional. Section 7 is devoted to an issue not discussed in this Introduction: the spurious selfinteraction contribution contained in the mean field electrostatic energy J(ρ) (proven to be of lower order than the exchange energy in Theorem 7.1), while Sect. 8 elaborates on the connection of our work with ergodic theorems for partial differential operators. 2. Pair Correlations: Closed-Shell Ground States Proof of Lemma 1.1. This follows from the fact that the one-body eigenfunctions span the one-body Hilbert space L2 (Λ×{± 21 }, C), while their Slater determinants span the N -electron Hilbert space {ψ ∈ L2 ((Λ×{± 21 })N , C) : (1.2) holds}. Proof of Lemma 1.2. The finite-dimensional vector space spanned by the eigenfunctions of − 21 ∆ with eigenvalues ≤ λN has a canonical basis [ {ψ1 , ..., ψN } = {φn (r)δs=− 1 , φn (r)δs= 1 }, (2.1) 2
2
n∈LN
where in the periodic case φn (r) = L−3/2 e
2πin L ·r
,
1 1 2π − ∆φn = ( )2 |n|2 φn 2 2 L
(2.2)
3 and LN = Lper N is the set of the N/2 integer lattice points n ∈ Z closest to the origin, and in the Dirichlet case
φn (r) =
3 2 3/2 Y
L
j=1
π sin( nj rj ), L
1 1 π 2 2 − ∆φn = |n| φn , 2 2 L
(2.3)
3 and LN = LDir N is the set of the N/2 positive integer lattice points n ∈ N closest to the origin. 19 It is almost trivial (and was known to Gauss [Ga63]) that the error is at most of order Rn−1 . However the precise nature of the error term especially in dimensions 2 and 3 is a fascinating and largely unsolved problem. Note that An (R)R−2 gives the average number of representations of integers ≤ R2 as a sum of n squares. For n = 3 Corollary 1.1 seems to entail all that is known about the magnitude of the error. For n = 2 the above exponent 2/3 was improved by many workers beginning with van der Corput [Co23] (to 2/3 − for some > 0); the most recent results are are 7/11 [IM88] and 46/73 + for arbitrarily small > 0 [Hu91], while it is an old result of Hardy that the optimal exponent cannot be lower than 1/2 [Ha15].
Pair Correlations and Exchange Phenomena in the Free Electron Gas
153
By the closed shell condition λN +1 > λN , the ground state is unique up to multiplication by a phase factor α ∈ {z ∈ C : |z| = 1} = S 1 ∼ = U (1), as is well-known, and easy to see arguing as in the proof of Lemma 1.1. In particular all k-body densities, density matrices, and correlation functions are unique. By Lemma 1.1 the Slater determinant of the canonical basis given by (2.1) and (2.2) resp. (2.3) is a minimizer and thus it suffices to compute its pair correlation function. To do so we use the following basic formulae from Hartree-Fock theory. For any Slater determinant of one-body orbitals ψ1 , ..., ψN , C(r, r0 ) = −
1 X |γ spin (r, s, r0 , s0 )|2 , 2 0 1
(2.4)
s,s =± 2
γ spin (r, s, r0 , s0 ) =
N X
ψi (r, s)ψi (r0 , s0 )∗ ,
(2.5)
i=1
where γ spin is the one-body spin density matrix, while in case (2.1) of doubly occupied spinless one-body orbitals 1 C(r, r0 ) = − |γ(r, r0 )|2 , 4 N X φi (r)φi (r0 )∗ , γ(r, r0 ) = 2
(2.6) (2.7)
i=1
where γ(r, r0 ) = Σs=±1/2 γ spin (r, s, r0 , s) is the spinless one-body density matrix. The statement in the periodic case follows immediately by substituting (2.2). In the Dirichlet case, substituting (2.3) and using the trigonometric formula 2 sin α sin β = cos(α − β) − cos(α + β) gives 3 1 X Y π π 1 γ(r, r0 ) = 3 cos nj (rj − rj0 ) − cos nj (rj + rj0 ) . 2 L L L
(2.8)
n∈LN j=1
The right-hand side can be simplified by invoking the group G twice, through the elementary identity 3 Y
cos
j=1
3 X Y π π π nj (rj − rj0 ) − cos nj (rj + rj0 ) = det σ cos nj (r − σr0 )j L L L σ∈G
j=1
and the trigonometric formula 3 Y j=1
This establishes (1.22).
cos αj βj =
1 X iα·τ β e . 8 τ ∈G
We conclude this section by noting, for further reference, the following property of the eigenbases introduced above: Lemma 2.1. There exists a universal constant c such that for every member ψi of the periodic basis (2.1), (2.2) or the Dirichlet basis (2.1), (2.3), supr,s |ψi (r, s)|2 ≤ cL−3 .
154
G. Friesecke
3. Pair Correlations: Open-Shell Effects For open-shell determinants, the ground state and pair correlation function are no longer unique. The regular structure in the exponential sums (1.22), (1.23) (in which no direction within the integer lattice in k-space is preferred) can become contaminated, and it does not seem obvious that these contaminations are negligible when computing the pair correlation function and the exchange energy. To quantify these effects we combine the lattice point estimate from Corollary 1.1 with an observation that the orthonormality of spin orbitals, although only a restriction on averages (L2 inner products), yields some pointwise smallness of the deviation of the one-body spin density matrix from its closed-shell behaviour. For either choice of boundary condition (Dirichlet or periodic), let 0 < λ1 = λ2 ≤ λ3 = λ4 ≤ ... be the corresponding one-body eigenvalues of − 21 ∆ in [0, L]3 , and let {ψ1 , ψ2 , ...} be the canonical basis defined in (2.1), (2.3) or (2.1), (2.2). For N ∈ N define the number of closed-shell electrons, N− = max{n ≤ N : λn < λn+1 }, the number of electrons in the next closed shell state, N+ = min{n ≥ N : λn < λn+1 }, and the degeneracy of the open shell, d(N ) = N+ − N− . These three quantities, while independent of L, of course depend on the boundary conditions, and when necessary this will by indicated by superscripts ( )Dir or ( )per . ¯ Then for some universal constant c and Lemma 3.1. Let N ∈ N, L > 0, N/L3 = ρ. every one-body spin density matrix γN,L of a determinantal ground state ψN,L of EN,Q(L) , spin supr,s,r0 ,s0 γN,L (r, s, r0 , s0 ) − γNspin (r, s, r0 , s0 ) ≤ cρN ¯ −1/2 , −,L is the spin density matrix of the ground state of EN−,L . where γNspin −,L The following lemma will be used in the proof of Lemma 3.1. Lemma 3.2. There exists a universal constant c such that dDir (N ) ≤ cN 1/2 , dper (N ) ≤ cN 1/2 for all N ∈ N. Proof. By Corollary 1.1 on lattice points in R3 , A3 (R0 ) − sup A3 (R0 ) ≤ c(1 + R3/2 ) S(R) ∩ Z3 = inf 0 R >R
R0
for all R > 0 and some universal constant c. Here S(R) denotes the sphere of radius R in R3 centered at the origin. The assertion now follows from the formulae dDir (N ) = per Dir ) ∩ N3 |, dper (N ) = 2|S(RN ) ∩ Z3 |, with the respective Fermi radii of the 2|S(RN discrete systems given by Dir = max{|n| : n ∈ LDir RN N Dir /2 },
(3.1)
per RN
(3.2)
+
= max{|n| : n ∈
Lper }, N+per /2
Pair Correlations and Exchange Phenomena in the Free Electron Gas
155
per Dir and the elementary estimates RN ≤ cN 1/3 , RN ≤ cN 1/3 .
Proof of Lemma 3.1. By Lemma 1.1 and (2.5), if ψN,L is any determinantal ground state, its one-body spin density matrix is γ
spin
0
(x, x ) =
N X
ψi0 (x)ψi0∗ (x0 )
=
N− X
i=1
ψi0 (x)ψi0∗ (x0 )
i=1
+
N X
ψi0 (x)ψi0∗ (x0 )
i=N−+1
for some collection of orthonormal eigenfunctions satisfying − 21 ∆ψi0 = λi ψi0 . Consider first the first sum on the right-hand side. Since N− corresponds to a closed shell, the vector space over C spanned by the ψi0 (i ∈ {1, ..., N− }) coincides with the space spanned PN− Uij ψj for some by the canonical eigenfunctions ψi (i ∈ {1, ..., N− }), so ψi0 = j=1 Uij ∈ C. By the L2 -orthonormality of the ψi and the ψi0 , U = (Uij ) must be unitary,20 N N i.e. U (U ∗ )T = (U ∗ )T U = id. Consequently Σi=1− ψi0 (x)ψi0∗ (x0 ) = Σi=1− ψi (x)ψi∗ (x0 ) = (x, x0 ). γNspin −,L Consider now the second sum on the right-hand side. The space spanned by the ψi0 (i ∈ {N− + 1, ..., N }) is a subspace of the eigenspace with eigenvalue λN−+1 = N+ ... = λN = λN+ . Thus ψi0 = Σj=N U ψ for some Uij ∈ C (i ∈ {N− + 1, ..., N }, − +1 ij j j ∈ {N−+1, ..., N+ }). U can, of course, never be unitary for an open shell (N+ > N ), but the L2 -orthonormality yields U (U ∗ )T = id(N −N− )×(N −N− ) , and consequently (U ∗ )T U is a projection operator; in particular |(U ∗ )T U v| ≤ |v| for all v ∈ CN+ −N− . This observation leads to a pointwise estimate: X X N+ N 0 0∗ 0 ∗ T ∗ 0 ψi (x)ψi (x ) = ((U ) U )kj ψj (x)ψk (x ) i=N−+1
≤
X N+
j,k=N−+1
|ψj (x)|
j=N−+1
2
21 X N+
0 2
|ψk (x )|
21
≤ d(N ) sup |ψj (x)|2 . j,x
k=N−+1
One concludes by applying Lemmas 3.2 and 2.1.
4. Pair Correlations: Discreteness Effects This section touches on the heart of pair correlations and the Dirac-Bloch formula. 0 It shows how the lack of strong a-priori bounds on the oscillating field e(iπ/L)(r−r )·k can be overcome, by using the special exponential structure, to justify the continuum approximation Z X 0 i(π/L)k·(r−r 0 ) e ≈ ei(π/L)k·(r−r ) dk. k∈Zn ∩B(R)
Rn ∩B(R)
Proof of Lemma 1.3. Up to uniformity in z, the result is due to E. Landau [La15]. Our proof is an adaptation of the analysis in [So93, Thm 1.2.3], where the lemma is proved for z = 0. 20
Beware that in our notation ( )∗ denotes complex conjugation, so the adjoint of U is (U ∗ )T , not U ∗ .
156
G. Friesecke
We use the Fourier transform F : L2 (Rn ; C) → L2 (Rn ; C) normalized so that for functions u ∈ L1 (Rn ; C) ∩ L2 (Rn ; C), Z e−ik·z u(z) dz. u(k) ˆ = (Fu)(k) = Rn −1
Its inverse F : L (R R; C) → L (R ; C) is then (F −1 v)(x) = (2π)−n v(−x), ˆ and convolutions (u ∗ v)(z) = Rn u(z − y)v(y) dy behave as 2
n
2
\ (u∗v)(k) = u(k) ˆ v(k), ˆ
n
d (uv)(k) = (2π)−n (uˆ ∗ v)(k). ˆ
(4.1)
If u is smooth and decays sufficiently fast to zero at infinity, its Fourier transform is linked to the discrete Fourier series of a certain natural 2π-periodic function associated to u: Lemma 4.1. (Poisson summation formula) [e.g. So93, Thm 0.1.16]. If, for instance, u or uˆ ∈ C0∞ (Rn ; C), then X X (2π)−n u(k) ˆ eik·z = u(z + 2πk). k∈Zn
k∈Zn
Poisson’s formula cannot be applied directly to the discontinuous function u(k) ˆ = χB(R) (k) = 1 if |k| < R and 0 elsewhere (in fact the sum on the right-hand side would diverge, providing a nice example that the formula fails if uˆ is not smooth enough). Instead one first smoothens χB(R) by local averaging over a small length scale , later to be adjusted carefully as a suitable negativeRpower of R depending on dimension. Take η ∈ C0∞ (Rn ; R), η(z) = η(−z), Rn η = 1, η ≥ 0, η = 0 outside B(R0 ) for some R0 < 1, and let η = −n η(−1 ·). Apply Poisson’s formula to (2π)−n uˆ = χB(R) ∗η , noting u = χ [ [ B(R) (−·)ηˆ (−·) = χ B(R) ηˆ by (4.1): X X (χB(R) ∗ η )(k)eik·z = ([ χB(R) ηˆ )(z + 2πk). k∈Zn
k∈Zn
By the scaling of the Fourier transform under dilatation of the domain, F(u(λ·)) = λ−n (F u)(λ−1 ·),
(4.2)
one has ηˆ = η(·). ˆ Splitting off the k = 0 term on the right-hand side (note η(0) ˆ = 1), X (χB(R) ∗ η )(k)eik·z − χ [ (4.3) B(R) (z) k∈Zn
=
X
χ [ ˆ + (2πk)) B(R) (z + 2πk) η(z
(4.4)
k∈Zn \{0}
+
(z) η(z) ˆ − η(0) ˆ . χ [ B(R)
(4.5)
Lemma 4.2. [e.g. St93 VIII 1.4.1; So93 Cor. 1.2.2]. In any space dimension n the Fourier transform of the characteristic function of the unit ball has the following decay −(n+1)/2 21 . behaviour: |χd B(1) (k)| ≤ c(n)(1 + |k|) 21 This statement on the volume measure χ n B(1) dL is closely related to the decay of the Fourier transform ˆ ≤ c(n)(1 + |k|)−(n−1)/2 [St93 VIII.3, So93, of surface-carried measures like µ = χS(1) dHn−1 , |µ(k)| Thm. 1.2.1], for the truth of which radial symmetry could be replaced by appropriate curvature properties, and which underlies the remarkable restriction properties for Fourier transforms of Lp functions onto surfaces [e.g. St93, Thm. VIII.3].
Pair Correlations and Exchange Phenomena in the Free Electron Gas
157
(For n = 3, from the explicit formula (6.4) below it is not hard to see that c = 32π will do.) The term (4.5) is thus in absolute value bounded above by ˆ ≤ c2 Rn−1 |Rz|(1 + |Rz|)−(n+1)/2 , c1 Rn (1 + |Rz|)−(n+1)/2 |z| sup |∇η| Rn
where here and below c1 , c2 etc. denote constants independent of R and z. Since η ∈ C0∞ (Rn ; R), for any m ∈ N, −m |η(z)| ˆ ≤ c(m) 1 + |z| . The term (4.4) is thus in absolute value bounded above by −(n+1)/2 −m X z z c3 Rn 1 + 2πR| 2π + k| + k| 1 + 2π| 2π k∈Zn \{0}
X
≤ c4
−(n+1)/2 −m Rn 1 + |Rk|max 1 + |k|max ∀ |z|max ≤ π, (4.6)
k∈Zn \{0}
where |z|max = max{|z1 |, ..., |zn |} and we have used that z is restricted to one periodic cell, |z/(2π)|max ≤ 1/2, whence z z + k| ≥ | 2π + k|max ≥ |k|max − | 2π
1 2
≥ 21 |k|max ≥
1 √ |k|. 2 n
Now introduce Λk = {k 0 ∈ Rn : ki −1 < ki0 ≤ ki ∀ ki ≥ 0, ki ≤ ki0 < ki +1 ∀ki < 0}, the unit cube with edges parallel to the coordinate axes with k being the farthest corner from the origin. Since every k 0 ∈ Rn is contained in at most 2n such cubes, (4.6) is bounded by − n+1 −m X Z 2 1 + |k 0 |max Rn 1 + |Rk 0 |max dk 0 c4 k∈Zn \{0}
Z
Λk
− n+1 −m 2 1 + |k 0 |max ≤ 2 n c4 Rn 1 + |Rk 0 |max dk 0 Rn Z Z n + = 2 c4
{|k0 |<−1 }
Z
≤ 2 c 4 τn R n
{|k0 |>−1 }
1/
n
r 0
n−1
(1 + Rr)
− n+1 2
(R−1 )n dr + (n+1)/2 1+R−1
(4.7)
Z
0 −m
Rn
(1 + |k |)
dk
0
.
Choosing m > n, the second integral on the right-hand side converges. Assuming without loss of generality n ≥ 2, the first integral in the last expression is bounded by Z −1 2 R−(n+1)/2 −(n−1)/2 , R−(n+1)/2 r(n−3)/2 dr = n−1 r=0 hence (4.8) does not exceed c5 (R−1 )(n−1)/2 . Summarizing the bounds on (4.4), (4.5), X ik·z (χB(R) ∗ η )(k) e −χ [ B(R) (z) k∈Zn
≤ c6 (R−1 )(n−1)/2 + Rn−1 |Rz|(1+|Rz|)−(n+1)/2 ∀ |z| ≤ π.
(4.8)
158
G. Friesecke
Next we turn to the error made by smoothing χB(R) . Recall the notation An (R) = Σk∈ZnχB(R) (k). Since η = 0 outside B() and 0 ≤ η ∗ χB(R) ≤ 1, X X ik·z ik·z (χB(R) ∗ η )(k) e − χB(R) (k) e k∈Zn
≤
X
k∈Zn
|(χB(R) ∗ η )(k) − χB(R) (k)| ≤ An (R+) − An (R−)
(4.9)
k∈Zn
X
and
X
(χB(R−) ∗ η )(k) ≤ An (R) ≤
k∈Zn
(χB(R+) ? η )(k).
(4.10)
k∈Zn
By (4.8) with z = 0, X 0 −1 (n−1)/2 0 ) ∗ η )(k) − χ 0 ) (0) ≤ c6 (R (χ \ ) , B(R B(R k∈Zn
X
and hence
(χB(R0 ) ∗ η )(k) = τn (R0 )n + O (R0 −1 )(n−1)/2 .
k∈Zn
But now, assuming without loss of generality ≤ 1, X X (χ ∗ η )(k) − (χ ∗ η )(k) B(R+) B(R−) k∈Zn
= τn (R+) − (R−) n
n
k∈Zn
+ O (R−1 )(n−1)/2
= O Rn−1 + (R−1 )(n−1)/2 ,
whence by (4.10) An (R) = τn Rn + O Rn−1 + ((R−1 )(n−1)/2 .
(4.11)
Finally, applying this to R+ and R− and substituting into (4.9) yields X X ik·z ik·z (χB(R) ∗ η )(k) e − χB(R) (k) e ≤ c7 Rn−1 +((R−1 )(n−1)/2 . (4.12) k∈Zn
k∈Zn
Combining (4.8), (4.12) and the fact that Rn−1 + ((R−1 )(n−1)/2 is minimized when = ((n−1)/2)2/(n+1) R−(n−1)/(n+1) , the assertion of Lemma 1.3 follows. We proceed next to convert our knowledge gathered in Lemma 1.3 about exponential sums into information about one-body density matrices γ(r, r0 ) and pair correlation functions C(r, r0 ). We begin with the case of zero boundary conditions and the closedshell case N = N− . To understand why away from a boundary layer these quantities closely resemble their periodic counterparts, introduce the set of projections onto the median planes parallel to the faces of the cube [0, L]3 , P = {π1 , π2 , π3 }, where πi = Σj∈{1,2,3}\{i} ej ⊗ej . We may rewrite formula (2.8) as X 1 γN,L (r, r0 ) = (det σ) aN,L (r−σr0 ) for all r, r0 ∈ [0, L]3 , 2 σ∈G
where
(4.13)
Pair Correlations and Exchange Phenomena in the Free Electron Gas
X
aN,L (y) = (2L)−3 = (2L)−3
ei(π/L)k·y
Dir ) k∈(Z\{0})3 ∩B(RN
X
X
eik·(π/L)y −
Dir ) k∈Z3 ∩B(RN
+
X
ei(π/L)k·y = (2L)−3
k∈GLDir N
159
0
ei(π/L)k ·πj (y)
Dir ), π ∈P k0 ∈Z2 ∩B(RN j
X
ei(π/L)k
00
− 1 .
·yj
(4.14)
Dir ), j∈{1,2,3} k00 ∈Z∩B(RN
Now the last three terms on the right-hand side of (4.14), when evaluated on y = r − σr0 and summed over σ, vanish: indeed Σσ∈G (det σ)f (πj (r − σr0 )), Σσ∈G (det σ)g((r − σr0 )j ), Σσ∈G det σ are zero for any functions f , g, since each term appears twice, with opposite sign. So aN,L may and shall be redefined as aN,L (y) = (2L)−3
X
eik·(π/L)y
(4.15)
Dir ) k∈Z3 ∩B(RN
without affecting the validity of (4.13). We would like to apply Lemma 1.3 but a little care is needed regarding the range of y = r − σr0 for r, r0 ∈ [0, L]3 . If σ = id then y ∈ [−L, L]3 , or equivalently z = (π/L)y satisfies |z|max ≤ π, i.e. lies in the domain where the continuum approximation of the exponential sum is valid. But if σii = −1, then yi ranges instead over [0,2L], so we need to introduce the periodically extended continuum approximation Dir Dir 3 RN RN π 3 (1) χ [ (y mod 2L) acts B N,L (y) = 2L L −3 π = (2L) χB\ (y mod 2L) , (4.16) 3 (RDir ) L N where B n (1) stands for the unit ball in Rn , and for any y ∈ R3 we denote by y mod 2L the unique element y 0 ∈ [−L, L)3 such that y ∈ y 0 + 2LZ3 . (With the norm introduced in Theorem 1.2, |y|T (2L) = |y 0 |.) Now lift the restriction to closed-shell ground states and define for arbitrary N : cts acts N,L (y) = aN−,L (y).
(4.17)
Applying Lemma 1.3 to the right-hand side of (4.15), denoting ρ¯ = N/L3 and using Dir ≤ cN 1/3 , RN 0 ¯ −1/2 (4.18) aN−,L (r−σr0 ) − acts N,L (r−σr ) ≤ cρN for all r, r0 ∈ [0, L]3 and some universal constant c. Since the long range behaviour of the Fourier transform χ [ B 3 (1) is known (Lemma 4.2) and open-shell effects can be controlled (Lemma 3.1) one infers the following version of Theorem 1.2. Theorem 4.1. (Continuum approximation, zero boundary conditions) ¯ acts Let N ∈ N, L > 0, N/L3 = ρ, N,L as in (4.16), (4.17). Let ψN,L be any determinantal ground state of the free electron gas energy (subject to zero boundary conditions). Then its one-body density matrix and pair correlation function satisfy
160
G. Friesecke
1 X 1 0 (det σ) acts (r−σr ) ¯ −2 , ≤ cρN γN,L (r, r0 ) − N,L 2 σ∈G −2 1 1 , γN,L (r, r0 ) ≤ cρ¯ N − 2 + 1 + ρ¯ 3 |r−r0 | X 2 0 (det σ) acts CN,L (r, r0 ) + N,L (r−σr )
(4.20)
σ∈G
−2 1 1 ≤ cρ¯ N −1 + N − 2 1 + ρ¯ 3 |r−r0 | , 1 −4 CN,L (r, r0 ) ≤ cρ¯2 N −1 + 1 + ρ¯ 3 |r−r0 | 2
(4.19)
(4.21) (4.22)
for all r, r0 ∈ [0, L]3 and some universal constant c. Proof. The first two inequalities follow immediately from (4.13), (4.15), (4.18), Lemma 3.1, Lemma 4.2, the elementary inequality c1 ρ¯1/3 ≤ RN− /L ≤ c2 ρ¯1/3 , and the fact that |r − σr 0 mod 2L| ≥ |r − r0 | for all σ ∈ G, r, r0 ∈ [0, L]3 . The fourth inequality follows by squaring (4.20). Finally, to prove the third inequality, introduce γ cts (r, r0 ) = 0 2Σσ∈G (det σ)acts N,L (r−σr ), rewrite the pair correlation function as 2 1 2 1 1 C = − γ = − Re (γ +γ cts )(γ −γ cts )∗ − γ cts 2 4 2 and note that the absolute value of the first term on the right-hand side cannot exceed a constant times the product of the right-hand sides of (4.19) and (4.20). In case of periodic boundary conditions the group G collapses to the identity, but the subtleties regarding the domain of validity of the continuum approximation are slightly different. Begin, again, with the closed shell situation N = N− . The ground state density matrix is then X 0 1 γN,L (r, r0 ) = L−3 e(2iπ/L)k·(r−r ) = L−3 2 per k∈LN
X
0
e(2iπ/L)k·(r−r ) .
per k∈Z3 ∩B(RN )
As r, r0 vary over one periodic cell [0, L]3 , the exponent z = (2π/L)(r − r0 ) now ranges over [−2π, 2π]3 ; that is: over two periods (in each coordinate) of the exponential sum we wish to approximate. So this time the appropriate periodically extended continuum approximation reads bcts N,L (y) =
per RN
−
L
3
χ [ B 3 (1) (
per 2πRN
−
L
(y mod L)).
(4.23)
Note z 0 = (2π/L)((r − r0 ) mod L) ∈ [−L, L)3 . Lemmas 1.3, 3.1 and 4.2 then yield Theorem 4.2. (Continuum approximation, periodic boundary conditions). Let N ∈ ¯ and bcts N, L > 0, N/L3 = ρ, N,L as defined in (4.23). Let ψN,L be any determinantal ground state of the free electron gas energy (subject to periodic boundary conditions). Then its one-body density matrix and pair correlation function satisfy
Pair Correlations and Exchange Phenomena in the Free Electron Gas
161
1 1 0 ¯ −2 , (4.24) 2 γN,L (r, r0 ) − bcts N,L (r−r ) ≤ cρN 1 1 −2 , (4.25) γN,L (r, r0 ) ≤ cρ¯ N − 2 + 1 + ρ¯ 3 |r−r0 |T (L) −2 1 1 0 2 ¯2 N −1 + N − 2 1 + ρ¯ 3 |r−r0 |T (L) , (4.26) CN,L (r, r0 ) + |bcts N,L (r−r )| ≤ cρ 1 −4 (4.27) CN,L (r, r0 ) ≤ cρ¯2 N −1 + 1 + ρ¯ 3 |r−r0 |T (L) for all r, r0 ∈ [0, L]3 and some universal constant c. The long-range decay results (1.17), (1.19) in Theorem 1.2 are thus proved without appeal to the explicit formulae (1.16), (1.18) for the continuum limits. To justify these explicit expressions and complete the proof of Theorem 1.2, it suffices to establish the following Lemma 4.3. If N/L3 ≡ ρ¯ and h, pF are as in Theorem 1.2, then 1 ρh(p ¯ F |y|T (2L) )| ≤ cρN |acts ¯ −1/2 , N,L (y) − 2 1 ρh(p ¯ F |y|T (L) )| ≤ cρN ¯ −1/2 |bcts N,L (y) − 2 for some universal constant c. Proof. By Lemmas 1.3, 3.2 on lattice points in R3 , 1 1 N N− per = +O(N 2 ) = |Z3 ∩ B(RN )|+O(N 2 ) − 2 2 3 1 4 per 3 per 2 ) +O((RN ) )+O(N 2 ), = π(RN − − 3
and thus per 3 ) = (RN −
13 N + O(N 1/2 ), 8π
per RN 3 −
L
=
13 ρ(1 ¯ + O(N −1/2 )). 8π
(4.28)
=
3 ρ(1 ¯ + O(N −1/2 )). π
(4.29)
Similarly Dir 3 ) = (RN −
3 N + O(N 1/2 ), π
Dir RN 3 −
L
By applying the elementary inequality |a − b| ≤ |a3 − b3 |/ max{a2 , b2 } (a, b > 0), Dir RN −
L
3ρ¯ 1/3
=
π
1 + O(N −1/2 ) ,
per RN −
L
=
1 3ρ¯ 1/3 1 + O(N −1/2 ) . 2 π
(4.30)
cts The Fourier transform entering the definitions of acts N,L and bN,L is calculated explicitly in Lemma 6.1 below: Dir per per Dir π π π R N − 3 R N π 2RN− 3 2RN cts − − (y) = h |y| (y) = h |y|T (L) . , b acts T (2L) N,L N,L L L 6 L 6 L
In the above expressions discreteness effects are still present, through the discrete Fermi Dir per , RN . By (4.28), replacing the cubic factors in front of h by their limit ρ/2 ¯ radii RN − −
162
G. Friesecke
produces an error not exceeding cρN ¯ −1/2 . Finally, a moment’s thought shows that if h is any differentiable function on [0, ∞) with |h0 (s)| ≤ C(1 + s)−2 , as is the case here, one has sups≥0 |h(αs) − h(βs)| ≤ C max{|α − β|, |α−1 − β −1 |}, for any α, Dir β > 0. This observation, together with (4.30) and the choices α = RN πL−1 p−1 F , − per −1 −1 β = 1, s = pF |y|T (2L) respectively α = 2RN− πL pF , β = 1, s = pF |y|T (L) (to ensure ρ-independence ¯ of α, β), establishes the lemma. Finally we remark that the decay exponent −4 in Theorem 1.2 is optimal. Corollary 4.1. Let N/L3 ≡ const = ρ. ¯ Let ψN,L be any determinantal ground state of the free electron gas energy (subject to zero or periodic boundary conditions), with pair correlation function CN,L . Let f : N → R be a positive function such that f (y) → 0 (||y|| → ∞), (1 + ||y||)−4
(4.31)
where || · || = | · | in case of zero boundary data and || · || = | · |T (L) in the periodic case. Then |CN,L (r, r0 )| = ∞. (4.32) sup sup −1 + f (r−r 0 ) N ∈N r,r 0 ∈[0,L]3 N Proof. For instance, in case of zero boundary conditions, pick α ∈ (0, 3/4) and let rN,L = 0 0 = (L/2, L/2, (L−Lα )/2). Then since |rN,L −σrN,L |≥ (L/2, L/2, (L+Lα )/2), rN,L −2 0 L for all σ ∈ G\{id} and |h(s)| ≤ c(1 + s) , (1.18) implies CN,L (rN,L , rN,L ) = − (ρ¯2 /2)(h(pF Lα ))2 + o(L−4α ) as L → ∞, while by hypothesis N −1 + f (rN,L − 0 rN,L ) = ρ¯−1 L−3 +f ((0, 0, Lα )) = o(L−4α ) (L → ∞). One concludes since by inspection lim sups→∞ (h(s))2 s−4 > 0.
5. One-body Density and Dirac-Bloch-Slater Functional The analysis of the previous section allows one to establish, with little effort, the asymptotic behaviour of the Dirac-Bloch-Slater functional on ground state densities. Recall that in the periodic case these finite-system densities may be heterogeneous (due to openshell effects), while in the case of zero boundary conditions they must be heterogeneous (since the wave-functions must decay continuously to zero toward the boundary of the box). ¯ and let ψN,L be any determinantal ground state Theorem 5.1. Let N/L3 ≡ const = ρ, of the free electron gas energy, with one-body density ρN,L . In case of periodic boundary conditions ¯ ≤ cρN ¯ − 2 for all r ∈ [0, L]3 , |ρN,L (r) − ρ| Z 4 4 1 1 5 3 3 dr − ρ 3 L3 ≤ c ρ ρ (r) ¯ ¯ 3 N 2 = cρ¯ 6 L 2 , N,L 3 1
[0,L]
and in case of zero boundary conditions
(5.1) (5.2)
Pair Correlations and Exchange Phenomena in the Free Electron Gas
X
|ρN,L (r) − ρ¯
163
(det σ)h(pF |(id − σ)r|T (2L) )| ≤ cρN ¯ −1/2
(5.3)
σ∈G
Z
[0,L]
for all r ∈ [0, L]3 , 4 4 1 2 ρN,L (r) 3 dr − ρ¯ 3 L3 ≤ cρ¯ 3 N 3 = cρL ¯ 2, 3
(5.4)
for some universal constant c, with G, h, pF and | · |T (2L) as in Theorem 1.2. Proof. Recall that ρ(r) = γ(r, r). Deal first with the periodic case. By Theorem 4.2, ¯ −1/2 , |ρN,L (r) − 2bcts N,L (0)| ≤ cρN
|ρN,L (r)| ≤ cρ. ¯
(5.5)
Substituting definitions, 2bcts N,L (0) = 2
per RN 3 −
L
8π RN− 3 . 3 L per
χ [ B 3 (1) (0) =
The estimate (5.1) now follows from (4.28). To prove (5.2), apply the elementary inequality |ap − bp | ≤ p|a − b| max{ap−1 , bp−1 } (a ≥ 0, b ≥ 0, p ≥ 1) to a = ρN,L (r), b = ρ, ¯ p = 4/3, estimate the right-hand side through (5.1) and the second inequality in (5.5), and integrate over r. For zero boundary conditions, Theorem 4.1 and Lemma 4.3 yield (5.3). Without needing Lemma 4.3, by Theorem 4.1 and the decay estimate presented as Lemma 4.2, X (det σ) acts ((id−σ)r) ¯ −1/2 , |ρN,L (r)| ≤ cρ, ¯ ≤ cρN ρN,L (r) − 2 N,L σ∈G
1 ρ(1 ¯ + O(N −1/2 )), acts N,L (0) = 2
|acts ¯ + ρ¯ 3 |(id−σ)r|T (2L) )−2 . N,L ((id−σ)r)| ≤ cρ(1 1
This yields X |ρN,L (r)4/3 − ρ¯4/3 | ≤ cρ¯4/3 N −1/2 +
1
1+ ρ¯ 3 |(id−σ)r|T (2L)
−2
.
σ∈G\{id}
For each σ ∈ G\{id} there exists i ∈ {1, 2, 3} such that σii = −1. Consequently Z
L
1
1+ ρ¯ 3 |((id−σ)r)i |T (2L)
−2
Z
L 2
dri = 2
0
and the last assertion (5.4) follows.
0
(1+ ρ¯ 3 |2ri |)−2 dri ≤ ρ¯− 3 1
1
Z
∞
(1+|s|)−2 ds,
−∞
6. The Exchange Energy Functional Throughout this section ρ¯ > 0 is fixed, and the thermodynamic relation N/L3 ≡ const = ρ¯ is assumed. We write Q(L) = [0, L]3 and denote by c any constant which may depend on ρ¯ but not on N or L. The value of c may change from line to line. Proof ofRTheorem 1.1 R(1.12), periodic boundary conditions. By Theorem 4.2 and abbreviating ( ) dr dr 0 = ( ),
164
G. Friesecke
Z Z 0 2 |bcts CN,L (r, r0 ) N,L (r−r )| + 0 |r − r0 | Q(L)2 |r − r | Q(L)2 Z Z X 1 (1 + |r−r0 −d|)−2 −3/2 + L ≤ c L−3 0 |r − r0 | 2 Q(L)2 |r − r | {d∈LZ3 : |d|max ≤L} Q(L) (6.1) ≤ c L2 + L3/2 log(2+L) . To deal with bcts N,L we need to decode the information hidden in the torus co-ordinate cts denote the nonperiodic function obtained from the right-hand side of y mod L. Let bg N,L (4.23) by replacing y mod L with y. Then Z Z 0 2 cts (r−r 0 )|2 |bcts |bg N,L (r−r )| N,L − |r − r0 | |r − r0 | Q(L)2 Q(L)2 Z 0 2 0 2 g cts |bcts N,L (r−r )| − |bN,L (r−r )| = |r − r0 | Q(L)2 ∩{|r−r 0 |max ≥ L 2 } Z X c (1 + |r−r0 −d|)−4 ≤ cL2 . ≤ L Q(L)2 2
(6.2)
{d∈LZ : |d|≤L}
We can now isolate the leading contribution to the exchange energy. Changing variables y = r − r0 , y 0 = r0 − L2 (1, 1, 1)T , Z cts (r−r 0 )|2 cts (y)|2 |bg |bg N,L N,L 3 − L dy 0 |r − r | |y| Q(L)2 R3 Z Z cts (y)|2 |bg N,L dy dy 0 = L 3 L |y| 0+y| y 0 ∈[− L , ] |y > max 2 2 2 Z Z (1 + |y|2 )−2 dy dy 0 ≤ L 3 L |y| 0| y 0 ∈[− L , ] |y|> −|y max 2 Z 2 2 1 = 2πc dy 0 (with the same constant c) L 0| 2 L 3 1 + ( −|y ) y 0 ∈[− L , ] max 2 2 2
Z
≤ cL2 .
(6.3)
To find the exchange constant cx it remains to make explicit the factor of L3 in the first line of (6.3). This amounts to evaluating the integral Z πy 2 |L−3 χ [ B(R) ( L )| dy, I(R, L) = |y| R3 which encodes all relevant information about the decay and the quantum oscillations of χB(R) ( 2π (r−r0 ))|2 . For the sake of completeness the pair correlation function C(r, r0 ) ∼ |[ L we sketch a derivation of the (elementary) result. Lemma 6.1. In three dimensions the Fourier transform of the characteristic function of the unit ball is sin |k| − |k| cos |k| , (6.4) χd B(1) (k) = 4π |k|3
Pair Correlations and Exchange Phenomena in the Free Electron Gas
and
165
4 I(R, L) = 16π
R L
.
(6.5)
Proof. Introducing polar coordinates with the z3 -axis pointing in the direction of k, (z1 , z2 , z3 ) = (r cos φ sin θ, r sin φ sin θ, r cos θ), one has Z Z 1 Z 2π Z π ik·z e dz = r2 sin θ eir|k| cos θ dθ dφ dr, χd B(1) (k) = B(1)
r=0
φ=0
θ=0
and one easily obtains (6.4). Substitution into the definition of I(R, L) and the changes 0 of variables y 0 = πR L y, s = |y | yield 6 Z 1 (sin t−t cos t)2 I(R, L) = (4π)2 R L πR|y| dy t6 R3 |y| t= L 4 Z ∞ (sin s−s cos s)2 = 64π R ds. L s5 0 An elegant evaluation of this last integral can be found in [PY, Sect. 6.1]: set t = (sin s)/s, then dt/ds = −(sin s−s cos s)/s2 , d2 t/ds2 = −t − (2/s)dt/ds, and so Z ∞ Z ∞ Z ∞ t 1 d2 t 1 (sin s−s cos s)2 dt 1 dt dt ds = − − ds = . = 5 2 s ds s ds ds 2 2 ds 4 0 0 0 This proves the lemma.
By (6.5), (4.28) and the elementary inequality |a4 − b4 | ≤ (4/3)|a3 − b3 | max{a, b}, Z L3
Rper 4 3 3 13 4 cts (y)| |bg 3 L3 N− N,L per 3 dy = I(RN , L) = 4πL = ρ¯ 3 L3 + O(L 2 ). (6.6) L − |y| 4 4 π R3 2
Assertion (1.12) in Theorem 1.1 now follows by combining (6.1), (6.2), (6.3) and (6.6). Proof of Theorem 1.1 (1.12), zero boundary conditions. Altering the boundary conditions produces the same leading order exchange energy, but for somewhat subtle reasons. By multiplying (4.21) by 1/|r−r0 | and integrating over Q(L)2 , Z Z 0 cts 0 ∗ X acts CN,L (r, r0 ) N,L (r−σr )aN,L (r−τ r ) + det(στ ) 0 |r − r0 | Q(L)2 |r − r | Q(L)2 σ,τ ∈G ≤ c L2 + L3/2 log(2+L) . (6.7) The terms in the sum on the left hand side of (6.7) are at most of the order of magnitude of the surface area of the box, unless (σ, τ ) = (id, id): Z Z 0 cts 0 ∗ 0 2 X acts |acts N,L (r−σr )aN,L (r−τ r ) N,L (r−r )| − det(στ ) |r − r0 | |r − r0 | Q(L)2 Q(L)2 σ,τ ∈G X Jσ,τ (L), (6.8) ≤ c σ,τ ∈G
(σ,τ )6=(id,id)
where
166
G. Friesecke
Z
(1+|r−σr0 |T (2L) )−2 (1+|r−τ r0 |T (2L) )−2 |r − r0 | Q(L)2 Z cts 0 cts |aN,L (r−σr )aN,L (r−τ r0 )∗ | 1 , ≥ c Q(L)2 |r − r0 |
Jσ,τ (L) =
and we have Lemma 6.2. If (σ, τ ) 6= (id, id) then Jσ,τ (L) ≤ cL2 . (By contrast Jid,id is of order L3 .) Proof. What is surprising at first sight is that for tr σ = 1 the term Jσ,id is of lower order than the geometric mean of the orders of Jσ,σ and Jid,id . So we begin with this case. There is exactly one i ∈ {1, 2, 3} with σii = −1. We pick p ∈ (1, 2) and j ∈ {1, 2, 3}\{i} and estimate as follows: Z (1+|πi (r−r0 )|)p−2 (1+|(r+r0 )i |T (2L) )−p ≤ cL2 . Jσ,τ (L) ≤ 0 0 p/2 (1+|(r−r 0 ) |)|(r−r 0 ) |1−p/2 i i Q(L)2 (1+|πi (r−r )|)|πi (r−r )| Next, consider the case tr σ ≤ −1, τ = id. Then pick i such that σjj = −1 for all j 6= i, and calculate Z (1+|πi (r+r0 )|T (2L) )−2 ≤ cL log(2+L)2 . Jσ,τ (L) ≤ c 0 2 0 Q(L)2 (1+|r−r |) |r−r | Next, assume there exist i, j, i 6= j, such that σii = −1, τjj = −1. Choosing p, q ∈ (1, 23 ) and letting {k} = {1, 2, 3}\{i, j}, the integrand of Jσ,τ (L) is bounded by c
(1+|(r+r0 )i |T (2L) )−p (1+|(r+r0 )j |T (2L) )−q · |(r−r0 )i |1/2 |(r−r0 )j |1/2
·(1+|(r−σr0 )k |T (2L) )p−2 (1+|(r−τ r0 )k |T (2L) )q−2 , hence Jσ,τ (L)(L) ≤ cL2 , with the integrals over (r−r0 )i , (r−r0 )j , (r+r0 )i , (r+r0 )j , and (r+r0 )k , (r−r0 )k contributing, respectively, a multiple of L1/2 , L1/2 , 1, 1, and L. The remaining case is σ = τ , tr σ = 1. We may assume σ = τ = diag(−1, 1, 1). A moment’s thought shows that the seemingly equivalent expression obtained by switching signs in the differences r−σr 0 , r−r0 , Z (1 + |((r−r0 )1 , (r+r0 )2 , (r+r0 )3 )|T (2L) )−4 , |r+r0 | Q(L)2 is of order L2 log L. But the domain of integration is asymmetric with respect to this switching operation. In fact, changing variables y = r − r0 , y10 = r1 + r10 , y20 = r2 , y30 = r3 , we have |y1 | ≤ min{y10 , 2L−y10 } and thus (abbreviating Λ(L) = [0, 2L] × [−L, L]2 ) Jσ,τ (L) Z Z Z ≤ c (y0 ,y0 ) (y0 ,y2 ,y3 ) 2
3
∈[0,L]2
Z
= 2cL2 Z
1
∈Λ(L)
(y10 ,y2 ,y3 )∈Λ(L)
= 4cL2 [0,L]×[−L,L]2
(1+mind∈{0,2L}|(y10 −d, y2 , y3 )|)−4 dy1 d(π1 y) dy 0 |y1 | + |π1 y|
|y1 |≤ min{y10 ,2L−y10 }
log(min{y10 , 2L−y10 } + |π1 y|) − log |π1 y| d(y10 , y2 , y3 ) (1 + mind∈{0,2L} |(y10 −d, y2 , y3 )|)4 log(z1 + |π1 z|) − log |π1 z| dz. (1 + |z|)4
Pair Correlations and Exchange Phenomena in the Free Electron Gas
Since the integrand lies in L1 (R3 ), the lemma follows.
167
cts for the It remains to look at the leading order term σ = τ = id in (6.8). Writing ag N,L by replacing y mod 2L in (4.16) by y, one has function obtained from acts N,L
Z
0 2 |acts N,L (r−r )| − L3 |r − r0 | Q(L)2 Z Z
Z
cts (y)|2 |ag N,L dy |y| R3
cts (y)|2 |ag N,L dy dy 0 L 3 L |y| 0| y 0 ∈[− L , ] |y+y > max 2 2 2 Z Z (1+|y|2 )−2 dy dr0 ≤ cL2 , ≤c L 3 L |y| 0| y 0 ∈[− L , ] |y|> −|y max 2 2 2
=
(6.9)
where the last inequality follows from (6.3). The second integral on the left hand side of (6.9) can be evaluated with the help of Lemma 6.1 and (4.29): Z Dir 2 |acts L3 L3 RN− 4 3 3 1/3 4/3 3 N,L (y)| Dir = I(RN π , L) = = ρ¯ L + O(L3/2 ). L3 − |y| 64 4 L 4 π R3 (6.10) By combining (6.7), (6.8), Lemma 6.2, (6.9), and (6.10) one obtains Theorem 1.1 (1.12). 7. Self-interaction In the density functional theory literature, the success of the Dirac-Bloch-Slater approximation Eee (ψ) ≈ ExLDA (ρ) + J(ρ) applied to atomic or molecular systems is sometimes attributed to an anticipated ability of ExLDA to cancel the bulk of the spurious self-interaction energy contained in J(ρ). It is then interesting to note that such virtues of the local density approximation – if true – must be accidental: I prove below that the self-interaction contribution to Ex (see Footnote 7 in the Introduction) is a lower-order effect which disappears in the thermodynamic limit and contributes nothing to the exchange constant cx . Mathematically, my proof relies on Lemma 3.2 (which was a consequence of the lattice point estimate in Corollary 1.1) and the Hardy-Littlewood-Sobolev inequality from the theory of fractional integration. Theorem 7.1. Under the assumptions of Theorem 1.1, both for zero and periodic boundary conditions, there exists a universal constant c such that Jself (ψN,L ) ≤ cρ¯7/6 L5/2 = cρ¯1/3 N 5/6 .
(7.1)
In particular, the limit theorem that Ex (ψN,L )/ExLDA (ρN,L ) tends to 1 remains true with Ex replaced by proper exchange Exsic = Ex − Jself , with the same exchange constant cx . 0 , recall its self-interaction Proof. For a ground state with one-body spin orbitals ψ10 , ..., ψN N (i) (i) energy, Jself (ψN,L ) = Σi=1 J(ρ ), where ρ (r) = Σs |ψi0 (r, s)|2 . The ψi0 must be eigenfunctions of the one-body Laplacian (see Lemma 1.1) and are as usual assumed
168
G. Friesecke
to be ordered by size of eigenvalue. Hence for any i ∈ {1, ..., N } we may write ψi0 = Σji+= i− +1 αij ψj for some αij ∈ C, Σji+= i− +1 |αij |2 = 1, where the ψj are the canonical eigenfunctions from Sect. 2. Thus by Lemmas 2.1, 3.2, |ψi0 (r, s)|2 ≤
i+ X j=i− +1
|αij |2
i+ X
|ψi (r, s)|2 ≤ cρN ¯ −1/2 .
(7.2)
j=i− +1
This L∞ -estimate alone does not suffice to infer (7.1). I use H¨older’s inequality and the Hardy-Littlewood-Sobolev inequality [e.g. So93 0.2.3, St93 VIII 4.2]
1
| · |n/α ∗ f q n ≤ c(p, q, n)kf kLp (Rn ) L (R ) (n ∈ N, α > 1, α1 = 1 − p1 − q1 , 1 < p < q < ∞ ) with α = n = 3, p = 6/5, q = 6. Extending the ρ(i) by zero to all of R3 , for f = ρ(i) Z J(f ) = f (r) | 1· | ∗ f (r) dr R3
≤ ||f ||
6
L5
k | 1· | ∗ f kL6 ≤ c||f ||2
6 L5
≤ c ||f ||L1
53
||f ||L∞
13
.
To infer (7.1), use ||ρ(i) ||L1 = 1 and the L∞ -bound (7.2). 8. Concluding Remarks This article by no means exhausts the study of pair correlations and exchange phenomena even in noninteracting systems. A main shortcoming is that I do not know under which changes of domain the pair correlation function away from the boundary, the ‘correlation exponent’ (3 in Theorem 1.1 and −2/3 in Theorem 1.10 ) and the exchange constant cx would survive. Studying more general domains would seem to require a strategy of investigation which bypasses the explicit calculations in Sect. 2 and could use the differential information on the one-body orbitals {φi } more directly. One step in this direction would be to devise, without resorting to explicit calculation of eigenfunctions, a proof of the following consequence of Theorem 1.10 : Corollary 8.1. Under the assumptions of Theorem 1.10 , there exists a set S ⊆ N2 of asymptotic density one (that is, N −2 |S ∩ [1, N ]2 | → 1 as N → ∞), such that φi φ∗j −→ 0 in H −1 (R3 ) ((i, j) ∈ S, |(i, j)| → ∞). Inspection of the diagonal terms i = j shows that the restriction to a subset of asymptotic density one is essential: h(−∆R3 )−1 |φi |2 , |φi |2 iQ ≥ (4π)−1 3−1/2 h1, |φi |2 ⊗ |φi |2 iQ×Q = ˆ |φi |2 ⊗ (4π)−1 3−1/2 . (If {φi } is the standard basis (2.3), h(−∆R3 )−1 |φi |2 , |φi |2 iQ = hC1, ∗ 2 2 2 ∞ ˆ 1iQ×Q with Cˆ as below, since |φi | ⊗ |φj | * 1 in L (Q×Q) and |φi | iQ×Q → hC1, ˆ ∈ L1 (Q×Q). But note that if the diagonal terms were the only terms not to converge to C1 zero, the expected value in Theorem 1.10 should behave as IE ∼ N −1 , not IE ∼ N −2/3 .) While recent advances in weak convergence methods for partial differential equations [Ta90, Ge91] as well as classical ergodic theorems for eigenfunctions of the Laplacian
Pair Correlations and Exchange Phenomena in the Free Electron Gas
169
[Sh74, CV85, HMR87, GL93] concern the asymptotic behaviour of quadratic forms hL0 φi , φi iΛ associated with pseudodifferential operators L0 of degree zero, the situation here appears to be slightly different. When working in one-body configuration space, ||φi φ∗j ||2H −1 (R3 ) = h(−∆R3 )−1 (φi φ∗j ), φi φ∗j iQ , so we are dealing with a pseudodifferential operator of degree −2 and a sequence not known to converge weakly to zero in L2 (Q). Alternatively, when working in two-body configuration space, we may write ˆ i ⊗ φj ), φi ⊗ φj iQ×Q , ||φi φ∗j ||2H −1 (R3 ) = hC(φ where Cˆ is a nonlocal operator which switches arguments and multiplies by a weight factor making contributions of nearby points dominant, ˆ (Cψ)(r, r0 ) =
1 ψ(r0 , r), 4π|r−r0 |
but at least the argument of the quadratic form, φi ⊗ φj , then converges weakly in L2 (Q×Q, C) to zero as |(i, j)| → ∞. Keeping [Sh74, CV85, HMR87, GL93] in mind, what may play a role in Corollary 8.1 is that the underlying classical Hamiltonian system, the geodesic flow in Q (augmented, in case of zero boundary conditions, by reflection at the boundary according to the law of geometrical optics), (p(t) , q (t) ) : R3 ×[0, L]3 → R3 ×[0, L]3 , is ergodic at least in the position for almost Revery (p0 , q0 ) ∈ R3 ×[0, L]3 and every f ∈ C([0, L]3 ), R Tvariable: (t) limT → ∞ 0 f (q (p0 , q0 )) dt = [0, L]3 f dq. Acknowledgement. The work reported here forms part of a wider research project [Fr97], the bulk of which was carried out and presented in a series of lectures during the 1995/96 programme ‘Mathematical methods in materials science’ at the Institute for Mathematics and Its Applications, University of Minnesota, Minneapolis. I am greatly indebted to my hosts A. Friedman, R. Gulliver & R. D. James for their hospitality and their enthusiastic support of this project. Also, it is a pleasure to thank the participants of the lectures, in particular F. Dulles, R. D. James, S. M¨uller, for stimulating feedback, and M. Struwe and the referee for careful reading of the manuscript.
References [Ba93] [Bl29] [BP89] [Co23] [CV85] [DG90] [Di30] [El95]
Bach, V.: Accuracy of mean field approximations for atoms and molecules. Commun. Math. Phys. 155, 295–310 (1993) Bloch, F.: Bemerkung zur Elektronentheorie des Ferromagnetismus und der elektrischen Leitf¨ahigkeit. Z. Physik 57, 545–555 (1929) Bombieri, E., Pila, J.: The number of integral points on arcs and ovals. Duke Math. J. 59, 337–357 (1989) van der Corput. J. G.: Neue zahlentheoretische Absch¨atzungen. Math. Annalen 89, 215–254 (1923) Colin de Verdiere, Y.: Ergodicit´e et fonctions propres du laplacien. Commun. Math. Phys. 102, 497–502 (1985) Dreizler, R.M., Gross, E.K.U.: Density Functional Theory. Berlin–Heidelberg–New York: Springer-Verlag, 1990 Dirac, P.A.M.: Note on exchange phenomena in the Thomas atom. Proc. Cambridge Philosophical Society 26, 376–385 (1930) Ellis, D.E. (ed.): Density Functional Theory of Molecules, Clusters, and Solids. Amsterdam: Kluwer, 1995
170
[Fe27]
G. Friesecke
Fermi, E.: Un metodo statistico per la determinazione di alcune propriet`a dell’atome. Rend. Accad. Naz. Lincei 6, 602–607 (1927) [Fr97] Friesecke, G.: From Quantum Mechanics to Density Functional Theory. Berlin–Heidelberg–New York: Springer-Verlag (Series: IMA Volumes in Mathematics and Its Applications), to appear [FS92+] Fefferman, C., Seco, L.: Adv. Math. 95, 145–305 (1992); Rev. Math. Iberoamericana 9, 409–551, (1993); Adv. Math. 107, 1–185 (1994); 107, 187–364, (1994); 108, 263–335 (1994); 111 88–161 (1995) [Ga63] Gauss, C.F.: De nexu inter multitudinem classium, in quas formae binariae secundi gradus distribuntur, earumque determinantem. Werke, Zweiter Band, G¨ottingen: K¨onigl. Gesellschaft d. Wissenschaften zu G¨ottingen (ed.), 1863, pp. 269–291 [GD95] Gross, E.K.U., Dreizler, R.M. (eds.): Density Functional Theory. New York: Plenum Press, 1995 [G´e91] G´erard, P.: Microlocal defect measures. Comm. Partial Differential Equations 16, 1761–1794 (1991) [GL93] G´erard, P., Leichtnam, E.: Ergodic properties of eigenfunctions for the Dirichlet problem. Duke Math. J. 71, 559–607 (1993) [GS94] Graf, G.M., Solovej, J.P.: A correlation estimate with applications to quantum systems with Coulomb interactions. Rev. Math. Phys. 6 No. 5a, 977–997 (1994) [Ha15] Hardy, G.H.: On the expression of a number as the sum of two squares. Quarterly J. Math. 46 263–283 (1915) [HK64] Hohenberg, P., Kohn, W.: Inhomogeneous electron gas. Phys. Rev. B 136, 864–871 (1964) [HMR87] Helffer, B., Martinez, A., Robert, D.: Ergodicit´e et limite semi-classique. Commun. Math. Phys. 109, 313–326 (1987) [Hu93] Huxley, M.N.: Exponential sums and lattice points II. Proc. London Math. Soc. 66, 279–301 (1993) [IM88] Iwaniec, H., Mozzochi, C. J.: On the divisor and circle problems. J. Number Theory 29, 60–93 (1988) [KL90] Kryachko, E.S., Lude˜na, E.V.: Energy Density Functional Theory of Many-Electron Systems. Amsterdam: Kluwer Academic Publishers, 1990 [KS65] Kohn, W., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. A 140, 1133–1138 (1965) [La15] Landau, E.: Zur analytischen Zahlentheorie der definiten quadratischen Formen. Sitzungsberichte d. K¨oniglich Preußischen Akademie d. Wissenschaften. 1. Halbband, 1915, pp. 458–476 [Li79] Lieb, E.H.: A lower bound for Coulomb energies. Phys. Lett. 70A, 444–446 (1979) [Li83] Lieb, E.H.: Density functionals for Coulomb systems. Int. J. Quantum Chemistry XXIV, 243–277 (1983) [LN75] Lieb, E.H., Narnhofer, H.: The thermodynamic limit for Jellium. J. Stat. Phys. 12 No. 4, 291–310, (1975) [LS77] Lieb, E.H., Simon, B.: The Thomas-Fermi theory of atoms, molecules and solids. Adv. Math. 23, 22–116 (1977) [LT75] Lieb, E.H., Thirring, W.E.: Bound for the kinetic energy of fermions which proves the stability of matter. Phys. Rev. Lett. 35 (11), 687–689 (1975) [PY89] Parr, R.G., Yang, W.: Density-Functional Theory of Atoms and Molecules. Oxford: Oxford University Press, 1989 [RS78] Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV: Analysis of Operators. New York: Academic Press, 1978 [Se95] Seminario, J.M.: An introduction to density functional theory in chemistry. In [SP95], 1995, pp. 1–27 [Sh74] Shnirelman, A.: Ergodic properties of eigenfunctions. Uspekhi Mat. Nauk. 29 (6), 181–182 (1974) [Si06] Sierpi´nski, W.: O pwenem zagadnieniu z rachunku funkcyj asymptotycznych. Prace Mat.-Fiz. 17, 77–118 (1906). Reprinted in French under the title ‘Sur un probl`eme du calcul des fonctions asympˆ totiques’. In Oeuvres Choisies, S. Hartman, A. Schinzel (eds.), Warszawa: Editions Scientifiques de Pologne, 1974 [Sl51] Slater, J.C.: A simplification of the Hartree-Fock method. Phys. Rev. 81, 385–390 (1951) [So93] Sogge, Ch.D.: Fourier integrals in classical analysis. Cambridge: Cambridge University Press, 1993 [SO82] Szabo, A., Ostlund, N.S.: Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. New York: Macmillan Publishing Co., 1982 [SP95] Seminario, J.M., Politzer, P. (eds.): Modern Density Functional Theory – A Tool for Chemistry. London: Elsevier, 1995
Pair Correlations and Exchange Phenomena in the Free Electron Gas
[St93] [Ta90] [Th80] [Th27] [WS33]
171
Stein, E.M.: Harmonic Analysis. Princeton: Princeton University Press, 1993 Tartar, L.: H-measures, a new approach for studying homogenisation, oscillations and concentration effects in partial differential equations. Proc. Roy. Soc. Edinburgh 115A, 193–230 (1990) Thirring, W.: Lehrbuch der Mathematischen Physik 4: Quantenmechanik großer Systeme. Berlin– Heidelberg–New York: Springer-Verlag, 1980 Thomas, L.H.: The calculation of atomic fields. Proc. Cambr. Phil. Society 23, 542–548 (1927) Wigner, E., Seitz, F.: On the constitution of metallic sodium. Phys. Rev. 43, 804–810 (1933)
Communicated by D. Brydges
Commun. Math Phys. 184, 173 – 202 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Some Functions that Generalize the Askey–Wilson Polynomials 1,? ¨ F. Alberto Grunbaum , Luc Haine2,?? 1 2
Department of Mathematics, University of California, Berkeley, CA 94720–3840, USA. Department of Mathematics, Universit´e Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium.
Received: 7 May 1996 / Accepted: 30 August 1996
Abstract: We determine all biinfinite tridiagonal matrices for which some family of eigenfunctions are also eigenfunctions of a second order q-difference operator. The solution is described in terms of an arbitrary solution of a q-analogue of Gauss hypergeometric equation depending on five free parameters and extends the four dimensional family of solutions given by the Askey-Wilson polynomials. There is some evidence that this bispectral problem, for an arbitrary order q-difference operator, is intimately related with some q-deformation of the Toda lattice hierarchy and its Virasoro symmetries. When tridiagonal matrices are replaced by the Schroedinger operator, and q = 1, this statement holds with Toda replaced by KdV. In this context, this paper determines the analogs of the Bessel and Airy potentials. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 2 An Operator Identity and its Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 3 The q-Riccati Equation for f1 (k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 4 The Operator B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 5 Linearizing the q-Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 6 The Gauss-Askey-Wilson Equation & Proof of Theorem 1 . . . . . . . . . . 191 7 Solving the Gauss-Askey-Wilson Equation in Terms of Basic Hypergeometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 8 The Case of the Classical Orthogonal Polynomials . . . . . . . . . . . . . . . . . 198 9 The Case v = 0 in the q-Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . . 200 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 ? The first author was supported in part by NSF Grant # DMS94-00097 and by AFOSR under Contract AFO F49629-92. ?? The second author is a Research Associate for FNRS.
174
F.A. Gr¨unbaum, L. Haine
1. Introduction In [4] Askey and Wilson introduced a family of polynomials depending on four parameters a, b, c, d that satisfy a three term recursion relation and a second order q-difference equation. As the title of [4] clearly states these polynomials generalize those of Jacobi and in fact most of the polynomials used in mathematical physics and harmonic analysis can be seen to be special or limiting cases of the Askey–Wilson polynomials. It is fair to say that the main properties of all these polynomials are indeed closely related to the two mentioned above. In the last few years these polynomials have become a working tool in many developing areas of mathematics and physics among them quantum groups, see [8, 20, 22]. A sample of other applications is given in [10, 12, 23–25]. In [14] and particularly in [1] and [15] some instances were uncovered where these polynomials play a role in relation to nonlinear evolution equations connected with the Toda lattice and its Virasoro symmetries. We defer to a later paper for a more detailed look into this issue. In [3] Andrews and Askey proposed that a family of orthogonal polynomials should be named “classical” exactly when the two properties mentioned above, namely a three term recursion relation and a second order q-difference equation hold, i.e. we have fn (k) = (s(k)−bn )fn−1 (k)−an−1 fn−2 (k) for n ≥ 1 and f−1 = 0, f0 = 1
(1.1)
Bfn (k) ≡ a(k)(fn (qk)−fn (k)) + b(k)(fn (k/q)−fn (k)) = θn fn (k) ,
(1.2)
and
for some appropriate choice of the “spectral parameter” s(k). When (1.2) is replaced by a differential equation in k the choice of “spectral parameter” is immaterial, as discussed in [14]. Within the context of q-difference equations the choice of spectral parameter cannot be swept under the rug. We limit ourselves to the case s(k) = γ(k + ε/k), (1.3) with γ and ε arbitrary parameters. The choices s(k) = k and s(k) = (k + 1/k)/2 are up to scaling the only ones that deserve attention. These two familiar cases are connected with the big q-Jacobi and the Askey–Wilson polynomials respectively. We make an effort to push both cases in a unified fashion. At times we abuse the standard convention by referring to the case s(k) = γ(k + ε/k) as the Askey–Wilson case. In [14] a proof is given of the fact that the Askey–Wilson (and the big q-Jacobi) polynomials are indeed the only ones that satisfy these two properties. In [7] Bochner had proved the corresponding statement when one is dealing with ordinary second order differential equations (i.e. q = 1), namely the only families of polynomials satisfying a three term recursion relation and a second order differential equation are those of Jacobi, Laguerre, Hermite and Bessel. In [15] this result of Bochner is “revisited” by doing away with the requirement that the functions fn (k) should be polynomials in the spectral variable k. This is achieved by removing the condition f−1 (k) = 0. The main result is that all the instances when the two properties hold are given by an arbitrary choice of five parameters which after scaling and translation in k can be reduced to three. For any such choice of parameters one considers an arbitrary solution of Gauss’ hypergeometric equation (which contains
Some Functions that Generalize Askey–Wilson Polynomials
175
three free parameters), and builds f1 (k) out of it. The rest of the family is then determined since we can still assume f0 (k) = 1. The main motivation for “revisiting” this result of Bochner is our interest in the “bispectral property” studied in the continuous-continuous case in [9] and then analyzed in [13–15]. A specially simple case of the “bispectral property” is given by the existence of nontrivial simultaneous solutions to (1.1) and (1.2) above. In the context of this problem one sees that the restriction to polynomials is not natural and that starting with the largest possible family of “second order” bispectral situations is crucial in obtaining the “higher order” ones. If one had imposed in [9] an initial condition of the type φ(0, k) = 0 — in line with the condition f−1 (k) = 0 in the present setup — one would have missed completely the “Korteweg-deVries” cases, and would have only found the Virasoro cases. These considerations could also be relevant in providing new examples in the theory of random matrices, see for instance [1, 2, 23, 26]. The purpose of this paper is to obtain the general (q 6= 1) version of the results obtained in [15], i.e. we remove the condition f−1 (k) = 0 in (1.1) and retain (1.2). More precisely (1.1) is replaced by (2.1). We can now state the main result in this paper. Theorem 1. All the instances of families fn (k), n ∈ Z, f0 (k) = 1, satisfying a three term recursion (1.1) with spectral parameter s(k) = γ(k + ε/k) and a second order q-difference equation (1.2) are obtained as follows: (1) Choose arbitrarily five parameters a, b, c, d, z and use them to define an and bn in the three term recursion relation by an = An−1 Cn , bn = γ(εa + a1 ) − (An−1 + Cn−1 ) with An =
(1.4)
γ(q −n z − εab)(q −n z − ac)(q −n z − ad)(q 1−n z − abcd) , a(q 1−2n z 2 − abcd)(q −2n z 2 − abcd)
γa(q −n z − 1)(q 1−n z − bc)(q 1−n z − bd)(εq 1−n z − cd) . (q 2−2n z 2 − abcd)(q 1−2n z 2 − abcd) (2) For each such choice consider the second order q-difference equation Cn =
A(k)(y(qk) − y(k)) + A(ε/k)(y(k/q) − y(k)) = (1 − z)
abcd − 1 y(k) qz
(1.5)
with (1−ka)(1−kb)(ε−kc)(ε−kd) (ε − k 2 )(ε − k 2 q) 4 y4 k − y3 k 3 + y2 k 2 − εy1 k + ε2 , ≡ (ε − k 2 )(ε − k 2 q)
A(k) =
and with an arbitrary solution y(k) of this equation construct f1 (k) via the “Rodrigues type formula” y(qk) ε f1 (k) = s(k) − b1 + y4γqz −qz 2 A(k)(k − k ) y(k) − 1 γ(z − 1) qz(qzy3 − y1 y4 ) εqz + . + y4 k + y4 − qz 2 y4 − q 2 z 2 k
(1.6)
176
F.A. Gr¨unbaum, L. Haine
Then, the functions fn (k), n ∈ Z, defined by the three term recursion relation fn (k) = (s(k) − bn )fn−1 (k) − an−1 fn−2 (k) ,
n∈Z,
with an , bn as in (1.4), f0 (k) = 1 and f1 (k) as in (1.6), are eigenfunctions of a second order q-difference operator Bfn (k) := a(k)(fn (qk) − fn (k)) + b(k)(fn (k/q) − fn (k)) = θn fn (k) , with coefficients a(k) and b(k) defined in terms of a, b, c, d, z and f1 (k) and eigenvalues θn given by (1.7) θn = (1 − q −n )(abcdq n−1 − z 2 ) . Furthermore, different choices of the function f1 (k) (corresponding to different solutions of (1.5)) result in second order q-difference operators that are conjugate to each other. Notice that the Eq. (1.5) is strikingly similar to the one satisfied by the Askey-Wilson polynomials, but with an arbitrary value of z. In the special case with z = 1, one can pick a solution of the q-difference equation (1.5) to be a constant and then the construction gives f1 (k) = γ(k + ε/k) − b1 and we are in the case when all the fn (k), n ≥ 0, are polynomials in the spectral parameter γ(k + ε/k), i.e. the Askey-Wilson case according to [14]. For a more complete picture, see Sect. 8. For generic values of z in what one may want to call the “Gauss-Askey-Wilson” equation we are led to functions fn (k) that are not polynomials, but generalize the Askey-Wilson polynomial situation as the title of this paper indicates. One can consider the parameter z of the Gauss-Askey-Wilson equation as the extra degree of freedom that moves us away from the polynomial case. The equation that we are calling the Gauss-Askey-Wilson equation converges as q approaches 1, to the classical equation of Gauss (see Sect. 6). This happens for any choice of the spectral parameter s(k). For a discussion of the relation between (1.5) and the equation that is usually referred to as the q-hypergeometric equation, [11, 17, 20], one can see [16]. A final remark is in order. The appearance of the Gauss-Askey-Wilson equation in the case of a general z may be confusing, but it reflects the central role played by this equation. Its solutions do not give f1 (k) directly, rather f1 (k) is obtained through the Rodrigues type formula (1.6). Exactly the same situation arises in the case of q = 1, see [15]. In the polynomial case the Askey-Wilson equation plays a second role, namely it appears in the bispectral operator B in (1.2).
2. An Operator Identity and its Solution Rewrite (1.1) and (1.2) as
with
Lf = s(k)f, f = (. . . , f−1 (k), f0 (k), f1 (k), f2 (k), . . .)t
(2.1)
Bf = Θf,
(2.2)
Some Functions that Generalize Askey–Wilson Polynomials
177
·· · a−2 b−1 1 a−1 b0 1 a 0 b1 1 L= a 1 b2 1 a 2 b3 1 · · · · ··
(2.3)
s(k) as in (1.3) and Θ the diagonal matrix Θ = diag(. . . , θ−1 , θ0 , θ1 , θ2 , . . .). The next lemma was already derived in [14]. It provides the q-version of a lemma used in [15] in the case q = 1, following a basic observation in [9]. Lemma 1. Any solution of (2.1) and (2.2) satisfies the matrix identity ˜ − ΘL) = 0 (L3 Θ − ΘL3 ) + x(L2 ΘL − LΘL2 ) + x(LΘ with x=−
1 + q + q2 , q
x˜ = εγ 2
(2.4)
(q − 1)2 (q + 1)2 . q2
Proof. From (2.1) and (2.2) one obtains immediately that ˜ − ΘL)f = (L3 Θ − ΘL3 )f + x(L2 ΘL − LΘL2 )f + x(LΘ ˜ ) − sBf ) = (B(s3 f ) − s3 Bf ) + x(sB(s2 f ) − s2 B(sf )) + x(B(sf ˜ (qk)+ a(k)(s(qk) − s(k))[s2 (k) + s2 (qk) + (1 + x)s(k)s(qk) + x]f ˜ (q −1 k). b(k)(s(q −1 k) − s(k))[s2 (k) + s2 (q −1 k) + (1 + x)s(k)s(q −1 k) + x]f Notice that the choice of s(k) given in (1.3) cancels the two terms in the square brackets. As k varies, the f (k) are linearly independent vectors, so that the finite band operator ˜ − ΘL) has infinite dimensional kernel; hence (L3 Θ − ΘL3 ) + x(L2 ΘL − LΘL2 ) + x(LΘ it must vanish identically, proving our lemma. We can now exploit the lemma above to derive necessary conditions that L and Θ should satisfy if (2.1) and (2.2) are to hold. We restrict ourselves to the case where all an ’s are nonzero. The same requirement is needed to derive the classical result of Bochner. In the sequel we shall denote by [α] =
qα − 1 q−1
the q-analogue of α. In order to solve (2.4) for L and Θ, we proceed along the lines of [14]. Equating the diagonals of (2.4) to zero, starting with the upper one, we obtain at the (n, n + 3)th entry the equations: θn+2 −
[3] [3] θn+1 + θn − θn−1 = 0, q q
n∈Z,
(2.5)
178
F.A. Gr¨unbaum, L. Haine
whose general solution is given by: θn = q 1−n [n]([n]u + v) + w,
(2.6)
with u, v, w free parameters. Since we can shift the θn ’s by an arbitrary constant, we may always assume that w = 0. For our purposes it is clear that only the ratio u/v (or v/u) plays an important role in (2.4). Equating the (n, n + 2)th and (n, n + 1)th entries to zero, we obtain (θn+1 − θn+2 )bn+2 + (θn+1 − θn−1 )bn+1 + (θn−2 − θn−1 )bn = 0 and
(2.7)
(θn − θn+2 )an+1 + (θn+1 + θn − θn−1 − θn−2 )an +(θn−3 − θn−1 )an−1 + (θn − θn−1 )
(2.8)
(q − 1) (q + 1) (θn − θn−1 ) = 0, q2 2
+εγ 2
(bn+1 q − bn )(bn+1 − bn q) q
2
where the θn ’s, n ∈ Z, are given by (2.6). Using (2.6) we see that the general solution of (2.7) depends on two free parameters b1 and r = b2 − b1 , explicitly: bn = b1 +
[n − 1]zn−1 n−2 (rq ([3]u + v) + b1 (q n−2 − 1)((1 − q)v + (1 − q n )u)) (2.9) z2n−3 z2n−1
with zn = v + [n]u. Going now into Eq. (2.8) one sees, after some labor, that the general solution for an depends on two free parameters, a0 and a1 , and is given by an =
q n−1 [n − 1][n]zn−1 zn a˜ n a˜ n q n−1 [n](v + [2]u)zn−1 + a 1 2 z2n−2 z2n (q + 1)2 z2n−2 z2n−1 z2n
− a0 with
q n−1 [n − 1](q 2 v − [2]u)zn (q − 1)2 [n − 1][n]zn−1 zn + εγ 2 z2n−2 z2n z2n−2 z2n
(2.10)
a˜ n = −r(v + [3]u) + b1 v(q − 1) + b1 u(q n+1 + q n − q 2 − 1) a˜ n = −q n−1 r(v + [3]u) + b1 v(q n − q n−1 − q 2 + 1) + b1 u(1 + q − q n−1 − q n+1 ).
In summary, Θ is given by (2.6) and L is determined by (2.9) and (2.10). The expressions above make it clear that it is safer to assume that v + [n]u is nonzero for all integers n, see however the remark at the end of the section for a discussion of these special cases. For further use, it will be crucial to observe that Eqs. (2.7) and (2.8) can both be integrated once.
Some Functions that Generalize Askey–Wilson Polynomials
179
Indeed we can write (2.7) as [(θn+1 − θn+2 )bn+2 + (θn − θn−1 )bn+1 ] −[(θn − θn+1 )bn+1 + (θn−1 − θn−2 )bn ] = 0, which is equivalent to
with
(θn − θn+1 )bn+1 + (θn−1 − θn−2 )bn = β
(2.11)
1 β = − [(r + b1 (1 − q 2 ))v + ((q + 1)2 b1 + [3]r)u]. q
(2.12)
Then using (2.5), (2.6) and (2.11), (2.8) becomes ˜ n] [(θn − θn+2 )an+1 + (θn − θn−2 )an + xn+1 b2n+1 − βbn+1 + xθ ˜ n−1 ] = 0, −[(θn−1 − θn+1 )an + (θn−1 − θn−3 )an−1 + xn b2n − βbn + xθ with x˜ = εγ 2
(q − 1)2 (q + 1)2 , q2
(2.13)
xn = q 1−n ((q − 1)v − (1 + q 2n−2 )u), or equivalently ˜ n−1 = α, (θn−1 − θn+1 )an + (θn−1 − θn−3 )an−1 + xn b2n − βbn + xθ
(2.14)
with (rb1 +(1−q)b21 + [2](a0 q 2 − a1 ))v + ((1+ q 2 )b21 + [3]rb1 −(1 + q)2 (a0 + a1 ))u . q (2.15) Observe that one can solve (2.12) and (2.15) for r and a0 ,
α=
r= a0 =
b1 ((q 2 − 1)v − (q + 1)2 u) − qβ , (q 2 + q + 1)u + v (q + 1)((q + 1)u + v)a1 + q((2u − (q − 1)v)b21 + βb1 + α) , (q + 1)(q 2 v − (q + 1)u)
(2.16)
and therefore the set of parameters (u, v, a1 , b1 , α, β) is equivalent to the set of parameters (u, v, a0 , a1 , b1 , r). From now on up to the end of Sect. 5, it will be more convenient for us to work with the set of parameters (u, v, a1 , b1 , α, β) and all our formulas will be written in terms of these equivalent parameters. Notation. Equations (2.11) and (2.14) will be the crucial ingredient in Sect. 4 in order to establish the existence of B in (2.2) under the condition (2.4). We shall abbreviate them by V an = V bn = 0 with
180
F.A. Gr¨unbaum, L. Haine
V an = β − (θn−1 − θn−2 )bn − (θn − θn+1 )bn+1 , ˜ n−1 + βbn − xn b2n V bn = α − xθ
(2.17)
− (θn−1 − θn−3 )an−1 − (θn−1 − θn+1 )an . _n = 2an V an , _bn = V bn is a When q = 1, we showed in [15] that the vector field a combination of the Toda lattice vector field and the sl(2) part of its Virasoro symmetries. Remark As long as v + [n]u 6= 0, for all n ∈ Z, one can re-express the solution (2.9) and (2.10) to Eqs. (2.7) and (2.8) using as free parameters u, v, bk−1 , rk = bk − bk−1 , ak−2 , ak−1 instead of u, v, b1 , r = b2 − b1 , a0 , a1 for an arbitrary choice of k ∈ Z. One checks easily that in this way one obtains formulas for an and bn for which the limits v + [2k − 1]u = 0 and v + [2k − 4]u = 0 make sense and provide the solution of (2.7) and (2.8) corresponding to these special choices.
3. The q-Riccati Equation for f1 (k) In the last section we obtained expressions for the entries in the matrices L and Θ (in terms of the free parameters u, v, a1 , b1 , α, β) which follow from the existence of a nontrivial family of functions fn (k) that satisfy both (2.1) and (2.2). In this section we will obtain a further equation that the function f1 (k) has to satisfy if (2.1) and (2.2) are to hold. Section 4 will be devoted to proving that the conditions derived in Sect. 2 coupled with the requirement that f1 (k) should satisfy the equation derived in the present section are not only necessary but also sufficient for (2.1) and (2.2) to hold. From Bf1 = θ1 f1 and Bf2 = θ2 f2 one can solve for the coefficients a(k) and b(k) in B except in the case when the determinant of the corresponding system of equations is zero. This determinant is seen to be zero exactly when f1 (k) satisfies the nonlinear qdifference equation of order two given by f1 (qk)f1 (k/q)(s(k/q) − s(qk)) + f1 (qk)f1 (k)(s(qk) − s(k)) +f1 (k)f1 (k/q)(s(k) − s(k/q)) = 0 .
(3.1)
If f1 (k) is (locally) identically zero, the equation Bf2 = θ2 f2 reduces to a1 θ2 = 0, and under our assumption that v + [n]u 6= 0 for all integers this means a1 = 0. This case gives then fi = 0 for all i ≥ 1, and we get (directly from (2.1)) for f−i a family of polynomials of degree i in s and we are back into familiar territory. If we assume that f1 is not (locally) identically zero by introducing the ratio g(k) = f1 (k)/f1 (k/q), Eq. (3.1) can be rewritten as g(qk) = Q(k)/(g(k)R(k) + S(k))
(3.2)
with Q(k) = s(k/q)−s(k), R(k) = s(qk)−s(k), S(k) = s(k/q)−s(kq). Notice that this equation is of the same form as the equation for f1 (k) given later in (3.4) with P (k) = 0. This equation will be of fundamental importance for our development, and Sect. 5 is devoted to solving it.
Some Functions that Generalize Askey–Wilson Polynomials
181
We now solve the simpler Eq. (3.2) by adapting the method to be described in Sect. 5. By putting g(k) = S(k)/R(k) (w(qk)/w(k) − 1), the nonlinear equation above becomes the second order linear q-difference equation w(q 2 k) − w(qk) − R(kq)Q(k)/S(kq)S(k)w(k) = 0,
(3.3)
and the coefficient of w(k) is q/(q + 1)2 when ε = 0 and is given by (k 2 − q)(k 2 q 3 − 1)/((k 2 − 1)(k 2 q 2 − 1)(q + 1)2 ) in the case ε = 1. Appropriate substitutions allow one to proceed. The general solution of (3.3) is a linear combination of two independent solutions with coefficients c(k) arbitrary q-periodic functions, i.e., satisfying c(kq) = c(k). See [6]. In the case ε = 0 this gives for the first order equation for g the general solution g(k) =
(−q − 1)((k n2 q n2 − k n2 )t + k n1 q n1 − k n1 ) q(k n2 t + k n1 )
with t an arbitrary q-periodic function and n1 , n2 given by q n1 = q/(q + 1) and q n2 = 1/(q + 1), and one concludes that f1 (k) is given by f1 (k) = a/(k + t) with a (and t) an arbitrary q-periodic function. In the case ε = 1 the first order q-difference equation for g has a general solution given by g(k) = ((q 2 + k 2 )t + kq)/(q(t(k 2 + 1) + k)) with t an arbitrary q-periodic function. In this case, once again, one concludes that f1 (k) has the form c1 /(s(k) − c2 ), where c1 , c2 are arbitrary q-periodic functions. To complete this discussion we need to see what is the sequence of functions fn (k) that one obtains when f1 (k) is given by the choice above. The equations Bfi = θi fi , i = 1, 2, force the following quantities to vanish: γ(q + 1)(v + (q + 1)u)(a1 − c1 )
and
a1 (v + [3]u)(b2 − c2 ) .
The first condition pins down c1 . Since we already assumed that f1 (k) is not (locally) identically zero, the second condition pins down c2 and we get f1 (k) = a1 /(s(k) − b2 ) . By using (2.1) this means that f2 (k) = 0 and f3 (k) = a2 f1 (k). From Bf3 = θ3 f3 we conclude then, since θ1 6= θ3 , that a2 = 0 and from (2.1) we get that fi (k) = 0 for i ≥ 3. Furthermore f1 , f0 , f−1 , . . . , f−i , . . . are given by f1 (k) times a polynomial of degree i + 1. Conjugating by f1 we are back in the polynomial case. Assume from now on that the 2 × 2 system of equations Bfi = θi fi , i = 1, 2, has a nonzero determinant. We proceed now to explore the conditions under which Bfi (k) = θi fi (k) for i = 1, 2, 3, . . . .
182
F.A. Gr¨unbaum, L. Haine
It will turn out that we can always conclude that f1 (k) has to satisfy a certain first order nonlinear q-difference equation of the form f1 (kq) =
P (k)f1 (k) + Q(k) R(k)f1 (k) + S(k)
(3.4)
with P, Q, R, S given by the expressions P (k) = qα + q[β + b1 (q − 1)(qv − u)]s(kq) +q(q + 1)[u −(q−1)v]s(kq)2 + γkq(qv − u)(q 2 −1)(s(kq) − b1 ), Q(k) = −a1 (q + 1)((q + 1)u + v)(s(kq) − s(k)), R(k) = q(q + 1)v(s(kq) − s(k)),
(3.5)
S(k) = qα + q[β + b1 (q − 1)(qv − u)]s(k) +q(q + 1)[u −(q−1)v]s(k)2 +
εγ (qv − u)(q 2 −1)(s(k) − b1 ) . k
We will devote Sect. 5 to a discussion of this equation including a method that allows us to solve it explicitly in terms of “classical functions” for the P, Q, R, S that appear in our problem. We observe for later use that S(ε/kq) = P (k) ,
Q(ε/kq) = −Q(k) and R(ε/kq) = −R(k) .
(3.6)
We proceed now to explain the method that allows one to trap f1 (k) into a relation as the one given in (3.4). Recall that we can read off the coefficients a(k) and b(k) in the operator B in terms of f1 (k) and s(k) as well as their forward and backward q-shifted versions. Insisting on Bf3 = θ3 f3 gives an expression for f1 (k/q) in terms of f1 (k) and f1 (kq), in the form (3.7) f1 (k/q) = F (k, f1 (k), f1 (kq)) with F a rational function. Insisting on Bf4 = θ4 f4 and using the above expression to eliminate f1 (k/q) we get for f1 (kq) two possible expressions in terms of k and f1 (k). The first one is the expression (3.4) mentioned above. The other possible expression for f1 (kq) has the form in (3.4) except for an extra multiplicative factor f1 (k) in the right-hand side. Indicate this last relation by f1 (kq) = G(k, f1 (k)).
(3.8)
If one replaces k by k/q this becomes f1 (k) = G(k/q, f1 (k/q)).
(3.9)
This expression can be combined with relation (3.7) to give f1 (k/q) = F (k, f1 (k), G(k, f1 (k)))
(3.10)
and if this one is finally plugged into (3.9) we get the equation in f1 (k) given by f1 (k) = G(k/q, F (k, f1 (k), G(k, f1 (k)))).
(3.11)
Some Functions that Generalize Askey–Wilson Polynomials
183
It turns out that this equation has two possible solutions: one is f1 (k) = 0 and the other one is given by f1 (k) = a1 /(s(k) − b2 ). In both cases the determinant of the two-by-two system discussed above vanishes, contradicting our assumption. Thus we are forced to rule out the case (3.8) and to keep the condition (3.4). We call this Eq. (3.4) the q-Riccati equation, since it appears in our development at exactly the point where the usual Riccati equation appears in [15] (see case b) in Sect. 3). Now something amazing happens: as long as f1 (k) satisfies (3.4) all the remaining conditions Bfi = θi fi , i = 5, 6, . . . are automatically met and (3.4) is not only necessary but in fact in conjunction with the conditions derived in Sect. 2, it is sufficient to ensure the existence of an operator B so that (2.1) and (2.2) should hold. This will be seen in the next section. We close this section with the observation that Eq. (3.4) can be put in a form that allows one to see the classical Riccati equation emerging in the limit q → 1. One first rewrites (3.4) as (Rf1 + S)
R f1 (kq)−f1 (k) P −S Q =− f12 + f1 + . (3.12) s(kq)−s(k) s(kq)−s(k) s(kq)−s(k) s(kq)−s(k)
Using the notation Ds+ f =
f (kq) − f (k) s(kq) − s(k)
(3.13)
and the definitions of P, Q, R, S one can express (3.12) in the form (Rf1 + S)Ds+ f1 = −q(q + 1)vf12 + [q(u + v)(s(k) + s(kq)) + qβ + 2qb1 (u − qv)]f1
(3.14)
− (q + 1)a1 ((q + 1)u + v) . As q → 1, the equation becomes (α + βs(k) + 2us2 (k))
df = −2vf12 + [2(u + v)s(k) + β + 2b1 (u − v)]f1 ds − 2a1 (2u + v) .
When s(k) = k, this equation agrees with the one obtained in [15]. 4. The Operator B Substituting f2 + b2 f1 + a1 f0 for s(k)f1 , the q-Riccati equation as written in (3.14) becomes A+ f1 (k) = q(u + v)f2 (k) + (q(u + v)b2 + qβ + 2qb1 (u − qv))f1 (k) − a1 ((1 + q + q 2 )u + v)f0 (k) + qs(kq)(u + v)f1 (k) with
(4.1)
184
F.A. Gr¨unbaum, L. Haine
A+ = (R(k)f1 (k) + S(k))Ds+ + q(q + 1)vf1 (k) . Rewriting (3.4) as f1 (k/q) =
S(k/q)f1 (k) − Q(k/q) , −R(k/q)f1 (k) + P (k/q)
we could of course have obtained a similar formula in terms of the operator Ds− f (k) =
f (k/q) − f (k) , s(k/q) − s(k)
namely A− f1 (k) = q(u + v)f2 (k) + (q(u + v)b2 + qβ + 2qb1 (u − qv))f1 (k) − a1 ((1 + q + q 2 )u + v)f0 (k) + qs(k/q)(u + v)f1 (k) with
(4.2)
A− = (−R(k/q)f1 (k) + P (k/q))Ds− + q(q + 1)vf1 (k) .
In this section we show that formulas (4.1) and (4.2) generalize into differentiation formulas for all fn (k). This will follow by induction from the integrated form (2.11) and (2.14) of the operator identity (2.4). The difference between these two differentiation formulas will give the bispectral operator B in (2.2), therefore establishing that the operator identity (2.4) is not only necessary but is also sufficient to guarantee a solution of (2.1) and (2.2). Let T be the tridiagonal matrix defined by Tnn−1 = −an [n]q 1−n ([n + 2]u + v), Tnn = q(u − qv)b1 + q 1−n ((q − [n])v − ([2n + 1] + q 2 [n][n − 2])u)bn+1 , Tnn+1 = −[n − 2]q
3−n
(4.3)
([n]u + v).
With f = (. . . , f−1 (k), f0 (k) = 1, f1 (k), . . .)t , we have the Lemma 2. (Differentiation formulas) A+ f = T f + qs(kq)Θf,
(4.4)
A− f = T f + qs(k/q)Θf
(4.5)
with T as in (4.3) and Θ the diagonal matrix of the eigenvalues θn ’s as in (2.6). Remark. When s(k) = k, s(kq) = qs(k) and, using the three term recursion relation, formulas (4.4) and (4.5) can be rewritten as A ± f = Q± f ,
(4.6)
for some tridiagonal matrices Q± . When q → 1, these formulas reduce to a differentiation formula which we already used in [15] and which generalizes the standard differentiation formulas satisfied by the Hermite, Laguerre, Jacobi and Bessel polynomials. If we now express the compatibility between (4.6) and Lf = kf , we obtain a q-version of a “string-like” equation
Some Functions that Generalize Askey–Wilson Polynomials
LQ+ − qQ+ L = S(L) and LQ − q −1 Q− L = P (q −1 L), −
185
(4.7)
with S(k) and P (k) the polynomials of degree 2 in k defined in (3.5) with γ = 1 and ε = 0. In [15] (see also [1] for a version in the context of orthogonal polynomials) we have shown that when q → 1 Eqs. (4.7) can be interpreted as saying that the solutions to our problem are fixed points of an arbitrary linear combination of the Toda lattice vector field and the sl(2) part of its Virasoro symmetries. A similar interpretation remains to be worked out when q 6= 1. Before proving Lemma 2, we deduce Theorem 2. Let B=
A+ − A− . q(s(kq) − s(k/q))
(4.8)
Then Bf = Θf.
(4.9)
More explicitly, the coefficients a(k) and b(k) in (1.2) are given by R(k)f1 (k) + S(k) q(s(kq) − s(k))(s(kq) − s(k/q))
(4.10)
R(k/q)f1 (k) − P (k/q) , q(s(k/q) − s(k))(s(kq) − s(k/q))
(4.11)
a(k) = and b(k) =
with f1 (k) a solution of the q-Riccati equation (3.4). Moreover, a choice of a different solution f˜1 (k) in (3.4) leads to B˜ conjugate to B as follows: B˜ = g(k)Bg(k)−1 ,
(4.12)
with g(k) a solution of the equation q(q + 1)v(f1 − f˜1 ) 1 Ds+ g(k) = . g(k) R(k)f˜1 + S(k) Proof. Formula (4.9) follows by taking the difference between (4.4) and (4.5). The conjugation formula (4.12) follows from the explicit expressions (4.10) and (4.11) for a(k) and b(k) if one uses Eq. (3.4) satisfied by f1 (k) and f˜1 (k). Remark. Note that the form of B is independent of f1 (k) exactly when v = 0. The meaning of the conjugation result above is that one can get the same B for different choices of f1 if one is willing to abandon the normalization f0 = 1.
186
F.A. Gr¨unbaum, L. Haine
Proof of Lemma 2. We only establish (4.4) since (4.5) can be proved in a similar way. Rewrite (4.4) as (R(k)f1 (k) + S(k))fn (kq) = S(k)fn (k) + (s(kq) − s(k))(Tnn+1 fn+1 (k) + Tnn fn (k) + Tnn−1 fn−1 (k) + qs(kq)θn fn (k)) .
(4.13)
The case n = 0 of this identity is trivially satisfied, using the definition of R(k) in (3.5) and the case n = 1 has been established in (4.1). Assume that we have proved (4.13) for 0 ≤ j ≤ n, n ≥ 1, we establish it for j = n + 1. Let us denote by (RHS)n the right-hand side of (4.13). Since s(kq) = qs(k) +
εγ (1 − q 2 ) , kq
by using the three term recursion relation defining the fn ’s and the definitions of T and Θ, (RHS)n can be rewritten as (RHS)n = q n+1 (q + 1)ufn+2 (k) + [q n+1 (q+1)ubn+2 + q 1−n ((q 2n + 1)u−(q−1)v)bn+1 + qβ]fn+1 (k) + [q 1−n ((q 2n + 1)u − (q − 1)v)b2n+1 + qβbn+1 + qα + q n+1 (q + 1)uan+1 + q 1−n (q + 1)(u − (q − 1)v)an ]fn (k) + [q 1−n ((q 2n + 1)u − (q − 1)v)bn+1 + q 1−n (q + 1)(u − (q − 1)v)bn + qβ]an fn−1 (k) + q 1−n (q + 1)(u − (q − 1)v)an−1 an fn−2 (k) Pnn+1 fn+1 (k) + bn+1 − εγ − εγ(q+1) kq n k Pnn fn (k) + an Pnn−1 fn−1 (k)
with Pnn+1 = (2q 2n+1 + q 2n − 2q n+1 − 2q n + q)u + (q − 1)(q n+1 + q n − q)v, Pnn = (q + 1)(q n − 1)((q n − 1)u + (q − 1)v), Pnn−1 = (q 2n+1 − 2q n+1 − 2q n + 2q + 1)u + (q−1)(q n+1 + q n − 2q−1)v . Since fn+1 = (s − bn+1 )fn − an fn−1 , using the induction hypothesis that (4.13) holds for n − 1 and n, and remembering the definitions of V an and V bn in (2.17), we obtain
Some Functions that Generalize Askey–Wilson Polynomials
187
(R(k)f1 (k) + S(k))fn+1 (kq) − (RHS)n+1 = (s(kq)−bn+1 )(RHS)n − an (RHS)n−1 − (RHS)n+1 = q(q−1)(V an+1 )fn+2 (k) +[q(q − 1)(bn+2 V an+1 + V bn+1 ) −
εγ k (q
− 1)(q + 1)V an+1 ]fn+1 (k)
+[q(q − 1)(an+1 V an+1 + an V an + bn+1 V bn+1 ) − εγ k (q − 1)(q + 1)V bn+1 ]fn (k) +[q(q − 1)(bn V an + V bn+1 ) −
εγ k (q
− 1)(q + 1)V an ]an fn−1 (k)
+q(q − 1)(V an )an−1 an fn−2 (k) εγ 2 n − 1)((q − 1)v + (q n − 1)u) − kq n (q − 1)(q + 1) (q
×[an (fn + bn fn−1 + an−1 fn−2 − sfn−1 ) + bn+1 − εγ (q + 1) (fn+1 + bn+1 fn + an fn−1 − sfn ) kq +fn+2 + bn+2 fn+1 + an+1 fn − sfn+1 ] . Since V an = V bn = 0 for all n, (see (2.11), (2.14) and (2.17)), using the three term recursion relation satisfied by the fn (k)’s, the last expression is obviously identically zero, which establishes (4.13) for n + 1. By a similar argument one shows that, assuming that (4.13) is true for n−1 ≤ j ≤ 1, n ≤ 1, it is also true for j = n − 2, which completes the proof. 5. Linearizing the q-Riccati Equation When v = 0, the Eq. (3.4) satisfied by f1 becomes a linear first order non homogeneous q-difference equation. We will consider this case in Sect. 9. Our objective in this section is to linearize (3.4) when v 6= 0, by a q-analogue of the “log derivative trick” which is used to solve a standard Riccati equation. In this way we will reduce the solution of this equation to solving a second order linear q-difference equation which is strikingly similar to the celebrated second order q-difference equation satisfied by the Askey– Wilson polynomials. The following trick converts the q-Riccati equation (3.4) into a second order linear equation. Put S(k) w(kq) − w(k) , (5.1) f1 (k) = R(k) w(k) then w(k) satisfies the second order linear q-difference equation R(k)S(k)S(kq)w(kq 2 ) − (R(kq)P (k) + R(k)S(kq))S(k)w(kq) + (P (k)S(k) − Q(k)R(k))R(kq)w(k) = 0. From the definition of P (k), R(k) and S(k), see (3.5), it follows that
(5.2)
188
F.A. Gr¨unbaum, L. Haine
R(kq) = R(k) and
s(kq 2 ) − s(kq) s(kq) − s(k)
(s(kq 2 ) − s(kq))P (k) + (s(kq) − s(k))S(kq) = (s(kq 2 ) − s(k))U (kq)
with
U (k) = qα + qβs(k) + q(2u − (q − 1)v)s(k)2 .
(5.3)
Substituting the above identities into (5.2) one finds (s(kq) − s(k))S(k)S(kq)w(kq 2 ) + (s(k) − s(kq 2 ))U (kq)S(k)w(kq) +(s(kq 2 ) − s(kq))(P (k)S(k) − Q(k)R(k))w(k) = 0.
(5.4)
When q → 1, we have seen in Sect. 3 that the q-Riccati equation (3.4) becomes S(s)
df1 + 2vf12 − (2(u + v)s + β + 2b1 (u − v))f1 + 2a1 (2u + v) = 0, ds
and (5.1) reduces to S(s) d log w 2v ds with S(s) = α + βs + 2us2 , leading to the second order linear equation f1 =
S(s)2
dw d2 w + 4a1 v(2u + v)w = 0. + 2(u − v)(s − b1 )S(s) 2 ds ds
(5.5)
Notice that in this case, since our equation does not depend explicitly on k, there is no difference between the cases ε = 0 or ε 6= 0. When q 6= 1, the explicit dependence on k cannot be eliminated and the situation becomes richer. We refer the reader to our previous paper [15] for a detailed study of the case q = 1. As long as the roots p1 and p2 of S(s) are distinct, (5.5) is a Fuchsian differential equation with three regular singular points at p1 , p2 and infinity and can therefore be reduced to the standard form of the Gauss hypergeometric equation by putting w = (s − p1 )r1 (s − p2 )r2 y, with r1 (resp. r2 ) a root of the indicial equation at p1 (resp. p2 ). Our strategy to solve (5.4) will be to mimic this approach using that the q-analogue of (1 − k)−r is provided by the q-hypergeometric series 1 φ0 (r; −; q, k)
=
∞ X (r; q)n n=0
(q; q)n
kn ,
where (r; q)n denotes the q-shifted factorial (r; q)0 = 1, (r; q)n = (1 − r)(1 − rq) . . . (1 − rq n−1 ), n = 1, 2, . . . . Denoting by hr (k) the above series, we recall that hr (qk) =
1−k hr (k) 1 − rk
(5.6)
from which, with |q| < 1, one derives immediately Cauchy’s formulation of the qbinomial theorem (rk; q)n (rk; q)∞ hr (q n k) = , (5.7) hr (k) = (k; q)n (k; q)∞
Some Functions that Generalize Askey–Wilson Polynomials
189
where (r; q)∞ denotes the infinite product (r; q)∞ =
∞ Y
(1 − rq n ).
n=0
Observe from (3.6) that the Laurent polynomial P (k)S(k) − Q(k)R(k) is invariant under the change k 7→ ε/kq and therefore it can be factorized as P (k)S(k) − Q(k)R(k) = V (k)V (ε/kq),
(5.8)
with
εx1 ε2 x0 + 2 . k k One can easily see that x0 = 0 only if u = 0 or (q − 1)v = u. This case can be treated separately with the same technique that is used below for x0 6= 0. Denote by pi (1 ≤ i ≤ 4) the four roots of k 2 V (k) and put V (k) ≡ x4 k 2 − x3 k + x2 −
w(k) = k ρ
4 Y
1 φ0 (ri ; −; q,
i=1
k )y(k) . pi
(5.9)
From the assumption x0 6= 0 it follows that pi 6= 0. Then, using (5.6), one sees easily that by picking ri = pi /ki , with ki (1 ≤ i ≤ 4) the roots of k 2 S(k), and choosing ρ such that q ρ = limk→0 V (k)/S(k), after changing k to k/q, Eq. (5.4) simplifies to V (k) (s(k/q)−s(kq))(s(kq)−s(k))
y(kq) + +
U (k) (s(k)−s(k/q))(s(kq)−s(k))
y(k)
V (ε/k) (s(k)−s(k/q))(s(k/q)−s(kq))
y(k/q) = 0.
(5.10)
Notice that (5.10) can be written as W (k)y(kq) + T (k)y(k) + W (ε/k)y(k/q) = 0 ,
(5.11)
with W (k) the coefficient in front of y(kq) and T (k) the coefficient in front of y(k) in (5.10). We now establish the following crucial Lemma 3. There are eight ways to perform the factorization (5.8) so that W (k) + T (k) + W (ε/k) = x5 ,
(5.12)
for some constant x5 (independent on k). As a consequence (5.11) can be written as W (k)(y(kq) − y(k)) + W (ε/k)(y(k/q) − y(k)) + x5 y(k) = 0.
(5.13)
The expert reader will immediately notice the striking resemblance between this equation and the second order q-difference equation which is satisfied by the Askey-Wilson polynomials, see for example [19]. Indeed, in Sect. 7, we will show that solutions of this equation can be given in terms of some basic hypergeometric series, which in general are not polynomials.
190
F.A. Gr¨unbaum, L. Haine
Proof of Lemma 3. Cleaning up the denominators in (5.12), one obtains a Laurent polynomial running from k −3 to k 3 . Since U (k) in (5.3) is invariant under the change k y ε/k, one sees easily that this Laurent polynomial changes sign under the substitution k y ε/k, and therefore (5.12) amounts to three independent equations, which can be solved for x4 , x3 , x2 : x4 = −
γ 2 (q − 1)2 (q + 1) x5 − qx0 − γ 2 q(q + 1)((q − 1)v − 2u), q
x3 = −q(x1 + γ(q + 1)β), x2 =
(5.14)
εγ 2 (q−1)2 (q+1) x5 − ε(q−1)x0 −εγ 2 q(q + 1)((q − 1)v − 2u) + qα. q2
Since (5.8) is a Laurent polynomial running from k −4 to k 4 , which is invariant under the substitution k y ε/kq, this relation amounts to a system of five independent equations in the five unknowns x0 , x1 , x2 , x3 , x4 . Substituting (5.14) into these equations, one discovers that two of these equations, corresponding to the coefficients of k −3 and k −1 , become proportional, and thus we only have four independent equations. One can solve the equation given by the coefficient of k −4 for x5 : x5 = −
q 2 (x0 − γ 2 (q + 1)u)(x0 + γ 2 (q + 1)((q − 1)v − u)) . γ 2 (q − 1)2 (q + 1)x0
(5.15)
Substituting (5.15) into the three remaining independent equations (corresponding to k −2 , k −1 and k 0 ), one obtains that the equations given by the coefficients k −2 and k 0 become proportional, and thus we have two independent equations for x0 and x1 . The equation given by the coefficient of k −1 can be solved for x1 : x1 = −
γ(q + 1)x0 [qβ(x0 + γ 2 ((q−1)v −2u)) − γ 2 b1 (q−1)2 (u + v)(qv−u)] , (5.16) qx20 + γ 4 u(q + 1)2 ((q−1)v − u)
and, by substituting (5.16) into the coefficient of k 0 , one gets that x0 must be a root of the following degree 8 polynomial: a1 γ 2 (q − 1)2 (q + 1)2 v(v + (q + 1)u)x20 p3 (x0 )2 +αqx0 p1 (−x0 )p2 (qx0 )p3 (x0 )2 −βb1 γ 2 (q − 1)2 q(qv − u)x20 p1 (−x0 )p2 (qx0 )p4 (x0 ) +β 2 γ 2 q 2 x20 p1 (−x0 )p1 (−qx0 )p2 (x0 )p2 (qx0 ) −b21 γ 2 (q − 1)2 q(qv − u)2 x20 p1 (x0 )p1 (−x0 )p2 (qx0 )p2 (−qx0 ) +εp1 (−x0 )p2 (x0 )p1 (−qx0 )p2 (qx0 )p3 (x0 )2 = 0, with
(5.17)
Some Functions that Generalize Askey–Wilson Polynomials
191
p1 (x0 ) = x0 + γ 2 (q + 1)u, p2 (x0 ) = x0 + γ 2 (q + 1)((q − 1)v − u), p3 (x0 ) = qx20 + γ 4 (q + 1)2 u((q − 1)v − u), p4 (x0 ) = qx20 − 2γ 2 q(q + 1)vx0 + γ 4 (q + 1)2 u(u − (q − 1)v). This completes the proof of the lemma. The results of this section allow us to express the solution of the q-Riccati equation (3.4) in terms of an arbitrary solution of the linear second order q-difference equation (5.13), as explained below. From (5.9), using property (5.6), we get that Y w(kq) = qρ w(k) 4
i=1
1 − k/pi 1 − k/ki
y(kq) , y(k)
or equivalently, using the definition of ρ, pi and ki , w(kq) V (k) y(kq) = . w(k) S(k) y(k) If we substitute this last expression into (5.1) we obtain that f1 (k) is given by the Rodrigues type formula y(kq) 1 V (k) − S(k) , (5.18) f1 (k) = R(k) y(k) with y(k) an arbitrary solution of (5.13). Of course both V (k) defined by (5.8) with the extra requirement that (5.12) should hold and Eq. (5.13) defining y(k) depend on one of the eight choices for x0 in (5.17). However any choice for x0 will lead to the same family of solutions for the q-Riccati equation (3.4). 6. The Gauss-Askey-Wilson Equation & Proof of Theorem 1 In view of Eq. (5.13) to which we have reduced the solution of the q-Riccati equation satisfied by f1 (k), it is natural to try to describe the solution of our problem in terms of the parameters xi , 0 ≤ i ≤ 5, defining V (k) in (5.8) in such a way that (5.12) holds. This is easy to do with 0 ≤ i ≤ 4 but x5 requires a different treatment. Put xi , 1≤i≤4, yi = x0 and define γ 2 (q + 1)((q − 1)v − u) z=− . x0 Observe from (5.15) that z = 1 implies that x5 = 0. A better motivation for this choice will be given in Sect. 8, where we shall see that z = 1 precisely pins down the case where the functions fn (k), n = 0, 1, 2, . . . , are polynomials in the spectral parameter s(k). Using (5.14), (5.15), (5.16) and (5.17) one sees easily that the set of parameters (x0 , y1 , y2 , y3 , y4 , z) is equivalent to the set of parameters (u, v, a1 , b1 , α, β) and thus it is also equivalent to (u, v, a0 , a1 , b1 , r). Explicitly:
192
F.A. Gr¨unbaum, L. Haine
u=
x0 (y4 − qz 2 ) x 0 y4 , v = , γ 2 q(q + 1)z γ 2 q(q 2 − 1)z
γz[q(y3 + qy1 )z 2 − (q + 1)(y1 y4 + qy3 )z + (y3 + qy1 )y4 ] , (z 2 − y4 )(q 2 z 2 − y4 ) (y3 + qy1 )(qz 4 + 2(q 2 + q + 1)y4 z 2 + qy42 ) γ(q − 1)z −(q + 1)2 (y1 y4 + qy3 )z(z 2 + y4 ) , r= (z 2 − y4 )(z 2 − q 2 y4 )(q 2 z 2 − y4 ) 2 2 (q z − y4 )2 (ε(q 2 z 2 + y4 ) − qy2 z) γ 2 (z − 1)(q 2 z − y4 ) +q 2 z 2 (qy1 z − y3 )(qy3 z − y1 y4 ) , a0 = (qz 2 − y4 )(q 2 z 2 − y4 )2 (q 3 z 2 − y4 ) 2 (z − y4 )2 (ε(z 2 + y4 ) − y2 z) 2 γ (z − q)(qz − y4 ) +z 2 (y1 z − y3 )(y3 z − y1 y4 ) . a1 = (z 2 − qy4 )(z 2 − y4 )2 (qz 2 − y4 ) b1 =
(6.1)
Introduce now a, b, c, d by means of the relation k 2 V (k) = y4 k 4 −y3 k 3 + y2 k 2 − εy1 k + ε2 x0 = (1−ka)(1−kb)(ε −kc)(ε −kd) ,
(6.2)
and substitute (6.1) into (2.9) and (2.10) to obtain: an = An−1 Cn , bn = γ(εa + a1 ) − (An−1 + Cn−1 ) with
(6.3)
An =
γ(q −n z − εab)(q −n z − ac)(q −n z − ad)(q 1−n z − abcd) , a(q 1−2n z 2 − abcd)(q −2n z 2 − abcd)
Cn =
γa(q −n z − 1)(q 1−n z − bc)(q 1−n z − bd)(εq 1−n z − cd) . (q 2−2n z 2 − abcd)(q 1−2n z 2 − abcd)
In terms of these new parameters, one finds from (5.15) that x5 q(1 − z)(abcd − qz) , = 2 x0 γ (q − 1)2 (q + 1)z
(6.4)
and Eq. (5.13) takes on the form A(k)(y(qk) − y(k)) + A(ε/k)(y(k/q) − y(k)) = (1 − z) with A(k) = and its consequence
abcd − 1 y(k) qz
(1 − ka)(1 − kb)(ε − kc)(ε − kd) , (ε − k 2 )(ε − k 2 q)
(6.5)
(6.6)
Some Functions that Generalize Askey–Wilson Polynomials
A(ε/k) =
193
(k − εa)(k − εb)(k − c)(k − d) . (k 2 − ε)(k 2 − εq)
Replacing k by kq, we can rewrite Eq. (6.5) in terms of Ds+ and Ds+2 , with Ds+ as in (3.13). Put then abcd − 1 = (q − 1)2 t , y4 = (q − 1)y40 + 1, (1 − z) qz y3 = (q − 1)y30 + y1 , with yi as in (6.2). The limit q = 1 of (6.5) becomes (s(k)2 −γy1 s(k) + γ 2 (y2 −2ε))
d2 d y(k) + (y40 s(k) − γy30 ) y(k)−ty(k) = 0 . (6.7) 2 ds ds
Notice that in the limit y(k) can be considered as a function of s(k). By translation and scaling in s, (6.7) can be brought to the standard form of the Gauss hypergeometric equation. Since for the special choice z = q −n , Eq. (6.5) is nothing but the celebrated equation satisfied by the Askey-Wilson polynomials, we propose to call this equation the Gauss-Askey-Wilson equation. Since (6.5) has the form c2 (k)y(q 2 k) + c1 (k)y(qk) + c0 (k)y(k) = 0 for appropriate polynomials ci , i = 0, 1, 2, it is known, see [6], that the existence of solutions of the form X xi k i (6.8) y(k) = k r i≥0
is controlled by the “indicial equation” c2 (0)q 2r + c1 (0)q r + c1 (0) = 0 . In our case this equation becomes, for any nonzero ε (which we take to be 1), abcd r r+1 =0. (q − z) q − z For generic values of abcd and z the two values of r given by this equation will not differ by an integer and the general theory guarantees the existence of two convergent power series of the form (6.8). |q| ). For |q| > 1 one can see that the two series converge for |k| < min(|q|, max(a,b,c,d) If 0 < |q| < 1 one obtains two convergent power series for |k| < min(|q|−1 , |q|−1 min(a, b, c, d)). For ε = 0 we get as “indicial equation”, the expression (q r − 1)(q r − q) = 0 . This gives rise to a pair of solutions of the form X X xi k i , y(k) = k xi k i , y(k) = 1 + i≥1
i≥0
194
F.A. Gr¨unbaum, L. Haine
|q| which converge when |q| > 1 as long as |k| < max(a,b) and when 0 < |q| < 1 as long −1 as |k| < |q| min(c, d). These “power series” solutions do not make explicit, even when z = q −n , the role of the basic hypergeometric functions. In the next section we will see that for any z, the solutions of the Gauss-Askey-Wilson equation can be written in terms of basic hypergeometric series in a way that extends the representation of the well known AskeyWilson polynomials. We can now summarize the results of this section and the previous ones and give the
Proof of Theorem 1. Formula (1.4) giving the explicit form of the coefficients entering the three term recursion relation satisfied by the functions {fn (k)}n∈Z has just been established in (6.3). We now explain how to obtain formula (1.6) which determines the function f1 (k) in terms of an arbitrary solution of the Gauss-Askey-Wilson equation (1.5), by combining the results of this section with those of Sects. 3 and 5. The linearization of the q-Riccati equation (3.4) satisfied by f1 (k) led us to the Rodrigues type formula (5.18), with y(k) an arbitrary solution of Eq. (5.13) which, when expressed in terms of the new parameters a, b, c, d and z introduced at the beginning of this section, becomes the Gauss-Askey-Wilson equation (6.5). Substituting into V (k) as defined in (5.8) the expressions (5.14) and (5.16) for x1 , x2 , x3 and x4 , with x5 given by (5.15) and x0 replaced by x0 = −
γ 2 (q + 1)((q − 1)v − u) , z
one obtains by a straightforward computation using the definitions of S(k) and R(k) in (3.5) that εγ((q−1)v−u) 1 γu V (k)−S(k) = s − b1 + (z−1) δ + k− , (6.9) R(k) (q−1)v (q−1)vz k with δ=
q(uz + qv − v − u) (u + v)(uz − q 2 v + qv + qu) b1 + 2 β. (q + 1)v(uz 2 + q 2 v − qv − qu) (q − 1)v(uz 2 + q 2 v − qv − qu)
Replace now S(k)/R(k) in (5.18) by the expression obtained from (6.9), express V (k) in terms of A(k) from (6.2) and (6.6) and use the definition (3.5) of R(k) to get ε y(qk) x0 A(k) k − −1 f1 (k) = s − b1 + γ(q 2 − 1)v k y(k) εγ((q − 1)v − u) 1 γu k− . + (z − 1) δ + (q − 1)v (q − 1)vz k From formulas (2.12) and (6.1), we can express δ, x0 /v, u/v in terms of yi (1 ≤ i ≤ 4) and z, which leads to formula (1.6). In Sect. 4, Theorem 2, we have established the existence of the bispectral operator B and the fact that different choices of the function f1 (k) result in operators that are conjugate to each other. Finally, substituting the expressions for u and v in (6.1) into (2.6) gives (up to an inessential scaling factor) formula (1.7) for the eigenvalues θn . This completes the proof of Theorem 1.
Some Functions that Generalize Askey–Wilson Polynomials
195
7. Solving the Gauss-Askey-Wilson Equation in Terms of Basic Hypergeometric Series The purpose of this section is to obtain solutions of the equation A(k)(y(qk)−y(k)) + A(ε/k)(y(k/q)−y(k)) = (1 − z)( with A(k) =
abcd − 1)y(k) qz
(7.1)
(1 − ka)(1 − kb)(ε − kc)(ε − kd) , (ε − k 2 )(ε − k 2 q)
and its consequence A(ε/k) =
(k − εa)(k − εb)(k − c)(k − d) (k 2 − ε)(k 2 − εq)
in terms of basic hypergeometric series. Introduce the standard q-shifted factorials 1 for n = 0 (a; q)n = (a; q)n−1 (1 − aq n−1 ) for n ≥ 1 and define (a; q)∞ =
∞ Y
(1 − aq k ) ,
k=0
for |q| < 1. Whenever (a; q)∞ appears in a formula, we shall assume that |q| < 1. When products of q-shifted factorials occur, we shall use more compact notations (a1 , a2 , . . . , am ; q)n = (a1 ; q)n (a2 ; q)n . . . (am ; q)n , (a1 , a2 , . . . , am ; q)∞ = (a1 ; q)∞ (a2 ; q)∞ . . . (am ; q)∞ . The next theorem can be extracted from [5]. For the convenience of the reader we shall present our “own proof” of this result, most of which was developed before we became aware of [5]. We shall follow the notations of [11] for the definition and the notation of the basic hypergeometric series. Theorem 3. Let r = 1, q/εab, q/ac or q/ad. Then the function a˜ b˜ c˜d˜ ˜ z, ˜ qz˜ , a˜ k, aε k (ak, aε/k; q)∞ y(k) = ; q, q 4 φ3 (ark, arε/k; q)∞ ˜ a˜ c˜, a˜ d˜ a˜ bε, solves the inhomogeneous equation A(k)(y(kq) − y(k)) + A(ε/k)(y(k/q) − y(k)) − (1 − z)
abcd qz
(7.2)
− 1 y(k)
abcd aε 1 (rz, r qz , ak, k ; q)∞ , = r (rabε, rac, rad, rq; q)∞
(7.3)
where a˜ = ra , z˜ = rz , and, depending on the choice of r above, one must pick b˜ = b, c˜ = c, d˜ = d, or b˜ = q/εa, c˜ = c, d˜ = d, or b˜ = b, c˜ = q/a, d˜ = d, or b˜ = b, c˜ = c, d˜ = q/a .
(7.4)
(7.5)
196
F.A. Gr¨unbaum, L. Haine
Remark. Clearly if we pick r = q/εab above, we must assume that ε 6= 0. The other choices of r are valid for any ε. Proof. First we observe that it suffices to establish the case r = 1. Indeed, assuming the result for r = 1, denoting by L(a, b, c, d, z)y(k) the left-hand side of (7.3) and putting y(k) =
(ak, aε/k; q)∞ y(k) ˜ , (ark, arε/k; q)∞
one finds from Cauchy’s formulation of the binomial theorem (5.6) and (5.7) that 1 (ak, aε/k; q)∞ ˜ c˜, d, ˜ z) L(˜a, b, ˜ y(k) ˜ r (ark, arε/k; q)∞ ˜ ˜ ˜ a˜ qbzc˜˜d , ak, aε 1 z, k ;q ∞ = , ˜ a˜ c˜, a˜ d, ˜ q; q)∞ r (˜abε,
L(a, b, c, d, z)y(k) =
˜ c˜, d˜ and z˜ corwhich coincides with (7.3), using the definitions (7.4) and (7.5) of a˜ , b, responding to the possible choices of r distinct from 1. It remains to establish the case r = 1 of (7.3). Define εa zn ≡ (ak; q)n ( ; q)n . k We first collect some useful properties, which are easily proved by induction: 1. (1 − a)(aq; q)n−1 = (a; q)n , n εa 2 2. zn (kq) − zn (k) = 1−q qk (ak q − εa)(akq; q)n−1 ( k ; q)n−1 , n
εaq 2 3. zn (k) − zn (k/q) = 1−q qk (ak − aεq)(ak; q)n−1 ( k ; q)n−1 , 4. (1 − bk)(ε − ck)(ε − dk)(ak; q)n ( aε k ; q)n−1 − (k − εb)(k − c)(k − d)k(ak; q)n−1 2 ( aε ; q) = k(k − ε)(ak; q) (aε/k; q)n−1 [γn (k + kε ) + δn ], n n−1 k
with γn = (abcdq n−1 − 1) and δn = aq n−1 (ε(1 − bc − bd) − cd) + bε + c + d − bcd. We now look for a solution of (7.3) in the form y(k) =
X
x n zn .
n≥0
It will turn out that the unknown coefficients xn satisfy a first order recursion relation which makes y(k) into a basic hypergeometric series. Put p X x n zn . yp (k) = n=0
Using 2) and then 1) obtain
Some Functions that Generalize Askey–Wilson Polynomials
197
A(k)(yp (kq)−yp (k)) aε (ak 2 q−εa) X xn (1−q n )(akq; q)n−1 ( ; q)n−1 qk k p
= A(k)
n=0
a(1−kb)(ε−kc)(ε−kd) X aε xn (1−q n )(ak; q)n ( ; q)n−1 . 2 (k − ε)qk k p
=
n=0
Similarly, using 3) and then 1) obtain A(ε/k)(yp (k/q)−yp (k)) εaq (ak 2 −aεq) X ; q)n−1 xn (1−q n )(ak; q)n−1 ( = −A(ε/k) qk k p
n=0
= −a
p (k−εb)(k−c)(k−d)k X εa xn (1−q n )(ak; q)n−1 ( ; q)n . (k 2 − ε)qk k n=0
Adding these two expressions and using 4) we get A(k)(yp (qk) − yp (k)) + A(ε/k)(yp (k/q) − yp (k)) =
p X aε ε a xn (1−q n )k(k 2 −ε)(ak; q)n−1 ( ; q)n−1 (γn (k + ) + δn ) 2 (k −ε)qk k k n=0
ε aX xn (1 − q n )zn−1 (γn (k + ) + δn ). q k p
=
n=0
If we make use of the property 5. (k + ε/k)zn = aq1n ((1 + a2 εq 2n )zn − zn+1 ), we obtain that the left-hand side of (7.3) with y(k) replaced by yp (k) is equal to p−1 X n=0
abcd n q qxn (1−zq ) 1− qz n
zn − (1−abεq )(1−acq )(1−adq )(1−qq )xn+1 n+1 q zp abcd p − 1 x p zp . − xp (1 − q )γp p − (1 − z) q qz n
n
n
n
If we determine xn inductively by xn+1 =
(1 − zq n )(1 − (1 − acq n )(1 − adq n )(1
abcd n qz q )q − abεq n )(1
− qq n )
xn ,
with x0 = 1, the first p terms in the expression above vanish and we find
(7.6)
198
F.A. Gr¨unbaum, L. Haine
z, xp =
abcd qz
; q p qp
(abε, ac, ad, q; q)p
.
Taking the limit p → ∞, by definition of the basic hypergeometric series, yp (k) tends to the expression given in (7.2) with r = 1 and, since limp→∞ xp zp = 0, the expression (7.6) reduces to aε z, abcd qz , ak, k ; q ∞ , (abε, ac, ad, q; q)∞ as desired, which concludes the proof of the theorem.
It is clear that by taking appropriate linear combinations of the functions given in (7.2), for different choices of r, we shall obtain a basis of solution of the Gauss–Askey– Wilson equation (7.1) expressed in terms of a basic hypergeometric series. An interesting consequence of these explicit formulas is Theorem 4. If in Theorem 1 one builds f1 (k) from (1.6) using any linear combination of the functions (7.2) giving rise to a solution of the Gauss–Askey–Wilson equation (1.5), f1 (k) and the resulting family fn (k), n ∈ Z, are functions of the variable s(k) = γ(k + ε/k). Proof. Clearly it suffices to consider the case ε = 1. Observe that any function that is meromorphic at k = 0 and is invariant under the change k y 1/k is a function of k +1/k. It is clear that for any solution of (1.5) which is obtained by taking a linear combination of the functions given in (7.2), the expression for f1 (k) given in (1.6) is meromorphic at k = 0, and thus it is enough to show that it is invariant under the change k y 1/k. Since, when ε = 1, a solution y(k) of (1.5) which is a linear combination of the functions (7.2) is invariant under the change k y 1/k, it satisfies also the equation A(k)
y4 y(qk −1 ) y(qk) − 1 + A(k −1 ) − 1 = (1 − z) −1 . −1 y(k) y(k ) qz
From this identity it follows that checking the invariance of the right-hand side of (1.6) under the change k y 1/k amounts to checking the identity γ(k −
y4 1 ) (z − 1)(y4 − qz) − qz(1 − z) 1 − =0, k qz
which is trivially satisfied. Thus f1 (k) and therefore all the other fn (k)’s are functions of s(k), proving the theorem. 8. The Case of the Classical Orthogonal Polynomials In Sect. 6 we have shown that the solution of our bispectral problem can be nicely parametrized by five parameters a, b, c, d and z which are equivalent to our original parametrization of the solution in (2.9) and (2.10) in terms of u/v (or v/u), a0 , a1 , b1 and r (see formulas (6.1), (6.2) and (6.3)). In this section we show that when z = 1, we get back the classical orthogonal polynomials in the sense of Andrews and Askey [3], so that z is the extra parameter which moves us away from the polynomial case. In fact we get a little bit more: the classical orthogonal polynomials are part of a (nontrivial) biinfinite
Some Functions that Generalize Askey–Wilson Polynomials
199
sequence of functions {fn (s(k))}n∈Z which are eigenfunctions of the celebrated Askey– Wilson second order q-difference operator; among those only the fn with n ≥ 0 are polynomials in s(k). To see this pick y(k) to be a solution of the Gauss–Askey–Wilson equation (7.1) given by abcd z, qz , ak, aε k ; q, q y(k) = 4 φ3 abε, ac, ad
−r
(z, abcd qz , rabε, rac, rad, rq, ak, aε/k; q)∞ (rz,
× 4 φ3
rabcd qz
(8.1)
, abε, ac, ad, q, ark, arε/k; q)∞
˜ ˜
˜ z, ˜ a˜ qbzc˜˜d , a˜ k, aε k
; q, q ,
˜ a˜ c˜, a˜ d˜ a˜ bε, corresponding to the appropriate combination of the functions in (7.2) with r = 1 and any other admissible choice of r. Clearly we can write y(kq) − 1 = (z − 1)g(k) , (8.2) y(k) and from (1.6) we obtain that lim f1 (k) = s(k) − b1 ,
z→1
and therefore all the fn (k), n ≥ 2, become (monic) polynomials of degree n in the variable s(k). Also, since a0 is divisible by z − 1 (see (6.1)), formulas (1.6) and (8.2) show that the limit s(k) − b1 − f1 (k) lim f−1 (k) = lim z→1 z→1 a0 exists. Since limz→1 g(k) is not a polynomial, f−1 (k) and consequently all the fn (k), n ≤ −2, are not polynomials. Substituting s(k) − b1 for f1 (k) into (4.10) and (4.11) and using (6.1) one computes that when z = 1 the bispectral equation (4.9) becomes: A(k)(fn (kq)−fn (k)) + A(ε/k)(fn (k/q)−fn (k)) = (1−q −n )(abcdq n−1 −1)fn (k),
(8.3)
with A(k) as in (6.6), which is nothing but the celebrated Askey–Wilson equation. To summarize, the special solutions of the q-Riccati equation corresponding to the solutions (8.1) of the Gauss–Askey–Wilson equation when z → 1 lead to a biinfinite sequence {fn (k)}n∈Z , with fn (k) polynomials in the variable s(k) for n ≥ 0. In this case the fn (k) themselves are eigenfunctions of the Askey–Wilson second order q-difference operator. When γ = 21 and ε = 1, one sees from (6.3) that the three term recursion relation satisfied by the polynomials fn (k) (n ≥ 0) coincides with the standard relation satisfied by the (monic) Askey-Wilson polynomials, see for example [19]. When γ = 1 and ε = 0, the change of parameters
200
F.A. Gr¨unbaum, L. Haine
a˜ = q −1 ac ,
b˜ = q −1 bd ,
c˜ = c ,
d˜ = −d ,
brings (6.3) to one of the standard forms of the three term recursion relation satisfied by the (monic) big q-Jacobi polynomials as given in [20] or [19, p. 58]. 9. The Case v = 0 in the q-Riccati Equation As we remarked at the beginning of Sect. 5, the q-Riccati equation (3.4) becomes linear when v = 0. We devote this section to the study of some explicit solutions that can be obtained in this case. In the simpler case of q = 1 this is all reduced to a question of integrating a first order linear inhomogeneous differential equation, a rather “trivial” task for the cases at hand, see Sect. 5.1 in [15]. For a general q this last step is not necessarily trivial given our present knowledge about explicit evaluation of q-integrals. Equation (3.4) takes the form f1 (kq) = A(k)f1 (k) + B(k) with A(k) = P (k)/S(k) and B(k) = Q(k)/S(k). This equation can be handled in a way that is similar to the simpler case q = 1. Let g(k) be a particular solution of the equation g(kq) = A(k)g(k), and notice that f1 (k) = C(k)g(k) solves the original equation if C(k) is chosen to satisfy the first order q-difference equation C(qk) − C(k) = B(k)/(A(k)g(k)) , whose solution is determined up to the addition of an arbitrary q-periodic function. As an illustration we do the explicit integration in the case ε = 0 with the further condition a0 + a1 = 0, which is the simplest case considered in [15]. Using (2.16) this condition becomes α = −b1 (2b1 u+β) and A(k) and B(k) are given by the expressions A(k) =
kq − b1 , k − b1
B(k) = −
This results in
a1 k(q − 1)(q + 1)2 u . (k − b1 )q(kqu + ku + 2b1 u + β)
g(k) = k − b1 .
We get that the q-difference ratio (C(qk) − C(k))/(k(q − 1)) is given by −
a1 (q + 1)2 u , (k − b1 )q(kq − b1 )(kqu + ku + 2b1 u + β)
which can conveniently be expressed as w2 w1 w3 + + (k − b1 )(kq − b1 ) k − b1 kqu + ku + 2b1 u + β with
Some Functions that Generalize Askey–Wilson Polynomials
w1 = − w2 =
201
a1 (q + 1)4 u3 , q(b1 qu + 3b1 u + β)(3b1 qu + b1 u + βq)
a1 (q + 1)3 u2 , q(b1 qu + 3b1 u + β)(3b1 qu + b1 u + βq)
w3 = −
a1 (q + 1)2 u . 3b1 qu + b1 u + βq
From here we obtain for C(k) the expression C(k) = w˜ 3 /(k − b1 ) + w˜ 2 logq k/b1 + w˜ 1 logq k/c1 with w˜ 3 , w˜ 2 , w˜ 1 simply related to w3 , w2 , w1 and c1 the root (in of the denominator Pk) ∞ in the last summand above. We are using the notation logq z = n=1 z n /(1 − q n ) from [21]. Finally we can use this C(k), as observed above, to get f1 (k) = (k − b1 )C(k) . Acknowledgement. We thank R. Askey, M. Ismail, D. Masson and S. Suslov for help with the statement of Theorem 3.
References 1. Adler, M., van Moerbeke, P.: Matrix integrals, Toda symmetries, Virasoro constraints and orthogonal polynomials. Duke Math. J. 80, 3, 863–911 (1995) 2. Adler, M., Shiota, T., van Moerbeke, P.: Random matrices, vertex operators and the Virasoro algebra. Phys. Lett. A 208, 101–112 (1995) 3. Andrews, G.E., Askey, R.: Classical orthogonal polynomials. In:C. Brezinski et al., editors, Polynˆomes Orthogonaux et Applications, Lecture Notes in Math. 1171, New York: Springer, pp. 36–62 (1985) 4. Askey, R., Wilson, J.: Some basic hypergeometric orthogonal polynomials that generalize Jacobi polynomials. Mem. Am. Math. Soc. 319, (1985) 5. Atakishiyev, N.M., Suslov, S.K.: Difference hypergeometric functions. In: Progress in Approximation Theory, A.A. Gonchar and E.B. Saff, eds., Berlin–Heidelberg–New York: Springer-Verlag, pp. 1–35 (1992) 6. Batchelder, P.M.: An Introduction to Linear Difference Equations. Cambridge, Mass. Harvard University Press, (1927) ¨ 7. Bochner, S.: Uber Sturm-Liouvillesche Polynomsysteme. Math. Z. 29, 730–736 (1929) 8. Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press (1994) 9. Duistermaat, J.J., Gr¨unbaum, F.A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103, 177–240 (1986) 10. Freund, P.G.O., Zabrodin, A.V.: The spectral problem for the q-Knizhnik–Zamolodchikov equation and continuous q-Jacobi polynomials. Commun. Math. Phys. 173, 17–42 (1995) 11. Gasper, G., Rahman, M.: Basic hypergeometric series, Encyclopedia of Mathematics and Its Applications. 35, Cambridge: Cambridge University Press (1990) 12. Gorsky, A.S., Zabrodin, A.V.: Degenerations of Sklyanin algebra and Askey–Wilson polynomials. J. Physics A: Math & Gen. 26, no. 15, L635–L639 (1993) 13. Gr¨unbaum, F.A., Haine, L.: Orthogonal polynomials satisfying differential equations: The role of the Darboux transformation. In: D. Levi, L. Vinet and P. Winternitz, editors, Symmetries and Integrability of Difference Equations, CRM Proceedings & Lecture Notes 9, Providence: American Mathematical Society, 143–154 (1996)
202
F.A. Gr¨unbaum, L. Haine
14. Gr¨unbaum, F.A., Haine, L.: The q-version of a theorem of Bochner. J. Comp. & Appl. Math. 68, (1996) 103–114 15. Gr¨unbaum, F.A., Haine, L.: A theorem of Bochner, revisited. In: A. S. Fokas and I. M. Gel’fand, eds., Algebraic Aspects of Integrable Systems, in memory of I. Dorfman, Progress in Nonlinear Differential Equations, Vol. 26, Boston: Birkhauser, (1996) pp. 143–172 16. Gr¨unbaum, F.A., Haine, L.: On a q-analogue of Gauss equation and some q-Riccati equations. Proceedings of the Workshop on q-Special Functions, Fields Institute, 1995, eds: M. Ismail, D. Masson, M. Rahman, to appear 17. Hahn, W.: Beitr¨age zur Theorie der Heineschen Reihen, Die 24 Integrale der hypergeometrischen qDifferenzengleichung, das q-Analogon der Laplace-Transformation. Math. Nachr. 2, 340–379 (1949) α
β
α+1
α
β+1
β
−1)(q −1) (q −1)(q −1)(q −1)(q −1) 2 x + . . ., 18. Heine, E.: Untersuchungen u¨ ber die Reihe 1 + (q(q−1)(q γ −1) x+ (q 2 −1)(q−1)(q γ+1 −1)(q γ −1) J. reine Angew. Math. 34, 285–328 (1847) 19. Koekoek, R., Swarttouw, R. F.: The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue. Report 94-05, Technical University Delft, Faculty of Technical Mathematics & Informatics, 1994 20. Koornwinder, T. H.: Compact quantum groups and q-special functions. In: V. Baldoni and M. A. Picardello, editors, Representations of Lie Groups and Quantum Groups, Pitman Research Notes in Mathematics Series 311, London: Longman Scientific & Technical, 1994, pp. 46–128 21. Koornwinder, T. H.: Special functions and q-commuting variables. To appear in Special Functions, q-Series and Related Topics, The Fields Institute Communications Series 22. Koornwinder, T. H.: Askey-Wilson polynomials as zonal spherical functions on the SU(2) quantum group. SIAM J. of Math. Anal. 24, no. 3, 795–813,(1993) 23. Nijhoff, F. W.: On a q-deformation of the discrete Painlev´e I equation and q-orthogonal polynomials, Letters in Mathematical Physics 30, 327–336 (1994) 24. Spiridonov, V. and Zhedanov, A.: Discrete reflectionless potentials, quantum algebras, and q-orthogonal polynomials. Annals of Phys. 237, 126–146 (1995) 25. Spiridonov, V., Vinet, L. and Zhedanov, A.: Periodic reduction of the factorization chain and the Hahn polynomials, Jour. of Physics A: Math & General, 27, no. 18, L669–L675 (1994) 26. Tracy, C. and Widom, H.: Fredholm determinants, differential equations and matrix models, Commun. Math. Phys. 163, 33–72 (1994)
Communicated by T. Miwa
Commun. Math Phys. 184, 203 – 232 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
The Andrews–Gordon Identities and q -Multinomial Coefficients S. Ole Warnaar Department of Mathematics, University of Melbourne, Parkville, Victoria 3052, Australia. E-mail:
[email protected] Received: 22 January 1996 / Accepted: 4 September 1996
Abstract: We prove polynomial boson-fermion identities for the generating function of PL−1 the number of partitions of n of the form n = j=1 jfj , with f1 ≤ i − 1, fL−1 ≤ i0 − 1 and fj + fj+1 ≤ k. The bosonic side of the identities involves q-deformations of the coefficients of xa in the expansion of (1 + x + · · · + xk )L . A combinatorial interpretation for these q-multinomial coefficients is given using Durfee dissection partitions. The fermionic side of the polynomial identities arises as the partition function of a onedimensional lattice-gas of fermionic particles. In the limit L → ∞, our identities reproduce the analytic form of Gordon’s generalization of the Rogers–Ramanujan identities, as found by Andrews. Using the q → 1/q duality, identities are obtained for branching functions corresponding to cosets of type (A1(1) )k × (A1(1) )` /(A1(1) )k+` of fractional level `. 1. Introduction The Rogers–Ramanujan identities can be stated as the following q-series identities. Theorem 1 (Rogers–Ramanujan). For a = 0, 1 and |q| < 1, ∞ X n=0
∞
Y q n(n+a) = (1 − q 5j+1+a )−1 (1 − q 5j+4−a )−1 . 2 n (1 − q)(1 − q ) . . . (1 − q )
(1.1)
j=0
Since their independent discovery by Rogers [1–3], Ramanujan [4] and also Schur [5], many beautiful generalizations have been found, mostly arising from partition-theoretic or Lie-algebraic considerations, see refs. [6, 7] and references therein. Most surprising, in 1981 Baxter rediscovered the Rogers–Ramanujan identities (1.1) in his calculation of the order parameters of the hard-hexagon model [8], a lattice gas of hard-core particles of interest in statistical mechanics. It took however another ten
204
S.O. Warnaar
years to fully realize the power of the (solvable) lattice model approach to finding qseries identities. In particular, based on a numerical study of the eigenspectrum of the critical three-state Potts model [9, 10] (yet another lattice model in statistical mechanics), the Stony Brook group found an amazing variety of new q-series identities of Rogers–Ramanujan type [11, 12]. Almost none of these identities had been encountered previously in the context of either partition theory or the theory of infinite dimensional Lie algebras. More specific, in the work of refs. [11, 12] expressions for Virasoro characters were given through systems of fermionic quasi-particles. Equating these fermionic character forms with the well-known Rocha-Caridi type bosonic expressions [13], led to many qseries identities for Virasoro characters, generalizing the Rogers–Ramanujan identities (which are associated to the M (2, 5) minimal model). The proof of the Rogers–Ramanujan identities by means of an extension to polynomial identities whose degree is determined by a fixed integer L, was initiated by Schur [5]. Before we elaborate on this approach, we need the combinatorial version of the Rogers–Ramanujan identities stating that Theorem 2 (Rogers–Ramanujan). For a = 0, 1, the partitions of n into parts congruent to 1 + a or 4 − a (mod 5) are equinumerous with the partitions of n in which the difference between any two parts is at least 2 and 1 occurs at most 1 − a times. Denoting the number of occurences of the part j in a partition Pby fj , the second type of partitions in the above theorem are those partitions of n = ≥1 jfj which satisfy the following frequency conditions: fj + fj+1 ≤ 1 ∀j
and
f1 ≤ 1 − a.
(1.2)
Schur notes that imposing the additional condition fj = 0 for j ≥ L + 1, the generating function of the “frequency partitions” satisfies the recurrence gL = gL−1 + q L gL−2 .
(1.3)
Together with the appropriate initial conditions, Schur was able to solve these recurrences, to obtain an alternating-sign type solution, now called a bosonic expression. Taking L → ∞ in these bosonic polynomials yields (after use of Jacobi’s triple product identity) the right-hand side of (1.1). Since this indeed corresponds to the generating function of the “ ( mod 5)” partitions, this proves Theorem 2. Much later, Andrews [14] obtained a solution to the recurrence relation as a finite q-series with manifestly positive integer coefficients, now called a fermionic expression. Taking L → ∞ in these fermionic polynomials yields the left-hand side of (1.1). Recently much progress has been made in proving the boson-fermion identities of [11, 12] (and generalizations thereof), by following the Andrews–Schur approach. That is, for many of the Virasoro-character identities, finitizations to polynomial bosonfermion identities have been found, which could then be proven either fully recursively (`a la Andrews) or one side combinatorially and one side recursively (`a la Schur), see refs. [15–30]. In this paper we consider polynomial identities which imply the Andrews–Gordon generalization of the Rogers–Ramanujan identities. First, Gordon’s Theorem [31], which provides a combinatorial generalization of the Rogers–Ramanujan identities, reads Theorem 3 (Gordon). For all k ≥ 1, 1 ≤ i ≤ k + 1, let Ak,i (n) be the number of partitions of n into parts not congruent to 0 or ±i (mod 2k + 3) and let Bk,i (n) be the
Andrews-Gordon Identities and q-Multinomial Coefficients
number of partitions of n of the form n = (for all j). Then Ak,i (n) = Bk,i (n).
P j≥1
205
jfj , with f1 ≤ i − 1 and fj + fj+1 ≤ k
Subsequently the following analytic counterpart of this result was obtained by Andrews [32], generalizing the analytic form (1.1) of the Rogers–Ramanujan identities. Theorem 4 (Andrews). For all k ≥ 1, 1 ≤ i ≤ k + 1 and |q| < 1, X
q N1 +···+Nk +Ni +···+Nk = (q)n1 (q)n2 · · · (q)nk 2
n1 ,n2 ,...,nk ≥0
with and (q)a =
Qa
∞ Y
2
(1 − q j )−1
j=1 j6≡0,±i (mod 2k+3)
Nj = nj + · · · + nk
k=1 (1
(1.4)
(1.5)
k
− q ) for a > 0 and (q)0 = 1.
Application of Jacobi’s triple product identity admits for a rewriting of the right-hand side of (1.4) to ∞ 1 X (−)j q j (2k+3)(j+1)−2i /2 . (1.6) (q)∞ j=−∞ Equating (1.6) and the left-hand side of (1.4), gives an example of a boson-fermion identity. Here we consider, in the spirit of Schur, a “natural” finitization of Gordon’s frequency condition such that this boson-fermion identity is a limiting case of polynomial identities. In particular, we are interested in the quantity Bk,i,i0 ;L (n), counting the number of partitions of n of the form L−1 X jfj (1.7) n= j=1
with frequency conditions f1 ≤ i−1,
fL−1 ≤ i0 −1
and
fj +fj+1 ≤ k
for j = 1, . . . , L − 2. (1.8)
If we denote the generating function of partitions counted by Bk,i,i0 ;L (n) by Gk,i,i0 ;L (q), then clearly limL→∞ Gk,i,i0 ;L (q) = Gk,i (q), with Gk,i the generating function associated with Bk,i (n) of Theorem 3. Also note that Gk,i,1;L = Gk,i,k+1;L−1 . Our main results can be formulated as the following two theorems for Gk,i,i0 ;L . Let
L a
be the Gaussian polynomial or q-binomial coefficient defined by (q)L L L (q) (q)L−a a = = a a q 0
0≤a≤L
(1.9)
otherwise.
Further, let Ik be the incidence matrix of the Dynkin diagram of Ak with an additional tadpole at the k th node: (Ik )j,` = δj,`−1 + δj,`+1 + δj,` δj,k
j, ` = 1, . . . , k,
(1.10)
and let Ck be the corresponding Cartan-type matrix, (Ck )j,` = 2δj,` − (Ik )j,` . Finally let ~n, m ~ and ~ej be k-dimensional (column)-vectors with entries ~nj = nj , m ~ j = mj and (~ej )` = δj,` . Then
206
S.O. Warnaar
Theorem 5. For all k ≥ 1, 1 ≤ i, i0 ≤ k + 1 and kL ≥ 2k − i − i0 + 2, Gk,i,i0 ;L (q) =
X
T −1 q ~n Ck (~n + ~ek − ~ei−1 )
k Y nj + mj , nj
(1.11)
j=1
n1 ,n2 ,...,nk ≥0
with (m, n)-system [16] given by ~ + (L−2)~ek + ~ei−1 + ~ei0 −1 . m ~ + ~n = 21 Ik m
(1.12)
We note that (Ck−1 )j,` = min(j, `) and hence that, using the variables Nj of (1.5), we can rewrite the quadratic exponent of q in (1.11) as N12 + · · · + Nk2 + Ni + · · · + Nk . For k ≥ 2, the “finitization” (1.11)–(1.12) of the left-hand side of (1.4) is new. For k = 1 it is the already mentioned fermionic solution to the recurrence (1.3) as found by Andrews [14]. Another finitization, which does not seem to be related to a finitization of Gordon’s frequency conditions, has recently been proposed in refs. [17–19] (see also ref. [21]). A more general expression, which includes (1.11)–(1.12) and that of [17–19] as special cases, will be discussed in Sect. 6. Our second result, which is maybe of more interest mathematically since it involves h i(p) new generalizations of the Gaussian polynomials, can be stated as follows. Let La k be the q-multinomial coefficient defined in Eq. (2.5) of the subsequent section. Also, define (1.13) r = k − i0 + 1 and
( s=
i
for i = 1, 3, . . . , 2b k2 c + 1
2k + 3 − i for i = 2, 4, . . . , 2b k+1 2 c,
(1.14)
so that r = 0, 1, . . . , k and s = 1, 3, . . . , 2k + 1. Then Theorem 6. For all k > 0, 1 ≤ i, i0 ≤ k + 1 and kL ≥ 2k − i − i0 + 2, ( (r) ∞ X L j (2j+1)(2k+3)−2s Gk,i,i0 ;L (q) = q 1 2 (kL + k − s − r + 1) + (2k + 3)j k j=−∞ (r) ) L 2j+1 (2k+3)j+s −q (1.15) 1 2 (kL + k + s − r + 1) + (2k + 3)j k for r ≡ k(L + 1) (mod 2) and ( ∞ X Gk,i,i0 ;L (q) = q j (2j+1)(2k+3)−2s 1
(r) L 2 (kL − k + s − r − 2) − (2k + 3)j k j=−∞ (r) ) L 2j+1 (2k+3)j+s −q (1.16) 1 2 (kL − k − s − r − 2) − (2k + 3)j k
for r 6≡ k(L + 1) (mod 2).
Andrews-Gordon Identities and q-Multinomial Coefficients
207
For k ≥ 3, the finitizations (1.15) and (1.16) of the right-hand side of (1.4) are new. h i(p) For k = 1 (1.15) and (1.16) are Schur’s bosonic polynomials. For k = 2, La being a 2 q-trinomial coefficient, (1.15) and (1.16) were (in a slightly different representation) first obtained in ref. [33]. An altogether different alternating-sign expression for Gk,i,i0 ;L in terms of q-binomials has been found in ref. [34]. A different finitization of the righthand side of (1.4) involving q-binomials has been given in [14, 35]. A more general expression, which includes (1.15), (1.16) and that of ref. [14, 35] as special cases, will be discused in Sect. 6. Equating (1.11) and (1.15)–(1.16) leads to non-trivial polynomial identities, which in the limit L → ∞ reduce to Andrews’ analytic form of Gordon’s identity. For k = 1 these are the polynomial identities featuring in the Andrews–Schur proof of the Rogers– Ramanujan identities (1.1) [14]. The remainder of the paper is organized as follows. In the next section we introduce the q-multinomial coefficients and list some q-multinomial identities needed for the proof of Theorem 6. Then, in Sect. 3, a combinatorial interpretation of the q-multinomials is given using Andrews’ Durfee dissection partitions. In Sect. 4 we give a recursive proof of Theorem 6 and in Sect. 5 we prove Theorem 5 combinatorially, interpreting the restricted frequency partitions as configurations of a one-dimensional lattice-gas of fermionic particles. We conclude this paper with a discussion of our results, a conjecture generalizing Theorems 5 and 6, and some new identities for the branching functions of cosets of type (A1(1) )k × (A1(1) )` /(A1(1) )k+` with fractional level `. Finally, proofs of some of the q-multinomial identities are given in the appendix. 2. q-Multinomial Coefficients Before introducing the q-multinomial coefficients,wefirst recall some facts about ordinary multinomials. Following ref. [33], we define La for a = 0, . . . , kL as k
kL X L k L xa . (1 + x + · · · + x ) = a k
(2.1)
a=0
Multiple use of the binomial theorem yields X L j1 jk−1 L = ··· , j2 jk a k j +···+j =a j1 L a
1
L a
(2.2)
k
is the usual binomial coefficient. Some readily established properties of La are the symmetry relation
where
=
1
k
and the recurrence
L L = kL − a k a k
(2.3)
k X L−1 L = . a−m k a k
(2.4)
m=0
208
S.O. Warnaar
For our subsequent work it will be convenient to define k +1 different q-deformations of the multinomial coefficient (2.2). Definition 1. For p = 0, . . . , k we set (p) X L = a k j +···+j h i L a
q
(L − j` )j`+1 −
`=1
k−1 X
j`+1
`=k−p
k =a
1
with
k−1 X
L j1
j1 jk−1 ··· , j2 jk
(2.5)
the standard q-binomial coefficients of (1.9). h i(p) L a
Note that
k
is unequal to zero for a = 0, . . . , kL only. Also note the initial condition (p) 0 = δa,0 . a k
(2.6)
In the following we state a number of q-deformations to (2.3) and (2.4). Although our list is certainly not exhaustive, we have restricted ourselves to those identities which in our view are simplest, and to those needed for proving Theorem 6. Most of these identities are generalizations of known q-binomial and q-trinomial identities which, for example, can be found in refs. [33, 36, 25, 28]. First we put some simple symmetry properties, generalizing (2.3), in a lemma. Lemma 1. For p = 0, . . . , k the following symmetries hold: (0) (k−p) (0) (p) L L L L (k−p)L−a =q and = . a k kL − a k kL − a k a k
(2.7)
The proof of this lemma is given in the appendix. To our mind the simplest way of q-deforming (2.4) (which was communicated to us by A. Schilling) is Proposition 1 (Fundamental recurrences; Schilling). For p = 0, . . . , k, the q-hyphenate satisfy (m) (p) k−p X L−1 L = q m(L−1) + a−m k a k m=0
k X m=k−p+1
q L(k−p)−m
L−1 a−m
(m) .
(2.8)
k
In the next section we give a combinatorial proof of this important result for the p = 0 case. An analytic proof for general p has been given by Schilling in ref. [29]. We now give some equations, proven in the appendix, which all reduce to the tautology 1 = 1 in the q → 1 limit. Proposition 2. For all p = −1, . . . , k − 1, we have (p+1) (p+1) (p) (p) L L L L L L +q = +q , kL − a − p − 1 k a k kL − a − p − 1 k a k h i(−1) with
L a
k
= 0.
(2.9)
Andrews-Gordon Identities and q-Multinomial Coefficients
209
The power of these (q-deformed) tautologies is that they allow for an endless number of different rewritings of the fundamental recurrences. In particular, as shown in the appendix, they allow for the non-trivial transformation of (2.8) into Proposition 3. For all p = 0, . . . , k, we have (p) L = a k
+
k−p X
q m(L−1)
m=0 m≡p+k (mod 2)
k−p−1 X
q m(L−1)
m=0 m6≡p+k (mod 2)
+
k X
1
q2
k X
1
q kL+ 2
m=k−p+1 m6≡p+k (mod 2)
L−1 a − 21 (m − p + k)
(m)
L−1 kL − a − 21 (m + p + k + 1)
(2L−1)(k−p)−m
m=k−p+2 m≡p+k (mod 2)
+
k
(m) k
L−1 a − 21 (m − p + k)
(2L+1)(k−p)−m+1 −2a
(m) (2.10) k
L−1 kL − a − 21 (m + p − k − 1)
(m) . k
It is thanks to these rather unappealing recurrences that we can prove Theorem 6. Before concluding this section on the q-multinomial coefficients let us make some further remarks. First, for k = 1 and k = 2 we reproduce the well-known q-binomial and q-trinomial coefficients. In particular, (0) L L = a a 1 and
(p) L; L − a − p; q L = L−a a 2 2
(2.11)
for p = 0, 1,
(2.12)
where on the right-hand side of (2.12) we have used the q-trinomial notation introduced by Andrews and Baxter [33]. Second, in [33], several recurrences involving q-trinomials with just a single superscript (p) are given. We note that such recurrences follow from (2.8) by taking the difference between various values of p. In particular we have for all r = 0, . . . , p (m) (p) (p−r) p−r−1 X m(L−1) L L−1 L L(k−p)−a rL = +q 1−q q a k kL − a − m k a k m=0 (m) p−1 X L−1 + q L(k−p)−a . (2.13) 1 − q (p−m)L q m(L−1) kL − a − m k m=p−r (m) This can be used to eliminate all multinomials .... k for m = 0, . . . , p − 1, p + 1, . . . , k (p) in favour of .... k . The price to be paid for this is that the resulting expressions tend to get very complicated if k gets large.
210
S.O. Warnaar
A further remark we wish to make is that to our knowledge the general q-deformed h i(0) multinomials as presented in (2.5) are new. The multinomial La however was already k suggested as a “good” q-multinomial by Andrews in [36], where the following generating function for q-multinomials was proposed for all k > 1: pk,L (x) =
L X
a
x q
a 2
a=0
L pk−1,a (xq L ), a
(2.14)
with p0,L (x) = 1. Clearly, pk,L (x) =
kL X
a 2
xa q
a=0
(0) L . a k
(2.15)
(0) L a
Also in the work of Date et al. the
makes a brief appearance, see ref. [34], k
Eq. (3.29). The more general q-multinomials of Eq. (2.5) have been introduced independently by Schilling [29]. (The notation used in ref. [29] and that of the present paper is almost h i(p) h i(p) L is replaced by kL/2−a .) identical apart from the fact that La k
k
3. Combinatorics of q-Multinomial Coefficients In this section a combinatorial interpretation of the q-multinomials coefficients is given using Andrews’ Durfee dissections [37]. We then show how the fundamental recurrences (2.8) with p = 0 follow as an immediate consequence of this interpretation. As a first step it is convenient to change variables from q to 1/q. Using the elementary transformation property of the Gaussian polynomials L −a(L−a) L =q , a q a 1/q
(3.1)
we set Definition 2. For p = 0, . . . , k (p) (p) L L := q −aL a k a k =
X
q
q→1/q
N12 +···+Nk2 +Nk−p+1 +···+Nk
N1 +···+Nk =a
=
X
~ nT Ck−1~ek =a
T −1 q ~n Ck (~n + ~ek − ~ek−p )
L N1
N1 Nk−1 ··· N2 Nk
(3.2)
(q)L . (3.3) (q)L−~nT C −1~e1 (q)n1 (q)n2 . . . (q)nk k
Andrews-Gordon Identities and q-Multinomial Coefficients
211
3.1. Successive Durfee squares and Durfee dissections. As a short intermezzo, we review some of the ideas introduced by Andrews in ref. [37], needed for our interpretation of (3.2). Those already familiar with such concepts as “(k, a)-Durfee dissection of a partition” and “(k, a)-admissible partitions” may wish to skip the following and resume in Sect. 3. Throughout the following a partition and its corresponding Ferrers graph are identified. Definition 3. The Durfee square of a partition is the maximal square of nodes (including the upper-leftmost node). The size of the Durfee square is the number of rows for which r` ≥ `, labelling the rows (=parts) of a partition by r1 ≥ r2 ≥ . . .. Copying the example from ref. [37], the Ferrers graph and Durfee square of the partition πex = 8 + 7 + 5 + 4 + 4 + 3 + 1 + 1 is shown in Fig. 1(a). The portion of a partition of n below its Durfee square defines a partition of m < n. For this “smaller” partition one can again draw the Durfee square. Continuing this process of drawing squares, we end up with the successive Durfee squares of a partition. For the partition πex this is shown in Fig. 1(b). If a partition π has k successive Durfee squares, with N` the size of the `th square, then π has exactly N1 + · · · + Nk parts with N1 + · · · + N` parts ≥ N` for all ` = 1, . . . , k. Following Andrews we now slightly generalize the previous notions. Definition 4 (Durfee rectangle). The Durfee rectangle of a partition is the maximal rectangle of nodes whose height exceeds its width by precisely one row. The Durfee rectangle of the partition πex is shown in Fig. 1(c). The size of the Durfee rectangle is its width. One can now combine the Durfee squares and rectangles to define Definition 5 (Durfee dissection). The (k, i)-Durfee dissection of a partition is obtained by drawing i − 1 successive Durfee squares followed by k − i + 1 successive Durfee rectangles. In the following it will be convenient to adopt a slightly unconventional labelling. In particular, we label the Durfee squares from 1 to i − 1 and the rectangles from i to k. Correspondingly, N` is the size of the Durfee square or rectangle labelled by `. We note that in the (k, i)-dissection of a partition corresponding to Durfee squares and rectangles of respective sizes N1 ≥ N2 ≥ . . . ≥ Nk , all the N` beyond some fixed `0 may actually be zero. Finally we come to the most important definition of this section. Definition 6 ((k, i)-admissible). Let N1 ≥ N2 ≥ . . . ≥ Nk be the respective sizes of the Durfee squares and rectangles in the (k, i)-Durfee dissection of a partition π. Then π is (k, i)-admissible if – π has no parts below its last successive Durfee rectangle (or square if i = k + 1). – For ` = i, . . . , k, the last row of the Durfee rectangle labelled by ` has N` nodes. The first condition is equivalent to stating that the number of parts of π equals N1 + · · · + Nk + max(`0 − i + 1, 0), where `0 labels the number of Durfee squares and rectangles of non-zero size; N` > 0 for ` ≤ `0 and N` = 0 for ` > `0 . The second condition is equivalent to stating that the last row of each Durfee rectangle is actually a part of π. 3.2. (k, i; L, a)-admissible partitions and q-multinomial coefficients. Using the previous definitions we are now prepared for the combinatorial interpretation of (3.2).
212
(a)
S.O. Warnaar
(b)
(c)
Fig. 1. a Durfee square of the partition πex = 8 + 7 + 5 + 4 + 4 + 3 + 1 + 1. b The four successive Durfee squares of πex . c The Durfee rectangle of πex
Definition 7 ((k, i; L, a)-admissible). Let N1 ≥ N2 ≥ . . . ≥ Nk be the respective sizes of the Durfee squares and rectangles of a (k, i)-admissible partition π. Then π is said to be (k, i; L, a)-admissible if the largest part of π is less or equal to L and N1 + · · · + Nk = a. For a (k, i; L, a)-admissible partition π, the portion π` of π to the right of the Durfee square or rectangle labelled by ` (and below the Durfee square or rectangle labelled N`−1 ), is a partition with largest part ≤ N`−1 − N` (where N0 = L) and number of parts ≤ N` . Recalling that the Gaussian polynomial (1.9) is the generating function of partitions with largest part ≤ L − a and number of parts ≤ a [6], we thus find that the generating function of (k, i; L, a)-admissible partitions is given by X 2 2 L Ni−2 Ni (Ni +1) Ni−1 q N1 · · · q Ni−1 q ··· Ni−1 Ni N1 N1 +···+Nk =a (k−i+1) Nk−1 L = . (3.4) × q Nk (Nk +1) Nk a k Denoting an arbitrary partition of n with largest part ≤ L and number of parts ≤ a by a rectangle of width L and height a, the (k, i; L, a)-admissible partitions can be represented graphically as shown in Fig. 2 for the case k = 2. Equipped with the above interpretation we return to the recurrence relation (2.8) for n o(p) , gives p = 0. Using Definition 2 to rewrite this in terms of La k
(m) (0) k X L−1 L a =q . a−m k a k
(3.5)
m=0
This is obviously true if the following combinatorial statements hold. Lemma 2. – Adding a column of a nodes to the left of a (k, k − m + 1; L − 1, a − m)-admissible partition with m ∈ {0, 1, . . . , k}, yields a (k, k + 1; L, a)-admissible partition. – Removing the first column (of a nodes) from a (k, k + 1; L, a)-admissible partition yields a (k, k−m+1; L−1, a−m)-admissible partition for some m ∈ {0, 1, . . . , k}.
Andrews-Gordon Identities and q-Multinomial Coefficients L
213
L
N1
L
N1
N1
N1
N1
a
N1+1
a +1 N2 N2
a +2
N2
N2
N2+1
N2+1
i =3 i =2 i =1
L (3−i)
Fig. 2. Graphical representation of the (2, i; L, a)-admissible partitions, generated by a . The respec2 tive values of N1 and N2 are free to vary, only their sum taken the fixed value a. Note that the number of parts in the second and third figure are actually not fixed, but vary between a and a − i + 3, depending on the number of Durfee rectangles of non-zero size
To show the first statement, we note that a partition is (k, k + 1; L, a)-admissible if it has exactly a parts, has largest part ≤ L and has at most k successive Durfee squares. A (k, k − m + 1; L − 1, a − m)-admissible partition has at most a parts and has largest part ≤ L − 1. Hence adding a column of a nodes to the left of such a partition, yields a partition π which has a parts and largest part ≤ L. It remains to show that π has at most k successive Durfee squares. To see this first assume that the (k, k − m + 1; L − 1, a − m)-admissible partition only consists of Durfee squares and rectangles. That is, we have a partition of N12 +· · ·+Nk2 +Nk−m+1 +· · ·+Nk , with N1 + · · · + Nk = a − m. Adding a column of a dots trivially yields a partition π with k successive Durfee squares with respective sizes N1 ≥ N2 ≥ ... ≥ Nk−m ≥ Nk−m+1 + 1 ≥ . . . ≥ Nk + 1 > 0,
(3.6)
with π having a column of N` nodes to the right of the `th successive Durfee square for each ` ≤ k − m. Now note that we in fact have treated the “worst” possible cases. All other (k, k − m + 1; L − 1, a − m)-admissible partitions can be obtained from the “bare” ones just treated by adding partitions with largest part ≤ N`−1 −N` (where N0 = L) and number of parts ≤ N` to the right of the Durfee square or rectangle labelled by ` for all `. Let π be such a “dressed” partition, obtained from a bare (k, k − m + 1; L − 1, a − m)admissible partition πb , and let the images of π and πb after adding a column of a dots be π 0 and πb0 . Further, let N` and M` be the size of the `th successive Durfee square of πb0 and π 0 , respectively. Since π is obtained from πb by adding additional nodes to its rows, we have M1 + · · · M` ≥ N1 + · · · + N` for all `. From the fact that πb0 has at most k successive Durfee squares it thus follows that this is also true for π 0 . To show the second statement of the lemma, note that from (2.4) we see that the map implied by the first statement is in fact a map onto the set of (k, k + 1, L, a)-admissible partitions. Since for m 6= m0 , the set of (k, k − m + 1; L − 1, a − m)-admissible partitions is distinct from the set of (k, k − m0 + 1; L − 1, a − m0 )-admissible partitions, the second statement immediately follows. To prove (2.8) is true for general p, we need to establish (p) k−p X L − 1 (m) L = qa + qa a−m k a k m=0
k X m=k−p+1
q L(p−k+m)
L−1 a−m
(m) . k
(3.7)
214
S.O. Warnaar
Unfortunately, a generalization of Lemma 2 which would imply this more general result has so far eluded us. Before concluding our discussion of q-multinomial coefficients we note that if the restriction on L is dropped in the (k, i; L, a)-admissible partitions, their generating function reduces to (k−i+1) 2 2 X q N1 +···+Nk +Ni +···+Nk L = , (3.8) lim L→∞ a (q)n1 (q)n2 . . . (q)nk k N1 +···+Nk =a n1 ,...,nk ≥0
which, up to a factor (q)a , is the representation of the Alder polynomials [38] as found in ref. [32].
4. Proof of Theorem 6 With the results of the previous two sections, proving Theorem 6 is elementary. First PL−1 we define Sk,i,i0 ;L as the set of partitions of n of the form n = j=1 jfj satisfying the frequency conditions f1 ≤ i − 1, fL−1 ≤ i0 − 1 and fj + fj+1 ≤ k for j = 1, . . . , L − 2. Let π be a partition in Sk,i,i0 ;L , with ` rows of length L−1. Using the frequency condition this implies fL−2 ≤ k − `. Hence, by removing the first ` rows, π maps onto a partition in Sk,i,k−`+1;L−1 . Conversely, by adding ` rows at the top to a partition in Sk,i,k−`+1;L−1 , we obtain a partition in Sk,i,i0 ;L . Since in the above ` can take the values ` = 0, . . . , i0 −1, the following recurrences hold: Gk,i,i0 ;L (q) =
0 iX −1
q `(L−1) Gk,i,k−`+1;L−1 (q)
i0 = 1, . . . , k + 1.
for
(4.1)
`=0
In addition to this we have the initial condition Gk,i,i0 ;2 (q) =
0 min(iX −1,i−1)
q` .
(4.2)
`=0
Using the recurrence relations, it is in fact an easy matter to verify that this is consistent with the condition (4.3) Gk,i,i0 ;0 (q) = δi,i0 . It remains to verify that (1.15) and (1.16) satisfy the recurrence (4.1) and initial condition (4.3). Since in these two equations we have used the variables r and s instead of i0 and i, let us first rewrite (4.1) and (4.3). Suppressing the k, s and q dependence, setting Gk,i,i0 ;L (q) = GL (r), we get GL (r) =
k−r X
q `(L−1) GL−1 (`)
r = 0, . . . , k
(4.4)
for s = 1, 3, . . . , 2b k2 c + 1 for s = 2b k2 c + 3, . . . , 2k + 1.
(4.5)
for
`=0
(
and G0 (r) =
δs+r,k+1 δs−r,k+2
Andrews-Gordon Identities and q-Multinomial Coefficients
215
To verify that (1.15) and (1.16) satisfy the initial condition (4.5), we set L = 0 and use the fact that r = 0, 1, . . . , k and s = 1, 3, . . . , 2k + 1. From this and Eq. (2.6) one immediately sees that the only non-vanishing term in (1.15) is given i(r) h 0 = δs+r,k+1 . Similarly, the only non-vanishing term in (1.16) is by (k−s−r+1)/2 h i(r)k 0 = δs−r,k+2 . Now recall that (1.15) with L = 0 is G0 (r) for r ≡ k. (−k+s−r−2)/2 k
From the allowed range of r this implies s = 1, 3, . . . , 2b k2 c + 1, in accordance with the top-line of (4.5). Also, since (1.16) with L = 0 is G0 (r) for r 6≡ k, and because of the range of r, we get s = 2b k2 c + 3, . . . , 2k + 1, in accordance with the second line in (4.5). Checking that (1.15) and (1.16) solve the recurrence relation (4.4) splits into several cases due to the parity dependence of GL (r) and of the q-multinomial recurrences (2.11). All of these cases are completely analogous and we restrict our attention to k and r being even, so that GL (r) is given by Eq. (1.15). Substituting recurrences (2.11), the first and second sum in (2.11) immediately give the right-hand side of (4.4). Consequently, the other two terms in (2.11) give rise to unwanted terms that have to cancel in order for (4.4) to be true. Dividing out the common factor q (2L−1)(k−r)/2 and making the change of variables m → m − 1 in the last sum of (2.11), the unwanted terms read k X
q
− 21 m
∞ X
(
q
j (2j+1)(2k+3)−2s
j=−∞
m=k−r+2 m even
L−1 1 (kL − s − m + 1) + (2k + 3)j 2
(m) L−1 −q 1 2 (kL + s − m + 1) + (2k + 3)j k (m−1) L−1 2j−1 (2k+3)j−s +q 1 2 (kL + s − m + 1) − (2k + 3)j k (m−1) ) L−1 j (2j−1)(2k+3)+2s −q . 1 2 (kL − s − m + 1) − (2k + 3)j k
2j+1
(2k+3)j+s
(m) k
(4.6)
After changing the summation variable j → −j in the second and fourth term, this becomes k X m=k−r+2 m even
(
q− 2 m 1
∞ X
qj
(2j+1)(2k+3)−2s
×
j=−∞
(m) L−1 1 2 (kL − s − m + 1) + (2k + 3)j k (m) L−1 −q s−2(2k+3)j 1 2 (kL + s − m + 1) − (2k + 3)j k (m−1) L−1 s−2(2k+3)j +q 1 2 (kL + s − m + 1) − (2k + 3)j k (m−1) ) L−1 − 1 . 2 (kL − s − m + 1) + (2k + 3)j k
(4.7)
216
S.O. Warnaar
We now show that the term within the curly braces vanishes for all m and j. To establish this, we apply the symmetry (2.7) to all four q-multinomials within the braces and divide 1 by q (k−m)(L−1)− 2 (kL−s−m+1)−(2k+3)j . After replacing L by L + 1 and m by k − p, this gives (p) (p) L L − 1 (4.8) 1 2 (kL + s − p − 1) − (2k + 3)j k 2 (kL − s − p − 1) + (2k + 3)j k (p+1) (p+1) L L +q L 1 − qL 1 . 2 (kL − s − p − 1) + (2k + 3)j k 2 (kL + s − p − 1) − (2k + 3)j k
Recalling the tautology (2.9) with a = 21 (kL + s − p − 1) − (2k + 3)j this indeed gives zero.
5. Proof of Theorem 5
5.1. From partitions to paths. To prove expression (1.11) of Theorem 5, we reformulate the problem of calculating the generating function Gk,i,i0 ;L (q) into a lattice path problem. To This end we represent each partition π in Sk,i,i0 ;L as a restricted lattice path p(π), similar in spirit to the lattice path formulation of the left-hand side of (1.4) by Bressoud [39].1 PL−1 To map a partition π of n = j=1 jfj onto a lattice path p(π), draw a horizontal line-segment in the (x, y)-plane from (j − 21 , fj ) to (j + 21 , fj ) for each j = 1, . . . , L − 1. Also draw vertical line-segments from (j + 21 , fj ) to (j + 21 , fj+1 ) for all j = 0, . . . , L − 1, where f0 = fL−1 = 0. As a result π is represented by a lattice path (or histogram) from ( 21 , 0) to (L − 21 , 0). The frequency condition fj + fj+1 ≤ k translates into the condition that the sum of the heights of a path at x-positions j and j + 1 does not exceed k. The restrictions f1 ≤ i − 1 and fL−1 ≤ i0 − 1 correspond to the restrictions that the heights at x = 1 and x = L − 1 are less than i and i0 , respectively. An example of a lattice path for k ≥ 8, i ≥ 3 and i0 ≥ 1, is shown in Fig. 3. The above map clearly is reversible, and any lattice path satisfying the above height conditions maps onto a partition in Sk,i,i0 ;L . From now on we let Pk,i,i0 ;L denote the set of restricted lattice paths corresponding to the set of partions Sk,i,i0 ;L . From the map of partitions onto paths, the problem of calculating the generating function Gk,i,i0 ;L (q) can be reformulated as X W (p) (5.1) Gk,i,i0 ;L (q) = 0 p ∈Pk,i,i ;L QL−1 with Boltzmann weight W (p) = j=1 q jfj . Before we actually compute the above sum, we remark that in the following k, i and i0 will always be fixed. Hence, to simplify notation, we use GL and PL to denote Gk,i,i0 ;L and Pk,i,i0 ;L , respectively. 1 Finitizing Bressoud’s lattice paths by fixing the length of his paths to L, results in the left-hand side of (6.1) of the next section. Hence the lattice paths introduced here are intrinsically different from those of ref. [39] and in fact correspond to a finitization of the paths of ref. [40].
Andrews-Gordon Identities and q-Multinomial Coefficients
217
y ≤k
2 1 0 12
x L
Fig. 3. A lattice path of the partition (f1 , . . . , fL−1 ) = (2, 4, 3, 3, 5, 3, 2, 4, 1, 0, 1, 3, 0, 0, 7, 0, 1, 1, 2, 4, 3, 3, 0, 1, 2, 1, 1, 3, 4, 4, 2, 2, 0, 0). The shaded regions correspond to the two particles with largest charge (=8), as described below.
5.2. Fermi-gas partition function; i = i0 = k + 1. To perform the sum (5.1) over the restricted lattice path, we follow a procedure similar to the one employed in our proof of Virasoro-character identities for the unitary minimal models [23, 27]. That is, the sum (5.1) is interpreted as the grand-canonical partition function of a one-dimensional lattice-gas of fermionic particles. The idea of this approach is to view each lattice path as a configuration of particles on a one-dimensional lattice. Since not all lattice paths correspond to the same particle content ~n, this gives rise to a natural decomposition of (5.1) into X ZL (~n; q), (5.2) GL (q) = ~ n
with ZL the canonical partition function, ZL (~n; q) =
X
W (p).
(5.3)
p ∈PL (~n) Here PL (~n) ⊂ PL is the set of paths corresponding to a particle configuration with content ~n. To avoid making the following description of the lattice gas unnecessarily complicated, we assume i = i0 = k + 1 in the remainder of this section. Subsequently we will briefly indicate how to modify the calculations to give results for general i and i0 . To describe how to interpret each path in PL = Pk,k+1,k+1;L as a particle configuration, we first introduce a special kind of paths from which all other paths can be constructed. Definition 8 (minimal paths). The path shown in Fig. 4 is called the minimal path of content ~n. Definition 9 (charged particle). In a minimal path, each column with non-zero height t corresponds to a particle of charge t. Note that in the minimal path the particles are ordered according to their charge and that adjacent particles are separated by a single empty column. The number of particles of charge t is denoted nt and ~n = (n1 , . . . , nk )T . For later use it will be convenient to give each particle a label, pt,` denoting the `th particle of charge t, counted from the right.
218
S.O. Warnaar
y n times k
k n2 times
n1 times
2 1 0 12
L
x
Fig. 4. The minimal path of content ~ n = (n1 , . . . , nk )T
Since the length of a path is fixed by L, there are only a finite number of minimal paths. In particular, we have 2(n1 + · · · + nk ) ≤ L, so that there are bL/2c+k different k minimal paths. In the following we show that all non-minimal paths in PL can be constructed out of one (and only one) minimal path using a set of elementary moves. Hereto we first describe how various local configurations may be changed by moving a particle.2 To suit the eye, the particle being moved in each example has been shaded. To describe the moves we first consider the simplest type of motion, when the two columns immediately to the right of a particle are empty. Definition 10 (free motion). The following sequence of moves is called free motion:
t
t−1
t t−2 2
1 j
j
j
j+1
Clearly, a particle of charge t in free motion takes t moves to fully shift position by one unit. Now assume that in moving a particle of charge t, we at some stage encounter the local configuration shown in Fig. 5(a). We then allow the particle to make t − s more moves following the rules of free motion, to end up with the local configuration shown in (b). If instead of (a) we encounter the configurations (c) or (d), the particle can make no further moves. In case of the configuration of Fig. 5(b), there are three possibilities. Either we have one of the configurations shown in Fig. 6(a) and (b), in which case the particle cannot move any further, or we have the configuration shown in Fig. 6(c) (with 0 ≤ u < t − s), in which case we can make t − u − s moves, going from (c) to (e). Ignoring the for our rules irrelevant column immediately to the left of the particle, the configuration of Fig. 6(e) is essentially the same as that of Fig. 5(b). To further move the particle we can thus refer to the rules given in Fig. 6. That is, if the column immediately to the right of the white column of height u has height t−u (corresponding to the configuration of Fig. 6(a) with s → u) the particle cannot move any further. Similarly, if the column immediately to the right of the white column of height u has height v > t − u (corresponding to the 2
In moving a particle we always mean motion from left to right.
Andrews-Gordon Identities and q-Multinomial Coefficients
219
t
(a)
(b)
(c)
t −s
t
s
s
s
(d) t
t
s
Fig. 5.
(a)
u
(b) t −s t −s
(c)
(d)
t −s
t −s u
s
s
s
s
s
s
(e) t −s−1 s+1 u s
t −u u u s
Fig. 6. A Lattice path of the partition (f1 , . . . , fL−1 =) (2, 4, 3, 3, 5, 3, 2, 4, 1, 0, 1, 3, 0, 0, 7, 0, 1, 1, 2, 4, 3, 3, 0, 1, 2, 1, 1, 3, 4, 4, 2, 2, 0, 0). The shades regions correspond to the two particles witj largest charge (= 8), as described below
configuration of Fig. 6(b) with u → v and s → u) the particle cannot move any further. However, if the height of the column immediately to the right of the white column of height u is 0 ≤ v < t − u (corresponding to the configuration of Fig. 6(c) with u → v and s → u) we can make another t − u − v moves. Having introduced all the necessary moves we come to the main propositions of this section Proposition 4 (rules of motion). Each non-minimal configuration can be obtained from one and only one minimal configuration by letting the particles carry out elementary moves in the following order: – Particle pt,` moves prior to ps,`0 if t < s. – Particle pt,`0 moves prior to pt,` if `0 < `. To prove this let us assume that we have completed the motion of all particles of charge less than t and all particles pt,`0 with `0 < `, and that we are currently moving the particle pt,` . Therefore, for x ≤ 2nk + · · · + 2nt+1 + 2(nt − `) := xmin , the lattice path still corresponds to the minimal path. Now note that in moving the particle pt,` , we never create a local configuration in which the sum of the height of two consecutive columns is greater than t, see the free motion and Figs. 5(a)(b) and 6(c)–(e). Since we have not moved any of the particles of charge greater than t, this means that to the right of xmin no two consecutive columns have summed heights greater than t.3 Also note that as soon as pt,` meets two columns immediately to its right whose summed heights equal t, pt,` cannot move any further, see Figs. 5(c) and 6(a). Consequently, pt,` always corresponds to the leftmost two 3 This also means that in moving a particle of charge t, the configurations of Fig. 5(d) and Fig. 6(b) in fact never arise.
220
S.O. Warnaar
y ≤k
2 1 0 12
x L
Fig. 7. The lattice path obtained from Fig. 3 after moving the largest particles to their minimal position. The shaded regions mark the three next-largest particles
consecutive columns right of xmin whose summed height equals t. (In fact, it corresponds to the leftmost two consecutive columns right of xmin with maximal summed heights.) Now we define reversed moves by reading all the previous figures in a mirror. Using this motion we can move pt,` all the way back to its minimal position but not any further. To see this we note that the only situations in which pt,` cannot be moved further back is if it meets two consecutive columns to its left whose summed heights are greater or equal to t. Since we have just argued that such a configuration cannot occur between xmin and pt,` we can indeed move pt,` back to its minimal position using the reversed moves. Once it is back in its minimal position we either have the mirror image of Fig. 5(c) (in case ` < nt ) or (d) (in case ` = nt ). Neither of these configurations allows for further reversed moves. The above, however, gives a general procedure for reducing each non-minimal path to a minimal one by simply reversing the rules of motion in the proposition. That is, we first scan the path for all particles of charge k, by locating all occurrence of two consecutive columns of summed heights k. From left to right these label the particles pk,nk to pk,1 . Applying the previous reasoning with t = k, we can first move pk,nk back using reversed moves, then pk,nk −1 , et cetera, until all particles of charge k have taken their “minimal position”. Repeating this for the particles of charge k − 1, then the particles of charge k − 2, et cetera, each non-minimal path reduces to a unique minimal path. As an example to the above, for the path of Fig. 3 the shaded regions mark the (two) particles with largest charge (=8). Moving them back using the reversed motion, the leftmost particle being moved first, we end up with the path shown in Fig. 7, in which now the (three) particles with next-largest charge have been marked. We leave it to the reader to further reduce the path to obtain the minimal path of content (n1 , n2 , . . . , n8 ) = (2, 1, 2, 1, 3, 1, 3, 2). The elementary moves and the reversed moves are clearly reversible. If a particle of charge t has made an elementary move changing a path from p to p0 , we can always carry a reversed move going from p0 back to p. Since each path can be reduced to a unique minimal path using the reversed moves by carrying out the rules of motion of Proposition 4 in reversed order, we have thus established that using the rules of motion we can generate each non-minimal path uniquely from a minimal path. Hence the proposition is proven. We now have established the decomposition of the sum (5.1) into (5.2), where ZL (~n; q) is the generating function of the paths generated by the minimal path labelled by ~n, or, in other words, ZL (~n; q) is the partition function of a lattice gas of fermions
Andrews-Gordon Identities and q-Multinomial Coefficients
221
with particle content ~n. The fermionic nature being that, unlike particles of different charge, particles of equal charge cannot exchange position. Our next result concerns the actual computation of the partition function. Proposition 5. The partition function ZL is given by T −1 ZL (~n; q) = q ~n Ck ~n
k Y nt + m t nt
t=1
,
(5.4)
~ + L~ek ). with m ~ + ~n = 21 (Ik m To prove this we first determine the contribution to ZL of the minimal path of content ~n, ln W (pmin )/ ln q =
nt k k k k X X X X X 2` − 1 + 2 t ns = tnt nt + 2 ns t=1
=
s=t+1
`=1
k k X X
k X
nt n t + 2
r=1 t=r
t=1
ns =
s=t+1
k X r=1
s=t+1
Nr2 = ~nT Ck−1~n.
(5.5)
To obtain the contribution to ZL of the non-minimal configurations, we apply the rules of motion of Proposition 4. If e` denotes the number of elementary moves carried out by pt,` , the generating function of moving the particles of charge t reads mt X e1 X e1 =0 e2 =0
ent −1
...
X
q e1 +e2 +···+ent =
ent =0
mt + nt . nt
(5.6)
Here we have used the fact that each elementary move generates a factor q and that pt,` cannot carry out more elementary moves than pt,`−1 . (If pt,` has made as many moves as pt,`−1 we obtain either the local configuration of Fig. 5(c) or Fig. 6a, prohibiting any further moves.) The number mt in (5.6) is the maximal number of elementary moves pt,1 can make and remains to be determined. If the content of the minimal path is ~n, the x-position of pt,1 in pmin is 2(nk + · · · + nt ) − 1 := x0 . To fix mt , let us assume that after the motion of the particles of charge less than t has been completed, the nontrivial part of the lattice path is encoded by the sequence of heights (fx0 , . . . , fL−1 ), with fx0 = t, fx0 +1 = 0 and fj + fj+1 < t for j > x0 . The particle pt,1 can now move all the way to x = L − 1 making mt = (t − fx0 +2 ) + (t − fx0 +2 − fx0 +3 ) + (t − fx0 +3 − fx0 +4 ) + · · · + (t − fL−2 − fL−1 ) + (t − fL−1 ) = t(L − x0 − 1) − 2
L−1 X
fj
(5.7)
j=x0 +2
elementary moves. To simplify this, note that the sum on the right-hand side is nothing Pt−1 but twice the sum of the heights of the columns right of x = x0 , which is 2 s=1 sns . Substituting this in (5.7) and using the definition of x0 , results in
222
mt = tL−2
S.O. Warnaar t−1 X
sns −2t
k X
nt = tL−2
s=t
s=1
k X
min(s, t)ns =
s=1
L(Ck−1 )t,k −2
k X s=1
(Ck−1 )t,s ns
(5.8) in accordance with Proposition 5. Putting together the results (5.5), (5.6) and (5.8) completes the proof of Proposition 5. Substituting the form (5.4) of the partition function into (5.2) proves expression (1.11) of Theorem 5, for i = i0 = k + 1. 5.3. Fermi-gas partition function; general i and i0 . Modifying the proof of Theorem 5 for i = i0 = k + 1 to all i and i0 is straightforward and few details will be given. It is in fact interesting to note that unlike our proof for the character identities of the unitary minimal models [23, 27], the general case here does not require the introduction of additional “boundary particles”. First let us consider the general i0 case, with i = k + 1. This implies that the height fL−1 of the last column of the lattice paths is no longer free to take any of the values 1, . . . , k, but is bound by fL−1 ≤ i0 − 1. For the particles of charge less or equal to i0 − 1 this does not impose any new restrictions on the maximal number of moves pt,1 can make. For t > i0 − 1 however, mt in (5.8) has to be decreased by t − i0 + 1. Thus we find that mt of (5.8) has to be replaced by mt − max(0, t − i0 + 1) = mt − t + min(t, i0 − 1). Recalling (Ck−1 )s,t = min(s, t), this yields mt = (L − 1)(Ck−1 )t,k + (Ck−1 )t,i0 −1 − 2
k X s=1
(Ck−1 )t,s ns
(5.9)
Pk and therefore, mt +nt = 21 ( s=1 (Ik )t,s ms +(L−1)δt,k +δt,i0 −1 ) which is in accordance with Proposition 5, for i = k + 1. Second, consider i general, but i0 = k + 1, so that f1 ≤ i − 1, fL−1 ≤ k. Now the modification is slightly more involved since the actual minimal paths change from those of Fig. 4 to those of Fig. 8. This leads to a change in the calculation of W (pmin ) to ln W (pmin )/ ln q =
nt k X X
2`t − min(t, i − 1) + 2t
t=1 `=1
=
k X
tnt nt + 2
t=1
= ~nT Ck−1~n +
k X
ns +
t=1
tnt −
ns
s=t+1
k X
(t − i + 1)nt
t=i
s=t+1 k X
k X
k X
min(t, i − 1)nt
t=1
= ~nT Ck−1 (~n + ek − ei−1 ),
(5.10)
which is indeed the general form of the quadratic exponent of q in (1.11). Also mt again requires modification, which is in fact similar to the previous case: mt → mt − max(0, t − i + 1). To see this note that it takes max(0, t − i + 1) elementary moves to move pt,1 from its minimal position in Fig. 4 to its minimal position in Fig. 8. Finally, combining the previous two cases, and using the fact that the modifications of mt due to f1 ≤ i − 1 and fL−1 ≤ i0 − 1 are independent, we immediately arrive at the general form of (1.11) with (m, n)-system (1.12).
Andrews-Gordon Identities and q-Multinomial Coefficients
223
y n i−1 times
i−1
nk times
k
n1 times
1 0 12
L
Fig. 8. The minimal path of content ~ n = (n1 , . . . , nk different particles
)T
x
for general i. The dashed lines are drawn to mark the
6. Discussion In this paper we have presented polynomial identities which arise from finitizing Gordon’s frequency partitions. The bosonic side of the identities involves q-deformations of the coefficient of xa in the expansion of (1 + x + · · · + xk )L . The fermionic side follows from interpreting the generating function of the frequency partitions, as the grand-canonical partition function of a one-dimensional lattice gas. Interestingly, in recent publications, Foda and Quano, and Kirillov, have given different polynomial identities which imply (1.4) [17-19]. In the notation of Sect. 1 these identities can be expressed as Theorem 7 (Foda–Quano, Kirillov). For all k ≥ 1, 1 ≤ i ≤ k + 1 and L ≥ k − i + 1, Pj−1 k Y L − Nj − Nj+1 − 2 `=1 N` − αi,j q nj j=1 n1 ,...,nk ≥0 ∞ X L (6.1) (−)j q j (2k+3)(j+1)−2i /2 L−k+i−1−(2k+3)j , = c b 2 j=−∞ X
N12 +···+Nk2 +Ni +···+Nk
with Nk+1 = 0 and αi,j = max(0, j − i + 1). An explanation for this different finitization of (1.4) can be found in a theorem due to Andrews [41]: Theorem 8 (Andrews). Let Qk,i (n) be the number of partitions of n whose successive ranks lie in the interval [2 − i, 2k − i + 1]. Then Qk,i (n) = Ak,i (n). It turns out that it is the (natural) finitization of these successive rank partitions which gives rise to the above alternative polynomial finitization. That is, (6.1) is an identity for the generating function of partitions with largest part ≤ b(L + k − i + 2)/2c, number of parts ≤ b(L − k + i − 1)/2c, whose successive ranks lie in the interval [2 − i, 2k − i + 1]. Let us now reexpress (6.1) into a form similar to Eqs. (1.11), (1.15) and (1.16). Thus we eliminate i in the right-hand side of (6.1) in favour of the variable s of Eq. (1.14) and split the result into two cases. This gives ( (0) ∞ X L j (2j+1)(2k+3)−2s q RHS(6.1) = 1 2 (L + k − s + 1) + (2k + 3)j 1 j=−∞ (0) ) L 2j+1 (2k+3)j+s −q (6.2) 1 2 (L + k + s + 1) + (2k + 3)j 1
224
S.O. Warnaar
for L + k even, and RHS(6.1) =
(
(0) L 1 2 (L − k + s − 2) − (2k + 3)j 1 j=−∞ (0) ) L 2j+1 (2k+3)j+s −q (6.3) 1 2 (L − k − s − 2) − (2k + 3)j 1 ∞ X
qj
(2j+1)(2k+3)−2s
for L + k odd. This we recognize to be exactly (1.15) and (1.16) with r = 0, kL replaced ... (0) (0) by L and ... ... k replaced by ... 1 . Similarly, if we express the left-hand side of (6.1) through an (n, m)-system, we find precisely (1.11) but with (6.4) ~ + L~e1 + ~ei−1 − ~ek . m ~ + ~n = 21 Ik m This is just (1.12) with r = 0 and L ek replaced by L e1 . From the above observations it does not require much insight to propose more general polynomial identities which have (6.1) and those implied by Theorems 5 and 6 as special cases. In particular, we have confirmed the following conjecture by extensive series expansions. Conjecture 1. For all k ≥ 1, 1 ≤ ` ≤ k, 1 ≤ i ≤ k + 1, 1 ≤ i0 ≤ ` + 1 and `L ≥ k + ` − i − i0 + 2 X
T −1 q ~n Ck (~n + ~ek − ~ei−1 )
k Y nj + m j nj
with (m, n)-system given by m ~ + ~n = 21 Ik m ~ + (L− 1)~e` + ~ei−1 + ~ei0 −1 − ~ek equals
(
(r) L 1 2 (`L + k − s − r + 1) + (2k + 3)j ` j=−∞ (r) ) L 2j+1 (2k+3)j+s −q 1 2 (`L + k + s − r + 1) + (2k + 3)j ` ∞ X
qj
(6.5)
j=1
n1 ,n2 ,...,nk ≥0
(6.6)
(2j+1)(2k+3)−2s
(6.7)
for r ≡ `L + k (mod 2) and ( ∞ X j (2j+1)(2k+3)−2s q 1
(r) L 2 (`L − k + s − r − 2) − (2k + 3)j ` j=−∞ (r) ) L 2j+1 (2k+3)j+s −q 1 2 (`L − k − s − r − 2) − (2k + 3)j `
(6.8)
for r 6≡ `L + k (mod 2). Here s is defined as in (1.14) and r = ` − i0 + 1 so that r = 0, . . . , `.
(6.9)
Andrews-Gordon Identities and q-Multinomial Coefficients
225
For later reference, let us denote these more general polynomials as G(`) k,i,i0 ;L (q). Then ` = k corresponds to the polynomials considered in this paper and ` = 1 to those of Foda, Quano and Kirillov. The above conjecture leads one to wonder whether there are in fact (at least) k different partition theoretical interpretations of (1.4), each of which has a natural finitization corresponding to the polynomials G(`) k,i,i0 ;L (q) with ` = 1, . . . , k. Intimately related to the conjecture and perhaps even more surprising is the following observation, originating from the work of Andrews and Baxter [33]. For k ≥ 0 and 1 ≤ i ≤ k + 1, define a k-variable generating function f (x1 , . . . , xk ) =
X
1 2(N1 +N2 ) q N1 +···+Nk +Ni +···+Nk x2N · · · xk2(N1 +···+Nk ) 1 x2 , (x1 )n1 +1 (x2 )n2 +1 · · · (xk )nk +1 2
n1 ,n2 ,...,nk ≥0
2
(6.10) Qn−1 where (x)n = k=0 (1 − xq k ). Obviously, (1 − x1 ) · · · (1 − xk )f (1, . . . , 1) corresponds to the left-hand side of (1.4). Now define the polynomials P (`1 , . . . , `k ) := P (~`) as the coefficients in the series expansion of f , X P (~`) x`11 · · · x`kk . (6.11) f (x1 , . . . , xk ) = `1 ,...,`k
From the readily derived functional equations for f and the recurrences (4.1) with (4.3) one can deduce that (6.12) P (m ~ + 2Ck−1~n) = Gk,i,i0 ;L (q) with m ~ and ~n given by (1.12). Similarly the polynomials of Foda, Quano and Kirillov ~ and ~n now satisfy (6.4). Again we found arise again as P (m ~ + 2Ck−1~n), where m numerically that also the polynomials featuring the conjecture appear naturally. That is, P (m ~ + 2Ck−1~n) = G(`) k,i,i0 ;L (q),
(6.13)
where now the generalized (m, n)-system (6.6) should hold (so that m ~ + 2Ck−1~n = Ck−1 ((L− 1)~e` + ~ei−1 + ~ei0 −1 − ~ek )). Although all the polynomial identities implied by conjecture 1 reduce to Andrews’ identity (1.4) in the L to infinity limit, they still provide a powerful tool for generating new q-series results. That is, if we first replace q → 1/q and then take L → ∞, new identities arise. To state these, we need some more notation. The inverse Cartan matrix of ~ and ~εj are (` − 1)-dimensional (column) the Lie algebra A`−1 is denoted by B`−1 , and µ vectors with entries µ ~ j = µj and (~j )m = δj,m . Furthermore, we need the k-dimensional vector ~ i,i0 ,` = ~ei + ~ei+2 + · · · + ~ei0 + ~ei0 +2 + · · · + ~e`+1 + ~e`+3 + · · · , (6.14) Q with ~ej = ~0 for j ≥ k + 1. Using this notation, we are led to the following conjecture. Conjecture 2. For all k ≥ 1, 1 ≤ ` ≤ k, 1 ≤ i ≤ k + 1, 1 ≤ i0 ≤ ` + 1 and |q| < 1, the q-series q
(i0 +i−2)/4
X m1 ,m2 ,...,mk ≥0
1 ~ T (Ck m ~ + 2~ek − 2~ei−1 ) q4m (q)m`
~ 0 )j (mod 2) mj ≡(Q i,i ,`
226
S.O. Warnaar
k 1 Y ~ + ~ei−1 + ~ei0 −1 − ~ek Ik m 2 × mj
(6.15)
j=1 j6=`
equals q
(k+r−s+1)(k−r−s+1)/(4`)
`−1 1 X (q)∞
X
T
~ B`−1 (~ µ − ~r ) qµ (q)µ1 · · · (q)µ`−1
n=0 µ1 ,...,µ`−1 ≥0 n+`(B`−1 µ ~ )1 ≡0 (mod `)
∞ X
×
q
j=−∞ n+(k−s−r+1)/2+(2k+3)j≡0 (mod `) ∞ X
−
(2k−2`+3)j+(k−`+1)
q
j (2k−2`+3)(2k+3)j+(2k+3)(k−`+1)−(2k−2`+3)s /`
(2k+3)j+s /`
(6.16)
j=−∞ n+(k+s−r+1)/2+(2k+3)j≡0 (mod `)
for r ≡ k (mod 2), and equals q (k+r−s+2)(k−r−s+2)/(4`)
`−1 1 X (q)∞ n=0
X µ1 ,...,µ`−1 ≥0 n+`(B`−1 µ ~ )1 ≡0 (mod `)
∞ X
×
qj
∞ X
q
(2k−2`+3)(2k+3)j+(2k+3)(k−`+2)−(2k−2`+3)s /`
j=−∞ n−(k−s+r+2)/2−(2k+3)j≡0 (mod `)
−
T
~ B`−1 (~ µ − ~r ) qµ (q)µ1 · · · (q)µ`−1
(2k−2`+3)j+(k−`+2)
(2k+3)j+s /`
(6.17)
j=−∞ n−(k+s+r+2)/2−(2k+3)j≡0 (mod `)
for r 6≡ k (mod 2). Since conjecture 1 is proven for ` = 1 and k, we can for these particular values claim the above as theorem. In fact, for ` = 1, the above was first conjectured in ref. [12] and proven in [42]. In refs. [43, 44] expressions for the branching functions of the (A1(1) )M ×(A1(1) )N /(A1(1) )M +N coset conformal field theories were given similar to (6.16) and (6.17). This similarity suggests that (6.15), (6.16) and (6.17) correspond to the branching functions of the coset (A1(1) )` × (A1(1) )k−`−1/2 /(A1(1) )k−1/2 of fractional level. A very last comment we wish to make is that there exist other polynomial identities than those discussed in this paper which imply the Andrews–Gordon identity 1.4 and which involve the q-multinomial coefficients. Theorem 9. For all k ≤ 1 and 1 ≤ i ≤ k + 1, kL (k−i+1) L X X (q)L L = (−)j q j (2k+3)(j+1)−2i /2 . (q)L−j (q)L+j a k a=0
j=−L
(6.18)
Andrews-Gordon Identities and q-Multinomial Coefficients
227
Note that for i = k + 1 the left-hand side is the generating function of partitions with at most k successive Durfee squares and with largest part ≤ L. The proof of Theorem 9 follows readily using the Bailey lattice of refs. [45]. For k = 1 (6.18) was first obtained by Rogers [1]. For other k it is implicit in refs. [45, 46]. Acknowledgement. I thank Anne Schilling for helpful and stimulating discussions on the q-multinomial coefficients. Especially her communication of Eq. (2.8) has been indispensable for proving proposition 3. I thank Alexander Berkovich for very constructive discussions on the nature of the fermi-gas of Sect. 4. I wish to thank Professor G. E. Andrews for drawing my attention to the relevance of Eq. (6.10) and Barry McCoy for electronic lectures on the history of the Rogers–Ramanujan identities. Finally, helpful and interesting discussions with Omar Foda and Peter Forrester are greatfully acknowledged. This work is supported by the Australian Research Council.
A. Proof of q-Multinomial Relations In this section we prove the various claims concerning the q-multinomial coefficients made in Sect. 2. Let us start proving the symmetry properties (2.7) of Lemma 1. First we take Definition 2.5) and make the change variables j` → L − jk−`+1 for all ` = 1, . . . , k. This changes the restriction on the sum to j1 + · · · + jk = kL − a, changes the exponent of q to k−1 k−1 X X (L − jk−` )jk−`+1 − (L − jk−` ), (A.1) `=1
`=k−p
but leaves the product over the q-binomials invariant. We now perform a simple rewriting of (A.1) as follows (A.1) =
k−1 X
(L − j` )j`+1 +
`=1
=
k−1 X
p X
j` − pL
(by ` → k − `)
`=1
(L − j` )j`+1 −
k−1 X
`=1
j`+1 + (k − p)L − a
(A.2)
`=p
(by j1 + · · · + jk = kL − a), which proves the first claim of the lemma. The second statement in the lemma follows h i(0) h i(k) = q −a La . for example, by noting that La k k The proof of the tautologies (2.9) of Proposition 2 is somewhat more involved and we proceed inductively. For L = 0 (2.9) is obviously correct, thanks to (p) 0 = δa,0 . (A.3) a k Now assume (2.9) holds true for all L0 = 0, . . . , L. To show that this implies (2.9) for L0 = L + 1, we substitute the fundamental recurrence (2.8) into (2.9) with L replaced by L + 1. After some cancellation of terms and division by (1 − q L+1 ), this simplifies to (m) X (m) M M X L L mL mL q = q , (A.4) a−m k kL − a − m + M k m=0
m=0
228
S.O. Warnaar
where we have replaced k − p − 1 by M . Since in (2.9) we have p = −1, . . . , k − 1, (A.5) should hold for M = 0, . . . , k. A set of equations equivalent to this is obtained by taking (A.5)M =0 and (A.5)M − (A.5)M −1 for M = 1, . . . , k. In formula this new set of equations reads ( (m) (m+1) ) M −1 X L L mL L q −q kL − a − m + M − 1 k kL − a − m + M − 1 k m=0 (0) (M ) L L ML = −q , (A.5) kL − a + M k a−M k for M = 0, . . . , k. Now we use the induction assumption on the term within the curly braces, and the second symmetry relation of (2.7) on the first term of the right-hand side. This yields ( (m) (0) (m+1) ) (M ) M −1 X L L L L mL L ML q −q −q . = a−M k a−M k a−M k a−M k m=0 (A.6) Expanding the sum, all but two terms on the left-hand side cancel, yielding the right-hand side. Finally we have to show Eq. (2.11) of Proposition 3 to be true. We approach this problem indirectly and will in fact show that the right-hand side of (2.11) can be transformed into the right-hand side of (2.8) by multiple application of the tautologies (2.9) and the symmetries (2.7). For the sake of convenience, we restrict our attention to the case k and p even, and replace L in (2.8) and (2.11) by L + 1. The other choices for the parity of k and p follow in analogous manner, and the details will be omitted. Rewriting the right-hand side of (2.11) by replacing L by L + 1, using the even parity of k and p, and replacing p by k − M , gives M X
q
mL
m=0 meven
L a − 21 (m + M ) k X
+
(m) + k
M −1 X m=1 modd
k−1 X
1
q k(L+1)+ 2
1 q 2 (2L+1)M −m
m=M +2 meven
+
q
(2L+3)M −m+1 −2a
m=M +1 modd
mL
L kL − a − 21 (m − M + 1)
L a − 21 (m + M )
(m) k
(m) (A.7) k
L k(L + 1) − a − 21 (m − M − 1)
(m) . k
The proof that this equals the right-hand side of (2.8) (with L replaced by L + 1 and p by k − M ) breaks up into two independent steps, both of which will be given as a lemma. First, we have Lemma 3. The top-line of Eq. (A.9) equals M X m=0
q
mL
L a−m
(m) . k
(A.8)
Andrews-Gordon Identities and q-Multinomial Coefficients
229
Second, Lemma 4. The bottom-two lines of Eq. (A.9) equal k X
q (L+1)M −m
m=M +1
L a−m
(m) .
(A.9)
k
Clearly, application of these two lemmas immediately establishes the wanted result. At the core of the proof of both lemmas is yet another result, which can be stated as Lemma 5. For M even and ` = 0, . . . , 21 M , the following function is independent of `: (m) (m) X `−1 L L mL F` (M, a) = q + q a−m k kL − a − m + 21 M − 1 k m=0 m=M −` ( (m) MX −`−2 L mL + q a − 21 (m + M − `) k M X
mL
m=` m≡` (mod 2)
+ qL
L kL − a − 21 (m − M + `) − 1
(m+1) ) . k
The proof of this is simple. First we apply the tautology (2.9) to the term within the curly braces, yielding M X
F` (M, a) =
q mL
m=M −`
+
MX −`−2
(m) (m) X `−1 L L + q mL a−m k kL − a − m + 21 M − 1 k m=0 ( (m) L mL q (A.10) kL − a − 21 (m − M + `) − 1 k
m=` m≡` (mod 2)
+q
L
L 1 a − 2 (m + M − `)
(m+1) ) . k
After separating the m = ` term in the first and the m = M − ` − 2 term in the second term within the curly braces, we change the summation variable m → m − 2 in the sum over the second term within the braces. This results in M X
F` (M, a) =
q
mL
m=M −`−1
+
q
L
MX −`−3
(m) (m) X ` L L mL + q a−m k kL − a − m + 21 M − 1 k m=0 ( (m) L mL q a − 21 (m + M − ` − 1) k
m=`+1 m≡`+1 (mod 2)
L kL − a − 21 (m − M + ` + 1) − 1
(m+1) ) k
= F`+1 (M, a).
(A.11)
230
S.O. Warnaar
The proof of Lemmas 3 and 4 readily follows from Lemma 5. To prove Lemma 3, note that the top-line of (A.9) is nothing but F0 (M ). Since this is equal to F 1 M (M ), we 2 get
top-line of (A.9) =
M X
q
mL
m= 21 M
L a−m
(m)
1 2 M −1
+ k
X
q
mL
m=0
L kL − a − m + 21 M − 1
(m) . k
(A.12) Applying Eq. (A.5) with M replaced by 21 M − 1, to the the second sum, we simplify to Eq. (A.9) thus proving our lemma. To prove Lemma 4, we apply the first symmetry relation of (2.7) to all q-multinomials in the bottom-two lines of (A.9). After changing m → k − m in the second line and m → k − m − 1 in the third line, the last two lines of (A.9) combine to ( (m) k−M X−2 L mL q fa kL − a − 21 (m − M − k) k m=0
meven
+ q
L
L a − 21 (m + M + k) − 1
with fa = q (L+1)M −a . This we recognize as ( fa
F0 (k − M, kL + k − a) − q
(k−M )L
(m+1) ) ,
(A.13)
k
L kL − a + M
(k−M ) ) .
(A.14)
k
Replacing the first term by F 1 (k−M ) (k − M, kL + k − a), gives 2
1 2 (k−M )−1
fa
X
q mL
m=0
+ fa
k−M X−1 m= 21 (k−M )
(m) L a − 21 (M + k) − m − 1 k (m) L mL q . kL − a − m + k k
(A.15)
Applying Eq. (A.5) with M replaced by 21 (k − M ) − 1 and a by a − 21 (M + k) − 1, to the first sum, this simplifies to fa
k−M X−1 m=0
q
mL
L kL − a − m + k
(m) .
(A.16)
k
Finally using the symmetry (2.7), recalling the definition of fa and changing m → k−m, we get Eq. (A.10). Note added in proof. A generalization of Conjecture 1, involving q-supernomial coefficients, has been proven in [46]. Conjecture 2 has been established in both [47] and [48].
Andrews-Gordon Identities and q-Multinomial Coefficients
231
References 1. Rogers, J.L.: Second memoir on the expansion of certain infinite products. Proc. Lond. Math. Soc. 25, (1894) 318–343 2. Rogers, J.L.: On two theorems of combinatory analysis and some allied identities. Proc. Lond. Math. Soc. (2) 16, (1917) 315–336 3. Rogers, J.L.: Proof of certain identities in combinatory analysis. Proc. Cambridge Phil. Soc. 19 (1919) 211–214 4. Ramanujan, S.: Proof of certain identities in combinatory analysis. Proc. Camb.e Phil. Soc. 19, (1919) 214–216 5. Schur, .J.: Ein Beitrag zur additiven Zahlentheorie und zur Theorie der Kettenbr¨uche. S.-B. Preuss. Akad. Wiss. Phys.-Math. Kl. (1917) 302–321 6. Andrews, G.E.: The Theory of Partitions. (Reading, Massachusetts: Addison-Wesley, 1976) 7. J. Lepowsky, J., Primc, M.: Structure of the standard modules for the affine Lie algebra A(1) 1 . Contemp. Math. 46, Providence: AMS, 1985 8. Baxter, R.J.: Rogers–Ramanujan identities in the hard hexagon model J. Stat. Phys. 26, (1981) 427–452 9. Kedem, R., McCoy, B.M.: Construction of modular branching functions from Bethe’s equations in the 3-state Potts chain. J. Stat. Phys. 71, (1993) 865–901 10. Dasmahapatra, S., Kedem, R., McCoy, B.M.,Melzer, E.: Virasoro characters from Bethe’s equations for the critical ferromagnetic three-state Potts model. J. Stat. Phys. 74, (1994) 239–274 11. R. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic quasiparticle representations for (1) (1) characters of G(1) 1 × G1 /G2 . Phys. Lett. 304B, (1993) 263–270 12. R. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic sum representations for conformal field theory characters. Phys. Lett. 307B, (1993) 68–76 13. Rocha-Caridi, A.: Vacuum vector representation of the Virasoro algebra. In: J. Lepowsky, S. Mandelstam and I. M. Singer, (eds.) Vertex Operators in Mathematics and Physics. Berlin: Springer, 1985 14. Andrews, E.G.: A polynomial identity which implies the Rogers-Ramanujan identities. Scripta Math. 28, (1970) 297–305 15. Melzer, E.: Fermionic character sums and the corner transfer matrix. Int. J. Mod. Phys. A 9, (1994) 1115–1136 16. Berkovich, A.: Fermionic counting of RSOS-states and Virasoro character formulas for the unitary minimal series M (ν, ν + 1). Exact results. Nucl. Phys. B 431, (1994) 315–348 17. Foda, O., Quano, Y.-H.: Polynomial identities of the Rogers–Ramanujan type. Int. J. Mod. Phys. A 10, (1995) 2291–2315 18. Foda, O.: On a polynomial identity which implies the Gordon–Andrews identities: a bijective proof. preprint University of Melbourne No. 27–94 19. Kirillov, A.N.: Dilogarithm Identities. Prog. Theor. Phys. Suppl. 118, (1995) 61–142 20. Warnaar, S.O., Pearce, P.A.: Exceptional structure of the dilute A3 model: E8 and E7 Rogers–Ramanujan identities. J. Phys. A: Math. Gen. 27, (1994) L891–L897 21. Berkovich, A., McCoy, B.M.: Continued fractions and fermionic representations for characters of M (p, p0 ) minimal models. Lett. Math. Phys. 37, (1996), 49–66 22. Foda, O., Warnaar, S.O.: A bijection which implies Melzer’s polynomial identities: the χ(p,p+1) case. 1,1 Lett. Math. Phys. 36, (1996) 145–155 23. Warnaar, S.O.: Fermionic solution of the Andrews-Baxter-Forrester model I: unification of CTM and TBA methods. J. Stat. Phys. 82, (1996) 657–685 d1 ⊗ sl(n) d1 /sl(n) d2 . J. 24. Foda, O., Okado, M., Warnaar, S.O.: A proof of polynomial identities of type sl(n) Math. Phys. 37, (1996) 965–986 25. Berkovich, A., McCoy, B.M., Orrick, W.P.: Polynomial identities, indices, and duality for the N = 1 superconformal model SM (2, 4ν). J. Stat. Phys. 83, (1996) 795–837 26. Schilling, A.: Polynomial fermionic forms for the branching functions of the rational coset conformal b M × su(2) b N /su(2) b N +M . Nucl. Phys. B 459, (1996) 393–436 field theories su(2) 27. Warnaar, S.O.: Fermionic solution of the Andrews-Baxter-Forrester model II: proof of Melzer’s polynomial identities. J. Stat. Phys. 84, (1996) 49–83 28. Berkovich, A., McCoy, B.M.: Generalizations of the Andrews–Bressoud identities for the N = 1 superconformal model SM (2, 2ν). Preprint BONN-TH-95-15, ITPSB 95-29, hep-th/9508110. To appear in Int. J. of Math. and Comp. Modelling
232
S.O. Warnaar
29. Schilling, A.: Multinomials and polynomial bosonic forms for the branching functions of the su(2) b M× su(2) b N /su(2) b N +M conformal coset models. Nucl. Phys. B 467, (1996) 247–271 30. Dasmahapatra, S.: On the combinatorics of row and corner transfer matrices of the A(1) n−1 restricted face models. Peprint CMPS/95-114, hep-th/9512095 31. Gordon, B.: A combinatorial generalization of the Rogers–Ramanujan identities. Am. J. Math. 83,(1961) 393-399 32. Andrews, G.E.: An analytic generalization of the Rogers–Ramanujan identities for odd moduli. Prod. Nat. Acad. Sci. USA 71,(1974) 4082–085 33. G. E. Andrews, G.E., Baxter, J.: Lattice gas generalization of the hard hexagon model. III. q-trinomial coefficients. J. Stat. Phys. 47,(1987) 297–30 34. Date, E., Jimbo, M., Kuniba, A., Miwa T., Okado, M.: Exactly solvable SOS models. Local height probabilities and theta function identities. Nucl. Phys. B 290 [FS20], (1987) 231–273 35. Andrews, G.E., Baxter, R.J., Forrester, P.J.: Eight-vertex SOS model and generalized Rogers–Ramanujantype identities. J. Stat. Phys. 35, (1984) 193–266 36. Andrews, G.E.: Schur’s theorem, Capparelli’s conjecture and q-trinomial coefficients. Contemp. Math. 166, (1994) 141–154 37. Andrews, G.E.: Partitions and Durfee dissection. Am. J. Math. 101, (1979) 735–742 38. Alder, H.L.: Partition identities–from Euler to the present. Am. Math. Monthly 76, (1969) 733–764 39. Bressoud, D.: Lattice paths and the Rogers-Ramanujan identities. Lect. Notes in Math. 1395, BerlinHeidelberg-New York: Springer Verlag, pp. 140-172 1987 40. R¨osgen M., Varnhagen, R.: Steps towards lattice Virasoro algebras: su(1,1). Phys. Lett. 350B, (1995) 203–211 41. Andrews, G.E.: Sieves in the theory of partitions. Am. J. Math. 94, (1972) 1214–1230 42. Foda, O., Quano, Y.H: Virasoro character identities from the Andrews–Bailey construction. Preprint University of Melbourne No. 26–94, hep-th/9408086. To appear in Int. J. Mod. Phys. A 43. Kastor, D., Martinec, E., Qiu, Z.: Current algebra and conformal discrete series. Phys. Lett. 200B, (1988) 434–440 44. Bagger, J., Nemeshansky, D., Yankielowicz, S.: Virasoro algebras with central charge c > 1. Phys. Rev. Lett. 60, (1988) 389–392 45. Agarwal, A.K., Andrews, G.E., Bressoud, D.: The Bailey lattice. J. Indian. Math. Soc. 51, (1987) 57–73 46. Andrews, G.E.: Multiple series Rogers–Ramanujan type identities. Pac. J. Math. 114, (1984) 267–283 47. Schilling, A., Warnaar, S.O.: Supernomial coefficients, polynomial identities and q-series. Preprint ITPSB-97-03, University of Melbourne No. 01-97, q-alg/9701007 48. Berkovich, A., McCoy, B.M., Schilling, A., Warnaar, S.O.: Bailey flows and Bose–Fermi identities for (1) (1) the conformal coset models (A(1) 1 )N × (A1 )N 0 /(A1 )N +N 0 . Preprint ITP-SB-97-02, University of Melbourne preprint 04-97, hep-th/9702026 Communicated by M. Jimbo
Commun. Math Phys. 184, 233 – 250 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group Brian C. Hall McMaster University, Department of Mathematics, Hamilton, ON, Canada L8S-4K1. E -mail:
[email protected] Received: 9 July 1996 / Accepted: 9 September 1996
Abstract: Let K be a compact, connected Lie group and KC its complexification. I consider the Hilbert space HL2 (KC , νt ) of holomorphic functions introduced in [H1], where the parameter t is to be interpreted as Planck’s constant. In light of [L-S], the complex group KC may be identified canonically with the cotangent bundle of K. Using this identification I associate to each F ∈ HL2 (KC , νt ) a “phase space probability density.” The main result of this paper is Theorem 1, which provides an upper bound on this density which holds uniformly over all F and all points in phase space. Specifically, the phase space probability density is at most at (2πt)−n , where n = dim K and at is a constant which tends to one exponentially fast as t tends to zero. At least for small t, this bound cannot be significantly improved. With t regarded as Planck’s constant, the quantity (2πt)−n is precisely what is expected on physical grounds. Theorem 1 should be interpreted as a form of the Heisenberg uncertainty principle for K, that is, a limit on the concentration of states in phase space. The theorem supports the interpretation of the Hilbert space HL2 (KC , νt ) as the phase space representation of quantum mechanics for a particle with configuration space K. The phase space bound is deduced from very sharp pointwise bounds on functions in HL2 (KC , νt ) (Theorem 2). The proofs rely on precise calculations involving the heat kernel on K and the heat kernel on KC /K. Table of Contents 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
2
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
3
The Complex Structure on Phase Space . . . . . . . . . . . . . . . . . . . . . . . . . 237
4
Phase Space Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
5
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
234
B.C. Hall
1. Introduction The classical Segal-Bargmann space [B, Se1-3] is the space of holomorphic functions F on Cn satisfying Z 2
kF kt ≡ where
2
Cn
|F (z)| νt (z) dz < ∞,
νt (z) = (πt)−n/2 e−(Imz)
2
/t
.
Here t is a positive parameter. (This is the “invariant” form of the Segal-Bargmann space in which the measure is constant in the real directions. See the appendix in [H1] for its relationship to other forms.) We will denote this space HL2 (Cn , νt ), where H indicates holomorphic. I wish to interpret HL2 (Cn , νt ) as the “phase space Hilbert space” for quantum mechanics of a particle moving in Rn . In this case, t is to be interpreted as Planck’s constant (~). There is a natural unitary map, called the Segal-Bargmann transform, which connects this phase space Hilbert space to the customary “configuration space Hilbert space” L2 (Rn , dx). However, the transform is not directly relevant to the present paper. A phase space Hilbert space is a natural and useful setting for semiclassical analysis [V, P-U, G-P, T-W, C]. If we normalize F ∈ HL2 (Cn , νt ) so that kF kt = 1, then Z 2 |F (z)| νt (z) dz = 1. Cn
2
The quantity |F (z)| νt (z) is to be interpreted as a sort of “phase space probability density.” Although other definitions of the phase density are possible, this one is natural in many respects. (See [H3].) The results of Bargmann [B, (1.7)], adapted to our normalization, show that 2 (1) |F (z)| νt (z) ≤ (2πt)−n for all F ∈ HL2 (Cn , νt ) with kF kt = 1 and for all z ∈ Cn . The quantity (2πt)n = (2π~)n is the volume of a semiclassical cell in phase space. Thus (1) tells us that if E is a region of phase space whose volume is p times the volume of a cell, then the particle has probability at most p of being in E. This is a form of the Heisenberg uncertainty principle. The fact that the right side of (1) is independent of z reflects the fact that the group of translations of Cn acts in a projective unitary fashion on HL2 (Cn , νt ). (See [B, (3.5)].) The purpose of this paper is to prove a similar result for a particle whose configuration space is an arbitrary connected compact Lie group K. In [H1] I construct an analog on K of the Segal-Bargmann transform. (See also [H2, D, D-G, G-M, A, Hi1-2].) Let KC denote the complexification of K (Sect. 2). The range of the generalized SegalBargmann transform is HL2 (KC , νt ), that is, the space of holomorphic functions F on KC for which Z 2
kF kt ≡
2
KC
|F (g)| νt (g) dg < ∞.
Here dg is Haar measure on KC and νt is (Sect. 2) the heat kernel on KC /K, viewed as a K-invariant function on KC . (More precisely, this space is the image of the K-invariant form Ct of the generalized Segal-Bargmann transform [H1, Thm. 2].) I wish to interpret
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
235
HL2 (KC , νt ) as the phase space Hilbert space for a quantum particle with configuration space K. The usual phase space for a particle with configuration space K is the cotangent bundle of K, T ∗ (K). In Sect. 3, we will discover a canonical diffeomorphism Φ between T ∗ (K) and the complex group KC , obtained by means of the results of Lempert and Sz¨oke [L-S, Sz1-2] or the largely equivalent results of Guillemin and Stenzel [G-S1-2]. For each F ∈ HL2 (KC , νt ) with kF kt = 1, the associated phase space probability density is 2
|F (g)| νt (g) σ (g) , where σ is the “Jacobian” of Φ. Let n = dim K. The main result of this paper (Theorem 1) is that for any F in HL2 (KC , νt ) with kF kt = 1, the phase space probability density satisfies |F (g)| νt (g) σ (g) ≤ at (2πt)−n , 2
(2)
where at is a constant that tends to one exponentially fast as t tends to zero. In particular, for each fixed t there is a bound on the phase space probability density that holds uniformly over all F and all points in phase space. I prove that, at least for small t, the bound (2) cannot be substantially improved. The optimal bound for the left side of (2) is a non-constant function of g, given in (6) below. This non-constancy reflects the fact that T ∗ (K) is less symmetric than Cn . Unless K is commutative there is no obvious transitive group of canonical transformations of T ∗ (K); in particular, the symplectic structure on KC obtained via Φ is neither leftnor right-invariant. Nevertheless, the right side of (6) is nearly constant. According to Theorem 1, it is bounded above by a constant for all t and bounded below by a constant for small t, and the ratio of the upper and lower bounds tends to one exponentially fast as t tends to zero. Theorem 1 supports the view that HL2 (KC , νt ) is the “right” phase space Hilbert space for a quantum particle with configuration space K. This view is also supported by the inversion formula in [H2], which says (roughly) that the configuration space wave function can be obtained from the phase space wave function by integrating over the momentum variables. As explained in Sect. 4, the phase space density is bounded by the product of three quantities–the function νt , the “Jacobian” of the map Φ, and a certain analytic continuation of the heat kernel on K. Gangolli [G] gives an exact formula for νt , the Jacobian of Φ can be computed exactly, and the analytic continuation of the heat kernel on K can be estimated by analyzing the Poisson summation formula of Urakawa [U]. When we multiply a miracle occurs: everything cancels except for the physically expected quantity (2πt)−n , times a function which tends to one uniformly as t tends to zero. The miraculous nature of these cancellations suggests that some more general principle is at work. The results of [H1] and [L-S] carry over to the case of compact symmetric spaces. (See [H1, Sect. 11] and [Sz1, Thm. 2.5].) However, the present paper relies on heat kernel formulas which hold only in the group case. I conjecture that some analog of Theorem 1 holds for general compact symmetric spaces. I thank Ping Feng for helping me to understand the map Φ and Chris Herald for inspiring me to use the Fourier transform in the proof of Proposition 3.
236
B.C. Hall
2. Preliminaries The setup is as follows. We let K be an arbitrary compact connected Lie group with Lie algebra k. We fix an inner product h , i on k which is invariant under the adjoint action of K. This inner product determines a bi-invariant Riemannian metric on K. We will let dx denote Haar measure on K normalized to coincide with Riemannian volume measure. With this normalization the volume of K need not equal one. Let ∆ denote the Laplace-Beltrami operator associated with this Riemannian metric. The heat kernel ρt at the identity on K is defined by the conditions that ρt satisfy the heat equation dρ 1 = ∆ρt dt 2 and that Z lim f (x) ρt (x) dx = f (e) t→0
K
∞ for all continuous functions f on K. For R each t > 0, the heat kernel is a C , strictly positive function on K which satisfies K ρt (x) dx = 1. Let KC be the complexification of K. (See [H1, Sect. 3] for the definition.) Then KC ist a connected complex Lie group whose Lie algebra kC is the complexification of k, and which contains K as a subgroup. For example, if K = SU (n), then KC = SL (n; C). The inner product on k extends to a real-valued inner product on kC satisfying
hX1 + iY1 , X2 + iY2 i = hX1 , X2 i + hY1 , Y2 i for Xk , Yk ∈ k. This inner product determines a left-invariant Riemannian metric on KC . We will let dg denote Haar measure on KC normalized to coincide with Riemannian volume measure. As proved in [H1, Sect. 4], the heat kernel ρt has a unique analytic continuation from K to KC . The “reproducing kernel” described in Sect. 4 is expressed in terms of the analytic continuation of ρt . The quotient space KC /K is a manifold with a transitive left action of KC . The tangent space to KC /K at the identity coset can be thought of as ik ⊂ kC . There exists a unique KC -invariant Riemannian structure on KC /K which at the identity agrees with our inner product on ik ⊂ kC . We will let νt be the solution to the equation dν 1 = ∆νt dt 4 subject to the condition that Z lim
t→0
KC /K
f (m) νt (m) dm = f ([e])
for all continuous functions f of compact support. Here dm denotes Riemannian volume measure and ∆ the Laplace-Beltrami operator on KC /K. The function νt is positive R and C ∞ and satisfies νt (m) dm = 1. We will think of νt as a right-K-invariant function on KC , one which turns out to be left-K-invariant as well. The normalization of νt as a function on KC is Z νt (g) dg = Vol (K) . KC
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
237
This is proved in Lemma 6 in Sect. 4. This normalization guarantees that νt as defined in this paper coincides with νt as defined in [H1, Thm. 2], since the function µt in [H1] integrates to one. An explicit formula for νt (g), due to Gangolli, is given in (11) below. The space HL2 (KC , νt ) will denote the space of holomorphic functions F on KC satisfying Z 2 |F (g)| νt (g) dg < ∞. KC
The norm in this space will be denoted kF kt . An explicit formula for the measure νt (g) dg in natural coordinates is given in Lemma 5. We will use the standard machinery for compact Lie groups. (See [B-D].) Let T be a maximal torus in K, and t its Lie algebra. Using the inner product on k (restricted to t) we will identify t∗ with t. Let R ⊂ t be the real roots, that is, the non-zero α in t for which there exists a non-zero X ∈ kC with [H, X] = i hα, Hi X for all H ∈ t. Let R+ be a set of positive roots and let ρ be half the sum of the positive roots. Let W be the Weyl group. Let Γ ⊂ t be the kernel of the exponential mapping for t. Let π denote the polynomial on t given by Y hα, Hi . π (H) = α∈R+
In light of [B, V.(4.10)], π is alternating with respect to the action of the Weyl group. We will use the polar decomposition for KC , which states that every g ∈ KC can be written uniquely in the form g = xeiY , with x ∈ K and Y ∈ k. In fact, the map Φ : K × k → KC given by Φ (x, Y ) = xeiY is a diffeomorphism. (See the proof of Lemma 12 in [H1, Sect. 11].) Since every Y ∈ k can be moved into t by the adjoint action of K, every g ∈ KC can be written as g = xeiH y, with x, y ∈ K and H ∈ t. While this decomposition is not unique, H is unique up to the action of the Weyl group. 3. The Complex Structure on Phase Space A phase space probability density should be a positive function on phase space (that is on the cotangent bundle T ∗ (K)) which integrates to one with respect to the natural phase volume measure. In Sect. 4 we will associate such a probability density to each F ∈ HL2 (KC , νt ) with kF kt = 1. The probability density depends on an identification of KC with T ∗ (K). In this section we will discover the “right” such identification. We may identify the cotangent bundle T ∗ (K) with K × k∗ by means of lefttranslation, and then with K × k by means of the inner product on k. (Under this identification, the phase volume measure is simply Haar measure on K times Lebesgue measure on k. See Lemma 4.) We then use the diffeomorphism Φ : K × k → KC of Sect. 2 given by x ∈ K, Y ∈ k. Φ (x, Y ) = xeiY , Physically, x represents position and Y momentum. Since we are identifying T ∗ (K) with K × k, we will regard Φ as a map from T ∗ (K) to KC . The map Φ is natural in several respects. First, it takes the obvious copy of K in T ∗ (K) to the obvious copy of K in KC , and it intertwines the action of K ×K on T ∗ (K) with the action of K × K on KC . Second, if you use Φ to transfer the complex structure
238
B.C. Hall
of KC back to T ∗ (K), this complex structure fits together with the symplectic structure of T ∗ (K) to give you a K¨ahler manifold. (More on this below.) These two conditions already severely constrain what Φ can be. Third, there is a canonical “adapted” complex structure J on T ∗ (K), and, as explained below, Φ is the unique biholomorphism of (T ∗ (K) , J) with KC which restricts to the identity map of K ⊂ T ∗ (K) onto K ⊂ KC . Last, the Jacobian of Φ comes out in precisely the right way to give the physically natural bounds on the phase space probability density. The map Φ is a diffeomorphism of the symplectic manifold T ∗ (K) with the complex manifold KC . We may use Φ to transfer the complex structure of KC to a complex structure J on T ∗ (K). The resulting complex symplectic manifold is in fact a K¨ahler manifold. This means that ω (JX, JY ) = ω (X, Y ) and that ω (X, JX) ≥ 0 for all tangent vectors X and Y , where ω is the canonical 2-form on T ∗ (K). While the K¨ahlerness of (T ∗ (K) , J, ω) can be proved directly by differentiating Φ as in Sect. 4, the result also follows from the results of Lempert-Sz¨oke and Guillemin-Stenzel [L-S, Sz1-2, G-S1-2] which I now recap briefly. Let M be a real-analytic Riemannian manifold and T (M ) its tangent bundle. Since M is Riemannian, the tangent and cotangent bundles are identified. A complex structure on T (M ) is said to be adapted if for each geodesic γ in M the map τ + iσ → (γ (τ ) , σ γ_ (τ )) is a holomorphic mapping of C into T (M ). If an adapted complex structure exists, then it is unique. Moreover, in this case if we identify T ∗ (M ) and T (M ) using the Riemannian structure, then the symplectic structure of T ∗ (M ) and the adapted complex structure of T (M ) fit together to give a K¨ahler manifold. In general, an adapted complex structure may not exist on all of T (M ). If M is compact, then an adapted complex structure exists at least on a tube of some radius. If M is a compact Lie group with a bi-invariant metric, then an adapted complex structure exists on all of T (M ). Now, the geodesics in K (with a bi-invariant metric) are precisely the curves of the form γ (τ ) = xeτ Y , for x ∈ K, Y ∈ k. Ifwe identify T (K) with K × k via left-translation, then (γ (τ ) , σ γ_ (τ )) = xeτ Y , σY . Thus Φ (γ (τ ) , σ γ_ (τ )) = xeτ Y eiσY = xe(τ +iσ)Y . This last expression clearly depends holomorphically on z = τ + iσ. Thus the complex structure on T (K) induced by the map Φ is adapted. Equivalently, if J is the unique adapted complex structure on T ∗ (K), then Φ is a holomorphism of (T ∗ (K) , J) with KC . So the fact that Φ makes KC into a K¨ahler manifold follows from, say, Cor. 5.5 and Thm. 5.6 of [L-S]. (See also [Sz2, Sect. 4].) 4. Phase Space Bounds I would like to interpret F ∈ HL2 (KC , νt ) as the phase space wave function for a quantum particle with configuration space K. Such an interpretation would be impossible if F were an arbitrary element of L2 (KC , νt ), for then F could be supported in an arbitrarily small region of phase space, violating the uncertainty principle. Fortunately, F is required to be holomorphic, which, as we shall see, imposes very precise conditions on how concentrated F can be in phase space. The natural “reference measure” on KC is not Haar measure but rather the Liouville phase volume measure, which can be thought of as a measure on KC by means of
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
239
the diffeomorphism Φ between T ∗ (K) and KC . In terms of the position-momentum coordinates (x, Y ), phase volume measure is simply dx dY , that is, Haar measure in x times Lebesgue measure in Y (Lemma 4). Let σ (g) denote the density of Haar measure with respect to phase volume measure (Lemma 5). Then for F with kF kt = 1, the quantity 2 |F (g)| νt (g) σ (g) (3) is the phase space probability density and integrates to one with respect to phase volume measure. As on any reasonable L2 -space of holomorphic functions, the pointwise evaluation maps F → F (g) are bounded linear functionals on HL2 (KC , νt ). Estimates on the norms of these functionals will give us bounds on the density (3). Now, as a consequence of [H1, Thm. 6], “evaluation at g” may be computed as Z (4) ρ2t (hg ∗ )F (h) νt (h) dh. F (g) = KC
Here ρ2t refers to the analytic continuation of ρ2t from K to KC , and the map g → g ∗ is the unique antiholomorphic antiautomorphism of KC with the property that g ∗ = g −1 for g ∈ K. (In the notation of [H1, Sect. 3], g ∗ = g −1 .) The function ρ2t (hg ∗ ) is called the reproducing kernel or Bergman kernel. For each g, ρ2t (hg ∗ ) is [H1, Thm. 6] holomorphic and square-integrable with respect to h. So (4) tells us that the norm of “evaluation at g” is equal to the L2 norm of ρ2t (hg ∗ ). But by (4), Z ρ2t (hg ∗ )ρ2t hg ∗ νt (h) dh = ρ2t gg ∗ KC
∗
because ρ2t (hg ) is holomorphic and square-integrable with respect to h. So we obtain the bound 2 2 |F (g)| ≤ ρ2t gg ∗ kF kt . (5) This bound is sharp in the sense that for each g there is a non-zero F for which equality holds. We will obtain explicit upper bounds (for all t) and lower bounds (for small t) on the function ρ2t (gg ∗ ). The pointwise bounds (5) lead immediately to sharp bounds on the phase space probability density (3): 2 (6) |F (g)| νt (g) σ (g) ≤ ρ2t gg ∗ νt (g) σ (g) for all g and all F with kF kt = 1. The bound in Theorem 1 follows from the estimates for ρ2t (gg ∗ ) in Theorem 2 together with explicit formulas for νt and σ. Theorem 1. Let n = dim K. For each t > 0, there exists a constant at such that for all F ∈ HL2 (KC , νt ) with kF kt = 1 the phase space probability density satisfies |F (g)| νt (g) σ (g) ≤ at (2πt)−n 2
for all g ∈ KC . For all sufficiently small t > 0, there exists a positive constant bt such that for each g ∈ KC there is F ∈ HL2 (KC , νt ) with kF kt = 1 such that |F (g)| νt (g) σ (g) ≥ bt (2πt)−n . 2
240
B.C. Hall
The optimal constants at and bt satisfy lim at = lim bt = 1,
t→0
t→0
and the convergence is exponentially fast. Theorem 2. Let n = dim K. For each g ∈ KC , write g in the form g = xeiH y, with x, y ∈ K and H ∈ t. Then for each t > 0 there exists a constant at such that for all F ∈ HL2 (KC , νt ) with kF kt = 1 Y 2 2 2 |F (g)| ≤ ρ2t gg ∗ ≤ at e|ρ| t (4πt)−n/2 e|H| /t α∈R+
hα, Hi . sinh hα, Hi
Here R+ is the set of positive roots, and ρ is half the sum of the positive roots. For all sufficiently small t > 0, there exists a positive constant bt such that for each g ∈ KC there is F ∈ HL2 (KC , νt ) with kF kt = 1 such that Y 2 2 2 |F (g)| = ρ2t gg ∗ ≥ bt e|ρ| t (4πt)−n/2 e|H| /t α∈R+
hα, Hi . sinh hα, Hi
The optimal constants at and bt satisfy lim at = lim bt = 1,
t→0
t→0
and the convergence is exponentially fast. Remarks. 1) If K is commutative or if K = SU (2), then the constant bt in the preceding theorems exists not just for small times, but for all times t. I will point out how this is proved after the end of the proof of Theorem 2. It is reasonable to conjecture that this holds for all K. 2) The proof of Theorem 1 relies on a strong similarity between the formula (8) for ρ2t (gg ∗ ) and the formula (11) for νt (g). This similarity is not coincidental. As a consequence of [H2, Thm. 5] ρ2t (gg ∗ ), viewed as a function on KC /K, satisfies the inverse heat equation. In fact it is possible to show that each term in (8) satisfies the inverse heat equation. The γ0 = 0 term is (up to a constant) the solution to the inverse heat equation obtained by formally replacing t by −t in the formula (11) for νt . 3) The “averaging lemma” [H1, Lem. 11], together with Theorem 2, gives pointwise bounds on functions in the space HL2 (KC , µt ) of [H1]. The bounds are the same as in Theorem 2, except that at and bt do not tend to one as t tends to zero. These bounds for HL2 (KC , µt ) are stronger than the bounds of Driver and Gross [D-G, Cor. 3.10], both because |H| ≤ |g|, and because of the exponentially decaying factors hα, Hi / sinh hα, Hi in Theorem 2. On the other hand, the bounds of Driver and Gross hold in much greater generality. Proof of Theorem 2. We will use an extension of Urakawa’s [U] Poisson summation formula for the restriction of the heat kernel ρt to the maximal torus T . Recall that Γ denotes the kernel of the exponential mapping for t, R+ denotes the set of positive roots, and ρ denotes half the sum of the positive roots. For γ ∈ Γ , let (γ) = exp i hρ, γi, so that (γ) = ±1. Then ! Y X 2 1 −n/2 |ρ|2 t/2 H e (γ) π (H − γ) e−|H−γ| /2t ρt e = (2πt) 1 2 sin hα, Hi 2 α∈R+ γ∈Γ (7)
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
241
Q for all H ∈ t for which eH is regular. Here n = dim K and π (H) = α∈R+ hα, Hi. If K is simply connected then (γ) ≡ 1 and (7) reduces essentially to the formula in [U]. (There is a question about the overall constant, which will be addressed below.) The general result can be reduced to the simply connected case as follows. If K is commutative then R+ is empty, so ρ = 0, (γ) ≡ 1, and π (H) ≡ 1; thus (7) reduces to the usual Poisson summation formula for the heat kernel on a torus. A general compact connected Lie group is of the form K = (K1 × S) /N , where K1 is simply connected, S is a torus, and N is a finite subgroup of the center of K1 × S. The Lie algebras of K1 and S are automatically orthogonal with respect to any invariant inner product, and so the heat kernel on K1 × S factors, establishing (7) on K1 × S. To get the heat kernel on K one simply periodizes over the action of N . But it is not hard to see that if γ is in the kernel of the exponential mapping for K then Y α∈R+
sin
Y 1 1 hα, H − γi = (γ) sin hα, Hi . 2 2 + α∈R
From this it is straightforward to see that periodization over N yields (7) for K. The formula in [U] contains an overall constant which is not computed explicitly. However, because we are normalizing Haar measure on K to coincide with Riemannian volume measure, we are able to pin down this constant. To see that the constant in (7) is correct, note that by Minakshisundaram’s expansion [U, (1.2), (1.7)] and my normalization of the heat equation, ρt must satisfy Vol (K) ρt (e) = (2πt)−n/2 [Vol (K) + O (t)] . So ρt (e) ∼ (2πt)−n/2 . But as proved in detail below, ρt (e) is well approximated for small t by the limit as H → 0 of just the γ = 0 term in (7), which goes as (2πt)−n/2 . (Let H → 0 in (8) using Prop. 3.) We wish to estimate ρ2t (gg ∗ ). As in Sect. 2, we write g = xeiH y, with x, y ∈ K and continued H ∈ t. Then gg ∗ = xeiH yy −1 eiH x−1 = xe2iH x−1 . Since the analytically heat kernel is a class function, this means that ρ2t (gg ∗ ) = ρ2t e2iH . It is not hard to show that (7) can be analytically continued term by term, so that we may simply replace 2 H by 2iH (and t by 2t). The analytic continuation of |H − γ| = hH − γ, H − γi is accomplished by taking a complex bilinear extension of h , i, giving “ h2iH − γ, 2iH − γi ” = −4 hH, Hi − 4i hH, γi + hγ, γi . Now, every γ ∈ Γ is contained in the orbit under W of a unique γ0 in the closed fundamental Weyl chamber C. Letting W · γ0 denote the orbit of γ0 and doing the algebra gives ! Y hα, Hi −n/2 |H|2 /t 2iH |ρ|2 t (4πt) ρ2t e e = e sinh hα, Hi α∈R+ ihH,γi/t P 1 X γ∈W ·γ0 π H − 2i γ e −|γ0 |2 /4t × (γ0 ) e . (8) π (H) γ0 ∈Γ ∩C
We have used the easily verified fact that (w Q · γ0 ) = (γ0 ) for all w ∈ W , and we have multiplied and divided each term by π (H) = α∈R+ hα, Hi.
242
B.C. Hall
Strictly speaking this formula is valid only on the complement of the hyperplanes hα, Hi = 0. However, the complement of the hyperplanes is dense, so bounds that apply there continue to hold for all H. We will show directly that the right side of (8) extends to a smooth function on all of t. We now need to estimate the sum ihH,γi/t P 1 X γ∈W ·γ0 π H − 2i γ e −|γ0 |2 /4t (9) (γ0 ) e π (H) γ0 ∈Γ ∩C
in (8). We will show that (9) is a bounded function of H for all t, and that this function tends to one uniformly in H as t tends to zero. Note that the γ0 = 0 term is identically equal to one and that all of the other terms are small for small t. So it is easy to see that (9) tends to one for each fixed H not in any hyperplane. But because of the factor of π (H) in the denominator, we will have to work much harder to get uniform estimates. Proposition 3. There exists a polynomial P , whose degree is equal to twice the number of positive roots, such that P ihH,γi/t 1 |γ0 | γ∈W ·γ0 π H − 2i γ e (10) ≤P √ π (H) t for all H and γ0 in t and all t > 0. This proposition is the key technical result in the proof of Theorem 2. Its proof is deferred to an appendix. Using Proposition 3 we see easily that the sum (9) is a bounded function of H for each t. If at is the supremum over H of this sum, then (8) gives us the first part of Theorem terms are 2. Furthermore, the γ0 = 0 term in (9) is one and all the other uniformly small 2 for small t because of Proposition 3 and the factor exp − |γ0 | /4t . It is easy to see, then, that (9) tends to one uniformly in H as t → 0. Thus the infimum bt over H will be positive for all sufficiently small t, giving the second part of Theorem 2. The constants at and bt tend to one as t tends to zero, and it is not hard to see that the convergence 2 is exponentially fast, essentially because exp − |γ0 | /4t tends to zero exponentially fast for each non-zero γ0 . This gives the last part of the theorem. If K is commutative then R+ is empty, π (H) ≡ 1, (γ) ≡ 1, and Γ ∩ C = Γ . Thus the sum (9) is periodic. But ρ2t e2iH must be strictly positive, since it is the norm squared of the “evaluation at g” functional, which is non-zero (e.g., with F ≡ 1). So the sum (9) is a strictly positive continuous periodic function, which must therefore be bounded away from zero. If K = SU (2) then (γ) ≡ 1, Γ may be identified with the integer lattice in R, and π is linear. The Weyl group is {1, −1} and C = [0, ∞). So if y is a suitable linear coordinate on t, the sum (9) becomes # " ∞ ∞ ny ny X X −n2 /4t −n2 /4t n sin t − . e cos e 1+2 t y n=1
n=1
The first term is periodic and is essentially a heat kernel on the circle. It is therefore strictly positive. The second goes to zero as y → ∞. So ρ2t e2iH is a strictly positive
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
243
continuous function which is the sum of a strictly positive continuous periodic function and a functionwhich goes to zero at infinity. A simple compactness argument then shows that ρ2t e2iH must be bounded away from zero. Thus the constant bt in Theorem 2, and so also in Theorem 1, exists for all t if K is commutative or if K = SU (2). Proof of Theorem 1. Recall from Sect. 2 that each g ∈ KC can be written in the form xeiH y, with x, y ∈ K and H ∈ t and that νt is bi-K-invariant. The formula for νt is the following Y 2 2 νt xeiH y = e−|ρ| t (πt)−n/2 e−|H| /t α∈R+
hα, Hi . sinh hα, Hi
(11)
(See also Lemma 5.) If K is semisimple, then this is (up to a constant) a formula of Gangolli [G, Prop. 3.2], where νt is gt/4 in Gangolli’s notation. What Gangolli calls |ρ∗ | is 2 |ρ| in our notation; see the expression for ρ (H) near the top of p.159 in [G]. If K is commutative, then KC /K is isometric to Rn , and (11) is the usual Gaussian heat kernel. In general, K = (K1 × S) /N with K1 semisimple, S a torus, and N a finite central subgroup. It follows from the polar decomposition that KC /K is isometric to K1,C /K1 × SC /S . So (11) holds for K. As in the compact case, there is a question about the overall constant in (11). The constant can be verified as follows. By Lemma 5 below and our normalization of νt , Z Z Vol (K) = νt (g) dg = Vol (K) νt eiY σ (Y ) dY , KC
k
where σ is given explicitly in the lemma. Cancelling Vol (K) and letting t → 0, we see that νt should satisfy Z lim νt eiY σ (Y ) dY = 1. (12) t→0
k
√ The limit may be computed by making the change of variable Z = Y / t and moving the limit inside the integral. Since σ (0) = 1 and limH→0 hα, Hi / sinh hα, Hi = 1, (12) becomes Z 2 −n/2 π e−|Z| dZ = 1, k
which is true. So the constant in (11) must be correct. In the next three lemmas we will give an explicit formula for phase volume measure, compute the Jacobian factor σ, and verify that the normalization of νt in this paper is consistent with that in [H1]. This last point is necessary because we are using the formula from [H1] for the reproducing kernel. Then to prove Theorem 1 we will simply put everything together. Lemma 4. Identify T ∗ (K) with K × k via left-translation and the inner product on k. Then the integral of a function f with respect to phase volume measure is given by Z Z f (x, Y ) dx dY, K
k
where dx is Haar measure on K normalized to coincide with Riemannian volume measure and dY is Lebesgue measure on k normalized by means of the inner product.
244
B.C. Hall
Lemma 5. If f is a continuous function of compact support, then Z Z Z f (g) dg = f xeiY dx σ (Y ) dY, KC
K
k
where σ is an Ad-K-invariant function on k which satisfies σ (H) =
Y sinh hα, Hi 2 hα, Hi +
α∈R
for H ∈ t. The measure νt (g) dg is given in (x, Y ) coordinates by νt (g) dg = e−|ρ| t (πt)−n/2 e−|Y | 2
2
/t
η (Y ) dx dY,
(13)
where η (Y ) is the Ad-K-invariant function given by η (H) =
Y sinh hα, Hi hα, Hi +
α∈R
for H ∈ t. Lemma 6. Normalizing things as in Sect. 2 we have Z νt (g) dg = Vol (K) . KC
Proof of Lemma 4. If M is any Riemannian manifold, the phase volume on T ∗ (M) may be computed by integrating over the cotangent spaces with respect to Lebesgue measure (normalized by the inner product) and then integrating over M with respect to Riemannian volume measure. To see this note that the phase volume measure is given by integrating the Liouville 2n-form dq 1 ∧ · · · ∧ dq n ∧ dp1 ∧ · · · ∧ dpn , where the q’s are local coordinates on M and the p’s are the associated coordinates on the cotangent spaces. But this is equal to 1 √ 1 n gdq ∧ · · · ∧ dq ∧ √ dp1 ∧ · · · ∧ dpn g which corresponds to volume measure on M times normalized Lebesgue measure on the cotangent spaces. If we use the metric to identify T ∗ (M) and T (M), we get a similar statement on T (M). The lemma is then just a special case of this general result, in which all the tangent spaces to K are identified isometrically with k. Proof of Lemma 5. We have to compute the “Jacobian” of the map Φ : K × k → KC given by Φ (x, Y ) = xeiY . Now
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
d Φ xesX , Y ds s=0
=
245
d xeiY e−iY esX eiY ds s=0
= (LxeiY )∗ e−iadY (X) = (LxeiY )∗ (cos adY (X) − i sin adY (X)) .
Using the formula for the differential of the exponential mapping [He, Thm. II.1.7] d 1 − e−iad Y (iX) Φ (x, Y + sX) = (LxeiY )∗ ds s=0 iad Y sin ad Y 1 − cos ad Y (X) + i (X) . = (LxeiY )∗ ad Y ad Y Using left-translation on K we think of the tangent space at each point of K × k as k ⊕ k. Using left-translation on KC , we think of the tangent space at each point of KC as kC = k ⊕ k. Thus the differential of Φ at the point (x, Y ) is represented by the block matrix 1 − cos adY cos adY adY (14) Φ∗ (x, Y ) = . sin adY − sin adY adY The cotangent space at each point to K × k is k∗ ⊕ k∗ , which we identify with k ⊕ k via the inner product. Let {ej } be an orthonormal basis for the first copy of k and {fj } an orthonormal basis for the second copy of k. By Lemma 4, the Liouville form on K × k is (15) e 1 ∧ · · · ∧ en ∧ f 1 ∧ · · · ∧ fn . The cotangent space at each point of KC is similarly identified with k⊕k, and the 2n-form that gives Haar measure on KC is also given by (15). Thus the density σ of Haar measure with respect to phase volume measure will be given by the determinant of the matrix in (14), which is evidently a function of Y only. Since the blocks of (14) commute, its determinant as a 2n × 2n matrix may be computed by first taking the blockwise “determinant,” which comes out to be sin adY /adY , and then taking the determinant of the result as an n × n matrix. So sin adY . σ (Y ) = det adY It is clear from this expression that σ (Y ) is Ad-K-invariant, so it suffices to compute σ for Y = H ∈ t. Now, sin θ/θ = 1 when θ = 0, so only the non-zero eigenvalues of adH contribute to the determinant. But the non-zero eigenvalues of adH are of the form i hα, Hi, with α ∈ R. Since sin iθ/iθ = sinh θ/θ we have Y sinh hα, Hi 2 Y sinh hα, Hi = . σ (H) = hα, Hi hα, Hi + α∈R
α∈R
This is the formula we want. Meanwhile, to get the formula (13) for the measure νt (g) dg, we take the formula for the function νt (g) and multiply by σ (g). Note that the exponentially growing factor sinh hα, Hi is in the numerator in the formula for the measure νt (g) dg.
246
B.C. Hall
Proof of Lemma 6. The Riemannian volume measure on KC /K is invariant under the action of KC . Haar measure on KC pushed forward under the quotient map to KC /K is also invariant under the action of KC . It follows automatically that pushed-forward Haar measure equals a constant times Riemannian volume measure. To establish the lemma, we need to show that this constant is Vol (K). Now, the quotient map takes the set P = exp ik diffeomorphically onto KC /K; we may thus identify KC /K with P . Lemma 5 works just as well with xeiY replaced by eiY x, and so integration with respect to pushed-forward Haar measure amounts to Z Vol (K) f eiY σ (Y ) dY . k
Meanwhile, under the identification of KC /K with P , the map Y → eiY is the geometric exponential mapping for KC /K, which is a diffeomorphism in this case. It follows that integration with respect to Riemannian volume measure is given by Z f eiY φ (Y ) dY , k
where φ is a positive density equal to one at the origin. The functions φ and σ must differ at most by a multiplicative constant; since φ (0) = σ (0) = 1, the constant is one. We are now ready to put everything together. We use formula (11) for νt , the formula in Lemma 5 for σ, and the pointwise estimates in Theorem 2. In the resulting bounds on 2 |F (g)| νt (g) σ (g), everything miraculously cancels, except for the constant, a factor of (4πt)−n/2 from Theorem 2, and a factor of (πt)−n/2 from νt . These combine to give you a constant (at or bt ) times (2πt)−n , which is Theorem 1.
5. Appendix Proof of Proposition 3. We may write k as k = k1 ⊕ a, where k1 is semisimple and a is abelian, in which case t = t1 ⊕ a, where t1 is a maximal abelian subalgebra of t. If we identify t∗ with t, then all the roots lie in t1 . Furthermore, we may write γ0 as γ1 + γ2 , with γ1 ∈ t1 and γ2 ∈ a. Then the contribution of γ2 to the expression in the proposition is just a multiplicative factor of absolute value one. Since |γ1 | ≤ |γ0 |, there is no harm in assuming k is semisimple. We will proceed by computing the Fourier transform, in the sense of tempered distributions, of the fraction in the proposition. The Fourier transform of the numerator is easily computed as a linear combination of derivatives of δ-functions. To compute the Fourier transform of the fraction we will compute the Fourier transform of the numerator and then integrate, in a sense to be described below. The key result will be that the Fourier transform of the fraction has compact support. (See Lemma 9.) Let a cone over R+ denote a set of the form {x0 + a1 α1 + · · · + ak αk |aj ≥ 0 }
(16)
with x0 ∈ t, where R+ = {α1 , · · · , αk } is the set of positive roots. Analogously define a cone over R− to be a set of the same form but with aj ≤ 0. The set (16) is the
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
247
same as {x0 + a1 α1 + · · · + am αm |aj ≥ 0 }, where α1 , · · · , αm are the positive simple roots, which (since we assume k is semisimple) form a basis for t. Every compact set is contained in a cone over R+ and in a cone over R− . The intersection of a cone over R+ and a cone over R− is compact. Definition 7. Suppose f ∈ C ∞ (t) and f is supported in some cone over R+ . Then for α ∈ R+ define Z ∞ f (x − tα) dt. Iα f (x) = 0
The condition on f guarantees that the integral exists, since for all sufficiently large t, x − tα will be outside the cone supporting f . Note also that if x is not in the cone supporting f , then neither is x − tα (t > 0). Thus Iα f will again be supported in a cone over R+ . It is easy to verify that Iα f is C ∞ and that Dα Iα f = f , where Dα denotes the directional derivative in the α direction. It is also true that Iα Dα f = f , since Iα Dα f − f must be constant along each line of the form {x − tα}, and is zero when t is large. If f is supported in a cone over R+ , then for α, β ∈ R+ , Iα Iβ f and Iβ Iα f both make sense, and must be equal because Iα and Iβ are two-sided inverses of Dα and Dβ , which commute. Of course, by reversing signs we can define I−α f for f supported on a cone over R− . Integration by parts shows that if f is supported on a cone over R+ and g is supported on a cone over R− , then Z Z Iα f (x) g (x) dx = f (x) I−α g (x) dx. (17) t
t
The integrals make sense because in both cases the integrand is supported on the intersection of a cone over R+ and a cone over R− . Definition 8. Let T be a distribution supported on a cone over R+ . Then for α ∈ R+ , define a distribution Iα T by (Iα T, f ) = T, I−α f for all f ∈ Cc∞ (t). + over R− . Note that T is supported on a cone over R and I−α f is supported on a cone ∞ The expression T, I−α f really means T, φI−α f , where φ is any C function of compact support which is equal to one in a neighborhood of supp(T ) ∩supp I−α f . If f is supported outside a cone over R+ , then so is I−α f . Thus the distribution Iα T will again be supported on a cone over R+ . If T is a C ∞ function, then by (17) Iα T defined as a distribution coincides with Iα T defined as a function. The results Iα Dα T = Dα Iα T = T and Iα Iβ T = Iβ Iα T follow from the corresponding results for functions.
Lemma 9. Let T be a compactly supported distribution which is alternating with respect to the action of the Weyl group. Let R+ = {α1 , · · · , αk } be the set of positive roots. Then S = Iα1 Iα2 · · · Iαk T has compact support, and the convex hull of the support of S is contained in the convex hull of the support of T .
248
B.C. Hall
Proof. The distribution T can be approximated, in the sense of distribution, by alternating C ∞ functions T such that every point in the support of T is within of a point in the support of T . It suffices, then, to prove the lemma under the assumption that T is an alternating C ∞ function of compact support. Let E denote the convex hull of the support of T . If f is any C ∞ function supported in E, then it is easy to see that Iα f will be supported in E if and only if Z ∞ f (x + tα) dt = 0 (18) −∞
for all x. Let α and β be distinct elements of R+ , and suppose f , Iα f , and Iβ f are all supported in E. Then Z ∞ Z ∞Z ∞ Iβ f (x + tα) dt = f (x + tα − sβ) ds dt −∞ −∞ 0 Z ∞Z ∞ = f (x + tα − sβ) dt ds 0
=
−∞
0.
Here Fubini applies because α and β are distinct (hence non-parallel) elements of R+ , so that f (x + tα − sβ) is zero for all sufficiently large s and t. Thus we see that Iα Iβ f also is supported in E. Applying this argument repeatedly we see that if f is supported in E, and Iα f is supported in E for each α ∈ R+ , then Iα1 · · · Iαk f is supported in E. Since T is alternating, T (sα x) = −T (x), where sα is the reflection about the hyperplane perpendicular to α. It follows that for any α ∈ R+ , condition (18) holds, so Iα T will be supported in E. But then by the preceding paragraph, Iα1 · · · Iαk T will be supported in E. Now, since π is homogeneous, the expression on the left side of Proposition 3 may be written as D E P γ γ H 1 √ H √ √ √ exp i − , γ∈W ·γ0 π 2i t t t t . H √ π t √ Thus the supremum over H of this expression will be a function of γ0 / t. So it suffices to prove the proposition with t = 1. Since π is alternating, the inner product is Weyl invariant, and γ ranges over a Weyl invariant set, we see that the numerator in the proposition, X 1 (19) π H − γ exp i hH, γi , 2i γ∈W ·γ0
is alternating. Let T denote the Fourier transform of (19), in the sense of tempered distributions. Then T is also alternating. Now (19) can be expanded as a linear combination of at most 2k |W | terms of the form hαi1 , γi · · · hαil , γi hαil+1 , Hi · · · hαik , Hi eihH,γi , with coefficients independent of γ and H. (Here k is the number of positive roots.) Taking the Fourier transform of this gives an irrelevant constant times
Phase Space Bounds for Quantum Mechanics on a Compact Lie Group
249
hαi1 , γi · · · hαil , γi Dαil+1 · · · Dαik δγ , where δγ denotes a δ-function at γ. Thus S = Iα1 · · · Iαk T is a linear combination of terms of the form hαi1 , γi · · · hαil , γi Iαi1 · · · Iαil δγ . Now, Iαi1 · · · Iαil δγ is a positive measure, so S is a complex measure. By Lemma 9, S is supported on E, where E is the convex hull of the support of T –that is, E is the convex hull of W · γ0 . Let C1 be the smallest cone over R+ containing E, C2 the smallest cone over R− containing E, and P = C1 ∩ C2 , so that P is a parallelepiped. There exists a constant c, independent of γ0 , so that diam (P ) ≤ c diam (E) ≤ 2c |γ0 |. It is a straightforward calculation to see that the measure of the set E with respect to l the measure Iαi1 · · · Iαil δγ is at most diam (P )l ≤ 2c |γ0 | . Taking into account the factors hαi1 , γi · · · hαil , γi and the fact that l ≤ k, we see that the total variation norm of S will be bounded by const. 1 + |γ0 |
2k
.
But if F denotes the Fourier transform, then π (H) F
−1
(S) = const.F
−1
(T ) = const.
X γ∈W ·γ0
(20) 1 π H − γ exp i hH, γi . 2i
But both F −1 (S) and F −1 (T ) are C ∞ functions, so P 1 γ∈W ·γ0 π H − 2i γ exp i hH, γi = const.F −1 (S) , π (H)
(21)
where all the constants are independent of γ0 . While the left side of (21) is defined only when π (H) 6= 0, we see that it extends to a C ∞ function on all of t. The expression (21) together with the bound (20) on the total variation of S gives the desired estimate.
References [A]
Ashtekar, A., Lewandowski, J., Marolf, D., Mour˜ao, J., Thiemann, T.: Coherent state transforms for spaces of connections. J. Funct. Anal. 135, 519–551 (1996) [B] Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform, Part I. Comm. Pure Appl. Math. 4, 187–214 (1961) [B-D] Br¨ocker, T., tom Dieck, T.: Representations of compact Lie groups. New York: Springer-Verlag, 1985 [C] Carlen, E.: Some integral identities and inequalities for entire functions and their application to the coherent state transform. J. Funct. Anal. 97, 231–249 (1991) [D] Driver, B.: On the Kakutani-Itˆo-Segal-Gross and Segal-Bargmann-Hall isomorphisms. J. Funct. Anal. 133, 69–128 (1995) [D-G] Driver, B., Gross, L. Hilbert spaces of holomorphic functions on complex Lie groups. To appear in: Proceedings of the 1994 Taniguchi Symposium [G] Gangolli, R.: Asymptotic behaviour of spectra of compact quotients of certain symmetric spaces. Acta Math. 121, 151–192 (1968) [G-P] Graffi, S., Paul, T.: The Schr¨odinger equation and canonical perturbation theory. Commun. Math. Phys. 108, 25–40 (1987)
250
B.C. Hall
[G-M] Gross, L., Malliavin, P.: Hall’s transform and the Segal-Bargmann map. In: Fukushima, M., Ikeda, N., Kunita, H., Watanabe, S. (eds.) Itˆo’s stochastic calculus and probability theory. New York: SpringerVerlag, 1996, pp. 73–116 [G-S1] Guillemin, V., Stenzel, M.: Grauert tubes and the homogeneous Monge-Amp`ere equation. J. Diff. Geom. 34, 561–570 (1991) [G-S2] Guillemin, V., Stenzel, M.: Grauert tubes and the homogeneous Monge-Amp`ere equation. II. J. Diff. Geom. 35, 627–641 (1992) [H1] Hall, B.: The Segal-Bargmann “coherent state” transform for compact Lie groups. J. Funct. Anal. 122, 103–151 (1994) [H2] Hall, B.: The inverse Segal-Bargmann transform for compact Lie groups. J. Funct. Anal.143, 98–116 (1997) [H3] Hall, B.: Quantum mechanics in phase space. Preprint, 1996.To appear in: Coburn, L., Rieffel, M. (eds.) Proceedings of the summer research conference on quantization. Providence, Rhode Island: American Mathematical Society, 1997 [He] Helgason, S.: Differential geometry, Lie groups, and symmetric spaces. Boston: Academic Press, 1978 [Hi1] Hijab, O.: Hermite functions on compact Lie groups, I. J. Funct. Anal. 125, 480–492 (1994) [Hi2] Hijab, O.: Hermite functions on compact Lie groups, II. J. Funct. Anal. 133, 41–49 (1995) [L-S] Lempert, L., Sz¨oke, R.: Global solutions of the homogeneous complex Monge-Amp`ere equation and complex structures on the tangent bundle of Riemannian manifolds. Math. Ann. 290, 689–712 (1991) [P-U] Paul, T., Uribe, A: A construction of quasimodes using coherent states. Ann. Inst. Henri Poincar´e 59, 357–381 (1993) [Se1] Segal, I.: Mathematical problems of relativistic physics, Chap. VI. In: Kac, M. (ed.) Lectures in applied mathematics: Proceedings of the Summer Seminar, Boulder, Colorado, 1960, Vol. II. Providence, Rhode Island: American Mathematical Society, 1963 [Se2] Segal, I.: Mathematical characterization of the physical vacuum for a linear Bose-Einstein field, Illinois J. Math. 6, 500–523 (1962) [Se3] Segal, I.: The complex wave representation of the free Boson field. In: Gohberg, I., Kac, M. (eds.) Topics in functional analysis: Essays dedicated to M.G. Krein on the occasion of his 70th birthday. Advances in Mathematics Supplementary Studies, Vol. 3, New York: Academic Press 1978, pp. 321– 343. [Sz1] Sz¨oke, R.: Complex structures on tangent bundles of Riemannian manifolds. Math. Ann. 291, 409–428 (1991) [Sz2] Sz¨oke, R.: Automorphisms of certain Stein manifolds. Math. Z. 219, 357–385 (1995) [T-W] Thomas, L., Wassell, S.: Semiclassical approximation for Schr¨odinger operators on a two-sphere at high energy. J. Math. Phys. 36, 5480–5505 (1995) [U] Urakawa, H.: The heat equation on compact Lie group. Osaka J. Math. 12, 285–297 (1975) [V] Voros, A.: Wentzel-Kramers-Brillouin method in the Bargmann representation. Phys. Rev. A 40, 6814–6825 (1989) Communicated by D. Brydges
Commun. Math. Phys. 184, 251– 272 (1997)
On the Stability of Double Homoclinic Loops Clodoaldo Grotta Ragazzo1; 2 1
(On leave from) Instituto de Matematica e Estatstica, Universidade de S˜ao Paulo, CP 66281, 05315-970 S˜ao Paulo, SP, Brazil 2 Department of Mathematics, Princeton University, Fine Hall-Washington Road, Princeton, NJ 08544-1000, USA E-mail:
[email protected] Received: 10 November 1995 / Accepted: 5 June 1996
Abstract: We consider 2-degrees of freedom Hamiltonian systems with an involutive symmetry and a pair of orbits bi-asymptotic (homoclinic) to a saddlecenter equilibrium (related to pairs of pure real, ±, and pure imaginary eigenvalues, ±!i). We show that the stability of this double homoclinic loop is determined by the re ection coecient of a one-dimensional scattering problem and !=. We also show that the mechanism for losing stability is the creation of an in nite heteroclinic chain connecting a sequence of periodic orbits that accumulates at the double loop.
1. Introduction and Main Results The orbits of a simple pendulum are essentially of two types: those which oscillate around a stable equilibrium and those which rotate inde nitely. In between them we nd a pair of orbits bi-asymptotic (or homoclinic) to an unstable equilibrium. We call the union of this pair of homoclinic orbits and the unstable equilibrium a double homoclinic loop. It is clear from the pendulum phase portrait that all its periodic orbits are orbitally stable. We can also say that the pendulum double homoclinic loop (understood as a set in phase space) is “stable,” since any orbit that is close to it at some point is close to it everywhere. In this paper we are interested in the stability question of sets that are generalizations of the simple pendulum double homoclinic loop. Before presenting any general statement, we consider the problem in the context of a simple example. Consider the 2-degrees of freedom system (or the “2-pendulum”) de ned by a particle of mass m in a uniform gravitational eld, constrained to move in the surface of a 2-torus, see Fig. 1. Gravity g acts in the negative z-direction. In the (; )-coordinates shown in Fig. 1 the Hamiltonian function of the system is " # p2 1 p2 + 2 − mgl cos ; (1) 2m l21 l Most of this work was done while the author was visiting the Courant Institute of Mathematical Sciences, New York University.
252
C. Grotta Ragazzo
Fig. 1. Coordinates ; used to describe the “2-pendulum” system
where l = l2 + l1 cos . This system has several invariant manifolds, in particular the cylinder = {(p ; p ; ; )|p = 0; = } : The dynamics in is the one of a simple pendulum of length l2 − l1 ¿ 0, described by the (p ; )-variables. We denote by the set (in the 4-dimensional phase space) given by the double homoclinic loop of this simple pendulum. The question we are interested in is: “is stable; in the usual sense that any orbit of system (1) that is suciently close to at some point is close to everywhere”? We point out that the vector eld of system (1) linearized at the equilibrium point (p ; p ; ; ) = (0; 0; ; ) ∈ is related to a pair of pure-real and a pair of pureimaginary eigenvalues (the equilibrium is said to be of saddle-center type). This is crucial in order to have some hope of stability for . If the equilibrium were hyperbolic then could not be stable. In this case, perturbations contained in the unstable manifold of the equilibrium but not in would provide the instability (systems with a stable double homoclinic loop containing a hyperbolic equilibrium must have 1-degree of freedom). More generically, in this paper we consider real analytic four-dimensional Hamiltonian systems (M; ; H ) where: M is a four-dimensional manifold, is a symplectic form and H is a Hamiltonian function. We assume that there exists a symplectic analytic involutive map S, dierent from the identity, acting on (M; ) and that (M; ; H ) satis es the following hypothesis: H1) (M; ; H ) is symmetric with respect to S, namely H ◦ S = H ; H2) (M; ; H ) has a symmetric equilibrium point r, S(r) = r, of saddle-center type (the eigenvalues related to it are: ±-0 and ±!i-0); H3) (M; ; H ) has a pair, , 0 , of orbits homoclinic to r, such that S( ) = 0 . We call the set ∪ 0 ∪ r as “double saddle-center loop.” Let denote the ow associated to (M; ; H ). We say that ∪ 0 ∪ r is stable if for any neighborhood U of ∪ 0 ∪ r it is possible to nd another neighborhood V of it, such that x ∈ V ⇒ t (x) ∈ U , for all t ∈ R (this is the usual de nition of stability of sets with respect to ows). In order to present our theorems we have to outline the general idea we use to analyze the stability of double saddle-center loops. First of all, let us consider the single saddle-center loop ∪ r. This set is a closed curve in phase space (it is homeomorphic to a periodic orbit). As we usually do to analyze the dynamics
253
Stability of Double Homoclinic Loops
near periodic orbits, let us de ne a section transversal to (or a Poincare section) and restrict this section to the constant energy level of H containing . We denote the restricted two dimensional section as . The presence of the equilibrium in the saddle-center loop implies that a rst return map (or Poincare map) to this section is not always de ned (in contrast to what happens in the periodic orbit case) unless we consider the presence of the second loop 0 ∪ r. Thus, possibly using the second loop, we show that a Poincare map to is always well de ned, except for the point where intersects ([12, 15]). In analogy to the periodic orbit case, the image of ∩ is de ned to be itself. In a convenient coordinate system we write the Poincare map to as (this mapping was originally obtained by Lerman [12] and Mielke et al. [15]) def
y → AR(−2 ln kyk)y + R(y) = F(y) + R(y) ;
(2)
where: y ∈ R2 has suciently small norm; = |!=|; kR(y)k ¡ Kkyk2 , K ¿ 0, and cos − sin 0 def def R() = ; A= : sin cos 0 1= The origin, y = (0; 0), represents the intersection of and and it is, by de nition, a xed point of the Poincare map. The dominant part of the Poincare map, given by F, is a composition of a twist map and a linear stretching map. The twist part, given by R(−2 ln kyk), comes from the passage of solutions near the saddle-center equilibrium. There, solutions rotate with frequency ! a number of times that is proportional to the time, −(2=) ln kyk, they spend near the equilibrium. The stretching part, given by A, is due to the travel of solutions near . The map F depends on two parameters = |!=| and , with ¿ 1 (both numbers are symplectic invariants related to the saddle-center loop [9]). In order to calculate we linearize (M; ; H ) at and consider the normal variation components of the linear system at the energy level of . We end by having to analyze a scattering problem similar to those appearing in 1-dimensional quantum mechanics (if we think of as a “periodic orbit of in nite period,” then the calculation of is similar to the calculation of Floquet multipliers). More details on how to calculate are given in [15, 6 and 9]. Here, as an illustration, we just present the derivation of for the 2-pendulum presented above. In this case is given by ˙ p = 0; (t) = 2 arctan(sinh( t)); p = m(l2 − l1 )2 ; p where = g=(l2 − l1 ). The scattering equation is given by
= ; (3)
= − 2 (1 − 6sech2 ( t)) ; p p where = g=l1 and = = = (l2 − l1 )=l1 . This well-known equation can be explicitly solved. It has a complex solution with asymptotic behavior (t) → Cei! t + Be−i! t as t → −∞ and (t) → ei!t as t → ∞, where |B| is given by, either cos( p−24 2 + 1) cosh( p24 2 − 1) 2 2 or ; sinh( ) sinh( )
254
C. Grotta Ragazzo
depending p on either 2 5 1=24 or 2 ¿ 1=24, respectively. Finally, we can show that = |B| + |B|2 + 1 = 1 or, equivalently, 2|B| = − −1 (see [6, 9]). Our goal in this paper is to prove the following two theorems relating the stability of the double saddle-center loop Gm ∪ 0 ∪ r to its coecients = != and : Theorem 1 (Stability). Assume that (M; ; H ) satis es hypotheses H1, H2, H3. Thus; there exists a function Z : (0; ∞) → (1; ∞] such that if 1 5 ¡ Z( ); then the double saddle-center loop ∪ 0 ∪ r is stable. The proof of Theorem 1 is based on the following ideas. For = 1, map F, de ned in (2), is integrable (it leaves invariant all circles centered at the origin). This implies that for = 1; y = (0; 0) is stable under iterations of F. If F were suciently smooth at the origin (it is not even dierentiable) then we could immediately apply the KAM theorem to show the stability of y = (0; 0) under F for ≈ 1. In order to overcome this nonsmoothness problem we have to use the important discrete dilation symmetry of F given by F(e−k = y) = e−k = F(y), k ∈ Z. This symmetry allows us to apply the KAM far from the origin and then pull back invariant circles arbitrarily close to it. The symmetry is also crucial in showing that, for ≈ 1, the origin is stable under iterations of the full map F, Eq. (2), and that the saddle-center loop is stable under perturbations with energies dierent from that of . Our second main result concerns the instability of the double saddle-center loop. Theorem 2 (Instability). Assume that (M; ; H ) satis es hypotheses H1, H2, H3. If
( − −1 ) ¿ 1 ; then the double saddle-center loop
∪
0
∪ r is unstable.
The main property of F that allows us to prove Theorem 2 is again the invariance under discrete dilation. Notice that if F has a xed point p distinct from y = (0; 0), then it has an in nite family of xed points given by e−k = p, k ∈ Z. This and some explicit computation imply that, for -1, (M; ; H ) has a sequence of unstable periodic orbits k , k = 1; 2; 3; : : : ; that accumulates on the double saddlecenter loop. Moreover, if the condition of Theorem 2 is veri ed, then we show that the unstable and stable manifolds of all periodic orbits in this sequence intersect transversally, inside the energy level of , forming an in nite heteroclinic chain. Since k approaches ∪ 0 ∪ r as k → ∞ we conclude that the double saddle-center loop is unstable. In particular, (M; ; H ) is not integrable. Numerical studies of map F and a model Hamiltonian (see [8]) and the two theorems above lead us to conjecture that there exists a critical curve on the ( ; )space, approximately given by 1
( − −1 ) = √ ; 2 such that if the pair ( ; ) related to a given double saddle-center loop is above this curve then the double saddle-center loop is unstable, otherwise it is stable. If we apply this conjecture to the 2-pendulum presented above, we conclude that its double saddle-center loop related to the homoclinic orbit (3) is stable if
c ¡ l1 =l2 ¡ 1, where c ≈ 22=23, otherwise it is unstable. The stability part of this statement is not very intuitive. In order to show its plausibility we point out that
255
Stability of Double Homoclinic Loops
in the limit l1 =l2 → 1 the coecient in Eq. (3) tends to in nity which implies that the -component of tends to a step function. Therefore, when l1 =l2 ≈ 1 the solution related to stays a very short period of time near = 0 where perturbations in the -direction tend to destabilize the double saddle-center loop. Numerical simulations with the 2-pendulum support our conjecture.
2. Poincare Maps to Double Homoclinic Loops We start this section with a normal form result essentially due to Moser, [16], with a supplement of Russmann, [19]. The result concerning the form of the involution S is given in [7]. Theorem 3. Let (M; ; H ) be a Hamiltonian system and S be an involution satisfying hypotheses H1 and H2. Then there exists a neighborhood U of r with symplecdef tic coordinates (p1 ; q1 ; p2 ; q2 ) = x and symplectic form = dp1 ∧ dq1 + dp2 ∧ dq2 such that; in these coordinates; the Hamiltonian function H is given by h(I1 ; I2 ) = −I1 + !I2 + R(I1 ; I2 );
R(I1 ; I2 ) = O(I12 + I22 ) ;
(I1 = p1 q1 ; I2 = (p22 + q22 )=2) ; and S is given by one of the following matrices: I 0 −I 0 − ; ; 0 I 0 I
I 0
0 −I
;
(4)
where I is the 2 × 2 identity matrix. Remark. With a possible time reversion (t → −t) and a canonical transformation (q1 → −p1 , p1 → q1 ) we can always make ¿ 0 and ! ¿ 0. For simplicity we will assume this normalization. The procedure we follow below is related to those in Conley [2, 3], Churchill and Rod [4], Llibre et al. [11], and is essentially the same as the one in Lerman [12] and Mielke et al. [15]. Let us denote by : w → t (w) the ow related to (M; ; H ). In the coordinate system of Theorem 3, is written as p (t) e−t@I1 h0 1 q1 (t) 0 p2 (t) = 0 q2 (t)
0
0 et@I1 h0 0 0
p (0) 0 0 1 q1 (0) 0 0 ; cos(t@I2 h0 ) − sin(t@I2 h0 ) p2 (0) sin(t@I2 h0 ) cos(t@I2 h0 ) q2 (0)
(5)
where @I1 h0 = @I1 h(I1 (0); I2 (0)), @I2 h0 = @I2 h(I1 (0); I2 (0)), I1 (0) = p1 (0)q1 (0), I2 (0) = [p22 (0) + q22 (0)]=2. Using the coordinate system of Theorem 3 and hypothesis H3 we de ne four Poincare sections k , k = 1; 2; 3; 4, as 1 = {x|q1 = };
2 = {x|p1 = };
3 = S(1 );
4 = S(2 ) ;
256
C. Grotta Ragazzo
where is some positive real number to be xed later. Sections (1 ; 3 ) and (2 ; 4 ) are transversal to the stable and unstable manifolds of r, respectively. The fact that @I1 h(0; 0) = −-0 allows us to solve the equation E = h(I1 ; I2 ) for I1 to obtain ! 1 2 ; E) ; I1 = v(I2 ; E) = − E + I2 + v(I
(6)
where v is an analytic function with v(I 2 ; E) = O(E 2 + I22 ). The invertibility of h( · ; I2 ) and the fact that I1 = p1 q1 also imply that we can take (p2 ; q2 ; E) as coordinates in k , k = 1; 2; 3; 4. Expression (5) for the ow allows us to de ne, and explicitly write, Poincare maps L1 : 1 → 2 ∪ 4 and L3 : 3 → 2 ∪ 4 . Maps L1 and L3 are de ned in a similar way, so in the following we just consider L1 . The coordinates of Theorem 3 are very convenient because they allow us to think on the ow as the Cartesian product of very simple ows in the (p1 ; q1 ) and (p2 ; q2 ) planes. If we look to the (p1 ; q1 ) part of the ow we easily conclude that for I1 ¿ 0 points in 1 are mapped into 2 and for I1 ¡ 0 points in 1 are mapped into 4 . Points with I1 = 0 converge to (p1 ; q1 ) = (0; 0) as either t → +∞ or t → −∞. This implies that L1 must have a discontinuity whenever I1 = 0 = v(I2 ; E). Since @E v(0; 0)-0 we can solve equation v(I2 ; E) = 0 for E to describe the set of discontinuities of L1 as the graph of a function Ec : I2 → E. We remark that this discontinuity set corresponds to the intersection of the stable manifolds of the periodic orbits in the center manifold of r and 1 (these periodic orbits are explicitly given by p1 = q1 = 0, I2 = constant ¿ 0). The function Ec can be written as Ec (I2 ) = !I2 + O(I22 ) : For I2 small, Ec is an increasing function. Remark. In the rest of this paper we restrict our attention to a suciently small neighborhood of r such that the monotonicity of Ec holds for all values of I2 we consider. Thus, if E ¡ Ec (I2 ) then I1 ¿ 0 and L1 ( · ; E) : 1 → 2 , and if E ¿ Ec (I2 ) then I1 ¡ 0 and L1 ( · ; E) : 1 → 4 . Using the explicit representation (5) for the ow, we obtain that the time T that a point in 1 takes to reach either 2 or 4 is given by |v(E; I2 )| 1 log : T = @I1 h 2 Using the time T and the ow expression (5) we can write L1 explicitly. If x ∈ 1 def has coordinates (p2 ; q2 ; E) = (y; E) (we denote the pair (p2 ; q2 ) as y and recall that def I2 = kyk2 =2) with E-Ec (I2 ) and if L1 (x) ∈ 2 ∪ 4 has coordinates (p20 ; q20 ; E 0 ) = 0 0 (y ; E ) then E0 = E
and
y0 = ‘1 (y; E) = R((y; E))y ; def
where (y; E) = T@I2 h = −@I2 v(I2 ; E) log and
def
R() =
cos sin
|v(E; I2 )| ; 2 − sin cos
I2 =
:
(7) |y|2 ; 2
(8)
257
Stability of Double Homoclinic Loops def
For E = Ec (I2 ) we de ne L1 (y; E) = (y0 ; E 0 ) = (y; E) ∈ 2 . The map L3 is de ned in a similar way, with the dierence that E 5 Ec (I2 ) implies L3 ( · ; E) : 3 → 4 , and E ¿ Ec (I2 ) implies L3 ( · ; E) : 3 → 2 . Our goal in the rest of this section is to de ne a Poincare map to describe the dynamics near the double homoclinic loop. So the next step is to use the ow near compact parts of and 0 to de ne a Poincare map from 2 ∪ 4 to 1 ∪ 3 . At this point we have to distinguish two topological cases: Case 1) the orbit Case 2) the orbit
coincides locally with the semi-axis p1 ¿ 0 and q1 ¿ 0; coincides locally with the semi-axis p1 ¿ 0 and q1 ¡ 0.
In Case 1 we de ne Poincare maps G21 : 2 → 1 and G43 : 4 → 3 . In Case 2 we de ne Poincare maps G23 : 2 → 3 and G41 : 4 → 1 . We will consider Cases 1 and 2 separately. 2.1. Case 1. In this case we de ne an invertible map F : 1 ∪ 3 → 1 ∪ 3 as (see Fig. 2): def
a) If x ∈ 1 has coordinates (y; E) with E 5 Ec (I2 ), then F(x) = G21 ◦ L1 (x) ∈ 1 ; def
b) If x ∈ 3 has coordinates (y; E) with E 5 Ec (I2 ), then F(x) = G43 ◦ L3 (x) ∈ 3 ; def
c) If x ∈ 1 has coordinates (y; E) with E ¿ Ec (I2 ), then F(x) = G43 ◦ L1 (x) ∈ 3 ; def
d) If x ∈ 3 has coordinates (y; E) with E ¿ Ec (I2 ), then F(x) = G21 ◦ L3 (x) ∈ 1 .
Fig. 2. Diagram showing several possible trajectories related to the construction of the Poincare map F in Case 1. The trajectory labeled by “a” corresponds to the part of F described in the text in Case 1, item “a)”. The same is valid for “b, c, and d”
258
C. Grotta Ragazzo
Using the symmetry S we can re-de ne F in terms only of L1 , G21 and S. Indeed, using that S ◦ t = t ◦ S we obtain (the symbol S in the expressions below must be understood as representing the restriction of S to dierent sections k ; the composition with other maps determine, with no ambiguity, the meaning of S): L 1 ◦ S = S ◦ L3 ;
G43 ◦ S = S ◦ G21 :
(9)
The symmetry of all maps appearing in the de nition of F allows us to say that F is reversible with respect to S (or, more precisely, “with respect to the restriction of S to 1 ∪ 3 ”), namely S◦F =F ◦S: In the following sections an important role will be played by F restricted to the part of 1 , where E 5 Ec (I2 ). In this case, if x ∈ 1 has coordinates (y; E) and F(x) ∈ 1 has coordinates (y0 ; E 0 ), then E 0 = E;
y0 = f(y; E) = g(‘1 (y; E); E) ; def
(10)
where ‘1 is de ned in (7) and g represents the y-components of G21 . Function g is analytic, satis es g(0; 0) = 0 (y = 0, E = 0, y0 = 0, E 0 = 0, correspond to points on the homoclinic orbit ), and its leading order expansion is given by g(y; E) = AR(K)y + Eu + g(y; E) ;
(11)
def
where: @y g(0; 0) = AR(K) is decomposed as a symmetric positive matrix A and def
a rotation matrix R(K), @E g(0; 0) = u ∈ R2 , and g is an analytic function with g(y; E) = O(|y|2 + E 2 ). The ow expression (5) implies that if we vary the number in the de nition of 1 , then @y g(0; 0) changes as @y g(0; 0) → R(K 0 )@y g(0; 0)R(K 0 ), where K 0 ∈ (0; 2] depends on . Thus, choosing appropriately we can make 0 A= ; ¿0 : (12) 0 1= The symplectic character of the ow implies det(A)=1. 2.2. Case 2. In this case we de ne an invertible map F : 1 ∪ 3 → 1 ∪ 3 as (see Fig. 3): a) ∈ 3 ; b) ∈ 1 ; c) ∈ 1 ; d) ∈ 3 .
def
If x ∈ 1 has coordinates (y; E) with E 5 Ec (I2 ), then F(x) = G23 ◦ L1 (x) def
If x ∈ 3 has coordinates (y; E) with E 5 Ec (I2 ), then F(x) = G41 ◦ L3 (x) def
If x ∈ 1 has coordinates (y; E) with E ¿ Ec (I2 ), then F(x) = G41 ◦ L1 (x) def
If x ∈ 3 has coordinates (y; E) with E ¿ Ec (I2 ), then F(x) = G23 ◦ L3 (x)
As in Case 1 symmetry S allows us to write F in terms only of L1 , G23 and S. Here, the relations analogous to (9) are: L3 = S ◦ L1 ◦ S;
G41 = S ◦ G23 ◦ S :
259
Stability of Double Homoclinic Loops
Fig. 3. Diagram showing several possible trajectories related to the construction of the Poincare map F in Case 2. The trajectory labeled by “a” corresponds to the part of F described in the text in Case 2, item “a)”. The same is valid for “b, c, and d”
Again we can say that F is reversible with respect to S, namely S ◦ F = F ◦ S. Consider the set of points x ∈ 1 that have coordinates (y; E) with E 5 Ec (I2 ), and such that x0 = G23 ◦ L1 (x) ∈ 3 have coordinates (y0 ; E) with E 5 Ec (I20 ). In this case, F ◦ F(x) ∈ 1 can be written as F ◦ F(x) = G41 ◦ L3 ◦ G23 ◦ L1 (x) = S ◦ G23 ◦ L1 ◦ S ◦ G23 ◦ L1 (x) : This suggests the de nition of a map def = S ◦ G23 ◦ L1 (x) = S ◦ F(x) ∈ 1 ; F(x)
(13)
for all x ∈ 1 that have coordinates (y; E) with E 5 Ec (I2 ) and such that x0 = G23 ◦ L1 (x) ∈ 3 have coordinates (y0 ; E) with E 5 Ec (I20 ). In this case we have F ◦ F(x) = F ◦ F(x) : Map F will play an important role in the next sections. So, let us write it more explicitly. The involution S does not change the value of E. From Theorem 3 we also know that S either does not change (p2 ; q2 ) or changes it to (−p2 ; −q2 ). Thus, if ∈ 1 has coordinates x ∈ 1 has coordinates (y; E) such that E 5 Ec (I2 ) and F(x) (y0 ; E 0 ), then E 0 = E;
y0 = ±f(y; E) = ±g(‘1 (y; E); E) ; def
where ‘1 is de ned in (7), g represents the y-components of G23 and the choice of sign in ± depends on S. The leading order expansion of g is again given by an expression of the form (11).
260
C. Grotta Ragazzo
3. Proof of the Stability Theorem The results in the preceding section imply that any orbit suciently close to the double homoclinic loop intersects either 1 or 3 . Successive intersections determine successive iterations of map F. Let us denote by the union of the point def x ∈ 1 with coordinates (y; E) = (0; 0; 0) = (0; 0) and the point x ∈ 3 with the same coordinates ( represents the intersection of ∪ 0 with 1 ∪ 3 ). In both Cases 1 and 2 (Sects. 2.1 and 2.2) is invariant under F. Moreover, the de nition of F implies that if is stable under iterations of F then the double homoclinic loop ∪ 0 ∪ r is stable under the action of the ow (namely, for any neighborhood U ⊂ M of ∪ 0 ∪ r it is possible to nd another neighborhood V ⊂ M of it, such that x ∈ V ⇒ t (x) ∈ U , for all t ∈ R). Therefore, in order to prove Theorem 1 we just have to show that the hypotheses of the theorem imply that is stable under iterations of F. This will be done in several steps. The rst one is to reduce the problem to the analysis of a one parameter family of twist maps. 0 ¡ E ¡ , such Lemma 1. Suppose that for any ¿ 0 it is possible to nd E, that for every |E| ¡ E either the map f( · ; E), given by (10), (Case 1) or the map ±f( · ; E) (Case 2), has a rotational invariant circle (E) (or, a closed invariant curve encircling the origin) that is contained in the disc |y| ¡ . Then the set , de ned above, is stable under iterations of F. Proof. Let V denote the union of the set of points in 1 with coordinates (y; E) such that |E| ¡ , |y| ¡ , and the set of points in 3 with coordinates (y; E) such that |E| ¡ , |y| ¡ . In order to prove that is stable we have to show that for any given V we can nd a neighborhood U of such that x ∈ U ⇒ F n (x) ∈ V, for any n ∈ Z. Let us denote by B1 the set of points in 1 with coordinates (y; E) such that (see Fig. 4): E 5 Ec (I2 ), |E| 5 E and, for each value of E, y is inside (E) ((E) is the closed invariant curve whose existence was assumed). Let us denote by C1 the set of points in 1 with coordinates (y; E) such that E ¿ Ec (I2 ) and 0 ¡ E 5 E (see Fig. 4). The set D1 = B1 ∪ C1 is a nite cylinder with boundary @D1 given by ∪ {(y; ± E)|y : {(y = (E); E)kE| 5 E} is inside (± E)} We de ne sets B3 , C3 and D3 in 3 as: B3 = S(B1 ), C3 = S(C1 ) and D3 = S(D1 ). Notice that D1 ∪ D3 ⊂ V since E ¡ and (E) is contained in the disc |y| ¡ for So, the lemma will be proved if we show that F(D1 ∪ D2 ) ⊂ D1 ∪ D2 . every |E| ¡ E. Let us rst consider Case 1. We recall that if x ∈ 1 has coordinates (y; E) such that E 5 Ec (I2 ), then F(x) ∈ 1 has coordinates (y0 ; E) with y0 = f(y; E). So, the hypothesis on the invariance of (E), the continuity of F inside B1 , and the fact that F does not change E imply that F(B1 ) ⊂ D1 . This and the symmetry of F with respect to S imply that F(B3 ) ⊂ D3 . Now, the invariance of (E), the continuity of F inside C1 , the fact that F does not change E and that F(C1 ) ∩ D3 -∅ (which can be easily seen using the decomposition F = G43 ◦ L1 ) imply that F(C1 ) ⊂ D3 . This and the symmetry of F with respect to S imply that F(C3 ) ⊂ D1 . Since D1 ∪ D3 = B1 ∪ C1 ∪ B3 ∪ C3 , the lemma is proved in Case 1. Now, let us consider Case 2. We recall that if x ∈ 1 has coordinates (y; E) = S ◦ F(x) ∈ 1 has coordinates (y0 ; E) with y0 = such that E 5 Ec (I2 ), then F(x) ±f(y; E), where the sign ± depends on S. So, the hypothesis on the invariance
261
Stability of Double Homoclinic Loops
Fig. 4. Diagram representing the set D1 = B1 ∪ C1 described in the text. The set C1 is given by the interior of the paraboloid de ned by E = Ec (I2 ) = Ec [(p22 + q22 )=2]. The set B1 is given by the complement of C1 . The outer boundary of B1 is foliated by invariant curves (E). Two of them, (E1 ) and (E2 ), are represented in the gure
of (E), the continuity of F inside B1 , the fact that F does not change E, and that S(B1 ) = B3 , imply that F(B1 ) ⊂ D3 . This and the symmetry of F with respect to S imply that F(B3 ) ⊂ D1 . Now, the invariance of (E), the continuity of F inside C1 , the fact that F does not change E and that F(C1 ) ∩ D1 -∅ (which can be easily seen using the decomposition F = G41 ◦ L1 ) imply that F(C1 ) ⊂ D1 . This and the symmetry of F with respect to S imply that F(C3 ) ⊂ D3 . Since D1 ∪ D3 = B1 ∪ C1 ∪ B3 ∪ C3 , the lemma is also proved in Case 2. Now, let us de ne new variables z and as y = ze−k= = z; def
E = e−2k= = 2 ; def
k∈Z;
where = !=. We want to write f in the new variables. In order to do this we rst write the several factors of f (see Eqs. (6), (7), (8), (10)) in the new variables: 1 1 g( z; 2 ) = AR(K)z + u + g( z; 2 ) ; ! 2 1 2 2 1 def 2 2 2 2 2 2 |z| =2; ) ; b(|z| =2; ; ) = v( |z| =2; ) = − + |z| + 2 v( 2 def
G(z; ; ) =
2 ; E)|I2 =2 |z|2 =2; E=2 @I2 b(|z|2 =2; ; ) = @I2 v(2 |z|2 =2; 2 ) = + @I2 v(I = + v0 (2 |z|2 =2; 2 ) ;
def
262
C. Grotta Ragazzo
1 |b| ‘1 ( z; 2 ) = R −@I2 b log 2 z 1 ! 2 0 2 2 2 |z| = R −[ + v ( |z| =2; )] log − + 2 1 2 2 + 2 v( |z| =2; 2 ) − v0 (2 |z|2 =2; 2 ) log 2 z ;
def
L(z; ; ) =
p where g(y; E) = O(|y|2 + E 2 ), v(I 2 ; E) = O(E 2 + I22 ) and v0 (I2 ; E) = O( E 2 + I22 ). The important point in these expressions is that the term log 2 = −2k does not appear in the argument of R( · ) in the de nition of L. Finally we write 1 def F(z; ; ) = ± f( z; 2 ) = ±G(L(z; ; ); ; ) ;
(14)
where in Case 1 we have the + sign and in Case 2 we choose the ± sign according to S. Taking the limit as → 0 (or k → ∞) we get (this map was rst obtained by Lerman [12] and Mielke et al. [15]) |z|2 − z ; (15) F(z; ; 0) = AR c − log ! 2 where either c = K + log(2 ) or c = K + log(2 ) + depending on the sign ± appearing in Eq. 14. We remark that the change of variables z → R(=2)z is equivalent to the transformation → −1 (see Eq. (12)), so we can assume that = 1. Equation (15) implies that |z|2 z: (16) F(z; 0; 0) = AR c − log ! 2 From the way we de ned function F we expect it to well describe the dynamics of f when |y| and E are small. Let us consider F( · ; 0; 0) as a mapping from the annulus A = {e−= 5 |z| 5 1} def
to R2 . If we de ne polar coordinates as √ z1 = 2I cos( − c + log(!I ));
z2 =
√
2I sin( − c + log(!I )) ;
then we can write F( · ; 0; 0) : (I; ) → (I 0 ; 0 ) as I 0 = IJ () ; 0 = c − log ! + u() − log I − log J () ; where J () = 2 cos2 + −2 sin2 ; u() = arctan
tan 2
;
∈ (0; =2) ⇒ u() ∈ (0; =2) :
(17)
263
Stability of Double Homoclinic Loops
Map F( · ; 0; 0) is a twist map that depends analytically on two parameters = 1 and ¿ 0. If = 1 then F( · ; 0; 0) just rotates each circle I = constant (in this case we denote F( · ; 0; 0) as F( · ; 0; 0)=1 ). So, by the KAM theorem, F( · ; 0; 0) has rotational invariant circles for ≈ 1. These invariant circles, after re-scaling, will allow us to prove the stability theorem using Lemma 1. In order to be more precise at this point let j = 5 be some xed integer and : A → R2 be any j-times continuously dierentiable mapping. Denoting as (I ; ) the (I; ) coordinates of , we de ne m+n m+n @ I @ (18) kkj = sup m n + sup m n : m+n 5 j @I @ m+n 5 j @I @ A consequence of Moser’s twist map theorem ([17], Theorem 2.11) is the following lemma. Lemma 2. For a xed j there exists a function → ( ); with ( ) ¿ 0; such that if k − F( · ; 0; 0)=1 k j ¡ ( ) ; then has a rotational invariant circle in A: Due to Lemma 1, in order to nish the proof of the stability Theorem 1 we just have to prove the following result. Lemma 3. There exists a function Z : (0; ∞) → (1; ∞] such that if 1 5 ¡ Z( ); 0 ¡ E ¡ ; such that for every then for any given ¿ 0 it is possible to nd E; |E| ¡ E either the map f( · ; E) de ned in (10) (Case 1); or the map ±f( · ; E) (Case 2); has a rotational invariant circle (E) that is contained in the disc |y| ¡ . Proof. Let 0 ¿ 0 and 0 ¿ be small enough numbers such that F; given by expression (14), is de ned for all z; ; and ; that verify |z| = 1; || ¡ 0 and || ¡ 0 . Here we assume that does not only assume discrete values, e−k= ; k ∈ Z; but all possible real values. We de ne 1 = min{!e−2= =4; 0 } and set 1 ¿ 0; 1 5 0 ; such that 1 2 −2= !e−2= + 2 v( e =2; 2 ) ¿ 0 4 for || ¡ 1 and || ¡ 1 (here we used that v (I2 ; E) = O(I22 + E 2 )). This choice of 1 and 1 ensures that b(|z|2 =2; ; ) = v (2 |z|2 =2; 2 ) ¿ 0 for z ∈ A; || ¡ 1 and || ¡ 1 . So, F( · ; ; ) can be considered as a function from A to R2 for all || ¡ 1 and || ¡ 1 . All functions that are present in the de nition of F; see (14), are analytic with respect to (z; ; ); except v0 (2 |z|2 =2; 2 ) log 2 = (z; ; ) def
for = 0 (this function appears in the argument of the rotation matrix in the de nition of L). Since v0 (I2 ; E) = O(|I2 | + |E|) is analytic, (z; ; · ) is dierentiable with respect to at = 0. Moreover, ( · ; · ; ) is in nitely many times dierentiable for all || ¡ 1 and all of its derivatives tend to zero as → 0. This implies that we can write F as ; ) ; F(z; ; ) = F(z; 0; 0) + F(z; · ; ; ) : A → R2 ; where F( · ; 0; 0) is given by (16) and all the derivatives of F( for || ¡ 1 and || ¡ 1 ; tend to zero as || + || → 0.
264
C. Grotta Ragazzo
We point out that in this proof we do not distinguish between Cases 1 and 2. This is because F( · ; 0; 0); Eq. (16), has the same expression in both cases (except that we are going to use do not depend for the constant c) and the properties of F does depend on whether we consider Case 1 or 2 (although the expression for F on the case). Let be the function of = 1 and ¿ 0 de ned by def
= kF( · ; 0; 0) − F( · ; 0; 0)=1 k j ; where k · k j is de ned in (18). Function ( · ; ) is continuous and (1; ) = 0. Now, we de ne the function Z that appears in Lemma 3 and Theorem 1 as def
Z( ) = sup{ | (; ) ¡ ( ) for 1 5 ¡ } ¿ 1 ; where ( ) ¿ 0 is de ned in Lemma 2. The hypothesis 1 5 ¡ Z( ) and the de nition of Z imply that kF( · ; 0; 0) − F( · ; 0; 0)=1 k j = K 0 ¡ ( ) :
(19)
· ; ; ) tend to zero as || + || → 0 we can choose Since all the derivatives of F( 2 ¿ 0; 2 5 1 ; and 2 ¿ 0; 2 5 1 ; such that · ; ; )k j ¡ ( ) − K 0 ; kF(
(20)
for || ¡ 2 and || ¡ 2 . From Lemma 2 with = F( · ; ; ) we know that if kF( · ; ; ) − F( · ; 0; 0)=1 k j ¡ ( ) ; then F( · ; ; ) has a rotational invariant circle in A. This and inequalities (19) and (20) imply that F( · ; ; ) has a rotational invariant circle in A provided || ¡ 2 and || ¡ 2 . Now, let K 00 be the smallest integer such that exp[−K 00 = ] ¡ 2 : Then the de nition of F implies that for every value of such that = e−k= ; k = K 00 ; and for E = e−2k= ; || ¡ 2 ; map f( · ; E) has a rotational invariant circle (E) such that y ∈ (E) ⇒ e−(k+1)= ¡ |y| ¡ e−k= (the same is valid for map −f( · ; E) that appears in Case 2). Finally, let K ¿ K 00 be the smallest integer such that ¿ e− K= and ¿ 2 e−2 K= . Choosing E = 2 e− K= we conclude that for any |E| ¡ E map f( · ; E) has a rotational invariant circle contained in the disc |y| ¡ e− K= ¡ ; which concludes the proof of the lemma and also of the stability Theorem 1.
Stability of Double Homoclinic Loops
265
4. Proof of the Instability Theorem In this section we prove Theorem 2 on the instability of double saddle-center loops. From our discussion in the beginning of Sect. 3, in order to prove the theorem, it is sucient to show that the set is unstable under iterations of F. Since F preserves the energy E; we can restrict our attention to the energy of the double homoclinic loop E = 0. Remark. In this section we always assume E = 0 and all elements we consider, like 1 ; 3 ; F; are understood as restricted to the energy level E = 0. In Case 1, Sect. 2.1, if x ∈ 1 then F(x) ∈ 1 and, if x ∈ 3 then F(x) ∈ 3 . So, in order to show that is unstable we just have to consider F restricted to 1 . In this case, if x has coordinates y and F(x) has coordinates y0 then expressions (7), (10) and (11) imply that |y|2 z + f(y) ; (21) y0 = f(y) = g(‘1 (y)) = AR c − log ! 2 where: f is a real analytic function for |y|-0; kf(y)k ¡ K 0 kyk2 ; K 0 ¿ 0; and c is some constant. Thus Theorem 2 will be proved if we show that, under its hypotheses, def y = (0; 0) = 0 is unstable under iterations of f. In Case 2, Sect. 2.2, if x ∈ 1 then F(x) ∈ 3 (remember that E = 0) and, if ∈ 1 ; where F is x ∈ 3 then F(x) ∈ 1 . Thus, if x ∈ 1 then F ◦ F = F ◦ F(x) de ned in (13). Since is invariant under F the instability of under iterations of F implies the instability of under iterations of F. Therefore, we can restrict our If x ∈ 1 has coordinates y; then F(x) ∈ 1 has coordinates y0 attention to map F. 0 with y = ±f(y); where f is given by (21) and the sign ± depends on S. Indeed, the sign in front of f is not relevant since it can be removed with a rede nition of the constant c and the map f in Eq. (21). Therefore, in order to prove Theorem 2 in Case 2 we again just have to show that y = 0 is unstable under iterations of f. In this section, as in the previous one, a key role is played by |y|2 y; (22) F(y) = AR c − log ! 2 here considered as a mapping from R2 to R2 . For |y| small we expect the dynamics of f to be similar to the dynamics of F since f(y) = F(y) + f(y) and kf(y)k kF(y)k. Map F has among other symmetries a remarkable one: it is invariant under discrete dilation y → e= y (namely, F(e= y) = e= F(y)). This has strong dynamical consequences. Suppose F has a hyperbolic xed point r0 ; r0 -0. Then it has an in nite family of hyperbolic xed points given by rk = e k = r0 ; k ∈ Z. Moreover, if the unstable and stable manifolds of r0 intersect transversally the stable and unstable manifolds of r1 ; respectively, then the discrete symmetry implies that the unstable and stable manifolds of rk intersect transversally the stable and unstable manifolds of rk+1 ; respectively, for all k ∈ Z. As a consequence of the “-lemma,” [18], we conclude that the stable manifold of rj ; for any xed j ∈ Z; intersects transversally the unstable manifold of rk for all k ∈ Z. Since rk → 0 as k → −∞; this implies that 0 is unstable under iterations of F. Indeed, in any neighborhood of 0 we nd points in the stable manifold of rj ; for arbitrarily large values of j. We are interested
266
C. Grotta Ragazzo
in proving the instability of 0 under iterations of f and not F; though. The following two lemmas, proved in Sect. 4 of [7], show that the existence of the above in nite heteroclinic chain for F implies the existence of a similar chain for f inside some disc around y = 0. This reduces the problem of proving the instability of y = 0 under iterations of f; to showing the existence of a heteroclinic chain for F as that described above. Lemma 4. Suppose F has a xed point r∗ ; r∗ -0; of elliptic or hyperbolic type. Then f has in nitely many xed points yn ; n = N; N + 1; : : : ; of the same type and period as r∗ ; where N is a suciently large integer. The points yn are approximately given by yn = r∗ n + O(n2 ); where n = e−n= : Lemma 5. Let r∗ and r∗0 be two hyperbolic xed points of F (they may coincide) dierent from y = 0. Suppose that the unstable manifold of r∗ and the stable manifold of r∗0 intersect transversally at w. Let yn and yn0 ; n = N 0 ; N 0 + 1; : : : ; denote periodic points of f; given by Lemma 4, related to r∗ and r∗0 ; respectively. Then; there exists an integer N ¿ N 0 ; suciently large; such that for every pair yn ; yn0 ; n = N; N + 1; : : : ; the unstable manifold of yn intersects transversally the stable manifold of yn0 at some point given approximately by n w + O(n2 );
where n = e−n= :
These lemmas are proved using scaling ideas similar to those in the proof of Lemma 3 and the implicit function theorem. In the rest of the section we consider only map F de ned in Eq. (22). In this de nition we can assume c = 0; ! = 2; (since the transformation y → Ky is equivalent to c → c − 2 log K) and = 1 (since y → R(=2)y is equivalent to → −1 ). Map F is reversible with respect to the linear involutions z → Bz and z → −Bz (namely F(± Bz) = ± BF−1 ) where 0 def B= : 0 −1 For -1; F has four families of xed points. We denote the elements of the rst family by rk ; k ∈ Z; where rk = ek= r0 and r0 has coordinates r01 = exp − cos ; r02 = exp − sin ;
4
4 with = arctan(1=) ∈ (0; =4). Notice that rk is reversible with respect to B; namely Brk = rk ; for all k ∈ Z. The elements of the second family of xed points are given by −rk . The linearization of F around r0 and −r0 is associated to the eigenvalues (; −1 ); determined by + −1 = 2 + 2 ( − −1 ). This implies that all ±rk are hyperbolic. Let us denote by Wku and Wks the unstable and stable manifolds of rk ; respectively. The reversibility of F and rk with respect to B implies Wku = BWks . The reversibility of F with respect to −B implies that the unstable, Wku−; and the stable, Wks−; manifolds of −rk are given by Wku− = −BWks and Wks− = −BWku . A simple computation (see [7], Sect. 5) shows that W0u has a component that lies at
Stability of Double Homoclinic Loops
267
one side of the line y1 − y2 = 0 (the set of xed points of B) and crosses the line y1 + y2 = 0 (the set of xed points of −B). At the crossing point both W0u and y1 + y2 = 0 may have the same tangent space. Let us denote by a (see Fig. 5) the rst point of intersection between this component of W0u and the line y1 + y2 = 0 (the sub-arc of W0u in between r0 and a does not intersect the lines y1 − y2 = 0 and y1 + y2 = 0 at any point). The reversibility of F with respect to B and −B implies that we can de ne similar sub-arcs of W0s; W0u−; and W0s−; such that these sub-arcs connect the points r0 → −a; −r0 → −a and −r0 → a; respectively. We denote by 0 the closed curve encircling the origin de ned by the sub-arcs of W0u; W0s−; W0u− and W0s that connect the points r0 → a; a → −r0 ; −r0 → −a and −a → r0 ; respectively (see Fig. 5). The discrete dilation symmetry of F implies def that the curves k = e k = 0 ; k ∈ Z; contain the xed points ±rk and are formed u by parts of Wk ; Wks−; Wku− and Wks . The following lemma reduces the problem of proving the instability of 0 under iterations of F to show that some iterate of 0 crosses 1 . Lemma 6. Suppose that there exists an integer n such that Fn (0 ) crosses 1 . Then W0u intersects transversally: W0s ; W0s−; W1s and W1s− . Proof. First of all, we remember that the discrete symmetry of F implies that the stable and unstable manifolds of ± r1 coincide with the stable and unstable manifolds ±r0 after a rescaling by the factor e−= . The crossing of Fn (0 ) and 1 and the reversibility properties of F; ±r0 and ± r1 with respect to ± B imply that W0u crosses either W1s or W1s− . This has several consequences. At rst, W0s− cannot coincide with W0u . Otherwise, W0u would cross either W1u or W1u− which is impossible, since F is invertible. Thus, since W0u crosses the line y1 + y2 = 0 (of xed points of −B) at some point a; W0u crosses W0s− at a and W0u− crosses W0s at −a.
Fig. 5. Diagram showing the several invariant manifolds of r0 and −r0 . The symmetry of the gure is a consequence of the relations W0s = B(W0u ), W0s− = −B(W0u ), W0u− = B(W0s− ) and the de nition of B (in particular, the fact that B acts as the identity on the line y1 − y2 = 0 and acts as y → −y on the line y1 + y2 = 0)
268
C. Grotta Ragazzo
In order to nish the proof we need the following proposition which is a simpli cation of a result asserted by C. Conley (for the proof see Churchill and Rod ([4], Theorem 1.1, Remarks 1.8(b) and (c)). Proposition 1. Let W s be the stable and W u be the unstable manifold of a hyperbolic xed point of a real analytic symplectic planar mapping T . Let 1 be an analytic curve that cross W s and 2 be an analytic curve that cross W u (at the crossing point the tangent spaces to 1 and to W s may coincide; the same being true with respect to 2 and W u ). Then for n suciently large the iterates T n ( 1 ) will have a transverse intersection with 2 . This proposition, the analyticity of F; the fact that W0u crosses W0s−; and that crosses W0u−; imply that W0u intersects W0s transversally. Proposition 1, the fact that W0u intersects W0s transversally, and that W0s− crosses W0u; imply that W0u intersects W0s− transversally. We know that W0u crosses either W1s or W1s− . Let us assume that W0u crosses W1s (the other case is similar to this one). Then, Proposition 1, the fact that W0u intersects W0s transversally, and that W1s crosses W0u; imply that W0u intersects W1s transversally. Now, Proposition 1, the fact that W0u intersects W1s transversally, and that W1s− crosses W1u ; imply that W0u intersects W1s− transversally. W0s
Our goal in the rest of this section is to show that the hypothesis ( − −1 ) ¿ 1 of Theorem 2 implies the hypothesis of Lemma 6. In order to do this we use the fact that F is a twist map. First of all we recall the expression for F in polar coordinates I; ; Eq. (17), (here c = 0 and ! = 2) I 0 = FI (I; ) = IJ () ; def
0 = F (I; ) = u() − log 2 − log I − log J () ; def
where J () = 2 cos2 + −2 sin2 ; u() = arctan
tan ; 2
∈ (0; =2) ⇒ u() ∈ (0; =2) :
Map F restricted to the set (I; ) ∈ (0; ∞) × [0; 2) can be understood as an analytic mapping of the half open cylinder R+ × S 1 to itself. Moreover, F preserves the area form d I × d; maps each end of R+ × S 1 to itself, preserves orientation, and is a twist map because 1 @I F (I; ) = − ¡ 0 : I This implies that we can use Birkho’s theorem (see [1, 10, 13, 14]) to obtain the following result (the statement below is adapted from [13]). Theorem 4. Let U be an open subset of R+ × S 1; invariant under F; homeomorphic to R × S 1; and such that (0; c− ] × S 1 ⊂ U ⊂ (0; c+ ) × S 1 ; for some c− ; c+ ∈ R with 0 ¡ c− ¡ c+ . Then the frontier of U in R+ × S 1 is the graph of a Lipschitz function : S 1 → R+ (I = ()).
269
Stability of Double Homoclinic Loops
Proof. Here, the only reason we cannot immediately apply the version of Birkho’s theorem as stated in [13] is that F is not de ned in the whole cylinder R × S 1 but only in R+ × S 1 . Nevertheless, it is easy to extend our map F to the whole cylinder in the following way. Let v : R → R be a C ∞ function with strict positive : R × S 1 → R × S 1 be derivative and such that v(x) = log(x) if x ¿ −2 c− . Let F ∞ the following C area preserving twist map: I 0 = IJ () ; 0 = u() − log 2 − v(IJ ()) : ) coincides with F(I; ) if I ¿ c− . In partiSince inf ∈[0; 2) {J ()} = −2 ; F(I; def So, we can cular, this implies that U = U ∪ (−∞; c− ) × S 1 is invariant under F. U to conclude that the frontier apply Birkho’s theorem as stated in [13] to F; that coincides with the frontier of U inside R+ × S 1 ; is a Lipschitz function of U; : S 1 → R+ . Corollary 1. Suppose Fn (0 ) does not cross 1 for any n ∈ Z. Then F has a rotational invariant circle (an invariant closed curve encircling R+ × S 1 ). Proof. By construction 0 is a closed non-self-intersecting piecewise analytic curve encircling R+ × S 1 . So, 0 divide R+ × S 1 in two open sets homeomorphic to cylinders. Let us denote by V the set of points “below” 0 (the boundary of V is given by I = 0 and 0 ). SThe iterates of all points of V form an invariant set n V = that we denote by V; n∈Z F (V ). By hypothesis, the set V lies below the although connected, is circle I = K; where K ¿ sup∈[0; 2) {I | (I; ) ∈ 1 }. Set V; not necessarily homeomorphic to a cylinder (it may contain “holes”). Let V c be the component of the complement of V in R+ × S 1 that contains the circle I = K. The complement of V c is a set that satis es the hypotheses of Theorem 4. So, its boundary is a rotational invariant circle. Finally, due to Corollary 1, in order to show that Fn (0 ) crosses 1 it is sucient to show that F has no rotational invariant circles. This is the content of the following lemma. Lemma 7. Suppose ( − −1 ) ¿ 1. Then F has no rotational invariant circles. Proof. This proof is similar to the one found in [13] for nonexistence of rotational invariant circles for the “standard map.” Let us assume, in contradiction to the thesis, that F has a rotational invariant circle . Due to Theorem 4 we can describe as the graph of a Lipschitz function → () = I . Now, we de ne a homeomorphism h : S 1 → S 1 as 0 = h() = F ((); ). Since is a Lipschitz function and F is analytic, h is also a Lipschitz function. A Lipschitz function is dierentiable almost everywhere, so we denote by @ h the derivative of h whenever it exists. Since F is orientation preserving and maps the ends of R+ × S 1 to itself, h is also orientation preserving which means that @ h() ¿ 0 for almost all . Now, let (I 00 ; 00 ), (I 0 ; 0 ) be two consecutive iterates of (I; ) ∈ by map F. Then the de nition of F implies 00 + u() = 0 + u(0 ) − log[J (0 )] :
270
C. Grotta Ragazzo
Using that 00 = h(0 ) and = h−1 (0 ), dierentiating the above expression with respect to 0 and using that @ u() = 1=J (), we get (omitting the prime in 0 ) @ h() +
1 + J () − @ J () 1 @ h−1 () = : J [h−1 ()] J ()
Since @ h(), @ h−1 () and J () are positive for all possible ’s, we conclude that the right-hand side of the above equation is positive, namely def
1 + J () − @ J () = G() ¿ 0 :
(23)
Using the de nition of J we get G() = 1 + and
2 − −2 2 + −2 + [cos(2) + 2 sin(2)] 2 2
p @ G() = 0 ⇒ 2 = arctan(2 ) ⇒ cos(2) + 2 sin(2) = ± 1 + 4 2 :
This implies that inf
∈[0; 2)
G() =
p + −1 [ + −1 − ( − −1 ) 1 + 4 2 ] : 2
Substituting this result in inequality (23) we get
( − −1 ) ¡ 1 ; which contradicts the hypothesis of the lemma. Therefore, map F under the hypothesis of the lemma cannot have an invariant rotational circle. 5. Final Remarks Many 2-degrees of freedom Hamiltonian systems with a saddle-center loop and a discrete symmetry have continuous one-parameter families of periodic orbits that accumulate on the double saddle-center loop. This happens, for instance, for the 2-pendulum presented in the introduction. In that case, Q : (p ; p ; ; ) → (p ; −p ; ; −) is a symmetry of the system. The cylinder = {(p ; p ; ; ) | p = 0; = } is invariant under Q and also under the ow of the system. The phase portrait of the system in is just the one of a simple pendulum with Hamiltonian function H =
1 p2 − mg(l2 − l1 ) cos : 2m(l2 − l1 )2
The double saddle-center loop ∪ 0 ∪ r related to the solution (3) is the double homoclinic loop of this simple pendulum. Let us denote by E∗ the energy of ∪ 0 ∪ r. The periodic orbits of H can be labeled in the following way: i) E represents the librating periodic orbit with energy E,
Stability of Double Homoclinic Loops
271
ii) wE represents the rotating periodic orbit with energy E and momentum p ¡ 0, iii) zE represents the rotating periodic orbit with energy E and momentum p¿0. Notice that E ; wE ; zE → ∪ 0 ∪ r as E → E∗ . In view of our stability results for the double saddle-center loop, it is natural to inquire about the stability of the periodic orbits E ; wE ; zE as E → E∗ . In order to answer this question we consider the Poincare maps associated to these orbits. For E suciently close to E∗ , these maps can be obtained from F : 1 ∪ 3 → 1 ∪ 3 de ned in Sect. 2.2 (Case 2). Using that E ; wE ; zE approach continuously ∪ 0 ∪ r as E → E∗ , that all these orbits are in the cylinder , and analyzing the ow near the saddle-center equilibrium r, we conclude that wE and zE intersect only once either 1 or 3 (each section intersect only one of these orbits) and E intersects once both sections 1 and 3 . So, wE and zE are represented by xed points of F (one in 1 the other in 3 ) and is represented by a periodic orbit of F with period two (one point of this orbit is in 1 the other in 3 ). In the (y; E)-coordinates de ned in Sect. 2 the action of Q on the Poincare sections 1 and 3 may be written as (y; E) → (−y; E). Since all the orbits E ; wE ; zE are invariant under Q, we conclude that E ; wE ; zE are always represented by points in 1 ∪ 3 with coordinates in (y; E) = ( 0; E). Using this information and explicit expressions for the several components of F in (y; E)-coordinates (namely, ‘1 and g) we conclude that the general form of the Poincare map associated to E , wE and zE is (see also Eq. (15)) |y|2 − (E − E∗ ) y + O(|y| + |E − E∗ |) ; y → AR c − log ! 2 where c is some number. From this expression we obtain that E , wE and zE are linearly stable if the absolute value of the trace of the matrix AR(c − log |E − E∗ |) is less than two, i.e. if |( + −1 ) cos(c − log |E − E∗ |)| ¡ 2 : Therefore, if -1 then the families of periodic orbits E ; wE ; zE go through in nite sequences of transitions in stability type (between ellipticity and hyperbolicity) as E → E∗ . This result was previously obtained by Churchill et al. [5]. It is worth mentioning that = 1 is a necessary condition for the integrability of (M; ; H ) (see [12, 15 and 6]) and this is the only case where the in nite sequence of stability transitions cannot occur (the integrable case was studied by Lerman [12]). This result on the stability of the periodic orbits E , wE and zE shows that, except for the case = 1, double saddle-center loops may be at most isolated stable sets, in the sense that in any neighborhood of them we nd an in nite number of unstable periodic orbits. The “region of instability” of these periodic orbits has to be small, though. By the other hand, in any neighborhood of an unstable saddle-center loop we nd in nitely many stable periodic orbits. Again, the size of the domain of stability of these orbits has to be small. We nish pointing out that this scenario, rather than the exception, is the rule for 2-degrees of freedom Hamiltonian systems. For instance, arbitrarily near generic elliptic periodic orbits, which are always stable, we nd “resonance zones” that contains in nitely many unstable periodic orbits.
272
C. Grotta Ragazzo
Acknowledgements. I am very grateful to E. Sere, for many discussions, comments, and suggestions (specially concerning the way to use the KAM theorem to prove the stability Theorem 1) and to J. Shatah, for his support and discussions on the subject of this paper and on related problems. I also would like to thank J. Mather for listening to oral expositions of these results and for his support.
References 1. Birkho, G.D.: Surface transformations and their dynamical applications. Acta Math. 43, 1–119 (1922) 2. Conley, C.: Low energy transit orbits in the restricted three-body problem. SIAM J. Appl. Math. 16, 732 – 746 (1968) 3. Conley, C.: On the ultimate behavior of orbits with respect to an unstable critical point I. Oscillating, asymptotic, and capture orbits. J. Di. Eq. 5, 136 –158 (1969) 4. Churchill, R.C., Rod, D.L.: Pathology in dynamical systems III, Analytic Hamiltonians. J. Di. Eqns. 37, 351– 373 (1980) 5. Churchill, R.C., Pecelli, G., Rod, D.L.: Stability transitions for periodic orbits in Hamiltonian systems. Arch. Rat. Mech. Anal. 73, 313 – 347 (1980) 6. Grotta Ragazzo, C.: Nonintegrability of some Hamiltonian systems, scattering and analytic continuation. Commun. Math. Phys. 166, 255 – 277 (1994) 7. Grotta Ragazzo, C.: Irregular Dynamics and Homoclinic orbits to Hamiltonian saddle-centers. Comm. Pure Appl. Math. 50, 105–147 (1997) 8. Grotta Ragazzo, C.: Stability of homoclinic orbits and diusion in phase space. Submitted to Phys. Lett. A 9. Grotta Ragazzo, C.: Symplectic invariants of orbits homoclinic to saddle-center equilibria. In preparation (1996) 10. Herman, M.: Introduction a l’etude des courbes invariantes par les dieomorphismes de l’anneau. Asterisque 103 –104 (1983) 11. Llibre, J., Martnez, R., Simo, C.: Transversality of the invariant manifolds associated to the Liapunov family of periodic orbits near L2 in the restricted three-boby problem. J. Di. Eq. 58, 104 –156 (1985) 12. Lerman, L.M.: Hamiltonian systems with loops of a separatrix of a saddle-center. Sel. Math. Sov. 10, 297 – 306 (1991) 13. Mather, J.: Non-existence of invariant circles. J. Erg. Th. Dyn. Syst. 4, 301– 309 (1984) 14. Meiss, J.D.: Symplectic maps, variational principles, and transport. Rev. Mod. Phys. 64, 795 – 848 (1992) 15. Mielke, A., Holmes, P., O’Reilly, O.: Cascades of homoclinic orbits to, and chaos near, a Hamiltonian saddle center. J. Dyn. Di. Eqns. 4, 95 –126 (1992) 16. Moser, J.: On the generalization of a theorem of Liapuno. Comm. Pure Appl. Math. 11, 257 – 271 (1958) 17. Moser, J.: Stable and Random Motions in Dynamical Systems. Princeton, NJ: Princeton University Press (1973) 18. Palis, J.: On Morse Smale Dynamical Systems. Topology 8, 385 – 405 (1969) 19. Russmann, H.: Uber das verhalten analytischer Hamiltonscher dierentialgleichungen in der nahe einer gleichgewichtslosung. Math. Ann. 154, 285 – 300 (1964)
Communicated by Ya.G. Sinai
Commun. Math. Phys. 184, 273–300 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Analyticity in Time and Smoothing Effect of Solutions to Nonlinear Schr¨odinger Equations Nakao Hayashi1,? , Keiichi Kato2,?? 1 2
Department of Mathematics, Faculty of Engineering, Gunma University, Kiryu 376, Japan Department of Mathematics, Faculty of Science, Osaka University, Toyonaka 560, Japan
Received: 1 March 1994 / Accepted: 16 May 1996
Dedicated to Professor Haruo Shizuka on his sixtieth birthday Abstract: In this paper we consider analyticity in time and smoothing effect of solutions to nonlinear Schr¨odinger equations i∂t u + 21 ∆u = λ|u|2p u, (t, x) ∈ R × Rn , (1) u(0, x) = φ, x ∈ Rn , where λ ∈ C, p ∈ N. We prove that if φ satisfies
2
|x|
e φ [n/2]+1 < ∞, H
(2)
then there exists a unique solution u(t, x) of (1) and positive constants T , C0 , C1 such that u(t, x) is analytic in time and space variables for t ∈ [−T, T ] \ {0} and x ∈ Ω = {x; |x| < R} and has an analytic continuation U (z0 , z) on z0 = t + iτ ; −C0 t2 < τ < C0 t2 , t ∈ [−T, T ] \ {0} and {z = x + iy; −C1 |t| < y < C1 (t), (t, x) ∈ [−T, T ] \ {0} × Ω} . In the case n = 1, 2, 3 the condition (2) can be relaxed as follows:
2
|x|
e φ m < ∞, H
where m = 0 if n = 1, p = 1, m = 1 if n = 2, p ∈ N and m = 1 if n = 3, p = 1. ? Present address: Department of Applied Mathematics, Science University of Tokyo, 1-3, Kagurazaka, Shinjuku-ku, Tokyo 162, Japan ?? Present address: Department of Mathematics, Science University of Tokyo, Wakamiya, Shinjuku-ku, Tokyo 162, Japan
274
N. Hayashi, K. Kato
1. Introduction In this paper we consider analyticity in time and smoothing effect of solutions to nonlinear Schr¨odinger equations i∂t u + 21 ∆u = λ|u|2p u, (t, x) ∈ R × Rn , (1.1) u(0, x) = φ(x), x ∈ Rn , where λ ∈ C, p ∈ N. Equation in (1.1) appears in various physical applications, such as plasma physics, nonlinear optics, and nonrelativistic quantum physics. There have been many works on global existence of solutions, and on asymptotic behavior of solutions (see [Ca] and references cited therein). The analyticity in space and smoothing effect of solutions to (1.1) were studied in [H-Sai 1]. In particular, [H-Sai 1] showed that the solution u of (1.1) with p = 1 has an analytic continuation on the strip S(|t|) = {z = x + iy; −|t| < yj < |t|, j = 1, . . . , n}
Q
n for any time provided that j=1 (cosh xj )φ m is sufficiently small with m > n and H n ≥ 2. If we restrict our attention to the local existence of solutions, the method used in [H-Sai 1] is applicable to (1.1) when n = 1. ¯ was proved The analyticity in time of solutions to (1.1) replacing λ|u|2p u by F (u, u) in [H-K.K] first under the analytical condition on the data, where F (u, u) ¯ is a polynomial with respect to u and u¯ (see [H-K.K, Theorem 1.1]). Our purpose in this paper is to prove analyticity in time of solutions to (1.1) without regularity assumption on the data. Our main tool is the operator K = |x|2 + nit + 2itx · ∇ + 2it2 ∂t , which almost commutes with L = i∂t + 21 ∆. Indeed we have [L, K] = 4itL which yields LK l u = (K +4it)l Lu. Theorems are obtained through Propositions 3.1–3.3 which state the existence of solutions in analytic function spaces involving the operator K. In order to prove Propositions 3.1–3.3 we need the multiplication lemmas (Lemma 2.7, Lemmas 2.12–2.14). Lemma 2.7 is used to prove Proposition 3.2, Lemma 2.12 is used to prove Proposition 3.1 and Lemmas 2.13–2.14 are used to prove Proposition 3.3, respectively. The main tool in the previous work [H-K.K] was the operator P = x·∇+2t∂t which has the commutation relation [L, P ] = 2L. Differences between the proof in this paper and the previous one follow from the facts that the operator K does not commute with the time variable t and the operator x · ∇, and K is not the first order differential operator. The fact that K does not commute with the time variable t means that we can not use the Reibniz rule in (K + 4it)l and so we need to prepare Lemma 2.3 which prevents us from considering the analyticity in time of solutions in large time. On the other hand the fact that the operator P commutes with the constant 2 appears in [L, P ] = 2L enables us to use the Reibniz rule in (P + 2)l . Furthermore P is the first order differential operator which commutes with the operator x · ∇. Since K is not the first order differential operator, we introduce the multiplication term e− see that
i|x|2 2t
to prove the multiplication lemmas. By making use of e− i|x|2
i|x|2 2t
i|x|2
, we easily
˜ = |x|2 + 2itx · ∇ + 2it2 ∂t = 2ite 2t (x · ∇ + t∂t )e− 2t . K ˜ is considered as the first order differential operator for nonlinear terms The operator K ˜ commutes with the operator satisfying the gauge condition and we see that (i/2it)K i|x|2 i|x|2 ˜ does not commute with x · ∇. We also know K ˜ is almost e 2t x · ∇e− 2t although K
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
275
equivalent to K through Lemma 2.3. We give the strategy of the proofs of Theorems 1–2. The desired results are established by showing
i|x|2
e 2t u <∞
2
Gb1 (t∂;Gb2 t (∂t ;L2 (Ω)))
for some constants b1 , b2 and sufficiently small t, where the norm is defined below. We show the above estimate by combining Propositions 3.1–3.3 and Lemmas 4.1–4.2. We note that Lemma 4.1 also prevents us to prove the analyticity in time in large time t. 2 i|x|2 ˜ l = tl e i|x| ˜ l and tl ( 1 K) 2t (x·∇+t∂ )l e− 2t . Lemma 4.1 says that the relation between K t 2it The operator P was also used to prove the Gevrey smoothing effect in space variable in [B-H-K.K]. Roughly speaking it was shown that the data φ belongs to a Gevrey class of order 2, then solutions of some nonlinear Schr¨odinger equations become analytic in the space variable for t 6= 0. Korteweg-de Vries equation’s version of the operator P written as x∂x + 3t∂t is also useful to study analyticity in time and Gevrey regularizing effect in space variables for solutions (see [B-H-K.K] for details). Local smoothing effect of solutions to linear Schr¨odinger equations were studied by [Co-Sau, Sj and V] for the homogeneous case and by [Ke-P-V] for the inhomogeneous case. Kenig, Ponce and Vega [Ke-P-V] applied them to the proof of local existence results of nonlinear Schr¨odinger equations with nonlinearities having the derivatives of unknown function. This paper is organized as follows. In Sect. 2 we prove multiplication lemmas (Lemma 2.7, Lemmas 2.12–2.14) which are needed to prove local existence of analytic solutions of (1.1) which are established in Sect. 3. Section 4 is devoted to prove Theorems 1.1 and 1.2. In Sect. 5 we give some applications. Theorem 1.1. We let Ω be a ball in Rn with radius R center at the origin and assume that
2
|x|
e φ [n/2]+1 < ∞. H
Then for any R, there exists a unique solution u of (1.1) and positive constants T , C0 , C1 such that u is analytic in time and space variables for (t, x) ∈ [−T, T ] \ {0} × Ω and has an analytic continuation U (z0 , z) on z0 = t + iτ : −C0 t2 < τ < C0 t2 , t ∈ [−T, T ] \ {0} and {z = x + iy; −C1 |t| < yj < C1 |t| < yi < C1 |t|, (t, x) ∈ [−T, T ] \ {0} × Ω, j = 1, . . . , n}. Theorem 1.2. We assume that
2
|x|
e φ
2
|x|
e φ
L2
<∞
for p = 1, n = 1,
H1
<∞
for
Then the same results as in Theorem 1.1 holds.
p ∈ N, n = 2, p = 1, n = 3.
276
N. Hayashi, K. Kato
In the case of the linear Schr¨odinger equations i∂t u + 21 ∆u = 0, (t, x) ∈ R × Rn , u(0, x) = φ(x), x ∈ Rn ,
(1.2)
we have the same as in the proof of [H-Sai2, Theorem 1]. Proposition 1.1. We assume that
2
|x|
e φ
L2
< ∞.
Then there exists a unique solution u(t, x) of (1.2) such that u(t, x) has an anylytic continuation U (t, z) to S(∞) = {z; z ∈ Cn } and e− where
iz 2 2t
U (t, z) ∈ A(∞, 2t2 ),
A(∞, α) =
the set of all analytic functions f (z) on S(∞) such that
1 (απ)n/2
Z
Z
2
e
− yα
Rn R n
|f (z)| dx dy < ∞ 2
for each α .
Furthermore U (t, z) satisfies Z Z Z 2 2 1 − 21 ( u +2x) +2|x2 | 2 t e |U (t, x + iy)| dx dy = e2|x| |φ(x)|2 dx. (1.3) 2 n/2 (2t π) Rn Rn Rn From Proposition 1.1 and the same argument as in Sect. 4 the result about analyticity in time follows. However we can not expect the estimate (1.3) in the case of nonlinear Schr¨odinger equations because the function space A(∞, 2t2 ) does not work for (1.1). We notice that Proposition 1.1 follows from the use of the operator J = x + it∇. α Notatation and function spaces. Pn Let X be a Banach space with norm k · kX and ∂ = α1 αn ∂1 · · · ∂n , where |α| = j=1 αj , ∂j = ∂/∂xj and αj ∈ N ∪ {0}. We define analytic function spaces as follows: X a|α| ||∂ α f ||X < ∞ , Ga (∂; X) = f ∈ X; ||f ||Ga (∂;X) = α! n α∈(N∪0)
where X α∈(N∪0)n
a|α| α ||∂ f ||X = α!
X α∈(N∪0)n
In the following we denote the infinite sum We also define ( Ga (A; X) =
a|α| ||∂ α1 · · · ∂nαn f ||X . (α1 !) · · · (αn !) 1 P α∈(R∪0)n
f ∈ X; ||f ||Ga (A;X) =
by
X aN N
N!
P
α.
) ||AN f ||X < ∞ ,
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
where
277
A = K = |x|2 + nit + 2itx · ∇ + 2it2 ∂t ,
or ˜ = |x|2 + 2itx · ∇ + 2it2 ∂t A=K
with x · ∇ =
n X
xj ∂ j .
j=0
We note that ˜ = 2ite K
i|x|2 2t
(x · ∇ + t∂t )e−
i|x|2 2t
.
m,p
The usual Sobolev space H is defined by X H m,p = f ∈ Lp ; ||f ||H m,p = ||∂ α f ||Lp < ∞ , |α|≤m
and we let H m = H m,2 . We let with J = (Jj )1≤j≤n , Jj = xj + it∂j , X ||J α ∂ β f ||Lp < ∞ Rm,p (t) = f ∈ Lp ; ||f ||Rm,p (t) = |α|+|β|≤m
and Rm (t) = Rm,2 (t). For simplicity we write Ga (A, B; X) = Ga (B; Ga (A; X)). 2. Preliminary Estimates Lemma 2.1. We assume that operators A and B satisfy the commutation relations [A, B] = −βA2 ,
[A, γ] = [B, γ] = 0,
where β, γ ∈ C. Then we have (A + B)l =
Y X l k−1 (1 + βj)Ak B l−k + B l . k
1≤k≤l
(2.1)
j=0
Proof. We prove (2.1) by induction. When l = 1, it is clear that (2.1) holds. We assume that (2.1) holds for any l. Then we have by the assumption (A + B)
l+1
X l k−1 Y = (1 + βj)(A + B)Ak B l−k + (A + B)B l . k 1≤k≤l
(2.2)
j=0
We next prove by induction [B, Ak ] = βkAk+1 . The case k = 1 follows from the assumtion. Assume (2.3) for any k, then [B, Ak+1 ] = BAk+1 − Ak+1 B = (BAk − Ak B)A + Ak (BA − AB) = [B, Ak ]A + Ak [B, A] = β(k + 1)Ak+2 . This implies (2.3). We apply (2.3) to (2.2) to obtain
(2.3)
278
N. Hayashi, K. Kato
(A + B)l+1 X l k−1 Y = (1 + βj)((Ak+1 + Ak B + βkAk+1 )B l−k ) + (A + B)B l k j=0
1≤k≤l
Y X l k−1 = (1 + βj)((1 + βk)Ak+1 B l−k + Ak B l+1−k ) + (A + B)B l k j=0
1≤k≤l
X
=
2≤k0 ≤l−1
l k0 − 1
0 kY −2
0
(1 + βj)(1 + β(k 0 − 1))Ak B l+1−k
0
j=0
Y X l k−1 + (1 + βj)Ak B l+1−k + (A + B)B l k j=0
1≤k≤l
=
X
2≤k≤l
+
l Y j=0
l k−1
+
k−1 Y l (1 + βj)Ak B l+1−k k j=0
l (l + βj)Al+1 + AB l + (A + B)B l 1
Y X l + 1 k−1 = (1 + βj)Ak B l+1−k k 2≤k≤l
+
=
l+1 l+1
Y l
j=0
(1 + βj)A
l+1
+
j=0
l+1 1
AB l + B l+1
l X l + 1Y (1 + βj)Ak B l+1−k + B l+1 . k
(2.4)
j=0
1≤k≤l+1
This implies (2.1).
˜d = K ˜ + idt, Lemma 2.2. We have for K ||f (t)||Ga (K˜ d +αt;X(t)) ≤
C ||f (t)||Ga (K˜ d ;X(t)) , 1 − ab|t|
provided that ab|t| < 1, ||t · ||X(t) ≤ |t||| · ||X(t) , where b > 2. ˜ d, ˜ d ] = − 2i (αt)2 , we have by Lemma 2.1 with A = αt, B = K Proof. Since [αt, K α β = 2i/α X l k−1 Y 2i ˜ d + αt)l = ˜ l−k + K ˜ dl . (K (1 + j)(αt)k K (2.5) d k α j=0
l≤k≤l
By (2.5) ||f (t)||Ga (K˜ d +αt;X(t)) =
X al l
l!
˜ d + αt)l f (t)||X(t) ||(K
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
≤
X l k−1 Y 2 ˜ l−k f (t)||X(t) j)|α|k |t|k ||K (1 + d k l! |α|
X al l≥1
279
j=0
1≤k≤l
+
X al l
X X
l!
˜ dl f (t)||X(t) ||K
k−1 a (2a|t|)k Y |α| ˜ l−k f (t)||X(t) + ||f (t)||Ga (K˜ ;X(t)) + j)||K ≤ ( d d (l − k)! k! 2 j=0 l≥1 1≤k≤l k−1 X Y |α| 1 + j) ||f (t)||Ga (K˜ d ;X(t)) + ||f (t)||Ga (K˜ d ;X(t)) . ≤ (2a|t|)k (2.6) ( k! 2 l−k
j=0
k≥1
It is clear that k−1 1 Y |α| + j) ≤ C a˜ k ( k! 2
for
a˜ > 1.
j=0
Hence we have by (2.6), ||f (t)||Ga (K˜ d +αt;X(t)) ≤ C(
X
(ab|t|)k )||f (t)||Ga (K˜ d ;X(t)) ,
k
which implies the lemma. ˜ + idt, ˜d = K Lemma 2.3. We have for K ||f ||Ga (K˜ d +αt;X(T )) ≤
C ||f ||Ga (K˜ d ;X(T )) , 1 − abT
provided that abT < 1, ||t · ||X(T ) ≤ |T ||| · ||X(T ) , where b > 2. Proof. In the same way as in the proof of Lemma 2.2 we have the lemma. In what follows we use the notation M (t) = e−
i|x|2 2t
fˆ = M (t)f
,
and the relation J α = M (−t)(it∂)α M (t). Lemma 2.4. We have for p1 ≥ 1, ||f1 f2 f 3 ||Ga (J;Lp1 ) ≤
3 Y j=1
where 1/p1 =
P3
j=1 (1/pj+1 ).
||fj ||Ga (J;Lpj+1 ) ,
280
N. Hayashi, K. Kato
Proof. We have by Reibniz’ rule, X a|α|
J α (f1 f2 f 3 ) P L 1 α! α
(it∂)α (fˆ1 fˆ2 fˆ3 ) P
kf1 f2 f 3 kGa (J;LP1 ) = X a|α|
=
α
α!
L
1
|α| X (a|t|) X α β α−β ˆ β−γ ˆ γ ˆ f1 )(∂ f2 )(∂ f3 ) (∂
α! β≤α β γ α
γ≤β
P L 1
XX (a|t|)|α|
α−β ˆ
f2 )(∂ β−γ fˆ3 ) P ≤
(∂ (α − β)!(β − γ)!γ! L 1 β≤α α ≤β
≤
XX α
α≤α γ≤β
(a|t|)|α|
α−β ˆ
f1 ) P (∂ β−γ fˆ2 ) P (∂ γ fˆ3 ) P
(∂ (α − β)!(β − γ)!γ! L 2 L 3 L 4 (by H¨older’s inequality)
≤
3 Y
k fj kGa (J;Lpj+1 ) .
j=1
Lemma 2.5. We have for p1 , q1 ≥ 1, ||f1 f2 f 3 ||Gα (J;Lq1 (−T,T ;Lp1 )) ≤
3 Y
||fj ||Ga (J;Lqj+1 (−T,T ;Lpj+1 )) ,
j=1
where 1/p1 =
P3
j=1 (pj+1 )
and 1/q1 =
P3
j=1 (1/qj+1 ).
Proof. By the definition ||g||Ga (J;Lq1 (−T,T ;Lp1 )) =
X a|α| α!
α
Z
!1/q1
T −T
||J
α
g(t)||qL1p1 dt
.
In the same way as in the proof of Lemma 2.4, ||f1 f2 f 3 ||Ga (J;Lq1 (−T,T ;Lp1 )) ≤
XX α
≤
β≤α γ≤β
XX α
β≤α γ≤β
a|α| (α − β)!(β − γ)!γ!
Z
!1/q1
T
−T
||(J
α−β
f1 )(J
β−γ
a|α| ||J α−β f1 ||Lq2 (−T,T ;Lp2 ) (α − β)!(β − γ)!γ!
×||J β−γ f2 ||Lq3 (−T,T ;Lp3 ) ||J γ f3 ||Lq4 (−T,T ;Lp4 ) , which gives the lemma.
f2
)(J γ f
q1 3 )||Lp1 dt
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
281
Lemma 2.6. We have for p1 ≥ 1, 3 Y
||f1 f2 f 3 ||Ga (J,K;L ˜ p1 ) ≤
||fj ||Ga (J,K;L ˜ pj+1 ) ,
j=1
where 1/p1 =
P3
j=1 (1/pj+1 ).
Proof. We have kf1 f2 f¯3 kGa (J,K;L ˜ p1 ) =
X al l!
l
˜ l (f1 f2 f¯3 )kGa (J;Lp1 ) . kK
(2.7)
Since ˜ l = M (−t)(2itx · ∇ + 2it2 ∂t )l M (t), K we easily see that ˜ l (f1 f2 f¯3 ) = M (−t)(2itx · ∇ + 2it2 ∂t )l (fˆ1 fˆ2 fˆ3 ). K The operator 2itP˜ = 2it(x · ∇ + t∂t ) is the first order differential operator and so in the same way in the proof of Lemma 2.4 we find that the right-hand side of (2.7) is bounded from above by XX l
k≤l j≤k
al
.
(2itP˜ )l−k fˆ1 (2itP˜ )k−j fˆ2 (2itP˜ )j fˆ3 a (l − k)!(k − j)!j! G (J;Lp1 ) (2.8)
We apply Lemma 2.4 to (2.8) to get the lemma. Lemma 2.7. We have for p1 , q1 ≥ 1, kf1 f2 f¯3 kGa (J;K;L ˜ q1 (−T,T ;Lp1 )) ≤
3 Y
kfj kGa (J,K;L ˜ qj+1 (−T,T ;Lpj+1 )) ,
j=1
where 1/p1 =
P3
j=1 (1/pj+1 )
and 1/q1 =
P3
j=1 (1/qj+1 ).
Proof. In the same way as in the proof of Lemma 2.6 we have the lemma by using Lemma 2.5. Lemma 2.8. We have kf1 f2 f¯3 kY (T ) ≤ C
3 Y
kfj kY (T ) ,
j=1
where Y (T ) = L∞ (−T, T ; Rm (t)) and m ≥ [n/2] + 1.
282
N. Hayashi, K. Kato
Proof. By integration by parts and the commutation relation [∂k , Jk ] = δjk we obtain
X
kf1 f2 f¯3 kRm (t) ≤ C
α
J (f1 f2 f¯3 )
(2.9)
+ ∂ α (f1 f2 f¯3 ) L2 .
L2
(2.10)
|α|≤m
Hence we have with fˆ = M (t)f ,
X X
α ˆ ˆ ˆ
J α (f1 f2 f¯3 ) 2 = f f ( f )
(it∂) 1 2 3 L |α|≤m
|α|≤m
L2
X X αβ
≤ |t||α| (∂ α−β fˆ1 )(∂ β−γ fˆ2 )(∂ γ fˆ3 ) 2 β γ L |α|≤m
α≤α γ≤β
(it∂)α1 fˆ1
X
≤C
|α|=|α1 |+|α2 |+|α3 | |α|≤m
Lp1
p2
(it∂)α2 fˆ2 (it∂)α3 fˆ3 L
Lp3
(by H¨older’s inequality) 3
αj X Y
1−α ≤C
(it∂)α fˆj 2 kfj kL∞ j
(by Sobolev’s inequality),
L
|α|≤m j=1
where |αj | 1 + aj = pj n
1 m − 2 n
,
3 X 1 . pj j=1
Hence we have X
J α (f1 f2 f¯3 )
L2
3 X Y
≤C
|α|≤m
a
1−a
kJ α fj kLj2 kfj kL∞ j .
|α|≤m j=1
We again apply Sobolev’s inequality to get X
J α (f1 f2 f¯3 )
3 Y
≤C
L2
|α|≤m
kfj kRm (t) .
(2.11)
kfj |Rm (t) .
(2.12)
j=1
In the same way as in the proof of (2.11) we have X
∂ α (f1 f2 f¯3 )
L2
≤C
|α|≤m
From (2.10)–(2.12) the lemma follows.
3 Y j=1
Lemma 2.9. We have kf1 f2 f¯3 kGa (J;Y (T )) ≤ C
3 Y
kfj kGa (J;Y (T )) ,
j=1
where Y (T ) = L∞ (−T, T ; Rm (t)) and m ≥ [n/2] + 1.
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
283
Proof. We have by Lemma 2.8, kf1 f2 f¯3 kGa (J;Y (T )) = ≤
XX α
≤C
β≤α γ≤β
α−β
a|α|
(J f1 )(J β−γ f2 )(J γ f3 ) Y (T ) (α − β)!(β − γ)!γ!
XX α
X a|α|
J α (f1 f2 f¯3 ) Y (T ) α! α
β≤α γ≤β
α−β
a|α|
J f1 Y (T ) J β−γ f2 Y (T ) kJ γ f3 kY (T ) , (α − β)!(β − γ)!γ!
which gives the lemma.
Lemma 2.10. We have kf1 f2 f¯3 kGa (J,K;Y ˜ (T )) ≤ C
3 Y
kfj kGa (J,K;Y ˜ (T )) ,
j=1
where Y (T ) = L∞ (−T, T ; Rm (t)) and m ≥ [n/2] + 1. Proof. We have by the definition kf1 f2 f¯3 kGa (J,K;Y ˜ (T )) =
X al
K ˜ l (f1 f2 f¯3 ) a . G (J;Y (T )) l! l
In the same way as in the proof of (2.8) we obtain kf1 f2 f¯3 kGa (J,K;Y ˜ (T ))
XX al
˜ l−k ˜ j f3 ˜ k−j f2 ) K ≤ .
(K f1 )(K
a (l − k)!(k − j)!j! G (J;Y (T )) k≤l l
j≤k
We apply Lemma 2.9 to the right-hand side of the above to get lemma.
Lemma 2.11. We have
2p
|f | f − |g|2p g a ˜ G (J,K;Y (T )) 2p 2p ≤ C kf kGa (J,K;Y ˜ (T )) , ˜ (T )) + kgkGa (J,K;Y ˜ (T )) kf − gkGa (J,K;Y where Y (T ) = L∞ (−T, T ; Rm (t)) and m ≥ [n/2] + 1. Proof. We prove by induction with respect to p . When p = 1 we have the lemma by Lemma 2.10. We assume that the lemma holds for any p. From the equality ¯ , |f |2p+2 f − |g|2p+2 g = |f |2 |f |2p f − |g|2p g + |g|2p g f¯(f − g) + g(f¯ − g) and Lemma 2.10 it follows that
284
N. Hayashi, K. Kato
2p+2
|f | f − |g|2p+2 g Ga (J,K;Y ˜ (T ))
2p
2p ≤ C kf k2Ga (J,K;Y ˜ (T )) |f | f − |g| g Ga (J,K;Y ˜ (T ))
2p + |g| g Ga (J,K;Y ˜ (T )) kf kGa (J,K;Y ˜ (T )) + kgkGa (J,K;Y ˜ (T )) kf − gkGa (J,K;Y ˜ (T )) . Thus the lemma for the case p + 1 follows from the assumption. This completes the proof of Lemma 2.11. Lemma 2.12. We have k|f |2p f − |g|2p gkGa (J,K+4it,Y (T )) C 2p kf k2p ≤ Ga (J,K;Y (T )) + kgkGa (J,K;Y (T )) ||f − g||Ga (J,K;Y (T )) , 2p+2 (1 − abT ) provided that abT > 1, where b < 2, Y (T ) = L∞ (−T, T ; Rm (t)) and m ≥ [n/2] + 1. Proof. Lemma 2.3 with d = 0, α = (4 + n)i and X(T ) = Ga (J; Y (T )) gives k · kGa (J,K+4it;Y (T )) ≤
C k · kGa (J,K;Y ˜ (T )) , 1 − abT
(2.13)
since
kt · kX(T ) ≤ T k · kX(T ) . We again use Lemma 2.3 with d = n, α = −ni to obtain C k · kGa (J,K;Y (T )) . 1 − abT The lemma follows from (2.13) and Lemma 2.11. k · kGa (J,K;Y ˜ (T )) ≤
(2.14)
Lemma 2.13. We assume that n = 2 or 3. Then we have
p
Y
fj f¯j+p f2p+1
j=1
a ˜ r G (J,K;L (−T,T ;R1,r0 (t))) ≤C
2p+1 X 2p+1 Y k=1
kfj kGa (J,K;L ˜ ∞ (−T,T ;R1 (t))) kfk kGa (J,K;L ˜ r (−T,T ;R1,r (t))) ,
j=1 j6=k
where r = 2 + (4/n), (1/r) + (1/r0 ) = 1 and p = 1 if n = 3, p ∈ N if n = 2. Proof. We have by H¨older inequality
p
Y
fj f¯j+p f2p+1
j=1
r L (−T,T ;R1,r0 (t))
p X
α β Y
J ∂ fj f¯j+p f2p+1 =
j=1 |α|+|β|≤1 ≤C
2p+1 X 2p+1 Y k=1
j=1 j6=k
(2.15) r0
Lr (−T,T ;L )
kfj kL∞ (−T,T ;Lp(n+2) ) kfk kLr (−T,T ;R1,r (t)) .
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
By Sobolev’s inequality,
p
Y
fj f¯j+p f2p+1
j=1
≤C
2p+1 X 2p+1 Y k=1
285
(2.16) Lr (−T,T ;R
1,r 0
(t))
kfj kL∞ (−T,T ;R1 (t)) kfk kLr (−T,T ;R1,r (t)) .
j=1 j6=k
In the same way as in the proof of Lemma 2.9 we obtain by (2.16),
p
Y
fj f¯j+p f2p+1
j=1
a r 1,r 0 G (J,L (−T,T ;R
≤C
2p+1 X 2p+1 Y
(t)))
kfj kGa (J;L∞ (−T,T ;R1 (t))) ||fk ||Ga (J;Lr (−T,T ;R1,r (t))) .
(2.17)
j=1 j6=k
k=1
From (2.17) and the similar argument as in the proof of Lemma 2.10 the lemma follows. Lemma 2.14. We assume that n = 2 or 3. Then we have
Y
p
fj f¯j+p f2p+1
a ˜ 1
j=1 G (J,K;L (−T,T ;R1 (t))) X 3 ≤ CT 5 kfj kGa (J,K;L ˜ ∞ (−T,T ;R1 (t))) kfk kGa (J,K;L ˜ r (−T,T ;R1,r (t))) j,k,l=1 j6=k6=l
×kfl kGa (J,K;L for n = 3, ˜ r (−T,T ;R1,r (t)))
p
Y
fj f¯j+p f2p+1
j=1
a ˜ 1 G (J,K;L (−T,T ;R1 (t))) 3
≤ CT 4
2p+1 X 2p+1 Y k=1
kfj kGa (J,K;L ˜ ∞ (−T,T ;R1 (t))) kfk kGa (J,K;L ˜ r (−T,T ;R1,r (t)))
j=1 j6=k
for
n = 2,
where r = 2 + (4/n), p = 1 if n = 3 and p ∈ N if n = 2. Proof. We have by H¨older’s inequality
p
Y
fj f¯j+p f2p+1
j=1
1 L (−T,T ;R1 (t))
p Y X
α β
¯ = fj fj+p f2p+1
J ∂
j=1 |α|+|β|≤1
L1 (−T,T ;L2 )
286
N. Hayashi, K. Kato
≤C
2p+1 X 2p+1 Y k=1
where
P2p+1 j=1
kfj kLqj (−T,T ;L2p(n+2) ) kfk kLqk (−T,T ;R1,r (t)) ,
(2.18)
j=1 j6=k
1/qj = 1. On the other hand by Sobolev’s inequality we have for n = 3, 1/2
1/2
kf kL2p(n+2) = kf kL10 ≤ Ckf kH 1 kf kH 1,r ,
(2.19)
kf kL2p(n+2) ≤ Ckf kH 1 .
(2.20)
and for n = 2 We use (2.19) and (2.20) in the right-hand side of (2.18) to get
Y
p
fj f¯j+p f2p+1
1
j=1 1 L (−T,T ;R (t))
3 X
≤C
kfj k
L
j,k,l=1 j6=k6=l n
≤ CT n+2
X
n+2 2
(−T,T ;R1 (t))
kfk kLr (−T,T ;R1,r (t)) kfl kLr (−T,T ;R1,r (t))
kfj kL∞ (−T,T ;R1 (t)) kfk kLr (−T,T ;R1,r (t))
j,k,l=1 j6=k6=l
kfl kLr (−T,T ;R1,r (t)) and
Y
p
fj f¯j+p f2p+1
j=1 ≤C
2p+1 X 2p+1 Y k=1 3
≤ CT 4
kfj k
L
j=1 j6=k
2p+1 X 2p+1 Y k=1
8p 3
for n=3
L1 (−T,T :R1 (t))
(−T,T ;R1 (t))
kfk kLr (−T,T ;R1,r (t)) ·
kfj kL∞ (−T,T ;R1 (t)) kfk kLr (−T,T ;R1,r (t))
for n = 2.
j=1 j6=k
The rest of the proof is done in the same way as in the proof of Lemma 2.13 and so we leave it to the reader. Define U (t) by U (t)(φ) = F −1 e−i|ξ|
2
t/2
ˆ φ,
which is the fundamental solution of linear Schr¨odinger equation i∂t u + 21 ∆u = 0, (t, x) ∈ R × Rn , u(0, x) = φ(x), x ∈ Rn . For U (t) we have
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
287 0
Lemma 2.15. (1) We let 2 ≤ p ≤ ∞ and (1/p) + (1/p0 ) = 1. Then for any φ ∈ Lp , kU (t)φkLp ≤ C|t|− 2 (1− p ) kφkLp0 . n
2
(2.21)
(2) We let r = 2 + (4/n). for any φ ∈ L2 , kU (·)φkLr (R;Lr ) ≤ CkφkL2 .
(2.22)
Lemma 2.15 (2.21) is well known. Lemma 2.15 (2.22) is due to Stricharz [St].
3. Existence of Analytic Solutions Proposition 3.1. We assume that φ ∈ Ga x, |x|2 ; Rm (0)
n with m ≥ [ ] + 1. 2
Then there exists a unique solution u(t, x) of (1.1) and a positive constant T such that u(t, x) ∈ Gα (J, K; L∞ (−T, T ; Rm (t)))
for
t ∈ [−T, T ].
Proof. We only treat the case of positive time, since the negative time is treated similarly. To prove Proposition 3.1 we introduce the function space XT = {f ∈ L∞ (0, T ; L2 ); ||f ||XT = kf kGa (J,K;Y (T )) < ∞}, where
Y (T ) = L∞ (0, T ; Rm (t)).
We consider the linearized equation of (1.1), i∂t u + 21 ∆u = λ|v|2p v, (t, x) ∈ R × Rn , u(0, x) = φ(x), x ∈ Rn ,
(3.1)
where v ∈ XT . We define M by u = M v. It is sufficient to prove M is a contraction mapping from a closed ball XT,ρ = {f ∈ XT ; ||f ||XT ≤ ρ} into itself for some time T . Applying both sides of (3.1) by J β ∂ γ J α K l , we obtain 1 i∂t J β ∂ γ J α K l u + ∆J β ∂ γ J α K l u = λJ β ∂ γ J α (K + 4it)l |v|2p v, 2 where we have used the commutation relations [L, J] = 0, with
From (3.2) it follows that
[L, K] = 4itL,
1 L = i∂t + ∆. 2
(3.2)
288
N. Hayashi, K. Kato
kJ β ∂ γ J α K l ukL2 ≤ kxβ ∂ γ xα |x|2l φkL2 Z t +C kJ β ∂ γ J α (K + 4iτ )l |v|2p vkL2 dτ.
(3.3)
0
Multiplying both sides of (3.3) by a|α|+l /α!l!, making a summation with respect to α, l, β, γ, we get Z
T
||u(t)||Ga (J,K;Y (T )) ≤ ||φ||Ga (x, |x|2 ; Rm (0)) + C
|||v|2p v(τ )||Ga (J,K+4iτ ;Y (T )) dτ. 0
(3.4)
a
By Lemma 2.3 with d = n, α = (4 + n)i, X(T ) = G (J; Y (T )) we see that || · ||Ga (J,K+4it;Y (T )) ≤
C || · ||Ga (J,K;Y (T )) . 1 − abT
(3.5)
We apply (3.5) and Lemma 2.12 to the second term of the right-hand side of (3.4) to obtain Z T 1 ||v||2p+1 ||u(t)||Ga (J,K;Y (T )) ≤ ||φ||Ga (x,|x|2 ;Rm (0)) + C g a (J,K;Y (T )) dτ 2p+3 0 (1 − abT ) from which it follows that ||u||XT ≤ ||φ||Ga (x,|x|2 ;Rm (0)) + CT ρ2p+1
(3.6)
provided that T < We take ||φ||Ga (x,|x|2 ;Rm (0)) ≤
1 . 2ab ρ 2
and Cρ2p+1 T ≤
ρ . 2
Then (3.6) gives us ||u||XT ≤ ρ.
(3.7)
In the same way as in the proof of (3.7) we have by Lemma 2.12, ||M v1 − M v2 ||XT ≤ CT ρ2p ||v1 − v2 ||xT ≤
1 ||v1 − v2 ||XT , 2
(3.8)
provided that cT ρ2p ≤ 21 . From (3.7) and (3.8) we see that there exists a T such that M is a contraction mapping from XT,ρ into itself. This completes the proof of Proposition 3.1. Proposition 3.2. We assume that p = 1, n = 1 and φ ∈ Ga (x, |x|2 ; L2 ). Then there exists a unique solution u(t, x) of (1.1) and a positive constant T such that u(t, x) ∈ Ga (J, K; L2 )
f or
t ∈ [−T, T ].
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
289
Proof. To prove Proposition 3.2 we introduce the function space XT = {f ∈ L∞ (0, T ; L2 ); ||f ||XT < ∞}, where
||f ||XT = ||f ||Ga (J,K;L∞ (0,T ;L2 )) + ||f ||Ga (J,K;L6 (0,T ;L6 )) .
We also define a closed ball XT,ρ = {f ∈ XT ; ||f ||XT ≤ ρ}. We now prove that there exists a T such that M defined by u = M v is a contraction mapping from XT , ρ into itself. In the same way as in the proof of (3.2) we have by (3.1), 1 i∂t J α K l u + ∆J α K l u = λJ α (K + 4it)l |v|2 v, 2 which can be written as Z t U (t − τ )λJ α (K + 4iτ )l |v|2 v(τ )dτ. (3.9) J α K l u(t) = U (t)xα |x|2l φ − i 0
By virtue of Lemma 2.13 we get ||J α K l u(t)||L6 ≤ ||U (t)xα |x|2l φ||L6 Z t 1 +C (t − τ )− 3 ||J α (K + 4iτ )l |v|2 v(τ )||L6/5 dτ. 0
Taking L6 norm in time, using Lemma 2.15, we obtain ||J α K l u||L6 (0,T ;L6 ) ≤ C||xα |x|2l φ||L2 Z t 1 + C|| (t − τ )− 3 ||J α (K + 4iτ )l |v|2 v(τ )||L6/5 dτ ||L6 (0,T ) .
(3.10)
0
By H¨older’s inequality Z
t
Z 1 3
(t − τ ) g(τ )dτ ≤ 0
t
(t − τ )
− 23
21 Z
t
|g(τ )| dτ 2
dτ
0
21
1
≤ Ct 6 ||g||L2 (0,T ) .
0
(3.11)
We use (3.11) with g(τ ) = ||J α (K + 4iτ )l |v|2 v||L6/5 to (3.10) to have 1
||J α K l u||L6 (0,T ;L6 ) ≤ C||xα |x|2l φ||L2 + CT 3 ||J α (K + 4it)l |v|2 v||L2 (0,T ;L6/5 ) . (3.12) Multiplying both sides of (3.12) by a|α|+l /a!l!, making a summation with respect to α, l, we get kukGα (J,K;L6 (0,T ;L6 ))
1 ≤ C kφkGα (x,|x|2 ;L2 ) + T 3 |v|2 v Gα (J,K+4it;L2 (0,T ;L6/5 )) .
(3.13)
290
N. Hayashi, K. Kato
We have by Lemma 2.3 with X(T ) = Gα J; L2 (0, T ; L6/5 ) , d = 0, α = (n + 4)i,
2
|v| v α = |v|2 v Gα (K+4it;Gα (J;L2 (0,T ;L6/5 ))) G (J,K+4it;L2 (0,T ;L6/5 ))
2 C
|v| v α ˜ 2 ≤ G (J,K;L (0,T ;L6/5 )) 1 − abT C kvk2Gα (J,K;L (by Lemma 2.7). ≤ ˜ 6 (0,T ;L2 )) ˜ 6 (0,T ;L6 )) kvkGα (J,K;L 1 − abT We again use Lemma 2.3 to obtain k|v|2 vkGα (J,K+4it;L2 (0,T ;L6/5 )) ≤
1 C T 6 ρ3 . (1 − abT )4
(3.14)
Hence by (3.13) and (3.14),
1 kukGα (J,K;L6 (0,T ;L6 )) ≤ C kφkGα (x,|x|2 ;L2 ) + T 2 ρ3 ,
(3.15)
provided that T ≤
1 . 2ab
In the same way as (3.4) we have
Z T kukGα (J,K;L∞ (0,T ;L2 )) ≤ kφkGα (x,|x|2 ;L2 ) + C k|v|2 vkGα (J,K+4iτ ;L2 ) dτ 0
(3.16) ≤ kφkGα (x,|x|2 ;L2 ) + C |v|2 v Gα (J,K+4it;L1 (0,T ;L2 )) .
We have
2
|v| v
Gα (J,K+4it;L2 (0,T ;L2 ))
2 C
|v| v α ˜ 1 (by Lemma 2.3) G (J,K;L (0,T ;L2 )) 1 − abT C kvk3Gα (J,K;L ≤ (by Lemma 2.7) ˜ 3 (0,T ;L6 )) 1 − abT C kvk3Gα (J,K;L3 (0,T ;L6 )) (by Lemma 2.3). ≤ (1 − abT )4
≤
(3.17)
We use (3.17) in the right-hand side of (3.16) to get 1
ku(t)kGα (J,K;L2 ) ≤ kφkGα (x,|x|2 ;L2 ) + CT 2 ρ3 . From (3.15) and (3.18) it follows that 1 kukXT ≤ C kφkGα (x,|x|2 ;L2 ) + T 2 ρ3 .
(3.18)
(3.19)
In the same way as in the proof of (3.19) we have by Lemma 2.6 and Lemma 2.7, 1
kM v1 − Mv2 kXT ≤ CT 2 ρ2 kv1 − v2 kXT . We take
(3.20)
1 ρ 1 , CT 2 ρ2 ≤ . 2 2 Then (3.19) and (3.20) show that there exists a T such that M is a contraction mapping from XT,ρ into itself. This completes the proof of Proposition 3.2.
CkφkGα (x,|x|2 ;L2 ) ≤
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
291
Proposition 3.3. We assume that n = 2 or 3 and p = 1 when n = 3, p ∈ N when n = 2, and φ ∈ Gα x, |x|2 ; R1 (0) . Then there exists a unique solution u(t, x) of (1.1) and a positive constant T such that for t ∈ [−T, T ]. u(t, x) ∈ Gα J, K; R1 (t) Proof. We define the function space as follows: XT = f ∈ L∞ (0, T ; L2 ); kf kXT < ∞ , where
kf kXT = kf kGα (J,K;L∞ (0,T ;R1 (t))) + kf kGα (J,K;Lr (0,T ;R1,r (t))) ,
and r = 2 + 4/n. We also define a closed ball XT,ρ = {f ∈ XT ; kf kXT ≤ ρ} . We let M be defined by u = M v, where v ∈ XT,ρ . In the same way as in the proof of (3.10), kJ α K l ukLr (0,T :R1,r (t)) ≤ Ckxα |x|2l φkR1 (0)
Z t
− n2 (1− r2 ) α l 2p
+ C (t − τ ) J (K + 4iτ ) |v| v(τ ) R1,r0 (τ ) dτ
0
.
(3.21)
Lr (0,T )
The similar arguments as in (3.13) and (3.21) give kukGa (J,K;Lr (0,T ;R1,r (t))) ≤ C kφkGa (x,|x|2 ;R1 (0))
2n + T 2n+4 |v|2p v Ga (J,K+4it;Lr (0,T ;R1,r0 (t))) .
(3.22)
We have
2p
|v| v a G (J,K+4it;Lr (0,T ;R1,r0 (t)))
2p C
|v| v a ˜ r (by Lemma 2.3) G (J,K;L (0,T ;R1,r0 (t))) 1 − abT C kvk2p (by Lemma 2.13) ˜ r (0,T ;R1,r (t))) ˜ ∞ (0,T ;R1 (t))) kvkGa (J,K;L Ga (J,K;L 1 − abT C ≤ (1 − abT )2p+2 ≤
kvkGa (J,K;L( 0,T ;R1 (t))) ×kvk2p Ga (J,K;L∞ (0,T ;R1 (t)))
(by Lemma 2.3).
(3.23)
2n kukGα (J,K;Lr (0,T ;R1,r (t))) ≤ C kφkGα (x,|x|2 ;R1 (t)) + T 2n+4 ρ2p+1 ,
(3.24)
From (3.22) and (3.23) it follows that
provided that T ≤ In the same way as in the proof of (3.16)
1 . 2ab
292
N. Hayashi, K. Kato
kukGa (J,K;L∞ (0,T ;R1 (t)))
≤ kφkGα (x,|x|2 ;R1 (0)) + C |v|2p v Gα (J,K+4it;L1 (0,T ;R1 (t))) .
(3.25)
We have by Lemma 2.3,
2p
|v| v
Gα (J,K+4it;L1 (0,T ;R1 (t)))
≤
2p C
|v| v α ˜ 1 . G (J,K;L (0,T ;R1 (t))) 1 − abT
(3.26)
We apply Lemma 2.14 to (3.26) to see that the right-hand side of (3.26) is bounded from above by CT 3/5 2 kvkGα (J,K;L ˜ ∞ (0,T ;R1 (t))) kvkGα (J,K;L ˜ r (0,T ;R1,r (t))) 1 − abT
for n = 3
(3.27)
CT 3/4 kvk2p ˜ r (0,T ;R1,r (t))) ˜ ∞ (0,T ;R1 (t))) kvkGα(J,K;L Gα (J,K;L 1 − abT
for n = 2.
(3.28)
and
We use Lemma 2.3 in (3.27) and (3.28) to get
2p
|v| v
≤C Gα (J,K+4it;L1 (0,T ;R1 (t)))
T 3/5 ρ3 T 3/4 ρ2p+1
for n = 3, for n = 2,
(3.29)
provided that T ≤
1 . 2ab
Hence by (3.25) and (3.29), 3 3 kukGα (J,K;L∞ (0,T ;R1 (t))) ≤ kφkGα (x,|x|2 ;R1 (0)) + C max T 5 , T 4 ρ2p+1 .
(3.30)
From (3.24) and (3.30) we see that there exists a T such that kukXT ≤ ρ.
(3.31)
In the same way as in the proof of (3.31) we have by Lemma 2.3, Lemma 2.13 and Lemma 2.14 ρ (3.32) kM v1 − M v2 kXT ≤ kv1 − v2 kXT 2 for a sufficiently small T . Proposition 3.3 follows from (3.31) and (3.32).
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
293
4. Proofs of Theorems We first prove Lemma 4.1. We let P˜ = x · ∇ + t∂t . Then we have for any k ∈ N, X 0≤l≤m
X
al+k ak a l ˜l ˜ k
(tP˜ )l+k f kt P (tP ) f kXT ≤ , X(t) (l + k)! (l + k)! 2 − ea|t| 0≤l≤m
provided that kt · kXt ≤ |t|k · kX(t) and 2 − ea|t| > 0. Proof. We prove the lemma by induction. It is clear that the lemma holds for m = 0 and any k. We assume that the lemma holds for m and any k. We have X 0≤l≤m+1
al+k
tl P˜ l (tP˜ )k f X(t) (l + k)!
ak
(tP˜ )k f X(t) k! X
al+k
tl−1 [t, P˜ l−1 ]P˜ + tl−1 P˜ l−1 (tP˜ ) (tP˜ )k f + X(t) (l + k)! =
1≤l≤m+1
≤
ak
(tP˜ )k f + X(t) k! +
X 2≤l≤m+1
X 1≤l≤m+1
al+k
tl−1 P˜ l−1 (tP˜ )k+1 f X(t) (l + k)!
al+k
tl−1 [t, P˜ l−1 ]P˜ (tP˜ )k f . X(t) (l + k)!
(4.1)
P˜ l t = t(P˜ + 1)l .
(4.2)
We prove by induction
In case l = 0, (4.2) is valid. We assume that (4.2) holds for l. Then we have P˜ l+1 t = P˜ t(P˜ + 1)l (by assumption) = tP˜ + [P˜ , t] (P˜ + 1)l = t(P˜ + 1)l+1 .
(4.3)
This completes the proof of (4.2). From (4.2) we have P˜ l t = t
X l P˜ l−j + tP˜ l . j
1≤j≤l
Hence [t, P˜ l ] = −t
X l P˜ l−j . j
1≤j≤l
From (4.4) it follows that
(4.4)
294
N. Hayashi, K. Kato
X 2≤l≤m+1
al+k
tl−1 [t, P˜ l−1 ]P˜ (tP˜ )k f X(t) (l + k)!
X l − 1
al+k |t|l P˜ l−j (tP˜ )k f X(t) j (l + k)! 2≤l≤m+1 1≤j≤l−1 l+k X X
a (l − 1)! |t|l P˜ l−j (tP˜ )k f X(t) . = (l + k)!(l − 1 − j)!j!
≤
X
2≤l≤m+1
(4.5)
1≤j≤l−1
We have for k ∈ N ∪ {0}, 1 (l − 1)! ≤ . (l + k)!(l − 1 − j)! (l + k − j)! Hence the right-hand side of (4.5) is bounded from above by l+k j X X
l−j l−j
a |t|
t P˜ (tP˜ )k f . X(t) (l + k − j)!j! 2≤l≤m+1
Since X
2≤l≤m+1
(4.6)
1≤j≤l−1
X
al−j bj ≤
1≤j≤l−1
X
1≤l≤m
al
X
bl ,
1≤l≤m
we obtain by (4.6) if we put
al+k (a|t|)t
tl P˜ l (tP˜ )k f , , b = l X(t) (l + k)! l! X (a|t|)l X al+k
tl P˜ l (tP˜ )k f ≤ X(t) l! (l + k)!
al =
1≤l≤m
1≤l≤m
X
≤ ea|t| − 1
1≤l≤m
From (4.1) and (4.7) it follows that P
al+k
tl P˜ l (tP˜ )k f . X(t) (l + k)!
l l
t P˜ (tP˜ )k f X(t)
P al+1+k l ˜ l ˜ k+1 ak ˜ k ≤ k! (tP ) f X(t) + 0≤l≤m (l+1+k)! t P (tP ) f X(t)
P al+k l ˜ l ˜ k + ea|t| − 1 1≤l≤m (l+k)! t P (tP ) f X(t) l
P k a a1+k
(tP˜ )l+k+1 f ≤ ak! (tP˜ )k f X(t) + 0≤l≤m (l+1+k)! 2−ea|t| X(t)
P l+k a
l ˜l ˜ k + ea|t| − 1 (by assumption) 1≤l≤m (l+k)! t P (tP ) f X(t) l−1
P k ak a
(tP˜ )l+k f = ak! (tP˜ )k f X(t) + a 1≤l≤m+1 (l+k)! 2−ea|t| X(t)
P l+k a
l ˜l ˜ k + ea|t| − 1 1≤l≤m (l+k)! t P (tP ) f X(t) . al+k 0≤l≤m+1 (l+k)!
(4.7)
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
295
Therefore X
al+k
tl P˜ l (tP˜ )k f X(t) (l + k)! 1≤l≤m+1 l X
a ak
(tP˜ )l+k f ≤ , X(t) (l + k)! 2 − ea|t| 1≤l≤m+1 which means the lemma holds for m + 1. This completes the proof of the lemma. We next prove the multi-dimensional version of [H-K.K, Lemma 2.4 (2.2)]. Lemma 4.2. We have for any k ∈ N, X
al+k
(x · ∇)l+k f X(t) (l + k!) 0≤l≤m |α| X
ak a
(x · ∇)k xα ∂ α f ≤ , X(t) (a + k)! 1 − a |α|≤m
provided that 0 < a < 1. Proof. We prove by induction. It is clear that the lemma holds for m = 0 and any k. We assume that the lemma holds for m and any k. Then we have by the assumption X 0≤l≤m+1
al+k
(x · ∇)k+1 f X(t) (l + k)!
X
ak al+1+k
(x · ∇)k f
(x · ∇)k+1 (x · ∇)l j + (4.8) X(t) X(t) k! (l + 1 + k)! 0≤l≤m |α| X
ak a1+k a k
(x · ∇)k+1 xα ∂ α f (x · ∇) f X(t) + . ≤ X(t) k! (α + 1 + k)! 1 − a =
|α|≤m
Since (x · ∇)xα ∂ α =
X
(aj xα ∂ α + xj xα ∂j ∂ α ),
1≤j≤n
we have by (4.8), X
ak al+k
(x · ∇)k+1 f
(x · ∇)k f ≤ X(t) X(t) (l + k)! k! 0≤l≤m+1 |α| X X
ak+1 a aj (x · ∇)k xα ∂ α f X(t) + (α + k + 1)! 1 − a 1≤j≤n |α|≤m
+ (x · ∇)k xj xα ∂j ∂ α f X(t) . By a simple calculation
(4.9)
296
N. Hayashi, K. Kato
X 1≤j≤n
aj = (α + 1)!
α1 + α2 + · · · αn (α + 1)!(α + 2)! · · · (αn + 1)!
≤
1 . α!
Hence by (4.9), X 0≤l≤m+1
≤
al+k
(x · ∇)k+1 f X(t) (l + k)!
ak
(x · ∇)k f +a X(t) k! X
+ (1 − a)
1≤|α|≤m+1
≤
X |α|≤m+1
ak (α + k)!
X 1≤|α|≤m
ak (α + k)!
a 1−a
|α|
ak (α + k)!
a 1−a
|α|
a 1−a
|α|
(x · ∇)k xα ∂ α f
(x · ∇)k xα ∂ α f
(x · ∇)k xα ∂ α f
X(t)
X(t)
X(t)
.
This completes the proof of the lemma. Proof of Theorems 1.1 and 1.2. Theorems 1.1 and 1.2 are obtained if we prove that there exist constants b1 , b2 , and T such that kM (t)ukGb1
2 t∂;Gb2 t (∂
t
< ∞,
;L2 (Ω))
M (t) = e−
i|x|2 2t
(4.10)
for t ∈ (−T, T ) \ {0}. For simplicity we assume that t > 0 since the negative time is treated similarly. By [H-K.K Lemma 2.4 (2.1)] kM (t)ukGb1
2
t∂;Gb2 t (∂t;L2 (Ω))
X (b1 )|α| X (b2 t2 )t
∂tl (it∂)α M (t)u 2 L (Ω) α! l! α l X (b1 )|α| X 1 b2 t l ≤ k(t∂t )t vα kL2 (Ω) α! l! 1 − b2 t α =
(4.11)
l
α
where vα = (it∂) M (t)u, and positive constants b1 and b2 are determined later. By Reibniz’ rule we see that for P˜ = x · ∇ + t∂t , X l (t∂t )l = (P˜ − x · ∇)l = P˜ l−k (x · ∇)k , k 0≤k≤l
since [x · ∇, P˜ ] = 0. We use the above equation in the right-hand side of (4.11) to obtain kM (t)ukGb1
2
t∂;Gb2 t (∂t ;L2 (Ω))
X (b1 )|α| X b2 t l
1
(x · ∇)k P˜ l−k vα 2 ≤ L (Ω) α! 1 − b2 t (l − k)!k! α,l k≤l X (b1 )|α| b2 t l1 +l2 1
(x · ∇)l1 P˜ l2 vα 2 . ≤ L (Ω) α! 1 − b2 t l1 !l2 ! α,l1 ,l2
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
297
We use Lemma 4.2 to the above to get kM (t)ukGb1
2
t∂;Gb2 t (∂t ;L2 (Ω))
X (b1 )|α| X b2 t l1 b2 t |β| 1 ≤ kxβ ∂ β P˜ l1 vα kL2 (Ω) α! 1 − b t 1 − 2b t l !β! 2 2 1 α l1 ,β
X (b1 )|α| X (2b2 t)t1 (2Rb2 t)|β| k∂ β P˜ l1 vα kL2 (Ω) ≤ α! l !β! 1 α l1 ,β
X (b1 )|α| X (2b2 t)l1 (2Rb2 )|β| l1
β
=
J M (−t)P˜ M (t) J α u 2 , α! l1 !β! L (Ω) α!
(4.12)
l1 ,β
provided that 1 . 2b2
t< By Lemma 4.1 and (4.12) kM (t)ukGb1 ≤
X α,l1 ,β
≤
X
α,l1 ,β
2
t∂;Gb2 t (∂t ;L2 (Ω))
1 (b1 )|α| α!l1 !β!
b2 2 − e2b2 |t|
l 1
˜ l1 J α ukL2 (2Rb2 )|β| kJ α K
1 (b1 )|α| (2b2 )l1 (2Rb2 )|β| k˜(K)l1 J α+β ukL2 , α!l1 !β!
(4.13)
provided that t<
3 1 log , 2b2 2
where we have used the commutation relation ˜ J] = 0. [K,
(4.14)
Hence we have by (4.13), kM (t)ukGb1 ≤
X b|α| (2Rb2 )|β| 1
α,β
2
t∂;Gb2 t (∂t ;L2 (Ω))
α!β!
kJ α+β ukG2b2 (K;L ˜ 2 (Ω))
≤ CkukG(b1 +2Rb2 ) (J;G2b2 (K;L ˜ 2 (Ω))) .
(4.15)
We put a = max{b1 + 2Rb2 , 2b2 } with
b2 ≤
3 1 log . 2t 2
Then (4.14) and (4.15) imply that kM (t)ukGb1
2
t∂;Gb2 t (∂t ;L2 (Ω))
≤ CkukGa (K,J;L 2 (Ω) ≤ CkukGa (J,K;L ˜ ˜ 2). )
(4.16)
298
N. Hayashi, K. Kato
From Propositions 3.1–3.3 we see that the right-hand side of (4.16) is bounded if
2
(4.17) kφkGa (x,kx|2 ;Rm (0)) ≤ C e|x| φ m . H We have kφkGa (x,|x|2 ;Rm (0)) =
X a|α|+k α!k!
α,k
X
≤
X
β γ α 2k
x ∂ x |x| φ
L2
|β|+|γ|≤m
X a|α|+k
xα |x|2k xβ ∂ γ φ 2 , L α!k!
|β|+|γ|≤m α,k
from which it follows that kφk2Ga (x,|x|2 ;Rm (0)) ≤
X
X
|β|+|γ|≤m α,k
√
2(|α|+k) 2a
(α!)2 (k!)2
α 2k β γ 2
x |x| x ∂ φ 2 L
√ 2(|α|+k) 2|α| X 2 2a
α 2k β γ 2
x |x| x ∂ φ 2 (by 1 ≤ 2 ) ≤ 2 L (2a)!(2k)! (a!) (2α)! |β|+|γ|≤m α,k
2
b √ X
Y β γ 2
≤C
cosh 2 2a(xj + |x| ) x ∂ φ
2 |β|+|γ|≤m j=1 L
X 2 √ b|x|2 β γ ≤C x ∂ φ for b > 2 2a. (4.18)
e X
|β|+|γ|≤m
L2
From (4.18) it follows that
2
kφkGa (x,|x|2 ;Rm (0)) ≤ C e|x| φ
Hm
1 for a < √ . 2 2
This completes the proof of Theorem 1.1–1.2.
5. Applications Our proof of the theorems can be applicable to i∂t u + 21 ∆u = λ|u|2p u + V (x)u, (t, x) ∈ R × Rn , p ∈ N, λ ∈ C, n u(0, x) = φ(x) x ∈ R and
i∂t u + ∂x2 u + 2iδ∂x (|u|2 u) = 0, u(0, x) = φ(x), x ∈ R.
We have for (5.1)
(t, x) ∈ R × R, δ ∈ R,
(5.1)
(5.2)
Analyticity in Time and Smoothing Effect of Solutions to NLS Equations
299
Proposition 5.1. In addition to the assumption on Theorems 1.1–1.2 we assume that X a|α| α!
α
kxα ∂ α V kGb (∂;H m ) < ∞,
where 0 < a < 1, 0 < b. Then the same result as in Theorems 1.1–1.2 are valid for the solutions of (5.1). Proof. In the same way as in the proofs of Theorems 1.1–1.2 we have the result if we prove that there exists a positive constant a1 such that V ∈ Ga1 x · ∇; Gb (∂; H m ) . By Lemma 4.2 with X = Gb (∂; H m ) and k = 0 we find that kV kGa1 (x·∇;X) provided that
a1 1−a1
X a1 |α| 1 kxα ∂ α V kX < ∞, ≤ 1 − a α! 1 α
≤ a. This completes the proof of Proposition 5.1.
The condition on V given in Proposition 5.1 is satisfied if V (x) has an analytic continuation V (z) on the complex domain Γ√2a,√2b = {z ∈ C; zj = xj + iyj ; −∞ < xj < +∞, √ √ − 2b − (tan α)|xj | < yi < 2b + (tan α)|xj |, √ j = 1, 2, . . . , n, 0 < α = sin−1 2a < π/2} and
Z Γ√2a,√2b
|V (z)|2 dxdy < ∞,
(see the proof of Theorem 1.1 in([H-K.K]). Hence if V (x) =
√
2b < 1,
1 hni <m∈N 2 2
1 , (1 + |x|2 )m
can be considered as the typical example satisfying the condition in Proposition 5.1. The derivative nonlinear Schr¨odinger equation (5.2) can be translated into the system of nonlinear Schr¨odinger equations without nonlinear terms having derivatives of unknown function by using a gauge transformation. Indeed putting u1 = E 2 u and u2 = E∂x (Eu) with Z x
E(t, x) = exp(iδ −∞
we have
|u(t, y)|2 dy),
i∂t u1 + ∂x2 u1 = 2iδu21 u2 , i∂t u2 + ∂x2 u2 = −2iδu22 u1 ,
(see [H] for details). Hence in the same way as in the proof of Theorem 1.2 we obtain for (5.2)
300
N. Hayashi, K. Kato
Proposition 5.2. We assume that ||e|x| φ||H 1 < ∞. 2
Then the same result as in Theorem 1.1 is valid for the solutions of (5.2). Acknowledgement. The first author is partially supported by the Gunma University Foundation for Science and Technology.
References [B-H-K.K] De Bouard, A., Hayashi, N., Kato, K.: Gevrey regularizing effect for the (generalized) Korteweg– de Vries equation and nonlinear Schr¨odinger equations. Ann. Inst. Henri Poincar´e, Analyse, non lin´eaire 12, 673–725 (1995) [Ca] Cazenave, T.: An introduction to nonlinear Schr¨odinger equations. Textos de Matematicas, Rio de Janeiro, 1989 [Co-Sau] Constantin, P., Saut, J.C.: Local smoothing properties of dispersive equations. J. Am. Math. Soc. 1, 413–439 (1988) [H] Hayashi, N.: The initial value problem for the derivative nonlinear Schr¨odinger equation in the energy space. J. Nonlinear Anal. T.M.A. 20, 823–833 (1993) [H-K.K] Hayashi, N., Kato, K.: Regularity in time of solutions to nonlinear Schr¨odinger equations. J. Funct. Anal. 128, 253–277 (1995) [H-Sai 1] Hayashi, N., Saitoh, S.: Analyticity and global existence of small solutions to some nonlinear Schr¨odinger equations. Commun. Math. Phys. 129, 27–42 (1990) [H-Sai 2] Hayashi, N., Saitoh, S.: Analyticity and smoothing effect for the Schr¨odinger equation. Ann. Inst. Henri Poincar´e, Physique The´eorique 52, 163–173 (1990) [Ke-P-V] Kenig, C.E., Ponce, G., Vega, L.: Small solutions to nonlinear Schr¨odinger equations. Ann. Inst. Henri Poincar´e, Analyse non lin´eaire 10, 255–288 (1993) [Sj] Sj¨olin, P.: Regularity of solutions to the Schr¨odinger equations. Duke Math. J. 55, 699–715 (1987) [St] Strichartz, R.S.: Restriction of Fourier transforms to quadratic surfaces and decay of solutions of wave equations. Duke Math. J. 44, 705–714 (1977) [V] Vega, L.: Schr¨odinger equations: Pointwise convergence to the initial data. Proc. Am. Math. Soc. 102, 874–878 (1988) Communicated by H. Araki
Commun. Math. Phys. 184, 301 – 365 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Theory of Tensor Invariants of Integrable Hamiltonian Systems. II. Theorem on Symmetries and Its Applications Oleg I. Bogoyavlenskij? Department of Mathematics and Statistics, Queen’s University, Kingston, Canada, K7L 3N6 Received: 16 January 1996 / Accepted: 3 July 1996
Abstract: The theorem on symmetries is proved that states that a Liouville-integrable Hamiltonian system is non-degenerate in Kolmogorov’s sense and has compact invariant submanifolds if and only if the corresponding Lie algebra of symmetries S is abelian. The theorem on symmetries has applications to the characterization problem, to the integrable hierarchies problem, to the necessary conditions for the strong dynamical compatibility problem, and to the problem on master symmetries. The invariant necessary conditions for the non-degenerate C-integrability in Kolmogorov’s sense of a given dynamical system V are derived. It is proved that the C-integrable Hamiltonian system is nondegenerate in the iso-energetic sense if and only if the corresponding Lie algebra of the iso-energetic conformal symmetries Sec is abelian. An extended concept of integrability of Hamiltonian systems on the symplectic manifolds M n , n = 2k, is introduced. The concept of integrability describes the Hamiltonian systems that have quasi-periodic dynamics on tori Tq or on toroidal cylinders Tm × Rq−m of an arbitrary dimension q = n, n − 1, · · · , 2, 1. This concept includes, as a particular case, all Hamiltonian systems that are integrable in Liouville’s classical sense, for which q ≤ n/2 = k. The A-B-C-cohomologies are introduced for dynamical systems on smooth manifolds. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Chapter I. Theorem on Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 2 Lemma on Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 3 Theorem on Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 4 Necessary Conditions for the Non-Degenerate C-Integrability in the Kolmogorov Sense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 5 The Second Characterization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 ?
Supported by the Natural Sciences and Engineering Research Council of Canada
302
O.I. Bogoyavlenskij
6 An extended Concept of Integrability of Hamiltonian Systems . . . . . . . 322 7 The A-B-C-Cohomologies for Dynamical Systems . . . . . . . . . . . . . . . . 327 Chapter II. Applications of Theorem on Symmetries . . . . . . . . . . . . . . . . . . . . 331 8 Applications to the Integrable Hierarchies Problem . . . . . . . . . . . . . . . . 331 9 Applications to the Necessary Conditions for Strong Dynamical Compatibility Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 10 Tensor Invariants of Two Degenerate Poisson Structures . . . . . . . . . . . . 333 Chapter III. Structure of Dynamical Systems (2.1) . . . . . . . . . . . . . . . . . . . . . 339 11 An Alternative for the Analytic Dynamical Systems (2.1) . . . . . . . . . . . 339 12 Theorem on the Structure of General Dynamical Systems (2.1) . . . . . . . 340 13 Theorem on the Structure of the Liouville-Integrable Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Chapter IV. Master Symmetries and Their Applications . . . . . . . . . . . . . . . . . 344 14 Conformal Symmetries Form a Lie Algebra Sc . . . . . . . . . . . . . . . . . . . 344 15 Master Symmetries of Any Dynamical System (2.1) Form a Lie Algebra M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 16 Criteria for the Lie Subalgebra of Symmetries S to be an Ideal in M . . 350 17 Applications of Theorem 1 to the Master Symmetries . . . . . . . . . . . . . . 352 18 Form of the Hamiltonian in the Action-Angle Coordinates . . . . . . . . . . 354 19 Concluding Remarks on the Theorem on Symmetries . . . . . . . . . . . . . . 358 20 Conformal Symmetries, KAM Theory, and Rings of Cohomologies ∗ HB (f, M n ) of Smooth Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 1. Introduction I. This paper continues the study initiated in our papers [5, 6, 9] where we derived the complete classification of all invariant Poisson structures for the Liouville-integrable Hamiltonian systems non-degenerate in the sense of Kolmogorov [24]. This class of dynamical systems with compact invariant submanifolds is exactly the starting point for the Kolmogorov-Arnold-Moser theory [2,24,25,31] that studies small Hamiltonian perturbations of the integrable systems with compact invariant submanifolds. The Kolmogorov non-degeneracy condition [24] and the iso-energetic non-degeneracy condition [2,3] are formulated in the action-angle coordinates and therefore depend explicitly upon the choice of these coordinates. One of the well-known problems in the theory of integrable systems is The Characterization Problem. What invariant properties characterize the Liouville-integrable Hamiltonian systems V with compact invariant submanifolds which are non-degenerate in the sense of Kolmogorov? In this paper, we prove the following theorem that solves the characterization problem. Theorem 1. The Liouville-integrable Hamiltonian system x˙ α = P αβ H,β = V α
(1.1)
is non-degenerate in Kolmogorov’s sense and all its invariant submanifolds are compact if and only if one of the following three conditions is satisfied:
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
303
a) The Lie algebra S of symmetries of system (1.1) is abelian, b) The equation U (F ) = 0 holds for any symmetry U and any first integral F of system (1.1), c) For any symmetry U , the dynamical system x˙ τ = U τ (x) is integrable, has quasiperiodic dynamics, and preserves all invariant k-dimensional submanifolds of system (1.1) on the manifold M 2k . Recall that the Lie algebra S of symmetries of any Hamiltonian system (1.1) is infinite-dimensional because the vector fields U = f (H)V are symmetries of system (1.1) for arbitrary functions f (H). The Theorem 1 has applications to the characterization problem, to the necessary conditions for the non-degenerate C-integrability problem (Section 4), to the integrable hierarchies problem (Section 8), to the necessary conditions for the strong dynamical compatibility problem (Sections 9 and 10), and to the problem on master symmetries (Section 17). Definition 1. The Hamiltonian system (1.1) is C-integrable in a domain O ⊂ M 2k if it is completely integrable in the Liouville sense and in the domain O all invariant submanifolds of constant level of the k involutive first integrals are compact. Liouville’s classical Theorem [3, 26] implies that these invariant submanifolds are tori Tk and the dynamics is quasi-periodic in the action-angle coordinates. Definition 2. A (r, s) tensor T on the manifold M 2k is called C-invariant if it is invariant with respect to a C-integrable non-degenerate Hamiltonian system (1.1). II. The notion of iso-energetic non-degeneracy [2, 3] is connected with the following well-known problem. The Second Characterization Problem. What invariant properties characterize the C-integrable Hamiltonian systems (1.1) which are non-degenerate in the isoenergetic sense? In Sect. 5, we prove Theorem 2 that solves the second characterization problem. In Theorem 2, we use the concept of conformal symmetry. The general master symmetries of a dynamical system V were introduced by Fuchssteiner [19] as vector fields X that satisfy the equation [V, [V, X]] = 0. Definition 3. A vector field X is a conformal symmetry of system (1.1) if it satisfies the equation [X, V ] = c(x)V,
(1.2)
where c(x) is a first integral of system (1.1). In Sect. 14, we prove that the conformal symmetries of any dynamical system V form a Lie algebraSc . The general master symmetries do not. Definition 4. A vector field X is iso-energetic with respect to the Hamiltonian system (1.1) if its flow preserves the Hamiltonian function H : X(H) = 0.
304
O.I. Bogoyavlenskij
Let Sec ⊂ Sc be the Lie algebra of iso-energetic conformal symmetries. For the Lie algebra Se of the iso-energetic symmetries we have Se = S ∩ Sec . Theorem 2. The C-integrable Hamiltonian system (1.1) is non-degenerate in the isoenergetic sense if and only if the corresponding Lie algebra Sec of the iso-energetic conformal symmetries is abelian. Theorem 1 and Theorem 2 imply that the properties of a Hamiltonian system (1.1) to be C-integrable and non-degenerate in the Kolmogorov sense or in the iso-energetic sense are independent of any system of action-angle coordinates. These theorems provide an insight into the relationships between the concept of the non-degeneracy in the Kolmogorov sense and the concept of the non-degeneracy in the iso-energetic sense. Namely, the assumption that the Lie algebra S of symmetries of the Hamiltonian system (1.1) is abelian does not entail that the Lie algebra Sec of the iso-energetic conformal symmetries is abelian. Conversely, the assumption that the Lie algebra Sec is abelian does not imply that the Lie algebra S of symmetries of system (1.1) is abelian. Therefore these two concepts of non-degeneracy of the Cintegrable Hamiltonian systems are independent. This fact was first established in [2, 31] by constructing concrete examples. III. The first question that always arises about any dynamical system V is whether or not it is integrable. Thus we arrive at the problem What are the necessary conditions for the non-degenerate C-integrability in Kolmogorov’s sense of a given dynamical system V ? We derive the necessary conditions from Theorem 1. If a dynamical system V is the C-integrable non-degenerate Hamiltonian system then the Lie algebra S of its symmetries is necessarily abelian and the equation dF (U ) = U (F ) = 0 is necessarily satisfied for any symmetry U and any first integral F of system V . These criteria prove that any Hamiltonian system that possesses two non-commuting symmetries either is not C-integrable or is degenerate in the original phase space. The fact that the Lie algebra of symmetries S of a Hamiltonian system is abelian does not entail that the system is integrable in the Liouville sense. An analysis of dynamical systems with abelian Lie algebras of symmetries led us to an extended concept of integrability of Hamiltonian systems that includes the Liouville integrability as a particular case. These results are presented in Sect. 6. Theorem 2 provides a solution to the following problem. What are the necessary conditions for the non-degenerate C-integrability in the isoenergetic sense of a given Hamiltonian system (1.1)? Theorem 2 implies that if a Hamiltonian system (1.1) is C-integrable and isoenergetically non-degenerate then the Lie algebraSec of its iso-energetic conformal symmetries is necessarily abelian. Another necessary condition is that any iso-energetic conformal symmetry (1.2) must be a symmetry. IV. In his 1855 paper [26], Liouville proved that a Hamiltonian system on a symplectic manifold M n , n = 2k, which possesses k involutive and functionally independent first integrals is integrable. Surprisingly enough, in the 140 years since Liouville’s classical paper, the converse statement has been treated as if it were true: the integrability of a Hamiltonian system depends on the existence of a sufficiently large number of involutive, functionally-independent first integrals.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
305
However, this converse statement is not true and integrability is not necessarily connected with either the existence of many first integrals or their involutiveness. Thus we arrive at the following problem. What is the most general concept of integrability of Hamiltonian systems? In Sect. 6, we introduce an extended concept of integrability of Hamiltonian systems on symplectic manifolds M 2k . The new class of integrable Hamiltonian systems contains all systems that are integrable by Liouville’s Theorem as a particular case. Our key idea about the properties of the general integrable Hamiltonian systems consists of the following: a) The integrable Hamiltonian system must possess functionally independent first integrals F1 (x), · · · , Fp (x) they can be non-involutive and their number p can be arbitrary: 0 ≤ p < 2k, b) The integrable Hamiltonian system must possess an abelian (n − p)-dimensional Lie subalgebra Sa ⊂ S of symmetries that preserve the first integrals Fj (x) but the symmetries may be non-symplectic. In Liouville’s definition [26] the independent conditions (a) and (b) were incorporated into the single condition of the involutiveness of first integrals. It is this hidden overlap of ideas that is probably the reason why Liouville’s definition has served as the most general characterization of integrability since that time. In Sect. 6, we introduce also a general concept of integrability of a dynamical system x˙ i = V i (x1 , · · · , xn ) on a smooth manifold M n of an arbitrary dimension n. The Hamiltonian systems with coisotropic invariant tori Tq for q > k were discovered by Parasyuk [38]. These systems and their links with KAM theory were studied by Herman [21, 22], Parasyuk [39] and Moser [32]. These authors investigated the dynamics of perturbations of integrable Hamiltonian systems that are invariant with respect to a symplectic action of a torus Tq on the symplectic manifold M 2k . Therefore, the integrability of the original Hamiltonian system is an initial assumption of the papers by Parasyuk [38, 39], Herman [21, 22] and Moser [32]. In the present paper we require much less than integrability or invariance with respect to a toral action. In Theorem 3, we derive from our Definition 8 the integrability of the Hamiltonian systems under investigation and obtain a set of k(k + 1)/2 canonical forms, that contains the systems with Lagrangian, isotropic and coisotropic tori as special cases. For general integrable Hamiltonian systems, the invariant tori Tq are not necessarily any one of these three types. For each of the derived canonical forms, we prove in Corollary 3 that the corresponding Lie algebra of symmetries S is abelian if and only if the invariant tori Tq either are Lagrangian or coisotropic, q = k, k + 1, · · · , 2k, and a non-degeneracy condition is met. V. In [5] we introduced the concepts of dynamical compatibility and strong dynamical compatibility of two Poisson structures which are incompatible in Magri’s sense [27]. In [9] we proved that if two Poisson structures P and Q are strongly dynamically compatible and are in the general position then any dynamical system V that preserves both of them is integrable and generates a hierarchy of integrable dynamical systems x˙ α = (Am V )α .
(1.3)
Here A = P Q−1 is the recursion operator associated with the pair of Poisson structures P and Q and m is an arbitrary integer.
306
O.I. Bogoyavlenskij
In Sect. 8 we study the more general problem that is connected with an arbitrary C-invariant (1,1) tensor Aα β (x). This (1,1) tensor is not necessarily the recursion operator for any two Poisson structures. The Integrable Hierarchies Problem. Suppose the C-integrable non-degenerate Hamiltonian system (1.1) preserves a (1,1) tensor Aα β (x). Are the dynamical systems (1.3) integrable? In Sect. 8 we solve this problem. In Theorem 4 we prove that even more general dynamical systems ` X am (x)Am V )α (1.4) x˙ α = ( m=−`
are completely integrable. Here am (x) are arbitrary smooth functions of the eigenvalues of the (1,1) tensor A. We prove that all dynamical systems (1.4) commute with each other and have the same invariant submanifolds as system (1.1). The proof of Theorem 4 follows from Theorem 1 practically without calculations. VI. In Sect. 9 we continue to investigate The Necessary Conditions for Strong Dynamical Compatibility Problem. What are the necessary conditions for strong dynamical compatibility of two incompatible Poisson structures? In Theorem 5 we present several necessary conditions assuming that at least one of the two Poisson structures P and Q is non-degenerate. We construct a rich family of tensor invariants using the recursion operator A = P Q−1 . The proof of Theorem 5 is a pure logical consequence of Theorem 1 on symmetries. VII. When both Poisson structures P and Q are degenerate, the recursion operator A does not exist. In this case, we develop new constructions of tensor invariants which are applicable to two arbitrary Poisson structures. The derived constructions generalize those first presented in our paper [9]. Using the Schouten bracket [P, Q] and an arbitrary non-degenerate differential nform of volume ωn on the manifold M n , n = 2k, we define in Sect. 10 two distributions B1 ⊂ B2 ⊂ T (M n ). These distributions are connected with the distribution B ⊂ T (M n ) introduced in [9] by the inclusions B1 ⊂ B2 ⊂ T (Tk ) ⊂ B
(1.5)
for any two strongly dynamically compatible Poisson structures P and Q. Here T (Tk ) is the distribution of the tangent spaces of the invariant tori of any C-integrable nondegenerate Hamiltonian system that preserves P and Q. The inclusions (1.5) imply the following necessary condition for strong dynamical compatibility of the two Poisson structures: dim(B2 )x ≤ k at all points x ∈ M 2k . We construct 2k (1,1) tensors Ar from the two Poisson structures P and Q and the n-form of volume ωn . The tensors Ar are defined uniquely up to a common factor f (x) because the n-form ωn is defined uniquely up to a factor f (x). We define the functions frm (x) = (Tr(Ar (x))m )1/m ,
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
307
where r = 0, 1, · · · , 2k − 1 and m = 1, · · · , k. The classes of proportionality of these functions specify the maps Fm :
M 2k −→ RP 2k−1 ,
Fm (x) = f0m (x) : f1m (x) : · · · : f2k−1.m (x) ∈ RP 2k−1 of the manifold M 2k into the real projective space RP 2k−1 . The maps Fm and other maps F constructed in Sect. 10 are first integrals of any dynamical system that preserves the two Poisson structures P and Q. The maps F annihilate the distributions B1 ⊂ B2 ⊂ T (Tk ): dF (T (Tk )) = 0, and therefore provide the second necessary condition for strong dynamical compatibility rank dF ≤ k. For any two Poisson structures P and Q which are compatible in Magri’s sense [27], the two distributions B1 and B2 are empty and the distribution B is maximal: B = T (M 2k ). For any two Poisson structures P and Q which are strongly dynamically compatible and are in the general position, the four distributions coincide: B1 = B2 = T (Tk ) = B. VIII. The master symmetries and the more general symmetries of order p were investigated in [19, 33–35] for many concrete dynamical systems and partial differential equations, and in [15, 17] for the Toda lattice. In Sect. 17, we study The Master Symmetries Problem. What are the invariant properties of the master symmetries of the C-integrable non-degenerate Hamiltonian systems? We prove in Proposition 13 that for the systems under investigation the master symmetries form a Lie algebra M and that the Lie algebra of symmetries S ⊂ M is an abelian ideal in M. These results are derived without any calculations as the logical consequences of Theorem 1 and the definition of the master symmetries. In Theorem 10 of Sect. 15, we prove that the master symmetries of any dynamical system with quasi-periodic dynamics form a Lie algebra M, that symmetries of order p ≥ 3 do not exist for such systems and that the Lie algebras M and S coincide if and only if the system has constant coefficients. The proof is based on Theorem 7 on the structure of general systems with quasi-periodic dynamics. We present an application of Theorem 10 to the necessary conditions for the B-integrability of a Hamiltonian system with compact invariant submanifolds. The iso-energetic conformal symmetries of the Hamiltonian systems (1.1) have an unexpected application to the second characterization problem that is studied in Sect. 5. In Theorem 12, we analyze the C-integrable Hamiltonian systems which possess conformal symmetries with some scaling properties. We prove that the corresponding Hamiltonian function H(I) satisfies a first-order partial differential equation in the action-angle coordinates. This equation implies that the Hamiltonian function H(I1 + c1 , · · · , Ik + ck ) is homogeneous if the original conformal symmetry is the standard scaling transform (here cj are some constants). As an application of Theorem 12, we derive without any calculations the Poincar´e formula [40] for the Hamiltonian of the Kepler problem in the action-angle coordinates.
308
O.I. Bogoyavlenskij
Chapter I. Theorem on Symmetries 2. Lemma on Symmetries I. In this section we study the general systems with quasi-periodic dynamics I˙1 = 0, · · · , I˙p = 0,
ϕ˙ 1 = ω1 (I), · · · , ϕ˙ q = ωq (I)
(2.1)
in the toroidal domains O = Ba × Tq ⊂ M n for arbitrary dimensions p and q, p + q = n. Here Ba is a ball p X (Ij − Ij0 )2 < a2 . (2.2) Ba : j=1
Definition 5. A trajectory of dynamical system (2.1) is called Tq -dense if it is everywhere dense on a torus Tq for Ij = cj = const. The frequencies ω1 (I), · · · , ωq (I) corresponding to the Tq -dense trajectory are incommensurable over the integers (or rationally independent). That means that for arbitrary integers mα we have m1 ω1 (I) + · · · + mq ωq (I) 6= 0,
mα ∈ Z.
Definition 6. The dynamical system (2.1) is called Tq -dense in the toroidal domain O = Ba × Tq ⊂ M n if the set X of points I ∈ Ba for which the trajectories are Tq -dense is everywhere dense in the ball Ba . Any continuous first integral F (Ij , ϕβ ) of a Tq -dense dynamical system (2.1) is constant on all tori Tq and hence any first integral F is a function of the variables Ij only: dF = 0 =⇒ F = F (I1 , · · · , Ip ). (2.3) dt A vector field U on the manifold M n is called a symmetry of a dynamical system V if it commutes with the corresponding vector field V : [U, V ] = 0. In this case vector field U is invariant with respect to the dynamical system V . Lemma 1. A vector field U is a symmetry of the Tq -dense dynamical system (2.1) if and only if it has the form U=
p X
X ∂ ∂ + U α+p (I) , ∂Ij ∂ϕα q
U j (I)
j=1
(2.4)
α=1
where smooth functions U α+p (I) are arbitrary and smooth functions U j (I) satisfy the linear system of equations p X ∂ωα (I) j=1
∂Ij
U j (I) = 0,
α = 1, · · · , q.
(2.5)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
309
Proof. For the dynamical system (2.1), the components of the vector field V have the form (2.6) V j = 0, V α+p = ωα (I). Here j = 1, · · · , p, and α = 1, · · · , q. For any vector field U (2.4), the equation [U, V ] = 0 is equivalent to the system of equations (2.5). Therefore any vector field U (2.4) that satisfies Eqs. (2.5) is a symmetry of the system (2.1). For any symmetry U , the Lie derivative LV U vanishes LV U = [V, U ] = 0. Thus any symmetry U satisfies the invariance equation τ µ U =0 (LV U )τ = U˙ τ − V,µ
(2.7)
where τ, µ = 1, · · · , p + q. Formulae (2.6) imply τ τ = δα+p V,µ
∂ωα (I) j δ ∂Ij µ
(2.8)
where summation with respect to the indices α = 1, · · · , q and j = 1, · · · , p is understood. Therefore the invariance Eq. (2.7) takes the form U˙ j = 0,
∂ωα (I) j U˙ α+p = U . ∂Ij
(2.9)
Using the key property of first integrals (2.3) for any Tq -dense dynamical system (2.1), we obtain that all solutions to (2.9) have the form U j (t) = U˜ j (I),
U α+p (t) =
∂ωα (I) ˜ j U (I)t + U˜ α+p (I) ∂Ij
(2.10)
where U˜ j (I) and U˜ α+p (I) are some smooth functions of variables I1 , · · · , Ip . Components U τ (I, ϕ) of any smooth vector field U are bounded on any torus Tq . Solutions (2.10) are bounded for all t if and only if Eqs. (2.5) are satisfied for α = 1, · · · , q. Hence using (2.10) and the fact that the general trajectories of system (2.1) are everywhere dense on the tori Tq we obtain formula (2.4). Lemma 1 is a generalization of a result by Brouzet [13] about the explicit form of the symmetries of the Liouville-integrable non-degenerate Hamiltonian systems in the action-angle coordinates. In [13] it is supposed that p = q and the frequencies have the form ωj (I) = ∂H(I)/∂Ij . For the symmetries U of these Hamiltonian systems Brouzet proved that U j (I) = 0 for j = 1, · · · , p and presented an application of this result to the theory of compatible pairs of Poisson structures. In Sects. 3, 5, 9 and 10 we develop applications of Lemma 1 to the problems that are not connected with compatible Poisson structures. II. Proposition 1. The Lie algebra S of symmetries of a Tq -dense dynamical system (2.1) for q ≥ p is abelian if and only if the non-degeneracy condition rank k
∂ωα (I) k= p ∂Ij
is met in a dense open domain in the ball Ba .
(2.11)
310
O.I. Bogoyavlenskij
Proof. If condition (2.11) is satisfied then Eq. (2.5) has zero solution only. In this case all symmetries U have the form U=
q X
U α+p (I)
α=1
∂ . ∂ϕα
(2.12)
Evidently, any two vector fields (2.12) commute. Therefore the Lie algebra S is abelian. If condition (2.11) is not satisfied in any dense open domain then there exists a ball B0 ⊂ Ba , where rank k ωα (I),j k = r < p. In this case Eq. (2.5) has a non-trivial solution p X ∂ U j (I) . (2.13) U0 = ∂Ij j=1
Let U0j (I) 6= 0 in some ball B1 ⊂ B0 and a(I) be a smooth function that is equal to 1 in the ball B1 and is equal to zero outside of the ball B0 . The smooth vector field U1 = a(I)U0 is a global symmetry of system (2.1). For two symmetries U1 and Ij U1 we have [U1 , Ij U1 ] = U1j (I)U1 6= 0
(2.14)
at all points I ∈ B1 . Therefore if condition (2.11) is not satisfied then the Lie algebra of symmetries S is non-abelian. Remark 1. For q < p and arbitrary differentiable functions ω1 (I), · · · , ωq (I), the linear system of equations (2.5) has a non-trivial solution U 1 (I), · · · , U p (I). Hence a symmetry U0 (2.13) does exist and Eq. (2.14) holds. Therefore for q < p the Lie algebra of symmetries S of any dynamical system (2.1) is non-abelian. III. Recall that rank r of any rectangular q × p matrix satisfies the inequalities r ≤ q and r ≤ p. Proposition 2. Suppose the dynamical system (2.1) is Tq -dense. Then the condition rank k
∂ωα (I) k = r = const ∂Ij
(2.15)
is satisfied in a domain D ⊂ Ba if and only if the following two conditions hold at all points I ∈ D: a) There exist p − r symmetries Ui and first integrals Fj such that det k Aij k6= 0,
Aij = Ui (Fj ),
i, j = 1, · · · , p − r.
(2.16)
b) For any p − r + 1 symmetries U` and first integrals Fm the equation det k B`m k= 0 holds where B`m = U` (Fm ),
`, m = 1, · · · , p − r + 1.
(2.17)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
311
Proof. Applying Lemma 1, we obtain that any symmetry U of the Tq -dense dynamical system (2.1) has form (2.4). Any first integral F of system (2.1) has form (2.3). Hence we obtain p X ∂F (I) U j (I) . (2.18) U (F ) = ∂Ij j=1
The condition (2.15) is satisfied if and only if the linear space of solutions to Eq. (2.5) has dimension d0 = p − r at all points I ∈ D. In view of (2.18), the condition d0 = p − r is equivalent to the two conditions (2.16) and (2.17) because functions F` (I) are arbitrary. Remark 2. Proposition 2 proves that the maximal value in the ball Ba of the integervalued function ∂ωα (I) k rank k ∂Ij has invariant meaning that does not depend on any choice of toroidal coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕq in the domain O = Ba × Tq . Corollary 1. The q functions ωα (I) are constant if and only if there exist p symmetries Uj of system (2.1) and p first integrals F` (I) such that det k Uj (F` )(I) k6= 0
(2.19)
in a dense open set D ⊂ Ba . Then system (2.1) is constant in any toroidal coordinates I˜j , ϕ˜ ` in the domain O = Ba × Tq . Proof. Applying Proposition 2, we obtain that condition (2.19) is equivalent to the condition that matrix k ∂ωα (I)/∂Ij k has rank zero at all points I ∈ D. Hence all entries of this matrix are equal to zero for all I ∈ Ba because the set D is dense in Ba . Therefore the frequencies ωα are constant. Eq. (2.19) does not depend on a choice of local coordinates. Hence the frequencies ωα are constant in any toroidal coordinates I˜j , ϕ˜ ` . 3. Theorem on Symmetries I. Let M 2k be a manifold with a non-degenerate Poisson structure P αβ . The Hamiltonian systems on the manifold M 2k have the form x˙ α = V α (x) = P αβ H,β ,
H,β = ∂H/∂xβ .
(3.1)
The Hamiltonian system (3.1) is called completely integrable in Liouville’s sense if it possesses k functionally independent involutive first integrals F1 (x), · · · , Fk (x): {Fj , F` } = P αβ Fj,α F`,β = 0. Liouville’s classical Theorem [3,26] on completely integrable Hamiltonian systems implies that almost all points of the manifold M 2k (excluding a set S ⊂ M 2k , dim S ≤ 2k − 1) are covered by a system of open toroidal domains Om ⊂ M 2k with the actionangle coordinates I1 , · · · , Ik , ϕ1 , · · · , ϕk . The symplectic structure ω = P −1 has the canonical form
312
O.I. Bogoyavlenskij
ω=
k X
dIj ∧ dϕj
(3.2)
j=1
in the action-angle coordinates. The Hamiltonian H of the completely integrable system (3.1) depends upon the action variables Ij (x) only and therefore system (3.1) takes the form ∂H(I) = ωj (I). (3.3) I˙j = 0, ϕ˙ j = ∂Ij It is evident that this system is a special case of the general system (2.1) with quasiperiodic dynamics. The action coordinates I1 , · · · , Ik are defined in a ball Ba (2.2) for p = k. The angle coordinates ϕ1 , · · · , ϕk run over a torus Tk , 0 ≤ ϕj ≤ 2π, in the compact case or over a toroidal cylinder Tm × Rk−m , 0 ≤ m < k if the manifold Ij (x) = Ij0 is non-compact. Definition 7. The Hamiltonian system (3.1) is called C-integrable in an invariant domain O ⊂ M 2k if it is completely integrable in the Liouville sense and the invariant submanifolds of constant level of the k involutive first integrals are compact. II. The completely integrable Hamiltonian system (3.1), (3.3) is called non-degenerate if Kolmogorov’s condition [24] for the Hessian det k
∂ 2 H(I) k6= 0 ∂Ij ∂I`
(3.4)
is met in a dense open domain in the action-angle coordinates I1 , · · · , Ik ,
ϕ1 , · · · , ϕ k ,
ϕ` = ϕ` mod(2π).
(3.5)
The following theorem clarifies the invariant geometric meaning of the Kolmogorov condition (3.4). Theorem 1, Part 1. The Hamiltonian system (3.1), integrable in the Liouville sense, is C-integrable and non-degenerate if and only if the Lie algebra S of symmetries of (3.1) is abelian. Proof. For any C-integrable non-degenerate Hamiltonian system (3.1), formula (2.12) yields that any two symmetries U and U1 commute. Hence the Lie algebra S of symmetries of the system (3.1) is abelian. Let us prove that if the integrable system is degenerate or if its invariant submanifolds are non-compact then this system has non-commuting symmetries. Any integrable system has the form (3.3) in the action-angle coordinates. The invariance equations (2.9) imply that the vector field ∂ (3.6) U0 = U0` (I) ∂I` is a symmetry of system (3.3) if its components satisfy the linear system of equations ∂ 2 H(I) ` U (I) = 0. ∂Ij ∂I` 0
(3.7)
Equations (3.7) imply that for an arbitrary smooth function F (I) the vector field F (I)U0 is a symmetry simultaneously with the vector field U0 (3.6).
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
313
Suppose the Hessian matrix is degenerate at all points I: det k
∂ 2 H(I) k= 0. ∂Ij ∂I`
Then Eq. (3.7) possesses a non-trivial solution U0` (I) that is smooth in some ball B1 ⊂ Ba (2.2). Let a(I) be a smooth function that is equal to 1 in a ball B2 ⊂ B1 and a(I) = 0 outside of the ball B1 . The smooth vector field U1 = a(I)U0` (I)
∂ ∂I`
(3.8)
is defined everywhere on the manifold M 2k and is a global symmetry of the integrable system (3.1). Let U1j (I) 6= 0 inside the ball B2 . For the two symmetries U1 and Ij U1 we have [U1 , Ij U1 ] = U1 (Ij )U1 = U1j (I)U1 6= 0. Therefore any degenerate integrable system (3.3) has the non-commuting symmetries U1 and Ij U1 . Suppose the integrable system (3.3) is non-degenerate and its invariant submanifolds are non-compact. Then at least one coordinate ϕm = ρm runs over the real line R1 . The invariance equations (2.9) imply that the vector field U2 = U2` (I)
∂ ∂ + ρm ∂I` ∂ρm
is a symmetry of system (3.3) if its components U2` (I) satisfy the linear system of equations ∂ 2 H(I) ` j ∂H(I) U (I) = δm . (3.9) ∂Ij ∂I` 2 ∂Im This system has a non-trivial solution U2` (I) because the Hessian matrix (3.4) is nondegenerate. Let vector field U2 satisfies Eqs. (3.9) and is smooth in some ball B1 ⊂ Ba (2.2). Let a(I) be a smooth function that is equal to 1 in a ball B2 ⊂ B1 and a(I) = 0 outside of the ball B1 . The smooth vector field U3 = a(I)U2` (I)
∂ ∂ + a(I)ρm ∂I` ∂ρm
(3.10)
is defined everywhere on M 2k and is a global symmetry of the integrable system (3.1). Let U3j (I) 6= 0 inside the ball B2 . For the two symmetries U3 and Ij U3 we obtain [U3 , Ij U3 ] = U3 (Ij )U3 = U3j (I)U3 6= 0. Thus any non-degenerate integrable system (3.3) that has non-compact invariant submanifolds Tm × Rk−m possesses the non-commuting symmetries U3 and Ij U3 . Therefore if the Lie algebra of symmetries S is abelian then the integrable system (3.3) must be non-degenerate and must have compact invariant submanifolds.
314
O.I. Bogoyavlenskij
Remark 3. Theorem 1, part 1 implies that if the Kolmogorov condition (3.4) is true in one system of action-angle coordinates then it is true in all other systems because this condition is equivalent to the fact that the Lie algebra S of symmetries of system (3.1) is abelian. The Lie algebra of symmetries S is a global invariant of a dynamical system that does not dependent upon any system of local coordinates and upon any Hamiltonian structure. The invariance of the Kolmogorov condition was first proved by Markus & Meyer [28]. Their proof requires exact integration of the Liouville-integrable Hamiltonian system and is based on the investigation of periodic trajectories with a prescribed period. 4. Necessary Conditions for the Non-Degenerate C-Integrability in the Kolmogorov Sense I. Let us consider a C-integrable Hamiltonian system (3.1) on a symplectic manifold M 2k with a symplectic structure ω. Theorem 1, Part 2. The Hamiltonian system (3.1), integrable in the Liouville sense, is C-integrable and non-degenerate if and only if the equation dF (U ) = U (F ) = 0
(4.1)
holds for any symmetry U and any first integral F of this system. Proof. Any C-integrable non-degenerate Hamiltonian system (3.1) is Tk -dense in any action-angle coordinates Ij , ϕj . Therefore any first integral F (I, ϕ) of this system depends upon the action variables Ij only and has the form (2.3). Any symmetry U of this system has form (2.12). The two formulae (2.3) and (2.12) satisfy Eq. (4.1) identically. Let us prove that if the integrable system is degenerate or if its invariant submanifolds are non-compact then there exists a symmetry U∗ and first integral F∗ for which U∗ (F∗ ) 6= 0. If integrable system (3.3) is degenerate then we take U∗ = U1 (3.8). Let for some j coordinate U1j (I) 6= 0 in the ball B2 where function a(I) = 1. The symmetry U∗ = U1 and first integral F∗ = a(I)Ij satisfy the inequality U∗ (F∗ ) = U1j (I) 6= 0 inside the ball B2 . If the integrable system (3.3) is non-degenerate and its invariant submanifolds are non-compact then we take U∗ = U3 (3.10). Let for some j coordinate U3j (I) 6= 0 in the ball B2 where function a(I) = 1. The symmetry U∗ = U3 and first integral F∗ = a(I)Ij satisfy the inequality U∗ (F∗ ) = U3j (I) 6= 0 inside the ball B2 . Therefore if Eq. (4.1) is satisfied for any symmetry U and any first integral F then the integrable system (3.1) must be non-degenerate and its invariant submanifolds must be compact. Theorem 1, Part 3. The Hamiltonian system (3.1), integrable in the Liouville sense, is C-integrable and non-degenerate if and only if the dynamical system x˙ τ = U τ (x)
(4.2)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
315
is integrable for any symmetry U , has quasi-periodic dynamics and preserves all invariant k-dimensional submanifolds of system (3.1) in the manifold M 2k . Proof. If the Hamiltonian system (3.1) is C-integrable and non-degenerate then formula (2.12) for q = k implies that the dynamical system (4.2) has the form I˙j = 0,
ϕ˙ ` = U `+k (I)
(4.3)
in the toroidal coordinates I1 , · · · , Ik , ϕ1 , · · · , ϕk . Evidently, this system is integrable and has the same invariant tori Tk as system (3.1). System (4.3) preserves the closed 2-form, ω2 =
k X
dU `+k (I) ∧ dϕ` .
(4.4)
`=1
The dynamical system (4.2), (4.3) has the Hamiltonian form U˙ `+k = 0,
ϕ˙ ` =
∂H0 (U ) , ∂U `+k
2H0 (U ) = (U 1+k )2 + · · · + (U 2k )2 ,
if the k functions U 1+k (I), · · · , U 2k (I) are functionally independent. Hence we obtain that in the non-degenerate case dynamical system (4.2) is completely integrable in the Liouville sense with respect to the symplectic structure (4.4) in a toroidal neighbourhood O = Br × Tk of any invariant torus Tk . If either the integrable Hamiltonian system (3.1) is degenerate or its invariant submanifolds are non-compact then this system possesses symmetries U1 (3.8) or U3 (3.10) respectively. For these symmetries, the dynamical systems (4.2) do not preserve the k-dimensional invariant submanifolds of system (2.1). This completes the proof of the three parts of Theorem 1. II. Theorem 1 is one of the key theorems in the theory of tensor invariants of integrable Hamiltonian systems that is being developed in the series of our papers initiated by [5–9]. Theorem 1 has applications to the characterization problem, to the integrable hierarchies problem, to the necessary conditions for the strong dynamical compatibility problem, and to the problem on master symmetries. Corollary 2. Let a dynamical system preserves two non-degenerate Poisson structures P1 and P2 and therefore has two Hamiltonian forms x˙ α = P1αβ H1,β = P2αγ H2,γ .
(4.5)
Suppose the system (4.5) is Liouville-integrable and non-degenerate with respect to the Poisson structure P1 and has compact invariant submanifolds. Then this bi-Hamiltonian system is Liouville-integrable and non-degenerate with respect to the Poisson structure P2 as well. Proof. The Liouville-integrability of the Hamiltonian system (4.5) with respect to the Poisson structure P1 implies that there exist k P1 -involutive functionally independent first integrals f1 (x), · · · , fk (x). The vector fields U` = P2 df`
316
O.I. Bogoyavlenskij
are symmetries of system (4.5), because this system preserves the Poisson structure P2 and functions f1 (x), · · · , fk (x). Applying Theorem 1, Part 2 for the Liouville-integrable non-degenerate system (4.5), we obtain {fj , f` }P2 = P2αβ fj,α f`,β = U` (fj ) = 0. Therefore the k functionally independent first integrals f1 (x), · · · , fk (x) are in involution with respect to the Poisson structure P2 . Hence the bi-Hamiltonian system (4.5) is Liouville-integrable with respect to the Poisson structure P2 as well. Using the non-degeneracy of the Liouville-integrable system (4.5) with respect to the Poisson structure P1 and applying Theorem 1, Part 1, we get that the Lie algebra S of symmetries of system (4.5) is abelian. Therefore, applying this theorem again we obtain that the Liouville-integrable system (4.5) is non-degenerate with respect to the Poisson structure P2 . Corollary 2 was first proved in Theorem 1 of our paper [9]. The first proof was based on the complete classification of all invariant Poisson structures for the C-integrable non-degenerate Hamiltonian systems. III. Theorem 1 leads to the following Necessary conditions for the non-degenerate C-integrability. If a dynamical system V on a smooth manifold M 2k is a C-integrable non-degenerate Hamiltonian system then the following necessary conditions must be satisfied: 1) The Lie algebra S of symmetries of system V is abelian; 2) Equation (4.1) holds for any symmetry U and any first integral F of system V . These necessary conditions are very easy to verify for many concrete dynamical systems. We present here a simple application to the n-body problem of celestial mechanics. This problem studies the dynamics of n attracting material points with masses mα , α = 1, · · · , n, in the Euclidean space R3 . The Hamiltonian of this problem has the form n n X X p2α Gmα mβ , − H(p, r) = 2mα |rα − rβ | α=1
α6=β
where vectors rα and pα ∈ R define position and momentum of the αth material point, and G is the gravitational constant. The corresponding Hamiltonian system has the form 3
p˙α = Gmα
n X β6=α
mβ
r α − rβ , |rα − rβ |3
r˙α =
1 pα mα
(4.6)
in the phase space R6n . The Hamiltonian system (4.6) possesses the non-abelian Lie algebra of symmetries so(3) + R3 connected with first integrals of the total angular momentum and total momentum n n X X ijk pαj rαk , Pj = pαj i, j, k = 1, 2, 3, Mi = α=1
α=1
where ijk is the alternating 3-tensor, 123 = 1. Therefore, using the Necessary Condition 1 we obtain that any C-integrable case of system (4.6) is degenerate in the phase space R6n .
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
317
Remark 4. The Necessary Conditions 1 and 2 are not sufficient for the Liouville-integrability. Indeed, Propositions 1 and 2 prove that these conditions hold for any Tq -dense dynamical system (2.1), that satisfies the non-degeneracy condition (2.11) for p < q. The invariant tori Tq of such systems have dimensions q ≥ k +1, if p < q and p+q = 2k. This invariant property contradicts the Liouville Theorem. Therefore the dynamical system (2.1) for p < q is not a Hamiltonian system integrable in the sense of Liouville. An extended concept of integrability of Hamiltonian systems that generalizes Liouville’s classical concept is introduced in Sect. 6.
5. The Second Characterization Problem I. In Theorem 9 of Sect. 14 below, we prove that the conformal symmetries (1.2) of any dynamical system V form a Lie algebra Sc . For any Hamiltonian system V (3.1), we define four Lie algebras: the Lie algebra of symmetries S, the Lie algebra of iso-energetic symmetries Se , the Lie algebra of conformal symmetries Sc , and the Lie algebra of isoenergetic conformal symmetries Sec . There are the following general relations between these Lie algebras: Se ⊂ S ⊂ Sc ,
Se ⊂ Sec ⊂ Sc ,
Se = S ∩ Sec .
(5.1)
A Liouville-integrable Hamiltonian system (3.3) is called non-degenerate in the isoenergetic sense [2,3] if in the action-angle coordinates (3.5) the map f of any submanifold MH ⊂ Ba of constant energy H(I) = const to the real projective space RP k−1 f:
MH → RP k−1 ,
f (I1 , · · · , Ik ) = (
∂H ∂H : ··· : ) ∂I1 ∂Ik
(5.2)
is a local diffeomorphism in a dense open domain DH ⊂ MH . The following theorem presents the invariant properties that completely characterize the non-degenerate C-integrability in the iso-energetic sense. Theorem 2. The C-integrable Hamiltonian system (3.1) is non-degenerate in the isoenergetic sense if and only if the corresponding Lie algebra Sec of the iso-energetic conformal symmetries is abelian. Proof. 1) Suppose the Hamiltonian system (3.1) is C-integrable and the map f : MH → RP k−1 is a local diffeomorphism. These conditions imply that the C-integrable system (3.1), (3.3) is Tk -dense. Let us prove that the Lie algebra Sec is abelian. The map f (5.2) is the composition of two smooth maps h and g:
h:
MH → Rk ,
g:
Rk → RP k−1 ,
∂H ∂H ,···, ), ∂I1 ∂Ik g(x1 , · · · , xk ) = (x1 : · · · : xk ).
h(I1 , · · · , Ik ) = (
(5.3) (5.4)
Therefore if the map f = g ◦ h is a local diffeomorphism then the map h is a local embedding of the (k − 1)-dimensional submanifold MH (H(I) = const) into the Euclidean space Rk . Hence the differential map dh is an embedding of the tangent spaces TI (MH ) into Rk at the regular points I ∈ MH .
318
O.I. Bogoyavlenskij
Applying Lemma 3 of Sect. 14 for the Tk -dense system (3.3), we obtain that any its master symmetry X has the form X=
k X
(X j (I)
j=1
∂ ∂ + X j+k (I) ). ∂Ij ∂ϕj
(5.5)
Pk Therefore, the equation X(H) = 0 takes the form j=1 X j (I)∂H(I)/∂Ij = 0 in ˜ the action-angle coordinates (3.5). This equation means that the vector field X(I) = X j (I)∂/∂Ij is tangent to the submanifolds MH (H(I) = const). Equation (1.2) for the iso-energetic conformal symmetry X : [X, V ] = c(x)V is equivalent to the equations k X
X j (I)
j=1
∂ 2 H(I) ∂H(I) = c(I) . ∂Ij ∂Im ∂Im
(5.6)
Using formulae (5.3), we find that Eqs. (5.6) coincide with the equation ˜ dh(X(I)) = c(I)h(I). This remarkable coincidence is the key point of our proof. Using formulae (5.4), we obtain ˜ ˜ df (X(I)) = dg ◦ dh(X(I)) = c(I)dg(h(I)) = 0. ˜ The map f = g ◦ h (5.2) is a local diffeomorphism. Hence we obtain X(I) = 0, X j (I) = 0. Therefore any iso-energetic conformal symmetry (5.5) has the form Xe =
k X
Xe`+k (I)
`=1
∂ . ∂ϕ`
(5.7)
Hence the Lie algebra of the iso-energetic conformal symmetries Sec is abelian. 2) Suppose the Lie algebraSec is abelian. Let us prove by contradiction that the C-integrable Hamiltonian system (3.1) is Tk -dense. Suppose not. Applying Proposition 9 of Sect. 12, we obtain that there exists a ball B0 ⊂ Ba where frequencies ωα (I) = ∂H(I)/∂Iα are rationally dependent. Therefore, Theorem 8, Part 2 implies that there exist the action-angle coordinates Ijτ , ϕτj (13.2) in the toroidal domain Oτ = B0 × Tk , where the Hamiltonian function H(I) has the form H(I) = H(I1τ , · · · , I`τ ) for ` < k. Hence the Hamiltonian system (3.3) takes the form τ = 0, I˙m
ϕ˙ τj =
∂H(I τ ) , ∂Ij
ϕ˙ τ`+1 = 0, · · · , ϕ˙ τk = 0.
(5.8)
For an arbitrary 2π-periodic smooth function Ue`+i (ϕτ`+1 , · · · , ϕτk ) 6= 0, the vector field k−` X ∂ Ue`+i (ϕτ`+1 , · · · , ϕτk ) τ (5.9) Ueτ = a(I) ∂ϕ`+i i=1
is an iso-energetic symmetry of system (5.8). Here a(I) is a smooth function that is equal to zero outside of the ball B0 ⊂ Ba and that is equal to 1 inside a ball B1 ⊂ B0 . The general iso-energetic symmetries (5.9) do not commute. This result contradicts the assumption that the Lie algebra Se ⊂ Sec is abelian. The contradiction derived proves that the system (3.1) – (3.3) is Tk -dense.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
319
Let us prove that if the Lie algebra of the iso-energetic conformal symmetries Sec is abelian then the map h is a local embedding and the equation rank dh = k − 1
(5.10)
holds in a dense open domain DH ⊂ MH . Applying Lemma 3 of Sect. 14 to the Tk -dense C-integrable Hamiltonian system (3.3), we obtain that all its conformal symmetries X have the form (5.5). It is easy to prove that all iso-energetic conformal symmetries Xe (5.5) of system (3.3) (and therefore all its iso-energetic symmetries Ue ) have form (5.7). Indeed, if Xej (I) 6= 0 then the two iso-energetic conformal symmetries Xe and Ij Xe do not commute: [Xe , Ij Xe ] = Xej (I)Xe 6= 0. Applying Lemma 1, we obtain that the fact just has been proven that all iso-energetic symmetries k k X X ∂ ∂ Uej (I) + Ue`+k (I) ` Ue = ∂Ij ∂ϕ j=1
`=1
of system (3.3) have the form (5.7) is equivalent to the fact that the equations k X ∂ 2 H(I) j U (I) = 0, ∂Im ∂Ij e
Ue (H) =
j=1
k X
Uej (I)
j=1
∂H(I) =0 ∂Ij
have no non-zero solutions Uej (I). Therefore, for I ∈ DH , the Hessian matrix ∂ 2 H(I)/∂Im ∂Ij is non-degenerate on the tangent spaces TI (MH ). Here DH is a dense open domain in the submanifold MH . This Hessian matrix coincides with the differential dh of the map h (5.3). Hence we obtain that the map h is a local embedding. Therefore Eq. (5.10) holds in the domain DH . The map g (5.4) satisfies the equation rank dg = k − 1 everywhere but the origin. This equation and Eq. (5.10) imply that the map f = g ◦ h satisfies the equation rank df = rank(dg ◦ dh) ≥ k − 2
(5.11)
in the domain DH ⊂ MH . We prove now by contradiction that if the Lie algebra Sec of the iso-energetic conformal symmetries of system (3.3) is abelian then system (3.3) is non-degenerate in the iso-energetic sense. Suppose not. Then there exists a domain O1 ⊂ MH where rank df < k − 1. Applying Eq. (5.11), we obtain that there exists a domain O2 ⊂ O1 where rank df (I) = k − 2, I ∈ O2 . Let x ∈ f (O2 ). Then f −1 (x) is a smooth curve I(s), where s is a real parameter. We consider the tangent vector field Y =
k X j=1
Y j (I)
∂ dI(s) 6= 0, = a(I) ∂Ij ds
Y j (I) = a(I)
dIj (s) . ds
(5.12)
Here a(I) is a smooth function that is equal to zero outside of the domain O2 and a(I) = 1 for I ∈ B2 , where B2 is a ball B2 ⊂ O2 . By definition, we have Y (H) = dH(Y ) = 0,
df (Y ) = dg ◦ dh(Y ) = 0.
(5.13)
320
O.I. Bogoyavlenskij
The first Eq. (5.13) and Eq. (5.10) yield dh(Y ) 6= 0. Therefore, the second Eq. (5.13) and Eq. (5.4) imply dh(Y (I)) = c(I)h(I), (5.14) where c(I) 6= 0 is some smooth function. Substituting the formulae (5.3), we obtain the coordinate form of Eq. (5.14): k X ∂H(I) ∂ 2 H(I) j Y (I) = c(I) . ∂Im ∂Ij ∂Im
(5.15)
j=1
The function c(I) is a first integral of system (3.3) because it depends upon the action variables only. Equations (5.15) are equivalent to the equation [Y, V ] = c(I)V . The first Eq. (5.13) and Eq. (5.15) mean that the vector field Y (5.12) is an isoenergetic conformal symmetry (1.2) of the system (3.3). We have proved that any isoenegretic conformal symmetry of system (3.3) has form (5.7). Hence Y j (I) = 0. These equations contradict the first Eq. (5.12). The contradiction derived proves that system (3.3) is non-degenerate in the iso-energetic sense. Remark 5. We have proved in Theorem 2 that if the Hamiltonian system (3.1) is isoenergetically non-degenerate then the two Lie algebras Sec and Se coincide, in view of Eq. (5.7). II. Theorems 1 and 2 provide an insight into the relationships between the concept of the non-degeneracy in the Kolmogorov sense and the concept of the non-degeneracy in the iso-energetic sense. Namely, the assumption that the Lie algebra S of symmetries of the Hamiltonian system (3.1) is abelian does not entail that the Lie algebra Sec of the iso-energetic conformal symmetries is abelian. Conversely, the assumption that the Lie algebra Sec is abelian does not imply that the Lie algebra S of symmetries of system (3.1) is abelian. Therefore these two concepts of non-degeneracy of the Cintegrable Hamiltonian systems are independent. This fact was first established in [2, 31] by constructing concrete examples. The methods used in Theorem 2 prove that the non-degeneracy in the Kolmogorov sense implies the inequality (5.11), and the nondegeneracy in the iso-energetic sense yields the inequality rank k
∂ 2 H(I) k ≥ k − 1. ∂Ij ∂Im
An Example. Let a smooth function H0 (I1 , · · · , Ik ) be homogeneous of degree zero. Applying the Euler Theorem on homogeneous functions, we obtain the equations k X j=1
∂H0 (I) Ij = 0, ∂Ij
k X j=1
Ij
∂H0 (I) ∂ 2 H0 (I) =− , ∂Ij ∂Im ∂Im
(5.16)
because the functions ∂H0 (I)/∂Im are homogeneous of degree −1. Equations (5.16) mean that the vector field X = Ij ∂/∂Ij is an iso-energetic conformal symmetry of the Hamiltonian system (3.3) with the Hamiltonian function H0 (I1 , · · · , Ik ). The two iso-energetic conformal symmetries X and Im X do not commute: [X, Im X] = Im X. Therefore, Theorem 2 implies that the system (3.3) is degenerate in the iso-energetic sense. However, this system can be non-degenerate in the Kolmogorov sense, if det k ∂ 2 H0 (I)/∂Ij ∂Im k6= 0.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
321
III. The relationships between the concept of the non-degeneracy in the Kolmogorov sense of the C-integrable Hamiltonian systems and the concept of the non-degeneracy in the iso-energetic sense are reflected in the relationships between the corresponding Lie algebras Se , S, Sec and Sc . Proposition 3. The two concepts of the non-degeneracy of the C-integrable Hamiltonian systems (3.1) on a symplectic manifold M 2k imply the following interrelationships between the Lie algebras Se , S, Sec and Sc : 1) If system (3.1) is non-degenerate in the Kolmogorov sense and is degenerate in the iso-energetic sense everywhere in the manifold M 2k then Se = S 6= Sec = Sc . 2) If system (3.1) is degenerate in the Kolmogorov sense everywhere in the manifold M 2k and is non-degenerate in the iso-energetic sense then Se = Sec 6= S = Sc . 3) If system (3.1) is non-degenerate in the Kolmogorov sense and is non-degenerate in the iso-energetic sense then Se = S = Sec 6= Sc . Proof. 1) The equality Se = S follows from Theorem 1. Any conformal symmetry X (5.5) satisfies Eqs. (5.6). In view of the non-degeneracy of the Hessian matrix ∂ 2 H(I)/∂Ij ∂Im , all solutions to Eqs. (5.6) for different functions c(I) are proportional with each other. In view of the degeneracy of system (3.1) in the iso-energetic sense, Eqs. ˜ = 0. (5.6) have a non-zero solution X˜ = X j (I)∂/∂Ij , that satisfies the equation X(H) ˜ ˜ All proportional solutions a(I)X satisfy the equation a(I)X(H) = 0 as well. Therefore any conformal symmetry X (5.5) is iso-energetic. Hence we get Sec = Sc . The inequality S = 6 Sec follows from Theorems 1 and 2 because the Lie algebra S is abelian and the Lie algebraSec is not. 2) The equality Se = Sec follows from Theorem 2. The non-degeneracy of system (3.1) in the iso-energetic sense implies that Eqs. (5.6) have no non-zero solutions X˜ = ˜ = 0. The degeneracy of system (3.1) in the X j (I)∂/∂I j that satisfy the equation X(H) Kolmogorov sense yields rank k
∂ 2 H(I) k ≤ k − 1. ∂Ij ∂Im
In view of Eq. (5.10), we obtain dh(TI (Rk )) = dh(TI (MH )). Therefore, Eqs. (5.6) have non-zero solutions X j (I) only for c(I) = 0. That means that any conformal symmetry of system (3.1) is a symmetry. This result and the general inclusions (5.1) prove the equality S = Sc . The inequality Sec 6= S follows from Theorems 1 and 2 because the Lie algebraSec is abelian and the Lie algebra S is not. 3) Applying Theorems 1 and 2, we obtain that the Lie algebras Se , S andSec are abelian and coincide. The Lie algebraSc is non-abelian because Eqs. (5.6) have solutions for arbitrary functions c(I) 6= 0, in view of the non-degeneracy of the Hessian matrix ∂ 2 H(I)/∂Ij ∂Im .
322
O.I. Bogoyavlenskij
In the case when the C-integrable Hamiltonian system (3.1) is degenerate in the Kolmogorov sense and is degenerate in the iso-energetic sense, the general inclusions (5.1) are strict. Remark 6. TheStatement (2) of Proposition 3 implies that all conformal symmetries (1.2) of the C-integrable Hamiltonian system (3.1) are symmetries if the system is degenerate in the Kolmogorov sense everywhere in the manifold M 2k and is non-degenerate in the iso-energetic sense. In Theorem 10 of Sect. 15 below, we prove that all master symmetries of an integrable system (2.1) are symmetries if and only if the system has constant coefficients. 6. An extended Concept of Integrability of Hamiltonian Systems I. According to the terminology we adopt here, a Hamiltonian system x˙ τ = V τ = P τ µ θµ ,
P = ω −1 ,
dω = 0,
dθ = 0
(6.1)
on a symplectic manifold M 2k is called integrable in the A-sense (or just A-integrable) if Liouville’s condition is satisfied: The system (6.1) has k functionally-independent first integrals F1 (x), · · · , Fk (x) that are in involution {Fi , Fj } = P τ µ Fi,τ Fj,µ = 0.
(6.2)
The Liouville condition implies that system (6.1) has an abelian k-dimensional Lie algebra Sa of Hamiltonian symmetries Uiτ = P τ µ Fi,µ . This abelian Lie algebra of symplectic symmetries Sa is isomorph the abelian Lie algebra of first integrals Fj (x) with respect to the Poisson brackets (6.2). Our key idea about the properties of the general integrable Hamiltonian systems (6.1) consists of the following: a) The integrable Hamiltonian system must possess functionally independent first integrals F1 (x), · · · , Fp (x) but they can be non-involutive and their number p can be arbitrary: 0 ≤ p < n. b) The integrable Hamiltonian system must possess an abelian (n − p)-dimensional Lie subalgebra of symmetries Sa ⊂ S that preserve the first integrals Fj (x) but the symmetries may be non-symplectic. Definition 8. The Hamiltonian system (6.1) on a symplectic manifold M 2k is said to be integrable in the B-sense (or just B-integrable) if it possesses: (a) p functionally independent first integrals F1 (x), · · · , Fp (x), where 0 ≤ p < n, (b) an abelian (n − p)dimensional Lie subalgebra of symmetries Sa that preserve the first integrals Fj (x). In Liouville’s 1855 definition [26] the independent Conditions (a) and (b) were incorporated into the single condition of the involutiveness of first integrals. It is this hidden overlap of ideas that is probably the reason why Liouville’s definition has served as the most general characterization of integrability since that time. Proposition 4. The Hamiltonian system (6.1) is B-integrable if the following conditions are satisfied: 1) The system (6.1) preserves s linearly independent closed 1-forms θ1 , · · · , θs on M 2k and has p functionally-independent first integrals F1 (x), · · · , Fp (x). 2a) The equations
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
{θα , θβ } = P τ µ θα.τ θβ.µ = cab ,
323
{Fj , θα } = P τ µ Fj,τ θα.µ = 0
hold. Here cαβ = −cβα are arbitrary constants. 2b) There are r ≤ p first integrals F` (x), 1 ≤ ` ≤ r, which are in involution with all other first integrals {F` , Fj } = P τ µ F`,τ Fj,µ = 0 for 1 ≤ j ≤ p. The equality p + r + s = 2k holds. 3) The r + s 1-forms dF1 , · · · , dFr , θ1 , · · · , θs are linearly independent on each component of the submanifolds Mq :
F1 (x) = c1 , · · · , Fp (x) = cp
for almost all values of the constants c1 , · · · , cp . Here q = 2k − p = r + s. Proof. Condition (1) implies that the vector fields Uατ = P τ µ θα.µ ,
τ Us+i = P τ µ Fi.µ
(6.3)
are symmetries of the Hamiltonian system (6.1). Here α = 1, · · · , s and i = 1, · · · , r. Conditions (2a) and (2b) imply that the symmetries (6.3) preserve the p first integrals F1 (x), · · · , Fp (x) and form an abelian Lie algebra Sa . Condition (3) yields that the dimension of the Lie algebra Sa is equal to n − p. Therefore the two Conditions (a) and (b) of the Definition of the B-integrable Hamiltonian systems are satisfied. The Hamiltonian systems that are integrable in the A-sense or in Liouville’s sense are described by the very restrictive conditions s = 0, p = q = r = k. An Example. Let the symplectic manifold M 4 be the cotangent bundle of a torus T2 , M 4 = T ∗ (T2 ), with coordinates p1 , p2 , ϕ1 , ϕ2 , and with the non-standard symplectic structure p1 dp2 − p2 dp1 ∧ (a1 dϕ2 − a2 dϕ1 ) + (6.4) ω= p21 + p22 + (p1 dp1 + p2 dp2 ) ∧ (a1 dϕ1 + a2 dϕ2 ) + a0 dp1 ∧ dp2 − a0 dϕ1 ∧ dϕ2 . We consider the Hamiltonian system x˙ τ = (ω −1 )τ µ H,µ with a Hamiltonian function H(I), I = p21 + p22 . The system has the form p˙1 = −a0 f p2 ,
p˙2 = a0 f p1 ,
ϕ˙ 1 = a1 f,
ϕ˙ 2 = a2 f,
(6.5)
where f = dH(I)/dI. It is easy to verify that system (6.5) preserves the symplectic structure ω (6.4). The invariant submanifolds of this system are the 3-dimensional tori T3 :
p21 + p22 = c,
0 ≤ ϕ1 ≤ 2π,
0 ≤ ϕ2 ≤ 2π.
The trajectories of system (6.5) have the form (i = 1, 2) p1 (t) = p01 cos(a0 f t + t0 ), p2 (t) = p02 sin(a0 f t + t0 ), ϕi (t) = ai f t + ϕ0i .
(6.6)
The trajectories (6.6) are quasi-periodic and everywhere dense on the 3-dimensional tori T3 provided that the constants a0 , a1 , a2 are rationally independent. Therefore the Hamiltonian system (6.5) is not integrable in the A-sense.
324
O.I. Bogoyavlenskij
However, the Hamiltonian system (6.5) is integrable in the B-sense. Indeed, the system has one first integral I = p21 + p22 and possesses 3-dimensional Lie algebra Sa of symmetries ∂ ∂ ∂ ∂ − p2 , U2 = , U3 = U1 = p1 ∂p2 ∂p1 ∂ϕ1 ∂ϕ2 that preserve the first integral I. The system (6.5) satisfies also the conditions (1) - (3) of Proposition 4. Indeed, the system preserves two closed 1-forms θ1 = a2 dϕ1 − a1 dϕ2 ,
θ 2 = a1
p1 dp2 − p2 dp1 − a0 dϕ1 . p21 + p22
These 1-forms satisfy the equations (ω −1 )τ µ θ1.τ θ2.µ = −a1 ,
(ω −1 )τ µ H,τ θ1.µ = 0,
(ω −1 )τ µ H,τ θ2.µ = 0.
Therefore system (6.5) is integrable in the B-sense and p = r = 1, q = 3, s = 2. A direct product of k copies of the symplectic manifold T ∗ (T2 ) with the Hamiltonian systems (6.5) for different functions H(I) and constants a0 , a1 , a2 , provides the Bintegrable Hamiltonian systems on the symplectic manifolds M 4k = T ∗ (T2k ). These Hamiltonian systems have invariant coisotropic tori T3k and are T3k -dense in general. For these systems, we have p = r = k, q = 3k, s = 2k. II. Recall that a torus Tk (q = k) is a Lagrangian submanifold if its tangent bundle coincides with the ω-orthogonal bundle: Tx (Tk ) = Tx (Tk )⊥ . A torus Tq is isotropic if It is coisotropic if
Tx (Tq ) ⊂ Tx (Tq )⊥ . Tx (Tq ) ⊃ Tx (Tq )⊥ .
The Lagrangian and isotropic tori appear in the A-integrable Hamiltonian systems. These systems have been studied in numerous papers and books over the 140 years since the Liouville paper [26]. The Hamiltonian systems with coisotropic invariant tori were discovered by Parasyuk [38]. These systems and their links with KAM theory were studied by Herman [21,22], Parasyuk [38,39] and Moser [32]. For general, integrable systems, the invariant tori Tq are not necessarily Lagrangian, isotropic or coisotropic. III. Theorem 3. If a Hamiltonian system (6.1) is B-integrable, then the following properties are realized: 1) The components of the invariant submanifolds M q , q = 2k − p, Fj (x) = cj , are tori Tq if they are compact and toroidal cylinders Tm × Rq−m if they are noncompact. In a toroidal neighbourhood O = Ba × Tm × Rq−m of any toroidal cylinder there exist local coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕm , ρm+1 , · · · , ρq . The coordinates Ij run over a ball Ba (2.2). The angular coordinates ϕ1 , · · · , ϕm run over the torus Tm , 0 ≤ ϕj ≤ 2π, and the coordinates ρm+1 , · · · , ρq run over the Euclidean space Rq−m . In these coordinates, the Hamiltonian system (6.1) has the form I˙j = 0,
ϕ˙ α (I) = ωα (I),
ρ˙γ = ωγ (I),
(6.7)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
325
and therefore it is integrable (in the conventional sense). 2) For each compact component Tq of the invariant submanifold M q , we assume that system (6.7) is Tq -dense. This assumption does not cause any loss of generality. The invariant symplectic structure ω = P −1 is reduced to the canonical form ωc =
q X r X
aα ` dI` ∧ dϕα +
α=1 `=1
q X
X
(p−r)/2
cαβ dϕα ∧ dϕβ +
dIr+j ∧ dIh+j
(6.8)
j=1
α,β=1
in the new coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕq in the toroidal domain O = Ba × Tq . Here p + q = n = 2k, 0 ≤ r ≤ p, h = (p + r)/2. The vectors a` ∈ Rq are orthonormal null-vectors of the q × q constant, skew-symmetric matrix cαβ : cαβ aβ` = 0,
(a` , aj ) = δj` ,
r = q − rank k cαβ k,
j, ` = 1, · · · , r.
There are k(k +1)/2 canonical forms (6.8) that are non-isomorphic to Liouville’s classical form (3.2). There are k − 1 canonical forms for which tori Tq are coisotropic. There are k canonical forms with a non-degenerate inherited symplectic structure on the tori Tq (for q = 2k we have M 2k = T2k ). For the remaining (k −1)(k −2)/2 canonical forms (6.8) the invariant tori Tq are not Lagrangian, isotropic, coisotropic, or non-degenerate. 3) A Tq -dense Hamiltonian system that preserves the symplectic structure (6.8) has the canonical form I˙1 = 0, · · · , I˙p = 0,
ϕ˙ α =
r X ∂H(I) `=1
∂I`
α aα ` + b0 .
(6.9)
Here H(I) is an arbitrary smooth function of the r variables I1 , · · · , Ir and the vector b0 is orthogonal to the vectors a` . System (6.9) has the Hamiltonian form τ µ x˙ τ = ωc−1 θµ , θ = dH(I) + cαβ bβ0 dϕα . For a general function H(I), system (6.9) is Tq -dense if and only if the image space C ⊂ Rq of the q × q skew-symmetric matrix cαβ contains no vectors m = (m1 , · · · , mq ) with integer coordinates mα , orthogonal to vector b0 , (m, b0 ) = 0. 4) Any Hamiltonian system (6.9) is integrable in the B-sense. The variables ϕ1 , · · · , ϕq are supposed to run over a torus Tq or over a toroidal cylinder Tm × Rq−m , where m = 0, 1, · · · , q − 1. The proof of Theorem 3 is split into five parts which will be completely published in paper [12]. The main stages of the proof are: (i) The proof of the existence of coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕm , ρm+1 , · · · , ρq , where the Hamiltonian system (6.1) has the form (6.7) or (2.1), provided that the system is integrable in the B-sense. (ii) The derivation of a complete classification of all closed 2-forms ωc that are invariant with respect to the Tq -dense dynamical system (2.1). Hence we obtain the general form of the original symplectic structure ω = P −1 (6.1) in the coordinates I 1 , · · · , I p , ϕ1 , · · · , ϕ q . (iii) The construction of a sequence of transformations of coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕq that transforms the symplectic structure ω to one of the canonical forms (6.8). These transformations preserve the general form of the dynamical system (2.1).
326
O.I. Bogoyavlenskij
(iv) The derivation of the canonical form (6.9) for the dynamical system (2.1) in the newly-constructed coordinates where the symplectic structure ω has the canonical form (6.8). (v) The verification of the invariance of the q closed differential 1-forms θα = cα1 dϕ1 + · · · + cαq dϕq ,
(6.10)
with respect to the Hamiltonian systems (6.9). Here α = 1, · · · , q. There are s (= rank k cαβ k) linearly independent 1-forms θα (6.10). These invariant 1-forms θα and first integrals I1 , · · · , Ip satisfy the conditions of Proposition 4. For the Hamiltonian system (6.9), we have r = q − rank k cαβ k. In paper [9], we derived a complete classification of symplectic and Poisson structures that are invariant with respect to a Liouville-integrable non-degenerate Hamiltonian system. Theorem 3 is a further development of Theorem 1 of that paper. The classical Liouville canonical forms (3.2) and (3.3) are the particular cases of the general canonical forms (6.8) and (6.9) for α aα ` = δ` ,
bα 0 = 0,
cαβ = 0,
p = q = r = k.
The constructed k(k + 1)/2 canonical forms of integrable Hamiltonian systems (6.9) cannot be integrated through Liouville’s Theorem because the maximal number of their independent involutive first integrals is equal to 1 1 r + (p − r) = (p + q − rank k cαβ k) < k. 2 2 The integrable Hamiltonian systems (6.9) are invariant with respect to the action of the torus Tq . This action preserves the symplectic structure ωc (6.8) because it has constant coefficients. This action is not, however, a Poisson action as defined by Souriau [42] and by Marsden & Weinstein [29] and that was studied by Atiyah [4], Cartier [14] and Flaschka [18]. The torus Tq action has the Hamiltonian form (6.1) where the closed 1-forms θ are not exact in general. IV. Corollary 3. Suppose a Hamiltonian system (6.1) on a symplectic manifold M 2k is integrable in the B-sense and is Tq -dense. Then the Lie algebra S of symmetries of (6.1) is abelian if and only if the tori Tq either are Lagrangian or coisotropic, q = k, k + 1, · · · , 2k, r = p = 2k − q, and the condition det k
∂ 2 H(I) k6= 0 ∂Ij ∂I`
(6.11)
holds for the canonical form (6.9) in a dense open set D ⊂ Ba . Here j, ` = 1, · · · , p, p = 2k − q. Proof. Theorem 3 implies that the B-integrable Hamiltonian system (6.1) in any toroidal domain O = Ba ×Tq is diffeomorphically equivalent to one of the canonical forms (6.9). For the systems (6.9), we have ωα (I) =
r X ∂H(I) `=1
∂I`
α aα ` + b0 ,
(6.12)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
327
where α = 1, · · · , q and r ≤ p. Applying Proposition 1, we obtain that the Lie algebra of symmetries S is abelian if and only if the non-degeneracy condition (2.11) is satisfied. This is possible only if r = p and p ≤ q. Therefore the tori Tq either are Lagrangian (q = p = k) or coisotropic (q = k + 1, · · · , 2k, p = 2k − q). For the functions ωα (I) (6.12), the non-degeneracy condition (2.11) takes the form (6.11) because the constant vectors a1 , · · · , ar ∈ Rq are linearly independent. The corollary follows. V. The introduced concept of B-integrability of Hamiltonian systems is a particular case of the following general concept of integrability of dynamical systems x˙ i = V i (x) on manifolds M n of an arbitrary dimensions n. Definition 9. A dynamical system x˙ i = V i (x1 , · · · , xn ) on a manifold M n is integrable in the B-sense (or just B-integrable) if it possesses: p functionally independent first integrals F1 (x), · · · , Fp (x), where 0 ≤ p < n, and an abelian (n − p)-dimensional Lie subalgebra Sa ⊂ S of symmetries that preserve the first integrals Fj (x). Proposition 5. Suppose a dynamical system x˙ i = V i (x) on a manifold M n is Bintegrable. Then the general submanifolds Mc ⊂ O of constant level of first integrals Fj (x) = cj are tori Tn−m or toroidal cylinders Tq × Rn−m−q . The dynamical system x˙ i = V i (x) has the form (6.7) in the corresponding toroidal coordinates and therefore it is integrable in the conventional sense. The proof is obtained by the same methods as the proof of Theorem 3. 7. The A-B-C-Cohomologies for Dynamical Systems I. In paper [7], we introduced the cohomology H ∗ (V, M n ) of the dynamical systems x˙ i = V i (x1 , · · · , xn )
(7.1)
on the smooth manifolds M n . In this section we generalize that construction and consider a representation of the Lie algebra of symmetries S of the dynamical system V (7.1) in the newly-constructed A-B-C-cohomologies. n Let Λm V be the linear space of the smooth differential m-forms on the manifold M that are invariant with respect to the dynamical system V (7.1). The operator iV of the interior product and the operator d of the exterior derivation act on the V -invariant differential forms. Any V -invariant m-form ωm is annihilated by the Lie derivative [1] LV = iV ◦ d + d ◦ iV , LV ωm = 0. Therefore the two operators iV and d satisfy the equations i2V = 0, d2 = 0, iV ◦ d = −d ◦ iV (7.2) on the V -invariant differential forms. In view of Eqs. (7.2), the operator dV = iV −d ◦ iV satisfies the equations d2V = 0,
dV
◦ iV
= iV
◦
dV = 0,
dV
◦
d = d ◦ dV = 0.
◦
d=
(7.3)
The linear spaces Λm V of the V -invariant differential forms with the three operators iV , d and dV = iV ◦ d form the A-, B- and C-complexes respectively – see Fig. 1. These complexes are invariant with respect to a shift in the vertical direction. Therefore their cohomologies have only one index. II. We define three different rings of the A-, B- and C-cohomologies:
328
O.I. Bogoyavlenskij
-
0
d-
Λ0V
iV
dV
0
-
?
Λ0V
0
-
?
Λ0V
0
-
?
Λ0V
?
Λ1V
?
Λ1V
?
d- . . .
d-
d-
...
iV
?
Λ2V
d-
...
?
Λ2V
d-
...
?
...
...
?
Λn−1 V
?
Λn−1 V
? ...
∗ HA (V, M n ),
∗ HB (V, M n ),
0
?
Λn V
-
0
?
Λn V
-
0
-
0
dV
d-
Fig. 1. The A-B-C-complexes of V -invariant differential forms
-
dV
diV
Λn V
dV
diV
dV
diV
dV
diV
dV
Λn−1 V
dV
diV
dV
diV
dV
Λ2V
dV
diV
dV
diV
dV
iV
dV
diV
dV
d-
Λ1V
?
Λn V
HC∗ (V, M n ).
(7.4)
The ring structures in the cohomologies (7.4) are induced by the wedge product of the ∗ ∗ (V, M n ), HB (V, M n ) and HC∗ (V, M n ) are V -invariant differential forms. The rings HA the cohomologies with respect to the operators iV , d and dV respectively: ∗ HA (V, M n ) = Ker iV / Im iV ,
∗ HB (V, M n ) = Ker d/ Im d,
HC∗ (V, M n ) = Ker dV / Im dV . The linear spaces (7.4) inherit the ring structure with respect to the wedge product of the V -invariant differential forms because the operators iV and d are skew-derivations and the operator dV = iV ◦ d is a derivation: dV (ω ∧ η) = dV ω ∧ η + ω ∧ dV η. Here ω and η are arbitrary differential forms. ∗ (V, M n ) coincide with those introduced in our Recall that the B-cohomologies HB paper [7]. Probably, the B-cohomologies are the most important among the others (7.4) because of their interrelationships with the de Rham cohomologies [7,9,16]. Remark 7. For any constant c, the operator dc = d + ciV satisfies the equation d2c = 0 on the V -invariant differential forms. Therefore there exist the cohomologies with respect to the operator dc as well. III. An example. Let us consider the dynamical system V : J˙i = 0,
ϕ˙ i = Ji
(7.5)
in the toroidal domain O = Ba ×Tk , where Ba is a k-dimensional ball, and Ji ∈ Ba , ϕi ∈ Tk . The system (7.5) is the canonical form [9] of all Liouville-integrable Hamiltonian systems that are non-degenerate in the Kolmogorov sense and have compact invariant submanifolds. In paper [9], we demonstrated that the V -invariant 1-, 2-, and 3-forms have the form
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
329
ω1 = θi (J)dJi ,
(7.6)
ω2 = ai` (J)dJi ∧ dJ` + bi` (J)dJi ∧ dϕ` ,
(7.7)
ω3 = bi`m (J)dJi ∧ dJ` ∧ dϕm + ci`m (J)dJi ∧ dJ` ∧ dJm .
(7.8)
Here coefficients ai` (J) and ci`m (J) are alternating and coefficients bi` (J) and bi`m (J) satisfy the equations bi` (J) = b`i (J), bi`m (J) + b`mi (J) + bmi` (J) = 0. (a) For the V -invariant differential forms (7.6) – (7.8), we have iV ω1 = 0,
iV ω2 = −bi` (J)J` dJi ,
iV ω3 = bi`m (J)Jm dJi ∧ dJ` .
(7.9)
1 2 Hence we find HA (V, O) = 0, HA (V, O) = R∞ . The elements of the cohomology 2 group HA (V, O) are represented by the 2-forms
ω2a = bi` (J)dJi ∧ dϕ` ,
(7.10)
where bi` (J) = b`i (J) and bi` (J)J` = 0. 1 2 3 (b) In papers [7,9], we proved that HB (V, O) = 0, HB (V, O) = R∞ , HB (V, O) = 2 0. The elements of the cohomology group HB (V, O) are represented by the closed 2-forms ∂ 2 B(J) dJi ∧ dϕ` , (7.11) ω2b = ∂Ji ∂J` where B(J) is an arbitrary smooth function. (c) Eqs. (7.7) and (7.9) yield dV ω2 = −diV ω2 = d(bi` (J)dJ` ) ∧ dJi . Therefore the cohomologies HC2 (V, M n ) = R∞ are represented by the 2-forms ω2c = ai` (J)dJi ∧ dJ` + bi` (J)dJi ∧ dϕ` ,
bi` (J)J` =
∂C(J) , ∂Ji
(7.12)
where C(J) is an arbitrary smooth function and bi` (J) = b`i (J). The formulae (7.10) – (7.12) imply that the three cohomology groups 2 (V, O), HA
2 HB (V, O),
HC2 (V, O)
2 are different and that the B-cohomology group HB (V, O) is the “smallest” one.
IV. Proposition 6. The Lie derivative operator LU defines the representations Rm of the Lie algebra S of symmetries of a dynamical systems V in the cohomologies m (V, M n ), HA
m HB (V, M n ),
HCm (V, M n ).
m The representations Rm in the B-cohomologies HB (V, M n ) are trivial.
Proof. For any symmetry U ∈ S and any V -invariant (`, k) tensor Tk` , we have LV LU Tk` = (LU LV − L[U,V ] )Tk` = 0.
(7.13)
Hence the (`, k) tensor LU Tk` is V -invariant. m (V, M n ) has the form ωm +iV ωm+1 , where iV ωm = (a) Any element of the group HA 0. The two operators iV and LU commute in view of the identity LU iV − iV LU = i[U,V ] [1] and the equation [U, V ] = 0. Hence we obtain LU (ωm + iV ωm+1 ) = LU ωm + iV LU ωm+1 ,
iV LU ωm = LU iV ωm = 0.
330
O.I. Bogoyavlenskij
Therefore, the representation Rm is defined correctly. m (V, M n ) has the form ωm +dωm−1 , where dωm = 0. (b) Any element of the group HB Applying the Cartan formula LU = d ◦ iU + iU ◦ d, we find LU (ωm + dωm−1 ) = d(iU ωm + LU ωm−1 ). m This element represents zero in the group HB (V, M n ). m n (c) Any element of the group HC (V, M ) has the form ωm +dV ω˜ m where dV ωm = 0. Applying Eqs. (7.2) and the equation iU ◦ iV = −iV ◦ iU , we obtain
LU (ωm + dV ω˜ m ) = LU ωm + dV LU ω˜ m , dV LU ωm = iV d(d ◦ iU + iU ◦ d)ωm = diU iV dωm = diU dV ωm = 0. Therefore the representation Rm is defined correctly. Proposition 6 implies that any one-parametric group of symmetries of a dynamical m (V, M n ). system V defines the identity representation of R1 in the B-cohomologies HB Therefore, any connected Lie group of symmetries G preserves the B-cohomologies ∗ ∗ (V, M n ) and does not preserve in general the cohomologies HA (V, M n ) and HB HC∗ (V, M n ). Remark 8. In his 1986 paper [33], Oevel proved the validity of Eq. (7.13) provided the vector field U is a scaling master symmetry that satisfies the equation [U, V ] = cV , where c = const. Therefore Proposition 6 is true also when the Lie algebra of symmetries S is replaced by the Lie algebra of scaling master symmetries Msc . V. We introduce another cohomologies using Eqs. (7.2) and (7.3). The A-cohomolo∗ (V, M n ) form a differential complex with respect to the operator d: gies HA d
d
d
0 1 n 0 −→ HA (V, M n ) −→ HA (V, M n ) −→ · · · −→ HA (V, M n ) −→ 0.
(7.14)
∗ HAB (V, M n )
Denote the cohomology ring of the complex (7.14). ∗ (V, M n ) form a differential complex with respect to the The B-cohomologies HB operator iV : i
i
i
V V V 0 1 n 0 ←− HB (V, M n ) ←− HB (V, M n ) ←− · · · ←− HB (V, M n ) ←− 0.
(7.15)
∗ Denote HBA (V, M n ) the cohomology ring of the complex (7.15). Recall that the operator ∗ ∗ (V, M n ) and HB (V, M n ). dV annihilates the cohomologies HA ∗ n The C-cohomologies HC (V, M ) form the two differential complexes with respect to the operators iV and d: i
i
i
V V V 0 ←− HC0 (V, M n ) ←− HC1 (V, M n ) ←− · · · ←− HCn (V, M n ) ←− 0,
d
d
d
0 −→ HC0 (V, M n ) −→ HC1 (V, M n ) −→ · · · −→ HCn (V, M n ) −→ 0. ∗ ∗ (V, M n ) and HCB (V, M n ) the cohomology rings of these two complexes Denote HCA respectively. Remark 9. The Lie derivative operator LU defines the representations Rm of the Lie algebra S of symmetries of the dynamical system V in the cohomologies m (V, M n ), HAB
m HBA (V, M n ),
m HCA (V, M n ),
m HCB (V, M n ).
The same methods as in Proposition 6 prove that the representations Rm in the cohom m (V, M n ) and HBA (V, M n ) are trivial. mologies HAB
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
331
Chapter II. Applications of Theorem on Symmetries 8. Applications to the Integrable Hierarchies Problem Theorem 1 has the following applications to the integrable hierarchies problem. Theorem 4. Suppose system (3.1) is a C-integrable non-degenerate Hamiltonian system. If this system preserves a (1,1) tensor Aα β (x) then the following statements are true: 1) Dynamical systems x˙ α = (L(A, x)V )α = ULα (8.1) are C-integrable. Here L(A, x) is a Laurent polynomial N X
L(A, x) =
am (x)Am (x),
(8.2)
m=−N
where am (x) are arbitrary first integrals of the Hamiltonian system (3.1). If the (1,1) tensor A(x) is degenerate then m ≥ 0 in (8.2). 2) Dynamical systems x˙ α = (L1 (A, x)P1 L2 (A, x)dF (x))α = UFα
(8.3)
are C-integrable. Here F (x) is an arbitrary first integral of the Hamiltonian system (3.1) and L1 (A, x) and L2 (A, x) are arbitrary Laurent polynomials (8.2). 3) Flows of all dynamical systems (8.1) and (8.3) pairwise commute. 4) All vectors UL (x) and UF (x) belong to an integrable k-dimensional distribution L ⊂ T (M 2k ). The integral submanifolds of this distribution are tori Tk that are invariant with respect to all integrable systems (3.1), (8.1) and (8.3). Proof. The invariance of the (1,1) tensor Aij (x) means that LV A = 0,
(8.4)
where LV is the Lie derivative with respect to the C-integrable flow (3.1). Equation (8.4) yields (8.5) LV (Am ) = 0 for all integers m. The vector field V , the Poisson structure P1 and first integrals am (x), F (x) satisfy the invariance equations LV V = 0,
LV P1 = 0,
LV am (x) = 0,
LV F (x) = 0.
(8.6)
Equations (8.5) and (8.6) imply LV UL = [V, UL ] = 0,
LV UF = [V, UF ] = 0,
(8.7)
where UL and UF are the vector fields corresponding to the dynamical systems (8.1) and (8.3) respectively. Equations (8.7) mean that the vector fields UL and UF are symmetries of the C-integrable non-degenerate Hamiltonian system (3.1). Therefore, applying Theorem 1, we obtain that the corresponding dynamical systems (8.1) and (8.3) are Cintegrable and their flows pairwise commute and are tangent to the integrable distribution Lx = Tx (Tk ).
332
O.I. Bogoyavlenskij
Remark 10. In [9] we solved the integrable hierarchies problem for the special case when the C-integrable Hamiltonian system (3.1) preserves two incompatible non-degenerate Poisson structures P1 and P2 . For this case the (1,1) tensor Aα β (x) is the recursion operator −1 A = P1 P2 . Theorem 4 is applicable in the more general situation when Aα β (x) is an arbitrary C-invariant (1,1) tensor. The remarkable fact is that the proof of Theorem 4 does not use any properties of the (1,1) tensor Aα β (x) but its C-invariance. 9. Applications to the Necessary Conditions for Strong Dynamical Compatibility Problem I. Let P αβ and Qαβ be two alternating (2,0) tensors. Their Schouten bracket is the alternating (3,0) tensor [P, Q]αβγ
=
P,ταβ Qτ γ + P,τβγ Qτ α + P,τγα Qτ β +
+
τγ Qαβ ,τ P
+
τα Qβγ ,τ P
+
(9.1)
τβ Qγα . ,τ P
Recall the following definition [5, 9]. Definition 10. Two Poisson structures P and Q on a manifold M n , n = 2k, are called strongly dynamically compatible if there exists a dynamical system x˙ α = V α (x1 , · · · , xn )
(9.2)
LV P = 0,
(9.3)
that preserves both of them LV Q = 0
and that is C-integrable and non-degenerate with respect to some non-degenerate Poisson structure P1 on the manifold M n . II. If one of the Poisson structures P and Q is non-degenerate (for example Q) then there exists the recursion operator αγ (Q−1 )γβ . Aα β =P
(9.4)
We define the vector fields U0 and U` : U0α = [P, Q]αβγ (Q−1 )γβ ,
U`α = (A` U0 )α ,
(9.5)
the invariant functions and vector fields Xτ :
Hm (x) = Tr Am (x),
(9.6)
βγ Hm,γ , Xτα = (A` )α βP
(9.7)
where τ = (`, m) is multi-index. Theorem 5. If a Poisson structure P and a non-degenerate Poisson structure Q are strongly dynamically compatible then the following necessary conditions are satisfied: 1) All vector fields U` (9.5) and Xτ (9.7) pairwise commute. 2) The distribution B1 ⊂ T (M n ) that is generated by the vector fields (9.5) is integrable. The closure of any integral submanifold of the distribution B1 is a d-dimensional torus Td , d ≤ k.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
333
3) All vector fields U` and Xτ and functions Hm (x) satisfy the equations U` (Hm ) = 0,
Xτ (Hm ) = 0
(9.8)
for all positive integers ` and m. 4) All functions Hm (x) are in involution with respect to the both Poisson structures P and Q: {H` , Hm }P = P αβ H`,α Hm,β = 0,
{H` , Hm }Q = Qαβ H`,α Hm,β = 0.
Any scalar invariants of the two Poisson structures P and Q are in involution. Proof. 1)-2) If the two Poisson structures P and Q are invariant with respect to the C-integrable Hamiltonian system (9.2) then all tensors which can be constructed from these Poisson structures also are invariant. Hence we get that vector fields Yµ = U` or Xτ commute with vector field V (9.2). That means that the vector fields Yµ are symmetries of the C-integrable system (9.2). Applying Theorem 1, Part 1, and the general Lemma 1 on symmetries, we obtain that all vector fields Yµ pairwise commute and have form (2.12) in the action-angle coordinates Ij , ϕ` . Hence we get that distribution B1 is integrable and the closures of its fibres are tori Td ⊂ Tk . 3) Eqs. (9.8) follow from Theorem 1, Part 2. 4) All vector fields α = P αβ Hm,β , Xm
α X˜ m = Qαβ Hm,β
are symmetries of the C-integrable non-degenerate Hamiltonian system (9.2) because they are contractions of the C-invariant tensors. Therefore, applying Theorem 1, part 2, we obtain {H` , Hm }P = Xm (H` ) = 0,
{H` , Hm }Q = X˜ m (H` ) = 0.
Arbitrary scalar invariants of the two Poisson structures P and Q are first integrals of the C-integrable dynamical system (9.2). Therefore they are in involution by the same reasons as first integrals Hm . If both Poisson structures P and Q are degenerate then the recursion operator A (9.4) does not exist in general [36,37]. Therefore to derive the necessary conditions for strong dynamical compatibility of two degenerate Poisson structures we need another construction of tensor invariants. Such invariant geometric objects were first presented in our paper [9]. In the following section we develop the new geometric constructions. 10. Tensor Invariants of Two Degenerate Poisson Structures I. Let Λ(T (M n )) be the exterior algebra of the tangent bundle T (M n ), Λ(T (M n )) = Λ0 ⊕ Λ1 ⊕ · · · ⊕ Λn . Let R` ⊂ Λn−2 (T (M n )) be the wedge product of ` factors P and k − 1 − ` factors Q, ` = 0, 1, · · · , k − 1, R` = P ∧ · · · ∧ P ∧ Q ∧ · · · ∧ Q. Let ωn be an arbitrary non-degenerate n-form on M n . We define k 2-forms η` by the formulae
334
O.I. Bogoyavlenskij µ ···µn−2
(η` )βγ = (ωn )βγµ1 ···µn−2 R` 1
.
(10.1)
Using the k 2-forms η` we define a large variety of geometric objects that contains the 2k (1,1) tensors A` , A`+k , the functions frm (x), the k (2,1) tensors S` and the k vector fields U` . These tensors are defined by the formulae αγ (A` )α (η` )γβ , β =P
αγ (A`+k )α β = Q (η` )γβ ,
(10.2)
frm (x) = (Tr(Ar (x))m )1/m ,
(10.3)
αβτ (S` )αβ (η` )τ γ , γ = [P, Q]
(10.4)
(U` )α = [P, Q]αβγ (η` )γβ = (S` )αβ β .
(10.5)
The 2-forms η` , the tensors A` , A`+k , S` , U` and functions frm (x) are defined uniquely up to a common factor f (x) because the n-form ωn is defined uniquely up to a factor f (x). II. Let us define the invariant maps fr , Fm and F0 of the manifold M 2k into the real projective spaces RP N : (10.6) fr : M 2k −→ RP k−1 , fr (x) = fr1 (x) : fr2 (x) : · · · : frk (x) ∈ RP k−1 . The map fr completely characterizes the class of proportionality of the eigenvalues of the (1,1) tensor Ar (x). Indeed, all eigenvalues of the (1,1) tensor Ar (x) (10.2) have even multiplicities. Therefore the k functions frm (x) (10.3) for m = 1, · · · , k contain all information about these eigenvalues. We define the map (10.7) Fm : M 2k −→ RP 2k−1 , Fm (x) = f0m (x) : f1m (x) : · · · : f2k−1.m (x) ∈ RP 2k−1 , where m = 1, · · · , k. The most general map F0 has the form F0 :
M 2k −→ RP 2k
2
−1
,
F0 (x) = f01 (x) : f02 (x) : · · · : f2k−1.k (x) ∈ RP 2k
(10.8) 2
−1
.
The maps F : fr , Fm and F0 are not defined in the points x where all functions frm (x) = 0. III. Let a function fij (x) be non-zero in an open domain O ⊂ M 2k . We define the functions frm (x) , x ∈ O, (10.9) Fµ (x) = fij (x) where µ = (i, j, r, m) is the multi-index. Proposition 7. The functions Fµ (x) and all maps fr (10.6), Fm (10.7) and F0 (10.8) are first integrals of any dynamical system (9.2) that preserves the two Poisson structures P and Q,
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
335
Proof. Let ϕt : M n → M n be the one-parametric group of diffeomorphisms defined by the system (9.2). For each t, the diffeomorphism ϕt preserves the Poisson structures P and Q ϕ∗t P = P,
ϕ∗t Q = Q.
(10.10)
ϕ∗t ωn (x) = gt (x)ωn (x),
(10.11)
For the n-form ωn (x) we have
because the space of alternating n-forms ωn (x) is one-dimensional. Here gt (x) > 0 is some smooth function. Equations (10.10) and (10.11) imply ϕ∗t frm (x) = gt (x)frm (x) for the function frm (x) (10.3). Therefore formula (10.9) yields ϕ∗t Fµ (x) = Fµ (x). Hence the functions Fµ (x) and all maps fm , Fm and F0 are first integrals of any dynamical system (9.2) – (9.3). Definition 11. The distribution B1 ⊂ T (M n ) is generated by the k vector fields U` . The distribution B2 ⊂ T (M n ) is generated by all vector fields Uσ` = Aσ1 · · · Aσp U` ,
0 ≤ σj ≤ 2k − 1,
(10.12)
where σ = (σ1 , · · · , σp ) is multi-index, |σ| = p, p ≥ 0. It is evident that the distributions B1 and B2 are determined uniquely by the two alternating (2,0) tensors P and Q. The distribution B2 is the minimal distribution that contains distribution B1 and that is invariant with respect to all operators (10.2). Proposition 8. The distributions B1 and B2 are invariant with respect to any dynamical system (9.2) that preserves the two Poisson structures P and Q. Proof. The formulae (10.10) and (10.11) imply ϕ∗t Aσ (x) = gt (x)Aσ (x),
ϕ∗t U` (x) = gt (x)U` (x)
(10.13)
for any diffeomorphism ϕt defined by the dynamical system (9.2). Here Aσ (x) and U` (x) are the (1,1) tensors (10.2) and vector fields (10.5) respectively. The formulae (10.13) prove that the distributions B1 and B2 generated by the vector fields U` and Uσ` (10.12) are invariant with respect to any dynamical system (9.2) – (9.3). Remark 11. Two Poisson structures P and Q are called compatible in Magri’s sense [27] if their Schouten bracket (9.1) vanishes [P, Q] = 0. For the compatible pairs of Poisson structures all (2,1) tensors S` (10.4) and vector fields U` (10.5) vanish and therefore the distributions B1 and B2 are zero.
336
O.I. Bogoyavlenskij
IV. Theorem 6. Suppose two incompatible Poisson structures P and Q are strongly dynamically compatible on a smooth manifold M 2k . Then the following necessary conditions are satisfied: 1) All functions Fµ (x) (10.9) are in involution with respect to both Poisson structures P and Q: {Fµ , Fν }P = P αβ Fµ,α Fν,β = 0,
{Fµ , Fν }Q = Qαβ Fµ,α Fν,β = 0,
where α, β = 1, · · · , 2k. 2) Any vector field Uσ` (10.12) and any two non-zero functions frm (x) and fij (x) (10.3) satisfy the equation Uσ` (log |frm (x)|) = Uσ` (log |fij (x)|).
(10.14)
3) Any two vector fields Uσ` and Uτ j (10.12) satisfy the equation [Uσ` , Uτ j ] = (q + 1)Uσ` (h)Uτ j − (p + 1)Uτ j (h)Uσ` ,
(10.15)
where |σ| = p ≥ 0 and |τ | = q ≥ 0, and h(x) = log |frm (x)|, where frm (x) 6= 0. 4) The maps F : fr , Fm and F0 satisfy the condition rank dF ≤ k
(10.16)
and annihilate the distributions B1 and B2 : dF (B1 ) = 0,
dF (B2 ) = 0.
(10.17)
5) Distributions B1 and B2 are integrable. The closure of any integral submanifold of the distribution B1 or distribution B2 is a d-dimensional torus Td , d ≤ k. Proof. 1) If two Poisson structures P and Q are strongly dynamically compatible then there exists a dynamical system (9.2) that preserves both of them and that is C-integrable and non-degenerate with respect to some non-degenerate Poisson structure P1 . That means the equations LV P = 0,
LV Q = 0,
L V ω1 = 0
(10.18)
hold where ω1 = P1−1 . Let ω n be the C-invariant n-form of volume LV ω n = 0,
ω n = ω 1 ∧ · · · ∧ ω1 .
(10.19)
Let η ` be the 2-forms µ ···µn−2
(η ` )βγ = (ω n )βγµ1 ···µn−2 R` 1
.
(10.20)
We define the tensors A` , S ` , U ` , U σ` and functions f rm (x) as in (10.2) – (10.5) and in (10.12): αγ αγ (η ` )γβ , (A`+k )α (10.21) (A` )α β =P β = Q (η ` )γβ , f rm = (Tr(Ar )m )1/m , (S ` )αβ γ
αβτ
= [P, Q]
(η ` )τ γ ,
α
(10.22) αβγ
(U ` ) = [P, Q]
U σ` = Aσ1 · · · Aσp U ` ,
|σ| = p.
(η ` )γβ ,
(10.23) (10.24)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
337
Equations (10.18) and (10.19) imply the invariance equations LV η ` = 0,
LV A` = LV A`+k = 0,
LV S ` = 0,
LV U ` = 0,
LV f rm (x) = 0,
(10.25)
LV U σ` = 0.
The functions (f rm (x))m are first integrals of the dynamical system (9.2) because they are contractions of invariant tensors. Any non-degenerate n-form ωn has the form ωn = f (x)ω n , where f (x) is some smooth function on the manifold M n . Therefore tensors (10.1) – (10.5) are connected with the C-invariant tensors (10.20) – (10.23) by the formulae η` = f η ` ,
A` = f A` ,
S` = f S ` ,
U` = f U ` .
(10.26)
Using the formulae (10.12), (10.24) and (10.26), we obtain Uσ` = f p+1 U σ` ,
p = |σ|.
(10.27)
The formulae (10.3), (10.22) and (10.26) yield frm = f f rm .
(10.28)
Equations (10.28) imply that functions Fµ (x) (10.9) have the form Fµ (x) =
f rm (x) , f ij (x)
µ = (i, j, r, m).
Therefore these functions are first integrals of the C-integrable non-degenerate Hamiltonian system (9.2). Hence we get that vector fields Xµα = P αβ Fµ,β ,
X˜ µα = Qαβ Fµ,β
are symmetries of system (9.2) because they are contractions of the C-invariant tensors. Applying Theorem 1, Part 2, we find {Fµ , Fν }P = Xµ (Fν ) = 0,
{Fµ , Fν }Q = X˜ µ (Fν ) = 0.
2) The invariance Eqs. (10.25) mean that the vector fields U ` , U σ` are symmetries of the C-integrable non-degenerate Hamiltonian system (9.2). Therefore, applying Theorem 1, Part 2, we obtain the equation U σ` (f rm ) = 0. Using this equation and formulae (10.26) and (10.28), we derive Uσ` (frm ) = f p+2 U σ` (h)f rm + f p+2 U σ` (f rm ) = Uσ` (h)frm ,
(10.29)
where function h(x) has the form
Equation (10.29) yields
h(x) = log |f (x)|.
(10.30)
Uσ` (h) = Uσ` (log |frm |)
(10.31)
338
O.I. Bogoyavlenskij
if frm 6= 0. This equality proves Eqs. (10.14). 3) Using formulae (10.27) and (10.30), we derive [Uσ` , Uτ j ] = [f p+1 U σ` , f q+1 U τ j ] =
(10.32)
= f p+q+2 ((q + 1)U σ` (h)U τ j − (p + 1)U τ j (h)U σ` + [U σ` , U τ j ]). Applying Theorem 1 to the symmetries U ` , U τ j , we obtain [U ` , U m ] = 0,
[U ` , U τ j ] = 0,
[U σ` , U τ j ] = 0.
(10.33)
Equations (10.32) and (10.33) imply Eq. (10.15). The equality (10.31) proves that Eq. (10.15) is satisfied if function h(x) in (10.15) is equal to any function log |frm (x)|, where |frm (x)| 6= 0. 4) The constructions of the maps F : fr , Fm and F0 imply that they are determined in some open domains O ⊂ M n which are invariant with respect to any dynamical system that preserves the Poisson structures P and Q. If the two Poisson structures P and Q are strongly dynamically compatible then such dynamical system (9.2) does exist. Every map F is first integral of this system. Therefore, every map F is constant on each trajectory of system (9.2) in the invariant open domain O, where F is defined. Hence F is constant on the tori Tk and the condition (10.16) follows. Definition 11 and formulae (10.26) and (10.27) imply that the distributions B1 and B2 are generated by the C-invariant vector fields U ` and U σ` respectively. Applying Theorem 1, Part 2 to these symmetries and first integrals Fµ (10.9), we obtain dFµ (U ` ) = 0,
dFµ (U σ` ) = 0.
(10.34)
Equations (10.34) prove Eqs. (10.17) because functions Fµ (x) are components of the mappings F into the projective spaces RP N . 5) The integrability of the distributions B1 and B2 follows from Eq. (10.15). Applying the general Lemma 1 and formulae (2.12) for the symmetries U ` and U σ` , we obtain that the closure of any integral submanifold of the distributions B1 and B2 is a torus Td ⊂ Tk . Hence d ≤ k. Remark 12. We introduced in [9] a distribution B ⊂ T (M n ). For any two strongly dynamically compatible Poisson structures P and Q the inclusions B 1 ⊂ B2 ⊂ B hold. Indeed, these inclusions follow from the inclusions B1 ⊂ B2 ⊂ T (Tk ),
T (Tk ) ⊂ B
which are proved in Theorem 6 and in our paper [9] respectively. For any two Poisson structures P and Q which are compatible in Magri’s sense, the two distributions B1 and B2 are empty and the distribution B is maximal: B = T (M n ). Theorem 6 and results of paper [9] imply that the four distributions coincide: B1 = B2 = T (Tk ) = B, if the two Poisson structures P and Q are strongly dynamically compatible and are in the general position.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
339
Chapter III. Structure of Dynamical Systems (2.1) 11. An Alternative for the Analytic Dynamical Systems (2.1) Let X ⊂ Ba be the set of all points I ∈ Ba for which the trajectories of the dynamical system (2.1) are Tq -dense or the set of all points I ∈ Ba for which the q frequencies ω1 (I), · · · , ωq (I) are incommensurable over the integers. An Alternative. 1) If an analytic dynamical system (2.1) has one Tq -dense trajectory then the set X of the Tq -dense trajectories is everywhere dense in the ball Ba and its complement is a set of measure zero . 2) If there are no Tq -dense trajectories then dynamical system (2.1) is reduced to the form 1 I˙1 = 0, · · · I˙p = 0, ϕ˙ 1q = 0, ϕ˙ 11 = ω11 (I), · · · , ϕ˙ 1q−1 = ωq−1 (I) (11.1) after certain unimodular transformation of the angular coordinates ϕ1i =
q X
det k bij k= 1,
bij ϕj ,
j=1
where bij are some integers and ω`1 (I) =
(11.2)
Pq
j=1 b`j ωj (I).
Proof. 1) Let v = (k1 , · · · , kq ) ∈ Zq be an arbitrary vector with integer coordinates. Let Sv ⊂ Ba be the set of all points I which satisfy the equation k1 ω1 (I) + · · · + kq ωq (I) = 0.
(11.3)
If there exists a T -dense trajectory then the corresponding point I0 does not belong to any set Sv . In this case each set Sv has measure zero because the functions ωα (I) are analytic. The countable union of all sets Sv ⊂ Ba has measure zero also. Therefore the complement to the set X, that is Ba \X = ∪ Sv , is a set of neasure zero. 2) If there are no Tq -dense trajectories then we have q
B a = ∪ Sv .
(11.4)
In this case at least one equation (11.3) is true identically in the ball Ba because otherwise each set Sv would have measure zero and we would get a contradiction with the equality (11.4). Hence functions ωα (I) identically satisfy an equation m1 ω1 (I) + · · · + mq ωq (I) = 0,
mα ∈ Z
(11.5)
for all points I ∈ Ba . This equation is reduced to that with the relatively prime integers m1 , · · · , mq . It is well-known [20] that there exists a unimodular q×q matrix with integer entries bij such that its q th row coincides with the integers m1 , · · · , mq . We define new angle coordinates ϕ11 , · · · , ϕ1q on the torus Tq by the formulae (11.2). Equations (2.1) imply ϕ˙ 11 = ω11 (I), · · · ,
1 ϕ˙ 1q−1 = ωq−1 (I),
where ω`1 (I) =
q X
ϕ˙ 1q = m1 ω1 (I) + · · · + mq ωq (I) = 0, b`j ωj (I).
j=1
Therefore the q th angle coordinate ϕ1q = m1 ϕ1 + · · · + mq ϕq is an additional first integral of the system (2.1) and therefore this system takes the form (11.1).
340
O.I. Bogoyavlenskij
We apply the Alternative to prove the following Theorem. Theorem 7, Part 1. Suppose the dynamical system (2.1) is analytic and the maximal dimension of the closures of its trajectories is equal to ` ≤ q. Then there exists a unimodular transformation of the angular coordinates ϕ1i
=
q X
det k aij k= 1
aij ϕj ,
(11.6)
j=1
that transforms system (2.1) to the form I˙1 = 0, · · · , I˙p = 0,
ϕ˙ 1`+1 = 0, · · · , ϕ˙ 1q = 0,
ϕ˙ 11 = ω11 (I), · · · , ϕ˙ 1` = ω`1 (I),
ω`1 (I) =
q X
(11.7)
a`j ωj (I).
j=1
The set X` ⊂ Ba of the T` -dense trajectories of system (11.7) is everywhere dense in the ball Ba and its complement is a set of measure zero. Proof. Applying the Alternative to the given analytic dynamical system (2.1), we reduce this system to the form (11.1). Then we repeat this reduction by applying the Alternative to the derived system (11.1) and so on until it is possible. After the q − ` transformations the system (2.1) takes the form (11.7). The unimodular transform aij (11.6) is the product of the unimodular transforms bij (11.2) that appear on each step of the reductions indicated. On the last stage of this construction we apply the first statement of the Alternative and obtain that the set X` of the T` -dense trajectories is everywhere dense in the ball Ba and its complement is a set of measure zero. 12. Theorem on the Structure of General Dynamical Systems (2.1) Proposition 9. Suppose the functions ωα (I) are continuous. The set X of points I ∈ Ba for which the trajectories of system (2.1) are Tq -dense is everywhere dense in the ball Ba if and only if the functions ωα (I) are rationally independent in any ball B0 ⊂ Ba . Proof. In the direct way (only if), the statement is obvious, because if the functions ωα (I) were rationally dependent in a ball B0 ⊂ Ba then X ∩ B0 = ∅. Let us prove by contradiction that if the functions ωα (I) are rationally independent then the set X is everywhere dense. Suppose not.Then there exists a ball B0 ⊂ Ba such that X ∩ B0 = ∅. The values of the functions ωα (I) are commensurable over the integers for all points I ∈ B0 . Therefore, applying Lemma 2 (see below), we obtain that there exists a ball B1 ⊂ B0 such that Eq. (11.5) holds for all points I ∈ B1 . Hence functions ωα (I) are rationally dependent in the ball B1 . This contradiction proves that the set X is everywhere dense. The following lemma is a consequence of the classical Baire’s Theorem. Lemma 2. Suppose the values of the continuous functions ω1 (I), · · · , ωk (I) are commensurable over the integers at all points I ∈ B0 , where B0 is some ball B0 ⊂ Ba . Then there exists a ball B1 ⊂ B0 and integers m1 , · · · , mk such that the linear equation m1 ω1 (I) + · · · + mq ωq (I) = 0 is true identically for all points I ∈ B1 .
(12.1)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
341
Proof. Let us consider the equation (v, ω(I)) = k1 ω1 (I) + · · · + kq ωq (I) = 0,
(12.2)
where v ∈ Zq is a vector with arbitrary integer coordinates k1 , · · · , kq ∈ Z. Let Sv be the set of all points I ∈ B0 which satisfy Eq. (12.2). The continuity of the functions ωj (I) implies that the set Sv is closed. Each point I ∈ B0 belongs to some closed subset Sv , v 6= 0, because at least one non-trivial Eq. (12.2) with integer coefficients kα (I) holds for any point I ∈ B0 . Hence we obtain that the ball B0 is a union of the countable family of closed subsets Sv ⊂ B0 . Therefore we can apply Baire’s Theorem [23]. Suppose a closed ball B ⊂ Rk is represented as a union B = ∪ S` of a countable family of closed subsets S` ⊂ B. Then at least one closed subset S` contains a ball B1 ⊂ S` . The Baire Theorem yields that at least one closed subset Sv contains a ball B1 ⊂ Sv . The definition of the subset Sv implies that Eq. (12.1) is satisfied identically in the ball B1 . Proposition 9 implies the following consequence. Corollary 4. The continuous dynamical system (2.1) is Tq -dense in the toroidal domain O = Ba × Tq ⊂ M n if and only if the functions ωα (I) are rationally independent in any ball B0 ⊂ Ba . The foregoing Alternative (see Sect. 11) implies the consequence. Corollary 5. Any analytic dynamical system (2.1) either is Tq -dense or possesses a reduction to the form (11.1). Theorem 7, Part 2. For arbitrary continuous functions ω1 (I), · · · , ωq (I) there exists a family of balls Bτ ⊂ Ba such that the union B = ∪ Bτ is everywhere dense in the ball Ba and the following properties are realized. The maximal dimension of the closures of trajectories of the dynamical system (2.1) in the toroidal domain Oτ = Bτ × Tq is equal to `(τ ) ≤ q. There exists a unimodular transformation of the action-angle coordinates ϕτi =
q X
det k aτij k= 1,
aτij ϕj ,
aτij ∈ Z,
(12.3)
j=1
that transforms system (2.1) in the domain Oτ to the form I˙1 = 0, · · · , I˙p = 0, ϕ˙ τ1
=
ω1τ (I), · · · , ϕ˙ τ`(τ )
=
ϕ˙ τ`(τ )+1 = 0, · · · , ϕ˙ τq = 0,
τ ω`(τ ) (I),
ω`τ (I)
=
q X
(12.4)
aτ`j ωj (I).
j=1
The set Xτ ⊂ Bτ of all points I ∈ Ba for which the trajectories of the dynamical system (12.4) in the domain Oτ are T`(τ ) -dense is everywhere dense in the ball Bτ .
342
O.I. Bogoyavlenskij
Proof. Let B0 ⊂ Ba be an arbitrary ball. If the set X0 ⊂ B0 of the Tq -dense trajectories is everywhere dense in the ball B0 then Theorem 7 is true in the toroidal domain O0 = B0 × Tq . If the set X0 is not everywhere dense in B0 then there exists a ball B1 ⊂ B0 such that X0 ∩ B1 = ∅. The continuous functions ω1 (I), · · · , ωq (I) are commensurable over the integers for all points I ∈ B1 . Applying Lemma 2 we obtain that there exists a ball B2 ⊂ B1 such that the linear equation (12.1) is true identically for all points I ∈ B2 . This equation is reduced to that with relatively prime integers m1 , · · · , mq . Let a2ij be a unimodular q × q matrix such that its q th row coincides with the relatively prime integers m1 , · · · , mq . We define new angle coordinates ϕ21 , · · · , ϕ2q on the torus Tq by the formulae (12.3). In view of Eqs. (2.1), we obtain 2 I˙1 = 0, · · · , I˙p = 0, ϕ˙ 2q = 0, ϕ˙ 21 = ω12 (I), · · · , ϕ˙ 2q−1 = ωq−1 (I). (12.5) If the set X2 ⊂ B2 of all Tq−1 -dense trajectories of system (12.5) is everywhere dense in the ball B2 then Theorem 7, Part 2 is true in the toroidal domain O2 = B2 × Tq . If the set X2 is not everywhere dense in B2 then we subsequently repeat the previous construction until it is possible. After a finite number of steps m < q we construct a ball Bν ⊂ B0 such that Theorem 7, Part 2 is true in the toroidal domain Oν = Bν × Tq . Indeed, on the last possible stage of this repeated construction we can obtain the dynamical system I˙1 = 0, · · · , I˙p = 0, ϕ˙ ν2 = 0, · · · , ϕ˙ νq = 0, ϕ˙ ν1 = ω1ν (I). (12.6) The set Xν ⊂ Bν of all T1 -dense trajectories of system (12.6) is defined by the condition ω1ν (I) 6= 0. If the set Xν is everywhere dense in the ball Bν then Proposition 1 is true in the toroidal domain Oν = Bν × Tq . If the set Xν is not everywhere dense in Bν then there exists a ball Bν+1 ⊂ Bν , where ω1ν (I) = 0 identically for all points I ∈ Bν+1 . All points in the toroidal domain Oν+1 = Bν+1 × Tq are critical points of the dynamical system (2.1). Therefore Theorem 7, Part 2 is true for the toroidal domain Oν+1 . The initial ball B0 is arbitrary. Therefore the union B = ∪Bτ of the constructed family of balls Bτ ⊂ Ba is everywhere dense in the ball Ba . Remark 13. Let us present the toroidal domain Oτ = Bτ × Tq in the form Oτ = Bτ × Tq−`(τ ) × T`(τ ) . In this domain, the dynamical system (12.4) is T`(τ ) -dense. This remark is used in Sects. 13, 15 and 16 where we present the applications of Theorem 7. 13. Theorem on the Structure of the Liouville-Integrable Hamiltonian Systems Theorem 8, Part 1. Suppose the Hamiltonian function H(I1 , · · · , Ik ) is analytic and the maximal dimension of the closures of trajectories of the integrable Hamiltonian system (3.3) is equal to ` ≤ k. Then there exists a unimodular canonical transformation Ji =
k X j=1
(a−1 )ji Ij ,
ϕ1i =
k X
aij ϕj ,
det k aij k= 1
(13.1)
j=1
that transforms the Hamiltonian function H(I1 , · · · , Ik ) to the form H = H(J1 , · · · , J` ) in the new action-angle coordinates J1 , ϕ1i (13.1). The set X` of points I ∈ Ba for which the trajectories of the Hamiltonian system (3.3) are T` -dense is everywhere dense in the space of coordinates J1 , · · · , J` and its complement is a set of measure zero.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
343
Proof. Theorem 7, Part 1 implies that there exists a unimodular transform (11.6) that reduces system (3.3) to the form (11.7). The corresponding unimodular transform (13.1) preserves the symplectic form ω (3.2): k X
dJi ∧
dϕ1i
=
k X
(a
−1
)ji aim dIj ∧ dϕm =
i,j,m
i=1
k X
dIj ∧ dϕj .
j=1
Therefore system (3.3) after the canonical transformation (13.1) takes the Hamiltonian form k X ∂H(J) J˙i = 0, ϕ˙ 1i = = ωi1 (J) = aij ωj (I). ∂Ji j=1
Theorem 7, Part 1 implies that the k − ` functions equal to zero. Therefore the equalities ∂H(J) 1 = ω`+m (J) = 0, ∂J`+m
1 ω`+1 (J), · · · , ωk1 (J)
are identically
m = 1, · · · , k − `
prove that the Hamiltonian function H = H(J) depends upon the ` variables J1 , · · · , J` only. Other statements follow from Theorem 7, Part 1. Suppose the Hamiltonian system (3.3) is defined in a toroidal domain O = Ba × Tk and function H(I1 , · · · , Ik ) belongs to class C 1 . Theorem 8, Part 2. For an arbitrary differentiable function H(I1 , · · · , Ik ), there exists a family of balls Bτ ⊂ Ba such that the union B = ∪ Bτ is everywhere dense in the ball Ba and the following properties are realized. The maximal dimension of the closures of trajectories of the dynamical system (3.3) in the toroidal domain Oτ = Bτ × Tk is equal to `(τ ) ≤ k. There exists a unimodular canonical transformation Ijτ =
k X
(aτ )−1 mj Im ,
m=1
ϕτj =
k X
aτjm ϕm
(13.2)
m=1
that transforms the Hamiltonian function H(I1 , · · · , Ik ) to the form τ H = H(I1τ , · · · , I`(τ ))
(13.3)
in the new action-angle coordinates Ijτ , ϕτj in the domain Oτ . The set Xτ ⊂ Bτ of all points I ∈ Bτ for which trajectories of the Hamiltonian system (3.3) in the domain Oτ are T`(τ ) -dense is everywhere dense in the ball Bτ . Proof. For the differentiable function H(I1 , · · · , Ik ), the frequencies ωj (I) (3.3) are continuous functions. Therefore applying Theorem 7, Part 2, we obtain a family of balls Bτ ⊂ Ba . Their union B = ∪ Bτ is everywhere dense in the ball Ba . In each toroidal domain Oτ = Bτ × Tk , the Hamiltonian system (3.3) takes the form (12.4) after the unimodular transformation (12.3) of the angular coordinates ϕj . The corresponding unimodular transformation (13.2) preserves the symplectic structure ω: k X j=1
dIjτ ∧ dϕτj =
k X i,j,m
τ (aτ )−1 mj aji dIm ∧ dϕi =
k X m=1
dIm ∧ dϕm .
344
O.I. Bogoyavlenskij
Therefore the Hamiltonian system (3.3) takes the form X ∂H(I τ ) τ τ = ω (I ) = aτjm ωm (I) j ∂Ijτ k
I˙jτ = 0,
ϕ˙ τj =
(13.4)
m=1
in the new action-angle coordinates Ijτ , ϕτj . Theorem 7 implies that τ τ τ τ functions ω`(τ )+1 (I ), · · · , ωq (I ) vanish identically in the ball Bτ . Therefore Eqs. (13.4) τ yield that the Hamiltonian function H(I τ ) depends upon the variables I1τ , · · · , I`(τ ) only. The last statement of Theorem 8 follows from Theorem 7, Part 2.
Chapter IV. Master Symmetries and Their Applications 14. Conformal Symmetries Form a Lie Algebra Sc I. The master symmetries were studied in [19,33–35] for many concrete dynamical systems and partial differential equations and in [15,17] for the Toda lattice. By definition [19], a vector field X is a master symmetry of the dynamical system x˙ τ = V τ (x),
(14.1)
L2V X = [V, [V, X]] = 0.
(14.2)
if X satisfies the equation A vector field Xr is called [19] a symmetry of order r if (adV )r Xr = 0,
(adV )r−1 Xr 6= 0,
(14.3)
where adV Y = LV Y = [V, Y ]. Any symmetry of dynamical system V (14.1) is a symmetry of order 1. Any master symmetry is a symmetry of order 2. For any master symmetry X, the vector field U = adV X = [V, X] is a symmetry of the dynamical system (14.1). For two master symmetries X1 and X2 , the algebraic identity ad2V [X1 , X2 ] = [ad2V X1 , X2 ] + [X1 , ad2V X2 ] + 2[adV X1 , adV X2 ]
(14.4)
implies ad2V [X1 , X2 ] = 2[adV X1 , adV X2 ].
(14.5)
Therefore, the vector field [X1 , X2 ] is not a master symmetry if the two symmetries adV X1 and adV X2 do not commute. Hence the master symmetries in general do not form a Lie algebra. Theorem 9. The conformal symmetries (1.2) of any dynamical system V (14.1) form a Lie algebraSc .
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
345
Proof. Any two conformal symmetries X1 and X2 satisfy the equations [X1 , V ] = c1 (x)V,
[X2 , V ] = c2 (x)V,
(14.6)
where ci (x) are first integrals of system (14.1). Using the algebraic identity adV [X1 , X2 ] = [adV X1 , X2 ] + [X1 , adV X2 ], we derive [[X1 , X2 ], V ] = −adV [X1 , X2 ] = (X1 (c2 ) − X2 (c1 ))V. Applying the identity V (X(c)) = X(V (c)) − [X, V ](c) and Eqs. (14.6), we obtain that the function X1 (c2 ) − X2 (c1 ) is a first integral of system (14.1). Therefore, the commutator vector field [X1 , X2 ] is a conformal symmetry. Hence, the linear space of all conformal symmetries Sc is a Lie algebra. II. Let us consider the general dynamical system (2.1) with quasi-periodic dynamics in the toroidal domain O = Ba × Tq ⊂ M n for arbitrary dimensions p and q that satisfy the condition p + q = n. Lemma 3. A vector field X is a master symmetry of the Tq -dense dynamical system (2.1) if and only if it has the form X=
p X
X ∂ ∂ + X α+p (I) . ∂Ij ∂ϕα q
X j (I)
j=1
(14.7)
α=1
Here X j (I) and X α+p (I) are arbitrary smooth functions. Proof. Using formulae (2.6) and (14.7), we obtain the equalities [V, X] = −X j (I)
∂ωα (I) ∂ , ∂Ij ∂ϕα
[V, [V, X]] = 0.
(14.8)
Therefore any vector field (14.7) is a master symmetry of system (2.1). Now we prove that any master symmetry of system (2.1) has form (14.7). Applying Lemma 1, we obtain that any symmetry U of the Tq -dense dynamical system (2.1) has the form p q X X ∂ ∂ U˜ j (I) U˜ α+p (I) + . (14.9) U= ∂Ij ∂ϕα j=1
α=1
Here smooth functions U˜ (I) satisfy Eq. (2.5) and functions U˜ α+p (I) are arbitrary. The definition of master symmetry (14.2) means that the vector field j
τ Xµ U τ = [V, X]τ = (LV X)τ = X˙ τ − V,µ
(14.10)
is a symmetry U of the dynamical system (2.1). Substituting formulae (14.9) into (14.10) and applying formulae (2.8), we obtain the system of equations X˙ j = U˜ j (I),
∂ωα (I) ` ˜ α+p X˙ α+p = X + U (I). ∂I`
(14.11)
346
O.I. Bogoyavlenskij
Using the fact that any first integral of the Tq -dense dynamical system (2.1) has form (2.3), we derive the formulae X j (t) = U˜ j (I)t + X˜ j (I),
(14.12)
1 ∂ωα (I) ˜ ` ∂ωα (I) ˜ ` U (I)t2 + ( X (I) + U˜ α+p (I))t + X˜ α+p (I) 2 ∂I` ∂I` for the solutions to the system (14.11). Here X˜ j (I) and X˜ α+p (I) are arbitrary smooth functions of I1 , · · · , Ip . Components X τ (I, ϕ) of the smooth vector field X are bounded on any torus Tq defined by conditions I1 = c1 , · · · , Ip = cp . Solutions (14.12) are bounded for all t if and only if the equations X α+p (t) =
U˜ j (I) = 0,
∂ωα (I) ˜ ` X (I) + U˜ α+p (I) = 0 ∂I`
(14.13)
hold. Using formulae (14.12) and (14.13) and the fact that general trajectories of the dynamical system (2.1) are everywhere dense on the tori Tq , we obtain that the coordinates of the vector field X depend upon the variables Ij only. Therefore any master symmetry X has form (14.7). Corollary 6. A vector field X is a conformal symmetry of the Tq -dense dynamical system (2.1) if and only if it has the form (14.7) where the smooth functions X α+p (I) are arbitrary and functions X j (I) satisfy the equations p X
X j (I)
j=1
∂ωα (I) = c(I)ωα (I). ∂I j
(14.14)
Here c(I) is an arbitrary smooth function and α = 1, · · · , q. Indeed, Eq. (14.14) follow from Lemma 3 and Eq. (1.2). 15. Master Symmetries of Any Dynamical System (2.1) Form a Lie Algebra M I. Proposition 10. 1) The master symmetries of the Tq -dense dynamical system (2.1) form a Lie algebra M. 2) A Tq -dense dynamical system (2.1) has no symmetries of order r ≥ 3. Proof. 1) Lemma 3 implies that all master symmetries X of system (2.1) have the form (14.7). It is evident that the linear space M of master symmetries (14.7) is closed with respect to the commutation of vector fields. Therefore M is a Lie algebra. 2) Let Xr be a symmetry of order r ≥ 3. Equations (14.3) have the form ad3V Y = 0, where Y = adr−3 V Xr =
p X j=1
ad2V Y 6= 0 X ∂ ∂ + Y α+p (I, ϕ) α . j ∂I ∂ϕ q
Y j (I, ϕ)
α=1
Let us prove that the first equation (15.1) implies the equation
(15.1)
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
ad2V Y = 0
347
(15.2)
and therefore Equations (15.1) are incompatible. Indeed, we have ad3V Y = [V, [V, Z]] = 0
(15.3)
Z = [V, Y ] = LV Y.
(15.4)
where Equation (15.3) means that vector field Z is a master symmetry. In view of Lemma 3, this master symmetry has the form Z=
p X
X ∂ ∂ + Z α+p (I) . ∂Ij ∂ϕα q
Z j (I)
j=1
(15.5)
α=1
Equation (15.4) is equivalent to the system of equations τ µ Y + Zτ . Y˙ τ = V,µ
Substituting here formulae (2.8) and (15.5), we obtain Y˙ j = Z j (I),
∂ωα (I) ` Y˙ α+p = Y + Z α+p (I). ∂I`
Solutions to this system of equations have the form Y j (t) = Z j (I)t + Y˜ j (I), Y α+p (t) =
(15.6)
1 ∂ωα (I) ` ∂ωα (I) ˜ ` Y (I) + Z α+p (I))t + Y˜ α+p (I). Z (I)t2 + ( 2 ∂I` ∂I`
All components Y τ (I, ϕ) of the smooth vector field Y are bounded on the tori Tq . The solutions (15.6) are bounded only if Z j (I) = 0. These equations and Eq. (15.5) yield the equality [V, Z] = 0. Substituting here Z = [V, Y ], we derive Eq. (15.2) [V, [V, Y ]] = 0. This equation contradicts the second Eq. (15.1). The contradiction derived proves that the symmetries of order r ≥ 3 do not exist. Theorem 10. 1) The master symmetries of any dynamical system (2.1) form a Lie algebra M. 2) Any dynamical system (2.1) has no symmetries of order r ≥ 3. 3) The Lie algebra of master symmetries M of the dynamical system (2.1) coincides with the Lie algebra of symmetries S if and only if ωα (I) = const. Proof. 1) – 2) Let X1 and X2 be two arbitrary master symmetries of the dynamical system (2.1) in the toroidal domain O = Ba × Tq and Xr be a symmetry of order r ≥ 3. Let us prove that the equations ad2V [X1 , X2 ] = 0,
adr−1 V Xr = 0
(15.7)
are true in a set O1 ⊂ O = Ba × Tq that is everywhere dense in the toroidal domain O. Let B ⊂ Ba be the union B = ∪ Bτ of sets Bτ constructed in Theorem 7, Part 2. The set B is everywhere dense in the ball Ba . Therefore the set O1 = B × Tq is everywhere dense in the toroidal domain O = Ba × Tq .
348
O.I. Bogoyavlenskij
Theorem 7, part 2 proves that dynamical system (2.1) is T`(τ ) -dense in each toroidal domain (15.8) Oτ = Bτ × Tq = Bτ × Tq−`(τ ) × T`(τ ) . Therefore, applying Proposition 10 for system (2.1) in the domains Oτ (15.8), we obtain that Eqs. (15.7) are true identically in all domains Oτ . Hence these equations are true in the set O1 = ∪ Oτ . The set O1 is everywhere dense in the toroidal domain O. Hence we obtain by continuity that Eqs. (15.7) are true identically in the domain O. The first Eq. (15.7) means that commutator [X1 , X2 ] of two arbitrary master symmetries is a master symmetry. The second Eq. (15.7) implies that symmetries of order r ≥ 3 do not exist. 3) If ωα = const then the corresponding vector field V (2.6) is constant. Therefore all trajectories of system (2.1) are isomorphic. These trajectories are T` -dense for some ` ≤ q. In view of Theorem 7, Part 2, system (2.1) takes the form (12.4) after a unimodular transformation (12.3). The constant vector field V (2.6) takes the form q ` X X ∂ ( aταβ cβ ) τ . V = ∂ϕα
(15.9)
α=1 β=1
In view of Lemma 3, any master symmetry X has the form X=
p X
X ∂ ∂ + X p+α (I, ϕτγ ) τ ∂Ij ∂ϕα q
X j (I, ϕτγ )
j=1
(15.10)
α=1
in the coordinates Ij , ϕτα (12.3). Here γ = `+1, · · · , q. The constant vector field V (15.9) commutes with any master symmetry X (15.10). Hence any master symmetry X is a symmetry and therefore M = S. Suppose M = S. Applying Theorem 7, Part 2, we obtain that there exists a family of balls Bτ ⊂ Ba such that the dynamical system (2.1) in the toroidal domain Oτ = Bτ × Tq−`(τ ) × T`(τ ) is T`(τ ) -dense and has the form (12.4). The union B = ∪ Bτ is everywhere dense in the ball Ba . The vector field V (2.6) has the form V =
`(τ ) X α=1
ωατ (I)
∂ ∂ϕτα
Ij , ϕτα .
in the coordinates Applying Lemma 3, we obtain that any master symmetry X of system (12.3) has the form (15.10), where γ = `(τ ) + 1, · · · , q. Hence we derive [X, V ] =
p X l(τ ) X
X j (I, ϕτγ )
j=1 α=1
∂ωατ (I) ∂ . ∂Ij ∂ϕτα
If any master symmetry of the system (2.1) is a symmetry then [X, V ] = 0. This equation for arbitrary functions X j (I, ϕτγ ) implies the equations ∂ωατ (I)/∂Ij = 0. Hence, after the inverse unimodular transform (12.3), we find q ∂ωα (I) X τ −1 ∂ωβτ (I) = (a )αβ = 0. ∂Ij ∂Ij
(15.11)
β=1
Thus, Eqs. (15.11) are true in the set B = ∪ Bτ that is everywhere dense in the ball Ba . Therefore, these equations are true everywhere in the ball Ba by continuity. Hence ωα (I) = const for all I ∈ Ba .
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
349
An Example. For the harmonic oscillator [1,3], all frequencies ωα are constant. Applying Theorem 10, we obtain that any master symmetry of the harmonic oscillator is a symmetry. II. We apply Theorem 10 to the Hamiltonian systems integrable in the B-sense (see Sect. 6). Proposition 11. If a Hamiltonian system V is integrable in the B-sense with compact invariant submanifolds, then the following necessary conditions must be satisfied: 1) For any two master symmetries X1 and X2 of the system V , the vector fields [X1 , V ] and [X2 , V ] commute. 2) The Hamiltonian system V has no symmetries of order k ≥ 3. Proof. Applying Theorem 3, we obtain that any B-integrable Hamiltonian system V has form (2.1) in the appropriate toroidal coordinates Ij , ϕα . In Theorem 10, we have proved that the master symmetries of any dynamical system (2.1) form a Lie algebra M. Hence, Eq. (14.5) implies [[X1 , V ], [X2 , V ]] = 0. The absence of the symmetries of order k ≥ 3 follows from Theorem 10. III. Let us consider the second application of the master symmetries to the B-integrable Hamiltonian systems. Proposition 12. A B-integrable Hamiltonian system V with Hamiltonian function H(x) has non-compact invariant submanifolds if and only if the system V possesses a Hamiltonian master symmetry. That means there exists a smooth function F (x) that satisfies the equations {H{H, F }} = 0, {H, F } 6= 0. (15.12) Proof. 1) If the B-integrable Hamiltonian system V has non-compact invariant submanifolds then such a function F (x) does exist. Indeed, in this case in the coordinates I1 , · · · , Ip , ϕ1 , · · · , ϕm , ρm+1 , · · · , ρq (6.7), there is at least one variable ρq ∈ R1 that satisfies the equations ρ˙q = ωq (I) = {H, ρq } 6= 0,
ρ¨q = {H{H, ρq }} = 0.
The variable ρq is an example of the Hamiltonian master symmetry F (x) that satisfies Eqs. (15.12). 2) If the Hamiltonian system is B-integrable and has only compact invariant submanifolds, then such a function F (x) does not exist. Indeed, if it were to exist then for some trajectory x(t) of the Hamiltonian system V one would have F¨ (x(t)) = 0,
F˙ (x(t)) = c 6= 0,
F (x(t)) = ct + c0 .
(15.13)
However, the last equation (15.13) is impossible for c 6= 0 because the smooth function F (x) is bounded on any invariant torus Tq of the B-integrable Hamiltonian system V . An Example. The Calogero-Moser system has the Hamiltonian H(p, q) =
X 1X 2 pi + α (qi − qj )−2 2 n
n
i=1
i6=j
(15.14)
350
O.I. Bogoyavlenskij
and is Liouville-integrable. This system possesses the Hamiltonian conformal symmetry F (p, q) = p1 q1 + · · · + pn qn .
(15.15)
For the two functions H(p, q) and F (p, q), the equations {H, {H, F }} = 0,
{H, F } = 2H 6= 0
hold. Therefore the Calogero-Moser system has non-compact invariant submanifolds. 16. Criteria for the Lie Subalgebra of Symmetries S to be an Ideal in M In this section, we investigate when the Lie subalgebra of symmetries S of the dynamical system (2.1) is an ideal in the Lie algebra of master symmetries M. This property is equivalent to the condition that the equation adV [X, U ] = 0
(16.1)
is true for arbitrary master symmetry X and for arbitrary symmetry U of system (2.1). Theorem 11. The Lie algebra of symmetries of a dynamical system (2.1) is an ideal in the Lie algebra of master symmetries M if and only if the following condition B is satisfied. Condition B: In any ball B0 ⊂ Ba , where rank k
∂ωα (I) k = r = const, ∂Ij
(16.2)
one has either r = 0 or r = p. In the latter case system (2.1) must be Tq -dense in the toroidal domain O0 = B0 × Tq . Proof. 1) Suppose Condition B is satisfied. Let us prove that Eq.(16.1) is true for any given symmetry U and any master symmetry X. The rank of any matrix ωα,j (I) is locally non-decreasing function. Therefore there exists a family of balls Bτ ⊂ Ba such that condition (16.2) is true in each ball Bτ for some r = r(τ ) and the union ∪ Bτ is everywhere dense in the ball Ba . Suppose r = 0 in a ball Bτ . Then ωα (I) = cα = const in the toroidal domain Oτ = Bτ × Tq . In this case, we have proved in Theorem 10 that all master symmetries X of system (2.1) are symmetries. Hence, the vector field [X, U ] is a symmetry and Eq. (16.1) follows. Suppose r = p in a ball Bτ and system (2.1) is Tq -dense in the toroidal domain Oτ = Bτ × Tq . Then any symmetry U and any master symmetry X have the forms (2.12) and (14.7) in Oτ respectively. Their commutator has the form [X, U ] =
p X j=1
X j (I)
∂U α+k (I) ∂ . ∂Ij ∂ϕα
Evidently, this vector field satisfies the equation adV [X, U ] = 0. Thus if the condition B is satisfied then Eq. (16.1) is true in all toroidal domains Oτ which are everywhere dense in the toroidal domain O = Ba × Tq . Hence Eq. (16.1) is true everywhere in O by continuity. Therefore the Lie subalgebra S is an ideal in the Lie algebra M.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
351
2) Suppose S is an ideal in M. Let us prove that if the condition (16.2) is true for r < p in some ball B0 ⊂ Ba then r = 0. If r < p then Eq. (2.5) has a non-trivial solution U0 (2.13). Multiplying this vector field with a smooth function a(I) that is equal to zero outside of the ball B0 and is equal to 1 inside a ball B1 ⊂ B0 , we obtain a global symmetry U1 = a(I)U0 . Suppose a component U1` (I) 6= 0 in the ball B1 . Let Ym = ∂/∂Im , where m = 1, · · · , p. The p one-component vector fields U1` (I)Ym = I` [Ym , U1 ] − [I` Ym , U1 ]
(16.3)
are symmetries of system (2.1) for m = 1, · · · , p. Indeed, the vector fields Ym and I` Ym are master symmetries and the vector field U1 is a symmetry. We have assumed that the Lie subalgebra S is an ideal in M. Therefore the right hand side of Eq. (16.3) is a symmetry. Hence the p linearly independent one-component vector fields (16.3) satisfy Eq. (2.5). This is possible only if the q ×p matrix ∂ωα (I)/∂Ij is equal to zero identically in the ball B1 ⊂ B0 . Hence we obtain r = 0. Let us prove by contradiction that if the condition (16.2) is true for r = p in some ball B0 ⊂ Ba and the Lie subalgebra S is an ideal in the Lie algebra M then system (2.1) is Tq -dense in the toroidal domain B0 × Tq . Indeed, if it is not Tq -dense then applying Theorem 7, Part 2, we obtain that there exists a ball Bτ ⊂ B0 , where system (2.1) is T`(τ ) -dense and `(τ ) < q. After a unimodular transformation (12.3), system (2.1) takes form (12.4) in the new angle coordinates ϕτ1 , · · · , ϕτq . Let a(I) be some smooth function that is equal to 1 for I ∈ B1 ⊂ Bτ and is equal to zero outside of the ball Bτ . The two vector fields ∂ ∂ , U = a(I) τ X = a(I) sin ϕτq ∂I1 ∂ϕq are a master symmetry and a symmetry of system (12.4) respectively. Their commutator has the form [X, U ] = a(I)
∂ ∂ ∂a(I) sin ϕτq τ − a2 (I) cos ϕτq . ∂I1 ∂ϕq ∂I1
(16.4)
Calculating the commutator of (16.4) with vector field V (12.4), we obtain
[V, [X, U ]] = a2 (I) cos ϕτq
`(τ ) X ∂ωατ (I) ∂ 6= 0. ∂I1 ∂ϕτα α=1
This vector field is not equal to zero because the rank of the matrix ωα,j (I) (16.2) is equal to p. Therefore the vector field [X, U ] is not a symmetry of the dynamical system (2.1). This fact contradicts the assumption that the Lie subalgebra of symmetries S is an ideal in the Lie algebra of master symmetries M. The contradiction obtained proves that if r = p then system (2.1) must be Tq -dense in the toroidal domain O = B0 × Tq . Using Theorem 11, we define the quotient Lie algebra M/S for the dynamical system (2.1) that satisfies Condition B. Theorem 10 implies the following consequence: The quotient Lie algebra M/S is zero if and only if ωα (I) = const.
352
O.I. Bogoyavlenskij
17. Applications of Theorem 1 to the Master Symmetries I. The proof of the following proposition is based on Theorem 1 on symmetries. Proposition 13. The master symmetries of the C-integrable non-degenerate Hamiltonian system (3.1) form a Lie algebra M. The Lie subalgebra of symmetries S ⊂ M is an abelian ideal in the Lie algebra of master symmetries M. Proof. Let X1 and X2 be two master symmetries of the Hamiltonian system (3.1). The vector fields adV X1 and adV X2 are symmetries of (3.1). Theorem 1 states that all symmetries of the C-integrable non-degenerate Hamiltonian system (3.1) commute. Therefore the identities (14.4) – (14.5) yield ad2V [X1 , X2 ] = 0. Hence the vector field [X1 , X2 ] is a master symmetry of system (3.1). That proves that the master symmetries of the C-integrable non-degenerate Hamiltonian system (3.1) form a Lie algebra M. Let us consider the algebraic identity adV [X, U ] = [adV X, U ] + [X, adV U ].
(17.1)
Any symmetry U of system (3.1) satisfies the equation adV U = 0. Theorem 1 implies that the two symmetries adV X and U commute. Therefore identity (17.1) yields adV [X, U ] = 0. This equality proves that vector field [X, U ] is a symmetry of system (3.1). Therefore the Lie algebra of symmetries S ⊂ M is an abelian ideal in the Lie algebra of master symmetries M. II. Let M n be a smooth manifold, P be a Poisson structure on M n , and H(x) be a smooth function on M n . Suppose a vector field X on the manifold M n satisfies the equations (17.2) LX P = cP, LX H = f (H), where f (H) is a smooth function of the single variable and c is a constant. The vector field X is a conformal symmetry of the Hamiltonian system x˙ α = P αβ H,β = V α (x).
(17.3)
Indeed, Eqs. (17.2) and (17.3) imply the equation [X, V ] = LX P dH = (LX P )dH + P d(LX H) = (f 0 (H) + c)V
(17.4)
that has the form (1.2). Corollary 7. If on a manifold M n some Poisson structure P , function H and vector field U satisfy the equations LU P = −P,
LU H = H
(17.5)
then the Hamiltonian system (17.3) is not a C-integrable non-degenerate Hamiltonian system with respect to any non-degenerate Poisson structure Pc in any invariant domain O ⊂ M n.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
353
Proof. Applying formulae (17.4) for the vector field X = U , we obtain [U, V ] = 0. Therefore the vector field U is a symmetry of the Hamiltonian system (17.3). The second Eq. (17.5) yields U (H) = H 6= 0. Therefore the Hamiltonian system (17.3) does not satisfy the necessary condition (4.1) for the non-degenerate C-integrability presented in Theorem 1, Part 2. III. An Example. Let the Poisson structure P on the phase space R2k has the standard form k X dpi ∧ dqi . (17.6) P = ω −1 , ω = i=1
Let the Hamiltonian function H1 (p, q) be a sum of two homogeneous functions H1 (p, q) = T (p) + V (q),
T (λp) = λ` T (p),
V (λq) = λr V (q).
(17.7)
The corresponding Hamiltonian system p˙i = −
∂H1 , ∂qi
q˙i =
∂H1 ∂pi
(17.8)
possesses the following conformal symmetry X=(
` r ∂ ∂ )pi )qi +( . r + ` ∂pi r + ` ∂qi
(17.9)
Indeed, vector field X and symplectic structure ω (17.6) satisfy the equation LX ω = ω. Hence the equation LX P = −P follows. Applying the Euler Theorem on homogeneous functions, we obtain LX H1 = X(H1 ) = mH1 ,
m=
r` . r+`
(17.10)
Therefore formula (17.4) yields [X, V ] = (m − 1)V.
(17.11)
Thus vector field X is a conformal symmetry of system (17.8). Remark 14. Equation (17.11) implies that for m = 1 the vector field X is a symmetry of the Hamiltonian system (17.8). This symmetry satisfies Eq. (17.10). Therefore, applying Theorem 1, Part 2, we obtain that the dynamical system (17.8) for ` = r(r − 1)−1 or m = 1 is not a C-integrable non-degenerate Hamiltonian system with respect to any non-degenerate Poisson structure Pc in any invariant domain O ⊂ R2k . Remark 15. The vector field X (17.9) does not exist if r + ` = 0. In this case applying the classical Virial Theorem [1,3], we obtain that system (17.8) has no invariant compact domain in the Euclidean phase space R2k . For r = −`, system (17.8) has a monotonous function F = p1 q1 + · · · + pk qk that satisfies the equation F˙ = rH1 . Therefore all trajectories of system (17.8) for H1 (p, q) 6= 0 are unbounded.
354
O.I. Bogoyavlenskij
IV. The classical harmonic oscillator has Hamiltonian k k X 1 2 X pi + ai qi2 , mi
2H0 (p, q) =
i=1
ai > 0,
mi > 0.
(17.12)
i=1
This is the special case of the Hamiltonian function (17.7) for r = ` = 2 and m = 1. The harmonic oscillator dynamics is described by the degenerate integrable system. This fact follows from the form of the Hamiltonian function (17.12) in the action-angle coordinates [1,3]: r H0 (p, q) = H0 (I) = ω1 I1 + · · · + ωk Ik ,
ω` =
a` . m`
(17.13)
Hence the corresponding Hessian matrix is identically equal to zero. Let us consider a Hamiltonian perturbation (17.8) of the harmonic oscillator that has a homogeneous of degree 2 Hamiltonian function H(p, q) = H0 (p, q) +
N X N X X X ( cµ q µ )2/s + pi ( ciµ q µ )1/s . s=1 |µ|=s
(17.14)
s=1 |µ|=s
Here cµ , ciµ are arbitrary constants and µ is multi-index. The vector field X (17.9) for r = ` = 2 is a symmetry of the corresponding Hamiltonian system (17.8) because we get m = 1 in Eq. (17.11). The Euler Theorem on homogeneous functions yields X(H) = H. Therefore the Hamiltonian system (17.8) does not satisfy the necessary condition (4.1) of Theorem 1, Part 2. Hence we obtain that any perturbation with Hamiltonian (17.14) is not a C-integrable non-degenerate Hamiltonian system in any invariant domain O ⊂ R2k .
18. Form of the Hamiltonian in the Action-Angle Coordinates I. Let us consider the Hamiltonian system x˙ α = P αβ H,β = V α (x1 , · · · , xn )
(18.1)
on a manifold M n , n = 2k, with the non-degenerate Poisson structure P αβ . Lemma 4. A master symmetry X of a Liouville-integrable Tk -dense Hamiltonian system (18.1) satisfies the equation (18.2) LX P = −P if and only if it has the form X = (Ij + cj )
∂ ∂B(I) ∂ + ∂Ij ∂Ij ∂ϕj
(18.3)
in the action-angle coordinates. Here P is the non-degenerate Poisson structure (18.1), B(I) is a smooth function and cj are arbitrary constants.
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
355
Proof. 1) In the action-angle coordinates Ij , ϕj , the symplectic structure ω = P −1 has the form k X ω= dIj ∧ dϕj . (18.4) j=1
Therefore Eq. (18.2) takes the equivalent form LX (
k X
dIj ∧ dϕj ) =
j=1
k X
dIj ∧ dϕj .
(18.5)
j=1
A direct calculation proves that vector field X (18.3) satisfies Eq. (18.5). 2) Applying Lemma 3 from Sect. 14, we obtain that any master symmetry of the integrable Tk -dense Hamiltonian system (18.1) has the form X=
k X
(X j (I)
j=1
∂ ∂ + X k+j (I) ). ∂Ij ∂ϕj
(18.6)
Using the classical properties of the Lie derivative and formula (18.6), we transform Eq. (18.5) to the form k X
(dX j (I) ∧ dϕj + dIj ∧ dX j+k (I)) =
j=1
k X
dIj ∧ dϕj .
j=1
This equation is equivalent to the k + 1 equations, d(X j (I) − Ij ) = 0, d(
k X
j = 1, · · · , k,
X j+k (I)dIj ) = 0.
(18.7) (18.8)
j=1
Solutions to Eqs. (18.7) have the form X j (I) = Ij + cj ,
cj = const.
(18.9)
Applying Poincar´e’s Lemma to Eq. (18.8), we obtain k X
X j+k (I)dIj = dB(I),
X j+k (I) =
j=1
∂B(I) ∂Ij
(18.10)
for some smooth function B(I). The derived formulae (18.9) and (18.10) prove Eq. (18.3). II. Suppose system (18.1) has a conformal symmetry X that satisfies the equations LX P = −P,
LX H = f (H)
(18.11)
where f (H) is a smooth function of the single variable. The classical Hamiltonian systems (17.8) with the homogeneous Hamiltonians (17.7) and m = r`(r + `)−1 6= 1 provide important applications for the following theorem.
356
O.I. Bogoyavlenskij
Theorem 12. Suppose the C-integrable Hamiltonian system (18.1) either is Tk -dense or analytic and possesses a conformal symmetry X that satisfies Eqs. (18.11). Then the Hamiltonian function H(I) satisfies the equation k X
(Ij + cj )
j=1
∂H(I) = f (H(I)) ∂Ij
(18.12)
in the action-angle coordinates Ij , ϕj . Here cj are some constants. Proof. 1) If the C-integrable system (18.1) is Tk -dense then applying Lemma 4, we obtain that the conformal symmetry X (18.11) has the form (18.3) in the action-angle coordinates Ij , ϕ` . Therefore the second equation (18.11) yields Eq. (18.12). 2) Suppose the C-integrable system (18.1) is analytic and the maximal dimension of the closures of its trajectories is equal to ` ≤ k. Applying Theorem 8, Part 1, we obtain that the Hamiltonian system (18.1) takes the form J˙α = 0,
J˙i = 0,
ϕ˙ 1i =
∂H(J) , ∂Ji
ϕ˙ 1α = 0,
H(J) = H(J1 , · · · , J` )
(18.13) (18.14)
after the unimodular canonical transformation (13.1) of the original action-angle cordinates Ij , ϕj . Here i = 1, · · · , ` and α = ` + 1, · · · , k. System (18.13) is T` -dense. Therefore applying Lemma 3 we get that any master symmetry X of this system has the form (α, β = ` + 1, · · · , k) X
=
` X
(X i (J, ϕ1β )
i=1
+
k X
∂ ∂ + X k+i (J, ϕ1β ) 1 ) + ∂Ji ∂ϕi
(X α (J, ϕ1β )
α=`+1
(18.15)
∂ ∂ + X k+α (J, ϕ1β ) 1 ). ∂Jα ∂ϕα
The symplectic structure ω (18.4) has the canonical form ω=
` X
dJi ∧
dϕ1i
+
i=1
k X
dJα ∧ dϕ1α .
α=`+1
The first equation (18.11) implies LX ω = ω. In view of the formulae (18.15), this equation takes the form ` X
(dX i ∧ d ϕ1i + dJi ∧ dX k+i ) +
i=1
k X
(dX α ∧ dϕ1α + dJα ∧ dX k+α ) =
(18.16)
α=`+1
=
` X i=1
dJi ∧ dϕ1i +
k X
dJα ∧ dϕ1α .
α=`+1
for 1 ≤ i ≤ ` enter only into the first sums in both sides of Eq. The differentials (18.16). Therefore the equality of these two expressions implies dϕ1i ,
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
357
X i (J, ϕ1α ) = Ji + ci ,
(18.17)
where ci are some constants. After the substitution of the formulae (18.14), (18.15) and (18.17), the second equation in (18.11) takes the form k X
(Jm + cm )
m=1
∂H(J) = f (H). ∂Jm
(18.18)
Here all summands for m = ` + 1, · · · , k are equal to zero because H(J) depends upon the ` coordinates J1 , · · · , J` only. Applying the inverse unimodular transform (13.1) to Eq. (18.18), we obtain Eq. (18.12) in the original action-angle coordinates Ij , ϕj , where cj = amj cm . Remark 16. If the integrable Hamiltonian system (18.1) is not analytic but differentiable then Theorem 8, Part 2 implies that the Hamiltonian function H(I) has form (13.3) in the special action-angle coordinates (13.2) for each ball Bτ ⊂ Ba . The integrable system (18.1) is T`(τ ) -dense for I ∈ Bτ . Therefore if Eqs. (18.11) are satisfied then the same proof as above is applicable for I ∈ Bτ . Hence we obtain that the Hamiltonian function H(I) satisfies the equation k X
(Ij + cτj )
j=1
∂H(I) = f (H(I)) ∂Ij
in each ball Bτ in the original action coordinates I1 , · · · , Ik . However, in general the constants cτj are different in the different balls Bτ . The union of all balls Bτ is everywhere dense in the ball Ba . Remark 17. If f (H) = mH then Eq. (18.12) means that the Hamiltonian H(I˜1 , · · · , I˜k ) is a degree m homogeneous function of the k variables I˜j = Ij + cj . Example 1. We first apply Theorem 12 to the harmonic oscillator that corresponds to the Hamiltonian H0 (I) (17.12). Here r = 2 and ` = 2, hence m = 1. The conformal symmetry (17.9) has the form X=
1 ∂ 1 ∂ pi + qi 2 ∂pi 2 ∂qi
and satisfies Eqs. (18.11) for f (H) = H. Applying Theorem 12, we obtain that the Hamiltonian H0 (I) is a degree 1 homogeneous function in the action-angle coordinates. This result is confirmed by the explicit formula (17.13). Example 2. Let us apply Theorem 12 to the classical Kepler problem that corresponds to the Hamiltonian function H(p, q) =
GM0 m 1 2 (p1 + p22 + p23 ) − q . 2m q12 + q22 + q32
(18.19)
Here ` = 2 and r = −1, hence m = −2. The conformal symmetry (17.9) has the form X = −pi
∂ ∂ + 2qi ∂pi ∂qi
358
O.I. Bogoyavlenskij
and satisfies Eqs. (18.11) for f (H) = −2H. All orbits of the Kepler problem for H(p, q) < 0 are closed curves. Therefore, applying Theorem 8, Part 1, we obtain that there exist the action-angle coordinates I1 , I2 , I3 , ϕ1 , ϕ2 , ϕ3 where Hamiltonian function (18.19) has the form H = H(I1 ). Applying Theorem 12 for f (H) = −2H, we obtain H(I) = −
K . I12
(18.20)
Thus we have derived without any calculations one of Poincar´e’s classical results [40]. Using the Delaunay variables, Poincar´e constructed the action-angle coordinates I1 , I2 , I3 , ϕ1 , ϕ2 , ϕ3 , where the Hamiltonian H(p, q) (18.19) has the form (18.20) for K = G2 M02 m3 /2. 19. Concluding Remarks on the Theorem on Symmetries (i) In the present paper we have proved Theorem 1 on symmetries that states that the Liouville-integrable Hamiltonian system x˙ α = P αβ H,β = V α
(19.1)
is non-degenerate in the Kolmogorov sense and all its invariant submanifolds are compact if and only if the Lie algebra S of symmetries of system (19.1) is abelian. The second necessary and sufficient condition states that any symmetry U and any first integral F satisfy the equation dF (U ) = U (F ) = 0. Each one of these two invariant properties completely characterizes the C-integrable and non-degenerate Hamiltonian systems among the general integrable systems. (ii) Theorem 1 on symmetries implies the necessary conditions for the non-degenerate C-integrability of a given dynamical system V . One of these necessary conditions entails that any Hamiltonian system that possesses two non-commuting symmetries either is not C-integrable or is degenerate. (iii) In Theorem 2, we have proved that the C-integrable Hamiltonian system (19.1) is non-degenerate in the iso-energetic sense if and only if the corresponding Lie algebraSec of the iso-energetic conformal symmetries is abelian. (iv) In Sect. 6, we have introduced a concept of B-integrability of the Hamiltonian systems on symplectic manifolds M 2k . There are k(k + 1)/2 canonical forms (6.9) of the B-integrable FHamiltionian systems that are non-isomorphic to Liouville’s classical form (3.3). The Lie algebra S of symmetries of the canonical form (6.9) is abelian if and only if its invariant tori Tq either are Lagrangian or coisotropic, q = k, k + 1, · · · , 2k, and the non-degeneracy condition (6.11) is met. (v) In Sect. 7, we have defined the A-B-C-cohomologies m HA (V, M n ),
m HB (V, M n ),
HCm (V, M n )
(19.2)
for a dynamical system V on a smooth manifold M n . In Proposition 6, we have constructed the representations Rm of the Lie algebra S of symmetries of the dynamical sysm (V, M n ) tems V in the cohomologies (19.2). The invariance of the B-cohomologies HB with respect to any connected Lie group of symmetries of system V has been established. (vi) In Theorem 4, we have proved that if the C-integrable non-degenerate Hamiltonian system (19.1) preserves a (1,1) tensor Aα β (x) then the general dynamical systems
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
α
x˙ = (
` X
am (x)Am V )α
359
(19.3)
m=−`
are completely integrable. Here am (x) are arbitrary first integrals of system (19.1). We have proved that all dynamical systems (19.3) commute with each other and have the same invariant submanifolds as system (19.1). The proof of Theorem 4 follows from Theorem 1 practically without calculations. (vii) In Theorem 5, we have presented several necessary conditions for strong dynamical compatibility of two incompatible Poisson structures P and Q assuming that at least one of them is non-degenerate. We construct a rich family of tensor invariants using the recursion operator A = P Q−1 . The proof of Theorem 5 is a pure logical consequence of Theorem 1 on symmetries. (viii) When both Poisson structures P and Q are degenerate, the recursion operator A does not exist. Therefore we have developed new constructions of tensor invariants which are applicable for two arbitrary Poisson structures. We have introduced the distributions B1 ⊂ B2 ⊂ T (M 2k ) and maps F of the manifold M 2k into the real projective spaces RP N (k) . These invariants are constructed from the Schouten bracket [P, Q] and an arbitrary non-degenerate volume form ω2k on the manifold M 2k . The maps F are first integrals of any dynamical system that preserves the two Poisson structures P and Q. The maps F annihilate the distributions B1 ⊂ B2 : dF (B2 ) = 0. The simplest necessary conditions for the strong dynamical compatibility have the form rank dF ≤ k,
dim B2 ≤ k.
(ix) In Theorem 10, we have proved that: (a) the master symmetries of any dynamical system (2.1) with quasi-periodic dynamics form a Lie algebra M, (b) the symmetries of order p ≥ 3 do not exist for such systems, (c) the Lie algebras M and S coincide if and only if the system (2.1) has constant coefficients. In Proposition 11, we have presented the applications of Theorem 10 to the necessary conditions for the B-integrability of a Hamiltonian system with compact invariant submanifolds. (x) In Proposition 13, we have presented the second proof of the fact that the master symmetries of the C-integrable non-degenerate Hamiltonian systems form a Lie algebra M and that the Lie algebra of symmetries S ⊂ M is an abelian ideal in M. These results are derived as the logical consequences of Theorem 1 on symmetries. Note added in proof 20. Conformal Symmetries, KAM Theory, and Rings of Cohomologies ∗ (f, M n ) of Smooth Mappings HB I. Let us investigate the geometrical meaning of the conformal symmetries (1.2). Definition 12. A diffeomorphism f : M n −→ M n is called a conformal symmetry of a dynamical system x˙ i = V i (x1 , · · · , xn ) if df (Vf −1 x ) = a(x)Vx , where a(x) > 0 is a first integral of the dynamical system V : V (a(x)) = 0.
360
O.I. Bogoyavlenskij
A conformal symmetry f maps trajectories of the system V into trajectories and introduces a new parametrization which is related to the old one by a scalar factor a(x) that is constant along each trajectory. Examples. For the Calogero–Moser system (15.14), the function F (15.15) defines a flow of the Hamiltonian conformal symmetries. The vector field X (17.9) defines a flow of the conformal symmetries of the Hamiltonian systems (17.7)–(17.8). Definition 13. A diffeomorphism f : M n −→ M n is called a general conformal symmetry of a dynamical system V if df (Vf −1 x ) = b(x)Vx at all points x ∈ M n . Here b(x) > 0 is an arbitrary smooth function on M n . A general conformal symmetry f transforms trajectories of the dynamical system V into trajectories and introduces an arbitrary new parametrization. It is obvious that the conformal symmetries and the general conformal symmetries form the Lie groups which contain the Lie group of symmetries. Theorem 13. 1) A vector field X generates a flow ϕt of the conformal symmetries if and only if it satisfies the equation [X, V ] = c(x)V, (20.1) where c(x) is a first integral of the dynamical system V . 2) A vector field X generates a flow ϕt of the general conformal symmetries of the dynamical system V if and only if the function c(x) in Eq. (20.1) is an arbitrary smooth function. Proof. We suppose that all critical points of the system V form a submanifold L ⊂ M n of dimension d ≤ n−1. The proof of Theorem 13 readily follows in the local coordinates x1 , · · · , xn where the respective vector field V has the form V = ∂/∂x1 . An atlas of such local coordinates covers the open manifold M n \L. Therefore Theorem 13 on the entire manifold M n follows by continuity in view of the invariant context of the Definitions 12 and 13 and Eq. (20.1). The same method as in Theorem 9 of Sect. 14 proves that the vector fields X satisfying Eq. (20.1) with arbitrary smooth functions c(x) form a Lie algebra. We refer to it as the Lie algebra Sgc of general conformal symmetries. The Lie algebra Sgc contains a non-abelian ideal of vector fields h(x)V where h(x) is an arbitrary smooth function. A general conformal symmetry X is a master symmetry of the dynamical system V if and only if the function c(x) in Eq. (20.1) is a first integral of the system V : V (c(x)) = 0, see Sect. 14. Therefore the Lie algebra Sc of conformal symmetries satisfies the relation: Sc = Sgc ∩ M. Here M is the linear space of master symmetries (14.2). II. The Kolmogorov non-degeneracy condition [24,25] and the iso-energetic nondegeneracy condition [2] are sufficient for the applicability of the Kolmogorov-ArnoldMoser theory to any small Hamiltonian perturbations of a C-integrable Hamiltonian system V . Another sufficient conditions were derived by R¨ussmann in [43]. Theorem 1 proves that the Kolmogorov non-degeneracy condition has an equivalent invariant formulation (1). Theorem 2 proves that the iso-energetic non-degeneracy condition possesses the following equivalent invariant formulation (2).
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
361
Two Sufficient Conditions for Applicability of the KAM Theory. 1) The Lie algebra of symmetries S of the C-integrable Hamiltonian system V is abelian. 2) The Lie algebra of the iso-energetic conformal symmetries Sec of the system V is abelian. In this section we consider dynamics of small Hamiltonian perturbations of the C-integrable systems which possess non-commutative Hamiltonian symmetries. Theorem 14. Suppose that an analytic C-integrable Hamiltonian system (3.3) possesses two first integrals F1 (x) and F2 (x) for which the Poisson bracket {F1 , F2 }(x0 ) = P αβ F1,α F2,β (x0 ) 6= 0
(20.2)
at a point x0 in the toroidal domain O = Ba × Tk . Then there are Hamiltonian perturbations ∂H1 (I, ϕ) ∂H(I) ∂H1 (I, ϕ) , ϕ˙ j = +ε (20.3) I˙j = −ε ∂ϕj ∂Ij ∂Ij with an arbitrary parameter ε 6= 0 that have no invariant tori Tq , 1 ≤ q ≤ 2k − 1 outside of an invariant submanifold M k+` ⊂ O, dim M k+` ≤ 2k − 1. The submanifold M k+` is foliated by the `-dimensional invariant tori T` , ` < k. For the Hamiltonian system (20.3) for ε 6= 0, the Kolmogorov set K of the invariant tori Tk is empty. The proof uses the Alternative of Sect. 11 and Theorem 8, part 1. The proof will be published elsewhere. Let us consider perturbations with the Hamiltonian functions H(I) + εH1 (I, ϕ) + ε2 H2 (I, ϕ), where H2 (I1 , · · · , Ik , ϕ1 , · · · , ϕk ) is an arbitrary analytic (or smooth, or of class C 1 ) function of all variables Ii , ϕj . Let V (ε, ε2 ) be the corresponding Hamiltonian system. Corollary 8. For an arbitrary ε and ε2 << ε, ε2 → 0, the Hamiltonian system V (ε, ε2 ) has no invariant tori Tq , 1 ≤ q ≤ 2k − 1 outside of a small neighbourhood O2 (ε, ε2 ) of the invariant submanifold M k+` ⊂ O. The measure mes O2 (ε, ε2 ) → 0 for ε2 → 0. For the Kolmogorov set K of the invariant tori Tk , one has mes K → 0 for ε2 → 0. The proof follows from Theorem 14 and the theorem on the continuous dependence of the solutions of a dynamical system on its right hand side. Theorem 14 and Corollary 8 show that the small Hamiltonian perturbations (20.3) of the C-integrable Hamiltonian systems with non-commuting Hamiltonian symmetries enjoy a dynamics that is different from that described by the KAM theory. Thus we arrive at The Necessary Condition for Applicability of the KAM Theory. The Lie algebra H of first integrals of the C-integrable Hamiltonian system V must be abelian. Remark 18. Any analytic C-integrable Lagrangian system on a manifold M k that possesses two non-commuting symmetries possesses two Noether’s first integrals which have a non-zero Poisson bracket. Therefore, the respective C-integrable Hamiltonian system on the cotangent bundle T ∗ (M k ) does not satisfy the necessary condition for applicability of the KAM theory. Any such a system possesses small Hamiltonian perturbations (20.3) which have no k-dimensional invariant tori Tk .
362
O.I. Bogoyavlenskij
Proposition 14. Suppose that for an analytic C-integrable Hamiltonian system V the Lie algebra of master symmetries M coincides with the Lie algebra of symmetries S : M = S. Then in a toroidal domain O = Ba × Tk ⊂ M 2k there exist the Hamiltonian perturbations (20.3) which have no invariant tori Tq , 1 ≤ q ≤ 2k − 1 for ε 6= 0 outside of an invariant submanifold M k+1 ⊂ O. All trajectories of system (20.3) on the submanifold M k+1 are closed curves. The proof is based on Theorem 10 which implies for M = S that the frequencies ωj of the C-integrable Hamiltonian system V are constant in the toroidal domain O. Proposition 15. Suppose that an analytic C-integrable Hamiltonian system V on a symplectic manifold M 2k possesses k symmetries U1 , · · · , Uk and k first integrals F1 (x), · · · , Fk (x) which satisfy the condition det k Uj (F` (x0 )) k 6= 0
(20.4)
at a point x0 ∈ M 2k . Then there are Hamiltonian perturbations (20.3) in a toroidal domain O = Ba × Tk ⊂ M 2k that have no invariant tori Tq , 1 ≤ q ≤ 2k − 1 for ε 6= 0 outside of an invariant submanifold M k+1 ⊂ O. All trajectories of system (20.3) on the submanifold M k+1 are closed curves. The proof uses Corollary 1 of Sect. 2 which implies that the frequencies ωj of the C-integrable Hamiltonian system V are constant provided that the condition (20.4) holds. Propositions 14 and 15 lead to the following Additional Necessary Condition for Applicability of the KAM Theory. 1) The C-integrable Hamiltonian system V must possess a master symmetry X that is not a symmetry: [V, X] 6= 0.
[V, [V, X]] = 0,
2) Any k symmetries Uj and any k first integrals F` (x) of the system V must satisfy the equation det k Uj (F` (x)) k = 0 at all points x of the symplectic manifold M 2k . We have presented here two independent formulations (1) and (2) of the additional necessary condition. Their equivalence follows from Theorem 10 and Corollary 1. III. In Sect. 7 we have introduced the A-B-C-cohomologies for dynamical systems. Among them only the B-cohomologies admit a discretization. Let f be a smooth mapping of a manifold M n into itself. Let Λkf be the space of f -invariant k-forms ωk which satisfy the invariance equation: f ∗ ωk = ωk . We consider the complex of f -invariant differential forms d
d
d
d
−→ Λnf −→ 0. 0 −→ Λ0f −→ Λ1f −→ · · · −→ Λn−1 f In paper [8] we defined the B-cohomologies of the mapping f by the formula ∗ (f, M n ) = Ker d/ Im d. HB ∗ (f, M n ) have a ring structure that is induced by the wedge product The cohomologies HB of the f -invariant differential forms. If a diffeomorphism fτ is a shift for time τ along trajectories of a dynamical system ∗ ∗ (V, M n ) −→ HB (fτ , M n ) is defined. Therefore V then the ring homomorphism HB
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
363
∗ ∗ the cohomologies HB (fτ , M n ) form a discretization of the cohomologies HB (V, M n ) when τ → 0. ∗ (f, M n ) −→ H ∗ (M n ) that converts a There is a ring homomorphism α : HB cohomology class of f -invariant closed q-forms into the corresponding de Rham’s cohomology class of general closed q-forms.
Remark 19. In Sect. 7, we have proved that any group G of symmetries of a dynamical system V possesses linear representations in the corresponding rings of the A-B-Ccohomologies (7.4). Analogously, any group Γ of diffeomorphisms of a manifold M n which commute with a smooth mapping f : M n −→ M n has linear representations in m (f, M n ), m = 1, · · · , n. the cohomologies HB Let (n + 1)-dimensional manifold Sf M n be the suspension of the manifold M n corresponding to a diffeomorphism f : M n −→ M n . Let t ∈ [0, 1] be the “vertical” coordinate on Sf M n that defines the vector field V = ∂/∂t. The mapping f induces a diffeomorphism f1 of the suspension Sf M n : f1 (x, t) = (f (x), t). It is evident that the diffeomorphism f1 is the shift for time 1 along the trajectories of the dynamical system V = ∂/∂t on Sf M n . Any f -invariant k-form ωk on M n induces a V -invariant k-form Eωk on Sf M n . The k-form Eωk is defined by the embeddings Et : M n −→ Sf M n , Et (x) = (x, t) and satisfies the relations Et∗ (Eωk ) = ωk , iV Eωk = 0. Theorem 15. For any diffeomorphism f : M n −→ M n , the following two rings of cohomologies are isomorphic ∗ ∗ (V, Sf M n ) = HB (f, M n ) ⊗ H ∗ (S 1 ). HB
Here H ∗ (S 1 ) = R1 ⊕ R1 is the ring of de Rham’s cohomologies of S 1 . The proof readily follows from the fact that any V -invariant (k + 1)-form ηk+1 on Sf M n has the form ηk+1 = dt ∧ Eωk + Eωk+1 , where ωk and ωk+1 are f -invariant forms on M n . Let manifold M 2k be a toroidal domain O = Ba × Tk , where Ba ⊂ Rk is a ball (2.2). We define a diffeomorphism f : O −→ O by the formulae f (Ij , ϕj ) = (Ij , ϕj + αj (I)),
j = 1, · · · , k,
(20.5)
where αj (I) are arbitrary smooth functions. Theorem 16. Suppose a diffeomorphism f (20.5) satisfies the non-degeneracy condition det k ∂α` (I)\∂Ij k 6= 0, `, j = 1, · · · , k. Then the first five cohomology groups have the form 0 1 2 HB (f, O) = R1 , HB (f, O) = 0, HB (f, O) = R∞ , 3 HB (f, O) = 0,
4 HB (f, O) = R∞ .
The proof is based on the explicit constructions of the invariant differential forms in the local coordinates ϕj , J` = α` (I). The proof will be published elsewhere. Acknowledgement. The author would like to thank the referees for their suggestions.
364
O.I. Bogoyavlenskij
References 1. Abraham, R., Marsden, J. E.: Foundations of Mechanics. London: The Benjamin/Cummings Publishing Co., 1978 2. Arnold, V. I.: Proof of A. N. Kolmogorov’s theorem on the preservation of quasi-periodic motions under small perturbation of the Hamiltonian. Uspekhi Mat. Nauk. U.S.S.R. 18, Ser. 5, 13–40 (1963) 3. Arnold, V. I.: Mathematical methods of classical mechanics. Berlin, Heidelberg, New York: Springer Verlag, 1978 4. Atiyah, M. F.: Convexity and commuting Hamiltonians. Bull. London Math. Soc. 14, 1–15 (1982) 5. Bogoyavlenskij, O. I.: Invariant incompatible Poisson structures. Proc. R. Soc. Lond. A 450, 723–730 (1995) 6. Bogoyavlenskij, O. I.: Incompatible Poisson structures and integrable Hamiltonian systems. C. R. Math. Rep. Acad. Sci. Canada 17, 123–128 (1995) 7. Bogoyavlenskij, O. I.: A cohomology for dynamical systems. C. R. Math. Rep. Acad. Sci. Canada 17, 253–258 (1995) 8. Bogoyavlenskij, O. I.: The A-B-C-cohomologies for dynamical systems. C. R. Math. Rep. Acad. Sci. Canada 18, 199–204 (1996) 9. Bogoyavlenskij, O. I.: Theory of tensor invariants of integrable Hamiltonian systems. I. Incompatible Poisson structures. Commun. Math. Phys. 180, 529–586 (1996) 10. Bogoyavlenskij, O. I.: A concept of integrability of dynamical systems. C. R. Math. Rep. Acad. Sci. Canada 18, 163–168 (1996) 11. Bogoyavlenskij, O. I.: The Lie algebraic criteria for two concepts of non-degeneracy of integrable Hamiltonian systems. Submitted for publication (1996) 12. Bogoyavlenskij, O. I.: An extended concept of integrability and its applications. I. Two topological invariants of integrable Hamiltonian systems. Submitted for publication (1996) 13. Brouzet, R.: About the existence of recursion operators for completely integrable Hamiltonian systems near a Liouville torus. J. Math. Phys. 34, 1309–1313 (1993) 14. Cartier, P.: Some fundamental techniques in the theory of integrable systems. In: Integrable systems. The Verdier Memorial Conference: Actes du Colloque International de Luminy. O. Babelon, P. Cartier, and Y. Kosmann-Schwarzbach (Eds.). Progress in Math. 115. Singapore: World Scientific Publishing Company, 1994, pp. 2–41 15. Damianou, P. A.: Master symmetries and R-Matrices for the Toda lattice. Lett. Math. Phys. 20, 101–112 (1990) 16. De Rham, G.: Sur l’Analysis situs des varietes a n dimensions. J. de Math. Pures et Appl. 10, 115–200 (1931) 17. Fernandes, R. L.: On the master symmetries and bi-Hamiltonian structure of the Toda lattice. J. Phys. A. 26, 3797–3803 (1993) 18. Flaschka, H.: Integrable systems and torus actions. In: Integrable systems. The Verdier Memorial Conference: Actes du Colloque International de Luminy. O. Babelon, P. Cartier, and Y. Kosmann-Schwarzbach (Eds.). Progress in Math. 115. Singapore: World Scientific Publishing Company, 1994, pp. 43–101 19. Fuchssteiner, B.: Mastersymmetries, higher order time-dependent symmetries and conserved densities of nonlinear evolution equations. Progr. Theor. Phys. 70, 1508–1522 (1983) 20. Gruber, P.M., Lekkerkerker, C.G.: Geometry of numbers, 2nd edition. Amsterdam, New York, Oxford, Tokyo: North-Holland, 1991 21. Herman, M. R.: Exemples de flots hamiltonians dont aucurne perturbation en topologie C inf n’a d’orbites p´eriodiques sur un ouvert de surfaces d’energies. C. R. Acad. Sci. Paris 312, S. I., 989–994 (1991) 22. Herman, M. R.: Differentiabilite optimale et contre-exemples a la fermeture en topologie C inf des orbites r´ecurrentes de flots hamiltonians. C. R. Acad. Sci. Paris 313, S.I., 49–51 (1991) 23. Kelley, J. L.: General Topology. Princeton, New Jersey: D. Van Nostrand Co., 1968 24. Kolmogorov, A. N.: On conservation of conditionally periodic motions under small variations of the Hamiltonian. Dokl. Akad. Nauk SSSR 98, 527–530 (1954) 25. Kolmogorov, A. N.: The general theory of dynamical systems and classical mechanics. In: Proc. Intern. Congr. Math. 1954, 1. Amsterdam: North-Holland Publ. Co., 1957, pp. 315–333 26. Liouville, J.: Note sur l’int´egration des e´ quations diff´erentielles de la dynamique. J. Math. Pures Appl. 20, 137–138 (1855) 27. Magri, F.: A simple model of an integrable Hamiltonian system. J. Math. Phys. 19, 1156–1162 (1978) 28. Markus, L., and Meyer, K. R.: Generic Hamiltonian dynamical systems are neither integrable nor ergodic. Memoirs of AMS 144 1974
Theory of Tensor Invariants of Integrable Hamiltonion Systems. II
365
29. Marsden, J., Weinstein, A.: Reduction of symplectic manifolds with symmetry. Rep. Math. Phys. 5, 121–130 (1974) 30. Marsden, J., Ratiu, T., Weinstein, A.: Semi-direct products and reduction in mechanics. Trans. Am. Math. Soc. 281, 147–177 (1984) 31. Moser, J.: On invariant curves of area-preserving mappings of an annulus. Nachr. Akad. Wiss. G¨ottingen, Math.-Phys. 1, 1–20 (1962) 32. Moser, J.: Old and new applications of KAM theory. NATO Conference “Hamiltonian systems of 3 and more degrees of freedom”. Barcelona, August 1995 33. Oevel, W.: A geometric approach to integrable systems admitting time dependent invariants. In: Topics in soliton theory and exactly solvable nonlinear equations, M. Ablowitz, B. Fuchssteiner, and M. Kruskal (Eds.). Singapore: World Scientific, 1986, pp. 108–124 34. Oevel, W., Falck, M.: Master symmetries for finite dimensional integrable systems. Progr. Theor. Phys. 75, 1328–1341 (1986) 35. Oevel, W., Zhang, H., Fuchssteiner, B.: Mastersymmetries and multi-Hamiltonian formulations for some integrable lattice systems. Progress of Theor. Phys. 81, 294–308 (1989) 36. Olver, P. J.: BiHamiltonian systems. In: Pitman Research Notes in Matematics Series 157, B. D. Sleeman, R. J. Jarvis (Eds.). New York: Longman Scientific & Technical, 1987, pp.176–193 37. Olver, P. J.: Applications of Lie groups to differential equations, 2nd edition. Berlin, Heidelberg, New York: Springer Verlag, 1993 38. Parasyuk, I. O.: Conservation of multidimensional invariant tori of Hamiltonian systems. Ukr. Math. J. 36, 380–385 (1984) 39. Parasyuk, I. O.: Reduction and coisotropic invariant tori of Hamiltonian systems with non-Poisson commutative symmetries. I, II. Ukr. Math. J. 46, 572–580, 994–1002 (1994) 40. Poincar`e, H.: Les m`etodes nouvelles de la m`echanique celeste. T. 1. Paris: Gauthier - Villars, 1892 41. Siegel, C. L., Moser, J. K.: Lectures on celestial mechanics. Berlin, Heidelberg, New York: Springer Verlag, 1971 42. Souriau, J. M.: Structure des syst`emes dynamiques. Paris: Dunod, 1970 43. R¨ussmann, H.: Non-degeneracy on the perturbation theory of integrable dynamical systems. In: London Mathematical Society Lecture Note Series 134, M.M. Dodson, J.A.G. Vickers (eds.). Cambridge: Cambridge University Press, 1989, pp. 5–18 Communicated by M. Jimbo
Commun. Math. Phys. 184, 367 – 385 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
The Electron Density in Intermediate Scales Alexei Iantchenko Institut de Math´ematiques, IRMAR, Campus de Beaulieu, Universit´e de Rennes 1, 35042 Rennes, Cedex, France Received: 2 January 1996 / Accepted: 24 July 1996
Abstract: The electron density of the neutral atoms at the distances Z −γ , γ ∈ (1/3, 1) from the nucleus in the limit when the charge of the nucleus Z tends to infinity is well 3 3 approximated by the function constZ 2 |x|− 2 , which is a common limiting value for both the Thomas-Fermi density at the origin and the hydrogen density at infinity. This conjecture by Lieb is proved in some weak sense by using the Ivrii and Sigal method.
1. Introduction We consider here the ground states of large atoms within the framework of the nonrelativistic Schr¨odinger equation with fixed, i.e., infinity massive, nuclei. The powerful tool for investigation of such systems is the Thomas-Fermi theory created in 1927 by Thomas and Fermi. In 1977 Lieb and Simon mathematically justified that the leading term in the expansion in powers of the nuclear charge Z for the ground state energy is exactly given by the Thomas-Fermi theory, see [9] and [7]. This leading term in the energy is proportional to Z 7/3 , with the proportionality constant depending on the ratio of N/Z, which is assumed to be held fixed as Z → ∞. Here N is the electron number. Here the system is not required to be neutral, i.e., N = Z, even though it is a matter of primary physical interest. The characteristic length scale for the electron density (in the sense that all the electrons can be found on this scale in the limit Z → ∞) is Z −1/3 . Lieb and Simon supplemented their result for the energy by one for the electron density. Namely, in [9] they proved that the true quantum-mechanical electron density, ρd , converges (after suitable scaling) to the Thomas-Fermi density, ρT F , as Z → ∞ with N/Z fixed. It was conjectured in 1952 by Scott [10] and proved later by Siedentop and Weikard [13, 14] in the atomic case, by Bach [1] in the ionic case and by Ivrii and Sigal [5] in the molecular case that the first correction is of order Z 2 and arises from extreme quantum mechanical effects on the innermost electrons, which are at a distance Z −1 from the
368
A. Iantchenko
nucleus. It was proved that the energy is E T F + q8 Z 2 + o(Z 2 ) for fixed N/Z 6= 0, where q is the number of spin states per electron (of course q = 2 in nature). For different proofs and extensions of this result we refer the readers to the list of references in the paper by Lieb, Siedentop and the author [4]. In [7], Eq. (5.34), Lieb extended the Scott conjecture for the energy by one for the density. His idea was that the electron density ρd in the ball of radius 1/Z around the origin is determined by the unscreened (bare) Coulomb potential. He stated that in the limit Z → ∞ a suitably scaled ρd converges to the sum of the squares of all the hydrogenic bound states ρH (defined below). In [4], this conjecture about the density related to the Scott correction was proved in several senses, one of which is a “pointwise” convergence on spheres. The region at distances of order Z −1 from the nucleus we shall call the “Scott region”. In the present article we will discuss the behavior of the electron density at distances of order Z −γ , where γ is a constant between 1/3 and 1, i.e., the region in the atom between the “Scott region” and the region of the characteristic length for the electron density which is of order Z −1/3 (the “Thomas-Fermi region”). We shall call this region the“intermediate region”. We prove that here the limit Z → ∞ of the suitably scaled ρd averaged with a smooth test-function coincides with the asymptotic value at the origin of the averaged Thomas-Fermi density. This result was also conjectured in [7]. Our main proof strategy is the same as in [4], used already by Lieb and Simon in [9]. We add times a one-body test potential, scaled appropriately for the region in question, to our Hamiltonian and then differentiate the ground state energy with respect to at = 0 in order to find ρd . To obtain weak convergence the test potential is an integrable, locally bounded function, which in the intermediate region must be taken smooth. To control the energy we shall rely on the results and methods in [5] (see also [17]). In the following part we shall state and prove our theorem only for the neutral case N = Z. It is straightforward, however, by using the Bach method [1] to generalize our results to N/Z = const 6= 1. The plan of this work is as follows: In Sect. 2 we shall recall the necessary definitions from [4] and formulate our result. In Sect. 3 we shall prove the weak convergence of the ground state density of atoms in the intermediate region. In Sect. 4 we shall give an alternative proof of the weak version of the conjecture by Lieb about atomic cores related to Scott’s correction proved by a different method in [4]. In Sect. 5 we shall formulate the extension of our result to the molecular case.
2. Definitions and Main Results The Hamiltonian of an atom of N electrons with q spin states each and a fixed nucleus of charge Z located at the origin is given by HN,Z =
N X ν=1
Z −∆ν − |xν |
+
X 1≤µ<ν≤N
1 , |xµ − xν |
(1)
in units in which ~2 /2m = 1 and |e| = 1. It is self-adjoint in the Hilbert space HN := N V L2 (R3 ) ⊗ Cq , i.e., antisymmetric functions of space and spin. A general ground ν=1
state density matrix, denoted by d, can be written as
Electron Density in Intermediate Scales
369
d=
M X
wν | Ψν ihΨν |,
(2)
ν=1
where the Ψν constitute an orthonormal basis for the ground state eigenspace and where PM the wν are nonnegative weights such that ν=1 wν = 1. It is well known that the ground state can be degenerate, e.g., as it is for the carbon atom. In the following we shall discuss only the case N = Z, i.e., neutral atoms. The corresponding one-electron density is the diagonal part of the one-electron density matrix and is, by definition, ρd (x) = N
M X ν=1
wν
q X σ1 ,...,σN =1
Z 2
R3(N −1)
|Ψν (x, x2 , ..., xN ; σ1 , ..., σN )| dx2 ...dxN . (3)
Throughout the paper we will write φTZ F (r) for the Thomas-Fermi potential of electron number N = Z and nuclear charge Z, i.e., Z Z ρTZ F (y) − dy, φTZ F (x) = |x| R3 |x − y| where ρTZ F is the nonnegative minimizer of the Thomas-Fermi functional Z 3 Z ρ(x))dx + D(ρ, ρ) ( (6π 2 /q)2/3 ρ5/3 (x) − 5 |x| 3 R R on the space {ρ ≥ 0|ρ ∈ L5/3 ∩ L1 , ρ = N } with Z 1 ρ(x)ρ(y) dxdy. D(ρ, ρ) := 2 R6 |x − y| Let EZT F = E(ρTZ F ) be the Thomas-Fermi energy (see [9, 7]). Both φZ and ρTZ F are spherically symmetric, i.e., they depend only on r = |x|. There is a scaling relation φTZ F (r) = Z 4/3 φT1 F (Z 1/3 r), where φT1 F is the ThomasFermi potential for Z = 1 and “electron number” equal to 1. Similarly, ρTZ F (r) = Z 2 ρT1 F (Z 1/3 r), EZT F = Z 7/3 E1T F . In [9, 7] it is shown that the Thomas-Fermi energy EZT F is the leading term in the asymptotic expansion for Z tending to infinity of the ground state energy EN,Z . Thus scaling shows that the “natural” length in an atom is Z −1/3 . Note that the Thomas-Fermi functional has a unique minimizer. The Scott conjecture, on the other hand, concerns the length scale Z −1 , where we expect the density to be of order Z 3 instead of Z 2 . In terms of the "true" density defined in (3), we now define (4) ρˆZ (x) := Z −3 ρd (x/Z). In [4], following Lieb [7], it was considered the quantum density for a Bohr atom with an infinite number of electrons, ρH Z . This density is defined in terms of the normalized bound-state eigenfunctions ψn,l,m,Z (x) for the hydrogenic atom with nuclear charge Z with Hamiltonian Z . (5) hH Z := −∆ − |x| The hydrogen density, ρH Z , is then defined as the sum
370
A. Iantchenko
ρH Z (x) = q
X
|ψn,l,m,Z (x)|2 .
(6)
n,l,m
This sum was studied and tabulated by Heilmann and Lieb [3]. We shall note the following proved in [3] : The sum over n, l and m defining ρH Z (x) is pointwise convergent for all x. Clearly, it is spherically symmetric and due to the scaling property of Hamiltonian (5), which is unitary equivalent to the operator Z 2 hH 1 , 1 . |y|
hH 1 = −∆ −
Under the scaling transform x = Zy, we have the scaling property 3 H ρH Z (x) = Z ρ1 (Zx).
(7)
In [4] was proved the following theorem : Theorem 1. 1. Let W be a bounded (not necessarily constant) function on the unit sphere and r positive. Then, as Z → ∞, Z Z W (ω)ρˆZ (rω)dω → ρH (r) W (ω)dω (8) 1 S2
S2
(pointwise convergence of spherical averages). 2. Let V be a locally bounded, integrable function on R3 . Then, as Z → ∞, Z Z |x|V (x)ρˆZ (x)dx → |x|V (x)ρH 1 (|x|)dx. R3
(9)
R3
In this article we are interested in the asymptotic value of the ground state density ρd (x), corresponding to the sequence of ground state density matrices d of HN,Z , at the distances r, r ∈ (Z −1 , Z −1/3 ), from the nucleus. In this region, called the “intermediate region”, the Thomas-Fermi density ρTZ F behaves like qZ 3/2 6π 2 |x|3/2
(10)
(see [9]). In [3] it was shown that the asymptotic behavior at infinity of the scaled TF at zero, i.e., ρH hydrogen density ρH 1 coincides with one of ρ1 1 decays asymptotically for large Z as q . (11) 2 6π |x|3/2 As a result E. Lieb noted in [7] that in the inner core of an atom at the distances r from the nucleus, for which Z −1 r Z −1/3 , the ground state density ρd should be well approximated by the expression in (10). In this article we are confirming his conjecture. Let γ be a scaling parameter, 1/3 < γ R < 1. We introduce the scaled density ρˆZ (x) := Z −3/2−3/2γ ρZR(Z −γ x). Note that R ρˆZ (x)dx = Z −1/2+3/2γ . Thus for the boundary values γ = 1, ρˆZ = Z; γ = 1/3, ρˆZ = 1. In what follows we denote by Cν various positive constants whose precise value is of no importance. Our main result is the following
Electron Density in Intermediate Scales
371
Theorem 2. Assume N = Z and pick a sequence of ground state density matrices d of HN,Z with densities ρd . Define ρˆZ (x) := ρd (Z −γ x)Z −3/2−3/2γ for 1/3 < γ < 1. Assume U to be any real function, smooth outside the point x = 0: \ U ∈ C ∞ (R3 \{0}) L∞ (R3 ), which obeys for |x| ≥ 2 |∂ ν U (x)| ≤ Cν |x|−|ν|−1 h|x|i−3 , |ν| ≥ 0,
(12)
|∂ ν U (x)| ≤ Cν |x|−|ν| , |ν| ≥ 0
(13)
p where h|x|i = 1 + |x|2 , and
for |x| ≤ 2. Here ν stands for a 3-tuple of non-negative integer numbers: ν = (ν1 , ν2 , ν3 ), |ν| = ν1 + ν2 + ν3 . Then, as Z → ∞, Z Z q ρˆZ (x)U (x)dx → U (x)|x|−3/2 dx. 6π 2 R3 R3 We would like to remark: It is not really necessary to take a sequence of ground state density matrices. We could take just a sequence of states, dN,Z , that is an approximate ground state in the sense that tr (HN,Z dN,Z ) − EN,Z Z 2−2γ 5
1
→0
as Z → ∞. Here EN,Z is the bottom of the spectrum of HN,Z . 3. Proof of Theorem 2 The proof of this theorem is succeeded by the application of the technique of Ivrii and Sigal [5] to the case of neutral atoms (K = 1). A clear presentation of an elementary approach due to Ivrii, a multiscale analysis, can also be found in a later paper by Sobolev [17] (see also [15, 16, 18]), where Sobolev gives the generalization of the same technique to the problem with external magnetic field. ind, the Schr¨odinger operator of N independent Following [5], we denote by HN,Z Fermions moving in the effective external potential −φTZ F (x) − UZ (x), ∈ R, ind, = HN,Z
N X
−∆ν − φTZ F (xν ) − UZ (xν ) − D(ρTZ F , ρTZ F )
(14)
ν=1
self-adjointly realized in HN . Here we define the scaled perturbation as UZ (x) = Z 1+γ U (Z γ x) for any values of the parameter γ (we are interested in γ ∈ (1/3, 1)), where U is a real function satisfying the hypotheses from Theorem 2. Let ind, ) (15) E ind, (N, Z) = inf σ(HN,Z ind, . Denote by Ei the eigenvalues of the operator be the bottom of the spectrum of HN,Z
372
A. Iantchenko
hZ = −∆ − φTZ F (x) − UZ (x), labeled in order of their increase and counting their multiplicity, if i is not greater than ind, the total number of negative eigenvalues of hZ , and are zeros otherwise. Since HN,Z N V acts on L2 (R3 ) ⊗ Cq and since the variables x1 , x2 , . . . , xN in it can be separated, ν=1
we have E ind, (N, Z) =
N X
Ei − D(ρTZ F , ρTZ F ).
(16)
i=1
First, we suitably rescale the operator hZ . We set h = Z −1/3 , then by scaling φTZ F (x) = h−4 φT1 F (h−1 x). Let U(h) be the unitary group of scaling x → hx defined by (U(h)ψ)(x) = h3/2 ψ(hx). Then where −3γ+1
U(h)hZ U(h)−1 = h−4 Kh ,
(17)
Kh = −h2 ∆ − φT1 F (x) − Uh (x).
(18)
−3γ+1
0
U (h x). We denote γ = 3γ − 1. Let ei,h () denote the Here Uh (x) = h ith eigenvalue of the operator Kh (counting the multiplicity) with the corresponding . By virtue of relation (17) eigenfunction ψi,h Ei = h−4 ei,h ().
(19)
The following uniform bound is obtained by Ivrii and Sigal (see the proof of Theorem 1.8 in [5]) which we formulate as a lemma: Lemma 1 (Ivrii, Sigal). Uniformly in h, X |ei,h (0)| ≤ const h−2/3 .
(20)
i>N
From this lemma and from formula (16) for = 0 there follows E ind,0 (N, Z) =
∞ X
14
Ei0 − D(ρTZ F , ρTZ F ) + O(Z 9 ).
(21)
i=1
By formula (16) the lower bound obviously follows: E ind, (N, Z) ≥
∞ X
Ei − D(ρTZ F , ρTZ F ).
(22)
i=1
Next we cite Theorem 2.1 from [5] (K = 1): Theorem 3 (Ivrii, Sigal). As Z → ∞ the following relation holds: 5
E 0 (N, Z) = E ind,0 (N, Z) + O(Z 3 ).
(23)
Electron Density in Intermediate Scales
373
By correlation inequality (see [6] and [8]) we have 5
EN,Z ≥ E ind, (N, Z) − constZ 3 .
(24)
We define HN,Z := HN,Z −
N X
1 ⊗ · · · ⊗ 1 ⊗UZ ⊗ 1 ⊗ · · · ⊗ 1 . | {z } | {z }
µ=1 µ−1 f actors
Then
Z
1
R3
5
U ρˆZ = Z 2 γ− 2
Z R3
UZ ρZ =
(25)
N −µ f actors
tr(HN,Z d) − tr(HN,Z d)
Z −1/2γ+5/2
.
(26)
Denote g(λ) = λ for λ ≤ 0 and = 0 for λ > 0. To obtain an upper bound we pick positive and use relations (21, 22) and (23, 24), lim inf lim sup
tr(HN,Z d) − tr(HN,Z d)
Z −1/2γ+5/2 tr{g(hZ )} − tr{g(hZ )} ≤ lim inf lim sup &0 Z −1/2γ+5/2 Z→∞ tr{g(Kh )} − tr{g(Kh )} = lim inf lim sup , 1 0 &0 h 2 γ −3 h&0 &0
Z→∞
(27)
where in the last line we used the property of the unitary scaling transform (19). Following [5], we introduce a smooth partition of unity θˆ1 , θˆ2 : θˆ1 + θˆ2 = 1, θˆ1 ∈ C0∞ (R3 ) such that |∂ ν θˆi (x)| ≤ Cν |x|−|ν| , x 6= 0, i = 1, 2. Suppose that θˆ1 is supported in B(0, 2) – the ball of radius 2 centered at the origin, θˆ1 = 1 in B(0, 1). For any r > 0 such that h2/3−δ1 ≤ r ≤ const for some δ1 > 0 we introduce a scaled partition of unity θ1 (x) = θˆ1 (x/r) and θ2 (x) = θˆ2 (x/r) which satisfies |∂ ν θi (x)| ≤ Cν |x|−|ν| , x 6= 0, i = 1, 2; θ1 is supported in B(0, 2r) and |∂ ν θ1 (x)| ≤ Cν r−|ν| . We have tr{g(Kh )} =
2 X
tr{θi g(Kh )}.
i=1
In order to apply the method of Ivrii and Sigal it is crucial that the Thomas-Fermi potential has the following properties: 1. It is smooth outside the origin and obeys the estimate (see Theorem 1.1 in [5]) |∂ ν φT1 F (x)| ≤ Cν |x|−|ν|−1 h|x|i−3 p for any multi-index ν, |ν| ≥ 0. Here h|x|i = 1 + |x|2 .
(28)
374
A. Iantchenko
2. It can be written in the form (see [9]) 1 − φreg (x), |x|
φT1 F (x) =
(29)
where φreg ≥ 0 is a real function, smooth outside x = 0 obeying |∂ ν φreg (x)| ≤ Cν |x|−|ν| , |ν| ≥ 0
(30)
for |x| ≤ 2. Note that U satisfies the same conditions as the regular part of the Thomas-Fermi potential, φreg . From hypothesis (12) it also follows that |∂ ν Uh (x)| ≤ Cν |x|−|ν|−1 h|x|i−3 , |ν| ≥ 0 for Cν independent on h, as 0
0
|h−γ ∂ ν U (h−γ x)| ≤ |Cν h−γ
0
−|ν|γ 0
1 |h−γ 0 x||ν|+1
0
h|h−γ x|i−3 |.
Therefore, we can use Theorem 7.8 from [5] which in our case claims Theorem 4 (Ivrii, Sigal). Let Kh be the operator defined in Eq. (18), ∈ R. Let g and θ2 be the functions defined above, |∂ ν θ2 (x)| ≤ Cν |x|−|ν| , and let, besides, θ2 be supported in {x| |x| ≥ r} with r ≥ 0. Suppose U obey (12). Then ZZ g(k )θ2 dxdξ + O(h−1 r−1/2 ), (31) tr{θ2 g(Kh )} = q(2πh)−3 where k = ξ 2 − φT1 F − Uh . Now, if Uh had satisfied (30), we could have finished the proof applying the method of Ivrii and Sigal directly. But this is not the case. Contrary to hypothesis (12), from the fact that U satisfies (13) it does not follow that Uh also satisfies this condition for γ 0 6= 0. From (13) it follows only that 0
0
0
0
h−γ |∂ ν U (h−γ x)| ≤ h(−1−|ν|)γ Cν |h−γ x|−|ν| = h−γ Cν |x|−|ν| for |x| ≤ 2. Therefore, for γ 0 6= 0 we cannot use Theorem 8.1 from [5] directly to calculate tr{θ1 g(Kh ) − θ1 g(Kh )}, as it requires that Uh satisfy both hypotheses (12) and (13). Thus we shall proceed in two steps. Near the singularity x = 0 we consider the perturbed “bare” Hamiltonian KhH, = −h2 ∆ −
1 − Uh (x) |x|
(32)
with symbol k H, = ξ 2 −
1 − Uh (x). |x|
(33)
Following [5], we reduce the problem in the ball of radius 2r to the “bare” one using Theorem 8.1 from [5] which in our case claims
Electron Density in Intermediate Scales
375
Theorem 5 (Ivrii, Sigal). Let Kh be defined as in Eq. (18) with Uh obeying formula (12) ∈ R. Let h2/3−δ1 ≤ r ≤ const for some δ1 > 0. Let θ1 ∈ C0∞ be supported in {x| |x| ≤ 2r}, equal to 1 on {x| |x| ≤ r} and obey ∂ ν θ1 = O(r−|ν| ). Then
=
tr{θ1 g(Kh ) − θ1 g(KhH, )} ZZ q(2πh)−3 θ1 g(k ) − g(k H, ) dxdξ + O(h−1 r−1/2 + h−2 r).
(34)
From Theorems 4 and 5 it follows that tr{g(Kh )} − tr{g(Kh )} = tr{θ1 g(KhH )} − tr{θ1 g(KhH, )} ZZ −3 +q(2πh) θ1 g(k 0 ) − g(k H,0 ) dxdξ ZZ θ1 g(k ) − g(k H, ) dxdξ −q(2πh)−3 ZZ −3 θ2 g(k 0 ) − g(k ) dxdξ +q(2πh) +O(h−1 r−1/2 + h−2 r).
(35)
To continue we need the following lemma: 1 1 − Uh (x) and k H, = ξ 2 − |x| − Uh (x). Then Lemma 2. Denote KhH, = −h2 ∆ − |x| for 0 < γ 0 < 2
=
tr{θ1 [g(KhH ) − g(KhH, )]} ZZ q(2πh)−3 θ1 (x) g(k H,0 ) − g(k H, ) dxdξ +O(h−
2−γ 0 2
( 43 − 21 δ1 )−γ 0 )
(36)
for some δ1 > 0. Proof. As on the left hand side of (36) we have just “bare” operators, we can use its 0 scaling properties and apply the unitary scaling transform U(hγ ) defined as before, 0 0 0 (U(hγ )ψ)(x) = h3/2γ ψ(hγ x), to the operator KhH, using that 0 0 1 − h−γ U (h−γ x) |x| 0 0 0 0 1 h−γ U(hγ )−1 −h2−γ ∆y − − U (y) U(hγ ). |y|
−h2 ∆x − =
We denote a new quasiclassical parameter h(2−γ 0 < (2 − γ 0 )/2 < 1) and
0
)/2
by β (as 0 < γ 0 < 2, we have
˜ H, = −β 2 ∆ − 1 − U (x). K β |x|
(37)
0 0 ˜ H, U(hγ 0 ), using the definition of the function g and Inserting KhH, = h−γ U(hγ )−1 K β the cyclic property of the trace, we get
376
A. Iantchenko
tr{θ1 [g(KhH ) − g(KhH, )]} 0 0 0 0 ˜ H, U(hγ 0 ))]} = h−γ tr{θ1 [g(U(hγ )−1 KβH U(hγ )) − g(U(hγ )−1 K β 0
0
0
=
˜ H, )]U(hγ )} h−γ tr{θ1 U(hγ )−1 [g(KβH ) − g(K β
=
0 0 0 ˜ H, )]} h−γ tr{U(hγ )θ1 U(hγ )−1 [g(KβH ) − g(K β
=
˜ H, )]}, h−γ tr{θ˜1 [g(KβH ) − g(K β
0
0
(38)
0
where θ˜1 = U(hγ )θ1 U(hγ )−1 is the operator of the multiplication by the function 0 θ˜1 (x) = θ1 (hγ x). As θ1 is supported in the ball of radius 2r and equal to 1 on {x| |x| ≤ r}, 0 0 0 0 and as h = β 2/(2−γ ) , the function θ˜1 is supported in {x| |x| ≤ 2rh−γ = 2rβ −2γ /(2−γ ) } 0 0 0 and equal to 1 on {x| |x| ≤ rh−γ = rβ −2γ /(2−γ ) }. 2 ( 2 −δ ) As h2/3−δ1 ≤ r ≤ const, we have that β 2−γ 0 3 1 ≤ r ≤ const and consequently, 2
2
β 2−γ 0 ( 3
−δ1 −γ 0 )
≤ rβ
0
2γ − 2−γ 0
≤ constβ
0
2γ − 2−γ 0
.
Let r˜ satisfy the inequality 2 1 2 2 −δ −γ 0 β 3 −δ1 ≤ r˜ ≤ min const, β 2−γ 0 ( 3 1 ) . 2
(39)
We introduce a smooth partition of unity χ1 + χ2 = 1 such that χ1 is supported in {x| |x| ≤ 2r}, ˜ equal to 1 on {x| |x| ≤ r} ˜ and |∂ ν χi (x)| ≤ Cν |x|−|ν| , x 6= 0, i = 1, 2. From these properties and from the following bound on θ˜1 : 0
0
0
|∂ ν θ˜1 | = |∂xν θ1 (hγ x)| ≤ Cν h|ν|γ |hγ x|−|ν| = Cν |x|−|ν| for x 6= 0, which follows from the property of θ1 , it follows that |∂ ν (χ2 θ˜1 )| ≤ Cν |x|−|ν| .
(40)
Therefore, inserting 1 = χ1 + χ2 on the right-hand side of (38): 0 ˜ H, )]} h−γ tr{θ˜1 [g(KβH ) − g(K β 0 ˜ H, )]} = h−γ tr{θ˜1 χ1 [g(KβH ) − g(K β 0
˜ H, )]}, +h−γ tr{θ˜1 χ2 [g(KβH ) − g(K β
(41)
using that χ1 θ˜1 = χ1 , ensuing from the choice of the upper bound in (39), and from (40) together with the properties of U : hypotheses (12) and (13), we can apply Theorem 5 to the first term and Theorem 4 to the last term on the right-hand side of (41) thus obtaining 0 ˜ H, )]} h−γ tr{θ˜1 [g(KβH ) − g(K β ZZ −γ 0 −3 = h q(2πβ) θ˜1 (x) g(k H ) − g(k˜ H, ) dxdξ 0
0
˜ +O(h−γ β −1 r˜ −1/2 + h−γ β −2 r).
(42)
Electron Density in Intermediate Scales
Here we denoted
377
1 k˜ H, = ξ 2 − − U (x). |x|
The lower bound in (39) assures that the second term in the remainder (42)0 is of lower 0 order than h−γ β −1 r˜ −1/2 . We choose r˜ = β 2/3−δ1 . Inserting β = h(2−γ )/2 , θ˜1 (x) = 0 0 θ1 (hγ x) and changing the variables in the integral on the right-hand side in (42), hγ x = 0 y, ξ = hγ /2 p, we obtain
=
tr{θ1 [g(KhH ) − g(KhH, )]} ZZ q(2πh)−3 θ1 (x) g(k H ) − g(k H, ) dxdξ +O(h−
2−γ 0 2
( 43 − 21 δ1 )−γ 0 ).
Consequently, from (36) and (35) it follows tr{g(Kh )} − tr{g(Kh )} ZZ g(k 0 ) − g(k ) dxdξ = q(2πh)−3 +O(h−
2−γ 0 2
( 43 − 21 δ1 )−γ 0 + h−1 r−1/2 + h−2 r).
(43)
The next lemma gives the upper bound on the first term on the right-hand side in (43). Lemma 3. Suppose ∈ R, || ≤ 0 with some 0 > 0. Then, there is a positive constant C independent on h and such that we have the following bound: q(2πh) ≤
−3
q 6π 2 h3
ZZ
g(k 0 ) − g(k ) dxdξ
Z
3/2
Uh (x) φT1 F (x) + Uh (x)
dx
0
+C2 h−3+ 2 γ , 1
(44)
Note. It is easy to see that φT1 F (x) + Uh (x) > 0 for h and small enough using the lower bounds on φT1 F which can be obtained from the estimates on the function L1 (r) defined in [13] , φT1 F (r) = B in [13]:
L21 (r) r2 ,
r = |x|. We use the following bounds from Appendix √ L1 ≥ C r for r ≤ rm ,
L1 ≥ C r−1 for r ≥ rm , where rm is some constant (the maximum of L1 (r)). Thus, using hypotheses (12), (13), we have the following bounds : 0 for |x| ≤ 2hγ 0
0
φT1 F (x) + h−γ U (h−γ x) ≥ 0
for 2hγ ≤ |x| ≤ rm
0 C − ||C0 h−γ > 0, |x|
378
A. Iantchenko 0
0
φT1 F (x) + h−γ U (h−γ x) ≥
0 C0 C − || h|h−γ x|i−3 > 0 |x| |x|
and for rm ≤ |x| 0
0
0
φT1 F (x) + h−γ U (h−γ x) ≥
C C0 h3γ − || > 0. 4 |x| |x|4
Proof of Lemma 3. Denote D+ = {x|Uh (x) ≥ 0} and D− = {x|Uh (x) ≤ 0}. Using the definition of g, we have ZZ g(k 0 ) − g(k ) dxdξ q(2πh)−3 ZZ Uh (x)θ(−ξ 2 + φT1 F (x))dξdx = q(2πh)−3 D ZZ+ −3 {ξ 2 − φT1 F (x) − Uh (x)}dξdx −q(2πh) S ZZ+ −3 Uh (x)θ(−ξ 2 + φT1 F (x) + Uh (x))dξdx +q(2πh) D ZZ − −3 {ξ 2 − φT1 F (x)}dξdx. (45) +q(2πh) S−
Here θ(x) = 1 for x ≥ 0, θ(x) = 0 otherwise, and S+ = {(ξ, x)|Uh (x) ≥ 0, ξ 2 − φT1 F (x) − Uh (x) ≤ 0 ≤ ξ 2 − φT1 F (x)}, S− = {(ξ, x)|Uh (x) ≤ 0, ξ 2 − φT1 F (x) ≤ 0 ≤ ξ 2 − φT1 F (x) − Uh (x)}. We bound the second term on right-hand side of (45), which is positive, from above by q(2πh)−3
ZZ S+
Z
F (x)+U (x) φT h 1
=
consth−3
=
1 consth−3 3
≤
1 2 · consth−3 2
D+
≤ =
Uh (x)dξdx Z √
dxUh (x) √
Z
F (x) φT 1
|ξ|2 d|ξ|
o n dxUh (x) (φT1 F (x) + Uh (x))3/2 − (φT1 F (x))3/2
D+
Z
dx(φT1 F (x) + Uh (x))1/2 Uh2 (x) Z 0 0 1 dx 1/2 (1 + C0 h|x|i−3 )1/2 U 2 (h−γ x) 2 · consth−3 h−2γ |x| Z 1 0 1 2 · consth−3+ 2 γ dx 1/2 U 2 (x) |x|
(46)
1 . The last integral in (46) is some finite constant as follows from using that φT1 F (x) ≤ |x| hypothesis (12) on U . For the first term on the right-hand side of (45) we have an obvious estimate
Electron Density in Intermediate Scales
q(2πh)−3
379
ZZ Uh (x)θ(−ξ 2 + φT1 F (x))dξdx D
= ≤
−3 4π
Z+
0
0
h−γ U (h−γ x)(φT1 F (x))3/2 dx
q(2πh)
3 D+ Z 0 0 4π h−γ U (h−γ x)(φT1 F (x) + Uh )3/2 dx. q(2πh)−3 3 D+
Similarly for the third term on the right-side of (45) we have: ZZ q(2πh)−3 Uh (x)θ(−ξ 2 + φT1 F (x) + Uh (x))dξdx D− Z 0 0 4π h−γ U (h−γ x)(φT1 F (x) + Uh (x))3/2 dx. = q(2πh)−3 3 D−
(47)
(48)
The last term on the right-hand side of (45) is negative. Summing (47) and (48) together with the estimation of the remainder (46), we complete the proof. 2 Choosing r = h 3 −δ1 , we obtain that the error terms in (43) and (44) are bounded by 2−γ 0
0
0
const(h− 2 ( 3 − 2 δ1 )−γ + h− 3 + 2 δ1 + 2 h−3+ 2 γ ). 0 4 1 1 0 5 1 0 1 − 21 γ 0 > 0 for γ 0 < 2, we obtain As − 2−γ 2 3 − 2 δ1 − γ − 2 γ + 3 = 3 + 2 δ1 from (43) and Lemma 3 for positive that the right hand side of (27) is bounded from above by Z 3/2 q − 1 γ0 2 lim sup lim inf 2 h Uh (x) φT1 F (x) + Uh (x) dx h&0 6π &0 Z 0 3/2 0 q U (y) hγ φT1 F (hγ y) + U (y) dy = lim sup lim inf 2 h&0 6π &0 Z q = U (x)|x|−3/2 dx. 6π 2 R3 Therefore, we obtain
4
1
Z
4
q lim sup U ρˆZ ≤ 2 6π 3 Z→∞ R
Z
1
1
U (x)|x|−3/2 dx. R3
Taking negative and reversing sign in the inequality (27), we obtain in a similar way the lower bound Z U ρˆZ lim inf Z→∞
≥
R3
lim sup lim inf h&0
&0
≥
lim sup lim inf h&0
&0
=
q 6π 2
Z
tr{g(Kh )} − tr{g(Kh )} 1
h 2 γ Z
q − 1 γ0 h 2 6π 2
0 −3
3/2
Uh (x) φT1 F (x) + Uh (x)
dx
U (x)|x|−3/2 dx. R3
By combining the upper and lower bounds, we conclude the proof of Theorem 2.
380
A. Iantchenko
4. An Alternative Proof of the Weak Version of the Conjecture by Lieb about Atomic Cores Related to Scott’s Correction In this section we give an alternative proof of the "weak" convergence of the total density to the hydrogen density in the Scott region – the weak version of the conjecture by Lieb about atomic cores related to Scott’s correction. In [4], this conjecture was proved by a different method in several senses. We will deduce the result from the proof of Theorem 2 in Sect. 3. We prove the following theorem: Theorem 6. For U such that sup{|U (x)||x|2 } < ∞ and for ρˆZ (x) = Z −3 ρd (Z −1 x), ρd defined in (3), and for ρH 1 defined in (6),(7), Z Z ρˆZ (x)U (x)dx → ρH (49) 1 (x)U (x)dx R3
R3
as Z → ∞. Proof. First, we suppose that U satisfy conditions (12) and (13). We follow the proof of Theorem 2 allowing parameter γ there to be equal to one (respectively γ 0 = 2) until formula (35). We decompose the first term in (35) again using θ2 = 1 − θ1 : tr[θ1 g(KhH,0 ) − θ1 g(KhH, )] = =
tr[g(KhH,0 ) − g(KhH, )] − tr[θ2 g(KhH,0 ) − θ2 g(KhH, )] ˜ H, )] − tr[θ2 g(K H,0 ) − θ2 g(K H, )], h−2 tr[g(K H,0 ) − g(K 1
1
h
h
(50)
where we used the scaling property of KhH, . Here ˜ H, = −∆ − 1 − U (x). K 1 |x| Though U is not spherically symmetric, we can also index the eigenvalues according to angular momentum decomposition by the l-value they have when tends to zero, using the analyticity of eigenvalues in . For each k pick nk , lk , mk such that H H lim eH k () = enk ,lk ,mk (0) = enk ,lk ,0 (0),
&0
which may be done in a way such that the map k 7−→ nk , lk , mk H,0 is one-to-one. Here eH n,l,m (0) are the eigenvalues of K1 . For any L ∈ N define H KL := inf k | eH k () ≥ e1,L,0 () .
(51)
To proceed further on we need the following lemma Lemma 4. For any L ∈ N, ≤ 1/(4C0 ) and KL defined above we have tr[g(K1H,0 )
˜ H, )] = − g(K 1
K l −1 X k=0
H eH k (0) − ek () + O( ). L
(52)
Electron Density in Intermediate Scales
381
1 1 1 Proof. From the condition U (x) ≤ C0 |x| (1+|x|)3 it follows that U (x) ≤ C0 |x|2 . We fix > 0, then ˜ H, = −∆ − 1 − U (x) ≥ ∆ − 1 − C0 . K 1 |x| |x| |x|2
˜ H, , eH (), as above and calculating the eigenvalues of Indexing eigenvalues of K k 1 0 the operator −∆ − 1/|x| − C0 /|x|2 , which are equal to eH n,l0 ,0 (0), where l = p 2 (l + 1/2) − C0 − 1/2 (see also [11, 12]), we get H H H H H eH k (0) − ek () = en,l,0 (0) − en,l,0 () ≤ en,l,0 (0) − en,l0 ,0 (0).
Then, expanding l0 = l0 () in a Taylor series at = 0 until the second term with the remainder in Lagrange’s form, we get H eH k (0) − ek () ≤
=
1 4
1 1 1 1 p − 4 (n + (l + 1/2)2 − C0 − 1/2)2 4 (n + l)2 C0
p
p (n + (l + − 1/2)3 (l + 1/2)2 " 2 C02 3 p + 2 4 (n + (l + 1/2) − ϑ − 1/2)4 ((l + 1/2)2 − ϑ) #! 1 p + (n + (l + 1/2)2 − ϑ − 1/2)3 ((l + 1/2)2 − ϑ)3/2 1/2)2
for some ϑ with 0 < ϑ < C0 . We obtain an upper bound by replacing ϑ by any number larger or equal to C0 and smaller than 1/4. Note that the sum over n is convergent for each term separately. Moreover, for the last two terms the summation over l when extended to infinity is convergent, too. We prove the statement for > 0 by calculating ∞ X
1 4 (n + l)3 (l + 1/2) n=1 ∞ X 2l + 1 1 −3 + O((l + 1) ) l + 1/2 2(l + 1)2 l=L 1 −2 + O((L + 1) ) , L+1
(2l + 1)
L
=
C0 4
=
C0 4
∞ X C0
twice using the following asymptotic expansion as x → ∞ of the Riemann zeta function (see Erdelyi et al. [2], formula 1.18(9)): ζ(s, x) =
∞ X n=0
(n + x)−s =
Γ (s − 1) 1−s 1 −s x + x + O(x−1−s ). Γ (s) 2
For < 0 we have an upper bound ˜ H, = −∆ − 1 − U (x) ≤ ∆ − 1 − C0 . K 1 |x| |x| |x|2
382
A. Iantchenko
Therefore, H H H H H eH k () − ek (0) = en,l,0 () − en,l,0 (0) ≤ en,l0 ,0 (0) − en,l,0 (0),
and again by the Taylor expansion we obtain the statement for < 0. Applying Lemma 4, formula (52), to the first trace on the right hand side of (50) and Theorem 4 to the last trace on the right hand side of (50), we have from (35) tr{g(Kh )} − tr{g(Kh )} = h−2 −q(2πh)
−3
+q(2πh)−3 −q(2πh)−3 +q(2πh)−3 =
h
−2
K l −1 X
θ2 g(k
ZZ ZZ
+q(2πh)−3
ZZ
H,0
) − g(k H, ) dxdξ
θ1 g(k 0 ) − g(k H,0 ) dxdξ θ1 g(k ) − g(k H, ) dxdξ θ2 g(k 0 ) − g(k ) dxdξ + O(h−1 r−1/2 + h−2 r + h−2 L−1 )
eH k (0)
k=0
H eH k (0) − ek ()
k=0
ZZ ZZ
K l −1 X
−
eH k ()
−3
ZZ
− q(2πh)
g(k H,0 ) − g(k H, ) dxdξ
g(k 0 ) − g(k ) dxdξ
+O(h−1 r−1/2 + h−2 r + h−2 L−1 ).
(53)
In the limit limL→∞ lim&0 limh&0 the sum of the second and the third term in (53) multiplied by h2 −1 gives zero, which easily follows from Lemma 3 putting γ 0 = 2 there and from analogous bound from below : q(2πh)−3 q ≥ 2 3 6π h
ZZ
g(k 0 ) − g(k ) dxdξ
Z
3/2
Uh (x) φT1 F (x)
dx
0
−C2 h−3+ 2 γ . 1
Choosing r = h 3 −δ1 and taking limit limL→∞ lim&0 limh&0 on the right and on the left hand sides in (53), we obtain (49) by perturbation theory. We will now weaken the conditions on U by approximating the general U , sup{|U (x)||x|2 } < ∞, by a sequence of Un satisfying conditions (12) and (13) for all n. Let Un be such a sequence, i.e., limn→∞ sup{|Un − U ||x|2 } = 0. It is enough to prove that Z Z ρˆZ U = lim lim ρˆZ Un . (54) lim 2
Z→∞
R3
n→∞ Z→∞
R3
The last property follows from the following estimates:
Electron Density in Intermediate Scales
Z
383
Z ρˆZ U − ρˆZ Un ≤ ρˆZ |U − Un | R3 R3 R3 Z Z −3 ρZ (Z −1 x) ≤ sup{|U − Un ||x|2 } · dx |x|2 R3 Z ρZ (x) dx = sup{|U − Un ||x|2 } · Z −2 |x|2 3 R Z
≤ const · sup{|U − Un ||x|2 }, R 2 Z (x) where in the last step we used the bound R3 ρ|x| proved in [11, 12]. 2 dx ≤ constZ 5. Extension to Molecules In this section we shall formulate the extension of Theorem 2 to the molecular case in the spirit of Sect. 4 of [4]. The ground state energy of a neutral molecule with nuclear charges Z1 = λz1 , ..., ZK = λzK and positions of the nuclei at R1 , ..., RK is given as E(N, Z) = inf{inf σ(HN,Z,R )|R ∈ R3K }, where HN,Z,R
=
N X ν=1
−∆ν −
K X κ=1
Zκ |xν − Rκ |
! +
N X µ,ν=1 µ<ν
(55)
X 1 Zκ Zκ0 + |xµ − xν | κ,κ0 =1 |Rκ − Rκ0 | κ<κ0
(56) self-adjointly realized in HN . Here Z denotes the K-tuple (Z1 , ..., ZK ) and R the 3Ktuple (R1 , ..., RK ). We also set z := (z1 , . . . , zK ). Solovej [19] recently showed that for arbitrary but fixed z and N = Z1 + ... + ZK , E(N, Z) =
K X
5
E(Zκ , Zκ ) + o(λ 3 )
(57)
κ=1
holds as λ tends to infinity and that the minimizing inter-nuclear distances are of order λ−5/21 or bigger. These results imply among other things that the atomic Scott and Schwinger correction for the energy and extension of the Scott conjecture to the density (see [4]) implies the molecular one and allows us to generalize Theorem 2 as well: Theorem 7. Assume that the minimizing inter-nuclear distances satisfy E(N, Z) = inf{inf σ(HN,Z,R )|R ∈ R3K , ∀1≤κ<κ0 ≤K |Rκ − Rκ0 | ≥ R := constλι } (58) with ι > −1/4. Assume N = Z1 + ... + Zk , Z1 = λz1 , ..., ZK = λzK with given fixed z1 , ..., zK . Furthermore fix κ0 ∈ 1, ..., K and pick a sequence of ground state density matrices d of HN,Z,R with densities ρd . Define ρˆλ,κ0 (x) := ρλ ((x−Rκ0 )λ−γ )λ−3/2−3/2γ for 1/3 < γ < 1. Assume U to be any real function, smooth outside the points Rκ , κ = 1, 2, . . . , K, which obeys |∂ ν U (x)| ≤ Cν l(x)−|ν|−1 hl(x)i−3 , |ν| ≥ 0
384
A. Iantchenko
for l(x) ≥ 2, where l(x) = minκ |x − Rκ | and hli =
√ 1 + l2 , and
|∂ ν U (x)| ≤ Cν l(x)−|ν| , |ν| ≥ 0 for l(x) ≤ 2. Here ν stands for a 3-tuple of non-negative integer numbers: ν = (ν1 , ν2 , ν3 ), |ν| = ν1 + ν2 + ν3 . Then, as λ → ∞, Z Z q ρˆλ,κ0 (x)U (x)dx → U (x)|x|−3/2 dx. 6π 2 R3 R3 The proof of this theorem is succeeded by the application of the Ivrii and Sigal technique [5] to the case of neutral atoms (K = 1) and the generalization to molecules using the result of Solovej [19] as in Sect. 4 of [4]. Note that in Theorem 2 the condition on ι can be weakened: ι > −5/9 + , with > 0, applying the method from [5] directly to molecules. Acknowledgement. The author thanks H. Siedentop for his constant support during discussions of the occurring problems and A. Sobolev for his remarks and suggestions.
References 1. Bach, V.: A proof of Scott’s conjecture for ions. Rep. Math. Phys., 28 (2), 213–248 (October 1989) 2. Erd´elyi, A., Magnus, W., Oberhettinger, F., Tricomi,F. G.: Higher Transcendental Functions Volume 1. New York: McGraw-Hill, 1 edition, 1953 3. Heilmann,O.J. Lieb, E.H.: The electron density near the nucleus of a large atom. Phys. Rev. A, 52 (5), 3628–3643, (1995) 4. Iantchenko, A., Lieb, E.H., Siedentop, H.: Proof of a conjecture about atomic and molecular cores related to Scott’s correction. J.f¨ur die Reine und Angewandte Math., 472, 177–195, (March 1996) 5. Ivrii, V.Ja., Sigal, I.M.: Asymptotics of the ground state energies of large Coulomb systems. Annals of Math., 138 (2), 243–335 (1993) 6. Lieb, E.H.: A lower bound for Coulomb energies. Phys. Lett., 70A, 444–446 (1979) 7. Lieb, E.H.: Thomas-Fermi and related theories of atoms and molecules. Rev. Mod. Phys., 53 (4), 603–641 (October 1981) 8. Lieb, E.H., Oxford, S.: Improved lower bound on the indirect Coulomb energy. Intern. J. Quantum Chem., 19, 427–439 (1981) 9. Lieb, E.H., Simon, B.: The Thomas-Fermi theory of atoms, molecules and solids. Adv. Math., 23, 22–116 (1977) 10. Scott, J. M. C.: The binding energy of the Thomas-Fermi atom. Phil. Mag., 43, 859–867 (1952) 11. Siedentop, H.: An upper bound for the atomic ground state density at the nucleus. Lett. Math. Phys., 32 (3), 221–229 (November 1994) 12. Heinz Siedentop: Bound for the atomic ground state density at the nucleus. In R. Froese J. Feldman and L.M. Rosen, editors, Proceedings of the Canadien Mathematical Society Annual Seminar held in Vancouver, BC, August 4–14, 1993. Providence, Rhode Island: American Mathematical Society, 1995 13. Siedentop, H., Weikard, R.: On the leading energy correction for the statistical model of the atom: Interacting case. Commun. Math. Phys., 112, 471–490 (1987) 14. Siedentop, H., Weikard, R.: On the leading correction of the Thomas-Fermi model: Lower bound – with an appendix by A. M. K. M¨uller. Invent. Math. 97, 159–193 (1989) 15. Sobolev, A.V.: The sum of eigenvalues for Schr¨odinger operator with Coulomb singularities in a homogeneous magnetic field. Universit´e de Nantes, preprint, 1993 16. Sobolev, A.V.: The quasi-classical asymptotics of local Riesz means for the Schr¨odinger operator in a strong homogeneous magnetic field. Duke Math. J., 74 (2), 319–429 (1994) 17. Sobolev, A.V.: The quasi-classical asymptotics of local Riesz means for the Schr¨odinger operator in a moderate magnetic field. Ann. Inst. Henri Poincar´e 62 (4), 325–359 (1995)
Electron Density in Intermediate Scales
385
18. A. V. SobolevA.V.: Discrete spectrum asymptotics for the Schr¨odinger operator with a singular potential and a magnetic field. In: Proceedings of London Mathematical Society, Accepted for publication 19. Solovej, J.P.: In preparation. Communicated by B. Simon
Commun. Math. Phys. 184, 387 – 395 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
On the Decay of Correlations in Non-Analytic SO (n)-Symmetric Models Ali Naddaf ?,?? Department of Mathematics, The University of British Columbia, Vancouver, BC V6T 1Z2, Canada Received: 13 May 1996 / Accepted: 31 July 1996
Abstract: We extend the method of complex translations which was originally employed by McBryan-Spencer [2] to obtain a decay rate for the two point function in two-dimensional SO(n)-symmetric models with non-analytic Hamiltonians for n ≥ 2. 1. Introduction The Mermin-Wagner theorem [3] rules out the possibility of having a spontaneous symmetry breaking at non-zero temperatures for the two dimensional Heisenberg model. As a result, the two point function for this model at non-zero temperature decays to zero at large spatial separations. This theorem, however, does not give any rate for the decay of the two point function. McBryan and Spencer [1, 2] employed the method of complex translation to get the expected power law decay (for low temperatures). Although the analyticity of the interaction seems to be crucial in their argument, this is the purpose of the present paper to show that a proper modification of their method gives similar bounds with much weaker assumptions, namely we will obtain a power law decay for the two point function by assuming that the interaction is only smooth (see Theorem 2). The main goal of this paper is to replace the analyticity assumption with some smoothness assumption, therefore we do not attempt to obtain the weakest smoothness requirement. A more careful treatment can reduce the degree of the required smoothness. To have a better idea about the necessary modifications, we shall now briefly review the original argument of McBryan-Spencer. Let n = 2 and consider an SO(2)-symmetric ferromagnet on Z2 . To each site i ∈ Z2 , assign a spin s(i) of unit length. Using the representation s = (cos θ, sin θ), −π ≤ θ ≤ π, for spins, then the spin-spin correlation function at inverse temperature β = T −1 can be written as ?
Research supported in part by the Natural Science and Engineering Research Council of Canada Present address: The University of Michigan, Department of Mathematics, Ann Arbor, MI 48109-1109, USA ??
388
A. Naddaf
hs0 · sx i = hcos(θ0 − θx )i = Z
−1
YZ i
X
where H(θ) =
π −π
cos(θ0 − θx )eβH(θ) dθi ,
(1.1)
cos(θi − θj ).
ki−jk=1
Theorem 1 (McBryan-Spencer). For any ε > 0 and β ≥ β0 (ε) sufficiently large, hs0 · sx i ≤ kxk−(1−ε)/(2πβ) .
(1.2)
Proof (taken from [2]). Fix a large periodic box of size L. The following estimates are uniform in L, hence we omit any reference to it. We first note that hsin(θ0 − θx )i = 0 implies that
hcos(θ0 − θx )i = ei(θ0 −θx ) . Using the periodicity and analyticity of the integrand, we make complex translations θ(y) → θ(y) + ie a(y),
def.
e a(y) = β −1 [C(y) − C(y − x)],
(1.3)
in the numerator of (1.1). Here C(y) is the fundamental solution of Laplace’s equation on the finite lattice lattice: −∆C(y) = δ0,y , C(0) = 0 (see (2.10)). The above translation means that we deform the path of integration and use the periodicity of the cosine to cancel the lateral contours. Then we get n X o YZ exp β cos(θi − θj ) cosh(e ai − e aj ) dθi hso · sx i ≤ e−(ea0 −eax ) Z −1 ≤
e
−(e a0 −e ax )+β
P
i ki−jk=1
ki−jk=1
[cosh(e ai −e aj )−1]
.
(1.4)
From the properties of the fundamental solution C(y), it follows that (see (2.11) below) |e a(i) − e a(j)| ≤ 4β −1 if |i − j| = 1,
uniformly in x, i, j.
Thus for any ε > 0, we can find β0 (ε) such that for β ≥ β0 (ε), X
[cosh(e ai − e aj ) − 1]
≤
ki−jk=1
= =
X 1 (1 + ε) (e ai − e a j )2 2 ki−jk=1 1 (1 + ε)(e a, −∆e a) 2 a0 − e ax ). (1 + ε)(2β)−1 (e
Noting that e a0 − e ax = −2β −1 C(x), we obtain the bound of Theorem 1.
Theorem 1 also holds for n > 2 if one multiplies the given upper bound by n/2 in (1.2). To see this, parameterize the n-sphere by angles φ(1) , . . . , φ(n−2) , θ, |φ(r) | ≤ π/2, |θ| ≤ π in such a way s(1) , s(2) of a unit spin vector s involve θ. Then D that only the components E (2) (2) (1) one can treat s(1) = (2/n) hs0 · sx i as for the case n = 2, translating only 0 sx + s0 sx the variables θi . Now we shall consider a more general, possibly non-analytic, Hamiltonian. For the very same reason, it suffices to consider the n = 2 case. Therefore we introduce the Hamiltonian
On Decay of Correlations
389
H(θ) =
X
f (∇b θ),
b
where the sum is over all bonds of a finite periodic box Λ(L) of Z2 , of size (2L+1)×(2L+ def.
1), centered at the origin. For each bond b with end-points i and j, ∇b θ = θ(i) − θ(j). Here f is an even, smooth periodic function of its argument, with period 2π, with the following Fourier series expansion: f (x) =
X
cn cos nx.
(1.5)
n≥0
We would like to obtain an upper bound similar to what we saw in Theorem 1 for the two point function
hs0 · sx iΛ = hcos(θ0 − θx )iΛ = e
i(θ0 −θx )
Λ
1 = Z(Λ)
Z
ei(θ0 −θx )+HΛ (θ) DΛ θ,
uniformly in the size of box Λ(L). Here Z DΛ θ =
YZ i∈Λ
π −π
dθi ,
and Z(Λ) is the partition function in the finite box Λ(L): Z(Λ) =
R
exp(HΛ ) DΛ θ.
Theorem 2. Let L kxk and assume that for some constants p > 3 and C > 0, |cm | ≤
C mp
for all m ≥ 1,
(1.6)
where cm is given in (1.5). Then there is a constant B > 0, independent of the size of the box Λ(L), such that i(θ −θ ) e 0 x ≤ B exp Λ
where 4 β = 3 def.
X
−1 ln kxk , β !
m |cm | . 2
(1.7)
m
Remark 1. We will not establish the existence of infinite volume quantities here, however our upper bound is uniform in the size of the box Λ(L). Remark 2. The coefficient 43 in (1.7) can be easily pushed down to anything greater than 1.
390
A. Naddaf
2. Proof of Theorem 2 We shall drop the subscript Λ(L) in what follows, while keeping in mind that we are working in a large but finite box. We also warn the reader that the same letter may represent different constants at different places. We expand f in a Fourier series and then truncate the series in a position-dependent fashion: def. f (∇b θ) = Mb (∇b θ) + Rb (∇b θ), where def.
Mb (∇θ) =
n(b) X
cm cos m∇θ,
m=0
and
X
def.
Rb (∇θ) =
cm cos m∇θ.
m>n(b)
Integers n(b) will be chosen later. Note that Mb is an analytic function of its argument. We then write Z P Y
i(θ0 −θx ) 1 ei(θ0 −θx )+ b Mb (∇b θ) = eRb (∇b θ) − 1 + 1 Dθ e Z b X (k) = I , k≥0
where I
(k) def.
=
X [b1 ,...,bk ]
1 Z
Z
P ei(θ0 −θx )+ b Mb (∇b θ) D(b1 ) · · · D(bk ) Dθ, def.
D(b) = eRb (∇b θ) − 1. Here the sum over [b1 , . . . , bk ] is understood as the sum over all mutually different bonds in Λ(L). D(b) will be called a defect at the bond b. As we will see, for large values of k, |I (k) | is small due to the presence of k small defects. On the other hand, for small k, we shall estimate |I (k) | by complex translations similar to the ones we saw in the introduction. A combination of these two different estimates will give us the claimed upper bound. We have the following estimate for I (k) : ( ) X X 1 Z (k) exp Mb (∇b θ) |D(b1 )| · · · |D(bk )| Dθ |I | ≤ Z [b1 ,...,bk ] b E X D − P R (∇ θ) b b b e |D(b1 )| · · · |D(bk )| . = [b1 ,...,bk ]
Moreover |Rb (∇b θ)| ≤
X m>n(b)
which in turn implies:
|cm | ≤
C , n(b)p−1
On Decay of Correlations
391
n exp
−
X
o n X o Rb (∇b θ) ≤ exp C n(b)1−p .
b
b
On the other hand, if n(b), for all b’s, is chosen to be large enough, then we have |
X
D(b)| ≤
X
b
b
These together imply
C ≤ δ < 1. n(b)p−1
(Hyp-1)
|I (k) | ≤ eδ δ k .
(2.8)
To summarize, if we choose n(b)‘s large enough so that (Hyp-1) is satisfied, then the estimate (2.8) shows that |I (k) | is exponentially small for large k. For small k, we estimate I (k) differently, using the appropriate complex translation. Consider the level k = 0 Z P 1 (0) I = ei(θ0 −θx )+ b Mb (∇b θ) Dθ. Z Since Mb is analytic and periodic, we can translate θ variables by θ(y) → θ(y) + iA(y). In other words, using the Cauchy theorem, we can deform the path of each individual θ-integration and use the periodicity of Mb to cancel the lateral contours (note that there are only finitely many θ-variables enclosed in the box Λ(L)). The values of real variables A(y)’s will be set later. Correspondingly, X cm cos m∇b θ Mb (∇b θ) = →
m≤n(b)
X m≤n(b)
≤ =
1 −(A(0)−A(x)) e Z
Z exp
nX X b
o cm cos m∇b θ cosh m∇b A Dθ
m≤n(b)
Z nX 1 −(A(0)−A(x)) e exp Mb (∇b θ) + Z b o X X cm cos m∇b θ[cosh m∇b A − 1] Dθ b
≤
cm sin m∇b θ sinh m∇b A.
m≤n(b)
Therefore |I (0) |
X
cm cos m∇b θ cosh m∇b A − i
e
−(A(0)−A(x))
exp
m≤n(b)
nX X b
o |cm |[cosh m∇b A − 1] ×
m≤n(b)
D
e−
Assume that
A’s are chosen so that for all b’s |n(b)∇b A| ≤ 1,
hence [cosh n(b)∇b A − 1] ≤
2 (n(b)∇b A)2 . 3
P b
Rb (∇b θ)
E .
(Hyp-2)
392
A. Naddaf
Then X X b
cm [cosh m∇b A − 1]
X
!
≤
2 3
=
β (A, −∆A), 2
m≤n(b)
m |cm | 2
X
m
(∇b A)2
b
where β is defined by (1.7), (·, ·) denotes the standard inner product of `2 (Z2 ) and ∆ is the nearest-neighbor Laplacian on the lattice Z2 . These together give the following upper bound for I (0) : β |I (0) | ≤ exp −(A(0) − A(x)) + (A, −∆A) eδ . 2 We define e a as the minimizer of the following functional: def.
F(a) = −(a(0) − a(x)) +
β (a, −∆a). 2
(2.9)
Let C(x) ≡ CL (x) denote the fundamental solution of Laplace’s equation on the box Λ(L): X cos k · x − 1 def. , (2.10) CL (x) = (2L + 1)−2 4 − 2 cos k1 − 2 cos k2 k∈Λ∗ (L) k6=0
where
Λ∗ (L) = k = (k1 , k2 ) : ki = 2π(2L + 1)−1 ri , ri integers , |ri | ≤ L .
Then one can easily see (cf. (1.3)) that e a(y) =
1 [C(y) − C(y − x)] β
minimizes the functional (2.9) and gives: 1 1 a(0) − e a(x)) = − [C(0) − C(x)]. F (e a) = − (e 2 β We shall also need the following estimate: β|e a(i) − e a(j)| ≤ 4,
whenever ki − jk = 1,
(2.11)
which follows easily from the following uniform estimate when ki − jk = 1: |CL (i) − CL (j)| ≤
π2 (2L + 1)2
X k∈Λ∗ (L) k6=0
|k1 | < 2. + k22
k12
For reasons that will become clear shortly, we need to modify this classical choice, according to the number and location of defects. Fix a large positive constant R kxk, whose exact value will be set later. On the circumference of the R-circle centered at the origin, e a is almost constant. Let C1 (resp. C2 ) be the largest “equipotential” level curve for e a contained in the R-circle around the origin (resp. x). We define:
On Decay of Correlations
393
e a(y) e a(C1 ) a(y) = e a(C2 )
if y is outside the C1 and C2 , inside C1 , inside C2 .
(2.12)
Therefore, a is different from e a only inside C1 , C2 , where it is defined to take a constant value, i.e. ∇a(y) = 0 for y inside C1 or C2 . Therefore (a, −∆a) ≤ (e a, −∆e a).
(2.13)
On the other hand, a(0) = e a(0) + δ0 (R),
a(x) = e a(x) + δx (R),
where, using (2.11), we have |δ0 (R)|, |δx (R)| ≤ 4R/β. As a result,
F(a) ≤ F (e a) + 4R/β.
The proper complex translations A at different levels will be built out of the a defined at (2.12). At defect-free level I (0) , we make the following complex translation: def.
θ(y) → θ(y) + iA∅ (y) where A∅ (y) = a(y)
(defined in (2.12)).
This gives the following bound: |I (0) | ≤ e4R/β+δ eF (ea) . Now let us consider I (1) : X 1 Z i(θ −θ )+P M (∇ θ) (1) b b b I = e 0 x (eRb0 (∇b0 θ) − 1) Dθ. Z 0 b
0
Fix b in the above sum. Then the integrand is an analytic function of all ∇b θ in the box Λ(L), except for b = b0 . Therefore, we can make a new complex translation 0
θ(y) → θ(y) + iA{b } (y), where {b0 } def.
∇b A
=
∇b a, 0,
b 6= b0 b = b0 .
In other words, if we have a defect at bond b0 , we only modify a by imposing the extra condition ∇b0 A = 0. We have to pay a penalty for this modification but it will not be big since this modification is only needed when b0 is outside of C1 and C2 , and there, a decays as the ∇a is already very small. More precisely, outside of C1 and C2 , β∇b e difference between the inverse of the distances between the bond b and the points 0 and x. Therefore, 0 0 6 . (A{b } , −∆A{b } ) ≤ (a, −∆a) + βR This gives the following upper bound for I (1) : |I (1) | ≤ δeδ+4R/β+6/(βR) eF (ea) .
394
A. Naddaf
In general, when we have k defects, located at b1 , . . . , bk , we perform the following complex translation: θ(y) → θ(y) + iA{b1 ,...,bk } (y), where ∇b A{b1 ,...,bk } =
∇b a, if b 6∈ {b1 , . . . , bk } 0, otherwise.
Then we have the following estimate: k |I (k) | ≤ e4R/β+δ δe6/βR eF (ea) .
(2.14)
Now we make our choice of n(b). To do so, let us list our two main hypotheses here: X b
C ≤ δ < 1, n(b)p−1
|n(b)∇b A| ≤ 1,
for all b and at all levels.
(Hyp-1)
(Hyp-2)
Since at all levels, ∇b A = 0 for b in C1 and C2 , (Hyp-2) is trivially satisfied in these two regions, independent of the choice of n(b). We choose n(b) to grow like def.
d(b) = β min{kb − 0k, kb − xk} outside of C1,2 (and very large inside), where kb − zk is the smallest distance between the bond b and the point z. Then for p > 3, the hypothesis (Hyp-1) will be fulfilled if we choose R large enough. Note that this choice of R only depends on the function f and a, and hence ∇b A, decays like not on anything else. On the other hand, we know that ∇b e the inverse of βd(b) so (Hyp-2) will also be satisfied everywhere. Putting all the pieces together, we have the following estimate: k 4R/β+δ F (e a) 6/βR 6/βR 1 + δe + · · · + δe e |I| ≤ e +eδ δ k+1 + δ k+2 + · · · 6/βR k+1 eδ k+1 4R/β+δ F (e a) 1 − δe = e δ . e + (2.15) 1−δ 1 − δe6/βR Here we have estimated |I (0) |, . . . , |I (k) | by the appropriate complex translations, as explained above, and the rest of the terms |I (k+1) |, . . ., using the estimate (2.8). Minimizing the right-hand side with respect to k, we obtain: −1 0 F (e a) ln |x| , = B exp |I| ≤ B e β where B is a constant, which only depends upon the function f . This completes the proof. Acknowledgement. I would like to thank Tom Spencer, who should be considered as the co-author of this article, for his many valuable suggestions.
On Decay of Correlations
395
References 1. Glimm, J. and Jaffe, A.: Quantum Physics. Second Edition. Berlin–Heidelberg–New York: Springer Verlag, 1987 2. McBryan, O.A. and Spencer, T.: On the decay of correlations in SO(n)-symmetric ferromagnets. Commun. Math. Phys. 53, 299-302 (1977) 3. Mermin, N.D. and Wagner, H.: Phys. Rev. Letters 17, 1133 (1960) Communicated by D.C. Brydges
This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.
Commun. Math. Phys. 184, 397 – 410 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems R. S. Ward Department of Mathematical Sciences, University of Durham, Durham DH1 3LE, UK Received: 29 February 1996 / Accepted: 5 August 1996
Abstract: The O(3) sigma model and abelian Higgs model in two space dimensions admit topological (Bogomol’nyi) lower bounds on their energy. This paper proposes lattice versions of these systems which maintain the Bogomol’nyi bounds. One consequence is that instantons/solitons/vortices on the lattice then have a high degree of stability.
1. Introduction In systems where there are topological configurations (instantons, vortices, Skyrmions, monopoles, etc.) classified by an integer k, one often has a Bogomol’nyi bound E ≥ α|k|, where E is the energy (or action) of the system, and α is a universal constant. The stability of the topological objects in question is often related to the existence of such a lower bound on E. Lattice versions of these systems have been much studied (for purposes of numerical computation, or regularization of the quantum field theory, etc.), but the Bogomol’nyi bound has generally not been preserved on the lattice. Although one can identify topological objects in the lattice systems, and the details of how to do this are well-known, these topological objects are often unstable. Given a continuum field theory, there are many different lattice systems which reduce to it in the continuum limit. The object of this paper is to present lattice versions of two systems in which the Bogomol’nyi bound is maintained. The systems in question are the O(3) sigma model and the abelian Higgs model, both in two space dimensions. Because the Bogomol’nyi bound is satisfied, topological objects will be well-behaved even when their size is not much greater than the lattice spacing. A similar study was undertaken a few years ago for systems in one space dimension [16, 17]. This concentrated on the sine-Gordon system, but analogous results hold for other systems (such as phi-fourth) which admit kink solutions. In these one-dimensional cases, the Bogomol’nyi bound can be attained: there exists a one-real-parameter family of static kink solutions on the lattice, with energy equal to the topological minimum.
398
R. S. Ward
The parameter describes the position of the kink on the one-dimensional lattice: in particular, its energy is the same irrespective of where it is in relation to the lattice (there is no Peierls-Nabarro energy barrier). These features are not present in the twodimensional cases discussed in this paper; the energy of a topological object will be minimized only when it is located at the centre of a plaquette, and even then is greater than the topological minimum. If one imposes radial symmetry in the plane, then in effect one obtains a onedimensional system, and one can try to set up a lattice version of this which maintains the Bogomol’nyi bound. This was done for the O(3) sigma-model [5], and used to study the stability of single (radially-symmetric) solitons [7]. The case of semilocal vortices in an abelian Higgs model (with Higgs doublet) was dealt with in the same way [6]. But the intention in this paper is to work with the full two-dimensional systems, so that one is not restricted to radially-symmetric configurations. One purpose of all this is to set up systems in which topological objects have a high degree of stability, even when the lattice is quite coarse. So the primary aim is not to simulate the continuum system, but rather to define an alternative, “genuine” lattice system with, it is hoped, similar properties (and more convenient to study numerically). The two cases dealt with in this paper provide examples of two different kinds of topological soliton. In the O(3) sigma-model, we have a “texture”, in which the field wraps the whole of two-dimensional space around the target space S 2 . By contrast, the vortex in the abelian Higgs model is a “monopole”, where the topology arises purely from the behaviour of the field at spatial infinity. (But of course the Bogomol’nyi bound on its energy involves the field throughout space.)
2. The Lattice O(3) Sigma Model Let us begin with a brief review of the continuum O(3) sigma model in two space dimensions. The field ϕ is a unit 3-vector field on R2 (i.e. a smooth function from R2 to the unit sphere S 2 ), with the boundary condition ϕ → ϕ0 as r → ∞ in R2 . Here ϕ0 is some given (fixed) point on the image sphere S 2 . Any such field ϕ is labelled by its winding number k, which represents the number of times R2 is wrapped around S 2 . The energy of ϕ is Z Econt =
R2
1 2 (∂j ϕ)
· (∂j ϕ) dx dy ,
(2.1)
and the appropriate Bogomol’nyi argument [3] gives the bound Econt ≥ 4π|k|. There are fields which attain this lower bound (such minimum-energy fields will be called solitons in what follows). Since Econt is invariant under the scaling transformation ϕ(xj ) 7→ ϕ(λxj ), these configurations are metastable rather than stable (their size is not fixed); and this absence of stabilty shows up when one studies the dynamics of solitons [7]. One can modify the system in order to achieve stability: for example, by adding to Econt a Skyrme term which prevents the soliton from becoming too localized, and a potential term which prevents it from spreading out [15]. This, of course, raises the energy significantly above the Bogomol’nyi bound. Another way of preventing the soliton from spreading out is to put it in a box; in other words, we take the domain of ϕ to be a bounded region such as |x| ≤ L, |y| ≤ L. The boundary condition is now ϕ = ϕ0 around the edge of this region. The topological classification and the Bogomol’nyi bound work just as before. This time, however, the bound cannot be attained (except in the trivial case k = 0), and all configurations are
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
399
unstable to shrinking. The addition to the energy of a term which opposes shrinking (such as a Skyrme term) will ensure stability; the size of the soliton is then determined by the balance between the expansionary effect of this term and the containing effect of the walls of the region. Let us now replace R2 by a (bounded or unbounded) square lattice, where the spatial variables x and y run over integer values. The field ϕ(x, y) at the lattice site (x, y) is a unit 3-vector, as before. Instead of the partial derivative ∂x ϕ, we work with the inner product ϕ · ϕx , where ϕx denotes the nearest neighbour in the positive x-direction [i.e. ϕx (x, y) = ϕ(x + 1, y)]; and similarly for ∂y ϕ. Take the energy of ϕ to have the form (which couples only nearest-neighbour spins) X (2.2) f (ϕ · ϕx ) + f (ϕ · ϕy ) , E= x,y
where f is a suitable function, to be described presently. The boundary condition, as before, is ϕ = ϕ0 around the rim of the lattice (which might be at infinity). The function f : [−1, 1] → R is chosen to satisfy the following two conditions: (a) The energy should be positive-definite. So we want f (ξ) > 0 for ξ < 1, and f (1) = 0 (the trivial solution ϕ ≡ ϕ0 then has zero energy). (b) The lattice energy (2.2) reduces to (2.1) in the continuum limit. There is no scale in this system (that is why, without loss of generality, the lattice spacing was taken to be unity), and “continuum limit” means ϕ · ϕx → 1 and ϕ · ϕy → 1 at all lattice sites. Using 21 (∂x ϕ)2 ≈ 1 − ϕ · ϕx reveals that we need f 0 (1) = −1 in order to get the correct continuum limit. The most obvious function f with these two properties is fHeis (ξ) = 1−ξ, which gives the well-known Heisenberg model. But in this case there is no stable (or even metastable) configuration other than the trivial one ϕ ≡ ϕ0 . Any “topologically non-trivial” configuration will “unwind”, essentially because it is not costly enough for the directions of neighbouring spins to be very different. In order to achieve topological stability, one has to use a different f , in particular making it more expensive for neighbouring spins to be different. This aspect is discussed further below. The immediate aim, though, is to have a lattice Bogomol’nyi bound; it turns out that this also ensures a measure of topological stability. So we add a further condition. (c) If the lattice configuration has winding number k, then E ≥ 4π|k|. For this to make sense, the winding number has to be well-defined, which requires some restriction on the field ϕ. The idea here is to say that ϕ is continuous if the angle between nearest-neighbour spins is less than π/2 (in other words, ϕ · ϕx > 0 and ϕ · ϕy > 0 at all lattice sites). For a continuous ϕ, the winding number k is well-defined, and corresponds to the number of times the lattice is wrapped around S 2 ; see Appendix A for more details. A function f which possesses all three properties (a), (b), (c) is p (2.3) f (ξ) = π − 4 tan−1 ξ ; this is proved in Appendix A. Actually, (2.3) defines f only for continuous fields, where ξ ∈ (0, 1], these being our main concern here. One may obtain an f for the full range ξ ∈ [−1, 1], and therefore for all fields ϕ, by extending the above f monotonically (for √ example, take the argument of the arctan to be − −ξ if ξ ≤ 0).
400
R. S. Ward
Note that if one starts with a random configuration (for example, in a lattice-statistical study), then this configuration is unlikely to be continuous according to the definition given above. But when the field is relaxed (cooled), then a “steep” function f (such as the one described above) will rapidly drive it towards continuity. This is certainly not the case for the Heisenberg choice fHeis ; in fact, the opposite is true, as is illustrated in more detail below. So if ϕ is a continuous field of spins on the lattice, with winding number k, then its energy E (defined by 2.2, 2.3) satisfies E ≥ 4π|k|. Unlike in the continuum case, this lower bound cannot be attained (see Appendix A): the energy of any configuration with winding number k 6= 0 is strictly greater than 4π|k|. On the infinite lattice Z 2 , such a configuration is unlikely to be stable. By spreading out in space, it can approach (but never reach) the continuum limit, where the energy equals 4π|k|. In the continuum system, spreading only occurs if one puts in something like a Skyrme term; but for this lattice system, no such extra term is present. The function f itself provides a spreading force, and a potential barrier against topological decay (which is caused by shrinking). The crucial feature is the behaviour of f (ξ) as ξ → 0 (in other words, as the field ϕ approaches a discontinuity): the slope of f (ξ) tends to −∞. The corresponding force drives the field away from the potential discontinuity and topological decay. In the Heisenberg case, by contrast, the slope is −1, and this is not enough to prevent discontinuity and decay: a continuous configuration will evolve into a discontinuous one [8]. This can be illustrated by looking at the energy of a one-parameter family of configurations in which ϕ is fixed at all lattice sites except the four around a particular plaquette. Around this plaquette (cf. Fig. 3), take ϕ(1)
=
ϕ(2)
=
ϕ(3)
=
ϕ(4)
=
1 √ (ν, ν, −2η), 2 1 √ (−ν, ν, −2η), 2 1 √ (ν, −ν, −2η), 2 1 √ (−ν, −ν, −2η) , 2
where η ≥ 0 is a small parameter and ν = 1 − η 2 . Discontinuity occurs if η = 0, for then ϕ(1) · ϕ(2) = 0, etc. The boundary value of ϕ is ϕ0 = (0, 0, 1), and ϕ · ϕ0 > 0 for all other lattice sites. Then √ p 0 dE = 8 2 lim ξf (ξ) + M , (2.4) ξ→0 dη η=0
where M is a positive constant (depending on the values of ϕ at the eight lattice sites linked to the four variable ones). For stability, we want (2.4) to be negative, so that the field flows away from the danger point η = 0. In the Heisenberg case, the opposite √ will clearly be the case, since f 0 (0) is finite. One needs f 0 (ξ) → −∞ at least as fast as 1/ ξ, and the function (2.3) has this property. With our choice of f , a soliton does not shrink; but on an infinite lattice, it will spread out indefinitely. In order to stabilize its size, we can (as in the continuum case) either confine it to a lattice of bounded extent, or introduce a potential term into the expression for the energy. More details of the latter possibility are given in [17]. In that paper, the choice f (ξ) = − log ξ was made. This function is greater than (2.3), and so the
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
401
Bogomol’nyi bound is still valid; in addition, the energy tends to infinity as ϕ approaches discontinuity, so there is a higher degree of stability. In order to stay as close to the Bogomol’nyi bound as possible, let us use the “minimal” choice (2.3), and avoid an extra term in the energy. Rather, we restrict to a finite lattice 1 ≤ x ≤ n, 1 ≤ y ≤ n. If n ≥ 6, then there exist continuous configurations with winding number k = 1. Starting with such configurations, the energy E was minimized numerically in order to find the corresponding solitons, for a range of values of n (from 6 to 82). The energy E(n) of the soliton depends on n, and is greater than 4π, with (as one would expect) E(n) → 4π as n → ∞. More precisely, the numerical result is E(n) ≈ 4π(1 + 2.1/n + 4/n2 ) for n > 20. Since the field maps each lattice site to a point on the unit sphere S 2 , one way of visualizing it is to think of a square net made of “elastic” fibres, each with natural length zero, wrapped around S 2 . The boundary of the net is gathered together at the single point ϕ0 . Topological decay occurs if a plaquette of the net becomes a hemisphere – the net then slips off and collapses to the single point ϕ0 . If the fibres obeyed Hooke’s law, then the net would indeed slip off in this way: the restoring force in the fibres would not be strong enough to prevent them stretching too much. A “Heisenberg” net is even worse – the restoring force is weaker. The potential energy stored in a fibre of length d is f (cos d); with f as in (2.3), this is larger than the Hooke’s law value of 21 d2 , and large enough to stabilize the net. The total energy E is minimized (locally) by a non-trivial configuration, and the corresponding net is depicted in Fig. 1, for n = 12. (The picture should, strictly speaking, have the vertices joined by segments of great circle, rather than by straight lines as shown.)
3. A Lattice Abelian Higgs System As in Sect. 2, we begin with a brief review of the continuum case. The fields consist of a U(1) gauge potential Aj , and a complex-valued scalar field ψ, both smooth on R2 . The covariant derivative of ψ is Dj ψ = ∂j ψ − iAj ψ, and the magnetic field strength is B = εjk ∂j Ak = ∂x Ay − ∂y Ax . The energy of the field (Aj , ψ) is Z 1 2 2 2 2 1 1 2 1 Econt = 2 |Dx ψ| + 2 |Dy ψ| + 2 B + 8 (|ψ| − 1) dx dy ,
(3.1)
R2
and the boundary conditions are |ψ| → 1, Dj ψ → 0 (which imply B → 0) as x, y → ∞ in R2 . All units have been fixed, and the factor of 18 in the potential term means that we are restricting to the case of critical coupling. The boundary conditions lead to a topological classification as follows. If C is a large circle r = r0 1 in R2 , then |ψ| ≈ 1 on C; so ψ|C is effectively a mapping from C to the unit circle in the complex plane, and it has a winding number k. Secondly, the boundaryR condition Dj ψ → 0 implies (via Stokes’s theorem) that the total magnetic flux Φ = B dx dy satisfies Φ = 2πk. (3.2) Finally, the Bogomol’nyi argument [1] implies that E ≥ (3.2) gives E ≥ π|k|.
1 2 |Φ|.
Combining this with
402
R. S. Ward
Fig. 1. Two views of a 1-soliton of the lattice sigma model, depicted as a 12 × 12 net wrapped around S 2
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
403
This lower bound on E is attained for static (multi-)vortex configurations. The vortex is stable; in particular, its size is fixed, and is (a few times) unity. So this system (unlike the sigma model) has a scale, and the lattice version will consequently contain a parameter h denoting the lattice spacing. The most natural way to set up the system on a lattice is to use lattice gauge theory. The variables x and y now run over integer multiples of h, and label the lattice sites. The complex scalar field ψ is defined at each lattice site. We want the system to be invariant under gauge transformations ψ 7→ ψb = Λψ, where Λ is a function from the lattice to U(1) (a phase at each lattice site). This is achieved in the standard way, in which a phase is associated with each link of the lattice. So on the link joining (x, y) to (x + h, y), we have a phase U x (x, y) ∈ U(1); and similarly U y (x, y) lives on the link from (x, y) to (x, y + h). The covariant derivative of ψ is represented by the gauged forward difference Dx ψ Dy ψ
= =
h−1 (ψx − U x ψ), h−1 (ψy − U y ψ),
where subscripts denote forward shifts on the lattice: ψx (x, y) = ψ(x + h, y), etc. Given cx = Λx Λ−1 U x , the covarithat U x transforms under a gauge transformation as U x 7→ U ant derivative transforms as Dx ψ 7→ Λx Dx ψ. The magnetic field B is defined in terms of the gauge-invariant product exp(iB) = U x Uxy (Uyx )−1 (U y )−1
(3.3)
of four phases around a plaquette (in an anticlockwise sense). R Many lattice gauge theory studies of vortices have been made. Usually, the 21 B 2 P −2 term in (3.1) is replaced by the Wilson action h (1 − cos B) on the lattice; the Bogomol’nyi bound is then no longer valid (cf. Appendix B). For example, studies of vortex scattering were carried out in this way [11, 12]. In those cases, the lattice 1 the size of a parameter h was taken to be relatively small (typically h = 0.15, about 20 vortex), since the authors were modelling the continuum system; consequently, problems of vortex instability did not arise. By contrast, in cases where h is larger, vortices √ become unstable and can disappear: see, for example, ref. [4], where h is taken to be 2 or 1. In order to have a well-defined topological charge, we need the quantity B defined by (3.3) to be unambiguous [9, 13, 14]. Accordingly, we require that for each plaquette, the product of the U -factors around it should not equal −1, and then take B to lie in the range (−π, π). A gauge field with this property is said to be continuous (cf. [9, 13, 14]). Note that in the continuum limit h → 0, we have B → 0, provided that the continuum gauge potential is continuous in the usual sense. Let us now derive the analogue of (3.2) for the lattice case. For convenience, we use a finite lattice: x and y range from −L to L, where L 1 (bearing in mind that the size of a vortex is of order unity). The boundary condition on the field is taken to be analogous to the continuum case, namely: • |ψ| = 1 for all vertices on the boundary; • Dx ψ = 0 for all x-links on the boundary, i.e. if y = ±L; • Dy ψ = 0 for all y-links on the boundary, i.e. if x = ±L. Define the total magnetic flux Φ to be the sum of B over all plaquettes, namely Φ = P x,y B, where x and y range, in integer multiples of h, from −L to L − h. So exp(iΦ) is a product of U -factors, and it is clear that for any two adjacent plaquettes, the U factors associated with their common link cancel. Consequently, exp(iΦ) is a boundary expression
404
R. S. Ward
exp(iΦ) =
L−h Y
x
x=−L
U |y=−L
L−h Y
y
y=−L
U |x=L
L−h Y x=−L
x −1
(U )
|y=L
L−h Y y=−L
(U y )−1 |x=−L .
(3.4) Now for an x-link on the boundary, we have 0 = hDx ψ = ψx − U x ψ, and hence U x = ψx /ψ; similarly for U y . So in fact the right-hand side of (3.4) equals 1, by virtue of the single-valuedness of ψ. Therefore Φ = 2πk for some integer k, which we define to be the winding number of the field. If the values of ψ at all pairs of neighbouring lattice sites on the boundary are not antipodal (i.e. ψx 6= −ψ or ψy 6= −ψ), then the winding number of ψ can be defined more directly, and is equal to k. Namely, one adds up the angles −i log(ψx /ψ), as one traverses the boundary in an anticlockwise direction. The next task is to devise an expression for the energy E of the lattice system which reduces to (3.1) in the continuum limit, and which satisfies the Bogomol’nyi bound E ≥ 21 |Φ|. In Appendix B, it is proved that the following expression has those properties: E = h2
L−h X x,y=−L
1
2 |Dx ψ|
2
+ 21 |Dy ψ|2 + 21 h−4 g(h)B 2 + 18 |Ψ 2 − 1|2 ,
(3.5)
p where g(h) = 1 + h4 /4 and Ψ 2 = U x ψx (U y )−1 ψy . One’s first guess might have been P 1 −2 2 replaced by the the expression (3.5) with Ψ 2 replaced by |ψ|2 ; and with 2 h g(h)B P 1 −2 2 Wilson action mentioned previously, or by the Manton [10] action h B . Studies 2 such as [4, 11, 12] used the Wilson form of this first guess. The given formula represents a slight modification of that (a modification which vanishes in the continuum limit). Note that Ψ 2 is gauge-invariant, as indeed is E. The conclusion, then, is that if the gauge-Higgs system satisfies the stated boundary conditions, and the gauge field is continuous, then the energy E is bounded below by π|k|, where the integer k is the winding number of the system. This statement remains true if L → ∞, i.e. for a lattice of infinite extent. In practice, vortices are exponentially localized with unit size, and so the parameter L is effectively irrelevant as long as L 1. The system depends only on the dimensionless parameter h, with h → 0 being the continuum limit. Minimum-energy configurations (vortices) with k = 1 were found numerically, for a number of values of h in the range from 0.5 to 1. This involved minimizing the function E (using a conjugate-gradient method), after fixing the gauge – the “radial” gauge U x = exp(−iyγ), U y = exp(ixγ) was used, with γ(x, y) being a real-valued function on the lattice. The variables x and y were taken to range over odd-integer multiples of h/2, with the vortex being “located” at x = y = 0 (which is at the centre of a plaquette). The energy density, in the case h = 1, is depicted in Fig. 2. The expression (3.5) for E is a sum over plaquettes; the lattice function plotted in Fig. 2 is the average, over the four plaquettes containing the lattice site (x, y), of the summand e(x, y) of (3.5). Presenting the picture in this way illustrates the fact that the vortex is located near the centre of a particular plaquette (the Higgs field winds once around that plaquette). Also apparent is an (x, y) 7→ (−x, −y) asymmetry: this arises because we are using forward (as opposed to backward) differences on the lattice. One would get a more symmetric picture if one replaced (3.5) by its average with the corresponding backward-difference expression; the Bogomol’nyi bound would remain valid if one did so. The energy E of this lattice vortex depends on h. To within numerical uncertainties, it has the quadratic form
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
405
Fig. 2. The energy density of the 1-vortex, with lattice parameter h = 1
E = π[1 + 0.070h2 + 0.018h4 ] for 0.5 ≤ h ≤ 1. The Bogomol’nyi bound E ≥ π is not attained, but of course E(h) → π in the continuum limit h → 0. The space of all gauge-Higgs fields on the lattice is connected, and so one does not have absolute topological stability. For example, one could start with a k = 1 vortex and deform it (so that the value of B on the central plaquette increased towards 2π), ending up at a configuration which was gauge-equivalent to the trivial field, with zero energy. Along this path in configuration space, the central B would have to pass through the disallowed value π. In fact, the space of continuous fields (where B never equals π) is disconnected, and its components are labelled by the winding number k. So the question of stability of the lattice vortex amounts to asking whether B gets driven towards a discontinuity. This will indeed happen if h becomes too large. In the h = 1 example illustrated above, the maximum value of B (namely B for the central plaquette) is about 0.17π. But as h increases, the central plaquette grows to encompass more and more of the vortex, and eventually B reaches the forbidden value π. By contrast, if h is not too large, then it costs energy to increase B to a point of discontinuity, and the vortex is (relatively) stable. The point to be emphasized is that the Bogomol’nyi lattice system introduced above allows h to be larger than the “naive” lattice version.
4. Concluding Remarks We have seen examples of ways of maintaining useful topological features for systems on the two-dimensional lattice. It seems likely that one could do something similar in three dimensions, and obtain lattice Bogomol’nyi versions of (say) the Skyrme model and the Yang-Mills-Higgs (monopole) system.
406
R. S. Ward
If one is close to the continuum limit, in the sense that the lattice spacing is small compared to the size of the topological solitons, then there may not be much difference between various lattice versions of the continuum system. An advantage of the versions described in this paper is that the lattice spacing can be relatively large, without compromising the stability of the solitons. This should lead to different results in lattice-statistical studies such as those of [4, 8]. In the context of classical soliton dynamics, one may study the interactions of topological solitons on relatively coarse lattices, a computationally simpler task than numerical simulation of the corresponding continuum systems. There are many different ways of introducing time dependence. If one uses equations which are second-order in time, and compares with relativistic dynamics such as the vortex-scattering studies of [11, 12], then an important difference is that moving solitons will radiate energy and gradually slow down. In [11, 12] this effect was negligible, because the lattice spacing h was small. One expects it to increase with h, but the relevant time-scale may still be large compared to that of the scattering process. For example, this is the case in the topological lattice sine-Gordon system [16]. In that system, however, there is no Peierls-Nabarro energy barrier, as mentioned previously. By contrast, there will be such a barrier in the two-dimensional systems of this paper, which will have some effect. Detailed dynamical studies are needed to provide more precise answers.
Appendix A This appendix describes how the winding number k of a continuous lattice O(3) field is defined, and establishes the Bogomol’nyi bound E ≥ 4π|k| on the energy (2.2, 2.3) of such a field.
Fig. 3. The image of a plaquette in the lattice sigma model
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
407
The definition of the winding number is that of [2], namely as follows. Let p be a plaquette of the lattice, made up of two triangles pa and pb (see Fig. 3). The image of pa is a spherical triangle on S 2 ; let A(pa ) denote the signed area of this spherical triangle. In other words, A(pa ) is positive if the sequence ϕ(1), ϕ(2), ϕ(3) goes round anticlockwise as depicted in Fig. 3, and negative if the sequence is clockwise. Let A(pb ) be defined similarly. Then the winding number k is defined by adding up these signed areas and dividing by the area of the sphere, i.e. k=
1 X A(pa ) + A(pb ) . 4π p
The covering of the planar lattice by triangles gives a |k|-fold covering of S 2 by the corresponding spherical triangles; in other words, k is an integer, and |k| is the number of times the lattice is wrapped around the image sphere. This definition of k breaks down if there is a triangle (pa , say) for which the sign of A(pa ) is ambiguous. This would happen if the three points ϕ(1), ϕ(2), ϕ(3) lay on a great circle (so that the spherical triangle became a hemisphere); in [2] such “exceptional configurations” are excluded. In this paper, a stronger exclusion is used: we restrict to continuous fields ϕ, meaning that for each link h1 − 2i, the angle between ϕ(1) and ϕ(2) is acute (so the segment of great circle joining ϕ(1) to ϕ(2) has length less than π/2). For a continuous field, the ambiguity described above cannot occur, and so the winding number k of such a field is well-defined. Let us now apply some spherical trigonometry to obtain a bound on A(pa ), namely (A1) A(pa ) ≤ |A(pa )| ≤ 21 f (cos B) + 21 f (cos C), √ where f (ξ) = π − 4 tan−1 ξ, and where B and C are the lengths of two of the sides of the spherical triangle. [More specifically, cos B = ϕ(1) · ϕ(2) and cos C = ϕ(1) · ϕ(3).] Let α, β and γ be the internal angles, as depicted in Fig. 3; the area A := |A(pa )| is given by A = α + β + γ − π. In order to establish (A1), we start with sin 21 A
=
− cos 21 (α + β + γ)
=
sin 21 α sin 21 (β + γ) − cos 21 α cos 21 (β + γ).
Now use the identity tan 21 (β + γ) = K cot 21 α, where K= this leads to
cos 21 (B − C) > 1; cos 21 (B + C)
(K − 1) cos 21 α . sin 21 A = q 1 + K 2 cot2 21 α
(A2)
For fixed K, the function (A2) has a maximum when cos2 21 α = 1/(K + 1); and its maximum value is (K −1)/(K +1) = tan( 21 B) tan( 21 C). The conclusion so far, therefore, is that A ≤ 2 sin−1 tan( 21 B) tan( 21 C) . (A3) The final step uses the following lemma (proof later):
408
R. S. Ward
Lemma 4.1. 2 sin−1 (xy) ≤ sin−1 (x2 ) + sin−1 (y 2 ) for all x, y ∈ [0, 1]. So (A3) becomes A ≤ sin−1 (tan2 21 B) + sin−1 (tan2 21 C); √ and the identity 2 sin−1 (tan2 21 B) = π − 4 tan−1 cos B completes the proof of the inequality (A1). Summing (A1), and the corresponding inequality for pb , over all plaquettes, then gives the bound E ≥ 4πk. Proof of Lemma. The function F (x, y) = sin−1 (x2 ) + sin−1 (y 2 ) − 2 sin−1 (xy) is continuous on the square x, y ∈ [0, 1]. The gradient of F vanishes only if x = y, and F is zero on this line. So we need only check that F is non-negative on the boundary of the square, for then min F = 0. Clearly F (x, 0) and F (0, y) are non-negative. And F (x, 1) = G(x) is non-negative as well, as is F (1, y) = G(y): the function G(x) = π/2 + sin−1 (x2 ) − 2 sin−1 x has a negative slope, and G(1) = 0. The above derivation of a Bogomol’nyi bound also reveals how the function f arises: it is the smallest function (giving the lowest energy) for which the argument works. To see this, suppose that for a particular plaquette p, we have β = γ = 21 α (and so B = C). Then the bound (A3) is attained, and hence |A(pa )| = f (cos B). So for the given choice of f , the bound for a single plaquette is attained if the image of that plaquette is a spherical square. Of course, the image sphere S 2 cannot be tiled with spherical squares, and so the total Bogomol’nyi bound E ≥ 4πk can never be attained. Appendix B The aim in this appendix is to prove that the lattice abelian-Higgs energy (3.5) satisfies a Bogomol’nyi bound. The proof is analogous to that of the continuum case, and indeed reduces to it as h → 0. The starting point is to consider a single plaquette, and to study the quantity E = 41 |Dx ψ|2 + 41 |Dy ψ|2 + 41 |(Dx ψ)y |2 + 41 |(Dy ψ)x |2 +2h−4 sin2 ( 21 B)+ 18 |Ψ 2 −1|2 , (B1) which involves the Higgs field ψ, ψx , ψy , ψxy at the four vertices of the plaquette, and the gauge field U x , U y , Uyx , Uxy on the four links around the plaquette. Note that (Dx ψ)y denotes the y-shifted version of Dx ψ, namely (Dx ψ)y = h−1 (ψx − U x ψ)y = h−1 (ψxy − Uyx ψy ). Clearly E is closely related to the summand in (3.5). Note, however, that 2h−4 sin2 ( 21 B) corresponds to the Wilson action. We now observe that E can be rewritten as E
=
2 2 x −1 y −1 1 1 4 |(U ) Dx ψ + i(U ) Dy ψ| + 4 |(Dx ψ)y + i(Dy ψ)x | 2 + 21 |h−2 [exp(iB) − 1] + 21 i(Ψ 2 − 1)| + 41 ih−1 (Yx − Y ) − (Xy − X) + 21 h−2 sin B ,
where Y
=
ψU y Dy ψ − ψ(U y )−1 Dy ψ ,
X
=
ψU x Dx ψ − ψ(U x )−1 Dx ψ ,
(B2)
Bogomol’nyi Bounds for Two-Dimensional Lattice Systems
409
with Yx and Xy being, respectively, the x-shifted and y-shifted versions of these. To check that (B1) and (B2) are equal is straightforward algebra. The first three terms in (B2) are non-negative. The fourth, when summed over all plaquettes, gives zero; for example, L−h X x=−L
(Yx − Y ) = Y |x=L −Y |x=−L = 0,
since Dy ψ vanishes for x = ±L. So we deduce that h2
L−h X x,y=−L
E≥
1 2
L−h X
sin B .
(B3)
x,y=−L
This “first attempt” at a Bogomol’nyi bound does not quite work:Pthe right-hand side B = 21 Φ; and one is not topological. One should replace sin B by B, so as to get 21 should replace 2h−4 sin2 ( 21 B) on the left-hand side by something which achieves the P P inequality h2 Enew ≥ 21 B. Let us use the modified Manton action 21 h−4 g(h)B 2 on the left-hand side, where g(h) is some function. The desired inequality will be true as long as Ξ(B) := 21 h−2 g(h)B 2 − 2h−2 sin2 ( 21 B) − 21 B + 21 sin B ≥ 0 . (B4) Indeed, adding the summed version of (B4) to (B3) gives E ≥ 21 Φ, where E is defined by (3.5). It remains only to check (B4). The second derivative Ξ 00 (B)
= =
h−2 g(h) − h−2 cos B − 21 sin B q h−2 g(h) − 1 + h4 /4 sin(B + B0 )
p is non-negative for all B, provided that g(h) = 1 + h4 /4. Since Ξ(0) = 0 = Ξ 0 (0), we then conclude that (B4) holds for all B. The given function g is not the smallest for which (B4) is true. But certainly g has to be greater than 1, with g = 1 + O(h4 ) for small h. References 1. Bogomol’nyi, E. B.: The stability of classical solutions. Sov. J. Nucl. Phys. 24, 449–454 (1976) 2. Berg, B., L¨uscher, M.: Definition and statistical distributions of a topological number in the lattice O(3) σ-model. Nucl. Phys. B 190, 412–424 (1981) 3. Belavin, A. A., Polyakov, A. M.: Metastable states of two-dimensional isotropic ferromagnets. JETP Lett. 22, 245–247 (1975) 4. Grunewald, S., Ilgenfritz, E.-M., M¨uller-Preussker, M.: Lattice vortices in the two-dimensional abelian Higgs model. Zeit. f¨ur Physik C 33, 561–568 (1987) 5. Leese, R.: Discrete Bogomolny equations for the nonlinear O(3) σ-model in (2+1) dimensions. Phys. Rev. D 40, 2004–2013 (1989) 6. Leese, R. A.: The stability of semilocal vortices at critical coupling. Phys. Rev D 46, 4677–4684 (1992) 7. Leese, R. A., Peyrard, M., Zakrzewski, W. J.: Soliton stability in the O(3) σ-model in (2+1) dimensions. Nonlinearity 3, 387–412 (1990) 8. L¨uscher, M.: Does the topological susceptibility in lattice σ models scale according to the perturbative renormalization group? Nucl. Phys. B 200, 61–70 (1982) 9. L¨uscher, M.: Topology of lattice gauge fields. Commun. Math. Phys. 85, 39–48 (1982)
410
R. S. Ward
10. Manton, N. S.: An alternative action for lattice gauge fields. Phys. Lett. B 96, 328–330 (1980) 11. Moriarty, K. J. M., Myers, E., Rebbi, C.: Dynamical interactions of cosmic strings and flux vortices in superconductors. Phys. Lett. B 207, 411–418 (1988) 12. Myers, E., Rebbi, C., Strilka, R.: Study of the interaction and scattering of vortices in the abelian Higgs (or Ginzburg-Landau) model. Phys. Rev. D 45, 1355–1364 (1992) 13. Panagiotakopoulos, C.: Topology of 2D lattice gauge fields. Nucl. Phys. B 251, 61–76 (1985) 14. Phillips, A.: Characteristic numbers of U1 -valued lattice gauge fields. Ann. Phys. 161, 399–422 (1985) 15. Piette, B. M. A. G., Schroers, B. J., Zakrzewski, W. J.: Multisolitons in a two-dimensional Skyrme model. Zeit. f¨ur Physik C 65, 165–174 (1995) 16. Speight, J. M., Ward, R. S.: Kink dynamics in a novel sine-Gordon system. Nonlinearity 7, 475–484 (1994) 17. Ward, R. S.: Stable topological Skyrmions on the 2D lattice. Lett. Math. Phys. 35, 385–393 (1995) 18. Zakrzewski, W. J.: A modified discrete sine-Gordon model. Nonlinearity 8, 517–540 (1995) Communicated by A. Jaffe
Commun. Math. Phys. 184, 411 – 441 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Localization of Classical Waves II: Electromagnetic Waves Alexander Figotin1,? , Abel Klein2,?? 1 Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA. E-mail:
[email protected] 2 Department of Mathematics, University of California at Irvine, Irvine, CA 92697-3875, USA. E-mail:
[email protected]
Received: 1 July 1996 / Accepted: 15 August 1996
Abstract: We consider electromagnetic waves in a medium described by a position dependent dielectric constant ε(x). We assume that ε(x) is a random perturbation of a periodic function ε0 (x) and that the periodic Maxwell operator M0 = ∇× ε01(x) ∇× has a gap in the spectrum, where ∇× 9 = ∇×9. We prove the existence of localized waves, i.e., finite energy solutions of Maxwell’s equations with the property that almost all of the wave’s energy remains in a fixed bounded region of space at all times. Localization of electromagnetic waves is a consequence of Anderson localization for the self-adjoint 1 ∇× . We prove that, in the random medium described by ε(x), operators M = ∇× ε(x) the random operator M exhibits Anderson localization inside the gap in the spectrum of M0 . This is shown even in situations when the gap is totally filled by the spectrum of the random operator; we can prescribe random environments that ensure localization in almost the whole gap. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 1.1 Maxwell’s equations and localization . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 1.2 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 2 Properties of Maxwell Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 2.1 An interior estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 2.2 A Combes-Thomas argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 2.3 Generalized eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 2.4 Estimates on traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 3 Periodic Maxwell Operators and Periodic Boundary Condition . . . . . . . 427 3.1 Periodic boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 3.2 A Combes-Thomas argument for the torus . . . . . . . . . . . . . . . . . . . . . . . 429 ? ??
This author was supported by the U.S. Air Force Grant F49620-94-1-0172. This author was supported in part by the NSF Grant DMS-9500720.
412
A. Figotin, A. Klein
3.3 4 5 6 7 7.1 7.2
Floquet theory and the spectrum of periodic operators . . . . . . . . . . . . . . 429 Location of the Spectrum of Random Operators . . . . . . . . . . . . . . . . . . . 432 Dirichlet Boundary Condition for Maxwell Operators . . . . . . . . . . . . . . 434 A Wegner-Type Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 The basic technical tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 The proofs of localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
1. Introduction This is the second of a series of papers on the localization of classical waves. In the first paper we discussed some general aspects of the localization of classical waves, and proved the existence of localized acoustic waves in appropriate random media [FK3]. The present paper is concerned with the localization of electromagnetic waves. This phenomenon arises from coherent multiple scattering and interference, when the scale of the coherent multiple scattering reduces to the wavelength itself. It has numerous potential applications (e.g., [DE, J1, J2, VP, JMW]), for instance, the optical transistor, which explain the recent interest in the localization of light. Although the localization of light has a lot in common with the localization of acoustic waves, the vector nature of electromagnetic waves poses additional problems for the appropriate arguments, let alone their numerical implementation. (For a discussion of the failure of standard arguments to work for classical waves see [An2].) In this paper we develop adequate tools in order to prove the localization of electromagnetic waves, in a randomly perturbed, lossless periodic dielectric medium with a gap in the spectrum. These tools include interior estimates for the intensity of the electromagnetic field components, properties of an electromagnetic analog of Dirichlet problems in finite domains, bounds on traces of the Green’s functions associated with the relevant Maxwell operators, existence of polynomially bounded generalized eigenfunctions, exponential decay of the Green’s functions of the underlying periodic medium if the frequency falls in a spectral gap, Wegner-type estimates of the density of states, and more. After all these preparations the proof of localization goes along the same guidelines as in the case of acoustic waves [FK3]. The multiscale analysis developed in [FK3], based on studies of Anderson localization for random Schr¨odinger operators [FS, FMSS, DK, Sp, CH], is extended to the case of electromagnetic waves, using the new technical tools. As far as the essence of the localization phenomenon is concerned, it remains the same. As in the case of electron waves, a strong enough single defect in a periodic dielectric medium with a spectral gap generates exponentially localized eigenmodes [FK4]. If we have a random array of such defects then, under some natural conditions, the electromagnetic wave tunneling becomes inefficient (that is the main result of the multiscale analysis) and, hence, Anderson localization of electromagnetic waves occurs in spectral gaps of the underlying periodic medium. To create an environment which would favor localization, one considers first a perfectly periodic dielectric medium (a “photonic crystal”, e.g., [JMW]), such that the associated spectrum has band gap structure; the most significant manifestation of coherent multiple scattering is the rise of a gap in the spectrum (“photonic band gaps”). If such a periodic medium with a gap in the spectrum is slightly randomized, eigenvalues with exponentially localized eigenfunctions should arise in the gap. If the disorder is increased further within some limits the localized states can fill the gap completely. This
Localization of Classical Waves II: Electromagnetic Waves
413
is exactly the medium in which we study electromagnetic waves; we assume an underlying periodic dielectric medium with a gap in the spectrum. We will slightly randomize such periodic media with a gap in the spectrum and show that, under pretty reasonable hypotheses, Anderson localization occurs in a vicinity of the edges of the gap. (The existence of periodic dielectric media exhibiting gaps in the spectrum has been proved rigorously for 2D-periodic dielectric structures [FKu1, FKu2].) We previously considered these questions and media in a lattice approximation, both for classical waves [FK2] and for Schr¨odinger operators [FK1]. The strategy of this paper and of [FK3] is the same one we used in [FK2], the main differences are of technical nature and due to working on the continuum instead of the lattice. Acoustic waves were similarly treated in [FK3]. Localization created by (non-random) local defects was studied in [FK4]. 1.1. Maxwell’s equations and localization. In a linear, lossless dielectric medium Maxwell’s equations are given by ∂ H = −∇ × E, µ ∂t
∇ · µH = 0,
∂ E = ∇ × H, ε ∂t
∇ · εE = 0,
(1)
where E = E(x, t) is the electric field, H = H(x, t) is the magnetic field, ε = ε(x) is the position dependent dielectric constant, and µ = µ(x) is the magnetic permeability. We use the Giorgi system of units. The energy density E(x, t) = EH,E (x, t) and the (conserved) energy E = EH,E of a solution (H, E) of the Maxwell’s equations (1) are given by Z 1 ε(x)|E(x, t)|2 + µ(x)|H(x, t)|2 , E = E(x, t) dx. (2) E(x, t) = 2 R3 Maxwell’s equations can be recast as a Schr¨odinger-like equation (i.e., a first order conservative linear equation): −i with
9t =
Ht Et
∂ 9t = M9t , ∂t
∈ H,
M=
0 −i × ε ∇
(3) i × µ∇
0
,
(4)
where H = Sµ ⊕ Sε is the Hilbert space of finite energy solutions; for a given % = %(x) > 0, bounded from above and away from 0, we set S% to be the closure in L2 (R3 ; C3 , %(x)dx) of the linear subset of functions 9 with %9 ∈ C01 (R3 ; C3 ), ∇·%9 = 0. The matrix operator M, where ∇× denotes the operator given by ∇× 9 = ∇ × 9 = curl 9, has a natural definition as a self-adjoint operator on H. The solution to (3) is then given by 9t = eitM 90 , it has energy E=
1 1 k9t k2H = k90 k2H . 2 2
(5)
A localized electromagnetic wave can be characterized as a finite energy solution of Maxwell’s equations with the property that almost all of the wave’s energy remains in a fixed bounded region of space at all times, e.g.,
414
A. Figotin, A. Klein
1 E
lim inf
R→∞ t
Z |x|≤R
E(x, t) dx = 1.
(6)
If the operator M has an eigenvalue ω with eigenmode 9ω , i.e., M9ω = ω9ω , with 9ω ∈ H, 9ω 6= 0, then 9ω,t = eitω 9ω is a localized electromagnetic wave, i.e., it satisfies (3) and (6). Notice that in this case −ω is also an eigenvalue of M with eigenmode 9ω , so 9ω,t = e−itω 9ω is also a localized wave, since if J denotes the antiunitary involution corresponding to complex conjugation on H, i.e., J9 = 9, we have JMJ = −M. It also follows that the spectrum of M is symmetric, i.e., σ(M) = −σ(M), with JM+ J = M− , M± being the positive and negative parts of M. In addition, linear combinations of eigenmodes of M give raise to localized electromagnetic waves. ∂2 If 9t is a solution of Eq. (3), it must satisfy the second order equation ∂t 2 9t = −M2 9t , so the magnetic and electric fields satisfy the second order equations 1 ∂2 1 Ht = − ∇× ∇× Ht , Ht ∈ Sµ , ∂t2 µ ε
(7)
1 ∂2 1 Et = − ∇× ∇× Et , Et ∈ Sε . ∂t2 ε µ
(8)
The Maxwell operators MH = µ1 ∇× ε1 ∇× and ME = ε1 ∇× µ1 ∇× have natural definitions as nonnegative self-adjoint operators on Sµ and Sε , respectively. The two Maxwell operators are unitarily equivalent, more precisely ME = U MH U ∗ ,
(9)
where U : Sµ → Sε is the unitary operator given by UH = 1
1 −i × − 21 ∇ MH H, H ∈ Ran MH2 . ε
(10)
1
Thus σ(M) = σ(MH2 ) ∪ [−σ(MH2 )]. We obtain solutions of (3) by setting 1 1 ±itMH2 ±itMH2 H0 , ±U e H0 , H0 ∈ Sµ . 9±,t = e
(11)
Conversely, any solution of (3) can be written as a linear combination of at most four solutions of this form. It follows that to find all eigenvalues and eigenmodes for M, it is necessary and sufficient to find all eigenvalues and eigenmodes for MH . For if MH Hω2 = ω 2 Hω2 , with ω > 0, Hω2 ∈ Sµ , Hω2 6= 0, we have U Hω2 = and
M Hω 2 , ±
−i × ∇ Hω2 ωε
−i × ∇ Hω2 ωε −i = ±ω Hω2 , ± ∇× Hω2 . ωε
(12)
(13)
Conversely, if M(H±ω , E±ω ) = ±ω(H±ω , E±ω ), with ω > 0, (H±ω , E±ω ) ∈ H, not 0, × it follows that MH H±ω = ω 2 H±ω and E±ω = ±U H±ω = ± −i ωε ∇ H±ω .
Localization of Classical Waves II: Electromagnetic Waves
415
Our strategy for proving the existence of localized electromagnetic waves is the following: first the operator MH is shown to have pure point spectrum in some closed interval I ⊂ (0, ∞), with all the corresponding eigenfunctions being exponentially decaying (in the sense of having exponentially decaying local L2 -norms). For this operator we prove that the curl of an exponentially decaying eigenfunction is also exponentially decaying, so it follows from (9) and (12) that the operator ME has also pure point spectrum in the closed interval I, with all the corresponding eigenfunctions being exponentially decaying. In addition, it ensues from (13) that the operator M has pure point spectrum in {ω ∈ R; ω 2 ∈ I}, with all the corresponding eigenfunctions being exponentially decaying, so the energy densities of the corresponding solutions of (3) are also exponentially decaying, uniformly in the time t. If χI (MH ) is the corresponding spectral projection, then any solution of (3) given by (11), with H0 in the range of χI (MH ), satisfies (6). The localization of electromagnetic waves is thus a consequence of Anderson localization for operators MH = µ1 ∇× ε1 ∇× on Sµ , i.e., the existence of closed intervals where these operators have pure point spectrum with exponentially decaying eigenfunctions. 1.2. Statement of results. In this article we study electromagnetic waves in a linear, lossless dielectric medium described by a position dependent dielectric constant ε = ε(x). For most dielectric materials of interest, the magnetic permeability µ(x) is close to one (e.g., [JMW]), so we set µ(x) ≡ 1. We always assume that ε(x) is a measurable real valued function satisfying 0 < ε− ≤ ε(x) ≤ ε+ < ∞ a.e. for some constants ε− and ε+ .
(14)
Such general conditions on ε(x), particularly the lack of smoothness, are required on physical grounds. In practice only a few materials are used in the fabrication of periodic and disordered media, in which case ε(x) takes just a finite number of values, so ε(x) is piecewise constant, hence discontinuous. The abrupt changes in the medium produce discontinuities in ε(x), which favor and enhance multiscattering and, hence, localization. In such a medium electromagnetic waves are described by the formally self-adjoint Maxwell operator 1 (15) M = M(ε) = MH = ∇× ∇× , ε acting on the Hilbert space
S = {9 ∈ L2 (R3 ; C3 ); 9 ∈ C01 (R3 ; C3 ) with ∇ · 9 = 0}.
(16)
For the rigorous definition, we start by defining the unrestricted Maxwell operator 1 M = M (ε) = ∇× ∇× , ε
(17)
as the nonnegative self-adjoint operator on L2 (R3 ; C3 ), uniquely defined by the nonnegative quadratic form given as the closure of 1 M(9, 8) = h∇ × 9, ∇ × 8i, 9, 8 ∈ C01 (R3 ; C3 ). ε By Weyl’s decomposition (see [BS]), we have
(18)
416
A. Figotin, A. Klein
L 2 ( R3 ; C 3 ) = S ⊕ G ,
(19)
where G, the space of potential fields, is the closure in L2 (R3 ; C3 ) of the linear subset {9 ∈ C01 (R3 ; C3 ); 9 = ∇ϕ with ϕ ∈ C01 (R3 )}. The spaces S and G are left invariant by M , with G ⊂ D (M ) and M |G = 0. We define M as the restriction of M to S, i.e., D (M) = D (M ) ∩ S and M = M |D(M )∩S . Thus M = PS M IS = M IS ,
(20)
with PS the orthogonal projection onto S and IS : S → L2 (R3 ; C3 ) the restriction of the identity map. Notice that M = M ⊕ 0G and 0 ∈ σ(M), so σ(M) = σ(M ).
(21)
We can thus work with M to answer questions about the spectrum of M. In the special case of a homogeneous medium with ε(x) ≡ 1, we will use the notation 4 = M (1) = (∇× )2 , 4 = M(1) = (∇× )2 D((∇× )2 )∩S . (22) In this article we consider electromagnetic waves in random media obtained by random perturbations of a periodic medium. The properties of the medium are described by the position dependent quantity ε(x), which we will take to always satisfy the following assumptions. Assumption 1 (The Random Media). εg (x) = εg,ω (x) is a random function of the form X ωi ui (x), (23) εg,ω (x) = ε0 (x)γg,ω (x) , with γg,ω (x) = 1 + g i∈Z3
where (i)
ε0 (x) is a measurable real valued function which is q-periodic for some q ∈ N, i.e., ε0 (x) = ε0 (x + qi) for all x ∈ R3 and i ∈ Z3 , with 0 < ε0,− ≤ ε0 (x) ≤ ε0,+ < ∞ for a.e. x ∈ R3 ,
(24)
for some constants ε0,− and ε0,+ . (ii) ui (x) = u(x − i) for each i ∈ Z3 , u being a nonnegative measurable real valued function with compact support, say u(x) = 0 if kxk∞ ≤ ru for some ru < ∞, such that X 0 < U− ≤ U (x) ≡ ui (x) ≤ U+ < ∞ for a.e. x ∈ R3 , (25) i∈Z3
for some constants U− and U+ . (iii) ω = {ωi ; i ∈ Z3 } is a family of independent, identically distributed random variables taking values in the interval [−1, 1], whose common probability distribution µ has a bounded density ρ > 0 a.e. in [−1, 1]. (iv) g, satisfying 0 ≤ g < U1+ , is the disorder parameter.
Localization of Classical Waves II: Electromagnetic Waves
417
For electromagnetic waves εg,ω (x) is the random position dependent dielectric constant of the medium. Notice that Assumption 1 implies that each εg,ω satisfies (14) with ε± = εg,± = ε0,± (1 ± gU+ ).
(26)
For later use we set δ± (g) =
U± 1 , with 0 ≤ g < . 1 ∓ gU+ U+
(27)
The periodic operators associated with the coefficient ε0 (x) will carry the subscript 0, i.e., M0 = M (ε0 ), M0 = M(ε0 ). We will study the random operators (see [FK3, Appendix A] for the definition) (28) Mg = Mg,ω = M (εg,ω ); Mg = Mg,ω = M(εg,ω ). It is a consequence of ergodicity (measurability follows from [FK3, Theorem 38]) that there exists a nonrandom set Σg , such that σ(Mg,ω ) = σ(Mg,ω ) = Σg with probability one. In addition, the decompositions of σ(Mg,ω ) and σ(Mg,ω ) into pure point spectrum, absolutely continuous spectrum and singular continuous spectrum are also independent of the choice of ω with probability one [KM1, PF]. In this article we are interested in the phenomenon of localization. According to the philosophy of Anderson localization we will assume that the operator M0 has at least one gap in the spectrum. Assumption 2 (The gap in the spectrum). There is a gap in the spectrum of the operator M0 . More precisely, there exist 0 ≤ aˆ a ≤ b ≤ bˆ such that \ [ ˆ = [ˆa, a] [b, b], ˆ σ(M0 ) [ˆa, b] so the interval (a, b) is a gap in σ(M0 ). The following theorem gives information on the location of Σg , the (nonrandom) spectrum of the random Maxwell operator Mg . Theorem 3 (Location of the spectrum). Let the random operator Mg defined by (28) satisfy Assumptions 1 and 2. There exists g0 , with ( !) 2UU+ a 21 − 1 1 b 1− ≤ g0 ≤ min 1, −1 , (29) U+ b U+ a and strictly increasing, Lipschitz continuous real valued functions a(g) and −b(g) on the interval [0, U1+ ), with a(0) = a, b(0) = b and a(g) ≤ b(g), such that: (i) Σg
\
ˆ = [ˆa, a(g)] [ˆa, b]
[
ˆ . [b(g), b]
(30)
(ii) For g < g0 , we have a(g) < b(g) and (a(g), b(g)) is a gap in the spectrum of the random operator Mg , located inside the gap (a, b) of the unperturbed periodic operator M0 . Moreover, we have U− a (31) a ≤ a(1 + gU+ ) U+ ≤ a(g) ≤ 1 − gU+ and b(1 − gU+ ) ≤ b(g) ≤
b U−
(1 + gU+ ) U+
≤ b.
(32)
418
A. Figotin, A. Klein
(iii) If g0 < U1+ , we have a(g) = b(g) for all g ∈ [g0 , U1+ ), and the random operator Mg has no gap inside the gap (a, b) of the unperturbed periodic operator A0 , i.e., ˆ ⊂ Σg . [ˆa, b] Definition 4 (Exponential localization). We say that the random operator Mg exhibits localization in an interval I ⊂ Σg , if Mg has only pure point spectrum in I with probability one. We have exponential localization in I if we have localization and, with probability one, all the eigenfunctions corresponding to eigenvalues in I are exponentially decaying (in the sense of having exponentially decaying local L2 -norms). Remark 5. The curls of exponentially decaying eigenfunctions of Mg always have exponentially decaying local L2 -norms (Corollary 14). Thus the corresponding energy densities (see (2) ) also have exponentially decaying local L2 -norms, uniformly in the time t. Our main results show that random perturbations create exponentially localized eigenfunctions near the edges of the gap. Our method requires low probability of extremal values for the random variables; the following two theorems achieve this in different ways. The results are formulated for the left edge of the gap, with similar results holding at the right edge. Theorem 6 (Localization at the edge). Let the random operator Mg defined by (28) satisfy Assumptions 1 and 2, with µ{(1 − γ, 1]} ≤ Kγ η for 0 ≤ γ ≤ 1,
(33)
where K < ∞ and η > 3. For any g < g0 there exists δ(g) > 0, depending only on the constants g, q, ε0,± , U± , ru , K, η, an upper bound on kρk∞ , and on a, b, such that the random operator Mg exhibits exponential localization in the interval [a(g)−δ(g), a(g)]. Theorem 7 (Localization in a specified interval). Let the random operator Mg defined by (28) satisfy Assumptions 1 and 2. For any g < g0 , given a < a1 < a2 < a(g), with a(g) − a1 ≤ b(g) − a(g), there exists p1 > 0, depending only on the constants g, q, ε0,± , U± , ru , a, an upper bound on kρk∞ and on the given a1 , a2 , such that if g1 µ ,1 < p1 , (34) g where g1 is defined by a(g1 ) = a1 , the random operator Mg exhibits exponential localization in the interval [a2 , a(g)]. Theorems 6 and 7 can be extended to the situation when the gap is totally filled by the spectrum of the random operator, we then establish the existence of an interval (inside the original gap) where the random Maxwell operator exhibits exponential localization. Notice that the extension of Theorem 7 says that we can arrange for localization in as much of the gap as we want. Theorem 8 (Localization at the meeting of the edges). Let the random operator Mg defined by (28) satisfy Assumptions 1 and 2, with µ{(1 − γ, 1]}, µ{[−1, −1 + γ)} ≤ Kγ η for 0 ≤ γ ≤ 1,
(35)
Localization of Classical Waves II: Electromagnetic Waves
419
U+ where K < ∞ and η > 3. Suppose g0 < U1+ (e.g., if ab 2U− < 2), so the random operator Mg has no gap inside (a, b) for g ∈ [g0 , U1+ ). Then there exist 0 < < U1+ − g0 and δ > 0, depending only on the constants q, ε0,± , U± , ru , K, η, an upper bound on kρk∞ , and on a, b, such that the random operator Mg exhibits exponential localization in the interval [a(g0 ) − δ, a(g0 ) + δ] for all g0 ≤ g < g0 + . Theorem 9 (Localization in a specified interval in the closed gap). Let the random operator Mg defined by (28) satisfy Assumptions 1 and 2. Suppose g0 < U1+ (e.g., if U+ b 2U− < 2), so the random operator Mg has no gap inside (a, b) for g ∈ [g0 , U1+ ). Let a a < a1 < a2 < a(g0 ) = b(g0 ) < b2 < b1 < b be given. For any g ∈ [g0 , U1+ ) there exist p1 , p2 > 0, depending only on the constants g, q, ε0,± , U± , ru , a, b, an upper bound on kρk∞ and on the given a1 , a2 , b1 , b2 , such that if g2 g1 µ ,1 < p1 , µ −1, − < p2 , (36) g g where g1 and g2 are defined by a(g1 ) = a1 and b(g2 ) = b1 (notice 0 < g1 , g2 < g0 ≤ g), the random operator Mg exhibits exponential localization in the interval [a2 , b2 ]. Theorems 8 and 9 are proved exactly as Theorems 6 and 7, respectively, taking into account both edges of the gap. Remark 10. Theorems 6 and 8 should be true without the extra hypotheses (33) and (35). They are used in conjunction with a Combes-Thomas argument to obtain the starting hypothesis for the multiscale analysis, in the proof of localization. One may expect estimates similar to Lifshitz tails (e.g., [PF]) for the density of states inside the gap, which would replace (33) and (35) in the proofs. This is how the starting hypothesis is obtained for random Schr¨odinger operators at the bottom of the spectrum [HM]. Combes and Hislop have announced an improved Combes-Thomas argument inside a gap; they obtain a decay rate proportional to the square root of the product of the distances to the edges of the gap. With this result we would only need η > 23 in Theorem 6, but we would still need to require η > 3 in Theorem 8. Theorem 3 is proved in Sect. 4; the proof requires periodic operators and periodic boundary condition, studied in Sect. 3. Theorems 6 and 7 are proved in Sect. 7 by multiscale analyses. Dirichlet boundary condition, used in the proofs, is discussed in Sect. 5. The required Wegner-type estimate is in Sect. 6. The starting hypotheses are proved first for finite volume Maxwell operators with periodic boundary condition, using a Combes-Thomas argument for operators with periodic boundary condition (Subsect. 3.2) and Theorem 3. We collect properties of Maxwell operators needed for the proof of localization in Sect. 2, they include an interior estimate for curls and existence of polynomially bounded generalized eigenfunctions. 1.3. Notation. We adopt the following definitions and notations: 3 – For x = (x1 , x2 , x3 ) ∈ R we let |x|p = (xp1 + xp2 + xp3 )1/p for 1 ≤ p < ∞, and |x|∞ = max1≤j≤3 |xj |. We set |x| = |x|2 and kxk = |x|∞ . – ΛL (x) = {y ∈ R3 ; ky − xk < L2 } is the (open) cube of side L centered at x ∈ R3 ; Λ¯L (x) is the closed cube, and Λ˘L (x) = {y ∈ R3 ; − L2 ≤ yi − xi < L2 , i = 1, 2, 3} the half-open/half-closed cube.
420
A. Figotin, A. Klein
– χΛ is the characteristic function of the set Λ; we write χx,L = χΛL (x) . – A function f on R3 is called q-periodic for some q > 0 if f (x + qi) = f (x) for all x ∈ R3 and i ∈ Z3 . – A domain Ω is an open connected subset of R3 ; its boundary is denoted by ∂Ω. – Lp (Ω; Cd ) is the space of Cd measurable functions u : Ω → Cd with the norm R 1/p p . We will often use the space L2 (Ω; Cd ) and kukp = kukp,Ω = Ω |u(x)| dx
–
– – – –
in this case we will write kukΩ for kuk2,Ω . If Ω = Rd we may omit it from the subscript. We write Lp (Ω) if d = 1. C n (Ω; Cd ) is the linear space of n-times continuously differentiable functions u : Ω → Cd , C0n (Ω; Cd ) is the subspace of functions with compact support. We write C n (Ω) if d = 1. The domain, spectrum and adjoint of a linear operator A are denoted by D(A), σ(A) and A∗ , respectively . If A is the quadratic form associated with an operator A, its domain will be denoted by either Q(A) or Q(A) . We also write A[9] for A(9, 9). B(X , Y) is the Banach space of bounded operators from the normed space X to the normed space Y; B(X ) = B(X , X ). For a complex number z its conjugate is denoted by z ∗ .
2. Properties of Maxwell Operators 2.1. An interior estimate. Let us consider the first order linear differential operator D = {Dα,β }α,β=1,...,ν , with each Dα,β = aα,β · ∇ for some fixed aα,β ∈ Cd . D is a closed densely defined operator on L2 (Rd ; Cν ), whose domain, D(D), is the closure of 1 C0∞ (Rd ; Cν ) in the norm k9k22 + kD9k22 2 . Given an open set Ω ⊂ Rd , we define DΩ as the closed densely defined operator on 2 L (Ω; Cν ), defined in the obvious way for 9 ∈ C ∞ (Ω; Cν ) with {aα,β ·∇}α,β=1,...,ν 9 ∈ L2 (Ω; Cν ). If Ω 0 ⊂ Ω, it is easy to see that if u ∈ D(DΩ ), then u|Ω 0 ∈ D(DΩ 0 ) with DΩ 0 u|Ω 0 = (DΩ u) |Ω 0 , so we will simply write Du to denote the function DΩ u. Let 0 = 0(x) be a measurable function on Rd whose values are ν × ν complex matrices with 0 ≤ 0(x) ≤ 0+ Iν a.e. for some constant 0+ < ∞,
(37)
Iν being the ν × ν identity matrix. We say that a function u ∈ D(DΩ ) is a weak solution for the equation D∗ 0Du = f in Ω, where f ∈ L2 (Ω; Cν ), if hD9, 0DuiΩ = h9, f iΩ for all 9 ∈ C0∞ (Ω; Cν ).
(38)
Theorem 11. Let D and 0 be as above. For any δ > 0 there exists a constant ξδ = ξ(d, ν, {aα,β }α,β=1,...,ν , δ) < ∞, depending only on the indicated parameters, such that if u ∈ D(DΩ ) is a weak solution for the equation D∗ 0Du = f in an open subset Ω of Rd , with f ∈ L2 (Ω; Cν ), we have 1 2 2 hDu, 0DuiΩ 0 ≤ ξδ 0+ kukΩ + kf kΩ (39) 0+ for any Ω 0 ⊂ Ω with dist(Ω 0 , ∂Ω) ≥ δ.
Localization of Classical Waves II: Electromagnetic Waves
421
Proof. We consider first the case when Ω and Ω 0 are open cubes, say Ω = ΛL (x0 ), Ω 0 = that 0 ≤ φ(x) ≤ 1, ΛL−2δ (x0 ), for some x0 ∈ Rd , L > 2δ. We fix φ ∈ C0∞ (Rd ) such √ φ(x) ≡ 1 in Ω 0 , φ(x) ≡ 0 in Rd \ΛL− δ (x0 ), and |(∇φ)(x)| ≤ 2 δ d . (Such a function 2 always exists.) We set Dφ = {Dα,β φ}α,β=1,...,ν = {aα,β · ∇φ}α,β=1,...,ν . Since φ2 u ∈ D(DΩ ) with compact support, it follows from (38) that hDφ2 u, 0DuiΩ = hφ2 u, f iΩ ,
(40)
so we have 0 ≤ hDu, φ2 0DuiΩ = hφ2 u, f iΩ − 2h(Dφ)u, φ0DuiΩ 1 2
(41) 1 2
≤ kukΩ kf kΩ + 2h(Dφ)u, 0(Dφ)uiΩ hDu, φ2 0DuiΩ 1 0+ ≤ kuk2Ω + kf k2Ω 2 20+ 1 + 20+ Cδ kuk2Ω + hDu, φ2 0DuiΩ , 2
(42)
(43)
where we used the elementary inequality ab ≤ r2 a2 + s2 b2 , for any a, b ≥ 0, r, s > 0 with 2rs = 1, and Cδ = C(d, ν, {aα,β }α,β=1,...,ν , δ) < ∞ is a constant depending only on the indicated parameters. Thus, hDu, 0DuiΩ 0 ≤ hDu, φ2 0DuiΩ ≤
1 kf k2Ω + (1 + 4Cδ )0+ kuk2Ω , 0+
(44)
which implies (39) when Ω is an open cube. We now consider the general case: let Ω and Ω 0 be as in the theorem, with dist(Ω 0 , ∂Ω) ≥ δ (we use the norm | |∞ ), and let Ωδ0 = {x ∈
δ d Z ; Λ δ2 (x) ∩ Ω 0 6= ∅}. 2
(45)
Using (44), we get hDu, 0DuiΩ 0 ≤
X x∈Ωδ0
hDu, 0DuiΛ δ (x)
(46)
2
X 1 2 2 kf kΛδ (x) + (1 + 4C δ )0+ kukΛδ (x) ≤ 4 0+ x∈Ωδ0 1 kf k2Ω + (1 + 4C δ )0+ kuk2Ω , ≤ (2d + 1) 4 0+
(47)
(48)
from which (39) follows. Theorem 11 has the following immediate corollaries for Maxwell operators. In this case ν = 3, D = ∇× (i.e., D9 = ∇ × 9), D∗ = D, D∗ D = 4, and 0 = ε1 I3 . We write ∇× |Ω for (∇× )Ω . If M on L2 (R3 ; C3 ) be given by (17) with (14), we have M = D∗ 0D.
422
A. Figotin, A. Klein
Corollary 12. Let the operator M on L2 (R3 ; C3 ) be given by (17) with (14). For any δ > 0 there exists Θδ < ∞, depending only on δ, such that if 9 ∈ D(∇× |Ω ) is a weak solution for the equation M 9 = F in an open subset Ω of R3 , with F ∈ L2 (Ω; C3 ), we have √ 1 √ (49) k∇ × 9kΩ 0 ≤ Θδ ε+ √ k9kΩ + ε− kF kΩ ε− for any Ω 0 ⊂ Ω with dist(Ω 0 , ∂Ω) ≥ δ. Corollary 13. Let the operator M be given by (17) with (14). Let 9 ∈ L2 (R3 ; C3 ) be such that ∇ × 9 is locally in L2 , i.e., 9|Ω ∈ D(∇× |Ω ) for any bounded open Ω ⊂ R3 . Then, if 9 is a weak solution for the equation M9 = F in R3 , with F ∈ L2 (R3 ; C3 ), we have √ 1 √ k∇ × 9k ≤ Θ∞ ε+ √ k9k + ε− kF k , (50) ε− with Θ∞ = inf δ>0 Θδ . Corollary 12 gives exponential decay for the curl of an exponentially decaying eigenfunction of a Maxwell operator. Corollary 14. Let M be an operator of the form (15) satisfying the bounds (14), and let 9 be an eigenfunction for M. Suppose 9 has exponentially decaying local L2 -norms, i.e., kχx,` 9k2 decays exponentially as kxk → ∞ for some ` > 0. Then ∇ × 9 also has exponentially decaying local L2 -norms. 2.2. A Combes-Thomas argument. Let the operator M be given by (17). If z ∈ / σ(M ), we write R(z) = (M − z)−1 . Lemma 15. Let the operator M be given by (17) with (14). Then for any z ∈ / σ(M ), n ∈ N and ` > 0 we have n √ 9 n e( 3`/4) e−mz |x−y| for all x, y ∈ R3 , (51) kχx,` R(z) χy,` k ≤ η with mz =
η , 4 ε−1 + |z| + η −
(52)
where η = dist(z, σ(M )). Proof. The lemma is proved in the same way as [FK3, Lemma 12], with the obvious modifications to take into account that in this lemma we have curls instead of gradients. The next lemma gives an exponential estimate for the curl of the resolvent. Lemma 16. Let the operator M be given by (17) with (14), and let z ∈ / σ(M ) with 3 × 2 η, mz as in Lemma 1. Then ∇ R(z) is a bounded operator on L (R , C3 ) with
×
(1 + |z|)
∇ R(z) ≤ Θ1 √ε+ √ε− + √1 +1 , (53) ε− η
Localization of Classical Waves II: Electromagnetic Waves
423
where Θ1 is given in (49). Furthermore, for each ` > 0 we have
9 √
χx,` ∇× R(z)χy,` ≤ Θ1 √ε+ √ε− + √1 (1 + |z|) e(3 3`/4) e−mz |x−y| ε− η
(54)
for all x, y ∈ R3 with |x − y| ≥ 2`. Proof. This lemma is proven in the same way as [FK3, Lemma 13], using Corollaries 12 and 13, and Lemma 15. 2.3. Generalized eigenfunctions. Let M be an operator of the form (17) satisfying the bounds (14). Given z ∈ C, a measurable function 9 : R3 → C3 will be called a generalized eigenfunction for z if both 9 and ∇ × 9 are locally in L2 , i.e., 9|Ω ∈ D(∇× |Ω ) for all open bounded subsets Ω of R3 , and 9 is a weak solution for the equation M 9 = z9 on R3 , i.e., 1 h∇ × 8, ∇ × 9i = zh8, 9i for all 8 ∈ C0∞ (R3 ; C3 ). ε
(55)
Theorem 17. Let M be an operator of the form (17) satisfying the bounds (14), ρ (dλ) −1 with p > 3. Then, for ρ (dλ)-almost all its spectral measure. Let w(x) = |x|p + 1 λ > 0, M has a generalized eigenfunction 9λ satisfying Z |9λ (x)|2 w(x) dx < ∞, (56) 3
R
so for any ` ∈ N we have kχx,` 9λ k ≤ C` |x|p + 1 for all x ∈ `Z3 ,
(57)
for some constant C` < ∞ depending only on `, ε± and the LHS of (56). Proof. Let
F (t) =
(t + 1)−1 , 0,
if t > 0; if t ≤ 0.
(58)
F is a bounded measurable function on the real line, continuous on (0, ∞), such that F (M ) = (M + IS )−1 ⊕ 0G
(59)
with respect to Weyl’s decomposition (19). 1 The operator F (M )W 2 is Hilbert-Schmidt by Theorem 18 below, W being the operator given by multiplication by the function w(x). The existence of generalized eigenfunctions satisfying (56), for ρ (dλ)-almost all λ > 0, now follows from [B, Subsects. V.4.1–V.4.2]. The estimate (57) is an immediate consequence of (56). 2.4. Estimates on traces.
424
A. Figotin, A. Klein
Theorem 18. Let M be an operator of the form (17) with (14), and let V denote the bounded operator given by multiplication by the bounded measurable function v(x), with v(x) ≥ 0 and X kχx,1 v 2 k∞ < ∞. (60) x∈Z3
Then the operator
PS (M + I)−1 V = (M + IS )−1 ⊕ 0G V
(61)
is Hilbert-Schmidt. 1
Theorem 18 was used in the proof of Theorem 17 with v(x) = [w(x)] 2 = − 1 |x|p + 1 2 . f which is To prove the theorem we will introduce a modified Maxwell operator M elliptic. Formally, f=M f(ε) = M + Y, M (62) f is rigorously defined as the nonwith Y = −∇ 1 ∇·, i.e., Y 9 = −∇ 1 [∇·9] . M ε
ε
negative self-adjoint operator on L2 (R3 ; C3 ) given by the closure of the nonnegative quadratic form Z 1 2 f |[∇·9](x)| dx, 9 ∈ C01 (R3 ; C3 ). (63) M[9] = M[9] + 3 R ε(x) f is diagonal with respect to Weyl’s decomposition (19), with M f = M⊕Y The operator M for the appropriate operator Y on G. If ε(x) ≡ 1, we have f(1) = −∆ ⊗ I3 , e ≡M 4 (64) where ∆ is the Laplacian in L2 (R3 ) and I3 is the identity operator on C3 . Since −2 f+I (M + IS )−2 ⊕ 0G ≤ (M + IS )−2 ⊕ (Y + IG )−2 = M ,
(65)
Theorem 18 is an immediate consequence of the following theorem. f be as in (62) with (14), and let v(x) and V be as in Theorem 18. Theorem 19. Let M −1 f+I V is Hilbert-Schmidt. Then the operator M Proof. We set χx = χx,1 for x ∈ R3 and −1 −1 e e= M f+1 e +µ , S(µ) = 4 for µ > 0; R 3 e x,y = χx S(µ)χ e y , S(µ) e ex,y = χx Rχ R y for x, y ∈ R .
(66) (67)
It follows from (14) and (75) that
We let
e ≤ ε+ S(ε e −) ≤ R e + ). ε− S(ε
(68)
e + ), Sbx,y = χx S(µ)χ b Sb = ε+ S(ε y.
(69)
We also set χx,y = max{χx , χy } for x, y ∈ R3 , notice χ2x,y = χx,y and χx,x = χx .
Localization of Classical Waves II: Electromagnetic Waves
425
Lemma 20. Let p > 23 and µ > 0. Then there exists a constant c1 = c1 (p, µ) < ∞, depending only on the indicated parameters, such that h ip o n e χx,y ≤ c1 (70) Tr χx,y S(µ) for all x, y ∈ R3 . Proof. It follows from (64) that it suffices to show that Tr χx,y (−∆ + µ)−p χx,y ≤ c
(71)
for all x, y ∈ R3 , the trace now being calculated in L2 (R3 ). But this is a consequence of the fact that the operator (−∆ + µ)−p has a bounded kernel; it is taken into multiplication by an integrable function by the Fourier transform. We recall some general results. operator A on a Hilbert space, we set Given a compact sj (A) = λj (|A|), where λ1 |A| ≥ λ2 |A| ≥ . . . are the strictly positive eigenvalues of |A|, repeated according to their multiplicity. For such A we have (e.g., [GK]): X kAkpp ≡ Tr |A|p = [sj (A)]p , 1 ≤ p < ∞; (72) j
sj (A) = sj A∗
for any j, so kAkp = kA∗ kp ;
sj (BA) , sj (AB) ≤ kBk sj (A) for any bounded operator B.
(73) (74)
If A and B are self-adjoint operators and A ≥ B ≥ 0, we have Tr A2 ≥ Tr B 2 ;
A−1 ≤ B −1 ;
Aβ ≥ B β , 0 < β ≤ 1.
(75)
We will also need the following general statement. Lemma 21. Let A be a nonnegative bounded operator and P an orthogonal projection on a Hilbert space H. For any γ ≥ 1 we have Tr [P AP ]γ ≤ Tr P Aγ P.
(76)
Proof. Let B be a nonegative compact operator on H . By the mini-max principle, we get max min hϕ, Bϕi . (77) λj (B) = {F ⊂H; dim F =j} {ϕ∈F ; kϕk=1}
If γ ≥ 1, it follows from Jensen’s inequality that for any ϕ ∈ H with kϕk = 1 we have γ
hϕ, Bϕi ≤ hϕ, B γ ϕi .
(78)
Without loss of generality we can assume Tr P Aγ P < ∞. In this case we claim that [λj (P AP )]γ ≤ λj (P Aγ P ) for any j, so (76) follows. Indeed, using (78) and (77) we obtain, with F = P H,
(79)
426
A. Figotin, A. Klein
γ
γ
[λj (P AP )] = = ≤ =
max
min
{F ⊂F ; dim F =j} {ϕ∈F ; kϕk=1}
hϕ, Bϕi γ
max
min
hϕ, Bϕi
max
min
hϕ, B γ ϕi
{F ⊂F ; dim F =j} {ϕ∈F ; kϕk=1} {F ⊂F ; dim F =j} {ϕ∈F ; kϕk=1} λj (P Aγ P ).
Lemma 22. There exists a constant c2 = c2 (ε+ ) < ∞, depending only on ε+ , such that e 2 3 e∗ e (80) Tr R x,y = Tr Rx,y Rx,y ≤ c2 for all x, y ∈ R . ex,y are compact. In particular, the operators R Proof. We have ∗ e e x Rχ e y ≤ Tr χy Rχ e x,y Rχ e y ex,y Rx,y = Tr χy Rχ Tr R
e y Rχ e x,y ≤ Tr χx,y Rχ e x,y Rχ e x,y = Tr χx,y Rχ e x,y = Tr χx,y Rχ On the other hand using (75), (68), (69) and (70) we obtain 2 2 e x,y ≤ Tr χx,y Sχ b x,y = Tr χx,y Sχ b x,y Sχ b x,y Tr χx,y Rχ
2
(81) .
(82)
≤ Tr χx,y Sb2 χx,y ≤ ε2+ c1 (2, ε+ ). The inequalities (81) and (82) imply (80).
Lemma 23. There exists a constant c3 = c3 ε± < ∞, depending only on ε± , such that e2 χx ≤ c3 for all x ∈ R3 . (83) Tr χx R Proof. We have e 2 χx = Tr χx R
X
e y Rχ e x= Tr χx Rχ
y∈Z3
X
e 2 Tr R x,y .
In addition, if 0 ≤ α < 1, we also have α α
e 1− 2 e α e 1− 2 e 2
e α = Tr R Tr R R R ≤ R x,y x,y x,y x,y x,y Tr so e 2 χx ≤ Tr χx R
X
e α
Rx,y Tr
(84)
y∈Z3
e 2−α , (85) Rx,y
e 2−α Rx,y
y∈Z3
"
≤
#
e 2−α X e α sup Tr Rx,y
Rx,y .
y∈Z3
y∈Z3
(86)
Localization of Classical Waves II: Electromagnetic Waves
that
427
f substituted for M , we get From Lemma 15, which holds exactly as stated with M
X
e α (87)
Rx,y ≤ b(ε− , α) y∈Z3
for some constant b(ε− , α) < ∞, which depends only on ε− and α. To estimate the other term, notice that using (73), (74), (75), (68) and (69), we obtain h i2 e 2 ex,y sj R = sj R x,y e x Rχ e y ≤ sj χy Rχ e x,y Rχ e y = sj χy Rχ h i2 e e e e e = sj χy χx,y Rχx,y Rχx,y χy ≤ sj χx,y Rχx,y Rχx,y = sj χx,y Rχx,y h h i2 h i2 i2 e b b = sj χx,y Rχx,y ≤ sj χx,y Sχx,y = sj χx,y Sχx,y h i2 b x,y Sχ b x,y = sj χx,y Sχ b x,y . = sj χx,y Sχ
(88)
Taking α ∈ 0, 1/2 so 2 − α > 23 , we use (76) and (70) to get h i2−α X h i2−α e 2−α X ex,y b x,y = ≤ sj R sj χx,y Sχ Tr R x,y j
j
h i2−α b x,y = Tr χx,y Sχ ≤ Tr χx,y Sb2−α χx,y ≤ c1 (2 − α, ε+ ) .
(89)
The lemma is proved, since (83) follows from (86), (87) and (89). We can now finish the proof of Theorem 19. Using (83) and (60), we get X X e2 V = Tr RV e 2R e xV 2R e xR e= e≤ e Tr V R Tr Rχ kχx v 2 k∞ Tr Rχ =
X
x∈Z3
x∈Z3
2
e χ x ≤ c3 kχx v 2 k∞ Tr χx R
x∈Z3
X
kχx v 2 k∞ < ∞,
(90)
x∈Z3
e is a Hilbert-Schmidt operator. so RV 3. Periodic Maxwell Operators and Periodic Boundary Condition The (non-random) spectrum of a random Maxwell operator can be represented as the union of the spectra of relevant periodic Maxwell operators, which in turn are given as the union of the spectra of finite volume Maxwell operators with periodic boundary condition. This is analogous to the situation for random Schr¨odinger operators [KM2] and random acoustic operators [FK3]. In this section we study Maxwell operators in periodic media. We say that the operators M, M , given by (15), (17) with (14), are q-periodic for some q > 0, if
428
A. Figotin, A. Klein
ε(x) is a q-periodic function. In this section we work with a given period q > 0 and q-periodic operators M and M . 3.1. Periodic boundary condition. We start by defining the restriction of such M to a cube with periodic boundary condition. Given a cube Λ = Λ` (x), where x ∈ R3 and ◦
` > 0, we will denote by Λ the torus we obtain by identifying the edges of the closed cube Λ¯ in the usual way. We introduce the usual distance in the torus: √ ◦ 3` ¯ for all x, y ∈ Λ. (91) d (x, y) ≡ min 3 |x − y + m| ≤ 2 m∈`Z ◦
We will identify functions on Λ with their `-periodic extensions to R3 ; for example, ◦ C 1 Λ; C3 will be identified with the space of continuously differentiable, `-periodic, ◦ ◦ C3 -valued functions on R3 . We define W 1,2 Λ; C3 as the closure of C 1 Λ; C3 in W 1,2 Λ; C3 . ◦
We will always take ` ∈ q N and define M Λ , the restriction of M to Λ with ◦ periodic 2 boundary condition, as the unique nonnegative self-adjoint operator on L Λ; C3 ∼ = 3 2 L Λ; C , defined by the nonnegative densely defined closed quadratic form ◦ ◦ 1 3 1,2 , (92) MΛ (9, 8) = h∇ × 9, ∇ × 8i, with 9, 8 ∈ W Λ; C ε Λ the inner product being in L2 Λ; C3 . We also have a corresponding Weyl’s decomposition in the torus: L2 Λ; C3 = ◦
◦
SΛ ⊕ GΛ , where n
◦
◦
SΛ = 9 ∈ L2 Λ; C3 ; 9 ∈ C 1 Λ; C3 n
◦
o with ∇ · 9 = 0 , ◦ o
GΛ = 9 ∈ L2 Λ; C3 ; 9 = ∇ϕ with ϕ ∈ C 1 Λ
(93)
.
(94)
◦ ◦ ◦ ◦ ◦ ◦ The spaces SΛ and GΛ are left invariant by M Λ , with GΛ ⊂ D M Λ and M Λ ◦ = 0. ◦ G◦ Λ ◦ ◦ ◦ ◦ We define MΛ as the restriction of M Λ to SΛ , i.e., D MΛ = D M Λ ∩ SΛ and ◦ ◦ ◦ ◦ ◦ MΛ = M Λ ◦ ◦ . Thus MΛ = P◦ M Λ I◦ =M Λ I◦ , with P◦ the orthogonal SΛ
D M Λ ∩ SΛ ◦
projection onto SΛ and I◦ ◦
◦
SΛ
◦
SΛ
SΛ
SΛ
: SΛ → L2 Λ; C3 the restriction of the identity map. ◦
Notice that M Λ = MΛ ⊕ 0 ◦ , and 0 is easily seen to be an eigenvalue of MΛ with GΛ multiplicity three, so ◦ ◦ (95) σ MΛ = σ M Λ .
Localization of Classical Waves II: Electromagnetic Waves ◦
◦
◦
429 ◦
◦
If ε(x) ≡ 1 we write 4Λ , 4Λ for M Λ , MΛ , respectively. Since 4Λ has compact ◦ resolvent (its eigenvalues and eigenfunctions can be explicitly computed), and MΛ ≥ ◦ 1 ε+ 4Λ
◦
by (14), we can conclude that MΛ has compact resolvent. ◦
◦
◦
3.2. A Combes-Thomas argument for the torus. If z ∈ / σ(M Λ ), we write RΛ (z) = (M Λ −z)−1 . Lemma 24. Let the operator M given by (17) with (14) be q-periodic, and let Λ = Λ` (x0 ) for some x0 ∈ R3 and ` ∈ q N, ` > 2r + 8, where r > 0. Then for any ◦
z∈ / σ(M Λ ) and n ∈ N we have kχx,r
n √3rm◦ ◦ ◦ ◦ z,r,` 9 2 e e−mz,r,` d(x,y) for all x, y ∈Λ, RΛ (z) χy,r k ≤ η ◦
n
with
◦
mz,r,` = 4
√ 2 3 1− 2r+8 `
η
+1
ε−1 − + |z| + η
,
(96)
(97)
◦
where η = dist(z, σ(M Λ )). Proof. The lemma is proved in the same way as [FK3, Lemma 18], with the obvious modifications to take into account that in this lemma we have curls instead of gradients. 3.3. Floquet theory and the spectrum of periodic operators. If k, n ∈ N, we say that k n if n ∈ k N and that k ≺ n if k n and k 6= n. The main result of this section is the following theorem. Theorem 25. Suppose the operator M given by (15) with (14) is q-periodic. Let {`n ; n = 0, 1, 2, . . .} be a sequence in N such that `0 = q and `n ≺ `n+1 for each n = 0, 1, 2, . . .. Then ◦ ◦ σ MΛ`n (0) ⊂ σ MΛ`n +1 (0) ⊂ σ(M) for all n = 0, 1, 2, . . . , (98) and σ(M) =
[
◦ σ MΛ`n (0) .
(99)
n≥1
Related results for periodic Schr¨odinger operators can be found in [Ea], where Floquet theory is used. For the nonsmooth coefficients we are interested in some aspects of the Floquet theory have to be revised. Periodic acoustic operators are treated in [FK3, Theorem 14], with a proof that does not use Floquet theory. In this subsection we will develop an appropriate Floquet theory for our Maxwell operators, and use it to prove Theorem 25. We refer to [RS4, Sect. XIII.6] for the definitions and notations of direct integrals of Hilbert spaces. Let Q = Λ˘q (0) be the basic period cell, Q˜ = Λ˘ 2π (0) the dual basic cell. We define q the Floquet transform
430
A. Figotin, A. Klein
F: L
2
R ;C 3
3
Z
→
⊕ ˜ Q
˜ L2 Q; C3 L2 Q; C3 dk ≡ L2 Q;
(100)
by (F 9)(k, x) =
q 23 X ˜ eik·(x−m) 9(x − m), x ∈ Q, k ∈ Q, 2π 3
(101)
m∈q Z
if 9 has compact support; it extends by continuity to a unitary operator. The q-periodic operator M is decomposable in this direct integral representation, more precisely, Z ⊕ ◦ (102) FMF∗ = M Q (k) dk, ˜ Q
◦
where for each k ∈ R3 we define M Q (k) to be the operator (∇ − ik)× ε1 (∇ − ik)× ◦ on L2 Q; C3 with periodic boundary condition; M Q (k) is rigorously defined as a ◦
self-adjoint operator by the appropriate quadratic form MQ (k) as in (92). As before (∇ − ik)× denotes the operator (∇ − ik)× 8 = (∇ − ik) × 8. We also have Weyl’s ◦ ◦ decompositions for each k ∈ R3 : L2 Q; C3 = SQ (k) ⊕ GQ (k), where
◦
SQ (k) = 9 ∈
◦
L2
Q; C
3
;9∈
C1
◦
Q; C
3
with (∇ − ik) · 9 = 0 , (103)
◦
GQ (k) = 9 ∈ L2 Q; C3 ; 9 = (∇ − ik)ϕ with ϕ ∈ C 1 Q
.
(104)
◦ ◦ ◦ ◦ ◦ The spaces SQ (k) and GQ (k) are left invariant by M Q (k), with GQ (k) ⊂ D M Q (k) ◦ ◦ ◦ ◦ = 0. We define MQ (k) as the restriction of M Q (k) to SQ (k), i.e., and M Q (k) ◦ G (k) ◦ Q ◦ ◦ ◦ ◦ ◦ D MQ (k) = D M Q k ∩ SQ (k) and MQ (k) = M Q (k) ◦ . Thus D M Q (k) ∩SQ (k)
◦
MQ (k) = P◦
◦
◦
=M Q (k)I◦ , with P◦ the orthogonal projection SQ (k) SQ (k) ◦ : SQ (k) → L2 Q; C3 the restriction of the identity map. onto SQ (k) and I◦ SQ (k) ◦ ◦ ◦ ◦ ◦ Notice M Q (k) = MQ (k) ⊕ 0 ◦ , so σ MQ (k) = σ M Q (k) . Each MQ (k) has SQ (k)
M Q (k)I◦
SQ (k) ◦
GQ (k)
compact resolvent. We have Z ⊕◦ SQ (k) dk, FS = ˜ Q
F MF =
Z
⊕ ◦ ˜ Q
MQ (k) dk.
(105)
2π 3 2 Q; C3 q Z we let Up denote the unitary operator on L 3 the function e−ip·x , then for all k ∈ Rd and p ∈ 2π q Z we
In addition, if for each p ∈ given by multiplication by have
∗
Localization of Classical Waves II: Electromagnetic Waves ◦
431 ◦
∗ M Q (k + p) = Up M Q (k)Up , ◦
(106)
◦
and, since Up SQ (k + p) = SQ (k), we can also think of Up as a unitary operator from ◦
◦
SQ (k + p) to SQ (k), with ◦
◦
∗ MQ (k + p) = Up MQ (k)Up .
(107)
Lemma 26. (i) The mapping ◦ −1 ◦ ∈ L L2 Q; C3 k ∈ R3 7−→ RQ (k) ≡ M Q (k) + I is operator norm continuous. (ii) We have [ ◦ σ M Q (k) σ(M ) =
σ(M) =
and
˜ k∈Q
[
◦ σ MQ (k) .
(108)
(109)
˜ k∈Q
Proof. Let k, h ∈ R3 , 9 ∈ L2 Q; C3 , we have ◦
◦
(110) MQ (k + h)[9]− MQ (k)[9] = 1 1 1 hh × 9, h × 9iQ + ihh × 9, (∇ − ik) × 9iQ − ih(∇ − ik) × 9, h × 9iQ . ε ε ε Using the Cauchy-Schwarz inequality and (14) we get (see [FK3, Proof of Lemma 12] for a similar argument) ◦ 1 ◦ ◦ k9k2Q . (111) MQ (k + h)[9]− MQ (k)[9] ≤ |h| MQ (k)[9] + |h| 1 + |h| ε− If |h| < 1 we have ◦
◦
k(|h|(1 + |h|)ε−1 − + |h| M Q (k)) RQ (k)k 1 1 +2 ≤2 + 1 |h|. (112) ≤ |h| 1 + |h| ε− ε− If we now require 2 ε1 − + 1 |h| ≤ 21 , we can use [Ka, Theorem VI.3.9] to conclude that ◦ ◦ 1 + 1 |h|. (113) k RQ (k + h)− RQ (k)k ≤ 32 ε− Part (i) of the lemma is proved; part (ii) follows from (i) by standard arguments. ◦
◦
If ` ∈ q Z3 , similar considerations apply to the operators M Λ` (0) and MΛ` (0) , which ◦
are q-periodic on the torus Λ` (0). The Floquet transform ◦ M F` : L2 Λ` (0); C3 → L2 Q; C3 3 ˜ k∈ 2π ` Z ∩Q
(114)
432
A. Figotin, A. Klein
is a unitary operator now defined by (F` 9)(k, x) = where x ∈ Q, k ∈ interpreted in the torus
q 23 `
X
eik·(x−m) 9(x − m),
(115)
m∈q Z3 ∩Λ˘ ` (0)
2π 3 2 ˜ ` Z ∩ Q, 9 ∈ L ◦ Λ` (0). We also have
◦ 3 Λ` (0); C , 9(x − m) being properly M
◦
F` M Λ` (0) F`∗ =
◦
M Q (k),
(116)
3 ˜ k∈ 2π ` Z ∩Q
and ◦
F` SΛ` (0) =
M
◦
SQ (k),
◦
F` MΛ` (0) F` ∗ =
3 ˜ k∈ 2π ` Z ∩Q
Thus we have [
◦
σ(M Λ` (0) ) =
M
◦
MQ (k).
(117)
3 ˜ k∈ 2π ` Z ∩Q
◦ σ M Q (k)
and
[
◦
σ(MΛ` (0) ) =
3 ˜ k∈ 2π ` Z ∩Q
◦ σ MQ (k) .
3 ˜ k∈ 2π ` Z ∩Q
(118) Theorem 25 is an immediate consequence of (118) and Lemma 26. 4. Location of the Spectrum of Random Operators In this section we prove Theorem 3. Since we already proved Theorem 25, the proof proceeds almost exactly as in [FK3, Sect. 4], so we will only outline the key steps. In order to investigate the samples of the random quantity εg,ω (x), for a fixed g, we set (119) Tg = {τ : τ = {τi , i ∈ Z3 }, −g ≤ τi ≤ g}, Tg(n) = {τ ∈ T : τi+nj = τi for all i, j ∈ Z3 }, n ∈ N, and
Tg(∞) =
[
Tg(n) .
(120) (121)
nq
For τ ∈ Tg we let
ετ (x) = ε0 (x) 1 +
X
τi u(x − i)
(122)
i∈Z3
and M (τ ) = M (ετ ), M(τ ) = M(ετ ).
(123)
We recall (21). To approximate Maxwell operators by periodic operators, given τ ∈ Tg , n ∈ N and x ∈ R3 , we specify τΛn (x) ∈ Tg(n) by requiring τΛn (x) i = τi for all i ∈ Λ˘n (x) ∩ Z3 , and define (124) MΛn (x) (τ ) = M (τΛn (x) ).
Localization of Classical Waves II: Electromagnetic Waves
433
The following lemma shows that the (nonrandom) spectrum of the random Maxwell operator Mg is determined by the spectra of the periodic Maxwell operators M (τ ), τ ∈ Tg(∞) . The analogous result for random Schr¨odinger operators was proven by Kirsch and Martinelli [KM2, Theorem 4]. Lemma 27. Let the random operator Mg defined by (28) satisfy Assumption 1, and let [
Σg =
σ (M (τ )).
(125)
τ ∈Tg(∞)
Then σ(Mg ) = Σg with probability one. Proof. Same proof as [FK3, Lemma 19]. Given a real number h; |h| <
1 U+ ,
let
M (h) = M (εh ) , M(h) = M(εh ) with εh (x) = ε0 (x) [1 + hU (x)] .
(126)
If |h| ≤ g, and we define τ (h) ∈ Tg by τ (h)i = h for all i ∈ Z3 , we have εh = ετ (h) and M (h) = M (τ (h)), M(h) = M(τ (h)). Lemma 28. Let M (h), |h| <
1 U+ ,
be given by (126), with ε0 and U given in Assumption
1. Let Λ = Λ` (x0 ) for some x0 ∈ R3 and ` q. The positive self-adjoint operator ◦
M(h)Λ has compact resolvent and 0 as an eigenvalue, so let 0 < µ1 (h) ≤ µ2 (h) ≤ . . . be its nonzero eigenvalues, repeated according to their (finite) multiplicity. Then each µj (h), j = 1, 2, . . ., is a Lipschitz continuous, strictly decreasing function of h, with µj (h1 ) − µj (h2 ) ≤ δ+ (g) min{µj (hl )} l=1,2 h2 − h1
δ− (g) max{µj (hl )} ≤ l=1,2
for any h1 , h2 ∈ (−g, g), 0 < g <
1 U+ ,
(127)
where δ± (g) are given in (27).
Proof. Same proof as [FK3, Lemma 20]. The following corollary follows immediately from Theorem 25, Lemmas 27 and 28, and the min-max principle. Corollary 29. Let the random operator Mg defined by (28) satisfy Assumption 1, and let {`n ; n = 0, 1, 2, . . .} be a sequence in N such that `0 = q and `n ≺ `n+1 for each n = 0, 1, 2, . . .. Then Σg =
[
σ (M (h)) =
h∈[−g,g]
[
[
◦ σ M (h)Λ`n (0) .
(128)
h∈[−g,g] n≥1
In particular, Σg is increasing in g. Theorem 3 is now proven as in [FK3, Subsect. 4.2], using Theorem 25, Lemma 28 and Corollary 29, and taking (21) and (95) into account.
434
A. Figotin, A. Klein
5. Dirichlet Boundary Condition for Maxwell Operators Given an open cube Λ in R3 and M as in (17), we will denote by MΛ the restriction of M to Λ with Dirichlet boundary condition, i.e., MΛ is the nonnegative self-adjoint operator on L2 (Λ; C3 ), uniquely defined by the nonnegative quadratic form given as the closure of 1 (129) MΛ (9, 8) = h∇ × 9, ∇ × 8i, 9, 8 ∈ C01 (Λ; C3 ), ε the inner product being in L2 (Λ; C3 ). If ε(x) ≡ 1, we write 4Λ for M (1)Λ . 4Λ has an operator core consisting of functions which are C 2 up to ∂Λ and whose tangential component vanishes on ∂Λ. (For a discussion of boundary conditions for Maxwell operators in bounded domains see [BS].) We will need this last description to find all eigenvalues for 4Λ . This is all given in the next theorem. ¯ C3 ), we use 9ν and 9τ to denote its (outer) normal Some notation. If 9 ∈ C(Λ; and tangential components on ∂Λ. Theorem 30. Let Λ be an open cube of side L in R3 . (i) The dense linear subset n o D ¯ C3 ); 9τ ≡ 0 DΛ = 9 ∈ C 2 (Λ;
(130)
D . is an operator core for 4Λ , with 4Λ 9 = ∇ × ∇ × 9 for 9 ∈ DΛ (ii) The operator 4Λ has an orthogonal basis of eigenfunctions n i h π D 9 = 9µ,j ∈ DΛ ;µ∈ L N3 ∪ {0} × N2 ∪ i o h N × {0} × N ∪ N2 × {0} , j = 0, 1, 2 ,
(131)
with ¯ ∇ × 9µ,0 = 0, 9µ,0 = ∇ϕµ,0 with ϕµ,0 ∈ C0∞ (Λ); ∇ × ∇ × 9µ,j = |µ| 9µ,j , ∇ · 9µ,j = 0, j = 1, 2. 2
More precisely, if Λ = ΛL (x0 ), we can take L 9µ,j (x) = 8µ,j x − x0 + (1, 1, 1) , 2
cos(µ1 x1 ) sin(µ2 x2 ) sin(µ3 x3 ) a(µ,j) 1
(132) (133)
(134)
(µ,j) 8µ,j (x) = a2 sin(µ1 x1 ) cos(µ2 x2 ) sin(µ3 x3 ) , a(µ,j) sin(µ1 x1 ) sin(µ2 x2 ) cos(µ3 x3 ) 3 i h i h π N3 ∪ {0} × N2 ∪ N × {0} × N ∪ N2 × {0} we set where for each µ ∈ L a(µ,0) = µ and pick a(µ,1) , a(µ,2) ∈ R3 such that {a(µ,j) ; j = 0, 1, 2} is an orthogonal basis for R3 .
Localization of Classical Waves II: Electromagnetic Waves
435
D Proof. Let the operator 0Λ be defined by 0Λ 9 = ∇ × ∇ × 9 for 9 ∈ DΛ . To see that 3 1 ¯ it is a symmetric operator on its domain, notice that for 8, 9 ∈ C (Λ; C ) we have Z Z ¯ × 9)d3 x = ¯ × 9)ν dS, ∇ · (8 (8 (135) h∇ × 8, 9i − h8, ∇ × 9i = Λ
∂Λ
where the inner products are in L2 Λ; C3 , dS is the surface measure, and we used ¯ × 9)ν ≡ 0, so we can conclude that the Gauss’ Theorem. If 8τ ≡ 0, we must have (8 surface integral in (135) equals 0. We proceed as in [RS4, Proof of Proposition 1 in Sect. XIII.15]. To show that the symmetric operator 0Λ is essentially self-adjoint,it suffices to exhibit an orthogo π D . Since cos(nx); n ∈ L ({0} ∪ N) and nal basis of eigenfunctions in its domain DΛ π sin(nx); n ∈ L N are both orthogonal bases for L2 ((0, L)), it follows that 9 = {9µ,j }, given in (134), is an orthogonal basis for L2 (Λ; C3 ). Since 8µ,0 = ∇[sin(µ1 x1 ) sin(µ2 x2 ) sin(µ3 x3 )],
(136)
D we clearly have (132). It is straightforward to check that 9 ⊂ DΛ and 9 also satisfies (133), so it is an orthogonal basis of eigenfunctions for the operator 0Λ . To finish the proof of the theorem, it suffices to show that 4Λ is the closure 0Λ D of 0Λ . To do that, notice that C02 (Λ; C3 ) ⊂ DΛ ⊂ Q(0Λ ), where for a self-adjoint operator A we use Q(A) to denote the domain of the corresponding quadratic form. As quadratic forms, we clearly have 4Λ [9] = 0Λ [9] for 9 ∈ C02 (Λ; C3 ), which is a form D core for 4Λ as a quadratic form, hence Q(4Λ ) ⊂ Q(0Λ ). Since DΛ is a form core for 0Λ as a quadratic form, to finish the proof of the theorem, it is enough to show that D ⊂ Q(4Λ ), so Q(0Λ ) ⊂ Q(4Λ ). DΛ D Thus, given 9 ∈ DΛ , it suffices to find 9n ∈ C01 (Λ; C3 ) such that
k9 − 9n k + k∇ × (9 − 9n )k → 0.
(137)
Translating and scaling, if necessary, we can assume that Λ = Λ2 (0) = (−1, 1)3 . For each n = 1, 2, . . . we select a function ηn ∈ C 2 ([−1, 1]), 0 ≤ ηn (t) ≤ 1, such that n+ 1 n ηn (t) = 1 for |t| ≤ n+1 and ηn (t) = 0 for n+12 ≤ |t| ≤ 1. We set 8n (x) = η¯n (x)Θn (x), where η¯n (x) = ηn (x1 )ηn (x2 )ηn (x3 ) and n+1 n 9( n x), if |x1 |, |x2 |, |x3 | ≤ n+1 ; n n 9(x1 , x2 , ±1), if |x1 |, |x2 | ≤ n+1 , n+1 < ±x3 ≤ 1; n n (138) Θn (x) = 9(x1 , ±1, x3 ), if |x1 |, |x3 | ≤ n+1 , n+1 < ±x2 ≤ 1; n n 9(±1, x , x ), if |x |, |x | ≤ , < ±x ≤ 1; 2 3 2 3 1 n+1 n+1 0, otherwise. We have 8n ∈ C0 (Λ; C3 ), and 8n is piecewise C 1 with bounded partial derivatives, so ∇ × 8n ∈ L2 (Λ; C3 ). In addition, ∇ × 8n = η¯n (∇ × Θn ) + (∇η¯n ) × Θn = η¯n (∇ × Θn ),
(139)
since (∇η¯n ) × Θn = 0 by our construction as 9τ ≡ 0. If each 8n was a C 1 -function, instead of only piecewise C 1 , we would be done, since 9n = 8n clearly satisfies (137). To repair that we set 9n = γn ∗ 8n , where {γn } is a suitably chosen approximate
436
A. Figotin, A. Klein
3 identity, i.e., positive C ∞ function γ on R3 with support on R γn (x) = n γ(nx) for some Λ1 (0) and γ(x)dx = 1, so 9n ∈ C01 (Λ; C3 ), ∇ × 9n = γn ∗ (∇ × 9n ), and (137) is satisfied.
The Weyl decomposition corresponding to Dirichlet boundary condition is given by L (Λ; C3 ) = SΛ ⊕ GΛ , where GΛ and SΛ are the closed subspaces spanned by {9µ,0 } and {9µ,j , j = 1, 2}, respectively, where {9µ,j , j = 0, 1, 2} is the orthogonal basis given in (131). It is easy to see that 2
GΛ = {9 ∈ C01 (Λ; C3 ); 9 = ∇ϕ with ϕ ∈ C01 (Λ)},
(140)
SΛ = {9 ∈ L (Λ; C ); ∇ · 9 = 0 weakly }
(141)
2
3
The spaces SΛ and GΛ are left invariant by MΛ , with GΛ ⊂ D (MΛ ) and MΛ |GΛ = 0. We define MΛ as the restriction of MΛ to SΛ , i.e., D (MΛ ) = D (MΛ ) ∩ SΛ and MΛ = MΛ |D(MΛ )∩SΛ . Notice MΛ = MΛ ⊕ 0GΛ , 0 ∈ / σ(MΛ ), so σ(MΛ ) = σ(MΛ )\{0}. MΛ and MΛ will be called Dirichlet Maxwell operators. We write 4Λ for MΛ (1). Notice that 4Λ is a strictly positive operator with discrete spectrum; the same being true of MΛ in view of (14). Corollary 31. Let M be as in (17) with (14), and let Λ be an open cube in R3 . Then (i) MΛ has compact resolvent; in fact Tr {(MΛ + I)−p } < ∞ for any p > 23 . (ii) For any E > 0 let nε,Λ (E) denote the number of eigenvalues of MΛ less than E, each eigenvalue counted as many times as its multiplicity. There exists a finite constant C0 , independent of Λ and ε, such that 3
3
nε,Λ (E) ≤ C0 ε+2 |Λ|E 2 .
(142)
Proof. We clearly have MΛ ≥ ε1+ 4Λ , so it suffices to prove the corollary for 4Λ . It follows from Theorem 30(ii) that the spectrum of 4Λ consists of eigenvalues whose multiplicity can be read from (131), so an explicit calculation gives Tr {(4Λ + I)−p } < ∞ for any p > 23 . A similar calculation gives (142). Remark 32. nε,Λ (E) is also equal to the number of strictly positive eigenvalues of MΛ less than E, each eigenvalue counted as many times as its multiplicity.
6. A Wegner-Type Estimate Given an open cube Λ in R3 , we will denote by Mg,Λ = Mg,ω,Λ the restriction of the random operator Mg,ω to Λ with Dirichlet boundary condition. Notice that Mg,ω,Λ is a random operator on L2 (Λ), measurability follows from [FK3, Theorem 38]. Each Mg,ω,Λ has compact resolvent by Corollary 31(i). For any E > 0 we define ng,Λ (E) = ng,ω,Λ (E) as the number of strictly positive eigenvalues of Mg,ω,Λ less than E. Notice that ng,ω,Λ (E) is the distribution function of the measure ng,ω,Λ (dE) on (0, ∞) given by Z h(E)ng,ω,Λ (dE) = Tr(h(Mg,ω,Λ )) = Tr(h(Mg,ω,Λ )) (143) for positive continuous functions h with compact support in (0, ∞).
Localization of Classical Waves II: Electromagnetic Waves
437
We will say that the random operator Mg defined by (28) satisfies Assumption 10 , if it satisfies all of Assumption 1 with the exception of the requirement that ε0 (x) be a q -periodic function. We have the following “a priori” estimate, which is an immediate consequence of Corollary 31(ii), (26) and Assumption 1(iv) . Lemma 33. Let the random operator Mg defined by (28) satisfy Assumption 10 . There exists a finite constant C1 , depending only on ε0.+ , such that we have 3
ng,ω,Λ (E) ≤ C1 |Λ|E 2
(144)
for all ω ∈ [−1, 1]Z , for all E > 0 and all open cubes Λ in Z3 . 3
Theorem 34 (Wegner-type estimate). Let the random operator Mg defined by (28) satisfy Assumption 10 . There exists a constant Q < ∞, depending only on the constants ru and ε0,+ , such that
P { dist(σ(Mg,ω,Λ ), E) ≤ η} ≤ Q
1 U− + 2U+ kρk∞ |E| 2 η|Λ|2 gU+ (1 − gU+ )U−
(145)
for all E > 0, open cubes Λ in R3 , and all η ∈ [0, E). Proof. The proof is exactly the same as the proof of [FK3, Theorem 23], with the proviso that we only integrate ng,ω,Λ (E) against positive continuous functions with compact support in (0, ∞).
7. Localization Theorems 6 and 7 are proved exactly as in [FK3], applying a multiscale analysis appropriate for random perturbations of periodic operators on R3 [FK3, Theorems 29 and 35] to operators Mg as in (28). Let the operator M be as in (17) with (14). Given an open cube Λ in R3 , MΛ is the restriction of M to Λ with Dirichlet boundary condition (see Sect. 5). Each MΛ is a nonnegative self-adjoint operator on L2 (Λ; C3 ) with compact resolvent RΛ (z) = (MΛ − z)−1 . If Λ = ΛL (x), we will write Mx,L = MΛL (x) and Rx,L (z) = RΛL (x) (z). The norm in L2 (Λ; C3 ) and also the corresponding operator norm will both be denoted by k kx,L . If Λ1 ⊂ Λ2 are open cubes, JΛΛ12 : L2 (Λ1 ; C3 ) → L2 (Λ2 ; C3 ) is the canonical ,L2 for the (operator) norm in injection. If Λi = ΛLi (xi ), i = 1, 2, we write k kxx21 ,L 1 ΛL2 (x2 ) 3 3 x2 ,L2 2 2 B L (ΛL1 (x1 ); C ), L (ΛL2 (x2 ); C ) and Jx1 ,L1 = JΛL (x1 ) . If ϕ ∈ L∞ (Λ), we 1
also use ϕ to denote the operator on L2 (Λ; C3 ) given by multiplication by ϕ; if 8 ∈ L∞ (Λ; C3 ) we write 8× for the operator 8×, i.e., 8× 9 = 8 × 9. 7.1. The basic technical tools. The results of [FK3, Subsects. 6.1 and 6.3] are valid for the Maxwell operator M , with the obvious modifications. We state the key results for completeness. We start with the smooth resolvent identity (SRI), which is used to relate resolvents in different scales.
438
A. Figotin, A. Klein
Lemma 35 (SRI). Let the operator M be given by (17) with (14), let Λ1 ⊂ Λ2 be open cubes in R3 , and let ϕ1 ∈ C01 (Λ1 ). Then, for any z ∈ / σ(MΛ1 ) ∪ σ(MΛ2 ) we have RΛ2 (z)JΛΛ12 ϕ1 =
1 1 JΛΛ12 ϕ1 RΛ1 (z) + RΛ2 (z) −JΛΛ12 (∇ϕ1 )× ∇× + ∇× JΛΛ12 (∇ϕ1 )× RΛ1 (z) ε ε
(146)
as quadratic forms on L2 (Λ2 ; C3 ) × L2 (Λ1 ; C3 ). Proof. The lemma follows immediately from [FK3, Lemma 24] and the definition of Dirichlet boundary condition. To take into account the periodicity of the background medium, q ∈ N being the period (see Assumption 1), we work with boxes ΛL (x) with x ∈ q Z3 and L ∈ 2q N, so the background is the same in all boxes in a given scale L. For such boxes (with L ≥ 4q) we set L (147) ΥL (x) = {y ∈ q Z3 ; ky − xk = − q} 2 and Υ˜L (x) = ΛL−q (x)\ΛL−3q (x), ΥˆL (x) = ΛL− 3q (x)\ΛL− 5q (x). (148) 2
We also set
2
χx = χx,q and 0x,L = χΥ˜ L (x) , 0ˆ x,L = χΥˆ L (x) .
Notice
0x,L =
X
(149)
χy a.e.
(150)
y∈ΥL (x)
and
(151) |ΥL (x)| ≤ 3(L − 2q + 1)2 . In addition each ΛL (x) will be equipped with a function 8x,L constructed in the following way: we fix an even function ξ ∈ C01 (R) with 0 ≤ ξ(t) ≤ 1 for all t ∈ R, 3 0 such that ξ(t) = 1 for |t| ≤ q4 , ξ(t) = 0 for |t| ≥ 3q 4 , and |ξ (t)| ≤ q for all t ∈ R. (Such a function always exists.) We define ( 5q L 1, if |t| ≤ 2 − 4 ξL (t) = (152) 3q ξ |t| − L2 − 2 , if |t| ≥ L2 − 3q 2 and set 8x,L (y) = 8L (y − x) for y ∈ R3 , with 8L (y) =
3 Y
ξL (yi ).
(153)
i=1
We have 8x,L ∈ C01 (ΛL (x)), 0 ≤ 8x,L ≤ 1, χx, L − 5q 8x,L = χx, L − 5q , 2
4
2
4
χx, L − 3q 8x,L = 8x,L , 2
(154)
4
√ 3 3 . (155) |∇8x,L | ≤ q We can now state a Simon-Lieb-type inequality (SLI); it is used to obtain decay in a larger scale from decay in a given scale.
and
0ˆ x,L ∇8x,L = ∇8x,L ,
Localization of Classical Waves II: Electromagnetic Waves
439
Lemma 36 (SLI). Let the operator M be given by (17) with (14). Then for any `, L ∈ 2q N with 4q ≤ ` < L − 3q, x, y ∈ q Z3 with 2ky − xk ≤ L − ` − 3q (so Λ` (y) ⊂ / σ(Mx,L ) ∪ σ(My,` ), we have ΛL−3q (x)), and z ∈ k0x,L Rx,L (z)χy kx,L ≤ γz `2 k0y,` Ry,` (z)χy ky,` k0x,L Rx,L (z)χy0 kx,L
(156)
for some y 0 ∈ Υy,` , with √ √ 18 3 1 √ γz = Θ q4 ε+ ε− + √ (1 + |z|), qε− ε−
(157)
where Θ q4 is the constant given in Corollary 12. Proof. The lemma is proved as [FK3, Lemma 26], using Lemma 35 and Corollary 12. The eigenfunction decay inequality (EDI) is used to obtain decay for generalized eigenfunctions from decay of local resolvents. Lemma 37 (EDI). Let the operator M be given by (17) with (14), and let 9 be a generalized eigenfunction for a given z ∈ C. For any x ∈ q Z3 and ` ∈ 2q N with ` ≥ 4q, such that z ∈ / σ(Mx,` ), we have kχx ψk ≤ γz `2 k0x,` Rx,` (z ∗ )χx kx,` kχy ψk
(158)
for some y ∈ Υy,` , with γz as in (157). Proof. Same proof as [FK3, Lemma 27]. The starting hypothesis for the multiscale analysis [FK3, (P1) in Theorem 29 and (H1) in Theorem 35] is formulated for operators with Dirichlet boundary condition. But under the hypotheses of Theorems 6 and 7 the natural starting hypothesis is the analogue of either (P1) or (H1) for periodic boundary condition. The following lemma enable us to go from periodic boundary condition to Dirichlet boundary condition. For Mg be as in (28) satisfying Assumption 1, x ∈ q Z3 and L ∈ 2q N, we set (with the notation of (124)) ◦
◦
M g,ω,x,L = (M ((gω)ΛL (x) ))ΛL (x) ,
(159) ◦
which is a random operator by [FK3, Theorem 38]. We write Rg,ω,x,L (z) for its resolvent. Lemma 38. Let Mg be as in (28) satisfying Assumption 1. Let E > 0, x ∈ q Z3 and L ∈ ◦ 2q N, L ≥ 4q; set Lˆ = L + [2ru ]2q + 2q. If ω is such that E ∈ / σ(Mg,ω,x,L ∪ σ(M g,ω,x,Lˆ ), then (160) k0x,L Rg,ω,x,L (E)χx kx,L ≤ ! √ ◦ 3 3 1 + 2(1 + E)kRg,ω,x,L (E)kx,L k0x,L Rg,ω,x,Lˆ (E)χx kx,Lˆ . 1+ qε−
440
A. Figotin, A. Klein
Proof. Same proof as [FK3, Lemma 37]. 7.2. The proofs of localization. Theorems 6 and 7 can now be proved exactly as in [FK3], using Theorems 3, 25, 34, and Lemmas 24, 27, 36, 37, 38, so we refer the reader to [FK3, Subsects. 6.4 and 6.5]. Acknowledgement. Effort sponsored by the Air Force Office of Scientific Research, Air Force Materials Command, USAF, under grant F49620-94-1-0172, and by the Division of Mathematical Sciences of the National Science Foundation, under grant DMS-9500720. The US Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Office of Scientific Research, the National Science Foundation, or the US Government.
References [An2] [B] [BS] [CH] [DE] [DK] [Ea] [FK1] [FK2] [FK3] [FK4] [FKu1] [FKu2] [FMSS] [FS] [GK] [HM] [JMW] [J1] [J2]
Anderson, P. W.: A Question of Classical Localization. A Theory of White Paint. Philos. Mag. B 53, 505–509 (1958) Berezanskii, Iu. M.: Expansions in Eigenfuncions of Selfadjoint Operators. RI: Providence, AMS, 1968 Birman, M. Sh. and Solomyak, M. Z.: L2 -Theory of the Maxwell Operator in Arbitrary Domains. Russ. Math. Surv. 42:6, 75–96 (1987) Combes, J. , Hislop, P.: Localization for some Continuous, Random Hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) Development and Applications of Materials Exhibiting Photonic Band Gaps. J. Optical Soc. of Am. B 10, 280–413 (1993) Dreifus, H., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) Eastham, M.: The Spectral Theory of Periodic Differential Equations. Edinburgh: Scottish Academic Press, 1973 Figotin, A., Klein, A.: Localization Phenomenon in Gaps of the Spectrum of Random Lattice Operators. J. Stat. Phys. 75, 997–1021 (1994) Figotin, A., Klein, A.: Localization of Electromagnetic and Acoustic Waves in Random Media. Lattice Model. J. Stat. Phys. 76, 985–1003 (1994) Figotin, A., Klein, A.: Localization of Classical Waves I: Acoustic Waves. Commun. Math. Phys. 180, 439–487 (1996) Figotin, A., Klein, A.: Localized Classical Waves Created by Defects. J. Stat. Phys. 86, 165–177 (1997) Figotin, A., Kuchment, P.: Band-Gap Structure of Spectra of Periodic Dielectric and Acoustic Media. I. Scalar Model. SIAM J. Appl. Math. 56, 68–88 (1996) Figotin, A. Kuchment, P.: Band-Gap Structure of Spectra of Periodic Dielectric and Acoustic Media. II. 2D Photonic Crystals. SIAM J. Appl. Math., 56, 1561–1670 (1996) Fr¨ohlich, J., Martinelli, F., Scoppola, E.: Spencer, T.: Constructive proof of localization in the Anderson tight binding model. Commun. Math. Phys. 101, 21–46 (1985) Fr¨ohlich, J., Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) Gohberg, I. C., Krein, M. G.: Introduction to the Theory of Nonselfadjoint Operators. Providence, RI: AMS, 1969 Holden, H., Martinelli, F.: On Absence of Diffusion near the Bottom of the Spectrum for a Random ν Schr¨odinger Operator on L2 (R ). Commun. Math. Phys. 93, 197–217 (1984) Joannopoulos, J., Meade, R., Winn, J.: Photonic Crystals. Princeton, NJ: Princeton University Press, 1995 John, S.: Localization of Light. Phys. Today, May 1991 John, S.: The Localization of Light. In “Photonic Band Gaps and Localization”, NATO ASI Series B: Physical 308, 1993
Localization of Classical Waves II: Electromagnetic Waves
[Ka] [KM1] [KM2] [Ku] [PF] [RS4] [Sp] [VP]
441
Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1976 Kirsch, W., Martinelli, F.: On the Ergodic Properties of the Spectrum of General Random Operators. J. Reine Angew. Math. 334, 141–156 (1982) Kirsch, W., Martinelli, F.: On the Spectrum of Schr¨odinger Operators with a Random Potential. Commun. Math. Phys. 85, 329–350 (1982) Kuchment, P.: Floquet Theory for Partial Differential Equations. Basel: Birkh¨auser Verlag, 1993 Pastur, L., Figotin, A.: Spectra of Random and Almost-periodic Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1991 Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol.IV, Analysis of Operators. New York: Academic Press, 1978 Spencer, T.: Localization for Random and Quasiperiodic Potentials. J. Stat. Phys. 51, 1009–1019 (1988) Villeneuve, P. R., Pich´e, M.: Photonic Band Gaps in Periodic Dielectric Structures. Prog. Quant. Electr. 18, 153–200 (1994)
Communicated by A. Kupiainen
Commun. Math. Phys. 184, 443 – 455 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Remarks on Singularities, Dimension and Energy Dissipation for Ideal Hydrodynamics and MHD Russel E. Caflisch? , Isaac Klapper?? , Gregory Steele??? Mathematics Department, UCLA, Los Angeles, CA 90095-1555, USA. Received: 21 March 1995 / Accepted: 6 August 1996
Abstract: For weak solutions of the incompressible Euler equations, there is energy conservation if the velocity is in the Besov space Bs3 with s greater than 1/3. Bsp consists of functions that are Lip(s) (i.e., H¨older continuous with exponent s) measured in the Lp norm. Here this result is applied to a velocity field that is Lip(α0 ) except on a set of co-dimension κ1 on which it is Lip(α1 ), with uniformity that will be made precise below. We show that the Frisch-Parisi multifractal formalism is valid (at least in one direction) for such a function, and that there is energy conservation if minα (3α + κ(α)) > 1. Analogous conservation results are derived for the equations of incompressible ideal MHD (i.e., zero viscosity and resistivity) for both energy and helicity . In addition, a necessary condition is derived for singularity development in ideal MHD generalizing the Beale-Kato-Majda condition for ideal hydrodynamics. 1. Introduction In turbulent flow at high Reynolds number, the energy dissipation rate is observed to be approximately independent of the coefficient of viscosity. If the Euler equations for ideal hydrodynamics are to correctly describe the infinite Reynolds number limit for turbulent flow, which is a major open question of fluid mechanics, then energy dissipation and singularities must occur in their solutions. The situation is similar for magneto-hydrodynamics (MHD) at high Reynolds and magnetic Reynolds number [2]. Although the available evidence is not as clear-cut, ? E-mail:
[email protected]. Research supported in part by the ARPA under URI grant number # N00014092-J-1890. ?? E-mail:
[email protected]. Current address: Mathematics Department, Montana State University, Bozeman, MT 59717, USA. Research supported in part by an NSF postdoctoral fellowship. ??? E-mail:
[email protected]. Current address: Lockheed Martin Western Development Laboratories, 3200 Zanker Rd. MS X-20, San Jose CA 95134, USA. Research supported in part by the NSF under grant # DMS-9306488.
444
R.E. Caflisch, I. Klapper, G. Steele
energy dissipation is apparently constant in the ideal limit. In contrast, according to the Taylor conjecture, magnetic helicity does not dissipate in the ideal limit. If the ideal MHD equations are to allow reasonable limits of incompressible MHD, these two observations must be reflected in properties of the solutions. In 1949 Onsager [12] stated that energy is conserved for weak solutions u ∈ Lip(α) with α > 1/3. This result is contained in a famous paper that initiated the statistical theory of point vortices, and received little attention until the work of Eyink [7], which gave it a rigorous mathematical proof in a certain function class. The proof was considerably simplified and extended to the Besov function space Bsp (= Bsp,∞ ) in subsequent work of Constantin, E and Titi [5]. In this note, we shall specialize the result of [5] to explicitly show the dependence on both the degree of singularity of the velocity and the dimension of the singular set. In particular, we consider a velocity which is Lip(α0 ) everywhere (i.e. with co-dimension κ0 = 0) except on a set of co-dimension κ1 on which it is Lip(α1 ). Our main result for ideal hydrodynamics, which is stated formally in Corollary 3.1 below, is that there is energy conservation for weak solutions of the Euler equations if inf (3α + κ(α)) > 1 α
(1.1)
in which the inf is taken over α = α0 , α1 . As shown below, this criterion is valid for negative, as well as positive, values of α. In fact, we show that for this class of functions, the multifractal formalism of FrischParisi [9] is valid (at least in one-direction), and that the functions are in the Besov space Bsp for any s > sp = inf α pα + κ(α) and for all 1 ≤ p < ∞. So the energy conservation criterion (1.1), using p = 3, then follows from [5]. In fact, the criterion (1.1) is correct whenever the Frisch-Parisi formalism is valid. The energy conservation criterion (1.1) is implicit in the work of Eyink [8] on multifractals and Besov spaces. Nevertheless, we believe that an explicit statement of this criterion and its validation for a particular class of velocities is noteworthy. In particular, it should be helpful in predicting the type of singularities for Euler flows, and in assessing their fluid dynamic significance if they do occur. Note, however, that there is no proof that the Euler velocity field will have the smoothness described above. We then present two results on singularities and energy dissipation for ideal incompressible MHD. First, we derive criteria for energy conservation and helicity conservation for weak solutions of ideal MHD. Second, we show that if smooth initial data for the ideal MHD equations leads to a singularity at a finite time t∗ , then Z t∗ ||ω ||∞ + ||J ||∞ dt = ∞ (1.2) 0
in which ω = ∇ × u is the fluid vorticity, J = ∇ × B is the electrical current, and || ||∞ is the L∞ norm in space. This result is analogous to the theorem of Beale-Kato-Majda [1] for singularity formation in ideal hydrodynamics.
2. Singularity and Dimension Consider a function f , defined on a set D ⊂ Rm , and assume that f is smooth except on a manifold S0 of co-dimension κ (an integer) on which it is Lip(α); e.g., f (x) = dist(x, S0 )α . Define sets S(r) consisting of points in D within distance r of S0 . Then
Singularities, Dimension and Energy Dissipation for Hydrodynamics
|S(r)| ≡ vol(S(r)) ≤ arκ
445
(2.1)
for some constant a, which will be adjusted for use in subsequent bounds. Next consider the difference of f (x) and f (x + y) for two points x and x + y that are at least distance r from S0 , i.e. with x, x + y ∈ D − S(r). Since the derivative of f blows up like r−(1−α) then |f (x) − f (x + y)| ≤ ar−(1−α) |y|. (2.2) Alternatively, f is everywhere Lip(α) if α ≥ 0, while f is of size rα if α < 0; i.e. a|y|α if α ≥ 0 |f (x) − f (x + y)| ≤ (2.3) arα if α < 0 for x, x + y ∈ D − S(r). This can be generalized to a function that is Lip(α0 ) in D − S0 , , with α0 > α1 , in which case the bounds can be combined as |f (x) − f (x + y)| ≤ ∆(r, α0 , α1 ) if x, x + y ∈ D − S(r), in which a|y|α0 r−(α0 −α1 ) if |y| ≤ r if r < |y| and α1 ≥ 0 ∆(r, α0 , α1 ) = a|y|α1 arα1 if r < |y| and α1 < 0
(2.4)
(2.5)
Definition. A function f satisfying the bounds (2.4) with α0 > α1 and 0 < κ will be said to be in class Lip(α0 , α1 , 0, κ). Next we derive Lp estimates for any function in Lip(α0 , α1 , 0, κ). These estimates show that such functions are in Besov space. Lemma 2.1. Let f ∈ Lip(α0 , α1 , 0, κ1 ), let 1 ≤ p ≤ ∞, and denote κ0 = 0. Define sp = min (αi + κi /p) i=0,1
(2.6)
and assume that sp > 0. Then for any sp > s > 0 there is a constant b (depending on sp − s) such that kf (· + y) − f (·)kLp < b|y|s . (2.7) Proof of Lemma 2.1. First assume that α1 ≥ 0 and rewrite the defining inequality (2.4) in a smooth way as |f (x + y) − f (x)| ≤ ∆(r) ≡ a(r + |y|)−α0 +α1 |y|α0
(2.8)
for x, x + y ∈ D − S(r). Also denote V (r) = vol(S(r)) ≤ a(r + |y|)κ1 V˜ (r) = vol(S(r) ∪ (S(r) − y)) ≤ 2V (r).
(2.9)
Write the integral of the H¨older difference as a Stieljes integral over r, then integrate by parts to estimate (omitting constant factors)
446
R.E. Caflisch, I. Klapper, G. Steele
Z D
|f (x + y) − f (x)|p dx ≤ Z
1
= 0
Z
∆(r)p dx
D
∆(r)p dV˜ (r)
Z
1
∂ (∆(r)p )V˜ (r)dr + ∆(1)p V˜ (1) 0 ∂r (Z ) 1 α0 p −1−p(α0 −α1 )+κ1 ≤ |y| (r + |y|) dr + 1 =−
≤ |y|sp
0
log |y| if α1 + κ1 /p = α0 1 otherwise
(2.10)
in which s = min(α0 , α1 + κ1 /p). This proves (2.7) for α1 ≥ 0. On the other hand, if α1 < 0 then ∆(r) = min(r−(α0 −α1 ) |y|α0 , rα1 )
(2.11)
Then, repeating the first few steps of the previous estimation, the bound becomes Z Z 1 ∂ (∆(r)p )V (r)dr + 2∆(1)p V (1) |f (x + y) − f (x)|p dx = −2 ∂r D 0 Z 1 Z |y| −1+pα1 +κ1 α0 p r dr + |y| r−1−p(α0 −α1 )+κ1 dr + a|y|α0 p ≤ 0
≤ |y|sp
|y|
log |y| if α1 + κ1 /p = α0 1 otherwise
(2.12)
in which s = min(α0 , α1 + κ1 /p) > 0. This proves (2.7) for f ∈ Lip(α0 , α1 , 0, κ1 ). The Besov spaces are characterized by the Lp bounds proved in Lemma 2.1, which leads to the following result: Corollary 2.1. Assume that function f ∈ Lip(α0 , α1 , 0, κ1 ) and that 1 ≤ p < ∞. Define (2.13) sp = min(α + κ(α)/p). If sp > 0, then f ∈ Bsp for any sp ≥ s > 0. This is exactly the formula for sp in the Frisch-Parisi formalism, which shows onesided validity of the Frisch-Parisi formalism for this function class. 3. Energy Conservation for Ideal Hydrodynamics For simplicity assume that D = [0, 1]3 with periodic boundary conditions. A weak solution of the incompressible Euler equation is a function u = (u1 , u2 , u3 ) satisfying Z Z TZ uj ∂t ψj + (∂i ψj )ui uj + (∂j ψj )pdx dt = uj ψj (t = 0)dx D D 0 Z uj (∂j ϕ)dx = 0 (3.1) D
Singularities, Dimension and Energy Dissipation for Hydrodynamics
447
for all test functions ψ = (ψ1 , ψ2 , ψ3 ) ∈ C ∞ (D × R+ ) and ϕ ∈ C ∞ (D) with compact support. Energy is conserved for an Euler solution if Z Z |u (x, t)|2 dx = |u (x, 0)|2 dx (3.2) D
D
for t ∈ [0, T ]. The following energy conservation theorem for ideal hydrodynamics is a consequence of Corollary 2.1 and the theorem of [5]. Corollary 3.1. (Energy Conservation for Euler). Let u be a weak solution of the Euler equations on D = [0, 1]3 . Suppose that u ∈ C([0, T ], B(D)) in which B(D) = Lip(α0 , α1 , 0, κ1 )). Then energy is conserved if min(3αi + κi ) > 1.
(3.3)
i
Note that here and in the next section, the function space C([0, T ], B(D)) could be replaced by L3 ([0, T ], B(D)) ∩ C([0, T ], L2 (D)) or something similar, as in [5]. 4. Energy Conservation for Ideal MHD The energy conservation results of [5] can be extended to ideal MHD in a straightforward manner. The equations for ideal MHD are 1 (∂t + u · ∇)u = −∇p − ∇b 2 + b · ∇b 2 (∂t + u · ∇)b = b · ∇u ∇ · u = ∇ · b = 0.
(4.1)
Actually, incompressibility of b (∇ · b = 0) need only be required at t = 0, and it then holds for all t. Let u = (u1 , u2 , u3 ) and b = (b1 , b2 , b3 ) be functions satisfying the weak form of the ideal MHD equations, namely, Z Z TZ h i (1) (1) (1) 2 uj ∂t ψj − (bi bj − ui uj )∂i ψj + (p + b /2)∂i ψi dx dt = uj ψj(1) dx 0
D
Z
T 0
Z h D
i
bj ∂t ψj(2) + (jkl uk bl )(jmn ∂m ψn(2) ) dx dt = Z Z uj ∂j ξ (1) dx = 0 bj ∂j ξ (2) dx = 0 D
Z
D,t=0
D,t=0
bj ψj(2) dx
D
for all test functions ψ (β) = (ψ1(β) , ψ2(β) , ψ3(β) ) ∈ C ∞ (D × R+ ) and ξ (β) ∈ C ∞ (D), with β = 1, 2. Again, the incompressibility condition on b need only be imposed at t = 0 and it then follows for all t. In analogy to the conservation of energy for the Euler equations, energy conservation for ideal MHD holds if Z Z |u (x , t)|2 + |b (x , t)|2 dx = |u (x , 0)|2 + |b (x , 0)|2 dx (4.2) D
D
for t ∈ [0, T ]. For simplicity we assume that D = [0, 1]n . Whereas singularity formation and energy dissipation is only possible for three-dimensional hydrodynamics, for MHD it is a possibility for dimension n = 2 or n = 3.
448
R.E. Caflisch, I. Klapper, G. Steele
Theorem 4.1. (Energy Conservation for Ideal MHD). Let u and b be a weak solution of the ideal MHD equations in D = [0, 1]n . Suppose that u ∈ C([0, T ], B3α1 ) and b ∈ C([0, T ], B3α2 ). If α1 > 1/3, α1 + 2α2 > 1,
(4.3)
then (4.2) holds. Proof. The proof follows that of [5] but will be briefly repeated here. Define ϕ (x ) = (1/n )ϕ(x /) to be a positive, smooth mollifier with support in B(0, 1) and total mass 1. We make use of the definitions Z r (f, g)(x ) = ϕ (y)(δy f (x ) ⊗ δy g(x ))dy , Z q (f, g)(x ) = ϕ (y)(δy f (x ) × δy g(x ))dy , where δy h(x ) = h(x −y )−h(x ). The proof relies critically on the following identities (first observed in [5]): (f ⊗ g) = f ⊗ g + r (f, g) − (f − f ) ⊗ (g − g ) (f × g) = f × g + q (f, g) − (f − f ) × (g − g ).
(4.4) (4.5)
In addition the following estimates hold for functions in B3α : ||f (· + y) − f (·)||L3 ≤ c|y|α , α−1
(4.6)
||∇f ||L3 ≤ C ||f ||L3 , (4.7) (4.8) ||f − f ||L3 ≤ Cα ||f ||L3 . R R Using ψ (1) (x ) = ϕ (y − x )u (y , t)dy and ψ (2) (x ) = ϕ (y − x )b (y , t)dy as test functions results in the equations Z Z |u (x , t)|2 dx − |u (x , 0)|2 dx D D Z tZ T r (u ⊗ u ) ∇u − (b ⊗ b ) ∇u (x , t)dx dt = D 0 Z Z |b (x , t)|2 dx − |b (x , 0)|2 dx D D Z tZ (u × b ) · ∇ × b (x , t)dx dt. = 0
D
The identities (4.6), (4.7) and (4.8) then yield the estimates Z Z 2 2 |u (x , t)|2 + |b (x , t)|2 dx − |u (x , 0)| + |b (x , 0)| dx D
≤
Z tZ 0
D
|T r (r (u , u ) − r (b , b ) − (u − u ) ⊗ (u − u ) D +(b − b ) ⊗ (b − b ) ∇u |dx dτ
Singularities, Dimension and Energy Dissipation for Hydrodynamics
Z tZ + D
0
449
| q (u , b ) − (u − u ) × (b − b ) · ∇ × b |dx dτ
Z t h 2/3 2/3 ||r (u , u )||3/2 + ||r (b , b )||3/2 0 2/3 2/3 1/3 + ||u − u ||3/2 + ||b − b ||3/2 ||∇u ||3 i 2/3 1/3 1/3 1/3 + ||q (u , b )||3/2 + ||u − u ||3/2 ||b − b ||3/2 ||∇u ||3 dτ ≤
≤ C1 3α1 −1 + C2 α1 +2α2 −1 . The result (4.2) follows in the limit → 0, which finishes the proof of Theorem 4.1. A similar theorem for magnetic helicity can be proven. The time evolution of the magnetic helicity for smooth ideal MHD is given by Z [a t · b + a · b t ] dx D Z = [b · (u × b ) + b · ∇α + a · ∇ × (u × b )] dx ZD = [b · ∇α + a · ∇ × (u × b )] dx , D
=0 where α is some smooth function and b = ∇ × a . Then for ψ ∈ C ∞ (D × R+ ), Z
T 0
Z D
(∇ × ψ (x , t)) · (u (x , t) × b (x , t))dx dt = 0
(4.9)
implies weak conservation of helicity. Using arguments identical to those of the previous proof we obtain Theorem 4.2. (Magnetic Helicity Conservation for Ideal MHD). Let u and b be a weak solution of the ideal MHD equations in D = [0, 1]n . Suppose that u ∈ C([0, T ], B3α1 ) and b ∈ C([0, T ], B3α2 ). If α1 + 2α2 > 0, then (4.9) holds. R D
In 2 dimensions the magnetic helicity vanishes identically. In its place the quantity a 2 dx serves as a non-trivial invariant. In 2 dimensions, a satisfies (up to a gradient) a t + u · ∇a = 0
and we have Theorem 4.3. Let u and b be a weak solution of the ideal MHD equations in D = [0, 1]R2 . Suppose that u ∈ C([0, T ], B3α1 )and a ∈ C([0, T ], B3α2 +1 ). If α1 + 2α2 > −1, then D a 2 dx is conserved. We remark that Theorems 4.1, 4.2, and 4.3 specialize easily to functions u and b in Lip(α0 , α1 , 0, κ1 ), as in Corollary 3.1 . In these cases the bounds of Theorem 4.1 become s1 > 1/3, s1 + 2s2 > 1, the bound for Theorem 4.2 becomes s1 + 2s2 > 0, and the bound for Theorem 4.3 becomes s1 + 2s2 > −1. Here
450
R.E. Caflisch, I. Klapper, G. Steele
s1 = min(α1 + κ1 (α1 )/3), α1
s2 = min(α2 + κ2 (α2 )/3), α2
where κ1 , κ2 are defined as in the introduction. For the commonly observed phenomenon of codimension 1 current sheets, κ2 = 1 so that s1 + 2α2 > 1/3 implies energy conservation and s1 + 2α2 > −2/3 implies helicity conservation (−5/3 in 2D). We also remark that while the fluid result (1.1) picks out the Kolmogorov exponent 1/3 naturally, the classical MHD exponent (namely 1/4 [10, 11]), while consistent with the bounds of Theorems 4.1 and 4.2, does not drop out as naturally. This should not be a surprise since important non-local MHD effects are not included in the argument. Additionally, Theorems 4.1 and 4.2 are consistent with recent intermittency models (see, e.g., [3]). Analogous results can be obtained in terms of the Elsasser (characteristic) variables z ± = u ± b for the MHD equations. The system (4.1) can be rewritten as (∂t + z + · ∇)z − = −∇Π, (∂t + z − · ∇)z + = −∇Π, ∇·z ± =0
(4.10)
in which Π = p + 21 b 2 . The following theorem gives two variants of the previous energy conservation result for MHD. Theorem 4.4. (Energy Conservation for Ideal MHD in Characteristic Variables). For a weak solution of the MHD equations in [0, 1]n , there is energy conservation if either of the following conditions are satisfied: (i) For some p, q with values in (1, ∞) and with 1/p + 2/q = 1 u ∈ C([0, T ], B3α0 ∩ Bpα1 ), b ∈ C([0, T ], Bqα2 ),
(4.11)
3α0 > 1, α1 + 2α2 > 1.
(4.12)
in which
(ii) For some pi , qi (i = 1, 2) with values in (1, ∞) and with 1/p1 + 2/q1 = 2/p2 + 1/q2 = 1, z + ∈ C([0, T ], Bpα11 ∩ Bpα22 ), z − ∈ C([0, T ], Bqβ11 ∩ Bqβ22 )
(4.13)
α1 + 2β1 > 1, 2α2 + β2 > 1.
(4.14)
in which
Similar statements can be made with regards to magnetic helicity.
Singularities, Dimension and Energy Dissipation for Hydrodynamics
451
5. Singularity Formation for Ideal MHD We will show the analogue of the Beale-Kato-Majda theorem for ideal MHD. Theorem 5.1. For the system (4.1) with initial data u 0 , b 0 ∈ H s , with s ≥ 3, the solution u (t), b (t) is in the class C([0, T ], H s ) ∩ C 1 ([0, T ], H s−1 ) as long as
Z
T
|ω (t)|∞ + |j (t)|∞ dt < ∞,
0
and
Z
T
|∇ × z + |∞ + |∇ × z − |∞ dt < ∞.
0
(The 2 inequalities are in fact equivalent.) Here j = ∇ × b and H s is the L2 Sobolev space. The approach closely follows that of [1]. Assume that Z T |∇ × z + |∞ + |∇ × z − |∞ dt = M < ∞. (5.1) 0
The proof consists of three parts: First, we derive energy estimates on |z ± |s in terms of |∇z ± |∞ . Second, we estimate |∇ × z ± |L2 . Finally, we utilize an inequality derived in [1] and Gronwall’s lemma to bound |z ± |s . 5.1. Energy estimates. We begin by deriving energy estimates for the system (4.10) with t ∈ [0, T ]. Let α be a multi-index with |α| ≤ s. Let η = Dxα z + . Apply Dxα to the second equation in (4.10) to obtain (∂t + z − · ∇)η = −∇Π 0 − F in which Π 0 = Dxα Π and F = Dα [(z − · ∇z + )] − z − · Dα ∇z + . A bound on F in the L2 norm can be based on the general inequality |Dα (f g) − f Dα g|L2 ≤ c(|f |s |g|∞ + |∇f |∞ |g|s−1 ), which was derived in [1] based on the Gagliardo-Nirenberg inequalities. Application of this to F yields |F |L2 ≤ c(|z − |s |∇z + |∞ + |∇z − |∞ |∇z + |s−1 ). This leads to the following bound on η d |η |2L2 ≤ c(|z − |s |∇z + |∞ + |∇z − |∞ |∇z + |s−1 )|η |L2 . dt Summing over α leads to
(5.2)
452
R.E. Caflisch, I. Klapper, G. Steele
d +2 |z |s ≤ c(|z − |s |∇z + |∞ + |∇z − |∞ |z + |s )|z + |s . dt
(5.3)
There is a similar result for z − ; i.e., d −2 |z |s ≤ c(|z + |s |∇z − |∞ + |∇z + |∞ |z − |s )|z − |s . dt
(5.4)
Add these two inequalities to obtain d (|z − |2s + |z + |2s ) ≤ c(|∇z + |∞ + |∇z − |∞ )(|z + |2s + |z − |2s ), dt
(5.5)
and thus |z
+ 2 |s
+ |z
− 2 |s
≤ (|z
+ 2 0 |s
+ |z
− 2 0 |s )exp
Z t + − C (|∇z |∞ + |∇z |∞ )dτ .
(5.6)
0
5.2. L2 bounds on ∇ × z± . Take the curl of (4.10) to obtain (∂t + z + · ∇)ζ − = ∇z + A∇z − , (∂t + z − · ∇)ζ + = ∇z − A∇z + .
(5.7)
where ζ ± = ∇ × z ± and A is a constant matrix. Multiplying the first equation in (5.7) by ζ − and integrating gives Z d −2 |ζ |L2 ≤ C |∇z + | |∇z − | |ζ − | dx dt ≤ C|ζ − |∞ (|∇z + |L2 |∇z − |L2 ) ≤ C|ζ − |∞ (|∇z + |2L2 + |∇z − |2L2 ).
Since ∇ · z ± = 0, z ± and ζ
±
(5.8)
are related by
z ± = −∇ × (∆−1 ζ ± ) and their Fourier transforms are related by (∇z ± )(k) = S(k)ζ ± (k) where S(k) is bounded independent of k. Thus |∇z ± |L2 ≤ C|ζ ± |L2 , so that (5.8) leads to d −2 |ζ |L2 ≤ C|ζ − |∞ (|ζ + |2L2 + |ζ − |2L2 ). dt We obtain a similar result for ζ + ; that is d +2 |ζ |L2 ≤ C|ζ + |∞ (|ζ + |2L2 + |ζ − |2L2 ). dt Add these two equations to obtain d (|ζ + |2L2 + |ζ − |2L2 ) ≤ c(|ζ + |∞ + |ζ − |∞ )(|ζ + |2L2 + |ζ − |2L2 ) dt so that
Singularities, Dimension and Energy Dissipation for Hydrodynamics
|ζ
+ 2 | L2
+ |ζ
− 2 | L2
≤ (|ζ
+ 2 0 | L2
+ |ζ
− 2 0 | L2
exp
Z
t
C
453
−
(|ζ (τ )|∞ + |ζ (τ )|∞ )dτ +
.
0
By Assumption (5.1) we have 2 |ζ + |2L2 + |ζ − |2L2 ≤ M (|ζ +0 |2L2 + |ζ − 0 |L2 ),
(5.9)
where M = exp(CM ). 5.3. Final estimates. In [1] it was proved, via the Biot-Savart law, that |∇f |∞ ≤ C{1 + (1 + log+ |f |3 )|∇ × f |∞ + |∇ × f |L2 } where
+
log a =
log a if a ≥ 1 0 otherwise.
(5.10)
(5.11)
Thus |∇z + |∞ + |∇z − |∞ ≤ C{1 + (1 + log+ |z + |3 )|ζ + |∞ + |ζ + |L2
+ (1 + log+ |z − |3 )|ζ − |∞ + |ζ − |L2 }.
Using the result from Section 5.2, we have |∇z + |∞ + |∇z − |∞ ≤ C{1 + (|ζ + |∞ + |ζ − |∞ )(log+ |z + |3 + log+ |z − |3 + 2). Combining this with the result from Sect. 5.2 gives −
Z C
t
(1 + (|ζ + |∞ + |ζ − |∞ )) 0 (log(|z + |3 + e) + log(|z − |3 + e)) dτ .
|z |s + |z |s ≤ c(|z +
+ 0 |s
+ |z
− 0 |s ) exp
Let y ± (t) = log(|z ± |s + e) then y + (t) + y − (t) ≤ log c(|z +0 |s + |z − 0 |s ) Z t +C (1 + (|ζ + |∞ + |ζ − |∞ )(y + (τ ) + y − (τ ))dτ. 0
Application of Gronwall’s lemma then shows that y + (t) + y − (t) is bounded by a constant which depends only on M, T and |z ± (0, ·)|s . This concludes the proof of Theorem 5.1.
454
R.E. Caflisch, I. Klapper, G. Steele
6. Conclusions At present, there are only a few analytical results on singularities in ideal hydrodynamics: The Beale-Kato-Majda theorem is a necessary condition for the formation of singularities from smooth initial data. Constantin [4] and Constantin & Fefferman [6] have obtained additional necessary conditions in terms of the geometry of the vorticity field. Finally, Onsager’s energy conservation criterion provides a necessary condition for energy dissipation due to singularities in an ideal fluid. The first part of this paper has refined Onsager’s criterion by explicitly showing the effect of singularity type and dimension on the necessary condition for energy dissipation. The result is an example of the Frisch-Parisi multi-fractal formalism, which has been proved to be valid for functions in the class Lip(α0 , α1 , 0, κ1 ). In the remaining parts of the paper two analytical results–the Beale-Kato-Majda theorem and Onsager’s energy conservation theorem–have been extended to ideal MHD. Since energy dissipation but helicity conservation are expected, this suggests a limited range of values for the uniform singularity spectrum in MHD. The appearance of the Elsasser variables z + and z − in the extension of the Beale-Kato-Majda inequality should also be noted. We expect these results to be useful in two ways: First, as a sufficient condition for regularity of ideal hydrodynamic and MHD solutions. They should also serve as a guide in investigation of possible singularities and their physical significance. For example in 3D hydrodynamics with singularities of type α on a smooth set S, nonzero energy dissipation requires α ≤ 0 for a 2D singularity surface (κ = 1), α ≤ −1/3 for a curve of singularities (κ = 2), and α ≤ −2/3 for a point singularity (κ = 3). In particular, in the point and curve cases, infinite velocities are required. These results also help to indicate the relation between the smoothness of b and that of u . Theorem 5.1 suggests that b and u should have the same degree of smoothness, while Theorems 4.1, 4.2, and 4.3 suggest a tradeoff between smoothness of u and that of b . References 1. Beale, J.T., Kato, T., Majda, A.: Remarks on the breakdown of smooth solutions for the 3-D Euler equations. Commun. Math. Phys. 94, 61–66, (1984) 2. Biskamp, D.: Nonlinear Magnetohydrodynamics. Cambridge: Cambridge Univ. Press, 1991 3. Carbone, V.: Cascade model for intermittency in fully developed magnetohydrodynamic turbulence. Phys. Review Letters 71, 1546–1548 (1993) 4. Constantin, P.: Geometric statistics in turbulence. SIAM Rev.,36, 73–98 (1994) 5. Constantin, P., Weinan E., Titi, E.S.: Onsager’s conjecture energy conservation for solutions of Euler’s equation. Commun. Math. Phys. 165, 207–209 (1994) 6. Constantin. P. , Fefferman, C.: Direction of vorticity and the problem of global regularity for the Navier-Stokes equations. Indiana U. Math. J. 42, 775–789 (1994) 7. Eyink, G.: Energy dissipation without viscosity in ideal hydrodynamics. 1. Fourier analysis and local energy transfer. Physica D 78, 222–240 (1994) 8. Eyink, G.: Besov spaces and the multifractal hypothesis. J. Stat. Phys. 78 353–375 (1995) 9. Frisch, U., Parisi, G.: On the singularity structure of fully developed turbulence. In M.Ghil, R.Benzi, and G.Parisi, editors, Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics. Proc. Internatial Summer School of Physics “Enrico Fermi”, Amsterdam: North-Holland, 1985, pp. 84–87 10. Iroshnikov, P.S.: Turbulence of a cducting fluid in a strong magnetic field. Soviet Astromy 7, 566–571 (1964)
Singularities, Dimension and Energy Dissipation for Hydrodynamics
455
11. Kraichnan, R.H.: Inertial range spectrum in hydromagnetic turbulence. Physics of Fluids A 8, 1385– 1387 (1965.) 12. Onsager, L.: Statistical hydrodynamics. Nuovo Cimento (Supplemento) 6, 279 (1949) Communicated by J.L. Lebowitz
Commun. Math. Phys. 184, 457 – 474 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Free Hypercontractivity Philippe Biane ´ Universit´e Pierre et Marie Curie, Place Jussieu, 75252 CNRS, Laboratoire de Probabilit´es, Tour 56, 3e Etage, ´ Paris Cedex 05, France, and DMI, Ecole Normale Sup´erieure, 45, rue d’Ulm, 75005 Paris, France Received: 23 July 1996 / Accepted: 29 August 1996
Abstract: We prove the analog of Nelson’s optimal hypercontractivity inequality, for the free second quantization (in the sense of Voiculescu) of a contraction between two real Hilbert spaces. The proof extends also to the case of the q-quantization functors of Bozejko, K¨ummerer and Speicher, for q ∈ [−1, 1], interpolating between the bosonic and the fermionic functors which correspond to q = +1 and q = −1 respectively. Introduction Segal’s bosonic second quantization functor (which we shall denote by 0 +1 for reasons which will appear later), assigns to every real Hilbert space H a probability space, ( , F , P ) (or equivalently a commutative von Neumann algebra 0 +1 (H) = L∞ C ( , F, P )), carrying a gaussian measure with covariance modelled on H, and to every contraction A : H → K between Hilbert spaces, a unital positivity preserving map 0 +1 (A) : 0 +1 (H) → 0 +1 (K). Such a map satisfies k 0 +1 (A)f kLp ≤ kf kLp for p f ∈ L∞ C ( , F , P ), 1 ≤ p ≤ ∞ and thus extends to a contraction between L spaces. In his famous paper [N], E. Nelson proved that in fact a better contraction property holds and one has, for 1 ≤ p, r < ∞, p−1 . (N) r−1 This property plays a fundamental role in some questions of quantum field theory, and has been widely extended to other contexts. The study of hypercontractivity of Markov operators has now developed into a new branch of analysis and probability theory, and we refer to [G3] and [Ba] for recent surveys of the theory and applications of hypercontractivity. In subsequent work, with the help of the logarithmic Sobolev inequalities techniques he had developed in [G1], L. Gross investigated in [G2] the analogue of Nelson’s inequalities when the bosonic second quantization functor is replaced by its fermionic analog k 0 +1 (A)kLp →Lr = 1
if and only if kAk2 ≤
458
P. Biane
0 −1 , which assigns to every real Hilbert space H its complex Clifford algebra, with its canonical von Neumann algebra structure, and to every contraction A : H → K a unital completely positive map 0 −1 (A) : 0 −1 (H) → 0 −1 (K). He conjectured that the inequalities (N ) hold, with 0 −1 instead of 0 +1 and with respect to the non-commutative Lp norms on the Clifford algebra, and obtained a weaker result. After some partial results by J. M. Lindsay and P. A. Meyer (see [L] and [LM]), Gross’s conjecture was finally proven by E. Carlen and E.H. Lieb [CL]. In his investigations on the II1 factors of free groups, D. Voiculescu has developed a theory of free probability, in which a third functor 0 0 , from the category of real Hilbert spaces with the contractions as morphisms, to the category of von Neumann algebras with a finite trace, and unital completely positive maps as morphisms, plays the role of the functor 0 +1 in classical probability theory. In particular, for every real Hilbert space H, the von Neumann algebra 0 0 (H) is isomorphic to the von Neumann algebra generated by the regular representation of a free group with n generators, n being the dimension of H, which is a II1 factor if n ≥ 2. This theory, introduced in [V], is developed in the book [VDN] and has some remarkable connections with the theory of random matrices. In this paper we shall prove that Nelson’s inequalities (N ) still hold when the functor 0 +1 is replaced by the functor 0 0 , the Lp norms being the non-commutative Lp norms on the von Neumann algebra 0 0 (H). The proof consists in firstly extending the above quoted result of E. Carlen and E. Lieb, on fermionic hypercontractivity, to systems of spins with mixed commutation and anti-commutation relations, and then deducing the result for the free second quantization functor by using a central limit argument, which relies on R. Speicher’s central limit theorem [S]. This is very much in the spirit of Gross’s proof of Nelson inequalities in [G1]. In fact the argument that we give works equally well to include the functors 0 q ; −1 ≤ q ≤ 1 of M. Bozejko, B. K¨ummerer and R. Speicher [BKS], which are a one parameter family of functors interpolating between the bosonic, fermionic, and free second quantization functors, and we shall give the proof of Nelson’s inequalities in this general setting. This paper is organized as follows. In the first part, we describe the functors 0 q , following [BKS], and state the problem of Nelson’s inequalities. In the second part, we extend the result of Carlen and Lieb on fermionic hypercontractivity to systems of spins with mixed commutation and anti-commutation relations. Although this extension is rather straightforward, the result in itself may be of some interest. We recall R. Speicher’s central limit theorem in Sect. 3, and finally we use it to deduce Nelson’s inequalities for the functors 0 q ; −1 < q < 1. 1. The Functors 0 q In this section we shall describe the functors 0 q , following [BKS] to which we refer for details. The cases q = ±1 correspond to the bosonic and fermionic second quantization functors, while the case q = 0 (free second quantization functor) was treated by D. Voiculescu in [V]. 1.1. q-Fock Spaces. Let H be a real Hilbert space with complexification HC (in the sequel we shall always assume that our spaces are separable), denote by F (HC ) LHilbert ∞ ⊗n the complex vector space F (HC ) = n=0 HC , where the direct sum and the tensor product are algebraic and, by definition, HC⊗0 = C for some non zero vector ,
Free Hypercontractivity
459
called “the vacuum”. For every q ∈ [−1, 1], define a hermitian form h., .iq on F (H) by sesquilinear extension of h , iq = 1, hf1 ⊗ f2 ⊗ . . . ⊗ fm , g1 ⊗ g2 ⊗ . . . ⊗ gn iq = δmn
X
q i(π) hf1 , gπ(1) i . . . hfm , gπ(m) i
π∈Sm
for f1 , . . . , fm , g1 , . . . , gn ∈ HC , where the sum is over the group Sm of all permutations of {1, . . . , m}, and i(π) denotes the number of inversions of the permutation π, namely i(π) = card{(i, j) ∈ {1, . . . , m}2 |
i<j
and
π(i) > π(j)}.
Proposition 1 ([BS, Gre, Z]). The hermitian form < ., . >q is non-negative on F (HC ), furthermore, it is positive definite if −1 < q < 1. Denote by Fq (H) the complex Hilbert space obtained by separating and completing F (H) with respect to h., .iq . Note that for q = ±1 we obtain the classical symmetric and antisymmetric Fock spaces associated with HC , while for q = 0 one has F0 (HC ) = L ∞ ⊗n n=0 HC , with the Hilbert space tensor product and direct sum; F0 (HC ) is the free Fock space of Voiculescu [V]. For f ∈ HC , and −1 < q < 1, one defines the creation operator l∗ (f ) on Fq (H) as the bounded linear extension of l∗ (f )f1 ⊗ . . . ⊗ fn = f ⊗ f1 ⊗ . . . ⊗ fn . The annihilation operator l(f ) is defined as the adjoint of l∗ (f ) and satisfies l(f )f1 ⊗ . . . ⊗ fn =
n X
q j−1 hfj , f if1 ⊗ . . . ⊗ fj−1 ⊗ fj+1 ⊗ . . . ⊗ fn .
j=1
For q = ±1 the creation and annihilation operators are defined by similar formulas involving symmetric or antisymmetric tensors, but they are unbounded in the symmetric case. Since the cases q = ±1 are well known, in the sequel we shall only give proof of statements for −1 < q < 1. It can be verified that these operators fulfill the relations l(g)l∗ (f ) − ql∗ (f )l(g) = hf, gi IFq (H)
for
f, g ∈ HC
which interpolate between the canonical commutation and anticommutation relations (obtained for q = +1 and q = −1 respectively). Such relations were first introduced by Frisch and Bourret in [BF], in order to describe particles obeying statistics intermediate between Bose and Fermi statistics, but the mathematical proof of existence of such operators on a Hilbert space waited until the beginning of the 90’s. Definition 1. Let H be a real Hilbert space, then 0 q (H) is the von Neumann algebra of operators on Fq (H) generated by the self-adjoint operators ω(f ) = l(f ) + l∗ (f ), for f ∈ H. The vacuum vector is a unit vector in Fq (H), hence it defines a state τq on 0 q (H) by τq (X) = hX , iq for X ∈ 0 q (H) Proposition 2 ([BKS]). The state τq is a faithful normal trace on 0 q (H).
460
P. Biane
We shall denote by Lp ( 0 q (H), τq ), for 1 ≤ p ≤ ∞ the non-commutative Lp spaces associated with the trace τq , namely Lp ( 0 q (H), τq ) is the completion of 0 q (H) for the Banach space norm kXkLp = (τq [(X ∗ X)p/2 ])1/p for 1 ≤ p < ∞ and L∞ ( 0 q (H)) = 0 q (H). It follows from the preceding proposition that the map X 7→ X is a continuous imbedding of 0 q (H) into Fq (H), which extends to a unitary isomorphism of L2 ( 0 q (H), τq ) with Fq (H). 1.2. Second quantization and Nelson’s inequality. Let T : H → K be a contraction between real Hilbert spaces, with complexification TC , then the linear map defined on elementary tensors by Fq (T )(f1 ⊗ . . . ⊗ fn ) = TC f1 ⊗ . . . ⊗ TC fn extends to a contraction from Fq (H) to Fq (K). Theorem 1. ([B-S-P] Theorem 2.1.1) Let T : H → K be a contraction between real Hilbert spaces, then there exists a unique map 0 q (T ) : 0 q (H) → 0 q (K) such that 0 q (T )(X) = Fq (T )(X ) for every X ∈ 0 q (H). The map 0 q (T ) is bounded, normal, unital, completely positive, and trace preserving. We remark here that 0 q is a functor, namely if S : H → K and T : K → J are contractions then 0 q (ST ) = 0 q (S) 0 q (T ). When q = ±1 the result of Theorem 1 is a classical property of the bosonic or fermionic second quantization. In the case q = 0, this was proved by Voiculescu [V]. Since 0 q (T ) is completely positive and unital, it extends to a contraction on all the Lp spaces, as shown in Sect. 3 of [DL]. We are now ready to state the main result of this paper. Theorem 2. Let T : H → K be a contraction between real Hilbert spaces, and 1 ≤ p, r < ∞, then one has k 0 q (T )kLp →Lr = 1
if and only if kT k2 ≤
p−1 . r−1
As in the bosonic or fermionic cases, we shall, as a first step in the proof of Theorem 2, show how to reduce it to a simpler case, namely that where T is a multiple of the identity. Definition 2. Let H be a real Hilbert space, and Tt = e−t IH for t ≥ 0, then the completely positive maps Ptq = 0 q (Tt ); t ≥ 0, on 0 q (H), form a semigroup, called the q-Ornstein-Uhlenbeck semigroup. The q-Ornstein-Uhlenbeck semigroup extends to a semigroup of contractions of the non-commutative Lp spaces, whichq are symmetric on L2 . Its generator on L2 is the number operator N q , i.e. Pt = e−tN , where N q is the unbounded self-adjoint operator given by N q = 0, N q f 1 ⊗ . . . ⊗ fn = n f1 ⊗ . . . ⊗ fn
for
f1 , . . . , f n ∈ H C .
The following property of the q-Ornstein-Uhlenbeck semigroup is a direct consequence of Theorem 2.
Free Hypercontractivity
461
Theorem 3. For every real Hilbert space H, one has kPtq kLp →Lr = 1
if and only if e−2t ≤
p−1 . r−1
It turns out that this particular case implies the more general result. Proof that Theorem 3 implies Theorem 2. Let T : H → K be a non zero contraction, then T = (T /kT k)kT kIH , and since (T /kT k) is a contraction, 0 q (T /kT k) induces a contraction between Lp spaces, hence k 0 q (T )kLp →Lr ≤ k 0 q (kT kIH )kLp →Lr . Apq plying the “if” part of Theorem 3 to 0 q (kT kIH ) = Plog(1/kT k) yields the “if” part of
Theorem 2. For the “only if” part, consider r, p and ρ such that kT k2 > ρ2 > p−1 r−1 . The operator T ∗ T /kT k is a positive self-adjoint contraction of H, of norm kT k, let Πρ denote its spectral projection corresponding to the interval [ρ, ∞[, then there exists a contraction S of H such that ST ∗ T /kT k = ρΠρ . The restriction of the map 0 q (ρΠρ ) to 0 q (Πρ (H)) belongs to the Ornstein-Uhlenbeck semigroup on 0 q (Πρ (H)), hence by the “only if” part of Theorem 3, there exists X ∈ 0 q (Πρ (H)) ⊂ 0 q (H) such that k 0 q (Πρ (H))XkLr > kXkLp . Since 0 q (ST ∗ /kT k) is a contraction on Lr , it follows that k 0 q (T )XkLr ≥ k 0 q (ST ∗ T /kT k)kLr = k 0 q (Πρ (H))XkLr > kXkLp , hence k 0 q (T )kLp →Lr > 1.
We shall now prove the easy half of Theorem 3. Proof of the necessity condition in Theorem 3. Let h ∈ H be a unit vector, and let φ = 1 + ω(h), for > 0, then ω(h) is a bounded operator, and Ptq (φ ) = φe−t . Since τq (ω(h)) = 0 and τq (ω(h)2 ) = 1, for small one has kφ kLp = (τq [1 + pω(h) +
2 2 ω(h)2 p(p − 1) + o(2 )])1/p = 1 + (p − 1) + o(2 ). 2 2
If p and r are such that kPtq kLr →Lp ≤ 1, one must have kPtq φ kLr = kφe−t kLr ≤ kφ kLp , 2 2 1 + e−2t (r − 1) + o(2 ) ≤ 1 + (p − 1) + o(2 ), 2 2 so that if we let → 0 we get e−2t ≤
p−1 r−1 .
The proof of Theorem 3 will be given in Sect. 3, after some preliminary results in Sect. 2. 1.3. The one dimensional case. When H = R.e is one dimensional, generated by a unit vector e, the von Neumann algebra 0 q (H) is generated by the single self-adjoint element ω(e) = l(e) + l∗ (e), hence by functional calculus we have an isomorphism f 7→ f (ω(e)) from L∞ (µq (dx)) to 0 q (H), for some probability measure µq with compact support on R, the trace τq being given by integration with respect to µq . The one dimensional subspaces H⊗n of L2 ( 0 q (H), τq ) are obtained by orthogonalization from the sequence ω(e)n , so that they correspond in L2 (µq ) to the spaces spanned by the orthogonal polynomials with respect to µq . Using the explicit form of the operators l(e) and l∗ (e), one checks that these polynomials Hnq satisfy the recursion relations
462
P. Biane q xHnq (x) = Hn+1 (x) +
qn − 1 q H (x). q − 1 n−1
These polynomials are q-deformations of the Hermite polynomials, which are obtained for q → 1. The orthogonalizing measure µq can be computed, indeed for −1 < q < 1 the measure has support in [− √ 2 , √ 2 ], and is given by 1−q
1−q
√ ∞ Y 1−q sin θ (1 − q n )|1 − q n e2iθ |2 dx, µq (dx) = π n=1
where x = √ 2
1−q
cos θ with θ ∈ [0, π]. This result goes back to Szeg¨o [Sz]. The
q-Ornstein-Uhlenbeck semigroup on L∞ (µq ) is a Markov semigroup, acting by multiplication by e−nt on the polynomial Hnq (x). It has an explicit kernel representation Z Ptq f (x) = f (y)pqt (x, y)µq (dy) with the q-Mehler kernel (see e.g. [Br]) pqt (x, y) =
∞ Y n=0
(1 − e−2t q n ) . |(1 − e−2t+iφ+iψ q n )(1 − e−2t+iφ−iψ q n )|2
In dimension 1, the question of hypercontractivity of the q-Ornstein-Uhlenbeck semigroup is thus reduced to a question on multipliers for the q-Hermite polynomials. In the case q = 0, this question can be solved directly, indeed, the 0-Hermite polynomials are just Tchebychev polynomials of the second kind, Hn0 (2 cos θ) =
sin(n + 1)θ , sin θ
which are orthogonal with respect to the semi-circle distribution µ0 (dx) =
1 p 4 − x2 dx 2π
on [−2, 2]. The Mehler kernel is equal to p(0) t (x, y) =
∞ X
e−nt Hn0 (x)Hn0 (y)
n=0
=
(1 −
e−2t )2
−
1 − e−2t . + e−t )xy + e−2t (x2 + y 2 )
e−t (1
The semi-circle distribution is the image of the uniform measure on a sphere S 3 ⊂ R4 , of radius 2, by projection onto the first coordinate. The nth Tchebychev polynomial evaluated on the first coordinate is a spherical harmonic of degree n, hence the 0Ornstein-Uhlenbeck semigroup is nothing but the restriction of the Poisson semigroup on the sphere S 3 , to the space of functions depending only on the first coordinate. The hypercontractivity estimate for the 0-Ornstein-Uhlenbeck semigroup given by Theorem 3 (or at least the “if” part of the theorem) is thus a consequence of that for the Poisson
Free Hypercontractivity
463
semigroup on the sphere, which was obtained by Beckner [Be]. We shall not use this result, so that we will obtain another proof of the hypercontractivity of the kernel p(0) t . 1.4. Sobolev inequalities and ultracontractivity properties. In this section we discuss some inequalities which are consequences of, or are related to the hypercontractivity property of the q-Ornstein-Uhlenbeck semigroup. Since this is not our main subject in this paper, we shall only outline proofs. As usual in the theory of hypercontractivity, one can derive from Theorem 3 a logarithmic Sobolev inequality for the number operator N q (see [G1,G3, Ba]...). Note that the unbounded quadratic form Qq (X) = τq (XN q X ∗ ) on L2 ( 0 q (H), τq ) is a noncommutative completely Dirichlet form (on an appropriate domain) in the sense of [DL]. From Theorem 3 we can infer the following Corollary 1. (Logarithmic Sobolev inequality) For all X in the domain of Qq ,one has τq (|X|2 log |X|2 ) − kXk2L2 log kXk2L2 ≤ 2 Qq (X). Proof. Take derivative at t = 0 in the inequality kPtq (X)kL2 ≤ kXkL1+e−2t .
Also there are Lp estimates for the eigenvectors of N q . Corollary 2. Let X be in the eigenspace of N q , with eigenvalue n, and p ≥ 2, then kXkLp ≤ (p − 1)n/2 kXkL2 . Proof. By Theorem 3, for e2t = p − 1, one has e−nt kXkLp = kPtq XkLp ≤ kXkL2 .
In particular, all Lp norms for 1 < p < ∞ are equivalent on an eigenspace of N q . In the free case (i.e. q = 0) one can prove more. Theorem 4. ([Bo]) Let X be in the eigenspace of N 0 , with eigenvalue n, then kXkL∞ ≤ (n + 1)kXkL2 . Proof. This estimate follows from [Bo]. The similarity between this inequality and Haagerup’s inequality on free groups (see Lemma 4 in [Ha]) is not fortuituous, in fact it is possible to derive Theorem 4 from Haagerup’s inequality by using a central limit argument based on Voiculescu’s central limit theorem for free random variables (see [VDN]), but we shall not do this here. In fact it is presumably true that an analogous estimate holds also for the case −1 < q < 1, but we have not been able to prove this. Theorem 4 allows one to compute the norm kPt0 kL1 →L∞ . Corollary 3. For t > 0 one has kPt0 kL1 →L∞ =
1+e−t (1−e−t )3 .
Proof. For the lower bound it is enough to consider the one dimensional case, but then the bound follows from the explicit form of the 0-Mehler kernel, kPt0 kL1 →L∞ =
sup
x,y∈[−2,2]
p(0) t (x, y) =
1 + e−t . (1 − e−t )3
464
P. Biane
P On the other hand, let X = n≥0 X (n) be the decomposition of X ∈ 0 0 (H) according to the eigenspaces of N 0 , then X kPt0 (X)kL∞ ≤ kPt (X (n) )kL∞ n≥0
=
X
e−nt kX (n) kL∞
n≥0
≤
X
e−nt (n + 1)kX (n) kL2
n≥0
≤
X
1/2 e−2nt (n + 1)2
n≥0
X
n≥0
s
1/2 kX (n) k2L2
1 + e−2t kXkL2 . (1 − e−2t )3 q 1+e−2t 0 This implies that kPt0 kL2 →L∞ ≤ (1−e−2t )3 . Since Pt is symmetric we can dualize q 1+e−2t and get kPt0 kL1 →L2 ≤ (1−e −2t )3 . Using the semigroup property, one has =
0 0 kPt0 kL1 →L∞ ≤ kPt/2 kL1 →L2 kPt/2 kL2 →L∞ ≤
which gives the upper bound.
1 + e−t , (1 − e−t )3
It is now standard, in the case of Markov semigroups, that an estimate such as that of Corollary 3 implies a corresponding Sobolev inequality, see for example [Ba, D or CSV] for more information. We shall see that results of non-commutative integration theory allow one to prove analogous results. Corollary 4. (Sobolev inequality) There exists a constant c > 0 such that for all X in the domain of Q0 one has kXk2L3 ≤ c (|τ0 (X)|2 + Q0 (X)). Proof. For an operator X, denote by X ∗ it decreasing rearrangement, i.e. X ∗ (λ) = τ0 (E[λ,∞[ ) for λ ≥ 0, where E is the family of spectral projections associated to |X|. According to [Y], Proposition 2.4, (iii) one has (X + Y )∗ (α + β) ≤ X ∗ (α) + Y ∗ (β) for all α, β ≥ 0. It follows that the proofs of Theorem 2.4.2 and Corollary 2.4.3 in [D] can be carried over, almost word for word, to the non-commutative case, where e−Ht is a symmetric non-commutative Markov semigroup. One needs to replace the Riesz-Thorin and Marcinkiewicz interpolation theorems by their extensions to non-commutative Lp and Lorentz spaces (see [K and PS]). From the proof of Corollary 3, we obtain the estimate kPt0 kL2 →L∞ ≤ c t−3/2 for 0 < t < 1, and applying the non-commutative extension of Corollary 2.4.3 to Pt0 , we get the inequality kXk2L3 ≤ c (kXk2L2 + Q0 (X)). We just remark that kXk2L2 ≤ |τ0 (X)|2 + Q0 (X) and obtain the required inequality. We do not know the best constant c in this inequality.
Free Hypercontractivity
465
2. Spin Systems with Mixed Commutation and Anti-commutation Relations We shall now introduce some spin systems, i.e. families of self-adjoint unitary operators, such that any two of them either commute or anticommute. When all the spins anticommute, they generate a Clifford algebra, whereas, when they all commute, they generate an algebra isomorphic with an algebra of independent Bernoulli random variables. The general case is between these two extremes. We shall develop a Fock model for representing such a spin system with creation and annihilation operators, and derive a generalized Jordan-Wigner transformation which will be fundamental for our purposes. The content of the first two sections might be well known, but we have not found adequate references. 2.1. Definition and Ornstein-Uhlenbeck semigroups. Let I be a finite set and ε be a function on I × I with values in {−1, +1} such that ε(i, j) = ε(j, i) and ε(i, i) = −1 for all (i, j) ∈ I × I. Consider the complex unital algebra A(I, ε), with generators (xi )i∈I and relations xi xj − ε(i, j)xj xi = 2 δij
for
i, j ∈ I.
(S)
Suppose that I is totally ordered, then this algebra has a vector space basis, which consists in elements xA ; A ⊂ I, where xA = xi1 . . . xip if A = {i1 , . . . , ip } ⊂ I with i1 < i2 . . . < ip and x∅ = 1. Moreover, if the order of I is changed, the basis vectors xA are just multiplied by some ±1. We endow A(I, ε) with the antilinear involution such that x∗i = xi for i ∈ I, and denote by τ ε the tracial linear map such that τ ε (xA ) = δA∅ for A ⊂ I. Clearly, τ ε is independent of the ordering chosen on I. There is a positive definite hermitian form hX, Y i = τ ε (XY ∗ ) on A(I, ε) for which the basis xA is orthonormal. Let A(I, ε) act by left multiplication on the Hilbert space (A(I, ε), h., .i), we get a faithful ∗-representation of A(I, ε). We shall endow A(I, ε) with the von Neumann algebra structure induced by this representation, and denote by Lp (A(I, ε), τ ε ) the associated Lp spaces. The representation of A(I, ε) by left multiplication operators is just the GNS representation associated with the tracial state τ ε . We shall now introduce the ε-Ornstein-Uhlenbeck semigroup on such an algebra. Let I and ε be as above, define a new set I˜ = I × {0, 1}, a function ε˜ on I˜ × I˜ by ˜ ε). ε((i, ˜ u), (j, v)) = ε(i, j), and consider the algebra A(I, ˜ The algebra A(I, ε) can be ˜ ε), embedded as a ∗-subalgebra of A(I, ˜ by sending its generator xi to x(i,0) . Denote by ˜ ε) E ε the conditional expectation, with respect to τ ε˜ from A(I, ˜ onto A(I, ε). One has E ε [xA×{0} xB×{1} ] = xA×{0} δB∅ , for A, B ⊂ I. For t ≥ 0, it is plain, from the defining √ ˜ ε), relations of A(I, ˜ that the elements yit = e−t x(i,0) + 1 − e−2t x(i,1) for i ∈ I, satisfy the same relations as the xi ; i ∈ I, hence there is an embedding ∗-homomorphism ˜ ε) ˜ such that Rt (xi ) = yit . It follows that Ptε = E ε ◦ Rt is a Rt : A(I, ε) → A(I, unital, completely positive map from A(I, ε) to itself. Furthermore, this map satisfies Ptε xA = e−|A| t xA for A ⊂ I, hence the maps Ptε form a semigroup. Definition 3. The semigroup Ptε ; t ≥ 0 will be called the ε-Ornstein-Uhlenbeck semigroup on A(I, ε). By the properties of the conditional expectations, we know that the maps Ptε are unital, completely positive, and they extend to contractions of the non-commutative Lp spaces of A(I, ε). The semigroup Ptε is symmetric on L2 (A(I, ε), τ ε ) and has a self-adjoint generator N ε , the number operator, given by N ε (xA ) = |A|xA for A ⊂ I.
466
P. Biane
2.2. Fock model. One can define creation and annihilation operators βi∗ and βi for i ∈ I, on L2 (A(I, ε), τ ε ) by the formulas βi∗ (xA ) = xi xA =0 βi (xA ) = 0 = x i xA
if if if if
i∈ / A, i ∈ A, i∈ / A, i ∈ A,
where the basis xA is as in Sect. 2.1, for some total ordering of I. It is easy to see that the operators βi and βi∗ do not depend on the particular order chosen. From the formula τ ε (xi xA x∗B ) = τ ε (xA (xi xB )∗ ) one sees that the operators βi∗ and βi are adjoint of each other. They satisfy the relations (βi∗ )2 = βi2 = 0 0
βi βi∗ + βi∗ βi = Id, 0
βiη βjη − ε(i, j)βjη βiη = 0
for
i 6= j,
(F)
0
where βiη (resp. βjη ) denotes either βi or βi∗ (resp. βj∗ or βj ), and Id is the identity operator on L2 (A(I, ε), τ ε ) (beware that 1 in Sect. 2.1 and below denotes the unit in A(I, ε), which is identified with a vector in L2 (A(I, ε), τ ε )). Furthermore, βi + βi∗ = xi , where xi acts on L2 (A(I, ε), τ ε ) by left multiplication. In the fermionic case, i.e. ε ≡ −1, the operators βi and βi∗ are the classical fermionic creation and annihilation operators, while in the bosonic case (ε ≡ 1), they are the “B´eb´e Fock” creation and annihilation operators (see Sect. II.2 of [M]). We have the following formula for the number operator acting on L2 (A(I, ε), τ ε ) X βi∗ βi . Nε = i∈I
Note thatthe ∗-algebra generated by βi is isomorphic with M2 (C), with βi identified 0 1 with . For each j ∈ J define χj = βj βj∗ − βj∗ βj , then from the relations (F ) 0 0 one gets χj = χ∗j
χj (1) = 1
χ2j = Id
χj βj = −βj χj
and β i χ j = χj β i
χ j χ i = χi χ j
for
for
j∈I
i 6= j.
Let us suppose I is totally ordered, e.g. I = {1, . . . , n} and define, for j ∈ I, vj = χ1,ε(1,j) χ2,ε(2,j) . . . χj−1,ε(j−1,j) , γ j = vj βj where χj,+1 = 1 and χj,−1 = χj , so that vj belongs to the ∗-algebra generated by β1 , . . . , βj−1 . One has vj (1) = 1
vj = vj∗
vj2 = Id
v i γ j = γj v i
for
i 6= j
It is clear from the relations above that for each j the ∗-algebra generated by γj is isomorphic with M2 (C) and
Free Hypercontractivity
467
γj γi − γi γj = γi γj∗ − γj∗ γi = 0
for all
i 6= j.
It follows that the ∗-algebra generated by the operators γj ; j ∈ I is isomorphic with M2 (C)⊗n ∼ M2n (C), hence from dimension considerations, it is equal to B(L2 (A(I, ε), τ ε )), and it coincides with the algebra generated by the βi and βi∗ for i ∈ I. The transformation from β to γ operators is a generalized Jordan-Wigner transform (see e.g. [CL] for the fermionic case). We shall use this transform to prove a factorization property for expectations with respect to the pure state 1. Proposition 3. For all i ∈ I, let αi ∈ A(βi , βi∗ ), the ∗-algebra generated by βi , and let i1 , . . . , ir ∈ I be r distinct elements, then hαi1 . . . αir 1, 1i = hαi1 1, 1i . . . hαir 1, 1i, the inner product being taken in L2 (A(I, ε), τ ε ). Proof. It is enough to consider the case where I = {1, . . . , n} is totally ordered and to prove that hXα1, 1i = hX1, 1ihα1, 1i whenever α belongs to A(βn , βn∗ ) and X to the ∗-algebra generated by β1 , . . . , βn−1 . By linearity, one needs only to consider the four cases α = βn , α = βn∗ , α = βn∗ βn , α = βn βn∗ . If α = βn or α = βn∗ βn , then α1 = 0 and the result follows. If α = βn βn∗ one has α1 = 1, and again the result is easy. Finally if α = βn∗ , then on one hand hα1, 1i = 0 and on the other hand hXα1, 1i = hXvn vn βn∗ 1, 1i = hvn βn∗ Xvn 1, 1i since vn βn∗ = γn∗ commutes with βj ; j < n = hXvn 1, βn vn 1i = 0 since vn 1 = 1 and βn 1 = 0. Another crucial property of our algebra is the following. As we have seen, we can embed A(I, ε) into B(L2 (A(I, ε), τ ε )) by the left regular (or GNS) representation, this yields the formula τ ε (X) = hX1, 1i for the trace τ ε evaluated on some element of A(I, ε). Let now tr denote the normalized trace on the algebra B(L2 (A(I, ε), τ ε )). Proposition 4. For all X ∈ A(I, ε) one has τ ε (X) = tr(X) = hX1, 1i. Proof. Let us consider the orthonormal basis xA ; A ⊂ I of L2 (A(I, ε), τ ε ), then for all A, B ⊂ I, one has hxA xB , xB i = 0, unless A = ∅, so that X tr(xA ) = 2−|I| hxA xB , xB i = δA∅ = τ ε (xA ). B⊂I
Using the generalized Jordan-Wigner transform, a concrete representation of the operators βi and βi∗ can be obtained. Let
468
P. Biane
σ1 =
1 0
0 1
σ−1 =
1 0
0 −1
b=
0 0
1 0
b∗ =
0 1
0 0
and I = {1, . . . , n} then the matrices βj = σε(1,j) ⊗ σε(2,j) ⊗ . . . ⊗ σε(j−1,j) ⊗ b ⊗ σ1 ⊗ . . . ⊗ σ1 provide a *-representation of the operators βi ; i ∈ I in M2 (C)⊗n . The pure state X 7→ ⊗n , hX1, 1i = hX1, 1i on B(L2 (A(I, ε), τ ε )) is induced by the pure state e⊗n 1 , on M2 (C) 2 where e1 = (1, 0) is the first basis vector in C . 2.3. Nelson’s inequality. Theorem 5. Let 1 ≤ p, r < +∞, then for every I and ε one has kPtε kLp →Lr = 1
if and only if
e−2t ≤
p−1 . r−1
Proof. This theorem is an extension of Theorem 4 in [CL], which deals with the case ε ≡ −1, and can be derived by an almost straightforward modification of the proof of [CL]. We shall only recall the main steps of the argument and point out the required modifications to be made in order to include the more general setting of the theorem. The “only if” part is easily settled by restricting Ptε to the algebra generated by a single spin xi , i ∈ I. Let us now prove the “if” part. One has Ptε 1 = 1 so that the norm of Ptε between Lp spaces is always ≥ 1, and we need only prove that kPtε XkLr ≤ kXkLp
for all X ∈ A(I, ε)
as soon as e−2t ≤ p−1 r−1 . We first note that Theorem 2 and 3 of [CL] are valid in our situation without any change, and this implies that we can restrict to proving kPtε XkLr ≤ kXkLp only for positive X. Lemma 1. If A, B are m × m matrices, then for 1 ≤ p ≤ 2 one has
T r(|A + B|p ) + T r(|A − B|p ) 2
2/p
≥ (T r(|A|p ))2/p + (p − 1)(T r(|B|p ))2/p .
Proof. This is Theorem 1 in [CL], see also [BCL]. Lemma 2. Let 1 < p ≤ 2 and e−t ≤
√ p − 1 one has
kPtε k2Lp →L2 = 1. Proof. The proof is by induction on the number of elements of I. It is true for |I| = 1, by the two-points space Nelson’s inequality (see [G2] or [CL]). Assume the inequality of the lemma is proved for all I with |I| ≤ n − 1 and all functions ε, and consider A(I, ε) with I = {1, . . . , n}. Any element X ∈ A(I, ε) can be written in a unique way as X = A + xn B, where A and B belong to the algebra generated by x1 , . . . , xn−1 .
Free Hypercontractivity
469
One has Ptε (A + xn B) = Ptε (A) + e−t xn Ptε (B) and the terms Ptε A and Ptε (xn B) are orthogonal in L2 (A(I, ε), τ ε ), hence one has, using the induction hypothesis, kPtε (A + xn B)k2L2 = kPtε (A)k2L2 + kPtε (xn B)k2L2
= kPtε (A)k2L2 + e−2t kPtε (B)k2L2
≤ kAk2Lp + e−2t kBk2Lp ≤ kAk2Lp + (p − 1)kBk2Lp . In order to estimate kXkLp we use the Fock model developed in the preceding section. Let x0 = xn vn and C = vn B, then A and C belong to the ∗-algebra generated by the βi ; i < n, hence they commute with x0 . Since χn and vn commute, and χ2n = 1, one has tr(x0 ) = tr(xχn vn χn ) = tr(χn xχn vn ) = tr(−xvn ) = −tr(x0 ), hence tr(x0 ) = 0. Using Proposition 4, we get kXkpLp = tr(|X|p ) =
1 (tr(|A + C|p ) + tr(|A − C|p )), 2
where tr is the normalized trace on B(L2 (A(I, ε), τ ε )). Applying Lemma 1 we get kXk2Lp ≥ (tr(|A|p ))2/p + (p − 1)(tr(|C|p ))2/p , and since |C|2 = C ∗ C = B ∗ vn∗ vn B = B ∗ B = |B|2 , kXk2Lp ≥ (tr(|A|p ))2/p + (p − 1)(tr(|B|p ))2/p ≥ kPtε (X)k2L2 . This concludes the proof of the lemma.
Lemma 2 establishes the required inequality for r = 2 and 1 < p ≤ 2. In order to obtain the general case, one makes use of a logarithmic Sobolev inequality, which we state as a theorem because of its own interest. Theorem 6. For all X ∈ A(I, ε) one has τ ε (|X|2 log |X|2 ) − kXk2L2 log kXk2L2 ≤ 2 τ ε (XN ε X ∗ ). Proof. As in [CL], one just has to take the derivative at zero in the inequality kPtε (X)kL2 ≤ kXkL1+e−2t , which is an equality for t = 0.
Lemma 3. For all positive X in A(I, ε), and 1 < p < ∞ one has hX p/2 , N ε X p/2 i ≤
(p/2)2 hX, N ε X p−1 i. p−1
470
P. Biane
Proof. The fermionic case of this lemma is proved by Gross in [G2]. Here we shall follow theP simplified version of the proof given by Hu in [Hu]. As remarked before, one has N ε = i∈I βi∗ βi , hence it is enough to prove that hX p/2 , Niε X p/2 i ≤
(p/2)2 hX, Niε X p−1 i p−1
for all i ∈ I,
where Niε = βi∗ βi , and even to assume I = {1, . . . , n}, and to prove this inequality for Nnε . Let X = A + xn B, so that A and B are in the algebra generated by x1 , . . . xn−1 , then Nnε (X) = xn B. Assume that X ≥ for some > 0, then χn Xχn = A − xn B, so that one has also A − xn B ≥ . Let X(s) = A + sxn B, then X(s) ≥ for s ∈ [−1, 1]. The relations (S) show that for all integer k ≥ 0, one has X(s)k = Uk (s) + xn Vk (s), where Uk is an even function of s and Vk is an odd function, with values in the ∗-algebra generated by x1 , . . . , xn−1 . Since z 7→ z p is holomorphic in a neighbourghood of [, ∞[, it follows that X(s)p = Up (s) + xn Vp (s) for some analytic functions Up (even) and Vp (odd) with values in the algebra generated by x1 , . . . , xn−1 , and thus Z 1 1 d X(s)p/2 ds. Nnε X = xn Vp (1) = 2 −1 ds The proof now goes on as in [Hu], pp. 94–95, without any change. Applying Theorem 6 to X
p/2
, then Lemma 3 we get
(p/2)2 hX, N ε X p−1 i, p−1 which is the right logarithmic Sobolev inequality. The proof is now finished exactly as in [CL]. τ (X p log X) − kXkpLp log kXkLp ≤ hX p/2 , N ε X p/2 i ≤
Corollary 5. There exist constants Cp,m > 0, such that for all algebras A(I, ε), all p ≥ 1, all integers m ≥ 0 and all polynomials P , in |I| non-commuting indeterminates, of degree m one has kP ((xi )i∈I )kLp ≤ Cp,m kP ((xi )i∈I )kL2 . P Proof. Let P be of degree m, expand P ((xi )i∈I ) = A⊂I αA xA , and put Hr = P ε √ kL2 →Lp = 1, |A|=r αA xA , then using the fact that kP log
kP (x)i )i∈I kLp ≤ ≤
m X r=0 m X
p−1
kHr kLp (p − 1)r/2 k(p − 1)−r/2 Hr kLp
r=0
≤
m X
(p − 1)r/2 kHr kL2
r=0
≤
m X
!1/2 (p − 1)
r
r=0
r=0
≤ Cp,m kP ((xi )i∈I )kL2 where Cp,m depends only on p and m.
m X
!1/2 kHr k2L2
Free Hypercontractivity
471
3. Speicher’s Central Limit Theorem and the Proof of Theorem 2 4.1. Choose a real q ∈] − 1, 1[. Let H be a real Hilbert space of dimension k, with orthonormal basis e1 , . . . ek , and consider the operators li ≡ lei ; li∗ ≡ le∗i and ωi = li + li∗ on Fq (H), as defined in Sect. 1.1. Let ε be a function on N∗ ×N∗ with values in {−1, +1}, such that ε(i, j) = ε(j, i) and ε(i, i) = 1 for all (i, j) ∈ N∗ × N∗ . For each integer n ≥ 1 consider the algebra An ≡ A({1, . . . , n}, ε|{1,...,n}×{1,...,n} ) and the operators βiε and (βiε )∗ for 1 ≤ i ≤ n, constructed in Sect. 2.2 (to which we have added an index in order to keep track of the dependence on the function ε). We now choose functions ε as above randomly, such that the values ε(i, j) for i < j form a family of independent random variables, identically distributed, with P (ε(i, j) = −1) = (1 − q)/2 P (ε(i, j) = 1) = (1 + q)/2. Pn(j+1) = √1n l=nj+1 βlε for j = 1, . . . , k, then for every polynomial Q, Theorem 7. Let aε,n j in 2k non-commuting indeterminates, one has ε,n ε,n ∗ ε,n ∗ ∗ ∗ lim hQ(aε,n 1 , . . . , ak , (a1 ) , . . . , (ak ) )1, 1i = hQ(l1 , . . . , lk , l1 , . . . , lk ) , iq
n→∞
almost surely in ε. Proof. This is a consequence of Theorem 2, Sect. 3, in [S], applied to the operators βi , and the state ϕ = h.1, 1i. The hypothesis of Speicher’s theorem are fulfilled by the operators βj , namely the conditions on covariance are easily verified, and the factorization of ordered moments comes from Proposition 3 above. (n) ∗ Lemma 4. Let wj(n) = a(n) j + (aj ) , and let P be a polynomial in k non-commuting indeterminates, then for every p ≥ 1 one has
lim kP (w1(n) , . . . , wk(n) )kLp (Ank ) = kP (ω1 , . . . , ωk )kLp ( 0 q (H))
n→∞
a.s.
Proof. There is a polynomial Q such that Q(l1 , . . . , lk , l1∗ , . . . , lk∗ ) = P (ω1 , . . . , ωk )∗ P (ω1 , . . . , ωk ), hence applying Theorem 7 to Qm , we get that for every integer m ≥ 1 one has lim τ ε [|P (w1(n) , . . . , wk(n) )|2m ] = τq [|P (ω1 , . . . , ωk )|2m ]
n→∞
a.s.
ε probability measure with R compact support on R+ such that RLet mνnε (resp ν), εbe the (n) x νn (dx) = τ [|P (w1 , . . . , wk(n) )|2m ] , resp. xm ν(dx) = τq [|P (ω1 , . . . , ωk )|2m ] for all integer m ≥ 1, then almost surely in ε, νn → ν weakly as n → ∞, hence R τ ε [|P (w1(n) , . . . , wk(n) )|p ] = xp/2 νn (dx) →n→∞ Z xp/2 ν(dx) = τq [|P (w1 , . . . , wk )|p ].
Indeed, let C > 0 be such that the support of ν is included in [0, C − 1], then, by weak convergence,
472
P. Biane
Z
C
xp/2 νn (dx) →
0
Z
Z
C
xp/2 ν(dx)
0 ∞
C
xp/2 νn (dx) ≤
Z
∞
C
a.s. as n → ∞ 1/2 Z ∞ 1/2 p νn (dx) x νn (dx) →0 0
and the result follows.
Lemma 5. Let P be a polynomial in k non-commuting indeterminates, then for every p ≥ 1 one has kPtε (P (w1(n) , . . . , wk(n) ))kLp (Ank ) → kPtq (P (ω1 , . . . , ωk ))kLp ( 0 q (H))
a.s. as n → ∞.
Proof. Let us first remark that the map which assigns to a polynomial P , with k noncommuting indeterminates XP 1 , . . . , Xk , the operator P (ω1 , . . . , ωk ), is injective. Indeed, let r be the degree of P and 1≤i1 ,...,ir ≤k ai1 ...ir Xi1 . . . Xir be the homogeneous component of P of highest degree. Consider the vector P (ω1 , . . . , ωk ) in Fq (H), then its component in the subspace HC⊗r is equal to X ai1 ...ir ei1 ⊗ . . . ⊗ eir , 1≤i1 ,...,ir ≤k
as can be verified by expanding P (ω1 , . . . , ωk ) with respect to ωj = lj + lj∗ . Since the operators lj decrease the degree by one, they do not occur in the evaluation of the highest term of P (ω1 , . . . , ωk ) . The vectors ei1 ⊗ . . . ⊗ eir for 1 ≤ i1 , . . . , ir ≤ k are linearly independent in HC⊗r , so that P (ω1 , . . . , ωk ) = 0 implies P = 0. By induction on the degree, one can see that every basis vector ei1 ⊗. . .⊗eir is of the form P i1 ,...,ir (ω1 , . . . , ωk ) for some (unique) polynomial P i1 ,...,ir in k indeterminates. For any polynomial P in k indeterminates, we denote by Ptq (P ) the unique polynomial such that Ptq (P (ω1 , . . . , ωk )) = Ptq (P )(ω1 , . . . , ωk ). One has Ptq (P i1 ,...,ir ) = e−rt P i1 ,...,ir , and from Theorem 7, ε,n ∗ ∗ lim kP i1 ,...,ir (w1(n) , . . . , wk(n) )1 − (aε,n i1 ) . . . (air ) 1kL2 (Ank )
n→∞
a.s. = kP
i1 ,...,ir
(ω1 , . . . , ωk ) −
li∗1
. . . li∗r
kL2 ( 0 q (H)) = 0.
ε,n ∗ ∗ ε The vector (aε,n i1 ) . . . (air ) 1 is in the eigenspace of N of eigenvalue r, so that, since ε Pt is a contraction, ε,n ∗ ∗ lim kPtε (P i1 ,...,ir (w1(n) , . . . , wk(n) ))1 − e−rt (aε,n i1 ) . . . (air ) 1kL2 (Ank ) = 0
n→∞
a.s.
hence, kPtε (P i1 ,...,ir (w1(n) , . . . , wk(n) )) − Ptq (P i1 ,...,ir )(w1(n) , . . . , wk(n) )kL2 (Ank ) →n→∞ 0 a.s. Since the polynomials P i1 ,...,ir form a basis of the space of all polynomials, it follows that
Free Hypercontractivity
473
lim kPtε (P (w1(n) , . . . , wk(n) )) − Ptq (P )(w1(n) , . . . , wk(n) )kL2 (Ank ) = 0
a.s.
n→∞
for all polynomials P . By the estimate in Corollary 5, we get lim kPtε (P (w1(n) , . . . , wk(n) )) − Ptq (P )(w1(n) , . . . , wk(n) )kLp (Ank ) = 0
a.s.
n→∞
for all p ≥ 1. Since, by Lemma 4, lim kPtq (P )(w1(n) , . . . , wk(n) )kLp (Ank ) = kPtq (P )(ω1 , . . . , ωk )kLp ( 0 q (H))
n→∞
one has the required limit.
a.s.
End of the proof of Theorem 3. By standard arguments it is enough to prove the theorem for a finite dimensional Hilbert space H of dimension k. It is also enough to prove that for any polynomial P in k indeterminates one has kPtq (P (ω1 , . . . , ωk ))kLr ( 0 q (H)) ≤ kP (ω1 , . . . , ωk )kLp ( 0 q (H)) , if e−2t ≤
p−1 r−1 .
By Lemma 4 and 5 we have
kP (ω1 , . . . , ωk )kLp ( 0 q (H)) = lim kP (w1(n) , . . . , wk(n) )kLp (Ank ) , n→∞
kPtq (P (ω1 , . . . , ωk ))kLr ( 0 q (H)) = lim kPtε (P (w1(n) , . . . , wk(n) ))kLr (Ank ) , n→∞
and by Theorem 5 one has kPtε (P (w1(n) , . . . , wk(n) ))kLr (Ank ) ≤ kP (w1(n) , . . . , wk(n) )kLp (Ank ) , so that in the limit kPtq (P (ω1 , . . . , ωk ))kLr ( 0 q (H)) ≤ kP (ω1 , . . . , ωk ))kLp ( 0 q (H)) .
Acknowledgement. I would like to thank Q. Xu for indicating to me the reference [PS], and J.L. Sauvageot for some useful conversations.
References [Ba]
Bakry, D.: L’hypercontractivit´e et son utilisation en th´eorie des semigroupes. In: Lectures on Probability Theory, Lecture Notes in Mathematics 1581, Berlin–Heidelberg–New York: Springer, 1994, pp. 1–114 [Be] Beckner,W.: Sobolev inequalities, the Poisson semigroup and analysis on the sphere S n . Proc. Nat. Acad. Sci. 89, 4816–4819 (1992) [Bo] Bozejko, M.: A q-deformed probability, Nelson’s inequality and central limit theorems. In: Nonlinear fields, classical, random, semiclassical P. Garbecaki and Z. Popowci ed., Singapore: World Scientific, pp. 312–335 1991 [Br] Bressoud, D.M.: A simple proof of Mehler’s formula for q-Hermite polynomials. Indiana Univ. Math. J. 29, 577–580 (1980) [BCL] Ball, K., Carlen, E.A., Lieb, E.H.: Sharp uniform convexity and smoothness inequalities for trace norms. Invent. Math. 115, 463–482 (1994) [BF] R. Bourret and U. Frisch Parastochastics. J. Math. Phys. 11, 364–390 (1970) [BKS] Bozejko, M., K¨ummerer, B., Speicher, R.: q-gaussian processes: non-commutative and classical aspects. Preprint (1995)
474
[BS]
P. Biane
Bozejko, M., Speicher, R.: An example of a generalized brownian motion. Commun. Math. Phys. 137, 519–531 (1991) [CL] Carlen, E.A. Lieb, E.H.: Optimal hypercontractivity for Fermi fields and related non-commutative integration inequalities. Commun. Math. Phys. 155, 27–46 (1993) [CSV] Coulhon, T., Saloff-Coste, L., Varopoulos, N.Th.: Analysis and Geometry on Groups. Cambridge Tracts in Mathematics Vol. 100, Cambridge: Cambridge University Press, 1992 [D] Davies, E.B.: Heat Kernels and Spectral Theory. Cambridge Tracts in Mathematics Vol. 92, Cambridge: Cambridge University Press, 1989 [DL] Davies, E.B., Lindsay, J.M.: Non-commutative symmetric Markov semigroups. Math. Zeit. 210, 379– 411 (1992) [G1] Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97, 1061–1083 (1975) [G2] Gross, L.: Hypercontractivity and logarithmic Sobolev inequalities for the Clifford-Dirichlet form. Duke Math. J. 42, 383–396 (1975) [G3] Gross, L.: Logarithmic Sobolev inequalities and contractivity properties of semigroups. In: Dirichlet Forms, Lecture Notes in Mathematics, Vol 1569, Berlin–Heidelberg–New York: Springer, 1993, pp. 54–82 [Gre] Greenberg, O.W.: Particles with small violation of Fermi or Bose statistics. Phys. Rev. D 43, 4111– 4120 (1991) [Ha] Haagerup, U.: An example of a non nuclear C∗ algebra which has the metric approximation property. Invent. Math. 50, 279–293 (1979) [Hu] Hu, Y.Z.: Hypercontractivit´e pour les fermions, d’apr`es Carlen et Lieb. In: S´eminaire de probabilit´es XXVII Lecture Notes in Mathematics Vol. 1557, Berlin–Heidelberg–New York: Springer, 1993, pp.86–96 [K] Kunze, R.A.: Lp Fourier transforms on locally compact unimodular groups. Trans. Am. Math. Soc. 89, 519–540 (1958) [L] Lindsay, J.M.: Gaussian hypercontractivity revisited. J. Funct. Anal. 92, 313–324 (1990) [LM] Lindsay, J.M., Meyer, P.A.: Fermionic hypercontractivity. In: Quantum probability VII, Singapore: World Scientific, 1992, pp. 211-220 [M] Meyer, P.A.: Quantum Probability for Probabilists, Second edition, Lecture Notes in Mathematics Vol. 1538, Berlin–Heidelberg-New York: Springer, 1995 [N] Nelson, E.: The free Markoff field J. Funct. Anal. 12, 211–227 (1973) [PS] Peetre, J., Sparr, G.: Interpolation and non commutative integration. Ann. Mat. Pura Appl. 104, 187–207 (1975) [S] Speicher, R.: A non commutative central limit theorem. Math. Zeit. 209, 55–66 (1992) [Sz] Szeg¨o, G.: Ein Beitrag zur Theorie der Thetafunktionen. In: Collected papers, R. Askey, ed. Vol. 1, Basel: Birkh¨auser, 1981, pp.793–805 [V] Voiculescu, D.V. Symmetries of some reduced free product C∗ algebras. In: Operator Algebras and their Connection with Topology and Ergodic Theory, Lecture Notes in Mathematics, Vol. 1132, Berlin–Heidelberg–New York: Springer, 1985, pp. 556–588 [VDN] Voiculescu, D.V., Dykema, K., Nica, A.: Free random variables. CRM Monograph Series No. 1, Providence, RI: Am. Math. Soc., 1992 [Y] Yeadon, F.J. Non-commutative Lp spaces. Math. Proc. Cambridge Philos. Soc. 77, 91–102 (1975) [Z] Zagier, D.: Realizability of a model in infinite statistics. Commun. Math. Phys. 147, 199–210 (1992) Communicated by A. Jaffe
Commun. Math. Phys. 184, 475–491 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Generalized Goodman–Harpe–Jones Construction of Subfactors, I Feng Xu Department of Mathematics, University of California, Los Angeles, CA 90024, USA Received: 24 November 1995 / Accepted: 14 May 1996
Abstract: We construct some exceptional finite depth subfactors and determine their principal graphs from some exceptional integrable lattice models. Some of these subfactors are conjectured to be the same as those coming from certain conformal embeddings (SU (3)5 ⊂ SU (6), SU (3)9 ⊂ E6 , and SU (3)21 ⊂ E7 ) for which the principal graphs are previously unknown.
1. Introduction This work starts with an observation that the integrable lattice models constructed in [1 and 2] (generalizing the work of [8] on A-D-E lattice models) can be used to construct periodic commuting squares in the sense of H.Wenzl [3] (see also Chap. 4 of [4]). The motivation was to derive higher relative commutants of subfactors coming from conformal embeddings [9]. A simple example may help to understand the issue. The famous Goodman-Harpe√ Jones (GHJ) subfactor with index 3 + 3 [4] can be constructed in two ways. One is the original GHJ construction. For our purposes, we explain this construction in the following way. One constructs Boltzmann weights associated with E6 graph [8] which satisfy Temperley-Lieb relations: Ui2 = βUi , U i Uj = Uj Ui , Ui Ui+1 Ui − Ui = 0, (Ui )abcd = δac (ψb(1) ψd(1) )1/2 /ψa(1) ,
476
F. Xu
6
1
2
3
4
5
where ψ (1) is the eigenvector of E6 that corresponds to the largest eigenvalue β = 2cos(π/12). One then builds string algebras [10] starting from the point 1. One obtains a sequence of algebras B1 ⊂ B2 ⊂ B3 .... Let An be the subalgebra of Bn generated by 1, U1 , ...Un−1 . Then [4]: An+1 ⊂ Bn+1 ∪ ∪ A n ⊂ Bn is a periodic commuting square. One constructs a subfactor A = ∪An ⊂ B = ∪Bn . The index of this subfactor is given by Wenzl’s index formula and it is irreducible which can be proved by using either Skau’s lemma [4] or Wenzl’s estimation on the dimension of higher relative commutants [3]. In fact, one can use Wenzl’s estimation and TemperleyLieb relations to determine the principal graph [6]. On the other hand, the above subfactor is the same as the subfactor coming from the conformal embedding √ SU (2)10 ⊂ SO(5)1 [9]. (There are only 2 finite depth subfactors with index 3 + 3, a fact which use the √ special property of index value 3 + 3 [11].) However, there is no direct proof of this relation √ with conformal embeddings without using the special property of index value 3 + 3. The Temperley-Lieb algebra is associated with the Lie group SU (2), and the Hecke algebra is associated with SU (N ), N > 2 (more precise statements will be presented in the following). One is thus led to consider a “SU (N ), N > 2” generalization or “Hecke algebra” version of the above story. Orbifold construction is such a generalization corresponding to D graphs in the GHJ case (see [5 and 7]). In [1 and 2], three integrable lattice models are constructed associated with some exceptional graphs coming from SU (3). The explicit graphs and Boltzmann weights are given in [1 and 2]. By using these and a technical Lemma 3.1, we are able to check that each of these three exceptional lattice models give rise to periodic commuting squares, hence we can construct some subfactors. By using Hecke algebra relations one can determine the principal graphs of these subfactors. Some of these subfactors are conjectured to be the same as the following subfactors coming from conformal embeddings: SU (3)5 ⊂ SU (6), SU (3)9 ⊂ E6 , and SU (3)21 ⊂ E7 of [9]. In [5], a framework has been set up to study the integrable lattice models from [1] (in particular, the role of intertwinning Yang-Baxter equation is emphasized). However, it is in general very difficult to find examples to satisfy all the axioms of [5]. Orbifold subfactors are notable exceptions. Our constructions seem to fit in the framework of [5] and it will be interesting to find the complete invariants (in the sense of [5]) for our examples. 2. Properties of the Known Lattice Models Let us start by describing the integrable models of [1 and 2]. The system consists of a square lattice with orientation which we fix once and for all as, say from left to right.
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
477
@ R @ R R @ @ R @ R @ @ @ R @ R @ R @ @ R @ R @ @ @ I R @ @ R @ R @ R Space
R Time 6 R @ R @
The spins live at the nodes of the lattice and take their values in a graph. What we mean by a graph is a set of vertices and a set of directed links between these vertices. We can summarize all the information about the graph in its incidence matrix Gab = (number of arrows that point from a to b). The admissible configurations are those where for every two sites i and j on the lattice that are connected by an arrow from i to j, the spins σi and σj are vertices in the graph which are connected by an arrow from i to j. Moreover the case where several links connect two vertices is interpreted as a situation where additional degrees of freedom are attached to links: one has to sum over all spins and over all spin variables. We attach to each face of the lattice a family of Boltzmann weights W (σ1 , σ2 , σ3 , σ4 |u) (the link variables are implicit in this notation) that depend on the four spins that surround it (the links between them) and on one complex parameter called the spectra parameter. The partition function reads Y W (f ace|u). (1) Z = Σσ allf aces
Let us define a face transfer matrix by Xi (u) = 1⊗i−1 ⊗ W (σi−1 , σi , σi+1 , σi0 |u) ⊗ 1⊗N −i .
(2)
A sufficient condition for the integrability of the model is that X(u) satisfy the YangBaxter equation: Xi (u)Xi+1 (u + v)Xi (v) = Xi+1 (v)Xi (u + v)Xi+1 (u).
(3)
We take the following ansatz for Xi ’s: Xi (u) = sin(π(λˆ − u))1 + sin(πu)Ui .
(4)
Then, by inserting (4) in Eq. (3), we find that Ui ’s are generators of the Hecke algebra: Ui2 = βUi , U i Uj = Uj Ui , Ui Ui+1 Ui − Ui = Ui+1 Ui Ui+1 − Ui+1 ,
(5)
ˆ Another constraint comes from the requirement that the generators where β = 2cos(π λ). U should be in the commutant of Uq SU (N ) [12]. The commutant is obtained by imposing the vanishing of the rank N+1 antisymmetrizer:
478
F. Xu
Σσ∈SN +1 (−q)|Iσ | Xσ = 0,
(6)
Ui Ui+1 Ui = Ui .
(7)
Q Q where σ ∈ SN +1 , σ = i∈Iσ τi,i+1 ,τi,i+1 is a transposition, Xσ = i∈Iσ Xi and Xi = 2i × limu→i∞ Xi (u) exp (iπu). For SU(2) it reads:
So the Hecke algebra reduces to the Temperley-Lieb algebra. For SU(3) it is: (Ui − Ui+2 Ui+1 Ui + Ui+1 )(Ui+1 Ui+2 Ui+1 − Ui+1 ) = 0.
(8)
Several classes of solutions which satisfy the above conditions are known, first for the graph that describes the fusion of representations of SU (N ) Kac-Moody algebra at level k=n-N with the fundamental representation. We remind the reader that the integrable representations of the SU (N ) Kac-Moody algebra at level k are indexed by the finite set (Weyl alcove) X X (n) = [λ ∈ P |λ = λi Λi , λi > 0, λi < n], (9) P++ i=1,...N −1
i=1,...N −1
where P is the weight lattice of SU(N) and Λi are the weights of the fundamental representations. We take these points as the vertices of our graph. We represent the N linearly dependent vectors ei by e1 = Λ1 , ei = Λi+1 − Λi , 1 ≤ i ≤ N − 1, eN = −ΛN −1 .
(10)
The bonds between the above vertices are directed along the ei ’s. This graph describes the fusion rules of representations Rλ of the Kac-Moody algebra with the fundamental representation Rf : X Nf λµ Rµ , (11) Rf × Rλ = µ
where Nf λµ are non-negative integers. We take Aλµ = Nf λµ to be the adjacency matrix of our graph. The generators for the Hecke algebra in this representation were found by Wenzl [3]: (Ui )λ,λ+ei ,λ+ei +ej ,λ+ek = (1 − δij ) · (Sij (λ + ei )Sij (λ + ek )/Sij (λ)2 )1/2 ,
(12)
where Sij (x) = sin(π/n(ei − ej , x)), β = 2cos(π/n). This class of solutions will be referred to in the following as the “regular” representation, a term which is justified in [1]. A remark concerning notation: we denote by A, Greek letters and φ the regular graph, its vertices and its eigenvectors respectively, while G, Roman letters and ψ denote the non-regular graph (Orbifold or exceptional graph), its vertices and eigenvectors. All the P eigenvalues are normalized, i.e. µ ψa(µ) ψb(µ)∗ = δab , where the sum over the multiplicity of eigenvectors corresponding to the same eigenvalue (if any) is assumed. A second class of solutions corresponds in the continuous limit to the coset models SU (2)l × SU (2)1 /SU (2)l+1 [8]. The requirements on the graphs are to be unoriented (thus G is symmetric) and to have a largest eigenvalue less than 2. This problem is closely connected with the problem of classifying simple Lie algebras. The solutions are the
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
479
ADE Dynkin diagrams that correspond to simply laced Lie algebras. The Boltzmann weights are given by: (13) (Ui )abcd = δac (ψb(1) ψd(1) )1/2 /ψa(1) , where ψ (1) is the eigenvector of G that corresponds to the largest eigenvalue. We will concentrate on relations which do not depend on the graph G for fixed rank and level in order to impose these relations or their direct generalizations on new representations. We start with the class of “regular” graphs. The fusion coefficients for the SU(N) Kac-Moody algebra at level k are: X Nλµγ Rγ . (14) Rλ × Rµ = (n) γ∈P++
They are nonnegative integers, and by the Perron-Frobenius theorem, there exists for any such matrix an eigenvector φ(1) of the largest eigenvalue γ (1) , such that all its components φ(1) µ are real and non-negative. In fact we have (Verlinde formula) [13]: X (ρ) Nλµγ = Sλ Sµ(ρ) Sγ(ρ)∗ /S1(ρ) , (15) (n) ρ∈P++
where 1 refers to the apex of the Weyl alcove, and Sλ(ρ) is given by the Kac-Peterson formula: X ω exp(iωρ · λ2π/n), (16) Sλ(ρ) = C ω∈SN
where ω = det(ω) and C is a normalization constant fixed by the requirement that S (ρ) is an orthonormal system. The proof of the above formula can be found in [13]. Our graph Aµγ is then given by: X (ρ) Sf Sµ(ρ) Sγ(ρ)∗ /S1(ρ) . (17) Aµγ = Nf µγ = (n) ρ∈P++
We see that S (ρ) are the eigenvectors of the graph A with eigenvalues: γ (ρ) = Sf(ρ) /S1(ρ) .
(18)
We will denote by γλ(ρ) = Sλ(ρ) /S1(ρ) . The largest eigenvalue corresponds to the apex of the Weyl alcove is γ (1) = PN −1 (β/2), where PN (x) is the Chebychev polynomial of the second kind of degree N, and the Perron-Frobenius eigenvector that corresponds to it is (up to normalization): Y φ(1) sin(π/n(α · µ)), (19) µ = α+
where the product is over all the positive roots. Let x be a word in Ui ’s. We use a1 = a, a2 , ...aL−1 , aL = b to denote a path of length L − 1 on A starting from a and ends with b. As in [2], we use < a1 = a, a2 , ...aL−1 , aL = b|x|a1 = a, a2 , ...aL−1 , aL = b > to denote the diagonal elements of x. In the following we fix the initial point a. Define X (G) (x) = < a1 = a, a2 , ...aL−1 , aL = b|x|a1 = a, a2 , ...aL−1 , aL = b > . Zab a2 ,...aL−1
(20)
480
F. Xu
By using the cyclicity of the trace and the defining relations of the Hecke algebra, we can write: X X (G) (x) = fj1 ,...jl (x, β) · Zab l=0,...M 0≤j1 <...<jl
< a1 = a, a2 , ...aL−1 , aL = b|Uj1 ...Ujl |a1 = a, a2 , ...aL−1 , aL = b > .(21) where fj1 ,...jl (x, β) is some function depending on x, β, and so is M. However, fj1 ,...jl (x, β),M are independent of graph G since in deriving (21), we only use the Hecke algebra relation(they depend on level n and rank N). Let us define a modified partition function: X (G) (µ) (x) = Zab (x)ψb(µ) /ψa(µ) (22) Zmod b
and tr(µ) (x) =
X
< a, a2 , ...aL−1 , aL |x|a, a2 , ...aL−1 , aL > ψa(µ) /ψa(µ) . L
(23)
a2 ,...aL−1 ,aL
For µ = 1, we define the normalized trace T r(1) by T r(1) (x) = checks that ([3]) tr(1) (xUn+1 ) = PN −2 (β/2)tr(1) (x).
1 tr(1) (x). (γ (1) )L−1
One (24)
For any word x in 1, U1 , ...Un for graph A, T r(1) is thus the Markov trace [3] and (24) follows from: X (1) Uµ,γ,λ,γ φ(1) (25) λ = PN −2 (β/2)φγ , λ
which has been verified in [3]. Based on the properties of the known models (regular graphs and their orbifolds, ADE lattice models), the following conditions on the graph G are imposed in [1]: (1) The largest eigenvalue of G will be the same as that of A, i.e. X Gbc ψc(1) = PN −1 (β/2)ψb(1) , (26) c
and the following Markov property is satisfied: X Uabcb ψc(1) = PN −2 (β/2)ψb(1) .
(27)
c
(2) The graph G is N-colorable, oriented, and the adjacency matrix G is diagonalizable. (3) Denote by exp(G) the set of eigenvalues of the graph G, then exp(G) ⊂ exp(A) possible up to multiplicities. (4) There exists an involution a → a0 such that Gab = Gb0 a0 . These conditions and especially condition (3) are very restrictive. In [1] a list of graphs satisfying the above conditions are given for SU(3). It is important to note, however, that this list is not restrictive enough in the sense that their list in [1] contains graphs that can not give rise to a representation of the Hecke algebra. We will be interested in the following 3 graphs in [1] and [2] where explicit Boltzmann weights which satisfy (5), (8) and (27) are given. The first graph is denoted by E (8) :
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
481
13 e 42
23
^
Y 6
31 ^+ ? K K -+ e ? 11 21 ^
K K
33 - 6 +
e 42
^ 22 6 K
3 j 32
K -e 1
2
^`e 41
It has an obvious Z3 symmetry. One can encode the E (8) by its Z3 quotient:
4 7
s w
} o 2
~
3 = 1
As in [1], we use squares to denote the entries of matrix U in the following way:
S abcd
a R =b R d c
482
F. Xu
Then the nonzero Boltzmann weights which satisfy (5), (8), (27) are:
(c = cos(π/8), s = sin(π/8))
1 2 3 4 R R R R 2R 2 = 3 R 3 = 2 R 2 = 3R 3 = β = 2c 3
1
4
2 √ R 2c 2, 3R 2, 3 = 2−1/4 2 3 √ R 2c 3, 2R 3, 2 = −2−1/4
−1/4 2√ 2s
2
−1/4 −2 √ 2s
3 2 3 R R 2, 3, 4R 2, 3, 4 = 3, 2, 1R 3, 2, 1 = 3
2
√ √2s 2s 23/4 s
The other two graphs are E1(12) (n=12):
11 Mr 3 21 r 1 K K5r r6 r 4 Y r 6 22 Or K 3 12 Or r- 2 r 30 20 ] 10
√ √2s 2s 23/4 s
23/4 s 23/4 s 2s
!
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
483
and E (24) (n=24):
/
71
100
os 21 s 31 ] ] 5 1 s sc 6 c ` p s 41 1 8 ] 91 ] 1 6 s 101 sq- : sz s s R R 6 110 ? cps 111 121 s I ) ss* qWsp y I s 70 80 ? 90 U 6 s ps 40 60 ss 90
U
30 s ^
U s 2 0 s 4
The Boltzmann weights for E1(12) and E (24) are given in [2]. The Boltzmann weights for E (8) graph are obtained in [1] by intertwining its trivial Z3 quotient with that of A(8) (see [1] for details). The Boltzmann weights for E1(12) , E (24) are obtained in [2] by imposing two additional conditions on the Boltzmann weights: X Uabcb = Gca β, (28) X
b
(U1 )abcb (U2 )bcdc = (β 2 − 1)δad +
X
Gca Gdc .
(29)
c
bc
An explicit calculation using (28), (29) can be found in the appendix of [2]. In the rest of this section, let us check the Boltzmann weights for E (8) as on the previous page which satisfy (28) and (29). These are simple calculations. For (28), a somewhat nontrivial check is: X √ √ U2a2a = 2c + 2s = β, 2,3
X
√ U2x3x = (2 + 2 2)s = β
2,3,4
(the rest are either simpler or followed by symmetry). For (29), one nontrivial check is:
484
F. Xu
(U1 )2434 (U2 )4323 + (U1 )2313 (U2 )3121 + (U1 )2323 (U2 )3222 + (U1 )2333 (U2 )3323 + (U1 )2222 (U2 )2222 + (U1 )2232 (U2 )2323 = 2βs + 4s2 + 3 √ = 2 = 4s2 + 3 = β2 + 2 X G2d G2d = (β 2 − 1)δ2,2 + d
(the rest are either simpler or followed by symmetry). We will also use the following identity which follows directly from (12): (for SU(3)) X (µ) (µ) U1,e1 ,e1 +ej ,e1 φ(µ) β, (30) e1 +ej /φ1 = γ j
X
(µ) (U1 )1,e1 ,λ,e1 (U2 )e1 ,λ,γ,λ φ(µ) γ /φ1
λ,γ
= β 2 + (γ (µ) )2 − 1.
(31)
3. Periodic Commuting Squares Let G be one of the three graphs E (8) , E1(12) or E (24) . Choose a vertex a of G. Let us construct B0 ⊂ B1 ⊂ ... ⊂ Bn ⊂ ... the path algebras on G starting from the initial point a of G ([10]), where B0 has dimension one. Let An be the subalgebra of Bn generated by 1, U1 , ...Un−1 , where the matrix elements of the action of Ui is given by the Boltzmann weights which satisfy (5), (8), (27), (28), and (29) of Sec. 2. Proposition 3.1. For m big enough, Am+1 ⊂ Bm+1 [ [ Am ⊂ Bm is a periodic commuting square in the sense of [3] with Markov trace T r(1) of Sec. 2 (see also Chapter 4 in [4]). Proof. Since Ai ’s are the same as the π (3,n) representation of the Hecke-algebra (see [3]), if y ∈ Bm , x ∈ Am+1 such that x = aUm b (any elements of Am+1 can be written as a linear combination of elements of the form aUm b for a, b ∈ Am or c for c ∈ Am ), it follows from (5) and (26): T r(1) (xEAm+1 (y)) = T r(1) (xy) = T r(1) (aUm by) = P1 (β)T r(1) (aby) = P1 (β)T r(1) (abEAm (y)) = T r(1) (xEAm (y)). Hence EAm+1 (y) = EAm (y), where E is the conditional expectation, thus the above is a commuting square. Let h be the longest distance on G or A from the point a. Let m > h. The inclusion matrices Am ⊂ Am+1 , Bm ⊂ Bm+1 are periodic (with period 3 since both A and G are 3 colorable) and primitive for m > h. Hence the proposition follows from Corollary 3.1 below. P (A) P (G) (µ) (µ) (µ) (µ) Lemma 3.1. b Zab (x)ψb /ψa = λ Z1λ (x)φλ /φ1
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
485
Proof. By the discussion immediately following (20), it is enough to prove the lemma for x of the form: x = Ui1 Ui2 ...Uis , 0 ≤ i1 < i2 ... < is , U0 = 1. We will prove this by induction on s. Fix a1 = a. If x=1: X X X (G) Zab (1)ψb(µ) /ψa(µ) = < a1 , ...am |1|a1 , ...am > ψa(µ) /ψa(µ) m 1 b
b
=
a2 ,...am =b
X
< a1 , ...am−1 |1|a1 , ...am−1 >
X
a2 ,...am−1
X
=
Gam−1 b ψa(µ) /ψa(µ) m 1
b
< a1 , ...am−1 |1|a1 , ...am−1 > γ (µ) ψa(µ) /ψa(µ) m−1 1
a2 ,...am−1
= ... = (γ (µ) )m−1 Since γ (µ) is the same for both A and G graphs, we have proved that the lemma is true for x=1. From the above calculation, one noticed the following property. Denote the RHS and LHS of lemma by f A (x), f G (x) respectively. If x = Ui1 ...Uis Uj1 ...Ujt , 0 ≤ i1 < ... < is < j1 ... < jt and j1 > is + 1, then: X f G (Ui1 ...Uis Uj1 ...Ujt ) = < a1 , ...am |Ui1 ...Uis Uj1 ...Ujt |a1 , ...am > · a2 ,...am =b
/ψa(µ) ψa(µ) m 1 X = < a1 , ...ais , ais +1 , ...aj1 −2 , aj1 −1 , aj1 , ...am |Ui1 . a2 ,...am =b
..Uis Uj1 ...Ujt |a1 , ...ais , ais +1 , ...aj1 −2 , aj1 −1 , aj1 , ...am > · /ψa(µ) ψa(µ) m 1 X = < a1 , ...ais+1 |Ui1 ...Uis |a1 , ...ais+1 > · a2 ,...am =b
< ais+1 , ...aj1 −1 |1|ais+1 , ...aj1 −1 > · < aj1 −1 , aj1 , ...am |Uj1 . /ψa(µ) ..Ujt |aj1 −1 , aj1 , ...am > ψa(µ) m 1 X = < a1 , ...ais+1 |Ui1 ...Uis |a1 , ...ais+1 > ψa(µ) /ψa(µ) · i s+1
a2 ,...am =b
(γ (µ) )j1 −is −2 < aj1 −1 , aj1 , ...am |Uj1 ... Ujt |aj1 −1 , aj1 , ...am > ψa(µ) /ψa(µ) m j −1 1
= f G (Ui1 ...Uis )f G (Uj1 ...Ujt )(γ (µ) )j1 −is −2 Hence if x = Ui1 ...Uis Uj1 ...Ujt , 0 ≤ i1 < ... < is < j1 ... < jt and j1 > is + 1 and f G (Ui1 ...Uis ) = f A (Ui1 ...Uis ), f G (Uj1 ...Ujt ) = f A (Uj1 ...Ujt ), then f G (x) = f A (x). We call this property the Factorization property. Let us continue our induction on s. Let x = Ui−1 . X f G (Ui−1 ) = < a1 , ...am |Ui−1 |a1 , ...am > ψa(µ) /ψa(µ) m 1 a2 ,...am =b
=
X
a2 ,...am =b
< a1 , ...ai−2 |1|a1 , ...ai−2 > (Ui−1 )ai−2 ,ai−1 ,ai ,ai−1 ·
486
F. Xu
< ai , ...am |1|ai , ...am > ψa(µ) /ψa(µ) m 1 X = < a1 , ...ai−2 |1|a1 , ...ai−2 > Gai ,ai−2 β(γ (µ) )m−i ψa(µ) /ψa(µ) i 1 a2 ,...am =b
=
X
< a1 , ...ai−2 |1|a1 , ...ai−2 > βP1 (β/2)(γ (µ) )m−i ψa(µ) /ψa(µ) i−2
a2 ,...ai−2
= βP2 (β/2)(γ (µ) )m−3 Notice that we use (28) in deriving the fourth identity. One can prove similarly f A (Ui−1 ) = βP2 (β/2)(γ (µ) )m−3 . Now let x = Ui−1 Ui , X f G (Ui−1 Ui ) = < a1 , ...am |Ui−1 Ui |a1 , ...am > ψa(µ) /ψa(µ) m 1 a2 ,...am =b
=
X
< a1 , ...ai−2 |1|a1 , ...ai−2 > (Ui−1 )ai−2 ,ai−1 ,ai ,ai−1 ·
a2 ,...am =b
/ψa(µ) (Ui )ai−1 ,ai ,ai+1 ,ai < ai+1 , ...am |1|ai+1 , ...am > ψa(µ) m 1 X = < a1 , ...ai−2 |1|a1 , ...ai−2 > · a2 ,...am =b
/ψa(µ) (γ (µ) )m−i−1 (β 2 + (γ (µ) )2 − 1)ψa(µ) i−2 1 = (γ (µ) )m−4 (β 2 + (γ (µ) )2 − 1), where we use (29) in the third identity. One can show similarly f G (Ui−1 Ui ) = (γ (µ) )m−4 (β 2 + (γ (µ) )2 − 1) by using (31). Assume the lemma is true for s < k + 2(k > 0). Let us show it is true for s = k + 2. Let x = Ui1 ...Uik−1 Up Uq Ur , 0 ≤ i1 < ... < ik−1 < p < q < r. There are three cases to consider: (i) q > p + 1: f G (x) = f A (x) follows by induction hypothesis and the Factorization property; (ii)r > q + 1: f G (x) = f A (x) follows by induction hypothesis and the Factorization property; Hence we are left with the case q=p+1, r=p+2. We will use (8) to reduce this case. Denote Y = Ui1 ...Uik−1 . From (8): Y (Up − Up+2 Up+1 Up + Up+1 )(Up+1 Up+2 Up+1 − Up+1 ) = 0 i.e.: Y Up Up+1 Up+2 Up+1 −Y Up Up+1 −Y Up+2 Up+1 Up Up+1 Up+2 Up+1 +Y Up+2 Up+1 Up Up+1 + Y Up+1 Up+1 Up+2 Up+1 − Y Up+1 Up+1 = 0 By using the cyclic property of the trace and Hecke algebra relations (5) of Sec. 1, we have: f A (Y Up Up+1 Up+2 Up+1 ) = f A (Y Up (Up+1 + Up+2 Up+1 Up+2 − Up+2 )) = f A (Y Up Up+1 ) + f A (Y Up Up+2 Up+1 Up+2 ) − f A (Y Up Up+2 ) = f A (Y Up Up+1 ) − f A (Y Up Up+2 ) + βf A (x). To simplify notations, we will drop those terms which will follow from the induction hypothesis and the Factorization property from now on. For example, we will write : f A (Y Up Up+1 Up+2 Up+1 ) = βf A (x) + ..., since f A (Y Up Up+1 ) − f A (Y Up Up+2 ) = f G (Y Up Up+1 ) − f G (Y Up Up+2 ) by induction hypothesis and the Factorization property. Similarly we have:
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
487
f A (Y Up+2 Up+1 Up Up+1 Up+2 Up+1 ) = f A (Y Up Up+1 Up+2 Up+1 Up+2 Up+1 ) = f A (Y Up Up+1 Up+2 (Up+1 + Up+2 Up+1 Up+2 − Up+2 ) = f A (Y Up Up+1 Up+2 Up+1 ) + βf A (Y Up Up+1 Up+2 Up+1 Up+2 ) − βf A (x) = βf A (Y Up Up+1 Up+2 Up+1 Up+2 ) + ... = βf A (Y Up (Up+1 + Up+2 Up+1 Up+2 − Up+2 ) Up+2 ) 2 ) + ... = βf A (x) + βf A (Y Up Up+2 Up+1 Up+2
= βf A (x) + β 2 f A (Y Up Up+2 Up+1 Up+2 ) + ... = βf A (x) + β 3 f A (Y Up Up+1 Up+2 ) + ... = (β + β 3 )f A (x) + .... Putting everything together, one has: β(1 − β 2 )f A (x) + ... = 0. Since β(1 − β 2 ) 6= 0, f A (x) can be expressed as a linear combination of those terms (...), to which we can apply our induction hypothesis and the Factorization property. One can go through the same calculation as above with f G . Hence f A (x) = f G (x). By induction, the lemma is proved. Corollary 3.1. For m ≥ h, Am ⊂ Bm is periodic with period 3. G λ G = ⊕λ Vab Z1λ , where Zab Proof. By abuse of notations, let us assume: Zab stands for the string components of Bm with endpoint b and Z1λ stands for the components of λ is the the Hecke algebra representation corresponding to Young diagrams λ, and Vab branching coefficients. By Lemma 3.1: X (G) P P λ Zab (x)ψb(µ) /ψa(µ) = b λ Vab Z1λ (x)ψb(µ) /ψa(µ) b
=
P λ
(µ) Z1λ (x)φ(µ) λ /φ1 .
Choose x = Pλ to be the minimal idempotents corresponding to λ as in [3], we P λ (µ) (µ) (µ) have Z1λ0 (Pλ ) = δλ0 ,λ . Therefore: b Vab ψb /ψa = φ(µ) λ /φ1 . Now using the fact P P (µ) (µ)∗ (µ) (µ) (µ)∗ λ λ = δb,b0 , we get: Vab = µ φ(µ) . Since Vab only depends on µ ψ b ψ b0 λ /φ1 ψa ψb the graph G and A, and G and A are 3-colorable, it follows that for m ≥ h, Am ⊂ Bm is periodic with period 3. 4. The Principal Graphs By Prop 3.1 and [3] or Chapter 4 of [4], we can construct finite-index subfactors. Using , with initial Wenzl’s index formula√in [3], one can easily calculate the index: For E (8)√ point 1, index = 4 + 2 2. For E1(12) , with initial point 1, index = 12 + 5 3. We will now present a proof similar to [3 and 6] to determine the principal graphs. Denote by A = ∪i Ai ⊂ B = B (0) = ∪i Bi ⊂ B (1) ... ⊂ B (k) ⊂ ..., where B (1) ... ⊂ B (k) ⊂ ... is the tower obtained from the basic constructions [4]. Let 2m ≥ h, and let P1 be the minimal (k) is generated idempotent corresponding to 1 (the apex of the Weyl alcove) in A2m . Bm (k−1) (0) and Jones projection ek . Bm = Bm . by Bm
488
F. Xu
(k) Lemma 4.2. P1 (A02m ∩ B2m ) ⊂ P1 (A0 ∩ B (k) ). (k) (0) is generated by B2m and e1 , ...ek . By our construction, operators Proof. First notice B2m U2m+n for all n ≥ 1 act on the paths on G with a fixed initial point a and do not change (0) is the matrix algebras the part of a path starting from a with length 2m. Since B2m (0) constructed out of the paths of length 2m on G with initial point a, it follows that B2m commutes with U2m+n for all n ≥ 1. On the other hand e1 , ...ek commutes with A which (k) commutes with U2m+n for all n ≥ 1. Let contains U2m+n for all n ≥ 1, therefore B2m (k) x ∈ P1 (A02m ∩ B2m ), then x commutes with P1 , P1 U2m+n for all n ≥ 1. Let us show that (A2m+n )P1 =< P1 , P1 U2m+1 , ...P1 U2m+n−1 >, n ≥ 0. Clearly (the left) contains (the right). But the Bratteli diagram for ((A2m+n )P1 )n≥0 are given by A graph as in Sec. 1, with initial point 1 , the same as that of ((< P1 , P1 U2m+1 , ...P1 U2m+n−1 ))n≥0 . Hence (A2m+n )P1 =< P1 , P1 U2m+1 , ...P1 U2m+n−1 >, n ≥ 0. So x commutes with (A2m+n )P1 , for all n ≥ 0. Therefore x ∈ P1 (A0P1 ∩ B (k) ) = P1 (A0 ∩ B (k) ). (k) ) = P1 (A0 ∩ B (k) )(' A0 ∩ B (k) ), for 2m ≥ h. Corollary 4.2. P1 (A02m ∩ B2m (k) Proof. By Wenzl’s estimation [3], dimP1 (A02m ∩B2m ) ≥ dimP1 (A0 ∩B (k) ) for 2m ≥ h, (k) it follows from Lemma 4.1 that P1 (A02m ∩ B2m ) = P1 (A0 ∩ B (k) )(' A0 ∩ B (k) ), for 2m ≥ h.
Hence the principal graph for A ⊂ B is given by the Bratteli diagram for A2m ⊂ B2m for 2m ≥ h with its distinguished point corresponding to the central support of P1 in 1 = δa,b , A2m . The Bratteli diagram for A2m ⊂ B2m is give by Corollary 3.1. Notice Vab so A ⊂ B is irreducible. By [1], we have the following recursion formula to determine P (µ) (µ) λ (µ) Vab completely. First, since φ(µ) /φ(µ) · φ(µ) γ /φ1 = 1 ρ Nλγρ φρ /φ1 (see Sec. 1), it λ P λ µ P γ λ follow from Corollary 3.1: b Vab Vbc = γ Nλµγ Vac . Think of Vab as a matrix with indices a,b, one obtained the remarkable fact (see [1]) that λ → V λ is a representation f = Gab , where f is the vector-representation of the Fusion-rule-algebra! Notice that Vab f¯ ¯ and similarly Vab = Gba , where f is the complex conjugate of f. For SU(3), the fusion ¯ rules may be generated recursively by repeated application of fusion P by f and f . In γ1 −1 γ2 −1 ¯ ·f = γ + γ 0 ,γ 0 +γ 0 <γ1 +γ2 γ 0 . fact, let γ = (γ1 − 1)Λ1 + (γ2 − 1)Λ2 , then: f 1 2 Hence for any γ, there exists a polynomial Pγ (x, y) with integer coefficients, such that γ = Pγ (f, f¯). Since λ → V λ is a representation of the Fusion-rule-algebra, we have V λ = (Pλ (G, Gt ))ab , thus determining V λ recursively from G. Let us use the above recursion formula to determine the principal graphs of subfactors associated with the E (8) graph with initial point 11 . Although the recursion formula determined the principal graph completely, it is in general a very tedious calculation. (The same is true for the calculations of general fusion coefficients of Kac-Moody algebras.) We will use the obvious Z3 symmetry of the graph E (8) to simplify our calculation. We will denote by γ = (γ1 − 1)Λ1 + (γ2 − 1)Λ2 . Then one can read from A(8) graph the following recursion relations: V (γ1 +2,1) = V (γ1 +1,1) G − V (γ1 ,1) Gt + V (γ1 −1,1) ; V (γ1 ,γ2 ) = V (γ1 +1,γ2 −1) G − V (γ1 +2,γ2 −1) − V (γ1 +1,γ2 −2) . In the above formula, if γ is outside the Weyl alcove, V γ is set to be 0. Due to the σ(γ) Z3 symmetry of the E (8) graph, we have: V1bγ = V1σ(b) , where σ is the anti-clockwise rotation around the center of graphs by 120 degree. By using the recursion formula, we have:
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
489
V (1,1) = I, V (2,1) = G, V (3,1) = G2 − Gt , V (4,1) = G3 − 2Gt G + I, V (2,2) = Gt G − I, V (3,2) = G2 Gt − G − (Gt )2 , where I stands for the identity matrix. Hence: V1b(2,1) = G1b = δb,21 , V1b(3,1) = G11 21 G21 b − Gb11 , X G21 c Gcb − 2G31 11 G31 b + δ11 ,b , V1b(4,1) = c
V1b(2,2) = G31 b − δ11 ,b , X V1b(3,2) = G21 c Gbc − G11 b − Gb,31 . c
One can use the Z3 symmetry to determine V1b(5,1) as follows: X (6,1) f¯ V1c Vcb V1b(5,1) = c
=
X
V1cσ(1,1) Gbc
c
=
X
(1,1) V1σ(c) Gbc
c
= Gb13 = δ33 ,b . Putting all these together and using the Z3 symmetry, we obtain the principal graph: (1, 5) (4, 3) r
r (1, 1) 11 (2, 3)
23 32 (1, 2)
(2, 1)
42 (4, 1)
490
F. Xu
5. Conclusions and Questions We have demonstrated how to construct some finite-depth subfactors from certain integrable lattice models. It maybe viewed as a generalized GHJ construction in constructing finite-depth subfactors. We also determine the principal graphs of these subfactors. It is remarkable that the principal graphs can be determined recursively. Since the integrable lattice models associated to E (8) , E1(12) , E (24) are conjectured to renormalize to conformal models described by conformal embeddings SU (3)5 ⊂ SU (6), SU (3)9 ⊂ E6 , and SU (3)21 ⊂ E7 , it is tempting to conjecture that some of the subfactors we constructed (with initial point 1), are exactly the same as the subfactors coming from the conformal embeddings above [9]. If so, our results then determine the principal graphs of these subfactors which are unknown previously. A direct proof of the above statements would fully validate the fact that “On the finite system (i.e. the periodic commuting square), the chiral symmetry is replaced by quantum group symmetry ( Uq SU (2) or Temperley-Lieb algebra in GHJ case, Uq SU (3) or Hecke-algebra in our case). ” It is straightforward to construct E (8) , E1(12) , E (24) subfactors (the analogue of E6 and E8 subfactors in GHJ case) with index value sin2 (3π/8)/sin2 (π/8), sin2 (3π/12)/ sin2 (π/12) and sin2 (3π/24)/sin2 (π/24) respectively. Based on the empirical relations between subfactors and type I modular invariants (see [1], [5]), it is easy to write down the conjectured principal graphs. For example for the E (8) case, it is :
Such subfactors can be viewed “quotients of Wenzl’s Hecke-algebra subfactors” by the quantum symmetry provided by the conformal embedding (see [5]). It is also an interesting question to determine the dual principal graphs of these subfactors. It is clear that one should be able to generalize our ideas in several directions: Consider N > 3 cases: replacing the Hecke algebra by the Birman-Wenzl algebra and for exceptional Lie groups, using the R-matrices of Quantum group at roots of unity along the lines of [7]. One will get generalized GHJ subfactors of which many subfactors coming from conformal embeddings will be (conjecturally) an important part. We will consider these questions and related questions in another paper. Acknowledgement. I would like to thank the referee for suggestions, especially for pointing out a mistake in the calulation of an example.
References 1. Di Francesco, P. and Zuber, J.-B: Integrable lattice models associated with SU(N). Nucl.Phys. B338, 602 (1990) 2. Sochen, N.: Nucl.Phys. B360, 613 (1991) 3. Wenzl, H.: Subfactors from Hecke algebras. Invent. Math. 121, 44–80 (1988) 4. Goodman, F.M., de la Harpe, P., Jones, V.: Towers of algebras and Coxeter graphs. MSRI publications, no.14 5. Kawahigashi, Y.: Classifications of paragroup actions on subfactors. Preprint 1993, to appear 6. Okamoto,S.: Invariants for subfactors arising from Coxeter graphs. Current topics in operator algebras, Singapore: World Scientific Publishing, 1991, pp.84–103
Generalized Goodman–Harpe-J-ones Construction of Subfactors, I
491
7. Xu, F.: Orbifold construction in subfactors. Comm. Math. Phys. 166, 237–253 (1994) and Ph.D thesis, Berkeley, 1995 8. Pasquier,V.: Two dimensional critical systems labeled by Dynkin diagrams. Nucl. Phys. B285,167–172 (1987) 9. Wassermann, A.: Subfactors from representations of Loop group. To appear in ICM 1994 10. Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, recorded by Y. Kawahigashi, 1991 11. Izumi, M. and Rehren, K.H.: Personal communications, 1993 12. Reshetikhin, N.: Quantized universal enveloping algebras, Yang-Baxter equations and invariants of links,I and II. LOMI Preprints, E-4-87 13. Kac, V.: Kac-Moody algebras. 1993, 3rd Edition 14. Finkerberg, E.: Ph.D thesis, Harvard 1993 Communicated by H. Araki
Commun. Math. Phys. 184, 493–508 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Generalized Goodman–Harpe–Jones Construction of Subfactors, II Feng Xu Department of Mathematics, UCLA, Los Angeles, CA 90024, USA Received: 16 February 1996 / Accepted: 25 July 1996
Abstract: We build some exceptional representations of Birman–Wenzl algebras from the data of certain conformal embeddings. As a result we construct new finite depth subfactors whose principal graphs are completely determined.
1. Introduction The motivation of this paper is to describe the subfactors associated with conformal embeddings SO(5)3 ⊂ SO(10), SO(5)7 ⊂ SO(14)1 . As promised in [1], we use Birman-Wenzl algebras ([2]) to describe a construction of subfactors conjecturally associated with SO(5)3 ⊂ SO(10) and SO(5)7 ⊂ SO(14) ([5]). Moreover, the principal graphs of these subfactors are determined completely. The content of this paper is as follows: In Sect. 2 we introduce some basic properties of Birman-Wenzl algebras and its regular representations. We then outline a general procedure to search for some graphs closely related to the data of conformal embeddings. These graphs are used to give some “exceptional" representations of Birman-Wenzl algebras. (These representations are built on paths on graphs.) In Sect. 3 we give the complete details of the calculations of matrix elements (Boltzmann-Weights) corresponding to SO(5)3 ⊂ SO(10) case. In Sect. 4 we construct periodic commuting squares (see Chapter 4 of [6]) and give a formula which determines the principal graphs of some subfactors completely as in [1]. In Sect. 5 we present a cell system associated with SL(5)3 ⊂ SL(10) following [3] which gives rise to some subfactors closely related to SL(3)5 ⊂ SL(6) constructed in [1]. We also present some evidences suggesting that the subfactors associated with SL(n)m ⊕ SL(m)n ⊂ SL(mn) should be the central sequence subfactors ([10]) of the Hecke-algebra subfactors constructed by Wenzl. In Sect. 6 we give conclusions and some further questions.
494
F. Xu
2. The Birman-Wenzl algebras The Birman-Wenzl algebras associated with SO(5)k (k is a positive integer) is a complex πi (see [2]). It is given by generators algebra denoted by Cf (q −5 , q) with q = exp 2(k+3) 1, g1 , . . . , gf −1 which are assumed to be invertible, and relations: gi gi+1 gi g i gj
= =
gi+1 gi gi+1 , gj gi
ei g i ±1 ei gi−1 ei
= =
r−1 ei , r±1 ei ,
(3) (4)
(q − q −1 )(1 − ei )
=
gi − gi−1 ,
(5)
if |i − j| ≥ 2,
(1) (2)
where r = q −5 . It follows from the above defining relations that: −1 gi gi±1 gi±1 ei±1 gi gi±1
= =
gi gi±1 gi−1 , gi gi±1 ei ,
gi ei±1 gi−1
=
−1 gi±1 ei gi±1 ,
±1 ei gi+1 ei
=
r±1 ei ,
gi ei±1 ei e i ej
= =
−1 gi±1 ei , ej ei
e2i
= = =
xei ei ei±1 ei = ei , ei−1 ei , (g − g −1 ) (gi + r−1 ei ) + 1,
ei−1 gi gi−1 gi2
if |i − j| ≥ 2, (6)
−1
r−r + 1. The relation ei−1 gi gi−1 = ei−1 ei will be crucial for our argument where x = q−q−1 in Sect. 4. The following proposition is an easy consequence of the above relations (see Prop. 3.2 of [2]).
Proposition 2.1. (a) Any element of Cf +1 can be written as a linear combination of elements of the form axb with x ∈ {1, gf , ef } and a, b ∈ Cf . In particular, it follows from this by induction on f that Cf +1 is finite dimensional. (b) There exists for all elements a ∈ Cf +1 a unique element Ef (a) ∈ Cf such that ef +1 aef +1 = Ef (a)ef +1 . The map a 7→ Ef (a) is linear and has the so-called bimodule property, i.e. Ef (b1 ab2 ) = b1 Ef (a)b2 for b1 , b2 ∈ Cf and a ∈ Cf +1 . We will be interested in representations of Cf (q −5 , q) built from paths on certain finite unoriented graphs. To introduce so called “regular" representations of Cf (q −5 , q), we remind the reader that the integrable representations of so(5) b Kac-Moody algebra at level k are indexed by the finite set (Weyl-alcove): (k) = {λ ∈ p | λ = b1 e1 + b2 e2 P++
b1 ≥ b2 ≥ 0,
b1 + b2 ≤ k + 3},
where p is the weight-lattice of SO(5) and e1 , e2 are basis of R2 . We take these points as the vertices of our “regular" graph, denoted by A. The bonds between the vertices
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
495
of A are along the ei ’s. This unoriented regular graph A describes the fusion rules of representation Rλ with the spin representation Re1 : X Ne1 λµ Rµ , R e1 × R λ = µ
where Ne1 λµ are non-negative integers. We take Aλµ = Ne1 λµ to be the adjacency matrix of graph A. The regular representation of Cf (q −5 , q) is a representation of Cf (q −5 , q) built on paths on A. The generators for Cf (q −5 , q) in this representation were given in [8] (strictly speaking, the following expressions are obtained from (2.5) of [8] by taking u → i∞ and suitable normalization): (g −1 )λ,λ+eµ ,λ+2eµ ,λ+eµ = −q, (g −1 )λ,λ+eµ ,λ+eµ +eν ,λ+eµ = (g −1 )λ,λ+eµ ,λ+eµ +eν ,λ+eν = −
−q aν −aµ , [aν − aµ ]
[aµ − aν + 1][aµ − aν + 1] [aµ − aν ]2
21 ,
(g −1 )λ,λ+eµ ,λ,λ+eν = −q −1−aµ −aν (Gaµ Gaν ) 2 , µ 6= ν, 1
(g −1 )λ,λ+eµ ,λ,λ+eµ =
q −2aµ −1 q −2aµ −1 − Gaµ , [2aµ + 1] [2aµ + 1] x
−x
−q where λ + 2e1 + e2 = a1 e1 + a2 e2 ; µ, ν = 1, 2; [x] = qq−q −1 ; Q [2aµ +2] Q [aµ −ak +1] Gaµ = − µ [2aµ ] [aµ −ak ] for µ 6= 0; Ga0 = 1. It is easy to check from k6=±µ,k6=0
the above formula that (g −1 )0,e1 ,2e1 ,e1
=
−q,
(g −1 )0,e1 ,0,e1
=
q −5 ,
(g −1 )0,e1 ,e1 +e2 ,e1
=
q −1 .
The second formula follows from Ga1 = −d = 1 − [5]. The fusion coefficients for so(5) b Kac–Moody algebra at level k are: R λ × Rµ =
(k) X
Nλµν Rν .
ν∈P++
They are nonnegative integers, and by the Perron-Frobenius theorem, there exists for any such matrix an eigenvector ψ (1) of the largest eigenvalue ν (1) , such that all its components ψa(1) are real and non-negative. In fact we have the Verlinde formula [12]: X (ρ) Sλ Sµ(ρ) · Sν(ρ)∗ /S1(ρ) , Nλµν = (k) ρ∈P++
where 1 refers to the apex of the Weyl alcove, and Sλ(ρ) is given by the Kac-Peterson formula:
496
F. Xu
Sλ(ρ) = C
X
Ew exp(iωρ · λ2π/k + 3),
ω∈S2
where Eω = det(ω) and C is a normalization constant fixed by the requirement that S (ρ) is an orthonormal system. The proof of the above formula can be found in [12]. The regular graph Aµν is given by: X Se(ρ) Sµ(ρ) Sν(ρ)∗ /S1(ρ) . Aµν = Ne1 µν = 1 ρ∈ρ(k) ++
We see that S (ρ) are the eigenvectors of the graph A with eigenvalues ν (ρ) = Se(ρ) /S1(ρ) . 1 (ρ) (ρ) (ρ) We will denote by νλ = Sλ /S1 . Let x be a word in gi±1 ’s, i = 1, ...L − 1 . Define: X ha1 = a, a2 , · · · aL−1 aL = b|x|a1 = a, a2 , · · · aL−1 aL = bi. Zab (x) = a2 ,···,aL−1
Proposition 2.2. Zab (x)
=
X
X
fj ± ···j ± (x)ha1 = a, · · · 1
`=0,···M 0≤j1 <···<j`
`
· · · gj±1 |a1 = a, · · · a` = bi, a` = b|gj±1 1 ` where fj ± ···j ± (x) is some universal function depending only on x and Birman–Wenzl 1
`
algebra Cf (q −5 , q).
Proof. Let us prove it by induction on f . f = 1, the proposition is trivial. Assume now the proposition is true for f ≤ n. By Prop 2.1, any element of Cn+1 can be written as a linear combination of elements of the form a0 xb0 with x ∈ {1, gn , gn−1 }. And a0 , b0 ∈ Cn . Since Zab (x) is a linear functional of x, all we need to show is that the proposition is true for Zab (a0 xb0 ) with x ∈ {1, gn , gn−1 }. By using the cyclicity of the trace: Zab (a0 xb0 ) = Zab (b0 a0 x). Since b0 a0 ∈ Cn , b0 a0 can be written as a linear combination of c0 x1 d0 with x1 ∈ ±1 {1, gn−1 } and c0 , d0 ∈ Cn−1 . Notice x commutes with c0 , d0 . Hence Zab (b0 a0 x)
Zab (c0 x1 d0 x) Zab (d0 c0 x1 x)
= =
continue in this fashion and using our induction hypothesis, we get X X Zab (x) = fj ± j ± ···j ± (x) 1
`=0,···M 0≤j1 <···<j`
2
`
· gj±1 | | a1 = a, · · · a` = bi. ha1 = a, · · · a` = b|gj±1 1 ` Since in the proof we only use the structure property of the Birman–Wenzl algebra and the cyclicity of the trace, f depends only on x and Cf (q −5 , q).
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
497
Set d = [5] − 1 and let x be a word in gi±1 , i = 1, ...L − 1. Let us define a modified partition function as in [1], Z (µ) (x) = mod
X
Zab (x)
b
One checks that T r(1) (agf±1 ) =
φ(µ) b φ(µ) a
×
1 ∆ (µ) = T r (x). dL
−r±1 (1) T r (a) for a ∈ Cf . d
T r(1) is thus the Markov trace ([23]). The above property follows from X ±1 ±1 gµ,ν,λ,ν φ(1) ) · φ(1) γ . λ = (−r
(7)
λ
We will need one more property of the regular representation: X
(g ±1 )λ,δ,γ,δ
δ,γ
φ(µ) γ φ(µ) λ
=
X
(g ±1 )0,e1 ,γ,e1
γ ∆
=
φ(µ) γ φ(µ) 0
x(µ) ± .
(8)
Equation(8) can be checked directly by using the explicit expression for g ±1 . We are now ready to construct some exceptional graph in order to construct exceptional representations of Cf (q −5 , q). A remark concerning the notations. We denote by A, Greek letters and φ the regular graph, its vertices and its eigenvectors respectively, while G, Roman letters and ψ denote the exceptional graph, its vertices and eigenvectors. All the eigenvalues are normalized, ∗ i.e. Σµ ψa(µ) ψb(µ) = δab where the sum over the multiplicity of eigenvectors corresponding to the same eigenvalue (if any) is assumed. The exceptional graph G associated with certain conformal embeddings (we study SO(5)3 ⊂ SO(10) and SO(5)7 ⊂ SO(14)) are determined according to some empherical relations of [3]. We will list below the properties which will help us to determine G. The reader may find more examples in [3]. (1) The number of vertices of G is the same as the number of fields in the partition function Z associated with the conformal embedding. For example, in the SO(5)3 ⊂ SO(10) case, Z = |x1 + x35 |2 + |x10 + x30 |2 + 2|x16 |2 . Here the numbers indicate the dimensions of irreducible representations of SO(5). There are 6 fields (x16 appears with multiplicity 2), hence G has 6 vertices. Similarly, in SO(5)7 ⊂ SO(14), case, Z = |x1 + x81
+
x91 + x231 |2 + |x14 + x154 + x84 + x204 |2
+
2|x64 + x256 |2 ,
the corresponding G
has 12 vertices. (2) The number of extreme vertices of G (namely, the vertices of G which connect to only one other vertex) is the same as the number of blocks of Z. In SO(5)3 ⊂ SO(10) case, there are 4 blocks: x1 + x35 , x10 + x30 , x16 with multiplicity 2. Hence G has 4 extreme vertices. In the same way G associated with SO(5)7 ⊂ SO(14) has 4 extreme vertices.
498
F. Xu
(3) The largest eigenvalue of G will be the same as that of A; moreover, denote by exp(G) the set of eigenvalues of G, then exp(G) ⊂ exp(A) possible up to multiplicities. (Unlike the Hecke-algebra case as in [1], our graphs in theBirman–Wenzl algebra case are all unoriented.) G is also connected. These conditions are very restrictive. The unique graph G corresponding to SO(5)3 ⊂ SO(10) is given by: 1
5
@ @2 6
4 6 @ @ 6
3
This graph also appears as the principal graph of the reduced inclusion of E6 subfactors (see [14]). The unique graph G corresponding to SO(5)7 ⊂ SO(14) is: 1 l 2 l LLl l3 L L Ls , 4 , , 5 , 6
9 , 10 , , 8, L L s L L 7llL 11l l 12
λ As in [1], we introduce Vab as follows:
λ Vab =
X
(µ) (µ) (µ)∗ φ(µ) . λ /φ1 · ψa ψb
µ
e1 Notice that λ → V λ is a representation of the fusion rule algebra and Vab = Gab where e1 is the spin representation of SO(5). Since the fusion rule algebra of so(5) b k is generated by e1 and e1 + e2 (which core1 +e2 to completely responds to the vector representation), we only need to determine Vab λ determine Vab . e1 +e2 = G0ab . The graph G0 is determined in a similar way as G except that G0 Let Vab may not be connected. The right choice of G0 for SO(5)3 ⊂ SO(10) is:
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
1
499
5
@ @
@
@
@ 2 @ @
4
@
3
6
G0 corresponding to SO(5)7 ⊂ SO(14) is:
s 7 ,l , , l - 11 91
8
2
4 ,@ , @s
10
6
3
5 12
We will build representations of Cf (q −5 , q) on paths on G in the next section. 3. The Boltzmann–Weights We will present in detail the calculation of matrix elements (g −1 )a,b,c,d in SO(5)3 ⊂ SO(10) case. These elements are called Boltzmann–Weights as in [1]. In addition to the constraints (1), (2), (3), (4), (5) and Markov property (7) (remember that one has to replace φ by ψ according to our notation), we impose that (g −1 )a,b,c,d is invariant under the symmetry of G: 1
5
@
2 @ @
3
*
4 @ @
/ @ 6
For the double edges between 2 and 4, we label the upper part by 1 and the lower part by 2 and use 241 (242 ) or 41 2 (42 2) to refer to the upper part (lower part). A unique solution is found: 1212 1232 1241 41 2 1241 42 2 1242 41 2 1242 42 2 g2−1 1212 − d1 q 5 d1 q −1 b c c b 1 −1 1 5 1232 q − q b c c b d d 1241 41 2 b b a1 a2 a2 a3 1241 42 2 c c a2 a4 a5 a2 1242 41 2 c c a2 a5 a4 a2 1242 42 2 b b a3 a2 a2 a1
500
F. Xu
e2 1212 1232 1241 41 2 1241 42 2 1242 41 2 1242 42 2 g1−1 1212 1232 1241 41 2 1241 42 2 1242 41 2 1242 42 2
1212
1232
x d2 x d2 x d3/2
x d2 x d2 x d3/2
1241 41 2 x
d3/2 x
d3/2 x d
0 0
0 0
0 0
x d3/2
x d3/2
x d
1212 q −5 0 0 0 0 0
1232 0 −q 0 0 0 0
1241 41 2 0 0 q −1 0 0 0
1241 42 2 0 0 0 0 0 0
1242 41 2 0 0 0 0 0 0
1242 42 2 x d3/2 x d3/2 x d
0 0
x d
1241 42 2 1242 41 2 1242 42 2 0 0 0 0 0 0 0 0 0 q −1 0 0 0 −q 0 0 0 −q √ 1 √ 1 1 π ; b = d− 2 22 + iW ; c = d− 2 · 324 ; q = exp where d = −x = 1+ 3; W = sin 12 g2−1 , e2
g1−1
πi 12
.
The above table gives the matrix elements of the operators and acting on paths on G starting from 1 with length 4 and endpoint 2 (the length of a path is defined to be the number of nodes of G on the path). Notice the first two nodes of such a path are 1 and 2. 1232, for example, denotes a path of length 4 on G starting from 1, passing through 2, 3 and ending at 2 while 1241 42 2, denotes a path of length 4 on G starting from 1, passing 2, 4 through the upper part (recall for the double edges between 2 and 4, 241 (242 ) or 41 2 (42 2) denotes the upper part (lower part)), ending at 2 through the lower part. Operators g2−1 , e2 and g1−1 preserve the first two nodes and the last node. The matrix elements of the operators g2−1 , e2 depend on the paths while the matrix elements of g1−1 depend only on the first length 3 part of the paths. Since we impose that (g −1 )a,b,c,d is invariant under the symmetry of G, the above table gives all the matrix elements of (g −1 )a,b,c,d . In the rest of this section, we will explain how to determine the unknowns in the above table. It follows from Markov property (7): (a1 + a4 ) · d + q −1 − q = (−q 5 ) · d. From e2 g2−1 = re2 we have: a1 − a∗1
=
2(q −1 − q),
a4 − a∗1 a5 − a∗5 a2 − a∗2
= = =
q −1 − q, 0, 0,
a3 − a∗3
=
q −1 − q.
From g1 g2 g1 = g2 g1 g2 we have: a2
=
a1 + a3
=
a5
=
1 1 34 , 2d q − q −1 − q5 , d a4 + q.
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
501
By using g1∗ = g1−1 we can solve all the equations, the results are: a1
=
a2
=
a3
=
a5
=
√ 2+√ 3 . 4 2
√ 1 3−1 √ −i· √ , 2 2 2 1 34 √ , 2(1 + 3) √ √ 1− 3 3−2 √ +i· √ , 2 2 2 2
√ √ 1− 3 3 a4 = − √ + √ i, 4 2 2 2
It should be mentioned that there are more equations than unknowns though many of them are redundant (for example those equations from g1 g2 g1 = g2 g1 g2 ). Hence it is really a miracle that a solution is found to all these equations. We have checked that all equations are satisfied by our solution. In addition, it is straightforward to check that our solution satisfies one more equation: X
±1 ga,c,b,c =
c
X
±1 λ Vab · g0,e . 1 ,λ,e1
(9)
λ
It follows from (8) and (9): X b,c
±1 ga,c,b,c
ψbµ ψaµ
=
X
(g ±1 )λ,δ,γ,δ
δ,γ
=
X
=
φ(µ) λ
(g ±1 )0,e1 ,γ,e1
γ ∆
φ(µ) γ
x(µ) ± .
φ(µ) γ φ(µ) 0 (10)
One can also determine the Boltzmann-Weights corresponding to SO(5)7 ⊂ SO(14) in exactly the same way and check that all the equations are satisfied. However, the calculation is tedious and not very illuminating. We omit the details. The readers are invited to do the same kind of calculations for the SO(5)12 ⊂ E8 case which seems to be more complicated.
4. Periodic Commuting Squares Choose a vertex a of G. Let us construct B0 ⊂ B1 ⊂ · · · ⊂ Bn ⊂ · · · the path algebras on G starting from point a. B0 = C1. Let An be the subalgebra of Bn generated by −1 , where the matrix elements of gi−1 are given as in Sect. 3 which satisfy 1, g1−1 , · · · gn−1 (1), (2), (3), (4), (5), (7) and (10). The following lemma is an analogue of Lemma 3.1 as in [1]. (µ) (G) (A) Lemma 4.1. Σb Zab (x)ψb(µ) /ψa(µ) = Σλ Z1λ (x)φ(µ) λ /φ1 .
502
F. Xu
Proof. By Prop. 2.2, it is enough to prove the lemma for x of the form: x = · · · gi±1 0 ≤ i1 < i2 < · · · < is (g0 ≡ 1). We will prove this by induction on gi±1 1 s is . If x = 1, X
(G) Zab
b
ψb(µ) ψa(µ)
= =
X
X
b
a1 =a;am =b
ha1 · · · am | 1|a1 · · · am iψa(µ) /ψa(µ) m 1
· · · = (γ (µ) )m .
Since γ (µ) is the same for both A and G, we have proved lemma for x = 1 (exactly the same as in [1]). By the same argument as in [1], one can show the following Factorization property: ψ (µ)
φ(µ)
a
1
(G) (A) If we denote f (G) (x) = Σb Zab (x) ψb(µ) , f (A) (x) = Σλ Z1λ (x) φλ(µ) , and if x =
±1 ±1 ±1 · · · gis gj1 · · · gj±1 with 0 ≤ i1 < · · · < is < j1 < · · · < js and j1 > is + 1, then gi1 t ±1 ±1 (G) ±1 f (gi1 · · · gis gj1 · · · gj±1 ) = f (G) (gi±1 · · · gi±1 ) · f (G) (gj±1 · · · gj±1 ) and the same is t 1 s 1 t (G) (A) true if we replace f by f . Let us continue our induction on s. Let x = gi .
f (G) (gi )
=
X
ha1 , · · · am |gi |a1 , · · · am i
a1 =a,···am =b
=
X
ψa(µ) m ψa(µ) 1
ha1 , · · · ai−2 |1|a1 , · · · ai−2 i(gi )ai−2 ,ai−1 ,ai ,ai−1
a1 =a,···am =b
hai , · · · am |1|ai , · · · am i =
X
ψa(µ) m ψa(µ) 1
ha1 · · · ai−2 |1|a2 , · · · ai−2 ix(µ) · (γ (µ) )m−i+1 ·
a1 =a,···am =b
=
ψa(µ) i2 ψa(µ) 1
(γ (µ) )m−1 · x(µ) ,
where from second identity to third we use (10). Similarly one can show f (A) (gi ) = (γ (µ) )m−1 · x(µ) . Hence f (G) (gi ) = f (A) (gi ). (G) −1 f (gi ) = f (A) (gi−1 ) is proved in exactly the same way. To proceed, notice that the structure of the algebra An is uniquely determined by the Birman-Wenzl algebraic relations ((1) to (5)) and Markov trace property (7). The Bratelli diagram for the inclusions An ⊂ An+1 is given by graph A. (see p. 424 of [2] for a similar example). Since the vertex of A is indexed by Young diagrams with at most 2 columns, it follows from [7] that P = 0.1 Here P is the minimal projection associated to the node . (See [7].) The explicit form of P is given as follows ([7]): P = (1 − Z3 )((qg1 + 1)(qg2 + 1)(qg1 + 1) + qg1 + 1) = 0, where Z3 is the central projection corresponding to ideal I3 = C2 e2 C2 . (Recall C2 is generated by 1, g1±1 .) Hence we have: 1
We call the following argument a “null-vector" argument.
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
(qg1 + 1)(qg2 + 1)(qg1 + 1) + qg1 + 1 − c02 e2 c002 = 0,
503
(11)
where c02 , c002 ∈ C2 . Let wi be the characteristic idempotent belonging to the characteristic value q of gi . Then qgi + 1 = aei + bwi , where a, b are nonzero complex numbers. Multiply (11) from left and right by w1 , we get 000 b2 w1 (ae2 + bw2 )w1 + bw1 − c000 2 e2 c2 = 0,
(12)
000 where c000 2 , c2 ∈ C2 since w1 ∈ C2 . Since b is nonzero, it follows from (12) that
w1 w2 w1 ∈ C2 e2 C2 + Cw1 .
(13)
On the other hand, let ∆1,f +2 be the element of C(q −5 , q) as defined in [2] such that for 1 ≤ i ≤ f + 1,
∆1,f +2 gi ∆−1 1,f +2 = gf +2−i .
Notice ∆1,f +2 g1 ∆−1 1,f +2
=
gf +1 ,
∆1,f +2 g2 ∆−1 1,f +2
=
gf .
It follows from (13) that ±1 wf +1 wf wf +1 ∈ h1, gf±1 +1 ief h1, gf +1 i + Cwf +1 ,
(14)
±1 where h1, gf±1 +1 i is a subalgebra generated by 1, gf +1 . Let us continue our induction on is . Suppose the lemma is true for is ≤ f . We need to show it is true for is = f + 1. There are two cases to consider: (1) is−1 < f : f (G) gi±1 · · · gi±1 gf±1 +1 1 s −1 (G) ±1 ±1 = f (G) gi±1 gf +1 · · · g is −1 · f 1 = f (A) gi±1 · · · gi±1 · f (A) gf±1 +1 s −1 ±1 ±1 = f (A) gi±1 · · · g · g is −1 f +1 , 1
where we use our induction hypothesis, Factorization property and f (G) (gf±1 +1 ) = (A) ± f (gf +1 ). (2) is−1 = f . Let x1 ∈ Cf +1 . Notice f (G) (x1 ef +1 ) = x1 f (G) (x1 e2f +1 ) = 1 (G) (ef +1 x1 ef +1 ) = x1 f (G) (Ef (x1 )ef +1 ). Since Ef (x1 ) ∈ Cf , it follows from inducxf tion hypothesis and Factorization property that f (G) (Ef (x1 )ef +1 ) = f (A) (Ef (x1 )ef +1 ). Therefore f (G) (x1 ef +1 ) = f (A) (x1 ef +1 ) for x1 ∈ Cf +1 . We will use y1 to denote gi±1 · · · gi±1 ∈ Cf . 1 s−1
(15)
504
F. Xu
Since ef gf +1 = ef ef +1 gf−1 (see (6) with index i replaced by f + 1) f (G) (y1 ef gf +1 ) = f (G) (y1 ef ef +1 gf−1 ) = f (G) (gf−1 y1 ef ef +1 ) = f (A) (gf−1 y1 ef ef +1 ) = f (A) (y1 ef gf +1 ), which follows from (15) and gf−1 y1 ef ∈ Cf +1 . Since {gf±1 +1 } span the same vector space as {ef +1 , gf +1 }, it follows (A) f (G) (y1 ef gf±1 (y1 ef gf±1 +1 ) = f +1 ).
(16)
We are left with the y1 wf gf±1 +1 case. Since (15) holds, we are left with only one case y1 wf wf +1 , f (G) (y1 wf wf +1 ) = f (G) (y1 wf wf2 +1 ) = f (G) (wf +1 y1 wf wf +1 ) f (G) (y1 wf +1 wf wf +1 ). P By (14), y1 wf +1 wf wf +1 ∈ y1 ai ef bi + y1 wf +1 with ai , bi ∈ h1, gf±1 +1 i. i
Notice f (G) (y1 ai ef bi ) = f (G) (ai y1 ef bi ) = f (G) (y1 ef bi ai ) = f (A) (y1 ef bi ai ) = (A) (G) f (y1 ai ef bi ) by (16) since bi ai ∈ h1, gf±1 (y1 wf +1 ) = f (A) (y1 wf +1 ) since +1 i and f y1 ∈ Cf . It follows that f (G) (y1 wf wf +1 ) = f (A) (y1 wf wf +1 ). By induction hypothesis, the lemma is proved. Let h be the longest distance on G from the point a. It follows by exactly the same argument as Cor. 3.1 of [1] that the following holds: Corollary 4.1. For n ≥ h, An ⊂ Bn is periodic with period 2. Moreover, the branching λ as in Sect. 2. coefficients are given by Vab By exactly the same argument as Prop 3.1 and Sect. 4 of [1], we have the following: Corollary 4.2. For n ≥ h, An+1 ⊂ Bn+1 ∪ ∪ An ⊂ Bn is a periodic commuting square in the sense of [6] with Markov trace T r(1) of Sect. 2. Moreover, the principal graph for the corresponding subfactors A ⊂ B is given by the Bratelli diagram for A2m ⊂ B2m with 2m ≥ h. √ √The subfactors (associated with SO(5)3 ⊂ SO(10)) having index value 3 + 3, 18 + 10 3 seems to be the same as some of the original GHJ subfactors associated to E6 . The smallest index of the subfactor associated with SO(5)7 ⊂ SO(14) is about 28.35 · · ·. (k) λ , there is a polynomial Pλ (x, y), such that Vab = As in [1], for each λ ∈ P++ 0 0 0 Pλ (G, G ) with G, G the adjacency matrix of graphs G, G described in Sect. 2. It
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
505
follows from Cor. 4.2 and Cor. 4.1 that the principal graphs of the subfactors are completely determined. In the following we will give an example of calculating the principle point 1 in the graph of the subfactor (associated with SO(5)3 ⊂ SO(10)) with initial √ construction of B as in Cor.4.2. This subfactor has index value 3 + 3 and its principle graph are determined by U.Haagerup by different methods. First notice that the “regular” graph A as in Sect. 2 is given by: (3, 3) (2, 2)
(1, 1)
(0, 0) (0, 0)
(1, 0) (1, 0)
(3, 2)
(2, 1)
(2, 0) (21, 0)
(3, 1)
(3, 0) (3, 0)
The spin representation of SO(5) corresponds to (1, 0) in A, and the vector represen(k) tation of SO(5) corresponds to (1, 1) in A. Since A describes the fusion of λ ∈ P++ with the spin representation of SO(5) which corresponds to (1, 0) in A, we can determine V λ from G = V (1,0) and G0 = V (1,1) as follows: V (2,0) V (2,1) V (3,0) V (3,1) V (2,2) V (3,2) V (3,3)
= = = = = = =
V (1,0) × V (1,0) − 1 − V (1,1) , V (1,0) × V (1,1) − V (1,0) , V (2,0) × V (1,0) − V (2,1) − V (1,0) , V (3,0) × V (1,0) − V (2,0) , V (1,0) × V (2,1) − V (1,1) − V (2,0) − V (3,1) , V (2,2) × V (1,0) − V (2,1) , V (3,2) × V (1,0) − V (2,2) − V (3,1) .
Here 1 in the first formula above stands for identity matrix. Since G = V (1,0) and G0 = V (1,1) are given as in section 2, the above formula determines V λ completely. Together with Cor.4.1 and Cor.4.2, the principle graph is found to be : (0, 0) (3, 3) (3, 1) (1, 1)(2, 2) (2, 0) " \ %S " EE
% " \ S
% " E \ S "" % E \ S \% S
"" E 1
4
3
as one expected. The structure of bimodules of this subfactor is determined in [14]. It is an interesting question to see if one can determine such a structure from our approach.
506
F. Xu
5. Level-Rank Duality In the Hecke-algebra case, level-rank duality says that the Hecke-algebras associated c c m and SL(m) with SL(n) n are isomorphic (see [13] for the precise statement). There are similar statements for other Lie groups as well. In [3], the following graphs corresponding to the conformal embedding SL(5)3 ⊂ SL(10) are constructed: d 2 d 17 d 12 d 7
d 3 d 18 d 13
d
8
d9 14 d d19
16 d 6 d 11 d
1 d
10d 15d
d4
20 d 5 d
It is interesting to see that G1 is associated to graph E (8) ([1]) corresponding to SL(3)5 ⊂ SL(6) in the following way: The trivial (meaning no fixed point under the action of symmetry) Z5 quotient of G1 is the same as the trivial Z3 quotient of E (8) . In fact, both G1 /Z5 and E (8) /Z3 are given by: 4 e
e
e
e w e s
*} o 2
3 = 1
~
Generalized Goodman–Harpe–Jones Construction of Subfactors, II
507
Hence it is straightforward to build Boltzmann Weights associated with G1 from that of E (8) . (See also [1].) By the similar argument as in [1] one can construct subfactors from G1 . It is tempting to conjecture that the subfactors associated with SL(5)3 ⊂ SL(10) should be related to those of SL(3)5 ⊂ SL(6) in a simple way. More precisely, they should be either the same or dual to each other. should be related In general, subfactors associated with SL(n)n−2 ⊂ SL n(n−1) 2 to those associated with SL(n − 2)n ⊂ SL (n−2)(n−1) (n ≥ 5) in a similar fashion 2 as the above case. Level-rank duality seems to play a role. Based on level-rank duality, we conjecture that the subfactors coming from conformal embeddings SL(n)m ⊕ SL(m)n ⊂ SL(mn) should be the same as the central sequence subfactors ([10]) of Wenzl’s Hecke algebra subfactors. In fact, following the spirit of [1], we should look for representations of algebras A associated with SL(n)m ⊕ SL(m)n . More precisely, let An,m (Am,n ) be the Hecke algebras associated with SL(n)m (SL(m)n ), then An,m , Am,n should be subalgebras of A which commute with each other. Since An,m , Am,n is related by level-rank duality as in [13], a natural choice of A will be: Ak = {g−k , · · · g−1 , 1, g1 , · · · gk } with {g−k , · · · g−1 , 1} = An,m;k , {1, g1 , · · · gk } = Am,n;k . In fact it is easy to check that {g−k , · · · g−1 , 1} is related to {1, g1 , · · · gk } exactly as level-rank duality as in [13]. The natural periodic commuting squares we are searching should be: ∆
Ak+1 ⊂ {g−k−1 , · · · g−1 , g0 , 1, g1 , · · · gk+1 } = Bk+1 ∪ ∪ ∆
Ak ⊂ {g−k , · · · g−1 , g0 , 1, g1 , · · · gk } = Bk . It is easy to check that the Bratelli diagram for Bk ⊂ Bk+1 satisfies our conditions for graphs associated with SL(n)m ⊕ SL(m)n ⊂ SL(nm) as in [1]. The periodic commuting squares above give rise to central sequence subfactors of Wenzl’s Hecke algebra subfactors (see [10]).
6. Conclusions and Questions We have demonstrated how to construct some finite-depth subfactors by studying exceptional representations of Birman-Wenzl algebras. The principal graphs of these subfactors are determined completely by a recursive formula as in [1]. Some of these subfactors are conjectured to be the same as the subfactors coming from conformal embeddings SO(5)3 ⊂ SO(10), SO(5)7 ⊂ SO(14). The structure theory of Birman-Wenzl algebras plays a fundamental role in our construction. (see “null vector" argument in Sect. 4). It will be interesting to extend “null vector" argument to other cases. (However, the expression for null vector gets more complicated as the rank of group grows.) The empherical relations between graph G and conformal embeddings should be better understood. These graphs not only help us to determine the principal graphs of subfactors conjecturally associated with conformal embeddings, they are also closely related to the principal graphs of subfactors which can be viewed as “quotients of Wenzl’s subfactors" by the symmetry provided by the conformal embeddings. (See [14].) For
508
F. Xu
example, the graphs (G) of Sect. 1 are conjectured to be the principal graphs of some subfactors. The representation of the fusion rule algebras λ 7→ V λ of Sect. 1 (also see [1]) λ are non-negative integers and contain all the seems to be quite interesting. Notice Vab information about principal graphs of some subfactors. It is interesting to study all possible such representations. It is also interesting to prove the conjectures stated in Sect. 5. We hope to consider these questions and related questions in the future. Acknowledgement. I’d like to thank the referee for some useful suggestions.
References 1. Xu, F.: Generalized Goodman-Harpe-Jones construction of subfactors, I. Commun. Math. Phys. 2. Wenzl, H.: Quantum groups and subfactors of type B, C, and D. Commun. Math. Phys. 133, 383–432 (1990) 3. Petkova, V. B. and Zuber, J.-B.: From CFT to graphs. hep-th-9510198 4. Goddard, P., Nahm, W. and Olive, D.: Symmetric spaces, Sugawara’s energy momentum tensor in two dimensions and free fermions. Phy. Lett. 160B, 111–116 (1985) 5. Wassermann, A.: Subfactors from infinite dimensional representations of Loop groups. To appear 6. Goodman, F. M., de la Harpe, P. and Jones, V.: Towers of algebras and Coexter graphs. MSRI publications, no. 14 7. Ram, A. and Wenzl, H.: Matrix units for centralizer algebra. Journal of Algebra 145, 378–395 (1992) 8. Jimbo, M., Miwa, T. and Okado, M.: Solvable lattice models related to the vector representations of classical simple Lie algebras. Commun. Math. Phys. 116, 507–525 (1988) 9. Saleur, H.: Level-rank duality. Nucl. Phys. B. 363, 177–192 (1992) 10. Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, recorded by Y. Kawahigashi, 1991 11. Kawahigashi, Y.: Central sequences and asymptotic inclusions. To appear 12. Kac, V.: Infinite dimensional Lie algebras. 3rd edition, 1990 13. Goodman, F. M. and Wenzl, H.: Littlewood Richardson coefficients for Hecke algebras at roots of unity. Adv. Math., 1990 14. Kawahigashi, Y.: Classification of paragroup actions on subfactors, Preprint 1993. To appear Communicated by H. Araki
Commun. Math. Phys. 184, 509–531 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Models of Local Relativistic Quantum Fields with Indefinite Metric (in All Dimensions) S. Albeverio1,2,3 , H. Gottschalk1 , J.-L. Wu1,2,4 1 2 3 4
Fakult¨at und Institut f¨ur Mathematik der Ruhr-Universit¨at Bochum, D-44780 Bochum, Germany SFB237 Essen-Bochum-D¨usseldorf, Germany BiBoS Research Centre, Bielefeld-Bochum, Germany; and CERFIM, Locarno, Switzerland Probability Laboratory, Institute of Applied Mathematics, Academia Sinica, Beijing 100080, P.R. China
Received: 25 April 1996 / Accepted: 29 July 1996
Abstract: A condition on a set of truncated Wightman functions is formulated and shown to permit the construction of the Hilbert space structure included in the Morchio– Strocchi modified Wightman axioms. The truncated Wightman functions which are obtained by analytic continuation of the (truncated) Schwinger functions of Euclidean scalar random fields and covariant vector (quaternionic) random fields constructed via convoluted generalized white noise, are then shown to satisfy this condition. As a consequence such random fields provide relativistic models for indefinite metric quantum field theory, in dimension 4 (vector case), respectively in all dimensions (scalar case).
Introduction Since the appearance of gauge theories, it became natural to consider (local) quantum field theory (abbr. QFT) in which not all of the Wightman axioms are satisfied. Such a consideration was in particular natural and also necessary for the study of “charged" fields interacting with gauge fields, because their description conflicts either with locality or with positivity (positive definiteness of the set of Wightman functions [12, 18] and [7]). The physical reason for this is that in such theories one must use observables of the charged type which obey a Gauss’ law (see e.g. Morchio and Strocchi [13]), instead of using the usual local observables. Actually, from the study of fields such as, e.g., α-gauge type Higgs models which do not satisfy positivity (see e.g. [11] and references therein), it turned out that it is better in general to keep the locality condition and to give up the positivity condition. This leads to the so-called “modified Wightman axioms" of the indefinite metric QFT (see [19]). The difference between indefinite metric QFT and standard (i.e. positive metric) QFT is that the axiom of positivity in the latter is replaced by the so-called “Hilbert space structure condition (HSSC)" in the former which permits the construction of Hilbert spaces associated to the given collection of Wightman functions.
510
S. Albeverio, H. Gottschalk, J.-L. Wu
In recent years models of Euclidean random fields of scalar and vector type have been constructed via convolution from generalized white noise, see e.g. [1–4] and references therein. Furthermore, by analytic continuation, one can get Wightman functions from the Schwinger functions of such Euclidean models. The corresponding Wightman functions satisfy the relativistic postulates on invariance, spectral property, locality and cluster property. The positivity condition does not hold, in general, for the Wightman functions, in fact in [2] (see also [1]) a counterexample was given to show that the reflection positivity does not hold for the associated Schwinger functions, if the non-Gaussian component in the generalized white noise is sufficiently strong. Hence it is very interesting to see whether the Wightman functions of such models satisfy the modified Wightman axioms for indefinite QFT’s. The aim of this paper is to prove that the Wightman functions associated with the above mentioned Euclidean models indeed satisfy the Hilbert space structure condition. Such Euclidean fields provide thus the first known (non trivial) local relativistic models for (indefinite metric) QFT’s. The technique required to achieve this is based on explicit formulae for the truncated Wightman functions. The paper is organized as follows. In Sect. 1, we introduce majorant Hilbert topologies, a necessary and sufficient condition called Hilbert space structure condition for the existence of a majorant Hilbert topology, and modified Wightman axioms. In Sect. 2, we present a sufficient Hilbert space structure condition for truncated Wightman functions which implies the Hilbert space structure condition for Wightman functions. In Sect. 3, we introduce both scalar and vector Euclidean random fields as convoluted generalized white noise. We give explicit formulae for their truncated Wightman functions (we remark that the formulae for the vector models we give in Sect. 3 are written in a way which is different, although equivalent, from the one used in [3]). We derive them by following the procedure for the scalar models in [2], which makes it possible to prove the temperedness of the truncated Wightman functions (which is a point left open in [3]). Sect. 4 is devoted to the verification of the Hilbert space structure condition for the models introduced in Sect. 3.
1. Majorant Hilbert Topologies and Modified Wightman Axioms In this section, we introduce a majorant Hilbert topology structure associated with Wightman functions. For an extensive mathematical account of such topologies as well as of indefinite inner product spaces, we refer to the monograph Bogn´ar [6]. Here we follow the presentation of [19] and [13]. Let d ∈ IN be a fixed space-time dimension and q ∈ IN be a fixed number. For n any n ∈ IN , let us denote by S(IRdn ,C q ) the Schwartz space of all rapidly decreasing n dn qn ∞ C -valued C -functions on IR with the Schwartz topology. Let S 0 (IRdn ,C q ) denote its topological dual. Let us begin by introducing the following axioms for Wightman functions {Wn }n∈IN 0 (with W0 = 1 for simplicity): Axiom I (Temperedness). For any n ∈ IN , the n-point function Wn (x1 , · · · , xn ), n x1 , · · · , xn ∈ IRd , is a tempered distribution, i.e., Wn ∈ S 0 (IRdn ,C q ). Axiom II (Poincar´e invariance). There is a representation T of the proper, orthochronous Lorentz group L↑+ (IRd ) (which can be assumed to be irreducible) acting on IRq , such that for any n ∈ IN and any Poincar´e transformation {a, Λ} ∈ P+↑ (IRd ), the n-point function Wn (x1 , · · · , xn ) is invariant under {a, Λ}:
Models of Local Relativistic Quantum Fields with Indefinite Metric
511
T (Λ)⊗n Wn (Λ−1 (x1 − a), · · · , Λ−1 (xn − a)) = Wn (x1 , · · · , xn ), which should be understood component-wise as follows: Wnj1 ,···,jn (x1 , · · · , xn )
=
q X
T (Λ)jl11 · · · T (Λ)jlnn
l1 ,···,ln =1
×
Wnl1 ,···,ln (Λ−1 (x1 − a), · · · , Λ−1 (xn − a)).
We remark that by Axiom II, every Wn is actually a distribution in the difference n variables, i.e. there is a tempered distribution wn ∈ S 0 (IRd(n−1) ,C q ) defined as wn (y1 , · · · , yn−1 ) := Wn (x1 , · · · , xn ), where yj := xj − xj+1 , 1 ≤ j ≤ n − 1. For r ∈ IN we adopt the conventions in Vol.II of [16] for the (component wise) definition of the Fourier transformˆon S(IRdn ,C r ) and S 0 (IRdn ,C r ), respectively. ˆ n (q1 , · · · , Axiom III (Spectral condition). For any n ∈ IN , the Fourier transform w qn−1 ) is supported in the backward cones {(q1 , · · · , qn−1 ) ∈ IRd(n−1) : qj2 ≥ 0, qj0 < 0, 1 ≤ j ≤ n − 1}, where qj = (qj0 , qj ) ∈ IR × IRd−1 , and qj2 := |qj0 |2 − |qj |2 is in Minkowski metric. (A different sign convention on the Fourier transform in most of the physical literature leads to the interchange of forward and backward cones.) Axiom IV (Locality). For n ≥ 2, if (xj+1 − xj )2 < 0 for some j ∈ {1, · · · , n − 1}, then Wn (x1 , · · · , xj , xj+1 , · · · , xn ) = ±t(j,j+1) Wn (x1 , · · · , xj+1 , xj , · · · , xn ). Here + corresponds to integer spin of T , whereas − corresponds to half-integer spin [18]. t(j,j+1) acts on Wn = (Wnl1 ,...,ln )l1 ,...,ln =1,...,q by transposing the indexes lj and lj+1 . Let S be the Borchers algebra over S(IRd ,C q ), namely, n
S := {F = (f0 , f1 , · · ·) : f0 ∈C, fn ∈ S(IRdn ,C q ), n ∈ IN } with addition and multiplication given as follows: F + G = (f0 + g0 , f1 + g1 , · · ·), P
F ⊗ G = ((F ⊗ G)0 , (F ⊗ G)1 , · · ·),
where (F ⊗ G)n := j+l=n fj ⊗ gl , n = 0, 1, 2, · · ·. The topology on S is the direct sum topology induced by the Schwartz topology of S(IRd ,C q ). Setting W(F ) :=
∞ X
Wn (fn ),
n=0
where W0 (f0 ) = 1 · f0 is the product of the complex numbers 1 and f0 , then W is a linear functional on S, called the Wightman functional. Furthermore, for F = (f0 , f1 , · · ·) ∈ S, ← we define its involution by F ∗ = (f0∗ , f1∗ , · · ·), where fn∗ (x1 , · · · , xn ) := r fn (xn , · · · , x1 ),
512
S. Albeverio, H. Gottschalk, J.-L. Wu ←
where r acts on f = (f l1 ,...,ln )(l1 ,...,ln )∈{1,...,q}n by reversing the order of the indexes and the bar denotes complex conjugation. Then W determines a sesquilinear form on S as follows: (1) < F, G >W := W(F ∗ ⊗ G), F, G ∈ S. Clearly, < ·, · >W is hermitian if the Wightman functions Wn , n ∈ IN , satisfy the hermiticity condition ←
Wn (x1 , · · · , xn ) = r Wn (xn , · · · , x1 ), n ∈ IN . Hereafter we assume this condition for simplicity. Now set NW := {F ∈ S :< F, G >W = 0, ∀G ∈ S}, which is the kernel of < ·, · >W , then the quotient space D := S/NW is well defined as an indefinite inner product space (cf. [6] for this notion) with respect to the indefinite inner product induced by < ·, · >W (we denote the induced product by the same notation). In general, (D, < ·, · >W ) can not be a pre-Hilbert space. However, we may specify some Hilbert inner product which dominates < ·, · >W . To this end, we introduce the following notion: Definition 1.1. By a majorant Hilbert topology τ of < ·, · >W on D we mean a topology determined by a Hilbert inner product (·, ·) on D such that 1
1
| < F, G >W | ≤ (F, F ) 2 (G, G) 2 , F, G ∈ D.
(2)
Remark 1.2. An important property of a majorant Hilbert topology τ is that from (2), we have τ F (n) → F =⇒< F (m) , F (n) >W →< F, F >W . Namely, the topology τ is strong enough for τ -convergence to imply convergence of all the corresponding Wightman functions with respect to the inner product < ·, · >W . (·,·)
From Definition 1.1, (D, (·, ·)) is a pre-Hilbert space. Setting H := D , then (H, (·, ·)) is a Hilbert space. By the known Riesz theorem, (2) implies that there exists a bounded self-adjoint operator T on H, hereafter called the metric operator corresponding to (·, ·), such that < F, G >W = (F, T G), F, G ∈ H. Moreover, such an operator can be chosen to be non-degenerate, i.e., T fulfills HT = {0}, where HT := {F ∈ H : (F, T G) = 0, ∀G ∈ H} is a Hilbert subspace of H. Actually, suppose that HT 6= {0}. We remark that T (HT ) = {0}, thus the following Hilbert inner product: (F, G)1 := (F, (1 − PT )G) also determines a majorant Hilbert topology τ1 of < ·, · >W on D, where PT : H → HT is the projection. Clearly, the metric operator T1 := (1 − PT )−1 T is non-degenerate (corresponding to (·, ·)1 ). We call such a τ1 a non-degenerate majorant Hilbert topology. In addition, such a procedure of removing the degeneracy of metric operators also removes the nontrivial ideals of the Borchers algebra S arising from properties like
Models of Local Relativistic Quantum Fields with Indefinite Metric
513
locality and spectral conditions of Wightman functions. On the other hand, we can (well) define a field operator (i.e., an operator valued distribution) φ(f ) on the dense domain D ⊂ H for any f ∈ S(IRd ,C q ) as follows: (φ(f ))(G) := Ff ⊗ G + NW , G ∈ D, where Ff := (0, f, 0, · · ·) ∈ S, with the property that Wn (x1 , · · · , xn )
= (Ω, T φ(x1 ) · · · φ(xn )Ω) = (φ(xj ) · · · φ(x1 )Ω, T φ(xj+1 ) · · · φ(xn )Ω) = < φ(xj ) · · · φ(x1 )Ω, φ(xj+1 ) · · · φ(xn )Ω >W ,
(3)
where Ω := (1, 0, · · ·) + NW . Clearly, < Ω, Ω >W > 0. Since D by definition of φ is a multiplication core for the field φ, products of field operators φ(f )φ(g),f, g ∈ S(IRd ,C q ), are well-defined on D. By Axiom IV and Eq. (3) the field operators φ(f ) are local in the sense that [φ(f ), φ(g)]∓ = 0 if the support of the test functions f, g ∈ S(IRd ,C q ) is space-like separated. Here [·, ·]∓ stands for the commutator if the spin of T is integer and for the anticommutator otherwise (cf. Axiom IV). By Eq. (3) and the hermiticity condition, we conclude that the field operator φ is T -symmetric in the sense that for f ∈ S(IRd ,C q ), T φ(f )∗ T −1 = φ(f ). Furthermore, from the action of P+↑ (via T ) on the test function spaces S(IRdn ,C q ) we get a representation U of the proper orthochronous Poincar´e group P+↑ (IRd ) by T -unitary operators defined on the common dense domain D, where, by definition, an operator U(a, Λ) on H is called T -unitary, if n
T U(a, Λ)∗ T −1 = U(a, Λ)−1 . The field φ transforms under U as U(a, Λ)φ(x)U(a, Λ)−1 = T (Λ)φ(Λ−1 (x − a)).
(4)
Furthermore, the spectral condition in Axiom III is equivalent to the following condition: Z (F, T U(a, 1)G)eiqa da = 0 , F, G ∈ D (5) IRd
if q ∈ / {q ∈ IRd : q 2 ≥ 0 , q 0 < 0}. Now we define a “Krein topology" for D as follows Definition 1.3. A non-degenerate majorant Hilbert topology τ on D is called a Krein τ topology if H := D is maximal. Namely, if τ1 is another non-degenerate majorant τ1 Hilbert topology on D such that D ⊃ H, then τ = τ1 . From Definitions 1.1 and 1.3, it is clear that a Krein topology is a minimal topology to provide maximal information from Wightman functions and to keep the density of D in H. Moreover, we have the following result from [13]:
514
S. Albeverio, H. Gottschalk, J.-L. Wu
Proposition 1.4. (Morchio and Strocchi). A majorant Hilbert topology τ is a Krein topology iff the corresponding metric operator T has a bounded inverse T −1 . Furthermore, such a bounded invertible operator can be chosen with the property that T 2 = 1. Given a majorant Hilbert topology τ with the non-degenerate metric operator T on the Hilbert space (H, (·, ·)), one can always find a corresponding Krein topology associated with it. To this end, by Proposition 1.4, it is sufficient to find a metric operator with bounded inverse. In fact, remarking that T is self-adjoint and bounded, the absolute operator |T | is well defined. Furthermore, (F, G)K := (F, |T |G), F, G ∈ H determines a new Hilbert inner product whose induced Hilbert topology τK is weaker than τ since (F, F )K = (F, |T |F ) ≤ kT k(F, F ), F ∈ H. On the other hand, we have < F, G >W = (F, T G) = (F, (sign T )G)K =: (F, TK G)K , F, G ∈ H. −1 = TK = sign T is bounded. Obviously, TK Concerning the existence of a majorant Hilbert topology, we have the following crucial condition from [13] and [19]:
Theorem 1.5. (Morchio and Strocchi). Given a collection of Wightman functions {Wn }n∈IN 0 , a necessary and sufficient condition for the existence of a majorant Hilbert topology is that the following holds: n
Axiom V. There is a sequence {pn }n∈IN , where ∀n ∈ IN , pn : S(IRdn ,C q ) → [0, ∞) is a Hilbert seminorm, such that |Wm+n (ϕ∗ ⊗ η)| ≤ pm (ϕ)pn (η) for all ϕ ∈ S(IR
dm
m
(6)
n
,C q ), η ∈ S(IR ,C q ), m, n ∈ IN . dn
Axiom V is called the Hilbert space structure condition. It is a new axiom for Wightman functions replacing the positivity condition in the standard QFT. Axioms I–IV together with Axiom V are called modified Wightman axioms. Such axioms, especially the Hilbert space structure condition, were presented and lucidly discussed in [19]. We remark that, in general, when the Wightman functions do not fulfill the positivity condition, one can not expect a unique Hilbert space structure for the states of the theory (the Hilbert space structure depending on the choice of Hilbert seminorms in Axiom V). This is at variance with non indefinite metric QFT, where the positivity condition guarantees the uniqueness of the physical Hilbert space. Uniqueness for indefinite metric QFT can perhaps be restored in terms of scattering theory, see Remark 1.6 below. Lastly, let us also present the cluster property for Wightman functions. A sequence of Wightman functions {Wn }n∈IN 0 satisfies the cluster property if for any m, n ∈ IN and any space-like a ∈ IRd (i.e., a2 < 0 in Minkowski metric) λ→∞
Wm+n (ϕ1 ⊗ · · · ⊗ ϕm Tλa (ϕm+1 ⊗ · · · ⊗ ϕm+n )) −→ Wm (ϕ1 ⊗ · · · ⊗ ϕm )Wn (ϕm+1 · · · ⊗ ϕm+n ) for ϕ1 , · · · , ϕm+n ∈ S(IRd ,C q ), where Tλa denotes the representation of the translation n by λa on S(IRdn ,C q ).
Models of Local Relativistic Quantum Fields with Indefinite Metric
515
Remark 1.6. We point out that the cluster property of Wightman functions is not an item of the modified Wightman axioms, since it does not (directly) imply the uniqueness of the vacuum and irreducibility of the field algebra as it does in the standard QFT. Nevertheless, the cluster property can still be looked upon as a genuine expression for the physical principle “forces decrease with the (spatial) distance" in indefinite metric QFT. Especially we expect that also in indefinite metric QFT there is a crucial connection between the cluster property and the possibility of an axiomatic scattering theory in such quantum field theories. 2. A Sufficient Hilbert Space Structure Condition for Truncated Wightman Functions Given a sequence of Wightman functions {Wn }n∈IN 0 , W0 = 1, Wn ∈ S 0 (IRdn ,C q ), the corresponding sequence of truncated Wightman functions {WnT }n∈IN , WnT ∈ S 0 (IRdn , n C q ), is defined recursively by the equations X Y F (I) WlT (ϕj1 ⊗ · · · ⊗ ϕjl ), n ≥ 1, (7) Wn (ϕ1 ⊗ · · · ⊗ ϕn ) = n
{j1 ,···,jl }∈I
I∈P (n)
where ϕ1 , · · · , ϕn ∈ S(IRd ,C q ) and P (n) stands for the collection of all partitions I of {1, · · · , n} into disjoint subsets. For each such subset {j1 , · · · , jl } ∈ I we assume that j1 < · · · < jl . F (I) stands for the fermionic parity of the partition I, i.e. F (I) := 1 for (bosonic) integer spin T and Y sign(πI (l) − πI (j)) F (I) := j
for (fermionic) half-integer spin T . For I = {{j11 , . . . , jl11 }, . . . , {j1k , . . . , jlkk }} with j11 < . . . < j1k , πI is defined as the permutation which maps (1, . . . , n) to (j11 , . . . , jl11 , . . . , j1k , . . . , jlkk ). By the nuclear theorem the sequence of truncated Wightman functions is determined uniquely by the sequence of Wightman functions and vice versa. Since the truncated Wightman functions of the models introduced in Sect. 3 below are much simpler objects than the Wightman distributions themselves, it seems natural to ask for a sufficient condition on the truncated Wightman functions which implies the Hilbert space structure condition(HSSC) for the Wightman functions, as it was introduced in Sect. 1. The aim of this section is to deduce such a HSSC for truncated Wightman functions which is then verified in Sect. 4 for both models of Sect. 3. Let us first introduce a special system of Schwartz norms {k · kK,N }K,N ∈IN 0 on the n spaces S(IRdn ,C q ), n ∈ IN , by n Y sup kϕkK,N := (1 + |xl |2 )N/2 Dα1 ···αn ϕ(x1 , · · · , xn ) , d x1 ,···,xn ∈IR
l=1
0≤|α1 |,···,|αn |≤K n
for K, N ∈ IN 0 and ϕ ∈ S(IRdn ,C q ). Here the absolute | · | is induced by the scalar qn ∼ product < ·, · >⊗n = (C q )⊗n , where < ·, · >E stands for the Euclidean scalar E onC q product onC , α1 , · · · , αn ∈ IN d0 are multi–indexes and for αj = (αj0 , · · · , αjd−1 ) we have used the notations |αj | = αj0 + · · · + αjd−1 and
516
S. Albeverio, H. Gottschalk, J.-L. Wu
Dα1 ···αn := Dα1 ⊗ · · · ⊗ Dαn , where Dαj :=
∂ |αj | α0j
(∂x0 )
d−1
· · · (∂xd−1 )αj
.
The definition of the Schwartz norms k · kK,N clearly implies that for m, n ∈ IN 0 , m n ϕ ∈ S(IRdm ,C q ), η ∈ S(IRdn ,C q ) we get kϕ ⊗ ηkK,N = kϕkK,N kηkK,N .
(8)
The following lemma shows that the Schwartz norms k · kK,N are also well adapted to the operation of taking the tensor product of two tempered distributions: Lemma 2.1. Let m, n ∈ IN , K, N ∈ IN 0 and R ∈ S 0 (IRdm ,C q ), S ∈ S 0 (IRdn ,C q ). If there exist constants CR , CS > 0, such that m
n
m
|R(ϕ)| ≤ CR kϕkK,N , ∀ϕ ∈ S(IRdm ,C q ) and n
|S(η)| ≤ CS kηkK,N , ∀η ∈ S(IRdn ,C q ), then |R ⊗ S (χ)| ≤ CR CS kχkK,N , ∀χ ∈ S(IRd(m+n) ,C q
(m+n)
).
Proof. By Vol.II (see p. 115) of [8] there exist continuous, polynomially bounded funcm n tions FR : IRdm → C q , FS : IRdn → C q and polynomials PR , PS , such that R = PR (D)FR and S = PS (D)FS holds in the sense of tempered distributions. Conse(m+n) ), we get that quently, for χ ∈ S(IRd(m+n) ,C q R ⊗ S (χ) = FR ⊗ FS (PR (−D) ⊗ PS (−D) χ) . The right-hand side(RHS) can be rewritten as an integral over IRd(m+n) , where the integrand is a product of a polynomially bounded function with a fast falling function. Thus, the integral converges absolutely and by Fubini’s theorem we get R ⊗ S (χ) = FR (PR (−D)%) = R(%) , where %(x1 , · · · , xm )
m
:= S(χ(x1 , · · · , xm , ·)) = FS (1m ⊗ PS (−D)χ(x1 , · · · , xm , ·)). m
Clearly, % ∈ S(IRdm ,C q ). Here we denoted the identity operation on S(IRdm ,C q ) by 1m . Therefore one gets
Models of Local Relativistic Quantum Fields with Indefinite Metric
|R ⊗ S (χ)|
≤
CR k%kK,N
=
CR
m Y
sup x1 ,···,xn ∈IR
d
517
(1 + |xl |2 )N/2
l=1
0≤|α1 |,···,|αn |≤K
× ≤
|S(Dα1 ···αn ⊗ 1n χ(x1 , · · · , xm , .))| m Y sup (1 + |xl |2 )N/2 CR x1 ,···,xn ∈IRd
l=1
0≤|α1 |,···,|αn |≤K
×
CS
m+n Y
sup xm+1 ,···,xm+n ∈IR
d
(1 + |xl |2 )N/2
l=m+1
0≤|αm+1 |,···,|αm+n |≤K
× =
|Dα1 ···αm ⊗ Dαm+1 ···αm+n χ(x1 , · · · , xm+n )| CR CS kχkK,N .
The following theorem gives a sufficient Hilbert space structure condition on the truncated Wightman functions: Theorem 2.2. Let K, N ∈ IN 0 and let {an }n∈IN be a sequence of positive constants, such that for all n ∈ IN the truncated n-point Wightman function WnT fulfills n
|WnT (ϕ)| ≤ an kϕkK,N , ∀ϕ ∈ S(IRdn ,C q ) .
(9)
Then the corresponding sequence of Wightman functions fulfills the Hilbert space structure condition. We first prove an auxiliary lemma: Lemma 2.3. For any sequence of positive constants {bn }n∈IN there exists a sequence of positive constants {cn }n∈IN , such that for all m, n ∈ IN the inequality bm+n ≤ cm cn holds. Proof. Let cn := max{max{bj : 1 ≤ j ≤ 2n}, 1}. Then for all m, n ∈ IN , we have bm+n ≤ max{cm , cn } ≤ cm cn . Proof of Theorem 2.2. For n ∈ IN we define X Y bn :=
al .
I∈P (n) {j1 ,···,jl }∈I (m+n)
), we get by inductive use of (9) and Lemma For m, n ∈ IN , χ ∈ S(IRd(m+n) ,C q 2.1, |Wm+n (χ)| ≤ bm+n kχkK,N . Now we take χ = ϕ∗ ⊗ η, ϕ ∈ S(IRdm ,C q ), η ∈ S(IRdn ,C q ), then by (8) we get m
n
|Wm+n (ϕ∗ ⊗ η)| ≤ bm+n kϕkK,N kηkK,N .
518
S. Albeverio, H. Gottschalk, J.-L. Wu
On the other hand, by Lemma 2.3 there exists a sequence of positive numbers {cn }n∈IN such that (10) |Wm+n (ϕ∗ ⊗ η)| ≤ cm kϕkK,N cn kηkK,N . By Vol.IV (see p. 82) of [8] for n ∈ IN there is a system {k · k0K,N }K,N ∈IN 0 of n Hilbert norms on S(IRdn ,C q ) which is equivalent to the system of Schwartz norms {k · kK,N }K,N ∈IN 0 . Thus, there is a sequence of positive constants {dn }n∈IN such that for the above fixed K, N ∈ IN 0 and suitable K 0 , N 0 ∈ IN 0 (depending on K, N and n) we get n kϕkK,N ≤ dn kϕk0K 0 ,N 0 , ∀ϕ ∈ S(IRdn ,C q ) . We now choose Hilbert norms pn on S(IRdn ,C q ) as pn (·) := cn dn k · k0K 0 ,N 0 . From (10) we immediately get the Hilbert space structure condition n
|Wm+n (ϕ∗ ⊗ η)| ≤ pm (ϕ)pn (η) . Since all truncated Wightman functions WnT are tempered distributions and are therefore continuous with respect to some norm k · kK(n),N (n) , it is enough to check (9) for n larger than a certain number m ∈ IN : We may simply put K 0 := max{K, K(n) : n = 1, · · · , m}, N 0 := max{N, N (n) : n = 1, · · · , m} and by k · kK 0 ,N 0 ≥ k · kK,N and k · kK 0 ,N 0 ≥ k · kK(n),N (n) , n = 1, · · · , m, we get (9) for all n ∈ IN if the numbers K, N are replaced by K 0 , N 0 . In particular, we get Corollary 2.4. Let {WnT }n∈IN be a sequence of truncated Wightman distributions. If WnT = 0 for all n larger than a certain number m ∈ IN , then the corresponding sequence of Wightman functions fulfills the Hilbert space structure condition. Since in our models introduced in Sect. 3, we have explicit formulae for the Fourier transformed truncated Wightman functions rather than for the truncated Wightman functions themselves, we need the following Fourier transformed version of Theorem 2.2: Corollary 2.5. Let K, N and {an }n∈IN as in Theorem 2.2. Suppose that ∀n ∈ IN the ˆ T fulfills Fourier transformed truncated n-point Wightman function W n n
ˆ T (ϕ)| ≤ an kϕkK,N , ∀ϕ ∈ S(IRdn ,C q ) . |W n
(11)
Then the sequence of Wightman functions fulfills the Hilbert space structure condition. n
ˆ n (ϕ), ˆ ∀ϕ ∈ S(IRdn ,C q ), we only have to Proof. By the basic fact that Wn (ϕ) = W replace the sequence of Hilbert norms {pn }n∈IN constructed in the proof of Theorem 2.2 by the sequence {pˆn }n∈IN defined as pˆn := pn ◦ˆ. Then {Wn }n∈IN 0 fulfills the Hilbert space structure condition with respect to {pˆn }n∈IN . 3. Relativistic Fields from Convoluted Generalized White Noise Since the work by Nelson [14], the problem of constructing Markovian or reflection positive (see [15] for the notion of reflection positivity) random fields over IRd , which are invariant(i.e., homogeneous, stationary) with respect to the Euclidean group, has been looked upon as closely related to the problem of constructing (Bosonic) relativistic quantum fields. In such an approach, the moments of the Euclidean random fields are
Models of Local Relativistic Quantum Fields with Indefinite Metric
519
viewed as Schwinger functions which are the analytic continuation of the vacuum expectation value (Wightman functions) of relativistic quantum fields to purely imaginary time. In this section, we introduce Wightman functions associated with scalar and vector convoluted generalized white noise Euclidean random fields. Such kind of Euclidean random fields are solutions of certain stochastic partial (pseudo-)differential equations of the form LX = F with F a Euclidean generalized white noise and L a suitable invariant (pseudo-)differential operator. In the case where F is a scalar Gaussian white noise and L = (−∆+m2 )α with α ∈ (0, 21 ], the obtained random field X is a generalized free Euclidean scalar quantum field (see e.g. [17]). In the case that F is a quaternionic Gaussian white noise, the solution X of the quaternionic Cauchy–Riemann equation ∂X = F driven by F is the free Euclidean electromagnetic quantum field. If F is nonGaussian, the corresponding covariant random fields can be interpreted as Euclidean quantum fields with some nonlinear interactions. As had been investigated in [2] in the scalar case (see also [5] for an axiomatic result in the vector case), under the condition of non-Gaussian white noise, such Euclidean random fields in general lack the reflection positivity property. However, since the Schwinger functions of such random fields can be explicitly calculated, we can perform the analytic continuation of the Schwinger functions to relativistic Wightman functions “by hand" (see [1–4] and [9]). Using the properties of Euclidean invariance, symmetry and real-valuedness of the Schwinger functions on one hand, and the Osterwalder– Schrader reconstruction theorem (see [15]) on the other hand, we can obtain that the corresponding Wightman functions satisfy the relativistic postulates of invariance, locality and hermiticity, whereas spectral property and cluster property of the Wightman functions can be verified directly from the derived explicit formulae. In what follows, we only briefly review these constructions. We refer the reader to [1–4] and [9] for all details. 3.1. Scalar models. Let S(IRd ) be the Schwartz space of all rapidly decreasing real valued C ∞ -functions on IRd and S 0 (IRd ) its topological dual. The dual pairing is denoted by < ·, · >. Let B be the σ-algebra generated by all cylinder sets of S 0 (IRd ). Then (S 0 (IRd ), B) is a standard measurable space. By the well-known Bochner-Minlos theorem (see e.g. [10] or Vol. IV of [8]), there exists a unique probability measure P on (S 0 (IRd ), B) such that its Fourier transform satisfies Z Z ei<ϕ,ω> dP (ω) = exp{ ψ(ϕ(x))}, ϕ ∈ S(IRd ), (12) S 0 (IRd )
IRd
where ψ is a L´evy-Khinchine function on IR given by Z ist 1 2 2 (eist − 1 − )dM (s), t ∈ IR ψ(t) = iat − σ t + 2 1 + s2 IR\{0}
(13)
with a, σ ∈ IR and M is a non-decreasing function satisfying Z min(1, s2 )dM (s) < ∞. IR\{0}
We call P a generalized white noise measure with L´evy-Khinchine function ψ. The associated coordinate process F : S(IRd ) × (S 0 (IRd ), B, P ) → IR defined by
520
S. Albeverio, H. Gottschalk, J.-L. Wu
F (ϕ, ω) :=< ϕ, ω >, ϕ ∈ S(IRd ), ω ∈ S 0 (IRd ) is called a generalized white noise. Let K : IRd × IRd → IR be a measurable integral kernel such that Z K(x, y)ϕ(y)dy, ϕ ∈ S(IRd ) (Gϕ)(x) := IRd
is a linear continuous mapping from S(IRd ) to itself. Then the conjugate mapping G˜ : S 0 (IRd ) → S 0 (IRd ) is a measurable transform from (S 0 (IRd ), B) to itself. Let PK denote ˜ the image measure of P under G: PK (A) := P (G˜ −1 A), A ∈ B. Then it is not hard to derive that Z Z ei<ϕ,ω> dPK (ω) = exp{ S 0 (IRd )
Z ψ( IRd
K(x, y)ϕ(y)dy)dx}
(14)
IRd
for ϕ ∈ S(IRd ). The coordinate process X : S(IRd ) × (S 0 (IRd ), B, PK ) → IR given by X(ϕ, ω) :=< ϕ, ω >, ϕ ∈ S(IRd ), ω ∈ S 0 (IRd ) ˜ defined by (GF ˜ )(ϕ, ω) := F (Gϕ, ω). is a random field. Actually, X is precisely GF Moreover, X is a Euclidean field if K is Euclidean invariant. In this case, we can write K(x, y) := G(x − y) for some function G on IRd (with the corresponding invariance property), and for the Euclidean field X we have X = G ∗ F , i.e. X is a (Euclidean) convoluted generalized white noise. Now we assume that all the moments of M in (13) are finite, then ψ is C ∞ -smooth in a neighborhood of the origin and all the moments of X exist. We define Schwinger functions of X on the topological tensor product S ⊗n (IRd ) ∼ = S(IRdn ) as follows: Z Sn (ϕ1 ⊗ · · · ⊗ ϕn ) :=
n Y
S 0 (IRd ) j=1
X(ϕj , ω)dPK (ω).
(15)
Moreover, by using the explicit form of the right-hand side of (14), we can calculate the truncated Schwinger functions of the model as follows:
SnT (ϕ1 ⊗ · · · ⊗ ϕn )
Z n X ∂n { ψ( λj (G ∗ ϕj )(x))dx} |λ1 =···=λn =0 ∂λ1 · · · ∂λn IRd j=1 Z n n Y Y cn G(n) (x1 , · · · , xn ) (16) ϕj (xj ) dxj
:= i−n
=
IRdn
j=1
for ϕ1 , · · · , ϕn ∈ S(IRd ) and n ∈ IN , where Z c1 = a + IR\{0}
s3 dM (s), 1 + s2
j=1
Models of Local Relativistic Quantum Fields with Indefinite Metric
521
Z c2 = σ 2 +
s2 dM (s), IR\{0}
Z sn dM (s), n ≥ 3,
cn = IR\{0}
Z G(n) (x1 , · · · , xn ) :=
n Y
IRd j=1
G(x − xj )dx, n ∈ IN .
Furthermore, taking into account that the Schwinger functions can be expressed by partial derivatives of the right-hand side of (14) at zero, and using a generalized chain rule, we get the following formula X
Y
I∈P n
{j1 ,···,jk }∈I
Sn (ϕ1 ⊗ · · · ⊗ ϕn ) =
SkT (ϕj1 ⊗ · · · ⊗ ϕjk ), n ∈ IN ,
which is clearly the same relation as (7). Taking now G to be the Green function Gα , say, of the pseudo-differential operator (−∆ + m20 )α for the mass m0 > 0 and α ∈ (0, 21 ], where ∆ is the Laplace operator on IRd , namely (e.g. in the sense of Fourier transforms of tempered distributions) −d
Z
Gα (x) = (2π)
Rd
eikx dk, x ∈ Rd , (|k|2 + m20 )α
then we have Euclidean fields X = Gα ∗ F and their Schwinger functions and truncated Schwinger functions as defined above. To perform analytic continuation of SnT , we need first to represent SnT in terms of a Laplace transform. In fact, we have (see [1, 2] and [9]) T }n∈IN , with the following Laplace a sequence of truncated Wightman functions {Wn,α transform formula
SnT (y1 , · · · , yn ) = (2π)−
dn 2
Z
e−
Pm l=1
kl0 yl0 +ikl yl
Rdn
T ˆ n,α W (k1 , · · · , kn ) ⊗nl=1 dkl
(17)
T for y10 < · · · < yn0 , where W1,α := 0 (we take this for simplicity); W2, 1 is given as c2 2 times the two–point function of the relativistic free field of mass m0 ; for n ≥ 3 or n = 2 and α ∈ (0, 21 ),
T ˆ n,α W (k1 , · · · , kn )
:= cn 2n−1 (2π)d ×δ(
n X l=1
are tempered distributions with
kl ),
n j−1 X Y
j=1 l=1
µ− α (kl )µα (kj )
n Y l=j+1
µ+α (kl )
(18)
522
S. Albeverio, H. Gottschalk, J.-L. Wu
µ+α (k)
:=
µ− α (k)
:=
µα (k)
:=
1 , (k 2 − m20 )α 1 (2π)−d/2 sin πα1{k2 >m20 ,k0 <0} (k) 2 , (k − m20 )α (2π)−d/2 cos πα1{k2 >m20 } (k) + 1{k2 <m20 } (k) (2π)−d/2 sin πα1{k2 >m20 ,k0 >0} (k)
|k 2
1 , − m20 |α
where k := (k 0 , k) ∈ IR × IRd−1 . By the general property of the Laplace transform, SnT can be analytically continued n from the purely Euclidean imaginary time to the permuted extended backward tube Tp.e. T T ˆ n,α with the boundary value Wn,α = F −1 W for real (relativistic) time. We then have the following result (see Corollary 7.11 of [2]) T }n∈IN via (7) is a sequence of Wightman Theorem 3.1. {Wn,α }n∈IN defined by {Wn,α functions which satisfy Axioms I–IV, the hermiticity condition and the cluster property.
3.2. Vector models. Euclidean vector models of quantum fields given by solutions of covariant stochastic partial differential equations with a white noise source have been discussed in [3](see also references therein). We recall here briefly the basic elements, in the case of a four dimensional space-time, identified, in its Euclidean version, with the vector space of quaternions (this identification permits us to write the basic stochastic partial differential equation in a simple form). Thus, let IH be the skew field of all quaternions with {1, i, j, k} its canonical basis. Let S(IR4 , IH) denote the Schwartz space of all rapidly decreasing functions from IR4 to IH and S 0 (IR4 , IH) its topological dual. The dual pairing is denoted by < ·, · >. By the known Bochner–Minlos Theorem (Vol. IV of [8]), there exists a unique probability measure P on the standard measurable space (S 0 (IR4 ), IH), B), where B is the σ-algebra generated by all cylinder sets of S 0 (IR4 , IH), with the following Fourier transform: Z Z i<ϕ,ω> e dP (ω) = exp{ ψ(ϕ(x))dx}, ϕ ∈ S(IR4 , IH), S 0 (IR4 ,IH)
IR4
where ψ is a L´evy-Khinchine function on IH given by ψ(x)
=
1 1 2 iβx0 − σ0 x0 − σ|x|2 2 2 Z − 1 + i < x, y >E 1(0,1) (|y|) − ei<x,y>E ν(dy) IH\{0} 4
with the condition that ψ(x) = O(|x| 3 + ) as x → 0, where x := x0 1 − x1 i − x2 j − x3 k, (x0 , x1 , x2 , x3 ) ∈ IR4 , x := x − 1x0 , β ∈ IR, σ0 , σ ∈ (0, ∞), |x| denotes the Euclidean norm of x ∈ IH and ν is a L´evy measure on IH supported by the centre of IH \ {0}(see [3]). In the same way as in Subsect. 3.1, we can define the associated coordinate process F : S(IR4 , IH) × (S 0 (IR4 , IH), B, P ) → IR by F (ϕ, ω) :=< ϕ, ω > , ϕ ∈ S(IR4 , IH), ω ∈ S 0 (IR4 , IH).
Models of Local Relativistic Quantum Fields with Indefinite Metric
523
We call F a IH-valued generalized white noise. The covariant vector random fields were constructed in [3] as solutions of the inhomogeneous quaternionic Cauchy–Riemann equation ∂X = F over IH, where ∂ is the quaternionic Cauchy–Riemann operator defined by ∂ := 1
∂ ∂ ∂ ∂ − i 1 − j 2 − k 3. 0 ∂x ∂x ∂x ∂x
The conjugate operator ∂¯ of ∂ is given by ∂ := 1
∂ ∂ ∂ ∂ + i 1 + j 2 + k 3, ∂x0 ∂x ∂x ∂x
¯ The Green function for −∆IH and the Laplace operator is defined by ∆IH := ∂ ∂¯ = ∂∂. is given explicitly by 1 , x ∈ IH \ {0}. g(x) = 4π 2 |x|2 ¯ ∗ F which is Then the equation ∂X = F is solved by the convolution X = (−∂g) the coordinate process associated to the probability measure PX on (S 0 (IR4 , IH), B) determined by the following Fourier transform: Z Z i<ϕ,ω> e dPX (ω) = exp{ ψ((g ∗ ∂ϕ)(x))dx}, ϕ ∈ S(IR4 , IH). S 0 (IR4 ,IH)
IR4
Similarly to the scalar case in Subsect. 3.1, under the assumption that ν has moments of all orders larger than one, ψ is C ∞ -smooth in a neighborhood of 0 ∈ IH. The Schwinger functions Sn , n ∈ IN and the truncated Schwinger functions SnT , n ∈ IN of X can be constructed explicitly as follows: Z n Y X(ϕj , ω)dPX (ω) , n ∈ IN Sn (ϕ1 ⊗ · · · ⊗ ϕn ) := S 0 (IR4 ,IH) j=1
and SnT (ϕ1 ⊗ · · · ⊗ ϕn )
:= i−n
=
Z
∂n { ∂λ1 · · · ∂λn
ψ( IR4
n X
λj (g ∗ ∂ϕj )(x))dx}|λ1 =···=λn =0
j=1
n=1 constant, < c0 div ϕ1 ⊗ div ϕ2 + cDE (ϕ1 ⊗ ϕ2 ), g (2) >, n = 2 n≥3 < E n ϕ1 ⊗ · · · ⊗ ϕn , g (n) >,
for ϕ1 , · · · , ϕn ∈ S(IR4 , IH), where 1 ln |y1 − y2 |, n = 2; (n) R− 8πQ g (y) = n j=1 g(x − yj )dx, n ≥ 3, IR4 for y = (y1 , · · · , yn ) ∈ (IR4 )n6= := {y ∈ (IR4 )n : yj 6= yl if j 6= l}, E n :=
n X l=0,l:even
cnl Eln ,
(19)
524
S. Albeverio, H. Gottschalk, J.-L. Wu
R 2 c0 := σ0 + IH\{0} x0 ν(dx), R c := σ + 13 IH\{0} |x|2 ν(dx), R n n 1 0 n−l |x|l ν(dx), n ≥ 3, 0 ≤ l ≤ n, cl := l+1 IH\{0} x l Eln := Sym(div ⊗ · · · ⊗ div ⊗ DE ⊗ · · · ⊗ DE ), | {z } | {z } l 2
n−l
and DE : S(IR4 × IR4 , IH × IH) → S(IR4 × IR4 , IR) is a linear partial differential operator on IR4 × IR4 which is of first order with respect to each variable x1 , x2 ∈ IR4 . The analytic continuation of {SnT }n∈IN from the imaginary Euclidean time to the real relativistic time performed in [3] (to which we refer for details) yields a sequence of truncated Wightman functions {WnT }n∈IN . In fact, each g (n) defined by (19) has a n . For holomorphic extension G(n) defined on the permuted extended backward tube Tp.e. n ≥ 3, it is defined as follows: G(n) (z) :=< e(z, ·), M0n +
n−1 X
(
j=1
∂ ∂ n − 0 )Mjn + Mnn >, z ∈ Tp.e. , 0 ∂kj+1 ∂kj
where e(z, k) := (2π)−2n exp{i
n X
< zj , kj >E }
j=1
and {Mjn : 0 ≤ j ≤ n} are measures defined on the space IR4n (see [3]). This can be verified by writing g (n) as the Laplace transform (cf. Eq. (17)) of the following tempered distribution n−1 X ∂ ∂ ( 0 − 0 )Mjn + Mnn . M0n + ∂k ∂k j j+1 j=1 In what follows, we will give a representation of {Mjn : 0 ≤ j ≤ n}, which is different from the one given in [3], for later use in Sect. 4, which can be derived from the argument in Subsect. 7.4 of [2] in the case m0 = 0:
M0n Mnn Mjn (ϕ)
(2π)3−n
=
(2π)3−n
=
1
n Y
2|k1 |(k10 − |k1 |)
l=2
δ0+ (kl )δ(
1
n−1 Y
2|kn |(kn0 + |kn |)
l=1
n X
kl );
l=1
δ0− (kl )δ(
n X
kl );
l=1
= Z
1
(2π)3−n
< 0
j−1 Y l=1
δ0− (kl )
n n X δ(kj0 − k˜j0 (s)) Y δ0+ (kl )δ( kl ), ϕ > ds, 4|kj ||kj+1 | l=j+2
l=1
(20) for 1 ≤ j ≤ n − 1, ϕ ∈ S(IR4n ), where
Models of Local Relativistic Quantum Fields with Indefinite Metric
525
δ0+ (kl )
=
1{k0 >0} (kl0 )δ(kl2 ),
δ0− (kl )
=
1{k0 <0} (kl0 )δ(kl2 ),
k˜ j0 (s)
=
0 0 k˜ j0 (k10 , · · · , kj−1 , kj , kj+1 , kj+1 , · · · , kn0 , s)
:=
−{(−
l
l
j−1 X
kl0 + ωj )s + (ωj+1 +
l=1
n X
kl0 )(1 − s) +
l=j+2
j−1 X
kl0 } .
l=1
For n ≥ 3, let Gn denote the boundary value of G(n) (under the limit of the purely real time) in the backward tube T n . For the case that n = 2, G2 can be calculated by using a different method. Since here we do not need an explicit formula for W2T , we refer to [2] for this calculation. The corresponding truncated Wightman functions {WnT }n∈IN over Minkowski space M4 are then given as follows: n=1 ≡ constant, WnT (ϕ1 ⊗ · · · ⊗ ϕn ) :=< c0 div ϕ1 ⊗ div ϕ2 + cDM (ϕ1 ⊗ ϕ2 ), G2 >, n = 2 :=< Ln ϕ ⊗ · · · ⊗ ϕ , G >, n≥3 1 n n (21) for ϕ1 , · · · , ϕn ∈ S(IR4 , IH), where DM is a linear partial differential operator on IR4 × IR4 which is obtained as an analytic continuation of DE and hence it is of first order with respect to each variable x1 , x2 ∈ IR4 , n X
Ln :=
cnl Lnl ,
l=0,l:even
with Lnl := Sym(div ⊗ · · · ⊗ div ⊗ DM ⊗ · · · ⊗ DM ) . | {z } | {z } n−l
l 2
We notice that Lnl is also a linear partial differential operator on IR4 × · · · × IR4 which | {z } n
is of first order with respect to every variable x1 , · · · , xn ∈ IR4 . We then have the following result from Theorem 4.21 of [3] (cf. Theorem 4.5 and Corollary 4.7 of [2] for the cluster property): Theorem 3.2. {Wn }n∈IN , as defined by {WnT }n∈IN via (7), is a sequence of Wightman functions which satisfy Axioms I–IV, the hermiticity condition and the cluster property.
4. Verification of the Hilbert Space Structure Condition for the Models In this section we prove that the truncated Wightman functions of the scalar models as well as the vector models in Sect. 3 fulfill the requirements of Corollary 2.5, which further implies that the Wightman functions of both the scalar and vector models in Sect. 3 satisfy Axiom V. Thus, we prove the following result:
526
S. Albeverio, H. Gottschalk, J.-L. Wu
Theorem 4.1. The Wightman functions obtained in Sect. 3 for the scalar and the vector models(over the d-dimensional resp. 4-dimensional Minkowski space–time) fulfill the modified Wightman axioms I–V (of Morchio and Strocchi). In particular, for each such model there is a Hilbert space (H, (·, ·)), a continuous and self adjoint metric operator T on H fulfilling T 2 = 1 and local T -symmetric field operators φ(f ) defined on a common dense domain D ⊂ H for f ∈ S(IRd ,C), S(IR4 ,C 4 ) respectively, such that Eq. (3) holds. Furthermore, we have a T -unitary representation U of (the proper orthochronous Poincar´e group over IRd resp. IR4 ) P+↑ on the dense domain D ⊂ H, where the transformation law of the fields φ under U is given by (4) and U fulfills the spectral condition as given in Eq. (5). The second part of Theorem 4.1 by the results of Sect. 1 immediately follows from Axioms I-V. Although there is a lot of similarity in the methods applied to the scalar and the vector model, the origin of the technical difficulties in the proof of Axiom V in both cases is quite different: In the scalar case the K¨allen–Lehmann representation of the Green functions Gα by infinite measures [2] leads to singularities of the Fourier transformed Wightman distributions near the mass shell of the lowest mass. These singularities for 0 < α ≤ 21 turn out to be locally integrable independently of the dimension d of the underlying space-time. In the vector case we restricted ourselves to a special Green function, such that the above mentioned singularities do not arise. But in this case we have to overcome the problems caused by the fact that the fields have mass zero, leading to singularities at the bottom of the light cone. These singularities are however locally integrable, since we have specialized to the sufficiently large (physical) space-time dimension 4. 4.1. Proof of Theorem 4.1 for the scalar models. By the argument given in Sect. 2, it suffices to check Eq. (11) for n ≥ 3, K = 0 and N = 2d. Using the explicit formulae of T ˆ n,α for α ∈ (0, 21 ] with m0 > 0, we get that for ϕ ∈ S(IRdn ,C), W T ˆ n,α (ϕ)| |W
≤ ×
ncn 2 |(
n X l=2
n−1
d− dn 2
Z
n Y |k 2 − m2 |−α l
(2π)
IRd(n−1)
kl )2 − m20 |−α
n O
l=2
0
(1 + |kl |2 )d
dkl kϕk0,2d .
l=2
It remains to show that the integral on the RHS is finite. Noticing that (1 + |kl |2 )−d ≤ 2 (1+kl0 )−1 (1+|kl |2 )1−d , the above integral can be estimated by the following expression: !n−3 n−1 Z dk |k 2 − m20 |−α 0 dk sup 2 2 d−1 (1 + k 0 ) IRd−1 (1 + |k| ) k∈IRd−1 IR Pn Z |( l=2 kl )2 − m20 )(k22 − m20 )(k32 − m20 )|−α 0 0 sup dk2 dk3 . 2 2 IR2 (1 + k20 )(1 + k30 ) k2 ,k3 ∈IRd−1
Z
×
k4 ,···,kn ∈IRd
(22)
Models of Local Relativistic Quantum Fields with Indefinite Metric
527
Clearly the first factor in (22) is finite. It remains to show that the remaining two factors are also finite. 1 Since |k 2 − m20 |−α = |kl0 + ω|−α |kl0 − ω|−α , where ω = (|k|2 + m20 ) 2 and therefore ω ≥ m0 , we get that (23) |k 2 − m20 |−α ≤ |m0 (k 0 + ω)|−α + |m0 (k 0 − ω)|−α Pn 1 1 We set ω1 := (| l=2 kl |2 + m20 ) 2 , ωl := (|kl |2 + m20 ) 2 for l = 2, · · · , n. By (23) the integral in the second factor in Eq.(22) can be estimated by m−α times two integrals of 0 the following kind: Z Z Z |k 0 ± ω|−α 0 |k 0 ± ω|−α 0 |k 0 ± ω|−α 0 dk = dk + dk 2 2 2 1 + k0 1 + k0 1 + k0 IR {|k0 ±ω|<1} {|k0 ±ω|>1} Z 2 1 + < dk 0 < ∞. 2 1−α IR 1 + k 0 Here the latter estimate is independent of k ∈ IRd−1 . Consequently the second factor in (22) is also finite. It remains to deal with the third factor. Again by (23) the integral in the third factor of (22) can be dominated by m−α times 0 eight integrals of the type Pn Z |( l=2 kl0 ) ± ω1 )(k20 ± ω2 )(k30 ± ω3 )|−α 0 0 dk2 dk3 . 2 2 IR2 (1 + k20 )(1 + k30 ) Therefore, to prove that the third factor in (22) is finite it is sufficient to show that Z |xy(x + y + c)|−α sup dxdy < ∞. (24) 2 2 a,b,c∈IR IR2 (1 + (x + a) )(1 + (y + b) ) To prove (24), we set t := y + c, then we get Z Z |(x0 + 2t )(x0 − 2t )|−α 0 |x(x + t)|−α dx = dx . 2 1 + (x0 − 2t + a)2 IR 1 + (x + a) IR
(25)
For the case that |t| > 2, the RHS of (25) is smaller than Z 2 0
∞
|t| −α 2 | dx0 0 2 (x − |t| ) 2
|x0 − 1+
Z <2 IR
|x00 |−α 00 dx < ∞ 1 + x00 2
independently of a ∈ IR and the value of |t| > 2. We now let 0 < |t| ≤ 2. In this case the RHS of (25) independently of a ∈ IR is smaller than Z Z 2 t 0 t −α 0 1 0 |(x + )(x − )| dx + dx00 . 2 00 2 2 IR 1 + x −2 Here the second integral is finite. For any γ ∈ (0, 21 ) the first integral can be further estimated by Z 2 Z 2 0 |t| −2α+γ 0 1−γ −γ x − dx ≤ 2 |t| |x00 |−2α+γ dx00 . 21−γ |t|−γ 2 0 −1
528
S. Albeverio, H. Gottschalk, J.-L. Wu
Since 2α − γ < 1, the integral on the RHS of the above inequality is finite and thus the RHS of (25) is smaller than C1 + C2 |t|−γ for sufficiently large constants C1 , C2 > 0, which can be chosen independently of the parameter a ∈ IR. We can therefore estimate the left-hand side(LHS) of (24) by Z (C1 + C2 |y − c|−γ )|y|−α dy sup 1 + (y + b)2 b,c∈IR IR Z Z |y|−α |y + c|−α−γ + |y|−α−γ ≤ C1 sup dy + C sup dy . 2 2 1 + (y + b)2 b∈IR IR 1 + (y + b) b,c∈IR IR Here the first integral on the RHS of the above inequality is smaller than Z 1 2 + dy 0 < ∞, 02 1−α 1 + y IR and this estimate is independent of b ∈ IR. The second one is dominated by the following constant: Z 4 1 +2 dy 0 , 02 1−α−γ IR 1 + y which is independent of b, c ∈ IR and is finite since α + γ < 1. Thus we have established (24), which was the missing step in the proof of the truncated Hilbert space structure condition (11). 4.2. Proof of Theorem 4.1 for the vector models. As in the preceding subsection we want to prove that the requirements of Corollary 2.5 are fulfilled by the Fourier transformed Wightman functions WnT of the vector models described in Sect. 3. To this aim, we denote the Fourier transform of the partial differential operators Ln by n n M . Mn is a tensor valued multiplication operator mapping S(IR4n ,C 4 ) to S(IR4n ,C). Since Ln is a first order partial differential operator in the variables x1 , · · · , xn ∈ IR4 , each component of Mn is a polynomial of degree 1 in each of the variables k1 , · · · , kn , which are conjugated to x1 , · · · , xn under the Fourier transform. Thus, for K, N ∈ IN 0 there exists a constant C1 > 0, such that n
kMn ϕkK,N ≤ C1 kϕkK,N +1 , ∀ϕ ∈ S(IR4n ,C 4 ).
(26)
By application of the Leibniz rule and the above estimate, we get that there exists a constant C2 > 0, such that also the following inequality holds: k(
n ∂ ∂ − 0 )Mn ϕkK,N ≤ C2 kϕkK+1,N +1 , ∀ϕ ∈ S(IR4n ,C 4 ), ∂kj0 ∂kj+1
(27)
n
for j = 1, · · · , n − 1. Now let again n ≥ 3 and ϕ ∈ S(IR4n ,C 4 ). From (21 ) we get ˆ nT (ϕ) W
=
< Gˆ n , Mn ϕ >
=< M0n , Mn ϕ > +
n−1 X j=1
< Mjn , (
∂ ∂ − 0 )Mn ϕ > + < Mnn , Mn ϕ > . ∂kj0 ∂kj+1
Models of Local Relativistic Quantum Fields with Indefinite Metric
529
Taking into account the inequalities (26) and (27) we get that ˆ nT (ϕ)| ≤ an kϕkK+1,N +1 , ∀ϕ ∈ S(IR4n ,C 4n ) |W Pn for the constant an := max{C1 , C2 } j=0 Cjn , if the measures Mjn , j = 1, · · · , n fulfill the conditions n (28) |Mjn (ϕ)| ≤ Cjn kϕkK,N ∀ϕ ∈ S(IR4n ,C 4 ) for sufficiently large constants Cjn > 0. Thus, if we can choose K, N ∈ IN 0 in (28) independently of n, j, the truncated Hilbert space condition of Corollary 2.5 holds. Let K = 0, N = 3. We first prove (28) for j = 0: By (20) we get that Z |k1 + · · · + kn |−1 n 3−n −n 2 |M0 (ϕ)| ≤ (2π) IR3n−3 |k2 | + · · · + |kn | + |k2 + · · · + kn | n n −1 Y O |kl | dkl kϕk0,3 × (1 + |kl |2 )3/2 l=2 l=2 We have to show that the integral on the RHS is finite. Since |k2 | + · · · + |kn | + |k2 + · · · + kn | ≥ |k3 | the integral is smaller than the following expression: Z n−3 Z |k|−1 |k|−2 dk dk 2 3/2 2 3/2 IR3 (1 + |k| ) IR3 (1 + |k| ) ! Z |k + a|−1 |k|−1 × sup dk (29) 2 3/2 a∈IR3 IR3 (1 + |k| ) (as seen by taking k2 as first integration variable, setting a ≡ k1 + k3 + · · · + kn and majorizing by sup over a ∈ IR3 ). Let us consider the first two factors, i.e. we let γ = 1, 2 and calculate Z ∞ 2−γ Z |k|−γ λ dλ dk = 4π < ∞, 2 )3/2 3 (1 + |k| (1 + λ2 )3/2 IR 0 since 2 − γ ≥ 0 and 3 − 2 + γ > 1. It remains to show that also the third factor in (29) is finite. For the moment we fix a ∈ IR3 and choose orthogonal coordinates, such that 2 2 1 a = (a, 0, 0). Let λ := (|k 2 | + |k 3 | ) 2 . Using Fubini’s theorem we get that the integral in the last factor is smaller than Z ∞Z 1 1 λdλ 2 (|k 1 + a|2 + λ2 )− 2 (|k 1 | + λ2 )− 2 dk 1 . 2π (1 + λ2 )3/2 IR 0 2
Clearly (|k 1 + a|2 + λ2 )− 2 and (|k 1 | + λ2 )− 2 ∈ L2 (IR, dk 1 ) for λ > 0. By the Cauchy R 2 Schwarz inequality we can dominate the inner integral by IR (|k 1 | + λ2 )−1 dk 1 = πλ−1 . Therefore, the above expression is smaller than Z dλ <∞ 2π 2 (1 + λ2 )3/2 IR 1
1
independently of a ∈ IR3 . The estimate |Mnn (ϕ)| ≤ Cnn kϕk0,3 for a sufficiently large Cnn > 0 can be proved analogously.
530
S. Albeverio, H. Gottschalk, J.-L. Wu
Let us therefore consider the case j = 1, · · · , n − 1. From the representation (20) we get the estimate Pn Z | l=1,l6=j kl |−1 |kj+1 |−1 n 3−n −n 2 |Mj (ϕ)| ≤ (2π) (1 + |kj+1 |2 )3/2 IR3n−3 n n Y O |kl |−1 × dkl kϕk0,3 2 3/2 (1 + |kl | ) l=1,l6=j l=1,l6=j for ϕ ∈ S(IRdn ,C). The integral can be dominated by the expression Z IR3
|k|−1 dk (1 + |k|2 )3/2
n−2
Z sup a∈IR3
IR3
! |k + a|−1 |k|−1 dk , (1 + |k|2 )3/2
which is finite by the above calculations. This completes the proof of the truncated Hilbert space structure condition for the truncated Wightman functions of the vector models. Acknowledgement. Stimulating discussions of the first named author with R. Gielerak, K. Iwata and T. Kolsrud are gratefully acknowledged. The financial support of D.F.G. (SFB 237) is also gratefully acknowledged. We also would like to thank the referee for very helpful comments and corrections on a previous version of this paper.
Note added in proof. After submitting the paper we achieved the construction of non trivial scattering amplitudes for the models discussed in the present work, see S. Albeverio, H. Gottschalk, J.-L. Wu: Non trivial scattering amplitudes for some local relativistic quantum field models with indefinite metrics. Bochum. Preprint, March 1997.
References 1. Albeverio, S., Gottschalk, H., Wu, J.-L.: Euclidean random fields, pseudodifferential operators, and Wightman functions. In: Stochastic Analysis and Applications (edts. Davies, I.M., Truman, A., Elworthy, K.D.). Singapore: World Scientific, 1996, pp. 20–37 2. Albeverio, S., Gottschalk, H., Wu, J.-L.: Convoluted generalized white noise, Schwinger functions and their analytic continuation to Wightman functions. Rev. Math. Phys. 8, 763–817 (1996) 3. Albeverio, S., Høegh-Krohn, R., Iwata, K.: Covariant Markovian random fields in four space-time dimensions with nonlinear electromagnetic interaction. In: Applications of Self-Adjoint Extensions in ˘ Quantum Physics, Proc. Dubna Conf. 1987 (eds. Exner, P., Seba, P.). Berlin: Springer-Verlag, 1989, pp. 69–83; Albeverio, S., Iwata, K., Kolsrud, T.: Random fields as solutions of the inhomogeneous quaternionic Cauchy–Riemann equation.I.Invariance and analytic continuation. Commun. Math. Phys. 132, 555–580 (1990) 4. Albeverio, S., Wu, J.-L.: Euclidean random fields obtained by convolution from generalized white noise. J. Math. Phys. 36, 5217–5245 (1995) 5. Becker, C.: Reflection positivity for quantum vector fields. In: Stochastic Analysis and Applications, (edts. Davies, I.M., Truman, A., Elworthy, K.D.). Singapore: World Scientific, 1996, pp. 76–90 6. Bogn´ar, J.: Indefinite Inner Product Spaces. Berlin–Heidelberg–New York: Springer-Verlag, 1974 7. Bogoliubov, N. N., Logunov, A. A., Todorov, R. T.: Introduction to Axiomatic Quantum Field Theory. Reading: Benjamin, 1975 (Translation and revision of original publication in 1969) 8. Gelfand, I.M., Vilenkin, N. Ya.: Generalized Functions. II and IV. New York–London: Academic Press, 1964
Models of Local Relativistic Quantum Fields with Indefinite Metric
531
9. Gottschalk, H.: Die Momente gefalteten Gauss-Poissonschen Weißen Rauschens als Schwingerfunktionen. Diplomarbeit, Bochum 1995 10. Hida, T., Kuo, H.-H., Potthoff, J. and Streit, L.: White Noise: An Infinite Dimensional Calculus. Dordrecht–Boston–London: Kluwer Academic, 1993 11. Jakobczyk, L., Strocchi, F.: Euclidean formulation of quantum field theory without positivity. Commun. Math. Phys. 119, 529–541 (1988) 12. Jost, R.: The General Theory of Quantized Fields. Providence, RI: AMS, 1965 13. Morchio, G., Strocchi, F.: Infrared singularities, vacuum structure and pure phases in local quantum field theory. Ann. Inst. H. Poincar´e A33, 251–282 (1980) 14. Nelson, E.: Construction of quantum fields from Markoff fields. J. Funct. Anal. 12, 97–112 (1973) 15. Osterwalder, K., Schrader, R.: Axioms for Euclidean Green’s functions. I. Commum. Math. Phys. 31, 83–112 (1973); II. Commun. Math. Phys. 42, 281–305 (1975) 16. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. I. Functional Analysis. New York: Academic Press, 1972; II. Fourier Analysis, Self–Adjointness. New York: Academic Press, 1975; III. Scattering Theory. New York: Academic Press, 1978 17. Simon, B.: The P (φ)2 Euclidean (Quantum) Field Theory. Princeton, NJ: Priceton University Press, 1975 18. Streater, R. F., Wightman, A. S.: PCT, Spin and Statistics, and All That. New York: Benjamin, 1964. 19. Strocchi, F.: Selected Topics on the General Properties of Quantum Field Theory. Lect. Notes in Physics 51. Singapore–New York–London–Hong Kong: World Scientific, 1993 Communicated by H. Araki
Commun. Math. Phys. 184, 533 – 566 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Quantum Stochastic Positive Evolutions: Characterization, Construction, Dilation V. P. Belavkin Mathematics Department, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom Received: 20 November 1995 / Accepted: 3 September 1996
Abstract: A characterization of the unbounded stochastic generators of quantum completely positive flows is given. This suggests the general form of quantum stochastic adapted evolutions with respect to the Wiener (diffusion), Poisson (jumps), or general Quantum Noise. The corresponding irreversible Heisenberg evolution in terms of stochastic completely positive (CP) maps is constructed. The general form and the dilation of the stochastic completely dissipative (CD) equation over the algebra L (H) is discovered, as well as the unitary quantum stochastic dilation of the subfiltering and contractive flows with unbounded generators. A unitary quantum stochastic cocycle, dilating the subfiltering CP flows over L (H), is reconstructed.
Introduction In quantum theory of open systems there is a well known Lindblad’s form [1] of the quantum Markovian master equation, satisfied by the one-parameter semigroup of completely positive (CP) maps. This nonstochastical equation is obtained by averaging the stochastic Langevin equation for quantum diffusion [2] over the driving quantum noises. On the other hand the quantum Langevin equation is satisfied by a quantum stochastic process of dynamical representations, which are obviously completely positive due to *-multiplicativity of the homomorphisms, describing these representations. The homomorphisms give the examples of pure, i.e. extreme point CP maps, but among the extreme points of the convex cone of all CP maps there are not only the homomorphisms. This means a possibility to construct the dynamical semigroups by averaging of pure, i.e. non-mixing irreversible quantum stochastic CP dynamics, which can not be driven by a Langevin equation. The examples of such dynamics having recently been found in many physical applications, will be considered in the first section. The rest of the paper will be devoted to the mathematical derivation of the general structure for the quantum stochastic CP
534
V.P. Belavkin
evolutions and the corresponding equations. The results of the paper not only generalize the Evans-Hudson (EH) flows [2] from homomorphism-valued maps to the general CP maps, but also prove the existence of the homomorphic dilations for the subfiltering and contractive CP flows. Here in the introduction we would like to outline this structure on the formal level. The initial purpose of this paper was to extend the Evans–Lewis differential analog [3] of the Stinespring dilation [4] for the CP semigroups to the stochastic differentials, generating an Itˆo ∗–algebra X X λi ai , dΛ (a)∗ = dΛ a? λi dΛ (ai ) = dΛ dΛ (a)∗ dΛ (a) = dΛ a? a , (0.1) with given mean values hdΛ (t, a)i = l (a) dt, a ∈ a. Here a is in general a noncommutative ?-algebra with a self-adjoint annihilator (death) d = d? , ad = 0, corresponding to dt = dΛ (t, d), and l : a → C is a positive l (a? a) ≥ 0 linear functional, normalized as l (d) = 1, corresponding to the determinism hdti = dt. The functional l defines the GNS µ=−,• representation a 7→ a = aµν ν=+,• of a in terms of the quadruples a•• = j (a) ,
a•+ = k (a) ,
∗ a− • = k (a) ,
a− + = l (a) ,
(0.2)
where j (a? a) = j (a)∗ j (a) is the operator representation j (a)∗ k (a) = k (a? a) on the pre-Hilbert space E = k (a) of the Kolmogorov decomposition l (a? a) = k (a)∗ k (a), and k ∗ (a) = k (a? )∗ . As was proved in [5], a quantum stochastic stationary processes t ∈ R+ 7→ Λ (t, a) , a ∈ a with Λ(0, a) = 0 and independent increments dΛ (t, a) = Λ (t + dt, a) − Λ (t, a), forming an Itˆo ?-algebra, can be represented in the Fock space F over the space of E -valued square-integrable functions on R+ as Λµν t, aµν = aµν Λνµ (t). Here • − + aµν Λνµ (t) = a•• Λ•• (t) + a•+ Λ+• (t) + a− • Λ− (t) + a+ Λ− (t) ,
(0.3)
is the canonical decomposition of Λ into the exchange Λ•• , creation Λ+• , annihilation + of quantum stochastic calculus [6], [7] Λ•− and preservation (time)
νΛ−= tI processes having the mean values Λµ (t) = tδ+ν δµ− with respect to the vacuum state in F. Thus the parametrizing algebra a can be always identified with a ?-subalgebra of the algebra µ=−,• Q (E) of all quadruples a = aµν ν=+,• , where aµν : Eν → Eµ are the linear operators on E• = E, E+ = C = E− , having the adjoints aµ∗ ν Eµ ⊆ Eν , with the Hudson–Parthasarathy (HP) multiplication table [8] µ=−,• a • b = aµ• b•ν ν=+,• , (0.4) µ + µ=−,• ν∗ the unique death d = δ− δν ν=+,• , and the involution a?µ −ν = a−µ , where −(−) = +, −• = •, −(+) = −. The stochastic differential of a CP flow φ = (φt )t>0 over an operator algebra B is written in terms of the quantum canonical differentials as dφ = φ ◦ λµν dΛνµ with φ0 = ı at t = 0, where ı (B) = B is the identical representation of B. The main result of this paper is the construction of CP flows and their filtering dilation to the HP flows, based on the linear quantum stochastic evolution equation of the form dφt (B) + φt K ∗ B + BK − L∗ (B) L dt = φt L• (B) L• − B ⊗ δ•• dΛ••
Quantum Stochastic Positive Evolutions
+ φt L• (B) L − K • B dΛ+• + φt L∗ (B) L• − BK• dΛ•− ,
535
(0.5)
where is an operator representation of B, δ•• is the identity operator in E and the operator K satisfies the conservativity condition K + K ∗ = L ∗ L for the deterministic generator λ = λ− + . This form of the equation was discovered in [9] as a result of the general CP differential structure λ (B) = L∗ (B)L − K ∗ B − BK of the bounded µ=−,• quantum stochastic generators λ = λµν ν=+,• over a von Neumann algebra B even in the nonlinear case. The dilation of the stochastic differentials for CP processes over arbitrary ∗-algebras, giving this structure for the bounded generators as a consequence of the Christensen-Evans theorem [10], was constructed in [11]. Here we shall prove that such quantum stochastic extension of Lindblad’s structure λ (B) = L∗ (B) L − K ∗ B − BK, can be always used for the construction and the dilation of the CP flows also in the case of the unbounded maps λµν : B → B over the algebra B = L (H) of all operators in a Hilbert space H. We shall prove that this structure is necessary at least in the case of the w*continuous generators, which are extendable to the covariant ones over the algebra of all bounded operators L (H). The existence of a minimal CP solution which is constructed under certain continuity conditions proves that this structure is also sufficient for the CP property of any solution to this stochastic equation. The construction of the differential dilations and the CP solutions of such quantum stochastic differential equations with the bounded generators over the simple finite-dimensional Itˆo algebra a = Q (E) and the arbitrary B ⊆ L (H) was recently discussed in [12, 13] (the latter paper contains also a characterization of the bounded generators for the contractive CP flows.) The Evans–Lewis case Λ (t, a) = αtI is described by the simplest one-dimensional Itˆo algebra a = Cd with l (a) = α ∈ C, α∗ = α and the nilpotent multiplication α? α = 0 corresponding to the non-stochastic (Newton) calculus (dt)2 = 0 in E = 0. The standard Wiener process Q = Λ•− + Λ+• in Fock space is described by the second order nilpotent algebra a of pairs a = (α, ξ) with d = (1, 0), ξ ∈ C, represented by the • a− a•• = 0 in E = C, corresponding to Λ (t, a) = quadruples a− + = α, • = ξ = a+ , 2 αtI + ξQ (t). The unital ?-algebra C with the usual multiplication ζ ? ζ = |ζ| can be embedded into the two-dimensional Itˆo algebra a of a = (α, ζ), α = l (a), ζ ∈ C as − a•• = ζ, a•+ = +iζ, a− • = −iζ, a+ = ζ. It corresponds to Λ (t, a) = αtI + ζP (t), where • + • P = Λ• +i Λ• − Λ− is the representation of the standard Poisson process, compensated by its mean value t. Thus our results are applicable also to the classical stochastic differentials of completely positive processes, corresponding to the commutative Itˆo algebras, which are decomposable into the Wiener, Poisson and Newton orthogonal components.
1. Quantum Filtering Dynamics The quantum filtering theory, which was outlined in [14, 15] and developed then since [16], provides the derivations for new types of irreversible stochastic equations for quantum states, giving the dynamical solution for the well-known quantum measurement problem. Some particular types of such equations have been considered recently in the phenomenological theories of quantum permanent reduction [17, 18], continuous measurement collapse [19, 20], spontaneous jumps [27, 21], diffusions and localizations [22, 23]. The main feature of such dynamics is that the reduced irreversible evolution can be described in terms of a linear dissipative stochastic wave equation, the solution to which is normalized only in the mean square sense.
536
V.P. Belavkin
The simplest dynamics of this kind is described by the continuous filtering wave propagators Vt (ω), defined on the space Ω of all Brownian trajectories as an adapted operator-valued stochastic process in the system Hilbert space H, satisfying the stochastic diffusion equation (1.1) dVt + KVt dt = LVt dQ, V0 = I in the Itˆo sense. Here Q (t, ω) is the standard Wiener process, which is described by the independent increments dQ (t) = Q (t + dt) − Q (t), having the zero mean values hdQi = 0 and the multiplication property (dQ)2 = dt, K is an accretive operator, K + K † ≥ L∗ L, defined on a dense domain D ⊆ H, with K † = K ∗ |D, and L is a linear operator D → H. This stochastic wave equation was first derived [25] from a unitary cocycle evolution by a quantum filtering procedure. A sufficient analyticity condition, under which it has the unique solution in the form of a stochastic multiple integral even in the case of unbounded K and L is given in the Appendix. Using the Itˆo formula (1.2) d Vt∗ Vt = dVt∗ Vt + Vt∗ dVt + dVt∗ dVt , and averaging h·i over the trajectories of Q, one obtains hVt∗ Vt i ≤ I as a consequence of dhVt∗ Vt i ≤ 0. Note that the process Vt is not necessarily unitary if the filtering condition K † + K = L∗ L holds, and even if L† = −L, it might be only isometric, Vt∗ Vt = I, in the unbounded case. Another type of the filtering wave propagator Vt (ω) : ψ0 ∈ H 7→ ψt (ω) in H is given by the stochastic jump equation dVt + KVt dt = LVt dP,
V0 = I.
(1.3)
at the random time instants ω = {t1 , t2 , ...}. Here L = J − I is the jump operator, corresponding to the stationary discontinuous evolutions ψt+ = Jψ at t ∈ ω, and P (t, ω) is the standard Poisson process, counting the number |ω ∩ [0, t)| compensated by its mean value t. It is described as the process with independent increments dP (t) = P (t + dt) − P (t), having the values {0, 1} at dt → 0, with zero mean hdPi = 0, and the multiplication property (dP)2 = dP + dt. This stochastic wave equation was first derived in [24] by the conditioning with respect to the spontaneous reductions J : ψt 7→ ψt+ . An analyticity condition under which it has the unique solution in the form of the multiple stochastic integral even in the case of unbounded K and L is also given in the Appendix.. Using the Itˆo formula (1.2) with dVt∗ dVt = Vt∗ L∗ LVt (dP + dt), one can obtain d Vt∗ Vt = Vt∗ L∗ L − K − K † Vt dt + Vt∗ L† + L + L∗ L Vt dP. Averaging h·i over the trajectories of P, one can easily find that dhVt∗ Vt i ≤ 0 under the sub-filtering condition L∗ L ≤ K + K † . Such evolution is not needed to be unitary even if L∗ L = K + K † , but it might be isometric, Vt∗ Vt = I if the jumps are isometric, J ∗ J = I. This proves in both cases that the stochastic wave function ψt (ω) = Vt (ω) ψ0 is not normalized for each ω, but it is normalized in the mean square sense to the survival system not to be demolished during probability h||ψt ||2 i ≤ ||ψ0 ||2 = 1 for
the quantum its observation up to the time t. If ||ψt ||2 = 1, then the positive stochastic function b or counting b P output process up to ||ψt (ω) ||2 is the probability density of a diffusive Q the given t with respect to the standard Wiener Q or Poisson P input processes. Using the Itˆo formula for φt (B) = Vt∗ BVt , one can obtain the stochastic equations
Quantum Stochastic Positive Evolutions
537
dφt (B) + φt K ∗ B + BK − L∗ BL dt = φt L∗ B + BL dQ, dφt (B) + φt K ∗ B + BK − L∗ BL dt = φt J ∗ BJ − B dP,
(1.4) (1.5)
describing the stochastic evolution Yt = φt (B) of a bounded system operator B ∈ L (H) as Yt (ω) = Vt (ω)∗ BVt (ω). The maps φt : B 7→ Yt are Hermitian in the sense that Yt∗ = Yt if B ∗ = B, but in contrast to the usual Hamiltonian dynamics, are not multiplicative in general, φt (B ∗ C) 6= φt (B)∗ φt (C), even if they are not averaged with respect to ω. Moreover, they are usually not normalized, Rt (ω) := φt (ω, I) 6= I, although the stochastic positive operators Rt = Vt∗ Vt under the filtering condition are usually normalized in the mean, hRt i = I, and satisfy the martingale property t [Rs ] = Rt for all s > t, where t is the conditional expectation with respect to the history of the processes P or Q up to time t. Although the filtering equations (1.3), (1.1) look very different, they can be unified in the form of the quantum stochastic equation dVt + KVt dt + K − Vt dΛ− = (J − I) Vt dΛ + L+ Vt dΛ+ ,
(1.6)
where Λ+ (t) is the creation process, corresponding to the annihilation Λ− (t) on the interval [0, t), and Λ (t) is the number of quantums on this interval. These canonical quantum stochastic processes, representing the quantum noise with respect to the vacuum state |0i of the Fock space F over the single-quantum Hilbert space L2 (R+ ) of squareintegrable functions of t ∈ [0, ∞), are formally given in [26] by the integrals Z t Z t Z t r + + Λ− dr, Λ (t) = Λr dr, Λ (t) = Λ+r Λr− dr, Λ− (t) = 0
0
0
where are the generalized quantum one-dimensional fields in F , satisfying the canonical commutation relations r s r + Λ− , Λ− = 0 = Λ+r , Λ+s . Λ− , Λs = δ (s − r) I, Λr− , Λ+r
They can be defined by the independent increments with h0|dΛ− |0i = 0,
h0|dΛ+ |0i = 0,
h0|dΛ|0i = 0
(1.7)
and the noncommutative multiplication table dΛdΛ = dΛ,
dΛ− dΛ = dΛ− ,
dΛdΛ+ = dΛ+ ,
dΛ− dΛ+ = dtI
(1.8)
with all other products being zero: dΛdΛ− = dΛ+ dΛ = dΛ+ dΛ− = 0. The standard Poisson process P as well as the Wiener process Q can be represented in F by the linear combinations [8] (1.9) P (t) = Λ (t) + i Λ+ (t) − Λ− (t) , Q (t) = Λ+ (t) + Λ− (t) , so Eq. (1.6) corresponds to the stochastic diffusion equation (1.1) if J = I, L+ = L = −K − , and it corresponds to the stochastic jump equation (1.3) if J = I + L, L+ = iL = K − . The quantum stochastic equation for φt (B) = Vt∗ BVt has the following general form dφt (B) + φt K ∗ B + BK − L− BL+ dt = φt J ∗ BJ − B dΛ + φt J ∗ BL+ − K+ B dΛ+ + φt L− BJ − BK − dΛ− , (1.10)
538
V.P. Belavkin
where L− = L∗+ , K+∗ = K − , coinciding with either (1.4) or with (1.5) in the particular cases. Equation (1.10) is obtained from (1.6) by using the Itˆo formula (1.2) with the multiplication table (1.8). The sub-filtering condition K + K † ≤ L− L+ for Eq. (1.6) defines in both cases the positive operator-valued process Rt = φt (I) as a sub-martingale with R0 = I, or a martingale in the case K + K † = L− L+ . In the particular case J = S,
K − = L− S,
L+ = SK+ ,
S ∗ S = I,
corresponding to the Hudson–Evans flow [2] if S ∗ = S −1 , the evolution is isometric, and identity preserving, φt (I) = I, at least in the case of bounded K and L. In the next sections we define a multidimensional analog of the quantum stochastic equation (1.10) and will show that the suggested general structure of its generator indeed follows just from the property of complete positivity of the map φt for all t > 0 and the normalization condition φt (I) = Rt to a form-valued sub-martingale with respect to the natural filtration of the quantum noise in the Fock space F .
2. Quantum Completely Positive Flows Throughout the complex pre-Hilbert space D ⊆ H is a reflexive Fr´echet space, E ⊗ D denotes the projective tensor product (π-product) with another such space E, D0 ⊇ H denotes the dual space of continuous antilinear functionals η 0 : η ∈ D 7→ hη|η 0 i, 2 with respect to the canonical pairing hη|η 0 i given by kηk for η 0 = η ∈ H, B (D) denotes the linear space of all continuous sesquilinear forms hη|Bηi on D, identified with the continuous linear operators B : D → D0 (kernels), B † ∈ B (D) is the Hermit conjugated form (kernel) hη|B † ηi = hη|Bηi∗ , and L (D) ⊆ B (D) denotes the algebra of all strongly continuous operators B : D → D. Any such space D can be considered as a projective limit with respect to an increasing sequence of Hilbertian norms k·kp > k·k on D; for the definitions and properties of this standard topological notions see for example [28]. The space D0 will be equipped with weak topology induced by its predual (= dual) D, and B (D) will be equipped with w*-topology (induced by the predual B∗ (D) = D ⊗ D), coinciding with the weak topology on each bounded subset with respect to a norm k·kp . Any operator A ∈ L (D) with A† ∈ L (D) can be uniquely extended to a weakly continuous operator onto D0 as A†∗ , denoted again as A, where A∗ is the dual operator D0 → D0 , hη|A∗ η 0 i = hAη|η 0 i, defining the involution A 7→ A∗ for such continuations A : D0 → D0 . We say operator A commutes with a
that the sesquilinear form, BA = AB if hη|BAηi = A† η|Bη for all η ∈ D. The commutant Ac = {B ∈ B (D) : [A, B] = 0, ∀A ∈ A} of an operator ∗-algebra A ⊆ L (D) is weakly closed in B (D), so that the weak closure B ⊆ B (D) of any B ⊆ Ac also commutes with A. 1. Let B ⊆ L (H) be a unital ∗-algebra of bounded operators B : H → H, kBk < ∞, and (Ω, A, P ) be a probability space with a filtration (At )t>0 , At ⊆ A of σ-algebras on Ω. One can assume that the filtration At ⊆ As , ∀t < s is generated by xt = {r 7→ x (r) : r < t} of a stochastic process x (t, ω) with independent increments dx (t) = x (t + ∆) − x (t), and the probability measure P is invariant under the measurable representations ω 7→ ωs ∈ Ω, A−1 s = {ω : ωs ∈ A} ∈ A, ∀A ∈ A on Ω 3 ω of the time shifts t 7→ t + s, s > 0, corresponding to the shifts of the random increments dx (t, ωs ) = dx (t + s, ω) ,
∀ω ∈ Ω, t ∈ R+ .
Quantum Stochastic Positive Evolutions
539
The filtering dynamics over B with respect to the process x (t) is described by a cocycle flow φ = (φt )t>0 of linear completely positive [4] w*-continuous stochastic adapted maps φt (ω) : B → B, ω ∈ Ω such that the stochastic process yt (ω) = hη|φt (ω, B) ηi is causally measurable for each η ∈ D, B ∈ B in the sense that yt−1 (B) ∈ At , ∀t > 0 and any Borel B ⊆ C. The maps φt can be extended on the A-measurable functions Y : ω 7→ Y (ω) with values Y (ω) ∈ B as the normal maps φt [Y ] (ω) = φ¯ t (ω, Y (ωt )), defined for each ω ∈ Ω by the normal extension φt (ω) onto B, and the cocycle condition φr (ω) ◦ φs (ωr ) = φr+s (ω), ∀r, s > 0 reads as the semigroup condition φr [φs [Y ]] = φr+s [Y ] of the extended maps. As it was noted in the previous section, the maps φt (ω) are not considered to be normalized to the identity, and can be even unbounded, but they are supposed to be normalized, φt (ω, I) = Rt (ω), to an operator-valued martingale Rt = t [Rs ] ≥ 0 with R0 (ω) = I, or to a positive submartingale, Rt ≥ t [Rs ] , ∀s > t in the subfiltering case, where t is the conditional expectation over ω with respect to At . 2. Now we give a noncommutative generalization of the filtering (subfiltering) CP flows for an arbitrary Itˆo algebra, which was suggested in [32] for a Gaussian Itˆo algebra of finite dimensional quantum thermal noise, and in [9] for the simple quantum Itˆo algebra Q Cd even in the nonlinear case. The role of the classical process x (t) will play the quantum stochastic process X (t) = A ⊗ I + I ⊗ Λ (t, a) ,
A ∈ A, a ∈ a
indexed by an operator algebra A ⊂ L (D) and a noncommutative Itˆo algebra a. Here Λ (t, a) is the process with independent increment on a dense subspace F ⊂ Γ (E) of the Fock space Γ (E) over the space E = L2E (R+ ) of all square-norm integrable E-valued µ=−,• functions on R+ , where E is a pre-Hilbert space of the representation a ∈ a 7→ aµν ν=+,• for the Itˆo ?-algebra a. Assuming that E is a Fr´echet space, given by an increasing sequence of Hilbertian norms ke• k (ξ) > ke• k, ξ ∈ N, we define F as the projective limit ∩ξ Γ (E, ξ) of the Fock spaces Γ (E, ξ) ⊆ Γ (E), generated by coherent vectors f ⊗ , with respect to the norms Z ∞ n Z ∞ X
⊗ 2
⊗ 2 • 2 1 2 •
f (τ ) (ξ) dτ :=
f (ξ) = = ekf k (ξ) . kf (t)k (ξ) dt n! Γ 0 n=0 (2.1) N Here f ⊗ (τ ) = t∈τ f • (t) for each f • ∈ E is represented by tensor-functions on the space Γ of all finite subsets τ = {t1 , ..., tn } ⊆ R+ (for example of the Fock scale see the Appendix.) Moreover, we shall assume that the Itˆo algebra a is realized as a ?µ=−,• subalgebra of Hudson-Parthasarathy (HP) algebra Q (E) of all quadruples a = aµν ν=+,• with a•• ∈ L (E), strongly representing the ?semigroup 1 + a on the Fr´echet space E by projective contractions δ•• + a•• ∈ L(E) in the sense that for each ζ ∈ N there exists ξ such that ke• + a•• e• k(ζ) ≤ ke• k(ξ) for all e• ∈ E. The following theorem proves that these are natural assumptions (which are not restrictive in the simple Fock scale [31] for a finite dimensional a.) Proposition 2.1. The exponential operators W (t, a) =: exp [Λ (t, a)] : defined as the solutions to the quantum Itˆo equation dWt (g) = Wt (g) dΛ (t, g (t)) ,
W0 (g) = I, g (t) ∈ a aˆ ••
= with g (t) = a, are strongly continuous, W (t, a) ∈ L (F), if all projective contractions on E. They give an analytic representation
δ••
(2.2) + a••
∈ L(E) are
540
V.P. Belavkin
W (t, a ? a) = W (t, a)† W (t, a) ,
W (t, 0) = I,
W (t, d) = et I
(2.3)
of the unital ∗-semigroup 1 + a for the Itˆo ?-algebra a with respect to the ?-product a ? a = a + a ? a + a? . Proof. The solutions W (t, a) are uniquely defined on the coherent vectors as analytic functions Z t • • • − • − • (r) (r) f f a ˆ exp a dr ⊗r≥t W (t, a) f ⊗ (τ ) = ⊗r
(2.4) which obey the properties (2.3), see for example [5]. Thus the span of coherent vectors is invariant, and it is also invariant under W (t, a)† = W (t, a? ). They can be extended on F by continuity which follows from the continuity of Wick exponentials ⊗ˆa•• for the 0 projective contractions aˆ •• on E, and boundedness a•+ ∈ E of a− • ∈E . 3. Let D denote the Fr´echet space D ⊗ F, generated by ψ = η ⊗ f ⊗ , η ∈ D, f • ∈ E. Assuming for simplicity the separability of the Itˆo algebra in the sense E ⊆ `2 such that f • = (f m )m∈N , one can identify each ψ 0 ∈ D0 with a sequence of D0 -valued symmetric 0 (t1 , ...tn ), n = 0, 1, 2, ... . Let (Dt )t>0 be the natural filtration tensor-functions ψm 1 ,...,mn and D[t t>0 be the backward filtration of the subspaces Dt = D ⊗ Ft , D[t = D ⊗ F[t generated by η ⊗ f ⊗ with f • ∈ Et and f • ∈ E[t respectively, where Et = L2E [0, t), E[t = L2E [t, ∞) are embedded into E. The spaces Dt , D[t of the restrictions Et ψ = ψ|Γt , E[t ψ = ψ|Γ[t onto Γt = {τt = τ ∩ [0, t)}, Γ[t = {τ[t = τ ∩ [t, ∞)} are embedded into D by the isometries Et† : ψ 7→ ψt , E[t† : ψ 7→ ψ[t as ψt (τ ) = ψ (τt ) δ∅ τ[t , ψ[t (τ ) = δ∅ (τt ) ψ τ[t , where δ∅ (τ ) = 1 if τ = ∅, otherwise δ∅ (τ ) = 0. The projectors Et , E[t onto Dt , Dt are extended onto D0 as the adjoints to Et† , E[t† . The time shift on D0 is defined by the semigroup T t t>0 of adjoint operators T t = Tt∗ to Tt ψ (τ ) = ψ (τ + t), where τ + t = {t1 + t, ..., tn + t}, ∅ + t = ∅, such that T t ψ (τ ) = δ∅ (τt ) ψ τ[t − t are isometries for ψ ∈ D onto D[t . A family (Zt )t>0 of sesquilinear forms hψ|Zt ψi given 0 t by linear operators Zt : D → D is called adapted (and Z t>0 is called backward adapted) if Z t η ⊗ f ⊗ = ψ 0 ⊗ Et f ⊗ , ∀η ∈ D, f • ∈ E, Zt η ⊗ f ⊗ = ψ 0 ⊗ E[t f ⊗ (2.5) where ψ 0 ∈ D0t (D0[t ) and E[t (Et ) are the projectors onto F[t (Ft ) correspondingly. The (vacuum) conditional expectation on B (D) with respect to the past up to a time t ∈ R+ is defined as a positive projector, t (Z) ≥ 0, if Z ≥ 0, t = t ◦s , ∀s > t, giving † 0 an adapted sesquilinear form Zt = t (Z) in (2.5) for each Z ∈ B (D) by ψ = Et ZEt ψ, ⊗ t where ψ = η ⊗ Et f . The time shift θ t>0 on B (D) is uniquely defined by the covariance condition θt (Z) T t = T t Z as a backward adapted family Z t = θt (Z) , t > 0 for each Z ∈ B (D). As in the bounded case [7] between the maps t and θt we have the relation θr ◦ s = r+s ◦ θr which follows from the operator relation T r Es = Er+s T r . An adapted family (Mt )t>0 of positive hψ|Mt ψi ≥ 0, ∀ψ ∈ D Hermitian Mt† = Mt forms Mt ∈ B (D) is called martingale (submartingale) if t (Ms ) = Mt (t (Ms ) ≤ Mt ) for all s ≥ t ≥ 0. The bounded operator-valued martingales Mt were introduced in the case of the finite-dimensional HP-algebra in [29]. 4. Let B denote the space of all Y ∈ B (D), commuting with all X = {X (t)} in the sense
Quantum Stochastic Positive Evolutions
AY = Y A,
∀A ∈ A,
541
Y W (t, a) = W (t, a) Y,
∀t > 0, a ∈ a,
where A (η ⊗ ϕ) = Aη ⊗ ϕ, W (η ⊗ ϕ) = η ⊗ W ϕ, and the unital ∗-algebra B ⊆ L (H) be weakly dense in the commutant Ac . The quantum filtration (Bt )t>0 is defined as the increasing family of subspaces Bt ⊆ Bs , t ≤ s of the adapted sesquilinear forms Yt ∈ B. The covariant shifts θt : Y 7→ Y t leave the space B invariant, mapping it onto the subspaces of backward adapted sesquilinear forms Y t = θt (Y ). The quantum stochastic positive flow over B is described by a one parameter family φ = (φt )t>0 of linear w*-continuous maps φt : B → B satisfying 1. the causality condition φt (B) ⊆ Bt , ∀B ∈ B, t ∈ R+ , 2. the complete positivity condition [φt (Bkl )] ≥ 0 for each t > 0 and for any positive definite matrix [Bkl ] ≥ 0 with Bkl ∈ B, 3. the cocycle condition φr ◦ φrs = φr+s , ∀t, s > 0 with respect to the covariant shift φrs = θr ◦ φs . Here the composition ◦ is understood as φr [φs (B)] = φr+s (B) in terms of the linear normal extensions of φt [B ⊗ Z] = φ¯ t (B) Z t to the CP maps B → B, form¯ φ¯ t are the normal extensions of φt ing a one-parameter semigroup, where B ∈ B, ¯ and Z t = θt (Z), Z ∈ B (F). These can be defined like in the classical case onto B, as φt [Y ] f¯• , f • = φ¯ t f¯• , Y f¯t• , ft• , f • with ft• (r) = f • (t + r) by the coherent matrix elements Y f¯• , f • = F ∗ Y F for Y ∈ B given by the continuous operators FR : η 7→ ψf = η ⊗ f ⊗ , η ∈ D for each f • ∈ Et with the adjoints F ∗ ψ 0 = τ
0 f,h∈Et B,C∈B
(2.6) f = (the usual summation rule over repeated cross-level indices is understood), where ξB f k • • • η if f = fk and B = Bk with fk ∈ Et , Bk ∈ B, k = 1, 2, ..., otherwise ξB = 0, and φt (B, f • ) = φt (B) F , φt f¯• , B = F ∗ φt (B). Proof. By definition the map φ into the forms is completely positive on
sesquilinear B if ψ k |φ (Bkl ) ψ l ≥ 0 whenever η k |Bkl η l ≥ 0, where η k , ψ k are arbitrary finite sequences. Approximating from below the latter positive forms by sums of the P
∗ Bil η l ≥ 0, the complete positivity can be tested only for the forms forms kl η k |Bik P ∗ P P k ∗ l ∗ ≥ 0 due to the additivity φ kl η |Bk Bl η i Bik Bil = i φ Bik Bil . If φt is adapted, this can be written as X
X k χB |φ B ∗ C χC = ψ k |φ Bk∗ Bl ψ l := ψ |φ Bk∗ Bl ψ l ≥ 0, B,C∈B
k,l
= Bk ∈ B, otherwise χB = 0. Because any ψ ∈ Dt can be where χB = ψ k ∈ Dt if BP approximated by a D-span f η f ⊗ f ⊗ of coherent vectors over fk• ∈ Et , it is sufficient to define the CP property only for such spans as
542
0≤
V.P. Belavkin
XXD
f ⊗ f ⊗ |φ B ∗ C ξB
h ξC ⊗ h⊗
E
f,h B,C
=
XXD
hE f |φ f¯• , B ∗ C, h• ξC ξB .
f,h B,C
5. Note that the subfiltering (filtering) flows can be considered as quantum stochastic CP dilations of the quantum sub-Markov (Markov) semigroups θ = (θt)t>0 , θr ◦ θs = θr+s in the sense θt = ◦ φt , where (Y ) η = EY ψ0 , Eψ 0 = ψ 0 ∅ , ∀ψ 0 ∈ D0 , with θs (I) ≤ θt (I) ≤ I (θt (I) = I), ∀t ≤ s. The contraction Ct = θt (I) with R0 = I defines the probability hη|Ct ηi ≤ 1, ∀η ∈ H, kηk = 1 for an unstable system not to be demolished by a time t ∈ R+ , and the conditional expectations hη|ACt ηi / hη|Ct ηi of the initial nondemolition observables A ∈ A in any state η ∈ D, and thus in any initial state ψ0 ∈ η ⊗ δ∅ . The following theorem shows that the submartingale (or the contraction) Rt = φt (I) is also the density operator with respect to ψ0 = η ⊗ δ∅ , η ∈ H (or with respect to any ψ ∈ H ⊗ F) for the conditional state of the restricted nondemolition process Xt = {r 7→ X (r) : r < t}. Theorem 2.3. Let t 7→ Rt ∈ Bt be a positive (sub)-martingale and (gt )t>0 be the increasing family of ?-semigroups gt of step functions g : R+ → a, g (s) = 0, ∀s ≥ t under the ?-product (gk ? gl ) (t) = gl (t) + gk (t)? gl (t) + gk (t)?
(2.7)
of gk? = gk ? 0 and gl = 0 ? gl . The generating function ϑt (g) = [Rt Wt (g)] of the output state for the process Λ (t), defined for any g ∈ gt and each t > 0 as hη|ϑt (g) ηi = hψ0 |Rt Wt (g) ψ0 i ,
ψ0 = η ⊗ δ∅ ,
is B c -valued, positive, ϑt ≥ 0 in the sense of positive definiteness of the kernel
k η |ϑt (gk ? gl ) η l ≥ 0, ∀gk ∈ gt ; η k ∈ D,
(2.8)
(2.9)
and ϑt ≥ ϑs |gt in this sense for any s ≥ t. If R0 = I, then ϑ0 (0) = I ≥ ϑt (0) , and if Rt is a martingale, then ϑt = ϑs |gt for any s ≥ t, and ϑt (0) = I for all t ∈ R+ . Any family ϑ = (ϑt )t≥0 of positive-definite functions ϑt : gt → B c , satisfying the above consistency and normalization properties, is the state generating function of the form (2.8) iff it is absolutely continuous in the following sense X X
lim ηng ⊗ g+⊗ = 0 ⇒ lim ηng |ϑt (g ? h) ηnh = 0, (2.10) n→∞
g∈gt
n→∞
g,h∈gt
where g+⊗ (τ ) = ⊗t∈τ g+• (t) and ηng = 0 for almost all g (i.e. except for a finite number of g ∈ gt ). Proof. Because the solutions Wt (g) to the quantum stochastic equation (2.2) for a step function g are given by finite products of commuting exponential operators W (t, a), they are multiplicative, Wt (gk )∗ Wt (gl ) = Wt (gk ? gl ), as the operators in (2.4) are. Then the positive definiteness of ϑt follows from their commutativity (2.7) with positive Rt :
k η |ϑt (gk ? gl ) η l = Wt (gk ) η k ⊗ δ∅ |Rt Wt (gl ) η l ⊗ δ∅ = hψt |Rt ψt i ≥ 0. It is B c -valued as
Quantum Stochastic Positive Evolutions
543
hη|ϑt (g) Bηi = hψ0 |Rt Wt (g) Bψ0 i = hBψ0 |Rt Wt (g) ψ0 i = hBη|ϑt (g) ηi , ∀B ∈ B. From Wt (gr ) = Wr (g), r < t and Wt (0) = I as the case g0 = 0 it follows that ϑt (gr ) = hψ0 |Rt Wt (gr ) ψ0 i = hψ0 |r (Rt ) Wr (g) ψ0 i ≤ hψ0 |Rr Wr (g) ψ0 i = ϑr (g) for any finite matrix g = [gk ? gl ], and ϑt (0) ≤ 1 = ϑ0 (0) if Rt is a submartingale with R0 = I. This implies the normalization and compatibility conditions if Rt is martingale. condition follows from the continuity of the forms Rt ∈ P The continuity B (D): if g ηng ⊗ g+⊗ → 0, then X
X
ηng |ϑt (g ? h) ηnh = Wt (g) ηng ⊗ δ∅ |Rt Wt (h) ηnh ⊗ δ∅ g,h
g,h
* =
X g
ηng
⊗
g+⊗ |Rt
X
+ ηng
⊗
g+⊗
→ 0.
g
Conversely, let (E, Vt , L) be the GNS triple, describing the decomposition ϑt (g) = L∗ Vt (g) L for a positive-definite kernel-function ϑt . It is defined by the multiplicative ∗-representation Vt (g ? h) = Vt (g)∗ Vt (h) of gt on a pre-Hilbert space E ⊆ H and by a linear operator L : D → E. The correspondence πt (B) : Vt (g) Lη 7→ Vt (g) LBη for B ∈ BP is extended to a ∗-representation πt : B → Vt (gt )0 on the linear combinations ◦ Et = { k Vt (gk ) Lt ηk : gk ∈ gt , ηk ∈ D} by virtue of the commutativity of ϑt (g) with B:
∗ g
B η |ϑt (g ? h) η h = πt B ∗ Vt (g) Lt η g |Vt (h) Lt η h
= Vt (g) Lt η g |πt (B) Vt (h) Lt η h = η g |ϑt (g ? h) Bη h . P P The linear correspondence Ft : g η g ⊗ g+⊗ 7→ g Vt (g) Lη g obviously intertwines this representation with B 7→ B ⊗ I as well as the representation Vt with Wt on P D◦t = { k ηk ⊗ Wt (gk ) δ∅ : gk ∈ gt , ηk ∈ D} ⊆ Dt : Ft η ⊗ Wt f ? g+⊗ = Vt (f ? g) Lη = Vt f ? Vt (g) Lη = Vt f ? Ft η ⊗ g+⊗ , where g+⊗ = W (g) Pδ∅ . It is continuous operator with respect to the Hilbert space norm in Ht because if g ηng ⊗ g+⊗ → 0, then
2
2
X
X X
ηng ⊗ g+⊗ = Vt (g) Lηng = ηnf |ϑt (f ? h) ηnh → 0
Ft
g g f,h
due to the strong absolute continuity of ϑt . Hence Ft can be continued to an intertwining operator D◦t → H, and there exists the adjoint intertwining operator Ft∗ : E → D0t , Ft∗ Vt (g) = Wt (g) Ft∗ such that hψ0 |Ft∗ Ft Wt (g) ψ0 i = hFt ψ0 |Vt (g) Ft ψ0 i = hLt η|Vt (g) Lt ηi = hη|L∗t Vt (g) Lt ηi . The positive operators Ft∗ Ft ∈ B (Dt ) uniquely extended to the adapted ones Rt : D → D0 , commute with all Wt (g) , g ∈ gt . They define a submartingale (martingale) Rt ∈ Bt due to the property Wt = Ws |gt for all s ≥ t and
Wt (gk ) ψ0k |t (Rs ) Wt (gl ) ψ0l = ψ0k |Rs Ws (gk ? gl ) ψ0l = η k |ϑs (gk ? gl ) η l
≤ (=) η k |ϑt (gk ? gl ) η l = ψ0k |Rt Wt (gk ? gl ) ψ0l = Wt (gk ) ψ0k |Rt Wt (gl ) ψ0l if ϑs |gt ≤ (=) ϑt . It is normalized, R0 = F0∗ F0 = I, as F0 = I if ϑ0 (0) = I.
544
V.P. Belavkin
3. Generators of Quantum CP Dynamics The quantum stochastically differentiable positive flow φ is defined as a weakly continuous function t 7→ φt with CP values φt : B → Bt , φ0 (B) = B ⊗ I, ∀B ∈ B such that for any product-vector ψf = η ⊗ f ⊗ given by η ∈ D and f • ∈ E
d hψf |φt (B) ψf i = ψf |φt λ f¯• (t) , B, f • (t) ψf , dt
B ∈ B,
(3.1)
where λ (e¯• , B, e• ) = λ (B) + e• λ• (B) + λ• (B) e• + e• λ•• (B) e• , e• = e¯• is the linear 2 form on E with e∗• = e• ∈ E and hψf |φ0 (B) ψf i = hη|Bηi exp kf • k . The generator λ (B) = λ (0, B, 0) of the quantum dynamical semigroup θt = ◦ φt is a linear w*continuous map B 7→ λ (B) ∈ Ac , λ• = 놕 is a linear w*-continuous map given by the Hermitian adjoint values λ• (B ∗ ) = λ• (B)† in the continuous operators E → Ac , and λ•• : B → B (D ⊗ E) is a w*-continuous map with the values λ•• (B) given by continuous operators E ⊗ E → Ac . 1. The differential evolution equation (3.1) for the coherent vector matrix elements hψf |φt (B) ψf i corresponds to the Itˆo form [8] of the quantum stochastic equation X dφt (B) = φt ◦ λµν (B) dΛνµ := φt λµν (B) dΛνµ , B∈B (3.2) µ,ν
with the initial condition φ0 (B) = B, for all B ∈ B. Here λµν are the flow generators • • − • λ− + = λ, λ+ = λ , λ• = λ• , λ• , called the structural maps, and the summation is taken over the indices µ = −, •, ν = +, • of the standard quantum stochastic integrators Λνµ . For simplicity we shall assume that the pre-Hilbert Fr´echet space E is separable, E ⊆ `2 . Then the index • can take any value in {1, 2, ...} and Λνµ (t) are indexed with µ ∈ {−, 1, 2, ...}, + ν ∈ {+, 1, 2, ...} as the standard time Λ+− (t) = tI, annihilation Λm − (t), creation Λn (t) (t) operator integrators with m, n ∈ N. The infinitesimal and exchange-number Λm n increments dΛµν (t) = Λtµ ν (dt) are formally defined by the HP multiplication table [8] and the ? -property [16], ν α ν dΛα µ dΛβ = δβ dΛµ ,
Λ? = Λ,
(3.3)
where δβα is the usual Kronecker delta restricted to the indices α ∈ {−, 1, 2, ...}, β ∈ ν∗ {+, 1, 2, ...} and Λ?µ −ν = Λ−µ with respect to the reflection −(−) = +, −(+) = − of the indices (−, +) only. The linear equation (3.2) of a particular type, (quantum Langevin equation) with bounded finite-dimensional structural maps λµν was introduced by Evans and Hudson [2] in order to describe the ∗-homomorphic quantum stochastic evolutions. The constructed quantum stochastic ∗-homomorphic flow (EH-flow) is identity preserving and is obviously completely positive, but it is hard to prove these algebraic properties for the unbounded case. However the typical quantum filtering dynamics is not homomorphic or identity preserving, but it is completely positive and in the most interesting cases is described by unbounded generators λµν . In the general content Eq. (3.2) was studied in [31], and the correspondent quantum stochastic, not necessarily homomorphic and normalized flow was constructed even for the infinitely-dimensional non-adapted case under the natural integrability condition for the chronological products of the generators λµν in the norm scale (6.2). The EH flows with unbounded λµν , satisfying certain analyticity conditions, have been recently constructed in the strong sense by Fagnola-Sinha in
Quantum Stochastic Positive Evolutions
545
[30] for the non-Hilbert class L∞ of test functions f • . Another type of sufficient analyticity conditions, which is related to the Hilbert scales of the test functions, is given in the Appendix. Here we will formulate the necessary differential conditions which follow from the complete positivity, causality, and martingale properties of the filtering flows, and which are sufficient for the construction of the quantum stochastic flows obeying these properties in the case of the bounded λµν . As we showed in [9], the found properties are sufficient to define the general structure of the bounded generators, and this structure will help us in construction of the minimal completely positive weak solutions for the quantum filtering equations also with unbounded λµν . 2. Obviously the linear w*-continuous generators λµν : B → Ac for CP flows φ∗t = φt , where φ∗t (B) = φt (B ∗ )† , must satisfy the ? -property λ? = λ, where λ?ν −µ = µ∗ µ ∗ ∗ (B) (B ) , λ and are independent of t, corresponding to cocycle property = λ λµ∗ ν ν −ν φs ◦ φsr = φs+r , where φst is the solution to (3.2) with Λµν (t) replaced by Λsµ ν (t), and flow, φt (I) = I, as it is in the multiplicative case [2]. We λ− + (I) = 0 if φ is a filtering µ=−,• shall assume that λ = λµν ν=+,• for each B ∗ = B defines a continuous Hermitian form b = λ (B) on the Fr´echet space D ⊕ D• , hη| b ηi =
X
n hη m |bm nη i+
m,n
X
hη m |bm + ηi +
X
m
− n η|b− + η|b+ η , nη
n
where η ∈ D, η • = (η m )m∈N ∈ D• = D ⊗ E. We say that an Itˆo algebra a , represented on E, commutes in HP sense with a b, given by the form-generator λ if I ⊗ aµ• b•ν = bµ• I ⊗ a•ν . (For simplicity the ampliation I ⊗ aµν will be written again as aµν .) Note that if we define the matrix elements aµν , bµν also for µ = + and ν = −, by the extension a+ν = 0 = aµ− ,
λ+ν (B) = 0 = λµ− (B) ,
∀a ∈ a, B ∈ B,
the HP product (0.4) of a and b can be written in terms of the usual matrix product µ=−,•,+ ab = aµλ bλν of the extended quadratic matrices a = aµν ν=−,•,+ and b = bg, where µ g = δ−ν . Then one can extend the summation in (3.2) so it is also over µ = +, and ν = −, such that bµν dΛνµ is written as the trace b · dΛ over all µ, ν. By such an extension the multiplication table for dΛ (a) = a · dΛ, dΛ (b) = b · dΛ can be represented as dΛ (a) dΛ (b) = ab · dΛ, and the involution b 7→ b? , defining dΛ (b)† = b? · dΛ, can µ∗ βν be obtained by the pseudo-Hermitian conjugation b?ν respectively to the α = gαµ bβ g indefinite Minkowski metric tensor g = gµν and its inverse g−1 = [g µν ], given by µ I = gµν . g µν = δ−ν Now let us find the differential form of the normalization and causality conditions with respect to the quantum stationary process, with independent increments dX (t) = X (t + ∆) − X (s) generated by an Itˆo algebra a on the separable space E. Proposition 3.1. Let φ be a flow, satisfying the quantum stochastic equation (3.2), and [Wt (g) , φt (B)] = 0 for all g ∈ g, B ∈ B. Then the coefficients bµν = λµν (B), µ = −, •, µ=−,• ν = +, •, where • = 1, 2, ..., written in the matrix form b = bµν ν=+,• , commute in the µ=−,• sense of the HP product with a = aµν ν=+,• for all a ∈ a and B ∈ B: [a, b] := aµ• b•ν − bµ• a•ν
µ=−,• ν=+,•
= 0.
(3.4)
546
V.P. Belavkin
Proof. Since t (φs (I) − φt (I)) is a negative Hermitian form, t (dφt (I)) = t φt λµν (I) dΛνµ = φt λ− + (I) dt ≤ 0. Since Yt = φt (B) commutes with Wt (g) for all B and g (t) = a, we have by virtue of quantum Itˆo’s formula d [Yt , Wt ] = [dYt , Wt ] + [Yt , dWt ] + [dYt , dWt ] = 0. Equations (2.2), (3.2) and commutativity of aµν with Yt and Wt imply φt bµν , Wt + Yt , aµν Wt + φt bµ• a•ν Wt − aµ• Wt φt b•ν dΛνµ = Wt φt bµ• a•ν − aµ• φt b•ν dΛµν = Wt φt bµ• a•ν − aµ• b•ν dΛνµ. = 0. Thus a • b = b • a by the argument [6]of independence of the integrators dΛνµ . 3. In order to formulate the CP differential condition we need the notion of quantum stochastic germ for the CP flow φ at t = 0. It was defined in [31, 11], for a quantum stochastic differential (3.2) with φ0 (B) = B, ∀B ∈ B as γνµ = λµν + ıµν , where λµν are the structural maps B 7→ λµν (B) given by the generators of the quantum Itˆo equation (3.2) and ıµν : B 7→ Bδνµ is the ampliation of B. Let us prove that the germ-maps γνµ of a CP flow φ must be conditionally completely positive (CCP) in a degenerated sense as it was found for the finite-dimensional bounded case in [9, 12]. Theorem 3.2. If φ is a completely positive flow satisfying the quantum stochastic equaµ=−,• tion (3.2) with φ0 (B) = B, then the germ-matrix γ = λµν + ıµν ν=+,• is conditionally completely positive in the sense X X ι (B) ζ B = 0 ⇒ hζ B |γ B ∗ C ζ C i ≥ 0. B∈B
B,C∈B
µ=−,• Here ζ ∈ D ⊕ D• , D• = D ⊗ E, and ι = ιµν ν=+,• is the degenerate representation µ ιµν (B) = Bδν+ δ− , written both with γ in the matrix form as γ γ• B0 , ι (B) = γ= , (3.5) • • γ γ• 0 0 m m m γ m = λm γn = λ− γnm = ım where γ = λ− + , + , n, n + λn with ın (B) = Bδn such that n (B)∗ . γ B ∗ = γ (B)∗ , γ n B ∗ = γn (B)∗ , γnm B ∗ = γm (3.6)
If φ is subfiltering, then D = −λ− + (I) is a positive Hermitian form, hη|Dηi ≥ 0, for all η ∈ D, and if φ is contractive, then D = −λ (I) is positive in the sense hη|Dηi ≥ 0 for all η ∈ D ⊕ D• . Proof. The CP condition in the form (2.6) for the adapted map φt can be obviously extended on all f • ∈ E if the sesquianalytical function f • 7→ φt f¯• , B, f • is defined as the E-function Z ∞
2 • • ⊗ ⊗ • ¯ kf (s)k ds , (3.7) η|φt f , B, f η = η ⊗ f |φt (B) η ⊗ f exp − t
Quantum Stochastic Positive Evolutions
547
P∞ 2 2 where kf • (t)k = n=1 |f n (t)| . It coincides with the former definition on Et and does • not depend on f (s), s > t due to the adaptiveness (2.5) of Yt = φt (B). If the D-form φt (B) satisfies the stochastic equation (3.2), the D-form φt f¯• , B, f • satisfies the differential equation [8] d 2 • φt f¯• , B, f • = kf • (t)k φt f¯• , B, f • + φt f¯• , λ− + (B) , f dt + +
∞ X
m
f
m=1 ∞ X
∞ X n • • (t) φt f¯• , λm φt f¯• , λ− + f (t) + (B) , f n (B) , f n=1
f
m
(t) φt f¯• , λm n (B) , f
•
f n (t) = φt f¯• , γ f¯• (t) , B, f • (t) , f • .
m,n=1
The positive definiteness of (3.7) ensures the conditional positive definiteness P P f f B BξB = 0 ⇒ h 1 X X D f h f ξB ξB φt f¯• , B ∗ C, h• ξC = ≥0 γt f¯• , B ∗ C, h• ξC t
XXD B,C f,h
B,C f,h
of the form, given by γt f¯• , B, f • = 1t φt f¯• , B, f • − B for each t > 0. This • • • • ¯ ¯ holds also at the limit γ0 f , B, f = γ f (0) , B, f (0) , given at t ↓ 0 by the E-form X X X m m e¯ γn (B) en + e¯m γ m (B) + γn (B) en + γ (B) , γ e¯• , B, e• = m,n
m
n
where e• = f • (0) ∈ E, e¯• = e• and the γ’s are defined in (3.5). Hence the form XX µ XX m hζB | γνµ B ∗ C ζCν i := hζB | γnm B ∗ C ζCn B,C µ,ν
+
X X B,C
B,C m,n
n
hζB | γn B ∗ C ζC +
n
P
P
X
!
m hζB | γ m B ∗ C ζC + hζB | γ B ∗ C |ζC i
m
P where e•f = f • (0), is positive if B BζB = 0. with ζ = f ξ , ζ = f ξ ⊗ The components ζ and ζ • of these vectors are independent because for any ζ ∈ D and such a function e• 7→ ξ e on E with a countable ζ • = ζ 1 , ζ 2 , ... P ∈eD ⊗ E there P exists e • • support, that e ξ = ζ, ξ e = 0 for all e• ∈ E except P∞ enξ ⊗ e• = ζ• , namely, • 0 th e = 0 with ξ = ζ − n=1 ζ and e = en , the n basis element in `2 , for which ξ e = ζ n . This proves the complete positivity of the matrix form γ, with respect to the matrix representation ι defined in (3.5) on the ket-vectors ζ = (ζ µ ). If (Rt ) ≤ I, then D = −λ(I)= lim 1t (I − Rt ) ≥ 0, and we also conclude the P dissipativity k,l ξ k |D k¯ • , l• ξ l ≥ 0 from Rt •
1X f f h 0 ≤ lim ξ |e 0 • I − φt f¯• , I, h• ξ h = − ξ f |λ f¯• (0) , I, h• (0) ξ h t f
•
f
e•f ,
f,h
if φt (I) ≤ I, where λ (e¯• , I, e• ) = γ (e¯• , I, e• ) − ke• k I = D (e¯• , e• ). 2
548
V.P. Belavkin
4. Obviously the CCP property for the germ-matrix γ is invariant under the transformation γ 7→ ϕ given by ϕ (B) = γ (B) + ι (B) K + K ∗ ι (B) ,
(3.8)
µ=−,• ∗µ ν∗ = K−µ . As was where K = Kνµ ν=+,• is an arbitrary matrix of Kνµ ∈ L (D) with K−ν proven in [9, 12] for the case of finite-dimensional matrix γ of bounded γνµ , see also [13], µ=−,• the matrix elements Kν− can be chosen in such way that the matrix map ϕ = ϕµν ν=+,• becomes CP from B into the quadratic matrices of ϕµν (B) . (The other elements can be chosen arbitrarily, say as K+• = 0, K•• = 21 I•• , because (3.8) does not depend on K+• , K•• .) Thus the generator λ = γ − ı for a quantum stochastic CP flow φ can be written (at least in the bounded case) as ϕ − ıK − K ∗ ı: 1 µ 1 µ µ δν I + δ − δν I + K µ δν+ B, (3.9) Kν − λµν (B) = ϕµν (B) − B 2 2 where ϕµν : B → B (D) are matrix elements of the CP map ϕ and Kν ∈ L (D), ∗ . Now we show that the germ-matrix of this form obeys the CCP K − = K+∗ , K m = Km property even in the general case of unbounded Kν− , ϕµν (B) ∈ B (D). µ=−,• Proposition 3.3. The matrix map γ = γνµ ν=+,• given in (3.8) by ϕ=
ϕ ϕ• ϕ• ϕ••
,
and K =
K K• 0 21 I••
, K∗ =
K∗ 0 K•∗ 21 I••
,
(3.10)
m m with ϕ = ϕ− ϕm = ϕm ϕn = ϕ− + , + , n and ϕn = γn is CCP with respect to the µ + µ=−,• degenerate representation ι = δ− δν ι ν=+,• , where ι (B) = B, if ϕ is a CP map.
Proof. If ι (Bk ) η k = 0, then
hη k |ι Bk∗ Bl K + K ∗ ι Bk∗ Bl η l i
= 2Re ι (Bk ) η k |ι (Bl ) Kη l = 0.
Hence the CCP for γ is equivalent to the CCP property for (3.8) and follows from its CP property:
k η |γ Bk∗ Bl η l = η k |ϕ Bk∗ Bl η l ≥ 0 for such sequences η k ∈ D ⊕ D• . 4. Construction of Quantum CP Flows µ=−,• The necessary conditions for the stochastic generator λ = λµν ν=+,• of a CP flow φ at t = 0 are found in the previous section in the form of a CCP property for the µ=−,• corresponding germ γ = γνµ ν=+,• . In the next section we shall show, these conditions are essentially equivalent to the assumption (3.9), corresponding to ∗ ∗ (B) , B = γm γ m (B) = ϕm (B) − Km
γ (B) = ϕ (B) − K ∗ B − BK,
(4.1)
Quantum Stochastic Positive Evolutions
549
µ=−,• m where ϕ = ϕµν ν=+,• is a CP map with ϕm n = γn . Here we are going to prove under the following conditions for the operators K, K• and the maps ϕµν that this general form is also sufficient for the existence of the CP solutions to the quantum stochastic equation (3.2). We are going to construct the minimal quantum stochastic positive flow B 7→ φt (B) for a given w*-continuous unbounded germ-matrix map of the above form, satisfying the following conditions. 1. First, we suppose that the operator K ∈ B (D) generates the one parametric semi group e−Kt t>0 , e−Kr e−Ks = e−K(r+s) of continuous operators e−Kt ∈ L (D) in the strong sense 1 I − e−Kt η = Kη, ∀η ∈ D. lim t&0 t (A contraction semigroup on the Hilbert space H if K defines an accretive K +K † ≥ 0 and so maximal accretive form.) 2. Second, we suppose that the solution Stn , n ∈ N to the recurrence Stn+1 = St◦ −
Z
t
◦ St−r
0
∞ X
St0 = St◦ ,
Km Srn dΛm −,
m=1
where St◦ = e−Kt ⊗ Tt ∈ L (D) is the contraction given by the shift co-isometries Tt : F → F, strongly converges to a continuous operator St ∈ L (D) at n −→ ∞ for each t > 0. 3. Third, we suppose that the solution Rtn , n ∈ N to the recurrence Rtn+1
=
St∗ St
Z
t
+
n dΛνµ r, Sr∗ ϕµν Rt−r Sr ,
Rt0 = St∗ St ,
0
where the quantum stochastic non-adapted integral is understood in the sense [31] (see the Appendix), weakly converges to a continuous form Rt ∈ B (D) at n −→ ∞ for each t > 0. The first and second assumptions are necessary to define the existence of free evo lution semigroup S ◦ = St◦ t>0 and its perturbation S = (St )t>0 on the product space D = D ⊗ F in the form of multiple quantum stochastic integral St = St◦ +
∞ X n=1
Z (−1)n
Z
mn 1 Kmn (t − tn ) · · · Km1 (t2 − t1 ) St◦1 dΛm − · · · dΛ− ,
··· 0
(4.2) iterating the quantum stochastic integral equation St = St◦ −
Z tX ∞ 0 m=1
Km (t − r) Sr dΛm −,
S0 = I,
(4.3)
where Km (t) = St◦ (Km ⊗ I). A sufficient analyticity condition under which this iteration strongly coverges in D is given in the Appendix. The third assumption supplies the weak convergence for the series
550
V.P. Belavkin
Rt = St∗ St +
∞ X
Z
Z
µ1 ...µn ∗ n dΛνµ11...ν ...µn t1 , . . . , tn , ϕν1 ...νn t1 , . . . , tn , St−tn St−tn
···
n=1 0
(4.4) of non-adapted n-tuple integrals, i.e. for the multiple quantum stochastic integral (see the definition in the Appendix) with ...µn ...µn−1 (t1 , . . . , tn ) = ϕµν11...ν t1 , . . . , tn−1 ◦ ϕµνnn tn − tn−1 , (4.5) ϕµν11...ν n n−1 where ϕµν (t, B) = St∗ ϕµν (B) St . A sufficient analyticity condition for this convergence is also given in the Appendix.. The following theorem gives a characterization of the evolution semigroup S in terms of cocycles with unbounded coefficients, characterized by Fagnola [33] in the isometric and unitary case. Proposition 4.1. Let the family V ◦ = Vt◦ t>0 be a quantum stochastic adapted cocy◦ cle, Vr◦ Ts Vs◦ = Ts Vr+s , satisfying the HP differential equation dVt◦ + KVt◦ dt +
∞ X
Km Vt◦ dΛm − +
m=1
∞ X
Vt◦ dΛm m = 0,
V0◦ = I.
(4.6)
m=1
Then St = Tt Vt◦ is a semigroup solution, Sr Ss = Sr+s to the non-adapted integral equation (4.3) such that St ψf = St (f • ) η ⊗ δ∅ , ∀η ∈ D on ψf = η ⊗ f ⊗ with f • ∈ Et . Conversely, if S = (St )t>0 is the non-adapted solution (4.2) to the integral equation (4.3), then Vt◦ = Tt∗ St is the adapted solution to (4.6), defined as Vt◦ ψf = St (f • ) η⊗f ⊗ , ∀η ∈ D, where St (f • ) = F ∗ St F is given by F η = η ⊗ f ⊗ with f • ∈ Et . Proof. First let us show that Eq. (4.6) is equivalent to the integral one Z tX ∞ V0◦ = I, e−K(t−r) Km ⊗ It−r Vr◦ dΛm Vt◦ = e−Kt ⊗ It − −, 0 m=1
where It = Tt† Tt is the orthoprojector onto Frt . Indeed, multiplying both parts of the eK(t−s) Vt◦ at integral equation from the left by eK(t−s) and differentiating P∞ the product m t = s, we obtain (4.6) by taking into account that dIt + m=1 It dΛm = 0. Conversely, the integral equation can be obtained from (4.6) by the integration: Z t − e−K(t−r) ⊗ It−r K• Vr◦ dΛ•− Z
0 t
Z
0
=
e−K(t−r) ⊗ It−r
t
d
=
dVr◦ + KVr◦ dr + Vr◦ dΛ••
e−K(t−r) ⊗ It−r Vr◦ = Vt◦ − e−Kt ⊗ It .
0
The non-adapted equation (4.3) is obtained by applying the operator Tt = Tt−r Tr to both parts of this integral equation and taking into account the commutativity of eK(r−t) K m with Tr . Moreover, due to the adaptiveness of Vt◦ , St ψf = Tt Et Vt◦ ψf ⊗ E[t f ⊗ = St (f • ) η ⊗ ft⊗ , where ft⊗ = Tt f ⊗ , and St (f • ) = EVt◦ F is the solution to the equation Z t • −Kt + e−K(t−r) K• f • (r) Sr f • dr, S0 f • = I. St f = e 0
Quantum Stochastic Positive Evolutions
551
Hence St F = E ∗ St (f • ) if f • ∈ Et , and F ∗ St F = St (f • ) as EF = I. Since this equation is equivalent to the differential one d St f • η + K• f • (t) + K St f • η = 0, dt
S0 f • η = η,
the function t 7→ St (f • ) , f • ∈ E is a strongly continuous cocycle, Sr fs• Ss f • = Sr+s f • , ∀r, s > 0, fs• (t) = f • (t + s) ,
∀η ∈ D,
(4.7)
S0 f • = I.
As was proved in [31], the multiple integral (4.2) gives a solution to the integral equation (4.3), and so the multiple integral for Vt◦ ψf = St (f • ) η ⊗ f ⊗ , Z
∞ X (−1)n St f • = e−Kt + n=1
Z ···
K (t, tn ) · · · K (t2 , t1 ) e−Kt1 dt1 · · · dtn ,
0
where K (t, r) = e−K(t−r) K• f • (r), corresponding to the iteration of the integral equa tion for Vt◦ on ψf , satisfies the HP equation (4.6). The following theorem reduces the problem of solving differential evolution equations to the problem of iteration of integral equations similar to the nonstochastic case [34, 35]. Proposition 4.2. Let St = Tt Vt◦ , where Vt◦ ∈ L (D) are continuous operators defining the adapted cocycle solution to Eq. (4.6). Then the linear stochastic evolution equation (3.2) is equivalent to the quantum non-adapted (in the sense of [31]) integral equation Z t ∗ ∗ φt (B) = St BSt + dΛνµ r, φr ϕµν St−r BSt−r (4.8) 0
with φ0 (B) = B ∈ B, where are extended onto B by w*-continuity and linearity as ϕµν (B ⊗ Z) = ϕµν (B) ⊗ Z for B ∈ B, Z ∈ B (F). ϕµν
Proof. The non-adapted equation (4.8) is understood in the coherent form sense as Z t
∗ BSt−r , f • (r) ψf dr, ψf |φr ϕ f¯• (r) , St−r hψf |φt (B) ψf i = hSt ψf |BSt ψf i+ 0
P
P m m P n ¯ ϕ (B) + n ϕn (B) en + ϕ (B). where ϕ (e¯• , B, e• ) = m,n e¯m ϕm n (B) e + me Due to the adaptiveness of φt this can be written for ψf = η ⊗ f ⊗ = F η with f • ∈ Et as φt f¯• , B, f • = St∗ f¯• BSt f • (4.9) Z t ∗ φr f¯• , ϕ r, St−r f¯r• BSt−r fr• , f • dr, + 0
f¯• = St (f • )∗ , fr• (t) = f • (t + r). Here we take into account that due to where adaptiveness F ∗ φr [Y ] F = φr f¯• , Fr∗ Y Fr , f • , where Fr = Tr F , and therefore ∗ ∗ BSt−r F = φr f¯• , Fr∗ ϕ r, St−r BSt−r Fr , f • = F ∗ φr ϕ r, St−r ∗ ∗ BSt−r Fr , f • = φr f¯• , ϕ r, St−r f¯r• BSt−r fr• , f • φr f¯• , ϕ r, Fr∗ St−r St∗
552
V.P. Belavkin
∗ ¯• (t) , B, f • (t) and St−r Fr = (t, as Ft∗ ϕ (t, B) F = ϕ t, F BF for ϕ B) = ϕ f t t t Ft St−r fr• , where Ft η = η ⊗ δ∅ for any f • ∈ Et . Let us prove that the operator-valued function t 7→ Ss (t, f • ) := Ss−t ft• satisfies the backward evolution equation d Ss t, f • η = Ss t, f • K• f • (t) + K η, dt
S0 fs• η = η
∀t ∈ [0, s).
Indeed, taking into account the forward equation (4.7), we obtain it at r = t from the cocycle property Ss (t, f • ) St (r, f • ) = Ss (r, f • ): d d • • • • Ss r, f η = Ss t, f − Ss t, f 0= K• f (t) + K St r, f • η. dt dt Now, replacing B in (4.9) by Ys f¯• , t, f • = Ss∗ t, f¯• BSs (t, f • ), we can write φt f~• , Ss∗ t, f¯• BSs t, f • , f • Z t φr f~• , ϕ r, Ss∗ r, f¯• BSs r, f • , f • dr. = Ss∗ f¯• BSs f • + 0
Calculating the total derivative ddt φt f¯• , Ss∗ t, f¯• BSs (t, f • ) , f • by taking into account the backward equation, we obtain the differential equation at s = t: d φt f¯•, B, f • +φt f¯•, K (t)∗ B + BK (t) , f • = φt f¯•, ϕ f¯• (t) , B, f • (t) , f • , dt
where K (t) = K + K• f • (t). This equation written for η|φt f¯• , B, f • η coincides with the coherent matrix form (3.1) for the quantum stochastic equation (3.2) with ψf = F η. ¯• , B, f • with B The converse is easy to show by integrating the equation for φ f t replaced by Y (t) = Ss∗ t, f¯• BSs (t, f • ): Z s d φs f¯• , B, f • − Ss∗ f¯• BSs f • = φt f¯• , Ss∗ t, f¯• BSs t, f • , f • dt 0 dt d d φt f¯• , Y (r) , f • + φr f¯• , Y (t) , f • dt dt dt 0 r=t Z s φt f¯• , ϕ t, Ss∗ t, f¯• BSs t, f • , f • dt, = Z
s
=
0
whereas ddt Y (t) = (K• f • (t) + K)∗ Y (t) + Y (t) (K• f • (t) + K).
Theorem 4.3. Let ϕ be a w*-continuous CP-map, and St = Tt Vt◦ be given by the solution to the quantum stochastic equation (4.6). Then the solutions to the evolution equation (3.2) with the generators, corresponding to (4.1), have the CP property, and satisfy the submartingale (contractivity) condition φt (I) ≤ t [φs (I)] for all t < s if ϕ (I) ≤ K + K † (φt (I) ≤ φs (I) if ϕ(I) ≤ K + K † ). The minimal solution can be constructed in the form of a multiple quantum stochastic integral in the sense [31] as the series
Quantum Stochastic Positive Evolutions
φt (B) =
∞ X
Z
Z
553
µ1 ...µn ∗ n dΛνµ11...ν ...µn t1 , . . . , tn , ϕν1 ...νn t1 , . . . , tn , St−tn BSt−tn
···
n=0 0
of non-adapted n-tuple CP integrals with St∗ BSt at n = 0 and
(4.10)
...µn (t1 , . . . , tn ) = ϕµν11 (t1 ) ◦ ϕµν22 (t2 − t1 ) ◦ . . . ◦ ϕµνnn tn − tn−1 , ϕµν11...ν n
where ϕµν (t, B) = St∗ ϕµν (B) St . If ϕ is bounded, then the solution to the equation is unique, and φt (I) = t [φs (I)] for all t < s if K + K † = ϕ (I) (φt (I) = I if K + K † = ϕ(I)). Proof. the quantum stochastic equations (3.2) with the bounded generators λµν (B) = γνµ (B) − Bδνµ and the initial conditions φ0 (B) = B in an operator algebra B ⊆ L (H) was proved in [31]. The CP property of the solution to this equation with the generators, corresponding to the conditionally positive germ-matrix (4.1), can be proven in the form (4.10), which is obtained by the iteration Z t ∗ ∗ (B) = S BS + dΛνµ r, φnr ϕµν St−r BSt−r , φ0t (B) = St∗ BSt φn+1 t t t 0
of the equivalent non-adapted integral equation (4.8). Indeed, in order to prove the complete positivity of the solution, written in this form, one should prove the positive definiteness of the iteration f¯• , B, f • = St∗ f¯• BSt f • φn+1 t Z t ∗ φns f¯• , ϕ f¯• (r) , St−r f¯r• BSt−r fr• , f • (r) , f • dr + 0
of the integral equation (4.9) with the CP φ0t (B) = St∗ BSt . Thus, we have to test the positive definiteness of the forms X X D f h X
ξB φn+1 f • , B ∗ C, h• ξC = BSt f • ηB |CSt f • ηC t B,C f,h
+
B,C
Z t XXD 0 B,C f,h
E ∗ η fB (r) |φns f¯• , ϕ St−r f¯r• B ∗ CSt−r h•r , h• η hC (r) ,
P f P f • η fB (r) = where ηB = f ξB , f (r) ξB ⊗ f (r), and f (r) = 1 ⊕ f (r). It is a n consequence of the CP condition for ϕ and the CP property for φtn , ∀tn < t, which , r < tn , and so on up to obviously follows from the positive definiteness of φn−1 r φ0r , r < t1 . The direct iteration of this integral recursion with the initial CP condition φ0t (B) = St∗ BSt gives at the limit n → ∞ the solution in the form of a series ∞ X φt f¯• , B, f • =
Z
Z ···
∗ ϕ t1 , . . . , tn ; St−t f¯t•n BSt−tn ft•n dt1 ···dtn n
n=0 0
of n-tuple integrals on the interval [0, t) with St∗ f¯• BSt (f • ) at n = 0. The positive definite kernels
554
V.P. Belavkin
ϕ (t1 , . . . , tn ) = ϕ0 (t1 ) ◦ ϕt1 (t2 ) ◦ . . . ◦ ϕtn−1 (tn ) , ∗ f¯r• ϕ f¯• (t) , B, f • (t) St−r fr• , are obtained by the recurwhere ϕr (t, B) = St−r rence ϕ (t) = ϕ0 (t) , ϕ (t1 , . . . , tn ) = ϕ t1 , . . . , tn−1 ◦ ϕtn−1 (tn ) , corresponding to (4.5). This proves the CP property for the series (4.10), which converges to a 0 ≤ Yt ≤ κRt for any positive bounded 0 ≤ B ≤ κI because of the increase Ytn ≤ Ytn+1 for Ytn = φnt (B) and the boundedness Ytn ≤ κRtn , Rtn ≤ Rt , where Rt is the continuous sesquilinear form (4.4). As follows from the exponential estimate [31] for the solutions to the quantum stochastic equations (3.2) with the bounded generators, Rt = φt (I) might be unbounded, but strongly continuous in the Fock scale F. In the case of unbounded generators the solution to (4.4) might not be unique, and the iterated series (4.10) gives obviously the minimal one, which is unique among such solutions. Let us prove the submartingale property for the sesquilinear form Rt , given by the weakly convergent series (4.4). Rs for a s > t is defined as the iterated solution Ys = Rs := lim Rsn to the backward integral equation Z s Ys = Ss∗ BSs + dΛνµ r, Sr∗ ϕµν Ys−r Sr 0
for the series Ys = φs (B) with B = I. It satisfies the integral equation Z t dΛνµ r, Sr∗ ϕµν Rs−r Sr , Rs = St∗ Rs−t St + 0
where we used the semigroup property Ss−t St = Ss and that Z s Z s ν ∗ µ ◦∗ ∗ dΛµ r, Sr ϕν Ys−r Sr = Vt dΛνµ r, Tt∗ Sr−t ϕµν Ys−r Sr−t Tt Vt◦ , t
t
Z
s
dΛνµ
r−
∗ ϕµν t, Sr−t
t
Z
s−t
Ys−r Sr−t =
dΛνµ r, Sr∗ ϕµν Ys−t−r Sr .
0
This can be written in terms of the coherent matrix elements Rs f¯• , f • = F ∗ Rs F , f • ∈ Es as Rs f¯• , f • = St∗ f¯• Rs−t f¯• , ft• St f • Z t Sr∗ f¯• ϕ f¯• (r) , Rs−r f¯• , fr• , fr• Sr f • dr. + 0
The coherent matrix elements Yt f¯• , f • of the conditional expectation Ys = t (Rs ) coincide with Rs f¯• , f • if f • ∈ Et . Hence, they satisfy the integral equation Yt f¯• , f • = St∗ f¯• Ps−t St f • Z t Sr∗ f¯• ϕ f¯• (r) , Yt−r f¯• , fr• , fr• Sr f • dr, + 0
corresponding to the non-adapted backward equation
Quantum Stochastic Positive Evolutions
Yt =
◦ St∗ Ps−t St
555
Z
t
+
dΛνµ r, Sr∗ ϕµν Yt−r Sr ,
0
Ps◦
where = Ps ⊗ I, Ps = Rs (0, 0), as ft• (r) = f • (r + t) = 0, ∀r ∈ R+ if f • ∈ Et . The operators Ps = (Rs ) = θs (I) are given by the Markov semigroup θs = ◦ φs as the decreasing solutions to the integral equation Z s −Ks∗ −Ks e + e−Kr∗ ϕ Ps−r e−Kr dr, Ps = e 0
et = and Pt ≤ I if K + K ≤ 0. (See, for example, [34].) Thus, the difference R Rt − t (Rs ) = Rt − Yt satisfies the same equation Z t ∗e e et−r Sr Rt = St Is−t St + dΛνµ r, Sr∗ ϕµν R †
0
Ies = I − Ps◦ et = lim R en R t
instead of I. The iteration of this equation defines it as the as Rt with weak limit in the form of the series (4.10) with B = I − Ps−t ≥ 0. et = φt I − Ps−t is a positive sesquilinear form on D for any s ≥ t due to Hence R the positivity of φt . The proof of contractivity φt (I) ≤ φs (I) for t < s is similar to that one, without the vacuum averaging of Rt . 5. The Structure of the Generators and Flows First, let us prove the structure (3.9) for the (unbounded) form-generator of CP flows over the algebra B = L (H) of all bounded
operators. This algebra contains the onedimensional operators |η 0 ihη 0 | : η 7→ η 0 |η η 0 given by the vectors η 0 , η 0 ∈ H. 1. Let us fix a vector η 0 ∈ D ⊕ D• with the unit projection η 0 ∈ D, η 0 = 1, and make the following assumption of the weak continuity for the linear operator η 0 7→ γ |η 0 ihη 0 | η 0 . 0) The sequence η 0n = γ |ηn0 ihη 0 | η 0 ∈ D0 ⊕ D•0 of anti-linear forms
η ∈ D ⊕ D• 7→ hη|η 0n i := η|γ |ηn0 ihη 0 | η 0 converges for each sequence ηn0 ∈ H converging in D0 ⊇ H. Proposition 5.1. Let the CCP germ-matrix γ satisfy the above continuity condition for a given η 0 . Then there exist strongly continuous operators K ∈ L (D) , K• : D• → D defining the matrix operator K in (3.9), such that the matrix map (3.8) is CP, and there exists a Hilbert space K, a ∗ -representation : B 7→ B ⊗ J of B = L (H) on the Hilbert product G = H ⊗ K, given by an orthoprojector J in K, such that ∗ ϕ (B) = (Lµ (B) Lν )µ=−,• ν=+,• = L (B) L.
(5.1)
Here L = (L, L• ) is a strongly continuous operator D⊕D• → G with L = L+ , L− = L∗ , L• = L∗• which is always possible to make
0 η ⊗ e|Lη 0 = 0, (5.2) ∀e ∈ K1 , where K1 = JK. If D = −λ (I) ≥ 0, then one can make L∗ L = K + K † in a canonical way, and in addition L∗ L• = K• , L∗• L• = I•• , where I•• = Iδ•• , if D = −λ (I) ≥ 0.
556
V.P. Belavkin
Proof. Define the linear operator A : H → D0 ⊕ D•0 by the relation
hη|Aη 0 i = η|γ |η 0 ihη 0 | η 0 for all η ∈ D ⊕ D• and η 0 ∈ H. By the weak continuity it can be extended on D0 and its dual operator A∗ = A∗ , A∗• into D is strongly continuous on D ⊕ D• . The operators K=
1 0 η |γ |η 0 ihη 0 | η 0 I − A∗ , 2
K• = −A∗•
define the matrix-map (3.8) in the form
hη|ϕ (B) ηi = hη|γ (B) ηi + η 0 |γ |η 0 ihη 0 | η 0 hη|Bηi −
η|γ B|ηihη 0 | η 0 − η 0 |γ |η 0 ihη|B η , where η ∈ D is the natural projection of η ∈ D ⊕ D• onto D. Let us prove that this is a CP map, i.e. X
ξ B |ϕ B ∗ C ξ C ≥ 0 B,C∈B
for all ξ B = 0 except for a finite number of B = Bk ∈ B, k = 1, 2, . . ., for which ξ B = η k . Indeed,
η k |ϕ Bk∗ Bl η l = η k |γ Bk∗ Bl η l + η 0 |γ |η 0 ihη 0 | η 0 η k |Bk∗ Bl η l −
k
η |γ Bk∗ Bl |η l ihη 0 | η 0 − η 0 |γ |η 0 ihη k |Bk∗ Bl η l X
= η k |γ Bk∗ Bl η l , k,l≥0
P where B0 = − B∈B B|ξB ihη 0 |, and η k = ξ B for B = Bk , k = 1, 2, . . .. Because P P k 0 k≥0 Bk η = B0 η + B∈B BξB = 0, this form is positive, as it is written as a conditionally positive form X
ζ B |γ B ∗ C ζ C ≥ 0, B,C∈B
P with B∈B ι (B) ζ B = 0, where ζ B = η k = ξ B if B = Bk 6= B0 , and ζ B = ξ B + η 0 for B = B0 , otherwise ζ B = 0. Moreover,
η|ϕ |η 0 ihη 0 | η 0 = η|γ |η 0 ihη 0 | η 0 + η 0 |γ |η 0 ihη 0 | η 0 hη|η 0 i −
η|γ |η 0 ihη 0 | η 0 − η 0 |γ |η 0 i hη|η 0 i hη 0 | η 0 = 0. Thus, the form-generator over B = L (H) has the form (3.9), where the CP map ϕ can always be chosen to satisfy ϕ |η 0 ihη 0 | η 0 = 0 for all η 0 ∈ H and a given vector η 0 ∈ D. The Steinspring dilation (5.1) of the CP map ϕ into the continuous forms ϕ (B) ∈ B (D ⊕ D• ) is given by a continuous operator L : D ⊕ D• → G with the dual L∗ : G → D0 ⊕ D•0 because 2
kLη n k = hη n |ϕ (I) η n i −→ 0
Quantum Stochastic Positive Evolutions
557
if η n −→ 0 strongly in D ⊕ D• . The w*-representation : B → L (G) of B = L (H) is always an ampliation (B) = B ⊗ J, where J is an orthoprojector onto a subspace K1 ⊆ K, corresponding to the minimal dilation in G1 = H ⊗ K1 . The property (5.2) follows from the inequality |eihe| ≤ J if Je = e: 0
η ⊗ e|Lη 0 2 ≤ Lη 0 | |η 0 ihη 0 | Lη 0
= η 0 |ϕ |η 0 ihη 0 | η 0 = 0. If K + K † ≥ ϕ (I), L1 = (I ⊗ J) L is the operator of the minimal dilation ϕ (B) = L1 (B) L1 , so that ϕ (I) = L1 L1 with respect to the adjoint L1 : G1 → D0 , and L0 is an operator on D into a Hilbert product G0 = H ⊗ K0 , satisfying the condition L0 L0 = D with respect to the adjoint L0 : G0 → D0 , then L◦ : η 7→ L0 η ⊕ L1 η defines the canonical dilation in G◦ = H ⊗ K◦ having the property L◦ L◦ = ϕ (I) + D = K + K † , where K◦ = K0 ⊕ K1 and L◦ : G◦ → D0 is the adjoint to L◦ : D → G◦ . Moreover, if µ=−,• µ Kν + K µ δν+ ν=+,• = K + K † − ϕ(I) ≥ 0, D = δνµ I − ϕµν (I) + δ− the operator L◦ : η 7→ L0 η ⊕ L1 η defines the canonical dilation with the property L∗ L = K + K † : µ Lµ◦ L◦ν = Lµ0 L0ν + Lµ1 L1ν = Dνµ + ϕµν (I) = δ− Kν + K µ δν+ + δνµ I, where L0 : D → G0 are operators L0 , L0• with the adjoints Lµ0 = L0∗ −µ , satisfying the conditions Lµ0 L0ν = Dνµ , and L1ν = (I ⊗ J) Lν are the operators of the minimal dilation ϕµν (B) = Lµ1 (B) L1ν .
2. Thus we have proved that Eq. (3.2) for completely positive quantum stochastic flows over B = L (H) has the following general form ∞ X φt L∗m (B) Ln − Bδnm dΛnm dφt (B) + φt K ∗ B + BK − L∗ (B) L dt = m,n=1
+
∞ X m=1
∞ X ∗ φt L∗m (B) L − Km B dΛ+m + φt L∗ (B) Ln − BKn dΛn− , n=1
generalizing the Lindblad form [1] for the semigroups of completely positive maps. This can be written in the tensor notation form as ν β µ ? (5.3) dφt (B) = φt Lµα α β (B) Lν − ıν (B) dΛµ = φt L J (B) L − ı (B) · dΛ, where the summation is taken over all α, β = −, ◦, + and µ, ν = −, •, +, − − (B) = B = µ µ=−,•,+ ? µ µ (B) (B) , and L = L ++ (B), ◦◦ (B) = (B), α = 0 if α = 6 β, ı = Bδ ν ν α α=−,◦,+ is the β β β=−,◦,+ + triangular matrix, pseudoadjoint to L = Lν ν=−,•,+ with L− − = I = L+ , ∗ − • ∗ − L◦• = L• , L•◦ = L∗• , L◦+ = L, L− ◦ = L , L• = −K• , L+ = −K• , L+ = −K.
(All other Lµα and Lβν are zero.) If the Hilbert space H ⊗ G is embedded intothe direct sum H ⊕ H ⊕ ... of copies of the initial Hilbert space H such that J = δli for a subset i, l ∈ / N0 ⊆ N, this equation can be resolved as φt (B) = Vt∗ (B ⊗ It ) Vt , where
558
V.P. Belavkin
V = (Vt )t>0 is an (unbounded) cocycle on the product D ⊗ F with Fock space F over the 2 (N × R+ ) of the quantum noise, and It is the solution to the stochastic Hilbert space LP equation dIt + n∈N0 It dΛnn = 0 with I0 = I in F. The cocycle V satisfies the quantum stochastic equation dVt = (Lµν − Iδνµ )Vt dΛνµ of the form dVt + KVt dt +
∞ X
Kn Vt dΛn− =
n=1
∞ X
∞ X m n − Iδ dΛ + Lm Vt dΛ+m , Lm V t n n m
m,n=1
m=1
where Lin and Li are the operators in D, defining X X l Ll∗ ϕ (B) = Ll∗ BLl ϕm n (B) = m BLn , l∈N / 0
ϕm (B) = P∞
X
(5.5)
l∈N / 0 l Ll∗ m BL ,
ϕn (B) =
†
P
l∈N / 0 †
(5.4)
X
Ll∗ BLln
l∈N / 0
i∗ i with i=1 L L = K + K if K + K ≥ ϕ (I) = l∈N / 0 L L . The formal derivation of Eq. (5.4) from (5.3) is obtained by a simple application of the HP Itˆo formula. The martingale Mt , describing the density operator for the output state of Λ (t, a), is then defined as Mt = Vt∗ Vt . 3. The following theorem ensures the existence of a ∗-representation ι : Λ (t, a) 7→ β Λ (t, i (a)) := iα β (a) Λα (t) of the quantum stochastic process (0.2), commuting with Yt = φt (B) for all a ∈ a, B ∈ L (H), in the form i∗
i
◦ − + Λ (t, i (a)) = i◦◦ (a) Λ◦◦ (t) + i◦+ (a) Λ+◦ (t) + i− ◦ (a) Λ− (t) + i+ (a) Λ− (t) . α=−,◦ is a ?-representation Here i= iα β β=+,◦
α ? ◦ ? iα iβ (a) , β a a = i◦ a
β ? = i−α (a)∗ iα −β a
of the Itˆo algebra a in the operators iα β (a) : Kβ → Kα , with a domain K◦ ⊆ K, K− = C =K+ , and Λβα (t) are the canonical quantum stochastic integrators in the Fock space Γ (K) over K = L2K (R+ ), the space of K-valued square-integrable functions on h iα=−,◦,+ R+ .We shall extend i to the triangular matrix representation i = iα on the β β=−,◦,+ h i α pseudo-Hilbert space C⊕K ⊕ C with the Minkowski metrics tensor g = δ−β = g−1 , µ µ=−,•,+ by i+β (a) = 0 = iα − (a), for all a ∈ a, as it was done for a = aν ν=−,•,+ , and denote the α ampliation I ⊗ iα β (a) again as iβ (a) by omitting the index ◦. Note that if the stochastic generator of the form (3.9) is restricted onto an operator algebra B ⊆ L (H) with the weak closure B¯ = Ac , and all the sesquilinear forms γνµ (B), B ∈ B commute with the ¯ ∗-algebra A ⊂ L (D), then λµν (B) ∈ B. Proposition 5.2. Let b = γ (B) −ı (B) satisfy the commutativity conditions (3.4) for all a ∈ a, B ∈ L (H). Then there exists a ?-representation a 7→ i (a) of the Itˆo algebra a, ∗ (a) : Kβ → Kα , with iα defining the operators iα β β (a) Kα ⊆ Kβ , where K− = C =K+ , µ α β such that Lα µ I ⊗ aν = I ⊗ iβ (a) Lν for all a ∈ a : • − − a− L• a•• = i (a) L• , + − K• a+ = i (a) L + i+ (a) , • − a− L• a•+ = i (a) L + i+ (a) , • − K• a• = i (a) L• .
(5.6)
Quantum Stochastic Positive Evolutions
559
If A, γνµ (B) = 0 for all A ∈ A and B ∈ B, where B ⊆ L (H) is a ∗-algebra of bounded h iα=−,◦,+ operators, and B¯ = Ac , then there exists a triangular ?-representation j = jβα of the operator algebra A with j◦◦ (I) = J such that JLA = j (A) L, j (A) , i (a) = 0 j (A) , J (B) = 0,
β=−,◦,+
∀A ∈ A , a ∈ a, B ∈ B. (5.7)
Proof. Let G = G− ⊕ G ⊕ G+ be the pseudo-Hilbert space, where G+ = D = G− , G ⊆ H ⊗ K is the linear span of { (B) Lη|B ∈ B, η ∈ D ⊕ D• } and the indefinite metrics is defined by
α 2 ξ |gαβ ξ β = kξk + 2Re ξ + |ξ − , ξ α ∈ Gα , ξ ◦ = ξ ∈ G. The algebra B = L (H) is represented on G by the ampliation J (B) = B ⊗ J, where J = 1 ⊕ J ⊕ 1, and J (B) Lgη ∈ G ◦ , where the pre-Hilbert space D ⊕ D• is isometrically embedded into D ⊕ D• ⊕ D as g (η ⊕ η • ) = 0 ⊕ η • ⊕ η. We define the representations i and j on G by intertwining i(a)L = La,
i (a) J (B) L = J (B) La,
j (A) J (B) L = J (B) LA,
the operators a =I ⊗ ag and A = A ⊗ I. Such a definition is correct, because if k k J (Bk ) Lζ = 0 for a finite family of non-zero ζ ∈ D ⊕ D• , then k = J (B) Lgη| J (Bk ) Lagζ k = J (B) Lgη|i (a) J (Bk ) Lgζ E D E η|γ B ∗ Bk gaζ k = η|agγ B ∗ Bk ζ k = D E a? η|gγ B ∗ Bk ζ k = J (B) La? gη| J (Bk ) Lgζ k = 0 D
for all η ∈ D ⊕ D• and B ∈ L (H), and so i (a) (Bk ) Lζ k = 0. Here we used the condition γ (B) ga = (ı (B) + b) ag = a (ı (B) + b) g = agγ (B) , as ab = ba due to the HP commutativity agb = bga of b = λ (B), where γ = (γνµ ) is extended to all indexes as γνµ (B) = Bδνµ +Cνµ with Cνµ = 0 if µ = + or ν = −. In the same way, the operators j (A) are correctly defined for B ∈ Ac if γ (B) ı (A) = ı (A) γ (B). This also proves that i (a)? = i (a? ), and j (A)? = j (A∗ ). (The multiplicativity of i, j as well as the commutativity properties (5.7) directly follow from the definition of these operators.) Note that j (I) = I ⊗ J, and if the dilation is minimal, j (I) = I ⊗ I. If it is not, the unital property can still be achieved for the canonical dilations in K◦ , by adding j◦◦ (A) (η ⊗ e0 ) = Aη ⊗ e0 for all e0 ∈ K with Je0 = 0. 4. Now we are going to construct the quantum stochastic dilation for the flow φt (B) and the quantum state generating function ϑat = [Rt W (t, a)] of the output process Λ (t, a) in the form φt (B) = Vt∗ (It ⊗ B) Vt , ϑt (g) = Vt∗ Wta ⊗ I Vt , ∀B ∈ L (H) , a ∈ a, where Vt is an operator on D ⊗ F into Γ (K) ⊗ D ⊗ F, intertwining the Weyl operators W (t, a) with the operators Wta = W (t, i (a)) It in the Fock space Γ (K),
560
V.P. Belavkin
dW (t, i (a)) = W (t, i (a)) dΛ (t, i (a)) ,
W (0, i (a)) = I,
and It ≥ Is , ∀t ≤ s is a decreasing family of orthoprojectors. In order to prove the existence of the Fock space dilation we need the following assumptions in addition to the continuity assumptions of this and previous sections. 1) The minimal quantum stochastic CP flow over the algebra A, resolving the quantum Langevin equation dτt (A) = τt (j (A) − ı (A)) · dΛ,
τ0 (A) = I ⊗ A,
A ∈ A,
(5.8)
where j (I) = J ⊗ I, ı(A) = I ⊗ A, is the multiplicative flow, satisfying the condition τt (I) = It ⊗ I, where It is the solution to the stochastic equation dI = (J − I)◦◦ It dΛ◦◦ with I◦ = I. 2) Let us assume the strong continuity of the operators L (e) ¯ : D → D, L• (e) ¯ : ¯ η|η 0 i = ¯ = (I ⊗ e∗ ) Lν by hLν (e) D• → D, given for all e ∈ K as Lν (e) hLν η|η 0 ⊗ ei ∀η ∈ D, η 0 ∈ D0 . This is necessary for the definition of the operators Vt (σ) for each subset σ ⊂ [0, t) of a finite cardinality |σ| ∈ N by the recurrence ! X Lm Vr σ\s ψ m (s) , s = max σ, Vt (σ) ψ = Vt◦ (s) LVs σ\s ψ + m
with Vt (∅) = Vt◦ . Here Vt◦ (s) = Tt∗ St−s Ts , Vt◦ is the solution to Eq. (4.3) in D ⊗ F ⊗|σ| ⊗ Vt◦ on K⊗|σ| ⊗ D ⊗ F, the operators Lν : D → K ⊗ H act on acting as I◦ ⊗|σ\s| ⊗|σ\s| ⊗ H ⊗ F as I◦ ⊗ L ⊗ I (s is identified with the single point subset K {s} such that σ\ max σ is the σ without its maximum) and ψ • (s) ∈ K ⊗ H ⊗ F is given as ψ • (τ, s) = ψ (τ t s) of ψ ∈ H ⊗ F, where τ t s is defined for almost all s (s ∈ / σ) as the disjoint union of the single point {s} with a finite subset τ ∈ R+ . 3) The operator-valued function σ 7→ Vt (σ), defined for all such Q σ ∈ Γt , is weakly square integrable for each t with respect to the measure dσ = s∈σ ds in the sense Z 2
kVt (σ) ψk dσ := Γt
∞ Z X
Z ···
Vt s1,..., sn ψ 2 ds1 . . . ds1 < ∞,
n=0 0<s ...s
for all ψ ∈ D ⊗ F. Thus the operators Vt can be extended to the Fock space ones Vt : D⊗F → Γ (K)⊗D⊗F, say, by letting Vt (σ) = Vt (σt )⊗δ∅ σ{t for all finite σ ⊂ R+ r (σ) Vr (σ) = Vt (σ), if σ[t = σ ∩ [t, ∞) 6= ∅. Obviously they form a cocycle, Vt−r ⊗|σr | r ∗ ⊗ Tr Vs (σr − r) Tr with σr = σ ∩ [0, r). where Vs (σ) = I◦ Theorem 5.3. Under the given assumptions 0), 1), 2), 3) there exist: (i)
A cocycle dilation Vt : D ⊗ F → 0 (K) ⊗ D ⊗ F of the minimal CP flow φ, intertwining the Weyl operator W (t, a) with Wta : Vt (I ⊗ W (t, a)) = Wta ⊗ I Vt , φt (B) = Vt∗ (It ⊗ B) Vt , ∀a ∈ a, B ∈ L (H) , (5.9) where It ≤ Is , ∀t < s are orthoprojectors in 0 (K).
Quantum Stochastic Positive Evolutions
561
(ii) A ∗-multiplicative flow τ = (τt ) over A in 0 (K) ⊗ H with the properties τt (I) = It , τt (A) , Wta = 0, Vt A = τt (A) Vt , (5.10) [τt (A) , I ⊗ B] = 0, ∀A ∈ A, a ∈ a, B ∈ B. (iii) If λ (I) ≤ 0, then one can make Mt = Vt∗ Vt martingale, and, if λ (I) ≤ 0, one can make Vt isometric, Vt∗ Vt = I. (iv) Moreover, let U = (U t )t>0 be a one parametric weakly continuous cocycle of unitary operators on 0 (K) ⊗ H ⊗ 0 (E) , giving the unique solution to the quantum stochastic equation dUt + Kdt + K•− dΛ•− + K◦− dΛ◦− Ut = L◦+ dΛ+◦ − I•• dΛ•• + J•◦ dΛ•◦ + J◦• dΛ◦• + J◦◦ − I◦◦ dΛ◦◦ Ut (5.11) with U0 = I and the necessary differential unitarity conditions ◦ − − ◦ • ◦ • − − ◦ ◦ ◦ ◦ • K + K † = L− ◦ L + , K • = L ◦ J • , J ◦ J • = I • , K ◦ = L◦ J ◦ , J ◦ = I ◦ − J • J ◦ , ◦∗ • ◦∗ ◦ ◦ where L− ◦ = L+ , J◦ = J• . If λ(I) ≤ 0 and L+ = L is the canonical operator the dilation (5.1), then
hψ| (A ⊗ I) φat (B) ψi = Ut (δ∅ ⊗ ψ) | τta (A) (I ⊗ B) Ut (δ∅ ⊗ ψ) (5.12)
for all A ∈ A, a ∈ a, B ∈ B, where ψ is any initial state ψ0 = η ⊗ δ∅ , η ∈ D and φat (B) = (I ⊗ W (t, a)) φt (B) , τta (A) = Wta ⊗ I τt (A) . If λ(I) ≤ 0, and in addition J•◦ = L◦• is the canonical isometry, this unitary cocycle dilation is valied also for any state ψ ∈ D ⊗ F. Proof. (Sketch). The cocycle V = (Vt )t>0 is recurrently constructed due to the above assumptions (1)–(3). It obviously intertwines the Weyl operators (2.4) with the operators Wta , acting in the same way in 0 (K), by virtue of the property (5.6). Let us denote by K1 = L2K (R+ ) the functional Hilbert space corresponding to the minimal dilation (5.1) sub-space K = K1 for the CP map ϕ, given by the orthoprojector J = J1 in the space K◦ of the canonical dilation, and K0 its orthogonal compliment, corresponding to K0 = J0 K◦ , where J0 = I −J1 . Representing 0 (K0 ⊕ K1 ) as 0 (K 0 )⊗0 (K1), let us denote by It the survival orthoprojectors It χ σ 0 , σ 1 = δ∅ σt0 χ σ 0 , σ 1 , 0 1 σt = σ ∩ [0, t), where χ σ 0 , σ 1 = χ σ 0 t σ 1 ∈ K⊗|σ | ⊗ K⊗|σ | is the set function, representing a χ ∈ 0 (K0 ⊕ K1 ). The decreasing family (It )t>0 defines the decay orthoprojectors Et = I − It in 0 (K◦ ) satisfying the quantum stochastic equation dEt = Et J0 · dΛ◦◦ with E0 = 0, and Λ◦◦ is the number integrator in the Fock space 0 (K◦ ) over K◦ = K0 ⊕ K1 . Then one easily find that the minimal CP flow (4.10) can be represented as φt (B) = Vt∗ (It ⊗ B) Vt . We may also construct the minimal quantum stochastic ∗-flow over the operator algebra A, resolving the quantum Langevin equation (5.8) by its iteration as it was done in Sect. 4 for the flow φ, and then prove its ∗-multiplicativity under certain conditions as in [30]. However, we can directly construct the representations τt with the property τt (I) = It in a similar way as it was done for the representation j, and then prove that it satisfies the Langevin equation. Then the properties [5.10] ) follow from the definition of the operators Vt , and can be checked recurrently by use of (5.6) and (5.7).
562
V.P. Belavkin
The cocycle U = (Ut ) is constructed to satisfy the HP quantum stochastic equation (5.11). It can be represented in the form of the stochastic multiple integral of the chronologically ordered products of the coefficients of the quantum differential equation under the integrability conditions given in the Appendix. If K + K † ≥ ϕ (I), the HP unitarity condition [8] is satisfied for the canonical choice L◦+ = L◦ , where L◦ = L0 + L1 , and arbitrary isometric operator J•◦ , J◦• J•◦ = I•• with K•− = L◦ J•◦ , K◦− = L◦ J◦◦ , J◦◦ = I◦◦ −J•◦ J◦• . In addition if K +K † ≥ ϕ (I), we make the choice J•◦ = L◦• from the canonical dilation, L◦• = L0• + L1• , and so K•− = L◦ L◦• = K• , where L∗◦ = L◦ , J◦• = J•◦∗ . In the first, subfiltering case λ (I) ≤ 0 such a choice gives Ut (δ∅ ⊗ ψ0 ) = Vt ψ0 for any ψ0 = η ⊗ δ∅ , η ∈ D and therefore kVt ψ0 k = kψ0 k. Thus Mt = Vt∗ Vt is a martingale and the condition (5.12) is satisfied for any initial ψ0 . In the second, contractive case λ (I) ≤ 0 the canonical choice gives Ut (δ∅ ⊗ ψ) = Vt ψ and therefore kVt ψk = kψk for any ψ ∈ D ⊗ F. Thus Vt∗ Vt = I and the condition (5.12) is satisfied for any ψ. 6. Appendix Here we give a resume on the sufficient analytical conditions for the quantum multiple integration [5] of stochastic linear differential equations in Hilbert spaces, based on the noncommutative analysis in the Fock scale [31]. 2 2 1. Let ke• k (ξ) = ξ ke• k , ξ > 0 as in [31], so that the projective limit E and the 2 0 dual space E coincide with the Hilbert space K with the norm kek . The projective 2 Fock space F = ∩ξ 0 (K, ξ) over h K =i LK (R+ ) with respect to the exponential scale
(2.1), where kf ⊗ k (ξ) = exp ξ kf • k , f • ∈ K, is the natural domain for the quantum stochastic integration [5], and F0 = ∪ξ 0 K, ξ −1 . If D = ∩Hp is the projective limit of an increasing family of the dense Hilbert subspaces Hp ⊆ Hp−1 , the π-product D = D ⊗ F of the Fr´echet spaces D and F is the projective limit of the directed family of the spaces Hp (ξ) = Hp ⊗ 0 (K, ξ) and D0 = D0 ⊗ F0 is given as ∪H−p ξ −1 , where H−p denote the duals Hp0 to the Hilbert spaces Hp , with respect to the standard pairing in the Hilbert product H of H = H0 and 0 (K) = 0 (K, 1) . Following [31, 5], we define the multiple quantum stochastic integral Yt = Λ⊗ [0,t) (B) of a function B (τ ) of µ µ=−,• µ the quadruple τ = τν ν=+,• of finite subsets τν ⊂ [0, t) with values in the nonadapted − kernels D ⊗ K⊗|τ ∪τ | → D0 ⊗ K⊗|τ ∪τ+ | , as the sesquilinear form Z Z Z Z D − τ− τ hψ|Yt ψi = ψ • (τ ∪ τ+ ) |B + · τ+ τ Γt Γt Γt Γt E (6.1) · ψ • τ ∪ τ − dτ dτ − dτ+ dτ+− 2
2
where ψ • (σ, τ ) = ψ (σ ∪ τ ), given by the quadruple of the multiple integrals Z Z Z ∞ X ··· B (τ ) dτ = B (t1 , . . . , tn ) dt1 · · · dtn . Γt
n=0 0
The function B is integrable up to a t > 0 if Yt ∈ B (D), and it is strongly integrable if Yt ∈ L (D). The natural criterion of multiple integrability was formulated in [31] in 0 terms of the norms kBkp,q (ξ, ζ) =
Quantum Stochastic Positive Evolutions
Z
Z
Z
Γt
Γt
563
21 2 |τ2 | |τ − | |τ+ | 0 1 1 1 sup kB (τ )kp,q dτ+ dτ − dτ+− ζ ξ ξζ τ ∈Γt Γt
(6.2) as kBkp,q (ξ, ζ) < ∞ for some p, q, ξ, ζ < ∞. The function B is strongly integrable if 0
0
kBkp,q (ξ, ζ) < ∞ for any p < 0, ξ < 1 and some q, ζ. Here
τ+−
B
τ+
τ− τ
0
p,q
ψn,n+ |B (τ ) χn,n−
(ξ, ζ) = sup
(ζ) ψ,χ kψn,n+ k (ζ) χn,n− p
(6.3)
q
denotes the norm of the kernel B (τ ) : Hq (ζ) ⊗ K⊗n ⊗ K⊗n− → H−p ξ −1 ⊗ K⊗n ⊗ K⊗n+ , where nµν = |τνµ | are the cardinalities of τνµ . The norms (6.2) define the estimate for the integral by virtue of the inequality [31]
0
0
⊗
Λ[0,t) (B) (3ξ, 3ζ) ≤ kBkp,q (ξ, ζ) p,q
so that Λ⊗ [0,t) (B) ∈ B (D) ( or ∈ L (D)) is defined as a bounded kernel Yt : Hq (ζ) → 0 H−p ξ −1 if kBkp,q 13 ξ, 13 ζ < ∞ (or if for each p, ξ the norm is finite for some q, ζ). 2. Let us consider the case when the operator-valued function B (τ ) is relatively bounded in the following sense
τ+−
B
τ+
τ− τ
0
p,q
n + n + + n − + n− + ! √ ≤ cp,q n + n+ + n− + n− + , − n! n+ !n !
(6.4)
P∞ where cp,q (n) are positive constants such that n=0 cp,q (n) ρRn
0
tn /n! three times, one can find that kBkp,q (ξ, ζ) ≤ 2 21 √ n − n+ n− + X t ξζ (tξ) 2 (tζ) 2 n!cp,q (n) X X sup n − 2 n !n− ! n − n − n− − n− ! n ! n (ξζ) + + + + − n − n+ n+
≤
XXXX n
n+
n− n− +
n+
− n+
n−
(tζ) 2 (tξ) 2 (tξtζ) 2 (n)! cp,q (n) , n − − (ξζ) 2 n+ !n− !n− + ! n − n + − n − n+ !
where the supremum and summation is taken over n ≥ n+ + n− + n− + . Then the function B is integrable up to a t < ρ as it has the finite estimate √ !n √ ∞ ∞ X X 0 1 + tξ 1 + tζ √ (6.5) cp,q (n) ≤ ρn cp,q (n) kBkp,q (ξ, ζ) ≤ ξζ n=0 n=0 0
and so kYt kp,q (3ξ, 3ζ) ≤
P∞ n=0
ρn cp,q (n) if
√
ξζ > 1/ρ and
564
V.P. Belavkin
p ξζt <
ξζρ +
p 2 1 p ξ− ζ 4
21
−
1 p p ξ+ ζ . 2
In particular, the integral Yt is defined as a continuous operator D → H into the Hilbert space H if this analytical estimate is valid for p = 0, ξ = 1/3 and some q, ζ, and it is a strongly continuous operator, Yt ∈ L (D), if it is also valid for any p < 0, ξ < 1/3. 3. Let us apply this estimate to the multiple integral of the chronological products (6.6) B (τ ) = L (tn ) · · · L (t1 ) = L |τ | , defined by the unique decomposition τ = t1 ∪ · · · ∪ tn of the set table τ into the single point tables t ∅ ∅ t ∅ ∅ ∅ ∅ − • • = = = = t− , t , t , t , + • + • ∅ ∅ ∅ ∅ t ∅ ∅ t with L tµν = Lµν ∈ L (D) and t1 < . . . < tn , n = |∪τνµ |. If the norms (6.2) of the chronological product (6.6) with
0 ηn,n+ |B (τ ) ηn,n− − −
τ τ +
B
=
sup
τ+ τ kηn,n+ k ηn,n− ⊗n+ ⊗n− ηn,n+ ∈Hp ⊗K
p,q
,ηn,n− ∈Hq ⊗K
p
q
(6.7) (B) satisfies the are finite for some ξζ ≥ 1 and a t = T , the multiple integral Vt = Λ⊗ [0,t) quantum linear differential equation dVt = Lµν Vt dΛνµ , t ≤ T with V0 = I, see Theorem 1 in [31]. Thus the estimate (6.4) for the chronological products (6.6) with L− + = −K,
− L− • = −K ,
L•+ = L,
L•• = J − I
gives a sufficient condition for the existence of the unique solution to Eqs. (1.1), (1.3) of the type (1.6) in the form of the stochastic chronologically ordered operator-valued exponents Vt . 4. A similar estimate (n)! 0 cp,q (n) kL (tn ) · · · L (t1 )kp,q ≤ √ (6.8) n− ! − = −K• gives for the chronological products B τ+− , τ − of L t− + = −K and L t the sufficient condition ! 21 2 Z |τ − | Z
1 0 0
B τ+− , τ − dτ+− dτ − kBkp,q (ξ, ζ) = (6.9) p,q ζ Γt Γt X 1 √ n √ + t cp,q (n) < ∞, ∀ξ −1 ≤ ζ ≤ ζ n
B τ+− , τ − dτ+− Λ− dτ − for the quantum stochastic √ 2 equation (4.6) with t ≤ ρ − 1/ ζ . Thus the iteration St = Tt Vt◦ of the nonadapted P∞ integral equation 4.3 has the estimates kSt kp,q (ξ, ζ) ≤ n=0 ρn cp,q (n) for all 2−1 ζ ≥ −2 −1 max ρ , 2ξ , and St ∈ L (D) if the chronological products B satisfy the analyticity condition (6.8) for each p < 0 and some q > 0. of the integrability Vt◦ =
R
Γt
R
Γt
Quantum Stochastic Positive Evolutions
565
5. In order to formulate an analyticity condition for the weak convergence of (4.4) in terms of the structural maps λµν , let us represent this multiple integral in the equivalent form (see Theorem 2 in [31]) as the adapted one Rt = Λ⊗ [0,t) (L), for the integrant L (τ ) = λ (τ , I) ,
τ = ∪ni=1 ti ,
(6.10)
giving the solution to the equivalent equation (3.2) for B = I. The integrant L for such a representation is given by the chronological composition
λ (τ ) = λ (t1 ) ◦ . . . ◦ λ (tn )
of λ tµν , · = λµν (·), see [31]. If this integrant B = L has the estimate (6.4) for some p, q > 0 and ξ, ζ > 1, then the series (4.4) converges to the continuous sesquilinear form Rt ∈ B (D) with kRt kp,q (3ξ, 3ζ) < ∞. Another analyticity condition, corresponding to a smaller (L∞ ) space of test functions f • ∈ L2 and a stronger (L2 ) integrability, is given in [30]. In the next paper, which will be published elsewhere, we generalize the sufficient integrability condition (6.4) to the L1+ξ spaces of test functions, ξ ≥ 1, and will show that our method gives more precise estimates in the limit case ξ −→ ∞. Acknowledgement. The author wishes to thank Professor K. R. Parthasarathy for drawing his attention to the problem of studying filtering dynamics from a general “completely positive” point of view, Dr. J. M. Lindsay for stimulating discussions on the subject, and Professor R. L. Hudson for encouraging the author to write this paper.
References 1. Lindblad, G.: On the Generators of Quantum Dynamical Semigroups. Commun. Math. Phys. 48, pp. 119–130 (1976) 2. Evans, M. P. and Hudson, R. L.: Multidimensional Quantum Diffusions. Lect. Notes in Math. 1303, Berlin–Heidelberg–New Yourk: Springer-Verlag, 1988, pp. 69–88 3. Evans, D. E. and Lewis, J. T.: Comm. Dublin Institute for Advanced Studies 24, 104 (1977) 4. Stinespring, W. F.: Positive Functions on C*-algebras. Proc. Am. Math. Soc.6, 242–247 (1955) 5. Belavkin, V. P.: Chaotic States and Stochastic Integration in Quantum Systems. Russ. Math. Surv. 47, (1), pp. 47–106 (1992) 6. Parthasarathy, K. R.: An Introduction to Quantum Stochastic Calculus. Basel: Birkh¨auser, 1992 7. Meyer, P. A.: Quantum Probability for Probabilists. Lect. Notes in Math. 1538, Heidelberg: SpringerVerlag, 1993 8. Hudson, R. L. and Parthasarathy, K. R.: Quantum Itˆo’s Formula and Stochastic Evolution. Commun. Math. Phys. 93, 301–323 (1984) 9. Belavkin, V. P.: On Stochastic Generators of Completely Positive Cocycles. Russ. J. Math. Phys. 3, 523–528 (1995) 10. Christensen, E. and Evans, D. E.: Cohomology of Operator Algebras and Quantum Dynamical Semigroups. J. London Math. Soc. 20, 358–368 (1979) 11. Belavkin, V. P.: Positive Definite Germs of Quantum Stochastic Processes. Comptes Rendus 322, 1, 385–390 (1996) 12. Belavkin, V. P.: On the General Form of Quantum Stochastic Evolution Equation. Stochastic Analysis and Applications, Proc. of Fifth Gregynog Symposium, Singapore: World Scientific 1996, 91–106 13. Lindsay, J. M. and Parthasarathy, K. R.: Positivity and Contractivity of Quantum Stochastic Flows. Stochastic Analysis and Applications. Proc. of Fifth Gregynog Symposium, Singapore: World Scientific, 1996, pp. 315–329 14. Belavkin, V. P.: Nondemolition Measurements and Nonlinear Filtering of Quantum Stochastic Processes. Lecture Notes in Control and Information Sciences, 121, Springer-Verlag, 1988, 245–266 15. Belavkin, V. P.: Nondemolition Calculus and Nonlinear Filtering in Quantum Systems. In: Stochastic Methods in Mathematics and Physics, Singapore: World Scientific, 1989, pp. 310–324
566
V.P. Belavkin
16. Belavkin, V. P.: Quantum Stochastic Calculus and Quantum Nonlinear Filtering. J. Multivariate Analysis 42 (2), 171–201 (1992) 17. Gisin, N.: Phys. Rev. Lett. 52, 1657–1660 (1984) 18. Diosi, L.: Phys. Rev. A 40, 1165–1174 (1988) 19. Barchielli, A. and Belavkin, V. P.: Measurement Continuous in Time and a Posteriori States in Quantum Mechanics. J. Phys. A: Math. Gen. 24, 1495–1514 (1991) 20. Belavkin, V. P.: Quantum Continual Measurements and a Posteriori Collapse on CCR. Commun. Math. Phys. 146, 611–635 (1992) 21. Milburn, G.: Phys. Rev. A 36, 744 (1987) 22. Pearle, P.: Phys. Rev. D 29, 235 (1984) 23. Ghirardi, G. C., Pearle, P. and Rimini, A.: Markov Processes in Hilbert Space and Continuous Spontaneous Localization of Systems of Identical Particles. Phys. Rev. A 42, 78–89 (1990) 24. Belavkin, V. P.: A Continuous Counting Observation and Posterior Quantum Dynamics. J. Phys. A: Math. Gen. 22, L1109–L1114 (1989) 25. Belavkin, V. P.: A Posterior Schr¨odinger Equation for Continuous Nondemolition Measurement. J. Math. Phys. 31, 2930–2934 (1990) 26. Collett, M. J. and Gardiner C. W.: Input and output in Damped Quantum System. Phys. Rev. A 31, 3761–3774 (1985) 27. Carmichael, H.: Open Systems in Quantum Optics. Lect. Notes in Phys. 18, Berlin–Heidelberg–New York: Springer-Verlag, 1993 28. Obata, N.: White Noise Calculus and Fock Space. Lect. Notes in Math. 1577, Heidelberg: SpringerVerlag, 1994 29. Parthasarathy, K. R. and Sinha, K. B.: Stochastic Integral Representations of Bounded Quantum Martingales in Fock Space. J. Funct. Anal. 67, 126–151 (1986) 30. Fagnola, F. and Sinha, K. B.: Quantum Flows with Unbounded Structure Maps and Finite Degrees of Freedom. J. London Math. Soc. 48, 537–551 (1993) 31. Belavkin, V. P.: A Quantum Nonadapted Itˆo Formula and Stochastic Analysis in Fock Scale. J. Funct. Anal. 102, No. 2, 414–447 (1991) 32. Belavkin, V. P.: Continuous Nondemolition Observation, Quantum Filtering and Optimal Estimation. Lect. Notes in Physics, 378, Berlin: Springer-Verlag, 1991, 310–324 33. Fagnola, F.: Characterization of Isometric and Unitary Weakly Differentiable Cocycles in Fock Space, Quantum Probability and Related Topics 8, Singapore: World Scientific, 1993, pp. 143–164 34. Belavkin, V. P.: Multiquantum Systems and Point Processes I. Rep. in Math. Phys. 28, No. 1, 57–90 (1989) 35. Chebotarev, A. M.: Minimal Solutions in Classical and Quantum Stochastics. In: Quantum Probability and Related Topics, 7. Singapore: World Scientific, 1992, pp. 79–91 Communicated by H. Araki
Commun. Math. Phys. 184, 567 – 577 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
On Scaling in Relation to Singular Spectra A. Hof Division of Physics, Mathematics and Astronomy, California Institute of Technology 253-37, Pasadena, CA 91125,USA. E-mail: [email protected] Received: 3 July 1996 / Accepted: 11 September 1996
Abstract: This paper relates uniform α-H¨older continuity, or α-dimensionality, of spectral measures in an arbitrary interval to the Fourier transform of the measure. This is used to show that scaling exponents of exponential sums obtained from time series give local upper bounds on the degree of H¨older continuity of the power spectrum of the series. The results have applications to generalized random walk, numerical detection of singular continuous spectra and to the energy growth in driven oscillators.
1. Introduction An interesting method to numerically detect singular continuous spectum has been proposed by Aubry, Goldr`eche and Luck [2, 3]. In essence, their method amounts to the folPn−1 −1 −2πikq 2 | . Suppose that Sn (q) dq lowing. Let {ak }∞ k=0 ⊂ C and Sn (q) := n | k=0 ak e converges weakly as n → ∞ to a positive Borel measure µ on T = R/Z. Then µ should be singular at values of q for which Sn (q) “scales” like nβ(q) for 0 < β(q) < 1. In a previous paper [22] we proved, without justifying any scaling argument, that the particular model Aubry et al. considered does indeed have purely singular continuous spectrum for generic parameters (see Sect. 5.1 for details). The method of Aubry et al. has been used to argue that sequences obtained from aperiodically driven quantum systems [29, 10] and a time series obtained from a dynamical system [30, 31] have singular continuous power spectrum. Moreover, the growth of sums of the form SN (q) has been shown to detemine the growth of the energy in driven classical and quantum-mechanical oscillators [6, 8]. Finally, Luck [28] has perturbatively related the exponents β(q) to the speed with which gaps close in the spectrum of a discrete Schr¨odinger operator with potential λan as λ → 0. It is desirable, therefore, that the mathematical meaning of the exponents β(q) and their relation to the singularity of the measure µ is clarified. That is the aim of this paper. It will show that: (i) if {an } is a time series of an ergodic dynamical system, then β(q)
568
A. Hof
exist a.e. with respect to the ergodic measure when interpreted as critical exponent for a limsup, and (ii) 1 − β(q) is an upper bound on the degree of H¨older continuity of µ in a neighborhood of q (see Corollary 4.3). However, scaling exponents β(q) ∈ (0, 1) need not imply that µ has a singular continuous part (see Remarks 2.5 and 2.6). H¨older continuity here means uniform H¨older continuity. A positive Borel Measure on R is uniformly α-H¨older continuous – UαH, for short – in an interval I if there is a constant C such that µ(I0 ) ≤ C|I0 |α for all intervals I0 ⊂ I with |I0 | < 1, where |I0 | denotes the length of I0 . It is globally uniformly α-H¨older continuous if it is UαH on its support. Uniform α-H¨older continuity is often referred to as uniform α-dimensionality (e.g., in [35]). A measure that is UαH gives zero weight to sets that have measure zero for α-dimensional Hausdorff measure (cf. Sect. 3.3 in [34]). Section 2 of this paper relates a double integral over the Fourier transform µ(t) ˆ := R −2πitx dµ(x) of a finite positive Borel measure – the convolution of the measure with e the Fej´er kernel – to the degree of H¨older continuity in arbitrary intervals. This should be contrasted to results that relate global uniform H¨older continuity to the averaged decay of µ. ˆ Strichartz [35] has shown that if µ is globally UαH then there is a constant C such RT ˆ 2 dt < CT −α . A partial, but optimal, converse has been obtained that T −1 0 |µ(t)| RT ˆ 2 dt < CT −α for all by Last [26]: if there exists a constant C such that T −1 0 |µ(t)| α T > 0 then µ is globally U 2 H. Of course a measure may very well have a degree of uniform H¨older continuity α that varies over its support. Then it is the most singular part (i.e., the part with the smallest α) that matters for the results just mentioned. Global uniform H¨older continuity of spectral measures of Schr¨odinger operators has received attention because it gives lower bounds on the spread of wave functions [18, 7], see also [19, 20, 26, 24, 15, 4]. The remaining sections apply the result of Sect. 2 to spectral measures of dynamical systems. Section 3 relates the degree of H¨older continuity of a spectral measure near zero to critical exponents of “generalized random walks”. The generalization to “twisted generalized random walks” in Sect. 4 explains the meaning of the scaling exponents β(q). Finally, Sect. 5 discusses the application to the model considered by Aubry et al., and to the energy growth of driven oscillators.
2. Uniform H¨older Continuity in an Interval This section explains how the Fourier transform of a positive bounded Borel measure can be related to uniform H¨older continuity in arbitrary intervals. In view of the applications that follow, the result will be stated and proved for spectral measures of a unitary operator. A series of remarks discusses some generalizations and the relation to the decomposition of the measure into its discrete, singular continuous and absolutely continuous parts. Let U be a unitary operator on a separable R Hilbert space H. Every ψ ∈ H defines a spectral measure µψ on the circle T by T e2πinx dµψ = (U n ψ, ψ). For λ ∈ T let Pn−1 2 Uλ := e−2πiλ U and Gn (λ) := n1 k k=0 Uλk ψk . Then limn→∞ Gn (λ) dλ = µψ weakly (as can e.g. be seen by integrating e−2πimλ with respect to Gn (λ) dλ). The following theorem says that Gn (λ) scales like nβ as n → ∞ if and only if µψ is U(1 − β)H in a neighborhood of λ. Theorem 2.1. Let ∆ be an open interval and 0 ≤ β ≤ 1. The following two statements are equivalent: i) There exists a constant C > 0 such that lim supn→∞ n−β Gn (λ) ≤ C < ∞ uniformly
Scaling in Relation to Singular Spectra
569
in λ ∈ ∆; ii) µψ is U(1 − β)H in ∆.
R Proof. Direct computation shows that Gn (λ) = T Kn−1 (λ−x) dµψ (x), where Kn−1 (y) := 1 sin nπy 2 is the Fej´er kernel. Note that Kn (0) = n for all n and that limn→∞ Kn (y) = 0 n sin πy uniformly on every closed interval not containing 0. i) implies ii). By i) there exists for every > 0 an N > 0 such that Gn (λ) ≤ (C +)nβ for all n > N and all λ ∈ ∆. Let I ⊂ ∆ be an interval of length |I| < 1/N . Let 1 . Then I can be covered by two adjacent intervals n > N be such that n1 < |I| ≤ n−1 Ij := [λj − Therefore,
1 2n , λj
+
1 2n ]
µψ (I)
of length 1/n. Note that
≤
2 Z X j=1
≤ ≤
Ij
π2 4n Kn−1 (λj
− x) ≥ 1 if x ∈ Ij .
π2 Kn−1 (λj − x) dµψ (x) 4n
2
π {Gn (λ1 ) + Gn (λ2 )} 4n π2 (C + )|I|1−β . 4
ii) implies i). Let λ ∈ ∆ and choose 0 < a < 1/2 such that [λ − a, λ + a] ⊂ ∆. It suffices to show that Z λ+a Kn−1 (λ − x) dµψ (x) ≤ C < ∞ n−β λ−a
for a constant C independent of λ and n. Since µψ is U(1 − β)H in ∆ there are constants D, N such that µψ (I) ≤ Dnβ−1 for every interval I of length 1/n with n > N . Now Kn−1 (λ − x) ≤ n if |λ − x| < 1/n and Kn−1 (λ − x) ≤ 4n/k 2 π 2 if k/n ≤ |λ − x| ≤ (k + 1)/n since sin πy ≥ y/2 for 0 ≤ y ≤ 1/2. Therefore, denoting the integer part of an by [an], Z
λ+a
Kn−1 (λ − x) dµψ (x) ≤ 2Dnβ + 2
λ−a
for some constant C independent of λ and n.
[na] X k=1
4 k2 π2
Dnβ ≤ Cnβ ,
Remarks 2.2. Theorem 2.1 holds in greater generality than R stated. Let µ be a finite positive Borel measure on R with Fourier transform µ(t) ˆ = e−2πitx dµ(x). For T > 0, 2 R R R T T ˆ − s) ds dt . Again, GT (λ) = KT (λ − define GT (λ) := T −1 0 0 e2πi(t−s)λ µ(t x) dµ(x), where Z T |u| 2πiyu 1 sin πT y 2 KT (y) = )e (1 − du = T T πy −T is the Fej´er kernel on R. Now Theorem 2.1 holds for µ and this function GT . The proof is the same; that of “i) implies ii)” simplifies in that one can take T = |I|−1 . 2.3. In particular, if µ = µψ is a spectral measure of aR strongly continuous group {Ut }t∈R of unitary operators on a separable Hilbert H (i.e. R e2πitx dµψ (x) = (Ut ψ, ψ)
570
A. Hof
R
2
T
for a ψ ∈ H) then Theorem 2.1 holds for µψ and GT (λ) := T −1 0 Uλ,t ψ dt , where Uλ,t := e−2πiλt Ut . 2.4. Theorem 2.1 also generalizes to higher dimensions. This is of interest for actions of Zd on a probability space (Ω, ν) that leave ν invariant. Let U k , k ∈ Zd , be the unitary action of Zd on L2 (Ω, ν) defined by U k f := f ◦ T k , where T k is the action of Zd on (Ω, ν). For λ ∈ Td , let Uλk := e−2πihλ,ki U k , where h · , · i denotes the Euclidean inner product in Rd . Let Cn := {m ∈ Zd | 0 ≤ P 2 mi < n} and Gn (λ) := n−d k k∈Cn Uλk ψk . The Fej´er kernel now is Kn (y) = Q 2 d n−d i=1 sin nπyi / sin πyi . The theorem holds as before with 0 ≤ β ≤ d, ∆ a cube, and U(d − β)H instead of U(1 − β)H in statement ii). 2.5. The values of limn→∞ n−β Gn (λ) for β = 1 and β = 0 determine the discrete part and the absolutely continuous part of µψ , respectively. Since the functions n−1 Kn tend to zero uniformly outside any neighborhood of zero, and n−1 Kn (0) = 1, one has µψ ({λ}) = limn→∞ n−1 Gn (λ) for all λ ∈ T (e.g., p. 42 in [25]). Thus β = 1 determines the discrete part of µψ . To see that β = 0 determines the absolutely continuous part of µψ , write µψ = f σ + ρ, where σ denotes Lebesgue measure, f ∈ L1 (σ) and ρ ⊥ σ. Then Gn (λ) → f (λ) at σ-a.e. λ, see e.g. Theorem III.8.1 in Volume 1 of [37]. If f can be chosen to be continuous in a neighborhood of λ and ρ is zero on that neighborhood then limn→∞ Gn (λ) = f (λ). 2.6. A measure µ that is UαH for some 0 < α < 1 needPnot have a singular continuous part. For instance, let {ci } be a positive sequence with ci < ∞, xi ∈ T, 0 < β < 1 P and let µ := ci |x − xi |−β . Then µ is absolutely continuous and U(1 − β)H. Still, the behavior of n−β Gn (λ) for β = 0 and 1 can always be used in principle to determine whether µψ has a singular continuous part µψ, sc , because Z X 2 lim n−1 Gn (λ), µψ, sc (T) = kψk − f dσ − n→∞
where, as in the previous remark, f = limn→∞ Gn . 2.7. The critical H¨older exponent αµ (x) of µ at x is defined by 0 if α < αµ (x) −α lim sup µ([x − r, x + r])r = . ∞ if α > αµ (x) r→0 Note that αµ (x) may be larger than 1 if µ has very little mass near x. If −1 ≤ β ≤ 1 and 1 1 lim supRn−β Gn (λ) < ∞ then 1 − β ≤ αµ (λ). This follows from µ([λ − 2n , λ + 2n ]) ≤ 2 π /4n |λ−x|≤1/2n Kn−1 (λ − x) dµ(x). 3. Generalized Random Walk Let Ω be a compact metric space with its Borel σ-algebra, T : Ω → Ω a measurable invertible map and ν a T -invariant ergodic probability measure on Ω. Each ψ ∈ L2 (Ω, ν) gives rise to a so-called generalized random walk (GRW) Snω :=
n−1 X k=0
ψ(T k ω).
(1)
Scaling in Relation to Singular Spectra
571
2 Z Simple random Q walk on Z is the GRW defined by Ω = {0, 1, 2, 3} , T the left shift on Ω, ν = j∈Z [(δωj ,0 + δωj ,1 + δωj ,2 + δωj ,3 )/4] and ψ(ω) = ψ(ωj ) = 1, i, −1, −i if ωj = 0, 1, 2, and 3, respectively. Generalized random walks, and especially their recurrence properties, have been considered in e.g. [1, 5, 11, 36, 12, 13]. The aim of this section is to relate the H¨older continuity at 0 of the spectral measure µψ of ψ to the speed with which the walk wanders off to infinity, as expressed by the critical exponents of the mean squared displacement and of |Snω |2 . Recall that the critical exponent αc of a sequence Nn ∈ C is defined by ∞ α < αc lim sup n−α |Nn | = . (2) 0 α > αc n→∞
The critical exponent of |Snω |2 is denoted by γ(ω), that of the mean squared displacement Z h|Snω |2 i :=
n−1
X k 2 |Snω |2 dν(ω) = U ψ 2 k=0
L
by γms ; here U ψ := ψ◦T is the unitary operator implementing the dynamics in L2 (Ω, ν). Note that 0 ≤ γms ≤ 2. Lemma 3.1. There exists a γ ∈ [0, γms ] such that γ = γ(ω) for ν-a.e. ω ∈ Ω. Proof. For α > 0 and I ⊂ [0, ∞) an interval, let Bα,I := {ω ∈ Ω | lim sup n−α |Snω |2 ∈ I}. n→∞
T ∞ S∞
| n−α |Snω |2 ∈ I for an n ∈ This set is measurable since Bα,I = K=0 L=K {ω ∈ Ω T∞ [K, . . . , L]}. It is also invariant. Hence the sets Bα,0 := m=1 Bα,[0,1/m] and Bα,∞ := T∞ m=1 Bα,[m,∞) are measurable and invariant. Thus Ω is the disjoint union of the measurable and invariant sets Bα,0 , Bα,(0,∞) and Bα,∞ . By ergodicity, one of these three sets has measure one. Since there is at most one value of α for which ν(Bα,(0,∞) ) = 1 there is a γ such that ν(Bα,∞ ) = 1 if α < γ and ν(Bα,0 ) = 1 if α > γ. This shows that γ(ω) = γ for ν-a.e. ω ∈ Ω. If lim supn→∞ n−β h|Snω |2 i = 0, then there is a subsequence nk such that ω 2 lim n−β k |Snk | = 0
k→∞
for ν − a.e. ω.
This means that ν(Bβ,0 ) = 1. Hence β > γms implies β > γ, and γ ≤ γms .
It is now clear that the degree of H¨older continuity of µψ at zero gives an upperbound on γms , and that, conversely, γms and γ provide upperbounds on the degree of H¨older continuity of µψ near zero. This follows from the fact that, in the notation of Theorem 2.1, Gn (0) = n−1 h|Snω |2 i. If µψ is UβH on a neighborhood of 0, then γms ≤ 2 − β. Conversely, if γms > 1 (or γ > 1), then µψ is not U(2 − β)H on any neighborhood of 0, for any β ∈ (1, γms ) (β ∈ (1, γ)). Thus superdiffusive behavior of the GRW requires that µψ is not U1H on any neighborhood of 0. On the other hand, the GRW can only be subdiffusive (γms < 1) if µψ has very little weight near 0. Indeed, by Remark 2.7 this requires αµψ (0) > 1. Theorem 1 in [27] gives that h|Snω |2 i is bounded in n if and only R if (sin πλ)−2 dµψ (λ) < ∞.
572
A. Hof
Remarks 3.2. Dekking [13] shows that γms ≤ 1 for the “Rudin-Shapiro walk”. This is the GRW in which Ω ⊂ {0, 1, 2, 3}Z is the substitution dynamical system (e.g., [32]) arising from the primitive substitution 0 → 02, 1 → 32, 2 → 01, 3 → 31 with ψ(ω) = ψ(ω0 ) = 1, −i, i, −1 if ω0 = 0, 1, 2, 3, respectively. The spectral measure µψ can be computed from Proposition VII.5 and Example VIII.2.2 in [32] as Lebesgue measure, so γms = 1, as conjectured by Dekking, and the diffusion coefficient is 1. (In Proposition VII.5 of [32] one should read τ (β) for τ (β); this τ is our ψ. The symbols {0, 1, 2, 3} in [32] correspond to the symbols {a, d, b, c} in [13].) 3.3. By Remark 2.5, if µψ is absolutely continuous on a neighborhood 0, with a density that can be chosen to be a continuous function g, then γms = 1 and the diffusion coefficient of the GRW is g(0). 3.4. Dumont and Thomas [16] have given an asymptotic expression for Snω for the case that ω is a fixed point of a primitive substitution and ψ(ω) depends only on ω0 (here Ω ⊂ {0, . . . , a}Z ). They find, under certain assumptions on the eigenvalues of the R substitution matrix, an upperbound β < 1 on γ(ω) if ψ dν = 0. This is a result for one particular sequence in Ω, a fixed point of the substitution, and therefore has no direct implication for the H¨older continuity of µψ at 0. 4. Twisted Generalized Random Walk The exponential sums Snω (λ) :=
n−1 X
e−2πiλk ψ(T k ω)
(3)
k=0
define a walk in the complex plane for each ω ∈ Ω and each λ ∈ T. One can think of these as walks with a rotational bias: at step k the walker changes its direction by e−2πiλ k and then makes a step of length |ψ(T k ω)| in the direction arg(ψ(T k ω)). When plotted in the complex plane,R such walks often give rise to pretty pictures [14, 3]. The mean squared displacement |Snω (λ)|2 dν(ω) of Snω (λ) is given by nGn (λ), and now depends on λ. By Theorem 2.1, its critical exponent γms (λ) satifies Rγms (λ) ≤ 2 − α on any interval where µψ is UαH. Again, superdiffusive behavior of |Snω (λ)|2 dν(ω) requires singularity of µψ at λ. The aim of this section is to show that Snω (λ) is itself a GRW (i.e., a GRW for some ergodic dynamical system), except possibly for a countable set of λ, and to relate the critical exponents γ(λ) of |Snω (λ)|2 to µψ . This will explain the meaning of the method of Aubry et al. [2, 3], see the next section. It is of interest because itR is often easier to deal numerically with one – hopefully typical – trajectory than with |Snω (λ)|2 dν(ω). An eigenvalue of (Ω, T, ν) is a λ ∈ [0, 1] ' T for which there is a φ ∈ L2 (Ω, ν) such that U φ = e2πiλ φ. The set of eigenvalues forms a countable group, which will be denoted by Λ. Let Q[Λ] := {µ ∈ T | µ = qλ, λ ∈ Λ, q ∈ Q}. Proposition 4.1. For λ 6∈ Q[Λ] there exists a γ(λ) ∈ [0, γms (λ)] such that γ(λ, ω) = γ(λ) for ν-a.e. ω. Proof. Let X be a compact metric space and ρ a Borel probability measure on X that is ergodic for an invertible, measurable transformation R on X. The direct product of ˜ T˜ , ν) (Ω, T, ν) and (X, R, ρ) is the dynamical system (Ω, ˜ defined by Ω˜ := Ω × X, T˜ (ω, x) := (T ω, Rx), ν˜ := µ × ρ. This direct product is ergodic if and only if (Ω, T, ν) and (X, R, ρ) share no eigenvalues other than 0 (see e.g. [9], Theorem 10.1.1).
Scaling in Relation to Singular Spectra
573
If λ is irrational, take X = T, ρ normalized Lebesgue measure and let R = Rλ be defined by Rλ x := x + λ (mod 1). If λ = p/q is rational, take X = {kp/q}q−1 k=0 , let ρ be normalized counting measure on X and again let R = Rλ . In both cases (X, Rλ , ρ) is ˜ T˜ , ν) ˜ is ergodic and has eigenvalues e−2πimλ , m ∈ Z. Therefore the direct product (Ω, ergodic if λ 6∈ Q[Λ]. Take such a λ. ˜ ˜ ν) ˜ be defined by ψ(ω, x) := e−2πix ψ(ω). Let U˜ For ψ ∈ L2 (Ω, ν) let ψ˜ ∈ L2 (Ω, Pn−1 k 2 ˜ (ω,x) ˜ ˜ act on L (Ω, ν) ˜ by U˜ φ = φ ◦ T˜ , and define S˜ n := k=0 U ψ(ω, x). (Note that (ω,x) ω (ω,0) (ω,x) S˜ n implicitly depends onR λ.) Then Sn (λ) = S˜ n . Since |S˜ n | is independent of x, the exponent of |S˜ n( · ) |2 dν˜ is the same as the critical exponent γms (λ) of R ω critical 2 |Sn (λ)| dν(ω). By Lemma 3.1 there is a γ(λ) ∈ [0, γms (λ)] such that the critical exponent γ(λ, ˜ ω, x) is equal to γ(λ) for ν-a.e. ˜ (ω, x). But γ(λ, ˜ ω, x) = γ(λ, ˜ ω, 0) = γ(λ, ω) for all x ∈ X, so γ(λ, ω) = γ(λ) for ν-a.e. ω. Corollary 4.2. If (Ω, T, ν) is weakly mixing then there exists for all λ ∈ T a γ(λ) ∈ [0, γms (λ)] such that γ(λ, ω) = γ(λ) for ν-a.e. ω. Proof. Recall that, by definition, (Ω, T, ν) is weakly mixing if it has no eigenvalues other than 1. The case λ = 0 is covered by Lemma 3.1. Corollary 4.3. Suppose λ 6∈ Q[Λ] and γ(λ) > 1. Let 0 < < γ(λ) − 1. Then µψ is not U(2 − γ(λ) + )H on any neighborhood of λ. Proof. Let β := γ(λ) − . Then 1 < β < γms (λ) and lim supn→∞ n−(β−1) Gn (λ) = ∞. Let ∆ be any neighborhood of λ. By Theorem 2.1, µψ is not U(2 − β)H on ∆. Note that the exceptional set in Proposition 4.1 depends on λ. Corollary 4.3 shows that a single (but typical) trajectory provides upper bounds on the degree of H¨older continuity of µψ . As explained in Remark 2.6, a degree of H¨older continuity strictly less than one does not prove that a measure is purely singular continuous. Thus the question arises whether one can determine from a single (typical) trajectory that the spectral measure has no absolutely continuous part. In general, this is not possible, as is explained below. The following lemma is well known. Lemma 4.4. limn→∞ n−1 |Snω (λ)|2 dλ = µψ weakly for ν-a.e. ω. If (Ω, T, ν) is uniquely ergodic and ψ is continuous then “ν-a.e. ω” can be replaced by “for all ω ∈ Ω”. Proof. For p ∈ Z, Z
e−2πiλp Snω (λ) dλ =
1X ψ(T m−p ω)ψ(T m ω), n n−1
(4)
m=0
which by the ergodic theorem converges to (U −p ψ, ψ) as n → ∞ for ν-a.e. ω. Hence the Fourier coefficients of Snω (λ) dλ converge pointwise to those of µψ , for ν-a.e. ω. This shows that limn→∞ n−1 |Snω (λ)|2 dλ = µψ weakly for ν-a.e. ω. The averages (4) converge for all ω ∈ Ω as n → ∞ if (Ω, T, ν) is uniquely ergodic and ψ is continuous (e.g. [9], Theorem 1.8.2). Thus both n1 |Snω (λ)|2 dλ (for a.e.-ω) and its expectation Gn (λ) dλ converge weakly to µψ as n → ∞. There is a difference, however. The sequence Gn (λ) converges for Lebesgue-a.e. λ and its limit is the value of the density of the absolutely continuous part
574
A. Hof
of µψ (see Remark 2.5). But lim n1 |Snω (λ)|2 in general does not exist for Lebesgue-a.e. λ. (E.g., if ψ(T k ω), k ∈ Z, are independent and uniformly distributed in T then for every λ and ν-a.e. ω the limit does not exist.) Now assume that (Ω, T, ν) is uniquely ergodic and that ψ is continuous. Then, if limn→∞ n1 |Snω (λ)|2 exists for Lebesgue-a.e. λ in some interval, the limit is equal to the density of µψ with respect to the Lebesgue measure in that interval. In particular, if limn→∞ n1 |Snω (λ)|2 = 0 a.e. on T, then µψ has no absolutely continuous part. In this special case examining a single trajectory allows to conclude that the spectral measure has no absolutely continuous part. Remarks 4.5. The results in Sects. 3 and 4 generalize to flows, if, for T > 0 and λ ∈ R, RT STω (λ) is defined by STω (λ) := 0 e−2πiλx ψ(Tt ω) dt. An eigenvalue of the flow Tt is a λ ∈ R for which there is a φ ∈ L2 (Ω, ν) such that Ut φ = e2πiλt φ. 4.6. If µψ ({λ}) > 0 then Snω (λ) behaves ballistically, γms (λ) = 2, with diffusion coefficient µψ ({λ}), see Remark 2.5. If (Ω, T, ν) is uniquely ergodic and the eigenfunction φ for λ is continuous then γ(λ, ω) = 2 for all ω ∈ Ω [33]. 5. Applications
5.1. The model of Aubry et al.. Aubry et al. [2, 3] have considered a model of atoms on the line defined by (5) xn − xn−1 = 1 + ξ1[0,β) (nα + θ), where {xn } is the set of atomic positions, α is irrational, and x0 ∈ R, β, θ ∈ T and ξ > 0 are parameters. (Changing x0 translates the structure; θ selects a structure within a “local isomorphism class” determined by α, β and ξ.) They were interested in the “structure factor” S(λ), an unbounded measure defined by X 2 e−2πiλxn dλ, (6) dS(λ) = lim T −1 T →∞
xn ∈[0,T ]
where the limit is taken in the vague topology (cf. [21, 23]). The structure factor does not depend on the choice of x0 or θ. Aubry et al. argued that S should be purely singular continuous if β 6= nα (mod 1). One reason for this assertion was that they found critical exponents η(λ) of the sums X e−2πiλxn (7) FT (λ) := xn ∈[0,T ]
√ to be strictly between 1/2 and 1. For the case β = 1/2, α = τ −1 (τ = ( 5 + 1)/2) they showed analytically [3] that η(λ) = c for some 21 < c < 1 and λ in a countable dense set. This section explains that 2 − 2η(λ) is an upperbound on the degree of H¨older continuity of S on neighborhoods of λ. The idea is to view S as a limit of spectral measures of a dynamical system, the flow under the function f (x) = 1 + ξ1[0,β) (x) over the irrational rotation α on T. This was explained in detail in [22]. TheRessence is the following. Let φ ≥ 0 be a C ∞ -function with support in [−1/2, 1/2], Pφ(x) dx = 1, and for 1 > > 0 let φ (x) := −1 φ(x/). Then the function ρ := φ ∗ n∈Z δxn is of
Scaling in Relation to Singular Spectra
575
the form x → ψ (Tx ω) for an ω ∈ Ω and a ψ ∈ L2 (Ω, ν), where Ω = Ωβ,ξ = {(x, y) | x ∈ T, 0 ≤ y < f (x)} and ν is the normalized Lebesgue measure on Ω. The spectral measure µψ of ψ satisfies (8) µψ = S|φˆ |2 , where φˆ denotes the Fourier transform of φ. Thus µψ → S vaguely as → 0. Since φˆ is smooth it follows that for every interval I there is an such that the degrees of H¨older continuity of S and µψ are the same on I. RT Now consider the integral STω (λ) := 0 e−2πiλx ψ (Tx ω) dx. For each ω ∈ Ω there is an Aω , |Aω | ≤ 2 such that Z ∞ X STω (λ) = e−2πiλx φ ∗ δxn (x) dx + Aω 0
=
xn ∈[0,T ]
φˆ (λ)FT (λ) + Aω .
It follows that the critical exponent η(λ) of FL (λ) is related to the critical exponent γ(λ) of |SLω (λ)|2 by γ(λ) = 2η(λ). Hence Corollary 4.3 (in “flow form”, cf. Remark 4.5) gives that S is not U(2 − 2η(λ) + δ)H on any neighborhood of λ, for all 0 < δ < 2η(λ) − 1. Remark 5.1. The connection with the flow under the function f was used in [22] to determine the parameters α, β, ξ for which the unbounded measure S is continuous (apart from a delta function at zero, which is present for all parameters). This in turn was used to prove that for every irrational ξ and every β the measure S is purely singular continuous for generic α (i.e., for α in a dense Gδ ). We recently found a paper by Goodson and Whitman [17] that proves by periodic approximation that for each ξ (irrational or not) and each irrational α the flow under f has singular spectrum (i.e., no spectral measure has an absolutely continuous part) for Lebesgue-a.e. β. Combined with the continuity result of [22] this gives that for every irrational ξ and every irrational α, the flow has purely singular continuous spectrum for almost every β. For these parameters S is purely singular continuous apart from the delta function a zero. For β = 1/2, α = τ −1 , the parameters considered in [3], S is continuous [22]. It has not been proved that it is singular continuous although the work of Aubry et al. does suggest that it is singular continuous. 5.2. Energy growth in driven oscillators. Bunimovitch et al. [6] consider a classical driven oscillator with natural frequency ω0 described by the Hamiltonian H = p2 /2 + ω0 q 2 /2 − qψ(Tt x), where Tt is an ergodic flow on a compact metric space with invariant probability measure ν. They are interested in the expectation of the time evolution hEt iν of the energy of the oscillator. Their Eqs. (2.9) and (2.10) can be rewritten to give hEt iν = ω02 q02 /2 + p20 /2 +
ω02 tGt (ω0 ), 2
where Gt is as in Remark 2.3. Bunimovich et al. express this in terms of the correlation function µˆ ψ . In Proposition 3.1 they show that hEt iν grows linearly with t and has a well defined diffusion coefficient if the correlation function decays sufficiently fast. Remark 3.3 generalizes this: it suffices to require that µψ is absolutely continuous on a neighborhood of ω0 with a density that can be chosen to be continuous at ω0 . The correlation function need not decay at all in order to have diffusive behavior of hEt iν . Again, superdiffusive behavior of hEt i requires µψ to be singular at ω0 , in the sense that µψ is not UαH on any neighborhood of ω0 , for some α < 1.
576
A. Hof
Note that by Proposition 4.1 the critical exponent of Et is independent of x (ν-a.e.), for all but possibly countably many ω0 . The expection of the energy growth for driven quantum oscillators is also determined by tGt [6, 8]. Acknowledgement. It is a pleasure to thank F.M. Dekking and O. Knill for discussions.
References 1. Aaronson, J. and Keane, M.: The visits to zero of some deterministic random walks. Proc. Lond. Math. Soc. (3) 44, 535–553 (1982) 2. Aubry, S., Godr`eche, C. and Luck, J.M.: A structure intermediate between quasi-periodic and random. Europhys. Lett. 46, 39–643 (1987) 3. Aubry, S., Godr`eche, C. and Luck, J.M.: Scaling properties of a structure intermediate between quasiperiodic and random. J. Stat. Phys. 51, 1033–1074 (1988) 4. Barbaroux, J.M., Combes, J.M. and Montchio, R.: Remarks on the relation between quantum dynamics and fractal spectra. preprint Marseille CPT-96/P.3303 5. Berbee, H.: Recurrence and transience for random walks with stationary increments. Z. Wahr. Verw. Geb. 56, 531–536 (1981) 6. Bunimovich, L., Jauslin, H.R., Lebowitz, J.L., Pellegrenotti, A. and Nielaba, P.: Diffusive energy growth in classical and quantum driven oscillators. J. Stat. Phys. 62, 793–817 (1991) 7. Combes, J.M.: Connections between quantum dynamics and spectral properties of time evolution operators. In W.F. Ames, E.M Harrell, and J.V. Herod, editors, Differential Equations with Applications to Mathematical Physics. New York: Academic Press, Inc., 1993 8. Combescure, M.: Recurrent versus diffusive dynamics for a kicked quantum oscillator. Ann. Inst. Henri Poincar´e Phys. Th´eor. 57, 67–87 (1992) 9. Cornfeld, I.P., Fomin, S.V. and Sinai, Ya.G.: Ergodic Theory, Vol. 115 of Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen. Berlin–Heidelberg–New York: Springer-Verlag, 1982 10. de Oliveira, C.R.: Numerical study of the long-time behaviour of quantum systems driven by Thue-Morse forces. Application to two-level systems. Europhys. Lett. 31, 63–68 (1995) 11. Dekking, F.M.: On transience and recurrence of generalized random walks. Z. Wahr. Verw. Geb. 61, 459–465 (1982) 12. Dekking,F.M.: Marches automatiques. J. de Th´eorie de Nombres de Bordeaux 5, 93–100 (1993) 13. Dekking, F.M.: Random and automatic walks. In F. Axel and D. Gratias, editors, Beyond Quasicrystals. ´ Les Editions de Physique, Berlin–Heidelber–New York: Springer-Verlag, 1995 14. Dekking, F.M. and Mend`es France, M.: Uniform distribution modulo one: A geometric point of view. J. Reine Angew. Math. 329, 143–153 (1981) 15. del Rio, R., Jitomirskaya, S., Last, Y. and Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank one perturbations, and localization. J. Analyse Math. 69, 153–200 (1996) 16. Dumont, J.M. and Thomas, A.: Syst`emes de num´eration et fonctions fractales relatifs aux substitutions. Theor. Comp. Science 65, 153–169 (1989) 17. Goodson, G.R. and Whitman, P.N.: On the spectral properties of a class of special flows. J. Lond. Math. Soc. 21, 567–576 (1980) 18. Guarneri, I.: Spectral properties of quantum diffusion on discrete lattices. Europhys. Lett. 10, 95–100 (1989) 19. Guarneri, I.: On an estimate concerning quantum diffusion in the presence of a fractal spectrum. Europhys. Lett. 21, 729–733 (1993) 20. Guarneri, I. and Mantica, G.: On the asymptotic properties of quantum dynamics in the presence of a fractal spectrum. Ann. Inst. Henri Poincar´e Phys. Th´eor. 61, 369–379 (1994) 21. Hof, A.: On diffraction by aperiodic structures. Commun. Math. Phys. 169, 25–43 (1995) 22. Hof, A.: On a “Structure intermediate between quasiperiodic and random”. J. Stat. Phys. 84, 309–320 (1996) 23. Hof, A.: Diffraction by aperiodic structures. In R.V. Moody and J. Patera, editors, Proceedings of the NATO Advanced Study Institute on the Mathematics of Aperiodic Long Range Order. Kluwer Academic Publishers. To appear
Scaling in Relation to Singular Spectra
577
24. Jitomirskaya, S. and Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 25. Katznelson, Y.: An Introduction to Harmonic Analysis. New York; John Wiley & Sons, 1968 26. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Func. Anal. 142, 406–445 (1996) 27. Liardet, P.: Regularities of distribution. Comp. Math. 61, 267–293 (1987) 28. Luck, J.M.: Cantor spectra and scaling of gap widths in deterministic aperiodic systems. Phys. Rev. B 39, 5834–5849 (1989) 29. Luck, J.M., Orland, H. and Smilansky, U.: On the response of a two-level quantum system to a class of time-dependent quasiperiodic perturbations. J. Stat. Phys. 53, 551–564 (1988) 30. Pikovsky, A.S. and Feudel, U.: Correlations and spectra of strange non-chaotic attractors. J. Phys. A: Math. Gen. 27, 5209–5219 (1994) 31. Pikovsky, A.S., Zaks, M.A., Feudel, U. and Kurths, J.: Singular continuous spectra in dissipative dynamics. Phys. Rev. B 52, 285–296 (1995) 32. Queff´elec, M.: Substitution Dynamical Systems – Spectral Analysis, Vol. 1294 Lect. Notes in Mathematics. Berlin–Heidelberg–New York: Springer-Verlag, 1987 33. Robinson, Jr., E.A.: On uniform convergence in the Wiener-Wintner theorem. J. Lond. Math. Soc. 49, 493–501 (1994) 34. Rogers, C.A.: Hausdorff measures. Cambridge: Cambridge University Press, 1970 35. Strichartz, R.S.: Fourier asymptotics of fractal measures. J. Func. Anal. 89, 154–187 (1990) 36. Wen, Z.-X. and Wen, Z.-Y.: Marches sur les arbres homog`enes suivant une suite substitutive. S´em. T´eor. Nombres, Bordeaux 4, 155–186 (1992) 37. Zygmund, A.: Trigonometric Series. Cambridge: Cambridge University Press, 1959 Communicated by B. Simon
Commun. Math. Phys. 184, 579 – 596 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
An Extension of Distribution Theory and of the Paley-Wiener-Schwartz Theorem Related to Quantum Gauge Theory M.A. Soloviev Department of Theoretical Physics, P. N. Lebedev Physical Institute, Leninsky prosp. 53, Moscow 117924, Russia. Fax: 095-938-2251, E-mail: [email protected] Received: 29 December 1995 / Accepted: 13 September 1996
Abstract: We show that a considerable part of the theory of (ultra)distributions and hyperfunctions can be extended to more singular generalized functions, starting from an angular localizability notion introduced previously. Such an extension is needed to treat quantum gauge field theories with indefinite metric in a generic covariant gauge. Prime attention is paid to the generalized functions defined on the Gelfand-Shilov spaces Sα0 which gives the widest framework for construction of gauge-like models. We associate a similar test function space with every open and every closed cone, show that these spaces are nuclear and obtain the required formulas for their tensor products. The main results include the generalization of the Paley–Wiener–Schwartz theorem to the case of arbitrary singularity and the derivation of the relevant theorem on holomorphic approximation.
1. Introduction This article will present a systematic development of the general distribution-theoretic framework proposed in [32] for the nonperturbative treatment of quantum field models with arbitrarily singular infrared and/or ultraviolet behavior and primarily for local covariant formulation of gauge theories. The main aim is to establish the most general version of the Paley-Wiener-Schwartz theorem which has a double conceptual significance for Quantum Field Theory since it determines the analytic structure of n-point correlation functions of the fields both in the space-time and energy-momentum variables, relating this structure to the support properties implied by the basic physical notions of Einstein causality and positivity of the energy. This theorem underlies the derivation of the main results of the axiomatic QFT [1, 34], such as the connection between spin and statistics and the existence of PCT-symmetry; it makes Euclidean Field Theory legitimate, and it enters in some way or other into the mathematical arsenal of all constructive approaches to relativistic quantum fields. In its standard form [1, 13, 27, 34], the PWS theorem deals with the Laplace transforms of tempered distributions and
580
M.A. Soloviev
so with analytic functions which have no worse than power-like growth at infinity and near the boundary of their analyticity domain. In the perturbative framework, such an assumption is justifiable and has its origin in the temperateness of the free propagators. However, the behavior of exact solutions of physically relevant QFT’s can be appreciably more singular, and the crucial question of the strength of these singularities, as an important characteristic of the interaction, attracted attention of many authors, see [35] and references therein. The principal difficulty in deriving and even in formulating the general PWS-like theorem lies in finding the proper substitute for the notion of support for the distribution1 classes larger than ultradistributions and hyperfunctions, when it loses its sense. Classes of such a kind were considered [3–10] as candidates for a consistent quantum theory of nonlocal interactions, where severe singularities occur in the ultraviolet domain, and they make an appearance in the local and covariant quantization of gauge field models [22–24, 35], where one is facing the infrared singularity problem. Our extension of the PWS theorem is based on the employment of an angular localizability notion introduced in [10, 31] and is intended to express the physical ideas of causality and positivity of the energy in the field theories with highly singular behavior of correlation functions. It reveals the remarkable permanence of the basic analytic structure of QFT and is general enough to cover the wildest singularities predicted by field-theorists treating gauge models. We shall now state the main ingredients that go into the mathematical framework developed below. The analysis of simple explicitly soluble gauge models [22–24] has shown that their correlation functions are correctly defined as distributions with momentum-space test functions in the Gelfand-Shilov [11] spaces Sαβ , where α and β are model-dependent and the superscript is less than one. These spaces have long ago been exploited [8, 9], equally with the spaces S β [4–7], in nonlocal QFT, where they came into play in configuration representation, and we consider the use of them to be adequate to our purpose. The elements of Sαβ decrease at infinity exponentially with order 1/α and a finite type, and their Fourier transforms behave analogously but with order 1/β. We recall that for β > 1, the configuration space test functions Sαβ serve as the functional domain of definition of the strictly localizable Jaffe fields [14, 15] whose local properties can be described by the usual “partition of unity” method and so are no different in the main from those of tempered distributions. The elements of Sα1 admit an analytic continuation into a complex neighborhood of the real space and in order to define the supports of distributions with such test functions, one has to use more sophisticated methods of the theory of hyperfunctions whose analytic basis is formed by H¨ormander’s L2 –estimates [12] for solutions of the nonhomogeneous Cauchy-Riemann equations. In conformity to QFT, the corresponding framework, where S11 plays the role of a universal object for local fields, has been developed by Nagamachi and Mugibayashi [2, 25, 26]. In the case β < 1 of interest to us, both these methods break down, but some ideas underlying them can be adapted for use. The notion of angular localizability is formalized by using test function spaces related to Sαβ and associated naturally with closed cones in Rn , whose definition will be given just below. As usual, we denote the continuous dual space by a prime. A distribution belonging to Sα0β may be thought of as carried by a closed cone K if it allows a continuous linear extension to the space Sαβ (K) , and then the following theorem holds. Theorem 1. Let K1 and K2 be closed cones in Rn and let β < 1. Every distribution on Sαβ carried by K1 ∪ K2 can be decomposed into a sum of two distributions carried 1 Throughout this paper, the term distribution is used synonymously with, and instead of, generalized function, while the usual distributions with finite order of singularity are called Schwartz or tempered.
Extension of Distribution Theory Related to Gauge QFT
581
by K1 and K2 . Furthermore, if each of cones K1 , K2 is a carrier of f ∈ Sα0β , then so is K1 ∩ K2 and consequently f has a unique minimal carrier cone. For 0 < β < 1, these facts can be established in a rather elementary way [31] which, however, is inapplicable to the case β = 0 when the Fourier transforms of test functions have compact supports. This borderline case is of particular interest because using Sα0 imposes no restrictions on the singularity. In other words, these spaces may serve as a universal object for covariant formulation of gauge QFT and construction of gauge-like models. The general proof of Theorem 1 set forth in [32] and covering β = 0 is based on a representation of the spaces Sαβ (K) as the inductive limits of Hilbert spaces of entire analytic functions, which enables one to take advantage of H¨ormander’s L2 -estimates again. So the situation somewhat copies that of the familiar case β ≥ 1 outlined above. Here we use the language of complex variables from the very outset, and we refer to our previous work [32] for the interplay with the theory of hyperfunctions and for the relationship to the original Gelfand and Shilov definition of Sαβ used traditionally in nonlocal QFT and given in terms of real variables. First we assign a kindred space to every open cone with the vertex at the origin. If U ⊂ Rn is such a cone, then Sαβ (U ) is defined to be the union or, more precisely, the inductive limit of Hilbert spaces β,b (U ) (a, b → ∞) consisting of entire functions on Cn and provided with the scalar Hα,a products Z hϕ, ψiU,a,b =
ϕ(z)ψ(z) ¯ exp{−2ρU,a,b (z)}dλ,
(1)
where dλ stands for the Lebesgue measure on Cn and ρU,a,b (z) = −|x/a|1/α + d(bx, U )1/(1−β) + |by|1/(1−β) ,
(2)
with z = x + iy and d(·, U ) being the distance of the point to U . We remark that the union is independent of the choice of the norm | · | in Rn and this degree of freedom will be of use in what follows, though at first one may hold it Euclidean. A sequence β,b (U ) and is ϕν ∈ Sαβ (U ) is regarded to be convergent if it is contained in some Hα,a β β k · kU,a,b -convergent. Starting from Sα (U ), the space Sα (K) associated with a closed cone K is constructed by taking another inductive limit through those U which contain the set K \{0} and shrink to it. The origin plays the role of the least element in the lattice of closed cones, and its associated space Sαβ ({0}) is defined by the same formulas, but with the first term in (2) being dropped and | · | substituted for d(·, U ). It should be noted that the designations used in (2) are inherited from nonlocal QFT and will be convenient for the most derivations below. However, in the final application to infrared singular fields, the replacement x → p, y → q is advisable of course. For the reader’s convenience, in Sect. 2 we briefly sketch the proof of basic Theorem 1 presented in every detail in [32]. We show also that the spaces associated with cones are nuclear. This extends, with a simpler proof, Mitiagin’s well-known result concerning Sαβ and implies some nice topological properties which are of use in QFT. Specifically, the nuclearity enables us to prove in Sect. 3 that the tensor product of the spaces over open cones, when being completed under a proper topology, is identical to the space associated with the cross product of the cones. The desired extension of the Paley– Wiener–Schwartz theorem is established in Sect. 4 by combining this fact with Theorem 1 and Theorem 11.5 of [18]. All these derivations are expounded for the widest distribution class Sα00 when, fortunately, the designations get simplified. Certainly the case of nonzero β < 1 can be treated in the same manner but the corresponding PWS-type theorem has already been proved in [31] by another method. Section 5 contains a complete
582
M.A. Soloviev
proof of an approximation theorem announced in [32], which asserts that the space Sα0 is dense in every Sα0 (K). Here again a leading part is played by H¨ormander’s L2 estimate but this time relating to the dual equation. This theorem implies that the space of distributions carried by K can be identified with Sα00 (K). The listed results form a basis for extension of methods of Euclidean and constructive field theory to gauge models with singular infrared behavior and for more special derivations such as structure theorems and representations for the corresponding correlation functions. Section 6 is devoted to concluding remarks. 2. Nuclearity Theorem 2. For each open cone U , the space Sα0 (U ) is nuclear and this property is inherited by the spaces corresponding to closed cones. Proof. We refer the reader to Schaefer [28] for definition and basic facts concerning nuclear locally convex spaces. It is sufficient to show that, for each a0 > a, b0 > b, the 0 0,b0 0,b Hα,a (U ) → Hα,a inclusion mapping ibb 0 (U ) is a Hilbert–Schmidt operator. Then aa0 : bb0 iaa0 is also nuclear since it may be regarded as a composition of two Hilbert–Schmidt operators. According to [28], it follows that the projective limit \ 0,b+ 0,b+ε (U ) = Hα,a+ε (U ) (3) Sα,a+ ε>0
Sα0 (U ) and Sα0 (K) can be represented as countable induc-
is a nuclear space. The spaces tive limits of auxiliary spaces of the form (3) and so they are nuclear by the inheritance properties listed in [28], Sect. III.7.4. 0,b (U ) the Hilbert space of locally square-integrable functions We denote by Hα,a n 0,b 0,b on C equipped with the same scalar product as that of Hα,a (U ), i.e., Hα,a (U ) = 2 n −2ρ L (C , e dλ). Let us consider within this scale of spaces the integral operator G defined by the kernel Z ip(x0 −x)+iq(y0 −y) 1 e 0 dn p dn q, (4) G(z − z) = (2π)2n (p2 + q 2 + M 2 )n+1 which is nothing but the inverse of (−∆ + M 2 )n+1 . We claim that, if M is properly 0,b0 0,b chosen, G acts as a Hilbert–Schmidt operator Hα,a (U ) → Hα,a 0 (U ). On the other hand, 0,b its restriction onto Hα,a (U ) is the identity operator, up to the factor M −2(n+1) , because the analytic functions satisfy the Laplace equation. Thus showing our claim implies that 0 the injection ibb aa0 has the desired property. We use the standard estimate |G(z 0 − z)| ≤ C exp{−m|z 0 − z|}
(5)
valid for each m < M . Due to the rotational invariance, when deriving (5) one may assume that the only nonzero component of the argument is x01 − x1 and then shift the path of integration in√the p1 -plane. Changing variables and combining (5) with the elementary inequalities 2|z| ≥ |x| + |y|, and |y 0 − y| ≤ |y 0 | + |y|,
d(x0 − x, U ) ≤ d(x0 , U ) + |x|,
−|x0 − x|1/α ≤ |x|1/α − |x0 |1/α , (6)
Extension of Distribution Theory Related to Gauge QFT
583
where the latter is true since α > 1, we find that Z |G(z 0 − z)|2 exp{2ρU,a,b (z)}dλ ≤ C 0 exp{2ρU,a,b (z 0 )},
(7)
√ R 0,b provided M > b 2. Then the integral G(z 0 −z)ϕ(z)dλ, where ϕ ∈ Hα,a (U ), belongs 0
0,b 0 to any Hα,a b0 > b by virtue of Schwarz’s inequality. Next we note 0 (U ) with a > a, 0,b that multiplication by the weight function e−ρ generates a unitary mapping of Hα,a (U ) 2 n −ρ0 ρ onto L (C ) and hence our claim is equivalent to saying that e Ge is a Hilbert– Schmidt operator on L2 . The same formula (7) shows that this is the case with M as 0 above, i.e., the kernel of e−ρ Geρ is square-integrable. For accuracy, let us specify a space containing those in question whereon (−∆ + M 2 )n+1 acts as an automorphism. We take it to be the dual of the space S1,l (R2n ) consisting of infinitely differentiable functions with the property that
|∂ κ ϕ(x, y)| ≤ Cκ exp{−|x/l| − |y/l|} for all multi-indices κ, and equipped with the corresponding topology. Clearly (−∆ + n+1 is a continuous operator on S1,l , while G maps this space into itself provided M 2 )√ l > 2/M , and then it is just the inverse operator as can easily be seen with the use of 0,b0 0 Fubini’s theorem. The dual space contains Hα,a 0 (U ) if l < 1/b and the dual operator 0,b also has a continuous inverse whose restriction to Hα,a (U ) cannot be different from G since S1,l is dense there. This completes the proof. Remark 1. Each Hilbert–Schmidt operator is compact and therefore so is the injection 0 ibb aa0 . The limit space of a projective (injective) sequence of locally convex spaces with compact connecting mappings is referred to as an FS (DFS) space respectively, FS being abbreviation of Fr´echet–Schwartz and D signifying “dual”. Thus, as a consequence of Theorem 2, the spaces Sα0 (U ) and Sα0 (K) are DFS whereas their strong dual spaces as well as the spaces (3) are FS. In this connection, it perhaps should be recalled that all spaces of these two types are complete, reflexive, separable and Montel, see, e. g., [17] for more detailed comments. Certainly, these nice topological properties hold true [31] for nonzero β, and they alleviate in particular the proof of Theorem 1. In a more refined formulation [32], it asserts that the sequence 0 → Sα0β (K1 ∩ K2 ) → Sα0β (K1 ) ⊕ Sα0β (K2 ) → Sα0β (K1 ∪ K2 ) → 0
(8)
is exact. All the arrows in (8) are natural mappings and the next to last one maps a pair of linear forms into the difference of their restrictions. Because the involved spaces are FS, this assertion is equivalent to saying that the sequence 0 ← Sαβ (K1 ∩ K2 ) ← Sαβ (K1 ) ⊕ Sαβ (K2 ) ← Sαβ (K1 ∪ K2 ) ← 0
(9)
is exact. Moreover both of them are topologically exact by the same reason. Remark 2. The formula (8) represents a weakened version of a similar formula for the Fourier hyperfunctions (α = β = 1) which is valid for every pair of closed sets in the radially compactified Rn , ensures the existence of supports for the elements of S 01 1 and is really a simple way of describing their local properties with an accuracy sufficient for use in QFT.
584
M.A. Soloviev
The only nontrivial conclusion concerning (9) is the exactness at Sαβ (K1 ∩ K2 ) which means that each element of this space can be decomposed into a sum of two functions belonging to Sαβ (Kj ), j = 1, 2. For 0 < β < 1, such a decomposition presents no serious problems and copies essentially the usual partition of unity. Namely, β,b (U ), where U is a cone-shaped neighborhood of K \ {0} with K denoting let ϕ ∈ Hα,a the intersection K1 ∩ K2 . We recall that a cone V is said to be a (relatively) compact ⊂ U subcone of U if V \{0} ⊂ U , where the bar denotes closure, and then the notation V ⊂ ⊂ V ⊂ ⊂ U . Since the angular distance between is used. Choose an open cone V so that K ⊂ ⊂ V and the closed cones Kj \ V is nonzero, there are open cones Vj such that Kj \ V ⊂ j |x − ξ| ≥ θ|x|,
|x − ξ| ≥ θ|ξ|
for all
x ∈ V1 , ξ ∈ V2 ,
β,b0 so that where θ is a positive constant. Let us take χ0 ∈ H1−β,a 0
R
(10)
χ0 dx = 1 and set
Z χ0 (z − ξ)dξ.
χ(z) = V2
Using (10), one can verify that χϕ ∈ Sαβ (K1 ) provided a0 < θ/b and show that (1 − χ)ϕ ∈ Sαβ (K2 ) if a0 < θ0 /b, where θ0 is the angular distance of K2 \V to the complement of V2 . When β = 0, this argument fails because the space S10 is trivial, but one may follow the regular way [12] of solving the Cousin problem and start from a decomposition into nonanalytic functions, using this time a standard bump function χ0 ∈ C0∞ (Rn ) and setting χ0 (z) = χ0 (Rez). The functions ϕ1 = χϕ and ϕ2 = (1 − χ)ϕ, with χ defined as above, possess the required behavior at infinity and, on writing ϕ = (ϕ1 − ψ) + (ϕ2 + ψ), our problem amounts to finding a solution of the system of equations ∂ψ = ηj ∂ z¯j
(j = 1, . . . , n)
(11)
with the same growth properties as those of ηj = ϕ∂χ/∂ z¯j . The existence of such a solution is ensured by fundamental H¨ormander’s theorem [12], though there is a subtlety here. The point is that the function (2) is not plurisubharmonic, while this property is crucial for H¨ormander’s estimate. However, one can replace ρ by its greatest plurisubharmonic minorant and this leaves the space unaltered since ln |ϕ| is plurisubharmonic for any analytic function ϕ.
3. Tensor Products We will consider the tensor product Sα0 (U1 ) ⊗i Sα0 (U2 ) equipped with the inductive topology τi , that is, the finest locally convex topology under which the canonical bilinear mapping (ϕ, ψ) → ϕ ⊗ ψ is separately continuous. Recall [28] that this topologization has the following category meaning. If E, F , and G are locally convex spaces and u : E×F → G is a bilinear separately continuous mapping, then its associated linear mapping u∗ : E ⊗i F → G is continuous. Specifically, this implies the continuity of the natural injection (12) Sα0 (U1 ) ⊗i Sα0 (U2 ) → Sα0 (U1 × U2 ) P P ϕj (z1 )ψj (z2 ). generated by the correspondence ϕj ⊗ ψj →
Extension of Distribution Theory Related to Gauge QFT
585
Theorem 3. When extended by continuity to the completion of the tensor product, the embedding (12) turns into an algebraic and topological isomorphism, that is, for any open cones U1 , U2 ∈ Rn , the following identification holds: ˆ i Sα0 (U2 ) = Sα0 (U1 × U2 ). Sα0 (U1 ) ⊗ 0,b (U ) (a, b → ∞) reduces the probOur basic representation Sα0 (U ) = inj limHα,a lem to that in Hilbert spaces, but here care is necessary because usually the topology on the tensor product of these latter is determined quite differently, by means of its natural scalar product [27]. It is customary to denote the completed tensor product in the Hilbert space category by H1 ⊗H2 , and we hope this will not lead to a misunderstanding, though the same notation is used for the algebraic tensor product. The definition (2) should be fitted to the problem at hand, and this time we set X X X |xj /a|1/α + b inf |yj |, (13) |xj − ξj | + b ρU,a,b (z) = − ξ∈U
so that the multiplicativity relation exp{−ρU1 ×U2 (z1 , z2 )} = exp{−ρU1 (z1 )} exp{−ρU2 (z2 )} is fulfilled. Lemma 1. If the defining weight function is chosen in the multiplicative form, then 0,b 0,b 0,b Hα,a (U1 ) ⊗ Hα,a (U2 ) = Hα,a (U1 × U2 ).
(14)
The proof faithfully copies that of the analogous statement in [27], Sect. II.4, about weighted spaces of locally square-integrable functions. Choosing bases {ϕj } and {ψk } in the spaces on the left-hand side of (14) and using Fubini’s theorem, one makes sure that {ϕj (z1 )ψk (z2 )} is a basis for the space on the right and so the natural injection of the algebraic tensor product into it can uniquely be extended to a unitary operator. As an immediate consequence, one obtains that the image of (12) is everywhere dense, by the very definition of convergence in Sα0 (U ). Thus to prove Theorem 3, it is sufficient to show a continuous mapping ˆ i Sα0 (U2 ) Sα0 (U1 × U2 ) → Sα0 (U1 ) ⊗
(15)
whose composition with (12) is the identity mapping. Indeed, then τi is the same as the topology induced on the tensor product by that of Sα0 (U1 × U2 ). At this point we will take advantage of the fact that, in the case of nuclear Fr´echet spaces, τi coincides with another often-used topology on tensor products, the so-called topology of equicontinuous convergence τe which has a quite simple description. The auxiliary spaces (3) are just of this type and the space (14) can be canonically mapped into the completion of their tensor product endowed with the topology τe by virtue of the following Lemma 2. Let H1 , H2 be Hilbert spaces. The Hilbert norm k · k on their tensor product is stronger than the norm which determines the topology τe , that is, the identity mapping H1 ⊗k·k H2 → H1 ⊗e H2 is continuous.
(16)
586
M.A. Soloviev
Proof. According to [28], for any pair of normed spaces, the norm k · ke is defined by X X k ϕj ⊗ ψj ke = (f, ϕj )(g, ψj ), sup kf k0 ≤1, kgk0 ≤1
where f, g belong to the dual spaces and the dual norms are marked by primes. By Riesz’s theorem, in the Hilbert case the linear forms f, g are identified with elements of the spaces H1 , H2 themselves and the primes can Pbe dropped, so the sum on the righthand side turns into the scalar product (f ⊗ g, ϕj ⊗ ψj ) and Schwarz’s inequality P P yields k ϕj ⊗ ψj ke ≤ k ϕj ⊗ ψj k. In order to complete the proof of Theorem 3, it remains to combine lemmas and note that both topologies τe and τi are consistent with tensoring morphisms. Namely, if h1 : E1 → F1 and h2 : E2 → F2 are continuous linear mappings of locally convex spaces, then h1 ⊗ h2 : E1 ⊗e E2 → F1 ⊗e F2 and E1 ⊗i E2 → F1 ⊗i F2 are continuous too. For τe , this fact can readily be established using the explicit form [28] of a base of neighborhoods and for τi by the category arguments. First we take hj to be the natural 0,b+ 0,b (Uj ) → Sα,a+ (Uj ) and use τe . Next we consider the inclusion mappings injections Hα,a 0,b+ 0 Sα,a+ (Uj ) → Sα (Uj ) and endow the tensor products with τi . Making up a composition with (16) and passing to the completions, we arrive at the embeddings 0,b ˆ i Sα0 (U2 ) (U1 × U2 ) → Sα0 (U1 ) ⊗ Hα,a
which are evidently compatible for all a, b and, taken together, determine the desired mapping (15) which is continuous by the definition of inductive limit topology, and whose restriction to Sα0 (U1 ) ⊗ Sα0 (U2 ) is the identity mapping by construction. 4. A Generalization of the Paley–Wiener–Schwartz Theorem As originally formulated, the PWS theorem establishes necessary and sufficient conditions for an analytic function to be the Laplace transform of a tempered distribution with compact or cone-shaped support. An analogous theorem for the more singular distributions defined on Sαβ , β > 1, has been derived in [29] with the aim of application to strictly local QFT’s of the Jaffe type [14, 15]. An appropriate generalization to the nonlocalizable case 0 < β < 1 was formulated there too, but its complete proof has been set forth much later, see [31], Theorem 5.23. In contrast to the case of Fourier hyperfunctions considered by Kawai [16], this proof is elementary in essence and makes use of the fact that an element f of Sα0β , 0 < β < 1, is carried by a cone if and only if β falls off like ϕ in the complementary cone. Now we the convolution f ∗ϕ with ϕ ∈ S1−β are in a position to treat the most difficult case β = 0 of arbitrary singularity. It is not so easy because S10 is trivial, but one may use a result of Komatsu [18] who has established the growth conditions under which analytic functions have boundary values belonging to S00α . Actually he considered even a finer scale of spaces designated as D{Mp } (Ω), where Mp = pαp and Ω = Rn for our case. A combination of Komatsu’s result with Theorems 1 and 3 leads directly to the desired extension of the PWS theorem. We now recall terminology and some simple facts concerning cones with the vertex at the origin. For a given cone V ⊂ Rn , the dual cone is defined by V ∗ = {η : ηx ≥ 0, ∀ x ∈ V }. The convex hull of V is denoted by chV . Clearly V ∗ = (chV )∗ and V ∗∗ = chV . A cone V is said to be proper if chV does not contain a straight line, or equivalently, if
Extension of Distribution Theory Related to Gauge QFT
587
the interior of the dual cone is nonempty. By a tubular cone in Cn is meant one of the form Rn + iV , with V an open cone in Rn . We will name it simply a tube and denote it by T V . It should be also R noted that we define the Fourier transformation Fξ→x of test functions by ψ(ξ) → eixξ ψ(ξ)dξ and denote the inverse operator by Fξ←x . Their dual operators acting on distributions are designated as Fx→ξ and Fx←ξ respectively. Theorem 4. Let K be a closed proper cone in Rn and let V be the interior intK ∗ of the dual cone. Suppose that K is a carrier of f ∈ Sα00 (Rn ), α > 1. Then the distribution f has a uniquely determined Laplace transform g(ζ) which is analytic in the tube T V and satisfies the estimate |g(ζ)| ≤ Cε,R (V 0 ) exp{ε |η|−1/(α−1) }
(η = Im ζ ∈ V 0 , |ζ| ≤ R)
(17)
for each ε, R > 0 and any open relatively compact subcone V 0 of V . As η → 0 inside a fixed V 0 , the function g(ξ + iη) tends in the topology of S00α to the Fourier transform Fx→ξ f . Conversely, if g(ζ) is an analytic function on T V , with V an open connected cone, and if its growth near the real boundary is bounded by (17), then g(ζ) is the Laplace transform of a distribution defined on Sα0 and carried by the cone V ∗ . Remark 3. In the case of nonzero β examined in [31], the bound (17) is replaced by |g(ζ)| ≤ Cε (V 0 ) exp{ε|ζ|1/β + ε|η|−1/(α−1) }
(η ∈ V 0 ).
(17)0
The proof of the first part of Theorem 4 is direct and no different in the main from that for β > 0. The exponentials eizζ belong to Sα0 (K) for each ζ ∈ T V and one can set g(ζ) = (fˆ, eizζ ),
(18)
with the caret denoting an extension to this space. The estimate (17) is a consequence of the inequality (19) |g(ζ)| ≤ kfˆkU,a,b keizζ kU,a,b , ⊂ U . From where a, b can be taken arbitrarily large and U is any open cone such that K ⊂ the definition of the norm k · kU,a,b , it follows that
keizζ kU,a,b ≤ Ca0 ,b0 sup exp{−xη − yξ + |x/a0 |1/α − d(bx, U ) − |b0 y|} x,y
(20)
for any a0 < a, b0 < b. Taking b0 > R, one sees that the terms dependent on y are unessential for |ζ| ≤ R and may be omitted from the formula. The cone U and another ⊂ U ⊂ ⊂ U0 ⊂ ⊂ intV 0∗ . This is possible auxiliary open cone U 0 should be taken so that K ⊂ 0 ⊂ ∗ 0∗ 0 ⊂ intV since V ⊂ intK implies K ⊂ . If x 6∈ U , then d(x, U ) > δ|x| with δ > 0 and, choosing b > R/δ, one can majorize the exponent (20) by a constant. Now let x be inside ⊂ intV 0∗ it follows that U 0 . By standard compactness arguments, from the inclusion U 0 ⊂ −xη ≤ −θ|x||η|
for all x ∈ U 0 , η ∈ V 0 ,
(21)
with θ a positive constant. Inserting (21) into (20), dropping the term d(·, U ) and locating the maximum, we arrive at (17). A simple estimation proceeding along similar lines 0 shows that, for any ζ ∈ T V , the difference quotients corresponding to the partial izζ derivatives ∂e /∂ζj converge in the topology of Sα0 (U ) and hence g(ζ) is analytic. Further, using the mean value theorem, one can verify that, for each ψ ∈ S0α and
588
M.A. Soloviev
R izζ η ∈ V 0 , the Riemann sums corresponding to the integral e ψ(ξ)dξ converge in R izξ 0 −zη , where ϕ(z) = e ψ(ξ)dξ , and therefore the identity Sα (U ) to ϕ(z)e Z (22) g(ζ)ψ(ξ)dξ = (fˆ, ϕ(z)e−zη ) holds in V . Finally, it is again straightforward to prove the convergence ϕe−zη → ϕ in the same space as η → 0 inside V 0 . Thus the Fourier transform Fx→ξ f is a boundary value of g(ζ) in the sense of weak convergence. But the latter implies strong convergence since S00α is Montel. It should be noted that the extension used in (18) is unique by Theorem 5 proved below, but really there is no need to appeal to it here since the difference of two analytic functions with the same boundary value must vanish, see Ref. [13], Theorem 9.3.3. Remark 4. Evidently the space of analytic functions on T V with the growth property (17) is an algebra under multiplication for which we will use the notation Aα (T V ). It can be made into an FS space and into a topological algebra by giving it the projective limit topology determined by the set of norms kgkV 0 ,R,ε =
sup η∈V 0 , |ζ|≤R
|g(ζ)| exp{−ε|η|−1/(α−1) }.
The above examination shows that the Laplace transformation is a one-one and continuous mapping of Sα00 (K) into Aα (T V ), V = intK ∗ . We now turn to the converse assertion of Theorem 4. Komatsu’s Theorem 11.5 of [18] ensures that every function g ∈ Aα (T V ) has a boundary value bV g which belongs to S00α and coincides with that in the sense of a hyperfunction. In [18], the cone V was assumed to be convex but really only its connectedness is an essential assumption since ⊂ V can be covered by a finite enchained family of convex subcones. Our then each V 0 ⊂ task is to make sure that f = Fx←ξ (bV g) is carried by V ∗ . We start with the simplest case n = 1 and V = R, when the dual cone V ∗ degenerates into {0}. Then bVPg is merely the restriction of the entire function g to the real axis and fRtakes the form ck δ (k) (0), where lim|ck |1/k = 0. Let ϕ ∈ Sα0 ({0}) and so kϕk2b = |ϕ|2 exp(−2b|z|) dλ < ∞ for some b. Denote by χR the characteristic function of the disk |z| < R smoothed by convolution with a C ∞ function whose support is contained in the unit disk, and apply the Cauchy–Green integral formula to ϕχR . Using Schwarz’s inequality and taking into ¯ we find that account the support properties of ∂χR /∂ z, √ |ϕ(k) | ≤ C kϕkb k! (R − 1)−(k+1) eb(R+1) R. The optimization of R, combined with Stirling’s formula, yields the estimate |ϕ(k) | ≤ C kϕkb bk which shows that the above series does determine a continuous functional on Sα0 ({0}). Next we consider the case V = R+ , V ∗ = R+ which is slightly more involved. This time we make use of Theorem 1 and decompose f into a sum of two distributions f+ , f− carried by R+ and R− . On doing the Laplace transformation, we get g(x + i0) = g+ (x + i0) + g− (x − i0) with g± ∈ Aα (C± ). By the general version of the “medge of the wedge” theorem given in Ref. [13] in terms of hyperfunctions (Theorem 9.3.5), there is an entire function which continues both g − g+ and g− and hence f − f+ is carried by the origin according to what has just been said. This completes the proof for n = 1. When n ≥ 2, we may assume without loss of generality that the first basis
Extension of Distribution Theory Related to Gauge QFT
589
vector lies in V . Let us take ψ ∈ S0α (Rn−1 ), introduce the designation ξˇ = (ξ2 , . . . , ξn ), and consider the mapping Z ˇ ξ) ˇ dξˇ ψ → g1 (ζ1 ) = g(ζ1 , ξ)ψ( of S0α (Rn−1 ) into Aα (C+ ). As can easily be seen, it is continuous. Let f1 be the distribution in Sα00 (R+ ) whose Laplace transform is g1 . Note that Sα00 (R+ ) is no different from Sα00 (R+ ). The correspondence g1 → f1 is also continuous by virtue of the open mapping theorem for the Fr´echet spaces, see, e.g., Ref. [27], Theorem V.6. Since the Fourier transformation determines an isomorphism of S0α (Rn−1 ) onto Sα0 (Rn−1 ) , we thereby obtain a bilinear separately continuous form on Sα0 (R+ ) × Sα0 (Rn−1 ) and hence, by Theorem 3, a linear continuous form on Sα0 (H1 ), where H1 is the half-space x1 > 0. Its restriction to Sα0 (R) ⊗ Sα0 (Rn−1 ) coincides obviously with that of f and, since this tensor product is dense in Sα0 (Rn ) by Theorem 3 again, we infer that H 1 is a carrier of f . The same argument shows that f is carried by every half-space {x : xη ≥ 0} with η ∈ V , whose intersection is just the cone V ∗ . To complete the proof of Theorem 4, it remains to apply the last conclusion of Theorem 1. Corollary 1. Every function analytic in a tubular connected cone T V and satisfying the condition (17) allows an analytic continuation into the tube T chV which possesses the same growth property. Corollary 2. If a closed cone K is convex and proper, then the distributions defined on Sα0 and carried by K form a topological algebra under convolution, and the Laplace transformation maps it isomorphically onto the topological algebra Aα (T V ), V = intK ∗ . Proof. For brevity, we will identify the set of distributions carried by K with the space Sα00 (K). This is admissible due to Theorem 5 proved below. As shown in Theorem 5.16 of [31], for each f0 ∈ Sα00 (K) and ϕ ∈ Sα0 , the convolution (f0 ∗ ϕ)(x) = (f0 , ϕ(x − ·)) belongs to Sα0 (U ), where U is any open cone compact in the complement CK of K. (In the context of axiomatic QFT, a similar assertion for tempered distributions is sometimes referred to as Hepp’s lemma.) Moreover, the mapping Sα0 → Sα0 (U ) : ϕ → f0 ∗ ϕ is continuous. It should be noted that this fact is true for arbitrary close cone K and is proved in [31] even for more general classes of distributions. It enables one to define the convolution f ∗ f0 for each f carried by a compact subcone C of −CK. Namely, the mapping Sα00 (C) → Sα00 : f → f ∗ f0 is defined to be dual of Sα0 → Sα0 (C) : ϕ → (f0 , ϕ(x + ·)). If K is proper, then it is itself compact in −CK, and we obtain the bilinear mapping Sα00 (K) × Sα00 (K) → Sα00 . By the estimates obtained in [31], it is continuous in the second argument as well as in the first one and, since the space Sα00 (K) is Fr´echet, this separate continuity implies continuity, see [28], Theorem III.5.1. When being restricted to the Schwartz distributions of compact support, the above convolution operation is in line with the definition of [11] and hence corresponds to the multiplication of the Laplace transforms by the usual PWS theorem. Taking into account analyticity of the elements of Sα0 (K) and reflexivity of this space and using the Hahn–Banach theorem, one sees immediately that these distributions and even those supported by the origin are dense in Sα00 (K). Therefore, by virtue of Theorem 4 and the open mapping theorem, this correspondence holds true for the whole of Sα00 (K) and in particular the convolution does not take distributions out of this space. This completes the proof. It is worthwhile to note that, for the tempered distributions with support contained in a proper closed convex
590
M.A. Soloviev
cone, a convolution is defined in [13] through another construction which demonstrates associativity and commutativity of this operation but involves a localization and so is inapplicable to our more general case. However the PWS theorem enables these algebraic properties to be interpreted as those of multiplication. 5. Approximation Theorem In order to develop the calculus of interest to us by analogy with that of hyperfunctions, we need a suitable approximation theorem for functions belonging to Sα0 (K). However, unlike the case [31] of nonzero β, the customary means of approximation such as smoothing and cutoff are insufficient here, except for the degenerate cone K = {0}. Because of this, we will follow the line of thought used in [12]. Before showing the desired theorem for Sα0 (K), we consider a more general situation when the defining function ρ(z) need not be of the form (2), and we denote by Hρ the closed subspace of L2ρ = L2 (Cn , e−ρ dλ) consisting of analytic functions. If ρ is plurisubharmonic, then ρˆ denotes the strictly plurisubharmonic function ρ + 2 ln(1 + |z|2 ). It should be noted that the proof presented below is based on exploiting L2 –estimates for solutions of the dual equation rather than those for Eqs. (11) themselves, in contrast to the strategy sketched in [32]. We begin with quoting H¨ormander’s result derived in [12], Sect. 4.4. It was not stated there in the form of a theorem but this may be done as follows. Let ρ be a plurisubharmonic C 2 function on Cn and let v ∈ L2ρˆ . If v is orthogonal to each analytic function in this space, then the equation X ∂(hj e−ρˆ ) ∂zj
= −ve−ρˆ
has a solution satisfying the estimate Z Z 2 2 −2 −ρˆ 2 |h| (1 + |z| ) e dλ ≤ |v|2 e−ρˆ dv.
(23)
(24)
This fact enables one to prove the following lemma. Lemma 3. Let ρ0 , ρ and ρ0 be continuous real valued functions on Cn such that ρ0 ≤ ρ0 , ρ ≤ ρ0 and hence Hρ0 , Hρ can be regarded as vector subspaces of Hρ0 . Suppose that there exists a sequence of smooth plurisubharmonic functions ρν , ν = 1, 2, ..., which satisfy the conditions: (i) ρˆν ≤ ρ0 , (ii) ρˆν ≤ ρ0 + Cν , (iii) ρν ≥ ρ for |z| ≤ ν. Then Hρ0 is dense in the space Hρ under the topology of Hρ0 . Proof. The closure of Hρ0 in Hρ0 covers Hρ if and only if Hρ⊥0 ⊂ Hρ⊥ , where the orthogonal complement is determined by the of Hρ0 . It is this inclusion R scalar product ¯ exp{−ρ0 }dλ = 0 for each u ∈ Hρ0 . It that we shall prove. Let v ∈ Hρ0 and let vu suffices to derive the representation 0
−ve−ρ =
X ∂wj ∂zj
,
(25)
Extension of Distribution Theory Related to Gauge QFT
591
with wj ∈ L2−ρ , which should be fulfilled in the sense of a Schwartz distribution, that is, Z XZ 0 ∂ϕ w¯ j dλ for all ϕ ∈ C0∞ . (26) vϕe ¯ −ρ dλ = ∂ z¯j In fact, let u be an element of L2ρ such that ∂¯j u ∈ L2ρ for every j. In the ordinary way which combines a cutoff with smoothing and is used, e.g., in the proof of Lemma 4.1.3 ¯ ρ . If u ∈ Hρ , of [12], one can approximate u by C0∞ functions in the norm kR· kρ + k∂(·)k then by passing to the limit in (26), one obtains immediately vu ¯ exp{−ρ0 }dλ = 0. Let us now rewrite the orthogonality condition v⊥Hρ0 as follows: Z 0 ve ¯ −ρ +ρˆ ν ue−ρˆ ν dλ = 0. By virtue of (i), the function v exp{−ρ0 + ρˆν } belongs to L2ρˆ ν . Furthermore, it is orthogonal to each analytic element of L2ρˆ ν since these are contained in Hρ0 due to (ii). Thus by H¨ormander’s theorem referred to, the equation X ∂(hj e−ρˆ ν ) ∂zj
= −ve−ρ
0
has a solution such that Z Z 0 2 |h|2 (1 + |z|2 )−2 e−ρˆ ν dλ ≤ |v|2 e−2ρ +ρˆ ν dλ. Setting hj exp{−ρˆν } = wjν and using (i) again, we get a family of representations of the form (25), where Z Z 0 2 |wν |2 eρν dλ ≤ |v|2 e−ρ dλ. (27) ν (z) = wν (z)χ(z/ν), where χ ∈ C0∞ is a standard cutoff function with support Let wcut in the unit ball and equal to 1 in a neighborhood of the origin. Owing to the condition ν is strongly (iii), since the right-hand side of (27) is independent of ν, the sequence wcut 2 bounded in L−ρ and one can draw from it a weakly convergent subsequence. We take ν coincides with wν on every given compact set w to be its limit. By construction, wcut when ν is large enough and so w does satisfy the Eq. (26). This ends the proof. 0
0,b 0,b0 0,b 0 , Hα,a (U ), Hα,a We shall apply Lemma 3 to the triple Hα,a 0 (U ) with a0 = a 0 0 and b0 = b for simplicity. The required sequence ρν will be constructed starting from auxiliary functions of the form ln |ϕN |, where ϕN belong to Sα0 and are bounded by
|ϕN | ≤ A exp{|y| − |x/γ|1/α },
(28)
with constants A, γ common to all ϕN . Lemma 4. For each α > 1, γ > 0 and 0 < σ < 1/2, there exists a family of functions ϕN (z) ∈ Sα0 (R), N = 1, 2, ..., which satisfy the bound (28) and the following additional requirements: for |x| ≤ 1, (*) ln |ϕN (z)| ≥ σ|y| (**) ln |ϕN (z)| ≤ |y| − N ln+ (σ|x|/N ) + B, where ln+ r = max(0, ln r) and B is a constant independent of N .
592
M.A. Soloviev
Proof. Let χ(t) be the characteristic function of the interval [-1, 1], and let N χN (t) = 2
t+1/N Z
t−1/N
N dtN . . . 2
t2Z +1/N
χ(t1 ) dt1 .
(29)
t2 −1/N
It is well known that the inequality α > 1 is a non-quasianalyticity condition under which, for any γ, δ > 0, one can construct (by an iteration procedure similar to (29)) a nonnegative even function ω such that Z (k) k αk ω dt = 1, supp ω ⊂ [−δ, δ]. (30) kω (t)k ≤ Aγ k , It is easy to see that the Laplace transform of ψN = χN ∗ ω satisfies all the stated requirements after a suitable rescaling. P First, dropping for the moment the subscript, we note that χ(N ) is of the form (N/2) (−1)i χ(t − τi ), where the sum involves 2N terms, and so is dominated by N N . Thus, making use of Kolmogorov’s inequalities 1−k/N k/N Mk ≤ 2M0 MN for modulus maxima of successive derivatives, we can write Z (k) k (k ≤ N ), χ dt = 2, supp ω ⊂ [−2, 2], (31) kχ (t)k ≤ 2 N with the last two properties being evident. From (30), (31), we have the estimate ˜ ≤ |xk ψ(z)|
Z2+δ
izt (k) e ψ (t) dt ≤ Cδ e(2+δ)|y| γ k k αk
−2−δ
which implies (28) upon changing from z to z/(2 + δ) and redefining A, γ, because inf k k αk /rk ≤ C exp{−(α/e)|r|1/α }. After the replacement γ k k αk → N k , the same estimate and rescaling yield (**). Note now that, for |x| < 1/2 and δ small enough, the inequality cos xt > 1/2 R holds on supp ψ and that ψ is an even function with the properties 0 ≤ ψ ≤ 1, ψdt = 2. Therefore Z ˜ |ψ(z)| ≥
e
−yt
1 cos xt ψ(t) dt ≥ 2
Z2+δ
e−yt ψ(t)dt ≥
δ (1−δ)|y| e . 2
1−δ
Thus, if δ is sufficiently close to zero, the function ϕN (z) = (2/δ)ψ˜ N (z/(2 + δ)) satisfies all the conditions required. Remark 5. Besides the approximation theorem below, we would like to point out another application [30] of the multipliers ϕN . Due to the condition (**), the lower envelope inf N |ϕN (x) decreases exponentially when x approaches infinity, and this makes it possible to improve considerably Ruelle’s original derivation of the cluster decomposition properties in QFT. Namely, when combined with Hepp’s lemma, the above construction shows rather directly the exponential character of this property in field theories having a mass gap without using any more special tools, and such a simple derivation is applicable to quantum fields of arbitrary singularity including nonlocal ones. In what follows, we use the specification (13) which again is most suitable.
Extension of Distribution Theory Related to Gauge QFT
593
Theorem 5. Let α > 1 and let U be an open cone in Rn . For any a0 > a and b0 > 2enb, 0,b0 0,b0 0,b the space Hα,a 0 is dense in Hα,a (U ) under the topology of the space Hα,a0 (U ) which contains both of them. As a consequence, the space Sα0 is dense in Sα0 (U ) and all the more it is dense in each space Sα0 (K), where K is a closed cone. Proof. We may assume that b0 > 1 and b = σ/en with some σ < 1/2 since the problem is reduced to this particular case by rescaling, and then apply Lemmas 3 and 4. Let us denote the number σ/en by σ 0 . It is sufficient to find a sequence of plurisubharmonic functions such that ρν ≤ ρU,a,b0 and
and ρν ≤ ρRn ,a,b0 + Cν
ρν ≥ ρU,a,σ0 − C
for
X
everywhere,
|xj | < ν,
(32) (33)
with C independent of ν. In fact, if ρν is not smooth, then one can correctR this defect forming the convolution by a nonnegative C0∞ function χ(|z|) such that χdλ = 1, which preserves plurisubharmonicity, see [12]. Elementary estimates using the triangle inequalities (6) show that the convolution satisfies the same conditions with some additional constants. Furthermore, we have ln(1 + |z|2 ) ≤ δ|x|1/α + δ|y| + Cδ with arbitrarily small δ which can be included in a0 , b0 . Thus, (32) and (33) ensure the fulfillment of all the conditions of Lemma 3 for the triple ρRn ,a0 ,b0 , ρU,a,σ0 − C 0 , ρU,a0 ,b0 + C 00 which defines the same spaces for any values of the constants. Let ϕN be functions Let us denote by ε the difference b0 − 1 and set γ = εa in (28).P ln |ϕN (zj )|, Φ = whose existence in Sα0 (R) is established by Lemma 4 and let ΦN = P ln ϕ1 (εzj )|. We define ρν by ρν (z) = sup {Φ(z − κ) + L(κ)} + sup {ΦN (z − κ) + LN (κ)}, |κ|≤ν
|κ|≤ν
(34)
P where κ runs over real multi-integers, |κ| = |κj | and L, LN are the least upper bounds of those l for which X X X xj 1/α X |yj | and ΦN (z−κ)+l ≤ inf |xj −ξj |+ |yj | Φ(z−κ)+l ≤ − +ε ξ∈U a (35) respectively. Since only a finite number of κ’s is involved in (34), the function ρν is surely plurisubharmonic and it satisfies (32) due to the bound (28) and by construction since the sum of right-hand sides of the inequalities (35) is just ρU,a,b0 . The parameter N in (34) is regarded as a function of κ which should be chosen in such a way to satisfy (33). Let |x| < ν. Note that Φ(z − κ) ≥ 0 for |xj − κj | ≤ 1/ε by the condition (*) and P L(κ) ≥ − |κj /a|1/α −n ln A by definition of L and due to (28). Therefore, if κj ’s are P Φ(z − κ) + L(κ) ≥ − |xj /a|1/α − C1 . With the equal to the integer parts of xj ’s, then P 0 same κj ’s we have ΦN (z − κ) ≥ σ 0 |yj |, since P1/2 ≥ σ . Thus to complete the proof, 0 |κj − ξj | − C2 , with C2 independent we only need to show that LN (κ) ≥ σ inf ξ∈U of κ, providing N (κ) is properly chosen. Then (33) is fulfilled for C = nσ 0 + C1 + C2 . We assume that κ ∈ / U , for otherwise the infimum is zero and this inequality is obviously valid with C2 = n ln A, and we now pass to the Euclidean norm |x| through the use of X
|x| ln+ |xj | ≥ ln+ √ , n
|x| ≤
X
|xj |.
594
M.A. Soloviev
Then it remains to examine the inequality −N ln+
σ|x| √ + l ≤ inf |x + κ − ξ| ξ∈U N n
(36)
which implies the second of inequalities (35) by virtue of (**), upon replacing l by l+nB and x by x − κ. Let d(κ) be the Euclidean distance of the point κ to U . The infimum on the right-hand side of (36) can be minorized by that taken over {ξ : |κ − ξ| ≥ d}. The latter is equal to d − r when r ≡ |x| ≤ d and to zero otherwise. Thus we face an easy task to compare a piecewise convex function with a piecewise linear one. Regarding N for the √ √ moment as a continuous parameter and performing the minimization of −N ln |σr/N n| with respect to N at the point r = d, we find N (κ) = σd/e n. Next we equate the left-hand side of (36) to zero at the same point and √ obtain l = N . It is readily verified that then l < d − r at the break point of ln+ |σr/N n|, and hence√the inequality (36) holds everywhere. If N and l are equal part of σd/e n, P to the integer √ then it is also fulfilled and, returning to the norm |xj | ≤ n|x|, we arrive at the estimate X σd(κ) |κj − ξj | − C 0 , L(κ) ≥ √ − 1 − nB ≥ σ 0 inf ξ∈U e n which completes the proof.
6. Conclusion The purpose of this work was to establish the basic properties of those distribution classes which provide the widest framework for constructing quantum field models with singular infrared behavior. The employment of such distributions enables one to keep analytical tools of QFT in coordination with the indefinite metric formalism in going beyond perturbation theory. We have shown that a considerable part of Schwartz’s theory of distributions and Sato’s theory of hyperfunctions has interesting analogues under arbitrary singularity, and it is luckily just this part that is of use in QFT. The results obtained form a basis for further developments which are beyond the scope of the present paper, such as general structure theorems and special ones concerning Lorentz covariant distributions, an invariant splitting of distributions carried by the closed light cone, a representation of the Jost–Lehmann–Dyson type, a connection with the concept of wave front and so on. These topics are under investigation and both Theorems 4 and 5 are of prime importance therein. It is noteworthy that similar theorems are valid for distributions defined on the space 0 . This is of interest particularly in view of L¨ucke’s works [19–21] which show S 0 = S∞ a way of deriving the connection between spin and statistics and the PCT–invariance for nonlocal fields whose matrix elements belong to S 00 and so have arbitrary highenergy behavior. This time a part of the argument is even simpler since S 0 is none other than the Fourier transform of Schwartz’s space D, and in proving the analogue of Theorem 4 one can appeal to the treatise [13] instead of Komatsu’s theorem. However, the topological structure of S 0 (K) is rather complicated and in this respect exploiting the distributions of the wider class Sα00 is perhaps preferable here too. We would like also to point out that Theorem 4 has an immediate application to the problem of formulating the generalized spectral condition for infrared singular quantum fields raised by Moschella and Strocchi [22] and enables one to cope with it in a manner completely analogous to that used in nonlocal QFT for generalization of the microcausality condition. Really, as
Extension of Distribution Theory Related to Gauge QFT
595
argued in more detail in an accompanying paper, the first part of the generalized Paley– Wiener–Schwartz theorem establishes general bounds on the correlation functions of gauge fields while the second one specifies the test function spaces which correspond to quantum fields with given infrared behavior. The developed technique provides also a simple and general method for constructing Wick-ordered entire functions of the indefinite metric free field. As shown in [33], every such function with growth of order ρ < 2 allows a correct operator realization in the corresponding Fock-Hilbert-Krein space, while being smeared with configuration-space test functions in S0α , where α < 1/2 + 1/ρ, and this realization satisfies all requirements of the indefinite metric local QFT. Acknowledgement. This work was supported in part by the Russian Foundation for Basic Research under Contract No. 96-01-00105, and in part by Grant No. INTAS-93-2058. The author is also grateful to Professor V. Ya. Fainberg for helpful discussions and to the referee for valuable remarks and suggestions.
References 1. Bogolubov, N. N., Logunov, A. A., Oksak, A. I., Todorov, I. T.: General Principles of Quantum Field Theory. Dordrecht: Kluwer, 1990 2. Bruning, E., Nagamachi, S.: Hyperfunction quantum field theory: Basic structural results. J. Math. Phys. 30, 2340–2359 (1989) 3. B¨ummerstede, J., L¨ucke, W.: Haag–Ruelle–Hepp scattering formalism for essentially local nonlocalizable fields. Commun. Math. Phys. 37, 121–140 (1974) 4. B¨ummerstede, J., L¨ucke, W.: The notion of essential locality for nonlocalizable fields. J. Math. Phys. 16, 1203–1209 (1975) 5. Constantinescu, F., Taylor J. G.: Equivalence between nonlocalizable and local fields. Commun. Math. Phys. 30, 211–227 (1973) 6. Constantinescu, F., Taylor J. G.: Causality and nonlocalizable fields.J. Math. Phys. 15, 824–830 (1974) 7. Efimov, G. V.: Nonlocal Interactions of Quantum Fields. Moscow: Nauka, 1977 (in Russian) 8. Efimov, G. V.: Problems in Quantum Theory of Nonlocal Interactions. Moscow: Nauka, 1985 (in Russian) 9. Fainberg, V.Ya., Soloviev, M. A.: How can local properties be described in field theories without strict locality? Ann. Phys. 113, 421–447 (1978) 10. Fainberg, V. Ya., Soloviev, M. A.: Nonlocalizability and asymptotical commutativity. Theor. Math. Phys. 93, 1438–1449 (1992) 11. Gelfand, I. M., Shilov, G. E.: Generalized Functions. Vol.2. New York: Academic Press, 1964 12. H¨ormander, L.: An Introduction to Complex Analysis in Several Variables. Princeton, N.J.: D. van Nostrand Publ. Co., 1966 13. H¨ormander, L.: The Analysis of Linear Partial Differential Operators, Vol. 1. Berlin–Heidelberg–New York–Tokyo: Springer-Verlag, 1983 14. Jaffe, A.: Form factors at large momentum transfer. Phys. Rev. Lett. 17, 661–663 (1966) 15. Jaffe, A.: High energy behavior in quantum field theory. Phys. Rev. 158, 1454–1461 (1967) 16. Kawai, T.: On the theory of Fourier hyperfunctions and its applications to partial differential equations with constant coefficients. J. Fac. Sci. Univ. Tokyo, Sect. 1A, Math. 17, 465–517 (1970) 17. Komatsu, H.: Projective and injective limits of weakly compact sequences of locally convex spaces. J. Math. Soc. Japan 19, 366–383 (1967) 18. Komatsu, H.: Ultradistributions, I. Structure theorems and a characterization. J. Fac. Sci. Univ. Tokyo, Sect. 1A, Math. 20, 25–105 (1973) 19. L¨ucke, W.: PCT, spin and statistics, and all that for nonlocal Wightman fields. Commun. Math. Phys. 65, 77–82 (1979) 20. L¨ucke, W.: Spin-statistics theorem for fields with arbitrary high-energy behavior. Acta Phys. Austriaca 55, 213–228 (1984) 21. L¨ucke, W.: PCT theorem for fields with arbitrary high-energy behavior. J. Math. Phys. 27, 1901–1905 (1986)
596
M.A. Soloviev
22. Moschella, U., Strocchi, F.: The choice of test functions in gauge quantum field theories. Lett. Math. Phys. 24, 103–113 (1992) 23. Moschella, U.: A note on gauge symmetry breaking. Lett. Math. Phys. 24, 155–163 (1992) 24. Moschella, U.: The Wick-ordered exponential of the dipole field as a field of type S. J. Math. Phys. 34, 535–548 (1993) 25. Nagamachi, S., Mugibayashi, N.: Hyperfunction quantum field theory. Commun. Math. Phys. 46, 119– 134 (1976) 26. Nagamachi, S., Mugibayashi, N.: Hyperfunction quantum field theory II. Euclidean Green’s functions. Commun. Math. Phys. 49, 257–275 (1976) 27. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Vol. 1. New York, London: Academic Press, 1972 28. Schaefer, H. H.: Topological Vector Spaces. New York, Heidelberg, Berlin: Springer-Verlag, 1970 29. Soloviev, M. A.: On the Fourier–Laplace transformation of generalized functions. Theor. Math. Phys. 15, 317–328 (1973) 30. Soloviev, M. A.: A generalization of Ruelle’s theorem. Theor. Math. Phys. 52, 756–763 (1982) 31. Soloviev, M. A. Beyond the theory of hyperfunctions. In: Arnold, V., Monastyrsky, M. (eds.) Developments in Mathematics: The Moscow School, pp. 131–193. London: Chapman and Hall, 1993 32. Soloviev, M. A.: Towards a generalized distribution formalism for gauge quantum fields. Lett. Math. Phys. 33, 49–59 (1995) 33. Soloviev, M. A.: Wick-ordered entire functions of the indefinite metric free field. Preprint FIAN/TD/1496, Lett. Math. Phys. To appear 34. Streater, R. F., Wightman, A. S.: PCT, Spin & Statistics, and All That. Reading MA: Benjamin/Cummings, 1978 35. Wightman, A. S. The choice of test functions in quantum field theory. In: Adv. Math. Suppl. Stud., Vol. 7B, pp. 769–791. New York: Academic Press, 1981 Communicated by D. Brydges
Commun. Math. Phys. 184, 597 – 617 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Solutions of the Oppenheimer–Volkoff Equations Inside 9/8ths of the Schwarzschild Radius Joel Smoller1 ,? , Blake Temple2 ,?? 1 2
Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA Department of Mathematics, University of California, Davis, Davis CA 95616, USA
Received: 19 June 1996 / Accepted: 13 September 1996
Abstract: We refine the Buchdahl 9/8ths stability theorem for stars by describing quantitatively the behavior of solutions to the Oppenheimer–Volkoff equations when the star surface lies inside 9/8ths of the Schwarzschild radius. For such solutions we prove that the density and pressure always have smooth profiles that decrease to zero as the radius r → 0, and this implies that the gravitational field becomes repulsive near r = 0 whenever the star surface lies within 9/8ths of its Schwarzschild radius. 1. Introduction In General Relativity, the interior of a star is modeled by solutions of the Oppenheimer– Volkoff (OV) equations which describe the pressure gradient inside a static fluid sphere. In this paper we describe the global behavior of the density, pressure, and gravitational field when the surface of the star lies within 9/8ths of its Schwarzschild radius. The well-known Buchdahl stability theorem, [1], states, loosely speaking, that when the surface of a star lies within 9/8ths of its Schwarzschild radius, then the star is unstable to gravitational collapse, and this result is essentially independent of the equation of state. This places a maximum red-shift factor of 2 on the possible emission spectrum from the surface of a spherically symmetric, static stellar object. The precise statement of Buchdahl’s theorem is as follows, ([2], p. 332). Let ρ(r) and p(r) denote the density and pressure, respectively, and let M (r) denote the mass function at radius r < R, where R denotes the surface of the star. (We call ρ the density so that ρc2 is the energy-density ? Supported in part by NSF Applied Mathematics Grant Number DMS-95OOO-694, in part by ONR, US NAVY grant number N00014-94-1-0691, and by the Institute of Theoretical Dynamics (ITD), UC-Davis. The author would like to thank Joel Keizer, director of the ITD, for his warm hospitality during the author’s tenure as a Visiting Regents Professor at UC-Davis. ?? Supported in part by NSF Applied Mathematics Grant Number DMS-95OOO-694, in part by ONR, US NAVY grant number N00014-94-1-0691, a Guggenheim Fellowship, and by the Institute of Theoretical Dynamics, UC-Davis.
598
J. Smoller, B. Temple
of the fluid, and c denotes the speed of light.) Assume that these functions satisfy the Oppenheimer–Volkoff equations, ((2.1), (2.2) below), and that the following conditions hold: (A) The radius R > 0 of the star is fixed, and the density ρ(r) and pressure p(r) are arbitrary bounded positive functions defined on 0 ≤ r < +∞, such that ρ(r) = 0 = p(r) for r ≥ R. The metric is assumed to be attached smoothly to the empty space Schwarzschild metric at r = R. (B) The mass function M (r) is given by Z r 4πρ(s)s2 ds, M (r) = 0
so that the total mass of the star is given by Z
R
M0 =
4πρ(s)s2 ds.
0
(C) The metric coefficient A, defined by A(r) ≡ 1 −
2GM (r) , c2 r
where G denotes Newton’s gravitational constant, satisfies A(r) > 0. (D) The density ρ(r) does not increase outward: ρ0 (r) ≤ 0. Then, assuming (A)–(D), the conclusion of the Buchdahl theorem is that, if ρ(r), p(r) and M (r) satisfy the OV equations, the surface r = R must satisfy R>
9 Rs (M0 ), 8
where Rs (M0 ) = 2G c2 M0 denotes the Schwarzschild radius of a star of total mass M0 . Here G denotes Newton’s gravitational constant. The stability limit for stars is obtained from this theorem by concluding that if the boundary surface of a star satisfies R ≤ 98 Rs (M0 ), then one of the above assumptions must fail. However, no information is given about exactly how (A)-(D) fail in this case. For example, can A → 0 for some r > 0? (This would correspond to the formation of a black-hole.) Can p → ∞ for some r ≥ 0? Can M (0) = 0 fail, or does the solution fail to exist on the entire interval [0, R] for some other reason? In addition, what is the behavior of the solutions as A(R) → 0; i.e., as the star surface tends to its Schwarzschild radius? In this paper we describe the global behavior of solutions of the OV equations starting from initial data satisfying Rs (M0 ) < R ≤ 98 Rs (M0 ), and as a corollary we obtain a refinement of Buchdahl’s theorem. We have been led to study such solutions in detail because of our earlier work, [3, 4], in which we constructed shock-wave solutions of the Einstein equations by attaching a Friedmann-Robertson-Walker metric to the inside of an arbitrary static metric determined by the Oppenheimer–Volkoff equations, such that the interface between them is an outward moving shock-wave. In the forthcoming paper [7] we study shock-wave
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
599
solutions of the Einstein equations arbitrarily close to the Schwarzschild radius by placing an outgoing shock-wave inside the static solutions that we analyze here. In such a construction the shock-wave stabilizes the solution by supplying the pressure required to “hold the star up”even when Rs (M0 ) < R ≤ 98 Rs (M0 ). In order to make the exposition as simple as possible, we assume throughout that a baryotropic equation of state of the form p = p(ρ) is given, where the function p(ρ) satisfies the conditions that ρp and p0 (ρ) are bounded above and below by positive con√ stants. Note that in this case p0 is the sound speed, which for physical reasons should be bounded by c. Our approach is to start with initial conditions at r = r0 > 0, and in terms of this data we estimate the solution for 0 < r < r0 . This contrasts with the standard approach which is to assume conditions at r = 0. We prove that any solution of the OV equations starting from initial data at r = r0 , and satisfying r0 ≤ 98 Rs (M (r0 )), will necessarily exist all the way into r = 0, and A(r) > 0 for all r ≥ 0. Moreover, we show that the pressure p and density ρ never tend to ∞, and actually are bounded and tend to zero smoothly as r → 0. (This contrasts with the case when r0 > 98 Rs (M (r0 )), in which case we can have p → ∞, cf. [4].) We prove that what always happens is that the mass function M hits zero at some r1 > 0, then goes negative for r < r1 , and M 0 (r) remains positive for all r ≥ 0. Moreover, M (r) → M (0) as r → 0, where −∞ < M (0) < 0. Indeed, we show that the density ρ and pressure p increase as r decreases until they reach a critical value r = r2 , 0 < r2 < r1 , (so that M (r2 ) < 0), and then ρ and p decrease to zero as r → 0. Moreover, we also prove that limr→0 ρ0 (r) = limr→0 p0 (r) = 0, which implies that ρ and p have smooth profiles at r = 0. Thus we conclude that in the presence of positive density and pressure, a repulsive gravitational effect appears, (i.e., p0 > 0 near r = 0), due to a negative mass function inside r = r1 . In light of the above, our results show that hypotheses (C) and (D) are actually consequences of the other assumptions in Buchdahl’s theorem because (B) implies that M (r) ≥ 0 for all r ≥ 0. Moreover, when M0 ≡ M (r0 ) ≤ 98 Rs (M (r0 )), we show that the region of the solution where M (r) ≥ 0 accumulates in a thin layer that tends to r = r0 as r0 tends to its Schwarzschild radius Rs (M (r0 )), and we obtain sharp estimates for the width of this layer. Note finally that the hypotheses of the Buchdahl theorem do not explicitly assume the existence of an equation of state. Although in our treatment here we assume the equation of state is of the form p = p(ρ), we could be more general 0 by assuming only that µ(r) = ρp and σ(r) = ρp0 are any given positive functions that are bounded above and below by positive constants; c.f. [6]. The main results of this paper are summarized in the following theorem which gives a refinement of Buchdahl’s result. In what follows we utilize the variable z defined by z≡
ρ , ρ¯
(1.1)
where ρ(r) ¯ is the average density inside radius r, defined by ρ¯ ≡
3 M (r) . 4π r3
(1.2)
Theorem 1. Let (r1 , r0 ], 0 ≤ r1 ≤ r0 , be the maximal interval of existence of a positive smooth solution, ρ(r) > 0, p(r) > 0, and M (r) > 0, of the OV system, (given in (2.1), (2.2) below), starting from positive initial data at r = r0 which satisfies
600
J. Smoller, B. Temple
0 < A(r0 ) ≡ 1 −
2GM (r0 ) < 1. c 2 r0
Then M 0 (r) > 0 and A(r) > 0 throughout (r1 , r0 ], M (r1 ) = 0, and the following hold: (i) If r1 = 0, then A(r0 ) > 19 , or equivalently r0 > 98 Rs (M (r0 )). (ii) If r1 > 0, then the functions ρ(r), p(r) and M (r) can be continued to the interval [0, r1 ] as bounded smooth solutions of the OV system, such that ρ, p, A and M 0 remain positive, but M (r) is negative on [0, r1 ). Moreover, there exists a unique point r2 ∈ (0, r1 ) such that the density ρ and pressure p increase on the interval [0, r2 ) and decrease on the interval (r2 , r0 ], and the following equalities hold: lim ρ(r) = lim p(r) = lim ρ0 (r) = lim p0 (r) = 0,
(1.3)
lim M (r) = M (0),
(1.4)
r→0
r→0
r→0
r→0
and r→0
where M (0) is a finite negative number. (iii) Assume that the initial values satisfy the further conditions that 0 < z0 < 1,
(1.5)
1 . (1.6) 9 Then r1 > 0, and there exists a unique point r∗ , r1 < r∗ < r0 , such that z(r∗ ) = 1, z(r) < 1 for r > r∗ , z(r) > 1 for r < r∗ , and the following inequalities hold: s 1 − 9A(r0 ) r∗ , (1.7) > 1> r0 1 − A(r0 ) 0 < A0 ≤
and 3 ρ(r) < ρ(r∗ ) ≤ 8πGr02
1 − A0 1 − 9A0
,
(1.8)
for all r in the interval r∗ ≤ r < r0 . (iv) For fixed r0 > 0 and z0 > 0, r1 → r0 as A0 → 0. Note that whenever M (r) tends to a finite negative number at r = 0, the metric (r) must have a singularity at r = 0 because A(r) = 1 − 2GM . We will show below that r such singularities in solutions of the OV equations are non-removable, and we will use the results in [3] to show that this singularity corresponds to a delta fuction source of negative mass at r = 0. As a consequence of this theorem, it follows that for any solution of the OV system, the pressure can tend to ∞ only at the origin r = 0; i.e., by (ii), p is uniformly bounded if r1 > 0, so p can tend to ∞ only at r = 0. Note that part (i) refines the Buchdahl result because it implies that if the mass M (r) ever gets within 9/8ths of the Schwarzschild radius Rs (M (r)), then r1 > 0, so M must go negative before r = 0, thereby violating the definition of M given in (B). Also, since ρ0 (r) > 0 for r near zero, we see that (D) is also violated. Note too that in our theorem, the critical 9/80 ths limit applies at any radius interior to the star, while in Buchdahl’s
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
601
argument the 9/80 ths limit applies only at r = R, the surface of the star. Moreover, the fact that A stays positive is a theorem in our treatment, not an assumption, and we demonstrate the failure of (D) when r0 ≤ 98 Rs (M (r0 )), in which case (ii) and (iii) give the global behavior of solutions that start inside 9/8ths of the Schwarzschild radius. Theorem 1 also rules out the possibility that p → ∞ as r → 0 in the critical case when r0 is exactly 98 Rs (M (r0 )), because when r0 = 98 Rs (M (r0 )), Theorem 1 implies that r1 > 0. (See [2], p. 334, where p → ∞ as r → 0 and r0 = 98 Rs (M (r0 )), but in this case ρ ≡ const, and so this example violates our assumption that p/ρ remains bounded.) Note also that since r1 → r0 as A0 → 0, and M (r1 ) = 0, it follows that the entire portion of the solution in which the mass M is positive, accumulates in a thin layer that tends to r = r0 as A0 tends to zero. In [7] we use our detailed description of this layer to analyze dynamical solutions in which a shock-wave inside the layer supplies the pressure required to hold the layer up when A0 is arbitrarily close to zero. Statement (1.3) implies that the density ρ(r) and pressure p(r) are everywhere positive and have smooth profiles that tend to zero as r → 0, and this implies that the gravitational field becomes repulsive near r = 0 (when M (r) is negative). Note that M (r) < 0 for r > 0 is not ruled out in general relativity, (so long as the density and pressure are positive), because M (r) is not an invariant quantity. This issue is discussed in the final section of this paper. 2. Statement of Results Theorem 1 is a consequence of the results stated in this section; in the next section we will supply the proofs of the theorems in the order that they are presented here. The Oppenheimer–Volkoff (OV) system is, (cf. [2]), p 4πr3 p dp = GM ρ 1 + 2 (2.1) 1+ A−1 , − r2 dr ρc M c2 dM = 4πρr2 , dr
(2.2)
where G M (r) . (2.3) c2 r Equations (2.1), (2.2) form a system of two ODE’s in the unknown functions p = p(r), ρ = ρ(r), and M = M (r), where p denotes the pressure, ρc2 denotes the mass-energy density, c denotes the speed of light, M (r) denotes the total mass inside radius r, and G denotes Newton’s gravitational constant. The last three factors in (2.1) are the generalrelativistic corrections to the Newtonian theory, [2]. Solutions of (2.1) and (2.2) determine a Lorentzian metric tensor g of the form (2.4) ds2 = −B(r)d(ct)2 + A(r)−1 dr2 + r2 dθ2 + sin2 (θ)dφ2 , A ≡ A(r) = 1 − 2
that solves the Einstein equations 8πG T, (2.5) c4 when G is the Einstein tensor, and T is the stress-energy tensor for a perfect fluid, G=
602
J. Smoller, B. Temple
Tij = (p + ρc2 )ui uj + pgij .
(2.6)
Here i and j are indices that run from 0 to 3, A(r) is defined by (2.3), and the function B satisfies the equation p0 B0 = −2 . (2.7) B p + ρc2 The metric (2.4)√is spherically symmetric, time independent, and the fluid 4-velocity is given by ut = B and ur = uθ = uφ = 0, so that the fluid is fixed in the (t, r, θ, φ)coordinate system, [2]. We assume that, (cf. [6]), p (2.8) µ= , ρ and σ=
dp/dr , dρ/dr
(2.9)
satisfy the apriori bounds 0 ≤ µ < µ+ < ∞,
(2.10)
0 < σ− < σ < σ+ < ∞.
(2.11)
and
Note that if an equation of state of the form p = p(ρ) is given, then the bounds (2.10) and (2.11) are implied by the usual physical requirements on the function p(ρ), (cf. [6]). Our results rely on a regularity theorem, (Theorem 2 below), for solutions of (2.1), (2.2) that satisfy (2.10) and (2.11). The results are stated in terms of the variables z and A, where z is defined above in (1.1). That is, in [6] we showed that on the maximal interval (r1 , r0 ] over which M (r) > 0, the OV system (2.1), (2.2) is equivalent to the system z 1−A dz = −C , (2.12) dr A r dA 1−A = (1 − 3z) , (2.13) dr r where C≡
(1 +
µ c2 )(1 2 cσ2
+
3µz c2 )
− 3(1 − z)
A . (1 − A)
(2.14)
In terms of z and A, Eq. (2.7) becomes µz B0 1 = 1+3 2 B r c
1−A A
.
The regularity theorem that we need is the following theorem proved in [6].
(2.15)
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
603
Theorem 2. Let (z(r), A(r)) denote the smooth, (i.e., C 1 ), solution of (2.12), (2.13), defined on a maximal interval (r1 , r0 ], 0 ≤ r1 < r0 < ∞, satisfying the initial conditions z(r0 ) = z0 , A(r0 ) = A0 , where 0 < z0 < ∞, 0 < A0 < 1.
(2.16)
Assume that (2.10) and (2.11) hold. Then (z(r), A(r)) satisfies the following inequalities for all r ∈ (r1 , r0 ] : 0 < z(r) < ∞,
(2.17)
0 < A(r) < 1,
(2.18)
B(r) > 0,
(2.19)
0 < M (r) < M (r0 ), M 0 (r) > 0,
(2.20)
and lim M (r) = 0.
(2.21)
lim z(r) = +∞,
(2.22)
lim A(r) = 1,
(2.23)
lim B(r) = B(r1 ) > 0.
(2.24)
r→r1 +
Moreover, if r1 > 0, then r→r1 +
r→r1 +
r→r1 +
If r1 = 0, then 0 ≤ z(r) ≤ 1,
(2.25)
for all r ∈ (0, r0 ], and if ρ(r) has a finite limit at r1 = 0, then (2.23) and (2.24) also hold. The original variables ρ and p of the OV system (2.1), (2.2) satisfy the inequalities 0 < ρ(r0 ) < ρ(r) < ρ(r1 ) < ∞, ρ0 (r) < 0,
(2.26)
p0 (r) < 0,
(2.27)
and 0 < p(r0 ) < p(r) < p(r1 ) < ∞, for all r, r1 < r < r0 .
604
J. Smoller, B. Temple
We remark that (2.21) and (2.22) show that z can only tend to infinity at a value r1 > 0 where M (r1 ) = 0. Furthermore, it follows that when r1 > 0, the values of ρ(r) and p(r) are bounded on the closed interval r1 ≤ r ≤ r0 . Thus, solutions of the OV system (2.1),(2.2), actually exist on a larger interval containing [r1 , r0 ], but M ≥ 0 is violated. Our first result is given in the following theorem which describes the continuation of an OV solution to values 0 ≤ r ≤ r1 in the case when r1 > 0. We then show that r1 is always positive when r0 ≤ 98 Rs (M (r0 )); that is, r1 > 0 if r0 is within 9/8ths of the Schwarzschild radius. Theorem 3. Let (z(r), A(r)) denote the smooth, (i.e., C 1 ), solution of (2.12), (2.13), defined on a maximal interval (r1 , r0 ], 0 ≤ r1 < r0 < ∞, satisfying the initial conditions (2.16), and assume that (2.10) and (2.11) hold, so that the hypotheses of Theorem 2 hold. Assume that r1 > 0. Then the functions ρ(r), p(r) and M (r) can be extended as a smooth solution of the OV system (2.1), (2.2), to values r satisfying 0 ≤ r < r0 . Moreover, for r < r1 , − ∞ < M (0) < M (r) < 0,
(2.28)
lim M (r) = M (0),
(2.29)
where r→0
A(r) > 0, M 0 (r) > 0, and p(r) and ρ(r) are positive and bounded for all r ∈ [0, r0 ]. Furthermore, there exists a unique value r2 , 0 < r2 < r1 , such that the functions p(r) and ρ(r) assume their maximum values at r = r2 , and lim p(r) = lim ρ(r) = lim p0 (r) = lim ρ0 (r) = 0.
r→0
r→0
r→0
r→0
(2.30)
Finally, the component B in the metric (2.4) satisfies (2.31) B(r) = O(r−1 ) as r → 0, ijkl and the tensor invariant R ≡ Rijkl R of the Riemann curvature tensor determined by the metric (2.4) satisfies const. as r → 0, (2.32) r6 so that there is a non-removable singularity in the metric (2.4) at r = 0 when r1 > 0. R≥
The next theorem will be used to show that r1 tends to r0 as the initial condition A(r0 ) = A0 tends to zero. That is, as the initial condition is taken closer and closer to the Schwarzschild radius, the point r1 at which M (r1 ) = 0 tends to r0 . Since by (2.21), 2 M = 0 at r = r1 , and M (r0 ) tends to c2Gr0 as A0 tends to zero, we conclude that all of the mass accumulates in a surface layer near r = r0 as A0 tends to zero. Our analysis is based on estimating, explicitly in terms of A0 , the position r = r∗ of the unique point where Mr(r) assumes its maximum. A calculation (below) shows that at r = r∗ , we also 3 ¯ ∗ ), so z(r∗ ) = 1, and moreover, ρ > ρ¯ for r∗ < r < r0 , and ρ < ρ¯ for have ρ(r∗ ) = ρ(r r1 < r < r∗ .3 3 The point r also plays an important role in the shock-wave matching problem set out in, [3, 4, 5]. Indeed, ∗ we showed in [5] that outgoing shocks, modeling explosions, can be constructed from any outer OV solution so long as ρ > ρ. ¯ We will use these results in a future paper to study shock-waves near the Schwarzschild radius.
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
605
Theorem 4. Let (z(r), A(r)) be a smooth solution of (2.12),(2.13) starting from initial values (z0 , A0 ) and defined on a maximal interval (r1 , r0 ]. Assume that the initial values satisfy 0 < z0 < 1,
(2.33)
1 . (2.34) 9 Then r1 > 0, and there is a unique point r∗ , r1 < r∗ < r0 , such that z(r∗ ) = 1, z(r) < 1 for r > r∗ , z(r) > 1 for r < r∗ , and the following inequalities hold: s r∗ 1 − 9A(r0 ) , (2.35) > 1> r0 1 − A(r0 ) 0 < A0 ≤
and ρ(r) < ρ(r∗ ) ≤
3 1 − A0 , 2 1 8πGr0 − 9A0
(2.36)
for all r, r∗ ≤ r < r0 . The estimate (2.35) gives a rate at which rr∗0 → 1 as A0 → 0, and we will use this to demonstrate that rr01 → 1, as A0 → 0. Note that the hypothesis 0 < A0 ≤ 19 implies that r0 is outside the Schwarzschild radius Rs (M0 ), but inside 9/8ths of Rs (M0 ). Theorem 1 of the introduction follows directly from Theorems 2-4, together with the following corollary which generalizes the Buchdahl theorem: Corollary 1. If r1 = 0, then A0 > 19 , or equivalently r0 >
9 Rs (M (r0 )). 8
Rr To see this, note that if r1 = 0, then M (0) = 0, and so M (r) = 0 4πρ(s)s2 ds. Now suppose that A0 ≤ 19 . Then by (2.35), r∗ > 0. But if r1 = 0, then ρ0 < 0 implies ρ ≤ ρ¯ so z ≤ 1 when r1 = 0. (Theorem 3). Thus r1 = 0 is impossible when r∗ > 0 because the latter implies z > 1 for r < r∗ , a contradiction. The next corollary shows that r1 → r0 as A0 → 0, thereby demonstrating that all of the mass accumulates in a layer that tends to r0 as r0 tends to the Schwarzschild radius. Corollary 2. If r0 and z0 are fixed, then lim
A0 →0
r1 = 1. r0
(2.37)
The final theorem estimates the size of the surface layer r∗ < r < r0 , (where z < 1), from above in terms of the initial data (z0 , A0 ). Our estimate for the width of the layer depends on the value B(r∗ ), but this value depends on the initial condition for B(R) at the surface of the star r = R. Thus in this case we shall assume that the solution is defined for r1 < r ≤ R, and that limr→R z(r) = 0, and B(R) = A(R). (Note here that the OV solution will not go continuously to a vacuum at r = R, (z(R) = 0, ρ(R) = 0),
606
J. Smoller, B. Temple
unless σ → 0 as r → R. This follows directly from (2.12) because, if σ is bounded away from zero, then the system (2.12), (2.13) is regular, and has a unique solution through r = R, namely, the Schwarzschild solution. Allowing σ → 0 as r → R, is not a problem in the arguments to follow.) Theorem 5. Let (z(r), A(r) be a smooth solution of (2.12),(2.13) starting from initial values (z0 , A0 ) and defined on a maximal interval (r1 , R], 0 < r1 < r0 < R, where we assume the initial values satisfy (2.33), (2.34), together with limr→R z(r) = 0,
(2.38)
z(r) = 0 and B(r) = A(r) f or r ≥ R.
(2.39)
and
Then the following inequality holds: 1 − A0 r∗ . (2.40) ≤ r0 1 − B(r∗ ) Moreover, if A is sufficiently small so that C in (2.12) satisfies C > 0 for r ∈ (r∗ , r0 ), (for example A < 19 and σ < 2 will suffice), then B(r∗ ) satisfies R 1 1+3µz − dz (2.41) B(r∗ ) = B(R)e z0 Cz . Note that to estimate B(r∗ ) by using (2.41), (which by (2.40) yields an estimate for rr∗0 from below), we need to estimate the function C in (2.14) and this essentially requires knowledge of the equation of state. 3. Proofs of Theorems In this section we supply the proofs of Theorems 3–5 stated in Sect. 3. From here on we always assume that the speed of light c is unity. Proof of Theorem 3: Assume r1 > 0. By Theorem 2, lim M (r) = 0,
r→r1
and ρ and p have finite positive limits ρ(r1 ), p(r1 ), at r = r1 , respectively. Thus by defining M (r1 ) = 0, we have a continuous extension of the OV solution to r = r1 . Moreover, M 0 (r1 ) = 4πρ(r1 )r12 > 0; thus there is an extension of the OV solution to a neighborhood (r1 − , r1 ], and we choose sufficiently small so that, on this neighborhood, p(r) > 0 and ρ(r) > 0 but M (r) < 0. Now let I ≡ (r3 , r1 ] denote the largest interval over which the solution of the OV equations starting from initial data at r = r1 , exists, is smooth, and both ρ and p are positive. The OV equation (2.1) can be rewritten in the form − ρ0 =
G(1 + µ) 1 ρ(M + 4πµr3 ) . 2 r σ 1 − 2GM r
(3.1)
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
607
Let D(r) ≡ M (r) + 4πp(r)r3 .
(3.2)
Claim 1. ρ and M are bounded on [r3 , r1 ]. Proof of Claim 1. Using (3.1) we have that for r ∈ I, − ρ0 ≤ K1
r ρ 1 ≤ K2 ρ2 r2 , (4πpr3 ) 2 r 2G|M | |M |
(3.3)
for some positive constants K1 and K2 . But M ≡ M (r−) < 0. Thus, since M 0 (r) > 0 on I ≡ (r3 , r1 − ], we have −ρ0 ≤
K2 2 2 ρ r ≤ Kρ2 r2 , |M |
for some positive constant K. Then integrating from r > r3 to r1 − gives ρ(r) ≤
1 ρ(r1 − ) +
K 3 3 [r
− (r1 − )3 ]
< Const,
and this proves Claim 1. Using the claim we conclude that D(r2 ) = 0 for some r2 ∈ I. Indeed, if D(r) 6= 0 for all r ∈ I, then since ρ0 < 0 and ρ is bounded, it follows that ρ, p and M would have finite positive limits at r = r3 if r3 6= 0, so we must have r3 = 0 in order not to contradict the maximality of the interval I. But if r3 = 0, then clearly D(r) = M + 4πpr3 is negative for r sufficiently close to r = 0. Now let r2 be any point in I for which D(r2 ) = 0. Then d D(r2 ) = M 0 (r2 ) + 4πp0 (r2 )r23 + 12πp(r2 )r22 > 0, dr since p0 (r2 ) = 0. It follows from this that there exists a unique r2 ∈ I at which D(r2 ) = 0. For r < r2 , note that ρ0 (r) > 0 and p0 (r) > 0. Claim 2. r3 = 0. Proof of Claim 2. Using (3.1) we can write ρ0 =
1 G(1 + µ) 1 r ρ < K+ , ρ(−M − 4πµρr3 ) < K 2 ρ(−M ) 2 σr A r −M r
for some positive constants K and K+ . Integrating from r < r2 to r2 gives ρ(r) > ρ(r2 )
r r2
K + ,
so that ρ(r) ≥ 0 for all r ≥ r3 . We conclude that either r3 = 0 or else we contradict the maximality of I. This proves Claim 2. Claim 3. limr→0 ρ(r) = 0.
608
J. Smoller, B. Temple
Proof of Claim 3. Note first that D0 (r) = M 0 (r) + 4πp0 (r)r3 + 12πp(r)r2 ≥ 0, for all r ∈ (0, r2 ]. It follows that −D(r) > −D(r2 − ) ≡ K , 0 < r < r2 − , for some small positive number . Thus from (3.1) we obtain for 0 < r < r2 − , K 1 ρ0 ≥ 2 ρK , r 1 + G|M | r
so that
ρ ρ0 ≥ K− , r where K− > 0. Thus for such r we have K− r , ρ(r) < ρ(r2 − ) r2 −
and this shows that ρ(r) → 0 as r → 0, which proves Claim 3. Next we show that lim ρ0 (r) = 0. r→0
(3.4)
To see this, note that for r near r = 0, we obtain from (3.1) that ρ0 =
G(1 + µ) r (1 + O(r)), ρ(|M | + O(r)) 2 σr 2G|M |
which we can rewrite as
(1 + µ) ρ (1 + O(r)). 2σ r Since limr→0 ρ(r) = limr→0 p(r) = 0, we may write this last equation as ρ0 (r) =
ρ0 (r) =
(1 + µ(0)) ρ (1 + O(r)) as r → 0. 2σ(0) r
Now integrating from r < to r = , (where is near zero), we obtain ρ(r) = ρ()
r K0
where K0 = But, µ(0) = limρ→0
p(ρ) ρ
e−K0 O() ,
1 + µ(0) . 2σ(0)
= p0 (0) = σ(0). Thus, K0 =
1 + σ(0) > 1, 2σ(0)
because σ, the sound speed squared, is less than unity. We conclude that
(3.5)
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
lim
r→0
609
ρ(r) = 0, r
and hence
ρ(r) − ρ(0) = 0. r−0 Finally we verify (2.31) and (2.32). For (2.31) note that we have ρ0 (0) = lim
r→0
B0 2p0 =− , B p+ρ
(3.6)
and using an argument similar to the derivation of (3.5), we obtain that near r = 0, 1 1+µ 1 ρ − + O(r) . (3.7) p0 = 2 r 2G|M (0)| Substituting this for p0 in (3.6), we see that for r near zero, ρ B0 1 =− + O(1) . B p+ρ r Now integrating from r < to r = yields (1 + O()) . B(r) = B() r This shows that B(r) = O r1 near r = 0. To verify (2.32), a calculation using MAPLE yields R=
(3.8)
(3.9)
[2ABB 00 − A(B 0 )2 + BA0 B 0 ]2 2A2 (B 0 )2 2(A0 )2 4(1 − A)2 + + + . 4B 4 r2 B 2 r2 r4
Thus
2 4(1 − A)2 2 M (r) = 16G → ∞ as r → 0, r4 r6 since M (0) 6= 0. This completes the proof of Theorem 3.
R≥
We can use the shock-wave matching techniques developed in [3] to show that the non-removable singularity that appears in the metric at r = 0 in the case when r1 > 0 really does represent a delta function source of negative density. Indeed, a FriedmannRobertson-Walker (FRW) metric can only be matched Lipschitz continuously to a metric of type (2.4) if the following condition holds, (cf. [3]): 3 3 ρr ¯ , (3.10) 4π where ρ¯ denotes the FRW density behind the interface between an FRW metric inside radius r and a metric of type (2.4) outside radius r. Thus if M (r) < 0, then only FRW metrics with negative density can be matched to (2.4) at radius r. In the limit that r → 0, M (r) → M (0) < 0, and thus by (3.10) FRW density ρ¯ tends to a negative delta function source of magnitude M (0) centered at r = 0. In other words, replacing the ball of radius r = by an FRW space at fixed time has the effect of regularizing the singularity at r = 0 at that time. But by (3.10), the FRW solution inside radius r = determines a sequence whose density converges to a delta-function of negative mass M (0) as → 0. M (r) =
610
J. Smoller, B. Temple
We now show that a solution of the OV equation starting from initial values M (r0 ) < 0 and p(r0 ) > 0, cannot reach p = 0 for some R > r0 without having M (R) ≥ 0. To see this note that if limr→R p(r) = 0, we must have p0 (rk ) < 0 on a sequence rk → R, so long as p > 0 for r < R. But if M < 0, then A > 1, and so the OV equation (2.1) implies that 0 ≤ limrk →R M (rk ) + 4πp(rk )rk3 = limrk →R M (rk ), and so in fact, since M 0 (r) > 0 when p > 0, we must have M (R) ≥ 0. Thus negative total masses will never be observed at the surface of a star r = R, (or beyond), if ρ(r) > 0 at any r < R outside the Schwarzschild radius (i.e., the solution is not the empty space Schwarzschild solution with negative mass). Proof of Theorem 4. We begin by proving the following: Lemma 1. Let (z(r), A(r)) denote the solution of (2.12), (2.13) defined on the maximal interval (r1 , r0 ], starting from initial data z(r0 ) = z0 , A(r0 ) = A0 , where 0 < z0 , A0 < 1, (so that the hypotheses of Theorem 2 hold). Assume that r1 > 0. Then there exists a unique point r∗ , r1 < r∗ < r0 , such that z(r∗ ) = 1. Proof of Lemma. Since z(r0 ) < 1, and by Theorem 2, z(r) → +∞ as r → r1 , we see that there exists an r∗ for which z(r∗ ) = 1. On the other hand, by (2.12), z 0 (r) < 0 if z ≥ 1, so we see that r∗ is unique. This completes the proof of the lemma. Now differentiating the average density, ρ¯ =
3 M (r) , 4π r3
we obtain 3ρ¯ 3 (ρ − ρ) ¯ = (z − 1), r r so we see that ρ¯ takes a unique maximum at r = r∗ , and thus ρ¯0 =
(3.11)
ρ¯0 (r) < 0 if r∗ < r < r0 ,
(3.12)
ρ¯0 (r) > 0 if r1 < r < r∗ .
(3.13)
We now estimate rr∗0 when A0 < 19 . As a first step, we prove the following lemma, which implies (2.35) in the special case when r0 is the boundary surface of the star, and the Schwarzschild solution is attached to the OV solution at r = r0 . (Note here that the OV solution will not go continuously to a vacuum at r = R, namely, z(R) = 0, ρ(R) = 0, unless σ → 0 as r → R. This follows directly from (2.12) because, if σ is bounded away from zero, then the system (2.12), (2.13) is regular, and has a unique solution through r = R, namely, the Schwarzschild solution. Allowing σ → 0 as r → R, is not a problem in the arguments to follow because, for any r˜ < R, ρ(r) ˜ 6= 0, σ > 0, and our regularity results Theorems 2 and 3 are valid for r ≤ r.) ˜
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
611
Lemma 2. Assume the hypotheses of Theorem 4, and in addition assume that ρ(r) = 0 = p(r), and B(r) = A(r), for all r ≥ r0 . Then inequality (2.35) holds. Proof of Lemma 2. From Weinberg, [2], p. 333, we have the following identity that holds on solutions of the OV system: 0 0 B M 1√ √ 0 A( B) =G , (3.14) r A r3 where prime denotes differentiation with respect to r. (Note that by Theorem 2, A(r) and 0 < 0 for r > r∗ , B(r) are both positive on (r1 , r0 ].) Now from (3.11) and (3.12), M r3 (and this holds when r∗ = 0 because in this case r1 = 0, and thus from (3.11), ρ¯0 < 0 for all r > 0), so that, from (3.14), 0 1 √ h√ i0 A B < 0, r holds for r∗ < r < r0 . Integrating we obtain for such r Z
r0
0> r
1√ √ 0 A( B) s
0 ds =
hp i0 1 p hp i0 1p A(r0 ) B(r0 ) − A(r) B(r) , r0 r
or s r 1 √ A(R) r0
i0 A(r0 ) B 0 (r0 ) hp < B(r) . B(r0 ) 2
(3.15)
But note that by assumption B(r0 ) = A(r0 ), and moreover, B 0 (r0 ) = A0 (r0 ) =
2GM (r0 ) . r02
(3.16)
Indeed, for the second equality we use M 0 (r0 ) = 4πρ(r0 )r2 and ρ(r0 ) = 0. For the first equality, we substitute the expression for p0 given in the OV equation (2.1) into (2.7) and again use the fact that ρ(r0 ) = p(r0 ) = 0, and A(r0 ) = B(r0 ). Integrating (3.15) from r∗ to r0 and using the fact that B 0 (r0 ) = A0 (r0 ), gives p
B(r0 ) −
p
B(r∗ ) >
GM (r0 ) r03
because M (r) =
Z
r0 r∗
GM0 rdr q ≥ 3 r0 (r) 1 − 2GM r
Z
r r∗
rdr q , 0 2 1 − 2GM r r3
4π M0 2π 3 ρ(r)r ¯ ρ(r ¯ 0 )r3 = 3 r3 . ≥ 3 3 r0
Now making the substitution u = 1 −
2GM0 2 r , r03
in the last integral, gives
0
612
J. Smoller, B. Temple
p 3 A(r0 ) >
s 1−
2GM0 2 r . r03 ∗
(3.17)
In particular, this implies that r∗ > 0 because r∗ = 0 would imply that A0 > 19 , in violation of our hypothesis. But, if r∗ > 0, then z(r) > 1 for r < r∗ by (2.12). Now using Theorem 2, we see that if r1 = 0, then z(0) ≤ 1, and this is a contradiction. Thus r1 > 0. Now simplifying (3.17) yields (2.35) in the case when r = r0 is attached to the empty space Schwarzschild solution. This completes the proof of Lemma 2. To complete the proof of (2.35) it remains only to extend Lemma 2 to the case when the initial conditions at r = r0 are the general conditions (2.33), (2.34); that is, this is the case when we do not assume that the solution is attached to the empty space Schwarzschild metric at r = r0 ; i.e., we assume that ρ(r0 ) > 0. To accomplish this, we will extend the definition of the equation of state function p(ρ) to values of ρ smaller than the value ρ(r0 ) in such a way that the extension of the solution to r > r0 , (r near r0 ), hits ρ = 0 at an arbitrarily small distance from r = r0 . The extension of p(ρ) to values of ρ < ρ(r0 ) ≡ ρ0 does not affect the solution for r ∈ (r1 , r0 ] because in this range, ρ0 (r) < 0, and hence ρ > ρ(r0 ). Thus (2.35) will follow in full generality by passing to the limit. To carry out this program, let 0 < δ < ρ0 be given and let pδ (ρ) be an extension of p(ρ) to values of ρ < ρ0 such that the following conditions hold: pδ (ρ) = p(ρ), pδ (ρ) = δρ,
f or f or
ρ ≥ ρ0 , 0 ≤ ρ ≤ ρ0 − δ,
(3.18)
and we let pδ be a smooth interpolation of p between the values ρ = ρ0 and ρ = ρ0 − δ. For this extension pδ of p, we now show that the extension of the solution by the OV equation to values of r > r0 , satisfies ρ0 (r) < 0, and ρ(r) = 0 for some r ∈ (r0 , r0 + ) for = (δ) → 0 as δ → 0. To this end, note that for r sufficiently close to r = r0 , it is not difficult to see that using the OV equation (2.1), we can obtain the following estimate: ρ0 (r) ≤ −K
ρ(r) , p0δ (ρ(r))
(3.19)
where K is a constant independent of δ, (uniform over a fixed r-interval about r0 , and depending only on values of the solution near r = r0 ). Now fix << 1; we show that there exists a δ such that the solution of the OV system starting from initial data at r = r0 to values r > r0 , (using equation of state pδ ), must satisfy ρ(r) = 0 for some r, r0 < r < r0 + . To this end, assume ρ(r) > 0 on this interval for all δ << 1. We show that this is impossible. Indeed, integrating (3.19) from r0 to r0 + gives Z r0 + Z ρ(r0 +) 0 pδ dρ ≤ −K dr = −K. ρ ρ0 r0 But Z
ρ(r0 +) ρ0
p0δ (ρ) dρ + ρ
Z
ρ0 −δ ρ0
Z ρ(r0 +) 0 p0δ (ρ) pδ (ρ) dρ + dρ ρ ρ ρ0 −δ = O(δ) + δρ(r0 + ).
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
613
Thus we get O(δ) + δρ(r0 + ) ≤ −K.
(3.20)
Since is fixed, we see from (3.20) that ρ(r0 + ) cannot be positive for δ sufficiently small. This proves that for every > 0 there exists a δ > 0 such that ρ(r ) = 0 for r0 < r < r0 + , when pδ (ρ) is taken as the equation of state. Thus for each << 1, we can match the (extended) OV solution determined from initial data (2.33), (2.34), to the empty space Schwarzschild solution, at r = r . Thus, by applying the last lemma we conclude that r 1 − 9A r∗ , > 1> r0 1 − A where A = A(r ) = 1 −
2GM (r ) . r
Since M (r ) → M (r0 ) as → 0 because Z M (r ) − M (r0 ) =
r
4πρ(r)r2 dr → 0,
r0
as → 0, we conclude that indeed estimate (2.35) must hold in full generality. To complete the proof of Theorem 4 it remains only to prove (2.36). To this end, we have M (r∗ ) =
4π 4π ρ(r ¯ ∗ )r∗3 = ρ(r∗ )r∗3 , 3 3
so that A(r∗ ) = 1 −
2GM (r∗ ) 8πG ρ(r∗ )r∗2 , =1− r∗ 3
and hence 1 − A(r∗ ) =
8πG 8πG ρ(r∗ )r∗2 > ρ(r∗ )r02 3 3
1 − 9A0 1 − A0
,
where we have used (2.35). Thus 0 < A(r∗ ) < 1 −
8πG ρ(r∗ )r02 3
1 − 9A0 1 − A0
,
and simplifying yields (2.36) because ρ0 (r) < 0 on r∗ < r < r0 . This completes the proof of Theorem 4. We now give the proof of Corollary 2. For this, consider a solution of (2.12), (2.13) defined on the maximal interval (r1 , r0 ), starting from initial data (z0 , A0 ) that satisfies 0 < z0 , A0 < 1. Now fix z0 and r0 and let A0 → 0. Then we know from Theorem 4 that r∗ → r0 as A0 → 0. We also show that r1 → r0 as A0 → 0. To this end, assume not. Then (at least for some subsequence of A0 ’s), there exists an interval (r˜1 , r0 ) such that r1 ≤ r˜1 for all A0 → 0 in this subsequence. We show that this implies that z(r) → ∞
614
J. Smoller, B. Temple
for all r ∈ (r˜1 , r0 ) as A tends to zero along this subsequence. This would give the desired contradiction because z = ρ/ρ, ¯ and 3 M (r) 4π r3 is bounded away from zero as A0 → 0, so z → ∞ implies that ρ(r) → ∞ as A0 → 0. The contradiction then is that Z r0 Z r0 4πρ(r)r2 dr > 4πρ(r)r2 dr → ∞, M (r0 ) = ρ(r) ¯ =
r1
r˜ 1
as A0 → 0, but M (r0 ) < ∞. (We use the fact that the integral of a sequence of positive functions tends to infinity if the sequence tends to infinity pointwise.) Thus we need only show that z(r) → ∞ as A0 → 0. To see this, note first that z > 1 for all A0 sufficiently small because for A0 sufficiently small, r∗ > r and hence z(r) > 1 because z 0 < 0 for r < r∗ . Thus (2.14) implies that C ≥ C¯ for some positive constant C¯ that is independent of A0 . Moreover, solving for (2.13) and substituting into (2.12), and using the fact that z > 1 and that z 1 1 − 3z ≥ 3 ,
1−A r
in
we obtain the inequality C¯ A0 , 3 A which holds for all r ∈ (r˜1 , r∗ ). Integrating between r and r∗ yields C¯ A(r) z(r) ≥ 1 + ln . 3 A(r∗ ) z0 ≤
(3.21)
Notice now that Z M (r∗ ) = M (r0 ) −
r0
4πρ(r)r2 dr.
r∗
But since (2.36) shows that ρ(r) is uniformly bounded on the interval (r∗ , r0 ), we see that this latter integral tends to zero as A0 → 0 because r∗ → 0. Thus M (r∗ ) → M (r0 ) as A0 → 0 which implies A(r∗ ) → 0 as A0 → 0. But A(r) is uniformly bounded away is bounded above by a nonzero negative constant from zero because A0 = (1−3z)(1−A) r ˜ r0 ), when z > 1. In light of this, (3.21) shows that z(r) → ∞ as A0 → 0 for all r ∈ (r, the condition we sought. This proves Corollary 2. Proof of Theorem 5. We first verify (2.41). From (2.15), (2.12) and (2.14), if the function C given in (2.14) satisfies C > 0, then z is a monotone function of r, so we have 1 dB 1 dB dr 1 + 3µz) 1 d ln(B) = = =− . dz B dz B dr dz C z Thus integrating from z0 to z = 1 gives (2.41). We also shall need the following lemma:
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
615
Lemma 3. The metric coefficients B(r) and A(r) determined by a solution of the OV equations satisfy (1 + µ) A d ln =− 8πGρr < 0. (3.22) dr B A Proof of Lemma. First write d A A0 B0 ln = − , dr B A B and use (2.13) together with the OV equation (2.1) to write B 0 (1 − 3z)(1 − A) (1 − A) 4πpr3 A0 − = − 1+ , A B rA rA M from which (3.22) follows upon noticing that 3z =
4πρr3 . M
This completes the proof of the lemma. To prove Theorem 5, we see from (3.14) together with the last lemma, (which implies A > 1 since B(R) = A(R)), that we may write that B 0 0 p 1p M (r) 0 A(r)( B(r)) ≥G , r r3 for all r ∈ (r∗ , R). Integrating this expression from r ∈ (r∗ , R) to R yields √ 0 B0 A(r) p 1p M (R) M (r) A(R) √ B(r) ≥ G − 3 . − R r R3 r 2 A(R) Using (3.16) and simplifying gives p
0
B(r) ≤
GM (r) √ , r2 A(r)
so integrating from r∗ to R gives Z Z R p 0 B(r) dr ≤ r∗
R r∗
GM (r) √ dr, r2 A(r)
or Z p p B(R) − B(r∗ ) ≤
R
r∗
GM (r) 1 q dr. r2 1 − 2GM (r) r
Now to estimate the integral on the right hand side of (3.23), use the fact that M (r) ≤ M (R), and
(3.23)
616
J. Smoller, B. Temple
q 1−
1 2GM (r) r
1
≤q
1−
,
2GM (R) r
to obtain Z
R r∗
GM (r) 1 q dr ≤ r2 1 − 2GM (r) r
Z
R r∗
GM (R) 1 q dr. r2 1 − 2GM (R)
(3.24)
r
Using the substitution 2GM (R) 2GM (R) , du = dr r r2 we obtain from (3.24) the estimate r p p p 2GM (R) B(R) − B(r∗ ) ≤ A(R) − 1 − . r∗ u=1−
Finally, since B(R) = A(R), a straightforward calculation gives (2.40). This completes the proof of Theorem 5. 4. Concluding Remarks The issue of negative mass functions raises an interesting question. Recall that, for spherically symmetric solutions, it is only the total mass M (R), which is the total mass measured in the far field, that has an intrinsic physical meaning in general relativity. That Rr is, in the Newtonian theory, M (r) = 0 4πρ(s)s2 ds must be interpreted as the total mass inside radius r because the underlying space is Euclidean; but in general relativity, the mass function enters indirectly through the metric coefficient A(r)−1 , the coefficient of 2 the dr2 term in the gravitational metric tensor, via the formula M (r) = rc 2G (1 − A(r)). In general relativity, only the equation M 0 (r) = 4πρr2 follows from the Einstein equations, and the integration constant is not specified. Said differently, in general relativity, there is no intrinsic physical interpretation for the function M (r) when r < R because the spacetime inside radius r is not fixed apriori as in the Newtonian theory. Since the density and pressure are everywhere positive but the mass M (r) is negative for 0 < r < r1 in the solutions constructed here, we pose the question as to whether a region 0 ≤ r < r˜ < r1 in an OV solution can be replaced by a perfect fluid solution that is singularity free inside radius r, ˜ such that the density and pressure are everywhere positive. This introduces the following dichotomy. Namely, if such a matching is possible, then the gravitational field can have a repulsive effect, in light of the fact that p0 > 0 near r = 0. If such a matching cannot be made, then the following conjecture must hold: Conjecture: No singularity free metric that solves the Einstein equations for a perfect fluid can be matched Lipschitz continuously to the negative mass portion of an OV metric in such a way that the interface between the metrics describes a fluid dynamical shockwave, and such that the matched solution is singularity free, and has everywhere positive density and pressure. We showed above (before the proof of Theorem 4) that the conjecture is correct for matching to a Friedmann-Robertson-Walker metric; cf. [3].
Oppenheimer–Volkoff Equations Inside the Schwarzschild Radius
617
In light of this dichotomy, we find it interesting that, as we proved above, the invariant quantity limr→∞ M (r) = M (R) must satisfy M (R) ≥ 0 at the surface of the star r = R, even when M (r) is negative at some interior point r < R. Therefore we conclude that negative mass M < 0 would never be seen by an observer beyond the surface of the star, (consistent with the positive mass theorem, [8]). References 1. Buchdahl, W.A.: Phys. Rev. 116, 1027 (1956) 2. Weinberg, S.: Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity. New York: John Wiley & Sons, 1972 3. Smoller, J. and Temple, B.: Shock-wave solutions of the Einstein equations: The Oppenheimer-Snyder model of gravitational collapse extended to the case of non-zero pressure. Arch. Rat. Mech. Anal. 128, 249 (1994) 4. Smoller, J. and Temple, B.: Astrophysical shock wave solutions of the Einstein equations, Phys. Rev. D 51, 2733 (1995) 5. Smoller, J. and Temple, B.: General relativistic shock-waves that extend the Oppenheimer-Snyder model. Arch. Rat. Mech. Anal. (to appear) 6. Smoller, J. and Temple. B.: On the Oppenheimer–Volkoff equations in general relativity. Arch. Rat. Mech. Anal. (to appear) 7. Smoller, J. and Temple, B.: Shock-waves near the Schwarzschild radius and the stability limit for stars, Phy. Rev. D (to appear) 8. Schoen, R. and Yau, S.T.: Commun. Math. Phys. 79, 231 (1981) Communicated by S.-T. Yau
Commun. Math. Phys. 184, 619 – 628 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Eigenvalue Inequalities and Poincar´e Duality in Noncommutative Geometry Henri Moscovici? Department of Mathematics, The Ohio State University, Columbus, OH 43210, USA. E-mail: [email protected] Received: 7 July 1996 / Accepted: 23 September 1996
Abstract: In the context of Connes’ noncommutative geometry, eigenvalue inequalities of the type discovered by Vafa and Witten are shown to be a characteristic feature of those spectral geometric spaces of finite topological type that satisfy rational Poincar´e duality in K-theory.
1. Introduction In a much noted paper [VW], Vafa and Witten have produced, by a surprising argument of a topological nature, uniform upper bounds for the first eigenvalues of the twisted Dirac operators D /A on a compact Riemannian spin manifold, formed by coupling the fiducial Dirac operator D / to an arbitrary background gauge potential A. Furthermore, for odd-dimensional manifolds they proved the existence of a constant C > 0, independent of the potential, such that every interval of length C contains at least one eigenvalue of D /A . Their proof relies on ingenious and remarkably elementary manipulations with the index theorems of Atiyah-Singer [AS] and of Atiyah-Patodi-Singer [APS]. Our aim here is to point out that the explicit form of the index theorems is not essential and that, ultimately, the Vafa-Witten inequalities are a manifestation of Poincar´e duality in K-theory. As such, they continue to hold in the full generality of Connes’ noncommutative geometry (Theorem 1), for spectral geometric spaces of finite topological type that satisfy rational Poincar´e duality in the sense of [C1 , C2 ]. In particular, they hold for noncommutative spectral manifolds [C3 ], such as the noncommutative tori or the spectral triple describing the standard model of particle physics [C2 ], as well as for ordinary (topological) manifolds which may not be smooth but only Lipschitz. More remarkably, they remain true even when the growth of the number of eigenvalues is non? J.S. Guggenheim Fellow. Research supported in part by the U.S. National Science Foundation and the U.S.-Israel Binational Science Foundation.
620
H. Moscovici
polynomial (Theorem 2), e.g. for duals of discrete, cocompact subgroups of SO(n, 1) or SU (n, 1) (Corollary 1). 2. The General Framework In Connes’ noncommutative geometry, the concept of a (compact) geometric space is realized by a spectral triple (A, H, D), where A is an involutive algebra with unit, represented in the Hilbert space H by bounded operators, and D = D∗ is an unbounded selfadjoint operator in H so that the resolvent (D + i)−1 is compact;
(1)
the commutators [D, a] = Da − aD are bounded, for any a ∈ A.
(2)
Such a triple is even if it is further equipped with a Z2 -grading γ ∈ L(H), γ = γ ∗ , γ 2 = 1, so that ∀ a ∈ A, γa = aγ, while Dγ = −γD. (3) Otherwise, the triple is odd. The algebra A plays the role of the algebra coordinates on the underlying “space” X. In the commutative case, the latter is a genuine space, namely the spectrum of the C ∗ -algebra A = the norm closure of A in L(H). Without requiring A to be commutative, we shall assume that if a ∈ A is invertible in A, then a−1 ∈ A.
(4)
This has as effect that the K-groups of A and A coincide: Ki (A) = Ki (A) ,
i = 0, 1 .
Let us recall that K0 (A) classifies finite (i.e. finitely generated) projective modules over A, or equivalently idempotents in M∞ (A), while K1 (A) = π0 (GL∞ (A)) is the group of connected components of GL∞ (A). In view of Bott periodicity, one has πn (GL∞ (A)) ∼ = Kn+1 (A) , where n only matters mod 2. We next recall that the datum of D as above, defines an additive index map IndD : Kn (A) → Z as follows. In the even case, one uses the grading γ to decompose H = H+ ⊕ H− , D into D+ and D− , where 1−γ 1+γ 1±γ D , H± = H; D+ = 2 2 2 for any selfadjoint idemptotent e∗ = e2 = e ∈ Mq (A), the operator e(D+ ⊗ Iq )e : e(H + ⊗ Cq ) → e(H− ⊗ Cq ) , is Fredholm and one defines IndD (e) = Index e(D+ ⊗ Iq )e .
(5ev )
Eigenvalue Inequalities and Poincar´e Duality
621
In the odd case, given U ∈ GLq (A), U ∗ U = U U ∗ = Iq , one sets IndD (U ) = Index (P ⊗ Iq · U · P ⊗ Iq )
(5odd )
where P = 1+F 2 , F = Sign (D). It is important to remark at this point that (50odd )
IndD (U ) = Sf (Dt ) , where Dt is the periodic one-parameter family of selfadjoint Fredholm operators Dt = D ⊗ Iq + t U [D ⊗ Iq , U ∗ ],
t ∈ [0, 1] ,
and Sf (Dt ) denotes the corresponding spectral flow (cf. [APS]), counting the net number of eigenvalues which cross the origin as the family turns once around the circle. Note that the assumption (4) allowed us to employ only selfadjoint elements in defining IndD . A more geometric account of the preceding discussion can be given in terms of (Hermitian) finite projective modules over A, which correspond to (Hermitian) vector bundles over the ghost space X. To this end, we recall that, given a finite projective (right-)module E over A, a Hermitian structure on E is a sesquilinear map (, ):E ×E →A satisfying the following conditions: (i) (ξa, ηb) = a∗ (ξ, η) b, ∀ ξ, η ∈ E and ∀ a, b ∈ A; (ii) (ξ, ξ) ≥ 0; (iii) E is self-dual with respect to ( , ). If we write E as a direct summand E = e Aq of a free module E0 = Aq , with e2 = e = e∗ ∈ Mq (A), any Hermitian structure on E is isomorphic to the one obtained by restricting to E the Hermitian structure on E0 given by (ξ, η) =
q X
ξj∗ ηj ,
∀ ξ = (ξj ) , η = (ηj ) ∈ E0 .
(6)
1
Before recalling the notion of connection (cf. [C1 ]), we need to introduce the Abimodule of bounded operators on H, nX o 1 1 ≡ ΩD (A) = aj [D, bj ] ; aj , bj ∈ A , ΩD which assumes the role of the 1-forms. A connection on the Hermitian finite projective module E is a C-linear map ∇ : E → 1 such that E ⊗ A ΩD (iv) ∇(ξa) = (∇ξ)a + ξ ⊗ da, ∀ ξ ∈ E, a ∈ A, (v) (ξ, ∇η) − (∇ξ, η) = d(ξ, η), ∀ ξ, η ∈ E; here da = [D, a], ∀ a ∈ A, and if X 1 with ξj ∈ E , ωj ∈ ΩD ∇ξ = ξ j ⊗ ωj , then (∇ξ, η) =
X
ωj∗ (ξj , η) .
622
H. Moscovici
An example of such is the Grassmannian connection ∇0 on E = e · Aq , given by ∇0 ξ = e η,
where (ηj ) = (d ξj ) .
Note also that any two connections differ by a selfadjoint element of HomA (E, E ⊗A 1 ΩD ). Given a Hermitian finite projective module E over A, we can form the Hilbert space E ⊗A H, by completing the algebraic tensor product with respect to the inner product (vi) hξ1 ⊗ η1 , ξ2 ⊗ η2 i = hη1 , (ξ1 , ξ2 ) η2 i, ∀ ξj ∈ E, ηj ∈ H. If, in addition, E is equipped with a connection ∇, then one can define a twisted operator DE,∇ on E ⊗A H by setting (vii) DE,∇P (ξ ⊗ η) = ξ ⊗ Dη + (∇ξ)η, ∀ ξ ∈ E, η ∈ H; here, if ∇ξ = ξj ⊗ ωj , then (∇ξ) η =
X
ξj ⊗ ωj (η) ∈ E ⊗A H .
It is straightforward to check that DE,∇ is selfadjoint and Fredholm. Indeed, for the Grassmannian connection one has DE,∇0 = e(D ⊗ Iq ) e . 1 ) acts as a It then suffices to note that a (selfadjoint) element T ∈ HomA (E, E ⊗A ΩD (selfadjoint) bounded operator on the Hilbert space E ⊗A H via the composition T ⊗I
I⊗π
1 ⊗A H −→ E ⊗A H , E ⊗A H −→ E ⊗A ΩD
where π(ω ⊗ η) = ω(η),
1 ω ∈ ΩD , η ∈ H.
In this picture, the additive map IndD : Kn (A) → Z can be described as follows. In the even case + IndD (e) = Index DE,∇ ,
(6ev )
where E = e Aq , equipped with a Hermitian structure and a connection ∇. In the odd case IndD (U ) = Sf (Dt ) ,
(6odd )
where the right hand side refers to the spectral flow of the one-parameter family Dt = DE0 ,∇ + t U [DE0 ,∇ , U ∗ ],
t ∈ [0, 1] .
Here U is an unitary in GLq (A) and ∇ an arbitrary connection on E0 = Aq compatible with a given Hermitian structure. In the case when ∇ = ∇0 is the (trivial) Grassmannian connection in E0 , the above family coincides with the one appearing in (50odd ).
Eigenvalue Inequalities and Poincar´e Duality
623
3. Poincar´e Duality in K-Theory The formulation of Poincar´e duality in noncommutative geometry involves, in general, a pair of two algebras. In its fullest expression, it also involves Kasparov’s KK-bifunctor [K1 ]. However, for the purposes of this paper, the appropriate version is a weaker, rational form of Poincar´e duality in K-theory, which we proceed now to state following Connes [C1 ]. Let (A, H, D) be a spectral triple describing the geometry of a noncommutative space. Assume we are given another spectral triple (B, H, D) such that: (α) A and B commute in L(H); (β) [D, a] and b commute, for any a ∈ A, b ∈ B. Then (A ⊗ B, H, D) forms again a spectral triple, and so we can consider the index map IndD : Kn (A ⊗ B) → Z . Composing it with the natural biadditive map Ki (A) × Kj (B) → Ki+j (A ⊗ B) , one obtains a biadditive form ( , )D : Ki (A) × Kj (B) → Z . We shall say that (A, H, D) and (B, H, D) are in rational Poincar´e duality if the form ( , )D is nondegenerate. The prototype on which this notion is patterned arises in the case of an (evendimensional) closed Riemannian manifold M , represented by the signature spectral triple ∗ ∗ A = C ∞ (M ), H± = L2 (M, Λ± C T M ), D = d + d ,
where the superscript ± refers to the signature grading. Then (A, H, D) is in Poincar´e duality with (B, H, D), where B = Γ (Cliff C (T ∗ M )) is the algebra of C ∞ -sections of the Clifford bundle Cliff C (T ∗ M ) (cf. [C1 , VI.4.β]). When M is a Riemannian spin manifold, replacing the signature operator by the Dirac /, one obtains the Dirac operator D / acting on the space of L2 -sections of the spin bundle S spectral triple A = C ∞ (M ), H = L2 (M, S /), D = D /, which is self-dual. A simple noncommutative example is that of the noncommutative 2-torus [C1 , loc.cit] (Aθ , H, D), θ ∈ R/Z, which is in Poincar´e duality with (A0θ , H, D), where A0θ denotes the opposite algebra of Aθ . This is in fact a typical case of noncommutative self-duality, that occurs in the presence of a real structure (see [C2 ]).
624
H. Moscovici
4. Eigenvalue Inequalities We now fix a spectral space (A, H, D), assumed to admit a rational Poincar´e dual (B, H, D). To state the main result, we need one more definition: the underlying noncommutative space is said to be of finite topological type, if dim K∗ (A) ⊗ Q =< ∞ . Theorem 1. Let (A, H, D) be a spectral space of finite topological type, which admits a rational Poincar´e dual (B, H, D). 10 . There exists a constant C < ∞ so that, if E is any Hermitian finite projective module over A with connection ∇ and if |λ1 (E, ∇)| ≤ |λ2 (E, ∇)| ≤ . . . denote the eigenvalues of the twisted operator DE,∇ indexed in the order of ascending absolute value, then |λ1 (E, ∇)| ≤ C . 20 . In the odd case, there also exists a constant L > 0 so that every interval of length L contains an eigenvalue of DE,∇ , for any (E, ∇) as above. Proof. The proof will follow closely the original arguments of Vafa and Witten (cf. [VW], also [A]), with one notable exception. Instead of pulling back the Bott class from spheres, a procedure which is not available in this generality, we shall make use of the finite topological type assumption. A. We start with the proof in the even case. Fix a basis {β1 , . . . , βr } of K0 (B) ⊗ Q. Each βj , j = 1, . . . , r, is of the form βj = [Fj ] − [B qj ] . Given (E, ∇) as in 10 , by virtue of the Poincar´e duality hypothesis, there exists β ∈ {β1 , . . . , βr }, β = [F 0 ] − [B q ], so that ([E], β)D 6= 0 .
(1)
+ Index DE,∇ = 0,
(2)
We may assume the statement being otherwise trivially satisfied. Together with (1), this implies ([E], [F 0 ])D 6= 0 .
(3)
Let us write F 0 as f B N , with f 2 = f = f ∗ ∈ MN (B) and fix a connection ∇0 , e.g. the Grassmannian connection, on F 0 . By the very definitions of the index map (Sect. 2) and of the intersection form + (Sect. 3), we can express ([E], [F 0 ])D as the index of the operator DE⊗F 0 defined as follows. We view H, which is an A ⊗ B left module, as an (A, B 0 )-bimodule, where B 0 is the opposite algebra to B, acting on H on the right: η · b0 = b η
,
η ∈ H , b ∈ B.
Similarly, we regard F 0 as a left B 0 -module:
Eigenvalue Inequalities and Poincar´e Duality
b0 · ζ 0 = ζ 0 b
625
,
ζ0 ∈ F 0 , b ∈ B .
We can now form the Hilbert space E ⊗ A H ⊗B 0 F 0 as in Sect. 2, and then define the selfadjoint operator DE⊗F 0 as follows: DE⊗F 0 (ξ ⊗ η ⊗ ζ 0 ) = ∇ξ ⊗ η ⊗ ζ 0 + ξ ⊗ Dη ⊗ ζ 0 + ξ ⊗ η ⊗ ∇0 ζ . Then
(4)
+ ([E], [F 0 ])D = Index DE⊗F 0 ,
and because of (3) one has
Ker DE⊗F 0 6= 0 .
(5)
Consider now the complementary module F 00 = (1 − f ) B N , equipped with its (Grassmannian) Hermitian structure and connection ∇00 , and let F = F 0 ⊕F 00 ∼ = B N denote their direct sum equipped with the direct sum Hermitian structure and connection ∇0 ⊕ ∇00 . We can form DE⊗F = DE⊗F 0 ⊕ DE⊗F 00 using ∇0 ⊕ ∇00 , and also DE⊗F0 ∼ = DE,∇ ⊕ · · · ⊕ DE,∇ | {z } N -times using the trivial connection ∇0 on F0 = B N . From (4) it is obvious that the difference B = DE⊗F − DE⊗F0 depends only on the difference of connections on F ∼ = F0 : 1 T0 = ∇0 ⊕ ∇00 − ∇0 ∈ ΩD (B)N ,
which is bounded and independent of (E, ∇). To complete the proof of the even case, it remains to notice, on one hand, that (5) implies that Ker DE⊗F 6= 0 , and, on the other hand, that DE⊗F0 has the same eigenvalues as DE,∇ . B. Passing now to the odd case, we fix a basis {[U1 ], . . . , [Us ]} of K1 (B) ⊗ Q. By hypothesis, (6) ([E], [U ])D 6= 0 for some unitary U ∈ {U1 , . . . , Us }, U ∗ U = U U ∗ = I, U ∈ GLN (B). Again appealing to the definitions of the index map and intersection form, ([E], [U ])D can be expressed as the spectral flow of a one-parameter family of selfadjoint operators, as follows. On the Hilbert space E ⊗A ⊗ H ⊗B0 F0 , where F0 = B N equipped with the Grassmannian connection ∇0 , we consider the operators D0 = DE⊗F0 ' N DE,∇
626
and
H. Moscovici
D1 = (I ⊗ U ) · D0 · (I ⊗ U ∗ ) .
Then ([E], [U ])D is the spectral flow of the family Dt = (1 − t) D0 + t D1 ,
t ∈ [0, 1] .
(7)
Because of (6), there must be some t0 ∈ [0, 1] such that Dt0 has a zero-eigenvalue. But Dt0 − D0 = t0 B, with B = D1 − D0 involving only the bounded operator U [D ⊗ IN , U ∗ ]. This proves 10 in the odd case, and therefore achieves the proof of the first part of the theorem. C. To prove 20 it suffices to note that the fact that the family (7) has a nonzero spectral flow actually implies that for any λ ∈ R, there is a t ∈ [0, 1] so that Dt has eigenvalue λ. Thus, D0 must have an eigenvalue within kBk of λ. Remark 1. The statement is symmetric with respect to the algebras A and B. 5. Non-smooth and Non-polynomial Growth Examples The first comment we want to make is to emphasize the topological – not-necessarily smooth – nature of the result. This is put in evidence by the fact that the original proof of Vafa-Witten applies verbatim to the signature operator D on a Lipschitz manifold M , if instead of the Atiyah-Singer index theorem one uses its extension to Lipschitz manifolds (cf. [T]). In particular, the uniform estimates for the higher eigenvalues |λN (E, ∇)| ≤ C N 1/n ,
n = dim M ,
with C < ∞ independent on the (Lipschitz) Hermitian bundle with connection (E, ∇), also hold. Incidentally, the Lipschitz manifold example can be easily fitted in the noncommutative framework discussed above. Thus, one can take A as being the algebra of Lipschitz functions on M , H as being the Hilbert space of L2 -forms on M and D the signature operator, as in [H]. As in the smooth case [C1 ], the commutant of the algebra generated by A and [D, a], with a ∈ A, is canonically isomorphic to the algebra of bounded measurable forms on M , equipped with the Clifford multiplication. Taking as B the subalgebra of Lipschitz forms (relative to the same Clifford multiplication), one can check with the help of the results in [H] that (A, H, D) and (B, H, D) are in Poincar´e duality. The second and more substantive remark is that the properties stated in Theorem 1 do not represent a strictly finite-dimensional phenomenon. To illustrate this, we shall exhibit below a class of non-polynomial growth, θ-summable, noncommutative spaces (duals of discrete countable groups) which satisfy the hypotheses of Theorem 1. We begin by recalling Connes’ construction of a geometric structure on a dual of a discrete subgroup Γ of a Lie group G (see [C1 , IV.9.α]). To simplify the discussion, we shall assume that G is connected and semisimple and that Γ is torsion-free and cocompact. In particular, EΓ can be identified with the symmetric space X = G/K, where K is a maximal compact subgroup of G. We shall equip X with the canonical G-invariant Riemannian metric. Let H be the Hilbert space of L2 -forms on X, on which Γ acts by left translations. Consider the Morse function 2ϕ, given by the square of the geodesic distance to the base point {K} ∈ X, then define (cf. [W])
Eigenvalue Inequalities and Poincar´e Duality
627
Dτ = dτ + d∗τ ,
(1)
where dτ = e−τ ϕ d eτ ϕ , with τ 6= 0. We let A be the closure under the holomorphic functional calculus of the group algebra CΓ in the reduced C ∗ -algebra Cr∗ (Γ ), and B = C ∞ (Γ \X), acting by multiplication operators on H. At this point, we need to recall that there is a canonical analytic assembly map (cf. [C1 , Chapter II]) µΓr : Ki (BΓ ) → Ki (Cr∗ (Γ )) , which was conjectured by Baum and Connes to be an isomorphism. This conjecture has been verified for discrete subgroups of SO(n, 1) (cf. [K2 ]) and of SU (n, 1) (cf. [JK]). With these preparations, we are ready to state the main result of this section. Theorem 2. Let Γ be a discrete, torsion-free, cocompact subgroup of a connected, noncompact, semisimple Lie group G. Assume that the assembly map µΓr : Ki (BΓ ) → Ki (Cr∗ (Γ )) is a rational isomorphism. Then: 10 . (A, H, Dτ ), with the Z2 -grading γ = (−1)degree , is an even spectral triple of finite topological type with rational Poincar´e dual (B, H, Dτ ); 20 . Dτ satisfies the θ-summability property Trace e−βDτ < ∞, 2
∀β > 0,
(2)
without being finitely-summable: Trace (1 + Dτ2 )−p = ∞,
∀p < ∞.
(3)
Proof. The fact that (A ⊗ B, H, Dτ ) is a θ-summable spectral triple satisfying the property [[D, a], b] = 0,
∀a ∈ A , b ∈ B
(4)
is the content of Prop. 3 in [C1 , IV.9.α]. Given the assumption of rational isomorphism of the analytic assembly map µΓr , the nondegeneracy of the pairing ( , )Dτ : Ki (A) × Ki (B) → Z is a consequence of the following index formula of Connes (Theorem 4, [C1 , loc.cit]): (µΓr (x), y)Dτ = hch∗ (x), ch∗ (y)i , for any K-homology class x ∈ Ki (BΓ ) and any K-theory class y ∈ Ki (B) = K i (Γ \X). Finally, since Γ cannot be amenable, (3) follows from Theorem 1 in [C1 , loc.cit].
628
H. Moscovici
We conclude with a consequence which can be stated without any reference to noncommutative geometry. Let E be a Γ -equivariant Hermitian vector bundle on X = G/K equipped with a Γ equivariant connection ∇. As in [C1 , IV.9.α], one constructs the corresponding Hilbert space of E-valued forms on X HE = L2 (X, Λ∗C T ∗ M ⊗ E), on which one can define a twisted analogue of Dτ , namely the operator Dτ,E,∇ = ∇τ + ∇∗τ . With this notation, the following geometric statement is an immediate consequence of the preceding results. Corollary 1. Let Γ be a discrete, torsion-free, cocompact subgroup of G = SO0 (n, 1) or SU (n, 1). There exists a constant C < ∞, such that for any Γ -equivariant Hermitian vector bundle E on X = G/K equipped with a Γ -equivariant connection ∇, and with |λ1 (E, ∇)| ≤ |λ2 (E, ∇)| ≤ . . . denoting the increasing sequence of absolute values of eigenvalues for Dτ,E,∇ , one has |λ1 (E, ∇)| ≤ C . At the same time, for any p < ∞, ∞ X
(1 + |λn (E, ∇)|)−p = ∞ .
n=1
References [A] Atiyah, M.F.: Eigenvalues of the Dirac operator, Lecture Notes in Math. no 1111, pp. 251–260 [AS] Atiyah, M.F., and Singer, I.M.: The index of elliptic operators III, Ann. of Math. 87, 546–604 (1968) [APS] Atiyah, M.F., Patodi, V.K. and Singer, I.M.: Spectral asymmetry and Riemannian geometry III, Math. Proc. Camb. Phil. Soc. 79, 71–99 (1976) [C2 ] Chamseddine, A. and Connes, A.: The spectral action principle, Preprint [C1 ] Connes, A.: Noncommutative Geometry. New York–London: Academic Press (1994) [C2 ] Connes, A.: Noncommutative geometry and reality, J. Math. Phys. 36, no 11 (1995) [C3 ] Connes, A.: Gravity coupled with matter and the foundation of noncommutative geometry. Commun. Math. Phys. 182, 155–176 (1996) [H] Hilsum, H.: Fonctorialit´e en K-th´eorie bivariante pour les vari´et´es Lipschitziennes. K-theory, 3, 401–440 (1989) [JK] Julg, P. and Kasparov, G.: Operator K-theory for SU (n, 1). J. Reine Angew. Math. 463, 99–152 (1995) [K1 ] Kasparov, G.G.: The operator K-functor and extensions of C ∗ -algebras. Math. U.S.S.R. Izv. 16 no 3, 513–572 (1981). Translated from Izv. Akad. Nauk. S.S.S.R. Ser. Mat. 44, 571–636 (1980) [K2 ] Kasparov, G.G., Lorentz groups: K-theory of unitary representations and crossed products, Dokl. Akad. Nauk. S.S.S.R. 275, 541–545 (1984) [T] Teleman, N.: The index theorem for topological manifolds. Acta Math. 153, 117–152 (1984) [VW] Vafa, V. and Witten, E.: Eigenvalue inequalities for fermions in gauge theories. Commun. Math. Phys. 95, 257–276 (1984) [W] Witten, E.: Supersymmetry and Morse theory. J. Differential Geom. 17, 661–692 (1982) Communicated by A. Connes
Commun. Math. Phys. 184, 629 – 652 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Basic Properties of Symplectic Dirac Operators K. Habermann Mathematisches Institut, Ruhr-Universit¨at Bochum, 44780 Bochum, Germany. E-mail: [email protected] Received: 1 July 199 / Accepted: 24 September 1996
Abstract: Symplectic Dirac operators, acting on symplectic spinor fields introduced by B. Kostant in geometric quantization, are canonically defined in a similar way as the Dirac operator on Riemannian manifolds. These operators depend on a choice of a metaplectic structure as well as on a choice of a symplectic covariant derivative on the tangent bundle of the underlying manifold. This paper performs a complete study of these relations and shows further basic properties of the symplectic Dirac operators. Various examples are given for illustration.
1. Introduction The study of relations between analytic properties of certain differential operators acting on sections of vector bundles and geometrical properties of the underlying base manifold is a central topic in differential geometry. There are several classical results, for example the Hodge–de Rham theory. The Hodge–Laplace–Beltrami operator ∆ acting on differential forms is one of the natural operators in global Riemannian geometry and the kernel gives important topological invariants, the dimension of the kernel of ∆ on p-forms over a closed Riemannian manifold is the pth Betti number. Other well known and well studied operators are the Kodaira–Hodge–Laplace operator on differential forms with values in a holomorphic vector bundle, or the Dirac operator on Riemannian manifolds. Symplectic spinor fields were introduced by B. Kostant in [12] in order to give the construction of the half-form bundle and the half-form pairings in the context of geometric quantization. Furthermore, the Kostant-Souriau scheme for geometric quantization of a symplectic manifold (M, ω) uses the metaplectic representation, based on the Schr¨odinger representation of the canonical commutation relations (CCR), and metaplectic structures. ˜ acting on Now, it is also possible to define symplectic Dirac operators D and D, symplectic spinor fields, in a canonical way and by an analogous construction as in
630
K. Habermann
the Riemannian situation. This has been done by the author in [6]. All tools necessary for that (symplectic Clifford algebra, metaplectic group, metaplectic representation, and metaplectic structures) are already known and accepted. Although the whole construction follows the same procedure as one introduces the classical Riemannian Dirac operator, using the symplectic structure of M instead of the Riemannian metric, we wish to emphasize that the underlying algebraical structure of the symplectic Clifford algebra is completely different. For the classical Clifford algebra we have the relation X 2 = −||X||2 , whereas the algebraical structure of the symplectic Clifford algebra – also known as Weyl algebra – is given by X · Y − Y · X = −ω(X, Y ). This implies essentially different properties for the Clifford multiplication, which comes into the definition of the Dirac operators. ˜ depends on a choice of a metaplectic structure of M as The definition of D and D well as on a choice of symplectic covariant derivative on the tangent bundle T M . In this paper we first study the relation between different metaplectic structures of (M, ω) and the corresponding symplectic Dirac operators. Secondly, this paper gives a complete investigation of the relation between symplectic connections over (M, ω) and the induced Dirac operators. Furthermore, the present paper is concerned with the operators defined to be the formal L2 -adjoint operators, gives sufficient and necessary conditions for formal selfadjointness, and says under which conditions the operators are essentially selfadjoint. These observations are important for further investigations of these symplectic Dirac operators, even though the computations are quite elementary. Since investigations of symplectic manifolds became a fundamental subject in mathematics, it will be useful to study these symplectic Dirac operators.
2. Notations, Definitions, and Preliminaries We consider the standard symplectic space (R2n , ω0 ). Then the group Sp(2n, R) of symplectic transformations has the fundamental group π1 (Sp(2n, R)) ∼ = Z. Its two-fold connected covering group is known as metaplectic group M p(2n, R) and we have the central short exact sequence ρ
1
→
Z2
→
M p(2n, R)
→ Sp(2n, R)
→
1.
The metaplectic group M p(2n, R) has a natural representation acting on the Hilbert space L2 (Rn ). Perhaps the best way to introduce this representation is by using the Stone–von Neumann theorem to obtain intertwining operators of irreducible unitary representations of the 2n+1-dimensional Heisenberg group H n = R2n ×R. These intertwining operators define the so-called Segal-Shale-Weil representation or metaplectic representation L : M p(2n, R) → U nit(L2 (Rn )) of the metaplectic group. This representation stabilizes the Schwartz space S(Rn ) ⊂ L2 (Rn ) of rapidly decreasing smooth functions on Rn , which consists of C ∞ vectors for L, cf. [3, 10 and 15]. Furthermore, we have the symplectic Clifford multiplication µ : R2n × S(Rn ) → S(Rn ) (v, f ) → 7 v · f = µ(v, f ) = σ(v)f
Basic Properties of Symplectic Dirac Operators
631
defined by the Schr¨odinger quantization prescription 1 ∈ R 7→ σ(1) = i 7 σ(aj ) = ixj aj ∈ R2n → ∂ 7 σ(aj+n ) = ∂x aj+n ∈ R2n → j
and for j = 1, . . . , n,
where a1 , . . . , a2n denotes the canonical symplectic basis of Rn with respect to ω0 . The Clifford multiplication commutes with the action of the metaplectic group, i.e. we have µ(g(v, f )) = µ(ρ(g)v, L(g)f ) = L(g)µ(v, f ) for any g ∈ M p(2n, R), v ∈ R2n , and f ∈ S(R). For x, y ∈ R2n we have the relation σ(x) ◦ σ(y) − σ(y) ◦ σ(x) = σ(ω0 (y, x)) = iω0 (y, x).
(1)
Now let (M, ω) be a 2n-dimensional connected symplectic manifold and π : R → M the Sp(2n, R)-principal fibre bundle of all symplectic frames over M . In order to be able to construct a symplectic spinor bundle, we have to fix a metaplectic structure for (M, ω). A metaplectic structure on (M, ω) is a reduction of the symplectic frame bundle R with respect to the double covering ρ, i.e. a metaplectic structure is a principal M p(2n, R) fibre bundle P over M together with a bundle morphism f : P → R which is equivariant with respect to ρ : M p(2n, R) → Sp(2n, R). The topological obstruction to the existence of a metaplectic structure is given by the vanishing of a certain cohomology class in H 2 (M, Z2 ). Equivalently, a symplectic manifold admits a metaplectic structure if and only if c1 (M ) ≡ 0 mod 2. Then the first cohomology group H 1 (M, Z2 ) classifies the set of all inequivalent metaplectic structures. Cf. [12]. Finally, we need a symplectic connection to define symplectic Dirac operators. A covariant derivative ∇ : Γ (T M ) → Γ (T ∗ M ⊗ T M ) on the tangent bundle T M of a symplectic manifold (M, ω) is called symplectic if and only if ∇ω = 0. There is a bijective correspondence between symplectic covariant derivatives on the tangent bundle T M and connection 1-forms Z : T R → sp(2n, R) in the symplectic frame bundle. There are infinitely many symplectic connections on a symplectic manifold, even infinitely many symplectic connections without torsion. The difference between two torsion-free symplectic connections is given by a symmetric covariant 3-tensor. On K¨ahler manifolds, however, the Levi-Civita connection is a canonical torsion-free symplectic connection. Pay attention to the fact that one has no uniqueness property for torsion-free symplectic connections as one knows for the Levi-Civita connection in Riemannian geometry. See [14]. However, fixing an almost complex structure J and a Riemannian metric g compatible with ω, i.e. such that ω( , ) = g( J , ), one has a natural Hermitian connection ∇ on the tangent bundle T M . If ∇g denotes the Levi-Civita connection with respect to the Riemannian metric g then this Hermitian connection is given by 1 ∇X Y = ∇gX Y − J((∇gX J)Y ) 2 for vector fields X and Y . Cf. [11] and [13]. Using the metaplectic representation of M p(2n, R) on L2 (Rn ) one can form the bundle of symplectic spinors for (M, ω) with respect to P . We define the symplectic spinor bundle Q to be the Hilbert bundle associated to the metaplectic structure P via the metaplectic representation L
632
K. Habermann
Q = P ×L L2 (Rn ). This bundle has the structure of a topological vector bundle. Nevertheless, it is possible and useful to consider smooth sections in Q, using the Schwartz space S(Rn ), which consists of C ∞ vectors of the metaplectic representation. Definition 2.1. A section ϕ of the symplectic spinor bundle Q is called a smooth section if and only if it locally can be written as ϕ = [s, F ], where s : U → P is a smooth local section of the metaplectic structure P and F : U → L2 (Rn ) is a smooth mapping such that F (U ) ⊂ S(Rn ). Γ (Q) denotes the space of all smooth sections of the symplectic spinor bundle. Remark. The metaplectic representation maps the Schwartz space S(Rn ) onto itself and for any f ∈ S(Rn ) the map g ∈ M p(2n, R) 7→ L(g)f ∈ S(Rn ) is of class C ∞ . Thus, the above definition is correct. Additionally, one may consider the associated vector bundle S = P ×L S(Rn ), which is a subbundle of Q. If Γ (S) denotes the space of all smooth sections in S, we have that Γ (Q) = Γ (S). Now, by the M p(2n, R)-equivariance the Clifford multiplication in the fibres defines a Clifford multiplication µ : TM ⊗ S X ⊗ϕ
→ S 7→ µ(X, ϕ) = X · ϕ
on the bundle level. By formula (1) we have (X · Y − Y · X) · ϕ = −iω(X, Y )ϕ.
(2) n
Furthermore, Q admits a canonical Hermitian scalar product < , > given by the L (R )scalar product on the fibres. Moreover, this gives the L2 -product Z < ϕ, ψ > dM (ϕ, ψ) = 2
M
for any sections ϕ and ψ in Q. Finally, any symplectic covariant derivative ∇ : Γ (T M ) → Γ (T ∗ M ⊗ T M ) on the tangent bundle T M of (M, ω) induces a covariant derivative, the so-called spinor derivative, on the sympletic spinor bundle S, which also will be denoted by ∇. For the Clifford mutiplication, the spinor derivative, and the Hermitian scalar product we have the following properties on S: < X · ϕ, ψ > = − < ϕ, X · ψ >, ∇X (Y · ϕ) = (∇X Y ) · ϕ + Y · ∇X ϕ, X < ϕ, ψ > = < ∇X ϕ, ψ > + < ϕ, ∇X ψ > . Having introduced the necessary material we now define symplectic Dirac operators in a canonical way. We consider the 2n-dimensional symplectic manifold (M, ω) with symplectic spinor bundle S and fix an almost complex structure J and a Riemannian metric g compatible with ω. Further, let ∇ be any fixed symplectic covariant derivative on the tangent bundle of M .
Basic Properties of Symplectic Dirac Operators
633
Definition 2.2. The symplectic Dirac operator D of (M, ω) with respect to ∇ is defined as the composition of the spinor derivative ∇ and the Clifford multiplication µ ω
D = µ ◦ ∇ : Γ (S) → Γ (T ∗ M ⊗ S) ∼ = Γ (T M ⊗ S) → Γ (S), where we identify the bundles T ∗ M and T M via the symplectic structure ω such that ω(X, ) ∼ = X. Using the Riemannian metric g for identifying the bundles T ∗ M and T M ˜ we obtain a second Dirac operator D, g
˜ = µ ◦ ∇ : Γ (S) → Γ (T ∗ M ⊗ S) ∼ D = Γ (T M ⊗ S) → Γ (S). ˜ are given by Locally, D and D Dϕ =
n X
{ej · ∇fj ϕ − fj · ∇ej ϕ}
˜ = Dϕ
and
j=1
n X
{Jej · ∇fj ϕ − Jfj · ∇ej ϕ},
j=1
where e1 , . . . , en , f1 , . . . , fn is any local symplectic frame on M and Dϕ = −
2n X
Jej · ∇ej ϕ
and
˜ = Dϕ
j=1
2n X
ej · ∇ej ϕ,
j=1
where e1 , . . . , e2n denotes a local orthonormal (with respect to g) frame on M , respectively. ˜ depends on a choice of a symplectic covariant derivative The definition of D and D on the tangent bundle T M as well as on a choice of a metaplectic structure of M . ˜ also depends on an arbitrary almost complex structure compatible with ω. Moreover, D General references for this are [3, 6, 10 and 12]. ˜ Depends on the Choice of an Almost Complex Structure 3. How D ˜ actually does not depend on the choice of a compatible Our first task is to show that D almost complex structure up to isomorphism. The almost complex structure chosen to ˜ is not unique. As one knows a compatible almost complex structure define the operator D always exists and the set of all such almost complex structures is contractible. Let J and K be two of them. We consider the composition A = −J ◦ K. Since ω(AX, AY ) = ω(X, Y ) for all vector fields X and Y , A defines an isomorphism of the symplectic frame bundle
by
Φ:R (e1 , . . . , en , f1 , . . . , fn )
→ R 7→ (Ae1 , . . . , Aen , Af1 , . . . , Afn ).
Considering T M as the associated vector bundle TM ∼ = R ×κ R2n with respect to the standard representation κ : Sp(2n, R) ,→ GL(R2n ), one has locally A[s, t] = [Φ(s), t],
634
K. Habermann
where s is any local section of the symplectic frame bundle R. Furthermore, Φ lifts into the metaplectic structure P and gives an isomorphism Ψ : P → P such that the diagram P ↓ R
Ψ → → Φ
P ↓ R
commutes. Then Ξ:S [s, f ]
given by
→ S 7 → [Ψ (s), f ]
denotes the induced isomorphism of the symplectic spinor bundle S = P ×L S(Rn ). Lemma 3.1. With respect to the isomorphism Ξ : S → S, the Clifford multiplication satisfies Ξ(X · ϕ) = AX · Ξ(ϕ). Proof. Let s : U → R be a local section of the symplectic frame bundle and s : U → P a lift of s into the metaplectic structure. Then locally X = [s, t] and ϕ = [s, f ]. The definition of the Clifford multiplication gives Ξ(X · ϕ) = Ξ([s, t] · [s, f ]) = Ξ([s, σ(t)f ]) = [Ψ (s), σ(t)f ] and AX · Ξ(ϕ) = [Φ(s), t] · [Ψ (s), f ] = [Ψ (s), σ(t)f ], which already proves the assertion.
Lemma 3.2. With respect to the isomorphism Ξ : S → S, the spinor derivative satisfies Ξ(∇X ϕ) = ∇X (Ξ(ϕ)). Proof. With X = [s, t] and ϕ = [s, f ] as in the proof above we have Ξ(∇X ϕ)
= Ξ([s, X(f ) + L∗ (Z(ds(X)))f ]) = [Ψ (s), X(f ) + L∗ (Z(ds(X)))f ] = ∇X (Ξ(ϕ)).
Proposition 3.3. Let J and K be two almost complex structures on M compatible with ˜ K denote the symplectic Dirac operators defined with respect ˜ J and D ω. Further, let D to the Riemannian metrics gJ and gK . Then ˜ J (Ξ(ϕ)). ˜ K ϕ) = D Ξ(D
Basic Properties of Symplectic Dirac Operators
635
Proof. Let e1 , . . . , en , f1 , . . . , fn be any local symplectic frame on M . Using A ◦ K = −J ◦ K 2 = J, we compute ˜ K ϕ) Ξ(D
=
=
=
n X j=1 n X j=1 n X
{Ξ(Kej · ∇fj ϕ) − Ξ(Kfj · ∇ej ϕ)} {A ◦ K(ej ) · Ξ(∇fj ϕ) − A ◦ K(fj ) · Ξ(∇ej ϕ)} {Jej · ∇fj (Ξ(ϕ)) − Jfj · ∇ej (Ξ(ϕ))}
j=1
=
˜ J (Ξ(ϕ)), D
which concludes the proof.
4. Some Technical Definitions and Remarks Before continuing with our discussion of symplectic Dirac operators we pause to give some technical considerations, which will be useful in our further investigations. The proofs are elementary computations and may be found in [8]. Definition 4.1. For X ∈ Γ (T M ) and A ∈ Γ (T ∗ M ⊗ T M ) we set P (X) P˜ (X)
P (A)(X)
=
=
=
n X j=1 n X j=1 n X
{ej · ∇fj X − fj · ∇ej X}, {Jej · ∇fj X − Jfj · ∇ej X},
and
{(∇fj A)(X) · ej − (∇ej A)(X) · fj },
j=1
where e1 , . . . , en , f1 , . . . , fn denotes any symplectic frame on M . For X ∈ Γ (T M ) and A ∈ Γ (T ∗ M ⊗ T M ) the expressions P (X), P˜ (X), and P (A)(X) act on the symplectic spinor bundle S via Clifford multiplication. Further, the context always makes clear which P is used. Remark. If A is parallel with respect to ∇ then obviously P (A) = 0. Lemma 4.2. For any spinor field ϕ ∈ Γ (S) and any vector field X on M we have D(X · ϕ) = X · Dϕ + P (X) · ϕ − i∇X ϕ ˜ ˜ + P˜ (X) · ϕ + i∇JX ϕ. D(X · ϕ) = X · Dϕ
and
Lemma 4.3. Let ϕ ∈ Γ (S) be any spinor field on M . Then D(X · ϕ) = P (X) · ϕ (resp. ˜ D(X · ϕ) = P˜ (X) · ϕ) for all vector fields X if and only if ϕ is parallel. Proposition 4.4. Let X be a vector field, such that the corresponding 1-form ω(X, ) is closed. Then we have the relation P (JX) − P˜ (X) = P (J)(X) − idiv(JX).
636
K. Habermann
5. The Operator P The relation (2) for the Clifford multiplication suggests that it is quite natural to study the ˜ − DD) ˜ whereas in (Riemannian) spin geometry the square differential operator i(DD of the Dirac operator is considered. The principal symbols (see [6]) ˜ : π ∗ (S) → π ∗ (S) σ(D), σ(D) ˜ are given by of the symplectic Dirac operators D and D σ(D)η (ϕ) = X · ϕ
and
˜ η (ϕ) = JX · ϕ σ(D)
with η ∈ Tx∗ M such that η = ω(X, ) = g(JX, ), X ∈ Tx M and ϕ ∈ Sx for x ∈ M . ˜ − DD) ˜ has metric principal symbol Hence, the operator i(DD ˜ − DD) ˜ η (ϕ) iσ(DD
˜ η ◦ σ(D)η ϕ − σ(D)η ◦ σ(D) ˜ η ϕ) = i(σ(D) = i(JX · X · ϕ − X · JX · ϕ) = −g(X, X)ϕ.
Taking the formal L2 -adjoint operator ∇∗ of ∇ : Γ (S) → Γ (T ∗ M ⊗ S) with respect to the L2 -product (α, η)g for α, η ∈ Γ (T ∗ M ⊗ S) given by the L2 -product ( , ) on sections in Q and by the Riemannian metric g, we have a Bochner-Laplace operator ∇∗ ∇ – the spinor Laplacian, which will be denoted by ∆Q : Γ (S) → Γ (S). For a local orthonormal frame e1 , . . . , e2n on M ∆Q ϕ
=
−
2n X j=1
=
−
2n X
{∇ej ∇ej ϕ − ∇∇ej ej ϕ} {∇ej ∇ej ϕ + div ∇ (ej )∇ej ϕ}.
j=1
˜ we have the following Proposition 5.1. For the commutator of the operators D and D Weitzenb¨ock formula ˜ − DD)ϕ ˜ (DD = i∆Q ϕ −
2n X
Jej · ek · RQ (ej , ek )ϕ +
2n X
P (J)(ej ) · ∇Jej ϕ, (3)
j=1
j,k=1
where e1 , . . . , e2n denotes a local orthonormal frame on M . Proof. We assume that e1 , . . . , e2n is a local frame on M such that it is orthonormal with respect to g as well as symplectic with respect to ω. Then we have the relations ˜ are Jej = ej+n and Jej+n = −ej for j = 1, . . . , n and the Dirac operators D and D locally given by Dϕ = −
2n X j=1
Jej · ∇ej ϕ
and
˜ = Dϕ
2n X
ej · ∇ej ϕ.
j=1
Furthermore, suppose that e1 , . . . , e2n arises from a frame in Tx M by parallel displacement along geodesics, x ∈ M . Then ∇ej (x) = 0 for j = 1, . . . , 2n. Using Lemma 4.2 and Proposition 4.4 we compute
Basic Properties of Symplectic Dirac Operators
˜ − DD)ϕ ˜ (DD
=
2n X
D(ek · ∇ek ϕ) +
=
2n X
˜ D(Je j · ∇ej ϕ)
j=1
k=1 2n X
637
{ek · D(∇ek ϕ) + P (ek ) · ∇ek ϕ − i∇ek ∇ek ϕ}
k=1
+
2n X
˜ ej ϕ) + P˜ (Jej ) · ∇ej ϕ + i∇J 2 e ∇ej ϕ} {Jej · D(∇ j
j=1
=
2n X
{−ek · Jej · ∇ej ∇ek ϕ + Jej · ek · ∇ek ∇ej ϕ}
j,k=1
−
2n X
(P (J 2 ej ) − P˜ (Jej )) · ∇ej ϕ − 2i
j=1
=
−
2n X
2n X
Jej · ek · RQ (ej , ek )ϕ + i P (J)(Jej ) · ∇ej ϕ − i
2n X
2n X
g(ek , ej )∇ej ∇ek ϕ
j,k=1
j=1
−2i
∇ ej ∇ ej ϕ
j=1
j,k=1
−
2n X
2n X
div(ej )∇ej ϕ
j=1
∇ ej ∇ ej ϕ
j=1
=
i∆Q ϕ −
2n X
Jej · ek · RQ (ej , ek )ϕ +
j,k=1
2n X
P (J)(ej ) · ∇Jej ϕ.
j=1
Definition 5.2. We define the operator P : Γ (S) → Γ (S) to be ˜ D] = i(DD ˜ − DD). ˜ P = i[D, ˜ this operator P also depends on the choice of an almost complex As the definition of D, structure on M compatible with ω. Proposition 5.3. Let J and K be two almost complex structures compatible with ω and ˜ J , D], PK = i[D ˜ K , D] denote the corresponding operators. Then let PJ = i[D Ξ(PK ϕ) = PJ (Ξ(ϕ)). Proof. The assertion is a direct consequence from Proposition 3.3 and the fact that the operator D depends only of the symplectic structure ω. 6. On the Dependence of the Symplectic Dirac Operators on the Metaplectic Structure Defining it Extending methods developed for the classical Riemannian Dirac operator (see [1] and [4]) to our situation, we perform a study of symplectic Dirac operators with respect to
638
K. Habermann
different metaplectic structures of a symplectic manifold in this section. Furthermore, we refer to [8], where this approach is described in all details for the symplectic case. The set of all inequivalent metaplectic structures is classified by the first cohomology group H 1 (M, Z2 ). Thus, the difference of two metapletic structures (P1 , f1 ) and (P2 , f2 ) over (M, ω) gives an element in H 1 (M, Z2 ) which for its part defines a complex line bundle E. This complex line bundle may be constructed making use of deformations of metaplectic structures. The deformation of two metaplectic structures is a reduction (S, τ ) of the symplectic frame bundle R with respect to the projection Sp(2n, R) × Z2 → Sp(2n, R) onto the first component consisting of a Sp(2n, R) × Z2 principal fibre bundle over M and a twofold covering τ : S → R. It then follows that S is connected if and only if there is no isomorphism between (P1 , f1 ) and (P2 , f2 ). Then the complex line bundle E is defined to be the associated bundle E = S ×α C, where α denotes the action of Sp(2n, R) × Z2 on C given by α(A, a)z = az. Since E is the complexification of a real line bundle one has 2c1 (E) = 0. If Q1 = P1 ×L L2 (Rn ) and Q2 = P2 ×L L2 (Rn ) are the corresponding symplectic spinor bundles, we have an isomorphism β : E ⊗ Q1 → Q2 of these vector bundles. If the complex line bundle E according to the metaplectic structures (P1 , f1 ) and (P2 , f2 ) is trivial then the symplectic spinor bundles Q1 and Q2 are isomorphic. In particular E is trivial if (P1 , f1 ) and (P2 , f2 ) are isomorphic metaplectic structures or if the second cohomology group H 2 (M, Z) has no torsion of order 2. ˜ according to two different metaplectic strucStudying the Dirac operators D and D tures (P1 , f1 ) and (P2 , f2 ) one sees that the Dirac operators in Q2 correspond to twisted Dirac operators in E ⊗ Q1 coupled to a certain connection ∇ in the line bundle E. The symplectic connection fixed to define the symplectic Dirac operators also induces a covariant derivative ∇ : Γ (E) → Γ (T ∗ M ⊗ E) in the complex line bundle E. One easy sees that ∇ is flat. Now we introduce twisted Dirac operators ˜ : Γ (E ⊗ S1 ) → Γ (E ⊗ S1 ) D∇ , D ∇ given by ∇. For this we consider the covariant derivative id ⊗ ∇1 + ∇ ⊗ id : Γ (E ⊗ S1 ) → Γ (T ∗ M ⊗ E ⊗ S1 ) and the Clifford multiplication µ : Γ (T M ⊗ S1 ) → Γ (S1 ). ˜ are defined to be the compositions Then D∇ and D ∇ id⊗∇1 +∇⊗id
ω
µ
id⊗∇1 +∇⊗id
g
µ
D∇ : Γ (E ⊗ S1 ) −→ Γ (T ∗ M ⊗ E ⊗ S1 ) ∼ = Γ (T M ⊗ E ⊗ S1 ) → Γ (E ⊗ S1 ), and ˜ : Γ (E ⊗ S1 ) −→ Γ (T ∗ M ⊗ E ⊗ S1 ) ∼ D = Γ (T M ⊗ E ⊗ S1 ) → Γ (E ⊗ S1 ), ∇ where the cotangent bundle T ∗ M and the tangent bundle T M are identified via the symplectic structure ω and the Riemannian metric g, respectively.
Basic Properties of Symplectic Dirac Operators
639
˜ : Γ (E ⊗ S1 ) → Γ (E ⊗ S1 ) are given by Lemma 6.1. Locally, the operators D∇ , D ∇ D∇ (e ⊗ ϕ) = e ⊗ D1 ϕ + and ˜ (e ⊗ ϕ) = e ⊗ D ˜ 1ϕ + D ∇
n X
{∇fj e ⊗ ej · ϕ − ∇ej e ⊗ fj · ϕ},
j=1
n X
{∇fj e ⊗ Jej · ϕ − ∇ej e ⊗ Jfj · ϕ},
j=1
where e1 , . . . , en , f1 , . . . , fn is any local symplectic frame on M . A direct calculation gives
˜ j : Γ (Sj ) → Γ (Sj ) with j = 1, 2 be the symplectic Dirac Proposition 6.2. Let Dj , D operators defined according to the metaplectic structures (P1 , f1 ) and (P2 , f2 ). Then the following diagrams are commutative. Γ (E ⊗ S1 ) ↓ D∇ Γ (E ⊗ S1 )
β → → β
Γ (E ⊗ S1 ) ˜ ↓D ∇ Γ (E ⊗ S1 )
Γ (S2 ) ↓ D2 Γ (S2 )
β → Γ (S2 ) ˜2 ↓D → Γ (S2 ) β
˜ and P according to In order to understand the relation between the operators D, D different metaplectic structures in detail, it will be useful to choose a local non-vanishing section e in E. Locally, this section gives an isomorphism between Q1 and E⊗Q1 defined by ϕ ∈ Q1 7→ e ⊗ ϕ ∈ E ⊗ Q1 . Hence, the map β e : Q1 → E ⊗ Q 1 → Q 2 ϕ 7→ e ⊗ ϕ 7→ β(e ⊗ ϕ) defines, locally, an isomorphism between Q1 and Q∈ . Remark. Pay attention to the fact that the following investigations are local considerations depending on the choice of the local section e in E. In the case that the complex line bundle E is trivial a global section e ∈ Γ (E) can be chosen, the symplectic spinor bundles Q1 and Q2 are isomorphic to each other, and the following relations hold globally. Moreover, the section e defines by ∇X e = we (X)e a closed complex-valued 1-form we . Let Xe denote the complex vector field related to we , i.e. Xe is given by the relation we = ω(Xe , ), where ω is extended in a C-linear way. Proposition 6.3. Let e be any local non-vanishing section in E. Then the diagram Γ (S1 ) ↓ D 1 + Xe · Γ (S1 )
e⊗ −→ ˜ 1 + JXe · (resp. D
)
−→ e⊗
Γ (E ⊗ S1 ) ↓ D∇
β −→
˜ ) (resp. D ∇
Γ (E ⊗ S1 )
−→ β
Γ (S2 ) ↓ D2 Γ (S2 )
˜ 2) (resp. D
640
K. Habermann
commutes. That means, locally, we have the following identities: D2 ◦ βe (ϕ) ˜ 2 ◦ βe (ϕ) D
βe (D1 ϕ + Xe · ϕ) ˜ 1 ϕ + JXe · ϕ). βe ( D
= =
and
Proof. Let e1 , . . . , en , f1 , . . . , fn denote any local symplectic frame. Then n X {we (fj )e ⊗ ej · ϕ − we (ej )e ⊗ fj · ϕ} D∇ (e ⊗ ϕ) = e ⊗ D1 ϕ + =
j=1 n X
e ⊗ {D1 ϕ +
{ω(Xe , fj )ej + ω(ej , Xe )fj } · ϕ}
j=1
=
e ⊗ {D1 ϕ + Xe · ϕ}
and ˜ (e ⊗ ϕ) D ∇
=
˜ 1ϕ + e⊗D
n X
{we (fj )e ⊗ Jej · ϕ − we (ej )e ⊗ Jfj · ϕ}
j=1
=
˜ 1 ϕ + J( e ⊗ {D
n X
{ω(Xe , fj )ej + ω(ej , Xe )fj }) · ϕ}
j=1
=
˜ 1 ϕ + JXe · ϕ}, e ⊗ {D
which already gives the assertion.
7. The Operator P with Respect to Different Metaplectic Structures In this section we study the relation between the operators P1 and P2 defined according to two different metaplectic structures (P1 , f1 ) and (P2 , f2 ) over M . Proposition 7.1. Let e denote any local non-vanishing section in E. Then we have for all symplectic spinor fields ϕ ∈ Γ (S1 ), P2 ◦ βe (ϕ) = βe (P1 ϕ − 2∇1JXe ϕ − iP (J)(Xe ) · ϕ − (div(JXe ) + |Xe |2 )ϕ). ˜ j with j = 1, 2 the relations Proof. We proved for Dj and D D2 ◦ βe (ϕ) ˜ 2 ◦ βe (ϕ) D
= =
βe (D1 ϕ + Xe · ϕ) ˜ 1 ϕ + JXe · ϕ). βe ( D
and
Using Sect. 4 this implies P2 ◦ βe (ϕ) = ˜ 2 ◦ βe (ϕ) ˜ 2 ◦ D2 ◦ βe (ϕ) − iD2 ◦ D = iD ˜ 1 ϕ + JXe · ϕ)} ˜ = i{D2 ◦ βe (D1 ϕ + Xe · ϕ) − D2 ◦ βe (D ˜ 1 D1 ϕ + D ˜ 1 (Xe · ϕ) + JXe · D1 ϕ + JXe · Xe · ϕ) = i{βe (D ˜ ˜ 1 ϕ + Xe · JXe · ϕ)} −βe (D1 D1 ϕ + D1 (JXe · ϕ) + Xe · D =
˜ 1ϕ βe (P1 ϕ + i{−JXe · D1 ϕ − P (JXe ) · ϕ + i∇1JXe ϕ − Xe · D ˜ 1 ϕ + P˜ (Xe ) · ϕ + i∇1JX ϕ + JXe · D1 ϕ + iω(Xe , JXe )ϕ}) +Xe · D
=
βe (P1 ϕ − i{P (JXe ) − P˜ (Xe )} · ϕ − 2∇1JXe ϕ − |Xe |2 ϕ)
=
βe (P1 ϕ − iP (J)(Xe ) · ϕ − div(JXe )ϕ − 2∇1JXe ϕ − |Xe |2 ϕ).
e
Basic Properties of Symplectic Dirac Operators
641
Here we note that the 1-form we is closed, which allows us to apply Proposition 4.4 in the last equation. Example. The flat complex torus TCn . On TCn we have the K¨ahlerform as a symplectic structure and the Levi-Civita connection as a canonical symplectic and torsion-free connection. We consider R2n ∼ = Cn in coordinates (x1 , . . . , x2n ) and Γ let be a lattice in ∂ 2n R . Then ∂j = ∂xj for j = 1, . . . , 2n yields a global symplectic frame on TCn = Cn /Γ . This gives a trivialization of the symplectic frame bundle R = TCn × Sp(2n, R) and thus a canonical metaplectic structure (P1 , f1 ) on TCn , where P1 = TCn × M p(2n, R) and f1 = id × ρ. For the associated symplectic spinor bundle Q1 = TCn × L2 (Rn ) all functions f ∈ S(Rn ) define global parallel smooth sections via ϕ = ( , f ), i.e. ϕx = (x, f ) for x ∈ TCn . Hence, there are infinitely many parallel sections in Γ (S1 ). Conversely, any parallel section of S1 is given in this way by a function f ∈ S(Rn ). The Weitzenb¨ock-formula (3) implies that P1 is the spinor Laplacian on smooth sections in Q1 , P1 = ∆Q1 : Γ (S1 ) → Γ (S1 ). Thus, since TCn is compact, the kernel of P1 consists of all parallel sections ϕ ∈ Γ (S1 ). Now, let (P2 , f2 ) denote a second non-isomorphic metaplectic structure on TCn . Since H 2 (TCn , Z) has no torsion of order 2, the complex line bundle E is trivial and thus all spinor bundles over TCn are isomorphic to each other. A global section e in E is a map e : S → C so that any non-vanishing e may not be constant. Since S is connected the section e cannot be parallel, which gives Xe 6≡ 0. Furthermore, one sees that always div(JXe ) + |Xe |2 6≡ 0 is satisfied. Otherwise, we would have Z Z (div(JXe ) + |Xe |2 )dTCn = 0= TCn
TCn
|Xe |2 dTCn ,
which contradicts Xe 6≡ 0. Using that TCn is K¨ahlerian, i.e. P (J) = 0, we obtain for the operator P2 with respect to the metaplectic structure (P2 , f2 ) P2 ◦ βe (ϕ) = βe (P1 ϕ − 2∇1JXe ϕ − (div(JXe ) + |Xe |2 )ϕ).
(4)
Proposition 7.2. On the flat complex torus TCn = Cn /Γ only the operators P,D, and ˜ defined with respect to the canonical metaplectic structure admit a non-trivial kernel. D This kernel consists of all parallel smooth symplectic spinor fields on TCn , i.e. is nothing else but the Schwartz space S(Rn ).
642
K. Habermann
Proof. Formula (4) gives that βe ϕ ∈ kerP2 if and only if P1 ϕ − 2∇1JXe ϕ − (div(JXe ) + |Xe |2 )ϕ = 0. This equation implies Z ||∇1 ϕ||2 − JXe (||ϕ||2 ) =
TCn
(Re(div(JXe )) + |Xe |2 )|ϕ|2 dTCn .
(5)
Equation (5) holds also for any sections αe instead of e, where α ∈ R∗ is an arbitrary real non-zero constant, i.e. we have Z (αRe(div(JXe )) + α2 |Xe |2 )|ϕ|2 dTCn ||∇1 ϕ||2 − αJXe (||ϕ||2 ) = TCn
for any α ∈ R∗ . Taking a sequence αn → 0 with αn 6= 0 for all indices n, we obtain ||∇1 ϕ||2 = 0, which gives that ϕ is parallel with respect to ∇1 . Hence, the function |ϕ|2 is constant and (5) implies Z 2 |Xe |2 dTCn = 0. |ϕ| TCn
Since Xe 6≡ 0 this says ϕ ≡ 0 and kerP2 = 0. Furthermore, for P1 we see ˜ 1. kerP1 = kerD1 = kerD In fact, since the kernel of P1 consists of all parallel sections the inclusions kerP1 ⊂ ˜ 1 are obvious. Conversely, let ϕ ∈ kerD1 . Then kerD1 and kerP1 ⊂ kerD ||∇1 ϕ||2
= (P1 ϕ, ϕ) ˜ 1 ϕ, ϕ) ˜ 1 ◦ D1 ϕ − D1 ◦ D = i(D ˜ ˜ = i(D1 ϕ, D1 ϕ) − i(D1 ϕ, D1 ϕ) = 0,
˜ 1 ⊂ kerP1 . which gives ϕ parallel and thus kerD1 ⊂ kerP1 . The same one sees kerD
8. Symplectic Connections and the induced Dirac Operators Here, the developed concepts of Riemannian Dirac operators defined with respect to different metric connections are carried over from the orthogonal to the symplectic case. See also [7]. For the Riemannian Dirac operators, we refer to [1]. First we prove an elementary property of the symplectic Clifford multiplication. Note that the corresponding observations in the Riemannian situation are purely algebraic.
Basic Properties of Symplectic Dirac Operators
643
Lemma 8.1. Let a1 , . . . , a2n be the canonical symplectic basis of R2n with respect to the standard symplectic structure ω0 . Then the subspace of the symplectic Clifford algebra, spanned by all elements of the form aj , j = 1, . . . , 2n, and ai · aj · ak , 1 ≤ i ≤ j ≤ k ≤ 2n, is not contained in the kernel of the homomorphism sClif f (R2n ) → L(L2 (Rn )) a i · a j · . . . · ak → 7 σ(aj ) ◦ σ(aj ) ◦ . . . ◦ σ(ak )
i, j, k = 1, . . . , 2n
defined by the Clifford multiplication. Proof. First of all we consider the Schwartz space S(Rn ), the operators Xj = ixj of ∂ for j = 1, . . . , n acting as continuous multiplication by ixj , and the operators Dj = ∂x j n n operators on S(R ). Furthermore, let A : S(R ) → S(Rn ) be given by A
X
=
A1ijk Xi
1≤i≤j≤k≤n n X X
+
◦ Xj ◦ Xk +
n X
X
A2ijk Xi ◦ Xj ◦ Dk
k=1 1≤i≤j≤n
X
A3ijk Di ◦ Dj ◦ Xk +
k=1 1≤i≤j≤n n n X X + A1k Xk + A2k Dk k=1 k=1
A4ijk Di ◦ Dj ◦ Dk
1≤i≤j≤k≤n
β n α with real coefficients Aα ijk and Ak . If Af = 0 holds for all f ∈ S(R ), then Aijk = 0 and Aβk = 0 for all indices. Recalling the definition of the symplectic Clifford multiplication, this proves the assertion.
Definition 8.2. Let ∇ : Γ (T M ) → Γ (T ∗ M ⊗T M ) be any symplectic covariant derivative on the tangent bundle of (M, ω). Then div ∇ denotes the corresponding divergence operator, i.e. for each vector field X the divergence div ∇ (X) is defined by div ∇ (X) =
n X
{ω(∇ej X, fj ) + ω(ej , ∇fj X)},
j=1
where e1 , . . . , en , f1 , . . . , fn denotes any local symplectic frame on M . Remark. If e1 , . . . , en , f1 , . . . , fn denotes a local frame, which is symplectic with respect to ω as well as orthonormal with respect to g then one has Jej = fj and Jfj = −ej for j = 1, . . . , n and the divergence operator for ∇ satisfies ∇
div (X) =
n X
{ω(∇ej X, fj ) + ω(ej , ∇fj X)} =
j=1
n X
{g(ej , ∇ej X) + g(fj , ∇fj X)}.
j=1
Thus, div ∇ (X) is defined as one knows it from Riemannian geometry. Remark. If e1 , . . . , en , f1 , . . . , fn is any local symplectic frame, then div ∇ (ek ) div ∇ (fk )
=
=
n X j=1 n X j=1
ω(∇ej fj − ∇fj ej , ek ) ω(∇ej fj − ∇fj ej , fk )
and
644
K. Habermann
holds for k = 1, . . . , n. Definition 8.3. Now, the symplectic divergence operator sdiv ∇ is defined with respect to a symplectic covariant derivative ∇ : Γ (T M ) → Γ (T ∗ M ⊗ T M ) by ∇
sdiv (X) =
n X
{ω(ej , ∇ej X) + ω(fj , ∇fj X)},
j=1
where e1 , . . . , en , f1 , . . . , fn denotes any local symplectic frame on M . Remark. If e1 , . . . , en , f1 , . . . , fn is any local symplectic frame, then sdiv ∇ (ek ) sdiv ∇ (fk )
=
=
− −
n X j=1 n X
ω(∇ej ej + ∇fj fj , ek )
and
ω(∇ej ej + ∇fj fj , fk )
j=1
holds for k = 1, . . . , n. In the sequel let ∇ as well as ∇ denote symplectic covariant derivatives on the tangent bundle T M . Then A ∈ Γ (T ∗ M ⊗ T ∗ M ⊗ T M ) is defined to be the (2,1)-tensor field given by A(X, Y ) = ∇X Y − ∇X Y. Thus, ω(A(X, Y ), Z) = ω(A(X, Z), Y ), since ∇ and ∇ are symplectic. With T (X, Y ) = ∇X Y − ∇Y X − [X, Y ] T (X, Y ) = ∇X Y − ∇Y X − [X, Y ]
and
one obtains A(X, Y ) = (T − T )(X, Y ) + A(Y, X). Definition 8.4. In the situation given above A defines a (3,0)-tensor field K ∈ Γ (T ∗ M ⊗ T ∗ M ⊗ T ∗ M ) by K(X, Y, Z) = ω(A(X, Y ), Z). Moreover, we define a symmetric (3,0)-tensor field S by S(X, Y, Z) = K(X, Y, Z) + K(Y, Z, X) + K(Z, X, Y ) and a further symmetric (3,0)-tensor field S˜ by ˜ S(X, Y, Z) = K(X, JY, JZ) + K(Y, JZ, JX) + K(Z, JX, JY ). Remark. If ∇ and ∇ have the same torsion, i.e. if T ≡ T , then K itself is symmetric and S = 3K. Moreover, one sees that K(X, JY, JZ) = K(X, Y, Z) if and only if ∇J = ∇J. Proposition 8.5. For two symplectic connections ∇ and ∇ the corresponding Dirac operators D∇ and D∇ coincide, D∇ = D∇ , if and only if S ≡ 0 and div ∇ = div ∇ . ˜∇ =D ˜ ∇ if and only if S˜ ≡ 0 and sdiv ∇ = sdiv ∇ . Furthermore, D
Basic Properties of Symplectic Dirac Operators
645
Proof. Let e1 , . . . , en , f1 , . . . , fn denote any local symplectic frame on M . For the difference of the Dirac operators one obtains D∇ ϕ − D∇ ϕ
n i X {K(fl , fj , fk )el · ej · ek − K(fl , fj , ek )el · ej · fk 2
=
j,k,l=1
−K(fl , ej , fk )el · fj · ek + K(fl , ej , ek )el · fj · fk −K(el , fj , fk )fl · ej · ek + K(el , fj , ek )fl · ej · fk +K(el , ej , fk )fl · fj · ek − K(el , ej , ek )fl · fj · fk } · ϕ. Now, a rather extensive, but elementary computation follows. Altogether, this leads to D∇ ϕ − D∇ ϕ = X {S(fl , fj , fk )el · ej · ek − S(el , ej , ek )fl · fj · fk } · ϕ = i 1≤l<j
+i
{S(ej , ek , fl )fj · fk · el − S(fj , fk , el )ej · ek · fl } · ϕ
l=1 1≤j
+
i 2
X
{S(fj , fj , fk )e2j · ek + S(fj , fk , fk )ej · e2k
1≤j
−S(ej , ej , ek )fj2 · fk − S(ej , ek , ek )fj · fk2 } · ϕ n i X + {S(ek , ek , fj )fk2 · ej − S(fk , fk , ej )e2k · fj } · ϕ 2 + + +
i 6
j,k=1 n X
{S(fk , fk , fk )e3k − S(ek , ek , ek )fk3 } · ϕ
k=1
n 1X
2 1 2
{(div ∇ (fk ) − div ∇ (fk ))ek − (div ∇ (ek ) − div ∇ (ek ))fk } · ϕ
k=1 n X
{S(ej , fj , fk )ek + S(ej , fj , ek )fk } · ϕ.
j,k=1
For S ≡ 0 and div ∇ = div ∇ the identity D∇ = D∇ follows obviously. Conversely, if D∇ ϕ = D∇ ϕ holds for any symplectic spinor field ϕ Lemma 8.1 gives S(fl , fj , fk ) = S(el , ej , ek ) = 0 S(fj , fj , fk ) = S(fj , fk , fk ) = 0 S(ej , ej , ek ) = S(ej , ek , ek ) = 0 S(fk , fk , fk ) = S(ek , ek , ek ) = 0 S(fk , fk , ej ) = S(ek , ek , fj ) = 0 S(fj , fk , el ) = S(ej , ek , fj ) = 0 as well as
for 1 ≤ l < j < k ≤ n, for 1 ≤ j < k ≤ n, for 1 ≤ j < k ≤ n, for 1 ≤ k ≤ n, for 1 ≤ j, k ≤ n, for 1 ≤ l ≤ n and 1 ≤ j < k ≤ n,
646
K. Habermann
div ∇ (fk ) − div ∇ (fk ) + div ∇ (ek ) − div ∇ (ek ) −
n X j=1 n X
S(ej , fj , fk ) = 0
and
S(ej , fj , ek ) = 0
for 1 ≤ k ≤ n.
j=1
The first group of equations means S ≡ 0 by the symmetry of the tensor S. Thus, the last two equations imply div ∇ = div ∇ . If the local frame e1 , . . . , en , f1 , . . . , fn is symplectic as well as orthonormal, which implies Jej = fj and Jfj = −ej for j = 1, . . . , n, one ˜ ∇ and D ˜ ∇, obtains for the difference of D ˜ ∇ϕ ˜ ∇ϕ − D D
=
n i X {K(fl , fj , fk )fl · ej · ek − K(fl , fj , ek )fl · ej · fk 2 j,k,l=1
−K(fl , ej , fk )fl · fj · ek + K(fl , ej , ek )fl · fj · fk +K(el , fj , fk )el · ej · ek − K(el , fj , ek )el · ej · fk −K(el , ej , fk )el · fj · ek + K(el , ej , ek )el · fj · fk } · ϕ. As above we arrive at ˜ ∇ϕ − D ˜ ∇ϕ = D X ˜ l , ej , ek )el · ej · ek + S(f ˜ l , fj , fk )fl · fj · fk } · ϕ {S(e = i 1≤l<j
+i
˜ j , fk , el )fj · fk · el + S(e ˜ j , ek , fl )ej · ek · fl } · ϕ {S(f
l=1 1≤j
i + 2
X
˜ j , ej , ek )e2j · ek + S(ej , ek , ek )ej · e2k {S(e
1≤j
˜ j , fj , fk )fj2 · fk + S(f ˜ j , fk , fk )fj · fk2 } · ϕ +S(f n i X ˜ ˜ k , ek , fj )e2k · fj } · ϕ + {S(fk , fk , ej )fk2 · ej + S(e 2 j,k=1
n iX ˜ ˜ k , fk , fk )fk3 } · ϕ {S(ek , ek , ek )e3k + S(f + 6 k=1 n
1X {(sdiv ∇ (fk ) − sdiv ∇ (fk ))ek − (sdiv ∇ (ek ) − sdiv ∇ (ek ))fk } · ϕ + 2 +
1 2
k=1 n X
˜ j , ej , ek )ek + S(f ˜ j , ej , fk )fk } · ϕ, {−S(f
j,k=1
which also gives the assertion by Lemma 8.1.
Corollary 8.6. If ∇ and ∇ are two symplectic connections such that T = T , then the Dirac operators D∇ and D∇ coincide if and only if ∇ = ∇. In other words, different symplectic connections having the same torsion induce different Dirac operators D.
Basic Properties of Symplectic Dirac Operators
647
Proof. In case T = T the conditions S = 0 and ∇ = ∇ are equivalent.
Remark. For ∇J = ∇J the relation div ∇ − div ∇ = sdiv ∇ − sdiv ∇ is valid, so that in ˜∇ = D ˜∇ this case D ∇ = D∇ holds if and only if S ≡ 0 and div ∇ = div ∇ as well as D ∇ ∇ if and only if S ≡ 0 and div = div . In order to illustrate some matters, we now want to consider three examples. On the one hand different symplectic connections may induce the same Dirac operator. On the other hand the conditions for D∇ = D∇ given in the theorem both are necessary. Moreover, the 2-dimensional case is a special one. Examples illustrating these relations ˜ ∇ may be found analogously. Finally, one has to extend the computations made in for D the proof of Proposition 8.5 to see how P ∇ depends on the choice of ∇. Example. We consider the 4-dimensional symplectic space (R4 , ω0 ) with the standard symplectic structure. Furthermore, ∇ denotes the Levi-Civita connection. For At : R4 × R4 → R4 defined by x1 y1 tx1 y1 + tx1 y2 + tx2 y1 + 2x2 y3 − x3 y2 0 x y At 2 , 2 = x3 y3 2tx3 y1 − tx1 y3 − tx2 y3 + 2tx3 y2 x4 y4 −tx1 y3 + 2tx3 y1 + x3 y3 one verifies ω0 (At (X, Y ), Z) = ω0 (At (X, Z), Y ). Hence, ∇tX Y = ∇X Y + At (X, Y ) defines a family ∇t , t ∈ R, of symplectic connections on R4 . With St (X, Y, Z) = ω0 (At (X, Y ), Z) + ω0 (At (Y, Z), X) + ω0 (At (Z, X), Y ) one sees St ≡ 0 for any t ∈ R. Besides, it follows immediately that t
div ∇ (X) − div ∇ (X) = 3t(x1 + x2 ) for X = (x1 , x2 , x3 , x4 )> . Thus, for t = 0 we have S0 = 0 and div ∇ = div ∇ but t ∇0 6= ∇. Moreover, for any t 6= 0 one has St = 0 and div ∇ 6= div ∇ . 0
Example. We consider the same setting as in the example above and define A : R4 × R4 → R4 to be y1 x2 y3 − x3 y2 x1 0 x y A 2 , 2 = . x3 y3 0 x4 y4 x 3 y3 Then ω0 (At (X, Y ), Z) = ω0 (At (X, Z), Y ) and ∇X Y = ∇X Y +A(X, Y ) gives a second symplectic connection on R4 . One computes easily that S 6≡ 0 and div ∇ = div ∇ . Example. On a 2-dimensional manifold the condition div ∇ = div ∇ already implies ∇ = ∇. For the difference of div ∇ and div ∇ we have div ∇ (X) − div ∇ (X) = ω(A(e, X), f ) − ω(A(f, X), e), where e, f is any symplectic frame. Thus, div ∇ = div ∇ holds if and only if ω(A(e, X), f ) = ω(A(f, X), e),
648
K. Habermann
which is equivalent to ω(A(e, f ), X) = ω(A(f, e), X) for any vector field X. Finally, we have div ∇ = div ∇ if and only if A(e, f ) = A(f, e). But, generally the sum n X A(ej , fj ) j=1
does not depend on the choice of the symplectic frame e1 , . . . , en , f1 , . . . , fn . Since for any local symplectic frame e1 , . . . , en , f1 , . . . , fn the frame −f1 , . . . , −fn , e1 , . . . , en is a symplectic one, too, we have in dimension 2n, n X
A(ej , fj ) = −
j=1
n X
A(fj , ej )
j=1
for any local symplectic frame e1 , . . . , en , f1 , . . . , fn . In dimension two this gives A(e, f ) = −A(f, e). Consequently, div ∇ = div ∇ implies A = 0, which gives ∇ = ∇.
9. The Formal Adjoint Operators of the Symplectic Dirac Operators and Formal Selfadjointness Remark. If X is any vector field on the symplectic manifold (M, ω), then ηX = ω(X, ) denotes the corresponding 1-form on M . If now ∇ is any symplectic and torsion-free covariant derivative on the tangent bundle of M , the structure equations of ∇ yield for the divergence operator div ∇ of ∇, d(ηX ∧ ω n−1 ) = dηX ∧ ω n−1 =
1 div ∇ (X)ω n . n
Moreover, this relation implies the following lemma. Lemma 9.1. For any symplectic and torsion-free connection ∇ and any vector field X on a closed symplectic manifold M we have the identity Z
div ∇ (X)dM = 0. M
If B is any operator acting on Γ (Q) then the formal adjoint operator B ∗ is defined to be the operator satisfying the equation (Bϕ, ψ) = (ϕ, B ∗ ψ) for any symplectic spinor fields ϕ, ψ ∈ Γ0 (Q) with compact support. In this section ∇ always denotes any fixed symplectic connection.
Basic Properties of Symplectic Dirac Operators
649
Proposition 9.2. The formal adjoint of D∇ locally is given by (D∇ )∗ ψ =
n X
{∇fk (ek · ψ) − ∇ek (fk · ψ) + div ∇ (fk )ek · ψ − div ∇ (ek )fk · ψ},
k=1
where e1 , . . . , en , f1 , . . . , fn is any local symplectic frame on M . Moreover, the formal ˜ ∇ locally is given by adjoint of D ˜ ∇ )∗ ψ = (D
2n X
{∇ek (ek · ψ) + div ∇ (ek )ek · ψ},
k=1
where e1 , . . . , e2n is a local orthonormal frame on M . For both operators ∇ is an arbitrary symplectic and torsion-free connection. Proof. Using Lemma 9.1 this proof is straightforward.
Proposition 9.3. Let ∇ denote any symplectic connection on (M, ω). Then the symplec˜ ∇ acting on Γ0 (Q) are formally selfadjoint with respect tic Dirac operators D∇ and D 2 to the L -scalar product ( , ) if and only if div ∇ = div ∇ for a certain symplectic and torsion-free connection ∇. Proof. By the proposition proved above we obtain (D ∇ )∗ ψ
=
n X
{ek · ∇fk ψ − fk · ∇ek ψ}
k=1 n X
+
{∇fk ek − ∇ek fk + div ∇ (fk )ek − div ∇ (ek )fk } · ψ
k=1
=
∇
D ψ n X + {ω(∇fk ek , fj )ej + ω(ej , ∇fk ek )fj } · ψ j,k=1 n X
− +
{ω(∇ek fk , fj )ej + ω(ej , ∇ek fk )fj } · ψ
j,k=1 n X
{div ∇ (fk )ek − div ∇ (ek )fk } · ψ
k=1
=
∇
D ψ n X + {(div ∇ (ek ) − div ∇ (ek ))fk − (div ∇ (fk ) − div ∇ (fk ))ek } · ψ k=1
and ˜ ∇ )∗ ψ (D
=
2n X
{ek · ∇ek ψ + ∇ek ek · ψ + div ∇ (ek )ek · ψ}
k=1
=
˜ ∇ψ + D
2n X k=1
{div ∇ (ek ) − div ∇ (ek )}ek · ψ,
650
K. Habermann
which gives the assertion.
Corollary 9.4. If ∇ is any symplectic and torsion-free connection, then the induced ˜ ∇ are formally selfadjoint on Γ0 (Q) with respect to the Dirac operators D∇ and D 2 L -product ( , ). Finally, Proposition 5.1 immediately gives Proposition 9.5. For any symplectic and torsion-free connection ∇ the operator P ∇ is an elliptic, formally selfadjoint differential operator of second order and has metric principal symbol, i.e. P is a generalized Laplacian. One has a complete developed theory for an elliptic, formally selfadjoint pseudodifferential operator with positive definite leading symbol acting on sections in a vector bundle of finite-dimensional fibre type. But we deal with operators acting on sections of an infinite-dimensional vector bundle. So for our operator P nothing is clear. In the following example we give two covariant derivatives ∇ and ∇ with the properties – – – –
∇ symplectic and torsion-free ∇ symplectic with torsion div ∇ = div ∇ D∇ 6= D∇
Thus, D∇ as well as D∇ are formally selfadjoint, but they are different operators. Example. Let (R4 , ω0 ) be the 4-dimensional symplectic space with the standard symplectic structure ω0 and with the Levi-Civita connection ∇, which is symplectic and torsion-free. Furthermore, A : R4 × R4 → R4 is given by y1 x2 y3 − x3 y2 x1 0 x y A 2 , 2 = . x3 y3 0 x4 y4 x 3 y3
Then ∇ defined by ∇X Y = ∇X Y + A(X, Y ) is symplectic and has the torsion
x2 y3 − x3 y2 0 T (X, Y ) = A(X, Y ) − A(Y, X) = 2 , 0 0 where X = (x1 , x2 , x3 , x4 )> and Y = (y1 , y2 , y3 , y4 )> . With respect to the canonical symplectic basis a1 , a2 , a3 , a4 of R4 we obtain for the divergence operators div ∇ (X) − div ∇ (X)
= =
ω0 (A(a1 , X), a3 ) + ω0 (a1 , A(a3 , X)) +ω0 (A(a2 , X), a4 ) + ω0 (a2 , A(a4 , X)) 0
for any vector field X. Moreover, an elementary computation gives S 6≡ 0, which says by Proposition 8.5 that D ∇ 6= D∇ .
Basic Properties of Symplectic Dirac Operators
651
10. Essential Selfadjointness We consider the space Γ0 (Q) of symplectic spinor fields with compact support. Further, let L2 (Q) denote the Hilbert space of L2 -sections of Q with respect to the norm ||ϕ||2 = (ϕ, ϕ) given by the L2 -product for sections in Q. Since Γ0 (Q) is dense in L2 (Q), the symplectic Dirac operators can naturally be considered as unbounded operators on the Hilbert space L2 (Q) ˜ P : L2 (Q) → L2 (Q) D, D, with domain Γ0 (Q). In the previous section we proved under which conditions these operators are formally selfadjoint, i.e. that they are symmetric operators on L2 (Q) with domain Γ0 (Q). Now, we consider the case where these operators are formally selfadjoint. Moreover, let B be one of them. Since B is symmetric with dense domain the adjoint B ∗ of B with domain dom(B ∗ )
=
{ψ ∈ L2 (Q) such that there is a section ψ 0 ∈ L2 (Q) with (Bϕ, ψ) = (ϕ, ψ 0 ) for all sections ϕ ∈ Γ0 (Q)}
and B ∗ ψ = ψ 0 as well as the closure B of B with domain dom(B)
=
{ϕ ∈ L2 (Q) such that there is a sequence {ϕn }n=1,2,... of elements in Γ0 (Q) with ϕ = lim ϕn in L2 (Q) n→∞
and {Bϕn }n=1,2,... converges in L2 (Q)} and Bϕ = limn→∞ Bϕn are well defined operators. This, furthermore, establishes that dom(B ∗ ) equipped with the scalar product (ϕ, ψ)∗ = (ϕ, ψ) + (B ∗ ϕ, B ∗ ψ) is a Hilbert space. Moreover, dom(B) is a closed subspace in (dom(B ∗ ), ( , )∗ ). Cf. [2]. Now, B is called essentially selfadjoint if and only if B is formally selfadjoint and the closure B of B is a selfadjoint operator. Since (B)∗ = B ∗ holds for any closurable operator on a Hilbert space with dense domain, the operator B is essentially selfadjoint if and only if B ∗ = B. Hence, it remains to prove that the subspace dom(B) is dense in the Hilbert space (dom(B ∗ ), ( , )∗ ). Using formal selfadjointness, we have Bϕ = B ∗ ϕ for any ϕ ∈ Γ0 (Q), which gives by the definition of B that Γ0 (Q) is dense in dom(B) with respect to the norm || ||∗ given by ( , )∗ . Proposition 10.1. Let M be compact with formally selfadjoint symplectic Dirac operators. Then these operators are essentially selfadjoint. Proof. Let B be as above. Following the proof of Theorem 5.1 in [16] for the classical Riemannian Dirac operator one sees that Γ0 (Q) is dense in dom(B ∗ ) with respect to the norm || ||∗ . Thus, the closure B of B is the unique selfadjoint extension of B. Acknowledgement. The author wishs to thank the Mathematics Institute of the Ruhr-University Bochum and in particular Professor J¨urgen Jost for kind hospitality. This work was supported by a Lise-Meitner fellowship of North Rhine–Westfalia.
652
K. Habermann
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Baum, H.: Spin-Strukturen und Dirac-Operatoren u¨ ber pseudo-riemannschen Mannigfaltigkeiten. Leipzig: Teubner Verlag, 1981 Dunford, N., Schwartz, J.: Linear Operators, Part II. New York: Intersience Publishers, 1963 Folland, G.B.: Harmonic Analysis in Phase Space. Annals of Mathematics Studies 122 (1989) Friedrich, Th.: Zur Abh¨angigkeit des Dirac Operators von der Spin-Struktur. Colloquium Mathematicum Vol. XLVIII, 57–62 (1984) Friedrich, Th., Sulanke, S.: Ein Kriterium f¨ur die formale Selbstadjungiertheit des Dirac Operators. Colloquium Mathematicum Vol. XL, 239–247 (1979) Habermann, K.: The Dirac Operator on Symplectic Spinors. Annals of Global Analysis and Geometry 13, 155–168 (1995) Habermann, K.: On the Dependence of the Symplectic Dirac Operator on the Connection defining it and Formal Selfadjointness. Preprint 193, Bochum 1995 Habermann, K.: Metaplectic Structures and Dirac Operators over Symplectic Manifolds. Preprint 196, Bochum 1996 Habermann, K.: Symplectic Dirac Operators on K¨ahler Manifolds. In preparation Kashiwara, M., Vergne, M.: On the Segal-Shale-Weil Representations and Harmonic Polynomials. Invent. math. 44, 1–47 (1978) Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry. New York: Intersience Publishers, 1963 Kostant, B.: Symplectic Spinors. Symposia Mathematica. Vol. XIV (1974) Salamon, D.: Spin Geometry and Seiberg-Witten Invariants. Manuscript, University of Warwick 1995 Tondeur, Ph.: Affine Zusammenh¨ange auf Mannigfaltigkeiten mit fast-symplektischer Struktur. Comment. Math. Helv. 36, 234–244 (1961) Wallach, N.R.: Symplectic Geometry and Fourier Analysis. MATH SCI PRESS, 1977 Wolf, J.A.: Essential Self Adjointness for the Dirac Operator and its Square. Indiana Univ. Math. Journal, Vol. 22, No. 7, 611–640 (1973)
Communicated by A. Connes
This article was processed by the author using the LaTEX style file pljour1 from Springer-Verlag.
Commun. Math. Phys. 184, 653 – 667 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Some Classes of Solutions to the Toda Lattice Hierarchy Harold Widom Department of Mathematics, University of California, Santa Cruz, CA 95064, USA. E-mail: [email protected] Received: 12 March 1996 / Accepted: 26 September 1996
Abstract: We apply an analogue of the Zakharov-Shabat dressing method to obtain infinite matrix solutions to the Toda lattice hierarchy. Using an operator transformation we convert some of these into solutions in terms of integral operators and Fredholm determinants. Others are converted into a class of operator solutions to the l-periodic Toda hierarchy. 0. Introduction We begin by recalling some terminology: The shift matrix (δi+1,j ) is denoted by Λ. (All matrices are doubly-infinite unless otherwise stated.) Any matrix A has a representation P∞ as a formal sum i=−∞ ai Λi where the ai are diagonal matrices. Two of these may be multiplied if for both matrices the indices corresponding to nonzero components are bounded above (or below), or if for one of the matrices these indices are bounded above and below. If A is triangular then it is invertible if and only if each diagonal entry of a0 is nonzero. The upper-triangular and strictly lower-triangular projections of A are defined by ∞ −1 X X a i Λi , A− = ai Λi . A+ = i=0
i=−∞
A solution to the Toda lattice (TL) hierarchy [11] is a family of matrices of the form Bn =
n−1 X
bi,n Λi + Λn ,
i=0
Cn =
−1 X
ci,n Λi ,
i=−n
which satisfy the 2-dimensional TL equations ∂xn Bm − ∂xm Bn + [Bm , Bn ] = 0,
(0.1)
654
H. Widom
∂yn Cm − ∂ym Cn + [Cm , Cn ] = 0, ∂yn Bm − ∂xm Cn + [Bm , Cn ] = 0. In the first section we apply a discrete version of the dressing method of Zakharov and Shabat [12] (see also the nice exposition in [8]) to obtain infinite matrix solutions to the TL hierarchy. By this we mean that the entries of the Bn and Cn are themselves expressed in terms of infinite matrices, analogous to the operators whose Fredholm determinants give solutions to the KP hierarchy. Matrix solutions to the semi-infinite TL hierarchy were obtained in [7] and [1]. The methods here and in these references are different although the latter also begins with the factorization of a type of moment matrix. In the following section we obtain operator solutions to the TL hierarchy, in which the entries of Bn and Cn are expressible in terms of integral operators and the diagonal entries of the Bn are given in terms of Fredholm determinants. These are obtained from the matrix solutions by applying the fact that the inverses of the operators I − AB and I − BA may be expressed in terms of each other, and that they generally have the same determinant. The simplest integral operators which arise have kernel P n −n n −n e (xn −yn )(u −u +v −v )/2 1 − uv and act on the space L2 (σ), where σ is a measure supported in the unit disc of the complex plane. When σ equals a function p(u)2 times Lebesgue measure we obtain an equivalent operator which acts on L2 of Lebesgue measure when the kernel given above is multiplied by p(u) p(v). If we set t = x1 − y1 (ignoring the other parameters) these operators give Fredholm determinant solutions to the Toda equations d 2 qk = eqk−1 −qk − eqk −qk+1 dt2
(k ∈ Z).
When σ is a discrete measure supported on N points the Fredholm determinants are finite determinants and these give the familiar N -soliton solutions. In the next section we use the same device to obtain another class of operator solutions which includes solutions to the l-periodic Toda hierarchy. The simplest operators here have kernel P −n n n n −n −n e [xn (1−ω )(u +v )+yn (1−ω )(u +v )]/2 , u − ωv where ω is an lth root of unity, and act on L2 (σ), where now σ is a measure on R+ . Again there is the special case where the operator acts on the usual L2 (R+ ) space and the kernel is multiplied by p(u) p(v). When σ is a discrete measure supported on finitely many points the solutions are again expressed in terms of finite determinants. A related class of kernels gives solutions to the cylindrical Toda equations dqk d 2 qk = eqk −qk−1 − eqk+1 −qk + t−1 2 dt dt
(k ∈ Z).
Integral operator solutions to some analogous equations, or special cases, have also appeared in the literature, for example [2, 5, 6, 10]. The methods here are quite different. We mention that in [10] it was shown that the Fredholm determinants of the case l = 2 of the last kernels gave solutions of the mKdV/sinh-Gordon hierarchies. In [4] it was observed that those Fredholm determinant solutions were limits of N -soliton solutions.
Some Classes of Solutions to the Toda Lattice Hierarchy
655
They are both special cases of the more general situation where the operator acts on an L2 (σ) space. We shall see that it is easy formally to derive the solutions to the TL hierarchy. But some justification is required, and the case of the periodic TL hierarchy is a little tricky. 1. Matrix Solutions to the Toda Hierarchy In our discrete version of the dressing method we begin with a doubly-infinite matrix F (i, j) for which there is a factorization I − F = K+−1 K− with K+ upper-triangular and K− of the form I+ strictly lower-triangular. More precisely, we require the relations K+ (I − F ) = K− ,
−1 (I − F )K− = K+−1 .
(1.1)
(There is a difference, since not all matrix products are defined.) To give conditions assuring the existence of such matrices K± we denote by Fk the infinite matrix F (i, j) with i, j ≥ k and denote by Zk the set of integers ≥ k. Lemma . Assume that Fk represents a bounded operator on l2 (Zk ) for each k and that each I − Fk is invertible. Then there are triangular matrices K± of the form described −1 belong to l2 (Z) and such that the such that all rows of K+ and all columns of K− relations (1.1) hold. Proof. We shall see what the matrices K± should be so that (1.1) holds, the actual verification then being a simple matter. If K+ is upper-triangular then K+ (I − F ) is of the form I+ strictly lower-triangular precisely when K+ (k, j) −
∞ X
K+ (k, i)F (i, j) = δj,k
(j ≥ k).
i=k
So we define K+ to be the upper-triangular matrix with entries K+ (k, j) = (I − Fk )−1 (k, j)
(j ≥ k).
(1.2)
Note that each row of K+ belongs to l2 (Z). Note also that K+ (k, k), the upper lefthand corner of (I − Fk )−1 , must be nonzero since the corresponding minor, the matrix I − Fk+1 , is invertible. −1 should be upper triangular with k, k entry K+ (k, k)−1 the entries Since (I − F )K− −1 of K− must satisfy −1 (i, k) − K−
∞ X
−1 F (i, j)K− (j, k) = K+ (k, k)−1 δi,k
(i ≥ k).
j=k −1 to be the lower-triangular matrix with entries So we define K− −1 (i, k) = K+ (k, k)−1 × ith component of (I − Fk )−1 εk K−
(i ≥ k),
−1 belongs to l2 (Z). where εk is the vector (δi,k ) of Zk . Note that each column of K− −1 The matrices K+ and K− defined above have the right form and their rows and columns, respectively, belong to l2 (Z). Since each Fk represents a bounded operator on
656
H. Widom
−1 −1 l2 (Zk ) the two matrix products [K+ (I − F )]K− and K+ [(I − F )K− ] are well-defined and equal. It follows from our definitions that the first product is of the form I+strictly lower-triangular while the second is of the form I+ strictly upper-triangular. Hence both products equal I, and this is equivalent to the two relations (1.1). 2
Theorem 1. Suppose, in addition to the hypotheses of the lemma, that ∂F = F (i + n, j) − F (i, j − n), ∂xn
∂F = F (i − n, j) − F (i, j + n). ∂yn
(1.3)
∂yn − Cn = K+ (∂yn − Λ−n )K+−1
(1.4)
Then the matrices Bn and Cn defined by ∂xn − Bn = K+ (∂xn − Λn )K+−1 ,
are solutions to the TL hierarchy. Moreover, if we define −1 L = K− ΛK− ,
M = K+ Λ−1 K+−1 ,
then Bn = (Ln )+ ,
Cn = (M n )− .
(1.5)
Proof. The identities (1.3) are equivalent to the statement that F commutes with the operators ∂xn − Λn and ∂yn − Λ−n . This commutativity implies that we may replace K+ everywhere in (1.4) by K− . The reason is that −1 −1 −1 K− (∂xn − Λn )K− = K+ (I − F )(∂xn − Λn )K− = K+ (∂xn − Λn )(I − F )K−
= K+ (∂xn − Λn )K+−1 , and similarly for the definition of Cn given by the second part of (1.4). (These matrix −1 , and manipulations are justified using the facts that the rows of K+ , the columns of K− their derivatives all belong to l2 (Z), and that Fk and its derivatives represent bounded operators on l2 (Zk ).) Thus we have the second pair of relations −1 ∂xn − Bn = K− (∂xn − Λn )K− ,
−1 ∂yn − Cn = K− (∂yn − Λ−n )K− .
(1.6)
The second statement of (1.5) is obvious once we recognize that the definition of Cn may be rewritten ∂K+ −1 Cn = K + M n. ∂yn + Similarly the first statement follows from the identity Bn =
∂K− −1 K + Ln , ∂xn −
which is equivalent to the first identity of (1.6). To see that the TL equations are satisfied we observe that the operators ∂xn − Bn and ∂ym − Cm all commute since by (1.4) they are simultaneously similar to the commuting operators ∂xn − Λn and ∂ym − Λ−m . The 2 commutativity of the ∂xn − Bn and ∂ym − Cm is equivalent to the TL equations.
Some Classes of Solutions to the Toda Lattice Hierarchy
657
Remark 1. The definition of Bn in (1.4) may be rewritten as B n K+ =
∂K+ + K+ Λn . ∂xn
Since the last term is strictly upper-triangular, we have for the diagonal entries Bn (k, k) K+ (k, k) = ∂xn K+ (k, k), or Bn (k, k) = ∂xn log K+ (k, k). If each Fk represents a trace class operator then Cramer’s rule gives the nice formula K+ (k, k) = det (I − Fk+1 )/ det (I − Fk ).
(1.7)
Remark 2. It is familiar how the two-dimensional Toda equations ∂ 2 qk = eqk −qk−1 − eqk+1 −qk ∂x ∂y
(1.8)
lead to one of the equations of the TL hierarchy, ∂C1 ∂B1 − + [B1 , C1 ] = 0. ∂y ∂x
(1.9)
Here we write x, y for x1 , y1 respectively. (See, for example, the introduction to [11].) In fact we can easily see directly that qk (x, y) = log K+ (k, k)
(1.10)
solves (1.8). From the second relation of (1.4) or (1.5) follows the formula C1 (k + 1, k) = K+ (k + 1, k + 1)/K+ (k, k) = eqk+1 −qk , and so taking the k, k entry of (1.9) gives ∂B1 (k, k) = C1 (k, k − 1) − C1 (k + 1, k) = eqk −qk−1 − eqk+1 −qk . ∂y Since B1 (k, k) = ∂qk /∂x we obtain (1.8). It remains to write down a class of matrices F satisfying the conditions of the theorem. If µ is a measure on C × C then Z Z P n −n −n n (1.11) F (i, j) = ui v j e [xn (u −v )+yn (u −v )] dµ(u, v) satisfies (1.3) as long as the integrands belong to L1 (µ) for some range of the parameters xn and yn . Boundedness of the operators Fk on l2 (Zk ) requires that the support of µ be contained in the product D × D of the closed unit discs in C. If the support is contained in the product of the open subdiscs then F (i, j) tends rapidly to 0 as i or j tends to +∞ and so each Fk is trace class then. We shall not concern ourselves here with precise necessary or sufficient conditions.
658
H. Widom
2. Operator Solutions to the Toda Hierarchy In this section and the next we obtain operator solutions to the TL hierarchy from the matrix solutions above by using two facts about operators. The first, which is very easy, is that if A and B are operators such that I − AB is invertible then so is I − BA and (I − AB)−1 = I + A(I − BA)−1 B.
(2.1)
(This is obvious if the inverses are given by the Neumann series.) The second, which is not as easy, is that if A and B are both Hilbert-Schmidt operators, or one operator is trace class and the other merely bounded, then det (I − AB) = det (I − BA).
(2.2)
(See, for example, [3], Sect. IV.1.) It is important that A and B are not required to act on the same Hilbert space; A may take a space H1 to a space H2 and B take H2 to H1 , so that AB acts on H2 and BA acts on H1 . When we say that a particular matrix K+ “gives a solution to the TL hierarchy” we shall mean that the matrices Bn and Cn defined in terms of K+ by (1.4) are of the form (0.1) and satisfy the TL equations. We shall always assume that only finitely many of the parameters xn , yn occur and that the range of these parameters is such that the exponentials which appear, such as in (1.11), are bounded in the support of the corresponding measure. Throughout, we shall use the notation P n −n −n n (2.3) E(u, v) := e [xn (u −v )+yn (u −v )]/2 . The simplest special case, where the measure µ in (1.11) is supported on the set u = v, leads to our first family of operator solutions to the TL hierarchy. We define P n −n (2.4) Ek (u) := uk E(u, u) = uk e (xn −yn )(u −u )/2 . Theorem 2. Let σ be a measure on the unit disc D satisfying Z D
|Ek (u)|2 dσ(u) < ∞ 1 − |u|
for all k ∈ Z. Then Gk , the integral operator on L2 (σ) with kernel Gk (u, v) =
Ek (u) Ek (v) , 1 − uv
is trace class, and if 1 is not an eigenvalue of any Gk then K+ (k, j) = δk,j + (I − Gk )−1 Ej , Ek
(2.5)
(2.6)
gives a solution of the TL hierarchy. (The parentheses in the displayed formula denote the inner product in L2 (σ).) The diagonal entries of K+ are also given by K+ (k, k) = det (I − Gk+1 )/ det (I − Gk ).
(2.7)
Some Classes of Solutions to the Toda Lattice Hierarchy
659
Proof. We take the special case of (1.11), where µ is supported on the set v = u and is given by dµ(u, v) = δ(u − v) dσ(u). Then (1.11) becomes Z F (i, j) = ui+j E(u, u)2 dσ(u) and Fk may be written as the product AB where A is the operator from L2 (σ) to l2 (Zk ) with “kernel” A(i, u) = Ei (u) and B is the operator from l2 (Zk ) to L2 (σ) with kernel B(u, i) given by exactly the same formula. They are both Hilbert-Schmidt, Z X Z Z X ∞ ∞ |Ek (u)|2 |A(i, u)|2 dσ(u) = |B(u, i)|2 dσ(u) = dσ(u) < ∞, 2 D D D 1 − |u| i=k
i=k
and BA is precisely the operator Gk . Since Fk = AB is trace class we may apply Theorem 1 to conclude that the matrix K+ defined by (1.2) gives a solution to the TL hierarchy and (1.7) holds. Applying (2.1) in our situation gives (2.6) and applying (2.2) gives (2.7). 2 Remark 3. If dσ(u) = p(u)2 du on the interval (0, 1) ⊂ R and Z 1 |Ek (u)|2 p(u)2 du < ∞ 1−u 0 for all k ∈ Z then the conclusion of the theorem holds if the quantities Ek (u) are replaced by Ek (u) p(u) and the operators act on L2 (0, 1). This follows by using the unitary equivalence between L2 (σ) and L2 (0, 1) given by f ↔ pf . At the other extreme, if σ consists of N masses p2i at points ui then Gk becomes an N × N matrix, the Fredholm determinants become finite determinants and we obtain exponential solutions through the formula (recall (2.4)) (ui uj )k E0 (ui ) E0 (uj ) . det (I − Gk ) = det δi,j − pi pj 1 − ui uj i,j=1,···,N
(2.8)
Remark 4. If we think of xn − yn as a new variable tn then we obtain solutions of the 1-dimensional Toda hierarchy ∂tn (Bn + Cn ) = [Bn , Cn ]. Writing t = t1 , x = x1 , y = y1 (ignoring the other parameters), keeping in mind that ∂ 2 /∂x ∂y = −d2 /dt2 and substituting into (1.8) we find that qk (t) = log K+ (−k, −k) solves the one-dimensional Toda equations d2 qk = eqk−1 −qk − eqk −qk+1 . dt2 In the case of a finite discrete measure we have (2.8), which becomes (ui uj )k t (ui −u−1 +uj −u−1 )/2 i j e . det (I − Gk ) = det δi,j − pi pj 1 − u i uj i,j=1,···,N This gives the N -soliton solutions of Toda the equations. (See [9], Sect. 3.6.)
660
H. Widom
Remark 5. The kernels (2.5) and those described in Remark 1 admit significant generalizations. They are obtained by thinking of C × C as the union of complex lines v = ωu (ω ∈ C) and taking dµ(u, v) = p(ω, u)2 dρ(ω) dσ(u), so that (1.11) becomes (recall (2.3)) Z Z F (i, j) = ui+j ω j p(ω, u)2 E(u, ωu)2 dρ(ω) dσ(u). In the kernels above ρ was a unit mass at ω = 1. Now Fk may be written as the product AB, where A and B are the operators from L2 (ρ × σ) to l2 (Zk ) and from l2 (Zk ) to L2 (ρ × σ), respectively, with kernels A(i; ω, u) = ui p(ω, u) E(u, ωu),
B(ω, u; i) = (ωu)i p(ω, u) E(u, ωu).
Appropriate assumptions on the measures dρ(ω) and dσ(u) and the function p(ω, u) guarantee that these are Hilbert-Schmidt operators between l2 (Zk ) and L2 (ρ × σ). The operator Gk = BA on L2 (ρ × σ) has kernel Gk (ω, u; ω 0 , v) = If we now define
(ωuv)k p(ω, u) p(ω 0 , v) E(u, ωu) E(v, ω 0 v). 1 − ωuv
(2.9)
Ek (ω, u) := uk E(u, ωu)
then we obtain a solution of the TL hierarchy in which (2.6) is replaced by K+ (k, j) = δk,j + (I − Gk )−1 ω j pEj , pEk . The inner product is now taken in L2 (ρ × σ). Of course (2.7) remains the same. If the measure ρ consists of a finite set Ω of unit point masses then we may think of Gk as acting on vector-valued L2 (σ), the components being indexed by ω ∈ Ω. The right side of (2.9) gives the ω, ω 0 entry of the matrix kernel of Gk . Actually, the kernel of Gk can always be transformed to an equivalent scalar kernel by another AB → BA operation. The details of this will be given for another class of kernels at the end of the next section. 3. More Operator Solutions to the Toda Hierarchy The kernels described in the last remark were given not so much for their inherent interest but because our solutions to the periodic TL hierarchy and the cylindrical Toda equations arise in a similar way. Now we think of C × C as the union of complex hyperbolas v = ω/u. This time we take dµ(u, v) = u−1 p(ω, u)2 dρ(ω) dσ(u) (the reason for the factor u−1 in the measure will soon become apparent) so that (1.11) becomes Z Z F (i, j) = ui−j−1 ω j p(ω, u)2 E(u, ω/u)2 dρ(ω) dσ(u). In this case for the representation of Fk we define Ak (i; ω, u) = ui−k p(ω, u) E(u, ω/u),
Bk (ω, u; i) = ω i uk−i−1 p(ω, u) E(u, ω/u).
The situation is more awkward now, and we begin with the assumption that dσ(u) and dρ(ω) are finite measures supported on annuli
Some Classes of Solutions to the Toda Lattice Hierarchy
r1 ≤ |u| ≤ r2 < 1,
661
0 < s1 ≤ |ω| ≤ s2
where s2 < r1 . We call this “the annulus condition”. We also assume that the function p(ω, u) E(u, ωu) belongs to L2 (ρ × σ). It is easy to see that both operators Ak and Bk are then Hilbert-Schmidt, Fk = Ak Bk and the operator Gk = Bk Ak on L2 (ρ × σ) has kernel Gk (ω, u; ω 0 , v) =
ωk p(ω, u) E(u, ω/u) p(ω 0 , v) E(v, ω 0 /v). u − ωv
(3.1)
If we define now Ek (ω, u) := uk E(u, ω/u) then the solution to the TL hierarchy is given by the formula K+ (k, j) = δk,j + (I − Gk )−1 ω j pEk−j−1 , pE0 ,
(3.2)
and (2.7) holds as usual. The very restrictive condition imposed on dσ(u) and dρ(ω) makes this not very interesting. But the formulas make sense much more generally and we might expect them to give solutions whenever the operators Gk with kernel defined by (3.1) are trace class and the I − Gk are invertible. Although we cannot prove such a general result we can prove enough for our purposes. We denote the supports of the finite measures dσ(u) and dρ(ω) by U and Ω respectively and use the familiar notation U U −1 := {uv −1 : u, v ∈ U }. The annulus condition implies that U U −1 is also contained in an annulus and that Ω is inside the inner boundary of this annulus. In particular Ω can be shrunk down to 0 without in the process intersecting U U −1 . As we shall now show, a quasi-topological condition like this on U and Ω is all that we need. Note that just the assumption Ω ∩ U U −1 = ∅ implies that Gk has C ∞ kernel and so is trace class. Theorem 3. Assume that pE0 ∈ L2 (ρ × σ) and that the supports U and Ω are compact subsets of C\0 with the property that there is a path from 0 to 1 in C such that βΩ ∩ U U −1 = ∅ for all β in the path. Assume also that each I − Gk is invertible. Then (3.2) gives a solution of the TL hierarchy and (2.7) holds. Remark 6. For the proof it will be convenient to widen the meaning of the phrase “K+ gives a solution to the TL hierarchy” to the case where not all entries of K+ are defined. The TL hierarchy consists of an infinite number of scalar equations each of which depends on only finitely many entries of the matrices Bn and Cn , and these in turn are determined from (1.4) using only finitely many entries of K+ . If K+ is only a partially defined matrix then our phrase will mean that all those scalar equations of the TL hierarchy which make sense are satisfied. This will be the case if in the statement of the theorem not all I − Gk are invertible. This occurs for those values of the paramenters xn and yn for which det (I − Gk ) = 0 and at these values certain entries of the solution matrices become singular. Even if all I − Gk are invertible this concept, in proving the theorem, will allow us to deal with finitely many k at a time rather than all k simultaneously.
662
H. Widom
Proof. For nonzero complex numbers α and β let σα and ρβ be the measures on αU and βΩ defined by σα (V ) = σ(α−1 V ),
ρβ (W ) = ρ(β −1 W ).
If we first choose α small enough and then choose β small enough these measures will satisfy the annulus condition. Thus if we replace σ by σα , ρ by ρβ and p(ω, u) by p(β −1 ω, α−1 u) then the corresponding operators give a solution to the TL hierarchy by the corresponding formula (3.2), and (2.7) holds. If 1 is an eigenvalue of some of these operators then we obtain a partial solution, as described in the remark. These operators act on L2 (ρβ × σα ) but by means of the variable changes ω → βω, u → αu we obtain on L2 (ρ × σ) which give a solution by the corresponding formulas operators G(α,β) k (3.2). Their kernels are α−1 ω k p(ω, u) E(αu, βω/αu) p(ω 0 , v) E(αv, βω 0 /αv). u − βωv These will give a solution to the TL hierarchy if α lies in some annulus r1 < |α| < r2 and β in some punctured disc 0 < |β| < s. Consider a finite set k1 , · · · , kn of k for each of which I − Gk is invertible for particular values of the parameters xn and yn . We are going to show that the TL equations which are defined using the formula (3.2) for these k are correct for these values of the parameters. Choose any β with 0 < |β| < s. A glance at the form of its kernel shows is well-defined for all α ∈ C\0 and is analytic in α. The set of α that the operator G(α,β) k (α,β) such that I − Gk is not invertible for one of our k is the union of the set of zeros of the functions (of α) det (I − G(α,β) ). This is a discrete subset of C\0 varying continuously k with the parameters. It follows that there is a path running from a point in r1 < |α| < r2 to α = 1 such that everywhere on a neighborhood of this path, and on neighborhoods of are invertible. The matrix entries our given parameter values, all operators I − G(α,β) k which arise in those TL equations which are defined using only k1 , · · · , kn are analytic functions of α and since these equations are satisfied for r1 < |α| < r2 they must persist by analytic continuation to a neighborhood of α = 1. In other words, the operators G(1,β) k give a partial solution to the TL hierarchy for k = k1 , · · · , kn whenever |β| < s. Now we can apply the same argument to β. Our hypothesis implies that there is a is path running from a point in 0 < |β| < s to β = 1 on a neighborhood of which G(1,β) k an analytic family of trace class operators: the denominator in the formula for the kernel will remain nonzero. By deforming the path slightly if necessary we can assure that for are invertible. Analytic continuation as before now β on the path all operators I − G(1,β) k , in other words our given operators Gk , give a solution to the TL shows that the G(1,1) k hierarchy. 2 A limiting argument applied to a special case of the above will give a class of solutions to the TL hierarchy which include periodic ones. The measure σ will now be supported on R+ , the nonnegative real numbers in C. Thus we shall have a case where the support of σ is neither compact nor contained in C\0. This is the reason a limiting argument is necessary. Theorem 4. Assume Ω is a compact subset of C\R+ , the measure σ is supported on R+ and pEi ∈ L2 (ρ × σ) for all i ≤ 0. Then (3.2) gives a solution of the TL hierarchy and (2.7) holds.
Some Classes of Solutions to the Toda Lattice Hierarchy
663
Remark 7. Our basic assumption of boundedness of E(u, ω/u) will imply that it vanishes exponentially at u = 0 and u = ∞. It follows that the function p(ω, u) can be very general. Proof. For r > 1 let Pr denote multiplication by the characteristic function of the interval [r−1 , r] and define G(r) k := Pr Gk Pr . These may be thought of alternatively as the operators Gk which arise when σ is replaced by its restriction to [r−1 , r]. The assumption of Theorem 3 is satisfied for this measure: we can take our path from β = 0 to β = 1 to be the line segment. Thus for each r the corresponding operators G(r) k and formulas (3.2) give solutions (or partial solutions) to the TL hierarchy, and (2.7) holds. We shall show that the trace norm of the operator Gk − G(r) k tends to 0 as r → ∞. We write Gk − G(r) = (I − P ) G + P G (I − P ) and apply Lemma 1 of the appendix r k r k r k to each of the operators on the right. Denote by χr (u) the characteristic function of the u-interval [r−1 , r], and recall (3.1). In the first application of the lemma q1 = (1 − χr ) ω k pE0 , q2 = pE0 and in the second application q1 = χr ω k pE0 , q2 = (1 − χr ) pE0 . In both cases one of the integrals in the statement of the lemma will tend to 0 as r → ∞. Notice that all we need here is that pE−1/2 ∈ L2 (ρ × σ). All functions which appear in the inner products in (3.2) are of the form ω j pEi for i ≤ 0, and these belong to L2 (ρ × σ) by our main assumption. It follows that the functions of the parameters xn and yn which appear in the TL equations obtained from the Gk using (3.2) are the limits of the corresponding functions obtained from the G(r) k . The same is true of the derivatives of these functions. (This is automatic because the functions are locally bounded and analytic in the parameters.) Since the equations are satisfied for each r the equations must be satisfied by the limiting functions. Trace norm convergence of the operators is needed to deduce (2.7): we know that it holds for the solution corresponding to G(r) k and so K+ (k, k) = lim
r→∞
det (I − G(r) k+1 ) det (I −
G(r) k )
=
det (I − Gk+1 ) . det (I − Gk )
The convergence of the determinants requires trace norm convergence of the operators. 2 We are now going to transform (3.1) into a kernel on L2 (σ) by another AB → BA operation. (We think of this as a scalar-valued kernel on L2 (σ) whereas (3.1) could be thought of as an L2 (ρ) operator-valued kernel or, in case Ω is finite, a matrix-valued kernel.) Now B will be the operator from L2 (ρ × σ) to L2 (σ) which is multiplication by pE0 followed by integration with respect to dρ over Ω. If pE0 is bounded then this will be a bounded operator. The operator A from L2 (σ) to L2 (ρ × σ) has kernel ω k p(ω, u) E(u, ω/u) /(u − ωv). (In other words one takes a function g(v), multiplies ek = BA on by this kernel, and then integrates with respect to dσ(v).) The operator G L2 (σ) has kernel Z p(ω, u)2 E(u, ω/u)2 e k (u, v) = dρ(ω). (3.3) ωk G u − ωv Ω In order to apply (2.1) we have to know that A is also bounded, and in order to apply (2.2) we have to know that it is even trace class, since B is surely not Hilbert-Schmidt. Fortunately, we have Lemma 2 of the appendix, and so we can immediately give a variant of Theorem 4 for this scalar kernel. The condition imposed on σ is a little awkward but
664
H. Widom
it is satisfied by Lebesgue measure, the most interesting case. We shall not write down the analogue of (3.2) here since it may be obtained from (3.2) itself by applying (2.1). Theorem 40 . In addition to the assumptions of Theorem 4, assume that pE0 is bounded and that for some δ ∈ (0, 2) Z ∞ dσ(u) <∞ u2−δ + 1 0 e k give a solution of the TL hierarchy and u−1−δ/2 p2 E02 ∈ L2 (ρ×σ). Then the operators G e if all I − Gk are invertible, and we have e k+1 )/ det (I − G e k ). K+ (k, k) = det (I − G
(3.4)
We now show that the special case of these kernels where σ is Lebesgue measure and p is independent of u (in which case it can be incorporated into the measure ρ) gives solutions to the cylindrical Toda equations dqk d2 qk = eqk −qk−1 − eqk+1 −qk . + t−1 dt2 dt e k although the case of Gk is no different. Again we We consider only the scalar kernel G ignore all parameters except x = x1 and y = y1 , so (3.3) becomes Z E(u, ω/u)2 e k (u, v) = dρ(ω) ωk G u − ωv Ω with
−1
−1
E(u, ω/u)2 = ex(1−ω )u+y(1−ω)u . p If we make the variable change u → y/x u then the new kernels, which have the same determinants, are √ Z xy [(1−ω −1 )u+(1−ω)u−1 ] k e dρ(ω). ω u − ωv Ω
(This is where we use the fact that σ is Lebesgue measure and p is independent of u.) √ Thus the determinants in (3.4) are functions of xy, and if we set t = 2 xy then the two-dimensional Toda equations (1.8) become the cylindrical Toda equations. Solutions are given by (1.10) and (3.4). 4. The l-Periodic Toda Hierarchy A doubly-infinite matrix M (i, j) is called l-periodic if M (i + l, j + l) = M (i, j) for all i, j. The solutions Bn , Cn of the TL hierarchy given by (1.4) are l-periodic if the matrix K+ is. And the matrix K+ given by (3.2) is l-periodic if Ω is contained in the set of lth roots of unity, as is easily seen by referring to (3.1) and (3.2). Of course since Ω is disjoint from R+ the root 1 must be omitted, so we may take Ω to be the set of lth roots of unity other than 1. Assuming ρ({ω}) = 1 for each of these roots the formula (3.1) for the kernel, now a matrix kernel, may be written
Some Classes of Solutions to the Toda Lattice Hierarchy
Gk (u, v) =
ωk pω (u) E(u, ω/u) pω0 (v) E(v, ω 0 /v) u − ωv ω,ω 0 ∈Ω
665
(4.1)
and the hypothesis of Theorem 4 becomes Z ∞ |pω (u) ui E(u, ω/u)|2 dσ(u) < ∞ 0
e k becomes for all ω and i ≤ 0. The scalar kernel of G e k (u, v) = G
X ω
ωk
pω (u)2 E(u, ω/u)2 , u − ωv
(4.2)
with additional assumptions coming from the statement of Theorem 40 . The case where only one ω occurs is especially simple. If G is the operator with either kernel G(u, v) =
p(u)2 E(u, ω/u)2 p(u) E(u, ω/u) p(v) E(v, ω/v) or , u − ωv u − ωv
e k equals ω k G and (2.7) becomes then Gk resp. G K+ (k, k) = det (I − ω k+1 G)/ det (I − ω k G).
(4.3)
Of course the assumptions are slightly different in the two cases. If σ consists of N masses p2i at the points ui then the Fredholm determinants become finite determinants, E(ui , ω/ui ) E(uj , ω/uj ) det (I − ω k Gk ) = det δi,j − ω k pi pj . ui − ω uj i,j=1,···,N Remark 8. Observe that for l = 2 only ω = −1 occurs and so each term in (4.3) is the ratio of the determinants det (I ± G). Remark 9. If in (4.1) or (4.2) pω (u) is independent of u then we are in the case considered √ at the end of Sect. 3. Therefore if σ is Lebesgue measure and we set t = 2 x1 y1 we obtain periodic solutions of the cylindrical Toda equations through the formulas (1.10) and (2.7) or (3.4). 5. Appendix We prove here the lemmas needed for Theorems 4 and 40 . First we shall prove a sublemma, which gives a family of estimates for the trace norm of the operator from L2 (ρ0 × σ) to L2 (ρ × σ) with kernel q1 (ω, u) q2 (ω 0 , v) . u − ωv
(5.1)
There will be an inequality for each positive funtion ϕ(s) defined on R+ . We denote the Laplace transform of ϕ(s) by Φ and the Laplace transform of ϕ(s)−1 by Ψ . Sublemma. Let ρ be a finite measure supported on a compact set Ω ⊂ C\R+ and σ a measure supported on R+ . (The measure ρ0 is arbitrary.) Then there is a constant m
666
H. Widom
depending only on Ω such that the square of the trace norm of the operator with kernel (5.1) is at most m−1 times the square root of Z ∞Z Z ∞Z 2 |q1 (ω, u)| Φ(mu) dρ(ω) dσ(u) · |q2 (ω 0 , u)|2 Ψ (mu) dρ0 (ω 0 ) dσ(u). Ω
0
Ω
0
Proof. Write −ω = r2 e2iθ . Then r is bounded and bounded away from 0 for ω ∈ Ω and we may take |θ| ≤ π2 − δ for some δ > 0. With this notation we write r−1 e−iθ 1 = −1 −iθ , u − ωv r e u + reiθ v which has the integral representation Z ∞ −1 −iθ iθ r−1 e−iθ e−sr e u e−sre v ds. 0
It follows that the kernel of our operator has the integral representation Z ∞ −1 −iθ iθ r−1 e−iθ q1 (ω, u) e−sr e u q2 (ω 0 , v) e−sre v ds.
(5.2)
0
From this representation it follows that for any choice of ϕ(s) we can factor the operator as the product AB where A is the integral operator from L2 (R+ ) (with Lebesgue measure) to L2 (ρ × σ) with kernel A(ω, u; s) = r−1 e−iθ q1 (ω, u) ϕ(s)1/2 e−sr
−1
e−iθ u
and B is the integral operator from L2 (ρ0 × σ) to L2 (R+ ) with kernel B(s; ω 0 , u) = q2 (ω 0 , u) ϕ(s)−1/2 e−sre
iθ
u
.
Let m > 0 be any constant less than or equal to r cos θ and to r−1 cos θ for all ω ∈ Ω. (Notice that cos θ ≥ sin δ for all ω ∈ Ω.) Then the square of the Hilbert-Schmidt norm of A is at most m−2 times ZZZ ∞ ZZ |q1 (ω, u)|2 ϕ(s) e−smu ds dρ(ω) dσ(u) = |q1 (ω, u)|2 Φ(mu) dρ(ω) dσ(u). 0
Similarly the square of the Hilbert-Schmidt norm of B is at most Z Z |q2 (ω 0 , u)|2 Ψ (mu) dρ0 (ω 0 ) dσ(u), which establishes the sublemma.
2
Lemma 1. The trace norm of the integral operator on L2 (ρ × σ) with kernel (5.1) is at most a constant depending only on Ω times Z ∞Z Z ∞Z u−1 |q1 (ω, u)|2 dρ(ω) dσ(u) · u−1 |q2 (ω, u)|2 dρ(ω) dσ(u). 0
Ω
0 0
Ω
Proof. In the sublemma take ρ = ρ and ϕ(s) ≡ 1.
2
Some Classes of Solutions to the Toda Lattice Hierarchy
667
Lemma 2. For any δ ∈ (0, 2) the trace norm of the integral operator from L2 (σ) to L2 (ρ × σ) with kernel q(ω, u)/(u − ωv) is at most a constant depending only on Ω and δ times the square root of Z ∞ Z ∞Z dσ(u) −2−δ 2 . (u + 1) |q(ω, u)| dρ(ω) dσ(u) · 2−δ + 1 u Ω 0 0 Proof. In the sublemma we take ρ0 to be a unit point mass, so that L2 (ρ0 × σ) may be identified in the obvious way with L2 (σ). We take ϕ(s) = s−1+δ for s ≤ 1 and ϕ(s) = s1+δ for s ≥ 1. Then it is easy to see that O(1) when t → 0 O(t−2−δ ) when t → 0 Ψ (s) = Φ(s) = O(1) when t → ∞, O(t−2+δ ) when t → ∞. Combining these estimates with the sublemma gives the statement of the lemma.
2
Acknowledgement. The author thanks Fritz Gesztesy, David Sattinger and Pierre Van Moerbeke for helpful electronic conversations. Special thanks go to Craig Tracy, whose ongoing collaboration with the author led to this work and whose advice during its preparation was invaluable. Research was supported by the National Science Foundation through grant DMS-9424292.
References 1. Adler, M., Van Moerbeke, P.: String-orthogonal polynomials, string equations and two-Toda symmetries. Preprint 2. Bernard, D., LeClair, A.: Differential equations for sine-Gordon correlation functions at the free fermion point. Nucl. Phys. B426 [FS], 534–558 (1994) 3. Gohberg, I. C., Krein, M. G.: Introduction to the theory of linear nonselfadjoint operators. Amer. Math. Soc. Transl. Math. Monog. 18, Providence, RI: Am. Math. Soc., 1969 4. Kakei, S.: Toda lattice hierarchy and Zamolodchikov’s conjecture. solv-int/9510006 5. Nijhoff, F. W.: Linear integral transformations and hierarchies of integrable nonlinear evolution equations. Physica 31D, 339–388 (1988) 6. P¨oppe, Ch.: Construction of solutions of the sine-Gordon equation by means of Fredholm determinants. Physica 9D, 103–139 (1983) 7. P¨oppe, Ch.: Fredholm determinant methods for continuous and discrete soliton equations. In: M. Ablowitz, B. Fuchssteiner, M. Kruskal (eds.) Topics in soliton theory and exactly solvable nonlinear equations. Singapore: World Scientific Publ., 1987, pp. 277–283 8. P¨oppe, Ch., Sattinger, D. H.: Fredholm determinants and the τ function for the Kadomtsev-Petviashvili hierarchy. Publ. Research Inst. Kyoto Univ. 24, 505–538 (1988) 9. Toda, M.: Theory of nonlinear lattices. 2nd ed. Springer Series in Solid- State Sciences 20. Heidelberg: Springer-Verlag, 1989 10. Tracy, C. A., Widom, H.: Fredholm determinants and the mKdV/sinh-Gordon hierarchies. Commun. Math. Phys. 179, 1–10 (1996) 11. Ueno, K., Takasaki, K.: Toda lattice hierarchy. In: Group representations and systems of differential equations. K. Okamoto (ed.) Adv. Stud. Pure Math. 4, Tokyo: Kinokuniya and North-Holland, 1984, pp. 1–95 12. Zakharov, V. E., Shabat, A. B.: A scheme for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem. Funct. Anal. Applic. 8, 226–235 (1974) Communicated by M. Jimbo
Commun. Math. Phys. 184, 669 – 681 (1997)
Communications in
Mathematical Physics c Springer-Verlag 1997
Higher-Order Simple Lie Algebras J. A. de Azc´arraga?,??,??? , J. C. P´erez Bueno∗∗ Department of Applied Mathematics and Theoretical Physics, Silver St., Cambridge, CB3-9EW, UK Received: 3 June 1996 / Accepted: 8 November 1996
Abstract: It is shown that the non-trivial cocycles on simple Lie algebras may be used to introduce antisymmetric multibrackets which lead to higher-order Lie algebras, the definition of which is given. Their generalised Jacobi identities turn out to be satisfied by the antisymmetric tensors (or higher-order “structure constants”) which characterise the Lie algebra cocycles. This analysis allows us to present a classification of the higherorder simple Lie algebras as well as a constructive procedure for them. Our results are synthesised by the introduction of a single, complete BRST operator associated with each simple algebra. 1. Introduction It is well known that, given [X, Y ] := XY − Y X, the standard Jacobi identity (JI) [[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0 is automatically satisfied if the product is associative (which will be assumed throughout). For a Lie algebra G, expressed by the k Xk in a certain basis {Xi } i = 1, . . . , r = dimG, the JI Lie commutators [Xi , Xj ] = Cij implies the Jacobi condition (JC) 1 j1 j2 j3 ρ C Cσ = 0 2 i1 i2 i3 j1 j2 ρj3
,
(1.1)
on the structure constants. Let G be simple. Using the Killing metric kij = k(Xi , Xj ) to s lower and raise indices, the fully antisymmetric tensor Cijk = Cij ksk = k([Xi , Xj ], Xk ) defines a non-trivial Lie algebra three-cocycle. Since it is obtained from k, this threecocycle is always present (H03 (G, R) 6= 0 for any G simple). In fact, it is known since ?
St. John’s College Overseas Visiting Scholar On sabbatical (J.A.) leave and on leave of absence (J.C.P.B.) from Departamento de F´ısica Te´orica and IFIC (Centro Mixto Univ. de Valencia-CSIC), E-46100 Burjassot (Valencia), Spain ??? Permanent address: Dpto. de Fisica Teorica, Facultad de Fisica, 46100 Burjasot (Valencia), Spain ??
670
J. A. de Azc´arraga, J. C. P´erez Bueno
the classical work of Cartan, Pontrjagin, Hopf and others (see in particular [1–9]) that, from a topological point of view, the group manifolds of all simple compact groups are essentially equivalent to the products of odd spheres 1 , that S 3 is always present in these products and that the simple Lie algebra G-cocycles are in one-to-one correspondence with bi-invariant de Rham cocycles on the associated compact group manifolds G. The appearance of specific spheres S 2p+1 (p ≥ 1) other than S 3 depends on the simple group considered. This is due to the intimate relation between the order of the l =rank G primitive symmetric polynomials which can be defined on a simple Lie algebra, their l associated generalised Casimir-Racah invariants [10–17, 39, 40]2 and the topology of the associated simple groups, a fact which was found in the eighties to be a key in the understanding of non-abelian anomalies in gauge theories (see [18] for an account of the subject and e.g., [19-21]). By looking at the invariant symmetric polynomials on G we may obtain the higherorder cocycles of the Lie algebra cohomology. These cocycles will turn out to define Gvalued skew-symmetric brackets of even order s satisfying a generalised Jacobi condition replacing (1.1). Higher-order generalisations of Lie algebras, in the form of the strongly homotopy Lie algebras (SH) of Stasheff [22, 23], have recently appeared in physics. This is the case of the (SH) algebra of products of closed string fields (see [24, 25] and references therein), which involves multilinear, graded-commutative products of n string fields satisfying certain “main identities” which also generalise the standard Jacobi identity. The higher-order Lie algebras to be discussed in this paper satisfy, however, a generalised Jacobi condition which is a consequence of the assumed associativity of the product of algebra elements and which has also appeared in another, but related context [26]. As a result the definition of the skew-symmetric multibracket to be given in Sect. 2 permits, for each even s, the introduction of a coderivation ∂s of the exterior algebra ∧(G) constructed on the Lie algebra G. In contrast, the “main identities” of the SH Lie algebras [22, 23] (some detailed expressions can be found in [27]) involve a further extension of our generalised Jacobi identities which in effect describes how the products fail to satisfy them and how the various ∂s involved in the main identities fail separately to be a coderivation. Our extended higher-order algebras are thus a particular case of the SH Lie algebras in which only one of the ∂s is non-zero 3 . We shall now show how to introduce them (Sect. 2) and present their Cartan–like classification in the simple case (Sects. 3, 4). In Sect. 5 we shall describe our results by introducing the complete BRST operator associated with a simple Lie algebra; some comments concerning applications and extensions will be made in Sect. 6.
2. Multibrackets and Higher-Order Lie Algebras Higher-order Lie algebras may be defined by introducing a suitable generalisation of the Lie bracket by means of Definition 2.1 (s-bracket). Let s be even. A s-bracket or skew-symmetric Lie multis bracket is a Lie algebra valued s-linear skew-symmetric mapping G × · · · ×G → G, (Xi1 , Xi2 , . . . , Xis ) 7→ [Xi1 , Xi2 , . . . , Xis ] = ωi1 ...is σ· Xσ
,
(2.1)
1 More precisely, if G is a compact connected Lie group, G has the (real) cohomology or homology of a product of odd dimensional spheres. 2 In the non-simple case the situation is more involved (see [40]). 3 Note added: This is also the case of the “k-algebras” [28]. We thank P. Hanlon for sending us this reference.
Higher-Order Simple Lie Algebras
671
where the constants ωi1 ...is σ· satisfy the condition j ...j
2s−1 ωj1 ...js ρ· ωρjs+1 ...j2s−1 σ· = 0 i11...i2s−1
.
(2.2)
The ωi1 ...is σ· will be called higher-order structure constants, and condition (2.2) will be referred to as the generalised Jacobi condition (GJC); for s = 2 it gives the ordinary JC (1.1). Remark 1. Although we shall only consider here the case of Lie algebras, this definition (as others below) is more general. The GJC (2.2) is clearly a consistency condition for (2.1). From now on Xi will denote both the algebra basis elements and its representatives in a faithful representation of G. Let now [Xi1 , Xi2 , . . . , Xin ], n arbitrary, be defined by X (−1)π(σ) Xiσ(1) Xiσ(2) . . . Xiσ(n) , (2.3) [Xi1 , Xi2 , . . . , Xin ] = σ∈Sn
where π(σ) is the parity of the permutation σ and the (associative) products on the r.h.s. are well defined as products of matrices or as elements of U(G). Then, the following lemma holds: Lemma 2.1. Let [X1 , . . . , Xn ], be as in (2.3) above. Then, for n even, 1 1 (n − 1)! n!
X
(−1)π(σ) [[Xσ(1) , . . . , Xσ(n) ], Xσ(n+1) , . . . , Xσ(2n−1) ] = 0
(2.4)
σ∈S2n−1
is an identity, the generalised Jacobi identity (GJI) which for (2.1) implies the GJC (2.2); for n odd, the l.h.s. is proportional to [X1 , . . . , X2n−1 ]. Proof. Let Qp be the antisymmetriser for the symmetric group Sp (i.e., the primitive “idempotent” [Q2p = p!Qp ] in the Frobenius algebra of Sp , associated with the n−1 different terms fully antisymmetric Young tableau). The sum in (2.4) contains C2n−1 n−1 ((2n − 1)!/(n − 1)!n! = C2n−1 ). Consider the first of these, [[X1 , . . . , Xn ], Xn+1 , . . . , X2n−1 ]. Its full expansion contains (n!)2 terms, which may be written as the sum of n terms [[X1 , . . . , Xn ], Xn+1 , . . . , X2n−1 ] = Qn (X1 X2 . . . Xn )Qn−1 (Xn+1 Xn+2 . . . X2n−1 ) − Qn−1 (Xn+1 Qn (X1 X2 . . . Xn )Xn+2 . . . X2n−1 ) + Qn−1 (Xn+1 Xn+2 Qn (X1 X2 . . . Xn )Xn+3 . . . X2n−1 ) + . . . + (−1)n−2 Qn−1 (Xn+1 Xn+2 . . . X2n−2 Qn (X1 X2 . . . Xn )X2n−1 ) + (−1)n−1 Qn−1 (Xn+1 Xn+2 . . . X2n−1 )Qn (X1 X2 . . . Xn )
,
(2.5) where the antisymmetriser Qn [Qn−1 ] acts on the n [n − 1] indices (1, . . . , n) [(n + 1, . . . , 2n − 1)] only. This sum may be rewritten as Qn Qn−1 {e + (−1)n (1, n + 1) + (1, n + 1)(2, n + 2) + (−)n (1, n + 1)(2, n + 2)· · (3, n + 3) + . . . + (−1)n (1, n + 1)(2, n + 2) . . . (n − 2, 2n − 2)+ (2.6) (1, n + 1) . . . (n − 1, 2n − 1)}X1 . . . X2n−1 ,
672
J. A. de Azc´arraga, J. C. P´erez Bueno
where (i, j) indicates the transposition in S2n−1 which interchanges the indices i, j; thus, all the signs in (2.6) are positive for n even, and they alternate for n odd according to the parity of the accompanying permutation. Numerical factors apart, the l.h.s of (2.4) is the result of the action of the S2n−1 antisymmetriser in (2n − 1) indices, Q2n−1 , on (2.5) or (2.6). Since σQ2n−1 = (−1)π(σ) Q2n−1 ∀σ ∈ S2n−1 , it turns out that Q2n−1 (Qn Qn−1 ) ∝ Q2n−1 . Thus, only the action of Q2n−1 on the curly bracket in (2.6) has to be considered. Since its permutations are half even and half odd, it becomes identically zero for n even and proportional to Q2n−1 for n odd. Lemma 2.1 shows that the higher-order bracket may be defined, as the Lie bracket, by the skew-symmetric product of an (even) number of generators. By analogy with the standard Lie algebra (s = 2) case, we may now give the following Definition 2.2 (Higher-order Lie algebra). Let G be a Lie algebra. A higher-order Lie algebra on G is the algebra defined by the s-bracket (2.1), where the higher-order structure constants satisfy the generalised Jacobi condition (2.2). Multibrackets appear naturally if we use for the basis Xi of G a set of left-invariant vector fields (LIVF) on the group manifold4 of the Lie group G associated with G. Then, the exterior algebra ∧(G) may be identified as the exterior algebra of the LI contravariant, skew-symmetric tensor fields on G obtained by taking the exterior products of LIVF’s with constant coefficients; this is analogous to the exterior algebra of LI covariant tensor fields (LI forms) on G. Then, in analogy with the exterior derivative of a LI q-form ω ∈ ∧q (G), an exterior coderivation ∂ : ∧q (G) → ∧q−1 (G) , ∂ 2 = 0, may be introduced by taking ∂(X1 ∧ . . . ∧ Xq ) =
q X
bl . . . X b k . . . ∧ Xq (−1)l+k+1 [Xl , Xk ] ∧ X1 ∧ . . . X
. (2.7)
l=1 l
For instance, on Xi1 ∧ Xi2 ∧ Xi3 ∈ ∧3 (G), the statement ∂ 2 (Xi1 ∧ Xi2 ∧ Xi3 ) = 0 is nothing but the standard Jacobi identity. If we now define ∂2 (Xi1 ∧Xi2 ) = ji11ij22 Xj1 Xj2 = [Xi1 , Xi2 ], the coderivation ∂ above corresponds to ∂2 : ∧q (G) → ∧q−1 (G). This may now be extended to a general even coderivation ∂s , ∂s : ∧q (G) → ∧q−(s−1) (G) , ∂s2 = 0 : Definition 2.3 (coderivation ∂s ). Let s be even. The mapping ∂s : ∧s (G) → ∧1 (G) ∼ G given by ∂s : X1 ∧ . . . ∧ Xs 7→ [X1 , . . . , Xs ], where the s-bracket is given by Def. 2.1, may be extended to a higher-order coderivation ∂s : ∧n (G) → ∧n−s+1 by 1 i1 ...in ∂s (Xi1 ∧ . . . ∧ Xis ) ∧ Xis+1 ∧ . . . ∧ Xin , (2.8) ∂s (X1 ∧ . . . ∧ Xn ) = s!(n − s)! 1 ... n with ∂s : ∧n (G) = 0 for s > n. It follows from (2.2) that ∂s2 = 0. For s = 2, Eq. (2.8) reduces to (2.7). On Xi1 ∧. . .∧Xi7 ∈ ∧7 (G), for instance, ∂42 = 0 leads to the GJI which must be satisfied by a 4th order Lie algebra. As mentioned, these higher-order algebras are particular cases of the strongly homotopy algebras [22, 23] of recent relevance in string field theory (see [25]). We shall now give explicit examples of higher-order algebras and, as a result, provide the classification of all higher-order simple Lie algebras. 4 On G, a vector field X is expressed as X j (g)∂/∂g j , j = 1, . . . , r , where g i are local coordinates of i i G at the unity.
Higher-Order Simple Lie Algebras
673
3. Higher-Order Simple Lie Algebras. The Case of su(n) Let G be now a simple Lie algebra. In what follows, we shall also assume G to be compact (although compactness is not essential in many reasonings below) so that the nondegenerate Killing matrix kij may be taken as the unity δij after suitable normalization of the generators. As mentioned, there are l primitive invariant polynomials for each simple algebra of rank l which are in turn related to the Casimir-Racah operators of the algebra [10–13, 15–17, 29, 30, 31, 39], to the Lie algebra cohomology for the trivial action and to the topology and de Rham cohomology of the associated simple compact Lie group [1, 3–8]. We now use this fact to provide a classification of the possible higher-order simple Lie algebras. Given a simple Lie algebra G, the orders mi of the l invariant polynomials (or of the generalised Casimir invariants) and of the l cocycles (or bi-invariant forms on the corresponding compact group G) are given by the following table: G Al Bl Cl Dl G2 F4 E6 E7 E8
algebra dimension r = dimG (l + 1)2 − 1 [l ≥ 1] l(2l + 1) [l ≥ 2] l(2l + 1) [l ≥ 3] l(2l − 1) [l ≥ 4] 14 52 78 133 248
order of invariants m1 , . . . , ml 2, 3, . . . , l + 1 2, 4, . . . , 2l 2, 4, . . . , 2l 2, 4, . . . , 2l − 2, l 2, 6 2, 6, 8, 12 2, 5, 6, 8, 9, 12 2, 6, 8, 10, 12, 14, 18 2, 8, 12, 14, 18, 20, 24, 30
order of G-cocycles (2m1 − 1), . . . , (2ml − 1) 3, 5, . . . , 2l + 1 3, 7, . . . , 4l − 1 3, 7, . . . , 4l − 1 3, 7, . . . , 4l − 5, 2l − 1 3, 11 3, 11, 15, 23 3, 9, 11, 15, 17, 23 3, 11, 15, 19, 23, 27, 35 3, 15, 23, 27, 35, 39, 47, 59
Dimension of the Casimir-Racah invariants and Lie algebra cocycles for G simple. Pl We see that i=1 (2mi − 1) = r. Definition 3.1 (Higher-order simple Lie algebras). A higher-order simple Lie algebra associated with a simple Lie algebra G is the higher-order algebra defined by a primitive G-cocycle (of order > 3) on G. Thus, to find the higher-order simple Lie algebras one has to look for the invariant polynomials on them. For the compact forms on these groups, the cocycle orders are also the dimensions of the primitive de Rham cycles (odd spheres) to which the group manifolds are essentially equivalent. We shall now find explicit realizations of these algebras. Consider first the case of su(n) , n ≥ 3 with kij ∼ δij (there are no higher-order simple Lie algebras on su(2)). In terms of its structure constants (for hermitian generators Ti ) [Ti , Tj ] = iCijk Tk , the anticommutator of two n×n su(l+1) matrices may be expressed as {Ti , Tj } = cδij + dijk Tk (with c = 1/n, Tr(Ti Tj ) = 21 δij ). The dijk ∝ Tr(Ti {Tj , Tk }) term (absent for su(2)) is the first example of a symmetric invariant polynomial (of 3rd order5 ) beyond the Killing tensor kij (see the table). Invariant, symmetric polynomials are given by the symmetric traces (sTr) of products of su(n) generators (cf. the theory of characteristic classes). Let us then consider the next case, m3 = 4. The coordinates of this fourth-order polynomial ki1 i2 i3 i4 are given by sTr(Ti1 Ti2 Ti3 Ti4 ) or (ignoring numerical factors) by Tr(s(Ti1 Ti2 Ti3 )Ti4 ) ∝Tr(s({{Ti1 , Ti2 }, Ti3 })Ti4 ) ∝ 5
For the properties of the d-tensors see [32].
674
J. A. de Azc´arraga, J. C. P´erez Bueno
(d(i1 i2 l dli3 )i4 + 2cδ(i1 i2 δi3 )i4 ), where s symmetrises the i1 , i2 , i3 indices. Thus, we may take ki1 i2 i3 i4 =di1 i2 l dli3 i4 + di1 i3 l dli2 i4 + di1 i4 l dli2 i3 (3.1) + 2c(δi1 i2 δi3 i4 + δi1 i3 δi2 i4 + δi1 i4 δi2 i3 ) . Clearly, the last term will not generate a primitive 4th order Casimir operator6 , since it is proportional to the square of the second order one, (I2 )2 . Equation (3.1) reflects the well known ambiguity in the selection of the higher-order Casimirs for the simple Lie algebras (see, e.g., [11, 13, 17, 33]). The first part, which generalises easily up to ki1 ...in leads to the form of the Casimir-Racah operator In given in [12]. We are now in a position to introduce all Al higher-order simple Lie algebras Theorem 3.1 (Higher-order Al Lie algebras). Let Xi a basis of Al , i = 1, . . . , (l + 1)2 − 1. Then, the even multibracket j ...j
2m−2 Xj1 . . . Xj2m−2 [Xi1 , . . . , Xi2m−2 ] := i11...i2m−2
(3.2)
is G-valued and defines a higher-order simple Lie algebra [Xi1 , . . . , Xi2m−2 ] = ωi1 ...i2m−2 σ· Xσ
,
(3.3)
where the higher-order structure constants ωi1 ...i2m−2 σ· associated to the invariant polynomial ki1 ...im are given by the skew-symmetric tensor j ...j
l
2m−2 Cil11j2 . . . Cjm−1 k ωi1 ...i2m−2 σ = i22...i2m−2 2m−3 j2m−2 l1 ...lm−1 σ
,
(3.4)
which defines a non-trivial (2m − 1)-cocycle for su(l + 1) , 3 ≤ m ≤ l + 1 (m = 2 is the standard Lie algebra). Before presenting a general proof, let us illustrate the theorem in the two simplest cases. For m = 2 Eq. (3.4) reads ωi1 i2 σ = δij22 Cil11j2 kl1 σ = k([Xi1 , Xi2 ], Xσ )
,
(3.5)
and the ωi1 i2 σ are the standard structure constants of G. Thus, the m = 2 (lowest) polynomial corresponds to the ordinary (su(n), in this case) Lie algebra commutators. Let m = 3. If d denotes the symmetric polynomial, Eq. (3.4) gives ωi1 i2 i3 i4 σ =ji22ij33ij44 Cil11j2 Cjl23 j4 dl1 l2 σ = =ji22ij33ij44 d([Xi1 , Xj2 ], [Xj3 , Xj4 ], Xσ )
,
(3.6)
which is the expression of the fully antisymmetric five-cocycle. On the other hand, [Xi1 , Xi2 , Xi3 , Xi4 ] = ji11ij22ij33ij44 Xj1 . . . Xj4 = =
1 j1 j2 j3 j4 [Xj1 , Xj2 ][Xj3 , Xj4 ] 2 2 i1 i 2 i3 i 4
1 j1 j2 j3 j4 k C C l Xk Xl 2 2 i1 i 2 i3 i 4 j 1 j 2 j 3 j 4
(3.7)
.
1 6 Notice that l ≥ 3 for k i1 i2 i3 i4 to be primitive. For S4(3), the identity dl(i1 i2 di3 )i4 l = 3 δ(i1 i2 δi3 )i4 , where the brackets mean symmetrisation, precludes Eq. (3.1) from producing a primitive fourth-order invariant. Similar type relations hold for higher ranks [32] (see also [41], where higher order Casimir operators are used to introduce the so-called Casimir W-algebras).
Higher-Order Simple Lie Algebras
675
Taking into account that ji11ij22ij33ij44 Cjk1 j2 Cjl3 j4 is symmetric in k, l this is equal to 1 j 1 j 2 j 3 j 4 l1 C C l2 (dl l σ Xσ + cδl1 l2 ) 2 3 i 1 i2 i 3 i4 j 1 j 2 j 3 j 4 1 2
.
(3.8)
The term in c may be dropped since, for each j4 , it is proportional to the antisymmetrised sum Cj1 j2 l Cjl3 j4 in j1 , j2 , j3 which is zero by the Jacobi identity. Using now that ji11ij22ij33ij44 =
4 X
b (−)s+1 δij1s ji21i...3 ij4s ...j4
,
(3.9)
s=1
it is easy to see that all the terms in (3.9) give the same contribution for the remaining d term in (3.8). Hence, the fourth-commutator [Xi1 , Xi2 , Xi3 , Xi4 ] =
1 j 2 j 3 j 4 l1 l 2 C C dl l σ Xσ 2 i2 i3 i4 i1 j 2 j 3 j 4 1 2
(3.10)
is indeed of the form (3.6), and it may be checked explicitly that it is in su(3). The proof of Theorem 3.1 requires now the following simple Lemma 3.1. If kl1 ...lm is an ad-invariant, symmetric polynomial on a simple Lie algebra G, ...j2m l1 Cj1 j2 . . . Cjlm k =0 . (3.11) ji11...i 2m 2m−1 j2m l1 ...lm Proof. First, we note that the ad-invariance condition of the m-tensor k may be expressed in coordinates by m X Cjk2m−1 ls kl1 ...ls−1 kls+1 ...lm = 0 . (3.12) s=1
Hence, replacing get
Cjlm k 2m−1 j2m l1 ...lm l
in the l.h.s. of (3.11) by the other terms in (3.12) we
...j2m l1 Cj1 j2 . . . Cjm−1 ( ji11...i 2m 2m−3 j2m−2
m−1 X
Cjk2m−1 ls kl1 ...ls−1 kls+1 ...lm−1 j2m )
,
(3.13)
s=1 s k which vanishes since all terms in the sum include products of the form Cjj 0 Csj 00 anti0 00 symmetrised in j, j , j , which are zero due to the standard JC (1.1).
To prove now Theorem 3.1, we write the (2m − 2) bracket as [Xi1 , . . . , Xi2m−2 ] =
1 j1 ...j2m−2 i1 ...i2m−2 [Xj1 , Xj2 ] . . . [Xj2m−3 , Xj2m−2 ] m−1 2 1 j1 ...j2m−2 l1 i1 ...i2m−2 Cj1 j2 m−1 2
=
1 s(Xl1 . . . Xlm−1 ) , (m − 1)! (3.14) j ...j2m−2 l1 l Cj1 j2 . . . Cjm−1 is symmetric where we have used (cf. the m = 3 case) that i11...i2m−2 2m−3 j2m−2 in l1 , . . . , lm−1 to introduce the symmetrised product of generators, which in turn may be replaced, adding the appropriate factors, by s({{. . . {Xl1 , Xl2 }, Xl3 }, . . . , Xlm−1 }). Using that {Xi , Xj } = cδij + dijk Xk in the expression of the nested anticommutators, we then conclude that it has the form l
. . . Cjm−1 2m−3 j2m−2
676
J. A. de Azc´arraga, J. C. P´erez Bueno σ (factors)s(Xl1 . . . Xlm−1 ) = k˜ l1 ...lm−1 · Xσ + kˆ l1 ...lm−1 1
.
(3.15)
By Lemma 3.1, the second term does not contribute to (3.14) because kˆ is an invariant polynomial of (m − 1)-order. On the other hand since Tr(s(Xl1 . . . Xlm−1 )Xσ ) ∝ sTr(Xl1 . . . Xlm−1 Xσ ), we conclude that k˜ l1 ...lm−1 σ is an invariant symmetric mth order polynomial. Absorbing all numerical factors in k˜ and renaming it as k, we find that the (2m − 2)-commutator in (3.14) is given by 1 j ...j l σ 1 2m−2 C l1 . . . Cjm−1 k Xσ = ωi1 ...i2m−2 σ· Xσ 2m−3 j2m−2 l1 ...lm−1 · (2m − 2) i1 ...i2m−2 j1 j2
(3.16)
i.e., by the (2m − 1)-cocycle (3.4). Since (3.2) is given by the product of associative operators, the GJC (2.2) follows from the GJI (2.4). Equivalently, one may show that the cocycle condition for ωi1 ...i2m−2 σ guarantees that the GJC is satisfied (see the remark after Theorem 5.1 below). This establishes the connection between Lie algebra cohomology cocycles and higher-order Lie algebras. 4. Higher-Order Orthogonal and Symplectic Algebras We now extend to the Bl (l ≥ 2), Cl (l ≥ 3), Dl (l ≥ 4) series the considerations in Sect. 3 for Al . First we notice that for all of them the third-order symmetric polynomial is absent and that only for the even orthogonal algebra Dl (and odd l) we may have an odd-order invariant polynomial. We shall ignore this case for a moment, and look first for the even-order symmetric polynomials. Let us realise the generators of the above algebras in terms of the n × n matrices of the defining representation, where n = (2l + 1, 2l, 2l) for (Bl , Cl , Dl ) respectively. These matrices T have all in common the metric preserving defining property T g = −gT t , where g is the n × n unit matrix for the orthogonal algebras and the symplectic metric for Cl . If we define the symmetric third-order anticommutator by X Tσ(1) Tσ(2) Tσ(3) ≡ s(T1 T2 T3 ) , (4.1) {T1 , T2 , T3 } = σ∈S3
it is trivial to check that {T1 , T2 , T3 }g = −g{T1 , T2 , T3 }t so that {T1 , T2 , T3 } ∈ G (G = so(2l + 1) , sp(2l) or so(2l)). Notice that such a relation cannot be satisfied for the ordinary anticommutator, and that in general requires odd-order anticommutators in order to preserve the minus sign in the r.h.s. Note also the absence of the identity matrix in the r.h.s. of the odd-order anticommutator, which was allowed for Al . Let then {Ti1 , Ti2 , Ti3 } = ki1 i2 i3 σ· Tσ . Extending this result to the arbitrary odd case, we find Lemma 4.1. The symmetrised product of an odd number of n × n matrix generators of so(2l + 1) , sp(2l) or so(2l) is also an element of these algebras which is determined by the associated invariant symmetric polynomial. Proof. s(Ti1 Ti2 Ti3 Ti4 . . . Ti2p−1 ) = =
1 6p−1
s({. . . {{Ti1 , Ti2 , Ti3 }Ti4 , Ti5 }, . . . , Ti2p−2 , Ti2p−1 })
1 α2 β 1 s(ki1 i2 i3 α · kα1 i4 i5 · . . . kαp−2 i2p−2 i2p−1 · Tβ ) 6p−1
. (4.2)
Higher-Order Simple Lie Algebras
677
Since s symmetrises all i1 , i2 , . . . , i2p−1 indices we may write this as {Ti1 , . . . , Ti2p−1 } = ki1 ...i2p−1 σ· Tσ
,
(4.3)
and identify k with the invariant symmetric polynomial of even 2p (see Table) since Tr({Ti1 , . . . , Ti2p−1 }Tσ ) is equal to sTr(Ti1 . . . Ti2p−1 Tσ ) = kii ...i2p−1 σ
,
(4.4)
This now leads to the following Theorem 4.1. Let G be a simple orthogonal or symplectic algebra. Let ki1 ...i2p be as in (4.4) for 2 ≤ p ≤ l (Bl , Cl ) and 2 ≤ p ≤ l − 1 (Dl ). Then, the even (4p − 2) bracket defined as in (3.2) defines a higher-order orthogonal or symplectic algebra, the structure constants of which are given by the Lie algebra (4p − 1)-cocycles associated with the symmetric invariant polynomials on G. Proof. It suffices to use Lemma 4.1 and to insert (4.3) in expression (3.14). As a result, the (4p − 1)-cocycle is given again by (3.4), where the kii ...i2p−1 σ is now found in (4.4). Let us consider now the order l invariant for so(2l). The reasonings before Lemma 4.1 show that, for l odd, the order l invariant polynomial cannot be obtained from the symmetric trace of (l − 1) 2l × 2l T ’s, since the symmetrised bracket of an even number of T ’s cannot be expressed as a linear combination of the 2l × 2l matrix generators of so(2l). It is well known, however, that for so(2l) there is an order l (even or odd) invariant polynomial (which gives the Euler class of a real oriented vector bundle with even-dimensional fibre) which comes from the Pfaffian, since P f (AT At ) = P f (T ) for A ∈ SO(2l). Using pairs of indices to relabel the generators Ti i = 1, . . . , ( 2l2 ) as Tµν = −Tνµ , µ, ν = 1, . . . , 2l, the order l invariant (corresponding to the last one in the table for Dl ) is given by P f (T ) =
(−1)l µ1 ν1 ...µl νl Tµ1 ν 1 Tµ2 ν 2 . . . T µl ν l 2l l! 1 ... 2l
.
(4.5)
The antisymmetric tensor µ1 ν1 µ2 ν2 ...µl νl defining the invariant is symmetric under the exchange of pairs of indices (µi νi ) i = 1, . . . , l. Although it cannot be obtained as the symmetric trace of a product of 2l × 2l generators it may be obtained again in the standard way if we use an appropriate spinorial representation for so(2l). This means that the previous arguments may be also carried through to the lth order invariant of the Dl algebra. To see it explicitly, consider the 2l -dimensional Clifford algebra {Γµ , Γν } = 2δµν (µ, ν = 1, . . . , 2l). The ( 2l2 ) Spin(2l) generators are given by Σµν = 2i [Γµ , Γν ], l and the Γ matrix by Γ = i Γ ...Γ ; Γ † = Γ , Γ † = Γ . 2l+1
2l+1
(2l)! µ1 ...µ2l
µ1
µ2l
µ
µ
2l+1
2l+1
Thus, we may write with all indices different µ1 6= ν1 6= . . . 6= µl−1 6= νl−1 6= α 6= β, Γµ1 Γν1 . . . Γµl−1 Γνl−1 ∝ µ1 ν1 ...µl−1 νl−1 αβ Γ2l+1 Γα Γβ
.
(4.6)
Antisymmetrising the (l − 1) pairs of gammas this leads to Σµ1 ν1 Σµ2 ν2 . . . Σµl−1 νl−1 ∝ µ1 ν1 ...µl−1 νl−1 αβ Γ2l+1 Σαβ
,
(4.7)
678
J. A. de Azc´arraga, J. C. P´erez Bueno
an expression which is symmetric in the (µν) pairs which are all different. To check that the definition (2.3) for the (2l − 2) bracket is indeed so(2l)-valued, we notice that λκ Σλκ are non-zero only if the pairs the so(2l) commutators [Σµν , Σρσ ] ≡ iC(µν)(ρσ) (µν) , (ρσ) have one (and only one) index in common. Thus, the only non-zero (2l − 2)brackets have the form [Σi1 k , Σi2 k , . . . , Σi2l−2 k ] where all indices i are different. Since the ordinary product of such Σ’s sharing an index is already antisymmetric, we find that (cf. (3.14)) j
j
2l−2 2 . . . C(i2l−3 {Σj1 j2 , . . . , Σj2l−3 j2l−2 } [Σi1 k , Σi2 k , . . . , Σi2l−2 k ] ∝ C(ij11jk)(i 2 k) 2l−3 k)(i2l−2 k)
Γ2l+1 Σαβ ∝ ωi1 k,...,i2l−2 k αβ ·
,
(4.8) and we may now use the chiral projectors 21 (1 ± Γ2l+1 ) to extract from the reducible 2l × 2l representation Σµν its two irreducible 2l−1 × 2l−1 components. 5. Higher-Order Simple Lie Algebras and Their Complete BRST Operator The case of the exceptional algebras requires more care, and we shall not discuss here their realization. We may nevertheless state the following Theorem 5.1 (Classification theorem for higher-order simple algebras ). Given a simple Lie algebra G of rank l, there are (l − 1) (2mi − 2)-higher-order simple algebras associated with G. They are given by the (l − 1) Lie algebra cocycles of order (2mi −1) > 3 which may be obtained from the (l−1) symmetric invariant polynomials on G of order mi > m1 = 2. The m1 = 2 case (Killing metric) reproduces the original simple Lie algebra G; for the other (l−1) cases, the skew-symmetric (2mi −2)-commutators define an element of G by means of the (2mi − 1)-cocycles. These higher-order structure constants (as the ordinary structure constants with all indices are written down) are fully antisymmetric and satisfy, by virtue of being Lie algebra cocycles, the generalised Jacobi condition (2.2). Remark 2. It may be checked explicitly that the coordinate definition of the cocycles ωi1 ...i2m−2 σ· and the invariance condition (3.12) for their associated invariant polynomials entail the GJI. Indeed, the l.h.s of (2.2) (for s = 2m − 2) is, using (3.4), equal to j ...j
l ...l
p
p1 σ 4m−5 m−1 ωj1 ...j2m−2 · ρ j12m−12m−3 i11...i4m−5 ...j4m−5 Cρl1 . . . Cl2m−4 l2m−3 kp1 ...pm−1 · j ...j
l ...l2m−3
1 =(2m − 3)!i11 ... 2m−2 i4m−5
p
p1 σ m−1 ωj1 ...j2m−2 · ρ Cρl . . . Cl2m−4 l2m−3 kp1 ...pm−1 · = 0 1
which is zero since if ωj1 ...j2m−2 ρ· is a (2m − 1)-cocycle [(3.4)] j ...j
2m−1 Cjν1 ρ ωj2 ...j2m−1 ρ· = 0 i11...i2m−1
,
, (5.1)
(5.2)
which follows from Lemma 3.1. There is a simple way of expressing the above results making use of the ChevalleyEilenberg [5] formulation of the Lie algebra cohomology. For the standard case, we may introduce the BRST operator ∂ 1 s = − ci cj Cij k· k 2 ∂c
,
s2 = 0
,
(5.3)
Higher-Order Simple Lie Algebras
679
with ci cj = −cj ci (in a graded algebra case, the c’s would have a grading opposite to that of the associated generators). Then, sck = − 21 Cij k· ci cj (Maurer-Cartan eqs.) and the nilpotency of s is equivalent to the JC (1.1). In the present case, we may describe all the previous results by introducing the following generalisation: Theorem 5.2 (Complete BRST operator for a simple Lie algebra). Let G be a simple Lie algebra. Then, there exists a nilpotent associated operator given by the odd vector field 1 ∂ ∂ 1 cj1 . . . cj2mi −2 ωj1 ...j2mi −2 σ· σ − . . . s = − cj1 cj2 ωj1 j2 σ· σ − . . . − 2 ∂c (2mi − 2)! ∂c 1 ∂ − cj1 . . . cj2ml −2 ωj1 ...j2ml −2 σ· σ ≡ s2 + . . . + s2mi −2 + . . . + s2ml −2 , (2ml − 2)! ∂c (5.4) where i = 1, . . . , l , ωj1 j2 σ· ≡ Cj1 j2 σ· and ωj1 ...j2mi −2 σ· are the corresponding l (cnumber) higher-order cocycles. The operator s will be called the complete BRST operator associated with G. Proof. The nilpotency of s encompasses, in fact, the JC and the (l − 1) GJC’s which have to be satisfied, respectively, by the ω’s which determine the standard BRST operator (5.3) and the (l − 1) higher-order BRST operators; all the cohomological information on G is contained in the complete BRST operator. The GJC’s come from the squares of the individual terms s2p , the crossed products sp sq not contributing since the terms s2mi −2 are given by Lie algebra (2mi − 1)-cocycles. To see this, we first notice that there are no ω’s with an even number of indices (s is an odd operator). Consider now a mixed product sp sq (p and q even). This is given by s p sq ∝
ωi1 ...ip ρ· ci1
...c
ip
ωj1 ...jq σ· [
q X
(−)l+1 δρjl cj1 . . . cˆjl . . . cjq ]
l=1
∂ ∂cσ ∂ = qωi1 ...ip ρ· ωρj2 ...jq σ· ci1 . . . cip cj2 . . . cjq σ , ∂c =
qωi1 ...ip ρ· ωj1 ...jq σ· δρj1 ci1
. . . c ip c j 2 . . . c j q
∂ ∂cσ (5.5)
where the term in ∂c∂ρ ∂c∂σ has been omitted since (p and q being even) it cancels with the one coming from sq sp . Recalling now expression (3.4) it is found that (5.5) is zero because of (5.2), which in the present language reads sp s2 = 0. Thus, s2 = s22 + . . . + s22mi −2 + . . . + s22ml −2 = 0, each of the l terms being zero separately as a result of the GJC (2.2). 6. Concluding Remarks Many questions arise now that require further study. From a physical point of view it would be interesting to find applications of these higher-order Lie algebras to know whether the cohomological restrictions which determine and condition their existence have a physical significance. Lie algebra cohomology arguments have already been very useful in various physical problems as e.g., in the description of anomalies [18] or in the construction of the Wess-Zumino terms required in the action of extended
680
J. A. de Azc´arraga, J. C. P´erez Bueno
supersymmetric objects [34]. In the form (5.4), the above formulation of the higher algebras has a resemblance with the closed string BRST cohomology and the SH algebras [22, 23] relevant in the theory of graded string field products [24, 25] (see also [35]). Note, however, that because of the cocycle form of the ω’s, the GJI’s are not modified as already mentioned in the introduction. In the SH algebras such a modification is the result of having, for instance, terms lower than quadratic in (5.4) (with the appropriate change in ghost grading). Due to their underlying BRST symmetry, similar structures appear in the determination of the different gauge structure tensors through the antibrackets and the master equation in the Batalin-Vilkovisky formalism (for a review, see [36, 37]), where violations of the JI are also present (the Batalin-Vilkovisky antibracket is a two-bracket, but higher-order ones may also be considered [38]). Other questions may be posed from a purely mathematical point of view. As the discussion in Sect. 4 shows, a representation of a simple Lie algebra may not be a representation for the associated higher-order Lie algebras. Thus, the representation theory of higher-order algebras requires a separate analysis. Other problems may be more interesting from a structural point of view as, for instance, the contraction theory of higher-order Lie algebras (which will take us outside the domain of the simple ones), as well as the study of the non-simple higher-order algebras themselves and their cohomology. These, and the generalisation of these ideas to superalgebras (for which there exist simple finite dimensional ones with zero Killing form) are problems for further research. Acknowledgements The authors wish to thank J. Stasheff for helpful correspondence and his comments on the manuscript and T. Lada for a copy of [23]. This research has been partially sponsored by the Spanish CICYT and DGICYT (AEN 96-1669, PR 95-439). Both authors wish to thank the kind hospitality extended to them at DAMTP. The support of St. John’s College (J.A.) and an FPI grant from the Spanish Ministry of Education and Science and the CSIC (J.C.P.B.) are also gratefully acknowledged.
References 1. Cartan, E.: La topologie des groupes de Lie. L’Enseignement math. 35, 177–200 (1936) 2. Pontrjagin, L.: Sur les nombres de Betti des groupes de Lie. C. R. Acad. Sci. Paris 200, 1277–1280 (1935) ¨ 3. Hopf, H.: Uber die topologie der gruppen-manigfaltigkeiten und ihre verallgemeinerungen. Ann. Math. 42, 22–52 (1941) 4. Hodge, W. V. D.: The theory and applications of harmonic integrals. Camb. Univ. Press (1941) 5. Chevalley, C. and Eilenberg, S.: Cohomology theory of Lie groups and Lie algebras. Trans. Am. Math. Soc. 63, 85–124 (1948) 6. Samelson, H.: Topology of Lie groups. Bull. Am. Math. Soc. 57, 2–37 (1952) 7. Borel, A. and Chevalley C.: The Betti numbers of exceptional groups. Mem. Am. Math. Soc. 14, 1–10 (1955) 8. Borel, A.: Topology of Lie groups and characteristic classes. Bull. Am. Math. Soc. 61, 397–432 (1965) 9. Bott, R.: The geometry and representation theory of compact Lie groups. in London Math. Soc. Lecture Notes Ser. 34, 65–90, Camb. Univ. Press (1979) 10. Racah, G.: Sulla caratterizzazione delle rappresentazioni irreducibili dei gruppi semisimplici di Lie. Lincei-Rend. Sc. fis. mat. e nat. VIII, 108-112 (1950); Princeton lectures, CERN-61-8 (reprinted in Ergeb. Exact Naturwiss. 37, 28–84 (1965)), Springer-Verlag 11. Biedenharn, L. C.: On the representations of the semisimple Lie groups I. J. Math. Phys. 4, 436–445 (1963) 12. Klein, A.: Invariant operators of the unimodular group in n dimensions. J. Math. Phys. 4, 1283–1284 (1963)
Higher-Order Simple Lie Algebras
681
13. Gruber, B. and Raifeartaigh, O.: S-theorem and construction of the invariants of the semisimple compact Lie algebras. J. Math. Phys. 5, 1796–1804 (1964) 14. Perelomov, A. M. and Popov, V. S.: Casimir operators for semisimple groups. Math. USSR-Izvestija 2, 1313–1335 (1968) 15. Nwachuku, C. O. and Rashid, M. A.: Eigenvalues of the Casimir operators of the orthogonal and symplectic groups. J. Math. Phys. 17, 1611–1616 (1976) 16. Okubo, S. and Patera, J.: General indices of representations and Casimir invariants. J. Math. Phys. 25, 219–227 (1983) 17. Okubo, S.: Modified fourth-order Casimir invariants and indices for simple Lie algebras. J. Math. Phys. 23, 8–20 (1982) 18. For an outlook including further references, see the articles by Jackiw and Zumino in Treiman. S. B.; Jackiw, R.; Zumino, B. and Witten, E.: Current algebra and anomalies World Sci. (1985) 19. Atiyah, M. F. and Jones, J. D. S.: Topological aspects of Yang-Mills theory. Commun. Math. Phys. 61, 97–118 (1978) 20. Kephart, T. W.: Safe groups and anomaly cancellation in even dimensions. Phys. Lett. B151, 267–270 (1985) 21. Okubo, S. and Patera, J.: Cancellation of higher order anomalies. Phys. Rev. D31, 2669–2671 (1985) 22. Lada, T. and Stasheff, J.: Introduction to SH Lie algebras for physicists. Int. J. Theor. Phys. 32, 1087– 1103 (1993) 23. Lada, T. and Markl, M.: Strongly homotopy Lie algebras. Commun. in Alg. 23, 2147–2161 (1995) 24. Witten, E. and Zwiebach, B.: Algebraic structures and differential geometry in two-dimensional string theory. Nucl. Phys. B377, 55–112 (1992) 25. Zwiebach, B.: Closed string theory: quantum action and the Batalin-Vilkovisky master equation. Nucl. Phys. B390, 33–152 (1993) 26. de Azc´arraga, J. A., Perelomov, A. M. and P´erez Bueno, J. C.: New Generalized Poisson Structures. J. Phys. A29, L151–L157 (1996); The Schouten-Nijenhuis bracket, cohomology and generalized Poisson structures. J. Phys. A29 7993–8009 (1996) 27. Jones, E. S.: A study of Lie and associative algebras from a homotopy point of view. Master’s project, North Carolina Univ., Raleigh (1990) 28. Hanlon, P. and Wachs, H.: On Lie k-algebras. Adv. in Math. 113, 206–236 (1995) 29. Weyl, H.: The classical groups. Princeton (1946) 30. Englefield, M. and King, R. C.: Symmetric power sum expansions of the eigenvalues of generalised Casimir operators of semi-simple Lie groups. J. Phys. A13, 2297–2317 (1980) 31. Boya, L. J.: Representations of simple Lie groups. Rep. Math. Phys. 32, 351–354 (1993) 32. Sudbery, A.: Computer-friendly d-tensor identities for SU (n). J. Phys. A23(15), L705–L710 (1990) 33. Berdjis, F.: A criterium for completeness of Casimir operators. J. Math. Phys. 22, 1850–1856 (1981); Berdjis, F. and Beslm¨uller, E.: Casimir operators for F4 , E6 , E7 and E8 . ibid. 1857–1860 34. de Azc´arraga, J. A. and Townsend, P. K.: Superspace geometry and classification of supersymmetric extended objects. Phys. Rev. Lett. 62, 2579–2512 (1989) 35. Lian, B.H. and Zuckerman, G.J.: New perspectives on the BRST-algebraic structure of string theory. Commun. Math. Phys. 154, 613–646 (1993) 36. Henneaux, M.: The antifield BRST formalism. Nucl. Phys. B (Proc. Suppl.) 18A, 47–106 (1990) 37. Gomis, J.; Paris, J. and Samuel, S.: Antibracket, antifields and gauge-theory quantization. Phys. Rep. 259, 1–145 (1995) 38. Bering, K.; Damgaard, P. H. and Alfaro, J.: Algebra of higher antibrackets. Nucl. Phys. B478, 459–503 (1996) 39. Gel’fand, I.M.: The Center of an infinitesimal groupring. Mat. Sbornik 26, 103–112 (1950) (English trans. by Los Alamos Sci, Lab. AEC-TR-&6133 (1963)) 40. Abellanas, L. and Mart´ınez Alonso, L.: A general setting for Casimir invariants. J. Math. Phys. 16, 1580–1584 (1975) 41. Bais, F.A., Bouwnegt, P., Surridge, M. and Schoutens, K.: Extensions of the Virasoro algebra constructed from KAC–Moody algebras using higher order Casimir invariants. Nucl. Phys. B304, 348–370 (1988) Communicated by T. Miwa