Communications in Mathematical Physics - Volume 202

Commun. Math. Phys. 202, 1 – 63 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Exotic √ Subfact...

Author: A. Jaffe (Chief Editor)

26 downloads 1013 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 202, 1 – 63 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Exotic √ Subfactors of Finite √ Depth with Jones Indices (5 + 13)/2 and (5 + 17)/2 M. Asaeda1,? , U. Haagerup2 1 Graduate School of Mathematical Sciences, University of Tokyo, Komaba, Meguro-ku, Tokyo, 153-8924, Japan. E-mail: [email protected] 2 Institut for Matematik og Datalogi, Odense Universitet,Campusvej 55, DK-5230 Odense M, Denmark. E-mail: [email protected]

Received: 23 February 1998 / Accepted: 3 June 1998

Abstract: We prove √ depth of the hyperfinite II1 factor √ existence of subfactors of finite with indices (5 + 13)/2 = 4.302 · · · and (5 + 17)/2 = 4.561 · · · . The existence of the former was announced by the second named author in 1993 and that of the latter has been conjectured since then. These are the only known subfactors with finite depth which do not arise from classical groups, quantum groups or rational conformal field theory. 1. Introduction In the theory of operator algebras, subfactor theory has been developing dynamically, involving various fields in mathematics and mathematical physics since its foundation by V. F. R. Jones in 1983 [J]. Above all, the classification of subfactors is one of the most important topics in the theory. In the celebrated Jones index theory [J], Jones introduced the Jones index for subfactors of type II1 as an invariant. Later, he also introduced a principal graph and a dual principal graph as finer invariants of subfactors. Since Jones proved in the middle of 1980’s that subfactors with index less than 4 have one of the Dynkin diagrams as their (dual) principal graphs, the classification of the hyperfinite II1 subfactors, has been studied by A. Ocneanu and S. Popa, and also by M. Izumi, Y. Kawahigashi, and a number of other mathematicians. In this process, Ocneanu’s paragroup theory [O1] has been quite effective. He penetrated the algebraic, or rather combinatorial, nature of subfactors and constructed a paragroup from a subfactor of type II1 . A paragroup is a set of data consisting of four graphs made of a (dual) principal graph and assignment of complex numbers to “cells” arising from four graphs, called a connection. Thanks to the “generating property” for subfactors of finite depth proved by Popa in [P1], it has turned out that the correspondence between paragroups and subfactors of the hyperfinite II1 factor with finite index ? Current address: Department of Mathematics, The Pennsylvania State University, 218 MCAllister Building, University Park, PA 168032-6401. E-mail: [email protected]

2

M. Asaeda, U. Haagerup

and finite depth is bijective, therefore the classification of hyperfinite II1 subfactors with finite index and finite depth is reduced to that of paragroups. By checking the flatness condition for the connections on the Dynkin diagrams, Ocneanu has announced in [O1] that subfactors with index less than 4 are completely classified by the Dynkin diagrams An , D2n , E6 , and E8 . (See also [BN], [I1], [I2], [K], [SV].) After that, Popa ([P2]) extended the correspondence between paragroups and subfactors of the hyperfinite II1 factor to the strongly amenable case, and gave a classification of subfactors with indices equal to 4. (In all the above mentioned cases, the dual principal graph of a subfactor is the same as the principal graph. See also [IK].) We refer readers to [EK], [GHJ] for algebraic aspects of a general theory of subfactors. The second named author then tried to find subfactors with index a little bit beyond 4. Some subfactors with index larger than 4 had been already constructed from other mathematical objects. For example, we can construct a subfactor from an arbitrary finite group by a crossed product with an outer action, and this subfactor has an index equal to the order of the original finite group. Trivially, the index is at least 5 if it is larger than 4. We also have subfactors constructed from quantum groups Uq (sl(n)), q = e2πi/k 2

(nπ/k) with index sin as in [W] and these index values do not fall in the interval (4, 5). sin2 (π/k) Unfortunately or naturally, these subfactors do not contain more information about the algebraic structure than the original mathematical objects themselves such as groups or quantum groups. “Does there exist any subfactor not arising from (quantum) groups ?” If it is the case, we have a subfactor as a really new object producing new mathematical structures. We expect that subfactors with index slightly larger than 4 would be indeed those with exotic nature and they do not arise from other mathematical objects. The second named author gave in 1993 a list of possible candidates of graphs √ which might be realized as (dual) principal graphs of subfactors with index in (4, 3+ 3) = (4, 4.732 · · · ) in [H]. We see four candidates of pairs of graphs, including two pairs with parameters, in §7 of [H]. At the same time, the√second named author announced a proof of existence of the subfactor with index (5 + 13)/2 for the case n = 3 of (2) in the list in [H], but the proof has not been published until now. Ever since, nothing had been known for the other cases for some years, until D. Bisch recently proved that a subfactor with (dual) principal graph (4) in §7 of [H] does not exist [B]. About case (3) in §7 of [H] as in Fig. 2, as well as the case n = 3 of (2), we can easily determine a biunitary connection uniquely on the four graphs consisting of the graphs √ (3), and we thus have a hyperfinite II1 subfactor with index (5 + 17)/2 constructed from the connection by the commuting square as in [S]. The problem is whether this subfactor has (3) as (dual) principal graphs or not. This amounts to verifying the flatness condition of the connection. In 1996, K. Ikeda made a numerical check of the flatness of this connection by approximate computations on a computer in [Ik] and showed that the graphs (3) are very “likely” to exist as (dual) principal graphs. (He also made a numerical verification of the flatness for the case n = 7 of (2) in §7 of [H].) √ In this paper, we will give the proof of the existence for the case of index (5+ 13)/2 previously announced by √ the second named author, and give the proof of the existence for the case of index (5 + 17)/2. The proof in the latter case was recently obtained by computations of the first named author based on a strategy of the second named author. Our main result in this paper is as follows.

√ 1 are realized as a pair of (dual) Theorem 1 ((5 + 13)/2 case). The two graphs in Fig. √ principal graphs of a subfactor with index equal to 5+ 2 13 of the hyperfinite II1 factor.

Exotic Subfactors of Finite Depth with Jones Indices

3

√ Theorem 2 ((5 + 17)/2 case). The two graphs in Fig. 2 are realized as a pair of (dual) √ 5+ 17 principal graphs of a subfactor with index equal to 2 of the hyperfinite II1 factor. ∗σ aσ bσ a

∗

c

b

bσ2 aσ 2 ∗σ 2 3 aσ

a

1

c

2

4

aσ 2

Fig. 1. The case n = 3 of the pair of graphs (2) in the list of Haagerup

h

˜ h

g f

a

*

c

b

e

d

a

2

c

3

a˜

b˜

∗˜

c˜

4

1

c˜

d˜

e

5

g

6

a˜

Fig. 2. The pair of graphs (3) in Haagerup’s candidates list

In Sect. 2, we will give two key lemmas to prove our two main results respectively. In Sect. 3, we will give a construction of generalized open string bimodules which is a generalization of Ocneanu’s open string bimodules in [O1], [Sa], and we will give a correspondence between bimodules and general biunitary connections on finite graphs. In Sects. 4 and 5, we will prove our two main theorems respectively. 2. Key Lemmas to the Main Results In this section, we will give the key lemmas which have been proved by the second named author. First of all, we will explain the motivation to the lemmas. Proofs of our main theorems presented in Sect. 1 are reduced to verifying “flatness” of the biunitary connections which

4

M. Asaeda, U. Haagerup

exist on the four graphs made of the pairs of the graphs in Fig. 1, 2 respectively. However, it is well known that, to verify flatness exactly is almost impossible, except for some easy cases such as the biunitary connections arising from the subfactors of crossed products of finite groups ([EK, 10.6]). So far, in the history of classification of subfactors, several methods have been introduced to prove flatness/nonflatness of biunitary connections. Finding inconsistency of the fusion rule on the graph of a given biunitary connection has been sometimes effective to prove nonflatness, e.g. D2n+1 , E7 ([I1], [SV], . . . ). On the other hand, since consistency of the fusion rule never means flatness of a given biunitary connection, several ideas have been introduced to prove flatness ([EK], [I2], [IK], [K] . . . ). The second named author, however, inspected the fusion rules of the upper graphs in Figs. 1 and 2 and noticed that if we can construct bimodules satisfying part of the fusion rule, we can conclude that there exists a subfactor having the desired principal graph. Now we introduce the notation used in the lemmas. Definition 1. Let N and M be II1 factors and N XM = X be an N -M bimodule (see [EK]). We denote by RX (M ) the right action of M on X, and by LX (N ) the left action of N . We have the subfactor RX (M )0 ⊃ LX (N ). We denote its Jones index by [X]. We define the principal graph of the bimodule X as that of the subfactor RX (M )0 ⊃ LX (N ). Definition 2. For bimodules X and Y with common coefficient algebras, we define hX, Y i = dim Hom(X, Y ). A formal Z-linear combination Y of bimodules (of finite index) will be called positive if it is an actual bimodule, i.e. of hY, Zi ≥ 0 for any irreducible bimodule Z which appears in the direct sum decomposition of X. When Y ∼ = X ⊕ Z for some positive bimodule Z, we write X ≺ Y . Hereafter we use the expression as follows, so far as it does not cause misunderstanding: 1N = N NN , 2X = X ⊕ X, XY = X ⊗N Y, X 2 = X ⊗N X, where N is a II1 factor and X and Y are suitable bimodules. Lemma 1. Let X = to four. Then,

N XM

be a bimodule with finite Jones index larger than or equal

1) XX − 1N and (XX)2 − 3XX ⊕ 1N are positive N -N bimodules. 2) XXX − 2X and (XX)2 ⊗N X − 4XXX ⊕ 3X are positive N -M bimodules. Proof. Let G be the principal graph of X. We set (0) := set of all irreducible components of Geven

1N , (XX)n , n = 1, 2, . . . , (0) := set of all irreducible components of Godd

X, (XX)n X, n = 1, 2, . . . , (0) (0) (resp. Godd ) means the even (resp. odd) vertices of G, and where Geven (0) G = (GY,Z )Y ∈Geven ,Z∈G (0)

odd

Exotic Subfactors of Finite Depth with Jones Indices

5

to be the incidence matrix for G , i.e., (0) . GY,Z = hY X, Zi, Y ∈ Geven

Since 4 ≤ [X] < ∞, we have 2 ≤ ||G|| < ∞. Put 1=

0 G , Gt 0

(0) (0) ∪ Godd and ||1|| ≥ ||G|| ≥ 2. Let P0 , P1 , then 1 is the adjacency matrix of G (0) = Geven P2 , . . . be the sequence of the polynomials given by

P0 (x) = 1, P1 (x) = x, . . . , Pn+1 (x) = Pn (x)x − Pn−1 (x). Then, by [HW], all of P2 (1), P3 (1), . . . have non-negative entries. For n = 2, 3, 4, 5, we get, in particular, that GGt − 1, GGt G − 2G, (GGt )2 − 3GGt + 1, and (GGt )2 G − 4GGt G + 3G (0) , have non-negative entries. Hence, for W ∈ Geven

hXX − 1N , W i = (GGt − 1)1N ,W ≥ 0, h(XX)2 − 3XX ⊕ 1N , W i = ((GGt )2 − 3GGt + 1)1N ,W ≥ 0, namely, XX − 1N and (XX)2 − 3XX ⊕ 1N are positive N -N bimodules. The same (0) argument (with W ∈ Godd ) shows that XXX − 2X and (XX)2 X − 4XXX ⊕ 3X are positive N -M bimodules. √ 2.1. Key lemma for the case of index (5 + 13/2). In this subsection, we present the key lemma given by the second √ named author to which the construction of the finite depth subfactor with index (5 + 13)/2 is reduced. Lemma 2. Let M and N be II1 factors. Assume the following: √

1) We have an N -M bimodule X = N XM of index (5+ 2 13) . 6 = 1N , 2) We have an N -N bimodule S = N SN of index 1 satisfying S 3 ∼ = 1N and S ∼ i.e. S is given by an automorphism of N of outer period 3. 3) The six bimodules 1N , S, S 2 , XX − 1N (XX − 1N ), S 2 (XX − 1N ) are irreducible and mutually inequivalent. 4) The four bimodules X, SX, S 2 X, XXX − 2X are irreducible and mutually inequivalent.

6

M. Asaeda, U. Haagerup

5) (The most important assumption) S(XX − 1N ) ∼ = (XX − 1N )S 2 . Then the principal graph of X and bimodules corresponding to the vertices on the graph are as follows: ∗σ aσ 1N

∗

X

a

bσ

XX − 1N

b

c

S

SX

S(XX − 1N )

XXX − 2X

bσ 2

S 2 (XX − 1N )

aσ2

S2X

∗σ2

S2

Remark. By Lemma 1, all the above bimodules are well-defined (i.e. positive in the sense of Def. 2). Proof. We have h(XXX − 2X)X, XX − 1N i = hXXX − 2X, (XX − 1N )Xi (by Frobenius reciprocity) = hXXX − 2X, XXX − 2Xi + hXXX − 2X, Xi =1 because X and XXX − 2X are irreducible and inequivalent. Hence, by irreducibility of XX − 1N , we have XX − 1N ≺ (XXX − 2X)X. (2.1) We have and 5) says Hence

(XXX − 2X)X ∼ = (XX − 1N )2 − 1N ,

(2.2)

S(XX − 1N ) ∼ = (XX − 1N )S 2 .

(2.3)

S 2 (XX − 1N ) ∼ = S(XX − 1N )S 2 ∼ = (XX − 1N )S 4

(2.4)

∼ = (XX − 1N )S.

Therefore, S(XX − 1N )2 ∼ = (XX − 1N )2 S = (XX − 1N )S 2 (XX − 1N ) ∼ and

S 2 (XX − 1N )2 ∼ = (XX − 1N )2 S 2 .

Exotic Subfactors of Finite Depth with Jones Indices

7

Hence by (2.2), S(XXX − 2X)XS 2 ∼ = (XXX − 2X)XS 3 ∼ = (XXX − 2X)X, and similarly So, by (2.1),

S 2 (XXX − 2X)XS ∼ = (XXX − 2X)X. S(XXX − 2X)XS 2 ≺ (XXX − 2X)X,

(2.5)

S 2 (XXX − 2X)XS ≺ (XXX − 2X)X,

(2.6)

by (2.3) and (2.4). Hence, by (2.1), (2.5), (2.6) and 3), (XX − 1N ), S(XX − 1N ), S 2 (XX − 1N ) are mutually inequivalent subbimodules of (XXX − 2X), i.e., (XXX − 2X)X ∼ = (XX − 1N ) ⊕ S(XX − 1N ) ⊕ S 2 (XX − 1N ) ⊕ Y, where Y is an N -N bimodule (possibly zero). Since X = N XM is irreducible, the subfactor RX (M )0 ⊃ LX (N ) has the trivial relative commutant, hence extremal (see [P], p. 176). Therefore, the square root of the Jones index of a bimodule [·]1/2 is additive and multiplicative on the bimodules expressed in terms of X and X (see [P2]). Thus, we have q q p [(XXX − 2X)X] = 3 [(XX − 1N )] + [Y ], q √ where the index of a zero bimodule is defined to be 0. Hence with λ = [X] = 5+ 2 13 , we get p [Y ] = λ(λ3 − 2λ) − 3(λ2 − 1) = λ4 − 5λ2 + 3 = 0. We must finally prove that S(XXX − 2X) ∼ = XXX − 2X.

(2.7)

To see this, we compute hS(XXX − 2X), XXX − 2Xi = hS(XXX − X), XXX − 2Xi − hSX, XXX − 2Xi = hS(XX − 1N ), (XXX − 2X)Xi − hSX, XXX − 2Xi. The first bracket is 1 because S(XX − 1N ) is contained in (XXX − 2X)X with multiplicity 1, and the second bracket is 0 because SX and XXX − 2X are irreducible and inequivalent by 4), hence hS(XXX − 2X), XXX − 2Xi = 1, thus, the equality of the irreducible bimodules (vii) holds.

8

M. Asaeda, U. Haagerup

From all the above, it follows easily that (a) 1N ∈ G (0) , (b) G is connected, (0) (c) Multiplication by X (resp. X) from the right (resp. left) on any bimodule U in Godd (0) (0) (0) (resp. Geven ) gives a direct sum of the bimodules in Geven (resp. Geven ) connected to U by edges, namely, we find that G is the principal graph of X. √ 2.2. Key lemma for the case of index (5 + 17)/2. In this subsection we present the key lemma similar √ to the previous one for the construction of the finite depth subfactor with index (5 + 17)/2. Lemma 3. Let M , N be II1 factors. Assume the following: √

1) We have an N -M bimodule X of index 5+ 2 17 . 2) We have an N -N bimodule S of index 1 satisfying S 2 ∼ 6 = 1N , i.e., S is = 1N and S ∼ given by an automorphism of N of outer period 2. 3) The eight N -N bimodules 1N , S, XX − 1N , S(XX − 1N ), (XX − 1N )S, S(XX − 1N )S, (XX)2 − 3XX ⊕ 1N , S((XX)2 − 3XX ⊕ 1N ) are irreducible and mutually inequivalent. 4) The six N -M bimodules X, SX, XXX −2X, S(XXX −2X), (XX)2 X −4XXX ⊕3X, (XX −1N )SX are irreducible and mutually inequivalent. 5) (The most important assumption) S(XX − 1N )SX ∼ = (XX − 1N )SX. Then the principal graph of X and the bimodules corresponding to the vertices on the graph are as follows: S(XX − 1N )S

h

(XX − 1N )S

h˜

g

S(XX − 1N )SX ∼ = (XX − 1N )SX

f 1N

X

∗

a

b

c

d

e

d˜

where, b · · · XX − 1N , c · · · XXX − 2X, d · · · (XX)2 − 3XX ⊕ 1N , e · · · (XX)2 X − 4XXX ⊕ 3X,

c˜

b˜

SX

S

a˜

∗˜

Exotic Subfactors of Finite Depth with Jones Indices

9

f · · · (XX − 1N )S(XX − 1N ) − S(XX − 1N )S, d˜ · · · S((XX)2 − 3XX ⊕ 1N ), c˜ · · · S(XXX − 2X), b˜ · · · S(XX − 1N ) Remark. By Lemma 1, we know that all the bimodules above except for the “bimodule” corresponding to f are well-defined (i.e., positive in the sense of Def. 2). The welldefinedness of the bimodule at f will come out of the proof below. Proof. In the proof we will sometimes use formal computations in the Z-linear span of the N -M bimodules or N -N bimodules considered. The symbol h , i for computing the dimension of the space of intertwiners can be extended to Z bilinear maps and hZ, Zi = 0 implies Z = 0 also for these generalized bimodules. Since 1N ≺ XX, we have by 5) that both (XX − 1N )S and S(XX − 1N )S are (equivalent to) subbimodules of (XX − 1N )SXX. By 3), (XX − 1N )S and S(XX − 1N )S are two non-equivalent irreducible bimodules. Hence we can write (XX − 1N )SXX ∼ = (XX − 1N )S ⊕ S(XX − 1N )S ⊕ R, where R is an N -N bimodule, and we have R∼ = (XX − 1N )S(XX − 1N ) − S(XX − 1N )S. Note that R 6∼ = 0 because R ∼ = 0 would imply [(XX − 1N )] = 1, which is impossible since X is irreducible and [X] > 4. Next we will show the following. The bimodule R is irreducible, S((XX)2 − 3XX ⊕ 1N )S ∼ = (XX)2 − 3XX ⊕ 1N , S((XX)2 X − 4XXX ⊕ 3X) ∼ = (XX)2 X − 4XXX ⊕ 3X. We have hR, Ri = h(XX − 1N )S(XX − 1N ), (XX − 1N )S(XX − 1N )i −2hS(XX − 1N )S, (XX − 1N )S(XX − 1N )i +hS(XX − 1N )S, S(XX − 1N )Si = t 1 + t2 + t3 , where

t1 = hS(XX − 1N )2 S, (XX − 1N )2 i, t2 = hS(XX − 1N )S, (XX − 1N )S(XX − 1N )i, t3 = h(XX − 1N ), (XX − 1N )i.

Note first that t3 = 1 because (XX − 1N ) is irreducible. Next t2 = hS(XX − 1N )S, (XX − 1N )SXXi −hS(XX − 1N )S, (XX − 1N )Si.

(2.8) (2.9) (2.10)

10

M. Asaeda, U. Haagerup

The last term is 0 because S(XX − 1N )S and (XX − 1N )S are irreducible and inequivalent by 3). Hence, using 4) and 5), we get t2 = hS(XX − 1N )SX, (XX − 1N )SXi = 1. To compute t1 , set irreducible bimodules Y , Z as Y = XX − 1N , Z = (XX)2 − 3XX ⊕ 1N . Then t1 = hS(1N ⊕ Y ⊕ Z)S, 1N ⊕ Y ⊕ Zi = h1N , 1N i + hSY S, Y i + hSZS, Zi + 2h1N , Y i + 2h1N , Zi + 2hSY S, Zi. By 3), 1N , Y , SY S, and Z are irreducible and mutually inequivalent. Hence t1 = 1 + hSZS, Zi. Altogether, we have shown that hR, Ri = (1 + hSZS, Zi) − 2 + 1 = hSZS, Zi. Since R 6= 0, we have hR, Ri ≥ 1. Moreover, since Z is irreducible, so is SZS. Hence hSZS, Zi ≤ 1. Therefore hR, Ri = hSZS, Zi = 1, which shows that R is irreducible, and using that Z and SZS are irreducible, we also get that SZS ∼ = Z. Hence we have verified (2.8) and (2.9). To prove (2.10), put G = XXX − 2X, E = (XX)2 X − 4XXX ⊕ 3X. Note that E = ((XX)2 − 3XX ⊕ 1N )X − (XXX ⊕ 2X), then, by (2.9) E∼ = S((XX)2 − 3XX ⊕ 1N )SX − XXX ⊕ 2X ∼ = S(XX − 1N )2 SX − S(XX − 1N )SX − (XX − 1N )X. Using 5), we have E∼ = S(XX − 1N )S(XX − 1N )SX − (XX − 1N )SX − (XX − 1N )X ∼ = S(XX − 1N )SXXSX − S(XX − 1N )X − (XX − 1N )SX − (XX − 1N )X. Hence, again using 5) we get E∼ = (XX − 1N )SXXSX − S(XX − 1N )X − (XX − 1N )SX − (XX − 1N )X ∼ = (XX − 1N )SX(XSX − 1N ) − (1N ⊕ S)(XX − 1N )X. From this expression of E and 5), we clearly have SE ∼ = E, which proves (2.10). We next prove (2.11) RX ∼ = (XX − 1N )SX ⊕ E, ∼ where R = (XX − 1N )S(XX − 1N ) − S(XX − 1N )S is irreducible by (2.8). We put

Exotic Subfactors of Finite Depth with Jones Indices

11

E 0 = RX − (XX − 1N )SX. By 5), we have E0 ∼ = (XX − 1N )S(XX − 1N )X − 2(XX − 1N )SX ∼ = (XX − 1N )S(XX − 31N )X. To prove (2.11), we just have to show that E ∼ = E 0 , namely hE 0 − E, E 0 − Ei = 0. Note that

hE 0 − E, E 0 − Ei = s1 − 2s2 + s3 ,

where s1 = hE 0 , E 0 i, s2 = hE 0 , Ei, and s3 = hE, Ei. First, s3 = 1 because E is irreducible. Next, s2 = h(XX − 1N )S(XX − 31N )X, (XX − 1N )(XX − 31N )Xi = hS, (XX − 1N )2 (XX − 31N )XX(XX − 31N )(XX − 1N )i = hS(XX − 1N )(XX − 31N )X, (XX − 1N )(XX − 31N )Xi = hSE, Ei = 1 (by (2.10)). Finally, s1 = hE 0 , E 0 i = hS(XX − 1N )2 S, (XX − 31N )XX(XX − 31N )i = hS(1N ⊕ Y ⊕ Z)S, (XX − 31N )2 XXi. Here, using SY S = S(XX − 1N )S and SZS ∼ = Z by (2.9), we have s1 = h1N ⊕ S(XX − 1N )S ⊕ Z, (XX − 31N )2 XXi = hX ⊕ S(XX − 1N )SX ⊕ ZX, (XX − 31N )2 Xi = hX ⊕ (XX − 1N )SX ⊕ ZX, (XX − 31N )2 Xi, where we have used 5) again. We expand ZX and (XX − 31N )2 X in terms of the irreducible bimodules X, G = XXX − 2X, and E = (XX)2 X − 4XXX ⊕ 3X, and get ZX ∼ = G⊕E and Hence

(XX − 31N )2 X ∼ = 2X − 2G ⊕ E. s1 = hX ⊕ (XX − 1N )SX ⊕ G ⊕ E, 2X − 2G ⊕ Ei.

By 4), X G, E, and (XX − 1N )SX are irreducible and mutually inequivalent, hence s1 = 2hX, Xi − 2hG, Gi + hE, Ei = 1.

12

M. Asaeda, U. Haagerup

Altogether, hE 0 − E, E 0 − Ei = s1 − 2s2 + s3 = 1 − 2 + 1 = 0, which proves (2.11). We need to prove one more relation EX ∼ = Z ⊕ SZ ⊕ R.

(2.12)

(0) , the set To prove (2.12), note first that X, Z, and E all correspond to the vertices in Godd (0) of the odd vertices of the principal graph of X. (We write Geven for the even vertices.) (0) , and since Hence, EE ∈ Geven

hS, EEi = hSE, Ei = 1, (0) S is an irreducible subbimodule of EE, so also S ∈ Geven . Therefore, every irreducible N - N bimodule or N -M bimodule that can be expressed in terms of X, X, and S, belong to the principal graph G of X. Therefore, by the same argument in the proof of the previous lemma, the square root of the Jones index is additive and multiplicative on the N -N bimodules or N -M bimodules which can be expressed in terms of X, X, and S, because it will occur as a submodule of

(XX)n , n ≥ 0, or (XX)n X, n ≥ 1. Since ZX ∼ = G ⊕ E, we have E ≺ ZX and therefore Z ≺ EX.

(2.13)

SZ ≺ EX.

(2.14)

By (2.10), also Moreover, in the same way, we have R ≺ EX

by (2.11).

(2.15)

We know that Z, SZ, and S are irreducible and Z 6∼ = SZ by 3). Moreover, by a simple computation using the additivity and multiplicativity of [·]1/2 , we have [R]1/2 = [Z]1/2 − 1 = [SZ]1/2 − 1. Hence all of R, Z, SZ are mutually inequivalent. Thus EX ∼ = Z ⊕ SZ ⊕ R ⊕ T, √ where T is an N -N bimodule. By [X] = (5 + 17)/2, we easily get [EX]1/2 = [Z]1/2 + [SZ]1/2 + [R]1/2 , hence, T = 0. Putting everything together, we see that conditions (a), (b), (c) in the proof of the previous lemma hold, namely, G is the principal graph of X.

Exotic Subfactors of Finite Depth with Jones Indices

13

3. Generalized Open String Bimodules In Sect. 2, we have reduced our construction problem to verification of certain fusion rules, but we still have a problem of handling bimodules in a concrete way. For example, we do not know how to represent X or S, or how to verify equalities of infinite dimensional bimodules. In this section, we will introduce the item to make full use of the lemmas. Consider a biunitary connection α, as in Fig. 3, on the four graphs with upper graph K, lower graph L and the sets of vertices V0 , . . . , V3 . Note that by the definition of biunitary connection, the graphs K and L should be connected, and the vertical graphs are not necessary to be connected. We fix the vertices ∗K ∈ V0 and ∗L ∈ V2 . We will now construct the bimodule corresponding to α. V0

V1

K α

V2

V3

L

Fig. 3. The connection α with four graphs

First we construct AFD II1 factors from the string algebras K=

∞ [ n=1

L=

∞ [ n=1

weak

String∗(n) K K

,

weak

String∗(n) L L

,

by the GNS construction using the unique trace, where String(n) ∗G G = span{(ξ, η)| a pair of paths on the graph G s(ξ) = s(η) = ∗G , r(ξ) = r(η), |ξ| = |η| = n}. Here for a path ζ, we denote the initial vertex, the final vertex and the length of the path by s(ζ), r(ζ) and |ζ| respectively. We define its ∗-algebra structure as (ξ, η) · (ξ 0 , η 0 ) = δη,ξ0 (ξ, η 0 ), (ξ, η)∗ = (η, ξ). Now we have another AFD II1 factor ˜ = L

n

∞ [

span n=0

∗K x

,

∗K x

|

a pair of paths, x ∈ L(0) , horizontal paths are in L, length n.

oweak

,

where L(0) denotes the set of vertices on L. We identify elements in K with elements in ˜ L˜ by the embedding using connection α, and then have an AFD II1 subfactor K ⊂ L. (See [EK], Chapter 11.)

14

M. Asaeda, U. Haagerup

Next we construct the K-L bimodule corresponding to α. Consider a pair of paths as follows: ∗K , ,∗ L here the horizontal part of the left (resp. right) path consists of edges of the graph K (resp. L), the vertical edge is from one of the two vertical graphs of the four graphs of the connection α, and the paths have a common final vertex. In general, a pair of paths, as above, with a common final vertex, not necessary with a common initial vertex, is called an open string. It was first introduced by Ocneanu in [O1] in more restricted situations. We embed an open string of length k into the linear span of open strings of length k + 1 in a similar way to the embedding of string algebras as follows:

=

∗K X

η , ∗L

∗K

ξ

|ξ|=1

=

XX η 0 ,ξ 0 |ξ|=1

,∗ L

0

ξ η α η 0 ∗K ξ

ξ ξ0

η,0 ∗

,

ξ

L

here the square marked with α means the value given by the connection α. We define the vector space spanned by the above open strings with the above embedding as follows: ◦

Xα =

[

span{(ξ, η)|s(ξ) = ∗K , s(η) = ∗L , r(ξ) = r(η)}

n

=

[

span{

∗K ◦

,∗ L

}.

We define an inner product of X α as the sesqui-linear extension of the following; h(ξ · ζ, η), (ξ 0 · ζ 0 , η 0 )i ∗K ξ , =h ζ, η ∗L µK (s(ζ)) δζ,ζ 0 tr K (ξ, ξ 0 )tr L (η 0 , η), = µL (r(ζ))

∗K ξ 0

ζ,0

∗L

η0

i

where ξ·ζ denotes the concatenation of ξ and ζ, µK and µL denotes the Perron–Frobenius eigenvector of the graphs, trK is the unique trace on K, and trL is as well. We set (ξ, ξ 0 ) and (η 0 , η) are the elements 0 of L˜ and L respectively if the end points of each pair of paths do not coincide. ◦

By this inner product, X α is regarded as a pre-Hilbert space, and then we complete it and denote the completion by X α . We have the natural left action of K and the right action of L as follows: for ∗K ξ ∈ X α, x= η ,∗ L ∗K σ 0 ∗K σ ∈ K, k= ,

Exotic Subfactors of Finite Depth with Jones Indices

l= we have k·x =

ρ

∗L X

∗K

|ζ|=1

=

X

|ζ|=1

δσ0 ·ζ,ξ

∗K

15

,∗ L

σ ∗K

ρ0

ζ , ∗K σ

σ0 ζ,

∗L

∈ L,

ζ

·x

η

,

. ,∗ L By the extension of this action, the Hilbert space X α is considered as a Hilbert K-L bimodule K XLα . Then we have a K-L bimodule K X α L constructed from α. (We call this bimodule made of open strings an open string bimodule. This is a generalization of open string bimodules in [O1] and [Sa], which are the bimodules constructed from flat connections.) We make the correspondence between direct sums, relative tensor products, and the contragredient map of bimodules and “sums”, “products”, and the renormalization of connections, so that fusion rules on open string bimodules are reduced to the operations of connections ([O3]). First we introduce the sum of two connections. Consider α and β as connections on the four graphs with upper graph K, lower graph L and sets of vertices V0 , . . . ,V3 as in Fig. 3 (the side graphs of α and β need not be identical), then they give rise to two K-L bimodules. We define the sum of the connections as follows:      k    α( m n ), if both m, n are edges appearing in α,     l k (α + β)( m n ) = k   l  β( m n ),  if both m, n are edges appearing in β,    l     0, otherwise. x · l = δη,ρ

ρ0

ξ

Obviously it satisfies the biunitarity. We denote the bimodule constructed from a connection γ by X γ . By considering the action of K from the left, it is easy to see that α K XL

⊕ K XLβ

=

α+β K XL ,

thus, we can use the summation of connections instead of the direct sum of bimodules. Next we define the product of connections ([O3], [Sa]). Consider the connections α and β, as in Fig. 4, which give rise to a K-L bimodule (resp. L-M bimodule). Note that the graph L appears in the both connections. Then we can define the product connection αβ on the four graphs with upper graph K, lower graph M and the sets of vertices V0 , V1 , V4 , V5 . The side graphs consist of the edges {p − q | p ∈ V0 (resp. V1 ), q ∈ V4 (resp. V5 )} with multiplicity ]{p − x − q | a path of length 2 from p to q, x ∈ V2 (resp. V3 )}.

16

M. Asaeda, U. Haagerup

V0

V1

K

V2

α V2

V3

L β

V3

L

V4

V5

M

Figure 4.

We have the connection αβ as follows:   k k k l n o  X n1 o1 )β( n2 o2 ), o ) = (αβ)  n1 o1  = α( (αβ)( n  2 2  m m l l m where n1 and n2 are edges such that their concatenation n1 · n2 is n, and o1 and o2 are as well. We observe that this process corresponds to the following process of composing commuting squares of finite dimension: A⊂B C⊂D A⊂B ∩ ∩ , ∩ ∩ ⇒ ∩ ∩, C⊂D E⊂F E⊂F where these three squares are finite dimensional commuting squares. We will show that β αβ β αβ α α K X ⊗L XM is isomorphic to K XM . We define the map ϕ from K X ⊗L XM to K XM as follows: For x= =

∗K 

X 

ξ ∗K

,∗ L ξ

∗L

ρ

 η

ζ , ∗L

|ζ|=1

y=

η

,∗ M

ζ

σ

◦  α  ∈ K X L, ◦

∈ L XβM ,

we define ϕ(x⊗L y) = x · y



 = δη·ζ,ρ 

 ∗K

ξ

ζ,

∗M

σ

 αβ  ∈ K XM .

Since (x⊗L y, x⊗L y) = (x(y, y)L , x) = tr M (x∗ · x(y, y)L ), (x · y, x · y) = tr M (y ∗ · x∗ · x · y) = tr M (y ∗ · (x∗ · x)y) ((x∗ · x) ∈ L)

Exotic Subfactors of Finite Depth with Jones Indices

17

= ((x∗ · x)y, y) = tr M (x∗ · x(y, y)L ) = (x⊗L y, x⊗L y), where x∗ and y ∗ means that we reverse the order of the pairs of paths and also take the complex conjugate of their coefficients, we see that ϕ is an isometry, so it is wellβ αβ to K XM , and it is also injective. Moreover, defined as a linear map from K X α ⊗L XM it is surjective because, for an element    x=

∗K

 αβ  ∈ K XM ,

ρ, ∗M

where we assume without loss of generality that x is long enough that there is a path connecting ∗L and s(ρ),   X  ∗K  δρ,ξ  x=  ξ, ξ ∗M   η X  ∗K  η ∗L · =   ρ ,∗ ξ, ∗L ξ M ξ η ∗L ∗K ⊗L ) = ϕ( η ρ ,∗ ,∗ L M for some η with s(η) = ∗L , r(η) = s(ρ). Therefore, we have the isomorphism KX

Next we prove

α

β ∼ αβ ⊗L XM = K XM .

α K XL

α ˜ = L XK ,

here we denote the renormalization of the connection α by α. ˜ Take an element ζ ∗K ∈ K XLα . x= ξ, η ∗L For x, we easily see that its image by the contragredient map is given as η ∗L ∈ K XLα , x¯ = ξ˜, ζ ∗K ˜ here ξ means the upside down edge of ξ. Since x¯ =

X

∗K

σ

=

X σ,σ 0 ,ξ 0

σ0 ξ α ξ0 σ

ζ

! ξσ , ∗L ∗K

ζ

η σ0 0 ξ,

σ ! ∗L

η

σ

18

M. Asaeda, U. Haagerup

=

X σ,σ 0 ,ξ 0

σ ξ˜ α˜ ξ˜0 σ0

η

∗K

σ

ξ˜, ∗L

ζ

σ

0

,

α ˜ , thus we have x¯ is regarded as the element of L XK α K XL

α ˜ ∼ . = L XK

Now we have a good correspondence between the operations of certain bimodules and those of connections. To complete it, we should check that the construction of bimodules from connections is a one to one correspondence of the equivalent classes. Theorem 3. Let α and β be two connections as below; V0 V1 V0 K K

V1

S1

α

T1

S2

β

T2

V2

L

V3

V2

L

V3

α K XL

β K XL

and are isomorphic if and only if α and β are then the K-L bimodules equivalent to each other up to gauge choice for the vertical edges, in particular the pairs (S1 , T1 ) and (S2 , T2 ) of the vertical graphs must coincide. Remark. In [O3], the same correspondence of bimodules and equivalent classes of connections has been introduced for limited objects, and there an equivalent class of connections is defined as that of a gauge transform not only by vertical gauges but also horizontal ones. If the horizontal graphs are “trees”, the equivalent class by total gauges is the same as that by vertical gauges, however, for general biunitary connections, we should limit the gauge choices only to vertical ones. Proof. First assume that α and β are equivalent up to gauge choice for the vertical edges. Now α and β are on the common four graphs, namely S1 = S2 = S, T1 = T2 = T . From the assumption, we have two unitary matrices uS , uT corresponding to the graphs S, T respectively, such that u∗S αuT = β, where α and β represent the matrices corresponding to the connections. Now we define the isomorphism 8 from K XLα to K XLβ as follows: ∗K ∈ K XLα , |x| = n x = ,∗ L ↓  (n)  ∗K , if n is even,  ξ,  (id · uS ) ∗L 8(x) =    (id(n) · uT ) ∗K , if n is odd, ξ, ∗L ∈ K XLβ , (n) (n) · uS is the concatenation, where id(n) represents the identity ∗K K, and id L of String(1) regarding uS as an element of p∈V0 Stringp S, and uT is as well. Note that this map changes only the vertical part of the elements of the bimodule. Now we check that 8

Exotic Subfactors of Finite Depth with Jones Indices

19

is a well-defined linear map, i.e., does not depend on the length of the expression of x. Here we assume n is even. We have X η,ξ ∗K uS 8(x) = η, ∗L η X η,ξ X ∗K uS = η σ , σ ∗L η σ =

X η

uη,ξ S

X σ,σ 0 ,η 0

σ0 η β η0 σ

σ0 0 η, ∗

∗K

σ

L

,

where uη,ξ S denotes the η-ξ entry of the matrix uS . On the other hand, we have X ∗K ) 8(x) = 8( ξ σ , σ ∗L σ = id

(n+1)

· uT

X σ,σ 0 ,ξ 0

=

X σ,σ 0 ,ξ 0

σ0 ξ α ξ0 σ

σ0 X 0 0 ξ α ξ0 uηT ,ξ 0 σ η

σ0 0 η, ∗

∗K

σ

L

σ0 0 ξ,

∗K

∗L

σ

.

By u∗S αuT = β, the above two expressions of 8(x) coincide. When n is odd, it follows from the same argument. Therefore, 8 is a well-defined linear map. Here, 8 is obviously a right L-homomorphism, and, since id·uS (resp. id·uT ) of any length commutes with the element of K of the same length, 8 is a left K-homomorphism, β too. Since uS and uT are unitaries, 8 is an isomorphism. Then, we have K XLα ∼ = K XL . β α ∼ Next we prove the converse. Assume K XL = K XL . Then we have a partial isometry u ∈ End(K XLα ⊕ K XLβ ) = End(K XLα+β ) such that

∼

u : K XLα −→ K XLβ , uu∗ + u∗ u = id.

Our aim is to prove S1 = S2 , T1 = T2 and construct a gauge transform between α and β from u. Claim 1. Consider a connection γ with four graphs as below. V0 V1 K S

γ

T

V2

L

V3

,

and three AFD II1 factors as in the beginning of this section. Then we have ˜ End(K XLγ ) = K 0 ∩ L, where the embedding of K ⊂ L˜ is given by γ.

20

M. Asaeda, U. Haagerup

Proof. First we have End(K XLγ ) = (the left action of K on X γ )0 ∩(the right action of L on X γ )0 . We have a natural left action of L˜ on K XLγ . Now we prove (the right action of L on X γ )0 = (the left action of L˜ on X γ ).

(3.1)

Obviously we have the inclusion ⊂, so we prove the equality by comparing dimensions of X γ as modules of both algebras. Take a vertex x on L and consider projections as below: ∗K ∗K ∈ L˜ p= , x x ∈ L. q= , ∗L x ∗L x ˜ consists of the strings such as We see that pLp , p· , x x ˜ essentially consists of the strings of L where · means the concatenation. Namely, pLp with the initial vertex x. It is the case for pX γ q and qLq by similar argument, thus we have γ dimpLp ˜ (pX q) = 1. On the other hand, we have γ dimpLp ˜ (pX q) =

tr L q dimL˜ X γ , tr L˜ p

then we have dimL˜ X γ =

tr L˜ p . tr L q

By the same argument, we have dim(pX γ q)qLq = 1 and dimXLγ =

tr L q . tr L˜ p

Thus, we have dimL˜ X γ = and the equality in (3.1) holds.

1 = dimXLγ 0 dimXLγ

By applying this claim to α + β, we see that the partial isometry u is in K 0 ∩ L˜ and the map X α −→ X β is given by the natural left action of L˜ on X α . To construct the gauge matrices which transfer α to β, we use the compactness argument of Ocneanu ([O2], [EK, Sect. 11.4]) We introduce some necessary notions and facts.

Exotic Subfactors of Finite Depth with Jones Indices

21

Definition 3 (Flat element, Flat field, Ocneanu [O2], [EK]). Consider a connection γ ˜ on the four graphs as in the previous claim, and three LAFD II1 factors K, L, and L as at S. It is called a flat the beginning of this section. Take an element ξ ∈ p∈V0 String(1) p element if id(2l) ξ id(2l) = ξ , l ∈ N, P under the identification by the connection γ, where id(2l) denotes the string |σ|=2l (σ, σ) on the graph K (resp. L). We use this notation often hereafter under similar conditions. L It is known that, for a flat element ξ, there is the element η ∈ p∈V1 String(1) p T such that id(1) ξ id(1) = η , and η id(2l) =

id(2l) η

by the connection γ. We call η a flat element, too. This “couple” of flat elements represents an element of the string algebra with identification by γ, namely, ∗K

id(k)

ξ=

∗K

id(l)

η

for any sufficiently large k:even and l:odd, that is, large enough that the set of the end points of id(k) (resp. id(l) ) coincides with V0 (resp. V1 ). Now we define z to be a function (1) on V0 ∪ V1 such that z(p) ∈ String(1) p S (resp. Stringp T ) for p ∈ V0 (resp. V1 ) and ⊕p∈V0 z(p) = ξ (resp. ⊕p∈V1 z(p) = η), and call it flat field. Let Vn0 to be a proper subset of Vn , where n = 0, 1, and put ξ0 = ⊕p∈V00 z(p) (resp. η0 = ⊕o∈V10 z(p)). It is known that ξ0 (resp. η0 ) = id(2j)

id(2j)

ξ(resp. η),

for sufficiently large j. We call such elements as ξ0 and η0 flat, too, though they are not flat elements by the definition above. Theorem 4 (Ocneanu [O2], [EK]). Let K ⊂ L˜ be the AFD II1 subfactor constructed from the connection γ. Then, K 0 ∩ L˜ = {flat field}. L The correspondence of elements is as follows: Take a flat field z and let ξ = p∈V0 z(p), then id(2k) ˜ ∗K ξ ∈ K 0 ∩ L, ˜ it turns out that x is written as and conversely, for x ∈ K 0 ∩ L, x= for some flat field z.

˜ ∗K ∈ String(1) ∗ S ⊂L z(∗)

22

M. Asaeda, U. Haagerup

This theorem is proved by the compactness argument of Ocneanu, see [O2] and [EK]. (Generally, the length of flat field/element can be arbitrary.) Now we continue the proof of Theorem 3, using the above notions. Let γ = α + β and S = S1 ∪ S2 , T = T1 ∪ T2 . By the above theorem, we consider the partial isometry u ∈ K 0 ∩ L˜ which gives the isometry X α −→ X β as a flat field for the connection γ. Take p ∈ V0 and q ∈ V2 so that they are connected in S. Since uu∗ + u∗ u = 1, we have u(p, q)u(p, q)∗ + u(p, q)∗ u(p, q) = 1 in the algebra String(1) (p,q) S =span{(σ, ρ)| |σ| = |ρ| = 1, s(σ) = s(ρ) = p, r(σ) = r(ρ) = α q}, where u(p, q) ∈ String(1) (p,q) S such that ⊕q∈V2 u(p, q) = u(p). Take an element of X , ζ p ∗K , ξ ∈ S1 . x= ξ, ε q ∗L q From the definition of u, we have ux ∈ X β , then ζ p ∗K (id · u(p, q)) · x = ∗K id u(p, q) · ξ, ε q ∗L q X ζ p ∗K ∈ X β , u(p, q)η,ξ ∈ C. u(p, q)η,ξ = η, ε η q ∗L q Note that η ∈ S2 if u(p, q)η,ξ 6= 0. Since u gives an isometry of X α and X β , ζ p ∗K , ξ ∈ S1 } dim span{ ξ, ε ∗ L q q ζ p ∗K , η ∈ S2 } = dim span{ η, ε ∗ L q q for each ζ and ε. This means o n p o n p ] ξ ∈ S1 = ] η ∈ S2 . q q By seeing all the possible pairs of vertices p and q, we have S1 = S2 . By the same discussion, we have also T1 = T2 , then we know that α and β are on the same four graphs. Now we see that u(p, q) gives the gauge matrix for the edges which connect p and q. Let uS and uT be “stable” flat elements on S and T corresponding to the flat field u. Since the isomorphism x ∈ X α → u · x ∈ X β is well-defined, from the same deformation as we proved the well-definedness of 8 in the first half proof of our main statement here, u∗S αuT = β follows. Under the identification of S1 = S2 and T1 = T2 , uS and uT are considered as unitary matrices corresponding to the gauge transform action of α and β. Thus we have α∼ =β

up to vertical gauge choice.

Exotic Subfactors of Finite Depth with Jones Indices

Corollary 1.

α K XL

23

is irreducible if and only if α is indecomposable.

Proof. Assume α is decomposable, i.e. there exist gauge unitaries uS , uT and connections β, γ such that u∗S αuT = β + γ. Then we have

u∗ S αuT K XL

β γ ∼ = K XL ⊕ K XL ∼ = K XLα ,

namely, K XLα is reducible. Conversely, assume K XLα is reducible. then we have bimodules that α ∼ K XL = K YL ⊕ K ZL

K YL

and

K ZL

such

and a projection p ∈ End(K XLα ) = K 0 ∩ L˜ with p : K XLα −→ K YL . Along the same argument as in the proof of the previous theorem, we consider p as a flat field and make the projections pS and pT which project elements of K XLα to K YL at the finite level, and they act as the “projections” of the connection matrix, and we have a “sub connection” of α β = p∗S αpT so that

β K XL

thus, α is decomposable.

∼ = K YL ,

Corollary 2. Let γ be a connection as in Claim 1, i.e., V0 V1 K S

γ

T

V2

L

V3

.

If there exists a vertex p to which only one vertical edge is connected, then the bimodule γ K XL is irreducible. Proof. Assume p ∈ V0 without missing generality. Let ξ be the only one vertical edge in S connected to p. Assume K XLγ is not irreducible. Then, by the above argument, we have connections γ1 and γ2 with the four graphs V0 V1 V0 V1 K K S1

γ1

T1

V2

L

V3

,

S2

γ2

T2

V2

L

V3

respectively, so that γ∼ = γ1 + γ2 , S = S1 ∪ S2 , T = T1 ∪ T2 . ξ should be contained either in S1 or S2 . Assume ξ ∈ S2 , then no edge in S1 connects to p. This contradicts the unitarity of γ1 .

24

M. Asaeda, U. Haagerup

Remark. Corollary 2 is a generalization of Wenzl’s Criterion for irreducibility of subfactors obtained from a periodic sequence of commuting squares (cf. [W]). 4. Main Theorem for the Case of (5 +

√ 13)/2

In this section, we give a proof for our main theorem for the case of index (5 + due to the second named author.

√ 13)/2

Theorem 5. A subfactor with principal graph and dual principal graph as in Fig. 1 exists. From the key lemma, we know that the above theorem follows from the next proposition. We define the connection σ as   p q   σ  = δσ(p),r δσ(q),s , r s where p, q, r, s are the vertices on the upper graph in Fig. 1, and we define σ(·) as σ(x) = xσ , σ(xσ ) = xσ2 , and xσ3 = x. Note that, for the vertex c we put σ(c) = c. Proposition 1. Let α be the unique connection on the four graphs consisting of the pair of the graphs appearing in Fig. 1, and σ be the connection defined above. Then, the following hold. 1) The six connections 1, σ, σ 2 , (αα˜ − 1), σ(αα˜ − 1), σ 2 (αα˜ − 1) are indecomposable and mutually inequivalent. 2) The four connections ˜ − 2α α, σα, σ 2 α, ααα are irreducible and mutually inequivalent. 3)

σ(αα˜ − 1) ∼ = (αα˜ − 1)σ 2 .

Proof. The four graphs of the connection α are as in Fig. 5. The Perron–Frobenius weights of the vertices can easily be computed as follows: µ(∗) = 1, µ(a) = µ(aσ ) = µ(aσ2 ) = λ, µ(b) = µ(bσ ) = µ(bσ2 ) = λ2 − 1, µ(c) = λ3 − 2λ, µ(1) = 1, µ(2) = λ2 − 1, µ(3) = λ2 − 2, µ(4) = λ2 ,

q √ 5+ 13 where λ = 2 . One can check that Table 1 defines a connection α on the four graphs (Fig. 5) which satisfies Ocneanu’s biunitary conditions, i.e.,    p η0   ξ ξ 0  is a unitary matrix for each fixed p, s, (unitarity)  α  η s ξ·η,ξ·ξ 0

and

Exotic Subfactors of Finite Depth with Jones Indices

V0

∗

25

V0

G0

V1

G3

α

G1

V3

G2

V2 bσ2

bσ

b

∗σ 2

∗σ

G0 V1

a

c

G1 V2

1

2

aσ 2

aσ 3

4

G2 V3

a

aσ 2

aσ

c

G3 V0

∗

b

bσ

bσ2

∗σ

∗σ 2

Fig. 5. Four graphs of the connection α

s

y η w µ(y)µ(z) (renormalization) · ξ˜ ξ˜0 , µ(x)µ(w) x η 0 z see [O1] and [EK, Chap. 10]. We see that such a biunitary connection α on these four graphs is determined uniquely up to the complex conjugate arising from the symmetricity of the graphs, namely it is essentially unique. The connection α is as in Table 1. Note   x z   (xy, zw)-entry in the table = α  , y w x η0 z ξ ξ0 = y η w

where we note that, since all the graphs which consist the four graphs in Fig. 5 are “tree”, all the edges are expressed by both ends. For example, in Table 1 one can find   a ∗   (∗a, a2)-entry = α   = 1. a 2 √ √ We also note that blank entries are all 0’s, and ρ = 21 (− λ2 − 4 + i 8 − λ2 ), τ = √ √ 1 3 2 2 2 (− λ − 1 − i 5 − λ ), |ρ| = |τ | = 1. (τ¯ = ρ). Now we display the table of the connection α˜ computed by “renormalization” in Table 2 for use of the later computations. First we check condition 3), namely we prove σ(αα˜ − 1) ∼ = (αα˜ − 1)σ 2

26

M. Asaeda, U. Haagerup Table 1. Connection α ∗a

a1 1

a2 1

ba

1

−1 λ2 −1

λ

bc

c2

c3

c4

1 λ2 −1

1

1

ρ¯

τ¯

aσ 2 4

aσ 4

√

λ

√

λ2 −2 λ2 −1

bσ c

λ2 −2 λ2 −1

bσ aσ bσ2 c

ρ

τ

bσ2 aσ

√

q √

q

1 λ2 −1

λ2 −2 λ2 −1

√−1

λ2 −2 λ2 −1

λ2 −1

q

1 λ2 −1

q

λ2 −2 λ2 −1

√−1

λ2 −2 λ2 −1

λ2 −1

∗σ aσ ∗σ 2 a σ 2

1 1

Table 2. Connection α˜ 1a a∗ ab cb cbσ cbσ2

√

1 λ λ2 −1 λ

√

2a

2c

3c

4c

4aσ2

4aσ

λ2 −1 λ −1 λ

1

1 √ 1 λ(λ2 −2) λ2 −1 ρ λ(λ2 −2) λ2 −1 ρ¯ λ(λ2 −2)

√1 3 √1 τ 3 √1 τ¯ 3

aσ bσ a σ ∗σ 2 aσ2 bσ2 aσ 2 ∗ σ

λ2 −1 λ2 −2 1 λ2 −2

1

1 λ2 −2

1

1 −1 1 −1

1 1

up to vertical gauge choice. It is enough to show (αα˜ − 1) ∼ = σ(αα˜ − 1)σ, so now we will prove this equivalence. First we compute the connection αα. ˜ The four graphs on which the connection αα˜ exists are as in Fig. 7. The vertical graphs are constructed as in Fig. 6, where we explain it only by GG t . To obtain the connection αα˜ − 1, we multiply the entries of the connections α and α˜ properly (we call this sort of computations of the multiplication of the connections “actual” multiplication), transform it by vertical gauge so that the entries corresponding to the trivial connection 1 are 1, and subtract 1. (In Fig. 7, the broken lines correspond to this trivial summand.) Here, in Table 3, we show the landscape of αα˜ with 1’s in the entries corresponding to 1.

Exotic Subfactors of Finite Depth with Jones Indices V1

a

27 aσ 2

aσ

c

G1 V2

1

G1

2

3

4

t

V1

a

c

aσ

aσ 2

⇓ V1 G1 G1

a

c

aσ

aσ 2

a

c

aσ

aσ 2

t

V1

Fig. 6. Construction of the vertical graphs of αα˜

First we will compute the entries marked in Table 3. We assume that 1 × 1 gauge transform unitaries corresponding to single vertical edges which connect different vertices in the graph G to be 1 without losing generality, because they are not involved in the trivial connection 1. We compute such entries by “actual” multiplication. ∗

a

αα˜

b

∗ =

c

a

a

α

a

·

2

2

α˜

b

c

= 1 · 1 = 1,

√ √ √ λ λ2 − 2 λ2 − 1 λ2 − 2 √ α˜ · . = = = · λ2 − 1 λ λ2 − 1 a a a ∗ ∗ 2 From here, we only write the result of multiplication. b

αα˜

c

b

α

b bσ

a

c

αα˜

a c

b bσ b

αα˜

bσ 2

2

b

=

αα˜

c

a

b

αα˜

b

= b

=

c

c α

·

α

a

b

c

·

α

α˜

bσ c 4

2

c

·

c bσ 2 c

2 c c

bσ

α˜

·

=√

α˜

2

c

α˜

−2

,

4 aσ

= 1,

=√

c

ρ λ2

ρ¯ λ2

−2

,

4

= 1, bσ2 aσ2 c 2 α αα˜ α˜ · = ρ, ¯ = a a c 2 b b

bσ2 aσ2 bσ c

bσ b

aσ

αα˜

c

=

a 2

c

aσ

c

α

bσ =

c

c bσ

α

aσ 4

4 c

·

c b

α˜

4 c

=√

1 λ2

−2

,

28

M. Asaeda, U. Haagerup

V0

G0

V1

G3

α

G1

V2

G2

V3

G3 t

α˜

G1 t

V0 V0

G0

∗

=

V0

G0

G3 G3 t

αα˜

V0

G0

V1 G1 G1 t V1

V1 bσ2

bσ

b

∗σ 2

∗σ

G0 V1

a

c

aσ

aσ 2

a

c

aσ

aσ 2

G1 G1 t V1 G0 V0 G3 G3 V0

∗

b

∗

b

bσ

bσ2

t

bσ2

bσ

∗σ

∗σ 2

∗σ

∗σ 2

Fig. 7. The four graphs of αα˜

bσ

aσ

αα˜

bσ 2

c

bσ

c

αα˜

bσ2 aσ2 bσ

αα˜

∗σ2 aσ2 bσ 2

c bσ

=

c bσ

=

aσ

αα˜

c bσ

=

c

∗σ2 aσ2 bσ

=

aσ

αα˜

bσ2 aσ2 bσ

bσ

aσ bσ

=

aσ

c

α

α

√ λ2 − 4 α˜ · , =√ λ2 − 2 4 bσ 2 c

aσ

c

·

4 α

aσ 4

α

c aσ 4

bσ 2

·

·

4 α

c

·

α

4

c

α˜

4

bσ2 aσ2 c

α˜

4

bσ2 aσ2 aσ

α˜

4

∗σ2 aσ2 aσ

α˜

4

∗σ2 aσ2 c

·

c

α˜

=√

1 λ2

−1

,

√ λ2 − 2 , = √ λ2 − 1 √ λ2 − 2 , = √ λ2 − 1 =√ 2

−1 , λ2 − 1

= ρ, a a c 2 b b c bσ2 aσ2 aσ2 c 4 1 αα˜ α˜ , = bσ 2 · =√ 2 λ −2 c c c 4 b b αα˜

=

Exotic Subfactors of Finite Depth with Jones Indices

29

Table 3. Landscape of the connection αα˜ after a gauge transform aa1 aa2 ca ac cc1 cc2 cc3 aσ c aσ2 c caσ aσ aσ aσ2 aσ caσ2 aσ aσ2 aσ2 aσ2 ∗∗

1

0

∗b

0

◦

b∗

0

◦

bb1

1

0

0

0

1

0

0

bb2

0

•

◦ ◦

0

•

•

bbσ

0

bbσ2

0

bσ b

0

bσ bσ 1

1

0

0

0

0

1

bσ bσ 2

0

•

•

◦

◦

0

bσ bσ2

0

bσ ∗σ2 0

bσ2 bσ

bσ2 b

0

bσ2 bσ2 1

1

•

•

◦

bσ2 bσ2

0

•

•

◦

2

bσ2 ∗σ

∗σ bσ2

◦

1

◦

0

∗σ ∗σ

1

∗σ2 bσ

∗σ 2 ∗σ 2

1

bσ2 aσ2 αα˜

bσ 2

c

bσ 2

c

bσ

αα˜

aσ

=

=

√ bσ2 aσ2 c 4 λ2 − 4 α α˜ , · =√ λ2 − 2 c 4 bσ c bσ 2 c

α

c 4

·

c bσ

α˜

4 aσ

=√

1 λ2

−1

,

√ bσ2 aσ2 c 4 λ2 − 2 α αα˜ α˜ , · = = √ λ2 − 1 c bσ aσ 4 bσ aσ √ bσ2 c aσ2 4 bσ 2 c λ2 − 2 α αα˜ α˜ · , = = √ λ2 − 1 ∗σ aσ aσ2 4 ∗σ aσ bσ2 aσ2

bσ2 aσ2 ∗σ

αα˜

aσ

∗σ

=

aσ

αα˜

bσ 2

bσ2 aσ2 aσ2 4 −1 α α˜ · , =√ aσ2 4 ∗σ aσ λ2 − 1

c

∗σ =

aσ2

α

aσ aσ2 4 α˜ · = 1, 4 bσ 2 c

30

M. Asaeda, U. Haagerup

aσ aσ2 4 α˜ · = −1, bσ2 aσ2 aσ2 4 bσ2 aσ2 ∗σ2 aσ2 ∗σ2 aσ2 aσ 4 α αα˜ α˜ · = = 1, aσ 4 bσ c bσ c ∗σ2 aσ2 ∗σ2 aσ2 aσ2 4 α αα˜ α˜ · = = −1. bσ aσ aσ2 4 bσ aσ Next, we will obtain the entries marked ◦. We have two vectors of connection αα˜ concerning b-b double edges by “actual” multiplication as follows:   a c  √ 2  2 b − λ −2 α α ˜ ·   c λ2 −1  b    a a 2 b = , αα˜ =     a   b b c c 2 1 λ2 −1 α α˜ · a c 2 b   a a   2 b −1 α 2 −1 α˜ ·   λ a b    a c  2 b = . αα˜ =     c  b b 1 c a 2  √ (λ2 −1) λ2 −2 α α˜ · c c 2 b Since these two vectors are proportional, they are transformed to two proportional vectors by a left vertical gauge transform for the double edges b-b, i.e., multiplication from the left by an element of U (2). Since we should have 1’s in the (bb1 , aa1 )-entry and the (bb1 , cc1 )-entry, they can be transformed into the following pair:     0 0  1  , and  √λ2 −4  , √ √ 2 ∗σ

aσ

αα˜

∗σ

=

α

λ −1

then we have (bb2 , ca) = √ 12

λ −1

λ2 −2

√ λ2 −4 , (bb2 , ac) = √ 2 respectively, where we have omitted λ −2

“-entry”. The same procedure for the entries with vertical double edges bσ -bσ and bσ2 -bσ2 gives two pairs of vectors as follows:     bσ aσ c 4 1 √ α α ˜ · 2 2    (λ −1)(λ −2)  bσ aσ  c  4 bσ c  = αα˜ = ,     bσ c  bσ aσ aσ 4   −1 √ α α˜ · λ2 −1 aσ 4 bσ c     c bσ c 4 √1 α 2 α ˜ ·    λ −1  bσ c  c  4 bσ aσ  = αα˜ = ,    √   a bσ σ  bσ c aσ 4  − λ2 −2 √ α α˜ · λ2 −1 aσ 4 bσ aσ

Exotic Subfactors of Finite Depth with Jones Indices

31

   bσ2 aσ2 c 4 1 √ α α ˜ · 2 2    (λ −1)(λ −2)  bσ2 aσ2  c  4 bσ 2 c  = αα˜ = ,      bσ 2 c  bσ2 aσ2 aσ2 4  −1 √ 2 α α˜ · λ −1 aσ2 4 bσ2 c     bσ2 aσ2 c 4 √1 α 2 α ˜ ·    λ −1  bσ 2 c  c  4 bσ2 aσ2  = αα˜ = .    bσ2 aσ2  bσ2 c aσ2 4   −√λ2 −2  √ α α˜ · λ2 −1 aσ2 4 bσ2 aσ2 

The first pair concerning bσ -bσ double edges can be transformed by gauge unitary into the pair   ! 0 0  1  and , √ 1 λ2 −2

where we note (bσ b2σ , cc1 ) = (bσ b2σ , aσ a1σ ) = 1 by this gauge. Since the second pair is equal to the first pair, it can be transformed to the same pair of vectors by the left gauge unitary of the double edges bσ2 -bσ2 . Thus, we have (bσ b2σ , aσ c) = (bσ2 b2σ2 , aσ2 c) = √ and

1 λ2 − 2

(bσ b2σ , caσ ) = (bσ2 b2σ2 , caσ2 ) = 1.

Along the same argument, we have (∗b, aa1 ) = 1, (b∗, aa1 ) = √

1 λ2

−1

from the “actual” multiplications ∗ b b ∗

αα˜

αα˜

√

a a

=

a a

=

λ2 − 1 −1 , λ λ

1 −1 , √ λ λ λ2 − 1

! ,

,

by the right gauge unitary of double edges a-a. So far, we have the entries of αα˜ as in Table 4, where the entries g∗∗ ’s mean that they have not been determined so far. Now we will obtain the entries marked in Table 3. Denote the vectors of entries of “actual” multiplication αα˜ corresponding to (0, g?? ) by f?? . We use the following data of f?? ’s: ! c c c c c c c 2 b 3 b 4 b b α α α α˜ α˜ α˜ = , , fbbσ = αα˜ · · · c bσ c 2 bσ c c 3 bσ c c 4 bσ c

32

M. Asaeda, U. Haagerup

Table 4. Connection αα˜ (λn =

√

λ2 − n)

aa1 aa2 ca ac cc1 cc2 cc3 aσ c aσ2 c caσ aσ aσ aσ2 aσ caσ2 aσ aσ2 aσ2 aσ2 ∗∗

1

0

∗b

0

1

b∗

0

1 λ1

λ2 λ1

bb1

1

0

0

0

bb2

0

−λ2 λ1

1 λ1

λ4 λ2 ρ¯ λ2 ρ λ2

1

bbσ

1

0

0 gbb

0 0

gbbσ

0

gbb

0

gbσ b

1

0

0

0

0

1

2

0

gbσ bσ

1

0

bσ bσ2

0

gbσ b

1 λ2 λ4 λ2

0

gb

bσ2 bσ

0

gb

bσ2 bσ2 1

1

0

bσ2 bσ2 2

0 gb

bbσ2 bσ b

ρ

bσ bσ 1 bσ bσ

1 1

σ2

σ2

1 λ2

1 λ1 λ2 λ1

bσ ∗σ2 bσ2 b

ρ¯

b σ2 σ

1 λ2 λ4 λ2

0

0

0

1

1 λ2

1

0

σ2

b

b σ2 σ2

bσ2 ∗σ ∗σ bσ2

λ2 λ1 −1 λ1

1 λ1

λ2 λ1

0

0

λ2 λ1

−1 λ1

−1

1

∗σ ∗σ

1

∗σ2 bσ

1

∗σ 2 ∗σ 2

−1 1

τ 1 ρ √ , , , λ(λ2 − 2) 3 λ2 − 2 τ¯ 1 ρ¯ √ , , , fbbσ2 = λ(λ2 − 2) 3 λ2 − 2 τ¯ 1 ρ¯ √ , , fbσ b = , λ(λ2 − 2) 3 λ2 − 2 2 1 (λ − 1)ρ¯2 τ¯ 2 √ √ , , , fb σ b σ 2 = λ(λ2 − 2) 3 (λ2 − 2) λ2 − 1 1 ρ τ √ , , , fb σ 2 b = λ(λ2 − 2) 3 λ2 − 2 2 1 (λ − 1)ρ¯2 τ¯ 2 √ √ , . fb σ 2 b σ = , λ(λ2 − 2) 3 (λ2 − 2) λ2 − 1 Note that fbbσ2 = fbσ b and fbσ2 b = fbbσ , so they are transformed keeping equality by the gauge transform of the triple edges c-c. Therefore, we see only fbbσ , fbσ b , fbσ bσ2 and fbσ2 bσ . We have the following lemma.

=

Exotic Subfactors of Finite Depth with Jones Indices

33

Lemma 4. The three vectors u1 = √ u2 = and

1 1 1 √ , ,√ 2 λ 3 λ −2

λ2 − 2 , 3

,

√

λ2 − 2 λ2 − 3 √ ,− √ λ 3 3

λ 2 − 2 λ2 − 2 √ ,− ,0 3 λ 3

! ,

form an orthonormal basis for C3 and fbσ2 b = fbbσ

√ −1 λ 2 − 2 λ2 − 3 = √ u2 + ( + i)u3 , 2λ 2λ 3

fbbσ2 = fbσ b

√ λ 2 − 2 λ2 − 3 −1 − i)u3 , = √ u2 + ( 2λ 2λ 3

r fbσ bσ 2 = − r fbσ 2 bσ = −

λ2 − 4 λ2 − 2 u2 + (− √ + 3 2 3 λ2 − 2 λ2 − 4 u2 + (− √ − 3 2 3

√

λ2 − 3 i)u3 , 2

√

λ2 − 3 i)u3 . 2

Proof. Checked by elementary, but heavy computations, using λ4 − 5λ2 + 3 = 0.

From Lemma 4, we have g?? ’s as the expression of f?? ’s by the orthonormal basis u2 and u3 as follows: gbbσ = g bσ b = g bσ bσ 2 = g bσ 2 bσ =

! λ 2 − 2 λ2 − 3 + i , gbσ2 b = 2λ 2λ ! √ λ 2 − 2 λ2 − 3 −1 gbbσ2 = √ , − i , 2λ 2λ 3 ! r √ λ2 − 4 λ2 − 2 λ2 − 3 − ,− √ + i , 3 2 2 3 ! r √ λ2 − 4 λ2 − 2 λ2 − 3 ,− √ − i . − 3 2 2 3 −1 √ , 3

√

34

M. Asaeda, U. Haagerup

gbb , gbσ bσ and gbσ2 bσ2 are uniquely determined so that the matrices q 2  λ −4 g 2 bb  λ −2  b  ρ¯  αα˜ =  √λ2 −2 gbbσ  ,   c √ρ g bb σ2 2 λ −2

 bσ

and

αα˜

gbσ b √ 12



λ −2     =  gbσ bσ √ 12  ,  q λ −2  c λ2 −4 gbσ bσ2 λ2 −2



 gbσ2 b √ 12 λ −2   bσ 2 q  λ2 −4  αα˜ =  gbσ2 bσ  2 λ −2   c 1 gbσ2 bσ2 √ λ2 −2

are unitaries, hence we have gbb gb σ bσ gb σ 2 b σ 2

! √ λ2 − 2 −1 , = √ ,− λ 3 2 λ −3 √ ,0 , = 3 2 λ −3 √ ,0 . = 3

Now, the connection (αα˜ − 1) is as in the Table 5. ∼ Our aim is to show (αα−1) ˜ ˜ For this purpose, an expression of αα−1 ˜ = σ(αα−1)σ. with symmetry up to σ is useful. We will re-choose another gauge as in Table 6, where s = τ¯ , and numbers beside the name of the edges denote 1 × 1 unitaries corresponding to the edges, namely, we have multiplied these numbers to the corresponding rows (resp. 0 ’s denote the vectors corresponding to g∗∗ ’s after columns) in the previous table, and g∗∗ being multiplied by suitable gauge numbers respectively. By seeing this table, we easily 0 see that entries other than g∗∗ ’s are invariant to the transformation of (αα˜ − 1) −→ σ(αα˜ − 1)σ, which acts on the table as the relabeling xy → σ(x)σ(y). The remaining problem is whether we have a gauge unitary matrix u cc 2 corresponding to double edges c-c such that c 0 u ( c) 0 −→ gσ(∗)σ(∗) g∗∗ or not. We can check by a simple computation that   3 (λ2 −2) 2 −1 λ2√ −2 − i c 6 λ 3  = 2 u 3 (λ2 −2) 2 λ2√ −2 −1 c 2 i i 2 + 6 λ 3

Exotic Subfactors of Finite Depth with Jones Indices

35

Table 5. Connection αα˜ − 1(λn = aa2 ∗b

1

b∗

1 λ1 −λ2 λ1

bb2

ca λ2 λ1 1 λ1

bbσ bbσ2 bσ b

ac

cc2

λ4 λ2 ρ¯ λ2 ρ λ2

ρ

cc3

aσ c

√

λ2 − n)

aσ 2 c

caσ

a σ 2 aσ

1

gbb

1

σ2

gbσ b gbσ bσ

bσ bσ2

gbσ b

σ2

1 λ2 1 λ2 λ4 λ2

1 1 λ1 λ2 λ1

bσ ∗σ2 ρ¯

gb

σ2

bσ2 bσ

gb

bσ2 bσ2 2

gb

1 λ2 λ4 λ2 1 λ2

b

b σ2 σ b

σ2 σ2

bσ2 ∗σ ∗σ bσ2

1 λ1

λ2 λ1

λ2 λ1

−1 λ1

λ2 λ1 −1 λ1

1

−1

1

∗σ2 bσ

a σ aσ 2

gbb gbbσ

bσ bσ 2

bσ2 b

caσ2

1

−1

gives rise to the transformation 0 0 → gb0 σ bσ2 → gb0 σ2 b → gbb , gbb σ σ 0 → gb0 σ b , gb0 σ b → gb0 σ2 bσ → gbb σ2

and

0 → gb0 σ bσ → gb0 σ2 bσ2 . gbb

Thus, we have proved the equivalence of connections αα˜ − 1 ∼ = σ(αα˜ − 1)σ. Finally we will check conditions 1) and 2). Mutual inequivalence is obvious by seeing four graphs of the connections appearing there. Namely, connections producing the bimodules of different indices are trivially mutually inequivalent. To prove inequivalence of connections which produce the bimodules of the same index, it is sufficient to show x the existence of the unitary matrices of the connection of the form y which have different sizes in each connection. We can check it only by seeing the four graphs. About the indecomposability, since it was irreducibility of the bimodules in our original lemma, all we must see is the irreducibility of bimodules made of connections here. The bimodule X 1 = N NN is trivially irreducible, and indecomposability of σ and σ 2 follows. To see the irreducibility of X α , consider the subfactor N ⊂ M constructed from the connection α. Then X α = N MM . By Ocneanu’s compactness argument, (see Sect. 3, Theorem 4)

36

M. Asaeda, U. Haagerup

Table 6. Connection αα˜ − 1 after taking symmetric gauge choice 1 aa2 1

∗b

1

1

b∗

1

bb2

1 λ1 −λ2 λ1

s2

bbσ

s¯2

1

bσ b

1

bσ bσ

s¯

bσ bσ2

s¯

bσ ∗σ2

s2

bσ2 b

s

bσ2 bσ

s 2

1 bσ2 bσ2 2 s

bσ2 ∗σ

1

∗σ bσ2

1

∗σ2 bσ

s¯

s¯

−s¯

s

−s

1 λ2 λ1 1 λ1

bbσ2

s¯2

s

1

ca ac cc2 cc3 aσ c aσ2 c caσ aσ2 aσ caσ2 aσ aσ2

s¯

λ4 λ2 s¯ λ2 s λ2

0 gbb 0 gbb σ

s

0 gbb σ2

gb0 σ b gb0 σ bσ gb0 σ b σ2 gb0 gb0 gb0

σ2

s¯ s¯ λ2 s λ2 λ4 λ2

s¯

s λ2 λ4 λ2 s¯ λ2

b

b σ2 σ b

σ2 σ2

−λ2 λ1 1 λ1

1 λ1 λ2 λ1

1 λ1

−λ2 λ1

λ2 λ1

1 λ1

s

1

1 1

1

α End(N XM ) = End(N MM )

= N 0 ∩ M ⊂ String(1) ∗ G = C,

where G is the upper graph in Fig. 1, thus irreducibility of X α , X σα , and X σ Similarly, we have

2

α

follows.

αα ˜ ) = End(N M ⊗M MN ) = N 0 ∩ M1 ⊂ String(2) End(N XN ∗ G = C ⊕ C, ˜ , where N ⊂ M ⊂ M1 · · · is Jones tower of N ⊂ M , thus irreducibility of X αα−1 σ(αα−1) ˜ σ 2 (αα−1) ˜ ααα−2α ˜ , and X follows. Irreducibility of X follows in the same way X using String(3) ∗ G = C ⊕ M2 (C).

Now, the proposition holds and thus we have proved the theorem. 5. Main Theorem for the Case of (5 +

√

17)/2

√ In this section, we will give a proof for our main theorem for the case of index (5+ 17)/2 due to the first named author. Theorem 6. A subfactor with principal graph and dual principal graph as in Fig. 2 exists.

Exotic Subfactors of Finite Depth with Jones Indices

37

From the key lemma, we know that the above theorem follows from the next proposition. We define the connection σ as   p  σ   r

 q    = δp,r˜ δq,s˜ ,  s 

here p, q, r, s are the vertices on the upper graph in Fig. 2 and we consider x˜ as x and if x is one of e, f, g, x˜ = x. Proposition 2. Let α be the unique connection on the four graphs consisting of the pair of the graphs appearing in Fig. 2, and σ be the connection defined above. Then, the following hold: 1) The eight connections 1, σ, σ 2 , (αα˜ − 1), σ(αα˜ − 1), (αα˜ − 1)σ, σ(αα˜ − 1)σ, ˜ 2 − 3αα˜ + 1) (αα) ˜ 2 − 3αα˜ + 1, σ((αα) are indecomposable and mutually inequivalent. 2) The six connections ˜ + 3α, (αα˜ − 1)σα α, σα, ααα ˜ − 2α, σ(ααα ˜ − 2α), (αα) ˜ 2 α − 4ααα are irreducible and mutually inequivalent. 3) σ(αα˜ − 1)σα ∼ = (αα˜ − 1)σα. Proof. The four graphs of the connection α and the Perron–Frobenius weights are as in Fig. 8. The Perron–Frobenius weights for Fig. 8 are: ˜ = µ(h) = µ(h) ˜ = β 2 − 1, µ(∗) = µ(˜∗) = 1, µ(a) = µ(˜a) = β, µ(b) = µ(b) ˜ = 2β 2 − 1, µ(e) = β 3 + β, µ(f ) = 2β 2 , µ(c) = µ(˜c) = β 3 − 2β, µ(d) = µ(d) µ(g) = β 3 − β, µ(2) = β 2 − 1, µ(3) = 2β 2 − 1, µ(4) = β 2 + 1, µ(5) = 3β 2 − 2, µ(6) = β 2 . Note that the Perron–Frobenius weights of the vertices in V3 are the same as that of the vertices in V1 , and here we used β 4 − 5β 2 + 2 = 0. The biunitary √ connection α on these four graphs is determined uniquely as in Table 7, as in (5 + 13)/2 case. We will also display the table of the connection α˜ in Table for use in later computations.

38

M. Asaeda, U. Haagerup

V0 G0 V1 G1 V2 G2 V3 G3 V0

V0

G0

V1

G3

α

G1

V3

G2

V2

b

* a

1

d c

2 A

3 C

b

*

f

d

d˜

h

e

c˜

g

4

5

E

C˜

G

d˜

h

f

b˜

˜ h

∗˜

a˜

6 A˜ b˜

˜ h

∗˜

Fig. 8. Four graphs of the connection α

First we check condition 3), namely we prove σ(αα˜ − 1)σα ∼ = (αα˜ − 1)σα up to vertical gauge choice. Now we ompute the connection αα. ˜ The four graphs on which the connection αα˜ exists are as in Fig. 9. Table 7. Connections α (upper) and α˜ (lower) ∗A

a1 1

a2 1

c2

bA

1

−1 β10

ββ2 β10

ββ2 β10

1 β10

1

1

−1 γ0

2ββ1 γ0

2ββ1 γ0

bC dC dE fE

c3

e3

e4

e5

1 γ0

1

1

1

−1

1 β20

β−1 β20

β−1 β20

−1 β20

1

1

1

fG

c5 ˜

hG ˜ C˜ h ˜ A˜ h ˜ dE d˜C˜ ˜ bG ∗˜ A˜

g5

g6

a6 ˜

1 1 1

1

−β2 γ

β−1 γ

β−1 γ

β2 γ

1

1 1

Exotic Subfactors of Finite Depth with Jones Indices 1a

2a

A∗

1 β

β1 β

Ab

β1 β

−1 β

1

1

1 ββ20

Cb

2c

3c

β0 √ 1 2β20

Cd

39

3e

4e

5e

β2 0 β−1

ββ1 0 β−1

β √ 1 2β−1 √ − 2 β−1

γ 0 ββ−1

β √ 1 2β−1

5c˜

5g

6g

6a˜

β0 √ 1 2β20

Ed

−1 ββ20

1

1

1 0 ββ−1

Ef

β1 0 β−1

1

−β20 0 β−1

1

C˜ d˜ ˜ C˜ h

1

1

Gf

1

0

E d˜

1

Gh Gb˜

−1 β1

β2 β1

β2 β1

1 β1

1

1

˜ A˜ h ˜∗ A˜

1 1

where,p βn =p β 2 − n, βn0 = β 2 − n, γ = 2β 2 − 1, γ 0 = 2β 2 − 1.

V0 G0 V1 G1 G1t V1 G0 V0 G3 G3t V0

V2

G2

V3

G3 G3t

αα˜

G1 G1t

V2

G2

V3

b

*

d

f

d˜

h

b˜

˜ h

a

c

e

c˜

g

a˜

a

c

e

c˜

g

a˜

∗˜

*

b

d

f

d˜

h

b˜

˜ h

∗˜

*

b

d

f

d˜

h

b˜

˜ h

∗˜

Fig. 9. Four graphs of the connection αα˜

40

M. Asaeda, U. Haagerup

Table 8. Connection αα˜ − 1 (left part of diagram) ∗b

aa 1

ca

b∗

1 β1

β2 β1

2 bb − β (?) β 1

1 β1

bd db dd

ac 1

cc

√1 2β

−γ √ (?) 2β −γ √ 2β

√1 2β

1 −

√

ec

1 β2

β 4 −1 β2

ee1 ee2

ce ˜

β 4 −1 β2

(?)

1 β2

1 β2 β 2 β−1 β 2 −2 β 2 −1 2β1 β 2 +1

dd˜ 1

ff fh f d˜

l1

l2

m1 m2 n1

n2

p1

p2

q1

q2

r1

r2

√

3β 2 −1 2β−1 −1 2 −1 β−1

f b˜ hf hb˜

1

˜h ˜ h ˜ d˜ h ˜∗ h˜ ˜ dd ˜ df ˜ d˜h d˜d˜ ˜ bf ˜ bh ˜ ∗˜ h

ge

1 √

df

fd

ce

1 1

s1

s2

β2 β−1

t1

t2

1 β 2 −1

u1

u2

β2 β−1

1

The broken edges correspond to the trivial connection 1. We will now compute √ the connection αα−1,which ˜ is determined only up to vertical gauges. As in the (5+ 13)/2case, we assume that 1 × 1 gauge transform unitaries corresponding to single vertical edges which connect different vertices in the graph G1 G1t ∪ G3 G3t are 1. Then we easily find 38 entries√of αα˜ − 1 by “actual” multiplication of the connections α and α. ˜ Next, as in the (5 + 13)/2-case, we can compute all the entries of αα˜ − 1 which involve the double edges in the graph of αα˜ by a simple gauge transform, then we have 14 entries listed in the Tables 8 and 9 other than the entries marked “(?)”. The four entries marked (?) in Tables 8 and 9 can easily be computed by the unitarity ˜ gg)-entry in Table 9 can be of the 2 × 2 matrices which they are part of, and the (h˜ h, ˜ put equal to 1, because a gauge choice corresponding ot the h˜ h-edge in the vertical left ˜ gg)-entry. graph will only be concerned with the (h˜ h, The only entries left to compute are the √ 18 entries l1 , l2 , m1 , m2 , . . . , u1 , u2 in Table 8. They can also be obtained as in the (5+ 13)/2 case, but here we will make a shortcut: Since the entries of the connection αα˜ obtained by “actual” multiplication are all real

Exotic Subfactors of Finite Depth with Jones Indices

41

Table 9. Connection αα˜ − 1 (right part of diagram) ∗b b∗ bb bd db dd df dd˜ fd ff

ec˜

g c˜

eg

hf hb˜

ag ˜ g a˜

1

1 β 2 −2 β−1 β 2 −2

β √−1 2β

−β √ 1 2β β √−1 2β

β−1 β 2 −2 −1 β 2 −2

(?)

β √1 2β

1 1

1

˜h ˜ h ˜ d˜ h ˜∗ h˜

1 1 (?)

1 1

˜ dd ˜ df ˜ d˜h d˜d˜

gg

1

fh f d˜ f b˜

cg ˜

−

q

q

β 2 +1 2β 2 −1

β 2 +1 2β 2 −1

q −

β 2 +1 2 −1

q2β

β 2 +1 2β 2 −1

1

˜ bf ˜ bh ˜ ∗˜ h

−1 β1 β2 β1

β2 β1 1 β1

1

scalars, all the gauge choices involved in decomposing the connection αα˜ into (αα−1)+1 ˜ can be chosen to be matrices with real entries. Hence, l1 , l2 , m1 , m2 , . . . , u1 , u2 become real numbers. We still have a possibility of making a gauge choice of the double edges e-e with an orthogonal matrix, i.e., we can make the following change: (l1 , l2 ) → (l1 , l2 )v, (m1 , m2 ) → (m1 , m2 )v, . . . , (u1 , u2 ) → (u1 , u2 )v for some v ∈ O(2) (common orthogonal matrix for all the vectors in R2 ). Then, we can assume l2 = 0 and m2 ≥ 0, thus, we obtain d = q   q 2 e   q 2 β −2 β 4 +4 1 0 β −2 1 2 β 2 +1 β 2 (β 2 +1) 2  √ β q β 2 +1 l1 l2    β 2 −2  β β 2 −2 − β 4 −4 2β 4 √     2 2 4 = m1 m2   β −1 (β −1)(β +4) 2 4   √ β 2 −1 β(β −1) β +4 √ √ q   2 2 β −1 2 (β 2 −2) −4 β 2 (2β 2 −1) 2 β 2 −1 β n1 n2 − (β 2 +1)(2β 2 −1)(β 4 +4) 2 √ 2 β 2 +1 β 2 +1 4 (β −1)

by the orthogonality of the matrix.

(β −1)(β +4)

42

M. Asaeda, U. Haagerup

Now, all the gauge choices have been used up. We know that there is an orthogonal matrix V ∈ O(3) such that d f

αα˜

d d˜ f d

where

e e

αα˜

αα˜

e e e e

f

= (0, m1 , m2 )V,

d˜ d˜

= (0, n1 , n2 )V,

d d˜

= (0, p1 , p2 )V,

f

αα˜

αα˜

αα˜

e e

e e e e

= (0, r1 , r2 )V, = (0, s1 , s2 )V, = (0, s1 , s2 )V,

αα˜

denotes the 1×3 matrices obtained by “actual” multiplication of α and α. ˜ It is clear from the definition of the renormalization of the connection, that (αα˜ − 1)∼ = αα˜ − 1 without any gauge transformation. Together with the fact that all the entries of the connection αα˜ by “actual” multiplication are real numbers, we have f d d˜ d d˜ f

αα˜

αα˜

αα˜

s

e =

e

s

e e

= s

e e

=

e µ(d) d αα˜ , µ(f ) f e e µ(d) d αα˜ , ˜ ˜ e µ(d) d e µ(f ) f αα˜ , ˜ ˜ e µ(d) d

hence, s

2β 2 − 1 (m1 , m2 ) 2β 2 s s ! p 2β 2 − 1 2β 4 − β4 − 4 p , , = 2β 2 (β 2 − 1)(β 4 + 4) β(β 2 − 1) β 4 + 4

(p1 , p2 ) =

(s1 , s2 ) = (n1 , n2 ) s =

−

! p −4 β 2 (2β 2 − 1) β 2 (β 2 − 2) p , , (β 2 + 1)(2β 2 − 1)(β 4 + 4) (β 2 − 1) (β 2 − 1)(β 4 + 4)

and

s (t1 , t2 ) =

2β 2 (r1 , r2 ). 2β 2 − 1

(5.1)

Exotic Subfactors of Finite Depth with Jones Indices

43

We next determine (r1 , r2 ) and (t1 , t2 ). In the text, we denote the connection matrix, e.g., b 5 by M (b/5) for the convenience of space. By orthogonality of the first and the last row in the 3 × 3 matrix M (f /e) in Tables 8–9, we have p 3β 2 − 1 , (5.2) p 1 r1 + p 2 r2 = 2(β 2 + 1) ˜ and by orthogonality of the first two rows in the 3 × 3 matrix M (d/e) in Table 9, we have s −1 β2 − 2 , s1 t1 + s2 t2 = 2 β − 1 β2 + 1 too, and together with (5.1) we have −1 (s1 r1 + s2 r2 ) = 2 β −1

s

(β 2 − 2)(2β 2 − 1) . (β 2 + 1)2β 2

(5.3)

Solving (5.2) and (5.3) with respect to (r1 , r2 ) using the known values of p1 , p2 , s1 , and s2 gives s ! 4β 2 −β 3 , , (r1 , r2 ) = p (β 2 + 1)(β 4 + 4) (β 2 + 1)(β 4 + 4) and therefore,

s 2β 2 2β 2 − 1

(t1 , t2 ) =

s

−β 3

p , (β 2 + 1)(β 4 + 4)

4β 2 2 (β + 1)(β 4 + 4)

! .

The four remaining entries q1 , q2 , u1 , u2 can now be computed using the orthogonality ˜ of the 3 × 3 matrices M (f /e) and M (d/e). We have s s ! 2β 4 2(β 2 + 1) , , (q1 , q2 ) = (β 2 + 1)(β 4 + 4) β4 + 4 s (u1 , u2 ) =

√ ! 2 2 β 2 (β 2 + 2) ,p . (β 2 + 1)(β 4 + 4) β4 + 4

Now, we have obtained all the entries of αα˜ − 1. We can obtain (αα˜ − 1)σ only by exchanging the vertices at the bottom of the connection αα˜ − 1 as below: p

q

p

q

(in (αα˜ − 1)). r s r˜ s˜ Together with the information of α, we obtain all the entries of the connection (αα˜ − 1)σα. Now we show the landscape of them in Table 10. The four graphs of this connection are as in Fig. 10. (in (αα˜ − 1)σ) :=

44

M. Asaeda, U. Haagerup

Table 10. Landscape of (αα˜ − 1)σα 1

2

g1 e

3 g

c

e

4 c˜

g

c

e

5 c˜ g a

∗G

c

e

c˜

g

•

•

dC

•

• ••• • ••• • •••

• • • •

• •• • •• • ••

•• •••• •• •••• •• ••••

dC˜

•• ••••

dG

•• •••• •• ••••

fC

• • • •

• •• • •• • ••

fE

••• •••

•• ••

••• ••• •••

•• •• ••

•••• •••• ••••

•• •• ••

f C˜

•••• ••••

•• ••

fG f A˜

••••

••

˜ dE

•

•• ••

•

• • •

•• ••

•• ••

•

•• •• ••

•

•• ••

•

••• • ••• • ••• •

•• • •• • •• •

• •

•• ••

•

•• ••

•

•••• •• •••• •• •••• •• •••• ••

˜ dG

•••• •• •••• ••

∗˜ G

• •

••• •

d˜C˜

˜ bE b˜ C˜ ˜ bG b˜ A˜

c˜ g a˜

• •••

dE

•

e

•

• •• • •• • ••

bG bA˜

hA hC hE hG ˜ hA ˜ hC ˜ hE ˜ hG ˜ dC

a˜ a c

•

bE bC˜

fA

6

•

•

•• •• ••

• • • • • • • •

• •

• • •

Exotic Subfactors of Finite Depth with Jones Indices V0

45 G0

V1

H0 V2

V0

b

*

H1 G2

d

V3

f

d˜

h

e

c˜

g

4

5

E

C˜

b˜

∗˜

˜ h

G0 a

V1

c

a˜

H1

V2

2

1

3

6

G2 A

V3

C

G

A˜

H0

V0

d

b

*

f

h

d˜

b˜

˜ h

∗˜

Fig. 10. Four graphs of the connection (αα˜ − 1)σα

Since the exact values of the connection take up too much room to be listed in a table, we will show them in the shape of unitary matrices. Table gives also an overview of the connection σ(αα−1)σα, ˜ because it is easy to check that σ(αα−1)σα ˜ has exactly the same vertical edges as (αα˜ − 1)σα. Below we list all the entries of (αα˜ − 1)σα. These entries can be obtained by direct multiplication of the connections (αα˜ − 1)σ and α as explained in Sect. 3. In the list we have labeled rows and columns of the unitary matrices according to those entries that b have to be used in he direct multiplication, for instance, in the 2 × 2-matrix ˜ below, the entry with row-label Gb and column-label aa˜ is computed as follows:

b

aa˜

Gb

6

˜

b

=

=

(αα˜ − 1)σ

b˜ b b

αα˜ − 1

a

a

·

b˜

a˜ G b˜

a˜

α

α · = a G 6

a˜ 6 p β2 − 2 ) · 1, (− p β2 − 1

6

46

M. Asaeda, U. Haagerup

where the last equality is obtained from Tables 1 and 9. Sometimes the entries listed below appear at first glance to be different from the entries obtained by direct multiplication. However in all those cases, it is just a different representation of the same q algebraic number.This can easily be checked using the following identities for β = 2(β 2 −1)2 , β2 4β 4 2 β + 2 = (β 2 −1)2 2 2 2 (β 4 + β + 3 = (β2β−1) 4 2 β 2 − 2 = β2β 2 −1 , 2 , β 2 − 3 = 2(ββ−1) 2

β2 − 4 =

2 β 2 −1 , 5 − β 2 = β22 , 2 2 2β 2 − 1 = β (β2 −1) , 2 2 2

β2 + 1 =

4),

3β − 1 = (β − 1) , 3β 2 − 4 =

β 2 −1 4 2β 2 (β

+ 4).

Here is the list of entries of (αα˜ − 1)σα: aa˜

∗

=

Gb ˜

6

∗

a

b˜

a˜

∗˜

a˜ g

G˜ h 6 ∗ ac˜ Gb ∗˜

5 a˜ g

Gh

5

˜

˜

A˜ ∗˜

b

=

G

6

1,

=

1,

=

1,

aa˜

β 2 −1



a˜ a˜

β −1



  E    ˜ d C˜ d˜

ac˜ √1 √ 2β β 2 +1 √ √ 2β β 2 −2 √ 2β

−

β 2 −1

c˜g˜ √  β 2 −2 √ β 2 −1  , −√ 1 β 2 −1

c √ 2 c˜ 2

√1 2β √1 2β

2β −1 √ 2β

q q

1,

c √ a˜  β 2 −2 √ β 2 −1  , √1

√1 2 √β −1 β 2 −2 −√

6

A

=

=

=

√1  √β 2 −1  β 2 −2 Gf √ 2

˜

˜ h˜

˜

=

a˜

Gb

Gb

5

 

b˜

6

b˜

b



·

β 2 +1 2β 2 −1 β 2 −2 2β 2 −1

ce 0



 q  β 2 −2  − 2β 2 −1 , q  2 β +1 2β 2 −1

√ 5+ 17 2 :

Exotic Subfactors of Finite Depth with Jones Indices

a˜ g q 2

β +1 2β 2

Ef

b˜

  Gf  − √1 2  2β  ˜ 1 h ˜ √ C 2

= 5

β −1

d

= 6

d˜

=

cg √

Gf b˜

− β −1 ˜ C˜ d  2β 2 −1  √ 2 ˜ +1) E d  − 2(β β3   Ef  0  =  5 Ed  0    Gf  0   ˜ 1 Gb β2

G

2 −(β √ −2) β 2 (β 2 −1)3

1 β 2 −1

√ β (β 2 −1)3 q 2

2 β2

0

β 4 +4

β 2 (2β 2 −1)

√

−

2(β 4 +4) (β 2 −1)2

√ −(β −2) 2(β 2 −1)(β 4 +4) √ 2 2 √− β (β −2) 2

(β 2 −1)3 (β 4 +4)

−2 β 2 (β 2 −1)(β 4 +4)

0

e3e

ec˜

eg

0

1 2β 2 −1

0

0 √ 2 √ β −1

β2

√

eg ca˜ 1 0 , 0 1

1 2β 2 −1

e2e

q

ce

4

d

c˜g c˜e q q  β 2 +1 β 2 −1 − 2β 2 (β 2 −2) 2β 2 (β 2 −2)  q  β 4 −1 √ 1  2β 2 (β 2 −2) , 2β 2 (β 2 −2) q  β 2 −2 0 2 β −1

eg c˜g q  q 2  β −2 β 2 +1 Gf − 2β 2 −1 2  q q 2β −1 , β 2 +1 β 2 −2 h G 2β 2 −1 2β 2 −1

6



47

1 β2

q

2β 2 −1

0

2(β 4 +4)

q

2 −2(β √ −2) 2(β 4 +4)

0

β 2 (β 2 +1) (β 2 −2)(β 4 +4)

0

0

β 2 +1

√

β 4 −1 β2



   0    √ β 2 +1   β 2 −2  ,  0     −1  β 2 −2    0

48

M. Asaeda, U. Haagerup

c˜g



d˜

c˜e

˜ 0 C˜ d   ˜ Ed  0    2 √ 2 (β +1) β −1 √ Ef   2β 3  =  5 Ed  0   q  β 4 −1 Gf   − 2β 6  q β 2 −2 Gh 2β 2 −1

−1 2β 2 −1 β 2 −2 2β 2 −1

β 2 −2 β2

√

−(β 2 −2) (β 2 +1)(2β 2 −1) 1 2β 2

q 1 β2

q

β 2 −1 β 2 −2

0

e2e q

β 2 (β 2 −2)

q

β 2 −2 β 2 +1

β 4 +4

β2 (β 4 +4)(β 2 +1)

e3e ec˜ q 2 4(β +1) β 2 −2 − (β 2 +3)(2β 2 −1) β2 q q 4(β 2 −2) (β 2 +3)(2β 2 −1)

√ − β 2 −1 √

q

β 2 (β 2 +2) (β 2 +1)(β 4 +4)

= 2

−

0

β 2 +3

0 √

β 2 +1 β2

0

0

q 2 2 − β β(β4 +4−1)

√ 4 2(β 4 +4) √ 2 √2 β −1

0

1 β2

0

0

0

Ab

f

d

2

β2

0

β 2 +1 2β 2 −1

√ β 2 −1 √

β 2 +3

eg

 q

β 2 (β 4 +4)

ec

q

ga

β 2 +1 2(β 2 −1)

q  β 2 −1 C   2β 2 (β 2 −2)  √ 2 β +1 Cb 2β 2 d

         ,         

gc

 − √ 12 q 2(β −1)   β 4 −1 , 2β 2 (β 2 −2)   − 2β1 2

− β 21−1 √

β 2 +1 2β 2 −1



0 β 2 (β 2 −2) β 2 −1

= d˜

3 

q

e2e

e3e

ec

β 4 +4

E  0 β 2 (β 2 +1)  q q  β 4 −4 −1 2 Ef  β 2 (β 2 −1)(β 4 +4) β 2 −1 β 2 (β 4 +4)  √  2 2 β −2 2(β −2) Ed   − (2β 2 −1)(β 2 −1)√2(β 4 +4) − (2β 2 −1)√β 4 +4   2 2 2 −2) Cd − √ β 2(β −2) − √ 4(β 3 4 4 2 (2β −1) (β +4)

(β +4)(2β −1)

0 0 √

4 β 2 (β 2 −1)

− 2β 21−1

1 β2

ce q

β 2 −2 β 2 +1



     , β2  √ (2β 2 −1)3 (β 2 +1)   q  6 2 β 2 −2 β 2 −1

β (β −2) (2β 2 −1)3

Exotic Subfactors of Finite Depth with Jones Indices

49

d˜

= 3 e2e e3e ec c˜e  q  q β 2 (β 2 −2) 2(β 2 −2) β 2 −2 d˜ √ E  − (β 2 +1)(2β 2 −1)(β 4 +4) − 0 β 2 +1  2(β 4 +4)  q  q q q   3 2 2 2 2 2β 2β 4β 5−β β f   √ E  − 2β 2 −1 0 2β 2 −1 (β 2 +1)(β 4 +4) β 2 +1  (β 2 +1)(β 4 +4)  , q q q   2 2 2 β (β +2) β −2 1 1 4 1 2 d   √ E  2β 2 −1 (β 2 +1)(β 4 +4) 2 2β 2 −1 2β 2 −1 β 2 +1 2β 2 −1  4 +4) 2(β   q q  q  2 β 2 (β 2 +2) −2) β(β 2 2 1 4 d C 2 2β 2 −1 (β 2 +1)(β 4 +4) 2 2β 2 −1 √ 4 − 2β 2 −1 2β 2 −1 2(β +4)

h

= 3

=

g  √c − β 2 +1 C d  β4   −1 Cb   β 2 −2  ˜ = Ed  0  3  f  E  0  √  2 β 4 −1 d E β3

gc

,

q

√ √−2 2 2β 2 −1

0 2β 2 −1 2β 2 (β 2 −3)

−1 √ (2β 2 −1) β 2 +1

− 2β 21−1 √ ec − β14 √

β 2 +1 β 2 −2

0

2(β 2 +1) β 4 +4 4β 2 (β 2 +1)(β 4 +4)

β3 (β 2 +1)(β 4 +4)

2β 4 (β 2 +1)(β 4 +4)

− 21

q

√

0 q 2 −1)(β 4 −4) 1 − (2β 2(β 4 +4) β 2 (β 2 −1) q

β 2 (2β 2 −1) (β 2 −1)(β 4 +4)

1 2β 2 −1



e2e

√ √−2 2 (β 2 −1)3

0

q

ge √ 2 √ 2 2β 2 −1 1 2β 2 −1

ge

e3e q √ 4β 2 √2 2 (β 2 +1)(β 4 +4) 2β 2 −1 q

gc ge 1 0 , 0 1

C − 2β 21−1  √ E d √2 22 2β −1

3

f



d

h˜

Cb Ef

0 √ 2

β 2 −1 β3

        ,      

β3 (β 2 +1)(β 4 +4)

50

M. Asaeda, U. Haagerup

E

f = 4

d˜

q

ge

e2e

2β 2 −1 2β 2 (β 2 −3)

  1 E  2   Ed − √ 12

q 2β 4 − (β 2 +1)(β 4 +4)

f

−√

β +1

h

e3e β 2 (2β 2 −1) (β 2 −1)(β 4 +4)

q 2 +1) − 2(β β 4 +4 q

β3 (β 2 +1)(β 4 +4)

4β 2 (β 2 +1)(β 4 +4)

   ,  

gg ge q  q 2  β 4 −1 Gf − 2ββ2 (β−1 2 −2) 2 2  q q 2β (β −2) , β 4 −1 β 2 −1 f E 2β 2 (β 2 −2) 2β 2 (β 2 −2)

= 5 h˜

Gh Ed

= 5

gg ge 1 0 , 0 1

gg



ge

√(β

+1)

−1 2

β 2 −1

ce ˜

= 3

q

β 4 −1 2β 2 (β 2 −2)

0

1,

√β

β 4 +4

β 2 (β 2 −1)

0

β 2 −2 2β 2 −1

2(β 4 +4)

3

β 2 −2

q

√

√ −β (β 2 +1)(β 4 +4) √ − (β 2 +1)(β 4 −4) √

√1

β 2 (β 4 +4)(β 2 −2)

Ed

2

β 2 +1

0

b

(β 2 −2)(β 4 −4)

β 2 (β 2 −1)

√−1

2(β 2 −1) β 4 +4

0 √

e2e

−1 2(β 2 −2)

e3e ec˜ q q 2 (β 2 −2) β 2 +1 − (ββ2 −1)(β 4 +4) 2β 2 −1 q (β 2 +1)(β 2 −1) 0 β 2 (β 2 −2)(β 4 +4) q 4β 2 0 (β 2 +1)(β 4 +4) q q

2

√

2 −(β √ −2) 8β 2

Ed  0   2 q 2  −1 E f  β2β+12 ββ 2 −2   Ed  0   =  5 ˜  C˜ d  0   q ˜  β 2 −1 C˜ h  2β 2  q  β 4 −1 −1 Gf 2β 2 β 2 −2 ˜

f

q

2 −(β √−2) 2 (β −1) 2(β 4 +4)

2(β 4 +4)

0 β 2 (β 2 −1) (β 4 +4)(β 2 −2)

eg



   −(β −1)  2β 2 β 2 −2     0  ,  0   q  β 2 +1   2β 2   β 2 −1 √ 1 2β 2 2

0 q

β 2 +1

β 2 −2

Exotic Subfactors of Finite Depth with Jones Indices

b˜

c˜e

Ef b

3 c˜e

Ed

4

˜

E

d

= 4

d˜



1 β2

b˜

c˜e

Ef

4

c qe

E

d˜

= 4

β 2 −2

 qβ +1 2  E f  − 5−β  q β 2 +1  β 2 −2 Ed β 2 +1 2

=

1,

−1,

1 β 2 −1

e2e

e3e

β 4 +4 β 2 (β 2 +1)

q

β 4 −4 β 2 (β 4 +4)

q β 2 (β 2 −2) − (β 2 +1)(2β 2 −1)(β 4 +4)

(β +1)(2β −1)

c˜e

1,

q

β 2 −2 β 2 +1

 √  β 2 +2  E  − β 2 −1  2 Ed √ 2 β 2

 q

=

=

f

d˜

51

h

gg

Gf h

6 gg

f

=

1,

=

1,

=

∗,

1 Gf ˜ h˜

eg gg

Ab C

h˜

b



− 1  √ β 2 −1

β 2 (β 2 −2) β 2 −1

= 2

ga

A∗ Cd

.

= A 6 (We do not use the values of these entries.)

=

− β 24−1

2 (β 2 −1)(β 4 +4)

q

2β 2 −1 2(3β 2 −4)

2(β 4 +4)

ab

2

q

   ,  

e2e e q q3e  2 2 β (β −2) 2β 2 −1 − (β 2 +1)(2β − β 24−1 2(3β 2 −1)(β 4 +4) 2 −4)  q q q  2β 2 √ 2β 2 4β 2 β3 − 2β 2 −1 2β 2 −1 (β 2 +1)(β 4 +4)  (β 2 +1)(β 4 +4) , q  2 2 β (β +2) √ 4 (β 2 +1)(β 4 +4)

Gh 6 f ga

h

−β

2

0

√

gc β 2 (β 2 −2) β 2 −1 1 β 2 −1

ga gc 1 0 , 0 1

 ,

52

M. Asaeda, U. Haagerup

h

ga

Ab

1

h˜

ga

Ab

1

h

ge

Ef

4

=

1,

=

1,

−1,

=

h˜

ge

Ed

4

d

e

C

2

d˜

ec

Cd

2

=

1,

=

1,

=

1.

Here we will display three matrices of the connection ((αα˜ − 1)σα)∼ = ασ(αα˜ − 1) for ease of later procedures. These matrices are computed from the entries of (αα−1)σα ˜ and the Perron–Frobenius weights of the horizontal graphs by renormalization rule as in Sect. 4: 5e2 q



(2β 2 −1)(β 4 −4)

G

df  − β 2 (β12 −1) 2(β 4 +4)  ˜  0 db   3  ˜ = df  −√ 2 β 4 (β +1)(β +4) e   d˜h  q 0  f 2β 2 f1 (β 2 +1)(β 4 +4) 5c˜ 0 −

5g √

2β 2 −1 2(β 2 −1)

1

0

0

√1 β 4 −1 q

β 2 −2 β 2 −1

0 0

2

√1

β 2 −1

q

5e3 β 2 (2β 2 −1)

(β 2 −1)(β 4 +4)

q

0 4β 2 (β 2 +1)(β 4 +4)

q 0

2(β 2 +1) β 4 +4

q

6g



2β 2 −1 β 4 −1

    q0 2   − ββ 4 −2 −1 ,    √1 2 β −1   √−1 β 2 +1

(5.4)

Exotic Subfactors of Finite Depth with Jones Indices

f

A g

=

b

q

  h    ˜h∗ b

∗a β 2 −2 β2 1 β 1 β

53

2 2c q a  β 2 −2 − (β 2 −1)β 2 − √ 12 q β −1   β 2 −2  1 √ − 2 2 β 2 −1 , β (β −1)  q β 2 −1 0 2 β

C e 

=

3e3 3e2 q 2 2 2 β (β −2) −2(β √ −2) dd − (β 2 +1)(2β 2 −1)(β 4 +4)  2(β 4 +4) q q  q 2 2β 2β 2 4β 2 β3 √ fd  −  2β 2 −1 2β 2 −1 (β 2 +1)(β 4 +4) (β 2 +1)(β 4 +4)  q d β 2 (β 2 +2) √ 4 d˜1  (β 2 +1)(β 4 +4)  2(β 4 +4)  b f 0 0

2β

3c

2c

√−1

√β

β 2 −1

−1 2β 3

√−1

β 2 −1 2 β −1 √ 2(β 2 −2)

2β

2

β 2 −2

√ 1 2(β 2 −2)

√β

β 2 −2 1 β(β 2 −2)

2

     .    

Now we will prove that (αα−1)σα ˜ and σ(αα−1)σα ˜ are equivalent up to vertical gauge choice. What we should do is to construct gauge transformation matrices for each vertical for the m×m unitary gauge transformation matrix coming from edge. We write u pq m,l the edges p-q of multiplicity m in the left vertical graphs H0 and u rs n,r for the n × n unitary gauge transformation matrix coming from the edges r-s of multiplicity n in the right vertical graph H1 . Let   x z  ξ η    y w ξ,η

to be a n × m matrix of the connection (αα˜ − 1)σα, where n and m are the multiplicities of the edges x-y and z-w respectively, and   x z  ξ ∼η    y w ξ,η

to be an n × m matrix of the connection σ(αα˜ − 1)σα. Then, the gauge matrices which we are going to construct should satisfy the equality     x x x z x z   ξ  ξ ∼η  η  =u  u   y w y w y w n,l m,r ξ,η

ξ,η

for all pair of vertical edges (xy, zw). Notice that multiplying the connection σ from the left means simply changing the upper vertices of the connection as x ↔ x, ˜ and then the above equality is equivalent to

54

M. Asaeda, U. Haagerup

 z˜  η  w



˜ x  ξ y

ξ,η

 z η   w

 x x  ξ =u  y n,l y

y u . w m,r

ξ,η

Note that the vertices e, f and g are fixed by taking ∼. We easily know that t z˜ z z =u =u . u w n,l w n,r w n,l Note that the multiplicity n of the edges z-w is equal to that of the edges z-w. ˜ Now we begin to construct a candidate for the list of gauge transformation matrices. First, for the connections M (∗/6) and M (˜∗/6), we fix the gauges for the simple edges as a ∗˜ a˜ ∗ =u =u =u = (1)1,1 . u 6 1,r G 1,l 6 1,r G 1,l Here the matrices are all 1 × 1. From the next matrices, we always fix the gauges for simple edges to 1 × 1 matrices (1)1,1 , unless otherwise specified. We denote it simply by 1. ˜ Next we fix the gauges for the connections M (∗/5), M (˜∗/5) and M (b/6), M (b/6). b b˜ We put u G 1,l = u G 1,l = −1. ˜ For M (b/5) and M (b/5), we fix gauges as follows: b˜





1

1 u

c˜ 5 2,r



b

=  −1  5 1

1 u

5

1

0

c˜

,

5 2,r



√0

 1 0 =  √ β2 β 4 −1 0 β2

β 4 −1  . β2 

−1 β2

In the same way we get   √ β 4 −1 1 c 2 β2 . =  √ β4 u β −1 5 2,r −1 β2

β2

˜ For M (d/6) and M (d/6), d˜ 6

=u

d d G 2,l

6

,

 q q  ˜ β 2 −2 β 2 +1 − 2β 2 −1 2 −1 d d 2β   q q = , =u u β 2 +1 β 2 −2 G G 2,l 2,l 2 2 2β −1

2β −1

Exotic Subfactors of Finite Depth with Jones Indices

55

˜ by symmetricity of this matrix. To check M (d/5) and M (d/5), we use M (G/e). See the matrix (1). Note that the multiplication by σ from the left to (αα˜ − 1)σα corresponds to that by σ from the right on ασ(α ˜ α˜ − 1), which causes the permutation of the entries of the connection matrix M (G/e) as follows: e

G

e

G

e −→ d˜ e . d We denote the connection matrix made from M (G/e) by multiplying σ by M (G/e)∼ . Since the vertex e is fixed by multiplying σ, we should fix the gauge matrix so that M (G/e) and M (G/e)∼ are transferred to each other. By the effect of multiplying σ, ˜ we see that M (G/e)∼ is made from M (G/e) by exchanging df (resp. db )-row and d˜f (resp. d˜h )-row, i.e., we have the following relation:   00100 0 0 0 1 0 G G   = 1 0 0 0 0  ,  ∼ e e 0 1 0 0 0 00001 here ∼ at the lower left corner in the left hand side square means the changing of labels and replacing columns according to the labels. Now we get the gauge as usual. Note that gauge matrices for upside down edges are the same as those of normal position:   ˜ d e u G u 5 4,r  G  G 2,l d =  u G 2,l 1 ∼ e e 1 

2 3 (β √−2)  2β 4 2(β 4 +4)  √  √ − 2(β 2 −2)  β 2 (β 2 −1)(β 4 +4) 

 =      

2 β 2 (β √−2) (β 2√ −1) 2(β 4 +4) β 2 (β 2 −2)

√

β 4 +4 √ 2 √ 2β (β 2 +1)(β 4 +4)



u

e 5 4,r

q

−(β −2) √ 4 √ 2(β +4) 2(β 2 −1) √ 2

q

β 4 +4 √ 2 − 2(β √−2) 4 (β 2 −1) √ β +4 −2 β 2 −2

0 0

β 4 +4

2(β 2 −2) β 4 +4

β (β −1)(β +4)

0

√

β 2 −2 2(β 2 −1)

β 2 −2 2β 2 −1

0

√ 2 4 √β (β +4) 2(β 2 +1) √

  2(β 2 +4)  4   √β +4 − β 2 (β 2 −2) =  √ 1  β 4 +4  √ −2  2 2 4

β 2 +1 2β 2 −1

√−1

2β 2

2 √β −2 β 4 −1 √ 2 β

2

√1

2(β 2 +4) β 4 +4 β2 β 4 +4

√

2β 4 (2β 2 −1)(β √ +4) (β 2 −1) β 2 −1

√

β 2 (β 4 +4)

0

β 2 −1



−(β 2 −2) 2(β 2 −1) 

 √1   2 √ β −1   u e 2 5 4,r √2β −1  , β 4 −1  1   0    √−1 β 2 +1

√ 2 2 √β (β −2)

−

β 4 +4

√ 2β (2β 2 −1)(β 4 +4) 0 √ 2 β

0

√

−2 4 β 2 (β 2 −1)(β √ +4) (β 2 −1) β 2 −1

√

β 2 (β 4 +4) √ 2 β −1 β 2 −2

0

 0

  0   . 0    0 1

56

M. Asaeda, U. Haagerup

It is too hard to obtain the above gauge matrix by calculating all the elements by multiplication of matrices. Note that 

00100

  0 0 0 1 0   G   = 1 0 0 0 0    e e 0 1 0 0 0   00001   ˜ d u G 2,l  G  d  = u  G 2,l  1

G ∼



00100

 0   1  0  0  0  0   = 1  0  0   = 

u

!

e 5 4,r

e

,

1



 0 0 1 0  G  0 0 0 0  e 1 0 0 0  0001  0100   t d˜ 0 0 1 0   u G 2,l  0 0 0 0  u  1 0 0 0  0001

u





00100

d t G 2,l

 0   1   0 1  0



d t G 2,l u

 G  

t d˜ G 2,l

u

1

 0 0 1 0  G  0 0 0 0  1 0 0 0  0001

u e

!

e t 5 4,r 1

!

e t 5 4,r

e



,

1

d˜ d d t = u G = u G . By comparing the above two equations we know and u G 2,l 2,l 2,l e that the matrix for the gauge u 5 4,r is symmetric. Thus it is enough first to check the (5,5)-entry is 1, and then to calculate the (1, 1), (2, 1), (3, 1), (4, 1), (2, 2), (3, 2), (3, 3), (2, 4), (3, 4) and (4, 4)-entries. We continue to fix gauge transformation matrices:  d˜ 5

 = 



−1 u

d E 3,l

u

d G 2,l

 d  

u 5

!

c 5 2,r

u

e 5 4,r

Exotic Subfactors of Finite Depth with Jones Indices

  =  

57



−1 u

d e 3,l

u 0

d G 2,l

β2 − 4

  2 −(4β  √ −5) ∗  2β 2 β 2 −1   −(β 2 −2)  ∗ 2β 4    √ −1 ∗  (β 2 −1)3 √ 4  2(β −1)  β3 ∗  1 ∗ β4

            =      

   √

β 2 −2 β 2 (2β 2 −1)(β 4 +4)

√

3β 2 +4 β 2 (2β 2 −1)(β 4 +4)

√ 2 3√ 2 (β√ −1) β 4 +4 (β 2 −2)3

√ 2 − 2(3β √ +1) 2 2 4 (β −1) √ β +4 2 4 β −2

β 3 β 4 +4 √ 2 2(β√ −2)2 (β 2 −1) β 4 +4

2 4 +4) √β (β 2 2(β√ −2) (β 2 −1) β 4 +4

∗

∗

0

∗

∗

0

√

√

0

2β 2 (β 2 −2) (β 2 −1)2 β −2 β 2 −1 2

√

∗ ∗

−1 (β 2 −1)3

∗ 2 β 2 −1 2 β +1 √ β 2 β 2 −1

        ,       



−1

d E 3,l

u

u

d G 2,l

  

−1

0

0

0

0

0

β 2 −2 β 2 +1

√β(β −2) 4 q2(β −1)

−1 β 2 +1

0

2 −2) 2(β 4 −1)

0

0

√

−β 2 2β 2 −1

√ 2 √β −2

− 2

β 2 +1

2

β 2 −2 β 4 −1

√ 2 √β −2

√β(β

−(β 2 −2) β 2 +1

0

−β 2 β 2 +1

0

0

0

0

0

0

0

0

2

β 2 +1

0 −(β 2 −2) √β 2 2 β 2 −1 β2

0



     0   . 0    √ 2 2 β −1   β2  0

β 2 −2 β2

d , we use Mathematica Here, to obtain the above matrix for the gauges u Ed and u G to see the signs of non-zero entries. First we calculate the (6, 5), (6, 6)-entries and (1, 1)entry (equal to −1), and also check {(6, 5)-entry}2 + {(6, 6)-entry}2 = 1, then we know the entries of the first and last rows and of the first column are equal to 0 except for (1,1),(6,5) and (6,6). Next we calculate (2, 2), . . . , (2, 4) and (4, 2), . . . , (4, 4) and check that the square sums are equal to 1 respectively, and also calculate the (5,2)-entry is equal to 0. Then the (3,2)-entry is determined by using the fact that the square sum of the entries in the second column is equal to 1, and by the sign of the entry obtained by the numerical calculation of the product of matrices by Mathematica. Then we have (3,3) and (3,4)-entries by the orthogonal relation of the second column and the third and fourth. Now we know that the matrix is block diagonal, and we have the rest (5,5) and (5,6)-entries by unitarity and signs obtained by Mathematica. To execute the above calculation, we do not use the entries ∗ in the multiplication matrix of M (d/5) and the ˜ right gauges. Note u Cd˜ = u Cd˜ = −1.

58

M. Asaeda, U. Haagerup

d We see that u G is the same matrix as we have already gotten. From now on, we consider that we are always checking it when the matrices for the gauges whose matrices have been already obtained appear with the matrices of new gauges. We check for the rest connections as before: A

!

1

g 2 2,r

u

f =

1 u

f

C 2,l

! u



f C 2,l

    

 =  1

u

f C 2,l



 e  

2 −(β √−2)  (β 2 −1) 2(β 4 +4)

 2(β 2 −2)  −√  2(β 4 +4) =  −1 √   2β β 2 −1 ∗

q

β 2 (β 2 −2)  ,  β 2 −1

!

1 u

√ 2 β 2 −2

−1 β 2 −1

√

1

 = 0 0

u

√

1 β 2 −2 √ β 2 +1 β 2 −2

∗ 

0

β 2 +1  , β 2 −2  −1 β 2 −2

!

e 3 3,r

1 √

∗



∗

0



β 2 −2 2β 

β 2 −2 , β2 2β 2(β 2 −2) 

−β β 2 −1 β 2 (β 2 +2) √ (β 2 +1)(β 4 +4) 4 √β +4 2 β 2 −1 4

2(β 4 +4) √−1 2β β 2 −1

√

√

−

β 2 +1 2β 2

c

√

g 2 2,r

2

∗ 



0

1 β 2 −1

f

 1

e

√

−1 √ β 2 −1 2 β (β 2 −2) β 2 −1

f C 2,l

!

1

∼

 = 0

0

,

g 2 2,r

u

1

! u

=



!

1 g

0

1

2

C

=

g

∼

A

√

β 2 (β 4 +4) √ −2 2 2 (β −2)2

1 (β 2 −2)β

∗



  ∗    2 3β −4  2 2β(β −2)   √ 1 2(β 2 −2)

u

!

e 3 3,r

1

,

Exotic Subfactors of Finite Depth with Jones Indices

 !

e 3 3,r

u

1

d˜

=

2(β 2 −2) β 4 +4

   −2(β 2 −2)2 4  =  √β +4  − (β 2 −2)3  √  β 4 +4 0 !

d˜ E 3,l

u

1

3 d˜ E 3,l

u

=

d 

!    1 

!

u

1

√ 2 3 √(β −2) 4 √ β +4 2 (β 2 −2)3 √

−

−2(β 2 −2)2 β 4 +4 −2(7β 2 +2) (β 2 −1)(β 4 +4) 2

β2

√ (β 2 −2)3 √

β2

β 4 +4

−2 β2

β 4 +4

0

0

1

3

√ 2 2(β 2 −2) √ −8 2β 4 2 4 4 2(β √ +4)(β −1) β√+4(β −1) 2 3 2 − (β −2) −4 2(β −1)

√

√

β 2 (β 4 +4)

β2

√

∗

β 4 +4

∗

β 2 −2 β 2 +1

 0

  0  ,  0  1

!

e 3 3,r

u

 d˜ E 3,l

59

2 −(β √ −2) 4 β 2 −1 β2 (β 2 −1)2

−β 2 β 2 +1

 √ β 2 −2 √ β 2 β 2 +1   2 , ∗ ββ 2 −2 −1   ∗ ∗ ∗

 2 2  β√(β −2) β 2 −2 √  2 3 2 (β −1) 4 β 2 −1 = 2 2  −1 (β −2) −(β 2 −2) β √  β 2 +1 β 2 +1 2 3  2 (β −1) 0 0 0

0



  0 ,  0  1

g , 3 2,r 3 3 √   −1 √2 2 2 −1 2β 2 g 2β −1  =  2√2 , u 1 3 2,r √ 2β 2 −1 h˜

=

h

u

2β 2 −1

f =

u

u 

3

=

u

f C 2,l

u

!

f C 2,l

f E 3,l

f

f E 3,l

1  2 β 4√  β 2 +1 !  −(β −1) β4 

     

u

!

g 3 2,r

u

3

e 3 3,r

∗

∗

∗

∗

∗ √1

∗

∗

√ 1 2β(β 2 −1)

∗

−1 β 2 (β 2 −1)

∗

∗

√ 2(β 2 −1) −2 √ 2 +3 2 β β β 4 +4 √ 2 2 − 2(β √ −2) √−(β +4) 2β 2 β 4 +4 β 2 (β 4 +4)

∗

∗

√



∗

 ,    

−2 β 2 +1 β 2 (β 2 −2)  4β 2 −1   β4 

∗ ∗

60

M. Asaeda, U. Haagerup

 u

!

f C 2,l

u

=    

f

E 3,l

β 2 +1 β 2 −2 −1 β 2 −2

0

0

0

0

0

0

f f =u E 3,l 4 4  β 2 −(β −2) √ √ 2(β 4 +4)  2 2 2β  −1 √ = 2 β 4 +4  2 √1 √β −2

f

β 2 +1

!

−1 u

e 4 2,r

 0   0   β2  , 2 β√+1   − 2β  2 β +1 

√

1 β 2 −2 √  β 2 +1  2  β −2

2(β 4 +4)

0

0

0

0

√ − 2β −1 2 β√+1 β 2 +1 − 2β −β 2 2 −1) β 2 +1 2(β√ β2 − 2β β 2 +1 β 2 +1

!

−1 u



e 4 2,r

∗

 ∗   ∗

!

−1 u

e 4 2,r



−1

 = 0 0

−1 β 2 +1

,

 0

0



−4β 2 −(β 4 −4)  . β 4 +4 β 4 +4  −(β 4 −4) 4β 2 β 4 +4 β 4 +4

Here note u(g-4) = −1. h˜

=

g u , 5 2,r 5

h

5

  √ β 2 +1 −1 g β 2 −2 β 2 −2  = √ , u β 2 +1 5 2,r 1 β 2 −2

 f 5

 = 

u

β 2 −2



f E 3,l

u

 f  

f

C˜ 2,l

1

u 5

!

g 5 2,r

u

e 5 4,r

,

Exotic Subfactors of Finite Depth with Jones Indices

f

u

!

g 5 2,r

u

5



∗

  ∗    ∗   √ 2 =  2 β −1  (β 2 −2)2   −1 √  (β 2 −2)3  −2 (β 2 −2)2

61

e 5 4,r

∗

∗ √3

∗ −β √

∗

−1 √ 2 2β

β 4 +4 √ 2 √ (β 2 −1) β 4 +4

(β 2 −2) β 4 +4 −(β 2 −1) √ 2(β 4 +4)

2 −(β √ −2) 2 2β 2 −1 √ β (β 2 −1)3

∗

∗

∗

∗

∗ √

∗

∗ √ 2 2 √β −1(β +1)

∗

3 2(β 2 −2)

β

− β 2 +1 2(β 2 −2)

2 −1 2(β 4 +4)

√β

β

∗

(β 4 +4)(β 2 −2)

    

u

−(β√2 −2)2 β2 −1 β 2 +1 β 2 +1 4 2β  −(β 2 −2)2 −(β 2 −1) −(β 2 −2)  √ √ β 2 +1  4 2β 2 β 2 +1  2 2 −(β  β √ −2) β−1 2 +1  β 2 +1 2 β 2 +1 



f

E 3,l

u

f C˜ 2,l

 =    1   

b

ce

Ef

3

0

0

0

0

0

0

0

0

0

=

c˜e

b˜ Ed

˜

3



∗

     ∗   2 , β +1 √  2β(β √ 2 −2)  2 − β +1   √ 2β(β 2 −2)   √ 1 ∗

(β 2 −2)3

0

0

0

0

0

0

β√+1 1 β 2 −2 2 2β 2 β√+1 −1 2 2β β 2 −2 2

0

0

0



 0   0  ,  0  0  1

× (−1),

u(c-4) = u(˜c-4) = −1.

From the above computations, we can extract a list of 32 unitary matrices u xy m,l labeled by the edges x-y of H0 and 23 unitary matrices u wz n,r labeled by the edges z-w of H1 . We now have to check the equality     x z x ˜ z ˜ x z     =u u     y w y w y w n,l m,r xy,zw

xy,zw

for all pairs of edges (xy, zw) from the vertical graphs. This amounts to checking 280 equalities of real numbers. First we go back and check that all the matrix identities listed above for the construction of the gauge matrices hold for all entries and not just for the ’s and u wz m,r ’s. Then, subset of entries needed to produce the candidates for u xy n,l we see that a few equalities are left to be checked, namely, the equality of the gauge ˜ transform between M (d/4) and M (d/4), and that of some other matrices. The checking of the former is done as follows:

62

M. Asaeda, U. Haagerup

d˜

˜ d =u E 3,l 4  2 −(β √ −2)  √ β 2 +1  5−β 2 =  √β 2 +1  2 −(β √ −2) β 2 +1

d 4 ∗ √ β(β 2

u

e 4 2,r

! u e4 2,r 

2 −2) (β +1)(β 4 +4)

∗ ∗

!

−1

−1

 

3  √ −2β 4 (β −1)(β 4 +4)  √  2 √2 2 4 (β +1) β +4

u



−1

 = 0 0

!

−1

e 4 2,r

,

 0

0



−4β 2 −(β 4 −4)  . β 4 +4 β 4 +4  −(β 4 −4) 4β 2 β 4 +4 β 4 +4

We can see that u e4 2,r here is the same matrix as when it appeared first. The latter equalities are of scalar matrices or 2 × 2 matrices which we can check at a glance that the gauge matrix is common with what we already used. In either case it is easy enough not to write down. All the above identities, we have checked using Mathematica and of course we have made repeated use of the identity β 4 − 5β 2 + 2 = 0. At last, we have obtained the equivalence of the connections (αα˜ − 1)σα ∼ = σ(αα˜ − 1)σα up to the vertical gauge choice. Finally we will check conditions√ 1) and 2). Along the same argument of the proof of the previous theorem for the (5 + 13)/2 case, we see the indecomposability other than for (αα˜ − 1)σα and the mutual inequivalence of all. In Fig. 10, ∗ in V0 is the vertex of only one edge in K. Thus, using Cororally 2, we have indecomposability of the connection (αα˜ − 1)σα. Now, the proposition holds and thus we have proved the theorem. Acknowledgement. The first named author acknowledges financial support and hospitality from Odense University and University of Copenhagen during her visit to Denmark in March/April and September, 1997. She also acknowledges a financial support from the Honda Heizaemon memorial fellowship. She is very grateful to Y. Kawahigashi and M. Izumi for constant advice and encouragement.

References Bion-Nadal, J.: Subfactor of hyperfinite II1 factor with Coxeter graph E6 as invariant. J. Operator Theory 28, 27–50 (1992) [B] Bisch, D.: Principal graphs of subfactors with small index. To appear in Math. Ann. [EK] D. Evans, E. & Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: Oxford University Press, 1998 [EK1] Evans, D.E. and Kawahigashi, Y.: Orbifold subfactors from Hecke algebras. Commun. in Math. Phys. 165, 445–484 (1994) [GHJ] Goodman, F., de la Harpe, P. & Jones, V.F.R.: Coxeter graphs and towers of algebras. MSRI publications 14, Berlin–Heidelberg–New York: Springer, 1989

[BN]

Exotic Subfactors of Finite Depth with Jones Indices

63

√ Haagerup, U.: Principal graphs of subfactors in the index range 4 < [M : N ] < 3 + 2. Subfactors. Singapore: World Scientific, 1994, pp. 1–38 [HW] de la Harpe, P. and Wenzl, H.: Operations sur les rayons spectraux de matrices symetriques entieres positives. C. R. Acad. Sci. I 305, 733–736 (1987) [Ik] Ikeda, K.: Numerical evidence for flatness of Haagerup’s connections. Preprint, (1996) [I1] Izumi, M.: Application of fusion rules to classification of subfactors. Publ. RIMS, Kyoto Univ. 27, 953–994 (1991) [I2] Izumi, M.: On flatness of the Coxeter graph E8 . Pac. J. Math. 166, 305–327 (1994) (1) [IK] Izumi, M. & Kawahigashi, Y.: Classification of subfactors with the principal graph Dn . J. Funct. Anal. 112, 257–286 (1993) [J] Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–15 (1983) [K] Kawahigashi, Y.: On flatness of Ocneanu’s connection on Dynkin diagrams and classification of subfactors. J. Funct. Anal. 127, 63–107 (1995) [O1] Ocneanu, A. Quantized group, string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2. London Math. Soc. Lecture Notes Series 136, 1989, pp. 119–172 [O2] Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. Univ. of Tokyo Seminary Note 45, recorded by Y. Kawahigashi, 1991 [O3] Ocneanu, A.: Paths on Coxeter Diagrams: From Platonic solids and singularities to minimal models and subfactors. In preparation [P1] Popa, S.: Classification of subfactors: Reduction to commuting squares. Invent. Math. 101, 19–43 (1990) [P2] Popa, S.: Classification of amenable subfactors of Type II. Acta Math. 172, 352–445 (1994) [Sa] Sato, N.: Two subfactors arising from a non-degenerate commuting squair II – tensor categories and TQFT’s. Internat. J. Math. 8, 407–420 (1997) [S] Schou, J.: Commuting squares and index for subfactors. Ph.D. dissertation at Odense University (1990) [SV] Sunder, V.S. & Vijayarajan, A.K.: On the non-occurrence of the Coxeter graphs β2n+1 , E7 , D2n+1 as principal graphs of an inclusion of II1 factors. Pac. J. Math. 161, 185–200 (1993) [W] Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math. 92, 349–384 (1988)

[H]

Communicated by H. Araki

Commun. Math. Phys. 202, 65 – 87 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Topological Approach to Quantum Surfaces Toshikazu Natsume1 , Ryszard Nest2 1 2

School of Mathematics, Nagoya Institute of Technology, Showa-ku, Nagoya 466, Japan Mathematics Institute, University of Copenhagen, Universitetsparken 5, Copenhagen, DK-2100 Ø, Denmark

Received: 11 June 1998 / Accepted: 28 July 1998

Abstract: We discuss a topological method to quantize closed surfaces.

1. Introduction One of the more exciting developments of the last decade is the introduction of noncommutative geometry, which subsumes under common structure both the classical Riemannian geometry and various noncommutative situations like discrete groups or their duals, C*-algebras associated to various questions of number theory, quantum mechanical systems and many more. However one of the main questions at the moment is to find sufficiently many computable examples. The most celebrated one is the noncommutative torus, which played a significant role in the development of the subject [5]. The C*-algebra Tθ 2 is generated by two unitaries U, V subject to the commutation relation U V = e2πiθ V U . In particular, for θ = 0, this is just the C*-algebra of continuous functions on the two torus T2 and the noncommutative version can in a natural way be regarded as a deformation of this commutative C*-algebra. Actually this is the way that the noncommutative tori appear in physics (Hall effect), where the phase factor e2πiθ comes from the shift in the phase of an electron on a lattice in a transversal magnetic field. The fact that C(T2 ) can be naturally described by generators and relations is a consequence of the fact that C(T2 ) is isomorphic to the group C*-algebra of its fundamental group Z2 , the isomorphism given by the Fourier transform. This is one of the special features of the genus one case, and fails miserably in the higher genus case. Hence a more or less natural construction of a “quantization” of the algebra of continuous functions on a surface of higher genus requires a different approach. Before explaining our approach let us list a few of the more pertinent properties of the genus one case as a guideline for what follows.

66

T. Natsume, R. Nest

(1) T2θ is simple for θ irrational; (2) T2θ is isomorphic to a twisted group C*-algebra C ∗ (Z2 , ωθ ) for an appropriate element ωθ of the second cohomology group of Z2 with values in the unit circle; (3) θ → T2θ is a strict deformation quantisation of the Poisson manifold (T2 , { , }) (where { , } is the standard Poisson structure of T2 ) in the sense of M. A. Rieffel, i.e. there exists a family of maps: θ 7→ πθ : C ∞ (T2 ) → T2θ such that for each θ the range of πθ is dense and 1 ||πθ (f )πθ (g) − πθ (f g + 2πθ{f, g})|| −→ 0. θ→0 θ In particular the last of the above mentioned facts leads to the interpretation of T2θ as a deformation of the group C*-algebra C ∗ (π1 (Z2 )) associated to the family of group 2-cocycles ωθ . We will use the above observation to introduce the following “minimal” requirements for a noncommutative deformation R of a Riemannian surface: • R is a unital (hopefully simple) C*-algebra; • both R and C(6) are fibers of a continuous family of C*-algebras over a path connected space; ∗ (0, σ) for some • R is (at least) KK-equivalent to a twisted group C*-algebra Cred group 2-cocycle σ. Let now 6 be a closed Riemannian surface of genus g ≥ 2 and let 0 denote its fundamental group. In this case there is no direct analytic connection (such as Morita equiva∗ (0). However, there exists a KKlence) between the abelian C*-algebra C(6) and Cred equivalence between the two algebras (see e.g. [21]). Moreover, since H 2 (0; T) ' T, there exists a one-parameter family of twisted group C*-algebras which can be regarded ∗ (0). as deformations of Cred The group 0 will be regarded as a discrete cocompact subgroup of PSU(1,1). The latter acts on the unit disc D by linear fractional transformations and hence induces the holomorphic covering map D/0 → 6. The C*-algebra C(6) is identified with the algebra of continuous 0-invariant functions on D. There exists a construction due to A. Lesniewski and S. Klimek [13] which takes a representation theoretic approach and which yields a sequence of finite dimensional algebras “converging” in an appropriate sense to C(6), essentially based on what is known as Kaehler quantization. The paper [14] introduces various algebras of Toeplitz operators restricted to finite dimensional subspaces of modular forms. It is not entirely ∗ (0). clear what is the relation between those algebras and either C(6) or Cred In the present paper we develop a topological method to construct noncommutative ∗ (0). The starting surfaces based on the relation between the algebras C(6) and Cred point is the fact that C(6) and C0 (D) ored 0 are Morita equivalent [12, 17]. In particular there exists a projection e ∈ C0 (D) ored 0 such that e(C0 (D) ored 0)e ∼ = C(6). Our approach will be based on a construction of a deformation of the left hand side of the above equation. The first step is a 0-equivariant deformation of the algebra C(6)

Topological Approach to Quantum Surfaces

67

given by an equivariant field of C*-algebras of compact operators given as follows. For s > 2 denote by µs the weighted Lebesgue measure on D: i (1 − |z|2 )s−2 dzdz. ¯ 2 The Bargman space Hs of dµs -square integrable holomorphic functions on D is a closed subspace of L2 (D, dµs ) and carries a projective representation of PSU(1,1). In particular 0 acts on the algebra K(Hs ) of compact operators on Hs . There exists a family of projections es in K(Hs ) ored 0 such that the family of the corners es (C0 (D) ored 0)es ; s > 2 together with C(6) forms a continuous field of C*-algebras, and our candidate for the noncommutative surface Rs will be the above reduced crossed product algebra es (C(D) ored 0)es (see Definition 2.6). About the content of this paper: In the next section we will construct the family {Rs }s∈]2,∞] of C*-algebras along the above lines. In Sect. 3 we study the individual algebras {Rs }, in particular their K-theory, and compute its pairing with the (unique) trace. In Sect. 4 we construct the structure of continuous field on the family of {Rs }s∈]2,∞] . In Sect. 5 we describe the elements of {Rs } in terms of operators on the Bargman space Hs and recover the results of [16] on the value of the unique normalized trace on {Rs }. In the last section we will apply to the above general construction the case of the surface of genus one and show that one recovers the usual description of the noncommutative tori. 2. Construction of {Rs }s∈]2,∞] Recall that a model for the universal covering space of 6 is given by the Poincar´e disc D, and 0 can be identified with a discrete subgroup of PSU(1,1) acting on D by linear fractional transformations. The reduced crossed product C0 (D) ored 0 is stable and strongly Morita eqivalent to C(6) ([12], [17]). In particular there exists a projection e in C0 (D) ored 0 such that e(C0 (D) ored 0)e ' C(6).

(2.1)

The explicit description of the projection e is given as follows. Let f be a compactly supported, nonnegative function on D such that {γ(f 2 )}γ∈0 forms a partition of unity. Then the sum X f γ(f )Uγ (2.2) e= γ∈0

is finite and defines a projection satisfying (2.1). As mentioned in the introduction, we will construct a continuous deformation of the left-hand side of (2.1). Let s be a real number greater than two and let dm(z) denote the Lebesque measure on D. We will work with the Hilbert spaces:

68

T. Natsume, R. Nest

Ls = L2 (D, (1 − |z|2 )s−2 dm(z)), Hs = the subspace of holomorphic functions in Ls . The space Ls carries a projective representation of SU(1,1) given as follows. For an element αβ ∈ SU(1,1), γ −1 = βα

(2.3) (2.4)

(2.5)

we denote by log a branch of logarithm holomorphic on the subset of the complex plane given by {βz + α|z ∈ D} and set (πs (γ)φ)(z) = exp(−s(log(βz + α)))φ(γ −1 z).

(2.6)

It is easy to see that πs is indeed a projective representation of SU(1,1) which moreover commutes with the orthogonal projection Ps of Ls onto Hs . Remark 2.1. The definition of πs above is different from the one given in [13]. In fact, the choice of principal branch of the logarithm in [13] makes the expression (V.1) there dependent on the choice of contrary to whatis√claimed there, as seen in the example ζ,√ 2 −i 2 1 − √ √ and respectively. of γ1−1 and γ2−1 equal to i − 2 1 2 √

2 √1 ) 6= 1, and πs does not induce a map from Note that, for s ∈ R \ 2Z, πs ( 2 1 PSU(1,1) into the group of unitaries on Hs . However, if we define, for γ ∈SU(1,1), αγ = Ad πs (γ) ∈ AutK(Hs ), the map γ 7→ αγ gives rise to a representation of PSU(1,1) into the group Aut(K(Hs )) which we will still denote by α. Given the fundamental group of a Riemannian surface 0 ⊂ PSU(1,1), we will denote by 00 its lift to SU(1,1) by the canonical map SU(1,1) → PSU(1,1). For a continuous function g on the unit disc D we denote by Mg the multiplication operator φ 7→ gφ on Ls and by Tg the Toeplitz operator Ps Mg Ps on Hs . Recall that we have already chosen a 0-invariant partition of unity {γ(f 2 )}γ∈0 on D. As f 2 is nonnegative and compactly supported, the operator Tf 2 is positive and compact. p Let Tf 2 be its positive square root. Note that αγ ( Lemma 2.2. The series

P γ∈0

p p Tf 2 ) = Tγ(f )2 , γ ∈ 0.

p p || Tf 2 Tγ(f )2 || is convergent.

Topological Approach to Quantum Surfaces

69

p p Proof. Since f has compact support, the operators Tf 2 Tγ(f )2 are of Hilbert–Schmidt class. If we denote by || · ||2 the Hilbert–Schmidt norm, the following holds:

p p

p p

Tf 2 Tγ(f )2 2 ≤ Tf 2 Tγ(f )2 2 = Tr(Tf 2 Tγ(f )2 ). 2 Recall that

dµs (z) = (1 − |z|2 )s−2 dm(z).

The functions

r φn (z) =

1 0(s + n) n z , n = 0, 1, . . . π 0(s − 1)

(2.7)

form an orthonormal basis of Hs , hence Tr(Tf 2 Tγ(f )2 ) = =

∞ X

∞ X

Tf 2 Tγ(f )2 φm |φm

m=0

mf 2 Ps mγ(f )2 φm |φm

m=0

X Z Z s − 1 φm (x)f 2 (x)γ(f 2 )(y)φm (y) dµs (y)dµs (x) = π (1 − xy)s D D m ! Z Z 2 f (x)γ(f 2 )(y) X s−1 φm (x)φm (y) dµs (y)dµs (x) = π (1 − xy)s D D m 2 Z Z 2 f (x)γ(f 2 )(y) 1 s−1 dµs (y)dµs (x) = s π (1 − xy) (1 − yx)s D D 2 Z Z 2 s−2 s−2 f (x)γ(f 2 )(y) s−1 )dm(x) 1 − |y|2 dm(y). 1 − |x|2 = 2s π |1 − xy| D D Since f is compactly supported, the above integral is dominated by Z s−2 dm(y) 1 − |y|2 C0 γ −1 (supp f )

0 with a constant C dependent only on the support of f . Now a bit of geometry shows αβ in 00 covering γ, we have an estimate that, for βα Z 1 (1 − |y|2 )s−2 dm(y) ≤ C2 2s . −1 |α| γ (suppf )

As a consequence we get an estimate

p p

Tf 2 Tγ(f )2 ≤ C 1 |α|s for any γ ∈ 0. To complete the proof of our lemma we will apply the following: P Claim. The series γ∈0 |α|1 s is convergent for s > 2.

70

T. Natsume, R. Nest

In fact, let Nr denote the number of points of the form γ −1 (0) inside the disc |z| ≤ r. If we choose a fundamental domain F for 0 which contains the origin in its interior, Nr is estimated (up to a multiplicative constant) by 1 1 Volume of ball of radius r , = Volume of F 2(g − 1) (1 − r)2 cf. (6.2.14) of [15]. If |γ −1 (0)| = r then |α|2 =

1 (1−r)2 .

Therefore

Z 1 X 1 1 1 ≤ C (1 − r2 )s d( ) s |α| 2(g − 1) (1 − r)2 0 γ∈0 for some universal constant C, and the claimed result follows.

Corollary 2.3. The series Xp

Tf 2

p

Tγ(f )2 Uγ

(2.8)

γ∈0

converges in norm and defines an element es in the reduced crossed product K(Hs ) ored 0. Our next goal is to show that es is in fact a projection. For that purpose we need a bit more information about the operators Tf 2 . Lemma 2.4. For any A ∈ B(Ls ) the following implication holds: X X ||AMγ(f 2 ) || < ∞ =⇒ AMγ(f 2 ) = A. γ∈0

γ∈0

P

Proof. By our assumption, the sum γ∈0 AMγ(f 2 ) converges to a bounded operator, say B on Ls . Since continuous functions with compact support are dense in Ls , it is enough to check that Ag = Bg for g with compact support. But then supp γ(f 2 ) ∩ supp g = ∅ for all but finitely many elements of 0, say γ1 , . . . , γn . Then therefore n n X X AMγi (f 2 ) g = A Mγi (f 2 ) g = Ag. Bg = 1

Pn 1

Mγi (f 2 ) g = g and

1

Lemma 2.5. The element es is a projection in K(Hs ) ored 0. Proof. It is straightforward to see that es is selfadjoint. Since it is in `1 (0, K(Hs )) ⊂ K(Hs ) ored 0,    q X X p q 0 p 0 Tf 2 Tγ 0 (f )2 π(γ ) Tf 2 T(γ 0 )−1 γ(f )2 π(γ )∗ Uγ e2s =  0  γ γ    X X p p = Tf 2 Tγ 0 (f )2 Tγ(f )2 Uγ .  0  γ γ

Topological Approach to Quantum Surfaces

71

But

X p

X

p

Tf 2 P Mγ 0 (f )2 ≤

Tf 2 P Mγ 0 (f )2 γ0

2

γ0



X

=



γ0

 21 p p Tr( Tf 2 P Mγ 0 (f )4 P Tf 2 )

X

≤

γ0

 21

||Mγ 0 (f )2 || Tr(

p p Tf 2 P Mγ 0 (f )2 P Tf 2 )

X

p q 1

Tf 2 T 0 2 < ∞, 2 || 2 = M f γ (f )

0 γ and the lemma above shows that   Xp p  Tf 2 P Mγ 0 (f )2  P = Tf 2 . γ0

Thus e2s = es .

Definition 2.6. Let 2 < s < ∞. The noncommutative surface of genus g ≥ 2 is the unital C*-algebra Rs given by the reduction: es (K(Hs ) ored 0)es . Proposition 2.7. The C*-algebra Rs is independent of the choice of f . Proof. Suppose that {γ(f 2 )}γ∈0 and {γ(g 2 )}γ∈0 are two 0-invariant partitions of unity. Set Xp p Tf 2 Tγ(g2 ) . (2.9) v= γ

The arguments used to prove that es is in `1 (0, K(Hs )) work just as well for v, and then it is straightforward to see that the proof of the above lemma applied verbatim gives v∗ v =

Xp

Tg 2

γ∈0

vv ∗ =

Xp

Tf 2

p Tγ(g)2 Uγ ,

(2.10)

p Tγ(f )2 Uγ .

(2.11)

γ∈0

In particular, es (K(Hs ) ored 0)es is independent (up to an isomorphism) of the choice of f .

72

T. Natsume, R. Nest

3. Algebraic Properties of {Rs }s∈]2,∞[ Theorem 3.1. The noncommutative surface Rs is strongly Morita equivalent to a reduced twisted group C*-algebra of 0, simple and has a unique normalized trace. Proof. The first statement follows immediately from the fact that Rs is a full corner in the reduced crossed product K(Hs ) ored 0 of the action of 0 on the algebra of compact operators. Since the action of 0 is implemented by a projective unitary representation, ∗ (0, σ), the reduced crossed product is isomorphic to the tensor product K(Hs ) ⊗ Cred where σ is the T-valued group two cocycle associated with πs . The argument in [8] can ∗ (0, σ) to show that it is simple and has a be applied to the twisted group C*-algebra Cred unique normalized trace, both of which properties descend by strong Morita equivalence to Rs . Corollary 3.2. K0 (Rs ) ' Z2 , K1 (Rs ) ' Z2g . ∗ (0, σ), which Proof. This follows immediately from the analogous statements for Cred can be for example seen from the fact that 0 is hyperbolic and hence the Baum-Connes ∗ (0, σ) is KK-equivalent to the bundle (C0 (D) ⊗ conjecture holds and implies that Cred 0 K(Hs )) over 6 (with fiber isomorphic to the algebra of compact operators).

Let τ denote the unique normalized trace on Rs . Our next goal is to compute its range on K0 (Rs ). of the cocycle σ to SU(1,1). As in Sect. 3 we choose, We denote by σ0 the pull-back β α γ γ ∈ SU(1,1), a branch of the logarithm so that the expression for each γ −1 = β γ αγ χ(γ1 , γ2 ) =

1 log(β γ2 (γ1−1 z) + αγ2 ) − log(β γ1 γ2 z + αγ1 γ2 ) + log(β γ1 z + αγ1 ) 2πi (3.1)

is independent of z ∈ D. For γ1 , γ2 ∈SU(1,1), set c(γ1 , γ2 ) = exp{2πisχ(γ1 , γ2 )}.

(3.2)

Then c is a cocycle associated with the projective unitary representation πs of SU(1,1), i.e. πs (γ1 )πs (γ2 ) = c(γ1 , γ2 )πs (γ1 γ2 ).

(3.3)

It is evident that σ0 and c are cohomologous on SU(1,1), as both define the same projective representation. Proposition 3.3. The cocycle σ on 0 represents exp(2πis(1 − g)) via the canonical isomorphism of H 2 (0, T) with T, i.e. for s = 1 it is half of the Euler class of 6. Proof. This follows from Proposition 2 of [22].

Theorem 3.4. Let τ be the unique normalised trace on Rs . Then τ∗ (K0 (Rs )) = Z +

1 Z. (s − 1)(g − 1)

Topological Approach to Quantum Surfaces

73

∗ Proof. To begin with, let τ0 be the unique normalized trace on Cred (0, σ). By [3] its range on K0 is Z + s(1 − g)Z. Since the canonical semifinite trace on K(Hs ) is invariant under the action of 0, it gives rise to a semifinite trace tr on K(Hs ) ored 0 again with the range Z + s(1 − g)Z. Let j denote the canonical inclusion of Rs into K(Hs ) ored 0. Then j induces an isomorphism on K-groups. Since es is in the domain of tr, so is all of j(Rs ) and

tr ◦ j = tr(es )τ.

(3.4)

As in the proof of Lemma 2.2 we can compute the value tr(es ) and get tr(es ) = tr(Tf 2 ) Z f (z)2 s−1 (1 − |z|2 )s−2 dm(z) = 2 s π D (1 − |z| ) Z f (z)2 s−1 dm(z). = 2 2 π D (1 − |z| ) 1 2 As (1−|z| 2 )2 dm(z) is 0-invariant and {γ(f )}γ∈0 is a 0-invariant partition of unity, the 1 integral above is equal to the (1−|z|2 )2 dm(z) volume of a fundamental domain F , i.e.

tr(es ) =

s−1 s−1 vol(D/0) = π (g − 1) = (s − 1)(g − 1). π π

Therefore τ∗ (K 0 (Rs )) =

1 1 (tr∗ (K 0 (K(Hs ) ored 0))) = Z + Z. tr(es ) (s − 1)(g − 1)

Corollary 3.5. For s irrational, the following holds: g > 2 and Rs ' Rs1 ⇐⇒ s = s1 , g = 2 and Rs ' Rs1 =⇒ s = s1 or (s − 2)(s1 − 2) = 1. Proof. This follows from the uniqueness of the normalised trace and the above theorem. In fact, the equality Z+ implies

s1 − 1 s−1 Z=Z+ Z g−1 g−1

1 1 − s − 1 s1 − 1

1 ∈ Z, g−1

1 . and, since s, s1 > 2, the left-hand side is bounded by 2 g−1

74

T. Natsume, R. Nest

4. Continuous Field Structure The goal of this section is to prove the following result. Theorem 4.1. Let R∞ = C(6). The collection {Rs }s∈]2,∞] can be endowed with the structure of a continuous field of C*-algebras over ]2, ∞]. Construction of the continuous field. Step 1. Set As = K(Hs ) for s > 2 and A∞ = C0 (D). We endow the collection {As }s∈]2,∞] with a structure of continuous field of C*-algebras. Recall that functions r 1 0(s + n) n s z ; s > 2, n = 0, 1, . . . (4.1) φn (z) = π 0(s − 1) denote the space of continuous functions on form an orthonormal basis for Hs . Let F Q ]2, ∞[ and let M0 be the F-submodule of s Hs generated by the sections {φsn }. Then (Hs , M0 ) satisfies: Q • M0 is a linear subspace of s Hs ; • for every 2 < s < ∞, the set {φ(s)|φ ∈ M0 } is dense in Hs ; • for every φ ∈ M0 the function s 7→ ||φ(s)|| is continuous. Therefore by [9], (0.2.3), there exists ( M⊂

sections of

Y

) Hs over ]2, ∞[

s

such that ((Hs ), M ) is a continuous field of Hilbert spaces. The field ((Hs ), M ) in turn defines the associated continuous field of elementary C*-algebras {K(Hs )}s∈]2,∞[ . Our goal is to extend it to ]2, ∞]. Let Tfs denote the Toepliz operator with symbol f ∈ C0 (D) acting on Hs . For s > 2, Tfs is of Hilbert–Schmidt class and, in terms of rank one partial isometries θφsm ,φsn , has a norm convergent expansion X cs (m, n; f )θφsm ,φsn , Tfs = m,n

where cs (m, n; f ) =

Z D

φsn (z)f (z)φsm (z)(1 − |z|2 )s−2 dm(z).

It is fairly obvious from this formula that the functions s 7→ cs (m, n; f ) are in F and hence s 7→ Tfs are approximated by continuous sections and hence is itself a continuous section. Moreover a minor modification of the result of Engliˇs ([11]) shows that the sections s 7→ θφsm ,φsn are locally uniformly approximated by Toeplitz operators with continuous compactly supported symbols. Now set nX o ai1 ,...ik (s)Tgs1 . . . Tgsk |ai1 ,...ik (s) ∈ F; gij ∈ Cc (D) . 30 = Q 30 is in a natural way a subalgebra of s K(Hs ) pointwise dense in the fibers, hence defines a continuous field of C*-algebras, identical with the one constructed with the help of θφsm ,φsn ’s. Now for any g ∈ Cc (D), we set

Topological Approach to Quantum Surfaces

75

Tg∞ = g ∈ C0 (D) = A∞ . Let F 0 denote the space of continuous functions on ]2, ∞]. Set nX o 0 ai1 ,...ik (s)Tgs1 . . . Tgsk |ai1 ,...ik (s) ∈ F 0 ; gij ∈ Cc (D) . 30 =

(4.2)

Using Theorem VI.1 of [13] we have, for g, h ∈ Cc (D), • lims→∞ ||Tgs || = ||g||∞ , s || = 0. • lims→∞ ||Tgs Ths − Tgh Hence

X

X

ai1 ,...ik (∞)g1 . . . gk ai1 ,...ik (s)Tgs1 . . . Tgsk −→

∞

as s → ∞.

0

But this means that 30 determines a continuous field of C*-algebras of required type. Step 2. Set Bs = K(Hs ) ored 0 for s < ∞ and B∞ = C0 (D) ored 0. The collection {Bs } has a structure of continuous field of C*-algebras over]2, ∞]. Let us start with the following two results. ∗ This is a direct consequence of the fact that Cred (0) is an exact C*-algebra (Anantharaman–Delaroche) and the general theory of discrete groups with exact reduced group C*-algebra (E. Kirschberg, S. Wasserman, E. Blanchard) but, for the convenience of the reader (and for lack of an easy reference) we will include below a simple proof. Lemma 4.2. The canonical map C(D) o 0 → C(D) ored 0 is an isomorphism. Proof. Since the 0-action on D is amenable, the conclusion follows ([1]).

Lemma 4.3. For any element λ in C[0] the function ∗ (0,c ) ∈ R ]2, ∞[3 s → ||λ||Cred s

is continuous. ∗ (0, cs )-norm is computed by representing λ by the operator acting on Proof. The Cred 2 ` (0) by X λ(g)cs (g, h)ξ(g −1 h). (πs (λ)ξ)(h) = g∈0

We will use this notation throughout the proof. We will require a bit more information about the group 0 and the cocycle cs ∈ H 2 (0, T). • The one-parameter family of cocycles cs can be represented by the functions s → exp( 2πisω(g, h)), where ω(g, h) is the (oriented) volume of the geodesic triangle with vertices (0, g(0), gh(0)) in the Poincar´e disc D.

76

T. Natsume, R. Nest

• For each fixed element g ∈ 0, Cg = suph∈0 |ω(g, h)| < ∞. The first of the above is well known. The second claim is a direct consequence of the Gauss–Bonnet theorem. Let now again ξ ∈ `2 (0). We have the following estimate: 2

k(πt (λ) − πs (λ))ξk = 2 X X λ(g) exp(it−1 ω(g, h)) − exp(is−1 ω(g, h)) ξ g −1 h ≤ g h !2 X X −1 −1 −1 |λ(g)||t − s ||ω(g, h)||ξ(g h)| ≤ h

t

−1

g

−1 2

−s

!2 sup |Cg λ(g)|

g∈supp λ

(#{supp λ})2 ||ξ||2 .

Hence ||πt (λ) − πs (λ)|| ≤ const(t − s), and the claimed result follows.

Now we can finish the proof of continuity of the field. Denote by A the C*-algebra of continuous sections of the fleld {As }s over ]2, ∞] vanishing at infinity. We prove that the reduced crossed product A ored 0 defines a continuity structure on the field {As ored 0}s . Let ρs be the evaluation map As o 0 → As ored 0. The map ρs exists (see, for instance, the proof of Theorem 2.5 of [10]). What remains to prove is the fact that, for any a ∈ A ored 0, the map s → ||ρs (a)|| is continuous. First of all, the lower semicontinuity is a straightforward consequence of the Fatou Lemma (see also the proof of Theorem 2.5 of [10]). The universality of the full crossed product implies the upper semicontinuity of the field {As o 0}s . Then the continuity at s = ∞ follows from Lemma 4.2. 0 To deal with finite s, denote by A the C*-algebra of continuous sections of the field 0 {As }s over ]2, ∞[, vanishing at infinity. Note that A is a closed ideal of A and, at any 0 finite value of s, the continuity structure of {As ored 0}s is defined as well by A ored 0. On ]2, ∞[ we have two continuous fields of C*-algebras: ∗ (0, cs ). s → K(Hs ) and s → Cred ∗ (0, cs ) can be endowed with a By Remark 2.6 of [10], the field s → K(Hs ) ⊗ Cred 0 0 continuity structure. The ∗-algebra Cc (0, A ) is a total subset of A ored 0 and the isomorphisms ∗ (0, cs ) K(Hs ) ored 0 ∼ = K(Hs ) ⊗ Cred 0

∗ (0, cs )}s . This means map any a ∈ Cc (0, A ) to a continuous section of {K(Hs ) ⊗ Cred that ||ρs (a)|| is continuous in s and hence completes Step 2.

Topological Approach to Quantum Surfaces

77

Step 3. Let us fix an > 0 and denote by e∞ the projection e given by (2.8). Then the section ]2, ∞] 3 s 7→ es ∈ K(Ls ) ored 0 is continuous. of the crossed Recall that continuous sections {Bs } are given by elements q of the√field q 2 ∞ s product C(A) ored 0. Since Tf 2 = f = f , the section s 7→ Tf 2 is continuous q s (and similarly for Tγ(f )2 ). Therefore it is sufficient to show that the function γ 7→

q q s Tfs2 Tγ(f )2

(4.3)

is summable in γ ∈ 0 uniformly in s ∈]2, ∞]. Hence we have to estimate the sum

q q X

s

Tfs2 Tγ(f )2 . γ

As in Sect. 3,

2

q q

s s s

Tfs2 Tγ(f )2 ≤ Tr Tf 2 Tγ(f )2 2 Z s−2 f 2 (z)f 2 (γ −1 ξ) s−1 (1 − |ξ|2 )s−2 dm(z)dm(ξ). 1 − |z|2 = 2s π |1 − zξ| D×D For z, ξ ∈ D we have the equality 1 − |z|2 1 − |ξ|2 |1 − zξ|2

= (cosh d(z, ξ))−1 ,

where cosh d(z, ξ) is the hyperbolic distance between the points z and ξ. Let δ be a fixed strictly positive real number. Then d(supp f, supp γ(f )) > δ

(4.4)

except for finitely many γ’s. We claim that there exists a s0 such that for any γ satisfying 4.3, any z ∈ supp f and ξ ∈ supp γ(f ), the function !s 1 − |z|2 (1 − |ξ|2 ) 2 φ(s) = (s − 1) 1 − zξ 2 is monotone decreasing for s ≥ s0 . Since d(z, ξ) > δ, (cosh d(z, ξ))−1 > (cosh δ)−1 > 1. Set

(cosh d(z, ξ))−1 = e−a and (cosh δ)−1 = e−b . 0

By above, a > b > 0. Now φ(s) = (s − 1)e−as and φ (s) = (s − 1)e−as (2 + a − as). 0 Thus, for s > s0 = 2b + 1 > a2 + 1, φ (s) < 0.

78

T. Natsume, R. Nest

But now for any s ≥ s0 , Z f 2 (z)f 2 (γ −1 ξ) s−1 2 ) (1 − |z|2 )s−2 (1 − |ξ|2 )s−2 dm(z)dm(ξ) ( 2s π |1 − zξ| D×D Z f 2 (z)f 2 (γ −1 ξ) s0 − 1 2 ) (1 − |z|2 )s0 −2 (1 − |ξ|2 )s0 −2 dm(z)dm(ξ). ≤( π |1 − zξ|2s0 D×D Since the sum defining es converges uniformly for s in any compact subset of ]2, ∞[, the required conclusion follows immediately. Step 4. The completion of the proof of the theorem. Let B denote the C*-algebra of continuous sections of the field Bs . Since e = {es }s∈]2,∞] is a continuous section of the field Bs , eBe determines a structure of a continuous field of C*-algebras on {Rs } = {es Bs es }. 5. Structure of {Rs }s∈]2,∞] We will start by giving an equivalent description of Rs . Recall that P is the orthogonal projection onto the subspace of holomorphic functions in Ls . It is easy to check that X Mf P Mγ(f ) Uγ p= γ∈0

is a projection in K(Ls ) ored 0. Proposition 5.1. The algebra p(K(Ls ) ored 0)p is isomorphic to Rs . Proof. Set w=

X

Mf P

p Tγ(f 2 ) Uγ.

γ

Then w is an element of the crossed product K(Ls ) ored 0 and satisfies ww∗ = p and w∗ w = es . Since Rs = es (K(Hs ) ored 0)es = es (K(Ls ) ored 0)es , the conclusion follows. Let R denote the linear space of all bounded operators A on Hs satisfying ∗ =A, for all γ ∈ 0; • π(γ)Aπ(γ) P • γ ||Mf P AP Mγ(f ) || < ∞.

Proposition 5.2. The space R is a ∗-subalgebra of B(Hs ). Proof. The invariance of R under the adjoint follows immediately from the fact that π(γ)’s are unitary. The fact that it is closed under products is a fairly routine computation based on Lemma 2.4. In fact, suppose that we are given A, B ∈ R. Then X ||Mf P AP Mγ1 (f ) π(γ1 )Mf P BP Mγ −1 γ(f ) π(γ1 )∗ || γ,γ1

≤

γ,γ1

=

X γ,γ1

=

1

X

X γ,γ1

||Mf P AP Mγ1 (f ) || ||π(γ1 )Mf P BP Mγ −1 γ(f ) π(γ1 )∗ || 1

||Mf P AP Mγ1 (f ) || ||Mf P BP Mγ −1 γ(f ) || 1

||Mf P AP Mγ1 (f ) || ||Mf P BP Mγ(f ) || < ∞.

Topological Approach to Quantum Surfaces

79

Since B commutes with π(γ), X γ,γ1

=

Mf P AP Mγ1 (f ) π(γ1 )Mf P BP Mγ −1 γ(f ) π(γ1 )∗ 1

X

Mf P AP Mγ1 (f 2 ) P BP Mγ(f ) ,

γ,γ1

and

X

Mf P AP Mγ1 (f 2 ) P BP Mγ(f ) < ∞.

γ,γ

(5.1)

1

By assumption, X

||Mf P AP Mγ(f 2 ) || <

γ

X

||Mf P AP Mγ(f ) ||||Mγ(f ) || < ∞.

γ

By Lemma 2.4,

X

Mf P AP Mγ(f 2 ) = Mf P AP,

γ

therefore

X

Mf P AP Mγ1 (f 2 ) P BP Mγ(f ) = Mf P ABP Mγ(f ) .

γ1

In view of (5.1) this shows that AB ∈ R.

The space R is nontrivial. In fact, Proposition 5.3. Any Toeplitz operator Tφ with continuous 0-invariant symbol belongs to R. P Proof. For a given φ we choose φ0 , ψ ∈ Cc (D) such that φ = γ γ(φ0 ) and ψφ0 = φ0 . In this case Mψ Mφ0 = Mφ0 in B(Ls ). The argument of Lemma 2.2 gives X

||Mf P Mγ(ψ) || < ∞ and

γ

X

||Mφ0 P Mγ(f ) || < ∞.

γ

As in the proof of Proposition 5.2, X γ,γ1

and

X γ1

Therefore

P

γ

||Mf P Mγ(ψ) π(γ1 )Mφ0 P Mγ −1 γ(f ) π(γ1 )∗ || < ∞, 1

Mf P Mγ(ψ) π(γ1 )Mφ0 P Mγ −1 γ(f ) π(γ1 )∗ = Mf P Tφ P Mγ(f ) . 1

||Mf P Tφ P Mγ(f ) || < ∞ and Tφ ∈ R.

80

T. Natsume, R. Nest

Proposition 5.4. The map 9 R → p `1 (0, K(Ls )) p,→Rs defined by X

A 7→ p

! Mf P AP Mγ(f ) Uγ

p

γ

is an isometric ∗-homomorphism. Proof. The fact that 9 is a well-defined ∗-homomorphism follows from the computations done in the proof of Proposition 5.2. To show injectivity, suppose that 9(A) = 0. Since 9(A) lies in `1 (0, K(Ls ), all its coefficients must be zero and hence also Mf 2 P AP Mγ(f 2 ) = Mf (Mf P AP Mγ(f ) )Mγ(f ) , γ ∈ 0. Now a double application of Lemma 2.4 gives 0=

X

π(γ)

X

γ

! Mf 2 P AP Mγ1 (f 2 )

π(γ)∗ = A.

γ1

Taking into account the fact that we are dealing with projective unitary representations, a straightforward computation shows the following. For any ξ ∈ `2 (0, L2s ) finitely supported on 0 (i.e. ξ(g) = 0 for all but finitely many g), we have h9(A)ξ|ξi = X XD g∈0

h

1 2

!

1 2

E

(A P Mf ξ)(g−1 h)|(A P Mf ξ)(h)

2

X 1

A 2 P Mf ξ)(h) . = h

Since finitely supported ξ’s are dense in `2 (0, L2s ) and elements of the form ! X −1 Mh(f ) π(h)(ξ(h )) P h

with finitely supported ξ’s are dense in Hs , the equality above means that A is positive as an operator on Hs if and only if 9(A) is a positive operator on L2s . Let A ∈ R. The operator ||A||2 − A∗ A is positive as an operator on Hs . Therefore 0 ≤ 9(||A||2 − A∗ A) = ||A||2 − 9(A)∗ 9(A). By the spectral theorem,

||9(A)∗ 9(A)|| ≤ ||A||2 ,

and hence 9 is a contraction. If we apply the same argument to the map 9−1 : 9(R) → R, the claimed result follows.

Topological Approach to Quantum Surfaces

81

Remark 5.5. By the two propositions above, Rs containes the algebra T0s of Toeplitz operators with 0-invariant kernels. However it does not seem clear whether the map 9 : T0s → Rs is surjective. Suppose that ke (x, y) ∈ Cc (D×D). The integral kernel of the operator π(γ)Int(ke )π(γ)∗ is given by kγ (x, y) =

1 1 ke (γ −1 x, γ −1 y) . (βy + α)s (βx + α)s

(5.2)

The family {supp(kγ )} of subsets of D × D is locally finite. Hence K(x, y) =

X

kγ (x, y)

(5.3)

γ

is a well-defined continuous function on D × D, having the property 1 1 K(γ −1 x, γ −1 y) = K(x, y) for all γ ∈ 0. s (βy + α)s (βx + α)

(5.4)

Let p(x,y) denote the Bargman kernel for Hs , i.e. p(x, y) =

1 s−1 . π (1 − xy)s

(5.5)

Then p satisfies the identity (5.4) above and hence, if we set k(x, y) = K(x,y) p(x,y) , then k becomes a 0-invariant function on D × D. Recall that a subset of D × D is called 0-compact if its image in the quotient D × D/0 (with diagonal action of the group) is compact. The above assumption that ke is compactly supported implies that k is 0compactly supported. In other words, given a compactly supported function ke (x, y) on P D × D, the function K(x, y) = γ kγ (x, y) is of the form k(x, y)p(x, y); k is 0-invariant and 0-compactly supported.

(5.6)

Conversely, given any 0-invariant and 0-compactly supported function k on D × D, we can set ke (x, y) = f 2 (x)k(x, y)p(x, y),

(5.7)

and it is straightforward to see that the above construction recovers k from ke . For simplicity, we will call any function of the form given by Eq. (5.6) a 0-compact integral kernel. Proposition 5.6. The operator Int(K) with a 0-compact integral kernel is bounded and commutes with π(γ), γ ∈ 0.

82

T. Natsume, R. Nest

Proof. We will use the criterion given by Proposition 2-7 of [16]. What we need to show is that Z s s (1 − |x|2 ) 2 (1 − |y|2 ) 2 dm(y) (5.8) supx∈D |k(x, y)| |1 − xy|s (1 − |y|2 )2 D and Z supy∈D

s

s

(1 − |x|2 ) 2 (1 − |y|2 ) 2 dm(x) |k(x, y)| |1 − xy|s (1 − |x|2 )2 D 2 s 2

(5.9)

2 s 2

) (1−|y| ) dm(y) are finite. But, since both k and the measure (1−|x||1−xy| s (1−|y|2 )2 are 0-invariant, the supD in both of the above expressions can be taken over x and y respectively in the (compact) fundamental domain for 0, and hence both expressions are finite. The second claim of the proposition is obvious.

Proposition 5.7. Let Int(K) be as in Proposition 5.6. Then the compression P (Int(K)) P belongs to R. Proof. Let us write K(x, y) = k(x, y)p(x, y) =

X γ

1 1 ke (γ −1 x, γ −1 y) s (βy + α)s (βx + α)

with ke ∈ Cc (D × D). We can choose a smooth function ψ with compact support such that ψ(x)ke (x, y) = ke (x, y). Now by arguments similar to the ones in the proof of Lemma 2.2, X X ||Mf P Mγ(ψ) || < ∞ and ||Int(ke )P Mγ(f ) || < ∞. γ

γ

As in the proof of Proposition 5.3, X ||Mf P π(γ)Mψ Int(ke )π(γ)∗ P Mγ 0 (f ) || < ∞, γ,γ 0

and

X

Mf P π(γ)Mψ Int(ke )π(γ)∗ =

γ

Therefore claimed.

X

Mf P π(γ)Int(ke )π(γ)∗ = Mf P Int(K).

γ

P γ

||Mf P Int(K)P Mγ(f ) || converges and P (Int(K))P belongs to R as

We are now ready to describe Rs in terms of operators on Hs . Theorem 5.8. Let Rs denote the C*-algebra of operators on Hs generated by the operators of the form P (Int(K))P with 0-compact invariant kernels. Then 9 is an isomorphism of Rs with Rs .

Topological Approach to Quantum Surfaces

83

Proof. The only unproven part of the statement is the surjectivity of 9. But, since operators of the form p(Int(ke )Uγ )p with ke compactly supported are in the range of 9, this follows immediately from the fact that Int(ke )Uγ form a dense family in the crossed product K(Ls ) ored 0. Remark 5.9. Note that we are working with three algebras acting on three different Hilbert spaces. In fact, Rs = es (K(Hs ) ored 0) es ⊂ B(Hs ⊗ `2 (0)), p (K(Hs ) ored 0) p ⊂ B(Ls ⊗ `2 (0)), Rs ⊂ B(Hs ),

(5.10)

ad(P w)

and 9 : Rs −→ p (K(Ls ) ored 0) p −→ Rs . We will conclude this section by computing the value of the unique normalized trace τ on Rs ' Rs . By construction, for an element A of Rs , the normalised trace is given by τ (A) =

p p 1 1 Tr Tf 2 A Tf 2 = T r(Tf 2 A). (s − 1)(g − 1) (s − 1)(g − 1)

Proceeding as in the proof of Lemma 2.2 we get p X p Tf 2 A Tf 2 = hTf 2 Aφn , φn i = Tr XZ D

n

n

f (x)(Aφn )(x)φn (x)(1 − |x|2 )s−2 dm(x). 2

Denote by ex the evaluation vector hφ, ex i = φ(x). It is given by ex (y) =

1 s−1 . π (1 − xy)s

Then the integral above becomes XZ f 2 (x) hAφn , ex i φn (x)(1 − |x|2 )s−2 dm(x) D

n

Z

*

f (x) A( 2

= Z

D

X

+ φn (x)φn , ex

(1 − |x|2 )s−2 dm(x)

n

f (x) hAex , ex i (1 − |x|2 )s−2 dm(x). 2

= D

According to the definition [4] of the contravariant symbol Aˆ of A, we have 1 ˆ x). ˆ x) hex , ex i = s − 1 A(x, hAex , ex i = A(x, π (1 − |x|2 )s

(5.11)

84

T. Natsume, R. Nest

Therefore Tr(

p p s−1 Tf 2 A Tf 2 ) = π

Z D

ˆ x)(1 − |x|2 )−2 dm(x). f 2 (x)A(x,

ˆ x) and (1 − |x|2 )−2 dm(x) are 0-invariant and {γ(f 2 )} is a 0-invariant parSince A(x, tition of unity, we get, as in [16], Z 1 ˆ x) dm(x) . (5.12) A(x, τ (A) = (s − 1)(g − 1) F (1 − |x|2 )2 Proposition 5.10. The normalized invariant trace on Rs is given in terms of the contravariant Berezin symbol (cf. (5.11)) by Z 1 ˆ x) dm(x) . A(x, τ (A) = Vol(6) F (1 − |x|2 )2 Proof. All that is left is to recall that, in terms of the invariant measure (1−|x|2 )−2 dm(x), Vol(6) = π(g − 1). 6. Noncommutative Tori Revisited Below we apply the above procedure to the case g = 1, i.e. C(T2 ). The description will be a bit sketchy, since careful bookkeeping can fill in the details. The universal covering space of T2 is given by R2 with the natural action of the lattice Z2 = π1 (T2 ). In particular we have a strong Morita equivalence between C0 (R2 ) o Z2 and C(T2 ). The obvious (in this context) deformation of C0 (R2 ) is given by the Moyal product, i.e the twisted group C*-algebras of R2 . To be more precise, fix a 0 < θ < 1 and define a T-valued group cocycle by ωθ ((r, s), (t, u)) = e2πθst , (r, s), (t, u) ∈ R2 . The cocycle ωθ restricts to the discrete subgroup Z2 and the noncommutative torus T2θ is defined (and denoted in what follows) by Aθ = C ∗ (Z2 , ωθ ). Let Eθ be the ∗-algebra S(R2 , ωθ ), i.e. • S(R2 , ωθ ) = RS(R2 ) (Schwartz functions) as the linear space; • (φ∗ψ)(x) = φ(y)ψ(x−y)ωθ (y, x−y)dm(y) , where dm(y) is the Lebesque measure on R2 ; • φ∗ (x) = φ(−x)ωθ (x, −x). The action of Z2 on the ∗-algebra S(R2 , ωθ ) given by πn (φ)(x) = e2πix·n φ(x), x ∈ R2 , n ∈ Z2 extends to an action on C ∗ (R2 , ωθ ) which will be denoted by α. Remark 6.1. Our noncommutative surface of genus one as an analogue of R θ1 is given by unital reduction of the stable C*-algebra C ∗ (R2 , ωθ ) oα Z2 . We will below describe its structure and compare it with the standard noncommutative torus Aθ .

Topological Approach to Quantum Surfaces

85

Let us start with the following well known result. Lemma 6.2. C ∗ (R2 , ωθ ) oα Z2 is isomorphic to the C*-algebra K(L2 (R)) ⊗ A θ1 . ˆ → L2 (R2 ) the algebra C ∗ (R2 , ωθ ) Proof. Under the Fourier transform F : L2 (R × R) is easily seen to be isomorphic to the norm closure C of the algebra of operators on ˆ of the form L2 (R × R) Z (ρ(a)φ)(x, η) = e2πi(ξ−η) a(x, η)φ(x + θy, ξ)dydξ; a ∈ S(R2 ). The Z2 -action becomes implemented by the unitary representation U(p,q) φ(x, ξ) = φ(x − p, ξ − q). ˆ given by Let V be the unitary operator on L2 (R × R) V φ(x, ξ) = e Then (V ρ(a)V ∗ φ)(x, ξ) =

2πi θ xξ

φ(x, ξ).

Z e2πi(x−y)·η a(x, θη)φ(y, ξ)dydη.

In other words, V ρ(a)V ∗ = Op(aθ )⊗I, where Op(aθ ) is the pseudodifferential operator with symbol aθ (x, ξ) = A(x, θξ). Since those generate K(L2 (R)), we get V CV ∗ = K(L2 (R)) ⊗ I. Moreover

(V U(m,n) V ∗ φ)(x, ξ) = e

2πi θ (xn+mξ−mn)

φ)(x − m, ξ − n).

It follows immediately that Ad V (C o Z2 ) = K(L2 (R)) ⊗ I C ∗ (U(1,0) , U(0,1) ) = K L2 (R) ⊗ A θ1 .

2 ∞ We let A∞ θ denote the dense subalgebra S(Z , ωθ ) ⊂ Aθ . A right Aθ -module structure on Eθ is given by X φ(p)ψ(x − p)ωθ (p, x − p). (φ · a)(x) = p∈Z2

Moreover Eθ is a pre-Hilbert C*-module over A∞ θ with inner product given by Z hφ|ψi (p) = φ(x)ψ(x + p)ωθ (x, −p)dm(x). A∞ θ R2

Let us denote by Bθ the crossed product C*-algebra: Bθ = C ∗ (R2 , ωθ ) oα Z2 , and by Bθ∞ the dense ∗-subalgebra S(Z2 , S(R2 , ωθ )) of S(R2 , ωθ )- valued rapidly decreasing functions on Z2 . The algebra Bθ∞ acts on Eθ from the left by

86

T. Natsume, R. Nest

X

! fm Um

·φ=

X

fm ∗ αm (φ).

m

m∈Z2

We define a Bθ∞ -valued inner product on Eθ by X φ ∗ αm (ψ ∗ )Um . hφ|ψiB ∞ = θ

m

Lemma 6.3. The closure of Eθ is a strong Morita equivalence C ∗ (R2 , ωθ ) oα Z2 -Aθ bimodule. Proof. The only fact needed to verify the claim is the equality hψ|πi ; φ, ψ, π ∈ Eθ . hφ|ψiB ∞ · π = φ · A∞ θ θ

But this is a straightforward consequence of the Poisson summation formula for a Schwartz function φ: X XZ φ(x)e−2πip·x dx = φ(p). p∈Z2

R2

p

In particular we get the folowing well known result [17].

Corollary 6.4. The C*-algebras Aθ and A θ1 are strongly Morita equivalent. References 1. Anantharaman-Delaroche, C.: Syst´em dynamiques non commutatif et moyenabilit´e. Math. Ann. 279, 297–315 (1987) 2. Anderson, J., Paschke, W.: The rotation algebra. Houston J. Math. 15, 1–26 (1989) 3. Baum, P., Connes, A.: Geometric K-theory for Lie groups and foliations. Preprint IHES, 1982 4. Berezin, F.A.: Quantisation. Math. USSR Isvestia, 8 (1974) 5. Connes, A.: Noncommutative geometry. New York: Academic Press, 1994 6. Connes, A.: Noncommutative geometry and reality. J. Math. Phys. 36, 6194–6231 (1995) 7. Connes,A., Moscovici, H.: Cyclic cohomology, the Novikov conjecture and hyperbolic groups. Topology 29, 345–388 (1990) 8. de la Harpe, P.: Reduced C*-algebras of discrete groups which are simple with unique trace. In: Operator algebras and their connections with topology and ergodic theory, Proceedings, Busteni, Romania 1983, Lecture Notes in Math. 1132, pp. 230–253 9. Dixmier, J.: C*-algebras. Amsterdam: North Holland, 1977 10. Elliott, G.A., Natsume, T., Nest, R.: The Heisenberg group and K-theory. K-Theory 7, 409–428 (1993) 11. Engli˘s, M.: Some density theorems for Toeplitz operators on Bargman spaces. Czechoslovak Math. J. 40, 491–502 (1990) 12. Green, P.: C*-algebras of transformation groups with smooth orbit sapce. Pacific J. Math. 72, 71–97 (1977) 13. Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I, The unit disc. Commun. Math. Phys. 146, 105–122 (1992) 14. Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I, The discrete series. Lett. Math. Phys. 24, 125–139 (1992) 15. Lehner, J.: Discrete groups and automorphic forms. In: Automorphic forms, Edit.: J. Harvey, Providence, RI: AMS 1964, pp. 73–119 16. Radulescu, F.G.: On the 0-equivariant form of the Berezin quantisation. Memoires AMS 630, Providence, RI: AMS, 1998 17. Rieffel, M.A.: C*-algebras associated with irrational rotations. Pacific J. Math. 93, 415–429 (1981)

Topological Approach to Quantum Surfaces

87

18. Rieffel, M.A.: Strong Morita equivalence of certain transformation group C*-algebras. Math. Annal. 222, 7–22 (1976) 19. Rieffel, M.A.: Continuous fields of C*-algebras coming from group cocycles and actions. Math. Annal. 283, 131–143 (1989) 20. Rieffel, M.A.: Deformation quantisation for actions of Rd . Memoires AMS 506, Providence, RI: AMS, 1982 21. Rosenberg, J.: The role of K-theory in noncommutative algebraic topology. In: Operator Algebras and K-theory, Contemp. Math. 10, Providence, RI: AMS, 1982, pp. 155-182 22. Patterson, S.J.: On the cohomology of Fuchsian groups. Glasgow Math. J. 16, 123–140 (1975) Communicated by H. Araki

Commun. Math. Phys. 202, 89 – 126 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Mass Generation in the Large N -Nonlinear σ -Model C. Kopper Centre de Physique Th´eorique de l’Ecole Polytechnique, F-91128 Palaiseau, France Received: 30 March 1998 / Accepted: 19 September 1998

Abstract: We study the infrared behaviour of the two-dimensional Euclidean O(N ) nonlinear σ-Model with a suitable ultraviolet cutoff. It is proven that for a sufficiently large (but finite!) number N of field components the model is massive and thus has exponentially decaying correlation functions. We use a representation of the model with an interpolating bosonic field. This permits to analyse the infrared behaviour without any intermediate breaking of O(N )-symmetry. The proof is simpler than that of the corresponding result for the Gross–Neveu-Model [1].

1. Introduction We want to study the infrared behaviour of the two-dimensional Euclidean nonlinear σ-model [2] which is formally given in terms of the Lagrangian K 2 N 2 2 (∂φ) + (φ − 1) . (1) L= 2λ 4 Here the constant K is assumed to be of order 1, whereas we assume N >> 1 , for λ see below. φ is a real-valued N -(flavour-)component bosonic field in the fundamental (vector) representation of O(N ). The minimum of L is thus situated at φ2 = 1, where the value 1 may be changed by rescaling the field variable. The ultraviolet (UV) cutoff as well as more precise statements on the lower bound for N will be specified later. As regards λ, its value should not be much larger than 1, because otherwise the generated mass m approaches the UV cutoff, see below (20). If it is much smaller than 1, on the other hand, the effective energy range of the UV cutoff model becomes large and therefore the bounds, which involve factors of exp(4π/λ), deteriorate. The convergence proof then requires larger values of N . In the full renormalization group construction one would try to impose a condition λ ∼ 1 by fixing the renormalized coupling λ0 of

90

C. Kopper

the last renormalization group step to obey that condition since in the full construction λ0 corresponds to our coupling λ. The standard nonlinear σ-model has the constraint on the field variable (which we call φ instead of σ ) φ2 = 1.

(2)

Condition (2) can be obtained from (1) by a suitable limit taking K → ∞.1 Such a constraint however is immediately softened out when starting from the model with a large UV cutoff on integrating out high frequency modes, even after the first renormalization group step in a renormalization group construction. This can be seen from the renormalization group construction of the hierarchical model which has been performed by Gawedzki and Kupiainen[4] and later also by Pordt and Reiss[5]. It is rather obvious anyhow: Once you have (at least) two independent frequency modes, fluctuations of one may compensate those of the other such that the constraint (2) is restored for the sum. These fluctuations are not even highly improbable since neighbouring frequency modes may look similar in position space for frequencies close to the border line between the two. Thus we obtain for K a value of order 1 after the first step. The much more difficult part of the ultraviolet analysis of the model – so far only performed in the hierarchical case for N > 2 and as long as the effective coupling stays small – is to show that the Lagrangian (1) is a good approximation to the full model. That implies in particular that the model has only one marginal direction which is well represented by the quartic term in (1). So our starting point is reasonable when giving credit to the evidence based on the hierarchical approximation. This hierarchical analysis in turn agrees with the seminal papers on the model based on perturbation theory by Br´ezin, Le Guillou and Zinn-Justin [6], and the analysis of Br´ezin and Zinn-Justin [6] also agrees with ours on the IR side in the limit N → ∞. Furthermore the generally accepted view is confirmed by numerical simulations [7] and, which is of great importance in this respect, also by the Bethe Ansatz methods based on the exact S-matrix [3], which show in particular that the model has a mass gap. Nevertheless these results are not fully based on well proven assumptions and are rather self-consistent than rigorous. So we note that on the other hand doubts against the general wisdom have been raised by Patrascioiu and Seiler [8]. We take an UV regularized version of (1) as our starting point. The scale is chosen such that the UV cutoff 3 is situated at 3 = 1. The situation in constructive field theory is often complicated by the fact that the expansions around the situation where the degrees of freedom are to some extent decoupled start from regularized versions which tend to violate symmetries of the model in question. The symmetries on the other hand often greatly simplify the perturbative analysis if, as is often the case, an invariant regularization for perturbation theory is at hand. Fortunately this time we are on the easy side: Once we have introduced an interpolating field, which we now call σ, the whole analysis of the model can be performed without breaking the O(N )symmetry, in complete agreement with the Mermin-Wagner-theorem [9, 10]. When the one-component scalar σ-field has been introduced we may integrate out the φ-field thus obtaining a new interaction given by (the inverse of) a Fredholm determinant. For the UV cutoff model it is well-defined in finite volume. The infinite volume limit is taken in the end, once the cluster and Mayer expansions have been performed, which allow to divide out the divergent vacuum functional. The analysis of the Fredholm determinant proceeds similarly as that of the corresponding determinant in the case of 1 For the analysis of the model in that limit a lattice regularization is probably most appropriate, see also [25, 26] and the comments below.

Mass Generation in Large N -Nonlinear σ-Model

91

the Gross–Neveu Model [1]. It is simplified in the same way as the expansions are since we do not have to distinguish different zones characterized by the mean value of the σ-field – apart from the small field/large field splitting. The main new problem lies in the fact that for the inverted Fredholm determinant some of the estimates used to bound the determinant (together with antisymmetric tensor products generated by taking derivatives when cluster expanding, see [1], p. 169 and more generally [11]) are no more valid. The problem is solved by deriving new bounds on inverted Fredholm determinants – in the last part of Chapter 3, to show stability, by introducing a finer splitting of the large field configurations before cluster expanding to make sure that the cluster expansion derivatives always produce small terms, and by evaluating the expansion derivatives through Cauchy formulae. The paper is organized as follows: Our specific choices for the regulators and the basic definitions are presented in Sect. 2. They are dictated by technical simplicity. In Sect. 3 we perform the small/large field splitting and develop the bounds on the various terms in the action ensuing from that splitting, as well as on the non-local operator kernels appearing. In particular we show that all the kernels appearing fall off exponentially in the small field region. In Sect. 4 the cluster-expansion is performed which then allows to control the thermodynamic limit and to prove the exponential fall-off of the (two-point) correlation function(s). After submitting this paper we learned about two important references on the subject. First the author was not aware of Kupiainen’s work 2 [25]. Secondly, a few weeks after submission there appeared a preprint by Ito and Tamura [26]. We close the introduction by shortly commenting on these papers. Kupiainen regards the N component nonlinear σ-model on a unit width lattice for arbitrary dimensions d. He shows that the 1/N expansion is asymptotic above the spherical model critical temperature TS , which is zero for d = 2. He also proves the existence of a mass gap for these temperatures and N sufficiently large. Without attaching much importance to the numerical side we just say what “N sufficiently large” means. We read from [25] (see Eq. (19)) for the two-dimensional case that for given inverse temperature β one needs N > cst e50πβ . Since β is to be identified with the inverse coupling 1/λ in our language this is basically the same as our bound: We require N −1/6 < cst e−4π/λ since the small factor per cluster expansion step (see end of Sect. 4.4) has to beat the factor O(m−2 ) from the spatial integration per link. Similarly the authors of [26] state their result in Theorem 24 for N > cst e400πβ and β large. They regard the same model as Kupiainen, the N component nonlinear σ-model on a unit width lattice, for d = 2. Thus [25] and [26] analyse the lattice version of (1) where the limit K → ∞ has been taken, i.e. the Heisenberg model. The result [26] only concerns the free energy or partition function which is shown to be analytic in β, given N as above. Correlation functions have not yet been treated. It seems clear however that their method of proof which, as ours, is based on a small/large field cluster expansion is well adapted for that case too. We prove exponential fall-off of the two-point 2 This important and beautiful contribution to constructive physics is maybe not as well known as it should be to those working in the field. In part this might be due to its title.

92

C. Kopper

function, extension to any connected n-point function is straightforward using the Mayer expansion formulae for those, see e.g. [19]. The change in Sect. 4.5 would consist in singling out a connecting tree now for n external points instead of two. Kupiainen’s result on the other hand is based on reflection positivity in the form of chess board estimates. It is not clear how the result on the exponential fall-off can be extended to general connected functions in this context, so strictly speaking (as he does) his result only holds for those correlation functions which have no nontrivial truncations. 3 An important point shared by [25] and [26] (in fact the authors of [26] could have referred themselves to [25] here) is that they both apply the Brydges–Federbush random walk representation to show and use exponential fall-off of the lattice kernels of 1/[p2 + m2 + iσ]. In the continuum we only succeed in proving exponential fall-off for small fields σ. This is the main reason why we introduce a whole hierarchy of large field regions with larger and larger protection corridors (see (60)–(63)), and the fall-off over the corridors has to make up for the (possibly) absent fall-off in the large field domain. Apart from this [26] is technically closer to my paper than to [25]. It is more detailed on some aspects of the expansions. A number of bounds take a similar form here and in [26]. In [26] the building blocks of the cluster expansion are taken to be large also in the small field region. This has technical advantages, on the other hand treating many degrees of freedom as a whole generally tends to deteriorate the numerical bounds.

2. Presentation and Rigorous Definition of the Regularized Model We want to show that the UV regularized large N σ-model is massive, i.e. that the correlation functions decay exponentially. In our explicit representation we will restrict to the two-point function, generalizations to arbitrary 2N -point functions being obvious. Thus formally we study the following object: Z R 2 2 N { (∂φ)2 + K ~ φi (x)φi (y)e− 2λ 4 (φ −1) } . Dφ (3) S2 (x, y) ∼ ~ indicates the product of (ill-defined) Lebesgue measures Dφ1 , . . . , DφN . Here Dφ Before giving sense to this expression mathematically by imposing suitable regulators we want to introduce the interpolating field σ as announced. We rewrite (3) as Z R 2 2 2 N 1 ~ DφDσ φi (x)φi (y)e− 2λ { (∂φ) +i(φ −1)σ+ K σ } (4) S2 (x, y) ∼ up to a global field-independent normalization factor. Now we can perform the Gaussian integrations over the φ-fields to obtain Z R 2 iN R N 1 Dσ ( 2 (5) )(x, y) det −N/2 (p2 + iσ) e− 2λK σ + 2λ σ S2 (x, y) ∼ p + iσ again up to a global field-independent normalization factor and on rescaling φ2 → φ0 = 1 1 )(x, y) denotes the position space kernel of the operator p2 +iσ . Its ex(N/2λ)φ2 . ( p2 +iσ 2 2 istence in L (R ) say, will be clear once the cutoffs and thus the support of the measure 2

3

In special cases he succeeds in performing truncations by a clever use of certain Ward identities.

Mass Generation in Large N -Nonlinear σ-Model

93

are specified below.4 As regards notation we will generally use the same letters for position and momentum space objects. This lack in precision in our eyes is overcompensated by the gain in suggestive shortness. For the same reason and on the basis of the previous remarks on the size of the constants appearing we will abbreviate by O(1) sums of products of N -independent constants the largest of which appearing will actually be 1/m2 (21). Without making this explicit we pay some attention not to collect astronomic numbers into O(1). By performing a translation of the field variable σ according to τ 0 = σ + im2 we finally arrive at

(6)

Z

1 )(x, y) + m2 + igτ R N ig − 21 τ ) e × det − 2 (1 + 2 p + m2

S2 (x, y) ∼

Dτ (

p2

√K R √ 2 τ 2 +i N ( √m + 4λ ) τ λK

(7) .

This time the change of normalization stems from three sources: from the translation, from a change of normalization of the Fredholm determinant and from a rescaling of q the τ 0 -field: τ 0 → τ =

N λK

τ 0 . In (7) we introduced the coupling constant r g=

λK . N

(8)

The value of the translation parameter m is fixed below ((17)–(21)) by a gap equation. This eliminates the term in the interaction exponential which is linear in τ , and this √ in turn is a prerequisite in the 1/N -expansion, since that term has a coefficient ∼ N . Before specifying the UV and IR regularizations we note that from the point of view of mathematical purity it would have been preferable to introduce them from the beginning. This however would have blown up the previous manipulations without a real gain since (3) and (7) are in fact to be viewed on equal footing as starting points: They both produce the same perturbation theory in 1/N . We now introduce the following regularizations: UV1. We set the cutoff scale to be 1 and replace p2 → p2reg = p2 ep

2

(9)

in (7). UV2. We also introduce an UV cutoff for the τ -field. When tracing this back to the 4 original interaction (1) it amounts R 2 to smoothing out the pointlike quartic φ -interaction. R 1 To the expression Dτ e− 2 τ in (7) corresponds in rigorous notation integration with respect to the Gaussian measure dµδ (τ ) with mean zero and covariance Cδ (x − y) = δ(x−y), or in momentum space Cδ (p) = 1(p). We replace the δ-function by a regularized version LN 2 2 4

When studying higher order correlation functions it is preferable to work in the space L (R ) and 1 1 to suppress the exponent N of det instead, because in this case the factor replacing ( p2 +iσ )(x, y) will depend on the flavour indices. We will adopt this convention only in the last part of the paper where it somewhat shortens the notation.

94

C. Kopper

1(p) →

p p 1 , fˆ(p) = 1 + π(p) f (p) 1 + π(p), 1 + fˆ(p)

(10)

where π(p) is defined below, see (28). It is a smooth nonnegative function depending on p2 only, bounded above by a constant of order 1/m2 (see (29)). f (p) also is a smooth nonnegative function depending on p2 only. It vanishes in the origin, grows monotonically with p2 such that α(p2 )2 < f (p) < A(p2 )2 with suitable 0 < α < A < ∞, and fulfills (

1 )(x − y) = 0, if |x − y| > 1. 1+f

(11)

The last condition is the most important one. That all conditions are mutually compatible is rather credible. A proof is in the elementary Lemma 1 in [1].5 There a suitable f (which in [1] is further restricted by demanding that it should vanish of high order in the origin) is constructed explicitly, basically by starting from the characteristic function of the unit ball in position space R2 and taking linear combinations of rescaled convolutions thereof. We should note that it is by no means crucial to have a cutoff with these particular properties. Only sufficient fall-off of 1/(1 + f ) in momentum and position space are 2 required. So e.g. 1/(1 + ep ) would do. The compact support property (11) is however helpful when fixing the final covariance of the model, taking into account large field constraints, see (67). It eliminates further small correction terms of a similar nature as those appearing in δCγ (70), cf. the remark after (79). In short the UV cutoff on the τ field replaces the ultralocal covariance δ(x − y) of this field by a√smoothed compact support version of the δ-function sandwiched between the two 1 + π-factors. The growth properties of f (p) restrict the support properties of the corresponding Gaussian measure dµf (τ ) to (real) continuous functions [12] and therefore we need not regularize expressions such as τ 2 , etc. In general we will view τ as an element of the real Hilbert space L2 (R2 , R). IR. As an intermediate IR regularization to be taken away in the end we also introduce a finite volume cutoff. To be definite we choose a square 3 ⊂ R2

(12)

|3| = 4n2 >> 1, n ∈ N.

(13)

centered at the origin with volume

We then restrict the support of the τ -field to 3. But we do not restrict the Gaussian measure to 3 from the beginning, because this again would increase the number of correction terms later when we perform a configuration dependent change of covariance. We want to avoid this, but nevertheless want to suppress contributions in the measure supported outside 3. We therefore introduce a term Z τ 2 ), R >> 1 (14) exp(−R R2 −3

in the functional integral, and take the limit R → ∞ later on. Note that absorbing this term in the measure and taking the limit right away would amount to restricting the 5 In fact we did not prove monotonicity in [1]. This can however be achieved by a slight extension of the proof. We do not include it since monotonicity is not needed here, it might however be useful when performing a renormalization group construction on the same basis.

Mass Generation in Large N -Nonlinear σ-Model

95

covariance to 3 from the beginning [13]. Again our particular choices for the IR cutoff are convenient, but not crucial. With these preparations we now obtain the following rigorous expression for the regularized normalized two-point function: Z 1 1 dµf (τ ) ( 2 )(x, y) S23 (x, y) = 3 preg + m2 + igτ χ3 Zˆ R (15) √K R √ 2 1 −R 2 τ 2 i N ( √m + 4λ ) τ R −3 λK 3 igτ χ ) e e . × det −N/2 (1 + 2 3 preg + m2 The partition function Zˆ 3 is given by Z R 1 −R 2 R −3 igτ χ ) e Zˆ 3 = dµf (τ ) det−N/2 (1 + 2 3 preg + m2

τ2

e

√K R √ 2 i N ( √m + 4λ ) τ λK

. (16)

3

Here χ3 is the sharp characteristic function of the set 3 in position space. Instead of χX we will mostly use PX to denote the orthogonal projector on the subspace of functions supported in X. From the bounds on the action given in the next section it is clear that Zˆ 3 will not vanish in finite volume (see (118)). In the following we will mostly suppress explicit reference to the regulators by reg and χ3 for shortness. As announced the value of m is fixed by imposing a gap equation eliminating the linear term in τ from the action, i.e. we demand: r Z √ K m2 1 + ig τ χ3 ). (17) ) τ = N/2 T r( 2 i N(√ 4λ 3 preg + m2 λK R When evaluating the T r, the term 3 τ factorizes on both sides of (17), and we obtain the relation Z m2 1 1 d2 p = + . (18) 1/2 2 2 2 p 2 (2π) p e + m λK 2λ For a sharp cutoff at p2 = 1 we would find from this

with the solution

1 + m2 m2 1 ln( ) = 1/λ + 2 4π m2 λK m2 = e− λ (1 + O( 4π

4π − 4π e λ )). λK

(19) (20)

For the case of an exponential cutoff the integral cannot be evaluated analytically, but it is easy to find suitable upper and lower bounds saying that m2 = cm e− λ , 4π

(21)

where the constant cm is close to one (lies between 0.9 and 1.1) for λ ≤ 1. For definiteness we will assume from now on 2/π < λ < π so that e−10 < m < 1/6. Taking into account the constraint (17) we thus obtain for the two-point function

(22)

96

C. Kopper

S23 (x, y) =

1 Zˆ 3

Z

1 )(x, y) p2 + m2 + igτ R 1 −R 2 τ2 −N/2 R −3 × det 2 (1 + 2 igτ χ) e , 2 p +m dµf (τ ) (

(23)

where we used the standard definition det n+1 (1 + K) = det(1 + K) e−T rK+ 2 T rK 1

2

+...+(−1)n n1 T rK n

(24)

for any traceclass operator K and n ∈ N. In an expansion based on the parameter 1/N the canonical choice of covariance is such that it contains all terms of the action quadratic in the field τ , possibly up to terms which are suppressed for N → ∞. This is not yet the case for (23) since the term quadratic in τ from det is not suppressed for N large: It contains 1/N from g 2 and N from det−N/2 giving N 0 altogether. Thus the appropriate presentation of the two-point function is rather Z 1 1 3 )(x, y) dµC (τ )( 2 S2 (x, y) = 3 Z p + m2 + igτ R (25) 1 −R 2 τ2 −N/2 R −3 × det 3 (1 + 2 igτ ) e . preg + m2 A corresponding change of definition has also been introduced when passing from Zˆ 3 to Z 3 , Z R 1 −R 2 τ2 −N/2 3 R −3 (1 + 2 ig τ χ ) e . (26) Z = dµC (τ ) det3 3 preg + m2 In (25), (26) dµC (τ ) represents the Gaussian measure with covariance C = (1 + fˆ + P3 πP3 )−1 .

(27)

P3 is the orthogonal projector onto the subspace L2 (3) of L2 (R2 ), and π is the quadratic part in τ from det. In momentum space it is given as Z d2 q 1 1 λK > 0. (28) π(p) = 2 (2π)2 q 2 + m2 (p + q)2 + m2 Since the integral is UV convergent, it is largely independent of the cutoff functions ep which we did not write explicitly. We find in particular

2

π(0) =

Cπ λK , 8π m2

(29)

where Cπ is again a constant close to 1. Furthermore one easily realizes that π ≤ π(0) in the operator sense, or in momentum space π(0) − π(p) ≥ 0. (30) This can either be done by direct calculation or by noting that (τ, π(0) τ ) =

N N T r(V V ∗ ) ≥ T rV 2 = (τ, π τ ), 2 2

(31)

Mass Generation in Large N -Nonlinear σ-Model

97

for the operator V (τ ) =

1 g τ. p2 + m2

(32)

R By (τ, π τ ) we denote the scalar product, which is given by τ (x) π(x − y) τ (y) in position space. Note that V has real expectation values in the real Hilbert space L2 (3, R). For later use we collect the following facts about the operator π and (some functions 2 of) the kernels of π and of 1/(p2 ep + m2 ) in position space. Lemma 1. a) The operator π fulfills: 0 ≤ π ≤ π(0) . b) The kernel of π in position space denoted as π(x − y) (using translation invariance) satisfies: i) |π(x − y)| ≤ O(1) e−2m|x−y| , √ ±1 ii) furthermore | 1 + π (x − y)| ≤ O(1) e−2m|x−y| for x 6 = y. 2 c) 0 < [1/(p2 ep + m2 )](x − y) < O(1) exp{−m|x − y|}. Proof. The proof of a) was given previously. The statement b)i) follows from standard analyticity arguments: π(p) is analytic in momentum space for (Imp)2 ≤ 4m2 as is directly seen from the integrand in (28) by shifting the integration variable q by p/2. 2 The main reason to choose the analytic regulator function ep was that it does preserve (and even slightly enlarge)6 this analyticity domain so that b)i) follows. Coming now to the statement in b)ii) we first note that the condition x 6 = y eliminates the δ-distribution contribution to the kernel so that we may regard in fact √ ( 1 + π)±1 − 1.

(33)

As compared to b)i) we now also have to verify that the real part of 1 + π stays positive for (Imp)2 ≤ 4m2 (so as to exclude a cut, i.e. a violation of analyticity due to the square root). Again explicit calculation simply reveals this to be the case, where the regularization again slightly improves the situation. c) This statement was proven in [1], Lemma 5. The lower bound follows from the representations ∞

2 1 X 1 1 1 −p2 − m2 2 e−p )n = e ( 2 2 2 2 p + 1 p + 1 p + 1 p e +m n=0

(34)

p2

and 2 1 1 1 − m2 2 e−p = 2 (1 − m2 e) + m2 e p2 + 1 p +1 p +1

Z

1

ds e−s(p

2

+1)

.

(35)

0

Since m2 e < 1 it becomes now obvious by explicit calculation of the Fourier transforms that the kernel of p2 ep12 +m2 is pointwise positive. 7 6 7

by a factor of |O(1)m2 |, see also [1] This fact will be useful later (see in particular (173)), but it is not crucial.

98

C. Kopper

3. Small/Large Field Decomposition and Bounds The representation of the correlation functions according to (25), (26) is well-suited for an expansion√in 1/N , since the remnants of the action left in det3 are all suppressed by factors of 1/ N or smaller. We then have to bound the contributions from det3 for large values of the field variable τ to show that it is integrable with respect to the Gaussian measure dµC (τ ). From our starting point we presume that this should be possible, since there the action was manifestly integrable. However, to obtain a convergent expansion of the correlation functions we have to perform a cluster expansion which makes visible the decoupling of the degrees of freedom with increasing separation in space. The cluster expansion interpolation formulae modify all nonlocal kernels of the theory, the modification being different for the measure and det 3 . Therefore one global bound is not sufficient. What we rather need are local bounds per degree of freedom. The solution we adopt is similar as in [1], with simplifications due to the fact that we only have one phase, and complications due to the fact that the model is not fermionic in origin. The latter implies that certain sign cancellations due to the Pauli principle are absent in the outcome of the cluster expansion and necessitates finer distinctions on the size of the τ -field than in [1]. We distinguish R between small and (a series of) large field configurations depending on the size of 1 τ 2 , where 1 is any (closed) unit square in 3 with lower left corner coordinates (n1 , n2 ) ∈ Z2 . Then we sum over the possible choices for all squares. For a given configuration we take the union of large field squares, enlarge this region by adding all squares below some finite distance from those and divide (roughly speaking) the enlarged region into its connected components. In the interior of any such component we do not introduce interpolation parameters, it is even reasonable not to absorb the quadratic part of det2 in the covariance there. Rather we use the large field criteria and certain bounds on inverted Fredholm determinants to show that these Rregions are suppressed in probability per large field square 1 and according to the size of 1 τ 2 . Then the expansion is largely restricted to the small field region, where the integrability of det3 is assured due to the small field criterion anyway. As usual such a cluster expansion with constraints goes hand in hand with a certain amount of combinatorics and technicalities coming from all sorts of correction terms. These are controlled by means of the large value of N . We are now going to make this reasoning precise. We subdivide the volume 3 into the 4n2 unit squares 1 specified above and regard some given τ ∈ L2 (3). We say that 1 ∈ 3 is a large field square w.r.t. τ if Z τ 2 ≥ N 1/6 , (36) λK 1

and 1 ∈ 3 is a small field square w.r.t. τ if Z τ 2 < N 1/6 . λK

(37)

1

We introduce a smoothed monotonic step function θ ∈ C ∞ (R) fulfilling ( 0 for x ≤ −1/4 θ(x) = . 1 for x ≥ 1/4

(38)

Mass Generation in Large N -Nonlinear σ-Model

99

Then we also introduce 4n2 factors of 1 into the functional integral according to 11 = θ(

λK||τ1 ||22 λK||τ1 ||22 − 1) + (1 − θ( − 1)). 1/6 N N 1/6

In (39) we set as usual

(39)

Z ||τ1 ||22

= 1

τ 2.

(40)

Now the first factor is decomposed further writing ∞

X λK||τ1 ||2 λK||τ1 ||22 λK||τ1 ||22 2 − 1) = − 1) − θ( − 1) θ( θ( n/6 (n+1)/6 N 1/6 N N n=1 =:

∞ X

(41)

θn (||τ1 ||22 ).

n=1 2

We then may rewrite (25), (26) as a sum of 24n terms each carrying for any square 1 a factor which is either the first or the second summand in (39). For a square carrying the first factor the functional integral is then split up further according to (41). To fix the language we say Definition. A square carrying the factor λK||τ1 ||22 − 1) N 1/6 is called a small field or s-square. A square 1 carrying the factor s (τ ) := 1 − θ( θ1

(42)

λK||τ1 ||22 − 1) (43) N 1/6 is a called a large field or l-square. More specifically we call it an ln -square if it carries a factor θn (||τ1 ||22 ) resulting from the splitting (41). l (τ ) := θ( θ1

An l-square then only contributes to the functional integral if λK||τ1 ||22 ≥ an ln -square only if

3 1/6 N , 4

(44)

3 5 (n+1)/6 N > λK||τ1 ||22 ≥ N n/6 , (45) 4 4 and an s-square only contributes, if 5 (46) λK||τ1 ||22 < N 1/6 , 4 so that we will always assume the respective inequality to hold once a square has been specified to be l, ln or s, since in this paper we are only bounding contributions to the functional integral. As regards notation we will write Ps , Pl , Pln and P1 for the orthogonal projectors onto functions with support in 3s , 3l , 3ln and 1 respectively. Here we denote by 3s ⊂ 3, resp. 3l ⊂ 3, resp. 3ln ⊂ 3 S the set of small field, resp. large field, resp. ln − squares in 3. Note 3s ∪ 3l = 3, n∈N 3ln = 3l . Before proceeding further with the l/s decomposition we want to show that the small field condition is sufficient to obtain a small upper bound in norm on the operator appearing in det:

100

C. Kopper

Proposition 2. For τ ∈ L2 (3) let 3s ⊂ 3 be a collection of unit squares such that for 1 ∈ 3s we have λK||τ1 ||22 < 5/4 N 1/6 .

(47)

Then the operator norm of As : L2 (3) → L2 (3) satisfies: ||As || ≤ O(1)N −5/12 ≤ N −2/5 . Here A is the operator P3

1 p2reg +m2

(48)

gτ P3 , and As is defined to be Ps A Ps .

Proof. We first regard A1 = P1 AP1 for 1 ∈ 3s . For ϕ ∈ L2 (1) and ||ϕ||2 = 1 we find: Z 2 |ϕ(x)τ (x)F (x − y)χ1 (y)F (y − z)τ (z)ϕ(z)| |(A1 ϕ, A1 ϕ)| ≤ g x,y,z Z Z 5 2 2 2 2 |ϕ(x)τ (x)τ (y)ϕ(y)| ≤ g F (0) τ 2 (x) < F 2 (0) N −5/6 . (49) ≤ g F (0) 4 x,y x∈1 Here F (x − y) is the pointwise positive kernel (see Lemma 1) Z F (x − y) =

d2 q eiq(x−y) , (2π)2 q 2 eq2 + m2

(50)

which is obviously bounded by its value at 0, which in turn is bounded by O(1/λ), which we absorb in O(1), which we bound by N −2/5+5/6 . This proves the assertion for a single square 1. To go from here to the general case one has to exploit the exponential fall-off of the kernel F (x − y) (Lemma 1), which deteriorates the bound by a factor of O(1/m2 ), which we absorb in O(1) and bound it again by N −2/5+5/6 . So now let ϕ ∈ L2 (3) with ||ϕ||2 = 1, X

|(As ϕ, As ϕ)| ≤

|(AP1 ϕ, P10 AP100 ϕ)|

1,10 ,100 ∈3s

≤ O(1) g 2

0

0

x,y

00

exp{−m(dist(1,1 )+dist(1 ,1

1,10 ,100 ∈3s

Z ×

X

))}

(51)

|τ (x)χ1 (x)ϕ(x)τ (y)χ100 (y)ϕ(y)|.

By performing first the sum over 100 and then over 10 and using the bound on τ , the Schwarz inequality and the fact that ϕ is normalized, we obtain the bound O(1)N −1 N 1/12

Z X Z ( ϕ2 )( τ 2 )1/2 ≤ O(1)N −5/6 . 1∈3s

This ends the proof.

1

1

(52)

Mass Generation in Large N -Nonlinear σ-Model

101

As announced we want – for given l/s-regions – to enlarge the l-regions by security belts of sufficient width such that the fall-off of the kernels from Lemma 1 will produce a small factor if the kernels have to bridge these belts. This procedure generally will merge together some of the different connected components of the l-region. Let 31l , . . . , 3rl be the connected components of 3l . We say there is a connectivity link between 3il and 3jl , 1 ≤ i, j ≤ r, i 6 = j, if there exists some 1i ∈ 3il and some 1j ∈ 3jl such that there exists 1 ∈ 3 with dist(1i , 1) + dist(1j , 1) ≤ 2M, where we choose for definiteness 2 ln N. M= m

(53) (54)

Then we call l1 , . . . , ls the maximal subsets of 3l connected by connectivity links and call them connectivity components. Obviously s ≤ r. Now we set and

0 = 0(l) = {1 ⊂ 3| dist(1, 3l ) ≤ M }

(55)

0i = 0(li ) = {1 ⊂ 3| 3kl ⊂ li , dist(1, 3kl ) ≤ M }.

(56)

Thus there is a one-to-one relation between the 0i and the li , and the 0i are connected 8 (in the standard sense), and we have 0i ∩ 0j = 0 for i 6 = j, and

n [

0i = 0.

(57)

1

In set-theoretic relations we always denote by 0 a set of (standard) Lebesgue measure 0. We also introduce the sets γi which (roughly speaking) lie between li and 0i : γi = {1 ⊂ R2 | 3kl ⊂ li , dist(1, 3kl ) ≤ M/2 }, γ =

n [

γi

(58)

1

so that dist(γ, 3 − 0) ≥ M/2 −

√

2.

(59)

Note that for technical reasons we have defined γ as a subset of R2 , not necessarily of 3. We do so because this definition of γ is useful when fixing the covariance in the presence of large field configurations such that it has good positivity and fall-off properties (see (67)–(70) and Lemma 3). The previous definitions now are extended to the situation where we split up further the 3l -region into the components 3ln . If the size of the field is very large we also need very large security belts to protect our large field regions – such that the decay of the kernel across this belt again assures a small contribution. We start again from the connected components 31l , . . . , 3rl of 3l and say that there is an e-connectivity link (or extended connectivity link) between 3il and 3jl , 1 ≤ i, j ≤ r, i 6 = j, if there exists some 1i ∈ 3il ∩ 3ln0 and some 1j ∈ 3jl ∩ 3ln00 such that there exists 1 ∈ 3 with 8 It requires some (elementary) work to really give an explicit proof of that fact, which amounts basically to transferring the square 1 constituting the connectivity link between 1i and 1j to the centre of a line of minimal length connecting 1i and 1j and showing that then either this transferred square or two of its neighbours touching each other connect together 1i and 1j within some 0k . We skip the proof since it is not crucial for us that the 0i are connected.

102

C. Kopper

dist(1i , 1) + dist(1j , 1) ≤ (n0 + n00 )M.

(60)

The e-connectivity components are then the maximal subsets of 3l connected by econnectivity links. We call them lie , 1 ≤ i ≤ s0 , and obviously s0 ≤ s ≤ r . Now we set 0e = 0e (l) =

[ {1 ⊂ 3| dist(1, 3ln ) ≤ n M }

(61)

n

and

0ei = 0e (lie ) =

[ { 1 ⊂ 3| 3kln := 3ln ∩ 3kl ⊂ lie , dist(1, 3kln ) ≤ n M }. (62) n

Again there is a one-to-one relation between the 0ei and the lie , and as before 0

0ei

∩

0ej

= 0 for i 6 = j, and

s [

0ei = 0e .

(63)

1

Starting from the l/s- decomposition of the volume 3 we now decompose the Fredholm determinant, define the s-dependent final covariance and bound the large field action. With the definition of the operator A (Proposition 2) we can write the Fredholm determinant as det(1 + iA). We first separate As from the rest of A via the standard relation det −1 (1 + iA) = det−1 (1 + iAs ) det−1 (1 + with

1 iA00 ) 1 + iAs

A00 := A − As = A0 + Al , A0 := Ps A Pl + Pl A Ps .

(64) (65)

Since A has real spectrum, the operator 1/(1 + iA) is well-defined. For As we now proceed as indicated before (see (25)), i.e. we absorb the quadratic part in τ into the covariance. When doing so we obtain the following (transitory) expression for the inverse −1 : propagator Cls −1 = Ps π Ps + 1 + fˆ. Cls

(66)

We express (66) in terms of Cγ−1 (67), the basic reason for this being the fact that we are not able to deduce suitable fall-off properties in position space for the inverse of (66). Our final choice for the configuration dependent covariance will rather be √ √ (67) Cγ−1 = 1 + π(1 − Pγ + ε Pγ + f ) 1 + π. Here ε is introduced so that Cγ is bounded also in the large field region. We fix it as ε = N−5 . 2

(68)

Choosing (67) we have to control the difference between (66) and (67), since it is (66) which is isolated from the action. Writing −1 = δCγ − Pl Cγ−1 − Cls

we obtain for δCγ the sum of terms:

(69)

Mass Generation in Large N -Nonlinear σ-Model

δCγ =

103 i=4 X

δCi (γ),

(70)

i=1

√ √ δC1 (γ) = − Ps ( 1 + π)Pγ ( 1 + π)Ps , √ √ √ √ δC2 (γ) = Pl ( 1 + π)(1 − Pγ )( 1 + π)Ps + Ps ( 1 + π)(1 − Pγ )( 1 + π)Pl √ √ +Pl ( 1 + π)(1 − Pγ )( 1 + π)Pl , √ √ δC3 (γ) = (1 − P3 )( 1 + π)(1 − Pγ )( 1 + π)P3 √ √ +P3 ( 1 + π)(1 − Pγ )( 1 + π)(1 − P3 ) √ √ +(1 − P3 )( 1 + π)(1 − Pγ )( 1 + π)(1 − P3 ), √ √ δC4 (γ) = 1 + π ε Pγ 1 + π. Having introduced the final covariance we may now rewrite the expression for the twopoint function based on the Gaussian measure dµγ with covariance Cγ normalized such that Z (71) dµγ (τ ) = 1. Since our covariance is configuration dependent there will be a change of normalization of the functional integral when changing the l/s-assignment. Relative to the situation where γ = ∅ this normalization factor is given by [13] Zγ = det1/2 (Cγ /C0 ),

(72)

where C0 is given below (76). Taking into account this factor we may rewrite (25) as R P Zγ dµγ (τ ) ( p2 +m12 +igτ χ3 )(x, y) Gγ l,s Rreg P , γ = γ(l). (73) S23 (x, y) = dµγ (τ ) Gγ l,s Zγ For the action Gγ we find collecting the results of the previous manipulations: R 2 −1/2 τ l s 3l (τ ) 51∈3s θ1 (τ ) e Gγ = 51∈3l θ1 1 −N/2 −N/2 × det3 (1 + iAs ) det2 (1 + iA00 ) 1 + iAs R −R 2 τ2 R −3 e1/2 (τ, δCγ τ ) . ×e −N/2

(74)

−N/2

1 (1 + iAs ) det2 (1 + 1+iA iA00 ) after using the gap Note that we get indeed det3 s −N/2 (1 + iAs ), since equation and absorbing the quadratic part of det

T rA = T rAs + T rAl , T rAl = T r(

1 Al ), 1 + iAs

(75)

on using T rA0 = 0 and As Al = 0. We first analyse the covariance Cγ . Then we bound the normalization factors Zγ and the correction terms δCγ . Finally we bound the large field determinant. Calling C0 the covariance Cγ for the case that γ = ∅ which means

104

C. Kopper

C0 = √

1 1 1 √ , 1+π 1+f 1+π

(76)

we may write the inverse of (67) as Cγ = C0 + C0 (C0−1 − Cγ−1 )Cγ = C0

∞ X

[(C0−1 − Cγ−1 )C0 ]r

r=0

∞ X 1 1 1 1 1 1 √ √ Pγ (1 − ε) √ ]r √ =√ [√ 1+f 1+f 1+π 1 + π 1 + f r=0 1 + f

= C0 + √

(77)

∞ 1 1 X 1 1 1 √ Pγ (1 − ε)]r . Pγ (1 − ε)[ 1+f 1+f 1+π 1 + π 1 + f r=0

The sums are obviously norm-convergent. At this stage the support properties of 1/(1 + f ) (11) become very helpful. They imply that in position space Cγ may be written in terms of a simple sum over disconnected pieces with support restricted to (a neighbourhood of) γi . We obtain Cγ = C0 + C γ , C γ := C γi := √

∞

n X

C γi ,

(78)

i=1

X 1 1 1 1 1 √ . (79) [ Pγi (1 − ε) Pγi (1 − ε)]r 1 + f 1 + f 1+π 1+f 1 +π r=0

If we had only imposed exponential fall-off for 1/(1+f ) , arbitrarily many terms coupling the various γi would have appeared. They could be shown to be small using the distance 1 , but still between the various γi of size ∼ M and the fall-off of 1/(1 + f ) and of √1+π they would be a nuisance. The fall-off properties of C0 have been analysed in Lemma 1. The complications stemming from nonempty γ are controlled easily in Lemma 3. The kernel C γ for γ 6 = ∅ satisfies the following estimates: |C γ (x, y)| ≤ O(1) N 2/5 exp{−2m(dist(x, γ) + dist(y, γ))}.

(80)

For x, y ∈ 3 − 0 or x ∈ 0i , y ∈ 0j with i 6 = j we find: |C γ (x, y)| ≤ O(1) and for x ∈ 0, y ∈ 3 − 0, |C γ (x, y)| ≤ O(1) Finally we have

1 N 18/5

(81)

1 . N 8/5

(82)

|C0 (x, y)| ≤ O(1) exp{−2m|x − y|}.

(83)

Proof. We have to control the contribution of the infinite sum over r in (79). We abbreviate O = Pγi (1 − ε)

∞ X r=0

[

1 1 1 Pγ (1 − ε)]r , B = √ 1+f i 1+π 1+f

(84)

Mass Generation in Large N -Nonlinear σ-Model

105

so that C γi = B O B ∗ .

(85)

Obviously ||O|| ≤ N 2/5 and ||B|| ≤ 1. Furthermore the kernel of B is continuous and pointwise bounded by O(1). By inserting characteristic functions of squares 1 between B and O and between O and B ∗ , summing over the squares, using the fall-off properties of the kernels and bounding |(χ1 , O χ10 )| ≤ N 2/5 ,

(86)

we then arrive at the bounds stated in Lemma 3. For the required properties of the kernels see Lemma 1 and (11). The minimal distances of points fulfilling the conditions specified in Lemma 3 follow from the definitions (54)–(59). We remark that the bounds in Lemma 3 could be somewhat improved on by using methods similar to those employed in the proof of Lemma 4. We do not do so because this improvement would not strengthen our final bounds anyway. Note in particular that the cluster expansion will be performed such that only C γ -terms bridging the gap between γ and 3 − 0 will be produced. Now we are going to bound the factors Zγ . Lemma 4. Let |γ| denote the volume of γ. Then 1 ≤ Zγ ≤ eO(1)|γ| .

(87)

Proof. Using (76), (78), (79) we have Zγ = det1/2 (Cγ /C0 ) = det1/2 (1 + = det 1/2 (1 +

X

[(1 − ε) Pγi

i,ri ≥1

= 5i det

1/2

(1 +

X

1 X γi C ) C0 i 1 ri ] ) 1+f

[(1 − ε) Pγi

ri ≥1

(88) 1 ri 1/2 ] ) = 5i det (Cγi /C0 ) = 5i Zγi . 1+f

Again we used the support properties of 1/(1 + f ) to factorize the determinant. For Zγi we now find Zγi = det1/2 (1 +

(1 − ε) Pγi 1 − (1 −

1 1+f

1 ε) Pγi 1+f

) = det−1/2 (1 − (1 − ε) Pγi

= exp T r{(−1/2) ln(1 − (1 − ε) Pγi = exp T r{1/2

X1 r≥1

r

(1 − ε)r [Pγi

1 )} 1+f

1 ) 1+f (89)

1 r ] }. 1+f

This expression implies Zγ ≥ 1. On the other hand we may use Lemma 3’ from [1], which says that for an Hermitian trace class operator A and orthogonal Projector P we have the inequality: T r(P A P )r ≤ T rP Ar P.

(90)

106

C. Kopper

Applying this to

1 1+f

T r(

using the fact that9 Z d2 p (

and Pγi we may bound 1 1 Pγ )r = T r(Pγi Pγ )r 1+f i 1+f i 1 r ) Pγi ) ≤ O(1) |γi |, ≤ T r(Pγi ( 1+f

1 )r ≤ O(1) 1 + f (p)

Z d2 p (

1 )r ≤ O(1)r−1/2 . 1 + (p2 )2

(91)

(92)

Using this we obtain Zγ ≤ exp{O(1) |γ| This proves Lemma 3.

X 1 (1 − ε)r } ≤ exp(O(1)|γ|). 3/2 r r≥1

(93)

We now come to the bounds on the correction terms δCi (γ) from (70). Lemma 5. i) δC1 (γ) ≤ 0 (as an operator), ii) δC3 (γ) ≤ 1 + π(0) , δC4 (γ) ≤ O(1) N −2/5 (as operators), iii) ||δC2 (γ)|| ≤ O(1) N −2 , |δC2 (γ)(x − y)| ≤ O(1) inf{e−2m|x−y| , N −2 }. Proof. i) is immediately obvious from the positivity of π. ii) The first statement is obvious since √ √ √ √ δC3 (γ) = ( 1 + π)(1 − Pγ )( 1 + π) − P3 ( 1 + π)(1 − Pγ )( 1 + π)P3 . (94) Note that δC3 (γ) only enters through interactions with field configurations of support outside 3, which will be suppressed anyway when taking R → ∞, (Prop. 8, (114)). The bound on δC4 (γ) follows from the definition of ε in (68). √ iii) The first statement in iii) follows from the exponential fall-off of 1 + π (Lemma 1) and the fact that dist(3l , (R2 − γ)) ≥ lnmN (see (58)). This implies a bound on δC2 (γ) of the form in iii); closer inspection shows that O(1) is basically given by m−3 , two powers coming from the integration over the√kernel bridging the distance gap and one in iii) also coming from a norm bound on the second 1 + π . The second statement √ follows from the definition of γ and from the fall-off of the kernel of 1 + π . −N/2

1 (1 + 1+iA iA00 ) in the Now we come to the bound on the nondiagonal term det2 s action (74). We need to get a suitable bound for this term which is sufficiently stable under the modifications caused by the cluster expansion parameters. We (temporarily) introduce the operator B through

B=

1 1 iA00 = (i + A∗s )A00 . 1 + iAs (1 + iAs )(1 − iA∗s )

(95)

Using the facts that the A-operators have real expectation values in real Hilbert space, that T rAns A00 = 0 and cyclicity we find 9 Unfortunately the factor of r −1/2 appearing in (92) is falsely written as 2−r in [1]. This mistake fortunately is of no consequence however.

Mass Generation in Large N -Nonlinear σ-Model

107

−1 |det−1 (1 + B)| = |det−1 (1 + B ∗ )| = det−1/2 (1 + D), 2 (1 + B)| = |det

(96)

D = B + B ∗ + B ∗ B.

(97)

where

Now we may apply the norm bound on As from Proposition 2 to realize that B coincides with iA00 up to small corrections, more precisely: Lemma 6. For ϕ ∈ L2 (3) we find ∗

∗

Bϕ = iA00 ϕ + δ A00 ϕ, B ∗ ϕ = −iA00 ϕ + A00 δ ∗ ϕ,

(98)

where the operator δ is bounded in norm as ||δ|| ≤ (1 + α)||As || ≤ (1 + α) N −2/5 << 1

(99)

with suitable 0 < α << 1. Proof. Since we have δ = i(

1 − 1), 1 + iAs

(100)

the statements of the lemma follow directly from Proposition 2, where α may be chosen to obey an upper bound of size ∼ ||As ||. Now we can also bound the operator D. For ϕ ∈ L2 (3) normalized to one we obtain (ϕ, Dϕ) = i(ϕ, A00 ϕ) − i(A00 ϕ, ϕ) + (ϕ, δ A00 ϕ) + (δ A00 ϕ, ϕ) 1 − 1)A00 ϕ). + (A00 ϕ, A00 ϕ) + (A00 ϕ, ( ∗ (1 − iAs )(1 + iAs )

(101)

Since the first two terms drop out, this entails (ϕ, Dϕ) ≥ (1 − η)||A00 ϕ||22 − η||A00 ϕ||2 ≥ −

4 η2 25 1 − η

(102)

with the choice η = 2 ||δ|| << 1.

(103)

Splitting the selfadjoint operator D into its negative part D− and its nonnegative part D+ , D = D+ − D− , we thus have obtained that 0 ≤ D− ≤

(104)

4 η2 ≤ N −4/5 . 25 1 − η

(105)

−1/2 (1 + D). We find Using this we may now proceed to a bound on |det −1 2 (1 + B)| = det −1/2

det −1/2 (1 + D) = e−1/2T rD det2 =

∗

(1 + D)

−1/2 e−1/2T rB B det2 (1 −1/2 ≤ det 2 (1 − D− ).

−1/2

+ D+ ) det 2

(1 − D− )

(106)

108

C. Kopper

Here we used again the fact that T rB = i T rAl = −T rB ∗ . Evaluating the trace of D− in an eigenbasis of D− one may easily establish the bound −1/2

det 2

(1 − D− ) ≤ exp(1/2T r(

2 D− )) 2 − ||D− || − D−

1 2 T rD− ). ≤ exp( 4 − 4||D− ||

(107)

To verify the first inequality one observes that for x > −ε > −1 we have x − ln(1 + x) ≤ x2 2 2−ε+x . Now it remains to bound T rD− . We call {ϕ− } a suitable set of normalized eigenfunctions of D− and find 2 = T rD−

X ϕ−

(ϕ− , D2 ϕ− ) =

X

[(ϕ− , Dϕ− )]2

ϕ−

X 4 16 2 16 2 ∗ η T r(A00 A00 ) ≤ η T r(A∗ A) ≤ ( η ||A00 ϕ− ||)2 ≤ (108) 5 25 25 ϕ− Z Z Z Z d2 p 1 16 2 2 2 2 −4/5 2 2 ( ) ) ≤ O(1)N g ( τ + τ 2 ). η g ( τ )( ≤ 25 (2π)2 p2reg + m2 3 3s 3l In the first inequality in (108) we made use of (102). It is admittedly pedantic to insist ∗ on factors as (4/5)2 in our context. To pass from T r(A00 A00 ) to the expression in the last line in (108) it is sufficient to take away the projectors Pl or Ps and thus to bound ∗ T r(A00 A00 ) in terms of T r(A∗ A) which is given by the double integral. We have obtained Lemma 7. −N/2 (1 |det 2

1 O(1)N −4/5 + iA00 )| ≤ e 1 + iAs

R 3

τ2

.

(109)

Finally we may also bound somewhat further the terms (τ, δC3 (γ)τ ) and (τ, δC4 (γ)τ ) appearing in Gγ . Writing τˆ := τ χR2 −3 we have 1/2 (τ, δC3 (γ) τ ) = (τˆ , δC3 (γ) τ ) + 1/2 (τˆ , δC3 (γ) τˆ ).

(110)

We may then bound (τˆ , δC3 (γ) τ ) + 1/2 (τˆ , δC3 (γ) τˆ ) ≤ R/2 (τˆ , τˆ ) +

O(1) (τ χ3 , τ χ3 ) (111) R

for R large enough, using Lemma 5. As for δC4 (γ) we find (τ, δC4 (γ) τ ) ≤ O(1)N −2/5 (τ χ3 , τ χ3 ) + (τˆ , τˆ ) .

(112)

Now we dispose of complete control of the action Gγ from (74) and may collect our findings in

Mass Generation in Large N -Nonlinear σ-Model

109

Proposition 8. For R large enough we have R 2 R 2 R 49 − 100 τ O(1)N −2/5 τ −R/2 τ2 3 3s R2 − 3 l e e . (113) Zγ Gγ (τ ) ≤ e R −R/2 τ2 R2 − 3 in the covariance, Thus we now take the limit R → ∞ and absorb the term e which implies that we may replace Cγ → P3 Cγ P3

(114)

and restrict the action to configurations τ (x) with supp τ ⊂ 3 . Proof. The proof concerning (114) is to be found e.g. inR[13], so we have only to gather the pieces for the proof R of (113). In the first term e −1/2

49 − 100

3l

τ2

we collected together the

τ2

3l from (74), the 3l -contribution from (109), the term from the contribution e bound on Zγ in Lemma 4, where we used Z ln N 2 −1/6 ln N 2 ) N τ 2, (115) ) |3l | ≤ O(1) ( O(1) |γ| ≤ O(1) ( m m 3l

and the contributions from (111) and (112) in 3l . Finally we Ralso absorbed in this term O(1)N −2/5

τ2

3s we have absorbed a contribution coming from δC2 (γ). In the term e the contribution in 3s from (109), (111), (112) and again a contribution coming from −N/2 (1 + iAs )| to be derived δC2 (γ). Finally we absorbed the one from the bound on |det3 −N/2 (1 + iAs )| can be bounded using the inequality now: |det 3

|T rAn | ≤ ||An−2 || T r(A∗ A)

(116)

|| we use Proposition valid for any traceclass operator A and n ≥ 2. To bound ||An−2 s −N/2 2. We may restrict to n = 3, the subsequent terms in the expansion of det 3 (1 + iAs ) being much smaller, |T rA3s | ≤ ||As || T r(A∗s As ) ≤ N −2/5 N −1 O(1)(τ χs , π(0) τ χs ) Z τ 2. ≤ O(1)N −7/5

(117)

3s

With the help of the previous Rremarks and this relation we can verify the bound O(1)N −2/5 τ2 −N/2 3s (1 + iAs )| ≤ e . |det 3 The following result now is immediate. Corollary. Reducing the volume 3 to a single square 1 equipped with a small field condition (42) we find Z 1 dµ1 (τ )G1 = 1 + o(N −1/5 ), (118) Z = where G1 is the integrand from (26) restricted to the single small field square volume, and dµ1 (τ ) is the normalized measure with covariance χ1 C0 χ1 .

110

C. Kopper

The statement follows from the bound (113) restricted to one small field square in 3. It will be useful later on to bound the large field contribution in (113), r.h.s. by a product of suppression factors in probability per square 1 ∈ 0. If (43) holds we may write Z Z Z 3λK 1/6 2 2 N τ ≥ 1/2 τ + and τ2 8 1∈3l 1∈3l 1∈3ln Z (119) 3λK n/6 2 ≥ 1/2 τ + N . 8 1∈3ln Therefore we obtain Lemma 9. e

49 − 100

R 3l

τ2

−1/4

≤e

R 3l

τ2

e−N

1/8

Y

|0e |

e−N

n−1 8

.

(120)

1∈3ln

Proof. It suffices to observe that for N sufficiently large we have (using (42), (45)) O(1)(

n n ln N −2 n/6 ≥N8. ) N m

Now we have sufficient control of the action to start with the expansions. 4. The Expansions, Proof of Mass Generation 4.1. The general form of the expansions. The cluster expansion allows to control the spatial correlations of the model. When combined with a subsequent Mayer expansion, which frees the clusters from their hard core constraints, it allows to take the thermodynamic limit and to bound the decay of the correlation functions. We proceed similarly as in [1] and use the general formalism for cluster expansions presented in [18], which in turn is an elaboration on a theme which has been the subject of several seminal papers by Brydges and collaborators over more than a decade. We apply in particular the Brydges–Kennedy formulae [15]. For general references on cluster expansions see also [13], where the presentation is close to the original way of introducing cluster expansions in constructive field theory, and [14, 19], which are close to our way of presentation. The cluster expansion is a technique to select explicit connections between different spatial regions. The best formulas for the clusters involve trees, which are the minimal way to connect abstract objects together. We call the subsequent formulae forest formulae, the forests generally consisting of several disconnected trees. The basic building blocks of our expansion are the large field blocks 0ei (62) composed of (generally many) large field squares and their security belts, and the individual small field squares 1 from S := 3 − 0e .

(121)

From the point of view of the presentation it seems advantageous to connect together these large field blocks 0ei already by a first cluster expansion, and then to proceed to a second one, the building blocks of which are given by the outcome of the first. Then the expansion really connects together unit size squares which allows to somewhat unify the language as regards convergence criteria, etc. In view of the existence of the excellent presentations to be found in [14–19] and since we stick very closely to [18] we hardly

Mass Generation in Large N -Nonlinear σ-Model

111

give indications on the proofs of cluster expansion formulae here. The general forest formula we are going to use will be given now. We introduce the following notation: Let I be a finite index set (in our context the set 3 of the squares 1 ∈ 3) and P (I) the set of all unordered pairs (i, j) ∈ I × I, i 6= j. A (unordered) forest F on I is a subset of P (I) which does not contain loops (i1 , i2 ) . . . (in , i1 ). Any such forest splits as a single union of disjoint trees, and it gives also a decomposition of I into |I| − |F| clusters (some of them possibly singletons). The non-trivial clusters are connected by the (non-empty) trees of the forest. Let H be a function of variables xij , ij ∈ P . Then the following forest formula due to Brydges is proven in [18]: X YZ 1 Y d dhl (122) H (hF H(1, ..., 1) = ij (h)), dxl 0 F

where

l∈F

l∈F

hF ij (h) = inf{hl , l ∈ LF (i, j)}

(123)

and LF (i, j) is the unique path in the forest F connecting i to j. If no such path exists, by convention hF ij (h) = 0. This interpolation formula will subsequently be applied to our expression for the two-point function, more precisely to the summands in the numerator and denominator of (73) with given l/s-assignments. As mentioned we proceed in two steps. The first rather trivial one is to connect together the squares in the components 0ea of 0e . Let P0e be the set of all pairs (i, j) of distinct squares in 0e . We define εij = 0, if 1i ∩ 1j = ∅ or if 1i and 1j belong to different components 0ea(i) 6 = 0ea(j) of 0e , εij = 1 otherwise, and ηij = 1 − εij . Our first forest formula is simply Z 1 XY Y 1 dhl (ηl + εl hF (124) εl 1= l (h)). F1 l∈F1

0

l6∈F1

This follows directly from the application of the forest formula to Y (xij εij + ηij ), H({xij }) = ij∈P0e

using that here H(1, ..., 1) = 1. The only non-zero terms in this formula are those for which the clusters associated 0ea of the large field region. to the forest F1 are exactly the set of connected components Q Indeed they cannot be larger because of the factor l∈F1 εl , nor can they be smaller Q 1 because of the factor l6∈F1 (ηl + εl hF l (h)), which is zero if there are some neighbours belonging to the same component (for which ηij = 0) belonging to different clusters 1 (for which hF ij (h) = 0). Therefore this formula simply associates connecting trees of “neighbour links” to each such connected component, but in a symmetric way Q without arbitrary choices. We remark finally that in (124) the interpolated factors l6∈F1 (ηl + ε l hF l (h)) after giving the necessary constraints on the clusters can be bounded simply by 1. The second cluster expansion links together the previous clusters by interpolating all the non-local kernels in the theory. It gives a forest formula which is an extension of the first one. We consider all non-local kernels in our theory, that is √

√ 1 , 1 + π, 1+π

1 . p2 + m2

(125)

112

C. Kopper

Note that due to the support property (11) of 1/(1 + f ) and our choice of treating each 0ea as one connected block of the second expansion, we need not interpolate Cγ = C0 + C γ as a whole: When all kernels appearing in (125) are interpolated such that they do not connect any more different clusters of the second expansion, then Cγ – with these interpolated kernels replacing the noninterpolated ones in the expression for Cγ – does not connect different clusters either. The three kernels from (125) will be generically called K. Now the second expansion takes into account the connections built by the first, i.e. it interpolates only the links Kl (x, y) = Kij (x, y) = 1i (x) K(x, y) 1j (y)

(126)

for squares which belong to different clusters of the first forest. Let Z(K, 0e , 3) be a generic name for the quantities we want to compute, namely the numerator and denominator in (73). Then the second forest formula gives: Z 1 XY Y e dhl (ηl + εl hF εl Z(K, 0 , 3) = l (h)) × 0

F1 l∈F1

×

X

Z

Y

l6∈F1 1

dhl

F2 ⊃F1 l∈F2 −F1

0

Y l∈F2 −F1

d Z(K({hF2 −F1 }), 3), (127) dxl

where Z(K({hF2 −F1 }), 3) is a functional integral with interpolated kernels K({hF2 −F1 }). These interpolated kernels are defined by 1 ,F2 (h)Kl (x, y), K({hF2 −F1 }) = hF l 1 ,F2 (h) is the inf of the h parameters of the lines of F2 − F1 on the unique where hF l path in F2 joining 1i to 1j (if l = (i, j)). Again if no such path exists, by convention 1 ,F2 (h) = 0. In other words the path is computed with the full forest, but only the hF l parameters of the forest F2 − F1 are taken into account for the interpolated non-local kernels. Q The product l∈F2 −F1 dxd l is a short notation for an operator which derives with respect to a parameter xl multiplying Kl , whereQK is any of the non-local kernels, and then takes xl to 1. Therefore the action of l∈F2 −F1 dxd l creates the product Q l∈F2 −F1 Kl (with summation over the finite set of possible K’s), multiplied either by functional derivatives hooked to both ends (for the case where the derivatives apply to the measure and are evaluated by partial integration) or by other terms descended from the action exponential, if the derivatives apply directly to the action. In Sect. 4.4 we give the list of the corresponding derived “vertices” produced by these derivatives. The important fact to be shown is that because these derivatives act on terms which a carry a factor N −x , x > 0 , in fact to each such vertex, hence to each link of this second expansion, is associated a factor which tends to zero as N → ∞. It is an important property of the forest formulas of this type that they preserve positivity properties [18], so that if K is a positive operator, K({hF2 −F1 }) is also positive. This is not obvious at first sight from the infimum rule of (123), but it is true because for any ordering of the h parameters (say h1 ≤ ... ≤ hn ) there is a way (which varies with the ordering) to rewrite the interpolated K(h) as an explicit sum of positive operators [18]:

K(h) =

X p

(hp − hp−1 )

p X q=1

χp,q Kχp,q .

(128)

Mass Generation in Large N -Nonlinear σ-Model

113

The functions χp,q are the characteristic functions of the clusters built with the part of the forest made of lines p, p + 1,...,n. For us (as for anyone interpolating Gaussian measures) this preservation of positivity is crucial when the covariance Cγ is interpolated. 4.2. The cluster amplitudes. Factorization. From (127) we realize that the quantities Z(K, 0e , 3) factorize over contributions, the mutually disjoint supports of which – to be called polymers – are the blocks connected together by the links of the disjoint trees in the forest F2 . So they take the form Z 1 Y X Y e dhl (ηl + εl hTl 1 (h)) × εl A(K, 0 , Y ) = 0

trees {T1a }=:T1 l∈T1

Z

X

Y

trees T2 on Y, T2 ⊃T1

l∈T2 −T1

×

1

dhl 0

l6∈T1

Y l∈T2 −T1

d A(K({hT2 −T1 }), Y ), dxl

(129)

The trees T1a join together the connected subsets of Y ∩0ea , their union, called T1 , (which in fact is a forest) becomes a subset of a single tree when adding the links from T2 − T1 . The trees T2 connect together all of the polymer Y , so they have |Y | − 1 elements. Then (similarly as above (122)) A(K({hT2 −T1 }), Y ) is a functional integral with interpolated kernels K({hT2 −T1 }). These kernels are defined by K({hT2 −T1 }) = hTl 1 ,T2 (h)Kl (x, y), where hTl 1 ,T2 (h) is the inf of the h parameters of the lines of T2 − T1 on the unique path in T2 joining 1i to 1j for l = (i, j). Now regarding more explicitly the two-point function (73) we get the following formula as result of the cluster expansion: Qq P P Q Al (Y1 , x, y)(1/(q − 1)!) i=2 Al (Yi ) l a Zγa S23 (x,

y) =

q,Y l i Yi ∩Yj =0,∪i Yi =3

P Q l

a

Zγa

P

(1/q!)

Qq

q,Y l i Yi ∩Yj =0,∪i Yi =3

i=1

Al (Yi )

(130)

with the following explanations : 1) The amplitudes for the polymers depend on the choice l of the large field region. 1 2 By shorthand notation l stands P for the infinite series of possible choices s, l , l , . . . . Correspondingly the sum l stands for the infinite sum over those choices. We note already that there is no convergence problem associated with this infinite sum due to the suppression factors (120). 2) The difference between the numerator and the denominator in (130) is that in the numerator there is one external polymer depending on the source points x and y. Note that there is no nonzero contribution in which the points x and y lie in two distinct polymers. 1 This would necessitate to cut the factor ( p2 +m2 +igτ χ3 )(x, y) into a product of two pieces 10 of disjoint support , one containing x and the other y. Such a contribution obviously vanishes. The absence of such a contribution can be traced back to the symmetry φ → −φ of the action (1). 10

We write

1 = p2 + m2 + igτ 1+

1 1

p2 +m2

1 igτ p2 + m2

and interpolate the kernel 1/(p2 + m2 ), see also the proof of the theorem below.

114

C. Kopper

Since by the rule of our cluster expansion, each component γa of the large field region is contained in exactly one polymer Y , we may absorb each normalization factor Zγa into its cluster, defining Y e ) := A(Y ) Zγa . (131) A(Y a/γa ⊂Y

The simplest cluster is a single small field square 1 ⊂ S = 3 − 0e .11 Due to (118) we find in this case A0 (1) = 1 + o(N −1/5 ).

(132)

Therefore it is convenient to cancel out the background of trivial single square small field clusters, hence to introduce for a polymer Y the normalized amplitude a(Y ) = Q

e ) A(Y . 1⊂Y A0 (1)

(133)

Then we obtain the usual dilute polymer representation: P P Qq l 1 al (Y1 , x, y) (q−1)! l i=2 a (Yi ) S2 (x, y) =

q,Y l i Yi ∩Yj =0

P

l

P

(1/q!)

Qq

q,Y l i Yi ∩Yj =0

i=1

al (Yi )

.

(134)

To get factorization we must analyze how the choice of l affects the cluster amplitudes. The choice of the large field regions 3ln for fixed n is a local one, which means that the constraints implied by the choice are of finite range. The sum over these choices therefore can be absorbed into the value of (redefined) factorized amplitudes. Indeed we can replace the global sums over s, l1 , l2 , . . . by local ones: X X l

q,Y l i Yi ∩Yj =0

q

Y 1 a (Y1 , x, y) al (Yi ) = (q − 1)! i=2 l

X X l

q,Y l i Yi ∩Yj =0

(1/q!)

q Y i=1

l

a (Yi ) =

q

X q,Yi Yi ∩Yj =0

X q,Yi Yi ∩Yj =0

Y 1 b(Y1 , x, y) b(Yi ), (q − 1)! i=2 (135)

(1/q!)

q Y

b(Yi )

(136)

i=1

with the explanations: (i) The right sum is over all sets {Y1 , ..., Yq }, where the Yi are sets of 1’s, a single 1 being excluded (except if it is an external square containing one of the source points x and y). One has the disjointness or hard core constraints Yi ∩ Yj = 0 for i 6= j. (ii) b(Y ) is computed from a(Y ) through X 0 l a (Y ), (137) b(Y ) = 11

We assume the square not to contain the external points x, y which may be thought of to lie far apart.

Mass Generation in Large N -Nonlinear σ-Model

115

wherePthe sum is over all assignments of large field regions included in Y . This 0 is submitted to constraints (as indicated): We define 3l (Y ) := 3l ∩ Y = sum S n n (Y ), 3ln (Y ) := 3ln ∩ Y and sum over the s, l -assignments within Y with 3 l n the following restriction: For given Y any assignment for which there exists some 1 ∈ 3ln (Y ) with dist(1, (∂Y − ∂3)) ≤ n M

(138)

is forbidden, because otherwise our polymer would not contain the whole of the large field block 0ea containing 1 and associated with 3l (Y ). It is also evident that it does contain this block if (138) does not hold for any square from 3ln (Y ). With this definition of the amplitudes b(Y ) we now obtain factorization: P Qq 1 b(Y1 , x, y) (q−1)! i=2 b(Yi ) S2 (x, y) =

q,Yi Yi ∩Yj =0

P

(1/q!)

Qq

q,Y l i Yi ∩Yj =0

i=1

b(Yi )

.

(139)

4.3. The Mayer expansion and the convergence criterion. Equation (139) has now the form required for the application of the Mayer expansion in a standard way. The hard core interaction between two clusters or polymers X, Y is V (X, Y ) = 0 if X ∩ Y = 0, and V (X, Y ) = +∞ if X ∩ Y 6= 0, and the disjointness constraint for the polymers can be replaced by the inclusion of an interaction e−V (Yi ,Yj ) between each pair of polymers. A configuration M is an ordered sequence of polymers. We define bT (M ) by q

1 Y b(Yi )), b (M ) = T (M )( q! i=1 T

(140)

where the connectivity factor T (M ) is defined using connected Graphs G on M , by Y X (e−V (Xi ,Xj ) − 1). (141) T (M ) := G connected on M ij∈G

Then we can divide by the vacuum functional to obtain X bT (M ), S2 (x, y) =

(142)

M (x, y)−configuration

where M is a sequence of overlapping polymers Y1 , ..., Yq , the first of which contains the squares containing x and y and thus includes the factor ( p2 +m12 +igτ χ3 )(x, y) from reg (73). The sufficient condition for the convergence of (142) in the thermodynamic limit is well known: It is a particular bound on the sum over all clusters, containing a fixed square or point to break translation invariance [14,18,19]. We state it as Proposition 10. |

X

b(Y )e|Y | | ≤ 1/2

Y,0∈Y

for N sufficiently large, uniformly in 3, |Y | being the number of squares in Y .

(143)

116

C. Kopper

The fixed point is chosen to be 0 without restriction. For N large enough, (143) in fact holds if one replaces the number e in (143) by any other constant. To deduce convergence of (142) under condition (143) requires to reorganize the connectivity factor T (M ) according to a tree formula. We can use again the basic forest formula (122) to obtain a symmetric sum over all trees. We define (144) vij = (e−V (Xi ,Xj ) − 1) for i 6= j . Q We call P the set of pairs 1 ≤ i < j ≤ n. Expanding (ij)∈P (1 + vij ) with (122) we get another forest formula, on which we can read the connectivity factor Z 1 XY Y dhl (1 + hT (i, j)vij ), (145) vil jl T (M ) = T

l∈T

0

(ij)6∈T

where hT (i, j) is the inf of all parameters in the unique path in the tree T joining i to j. This formula is then used e.g. like in [14,18,19] to derive the convergence of (142). Remark that again every tree coefficient forces the necessary overlaps and is bounded by 1. It remains to prove Proposition 10. We do not give a first principles proof here, but we do show how to sufficiently control those contributions to the polymer amplitudes, which do not appear in analogous form in e.g. UV-regularized massive ϕ4 -theory, since the latter is clearly exposed in many reviews and textbooks, e.g. [14, 19, 22]. Cluster expansion techniques are nowadays applied to much more complicated situations than this, recently also with accent on a clear and systematic presentation [20, 21]. The aspects not to be encountered in a ϕ4 -treatment are analyzed in Sect. 4.4. Here we reduce the proof to certain bounds on functional derivatives generated by the links of the second tree T2 − T1 in (129). Because the amplitude b(Y ) is given by a tree formula we will sum over all squares in Y by following the natural ordering of the tree, from the leaves towards the root, i.e. the particular square containing 0. The factorial of the Cayley theorem counting the number of (unordered) trees is compensated in the usual way by the symmetry factor 1/|Y |! that one naturally gets when summing over all positions of labeled squares [14, 19]. Then the only requirements to complete the proof of (143) are (i) summable decay of the factor associated to each tree link. This is obvious for the εij links of T1 , because these extend only over neighbours, so have bounded range. For the tree links of T2 − T1 , it follows from the decay of the corresponding kernels (125), see Lemmas 1, 3. (ii) A small factor for each tree link, or equivalently for each square of Y . This will compensate in particular for the combinatorial factors to choose which term of the action to act on by the derivatives, etc. For tree links of T1 this small factor comes from the one associated to each of the large field squares, hence from Lemma 9. Once a square is chosen large field we still have the choices l1 , . . . ln0 , the value of n0 depending on the distance of the square from the boundary of Y . The sum over the n-values converges (rapidly) due to (120). For the tree links of T2 − T1 the small factor comes from the negative powers of N generated at the ends of these links (“vertices”). These small factors are described in more detail in the next section. Remark that all types of small factors tend to zero as N → ∞. We note that the small factor per square should be there on taking into account the bound on the action as a net effect. Equation (113) was derived before performing the cluster expansion. Does it still hold once the interpolation parameters and support restrictions are introduced? It does indeed, because support restrictions do not cause

Mass Generation in Large N -Nonlinear σ-Model

117

any harm in the reasoning of Sect. 3, because all interpolated kernels are bounded in modulus by the modulus of their noninterpolated versions (see (128)), and because the interpolated versions of the operator A still have real spectrum. Then one easily realizes that all statements go through as before, in particular the proof of Proposition 2 and of Lemma 7. A slightly more serious modification of the action is caused by the use of the Cauchy formula below, it will be controlled by Lemmas 12 and 13. 4.4. The outcome of the derivatives. With the tools previously developed we now want to show the existence of the correlation functions in the thermodynamic limit. We have at our disposal exponentially decaying kernels, a suitable stability bound on the action (Proposition 8), and we have arranged things such that derivatives will produce a small factor corresponding to the small coupling.As compared to a treatment of UV-regularized ϕ4 , the main new features to be analysed are the following: a) The action is nonlocal, and the covariance is interpolated twice. b) There is a small/large field split, and thus small factors per derivative appear in various different forms. c) The action is nonpolynomial, which implies in particular that terms descended from the action by derivation may be rederived arbitrarily often. The amplitudes of the polymers Y are given as sums over trees (129) which are the factorized contributions coming from the forest formula (122). When performing the h-derivatives those may either apply to dµγ (Y ) or to 1 −N/2 )(x, y) det 3 (1 + iAs ) p2reg + m2 + igτ χY (146) 1 −N/2 00 1/2 (τ, δCγ τ ) (1 + iA ) e . det 2 1 + iAs R −R 2 τ2 R −3 is now absent, cf. Here we went back to (73) (remembering that the term e Proposition 8). In (146) the kernels from (125), which appear in Cγ and the action, are to be replaced by their h-dependent versions. We write shortly K(h) for K({hT2 −T1 }) and have (see (128), (129)...) (

K(h)(x, y) = χY (x) hTl 1 ,T2 (h) K(x, y)χY (y).

(147)

Application of derivatives with respect to dµγ is evaluated by partial integration ([13], Chap. 9): Z Z Z δ δ Y Y (∂hi Cγ (h))(x − y) ... . ∂hi dµγ (h, τ ) . . . = dµγ (h, τ ) δτ (x) δτ (y) x,y (148) In Cγ the kernels S =

√1 1+π

are interpolated. Thus ∂hi Cγ (h) is of the form

∂hi Cγ (h) = (∂hi S(h)) Cˆγ S(h) + S(h) Cˆγ ∂hi S(h).

(149)

The supports of the derived kernels, i.e. ∂hi S(h), are by construction restricted to the two squares linked by the hi derivation [18], which adds a link to the previous tree. Therefore the τ functional derivatives are either directly localized in these squares – in the case where ∂hi applies to the first (second) kernel S(h) in Cγ , and we consider the δ δτ derivative on the left (right), or they are only essentially localized – when e.g. ∂hi

118

C. Kopper

δ applies to the first (second) kernel S(h) in Cγ , and we consider the δτ derivative on the δ right (left). In the last case this means that the δτ functional derivative is linked to its localization square via the second (underived) kernel S(h), which is supported over the polymer in question, see (128),(147). It has exponential decay, so the links to squares distant from the localization square rapidly decrease with distance. Summing over them gives an additional factor ∼ 1/m2 . Since this tolerable deterioration of the bound per derivative is the only effect of essential localization, we may forget about this difference from now on. The (T2 − T1 )-h-derivatives can apply also to the terms in (107). To roughly keep track of the combinatorial factors involved we note that any h-derivative may apply to any kernel in (146) (∼ 10 terms). If it applies to the measure there appear two terms with two functional derivatives which again may apply to the action (∼ 40 terms). Still one should note that the effect of these combinatorics is not very important since going through the terms in detail (which we shall not do too explicitly) reveals that most of them give much smaller (in N ) contributions than the dominating ones. This is also true for the sum over the l-assignments: Large field contributions, in particular for n > 1, are tiny corrections due to (120). Therefore e.g. all the contributions coming from the terms in δCγ are unimportant: They are only present when 3l ⊂ γ is not empty. There is one more source of combinatoric increase of the number of terms, namely due to the fact that the derivatives may also act on terms produced by previous derivatives. For the polynomial part of the action this may only happen a few times. But it needs special discussion when regarding the determinants. So we will now go through the various contributions and comment on how the √ derivatives act on them. We can be short about δCγ : In all terms we have the kernels 1 + π, which fall off as exp(−2m|x − y|). The contributions are listed in (70). When applying an h-derivative to δC1 (γ) the small factor in N comes from dist(γ, 3 − 0) ≥ ln N/m. Due to the fall-off this gives a factor ∼ N −2 . We may then e.g. write in the bound for the kernel

exp(−2m|x − y|) = exp(−

3m 5m |x − y|) exp(− |x − y|) 4 4

(150)

and keep the first factor as a kernel with exponential fall-off and bound the second by N −3/4 using the support restrictions. This is then the small factor per derivative. Note that we could also do without extracting this factor from (150), extracting it as a part of (120) instead. The same splitting (150) can be applied when the h-derivatives act on δC2 (γ). For δC4 (γ) we may invoke support restrictions to extract N −3/4 as above, additionally we get a factor of ε ∼ N −2/5 . The term δC3 (γ) no longer contributes due to the limit R → ∞. The same mechanism produces the small factors also, when we apply the functional derivatives δ/δτ instead of h-derivatives. Remember the above remarks concerning essential localization. By the derivatives we also produce τ -fields (essentially) localized in some square 1 (two fields per h-derivative, one per δ/δτ derivative). If the square 1 is in 3s , we have the choice to perform Gaussian integration or to bound the contribution directly using (46), Z K1 (z − x) τ (x) K2 (x − y) . . . | ≤ |... 1 (151) ≤ O(1)N 1/12 | . . . sup |K1 (z − x) K2 (x − y)| . . . |. x∈1,

This is maybe the simplest way of doing it. Note that in this case we still can keep aside a factor of N −3/4+2/12 < N −1/2 per h-derivative. If the square is in 3ln , the bound

Mass Generation in Large N -Nonlinear σ-Model

119

is achieved using (44),(45) and (120). The above-mentioned rederivation of derived terms allows to apply (at most) two δ/δτ on an h-derived term so that the factor has to be distributed over three derivatives leaving in this worst case N −1/6 per derivative (without invoking large field suppression). Maybe we should also mention shortly the wellknown and well-solved local factorial problem. There is the possibility that a large number of τ -fields accumulate in a single square 1, even when regarding only the polynomial part of the action, namely if the tree in question has a large coordination number d at that square: There are d links of the type li,jν , ν = 1, . . . d in the tree, i referring to 1. Then bounding the at most 2d τ -fields in 1 ⊂ 3l using (120) (and the Schwarz inequality) gives Z R 2 −1/4 τ 1 ≤ 4d d! (152) [ τ 2 ]d e 1

This is not tolerable as a bound when aiming to prove (143), but the solution is in the fact that most of the d squares associated to the links li,jν have to be at a large distance from 1 for large d. Extracting a small fraction η of the kernel decay we can isolate a factor associated to d >> 1, which is much smaller than d!1 . 12 For a more thorough discussion of the point see [14,19] or also [1]. Now we regard the Fredholm determinants. As compared to [1] we have to regard an inverted determinant. This is related to the fact that we regard a bosonic model, and it means that the sign cancellations appearing as a consequence of the Pauli principle which sometimes improve the convergence properties are absent. The inverted determinants are raised to the power N/2. For brevity we will change the notation for the rest of this section and suppress this power assuming instead the operators As , . . . to act in LN/2 2 k=1 L (3). We assume N to be even, otherwise we still would have to carry around a power 1/2 (without consequence). This change entails that we absorb a factor of N/2 in T r as well. We rewrite the product of the two Fredholm determinants appearing in terms of a single one. This is possible, since the interpolation acts equally on all A-operators. We have −1 det−1 3 (1 + iAs ) det 2 (1 +

det

−1

1 iA00 ) = 1 + iAs

1 T r{iAs −1/2(iAs )2 + 1+iA iA00 } s

(1 + iA) e

(153) .

Since the T r of A00 multiplied by any power of As vanishes, whereas T r(As + A00 ) = T rA, we may rewrite (153) as 2 det −1 2 (1 + iA) exp T r{−1/2(iAs ) }.

(154)

The cluster derivatives acting on (154) will be evaluated as Cauchy integrals over suitable (large) contours. Similar reasoning has been used by Iagolnitzer and Magnen [23] in a renormalization group analysis of the Edwards model and earlier by Spencer in the analysis of the decay of Bethe-Salpeter kernels [24]. To obtain useful bounds using this method requires that the derivatives ∂hl A are always small in norm. At this stage we therefore really need the whole cascade of large field splittings from the previous chapter. We have 12

It is of order e−δ d

3/2

.

120

C. Kopper

Lemma 11. Let l ∈ T2 − T1 be a link of the cluster expansion joining two squares 1, 10 such that ∂hl A = P10 A P1 . Then we have ||∂hl A|| ≤ O(1)N −5/12 exp{−m dl },

(155)

if 1 is a small field square. Here we set dl = dist(1, 10 ). If 1 is a large field square in 3ln , we find Z n 1 (156) ||∂hl A|| ≤ O(1) N −1/2 ( τ 2 )1/2 exp{−m dl } ≤ O(1) N 12 − 2 −2n . 1

Proof. The result is obtained in the same way as when proving Proposition 2, if 1 is a small field square. If 1 is in 3ln , the distance between the squares is by our expansion ln N 1 rules larger than 2nm which assures (156) through the decay of p2 +m 2 (remember in particular (44),(45),(62)). For brevity of notation we introduce det −1 (1 + Q) := det−1 (1 + iA)

(157)

and first describe how the derivatives act on (157) instead of (154). Namely we write h

∂h1 . . . ∂hn det−1 (1 + Q) = ∂α1 . . . ∂αn det−1 (1 + Q + α1 ∂h1 Q + . . . + αn ∂hn Q)

i α1 ,... ,αn =0

.

(158)

We evaluate (158) by means of a Cauchy formula for the n independent complex variables αi . The idea is to regain the small factor per derivative and the distance decay by choosing the α-parameters sufficiently large. We note first that det −1 (1 + Q + α1 ∂h1 Q + . . . + αn ∂hn Q) is analytic in the α-parameters, see Simon [11], as long as 1 + Q + α1 ∂h1 Q + . . .+αn ∂hn Q has no 0 eigenvalues. This restricts the maximal size of the |αi |. We choose the size of the αl -parameter corresponding to the link l as follows: 1

Rl := |αl | = N 6 e

9m 10

dl

.

(159)

We now find Lemma 12. If the αl are chosen according to (159) then ||

X

αl ∂hl A|| ≤ O(1) N −1/4 .

(160)

l

Proof. For the individual entries in the sum the bound follows on inspection. If the supports of the links (i.e. the pairs 1, 10 ) are mutually disjoint it stays true, since then the ∂hl A are mutually orthogonal. If they are not, we again employ the argument (see above (152)) that in this case the links corresponding to a large coordination number d in the tree have to grow longer and longer. In this case the sum may be performed using m the remnant decay e− 10 dl .

Mass Generation in Large N -Nonlinear σ-Model

121

Remark. When proving the exponential decay of the two-point function in the end of the paper we would like have exponential decay with mass m up to corrections small with N (without invoking the analyticity improvement due to the UV cutoff). It may then be necessary to use the full decay for at most two links 13 among those appearing at a branch point of the respective tree (see below, proof of Theorem). Obviously this does not change the norm bound (160) at all, since we may bound the sum in the same way keeping aside a fraction of the decay for d − 2 links only. So we now evaluate (158) through |∂h1 . . . ∂hn det −1 (1 + Q)| = Z X 1 1 n ) det−1 (1 + Q + αl ∂hl Q) | |( 2 2 2πi R1 ...Rn α1 . . . αn l X 1 1 sup |det−1 (1 + Q + αl ∂hl Q) |, ≤ ( )n 2π R1 , . . . , Rn α

(161)

l

where the sup is to be taken over the αl -parameters on the circles Rl . Thus we obtain indeed per derivative a factor 9m 1 N −1/6 e− 10 dl . 2π

(162)

Before ending the discussion of how to evaluate derivatives acting on det we mention how we treat the δ/δτm -derivatives. In this case we choose (in modification of (159)) Rm (τ ) = N 1/4 .

(163)

We thus collect a smaller factor in N from the τ -derivative because δ/δτ annihilates a possibly large τ -factor, on the other hand we do not get a distance decay factor and need not do so, because it is already present in the term ∂hl S(h) which accompanies δ/δτ (see (149)). Of course it remains to give suitable bounds on the Fredholm determinants modified by the α-parameters. We have to remember that our true object of interest is not det−1 (1+ Q) but rather the subtracted determinant (154). First we note that we may still evaluate the derivatives acting on (154) by introducing α-parameters, on replacing as before for a given choice of of h- and τ -derivatives, X X αl ∂hl A + αm δτm A and similarly for As , A00 . (164) Aα := A → A + l

m

So after bounding the Cauchy integrals we have to bound 1 2 sup |det−1 2 (1 + iAα ) exp{ T r( Aα s ) } |. 2 α

(165)

The task is to reproduce the bounds on the action from Sect. 3 on replacing A → Aα . Inspection shows that the proofs of Propositions 2, Lemmas 6,7 and part of Proposition 8 (as far as (117) is concerned) have to be redone with this modification on A. We collect our findings in 13 if these links are indispensable to join via the tree the squares containing the points x and y in the external polymer A(Y, x, y).

122

C. Kopper

Lemma 13. We assume that the kernels A are restricted to a given polymer Y ⊂ 3 of the cluster expansion. Then we have a) ||Aα s || ≤ O(1)N −1/4 (replacing (48)), R 1 iA00α )| ≤ exp{O(1)N −1/4 Y (τ 2 + 1)} (replacing (109)), b) |det −1 (1 + 1+iA αs R c) |T r(A3α s )| ≤ N −1/4 Y ∩3s τ 2 (replacing (117)) Remark. Note again that due to our change of notation a factor of N/2 has been absorbed in T r together with a corresponding change in det. Proof. The proof of a) is trivial from Proposition 2 and Lemma 12. As for b) we have to go again through the considerations leading from (95) to (109). Since the reasoning is analogous, we will be rather brief. Introducing the quantities Bα , Dα as we did for Aα we find that (95) to (97) stay true also for complex α. The essential modification occurs in (101), (102): since the αi are complex we find X Im αl (ϕ, ∂hl A00 ϕ) − i(ϕ, A00α ϕ) − i(ϕ, A00α ϕ) = −2 αl

−2

X

Im αm (ϕ, δτm A00 ϕ)

(166)

αm

instead of 0 for α ≡ 0. Correspondingly we have to modify (102). The norm bound (160) then still implies Dα ≤ O(1) N −1/4 ,

(167)

2 we take into which is weaker than (105) but sufficient for us. In evaluating T rDα− account the additional contribution too. Since now X X 2 Rl |∂hl A00 | + 2 Rm (τ ) |δτm A00 |, η 1, 0 ≤ Dα− ≤ (2 + η) |δ A00 | + (168) m l 2 by it is straightforward to realize that we may bound T rDα− Z 2 ≤ O(1)N −1/4 ( τ 2 + 1). T rDα−

(169)

3∩Y

The first contribution is obtained similarly as in (108), it is quadratic in τ , but we can keep aside a small factor. The additional contribution is proportional to the number of squares touched by δτ -derivatives (≤ |Y | ), thus it is independent of the size of τ . This ends the proof of b). c) The proof is as in (116), (117). From Lemma 13 we now find that (113) (on restriction to Y ⊂ 3 and on using interpolated kernels) is to be replaced by R R − 49 τ 2 O(1)N −1/4 (τ 2 +1) 3s ∩Y . (170) ZγY GYγ,α (τ ) ≤ e 100 3l ∩Y e So the large field suppression stays unaltered and in the bound on the polymer amplitudes there is at most a factor of ∼ 1 + O(N −1/12 ) per small field square from the action to beat (we could tolerate O(1)).

Mass Generation in Large N -Nonlinear σ-Model

123

Here we may end our discussion on the outcome of the derivatives. We have shown 1/8 that we have a small factor ∼ N −1/6 per derivative and factor of e−N per large field square. All links are through kernels decaying exponentially with mass > m. This is sufficient to beat the factors O(1) per square from the combinatoric choices 14 and from the action. We pointed out that this is sufficient for the proof of Proposition 10. 4.5. Exponential decay of the correlation functions. Now we have proven the existence of S2 (x, y) in the TD limit we want to proceed to the announced result on its exponential decay. Theorem. For N >> 1 sufficiently large the inifinite volume two-point function decays exponentially 0

with

|S2 (x, y)| ≤ O(1) e−m |x−y|

(171)

m0 = m(1 + o(N 1/10 )).

(172)

Remark. O(1) is an N -independent positive number. The estimate on the exponent of N in (172) is of course not optimal. The proof goes through without much change also for any 2n-point function. Using the effects of the UV-cutoff we could replace m0 by m. Proof. The reasoning is very similar to that of [1] though somewhat simpler. The point is now to realize that the convergence proof still works when we put aside the decay factor appearing in (171). We may assume x and y far apart. They both have to be contained in the same polymer A(Y, x, y), and we have to extract the decay factor when calculating its amplitude. More specifically we shall extract it from the sum over trees T2 in (129), where we first only deal with those trees T for which T1 is empty, namely we first assume that Y does not contain large field squares, which is the dominant contribution. Obviously the decay is associated with the factor p2

1 (x, y) = 2 + m + igτ 1+

1 1 p2 +m2

igτ

p2

1 (x, y) + m2

(173)

1 which appears in the external polymer. The kernel p2 +m 2 is interpolated and thus in particular restricted in support to Y . Let 11 and 12 be the squares in Y containing x and y. For given tree T there is a unique path in T connecting 11 and 12 . We call it T 0 , noting that T 0 is a tree with coordination numbers di = 2, apart from the ends, where they equal 1. Its complement in T will be called T 00 . It has several connected components in general. Each of these connected components may be viewed as being rooted at some square (attached to links) from T 0 . Keeping these squares fixed for the moment and summing over the positions of the other squares in the various connected components of T 00 then provides us for these connected components with the usual polymer bound (Proposition 10) sufficient for convergence. It remains to sum over the positions of the squares in T 0 apart from 11 and 12 , which are sitting on the ends. For given positions of those squares we may isolate a factor of Y 0 Kl0 (xl0 , yl0 ). (174) ε|T | l0 ∈T 0 14 where we mentioned already that just taking the maximal value gives a crude bound since most terms are much smaller than the leading ones

124

C. Kopper

Here ε ∼ o(N −1/10 ) is part of the small factor per small field derivative, the other being used to beat the combinatoric constants etc., see above. The kernels Kl0 (xl0 , yl0 ) are those generated by the derivatives of the expansion. They all fall off exponentially with 1 1 at least the rate of p2 +m 2 , so they all may be bounded by the modulus of [ p2 +m2 ](xl0 , yl0 ) up to a constant ∼ O(1), which we absorb in ε. The coordinates (xl0 , yl0 ) are situated in the two squares linked by l0 ∈ T 0 and are to be integrated over those squares.15 In 1 Lemma 1 we showed that the kernel of p2 +m 2 is pointwise positive. From this we then obtain easily that (174), when integrated over the intermediate squares and summed over their positions is bounded by

ε|T

0

|

p2

1 |T 0 | (x, y) + m2

(175)

(up to a constant ∼ O(1), which we absorb in ε). Note that having split up the tree T does not change the way in which the sum over the trees is performed. We succeeded in extracting the factor (175) due to the fact that two squares in the external polymer are fixed instead of only one as in Proposition 10. When summing over all possible values of |Y | and using the polymer bound (143) we now obtain a bound of the form

|S2 (x, y)| ≤ O(1)

p2

X 0 0 1 1 ε|T | ( 2 )|T | (x, y). 1+ 2 2 +m p +m 0

(176)

|T |

Here the first term is the contribution for |Y | = 2 and the single h-derivative applying to the second factor in (173). This is the only case where it does not produce a factor ≤ ε. Performing the geometric series in (176) now proves (171) on using

p2

1 (x, y) ≤ O(1) exp{−(m − ε/m)|x − y|}. 2 +m −ε

(177)

Finally we have to make sure that large field contributions do not spoil our estimate. For this it suffices to note that in the large field region we have at our disposal a factor of ≤ exp(−N 1/8 ) per square of 0e , half of which may be put aside per each square of 0ei , on which ends some l0 ∈ T 0 . Then we only have to note that this factor is much smaller than the factor of ε which we loose instead, and that the links within 0ei are of short range. Acknowledgement. The author is indebted to Jacques Magnen and Vincent Rivasseau for many helpful remarks. In particular the paper was initiated through several discussions with Jacques Magnen. The important reference [25] was pointed out to me by K. Gawedzki.

15

apart from x10 = x and y|T 0 | = y which are fixed

Mass Generation in Large N -Nonlinear σ-Model

125

References 1. Kopper, Ch., Magnen, J., Rivasseau, V.: Mass generation in the large N Gross–Neveu-Model, Commun. Math. Phys. 169, 121–180 (1995) 2. Zinn-Justin, J.: Quantum Field Theory and Critical Phenomena. 3rd ed., Oxford: Clarendon Press, 1997 3. For a review see: Smirnov, F.A.: Form Factors in Completely Integrable Models of Quantum field Theory, Singapore: World Scientific, 1992. Some important references are: Zamolodchikov, A., Zamolodchikov, Al.B.: Relativistic Factorized S-Matrix in two dimensions having O(N ) isotopic symmetry. Nucl. Phys. 133, 525 (1978). Karowski, M.,Weisz, P.: Nucl. Phys. B139, 455 (1978). Hasenfratz, P. Maggiore, M. Niedermayer, F.: The exact mass gap of the O(3) and O(4) σ-models in d = 2. Phys. Lett. B245, 522– 528 (1990). Hasenfratz, P., Niedermayer, F.: The exact mass gap for the O(N ) σ-model for arbitrary N ≥ 3 in d = 2. Phys. Lett. B245, 529–534 (1990) 4. Gawedzki, K., Kupiainen, A.: Continuum Limit of the Hierarchical O(N ) Nonlinear σ-Model. Commun. Math. Phys. 106, 533–550 (1986) 5. Pordt, A., Reiss, Th.: On the renormalization group iteration of a two-dimensional hierarchical nonlinear σ-model. Annales de l’Institut Poincar´e 55, 545–587 (1991) 6. Br´ezin, E., Le Guillou, J. and Zinn-Justin, J.: Renormalization of the nonlinear σ model in 2 + ε dimensions. Phys. Rev. D14, 2615–2621 (1976). Br´ezin, E., Zinn-Justin, J.: Spontaneous Breakdown of Continuous Symmetries near two dimensions, Phys. Rev. B14, 3110–3112 (1976) 7. See e.g., Caracciolo, S., Edwards, R., Pelissetto, A. and Sokal, A.: Asymptotic Scaling in the Twodimensional O(3) σ Model at Correlation Length 105 . Phys. Rev. Lett. 75, 1891–1894 (1995) 8. Patrascioiu, A., Seiler, E.: Super-Instantons and the Reliability of Perturbation Theory in Non-Abelian Models. Phys. Rev. Lett. 74, 1920–1923 (1995) and: Nonuniformity of the 1/N Expansion for O(N ) Models. Nucl. Phys. B443, 596 (1995) 9. Mermin, N.D. and Wagner, H.: Absence of ferromagnetism and antiferromagnetism in one- and twodimensional isotropic Heisenberg models. Phys. Rev. Lett. 17,1133–1136 (1966), Mermin, N.D.: Absence of ordering in certain classical systems. Journ. Math. Phys. 8, 1061–1064 (1967) 10. Dobrushin, R. and Shlosman, S.: Absence of breakdown of continuous symmetries in two-dimensional models of statistical mechanics. Commun. Math. Phys. 42, 31–40 (1975) 11. Seiler, E.: Schwinger functions for the Yukawa Model in two space time dimensions with space time cutoff, Commun. Math. Phys. 42, 163–182 (1975), Simon, B.: Trace Ideals and their Applications. London Mathematical Society Lecture Note Series 35, Cambridge: Cambridge Univ. Press, 1979, Faria da Veiga, P.A.: Constructions de Mod`eles non renormalisables en Th´eorie quantique des Champs, Thesis Ecole Polytechnique, 1991, Magnen, J. and S´en´eor, R.: Yukawa Quantum Field Theory in three Dimensions, Proc. 3rd Int. Conf. on Collective Phenomena, Annals of the New York Academy of Sciences 337, New York, 1980 12. Reed, M.C.: In: Constructive Field Theory. Proc. Erice 1973, Lecture Notes in Physics 25, 1973 13. Glimm, J., Jaffe, A.: Quantum Physics. New York, Springer-Verlag, 1987 14. Brydges, D.: In: Critical Phenomena, Random Systems, Gauge Theories, Proc. Les Houches 1984. Amsterdam: North Holland, 1986. Brydges, D. and Martin, Ph.A.: Coulomb Systems at low densitiy. Preprint 1998 15. Brydges D. and Kennedy, T.: Mayer Expansions and the Hamilton–Jacobi Equation. Journ. Stat. Phys. 48, 19 (1987) 16. Brydges, D. and Yau, H.T.: Grad8 Perturbations of massless Gaussian Fields. Commun. Math. Phys. 129, 351 (1990) 17. Brydges, D. and Federbush, P.: A new Form of the Mayer Expansion in Classical Statistical Mechanics. Journ. Math. Phys. 19, 2064 (1978) 18. Abdesselam,A. and Rivasseau, V.: Trees, Forests and Jungles, a Botanical Garden for Cluster Expansions. In: Rivasseau, V. (ed): Proceedings of the International Workshop on Constructive Theory, Berlin– Heidelberg–New York: Springer Verlag, 1995 19. Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton, NJ: Princeton University Press, 1991 20. Abdesselam, A.: Renormalisation Constructive Explicite. Thesis Ecole Polytechnique, 1997 21. Brydges, D., Dimock, J. and Hurd, T.: Estimates on Renormalization Group Transformations. Univ. of Virginia. Preprint 1996, and: A non-Gaussian fixed point for φ4 in 4 − ε dimensions. Commun. Math. Phys. 198, 111–156 (1998) 22. Brydges, D.: Functional integrals and their Applications. EPFL Lecture Notes, Lausanne 1992

126

C. Kopper

23. Iagolnitzer, D. and Magnen, J.: Polymers in a Weak Random Potential in Dimension Four: Rigorous Renormalization Group Analysis. Commun. Math. Phys. 162, 85–121 (1994) 24. Spencer, T.: The decay of the Bethe-Salpeter kernel in P (ϕ)2 Quantum Field Theory. Commun. Math. Phys. 44, 153–164 (1975) 25. Kupiainen, A.: On the 1/n Expansion. Commun. Math. Phys. 73, 273–294 (1980) 26. Ito, K.R., Tamura, H.: N Dependence of Upper Bounds of Critical Temperatures of 2D O(N ) Spin Models. Commun. Math. Phys. 202, 127–168 (1999) Communicated by D. C. Brydges

Commun. Math. Phys. 202, 127 – 168 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

N Dependence of Upper Bounds of Critical Temperatures of 2D O (N ) Spin Models K. R. Ito1,? , H. Tamura2 1 Department of Mathematics and Physics, Setsunan University, Ikeda-Naka Machi, Neyagawa 572, Japan. E-mail: [email protected] 2 Department of Mathematics, Faculty of Science, Kanazawa University, Kanazawa 920-11, Japan. E-mail:[email protected]

Received: 23 April 1998 / Accepted: 19 September 1998

Abstract: We investigate critical temperature of the classical O(N ) spin model in two dimensions. We show that if N is large and there is a phase transition in the system, the critical inverse temperature βc obeys the bound βc (N ) > const. N log N .

1. Introduction Quark confinement in 4 dimensional non-abelian lattice gauge thoeries and spontaneous mass generations in two dimensional (2D) non-abelian sigma models are widely believed [18]. These models exhibit no phase transitions in the hierarchical model approximation of Wilson-Dyson type or Migdal-Kadanov type [10], but we still do not have a rigorous proof for the real system. We recently considered a block-spin-type transformation of a random walk which appears in the O(N ) spin models [3, 4], and showed that [11] the correlation functions are represented by self-avoiding walks on Zν . This considerably improves our previous estimates for the inverse critical temperature βc of the system βc µν , as N → ∞, ≥ 2 N µν − 1

(1.1)

where µν ∈ (ν, 2ν − 1) is the connective constant of self-avoiding walk on Zν (µ2 = 2.653 · · · ). In this paper, we amalgamate our previous methods with the idea of the N −1 expansion [14, 15] and the cluster expansion [5, 9, 13, 16], the technology to represent quantities of the infinite volume limit by finite volume quantities. In spirit, our single block cluster expansion is similar to that in [1]. Our main conclusion in this paper is ? also at: Division of Mathematics, College of Human and Environmental Studies, Kyoto University, Kyoto 606, Japan.

128

K. R. Ito, H. Tamura

Main Theorem. The critical inverse temperature βc (N ) of the two-dimensional O(N ) Heisenberg Model obeys the following bound for large N : βc (N ) > const. N log N,

(1.2)

where const. > 0 is independent of N . This result is announced in [12]. As will be discussed, for the dimension ν > 2, we have G0 (0) ≥

βc (N ) 1 , ≥ N µν

(1.3)

where G0 (x) is the lattice Green’s function on the ν dimensional lattice Zν . Therefore a strong deviation exists in the N dependence of the critical temperature of the 2D O(N ) Heisenberg model. We expect a combination of the present method and renormalization group type argumemts will establish our longstanding conjecture on the 2D sigma model. The ν dimensional O(N ) spin (Heisenberg) model is defined by the Gibbs measure Z Y 1 F (φ) exp[−H3 (φ)] δ(φ2i − 1)dφi . (1.4) hF i ≡ Z3 (β) i Here 3 ⊂ Zν is the large square with its center at the origin. Moreover φ(x) = (φ(x)(1) , · · · , φ(x)(N ) ) is the vector valued spin at x ∈ 3, Z3 is the partition function defined so that h1i = 1. H3 is the Hamiltonian given by H3 ≡ −

β(N ) 2

X

φ(x)φ(y),

(1.5)

|x−y|1 =1

P where |x − y|1 = i |xi − yi | and β(N ) is the inverse temperature. To appeal to the 1/N expansion [15], we set β(N ) = N β.

(1.6)

We organize the paper as follows: in Sect. 2, we represent the theory in terms of a determinant by introducing an auxiliary field ψ and integrating out the spin variables. We discuss the reason why phase transitions may not occur in two-dimensional systems which have O(N ) symmetries. In Sect. 3, we argue the polymer expansion when |ψ(x)| are all small. Sect. 4 is the main part of this paper in which we prove that the contributions from large fields are small and negligible. Since ψ(x) can get large, we decompose 3 into two regions, the large and the small field regions and we estimate their contributions separately. The polymer expansion will be done combining these two regions. In Sect. 5, we represent the free energy by the convergent polymer expansion, from which the analyticity of the free energy follows. We discuss some related problems in Sect. 6. In the Appendices, we calculate decay rates and inverses of Green’s functions used in this paper. We also discuss polymer expansions of Green’s functions and Gaussian measures restricted to subsets of Z2 . Added Note. After submitting this paper, we received the paper by C.Kopper [19] in which the same problem is discussed and similar results are obtained. In this paper, we discuss the problem on the lattice Z2 , and on the other hand in [19], the model is discussed on R2 with an ultra-violet cutoff and the correlation functions are investigated.

Critical Temperature of 2D O(N ) Spin Model

129

2. Determinant Representation R We substitute the identity δ(φ2 − 1) = exp[−ia(φ2 − 1)]da/2π into Eq. (1.4) with the condition [3, 4] that Imai ≤ −νN β. We set Im ai = −N β(ν +

√ m2 ), Re ai = N βψi , 2

(2.1)

where m2 ≥ 0 will be determined soon. Thus we have X Z Z Y dφj dψj √ Nβ 2i |3| 2 φ, (m − 1 + √ ψ)φ + · · · exp[− i N βψj ] Z3 = c 2 2π N j Z Z X Y √ 2i dψj ψj ] = c|3| · · · det(m2 − 1 + √ ψ)−N/2 exp[i N β 2π N j Z Z Y dψj , (2.2) = c|3| det(m2 − 1)−N/2 · · · F (ψ) 2π where c are constants which may be different on lines, 1ij = −2νδij + δ|i−j|1 ,1 is the lattice laplacian and X √ 2iG ψj ]. F (ψ) = det(1 + √ ψ)−N/2 exp[i N β N j

(2.3)

Moreover G = (m2 − 1)−1 is Green’s function (matrix) discussed later. In the same way, the two point functions are given by Z Z Y dψj 1 2i , (2.4) F (ψ) · · · (m2 − 1 + √ ψ)−1 hφ0 φx i = 0x 2π Z˜ N where Z˜ is the obvious normalization constant. We choose m ≥ 0 so that G(0) = β, where Z π Z π ν Y dpi ··· g(p)eipx , (2.5) G(x) = 2π −π −π i=1 g(p) ≡

m2 + 2

1 1 1 P ]. ∈[ 2 , (1 − cos pk ) m + 4ν m2

(2.6)

This choice is possible for any β (and N ) if and only if ν ≤ 2, that is, if and only if G0 (0) ≡ G(0)|m2 =0 = ∞. In other words, we can rewrite Eq. (2.3) as 2iG F (ψ) = det 3 (1 + √ ψ)−N/2 exp[− Tr(Gψ)2 ] N

(2.7)

for any β, only for ν√≤ 2, where det 3 (1 + A) = det[(1 + A)e−A+A /2 ]. P factor exp[i N β ψx ] in (2.3) is the reminiscence of the double-well potential Q The responsible for phase transitions. Then roughly speaking, the δ(φ2x − 1) which is √ P disappearance of exp[i N β ψx ] in (2.7) means the absence of the effect of the double-well potential and is consistent with the absence of phase transitions [2]. 2

130

K. R. Ito, H. Tamura

p An explicit calculation shows that m2 = β −1 ( 1 + 4β 2 − 2β) for ν = 1. For ν = 2, G(0) is expressed by the complete elliptic integral of the first kind F (k, π/2) = R π/2 dϕ(1 − k 2 sin2 ϕ)−1/2 : 0 Z π dp 1 √ G(0) = 2π 0 (1 + 2ε − cos p)(3 + 2ε − cos p) 1 3 1 1 k F (k, π/2) = [O(ε) + log 2 + log ], = 2π 2π 2 2 ε where ε = m2 /4 and k = (1 + ε)−1 . Then the condition G(0) = β implies that m2 ∼ 32e−4πβ as β → ∞

(2.8)

which is consistent with the renormalization group arguments, see [6] and references therein. If ν ≥ 3, such an m ≥ 0 exists if β ≤ G0 (0). If β > G0 (0), there exists spontaneous ) > N/µν for ν > 2. magnetization in the system [7]. That is N G0 (0) > βc (N√ If m is chosen√so that G(0) = β, det3 (1 + 2iGψ/ N )−N/2 is almost equal to exp[4i Tr(Gψ)3 /(3 QN )] and is regarded as a small perturbation to the Gaussian measure ∼ exp[− Tr(Gψ)2 ] dψ. Namely F (ψ) looks like |F (ψ)| = det(1 + 4GψGψ/N )−N/4 which is strictly positive. If this is justified, then from Eq. (2.4), we have exponential decay of the correlation functions: Z Z Y dψj 1 2i · · · (m2 − 1 + √ ψ)−1 |F (ψ)| hφ0 φx i ∼ 0x 2π Z˜ N 2i ≤ | sup(m2 − 1 + √ ψ)−1 0x | N ψ −m|x| . ≤ (m2 − 1)−1 0x ∼ e

3. Polymer (Cluster) Expansion in Small Field 3.1. Polymer expansion. Let

Y dψ(x) √ dµ3 (ψ) = det 1/2 [C −1 ] exp[− ψ, C −1 ψ ] π

(3.1)

be the Gaussian probability measure of mean zero and covariance 21 C, where C −1 ≡ G◦2 and G◦2 is the matrix given by G◦2 (x, y) = G(x−y)2 . The partition function Z3 is given by Z 2i −N/2 (1 + √ Gψ)dµ3 (ψ), (3.2) Z3 = Z∞ det 3 N (3.3) Z∞ ≡ det −1/2 [C −1 ] = det 1/2 [C], up to a non-important multiplicative factor. Our purpose is to discuss analyticity of the free energy αF = − lim log Z3 /|3| in β. Since m is analytic in β ≥ 0, the assertion is trivial if there is no determinant. In the present case where we have the determinant, which is quite non-linear and non-local in ψ(x), we represent Z3 in terms of polymers:

Critical Temperature of 2D O(N ) Spin Model

131

Theorem 1. The partition function Z3 is represented by polymers ρX , X ⊂ 3:   X 1 X Y (3.4) ρXi  , Z3 = Z∞  p! p p i ∪1 Xi =3

where Xi are unions of squares 1 ⊂ 3 of size L × L (L >> 1 is determined later) and Xi ∩ Xj = ∅, (i 6= j). Given β > 0, if N is chosen large, N ≥ exp[const.β], there exist strictly positive constants δc and mc such that |ρX | ≤ exp[−δc nX log N − mc L(X)],

(3.5)

where nX is the number of squares 1i in X and L(X) is the length of the shortest connected tree graph over centers of 1i ⊂ X. The free energy is the convergent series of ρX . Each ρX is analytic in β. Thus the Main Theorem follows from Theorem 1 since αF is represented by the convergent series of ρX . The proof of this theorem is, however postponed until Sect. 5. Here we restrict ourselves to the small field case where the expansion can be easily done by the N −1 expansion. 3.2. Small and large fields. We let G˜ ≡ [G◦2 ]1/2 . Then C and G˜ have the following Fourier expansions: Z C=

−π

Z G˜ =

π

−π

Z

π −π

"Z g(p) ˜ =

Z

π

π −π

π

2 Y dpi i=1

ip(x−y)

−π

Z

eip(x−y) g˜ −2 (p) e

π

−π

g(p) ˜

2π

2 Y dpi i=1

g(p − k)g(k)

2π

(3.6)

,

2 Y dki i=1

,

2π

(3.7) #1/2 ∈[

c1 c2 , ]. +8 m

(3.8)

m2

Here and below, c stands for a generic constant independent of β which may change from place to place even in the same equations, and c0 , c1 , · · · stand for similar constants which are kept in the same equations. The following lemma is proved in Appendix C: ˜ G˜ −1 and C exhibit the followng exponential Lemma 2. For m < 1, the kernels G, G, decay: 1 ) exp[−m∗ |x − y|], m 1 ˜ |G(x, y)| ≤ c log(1 + ) exp[−m|x − y|], m |G˜ −1 (x, y)| ≤ c(1 + m2 ) exp[−m|x − y|], G(x, y) ≤ c log(1 +

|C(x, y)| ≤ c(1 + m ) exp[−m|x − y|], 2

where |x| =

p

(3.9) (3.10) (3.11) (3.12)

x21 + x22 and m∗ > 0 is a constant defined by 2 cosh(m∗ ) = 2 + m2 .

132

K. R. Ito, H. Tamura

We introduce the notion of the large field region R and small field region K: R = {x; N δ ≤ |ψ(x)|}, K = 3 − R,

(3.13)

where N = N (β) and a positive constant δ < 1/2 is chosen so that if |ψ(x)| ≤ N δ for all x, then N −1/2 ||G1/2 ψG1/2 || << 1. Then the determinant is perturbatively expanded and the higher order terms are negligible. Since spec G ∈ [(8+m2 )−1 , m−2 ] and m−2 ∼ (32)−1 e4πβ , these conditions are satisfied if exp[12πβ] < N for large β. The following is one of the most typical choices satisfying these conditions (though they are not optimal): δ=

1 , N (β) = exp[400πβ]. 12

(3.14)

Remark 1. For matrices A and B, we define A ◦ B by (A ◦ B)(x, y) = A(x, y)B(x, y). This is called the Hadamard product of A and B. It is easy to see that A ◦ B ≥ 0 if A ≥ 0 and B ≥ 0. √

˜ Remark 2. The kernel functions C(x), G(x) and G˜ −1 (x) decay faster than e[− 2m|x|] , see the Appendix. Of course, m∗ < m, m∗ = m − O(m2 ). However since m∗ is almost equal to m in the present problem where m << 1, we use m for m∗ for notational simplicity in the remaining part of the paper. If β < O(1), it is enough to choose L (the size for the expansion) and N larger than some constants for the convergence. So it suffices to consider the case β >> 1. Remark 3. In this paper, we use free boundary conditions for the Green’s function G and its inverse, and we assume that the ψ field distributes only in the large square region 3 ⊂ Z2 . Other boundary conditions can be easily adopted without changing the main estimates in the present paper. 3.3. Polymer expansion in small field region. We first consider the case of R = ∅. In this case, we decompose 3 ⊂ Z2 into squares (denoted 1 or 1i below) of size L × L whose centers are at 3 ∩ LZ2 . Collections of these squares are called paved sets. We also define L0 << L, where L and L0 are chosen so that L << N << emL , G(L0 ) = N −2 .

(3.15)

For this to be satisfied, we take L slightly larger than m−1 . Typically we may take L = 20m−1 log N so that emL = N 20 , in which case L0 = L/10. They satisfy the conditions on L and N . Let τ (ψ) be an even, positive and decreasing (in |ψ|) C ∞ function such that 1 for |ψ| < N δ . (3.16) τ (ψ) = 0 for |ψ| > N δ + h We may take the limit h → 0 after all calculations (limh→0 τ (ψ) = θ(N δ − |ψ|)), but we can keep h as a non-zero constant (say 1). We multiply X τ (ψK )τ c (ψR ) (3.17) 1= K⊂3

Critical Temperature of 2D O(N ) Spin Model

133

Q c to dµ3 , where = 1 − τ (ψ), R = K c = 3 − K and τ (ψK ) ≡ x∈K τ (ψ(x)), Q τ (ψ) c c τ (ψR ) ≡ x∈R τ (ψ(x)). We call K the small field region and R = K c the large field region. Then X Z(R), (3.18) Z3 ≡ Z∞ Z Z(R) ≡

R −N/2

det 3

2i (1 + √ Gψ)τ c (ψR )τ (ψK )dµ3 (ψ). N

We put Z3 (R) = Z∞ Z(R) and we first consider the case R = ∅: Z Z3 (R = ∅) ≡ Z∞ η3 dµ3 (ψ), −N/2

η3 ≡ det 3

Y 2i (1 + √ Gψ) τ (ψ(x)). N x∈3

(3.19)

(3.20) (3.21)

We introduce interpolation parameters si into dµ3 (ψ) to expand the measure [5, 16]. Let Y ⊂ 3 be a paved set consisting of p squares {11 , · · · , 1p }. Let {1j1 , · · · , 1jp } be any permutation of them such that 1j1 = 11 and let a be a map from {1, · · · , p − 1} into itself such that a(k) ≤ k. Then we have a set of ordered links {(ja(i) , ji+1 ); i = 1, · · · , p − 1} which is regarded as a tree graph T 0 over {1i } with root 11 . Let CY = χY CχY ,

(3.22)

where χY is the charcteristic function of Y . For a given permutation and a function a = aT 0 , we define p−1 Y

((1 − si )Pi + si )]CY ,

CY ({s}) = [

(3.23)

i=1

MT 0 =

p−1 Y k−1 Y

si ,

(3.24)

k=1 i=a(k)

where Pi are operators which bisect paved sets: Pi CX = CX\Xi + CX∩Xi , Xi ≡ ∪ik=1 1jk . See Appendix C for the construction and for the proof of the next theorem, see [5, 16]: Theorem 3. Z3 (R = ∅) have the cluster expansion   X 1 X Y SYi  η3 , Z3 (R = ∅) = Z∞  n! n n i

(3.25)

∪1 Yi =3

where Yi are paved sets such that ∪n1 Yi = 3 and Yi ∩ Yj = ∅ for i 6= j. Let Y = ∪pk=1 1k be one of Yi . Then SY is the differential and integral operator given by Z XZ 1 ds1 · · · dsp−1 MT 0 (s) dµY ({s}, ψ) SY = T0

×

0

p−1 Y k=1

 

X

X

xk ∈1ja(k) yk+1 ∈1jk+1

 1 ∂2 , C(xk , yk+1 ) 2 ∂ψ(xk )∂ψ(yk+1 )

(3.26)

134

K. R. Ito, H. Tamura

P where T 0 is the sum over all tree graphs T 0 = {(ja(k) , jk )} over {j1 , j2 , · · · , jp } (j1 = 1) and

Y dψ(x) √ . (3.27) dµY ({s}, ψ) = det −1/2 [CY (s)] exp[− ψ, CY−1 ({s})ψ ] π x∈Y

Here CY ({s}) is given by (3.23) and depends on permutations only. There are many graphs T 0 which have the same links and vertices but belong to different permutations {j1 , j2 , · · · , jp } of {1, · · · , p}. The following lemma is well known [5, 16]: Q Lemma 4. The measure MT dsi is the probability measure in the following sense: Z

X

where T.

MT 0

T 0 :T (T 0 )=T

P T 0 :T (T 0 )=T

1

0

p−1 Y

dsi = 1,

(3.28)

1

means the sum over tree graphs T 0 which have the same links with

Let 2i A3 = √ Gψ N

(3.29)

for simplicity, and let 3 = ∪pi=1 Yi be one of the partitions which appear in Eq. (3.25). Since {ψYi } are coupled in the determinant, we introduce interpolation parameters sij and set X X A3 = AYi + (AYi ,Yj + AYj ,Yi ) → A + B(s), (3.30) A≡

X

i<j

AYi , B(s) ≡

X

sij (AYi ,Yj + AYj ,Yi ),

(3.31)

i<j

in the determinant, where (3.32) AYi = χYi A3 χYi , AYi ,Yj = χYi A3 χYj . R1 We iteratively apply the identity f (1) = 0 ds∂s f (s) + f (0) to det3 (1 + A + B(s)). If all sij are set to zero, then the determinant is factorized with respect to ψYi . We thus have:   X 1 X Y ρXi  . Z3 (R = ∅) = Z∞  n! n n i ∪1 Xi =3

Here {Xi }n1 are partitions of 3 into polymers, Xi ∩ Xj = ∅, (i 6= j), ∪Xi = 3 and   X X 1 X Z Y dsγ ∂γ  ηX ({Yi }), (3.33) SYi  ρX = p! p ˜ Y1 ∪···∪Yp =X γ∈T ({Yi }) X X −N/2 (1 + AYi + sij (AYi ,Yj + AYj ,Yi ))τ (ψX ), (3.34) ηX ({Yj }) = det 3 i

i<j

Critical Temperature of 2D O(N ) Spin Model

135

where SY is the interpolation operator on Y defined by (3.26) and 1. 2. 3.

∪Yi = X and Yi are mutually disjoint paved sets, p T˜ ({Yi }) Qis the set of connected Q graphs (not necessarily trees) over {Yi }i=1 , / γ). dsγ = (ij)∈γ dsij and ∂γ = (ij)∈γ (∂/∂sij ), (put sij = 0 if (i, j) ∈

In the rest of this section, we prove the following theorem which ensures that the free energy log Z3 (R = ∅) is the convergent series of ρX [19], if N is chosen large: Theorem 5. Assume that R = ∅ and let n be the number of 1 in X ⊂ 3. If N ≥ N (β), there exist strictly positive constants δ0 and m0 such that |ρX | ≤ exp[−nδ0 log N − m0 L(X)], n ≥ 2, ρ1 = exp[−W1 ], n = 1,

(3.35) (3.36)

where L(X) is the length of the shortest tree graph connecting all centers of squares 1i ⊂ X, and W1 is the single square activity defined later. To prove this, we first set 2 Y NX Vi (A, B)] τ (ψ(x)), ηX ({Yi }) ≡ exp[− 2 i=1 x∈X 3 1 1 2 1 1 ) + Tr B , V1 (A, B) = Tr B 2 − (B 2 1+A 3 1+A 1 B). V2 (A, B) = log det 3 (1 + A) + log det 4 (1 + 1+A

(3.37)

(3.38) (3.39)

The derivatives of Vi with respect to sij can be done by the contour integrals: Y Z Y dtij ∂ η (t) Q X , ηX (s) = ∂sij 2πi (tij − sij )2 C where C is the product of the circles |tij − sij | = rij on C with their radii rij given by 4 ˜ (3.40) rij = N δ exp[ mdist(Yi , Yj )], where δ˜ > 0. 5 √ Put Bij = 2itij (GYi Yj ψYi + GYj Yi ψYj )/ N . Then for |tij | < rij + 1 , we find that |Bij (x, y)| ≤ const. log(1 + m−1 )N −1/2+δ+δ exp[− ˜

m |x − y|], 5

N | Tr χX B 3 χX | ≤ N −1/2+3δ+3δ+2ε0 |X|, ε0 ≡ −2.1 × log m/ log N (∼ 1/100 if N ∼ e400πβ ), ˜

(3.41)

where ε0 is chosen slightly larger than −2 log m/ log N so that N ε0 > cm−2 log(1 + m−1 ) and some trivial constants can be absorbed by N ε0 . We choose δ˜ > 0 so that 1 ˜ − 2ε0 > 0. δˆ ≡ − 3(δ + δ) 2

(3.42)

For example, we can choose as δ = 1/12, δ˜ = 1/16, δˆ = 1/16 − 2ε0 . Thus we have:

136

K. R. Ito, H. Tamura

Lemma 6. If N is chosen so large that (3.42) holds, then X Y ∂/∂sij ηX | ≤ exp[−nδ˜ log N − m2 dist(Yi , Yj )]||ηX ||, | (ij)∈γ

(3.43)

(ij)∈γ

where m2 = 4m/5 and γ are connected tree graphs over {Yi ⊂ X}, and n is the number of the bonds in the graph γ. Moreover ||ηX || ≡

|ηX (t)| ≤ exp[N −δ |X|]. ˆ

sup {|tij |≤rij +1}

(3.44)

Lemma 7. Let ∪ni=1 1i = Y and let xi ∈ 1i . Then Z |

dµY (s, ψ)

n Y 1

∂ ˆ ηY (ψ)| < exp[−nδ˜ log N + N −δ |Y |]. ∂ψ(xi ) −N/2

(3.45) −N/2

(· · · ) or on τ (ψ). If it acts on det 3 (· · · ), Proof. Each derivative acts either on det 3 ˜ it yields the factor bounded by N −δ . (We can get a much smaller factor N −1/6+ε0 this case.) On the other hand, if ∂/∂ψ(x) acts on τ (ψ(x)), ∂ τ (ψ(x)) = 0 unless N δ < |ψ(x)| < N δ + h. ∂ψ(x) Note that dµY (s) →

Y

dz(x) exp[−z(x)2 ] √ π

˜ −1 by the linear transformation ψ(x) = (G˜ −1 Y z)(x), where GY = and CY (s) is a convex linear combination of {CYi }, we see X

zx2 = ψ, χY CY (s)−1 χY ψ ≥

x∈Y

√

CY . Since C −1 = G◦2

X 1 ψ 2 (x). (8 + m2 )2 x∈Y

IfP|ψ(x)| > N δ , then {y : |z(y)| > N δ−ε0 , |x − y| < L0 } 6= ∅ since |ψ(x)| = 2 −m|x| ˜ −1 . Thus the contributions from | y G˜ −1 Y (x, y)z(y)| and |G (x)| < c(1 + m )e the derivatives of τ are exponentially smaller than those from the derivatives of −N/2 (· · · ). det 3 The single square activity ρ1 = e−W1 is defined by Z −N/2 (1 + A1 )τ (ψ1 )dµ1 (ψ). ρ1 = det 3 −N/2

(3.46)

(1 + A1 )| = O(N | Tr A31 |), we have W1 = O(N −1/2+3δ+3ε0 ) which Since | log det 3 is independent of locations of 1 (|1| = L2 < N ε0 ). connect 1i with other 1j in the tree graph, i.e. Let di be the number of lines Pwhich n di the incidence number. Then i=1 di = 2n − 2, where n is the number of squares 1i in Y . In this case there can appear di derivatives ∂ di /∂ψ(x)di , x ∈ 1i in Eq. (3.26). −N/2 (· · · ) or to By integration by parts, we can shift the action of ∂/∂ψ from τ to det 3

exp[− ψ, CY−1 ψ ].

Critical Temperature of 2D O(N ) Spin Model

137

Lemma 8. [16] With the notation of (3.26) in Theorem 3 (with p replaced by n), let F(x1 , y2 , · · · , yn ) ≡ |

n−1 Y

Z C(xi , yi+1 )

dµY (s, ψ)

n−1 Y

i=1

1

∂2 ηY (ψ)|, ∂ψ(xi )∂ψ(yi+1 )

where xk ∈ 1ja(k) , yk+1 ∈ 1jk+1 . Let γ be the tree graph defined by a(·). Then X

F(x1 , y2 , · · · , yn ) ≤ exp[−n(δ˜ − 4ε0 ) log N −

{xk ,yk+1 }

where xk ∈ 1ja(k) , yk+1 ∈ 1jk+1 and L0 (X) =

P (i,j)∈γ

4m ˆ L0 (X) + N −δ |X|], 5 (3.47)

dist(1i , 1j ).

be the incidence number of Proof. Without loss, we assume {jk = k}nk=1 . Let di ≥ 1P the vertex 1i . Since #{1j : dist(1i , 1j ) < 2, i 6= j} = 8, i |xi − yi+1 | is larger than 1 XX 4X |xi − yi+1 | + 5 i 10 i x∈1

i

X

4X L X di 3/2 |xi − yi+1 | + [ ] , 5 i 10 i 9

|x − y| ≥

y:(x,y)∈γ

where [x] = the maximal integer not larger than x. By integration by parts, we see that |F(x1 , y2 , · · · )| = |

n−1 Y

Z C(xi , yi+1 )

dµY (s, ψ)89|,

(3.48)

i=1

where relabelling {xi , yi+1 } as {xi , xi,1 , · · · , xi,di −1 }n1 , xi,k ∈ 1i , 9=

n Y i=1

∂ ηY (ψ), ∂ψ(xi )

(3.49)

n dY i −1 P Y 8 = (−1) di −n eH i=1 j=1

H = ψY , CY−1 (s)ψY .

∂ e−H , ∂ψ(xi,j )

(3.50) (3.51)

, we put Rewriting {xi,j } as {ξi }n−2 1 8 = eH

n−2 Y i=1

Y X X Y ∂ e−H = (−1)|I| Hξi ( Hξj ,ξk ), ∂ψ(ξi ) c I

i∈I

(3.52)

P ⊂I (j,k)∈P

where I are subsets of {1, · · · , n − 2}, P are sets of unordered pairs of elements in I c and X C −1 (ξ, ζ)ψ(ζ), Hξ1 ξ2 = 2C −1 (ξ1 , ξ2 ). (3.53) Hξ = 2 ζ

The number of partitions I ⊂ {1, · · · , n − 2} is 2n−3 (|I c | must be even) and note that Z Y X Y Hξj ,ξk = φ(ξi )dνH (φ), P ⊂I c (j,k)∈P

i∈I c

where dνH (φ) is the Gaussian measure of mean zero and covariance H = 2G◦2 .

138

K. R. Ito, H. Tamura

We first estimate the first term of 8, I = {1, · · · , n − 2}: "Z # Y Y XY −1 2|C(xi , yi+1 )| |C (ξi , ζi )| dµY (s, ψ) |ψ(ζi )||9| {ζi }

i

Z

≤M

dµY (s, ψ)92

21

i

,

where the integral of 92 is bounded by Lemma 7 (easily extended to 92 ) and 21 Z XY Y Y −1 2 2|C(xi , yi+1 )| M ≡ |C (ξi , ζi )| dµY (s, ψ) ψ(ζi ) . (3.54) ζi

⊂ {xk , yk+1 }n−1 and put We take the sum over {ξi }n−2 1 1 X X 2|C(ξ, ξ 0 )||C −1 (ξ, x˜ k )||C −1 (ξ 0 , y˜k+1 )| ≡ m−4 δf (1a(k) , 1k+1 )(x˜ k , y˜k+1 ). ξ∈1a(k) ξ 0 ∈1k+1

Then δf (1a(k) , 1k+1 )(x˜ k , y˜k+1 ) is bounded by exp[−m{dist(1a(k) , 1k+1 ) + dist(1a(k) , x˜ k ) + dist(1k+1 , y˜k+1 )}]

(3.55)

except for a coefficient O(log4 (1 + m−1 )) which originates from C −1 = G◦2 . Here the constraints x˜ k ∈ 1a(k) and y˜k+1 ∈ 1k+1 do not hold anymore. For xk or yk+1 not , we put x˜ k = xk or y˜k+1 = yk+1 and put δf (1a(k) , 1k+1 )(x˜ k , y˜k+1 ) = contained in {ξ}n−1 1 2C(x˜ k , y˜k+1 )χ1a(k) (x˜ k )χ1k+1 (y˜k+1 ) and so on. This is again bounded by (3.55). ˜ i couple with dij ˜ i ⊂ 3 contains d˜i points of {ζi }. If dij points in 1 Assume that 1 Q P 2 ˜ points in 1j (the same points appear twice in ψ(ζ) ), j dij = 2d˜i and we have the Q ˜ ˜ (dij )! < (2d˜i )!, we find that factor 2di 2dj dij ! (2dii for (i, i)). Since dij

Z dµ

Y

dij

j



Y X Y 1  [(2d˜i )!] 2 ψ(ζi )2 ≤ i

{dij }j

 Y 1 (2d˜i )! ˜ i, 1 ˜ j ))| 2 dij  |C(dist(1 di,1 ! · · · di,n ! j

 2d˜i Y X Y 1 1 ˜ i, 1 ˜ j ))| 2   |C(dist(1 ≤ [(2d˜i )!] 2 i

≤

c02(n−2)

j

Y 1 [(2d˜i )!] 2 ,

where c0 = O(1). Since (2d)! ≤ e2d log 2d and by exp[−

(3.56) Q

δf (1a(k) , 1k+1 )(x˜ k , y˜k+1 ) is bounded

mL X d˜i 3 4m X {dist(1a(k) , 1k+1 )+dist(1a(k) , x˜ k )+dist(1k+1 , y˜k+1 )}− [ ] 2 ]], 5 10 i 9 k

(2d˜i )! are compensated and the sum over {x˜ k , y˜k+1 }1n−1 just yields a coefficient ≤ (cm)−4(n−1) . RQ Q The coefficients ξ∈I c φ(ξ)dνH of ξ∈I Hξ are again bounded by (3.56) by replacing c0 by c0 log(1 + m−1 ) and 2d˜i by corresponding incidence numbers. Thus the total contribution of 8 is bounded by 2n−3 times the result of I = {1, · · · , n − 2}.

Critical Temperature of 2D O(N ) Spin Model

139

We introduce mass parameters mi for later convenience: ˜ 0 < m1 = 0 < m0 < m

m 4m < m2 = < m, 10 5

(3.57)

where Lm0 ∼ O(β) >> 1. The following lemmas are well-known to experts [5, 8, 16]: Lemma 9 ([16], Lemma A.5 ). For a paved set X consisting of n squares {1i }, let T (X) denote the set of tree graphs γ over 1i and L(X) denote the length of the shortest tree graph over centers of 1i ⊂ X. Let distc (1i , 1j ) be the distance from the center of 1i to that of 1j . Then there exist constants K1 = o(1) and K2 = o(1) such that (1)

X X

exp[−m ˜0

X30 γ∈T (X)

(2)

X

X

distc (1i , 1j )] < K1n ,

(3.58)

(ij)∈γ

exp[−m ˜ 0 L(X)] < K2n .

(3.59)

X30

P P Proof. (1) Interchange the order of X and γ , and take the sum over positions of 1i for each γ. If 1i are distinguishable, the result is bounded by K n−1 , where K = o(1) since 1i are squares of size L × L and e−m˜ 0 L << 1. However the same configuration is counted n! times. Then n X X K0 . exp[−m ˜0 distc (1i , 1j )] < n! (ij)∈γ X30

n−2 < n!enPto take the sum over γ. We finally note that the number of tree graphs P is n ˜ 0 (ij)∈γ distc (1i , 1j )]. (2) This is clear from exp[−m ˜ 0 L(X)] ≤ γ∈T (X) exp[−m

Lemma 10 ([5], Appendix C). Let X be a paved set consisting of nX squares 1i ⊂ X. Let f (Y ) be functions satisfying the bounds ˜ 0 L(Y )], |f (Y )| ≤ exp[−nY δ˜0 log N − m where nY is the number of squares 1i in Y and L(Y ) is the length of shortest tree graph over centers of 1i ⊂ Y . Then there exist strictly positive constants δ0 (∼ δ˜0 ) and m0 (∼ m ˜ 0 ) such that |

1 p!

X

Y

f (Yi )| ≤ exp[−nX δ0 log N − m0 L(X)],

(3.60)

Y1 ∪···∪Yp =X

where {Yi : i = 1, · · · , p} are paved sets such that X cannot be divided into two disconnected parts without bisecting some Yi . Q Proof. We first extract the tree decay factor exp[−nX δ0 log N −m0 L(X)] from f (Yi ) ˜ 0 . We show that the remaining sum choosing δ0 and m0 slightly less than δ˜0 and m converges. By Cayley’s theorem on the number of tree graphs with fixed incidence numbers d1 , · · · , dp , we have |

X T

(·)| = |

X

X

{di } T,{di }fixed

(·)| ≤

X d1 ,··· ,dp

(p − 2)! Q sup |(·)|, (di − 1)! (T,d):fixed

140

K. R. Ito, H. Tamura

and take the sum over the Yi ’s starting from the end branches of the tree. Let Yp be one of the end branches and let Yj be the ancestor. Fix P 1j ⊂ Yp ∩ Yj and take the sum over Yp . The sum is convergent and is bounded by Yp 30 |f (Yp )|. Next take the sum over 1j ⊂ Yj , which yields (nYj )dj −1 . Repeating this, we see that the sum is bounded by P P d nY /d ! ≤ enY . enY is compensated by a fraction of nX [ Y 30 |f (Y )|enY ]p , since exp[−nY δ˜0 log N ] in f (Y ). See also [5, 16] for detail. Proof of Theorem 5. We obtain f (Y ) in Lemma 10 from Lemma 8 by taking the sum over T 0 in (3.26). This yields a constant bounded by 1. Thus we obtain f (Y ) in Lemma 10. ˜ 0 . In Lemma 8, X may be single squares 1, We determine the parameters δ˜0 and m and they do not have tree decay factors. Moreover 1i and 1j may be nearest neighbours ˜ of each other and dist(1i , 1j ) = 1. Then we put δ˜0 ≡ (δ˜ −4ε0 )/2 and borrow N −δ0 from ˜ ˜ 0 L(1i ∪ 1j )] = e−m˜ 0 L N −2δ0 in Eq. (3.47) in Lemma 8 to extract the factor exp[−m this case. Namely m log N (∼ δ˜ if L = 20 log N/m). m ˜ 0 ≡ δ˜0 L 40

(3.61)

Let T ({Yi }) be the set of tree graphs (no loops) over {Yi } such that ∪Yi = X. Thus applying (3.43) and (3.47) to (3.33), we have from (3.33) that   p n X Y X 1 X Y bij  , A(Yi )e˜i  |ρX | ≤ p! p=1 1 ∪Yi =X

T

(ij)∈T

ˆ ˜ 0 L(Y ) + c1 N −δ |Y ]|, (c1 = O(1)), Yi ∩ Yj = ∅ where A(Y ) ≤ exp[−nY δ˜0 log N − m ˜ 0 distc (Yi , Yj )] comes from ∂/∂sij and for i 6= j, and bij ≡ exp[−δ˜0 log N − m

dist c (Yi , Yj ) =

min

1i ⊂Yi ,1j ⊂Yj

distc (1i , 1j ).

(3.62)

Moreover we have put (effects of loops in T˜ ({Yi })) X X X ˜ bi` + bi` bim + · · · < exp[ bij ] < e˜i , ˜i = O(N −δ0 ). 1+ `6=i

`<m

j:j6=i

Then we can extract exp[−nX δ0 log N − m0 L(X)] choosing δ0 and m0 slightly smaller ˆ ˆ ˜ 0 , respectively, to compensate N −δ |X| ≤ nX N −δ+ε0 . Finally we use than δ˜0 and m Lemma 10 to prove that the remaining terms converge. 1 1 ˜ − 2ε0 = and δ˜ = 16 so that δˆ = 21 − 3(δ + δ) Remark 4. We may choose δ = 12 1˜ 1 ˜ ˜ Then δ0 ∼ 2 δ = 32 . For large N , δ0 ∼ δ0 and m0 ∼ m ˜ 0.

1 16

− 2ε0 .

4. Polymer Expansion with Large Fields We here show that the contributions from large field regions are small and that the dominant contributions come from small field regions we discussed. The analysis is easy in two extremal cases where |ψ(x)| are very small or very large. If |ψ| are small, we expand the determinant using the N −1 expansion, and we extract small fields as

−1 exp[− ψ, C ψ ], leaving large fields untouched. Very large fields are easily estimated

Critical Temperature of 2D O(N ) Spin Model

141

by the |ψ|−N/2 behaviour of the determinant (thus the contribution is small). But it is hard to estimate contributions from N δ < |ψ(x)| < N 1/2+δ and from |ψ(x)| < N δ near R. We bound their contributions by the stability. This makes our analysis complicated (crude). Q For the large field region R introduced by x∈R τ c (ψ(x)), we define another large field region R0 = R(L0 ) which includes points of K = 3 − R near R: R0 = R(L0 ) ≡ {x ∈ 3; dist(x, R) ≤ L0 }.

(4.1)

˜ and those ˜ be the smallest paved set containing R0 . We denote D the union of D Let D ˜ We set ∂D = D − D, ˜ and we call it a collar [8] or a corridor [16]. 1 ⊂ K nearest to D. Decompose D into connected components Di , and set Ri = Di ∩ R and Ri0 = Di ∩ R0 . Then D = ∪Di , dist(Di , Dj ) ≥ L, i 6= j, Ri = Di ∩ R, dist(Ri , Rj ) > 3L + 2L0 , i 6= j.

(4.2) (4.3)

It is convenient to define two types of small field regions: ˜ = 3 − R0 . K 0 = 3 − D, K

(4.4)

In the following, we may write 2i 2i Ai = √ χi Gψχi , Aij = √ χi Gψχj , N N

(4.5)

where χ0 = χK˜ , χ1 = χR0 , i.e., A0 = AK˜ , A1 = AR0 , A01 = AK,R 0 and so on when ˜ there is no danger of confusion. Then we can factorize the determinant (see the remark below): det(1 + A3 ) = det(1 + AR0 ) det(1 + AK˜ − WK˜ ), 1 A 0 ˜. WK˜ ≡ AK,R 0 ˜ 1 + AR0 R ,K

(4.6) (4.7)

Here and hereafter we regard AR , GR and so on as operators on CR , and AR1 ,R2 , GR1 ,R2 and so on as operators CR2 → CR1 , where R, R1 , R2 ⊂ 3. Theorem 11. Let Di be any connected paved set and let Ri be a large field region consistent with Di . Put Ri0 = {x ∈ Di ; dist(x, Ri ) ≤ L0 }. Then the following (stability) bound holds: Z Y

dψ(x) = exp[− ψR˜ i , Ti ψR˜ i − E(ψR˜ i )], (4.8) | det −N/2 (1 + ARi0 )| x∈Ri

E(ψR˜ i ) ≥

β |Ri |N δ2 , 10

(4.9)

where R˜ i ≡ Ri0 \Ri , ψR˜ i , Ti ψR˜ i is a positive bilinear form of ψR˜ i defined later and δ2 = O(1) (= 1/24) is a strictly positive constant discussed later.

142

K. R. Ito, H. Tamura

Theorem 12. The small field contribution is represented by the polymer expansion: Z ZK˜ ≡

Y dψ(x) N √ Tr AK˜ ] det −N/2 (1 + AK˜ − WK˜ ) ψR˜ i , Ti ψR˜ i + 2 π i x∈K   ! 1 det 2 (CK˜ ) X 1 X Y ρ˜Xi  , (4.10) Z∞ n! n

exp[−

= Z∞

X

∪1 Xi =K

where CK˜ = [χK˜ G◦2 χK˜ ]−1 . ρ˜X satisfies the following bound uniformly in ψR (x): |ρ˜X | ≤ exp[−m0 L(X ∧ D) − δ0 nX log N + πL20 |RX |δ log N ], for nX ≥ 2. (4.11) Here RX = R ∩ X, nX is the number of unit squares 1 ⊂ X such that 1 ∩ R0 = ∅, and L(X ∧ D) is the length of the shortest tree graph over D` ⊂ X and centers of 1 ⊂ X. The reader should note that these theorems mean that Z∞ Z(R) ∼ ZK˜ exp[−

X i

min E(ψR˜ i )] ψR˜

i

and ρX ∼ ρ˜X exp[− minψR˜ E(ψR˜ X )]. (The estimate of ψRi in ZK˜ remains.) Since the X factor E(ψR ) compensates πL20 |RX |δ log N in ρ˜X which originates from small fields near RX , we obtain sufficiently small ρX . We prove these theorems in the rest of this section. Remark 5. For matrices A,B, C and D of sizes ` × `, m × m, m × ` and ` × m respectively, we have (blockwise diagonalization [11]):

A D C B

=

1 0 CA−1 1

A 0 0 B − CA−1 D

1 A−1 D 0 1

.

(4.12)

4.1. Polymer expansion of Ai with large fields. 4.1.1. Properties and expansions of J(R) . We note that 2i 0 WK˜ = GK,R 0 J(R , ψR0 )GR0 ,K ˜ ˜, ˜ √ ψK N 1 √ , J(R0 , ψR0 ) = −1 GR0 − i 2N ψR 0

(4.13) (4.14)

where GR0 = χR0 GχR0 , GK,R 0 = χK ˜ ˜ GχR0 and so on, and ψR0 is regarded as the diagonal matrix: ψR0 = diag(ψ(x), x ∈ R0 ). We first study properties of the operator J(R, ψ) ≡ J(R, ψR ).

Critical Temperature of 2D O(N ) Spin Model

143

Lemma 13. The following relations hold: 2 (1) ||J(R, ψR )|| ≤ ||G−1 R || ≤ 8 + m uniformly in R 6= ∅ and ψR . −1 −1 ] (x, y) decays exponentially fast uniformly in R 6= ∅ and ψR : (2) [GR − iψR −1 −1 ] (x, y)| ≤ const. GR (x, y). |[GR − iψR

(4.15) −1/2

Proof. (1) Since m−2 > GR > (m2 + 8)−1 > 0 uniformly in R 6= ∅, GR and GR −1/2 −1 −1/2 satisfy inequalities√of the same type. Moreover since GR ψR GR is self-adjoint, −1/2 −1/2 −1 N we see that ||1 − i 2 GR ψR GR || ≥ 1. Then the conclusion follows from 1/2

−1/2

J(R, ψ) = GR

1−

1 −1/2 √ G . i N −1/2 −1 −1/2 R G ψ G R R R 2

(2) We first note that 1 1 = iψR −1 G−1 , −1 GR − iψR GR + iψR R where ([17], Theorem VIII.1, or use (4.12)) 2 G−1 R = χR (−1 + m )χR − B∂R , −1

B∂R = E(χRc (−1 + m )χRc ) E = χR (−1)χRc . 2

(4.16) ∗

E ,

(4.17) (4.18)

Here B∂R is a positive operator bounded by χR (−1 + m2 )χR (by the positivity) and has non-negative matrix elements. B∂R (x, y) 6= 0 if and only if (x, y) ∈ ∂R × ∂R, where ∂R = {x ∈ R; ∃ y ∈ Rc , |x − y| = 1}. Then we have the convergent Neumann expansion, "∞ # X 1 D D n = iψR GR (ψ) (B∂R GR (ψ)) , iψR χR (−1 + m2 + iψ)χR − B∂R n=0 2 −1 and where GD R (ψ) = [χR (−1 + m + iψ)χR ] D |GD R (ψ)xy | ≤ GR (ψ = 0)xy , 2 D |ψx GD R (ψ)xy | ≤ (4 + m )GR (ψ = 0)xy ,

as is proved by the random walk representation of GD R (ψ). Putting all ψ = 0, we find that X 1 (x, y)| ≤ (4 + m2 ) GR (x, ζ)|G−1 | R (ζ, y)|. −1 GR − iψR ζ −1 2 Then (2) follows since |G−1 R (ζ, y)| = 2(m + 4)δζy − GR (ζ, y) by (4.16).

144

K. R. Ito, H. Tamura

Lemma 14. J(R, ψ) admits the following cluster (random walk) expansion: X δJ(X, ψ), J(R, ψ) =

(4.19)

X⊂R

where X are intersections of R with paved sets √(X = ∪i (1i ∩ R)). Moreover δJ(X, ψ) depends only on ψ(x), x ∈ X. If diam(X) > 2(2L + 1), then ||δJ(X, ψ)|| ≤ exp[−m1 L(X)], |δJ(X, ψ)xy | ≤ exp[−m1 L(X, x, y)],

(4.20) (4.21)

where δJ(X, ψ)xy is the (x, y) component of δJ(X, ψ) (x, y ∈ X) and L(X, x, y) is the length of the shortest walk from x to y through all centers of 1` ⊂ X, x ∈ / 1` , y ∈ / 1` . Proof. We apply the expansion procedure by Federbush and Brydges to G−1 R . For any X ⊂ R, X = ∪n1 (1i ∩ R), we choose 11 ∩ R ⊂ X and s1 ∈ [0, 1] and define −1 −1 −1 G(X, s1 ) = [(1 − s1 )(G−1 X\11 + G11 ) + s1 GX ] ,

J(X, s1 ) = [G(X, s1 ) − ihX ]−1 ,

√ −1 where h = N ψ −1 /2, G−1 X ≡ χX GR χX and 1i ∩ R is denoted as 1i for simplicity. Then J(X) = J(X, s1 = 1) and J(X, s1 ) is bounded uniformly in h and s1 , and we have Z 1 J 0 (X, s1 )ds1 J(X) = J(X, s1 = 0) + 0

= J(X\11 ) ⊕ J(11 ) −

X Z 12 6=11

1

J(X, s1 )G(X, s1 )δG−1 12 G(X, s1 )J(X, s1 )ds1 ,

0

−1 −1 where δG−1 ij = G1i 1j + G1j 1i , and we have used

∂ G(X, s1 ) = − ∂s1

X

−1 G(X, s1 ) G−1 11 ,12 + G12 ,11 G(X, s1 ),

12 ⊂X\11

and so on. We choose 12 6= 11 and s2 in the next step and continue the process inductively. (See the appendix and the proof of Theorem 3). Let J(R)xy be the (x, y) component of J(R). Then we have X X δJ(X)x,y , δJ(X)xy = δJ(X)T (x, y), J(R)x,y = X⊂R

T

where T are tree graphs over {11 ∩ R, · · · , 1n ∩ R} with root 11 and δJ(X)T (x, y) is given by X

X

γ:T (γ)=T

π

×

(−1)n−1

Z Mγ (s)

n−1 Y 1

(k1 ) Gm δG−1 `π(2) ,mπ(2) π(1) ,`π(2)

× G(X, sγ )J(X, sγ ) m

dsi

X

[J(X, sγ )G(X, sγ )]x,`π(1) δG−1 `π(1) ,mπ(1)

ki =0,1

(k ) · · · Gmn−1 δG−1 `π(n−1) ,mπ(n−1) π(n−2) ,`π(n−1)

π(n−1) ,y

Critical Temperature of 2D O(N ) Spin Model

145

with G(0) = G(X, sγ ) and G(1) = G(X, sγ )J(X, sγ )G(X, sγ ). Here γ are tree graphs over {1j1 , · · · , 1jn } (j1 = 1) and for a given tree γ = {b1 , b2 , · · · , bn−1 }, bk = (`k , mk ) . Moreover (`k , mk ∈ {j1 , · · · , jn }), π stands for permutations of {bk = (`k , mk )}n−1 1 si are introduced following the tree graph γ. (See Theorem 2 for the notation.) G−1 (X, sγ ) is a convex linear combination of χY (−1 + m2 − B∂R )χY , Y ⊂ X. Then the non-diagonal terms of G−1 (X, sγ ) are negative (ferromagnetic), and we have |G(i) (X, sγ )x,y | ≤ c1 m−2 exp[−m2 |x − y|], uniformly in {si } and X, where i = 0, 1, m2 = 4m/5 and c1 is a positive constant. If 1i and 1j are nearest neighbour and x ∈ 1i and y ∈ 1j are close to each other, some of the matrix elements (δG−1 Since e−mL << 1, this happens ij )xy may be large. √ p only for blocks of form ∪i=1 1i with diam(∪1i ) ≤ 2(2L + 1) (thus p ≤ 4). Then for n > 4, (k1 ) (s)p,`π(1) · · · δG−1 |G(s)p,`π(1) δG−1 `π(1) ,mπ(1) G `π(n−1) ,mπ(n−1) G(s)mπ(n−1) ,q |

1 ≤ exp[− m2 Lπ(γ) (p, q)], 5 Lπ(γ) (p, q) = distc (p, `π(1) ) + dist c (`π(1) , mπ(1) ) + dist c (mπ(1) , `π(2) ) + · · · + dist c (mπ(n−1) , q), where dist c (i, j) ≡ distc (1i , 1j ). We can then extract either the tree decay factor of γ Y 0 00 exp[−m1 distc (i, j)], (4.22) exp[−m1 (dist c (p, `π(1) ) + dist c (mπ(n−1) , q))] (ij)∈γ 00

or the decay factor proportional to the length of walk, exp[−m1 L(1p , {1}, 1q )] with P 0 0 00 the remainder bounded by π exp[−m1 Lπ(γ) (p, q)], where m1 + m1 = m2 /5. We 0 0 complete the proof by Lemma 9, by replacing m1 by m1 ≡ m/10 < m1 to compensate K2n . Remark 6. In the proof of Lemma 14, we may introduce interpolation parameters si in such a way that GR → GR (s) ≡ (1 − s)(χR\1 GR χR\1 + χ1 GR χ1 ) + sGR in the denominator of J(R, ψ), though G−1 R (s) may not be ferromagnetic in this case. See Appendix C. Moreover if R = ∪Ri and {Ri } distribute dilutely, we can just Taylorexpand the off-diagonal terms GRi ,Rj (i 6= j). This is the standard random walk expansion. 4.1.2. Proof of Theorem 11 (large field contribution). Let us consider the contribution from the large field region R0 = ∪Ri0 , Ri0 = R0 ∩ Di :   " # X Y N N −N − det 2 2 (1 + ARi0 ) det − 2 1 + δAij  , (4.23) det 2 2 (1 + AR0 ) = i

δAij

1 1 = ARi0 ,Rj0 = GRi0 ,Rj0 1 + ARj0 GRj0 −

i6=j

√ i N 2ψR0

j

.

(4.24)

146

K. R. Ito, H. Tamura

Since Ri0 and Rj0 are separated by distance more than 3L, we see that ||δAij ||1 ≤ m−4 exp[−mdist(Ri0 , Rj0 )] × min{|Ri0 |, |Rj0 |} 4m dist(Ri0 , Rj0 )] ≤ min{|Ri0 |, |Rj0 |} exp[− 5

(4.25)

uniformly in ψ(x), x ∈ R0 , where ||A||pp = Tr |A|p (p ≥ 1). (Note that ||A||1 = Tr |A| ≤ P P |A(x, y)|2 .) Then it is enough to consider det(1 + ARi0 ). |A(x, y)| and ||A||22 = Let δ1 be a positive constant such that 0 < 2δ − 3δ1 , and set Ri = Li ∪ Mi , where 1

1

Li = {x ∈ Ri ; |ψ(x)| > N 2 +δ1 }, Mi = {x ∈ Ri ; |ψ(x)| ≤ N 2 +δ1 }.

(4.26)

(L stands for Large, and M stands for Medium. Only in this subsection, L and Li stand for regions of very large fields ψ. We apologize for the abuse of notation.) We also introduce Li (L0 ) = {x ∈ Ri0 ; dist(x, Li ) ≤ L0 }, Mi (L0 ) = {x ∈

Ri0 ; dist(x, Mi )

≤ L0 },

(4.27) (4.28)

˜ i = Ri0 − Li = Mi ∪ R˜ i . For notational simplicity, we omit the subscript i and set M for a while and we denote Ri0 by R0 , Ri by R and Li by L and so on. We first extract ψL = χL ψχL : 2i (4.29) det(1 + AR0 ) = det(1 + AL ) det 1 + (TM˜ − δTM˜ ) √ ψM˜ , N where we have used the following abbreviations: 2i AL = χL GχL √ ψL , N TM˜ = GM˜ − GM˜ ,L G−1 ˜ , L GL,M " # √ i N −1 ) − G−1 GL,M˜ . δTM˜ = GM˜ ,L (GL − L 2ψL

(4.30) (4.31) (4.32)

Lemma 15. If {c1 < |ψ(x)| < c2 ; x ∈ A} , 0 < ci , then 1/2

1/2

spec|GA ψA GA | ⊂ [

c2 c1 , 2 ]. +8 m

m2

Proof. Since (8 + m2 )−1 hf, f i ≤ hf, GA f i ≤ m−2 hf, f i for f ∈ CA , we have D E 1/2 1/2 1/2 1/2 ||GA ψA GA f ||2 = ψA GA f, GA ψA GA f D E 1/2 1/2 ≥ (8 + m2 )−1 ψA GA f, ψA GA f ≥ (8 + m2 )−2 ( inf |ψ(x)|2 ) hf, f i . x∈A

The other inequality is also immediate.

(4.33)

Critical Temperature of 2D O(N ) Spin Model

147

Lemma 16. The matrices TM˜ and δTM˜ have the following properties: −1 −1 −1 TM ˜ ≡ χM ˜ GR0 χM ˜ , ˜ = (GR0 )M

(4.34)

1/2 TM˜

(4.35)

=

1/2 GM˜

+

1/2 tM˜ ,

h m i dist(x, L) + dist(y, L) + |x − y| , ≤ cm−4 exp − 2 −δ1 +ε0 ||δTM˜ ||1 ≤ |L|N .

1/2 |tM˜ (x, y)|

(4.36) (4.37)

Proof. To show (4.34), we take the inverses of both sides of the block-diagonalization of GR0 > 0: 1 0 GL 0 . GR 0 = U U ∗, U = 0 TM˜ GM˜ L G−1 L 1 R∞ To show the second, using T −1/2 = 2 0 (T + u2 )−1 du/π, we have −1/2

TM˜

−1/2 tˆM˜

−1/2

= GM˜ Z =2

−1/2 + tˆM˜ ,

1 1 du 1 , G ˜ G˜ GM˜ + u2 M L FL (u) LM GM˜ + u2 π

FL (u) = GL − GLM˜ (GM˜ + u2 )−1 GM˜ L ,

−1 −m|x−y| ˜, , |GM˜ L (x, y)| ≤ c log(1 + m−1 )e−m|x−y| , (x ∈ M where |GM ˜ (x, y)| ≤ ce −1 −1 −m|x−y| , x, y ∈ L uniformly in u ≥ 0. In fact FL y ∈ L) and FL (u) (x, y) ≤ ce −1/2 ˆ is essentially equal to (G−1 ) . Then t has the decay property (4.36) except for the R0 L −1/2 to obtain (4.35). coefficient. We multiply TM˜ to the expression of TM˜ √

To estimate ||δTM ||1 , we expand (GL − i 2ψNL )−1 into a series of G−1 L which √ −1 −δ1 converge absolutely since | N /ψL (x)| ≤ N . Since ||GL || ≤ 8 + m2 and P 2 2 −1 −2 ||GM˜ L ||22 = ˜ L (x, y) ≤ c|L| log (1 + m )m , (4.37) follows from the xy GM definition (3.41) of ε0 . Let



det(1 + AR0 ) = det(1 + AL ) det 1 − δTM˜

 1 TM˜ −

√ i N 2ψM˜

 det(1 + TM˜ √2i ψM˜ ).(4.38) N

Using det(1 + A) = exp[Tr(A + O(A2 ))] and | det(AL )| ≤ | det(1 + AL )|, we have estimates N 1 1−δ1 +ε0 √ )| ≤ exp[|L|N ], (4.39) | det − 2 (1 − δTM˜ i N TM˜ − 2ψ ˜ M #N/2 " √ Y N −N 2 (1 + AL )| ≤ det −N/2 (GL ) | det 2|ψ(x)| x∈L " # X 2|ψ(x)| 1 √ − log(8 + m2 ) log . (4.40) ≤ exp − N 2 N x∈L Therefore we have (using 2/5 instead of 1/2):

148

K. R. Ito, H. Tamura

Lemma 17. If N ≥ N (β) so that δ1 > ε0 , then   − N2 ψ(x) 1 2 X det(1 + AL ) det(1 − δT ˜  √ log(| √ |)]. ) < exp[− N M 2i N 5 N (4.41) TM˜ − ψ ˜ M

It remains to estimate the final determinant in the R.H.S. of Eq. (4.38): 2i 4 | det −N/2 (1 + √ T 1/2 ψM˜ T 1/2 )| = det −N/4 (1 + [Tˆ0 + Tˆ1 ]) N N = exp[−90 − 91 ],

(4.42)

where T ≡ TM˜ and Tˆ0 = T 1/2 ψM˜ T 1/2 χM˜ \R(L0 /2) T 1/2 ψM˜ T 1/2 ,

(4.43)

Tˆ1 = T 1/2 ψM˜ T 1/2 χR(L0 /2) T 1/2 ψM˜ T 1/2 , 4 N Tr log 1 + Tˆ0 , 90 = 4 N

(4.44) (4.45)

1 1 N 4 Tˆ1 Tr log 1 + 91 = 4 N (1 + N4 Tˆ0 )1/2 (1 + N4 Tˆ0 )1/2

! .

Both Tˆ0 and Tˆ1 are positive. Put X

ψ(x)Tˆ0 (x, y)ψ(y) ≡ ψM˜ Tˆ0 ψM˜ , 80 = Tr Tˆ0 = ˜ x,y∈M

81 = Tr Tˆ1 =

X

(4.46)

(4.47)

ψ(x)Tˆ1 (x, y)ψ(y) ≡ ψM˜ Tˆ1 ψM˜ ,

(4.48)

˜ x,y∈M 1/2 1/2 Tˆ1 = (TM˜ χR(L0 /2) TM˜ ) ◦ TM˜ ,

Tˆ0 =

1/2 1/2 (TM˜ χM˜ \R(L0 /2) TM˜ ) 1/2

(4.49)

◦ TM˜ ≡ T + δT ,

(4.50)

1/2

T ≡ (GR0 χR0 \R(L0 /2) GR0 ) ◦ GR0 ,

(4.51)

˜ \R(L0 /2) = R0 \R(L0 /2). Since G1/2 ˜ = R0 \L and note that M where M R0 (x, y) ≤ 1/2 −m|x−y| −m|x−y| and GM˜ (x, y) ≤ ce (Appendix B), we have ce |(GR0 χR0 \R(L0 /2) GR0 )(x, y)| ≤ N −1+ε0 , if x ∈ R, y ∈ R0 , 1/2

1/2

|(GR0 χR0 \R(L0 /2) GR0 )(x, y)| ≤ N −2+ε0 , if x ∈ R, y ∈ R. 1/2

1/2

Since ψM˜ = ψR˜ + ψM , we have 80 = hψR˜ , T ψR˜ i + δ80 , |δ80 | ≤

const.|L|L20 N 1/2+2δ e−mL0 /2

(4.52) ≤ |L|N

−1/2+2δ+2ε0

.

(4.53)

The argument of the same type shows that ||Tˆ0 /N || ≤ N −1+2δ+ε0 and ||Tˆ1 /N || ≤ N 2δ1 +ε0 .

Critical Temperature of 2D O(N ) Spin Model

149

We remark on the following facts: Let A and B be any positive matrices. Then (i)

Tr(xA − 21 x2 A2 ) ≤ Tr log(1 + A) ≤ Tr A for any x ∈ [0, 1].

(ii) A ◦ B ≥ c diag(A) if B ≥ c1, where 1 is the identity. The fact (i) is trivial and the fact (ii) follows from A ◦ B = A ◦ (c1 + (B − c1)) ≥ cA ◦ 1, where A ◦ 1 = diag(A). Then we have 80 ≥ 90 ≥ (1 − O(N −1 ))80 = 80 + O(N −1+2ε0 +2δ |R|), 81 ≥ 91 ≥ (1 − O(N −2δ1 +2ε0 ))N −3δ1 81 (we used (i) with x = N −3δ1 in the second). To obtain the lower bound for 81 > 0, we apply (ii) by setting A = T 1/2 χM (L0 /2) T 1/2 and B = T , where T = TM˜ . Therefore we have   X X 1  T 1/2 (x, ζ)2  ψ(x)2 , (4.54) 81 ≥ 8 + m2 ˜ x∈M

ζ∈R(L0 /2)

2 −1 ) , see Lemma 16. Here again by Lemma 16, we have since P ||T || ≥1/2(8 + m 2 (x, ζ) = GM˜ (x, x) − O(N −1/2+ε0 ) = β − O(N −1/2+ε0 ) >> 1 for ζ∈R(L0 /2) T x ∈ R(L0 /2)\L(L0 /2). Thus we find that

81 ≥

β 9

X

ψ(x)2 .

(4.55)

x∈R(L0 /2)\L(L0 /2)

Therefore we choose δ1 > 0 so that δ2 ≡ 2δ − 3δ1 > 1.2 × ε0 ,

δ1 > 1.2 × ε0 ,

(4.56)

which are satisfied by δ = 1/12 and δ1 = δ2 = 1/24. (δ2 > 1.2 × ε0 is needed later.) Proof of Theorem 11. Putting T = Ti , R = Ri , L = Li and so on, we have

| det −N/2 (1 + ARi0 )| ≤ exp[− ψR˜ i , Ti ψR˜ i + |Li |N −1/2+δ+ε0 + N −1+2δ+2ε0 |Ri | X 2 X ψ(x) ψ 2 (x) − N log | √ | ], −c1 N −3δ1 5 N x∈Li x∈R (L /2)\L (L /2) i

0

i

0

where c1 ≥ β/9. We fix Li ⊂ Ri and integrate over ψ(x), x ∈ Ri noticing that R ∞ −x2 2 e dx = e−s /2s(1 + O(s−1 )) and c1 |Li (L0 )|N δ2 ≤ (1/15)δ1 |Li |N log N : s Z Y Y dψ(x) dψ(x) | det −N/2 (1 + ARi0 )| x∈Li

x∈Ri \Li

1 ≤ e−hψR˜ i ,Ti ψR˜ i i exp[−(c1 − o(1))|Ri |N δ2 − δ1 |Li |N log N ]. 3 Take the sum over all Li ⊂ Ri and put c2 = c1 − o(1) − O(e−N ) ≥ β/10.

150

K. R. Ito, H. Tamura

4.2. Polymer expansion of the Gaussian measure. 4.2.1. Stability of small fields. For any large field region R, we integrate the following function: Y X N D(ARi0 ) det − 2 (1 + δAij ) 4R (ψ) ≡ i i6=j (4.57) −N 2

× det 3

(1 + AK˜ − WK˜ )e−V τ (ψK )τ c (ψR ),

where −N/2

D(ARi0 ) = det 2

(1 + ARi0 ) exp[ ψR˜ i , Ti ψR˜ i ],

1/2

1/2

Ti = (GR0 χRi0 \Ri (L0 /2) GR0 ) ◦ GRi0 , i

and

i

(4.58) (4.59)

(4.60) V = ψK˜ , G◦2 ψK˜ + δVK ≡ V0 + V1 , E

XD ◦2 ◦2 ψRi0 \Ri , Ti ψRi0 \Ri , (4.61) V0 = ψK˜ , G ψK˜ + 2 ψK˜ , G ψR0 \R + i

N N 1 2 ), Tr(AK˜ WK˜ − WK − V1 = − Tr WK˜ − AK,R 0 \R AR0 ,K ˜ ˜ 2 2 2 ˜ X

N 1 2 Tr(WK˜ + AK˜ WK˜ − WK ). δVK = ψR˜ i , Ti ψR˜ i − 2 2 ˜ i

(4.62) (4.63)

(We remark that R˜ i ≡ Ri0 \Ri .) V0 does not depend on ψ(x), x ∈ R, and V1 contains ψ(x), x ∈ R only through WK˜ . WK˜ is bounded uniformly in ψ(x), x ∈ R, because of the small field region surrounding R. We would like to stress that ψK˜ , G◦2 ψR0 \R = − N4 TrAK,R 0 \R AR0 ,K ˜ is extracted from TrWK ˜ ˜. Lemma 18. The following bounds (stability bounds) hold uniformly in |ψ(x)| > N δ , x ∈ R and ψ(x) ∈ [−N δ , N δ ], x ∈ K = 3\R: |V1 | ≤ const. N −1/2+2δ+ε0 |R|, V0 ≥ −O(|R|N

−1/2+2δ+ε0

).

(4.64) (4.65)

Proof. To show the first, we note that AR0 1 A 0 ˜ + AK,R A 0 ˜. ˜ 1 + AR0 R ,K 1 + AR0 R ,K P Then the trace of the left hand side is bounded by i O(N −3/2+δ+2ε0 |Ri |). To show the second, we introduce the positive function E D P (ψ) = ψK , [(G1/2 χ3\R(L0 /2) G1/2 ) ◦ G]ψK E D = ψK˜ , [(G1/2 χ3\R(L0 /2) G1/2 ) ◦ G]ψK˜ E D +2 ψK˜ , [(G1/2 χ3\R(L0 /2) G1/2 ) ◦ G]ψR˜ E D (4.66) + ψR˜ , [(G1/2 χ3\R(L0 /2) G1/2 ) ◦ G]ψR˜ WK˜ − AK,R 0 \R AR0 ,K 0 \R ˜ = −AK,R ˜ ˜

Critical Temperature of 2D O(N ) Spin Model

151

˜ = 3\R0 and R˜ = R0 \R. Since which approximates V0 and interpolates K 1/2

1/2

1/2

1/2

GR0 χR0 \R(L0 /2) GR0 = GR0 − GR0 χR(L0 /2) GR0 , G1/2 χ3\R(L0 /2) G1/2 = G − G1/2 χR(L0 /2) G1/2 , and GR0 = G on CR , we see that GR0 χR0 \R(L0 /2) GR0 is equal to G1/2 χ3\R(L0 /2) G1/2 0 on CR with an error of order O(m−2 e−mL0 /4 ) = O(N −1/2+ε0 ). To prove this, we estimate X 1/2 1/2 1/2 1/2 GR0 (x, ζ)GR0 (ζ, y) [GR0 χR0 \R(L0 /2) GR0 ](x, y) = 0

1/2

1/2

ζ∈R0 \R(L0 /2)

for x, y ∈ R0 . Since GR0 (x, y) ≤ ce−m|x−y| , if dist(x, R) > 3L0 /4, the sum over ζ is extended to all ζ ∈ R0 with a correction bounded by O(m−2 e−mL0 /4 ). Thus this is equal to GR0 (x, y) = G(x, y). If dist(x, R) < 3L0 /4, then dist(x, (R0 )c ) ≥ L0 /4 and 1/2 GR0 (x, y) = G1/2 (x, y) with a correction bounded by O(m−2 e−mL0 /4 ). Thus we have X D E D E 1/2 1/2 ψRi0 \Ri , Ti ψRi0 \Ri − ψR˜ , [(G χ3\R(L0 /2) G ) ◦ G]ψR˜ i X |Ri |N −1/2+2δ0 +ε0 , ≤ 1/2

i

since dist(Ri , Rj ) ≥ L. The same relation holds between the first two terms in V0 and P (ψ). Since P (ψ) ≥ 0, this implies V0 ≥ O(|R|N −1/2+2δ0 +ε0 ) uniformly in ψ(x) ∈ [−N δ , N δ ], x ∈ 3\R. 4.2.2. Proof of Theorem 12 (small field contribution). Let dµK˜ (ψ) be the Gaussian measure of mean 0 and covariance 21 [χK˜ G◦2 χK˜ ]−1 :

Y dψx −1 −1 √ , χK˜ ψ ] dµK˜ (ψ) = det 1/2 (CK ˜C ˜ ) exp[− ψ, χK π ˜

(4.67)

x∈K

−1 −1 χK˜ . We define the small field contribution ZK˜ by where C −1 = G◦2 , CK ˜C ˜ = χK

ZK˜ = det

−1/2 −N 2

ηK (ψ) = det 3

−1 [CK ˜ ]

Z

dµK˜ ηK (ψ),

(1 + AK˜ − WK˜ ) exp[−δVK ]

(4.68) Y

τ (ψ(x)),

(4.69)

x∈K

where δVK is defined by Eq. (4.63). We again use the cluster expansion of the Gaussian measure. But this time, the covariance 21 [χK˜ G◦2 χK˜ ]−1 depends on locations of Ri0 . We introduce interpolation parameters si ∈ [0, 1] into (4.67) as follows [16, 5]: C = C3 → C(s1 ) ≡ (1 − s1 )(C3\X1 + CX1 ) + s1 C3 .

152

K. R. Ito, H. Tamura

The integral is decoupled into X1 and 3\X1 if s1 = 0. Integration by parts yields ∂ ∂s1

Z

dµ(s1 )e−V =

Z dµ(s1 )

X1 x,y

4

(∂s1 A)xy

∂2 e−V , ∂ψ(x)∂ψ(y)

(4.70)

where A ≡ [χK˜ HχK˜ ]−1 and H = C −1 (s). Then we have (see Appendix C) ∂s1 A = A(C3\X1 ,X1 + CX1 ,3\X1 )A X = [δF (X1 , X2 ) + δF (X2 , X1 )] X2 ⊂3\X1

P by decomposing A ≡ [χK˜ HχK˜ ]−1 χK˜ H into polymers, A = IK˜ + X δC(X), where ˜ IK˜ is the identity operator on CK and δC(X) is the Green’s function represented by random walks passing all squares 1 only in X, X ∩ R0 6= ∅ and then exhibits tree decay over 1 ⊂ X. The next theorem is an extension of Theorem 3. We would like to remind the reader that δF (Xi , Xj )(x, y) = O(e−mL ) unless x ∈ Xi and y ∈ Xj . See Appendix C for the construction of δF (Xi , Xj ). The sum over partitions Y = ∪Xi is harmless thanks to Lemma 10. ˜ into paved sets {Yi }. Then ˜ = ∪Yi be partitions of K Theorem 19. Let K Z ηK (ψ)dµK˜ S(Y ) =

 X 1 = n! n

X X XZ p ∪p Xi =Y T 0 1

×

p−1 Y k=1

"

X

Y

 S(Yi ) ηK (ψ),

˜ ∪n 1 Yi =K

1

ds1 · · · dsp−1 MT 0 (s)

(4.71)

Z dµY ({s}, ψ)

0

# ∂2 , δF (Xja(k) , Xjk+1 )(xk , yk+1 ) 2 ∂ψ(xk )∂ψ(yk+1 ) (4.72)

XX 1 xk yk+1

where Y = ∪p1 Xi are partitions of Y by unions of 1j ⊂ Y and Dk ⊂ Y . Moreover both xk and yk+1 are ∈ ∪k+1 1 Xji . If i < j, then |δF (Xi , Xj )(x, y)| ≤ min exp[−m1 L(1` ∪ (Xj ∧ D), x, y)], (1` ⊂ Xi ), (4.73) `

where X ∧ D means that D` ⊂ X are regarded as one set D` , and L(X, x, y) means / 1i . the shortest length of walks from x to y passing all centers of 1i ⊂ X, x, y ∈ Here and hereafter, we use the following notational convention for paved sets Y : RY = R ∩ Y, RY0 = R0 ∩ Y, Y˜ = Y \RY0 .

(4.74)

By Lemma 14, we expand (1 + AR0 )−1 and obtain polymer expansions of WK˜ and δV .

Critical Temperature of 2D O(N ) Spin Model

153

Lemma 20. The following cluster expansion holds: WK˜ = AK,R 0 ˜

X X 1 AR0 ,K˜ = WYi + δWX , 1 + AR0 i

(4.75)

X6=Yi

WY = AY \RY0 ,RY0 X

δW (X) =

Yi ∪X 0 ∪Y` =X Yj ∪Yk ⊂X 0

1 0 , A 0 1 + ARY0 RY ,Y \RY AYi \Ri0 ,RY0

j

(4.76)

1 1 F (X 0 ) ARY0 ,Y` \R`0 , (4.77) k 1 + ARY0 1 + ARY0 j

k

where {Yi }p1 are paved sets in Eq. (4.71), X is a paved set consisting of Yi more than or equal to 2. (R0 must be subtracted.) F (X) are the non-diagonal terms coming from the random walk expansion of (1 + AR0 )−1 , R0 = ∪i RY0 i . They satisfy the bounds ||F (X)||1 ≤ ( ||δW (X)||1 ≤ (

X

|RY0 i |) exp[−m2 min γ

X

|RY0 i |) exp[−m2

min γ

X

dist(RY0 i , RY0 j )],

(ij)∈γ

X

dist(RY0 i , RY0 j )],

(4.78)

(ij)∈γ

where γ are tree graphs over Yi ⊂ X. Lemma 21. The following expansion holds for δVK defined in (4.63): X X δVYi + δ V˜ (X), δVK = i

δVY =

X D

j:Dj ⊂Y

(4.79)

X

E N 1 ψR˜ j , Tj ψR˜ j − Tr (WY + AY WY − WY2 ), 2 2

(4.80)

where Yi are paved sets made by the expansion of the Gaussian measure, X is a paved set consisting of Yi . Moreover X X |RY0 i |) exp[−m2 min dist(RY0 i , RY0 j )], (4.81) ||δ V˜ (X)||1 ≤ ( γ

(ij)∈γ

where γ are tree graphs over Yi ⊂ X. ˜ we introduce interpolation parameters sij connecting For each partition ∪Yi of K, Yi and Yj in the determinant: X 2i X 2i 2i √ χK˜ GψχK˜ → √ χYi GψχYi + sij √ (χYi GψχYj + χYj GψχYi ) N N N i i<j X X AYi + sij BYi ,Yj , (4.82) ≡ i

i<j

where Yi should be regarded as Y˜i = Yi − R0 if RYi 6= ∅. We also introduce interpolation parameters {tX } and {t˜X } into the decompositions (4.75) and (4.79) of WK˜ and δVK :

154

K. R. Ito, H. Tamura

WK˜ →

X i

δVK →

X

X

WYi +

tX δW (X),

(4.83)

t˜X δ V˜ (X),

(4.84)

X=∪Yi

X

δVYi +

i

X=∪Yi

where X = ∪Yi are paved sets consisting of more than or equal to two Yi ’s. Thus we have X 1 X Y X 1 X Y S(Yi )ηK (ψ) = ρ˜Xi , (4.85) p! p! p p ˜ ˜ ∪Yi =K ∪Xi =K X 1 X Y X Z (4.86) dsγ ∂γ ]ηX (s), S(Yi )[ ρ˜X = p! p ˜ ∪Yi =X

γ∈T ({Yi })

where X = ∪Yi are partitions of X into decoupled paved sets Yi , T˜ ({Yi }) is the set of connected graphs over {Yi } and ηX is the η function restricted to the paved set X = ∪Yi . Proof of Theorem 12. (Step 1). We consider the action of the differential operators in S(Y ) on ηY . By integration by parts, we start with # "p−1 Z Y ∂2 ηY δF (Xja(k) , Xjk+1 )(xk , yk+1 ) dµY ({s}, ψ) ∂ψ(xk )∂ψ(yk+1 ) k=1

=

p−1 Y

Z

δF (Xja(k) , Xjk+1 )(xk , yk+1 )

dµY ({s}, ψ) e−δV0 (Y ) 89,

(4.87)

k=1

where putting H = ψY˜ , CY−1 (s)ψY˜ , Y˜ = Y \RY0 and R˜ Y = RY0 \RY , we have set   dY i −1 Y ∂  e−H , (−1)di −1 (4.88) 8 = eH ∂ψ(x ) i,j i j=1 9 = eδV0 (Y ) ηY = δV0 (Y ) =

p−1 Y

∂ ηY , ∂ψ(xi,di )

i=1 −N/2 det 3 (1 + AY − WY ) exp[−δV0 (Y ) − V1 (Y X

2 ψY˜ , C −1 ψR˜ Y + ψR˜ i , Ti ψR˜ i ,

(4.89) )]τ (ψY ),

(4.90) (4.91)

i:Ri ⊂Y

N 1 2 V1 (Y ) = − Tr WY − AY˜ ,R0 \RY AR0 \RY ,Y˜ + AY WY − WY , (4.92) Y Y 2 2 such that xk ∈ Xji or yk+1 ∈ Xji . If xk ∈ Xja(k) and di is the number of {xk , yk+1 }p−1 1 and yk+1 ∈ Xjk+1 , di is the incidence number of the vertex Xji . By Theorem 19 Y h 4X m1 L(1ja(k) ∪ Xjk+1 , xk , yk+1 ) δF (Xja(k) , Xjk+1 )(x, y) ≤ exp − 5 k k X m1 d i 3 i L[ ] 2 − 10 9 i (∧D is omitted for simplicity). Then (see Appendix C)

Critical Temperature of 2D O(N ) Spin Model

155

Q (i) we can extract tree decay factors exp[−cm1 L(Xk0 )], Xk0 = 1ja(k) ∪ Xjk+1 , (ii) if Xjk+1 consists of more than or equal to two 1k or D` , δF (Xja(k) , Xjk+1 ) contains exp[−m dist(Xja(k) , R0 ∩ Xjk+1 )], / Xja(k) takes place only if δF consists of walks passing through ∃ R`0 ⊂ (iii) xk ∈ Xja(k) ∪ Xjk+1 . So |δF | is bounded by e−(L+L0 )m . The fact (i) means that it is enough to show that the derivatives and the summations over {xk , yk+1 } do not yield very large terms. (Step 2.) We show the stability of e−δV0 (Y ) dµY . It suffices to consider a paved set Y such that RY 6= ∅. Then V0 (Y ) ≡ ψY˜ , CY−1 (s)ψY˜ + δV0 (Y ) is given by

(ψY˜ + DψR˜ Y ), CY−1 (s)(ψY˜ + DψR˜ Y ) + ψR˜ Y , EψR˜ Y + O(|RY |e−mL0 ), where D = CY (s)(C −1 )Y˜ ,R˜ Y , E = TY˜ − (C −1 )+R˜

,Y˜ −mL0 /2 Y

1/2

1/2

CY (s)(C −1 )Y˜ ,R˜ Y and TY˜ =

on RY0 by Lemma 18. (AccuGY χY \RY (L0 /2) GY ◦ G. Then E ≥ −const.e −1 rately speaking, dµY and CY should be written dµY˜ and CY−1 ˜ .) Let us define dµ˜ Y ≡ det 1/2 [CY−1 (s)] exp[−V0 (Y )]τ (ψRY0 \RY )

Y x∈Y \RY

dψ(x) √ . π

(4.93)

Then dµ˜ Y is Gaussian with respect to ψY˜ if ψR˜ Y are fixed. Since |ψ(x)| ≤ N δ for R x ∈ R˜ Y , we have dµ˜ Y ≤ exp[πL20 |R| δ log N ]. Thus we can regard dµ˜ Y as the probability measure with an additional factor bounded by exp[πL20 |R|δ log N ]. P −1 (Step 3.) The application of ∂/∂ψ(ξ) on H yields ζ CY (ξ, ζ)ψ(ζ). Then using Schwarz’s inequality, we find it enough to estimate 1/2 Z 1/2 Z Y Y X p−2 −1 −δV0 (Y ) 2 −δV0 (Y ) 2 dµY e ψ(ζi ) |C (ξi , ζi )| dµY e |9| , {ζi } i=1

where {ξi ; i = 1, · · · , p − 2} are {xi,1 , · · · , xi,di−1 }, see Lemma 8. Consider 9. As for the derivatives of ηY , we first see that the derivatives of WY with respect to ψ(y), y ∈ Y˜ yield the factor N −1+3δ thanks to the small fields enclosing −N/2 the large fields. Thus derivatives of det 3 (· · · ) yield factors bounded by N −1+3δ . derivatives of V1 yield facWe estimate the derivatives of δVY = δV0 (Y ) + V1 (Y ). TheP tors bounded by N −1+2δ . The derivatives of δV0 (Y ) yield 2 ζ∈R0 \RY C −1 (y, ζ)ψ(ζ), Y

|ψ(ζ)| < N δ . But they come with δF (Xja(k) , Xjk+1 )(xk , yk+1 ) (y = yk+1 or y = xk ). Then −nY δ0 ||eδV0 ηY || L(1ja(k) ∪Xjk+1 , xk , yk+1 )+|yk+1 −ζ| Q > L. Thus we can bound |9| by N of δF . Differentiations of τ can be treated as before. uniformly in ψY by Ra fraction Q Let us consider ψ(ζi )2 dµ˜ Y . We shift ψ(x), x ∈ Y˜ , by −(DψR˜ Y )(x) which is 0 bounded by e−m dist(x,RY ) . Then dµ˜ Y decomposes into dµY and the integration over ψ(x), x ∈ R˜ Y . Then we may regard dµ˜ Y as dµY . Therefore the proof of Lemma 8 can be applied and we obtain the same results by replacing dist(1i , 1j ) by L(1i ∪ Xj ) and so on. In fact we define X δF (Xja(k) , Xjk+1 )(ξ, ξ 0 )|C −1 (ξ, xk )||C −1 (ξ 0 , yk+1 )| ξ,ξ 0

≡ m−4 δf (Xja(k) , Xjk+1 )(xk , yk+1 ).

156

K. R. Ito, H. Tamura

Then δf (Xja(k) , Xjk+1 )(xk , yk+1 ) again has the property (4.73) except for a multiplicative constant log4 (1 + m−1 ) which comes from C −1 = G◦2 . Then we repeat the arguments in Lemma 8 byQreplacing 1i by Xi and dist(1, x) by L(X, x, y) andPso on. (We remark that the factor |Xjk+1 | is compensated by a fraction of exp[−m1 L(Xjk+1 )].) (Step 4.) Finally take the sum over partitions Y = ∪Xi . Since we already have tree decay factors of Xk0 , the proofs of Lemma 8 and Theorem 5 apply to the rest. To expand det 2 (CK˜ )/Z∞ , CK˜ = [χK˜ G◦2 χK˜ ]−1 , we put H = G◦2 and observe that H0 H01 −1/2 −1/2 = det(H0 ) det(H1 ) det(1 − H1 H10 H0−1 H01 H1 ) det H10 H1 Y X −1/2 −1/2 1 det(HRi0 ) det(1 + δHij ) det(1 − H1 H10 H0−1 H01 H1 ), = det(H0 ) 1

where H0 = χK˜ HχK˜ , H1 = χR0 HχR0 and 1 = (HRi0 )−1 χRi0 HχRj0 , δHij

(4.94)

and we have used the notational convention HX ≡ χX HχX and HXY ≡ χX HχY . −1/2 −1/2 H10 H0−1 H01 H1 is the matrix of size |R0 | × |R0 |. Thus H1 −1 Hi (x, y) and (χX Hi χX ) (x, y) decay exponentially fast (see Appendix C). We −1/2 expand H0−1 and H1 by introducing interpolation parameters like [(1 − s)(HX\1 + H1 ) + sHX ]−1 and repeating the method used in the proof of Lemma 14. (We use R −1/2 H −1/2 = 2 (H + u2 )−1 du/π to expand H1 .) −1/2

−1/2

H10 H0−1 H01 H1 has the following expansion: X X −1/2 −1/2 H10 H0−1 H01 H1 = δH(Yi ) + δH(X), H1

Lemma 22. The matrix H1

i

X=∪Yi

where X are paved sets consisting of more than or equal to two Yi ’s and include at least one Ri0 ⊂ Di . The functions δH(Yi ) and δH(X) depend on variables located on Yi and X only. The diagonal terms δH(Y ) are given by −1/2

δH(Y ) = HR0

Y

−1/2

HRY0 ,Y \RY0 HY \RY0 HY \RY0 ,RY0 HR0

Y

, RY0 = R0 ∩ Y.

The non-diagonal terms δH(X) (X = ∪Yi ) satisfy the bound |δH(X)(x, y)| ≤ exp[−m1 L(X, x, y)]. The proof of Lemma 16 (1) means that −1/2

0 < O(1)m4 ≤ 1 − H1

−1/2

H10 H0−1 H01 H1

≤ 1.

Then the diagonal terms satisfy the bounds exp[−const. L20 |R| log m−1 ] ≤ det 2 (1 − δH(Y )) ≤ 1. 1

(4.95)

Since L0 ∼ 2m−1 log N R∼ βe2πβ , if the condition (4.56) is satisfied, the factors exp[πδL20 |R| log N ] from dµ˜ Y and exp[O(1)L20 |R| log m−1 ] from det 1/2 (HRi0 ) are all compensated by exp[−(β/12)N δ2 |R|] given in Theorem 11 (the large field stability). In fact for δ = 1/12 and δ1 = 1/24, we have δ2 = 1/24. If N ∼ e400πβ , we have N δ2 > e16πβ > m−8 ∼ L8 .

Critical Temperature of 2D O(N ) Spin Model

157

5. Analyticity of the Free Energy 5.1. Proof of Theorem 1 (former half) . To carry out the integration over {ψ(x); x ∈ 3}, we introduce a series of interpolation parameters {si , sij , tX , t˜X , uY , vij , v˜ ij } to decouple Ri0 ⊂ Di , Rj0 ⊂ Dj , i 6= j, Yk ⊂ K 0 (K 0 ≡ 3\D) and Di \Ri0 ⊂ D. From now on, let Yi stand for either Yi ⊂ K 0 or Di \Ri0 ⊂ D or for their unions. We summarize the interpolation parameters: ˜ = 3 − R0 into squares 1i ⊂ K 0 (1) Given the configuration of R, we decompose K 0 and paved sets Di with Ri subtracted. Introduce interpolation parameters si to the measure dµK˜ (ψ),   det1/2 (CK˜ ) X 1 X Y SYi  η3 . Z(R) = Z∞ p! p ˜ i ∪Yi =K

˜ = ∪Yi , introduce real interpolation parameters sij ∈ (2) To each decomposition K [0, 1] for BYi Yj like Eq. (4.82). (3) Introduce tX ∈ [0, 1] and t˜X ∈ [0, 1] following Eq. (4.83) and Eq. (4.84). (4) Introduce uY ∈ [0, 1] in such a way that X X −1/2 −1/2 H10 H0−1 H01 H1 → δH(Yi ) + uX δH(X). (5.1) H1 i

X=∪Yi

The diagonal terms δH(Yi ) such that H0 = HYi \RY0 and H1 = HRY0 are untouched i i and coupled with D(ARj0 ), Dj ⊂ Yi . (5) Redefine Aij and Hij by X

δAij ≡

X

k:Rk ⊂Yi `:R` ⊂Yj

ARk0 ,R`0

1 1 , δHij ≡ HRY0 ,RY0 i j 1 + AR`0 HRY0 i

and introduce vij ∈ [0, 1] and v˜ ij ∈ [0, 1] in such a way that X X X X δAij → vij δAij , δHij → v˜ ij δHij . i,j

ij

(5.2)

i,j

(5.3)

ij

Thus both of ||δAij || and ||δHij || are bounded by m−2 exp[−m dist(RY0 i , RY0 j )], and both of ||δAij ||1 and ||δHij ||1 are bounded by min{|RY0 i |, |RY0 j |} exp[−

4m dist(RY0 i , RY0 j )]. 5

Substituting theseP into the integrand 4R (ψ) defined by (4.57), we have our final expression of Z3 = Z∞ R Z(R), where   Z Y dψ(ζ) X 1 X Y , √ SYi · 4({Yi }, R; s, · · · , v)| ˜ s=···=v=1 Z(R) = ˜ p! π (5.4) ˜ ζ∈R0 ∪Yi =K ˜ is the 4−function with the interpolation parameters introduced and 4({Yi }, R; s, · · · , v) through Yi and explicitly given by

158

K. R. Ito, H. Tamura

 −N/2

det 3

1 +

X

 × det

−N/2

Ai −

i

1 +

X

X



 × det 1 +

X

"

vij δAij  det

ij

1 2

W (Yi ) +

Yi ∩R6=∅

"

v˜ ij δHij 

Y

ij

i

1 2

X

sij Bij −

#

 tX δWK˜ (X)

X

i6=j

1−

X

X

δH(Yi ) −

X

i

# uX δH(X)

X

D(RY0 i )

i h X X t˜X δ V˜ (X) τ (ψK )τ c (ψR ). × exp − δVYi −

(5.5)

Here D(RY0 ) is the contribution from RY0 = R0 ∩ Y with the small fields subtracted: D(RY0 ) ≡

Y −N/2 1 (1 + ARi0 ) det 2 (χRi0 HχRi0 ) det 2

i:Ri ⊂Y

× exp

hD

Ei ψRi0 \Ri , Ti ψRi0 \Ri ,

(5.6)

where by Theorem 11 Z sup

ψR0

|D(RY0

\RY Y

Y dψ(x) β δ2 √ ≤ exp − |RY |N . )|τ (ψRY ) 11 π c

(5.7)

x∈RY

If all parameters are set to 0, we have the completely decoupled result: X 1 X Y η(Yi ; RYi ), p! ∪Yi =3 i Z Y dψ(ζ) √ , SY 4(Y ; RY )τ (ψRY0 \RY )τ c (ψRY ) η(Y ; RY ) = π 0 Z(R) →

ζ∈RY

4(Y ; RY ) =

−N/2 det 3 [1 1/2

× det

+ AY − W (Y )] exp[−δVY ]

[1 − δH(Y )] D(RY0 )τ (ψY \RY0 ).

Here and hereafter, η means integrated activities which may contain contributions from ψR . If Y = 1, RY = ∅ and S1 = dµ1 (ψ) (with |ψ(x)| < N δ , x ∈ 1), and η(1) = ρ1 . If Y = Di , then SY = dµY \RY0 (ψ) and by Theorem 11, we have β δ2 −δˆ 2 |η(Di ; Ri )| ≤ exp − |Ri |N + N |Di | + πL0 |Ri |δ log N , 12 where |Di | < 9|Ri |L2 and L0 < L ∼ m−1 log N . Then

(5.8)

Critical Temperature of 2D O(N ) Spin Model

159

Lemma 23. Take the sum over all R ⊂ Di which are consistent with Di . Then X η(Di ; R) ≤ exp −|Di |N δ3 ,

(5.9)

R

δ3 = δ2 − O(N −δ2 ).

(5.10)

Proof. Take a square 1 ⊂ Di of size L × L such that R ∩ 1 6= ∅, and take the sum over R ∩ 1 (|R ∩ 1| = 1, · · · , L2 ). Since L2 ∼ 400m−2 log2 N ∼ N ε0 , we have estimates X R⊂1

exp[−

2 β β |R|N δ2 ] ≤ (1 + exp[− N δ2 ])L − 1 12 12

≤ exp[L2 exp[−

β δ2 N ]] − 1 ≤ exp[−|1|N δ3 ]. 12

Since Di is the connected set of {1 ⊂ Di }, the conclusion follows [9]. R1 We iteratively use the identity f (1) = 0 dw∂w f (w) + f (0) with respect to all interpolation parameters Q for si already used to expand the Gaussian measure. We thus P except obtain Z(R) = U (R) X∈U (R) η(X; R), where U(R) are partitions of 3 into paved sets which consist of 1i ⊂ K 0 and Di ⊂ D, and η(X; R) is the quantity given by Z p X 1 X Y SYi I({Yi }) 4(X, {Yi }, RX )τ (ψX∩K˜ )τ c (ψRX ) p! p i=1 ∪Yi =X

×

Y

x∈X∩R0

dψ(x) √ . π

(5.11)

Namely if U = {X1 , · · · , Xn } is a partition, Xi are unions of 1i and Dj and ∪Xi = 3. Moreover 4(X, {Yi }, RX ) is the restriction of 4(3, R) to the region X equipped with RX = R ∩ X, together with the interpolation parameters following the decomposition X = ∪Yi . I({Yi }) is the interpolation operator over {Yi }, Ri0 and so on defined by X Y I(Xij ), (5.12) I= ∪Xij =X i,j

where Xij is a paved set consisting of Yi ⊂ X connected by the interpolation parameters (sij for i = 1, tX for i = 2, t˜X for i = 3, uX for i = 4, vij for i = 5 and v˜ ij for i = 6). The paved set X cannot be decomposed into two disconnected pieces without bisecting some Xij and Ii (Xij ) =

X γ∈T˜ (Xij )

Z

1

dwγ ∂wγ ,

(5.13)

0

where T˜ (Xij ) is the set of connected graphs over the constituents Yk ⊂ Xij or Ri0 ⊂ Xij made by w (= sij , tX , t˜X , uY , vij and v˜ ij ). (Multi-indices are used for wγ .) Then we have

160

K. R. Ito, H. Tamura

  X 1 X Y Z3 = Z∞  ρXi  , p! p p ∪1 Xi =3 i X η(X; R), ρX =

(5.14) (5.15)

R⊂X

where the sum over R ⊂ X is chosen so that the locations of R are consistent with the polymer expansion, i.e., R0 ∩ 1 = ∅ for 1 ⊂ ∂X. We can now prove Theorem 1: Proof of Theorem 1 (former half). Put X = ∪6i=1 Xi and Xi = ∪j Xij , where Xij is a collection of paved sets {Yk ⊂ X} such that Xi = ∪Xij , and is constructed by the action of Ii (Xij ) on 4. X cannot be divided into two disconnected sets without bisecting some Xij and Xi . SYi yields the tree decay factor exp[−δ0 nYi log N − m0 L(Yi )], nYi ≥ 2 over the squares 1k ⊂ Yi . Moreover as is seen from Lemmas 9 and 10, the action of Ii (Xij ) on 4 yields the factor σi (Xij ) bounded by the tree decay factor: |σi (Xij )| ≤ exp[−δ0 n˜ Xji log N − m0 LY (Xij )], where n˜ X is the number of Yi contained in X (n˜ X ≥ 2) and LY (X) denotes the length of the shortest tree graphs over Yi ⊂ X (from the center of 1i ⊂ Yi to the center of 1j ⊂ Yj ). The factor D(RY0 ) is combined with det1/2 (1 P− δH(Y )) ≤ 1. By Lemma 23, we see that it yields the factors bounded by exp[− Dk ⊂Y |Dk |N δ3 ]. Since σs , · · · , and σw contain the tree decay factors over Yi and Dj , and since SYi contains the tree decay factors over 1k ⊂ Yi , we can extract a part (e.g. 7/8) of the tree decay factors over 1i ⊂ X\D and Dk ⊂ D ∩ X in advance from σs,··· ,w (X) (we denote the remainders again by σs,··· ,w (X) for simplicity). Thus we have h 7 X X exp − δ0 nX\∪Dk log N − c1 |Dk |N δ3 |ρ(X)| ≤ 8 Dk ⊂X 0 i 7 − m0 L({1i ⊂ X\ ∪ Dk }, {Dk }) 8      p1 p6 X X 1 Y X X 1 Y X  σs (X1i ) · · ·  σw (X6i ) × p ! p ! 1 6 i i p p i=1 i=1 ∪Xi =X

1

∪X1 =X1

6

∪X6 =X6

where X 0 = X − ∂X, c1 = O(1) > 0 (in fact c1 ∼ 1), {Dk } are the large field regions consistent with Xij and L({1i ⊂ X\ ∪ Dk }, {Dk }) is the length of the shortest tree graph over {1i } and {Dk }. Then we can assume that Xi cannot be bisected without Q bisecting some Xij by adding 1/8 of the decay factor to each of σi (Xij ), i = 1, · · · , 6. Thus the sum over {Xij }j is convergent for i = 1, · · · , 6. Since X cannot be divided into two pieces without bisecting some Xi , the sum over Xi is again convergent. The result is bounded by exp[−δc nX log N − mc L(X)] if N is large, where δc > δ0 /8 and mc > m0 /8. Remark 7. It is obvious that mc and δc converge to m0 and δ0 , respectively for large N since the contributions from large fields are exponentially small.

Critical Temperature of 2D O(N ) Spin Model

161

5.2. Proof of Theorem 1 (latter half) . We now resum Eq. (5.14) in the following form: X 1 X Y ρˆXi ] p! p ∪Xi ⊂3 i X X ˆ Y ], W1 − W = Z∞ exp[−

Z3 = Z∞ exp[−

X

W1 ][

(5.16)

Y

P where ρˆX ≡ exp[ 1⊂X W1 ]ρX is the polymer activity with the single square contributions subtracted. Thus ρˆ1 = 1. Moreover nY ≥ 2 (nY =number of squares in Y ) and ˆY =− W

X 1 k! k

X

XY

{Xi ;i=1,··· ,k};∪Xi =Y γc `∈γc

(`)

Y

ρˆXζ .

(5.17)

ζ

In this equation, k is the number of {Xi } and γc runs over connected graphs of lines {`} joining vertices {1, 2, · · · , k}, (`) = −1 if X`+ ∩ X`− 6= ∅, where ` = (`+ , `− ) and zero otherwise. Then it follows [5, 13, 16] from (3.5) that Theorem 24. For given β > 0, if N is chosen large (N ≥ exp[400πβ]), then α≡

W1 X 1 ˆ WY + L2 |Y |

(5.18)

Y 30

converges absolutely as 3 → Z 2 . The free energy αF = α0 + α is analytic in β, where α0 ≡ lim

1 N 1 log(det(m2 − 1)) − log(det(C3 )) . |3| 2 2

(5.19)

6. Conclusion and Some Remarks We have shown that the free energy is represented by the convergent polymer expansion, which establishes the analyticity of the free energy. Exponential decay of the correlation functions will be proved in the same way, but with some additional tricks. The mass parameter m ∼ e−2πβ is almost zero for large β, and our result is weak in the sense that βc (N )/N increases just logarithmically. Note that we used blocks of single scale only. Our longstanding problem will be solved by iterative usages of block-spin-type calculations. Acknowledgement. We would like to thank T. Hara for many inspiring conversations. The contribution by K.R.I. is partially supported by the Grant-in-Aid for Scientific Research, No. 09640304, the Ministry of Education, Science and Culture, the Japanese Government. He thanks K. Gawedzki for hospitalities extended to him at I.H.E.S. where part of this work was done.

162

K. R. Ito, H. Tamura

˜ and Their Inverses Appendix A: Properties of G, C, G R Q We first consider G(x) = (2π)−2 eipx g(p) dp. Since g(p) is analytic and periodic in p, the integral is invariant by the shift of pk by iεk , where εk = εxk (x21 + x22 )−1/2 , ε > 0. Then ipx → ipx − ε|x| and g(p + iε)−1 is equal to X X [1 − cos(pk ) cosh(εk )] + 2i sin(pk ) sinh(εk ) m2 + 2 k

= m +2 2

X

[1 − cosh(εk )] + 2

X

k

(1 − cos(pk )) cosh(εk ) + 2i

k

X

sin(pk ) sinh(εk ).

k

P Here we can set ε = m∗ by m2 +2(1−cosh(m∗ )) R = 0 since Q(1−cosh εk ) ≥ 1−cosh−1ε. Then ε = O(m) and it is immediate to see that |g(p + iε)| dp < const. log(1 + m ) In Eq. (3.8), we consider the complex displacement of pi by iεi . We again shift ki by iεi /2 since g(p − k) is periodic. Then g˜ 2 (p + iε) is equal to Z Z ε Y dki D − i(A1 B2 + A2 B1 ) Y dki ε = , g(p − k + i )g(k + i ) 2 2 2π 2π (A21 + B12 )(A22 + B22 ) P P where A1 = m2 + 2 [1 − cos(pi − ki ) cosh(εi /2)], B1 = 2 sin(pi − ki ) sinh(εi /2), Note A2 = A1 (p ≡ 0), B2 = −B1 (p ≡ 0) and D ≡ A1 A2 − B1 B2 . P √ that 2D = (A1 + B1 )(A2 −B2 )+(A1 −B1 )(A2 +B2 ), where A1 ±B1 = m2 +4−2 i Pcosh(ε √ i ) cos(pi − cosh(εi ) > 0. ki ±δi ) and so on, where tan δi = tanh(εi /2). Then D > 0 if m2 +4−2√ P P√ cosh(εi ) = 2+ 41 ε2 −O(ε4 ), D > 0 if |ε| ≤ 2m. Then C(x, y), Since ε2 = ε2i and √ ˜ exponential decay faster than exp[− 2m|x √ − y|]. G(x, y) and G˜ −1 (x, y) have uniform R By Schwarz’s inequality, |g(p ˜ + iε)|dp < const. log(1 + m−1 ) if |ε| < 2m. Thus the bound for G˜ follows. Maximize A2i + Bi2 and integrate D over k to obtain Re g(p ˜ + iε)2 ≥ c0 (8 + m2 )−2 , c0 = O(1) > 0. Thus the bounds for C = [G◦2 ]−1 and G˜ −1 follow. The function g(p) ˜ is exactly obtained in the continuum limit, and is analytic in |Imp| < 2m. Thus our estimate will be improved. Appendix B: Polymer Expansions of Kernel Functions ˜ Let H(x) be a positive type function defined on Z2 whose Fourier transform H(p) satisfies the following: ˜ ≤ c2 . (1) 0 < c1 ≤ H(p) ˜ (2) H(p) is periodic in pi , i = 1, 2. P 2 ˜ εi < m2 . (3) H(p) is analytic in p ∈ ε , where ε = {(p1 , p2 ); |Im pi | < εi }, −1 ˜ ˜ |H(p)| and |H(p)| are bounded on the boundary. ˜ ˜ ≤ c02 and |ImH(p)| ≤ c03 for p ∈ ε . (4) 0 < c01 ≤ Re H(p) Then we have shown that both H(x) and H −1 (x) decrease exponentially fast in |x|. Put Z Y dpi ˜ . H(x, y) = exp[ip(x − y)]H(p) 2π Let X ⊂ 3 and we define the matrix HX of size |X|×|X| by HX (x, y) ≡ χX (x)H(x− y)χX (y). Then c1 ≤ HX ≤ c2 and we have:

Critical Temperature of 2D O(N ) Spin Model

163 −1/2

−1 Theorem B.1. HX (x, y), HX (x, y) and HX 1/2

(x, y) again decay exponentially fast:

−1 (x, y)| < const. exp[−m|x − y|], |HX ±1/2

|HX

(x, y)| < const. exp[−m|x − y|].

Proof. First suppose that X is a rectangle of side lengths X1 and X2 with the center at the origin. The operator HX (x, y) is strictly positive. Let H˜ X (p, q) be its Fourier kernel: Y dki X Z ˜ ˜ ei(p+k)x−i(q+k)y H(k) HX (p, q) = . (B.1) 2π x,y∈X

This is strictly positive and hence invertible. The properties (2) and (3) mean that H˜ X can be analytically continued by X Z Y dki ˜ − iε) , ei(p+k)x−i(q+k)y H(k H˜ X (iε)(p, q) ≡ H˜ X (p + iε, q + iε) = 2π (B.2) x,y∈X and we see that (i) H˜ X is strictly positive as an op4rator on `2 (X ∗ ), where X ∗ is the dual of X: X ∗ = {(2πn1 /X1 , 2πn2 /X2 ); ni = 0, 1, · · · , Xi − 1}. (ii) The self-adjoint part of H˜ X (iε) is strictly positive for |ε| < m. Since

X Z 1 e−ipx+iqy H˜ X (0)(p, q) |X|2 p,q∈X ∗ X Z 1 e−i(p+iε)x+i(q+iε)y H˜ X (iε)(p, q), = |X|2 ∗

HX (x, y) =

(B.3)

p,q∈X

we have −1 (x, y) = HX

X 1 e−i(p+iε)x+i(q+iε)y H˜ X (iε)−1 (p, q), 2 |X| ∗

(B.4)

p,q∈X

where |X| = X1 X2 . Then take εk = −mζk /|ζ|, ζ = x − y. If X is not a rectangle, choose the smallest rectangular set Xˆ containing X. Define ˆ ˆ , where 1X\X is the identity operator on X\X. Then Hˆ Xˆ is HXˆ = χX HχX + 1X\X ˆ ˆ 2 ˆ strictly positive on ` (X) and the previous discussion applies. The proof is the same for 1/2 −1/2 HX (x, y) and HX (x, y). For (GR )−1/2 , we have an alternative: we can apply the polymer expansion or random R −1/2 walk expansion to the right-hand side of the integral representation GR = 2 (GR + u2 )−1 du/π. (This is left to the reader.) Let X = X1 ∪ X2 , where X1 ∩ X2 = ∅ and we assume that X1 , X2 and X = X1 ∪ X2 are rectangles. Let HX (s) ≡ (1 − s)(HX1 + HX2 ) + sHX . Then H(s) is strictly positive uniformly in s ∈ [0, 1]. What is important is that the Fourier transform of H(s)(x, y) is H˜ s (p, q) ≡ (1 − s)(H˜ X1 (p, q) + H˜ X2 (p, q)) + sH˜ X2 (p, q) which satisfies the conditions (i) and (ii) uniformly in s ∈ [0, 1]. This implies that

164

K. R. Ito, H. Tamura

Theorem B.2. Let HX (s) be a convex linear combination of {HX1 ⊕ · · · ⊕ HXn ; X = ∪Xi , Xi ∩ Xj = ∅, (i 6= j)} . Then the following bound holds uniformly in si ∈ [0, 1]: −1 (s)(x, y)| < const. exp[−m|x − y|]. |HX

For HX (s) with X = X1 ∪ X2 , we have: Z 1 −1 −1 −1 HX (s)−1 (HX1 X2 + HX2 X1 )HX (s)−1 ds. HX = HX1 ⊕ HX2 − 0

−1 in the form of Lemma 14, but This is the first step of the polymer expansion of HX here we have introduced the interpolation parameter s = s1 into the denominator (not in G−1 like in Lemma 14). All these mean that we can apply the Brydges–Federbush method to cluster-expand some Green’s functions.

Appendix C: Polymer Expansion of Gaussian Measures We here discuss a cluster expansion of Gaussian measures with an interaction V : Z (C.1) Z3 = exp[−V (ψ)]dµ,

Y dψ(x) √ . dµ = det −1/2 (C) exp[− ψ, C −1 ψ ] π

(C.2)

Since C is strictly positive, we use the cluster expansion of Brydges–Federbush type which keeps positivity of the operator. To do so, we first choose 11 ⊂ 3 and define C(s1 ) = [(1 − s1 )P1 + 1]C3 = (1 − s1 )(C3\11 + C11 ) + s1 C3 , P1 CX ≡ CX\11 + CX∩11 ,

(C.3) (C.4)

where we have used the notational convention CX = χX CχX , CX,Y = χX CχY and X c = 3\X as usual. Thus we have (C in [5, 16] is written 21 C here) Z Z3 = exp[−V (ψ)]dµ(s1 = 1) = Z3\11 Z11 Z X Z 1 X X 1 ∂2 C(x, y) e−V , (C.5) + ds1 dµ(s1 ) 2 ∂ψ(x)∂ψ(y) 0 x∈1 y∈1 12 ⊂3\11

1

2

where

Y dψ(x) √ . dµ(s1 ) = det −1/2 [C(s)] exp[− ψ, C(s1 )−1 ψ ] π In fact, this follows from the observations of Z 1 dµ(s1 )eiψ(f ) = exp[− hf, C(s1 )f i], 4 Z 1X ∂ ∂2 ∂ (r.h.s.) = [ C(s1 )]xy dµ(s1 ) eiψ(f ) , ∂s1 4 x,y ∂s1 ∂ψ(x)∂ψ(y) ∂ C(s1 ) = ∂s1

X

12 ⊂3\11

(C11 ,12 + C12 ,11 ).

(C.6)

(C.7) (C.8) (C.9)

Critical Temperature of 2D O(N ) Spin Model

165

This establishes the claim for the decomposition into 11 and 3\11 . We next apply the same steps to each term of Eq. (C.5): we introduce an interpolation parameter s1 to Z3\11 to decouple 12 from 3\11 and introduce next an interpolation parameter s2 to the rest to decouple Y ≡ 11 ∪ 12 from 13 ⊂ 3\Y . See [5, 16] for detail. Tree graphs T 0 over {11 , · · · , 1p } with the root 11 are graphs defined by permutations {j1 , · · · , jp } of {1, 2, · · · , p} with j1 = 1 and a map aT 0 : {1, 2, · · · , p − 1} → {1, 2, · · · , p − 1} such that aT 0 (k) ≤ k. They define a set of ordered links (tree graph T 0 ) `k = (1ja(k) , 1jk+1 ), k = 1, 2, · · · , p − 1. Set M (s) =

p−1 Y

i−1 Y

sj .

T0

(C.10)

i=1 j=aT 0 (i)

Theorem C.1. [16] Z3 have the cluster expansion X 1 p! p

X Y i

Y1 ,··· ,Yp

Y

ZYc i

Z1 ,

(C.11)

1⊂3\∪Yi

where Yi are paved sets which are disjoint to each other and consist of more than or equal to two 1i ⊂ 3. Let Y = ∪pi=1 1i be one of Yi . Then ZYc has the following expression: XZ T0

×

Z

1

ds1 · · · dsp−1 MT 0 (s)

0

p−1 Y k=1

 

X

X

xk ∈1ja(k) yk+1 ∈1jk+1

dµ({s})

 1 ∂2  exp[−V (ψ)], C(xk , yk+1 ) 2 ∂ψ(xk )∂ψ(yk+1 ) (C.12)

where T 0 = Ta = {(ja(k) , jk+1 )}p−1 k=1 ,

Y dψ(x) √ , dµ({s}) = det −1/2 [C(s)] exp[− ψ, C −1 ({s})ψ ] π C({s}) = [

p−1 Y

((1 − si )Pi + si )]C3 ,

(C.13) (C.14)

i=1

Pi CX = CX\Xi + CX∩Xi , (Xi = ∪ik=1 1jk ).

(C.15)

There are many tree graphs T 0 with root 11 which have the same links and vertices with T . They differ from each other by MT 0 (s) and C −1 (s) [5, 16]: Q Theorem C.2. MT dsi is a probability measure in the following sense: X T 0 :T (T 0 )=T

where T.

P T 0 :T (T 0 )=T

Z

1

MT 0

Y

dsi = 1,

(C.16)

0

means the sum over tree graphs T 0 which have the same links with

166

K. R. Ito, H. Tamura

˜ we have: For the Gaussian measure dµK˜ restricted to the region K, Y dψ(x) √ , (C.17) dµK˜ (s) = det 1/2 (χK˜ H(s)χK˜ ) exp − hψ, χK˜ H(s)χK˜ ψi π where H(s)−1 = C(s) = (1 − s)(C3\X1 + CX1 ) + sC3 (we used X1 for 11 ) and Z Z X1 ∂2 d e−V dµK˜ = dµK˜ (s) (ABA)xy e−V , (C.18) ds 4 ∂ψ(x)∂ψ(y) x,y A = [χK˜ H(s)χK˜ ]−1 , ∂ ∂ B = − A−1 = χK˜ H(s)[ C(s)]H(s)χK˜ . ∂s ∂s

(C.19) (C.20)

Since (ABA)xy depends on locations of Ri0 , we expand ABA into polymers. In fact using the method of Lemma 14 to expand [χK˜ H(s)χK˜ ]−1 in terms of H1i and HDi \Ri0 , we have [χK˜ H(s)χK˜ ]−1 χK˜ H(s) = IK˜ + [χK˜ H(s)χK˜ ]−1 χK˜ H(s)χR0 X = IK˜ + δC(X), X∩R0 6=∅

where IK˜ is the identity operator on CK and δC(X) are the polymers expressed by random walks passing all squares 1i only in X and at least one of {Ri0 , Di \Ri0 } if Di ⊂ X. We proceed inductively. After j steps, ABA is expressed as the sum of the following terms: h i X δC(X10 ) [( same )] 1K˜ 1K˜ CXi ,3\∪j Xk + C3\∪j Xk ,Xi 1K˜ + ˜

1

+

X

1

1K˜ [( same )] δC + (X20 ) +

XX

X20

X10

δC(X10 ) [( same )] δC + (X20 ),

X10 X20

where X` ∩ Xk = ∅ (k 6= `), 1 ≤ i ≤ j and {si }j1 are omitted. The next step is: In 1K˜ [· · · ]1K˜ , choose any Xj+1 = 1` ⊂ 3\ ∪j1 Xk or Xj+1 = D` ⊂ 3\ ∪j1 Xk . Define δF1 (Xi , Xj+1 ) ≡ CXi ,Xj+1 , δ1 (Xj+1 , Xi ) ≡ CXj+1 ,Xi . j (ii) P In δC(X10 )[· · · ]1K˜ , choose any Xj+1 ⊂ 3\ P ∪1 Xk . 0Define δF2 (Xi , Xj+10) ≡ 0 X 0 δC(X1 )CXi ,Xj+1 , δF2 (Xj+1 , Xi ) ≡ X 0 δC(X1 )CXj+1 ,Xi , where X1 ⊂ (i)

1

1

0 0 0 0 ∪j+1 1 Xk , and X1 ∩ Xj+1 must contain Xj+1 ∩ K and at least one of {Rk , Dk \Rk } + 0 if Dk ⊂ Xj+1 . This is same for 1K˜ [· · · ]δC (X2 ). (iii) In δC(X10 )[· · · ]δC(X20 ), choose any Xj+1 ⊂ 3\ ∪j1 Xk . Define X δC(X10 )CXi ,3\∪j Xk δC + (X20 ), δF4 (Xi , Xj+1 ) ≡ 1

X10 ,X20

δF4 (Xj+1 , Xi ) ≡

X

X10 ,X20

δC(X10 )C3\∪j Xk ,Xi δC + (X20 ), 1

0 0 0 where X10 ∪ X20 ⊂ ∪j+1 1 Xk , and (X1 ∪ X2 ) ∩ Xj+1 must contain Xj+1 ∩ K and at 0 0 least one of {Rk , Dk \Rk } if Dk ⊂ Xj+1 .

Critical Temperature of 2D O(N ) Spin Model

167

P4 Then we define δF (Xi , Xj+1 ) ≡ k=1 δFk (Xi , Xj+1 ). (It is the same for δF (Xj+1 , Xi ).) The following facts are immediate from the construction: (1) Thanks to the random walk expansion, the sum in the right-hand sides converge and exhibits a tree decay property with respect to blocks 1k ⊂ Xj+1 and D` ⊂ Xj+1 . The factor δF (Xi , Xj+1 ), with i < j + 1 includes the tree decay factor exp[−mL(Xj+1 ∧ D)] and exp[−m dist(Xi , Xj+1 )], where Xj+1 ∧ D implies that D` ⊂ Xj+1 must be regarded as one set and must not be decomposed into 1k ⊂ D` . (2) If Xj+1 consists of more than or equal to two 1k or D` , then the factor δF (Xi , Xj+1 ) must contain exp[−m dist(R0 ∩ Xj+1 , Xi )] ≤ exp[−m(L + L0 )]. (3) The matrix element δF (Xi , Xj )(x, y) is less than min` exp[−mL(1` ∪ (Xj ∧ D), x, y)], where 1` ⊂ Xi . (4) The matrix element δF (Xi , Xj )(x, y) 6= 0 even if x ∈ / Xi or y ∈ / Xj . But it is less than the value given above, and bounded by exp[−m(L + L0 )] since it contains R0 . We then introduce sj+1 to C(s1 , · · · , sj ) to separate ∪j+1 1 Xk from its complement. We repeat the argument and obtain Theorem 19. References 1. Balaban, T., Brydges, D., Imbrie, J. and Jaffe, A.: The Mass Gap for Higgs Models on a Unit Lattice. Ann. Physics 158, 281 (1984) 2. Brascamps, M. and Lieb, E.: On extensions of the Brunn–Minkowski and Prekopa–Leinder Theorem, including Inequalities for Log Concave Functions with Some Applications. J. Funct. Anal. 22, 336–389 (1976) 3. Brydges, D., Fr¨ohlich, J. and Spencer, T.: The Random Walk Representation of Classical Spin Systems and Correlation Inequalities. Commun. Math. Phys. 83, 123–150 (1982) 4. Brydges, D., Fr¨ohlich, J. and Sokal, A.: The Random Walk Representation of Classical Spin Systems and Correlation Inequalities, II. Commun. Math. Phys. 91, 117–139 (1985) 5. Brydges, D.: A Short Course on Cluster Expansions. In: Les Houches Summer School, Session XLIII (1984), ed. by K.Osterwalder et al. London: Elsevier Sci. Publ., 1986, pp. 129–183 6. Caracciolo, S., Edwards, R., Plisetto, A. and Sokal,A.: Asymptotic Scaling in the Two-Dimensional O(3) σ model at Correlation Length 105 . Phys. Rev. Letters 75, 1891–1894 (1995) 7. Fr¨ohlich, J., Israel, R., Lieb, E. and Simon, B.: Phase Transitions and Reflection Positivity. I.. Commun. Math. Phys. 62, 1–34 (1978); J. Fr¨ohlich, R. Israel, E. Lieb and B. Simon , Phase Transitions and Reflection Positivity. II.. J. Stat. Phys. 22, 297–347 (1980) 8. Gawedzki, K. and Kupiainen, A.: Massless Lattice φ44 Theory, Rigorous Control of a Renormalizable Asymptotically Free Model. Commun. Math. Phys. 99, 197–252 (1985) 9. Glimm, J., Jaffe, A. and Spencer, T.: The Particle Structures of the Weakly Coupled P (8)2 Models and Other Applications, Part II, The cluster expansion. In: Constructive Quantum Field Theory; Lecture Notes in Physics, 25, ed. by G.Velo and A. Wightman, Heidelberg, Springer Verlag, 1973, pp. 199–242 10. Ito,K.R.: Permanent Quark Confinement in 4D Hierarchical Lattice Gauge Models of Migdal–Kadanoff Type. Phys. Rev. Letters 55, 558–561 (1985); Mass Generations in Two-Dimensional Hierarchical Heisenberg Model of Migdal–Kadanoff Type. Commun. Math. Phys. 110, 237–246 (1987); Renormalization Group Flow of Two-Dimensional Hierarchical Heisenberg Model of Dyson–Wilson Type. Commun. Math.Phys. 137, 45–70 (1991) 11. Ito, K.R., Kugo, T. and Tamura, H.: Representation of O(N ) Spin Models by Self-Avoiding Random Walks. Commun. Math. Phys. 183, 723–736 (1997) 12. Ito, K.R. and Tamura, H.: Deviations of Upper Bounds of Critical Temperatures of 2D O(N ) Spin Models. To appear in Letters in Math. Phys. (1998) 13. Kotecky, R. and Preiss, D.: Cluster Expansion for Abstract Polymer Models. Commun. Math. Phys. 103, 491–498 (1986) 14. Kupiainen, A.: On the 1/n expansion. Commun. Math. Phys. 73, 273–294 (1980) 15. Ma, S.K.: The 1/n expansion. In: Phase Transitions and Critical Phenomena, 6, ed. by C. Domb and M. S. Green, London: Academic Press, 1976, pp. 249–292

168

K. R. Ito, H. Tamura

16. Rivasseau, V.: Cluster Expansion with Small/Large Field Conditions. In: Mathematical Quantum Theory I, Field Theory and Many-Body Theory, ed. by J. Feldman et al., (CRM Proceedings and Lecture Notes, Vol.7, Providence, RI: Am. Math. Soc., 1994 17. Simon, B.: The P (8)2 Euclidean (Quantum) Field Theory. Princeton Series in Physics, Princeton, N.J.: Princeton Univ. Press, 1974 18. Wilson, K.: Confinement of Quarks, Phys. Rev. D 10, 2445–2459 (1974); Polyakov, A.: Interactions of Goldstone Bosons in Two Dimensions. Phys. Lett. 59B, 79–81 (1975) 19. Kopper, C.: Mass Generation in the large N -nonlinear σ-Model. Commun. Math. Phys. 202, 89–126 (1999) Communicated by D. C. Brydges

Commun. Math. Phys. 202, 169 – 195 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Representations of Vertex Operator Algebra VL+ for Rank One Lattice L Chongying Dong? , Kiyokazu Nagatomo?? Department of Mathematics, University of California, Santa Cruz, CA 95064, USA Received: 3 August 1998/ Accepted: 25 September 1998

Abstract: We classify the irreducible modules for the fixed point vertex operator subalgebra VL+ of the vertex operator algebra VL associated to a positive definite even lattice of rank 1 under the automorphism lifted from the −1 isometry of L. 1. Introduction Vertex operator algebras VL associated to an arbitrary even positive definite lattice L have been studied extensively, and their representation theory and fusion rules have been understood very well (see [B, FLM2, D1, DL1, DLM1]). It is well known that the vertex operator algebra VL has an order 2 automorphism θ which is deduced from the −1 isometry of the lattice [FLM2]. The θ-invariants VL+ is a simple vertex operator subalgebra of VL . In this paper we classify the irreducible modules for VL+ for all rank 1 lattice L. The classification result says that any irreducible module for VL+ is isomorphic to either a submodule of an irreducible VL -module or a submodule of an irreducible θ-twisted VL -module. This confirms a conjecture in the orbifold conformal field theory [DVVV] in this special case. The study of VL+ was initiated in [FLM1] during the course of constructing the moonshine module although the notion of vertex operator algebra was not available back then. It became clear later in [FLM2] (also see [B]) that V3+ (where 3 is the Leech lattice) is a vertex operator subalgebra of the moonshine (module) vertex operator algebra V \ which is a direct sum of V3+ and an irreducible module for V3+ . In fact the moonshine module is the first example of so-called “orbifold conformal field theory”. ? Supported by NSF grant DMS-9700923 and a research grant from the Committee on Research, UC Santa Cruz. ?? On leave of absence from Department of Mathematics, Graduate School of Science, Osaka University, Toyonaka, Osaka 560-0043, Japan. This work was partly supported by Grant-in-Aid for Scientific Research, the Ministry of Education, Science and Culture.

170

C. Dong, K. Nagatomo

The important role of the twisted modules in the orbifold theory was also noticed and used in the construction of the moonshine module. There is a fundamental difference between vertex operator algebras VL and VL+ . In order to see this we need to recall their construction. Set h = C ⊗Z L and hˆ = h ⊗ C[t, t−1 ] ⊕ Cc. Then the affine Lie algebra hˆ has an automorphism θ of order 2 such that θ(h ⊗ tn ) = −h ⊗ tn and θ(c) = c. Then VL as a vector space is a tensor product of M (1) and C[L], where M (1) = S(h ⊗ t−1 C[t−1 ]) and C[L] is the group algebra of L. The map θ extends to an algebra automorphism of S(h ⊗ t−1 C[t−1 ]) and maps eα to e−α for α ∈ L, where we use eα to denote the corresponding element in C[L]. It is clear that VL has the subspace h ⊗ t−1 . So the affine algebra hˆ with c = 1 is a substructure of VL and plays a very important role in the classification of irreducible (twisted) modules for VL (see [D1] and [D2]). But this affine algebra is not available to VL+ . This explains why the representation theory for VL+ is more difficult. Although VL+ provides a large class of concrete and important examples of vertex operator algebras, the study of representation theory for an arbitrary VL+ is very limited so far. It was proved in [DGH] that if L contains a sublattice of type D1d then VL+ is rational but the classification of irreducible modules even in this case remains open except when L is the Leech lattice (see [D3]). The classification of irreducible modules for VL+ is well motivated by the problem of classification of rational vertex operator algebras. The classification of rational vertex operator algebras is definitely one of the most important problems in the theory of vertex operator algebra and it has immediate applications to the classification of rational conformal field theory. If the central charge is less than 1, the classification problem is not difficult as the vertex operator subalgebra generated by the Virasoro element has only finitely many irreducible modules. So the first nontrivial case is the central charge equal to 1. It is believed that all rational vertex operator algebras of central charge 1 are VL , VL+ and VLG2 , where the rank of L is 1, L2 is the root lattice of type A1 , G is a finite subgroup of SO(3) of type E and VLG2 is the corresponding invariants. As we mentioned already that the representation theory of VL including the fusion rules is clear, one has to understand VL+ and VLG2 better and eventually characterize them. Another importance of studying the vertex operator algebra VL+ lies in the connection of VL+ with the W -algebra W (2, 4, k) (cf. [BFKNRV]) where the positive integer k is half of the square length of generators of the lattice L. It was pointed out in [DG] that VL+ is generated by the Virasoro vector and two additional highest weight vectors for the Virasoro algebra of weights 4 and k. So VL+ is the vertex operator algebra associated to the W -algebra W (2, 4, k) of central charge 1. We may expect some application of the results in this paper to the study of the W -algebra W (2, 4, k). The present paper is a continuation of [DN] in which we determine Zhu’s algebra A(M (1)+ ) in the case d = 1 and classified the irreducible modules for M (1)+ , where M (1)+ is the θ-invariants of M (1). It is well known that M (1)+ is a vertex operator subalgebra of VL+ . The ideals, techniques and results in [DN] have been extensively used and significantly extended in the present paper. As in [DN] our main strategy is to determine Zhu’s algebra A(VL+ ) whose inequivalent irreducible modules have a one to one correspondence with the inequivalent irreducible (admissible) modules for VL+ . It turns out that A(VL+ ) is a semisimple commutative associative algebra of dimension k + 7 generated by the image of the three generators of VL+ in A(VL+ ). The organization of this paper is as follows: In Sect. 2 we review Zhu’s algebra and related results. We also briefly review the vertex operator algebras VL , VL+ and their (twisted) modules. In Sect. 3 we introduce the three generators of VL+ following [DG]

Representations of Vertex Operator Algebra for Rank One Lattice

171

and give the commutator relations of the component operators of these generators. We then show how to obtain a “small” spanning set for A(VL+ ). In Sect. 4 we use a PBW type generating result to give an even “smaller” spanning set for A(VL+ ) in terms of the images of the three generators of VL+ . Section 5 is the core of this paper. In this section we first find the four relations among the three generators of A(VL+ ), two of which were in [DN] already. These relations are good enough for us to determine a basis of A(VL+ ) in the three separate cases: k is not a perfect square, k is an even perfect square and k is an odd perfect square different from 1. The case k = 1 needs a special treatment although it is easy: in this case VL+ is isomorphic to another lattice vertex operator algebra VL0 corresponding to k = 4. 2. Preliminaries In this section after recalling the definitions of admissible modules for a vertex operator algebra and a rational vertex operator algebra from [DLM2] and [Z] we review the definition of Zhu’s algebra and related results. We then review the construction of vertex operator algebra VL associated to an even positive definite lattice of rank 1 and its representations (see [B, FLM2, D1]). We also define the automorphism θ of VL which is a lifting from the −1 isometry of L and the fixed point vertex operator subalgebra VL+ . The θ-twisted modules for VL are also discussed. 2.1. Modules and Zhu’s algebras. We begin with a vertex operator algebra V (cf. [B, FLM2]) and an automorphism g of V of finite order into L T . Thenr V decomposes r = {v ∈ eigenspaces with respect to the action of g as V = r∈Z/T Z V , where V V |gv = e−2πir/T v}. An admissible g-twisted V -module (cf. [DLM2, Z]) is a T1 Z-graded vector space M=

∞ X

n M ( ), T n=0

with top level M (0) 6 = 0, equipped with a linear map V −→ (End M ){z}, X vn z −n−1 (vn ∈ End M ) v 7−→ YM (v, z) = n∈Q

satisfying the following conditions: for all 0 ≤ r ≤ T − 1, u ∈ V r , v ∈ V, w ∈ M , X un z −n−1 , YM (u, z) = n∈ Tr +Z

un w = 0 for n 0, YM (1, z) = 1, z1 − z2 z2 − z1 −1 −1 YM (u, z1 )YM (v, z2 ) − z0 δ YM (v, z2 )YM (u, z1 ) z0 δ z0 −z0 −r/T z1 − z0 z1 − z0 YM (Y (u, z0 )v, z2 ), δ = z2−1 z2 z2

172

C. Dong, K. Nagatomo

P where δ(z) = n∈Z z n and all binomial expressions are to be expanded in nonnegative integral powers of the second variable, um M (n) ⊂ M (wt(u) − m − 1 + n) if u is homogeneous. If g = 1, this reduces to the definition of an admissible V -module. A g-twisted V -module is an admissible g-twisted V -module M which carries a C-grading by weight. That is, we have M=

a

Mλ ,

λ∈C

where Mλ = {w ∈ M |L(0)w = λw}. Moreover we require that dim Mλ is finite and for fixed λ, M Tn +λ = 0 for all small enough integers n. Again if g = 1 we get an ordinary V -module. A vertex operator algebra is called rational if any admissible module is a direct sum of irreducible admissible modules. It was proved in [Z] and [DLM2] that if V is rational then V has only finitely many irreducible admissible modules and each irreducible admissible module is an ordinary module. Zhu introduced an associative algebra A(V ) associated to a vertex operator algebra V which is extremely useful in the study of the representation theory of V [Z]. In fact we will compute Zhu’s algebra A(VL+ ) to determine the irreducible modules for VL+ . Let V be a vertex operator algebra. For homogeneous u, v ∈ V, we define products u ∗ v and u ◦ v as follows: u ∗ v = Resz u ◦ v = Resz

(1 + z)wt(u) Y (u, z)v z (1 + z)wt(u) Y (u, z)v z2

= =

∞ X wt(u) i=0 ∞ X i=0

i

ui−1 v,

wt(u) ui−2 v. i

(2.1)

Then we extend (2.1) to linear products on V . Let O(V ) be the linear span of u ◦ v for u, v ∈ V . Set A(V ) = V /O(V ). Let M be an admissible module for V . Following [DLM2] we define the “vacuum space” (M ) = {w ∈ M |un w = 0, u ∈ V, n ≥ wt(u)}. Then (M ) contains M (0) and each o(u) = uwt(u)−1 for homogeneous u ∈ V preserves (M ). One can extend o(u) to all u ∈ V to be the linearity. Then we have (see [Z, DLM2]) Theorem 2.1. (1) A(V ) is an associative algebra under multiplication ∗ and with identity 1 + O(V ) and central element ω + O(V ). (2) The map u 7→ o(u) gives a representation of A(V ) on (M ) for any admissible V -module M . Moreover, if V is rational A(V ) is a finite dimensional semisimple algebra. (3) The map M → M (0) gives a bijection between the set of equivalence classes of irreducible admissible V -modules and the set of equivalence classes of simple A(V )-modules.

Representations of Vertex Operator Algebra for Rank One Lattice

173

Following [DN] we write [u] = u + O(V ) ∈ A(V ). We define u ∼ v for u, v ∈ V if [u] = [v]. This induces a relation on End V such that for f, g ∈ End V , f ∼ g if and only if f u ∼ gu for all u ∈ V . We also need the following results from [Z]. Proposition 2.2. (1) Assume that u ∈ V homogeneous, v ∈ V and n ≥ 0. Then Res z

(1 + z)wt(u) Y (u, z)v z 2+n

=

∞ X wt(u) i=1

i

ui−n−2 v ∈ O(V ).

(2.2)

(2) If u and v are homogeneous elements of V , then u ∗ v ∼ Res z

(1 + z)wt(v)−1 Y (v, z)u . z

(2.3)

(3) For any n ≥ 1, L(−n) ∼ (−1)n {(n − 1)(L(−2) + L(−1)) + L(0)} , where L(n) are the Virasoro operators given by Y (ω, z) = (4) For any u ∈ V ,

P n∈Z

(2.4)

L(n)z −n−2 .

[u] ∗ [ω] = [(L(−2) + L(−1))u].

(2.5)

2.2. Vertex operator algebras VL and VL+ . We work in the setting of [FLM2]. Let L be an even lattice of rank one with nondegenerate symmetric Z-bilinear form h·, ·i, h = L⊗Z C and hˆ Z the corresponding Heisenberg algebra. Let M (1) be the associated irreducible induced module for hˆ Z such that the canonical central element of hˆ Z acts as 1. Define VL = M (1) ⊗ C[L], where C[L] is the group algebra of L with a basis {eα |α ∈ L}. Set 1 = 1 ⊗ 1 and ω = 21 β(−1)2 1, where β ∈ h such that hβ, βi = 1. It was proved in [B] and [FLM2] that there is a linear map ]], VL −→ (End VL )[[z, z −1 X vn z −n−1 (vn ∈ End VL ) v 7−→ Y (v, z) = n∈Z

such that VL = (VL , Y, 1, ω) is a simple vertex operator algebra. Let L◦ = {x ∈ h | hx, Li ⊂ Z} be the dual lattice of L. Then the irreducible modules for VL are VL+γ = M (1) ⊗ C[L + λ], where λ runs over the coset representatives of L in L◦ (see [D1]). Moreover, VL is a rational vertex operator algebra (see [DLM1]). To be more precise, 1 L and the irreducible modules for VL let L = Zα such that hα, αi = 2k. Then L◦ = 2k are VL+ 2ki α for i = 0, ..., 2k − 1. Let θ be the linear automorphism of VL◦ such that θ(u ⊗ eγ ) = θ(u) ⊗ e−γ for u ∈ M (1) and γ ∈ L◦ . Here the action of θ on M (1) is given by θ(α1 (n1 ) · · · αk (nk )) = (−1)k α1 (n1 ) · · · αk (nk ). Then the restriction of θ to VL is a VOA automorphism. Let M be an θ-stable subspace of VL◦ . We denote the ±1 eigenspaces by M ± respectively. Then M (1)+ is a vertex operator subalgebra of VL+ . We have (see Theorems 4.4 and 6.1 of [DM])

174

C. Dong, K. Nagatomo

Proposition 2.3. (1) VL+ is a simple vertex operator algebra. ± are irreducible VL+ -modules. (2) VL± and VL+ 1 2α (3) VL+ 2ki and VL+ 2k−i α are isomorphic and irreducible VL+ -modules for i = 1, . . . , k−1. 2k

Next we discuss the θ-twisted modules of VL following Chapter 9 of [FLM2]. Then L/2L is an abelian group isomorphic to Z2 which has two irreducible modules T1 , T2 ˆ such that α+2L acts as scalars 1 and −1 respectively. Let h[−1] be the twisted Heisenberg algebra. As in Sect. 1.7 of [FLM2] we also denote by M (1) the unique irreducible ˆ h[−1]-module with the canonical central element acting by 1. Define the twisted space VLTi = M (1) ⊗ Ti . It was shown in [FLM2] and [DL2] that there is a linear map VL → (End VLTi )[[z 1/2 , z −1/2 ]], X vn z −n−1 v 7→ Y (v, z) = n∈ 21 Z

such that VLTi is an irreducible θ-twisted module for VL . Moreover, VLTi for i = 1, 2 give all irreducible θ-twisted VL -modules (see [D2]). We also define a linear operator θ on VLTi such that θ(α1 (−n1 ) · · · αs (−ns ) ⊗ t) = (−1)s α1 (−n1 ) · · · αs (−ns ) ⊗ t for αi ∈ h, ni ∈ 21 + Z and t ∈ Ti . Then θY (u, z)(θ)−1 = Y (θu, z) for u ∈ VL (cf. [FLM2]). We have the decomposition VLTi = (VLTi )+ ⊕ (VLTi )− , where (VLTi )± are the ±1 eigenspaces. Then we have (see [FLM2] and Theorem 5.5 of [DLi]) Proposition 2.4. (VLTi )± are irreducible VL+ -modules for i = 1, 2. Our main result in this paper is that Propositions 2.3 and 2.4 give a complete list of irreducible modules for VL+ . 3. A Spanning Set for A(VL+ ) In this section we use the ideas and techniques developed in [DN] to reduce the spanning set for A(VL+ ) to the images of M (1)+ and VL+ (1) in A(VL+ ). We shall use the vertex operators Y (u, z) for u ∈ VL freely and we refer the reader to [FLM2] for the definition of these operators. In Subsect. 3.1 we review the bracket relations for the component operators of the generators of VL+ . Subsect. 3.2 gives several lemmas which are used in the later subsections. In Subsects. 3.3 and 3.4 we prove that the subspace VL+ (m)+O(VL+ ) of A(VL+ ) can be generated by the subspace M (1)+ + VL+ (1) + O(VL+ ) for all m ≥ 1. 3.1. The generators of VL+ . Recall from [DG] that the vertex operator algebra M (1)+ is generated by ω and 3 J = β(−1)4 1 − 2β(−3)β(−1)1 + β(−2)2 1, 2

(3.1)

which is a singular vector of weight Virasoro algebra. Also recall that Y (ω, z) = P 4 for the P −n−2 −n−1 L(n)z and J(z) = J z . The following lemma can be found in n n∈Z n∈Z [DN].

Representations of Vertex Operator Algebra for Rank One Lattice

175

Lemma 3.1. (1) For any m, n ∈ Z, [L(m), Jn ] = (3(m + 1) − n)Jn+m . (2) The commutators [Jm , Jn ] are expressed as linear combinations of L(p1 ) · · · L(ps ), L(q1 ) · · · L(qt )Jr , where p1 , . . . , ps , q1 , . . . , qt , r ∈ Z and s, t ≤ 3. For convenience we set VL+ (m) = M (1)+ ⊗ (emα + e−mα ) + M (1)− ⊗ (emα − e−mα ) for m ≥ 0. Then VL+ (m) is an irreducible M (1)+ -module and is also completely reducible module for the Virasoro algebra L √ 2  Lp≥0 L(1, (m k + p) ) if k 6 = 0 is a perfect square, 2 (3.2) VL+ (m) = if k = 0, p≥0 L(1, (4p) )  L(1, km2 ) otherwise (cf. [DG]) where L(1, h) is the highest weight module for the Virasoro algebra with central charge 1 and highest weight h. Set E = eα + e−α . Then VL+ is generated by ω, J and E (see Theorem 2.9 of [DG]). Since E is a highest weight vector for the Virasoro algebra of weight k, we immediately have [L(m), En ] = ((k − 1)(m + 1) − n)En+m . The commutator [Jm , En ] could be computed if one knows Js E for s ≥ 0. Next we make a rough estimation of Js E. Since wt(Js E) = k + 3 − s ≤ k + 3 if s ≥ 0, then Js E ∈ L(1, k) by (3.2) for k > 1. The following lemma now is obvious. Lemma 3.2. Assume that k > 1. For nonnegative integer n, Jn E is expressed as a linear combination of the set {L(−m1 ) · · · L(−ms )E | m1 ≥ m2 ≥ · · · ≥ ms ≥ 1, s ≤ 3} . P 3.2. Several lemmas. Recall from [FLM2] that α(z) = m∈Z α(m)z −m−1 for α ∈ h 1 d and Y (α(−n − 1)1, z) = ∂ (n) α(z) for n ∈ Z≥0 , where ∂ (n) = n! dz . Then ∂

(n)

α(z) =

X −j − 1 j≥0

n

α(j)z

−j−n−1

+

X −j − 1 α(j)z −j−n−1 . (3.3) n

j≤−n−1

Set E m = emα + e−mα , F m = emα − e−mα for any integer m. Then E 1 is the E defined in Subsect. 3.1. Also set F = F 1 . Notice that α(0)E m = 2kmF m , α(0)F m = 2kmE m .

176

C. Dong, K. Nagatomo

Lemma 3.3. For m, n ≥ 1, (α(−n)α(−1)1) ∗ E m = α(−n)α(−1)E

m

n+1

+ 2mk(n + (−1)

)α(−n − 1)F

m

+

n X

ci α(−i)F m

i=0

for some ci ∈ C. Proof. Let v = α(−n)α(−1)1. By (3.3) we have X −j1 − 1 ◦ α(j1 )α(j2 ) ◦◦ z −j1 −j2 −n−1 . Y (v, z) = n−1 ◦

(3.4)

j1 , j2 ∈Z j1 ≥0 or j1 ≤−n

Recall that Y (v, z) =

P

v−1

vj z −j−1 . Then X −j1 − 1 ◦ = α(j1 )α(j2 ) ◦◦ . n−1 ◦

j∈Z

j1 +j2 =−n−1 j1 ≥0 or j1 ≤−n

Hence v−1 E m = (−1)n−1 α(−n − 1)α(0) + nα(−n − 1)α(0) + α(−n)α(−1) E m . This proves that v−1 E m = α(−n)α(−1)E m + 2mk(n + (−1)n+1 )α(−n − 1)F m . Next we consider vi E m for 0 ≤ i ≤ n. By (3.3), we see X −j1 − 1 ◦ α(j1 )α(j2 ) ◦◦ . vi = n−1 ◦ j1 +j2 =i−n j1 ≥0 or j1 ≤−n

So vi E m = (−1)n+1 α(i − n)α(0) + δi0 α(−n)α(0) E m = 2km (−1)n+1 α(i − n) + δi0 α(−n) F m . Since wt(v) = n + 1 we see from (2.1) that v ∗ Em =

n+1 X n+1 i=0

i

vi−1 E m .

Substitute the explicit expressions of vi−1 into the equation above to get the desired result. We say an element u = α(−n1 ) · · · α(−nr )v (ni > 0, v = E m or F m ) has the length r with respect to α and we write `α (u) = r. In general if u is a linear combination of such vectors ui ’s we define the length of u to be the maximal length among `α (ui ).

Representations of Vertex Operator Algebra for Rank One Lattice

177

Lemma 3.4. Let n1 , n2 , . . . , nr ∈ Z>0 with r even. Then (α(−n1 ) · · · α(−nr )1) ∗ E m = α(−n1 ) · · · α(−nr )E m + u, where u ∈ VL+ (m) and `α (u) < r. Proof. Let v = α(−n1 ) · · · α(−nr )1. By the definition of a vertex operator and (3.3), we have X cm1 n1 · · · cmr nr ◦◦ α(m1 ) · · · α(mr ) ◦◦ z −m−n , Y (v, z) = mi ∈Z

where m = m1 + · · · + mr , n = n1 + · · · + nr and cmn = −m−1 n−1 , mi ≥ 0 or mi ≤ −ni . Therefore for j ≥ 0, X cm1 n1 · · · cmr nr ◦◦ α(m1 ) · · · α(mr ) ◦◦ E m . vj−1 E m = m=j−n mi ≥0 or mi ≤−ni

If j = 0 then either mi = −ni for all i or there exists i such that mi ≥ 0 . So in this case v−1 E m = α(−n1 ) · · · α(−nr )E m + u, where `α (u) < r. If j > 0 then there exists i such that mi ≥ 0. This implies `α (vj−1 E m ) < r. The lemma follows from the definition of ∗ product. A similar argument gives: Lemma 3.5. Let n1 , ..., nr ∈ Z>0 with r odd. Then (α(−n1 ) · · · α(−nr−1 )1) ∗ (α(−nr )F m ) = α(−n1 ) · · · α(−nr )F m + u, where u ∈ VL+ (m) and `α (u) < r. 3.3. Reduction I: Even case. In this subsection we prove by induction on n that VL+ (n) ≡ 0 mod O(VL+ ) + M (1)+ for even integers n. We need the following notation: ! ∞ ∞ ∞ X X X xn n pj (x1 , x2 , . . . )z j = pj (x)z j (3.5) z = exp n n=1 j=0 j=0 and qj (x) = pj (x) + pj (−x). Then pj (x) are elementary Schur polynomials. The following lemma is easily derived from the definition of vertex operators. Lemma 3.6. For m, n ∈ Z, Y (emα , z)enα =

∞ X

pj (mα) ⊗ e(m+n)α z 2kmn+j ,

j=0

where pj (β) = pj (β(−1), β(−2), . . . ). Lemma 3.7. For any m ∈ Z>0 , e2mα + e−2mα ≡ 0 mod O(VL+ ) + M (1)+ .

178

C. Dong, K. Nagatomo

Proof. Using Lemma 3.6 we see that Y (E m , z)E m = Y (emα , z)emα + Y (e−mα , z)e−mα + Y (emα , z)e−mα + Y (e−mα , z)emα ∞ ∞ X X 2 2 = qj (mα)z −2km +j . pj (mα) ⊗ e2mα + pj (−mα) ⊗ e−2mα z 2km +j + j=0

j=0

Hence,

(1 + z)km Y (E m , z)E m = E 2m + u, z z 2km2 +1 where u is a linear combination of qj (mα), in particular, u ∈ M (1)+ . Since wt(E m ) = 2

Res

km2 and 2km2 +1 ≥ 2 by Proposition 2.2 (1), Resz The proof is complete.

2

(1+z)km z 2km2 +1

Y (E m , z)E m lies in O(VL+ ).

Lemma 3.8. Let m ∈ Z>0 be even and n ∈ Z>0 . Then α(−n)(emα − e−mα ) ≡ 0 mod O(VL+ ) + M (1)+ . Proof. We prove the result by induction on n. Note that L(−1) ∼ −L(0). Then L(−1)E m ≡ −L(0)E m mod O(VL+ ) or,

mα(−1)F m ≡ −km2 E m mod O(VL+ ).

Using Lemma 3.7 shows that α(−1)F m ≡ 0 mod O(VL+ ) + M (1)+ . So the case n = 1 is done. Assume that the lemma is true for all integers less than n. Then by the induction hypothesis, α(−n)(emα − e−mα ) ∈ M (1)+ mod O(VL+ ). Then again use L(−1) ∼ −L(0) to get − (n + km2 )α(−n)F m = −L(0)α(−n)F m ∼ L(−1)(α(−n)F m ) = nα(−n − 1)F m + mα(−n)α(−1)E m ∈ M (1)+ mod O(VL+ ).

(3.6)

Let v = α(−n)α(−1)1. Then by Lemma 3.3 and the induction hypothesis, we have v ∗ E m ≡α(−n)α(−1)E m + 2mk(n + (−1)n+1 )α(−n − 1)F m mod O(VL+ ) + M (1)+ . On the other hand, E m ∈ M (1)+ (mod O(VL+ )). Hence v ∗ (emα + e−mα ) ∈ M (1)+ mod O(VL+ ), and α(−n)α(−1)E m ≡ −2mk(n + (−1)n+1 )α(−n − 1)F m mod O(VL+ ) + M (1)+ .

Representations of Vertex Operator Algebra for Rank One Lattice

179

Finally, substituting this into (3.6), we reach to n − 2km2 (n + (−1)n+1 ) α(−n − 1)F m ≡ 0 mod O(VL+ ) + M (1)+ . Since for m ≥ 2, we see

n − 2km2 (n + (−1)n+1 ) 6 = 0, α(−n − 1)F m ≡ 0 mod O(VL+ ) + M (1)+ .

The main result in this subsection is the following: Lemma 3.9. For any positive even integer m, VL+ (m) ≡ 0 mod O(VL+ ) + M (1)+ . Proof. Let Set

and

u = α(−n1 ) · · · α(−nr )(emα + (−1)r e−mα ). ( α(−n1 ) · · · α(−nr )1 v= α(−n1 ) · · · α(−nr−1 )1

if r is even, if r is odd,

( emα + e−mα w= α(−nr )(emα − e−mα )

if r is even, if r is odd.

Then by Lemma 3.4 and Lemma 3.5, we see that v ∗ w = u + u0 , where u0 ∈ VL+ (m) and `α (u0 ) < `α (u) = r. From Lemma 3.7 and Lemma 3.8, v ∗ w ≡ 0 mod O(VL+ ) + M (1)+ , that is, u + u0 ≡ 0 mod O(VL+ ) + M (1)+ . An induction on r shows that u ∈ M (1)+ mod O(VL+ ). 3.4. Reduction II: Odd case. In this subsection we prove that VL+ (m) ⊂ M (1)+ ⊗ (eα + e−α ) + O(VL+ ) for all odd integers m. Recall from the previous subsections that E = eα + e−α , F = eα − e−α . Lemma 3.10. For any s ∈ M (1)+ and n ∈ Z>0 , there exists t ∈ M (1)+ such that α(−n)s ⊗ F ≡ t ⊗ E mod O(VL+ ). Proof. We prove the lemma by induction on n. Applying L(−1) to s ⊗ E and using the relation L(−1) ∼ L(0) yields α(−1)s ⊗ F =L(−1)(s ⊗ E) − (L(−1)s) ⊗ E ≡ − L(0)(s ⊗ E) − (L(−1)s) ⊗ E

mod O(VL+ ).

Since M (1)+ ⊗ (eα + e−α ) is invariant under L(0) and L(−1)s ∈ M (1)+ we see immediately that α(−1)s ⊗ F ∈ M (1)+ ⊗ (eα + e−α ) mod O(VL+ ).

180

C. Dong, K. Nagatomo

Let us assume that the lemma holds for n > 0. Again applying L(−1) to α(−n)s⊗F gives L(−1)(α(−n)s ⊗ F ) = (L(−1)s)α(−n) ⊗ F + nsα(−n − 1) ⊗ F + sα(−n)α(−1) ⊗ E. Thus nsα(−n − 1) ⊗ F = L(−1)(sα(−n) ⊗ F ) − (L(−1)s)α(−n) ⊗ F − sα(−n)α(−1) ⊗ E ≡ −L(0)(sα(−n) ⊗ F ) − (L(−1)s)α(−n) ⊗ F − sα(−n)α(−1) ⊗ E mod O(VL+ ). By induction hypothesis both L(0)(sα(−n)⊗F ) and (L(−1)s)α(−n)⊗F lie in M (1)+ ⊗ E modulo O(VL+ ). As a result we have α(−n − 1)s ⊗ F ∈ M (1)+ ⊗ (eα + e−α ) mod O(VL+ ).

Remark 3.11. From the proof of Lemma 3.10, it is clear that for any 0 6 = γ ∈ L, M (1)− ⊗ (eγ − e−γ ) ⊂ M (1)+ ⊗ (eγ + e−γ ) mod O(VL+ ). The main result in this subsection is: Lemma 3.12. For any odd positive integer m, VL+ (m) ⊂ VL+ (1) + O(VL+ ). Proof. We prove this by induction on m. By Lemma 3.10 the lemma holds for m = 1. Suppose the assertion is true for m − 2 (m ≥ 3). A straightforward computation using Lemma 3.6 gives Y (E, z)E m−1 = Y (eα , z)e(m−1)α + Y (e−α , z)e−(m−1)α + Y (e−α , z)e(m−1)α + Y (eα , z)e−(m−1)α ∞ ∞ X X = pj (α) ⊗ emα z 2k(m−1)+j + pj (−α) ⊗ e−mα z 2k(m−1)+j j=0 ∞ X

+

=

j=0 ∞ X

j=0

pj (−α) ⊗ e(m−2)α z −2k(m−1)+j +

∞ X

pj (α) ⊗ e−(m−2)α z −2k(m−1)+j

j=0

pj (α) ⊗ emα + pj (−α) ⊗ e−mα z 2k(m−1)+j

j=0

+

∞ X

pj (−α) ⊗ e(m−2)α + pj (α) ⊗ e−(m−2)α z −2k(m−1)+j .

j=0

So Res z

(1 + z)k z 2k(m−1)+1

Y (E, z)E m−1 = emα + e−mα + u,

where u ∈ VL+ (m − 2). Since wt(E) = k and 2k(m − 1) + 1 ≥ 2 we see from Proposition (1+z)k Y (E, z)E m−1 ∈ O(VL+ ). From the induction hypothesis we know 2.2 that Resz z2k(m−1)+1 + that u lies VL (1) modulo O(VL+ ). Thus

Representations of Vertex Operator Algebra for Rank One Lattice

181

emα + e−mα ∈ VL+ (1) mod O(VL+ ). By the same argument given in the proof of Lemma 3.8 we prove that α(−n) ⊗ F m ∈ VL+ (1) mod O(VL+ ). Next, we want to show by induction on r that u = α(−n1 )α(−n2 ) · · · α(−nr )(emα + (−1)r e−mα ) ∈ VL+ (1) + O(VL+ ). Set

and

( α(−n1 ) · · · α(−nr )1 v= α(−n1 ) · · · α(−nr−1 )1

if r is even, if r is odd,

( emα + e−mα w= α(−nr )(emα − e−mα )

if r is even, if r is odd.

Then from Lemmas 3.4 and 3.5 we see that v ∗ w = α(−n1 )α(−n2 ) · · · α(−nr )(emα + (−1)r e−mα ) + u0 , where `α (u0 ) < r and u0 ∈ VL+ (m). By the induction hypothesis w, u0 ∈ VL+ (1)+O(VL+ ). Since v ∈ M (1)+ we have v ∗ w ∈ VL+ (1) + O(VL+ ) and then u ∈ VL+ (1) + O(VL+ ). 4. Generators of A(VL+ ) We have already proved in Sect. 3 that A(VL+ ) = M (1)+ + VL+ (1) + O(VL+ ). The main result of this section is that A(VL+ ) is generated by ω + O(VL+ ), J + O(VL+ ) and E + O(VL+ ). Since M (1)+ + O(VL+ ) is generated by ω + O(VL+ ) and J + O(VL+ ) [DN] we establish that VL+ (1) + O(VL+ ) is generated by ω + O(VL+ ) and E + O(VL+ ). Since the structure of VL+ (1) as a Virasoro module varies according to whether k is a perfect square or not, we deal with these cases separately. The case that k is a perfect square is more complicated. Nevertheless, the ideas and the techniques developed in [DN] still work in the present situation. 4.1. A spanning set for VL+ (1) + O(VL+ ) I: k is not a perfect square. In this section we assume that k is not a perfect square. In this case VL+ (1) is an irreducible Virasoro module which is isomorphic to L(1, k) with a highest weight vector eα + e−α . For short, we set s

z }| { v ∗s = v ∗ · · · ∗ v for v ∈ VL+ . Recall that [v] = v + O(VL+ ) for v ∈ VL+ , we will use a similar notation [v]∗s . Then it is easy to see that [v ∗s ] = [v]∗s . Lemma 4.1. Suppose that k is not a perfect square. Then VL+ (1) + O(VL+ ) is spanned by [Sω, E ] = {[ω ∗s ∗ E] | s ≥ 0} .

182

C. Dong, K. Nagatomo

Proof. In this case VL+ (1) is spanned by the vectors v = L(−n1 )L(−n2 ) · · · L(−nr )E, n1 ≥ n2 ≥ · · · ≥ nr ≥ 1. So it is enough to show that [v] is spanned by Sω, E . Using Proposition 2.2 (3), (3) and the relation L(0)L(−n1 ) · · · L(−nr )E = (n1 + · · · + nr + k)L(−n1 ) · · · L(−nr )E one can easily show that [v] = [P (ω) ∗ E] with some polynomial P (x).

4.2. A spanning set for VL+ (1) + O(VL+ ) II: k is a perfect square. In this subsection we consider the case that k is a perfect square. Since VL+ (1) is an irreducible M (1)+ module and M (1)+ is generated by ω and J one can see that VL+ (1) is spanned by

u1m1 · · · ukmk E | ui = ω, J, mi ∈ Z

which are not necessarily linearly independent. We say that an expression u1m1 · · · ukmk E has length t with respect to J, which we write `J (u1m1 · · · ukmk E) = t, if {i|ui = J} has cardinality t. Note that ωi = L(i − 1). An induction on `J (u1m1 · · · ukmk E) using Lemma 3.1 (1) shows that u1m1 · · · ukmk E is a linear combination of vectors of type {L(m1 )L(m2 ) · · · L(ms )Jn1 Jn2 · · · Jnt E | ma , nb ∈ Z} . Using the commutation relation in Lemma 3.1 and the fact that E is a singular vector we can prove the following lemma. Lemma 4.2. Let W be a subspace of VL+ spanned by Jn1 · · · Jnt E with ni ∈ Z. Then W is invariant under the action of L(m), m ≥ 0. Lemma 4.3. VL+ (1) is spanned by L(−m1 ) · · · L(−ms )J−n1 · · · J−nt E, where m1 ≥ m2 ≥ · · · ≥ ms ≥ 1 and n1 ≥ n2 ≥ · · · ≥ nt ≥ 1. Proof. We have already known that VL+ (1) is spanned by L(−m1 ) · · · L(−ms )J−n1 · · · J−nt E, where ma , nb ∈ Z. Using the PBW theorem for the Virasoro algebra we can assume that m1 ≥ · · · ≥ ms . By Lemma 4.2 we can further assume that m1 ≥ m2 ≥ · · · ≥ ms ≥ 1. We proceed by induction on `J (v) that v = L(−m1 ) · · · L(−ms )J−n1 · · · J−nt E can be spanned by the indicated vectors in the proposition. If the length is 0, it is clear. Suppose that it is true for all monomials v such that `J (v) < t. By Lemma 3.2 and the induction hypothesis we can assume nt ≥ 1. If n1 ≥ · · · ≥ nt we are done. Otherwise there exists na such that na+1 ≥ · · · ≥ nt but na < na+1 . There are two cases na ≤ 0 and na > 0 which are dealt with separately. If na ≤ 0, then

Representations of Vertex Operator Algebra for Rank One Lattice

183

L(−m1 ) · · ·L(−ms )J−n1 · · · J−nt E =

t X

∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · [J−na , J−nj ] · · · J−nt E

j=a+1 t X

+

∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · J−nt J−na E,

j=a+1 ∨

where J −na means that we omit the term J−na . However by Lemma 3.1 (2), [J−na , J−nj ] are linear combinations of operators of type L(p1 ) · · · L(ps0 ), L(q1 ) · · · L(qt0 )Jr . By substituting these into the above and using the commutation relation in Lemma 3.1 (1) again, the first term of the right-hand side is a linear combination of monomials whose lengths with respect to J are less than or equal to t − 1. Further by Lemma 3.2, the second term is also a linear combination of such monomials. Thus by the induction hypothesis, this is expressed as linear combinations of the expected monomials. If na > 0, then either na < nt or there exists b with t > b > a so that nb > na ≥ nb+1 . Then we have either L(−m1 ) · · ·L(−ms )J−n1 · · · J−nt E =

t X

∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · [J−na , J−nj ] · · · J−nt E

j=a+1 ∨

+ L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · J−nt J−na E or L(−m1 ) · · ·L(−ms )J−n1 · · · J−nt E =

b X

∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · [J−na , J−nj ] · · · J−nt E

j=a+1 ∨

+ L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · J−nb J−na J−nb+1 · · · J−nt E. From the discussion of case na ≤ 0 it is enough to show either ∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · J−nt J−na E or

∨

L(−m1 ) · · · L(−ms )J−n1 · · · J −na · · · J−nb J−na J−nb+1 · · · J−nt E can be expressed as linear combinations of desired vectors. But this follows from an induction on a. Lemma 4.4. Assume that k 6 = 1. VL+ (1) + O(VL+ ) is spanned by [Sω, E ] = {[ω ∗s ∗ E] | s ∈ Z≥0 } .

184

C. Dong, K. Nagatomo

Proof. The case that k is not a perfect square was treated in Lemma 4.1 already. So we can assume that k is a perfect square. By Lemma 4.3, it is enough to show that any [v] = [L(−m1 )L(−m2 ) · · · L(−ms )J−n1 · · · J−nt E], where m1 ≥ m2 ≥ · · · ≥ ms ≥ 1 and n1 ≥ n2 ≥ · · · ≥ nt ≥ 1 is spanned by [Sω, E ]. We prove this by induction on `J (v). If the length is 0, the proof of Lemma 4.1 gives the result. Let t > 0 and assume that the statement is true for all v with `J (v) < t. We will prove that [v] is spanned by [Sω, E ] by induction on the weight of v. Clearly, the smallest weight is 4t + k and v = J−1 · · · J−1 E. Then X an1 ...nt Jn1 · · · Jnt E. J ∗ ··· ∗ J ∗ E − v = ni ∈{−1,0,1,2,3}, 6 (−1,−1,...,−1) (ni )=

Since each term appeared in the right-hand side involves Jni for some nonnegative integer ni , by using Lemma 3.2, its length is strictly less than t. Thus by the induction hypothesis, the image of the right-hand side in A(VL+ ) is spanned by [Sω, E ]. So we can assume that v = J ∗ · · · ∗ J ∗ E. Note that 4 X 4 Jj−1 E J ∗E = j j=0 and wt(Jj−1 E) = 4 + k − j ≤ 4 + k if j ≥ 0. Then from the decomposition of VL+ (1) (see Eq. (3.2)) we see that J ∗ E is a vector in the irreducible module for the Virasoro algebra generated by the highest weight E. The proof of Lemma 4.1 shows that [J ∗ E] is a linear combination of elements of [Sω, E ]. Then using the fact that ω is a central element proves that v is spanned by [Sω, E ]. Now consider a general vector v = L(−m1 )L(−m2 ) · · · L(−ms )J−n1 · · · J−nt E. Suppose m1 > 2. Then by using Proposition 2.2 (3) we have v ∼ (−1)m1 {(m1 − 1)(L(−2) + L(−1)) + L(0)} L(−m2 ) · · · L(−ms )J−n1 · · · J−nt E which is a sum of three homogeneous vectors of weight strictly less than wt(v). Then by the induction hypothesis, [v] is spanned by [Sω, E ]. Thus we can assume that m1 ≤ 2. We can further assume by using the relation L(−1) ∼ −L(0) that m1 = m2 = · · · = ms = 2. Namely, s z }| { v = L(−2) · · · L(−2)J−n1 J−n2 · · · J−nt E. Then

v = ω ∗s ∗ (J−n1 1) ∗ (J−n2 · · · J−nt 1) ∗ E + u,

where the weights of homogeneous components of u are less than wt(v) and the length of each homogeneous component of u with respect to J is less than or equal to t. Then again by using the induction hypothesis [u] is spanned by [Sω, E ]. It reduces to the case that v = ω ∗s ∗ (J−n1 1) ∗ (J−n2 · · · J−nt 1) ∗ E.

Representations of Vertex Operator Algebra for Rank One Lattice

185

Note that v ≡ (J−n1 1) ∗ ω ∗s ∗ (J−n2 · · · J−nt 1) ∗ E = (J−n1 1) ∗ ω ∗s ∗ (J−n2 · · · J−nt E) + w, where w is a vector spanned by [Sω, E ]. By the induction hypothesis of the length with respect to J, we see that J−n2 · · · J−nt E is spanned by Sω, E modulo O(VL+ ). So we can assume that v = ω ∗p ∗ (J−n1 1) ∗ E for some p ≥ 0. Again by induction hypothesis on the length of v with respect to J we conclude that such [v] is spanned by [Sω, E ]. This establishes the lemma. Let us summarize the main results in this section. Proposition 4.5. Assume that k 6 = 1. Then Zhu’s algebra A(VL+ ) is spanned by {[ω ∗s ∗ J ∗t , ω ∗s ∗ E]|s, t ≥ 0}.

5. The Structure of A(VL+ ) It is proved in Sect. 4 that the algebra A(VL+ ) is generated by [ω], [J] and [E] if k 6 = 1. In this section we determine the algebra structure of A(VL+ ) which is a commutative semisimple algebra of dimension k + 7 if k 6 = 1. This is achieved by studying the relations among [ω], [J] and [E]. We have already known two relations between [ω] and [J] from [DN]. Using the known irreducible modules of A(VL+ ), we obtain more relations. The classification of irreducible modules for VL+ follows immediately from the dimension of A(VL+ ) as A(VL+ ) has k + 7 known irreducible modules. In Subsect. 5.1 we list all known irreducible modules of VL+ and give the scalars of ω, J, E on the top levels of these modules. In Subsect. 5.2 we find two relations among [ω], [J] and [E]. Subsection 5.3 is the core of this paper where we determine a basis of A(VL+ ). Subsection 5.4 is easy but important. In this subsection we classify the irreducible modules for VL+ . We assume that k 6 = 1 in the first three subsections. 5.1. List of known irreducible modules. Here we give the list of known irreducible VL+ -modules and the action of ω, E and J on the top levels of them. As we mentioned before, we have the following irreducible VL+ -modules: VL+ , VL− , VL+ 2kr α (r = 1, 2, . . . , k − 1), − + α, VL+ VLT1 ,+ , VLT1 ,− , VLT2 ,+ , VLT2 ,− . VL+ α, 2 2

(5.1)

Note that the top level of these modules are 1-dimensional and ω, J and E act as scalars. The following table gives the scalars which follows from the construction of these modules (cf. [DN]):

186

C. Dong, K. Nagatomo

ω E J

VL+

VL−

0 0 0

1 0 −6

ω E J

VL+

r α 2k

+ VL+ α

− VL+ α

k/4 1 k4 /4 − k/4

k/4 −1 k4 /4 − k/4

(1 ≤ r ≤ k − 1)

r 2 /4k 0 c4 − c2 /2, c2 = r2 /2k

2

2

VLT1 ,+

VLT1 ,−

VLT2 ,+

VLT2 ,−

1/16 2−2k+1 3/128

9/16 −2−2k+1 (4k − 1) −45/128

1/16 −2−2k+1 3/128

9/16 2−2k+1 (4k − 1) −45/128

5.2. The relations among ω, E and J. In this subsection we first prove the relation ([ω] − k/4) ∗ ([ω] − 1/16) ∗ ([ω] − 9/16) ∗ [E] = 0.

(5.2)

Note that VL+ (1) = ⊕n≥k VL+ (1, n) is Z-graded, where VL+ (1, n) is the weight n subspace of VL+ (1). Then (ω − k/4) ∗ (ω − 1/16) ∗ (ω − 9/16) ∗ E ∈ ⊕0≤n≤k+6 VL+ (1, n). It is easy to see that VL+ (1, k + 6) has the following basis: g1 = α(−6)F, g3 = α(−4)α(−2)E, g5 = α(−3)2 E, g7 = α(−3)α(−1)3 E, g9 = α(−2)2 α(−1)2 E, g11 = α(−1)6 E.

g2 = α(−5)α(−1)E, g4 = α(−4)α(−1)2 F, g6 = α(−3)α(−2)α(−1)F, g8 = α(−2)3 F, g10 = α(−2)α(−1)4 F,

In particular, dim VL+ (k, 6) = 11. Similarly VL+ (1, k + 5) has the following basis f1 f3 f5 f7

= α(−5)F, = α(−3)α(−2)E, = α(−2)2 α(−1)F, = α(−1)5 F,

f2 = α(−4)α(−1)E, f4 = α(−3)α(−1)2 F, f6 = α(−2)α(−1)3 E,

and dimension 7. We also need the following basis of VL+ (1, k + 3): h1 = α(−3)F, h2 = α(−2)α(−1)E, h3 = α(−1)3 F. Clearly, dim VL+ (k, 3) = 3. Set v = α(−1)4−3 E, where α(−1)4−3 is the component operator of Y (α(−1)4 , z) = P 4 −n−1 . n∈Z α(−1)n z Lemma 5.1. The vectors L(−1)(fi ) (i = 1, . . . , 7), L(−3)(hj ) (j = 1, 2, 3), v form a basis of VL+ (1, k + 6).

Representations of Vertex Operator Algebra for Rank One Lattice

L(−1)f1 L(−1)f2 L(−1)f3 L(−1)f4 L(−1)f5 L(−1)f6 L(−1)f7 2kL(−3)h1 2kL(−3)h2 2kL(−3)h3 v

g1 5 0 0 0 0 0 0 6k 0 0 32k3

g2 1 4 0 0 0 0 0 0 4k 0 48k2

g3 0 1 3 0 0 0 0 0 2k 0 48k2

g4 0 1 0 3 0 0 0 0 0 6k 24k

g5 0 0 2 0 0 0 0 2k 0 0 24k2

g6 0 0 1 2 4 0 0 1 2k 0 48k

187

g7 0 0 0 1 0 2 0 0 0 2k 4

g8 0 0 0 0 1 0 0 0 0 0 8k

g9 0 0 0 0 1 3 0 0 1 0 6

g10 0 0 0 0 0 1 5 0 0 1 0

g11 0 0 0 0 0 0 1 0 0 0 0

Proof. The main idea of the proof is to show that these vectors are linearly independent. This is done in the following table by giving explicit expressions of these vectors in terms of gi for i = 1, ..., 11. In fact if we denote the matrix above by A then det A = 6144(1 − k)k 2 . Thus A is non-singular if k 6 = 1. Lemma 5.2. We have the relation (ω − k/4)(ω − 1/16)(ω − 9/16)E = 0 where we also use v to denote its image in A(VL+ ) for any v ∈ VL+ and the in product is the ∗ operation. A(VL+ )

Proof. We first note that all vectors of a basis in Lemma 5.1 are congruent to the vectors of VL+ (1) of weight less than or equal to k + 5. By Lemma 4.4 there exists a monic polynomial of degree 3 such that f (ω)E = 0.

(5.3)

T1 ,+ + , VLT2 ,+ and note that E Let us apply both sides of (5.3) to the top levels of VL+ α,V L 2 is nonzero on these top levels. We immediately have

f (k/4) = f (1/16) = f (9/16) = 0. Since f has degree 3 we get f (x) = (x − k/4)(x − 1/16)(x − 9/16).

Next we study relations between J and E. From Lemma 4.4, it is clear that there exists a polynomial r(x) of degree 2 such that J ∗ E = r(ω)E. We give the explicit expression of the r(x) in the following lemma. Lemma 5.3. We have J ∗ E = E ∗ J = r(w)E in

A(VL+ )

(5.4)

where

r(x) =

9 + 80k − 104k 2 27k(k − 1) 2(32k 2 − 8k − 9) 2 x + x+ . (5.5) (4k − 9)(4k − 1) 2(4k − 1)(4k − 9) 8(4k − 1)(4k − 9)

188

C. Dong, K. Nagatomo

Proof. Set r(x) = ax2 + bx + c. We will evaluate J ∗ E = r(ω)E on the top levels of modules listed in Subsect. 5.1 on which E 6 = 0. Namely, we calculate the values of ω T1 ,+ + and VLT1 ,− . Then we have and J on the top levels of modules VL+ α,V L 2

k 2 a + 4kb + 16c = 4k 2 − 4k, a + 16b + 256c = 6, 81a + 144b + 256c = −90. By solving this linear system, we have the desired result. The same argument also shows that E ∗ J = r(w)E. In particular, J and E are commutative. 5.3. A basis for A(VL+ ). So far, we have established the following relations: J 2 = p(ω) + q(ω)J,

(B1 )

(ω − 1)(ω − 1/16)(ω − 9/16)(J + ω − 4ω ) = 0, JE = r(ω)E, t(ω)E = 0, 2

(B2 ) (L1 ) (L2 )

where 1816 4 212 3 89 2 27 x − x + x − x, (5.6) 35 5 10 70 27 314 2 89 x + x− , (5.7) q(x) = − 35 14 70 27k(k − 1) 9 + 80k − 104k 2 2(32k 2 − 8k − 9) 2 x + x+ , r(x) = (4k − 9)(4k − 1) 2(4k − 1)(4k − 9) 8(4k − 1)(4k − 9) (5.8)

p(x) =

t(x) = (x − k/4)(x − 1/16)(x − 9/16).

(5.9)

We remark that the relations (B1 ) and (B2 ) were found in [DN] in the algebra A(M (1)+ ). Since O(M (1)+ ) ⊂ O(VL+ ) these two relations are also true in A(VL+ ). Lemma 5.4. E∗E =

k X k j=0

j

q2k−j (α)1.

Proof. From the proof of Lemma 3.7, we have Y (E, z)E =

∞ X

∞ X qj (α)z −2k+j . pj (α) ⊗ e2α + pj (−α) ⊗ e−2α z 2k+j +

j=0

j=0

Therefore, we have E ∗ E = Res z

(1 + z)k Y (E, z)E z

=

k X k j=0

j

Ej−1 E =

k X k j=0

j

q2k−j (α)1.

Representations of Vertex Operator Algebra for Rank One Lattice

189

Lemma 5.5. There exist polynomials a(x), deg a = k and s(x), deg s ≤ 2 such that E 2 = a(ω) + s(ω)(J + ω − 4ω 2 ). Further a(x) = a0 x(x −

(5.10)

4 (k − 1)2 1 )(x − ) · · · (x − ), 4k 4k 4k

where a0 = 2(4k)k /(2k)!. Proof. Since E ∗ E ∈ M (1)+ and the highest weight of the homogeneous component is 2k, we can write E 2 as a linear combination of ω ∗s ∗J t for s, t ≥ 0 such that 2s+4t ≤ 2k (see [DN]). The existence of a(x) and s(x) follow from the relation (B1 )-(B2 ). Clearly, the degrees of a(x) and s(x) are less than or equal to k and k − 2, respectively. Using (B2 ) we can assume that the degree of s(x) is less than or equal to 2. Let us apply both sides of (5.10) to the top level of modules, VL+ , VL+ 2k1 α , . . . , VL+ k−1 α . 2k

Since both E and J + ω − 4ω 2 act trivially on these top levels, we see that a(ω) also acts trivially on top levels. This implies a(0) = a(

(k − 1)2 1 ) = · · · = a( ) = 0. 4k 4k

Since deg a ≤ k, we find a(x) = a0 x(x −

1 4 (k − 1)2 )(x − ) · · · (x − ) 4k 4k 4k

for some a0 ∈ C. Note that J + ω − 4ω 2 acts trivially and E = 1 on the top level of + k VL+ α , we have a(k/4) = 1, which implies a0 = 2(4k) /(2k)!. 2

Set ϕ(x) = (x − 1)(x − Lemma 5.6. We have

9 k 1 )(x − )(x − )a(x). 16 16 4

ϕ(ω) = 0.

Proof. From (B2 ), (ω − 1)(ω −

1 9 )(ω − )(J + ω − 4ω 2 ) = 0, 16 16

we see from Lemma 5.5 that (ω − 1)(ω −

1 9 9 1 )(ω − )E 2 = (ω − 1)(ω − )(ω − )a(ω). 16 16 16 16

On the other hand, (L2 ) tells us (ω − and therefore ϕ(ω) = 0.

1 9 k )(ω − )(ω − )E 2 = 0, 4 16 16

(5.11)

190

C. Dong, K. Nagatomo

Since JE = r(ω)E, we see 0 = (J − r(ω))E 2 = (J − r(ω))a(ω) + (J − r(ω))(J + ω − 4ω 2 )s(ω), and therefore J 2 s(ω) + {a(ω) + (ω − 4ω 2 −r(ω))s(ω)}J − r(ω)a(ω) − r(ω)(ω − 4ω 2 )s(ω) = 0. By using the relation J 2 = p(ω) + q(ω)J, this is reduced to (5.12) {a(ω) + (q(ω) − r(ω) + ω − 4ω 2 )s(ω)}J − r(ω)a(ω) 2 + p(ω) − r(ω)(ω − 4ω ) s(ω) = 0. For convenience we introduce b(x) = a(x) + (q(x) − r(x) + x − 4x2 )s(x).

(5.13)

Lemma 5.7. b(1) = a(1) +

1 1 9 9 27(−12 + 65k − 33k 2 ) s(1), b( ) = a( ), b( ) = a( ). 8(9 − 40k + 16k 2 ) 16 16 16 16

Proof. A straightforward calculation shows that q(x) − r(x) + x − 4x2 = The lemma follows.

9(−12 + 65k − 33k 2 )(16x − 1)(16x − 9) . 280(4k − 1)(4k − 9)

Lemma 5.8. (1) If k is not a perfect square, then b(1) 6 = 0, b(

9 1 ) 6 = 0, b( ) 6 = 0. 16 16

(2) If k = 4m2 for some positive integer m, then b(1) = b(

9 1 ) = b( ) = 0. 16 16

(3) If k = (2m + 1)2 for some positive integer m, then b(1) = 0, b(

9 1 ) 6 = 0, b( ) 6 = 0. 16 16

Qk−1 r2 ). If b(1/16) = a(1/16) = 0, then there Proof. (1) Recall that a(x) = a0 r=0 (x − 4k 1 i2 exists i (1 ≤ i ≤ k − 1) such that 16 = 4k , namely, k = 4i2 . This is a contradiction. By the exactly same reason, we know b(9/16) = a(9/16) 6 = 0. It remains to show b(1) 6 = 0. Let us evaluate the relation E 2 = a(ω) + s(ω)(J + ω − 4ω 2 ) on the top level of the module VL− . Then we have a(1) = 9s(1). Using Lemma 5.7 gives

Representations of Vertex Operator Algebra for Rank One Lattice

b(1) =

(k − 4)(29k − 9) a(1). 8(4k − 1)(4k − 9)

191

(5.14)

It is immediate that b(1) 6 = 0 as k is not a perfect square. (2) Since

m2 4k

=

1 16

and

9m2 4k

=

9 16

1 9 we have a( 16 ) = a( 16 ) = 0. By Lemma 5.7,

b(

9 1 ) = b( ) = 0. 16 16

Next we assert that b(1) = 0. If k = 4 this is immediate from (5.14). If k > 4 then m > 1. 2 So 4m ≤ k − 1. Since 16m 4k = 1 we again have a(1) = 0 and b(1) = 0 by using (5.14). The proof of (3) is similar to that of (2). Proposition 5.9. If k is not a perfect square then 1, ω, ω 2 , . . . , ω k+3 , E, ωE, ω 2 E is a basis of A(VL+ ). In particular, dimC A(VL+ ) = k + 7. Proof. Note that (B2 ) can be written as (ω − 1)(ω −

9 1 9 1 )(ω − )J = (ω − 1)(ω − )(ω − )(4ω 2 − ω). 16 16 16 16

From (5.12)–(5.13) we see that b(ω)J = r(ω)a(ω) + r(ω)(ω − 4ω 2 ) − p(ω) s(ω). Since k is not a perfect square it follows from Lemma 5.8 (1) that (x − 1)(x − 1/16)(x − 9/16) and b(x) are coprime. Then there exist polynomials α(x) and β(x) such that α(x)(x − 1)(x −

9 1 )(x − ) + β(x)b(x) = 1. 16 16

Thus 9 1 J = α(ω)(ω − 1)(ω − )(ω − )(4ω 2 − ω) 16 16 + β(ω) r(ω)a(ω) + (r(ω)(ω − 4ω 2 ) − p(ω))s(ω) . This shows that A(VL+ ) is spanned by ω i , i ∈ Z≥0 , E, ωE, ω 2 E . Lemma 5.6 then implies that A(VL+ ) is spanned by 1, ω, ω 2 , . . . , ω k+3 , E, ωE, ω 2 E since deg ϕ = k+4. Finally linear independence of these vectors is clear because A(VL+ ) has k+7 inequivalent irreducible modules which are the top levels of the known irreducible modules for VL+ . Proposition 5.10. If k = 4m2 for positive integer m, then the following set is a basis of A(VL+ ): 1, ω, · · · , ω k , J, ωJ, ω 2 J, E, ωE, ω 2 E . In particular, dimC A(VL+ ) = k + 7.

192

C. Dong, K. Nagatomo

Proof. By Lemma 5.8 (2), b(x) has a factor (x − 1)(x − 1/16)(x − 9/16) and we can write 9 1 b(x) = (x − 1)(x − )(x − )c(x), 16 16 where c(x) is some polynomial. Even in this case, we still have two relations; 9 1 9 1 )(ω − )J = (ω − 1)(ω − )(ω − )(4ω 2 − ω), 16 16 16 16 9 1 (ω − 1)(ω − )(ω − )c(ω)J = r(ω)a(ω) + r(ω)(ω − 4ω 2 ) − p(ω) s(ω) 16 16 (ω − 1)(ω −

(see (B2 ) and (5.12)–(5.13)). By eliminating J we obtain b(ω)(4ω 2 − ω) = r(ω)a(ω) + r(ω)(ω − 4ω 2 ) − p(ω) s(ω). Using the definition of b(x) we see that {r(ω) − 4ω 4 + ω}a(ω) + r(ω)(ω − 4ω 2 ) − p(ω) − (q(ω) − r(ω) + ω − 4ω 2 )(4ω 2 − ω) s(ω) = 0 or that (r(ω) − 4ω 2 + ω)a(ω) + −p(ω) − q(ω)(4ω 2 − ω) + (4ω 2 − ω)2 s(ω) = 0. A direct calculation gives −p(x) − q(x)(4x2 − x) + (4x2 − x)2 = 0 and r(x) − 4x2 + x = Thus (ω −

9(4x − k)((32k − 12)x − 3k + 3) . 8(4k − 1)(4k − 9)

3k − 3 k )(ω − )a(ω) = 0. 4 32k − 12

(5.15) (5.16)

3k−3 is not a root of ϕ(x). Recall the definition of ϕ(x) from (5.11). Suppose that 32k−12 3k−3 Then x − 32k−12 and ϕ(x) are relatively prime and there exist polynomials f (x) and 3k−3 g(x) such that (x − 32k−12 )f (x) + ϕ(x)g(x) = 1. So

(x −

k 3k − 3 k k )a(x) = (x − )(x − )a(x)f (x) + (x − )a(x)ϕ(x)g(x). 4 4 32k − 12 4

By Lemma 5.6 (ω −

k )a(ω) = 0. 4

So ω k+1 is a linear combination of ω i for i ≤ k. We then use the relations (B1 ), (B2 ),(L1 ) and (L2 ) to conclude that 1, ω, · · · , ω k , J, ωJ, ω 2 J, E, ωE, ω 2 E is a spanning set for A(VL+ ). Since A(VL+ ) has k + 7 known inequivalent irreducible modules already we see immediately that this spanning set is a basis.

Representations of Vertex Operator Algebra for Rank One Lattice

It remains to prove that

3k−3 32k−12

1, It is easy to see that

3k−3 32k−12

193

is not a root of ϕ(x). Note that the roots of ϕ(x) are

1 9 k i2 , , , , i = 0, ..., k − 1. 16 16 4 4k

1 9 k 6 = 0, 1, 16 , 16 , 4 . Suppose that

3k − 3 i2 = 32k − 12 4k or

i2 12m2 − 3 = 2 128m − 12 16m2

for some 1 ≤ i ≤ k − 1. Then i2 =

12m2 (4m2 − 1) . 32m2 − 3

Let d be the greatest common divisor of 12m2 (4m2 − 1) and 32m2 − 3, then d divides 2 (4m2 −1) is an integer we see that −24m2 (4m2 − 1) + 3m2 (32m2 − 3) or 15m2 . Since 12m32m 2 −3 d = 32m2 − 3 and 32m2 − 3 divides 15m2 . This is impossible for any positive integer m. So we have a contradiction. Proposition 5.11. If k = (2m + 1)2 for positive integer m, then the following set is a basis of A(VL+ ): 1, ω, · · · , ω k+2 , J, E, ωE, ω 2 E . In particular, dimC A(VL+ ) = k + 7. Proof. By Lemma 5.7 (3) we can write b(x) = (x − 1)c(x) such that the polynomials c(x) and (x − 1/16)(x − 9/16) are coprime. Use the following relations: 9 1 9 1 )(ω − )J = (ω − 1)(ω − )(ω − )(4ω 2 − ω), 16 16 16 16 (ω − 1)c(ω)J = r(ω)a(ω) + r(ω)(ω − 4ω 2 ) − p(ω) s(ω)

(ω − 1)(ω −

(5.17)

(see (B2 ) and (5.12)-(5.13)) to eliminate J and to obtain (ω −

9 k 3k − 3 1 )(ω − )(ω − )(ω − )a(ω) = 0. 16 16 4 32k − 12

Again one can show that Thus we have

3k−3 32k−12

(ω −

is not a root of ϕ(x) as in the proof of Proposition 5.10.

9 k 1 )(ω − )(ω − )a(ω) = 0, 16 16 4

and ω k+3 is a linear combination of ω i ’s for i = 0, ..., k + 2. Let the polynomials α(x) and β(x) satisfy α(x)c(x) + β(x)(x − 1/16)(x − 9/16) = 1.

194

C. Dong, K. Nagatomo

Combining this with relations (5.17) gives 9 1 (ω − 1)J = β(ω)(ω − 1)(ω − )(ω − )(4ω 2 − ω) 16 16 + α(ω) r(ω)a(ω) + (r(ω)(ω − 4ω 2 ) − p(ω))s(ω) . Thus A(VL+ ) is spanned by ω i , i = 0, ..., k + 2, J, E, ωE, ω 2 E . Again the known sim ple modules for A(VL+ ) implies that this in fact is a basis of A(VL+ ). Remark 5.12. We can determine all relations in A(VL+ ). From the proofs of Propositions 5.9–5.11 it is enough to give the exact expression of s(x) in (5.10). Since s(x) is a polynomial of degree less than or equal to 2 there are at most 3 coefficients to be determined. This can be done by evaluating (5.10) on the top levels of VL− , VLT1 ,+ and VLT1 ,− . We leave the details to the reader. 5.4. Classification of irreducible modules. Recall the known irreducible modules for VL+ from Sect. 2. We finally have the following classification result: Theorem 5.13. Let L = Zα be an even positive definite lattice of rank 1 such that α has square length 2k. Then Ti ,± ± , VL+ 2kr α |i = 1, 2, r = 1, ..., k − 1} {VL± , VL+ α,V L 2

gives a complete list of irreducible modules for VL+ . Moreover, any admissible irreducible VL+ -module is an ordinary module. Proof. If k 6 = 1 then A(VL+ ) is a commutative algebra of dimension k + 7 (see Sect. 5). So A(VL+ ) has at most k + 7 simple modules. Since A(VL+ ) has k + 7 known inequivalent simple modules already we conclude that A(VL+ ) is a semisimple algebra of dimension k + 7. Using the one to one correspondence result (see Theorem 2.1) we see that VL+ has exactly k + 7 inequivalent irreducible admissible modules which are ordinary modules. If k = 1, VL+ is isomorphic to the lattice vertex operator algebra VL0 , where L0 is a rank one positive definite lattice spanned by β whose square length is 8 (see [DG]), then it follows from a result in [D1] that VL+ has exactly 8 irreducible modules. Since VL0 is rational [DLM1] every irreducible admissible module is an ordinary module. Remark 5.14. If k = 2 the vertex operator algebra VL+ is isomorphic to L(1/2, 0) ⊗ L(1/2, 0) (see Lemma 3.1 of [DGH]) where L(1/2, h) is the irreducible highest weight module for the Virasoro algebra with central charge 1/2 and highest weight h. So the classification of irreducible modules in this case also follows from the classification of irreducible modules for the vertex operator algebra L(1/2, 0). One can easily see that all the irreducible modules for L(1/2, 0) ⊗ L(1/2, 0) are L(1/2, h1 ) ⊗ L(1/2, h2 ), where hi = 0, 1/16, 1/2. One can find in [DGH] the identification of these modules with the modules for VL+ listed in the theorem. References [BFKNRV] Blumenhagen,R., Lohr, M., Kliem, A., Nahm, W., Recknagel, A. and Varnhagen, R.: W -algebras with two and three generators. Nucl. Phys. B361, 255–289 (1991) [B] Borcherds, R.: Vertex algebras, Kac–Moody algebras, and the Monster: Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986)

Representations of Vertex Operator Algebra for Rank One Lattice

[DVVV] [D1] [D2] [D3] [DG] [DGH] [DL1] [DL2] [DLM1] [DLM2] [DM] [DLi] [DN] [FHL] [FLM1]

[FLM2] [Z]

195

Dijkgraaf, R., Vafa, C., Verlinde, E. and Verlinde, H.: The operator algebra of orbifold models. Commun. Math. Phys. 123, 485–526 (1989) Dong, C.: Vertex algebras associated with even lattices: J. Algebra 160, 245–265 (1993) Dong, C.: Twisted modules for vertex algebras associated with even lattices. J. Algebra 165, 90–112 (1994) Dong, C.: Representations of the moonshine module vertex operator algebra: Contemporary Math. 175, 27–36 (1994) Dong, C. and Griess, R.L., Jr.: Rank one lattice type vertex operator algebras and their automorphism groups. J. Algebra 208, 262–275 (1998) Dong, C., Griess, R.L., Jr. and Hoehn, G.: Framed vertex operator algebras, codes and the moonshine module. Commun. Math. Phys. 193, 407–448 (1998) Dong, C. and Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators. Progress in Math. Vol. 112, Boston: Birkh¨auser, 1993 Dong, C. and Lepowsky, J.: The algebraic structure of relative twisted vertex operators. J. Pure and Applied Algebra 110, 259–295 (1996) Dong, C., Li, H. and Mason, G.: Regularity of rational vertex operator algebras. Adv. in Math. 132, 148-166 (1997) Dong, C., Li, H. and Mason, G.: Twisted representation of vertex operator algebras. Math. Ann. 310, 571–600 (1998) Dong, C. and Mason, G.: On quantum Galois theory. Duke Math. J. 86, 305–321 (1997) Dong, C. and Lin, Z.: Induced modules for vertex operator algebras. Commun. Math. Phys. 179, 157–184 (1996) Dong, C., Nagatomo, K.: Classification of irreducible modules for vertex operator algebra M (1)+ . J. Algebra, to appear, math.QA/9806051 Frenkel, I.B., Huang, Y. and Lepowsky, J.: On axiomatic approach to vertex operator algebras and modules. Mem. AMS 104 (1993) Frenkel, I.B., Lepowsky, J. and Meurman, A.: A natural representation of the Fischer–Griess Monster with the modular function J as character. Proc. Natl. Acad. Sci. USA 81, 3256–3260 (1984) Frenkel, I.B., Lepowsky, J. and Meurman,A.: Vertex operator algebras and the Monster. London– New York: Academic Press, 1988 Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. AMS 9, 237–301 (1996)

Communicated by T. Miwa

Commun. Math. Phys. 202, 197 – 236 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Separatrix Splitting for Systems with Three Time Scales? G. Gallavotti1 , G. Gentile2 , V. Mastropietro3 1 Dipartimento di Fisica, Universit` a di Roma 1, P. le Moro 2, 00185 Roma, Italy. E-mail: [email protected] 2 Dipartimento di Matematica, Universit` a di Roma 3, Largo S. Leonardo Murialdo 1, 00146 Roma, Italy. E-mail: [email protected] 3 Dipartimento di Matematica, Universit` a di Roma 2, Viale Ricerca Scientifica, 00133 Roma, Italy. E-mail: [email protected]

Received: 10 November 1997 / Accepted: 25 September 1998

Abstract: An exact expression for the determinant of the splitting matrix is derived for three degrees of freedom systems with three time scales: it allows us to analyze the asymptotic behaviour needed to amend the large angles theorem proposed in Ann. Inst. H. Poincar´e, B-60, 1 (1994). The asymptotic validity of Mel’nikov’s integrals is proved for the class of models considered, which are polynomial perturbations. The technique for exhibiting cancellations is inspired by renormalization theory in quantum electrodynamics and uses an analogue of Dyson’s equations to prove an infinite family of identities, due to symmetries, that remind us of Ward’s identities. 1. Introduction Recently V. Gelfreich noted that a “theorem” in [CG] contains an error. The theorem gave a lower bound on the splitting angles in a three degrees of freedom system and it was needed to show the existence of heteroclinic chains in a class of Hamiltonian systems with the aim of an application to a Celestial Mechanics problem. We correct it here by providing the correct lower bound and, at the same time, exposing again and in a more meditated form some ideas of [CG]. In the present paper we do not discuss the existence of Arnol’d diffusion, [A2], in our systems. We do not discuss the Celestial Mechanics application of [CG] either, as two parts of it (see below) relied on the erroneous statement, and more work is needed. The present paper is, therefore, not a correction of the implications of the error in [CG] but only of the error itself. In order to derive the same implications further work is necessary as the erroneous result was used several times in the last three sections of [CG]. Each use has therefore to be treated separately. Here we shall consider a pendulum subject both to a slow periodic force and to a rapid periodic force (incommensurate to the former) that we imagine generated by ?

Paper archived in mp [email protected] 97-472 and [email protected] 9709004

198

G. Gallavotti, G. Gentile, V. Mastropietro

a pair of rotators, whose positions are given by two angles α, λ, while the pendulum position is given by an angle ϕ (the angles α, λ will be respectively called slow and fast). So, if (I, A, B) are the conjugate momenta of the angles (ϕ, α, λ), the systems will be described by Hamiltonians of the form: H = ηa A +

1 η 1/2

B + η 2a

A2 I 2 + + g 2 (cos ϕ − 1) + ε f (ϕ, α, λ), 2 2

(1.1)

where g and η are constants, ε is the perturbative parameter and the function f is an even trigonometric polynomial in its arguments. The Hamiltonian in which the isochrony breaking term 21 η 2a A2 is missing will also be studied, essentially as a preliminary approach to the more interesting case (1.1). Our analysis deals mainly with systems with three different time scales with the ratio between the smallest to the largest being 1: the largest is η −1/2 , the intermediate is g = O(1) and the smallest is η a , a ≥ 0 (the a = 0 case being a limiting two scales problem). When the system is perturbed some of the quasi periodic motions performed by rotators (or clocks) while the pendulum is in its unstable equilibrium persist. They become motions in which the pendulum is slightly moving quasi periodically around its unstable position (“without ever falling down”): one has therefore an invariant 2def dimensional torus in phase space. For each position α = (α, λ) of the rotators there are well defined values of the pendulum variables (I, ϕ) as well as of the rotators actions def A = (A, B). And (α, λ) evolve quasi–periodically, i.e. (α, λ) can be written as α = ψ1 + 11 (ψ1 , ψ2 ), λ = ψ2 + 12 (ψ1 , ψ2 ) and the time evolution is simply linear ψj → ψj + ωj t, j = 1, 2, for a suitable ω = (ω1 , ω2 ). Obviously the above quasi periodic motion is unstable and perturbations will make the pendulum “fall”. The 2-dimensional invariant tori have therefore stable and unstable manifolds, which coincide in absence of perturbations (because they correspond to the separatrix motions and the pendulum separatrix is degenerate). However under perturbations they “split” (i.e. become distinct 3-dimensional surfaces). Often, simply by symmetry, there is one trajectory that lies on both manifolds. Physically it corresponds to a motion that consists of just one swing of the pendulum from nearby the unstable position back to it. If we observe the swinging trajectory at the moment it passes through the stable equilibrium position, ϕ = π, we see just a point that can be identified by the value α 0 , at that moment, of the α coordinates (where the two manifolds meet). If we move away from that point the A coordinates on the two manifolds become different, at the same α , and their difference is a “splitting” vector Q ( α ). The 2 × 2 matrix D = ∂ α Q ( α 0 ) is called the intersection matrix. The homoclinic angles can be defined as the angles whose tangents are the eigenvalues of the intersection matrix. The splitting is usually defined as det D; see Appendix R1. When the perturbing frequencies are held fixed and the perturbation is sent to 0 there is a well known asymptotic expression for the splitting, called the Mel’nikov integral and coinciding with the “first order perturbation theory result”. The splitting problem is to find under which conditions the Mel’nikov integral holds when both the perturbation and the shortest forcing period are sent to 0. The main result of the paper will be called the large angles theorem (Theorem 2 in Sect. 5, proved in Sects. 6 and 7 for the systems defined by (1.1) without and with the anisochronous term; see (2.1), (7.1) below). The motivation for the name is that, even though the splitting is found to be exponentially small in the parameter η, the gaps

Separatrix Splitting for Systems with Three Time Scales

199

between the tori still have size which is narrower than the splitting; see remark (1) after Theorem 2 in Sect. 5. Informally the result is (see Theorem 2 in Sect. 5 for a formal statement): Suppose that one of the forcing frequencies is very large, say η −1/2 times the pendulum frequency, with η very small, and suppose also that the second characteristic frequency is relatively very small, say η a times that of the pendulum, with a ≥ 0; see the Hamiltonian (1.1). Then there are perturbations of size ε = O(η c ) (for some c > 0) such that the separatrix splitting has size given asymptotically by the Mel’nikov integral, i.e. it is given by γη −b exp[−(π/2)η −1/2 ], with γ 6 = 0, for some positive constant b, provided the frequencies of the motion verify a suitable Diophantine condition. In fact this property holds generically, under the same conditions, for perturbations of size ε = O(η c ) which are trigonometric polynomials. There are many examples of systems for which the above property does not hold; remarkably the correct answer was, in the early cases, found in experimental (i.e. numerical) works, (see [S] and [BCF] for the experimental part and, for the theory, see [DGJS] and the related later paper [RW]). But the ratios between the various forcing frequencies are different in the mentioned papers. Obtaining the correct lower bound estimate on the splitting makes an analysis to all orders necessary: the analysis, performed here, becomes “marginal” but the final result on the existence of homoclinic splitting (i.e. a lower bound on it) remains valid (see Appendix R2). Therefore the present paper corrects the error in Sect. 10 of [CG] as far as its implications on the size of the splitting are concerned. The techniques we use here to bypass perturbation theory were started in Appendix A13 of [CG]: they were not pushed too far because the error made any developments unnecessary for the purposes of [CG]. The techniques were subsequently developed in [G3] (which does not repeat the computational error). The methods of [G3] were not developed to treat three time scales problems: the aim there being to study the smallness of the splitting in two time scales problems (i.e. in systems not considered here with all rotators with comparably large velocity, a = − 21 ). But they can be easily extended to three time scales problems and even lead to a remarkable nonperturbative and exact computation of the leading order (exponentially small) of the intersection matrix, see (6.12), (7.19): V. Gelfreich stressed the necessity of a nonperturbative analysis by a simple cogent argument. The reason why we study three time scales systems is simply that they arise in a Celestial Mechanics problem of interest to us, see § 12 of [CG]. But our results hold also if the slow time scale is the same as that of the pendulum, i.e. the parameter a above (see also (2.1)) is a = 0, see also comments in Sect. 9: this is also an interesting case and it has been considered, to some extent, in [DGJS] and in subsequent papers. The recent works [DGJS, RW] considerably overlap with ours: even when a = 0 (a two time scales problem) it cannot be used to achieve all our results about the upper bound on the splitting because our assumptions can violate Eq. (15) of [RW] (i.e. the bound on the constant called b in [RW]; see (2.2), below, for the isochronous case and (7.3) for the anisochronous). We also require the perturbations to be trigonometric polynomials, while the examples of [DGJS, RW] require, as an essential assumption to obtain the lower bound for systems in which all rotators are equally fast, the perturbation to have infinitely many non vanishing modes in the rotators angles, although still finitely many in the pendulum angle variable; see Eqs. (16) and (19) of [RW]. Another essential difference with respect to [G3] and [RW] is that we only study the splitting at the homoclinic point while they study it everywhere: hence our work is much more limited in scope. On the other hand we recover some of the results of [RW] because our proofs

200

G. Gallavotti, G. Gentile, V. Mastropietro

also apply nontrivially to two time scale cases with 3 degrees of freedom (the ones in [RW]). Nevertheless we feel that the main difference between our work and [DGJS] (and the similar [RW]) lies in the techniques: here we show that the techniques of [G3] do apply immediately to the problem. See the concluding remarks in Sect. 9 for a more technical comparison. [During the refereeing process our view of the results in [RW] and [DGJS] has considerably changed, see Appendix R9.]

2. Isochronous Clock–Pendulum System Our main result concerns anisochronous systems, (1.1). Isochronous models will be considered only to illustrate the simplest cases: the cancellations that we find in the anisochronous case would look miraculous otherwise. Calling (I, ϕ), (A, α), (B, λ) pairs of canonical coordinates suppose them action–angle variables: (ϕ, α, λ) is a triple of angles (varying on the 3-dimensional torus T3 = [0, 2π]3 ) and (I, A, B) ∈ R3 . The Hamiltonian will be: H = ηa A +

I2 def + η −1/2 B + g 2 (cos ϕ − 1) + µ η c f (ϕ, α, λ), ε = µη c , 2

(2.1)

where η, µ > 0 and f is an even trigonometric polynomial in the angles (ϕ, α, λ); for instance one could take f (ϕ, α, λ) = cos(α + ϕ) + cos(λ + ϕ) . The parameter η sets the ratios between time scales and is a free parameter which we take, eventually, to be close to zero in order to study asymptotic properties as η → 0. If µ = 0 the 2-dimensional torus: A = B = I = 0, ϕ = 0 and α = (α, λ) arbitrary, is invariant and run quasi-periodically with rotation velocity ω = (η a , η −1/2 ): it will be called the unperturbed torus. In the above case, i.e. in the isochronous case only, we shall suppose (not “for simplicity”, but as an essential hypothesis) that the vector ω = (η a , η −1/2 ) is a Diophantine vector: def

| ω · ν | > C −1 η d | ν |−τ = C(η)| ν |−τ

(2.2)

for some Diophantine constant C(η), τ and some d > 0. This restricts the values of η that we can consider. For a ∈ [0, 21 ] it still allows sequences ηk , such that (2.2) holds with prefixed τ > 2, d = a and C large enough, with η = ηk = η1 k −1 , for some η1 ∈ [ 21 , 1] and all integers k large enough. Remarks. (1) A related H with a = 21 arises in a Celestial Mechanics problem near a double resonance, responsible for the time scales differences (see (12.39) in [CG], after scaling away the factors ωT to put H in dimensionless form and with several factors replaced here by constants, for simplicity). A first simplification of (2.1) compared to the “realistic” model in § 12 of [CG] is the absence of an additive isochrony breaking term 21 η 2a A2 (see (1.1)): taking it into account does not essentially change the analysis of the splitting results (even in its quantitative aspects on the asymptotics as η → 0, see Sect. 7). A second simplification is the absence in (2.1) of a further perturbation βf0 which is not small (i.e. f0 , β are η, µ-independent), but depends only on the “fast” angle λ and on ϕ: f0 = f0 (ϕ, λ). Taking it into account is a problem not discussed here. However it does not change the qualitative aspects but only the quantitative ones as long as the system is isochronous, see the discussion at the end of Sect. 8.

Separatrix Splitting for Systems with Three Time Scales

201

(2) g 2 in the above Hamiltonian is fixed (i.e. it is η, µ independent): eventually we take g≡1 for simplicity. The parameters µ, η are free and we shall be interested in them having a “small value”: note that if η → 0 the rotation vector ω of the unperturbed torus (I = ϕ = 0, B = A = 0) has a size tending to ∞. The even parity assumption on the “interactions” f simplifies, possibly in an essential way, the analysis. In the precession problem of [CG] η is the deviation from spherical shape of a planet precessing around its baricenter, which moves on an ellipse of (fixed) eccentricity ε = µη c . (3) A general physical interpretation of (2.1) is that of a system consisting in: (i) a forcing clock with angular velocity η a (i.e. a point moving on a circle with angular velocity η a , position α and action variable A); (ii) a pendulum (i.e. a point moving on a vertically placed circle with angular momentum I and position ϕ counted by taking ϕ = 0 as the unstable equilibrium position of the pendulum); (iii) a second forcing clock with angular velocity η −1/2 and action variable B. Equivalently one can delete the Aη a , Bη −1/2 terms and replace α, λ by ω t = (η a t, η −1/2 t) thus regarding the system as a time dependent one, consisting of a pendulum subject to a quasi periodic force with periods 2πη −a , 2πη 1/2 . The “characteristic time” of the pendulum system is T0 = g −1 . (4) Finally the “coupling constant” in (2.1) is written as µη c and not just µ, because the convergence radius of the expansions around the unperturbed torus is expected to be of the order of some power of η: it would be nice to know the best value of c (thus replacing the constant c by its optimal value) but it seems not known, see [HMS, ACKR] (it seems that c > 21 might be the right condition, but below we make no attempt at getting even close to such a small value).

3. Separatrices and Non Degeneracy Supposing µ = 0 in (2.1) we look at the unstable quasi periodic motions with A = 0, B = 0, I = 0, ϕ = 0, where α = α0 + η a t and λ = λ0 + η −1/2 t. This is a family of motions def whose initial data are parameterized by α 0 = (α0 , λ0 ) and therefore form an invariant torus: we shall call it a hyperbolic torus. The energy of such motions is H = 0, (if we fix A, B at other values we find a continuum of invariant hyperbolic tori, with rotation def vector ω = (η a , η −1/2 ): the energy of such motions is η a A + η −1/2 B). The torus is unstable and its unstable manifold W − is 3-dimensional and contains the set parameterized by (ϕ, α, λ) via the equations: A = B = 0, I = I(ϕ), (α, λ) ∈ [0, 2π]2 , ϕ ∈ (0, 2π),

(3.1)

√ def where I(ϕ) = I 0 (ϕ) = − g 2(1 − cos ϕ) is the pendulum separatrix. But also the set with I = −I 0 (ϕ) is part of the torus unstable manifold because of the meaning of ϕ as an angle. The stable manifold W + has equations I(ϕ) = ±I 0 (ϕ): this means that W + ≡W − : a well known degeneracy of the pendulum motion. A nice way of seeing the separatrix motions is by representing them in (canonical) Jacobi coordinates: these are coordinates (p, q) in terms of which the neighborhood of the (two) lines I = ±I 0 (ϕ) in (3.1) is represented, near I = ϕ = 0, as: I = R0 (p, q), ϕ = S0 (p, q),

(3.2)

where the functions R0 , S0 are defined near the origin, where they vanish, and the pendulum motion becomes, in such coordinates, simply p(t) = p0 e−gt , q(t) = q0 egt .

202

G. Gallavotti, G. Gentile, V. Mastropietro

Therefore the (p, q) coordinates describe globally, as t → +∞, the evolution of initial data (p0 , 0) and globally, as t → −∞, the evolution of initial data (0, q0 ). In such coordinates the equation of the stable manifold is simply p = 0 while that of the unstable manifold is q = 0. The functions S0 , R0 have well known holomorphy properties: the latter imply that the singularities of R0 (pegt , 0), R0 (0, qe−gt ), at fixed p or q, occur at t = ±i π2 g −1 , and the same holds for S0 . This can be seen from the explicit solution of the pendulum equation in terms of elliptic integrals: an elementary analysis is in Appendix A9 of [CG], or in Appendix A1 of [G5]. The first result that we use to set up a general picture but strictly speaking not even necessary, as shown in [Ge1], is the (well known, see [Gr]) stability of such unperturbed hyperbolic torus and of its stable and unstable manifolds. This will be stated as: Theorem 1. Suppose c in (2.1) large enough, and suppose that η is such that the rotation def vector ω = (η a , η −1/2 ) verifies the Diophantine condition (2.2) and µ0 > 0 is small enough. One can define functions 4, 0, 3, 2 divisible by µ and analytic in the variables (ψ1 , ψ2 , p, q, µ) varying respectively on the torus [0, 2π]2 , on a neighborhood of p = q = 0, and in |µ| ≤ µ0 , such that setting, as ψ varies on T 2 = [0, 2π]2 : A = A0 + 4( ψ , p, q),

B = B0 + 0( ψ , p, q),

α = ψ1 , λ = ψ 2 ,

I = R0 (p, q) + 3( ψ , p, q), ϕ = S0 (p, q) + 2( ψ , p, q),

(3.3)

one defines, for all A0 , B0 , an invariant set on which the motion described by (2.1) takes place following t → ( ψ + ω t, pe−gt , qe+gt ) (as long as the (p, q) stay in the domain of definition of 4, 0, 3, 2), with g = (1 + γ(µ))g and γ analytic in µ, near µ = 0 and divisible by µ. The functions 4, 0 evaluated at p = q = 0 have zero average with respect to ψ ; also the time average of H vanishes on the above motions and (therefore) H = 0 for all of them. The estimate µ0 for the radius of convergence is positive and η-independent for η small enough, as long as ω verifies the Diophantine condition. Remarks. (1) The theorem implies that if p = q = 0 then the parametric equations ϕ = 2( ψ , 0, 0), α = ψ1 , λ = ψ2 , I = 3( ψ , 0, 0), A = 4( ψ , 0, 0) and B = 0( ψ , 0, 0) describe, as ψ varies on the 2-dimensional torus [0, 2π]2 , an invariant torus. The quasi periodic rotation ψ → ψ + ω t with ω = (η a , η −1/2 ) gives, for all ψ , a solution to the equations of motion. This means that the invariant torus of dimension 2 that we considered in Sect. 2 (i.e. A = B = I = 0, ϕ = 0) survives the onset of perturbation and persists, slightly deformed, with the same rotation vector ω and a slightly varied pair of Lyapunov exponents (i.e. ±g = ±(1 + γ) g rather than ±g). The zero average property for 4, 0 means that the perturbed torus is in the average (over ψ ) located at the same position as the unperturbed one from which it “emanates”: this is useful as it allows us to parameterize the invariant tori by their average position in action space. About the proof of the theorem, see Appendix R4. (2) Setting p = 0, q 6 = 0 one finds a surface of dimension 3 which is a part of the unstable manifold W − of the torus, while setting q = 0, p 6 = 0 one gets a part of the stable manifod W + . Such manifolds are colorfully called local whiskers. They are in fact “local” parts of larger “global” manifolds (see item (4) below) and can be called, for this reason, local stable and local unstable manifolds. (3) The motion of A, B is computed from that of the other coordinates by quadrature (e.g. B˙ = −ε ∂λ f (ϕ, α, λ), while B itself does not occur in the equations of the other

Separatrix Splitting for Systems with Three Time Scales

203

coordinates, and similarly for A). The symmetry of f does not imply that 3, 2 have zero average over ψ , if ϕ = π, [CG], and also the Lyapunov exponent g in general changes by a quantity γ of order O(µ) with respect to the unperturbed value g(pq). The motions energy also changes, for more general (action dependent) perturbations, by an amount analytic in µ and divisible by µ (i.e. of order O(µ)). (4) Once the local parts of the torus whiskers have been defined as above we can extend them to global objects by simply applying time evolution to their points. (5) The case µ = 0 (one of the most classical results in Mechanics) is a non trivial exercise: it is developed in Appendix A9 of [CG] or Appendix A1 of [G5], see Appendix R3. (6) The stable and unstable manifolds do not coincide, in general, for µ 6 = 0. (7) In the anisochronous cases of Sect. 7 a very similar result holds with the difference that the relation between α and ψ is also non trivial and described by a function 1 ( ψ , p, q) = (1( ψ , p, q), 0) with zero average over ψ and divisible by µ and with the same domain of definition of the other functions in (3.3). In this case all the functions in (3.3), and 1 as well, depend also on A0 which must be restricted so that ω = (η a +η 2a A0 , η −1/2 ) verifies a Diophantine condition with suitable Diophantine constants C(η), τ : the size of the radius of convergence and the value of c will depend on the selected C(η) (like (2.2) or (7.3)). (8) The above theorem is well known; it is explicitly proved (in a much more general case) in the above form in § 5 of [CG], see p. 38. Its proof is a rather straightforward adaptation of Arnol’d’s method of proof of the KAM theorem, [A1], as exposed for instance in [G1]: the only element of “novelty” is perhaps the normal form of the motion in the coordinates ψ , p, q, as long as the latter two remain in their domain of definition. A more modern proof, based on Eliasson’s method, [E], for proving KAM (as exposed in [G2, GG]) and extending it to the problem of tori of one dimension less than the maximal, can be derived from [Ge1] (where only the cases p = 0 or q = 0 are studied). Also unusual is the absence of the twist condition (present, and necessary, in the more general proof in [CG]); it can be eliminated because of the special structure of the Hamiltonian (2.1). Anyway the proof of Theorem 1 is a long and uninteresting proof for us here. We shall not need, however, the normal form and, as it will be clear, we only need the classical results in the weaker form discussed in [Gr, Ge1]. 4. Splitting Angles. A Recursive Determination The symmetry of f implies an intersection at ϕ = π and α = 0 (see below or, for instance, p. 363 of [G3]) between the stable and unstable manifolds of the torus into which the unperturbed torus (i.e. A = B = 0, I = 0, µ = 0) is deformed by the perturbation. Therefore we set up an algorithm to study such an intersection. For this purpose it is convenient to work in the original canonical coordinates and write the stable and unstable whiskers Wµ± as: (4.1) Wµ± = {(ϕ, α , I, A ) = ϕ, α , Iµ± (ϕ, α ), A ± µ (ϕ, α ) } with α ∈ T2 , ε < |ϕ| < 2π − ε, where ε > 0 can be fixed a priori as small as we please provided we diminish the value of the analyticity radius µ0 in Theorem 1. In other words the whiskers deformation is of order O(µ) in every closed subinterval of (−2π, 2π): therefore they remain parametrizable by (ϕ, α ) for ϕ in any closed

204

G. Gallavotti, G. Gentile, V. Mastropietro

subinterval if µ is small enough (just note that for µ = 0 they are parametrizable). We say that, in such a region of (ϕ, α ), they are graphs over the angle variables. We only need that ϕ = π is allowed (hence ε = π2 will do). We define the splitting vector Q ( α ), the splitting matrix (or intersection matrix) and the splitting between Wµ+ and Wµ− at ϕ = π and α = 0 , respectively, as: Q ( α ) = A +µ (π, α ) − A − µ (π, α ), D = ∂ α Q ( α )| α = 0 , and det D. def

def

(4.2)

A relevant remark (Lochak and Sauzin, private communication) is that the whiskers are Lagrangian manifolds, so that, for a suitable generating function S ± (ϕ, α ), the whiskers have equation: A ± = ∂ α S(ϕ, α ), I ± = ∂ϕ S ± (ϕ, α ) around every point (ϕ0 , α 0 ), where they can be locally regarded as graphs over the angles (ϕ, α ), e.g. at ϕ = π and for µ small. This implies that D is symmetric, as we indeed find in (6.10), (7.10). We now derive recursive formulae for Iµ± , A ± µ in (4.1) and their time evolution, √ keeping in mind that for µ = 0 one has I0+ = I0− = I 0 (ϕ) = −g 2(1 − cos ϕ), A ± 0 = 0 . The unperturbed motion is simply: X 0 (t)≡(ϕ0 (t), α + ω t, I 0 (ϕ0 (t), 0 ), where (ϕ0 (t), I 0 (ϕ0 (t))) is the free (i.e. with µ = 0) separatrix motion, generated by the pendulum in (2.1) setting the origin of time when the pendulum swings through ϕ = π. The following Sects. 4 and 5 are presented here only for completeness: although selfcontained they are not meant as a substitute of the work done in [G3], but serve the purpose of guiding the reader to dig out of that paper what he may want to see in more detail. Let Xµσ (t; α ), σ = ±, be the evolution of the point on Wµσ whose initial coordinates are given by (π, α , Iµσ (π, α ), A σµ (π, α )); from now on we shall fix initial data with ϕ = π (which amounts to studying the whiskers at “Poincar´e section" {ϕ = π}). The analyticity in µ implied by the above Theorem 1 and the analyticity properties of the Jacobi functions R0 , S0 allow us to consider the convergent (if c in (2.1) is large) Taylor series expansions, in µ, of the whiskers equations. Let: X def X X kσ (t; α )εk ≡ X kσ (t)εk , σ = ± (4.3) Xµσ (t)≡Xµσ (t; α ) = k≥0

k≥0

be the power series in ε = µη c of Xµσ , (convergent for µ small); note that X 0σ ≡X 0 is the degenerate unperturbed whisker. We shall often omit writing explicitly the α variable among the arguments of various α dependent functions, to simplify the notations. Theorem 1, Sect. 3, tells us that the t-dependence of Xµσ (t) has the form: Xµσ (t; α ) = Xµσ ( ω t, t; α )

(4.4)

where Xµσ ( ψ , t; α ) is a real analytic function of all its arguments (µ included), which is periodic in ψ and α . The two functions X kσ (t) will be regarded as forming a single function X k (t): ( X k+ (t) if σ = sign t = + . (4.5) X k (t) = X k− (t) if σ = sign t = − We label the 6 components of X with an index j, j = 0, 1, . . . , 5, write them (notation used in [G3]) with the convention: def

def

def

def

X0 = X− , (X1 , X2 ) = X ↓ , X3 = X+ , (X4 , X5 ) = X ↑ ,

(4.6)

Separatrix Splitting for Systems with Three Time Scales

205

and we write first the angle variables ((ϕ, α, λ) = (X− , X ↓ )), then the action variables ((I, A, B) = (X+ , X ↑ )); first the pendulum, then the rotator and then the clock variables. Therefore at order 0 in µ: X0 (t) = ϕ0 (t),

X1 (t) = α0 + η a t, X2 (t) = λ0 + η −1/2 t,

X3 (t) = I 0 (ϕ0 (t)),

X4 = 0 = X5 ,

(4.7)

where ϕ0 (t) is the free separatrix motion (i.e. ϕ0 (t) = 4 arctg e−gt , see Appendix A1). Inserting (4.3) in Hamilton equations for (2.1) and comparing the various orders in µ, the coefficients X kσ (t)≡ X kσ ( ω t, t; α ) are seen to satisfy the hierarchy of equations: d kσ ˙ kσ X ≡X = LX kσ + F kσ , (4.8) dt where L is a 6 × 6 matrix with only two non vanishing elements L03 = 1 and L30 = g 2 cos ϕ0 (t); in the anisochronous case discussed in Sect. 7 below also the matrix element L14 is not vanishing: L14 = η 2a . Expanding X σ in powers of ε and imposing that the equations of motion are verified, we find recursively the expressions for F . For instance: F+1 = −∂ϕ f,

2 ↑

1 ↑

= −∂ α f,

g 1 2 1 sin ϕ (X− ) − ∂ϕ2 f X− − ∂ϕ α f X 1↓ , 2 1 = −∂ α ϕ f X− − ∂ α 2 f X 1↓ ,

F+2 = − F

F

2

g2 g2 1 2 1 3 2 2X− cos ϕ(X− X− sin ϕ − ) − ∂ϕ2 f X− + 2 3! 1 1 1 1 2 − ∂ϕ3 f (X− ) , − ∂ α ϕ f X 2↓ − ∂ϕ α 2 f X 1↓ X 1↓ − ∂ϕ2 α f X 1↓ X− 2 2 1 2 = −∂ α ϕ f X− − ∂ α 2 f X 2↓ − ∂ α 3 f X 1↓ X 1↓ 2 1 1 1 1 − ∂ϕ α 2 f X 1↓ X− − ∂ α ϕ2 f X− X− , 2

F+3 = −

F

3 ↑

def

(4.9)

def

where the functions are evaluated at ϕ(t) = ϕ0 (t), α (t) = α + ω t. More generally, F k depend upon X 0 , ..., X k−1 but not on X k . It turns out to be F−k ≡ 0 and F k↓ ≡ 0 for all k ≥ 1. Expressing the solution of a linear inhomogeneous equation like the one in (4.8) can be conveniently done in terms of the Wronskian matrix. We recall therefore the notion of Wronskian matrix W (t) of a solution t → x(t) of a differential equation x˙ = f (x) in Rn . It is a n × n matrix whose columns are formed by n linearly independent solutions of the linear differential equation obtained by linearizing f around the solution x and assuming W (0) = identity. In other words W (t) is, in our case, the solution of the ˙ (t) = L(t) W (t), W (0) = 1. differential equation W The solubility by elementary quadrature of the free pendulum equations leads, on the separatrix, to the following formulae that have importance because the Wronskian matrix of the free separatrix motion can be expressed in terms of them. If ϕ0 (t) = 4 arctan e−gt : 2 x2 = 1 − 8 , (cosh gt)2 (1 + x2 )2 sinh gt 1 − x2 = 4σx sin ϕ0 = 2 (cosh gt)2 (1 + x2 )2

cos ϕ0 = 1 −

(4.10)

206

G. Gallavotti, G. Gentile, V. Mastropietro

for x = egt or, as well, for x = e−gt . Equation (4.10) leads, see (A1.1) in Appendix A1, to the following expression for W (t) of the separatrix motion for the pendulum appearing in (2.1), with initial data at t = 0 given by ϕ = π, I = 2g: ! 1 w 2gt + sinh 2gt cosh gt 4 . (4.11) , w≡ W (t) = sinh gt w sinh gt − cosh (1 − ) cosh gt cosh gt 2 gt 4 cosh2 gt Notationally we follow here [G3] (in [CG] I, ϕ are exchanged). The evolution of the X± (i.e. I, ϕ) components can be determined by using the above Wronskian matrix: kσ Z t 0 0 X− −1 W (τ ) + W (t) dτ. (4.12) = W (t) F+kσ (τ ) X+kσ (0) X+kσ 0 Thus, denoting by wij (i, j = 0, 3) the entries of W , we see immediately that: kσ (t) X−

=

w03 (t)X+kσ (0) Z

Z + w03 (t)

t

w00 (τ )F+kσ (τ )dτ

0 t

w03 (τ )F+kσ (τ ) dτ, Z t w00 (τ )F+kσ (τ )dτ X+kσ (t) = w33 (t)X+kσ (0) + w33 (t) 0 Z t w03 (τ )F+kσ (τ ) dτ. − w30 (t) − w00 (t)

0

(4.13)

0

Integrating on the separatrix (4.8) for the ↑, ↓ components is “easier”: one can find it directly or, more systematically, by writing the full 6 × 6 Wronskian matrix of Eq. (4.8), which is trivially related to W (t) above; we shall not write it here: see Eq. (4.21) in [G3] for an explicit expression. The result is: Z t kσ kσ (t) = 0 , X (t) = X (0) + F kσ (4.14) X kσ ↓ ↑ ↑ ↑ (τ )dτ 0

having used that the X kσ ↓ (0)≡ 0 , because the initial datum is fixed and µ-independent. If (2.1) is modified by adding an isochrony breaking term 21 η 2a A2 the first component of the first of (4.14) becomes: Z t (t − τ )F1kσ (τ ) dτ , (4.15) X1kσ (t) = η 2a tX1kσ (0) + 0 kσ ↓ ,

see Eq. (2.18) in [G3]. where X1kσ is the first component of X Equations (4.13), (4.14) can be used to find a reasonably simple algorithm to represent whiskers to all orders k ≥ 1 of the perturbation expansion. It is very important to keep in mind that the initial data in (4.13), (4.14) are not constants: according to the convention following (4.3) they can depend on the α variables of the initial data. This means that the functions X depend separately on α and ω t. Except when, as in (2.1), the Hamiltonian is linear in the A, B variables. In the latter case the dependence on α and ω t of the r.h.s. of (4.4) (where the notation is complete and all variables are indicated explicitly) must be through α + ω t, since the α (t) angles vary as α˙ (t) = ω .

Separatrix Splitting for Systems with Three Time Scales

207

Note that the case (2.1) is non trivial and, in fact, very interesting: it is equivalent to a problem on a non linear quasi periodic Schr¨odinger equation, see [G2, BGGM]. The extension to anisochronus cases (i.e. with a quadratic term in A added to (2.1)) is worked out in [CG] up to third order: a general analysis can be found in [G3]. The initial data (still unknown) in (4.13), (4.14) are determined by imposing the correct behavior at ±∞, and the correct dependence on α and ω t (i.e. a dependence on these two vectors through their sum). These conditions arise from the fact that the motion must be asymptotic to the quasi periodic motion on the invariant torus whose whiskers are described by X + or X − . The scheme to do so is the following, see [G3]. Note that w03 (t), w33 (t) in (4.11) behave, as t → σ∞ with σ = ±1, asymptotically as σegtσ /4 and egtσ /4, while the other two matrix elements become exponentially small. Hence we see that the terms in (4.13) proportional to w30 (t) or w33 (t) diverge, in general, as t → σ∞ exponentially fast (supposing the integrals convergent, i.e. supposing F+kσ (τ ) not growing faster than a polynomialin t, as it will turn out to be). But they are multiplied Rt by: X+kσ (0) + 0 w00 (τ )F+kσ (τ ) dτ so that the condition to determine X+kσ (0) is: Z σ∞ w00 (τ )F+kσ (τ ) dτ = 0. (4.16) X+kσ (0) + 0

Likewise, in the isochronous case, from the second of (4.14) we determine X kσ ↑ (0) by imposing that X kσ (t), i.e. the momentum component corresponding to the isochronous ↑ angles α, λ, depends asymptotically on t only via α + ω t: this will determine X kσ ↑ (0) up to a constant. And the average over α of F kσ ↑ (t) must tend to 0 as t → σ∞ (otherwise the second of (4.14) could not possibly be bounded as t → σ∞: but it has to be such because X kσ ↑ (t) has to be bounded, by Theorem 1). This means that the constant is not a function of α and can be fixed arbitrarily: however we want that the averages of the 4, 0 in (3.3) are 0 so that X kσ (0) is completely determined. One finds, for both components: Z σ∞ (0) + F kσ (4.17) X kσ ↑ ↑ (τ )dτ = 0 , 0

where the integral is usually improper: see (5.1) below for a proper definition (derived by simply looking at the meaning of the conditions imposed to determine X kσ ↑ (0)). Equation (4.17) remains the same even in the anisochronous cases (see Eq. (4.5) in [G3]). We learned of the importance and relevance of improper integrals in the present context from the works of the series leading to [HMS], whose importance we want to stress. The key to concrete calculations is that, f in (2.1) being a trigonometric polynomial, the function F 1 (see (4.9)) belongs to a class M, see Definition 1 of § 3 of [G3], of linear combinations of terms like: M (t) =

(gσt)h k ϑ iρ ω · ν t x σ e h!

(4.18)

with h ≥ 0, k integers, ϑ = 0, 1, ρ = ±1 and gk ± i ω · ν 6 = 0 and we set x = e−gtσ with σ = sign t = ±1. In fact F 1 is a finite linear combination of harmonics ν . By (4.10) we see that (4.11) is an analytic function of the variable x = egt and of t, or of x = e−gt and of t: the explicit t-dependence arises because of the term w in (4.11).

208

G. Gallavotti, G. Gentile, V. Mastropietro

Expressing F 1 as a sum of monomials like (4.18) requires a (convergent) infinite sum over k, h. In the case of F 1 , in fact, there are no monomials with a power of t higher than 0 (i.e. with h > 0 in (4.18)). By induction all the F k have the property of being expressible as sums of monomials like (4.18). However, starting with the second order, one sees that powers of t do appear (this reflects, see Theorem 1, that the function γ, describing the change of the Lyapunov exponents to ±(1 + γ)g with γ analytic in µ, is not identically zero; see, however, [Ge2]). A full description of the induction can be found in [G3]. For the analyticity properties of the series introduced above we refer to Sect. 3 of [G3] and we proceed to a quick discussion of the determination of the initial constants. 5. Improper Integrals and the Operators I, O, O0 The integrations in (4.13), (4.14) can be expressed in terms of an operator I acting linearly on finite linear combinations of monomials like (4.18) with k 2 + ( ω · ν )2 > 0: Z t def M (τ )dτ, with : IM (t) = σ∞

IM (t) = −σ ϑ+1 xk eiρ ω · ν t

h X 1 (σt)h−p , (h − p)! (k − iρσ ω · ν )p+1 p=0

(5.1)

where the first row is a formal definition whose mathematical meaning is given by the second row (note that if k ≤ 0 the first line is an improper integral), and we set g = 1. Note that the I is not defined on the polynomials of t, σ, i.e. if k = 0 and ω · ν = 0 (so that no exponentials are present in the monomial defining M ). It can be naturally σ ϑ j+1 t , see (3.7) in [G3]. extended, for j ≥ 0, to the polynomials by setting Itj σ ϑ = j+1 The I is an integration with respect to t with special initial data: in fact at fixed σ: ∂t IM ≡M.

(5.2)

If M is such that M (t)≡M ( ω t, σ) for some M ( ψ , σ) defined on the torus, then IM (t) = ( ω · ∂

ψ

)−1 M ( ω t, σ).

The integrals in (4.13), (4.14) can be expressed in terms of the operators: OF (t) = w03 (t) I( w00 F ) (t) − w00 (t) I(w03 F ) (τ )|t0σ , O+ F (t) = w33 (t) I( w00 F ) (t) − w30 (t) I(w03 F ) (τ )|t0σ ,

(5.3)

(5.4)

2

I F (t) = I 2 F (t) − I 2 F (0σ ), where σ = sign t. Then one finds, in the general anisochronous case: h (t) = OF+h (t), X− 2

X+h (t) = O+ F+h ,

X h↓ (t) = η 2a EI F↑h (t), X h↑ (t) = I F h↑ (t),

(5.5)

where E is the projection over the first component, see (4.15), and F h have to be 0 expressed in terms of the X h with h0 < h.

Separatrix Splitting for Systems with Three Time Scales

209

Since, in this section, we consider an isochronous model the above formulae are slightly simpler as X h↓ (t)≡ 0 , i.e. one must take E = 0 in the third of (5.5). In Sect. 7 we shall need, however, (5.5). There is no difficulty in setting up a general recursive scheme for the computation. We just give the result using the convenient notations (kji ≥ 1, m = (m0 , m1 , m2 ), mi ≥ 0): X 0 1 1 2 2 , k , . . . , k , . . . , k , . . . , k ) s.t. kji = p (5.6) (kji ) m ,p ≡(k10 , . . . , km 1 m 1 m 0 1 2 def

def

referring to Sect. 2 of [G3] for more details (if needed). If f1 = f, f0 = g 2 cos ϕ, we find: F−kσ

= 0, F

kσ ↓ ≡0,

F

kσ ↑

=−

X | m |≥0 δ=1

F+kσ

=−

X | m |≥0 δ=0,1

∗

1 (∂ m ∂ϕ fδ ) m!

X

1 (∂ m ∂ α fδ ) m! mi 2 Y Y

(kji ) m ,k−1 i=0 j=1

X

mi 2 Y Y

(kji ) m ,k−1 i=0 j=1

ki σ

Xi j

(5.7)

ki σ Xi j ,

where (kji ) m ,k , (kji ) m ,k−1 are defined in (5.6). The ∗ means that if δ = 0 only vectors m with | m | ≥ 2 have to be considered in the sums. Note that if δ = 0 the sum in the expression for F+h can only involve vectors m with mj = 0 if j ≥ 1, because the function f0 = g 2 cos ϕ depends only on ϕ and not on α , (hence also kji = 0 if i > 0), while no sum with δ = 0 appears for F kσ ↑ . The functions are evaluated def

def

at ϕ(t) = ϕ0 (t), α (t) = α + ω t. The indices in (5.7) are mutually contracted with a natural rule that we leave to the reader to work out. ˙ ↓ , X˙ − are The relations F k↓ ≡ 0 and F−k ≡0 are general (as the equations for X linear) and in the isochronous cases: X 1↑ (t) = −I(∂ α f ), X 2↑ (t) = −I ∂ α ϕ f O(−∂ϕ f ) , 1 X 3↑ (t) = −I ∂ α ϕ f O − sin ϕO(∂ϕ f )O(∂ϕ f ) + (5.8) 2 1 − I ∂ α ϕ f O ∂ϕ2 f O(∂ϕ f ) − I ∂ α ϕ2 f O(∂ϕ f )O(∂ϕ f ) 2 (see the examples in (4.9)). We fix our attention on the models with f a trigonometric polynomial (“trigonometric perturbation”) of degree N : f ( α , ϕ) = fS (ϕ, α, λ) + fF (ϕ, α, λ), X fj (ϕ, α, λ) = fj,(n, ν ) cos(ν1 α + ν2 λ + nϕ), j = S, F

(5.9)

n, ν

with |n|, | ν | ≤ N and fS,(n, ν ) = 0 unless ν2 = 0, i.e. ν is a slow mode, while fF,(n, ν ) = 0 unless ν 2 6 = 0, i.e. ν is a fast mode. We also say that fS depends only on slowly rotating angles and fF depends on fastly rotating angles. A nontrivial example can be: (5.10) f ( α , ϕ) = cos(α + ϕ) + cos(λ + ϕ) . h , can be expressed, to order h = 1, 2, 3 as The intersection matrix to order h, Dij

210

G. Gallavotti, G. Gentile, V. Mastropietro

Z 1 Dij

∞

= −∞ ∞

Z 3 D22

= −∞

Z dt ∂ij f,

2 Dij

h

=−

∞

−∞

dt ∂ij0 f O(∂0 f ) + ∂j0 f O(∂i0 f ) ,

dt w30 O(∂220 f ) O(∂0 f )2 + 2w30 O(∂20 f )2 O(∂0 f ) +

(5.11)

+ ∂200 f O(∂20 f ) O(∂0 f ) + ∂00 f O(∂220 f ) O(∂0 f ) + i 1 + ∂00 f O(∂20 f ) O(∂20 f ) + ∂2200 f O(∂0 f )2 + ∂200 f O(∂20 f ) O(∂0 f ) , 2 where the derivatives of the f ’s are evaluated at ϕ(t), α + ω t, with ϕ(t)≡ϕ0 (t) the def def free separatrix motion, see (4.10). We set ∂0 = ∂ϕ and ∂i = ∂αi ; the α ’s have to be set equal to 0 after evaluating derivatives. The expressions contain improperly convergent R +∞ R 0 R +∞ integrals (in general) and must be understood by thinking −∞ as −∞ + 0 and by using Definition (5.1), see [G3]. It is convenient to split the operation O, see (6.5) in [G3], as: O(F ) = O0 (F ) + |w03 (t)| G(0) (F ) + w00 (t) G(1) (F ), Z Z t 1 t dτ (w03 (t) w00 (τ ) − w00 (t) w03 (τ )) F (τ ) + dτ (same) , O0 (F ) = 2 −∞ +∞ Z −∞ Z −∞ 1 1 dτ w00 (τ ) F (τ ), G(1) (F ) = dτ |w03 (τ )| F (τ ). G(0) (F ) = 2 +∞ 2 +∞ (5.12) The identity, see (6.12) and Appendix A2 of [G3]: Z +∞ Z +∞ dt F (t)O(H)(t) = dt H(t)O(F )(t) (5.13) −∞

−∞

implies symmetry of the above matrices Dij , at least for the first three orders (see (5.11)): symmetry follows to all orders as said after (4.2), or as it will be seen in Sect. 6. In the anisochronous case, i.e. in Sect. 7, we shall also use the splitting: 2

I (F ) = I02 (F ) + |t|G(2) (F ) + G(3) (F ), Z Z t 1 t 2 dτ (t − τ ) F (τ ) + dτ (same) , I0 (F ) = 2 −∞ +∞ Z −∞ Z 1 1 −∞ dτ F (τ ), G(3) (F ) = dτ |τ | F (τ ), G(2) (F ) = 2 +∞ 2 +∞ see Eq. (6.3), (6.6) in [G3], and the identity: Z Z +∞ 2 dt F (t)I (H)(t) = −∞

+∞

−∞

2

dt H(t)I (F )(t)

(5.14)

(5.15)

which can be proven as (5.13); see again [G3]. The key remark, to understand the asymptotic behaviour of (5.11) as η → 0, is that whenever the integrand is analytic it becomes possible to shift the integrations, over t and the τ ’s, to an axis close to Im t, Im τ = ±( π2 − η 1/2 ): see § 8 in [G3] (choosing the free parameter d appearing in [G3] as η 1/2 ). Using that G(1) (F ) = G(0) (F ) = 0 if F is odd (as the odd derivatives of f are, when evaluated at α = 0 ) and using also that O0 leaves parity unchanged, in general, we

Separatrix Splitting for Systems with Three Time Scales

211

shall find that the non-analytic terms (i.e. those containing integrals of a non-analytic function, e.g. |w03 (τ )|) cancel each other in their contribution to the determinant of Dij to all orders k ≥ 1. The result will be the proof of the following theorem. Theorem 2 (“Large angles theorem”). Consider a system described by the Hamiltonian (2.1) or (1.1) with f an even trigonometric polynomial of degree N . Let µ be small (|µ| < µ0 ) and c large enough. Consider an invariant torus with Diophantine rotation vector ω among those described in Theorem 1 above. At the homoclinic point with ϕ = π, α = 0 the intersection matrix determinant is exponentially small as η → 0, and it is generically asymptotic to its first order value (“Mel’nikov integral”), i.e. it is π −1/2 , with γ 6 = 0 and some positive constant b > 0 (depending on N ). The γη −b e− 2 η choice given by (5.10) is a concrete example of this result, generically holding for (5.9). Remarks. (1) The name of the theorem is due to the fact that the splitting, despite its exponential smallness, is nevertheless far larger than the tori separation, which is the natural scale over which to measure the splitting size. In the anisochronous case the average actions of the tori do not fill phase space (they do in the isochronous case as well as in the case of [A]): in fact the above theorem will imply immediately, along the lines of [CG] the existence of heteroclinic chains and therefore of Arnol’d diffusion, see [GGM2]. π

−1/2

in both cases (2) The result in the case f given by (5.10) is 32πη −1/2 ε2 e− 2 η (isochronous, (2.1), and anisochronous, (1.1)) with a > 0. (3) This theorem has an analogue when a = − 21 , i.e. only two time scales: an early review is in [G3]. In the latter case it has been extended to (special) analytic perturbations, i.e. beyond the trigonometric case, and to cover the exact asymptotic value of the splitting, i.e. far beyond [G3], see [DGJS, RW]. The above three scales case, (2.1), is quite different from the two scales case (discussed in [G3]): but arises naturally in Celestial Mechanics problems near a double resonance, as in the case of the precession problem in [CG] to which we hope to apply, eventually, the results of this paper. (4) The case in (2.1) is that of a pair of “clocks” and a pendulum. The case of a “clock”, a “rotator” and a pendulum is exemplified by the Hamiltonian obtained by adding 21 η 2a A2 to (2.1), see (1.1) (or (7.1) below). Both cases are treated in full detail in Sects. 6 and 7. 6. Nonperturbative Splitting Analysis in Presence of Fast and Slow Rotations It would be easy to show that the determinant of the intersection matrix is exponentially small to order ε4 : this requires evaluating the intersection matrix only to third order, by using (5.11). But it cannot be done without due care, as the error in [CG] was precisely due to the belief that it was not necessary to evaluate the matrix element D22 because it was exponentially small. In fact it is not exponentially small and it has the right value to make, instead, the whole determinant exponentially small. The real problem is to compute the determinant to all orders and to show that to all orders it is exponentially small: i.e. to all orders the determinant is a sum of “large” terms (not exponentially small) which “cancel each other” with a result that is exponentially small. So small that the first order calculation dominates in the limit as η → 0. It is remarkable that in fact one can give an exact expression for the leading corrections to the first order of perturbation expansion. See (6.12), (7.19). This section relies on (and in fact it follows almost immediately from) the general theory of the intersection matrix in [G3]. We cannot repeat here the general theory and

212

G. Gallavotti, G. Gentile, V. Mastropietro

therefore refer the reader to [G3] for details on the main definitions: we try nevertheless to make what follows readable at least from a formal viewpoint and as a guide to [G3]. The point of [G3] is that the intersection matrix can be quite explicitly calculated to all orders by using a graphical formalism very similar to that used in quantum field theory when the Schwinger functions are expressed via Feynman’s graphs. In the present case the graphs will be, topologically, trees: very unusual graphs from the viewpoint of field theory, where loops are often the main source of interest and non triviality. On the other hand the graphs have nodes with arbitrarily large coordination number: also unusual in quantum field theories (with polynomial interactions). In this section we confine ourselves to the Hamiltonian (2.1); we shall see in Sect. 7 how to extend the graphical construction to the anisochronous case. Let ϑ be a tree built with oriented lines all pointing towards a “highest” node r that we call the root and that we suppose to have only one “incoming” line, see the figure below. "r " " v5 " δv1 "" r " r " Z " v1 Z "" v6 " Zr" " v3 Z " ZZr δ v0 " j r" v7 root r (6.1) b r v0 b ( ( ( r ((vb v8 b ((( 4 r(X bX b b XX r XXX b v2 b v9 b X X X r b b v10 b b b b br v11

Fig. 1. A graph ϑ with pv0 = 2, pv1 = 2, pv2 = 3, pv3 = 2, pv4 = 2 and k = 12, and some labels. The line numbers, distinguishing the lines, and their orientation pointing at the root, are not shown. The lines length should be the same but it is drawn of arbitrary size. The nodes labels δv are indicated only for two nodes def

The graph will bear a label δv = 0, 1 on each node v: if δv = 1 it represents f = f1 while if δv = 0 it represents f0 = g 2 cos ϕ. And the labels can be given arbitrarily with the restrictions that all endnodes bear a label δ = 1 and that all nodes bearing a label δ = 0 have at least two incoming lines. Each node v will also bear a “time” label τv . We define the value of a graph by building the following symbol. We first lay down a set of parentheses ( ) ordered hierarchically and reproducing the tree structure: in fact any tree partially ordered towards the root can be represented as a set of matching parentheses corresponding to the tree nodes. Matching parentheses corresponding to a node v will be made easy to see by appending to them a label v. The root will not be associated with a parenthesis. Inside the parenthesis (v and next to it we write −∂0pv +1 fδv for all nodes v lower than the node v0 preceding the root (“first node”), where pv is the number of lines entering v; def def for v = v0 we write −∂j ∂0pv fδv0 . Here and henceforth ∂0 = ∂ϕ and ∂j = ∂αj , j = 1, 2 (this implies that δv0 = 1): the functions have to be evaluated at (ϕ0 (t), α + ω t).

Separatrix Splitting for Systems with Three Time Scales

213

Outside the parenthesis (v we write O for all the v < v0 and we add to the right of the matching parenthesis the symbol (τv ); for the first node we simply integrate over τv0 from +∞ to −∞. The symbol thus defined has the meaning of a linear combination of products of multiple integrals if one uses the definitions of the symbols O, see (5.12). We multiply it by n!−1 if n is the number of lines in the graph and we shall regard all the lines different (i.e. labeled); however two graphs that can be superposed, labels included, by successively rotating rigidly around the nodes subtrees that are attached to them have to regarded as identical. This defines the value of a graph (it is a function of α ). The reader can see that the above is a rather natural construction by working out patiently the definition in the case of trees with one,Ptwo or three nodes. The sum over all graphs of “order” k = v δv of the graph values gives the coefficient Qkj of order k of the splitting vector Q ( α ): see [G3] where the above construction is performed in Fourier transform to obtain directly an expression for Q kν .

It is convenient to make this more explicit by using the decomposition of O in the first line of (5.12). This can be easily done by simply attaching to each node v lower than the first a label βv = O, D, R signifying that we select the first of the three terms in the decomposition of O (see the first line in (5.12)) or the second or the third. We can alternatively imagine drawing a circle around each node v enclosing only the subtree with that node v as the first node and then to attach the label to the circle. Let ϑ be a graph whose nodes v carry indices δv = 0, 1 and τv ; let v0 be the first node of ϑ and v 0 > v be (if v < v0 ) the node following v. By (5.12) we see that a circle with a D or R label encircling a node v linked to the higher node v 0 (external to the circle) represents just a function |w03 (τv0 )| or w00 (τv0 ) by a number that in order to be evaluated requires essentially the same operations required to evaluate the value of a graph ϑ. This allows us to give a nonperturbative expression for the splitting vector Q ( α ). We simply consider the sum of all the values of the graphs bearing a label O on all the nodes except perhaps the endnodes that can bear also D, R labels. We evaluate the graphs values and in the end we replace the number associated with the R, D-labeled endnodes by the full perturbation series that is obtained by imagining that inside the circles with R and D labels there is the most general graph with O, R, D labels in all possible ways. The sum of such perturbation series will be denoted G(0) ( α ) for D-labeled circles, and G(1) ( α ) for R-labeled circles. The new representation of Q ( α ) is therefore a representation in terms of trees with a few “fruits” around some of the endnodes (possibly none or all) that can be D-labeled or R-labeled (dry or ripe, to follow the names of [G3]). Furthermore G(1) and G(0) verify a simple recursion relation that can be found by a more explicit representation of the quantities defined in the same way as G(0) , G(1) , Q but evaluated by considering only trees deprived of fruits, see (6.4) below. Fixed ϑ and setting w(τ 0 , τ ) = w03 (τ 0 )w00 (τ ) − w03 (τ )w00 (τ 0 ) define the function A({τv }) of the time labels τv associated with the nodes by: A≡A({τv }) =

(−1)n Y pv δv pv +1 0 w(τ , τ ) ε ∂ f (τ ) ∂0 0 fδv0 (τv0 ), v v δv v 0 2n n! v
(6.2)

0

where pv is the number of lines entering the node v; fδ (τ )≡fδ (ϕ0 (t), α + ω τ ) and n is the number of nodes of the graph ϑ; ∂0q fδ (τ ) means (∂ϕq fδ )(ϕ0 (t), α + ω τ ).

214

G. Gallavotti, G. Gentile, V. Mastropietro

We define also Ai,j,m,... x,y,w,... by the same expression as A with extra derivatives ∂i , ∂j , ∂m , . . . acting on the function fδx , fδy , fδm , . . . in (6.2). We shall say that i, j, m, . . . are marks on the nodes x, y, w, . . . . In terms of graphs we simply represent Ai,j,m,... x,y,w,... by affixing also “marks” i, j, m, . . . to the nodes x, y, w, . . . . Then let, for j = 1, 2: XZ |w03 (τv0 )|A0v0 ({τv }), 0(1) ( α ) = ϑ

0 (α) = (0)

XZ

w00 (τv0 )A0v0 ({τv }),

ϑ

0j ( α ) = 2

XZ

(6.3) Ajv0 ({τv }),

ϑ

the integrals ranging over τv0 ∈ (+∞, −∞) and over τv ∈ (−∞, τv0 ) ∪ (∞, τv0 ) for v < v0 . Given the above definitions, G = (G(0) , G(1) ) and 0 = (0(0) , 0(1) ) verify the following (analogues of the well known field theoretic) “Dyson equations” (see also [GGM1]): X Z −∞ Av0,0 w00 (τv0 ) (|w03 (τy )|G(0) ( α ) + G(0) ( α ) = 0(0) ( α ) + 0 ,y ϑ;y∈ϑ

G (α) = 0 (α) + (1)

(1)

∞

X Z ϑ;y∈ϑ

+ w00 (τy )G(1) ( α )) + · · · , −∞

∞

Av0,0 |w03 (τv0 )| (|w03 (τy )|G(0) ( α ) 0 ,y

(6.4) +

+ w00 (τy )G(1) ( α )) + · · · , where the r.h.s. represents the contributions of the trees with no fruit or with just one fruit and the . . . denote contributions from the trees with more than just one fruit, see Appendix R5. Equation (6.4) is obvious if one writes the r.h.s. as a sum of graphs and sees its graphical interpretation: note that (6.4) is a “non perturbative” identity since 0 , G are infinite series in ε (i.e. they contain all the contributions from graphs of any order). Then: X Z −∞ (0) (1) Aj,0 Qj (α) = 0j (α) + 2 v0 ,w |w03 (τw )|G ( α ) + w00 (τw )G ( α ) + · · · ϑ;w∈ϑ

∞

(6.5) with the above meaning of the . . . (i.e. up to sums involving at least two fruits). The convergence of all the above series and the estimates about their remainders are nontrivial and an open problem: they are, however, finite order by order as it follows from the analysis of Appendix A1 of [G3]. Therefore what follows can be interpreted as a collection of identities valid order by order in the expansion and as a way of deriving and expressing compactly infinitely many identities. This will be sufficient for all our conclusions; nevertheless one would get better bounds if one could prove convergence. (0) (1) (1) (0) (1) (0) (1) Consider G(0) i , 0i , Gi , 0i , 0ij defined as ∂i derivatives of G , 0 , G , 0 , 0j , see (6.3), (6.4), evaluated at α = 0 . The terms hidden in the . . . contribute 0 since at the homoclinic point by parity G(0) ( 0 ) = 0 and, as well, G(1) ( 0 ) = 0. Differentiating (6.4) and evaluating at α = 0 we get the remarkable exact relations: (0) (0) (1) (1) (1) (0) (1) G(0) j = 0j + M01 Gj + M11 Gj , Gj = 0j + M00 Gj + M10 Gj

(6.6)

Separatrix Splitting for Systems with Three Time Scales

with the matrix M =

M00 M01 M10 M11

215

defined by:

! R −∞ 0,0 R −∞ 0,0 P P A |w (τ )| |w (τ )| A w (τ ) |w (τ )| 03 v 03 y 00 v 03 y 0 0 0 ,y 0 ,y Pϑ; y∈ϑ R∞−∞ v0,0 Pϑ; y∈ϑ R∞−∞ v0,0 . ϑ; y∈ϑ ∞ Av0 ,y |w03 (τv0 )| w00 (τy ) ϑ; y∈ϑ ∞ Av0 ,y w00 (τv0 ) w00 (τy ) (6.7) M is symmetric: this is implied by the symmetry of the operator O0 , which follows immediately from that of O (see (5.13), and Appendix A2 of [G3]: or proceed inductively). MoreP generally , repeatedly R the symmetry of O0P R 0,0 applied (as done with O in [G3]), F (τ )H(τ ) = Av0 ,y H(τv0 )F (τy ) for all F, H: used implies that ϑ,y Av0,0 v y ,y 0 ϑ,y 0 several times below. Going back to (6.5), in order to evaluate the derivatives ∂i Qj ( α ), i, j = 1, 2 at α = 0 we need only the linear terms and the . . . contribute exactly 0. If we set: X Z 0ij = 2 (6.8) Aj,i v0 ,y ϑ; y∈ϑ

we have that the matrix 2 × 2 0 is symmetric (again use the symmetry of O0 ). 0 1 , then the intersection matrix D Noting that G = (1 − σM )−1 0 with σ = 1 0 is: (0) (0) (1) (6.9) Dij = 0ij + 2(0(1) j Gi + 0j Gi ) which, setting C = σ(1 − σM )−1 , becomes: 011 + 2( 0 1 , C 0 1 ) 012 + 2( 0 2 , C 0 1 ) . D= 021 + 2( 0 1 , C 0 2 ) 022 + 2( 0 2 , C 0 2 )

(6.10)

The matrix C is symmetric, because M is symmetric; as also 0 is symmetric, then D is a symmetric matrix (to all orders as (6.9) is exact): this agrees with the mentioned result (see comment following (4.2)) of Lochak and Sauzin on the symmetry of D. Setting P∞ n (0)(n) P∞ n (n) , 0ij = 0(0) i = n=1 ε 0i n=1 ε 0ij , an explicit calculation yields in the case (5.10): −1/2 π −1/2 1 (1) = η −1/2 022 = 2πη −1 e− 2 η (1 + O(e−2πη )). (6.11) 0(0)(1) 2 4 An elementary (and somewhat surprising) calculation of the determinant yields a remarkably simple expression in which the (generically) leading first non trivial order of det D (i.e. the second) and the possibly non exponentially small contributions are evaluated without any approximation: (1) (1)

2

2

(1) det C + · · · ; det D = 011 022 ε2 − 2 (011 M11 − 20(0) 1 )02

(6.12)

the last . . . collects contributions to det D from the products in det D involving 012 , 022 2 (for instance −4012 ( 0 1 , C 0 2 ) or 2022 ( 0 1 , C 0 1 ), −012 and a few others, see Ap(1) (1) pendix R6) or the higher orders of 011 022 ; we used that 012 and 022 are in general (1) equally exponentially small, while 011 is not. The terms not written in (6.12) contain as factors (seeAppendixA2 below) derivatives ∂2 (“fast angle derivatives”) of integrals of an analytic function of the τv ’s: if we imagine developing the kernels A as Fourier series in α we see that only Fourier components

216

G. Gallavotti, G. Gentile, V. Mastropietro

ν = (ν1 , ν2 ) with ν2 6 = 0 can contribute. This means that the dependence on the τv ’s will elements w(τv0 , τv ) of the pendulum Wronskian matrix and via Pbe via the matrix P i τv ω · ν v with v νv2 6 = 0 (we say that “the dependence on time is via a total fast e v rotation”). So that by shifting the integration over τv to ∼ ±i(π/2 − η 1/2 ), i.e. almost as far out in the complex as the singularities of the w-functions allow, one sees that such π −1/2 because terms contribute a quantity bounded to all orders h proportionally to e− 2 η | ω · ν | > η −1/2 − N hη a if ν2 6 = 0: this gives a good bound for h not too large, e.g. h < η −1/2 N −1 (actually even for h ' η −1 N −1/2−a , see Appendix A2). For h > η −1/2 N −1 one can invoke the convergence of the series for det D and obtain, −1/2 −1 π −1/2 because for some constant c, a bound (η −c ε)η N (much smaller than e− 2 η ε = µη c , provided c is large enough); see Appendix A2 for details. The conclusion is π −1/2 O(η −3β ), provided that the terms not written in (6.12) can be bounded by ε3 e− 2 η c is large enough so that the sum of the bounds of the orders from 3 to η −1/2 N −1 is dominated by the third order bound; see Appendix A2 for details. The value of β can be taken 2(N + 1) + 4d in terms of the degree N of f and of the constant d in (2.2): it is explained by the singularities of the elementarily computable Fourier transforms of cos N ϕ(t) and sin N ϕ(t), see [GR]. A similar argument is in Sect. 8 of [G3]. Hence the leading value of det D is given, as η → 0, by its first order expression 2 (1) (1) (1) 2 011 022 plus the apparently much larger −2(011 M11 − 20(0) det C. But we shall 1 ) 02 show that: 2

2

π

(1) det C = O(η −4β )ε4 e− 2 η (011 M11 − 20(0) 1 ) 02 π

−1/2

−1/2

(6.13)

), i.e. essentially 0, again because of the first factor being of order O(ε2 η −2β e− 2 η assuming (temporarily) convergence of the series for G , 0 , 0, M (see comments after (6.5)). The discussion of the convergence for the series for 0 , 0, M is very non trivial, while one could show the convergence of the series for G , following [Ge1]. However the series for det D converges and its convergence, which is absolutely essential, follows immediately from Theorem 1 above. It is remarkable that we can avoid proving the convergence of the power series for G , 0 , 0, M and get away with only the easily established convergence of det D: in fact we can just use the above series as formal power series and that is all we really need (together with the convergence of det D). After all the identity (6.12) as an identity between formal power series and the formal bound (6.13) show that to all orders the det D is exponentially bounded and this is almost enough. We suggest to proceed however by, at first, assuming convergence of the series for G , 0 , 0, M for εη −c small enough, and only on a second reading check that formal power series considerations (plus analyticity of det D) suffice: a technique used in [G3], Sect. 8. Here the difficulty is not the bounds but the cancellations and assuming convergence removes unessential worries and clarifies the algebra. (1) For instance in the case (5.10), with a > 0, one has 011 = 4 + O(η a ), by direct π −1/2 (1) , so that the leading term in (6.12) computation, and, by (6.11), 022 = 8πη −1/2 e− 2 η is found to be: π −1/2 . (6.14) det D = 32 πη −1/2 ε2 e− 2 η

Separatrix Splitting for Systems with Three Time Scales

217

We also take a = 21 in the following calculations to simplify notations: the general case is obtained by replacing the coefficients explicitly appearing in the following formulae 1 ω by η a and ω 2 by η 2 +a respectively. We recall that we set ϕ(t)≡ϕ0 (t). def The reason why (6.13) holds is that if ω = η 1/2 : 4 1 ω (0) 1 (0) 0 + 2 02 . (6.15) 011 = 0(0) 1 − 2 012 , M11 = ω ω 2 1 ω P R 01 P R To prove (6.15) note that ϑ,y A10 Av0 y w00 (τv0 ) by the symmetry v0 y w00 (τy ) = ϑ,y of O0 ; hence XZ XZ (0) 011 = 2 , 0 = (6.16) A11 A10 v0 y v0 y w00 (τy ). 1 ϑ,y

ϑ,y

Thinking of (6.16) as sums of graphs we see that to each graph with n nodes contributing to 011 with the node y marked 1 (see the definition preceding (6.3)) as in (6.16) there correspond two graphs contributing, if y < v0 , to 0(0) 1 . Namely the one with the node y marked 0 and the one obtained by deleting the mark 0 on y, adding a new node y 0 marked 0 on the line coming out of y and with index δy0 = 0. If y = v0 we associate with it only the first of the above graphs. Note that the second graph has n + 1 nodes. Suppose that y is an endnode in a graph ϑ associated with 011 : by A-kernels definition we must evaluate, when computing the contribution to 011 from such a graph, the quantity: 1 1 1 (6.17) O0 (∂10 f ) = O0 (∂τ ∂0 f ) + O0 (−ϕ˙ ∂00 f ) − 2 O0 (∂20 f ), ω ω ω def

where ω = η 1/2 (recall also that we take a = 21 ) and having used ∂1 ≡ ω1 (∂τ − ϕ˙ ∂0 )− ω12 ∂2 (because the derivatives act on functions of the special form F (ϕ(t), ω t) with F (ϕ, α ) suitable). Since ϕ˙ = −2w00 we see that the second term is ω2 O0 (w00 ∂00 f ): hence it appears in the evaluation of the first among the two corresponding graph contributions to ω4 0(0) 1 . will require computing: The second corresponding graph contribution to ω4 0(0) 1 2 4 − O0 (w00 ∂03 f0 O0 (∂0 f )) = O0 (ϕ˙ ∂03 f0 O0 (∂0 f )). ω ω

(6.18)

The combinatorial coefficient associated with this graph is (n+1)! rather than n!: but one checks that this is what is necessary to verify that the difference between the contributions to 011 and ω4 0(0) 1 due to the three graphs considered requires computing: 1 1 O0 ∂τy ∂0 f − ϕ˙ ∂03 f0 O0 (∂0 f ) − 2 O0 ∂02 f . ω ω In Appendix A3 we prove for all odd F the following commutation relation: ∂τy0 O0 (F ) = O0 ∂τy F − ϕ˙ ∂03 f0 O0 (F )

(6.19)

(6.20)

which, noting that below we only apply (6.20) for odd F ’s, allows us to rewrite (6.19) as: 1 1 (6.21) ∂τy0 O0 (∂0 f ) − 2 O0 (∂02 f ). ω ω We proceed by picking a new endnode (if any) and by repeating the construction until all endnodes are exhausted. The difference between the contributions to 011 and ω4 0(0) 1

218

G. Gallavotti, G. Gentile, V. Mastropietro

considered so far will therefore be given by a collection of graph values contributing to 011 with no marked node and with one of the endnodes requiring the evaluation of 1 1 ∂τ O0 (∂0 f ) − 2 O0 (∂20 f ). ω y0 ω

(6.22)

Consider now a node y marked 1, which is next to an endnode; and suppose that there are p lines linking it to the endnodes. The differences between the contribution of the considered graphs to 011 and of the corresponding ones to ω4 0(0) 1 plus the already evaluated differences due to the previously considered graphs will be, proceeding as before: p 1 X 1 O0 (∂τ ∂0p+1 f )O0 (∂0 f )p + O0 ∂0p+1 f O0 (∂0 f )p−1 ∂τy O0 (∂0 f ) ω ω j=1 1 1 + O0 − ϕ˙ ∂03 f0 O0 ∂0p+1 f O0 (∂0 f )p − 2 O0 ∂2 ∂0p+1 f O0 (∂0 f )p ω ω p X 1 O0 (∂0 f )p−1 O0 (∂20 f ) , − 2 O0 ∂0p+1 f ω j=1

(6.23)

which is, by (6.20): 1 1 ∂τy0 O0 (∂0p+1 f O0 (∂0 f )p ) − 2 ∂2 O0 (∂0p+1 f O0 (∂0 f )p ). ω ω

(6.24)

We proceed in the same way until we reach the first node and, in fact, we can treat in the same way also the first node except that in this case there is only one corresponding graph in 0(0) 1 . No need, this time, of using the commutation relation (6.20) because the τv0 -integral is a simple integral not involving O0 operations any more. Hence eventually we find that the differences between the contributions to 011 and ω4 0(0) 1 add up to: Z Z 2 2 Z 2 1 ∂t A1v0 − ∂2 A1v0 = − 2 ∂2 A1v0 = − 2 012 , (6.25) ω ω2 ω ω where one should note that the first term in the l.h.s. vanishes: this does not immediately follow from it being an integral of a derivative because the integrals are improper: one checks this by (5.1) and the fact that A1v0 is even (at α = 0 ). (0) 1 (0) In a similar way one shows that M11 = ω2 (0(0) 1 + ω 2 02 ). One proceeds by writing 01 R P (0) as 0(0) A01 v0 y w00 (τv0 ), and marking 0 the first node for both 01 and M11 . Then 1 = ϑ,y (0) note that to each graph contributing to 01 with the node y marked 1 there correspond two graphs contributing to M11 also for y = v0 , when δv0 = 1, and they are constructed by following the same prescription given after (6.16); moreover to each graph contributing to 0(0) 1 with δv0 = 0 there correspond two graphs contributing to M11 , namely the one with the node v0 marked 0 twice and the one obtained by adding a new node v00 (which becomes the new first node), marked 0 twice and with δv00 = 0, on the line coming out of y. Then proceed as before, with the only difference that, at the last step, when dealing with the contributions arising from the graphs with δv0 = 0, one has to use the identity sinh t 1 1 d 1 2 O0 (w00 sin ϕ) = − 21 cosh ¨ This can be proved in the same way as 2 t = 2 dt cosh t = − 4 ϕ. the commutation relation (6.20); see Appendix A3. This completes the proof that the cancellation discussed in Sect. 5 is in fact taking place to all orders of perturbation theory.

Separatrix Splitting for Systems with Three Time Scales

219

7. The Splitting in Anisochronous Cases We now proceed to the analysis of the anisochronous case (1.1). We recall the form of the Hamiltonian: H = ηa A +

1

η

B + η 2a 1/2

A2 I 2 + + g 2 (cos ϕ − 1) + ε f (ϕ, α, λ), 2 2

(7.1)

where ε = µη c and c will be chosen large enough. This model belongs, see [T, G2], to a well studied class of models (“Thirring models”) and it is a simplified version of the Hamiltonian considered in [CG] in Sect. 12. If we added (7.1) a further “monochromatic” term f0 (ϕ, α, λ) which has Fourier harmonics ν , for (α, λ), multiples of a given fast harmonic ν 0 (i.e. a harmonic with the component ν20 different from 0) it would offer all the difficulties of that case in spite of being much simpler analytically. We shall not attempt the analysis of such a more general Hamiltonian: the error in [CG] eliminated almost completely this problem which, hence, has to be studied again if one wants to recover the results of that paper. Consider, as an example, the case (5.10). This time the first problem is to find for how many values of A one can have, for ε small, an invariant hyperbolic torus with rotation number: (7.2) ω = (η a + η 2a A, η −1/2 ) and ε-close to an unperturbed one. For Hamiltonians like (7.1) the average position in the A variables of the torus with rotation (7.2) will be exactly A (this is a general property of “twistless tori”, as discussed in [G3, G4]). The average position in the B variables will be chosen so that the energy of the torus is a fixed value, e.g. 0: this can be done because the fast variable λ conjugate to B is still isochronous (see Sect. 5 in [CG] for the more general case in which also λ is anisochronous when the analysis is somewhat more involved). Of course ω must verify a Diophantine condition: but in view of using the result to show that the tori resisting the perturbation, i.e. the ones described by Remark (7) to Theorem 1 above, are dense enough to build long chains of tori joined by heteroclinic trajectories, “heteroclinic chains”, we must consider tori that verify a very generous Diophantine condition, compared to (2.2). Since we shall not discuss the ambitious application attempted in [CG] we just allow rotations vectors verifying (2.2) with a fixed, possibly very large, d. def

def

From Lemmata 1,1’ in Sect. 5 of [CG], all tori with rotation ω = ω (A) = (η a + η A, η −1/2 ) verifying: 2a

def

| ω · ν | > C −1 η d | ν |−3 = C(η)| ν |−3

(7.3)

will survive the perturbation if the parameter c in the definition ε = µη c of the coupling constant is large enough: so that εC(η)−q < ε0 for some ε0 and some q > 0. The splitting theory is “insensitive”, at given ω , to the presence or absence of the isochrony breaking term 21 η 2a A2 in (7.1). We discuss this delicate point below, for general perturbation f , see (5.9). The homoclinic splitting is given by (6.14) with no extra leading terms. The only effect of the anisochrony is to introduce a few gaps in the foliations of phase space into stable and unstable manifolds: but it has also the advantage that we no longer must be careful about the values of η. Anisochrony guarantees that the Diophantine conditions holds for “many” values of A.

220

G. Gallavotti, G. Gentile, V. Mastropietro

Turning to the main point of this section (and of the whole paper) we prove that usually the first order (“Mel’nikov integral”) dominates the splitting.Again the technique will be based on the general theory of [G3]. In the anisochronous case the graph labels have to be extended, see (5.7), (5.14) and [G3]. On each node v one adds a further node label jv = 0, 1 (which in the isochronous case would be jv ≡0) and this has the effect that in the definition of A one replaces: ∂0pv +1 fδv → ∂jv ∂jv1 . . . ∂jvpv fδv if v < v0 , pv

∂0 0 fδv0 → ∂jv1 . . . ∂jvpv fδv0 0

if v = v0 ,

(7.4)

where pv is the number of nodes v1 , . . . , vpv preceding v. Furthermore the kernels w(τv0 , τv ) become node dependent wv (τv0 , τv ) and equal to w(τv0 , τv ) if jv = 0 and η 2a (τv0 − τv ) if jv = 1. The first component X1 (t; α ) of X ↓ (t; α ) will not vanish. Let us define α(t) as P∞ α + ω1 t + k=1 εk X1k (t; α ), where X1k is the first (and only non-vanishing) component of X k↓ . P The contributions to the splitting Qj ( α ) due to fruitless trees will be 2 ϑ Ajv0 , with the same notations of Sect. 6. The full splitting will be: XZ XZ j (r) (7.5) Av0 + 2 Aj,[r] Qj ( α ) = 2 v0 ,y wr (τy )G ( α ) + · · · ϑ

ϑ;y,r

here [r] = 0 if r = 0, 1 and [r] = 1 if r = 2, 3, and: w0 (τ ) = w00 (τ ), w1 (τ ) = |w03 (τ )|, w2 (τ ) = η 2a , w3 (τ ) = η 2a |τ |

(7.6)

with G = (G(0) , G(1) , G(2) , G(3) ) representing the fruit values, defined as in Sect. 6, for fruits which now can carry also a label 2, 3 on the first node: the latter values correspond to the fruits carrying label jv = 1 (2 corresponds to a dry fruit and 3 to a ripe fruit): the choices of the w2 , w3 arise from the form of the operator corresponding to O for the 2 nodes with the new labels, called I as in [G3]. In complete analogy with Sect. 6 the G verify Dyson equations. If we set: XZ XZ (σ w )r (τv0 )(σ w )s (τy ), (7.7) Av[r]0 wr (τv0 ), Mrs = Av[r],[s] 0(r) ( α ) = 0 ,y ϑ

ϑ;y

where the matrix σ is defined to be σrs = 0 except for the matrix elements σ01 = σ10 = σ23 = σ32 = 1, then: XZ (r) (r) (σ w )s (τy )G(s) ( α ) + · · · , (7.8) ( w )r (τv0 )Av[r],[s] G (α) = 0 (α) + 0 ,y ϑ;y,s

where the dots represent contributions from the graphs with more than one fruit, while the terms explicitly written represent the contributions from the graphs with no fruits or with just one fruit. At the homoclinic point the derivatives G j = ∂j G, 0 j = ∂j 0 verify exactly: G j = 0 j + σM G j , (compare with (6.6)).

G j = (1 − σM )−1 0 j

(7.9)

Separatrix Splitting for Systems with Three Time Scales

221

P

R

ϑ;y

Aj,i v0 ,y , i, j = 1, 2

Dij = 0ij + 2( 0 j , σ G i ) = 0ij + 2( 0 j , C 0 i ).

(7.10)

The intersection matrix will be, setting as in Sect. 6, 0ij = 2 and C = σ(1 − M σ)−1 :

The convergence of the above series, (7.5)÷(7.8) and the estimate of their remainders is discussed as in Sect. 6: see Appendix A2. The above equation is not sufficient this time: there are in fact too many variables. There are however several relations between the matrix elements of M and 0 , G . In fact M is symmetric for the same reasons as the corresponding matrix in Sect. 6: i.e. by 2a 2 def using the symmetry of O0 and of I 0 operators (see (5.13) and (5.15)). If λ = 2ηω , ω = η a the relations are, up to terms exponentially small in η: ω (0) (0) (0) 0(2) G(2) 0 , 1 = λ01 , 1 = ZλG1 , 2 1 = λM01 − λ30 , M23 = λM12 − λ31 ,

M11 = M03

M33 = λ M11 , 2

(7.11)

M31 = λM11 ,

PR P R 2a |w03 (τv0 )|A0v0 and 31 = (2η 1/2 )−1 ∂2 η |τv0 |A0v0 where 30 = (2η 1/2 )−1 ∂2 1−30 and Z = 1+λ31 . The relations among the M elements are proved by the same argument discussed in Sect. 6 for the first of (7.11); one should use also the relation O0 (w00 |w03 | sin ϕ) = |w˙ 03 |/2, proven in Appendix A3. The constant Z arises solving by iteration (7.9): the structure of the matrix σM and of its powers is, given the relations between the Mij in (7.11), such that the first and third components of G are proportional via the constant λZ; see Appendix A4 for details. Equation (7.11) allows us to reduce the size of the vectors G , 0 and of the matrix M . We shall denote with a tilde the new vectors and matrices. Introduce ˜ , N as: ˜ = (0(0) , 0(1) , 0(3) ), and M G˜ = (G(0) , G(1) , G(3) ), 0     M10 + ZλM12 M11 λM11 0 1 λ ˜ = M00 + ZλM02 M01 λM01 − λ30  , N =  1 0 0 . M (7.12) M20 + ZλM22 M21 λM21 − λ31 Zλ 0 0 Equations (7.9), (7.10) become respectively: ˜ )−1 0 ˜ )−1 0 ˜ i , N (1 − M ˜ , Dij = 0ij + 2( 0 ˜ j ), G˜ = (1 − M

(7.13)

˜ )−1 is symmetric (because C in (7.10) is symmetric) noting that the matrix C˜ = N (1− M and that it has the second and third row proportional one deduces, analogously to (6.12), that: (0) 2 2 ˜ det D = 011 022 + 2(0(1) 2 ) 011 C11 + 2(01 ) 100,11 (7.14) + 2(0(3) )2 011 C˜ 33 + 2(0(0) )2 100,33 +

2 (1) (3) 402 02 (011 C˜ 13

+

1 2 2(0(0) 1 ) 100,13 ),

where 100,11 , 100,33 , 100,13 denote the determinants of the matrices:

C˜ 00 C˜ 03 C˜ 00 C˜ 03 C˜ 00 C˜ 01 , , . C˜ 10 C˜ 11 C˜ 30 C˜ 33 C˜ 10 C˜ 13

(7.15)

222

G. Gallavotti, G. Gentile, V. Mastropietro

To compute all the above quantities we note that if we set (see (7.12)) a = M11 , b = M01 , c = M12 and x = M10 + ZλM12 = b + Zλc, y = M00 + ZλM02 , z = M20 + ZλM22 , 30 = −b0 , 31 = −c0 , Z = (1 + b0 )(1 − λc0 )−1 :   1 − x −a −λa ˜ =  −y 1 − b −λb − λb0  , (7.16) 1−M −z −c 1 − λc − λc0 ˜ )−1 is: ˜ ) = − (y + λzZ)a − (1 − x)2 (1 − λc0 ) def = 1, and (1 − M hence det(1 − M 

 0 0 ˜ (1 − b)(1 − λ˜ c ) − λ bc a(1 − λc ) λa(1 + b ) 1  y(1 − λ˜c) + λz b˜ (1 − x)(1 − λ˜c) − λaz (1 − x)λb˜ + λay  , (7.17) 1 yc + (1 − b)z (1 − x)c + az (1 − x)(1 − b) − ay def

˜ )−1 is: where b˜ = b + b0 , c˜ = c + c0 . Thus the matrix C˜ = N (1 − M   y(1 − λc0 ) + λz(1 + b0 ) (1 − x)(1 − λc0 ) λ(1 − x)(1 + b0 ) 1 a(1 − λc0 ) λa(1 + b0 )  . C˜ =  (1 − x)(1 − λc0 ) 1 0 0 λ(1 − x)(1 + b ) λa(1 + b ) λ2 a(1 + b0 )Z

(7.18)

Noting that 100,11 = −1−1 (1 − λc0 ), 100,33 = Z 2 λ2 100,11 , 100,13 = Zλ100,11 and 0 ) ˜ , C33 = (Zλ)2 C˜ 11 , C˜ 13 = ZλC˜ 11 with a = M11 (not to be confused with C˜ 11 = a(1−λc 1 a in (7.1)) we get, for a suitable β > 0, for det D: (1) (2)

011 022 + 2 =

(1) (2) 011 022

(3) 2 π −1/2 (0(1) (0) 2 2 + Zλ02 ) + O(ε3 η −3β e− 2 η 0 M − 2(0 ) )= 11 11 1 0 −1 (1 − λc ) 1 3 −3β − π2 η −1/2

+ O(ε η

e

(7.19)

) = det D

by the argument leading to (6.12), (6.14): this completes the analysis of the remarkable cancellations for separatrices splitting in the anisochronous case. The leading order remains exactly the same as in the isochronous case: anisochrony only alters the final result by a factor of order (1 + O(η a )), as it should have been expected a priori once understood for the isochronous case. The proof of the domination of the first order now follows the same path as the corresponding in Sect. 8 of [G3], see Appendix R7: one uses the above results to treat the first η −1/2 N −1 orders of perturbation theory and for the remainder one just uses that the series for the splitting is convergent (see also Sect. 6 and Appendix A2). 8. Heteroclinic Chains For completeness we give the argument for the existence of heteroclinic chains, following [CG], in the easy case of isochronous systems. Below we imagine to have fixed µ and to take η → 0 (so that ε = O(η c )). It is worth noting that no gaps (i.e. all actions A are the average actions of an invariant torus) are present in the isochronous case (2.1), which is, therefore, very similar to the original example proposed by Arnol’d (also gapless). Let A0 = 0 < A1 < . . . < AN = A0 and choose correspondingly B0 , . . . , BN so that the sequence of action variables (Aj , Bj ) describes the time averaged location of invariant tori for (2.1) with energy 0 (say).

Separatrix Splitting for Systems with Three Time Scales

223

We consider a perturbation like (5.9) for which the splitting is given by the Mel’nikov π −1/2 ) for some b > 0. Since there are no gaps the sequence Ai integral σ = O(η −b e− 2 η π −1/2 can be chosen so that Ai+1 − Ai < e−δ 2 η for a prefixed δ > 1 and for all i’s. Hence π −1/2 +δ 2 η ). the number N has size O(A0 e We want to show that there are heteroclinic intersections Hi between the unstable manifold of the torus A i and the stable manifold of A i+1 . Since by construction the tori have the same energy this simply means finding a solution for the equations: + X− ↑ (π, α ; A i ) − X ↑ (π, α ; A i+1 ) = 0 (the energy being equal, this equality then im− plies X+ (π, α ; A i ) − X++ (π, α ; A i+1 ) = 0, i.e. also the pendulum momenta match).

The tori equations depend linearly on their average actions, i.e. X ± ↑ (π, α , A ) = A + ± ( α ) (see Theorem 1) where Y is defined here. We can regard the equation for the Y± ↑ ↑ − + heteroclinc intersection X ↑ (π, α ; A i )− X ↑ (π, α ; A i+1 ) = 0 as an implicit function problem which for A i+1 = A i has α = 0 as a solution. The linearization of the equation at A i involves the intersection matrix D at A i , α = 0 (which in the isochronous case is A -independent): def

D α = A i − A i+1

(8.1)

showing that the implicit functions problem of determining the heteroclinic point α can π −1/2 ) and | A i+1 − A i | = be solved for η small enough because det D = O(η −b e− 2 η π −1/2 ), with δ > 1. O(e−δ 2 η It might be surprising, at first, that the equation for α can be solved without an explicit estimate of the α -derivatives of the Y ↑ ( α ) at points α near 0 . Such estimates can be made directly from the existence theorem: however they give bounds on derivatives values that are much larger than σ, i.e. they have size O(1). This may seem to undermine the foundations of the implicit functions methods, that rely on the solubility of the linear equation. However the corrections to (8.1) are bounded by O( α 2 ); and | A i+1 − A i | ≤ π −1/2 π −1/2 −δ π2 η −1/2 . The solution of the linear equation (8.1) has size O(η b e−δ 2 η · e+ 2 η ). e Hence near such α the higher order corrections have roughly still the size −1/2 ): much smaller than the linear contribution, if δ > 1. This shows that O(e−(δ−1)πη our knowledge of the smoothness of X suffices, together with the basic estimate on the homoclinic angles, to deduce that the linear approximation dominates and to claim that the solutions for the heteroclinic point do exist and are very close to those of (8.1). Therefore there is a chain of heteroclinic points H0 , H1 , . . . , HN −1 “connecting” a π −1/2 ), neighborhood of A 0 to one of A N . The “length” of the chain is N = O(e+δ 2 η i.e. in some sense it is the inverse of the splitting. In the more interesting anisochronous case (7.1) there are gaps (i.e. not all actions are the average positions of an invariant torus), but one can show that the average actions fill action space within a distance much smaller than the splitting size, see [GGM2]. So one can prove immediately the existence of heteroclinic chains. We now consider the case in which the system in (2.1) is further perturbed by a monochromatic perturbation βf0 (λ, ϕ). The radius of convergence of the whiskers 1 series in β, µ (recall that ε = µη c ) can be shown to have size of order |β| < O(η − 2 ), |µ| < O(1), see Appendix A10 of [CG]. The splitting det D is analytic in β, µ for β η −1/2 , see Appendix A10 in [CG], and it is different from 0 for |β| O(η c ), µ 6 = 0, provided the generically non zero

224

G. Gallavotti, G. Gentile, V. Mastropietro

splitting at β = 0 is not zero. In the latter case the splitting can only vanish finitely many times, at µ fixed, in the domain of convergence of the series if µ 6 = 0. Hence for all values of β close to 1 (including β = 1 generically in f0 ), there exist heteroclinic chains as long as we wish. However not having an estimate for the size of the splitting we cannot infer from the above argument how many tori build the chains. The remark is interesting in view of the general fact, [A2], that heteroclinic chains imply diffusion in phase space, i.e. existence of motions starting near the torus at one end of the chain and reaching the one at the other end in due (finite) time; see [G5] for the discussion of the method of proof in [CG]. 9. Other Results. Comments 9.1. General theory of splitting. We call Diophantine with Diophantine constants C0 , τ > 0 a vector ω ∈ R` such that: | ω · ν | > C0−1 | ν |−τ , ∀ ν ∈ Z, ν 6 = 0

(9.1)

(compare with (2.2); here there is no extra parameter η). We use the notations of Sect. 2 for the other symbols that are not redefined. Theorem 3. Suppose that ω ∈ R`−1 is Diophantine and consider the Hamiltonian: H=

I2 A2 +ω ·A+ + g 2 J0 (cos ϕ − 1) + µf ( α , ϕ) 2J1 2J0

(9.2)

with ( A , α ), (I, ϕ) being ` pairs of canonically conjugate action–angle variables. Let f be a even trigonometric polynomial of degree N and, for simplicity, J0 ≤ J1 . Then: (1) The separatrix splitting, for the torus with rotation vector ω into which the unperturbed torus A = 0 evolves with µ, is analytic in µ near µ = 0. (2) The power series expansion of the splitting vector Q ( α ) in powers of µ has coefficients with 0-average; their Fourier components Q kν are bounded, for any δ < 1, by: ( J0 gDδ −β (Bδ −β )k−1 k!p ε(k, ν ) k ν 6= 0 , (9.3) |Q ν | ≤ J0 gDη −β (Bη −β )k−1 where D, B are suitable constants and β, p can be taken β = 4(N + 1), p = 4τ + 4, if τ is the Diophantine constant of ω , and: ε(k, ν ) =

max

h≤k; { ν

Ph

06 = ν 0 =

j }j=1,... ,h

j=1

ν j ; | ν 0 |≤| ν |

h Y j=1

−1 0 π |f ν j | e−g | ν · ω |( 2 −δ) .

(9.4)

Remarks. (1) This theorem is the main result of [G3]; note that (9.4) is stronger than the form in which it is quoted in Eq. (6) of [RW] which refers to the theorem stated in [G3] but not to its proof (which gives in fact (9.4)), see Appendix R8. The method of proof in [G3] could yield in fact the result for f depending analytically on the rotators angles α , by using the ideas in [GM], but extra work in necessary. See [DGJS, RW] for alternative proofs; see also [BCG], where the stronger form (with f analytic) was derived, in a similar problem.

Separatrix Splitting for Systems with Three Time Scales

225

(2) There are many instances in which the first order expression (called the Mel’nikov integral) of the splitting vector Q 1 ( α ) gives in fact the leading behavior (as µ → 0) in the calculation of the splitting. In the case of fixed ω , i.e. for the one time scale problem, this follows from the classical results of Mel’nikov, [Me]. (3) Another interesting question arises when ω = γ ω 0 with ω 0 Diophantine and γ a parameter that we let to ∞: this is a two time scales problem. In the case ` = 2 (hence ω 0 is a constant ω0 ) and with f a trigonometric polynomial the above theorem proves that π −1 the splitting is (generically) O(e− 2 g ω0 γ ): in fact this result was the main purpose of the theory in [G3] (see Sect. 8 in [G3] and, in particular, (8.6) and the related discussion). It should be stressed that the latter reference simply provides a new proof of a result already obtained, in a slightly different case, by [HMS] or, in the same case, by [Gl, GLT, LST]. The interest of [G3] lies in the technique. Mel’nikov’s “approximation”, i.e. the dominance of the first order value of the splitting, is more delicate if ` ≥ 3. The techniques of [G3] are inadequate to deal with this case and they only show that the splitting is smaller than any power of γ −1 while the first order value is O(ecγ ) for some (computable) c > 0: in the case ` = 3 this has been studied in [DGJS, RW], where the (9.2) is improved by replacing ε(k, ν ) by the much π −1 better e− 2 g γ| ω 0 · ν | . Several examples of first order dominance are provided in the latter references: see however Appendix R9. However all examples are constructed with f analytic: it would be nice to find a model with a trigonometrical polynomial f for which the first order theory gives the asymptotic result; see the recent work [G6] for a precise general conjecture on the result. 9.2. Three time scales. Anisochrony strength. Homoclinic scattering. The three time scales condition for the first order dominance (Mel’nikov integral) includes the case a = 0 which is in fact a 2 time scales problem: denoting always η −1/2 the fast velocity scale, from the analysis of Sect. 7 we see that the slow scale could be η a with a ≥ 0. This means that the above theory provides a class of models in which the Mel’nikov integral gives the exact asymptotics as η → 0 and the perturbation is a trigonometric polynomial, of which (5.10) is a concrete example. This does not seem to contradict the results of [DGJS] and [RW] who show that Mel’nikov integral does not necessarily give the leading asymptotics as η → 0 in cases corresponding to their n = 3, s = 2. In the only almost overlapping case a = 0, however, the above question is not treated in [DGJS] and [RW] (they consider the very different case n = 3, s = 2 in which a = − 21 , i.e. two fast rotators and a pendulum). This illustrates also that there are several “2 time scales problems”, depending on which pair among the three time scales is identified. The value a = 0 is a case considered with other techniques in the paper [RW] (it corresponds to their n = 3, s = 1): there the attention is dedicated to a wider question (namely the leading order of the splitting everywhere on the section ϕ = π rather than just at the homoclinic point), see Appendix R9. Our asymptotic result is consistent with their Theorem 2.1. We also get the complete asymptotics in the case of trigonometric polynomial perturbations: but they do not seem interested in this point and deal only with other cases (n = 3, s = 2 and non-trigonometric perturbations); their technique seems to apply to our (special) case a = 0 as well (in fact a simpler case). The case a > 0 is not considered in [RW] except, perhaps, for a remark at the end of the abstract and following Eq. (15): we do not know whether this case, that is explicitly excluded in the paper, can be treated with their techniques. In the end the main difference between our work and that of [DGJS, RW] might just lie in the technique, see Appendix R9: we have shown that the work in [G3] provides all the necessary technical tools for

226

G. Gallavotti, G. Gentile, V. Mastropietro

the analysis of the splitting and even leads to an “exact” expression for it. It is however limited to the splitting at the homoclinic point ϕ = π, α = 0 (unlike [G3] and [RW], where the splitting is measured at α arbitrary, on the section ϕ = π). The work [RW] is the last in a series of papers (like [DGJS]) which are inextricably linked with each other (and with [G3, BCG]). The above comments therefore are easily presented in connection with [RW]: but we are aware of the role of the other papers quoted in [RW]. Fixing a = 21 the anisochrony coefficient (of A2 ) in (7.1) is η β with β = 2a. The value β > a is necessary if one wants that the anisochronous and the isochronous splittings coincide to leading order as η → 0 (at given rotators velocities): however the analysis above does not seem yet sharp enough for such an improvement (i.e. taking β < 1). Finally the physical interpretation of the precession problem (i.e. diffusion in the presence of a double resonance for a a priori stable system) requires β = 1, a = 21 . Extensions of the cancellations theory of Sect. 7 to ` > 3 seem only a matter of patience. And they would be interesting as they can be conceivably used to treat a variety of systems and one should expect that the results will be quite different when the number of fast scales exceeds 1, as shown in the “maximal case” in which it is ` − 1 ([DGJS, RW], see Appendix R9). However a general theory of a priori stable systems, with a free Hamiltonian without free parameters and a perturbation consisting of terms of equal order of magnitude seems to require substantial new ideas. In Sect. 10 of [CG] there is also a statement about the homoclinic scattering: the techniques of this paper apply to its theory as well. We have not worked out, however, the corresponding details (the statement was not used anywhere in [CG]) and at the moment it is still an open question for us whether the homoclinic phase shifts are exponentially small or not at the homoclinic point (as claimed in [CG] on the basis of the computational error mentioned above).

Appendix A1. Computation of the Pendulum Wronskian Matrix The pendulum Hamiltonian: H = I 2 /2J0 +g02 J0 (cos ϕ−1) generates a separatrix motion t → ϕ0 (t) which is exactly computable. One finds, starting at ϕ = π at t = 0: sin ϕ0 (t)/2 = 1/ cosh g0 t, sin ϕ0 (t) = 2 sinh g0 t (cosh g0 t)−2 , cos ϕ0 (t)/2 = tanh g0 t,

cos ϕ0 (t) = 1 − 2 (cosh g0 t)−2 .

A further elementary discussion of the pendulum quadratures near E = 0 allows us to find the E derivatives of the separatrix motion and leads to: −2g0 J0 ϕ0 = −2g0 J0 sin , ∂E I 0 = J0 (I 0 )−1 1 + J0 g02 (∂E ϕ0 ) sin ϕ0 , cosh g0 t 2 (A1.1) −1 ϕ0 0 −g0 t 0 , ∂E ϕ = 2 2g0 t + sinh 2g0 t sin , ϕ = 4 arctg e 2 8g0 J0 I0 =

exhibiting the analyticity properties in the complex t plane that are useful in discussing the size of the homoclinic angles. The (A1.1) allows us to compute the Wronskian matrix of the above separatrices, i.e. the solution of the pendulum equation, namely ϕ˙ = JI0 , I˙ = J0 g 2 sin ϕ, linearized on the separatrices:

Separatrix Splitting for Systems with Three Time Scales

227

˙ = L(t)W, W (0) = 1, L(t) = W and we get:

W (t) =

0 J0−1 , J0 g02 cos ϕ0 (t) 0

ϕ˙ 0 /c2 ∂E ϕ0 /c1 c1 = ∂E I 0 (0) , , 0 0 ˙ c ˙ 0 (0) = −2 I /c2 ∂E I /c1 2 =ϕ

(A1.2)

(A1.3)

where the E derivative is computed by imagining motions close to the separatrix (which has energy E = 0) and with the same initial ϕ = π. This becomes: ! 1 w 2g0 t + sinh 2g0 t cosh g0 t 4J0 g0 , w≡ . (A1.4) W (t) = sinh g0 t w sinh g0 t −J0 g0 cosh (1 − ) cosh g t cosh g0 t 2g t 2 0 4 cosh g0 t 0 The theory of the Jacobian elliptic functions shows how to perform a complete calculation of the functions R0 , S0 in (3.2): see [CG], Appendix A9. Appendix A2. Convergence of the “Form Factors” 0 , 0, M and Remainders Bounds Integrals for 0 , 0, M , see (6.3), (6.7), (6.8) and the analogous ones in Sect. 7, are precisely the object of the analysis of Appendix A1 of [G3]. Hence we adhere closely to it. Consider any of the form factors, i.e. any of the series in Sect. 6 or 7. Following [G3], word by word, we obtain a bound on the sum of the contributions of the values of all trees ϑ with m nodes and order h, m ≤ 2h, as D0 B0h−1 εh m!2 max0<| ν |≤N h | ω · ν |−4h with D0 , B0 constants (see (A1.13) and (A1.4) of [G3]: noting that in (A1.4) m!2 is missing due to a typo). This factorial becomes h!4 in the bound (8.2) of [G3] (and consequently on the form factors and det D at order h, which interest us here and which are expressed essentially by the same integrals) because m ≤ 2h and it is responsible of the first value 4 in the variable called p, in [G3], p = 4 + 4τ . We know from Theorem 1 above, that det D is given by a convergent series but we do not know whether the series for 0 , 0 and for the matrix M converge. For the “dressed form factors” G one could in fact show that the series in ε converge for small ε by the method developed in Appendix A1 of [G3] (consisting in reexpressing the G in terms of the unsplit operators O): but a lot of work can be saved because as we show now we do not need to know such convergence properties and the analyticity of det D suffices. One can find some improvements over the bounds (8.2) of [G3] by using the fact that one has a better bound as long as N h < γη −1/2−a , with any prefixed γ < 1, because it is only for N h ≥ γη −1/2−a that one can find really small divisors: for h < γη −b N −1 , with 0 < b ≤ 1/2 + a, we can bound | ω · ν | below by η a , if η is small enough. Hence h−1 for such values of h we have the bound: D1 B 1 (εη −4a )h h!4 , with D1 , B 1 constants and for h < N −1 η −b γ. This bound holds for all form factors and also for det D (note that it does not imply any convergence). On the other hand we know that det D is analytic in ε for ε < η c and c large enough. Hence, to order h, det D (but a priori not individual form factors) can be bounded by: D2 B2h−1 (εη −c )h , for all h ≥ 1

(A2.1)

for some c > 0. It was conjectured in [G3], see Appendix A1, that this bound could be obtained directly from the graphical expansions. This has been proved in [Ge1] (getting

228

G. Gallavotti, G. Gentile, V. Mastropietro

c = 4d if d is the constant in (2.2) and (7.3)); but we are showing that such stronger result is not needed here. (2) Finally in the case of 0(0) 2 , 02 , 0i2 , i.e. in the case of the “bare” or “analytic form factors” , which are expressed as integrals of analytic functions, one can further improve the bound by the usual τv -variables integrations shift to Im τv = ±i( π2 − η 1/2 ), choosing the quantity called d in [G3] as η 1/2 , a natural but quite arbitrary choice. One checks directly (as explained in [G3], Appendix A1) that this simply introduces 0 a factor η −β with β 0 = 2(N + 1) due to closeness of the singularities of the functions appearing in the Wronskian matrix or of the fδ (ϕ0 (τv ), α + ω τv ) (located at the same places because f is assumed to be a trigonometric polynomial); it introduces also an 1/2 π exponentially small factor: εh = min0<| ν |≤N h, 0<|ν2 | e−( 2 −η )| ω · ν | so that the bound of the hth order, for such form factors (and hence for det D by Sect. 7), becomes : D3 h!4 B3h−1 (εη −β )h

π

max

0<| ν |≤N h,0<|ν2 |

e−( 2 −η

1/2

)| ω · ν |

, for all h < γ(N η b )−1 ≡h0

(A2.2) with β = 2(N + 1) + 2 and B3 , D3 constants. Note that in order to deduce this all we really need is that the exact expressions (6.12) and (7.10), regarded as formal power series, be true order by order in the expansion. We do not really need the (yet unknown) convergence of the series for the G ’s and for the matrices 0 and M , which might be not true. Since (εη −c )h = (2εη −c )h 2−h and for η small enough and h > h0 the quantity 2−h π −1/2 is < e− 2 η , provided b ≥ 1/2, the bounds (A2.1) and (A2.2) can be combined into the bound for the h-th order value of det D: π

D4 (B4 εη −c )h e− 2 η

−1/2

for all h ≥ 1

(A2.3)

with c > max{2(N + 1) + 2 + 4b, c} and B4 , D4 suitable constants. The bound (A2.3) is an immediate consequence of (A2.1) for h > h0 and the fact that, as | ω · ν | ≥ η −1/2 − N hη a , for ν2 6 = 0, (A2.2) can be used to give h0 X

π

D3 h!4 B3h−1 (εη −β )h e− 2 η

−1/2

+1+ π2 N hη a

π

≤ D4 B42 (εη −β )3 e− 2 η

−1/2

(A2.4)

h=3

provided that in (A2.2) one has 1/2 ≤ b ≤ 1/2 + a; the best choice is b = 1/2, which gives c > max{2(N + 1) + 4, c}. For instance, if f is given by (5.10), supposing also c = 4d < 2(N + 1) + 4, one finds c > 8. Appendix A3. The Commutation Relation (6.20) Let us denote by ϕ≡ϕ0 (t) themotion pendulum on the separatrix of the unperturbed 0 O(F ) : the vector T verifies T˙ = and T = (see Appendix A1). Let F = F ∂t O(F ) 0 1 . By differentiating with respect to t both sides L(t) T + F , where L(t) = cos ϕ 0 0 0 ϕ˙ T which means we get T¨ = L(t) T˙ + F˙ + −∂03 f0 0 (A3.1) ∂t O(F ) = O ∂t F − ϕ˙ ∂03 f0 O(F )

Separatrix Splitting for Systems with Three Time Scales

229

up to a homogeneous solution t → W (t)X, X ∈ R2 of the latter linear equation, when written for the first component; but the only function X such that O(F ) and ∂t O(F ) are both bounded uniformly X = 0. in time is ϕ˙ ∂E ϕ and F = 0, and proceeding as before, one or T = Choosing T = ϕ¨ ∂E ϕ˙ gets in the same way the identities ¨ O(ϕ[∂ ˙ E ϕ] sin ϕ) = −∂E ϕ. ˙ O(ϕ˙ 2 sin ϕ) = −ϕ,

(A3.2)

Noting that O0 (F ) = O(F ) for odd F , and that O0 (sign τ F )(t) = sign t O0 (F )(t), from the above equalities it follows immediately 2 sin ϕ) = O0 (w00

1 1 w30 , O0 (w00 |w03 | sin ϕ) = |w33 | 2 2

(A3.3)

which are used in Sect. 6 and 7. By definition of G0 in (5.12) and by the first of (A3.2), one has Z 1 −∞ dτ w00 (τ ) ∂τ F − ϕ˙ ∂03 f0 O0 (F ) G0 ∂t F − ϕ˙ ∂03 f0 O0 (F ) = 2 +∞ (A3.4) Z i 1 −∞ h 2 ˙ dτ F O0 (ϕ˙ sin ϕ) − ϕ˙ F = 0. = 4 +∞ A similar identity is obtained by considering G ∂t F − ϕ˙ ∂03 f0 O0 (F ) and using the second of (A3.2), with |∂E ϕ| replacing ∂E ϕ (see comments after (A3.2)). The oddness of F implies that ∂t O0 (F ) = ∂t O(F ). Then, as a consequence of (A3.4) and the analogous relation for G, (6.20) follows. Appendix A4. Proportionality Between G(0) and G(2) From (7.9) one has that, formally: G1 =

∞ X

(σM )k 0 1 = lim

N →∞

k=0

N X

(σM )k 0 1 ≡ lim G 1 (N ) N →∞

k=0

(A4.1)

so that, by using the relations between the matrix elements of M listed in (7.11), one has   M01 M11 M12 λM11 M00 M01 M02 λM01 − λ30   G (N − 1) + 0 1 , G 1 (N ) =  λM01 − λ30 λM11 λM12 − λ31 λ2 M11  1 M02 M12 M22 λM12 − λ31 (A4.2) which gives (0) (1) G(0) 1 (N ) = M01 G1 (N − 1) + M11 G1 (N − 1) (3) (0) + M12 G(2) 1 (N − 1) + λM11 G1 (N − 1) + 01 ,

G(2) 1 (N )

=

λG(0) 1 (N )

−

λ30 G(0) 1 (N

− 1) −

λ31 G(0) 1 (N

(A4.3)

− 1)

(0) as 0(2) 1 = λ01 (see the second identity in (7.11)). Taking the limit N → ∞, one obtains the third equality in (7.11), defining Z = (1 − 30 )/(1 + λ31 ).

230

G. Gallavotti, G. Gentile, V. Mastropietro

Appendix R. After Refereeing Comments This appendix contains a few clarifications on the text and on the comments on [RW, DGJS], requested by the referee. R1 Why det D is a measure of transversality). The splitting vector clearly gives the distance between the two manifolds at corresponding α ’s (because it is just the difference between the two). Therefore if the angles are increased by an infinitesimal amount d α away from a homoclinic point ( α = 0 in our case) the vector changes by its derivative D times d α : the derivative is the intersection matrix. Therefore Dd α is the increment of the splitting vector. Hence this cannot vanish unless the determinant of M vanishes: hence the determinant measures transversality. Other evidence that the determinant measures transversality is that the square root of its value is a bound on the lowest eigenvalue of the matrix M which gives the “minimum transversality” or the minimum splitting angle (the arctangents of the “principal angles” can be identified with the eigenvalues). This is the second geometrical interpretation. That it is a quantity of physical interest also for the theory of heteroclinic intersections is made clear by the whole content of the paper. In any event in Sect. 8 we repeat the argument used in [CG]: the determinant has the mathematical interpretation of the Jacobian determinant for the implicit equation problem that has to be solved when one looks for heteroclinic intersections (hence for Arnol’d diffusion, as an example). Further analysis can be found in [GGM2]. We briefly explain here how to find such intersections. The equations for the stable and unstable manifolds at ϕ = π of an invariant torus to which perturbation theory with ε < η c with a suitably large c can be applied have the form: A b ( α ) = A 0 + H b ( A 0 , α ), b = s, u,

(R1.1) −γ

where A 0 is such that ω ( A 0 ) = (η a +η 2a A01 , η ) provided | ω ( A 0 )· ν | ≥ e−η | ν |−τ for γ, τ > 0 and γ < 1: this is a consequence of a detailed analysis of the classical proofs (one can take for instance that in [CG], explicitly and more carefully reworked out for the case of the present models in [GGM2]).1 The point is that the constant c should be fixed suitably large once and for all and then, fixed γ < 1 and τ , for |ε| < η c and η small, the function H b together with the invariant tori equations and the splitting matrix determinant, can be evaluated by perturbation theory and in particular the splitting determinant will be generically of − 21

−1

order e−O(η 2 ) , [GGM2]. The relation between ω ( A 0 ) and A 0 , after (R1.1), says that the spacing between the A 0 that correspond to invariant tori is essentially the same as the spacing between the frequencies of the invariant tori: hence the resonance “gaps” around such A 0 ’s are −γ of size at most O(e−η ) by the “generosity” of the Diophantine condition that we can use. Hence the splitting is by far greater than the gaps if γ > 21 (this is of course, once more, the reason why we call our main result the “large splitting theorem”). The 1 Furthermore the remarkable “twistless” property of the models, see [G3, GGM2], implies that the time average of A on the invariant torus is A 0 so that there is a simple relation between the frequency spectrum of the motion on the invariant torus and the parameter A 0 in (R1.1): there is “no twist” on the frequencies due to the perturbation (one says that the tori are “twistless”, [G3]): the dispersion relation (i.e. the relation between frequencies and average actions) remains the same in presence or absence of perturbation (at least for the non resonant tori that survive the perturbation).

Separatrix Splitting for Systems with Three Time Scales

231

equation for a heteroclinic point between (existing) tori with average actions A 01 and A 02 is therefore, at ϕ = π: A 01 − A 02 = −(H u ( A 01 , α ) − H s ( A 02 , α ))

(R1.2)

which is an implicit equation for the point location α which at the trivial solution near A 01 (namely ( A 02 = A 01 , α = 0 ) has a Jacobian determinant precisely given by the −γ determinant of the splitting matrix. Hence if | A 01 − A 02 | < e−η with 1 > γ > 21 we see that, since | A 01 − A 02 | is far smaller than the value of the Jacobian determinant, Eq. (R1.2) has a solution α 1,2 . See [GGM2] for the proofs of the large density of the invariant tori. R2 (Why the result remains valid if the analysis is marginal). We use the word in a technical sense: this means that the series that we study does not necessarily provide a good approximation if truncated at a fixed order. To each order one gets a splitting value that cannot be considered correct (although in the end it turns out to be so) until an analysis to all orders is performed: a typical situation that arises in problems in QFT for the expansions that involve running couplings of marginal operators. Of course the purpose of this paper is precisely to perform the analysis to all orders with all rigor and clarity possible and the fact that the analysis is called “marginal” is not meant at all to diminsh the result but to stress that there is work to do. R3 (What is the import of the reference to Appendix 9 of [CG]). Simply to exhibit the form of functions R0 , S0 ; the nontriviality refers to the fact that this work (of Jacobi) founded the theory of elliptic functions which we consider non trivial even though it deals with a one degree of freedom pendulum. R4 (Theorem 1 is nowhere proved). A proof did not seem necessary to us, as many other consequences of KAM theory that are surprisingly appearing and being published even on prestigious journals, while deserving a place only as lecture notes or as chapters in monographs or books. In fact a proof can be found for instance in the paper [G5] (pp. 2 and 3 plus half of p. 10), which is in our opinion a trivial adaptation of the proofs in Sect. 5 of [CG] which deals with the harder anisochronous case. Further developments on the subject can be found in [GGM2]. R5 (Is (6.4) in [G3] and where?). It is there and, in any event, it does not require any proof because it simply expresses in a formula which is discussed in words in Sect. 6, A, of [G3]. One considers the tree expansion and replaces in it (6.1) of [G3] with (6.2) and (6.4) with (6.5), obtaining the result explained in the first sentence of p. 375 of [G3] whose translation in the formulae is (6.5). It is worth stressing that the quoted sentence is one of the main results in [G3], whose contents are assumed throughout the present work. The comment on the graphical meaning is by no means meant as a proof, but as a further suggestion to the reader on how to interpret it (it could, however, also be regarded as a proof if [G3] is assumed). Since Feynman, at least, graphs are concise ways to write (and even derive) involved formulae and to perform algebraic operations with them; all the results we present can be conveniently interpreted graphically, and yet perfectly rigorously. R6 (What are the “omitted terms” in (6.12)?). They are a few terms whose properties are described in words and whose expression can be very easily derived by patiently evaluating the determinant of the 2 × 2 matrix in (6.10). That their contribution is trivial for our purposes is immediate and the difficulty, the whole difficulty, lies in bounding

232

G. Gallavotti, G. Gentile, V. Mastropietro

the second terms of (6.12). The reference to Appendix A2 is not meant as a proof of the properties of the omitted terms, but rather as a place where it is discussed why the terms omitted (which are integrals of functions which are analytic and contain “fast angle derivatives”) will provide negligible contributions to the leading asymptotics. In any event we write here the complete form of the determinant: (1) 2 (1) (1) 2 det D = 011 022 ε2 − 2 011 M11 − 2(0(0) 1 ) (02 ) det C + X (k ) (k ) 1 2 011 022 εk1 +k2 + + k1 +k2 ≥3

i 2 + − 012 − 4012 ( 0 1 , C 0 2 ) + 2022 ( 0 1 , C 0 1 ) + o n (0) (1) (0) (1) 2 (1) 2 ) (0 ) − 2(0 0 )(0 0 ) det C, + 4 (0(0) 2 1 1 1 2 2 h

(R6.1)

from which (6.12) follows; in particular the terms in the last line of (R6.1) are the terms referred to as “a few others” after (6.12). They are just two terms. That the above equation really follows by performing the calculation of the determinant can be checked (explicitly) as follows. We compute the determinant of the matrix (6.10) 011 + 2( 0 1 , C 0 1 ) 012 + 2( 0 2 , C 0 1 ) , (R6.2) D= 021 + 2( 0 1 , C 0 2 ) 022 + 2( 0 2 , C 0 2 ) where (R6.3) C = σ(1 − σM )−1 , with C, M, 0 symmetric. One has 2

det D = 011 022 − 012 + 2011 ( 0 2 , C 0 2 ) + 2022 ( 0 1 , C 0 1 ) + 4( 0 1 , C 0 1 )( 0 2 , C 0 2 ) − 4012 ( 0 1 , C 0 2 ) − 4( 0 1 , C 0 2 )2 .

(R6.4)

By taking into account the fact that all contributions to (R6.4) containing factors 012 , 022 and 0(0) 2 are trivially either exponentially small in ω and of order 3 at least in ε or 2 they are “exponentially smaller” (like 012 ) than the first term in the r.h.s (which is of order ε2 unless it vanishes, which is not the case, generically), one has that the only not a priori exponentially small terms in (R6.4) are (1) 2 (0) (1) 2 2 (1) 2 2011 (0(1) 2 ) C11 + 4( 0 1 , C 0 1 ) (02 ) C11 − 4(01 C01 + 02 C11 ) (02 ) ,

(R6.5)

where (0) (1) (1) 2 2 2 ( 0 1 , C 0 1 )C11 = (0(0) 1 ) C00 C11 + 2(01 01 )C01 C11 + (01 ) C11 , (1) (0) 2 2 (0) (1) (1) 2 2 2 (0(0) 1 C01 + 02 C11 ) = (01 ) C01 + 2(01 01 )C01 C11 + (01 ) C11 ,

(R6.6)

so that (R6.5) becomes

n o (0) 2 2 2 2 2011 C11 + 4(0(0) (0(1) 2 ) 1 ) C00 C11 − 4(01 ) C01 .

(R6.7)

Then, by using also that, 2 , C11 = −M11 det C, det C = C00 C11 − C01

(R6.8)

Separatrix Splitting for Systems with Three Time Scales

233

one can write (R6.7) as n o 2 2 − 2011 M11 + 4(0(0) det C, (0(1) 2 ) 1 )

(R6.9) 2

so that, if we note also that the only terms to second order in ε are 011 022 and 012 (and that the latter decays twice faster than the first one), (6.12) of [GGM] follows. The skeptical reader can find useful to check that all the other terms (neglected so far) are really exponetially small, i.e. they have the properties claimed in Sect. 6. To see this, let us rewrite (R6.4) as 2

det D = 011 022 − 012 − 4012 ( 0 1 , C 0 2 ) + 2022 ( 0 1 , C 0 1 ) + G,

(R6.10)

G≡2011 ( 0 2 , C 0 2 ) + 4( 0 1 , C 0 1 )( 0 2 , C 0 2 ) − 4( 0 1 , C 0 2 )2 .

(R6.11)

where

By performing explicitly the calculations one has n o (1) 2 (0) (1) 2 G = 2011 (0(0) 2 ) C00 + (02 ) C11 + 2(02 02 )C01 n (0) (1) (0) (1) (1) 2 (1) 2 2 2 (0) 2 2 2 + 4 (0(0) 1 ) (02 ) C00 + 4(01 01 )(02 02 )C01 + (01 ) (02 ) C11 (0) 2 (0) (1) 2 (0) (1) + 2 (0(0) 1 ) (02 02 ) + (02 ) (01 01 ) C00 C01 (0) (1) 2 (1) 2 + (0(0) 1 ) (02 ) + (02 )(01 ) C00 C11 o (1) (1) 2 (0) (1) (1) 2 + 2 (0(0) 1 01 )(02 ) + (02 02 )(01 ) C01 C11 n 2 (0) 2 2 − 4 (0(0) 2 ) (01 ) C00 (0) (1) (0) (1) (1) 2 (1) 2 2 2 (1) 2 + (0(0) 2 ) (01 ) + 2(02 02 )(01 01 ) + (02 ) (00 ) C01

(R6.12)

2 (1) 2 2 + (0(1) 2 ) (01 ) C11 (0) 2 (0) (1) 2 (0) (1) + 2 (01 ) (02 02 ) + (0(0) 2 ) (01 01 ) C00 C01 (1) (0) (1) + 2 (0(0) 2 02 )(01 01 ) C00 C11 o (1) (1) 2 (0) (1) (1) 2 + 2 (0(0) 1 01 )(02 ) + (02 02 )(01 ) C01 C11 ,

which, by exploiting the fact that some terms cancel between each other, becomes o (1) 2 (0) (1) 2 ) C + (0 ) C + 2(0 0 )C G0 = 2011 {(0(0) 00 11 01 2 2 2 2 i nh (0) (1) (0) (1) (0) 2 (1) 2 2 (0) 2 2 (R6.13) + 4 2(01 01 )(02 02 ) − (02 ) (01 ) − (0(1) 2 ) (01 ) C01 i o h (1) (0) (1) (0) 2 (1) 2 (1) 2 (0) 2 − 2(0(0) 1 01 )(02 02 ) − (02 ) (01 ) − (02 ) (01 ) C00 C01 . The two expressions in square brackets in (R6.13) are equal and (parenthetically) are the same as i2 h (1) (1) (0) , (R6.14) 0(0) 2 01 − 02 01

234

G. Gallavotti, G. Gentile, V. Mastropietro

so that, by using also (R6.8), one can write n o (0) (1) 2 ) C + 2(0 0 )C G = 2011 (0(0) 00 01 2 2 2 o n (0) (1) (0) (1) 2 (1) 2 + 4 (0(0) 2 ) (01 ) − 2(01 01 )(02 02 ) det C (1) 2 2 − 2 011 M11 − 2(0(0) 1 ) (02 ) det C.

(R6.15)

So one can conclude, combining (R6.10) and (R6.15), that the determinant of the matrix (R6.2) can be writen as (R6.1) from which (6.12) follows. R7 (Where is the proof of domination?). As we say the proof is in Appendix A2: the reference to [G3] is here only to say that the same proof (i.e. combining the two estimates for the k-th order term and optimizing) appears there for the first time. R8 (Is the Remark 1 after (9.4) a reproach to other authors?). This statement is by no means intended as a critique to the authors of [RW] for not having read the proof of the result that they quote. This is not what we mean here: it just seemed to us the right place to point out that [G3] gave a more general result which, while uninteresting for the purposes of [G3], was worth pointing out. In fact it was exploited in [BCG] and, recently, in [GGM3]. It is also true that reading the proof of the “quasi flat” bounds of [G3] would not only have shown the validity of (9.4), but also would have shown why the result claimed in [RW] could not be right, see [GGM4]: the work [GGM3] gives further improvements of (9.4) useful to derive results of the type considered in [RW]. R9 (Relation of the present results with [DGJS] and [RW]). The numerous discussions intervened since the present work appeared as a preprint have considerably clarified the relations between the above papers and ours. It is therefore useful to give our present view. The results in [RW] are not correct as there are serious flaws in the proof, [GGM4]. The work [DGJS] deals only with the isochronous case and, more important, it uses a very different definition of splitting. A definition that if used instead of the one in [CG] would make the error there disappear! They do not study the determinant of the intersection matrix but only the difference of the values of a certain observable called “the analytic integral”, closely related to the energy of the free pendulum, evaluated on the two manifolds. This is interesting (and it seems related to preexisting numerical experiments) but it is different from the problems studied in [G3] (and [CG]). This cannot be considered as a measure of the splitting, see Appendix R1, in our sense. On the other hand the results in [G3] can be seen to imply, in some cases, already essentially all the results of [DGJS] and furthermore bounds on the splitting determinant in the isochronous and anisochronous cases. Essentially means that the results will follow for even interactions which are polynomials of very high degree and with non zero coefficients verifying the conditions of [DGJS] and replacing the “analytic integral” with the energy of the free pendulum (the relation between the two is also quite simple). This is explained in [GGM3]. Furthermore the results in [RW] and [DGJS] deal with the splitting on sections that are not the one we consider, namely ϕ = π. This is for us an important further difference as we are interested to show existence of heteroclinic chains with the method of [CG] which deals with the section at ϕ = π; see [GGM5]. In fact an application of the present work to heteroclinic chains and Arnol’d diffusion is analyzed in [GGM2], and described above in Appendix R1: we do not see that the results of [DGJS], in the isochronous

Separatrix Splitting for Systems with Three Time Scales

235

case not to mention the anisochronous ones that they do not treat, suffice to prove the existence of heteroclininc chains as discussed in Appendix R1 above and in [GGM2]. Acknowledgement. We are indebted to P. Lochak for many discussions and for encouraging one of us to revise the previous work [CG] in order to present a simplified version. One of us (GiG) is deeply indebted to V. Gelfreich for pointing out, in a meeting organized and led by P. Lochak, the error in [CG] that is corrected in the present paper. We also thank C. Sim´o and A. Jorba for comments on the manuscript. This work is part of the research program of the European Network on: “Stability and Universality in Classical Mechanics", # ERBCHRXCT940460.

References [A1] [A2] [ACKR] [BCF] [BCG] [BGGM]

[CG] [DGJS] [E] [G1]

[G2] [G3]

[G4]

[G5]

[G6]

[GG] [GGM1] [GGM2]

Arnol’d, V.I.: Proof of a A.N. Kolmogorov theorem on conservation of conditionally periodic motions under small perturbations of the Hamiltonian function. Usp. Mat. Nauk 18, 13–40 (1963) Arnol’d, V.I.: Instability of dynamical systems with several degrees of freedom. Sov. Math. Dokl. 5, 581–585 (1966) Amick, C., Ching, E.S.C., Kadanoff, L.P., Rom–Kedar, V.: Beyond All Orders: Singular Perturbations in a Mapping. J. Nonlinear Sci. 2, 9–67 (1992) Benettin, G., Carati, A., Fass´o, A.: On the conservation of adiabatic invariants for a system of copupled rotators. Physica D 104, 253–268 (1997) Benettin, G., Carati, A., Gallavotti, G.: A rigorous implementation of the Jeans–Landau–Teller approximation for adiabatic invariants. Nonlinearity 10, 479–507 (1997) Bonetto, F., Gentile, G., Gallavotti, G., Mastropietro, V.: Lindstedt series, ultraviolet divergences and Moser’s theorem. Annali della Scuola Normale Superiorte di Pisa Cl. Sci. Ser. IV 26, 545–593 (1998); Quasi linear flows on tori: Regularity of their linearization. Commun. Math. Phys. 192, 707–736 (1998) Chierchia, L., Gallavotti, G.: Drift and diffusion in phase space. Annales de l’Institut Henri Poincar´e B 60, 1–144 (1994). See also the erratum: B 68, 135 (1998) Delshams, S., Gelfreich, V.G., Jorba, A., Seara, T.M.: Exponentially small splitting of separatrices under fast quasiperiodic forcing. Commun. Math. Phys. 189, 35–72 (1997) Eliasson, L.H.: Absolutely convergent series expansions for quasi-periodic motions. Math. Phys. Electronic J. 2 (1996) Gallavotti, G.: The elements of Mechanics. Berlin–Heidelberg–New York: Springer, 1983. See also: Quasi integrable mechanical systems, Les Houches, XLIII (1984), vol. II, Ed. K. Osterwalder & R. Stora, Amsterdam: North Holland, 1986, pp. 539–624 Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164, 145–156 (1994) Gallavotti, G.: Twistless KAM tori, quasi flat homoclinic intersections, and other cancellations in the perturbation series of certain completely integrable Hamiltonian systems. A review. Rev. Math. Phys. 6, 343– 411 (1994) Gallavotti, G.: Methods in the theory of quasi periodic motions. Expanded version of a talk at the Conference in honor of Lax and Nirenberg, Venezia, June 1996, in print, mp arc@math. utexas.edu #96–498 Gallavotti, G.: Fast Arnold’s diffusion in isochronous systems. In: chao-dyn 9709011, revised in http://ipparco.roma1.infn.it. And Gallavotti, G.: Hamilton–Jacobi’s equation and Arnold’s diffusion near invariant tori in a priori unstable isochronous systems. chao-dyn #9710019, in print in Seminario Matematico di Torino Gallavotti, G.: Reminiscences on science at I.H.E.S. A problem on homoclinic theory and a brief ´ review. chao-dyn #9804044. In print in Publications Math´ematiques de l’ Institut des Hautes Etudes Scientifiques, (I.H.E.S), 1998 G. Gallavotti, G. Gentile: Majorant series convergence for twistless KAM tori. Ergodic Theory and Dyn. Syst. 15, 857–869 (1995) Gallavotti, G., Gentile, G., Matropietro, V.: Field theory and KAM tori. Math. Phys. Electronic J. 1, (1995) Gentile, G., Gallavotti, G., Mastropietro, V.: Hamilton-Jacobi equation, heteroclinic chains and Arnol’d diffusion in three time scales systems. Archived in chao-dyn@xyz. lanl. gov, #9801004

236

G. Gallavotti, G. Gentile, V. Mastropietro

[GGM3] Gentile, G., Gallavotti, G., Mastropietro, V.: Mel’nikov’s approximation dominance. Some examples. To appear in Rev. Math. Phys. [GGM4] Gentile, G., Gallavotti, G., Mastropietro, V.: Homoclinic splitting, II. A possible counterexample to the main results of the Physica D paper. 114, 3–80 (1998), chao-dyn, #9804017 [GGM5] Gentile, G., Gallavotti, G., Mastropietro, V.: Homoclinic splitting. I. Comment on a Physica D paper of Rudnev and Wiggins. mp arc, #98–245 [Gl] Gelfreich, V. G.: Mel’nikov method and exponentially small splitting of separatrices. Physica D 101, 227–248 (1996) [GLT] Gelfreich, V.G., Lazutkin, V.F., Tabanov, M.B.: Exponentially small splitting in Hamiltonian systems. Chaos 1, 137–142 (1991) [Ge1] Gentile, G.: A proof of existence of whiskered tori with quasi flat homoclinic intersections in a class of almost integrable systems. Forum Mathematicum 7, 709–753 (1995) [Ge2] Gentile, G.: Whiskered tori with prefixed frequencies and Lyapunov spectrum. Dynamics and Stability of Systems 10, 269–308 (1995) [GM] Gentile, G., Mastropietro, V.: KAM theorem revisited. Physica D 90, 225–234 (1996); Tree expansion and multiscale analysis for KAM tori. Nonlinearity 8, 1159–1178 (1995); Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications. Rev. Math. Phys. 8, 393–444 (1996) [GR] Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals series and products, London–NewYork:Academic Press, 1965 [Gr] Graff, S. M.: On the conservation for hyperbolic invariant tori for Hamiltonian systems. J. Differ. Eqs. 15, 1–69 (1974) [HMS] Holmes, P., Marsden, J., Scheurle, J: Exponentially small splittings of separatrices with applications to KAM theory and degenerate bifurcations. Contemp. Math. 81, 213–244 (1989) [LST] Lazutkin, V.F., Schachmannski, I.G., Tabanov, M.B.: Splitting of separatrices for standard and semistandard mappings. Physica D 40, 235–248 (1989) [Me] Mel’nikov, V.K.: On the stability of the center for time periodic perturbations. Trans. Moscow Math. Soc. 12, 1–57 (1963) [RW] Rudnev, M., Wiggins, S.: Existence of exponentially small separatrix splittings and homoclinic connections between whiskered tori in weakly hyperbolic near integrable Hamiltonian systems. Physica D, 114, 3–80 (1998) [S] Sim´o, C.: Averaging under fast quasiperiodic forcing. In: Integrable and chaotic behaviour in Hamiltonian systems. Torun, Poland (1993), Ed. I. Seimenis, New York: Plenum, 1994, pp. 13–34 [T] Thirring, W.: Course in Math. Physics. vol. 1 Wien: Springer, 1983, p. 133 Communicated by J. L. Lebowitz

Commun. Math. Phys. 202, 237 – 253 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Intermittency of the Tracer Gradient Leonid I. Piterbarg1,? , Vladimir V. Piterbarg2 1 Department of Mathematics, University of Southern California, Los Angeles, CA 90089-1113, USA. E-mail: [email protected] 2 233 South Wacker Drive, Suite 2800 Chicago, IL 60606, USA. E-mail: [email protected]

Received: 21 November 1997 / Accepted: 7 October 1998

Abstract: The problem of stirring a passive scalar (tracer) by a random velocity field is considered. For Gaussian velocity fields with infinitely small time and space scales it is shown that the tracer gradient is concentrated in a discrete set of points, the points that form a random point process independent of the initial tracer distribution. A complete description of this point process is given. If the initial tracer field is a random function with homogeneous increments a full statistics of the jumps is also given. 1. Introduction Numerous observations in hydrodynamics and oceanography demonstrate extremely sharp gradients of the temperature and other tracers in the presence of fluctuating currents (see e.g. [3], [12], [13, pp. 64–65]). The most interesting feature of these observations is that such a distribution of gradients occurs in statistically homogeneous environments. However, theoretical explanation of these results is far from being complete. The goal of this paper is to construct an exactly-solvable one-dimensional stochastic model which exhibits this phenomenon and allows us to describe qualitatively and quantitatively this behavior. The model may be considered oversimplified and unrealistic by some, but it does provide meaningful insights into the formation of highly intermittent tracer distributions in a homogeneous medium. The reader interested in this subject is referred to [13, pp.67–69], [9] and [10] for other approaches and results in this direction. Let u(t, x) be a random velocity field on a real line. We assume that it is a stationary Gaussian white noise in time t and a smooth homogeneous function of the space coordinate x ∈ R1 . Without loss of generality we also assume that E{u(t, x)} = 0. Hence E{u(t, x)u(s, y)} = δ(t − s)B(x − y), ?

The first author was supported by ONR Grant N00014-99-0042.

(1)

238

L. I. Piterbarg, V. V. Piterbarg

where δ(·) is the Dirac delta function and B(x) is a smooth covariance function. Let us renormilize the velocity field as follows, uε (t, x) = ε−1 u(t/ε2 , x/ε),

(2)

where ε is a small positive parameter. Because of the white noise assumption it is equivalent to uε (t, x) = u(t, x/ε). We consider the simplest equation describing the stirring of a passive scalar (tracer), ∂c ∂cε + uε (t, x) ε = 0, ∂t ∂x

(3)

cε (0, x) = c0 (x), where cε (t, x) is the tracer concentration, and the initial condition c0 (x) is a continuous (random or deterministic) function that does not depend on ε. Our goal is to study the limiting behavior of cε (t, x) as ε → 0. Thus, in physical language we face a velocity field with infinitely small time and space correlation scales. We formulate our findings in terms of the gradient gε (t, x) = (i)

As ε → 0, gε (t, x) →

∂cε (t, x) . ∂x

X

ζk (t)δ(x − ak (t))

(4)

k

in distribution, where {ak = ak (t)} is a simple homogeneous point process on the real line, {ζk = ζk (t)} is a sequence of random values for each fixed t. In other words, the initial tracer distribution, however smooth, is broken into a piece-wise constant random function. (ii) The point process of the singularity points, A = {ak }, is completely defined by the velocity field and is completely independent of the initial condition. Moreover, the probabilistic distribution of A is determined by a single parameter equal to the product Dt, (see Theorem 3.1), where D = B(0). The process A is not a renewal process. (iii) If c0 (x) = x, then {ζk } = {ak+1 − ak } in distribution. The sequences {ζk } and {ak } are not independent. (iv) If c0 (x) is a random function with stationary increments independent of the velocity, then {ζk } is a stationary sequence whose distribution can be explicitly written, see (42). The general picture for the linear initial tracer distribution is illustrated in Fig.1. We note that the most important statement (i) is discussed here quite fragmentarily, because it is a simple combination and interpretation of already proven results, see [1, 15, 16]. In this paper we focus on the computation of probabilistic characteristics of the singularity point set A and the sequence of jumps {ζk }. The paper is organized as follows. Section 2 presents rigorous formulation of the convergence results for one-dimensional stochastic Brownian flows. Section 3 is devoted to studying the point process A

Intermittency of the Tracer Gradient

6

239

c0 (x)

6

t=0

-

t>0

-

lim c (t, x)

→0

ζk+1

ζk

-

x

r -

ak

r

ak+1

r

r

r x

Fig. 1.

T and the jumps ζk . We give a simple explicit formula for the probability P A C = ∅ , where C is a union of segments in R1 (see Theorem 3.1). Using this result we derive expressions for the factorial moment densities of the number of points N (I) of A in any interval I, see (22). Explicit formulas for the second and third moments of N (I) are obtained in (23). The section ends with formulas for joint densities of jumps (35) derived from the duality property of homogeneous Brownian flows. Section 4 augments our model with an arbitrary random initial condition with homogeneous increments. In this case the initial profile is also transformed into a piecewise constant function. The set of discontinuity points remains the same as for the linear profile, but the distribution of the jumps is affected by the distribution of the increments of the initial condition as given in Theorem 4.1. Finally, the role of compressibility and viscosity is discussed in Sect. 5. 2. Convergence of Brownian Flows to the Coalescing Flow The coalescing Brownian flow to be defined in a moment plays a central role in our investigation. We should think of it as a rigorous interpretation of the stochastic flow with the velocity u0 (t, x) having the covariance function E{u0 (t, x)u0 (s, y)} = Dδ(t − s)1{0} (x − y), which is a “limit” for (1) under rescaling (2). Here 1C (x) is the indicator of a set C. The coalescing Brownian flow in R1 (with rate D) is a process K = {K(t, x), x ∈ R1 , t ∈ R+ }, where K(t, x) can be interpreted as the position of the particle at time t started at position x at time 0, such that

240

L. I. Piterbarg, V. V. Piterbarg

For fixed x ∈ R1 , {K(t, x), t ≥ 0} is a one-dimensional Brownian motion with diffusion coefficient D started at x, E (K(t, x) − K(s, x))2 = D|t − s|; (ii) For any x 6 = y, (i)

{K(t, x) = K(t, y)} implies {K(s, x) = K(s, y) for any s ≥ t}; (iii) The motions {K(t, x), t ≥ 0} for different x’s are independent until coalescence. In plain English, a particle is started at every point on the real line. All particles move as independent Brownian motions moving at rate D, but once two particles meet, they stick together, or coalesce, and move as a single Brownian particle. It is known ([1]) that for the coalescing Brownian flow the set of particles survived by time t is discrete for any t > 0. In particular, for any t > 0 there exists a simple homogeneous point process {ak (t)}∞ k=−∞ such that ∞ X

K(t, x) =

K(t, ak (t))1[ak (t),ak+1 (t)) (x),

x ∈ R1 ,

k=−∞

where K(t, x) is assumed to be right continuous. Let u(t, x) be a zero mean Gaussian field which is white noise in time and homogeneous in space, so that (1) holds. The space covariance function B(·) is assumed to satisfy the following conditions, • B(·) ∈ C 2 (R); • B(x) → 0 as |x| → ∞. By {X(t, x), x ∈ R1 , t ∈ R+ } we denote the space-homogeneous stochastic flow with the velocity field u(t, x), Z X(t, x) = x +

t

u(s, X(s, x)) ds.

(5)

0

It is clear that the rescaled process Xε (t, x) = εX(t/ε2 , x/ε)

(6)

is also a space-homogeneous stochastic flow with the velocity field uε (t, x) =

1 u(t/ε2 , x/ε). ε

It was proven in [15, 16] that the sequence of rescaled stochastic flows Xε (·, ·) converges (in various senses) to the coalescing flow K(·, ·), and these results are the backbone of our investigation. Let us state them rigorously. The conditions on the covariance structure B(·) listed above are assumed to hold. Proposition 2.1. For any n ∈ Z+ and for any (x1 , . . . , xn ) ∈ Rn , the sequence of rescaled n-particle motions emanating from (x1 , . . . , xn ) converges weakly to the corresponding n-particle coalescing Brownian motion, i.e. (Xε (·, x1 ), . . . , Xε (·, xn )) ⇒ (K(·, x1 ), . . . , K(·, xn )) as ε → 0.

Intermittency of the Tracer Gradient

241

Let us fix t > 0. As a function of x, each particular realization of Xε (t, x) is an increasing continuous function. Also, K(t, x) is a piece-wise constant non-decreasing function of x. For any φ(x), a C 1 (R) function with compact support, the integrals Z Z φ(x) dK(t, x) φ(x) dXε (t, x), are defined for each realization in the Lebesgue–Stiltjes sense. The quantities dXε (t, x), dK(t, x) can be regarded as gradients of the appropriate flows. The following proposition states a stronger kind of convergence than was claimed in Proposition 2.1. Proposition 2.2. Let us fix t > 0. Then Z Z φ(x) dXε (t, x) → φ(x) dK(t, x)

(7)

as ε → 0 in distribution. Define ξk = K(t, ak ) − K(t, ak − 0). P (7) can be viewed as The right-hand side of (7) is equal to ξk φ(ak ), so the statement P the convergence of the gradient gε (t, x) = dXε (t, x)/dx to ξk δ(x − ak ) as stated in (4). Note that for the special initial condition c0 (x) = x the notation for jumps is changed from ζk to ξk . 3. Set of Discontinuity Points and Jumps In this section we give a complete characterization of the random set A = {ak } and of the sequence of jumps {ξk }. The time t > 0 is assumed fixed. We say that an interval in R1 is empty if its intersection with A is empty. Let I1 , . . . , In be a collection of non-overlapping finite intervals separated by intervals J1 , J2 , . . . , Jn−1 (see Fig. 2 ). u

u I1

u J2

u I2

u Jn−1

u In

Fig. 2.

The lengths of the intervals are denoted by z2k−1 = |Ik |, k = 1, . . . , n; z2k = |Jk |, k = 1, . . . , n − 1. The probability that all Ik are empty depends only on zj , j = 1, . . . , 2n − 1. We define the zero-function by ! ! n \ [ Ik = ∅ . (8) pn (z1 , z2 , . . . , z2n−1 ) = P A k=1

It is well-known (see e.g. [6, p.10]) that the family of probabilities (8) completely determines the distribution of A. The following statement gives the explicit formula for (8).

242

L. I. Piterbarg, V. V. Piterbarg

Theorem 3.1. For any non-negative z1 , z2 , . . . , z2n−1 , X 8 zi1 + · · · + zj1 −1 8 zi2 + · · · + zj2 −1 · · · pn (z1 , z2 , . . . , z2n−1 ) = (i1 ,j1 ,...,in ,jn )

8 zin + · · · + zjn −1 (−1)s(i1 ,j1 ,...,in ,jn ) ,

where 1 8(x) = 8D (t, x) = 1 − √ πDt

Z

x

exp 0

−u2 4Dt

(9)

du,

(10)

the summation in (9) is over all permutations (i1 , j1 , . . . , in , jn ) of (1, . . . , 2n) such that i1 < j1 , . . . , in < jn , i1 < i2 < . . . in , and s (i1 , j1 , . . . , in , jn ) is the sign of the permutation. Recall that D = B(0), where B(x) is given in (1). Note that the number of terms on the right-hand side of (9) is (2n − 1)!!. This formula is similar to the formula for the 2nth order moment of a Gaussian process with the covariance function 8(x), but in our case we have alternating signs. It follows from (9) that (11) p1 (z1 ) = P A ∩ (0, z1 ) = ∅ = 8(z1 ), p2 (z1 , z2 , z3 ) = P A ∩ (0, z1 ) = ∅, A ∩ (z1 + z2 , z1 + z2 + z3 ) = ∅ = 8(z1 )8(z3 ) − 8(z1 + z2 )8(z2+ z3 ) + 8(z1 + z2 + z3 )8(z2 ). From (11) we readily obtain the expression for the intensity λ of the point process A, λ = lim

z→0

1 − p1 (z) 1 . =√ z πDt

Proof. Define {xk } by Ik = (x2k−1 , x2k ), k = 1, . . . , n, so that x2k − x2k−1 = z2k−1 . Denote the probability on the right-hand side of (8) by fn (t; x1 , x2 , . . . , x2n−1 , x2n ) = fn (t; x) and set Gn = {(x1 , . . . , x2n ) : x1 < x2 < · · · < x2n } ⊂ R2n , 0k = {(x1 , . . . , x2n ) : x1 < x2 < · · · < xk = xk+1 < · · · < x2n } ⊂ R2n , k = 1, 2, . . . , 2n − 1. We assert that ∂fn x ∈ Gn , = 1fn , ∂t fn |t=0 ≡ 0, fn |x∈0k = fn−1 (t; x1 , . . . , xk−1 , xk+2 , . . . , x2n ), k = 1, . . . , 2n − 1,

(12) (13) (14)

where 1 is the Laplace operator (multiplied by the constant D/2) in R2n . The fact that fn satisfies (12) follows from Theorem 2.1 in [16]. Informally, it can be explained as follows. Part of the definition of the finite-particle coalescing Brownian motion states that inside the domain Gn the coalescing Brownian motion behaves just like an ordinary 2n-dimensional Brownian motion (see e.g. Lemma 2.1 in [16]). Since Eq. (12) makes a statement about fn only up until the boundary of Gn , its validity follows from the same

Intermittency of the Tracer Gradient

243

property of an ordinary 2n-dimensional Brownian motion. The initial and boundary conditions (13), (14) are obviously satisfied for the zero function. The relations (12, 13, 14) also follow from the fact that fn (t; x1 , . . . , x2n ) is the annihilation probability for the 2n-dimensional Brownian annihilation motion, see [1]. The proof can be stopped here because it is reasonably straightforward to check that the function X 8 xj1 − xi1 . . . 8 xjn − xin (−1)s(i1 ,j1 ,...,in ,jn ) (15) fn (t; x) = (i1 ,j1 ,...,in ,jn )

satisfies (12), (13) and (14). However, we would like to go through the main steps of the original derivation in hopes to make it clear where this function fn came from. Set ϕ(x) = ϕD (t, x) = −

2 1 ∂ e−x /4Dt . 8 (x) = √ ∂x πDt

(16)

Then the Green function of the heat equation in G corresponding to the zero boundary conditions on ∂G = ∪n−1 k=1 0k ∪ {x1 = −∞} ∪ {xn = ∞} is given by P (t, x, y) = 2−n

X

(−1)s(i1 ,i2 ,...,i2n−1 ,i2n ) ϕ(x1 − yi1 ) . . . ϕ(xn − yin ), (17) (i1 ,i2 ,...,i2n−1 ,i2n )

where the summation is over all permutations of (1, 2, . . . , 2n). In particular, we have (2n)! terms in the sum in (17). Then (see e.g. [4]) fn (t; x) =

Z t n−1 XZ 0

k=1

0k

∂P (t − s, x, y) gk,n−1 (s, y) dy ds, ∂nk

(18)

where ∂/∂nk are the derivatives normal to 0k with respect to y, and gk,n−1 (s, y) = fn−1 (t, x)|x∈0k . The normal derivatives in this case are easy to compute, ∂P 1 ∂P xi − xi ∂P =√ − = k√ k+1 P |yk =yk+1 (19) ∂nk ∂yk−1 yk =yk+1 2 ∂yk 2 2Dt X 2−n xik − xik+1 ϕ(yk − xik )ϕ(yk − xik +1 ) =√ 2Dt 1≤ik
× ϕ(yk+2 − xik+2 ) . . . ϕ(y2n − xi2n ).

244

L. I. Piterbarg, V. V. Piterbarg

Let us first substitute (19) into (18), then integrate each term with respect to yk and at the end change the variables to z1 = y1 , . . . , zk−1 = yk−1 , zk = yk+2 , . . . , z2n−2 = y2n . The result is Z tZ Gn (t − s, x1 , x2 , . . . , x2n−1 , x2n ; z1 , . . . , z2n−2 ) fn (t; x) = {z1 ≤···≤z2n−2 } 0 (20) × fn−1 (s; z1 , z2 , . . . , z2n−2 ) dz1 . . . dz2n−2 , where Gn (t, x1 , . . . , x2n ; y1 , . . . , y2n−2 ) X 2−n xik − xik+1 ϕ(xik − xik ) =√ 2Dt 1≤ik
(21)

(i1 ,...ik−1 ,ik+1 ,...,i2n )

× ϕ(yk − xik+2 ) . . . ϕ(y2n−2 − xi2n ). What is surprising is that the recurrent formula (20) with multiple integrals can be folded down into the explicit expression (15) with no integrals in it at all. Now it is time to discuss some consequences of (9). First, self-similarity of A is pointed out. Formulas for the moments of the number N (I) of points of A in the interval I are given next. In particular, an explicit expression for the variance is obtained. Finally the probabilities pn = P(N (I) = n) are computed using (9). Let us fix the intensity λ and denote the number of A-points in C ∈ R1 by Nλ (C). Then it can easily be seen from (9) that Nkλ (kC) has the same distribution as Nλ (C) for any positive k and any Borel set C. This property also follows from the self-similarity of K(t, x) noted in [1], √ 1 √ K(tk, x k) = K(t, x) k in the sense of distributions. Let us introduce the densities of the factorial moment measures following [7, 17], ∂n E{N (I1 )N (I2 ) . . . N (In )} , hn (x1 , . . . , xn−1 ) = ∂z1 ∂z2 . . . ∂z2n−1 2 where 2 = {z : z1 = z3 = · · · = z2n−1 = 0, z2 = x1 , z4 = x2 , . . . , z2n−2 = xn−1 } . Let Bk = {N (Ik ) > 0}. For small z1 , z3 , . . . , z2n−1 we have, E{N (I1 )N (I2 ) . . . N (In )} ≈ P(B1 B2 . . . Bn ) X X P(B¯ k ) + P(B¯ k B¯ l ) − . . . + (−1)n P(B¯ 1 B¯ 2 . . . B¯ n ). =1− k

k,l

Intermittency of the Tracer Gradient

245

Using the definition of the factorial moment densities and the explicit formulas for the probabilities on the right hand side we get, h1 = λ,

(22) 0

h2 (x1 ) = λ − ϕ(x1 ) − ϕ (x1 )8(x1 ), 2

2

h3 (x1 , x2 ) = λ3 − λ(ϕ(x1 )2 + ϕ(x2 )2 + ϕ0 (x1 )8(x1 ) + ϕ0 (x2 )8(x2 ) + ϕ(x1 + x2 )2 + ϕ0 (x1 + x2 )8(x1 + x2 )) + ϕ0 (x1 )ϕ(x2 )8(x1 + x2 ) + ϕ(x1 )ϕ0 (x2 )8(x1 + x2 ) + 2ϕ(x1 )ϕ(x2 )ϕ(x1 + x2 ) − ϕ0 (x1 )8(x2 )ϕ(x1 + x2 ) − ϕ0 (x2 )8(x1 )ϕ(x1 + x2 ) + ϕ(x1 )8(x2 )ϕ0 (x1 + x2 ) + ϕ(x2 )8(x1 )ϕ0 (x1 + x2 ). From the latter we get the moments using the following formulas (see [6, 17]): Z a 2 (a − x)h2 (x)dx, EN (0, a) = λa + 2 Z0 a (a − x)h2 (x)dx + EN (0, a)3 = λa + 6 0 Z a Z a−x h3 (x, y)(a − x − y)dydx. 6 0

0

For the central moments µ2 , µ3 , 4 πλ2 a2 π−4 − exp − (23) µ2 (a) = (3 − 2 2)λa − π π 2 √ 2 √ √ πλa πλa + erf , +2 2λa erf 2 4 Z a Z a Z a−x (a − x)ψ0 (x)dx + 12 (a − x − y)ψ1 (x, y)dydx, µ3 (a) = λa − 6 √

0

0

0

where ψ0 (x) = ϕ(x)2 + ϕ0 (x)8(x), ψ1 (x, y) = ϕ0 (x)ϕ(y)8(x + y) +ϕ(x)ϕ(y)ϕ(x + y) − ϕ0 (x)8(y)ϕ(x + y) + ϕ(x)8(y)ϕ0 (x + y). The plots of µ2 (a), µ3 (a) are given in Fig.3 together with µ1 (a) = λa for λ = 1. Note that √ µ2 (a) = 3 − 2 2 ≈ 0.1716, lim a→∞ λa √ √ 32 3 µ3 (a) = 7 − 18 2 + ≈ 0.0194. lim a→∞ λa 3 An important application problem is to distinguish between a point process of type A and a Poisson process. This question arises when one wants to know whether anomalous

246

L. I. Piterbarg, V. V. Piterbarg 6

r

5

r r

4

r r

3

r

r

µ2 (a)

r

2

µ1 (a) µ3 (a)

r r

1

r 0 0

0,5

1

1,5

2

2,5

3

3,5

4

4,5

5

Fig. 3.

gradients come from random currents or from a random forcing. For example consider a heat balance equation alternative to (3), ∂c ∂c +u + λc = f (t, x), ∂t ∂x

(24)

where velocity u and the cooling coefficient λ are constant, and f (t, x) is a Gaussian random field stationary in t and homogeneous in x. In this case the stationary solution c(t, x) is also Gaussian and homogeneous in x. It is well known (see [14, Chapter 4] and references herein) that the high peaks of a Gaussian stationary process form a point process close to a Poisson one. Thus, in model (24) the points of anomalous values of ∂c(t, x)/∂x must look like coming from a Poisson process for each fixed t. Hence the problem of testing the model (3) against (24) can be formulated as the problem of differentiating an A-process from a Poisson process. Such a test might be based on the sharp difference in behavior of the second and third moments of N (0, a) for those processes (Fig.3). Recall that for a Poisson process µ2 (a) = µ3 (a) = λa. Next on our agenda is computing the probabilities P(N (0, a) = n). Let ! ! n [ Ik = ∅, A ∩ Jk 6 = ∅, k = 1, . . . , n − 1 qn (z1 , z2 , . . . , z2n−1 ) = P A ∩ (25) k=1 be the probability that all Ik are empty and all Jk are non-empty. Recall that z2k−1 = |Ik | and z2k = |Jk |. We set ∂n qn (z1 , z2 , . . . , z2n−1 ) , (26) Sn (x1 , . . . , xn ) = ∂z2 ∂z4 . . . ∂z2n−2 2

Intermittency of the Tracer Gradient

247

where 2 = {z : z2 = z4 = · · · = z2n−2 = 0, z1 = x1 , z3 = x2 , . . . , z2n−1 = xn } . It can be shown that Z P(N (0, a) = n) =

{x1 +···+xn =a}

Sn (x1 , . . . , xn )dx.

(27)

We need an effective way of computing qn (in terms of the quantities we know, pn , calculated in (9)) to be able to use the relation (27) in computing probabilities P(N (0, a) = n). For any set of different indices {i1 , . . . , is } ⊆ {1, . . . , n − 1} denote " s #! ! [ Jik qi1 ,...,is = P A ∩ Jik 6 = ∅, k = 1, . . . , s, A ∩ I \ =∅ , k=1

the probability that Ji1 , . . . , Jis are non-empty and the rest of Jk and all Ik are empty. Here ! ! n n−1 [ [ Ik ∪ Jk I= k=1

k=1

and s = 1, . . . , n − 1. In particular, qi1 ,...,in−1 = q1,2,...,n−1 = qn (z1 , z2 , . . . , z2n−1 ), qi1 = p2 (z1 + z2 + . . . z2i1 −1 , z2i1 , z2i1 +1 + · · · + z2n−1 ) − p1 (z1 + z2 + · · · + z2n−2 + z2n−1 ). It follows from the formula of total probability that qn (z1 , z2 , . . . , z2n−2 , z2n−1 ) = pn (z1 , z2 , . . . , z2n−2 , z2n−1 ) n−1 X X q i1 − q i1 i2 − . . . −p1 (z1 + · · · + z2n−1 ) − −

X

i1 =1

(28)

1≤i1
qi1 ...in−2 ,

1≤i1 <···
exactly the kind of relation we were looking for. Note that the total number of terms on the right-hand side of (28) is 2n−1 .We obtain from (26), (28) and (9) that S1 (x1 ) = λ, S2 (x1 , x2 ) = −8(x1 + x2 )ϕ(0) + ϕ(x1 )8(x2 ) + ϕ(x2 )8(x1 ), S3 (x1 , x2 , x3 ) = ϕ(x1 )ϕ(x2 )8(x3 ) + ϕ(x1 )ϕ(x3 )8(x2 ) + ϕ(x3 )ϕ(x2 )8(x1 ) − ϕ(0)ϕ(x1 )8(x2 + x3 ) − ϕ(0)ϕ(x2 + x3 )8(x1 ) − ϕ(0)ϕ(x3 )8(x1 + x2 ) − ϕ(0)ϕ(x3 )8(x1 + x2 ) + ϕ(x2 )ϕ(x1 + x2 )8(x2 + x3 ) + ϕ(x2 )ϕ(x2 + x3 )8(x1 + x2 ) − ϕ(x1 + x2 )ϕ(x2 + x3 )8(x2 ) + ϕ(0)2 8(x1 + x2 + x3 ) − ϕ(x2 )2 8(x1 + x2 + x3 ) − ϕ0 (x2 )8(x1 )8(x3 ) + ϕ0 (x2 )8(x1 + x2 )8(x1 + x3 ) − (29) ϕ0 (x2 )8(x2 )8(x1 + x2 + x3 ).

248

L. I. Piterbarg, V. V. Piterbarg

From here and (27) it follows that P(N (0, a) = 0) = 8(a),

Z

P(N (0, a) = 1) = −λa8(a) + 2

(30) a

ϕ(a − x)8(x)dx.

0

We have already established that the point process A is not a Poisson one. Could it be a Cox process, a “Poisson process with random intensity”? Let us show that the answer is still no. For the process A, P(N (0, a) > 1) =

λ 3 a + o(a3 ), 12Dt

(31)

and (see (27) and (29)) P(N (0, a) > 2) = o(a4 ). However, for a Cox process (see [6, p.48, Problem 1.33]) ) (Z 2 R a a − ψ(x) dx ψ(x) dx e 0 , P(N (0, a) = 2) = E

(32)

(33)

0

where ψ(x) is a positive stationary process. Clearly the right hand side of (33) has asymptotic a2 E(ψ(x))2 as a goes to zero, in contrast to (31) and (32). Our next order of business is to give a formula for the joint density pξ1 ,...,ξn (v1 , . . . , vn ) of consecutive jumps ξ1 , . . . , ξn of K(t, x). The point process defined by the jumps is equivalent to A in distribution, see [1]. Thus, pξ1 ,...,ξn (v1 , . . . , vn ) is the Palm distribution density for the distances between n + 1 consecutive points of A. The book [17] gives a general formula for the Palm density and additional details. We have, pξ1 ,...,ξn (v1 , . . . , vn ) =

∂2 1 Sn (v1 , . . . , vn ). ϕt (0) ∂v1 ∂vn

(34)

The joint density of ξ’s of any order can be computed from (29) and (34), v1 −v12 /4Dt ϕ0 (v1 ) = e , (35) ϕ(0) 2Dt 1 ϕ(0)ϕ0 (v1 + v2 ) − ϕ0 (v1 )ϕ(v2 ) − ϕ(v1 )ϕ0 (v2 ) pξ1 ,ξ2 (v1 , v2 ) = ϕ(0) 2 v1 + v2 −(v12 +v22 )/4Dt − e−(v1 +v2 ) /4Dt ), (e = 2Dt 1 pξ1 ,ξ2 ,ξ3 (v1 , v2 , v3 ) = {ϕ(v3 ) [ϕ(0)ϕ(v1 + v2 ) − ϕ(v1 )ϕt (v2 ) − ϕ(v1 )ϕ(v2 )] ϕ(0) + ϕ0 (v3 ) ϕ(0)ϕ(v1 + v2 ) + ϕ0 (v1 )8(v2 ) − ϕ(v1 )ϕ(v2 ) + ϕ(v2 + v3 ) ϕ(0)ϕ0 (v1 ) − ϕ(v2 )ϕ0 (v1 + v2 ) + ϕ(v1 + v2 )ϕ0 (v2 ) + ϕ0 (v2 + v3 ) ϕ(0)ϕt (v1 ) − 8(v2 )ϕ0 (v1 + v2 ) − ϕ(v1 + v2 )ϕ(v2 ) +ϕ0 (v1 + v2 + v3 ) ϕ(v2 )2 − ϕ(0)2 + 8(v2 )ϕ0 (v2 ) , pξ1 (v1 ) =

and so on. Note that the formula for the nth order density contains (2n − 1)!! summands. It follows from the second line that A is not a renewal process.

Intermittency of the Tracer Gradient

249

It will be shown later that {ξk } and {ak } are not independent. Thus a full description of processes A and {ξk } that we obtained does not give us a full description of the process K(t, x) for a fixed t. That would require computing all moments mt (x1 , . . . , xn ) = E{K(t, x1 ) . . . K(t, xn )}. A recurrent formula for the computation of moments can be given (much in the spirit of (12–14)). However, we have been unsuccessful so far in reducing the recurrent relations to an explicit formula. The covariance function on the other hand is easy, E(K(t, x1 ) − x1 )(K(t, x2 ) − x2 ) =

1 1 (x1 − x2 )2 + Dt − ρ(x1 − x2 ), 2 2

where ρ(x) = E(K(t, x) − K(t, 0))2 Z 1 ∞ (ϕ(x − y) − ϕ(x + y))y 2 dy = 2 0 Z ∞ 2 2 1 (e−(x−y) /4Dt − e−(x+y) /4Dt )y 2 dy. = √ 2 πDt 0

(36)

The formula (36) implies dependence between the sequences {ξn } and {an } as we proceed to show. Indeed, let us assume the opposite. Then ρ(x) = p1 (x)Eξ12 +p2 (x)E(ξ1 +ξ2 )2 + · · ·,

(37)

where pn (x) = P(N (0, x) = n). From (36) we have, 1 ρ(x) = 4λDtx + λx3 + o(x3 ), 3 while from (30) and (31) we get that the right hand side of (37) is 1 4λDtx + λx3 + o(x3 ), 2 in contradiction with the independence assumption. 4. Limiting Tracer Distribution for Arbitrary Initial Condition We go back to the transport equation ∂cε ∂cε + uε (t, x) = 0, ∂t ∂x

(38)

and consider it with an arbitrary (continuous) initial condition cε (0, x) = c0 (x).

(39)

250

L. I. Piterbarg, V. V. Piterbarg

The solution to (38)–(39) can be represented as cε (t, x) = c0 Xε−1 (t, x) , where Xε−1 (t, x) is the inverse (with respect to x) function to the characteristic Xε (t, x). The characteristics for (38) are defined in (5), (6). It is known (see [5]) that Xε−1 (·, ·) = Xε (·, ·) in distribution as flows. Therefore, if we are only interested in distributional properties of cε (·, ·), we can use the representation cε (t, x) = c0 (Xε (t, x))

(40)

(now understood in the sense of distribution). Let gε (t, x) =

∂cε (t, x) ∂x

be the tracer gradient. Having established the convergence of Xε (t, x) to K(t, x) in Sect. 1 we should expect that gε (t, x) goes to ∂K(t, x) ∂ c0 (K(t, x)) = c00 (K(t, x)) ∂x ∂x (in some sense). In fact, the following convergence holds true. Theorem 4.1. For any continuous initial condition c0 (x), lim gε (t, x) =

ε→0

X

ζk (t)δ(x − ak (t))

(41)

k

in the sense of Proposition 2.2, where A = {ak (t)} is a random set not depending on c0 (·), and ζk = c0 (K(t, ak )) − c0 (K(t, ak − 0)). If c0 (x) is a random function with homogeneous increments and is independent of u(t, x), then the characteristic function of the jumps can be expressed as Eei(λ1 ζ1 +...λn ζn ) =

Z pξ1 ,...,ξn (v1 , . . . , vn )χ (λ1 , . . . , λn ; v1 , . . . , vn ) dv1 . . . dvn , (42)

where pξ1 ,...,ξn (v1 , . . . , vn ) is given in (34), (35) and χ (λ1 , . . . , λn ; v1 , . . . , vn ) = E exp {iλ1 (c0 (v1 ) − c0 (0)) + iλ2 (c0 (v1 + v2 ) − c0 (v1 )) + · · · + iλn (c0 (v1 + · · · + vn ) − c0 (v1 + · · · + vn−1 ))}.

Intermittency of the Tracer Gradient

251

Convergence (41) follows from the continuity of c0 (·), the solution representation (40) and Proposition 2.2. We point out a simple but remarkable fact that for different initial conditions the set of the discontinuity points A remains the same because it comes from the corresponding coalescing flow. Formula (42) follows from the independence of c0 (x) and ξ1 , . . . , ξn and the fact that the sample {ζ1 , . . . , ζn } given ξ1 = v1 , . . . , ξn = vn have the same distribution as {c0 (v1 )−c0 (0), . . . , c0 (v1 +· · ·+vn )−c0 (v1 +· · ·+vn−1 )}. In particular for the first-order characteristic function we have, Z ∞ v2 1 ve− 4Dt Eeiλ(c0 (v)−c0 (0)) dv. Eeiλζ = 2Dt 0 Furthermore, in the case of a homogeneous initial condition c0 (x) with zero mean, the first- and second-order moments are Z ∞ 1 x2 dx, R0 (x)x exp − Eζk = 0, Eζk ζk+1 = 2Dt 0 4Dt where R0 (y) = E{c0 (x)c0 (x + y)}. For such an initial condition the limiting correlation function of the tracer R(t, x) = lim Ecε (t, y)cε (t, y + x) ε→0

can be explicitly computed for any time t. Indeed, it follows from Theorem 4.1 that Z ∞ R0 (u)q(u; t, x)du, R(t, x) = E {c0 (K(t, y))c0 (K(t, y + x))} = −∞

where q(·; t, x) is the density of the difference K(t, x) − K(t, 0). The distribution of this difference coincides with the distribution of w0,x (t), a Brownian motion started at x with absorption at the origin and rate 2D. Thus for x > 0, Z 1 ∞ R0 (y)(ϕD (t, x − y) − ϕD (t, x + y)) dy, (43) R(t, x) = R0 (0)8D (t, x) + 2 0 where 8 and ϕ are defined in (10) and (16) respectively. Let us use Theorem 4.1 for computing the limiting flatness (kurtosis) of the tracer spatial increments. Define E (cε (t, x) − cε (t, 0))4 σε (t, x) = 2 . E (cε (t, x) − cε (t, 0))2 By Theorem 4.1, σ(t, x) = limε→0 σε (t, x) =

E{[c0 (K(t, x)) − c0 (K(t, 0))]4 } . (E{[c0 (K(t, x)) − c0 (K(t, 0))]2 })2

Assume that the initial condition is a Gaussian homogeneous function with zero mean, independent of the velocity field. Then σ(0, x) = 3,

252

L. I. Piterbarg, V. V. Piterbarg

and σ(t, x) =

3E{F 2 (K(t, x) − K(t, 0))} , (E{F (K(t, x) − K(t, 0))})2

where F (y) = R0 (0) − R0 (y). Using the fact that F (0) = 0 and applying Jensen’s inequality, E F 2 (w0,x (t)) |w0,x (t) 6= 0 P w0,x (t) 6= 0 EF 2 (w0,x (t)) σ(t, x) = 3 2 = 3 2 EF (w0,x (t)) E F (w0,x (t)) |w0,x (t) 6= 0 P w0,x (t) 6= 0 √ 3 3 = = O( Dt). ≥ 1 − 8(x) P w0,x (t) 6= 0 So it is not only greater than 3 for any t > 0, but it also goes to infinity as t → ∞, indicating very heavy tails of the distribution for large t. Note that this asymptotic does not depend on the initial condition.

5. Discussion We realize that our example is of limited physical significance. This is so because, first of all, any nontrivial flow in one dimension (d = 1) is compressible. In incompressible isotropic Brownian flows of higher dimensions (d > 1) the distance between two Lagrangian particles goes to infinity as time does so ([2, 8]). This would preclude coalescing in such flows. The same is true for a wide class of two dimensional anisotropic flows, [12]. However, for another class (that in particular includes isotropic potential flows in two dimensions) that distance typically goes to zero, see [12], and hence the discovered effect can still work for those flows. Another important question is about the role of molecular diffusivity (viscosity). Let us show that adding however a small amount of viscosity changes the picture drastically. The qualitative difference is seen in the change in the behavior of the tracer correlation function. We study the equation with a non-zero viscosity coefficient, ∂cε ∂ 2 cε ∂cε + uε (t, x) =κ 2, ∂t ∂x ∂x cε (0, x) = c0 (x), where as before the initial condition is a zero mean homogeneous Gaussian random function. The equation for the correlation function is easy to write (e.g. [13, p.55, Eq. (3.54)]), x ∂ 2 Rε ∂Rε = (2κ + B(0) − B( )) , ∂t ε ∂x2 Rε (0, x) = R0 (x).

Intermittency of the Tracer Gradient

253

For any κ > 0, the diffusion coefficient is non-degenerate for all x. A straightforward limiting procedure (ε → 0) yields, Z ∞ 2 1 e−(x−y) /4(D+2κ)t dy R0 (y) √ R(t, x) = 4π(D + 2κ)t −∞ Z 1 ∞ R0 (y)(ϕ2κ+D (t, x − y) − ϕ2κ+D (t, x + y))dy. = 2 0 The “only” difference between (43) and this expression is the extra term in the former. However, this term is really important, because it makes the correlation function nondifferentiable at x = 0, while in the non-zero viscosity case the correlation function is smooth even in the limit κ → 0. Thus, the two limiting procedures do not commute. An interesting problem arises about the behavior of the correlation function when both κ and ε are small or, in other words, for large Peclet numbers and small space correlation scales. In particular, for which scalings κ ∼ εγ the limiting correlation function remains non-differentiable at zero? Acknowledgement. We thank the anonymous reviewers for comments which significantly improved the manuscript.

References 1. Arratia, R.: Coalescing Brownian Motions on the Line. Ph. D. Thesis, University of Wisconsin–Madison, 1979 2. Baxendale, P. and Harris, T.: Isotropic stochastic flows. Ann. Prob. 14, No.4, 1155–1179 (1986) 3. Eckart, C.: An analysis of the stirring and mixing processes in incompressible fluids. J. Mar. Res. 262–275 (1948) 4. Friedman, A.: Partial Differential Equations of Parabolic Type. Enlewood Cliffs, NJ: Prentice Hall, 1964 5. Harris, T.: Coalescing and non-coalescing stochastic flows in R1 . Stochastic Process. Appl. 17, 188–210 (1984) 6. Karr, A.F.: Point Processes and their Statistical Inference. New York: Marcel Dekker Inc, 1986 7. Kuznecov, P.L. and Stratonovich, R.L.: On the mathematical theory of correlated random points. Selected Translations in Mathematical Statistics and Probability, v. 7, 1–16 (1968) 8. Le Jan, Y.: On isotropic Brownian motions: Z. Wahrsch. verw. Gebiete 70, 609–620 (1985) 9. Molchanov, S.A and Piterbarg, L.I.: Turbulent diffusion of the tracer gradient. Dokl.AN SSSR 293 n. 5, 1092–1096 (1987) 10. Molchanov, S.A and Piterbarg, L.I.: Heat propagation in random flows. Rus. J. Math. Phys. 1, 353–376 (1994) 11. Ostrovskii, A.G.: Signatures of stirring and mixing in the Japan Sea surface temperature patterns in Autumn 1993 and Spring 1994. Geophys. Res. Lett. 22, n. 17, 2357–2360 (1995) 12. Piterbarg, L.I.: Drift estimation for Brownian flows. Stochastic Process and Appl. 79, 132–149 (1998) 13. Piterbarg, L.I. and Ostrovskii, A.G.: Advection and Diffusion in Random Flows: Implications for Sea Surface Temperature Anomalies. Dordrecht: Kluwer, 1997 14. Piterbarg, V.I.: Asymptotic Methods in the Theory of Gaussian Processes and Fields. Translations of Mathematical Monographs, Vol. 148, Providence, RI: American Mathematical Society, 1996 15. Piterbarg, V.V.: Expansions and contractions of isotropic stochastic flows of homeomorphisms. Ann.Probab. 43, n. 2 (1998) 16. Piterbarg, V.V.: Expansions and Contractions of Stochastic Flows. Ph.D. thesis, University of Southern California, 1997 17. Srivansan, S.K.: Stochastic Point Processes and their Applications. New York: Hafner Press, 1974 Communicated by Ya. G. Sinai

Commun. Math. Phys. 202, 255 – 265 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Mourre’s Method and Smoothing Properties of Dispersive Equations Toshihiko Hoshiro? Department of Mathematics, Himeji Institute of Technology, Shosha 2167, Himeji, Hyogo 671-2201, Japan. E-mail: [email protected] Received: 30 June 1998 / Accepted: 2 October 1998

In memory of Professor Nobuhisa Iwasaki Abstract: The purpose of this paper is to point out that Mourre’s method in spectral theory is useful and powerful in studying the smoothing effect of dispersive equations. Especially we will see that the smoothing effect is a quite general phenomenon, even for operators with variable coefficients.

1. Introduction In recent years, there have been a number of papers concerning smoothness properties of dispersive equations of the form ( ∂ ∂ −1 + P (D)u = f, D = i , · · · , i ∂u ∂t ∂x1 ∂xn , (1.1) n u(0, x) = u0 (x), x ∈ R , where P (ξ) is a real-valued symbol with suitable properties. Among others, Constantin and Saut [3] proved that, in the case where P (ξ) behaves like |ξ|m as |ξ| → ∞ and f ≡ 0, the solution u(t, x) satisfies |D|(m−1)/2 u ∈ L2loc (Rn+1 ) if u0 ∈ L2 (Rn ). In this paper, we shall apply a method initiated by Mourre [13] to give an approach to global smoothing effects of the initial value problem (1.1). Here we would like to explain Mourre’s commutator method briefly. First of all, it was invented to prove spectral properties for some class of Hamiltonians H = −1 + V . In [13], Mourre proved that, if there exists a selfadjoint operator A such that [H, iA] is positive at λ ∈ R+ in some sense, then the value k(|A| + 1)−1 (H − z)−1 (|A| + 1)−1 kL(L2 )

(1.2)

? Supported in part by Grant-in-Aid for Scientific Research (No. 40211544), Ministry of Education and Culture, Japan

256

T. Hoshiro

remains bounded even if z (z ∈ C, Im z > 0) approaches λ ∈ R+ . See Chapter 4 of Simon’s book [5] for the precise statement and developments after the epoch-making work [13]. Now let us observe that the value similar to (1.2) without the terms (|A| + 1)−1 cannot remain bounded as z → λ ∈ R+ even in the case V ≡ 0. This is because the symbol (|ξ|2 − z)−1 is not bounded as z → λ ∈ R+ . On the other hand, let us notice that, roughly speaking, the symbol of the fundamental solution of (1.1) with u0 (x) ≡ 0 is (−τ + P (ξ))−1 . It is singular on the hypersurface τ = P (ξ). So we cannot expect that f ∈ L2 (Rn+1 ) implies u ∈ L2 (Rn+1 ), although the solution u(t, x) may be a bit smoother than f (t, x). Thus we vaguely see that there exists some relation between Mourre’s method and the works on the initial value problem (1.1). In fact, both of them deal with some estimates for operators with non-regular symbols. Moreover there is a relation between the smoothing effects and the commutation properties. Concerning this, see Kato [9] or Doi [6]. In what follows we assume that P (ξ) is real-valued and of class C 1 (Rn ). P (ξ) is positively homogeneous of order m (m ∈ R+ ), namely

(A.1) (A.2)

P (λ ξ) = λm P (ξ), ∀λ > 0, ∀ξ = (ξ1 , . . . , ξn ) ∈ Rn . Also we introduce the notations as follows: |P (D)|α = F −1 |P (ξ)|α F, (F : the Fourier transform w.r.t. x = (x1 , . . . , xn )), ∂ ∂ 1 , ·x+x· A= 2i ∂x ∂x Z (|A| + 1)α = (|λ| + 1)α dEλ , where Eλ (−∞ < λ < ∞) is the spectral resolution of the operator A. Our first result is the following: Theorem 1.1. Let u(t, x) be a solution of (1.1) with u0 (x) ≡ 0 and f (t, x) ≡ 0 for t < 0. Then for α > 1/2, there exists a positive constant C = Cα such that k(|A| + 1)−α |P (D)|ukL2 (Rn+1 ) ≤ Ck(|A| + 1)α f kL2 (Rn+1 ) .

(1.3)

Remark. In the left-hand side of (1.3), |P (D)| can be replaced by P (D). A result on the homogeneous initial value problem comes from the inhomogeneous one. The following theorem is proved in Sect. 3. Theorem 1.2. Let u(t, x) be a solution of (1.1) with f (t, x) ≡ 0. Then for α > 1/2 there exists a positive constant C = Cα0 such that k(|A| + 1)−α |P (D)|1/2 ukL2 (Rn+1 ) ≤ Cku0 kL2 (Rn ) .

(1.4)

It seems that the presence of the fractional powers (|A| + 1)±α in the above theorems makes the results unclear concerning the smoot hing properties of the equation. However these theorems give the following:

Mourre’s Method and Regularity

257

Corollary 1.3. (i) Let u(t, x) be a solution of (1.1) with u0 (x) ≡ 0 and f (t, x) ≡ 0 for t < 0. Then for α > 1/2, khxi−α hDi−2α |P (D)|ukL2 (Rn+1 ) ≤ Ckhxiα f kL2 (Rn+1 ) . (Here hxi = (1 + |x|1 )1/2 , and hDi = (1 − 1)1/2 .) (ii) Let u(t, x) be a solution of (1.1) with f (t, x) ≡ 0. Then for α > 1/2 , khxi−α hDi−α |P (D)|1/2 ukL2 (Rn+1 ) ≤ Cku0 kL2 (Rn ) . From Corollary 1.3 we see that the solution u(t, x) has a smoothness property, the degree of which is higher than that of f (t, x) by m − 2α and that of u0 (x) by m 2 − α. Results of such a nature have been obtained by several authors (cf. [1, 3] or [11] for the homogeneous initial value problem, and [7] or [12] for inhomog enous one). However let us note that P (ξ) is not assumed to be a polynomial nor of principal type. Moreover the assumption that P (ξ) is homogeneous (i.e. (A.2)) can be replaced by another one. We devote Sect. 4 to such generalizations of the above results. This paper is organized as follows: Sect. 2 is devoted to prove Theorem 1.1. We apply Mourre’s method to show the smoothness property of the solution. Section 3 discusses that Theorem 1.2 and Corollary 1.3 follow from Theorem 1.1. Finally in Sect. 4, we extend the arguments in Sect. 2 and give some additional results.

2. Mourre’s Commutator Method Given a function f (t, x) in Rn+1 we denote by f˜(τ, ξ) its Fourier transform with respect 0 to t and x. Also define u+,0 (t, x) and u− ,0 (t, x) (, > 0) respectively by u˜ ± ,0 (τ, ξ) =

f˜(τ, ξ) . −τ + P (ξ) ± i {0 + |P (ξ)|}

2 n+1 ) for fixed and 0 > 0. Our main task in Clearly f ∈ L2 (Rn+1 ) implies u± ,0 ∈ L (R this section is to prove the following inequality concerning the maps f 7→ u± ,0 : there exists a positive constant C independent of and 0 > 0 such that α k(|A| + 1)−α |P (D)|u± ,0 k ≤ Ck(|A| + 1) f k,

(2.1)

where k k = k kL2 (Rn+1 ) and α > 1/2. We shall start this section by deducing Theorem 1.1 from inequality (2.1). First by (2.1), the limits u± = lim u± ,0 →0 0 →0

exist (in the distribution sense) and satisfy inequality (2.1), where u± ,0 is replaced by u± . Also it is easy to see that both u+ and u− are solutions of the equation i∂t u + P (D)u = f, with the following initial values:

258

T. Hoshiro

Z u+ (0, x) = −i Z u− (0, x) = i

0

e−isP (D) f (s, x) ds,

−∞ ∞ −isP (D)

e

f (s, x) ds.

(2.2) (2.3)

0

Hence if f (t, x) ≡ 0 for t < 0, u+ is a solution of (1.1) with u0 (x) ≡ 0. This shows that inequality (2.1) induces Theorem 1.1. Proof of (2.1). Here we prove (2.1) in the case where α = 1 and u = u+,0 . The proof for u− ,0 is similar and the general case α > 1/2 will be shown at the end of the present section. First denoting U+ and U− respectively by U+ = {ξ ∈ Rn | P (ξ) ≥ 0}, U− = {ξ ∈ Rn | P (ξ) < 0}, we define G1 and G2 by operators with the following symbols: ( (−τ + P (ξ) + i{0 + P (ξ)})−1 , σ(G1 ) = 0, ( 0, σ(G2 ) = (−τ + P (ξ) + i{0 − P (ξ)})−1 ,

τ ∈ R, ξ ∈ U+ , τ ∈ R, ξ ∈ U− , τ ∈ R, ξ ∈ U+ , τ ∈ R, ξ ∈ U− .

To derive (2.1) for α = 1 and u = u+,0 , it is enough to show that, for G = G1 and G = G2 , the value k(|A| + 1)−1 |P (D)|G(|A| + 1)−1 kL(L2 (Rn+1 )) is bounded by a constant independent of and 0 > 0. Here we prove it only for the case G = G1 . The proof for the case G = G2 will be quite similar. Set F = (|A| + 1)−1 P G1 (|A| + 1)−1

(P = P (D)).

Then d F = −i(|A| + 1)−1 G1 P 2 G1 (|A| + 1)−1 . d Observe that the assumptions (A.1) and (A.2) imply Euler’s identity [P, iA] = mP, so that

[i∂t + P + i{0 + P }, A] = (1 + i )[P, A] = −i m (1 + i )P.

(2.4)

Mourre’s Method and Regularity

259

Hence 1 d F = (|A| + 1)−1 P G1 [i∂t + P + i{0 + P }, A] (2.5) d i m(1 + i ) × G1 (|A| + 1)−1 1 (|A| + 1)−1 P [A, G1 ](|A| + 1)−1 = i m(1 + i ) 1 = (|A| + 1)−1 {AP G1 − P G1 A + [P, A]G1 }(|A| + 1)−1 i m(1 + i ) 1 {(|A| + 1)−1 AP G1 (|A| + 1)−1 = i m(1 + i ) − (|A| + 1)−1 P G1 A(|A| + 1)−1 − i mF }. This expression induces a useful inequality as follows: k

d F k ≤ C{kP G1 (|A| + 1)−1 k d + k(|A| + 1)−1 P G1 k + kF k},

where k k = k kL(L2 (Rn+1 )) .

(2.6)

Remark. Note that the relation (2.4) only implies (2.5) and (2.6). This suggests that the above argument is applicable to equations with variable coefficients. See Sect. 4. Now we state a lemma, which plays an important role in Mourre’s procedure. Lemma 2.1. (i) kF k ≤ −1 ,

(2.7)

(ii) kP G1 (|A| + 1)−1 k ≤ −1/2 kF k1/2 , k(|A| + 1)

−1

−1/2

P G1 k ≤

kF k

1/2

.

(2.8) (2.9)

Proof. (i) The inequality (2.7) immediately comes from the bound k(|A| + 1)−1 k ≤ 1 and 1 P (ξ) −τ + P (ξ) + i{0 + P (ξ)} ≤ , τ ∈ R, ξ ∈ U+ . (ii) Observe that, for f ∈ C0∞ (Rn+1 ), kP G1 (|A| + 1)−1 f k2 = (|A| + 1)−1 G∗1 P 2 G1 (|A| + 1)−1 f, f L2 (Rn+1 ) and G∗1 − G1 = 2iG∗1 (0 + P )G1 . Hence

1 (|A| + 1)−1 P (G∗1 − G1 )(|A| + 1)−1 f, f L2 (Rn+1 ) 2 1 k(|A| + 1)−1 P G1 (|A| + 1)−1 f k kf k, ≤ which proves the estimate (2.8). The estimate (2.9) immediately follows from (2.8), by taking the adjoint of the left-hand side. kP G1 (|A| + 1)−1 f k2 ≤

260

T. Hoshiro

Conclusion of the proof of (2.1) for α = 1. As indicated, all we need to prove is that the inequality kF k ≤ C holds with a constant C independent of and 0 . We begin with the bound (2.7). Putting this into (2.6) and applying (2.8), (2.9), we find k

dF k ≤ C · −1 , d

so that, by integrating w.r.t. , we obtain kF k ≤ C| log |. Putting this into (2.6) again and integrating, we obtain the required inequality.

Proof of (2.1) for 1/2 < α ≤ 1. We follow here Mourre’s argument (see [14] and [15]) briefly. Let D = (|A| + 1)−α (η|A| + 1)α−1 (0 < η ≤ 1) and denote G˜ 1 by an operator with symbol ( τ ∈ R, ξ ∈ U+ , (−τ + P (ξ) + i{0 + ( + η)P (ξ)})−1 , σ(G˜ 1 ) = 0, τ ∈ R, ξ ∈ U+ . Replace F by

F˜ = DP G˜ 1 D.

Observe that dD dG˜ 1 d ˜ dD ˜ F = P G1 D + DP D + DP G˜ 1 , dη dη dη dη kD(|A| + 1)k ≤ η α−1 and k

dD k = (1 − α)k(|A| + 1)−α |A|(η|A| + 1)α−2 k ≤ (1 − α) η α−1 . dη

Hence by the argument to prove (2.8) and (2.9), we obtain k

dF˜ k ≤ 2(1 − α) η α−1 kP G˜ 1 Dk + kDG˜ 1 P 2 G˜ 1 Dk dη ≤ 2(1 − α) η α−1 · η −1/2 kF˜ k1/2 + C · η α−1 (kF˜ k + η −1/2 kF˜ k1/2 ) ≤ C 0 η α−1 (kF˜ k + η −1/2 kF˜ k1/2 ).

This implies that, if kF˜ k ≤ C1 η −γ for some 0 < γ ≤ 1, then k

dF˜ k ≤ C2 η α−1 (η −γ + η −1/2−γ/2 ), dη

so that, by integrating w.r.t. η, kF˜ k ≤ C3 η α (η −γ + η −1/2−γ/2 ) ≤ C4 η α−1/2−γ/2 .

(2.10)

Mourre’s Method and Regularity

261

Observe that, if α > 1/2 then α − 1/2 − γ/2 − (−γ) > α − 1/2 > 0 in the above exponent. Hence after a finite number of steps beginning with kF˜ k ≤ η −1 we arrive at the inequality kF˜ k ≤ C with a constant C independent of , 0 and η. By taking η → 0 this implies (2.1) for 1/2 < α ≤ 1.

3. Proofs of Theorem 1.2 and Corollary 1.3 In the first part of the present section we shall deduce Theorem 1.2 from Theorem 1.1. The second part is concerned with the proof of Corollary 1.3. Proof of Theorem 1.2. First let

u = i (u+ − u− ).

From (2.2) and (2.3) it follows that u satisfies ( i∂t u + P (D)u = 0, R∞ u(0, x) = −∞ e−isP (D) f (s, x) ds. Denote T by the correspondence from functions in Rn+1 to functions in Rn as follows: f = f (t, x) 7→ u0 = u(0, x). Then the map of the homogeneous initial value problem u0 7→ eitP (D) u0 will become its adjoint T ∗ , and the correspondence f 7→ u can be represented as T ∗ T . Hence the proof of Theorem 1.1 yields the following inequality: k|P (D)|1/2 T f k2L2 (Rn ) = (|P (D)|1/2 T f, |P (D)|1/2 T f )L2 (Rn ) = (|P (D)|T ∗ T f, f )L2 (Rn+1 ) ≤ k(|A| + 1)−α |P (D)|T ∗ T f kL2 (Rn+1 ) k(|A| + 1)α f kL2 (Rn+1 ) = k(|A| + 1)−α |P (D)|ukL2 (Rn+1 ) k(|A| + 1)α f kL2 (Rn+1 ) ≤ Ck(|A| + 1)α f k2L2 (Rn+1 ) . Moreover observe that (f, |P (D)|1/2 T ∗ u0 )L2 (Rn+1 ) = (|P (D)|1/2 T f, u0 )L2 (Rn ) ≤ Ck(|A| + 1)α f kL2 (Rn+1 ) ku0 kL2 (Rn ) ,

262

T. Hoshiro

and hence

k(|A| + 1)−α |P (D)|1/2 T ∗ u0 kL2 (Rn+1 ) ≤ Cku0 kL2 (Rn ) , which proves Theorem 1.2. Proof of Corollary 1.3. All we need to prove is that, for 0 ≤ α ≤ 1, the operators hxi−α hDi−α (|A| + 1)α and (|A| + 1)α hDi−α hxi−α are bounded in L2 (Rn+1 ). This is obvious for α = 0, or more generally for α ∈ C satisfying Re α = 0. Observe that the operators n (A + i)−1 (|A| + 1), hxi−1 x · D − i + i hDi−1 2 and [hDi−1 , x · D] are bounded in L2 (Rn+1 ). Hence we see that the operator hxi−1 hDi−1 (|A| + 1) = hxi−1 hDi−1 (A + i)(A + i)−1 (|A| + 1) n = hxi−1 x · D − i + i hDi−1 (A + i)−1 (|A| + 1) 2 −1 −1 + hxi [hDi , x · D](A + i)−1 (|A| + 1) is bounded in L2 (Rn+1 ). Such a procedure is applicable to hxi−α hDi−α (|A| + 1)α and (|A|+1)α hDi−α hxi−α for α ∈ C satisfying Re α = 1. Finally the complex interpolation gives the desired assertion.

4. Additional Results The final section of this paper gives a set of the extensions. The following results can be obtained by our method of proof. Operators with constant coefficients. The arguments in Sect. 2 work even if P (ξ) is quasi-homogeneous, namely the assumption (A.2) is replaced by P (λθ1 ξ1 , · · · , λθn ξn ) = λ P (ξ1 , · · · , ξn )

(4.1)

for some θ1 , . . . , θn > 0. Indeed, by replacing A such as n ∂ ∂ 1 X , θj xj + xj A= 2i j=1 ∂xj ∂xj (4.1) implies [P, iA] = P. As remarked in Sect. 2, such a relation gives the inequality (2.6). More generally, for akj ∈ R (k, j = 1, . . . , n) set n ∂ ∂ 1 X . akj xk + xk A= 2i ∂xj ∂xj k,j=1

Then we have the following extension.

Mourre’s Method and Regularity

263

Theorem 4.1. Suppose that P (ξ) satisfies (A.1) and n X

akj ξj

k,j=1

∂P = P. ∂ξk

(A.2’)

Then all the conclusions in Sect. 1 hold. Proof. Assumption (A.2)0 implies [P, iA] = P.

This induces the conclusions.

Remark. Denote A = (akj ) and let η(θ, ξ) = t (η1 (θ, ξ), · · · , ηn (θ, ξ)) (θ ∈ R, ξ ∈ Rn ) be a solution of the initial value problem:   dη = Aη, dθ η(0, ξ) = ξ.

(4.2)

Then Assumption (A.2)0 means

so that

d P (η(θ, ξ)) = P (η(θ, ξ)), dθ P (η(θ, ξ)) = P (ξ) eθ .

This equality restricts the behavior of P (ξ) along the flow given by (4.2) . It is well known that the figure of such a flow depends on the type of the matrix A. See Hirsh and Smale [8]. Operators with variable coefficients. Let P be a selfadjoint operator in L2 (Rn ). We consider here the initial value problem of the form ( i∂t u + P u = f, (4.3) u|t=0 = u0 . In what follows, we denote A by a selfadjoint operator and Q by a positive selfadjoint operator in L2 (Rn ). Also denote α by a positive number satisfying α > 1/2. We have Theorem 4.2. Suppose that P, Q and A satisfy the following relations: [P, iA] = Q, [Q, iA] = c Q (c : a constant) and [P, Q] = O.

(4.4)

Then: (i)

In the case where u0 ≡ 0 and f ≡ 0 for t < 0 in (4.3), there exists a constant C = Cα such that k(|A| + 1)−α QukL2 (Rn+1 ) ≤ Ck(|A| + 1)α f kL2 (Rn+1 ) .

264

T. Hoshiro

(ii) In the case where f ≡ 0 in (4.3), there exists a constant C 0 = Cα0 such that k(|A| + 1)−α Q1/2 ukL2 (Rn+1 ) ≤ C 0 ku0 kL2 (Rn ) . Proof. In Sect. 2, we used the Fourier transform to obtain the inequality (2.7). So our task here is to give another proof of it. Observe that, by (4.4) it follows k(i∂t + P + i{0 + Q})uk2 = (i∂t + P + i{0 + Q})u, (i∂t + P + i{0 + Q})u 0

L2 (Rn+1 )

= k(i∂t + P )uk + k( + Q)uk 2

2

≥ 2 kQuk2 , where k k = k kL2 (Rn+1 ) . This immediately implies that the operator F = (|A| + 1)−1 Q(i∂t + P + i{0 + Q})−1 (|A| + 1)−1 satisfies

kF kL(L2 (Rn+1 )) ≤ −1 .

The remaining parts of the proof are completely the same as those in Sect. 2.

Remark. The above theorem is useful even for the initial value problem (1.1) . For example, if P (ξ) is written P (ξ) = ξ1m + R(ξ 0 ) ( ξ = (ξ1 , ξ 0 ) ), then 1 A= 2i

∂ ∂ x1 + x1 ∂x1 ∂x1

and Q = m

1 ∂ i ∂x1

m

satisfy the relations in Theorem 4.2. Finally we close this section by giving some examples, to which the above theorem is applicable. Example 1. For an integer ` > 1 let P =−

n X ∂ 2` ∂ xj . ∂x ∂xj j j=1

Then the following pairs satisfy the relations. ∂ ∂ 1 , Q = 2(` − 1)P. ·x+x· (i) A = − 2i ∂x ∂x ∂ 1 ∂ 2` ∂ ∂ (ii) A=− , Q = −2(` − 1) x1 + x1 x . 2i ∂x1 ∂x1 ∂x1 1 ∂x1 Example 2. Let

2 ∂2 2 ∂ + x . 1 ∂x21 ∂x22 Then the following pair satisfies the relations: ∂ ∂ ∂ ∂ 1 x1 + x1 x2 + x2 +2 , A= 2i ∂x1 ∂x1 ∂x2 ∂x2 Q = 2P.

P =−

Mourre’s Method and Regularity

265

References 1. Ben-Artzi, M. and Devinatz, A.: Local smoothing and convergence properties of Schr¨odinger type equations. J. Funct. Anal. 101, 231–254 (1991) 2. Ben-Artzi, M. and Klainerman, S.: Decay and regularity for the Schr¨odinger equation. J. d’Analyse Math. 58, 25–37 (1992) 3. Constantin, P. and Saut, J.C.: Local smoothing properties of dispersive equations. J. Am. Math. Soc. 1, 413–439 (1988) 4. Craig, W., Kappeler, T. and Strauss, W.: Microlocal dispersive smoothing for the Schr¨odinger equation. Comm. Pure Appl. Math. 48, 769–860 (1995) 5. Cycon, H.L., Froese, R.G., Kirsh, W. and Simon, B.: Schr¨odinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 6. Doi, S.: On the Cauchy problem for Schr¨odinger type equations and the regularity of solutions. J. Math. Kyoto Univ. 34, 319–328 (1994) 7. Ginibre, J. and Velo, G.: Smoothing properties and retarded estimates for some dispersive equations. Commun. Math. Phys. 144, 163–144 (1992) 8. Hirsh, M.W. and Smale, S.: Differential Equations, Dynamical Systems and Linear Algebra. New York: Academic Press, 1974 9. Kato, T.: On the Cauchy problem for the (generalized) Korteweg-de Vries equation. Studies in Appl. Math., Adv. Math. Suppl. Studies 8, 93–128 (1983) 10. Kato, T. and Yajima, K.: Some examples of smooth operators and the associated smoothing effect. Rev. Math. Phys. 1, 481–496 (1989) 11. Kenig, C., Ponce, G. and Vega, L.: Oscillatory integrals and regularity of dispersive equations. Indiana Univ. Math. J. 40, 33–69 (1991) 12. Kenig, C., Ponce, G. and Vega, L.: Well-posedness and scattering results for the generalized Koteweg–de Vries equation via the contraction principle. Commun. Pure Appl. Math. 46, 527–620 (1993) 13. Mourre, E.: Absence of singular continuous spectrum for certain selfadjoint operators. Commun. Math. Phys. 78, 391–408 (1981) 14. Mourre, E.: Operateurs conjugu´es et propri´et´es de propagation. Commun. Math. Phys. 91, 279–300 (1983) 15. Perry, P., Sigal, I.M. and Simon, B.: Spectral analysis of N -body Schr¨odinger operators. Ann. of Math. 114, 519–567 (1981) 16. Strichartz, R.S.: Restriction of Fourier transform to quadratic surfaces and decay of solutions to wave equations. Duke Math. J 44, 705–714 (1977) 17. Yamazaki, M.: On the microlocal smoothing effect of dispersive partial differential equations I, Secondorder linear equations. Algebraic Analysis II, Boston: Academic Press, 1988, pp. 911–926 Communicated by B. Simon

Commun. Math. Phys. 202, 267 – 290 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Bifurcation of Nonclassical Viscous Shock Profiles from the Constant State? A. V. Azevedo1,?? , D. Marchesin2 , B. Plohr3 , K. Zumbrun4 1

Departamento de Matem´atica, Universidade de Bras´ılia, 70910-900 Bras´ılia, DF, Brazil Instituto de Matem´atica Pura e Aplicada, Estrada Dona Castorina 110, 22460 Rio de Janeiro, RJ, Brazil. E-mail: [email protected] 3 Departments of Mathematics and of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, NY 11794-3651, USA. E-mail: [email protected] 4 Department of Mathematics, Indiana University, Bloomington, IN 47405, USA. E-mail: [email protected]

2

Received: 2 December 1997 / Accepted: 6 October 1998

Abstract: We determine the bifurcation from the constant solution of nonclassical transitional and overcompressive viscous shock profiles, in regions of strict hyperbolicity. Whereas classical shock waves in systems of conservation laws involve a single characteristic field, nonclassical waves involve two fields in an essential way. This feature is reflected in the viscous profile differential equation, which undergoes codimension-three bifurcation of the kind studied by Dumortier et al., as opposed to the codimension-one bifurcation occurring in the classical case. We carry out a complete bifurcation analysis for systems of two quadratic conservation laws with constant, strictly parabolic viscosity matrices by reducing to a canonical form introduced by Fiddelaers. We show that all such systems, except possibly those on a codimension-one variety in parameter space, give rise to nonclassical shock waves, and we classify the number and types of their bifurcation points. One consequence of our analysis is that weak transitional waves arise in pairs, with profiles forming a 2-cycle configuration previously shown to lead to nonuniqueness of Riemann solutions and to nontrivial asymptotic dynamics of the conservation laws. Another consequence is that appearance of weak nonclassical waves is necessarily associated with change of stability in constant solutions of the parabolic system of conservation laws, rather than with change of type in the associated hyperbolic system. ? This work was supported in part by: the Coordena¸ ca˜ o de Aperfei¸coamento de Pessoal de Nivel Superior under Grant BEX0012/97-1; the Funda¸ca˜ o de Amparo a` Pesquisa do Distrito Federal under Grant 0821 193 431/95; the Conselho Nacional de Desenvolvimento Tecnol´ogico e Cient´ıfico under Grant CNPq/NSF 910087/92-0; the Conselho Nacional de Desenvolvimento Tecnol´ogico e Cient´ıfico under Grant 530054/93-0; the Financiadora de Estudos e Projetos under Grant 65920311-00; the National Science Foundation under Grant INT-9512873; the Applied Mathematics Subprogram of the U. S. Department of Energy under Grant DE-FG02-90ER25084; CNPq Grant 301411/95-6; the Office of Naval Research under Grant N00014-94-10456; and the National Science Foundation under Grant DMS-9107990 and Grant DMS-9706842. ?? Current address: Departments of Applied Mathematics and Statistics State University of New York at Stony Brook, Stony Brook, NY 11794-3600, USA. E-mail: [email protected]

268

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

1. Introduction In this paper, we investigate the emergence of weak transitional and overcompressive shock waves. These nonclassical waves have been found in solutions of Riemann problems for conservation laws of mixed type [43, 24, 25, 26, 17, 18, 19]. In particular, transitional shock waves occur in systems of two conservation laws related to threephase flow in porous media, and overcompressive shock waves occur in the equations of magnetohydrodynamics. Here we show how nonclassical waves arise in generic, strictly hyperbolic systems. We obtain a complete structural description, analogous to those of Refs. [29] and [36] for classical waves, of the bifurcation of nonclassical shock profiles from a constant solution. One motivation for this study is to demonstrate the existence of 2-cycle configurations of transitional shock waves, which were shown in Ref. [3] to lead to nonuniqueness of Riemann solutions for the hyperbolic system and to corresponding nontrivial asymptotic dynamics for its associated parabolic regularization. We show that such configurations arise generically wherever there exist arbitrarily weak transitional shock waves. Thus the occurrence of weak transitional shock waves generically implies nonuniqueness of Riemann solutions. Other examples of nonuniqueness of Riemann solutions may be found in Refs. [28, 42, 1, 2, 5]. Our results for strictly hyperbolic systems also shed light on the origins of nonclassical waves in mixed-type systems. We show that the appearance of weak transitional shock waves in a system of hyperbolic conservation laws entails linearized instability of nearby constant solutions with respect to the associated parabolic equations, but not necessarily loss of strict hyperbolicity. This provides evidence that linearized instability, not ellipticity, is the true origin of nonclassical waves. Nonclassical waves occur frequently in mixed-type systems only because the elliptic region is contained in the region of linear instability, so that change of type implies linear instability. 1.1. Background discussion. We consider a system of conservation laws Ut + F (U )x = 0,

(1.1)

where U (x, t) ∈ O ⊆ Rn for x ∈ R and t > 0, and F ∈ C 2 (O, Rn ). Following Gel’fand [21] and Courant and Friedrichs [11], we also consider an associated viscous regularization, Ut + F (U )x = (B(U )Ux )x ,

(1.2)

where B ∈ C 1 (O, L(Rn , Rn )). Our concern is with weak shock waves bifurcating from a constant solution U (x, t) ≡ U ∗ ∈ O. We assume that, at U ∗ , Eq. (1.1) is strictly hyperbolic and Eq. (1.2) is strictly parabolic, i.e., (N1) F 0 (U ∗ ) has n distinct real eigenvalues, (N2) all eigenvalues of B(U ∗ ) have positive real part. The neighborhood O is chosen small enough that strict hyperbolicity and parabolicity hold throughout. The eigenvalues of F 0 (U ) are denoted λj (U ), j = 1, . . . , n, and rj (U ) and `j (U ) denote smooth families of corresponding right- and left-eigenvectors, normalized so that `j (U )rj (U ) = 1.

Bifurcation of Nonclassical Shock Profiles

269

Equation (1.1) admits many shock wave solutions ( e (x − st) = U− for x < st, U (x, t) = U U+ for x > st.

(1.3)

Throughout this paper, shock waves are required to satisfy the viscous profile criterion: that there should exist a corresponding traveling wave solution of system (1.2), moving with the same speed s and possessing the same asymptotic states U± at x = ±∞. Such e is a solution of the system of ordinary differential equations a traveling wave U U˙ = U(U ; U− , s) e (ξ) = U± , where ξ = x − st and satisfying limξ→±∞ U U(U ; U− , s) = B(U )−1 −s U − U− + F (U ) − F (U− ) .

(1.4)

(1.5)

A necessary condition for existence of a viscous profile is that (U− , U+ , s) satisfy the Rankine–Hugoniot condition F (U+ ) − F (U− ) − s(U+ − U− ) = 0,

(RH)

or equivalently that the corresponding solution (1.3) satisfy Eq. (1.1) weakly. We regard U as an (n + 1)-parameter family of vector fields in Rn , with parameters U− , s , that unfolds the vector field U˙ = U(U ; U ∗ , s∗ ).

(1.6)

A fundamental problem in conservation laws is to determine which of the triples satisfying condition (RH) have profiles. The first step in this direction is the structure theorem of Lax [29]. Considering condition (RH) as a bifurcation problem in the parameters (U− , s), Lax observed that shock triples (U− , U+ , s) bifurcate from the trivial solution (U ∗ , U ∗ , s∗ ) when s∗ is an eigenvalue of F 0 (U ∗ ), say s∗ = λk (U ∗ ). Assuming strict hyperbolicity and genuine nonlinearity, he showed that, locally, the solutions of condition (RH) for each U− form a curve. Furthermore, solutions satisfy either the Lax characteristic inequalities, λk (U− ) > s > λk (U+ ) and sgn(λj (U− ) − s) = sgn(λj (U+ ) − s) 6= 0 for j 6= k, (1.7) or the “reverse” inequalities, with U− and U+ interchanged. Lax showed that the initialvalue problem for Eq. (1.1), linearized around a weak shock wave, is well-posed if and only if inequalities (1.7) are satisfied. We refer to viscous shock waves satisfying inequalities (1.7) as Lax shock waves, and we refer to viscous shock waves satisfying the reverse inequalities as reverse-Lax shock waves. The Lax admissibility criterion has been extended, in the form of the Liu admissibility criterion [39, 47, 30, 31, 33], to allow for failure of genuine nonlinearity. The Liu criterion implies, but is not implied by, the Lax criterion. The relationship between the Lax and viscous profile criteria has been explored in many works (see, e.g., Refs. [16, 7, 8, 40, 32, 38, 9]). Most relevant for the present paper is the work of Majda and Pego [36]. Assuming strict hyperbolicity and absence of linear degeneracy, they showed that a weak viscous shock wave of family k satisfies either the Liu admissibility condition or its reverse, according as `k (U ∗ )B(U ∗ )rk (U ∗ )

270

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

is greater than or less than zero. They also observed (as did Kawashima [27]) that the stable viscosity matrix criterion, `j (U ∗ )B(U ∗ )rj (U ∗ ) ≥ 0, j = 1, . . . , n,

(1.8)

is necessary for linearized bounded L2 stability of the constant solution, U (x, t) ≡ U ∗ . For n = 2, they showed that this condition is sufficient as well. In this context, which is our main interest, inequalities (1.8) can be viewed as the condition for a state U ∗ to belong to the region of linearized stability for the chosen viscosity matrix [6]. If U ∗ belongs to the regions of linearized stability and strict hyperbolicity, the Liu criterion is necessary and sufficient for existence of viscous profiles for shock waves near U ∗ , so that all local profiles are of Lax type. In contrast, the examples of anomalous “rarefactionshocks,” i.e., reverse-Lax connections, described in Refs. [7] occur outside of the region of linearized stability. Other kinds of viscous shock waves, transitional and overcompressive waves, satisfy neither the Lax criterion nor its reverse. An overcompressive shock wave has a second compressive characteristic family ` such that λ` (U− ) > s > λ` (U+ ),

(1.9)

along with the compressive family k, whereas transitional (also known as undercompressive) shock profiles have no compressive family, i.e., sgn(λj (U− ) − s) = sgn(λj (U+ ) − s) 6= 0 for all j.

(1.10)

We refer to these types of viscous shock wave as nonclassical. One can also define higher degrees of over- and under-compressivity [35, 18], but we do not discuss these less generic cases here. Notice that condition (1.10) defining transitional waves allows the possibility U+ = U− , in which case the shock profile is a homoclinic orbit. Strictly speaking, the corresponding traveling wave solution is not a shock wave, but rather a solitary wave, or pulse solution. Whereas transitional shock waves with heteroclinic orbits play an important role in hyperbolic theory as components of Riemann solutions, waves with homoclinic orbits play a more subtle role, as discussed in Ref. [3]: their hyperbolic limits are constant solutions, they can be unstable as parabolic solutions [20], and they appear to function as saddle points on the boundary between attracting time-asymptotic solutions. Transitional waves having homoclinic orbits also differ from those having heteroclinic orbits in their bifurcation from the constant state solution. Therefore we distinguish the homoclinic and heteroclinic cases when stating our results. Notice also that we assume, as part of the definition of nonclassical shock waves, that (1.11) det F 0 (U− ) − s I 6= 0 and det F 0 (U+ ) − s I 6= 0, precluding shock waves with speed coinciding with a characteristic speed. In particular, the requirement of strict compressivity disallows the possibility of a Lax or overcompressive wave with a homoclinic orbit. 1.2. Summary of results. The main results of the paper, contained in Sects. 2–4, can be summarized as follows. First we derive necessary conditions for a point (U ∗ , s∗ ) to spawn arbitrarily weak nonclassical shock waves with heteroclinic orbits.

Bifurcation of Nonclassical Shock Profiles

271

Theorem 1.1. In addition to conditions (N1) and (N2), assume (N3) B(U ∗ )−1 F 0 (U ∗ ) − s∗ I has no purely imaginary eigenvalues other than 0. Suppose that, for every > 0, (U ∗ , U ∗ , s∗ ) is an accumulation point of nonclassical shock waves (U− , U+ , s) with profiles lying in the ball of radius about U ∗ and U+ 6= U− . Then (U ∗ , s∗ ) satisfies: (D1) s∗ is an eigenvalue of F 0 (U ∗ ); (D2) `∗ F 00 (U ∗ ) (r∗ , r∗ ) = 0; (D3) `∗ B(U ∗ )r∗ = 0. Here r∗ and `∗ denote, respectively, right and left eigenvectors of F 0 (U ∗ ) corresponding to s∗ . In other words, s∗ is a characteristic speed at U ∗ , genuine nonlinearity fails for the corresponding characteristic family at U ∗ , and U ∗ lies on the boundary of the region of linearized stability. The state U = U ∗ is a type of nilpotent equilibrium for Eq. (1.6) whose unfolding, which we describe below, was determined by Dumortier, Roussarie, Sotomayor [14, 13]. Accordingly, we refer to a point (U ∗ , s∗ ) satisfying conditions (D1)– (D3) as a DRS point, and we refer to the bifurcation described in Refs. [14, 13] as DRS bifurcation. Second, we find a broad class of models for which conditions (D1)–(D3) imply the existence of arbitrarily weak nonclassical shock waves with heteroclinic profiles. Theorem 1.2. Consider a system of two conservation laws (1.1) for which F is quadratic and B is constant. Let (U ∗ , s∗ ) be a DRS point that satisfies the nondegeneracy conditions (N4)–(N11) below in addition to conditions (N1) and (N2). Then the family of vector fields U undergoes the DRS bifurcation at (U ∗ , s∗ ). In particular, for every > 0, (U ∗ , U ∗ , s∗ ) is an accumulation point of nonclassical shock waves (U− , U+ , s) with orbits lying in the ball of radius about U ∗ and U+ 6= U− . More can be deduced, in fact, because the complete three-dimensional bifurcation diagrams are known for the three cases of DRS bifurcation, viz., saddle, elliptic, and focus. For instance, transitional shock waves appear in the saddle case, and overcompressive shock waves appear in the elliptic case, but the focus case does not arise for quadratic models. We have restricted to n = 2 quadratic flux functions and constant viscosity matrices primarily to simplify proofs and exposition; we expect that a similar result holds for general models, provided that suitably modified nondegeneracy conditions are satisfied. We view the DRS bifurcation as fundamental for nonclassical systems, just as is the Lax bifurcation for classical systems. Remark. As mentioned above, the case of nonclassical homoclinic profiles, for which U+ = U− , is distinct. Necessarily transitional, such waves accumulate only near points satisfying conditions (D1) and (D3) (see the remark after the proof of Theorem 1.1). A converse result analogous to Theorem 1.2 was proved in Ref. [6]. Thus the bifurcation of homoclinic shock waves is a simpler, codimension-two phenomenon. Finally, for a certain class of quadratic flux models, we show that DRS points occur generically. Let us define a nondegenerate DRS point to be a point satisfying nondegeneracy conditions (N1), (N2), and (N4)–(N11) along with conditions (D1)–(D3). Theorem 1.3. For all but a codimension-one variety of model coefficients, systems of two conservation laws with quadratic flux functions and constant, strictly parabolic viscosity matrices have either one or three DRS points, all nondegenerate.

272

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

Moreover, as described in Sect. 4, recent results [23, 48] allow us to classify certain quadratic models according to the number and types of the DRS points that occur. 1.3. Consequences for conservation laws. The bifurcation diagrams for the elliptic and saddle cases of DRS bifurcation are shown in Figs. 1.1 and 1.2. The three-dimensional space of universal parameters contains bifurcation surfaces that, topologically, have conical structure. Each diagram depicts the intersections of these surfaces with a small sphere, centered at the DRS point; the sphere has been punctured at a point outside the “lips” and laid flat.

Fig. 1.1. DRS bifurcation in the elliptic case [14] (Reproduced by permission)

The bifurcation parameters (U− , s) for the family U are not universal, but rather map to universal parameters (G− , s) via the relation G− = F (U− ) − s U− . The bifurcation diagram corresponding to the parameters (U− , s) is the preimage of the universal diagram under this map. We establish that this map has a cusp along the inflection curve U =

Bifurcation of Nonclassical Shock Profiles

273

UI (s), where genuine nonlinearity fails, so that it is a three-fold cover of the region inside the lips. The DRS point separates the inflection curve into two branches, corresponding to the points Ci and Cs in the figures. Each point of the bifurcation diagram represents a phase portrait for Eq. (1.4). Inside the lips, the phase portrait contains three equilibria, U1 , U2 , and U3 ; outside there is only one equilibrium. Thus the triple covering of the region inside the lips results from the ambiguity in the parameterization of Eq. (1.4) by a particular choice of equilibrium U− ; indeed, the map G− does not distinguish equilibria, by the Rankine–Hugoniot condition (RH).

Fig. 1.2. DRS bifurcation in the saddle case [14] (Reproduced by permission)

In the overcompressive (elliptic) case, the three equilibria for a phase portrait inside the lips consist of a saddle point, an attractor, and a repeller. In region I of Fig. 1.1, there exist orbits connecting each pair of equilibria: a one-parameter family of node-node connections, corresponding to overcompressive shock waves, bounded on one side by a pair of saddle-node connections, corresponding to Lax shock waves. In the surrounding, smaller open regions inside the lips, there do not appear any node-node connections; also there are no more than two saddle-node connections. In addition, there are homoclinic saddle-saddle connections on the boundaries labeled Ll and Lr , corresponding to the intersections of codimension one surfaces with the sphere of the diagram. In the transitional (saddle) case, the three equilibria for a phase portrait inside the lips consist of two saddle points, U1 and U2 , and a node U3 . In the various labeled

274

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

open regions, there exist no more than two saddle-node shock connections and no other types of connections. However, on the boundaries between these regions, there can occur saddle-saddle, or transitional shock connections. More precisely, there also exists a codimension-one surface of parameters on which there is a saddle-saddle connection from U1 to U2 , corresponding to a transitional shock. Likewise, there exists a (different) codimension-one surface on which there is a saddle-saddle connection from U2 to U1 . In addition, there exist two other codimension-one surfaces on which there are homoclinic saddle-saddle connections from U1 to itself or from U2 to itself. All of these connections correspond to transitional shock waves. In Fig. 1.2, the four surfaces of saddle-saddle connections correspond, respectively, to the boundary curves SCs , SCi , Ll , and Lr . In particular, there exists a codimension-two curve on which there exist simultaneous connections from U1 to U2 and from U2 to U1 , forming a 2-cycle of shock profiles. This two-saddle-cycle curve corresponds to the point TSC in Fig. 1.2. In other words, we have the surprising result that weak transitional waves arise always in pairs. The 2-cycle configuration was shown previously, in Ref. [3], to lead to nonuniqueness of Riemann solutions of Eq. (1.1) and corresponding nontrivial asymptotic dynamics of Eq. (1.2). Thus we conclude that weak transitional waves are intimately connected with linearized instability, nonuniqueness of Riemann solutions, and nontrivial asymptotic dynamics. We do not repeat in detail here the description of the nonuniqueness resulting from the occurrence of a two-saddle cycle between states U1 and U2 ; however, nonuniqueness is easily demonstrated at a heuristic level. Consider the Riemann problem with data UL = UR = U1 . This initial-value problem admits the usual constant solution U (x, t) ≡ U1 . It also admits a second solution consisting of a shock wave from U1 to U2 followed by a shock wave from U2 to U1 . As weak solutions of the hyperbolic conservation laws, these two solutions coincide, for the two shock waves in the second solution have the same speed. However, under perturbation of the data UL and UR , the speeds of the two shock waves separate, giving a distinct second solution. 1.4. Physical implications. Our results have consequences for models of three-phase flow in a porous medium, such as Corey’s model [10] and Stone’s model [45]. Theorem 4.5 of Sect. 4 asserts that weak transitional shock waves appear generically for quadratic flux models in Cases IB and IIB in the classification scheme of Hurley [23, 48]. When B = I, Cases IB –IVB of this scheme reduce to Cases I–IV of the Schaeffer-Shearer normal form [41]; and for any B, Case IB coincides with Case I. Quadratic models in Cases I and II are the ones that arise from quadratic expansions around the umbilic points in Corey’s model with quadratic permeabilities [41, 37]. Similarly, in Stone’s model, which is closely related, a topological argument establishes the existence of states at which the system of conservation laws is not strictly hyperbolic; these states form umbilic points and elliptic regions [46, 37]. When there is a single interior umbilic point, it is itself a (degenerate) DRS point; when there is an elliptic region, an extension of the topological argument [4] shows that DRS points exist nearby. Analogy with Corey’s model suggests that the number and types of DRS points for Stone’s model are predicted by an associated quadratic model. If this is so, the phenomena associated with weak transitional shock waves occur in this physical setting. Perhaps the associated nonuniqueness has consequences in the design of oil recovery strategies, for which Stone’s model is the current standard. These are issues that we are currently investigating. Theorems 1.1 and 1.2 provide further evidence that linearized instability, not ellipticity, is the true origin of nonclassical waves for nonlinear conservation laws. Nonclassical waves occur frequently in mixed-type systems only because the elliptic region is contained in (and, if B = I, coincides with) the region of linearized instability (as seen by

Bifurcation of Nonclassical Shock Profiles

275

examining the dispersion relation at long wavelengths). Closer examination (see Theorems 1.1–1.3) reveals that even for a mixed-type system, weak nonclassical waves generically emerge in the region of strict hyperbolicity. At this point, perhaps the condition of linearized stability deserves further discussion. It may be tempting to dismiss nonclassical shock waves, along with the reverse-Lax shock waves of Ref. [7], as nonphysical on the grounds that the stability condition (1.8) is violated at nearby states. However, overcompressive and heteroclinic transitional waves appear to be stable under L2 perturbations as solutions of Eq. (1.2). Evidence for this statement is both analytical [34, 35, 20] and numerical [3]. Moreover, the end states U− and U+ , and sometimes the whole shock orbit, can lie within the region of linearized stability [3, 35]. Our point of view is rather that failure of condition (1.8) represents an exchange of stability from the constant state solution to more complicated behavior, analogous to that found in many physical systems, that is manifested by the appearance of multiple Riemann solutions, corresponding to multiple possible asymptotic states of solutions of Eq. (1.2). The challenge (cf. Ref. [3]) is to understand this more complicated behavior by careful study of Riemann solutions and of their stability as solutions of Eq. (1.2).

2. Weak Nonclassical Shock Waves In this section, we prove Theorem 1.1, which gives necessary conditions for the existence of arbitrarily weak nonclassical shock waves with orbits in the vicinity of a point U ∗ . Proof of Theorem 1.1. From the Rankine–Hugoniot condition F (U+ ) − F (U− ) − s(U+ − U− ) = 0 and the Mean Value Theorem we conclude that [F 0 (Uˆ ) − sI](U+ − U− ) = 0 0 ˆ (U ) −sI] = for some point Uˆ on the line segment between U− and U+ . Hence det[F ∗ ∗ ∗ 0 ∗ 0. Taking the limit (U− , U+ , s) → (U , U , s ), we obtain det F (U ) − s∗ I = 0, verifying condition (D1). Next, suppose that condition (D2) fails, i.e., U ∗ is a point of genuine nonlinearity. Then, by the Lax structure theorem [29, 44], any shock triple (U− , U+ , s) sufficiently close to (U ∗ , U ∗ , s∗ ) must satisfy either the Lax admissibility criterion or its reverse. In particular, nonclassical shock waves are excluded, which is a contradiction. This verifies condition (D2). Similarly, Theorem 3.1 of Ref. [36] states that if `∗ B(U ∗ )r∗ 6= 0, any viscous shock profiles (U− , U+ , s) near to (U ∗ , U ∗ , s∗ ) must satisfy either the Liu admissibility criterion or its reverse, given conditions (N1)–(N3) and provided that the characteristic field associated with r∗ is not linearly degenerate in any neighborhood of U ∗ . Linear degeneracy is impossible, since it would imply that U− and U+ lie on the same rarefaction curve and that s is an eigenvalue of both F 0 (U− ) and F 0 (U+ ), contradicting hypothesis (1.11). Since the Liu admissibility criterion implies the Lax admissibility criterion, we again find that nonclassical shock waves are precluded, which is a contradiction. We conclude that condition (D3) holds. Remark. A related theorem can be proved concerning the accumulation point of shock waves with homoclinic orbits: again assuming conditions (N1)–(N3), the point (U ∗ , s∗ ) must satisfy conditions (D1) and (D3). However, condition (D2) does not necessarily hold, as the results of Ref. [6] show. The proof of conditions (D1) and (D3) can be based on the center-manifold argument of Ref. [36]. In this argument, Eq. (1.4) is supplemented by the equations U˙ − = 0 and s˙ = 0, and the center-stable and center-unstable manifolds of

276

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

the equilibrium (U ∗ , U ∗ , s∗ ) is constructed. The intersection of these manifolds contains the accumulating homoclinic orbits and therefore has dimension at least n + 3. Thus B(U ∗ )−1 [F 0 (U ∗ ) − s∗ I] has a null space of dimension at least 2, and conditions (D1) and (D3) can be derived as a consequence. Remark. Condition (N3)is superfluous in the case n = 2, since 0 is an eigenvalue of B(U ∗ )−1 F 0 (U ∗ ) − s∗ I and the number of eigenvalues that are not real must be even. The conditions defining a DRS point can be cast in alternative forms. First we show that condition (D3), concerning linearized stability, means that U ∗ is a nilpotent equilibrium for Eq. (1.6); our argument is simply the contrapositive of that in Ref. [36]. Proposition 2.1. Assuming conditions (D1), (N1), and (N2), condition (D3) is equivalent to the following one: B(U ∗ )−1 F 0 (U ∗ ) − s∗ I has a generalized eigenvector for the eigenvalue 0. (BT) Proof. By conditions (D1), (N1), and (N2), the null space of B(U ∗ )−1 F 0 (U ∗ ) − s∗ I is spanned by r∗ . Therefore condition (BT) says that there exists a vector v such that (2.1) B(U ∗ )−1 F 0 (U ∗ ) − s∗ I v = r∗ , or equivalently that 0 ∗ (2.2) F (U ) − s∗ I v = B(U ∗ )r∗ . This equation can be solved for v if and only if B(U ∗ )r∗ is annihilated by the vector `∗ spanning the left kernel of F 0 (U ∗ ) − s∗ I, i.e., condition (D3) holds. In the situation of Ref. [36], viz., `∗ B(U ∗ )r∗ 6= 0, the dynamics of Eq. (1.6) is effectively scalar, reducing to a one-dimensional center manifold through U ∗ ; this fact lies at the heart of the analysis therein. In contrast, when conditions (D1) and (D3) hold, the reduced dynamics is fundamentally nonscalar, since U ∗ is a nilpotent equilibrium. For instance, under certain nondegeneracy conditions (one being genuine nonlinearity, the negation of condition (D2)), Eq. (1.6) has a two-dimensional center manifold and undergoes Hirschberg-Knobloch, or “folded” Bogdanov-Takens, bifurcation at (U ∗ , s∗ ) (see Ref. [6]). The unfolding of this codimension-two bifurcation, given in Ref. [22], contains complete information on the viscous profiles to be found near nondegenerate points on the boundary of the linearized stability region. The heteroclinic connections that occur are either of Lax or reverse-Lax type, depending on the location of U− relative to the stability boundary; there are also homoclinic connections on a codimension one surface of parameters (U− , s). In the situation of our interest, where conditions (D1)–(D3) and certain nondegeneracy conditions hold, there is likewise a two-dimensional center manifold, but the unfolding has codimension three. In the next section, we determine this unfolding. Another condition related to those for a DRS point is the viscosity angle condition, B(U ∗ )−1 F 00 (U ∗ )(r∗ , r∗ ) = cr∗ for some c 6= 0.

(V)

This condition arises naturally in the study of transitional waves in 2-component quadratic flux models [25]; in a heuristic sense, it corresponds to existence of infinitesimal straight-line profiles in the direction of r∗ .

Bifurcation of Nonclassical Shock Profiles

277

Proposition 2.2. Assume conditions (N1) and (N2). Then conditions (D1), (V), and (BT) imply conditions (D1)–(D3). Conversely, if n = 2 and F 00 (U ∗ )(r∗ , r∗ ) 6= 0, conditions (D1)–(D3) imply conditions (D1), (V), and (BT). Proof. By Proposition 2.1, conditions (D1) and (BT) imply condition (D3). Furthermore, condition (D3) combines with condition (V) to yield condition (D2). Thus we deduce the first statement. Conversely: conditions (D1) and (D3) imply condition (BT), by Proposition 2.1; and condition (D2) implies that the vector `ˆ∗ = `∗ B(U ∗ ) satisfies `ˆ∗ B(U ∗ )−1 F 00 (U ∗ )(r∗ , r∗ ) = 0,

(2.3)

whereas condition (D3) says that `ˆ∗ r∗ = 0, so that condition (V) follows if n = 2 and F 00 (U ∗ )(r∗ , r∗ ) 6= 0. 3. DRS Bifurcation In this section, we prove Theorem 1.2 by transforming Eq. (1.5) to the normal form of Dumortier et al. [13]. To do this, we restrict to the case n = 2; furthermore we require the flux function F to be quadratic and the viscosity matrix B(U ) to be independent of U . In this setting, the problem is simplified to the one of transforming Eq. (1.5) to a quadratic normal form of Fiddelaers [15], who has exhibited a transformation to the DRS normal form. We expect that a similar analysis could be carried out for more general models by a direct reduction to the DRS normal form, but the calculations would be more complicated. 3.1. Strategy. We describe a sequence of transformations of the family of vector fields (1.5). Each transformation is constructed from three smooth (i.e., C ∞ ) maps: a map 8i of the phase plane, a map 9i of bifurcation parameters, and a map 4i scaling the vector field. Similarly, the composite of the transformations involves maps denoted 8, 9, and 4. Our result is that, under the change of phase plane variables (x, y)T = X = 8(U ; U− , s) and bifurcation parameters (µ1 , µ2 , ν)T = 3 = 9(U− , s), the system (1.4) becomes the system X˙ = 4(X; 3) X (X; 3),

(3.1)

where X (X; 3) (3.2) y . = 3 2 3 2 µ1 + µ2 x + 1 x + y ν + b(3) x + 2 x + x h(x, 3) + y q(x, y, 3) √ Here 1 and 2 are ±1, and the functions b, h, and q are smooth, with b > 0, b 6= 2 2 if 1 = −1, and q(x, y, 3) of arbitrarily high order in (x, y, 3). The vector field (3.2) is the normal form of Dumortier√ et al. [13]. The saddle case corresponds to 1√= 1, the elliptic case to 1 = −1 and b > 2 2, and the focus case to 1 = −1 and b < 2 2. The nature of the composite transformation can be understood as follows. Let Gs (U ) = B −1 [F (U ) − sU ] ,

(3.3)

278

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

so that the family of vector fields (1.5) is U(U ; U− , s) = Gs (U ) − Gs (U− ).

(3.4)

In terms of Gs , the bifurcation conditions (D1), (D2), and (D3) for the point (U ∗ , s∗ ) are: (D10 ) (D20 ) (D30 )

G0s∗ (U ∗ ) is singular; `ˆ∗ G00s∗ (U ∗ ) (r∗ , r∗ ) = 0; `ˆ∗ r∗ = 0.

Here r∗ and `ˆ∗ = `∗ B are right and left null vectors of G0s∗ (U ∗ ). Conditions (D10 ) and (D20 ) say that Gs∗ has a cusp singularity [49] at U ∗ (provided that certain nondegeneracy conditions are satisfied). Therefore we expect that the parameters (U− , s) do not constitute universal parameters, i.e., the relationship 3 = 9(U− , s) is not invertible. This observation suggests a two-stage approach for relating system (1.4) to system (3.1). In the first stage, we transform the family U to the auxiliary family of vector fields U0 (U ; G− , s) = Gs (U ) − G− ,

(3.5)

parameterized by (G− , s). The map of bifurcation parameters from (U− , s) to (G− , s), where G− = Gs (U− ), has a cusp at (U ∗ , s∗ ), as explained in Sect. 3.3. In the second stage, which is carried out in Sect. 3.4, we relate the family U0 to the normal form (3.2). The map of bifurcation parameters from (G− , s) to 3 turns out to be invertible. Because the cusp map is surjective, the composite map of bifurcation parameters is surjective, and DRS bifurcation occurs. 3.2. Notation and assumptions. As n = 2, we may simplify our notation: the eigenvalues ˜ ), where λ(U ∗ ) = s∗ ; the corresponding right of F 0 (U ) are denoted λ(U ) and λ(U eigenvectors are denoted r(U ) and r(U ˜ ); and the corresponding left eigenvectors are ˜ ). Let R(U ) = (r(U ), r(U denoted `(U ) and `(U ˜ )) and L(U ) = R(U )−1 , so that the first ˜ and second rows of L(U ) are `(U ) and `(U ), respectively. For simplicity, we use an asterisk to indicate when a quantity has been evaluated at U ∗ ; e.g., R∗ = R(U ∗ ). To carry out the bifurcation analysis, we make several nondegeneracy assumptions in addition to assuming conditions (N1) and (N2). First we require that (N4) `˜∗ F 00 (U ∗ ) (r∗ , r∗ ) 6= 0, (N5) `∗ F 00 (U ∗ ) (r∗ , r˜ ∗ ) 6= 0. These conditions are used to prove that Gs∗ has a cusp singularity at U ∗ . Notice that condition (N4) implies the condition F 00 (U ∗ ) (r∗ , r∗ ) 6= 0 appearing in Proposition 2.2. The following result is used in the first step of building the transformation. Lemma 3.1. Assume conditions (D10 )–(D30 ), (N1), (N2), (N4), and (N5). Then: ∗ ∗ α β , (3.6) L∗ B −1 R∗ = γ∗ 0 where α∗ > 0 and −β ∗ γ ∗ > 0; ∗

L

G0s∗ (U ∗ )R∗

=

0 g∗ 0 0

(3.7)

Bifurcation of Nonclassical Shock Profiles

279

with g ∗ 6= 0; and if X = (x, y)T , then 1 ∗ 2L

G00s∗ (U ∗ ) (R∗ X, R∗ X)

=

l∗ x2 + m∗ xy + n∗ y 2 , p∗ xy + q ∗ y 2

(3.8)

where l∗ 6= 0 and p∗ 6= 0. Proof. By Cramer’s rule, the 2, 2-component of the 2 × 2 matrix L∗ B −1 R∗ is proportional to the 1, 1-component of L∗ BR∗ , which vanishes by condition (D3). Therefore L∗ B −1 R∗ has the form (3.6), with α∗ = tr(B −1 ) > 0 and −β ∗ γ ∗ = (det B)−1 > 0 by assumption (N2). Since 0 0 , (3.9) L∗ G0s∗ (U ∗ ) R∗ = L∗ B −1 R∗ 0 λ˜ ∗ − λ∗ this matrix has the form (3.7) with g ∗ = β ∗ (λ˜ ∗ − λ∗ ) 6= 0 by condition (N1). By assumption (D30 ), the vector `ˆ∗ = `∗ B is proportional to `˜∗ , so that `˜∗ G00s∗ (U ∗ ) ∗ ∗ (r , r ) = 0 by assumption (D20 ). Hence Eq. (3.8) holds. Moreover, because of the equivalent condition (D2), the coefficient l∗ = 21 `∗ G00s∗ (U ∗ ) (r∗ , r∗ ) is given by l∗ = 21 `∗ B −1 R∗ L∗ F 00 (U ∗ ) (r∗ , r∗ ) = 21 β ∗ `˜∗ F 00 (U ∗ ) (r∗ , r∗ ), so that l∗ 6= 0 by condition (N4). Similarly, the coefficient p∗ = `˜∗ G00s∗ (U ∗ ) (r∗ , r˜ ∗ ) = γ ∗ `∗ F 00 (U ∗ ) (r∗ , r˜ ∗ ) is nonzero by condition (N5). In terms of the coefficients just defined, the remaining nondegeneracy conditions are: (N6) 2l∗ + p∗ 6= 0; (N7) 2l∗ − p∗ 6= 0; (N8) 3l∗ − p∗ 6= 0; (N9) 2l∗ q ∗ − p∗ q ∗ − p∗ m∗ 6= 0; (N10) γ ∗ p∗ m∗ − α∗ (p∗ )2 − 4γ ∗ l∗ q ∗ 6= 0; (N11) l∗ − p∗ 6= 0. These conditions are used to verify the nondegeneracy conditions of Fiddelaers and the nonsingularity of the transformation of parameters. We shall see that the sign of l∗ p∗ determines the type of the DRS point: saddle type ∗ ∗ if l p < 0 and elliptic type if l∗ p∗ > 0. As −β ∗ γ ∗ > 0, the formulae for l∗ and p∗ derived in the preceding proof imply that l∗ and p∗ have the same sign if and only if `˜∗ F 00 (U ∗ ) (r∗ , r∗ ) and `∗ F 00 (U ∗ ) (r∗ , r˜ ∗ ) have opposite signs. Of course, condition (N6) is automatically satisfied in the elliptic case, whereas conditions (N7), (N8), and (N11) are automatic in the saddle case. Lemma 3.1 shows that, under the hypotheses of Theorem 1.3, the partial differential equation (1.2) is linearly equivalent to the following normal form: ±uv + qv 2 −av u + = . (3.10) bu + v xx v t v + u2 + muv + nv 2 x In this normal form: the DRS point is located at the origin and has zero speed; the upper and lower signs correspond to the saddle and elliptic cases, respectively; and the nondegeneracy conditions (N1), (N2), and (N3)–(N11) read ab > 0, b 6= 2a, b 6= 3a

280

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

and b 6= a in the elliptic case, 2aq ± bq ∓ am 6= 1, and ±m 6= 4q. The equivalence is shown by first changing U to L∗ (U − U ∗ ) and x to x − s∗ t, and then scaling x, t, and U appropriately. Simple examples of the two cases are ±uv + v 2 −2v u + = . (3.11) u + v xx v t v + u2 x These examples show that there exist many conservation laws satisfying the combination of conditions (D1)–(D3), (N1), (N2), and (N3)–(N11). Remark. Equation (3.10) might misleadingly suggest that the viscosity matrix must be “far” from the identity matrix in order to have a nondegenerate DRS point. In fact, Theorem 1.3 shows that for all but a codimension-one variety of quadratic flux parameters, nondegenerate DRS points occur for viscosity matrices chosen arbitrarily close to (but not equal to) the identity matrix. 3.3. Cusp singularity. To effect the transformation from family U to the auxiliary family U0 , we use the transformation (G− , s)T = 90 (U− , s) of bifurcation parameters defined by G− = Gs (U− ). The transformation of phase plane and the scaling of the vector field are trivial: 80 (U ; U− , s) = U and 40 (U ; G− , s) = 1. 3.3.1. Inflection curve. As we shall see, the inflection curve, i.e., the set of points at which genuine nonlinearity fails, is the curve of cusp points for the map 90 . First we establish the following result. Lemma 3.2. The inflection curve is locally a smooth curve through U ∗ that is transverse to r∗ and can be parameterized as s 7→ UI (s) in such a way that s = λ(UI (s)). Proof. Let I(U ) = `(U )F 00 (U )(r(U ), r(U )), so that the inflection locus is defined by the equation I(U ) = 0. According to condition (D2), the point U ∗ belongs to the inflection locus. To establish that the inflection locus is locally a smooth curve through U ∗ that is transverse to r∗ , we show that I 0 (U ∗ )r∗ 6= 0 and invoke the Implicit Function Theorem. Using condition (D2), we find that (3.12) I 0 (U ∗ )r∗ = `0 (U ∗ )r∗ F 00 (U ∗ )(r∗ , r∗ ) + 2`∗ F 00 (U ∗ )(r∗ , r0 (U ∗ )r∗ ) 0 ∗ ∗ 0 ∗ ∗ ∗ ∗ 00 ∗ ∗ ∗ ∗ 00 ∗ ∗ ∗ ˜∗ ˜ = ` (U )r r˜ ` F (U )(r , r ) + 2` F (U )(r , r˜ ) ` r (U )r . (3.13) The identities F 0 (U )r(U ) = λ(U )r(U ) and `(U )F 0 (U ) = λ(U )`(U ) imply that (λ∗ − λ˜ ∗ )`˜∗ r0 (U ∗ )r∗ = `˜∗ F 00 (U ∗ )(r∗ , r∗ ), (λ∗ − λ˜ ∗ ) `0 (U ∗ )r∗ r˜ ∗ = `∗ F 00 (U ∗ )(r∗ , r˜ ∗ ).

(3.14) (3.15)

Therefore, as λ∗ 6= λ˜ ∗ by condition (N1), I 0 (U ∗ )r∗ = 3`∗ F 00 (U ∗ )(r∗ , r˜ ∗ )`˜∗ F 00 (U ∗ )(r∗ , r∗ )/(λ∗ − λ˜ ∗ ),

(3.16)

which is nonzero by conditions (N4) and (N5). Thus the inflection locus can be parameterized by a curve σ 7→ Uˆ (σ) such that Uˆ (0) = U ∗ and `˜∗ Uˆ 0 (0) 6= 0. This curve can be reparameterized by the speed s provided that the relationship s = λ(Uˆ (σ)) can be solved for σ in terms of s, which follows if

Bifurcation of Nonclassical Shock Profiles

281

λ0 (U ∗ )Uˆ 0 (0) 6= 0. To see this, notice that the identity F 0 (U )r(U ) = λ(U )r(U ) implies that λ0 (U ∗ )r∗ = `∗ F 00 (U ∗ )(r∗ , r∗ ), 0

∗

∗

∗

00

∗

∗

(3.17)

∗

λ (U )r˜ = ` F (U )(r , r˜ ).

(3.18)

λ0 (U ∗ )Uˆ 0 (0) = `∗ F 00 (U ∗ )(r∗ , r˜ ∗ ) `˜∗ Uˆ 0 (0)

(3.19)

Consequently,

is nonzero.

3.3.2. Cusp. Let us now fix s and focus on the map G− = Gs (U− ). Following the procedure of Whitney [49], we prove the following result. Lemma 3.3. The map Gs has a cusp at UI (s) for each s sufficiently near to s∗ . Proof. First we change independent variables from U− to X− = (x− , y− )T using U− = UI (s) + R(UI (s))X− ,

(3.20)

and we change dependent variables from G− to Z− = (w− , z− )T using G− = Gs (UI (s)) + B −1 R(UI (s))Z− .

(3.21)

Then Z− = Z(X− ), where

Z(X− ) = L F 0 − sI RX− + 21 LF 00 (RX− , RX− ).

(3.22)

(Here and in the remainder of the present subsection, functions such as L, R, F 0 , etc., are evaluated at U = UI (s).) More explicitly, 2 ˜ x− y− + O(y− ) `F 00 (r, r) (3.23) Z(X− ) = ˜ ˜ 00 (r, r) x2− + y− O(|x− | + |y− |) . (λ − λ)y− + 21 `F The singular set for Z is the zero-set of the Jacobian determinant J = det Z 0 . As ˜ y− `F 00 (r, r) ˜ x− + O(y− ) `F 00 (r, r) 0 , (3.24) Z (X− ) = ˜ 00 `F (r, r) x− + O(y− ) λ˜ − λ + O(|x− | + |y− |) the Jacobian determinant is ˜ 00 (r, r) `F 00 (r, r) J(X− ) = (λ˜ − λ) `F 00 (r, r) ˜ y− − `F ˜ x2− + y− O(|x− | + |y− |). (3.25) Because J vanishes at X− = 0, the map Z is singular. Moreover, by virtue of condi˜ is nonzero (for s sufficiently close tions (N1) and (N5), the coefficient (λ˜ − λ) `F 00 (r, r) to s∗ ), so that the singular set J = 0 forms a smooth curve C through X− = 0 tangent to the x− -axis. The vector field X˙ − = (−Jy− , Jx− )T

(3.26) T ˜ (r, r) `F (r, r) ˜ + O(|x− | + |y− |), −2`F ˜ x− + O(y− ) = −(λ˜ − λ) `F (r, r) (3.27) 00

00

00

282

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

is tangent to the singular set at each point of this set. The image of this vector field is Z˙ − = Z 0 (X− )X˙ −

(3.28) T ˜ 00 (r, r) `F 00 (r, r) ˜ x− + O(y− ) , = O(y− ) + O(|x− |2 + |y− |2 ), −3(λ˜ − λ) `F (3.29)

and this vector field vanishes at X− = 0, as is necessary to have a cusp. On the other hand, differentiating the components of Z˙ − in the direction of X˙ − and evaluating at X− = 0 yields T ˜ 00 (r, r)]2 `F 00 (r, r) ˜ . (3.30) (Z˙ − )0 X˙ − = 0, 3(λ˜ − λ)2 [`F Invoking conditions (N1), (N4), and (N5), we see that this vector is nonzero. Therefore Z has a cusp at X− = 0, i.e., Gs has a cusp at UI (s) for each s sufficiently near to s∗ . 3.3.3. Structure of the cusp. From Eq. (3.25) we see that the fold curve C for Z is given by the graph y− =

˜ 00 (r, r) `F x2− + O(|x− |3 ), λ˜ − λ

as illustrated in Fig. 3.1 (a). The image of this curve is the graph 00 ˜ 00 (r, r) x3− /(λ˜ − λ) + O(x4− ) `F (r, r) ˜ `F , Z− = 3 ˜ 00 2 3 2 `F (r, r) x− + O(|x− | )

(3.31)

(3.32)

which is the cusp Gs [C] shown in Fig. 3.1 (b). The preimage of the cusp under Gs consists of the fold curve C together with another curve F with graph y− = −

˜ 00 (r, r) 1 `F x2 + O(|x− |3 ), 8 λ˜ − λ −

(3.33)

as is verified by calculating the image of this curve. The map Gs carries region A bijectively onto region A0 , and it carries each of the three regions B1 , B2 , and B3 bijectively onto region B 0 . 3.4. Transformations. We now proceed with the series of transformations of the auxiliary family of vector fields (3.5). In this stage of the calculation, it is not necessary to determine the exact form of the transformation; only terms of lowest order in = max {|G− − Gs∗ (U ∗ )| , |s − s∗ |}

(3.34)

need to be retained, and we take advantage of this fact to keep the calculation manageable. 3.4.1. Step 1. The transformation 81 effects the change variables from U to X = (x, y)T , where x and y are coordinates along the eigenvectors: U = U ∗ + xr∗ + y r˜ ∗ .

(3.35)

Therefore X = 81 (U ; G− , s) = L∗ (U − U ∗ ). Under this change of variables, the family (3.5) becomes the family

Bifurcation of Nonclassical Shock Profiles

283

y− B2

A0

r(U ˜ I (s))

Gs (UI (s))

C

UI (s)

B1

B3 x−

B0

Gs∗ (U ∗ ) h

F

w−

r∗ ˜

r(UI (s))

U∗

Gs [C]

A

z−

r∗ k

a Domain of Gs

b Codomain of Gs

Fig. 3.1. Behavior of the map Gs . This map has a cusp at the point UI (s) on the inflection curve. The fold curve C maps to the cusp Gs [C], which has preimage consisting of the curve F as well as C. Region A is mapped bijectively onto region A0 , and regions B1 , B2 , and B3 are each mapped bijectively onto region B 0

L∗ Gs (U ∗ + R∗ X) − G− = L∗ Gs (U ∗ ) + G0s (U ∗ ) R∗ X + 21 G00s (U ∗ ) (R∗ X, R∗ X) − G− =

L G0s∗ (U ∗ ) R∗ X + 21 L∗ G00s∗ (U ∗ ) (R∗ X, R∗ X) − (s − s∗ )L∗ B −1 R∗ X − L∗ G− − Gs (U ∗ ) ,

(3.36)

∗

which we write as U1 (X; h, k, σ) =

(3.37)

h − σα∗ x + (g ∗ − σβ ∗ ) y + l∗ x2 + m∗ xy + n∗ y 2 . (3.38) k − σγ ∗ x + p∗ xy + q ∗ y 2

The correspondence (h, k, σ) = 91 (G− , s) between bifurcation parameters is defined by (h, k)T = −L∗ G− − Gs (U ∗ ) , (3.39) σ = s − s∗ ,

(3.40)

and 41 (X; h, k, σ) = 1. Notice that h, k, and σ are O(). 3.4.2. Step 2. Next we apply the transformation (x, ¯ y) ¯ T = X¯ = 82 (X; h, k, σ) = (x − ζ)/0, y − η

T

.

(3.41)

˙¯ the coefficient of x¯ vanishes if We find that, in the equation for x, ζ=

σα∗ − ηm∗ . 2l∗

(3.42)

284

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

With ζ chosen in this way, the constant coefficient vanishes provided that η is a root of ∗ ∗ m α (m∗ )2 (σα∗ )2 ∗ ∗ ∗ + g +σ −β η+ n − η 2 = 0; (3.43) h− 4l∗ 2l∗ 4l∗ we choose the root such that η = −(h/g ∗ )[1 + O()]. Also, by taking 0 = g ∗ − σβ ∗ + ζm∗ + 2ηn∗ ,

(3.44)

the coefficient of y¯ is 1. Therefore this transformation puts the family (3.38) in the normal form used by Fiddelaers [15, Eq. (3.19)]: y¯ + a¯ x¯ 2 + b¯ x¯ y¯ + c¯y¯ 2 ¯ λ1 , λ2 , λ3 ) = . (3.45) U2 (X; λ1 + λ2 x¯ + λ3 y¯ + e¯x¯ y¯ + f¯y¯ 2 The correspondence (λ1 , λ2 , λ3 ) = 92 (h, k, σ) between bifurcation parameters is defined by λ1 = k + −α∗ γ ∗ σ 2 + (γ ∗ m∗ + α∗ p∗ )ση + (2l∗ q ∗ − m∗ p∗ )η 2 / 2l∗ = k + O(2 ), (3.46) ∗ ∗ ∗ ∗ ∗ 2 (3.47) λ2 = 0 −σγ + ηp = −σγ g − hp + O( ), ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 2 ∗ λ3 = p ζ + 2q η = σα p + h m p − 4l q /g + O( ) / 2l , (3.48) ¯ λ1 , λ2 , λ3 ) = 1. The other parameters appearing in the vector field U2 are and 42 (X; a¯ = 0l∗ = g ∗ l∗ [1 + O()] , b¯ = m∗ ,

c¯ = n∗ /0 = (n∗ /g ∗ ) [1 + O()] , e¯ = 0p∗ = g ∗ p∗ [1 + O()] , f¯ = q ∗ .

(3.49) (3.50) (3.51) (3.52) (3.53)

3.4.3. Step 3. Now we appeal to the work of Dumortier and Fiddelaers [12, 15], who prove that the family U2 is a universal unfolding of a DRS point. In his thesis, Fiddelaers exhibits a transformation ¯ λ1 , λ2 , λ3 ) (x, ˜ y) ˜ T = X˜ = 83 (X;

(3.54)

˜ = 93 (λ1 , λ2 , λ3 ) of the bifur˜ T =3 of the phase plane and a transformation (µ˜ 1 , µ˜ 2 , ν) ¯ ¯ cation parameters that brings the system X = U2 (X; λ1 , λ2 , λ3 ) into the form ˜ 3) ˜ 3), ˜ U3 (X; ˜ X˜ = 43 (X;

(3.55)

where ˜ 3) ˜ (3.56) U3 (X; y˜ = ˜ x, ˜ 3) ˜ . ˜ + y˜ 2 q( ˜ x˜ + c˜(3) ˜ x˜ 2 + x˜ 3 h( ˜ x, ˜ y, ˜ 3) ˜ 3) µ˜ 1 + µ˜ 2 x˜ + 1 x˜ 3 + y˜ ν˜ + b( ˜ h, ˜ and q˜ are smooth, with b˜ > 0 and with q( ˜ of arbitrarily Here the functions b, ˜ x, ˜ y, ˜ 3) ˜ 3) ˜ = φ¯ 1 +O(), the transformation ˜ The scaling function is 43 (X; high order in (x, ˜ y, ˜ 3). 93 of bifurcation parameters is specified by

Bifurcation of Nonclassical Shock Profiles

285

µ˜ 1 = φ¯ 1 λ1 + O(2 ), µ˜ 2 = φ¯ 21 φ¯ 2 − f¯ λ1 + φ¯ 21 λ2 + O(2 ), 2¯a + e¯ 2¯a + e¯ ¯ ¯ ¯ ¯ 3f φ2 − c¯e¯ λ1 + φ¯ 1 3φ2 + b¯ − f¯ λ2 ν˜ = −φ1 c¯ + 6¯ae¯ 6¯ae¯ 2(¯ a − e) ¯ − φ¯ 1 λ3 + O(2 ), 3e¯

(3.57) (3.58)

(3.59)

and the other parameters appearing in the vector field U3 are ¯ 1 = − sgn(¯ae), 2 ¯ + O(), b˜ = φ¯ 1 (2¯a + e) 3 ¯ + O(), c˜ = φ¯ 1 φ¯ 2 (3¯a − e) where

¯ −1/4 , φ¯ 1 = |¯ae| 2¯af¯ − e¯f¯ − e¯b¯ . φ¯ 2 = 5e¯

(3.60) (3.61) (3.62) (3.63) (3.64)

3.4.4. Step 4. Finally, we scale system (3.55) to obtain system (3.1), following Ref. [13, 4th Step]. The transformation is T ˜ 3) ˜ = (x/α, ˜ y/β) ˜ , (x, y)T = X = 84 (X;

(3.65)

where ˜ |˜c|−1 , α = sgn(b) ˜ |˜c|−2 . β = sgn(b)

(3.66) (3.67)

˜ The scaling is 44 (X; 3) = |˜c|−1 , the transformation (µ1 , µ2 , ν)T = 3 = 94 (µ˜ 1 , µ˜ 2 , ν) of bifurcation parameters is given by ˜ |˜c|3 µ˜ 1 + O(2 ), µ1 = sgn(b)

(3.68)

2

µ2 = |˜c| µ˜ 2 + O( ),

(3.69)

ν = |˜c|ν˜ + O( ),

(3.70)

2

2

and the other parameters appearing in the vector field are 2 = sgn c˜, ˜ b = |b|.

(3.71) (3.72)

3.5. Nondegeneracy. The nondegeneracy conditions for the normal √ form (3.2) of Dumortier et al. [13] are: (i) 1 6= 0; (ii) 2 6= 0; (iii) b 6= 0; and (iv) b 6= 2 2 if 1 = −1. As in Ref. [15], these conditions are implied by conditions (N1)–(N9): (i) since g ∗ 6= 0 by conditions (N1) and (N2), 1 = − sgn[l∗ p∗ ]

(3.73)

is nonzero by conditions (N4) and (N5); (ii) likewise, 2 = sgn[g ∗ (3l∗ − p∗ )(2l∗ q ∗ − p∗ q ∗ − p∗ m∗ )/p∗ ]

(3.74)

286

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

is nonzero by conditions (N1), (N2), (N5), (N8), and (N9); (iii) furthermore, |2l∗ + p∗ | (3.75) |l∗ p∗ |1/2 √ √ is nonzero by condition (N6); and (iv) b 6= 2 2 by condition (N7). In fact, b < 2 2, so that the focus case does not occur for quadratic models. Therefore the possible cases are the saddle case (1 = 1, i.e., l∗ p∗ < 0) and the elliptic case (1 = −1, i.e., l∗ p∗ > 0). Concerning the maps of the bifurcation parameters, we observe the following. First, the map 91 : (G− , s) 7→ (h, k, σ) is invertible. Second, condition (N10) guarantees the invertibility of 92 : (h, k, σ) 7→ (λ1 , λ2 , λ3 ). Indeed, the derivative of 92 , evaluated at (h, k, σ) = (0, 0, 0), is   0 1 0 −p∗ 0 −γ ∗ g ∗  , (3.76) 902 (0, 0, 0) =  (m∗ p∗ − 4l∗ q ∗ )/(2l∗ g ∗ ) 0 α∗ p∗ /(2l∗ ) b=

and the determinant of this matrix is nonzero by condition (N10). Third, the map 93 : ˜ which has Jacobian determinant 2φ¯ 41 (e¯ − a¯ )/(3e), ¯ is well(λ1 , λ2 , λ3 ) 7→ (µ˜ 1 , µ˜ 2 , ν), defined and invertible because: (i) a¯ 6= 0 by conditions (N1), (N2), and (N4); (ii) e¯ 6= 0 by conditions (N1), (N2), and (N5); and (iii) e¯ 6= a¯ by condition (N11). Finally, the map ˜ |˜c|6 , is invertible ˜ 7→ (µ1 , µ2 , ν), which has Jacobian determinant sgn(b) 94 : (µ˜ 1 , µ˜ 2 , ν) because of conditions (N1), (N2), (N6), (N8), and (N9). The composite map of bifurcation parameters carries (U− , s) to (G− , s)T = 90 (U− , s) to (µ1 , µ2 , ν)T = 94 ◦ 93 ◦ 92 ◦ 91 (G− , s). As the cusp map 90 is surjective and the maps 91 , . . . , 94 are invertible, the composite map is surjective. This concludes the proof of Theorem 1.2. 4. Existence of DRS Points for Planar Quadratic Models As a codimension-three point in the three-parameter family of dynamical systems (1.4), a nondegenerate DRS points persist under arbitrary, sufficiently small perturbations of the parameters of model (1.1). Hence DRS points occur for an open set of models having regions of linearized instability. However, the occurrence of DRS points for a specific model or class of models has not yet been established. In this section, we investigate further the class of 2-component models with quadratic flux functions and constant viscosity matrices, classifying the number and types of DRS points that occur; in the process, we prove Theorem 1.3. Recall that, by Proposition 2.2, the conditions for a DRS point are equivalent to conditions (D1), (V), and (BT), provided that n = 2 and conditions (N1), (N2), and (N4) hold. In this context, condition (D1) means that there exists a vector r∗ 6= 0 such that (r∗ )⊥ F 0 (U ∗ )r∗ = 0,

(4.1)

s∗ = (r∗ )T F 0 (U ∗ )r∗ /|r∗ |2 .

(4.2)

in which case

Likewise, condition (BT) reduces to

tr B −1 F 0 (U ∗ ) − s∗ I = 0.

(4.3)

Bifurcation of Nonclassical Shock Profiles

287

Indeed, the matrix B −1 F 0 (U ∗ ) − s∗ I is singular by condition (D1), and it is nonzero by condition (N1). Also, because F is quadratic, F 00 is constant, so that condition (V) may be written (r∗ )⊥ B −1 F 00 (0)(r∗ , r∗ ) = 0.

(4.4)

(That the quantity c appearing in condition (V) is nonzero follows from condition (N4).) Equation (4.4) is a homogeneous cubic in the components of r∗ , and has nonzero real solutions forming one to three lines through the origin, the directions of which are known as viscosity angles [25]. Fixing r∗ to be one of these solutions, we reduce Eq. (4.1) to a linear equation for U ∗ , with solutions forming a line L (called the viscosity line). The coefficients in this linear equation are homogeneous quadratic functions of the components of r∗ . Moreover, Eq. (4.3) can be combined with Eq. (4.2) to obtain (4.5) tr B −1 F 0 (U ∗ ) |r∗ |2 − (r∗ )T F 0 (U ∗ )r∗ I = 0, also a linear equation for U ∗ with homogeneous quadratic coefficients; its solutions form a second line, K. The set of DRS points for a particular viscosity angle is given by the intersection of the two lines L and K, usually a single point. More precisely, we have the following result, from which Theorem 1.3 follows as an immediate consequence. Proposition 4.1. For all but a codimension-one variety of model coefficients, systems of two conservation laws with quadratic flux functions and constant, strictly parabolic viscosity matrices have precisely one DRS point for each viscosity angle, each of which is nondegenerate. Proof. Denote the left-hand side of Eq. (4.4) by P (r∗ ), which is a homogeneous cubic polynomial in the components of r∗ . Given a solution r∗ 6= 0 of P (r∗ ) = 0, Eqs. (4.1) and (4.3) have a unique mutual solution U ∗ whenever the determinant of the matrix of linear coefficients, which is a homogeneous quartic polynomial Q(0) (r∗ ), does not vanish. The resultant of P and Q(0) is a polynomial expression in the model parameters that vanishes if and only if P and Q(0) have a common root. Let N (0) be the zero-set of this resultant in the space of model parameters. For a model with parameter set not belonging to N (0) , let r∗ be a solution of Eq. (4.4), i.e., P (r∗ ) = 0. By definition of N (0) , Q(0) (r∗ ) does not vanish, so that there exists a unique mutual solution of Eqs. (4.1) and (4.5), which is the unique DRS point associated with r∗ . Notice that, by Cramer’s rule, U ∗ is a ratio of homogeneous quartic polynomials of r∗ . Next consider condition (N1), which may be written (4.6) tr F 0 (U ∗ ) |r∗ |2 − (r∗ )T F 0 (U ∗ )r∗ I 6= 0. After using the formula for U ∗ in terms of r∗ and clearing the denominator, the left-hand side becomes a homogeneous polynomial Q(1) (r∗ ) of degree six. If N (1) is the zero-set of the resultant of P and Q(1) , then condition (N1) holds off of N (1) . Finally, consider the nondegeneracy conditions (N4)–(N10), each of which is the nonvanishing of a homogeneous polynomial Q(j) (r∗ ), j = 4, . . . , 10. If N (j) is the zero-set of the resultant of P and Q(j) , then condition (Nj) holds off of N (j) , j = 4, . . . , 10. Set N = N (0) ∪ N (1) ∪ N (4) ∪ · · · ∪ N (10) . Since N is an algebraic variety, either it has codimension at least one or else it is the whole of parameter space. But the examples (3.11) of quadratic models with nondegenerate DRS points, given in Sect. 3.2, preclude the latter possibility.

288

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

Recent results concerning quadratic models permit us to be more specific about the existence and classification of DRS points. Recall that, if a quadratic model has an isolated umbilic point, then it can be put in the normal form of Schaeffer and Shearer [41], who classified such models into Cases I–IV based on rarefaction waves. An alternative classification, based on the existence of transitional shock waves for the chosen viscosity matrix B, is developed in Refs. [23, 48]. Assuming that B is symmetric and positive definite when the flux is in Schaeffer-Shearer normal form, models with isolated umbilic points fall into Cases IB –IVB . For B = I, Cases IB –IVB coincide with Cases I–IV; additionally, Case IB coincides with Case I for any B. This result is based on the observation [24] that a shock wave in a quadratic model has an orbit that lies along a straight-line if the separation r∗ = U+ − U− satisfies Eq. (4.4), i.e., the direction of r∗ is a viscosity angle. The following two theorems are proved in Refs. [23, 48]: Theorem 4.2. Let r∗ 6= 0 be a root of Eq. (4.4) for a 2-component quadratic flux model with a constant, invertible viscosity matrix. Then there exist transitional shock waves with straight-line orbits along the direction of r∗ if and only if det F 00 (0)(r∗ , ·) < 0.

(4.7)

In this case, the viscosity angle is called active. Theorem 4.3. Let B be a symmetric positive definite viscosity matrix and consider homogeneous quadratic models in Schaeffer-Shearer normal form. There are three polynomials in the model parameters that define curves separating models into four cases, which are characterized as follows: (i) (ii) (iii) (iv)

Case IB : three viscosity angles, all active; Case IIB : three viscosity angles, two active and one not; Case IIIB : three viscosity angles, none active; Case IVB : one viscosity angle, which is not active.

The following lemma relates these results to the present context, giving the desired classification as a corollary. Lemma 4.4. If det B > 0 and U ∗ is a strictly hyperbolic DRS point, then U ∗ is of saddle type if and only if the corresponding viscosity angle is active. Proof. Let α∗ , β ∗ , γ ∗ , l∗ , and p∗ be as in Lemma 3.1. Since `∗ F 00 (0) (r∗ , r∗ ) = 0, (4.8) det F 00 (0)(r∗ , ·) = det L∗ F 00 (0)(r∗ , ·)R∗ ∗ 00 ∗ ∗ ∗ 00 ∗ ∗ ˜ (4.9) = −` F (0)(r , r ) ` F (0)(r , r˜ ) =−

2l∗ p∗ . β∗γ∗

(4.10)

From Lemma 3.1 we know that −β ∗ γ ∗ = (det B)−1 > 0. Therefore det F 00 (0)(r∗ , ·) < 0, i.e., the viscosity angle is active, if and only if l∗ p∗ < 0, i.e., the DRS point is of saddle type. Corollary 4.5. Consider a system of two conservation laws for which F is quadratic, with homogeneous quadratic part F2 in Schaeffer-Shearer normal form, and B is constant and symmetric, positive definite. Then the number and types of DRS points is determined by the classification of F2 , as follows:

Bifurcation of Nonclassical Shock Profiles

(i) (ii) (iii) (iv)

289

Case IB : three DRS points, all of saddle type; Case IIB : three DRS points, two of saddle type and one of elliptic type; Case IIIB : three DRS points, all of elliptic type; Case IVB : one DRS point, which has elliptic type.

Acknowledgement. We thank Professor J. Sotomayor for inspiring discussions concerning this work.

References 1. Azevedo, A. and Marchesin, D.: Multiple Viscous Profile Riemann Solutions in Mixed EllipticHyperbolic Models for Flow in Porous Media. In: Hyperbolic Equations that Change Type, B. Keyfitz and M. Shearer, eds. IMA Volumes in Mathematics and its Applications, 27, New York–Heidelberg–Berlin: Springer-Verlag, 1990, pp. 1–17 2. Azevedo, A. and Marchesin, D.: Multiple Viscous Solutions for Systems of Conservation Laws. Trans. Am. Math. Soc. 347, 3061–3078 (1995) 3. Azevedo, A., Marchesin, D., Plohr, B. and Zumbrun, K.: Nonuniqueness of Nonclassical Solutions of Riemann Problems. Z. angew. Math. Phys. 47, 977–998 (1996) 4. Azevedo, A., Marchesin, D., Plohr, B. and Zumbrun, K.: Capillary Instability in Models for Three-Phase Flow. In preparation, 1998 ˇ c,: On the influence of viscosity on Riemann solutions. J. Dyn. Diff. Eqns. 9, 663–703 (1997) 5. CanicCani´ ˇ c, S. and Plohr, B.: Shock Wave Admissibility for Quadratic Conservation Laws. J. Differ. 6. CanicCani´ Eqs. 118, 293–335 (1995) 7. Conley, C. and Smoller, J.: Viscosity Matrices for two–dimensional nonlinear hyperbolic systems. Comm. Pure Appl. Math. XXIII, 867–884 (1970) 8. Conley, C. and Smoller, J.: Shock waves as limits of progressive wave solutions of higher order equations II. Comm. Pure Appl. Math. XXIV, 133–146 (1972) 9. Conlon, J.:A theorem in ordinary differential equations with an application to hyperbolic conservation laws. Adv. Math. 35, 1–18 (1980) 10. Corey, A., Rathjens, C., Henderson, J. and Wyllie, M.: Three-Phase Relative Permeability. Trans. AIME 207, 349–351 (1956) 11. Courant, R., Friedrichs, K.: Supersonic Flow and Shock Waves. New York: Springer-Verlag, 1976 12. Dumortier, F., Fiddelaers, P.: Quadratic Models for Generic Local 3-Parameter Bifurcations on the Plane. Trans. Am. Math. Soc. 326, 101–126 (1991) ˙ 13. Dumortier, F., Roussarie, R., Sotomayor, J. and Zoladek, H.: Nilpotent Singularities and Abelian Integrals. Lect. Notes Math., 1480, New York–Heidelberg–Berlin: Springer-Verlag, 1991 14. Dumortier, F. and Rousseau, C.: Cubic Li´enard Equations with Linear Damping. Nonlinearity 3, 1015– 1039 (1990) 15. Fiddelaers, P.: Local Bifurcations of Quadratic Vector Fields. Ph. D. Thesis, Limburgs Universitair Centrum, Diepenbeek, Belgium, 1992 16. Foy, R.:Steady-State solutions of hyperbolic systems of conservation laws with viscosity terms. Comm. Pure Appl. Math. 17, 177–188 (1964) 17. Freist¨uhler, H.: Central Degeneracy of Rotationally Symmetric Hyperbolic Systems of Conservation Laws. In: Nonlinear Hyperbolic Equations – Theory, Computational Methods, and Applications, J. Ballmann and R. Jeltsch, eds., Notes on Numerical Fluid Mechanics, Vol. 24, Vieweg, 1989 18. Freist¨uhler, H.: Linear degeneracy and shock waves. Math. Z. 207, 583–596 (1991) 19. Freist¨uhler, H.: Dynamical stability and vanishing viscosity: A case study of a non-strictly hyperbolic system. Comm. Pure Appl. Math. 45, 561–582 (1992) 20. Gardner, R. and Zumbrun, K.: A Geometric Condition for Stability of Undercompressive Viscous Shock Profiles. Comm. Pure Appl. Math. LI, 797–855 (1998) 21. Gel’Fand, I.M.: Some Problems in Theory of Quasilinear Equations. Am. Mat. Soc. Trans., Ser. 2, 29, 295–381 (1963), English transl. 22. Hirschberg, P. and Knobloch, E.: An Unfolding of the Takens-Bogdanov Singularity. Quart. J. Appl. Math. XLIX, 281–287 (1991) 23. Hurley, J.: Effects of viscous terms on solutions of Riemann problems. Ph.D. thesis, State Univ. of New York at Stony Brook, 1995

290

A. V. Azevedo, D. Marchesin, B. Plohr, K. Zumbrun

24. Isaacson, E., Marchesin, D., Plohr, B. and Temple, J.B.: The Riemann Problem Near a Hyperbolic Singularity: The Classification of Quadratic Riemann Problems I. SIAM J. Appl. Math. 48, 1009–1032 (1988) 25. Isaacson, E., Marchesin, D., and Plohr, B.: Transitional waves for consevation laws. SIAM J. Math. Anal. 21, 837–866 (1990) 26. Isaacson, E., Marchesin, D., Plohr, B. and Temple, J.B.: Multiphase flow models with singular Riemann problems. Mat. Apl. Comput. 11, 147–166 (1992) 27. S. Kawashima: Systems of hyperbolic-parabolic composite type, with applications to the equations of magnetohydrodynamics. Ph.D. thesis, Kyoto University, Kyoto, Japan, 1983 28. Keyfitz, B. and Kranzer, H.: A system of non-strictly hyperbolic conservation laws arising in elasticity theory. Arch. Rational Mech. Anal. 72, 219–241 (1980) 29. Lax, P.: Hyperbolic systems of conservation laws II. Comm. Pure Appl. Math. 10, 537–566 (1957) 30. Li, T.-C., Xiao, L. Y., Zu-Wen, S. and Zu-Wen, Y.: Riemann problem for typical quasilinear hyperbolic system, Report, Chinese Science and Technology University, 1963. 31. Liu, T.-P.: The Riemann problem for general systems of conservation laws. J. Differ. Eqs. 18, 218–234 (1975) 32. Liu, T.-P.: The entropy condition and the admissibility of shocks. J. Math. Anal. Appl. 54, 78–88 (1976) 33. Liu, T.-P.: Admissible solutions of hyperbolic conservation laws. Mem. Am. Math. Soc., Vol. 240, Providence, RI: American Mathematics Society, 1981 34. Liu, T.-P. and Zumbrun, K.: Nonlinear stability of an undercompressive shock for complex Burgers equation. Commun. Math. Phys. 168, 163–186 (1995) 35. Liu, T.-P. and Zumbrun, K.: On nonlinear stability of general undercompressive viscous shock waves. Commun. Math. Phys. 174, 319–345 (1995) 36. Majda, A. and Pego, R.: Stable viscosity matrices for system of conservation laws. J. Differ. Eqs. 56, 229–262 (1985) 37. Medeiros, H.: Stable hyperbolic singularities for three–phase flow models in oil reservoir simulation. Acta Applicandae Mathematicae 28, 135–159 (1992) 38. Mock, M. S.: A topological degree for orbits connecting critical points of autonomous systems. J. Differ. Eqs. 38, 176–191 (1980) 39. Ole˘ınik, O.: On the uniqueness of the generalized solution of a Cauchy problem for a nonlinear system of equations occurring in mechanics. Uspekhi Mat. Nauk 73, 169–176 (1957) 40. Pego, R.: Viscosity matrices for systems of conservation laws: Report No. 2, Center for Pure and Applied Mathematics, University of California, Berkeley, 1972 41. Schaeffer, D., Shearer, M., Marchesin, D. and Paes-Leme, P.: The classification of 2 × 2 systems of non-strictly hyperbolic conservation laws, with application to oil recovery. Comm. Pure Appl. Math. 40, 141–178 (1987) 42. Serre, D.: Probl`emes de Riemann singuliers. Applicable Analysis 35, 175–185 (1990) 43. Shearer, M., Schaeffer, D., Marchesin, D. and Paes-Leme, P.: Solution of the Riemann problem for a prototype 2 × 2 system of non-strictly hyperbolic conservation laws. Arch. Rational Mech. Anal. 97, 299–320 (1987) 44. Smoller, J.: Shock waves and reaction-diffusion equations. Second edition, New York–Heidelberg– Berlin: Springer-Verlag, 1994 45. Stone, H.: Probability model for estimating three-phase relative permeability. J. Petr. Tech. 22, 214–218 (1970) 46. Trangenstein, J.: Three-phase flow with gravity. In: Current Progress in Hyperbolic Systems: Riemann Problems and Computations (Bowdoin, 1988). B. Lindquist, ed., Contemporary Math. Vol. 100, Providence, RI: American Mathematics Society, 1989, pp. 147–159 47. Wendroff, B.: The Riemann problem for materials with non-convex equations of state: I. Isentropic flow; II. General flow. J. Math. Anal. and Appl. 38, 454–466; 640–658 (1972) 48. Wenstrom, J. and Plohr, B.: Classification of homogeneous quadratic conservation laws with viscous terms. In preparation, 1998 49. Whitney, H.: On singularities of mappings of Euclidean spaces. I. Mappings of the plane into the the plane. Ann. Math. 62, 374–410 (1955) Communicated by A. Kupiainen

Commun. Math. Phys. 202, 291 – 307 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Rieffel Type Discrete Deformation of Finite Quantum Groups Shuzhou Wang Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 24 November 1997 / Accepted: 10 October 1998

Abstract: We introduce a discrete deformation of Rieffel type for finite (quantum) groups. Using this, we give an example of a finite quantum group A of order 18 such that neither A nor its dual can be expressed as a crossed product of the form C(G1 )oτ G2 with G1 and G2 ordinary finite groups. We also give a deformation of finite groups of Lie type by using their maximal abelian subgroups.

1. Introduction Since the work of Woronowicz [26, 27], the theory of compact quantum groups, notably the deformation theory of compact Lie groups, has been intensively studied and is now quite well understood (see e.g. [28, 19, 12, 15, 21, 20]). However, this is not the case for finite quantum groups. Both as objects of great mathematical interest, like finite groups, and as objects with potential important applications in theoretical physics [2, 3], the theory of finite quantum groups calls for more study. To start with, the theory needs an interesting supply of examples, which are still lacking, though a few non-trivial examples have been studied [8, 1, 5, 14]. In this paper, we construct a class of finite quantum groups by introducing a discrete deformation of Rieffel type for finite (nonabelian) groups. In fact, just as in our earlier paper [24], this deformation can be applied to finite quantum groups as well, not just finite groups. This construction is motivated by Rieffel’s deformation of compact Lie groups [16], which has its origins in the Weyl–von Neumann quantization (also called Moyal product) (cf. [15]). As a matter of fact, our formula for the discretely deformed product (see Definition 2.1) is an exact analog of the product formula of von Neumann and Rieffel [13, 15]. In [16] (resp. [24]), actions of finite dimensional vector spaces are used to deform Lie groups (resp. compact quantum groups) into new quantum groups. In this paper, we use actions of finite abelian groups to deform finite groups (and finite quantum groups) into new finite quantum groups. Because of the nature of the objects

292

S. Wang

we deal with, we are spared the analytical complications met in [15, 16, 24] for the actions of continuous abelian groups (viz. vector spaces). Hence the arguments in this paper are of a purely algebraic nature. Though the constructions of this paper are direct analogs of [15, 16, 24] adapted to the actions of finite abelian groups, the proofs of the main results given there do not directly generalize to the new situation. Thus we have to develop different proofs for our main results. The main cause of this is that many facts on Euclidean geometry of Rd as used in [15] do not have generalizations to finite abelian groups (e.g. orthogonal complements, polar decomposition of operators, etc.), though one can develop to some extent the “Euclidean geometry” on a finite abelian group with “inner product” given by a pairing which identifies itself with its Pontryagin dual. The construction of deformation in this paper in the dual form is an example of Drinfeld’s twistings [4], just as the constructions in [16, 17, 24] (cf. also [11, 12]). This again shows the relationship between the Rieffel type deformations (generalizations of the Weyl–von Neumann quantization) and Drinfeld’s twistings. For Kac algebras, the most general form of twistings in the sense of Drinfeld [4] is studied by Enock and Vainerman [5, 18], following the work of Landstad and Raeburn [10, 9] on deformations of locally compact groups. Hence in the dual picture our construction constitutes a distinguished class of twistings of Kac algebras in the sense of Enock and Vainerman [5, 18]. Instead of imposing rather complicated cocycle conditions in addition to the existence of an abelian subgroup, such as the approach in [9, 5, 18], our construction of deformation is canonically associated with the abelian subgroup, and it is a natural generalization of the Weyl–von Neumann–Rieffel deformation. As a matter of fact, the twist F (see formula (3.16)) for the dual of our construction does not satisfy the 2cocycle condition on the Kac algebra, but the pseudo-2-cocycle condition, which is equivalent to the condition that the associated twisted coproduct is coassociative, a minimal requirement. It is interesting to note that it is not clear how to see that F is a pseudo-2-cocycle directly in the dual picture! Also, unlike [1, 5], our construction does not give rise to the 8 dimensional quantum group of Kac-Palyutkin [8]. Note also that in [14], more specific examples along the lines of [9, 5, 18] are given; it is also shown there that the K0 ring of the Hopf algebra is invariant under twists, which is obvious for our construction. The plan of this paper is as follows. As preparation for Sect. 3, we construct in Sect. 2 a deformed C ∗ -algebra AJ for every finite dimensional unital C ∗ -algebra A that is endowed with an action α of a finite abelian group H, where J is a skew-symmetric automorphism on H. See Theorem 2.7. The construction in this section parallels the one in [15]. In Sect. 3, for every finite quantum group A containing a finite abelian subgroup T , we construct an action α of H on the C ∗ -algebra A and show that the deformation AJ is also a finite quantum group containing T as a subgroup, where H = T ⊕ T , J = S ⊕ (−S), and S is a skew-symmetric automorphism on T . See the main result, Theorem 3.2. This theorem parallels the main results in [16, 24], and is announced in Sect. 2 of [25] without proof. At the end of this section, we discuss the relationships of this construction with Drinfeld’s twistings [4] and twistings of Landstad and EnockVainerman [9, 5, 18]. In Sect. 4, we construct a non-trivial finite quantum group A of order 18 such that neither A nor its dual can be expressed as a crossed product of the form C(G1 ) oτ G2 , where G1 and G2 are ordinary finite groups. Finally, in Sect. 5, we deform finite groups of Lie type using their maximal abelian subgroups (tori). A convention on terminology. When A = C(G) is a Woronowicz Hopf C ∗ -algebra, we also call A a compact quantum group, referring to the abstract dual G. Hence a representation of the quantum group A is a representation of G in the sense of [27],

Finite Quantum Groups

293

which is also called a corepresentation of the Woronowicz Hopf C ∗ -algebra A (cf. also [21]); while a representation of the algebra A has an obvious different meaning. 2. Deformation of Algebras via Actions of Finite Abelian Groups In this section, we adapt the construction of the monograph of Rieffel [15] to the situation of actions of finite abelian groups on C ∗ -algebras (as opposed to actions of Rd considered there by Rieffel). Namely, for every quadruple (A, H, α, J) consisting of a finite dimensional unital C ∗ -algebra A, an action α of a finite abelian group H on the C ∗ -algebra A, and a skew-symmetric automorphism J (with respect to a Pontryagin pairing – see the definition below) on H, we construct a deformed unital C ∗ -algebra AJ . It is not our intention to generalize in detail everything in [15] to this setting. As a matter of fact, many results in [15] do not generalize to this setting. Our primary task in this section is to give some details of those results that are needed in the next section for the deformation of finite quantum groups, the main one being the construction of the C ∗ -algebra AJ mentioned above. We will also briefly indicate some other results that might be useful elsewhere. Throughout this section, A will denote a finite dimensional unital C ∗ -algebra on which a finite abelian group H acts by ∗-automorphisms α. The group operation of H is written additively. Let H × H −→ T, (s, t) :7−→ hs, ti be a pairing (with values in the circle group T) that identifies H with its Pontryagin dual Hˆ (we call such a pairing a Pontryagin pairing). More precisely, identifying H with Z/n1 Z ⊕ Z/n2 Z ⊕ · · · ⊕ Z/nl Z, where n1 , n2 , · · · , nl are (not necessarily distinct) natural numbers, a pairing is given by hs, ti = h(s1 , · · · , sl ), (t1 , · · · , tl )i = e2πi(s1 t1 /n1 +s2 t2 /n2 +···+sl tl /nl ) ,

(2.1)

where sk , tk ∈ Z, k = 1, · · · , l. Let End(H) be the ring of endomorphisms of the group H and GL(H) the group of automorphisms on H (which is the same as the group of invertible elements in the ring End(H)). Using a Pontryagin pairing hs, ti on H above, we can define the notion of transpose J t of an endomorphism J ∈ End(H). More generally, if G and H are two finite abelian groups endowed with Pontryagin pairings, then every group homomorphism J from G to H admits a transpose J t , which is a homomorphism from H to G. Throughout this section, we assume that H admits a nontrivial skew-symmetric automorphism. (Note that some finite abelian groups do not have such automorphisms! But the examples of groups we consider later in this paper do.) We can also define the group of orthogonal automorphisms O(H) in the evident manner. Just as in the case of a vector space, by choosing l cyclic generators of H, one for each of the subgroups Hk ∼ = Z/nk Z of H, we can also represent elements of End(H) in terms of matrices with entries consisting of group homomorphisms from Hj to Hk , j, k = 1, · · · , l. With each choice of cyclic generators of H, GL(H) and O(H) can be identified with the sets of invertible and orthogonal matrices, respectively. Note that the skew-symmetry of the matrix J is independent of the choice of the generators of H.

294

S. Wang

For any finite group H, we will use

R

to denote the normalized Haar integral on H,

i.e. Z s∈H

f (s) =

1 X f (s), |H|

(2.2)

s∈H

where |H| is the number of elements in H and f is a function on H taking values in some vector space. We will see that the normalization is convenient for the constructions R of our deformed algebra and quantum group. The symbol s1 ,s2 ,...,sk ∈H will denote the corresponding k th fold integral. For the convenience of the reader, we recall here the orthogonality relations for group characters on H, namely the relations Z t∈H

hs, ti = δs,0 ,

(2.3)

where hs, ti is a Pontryagin pairing on H. We will also need the Fourier inversion formula for A-valued functions F (s) on H, which we recall also, as we will be using it a number of times, Fˇˆ = F, i.e., |H|

Z s,t∈H

F (s) hs, −ti ht, xi = F (x).

(2.4)

(The inversion formula is easily seen to be a consequence of the orthogonality relations (2.3).) In particular, we have, Z |H|

s,t∈H

F (s) hs, ti = F (0).

(2.5)

Definition 2.1 (cf. [13, 15]). Let J ∈ End(H) be an endomorphism on H. The deformed product ×J (or ×α J ) on A is defined by Z a ×J b = |H|

s,t∈H

αs (a)αt (b) hJs, ti , a, b ∈ A,

(2.6)

where the products on the right-hand side are in A. Let AJ (or Aα J ) denote (A, ×J ). The number |H| in the above formula insures that the deformed algebra AJ is unital (see (2) of the next proposition). The observant reader might have noticed that in the above formula we have chosen J to appear in the dual pairing h·, ·i instead of in the action α, as is done in [15, 16, 24] (cf. also von Neumann [13]). We do this because if we replace J with ~J, where ~ is any real number, the above formula still make sense. But α~Js does not make sense for a finite group H acting on A. See also the remarks at the end of this section. Proposition 2.2. (1) For any J ∈ End(H), the deformed product ×J is associative. (2) If J ∈ GL(H), then the unit of A continues to be the unit of AJ .

Finite Quantum Groups

295

Proof. (1) This is the analog of Theorem 2.14 of Rieffel [15]. However the proof given there does not work for finite abelian groups because their subgroups do not have orthogonal complements. We give a much simpler proof of this result. The key is to make the correct change of variables. We compute, by applying a change of variables twice (the second change of variable is a little bit tricky), Z αs (a ×J b)αt (c) hJs, ti (a ×J b) ×J c = |H| s,t∈H Z αs+u (a)αs+v (b)αt (c) hJs, ti hJu, vi = |H|2 s,t,u,v∈H Z αs (a)αs−u+v (b)αt (c) hJ(s − u), ti hJu, vi = |H|2 s,t,u,v∈H Z αs (a)αu+v (b)αt (c) hJu, ti hJ(s − u), vi , = |H|2 s,t,u,v∈H

which, after exchanging the roles of t and v, Z = |H|2

s,t,u,v∈H

αs (a)αu+t (b)αv (c) hJu, vi hJ(s − u), ti

Z = |H|2

s,t,u,v∈H

αs (a)αu+t (b)αv (c) hJu, v − ti hJs, ti

Z = |H|2

s,t,u,v∈H

αs (a)αu+t (b)αv+t (c) hJu, vi hJs, ti .

On the other hand, expanding a ×J (b ×J c) we see that Z αs (a)αt+u (b)αt+v (c) hJu, vi hJs, ti . a ×J (b ×J c) = |H|2 s,t,u,v∈H

(2) If J ∈ GL(H), then hJs, ti is a Pontryagin pairing for H, hence we can use (2.5) (replacing hs, ti in (2.5) by hJs, ti), Z αs (a) hJs, ti = α0 (a) = a. a ×J 1 = |H| s,t∈H

Similarly, 1 ×J b = b. This proves the proposition.

As in [15], let Au be the spectral subspace of u ∈ H: Au = {a ∈ A | αs (a) = hu, si a, s ∈ H}.

(2.7)

Proposition 2.3. Let J ∈ GL(H) be a skew-symmetric automorphism: J t = −J. Let a ∈ Au , b ∈ Av (the spectral subspace of A corresponding to u, v ∈ H). Then

(2.8) a ×J b = J −1 u, v ab.

296

S. Wang

Proof. This is the analog of Proposition 2.22 in [15]. Instead of the Poisson summation formula, as is used in the proof of 2.22 in [15], we apply the Fourier inversion formula to the last line of the following computation: Z Z αs (a)αt (b) hJs, ti = |H| hs, ui a ht, vi b hJs, ti a ×J b = |H| s,t∈H s,t∈H Z

−1 J s, u ht, −vi hs, −ti ab = J −1 (−v), u ab. = |H| s,t∈H

That is

a ×J b = J −1 u, v ab

by skew-symmetry of J.

Remark. Note that if A is commutative, then a ×J b = 2J −1 u, v b ×J a, where a, b are as in the above proposition. Hence, we see that if 2J −1 6= 0 and if the action α is non-trivial, then the algebra AJ is noncommutative, even if A is a commutative algebra. The condition 2J −1 = 0 is related with the characteristic 2 phenomenon (see the last two sections for examples concerning this). Proposition 2.4. Let J ∈ End(H) be a skew-symmetric homomorphism: J t = −J. Then under the involution ∗ of the algebra A, we have (a ×J b)∗ = b∗ ×J a∗ for a, b ∈ AJ . Hence AJ is a ∗-algebra. Proof. Use hJy, xi = hy, J t xi = hy, −Jxi.

Consider the Hilbert A-module E = C(H) ⊗ A under the A-valued inner product, Z f ∗ (x)g(x), f, g ∈ C(H) ⊗ A, (2.9) hf, giA = x∈H

where C(H) is the algebra of complex valued functions on H. Note that as a tensor product of two C ∗ -algebras, E is also a C ∗ -algebra and H acts on it by translation: τs (f )(x) = f (x − s).

(2.10)

If J is a skew-symmetric automorphism, then from the propositions above, EJ = (E, ×τJ ) is a unital ∗-algebra. Let L denote the left regular multiplication on EJ : Lf g = f ×τJ g.

(2.11)

Under these assumptions, we have (cf. 4.2, 4.3, 4.6 of [15]) Proposition 2.5. The left regular multiplication L is a faithful unital ∗-representation of the ∗-algebra EJ by bounded operators on the Hilbert A-module E. More precisely, we have Lf = 0 if and only if f = 0, and hf ×τJ g, hiA = hg, f ∗ ×τJ hi , f ∈ EJ , g, h ∈ E,

(2.12)

Z ||Lf || ≤

s∈H

||f (s)|| = ||f ||1 , f ∈ EJ .

(2.13)

Finite Quantum Groups

297

Proof. The identity (2.12) is a straightforward checking without going into the complications that are involved in the proof of 4.2 in Rieffel [15]. We leave this to the reader. We show that L is faithful. Let Lf = 0. Hence hf ×τJ g, f ×τJ giA = 0, g ∈ E. Let g be the unit element of the C ∗ -algebra E, g(s) = 1, s ∈ H. Then by Proposition 2.2, g is the unit of EJ . Hence hf ×τJ g, f ×τJ giA = hf, f iA = 0. Note that g plays two roles in here: as the unit of the algebra EJ and as a vector in the Hilbert A-module E. Hence f = 0. The proof of the inequality (2.13) is the same as (and easier than) the proof of 4.3 in [15] (see also 4.6 of [15]). For the convenience of the reader, we sketch the proof here. A short computation shows that Z Z f (s)Us (g), i.e., Lf = f (s)Us , Lf (g) = f ×τJ g = s∈H

s∈H

where Us is the unitary operator on the Hilbert module E defined by ˇ − s)), Us (g)(x) = hJ(x − s), xi g(J(x gˇ being the inverse Fourier transform of g (Plancherel’s theorem). Hence (2.13) is immediate. Let us come back to our algebra AJ . For a ∈ AJ = A, the element a˜ of EJ defined by a˜ (s) = αs (a) is zero if and only if a = 0. Using the above result, we can define a C ∗ -norm on AJ as follows. Definition 2.6. Let J be a skew-symmetric automorphism on (H, h, i). The deformed C ∗ -norm || · ||J on AJ is defined by ||a||J = ||La˜ ||, a ∈ AJ ,

(2.14)

where ||La˜ || is the operator norm of La˜ on the Hilbert A-module E. Summarizing the above, we have the following main result of the section: Theorem 2.7. Let A be a finite dimensional unital C ∗ -algebra. Let H be a finite abelian group acting on A by automorphisms. Let J ∈ GL(H) be a skew-symmetric automorphism: J t = −J. Then AJ is a unital C ∗ -algebra under the norm || · ||J . Remarks. (1) Note that on any finite dimensional ∗-algebra, there can be at most one C ∗ -norm. Hence the C ∗ -norm defined above is the unique one on AJ . (2) Note that we need J to be a skew-symmetric automorphism in order to define the C ∗ -norm on AJ , while in Rieffel [15], J can be any skew-symmetric endomorphism on a vector space. Also note that we do not have the analog of Theorems 2.15 and 6.5 of Rieffel [15]. Namely, (AJ )K = AJ+K is not true in general. However, using the orthogonality relations for group characters (see (2.3)), one can easily prove the following proposition.

298

S. Wang

Proposition 2.8. Under the assumption of the theorem above, (AJ )−J = A. Now we can state the analogs of Theorems 2.10. 5.7, 5.8, 5.12 and 7.7 of Rieffel [15]. Proposition 2.9. Let J ∈ GL(H) be a skew-symmetric automorphism. Let α and β be actions of H on A and B respectively. Let θ : A −→ B be an equivariant homomorphism. (1) θ is still an equivariant homomorphism from AJ to BJ (denote this homomorphism by θJ , called the deformation of θ). (2) θ is injective (resp. surjective) if and only if θJ is. (3) Let I be an ideal of A that is invariant under the action α. Let Q = A/I, and let α also denote the action of H on Q, so we have an equivariant exact sequence 0 −→ I −→ A −→ Q −→ 0. Then the corresponding sequence (see (1) above) 0 −→ IJ −→ AJ −→ QJ −→ 0 is also exact. The proofs of these analogs follow directly from our definitions and the key assumption that A is a finite dimensional C ∗ -algebra. We leave the checking to the reader. The reader is advised not to look up the proofs in [15] for clues (for otherwise the reader would be mislead to complicate the proofs of these analogs), but to simply think about our definitions. Remarks. (1) If we replace J with ~J in the construction above, a number of things in this section are still true, where ~ is real, and h~Js, ti == e2π~i(s1 t1 /n1 +s2 t2 /n2 +···+sl tl /nl ) , using the above identification of H with the concrete abelian group as a direct sum of cyclic groups. For any skew-symmetric J, A~J is an associative ∗-algebra, but it may not have a unit or a C ∗ -norm even if J is an automorphism. So it not clear how one constructs strict deformation quantization. (2) For the practical purposes of the next section, we have restricted A to be a finite dimensional C ∗ -algebra. If we remove this restriction, then the proofs of all the above results, except Proposition 2.9, are still valid (of course, in Theorem 2.7, we need a completion to obtain a C ∗ -algebra). We believe that Proposition 2.9 is still true in this case. 3. Deformation of Finite Quantum Groups via Finite Abelian Subgroups In the theory of finite groups, the finite groups of Lie type are one of the most important classes of finite groups. In view of the fact that classical Lie groups have q-deformation, a natural question in this connection is Problem 3.1. Do finite groups of Lie type have an analog of q-deformation into finite quantum groups?

Finite Quantum Groups

299

This problem seems to be out of reach at the moment. In this section, we construct a deformation of Rieffel type for finite groups (as well as for finite quantum groups) that contain an abelian subgroup. This deformation is not the analog of the q-deformation, it is dual to Drinfeld twistings of the quantized universal enveloping algebras. We will see this at the end of this section. We start with a finite quantum group G = (A, 8) (in the sense that A is the “function space” C(G), where 8 is the coproduct on A [27]). Assume that its maximal subgroup X(A) = { *-homomorphisms from A into C} contains an abelian subgroup T with a nontrivial skew-symmetric automorphism S. So there is a surjective morphism of Hopf C ∗ -algebras π : A −→ C(T ). Let H := T ⊕ T,

(3.1)

J := S ⊕ (−S)

(3.2)

and let

be the skew-symmetric automorphism on H. Define an action α of H on the C ∗ -algebra A as follows: α(s,u) = λs ρu ,

(3.3)

where λs = (E−s π ⊗ id)8,

ρu = (id ⊗ Eu π)8,

(3.4)

id being the identity map on A and Eu the evaluation functional on C(T ) corresponding to u. Using results of the previous section, we obtain a deformed C ∗ -algebra AJ with new product ×J defined by (see formula (2.1)) Z 2 α(s,u) (a)α(t,v) (b) hSs, ti h−Su, vi , (3.5) a ×J b = |T | s,t,u,v∈T

where a, b ∈ A and hs, ti is a Pontryagin pairing on T . The main result of this section is (cf. [16, 24]) Theorem 3.2. Under the same coproduct 8 of A, the deformation (A, ×J ) is still a finite quantum group containing T as a subgroup. Remarks on the proof. We will show that AJ satisfies the axioms of a finite dimensional Hopf C ∗ -algebra as given in Kac and Palyutkin [8], instead of the ones given in Appendix 2 of Woronowicz [27], though they are equivalent to each other. The proof is a modification of the proof of Theorem 3.9 of [24]. Unlike that theorem, because A is of finite dimension here, we do not need to consider the analogs of Propositions 3.2 and 3.8 in [24], which are essential steps for the proof of that theorem. The subtlety in our situation here is that the method used there in the treatment of the deformed coproduct does not work anymore, because the existence of orthogonal complements is used in an essential way there (but, as pointed out before, a subgroup of a finite abelian group needs not have an orthogonal complement). To deal with the deformed coproduct, we will show that the heuristic computation on p. 471 of [16] can be made rigorous in our setting (replacing the compact Lie group G there by our finite quantum group).

300

S. Wang

Proof. Let F, G ∈ AJ ⊗ AJ , and let ×J also denote the product on AJ ⊗ AJ . Using formula (2.1) we can find the formula for the product in the C ∗ -algebra AJ ⊗ AJ in terms of the product in the C ∗ -algebra A ⊗ A, with the summation (integration) over repeated indices: Z F ×J G = |T |4 γ(s,u,s0 ,u0 ) (F )γ(t,v,t0 ,v0 ) (G) hL(s, u, s0 , u0 ), (t, v, t0 , v 0 )i , (3.6) where γ = α ⊗ α is the tensor product action of H ⊕ H on A ⊗ A, L = J ⊕ J is the corresponding skew-symmetric automorphism on H ⊕ H, and h(s, u, s0 , u0 ), (t, v, t0 , v 0 )i = hs, ti hu, vi hs0 , t0 i hu0 , v 0 i .

(3.7)

Note that this identity is easily verified on tensors of the form F = a1 ⊗ a2 ,

G = b1 ⊗ b2 .

Since A is finite dimensional, this gives an isomorphism of ∗-algebras α (A ⊗ A)γL = Aα J ⊗ AJ .

It is easy to see this is actually an isomorphism of C ∗ -algebras (see Remark (1) after Theorem 2.7). This isomorphism is the analog of Corollary 2.2 of [16], where a more complicated proof is needed. For a, b ∈ AJ , we have by (3.6) and (3.3) (cf. [16]) 8(a) ×J 8(b) = 8(a) ×L 8(b) Z 4 = |T |

s,u,t,v,s0 ,u0 ,t0 ,v 0 ∈T

(λs ρu ⊗ λs0 ρu0 )(8(a))(λt ρv ⊗ λt0 ρv0 )(8(b))

hSs, ti h−Su, vi hSs0 , t0 i h−Su0 , v 0 i , which, by 2.7 of [24] Z = |T |4

s,u,t,v,s0 ,u0 ,t0 ,v 0 ∈T

(λs ρu−s0 ⊗ ρu0 )(8(a))(λt ⊗ λt0 −v ρv0 )(8(b))

hSs, ti h−Su, vi hSs0 , t0 i h−Su0 , v 0 i . Making the change of variables u − s0 7→ u, t0 − v 7→ t0 , and using (2.5) twice (note that both h−Su, vi and hSs0 , t0 i are Pontryagin pairings on T !), the last expression Z (λs ρu ⊗ ρu0 )(8(a))(λt ⊗ λt0 ρv0 )(8(b)) = |T |4 s,u,t,v,s0 ,u0 ,t0 ,v 0 ∈T

Z = |T |2

hSs, ti h−Su, vi hSs0 , t0 i h−Su0 , v 0 i

s,t,u0 ,v 0 ∈T

which, by 2.7 of [24] Z = |T |2

s,t,u,v∈T

(λs ⊗ ρu0 )(8(a))(λt ⊗ ρv0 )(8(b)) hSs, ti h−Su0 , v 0 i ,

8(λs ρu (a))8(λt ρv (b)) hSs, ti h−Su, vi = 8(a ×J b).

Finite Quantum Groups

301

That is

8(a) ×J 8(b) = 8(a ×J b).

As in [16], the action α restricts to an action on C(T ) and π is equivariant. From Proposition 2.9 of the last section, this gives a surjective homomorphism πJ from AJ onto C(T )J . It is also clear that (πJ ⊗ πJ )8J = 8T πJ ,

(3.8)

where 8T is the coproduct on C(T ). However the method used in [16] for the proof of C(T )J = C(T )

(3.9)

does not work here, because it uses a result of [15] which is not true for finite abelian groups. We can prove this directly as follows. For f ∈ C(T ), we have α(s,u) (f ) = λs ρu (f ) = λs−u (f ). Hence

Z f ×J g = |T |

2 s,u,t,v∈T

λs−u (f )λt−v (g) hSs, ti h−Su, vi

Z = |T |2

s,u,t,v∈T

λs (f )λt (g) hSs, ti hSs, vi hSu, ti

= f g, where we have used the orthogonality relations of the characters of a finite abelian group. This shows that T will still be a subgroup of AJ once AJ is shown to be a quantum group. The counit of AJ is defined by J = T πJ ,

(3.10)

where T is the counit of C(T ). So as a linear map, J is the same as . The coinverse κJ on AJ is defined to be the same as κ. The identity κJ (a ×J b) = κJ (b) ×J κJ (a)

(3.11)

is a direct consequence of the fact that κα(s,u) = α(u,s) κ (cf. 2.8 of [24]) and the skew-symmetry of S. Now we check the antipodal property mJ (idJ ⊗ κJ )8J = IJ J = mJ (κJ ⊗ idJ )8J ,

(3.12)

By 2.8 and 2.6 of [24], we have for coefficients aij ∈ A of a unitary representation (aij ) of the quantum group A that mJ (idJ ⊗ κJ )8J (aij ) = mJ (idJ ⊗ κJ )8(aij ) Z m(α(s,u) ⊗ α(t,v) κ)8(aij ) hJ(s, u), (t, v)i = |T |2 s,u,t,v∈T

302

S. Wang

Z = |T |2

s,u,t,v∈T

m(id ⊗ κ)(λs ρu ⊗ λv ρt )8(aij ) hSs, ti h−Su, vi

Z = |T |2

s,u,t,v∈T

m(id ⊗ κ)(λs ρu−v ⊗ ρt )8(aij ) hSs, ti h−Su, vi

Z = |T |2

s,u,t,v∈T

m(id ⊗ κ)(λs ρu ⊗ ρt )8(aij ) hSs, ti h−Su, vi ,

which, using (2.5), noting that h−Su, vi is a Pontryagin pairing on T , Z = |T |

s,t∈T

m(id ⊗ κ)(λs ⊗ ρt )8(aij ) hSs, ti

Z = |T |

s,t∈T

Z = |T |

m(λs ⊗ κρt )( X

s,t∈T k,l,r

X

aik ⊗ akj ) hSs, ti

k

E−s (π(ail ))alk a∗rk Et (π(arj )) hSs, ti

Z = |T |

s,t∈T

Z = |T |

s,t∈T

(E−s ⊗ Et )8T (π(aij )) hSs, ti Z E−s+t (π(aij )) hSs, ti = |T |

s,t∈T

Et (π(aij )) hSs, ti ,

which, using once again (2.5), = E0 (π(aij )) = T (π(aij )) = (aij ) = J (aij ). That is, on AJ (note that AJ = A as a vector space), mJ (idJ ⊗ κJ )8J = IJ J . Similarly mJ (κJ ⊗ idJ )8J = IJ J . We will denote the coproduct of AJ by 8J . Note that because of Proposition 2.8, AJ can be deformed back to the original quantum group (AJ )−J = A. Remarks. (1) In the above, we assumed that A is a finite quantum group instead of a more general compact quantum group. In the latter case, we do not know how to show directly that AJ is still a compact quantum group. The main difficulty is to rigorously define the coproduct. Note that we can not define an analog of the map % as given near 3.3 of [24] (see also [16]). One possible approach is to use the Krein duality as given in our paper [23]. (2) Just as in [16, 24], the Haar measure of A is still the Haar measure of AJ . One can easily see this from the uniqueness of the (left and right invariant) Haar measure and the fact that the coproduct of AJ is that of A. This proof is also valid for the infinite dimensional cases of [16, 24] if we work with the dense Hopf ∗-algebra, and it is much easier than the proof in [16]. (3) From the above remark and the orthogonality relations for characters of irreducible representations, we see that the irreducible representations of the quantum group A are still irreducible representations of the quantum group AJ . From these, we see that the representation ring of the quantum group A is invariant under deformation, just as in [16, 24] and [14].

Finite Quantum Groups

303

Now we describe the construction above in the dual picture. Let B be a finite dimensional Hopf C ∗ -algebra with coproduct 8. Let T be an abelian subgroup of the group of grouplike elements of B. Endow T with a Pontryagin pairing. For notational convenience, we now assume that the group operation on T is multiplicative instead of additive, and for a skew symmetric automorphism S on T , the matrix −S will denote the automorphism (−S)x = (Sx)−1 . As before, let J = (S, −S) be the skew-symmetric automorphism on H = T × T . Then B is a Hopf C ∗ -algebra under the original C ∗ -algebra structure, original counit and antipode, and the deformed coproduct given by Z

2 (β(s,u) ⊗ β(t,v) )8(b) hSs, ti (Su)−1 , v , (3.13) 8J (b) = |T | s,u,t,v∈T

where

β(s,u) (b) = sbu−1 .

(3.14)

Using orthogonality relations for group characters, we have that

Z |T |

s,t∈T

−1 Z (s ⊗ t) hSs, ti = |T |

u,v∈T

We can also check that

(u−1 ⊗ v) hSu, vi .

(3.15)

Z F = |T |

s,t∈T

(s ⊗ t) hSs, ti

(3.16)

is a unitary element in B ⊗ B. We can now rewrite 8J (b) as 8J (b) = F 8(b)F −1 .

(3.17)

This shows that our deformation of finite quantum groups in this dual picture is an analog of the twistings of the quantized universal enveloping algebra of Drinfeld [4, 11, 12], just as the ones in [16, 24]. The element F is not a 2-cocycle, though both ( ⊗ 1)(F ) = 1 and (1 ⊗ )(F ) = 1 are satisfied. But since the twist 8J is coassociative because of Theorem 3.2, F is a pseudo-2-cocycle in the sense of [5, 18]. However, it is not clear how to verify directly that F satisfies the pseudo-2-cocycle condition for a general noncommutative and noncocommutative Hopf C ∗ -algebra B without passing to the dual A of B (see Proposition 2.2 and Theorem 3.2)! The above twisting of B can be viewed as the canonical one among the ones considered in [5, 18] that are associated with a finite abelian subgroup T . We will call the twist F the Weyl–von Neumann–Rieffel twist associated with (T, S), and denote the twist of B by B S (to distinguish it from AJ ). To gain a better understanding of this construction, it is desirable to solve the following (cf. remark (3) above and [14]) Problem 3.3. Characterize the finite quantum groups that can be obtained as deformations of finite groups in the manner above. Find their isomorphic invariants.

4. A Finite Quantum Group of Order 18 Let T = (Z/nZ)2k with canonical generators j , j = 1, · · · , 2k (j is the element with the j th component 1 and the other components 0). Let β be the action of Z/2Z on T which exchanges j and j+k , j = 1, · · · , k. Let G be the corresponding semi-direct

304

S. Wang

product of T by Z/2Z under this action: G = T oβ Z/2Z. Let A = C(G). Consider the skew-symmetric automorphism S of T defined by S(j ) = −k+j ,

S(k+j ) = j , j = 1, · · · , k.

(4.1)

Alternatively, S has matrix representation 0 Ik −Ik 0 with respect to the generators (or “basis”) j , j = 1, · · · , 2k, where Ik represents the identity transformation on the group (Z/nZ)k . Then we are in the position to apply Theorem 3.2 to obtain a noncommutative deformation AJ , where J = S ⊕ (−S) (see the remark after Proposition 2.3). Take k = 1 and n = 3. Then AJ = C(G)J is a finite quantum group of order 18, where by definition, the order of a finite group is the dimension of its function algebra. From the remark after Proposition 2.3, we see that this quantum group is noncommutative and noncocommutative. We now show that the quantum group AJ (and its dual B S ) is not a crossed product of the form C(G1 ) oτ G2 , where G1 and G2 are ordinary finite groups and τ is an action of G2 on C(G1 ) by automorphisms of quantum groups (the triple (C(G1 ), G2 , τ ) is also called a Woronowicz Hopf C ∗ -dynamical system, see [8, 22]). An easy application of the Mackey Machine shows that the group G = (Z/3Z)2 oβ Z/2Z has 6 irreducible representations: 2 one-dimensional representations and 4 twodimensional representations. Hence by remark (3) after the proof of Theorem 3.2, the quantum group AJ also has 6 irreducible representations. If AJ is a crossed product of the form C(G1 ) oτ G2 , then we only need to consider four cases: (i) |G1 | = 2, |G2 | = 9, (ii) |G1 | = 9, |G2 | = 2, (iii) |G1 | = 3, |G2 | = 6, (iv) |G1 | = 6, |G2 | = 3. We claim that in each of the cases (i), (ii) and (iii), the quantum group AJ = C(G1 ) oτ G2 is a group C ∗ -algebra of an ordinary group and hence it has 18 irreducible representations instead of 6 (cf. [27], the irreducible representations of a compact quantum group of the form C ∗ (0) are exactly the elements of the discrete group 0). This claim implies that cases (i), (ii) and (iii) cannot happen. To prove the claim we note first that G1 is abelian in each of these three cases. Since τ is assumed to preserve the Hopf C ∗ algebra structure of C(G1 ), which is isomorphic to C ∗ (Gˆ 1 ) (as a Hopf C ∗ -algebra), by transport of structure, τ is an action of G2 by automorphisms on the group Gˆ 1 . Hence C(G1 ) oτ G2 ∼ = C ∗ (Gˆ 1 oτ G2 ) as claimed. Now consider case (iv). If G1 is abelian, then the same reasoning as above leads to a contradiction. If G1 is non-abelian, then G1 is the group S3 , hence it has 3 irreducible representations. From the classification of irreducible representations of quantum groups associated with crossed products (see Theorem 3.7 of [22]), C(G1 ) oτ G2 has 3 × 3 = 9 irreducible representations. This again contradicts the fact that the quantum group AJ has 6 irreducible representations. This shows that case (iv) cannot happen. Similarly, we show that the dual B S of the above AJ cannot be expressed in the form C(G1 ) oτ G2 either. As above we only need to look at case (iv) above with G1 = S3 and τ non-trivial. By remark (3) after the proof of Theorem 3.2 and the analysis in the previous paragraph, the algebra B S has 6 irreducible representations (which are exactly the irreducible representations of the quantum group AJ ). By transport of structure, τ gives an automorphism of order three of the group S3 . An examination of the structure of S3 then shows that the action τ is conjugation by a 3 cycle in S3 . From this, using the Mackey Machine we see that the algebra C(S3 ) oτ Z/3Z has 10 irreducible

Finite Quantum Groups

305

representations instead of 6: 9 one-dimensional representations and 1 three-dimensional representation. This completes the proof. Remarks. (1) The group D4 = (Z/2Z)2 oβ Z/2Z is the only non-abelian group of order 8 that has a maximal abelian subgroup T isomorphic to (Z/2Z)2 , and S defined above is the only non-trivial skew-symmetric automorphism on T . It is easy to see that C(D4 )J = C(D4 ) (see the remark after Proposition 2.3). Since the 8 dimensional quantum group of Kac-Palyutkin [8] clearly contains T as a subgroup, we conclude from Theorem 3.2 that this quantum group is not of the form C(G)J for any finite group G. (2) By the same method as above, we note that the duals of the two examples of 12 dimensional Kac algebras in [5, 14] (as twists of D6 and Q4 respectively) are not isomorphic to a Kac algebra of the form C(S3 ) oτ Z/2Z (like case (iv) above, this is the only non-trivial case), where Z/2Z acts on S3 non-trivially. The second tensor power of one of the 2 two-dimensional irreducible representations of the algebra L(G) of [5] decomposes into a direct sum of one-dimensional subrepresentations, but one can verify using [22] that this does not happen for either of the 2 two-dimensional irreducible representations of the quantum group C(S3 ) oτ Z/2Z. Similarly, we see that the group of one-dimensional representations of the algebra L(G) of [14] (see Example 2.9.(iii) therein) is Z/4Z, while using [22] we see that the group of one dimensional representations of the quantum group C(S3 ) oτ Z/2Z is isomorphic to Z/2Z ⊕ Z/2Z. 5. Deformation of Finite Groups of Lie Type Instead of constructing q-deformations of finite groups of Lie type (see Problem 3.1), we now construct a Rieffel type deformation for such finite groups. Let q = pd be the power of a prime p, and let Fq be the field with q elements. Let G be a finite group of Lie type over Fq , e.g., GL(n, Fq ), SL(n, Fq ), P SL(n, Fq ) or Sp(n, Fq ). Let T be a maximal abelian subgroup of G (T can be viewed as the points over Fq of a maximal torus of the algebraic group corresponding to G). Apart from trivial cases like GL(2, F2 ) = SL(2, F2 ) = P SL(2, F2 ) = S3 , T has non-trivial skewsymmetric automorphisms S. Letting J = S ⊕ (−S), we can use Theorem 3.2 to obtain a deformation C(G)J . In general, the quantum groups C(G)J should be non-trivial in the sense of the previous sections. Take for example the simplest case GL(2, Fq ). Then T consists matrices of the form a0 0b with a, b ∈ Fq∗ . Hence T is isomorphic to the group (Z/(q − 1)Z)2 . As in the previous section, we have on T the canonical skew-symmetric automorphism 0 1 S= . −1 0 Using Theorem 3.2, we can form the deformation C(GL(2, Fq ))J . If q 6= 2, 3, then the remark after Proposition 2.3 shows that C(GL(2, Fq ))J is noncommutative and noncocommutative. Apart from trivial cases such as those mentioned in the beginning of this section, it is not hard to see that C(G)J is noncommutative and noncocommutative for an arbitrary finite group of Lie type G. However, it is not so clear how to show that C(G)J cannot

306

S. Wang

be expressed as a crossed product of the form C(G1 ) oτ G2 for some finite groups G1 and G2 . We plan to study this further in the future. Since there are many skew-symmetric automorphisms on the maximal abelian subgroup of a finite group of Lie type G, C(G)J is not the analog of q-deformation of Drinfeld and Jimbo of infinite Lie groups. Acknowledgement. The author would like to thank Professor Marc Rieffel for continual encouragement and support, who urged the author to clarify the relationship of the construction in this paper and the construction of Enock and Vainerman [5]. He also thanks C.-S. Chu, A. Weinstein and S.L. Woronowicz for their helpful comments. The main results of this paper were obtained while the author was visiting IHES during the year July, 1995 – August, 1996. He thanks the IHES for its financial support and hospitality during this period. The author also wishes to thank the Department of Mathematics at UC-Berkeley for its support and hospitality while he held an NSF Postdoctoral Fellowship there during the final stage of this paper. He thanks the referee for pointing out an error at the end of the proof of Theorem 3.2 in the original manuscript.

References 1. Baaj, S. and Skandalis, G.: Unitaires multiplicatifs et dualite pour les produits croises de C ∗ -algebres. Ann. Sci. Ec. Norm. Sup. 26, 425–488 (1993) 2. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 3. Connes, A.: Noncommutative geometry and reality. J. Math. Phys. 36 11, 6194–6231 (1995) 4. Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J.1 (6), 1419–1457 (1990) 5. Enock, M. and Vainerman, L.: Deformation of a Kac algebra by an abelian subgroup. Commun. Math. Phys. 178, 571–596 (1996) 6. Hewitt, E. and Ross, K.: Abstract Harmonic Analysis II. Berlin–Heidelberg–New York: Springer-Verlag, 1970 7. Kac, G.: Certain arithmetic properties of ring groups. Funct. Anal. Appl. 6, 158–160 (1972) 8. Kac, G. and Palyutkin, V.: Finite ring groups. Trans. Moscow Math. Soc. 15, 251–294 (1966) 9. Landstad, M.B.: Quantizations arising from abelian subgroups. Int. J. Math. 5, 897–936 (1994) 10. Lanstad, M.B. and Raeburn, I.: Twisted dual-group algebras: Equivariant deformations of C0 (G). J. Funct. Anal. 132, 43–85 (1995) 11. Levendorskii, S.: Twisted algebra of functions on compact quantum group and their representations. St. Petersburg Math. J. 3 2, 405–423 (1992) 12. Levendorskii, S. and Soibelman, Y.: Algebra of functions on compact quantum groups Schubert cells, and quantum tori. Commun. Math. Phys. 139, 141–170 (1991) 13. von Neumann, J.: Die Eindeutigkeit der Schr¨odingerschen Operatoren. Math. Ann. 104, 570–578 (1931) 14. Nikshych, Dmitri: K0 rings and twisting of finite dimensional semisimple Hopf algebras. Preprint, National Technical University of Ukraine “Kiev Polytechnic Institute”, 1997 15. Rieffel, M.: Deformation quantization for actions of Rd . Memoirs A.M.S. no. 506, 1993 16. Rieffel, M.: Compact quantum groups associated with toral subgroups. Contemp. Math. 145, 465–491 (1993) 17. Rieffel, M.: Non-compact quantum groups associated with abelian subgroups. Commun. Math. Phys. 171, 181–201 (1995) 18. Vainerman, L.I.: 2-cocycles and twisting of Kac algebras. Commun. Math. Phys. 191 3, 697–721 (1998) 19. Vaksman, L. and Soibelman, Y.: The algebra of functions on quantum SU (2). Funct. Anal. ego Pril. 223, 1–14 (1988) 20. Van Daele, A. and Wang, S. Z.: Universal quantum groups. Int. J. Math 7 2, 255–264 (1996) 21. Wang, S.Z.: Free products of compact quantum groups. Commun. Math. Phys. 167 3, 671–692 (1995) 22. Wang, S.Z.: Tensor products and crossed products of compact quantum groups. Proc. London Math. Soc. 71 3, 695–720 (1995) 23. Wang, S.Z.: Krein duality for compact quantum groups. J. Math. Phys. 38 1, 524–534 (1997) 24. Wang, S. Z.: Deformations of compact quantum groups via Rieffel’s quantization. Commun. Math. Phys. 178 3, 747–764 (1996)

Finite Quantum Groups

307

25. Wang, S.Z.: Problems in the theory of quantum groups. In: Quantum Groups and Quantum Spaces, Banach Center Publication 40 (1997), Inst. of Math., Polish Acad. Sci., Editors: R. Budzynski, W. Pusz, and S. Zakrzewski, pp. 67–78 26. Woronowicz, S.L.: Twisted SU (2) group. An example of noncommutative differential calculus. Publ. RIMS, Kyoto Univ. 23, 117–181 (1987) 27. Woronowicz, S.L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, 613–665 (1987) 28. Woronowicz, S.L.: Tannaka–Krein duality for compact matrix pseudogroups. Twisted SU (N ) groups. Invent. Math. 93, 35–76 (1988) Communicated by A. Connes

Commun. Math. Phys. 202, 309 – 323 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne Masoud Khalkhali Department of Mathematics, University of Western Ontario, London ON, N6A 5B7, Canada. E-mail: [email protected] Received: 24 December 1996 / Accepted: 10 October 1998

Abstract: We prove that there is a product on the Hochschild and cyclic chain complex of a homotopy Gerstenhaber algebra. By restricting to the special case of the algebra of Hochschild cochains (the so called deformation complex), we obtain operations on cyclic homology of associative algebras.

1. Introduction The goal of this article is to relate recent developments in cyclic homology theory [3] and the theory of operads and homotopical algebra [6, 8], and hence to provide a general framework to define and study operations in cyclic homology theory. The link here is the bar construction. In [4], P. Deligne conjectured that the Hochschild cochain complex of an associative algebra, also called the deformation complex of the algebra, has a natural structure of an algebra over a singular chain operad of the little squares operad. This conjecture is now proved by M. Kontsevich [16]. It should be noted, however, that the results of the present paper in no way depends on the geometric form of this conjecture. In fact we do not use the conjecture in its original geometric form. A closely related statement recently proved by Gerstenhaber and Voronov [6] states that the deformation complex of an associative algebra has a natural structure of a homotopy Gerstenhaber algebra, also called homotopy G algebra. This result completes Gerstenhaber’s earlier work in [5] in the sense that it reveals the full structure of higher homotopies in the deformation complex. In a sense, this result shows that there is a natural “quantum group” structure on the bar construction of the deformation complex. It is this algebraic version of the conjecture that is most useful to construct the operations. In fact, we found it more conceptual to go beyond the deformation complex for associative algebras and define certain operations on the Hochschild and cyclic complexes of homotopy G algebras (Theorem 8 below). As an application, by specializing to the

310

M. Khalkhali

homotopy G algebra structure of the deformation complex, we obtain operations similar to those constructed by Nest and Tsygan in their study of algebraic index theorems [12]. In view of increasing importance of homotopy G algebras and the theory of operads in general in mathematics and mathematical physics (see, for example, [6, 8, 11, 15] and references therein), we hope that Theorem 8, and especially its method of proof, which is non-computational and lends itself to generalizations to algebras over operads, will prove useful in applications of noncommutative geometry and cyclic homology. This paper is organized as follows. In Sect. 2 we recall the notion of homotopy G algebra and especially its formulation in terms of the bar construction from [6]. In Sect. 3 we define operations on the Hochschild and cyclic complex of homotopy G algebras. A central tool here is the notion of X complex and its refinements due to Cuntz and Quillen [3]. By a result of Quillen [14], cyclic and Hochschild chain complexes appear as the X complex of the bar construction. From this point of view operations on the cyclic and Hochschild complex of homotopy G algebras are predicted by Kunneth formulas for the X complex of differential graded coalgebras. Section 4 is mainly devoted to deriving explicit formulas in the context of Connes’ b, B bicomplex and also specializing to the case of the deformation complex. I am much obliged to Maxim Kontsevich for a very informative communication on Deligne’s conjecture.

2. Homotopy Gerstenhaber Algebras This section is based on [6]. In an attempt to make the paper as self Lcontained as possible, we have reproduced the proofs of the main statements. Let V = i∈Z Vi be a Z-graded linear space. We use |x| to denote the degree of a homogeneous element x ∈ V . A brace algebra structure on V is given by a collection of linear homogeneous maps of degree zero, indexed by n ≥ 0, V ⊗ V ⊗n −→ V, x ⊗ x1 ⊗ · · · ⊗ xn 7→ x{x1 , · · · , xn }, such that, for all m,n, the following higher pre-Jacobi identities are satisfied: X (−1) , x{x1 , · · · , xm }{y1 , · · · , yn } = 0≤i1 ≤j1 ≤···≤im ≤jm ≤n

x{y1 , · · · , yi1 , x1 {yi1 +1 , · · · , yj1 }, · · · , xm {yim +1 , · · · , yjm }, · · · , yn },

(1)

Pm Pip |yq |). The degree where xi and yj are homogeneous elements and = p=1 (|xp | q=1 P zero assumption simply means that |x{x1 , · · · , xn }| = |x| + |xi |. We also assume that for n = 0 the resulting map x 7→ x{} : V −→ V is the identity. For example, as a consequence of (1), one checks that the bracket [x, y] := x{y} − (−1)|x||y| y{x} defines a graded Lie algebra structure on V . Indeed, puting m = n = 1 in (1), one obtains x{y}{z} − x{y{z}} = x{y, z} + (−1)|y||z| x{z, y},

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

311

which measures the failure of the operation (x, y) 7→ x{y} to be associative. From this the graded Jacobi identity easily follows. A brace algebra structure on V has a particularly simple interpretation in terms of the tensor coalgebra of V [1]. Let V [1] be the desuspension of V defined by V [1]n = Vn+1 . Let M (V [1])⊗n T (V [1]) = n≥0

be the tensor coalgebra of V [1] with its coproduct 1 : T (V [1]) −→ T (V [1]) ⊗ T (V [1]), 1(x1 , · · · , xn ) =

n X

(x1 , · · · , xi ) ⊗ (xi+1 , · · · , xn ),

i=0

where we have denoted a tensor x1 ⊗ · · · ⊗ xn in T (V [1]) by (x1 , · · · , xn ). Note that T (V [1]) is bigraded. Its horizontal grading is denoted by deg and de· , xn ) = n, and its vertical grading, denoted | |d , is given by fined by deg(x1 , · ·P , · · · , x )| = |xi | − n. The total grading is hence given by |(x1 , · · · xn )| = |(x n d P1 |x |. In the definition of T (V [1]) we are implicitly assuming that we are working i i with the total complex which is Z-graded. A linear homogeneous (with respect to the total grading) map of total degree zero, ∪ : T (V [1]) ⊗ T (V [1]) −→ T (V [1]), is called left increasing if deg(α ∪ β) ≥ deg(α). We always assume that ∪ is counital. Lemma 1. There is a natural 1 − 1 correspondence between brace algebra structures on V and left increasing bialgebra structures on T (V [1]). Proof. Since T (V [1]) is the free coalgebra generated by the desuspension V [1], we have a natural 1 − 1 correspondence between coalgebra morphisms ∪ : T (V [1]) ⊗ T (V [1]) −→ T (V [1]) and linear maps m : T (V [1]) ⊗ T (V [1]) −→ V [1]. Given m, the nth component of ∪ is defined by ˜ (n−1) , ∪n = m⊗n ◦ 1

(2)

˜ of the coalgebra T (V [1]) ⊗ ˜ (n) denotes the nth iteration of the coproduct 1 where 1 T (V [1]). Conversely, given ∪, m is just the degree one component of ∪. It is clear that ∪ is left increasing iff, for n ≥ 2, m|V [1]⊗n ⊗T (V [1]) = 0. In this case, let us denote the map m : V [1]⊗T (V [1]) −→ V [1] by x{x1 , · · · , xn }. Then, using (2), we get an explicit formula for the coalgebra map ∪ : T (V [1]) ⊗ T (V [1]) −→ T (V [1]).

312

M. Khalkhali

It is given by X

(x1 , · · · , xm ) ∪ (y1 , · · · , yn ) =

(−1) ,

0≤i1 ≤j1 ≤···≤im ≤jm ≤n

(y1 , · · · yi1 , x1 {yi1 +1 , · · · , yj1 }, · · · , xm {yim +1 , · · · , yjm }, · · · , yn ),

(3)

where is the same as in (1). It remains to check that ∪ is associative iff the braces satisfy the higher pre-Jacobi identities (1).Assume ∪ is associative. Then in particular we have (α∪β)∪γ = α∪(β∪γ), where α = x, β = (x1 , · · · , xm ) and γ = (y1 , · · · yn ). Taking the degree one component of both sides, one obtains (1). Conversely, assume the braces satisfy the pre-Jacboi identities. It is easy to see that both maps ∪ ⊗ 1 and 1 ⊗ ∪ : T (V [1]) ⊗ T (V [1]) ⊗ T (V [1]) −→ T (V [1]) are coalgebra maps. By the universal property of T (V [1]), the two maps are the same provided the degree one components of them coincide.One then checks that this is equivalent to the brace identity (1). The theorem is proved. In the rest of this paper we only consider left increasing multiplications on T (V [1]). Here is an example of a brace algebra. This example is due to Getzler [7]. Let A be a linear space and let Vn = Hom(A⊗n , A). One defines a brace algebra structure on V [1] by setting X (−1) , 4 x{x1 , · · · , xm }(a1 , · · · an ) = 0≤i1 ≤···≤im ≤n

(4) x(a1 , · · · , x1 (ai1 +1 , · · · , ai1 +d1 ), · · · xm (aim +1 , · · · , aim +dm ), · · · , an ), P Pm where di = |xi | + 1, n = 1 + |x| + |xi | and = p=1 |xp |ip . Checking (1) is straightforward. A homotopy G algebra (G stands for Gerstenhaber) is a differential graded (DG) associative algebra equipped with a system of “higher homotopies” so that its cohomology is a graded poisson algebra. More precisely, Let (V, δ) be a DG algebra where we assume the differential has degree +1. A homotopy G algebra structure on V is given by a brace algebra structure on the desuspension V [1] of V such that the following axioms are satisfied: (x1 x2 ){y1 , · · · , yn } =

n X

(−1) x1 {y1 , · · · , yk }x2 {yk+1 , · · · , yn },

(5)

k=0

where = (|x2 | − 1)(|y1 | + · · · + |yk | − k), and δ(x{x1 , · · · , xn+1 }) − δx{x1 , · · · , xn+1 } − (−1) = −(−1)

|x|−1

n+1 X

i=1 (|x|−1)(|x1 |−1)

+ (−1)|x|−1

n X i=1

where =

(−1) x{x1 , · · · , δxi , · · · , xn+1 }

Pi−1 k=1

|xk |.

x1 .x{x2 , · · · , xn+1 }

(−1)i+n x{x1 , · · · , xi xi+1 , · · · , xn+1 } − x{x1 , · · · , xn }xn+1 ,

(6)

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

313

A large class of homotopy G algebras are constructed as follows [6]. Let V be a brace algebra and m ∈ V1 a degree one element such that m{m} = 0. One defines a DG algebra structure on V [−1] by defining a differential and a product by δx = (−1)|x| [m, x], xy = m{x, y}.

(7)

Using the brace relations (1), one then checks that the axioms of a homotopy G algebra are satisfied. The axioms of a homotopy G algebra structure on V can be conceptually encoded in terms of the bar construction BV . Let us describe this correspondence. We first need a definition. Let (V, δ) be a DG algebra where we assume the differential has degree +1. Recall that the bar construction of V , denoted BV , is a differential graded coalgebra whose underlying coalgebra is the tensor coalgebra T (V [2]) and its differential is the total differential b0 + δ. The individual differentials b0 , δ : BV −→ BV are defined by 0

n

b (x1 , · · · , xn ) = (−1) δ(x1 , · · · , xn ) =

n X

n−1 X

(−1)i−1 (x1 , · · · , xi xi+1 , · · · , xn ),

i=1

(−1)|x1 |+···+|xi−1 | (x1 , · · · , δxi , · · · , xn ).

i=1 0

Note that both b and δ have total degree P +1. Also note that the total degree of α = (x1 , · · · , xn ) ∈ BV is given by |α| = i |xi | − n. Next recall that a DG bialgebra is by definition a bialgebra object in the abelian tensor category of DG linear spaces (cochain complexes). In particular the differential of a DG-bialgebra is simultaneously a derivation and a coderivation. Lemma 2. Let V be a DG algebra. Then there is a natural 1-1 correspondence between homotopy G algebra structures on V and DG bialgebra structures on the bar construction BV . Proof. By the above lemma, we have a natural 1 − 1 correspondence between brace algebra structures on V [1] and bialgebra structures on BV . So all that we need to prove is that the axioms of homotopy G algebras (5 , 6) are equivalent to the differential b0 + δ being a derivation of BV . That is for all α, β ∈ BV , (b0 + δ)(α ∪ β) = (b0 + δ)α ∪ β + (−1)|α| α ∪ (b0 + δ)β.

(8)

Now, since both b0 and d are coderivations of BV and ∪ is a coalgebra map, it follows that (8) holds if and only if the degree one components of both sides coincide. Note that the only possible contributions to degree one components are from the following two choices: α = x, β = (x1 , · · · , xn ) and α = (x1 , x2 ), β = (y1 , · · · , ym ). Computing the first order terms in the expansions, we find that (8) is equivalent to (5, 6). The lemma is proved. Given a homotopy G algebra V , let H(V ) denote the cohomology of the complex (V, δ). The product and the Lie bracket in V , being compatible with the differential δ, descend to H(V ) and define an associative product and a Lie algebra structure on H(V ). Moreover, the homotopy formulas in (5) and (6) can be used to show that the associative product in V is, up to homotopy, graded and commutative and the Lie bracket is, again

314

M. Khalkhali

up to homotopy, a derivation with respect to the associative product. It thus follows that H(V ) is a graded poisson algebra, also known as a Gerstenhaber algebra (G algebra). This simply means that the product in cohomology is graded and commutative and the Lie bracket is a derivation with respect to the product. Examples of graded poisson algebras and homotopy G algebras abound in algebraic topology, geometry and mathematical physics. By a classical result of F. Cohen the cohomology groups of configuration spaces is a universal model for graded poisson algebras in the sense that any graded poisson algebra is an algebra over the latter as an operad. Concrete examples of G algebras include the semi-infinite cohomology of string theory [11], the algebra of polyvector fields on a manifold, the Koszul complex of Lie algebras and more generally the deformation cohomology of any associative algebra, to be discussed in more detail in the next paragraph. Examples of homotopy G algebra structures that are just emerging include the homotopy G algebra structure of topological field theory [11], the homotopy G algebra structure on singular cochains on a topological space [6] and finally the deformation complex of associative algebras which we describe next. This structure also appears in the recent work of M. Kontsevich on deformation quantization of Poisson manifolds [15]. Let A be an associative algebra and let C(A, A) denote the deformation complex of A. This is the standard complex that calculates the Hochschild cohomology H • (A, A). We have C n (A, A) = Hom(A⊗n , A). It thus follows from (4) that there is a brace algebra structure defined on C(A, A)[1]. Let m : A ⊗ A −→ A be the multiplication of A. One has m{m} = 0, which is equivalent to associativity of m. It is easy to check that the differential and the product induced on C(A, A) by (7) coincide, respectively, with the Hochschild coboundary and the cup product on C(A, A). One thus obtains a homotopy G algebra structure on C(A, A), first discovered by Gerstenhaber and Voronov in [6]. As we will see in the next section, this homotopy G algebra structure is at the heart of operations on Hochschild and cyclic homology. 3. Operations on Homotopy G Algebras Let C and D be DG coalgebras and let A be an algebra. Our goal in this section is to show that any morphism of DG coalgebras C ⊗ D −→ BA induces a natural morphism of supercomplexes ˆ ˆ ˆ X(C) ⊗ X(D) −→ X(BA),

(9)

where X is the X complex functor of Cuntz and Quillen. We will then apply this result to the structure map of a homotopy G algebra to construct operations on the cyclic and Hochschild homology of homotopy G algebras. Note that, in general, there is no natural map X(C) ⊗ X(D) −→ X(C ⊗ D); otherwise defining (9) would be a trivial matter. This is simply because X only captures homological information up to dimension one. Instead, we obtain (9) as a composition ˆ ˆ ˆ X(C) ⊗ X(D) −→ Xˆ 2 (C ⊗ D) −→ Xˆ 2 (BA) −→ X(BA), where X 2 is a certain refinement of X to capture degree 2 homology classes. Despite the fact that there is no natural transformation X 2 −→ X, we can however make use of the ˆ fact that the underlying coalgebra of BA is free and show that X(BA) is a deformation retract of Xˆ 2 (BA). This gives the last map in the above sequence. The first map is

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

315

simply the DG coalgebra analogue of a map constructed by M. Puschnigg in his study of Kunneth formulas in cyclic homology [13]. We need to adopt some basic definitions and constructions from [9] to our DG coalgebraic set up. Let C be a DG coalgebra and let (C, d) denote the DG coalgebra of universal codifferential forms over C. Let η : C −→ k be the counit of C. We have n C = C ⊗ C¯ ⊗n , where C¯ = Kerη. Let b : • C −→ •+1 C be the analogue of the Hochschild boundary operator and let N be the number operator which multiplies a differential form by its degree. Let norm C = ker{(b + dN )2 : C −→ C}. Equipped with the differential b + dN and with its natural Z/2 grading, (norm C, b + dN ) can be regarded as a supercomplex. There is a decreasing filteration {F n norm C}n≥2 on norm C, where F n consists of forms of degree at least n. The successive quotient complexes norm C/F n approximate the normalized cyclic bicomplex for DG coalgebras. We need only the first two quotients, denoted by X(C) and X 2 (C). These are the supercomplexes X(C) :

b

−→

C ←− 1 C\ , d

X 2 (C) : C

M

b+2d

−→ ˙ 1 C, 2 C\ ←− b+d

˙ 1 C = norm,1 C. Note that 1 C\ ⊂ where \ denotes the cocommutator subspace and ˙ 1 C. We use ∂1 (resp. ∂2 ) to denote the horizontal (resp. vertical) differentials in X(C) and X 2 (C). We are mostly interested in the total complexes of these bicomplexes which ˆ ˆ as ω0 +ω1 , we denote by X(C) and Xˆ 2 (C). We express an even, or odd, element of X(C) where ω0 ∈ C and ω1 ∈ 1 C\ . Similarly we write ω0 + ω2 + ω1 to denote an even, or odd, element of Xˆ 2 (C). Note that we have a natural morphism of supercomplexes ˆ I : X(C) −→ Xˆ 2 (C), ˙ 1 C. obtained from the inclusions C −→ C ⊕ 2 C\ and 1 C\ −→ ˆ However, we would like to In general, there is no natural map Xˆ 2 (C) −→ X(C). show that if C = BA is the bar construction, then I is a homotopy equivalence and ˆ The easiest way to find R find an explicit homotopy inverse R : Xˆ 2 (C) −→ X(C). is to apply homological perturbation theory. Indeed a simple version of the so called perturbation lemma which we recall now is enough for our purpose. Recall that a (super)complex (L, ∂1 ) is a special deformation retract of a (super)complex (M, ∂1 ) if there are chain maps i

r

L −→ M −→ L and a homotopy h : M −→ M , of odd degree, such that ri = 1L , ir = 1M + ∂1 h + h∂1 and hi = 0. In particular i is a homotopy equivalence and r is a homotopy inverse to i. Let us perturb the differentials to ∂1 + ∂2 and assume that ∂2 i = i∂2 . It is natural to ask + ∂2 ). It is not difficult to show if (L, ∂1 + ∂2 ) remains a deformation retract of (M, ∂1 P that this is indeed the case, provided the operator K = n≥0 (∂2 h)n can be rigorously defined. In this case one shows that the chain maps

316

M. Khalkhali I

R

L −→ M −→ L and the homotopy H : M −→ M defined by I = i, R = rK and H = hK, provide a special deformation retract of (M, ∂1 + ∂2 ) to (L, ∂1 + ∂2 ). In our applications K will be a finite sum. Let C = BA be theL bar construction of an algebra A with its counit η : BA −→ k ⊗n ¯ ' BA ⊗ A ⊗ BA , ¯ = kerη = . We have 1 BA = BA ⊗ BA and let BA n≥1 A 1 2 ¯ ¯ BA\ '= A⊗BA, and BA = BA⊗ BA⊗ BA. We fix a left inverse θ : 1 BA −→ 1 BA\ for the inclusion 1 BA\ ,→ 1 BA, defined by θ(α ⊗ a ⊗ β) = η(α)a ⊗ β. From [9], one knows that one can use connections to construct homotopy operators with good algebraic properties for the Hochschild and cyclic complex of DG coalgebras. ¯ ⊗ A, Let us define an operator ∇ : 2 BA −→ 1 BA, which is supported on BA ⊗ BA by the formula ∇(β ⊗ α ⊗ a) = α ⊗ a ⊗ β, ¯ and a ∈ A. Define an odd operator h0 : Xˆ 2 (BA) −→ Xˆ 2 (BA) where β ∈ BA, α ∈ BA, via the formula h0 (ω0 + ω2 + ω1 ) = ∇ω2 . ˆ ˆ and i0 : X(BA) −→ Xˆ 2 (BA) Also define even operators r0 : Xˆ 2 (BA) −→ X(BA) via the formulas r0 (ω0 + ω2 + ω1 ) = ω0 + θω1 , and i0 = I. ˆ b). Lemma 3. (r0 , h0 , i0 ) is a special deformation retract of (Xˆ 2 (BA), b) to (X(BA), Proof. The relations r0 i0 = 1 and h0 i0 = 0 are easy to verify. The relation i0 r0 = 1 + bh0 + h0 b amounts to θω1 = ω2 + ω1 + b∇ω2 + ∇bω1 . This is equivalent to showing that, for ˙ 1 BA and ω2 ∈ 2 BA\ , all ω1 ∈ ω1 + ∇bω1 = θω1 , ω2 + b∇ω2 = 0. While it is possible to prove these relations by a direct computation, it is perhaps more instructive to prove the corresponding dual statements for the tensor coalgebra T A. In this case the connection ∇ : 1 T A −→ 2 T A is given by ∇(α ⊗ a ⊗ β) = βdαda. To prove the first relation let ω1 = α ⊗ a ⊗ β. We have b∇ω1 = b(βdαda) = −b(daβdα) = [daβ, dα] = daβα − αdaβ, and hence

b∇ω1 + ω1 = daβα = θω1 .

To prove the second relation, let ω2 = a0 da1 da2 . We have ∇bω2 = −∇[a0 da1 , a2 ] = −∇(a0 da1 a2 − a2 a0 da1 ) = −a2 da0 da1 + d(a2 a0 )da1 = da2 a0 da1 = −a0 da1 da2 = −ω2 . The lemma is proved.

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

317

To pass from the b-complex to ∂1 - complex, we compute the operator k : Xˆ 2 −→ Xˆ 2 . It is given by k(ω0 + ω2 + ω1 ) = (ω0 + d∇ω2 ) + ω2 + ω1 . Invoking the perturbation lemma, let us now define the operators h, r and i by the formulas: h(ω0 + ω2 + ω1 ) = ∇ω2 , r(ω0 + ω2 + ω1 ) = (ω0 + d∇ω2 ) + θω1 , and i = I. ˆ ∂1 ). Lemma 4. (r, h, i) is a special deformation retract of (Xˆ 2 (BA), ∂1 ) to (X(BA), We use the perturbation lemma once again to pass from the ∂1 -complex to the ∂1 +∂2 complex. The operator K is now given by K(ω0 + ω2 + ω1 ) = ω0 + ω2 + (ω1 + ∂2 ∇ω2 ). Let us define the operators R and H by R(ω0 + ω2 + ω1 ) = (ω0 + d∇ω2 ) + θ(ω1 + ∂2 ∇ω2 ), H(ω0 + ω2 + ω1 ) = ∇ω2 .

(10)

ˆ Proposition 5. (R, H, I) is a special deformation retract of Xˆ 2 (BA) to X(BA). Although we won’t need it in this paper, we note that the above proposition and its proof remain valid in the more general case where A is a DG algebra. In his study of Kunneth formulas in cyclic homology [13], M. Puschnigg constructed a natural map X 2 (A ⊗ B) −→ X(A) ⊗ X(B), where A and B are algebras and the tensor product of supercomplexes is understood in the right-hand side. This map lifts Connes’ external product in cyclic homology [1] to the level of chains in the X complex. It is given by a0 b0 7→ a0 ⊗ b0 , 1 1 a0 b0 d(a1 b1 ) 7→ a0 da1 ⊗ [b0 , b1 ]+ + [a0 , a1 ]+ ⊗ b0 db1 , 2 2 1 1 a0 b0 d(a1 b1 )d(a2 b2 ) 7→ a0 da1 a2 ⊗ b0 b1 db2 − a0 a1 da2 ⊗ b0 db1 b2 2 2 1 1 = a0 d(a1 a2 ) ⊗ b0 b1 db2 − a0 a1 da2 ⊗ b0 d(b1 b2 ), 2 2 where [a, b]+ = ab + ba and, to keep the notation simple, we have supressed the tensor product sign on the left hand side. This map is functorial and can be dualized to a DG coalgebra context to define a morphism of supercomplexes ˆ ˆ p : X(C) ⊗ X(D) −→ Xˆ 2 (C ⊗ D), where C and D are DG coalgebras. Let pi,j = p|Xi (C) ⊗ Xj (D). We have p0,0 = id, 1 p0,1 = R2,3 ◦ (1 ⊗ (1 + R1,2 1)), 2 1 p1,0 = R2,3 ◦ ((1 + R1,2 1) ⊗ 1), 2 1 p1,1 = R ◦ (1 ⊗ 1 ⊗ 1 ⊗ 1 − 1 ⊗ 1 ⊗ 1 ⊗ 1). 2

318

M. Khalkhali

In the above formulas Ri,j denotes the signed exchange between i and j factors in a tensor product and R = R4,5 R3,5 R2,3 . This observation, coupled with the above proposition, proves the following Theorem 6. Let C and D be DG coalgebras and let A be an algebra. Then any morphism of DG coalgebras C ⊗ D −→ BA induces a morphism of supercomplexes ˆ ˆ ˆ X(C) ⊗ X(D) −→ X(BA). In the Lremainder of this section we will apply this result to homotopy G algebras. So let V = i≥0 Vi be a homotopy G algebra with structure map ∪ : BV ⊗ BV −→ BV . Using the inclusion V0 −→ V and the surjection V −→ V0 , we obtain a morphism of coalgebras ∪1 : BV ⊗ BV0 −→ BV0 . It is given by (x1 , · · · , xm ) ∪1 (a1 , · · · , an ) =

→

X

i1 ≤i2 ≤···≤im ≤n

(−1)

(a1 , · · · , ai1 , x1 {ai1 +1 , · · · , ai1 +d1 }, · · · , xm {aim +1 , · · · , aim +dm }, · · · , an ),

(11)

Pm where, = p=1 (|xp | − 1)ip and di = |xi |. In the special case where V = C • (A, A), the braces in (11) are given by xik {aik +1 , · · · , aik +dk } = xik (aik +1 , · · · aik +dk ). Lemma 7. ∪1 is a morphism of DG coalgebras. Proof. Note that the surjection π : BV −→ BV0 is a DG coalgebra map. For α ∈ BV and β ∈ BV0 , we have b0 (α ∪1 β) = b0 π(α ∪ β) = π(b0 + δ)(α ∪ β) = π[(b0 + δ)α ∪ β + (−1)|α| α ∪ (b0 + δ)β] = (b0 + δ)α ∪1 β + (−1)|α| [π(α ∪ b0 β) + π(α ∪ δβ)] = (b0 + δ)α ∪1 β + (−1)|α| α ∪1 b0 β, since π(α ∪ δβ) = 0 as δβ has degree one. The lemma is proved.

Let ˆ2 ˆ ˆ P : X(BV ) ⊗ X(BV 0 ) −→ X (BV0 ) denote the composition P = ∪1 p. Now we can apply Theorem 6 to obtain Theorem 8. Let V be a homotopy G algebra. Then there are natural maps of supercomplexes ˆ ˆ ˆ X(BV ) ⊗ X(BV ) −→ X(BV ), ˆ ˆ ˆ X(BV ) ⊗ X(BV 0 ) −→ X(BV0 ). We believe, although we have not checked it, that the first product is homotopy associative and in fact there should exist a full structure of higher homotopies in the sense of A∞ -algebras. The same should be true on the corresponding pairing between homologies.

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

319

4. Higher Operations on Cyclic Bicomplex Our goal in this section is to find explicit formulas for the second map in Theorem 8 and to relate it to the Hochschild and cyclic homology of the homotopy G algebra V . In the last part we apply our formulas to a very special homotopy G algebra, namely the deformation complex of an algebra, to obtain higher homotopy formulas in the cyclic and b, B complex of the algebra. The computations in this section are based on results and ideas from [9]. In [12], Nest and Tsygan have defined two different types of operations on cyclic homology. On one hand, they have defined an action of the b, B complex of the deformation complex (as a DG algebra) of an algebra A on the b, B complex of A. This corresponds to the pairing (14) below, though we do not know to what extent the explicit formulas match. As is explained in [12], this operation is very general and in particular yields Cartan homotopy formulas for the action of higher Hochschild cochains on the b, B complex. Secondly, they have defined an action of the Chevalley-Eilenberg complex of the deformation complex (as a DG Lie algebra) on the b, B complex of A. Most probably, this operation too is a consequence of the homotopy G algebra structure of the deformation complex by a similar pattern as we derived (14) from Theorem 8. Let 0 = (x1 , · · · , xm ) ∈ BV , 1 = y0 ⊗ (y1 , · · · , yn ) ∈ 1 BV\ , ω0 = ˆ (a1 , · · · ap ) ∈ BV0 and ω1 = b0 ⊗ (b1 , · · · bq ) ∈ 1 BV0\ . Let η0 + η1 ∈ X(BV 0 ) be the image of (0 + 1 ) ⊗ (ω0 + ω1 ) under the second map in Theorem 8. Using formula (10) for the retraction R, we have η0 = P (0 ⊗ ω0 ) + d∇P (1 ⊗ ω1 ), η1 = θP (0 ⊗ ω1 + 1 ⊗ ω0 ) + θ∂2 ∇P (1 ⊗ ω1 ). Note that P (0 ⊗ ω0 ) = (x1 , · · · , xm ) ∪1 (a1 , · · · , ap ). To simplify the notation, we resort to the following convention. It is better to consider the points y0 , y1 , · · · , yn as located in the clockwise order on the circle. Let πk denote the set of all partitions of these points on the circle into k intervals. We allow one or several of these interavls to be empty, in which case they represent 1. For example, for n = 2, π1 has 3 elements while π2 has 12 elements. We denote an element of π2 by a pair (α, β), and similarly for elements of πk . It is also convenient to write X(Y ) for X ∪1 Y . To compute the other components of η0 and η1 , first we have to find the image of 1 under the inclusion 1 BV\ −→ 1 BV . We have

y0 ⊗ (y1 , · · · , yn ) 7→

n,n−i X

(−1) (yi+1 , · · · , yi+j ) ⊗ (yi+j+1 , · · · yi )

i=0,j=0

=

X

(−1)|α||β| α ⊗ β

(α,β) y0 ∈β

and similarly for ω1 . Now we have

320

M. Khalkhali

d∇P (1 ⊗ ω1 ) =

n 1X X 2 i=0

X

(α,β,yi ) y0 ∈(β,yi )

1 X − 2

±(β(β 0 ), yi (γ 0 ), α(α0 ))

(α0 ,β 0 ,γ 0 )

X

b0 ∈γ 0

(12)

±(β(β 0 ), y0 (γ 0 ), α(α0 )).

(α,β,y0 ) (α0 ,β 0 ,γ 0 ) b0 ∈(β 0 ,γ 0 )

The signs can be easily made explicit. Next we compute the contribution of 0 ⊗ ω1 + 1 ⊗ ω0 to η1 . First note that θ(α ⊗ β) 6 = 0 only if α = 1 or α ∈ V0 . In this case we have θ(a0 ⊗ (a1 , · · · an )) = a0 ⊗ (a1 , · · · an ), θ(1 ⊗ (a1 , · · · , an )) = −a1 ⊗ (a2 , · · · , an ). Using the above information plus our formulas for P, we obtain X (x1 , · · · , xm )(α0 ) θP (0 ⊗ ω1 ) = − (α0 )

+

1 2

X

→

b0 ∈β 0

+

X

1 2

±(x1 (α0 ), (x2 , · · · , xm )(β 0 ))

(α0 ,β 0 )

±(xm (α0 ), (x1 , · · · , xm−1 )(β 0 )).

→ (α0 ,β 0 )

b0 ∈β 0

Similarly we have θP (1 ⊗ ω0 ) =

n 1X X ±(yi (a1 , · · · , adi ), β(adi +1 , · · · , ap )) 2 i=1 (yi ,β) X β(a1 , · · · , ap ) −

+

(β) n X

1 2

X

±(yi (ap−di , · · · , ap ), β(a1 , · · · , ap−di −1 )),

i=1 (yi ,β)

where di = |yi |. Finally let us compute the contribution of 1 ⊗ ω1 to η1 . Using the formulas for the connection ∇ and for the induced differential ∂2 we note that θ∂2 ∇(β ⊗ α ⊗ γ) 6 = 0 only if α ∈ V0 and γ ∈ V0 . In this case we have θ∂2 ∇(β ⊗ α ⊗ γ) = αγ ⊗ β. From this we obtain θ∂2 ∇P (1 ⊗ ω1 ) =

1 2

X →

b0 ∈γ 0

1 − 2

±(y0 (β 0 )y1 (γ 0 ), (y2 , · · · , yn )(α0 ))

(α0 ,β 0 ,γ 0 )

X

→ (α0 ,β 0 ,γ 0 )

b0 ∈β 0

0

0

0

±(yn (β )y0 (γ ), (y1 , · · · yn−1 )(α )).

(13)

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

321

L We define the periodic cyclic homology of a DG algebra V = i≥0 Vi to be the ˆ homology of the supercomplex X(BV ). In the special case when V = A is an algebra, the bicomplex X(BA) has been shown by Quillen [14] to be isomorphic, up to a shift in the vertical direction, to the cyclic bicomplex of A. The same argument works in the DG case. Let C(V ) denote the total complex of the cyclic bicomplex of V . We have Y Cn (V ), Cev (V ) = Codd (V ) = n≥0

where Cn (V ) = V ⊗(n+1) . Theorem 8 can now be reformulated as Theorem 9. Let V be a homotopy G algebra. Then there is a natural morphism of supercomplexes C(V ) ⊗ C(V0 ) −→ C(V0 ). Finally we turn to Connes’ b, B bicomplex and the analogue of our formulas in that context. This is important because in many applications of cyclic homology and noncommutative geometry [1, 2] the b, B bicomplex appears in a natural way. Using an explicit homotopy equivalence between the cyclic and b, B bicomplexes, we can transform Theorem 9 into a morphism between b, B complexes. One obtains, however, simpler formulas if one restricts to the normalized b, B complex. Let (V, δ) be a unital DG algebra and let B(V ) denote its b, B complex. We have Y Y C2n (V ), B(V )odd = C2n+1 (V ). B(V )ev = n≥0

n≥0

The differential is given by b + B + δ, where B : C• (V ) −→ C•+1 (V ) is Connes’ boundary operator. Let V¯ = V /k and C¯ n (V ) = V ⊗ V¯ ⊗n . The normalized b, B complex ¯ ), is defined similarly except that we replace Cn (V ) by C¯ n (V ). of V , denoted B(V For n ≥ 0, let λ : Cn (V ) −→ Cn (V ) denote the cyclic shift operator, let N be the correspnding norm operator and let s : Cn (V ) −→ Cn+1 (V ) be defined by s(v0 , · · · , vn ) = (1, v0 , · · · , vn ). Recall the morphisms of complexes I : B(V ) −→ C(V ), J : C(V ) −→ B(V ) defined by

I = 1 + sN, J = 1 + s(1 − λ).

It is known that the operators I and J are homotopy inverse to each other [10]. Using the chain maps I and J it is clear that we can transform Theorem 9 into a morphism of complexes B(V ) ⊗ B(V0 ) −→ B(V0 ). We specialize to the case where V = C(A, A) is the deformation complex of a unital algebra A. Note that in this case V0 = C 0 (A, A) = A. A cochain φ ∈ C n (A, A) is said to be normalized if φ(a1 , · · · , an ) = 0 whenever ai = 1 for some i. Let Cnorm (A, A) denote the subcomplex of normalized cochains. It is easy to check that Cnorm (A, A) is indeed a sub DG algebra of the DG algebra C(A, A). Hence we have an inclusion B(Cnorm (A, A)) −→ B(C(A, A)), and a morphism of supercomplexes B(Cnorm (A, A)) ⊗ B(A) −→ B(A).

322

M. Khalkhali

Now our explicit formulas show that the above map descends to define a morphism of supercomplexes ¯ ¯ −→ B(A). B(Cnorm (A, A)) ⊗ B(A)

(14)

We denote this map as well as the one in Theorem 9 by ∪. We obtain explicit formulas for ∪ as follows. Let D = (D0 , · · · , Dm ) ∈ B(Cnorm (A, A)) and a = (a0 , · · · , an ) ∈ ¯ B(A). Let ID = 0 + 1 = sN D + D and Ia = ω0 + ω1 = sN a + a. Also let ID ∪ Ia = η0 + η1 . We have D ∪ a = J(ID ∪ Ia) = J(η0 + η1 ) = s(1 − λ)η0 + η1 . Now we have s(1 − λ)η0 = s(1 − λ)(P (sN D ⊗ sN a) + d∇P (D ⊗ a)). Because of our normalization conditions we have s(1 − λ)P (sN D ⊗ sN a) = 0. Using (12) we get s(1 − λ)η0 = s(1 − λ)d∇P (D ⊗ a) m X 1X = 2 i=0 →

D0 ∈(β,Di )

−

1 2

±(β(β 0 ), Di (γ 0 ), α(α0 )))

(α,β,Di ) → (α0 ,β 0 ,γ 0 ) a0 ∈γ 0

X

(α,β,D0 )

X

X →

D0 ∈(β 0 ,γ 0 )

±(β(β 0 ), D0 (γ 0 ), α(α0 )).

(α0 ,β 0 ,γ 0 )

Similarly we have η1 = θP (sN D ⊗ α + D ⊗ sN a) + θ∂2 ∇P (D ⊗ a). Lemma 10. We have θP (D ⊗ sN a) = 0 and θP (sN D ⊗ a) = 0. Using the above lemma and (13), we get η1 = θ∂2 ∇P (D ⊗ a) X 1 ±(D0 (β 0 )D1 (γ 0 ), (D2 , · · · , Dn )(α0 )) = 2 0 0 0 → (α ,β ,γ )

a0 ∈γ 0

−

1 2

X

±(Dn (β 0 )D0 (γ 0 ), (D1 , · · · , Dn−1 )(α0 )).

→ (α0 ,β 0 ,γ 0 )

a0 ∈β 0

Note that because of homogeneity we can drop the unpleasant factor of formulas. In the unnormalized case, however, this can not be done.

1 2

from the

Acknowledgement. Supported in part by NSERC of Canada. I would also like to thank the mathematics division of the International Centre for Theoretical Physics, Trieste, Italy, for a visiting fellowship during the summer of 1996 where part of this work was completed.

Operations on Cyclic Homology, the X Complex, and a Conjecture of Deligne

323

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

Connes, A.: Non-commutative differential geometry. Pub. Math. IHES 62, 41–144 (1985) Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 Cuntz, J. and Quillen, D.: Cyclic homology and nonsingularity. J. Am. Math. Soc. 8, 373–442 (1995) Deligne, P.: Letter to Stasheff, Gerstenhaber, May, Schechtman and Drinfeld. May 17, 1993 Gerstenhaber, M.: The cohomology structure of an associative ring. Ann. Math. 78, 267–289 (1963) Gerstenhaber, M. and Voronov, A.: Homotopy G-algebras and moduli space operad. Internat. Math. Res. Notices 3, 141–153 (1995) Getzler, E.: Cartan homotopy formula and the Gauss-Manin connection in cyclic homology. Israel Math. Conf. Proc. 102, 256–283 (1993) Getzler, E. and Jones, J. Operads, homotopy algebra and iterated integrals for double loop spaces. Preprint (1994) Khalkhali, M.: On cartan homotopy formulas in cyclic homology. Manuscripta Math. 94, 111–132 (1997) Khalkhali, M.: On the entire cyclic cohomology of Banach algebras. Comm. in Alg. 22, 5861–5875 (1994) Kimura, T., Voronov, A. and Zuckerman, G.: Homotopy Gerstenhaber algebra and topological field theory. In: Operads: Proceedings of Renaissance Conferences, Contemp. Math. 202, 305–333 (1997) Nest, R. and Tsygan, B.: Homological properties of the category of algebras and the bivariant JLO cochain. Preprint (1994) Puschnigg, M.: Explicit product structures in cyclic homology theories. Preprint (1995) Quillen, D.: Algebra cochains and cyclic homology. Pub. Math. IHES 68, 139–174 (1989) Kontsevich, M.: Deformation quantization of Poisson manifolds. q-alg/9709040 (1997) Kontsevich, M.: Private communication

Communicated by A. Connes

Commun. Math. Phys. 202, 325 – 357 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Projections in Rotation Algebras and Theta Functions? Florin P. Boca1,2,??,??? 1 2

Department of Mathematics, University of Wales, Singleton Park, Swansea SA2 8PP, UK Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, RO-70700 Bucharest, Romania

Received: 30 April 1998 / Accepted: 10 October 1998

Dedicated to Professor Marc A. Rieffel on the occasion of his 60th birthday Abstract: For each α ∈ (0, 1), Aα denotes the universal C ∗ -algebra generated by two unitaries u and v, which satisfy the commutation relation uv = e2πiα vu. We consider the order four automorphism σ of Aα defined by σ(u) = v, σ(v) = u−1 and describe a method for constructing projections in the fixed point algebra Aσα , using Rieffel’s imprimitivity bimodules and Jacobi’s theta functions. In the case α = q −1 , q ∈ Z, q ≥ 2, we give explicit formulae for such projections and find a lower bound for the norm of the Harper operator u + u∗ + v + v ∗ .

The commutative algebra C(Tn ) of continuous functions on the ordinary n-dimensional torus Tn = {(z1 , . . . , zn ) ; |zj | = 1} is isomorphic to the universal C ∗ -algebra generated by n commuting unitary operators. A non-commutative n-torus Aα is the universal C ∗ algebra generated by n unitaries u1 , . . . , un subject to relations uj uk = e2πiαjk uk uj , where α = (αjk )1≤j,k≤n is a skew symmetric matrix with real entries. In some situations it is convenient to regard α as a real skew bilinear form on Zn defined by α(ej , ek ) = αjk and Aα as the twisted group C ∗ -algebra C ∗ (Zn , β), where β : Zn × Zn → T is a 2cocycle such that β(x, y) β(y, x) = e2πiα(x,y) . In this paper we only consider the case n = 2, when α is a real number and Aα is isomorphic to the crossed-product C ∗ -algebra C(T) ∝β Z, where β is the automorphism of C(T) defined by β(φ)(e2πit ) = φ(e2πi(t+α) ), φ ∈ C(T), t ∈ R. The C ∗ -algebra Aα , called the rotation algebra by angle α, coincides with the universal C ∗ -algebra generated by two unitaries u and v such that uv = e2πiα vu. It is endowed with the canonical faithful ?

Research supported by an EPSRC Advanced Fellowship and GAR 198/1998 Paper presented in the C ∗ -algebren Meeting, Mathematische Forschungsinstitut, Oberwolfach, February 1–7, 1998 and the 50th British Mathematical Colloquium, Manchester, April 6–9, 1998 ??? Current address: School of Mathematics, Cardiff University, P.O. Box 926, Senghennydd Road, Cardiff CF2 4YH, UK. E-mail: [email protected] ??

326

F. P. Boca

tracial state τ defined by τ

P m,n

m n

am,n u v

= a0,0 . The starting point in the study of

rotation algebras is the existence of the Powers–Rieffel projections. They are projections eα of trace the fractional part {α} of α in Aα , which are of the form −1 ˇ , eα = G(u)v + F (u) + G(u)v

ˇ where G(x) = G(−x), x ∈ R and F , G are some smooth functions on R (see [16]). The classical results of Pimsner and Voiculescu ([14, 15]) show that τ∗ K0 (Aα ) = Z + Zα = Z + Zτ (eα ); in particular, for any irrational numbers α1 and α2 , the rotation algebras Aα1 and Aα2 are isomorphic if and only if α2 ± α1 ∈ Z. The Powers–Rieffel projections provide a central class of examples of projective modules in A. Connes’ noncommutative differential geometry ([6, 7]). They also play an important rˆole in the recent results ([8, 4]) on thestructure of noncommutative tori. ab Each matrix X = ∈ SL2 (Z) defines a ∗-automorphism σX of Aα , acting on cd the canonical generators u and v of Aα by σX (u) = ua v b , σX (v) = uc v d . Throughout this paper, we will denote by σ = σ 0 1 the “Fourier transform” automorphism of −1 0

Aα , acting on its generators by σ(u) = v and σ(v) = u−1 . We also set Aσα = {a ∈ Aα ; σ(a) = a} and

Hα(n) = un + u−n + v n + v −n ∈ Aσα , n ∈ Z.

We notice that for any α ∈ R \ Q, the C ∗ -algebra Aσα is generated only by the operators Hα = Hα(1) and Hα(2) (see the appendix). The K-groups of Aσα were computed in the case α = pq ∈ Q, gcd(p, q) = 1 in [9], where it was shown that K1 (Aσα ) = 0 and K0 (Aσα ) = Z9 if q ≥ 5. The problem of characterizing the spectrum of the self-adjoint operator Hα = u + u∗ + v + v ∗ , or of Hα,λ = u + u∗ + λ2 (v + v ∗ ), λ ∈ R, in Aα is very important in the study of the quantum Hall effect ([2]). If Eα,λ denotes the spectral measure of Hα,λ , then µα,λ = τ Eα,λ is a measure with supp(µα,λ ) ⊂ [−2 − λ, 2 + λ] and since τ is faithful, its support coincides with the spectrum of Hα,λ . Actually one can gather information on spec(Hα,λ ) from the K-theoretical properties of Aα . In this respect the results of Pimsner and Voiculescu ([15]) show that for any irrational α, any λ ∈ R and any t which belongs to a gap of spec(Hα,λ ), there exists an integer n such that µα,λ spec(Hα,λ ) ∩ χ(−∞,t] = {nα}. In the irrational case, the label n coincides with the first Chern character of Connes ([6]). Knowing more about the projections of Aσα and about τ P(Aσα ) , would presumably provide additional information on spec(Hα ). Another important feature of the automorphism σ is that [it implements the Andreσ(α, λ, θ), as defined in Aubry duality. One easily checks that the set σ+ (α, λ) = θ

[1], coincides with the spectrum of the operator Hα,λ ∈ Aα . Using the fact that σ is an automorphism of Aα and

Projections in Rotation Algebras and Theta Functions

327

λ λ ∗ σ(Hα,λ ) = σ u + u + (v + v ) = v + v ∗ + (u + u∗ ) 2 2 ∗ λ 4 v+v λ u + u∗ + · = Hα, λ4 , = 2 λ 2 2 ∗

it follows that the operators Hα,λ and λ2 Hα, λ4 have the same spectrum. This provides (cf. J. Bellissard) a quick proof of the Andre-Aubry duality 4 λ λ . σ+ (α, λ) = spec (Hα,λ ) = spec (Hα, λ4 ) = σ+ α, 2 2 λ The aim of this paper is to develop a method for constructing projections in the C ∗ -algebra Aσα . In the first section, we prove that for any α ∈ (0, 1), Aσα contains a projection eα of trace α (the trace is the canonical trace on Aα ). Although we make use of Rieffel’s formalism for constructing imprimitivity bimodules between twisted C ∗ -group algebras associated with lattices in abelian locally compact groups ([17]), the nature of these projections is different from that of the customary Powers–Rieffel projections ([16]). Actually, the projections we construct in Aσα are related to the classical Jacobi theta functions ([13, 12]). An important ingredient in the proof is the inequality 1 1 ϑ(0, it)2 < ϑ(0, it) + ϑ , it , t > , 2 2 which we prove in Proposition 1.3, employing the infinite product factorization of theta functions; we use the customary notation X 2 eπim τ +2πimz , z ∈ C, τ ∈ H = {w ∈ C ; Im(w) > 0}. ϑ(z, τ ) = m

Another interesting thing is that the classical transformation formula for ϑ(x, it) (see e.g. [13, p.33]) √ πx2 x i , = t e t ϑ(x, it), x ∈ R , t > 0, ϑ it t arises as a mere consequence of Rieffel’s trace formula ([17, Thm.3.5]). 1 and p is a quadratic In Sect. 2 we prove that if p, q ∈ Z, q ≥ 2, 0 < qα − p < 2q 1 σ residue of q, then Aα contains projections of trace qα − p. If 0 < p − qα < 2q and −p σ is a quadratic residue of q, then Aα contains a projection of trace p − qα. The third section considers the case α = q −1 , q ∈ Z, q ≥ 2, in more detail. We prove the following estimates for the norm of the operator Hα , α = q −1 , i i  i ϑ 21 , 2α ϑ 2 , 2α  − πα  2 4e , if q is even,   i 2  ϑ 0, 2α  kHα k ≥ i i odd i i  i i  ϑ 21 , 2α ϑ 2 , 2α − 2ϑodd 21 , 2α ϑ  − πα 2 , 2α  2  , if q is odd, 4e i 2 i 2 odd ϑ 0, 2α − 2ϑ 0, 2α where we set ϑodd (z, τ ) =

X

eπim

m odd

2

τ +2πimz

= ϑ(z, τ ) − ϑ(2z, 4τ ), z ∈ C, τ ∈ H.

328

F. P. Boca

These estimates should be compared with kHα k = 4 − O q11 , where α = 1/ q1 + 1/ q2 + 1/(q3 + . . . ) is the continued fraction associated to α, proved by Helffer and Sj¨ostrand in [10, Thm.1]. We derive closed formulae for the projection e q1 , expressing them as averages of products of operator-valued ϑa,b functions as follows:  q−1  (q)  X − πirs iq  e q ϑ(q) ϑ r , s U1 , iq2  s r U2 , ,  2 q 2 q 2   r,s=0    , if q is even,   qϑ U2q , iq2 ϑ U1q , iq2     e q1 = 2q−1 X πirs +πi r s (2q)   q q ϑ s e q (U2 , 2iq) ϑ(2q) (U1 , 2iq) r   2q ,0 2q ,0   r,s=0   , if q is odd,   1  X  q q (2) (2)  q πiε ε 1 2  e ϑ ε2 ,0 (U2 , 2iq) ϑ ε1 ,0 (U1 , 2iq)   2 2 ε1 ,ε2 =0

where we set for any unitary U and any τ ∈ H, b ∈ R, N ∈ N∗ , a ∈ Z, 0 ≤ a < N , X πiτ m+ a 2 +2πi m+ a b ) N N (U, τ ) = e U N m+a , ϑ(N a ,b N m X 2 (U, τ ) = eπiτ m U m . ϑ(U, τ ) = ϑ(1) 0 ,0 1

m

As a result, if we specialize for example to the case when q is even, the following identity follows for any integer k ≥ 1 and t1 , t2 ∈ R, 2k−1 X

e−

2πimn k

n m (t , ik) ϑ n m (−t , ik)ϑ m n (t , ik) ϑ m n (−t , ik) ϑ 2k , 2 1 1 2 2 2k , 2 2k , 2 2k , 2

m,n=0

= 2kϑ(t1 , ik)2 ϑ(t2 , ik)2 , where ϑa,b (z, τ ) =

X

eπi(m+a)

2

τ +2πi(m+a)(z+b)

, z ∈ C, τ ∈ H.

m

This should be compared with Riemann’s identities for theta functions (see [13]). 1. Constructing Projections of Trace α in Aσ α We start by recalling the framework from [17]. Let M be an abelian locally compact c denote its topological dual and consider G = M × M c equipped with the group, let M Haar–Plancherel measure. Let β : G × G → T be the Heisenberg bicharacter defined by c, β (x1 , y1 ), (x2 , y2 ) = hx1 , y2 i, x1 , x2 ∈ M, y1 , y2 ∈ M c → T is the canonical pairing of M with M c. where h , i : M × M

Projections in Rotation Algebras and Theta Functions

329

c. The formula Set β(x, y) = β(x, y) and β ∗ (x, y) = β(y, x) = β(y, x), x ∈ M, y ∈ M c π(x0 ,x00 ) f (t) = ht, x00 i f (t + x0 ), t, x0 ∈ M, x00 ∈ M defines a square-integrable projective unitary representation π : G → U L2 (M )) such that πx πy = β(x, y) πx+y , πx πy = ββ ∗ (x, y) πy πx , (πx )∗ = β(x, x) π−x , x, y ∈ G. If D is a lattice in G, we denote by |G/D| its covolume and by C ∗ (D, β) the C ∗ algebra generated by πw , w ∈ D. The subgroup D⊥ = {w ∈ G ; ββ ∗ (D, w) = 1} = {w ∈ G ; β(x, w) = β(w, x), ∀ x ∈ D} ⊂ G is a lattice in G. We equip D with the Haar measure which assigns mass |G/D⊥ |−1 = |G/ D| to each point and D⊥ with the Haar measure assigning mass one to each point (see [17, p. 278]). The twisted C ∗ -algebra C ∗ (D, β) acts on the left on the space S(M ) of Schwartz functions on M by Z X a(w)πw f, f ∈ S(M ), a ∈ L1 (D, β). (1.1) af = a(w)πw f dw = |G/D| D

w∈D

Replacing as in [17, p. 269] πz by πz∗ , we regard C ∗ (D⊥ , β) as being generated by acting on the left on S(M ). This action commutes with the left action of C ∗ (D, β) for πw πz = πz πw , w ∈ D, z ∈ D⊥ . The opposite algebra of C ∗ (D⊥ , β) is C ∗ (D⊥ , β), which acts on the right on S(M ) by Z X b(z) πz∗ f dz = b(z)πz∗ f , fb = (1.2) z∈D ⊥ D⊥ πz∗

f ∈ S(M ), b ∈ L1 (D⊥ , β) = L1 (D⊥ , β)opp . Moreover, S(M ) becomes a C ∗ (D, β) − C ∗ (D⊥ , β) equivalence bimodule with respect to the C ∗ -valued inner products h , iD : S(M ) × S(M ) → C ∗ (D, β) and h , iD⊥ : S(M )×S(M ) → C ∗ (D⊥ , β) defined for any f, g ∈ S(M ), w = (w0 , w00 ) ∈ D and z = (z 0 , z 00 ) ∈ D⊥ by Z hf, giD (w) = hf, πw giL2 (M ) = f (s) g(s + w0 ) hs, w00 i ds, (1.3) M

Z hf, giD⊥ (z) = hπz g, f iL2 (M ) =

f (s) g(s + z 0 ) hs, z 00 i ds.

(1.4)

M

If τD and τD⊥ are the canonical (normalized) traces on C ∗ (D, β) and respectively on C ∗ (D⊥ , β), the following Poisson type formula holds ([17, Thm.3.5]) for all f, g ∈ S(M ), (1.5) τD hf, giD = |G/D| τD⊥ hg, f iD⊥ .

330

F. P. Boca

In this paper we will be interested inthe case M = Rm × F , where F is a finite m P xj yj , x = (x1 , . . . , xm ), y = (y1 , . . . , ym ) ∈ Rm cyclic group. Since hx, yi = e j=1 and h[n]q , [m]q i = e nm q , [n]q , [m]q ∈ Zq , where e(t) = exp(2πit) for all t ∈ R, we c such that hx, yi = hy, xi for all x, y ∈ M = M c. Therefore shall identify M with M R(w1 , w2 ) = (−w2 , w1 ) defines a group automorphism of G. We notice that R(D⊥ ) = (RD)⊥ . In the sequel we will assume that RD = D. The Fourier transform Z (Ff )(s) = f (x) hx, si dx, f ∈ S(M ), s ∈ M M

extends to a unitary on L2 (M ) such that for all g ∈ G, F πg = β(g, g) πRg F.

(1.6)

Since β(Rw, Rw) = β(w, w), we see that for all ξ1 , ξ2 ∈ S(M ) and w ∈ D, hFξ1 , Fξ2 iD (Rw) = hFξ1 , πRw Fξ2 iL2 (M ) = β(w, w) hFξ1 , F πw ξ2 iL2 (M ) (1.7) = β(w, w) hξ1 , πw ξ2 iL2 (M ) = β(w, w) hξ1 , ξ2 iD (w), which yields for all ξ1 , ξ2 , ξ3 ∈ S(M ), Z Z F hξ1 , ξ2 iD ξ3 = hξ1 , ξ2 iD (w) Fπw ξ3 dw = β(w, w) hξ1 , ξ2 iD (w) πRw Fξ3 dw D

D

Z hFξ1 , Fξ2 iD (Rw) πRw Fξ3 dw = hFξ1 , Fξ2 iD (Fξ3 ).

=

(1.8)

D

In a similar way, (1.4) and (1.6) yield for all ξ1 , ξ2 ∈ S(M ) and z ∈ D⊥ , hFξ1 , Fξ2 iD⊥ (Rz) = hπRz Fξ2 , Fξ1 iL2 (M ) = β(z, z) hπz ξ2 , ξ1 iL2 (M ) = β(z, z) hξ1 , ξ2 iD⊥ (z), and further on for all ξ1 , ξ2 , ξ3 ∈ S(M ), Z hξ2 , ξ3 iD⊥ (z) β(z, z) Fπ−z ξ1 dz F ξ1 hξ2 , ξ3 iD⊥ = DZ⊥

=

∗ hFξ2 , Fξ3 iD⊥ (Rz) πRz Fξ1 dz

(1.9)

D⊥

= (Fξ1 )hFξ2 , Fξ3 iD⊥ . According to [5], (1.8) and (1.9) show that Z4 = Z/4Z acts on the imprimitivity ∗ , inducing automorphisms σ ∈ Aut C (D, β) , bimodule S(M ) by gξ = Fξ, g ∈ Z 4 D ∗ ⊥ ∗ σD⊥ ∈ Aut C (D , β) such that for all ξ1 , ξ2 ∈ S(M ), a ∈ C (D, β), b ∈ C ∗ (D⊥ , β),

Projections in Rotation Algebras and Theta Functions

331

σD hξ1 , ξ2 iD = hFξ1 , Fξ2 iD ,

(1.10)

F(aξ1 ) = σD (a) (Fξ1 ),

(1.11)

σD⊥ hξ1 , ξ2 iD⊥ = hFξ1 , Fξ2 iD⊥ ,

(1.12)

F(ξ1 b) = (Fξ1 ) σD⊥ (b).

(1.13)

To find σD , remark that (1.10) and (1.7) yield Z Z σD hξ1 , ξ2 iD = hFξ1 , Fξ2 iD (Rw) πRw dw = hξ1 , ξ2 iD (w) β(w, w) πRw dw. D

D

On the other hand σD hξ1 , ξ2 iD =

Z hξ1 , ξ2 iD (w) σD (πw ) dw, D

hence for all w ∈ D, σD (πw ) = β(w, w) πRw .

(1.14)

A similar computation shows that for all z ∈ D⊥ , σD⊥ (πz ) = β(z, z) πRz .

(1.15)

2 4 (πw ) = π−w , w ∈ D, σD⊥ (πz ) = π−z , z ∈ D⊥ , σD = idC ∗ (D,β) We notice that σD 4 and σD⊥ = idC ∗ (D⊥ ,β) .

Proposition 1.1. Let α ∈ (0, 1) and D = Zε1 + Zε2 be a lattice in G = R2 such that Rε1 = ε2 (so Rε2 = −ε1 ) and β(εj , εj ) = 1, j = 1, 2, ββ ∗ (ε1 , ε2 ) = e2πiα . Set Uj = πεj , j = 1, 2 (hence σ = σD is an automorphism of Aα = C ∗ (D, β) such that σ(U1 ) = U2 , σ(U2 ) = U1−1 ). Assume that there exists λ ∈ {±1, ±i} and f ∈ S(R) such that Ff = λf and the element a = hf, f iD⊥ is invertible in C ∗ (D⊥ , β). Then p = hf a−1/2 , f a−1/2 iD is a projection in Aα such that τD (p) = |G/D| = α and σ(p) = p. Proof. The first part follows from [16, 17], so we only have to prove that σ(p)= p. Since Ff = λf and R(D⊥ ) = D⊥ , (1.12) yields σD⊥ (a) = a, hence σD⊥ a−1/2 = a−1/2 . By (1.13) F f a−1/2 = λf a−1/2 and applying (1.10) we get σD (p) = p. √ √ α, 0 , ε2 = 0, α . The lattice Next, we choose D = Zε1 + Zε2 , with ε1 = D has covolume |G/D| = α in G = R2 and RD = D. Since ββ ∗ (ε1 , ε2 ) = e2πiα , the unitaries Uj = πεj satisfy U1 U2 = e2πiα U2 U1 , hence C ∗ (D, β) is canonically isomor phic to Aα . The automorphism σ = σD acts on Aα = C ∗ (D, β) as σ πm1 ε1 +m2 ε2 = −1 e2πim1 m2 α π−m2 ε1 +m 1 ε2 , m1 , m2 ∈ Z, so σ(U1 ) = U2 and σ(U2 ) = U1 . Notice also∗ 2 that σ πm1 ε1 +m2 ε2 = π−m1 ε1 −m2 ε2 . The orthogonal lattice of D with respect to ββ is D⊥ = Zδ1 + Zδ2 , with δ1 = 0, √1α , δ2 = √1α , 0 . It has covolume α−1 in R2 and R(D⊥ ) = D⊥ . If we denote V1 = πδ∗1 = π−δ1 and V2 = πδ∗2 = π−δ2 , then σD⊥ (V1 ) = V2−1 and σD⊥ (V2 ) = V1 .

332

F. P. Boca

Consider also the Schwartz function f (s) = e−πs , s D ∈ R. Since Ff = f , ProposiE −1/2 −1/2 tion 1.1 shows that if a = hf, f iD⊥ were invertible, then f hf, f iD⊥ , f hf, f iD⊥ D would be a projection of trace α in Aσα . The following basic formula ([12, p. 5]) will be used repeatedly throughout the paper Z 2 1 πa2 e−2πs +2πas ds = √ e 2 , a ∈ C. (1.16) 2 2

R

Note also that for all t > 0,

Z R

2 1 e−πts ds = √ . t

We make use of (1.4) and (1.16) to get Z 2πism 1 m2 m2 m1 √ = f (s) f s + √ e α ds hf, f iD⊥ (m1 δ1 + m2 δ2 ) = hf, f iD⊥ √ , √ α α α R

Z

−2πs2 −

e

=

πm2 2πm2 s 2πim1 s √ + √α − α 2 α

R

therefore a = hf, f iD⊥ =

X z∈D ⊥

=

X

π(m2 +m2 ) πim1 m2 1 1 2 ds = √ e− 2α − α , 2

hf, f iD⊥ (z) πz∗ =

X

(1.17)

hf, f iD⊥ (−z) β(z, z) πz

z∈D ⊥

hf, f iD⊥ (−m1 δ1 − m2 δ2 ) e

2πim1 m2 α

πm1 δ1 +m2 δ2

m1 ,m2

(1.18)

1 X − π(m21 +m22 ) + πim1 m2 m1 m2 2α α =√ e V1 V2 ∈ C ∗ (D⊥ , β). 2 m1 ,m2 2 A direct computation based on (1.18) yields 2τD⊥ (hf, f i2D⊥ ) = ϑ 0, αi . A comπ(m2 +m2 )α √ 1 2 +πim1 m2 α 2 , putation analogue to (1.17) yields 2 hf, f iD (m1 ε1 + m2 ε2 ) = e− therefore X X π(m2 +m2 )α √ 1 2 +πim1 m2 α m2 m1 2 2 hf, f iD = |G/D| hf, f iD (w) πw = α e− U2 U1 , (1.19) m1 ,m2 w∈D which yields further 2τD hf, f i2D = α2 ϑ(0, iα)2 . On the other hand, (1.5) yields for all φ, ψ ∈ S(R), (1.20) τD hφ, ψiD = |G/D| τD⊥ hψ, φiD⊥ = ατD⊥ hψ, φiD⊥ , therefore for all f1 , f2 , f3 , f4 ∈ S(R), τD hf1 , f2 iD hf3 , f4 iD = τD hhf1 , f2 iD f3 , f4 iD = ατD⊥ hf4 , hf1 , f2 iD f3 iD⊥ = ατD⊥ hf4 , f1 iD⊥ hf2 , f3 iD⊥ .

(1.21)

Projections in Rotation Algebras and Theta Functions

333

Taking fj = f in the previous equality we get i 1 . ϑ(0, iα) = √ ϑ 0, α α

(1.22)

This is one of the modularity conditions satisfied by theta functions. Its appearance is not really surprising, for the Poisson summation formula plays an important rˆole in the proof of (1.5) (and implicitly of (1.20)). Actually we can do better by taking 2 fa (s) = e−π(s+a) , a ∈ R. A computation similar to the previous ones gives √ 2 τD 2hfa , f iD hf, fa iD = α2 e−πa ϑ(−ia α, iα) ϑ(0, iα), τD⊥ 2hfa , fa iD⊥ hf, f iD⊥

a i =ϑ √ , α α

i ϑ 0, , α

and using (1.21) and (1.22), √ 2 i 1 a i a i ϑ 0, =√ ϑ √ , ϑ(0, iα), αe−πa ϑ(−ia α, iα) ϑ(0, iα) = ϑ √ , α α α α α α hence for all a ∈ R and α > 0,

√ 2 1 a i . ϑ(−ia α, iα) = √ eπa ϑ √ , α α α

Taking x =

√a , α

t=

1 α,

we fully recover the basic transformation formula for ϑ

x i , ϑ it t

=

Lemma 1.2. The operator X =

√ P m

te

πx2 t

ϑ(x, it) ,

e−πm

2

α0

x ∈ R, t > 0.

(1.23)

V1m is positive and invertible for all α0 > 0.

P −πm2 α0 m e λ , λ ∈ T, hence the Proof. As X ∈ C ∗ (V1 ) = C(T), we have X(λ) = m spectrum of X coincides with ϑ [0, 1), iα0 , where X 2 eπim τ +2πimz , z ∈ C, τ ∈ H = {ζ ∈ C ; Im ζ > 0} ϑ(z, τ ) = m

denotes the customary theta function. The operator X is self-adjoint for X 2 e−πm α0 cos(2πmt) ∈ R, t ∈ R. X(e2πit ) = 1 + 2 m≥1

P −πm2 α0 e > 0, so we only have to show that 0 ∈ / On the other hand X(1) = 1 + 2 m≥1 0 is the only zero of the function θα0 (z) = ϑ(z, iα0 ) ϑ [0, 1), iα0 . This is true for 1+iα 2 in the fundamental domain {z ∈ C ; 0 ≤ Rez < 1, 0 ≤ Imz < α0 }.

334

F. P. Boca

Let α ∈ (0, ∞). We set πm2

2α , αm = e− m i , , m ∈ Z, βm = ϑ − 2α 2α

c(t) = min ϑ(x, it) > 0, t > 0,

x∈R X 2 C(t) = max ϑ(x, it) = ϑ(0, it) = e−πn t , t > 0, x∈R n m i t √ , m ∈ Z. + , 8m (t) = ϑ − α 2α 2α

We have |8m (t)| ≤

P n

e−

πn2 2α

i = ϑ 0, 2α =C

1 2α

for all m ∈ Z and α > 0, hence

2 the multiplication operator Dm = M8m is bounded on L (R) and kDm k1 = k8m k∞ ≤ 1 t i C 2α . On the other hand 80 (t) = ϑ − √α , 2α ∈ R and 80 (t) ≥ c 2α > 0 for all t ∈ R, hence D0 is invertible and for all m ∈ Z, 1 8m (t) C 2α −1 ≤ . (1.24) kD0 Dm k = kM 8m k = sup 1 80 80 (t) c 2α t∈R

We need more information on the behaviour of the function ϑ(·, it) on R and prove the following Proposition 1.3. (i) For any t > 0, c(t) = ϑ (ii) For any t > 0.527,

1 , it . 2

C(t) C(t) − 1 < 1. c(t)

Proof. The proof relies on the infinite product expansion for theta functions ([13, Prop. 14.1]), which says that for all z ∈ C and τ ∈ H, Y Y 1 − e2πimτ 1 + e(2m+1)πiτ −2πiz 1 + e(2m+1)πiτ +2πiz . ϑ(z, τ ) = (1.25) m≥1 m≥0 Actually we will only use the easier fact that Y 1 + e(2m+1)πiτ −2πiz 1 + e(2m+1)πiτ +2πiz , ϑ(z, τ ) = kτ

(1.26)

m≥0

for some constant kτ 6 = 0 which does not depend on z. To prove (i), we remark that for any rm ≥ 0 and any ρ ∈ T we have (1 + rm ρ) (1 + rm ρ) ≥ (1 − rm )2 .

(1.27)

Projections in Rotation Algebras and Theta Functions

335

Taking r = e−πt ∈ (0, 1), t > 0, rm = r2m+1 and ρ = e2πix , x ∈ R, we obtain from (1.26) and (1.27), Y (1 + rm ρ)(1 + rm ρ) ϑ(x, it) = ≥ 1. 1 (1 − rm )2 ϑ 2 , it m≥0

Since ϑ(x, it) > 0, x ∈ R, (i) follows. To prove (ii), we fix t > 0 and denote P = yields

X

ln P =

q

C(t) c(t)

r =

ϑ(0,it)

ϑ

1 2 ,it

. Equality (1.26)

ln(1 + r2m+1 ) − ln(1 − r2m+1 ) .

m≥0

The mean value theorem yields for any ε ∈ (0, 1), 2ε 2ε < ln(1 + ε) − ln(1 − ε) < , 1+ε 1−ε hence ln P < 2

X m≥0

and therefore

r2m+1 2r X 2m 2r , < r = 2m+1 1−r 1−r (1 − r)(1 − r2 ) m≥0

4e−πt C(t) = P 2 < h(t) = exp c(t) (1 − e−πt )(1 − e−2πt ) 4r . = exp (1 − r)(1 − r2 )

(1.28)

Using also C(t) − 1 =

X n= 6 0

we get

C(t) C(t) − 1 c(t)

e−πn t = 2 2

X

rn < 2 2

n≥1

2r < g(r) = · exp 1 − r3

The derivative of φ(r) = ln g(r) = is φ0 (r) =

4r + ln 2 + ln (1 − r)(1 − r3 )

X n≥0

r3n+1 =

2r , 1 − r3

4r (1 − r)(1 − r2 )

.

r , r ∈ (0, 1), 1 − r3

4 2r3 + 1 4r(4r2 + r + 1) + + > 0, ∀ r ∈ (0, 1), (1 − r)(1 − r3 ) (1 − r)(1 − r3 )2 r(1 − r3 )

therefore φ is monotonically increasing on (0, 1). Moreover, the equation φ(r) = 0 has a unique solution r0 ≈ 0.191374 in ∈ (0, 1). If we set ψ(t) = φ(e−πt ), the function ψ is then monotonically decreasing on (0, ∞) and has a unique root t0 = − π1 log r0 ≈ 0.526334, therefore C(t) C(t) − 1 < c(t), for all t > t0 = 0.526334.

336

F. P. Boca

P Proposition 1.4. Let α ∈ (0, 0.948]. Then a0 = αm Dm V2m is a bounded invertible m √ operator in C ∗ (D⊥ , β) = A α1 and a0 = 2 hf, f iD⊥ . Proof. Since X

1 2α

≥ 1 > 0.527, (1.24) and the previous proposition yield

αm kD0−1 Dm V2m k

m= 6 0

C ≤ c

1 X 2α αm 1 2α m= 6 0

≤

C

1 2α

C c

1 2α

−1

< 1.

1 2α

P αm D0−1 Dm V2m defines a bounded invertible operator, hence This shows that I + m= 6 0 √ P αm Dm V2m . The operators a0 and 2 hf, f iD⊥ ∈ B(L2 (R)) coincide. To so is a0 = m

see this, notice that for all m1 , m2 ∈ Z,

V1m1 V2m2 φ

−

(s) = e

2πim1 s √ α

m2 φ s− √ , α

hence we may use (1.18) to obtain for all φ ∈ L2 (R), s ∈ R, √

X − π(m21 +m22 ) + πim1 m2 − 2πim s m2 √ 1 2α α α 2 hf, f iD⊥ (φ) (s) = e φ s− √ α m ,m 1

=

2

X

! αm2 Dm2 V2m2 φ (s) = (a0 φ)(s).

m2

Corollary 1.5. For any α ∈ (0, 1), the rotation algebra Aα contains a projection e = eα of trace α such that σ(e) = e. Remarks. (i) For all m ∈ Z we have Dm V2 = V2 Dm+2 . (ii) If α is irrational, the rotation algebra A α1 = C ∗ (D⊥ , β) is simple, thus isomorphic to the C ∗ -algebra C ∗ (W1 , W2 ) ⊂ B(`2 (Z)) generated by the unitaries W1 ξk = ξk+1 , 2πik 2πi W2 ξk = e α ξk , where ξk is an orthonormal basis of `2 (Z). If ρ = e α , the matrix coefficients of a = hf, f iD⊥ in this representation of A α1 are haξk , ξl i =

√1 2

P m1 ,m2

P

ρ

αm1 αm2 hW1m1 W2m2 ξk , ξl i

m1 m2 2

+m2 k

αm1 αm2 hξk+m1 , ξl i P (k+l)m2 = √12 αl−k ρ 2 αm2 = √12 αl−k ϑ k+l 2α ,

=

=

√1 2

√1 2

m1 ,m2

ρ

m1 m2 2

m2

αl−k βl+k ,

i 2α

k, l ∈ Z,

√ P αm Dm U m , where U ξk = ξk+1 , Dm ξk = β2k−m ξk , m, k ∈ Z. therefore a 2 = m

The diagonal operators are bounded and invertible because 0 < c ≤ βn ≤ C for all n ∈ Z.

Projections in Rotation Algebras and Theta Functions

337

2. Existence of Projections of Trace qα − p and p − qα Let q ∈ N∗ , q ≥ 2 and p ∈ Z such that 0 < γ = α − pq ≤

and there exists p0 ∈ Z such c, with that p = mod q. We choose M = R × Zq and D = Zε1 + Zε2 ⊂ G = M × M √ √ γ, [p0 ]q , 0, [0]q and ε2 = 0, [0]q , γ, [p0 ]q . ε1 = √ √ Then D is a lattice in G and [0, γ ) × Zq × [0, γ ) × Zq a fundamental domain for G/D. Since Zq is endowed with the Haar–Plancherel measure (which assigns mass q −1/2 to each point from Zq ), we get |G/D| = qγ = qα − p. An easy computation gives D⊥ = Zδ1 + Zδ2 with 1 1 ¯q ¯ q , 0, [0]q , and δ2 = δ1 = 0, [0]q , √ , [p] √ , [p] q γ q γ 1 2

p20

where p¯ ∈ Z is such that p0 p¯ = −1 mod q. Set Vj = πδ∗j = π−δj , j = 1, 2. For any φ ∈ S(R), we consider φ1 , φ2 ∈ S(M ) defined by φ1 s, [n]q = φ(s), √ φ2 s, [n]q = q δ[0]q ,[n]q φ(s), s ∈ R, [n]q ∈ Zq , where δa,b , a, b ∈ Zq denotes Kronecker’s symbol. Denote also δa (b) = δa,b a, b ∈ Z q. √ Notice that if Fφ = φ on R, then F(φ1 + φ2 ) = φ1 + φ2 on M because F 1 + q δ[0]q = √ 1 + q δ[0]q (again, it is essential that the measure on Zq is the Haar–Plancherel one). If we set Z sm1 m2 e √ ds, m1 , m2 ∈ Z, bm1 ,m2 = φ(s) φ s + √ q γ q γ R

then

m2 m1 ¯ q , √ , [m1 p] ¯q hφ1 , φ1 iD⊥ (m1 δ1 + m2 δ2 ) = hφ1 , φ1 iD⊥ √ , [m2 p] q γ q γ Z m2 sm1 nm1 p¯ ds d[n]q φ1 (s, [n]q ) φ1 s + √ , [n + m2 p] ¯q e √ + = q γ q γ q R×Zq 1 X nm1 p¯ 0, if q 6 |m1 bm1 ,m2 = √ =√ e , q bm1 ,m2 , if q|m1 q q n∈Zq Z m2 φ1 (s, [n]q ) φ2 s + √ , [n + m2 p] ¯q hφ1 , φ2 iD⊥ (m1 δ1 + m2 δ2 ) = (2.1) q γ R×Zq m1 m2 p¯2 sm1 nm1 p¯ ds d[n]q = e − bm1 ,m2 , ×e √ + q γ q q hφ2 , φ1 iD⊥ (m1 δ1 + m2 δ2 ) = bm1 ,m2 , 0, if q 6 |m2 , hφ2 , φ2 iD⊥ (m1 δ1 + m2 δ2 ) = √ q bm1 ,m2 , if q | m2

and consequently for all m1 , m2 ∈ Z, hφ1 + φ2 , φ1 + φ2 iD⊥ (m1 δ1 + m2 δ2 ) = bm1 ,m2 cm1 ,m2 , where

338

F. P. Boca

cm1 ,m2

 ¯2 2p   1 + e − m1 m , if q 6 |m1 and q 6 |m2 q = 2 + √q, if q 6 |m1 , q|m2 or q|m1 , q 6 |m2 .   2 + 2√q, if q|m and q|m 1 2

2 ˜ If φ(s) = e−πs √ , set φ = φ1 + φ2 . Computations similar to those from Sect. 1 (with q γ instead of α ) yield for all m1 , m2 ∈ Z,

√

2

2

) 2 − πim1 m2 1 − π(m2q12+m γ q2 γ , bm1 ,m2 = √ e 2

and X √ √ ∗ ˜ φi ˜ D⊥ = bm1 ,m2 cm1 ,m2 πm a 2 = 2 hφ, 1 δ1 +m2 δ2 m1 ,m2 √ X = 2 b−m1 ,−m2 c−m1 ,−m2 β(−m2 δ2 , −m1 δ1 ) πm1 δ1 +m2 δ2 1 ,m2 √ mX √ X = 2 b−m1 ,−m2 c−m1 ,−m2 V2m2 V1m1 = 2 bm1 ,m2 cm1 ,m2 V2m2 V1m1 m1 ,m2

=

X

π(m2 +m2 ) 1 2 − πim1 m2 − 2q 2 γ q2 γ

cm1 ,m2 e

m1 ,m2

=

X

−

e

πm2 2 2q 2 γ

m2

X

cm1 ,m2 e

m1 ,m2

V2m2 V1m1

2πim1 m2 p¯ 2 q

+

πim1 m2 q2 γ

−

πm2 1 2q 2 γ

! V1m1

V2m2 .

m1 2 − πm 2q 2 γ

We set αm = e

, m ∈ Z and X 1 mn + p¯2 cm,n αm e Dn = V1m , n ∈ Z. q 2qγ m

For all f ∈ L2 (R × Zq ),

mx mk p¯ f (x, [k]q ), V1m f (x, [k]q ) = e − √ − q γ q

hence Dn is the multiplication operator M8n on L2 (R × Zq ), with 8n ∈ L∞ (R × Zq ) given by X 1 mn mk p¯ mx cm,n αm e + p¯2 − √ − 8n (x, [k]q ) = q 2qγ q γ q m √ P and a 2 = αn Dn V2n . We have n

mx mk p¯ cm,0 αm e − √ − q γ q m mk p¯ lx mx √ X √ X + (2 + q) = (2 + 2 q) αql e − √ αm e − √ − γ q γ q l q6 | m i k p¯ x i x √ √ (2.2) + (2 + q ) ϑ − √ − , 2 = qϑ − √ , 2γ q q 2q γ γ γ 1 1 √ √ + > 0, q c ≥ (2 + q ) c 2q 2 γ 2γ

80 (x, [k]q ) =

X

Projections in Rotation Algebras and Theta Functions

where c(t) = min ϑ(x, it) = ϑ x∈R

|8n (x, [n]q )| ≤

X

1 2 , it

|cm,n | αm =

339

, t > 0 and

X

m

cql,n αql +

l

X

|cm,n | αm

q6 | m

i i i √ √ + (2 + q ) ϑ 0, 2 − ϑ 0, ≤ (2 + 2 q ) ϑ 0, 2γ 2q γ 2γ (2.3) 1 1 √ √ + qC , = (2 + q ) C 2q 2 γ 2γ where C(t) = max ϑ(x, it) = ϑ(0, it), t > 0.We combine (2.2) and (2.3) to obtain for x∈R

all n ∈ Z,

8n

= sup 8n (x, [k]q ) ; (x, [k]q ) ∈ R × Zq

80 (x, [k]q )

80 ∞ √ √ 1 (2 + q) C 2q12 γ + q C 2γ √ . ≤ √ 1 (2 + q) c 2q12 γ + q c 2γ

(2.4)

The function h(t) from (1.28) is monotonically decreasing on (0, ∞) and C(t) < h(t) c(t) for all t > 0, therefore (2.4) yields for all n ∈ Z,

8n

80

1 1 1 ,h =h , ≤ max h 2q 2 γ 2γ 2q 2 γ ∞

X X 8n 1

αn ≤ 2h αn . S= 80 ∞ 2q 2 γ

and

n≥1

n= 6 0

But X n≥1

therefore

αn =

X

2

− πn2

e

2q γ

<

n≥1

− π(3n+1) 2

e

2q γ

n≥0

1 S ≤ 2h 2q 2 γ −

X

−

·

e

π 2q 2 γ

−

1−e

3π 2q 2 γ

−

=

e

π 2q 2 γ

−

1−e

3π 2q 2 γ

,

= g(r),

π

1.7. We have already shown where r = e 2q2 γ and g is as in the proof of Proposition P αn D0−1 Dn V2n is bounded there that g(r) < 1 if 2q12 γ > 0.527, hence in this case I + n= 6 0 √ P ˜ φi ˜ D⊥ = αn Dn V n is invertible. Since the and invertible on L2 (R × Zq ), thus 2 hφ, 2 n

analogue of Proposition 1.1 with R replaced by R × Zq holds, we get the following Proposition 2.1. Let α ∈ (0, 1). If q ≥ 2 and p are integers such that q ≥ 2, 0 < 2 α − pq < 0.948 q 2 and there exists p0 ∈ Z such that p = p0 mod q, then a is invertible and σ Aα contains a projection of trace qα − p. One proves in a similar way that if 0 < γ 0 = pq − α < 0.948 q 2 and there exists p1 ∈ Z such that p = −p21 mod q, then the corresponding a is invertible, hence Aσα contains √ a projection is that one starts with ε1 = of trace p√− qα. The only difference γ 0 , [p1 ]q , 0, [0]q . 0, [0]q , γ 0 , [p1 ]q and ε2 =

340

F. P. Boca

3. More on the Case α = q −1 , q ∈ Z, q ≥ 2 In this section we will focus mainly on the case α = q1 , q ∈ Z, q ≥ 2, obtaining lower bounds for the norm of Hα = πε1 + πε∗1 + πε2 + πε∗2 and closed formulae for the projection constructed in Corollary 1.5 (notation is as in Sect. 1). In the beginning we will assume that α ∈ (0, 1) is such that the operator hf, f iD⊥ from (1.18) is invertible (for example 0 < α < 0.948), hence D E −1 −1 e = f hf, f iD⊥2 , f hf, f iD⊥2 D

Aσα .

is a projection of trace α in C ∗ (D⊥ , β) ' A α1 defined by

Then, according to [16], 2 : eAα e ' eC ∗ (D, β)e →

−1

−1

2(exe) = hf, f iD⊥2 hf, xf iD⊥ hf, f iD⊥2 , x ∈ C ∗ (D, β) is a *-isomorphism. Direct computations based on (1.4) and (1.16) yield m2 m1 hf, π±ε1 f iD⊥ (m1 δ1 + m2 δ2 ) = hf, π±ε1 f iD⊥ √ , √ α α 2πism Z √ 1 m2 √ = f (s) f s + √ ± α e α ds α R Z √ 2 2πism m −πs2 −π s+ √α2 ± α + √α 1 ds = e R π(m2 +m2 ) πim1 m2 πα 1 1 2 = √ e− 2 − 2α − α ∓πm2 +πim1 2

and

Z hf, π±ε2 f iD⊥ (m1 δ1 + m2 δ2 ) = R

m2 ±2πi√α f (s)f s + √ e α

m

s+ √α2 +

2πism1 √ α

ds

π(m2 +m2 ) πim1 m2 πα 1 1 2 = √ e− 2 − 2α − α ∓πm1 +πim2 , 2

hence 1 X − πα − π(m21 +m22 ) + πim1 m2 2α α e 2 hf, Hα f iD⊥ = √ 2 m1 ,m2

(3.1)

· (eπm2 + e−πm2 )eπim1 + (eπm1 + e−πm1 )eπim2 V1m1 V2m2 .

If α = q −1 , q ∈ Z, q ≥ 2, then A α1 ' C ∗ (D⊥ , β) is isomorphic to C(T2 ) and 2(eHα e) is identified with the function 2(eHα e)(z1 , z2 ) = where

F (z1 , z2 ) , z1 , z2 ∈ T, G(z1 , z2 )

Projections in Rotation Algebras and Theta Functions

X

F (z1 , z2 ) =

e−

πα 2 −

341

π(m2 +m2 ) πim m 1 2 + 1 2 2α α

m1 ,m2

× (eπm2 + e−πm2 )eπim1 + (eπm1 + e−πm1 )eπim2 z1m1 z2m2 , X π(m2 +m2 ) πim m 1 2 1 2 e− 2α + α z1m1 z2m2 . G(z1 , z2 ) = m1 ,m2

Since kHα k ≥ keHα ek = k2(eHα e)k∞ ≥ we further get for q even π ϑ 1 = 4 e− 2q kH q1 k ≥ φ0 q

i iq 2, 2

ϑ 0,

and for q odd

π ϑ 1 = 4 e− 2q kH q1 k ≥ φ1 q

where we set ϑodd (z, τ ) =

X

eπim

2

i iq 2, 2

τ +2πimz

ϑ ϑ

1 iq 2, 2 iq 2 2

ϑ

F (1, 1) , G(1, 1)

=4

ϑ ϑ

1 2i 1 iq q, q ϑ 2, 2 iq 0, 2i q ϑ 0, 2

,

1 iq odd i iq odd 1 iq 2, 2 − 2 ϑ 2, 2 ϑ 2, 2 2 iq 2 iq 0, 2 − 2 ϑodd 0, 2

(3.2)

, (3.3)

= ϑ(z, τ ) − ϑ(2z, 4τ ), z ∈ C τ ∈ H.

m odd

Taking as usual ϑa,b (z, τ ) =

X m

we make use of ϑ

i 2 , it

eπi(m+a)

2

τ +2πi(m+a)(z+b)

, z ∈ C, τ ∈ H, a, b ∈ R,

(3.4)

π = e 4t ϑ 2t1 ,0 (0, it) and ϑ 21 , it = ϑ0, 21 (0, it) to get 4ϑ0, 21 0, 2ti ϑt,0 0, 2ti . φ0 (t) = 2 ϑ 0, 2ti

The graphs of φ0 and φ1 are drawn in Fig. 3.1 and 3.2 using Mathematica. They should be compared with Hofstadter’s butterfly ([11]). Estimates (3.2) and (3.3) are quite accurate, as suggested by Table 3.1. The norm of H q1 has been computed numericallyusing [1, Cor. 3.2].

We notice that equality holds in (3.2) if q = 2. To see this, remark first that H 21 = 2 √ 4 + H0 , which yields kH 21 k = 2 2. Using (3.4) and (1.15), we readily see that 1 ϑ0, 21 (0, i) = ϑ , i = ϑ 21 ,0 (0, i). 2 On the other hand, Jacobi’s identity ([13, p. 23]) yields

hence ϑ

1 2, i

ϑ(0, i)4 = ϑ0, 21 (0, i)4 + ϑ 21 ,0 (0, i)4 , = 2− 4 ϑ(0, i) and we get equality in (3.2) in this case. 1

342

F. P. Boca

Fig. 3.1. The graph of φ0 on 0,

Fig. 3.2. The graph of φ1 on 0,

1 2

1 2

i

i

Projections in Rotation Algebras and Theta Functions

343

Table 3.1. α 1/2 1/3 1/4 1/5 1/6 1/7 1/8 1/9 1/10 1/11 1/12 1/13 1/50 1/51 1/100 1/101

kHα k

φ0 (α)

2.82842 2.73205 2.82842 2.96645 3.09557 3.20330 3.29066 3.36165 3.42005 3.46880 3.51004 3.54537 3.87630 3.87869 3.93766 3.93827

2.82842

φ1 (α) 2.73205

2.78648 2.94109 3.08292 3.19690 3.28709 3.35943 3.41855 3.46771 3.50922 3.54473 3.87628 3.87867 3.93765 3.93827

Notice also that the numerical computations above suggest that equality holds even 1 1 3i 1 2i too, which would produce an interesting relation between ϑ , , ϑ tually for 3 2 2 3, 3 , ϑ 16 , 6i and ϑ(0, 6i). Next, we derive explicit formulae for the projection e of trace α constructed in Proposition 1.1 when α = q −1 , q ∈ Z, q ≥ 2. Again, notation is as in Sect. 1. For any unitary operator U and τ ∈ H, b ∈ R, N ∈ N∗ , a ∈ Z, we define the bounded operators X 2 eπiτ m U m , ϑ(U, τ ) = m

ϑ0,b (U, τ ) =

X

eπiτ m

2

+2πimb

m

) (U, τ ) ϑ(N a N ,b

=

X

a πiτ m+ N

e

2

U m,

a +2πi m+ N b

U N m+a .

m

The explicit expression of the projection e from Sect. 1 is given by the following Proposition 3.1. (i) If q is even, then q−1 X

e−

πirs q

q

r,s=0

e=

(q) iq iq ϑ(q) s r U2 , , 2 ϑ r , s U1 , 2 q

2

qϑ U2q ,

iq 2

2

ϑ U1q , iq2

.

(ii) If q is odd, then 2q−1 X

e

e=

πirs q +πi

r q

s q

2q

r,s=0

q

1 X ε1 ,ε2 =0

(2q) ϑ(2q) s ,0 (U2 , 2iq) ϑ r ,0 (U1 , 2iq) 2q

. eπiε1 ε2 ϑ ε2 ,0 (U2q , 2iq) ϑ(2) (U1q , 2iq) ε1 ,0 (2) 2

2

344

F. P. Boca

Remark. The denominators are central elements in A q1 . Any b ∈ C ∗ (D⊥ , β) = A α1 is represented as

b=

X

b(z)πz∗ =

z∈D ⊥

X m1 ,m2

∗ bm1 ,m2 πm = 1 δ1 +m2 δ2

X

b−m1 ,−m2 e

2πim1 m2 α

m1 ,m2

V1m1 V2m2 . (3.5)

For such b and n1 , n2 ∈ Z, we set √ √ αn1 ,n2 = hf b, f biD (n1 α, n2 α).

(3.6)

Lemma 3.2. Let α ∈ (0, 1) and b, αn1 ,n2 be as in (3.5) and (3.6). Then for all n1 , n2 ∈ Z, X πα(n2 +n2 ) 1 1 2 bm1 ,m2 bm3 ,m4 αn1 ,n2 = √ e− 2 +πin1 n2 α 2 m1 ,...,m4 πi(m +m )(m −m ) π (m1 −m3 )2 +(m2 −m4 )2 1 3 2 4 − −πi(n1 +in2 ) m3 −m1 +i(m4 −m2 ) + 2α α ×e .

Proof. Using

(f b)(s) =

X

bm1 ,m2 e

2πim1 m2 α

m

−π s− √α2

2

−

2πism1 √ α

, s∈R

m1 ,m2

and (1.3) we get Z √ √ αn1 ,n2 = (f b)(s) (f b)(s + n1 α) e−2πisn2 α ds R

=

X

bm1 ,m2 bm3 ,m4 e

π(m2 +m2 ) 2πi(m1 m2 −m3 m4 ) 2 4 −πn2 α+2πn m − 1 4 1 α α

m1 ,...,m4

Z

×

−2πs2 −2πs −

e

m2 +m4 √ α

+n1

√ α+i

m1 −m3 √ α

+n2

√ α

ds.

R

The statement follows now through a plain computation based on (1.16).

Projections in Rotation Algebras and Theta Functions

345

For all r, s, n1 , n2 ∈ Z, we set π(r 2 +s2 )

πirs

(n1 ,n2 ) = e− 2α − α −πi(n1 +in2 )(r+is) , ar,s X (n1 ,n2 ) r s ar,s V1 V 2 . a(n1 ,n2 ) =

(3.7) (3.8)

r,s

Since

X r,s

(n1 ,n2 ) |ar,s | = ϑ

in1 i , 2 2α

in2 i ϑ , < ∞, 2 2α

(n1 ,n2 )

∈ C ∗ (D⊥ , β). Although we will not need it, notice that since it follows that a σD⊥ (V2 ) = V1 and σD⊥ (V1 ) = V2−1 , we get σD⊥ a(n1 ,n2 ) = a(−n2 ,n1 ) . If α = q −1 , q ∈ Z, q ≥ 2, Lemma 3.2 yields for all n1 , n2 ∈ Z, π(n2 +n2 ) πin n 1 2 1 2 1 αn1 ,n2 = √ e− 2q + q 2

X m1 ,...,m4

(n1 ,n2 ) bm1 ,m2 bm3 ,m4 am . 3 −m1 ,m4 −m2

(3.9)

Moreover, for any b as in (3.5), X X (n1 ,n2 ) bm1 ,m2 bm3 ,m4 ar,s τD⊥ (V1m1 +r−m3 V2m2 +s−m4 ) τD⊥ ba(n1 ,n2 ) b∗ = m1 ,...,m4 r,s

=

X

m1 ,...,m4

(n1 ,n2 ) bm1 ,m2 bm3 ,m4 am , 3 −m1 ,m4 −m2

which together with (3.9) gives π(n2 +n2 ) πin n 1 1 2 2 1 αn1 ,n2 = √ e− 2q + q τD⊥ ba(n1 ,n2 ) b∗ 2 Z 2 +n2 ) 1 2 + πin1 n2 1 − π(n2q q a(n1 ,n2 ) (e2πit1 , e2πit2 ) · |b(e2πit1 , e2πit2 )|2 dt1 dt2 . =√ e 2

[0,1]2

−1

In the sequel we will take b = a− 2 = hf, f iD⊥2 ∈ C(T2 ), hence E D √ √ −1 −1 (n1 α, n2 α) αn1 ,n2 = f hf, f iD⊥2 , f hf, f iD⊥2 D Z π(n2 +n2 ) πin n 1 1 2 2 a(n1 ,n2 ) (e2πit1 , e2πit2 ) 1 dt1 dt2 = √ e− 2q + q a(e2πit1 , e2πit2 ) 2 1

[0,1]2

and e = |G/D|

X n1 ,n2

αn1 ,n2 U2n2 U1n1 =

1 X αn1 ,n2 U2n2 U1n1 . q n ,n 1

2

For simplicity we shall denote throughout this section iq . ϑ(z) = ϑ z, 2

(3.10) (3.11)

346

F. P. Boca

Proof of Proposition 3.1 (i). If q is even, then (1.18) yields for all t1 , t2 ∈ R, 1 a(e2πit1 , e2πit2 ) = √ ϑ(t1 ) ϑ(t2 ). 2

(3.12)

We employ (3.7) and (3.8) to obtain n1 + in2 n2 + in1 ϑ t2 − . a(n1 ,n2 ) (e2πit1 , e2πit2 ) = ϑ t1 − 2 2

(3.13)

Writing n1 = ql1 + r, n2 = ql2 + s, l1 , l2 , r, s ∈ Z, 0 ≤ r, s < q and employing (3.10), (3.12) and (3.13) we get αn1 ,n2 = e−

2 2 πq(l2 +l2 ) 1 2 −π(l r+l s)− π(r +s ) +πi(l s+l r)+ πirs 1 2 1 2 2 2q q

Z1 × 0

Z1 2q 1q ϑ t1 − r+is+il ϑ t2 − s+ir+il 2 2 dt1 dt2 . ϑ(t1 ) ϑ(t2 ) 0

Using also the quasiperiodicity relation πql2 ilq = e 2 −2πilz ϑ(z), z ∈ C, l ∈ Z, ϑ z+ 2 we get 2

2

− π(r2q+s ) + πirs q

αql1 +r,ql2 +s = e

Z1 0

Z1 × 0

e2πil2 t1 ϑ t1 − ϑ(t1 )

e2πil1 t2 ϑ t2 − ϑ(t2 )

s+ir 2

r+is 2

dt1

dt2 .

Furthermore, we make use of X l

Z1 e

2πily 0

e2πilx ϑ x − ϑ(x)

r+is 2

X Z e2πilx ϑ x − y − dx = ϑ(x − y) l 0 ϑ y + r+is 2 = ϑ(y) 1

r+is 2

dx

and of the fact that U1q and U2q are central in A α1 to get X l1 ,l2

αql1 +r,ql2 +s U2ql2 U1ql1 = e−

π(r 2 +s2 ) πirs + q 2q

·

ϑ eπi(r+is) U2q , iq2 ϑ eπi(s+ir) U1q , iq2 . ϑ U2q , iq2 ϑ U1q , iq2

The equality from Proposition 3.1 (i) follows from this and from 2 πirs iq (q) − πs + q s πi(r+is) q iq 2q = ϑ s , r U, . U ϑ e U , e q 2 2 2

(3.14)

Projections in Rotation Algebras and Theta Functions

347

Before completing the proof of Proposition 3.1 for q odd, we introduce some notation by setting for all z ∈ C, τ ∈ H, a, b ∈ R, X 2 eπiτ (m+a) +2πi(m+a)(z+b) = ϑa,b (z, τ ) − ϑ a2 ,2b (2z, 4τ ) ϑodd a,b (z, τ ) = m odd

ϑeven a,b

(2z, 4τ ), = ϑ a+1 2 ,2b X 2 = eπiτ (m+a) +2πi(m+a)(z+b) = ϑ a2 ,2b (2z, 4τ ),

ϑ (z, τ ) =

m even ϑ0,0 (z, τ ),

∈ {even, odd}.

For all z ∈ C, τ ∈ H, a, b ∈ R, l ∈ Z we have 1 even z + , τ = ϑeven (z, τ ), ϑ 2 1 odd z + , τ = −ϑodd (z, τ ), ϑ 2 odd ϑa,b (z, τ ), if l is even odd −πiτ l2 −2πil(z+b) . · ϑa,b (z + lτ, τ ) = e ϑeven a,b (z, τ ), if l is odd

(3.15) (3.16) (3.17)

The following lemma is easy to prove. Lemma 3.3. Let φ : R2 → C be a continuous function which is periodic modulo Z2 . Then X Z e2πi(l1 t1 +l2 t2 ) φ(t1 , t2 ) dt1 dt2 l1 ,l2 even

[0,1]2

Z

X l1 even,l2 odd

[0,1]2

Z

X l1 odd,l2 even

X l1 ,l2 odd

1 1 1 1 1 φ(0, 0) + φ , 0 + φ 0, +φ , , = 4 2 2 2 2

[0,1]2

Z [0,1]2

e2πi(l1 t1 +l2 t2 ) φ(t1 , t2 ) dt1 dt2 1 1 1 1 1 φ(0, 0) + φ , 0 − φ 0, −φ , , = 4 2 2 2 2 e2πi(l1 t1 +l2 t2 ) φ(t1 , t2 ) dt1 dt2 1 1 1 1 1 φ(0, 0) − φ , 0 + φ 0, −φ , , = 4 2 2 2 2

e2πi(l1 t1 +l2 t2 ) φ(t1 , t2 ) dt1 dt2 1 1 1 1 1 φ(0, 0) − φ , 0 − φ 0, +φ , . = 4 2 2 2 2

348

F. P. Boca

Proof. We consider for instance the second equality. The others are similar. We set l1 = 2m1 , l2 = 2m2 + 1, x1 = 2t1 , x2 = 2t2 , and divide [0, 2]2 into four equal squares to get: Z X e2πi(l1 t1 +l2 t2 ) φ(t1 , t2 ) dt1 dt2 S01 = l1 even,l2 odd

[0,1]2

Z 1 X x1 x2 , dx1 dx2 = e2πi(m1 x1 +m2 x2 ) eπix2 φ 4 m ,m 2 2 1

2

1 X = 4 m ,m 1

where

[0,2]2

2

Z

e2πi(m1 x1 +m2 x2 ) ψ(x1 , x2 ) dx1 dx2 ,

[0,1]2

x1 x2 x1 + 1 x2 πix2 +e , , φ φ ψ(x1 , x2 ) = e 2 2 2 2 x1 x2 + 1 x1 + 1 x2 + 1 πix2 πix2 −e . , , −e φ φ 2 2 2 2 πix2

Since ψ(x1 + 1, x2 ) = ψ(x1 , x2 ) = ψ(x1 , x2 + 1), x1 , x2 ∈ R, we get 1 1 1 1 1 1 φ(0, 0) + φ , 0 − φ 0, −φ , . S01 = ψ(0, 0) = 4 4 2 2 2 2

For all a, b ∈ R, z ∈ C, we set iq iq , ϑa,b (z) = ϑa,b z, , ∈ {even, odd}, ϑa,b (z) = ϑa,b z, 2 2 and consider the following continuous functions on R2 /Z2 : odd ϑ t1 + r+is ϑ t2 + s+ir − 2ϑodd t1 + r+is ϑ t2 + s+ir 0,0 2 2 2 2 , φr,s (t1 , t2 ) = ϑ(t1 ) ϑ(t2 ) − 2ϑodd (t1 ) ϑodd (t2 ) ϑ t1 + r+is ϑ0, 21 t2 + s+ir + 2ϑeven t1 + r+is ϑodd t2 + s+ir 2 2 2 2 0,1 , φr,s (t1 , t2 ) = ϑ(t1 ) ϑ(t2 ) − 2ϑodd (t1 ) ϑodd (t2 ) ϑ0, 21 t1 + r+is ϑ t2 + s+ir + 2ϑodd t1 + r+is ϑeven t2 + s+ir 2 2 2 2 1,0 (t1 , t2 ) = , φr,s ϑ(t1 ) ϑ(t2 ) − 2ϑodd (t1 ) ϑodd (t2 ) ϑ0, 21 t1 + r+is ϑ0, 21 t2 + s+ir − 2ϑeven t1 + r+is ϑeven t2 + s+ir 2 2 2 2 1,1 . φr,s (t1 , t2 ) = ϑ(t1 ) ϑ(t2 ) − 2ϑodd (t1 ) ϑodd (t2 ) P αql1 +r,ql2 +s U2ql2 U1ql1 . This Proof of Proposition 3.1 (ii). Our first aim is to compute l1 ,l2

is split into the following four sums: X αql1 +r,ql2 +s e2πi(l1 x1 +l2 x2 ) , S00 =

S01 =

l1 ,l2 even

S10 =

X

l1 odd,l2 even

X

αql1 +r,ql2 +s e2πi(l1 x1 +l2 x2 ) ,

l1 even,l2 odd 2πi(l1 x1 +l2 x2 )

αql1 +r,ql2 +s e

, S11 =

X

l1 ,l2 odd

αql1 +r,ql2 +s e2πi(l1 x1 +l2 x2 ) .

Projections in Rotation Algebras and Theta Functions

349

From (1.18) we get for all t1 , t2 ∈ R 1 a(e2πit1 , e2πit2 ) = √ ϑ(t1 ) ϑ(t2 ) − 2 ϑodd (t1 ) ϑodd (t2 ) > 0, 2 whilst from (3.7) and (3.8)

(3.18)

n1 + in2 n2 + in1 ϑ t2 − ϑ t1 − 2 2 n n + in 1 2 2 + in1 ϑodd t2 − . (3.19) − 2ϑodd t1 − 2 2

1 a(n1 ,n2 ) (e2πit1 , e2πit2 ) = √ 2

To compute S00 , we notice that the quasi-periodicity properties (3.15), (3.16), (3.17) and

πql2 ilq = e 2 −2πilz ϑ(z), z ∈ C, l ∈ Z, ϑ z+ 2

(3.20)

yield for all t1 , t2 ∈ R, n1 , n2 ∈ Z, n1 = ql1 +r, n2 = ql2 +s, l1 , l2 , r, s ∈ Z, 0 ≤ r, s < q, 1 π(l21 +l22 )q (3.21) a(n1 ,n2 ) (e2πit1 , e2πit2 ) = √ e 2 −2πi(l2 t1 +l1 t2 )−πi(l2 r+l1 s)+π(l2 s+l1 r) 2 r + is s + ir r + is s + ir odd odd ϑ t2 − −2ϑ ϑ . × ϑ t1 − t1 − t2 − 2 2 2 2 0,0 We combine (3.10), (3.18), (3.21) and the definition of φr,s to get Z π(r 2 +s2 ) πirs 0,0 e2πi(l2 t1 +l1 t2 ) φr,s (−t1 , −t2 ) dt1 dt2 . αn1 ,n2 = e− 2q + q [0,1]2

Using Lemma 3.3 we gather S00 = e−

0,0 e2πil2 (t1 +x2 )+2πil1 (t2 +x1 ) φr,s (−t1 , −t2 ) dt1 dt2

l1 ,l2 even

= e−

π(r 2 +s2 ) 2q

X

+ πirs q

l1 ,l2 even

=

1 − e 4

π(r 2 +s2 ) 2q

+ πirs q

Z

X

π(r 2 +s2 ) πirs + q 2q

Z

[0,1]2

[0,1]2

0,0 e2πi(l1 t1 +l2 t2 ) φr,s (x2 − t2 , x1 − t1 ) dt1 dt2

1 1 0,0 0,0 0,0 + φr,s (x2 , x1 ) + φr,s φr,s x2 , x1 − x2 − , x1 2 2 1 1 0,0 . x2 − , x1 − + φr,s 2 2

(3.22)

Similar computations yield 1 − π(r2q2 +s2 ) + πirs 1 1 0,1 0,1 0,1 q − φr,s x2 − , x1 S01 = e φr,s (x2 , x1 ) + φr,s x2 , x1 − 4 2 2 1 1 0,1 (3.23) , x2 − , x1 − − φr,s 2 2

350

F. P. Boca

S10

1 π(r2 +s2 ) πirs = e− 2q + q 4

S11 =

1 − π(r2q2 +s2 ) + πirs q e 4

1,0 (x2 , x1 ) φr,s

−

1,0 φr,s

1 1 1,0 + φr,s x2 − , x1 x2 , x1 − 2 2 1 1 1,0 , (3.24) x2 − , x1 − + φr,s 2 2

1 1 1,1 1,1 1,1 − φr,s φr,s x2 , x1 − x2 − , x1 (x2 , x1 ) − φr,s 2 2 1 1 1,1 . (3.25) x2 − , x1 − + φr,s 2 2

Immediate computations based on the obvious equalities ϑ(z, τ ) = ϑeven (z, τ ) + ϑodd (z, τ )

ϑ0, 21 (z, τ ) = ϑeven (z, τ ) − ϑodd (z, τ )

and

0,0 0,1 1,0 1,1 = φr,s = φr,s = −φr,s = φr,s . Actually, for all r, s ∈ Z, t1 , t2 ∈ R, show that φr,s odd ϑ t1 + r+is ϑ t2 + s+ir − 2ϑodd t1 + r+is ϑ t2 + s+ir 2 2 2 2 . φr,s (t1 , t2 ) = ϑ(t1 ) ϑ(t2 ) − 2ϑodd (t1 ) ϑodd (t2 ) (3.26)

From (3.22)–(3.25), X αql1 +r,ql2 +s e2πi(l1 x1 +l2 x2 ) = S00 + S01 + S10 + S11 l1 ,l2

1 X 1 − π(r2q2 +s2 ) + πirs δ2 δ1 δ1 δ2 q . (3.27) = e (−1) φr,s x2 − , x1 − 2 2 2 δ1 ,δ2 =0

We identify X

U1q

with e2πix1 and

U2q

with e2πix2 to obtain

1 − π(r2q2 +s2 ) + πirs q e 2 l1 ,l2 −1 q iq q iq q iq q iq odd odd ϑ U1 , −2ϑ ϑ × ϑ U2 , U2 , U1 , 2 2 2 2 (3.28) 1 X δ1 δ2 πi(r+δ2 +is) q iq πi(s+δ1 +ir) q iq ϑ e ϑ e × (−1) U2 , U1 , 2 2 δ1 ,δ2 =0 iq iq ϑodd eπi(s+δ1 +ir) U1q , , − 2ϑodd eπi(r+δ2 +is) U2q , 2 2 αql1 +r,ql2 +s U2ql2 U1ql1 =

where we denote for any unitary U and τ ∈ H, q ∈ N∗ , a ∈ Z, b ∈ R, X 2 ϑodd (U, τ ) = eπiτ m U m = ϑ(2) (U, 4τ ), 1 ,0 2

m odd

ϑeven (U, τ ) =

X

m even

and

eπiτ m U m = ϑ(2) 0,0 (U, 4τ ) 2

Projections in Rotation Algebras and Theta Functions

X

odd ϑ(q) a ,b (U, τ ) = q

m+ a q

eπiτ

2

X

q

+2πi m+ a q b

U qm+a = ϑ(2q) (U, 4τ ), a 1 + ,2b 2q

m odd

even (U, τ ) = ϑ(q) a ,b

351

eπiτ

m+ a q

2

+2πi m+ a q b

2

U qm+a = ϑ(2q) a ,2b (U, 4τ ). 2q

m even

As ϑ(U, τ ) = ϑodd (U, τ ) + ϑeven (U, τ ), we get q iq q iq q iq q iq odd odd ϑ U1 , − 2ϑ ϑ U2 , U1 , ϑ U2 , 2 2 2 2 =

1 X

(−1)ε1 ε2 ϑ(2) (U2q , 2iq) ϑ(2) (U1q , 2iq). (3.29) ε2 ε1 ,0 ,0 2

ε1 ,ε2 =0

2

For any unitary U and n, s ∈ Z, 0 ≤ s < q πs2 iq iq even = ϑ(q) = ϑ(2q) U, (U, 2iq), e− 2q U s ϑeven eπi(n+is) U q , s s q ,0 2q ,0 2 2 πs2 iq iq odd = eπin ϑ(q) = eπin ϑ(2q) U, (U, 2iq), e− 2q U s ϑodd eπi(n+is) U q , s s+q q ,0 2q ,0 2 2 showing that e

2

2)

− π(r2q+s

U2s

πi(r+δ2 +is) q iq πi(s+δ1 +ir) q iq ϑ e U2 , U1 , ϑ e 2 2 iq odd πi(r+δ2 +is) q odd πi(s+δ1 +ir) q iq ϑ U1r U2 , U1 , −2 ϑ e e 2 2 1 X

=

πiε1 ε2 +πiε2 (r+δ2 )+πiε1 (s+δ1 ) (2q)

(2q)

2q

2q

e

ϑ s+qε2 ,0 (U2 , 2iq)ϑ r+qε1 ,0 (U1 , 2iq).

ε1 ,ε2 =0

1 P

We make use of (3.11), (3.28), (3.29), (3.30) and

δ1 ,δ2 =0

ε1 , ε2 = 0, 1 to get q−1 X 1 X

e=

e

(3.30)

πi(r+qε1 )(s+qε2 ) +πiε1 ε2 q

ϑ(2q) (U2 , 2iq)ϑ(2q) (U1 , 2iq) s+qε2 r+qε1 ,0 ,0 2q

r,s=0 ε1 ,ε2 =0 1 X ε1 ,ε2 =0

eπi(δ1 δ2 +ε1 ε2 +ε2 δ2 +ε1 δ1 ) = 2,

2q

, q q (2) 2 ,0 (U2 , 2iq)ϑ ε1 ,0 (U1 , 2iq)

eπiε1 ε2 ϑ(2) ε 2

which implies the formula in Proposition 3.1 (ii).

2

We continue with some considerations on the case α = 2−1 . According to Proposition 3.1, the formula   (2) (2) (2) (2) (2) (2) 1  ϑ0, 21 (U2 , i)ϑ 21 ,0 (U1 , i) ϑ 21 ,0 (U2 , i)ϑ0, 21 (U1 , i) iϑ 21 , 21 (U2 , i)ϑ 21 , 21 (U1 , i)  + − e = 1+ 2 ϑ(U22 , i)ϑ(U12 , i) ϑ(U22 , i)ϑ(U12 , i) ϑ(U22 , i)ϑ(U12 , i)

352

F. P. Boca

defines a projection of trace 21 in A 21 . We shall simply denote e = e1 = (+ + + −). We consider the automorphisms ρt1 ,t2 of A 21 acting on the generators U1 and U2 by ρt1 ,t2 (Uj ) = e2πitj Uj , j = 1, 2, t1 , t2 ∈ R and ρ acting by ρ(U1 ) = −U2 , ρ(U2 ) = −U1 . The relations (−U, τ ) = −ϑ(2) (U, τ ), ϑ(2) 1 1 ,b ,b 2

2

(2) (2) (2) b ∈ R , τ ∈ H, ϑ(2) 1 1 (U1 , τ ) ϑ 1 1 (U2 , τ ) = −ϑ 1 1 (U2 , τ ) ϑ 1 1 (U1 , τ ), , , , , 2 2

and the fact that

2 2

ϑ(U12 , i)

2 2

and

e1 = (+ + + −),

ϑ(U22 , i)

2 2

are central in A 21 show that

e2 = ρ 21 ,0 (e1 ) = (+ − + +), e3 = ρ0, 21 (e1 ) = (+ + − +),

e4 = ρ 21 , 21 (e1 ) = (+ − − −), e5 = ρ(e1 ) = (+ − − +). In particular ej , 1 ≤ j ≤ 5 are projections of trace

in A 21 such that

1 2

e1 + e5 = e + ρ(e) = 1, 1 X

e1 + e2 + e3 + e4 =

c1 ,c2 =0

ρ c1 , c2 (e) = 2. 2

2

(3.31) (3.32)

˜ 1 ) = U2 , ρ(U ˜ 2 ) = U1 , We also notice that if ρ˜ is the automorphism of A 21 defined by ρ(U then ρ(e ˜ 1 ) = (+ + + +). Actually, the analogue of (3.32) holds for any α = q −1 , q ∈ N∗ , q ≥ 2. To see this, we notice first that for all s, c ∈ Z, b ∈ R, 2πic 2πisc q U, τ = e q ϑ(q) ϑ(q) s s ,b e ,b (U, τ ), q

so if we write e =

q−1 P r,s=0

e−

πirs q

q

Ar,s as in Proposition 3.1, then q−1 2πisc2 2πirc1 1 X − πirs e q + q + q Ar,s q r,s=0

ρ c1 , c2 (e) = q

q

and furthermore q−1 q−1 q−1 q−1 X X 2πirc1 X 2πisc2 1 X − πirs ρ c1 , c2 (e) = e q e q e q Ar,s = qA0,0 = q. (3.33) q q q r,s=0 c ,c =0 c =0 c =0 1

2

1

2

Let π : A q1 → Mq (C), π(Uj ) = U˜ j , j = 1, 2 be the canonical finite dimensional representation of A 1 (i.e. U˜ q = U˜ q = Iq ). We consider the finite dimensional representa1

q

2

tions πt1 ,t2 = πρt1 ,t2 of A q1 and get πt1 ,t2 (Ujqm+s ) = e2πi(qm+s)tj U˜ js , j = 1, 2, 0 ≤ s < q. Furthermore, we obtain 2 X πiτ m+ qs +2πi m+ qs r2 2πi(qm+s)t2 ˜ s U2 = ϑ qs , r2 (qt2 , τ ) U˜ 2s , = e e πt1 ,t2 ϑ(q) s r (U2 , τ ) , q

πt1 ,t2

2

ϑ(q) r s (U1 , τ ) q,2

m

= ϑ rq , s2 (qt1 , τ ) U˜ 1r ,

πt1 ,t2 ϑ(Ujq , τ ) = ϑ(qtj , τ ) · Iq , j = 1, 2.

Projections in Rotation Algebras and Theta Functions

353

Therefore, for any non-negative even integer q, −1 −1 iq ϑ qt1 , πt1 ,t2 (e) = q 2 q−1 X iq iq ˜ s ˜ r − πirs q s r r s ϑ q , 2 qt1 , U2 U1 . · e ϑ q , 2 qt2 , 2 2 r,s=0 −1

iq ϑ qt2 , 2

(3.34)

We compute τ πt1 ,t2 (e)2 using (3.34)and obtain τ πt1 ,t2 (e) = q −1 . Therefore, for all t1 , t2 ∈ R, q−1 X

− 4πimn q

e

ϑ

m,n=0

m −n q ,− 2

iq iq iq m n m n ϑ ϑ− q ,− 2 t2 , ϑ q , 2 t2 , t1 , 2 2 2 2 2 (3.35) iq iq ϑ t2 , . = qϑ t1 , 2 2

iq t1 , 2

n m q, 2

Using ϑ−a,−b (z, τ ) = ϑa,b (−z, τ ), it follows that for any even integer q ∈ N∗ and any t1 , t2 ∈ R, q−1 X m,n=0

− 4πimn q

e

ϑ nq , m2

iq iq iq ϑ nq , m2 − t1 , ϑ mq , n2 t2 , ϑ mq , n2 − t2 , 2 2 2 2 2 (3.36) iq iq ϑ t2 , . = qϑ t1 , 2 2

iq t1 , 2

Actually (3.36) can be regarded as a sort of Riemann theta relation. To see this, we specialize further to q = 2 and denote as in [13] ϑ00 = ϑ, ϑ10 = ϑ 21 ,0 , ϑ01 = ϑ0, 21 , ϑ11 = ϑ 21 , 21 . Taking q = 2 in (3.36) and using the obvious equalities ϑ00 (−z, τ ) = ϑ00 (z, τ ), ϑ10 (−z, τ ) = ϑ10 (z, τ ), ϑ01 (−z, τ ) = ϑ01 (z, τ ), ϑ11 (−z, τ ) = −ϑ11 (z, τ ), we obtain for all x, u ∈ R, ϑ01 (x, i)2 ϑ10 (u, i)2 + ϑ10 (x, i)2 ϑ01 (u, i)2 + ϑ11 (x, i)2 ϑ11 (u, i)2 = ϑ00 (x, i)2 ϑ00 (u, i)2 . (3.37) Remark. Identity (3.37) should be compared with Riemann’s theta formulae, for example with formula (A1 ) in [13, p. 21] ϑ00 (x, τ )2 ϑ00 (u, τ )2 + ϑ11 (x, τ )2 ϑ11 (u, τ )2 = ϑ01 (x, τ )2 ϑ01 (u, τ )2 + ϑ10 (x, τ )2 ϑ10 (u, τ )2 = ϑ00 (x + u, τ ) ϑ00 (x − u, τ ) ϑ00 (0, τ ) . 2

(A1 )

354

F. P. Boca

A. Appendix Let α ∈ [0, 1), ρ = e2πiα , Aα = C ∗ (u, v ; uv = ρvu) and σ be the order four automorphism of Aα defined by σ(u) = v, σ(v) = u−1 . Denote by E the conditional expectation from Aα onto Aσα defined by E(x) =

1 x + σ(x) + σ 2 (x) + σ 3 (x) , x ∈ Aα . 4

For any n, m ∈ Z, set [n, m] = ρ−

nm 2

{n, m} = 4ρ− =ρ

(un v m + u−n v −m ) = 2ρ−

nm 2

− nm 2

E(un v m ) = ρ−

nm 2

nm 2

un v m + σ 2 (un v m )

(un v m + u−n v −m + v n u−m + v −n um )

(un v m + u−n v −m ) + ρ

nm 2

(u−m v n + um v −n ) = [n, m] + [−m, n].

The following properties of [n, m] are easy to check ([3]): [n, m]∗ = [n, m] = [−n, −m], [n, m] [k, l] = ρ

nl−mk 2

[n + k, m + l] + ρ

mk−nl 2

(A.1) [n − k, m − l].

(A.2)

Moreover, using (A.1) and (A.2) it is plain to check that for all n, m, k, l ∈ Z, {n, m}∗ = {n, m} = {−m, n} = {−n, −m} = {m, −n}, {n, m}{k, l} = ρ

nl−mk 2

+ρ

{n + k, m + l} + ρ

nk+ml 2

{n − l, m + k}ρ

mk−nl 2

(A.3)

{n − k, m − l}

−nk−ml 2

{n + l, m − k}. Proposition A.1. If α 6 = 0 and α 6 = 21 , then Aσα = C ∗ {n, 0} ; n ≥ 0 .

(A.4)

is dense in Aα and the conditional expectation Proof. The linear span of (un v m )n,m∈Z E continuous, hence span {n, m} n,m∈Z is dense in Aσα . Denote by B the C ∗ -algebra generated by {n, 0} n∈N . According to (A.3), it suffices to prove that for all n, m ∈ N, {n, m} ∈ B.

(A.5)

According to (A.4) and (A.3) {n, 0} {k, 0} = {n + k, 0} + {n − k, 0} + ρ

nk 2

{n, k} + ρ−

nk 2

{k, n},

(A.6)

whence {n, k} ∈ B if and only if {k, n} ∈ B. The previous equality gives also for all n, k ∈ Z, ρ

nk 2

{n, k} + ρ−

nk 2

{k, n} ∈ B.

(A.7)

In particular {1, 1} ∈ B and using again (A.4) and (A.3) we get for all n ∈ Z, n

n

n

n

{n, 0} {1, 1} = ρ 2 {n + 1, 1} + ρ− 2 {1, n − 1} + ρ 2 {n − 1, 1} + ρ− 2 {1, n + 1}. (A.8)

Projections in Rotation Algebras and Theta Functions

355

Let m ≥ 1 and assume that {k, 1} ∈ B (hence also {1, k} ∈ B) for all 0 ≤ k ≤ m. By (A.8) m

m

ρ 2 {m + 1, 1} + ρ− 2 {1, m + 1} ∈ B.

(A.9)

Taking k = 1, n = m + 1 in (A.7) and using (A.9) and m+1 ρ 2 ρ− m+1 1 2 − 21 m 2 m ρ 2 ρ− 2 = ρ − ρ 6 = 0, we conclude that {m + 1, 1} ∈ B, hence {n, 1} ∈ B and {1, n} ∈ B for all n ≥ 0. Finally, we prove (A.5) by induction for n, m ≥ 0. Assume that for some k ≥ 1 we have {n, m} ∈ B for all (n, m) ∈ [0, ∞) × [0, k] ∪ [0, k] × [0, ∞) . (A.10) To conclude, it will suffice, according to (A.7), to prove that for all n ≥ 0, {n, k + 1} ∈ B. This holds for n = 0, 1, . . . , k. By (A.10) and {k, k}{1, 1} = {k + 1, k + 1} + {k − 1, k − 1} + ρk {k − 1, k + 1} + ρ−k {k + 1, k − 1} ∈ B, {k + 1, k}{1, 1} = ρ 2 {k + 2, k + 1} + ρ− 2 {k, k − 1} 1

1

+ ρk+ 2 {k, k + 1} + ρ−k− 2 {k + 2, k − 1} ∈ B, 1

1

{k + 2, k}{1, 1} = ρ{k + 3, k + 1} + ρ−1 {k + 1, k − 1} + ρk+1 {k + 1, k + 1} + ρ−k−1 {k + 3, k − 1} ∈ B, etc., we get {k + 1, k + 1}, {k + 2, k + 1}, {k + 3, k + 1}, · · · ∈ B.

Corollary A.2. If α ∈ / Q, then Aσα = C ∗ {1, 0}, {2, 0} = C ∗ u + v + u−1 + v −1 , u2 + v 2 + u−2 + v −2 . Proof. Denote by C the ∗-algebra generated by {1, 0} and {2, 0}. According to the previous proposition it suffices to show that {n, 0} ∈ C for all n ≥ 3. Firstly, notice that 1 1 {1, 0}2 = {2, 0} + {0, 0} + ρ 2 + ρ− 2 {1, 1} ∈ C yields {1, 1} ∈ C. Then, prove by induction that for all n ≥ 2, {1, 0}, . . . , {n, 0}, {1, 1}, . . . , {n − 1, 1} ∈ C.

(A.11)

Assume that (A.11) holds for some n ≥ 2. By (A.6), ρ

n−2 2

{n − 2, 1} + ρ−

n−2 2

{1, n − 2} = {n − 2, 0}{1, 0} − {n − 1, 0} − {n − 3, 0} ∈ C,

hence {1, n − 2} = {2 − n, 1} ∈ C. By (A.8) and the induction hypotheses ρ

n−1 2

{n, 1} + ρ−

n−1 2

{1, n} = {n − 1, 0}{1, 1} − ρ− −ρ

n−1 2

n−1 2

{n − 2, 1} ∈ C.

{1, n − 2}

(A.12)

356

F. P. Boca

On the other hand {1, 1}{n − 1, 0} = ρ− +ρ

n−1 2

{n, 1} + ρ

− n−1 2

n−1 2

{1, n} + ρ

n−1 2

{2 − n, 1}

{n − 2, 1} ∈ C,

whence ρ−

n−1 2

{n, 1} + ρ

n−1 2

{1, n} ∈ C.

(A.13)

Since ρm 6 = 1, m ∈ Z, (A.11) and (A.12) show that {n, 1}, {1, n} ∈ C. Furthermore, using also n

n

{n, 0}{1, 0} = {n + 1, 0} + {n − 1, 0} + ρ 2 {n, 1} + ρ− 2 {1, n} ∈ C, we conclude that {n + 1, 0} ∈ C. Summarizing, we have shown that {n, 0}, {n, 1} ∈ C for all n ≥ 0. Computing {1, 0}{1, 1}, {1, 1}2 , {1, 2}{1, 1}, . . . we show that {n, 2} ∈ C for all n ≥ 0. Finally, one checks by induction on k ≥ 0 as above that {n, k} ∈ C for all n ≥ 0. Addendum. After writing this paper, I have received the preprint “Chern characters of Fourier modules” by S. Walters, where the author computes the Chern characters of nine basic modules over the crossed product C ∗ -algebra Aα ∝σ Z4 . Those computations involve also Jacobi’s theta functions. I was recently informed by A. Valette that estimates on the norm of the Harper operator have been obtained by different methods in the papers “On the spectrum of a random walk on the discrete Heisenberg group and the norm of Harper’s operator”, J. Geom. Phys. 21, 337–356 (1997), by C. B´eguin, A. Valette, A. Zuk and in “Norm estimates of discrete Schr¨odinger operators”, Colloq. Math. 76, 153–160 (1998) by R. Szwarc. Acknowledgement. I would like to express my gratitute to Dai Evans for his continuous encouragement and support. I am grateful to Bernard Helffer for explaining me the results from [10].

References 1. Avron, J., v. Mouche, P.H.M., Simon, B.: On the measure of the spectrum for the almost Mathieu operator. Commun. Math. Phys. 132, 103–118 (1990) 2. Bellissard, J.: Gap labelling theorems for Schr¨odinger operators. In: From number theory to physics, M. Waldschmidt et al (eds), Berlin–Heidelberg–New York: Springer, 1992, pp. 538–630 3. Bratteli, O., Elliott, G.A., Evans, D.E., Kishimoto, A.: Non-commutative spheres I. International J. Math. 3, 139–166 (1991) 4. Boca, F.P.: The structure of higher-dimensional noncommutative tori and metric diophantine approximation. J. Reine Angew. Math. 492, 179–219 (1997) 5. Combes, F.: Crossed products and Morita equivalence. Proc. London Math. Soc. 49, 289–306 (1984) 6. Connes, A.: C ∗ -alg`ebres et g´eom´etrie diff´erentielle. C. R. Acad. Sci. Paris 290, 599–604 (1980) 7. Connes, A.: Noncommutative geometry. London–New York: Academic Press, 1995 8. Elliott, G.A., Evans, D.E.: The structure of the irrational rotation C ∗ -algebra. Ann. of Math. 138, 477– 501 (1993) 9. Farsi, C., Watling, N.: Quartic algebras. Canad. J. Math. 44, 1167–1191 (1992) 10. Helffer, B., Sj¨ostrand, J.: Analyse semi-classique pour l’equation de Harper II. Mem. de la Soc. Mathem. de France, No 34, Tome 116, fasc. 4 (1988) 11. Hofstadter, D.R.: Energy levels and wave functions of Bloch electrons in a rational or irrational magnetic field. Phys. Rev. B 14, 2239–2249 (1976) 12. Igusa, J.: Theta Functions. Berlin–Heidelberg–New York: Springer Verlag, 1972 13. Mumford, D.: Tata Lectures on Theta I. Basel–Boston: Birkh¨auser, 1983

Projections in Rotation Algebras and Theta Functions

357

14. Pimsner, M., Voiculescu, D.: Exact sequences for K-groups and Ext-groups of certain cross-product C ∗ -algebras. J. Operator Theory 4, 93–118 (1980) 15. Pimsner, M.,Voiculescu, D.: Imbedding the irrational rotation C ∗ -algebra into anAF-algebra. J. Operator Theory 4, 201–210 (1980) 16. Rieffel, M.A.: C ∗ -algebras associated with irrational rotations. Pacific J. Math. 939, 415–429 (1981) 17. Rieffel, M.A.: Projective modules over higher-dimensional non-commutative tori. Canad. J. Math. 40, 257–338 (1988) Communicated by A. Connes

Commun. Math. Phys. 202, 359 – 401 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials Anne Schilling, S. Ole Warnaar Instituut voor Theoretische Fysica, Universiteit van Amsterdam, Valckenierstraat 65, 1018 XE Amsterdam, The Netherlands. E-mail: [email protected]; [email protected] Received: 10 March 1998 / Accepted: 15 October 1998

Abstract: Inhomogeneous lattice paths are introduced as ordered sequences of rectangular Young tableaux thereby generalizing recent work on the Kostka polynomials by Nakayashiki andYamada and by Lascoux, Leclerc and Thibon. Motivated by these works and by Kashiwara’s theory of crystal bases we define a statistic on paths yielding two novel classes of polynomials. One of these provides a generalization of the Kostka polynomials, while the other, which we name the An−1 supernomial, is a q-deformation of the expansion coefficients of products of Schur polynomials. Many well-known results for Kostka polynomials are extended leading to representations of our polynomials in terms of a charge statistic on Littlewood–Richardson tableaux and in terms of fermionic configuration sums. Several identities for the generalized Kostka polynomials and the An−1 supernomials are proven or conjectured. Finally, a connection between the supernomials and Bailey’s lemma is made.

1. Introduction Lattice paths play an important rˆole in combinatorics and exactly solvable lattice models of statistical mechanics. In particular, the one-dimensional configuration sums necessary for the calculation of order parameters of lattice models are generating functions of lattice paths (see for example [4, 12, 16]). A classic example of lattice paths is given by sequences of upward and downward steps. The number of such paths consist 2 = ing of λ1 steps up and λ2 steps down is given by the binomial coefficient λ1λ+λ 1 (λ1 + λ2 )!/λ1 !λ2 ! which is the expansion coefficient of X L λ1 λ2 L x1 x2 . (x1 + x2 ) = λ 1 λ ,λ 1 2 λ1 +λ2 =L

360

A. Schilling, S. O. Warnaar

An important q-deformation of the binomial is the q-binomial

( (qλ2 +1 )λ1 λ1 + λ2 for λ1 , λ2 ∈ Z≥0 , (q)λ1 = λ1 0 otherwise,

(1.1)

where (x)n = (1−x)(1−xq)(1−xq 2 ) · · · (1−xq n−1 ). The q-binomial can be interpreted as the generating function of all paths with λ1 steps up and λ2 steps down where each path is weighted as follows. Let p1 , . . . , pλ1 +λ2 denote the steps of the path where we label a step up by 1 and a step down by 2. Then the weight of the path is given by Pλ1 +λ2 −1 iχ(pi < pi+1 ), where χ(true) = 1 and χ(false) = 0. i=1 Other q-functions have occurred, such as a q-deformation of the trinomial coefficients [3, 2] in the expansion of (x21 + x1 x2 + x22 )L or, more generally, of the (N + 1)nomial coefficients [24, 44, 53] in the expansion of hL N , where hN is the complete symmetric polynomial in the variables x1 and x2 of degree N . In a study of Rogers– Ramanujan-type identities the following generalizations of the multinomial coefficients were introduced [47] X L LN λ1 λ2 1 . . . h = x x , (1.2) hL 1 1 2 N λ1 − 21 `N λ ,λ 1 2 λ1 +λ2 =`N

PN PN th N where L = i=1 Li ei ∈ ZN ≥0 with ei the i unit vector in Z and `i = j=1 min{i, j}Lj . Since Eq. (1.2) reduces to the definition of the (i + 1)-nomial coefficient when L = Lei (up to a shift in the lower index), the expansion coefficient in (1.2) was coined (A1 ) supernomial. In ref. [47] it was shown that many Rogers–Ramanujan-type identities admit bounded analogues involving the following q-deformation of the supernomial: PN −1 X LN −1 + jN L1 + j2 L (`k+1 −`k −jk+1 )jk LN k=1 ··· q = jN jN −1 j1 a ` (1.3) j1 +···+jN =a+ N 2

1 for L ∈ ZN ≥0 and a + 2 `N = 0, 1, . . . , `N . However, the question whether (1.3) or the Rogers–Ramanujan-type identities involving (1.3) can be interpreted as generating functions of weighted lattice paths remained unanswered. Incidentally, the polynomials in Eq. (1.3) have occurred in Butler’s study [7]–[9] of finite abelian groups. In a seemingly unrelated development, Nakayashiki and Yamada [40] introduced the notion of “inhomogeneous” lattice paths by considering paths in which each of the elementary steps pi can be chosen from a different set Bi . The main result of their work is a new combinatorial representation of the Kostka polynomial as the generating function of inhomogeneous paths where either all Bi are sets of fully symmetric (one-row) Young tableaux or all Bi are sets of fully antisymmetric (one-column) Young tableaux. An equivalent description of the Kostka polynomials, formulated in terms of the plactic monoid, was found by Lascoux, Leclerc and Thibon [33]. The purpose of this paper is to elucidate the connection of the work of Nakayashiki and Yamada and of Lascoux, Leclerc and Thibon on the Kostka polynomials with that of ref. [47] on supernomials and to extend all of them. In particular, we introduce inhomogeneous lattice paths based on Young tableaux with mixed symmetries, or more precisely, on Young tableaux of rectangular shape. Motivated by the theory of crystal bases [19] we assign weights to these paths and relate their generating functions to

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

361

q-deformations of the (An−1 ) supernomials defined through products of Schur polynomials, in the spirit of Eq. (1.2). By imposing suitable restrictions on the inhomogeneous lattice paths, we obtain new polynomials that include the Kostka polynomials as a special case. For these generalized Kostka polynomials we derive several extensions of classical results such as a Lascoux–Sch¨utzenberger-type representation [35] in terms of a charge statistic, and a representation akin to that of Kirillov and Reshetikhin [27] based on rigged configurations. We furthermore prove and conjecture several identities involving the An−1 supernomials and generalized Kostka polynomials and briefly comment on a Bailey-type lemma [6] for “antisymmetric” supernomials. The An−1 supernomials include polynomials previously studied in [7]–[9], [15]. The rest of this paper is organized as follows. Section 2 serves to set the notation used throughout the paper and to review some basic definitions and properties of Young tableaux, words and Kostka polynomials. In Sect. 3 inhomogeneous lattice paths based on rectangular Young tableaux are introduced (Def. 3.1). A statistic on such paths originating from crystal-base theory is used to define An−1 supernomials (Def. 3.5) and generalized Kostka polynomials (Def. 3.9) as generating functions of inhomogeneous paths. We furthermore map the paths underlying the generalized Kostka polynomials onto Littlewood–Richardson (LR) tableaux (Def. 3.6). In Sect. 4 an initial cyclage and charge statistic on LR tableaux is defined (Def. 4.3) which enables us in the subsequent section to prove a Lascoux–Sch¨utzenberger-type representation for generalized Kostka polynomials (Cor. 5.2). Section 6 deals with more general λ-(co)cyclages on LR tableaux, showing that these cyclages impose a ranked poset structure on the set of LR tableaux (Thm. 6.3). These results are used in Sect. 7 to prove a duality formula for the generalized Kostka polynomials (Thm. 7.1) and recurrence relations for the An−1 supernomials and the generalized Kostka polynomials (Thm. 7.4). In Sect. 8 the recurrences are employed to obtain a Kirillov–Reshetikhin-type expression for the generalized Kostka polynomials (Thm. 8.2). We finally conclude in Sect. 9 with some conjectured polynomial identities and with a Bailey-like lemma involving the An−1 supernomials.

2. Young Tableaux, Words and Kostka Polynomials This section reviews some definitions and properties of Young tableaux, words and Kostka polynomials and sets out the notation and terminology used throughout the paper. For more details the reader may consult refs. [9, 13, 37]. P Throughout, we denote by |A| the cardinality of a set A and we define |µ| = i µi for an array of numbers µ = (µ1 , µ2 , . . . ). 2.1. Young tableaux and words. We begin by recalling some definitions regarding partitions. A partition λ = (λ1 , λ2 , . . . ) is a weakly decreasing sequence of non-negative integers such that only finitely many λi 6 = 0. We write λ ` n if |λ| = n. Partitions which differ only by a string of zeros are identified. Each partition can be depicted by a Young diagram, which (adopting the “French” convention) is a collection of boxes with left-adjusted rows of decreasing length from bottom to top. If λ = (λ1 , λ2 , . . . ) is a partition then the corresponding Young diagram has λi boxes in the ith row from the bottom. For example

362

A. Schilling, S. O. Warnaar

is the Young diagram corresponding to the partition (4, 2, 1). The nonzero elements λi are called the parts of λ. The height of λ is the number of parts and its width equals the largest part. At times it is convenient to denote a partition λ with Li parts equal to i by λ = (1L1 2L2 · · · ). The partition λ> is the partition corresponding to the transposed diagram of λ obtained by reflecting along the diagonal, i.e., if λ = (1L1 2L2 · · · N LN ) then (λ> )i = Li + · · · + LN . The addition λ + µ of the partitions λ and µ is defined by the addition of their parts (λ + µ)i = λi + µi . The parts of the partition λ ∩ µ are given by (λ ∩ µ)i = min{λi , µi }, and λ/µ denotes the skew shape obtained by removing the boxes of µ from λ. By λ ≥ µ in dominance order we mean λ1 + · · · + λi ≥ µ1 + · · · + µi for all i. The set of “rectangular” partitions (i.e., partitions with rectangular diagram) is denoted by R. In this paper we will often encounter arrays of rectangular partitions. For such an > array µ = (µ1 , . . . , µL ) ∈ RL , define µ? = (µ> 1 , . . . , µL ) and |µ| = |µ1 | + · · · + |µL |. When all components of µ have height 1 ordered according to decreasing width, i.e., µi = (ki ) with k1 ≥ · · · ≥ kL , one can identify µ with the partition (k1 , . . . , kL ). Notice, however, that µ? 6 = µ> in this case. There is the following partial order on RL modulo reordering. Define λ(a) as the partition obtained from λ ∈ RL by putting the widths of all components of λ of height a in decreasing order. Then λ ≥ µ for λ, µ ∈ RL if λ(a) ≥ µ(a) for all a by the dominance order on partitions. Next we consider Young tableaux. Let X = {x1 < x2 < · · · < xn } be a totally ordered alphabet of non-commutative indeterminates. AYoung tableau over X is a filling of a Young diagram such that each row is weakly increasing from left to right and each column is strictly increasing from bottom to top. The Young diagram (or, equivalently, partition) underlying a Young tableau T is called the shape of T and the height of T is the height of its shape. The content of a Young tableau T is an array µ = (µ1 , . . . , µn ), where µi is the number of boxes filled with xi . The set of all Young tableaux of shape λ and content µ is denoted by Tab(λ, µ). It is clear that Tab(λ, µ) = ∅ unless |λ| = |µ|. Young tableaux can also be represented by words over the alphabet X. Let X be the free monoid generated by X. By the Schensted bumping algorithm [43] one can associate a Young tableau to each word w ∈ X denoted by [w]. Knuth [30] introduced equivalence relations on words generated by zxy ≡ xzy yxz ≡ yzx

(x ≤ y < z), (x < y ≤ z),

(2.1)

for x, y, z ∈ X and showed that [w] = [w0 ] if and only if w ≡ w0 . The word wT obtained from a Young tableau T by reading its entries successively from left to right down the page is called a word in row-representation or, for short, a row-word. Since T = [wT ], the Schensted and row-reading algorithms provide an one-to-one correspondence between the plactic monoid X / ≡ and the set of Young tableaux over X. Using the above correspondence, we say that a word has shape λ and content µ if the corresponding Young tableau [w] is in Tab(λ, µ). Furthermore, the product of two Young tableaux S and T is defined as S · T = [wS wT ], where wS wT is the word formed by juxtaposing the row-words wS and wT . Finally, the following definitions for words are needed. A word w = w1 w2 · · · wk with wi ∈ X is called a Yamanouchi word if, for all 1 ≤ i ≤ k, the sequence wk · · · wi

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

363

contains at least as many x1 as x2 , at least as many x2 as x3 and so on. We call a word balanced if all of the letters in X occur an equal number of times. Let w be a word on the two-letter alphabet {x < y} and recursively connect all pairs yx in w as in the following example: w = yyyxyxxxy. A pair y· · ·x is called an inverted pair. All letters of w which do not belong to an inverted pair are called non-inverted, and the subword of w containing its non-inverted letters is of the form xr y s . 2.2. Kostka polynomials. Throughout this paper xλ := xλ1 1 · · · xλnn , where x1 , . . . , xn are commutative variables (not to be confused with the noncommutative letters in the alphabet X) and λ = (λ1 , . . . , λn ). The Schur polynomial sλ in the variables x1 , . . . , xn is defined as X xT , (2.2) sλ (x) = T ∈Tab(λ,·)

where xT := xcontent(T ) . The Kostka polynomials Kλµ (q) arise as the connection coefficients between the Schur and Hall–Littlewood polynomials [37], X Kλµ (q)Pµ (x; q). (2.3) sλ (x) = µ`|λ|

Here λ and µ are partitions and Kλµ (q) 6 = 0 if and only if |λ| = |µ| and λ ≥ µ. A combinatorial interpretation of the Kostka polynomials was obtained by Lascoux and Sch¨utzenberger [35], who showed that X q c(T ) , (2.4) Kλµ (q) = T ∈Tab(λ,µ)

where c(T ) is the charge of a Young tableau defined below. Let T ∈ Tab(·, µ) be a Young tableauof content µ over X = {x1 < x2 < · · · < xn } and let Tmin = Tmin (µ) := xµ1 1 · · · xµnn be the one-row tableau of content µ. When T 6 = Tmin and wT = xi u the initial cyclage C on T is defined as C(T ) = [uxi ]. The cocharge co(T ) of T ∈ Tab(·, µ) is the number of times one has to apply C to obtain as c(T) = kµk − co(T ), P where kµk is the cocharge of Tmin . The charge of T is defined the Young tableau Tmax := xµnn · · · xµ1 1 , given by kµk = i<j min{µi , µj }. To illustrate the above definitions take, for example, T = [x3 x2 x21 x2 ]. Then C(T ) = [x2 x21 x2 x3 ], C 2 (T ) = [x21 x2 x3 x2 ] = [x3 x21 x22 ], C 3 (T ) = [x21 x22 x3 ], so that co(T ) = 3 and c(T ) = 1 in this example. Another combinatorial description of the Kostka polynomials, in terms of rigged configurations, is due to Kirillov and Reshetikhin [27] and provides an explicit formula for calculating the Kostka polynomials as " # (a) Y Pi(a) (α) + αi(a) − αi+1 X C(α) . (2.5) q Kλµ (q) = (a) αi(a) − αi+1 α a,i≥1

364

A. Schilling, S. O. Warnaar

The summation is over sequences α = (α(0) , α(1) , . . . ) of partitions such that α(0) = µ> and |α(a) | = λa+1 + λa+2 + · · · . Furthermore Pi(a) (α) =

i X k=1

and

(αk(a−1) − 2αk(a) + αk(a+1) )

X α(a−1) − α(a) i i , C(α) = 2

(2.6)

(2.7)

a,i≥1

where a2 = a(a−1)/2 for a ∈ Z. Expressions of the type (2.5) are often called fermionic as they can be interpreted as the partition function for a system of quasi-particles with fractional statistics obeying Pauli’s exclusion principle [22, 23]. In Sect. 3.2 a third combinatorial representation of the Kostka polynomials as the generating function of paths will be discussed. This representation is due to Lascoux, Leclerc and Thibon [33] and Nakayashiki and Yamada [40] and is the starting point for our generalized Kostka polynomials. As we will see in subsequent sections, these generalized Kostka polynomials also admit representations stemming from Eqs. (2.4) and (2.5). 3. An−1 Supernomials and Generalized Kostka Polynomials This section deals with paths defined as ordered sequences of rectangularYoung tableaux. Assigning weights to the paths, we consider the generating functions over two different sets of paths called unrestricted and classically restricted. These are treated in Sects. 3.1 and 3.2, respectively. As will be shown in Sect. 7, the generating functions over the set of unrestricted paths are An−1 generalizations of the A1 supernomials (1.3). The generating functions over the set of classically restricted paths lead to generalizations of the Kostka polynomials. 3.1. Unrestricted paths and An−1 supernomials. Denote by Bλ the set Tab(λ, ·) ofYoung tableaux of shape λ over the alphabet {1, 2, · · · , n}. An element of Bλ is called a step and an ordered sequence of L steps is a path of length L denoted by pL ⊗ · · · ⊗ p1 . We treat here only paths with rectangular steps pi , i.e., pi ∈ Bµi for µi ∈ R. Let us however emphasize that the steps in a path can have different shapes indicated by the subscript i on µi . Paths with this property are called inhomogeneous [40]. The reason for the tensor product notation for paths (treated here as ordered sequences of steps only) is for notational convenience, but is motivated by the relation to the theory of crystal bases [19]. In this setting B(ia ) is usually labelled by Bi3a , where 3a are the fundamental weights of An−1 . The set Bi3a is called a perfect crystal and parametrizes a basis of the irreducible highest weight module of An−1 with highest weight i3a [21]. There exist crystal bases for all integrable highest weight modules and they are compatible with the tensor product structure. Definition 3.1 (Unrestricted paths). For fixed integers n ≥ 2 and L ≥ 0 let λ ∈ Zn≥0 and µ = (µ1 , . . . , µL ) ∈ RL . The set of paths Pλµ is defined as Pλµ = {pL ⊗ · · · ⊗ p1 | pi ∈ Bµi and

L X i=1

content(pi ) = λ}.

(3.1)

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

365

To each path P ∈ Pλµ we assign an energy h(P ) ∈ Z≥0 as h(P ) =

L−1 X

ih(pi+1 ⊗ pi ),

(3.2)

i=1

where h(p ⊗ p0 ) for the steps p ∈ Bν and p0 ∈ Bν 0 is defined as the number of boxes in the product p · p0 that lie outside the Young diagram ν + ν 0 or, more formally, as h(p ⊗ p0 ) = |ν + ν 0 | − |shape(p · p0 ) ∩ (ν + ν 0 )|.

(3.3)

Example 3.1. 3

Let P = p2 ⊗ p1 =

2

3

1

2

⊗

1

2

. Then p2 · p1 =

2

2

1

1

2

and shape(p2 · p1 ) = (3, 2, 1). Hence h(P ) = |(4, 2)| − |(3, 2, 1) ∩ (4, 2)| = 6 − 5 = 1. The cardinality Sλµ of Pλµ does not depend on the ordering of µ, i.e., Sλµ = Sλµ˜ ,

(3.4)

where µ˜ is a permutation of µ. In general the generating function of Pλµ with paths weighted by the energy function h does not have this symmetry. To obtain a weight function such that the resulting generating function does respect this symmetry we introduce an isomorphism σ : Bα ⊗ Bα0 → Bα0 ⊗ Bα for α, α0 ∈ R between two ˜ where p˜0 and p˜ are successive steps. Let p ⊗ p0 ∈ Bα ⊗ Bα0 . Then σ(p ⊗ p0 ) = p˜0 ⊗ p, the unique Young tableaux of shape α0 and α, respectively, which satisfy p · p0 = p˜0 · p. ˜

(3.5)

The uniqueness of the Young tableaux p˜0 and p˜ is ensured since the Littlewood– Richardson coefficients have the symmetry cβαα0 = cβα0 α and for rectangular shapes α and α0 obey cβαα0 ≤ 1. Notice that σ is the identity if p and p0 have the same shape. Definition 3.2 (Isomorphism). For a path P = pL ⊗ · · · ⊗ p1 ∈ Pλµ we define the isomorphism σi as σi (P ) = pL ⊗ · · · ⊗ σ(pi+1 ⊗ pi ) ⊗ · · · ⊗ p1 .

(3.6)

The group generated by the isomorphisms σi is the symmetric group, i.e., σi2 = Id, σi σi+1 σi = σi+1 σi σi+1 and σi σj = σj σi for |i−j| ≥ 2. The proof of the braiding relation is non-trivial (see [49, 51]). Definition 3.3 (Orbit). The set OP is the orbit of the path P ∈ Pλµ under the group generated by the isomorphisms σi . The weight of a path P is now given by the mean of the energy function h over the orbit of P .

366

A. Schilling, S. O. Warnaar

Definition 3.4 (Weight). For P ∈ Pλµ , the weight function H : Pλµ → Z≥0 is defined as X 1 h(P 0 ). (3.7) H(P ) = |OP | 0 P ∈OP

It is not obvious from (3.7) that the weight H(P ) of a path P is indeed integer. This will follow from Theorem 5.1. Before we continue to define the generating functions over the set of paths Pλµ , some remarks on the relation of our definitions to lattice paths of exactly solvable lattice models and the theory of crystal bases are in order. Remark 3.1. For homogeneous paths, i.e., P ∈ Pλµ with µ1 = · · · = µL , the weight simplifies to H(P ) = h(P ) which is the weight function of configuration sums of A(1) n−1 solvable lattice models. For example, for p, p0 ∈ B(N ) , the energy function h(p ⊗ p0 ) coincides with the one of refs. [16, 12] (and references therein) given by N X χ(pi > p0τi )}. h(p ⊗ p0 ) = max { τ ∈SN

(3.8)

i=1

Here pi , p0i ∈ {1, 2, . . . , n} are the letters in p = [p1 · · · pN ] and p0 = [p01 · · · p0N ], SN is the permutation group on 1, 2, . . . , N , χ(true) = 1 and χ(false) = 0. An alternative combinatorial expression of (3.8) in terms of so-called nonmovable tableaux is given in [26]. PN When p, p0 ∈ B(1N ) , our energy function reduces to h(p ⊗ p0 ) = minτ ∈SN { i=1 χ(pi > p0τi )} of ref. [41]. Nakayashiki and Yamada [40] defined weight functions on inhomogeneous paths when either µ or µ? is a partition, i.e., when |µ1 | ≥ · · · ≥ |µL | and either height(µi ) = 1 for all i or width(µi ) = 1 for all i. Their isomorphism, defined in terms of graphical rules (Rule 3.10 and 3.11 of ref. [40]), is a special case of the isomorphism σ. The expression for H(P ) that they give is quite different from that of Eq. (3.7) even though it is the same function for the subset of paths they consider. For example when height(µi ) = 1 for all i, H of ref. [40] is, in our normalization, given by H(P ) =

i−1 L X X i=2 j=1

h(pi ⊗ pj(i−1) ),

(i) 0 0 where P = pL ⊗· · ·⊗p1 ∈ Pλµ , p(i) i = pi and pj = pi with P = σi−1 ◦σi−2 ◦· · ·◦σj (P ) for j < i. Lascoux, Leclerc and Thibon [33] defined a weight function b(T ) forYoung tableaux T as the mean over certain orbits very similar in spirit to Eq. (3.7) (see Thm. 5.1 in ref. [33]). In fact, when height(µi ) = 1 for all i, each path P ∈ P·µ can be mapped to a Young tableau T ∈ Tab(·, µ) (by virtue of the map ω of Eq. (3.16) below, i.e., T = [ω(P P )]), and in this case one finds that H(P ) = kµk − b(T ), where we recall that kµk = i<j min{µi , µj }.

Remark 3.2. Kashiwara [19] defined raising and lowering operators fi and ei (0 ≤ i ≤ n−1) acting on elements of a crystal Bi3a . Set Bk = Bik 3ak (k = 1, 2). Then for p1 ∈ B1 and p2 ∈ B2 the lowering operators act on the tensor product p2 ⊗ p1 as follows: ( p2 ⊗ ei p1 if ϕi (p1 ) ≥ εi (p2 ), ei (p2 ⊗ p1 ) = ei p2 ⊗ p1 if ϕi (p1 ) < εi (p2 ),

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

367

where εi (b) = max{k|eki (b) 6 = 0} and ϕi (b) = max{k|fik (b) 6 = 0}. The action of fi on a tensor product is defined in a similar way. (Note that to conform with the rest of this paper the order of the tensor product is inverted in comparison to the usual definitions). Let ei (p2 ⊗ p1 ) 6 = 0. Up to an additive constant the energy function used in crystal theory is recursively defined as   E(p2 ⊗ p1 ) + 1 if i = 0 and ϕ0 (p1 ) ≥ ε0 (p2 ), ϕ0 (p˜2 ) ≥ ε0 (p˜1 ), E(ei (p2 ⊗ p1 )) = E(p2 ⊗ p1 ) − 1 if i = 0 and ϕ0 (p1 ) < ε0 (p2 ), ϕ0 (p˜2 ) < ε0 (p˜1 ),  E(p ⊗ p ) (3.9) otherwise. 2 1 Here p2 ⊗ p1 7→ p˜1 ⊗ p˜2 with p1 , p˜1 ∈ B1 and p2 , p˜2 ∈ B2 is an isomorphism obeying certain conditions [17, 18, 20]. The isomorphism σ defined through (3.5) yields the isomorphism of crystal theory. Up to a sign the energy h(p2 ⊗ p1 ) as defined in (3.3) provides an explicit expression for the recursively defined energy of (3.9), i.e., E(p2 ⊗ p1 ) = −h(p2 ⊗ p1 ). We do not prove these statements in this paper. Let us now define the An−1 supernomial as the generating function of the set of paths Pλµ weighted by H of Definition 3.4. Definition 3.5 (Supernomials). Let λ ∈ Zn≥0 and µ ∈ RL . Then the supernomial Sλµ (q) is defined as X q H(P ) . (3.10) Sλµ (q) = P ∈Pλµ

Since H(P ) = H(P 0 ) for P 0 ∈ OP it is clear that Sλµ (q) = Sλµ˜ (q),

(3.11)

where µ˜ is a permutation of µ. To conclude this section we comment on the origin of the terminology supernomial as first introduced in ref. [47]. Recalling Definition 3.1 of the set of paths Pλµ and Eq. (2.2) for the Schur polynomial, one finds that X Sλµ xλ , (3.12) sµ1 (x) · · · sµL (x) = λ`|µ|

where Sλµ := Sλµ (1) = |Pλµ |. For homogeneous paths, i.e., µ1 = · · · = µL , this is the usual definition for various kinds of multinomial coefficients. The supernomials (or, more precisely, q-supernomial coefficients) can thus be viewed as q-deformations of generalized multinomial coefficients. 3.2. Classically restricted paths and generalized Kostka polynomials. Analogous to the previous section we now introduce classically restricted paths and their generating function. To describe the set of classically restricted paths we first map paths onto words and then specify the restriction on these words. For our purposes it will be most convenient to label the alphabet underlying the words associated to paths as (2) (a1 ) (a2 ) < x(1) < ··· X a = {x(1) 1 < x1 < · · · < x1 2 < · · · < x2 (aL ) < x(1) L < · · · < xL }

(3.13)

368

A. Schilling, S. O. Warnaar

for some fixed integers 1 ≤ ai ≤ n. As before, X a denotes the free monoid generated by X a . The i-subword of a word w ∈ X a is the subword of w consisting of the letters xi(j) (1 ≤ j ≤ ai ) only. More generally, the (i1 , . . . , i` )-subword of w is the subword consisting of the letters with subscripts i1 , . . . , i` only. We are interested in the following subset of X a : W = {w ∈ X a | each i-subword of w is a balanced Yamanouchi word}. (3.14) By the Schensted bumping algorithm each word w ∈ W corresponds to a Young tableau [w] over the alphabet X a . It is an easy matter to show that if w ∈ W is Knuth equivalent to w0 ∈ X a , then w0 ∈ W. Hence it makes sense to consider the set of Young tableaux over X a corresponding to W/ ≡. Instead of labelling the content of such a tableau by (j) (j) (aL ) µ = (µ(1) 1 , . . . , µL ), (where µi is the number of xi ’s), we set µ = (µ1 , . . . , µL ), (ai ) L where µi = (µ(1) i , . . . , µi ) so that µ ∈ R with height(µi ) = ai . The set of words w ∈ W with content([w]) = µ is denoted Wµ . Definition 3.6 (Littlewood–Richardson tableaux). Let L ≥ 0 and n ≥ 2 be integers, λ a partition and µ ∈ RL . Then the set of LR tableaux of shape λ and content µ is defined as LRT(λ, µ) = {T |wT ∈ Wµ and shape(T ) = λ}.

(3.15)

The set of LR tableaux LRT(λ, µ) reduces to the set Tab(λ, µ) of Young tableaux over X = {x1 < · · · < xL } when a1 = a2 = · · · = aL = 1 in (3.13). We now define a map ω : Pλµ → Wµ

(3.16)

in the following way. Let P = pL ⊗ · · · ⊗ p1 ∈ Pλµ and let (j, k) denote the k th row of pj . Set P 0 = P , w to the empty word and carry out the following procedure |µ| times: If (j, k) labels the position of the rightmost, maximal entry in P 0 , obtain a new P 0 by removing this maximal entry from P 0 and append x(k) j to w. The resulting word w defines ω(P ). Equivalently, [ω(P )] is the column insertion recording tableaux of col(pL ) . . . col(p1 ), where col(T ) is the column word of the tableau T . The word ω(P ) obtained via the above procedure is indeed in Wµ . The Yamanouchi condition is guaranteed in each intermediate w in the construction thanks to the fact that all steps pj are Young tableaux. Since all pj have rectangular shape ω(P ) is a balanced Yamanouchi word. Note that the integer aj in the alphabet (3.13) used in Definition (3.14) of W is exactly the height of step pj . Example 3.2. To illustrate the map ω take for example 4

P =

2

3

1

2

⊗

1

2

⊗

2 1

.

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

369

(2) (2) (1) (1) (2) (1) (1) (1) Then ω(P ) = x(3) 1 x3 x1 x2 x3 x3 x1 x2 x3 and the corresponding LR tableau (iden(k) tifying x(k) j with j ) is

1

(3)

3

(2)

1

(2)

3

(1)

3

(2)

1

(1)

2

(1)

2

(1)

3

(1)

.

(3.17)

The map ω has the following properties which are needed later. Lemma 3.1. Let P = pL ⊗ · · · ⊗ p1 ∈ Pλµ . Then: (ii) The shape of pi+1 · pi is the same as the shape of the (i, i + 1)-subword of ω(P ). 0 (i) Let w = w1 . . . w|µ| := ω(P ) and w0 = w10 . . . w|µ| := ω(σi (P )). Then 0 0 0 shape(w` . . . w`0 ) = shape(w` . . . w`0 ) for all 1 ≤ ` ≤ ` ≤ |µ|. In particular, ω(P ) and ω(σi (P )) have the same shape, and if ω(P ) is in row-representation then so is ω(σi (P )). Proof. (i) Let L(w, k) be the largest possible sum of the lengths of k disjoint increasing sequences extracted from the word w. If w has shape ν then L(w, k) = ν1 + · · · + νk (see for example Lemma 1 on page 32 of ref. [13]). Since shape(pi+1 · pi ) = shape(wpi+1 wpi ) (i) is proven if we can show that L(wpi+1 wpi , k) = L(ω(pi+1 ⊗ pi ), k) for all k. Let s1 , . . . , sk be disjoint increasing sequences of wpi+1 wpi such that the sum of their lengths is L(wpi+1 wpi , k). Now ω successively maps the rightmost largest element of P to x`(m) if its position is in step ` and row m. Interpreting s1 , . . . , sk as decreasing sequences from right to left we see that the image of sj under ω is an increasing sequence from left to right in ω(pi+1 ⊗pi ). Conversely, each increasing sequence of ω(pi+1 ⊗pi ) is a decreasing sequence from right to left of wpi+1 wpi which proves L(wpi+1 wpi , k) = L(ω(pi+1 ⊗pi ), k). (ii) Since σi only changes the (i, i + 1)-subword of ω(P ) it suffices to prove (ii) ˜ for paths P = p ⊗ p0 of length two. Set p˜0 ⊗ p˜ := σ(p ⊗ p0 ) so that p · p0 = p˜0 · p. In particular p · p0 and p˜0 · p˜ have the same shape. Hence from (i) we conclude that 0 0 ) and ω(p ⊗ p0 ) and ω(σ(p ⊗ p0 )) have the same shape as well. Denoted by wp(`,` ) wp(`,` 0 0

0

) (`,` ) wp˜ the words obtained from wp wp0 and wp˜ 0 wp˜ after successively removing the wp(`,` ˜0 `−1 rightmost biggest letters and the |µ|−`0 leftmost smallest letters, respectively. Then 0 0 ) the arguments of point (i) still go through, that is, L(wp(`,` ) wp(`,` , k) = L(w` . . . w`0 , k) 0 0

0

0

0

0

0

) (`,` ) ) ) (`,` ) wp˜ , k) = L(w`0 . . . w`0 0 , k) for all k. Since wp(`,` ) wp(`,` ≡ wp(`,` wp˜ , and L(wp(`,` 0 ˜0 ˜0 it follows that shape(w` . . . w`0 ) = shape(w`0 . . . w`0 0 ). From this one can immediately deduce that ω(P ) is in row-representation if and only if ω(σi (P )) is in row-representation.

Definition 3.7 (Classically restricted paths). Let λ be a partition such that height(λ) ≤ n and let µ ∈ RL . The set of classically restricted paths P λµ is defined as P λµ = {P ∈ Pλµ | shape(ω(P )) = λ}.

(3.18)

Since each path P ∈ Pλµ contains λi boxes filled with i, the condition that ω(P ) has shape λ implies that ω(P ) is in row representation. Hence P λµ is isomorphic to LRT(λ, µ). Let us now introduce the restricted analogue of the supernomials of Def. 3.5.

370

A. Schilling, S. O. Warnaar

Definition 3.8. For λ a partition with height(λ) ≤ n and µ ∈ RL define X ˜ λµ (q) = q H(P ) . K

(3.19)

P ∈P λµ

When µ is a partition, i.e., µ ∈ RL such that its components µi are one-row partitions ˜ λµ (q) reduces to the cocharge Kostka polynomial. This follows from of decreasing size, K the work of Nakayashiki and Yamada [40] and Lascoux, Leclerc and Thibon [33] and the relation between the weight (3.7) and their statistics as explained in Remark 3.1, and is our motivation for the definition of generalized Kostka polynomials for all µ ∈ RL . Definition 3.9 (Generalized Kostka polynomials). For λ a partition with height(λ) ≤ n and µ ∈ RL , the generalized Kostka polynomial Kλµ (q) is defined as ˜ λµ (1/q), Kλµ (q) = q kµk K where kµk =

P i<j

(3.20)

|µi ∩ µj |.

Lemma 3.1 ensures that if P 0 ∈ OP with P ∈ P λµ then shape(ω(P 0 )) = λ. Therefore ˜ λµ˜ (q) = K ˜ λµ (q) in analogy P ∈ P λµ˜ for some permutation µ˜ of µ, and hence also K with Eq. (3.11). We also find that 0

Kλµ (q) = Kλµ˜ (q)

(3.21)

since kµk = kµk. ˜ We remark that the lattice path representation of the Kostka polynomials was used in [31] to express the Demazure characters as a sum over Kostka polynomials. It also yields expressions for the A(1) n−1 /An−1 branching functions in terms of the Kostka polynomials [40, 26] and has been employed in [15] to obtain various furtherA(1) n−1 branchingand string function identities.

4. Initial Cyclage and Cocharge for LR Tableaux In this section we define the notions of initial cyclage, cocharge and charge for LR tableaux which “paves the path” for Sect. 5 where an expression of the Lascoux– Sch¨utzenberger-type (2.4) for the generalized Kostka polynomials is derived. In the case when LRT(λ, µ) coincides with Tab(λ, µ) our definitions reduce to the usual definitions of the initial cyclage etc. as mentioned in Sect. 2.2. The definition of the initial cyclage for T ∈ LRT(·, µ) has to be altered when a 6 = (1, . . . , 1) for the alphabet X a as given in (3.13). Namely, if T = xi(ai ) u ∈ i) LRT(·, µ) with x(a on each of i u in row-representation (by the Yamanouchi condition (ai ) (ai ) 0 obtained the i-subwords the first letter has to be xi for some i), then T := uxi by cycling the first letter is not in LRT(·, µ) since the Yamanouchi condition on the isubword of uxi(ai ) is violated if ai > 1. To repair this fault we define the initial cyclage for T = [w] with w = xi(ai ) u in row-representation by considering the following chain of transformations: w → w(ai ) → w(ai −1) → · · · → w(1) ∈ Wµ .

(4.1)

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

371

Here w(ai ) := uxi(ai ) and w(j) for 1 < j ≤ ai is a word such that (i) the k-subword for each k 6 = i is a balanced Yamanouchi word, (ii) the i-subword is balanced, (iii) the (j−1) has one non-inverted x(j−1) and one nonsubword consisting of the letters x(j) i and xi i (j) (k) (k−1) with k 6 = j inverted xi and (iv) the subword consisting of the letters xi and xi has no non-inverted letters. The transformation w(j+1) → w(j) is defined as follows. Consider the subword of w(j+1) consisting of the letters xi(j) and xi(j+1) only. Determine all inverted pairs for this subword and exchange the two non-inverted letters xi(j) and xi(j+1) (making them an inverted pair). The resulting word is w(j) . Clearly, this means that w(1) ∈ Wµ . Definition 4.1 (Initial cyclage). The initial cyclage C on T ∈ LRT(·, µ) is defined as (4.2) C(T ) = w(1) , where w(1) ∈ Wµ is the last word in the chain of transformations (4.1). Example 4.1. If 2

(2)

3

(1)

3

then C(T ) = 1 (2) 2 (2) 2 (2) ,

T = 1 (2) 2 (2) 1

(1)

2

(1)

(1)

2

(1)

1

(1)

2

(1)

2

(1)

where the words w(2) and w(1) in the chain (4.1) are given by (2) (2) (1) (1) (1) (2) w(2) = x(1) 3 x1 x2 x1 x2 x2 x2

and (2) (2) (1) (1) (2) (1) w(1) = x(1) 3 x1 x2 x1 x2 x2 x2 .

For Tab(·, µ) the initial cyclage defines a partial order ranked by the cocharge, that is, 0 0 co(T ) := rank(T ). In particular, co(T ) = co(T )+1 if T = C(T ) and the minimal element in this poset is Tmin = xµ1 1 · · · xµnn . For LRT(·, µ) we would like to mimic this structure, that is, we wish to turnLRT(·, µ) into a ranked poset with minimal element defined as i ) (ai −1) j · · · x(1) the LR tableau Tmin = xµ1 1 xµ2 2 · · · xµLL , where xµi i abbreviates (x(a i xi i ) , ai for µi = (j ) ∈ R. Note that for a 6 = (1, . . . , 1), Tmin is no longer a one-row tableau but a tableau of shape µ1 + µ2 + · · · + µL where the j th row is filled with the letters x(j) i (all possible i) only. Having fixed Tmin we observe two important differences between the sets LRT(·, µ) (for a 6 = (1, . . . , 1)) and Tab(·, µ). Remark 4.1. 1. If for T, Tmin ∈ Tab(·, µ), wT and wTmin both start with the same letter then T = Tmin . Generally this is not true for T, Tmin ∈ LRT(·, µ). Indeed, the LR tableau (3.17) of Example 3.2 starts with x(3) 1 , but is not the minimal LR tableaux, which reads

Tmin =

1

(3)

1

(2)

3

(2)

3

(2)

1

(1)

2

(1)

2

(1)

3

(1)

3

(1)

.

372

A. Schilling, S. O. Warnaar

2. For a = (1, . . . , 1) the initial cyclage C has no fixed points. In other words there is no T ∈ Tab(·, µ) with T 6 = Tmin such that C(T ) = T . For LRT(·, µ), however, fixed points may occur. The following LR tableau, for example, is not minimal but obeys C(T ) = T :

T =

4

(3)

1

(2)

1

(1)

3

(1)

4

(2)

2

(1)

4

(1)

.

(4.3)

The second remark shows that the initial cyclage C does not induce a ranked poset structure on LRT(·, µ) and hence needs further modification. For this purpose we define the “dropping” and “insertion” operators D and U, respectively. Let T ∈ LRT(·, µ) with µ ∈ RL . If, for some fixed i ∈ {1, . . . , L} and all j = 1, . . . , height(T ) the j th (j) row of T contains x(j) i (there may be more than one xi in row j), then drop all the (j) boxes containing the letters xi . Repeat this operation on the reduced tableau until no more letters can be dropped. The final tableau defines D(T ). Obviously the condition for dropping occurs if and only if height(T ) = ai for some i. The operator U is somewhat intricate in that we only define U ◦ O ◦ D, where O can be any content preserving operator acting on LR tableaux. So assume T 0 = (O ◦ D)(T ). Then U acts on T 0 by reinserting all boxes that have been dropped by D, inserting a box with filling x(j) i in the j th row such that the conditions for a Young tableau are satisfied (i.e., each row remains non-decreasing and each column strictly increasing on X a ). The insertion operator U is not to be confused with the insertion of boxes defined by the Schensted algorithm. U never bumps any boxes. Remark 4.2. Note that U ◦ D = Id, but D ◦ U ◦ O ◦ D = D ◦ O ◦ D and not O ◦ D. Definition 4.2 (Modified initial cyclage). The modified initial cyclage C : LRT(·, µ) → LRT(·, µ) is defined as C = U ◦ C ◦ D.

(4.4)

Note that D(T ) = T when height(T ) > max{a1 , . . . , aL }, in which case C = C. Finally we are in the position to define the cocharge and charge of an LR tableau. Definition 4.3 (Cocharge and charge). Let T ∈ LRT(·, µ). (i) The cocharge co(T ) of T is the number of times one has to apply C to obtain the minimal LR tableau Tmin . P (ii) The charge is c(T ) = kµk − co(T ), where kµk = i<j |µi ∩ µj |. Example 4.2. For T in (4.3) we have 4 (1) D(T ) = 3 ;

2

(1)

(C ◦ D)(T ) = 2 (1) 3 (1) ;

(3)

(2) (2) C(T ) = (U ◦ C ◦ D)(T ) = 1 4

Since C(T ) = Tmin and kµk = 7 we see that co(T ) = 1 and c(T ) = 6.

1

(1)

2

(1)

3

(1)

4

(1)

.

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

373

Further examples of the action of the modified initial cyclage can be found in Appendix B. As will be shown in Sect. 6, the modified initial cyclage C indeed turns LRT(·, µ) into a ranked poset with co := rank. This implies in particular that co(T ) is a bounded non-negative integer. In fact, we will do more and define more general λ-cyclages which induce a ranked poset structure on LRT(·, µ). We also show in Sect. 6 that the charge is non-negative and that kµk = co(Tmax ), where Tmax = xµLL · · · xµ1 1 has maximal cocharge. For convenience we write co(w) instead of co([w]) for w ∈ Wµ . 5. Charge Statistic Representation for the Generalized Kostka Polynomials There is a relation between the weight-function of Definition 3.4 and the cocharge of Definition 4.3 which is stated in Theorem 5.1. This relation enables us to derive an expression for the generalized Kostka polynomials stemming from the Lascoux and Sch¨utzenberger representation (2.4) given in Corollary 5.2. Section 5.2 is devoted to the proof of Theorem 5.1. 5.1. Relation between cocharge and weight. To state the precise relation between cocharge and weight, we need to introduce the anti-automorphism on words in W. Recall that for every alphabet X = {x1 < x2 < · · · < xL } there exists the dual alphabet X ∗ = {x∗L < x∗L−1 < · · · < x∗1 }. Setting (x∗i )∗ = xi , (X ∗ )∗ = X. The letter x∗i which is often identified with xL+1−i is called dual to xi . Under a word w = xi1 xi2 . . . xik in the monoid X is mapped to w∗ = x∗ik x∗ik−1 . . . x∗i1 in X ∗ . Obviously, is an involution. For the alphabet X a of Eq. (3.13) one may iden∗ (ai +1−j) ∗ tify (x(j) so that (X a )∗ becomes X a with a∗ = (aL+1 , . . . , a1 ). i ) with xL+1−i a One can ∗easily show that if w ∈ W over X , then (w) ∈ W ∗ with W ∗ = {w ∈ X a | each i-subword of w is a balanced Yamanouchi word}. Since respects the Knuth equivalence relations, is also well-defined on LR tableaux by setting (T ) = (wT ). On paths P = pL ⊗ · · · ⊗ p1 ∈ Pλµ we define p (P ) = (p1 ) ⊗ · · · ⊗ (pL ). Recall the map w : Pλµ → Wµ defined in (3.16) from paths to words. With this we can now state the following theorem. Theorem 5.1 (Weight-cocharge relation). For n ≥ 2 and L ≥ 0 integers let λ ∈ Zn≥0 and µ ∈ RL . Then for P ∈ Pλµ the weight H(P ) is a non-negative integer and H(P ) = co( ◦ ω(P )).

(5.1)

In the special case when n = 2 and µ = (1|λ| ) a similar relation was noticed in [11]. Theorem 5.1 generalizes Theorem 5.1 of ref. [33] valid when µ is a partition. It also implies that the generalized Kostka polynomials can be expressed as the generating function of LR tableaux with the charge statistic. This is summarized in the following corollary. Corollary 5.2. The generalized Kostka polynomial Kλµ (q) can be expressed as Kλµ (q) =

X T ∈LRT(λ,µ)

q c(T ) .

(5.2)

374

A. Schilling, S. O. Warnaar

˜ λµ (q). ReProof. We start by rewriting the generalized cocharge Kostka polynomial K calling that P λµ is isomorphic to LRT(λ, µ) it follows from (3.19) and (5.1) that X ˜ λµ (q) = q co((T )) . (5.3) K T ∈LRT(λ,µ)

Since does not change the shape of a tableau (see for Example ref. [13]), but changes its ˜ λµ (q) = K ˜ λµ˜ (q) for a permutation content µ = (µ1 , . . . , µL ) to (µL , . . . , µ1 ), and since K µ˜ of µ, we can drop in (5.3). Recalling Eq. (3.20) and c(T ) = kµk − co(T ) completes the proof of (5.2). 5.2. Proof of Theorem 5.1. The proof of Theorem 5.1 requires several steps. First we use the fact that all paths with Knuth equivalent words have the same weight. Lemma 5.3. Let P, P 0 ∈ P·µ such that ω(P ) ≡ ω(P 0 ). Then H(P ) = H(P 0 ). Proof. Since ω(P ) ≡ ω(P 0 ) also ω(pi+1 ⊗ pi ) ≡ ω(p0i+1 ⊗ p0i ) (see Lemma 3 on p. 33 of ref. [13]). In particular, they have the same shape and hence, by (i) of Lemma 3.1, pi+1 · pi and p0i+1 · p0i have the same shape. This implies h(pi+1 ⊗ pi ) = h(p0i+1 ⊗ p0i ) for all i and therefore h(P ) = h(P 0 ). Since σi only changes steps pi and pi+1 and since, by (ii) of Lemma 3.1, ω(P ) and ω(σi (P )) have the same shape, it follows that ω(σi (P )) ≡ ω(σi (P 0 )). Hence, repeating the argument, h(σi (P )) = h(σi (P 0 )). This implies H(P ) = H(P 0 ). Thanks to the above lemma it suffices to prove (5.1) for just one representative path P for each T ∈ LRT(·, µ) such that [ω(P )] = T . Let us now find a suitable set of such paths. P Define Pµ := Pλµ where λ = (1|µ| ) and |µ| = i |µi | so that for P ∈ Pµ each letter 1, . . . , |µ| occurs exactly once. There is a bijection between Pµ and Wµ . The map from Pµ to Wµ is just given by ω of Eq. (3.16). The inverse map ω −1 : Wµ → Pµ

(5.4)

is given as follows. Let w = w|µ| w|µ|−1 · · · w1 be a word in Wµ . Then reading w from left to right, place i in the rightmost empty box in row k of step j if wi = x(k) j . Obviously, since w ∈ Wµ the steps of the resulting path P are Young tableaux of rectangular shapes µi . Denote the set of paths P ∈ Pµ with ω(P ) in row-representation by P µ . Since LRT(·, µ) and Wµ / ≡ are isomorphic, the bijection between Pµ and Wµ also implies a bijection between P µ and LRT(·, µ) still denoted by ω. By Lemma 5.3 we are thus left to prove (5.1) for all P ∈ P µ . The bijection between LRT(·, µ) and P µ induces a modified initial cyclage C p : P µ → P µ defined as C p := ω −1 ◦ C ◦ ω. Before setting out for the proof of (5.1) let us study some of the properties of this induced function. First consider the induced map Cp := S ◦ ω −1 ◦ C ◦ ω of the initial cyclage C of Definition 4.1. Here S is a shift operator which decreases the letters in each step of P ∈ Pµ by one and hence makes Cp (P ) a path over {0, 1, . . . , |µ| − 1}. The reason for including S in the definition of Cp is merely for convenience so that Cp acts only on one step of paths in P µ as will be shown in Lemma 5.4. One may always undo the effect of S by acting with S −1 which adds one to each entry of a path. To state the precise action of Cp on a path, let us briefly review Sch¨utzenberger’s (inverse) sliding

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

375

mechanism [48]. Suppose there is an empty box with neighbours to the right and above. Then slide the smaller of the two neighbours in the hole; if both neighbours are equal choose the one above. Similarly for the inverse sliding mechanism consider an empty box with neighbours to the left and below. Slide the bigger of the two neighbours in the hole; if both are equal choose the one below. If there is only one neighbour in either case slide this one into the empty box. The sliding and inverse sliding mechanisms are illustrated in Fig. 1 and 2, respectively.   a b       if b ≤ a if a ≤ b     b a   a b → →   b a   b a       if b > a  a  b if a > b Fig.1. Sliding mechanism

Fig. 2. Inverse sliding mechanism

Lemma 5.4. Let P = pL ⊗ · · · ⊗ p1 ∈ P µ be a path over {1, 2, . . . , |µ|} and let the letter |µ| be contained in step pi . Then (i) Cp acts only on step pi of P , i.e., Cp (P ) = pL ⊗ · · · ⊗ Cp (pi ) ⊗ · · · ⊗ p1 , and (ii) Cp (pi ) is obtained by first removing |µ| from the top-right box of pi , then using the inverse sliding mechanism to move the empty box to the bottom-left corner and finally inserting 0 into the empty box. Proof. Since P ∈ P µ and since the largest letter |µ| occurs in step i, the word ω(P ) is in row-representation and of the form ω(P ) = xi(ai ) u. Let T = [ω(P )]. In the chain of transformations (4.1) with w = ω(P ) only the i-subword of w gets changed and all letters in w(1) not in the i-subword are shifted one position to the left. Hence S ◦ ω −1 ◦ C(T ) leaves all but the ith step in P invariant which implies (i). To prove (ii) observe that in row j + 1 the empty box moves to the left up to the point where the left neighbour is smaller than the neighbour below. Under the map ω these two neighbours correspond to (j+1) in w(j+1) of (4.1) used for the definition of C. the two non-inverted letters x(j) i and xi Some properties of the initial cyclage Cp , the map ω, the involution p and the isomorphism σi are summarized in the following lemma. For P ∈ Pλµ we set hi (P ) = h(pi+1 ⊗ pi ). Lemma 5.5. For λ ∈ Zn≥0 and µ ∈ RL we have on Pλµ ,

and on P µ ,

hL−i = hi ◦ p , p ◦ σi = σL−i ◦ p ,

(5.5) (5.6)

p = ω −1 ◦ ◦ ω, [σi , Cp ] = 0,

(5.7) (5.8)

Cp = p ◦ Cp−1 ◦ p ,

(5.9)

where Cp−1 is defined as follows. It acts on the step with the smallest entry in P ∈ P µ by removing the 1, moving the empty box by the sliding mechanism to the top right corner and inserting |µ| + 1.

376

A. Schilling, S. O. Warnaar

Proof. Let P ∈ Pλµ . The energy hi (P ) is determined by the shape of pi+1 · pi . Hence hi (p (P )) is determined by the shape of (pL−i )·(pL+1−i ) = (pL+1−i ·pL−i ). But leaves the shape of a Young tableau invariant (see for example ref. [13]), yielding (5.5). Since the isomorphism σi acts only locally on pi+1 ⊗ pi and p reverses the order of the steps, it suffices to prove (5.6) for a path of length two. Define p˜1 ⊗ p˜2 = σ(p2 ⊗p1 ) so that p˜1 ·p˜2 = p2 ·p1 .Acting on the last equation with yields (p˜2 )·(p˜1 ) = (p1 )·(p2 ). Since does not change the shape of a Young tableau and because of the uniqueness of the decomposition into the product of two rectangular Young tableaux we conclude that σ((p1 ) ⊗ (p2 )) = (p˜2 ) ⊗ (p˜1 ) which proves (5.6). Eq. (5.7) follows in a straightforward manner from the definitions of ω and ω −1 . Let P ∈ P µ and let the letter |µ| be contained in step pj of P . By (i) of Lemma 5.4, Cp acts only on step pj , and σi acts only on pi+1 ⊗ pi . Hence the proof of (5.8) reduces to showing that [σ, Zp ] = 0 on P(µ1 ,µ2 ) . Here Zp = S ◦ ω −1 ◦ Z ◦ ω and Z : Wµ → Wµ is defined as Z(w) = w(1) , where w(1) as given in (4.1) (note that w need not be in row-representation). Let P = p2 ⊗ p1 ∈ P(µ1 ,µ2 ) and set w = w1 . . . w|µ| := ω(P ) and w˜ = w˜ 1 . . . w˜ |µ| := ω(σ(P )). The map ω : Pµ → Wµ is a bijection. Since for a given shape λ the set LRT(λ, (µ1 , µ2 )) can have at most one element, a word w ∈ W(µ1 ,µ2 ) is uniquely specified by shape(w1 . . . wk ) for all 1 ≤ k ≤ |µ|. Hence 0 0 := Z(w) and w˜ 0 = w˜ 10 . . . w˜ |µ| := (5.8) amounts to showing that, for w0 = w10 . . . w|µ| 0 0 0 0 Z(w), ˜ shape(w1 . . . wk ) = shape(w˜ 1 . . . w˜ k ) for all 1 ≤ k ≤ |µ|. By construction, shape(w10 . . . wk0 ) = shape(w2 . . . wk+1 ) and shape(w˜ 10 . . . w˜ k0 ) = shape(w˜ 2 . . . w˜ k+1 ) for all 1 ≤ k < |µ| and by Lemma 3.1 (ii) shape(w2 . . . wk ) = shape(w˜ 2 . . . w˜ k ). Hence we are left to show that shape(w0 ) = shape(w˜ 0 ). This is can be done explicitly. In particular, one may use that the shape of the product of two rectangular Young tableaux has the following form: shape(p2 · p1 ) =

A

(5.10) B

where A and B are partitions and the two overlapping rectangles are the shapes of p1 and p2 ; one may be contained in the other. Note that A is the complement of B, so that knowing A (B) fixes the shape. By Lemma 3.1 (i) also ω(p2 ⊗ p1 ) has the shape (5.10). Let b = 0 )/shape(w2 . . . w|µ| ). shape(w1 . . . w|µ| )/shape(w2 . . . w|µ| ) and b0 = shape(w10 . . . w|µ| 0 0 One may show that (i) if b ∈ A then b ∈ B , (ii) if b ∈ B then b0 ∈ A0 and (iii) if b 6∈ A ∪ B then b0 6∈ A0 ∪ B 0 . Since b is the same for both w and w˜ this implies that shape(w0 ) = shape(w˜ 0 ). For the proof of (5.9) one can consider a path consisting of just a single step thanks to (i) of Lemma 5.4. Suppose p has M boxes filled with the numbers 1, . . . , M . From (ii) of Lemma 5.4 we know that Cp acts by the inverse sliding mechanism and by definition Cp−1 acts by the sliding mechanism. acts on rectangular Young tableaux by rotation of 180◦ and dualizing all letters. But since the inverse sliding mechanism is the same as the sliding mechanism after rotation of 180◦ and dualizing, as is easily seen from Figs. 1 and 2, Eq. (5.9) follows. After these preliminaries we come to the heart of the proof of Theorem 5.1. By Lemma 5.3 we are left to prove Eq. (5.1) for all P ∈ P µ and by (5.7) this is equivalent to H 0 (P ) := H(p (P )) = co(ω(P )).

(5.11)

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

377

We will show that C p = ω −1 ◦ C ◦ ω decreases the weight H 0 of paths in P µ by one, i.e., H 0 (P ) − H 0 (C p (P )) = 1 for P ∈ P µ and P 6 = Pmin ,

(5.12)

where Pmin := ω −1 (wmin ) with wmin = wmin (µ) = xµ1 1 · · · xµLL the word corresponding to the minimal LR tableau Tmin . By definition co(Tmin ) = 0 and one finds by direct computation that also H 0 (Pmin ) = 0. (This can be deduced from the fact that in Pmin the number i cannot be contained in a step to the left of the step containing i − 1; this is also ˜ for some permutation µ˜ of µ). The equation true for any P ∈ OPmin as P = ω −1 (wmin (µ)) H 0 (Pmin ) = co(Tmin ) = 0 together with (5.12) implies that H 0 (P ) and thus H(P ) are integers. By definition H 0 (P ) is finite and non-negative. Suppose there exists a P ∈ P µ such that m − 1 < H 0 (P ) < m for some integer m. Then we conclude from (5.12) that m H 0 (C p (P )) < 0 which contradicts the non-negativity of H 0 . Since co(T )−co(C(T )) = 1 Eq. (5.12) implies (5.11) for all P ∈ P µ . Using (3.2), (3.7), (5.5), (5.6) and 2p = Id one finds that X L−1 X 1 (L − i)hi (P 0 ). H (P ) = H(p (P )) = |OP | 0 i=1 0

(5.13)

P ∈OP

Hence to show (5.12) one needs to relate the energies hi (P ) and hi (C p (P )). Let us first focus on the relation between the energies of P and Cp (P ). Following ref. [33] we decompose the orbit OP of P into chains. Let U, V ∈ OP with largest entries in step i and i − 1, respectively. Then write U V if σi−1 (U ) = V (i = 2, 3, . . . , L). Connected components of the resulting graph are called chains. With this notation we have the following lemma which is proven in Appendix A. Lemma 5.6. For P ∈ P µ with µ ∈ RL define the vector h(P ) = (h1 (P ), h2 (P ), . . . , Pm−1 ··· P` } such that σk−1 (Pk ) = Pk−1 hL−1 (P )). For a chain γ = {Pm and Qj = Cp (Pj ) the following relations hold:

and if m = `,

h(Qm ) − h(Pm ) = em h(Qk ) − h(Pk ) = 0 for ` < k < m, h(Q` ) − h(P` ) = −e`−1 ,

(5.14)

h(Qm ) − h(Pm ) = em − em−1 .

(5.15)

Here em (1 ≤ m ≤ L − 1) are the canonical basis vectors of ZL−1 and e0 = eL = 0. 1 |γ|

Thanks to Eq. (5.8) {Qm , Qm−1 , . . . , Q` } is a subset of OCp (P ) . Defining Hγ0 (P ) := P PL−1 0 i=1 (L − i)hi (P ) for a subset γ ⊂ OP , Lemma 5.6 ensures that P 0 ∈γ Hγ0 (P ) − HC0 p (γ) (Cp (P )) = 1

(5.16)

Pm−1 ··· P` } as long as ` > 1. For the case treated in for γ = {Pm ref. [33], where [ω(P )] ∈ Tab(·, µ) is an ordinary Young tableau, ` is always bigger than one when P 6 = Pmin and hence the proof of Theorem 5.1 is complete in this case. For [ω(P )] ∈ LRT(·, µ), however, ` can take the value one even if P 6 = Pmin , due to point 4.1 of Remark 4.1. Hence (5.16) breaks down for ` = 1, i.e., when there is a P 0 ∈ γ such that the letter |µ| is contained in the first step. However, in this case we are saved by the following lemma. Therein, the height of a path P = pL ⊗ · · · ⊗ p1 is defined as height(P ) := max1≤i≤L {height(pi )}.

378

A. Schilling, S. O. Warnaar

Lemma 5.7. Let P ∈ P µ over {1, 2, . . . , |µ|}. Then there exists a path P 0 = p0L ⊗ · · · ⊗ p01 ∈ OP such that p01 contains the letter |µ| if and only if height(ω(P )) = height(P ). Proof. Let us first show that the existence of P 0 implies the condition on the height of ω(P ). Since p01 contains |µ| the word ω(P 0 ) starts with x1(a1 ) . By (ii) of Lemma 3.1 ω(P 0 ) is in row-representation. Hence the height of ω(P 0 ) equals the height of p01 and the first step is also (one of) the highest. Again by (ii) of Lemma 3.1 ω(P ) and ω(P 0 ) have the same shape so that the height of ω(P ) equals the height of P . To prove the reverse, consider P 0 ∈ OP such that the first step is highest. Employing again (ii) of Lemma 3.1 we see that the height of ω(P 0 ) equals the height of the first step. Now suppose that p01 does not contain |µ|. This means that ω(P 0 ) = xi(ai ) u for 1) in some u ∈ W with i > 1. Since P 0 is in row representation xi(ai ) must be above x(a 1 0 0 0 [ω(P )]. This contradicts the fact that the height of ω(P ) is the height of p1 . ··· P1 } The previous lemma shows that there exist chains γ such that γ = {Pm (so that (5.16) is violated) if and only if the modified initial charge C p differs from S −1 ◦Cp . This is the case because the dropping and insertion operators Dp := ω −1 ◦D ◦ω and Up := ω −1 ◦ U ◦ ω in the relation C p = Up ◦ S −1 ◦ Cp ◦ Dp only act non-trivially when the height of ω(P ) equals the height of P , or equivalently by Lemma 5.7, when ··· P1 }. The dropping operator, however, does not there exists a chain γ = {Pm change the weight of a path as shown in the following lemma: Lemma 5.8. For µ ∈ RL let P ∈ P µ such that height(P ) = height(ω(P )). Then H 0 (P ) = H 0 (Dp (P )). Lemmas 5.6–5.8 imply (5.12) and hence Theorem 5.1 for the following reason. For ··· P1 } thanks P ∈ P µ , the path Dp (P ) does not contain any chains γ = {Pm to Lemma 5.7. Hence H 0 (P ) = H 0 (Dp (P )) = H 0 (Cp ◦ Dp (P )) + 1. On the other hand H 0 (Cp ◦Dp (P )) = H 0 (C p (P )) since S −1 does not change the energy of a path and because of Remark 4.2 and Lemma 5.8. This proves (5.12). Proof of Lemma 5.8. For a path in Pλµ we refer to µ as its content. Now suppose the path P of Lemma 5.8 has k steps of shape ν ∈ R where k ≥ 1 and height(ν) = height(P ). Then all P 0 ∈ OP have k steps of shape ν and, by (ii) of Lemma 3.1, height(ω(P 0 )) = height(ν). Define Dν (P 0 ) as the path obtained from P 0 by dropping all steps of shape ν. Let η be the content of Dν (P ). Then for each permutation η˜ of η define the suborbit Sη˜ ⊂ OP as ˜ Sη˜ = {P˜ ∈ OP | content(Dν (P˜ )) = η}.

(5.17) Then clearly OP is the disjoint union of Sη˜ over all permutations η˜ of η, and |Sη˜ | = Lk . Let us now show that for any Q ∈ Sη˜ , X X L−1 P˜ ∈Sη˜ i=1

ihL−i (P˜ ) =

L−k−1 X L ihL−k−i (Dν (Q)). k i=1

(5.18)

Since Dp is a composition of Dν ’s, Eq. (5.18) clearly implies H 0 (P ) = H 0 (Dp (P )). To prove (5.18) we first study some properties of the energy hi (P˜ ) for P˜ ∈ Sη˜ . For P˜ = p˜L ⊗ · · · ⊗ p˜1 ∈ Sη˜ , let µ˜ = (µ˜ 1 , . . . , µ˜ L ) be the content of P˜ and define

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

379

L ≥ m1 > · · · > mk ≥ 1 such that µ˜ mi = ν. Then P˜ 0 = σmi +1 ◦ σmi (P˜ ) is also in Sη˜ and, for 1 ≤ i ≤ k, hmi (P˜ ) = hmi −1 (P˜ ) = 0, hmi (P˜ 0 ) = hmi +1 (P˜ ).

(5.19) (5.20)

The proof of (5.19) and (5.20) makes extensive use of Lemma 3.1. For two steps p ∈ Bλ and p0 ∈ Bλ0 let us call the shape λ + λ0 minimal because h(p ⊗ p0 ) = 0 if shape(p · p0 ) = λ + λ0 . Eq. (5.19) states that p˜mi +1 · p˜mi and p˜mi · p˜mi −1 have minimal shape, or equivalently by (i) of Lemma 3.1, that ω(p˜mi +1 ⊗ p˜mi ) and ω(p˜mi ⊗ p˜mi −1 ) have shapes µ˜ mi +1 + ν and ν + µ˜ mi −1 , respectively. But since the height of ω(P˜ ) is the height of ν, the heights of ω(p˜mi +1 ⊗ p˜mi ) and ω(p˜mi ⊗ p˜mi −1 ) equal the height of ν, and hence their shape has to be minimal. We now turn to the proof of (5.20). Denote P˜ 0 = p˜0L ⊗ · · · ⊗ p˜01 . Since P˜ 0 = σmi +1 ◦ σmi (P˜ ) we know by (ii) of Lemma 3.1 that ω(p˜mi +2 ⊗ p˜mi +1 ⊗ p˜mi ) and ω(p˜0mi +2 ⊗ p˜0mi +1 ⊗ p˜0mi ) have the same shape. But since by ω(p˜mi +1 ⊗ p˜mi ) and ω(p˜0mi +2 ⊗ p˜0mi +1 ) have minimal shape by (5.19) we can conclude that ω(p˜mi +2 ⊗ p˜mi +1 ) and ω(p˜0mi +1 ⊗ p˜0mi ) have the same shape. Hence by (i) of Lemma 3.1 also p˜mi +2 · p˜mi +1 and p˜0mi +1 · p˜0mi have the same shape which implies (5.20). Analogous to the proof of (5.20) we find that for P˜ 0 = σmi (P˜ ) the tableaux p˜mi +1 · p˜mi −1 and p˜0mi · p˜0mi −1 have the same shape. Setting P˜ to Q in this argument shows that hL−k−i (Dν (Q)) is independent of Q ∈ Sη˜ . Hence we can restrict our attention to Q ∈ Sη˜ with steps 1 to k of shape ν in the following. If k = L or L−1 the right-hand side of (5.18) is zero due to the empty sum. Eq. (5.19) ensures that the left-hand side is zero as well. If 1 ≤ k ≤ L−2 set Xi := hL−k−i (Dν (Q)) for 1 ≤ i < L − k. Define rj as rj = L + 1 − mj − j for 1 ≤ j ≤ k and r0 = 0, rk+1 = L − k for a given P˜ where, as above, the mi are the positions of the steps of shape ν. Treating Xi as an indeterminate we see from (5.19) and (5.20) that the contribution PL−1 to Xi from i=1 ihL−i (P˜ ) is given by (i + j) for rj < i < rj+1 and 0 ≤ j ≤ k, 0 for i = rj and 1 ≤ j ≤ k. Summing over all P˜ ∈ Sη˜ or, equivalently, over all possible ri we find that X X L−1

ihL−i (P˜ ) =

P˜ ∈Sµ˜ i=1

=

L−k−1 X

Xi

k X

i=1

j=0

L−k−1 X

k X

i=1

Xi

j=0

X

(i + j)

1

r0 ≤···≤rj
(i + j)

i+j−1 j

L−i−j−1 k−j

L−k−1 X L iXi , = k i=1 where the last step follows from (a special case of) the 2 F1 Gauß sum. Recalling that Xi = hL−k−i (Dν (Q)) this proves Eq. (5.18) and hence Lemma 5.8.

380

A. Schilling, S. O. Warnaar

6. The Poset Structure on LRT(·, µ) As shown in Theorem 5.1, the weight and the cocharge are related as H(P ) = co( ◦ ω(P )). Since H(P ) is by its Definition 3.4 finite and for each LR tableau T ∈ LRT(·, µ) there exists a path P ∈ P·µ such that T = [ω(P )], Theorem 5.1 immediately implies that co(T ) is finite for all T ∈ LRT(·, µ). This means that each T ∈ LRT(·, µ) reaches the minimal LR tableau Tmin after a finite number of applications of C. This, in turn, ensures the following corollary. Corollary 6.1. The modified initial cyclage C induces a ranked poset structure on LRT(·, µ). The statement of this corollary can be extended to more general cyclages which generalize the λ-cyclages of Lascoux and Sch¨utzenberger [36, 32] for Young tableaux T ∈ Tab(·, µ). We define λ-cyclages for LR tableaux T ∈ LRT(·, µ) in Sect.6.1 and prove the analogue of Corollary 6.1 in Theorem 6.3. In Sect. 6.2 we deduce several important properties of the charge and cocharge which are needed in Sect. 7.2 to prove recurrences for the An−1 supernomials and generalized Kostka polynomials. 6.1. The λ-cyclage and λ-cocyclage. The λ-cyclage is a generalization of the initial i) cyclage C. Let us first define the cyclage operator Z on words w = x(a i u ∈ Wµ as (1) (1) Z(w) = w , where w is defined as in (4.1) by dropping the restriction that w is in row-representation. The only additional requirement is that all w0 in the orbit of w (i.e., all w0 such that ω −1 (w0 ) ∈ Oω−1 (w) with ω −1 as defined in (5.4)) start with a letter different from x1(a1 ) . If w violates this condition we set Z(w) = 0. Similarly one can define a cocyclage Z −1 on a word w = ux(1) i ∈ Wµ . This time (1) −1 0 Z (w) = 0 if there exists a w in the orbit of w ending with x1 . If not, set w(1) = w and construct the chain of transformations w(1) → w(2) → · · · → w(ai ) ,

(6.1)

where w(j+1) is obtained from w(j) by exchanging the last letter x(j) i with its inverted (j+1) (j) (j) in the subword of w consisting of the letters xi and x(j+1) only. Then partner xi i the cocyclage Z −1 (w) is obtained by cycling the last letter xi(ai ) in w(ai ) to the front of the word. Obviously, Z −1 is the inverse of Z. The cyclage Z and cocyclage Z −1 defined on the set of words have analogues on the set of LR tableaux. A λ-cyclage Zλ on an LR tableau T ∈ LRT(·, µ) is defined as i) Zλ (T ) = Z(w), where w = x(a i u ∈ Wµ is a word such that T = [w] and shape(u) = λ. Similarly, a λ-cocyclage Zλ−1 is defined as Zλ−1 (T ) = Z −1 (w) if w = ux(1) i is a word such that T = [w] and shape(u) = λ. In both cases if no such w exists we set Zλ (T ) = 0 and Zλ−1 (T ) = 0, respectively. The λ-cyclage and λ-cocyclage obey Zλ ◦ Zλ−1 (T ) = T

Zλ−1 ◦ Zλ (T ) = T

if Zλ−1 (T ) 6 = 0, if Zλ (T ) 6 = 0.

(6.2)

In analogy with (4.4) we also define the modified λ-cyclage as Z λ = U ◦ Zλ ◦ D,

(6.3)

where D(T ) was defined in Sect. 4 by successively dropping all xi(j) (1 ≤ j ≤ ai ) from T if height(T ) = height(µi ) (recall that height(µi ) = ai ) and U was defined by

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

381

reinserting all xi(j) dropped by D in row j such that the Young tableau conditions are satisfied. Similarly we set 0

Z λ = U 0 ◦ Zλ−1 ◦ D0 ,

(6.4)

where D0 (T ) is the tableau obtained by dropping all x(j) i (1 ≤ j ≤ ai ) if width(T ) = width(µi ) and U 0 reinserts all boxes dropped by D0 such that there is one x(j) i in each column and the resulting object is again in LRT(·, µ). The initial cyclage C is a special λ-cyclage since for each T ∈ LRT(·, µ) there always exists a partition λ such that C(T ) = Zλ (T ). Hence also C(T ) = Z λ (T ) for this λ. For ordinary Young tableaux T ∈ Tab(·, µ), the cyclages Zλ and Zλ−1 have been considered in refs. [36, 32]. It was shown in ref. [36] that the λ-cyclages induce a ranked poset structure on Tab(·, µ) with the cocharge of a Young tableau being its rank and the minimal element being Tmin = [xµ1 1 · · · xµLL ]. Thanks to Z λ (T ) = Zλ (T ) for T ∈ Tab(·, µ) also Z λ induces a ranked poset structure on Tab(·, µ). Note, however, that 0 0 Z λ 6 = Zλ−1 even on Tab(·, µ) since for example Zλ−1 (T ) = 0, but Z λ (T ) = [3211] for T = [2311] and λ = (1). 0 From Z λ and Z λ we may now define a cyclage- and cocyclage-graph. Definition 6.1 (Cyclage- and cocyclage-graph). For µ ∈ RL , the cyclage-graph T (µ) is defined by connecting all T, U ∈ LRT(·, µ) as T → U if there exists a partition λ such that U = Z λ (T ). Similarly, the cocyclage-graph T 0 (µ) is obtained by connecting 0 T → U if there exists a λ such that U = Z λ (T ). An example of a cyclage-graph is given in Appendix B. The cyclage- and cocyclagegraphs are related by an involution 3 : LRT(λ, µ) → LRT(λ> , µ? ) defined as follows. Let T = [w1 · · · w` ] ∈ LRT(λ, µ) with wi ∈ X a . Then 3(T ) = (j) 0 th (j) = x(k) [w`0 · · · w10 ] where wm i if wm = xi is the k xi in T from the left. One may easily check that 3 respects the Knuth equivalence relations (2.1) and is therefore indeed a function on LR tableaux. The ith row of T gets mapped to the ith column in 3(T ) and hence the shape of 3(T ) is indeed λ> . For example 2

(2)

3

(1)

2 3

7→ 2 (1) 2 (2) 3 (1) .

T = 1 (2) 2 (2) 1

(1)

2

(1)

(2)

2

(1)

1

(1)

1

(1)

2

(1)

Lemma 6.2. For µ ∈ RL , T (µ) = 3T 0 (µ? ) or, equivalently, T 0 (µ) = 3T (µ? ). Proof. Observe that D0 = 3 ◦ D ◦ 3, U 0 = 3 ◦ U ◦ 3 and Zλ = 3 ◦ Zλ−1 > ◦ 3. This implies 0

Z λ = 3 ◦ Z λ> ◦ 3.

(6.5) 0

Hence, for T, T 0 ∈ LRT(·, µ) such that T 0 = Z λ (T ) one finds 3(T 0 ) = Z λ> ◦ 3(T ) which proves the lemma.

382

A. Schilling, S. O. Warnaar

We now wish to show that both T (µ) and T 0 (µ) induce a ranked poset structure on the set of LR tableaux LRT(·, µ). To prove this we extend the standardization embedding θ : T (µ) ,→ T ((1|µ| )),

(6.6)

of Lascoux and Sch¨utzenberger [36, 32] (see also Chap. 2.6 of ref. [9]) when µ is a partition to the case when µ ∈ RL . Define the map φ on LR tableaux as follows: (j) change the rightmost x(j) 1 to x2 for all 1 ≤ j ≤ a1 .

If height(µ1 ) = height(µ2 ) and width(µ1 ) > width(µ2 ) or µ2 = 0 then φ(T ) is an LR tableau of the same shape as T and of content µ0 = (µ1 − (1a1 ), µ2 + (1a1 ), µ3 , . . . , µL ). Denote by φ0 the map φ restricted to the case when µ2 = 0. One can show that Z λ ◦φ0 (T ) = 0 if and only if Z λ (T ) = 0, and furthermore [φ0 , Z λ ] = 0. (These statements can, for example, be proven by going over to paths using the map ω and noting that ω −1 ◦ φ0 ◦ ω only acts on steps one and two –which is empty– and S ◦ ω −1 ◦ Z λ ◦ ω only acts on the step containing the biggest entry in analogy to Lemma 5.4. For the first statement it is sufficient to consider a path of length three for which it can be explicitly verified. Assuming Z λ (T ) 6 = 0 the second statement then follows trivially since the two operators act on different steps in the path). Denote by G the group spanned by ω ◦ σi ◦ ω −1 , where σi is the isomorphism of Definition 3.2 and ω and ω −1 are defined in (3.16) and (5.4), respectively. Then, in analogy to (6.6), there exists an embedding ν ? a partition

θ : T (µ) ,→ T (ν)

(6.7)

for µ ∈ RL by combining φ0 with the action of G. Since both τ ∈ G and φ0 are compatible with the cyclages (the proof of the first statement is analogous to the proof of (5.8)), we find that [θ, Z λ ] = 0.

(6.8)

Example 6.1. If T is the LR tableau of Eq. (3.17) then under θ its content µ=

,

,

,

will be changed to ν =

,

,

,

.

The standardization θ(T ) can be determined from τ

T →1

φ

,→

3

(3)

4

(2)

3

(2)

4

(1)

1

(1)

1

(1)

4

(1)

5

(3)

1

(2)

2

(2)

5

(2)

1

(1)

2

(1)

3

(1)

4

(2)

3

(1)

φ

4

(1)

,→

τ

5

(1)

→3

3

(3)

4

(2)

3

(2)

4

(1)

1

(1)

2

(1)

1

(3)

5

(1)

1

(2)

2

1

(1)

2

4

(2)

3

(1)

(2)

3

(2)

(1)

3

(1)

τ

4

(1)

4

(1)

→2

4

(1)

5

(3)

1

(2)

1

(2)

5

(2)

1

(1)

1

(1)

3

(1)

5

(1)

= θ(T ),

where τ1 = ω ◦ σ2 ◦ σ3 ◦ σ1 ◦ ω −1 , τ2 = ω ◦ σ2 ◦ σ3 ◦ σ4 ◦ σ1 ◦ σ2 ◦ σ3 ◦ ω −1 and τ3 = ω ◦ σ2 ◦ σ3 ◦ σ1 ◦ σ2 ◦ ω −1 . Theorem 6.3. Let µ ∈ RL . Then the cyclage-graph T (µ) imposes a ranked poset structure on LRT(·, µ) with minimal element Tmin = [xµ1 1 · · · xµLL ]. Similarly, T 0 (µ) imposes a ranked poset structure on LRT(·, µ) with minimal element Tmax = [xµLL · · · xµ1 1 ].

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

383

Proof. Let us first consider µ to be a partition and show that in this case T 0 (µ) is a ranked poset. For every T ∈ Tab(·, µ) with T 6 = Tmax there exists at least one partition 0 λ such that Z λ (T ) 6 = 0 and one can show that 0

c(Z λ (T )) = c(T ) − 1.

(6.9)

0

Namely, if D0 (T ) = T , i.e., Z λ (T ) = Zλ−1 (T ) then c(Zλ−1 (T )) = kµk − co(Zλ−1 (T )) = kµk − co(T ) − 1 = c(T ) − 1 by Eq. (6.2) and the fact that the cocharge is the rank of T (µ) for a partition µ as shown by Lascoux and Sch¨utzenberger [36]. From the explicit prescription for calculating the charge of a Young tableau T ∈ Tab(·, µ) via indices (see for example [37] p. 242 or [9] p. 111) one may easily check that c(T ) = c(D0 (T )) which proves (6.9). This shows that for a partition µ, T 0 (µ) is a poset ranked by the charge with minimal element Tmax . From Lemma 6.2 and Eqs. (6.7) and (6.8) we deduce that also T (µ) with µ ∈ RL µ> µ> is a ranked poset. Since for Tmax = xLL · · · x1 1 with µ ∈ RL , (6.10) 3(Tmax ) = Tmin = xµ1 1 · · · xµLL , the minimal element of T (µ) is Tmin . According to Lemma 6.2 also T 0 (µ) is a ranked poset for all µ ∈ RL with minimal element equal to Tmax . The standardization embedding (6.7) can be refined by combining φ with the action of G to obtain ψνµ : T (ν) ,→ T (µ),

ν≥µ

(6.11)

for µ, ν ∈ RL with the ordering ν ≥ µ as defined in Sect. 2. Similar to (6.8) [ψνµ , Z λ ] = 0.

(6.12)

Certainly, [ψνµ , C] = 0 thanks to (5.8) and [φ, C] = 0 which can be varified explicitly. To establish (6.12) for general Z λ we are left to show [φ, Z λ ] = 0. Let us briefly sketch the proof here. Firstly, ψνµ only depends on ν and µ, but not on its explicit composition in terms of φ and σi ’s. This can be shown by induction on the cocharge using [ψνµ , C] = 0. Secondly, Z λ (T ) = 0 if and only if Z λ ◦φ(T ) = 0. This can be seen as follows. For every LR tableaux there exists a standardization composed only of φ0 and σi ’s. Denote by θ1 and θ2 such standardizations for T and φ(T ), respectively. Since the standardization is independent of the composition of φ and σi ’s we conclude θ1 (T ) = θ2 ◦ φ(T ). Thanks to (6.8) this means that θ1 ◦ Z λ (T ) = Z λ ◦ θ1 (T ) = Z λ ◦ θ2 ◦ φ(T ) = θ2 ◦ Z λ ◦ φ(T ) which proves the assertion. When Z λ (T ) 6 = 0 the commutation relation [Z λ , φ] = 0 can again be explicitly shown on paths using the maps ω and ω −1 . 6.2. Properties of charge and cocharge. In this section we establish some properties of the charge and cocharge for LR tableaux. The cocharge of an LR tableau T is its rank in the poset induced by the modified initial cyclage C. Since the initial cyclage is a special λ-cyclage the cocharge is also the rank of the posetP T (µ). In Definition 4.3 the charge was defined as c(T ) = kµk − co(T ), where kµk = i<j |µi ∩ µj |. We show now that kµk is in fact the cocharge of Tmax = xµLL · · · xµ1 1 and that the charge of an LR tableau is equal to its rank in the poset T 0 (µ).

384

A. Schilling, S. O. Warnaar

Lemma 6.4. For L ∈ Z≥0 and µ ∈ RL , kµk = co(Tmax ). Proof. Set Pmax := ω −1 (wT max ) with ω −1 as defined in (5.4). Using Theorem 5.1, kµk = co(Tmax ) is equivalent to kµk = H 0 (Pmax ) with H 0 as in (5.13). Defining L(a) i as ab the number of components of µ equal to (ia ) and setting Xij := |(ia ) ∩ (j b )|, we may rewrite kµk as kµk =

1 2

X

(b) ab L(a) i (Lj − δij δab )Xij ,

(6.13)

1≤i,j≤N 1≤a,b≤n

where δij = 1 if i = j and zero otherwise. Let P ∈ OPmax . Then P ∈ Pλµ˜ , where µ˜ is a permutation of µ. One may easily show that h(pk+1 ⊗ pk ) = |µ˜ k ∩ µ˜ k+1 | for all P ∈ OPmax . Hence H 0 (Pmax ) =

1 |OPmax |

X L−1 X (L − k)|µ˜ k ∩ µ˜ k+1 |, µ˜

k=1

ab if where the first sum is over all permutations of µ. Notice that |µ˜ k ∩ µ˜ k+1 | = Xij a b ab 0 µ˜ k = (i ) and µ˜ k+1 = (j ). We now wish to determine the coefficient of Xij in H (Pmax ). Since

L!

|OPmax | = number of permutations of µ = Q

c,k≥1

L(c) k !

,

ab in H 0 (Pmax ) is the contribution to Xij

1 |OPmax |

·Q

L−1 X (L − 2)! 1 (L − k) = Li(a) (Lj(b) − δij δab ). (c) 2 c,k≥1 (Lk − δac δik − δbc δjk )! k=1

Summing over all i, j, a, b yields (6.13) which completes the proof.

The charge and cocharge are dual in the following sense: Lemma 6.5. For T ∈ LRT(·, µ), c(T ) = co(3(T )). Proof. Since c(T ) = kµk − co(T ), the lemma is equivalent to co(3(T )) + co(T ) = kµk.

(6.14)

µ> µ> For Tmin = [xµ1 1 · · · xµLL ] we have 3(Tmin ) = Tmax = xLL · · · x1 1 . Since co(Tmin ) = 0 and co(Tmax ) = kµ? k = kµk by Lemma 6.4, Eq. (6.14) holds for T = Tmin and T = Tmax . Now assume that (6.14) holds for some T ∈ LRT(·, µ) so that D(T ) = T . Then (6.14) also holds for Z λ (T ) = Zλ (T ) if we can show that co(3(T )) = co(3◦Zλ (T ))−1 because −1 −1 co(T ) = co(Zλ (T )) + 1. Since 3 ◦ Zλ = Zλ−1 > ◦ 3 and D ◦ Zλ> = Zλ> this is fulfilled. 0 If on the other hand D (T ) = T or equivalently D ◦ 3(T ) = 3(T ) then (6.14) 0 also holds for Z λ (T ) = Zλ−1 (T ) because co(T ) = co(Zλ−1 (T )) − 1 and co(3(T )) = co(3 ◦ Zλ−1 (T )) + 1 thanks to 3 ◦ Zλ−1 = Zλ> ◦ 3. Since D(T ) = T if D0 (T ) 6 = T and vice versa, unless T is equal to both Tmin and Tmax , this proves (6.14) for all T ∈ LRT(·, µ).

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

385

As argued before the cocharge is the rank of the poset T (µ) since the initial cyclage is a special λ-cyclage. Lemmas 6.5 and 6.2 show that the charge is the rank of the poset T 0 (µ). This is summarized in the following corollary. Corollary 6.6. For µ ∈ RL , the cocharge is the rank of the poset T (µ) and the charge is the rank of the poset T 0 (µ). In addition 0 ≤ co(T ) ≤ kµk and 0 ≤ c(T ) ≤ kµk, with co(Tmin ) = c(Tmax ) = 0 and co(Tmax ) = c(Tmin ) = kµk. 7. Properties of the Supernomials and Generalized Kostka Polynomials Several interesting properties of the supernomials (3.10) and generalized Kostka polynomials (3.20) are stated. In Sect. 7.1 a duality formula for the generalized Kostka polynomials as well as relations between the supernomials and the generalized (cocharge) Kostka polynomial are given. Recurrences for the An−1 supernomials and the generalized Kostka polynomials are established in Sect. 7.2. These will be used in Sect. 8 to obtain a representation of the generalized Kostka polynomials of the Kirillov–Reshetikhintype (2.5). In Sect. 7.3 we treat the A1 supernomials in more detail and sketch an elementary proof of the Rogers–Ramanujan-type identities of ref. [47]. 7.1. General properties. The results of the previous section imply the following duality formula for the generalized Kostka polynomials. Theorem 7.1. For λ a partition and µ ∈ RL , Kλµ (q) = q kµk Kλ> µ? (1/q).

(7.1)

Proof. This follows from the charge representation of the generalized Kostka polynomials of Corollary 5.2, Lemma 6.5 and c(T ) = kµk − co(T ). ˜ λµ (q) The supernomial Sλµ (q) and the generalized cocharge Kostka polynomial K satisfy linear relations as follows. Theorem 7.2. For λ ∈ Zn≥0 and µ ∈ RL , X ˜ ηµ (q), Kηλ K Sλµ (q) =

(7.2)

η`|λ|

where Kηλ = Kηλ (1) is the Kostka number. Proof. By definition the supernomial Sλµ (q) is the generating function over all paths ˜ ηµ (q) is the generating function over all P ∈ Pλµ weighted by H(P ) and by (5.3) K LR tableaux T ∈ LRT(η, µ) with cocharge statistic. Hence, since [ω(P )] ∈ LRT(·, µ) and H(P ) = co( ◦ ω(P )) by Theorem 5.1 for P ∈ Pλµ , Eq. (7.2) is proven if we can show that for all partitions η of |λ| and T ∈ LRT(η, µ) there are Kηλ paths such that [ω(P )] = T . To this end let us show that for all partitions η of |λ| with η ≥ λ a pair (T, t) with T ∈ LRT(η, µ) and t ∈ Tab(η, λ) uniquely specifies a path P = pL ⊗ · · · ⊗ p1 ∈ Pλµ by requiring that pL · . . . · p1 = t and [ω(P )] = T . Firstly, by point (i) of Lemma 3.1 indeed shape(pL · . . . · p1 ) = shape([ω(P )]). Let us now construct P ∈ Pλµ from a given pair

386

A. Schilling, S. O. Warnaar

(k) (T, t). Set ai = height(µi ) and define p(k) i and ti (1 ≤ i ≤ L; 1 ≤ k ≤ ai ) recursively as follows. Set t(1) L+1 = t and decompose for 1 ≤ i ≤ L, (ai ) (k) · ti(ai ) and t(k+1) = p(k) (1 ≤ k < ai ) t(1) i i · ti i+1 = pi

(7.3)

(k) (k) (k) such that shape(p(k) i ) = (width(µi )) and shape(ti ) = shape(Ti ), where Ti is obtained (k) from T by dropping all letters x ≥ xi . The decompositions in (7.3) are unique by the Pieri formula. The desired path is P = pL ⊗ · · · ⊗ p1 , where pi := pi(ai ) · . . . · p(1) i (1 ≤ i ≤ L) because pi has shape µi since T ∈ LRT(η, µ), pL · . . . · p1 = t and [ω(P )] = T by construction.

From Eq. (7.2) one can infer that the special cases of the supernomials for which µ or µ? is a partition have previously occurred in the literature. In the study of finite abelian subgroups, Butler [7]-[9] defines polynomials αµ (S; q), where µ is a partition and S = {a1 < · · · < an−1 } an ordered set of n − 1 integers such that an−1 < |µ|, and shows that they satisfy X αµ (S; q −1 )q kµk Pµ (x; q). (7.4) ha1 (x)ha2 −a1 (x) · · · hm−an−1 (x) = µ`m

and Pµ (x; q) is the HallHere hk (x) is the k th homogeneous symmetric function P Littlewood polynomial. Using hλ1 (x) · · · hλn (x) = η Kηλ sη (x) and Eqs. (2.3) and (7.2) immediately yields that αµ (S; q) = Sλµ (q), where λ = (a1 , a2 − a1 , . . . , |µ| − an−1 ). When µ? is a partition the supernomial has been studied by Hatayama et al. [15]. An immediate consequence of Theorem 7.2 is the inverse of relation (7.2). Corollary 7.3. For λ a partition with height(λ) ≤ n and µ ∈ RL , X ˜ λµ (q) = K (τ )S(λ1 +τ1 −1,... ,λn +τn −n)µ (q),

(7.5)

τ ∈Sn

where Sn is the permutation group on 1, 2, . . . , n and (τ ) is the sign of τ . Proof. Substitute (7.2) into the right-hand side of (7.5) and use (see p. 76 of ref. [13]) X (τ )Kη(λ1 +τ1 −1,... ,λn +τn −n) = δηλ . τ ∈Sn

7.2. Recurrences of the An−1 supernomials and generalized Kostka polynomials. We have seen in Eqs. (3.11) and (3.21) that the supernomials and generalized Kostka polynomials are independent of the ordering of µ. We may therefore label the supernomials and generalized Kostka polynomials by a matrix L with component L(a) i in row a and column i where Li(a) = Li(a) (µ) := number of components of µ equal to (ia ).

(7.6)

If N := max{width(µk )}, then L is an n × N matrix. We denote the supernomials and generalized Kostka polynomials with the label L by S(L, λ) and K(L, λ), respectively, and from now on we identify S(L, λ) and Sλµ (q) (similarly K(L, λ) and Kλµ (q)) if µ and L are related as in (7.6). Define ei(a) as the n × N matrix with the only non-zero PN Pn element in row a and column i equal to 1 and furthermore set L = i=1 a=1 Li(a) ,

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

`i(a)

=

N X j=1

min{i, j}Lj(a)

and

(a) `i

=

n X

min{a, b}Li(b) .

387

(7.7)

b=1

With this notation we can state the following recurrence relations. Theorem 7.4 (Recurrences). Let i, a, N, n ∈ Z≥0 such that 1 ≤ i < N and 1 ≤ a < n. Let L be an n × N matrix with non-negative integer components such that L(a) i ≥ 2. Then for λ ∈ Zn≥0 , (a)

(a) (a) − 2ei(a) + ei+1 , λ) + q `i S(L, λ) = S(L + ei−1

−i

S(L + ei(a−1) − 2ei(a) + e(a+1) , λ) i (7.8)

and for λ a partition (a)

K(L, λ) = q `i

−a

(a) (a) K(L + ei−1 − 2ei(a) + ei+1 , λ) + K(L + ei(a−1) − 2ei(a) + e(a+1) , λ). i (7.9)

Proof. First we prove (7.8). Take µ ∈ RL corresponding to L such that µL−1 = µL = (ia ) (which is possible since Li(a) ≥ 2). Define µ0 and µ00 by µ0L = ((i + 1)a ), µ0L−1 = ((i − 1)a ), µ00L = (ia+1 ), µ00L−1 = (ia−1 ) and µ0j = µ00j = µj for 1 ≤ j ≤ L − 2. Recalling Definition 3.5 of the supernomials, it is obvious that Pλµ0 and Pλµ00 are the sets of paths underlying the two terms on the right-hand side of (7.8). Furthermore Pλµ0 and Pλµ00 are disjoint. We now wish to establish a bijection between Pλµ and Pλµ0 ∪ Pλµ00 . To this end define τ (pL ⊗ pL−1 ) = p˜L ⊗ p˜L−1 for pL−1 , pL ∈ BµL such that p˜L · p˜L−1 = pL · pL−1

(7.10)

and either (a) p˜L−1 ∈ Bµ0L−1 , p˜L ∈ Bµ0L if ν ∩((i+1)a ) = ((i+1)a ) or (b) p˜L−1 ∈ Bµ00L−1 , p˜L ∈ Bµ00L if ν ∩ (ia+1 ) = (ia+1 ), where ν = shape(pL · pL−1 ). Indeed these conditions are mutually excluding and determine p˜L−1 and p˜L uniquely, i.e., the Littlewood-Richardson coefficient cνµ0 µ0 = 1 if and only if cνµ00 µ00 = 0 and vice versa. Conversely, if p˜L−1 ∈ L−1 L L−1 L Bµ0L−1 , p˜L ∈ Bµ0L (or p˜L−1 ∈ Bµ00L−1 , p˜L ∈ Bµ00L ) one can find unique pL ⊗ pL−1 = τ −1 (p˜L ⊗ p˜L−1 ) with pL−1 , pL ∈ BµL by requiring (7.10). Hence τ : Pλµ → Pλµ0 ∪ Pλµ00 with τ (P ) := τ (pL ⊗pL−1 )⊗pL−2 ⊗· · ·⊗p1 for each path P = pL ⊗· · ·⊗p1 ∈ Pλµ , is the desired bijection. This proves (7.8) at q = 1. To prove (7.8) at arbitrary base q notice that if τ (P ) ∈ Pλµ0 , then the LR tableaux T = [ ◦ ω(P )] and T 0 = [ ◦ ω ◦ τ (P )] are related as T 0 = ψµµ0 (T ), with ψµµ0 defined in (6.11). Because of (6.12) we have co(T ) = co(T 0 ). Hence Theorem 5.1 implies that also H(P ) = H(τ (P )) for all P such that τ (P ) ∈ Pλµ0 . Therefore, (a) (a) the term S(L + ei−1 − 2ei(a) + ei+1 , λ) in (7.8) comes without a power of q. Similarly, if τ (P ) ∈ Pλµ00 then the LR tableaux T = [ ◦ ω(P )] and T 00 = [ ◦ ω ◦ τ (P )] are related as 3(T 00 ) = ψµµ00 ◦ 3(T ) which implies co(3(T )) = co(3(T 00 )).

(7.11)

388

A. Schilling, S. O. Warnaar

Therefore, pulling all strings in our register, we derive H(P ) − H(τ (P )) = co(T ) − co(T 00 ) = kµk − co(3(T ))

by Theorem 5.1,

−kµ00 k + co(3(T 00 )) by Lemma 6.5, = kµk − kµ00 k =

`(a) i

−i

by Eq. (7.11), recalling kµk =

X

|µj ∩ µk |,

j
which is the power of q in front of the second term in (7.8). This concludes the proof of (7.8). To prove (7.9), recall Definition 3.9 of the generalized Kostka polynomials. The generalized cocharge Kostka polynomials (3.19) obey the same recurrences (7.8) as the supernomials. This follows from the fact that P λµ ⊂ Pλµ and that τ (P ) is in P λµ0 or P λµ00 if P ∈ P λµ by the same arguments as in the proof of point (ii) of Lemma 3.1. (a) Using (3.20), kµk − kµ0 k = `i − a and kµk − kµ00 k = `i(a) − i yields (7.9). Rectangular Young tableaux over the alphabet {1, 2, . . . , n} of height n are often identified with the empty tableau. When this identification is made for the steps of the paths in the generating functions defining the supernomials and generalized Kostka polynomials one obtains the following properties. Lemma 7.5. Let n ≥ 2, N, i ≥ 1 be integers and L an n × N matrix with non-negative entries. Then for λ ∈ Zn≥0 , n S(L + e(n) i , λ + (i )) = S(L, λ), and for λ a partition with at most n parts, P n K(L + e(n) i , λ + (i )) = q

a

a`(a) i

(7.12) K(L, λ).

(7.13)

n Proof. Writing S(L + e(n) i , λ + (i )) as a generating function over paths as in (3.10), each path P in the sum has at least one step of height n. Hence height(ω(P )) = n by Lemma 3.1. Denoting by P 0 the path obtained from P by dropping the step pk of shape (in ), we find from Lemma 5.8 and (pk ) = pk that H(P ) = H(P 0 ). This proves (7.12). ˜ obey the same relation (7.12) as The generalized cocharge Kostka polynomials K 0 S. Let µ and µ be any of the arrays of rectangular partitions corresponding to L + e(n) i P and L, respectively, by (7.6). Then using kµk − kµ0 k = a a`i(a) and recalling (3.20) one finds (7.13).

7.3. The A1 supernomials. The A1 supernomials are given by specializing Definition 3.5 to n = 2, i.e., λ = (λ1 , λ2 ) ∈ Z2≥0 . By Lemma 7.5 it is sufficient to label all A1 supernomials by a vector L ∈ ZN ≥0 instead of a two-row matrix. Recall that the supernomials PN vanish unless λ1 + λ2 = |µ| = `N , where `i = j=1 min{i, j}Lj . We therefore set S1 (L, a) := Sλµ (q) =

X P ∈Pλµ

q H(P ) ,

(7.14)

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

389

1 where L ∈ ZN ≥0 , a + 2 `N = 0, 1, . . . , `N and

µ = (1L1 2L2 · · · N LN ), λ = ( 21 `N + a, 21 `N − a).

(7.15)

For S1 (L, a) the recurrences (7.8) read S1 (L, a) = S1 (L + ei−1 − 2ei + ei+1 , a) + q `i −i S1 (L − 2ei , a),

(7.16)

for 1 ≤ i < N . The ei are the canonical basis vectors of ZN and e0 = 0. The above equation is in fact part of a larger family of recursion relations. Lemma 7.6. Let A, B, N be integers such that 1 ≤ A ≤ B < N and let L ∈ ZN ≥0 such 1 that L1 = · · · = LB−1 = 0 if A < B. Then for a + 2 `N = 0, 1, . . . , `N , S1 (L + eA + eB , a) = S1 (L + eA−1 + eB+1 , a) + q `A +A S1 (L + eB−A , a). (7.17) Proof. When A = B Eq. (7.17) reduces to (7.16) with L replaced by L+2ei . For A < B Eq. (7.17) follows from the recurrences (1)

(1) (1) (1) (2) `A +A S(L + e(1) S(L + e(1) A + eB , λ) = S(L + eA−1 + eB+1 , λ) + q B−A + eA , λ), (7.18) (1) where n = 2 and L is a 2 × N matrix such that L(1) 1 = · · · = LB−1 = 0. This can be seen by dropping all entries of the matrix in the second row using Lemma 7.5 and then replacing the matrix L by a vector L. Eq. (7.18) can be proven in complete analogy to the proof of theorem 7.4 as follows. (1) 00 Take µ corresponding to L+e(1) A +eB , replace the components µL−1 , . . . , µL in the proof 0 0 of Theorem 7.4 by µL−1 = (B), µL = (A), µL−1 = (B + 1), µL = (A − 1), µ00L−1 = (B − A), µ00L = (A, A) and set τ (pL ⊗ pL−1 ) = p˜L ⊗ p˜L−1 , where again p˜L · p˜L−1 = pL · pL−1 and now (a) p˜L ∈ Bµ0L , p˜L−1 ∈ Bµ0L−1 if shape(pL · pL−1 ) 6 = (B, A) and (b) p˜L ∈ Bµ00L , p˜L−1 ∈ Bµ00L−1 if shape(pL · pL−1 ) = (B, A). Note that we have used here that n = 2 which ensures that the shape of the product of two steps has at most height 2. One may 0 00 explicitly check that co(T ) = co(T 0 ) and co(T ) = co(T 00 ) + `(1) A + A with T, T , T as defined in the proof of Theorem 7.4 which proves (7.18).

Using Sλµ (q) = αµ (S; q) for µ a partition (see the discussion after Theorem 7.2) and the explicit representations for αµ (S; q) in [7]–[9], one finds that L (7.19) S1 (L, a) = a h i with L a given in Eq. (1.3). We now recall some identities of ref. [47] involving the A1 supernomial and show how the recurrences of Lemma 7.6 yield an elementary proof. The identities unify and extend many of the known Bose–Fermi or Rogers–Ramanujan-type identities for one-dimensional configuration sums of solvable lattice models. Below we only quote the result of [47] corresponding to the Andrews–Baxter–Forrester model and its fusion hierarchy. Set PN 2 1 L Li `i − aN 4 i=1 . (7.20) T1 (L, a) = q a 1/q

390

A. Schilling, S. O. Warnaar

Theorem 7.7. Let a, b, p, N be integers such that N < p − 2, 1 ≤ a ≤ p − 1 and 1 ≤ b ≤ p − N − 1 and let L ∈ ZN ≥0 . Then ∞ n X o j 1 b+a N (pj+a)((p−N )j+b) T q N (p(p−N )j+pb−(p−N )a) T1 L, b−a +pj −q +pj L, 1 2 2 j=−∞ 1

= q 4N (b−a)(a−b−N )

X

q 4 mCm− 2 ma−1 1

1

p−3 Y j=1

p−3 m∈Z ≥0 m≡Qab (mod 2)

mj + nj , mj

(7.21)

where C is the Cartan matrix of Ap−3 and Qab = Q(a−1) + Q(p−b−1) + Q(p−2) + PN (i) with Q(i) = ei−1 + ei−3 + · · · . The expression m ≡ Q (mod2) stands i=2 Li Q Pp−3 for mi ≡ Qi (mod2) and mCm = i,j=1 Cij mi mj . The variable m0 = 0 and n is determined by N

n=

X 1 Li ei − Cm). (ea−1 + ep−b−1 + 2 i=1

For L = LeN 0 , 1 ≤ N 0 ≤ N , in the limit L → ∞ the identity (7.21) yields an identity for branching functions of A(1) 1 cosets. If we can show that the (suitably normalized) q → 1/q forms of both sides of (7.21) satisfy the recurrence X(L + eA + eB ) = X(L + eA−1 + eB+1 ) + q `A +A X(L + eB−A ),

(7.22)

then the identity is proven if it holds for the trivial initial conditions L = ei (i = 0, 1, . . . , N ). The q → 1/q version of the left-hand side of (7.21) satisfies (7.22) by Lemma 7.6. For the right-hand side of (7.21) with q → 1/q it is readily shown [47] that (7.22) holds for A = B and all L ∈ ZN by using modified q-binomials obtained by extending the range of λ2 in the top line of (1.1) to λ2 ∈ Z. This implies (7.22) thanks to the following lemma. Lemma 7.8. Let X be a function of L ∈ ZN satisfying the recurrences X(L + 2eA ) = X(L + eA−1 + eA+1 ) + q `A +A X(L)

(7.23)

for all A = 1, 2, . . . , N − 1 and L ∈ ZN . Then X fulfills the more general recurrences X(L + eA + eB ) = X(L + eA−1 + eB+1 ) + q `A +A X(L + eB−A )

(7.24)

for all 1 ≤ A ≤ B < N and L ∈ ZN such that L1 = · · · = LB−1 = 0 if A < B. Proof. Assume that (7.24) is proven for all 1 ≤ A0 ≤ B 0 < B. This is certainly true for A0 = B 0 = 1 thanks to (7.23). Using (7.23) successively (with A replaced by i) for i = B, B − 1, . . . , A yields X(L + eA + eB ) = X(L + eA−1 + eB+1 ) +

B X

q `i +A X(L + eA − ei − ei+1 + eB+1 ). (7.25) i=A

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

391

Applying (7.23) with A replaced by B to the second term on the right-hand side in (7.25) one obtains B−1 X

q `i +A {X(L + eA − ei − ei+1 − eB−1 + 2eB )

i=A

− q `B −2i+A X(L + eA − ei − ei+1 − eB−1 )}

(7.26)

+ q `B +A X(L + eA − eB ). Telescoping the last term with the negative terms in the sum at i = B − 1, B − 2, . . . , A using (7.23) with A → i − 1 yields q `B +A X(L + eA−1 − eB−1 ). The positive terms in the sum in (7.26) can be simplified to q `A +A X(L + eB−A−1 − eB−1 + eB ) by combining successively the term i = A with i = A + 1, . . . , B − 1 using (7.24) with A → i − A and B → i. Therefore (7.25) becomes X(L + eA + eB ) = X(L + eA−1 + eB+1 ) + q `A +A X(L + eB−A−1 − eB−1 + eB ) + q `B +A X(L + eA−1 − eB−1 ). The last two terms can be combined to q `A +A X(L + eB−A ) employing (7.24) with A → B − A and B → B − 1 and using that L1 = · · · = LB−1 = 0. This yields (7.24). Let us make some comments about the above outlined proof of Theorem 7.7. First, Lemma 7.6 requires N > 1. However, thanks to T1 (L1 , . . . , LN , 0, . . . , 0), a = M −N 2 q M N a T1 (L1 , . . . , LN ), a , where the dimension of the vector on the left-hand side is M , one can derive the identities (7.21) for all N ≥ 1 except when p = 4. Second, we note that for L ∈ ZN ≥0 the polynomials on the right-hand side of (7.21) indeed remain unchanged by replacing the q-binomial with the modified q-binomial. Finally, the proof given in [47] used the identities at L = Le1 as initial conditions. The knowledge of these non-trivial identities is not necessary in the above proof. In the discussion section we will conjecture higher-rank analogues of (7.21).

8. Fermionic Representation of the Generalized Kostka Polynomials In this section we give a fermionic representation of the generalized Kostka polynomials generalizing the Kirillov–Reshetikhin expression (2.5). Recalling the Definitions (7.7) we introduce the following function. Definition 8.1. Let n ≥ 2 and N ≥ 1 be integers, λ a partition with height(λ) ≤ n and L an n × N matrix with entry Li(a) ∈ Z≥0 in row a and column i. Then set F (L, λ) = 0 P if |λ| 6 = a,i≥1 aiLi(a) and otherwise F (L, λ) =

X α

q

C(α)

Y a,i≥1

"

# (a) Pi(a) + αi(a) − αi+1 , (a) αi(a) − αi+1

(8.1)

where the sum is over sequences α = (α(1) , α(2) , . . . ) of partitions such that |α(a) | = P (a) (0) j≥1 j`j − (λ1 + · · · + λa ). Furthermore, with the convention that αi = 0,

392

A. Schilling, S. O. Warnaar

Pi(a) =

i X

(αk(a−1) − 2αk(a) + αk(a+1) ) + `i(a) ,

(8.2)

X (b) X A(a) + α(a−1) − α(a) i i i , Ai(a) = Lk . C(α) = 2 k≥i

(8.3)

k=1

a,i≥1

b≥a

P Recalling that Kλµ (q) = 0 unless |λ| = |µ| = a,i≥1 aiLi(a) we find that F (L, λ) = K(L, λ) if Li(a) = 0 for a > 1 by comparing (8.1) with (2.5). We wish to show that F (L, λ) equals the generalized Kostka polynomial K(L, λ) for more general L. We begin by showing that F obeys the same recurrence relation as K. Lemma 8.1. Let i, a, N, n ∈ Z≥0 such that 1 ≤ i < N and 1 ≤ a < n and let λ be a partition with height(λ) ≤ n. Let L be an n × N matrix with non-negative integer entries such that Li(a) ≥ 2. Then (a)

F (L, λ) = q `i

−a

(a) (a) (a+1) F (L + ei−1 − 2ei(a) + ei+1 , λ) + F (L + ei(a−1) − 2e(a) , λ). i + ei (8.4)

(a) (a) − 2ei(a) + ei+1 the variable Pj(b) and the Proof. Under the substitution L → L + ei−1 function C(α) transform as

Pj(b) → Pj(b) − δij δab , (a)

(a) . C(α) → C(α) − `i + a + αi(a) − αi+1

(8.5)

On the other hand, replacing L → L + ei(a−1) − 2ei(a) + ei(a+1) induces the changes Pj(b) → Pj(b) + min{i, j}(δa−1,b − 2δab + δa+1,b ), C(α) → C(α) − mi(a) + i.

(8.6)

Now apply the q-binomial recurrence hm + ni m+n−1 n m+n−1 + , (8.7) =q n n−1 n to the (a, i)th term in the product in (8.1) (this term cannot be 00 because of the condition Li(a) ≥ 2). Thanks to (8.5) one can immediately recognize the first term of the resulting (a)

(a) (a) − 2ei(a) + ei+1 , λ). In the second term we perform the expression as q `i −a F (L + ei−1 (b) (b) variable change αj → αj +χ(j ≤ i)δab where recall that χ(true) = 1 and χ(false) = 0. Since this leads to exactly the same change in Pj(b) and C(α) as in (8.6), the second term indeed yields F (L + ei(a−1) − 2ei(a) + ei(a+1) , λ).

Theorem 8.2. Let N ≥ 1, n ≥ 2 be integers, λ a partition with height(λ) ≤ n and L an n × N matrix with components Li(a) ∈ Z≥0 in row a and column i. If either (a) for all 1 ≤ a ≤ n − 2 and 1 ≤ i ≤ N or Li(a) ≥ Li+2 for all 1 ≤ a ≤ n Li(a) ≥ L(a+2) i and 1 ≤ i ≤ N − 2, then F (L, λ) = K(L, λ).

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

393

Proof. We use F (L, λ) = K(L, λ) for L such that Li(a) = 0 when a > 1 as initial condition. Since K and F both satisfy the recurrences (a)

(a+1) ) = X(L − ei(a−1) + 2ei(a) ) − q `i X(L + ei+1

+1

(a) X(L − ei(a−1) + e(a) i−1 + ei+1 ) (8.8)

(compare with (7.9) and (8.4), respectively) the theorem follows immediately for the first set of restrictions on L. The second set of restrictions comes about by using the symmetry (7.1) of the generalized Kostka polynomials. The recurrences (8.8) are not sufficient to prove Theorem 8.2 for L with arbitrary entries Li(a) ∈ Z≥0 . However, we nevertheless believe the theorem to be true for this case as well. Conjecture 8.3. Let N ≥ 1, n ≥ 2 be integers, λ a partition with height(λ) ≤ n and L an n × N matrix with nonnegative integer entries. Then F (L, λ) = K(L, λ).

9. Discussion We believe that there exist many further results for the generalized Kostka polynomials and supernomials. For example, (7.21) admits higher-rank analogues in terms of T (L, λ) = q

1 2

PN Pn−1 i=1

a,b=1

−1 (b) 1 L(a) Cab `i − 2N i

Pn i=1

(λi − n1 |λ|)2

S(L, λ)|1/q ,

(9.1)

where C −1 is the inverse of the Cartan matrix of An−1 . For integers n ≥ 2, N, p ≥ 1 such that N < p − n and any n × N matrix L with non-negative integer entries such Pn−1 −1 (b) Li ∈ Z for all i and a, we conjecture that b=1 Cab X

X

(τ )q

Pn

k1 +···+kn =0 τ ∈Sn

=

X m

q 2 m(C 1

−1

⊗C)m

i=1

1 { 2N (pki +τi −i)2 − p2 ki2 +iki }

hm + ni m

T (L, λ(k, τ )) (9.2)

,

where the following notation has P been used. On the left-hand side the components of λ(k, τ ) are given by λj (k, τ ) = n1 a,i≥1 aiLi(a) + pkj + τj − j. On the right-hand side Pn−1 Pp−n−1 (a) mi (ea ⊗ ei ) with m(a) the sum runs over all m = a=1 i ∈ Z≥0 such that i=1 Pn−1 −1 (b) C m ∈ Z for all a = 1, . . . , n − 1 and i = 1, . . . , p − n − 1. The variable n i b=1 ab is fixed by (C ⊗ I)n + (I ⊗ C)m =

N n−1 XX a=1 i=1

L(a) i (ea ⊗ ei ),

(9.3)

where I is the identity matrix and C is the Cartan matrix of an A-type Lie algebra. The dimension of the first space in the tensor product is n − 1 and that of the second space is p − n − 1. Finally we used the notation

394

A. Schilling, S. O. Warnaar

(A ⊗ B)m =

n−1 X p−n−1 X a,b=1 i,j=1

n(A ⊗ B)m =

n−1 X p−n−1 X a,b=1 i,j=1

and

hm + ni m

=

n−1 Y p−n−1 Y a=1

i=1

Aab Bij mj(b) (ea ⊗ ei ), Aab Bij ni(a) mj(b)

"

# mi(a) + n(a) i . mi(a)

The identities (9.2) are polynomial analogues of branching function identities of the Rogers–Ramanujan type for A(1) n−1 cosets. For n = 2 they follow from Theorem 7.7 with a = b = 1 and for L = L(e1 ⊗ e1 ) they were claimed in [10] 1 . Unfortunately, the recurrences of Theorem 7.4 are not sufficient to prove (9.2) for general n and L. A proof would require a more complete set of recurrences for the An−1 supernomials analogous to those stated in Lemma 7.6 for n = 2. The left-hand side of Eq. (9.2) can be interpreted in terms of paths of a level-(p − n) A(1) n−1 lattice model. Denote by 3k (0 ≤ k ≤ n − 1) the dominant integral weights of A(1) n−1 . Then the states a of the lattice model underlying (9.2) are given by the levelPn−1 Pn−1 (p − n) dominant integral weights, i.e., a = k=0 ak 3k such that k=0 ak = p − n. Define the adjacency matrices A labelled by two states a, b and a Young tableau as [i] = χ(b = a + 3i − 3i−1 ) (i = 1, . . . , n; 3n = 30 ) and recursively A∅a,b = χ(a = b), Aa,b P T [i] T ·[i] b Aa,b Ab,c = Aa,c . Call a path P = pL ⊗ · · · ⊗ p1 ∈ Pλµ admissible with initial state QL Pn−1 (i) (1) a if i=1 Apai(i) ,a(i+1) = 1, where a(i+1) = a(i) + k=0 3k (λ(i) k − λk+1 ) for i = 1, . . . , L and λ(i) = content(pi ). Then, up to an overall factor, (9.2) is an identity for the generating function of admissible paths P ∈ Pλµ starting at a(1) = (p − n)30 with µ and L related |µ| as in (7.6) and λ = ( |µ| n , · · · , n ). The weights of the paths are given by −H(P ) with H as defined in (3.7). Our initial motivation for studying the An−1 supernomials is their apparent relevance to a higher-rank generalization of Bailey’s lemma [6]. Indeed, a Bailey-type lemma involving the supernomials Sλµ (q) such that µ? (or any permutation thereof) is a partition can be formulated. Here we briefly sketch some of our findings. Further details about a Bailey lemma and Bailey chain for A2 supernomials are given in [5], whereas we hope to report more on the general An−1 case in a future publication. Set f (L, λ) = Sλµ (q)/(q)L

(9.4)

n for L ∈ Zn−1 ≥0 and λ ∈ Z≥0 and zero otherwise where (q)L = (q)L1 · · · (q)Ln−1 and n−1 Ln−1 L1 ) , . . . , (1) ). Here (1i )Li denotes Li components (1i ). Note that µ? = µ = ((1 L1 L2 (1 2 · · · (n − 1)Ln−1 ) is indeed a partition. Let L = (L1 , . . . , Ln−1 ) and k = (k1 , . . . , kn ) such that k1 +· · ·+kn = 0 denote arrays of integers and let α = {αk }k1 ≥···≥kn , γ = {γk }k1 ≥···≥kn , β = {βL } and δ = {δL } be sequences. Then (α, β) and (γ, δ) such that 1

The proof in [10] seems to be incomplete.

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

X

βL =

αk f (CL + `e1 , Ln−1 ρ − k + `en ),

395

(9.5)

k1 +···+kn =0 k1 ≥···≥kn

and

X

γk =

δL f (CL + `e1 , Ln−1 ρ − k + `en )

(9.6)

L∈Zn−1

are called an An−1 Bailey pair relative to q ` and an An−1 conjugate Bailey pair relative to q ` , respectively. Here ` ∈ Z≥0 , C is the Cartan matrix ofAn−1 and ρ is the n-dimensional Weyl vector ρ = e1 + · · · + en . When n = 2, f (L, λ) = 1/(q)λ1 (q)λ2 for L = λ1 + λ2 and zero otherwise, and (9.5) and (9.6) reduce (up to factors of (q)` ) to the usual definition [6] of a Bailey pair and conjugate Bailey pair (after identifying k = (k1 , k2 ) = (k, −k)), βL =

L X k=0

∞

X αk δL and γk = . (q)L−k (q)L+k+` (q)L−k (q)L+k+` L=k

Analogous to the A1 case the An−1 Bailey pair and conjugate Bailey pair satisfy X X αk γk = βL δL . (9.7) L∈Zn−1

k1 +···+kn =0 k1 ≥···≥kn

For n ≥ 2, N ≥ 1 we now claim the following An−1 conjugate Bailey pair relative to q ` . Choose integers λj(a) ≥ 0 (a = 1, . . . , n − 1, j = 1, . . . N − 1) and σ such that `−

−1 n−1 XN X a=1 i=1

aiλi(a) + σN ≡ 0 (mod n).

(9.8)

Pn−1 PN −1 (a) (ea ⊗ ei ) and k = k(L), such that ki (L) = Li − Li+1 Setting λ = a=1 i n i=1 λP (Ln = 0, Ln+1 = L1 , so that i=1 ki = 0) the (γ, δ) pair γk(L) =

−1 −1 1 1 q 2N (LCL+2`L1 ) X q 2 n(C⊗C )n−n(I⊗C )λ (q)n (q)n−1 ∞ n

δL = q

1 2N

(LCL+2`L1 )

X

q

−1 1 )n−n(I⊗C −1 )λ 2 n(C⊗C

hm + ni

n

(9.9)

n

Pn−1 PN −1 (a) satisfies (9.6). The summations in (9.9) run over all n = a=1 i=1 ni (ea ⊗ ei ) such that Pn−1 PN −1 N −1 −1 −1 (b) X − b=1 La + `Ca1 aσ −1 (a) k=1 kCab λk C1j nj ∈ Z + − a = 1, . . . , n − 1. N n (9.10) j=1 In the expression for δL the variable m is related to n by (C ⊗ I)n + (I ⊗ C)m = (CL + `e1 ) ⊗ eN −1 + λ.

(9.11)

Inserting (9.9) into (9.7) yields a rank n − 1 and level N version of Bailey’s lemma. Indeed, when λ = ea ⊗ ei , γk is proportional to the level-N A(1) n−1 string function in

396

A. Schilling, S. O. Warnaar

the representation given by Georgiev [14]. When n = 2 the pair (γ, δ) of Eq. (9.9) reduces to the conjugate Bailey pair of refs. [45, 46]. The identities in (9.2) provide An−1 Bailey pairs relative to 1. We remark that Milne and Lilly [38, 39] also considered higher-rank generalizations of Bailey’s lemma. However, their definition of an An−1 Bailey is different from the one above, and in particular we note that the function f is not q-hypergeometric for n > 2. Acknowledgement. We would like to thank M. Okado for generously sharing some of his unpublished notes on energy functions of lattice models and for drawing our attention to ref. [40]. Many thanks to A. Lascoux for providing us with copies of his papers and to D. Dei Cont for travelling all the way to Italy to get us a copy of ref. [36]. Furthermore, we would like to acknowledge useful discussions with P. Bouwknegt, O. Foda, R. Kedem, A. Kuniba, B. McCoy and A. Nakayashiki. AS has been supported by the “Stichting Fundamenteel Onderzoek der Materie” which is financially supported by the Dutch foundation for scientific research NWO. SOW has been supported by a fellowship of the Royal Netherlands Academy of Arts and Sciences.

Note added After submission, several papers [25, 29] [49]–[52] with considerable overlap with this work have appeared. The generalized Kostka polynomials studied in this paper were also introduced in [52] as special types of Poincar´e polynomials and further studied in [49]–[51]. In refs. [25, 29] it was conjectured that the generalized Kostka polyonomials coincide with special cases of spin generating functions of ribbon tableaux [34] and that the fermionic representation (8.1) of the generalized Kostka polynomials is the generating function of rigged configurations. This last conjecture has now been established in [28]. We are indebted to Mark Shimozono for his questions and comments which led to several refinements of the paper. We also thank him for pointing out that the analogues of the recurrences of Lemma 7.6 for the (cocharge) Kostka polynomials have occurred in [42].

Appendix A. Proof of Lemma 5.6 Obviously, the following lemma implies Lemma 5.6. Lemma A.1. Let P ∈ P µ be a path over {1, 2, . . . , M }, where M = |µ| and set Q = Cp (P ). Suppose M is contained in step pi of P = pL ⊗ · · · ⊗ p1 . (i) (ii) (iii) (iv)

If M If M If M If M

is contained in the (i − 1)th step of σi−1 (P ), then hi−1 (Q) − hi−1 (P ) = 0. is contained in the ith step of σi−1 (P ), then hi−1 (Q) − hi−1 (P ) = −1. is contained in the (i + 1)th step of σi (P ), then hi (Q) − hi (P ) = 0. is contained in the ith step of σi (P ), then hi (Q) − hi (P ) = 1.

This lemma, in turn, follows from the next lemma. The height of an entry M in a Young tableau is defined to be i if M is in the ith row from the bottom. Lemma A.2. For µ, µ0 ∈ R, let p ∈ Bµ and p0 ∈ Bµ0 such that each entry in p · p0 occurs at most once and p contains the largest entry M . Then

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

397

shape(Cp (p) · p0 ) 6 = shape(p · p0 ) if and only if the height of M in p · p0 is bigger than height(p), and (ii) 0 ≤ h(p ⊗ p0 ) − h(Cp (p) ⊗ p0 ) ≤ 1. (i)

Before we prove Lemma A.2 let us first show that it indeed implies Lemma A.1. Proof of Lemma A.1. (i) Let p˜i ⊗ p˜i−1 := σ(pi ⊗ pi−1 ). The steps pi and p˜i−1 have the same shape and by assumption they both contain M . Since pi · pi−1 = p˜i · p˜i−1 we conclude that the height of M in pi · pi−1 has to be height(pi ). By (i) of Lemma A.2 it follows that shape(Cp (pi ) · pi−1 ) = shape(pi · pi−1 ) which proves hi−1 (Q) − hi−1 (P ) = 0. (ii) Again we denote p˜i ⊗ p˜i−1 := σ(pi ⊗ pi−1 ). By assumption pi and p˜i contain M . Equation pi · pi−1 = p˜i · p˜i−1 can only hold if the box with entry M has been bumped at least once. But this implies that the height of M in pi · pi−1 is bigger than height(pi ). By (i) of Lemma A.2 this means that hi−1 (Q) 6 = hi−1 (P ), and by (ii) of Lemma A.2 the difference has to be −1. (iii) This point can be proven analogous to (i). (iv) Let us show that this case follows from (ii) by considering P 0 = Cp−1 ◦ p (P ). The path P 0 satisfies the conditions of case (ii) with i → L+1−i since σi commutes with Cp−1 due to (5.8) and (5.9) and since (5.6) holds. Hence hL−i (Cp (P 0 )) − hL−i (P 0 ) = −1 which is equivalent to hL−i (p (P ))−hL−i (p ◦Cp (P )) = −1 by inserting the definition of P 0 and using (5.9) and 2p = Id. Finally employing (5.5) proves (iv). Proof of Lemma A.2. Let p0 = [w] with w = wN · · · w1 in row-representation. Define p(0) = p and p(i+1) = p(i) · [wN −i ] for i = 0, 1, . . . , N − 1. Then obviously p(N ) = p · p0 . We will show inductively that either M got bumped in p(i) (which implies that the height of M is bigger than height(p)) or the action of Cp on p(i) is still described by the inverse sliding mechanism starting at the largest element M and ending in the bottom left corner. We prove this claim by induction on i. The initial condition is satisfied since Cp acts on p(0) = p by the inverse sliding mechanism by definition. To prove the induction step suppose that M did not yet get bumped in p(i) (if it has been bumped in p(i) then this is also true for p(k) with i ≤ k ≤ N and we are finished). By the induction hypothesis the action of Cp on p(i) is still given by the inverse sliding mechanism. It is useful to and denote the boxes in p(i) affected by the inverse sliding pictorially by drawing for a > b and a < b, respectively, if the corresponding boxes of p(i) are

a c b

. For

example

p(i) =

.

The dot indicates the position of M in p(i) . Comparing with Fig. 2 we see that the line traces exactly the movement of an empty box under the inverse sliding mechanism. We now wish to insert [wN −i ] by the Schensted bumping algorithm to obtain p(i+1) , i.e., [wN −i ] gets inserted in the first row of p(i) and possibly bumps another box to the second row and so on. Let us label the boxes of p(i) which get bumped when inserting [wN −i ] by a cross. Two things may happen:

398

A. Schilling, S. O. Warnaar

(1) None of the boxes depicted by

and

contain a cross. (We include

in the set of boxes

). or

(2) There are boxes

which contain a cross.

If (1) occurs there can be at most one box containing both a line and a cross. One may easily see that the line of p(i) also describes the route of the inverse sliding mechanism in p(i+1) and that M does not get bumped. Hence we are finished in this case. If (2) occurs or up to and including either (a) or (b) must all boxes vertically above also contain a cross. In case (a) the line indicating the inverse sliding changes from p(i) to p(i+1) as illustrated in Fig 3.

−→

in p(i)

−→

in p(i+1) in p(i) Fig. 3. Change of inverse sliding route from p(i) to p(i+1)

in p(i+1)

In case (b) contains a cross and hence M got bumped. This concludes the proof of the claim. Observe that, as long as cases (1) or (2a) occur, M does not get bumped and shape(p(i+1) ) = shape(Cp (p(i+1) ) since Cp is still described by the inverse sliding mechanism. If, however, case (2b) occurs for p(i) which implies that M got bumped in p(i+1) , then shape(p(i+1) ) and shape(Cp (p(i+1) )) differ. This is so since p(i) must contain

or

where the dashed lines indicate possible other boxes and the number of the vertically aligned crossed boxes may of course vary (but at least the box containing the dot must is in also contain a cross). Suppose the lowest crossed box in the vertical line below row k. In comparison with p(i) , the shape of p(i+1) has one more box above the height(p). In the shape of Cp (p(i+1) ) = Cp (p(i) ) · [wN −i ], this box has been moved to row k. One may also easily see that shape(p(j) ) and shape(Cp (p(j) )) for j > i differ by moving exactly one box from above height(p) to the k th row. This proves Lemma A.2.

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

399

B. Example of a Cyclage-graph Figure 4 shows the poset structure of LRT(·, µ) for µ = ((2), (2), (12 )). A black arrow from LR tableau T to LR tableau T 0 means T 0 = C(T ). A white arrow indicates that T 0 and T are related by a modified λ-cyclage (as defined in Sect. 6) other than the modified initial cyclage, i.e. T 0 = Z λ (T ) for some shape λ but T 0 6 = C(T ).

3

(2)

2

(1)

3

(1)

1

(1)

1

(1)

3

(2)

3

(1)

1

(1)

1

(1)

2

2

(1)

(1)

(2)

3

(1)

2

(1)

2

(1)

1

(1)

1

(1)

3

(2)

2

(1)

2

(1)

1

(1)

1

(1)

3

(1)

3

(2)

3

(1)

2

(1)

1

(1)

1

(1)

2

(1)

1

(1)

2

(1)

3

3

(2)

2

(1)

2

(1)

3

(2)

2

(1)

1

(1)

1

(1)

3

(1)

1

(1)

2

(1)

3

(2)

1

(1)

1

(1)

2

(1)

3

(1)

1

(1)

2

(1)

2

(1)

3

(1)

2

3

3

(2)

1

(1)

(1)

Fig. 4. The cyclage-graph T (µ) for µ = ((2), (2), (12 ))

(1)

400

A. Schilling, S. O. Warnaar

References 1. Andrews, G.E.: Multiple series Rogers–Ramanujan type identities. Pacific J. Math. 114, 267–283 (1984) 2. Andrews, G.E.: Schur’s theorem, Capparelli’s conjecture and q-trinomial coefficients. Contemp. Math. 166, 141–154 (1994) 3. Andrews, G.E. and Baxter,R.J.: Lattice gas generalization of the hard hexagon model. III. q-trinomial coefficients. J. Stat. Phys. 47, 297–330 (1987) 4. Andrews, G.E., Baxter, R.J. and Forrester, P.J.: Eight-vertex SOS model and generalized Rogers– Ramanujan–type identities. J. Stat. Phys. 35, 193–266 (1984) 5. Andrews, G.E., Schilling, A. and Warnaar, S.O.: An A2 Bailey lemma and Rogers–Ramanujan-type identities. math.QA/9807125, to appear in J. Amer. Math. Soc. 6. Bailey, W.N.: Identities of the Rogers–Ramanujan type. Proc. London Math. Soc. (2) 50, 1–10 (1949) 7. Butler, L.M.: A unimodality result in the enumeration of subgroups of a finite abelian group. Proc. Am. Math. Soc. 101, 771–775 (1987) 8. Butler, L.M.: Generalized flags in finite abelian p-groups. Discrete Appl. Math. 34, 67–81 (1991) 9. Butler, L.M.: Subgroup lattices and symmetric functions. Memoirs of the Am. Math. Soc., no. 539, vol. 112, 1994 10. Dasmahapatra, S.: On the combinatorics of row and corner transfer matrices of the A(1) n−1 restricted face models. Int. J. Mod. Phys. A 12, 3551–3586 (1997) 11. Dasmahapatra, S. and Foda, O.: Strings, paths, and standard tableaux. Int. J. Mod. Phys. A 13, 501–522 (1998) 12. Date, E., Jimbo, M., Kuniba, A., Miwa, T. and Okado, M.: Paths, Maya diagrams and representations of sbl(r, C). Adv. Stud. Pure Math. 19, 149–191 (1989) 13. Fulton, W.: Young tableaux: with applications to representation theory and geometry. London Math. Soc. student texts 35, Cambridge: Cambridge University Press, 1997 14. Georgiev, G.: Combinatorial constructions of modules for infinite-dimensional Lie algebras, II. Parafermionic space. q-alg/9504024 15. Hatayama, G.,Kirillov, A.N., Kuniba, A.,Okado, M., Takagi, T. and Yamada, Y.: Character formulae of sbln -modules and inhomogeneous paths. Nucl. Phys. B 536 [PM], 575–616 (1998) 16. Jimbo, M., Miwa, T. and Okado, M.: Local state probabilities of solvable lattice models: An A(1) n−1 family. Nucl. Phys. B 300 [FS22], 74–108 (1988) 17. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Affine crystals and vertex models. Int. J. Mod. Phys. A Suppl. 1A, 449–484 (1992) 18. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Perfect crystals of quantum affine Lie algebras. Duke Math. J. 68, 499–607 (1992) 19. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63, 465–516 (1991) 20. Kashiwara, M.: Crystal bases of modified quantized enveloping algebras. Duke Math. J. 73, 383–413 (1994) 21. Kashiwara, M. and Nakashima, T.: Crystal graph for representations of the q-analogue of classical Lie algebras. J. Alg. 165, 295–345 (1994) 22. Kedem, R., Klassen, T.R., McCoy, B. M. and Melzer, E.: Fermionic quasi-particle representations for characters of (G(1) )1 × (G(1) )1 /(G(1) )2 . Phys. Lett. B 304, 263–270 (1993) 23. Kedem, R., Klassen, T.R., McCoy, B. M. and Melzer, E.: Fermionic sum representations for conformal field theory characters. Phys. Lett. B 307, 68–76 (1993) 24. Kirillov, A.N.: Dilogarithm identities. Prog. Theor. Phys. Suppl. 118, 61–142 (1995) 25. Kirillov, A.N.: New combinatorial formula for modified Hall–Littlewood polynomials. math.QA/9803006 26. Kirillov, A.N., Kuniba, A. and Nakanishi, T.: Skew Young diagram method in spectral decomposition of integrable lattice models II: Higher levels. q-alg/9711009 27. Kirillov, A.N. and Reshetikhin, N.Yu.: The Bethe Ansatz and the combinatorics of Young tableaux. J. Soviet Math. 41, 925–955 (1988) 28. Kirillov, A.N., Schilling, A. and Shimozono, M.: A bijection from Littlewood–Richardson tableaux to rigged configurations. math. CO/9901037 29. Kirillov, A.N. and Shimozono, M.: A generalization of the Kostka-Foulkes polynomials. math.QA/9803062

Inhomogeneous Lattice Paths, Generalized Kostka Polynomials and An−1 Supernomials

401

30. Knuth, D.E.: Permutations, matrices and generalized Young tableaux. Pacific J. Math. 34, 709–727 (1970) 31. Kuniba, A., Misra, K.C., Okado, M., Takagi, T. and Uchiyama, J.: Paths, Demazure crystals and symmetric functions. To appear in Nankai-CRM proceedings Extended and quantum algebras and their applications to physics, q-alg/9612018 32. Lascoux, A.: Cyclic permutations on words, tableaux and harmonic polynomials. In: Proc. Hyderabad Conference on Algebraic Groups 1989, Madras: Manoj Prakashan, 1991, pp. 323–347 33. Lascoux, A., Leclerc, B. and Thibon, J.-Y.: Crystal graphs and q-analogues of weight multiplicities for the root system An . Lett. Math. Phys. 35, 359–374 (1995) 34. Lascoux, A., Leclerc, B. and Thibon, J.-Y.: Ribbon tableaux, Hall–Littlewood functions, quantum affine algebras, and unipotent varieties. J. Math. Phys. 38, 1041–1068 (1997) 35. Lascoux, A. and Sch¨utzenberger, M.P.: Sur une conjecture de H.O. Foulkes. CR Acad. Sci. Paris 286A, 323–324 (1978) 36. Lascoux, A. and Sch¨utzenberger, M.P.: Le monoid plaxique. Quaderni della Ricerca scientifica 109, 129–156 (1981) 37. Macdonald, I.G.: Symmetric functions and Hall polynomials, Oxford: Oxford University Press, second edition, 1995 38. Milne, S.C. and Lilly, G.M.: The A` and C` Bailey transform and lemma. Bull. Am. Math. Soc. (N.S.) 26, 258–263 (1992) 39. Milne, S.C. and Lilly, G.M.: Consequences of the A` and C` Bailey transform and Bailey lemma. Discrete Math. 139, 319–346 (1995) 40. Nakayashiki, A. and Yamada, Y.: Kostka polynomials and energy functions in solvable lattice models. Selecta Math. (N.S.) 3, 547–599 (1997) 41. Okado, M.: Private communication 42. Regonati, F.: Sui numeri dei sottogruppi di dato ordine dei p-gruppi abeliani finiti. Istit. Lombardo (Rend. Sc.) A 122, 369–380 (1988) 43. Schensted, C.: Longest increasing and decreasing subsequences. Canad. J. Math. 13, 179–191 (1961) b M× 44. Schilling, A.: Multinomials and polynomial bosonic forms for the branching functions of the su(2) su(2) b N /su(2) b N +M conformal coset models. Nucl. Phys. B 467, 247–271 (1996) 45. Schilling, A. and Warnaar, S.O.: A higher-level Bailey lemma. Int. J. Mod. Phys. B 11, 189–195 (1997) 46. Schilling, A. and Warnaar, S.O.: A higher-level Bailey lemma: Proof and application. The Ramanujan Journal 2, 327–349 (1998) 47. Schilling, A. and Warnaar, S.O.: Supernomial coefficients, polynomial identities and q-series. The Ramanujan Journal 2, 459–494 (1998) 48. Sch¨utzenberger, M.P.: Quelques remarques sur une construction de Schensted. Math. Scand. 12, 117–128 (1963) 49. Shimozono, M.: A cyclage poset structure for Littlewood–Richardson tableaux. math.QA/9804037 50. Shimozono, M.: Multi-atoms and monotonicity of generalized Kostka polynomials. math.QA/9804038 51. Shimozono, M.: Affine type A crystal structure on tensor products of rectangles, Demazure characters, and nilpotent varieties. math.QA/9804039 52. Shimozono, M. and Weyman, J.: Graded characters of modules supported in the closure of a nilpotent conjugacy class. math.QA/9804036 53. Warnaar, S.O.: The Andrews–Gordon identities and q-multinomial coefficients. Commun. Math. Phys. 184, 203–232 (1997) Communicated by T. Miwa

Commun. Math. Phys. 202, 403 – 409 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

The Global Defect Index Stefan Bechtluft-Sachs, Marco Hien Naturwissenschaftliche Fakult¨at I, Universit¨at Regensburg, Universit¨atsstraße 31, 93053 Regensburg, Germany. E-mail: [email protected]; [email protected] Received: 8 June 1998 / Accepted: 21 October 1998

Abstract: We show how far the local defect index determines the behaviour of an ordered medium in the vicinity of a defect.

1. Introduction A rough model for an ordered medium may be constructed by a manifold M encoding the positions of the particles in space and a map M → V from M to the so-called order parameter space V or, more generally, a section σ : M → E in some fibre bundle E over M with typical fibre V , which describes additional degrees of freedom. We are interested in the consequences imposed on this situation merely by topology, i.e. by continuity assumptions on σ only. In general a bundle E → M does not admit a section e of some defect 1 e ⊂ M . Even if the on all of M but only on the complement M \ 1 e bundle is trivial there may occur defects 1 which can not be removed by changing σ in e only. the vicinity of 1 In a variety of examples the defect set is a submanifold (see e.g. [4]). In this case the section is called regularly defected. In the present investigation regularity will be tacitly e of the defect then has a well defined normal bundle assumed. An arc component 1 ⊂ 1 N → 1, and the behaviour of σ in the vicinity of this defect component is described by the restriction of σ to the sphere bundle SN of N , i.e. by a bundle map σ : SN → E|1 . e is Definition 1. The local defect index of a regularly defected cross section at p ∈ 1 the homotopy class ιp (σ) := [σp ] ∈ [SNp , Ep ], where σp : SNp → Ep denotes the e restriction of σ to the fibres over p ∈ 1. A regularly defected cross section is called topologically stable, if for every arc e the local defect index ιp (σ) at some (hence every) point p ∈ 1 is component 1 ⊂ 1 nontrivial.

404

S. Bechtluft-Sachs, M. Hien

The main objective of this work is to show that the local defect index does not in general suffice to determine the global behaviour of a defect mapping along the defect component 1. This is more precisely described by the fibre homotopy class of σ over 1, which we will refer to as the global defect index of σ. Recall that two mappings σ0 , σ1 : SN → E|1 over 1 are called fibre homotopic, if there is a homotopy H between them consisting of mappings Ht which commute with the projections of the two bundles. By [SN, E]1 we denote the set of fibre homotopy classes of mappings SN → E|1 over 1 and by [SN, E]α 1 the set of fibre homotopy classes of maps σ : SN → E|1 over 1, whose local defect index at p ∈ 1 equals a given α ∈ [SNp , Ep ]. In the examples we have in mind we may assume that the normal bundle as well as E are trivial. Nontrivial bundles are treated in [1]. There is a long exact sequence involving the Whitehead product (Theorem 1), from which the set [SN, E]α 1 can be computed by dividing out the action of the fundamental group of the mapping space Mapα (S n−1 , V ) of the fibres, see (1). As examples we explicitly treat nematics (Proposition 1), the superfluid dipolefree A-phase 3 He (Proposition 2), and (in Proposition 3) the case where V is an H-space — a Lie group for instance. The latter appears in the theory of the superfluid dipole locked A-phase of 3 He where V = SO(3). In the case of nematics (see also [3]) there are 4 different types of global defect indices sharing the only nontrivial local defect index. In the other cases above even infinitely many global defect indices with the same local defect index occur. Single unknotted ring defects in R3 were considered in [9]. The configurations with only one unknotted ring defect are described by the set [R3 \ S 1 , V ] = [S 2 ∨ S 1 , V ] = π2 (V ) × π1 (V )/θ where θ is the action of π1 (V ). Our treatment admits other defects but identifies configurations which are homotopic near the defect component 1. Thus we are interested in the set [S 1 × S 1 , V ], which we compute from the long exact sequence of Theorem 1. After dividing out the action of π1 (V ), this is related to [S 2 ∨ S 1 , V ] by an exact sequence π2 (V ) → [S 2 ∨ S 1 , V ]bp → [S 1 × S 1 , V ]bp → π1 (V ), where [·, ·]bp denotes homotopy classes, relative basepoint. 2. The Whitehead-Sequence Let SN := 1 × S n−1 and E := 1 × V be trivial bundles. The Exponential Law (see e.g. [2], p. 438) gives us a canonical bijection [1 × S n−1 , 1 × V ]1 → [1, Map(S n−1 , V )] between the set of fibre homotopy classes of mappings SN → E and a set of ordinary homotopy classes. Here we denote by Map(X, Y ) the mapping space equipped with the compact-open topology. We will sometimes write Mapα (X, Y ) instead of α for the arc-component of any (hence every) representative of a class α ∈ [X, Y ]. Let us take a closer look at the case 1 = S m . If we consider a fixed local defect index α as an element in [S n−1 , V ] we now know that after choosing basepoints we have a canonical bijection n−1 , V ))/π1 (Mapα (S n−1 , V )), [S m × S n−1 , S m × V ]α S m ≈ πm (Mapα (S

(1)

where the right-hand side is the quotient of the canonical action of the fundamental group π1 (Mapα (S n−1 , V )) on the higher homotopy group of this space. In order to calculate

Global Defect Index

405

th the set [S m × S n−1 , S m × V ]α S m we therefore have to determine the m homotopy group of the mapping space Mapα (S n−1 , V ) and the action of its fundamental group. The first part was done by G. W. Whitehead in [12] and we will summarize his results: Let s0 ∈ S n−1 and v0 ∈ V be the basepoints and let Fα denote the subspace Mapα ((S n−1 , s0 ), (V, v0 )) of basepoint preserving mappings homotopic to α ∈ πm (V, v0 ). Then the evaluation map

τα : Gα := Mapα (S n−1 , V ) → V f 7→ f (s0 ) is a Hurewicz-fibration with fibre Fα and so induces the long exact homotopy sequence ∂

· · · → πm+1 (V ) −→ πm (Fα ) → πm (Gα ) → πm (V ) → · · · .

(2)

The homeomorphism S m ∧ S n−1 ∼ = S m+n−1 induces an isomorphism ϕα : πm (Fα ) ∼ = πm+n−1 (V ).

(3)

The composition ∂

ϕα

πm+1 (V ) −→ πm (Fα ) −→ πm+n−1 (V ) is the Whitehead product from the left with α ([12]). Inserting this into the sequence (2) we obtain the following result. Theorem 1 (G.W. Whitehead). If ρα : πm+1 (V ) → πm+n−1 (V ), β 7→ [α, β] denotes the Whitehead product with α ∈ πn−1 (V ), we have the following long exact sequence: ρα

· · · → πm+1 (V ) −→ πm+n−1 (V ) → πm (Mapα (S n−1 , V )) → πm (V ) → · · · . 3. Applications We now want to give some applications of Theorem 1 of physical importance. For this we consider regularly defected cross sections of the trivial bundle S 3 × V → S 3 with e := 11 ∪ . . . ∪ 1r ∪ p1 ∪ . . . ∪ ps consisting of connected closed 1a defect set 1 dimensional submanifolds 1i ⊂ S 3 and points pj ∈ S 3 . Such a regularly defected e → V to the order parameter space V . cross section is just a continous mapping S 3 \ 1 The physical interpretation is that this mapping defines an ordering of the considered e medium, continuous everywhere except at the defect set 1. In order to study the behaviour of such a mapping at a defect component 1 ∼ = S1 we consider the induced mapping σ : SN → 1 × V from the sphere normal bundle SN → 1 of 1 ⊂ S 3 and its fibre homotopy class, as we have done before. If we require its orientability then the bundle SN → 1 is automatically trivial and hence we can restrict ourselves to the case SN = S 1 × S 1 → S 1 . Thus the desired homotopy classes will be elements of [S 1 × S 1 , V ]. We shall denote by [S 1 × S 1 , V ]∗ the subset of all those classes whose restriction to the fibre S 1 × 1 is nontrivial. 3.1. Nematics. Consider the case V := RP2 . This is the order parameter space for nematic liquid crystals and in this situation we have the following result that can also be found in [3] where a different proof is given:

406

S. Bechtluft-Sachs, M. Hien

Proposition 1. We have #[S 1 × S 1 , RP2 ]∗ = 4.

(4)

Proof. From π1 (RP2 ) = Z2 we know that [S 1 , RP2 ] has two elements and therefore we have [S 1 × S 1 , RP2 ]∗ = [S 1 × S 1 , RP2 ]α ≈ π1 (Mapα (S 1 , RP2 ))/ ∼, where α ∈ [S 1 , RP2 ] denotes the nontrivial element and the right-hand side of the equation denotes the quotient of the conjugation operation of the fundamental group on itself. But the group π1 (Mapα (S 1 , RP2 )) may be found in the exact sequence of Theorem 1 ρ2

ρ1

α α π2 (RP2 ) → π1 (Mapα (S 1 , RP2 )) → π1 (RP2 ) → π1 (RP2 ) → . π2 (RP2 ) →

(5)

The action of π1 (RP2 ) on π2 (RP2 ) is nontrivial. We have ρ2α (β) = α · β − β = −2β for every β ∈ π2 (RP2 ). At the right end of the diagram the groups are abelian. Hence the Whitehead product ρ1α is trivial. Thus (5) yields the exact sequence ·(−2)

Z −→ Z → π1 (Mapα (S 1 , RP2 )) → Z2 → 0, which gives immediately that #π1 (Mapα (S 1 , RP2 ) = 4. As every group with four elements is abelian, the number of elements does not change when passing to free homotopy classes and the assertion is proved. These four homotopy classes can be explicitly described as follows (see [3]). For π any ζ ∈ S 1 we denote by [ζ] ∈ RP2 its image under the mapping S 1 ,→ S 2 → RP2 . For k = 0, . . . , 3 let ψk : S 1 × S 1 → RP2 be the mapping induced by [0, 1] × [0, 1] → RP2 , (t, s) 7→ [eπi(t+ks) ]. We claim that

[S 1 × S 1 , RP2 ]∗ = {[ψk ] | k = 0, . . . , 3}.

Proof. When restricted to S 1 × 1 each ψk represents the nontrivial class α ∈ π1 (RP2 ), so that [ψk ] ∈ [S 1 × S 1 , RP2 ]∗ for all k = 0, . . . , 3. Clearly [ψ0 |1×S 1 ] = [ψ2 |1×S 1 ] = 0 ∈ π1 (RP2 ), and [ψ1 |1×S 1 ] = [ψ3 |1×S 1 ] = α ∈ π1 (RP2 ). Considered as elements of π1 (Mapα (S 1 , RP2 )) the ψk satisfy [ψ3 ] = [ψ2 ] + [ψ1 ] and [ψ1 ] = [ψ0 ] + [ψ1 ]. It suffices therefore to prove that [ψ0 ] 6 = [ψ2 ]. A straightforward calculation shows that the homomorphism ι

π2 (RP2 ) → π1 (Mapα (S 1 , RP2 )) maps 0 to [ψ0 ] and the generator of π2 (RP2 ) to [ψ2 ]. From the exactness of the sequence (5) we infer [ψ0 ] 6 = [ψ2 ].

Global Defect Index

407

3.2. Superfluid dipolefree A-phase 3 He. . Here we have to consider the order parameter space V := S 2 ×Z2 SO(3), where the generator of Z2 acts on S 2 by reversing the sign and on SO(3) via     −a1 −b1 c1 a1 b1 c1  a2 b2 c2  7→  −a2 −b2 c2  a3 b3 c3 −a3 −b3 c3 (see [5–7]). The first two homotopy groups of V are (see [8]) : π1 (V ) = Z4 and

π2 (V ) = Z.

As π1 (V ) is abelian, we can consider the local defect index α ∈ [S 1 , V ] as an element of π1 (V ). We have the fibration p : V = S 2 ×Z2 SO(3) → RP2 associated to the Z2 -principal fibration S 2 → RP2 . Its homotopy sequence yields that the induced homomorphism (i) p∗ : π2 (V ) → π2 (RP2 ) is injective and (ii) p∗ : π1 (V ) → π1 (RP2 ) is surjective. Since the operation of the fundamental group on the higher homotopy groups is natural we have the following equation for the action of a generator ι ∈ π1 (V ) ∼ = Z4 on an arbitrary β ∈ π2 (V ) ∼ = Z: p∗ (ι · β) = p∗ (ι) · p∗ (β) = (−1) · p∗ (β) = −p∗ (β) = p∗ (−β). From (i) we deduce that ι · β = −β and therefore we have calculated the operation of the fundamental group of V on π2 (V ) as follows: π1 (V ) × π2 (V ) → π2 (V ) (ιk , β) 7→ (−1)k β and so we are able to prove the following Proposition 2. Let V := S 2 ×Z2 SO(3) and let ι be the generator of π1 (V ) ∼ = Z4 . If we 1 1 ιk denote by [S × S , V ] the set of homotopy classes of mappings whose restrictions to S 1 × 1 equal ιk , k ∈ Z4 then we have the following two cases: k

(i) For k = 1, 3 mod 4 we have #[S 1 × S 1 , V ]ι = 8, k (ii) for k = 0, 2 mod 4 we have #[S 1 × S 1 , V ]ι = ∞. Proof. For the local defect index α := ιk we have the exact sequence π2 (V ) → π2 (V ) → π1 (Mapα (S 1 , V )) → π1 (V ) → π1 (V ) β 7→ α · β − β β 7→ α · β − β which becomes

ρ

0

Z → Z → π1 (Mapα (S 1 , V )) → Z4 → Z4 , where ρ = (−2) in case (i) and ρ = 0 in case (ii). In the first case it follows that k π1 (Mapα (S 1 , V )) is abelian and the set [S 1 × S 1 , V ]ι = π1 (Mapα (S 1 , V )) has 8 elements. In the second case it must have infinitely many conjugacy classes and the proposition is proved.

408

S. Bechtluft-Sachs, M. Hien

3.3. H-Space as Fibre. We now assume 1 = S m and that the fibre V of E is an HSpace. Recall that on an H-space all the Whitehead products vanish. In particular the action of the fundamental group on the higher homotopy groups is trivial, so that we may regard the local defect index as an element in πn−1 (V ). The following is immediate from Theorem 1. Proposition 3. If V is an H-space, then for every α ∈ πn−1 (V ) we have the equation #πm (Mapα (S n−1 , V )) = #πm+n−1 (V ) · #πm (V ). As a concrete example let V = S 3 and assume n = 4, such that SN also has S 3 as its fibre. For the number of possible fibre homotopy classes with local defect index 1 ∈ π3 (S 3 ) we have: Proposition 4. For every m ∈ N we have #[S m × S 3 , S m × S 3 ]1S m = #πm (S 3 ) · #πm+3 (S 3 ). Especially for m 6 = 3 we get that #[S m × S 3 , S m × S 3 ]1S m < ∞. Proof. From Proposition 3 we know that #πm (Map1 (S 3 , S 3 )) = #πm (S 3 ) · #πm+3 (S 3 ). There is a 1-1 correspondence [S m × S 3 , S m × S 3 ]1S m ≈ πm (Map1 (S 3 , S 3 ))/π1 (Map1 (S 3 , S 3 )). Hence the first assertion follows from the fact that Map1 (S 3 , S 3 ) is also an H-space and thus the action of its fundamental group on the mth homotopy group is trivial. From [11] we know that the groups πm (S 3 ) are all finite except for m = 3 and therefore the second assertion is proved as well. With the help of Table A.3.6 in [10] we obtain the following list: m #[S

m

× S ,S 3

m

×

S 3 ]1S m

1

2

3

4

5

6

7

8

...

2

2

∞

4

4

36

30

4

<∞

Acknowledgement. We are grateful to Prof. J¨anich for suggesting the present investigation and for many inspiring discussions.

References 1. 2. 3. 4.

Bechtluft-Sachs, S., Hien, M.: Sets of fibre homotopy classes. Preprint 1998 Bredon, G.: Topology and geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1993 J¨anich, K.: Topological properties of ordinary nematics in 3-space. Acta Appl. Math. 8, 65–74 (1987) J¨anich, K., Rost, M.: Regularity of line defects in 3-dimensional media. Topology 24, No. 3, 353–360 (1985) 5. Leggett, S.J.: A theoretical description of the new phases of liquid 3 He. Rev. Mod. Phys. 47, 331–414 (1975) 6. Mermin, N. D.: The topological theory of defects in ordered media. Rev. Mod. Phys. 51, 591-648 (1979)

Global Defect Index

409

7. Mermin, N.D., Lee, D.M.: Superfluid Helium 3. Scient. Am. 235, 56–71 (1976) 8. Michel, L.: Symmetry, defects and broken symmetry. Configurations. Hidden symmetry. Rev. Mod. Phys. 52, 617–651 (1980) 9. Nakanishi, H., Hayashi, K, Mori, H.: Topological classification of unknotted ring defects. Commun. Math. Phys. 117, 203–213 (1988) 10. Ravenell, D.C.: Complex cobordism and stable homotopy groups of spheres. London–New York: Academic Press, 1986 11. Serre, J.-P.: Groupes d’homotopie et classes de groupes ab´eliens. Ann. of Math. 58, 258–294 (1953) 12. Whitehead, G.W.: On products in homotopy groups. Ann. of Math. 47, 460–475 (1946) Communicated by H. Araki This article was processed by the author using the LaTEX style file cljour1 from Springer-Verlag

Commun. Math. Phys. 202, 411 – 419 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Skein Theory and Witten–Reshetikhin–Turaev Invariants of Links in Lens Spaces? Patrick M. Gilmer Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803-4918, USA. E-mail: [email protected] Received: 3 February 1998 / Accepted: 23 October 1998

Abstract: We study the behavior of the Witten–Reshetikhin–Turaev SU (2) invariants of an arbitrary link in L(p, q) as a function of the level r − 2. They are given by √1r times 2πi

one of p Laurent polynomials evaluated at e 4pr . The congruence class of r modulo p determines which polynomial is applicable. If p ≡ 0 (mod 4), the meridian of L(p, q) is non-trivial in the skein module but has trivial Witten–Reshetikhin–Turaev SU (2) invariants. On the other hand, we show that one may recover the element in the Kauffman bracket skein module of L(p, q) represented by a link from the collection of the WRT invariants at all levels if p is a prime or twice an odd prime. By a more delicate argument, this is also shown to be true for p = 9. 1. Introduction We consider the Witten–Reshetikhin–Turaev SU (2) invariants {wr (M, J)}r≥2 ∈ C [W, RT]. These are the invariants of Witten at level r − 2. We specify the precise version of the WRT invariant that we study near the beginning of Sect. 2. Let M be a closed oriented 3-manifold. Let S(M ) be the Kauffman bracket skein module. Recall this is a module over 3 = Z[A, A−1 ]. Elements of S(M ) are linear combinations of links over 3 modulo the Kauffman relations. Let J be a framed link in M , then J represents an element in S(M ), called its skein class. We will use the symbol J to represent a general skein element. It is immediate from the definition of wr that wr (M, J) only depends on the skein class of J. In fact we may extend the definition of wr to S(M ), by extending linearly, 2πi and sending A to −ar . Here and throughout, we adopt the notation that ξg = e g , and ar = ξ4r . We may study the behavior of wr on links in M by calculating wr for generators of S(M ). ?

This research was partially supported by a grant from the Louisiana Education Quality Support Fund.

412

P. M. Gilmer

Hoste and Przytycki [HP] have calculated the skein module for L(p, q). Throughout this paper p will be an integer greater than one. S(L(p, q)) is the free 3 module generated by [ p2 ] + 1 generators: x0 , x1 , x2 · · · x[ p2 ] . Here x0 is given by the empty link. We denote this by module by 3(x0 , x1 · · · x[ p2 ] ). When L(p, q) is presented as −p/q surgery to the unknot, x1 is given by a meridian to this unknot, and xi is given by i parallel meridians. We may consider µc , a meridian colored by c, as an element in S(L(p, q)). By this, we mean one should place the skein element ec ∈ S(S 1 × D2 ) defined in [BHMV1] into a tubular neighborhood of a meridian. The recursion definition in S(S 1 × D2 ) for ec is ec = αec−1 − ec−2 , e0 = 1, e−1 = 0. It is useful to define ec for c negative as well, by running the above recursion formula backwards. The recursion definition implies that αc ∈ 3(e0 , · · · , ec ), for c ≥ 0. Here we let α denote S 1 × 0 with the standard framing in S 1 × D2 . Thus S(L(p, q)) is also the 3 module generated by : µ0 , µ1 , µ2 · · · µ[ p2 ] . We wish to study wr (L(p, q), µc ) for r ≥ 2, and p ≥ 2. We will be particularly interested in 0 ≤ c ≤ [ p2 ]. The answer involves generalized Gauss sums: Gp (a, b) =

p−1 X

ξpan

2

+bn

.

n=0

Section Two of [LL] gives two convenient theorems which allow the evaluation of such sums. We will refer these theorems frequently. For k ∈ Z, consider the following generalized Gauss sum: G± (p, q, c, k) = Gp (qk, qc + q ± 1). Note that G± (p, q, c, k) = G± (p, q, c, k + p) = G± (p, q, c + p, k) = G± (p, q, c, −k). For 0 ≤ k < p, define fp,q,c,k (z) to be the Laurent polynomial in C[z, z −1 ] : i(−1)c+1 12p s(q,p)+q(c2 +2c) √ z G+ (p, q, c, k) z 2(c+1) − G− (p, q, c, k) z −2(c+1) . 2p Here s(q, p) is a Dedekind sum. Then ¯ fp,q,c,k (z) = −fp,q,c,p−k (z). The following theorem for the case c is zero is Jeffrey’s formula [J] for wr (L(p, q)). In Sect. 2, we will prove this theorem by adapting Jeffrey’s proof. Theorem 1. If r ≡ k (mod p), √ r wr (L(p, q), µc ) = fp,q,c,k (ξ4pr ). P[ p2 ] Suppose J is a link in L(p, q), let [J] = c=0 CJ,c (A)[µc ] ∈ S(L(p, q)). Note P[ p2 ] √ fp,q,c,r (ξ4pr )CJ,c (−ar ). For 0 ≤ k < p, CJ,c (A) ∈ 3. Then r wr (L(p, q), J) = c=0 let [ p2 ] X fp,q,c,k (z)CJ,c (−z p ) ∈ C[z, z −1 ]. (1.1) fJ,k (z) = c=0

Since CJ,c has integral and therefore real coefficients, ¯ fJ,k (z) = −fJ,p−k (z).

(1.2)

Skein Theory and WRT Invariants

413

Corollary 1. If J is a link in L(p, q) (or any element of S(L(p, q))), then for each 0 ≤ k < p, there is a polynomial fJ,k (z) in C[z, z −1 ] such that for r ≡ k (mod p), √ r wr (L(p, q), J) = fJ,k (ξ4pr ). √ Thus there are p Laurent polynomials which describe r wr (L(p, q), J) as a function of ξ4pr . Each function is applicable if one restricts r to a particular congruence class modulo p. Moreover the last p − [ p2 ] − 1 of these polynomials are determined in a simple way by the first [ p2 ] + 1 of these polynomials. In [G1], we showed that if J is a link in a connected sum of g S 1 × S 2 ’s, then wr (S 3 )g−1 wr (]g S 1 × S 2 , J) is given by a rational function in ar for r large enough (depending on J). We also gave examples where this formula fails for small r. Moreover if g = 0 or one, this rational function is a Laurent polynomial. We were interested in whether similar statements could be made about links in lens spaces. This was the genesis of the present paper. Lawrence and Rozansky have described the Witten–Reshetikhin–Turaev SU (2) invariants of a Seifert rational homology sphere by a single holomorphic function of r [LR]. See also [L1, L2]. They also show that Otsuki’s power series in h = q − 1 evaluated by setting q to be an rth root of unity converges r-adically to the rth Witten–Reshetikhin– Turaev invariant (associated to SO(3)), if r is an odd prime which is relatively prime to the order of the first homology. Suppose we normalize wr to be 1 on S 3 by dividing by wr (S 3 ). Then wr (S 3 )−1 wr (L(p, q), J) for a fixed congruence class of r modulo p is given by a rational function of ξ4pr . As we vary the congruence class of r modulo p, we obtain different functions. Note the case c = 0 in Theorem 1, which is really just a reinterpretation of Jeffrey’s result [J, Theorem (3.4)], concerns L(p, q) with the empty link. Thus work of Lawrence and Rozansky applies in this case, but describes the dependence on r in a different way. We use several rational functions of ξ4pr . Lawrence and Rozansky use a single holomorphic function of r. This holomorphic function is defined by a line integral. Recently the r-adic convergence mentioned above has been extended to links in a general rational homology sphere [R]. Using [LL], one may see that if p ≡ 0 (mod 4), and c is odd, then G± (p, q, c, k) is zero. Corollary 2. If p ≡ 0 (mod 4) and c is odd, wr (L(p, q), µc ) = 0. Similarly, if p ≡ 0 (mod 4) and c is odd and positive, wr (L(p, q), xc ) = 0. To see the last statement, we note: If c is odd and positive, αc ∈ 3(e1 , e3 · · · , ec ) ⊂ S(S 1 ×D2 ). In particular for any n ≥ 1, the meridian in L(4n, q) represents a non-trivial skein class but has all WRT invariants trivial. On the other hand, we can sometimes show that the skein √ class of a link may be recovered from its WRT invariants. Suppose we are given r wr (L(p, q), J) for all r, then fJ,k is determined for all k in the range 0 ≤ k < p. This is because a nonzero Laurent polynomial cannot have infinitely many roots. Let R denote the field of rational functions in z with complex coefficients. Suppose the p × [ p2 ] + 1 matrix [fp,q,c,k (z)]0≤k
414

P. M. Gilmer

If [fp,q,c,k (z)]0≤k2r . We can now specify the version of the WRT SU(2) invariant that we study. wr (M, J) = ψ(< (M, J, 0) >2r ). Here ψ : k2r → C is the homomorphism [MR, note p.134] which sends A to −ar , and sends κ to ζa−1 r . Here ζ = ξ8 . L(p, q) is obtained by gluing together two copies of S 1 × D2 by the self-homeomorphism U of S 1 × S 1 given with respect to a meridian-longitude basis by the matrix ( pq db ) ∈ SL(2, Z). L(p, q) weighted by an integer n as a morphism of the cobordism category is denoted (L(p, q), n). Let 6 denote the torus S 1 × S 1 equipped with the Lagrangian subspace generated by the meridian [S 1 × 1]. For each integer c, let Hc be the morphism from the empty set to 6 given by S 1 × D2 weighted zero, with the core S 1 × 0 with the standard framing colored by c. Let H˜ c be this same manifold (with the opposite orientation) viewed as a morphism from 6 to the empty set. In C2r , we have that (L(p, q), 0) is the composition H˜ 0 ◦ C(U, 0) ◦ H0 . We let C(U, w) denote the mapping cylinder of U weighted by an integer w. More generally, we have that L(p, q) weighted zero with the meridian colored c is given by the composition H˜ c ◦ C(U, 0) ◦ H0 . Hc represents an element βc in V2r (6). As c ranges over 0 ≤ c ≤ r − 2, these elements form an orthonormal basis B. We will also need elements νs = (−1)s−1 βs−1 . By [BHMV1, 6.3], they have the following symmetries

Skein Theory and WRT Invariants

415

νl = ν2r+l and νl = −ν−l ∀l ∈ Z.

(2.1)

Note that wr (L(p, q), (−1)l−1 µl−1 ) will possess the same symmetries. Let B denote the basis with elements νs : for 1 ≤ s ≤ r − 1. m 11 As usual, we let S = ( 01 −1 0 ), and T = ( 0 1 ). Let J(m) = T S. Then C(J(m), 0) = m C(T, 0) ◦ C(S, 0), as the contribution of σ (Maslov index) in the formula for the computation of the weight of a composition is zero. Following [J] we expand U = J(mt ) ◦ J(mt−1 ) · · · J(m1 ), where mt = 0, and t > 1. Let Ui = J(mi ) · · · J(m1 ). Let ci denote the lower left entry in Ui . For 1 ≤ i ≤ t, define wi by C(Ui , wi ) = C(J(mi ), 0) ◦ · · · ◦ C(J(m1 ), 0). So w1 = 0. For 1 < i ≤ t, we have C(Ui , wi ) = C(J(mi ), 0) ◦ C(Ui−1 , wi−1 ). Computing the contribution of σ, one has that wi = Pt wi−1 + sign(ci−1 ci ). Thus wt = i=2 sign(ci−1 ci ). This is the signature of the matrix definite matrix which we may insist WL of [J, Eq. 3.6] at least assuming this matrix is aP without loss of generality. Note that Trace(WL ) = i mi . 2 With respect to B, Z2r (T, 0) is given by Tˆl,j = δlj (−A)l −1 , and Z2r (S, 0) is given by Sˆ l,j = η2r [lj] [G2]. Here j and l range from 1 to r − 1. Also [k] denotes the quantum integer

A2k −A−2k A2 −A−2 ,

and η2r is the scalar of [BHMV2] with ψ(η2r ) = −

a2r −a−2 √ r i. 2r

ˆ and Witten [W] used a unitary matrix representation Rr of SL(2, Z), with Rr (S) = ψ(S) 2 j −1 l Rr (T ) = ar ζ ψ(Tˆ ) = δl ζar . In [J], Jeffrey found an explicit formula for Rr [ ac db ] in terms of a, b, c, and d. With respect to B, we have 3Sign(WL ) Trace(WL ) ψ (Z2r (C(U, 0))) = ψ(Z2r (C(U, wt ))) = (ζa−1 Rr (U ). (ζa−1 r ) r )

Since the Rademacher 8-function is given by [J, 3.2], [KM] 8(U ) = Trace(WL ) − 3 Sign(WL ), we have

ψ (Z2r (C(U, 0))) = ζa−1 r

8(U )

Rr (U ).

This is exactly the correction factor Jeffrey used to find the Witten invariant of L(p, q) in the canonical framing[J, Lemma 3.3, Theorem 3.4]. We have for 1 ≤ l ≤ r − 1 : 8(U ) Rr (U )l,1 . wr (L(p, q), (−1)l−1 µl−1 ) = ψ (< νl , Zr (C(U, 0))ν1 >) = (ζa−1 r ) 8(U ) Rr (U )1,1 . Jeffrey [J, Theorem 3.4] derived and simplified an expression for (ζa−1 r ) One may adapt her calculation as follows. Extending [J, Eq. (3.8 21 )] to the case l 6= 1, we have in our notation p

−i −8(U ) b X X (qγ±1)2 ξ4rq ±ξ4rpq , a wr (L(p, q), (−1)l−1 µl−1 ) = √ 2rp r n=1 ±

(2.2)

where γ = l + 2rn, and 1 ≤ l ≤ r − 1. Note that qγ 2 ± 2γ modulo 4rp only depends on n modulo p. Let S± (l) denote the unordered list, with multiplicity, of the values of qγ 2 ± 2γ modulo 4rp as n varies Pp P (qγ±1)2 = from 1 to p. Then S± (l + 2r) = S± (l), and S± (−l) = S∓ (l). So n=1 ± ±ξ4rpq Pp P qγ 2 ±2γ remains the same when l is replaced by l + 2r, and changes ξ4rpq n=1 ± ±ξ4rp sign when l is replaced by −l. By (2.1), it follows that (2.2) holds for all integers l. As (qγ ± 1)2 = q 2 l2 + 1 ± 2ql + 4rq(qrn2 + qln ± n),

416

P. M. Gilmer

we have: p i(−1)c+1 X pb−pq8(U )+q2 l2 +1±2ql X (qr)n2 +(ql±1)n ±ξ4rpq ξp wr (L(p, q), µc ) = √ 2rp ± n=1

i(−1)c+1 pb−pq8(U )+q2 l2 +1 X ±l ξ = √ ±ξ2rp G± (p, q, c, r). 2rp 4rpq ±

Here l denotes c+1. This holds for all integers c. But now making use of 1 = det(U ) = qd − bp, and the definition of 8, we have: pb − pq8(U ) + q 2 l2 + 1 = q 2 (l2 − 1) + 12pq s(d, p) = q 2 (l2 − 1) + 12pqs(q, p). As in [J], we have noted that d ≡ q ∗ (mod p), where q ∗ is an integer with qq ∗ ≡ 1 (mod p), and that s(d, q) = s(q ∗ , p) = s(q, p). Thus √

r wr (L(p, q), µc ) =

i(−1)c+1 12p s(q,p)+q(c2 +2c) X ±2(c+1) √ ξ ±ξ4rp G± (p, q, c, r). 2p 4rp ±

A key observation for this paper is that r enters the right hand side of this last formula in only two places in ξ4rp which we may think of as a variable z and in G± (p, q, c, r) where its contribution only depends on r (mod p). Thus if r ≡ k (mod p), then √ r wr (L(p, q), µc ) = fp,q,c,k (ξ4pr ). 3. Proof of Theorem 2 (Only if Part) Using [LL] one may prove: Lemma 1. If a 6≡ 0 (mod p), and b2 ≡ b0 (mod p), then Gp (a, b) = Gp (a, b0 ). 2

Let #p denote the number of squares modulo p. Lemma 2. The number of distinct columns appearing in either of the matrices [G+ (p, q, c, k)]1≤k
Skein Theory and WRT Invariants

417

(ii) a = s2t with s an odd prime and t ≥ 1. If t = 1, assume s ≥ 5, i.e. a 6= 9. Then every square is the square of a number in the range from zero to [ a2 ]. However zero, st , and 2st , are in this range and have the same square. (iii) a = s2t+1 with s an odd prime, and t ≥ 1. Then every square is the square of a t+1 t 2 t+1 t 2 = s 2−s number in the range from zero to [ a2 ]. However 02 = (st+1 )2 , and s 2+s are the squares of numbers from this range. (iv) a = s1 · s2 , where sa1and s2 are distinct odd primes. So #a = #s1 ·s2 = #s1 · #s2 = (s1 +1) (s2 +1) s1 ·s2 · < = 2 . 2 2 2 (v) a is four times an odd prime, nine times an odd prime (other than 3) , 2 · 9, or 4 · 9. These cases are dealt with as in (iv). These lemmas yield the only if part of Theorem 2, except for the cases p = 4, and p = 9. When p = 4, [fp,q,c,k (z)]0≤k
2 . Also (p) is one or i Here q λp denotes the primitive pth root of unity ξp a accordingly as p is one or three modulo four, and b denotes a Jacobi symbol. So the rows of det[G+ (p, q, cˆ, kp∗ )]1≤k≤ p2 ,1≤c≤ p2 are non-zero multiples of the rows of a Vandermonde2 matrix with a non-zero determinant. Here we make use of the fact that the numbers q λcp , as c varies from 1 to [ p2 ] are all distinct. For this it is important that as c ranges over 1 ≤ c ≤ [ p2 ], c2 does not repeat itself modulo p.

The case p is twice an odd prime s. For this case, we define 0 by saying that 0(0), 0(1), ˆ 2, ˆ 4, ˆ · · · , (s − ˆ 1),1, ˆ 3,· ˆ · · s. · · · , 0(s) in sequence are given by 0, ˆ Also define 1 by saying ∗ ∗ ∗ 1(0),1(1),· · · , 1(s) in sequence are given by 0, 2(1∗s ), 2(2s∗ ), · · · , 2( s−1 2 )s , 1p , 3p , · · · ,

418

P. M. Gilmer

(s − 2)∗p , s. Since p is even, q is odd, and cˆ ≡ c (mod 2). Thus for both 0 and 1, even values come first, and then odd values follow. Next notice that by [LL] ( 0, if c is relatively prime to s G+ (2s, q, cˆ, s) = G2s (qs, c) = 2s, if c = s. Thus the last row of [G+ (p, q, 0(c), 1(k))]1≤k≤s,1≤c≤s consists of all zeros except for the last entry of 2s. Thus we only need to show [G+ (p, q, 0(c), 1(k))]1≤k≤s−1,1≤c≤s−1 has a nonzero determinant. For j relatively prime to s, and c 6≡ j (mod 2) by [LL]: G+ (2s, q, cˆ, j) = G2s (qj, c) = 0. Thus our matrix is a block matrix of the form [ A0 B0 ]. We now study the matrix A. For 1 ≤ d ≤ s−1 2 , and 1 ≤ j ≤ ˆ 2js∗ ) = G2s (2qjs∗ , 2d) G+ (2s, q, (2d), = 2Gs (qjs∗ , d) = 2(s)

qjs∗ s

s−1 2 ,

by [LL] we have:

√ d2 j s q λs ,

−q ∗ ( s+1 )2

where now we let q λs be the primitive sth root of unity ξs s 2 . Thus the rows of A are nonzero multiples of the rows of a Vandermonde matrix with a non-zero determinant. 2 Here it is important that as d ranges over 1 ≤ d ≤ s−1 2 , d does not repeat itself modulo s. Now we study the matrix B. For 1 ≤ d ≤ s − 2, and 1 ≤ j ≤ s − 2, with d and j odd , by [LL] we have: 2qjp∗ √ d2 j ∗ ∗ ˆ s q ωs , G+ (2s, q, d, jp ) = G2s (qjp , d) = 2(s) s −q ∗ ( s+1 )3

where now we let q ωs be the primitive sth root of unity ξs p 2 . Let q θs be the primitive 2 j−1 2 j 2 2 = q ωs d q θsd . We may take a nonzero factor of sth root of unity q ωs2 . So q ωsd d2 q ωs

out of each column . The rows of the resulting matrix are non-zero multiples of the rows of a Vandermonde matrix with a non-zero determinant. Here it is important that as d ranges over odd numbers such that 1 ≤ d ≤ s − 2, d2 does not repeat itself modulo s. This completes the proof. 5. Proof of Theorem 3 There are only two lens spaces of order nine up to (not necessarily orientation preserving) diffeomorphism: L(9, 1), and L(9, 4). Thus we need only concern ourselves with q = 1 and q = 4. Using Mathematica we have found that the subset of S(9, 1) with trivial WRT invariants is spanned by (−z 15 + z 27 )µ1 + (z 12 − z 24 )µ2 − z 15 µ3 + µ4 . Similarly the subset of S(9, 4) with trivial WRT invariants is spanned by (−z 84 + z 108 )µ0 + (z 60 − z 72 )µ2 − z 30 µ3 + µ4 . No non-zero multiple of these elements lies in the ordinary skein module over 3.

Skein Theory and WRT Invariants

419

References [BHMV1] Blanchet, C., Habegger, N., Masbaum, G., Vogel, P.: Three manifold invariants derived from the Kauffman bracket. Topology 31, 685–699 (1992) [BHMV2] Blanchet, C., Habegger, N., Masbaum, G., Vogel, P.: Topological quantum field theories derived from the Kauffman bracket. Topology 34, 883–927 (1995) [G1] Gilmer, P.: A TQFT for wormhole cobordisms over the field of rational functions. In: Knot Theory (Warsaw 1995) V.F.R.Jones, ed., J. Kania-Bartoszynska, J.H. Przytycki, P. Traczyk, V. Turaev, Banach Center Publications 42, 1998, pp. 119–127 [G2] Gilmer, P.: On the WRT representations of mapping class groups. Proc. A.M.S. (to appear) [HP] Hoste, J., Przytycki, J.: The (2, ∞) skein module of lens spaces; a generalization of the Jones Polynomial. J. Knot Th. and Ramif. 2, 321–333 (1993) [J] Jeffrey, L.: Chern–Simons–Witten invariants of lens spaces and torus bundles, and the semiclassical approximation. Commun. Math. Phys. 147, 563–604 (1992) [L1] Lawrence, R.: Asymptotic expansions of Witten–Reshetikhin–Turaev invariants for some simple 3-manifolds. J. Math. Phys. 36, 6106–6129 (1995) [L2] Lawrence, R.: Witten–Reshetikhin–Turaev invariants of 3-manifolds as holomorphic functions. In: Geometry and physics, (Aarhus, 1995), Lecture Notes in Pure and Appl. Math. 184, New York: Dekker, 1997, pp. 363–377 [LR] Lawrence, R. and Rozansky, L.: Witten–Reshetikhin–Turaev invariants of Seifert manifolds. Preprint March 1997 [LL] Li, B.H., Li, T.J.: Generalized Gaussian sums and Chern–Simons–Witten–Jones invariants of lens spaces. J. Knot Th. and Ramif. 5, 184–224 (1996) [KM] Kirby, R., Melvin, P.. Dedekind sums, µ-invariants, and the signature cocycle. Math. Ann. 299, 231–267 (1994) [MR] Masbaum, G., Roberts, J.: On central extensions of mapping class groups. Math. Annallen 302, 131–150 (1995) [RT] Reshetikhin, N., Turaev, V.: Invariants of 3-manifolds via link-polynomials and quantum groups. Invent. Math. 103, 547–597 (1991) [R] Rozansky, L.: On p-adic properties of the Witten–Reshetikhin–Turaev invariant. Reprint: mathQA/9806075 [Wa] Walker, K.: On Witten’s 3-manifold invariants. Preprint (1991) [W] Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) Communicated by A. Jaffe

Commun. Math. Phys. 202, 421 – 444 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Properties of Free Entropy Related to Polar Decomposition Fumio Hiai? , D´enes Petz?? Department for Mathematical Analysis, Technical University of Budapest, H-1521 Budapest XI, Sztoczek u. 2, Hungary Received: 6 October 1998 / Accepted: 25 October 1998

Abstract: The free entropies χ(a ˆ 1 , . . . , aN ) of non-selfadjoint random variables and χu (u1 , . . . , uN ) of unitary random variables are introduced and discussed P by the methods of Voiculescu’s free analysis. The additivity χu (u1 , . . . , uN ) = i χu (ui ) is shown to be equivalent to freeness. The relation among χ, ˆ χu and χ is investigated in the case when ai = ui hi is the polar decomposition. The subadditivity χ(a ˆ 1 , . . . , aN ) ≤ χu (u1 , . . . , uN ) + χ(h21 , . . . , h2N ) + constant is proven and applications to some maximization problems for χˆ are given. Introduction A highlight of free probability theory is the free entropy which has been extensively developed with several applications by Dan Voiculescu [12–17]. Up to now there are two kinds of free entropies χ and χ∗ (for selfadjoint random variables). The free entropy χ(a1 , . . . , aN ) studied in [12–15, 17] is the matricial analogue of the classical Boltzmann-Gibbs entropy and is defined as the asymptotic growth rate of the volume of N -tuples of selfadjoint matrices approximating (a1 , . . . , aN ) in the sense of joint moments when the matrix size is going to infinity. Its most significant property is the additivity: χ(a1 , . . . , aN ) = χ(a1 ) + · · · + χ(aN ) in the case when (and only when) a1 , . . . , aN are in free relation. On the other hand, the free entropy χ∗ (a1 , . . . , aN ) from [16] was defined as a certain integral of the free analogue of the Fisher information. The equality χ(a) = χ∗ (a) was shown in [16], but the coincidence of the two concepts in the general multi-variable case is not known. Random matrices are often models of free random variables and the bridge between random matrix theory and free probability is the asymptotic freeness result of Voiculescu ? Permanent address: Graduate School of Information Sciences, Tohoku University, Aoba-ku, Sendai, 9808577, Japan ?? Supported by OTKA F023447

422

F. Hiai, D. Petz

[11, 17]. The free entropy χ(a) of a single variable a coincides, up to the sign and an additive constant, with the so-called logarithmic energy of the distribution measure of a, which is familiar from potential theory. The one-variable free entropy is the main component of the rate function in large deviation theorems obtained first by Ben Arous and Guionnet [1] and subsequently by the present authors [4, 5, 10]. In this paper we discuss the matricial free entropies χˆ of non-selfadjoint random variables and χu of unitary random variables. In fact, the non-selfadjoint version χˆ of Voiculescu’s entropy has been already implicitly used in several places in the literature. Our main goal is not really the entropy of non-selfadjoint variables but rather the study of the entropy of unitaries which is natural as well. Our motivation was the random matrix model of the Haar unitary distribution and the related large deviation results [4]. We make clear the intrinsic interrelation among different free entropies χ, ˆ χu and χ. This relation gives a useful method in the free entropy analysis. The paper is organized as follows. In Sect. 1 we define the free entropies χ(a ˆ 1, . . . , aN ) of non-selfadjoint random variables and χu (u1 , . . . , uN ) of unitary random variables, and their basic properties are given. The free entropy χ(a ˆ 1 , . . . , aN ) is equal to the free entropy of the real and imaginary parts of a1 , . . . , aN , so the properties of χˆ are direct translations from those of χ. If the polar decompositions ai = ui hi are taken, then the random variables a1 , . . . , aN may be considered as a combination of unitary u1 , . . . , uN and positive h1 , . . . , hN . In Sect. 2 we introduce, as a technical device, the free entropy χ(u,+) (u1 , . . . , uN ; h1 , . . . , hN ) of mixed strings of unitary and positive random variables. By making use of this mixed free entropy χ(u,+) we obtain the following relation among χ, ˆ χu and χ when ui is the unitary part of ai : π 3 N log + . χ(a ˆ 1 , . . . , aN ) ≤ χu (u1 , . . . , uN ) + χ(a∗1 a1 , . . . , a∗N aN ) + 2 2 2 In Sect. 3 we show that the equality holds true in the above inequality under a suitable freeness assumption. Furthermore, we show the additivity properties of χu and χˆ based on the above analysis through χ(u,+) . The result on approximate freeness for matrices in [17] and the formula in a separate change of variables in [15] are useful for our purpose. Our method has applications also to the maximization problems for χ(a), ˆ for instance, under the fixed distribution of a∗ a. The R-diagonal element introduced by A. Nica and R. Speicher [9] appears as the maximizer in this type of maximization of χ(a). ˆ Section 4 treats the similar maximization for the free entropy in the case of a matrix of noncommutative random variables. Our Sect. 4 is strongly inspired by a recent paper of Nica, Shlyakhtenko and Speicher [8] where the same problems were considered for χ∗ . The authors are grateful for having access to [8] prior to its publication. Just after this paper was completed, we received a preprint of Nica, Shlyakhtenko and Speicher [19] where the same kind of maximization problems for χ were considered. 1. Free Entropies of Non-Selfadjoint and Unitary Random Variables In this section we introduce the free entropies χ(a ˆ 1 , . . . , aN ) of non-selfadjoint random variables and χu (u1 , . . . , uN ) of unitary random variables. Throughout the paper let (M, τ ) be a tracial W*-probability space, that is, M is a von Neumann algebra with a faithful normal tracial state τ . Let Msa be the set of selfadjoint elements in M. The free entropy χ(a1 , . . . , aN ) of an N -tuple of a1 , . . . , aN in Msa was introduced in [13]. One can define the free entropy of an N -tuple of (non-selfadjoint) elements in M with an appropriate slight modification of Voiculescu’s original definition.

Free Entropy Related to Polar Decomposition

423

Let Mn denote the algebra of n×n complex matrices and Mnsa the space of selfadjoint matrices in Mn . Let tr n stand for the normalized trace on Mn while Tr n is the usual ˆ n on Mn (resp. 3n on Mnsa ) is transformed from the trace. The Lebesgue measure 3 2 2 2 usual Lebesgue measure on R2n (resp. Rn ) via the natural isometry Mn ∼ = R2n (resp. 2 Mnsa ∼ = Rn ) between the Hilbert–Schmidt norm of Mn (resp. Mnsa ) and the Euclidean 2 2 norm of R2n (resp. Rn ). Consider the map A 7→ (B, C) ∈ (Mnsa )2 given by the Descartes decomposition A = B + i C. Since kAkHS = (kBk2HS + kCk2HS )1/2 for the Hilbert–Schmidt norm, the following is obvious. ˆ n on Mn correLemma 1.1. Under the map A ∈ Mn 7→ (B, C) ∈ (Mnsa )2 above, 3 sa 2 sponds to 3n ⊗ 3n on (Mn ) . Let a1 , . . . , aN ∈ M. For n, r ∈ N, ε > 0 and R > 0 define 0ˆ R (a1 , . . . , aN ; n, r, ε) := {(A1 , . . . , AN ) ∈ (Mn )N : kAi k ≤ R, | tr (A0i1 · · · A0ik ) − τ (a0i1 · · · a0ik )| ≤ ε n

for all 1 ≤ i1 , . . . , ik ≤ 2N, 1 ≤ k ≤ r}, where

(a01 , . . . , a02N ) := (a1 , . . . , aN , a∗1 , . . . , a∗N ),

(A01 , . . . , A02N ) := (A1 , . . . , AN , A∗1 , . . . , A∗N ). Moreover, 1 ˆ 0ˆ R (a1 , . . . , aN ; n, r, ε) + N log n χˆ R (a1 , . . . , aN ; r, ε) := lim sup 2 log 3( n n→∞ ˆ is used for 3 ˆ ⊗N for brevity), (3 n χˆ R (a1 , . . . , aN ) := →

lim χˆ R (a1 , . . . , aN ; r, ε),

ε→+0 r→∞

χ(a ˆ 1 , . . . , aN ) := sup χˆ R (a1 , . . . , aN ). R>0

Then χ(a ˆ 1 , . . . , aN ) is called the free entropy of the N -tuple (a1 , . . . , aN ). Proposition 1.2. Let a1 , . . . , aN ∈ M and bi := (ai + a∗i )/2, ci := (ai − a∗i )/2i . Then χR (b1 , c1 , . . . , bN , cN ) ≥ χˆ R (a1 , . . . , aN ) ≥ χR/2 (b1 , c1 , . . . , bN , cN ), χ(a ˆ 1 , . . . , aN ) = χ(b1 , c1 , . . . , bN , cN ). Proof. The following are easy to check: 0R (b1 , c1 , . . . , bN , cN ; n, r, ε) ⊃ {(B1 , C1 , . . . , BN , CN ) ∈ (Mnsa )2N : (B1 + i C1 , . . . , BN + i CN ) ∈ 0ˆ R (a1 , . . . , aN ; n, r, ε)}, 0ˆ 2R (a1 , . . . , aN ; n, r, ε) ⊃ {(B1 + i C1 , . . . , BN + i CN ) ∈ (Mn )N : (B1 , C1 , . . . , BN , CN ) ∈ 0R (b1 , c1 , . . . , bN , cN ; n, r, ε/2r )}.

424

F. Hiai, D. Petz

By Lemma 1.1 these imply that χR (b1 , c1 , . . . , bN , cN ; r, ε) ≥ χˆ R (a1 , . . . , aN ; r, ε), χˆ 2R (a1 , . . . , aN ; r, ε) ≥ χR (b1 , c1 , . . . , bN , cN ; r, ε/2r ). Hence we get the conclusions.

The above proposition enables us to reformulate the results on free entropy of selfadjoint random variables as those of non-selfadjoint ones. For instance, χ(a ˆ 1 , . . . , aN ) is subadditive and upper semicontinuous similary to the selfadjoint case treated in [13]. Proposition 1.3. Let a1 , . . . , aN ∈ M and C > 0. When τ (a∗1 a1 + · · · + a∗N aN ) ≤ C, χ(a ˆ 1 , . . . , aN ) ≤ N log

πeC , N

and the equality p is attained if and only if a1 , . . . , aN are *-free circular elements of the same radius 4C/N . Proof. p Let bi , ci be as above. Then a1 , . . . , aN are *-free circular elements of radius 2p C/N if and only if b1 , c1 , . . . , bN , cN are free semicircular elements of radius 2C/N . Hence the result is just the translation of [15, Proposition 2.4] (also [3, Theorem 4.1]). According to a strong result in [14] we know that χ(a1 , a2 ) = −∞ if a1 , a2 ∈ Msa commute. (This can be seen also by using the change of variable formula in [15].) By Proposition 1.2 and the subadditivity it follows that χ(a ˆ 1 , . . . , aN ) = −∞ whenever there is a normal element among a1 , . . . , aN . Next we turn to the entropy of unitary random variables. Let γn denote the Haar probability measure on the unitary group U(n). Let u1 , . . . , uN ∈ M be unitaries. For n, r ∈ N and ε > 0 define 0u (u1 , . . . , uN ; n, r, ε) := {(U1 , . . . , UN ) ∈ (U(n))N : | tr (Ui01 · · · Ui0k ) − τ (u0i1 · · · u0ik )| ≤ ε n

for all 1 ≤ i1 , . . . , ik ≤ 2N, 1 ≤ k ≤ r}, where

(u01 , . . . , u02N ) := (u1 , . . . , uN , u∗1 , . . . , u∗N ),

0 ∗ ) := (U1 , . . . , UN , U1∗ , . . . , UN ). (U10 , . . . , U2N

The free entropy χu (u1 , . . . , uN ) of the N -tuple (u1 , . . . , uN ) is defined as follows: χu (u1 , . . . , uN ; r, ε) := lim sup n→∞

1 log γ(0u (u1 , . . . , uN ; n, r, ε)) n2

(γ is for γn⊗N on (U(n))N ), χu (u1 , . . . , uN ) := →

lim χu (u1 , . . . , uN ; r, ε).

ε→+0 r→∞

Free Entropy Related to Polar Decomposition

425

For the case of a single unitary we have Proposition 1.4. Let u ∈ M be a unitary and µ be the distribution measure of u on T. Then the limit 1 (1.1) χu (u; r, ε) = lim 2 log γn (0u (u, n, r, ε)) n→∞ n exists for every r ∈ N and ε > 0, and ZZ log |ζ − η| dµ(ζ) dµ(η). χu (u) = 6(µ) := T2

In particular, if χu (u) > −∞ then µ is non-atomic. Proof. It is convenient to use large deviation theory, see [2] for basics on large deviations. Let M(T) be the space of all probability measures on T with the weak topology. Let Pn be the empirical eigenvalue distribution of γn which is a probability measure on M(T). It is known [4] that (Pn ) satisfies the large deviation principle in the scale n−2 with rate function I(µ) := −6(µ). For r ∈ N and ε > 0 set a closed neighborhood of µ ∈ M(T) F (r, ε) := {ν ∈ M(T) : |mk (ν) − mk (µ)| ≤ ε, −r ≤ k ≤ r} and an open neighborhood G(r, ε) by replacing ≤ ε by < ε in the above, where mk (µ) denotes the k th moment of µ. Then the above large deviation theorem implies that 1 log Pn (F (r, ε)) ≤ sup{6(ν) : ν ∈ F (r, ε)}, n2 1 lim inf 2 log Pn (G(r, ε)) ≥ sup{6(ν) : ν ∈ G(r, ε)}. n→∞ n

lim sup n→∞

But it is straightforward to see that Pn (G(r, ε)) ≤ Pn (F (r, ε)) = γn (0u (u; n, r, ε)), sup{6(ν) : ν ∈ F (r, ε)} = sup{6(ν) : ν ∈ G(r, ε)}. Therefore, χu (u, r, ε) = lim

n→∞

1 log γn (0u (u; n, r, ε)) = sup{6(ν) : ν ∈ F (r, ε)}. n2

Since the latter tends to 6(µ) as r → ∞ and ε → +0, we have χu (u) = 6(µ).

Remark 1.5. When a ∈ Msa , similarly to the above proof one can use the large deviation technique to show that the limit 1 1 log 3n (0R (a; n, r, ε)) + log n χR (a; r, ε) = lim n→∞ n2 2 exists for every r ∈ N, ε > 0 and R ≥ kak. This slightly improves the result in [13]. The large deviation used here is concerned with the empirical eigenvalue distribution for the normalized Lebesgue measure on {A ∈ Mnsa : kAk ≤ R}. (The details are in [6].) The negativity χu (u1 , . . . , uN ) ≤ 0 is obvious. The subadditivity and the upper semicontinuity of χu (u1 , . . . , uN ) are easily shown as in the selfadjoint case in [13]. The following is a unitary counterpart of [13, Proposition 3.8].

426

F. Hiai, D. Petz

Proposition 1.6. Let u1 , . . . , uN , v1 , . . . , vN ∈ M be unitaries. If v1 = u1 and vi u∗i ∈ {u1 , . . . , ui−1 }00 for 2 ≤ i ≤ N , then χu (u1 , . . . , uN ) = χu (v1 , . . . , vN ). Proof. Since the assumption implies also that ui vi∗ ∈ {v1 , . . . , vi−1 }00 for 2 ≤ i ≤ N , it suffices to show that χu (u1 , . . . , uN ) ≤ χu (v1 , . . . , vN ). One can choose selfadjoint noncommutative polynomials Pm,i (X1 , X2 , . . . , X2i−2 ) for 2 ≤ i ≤ N , m ∈ N, such that u + u ∗ u − u∗ ui−1 + u∗i−1 ui−1 − u∗i−1 1 1 1 1 → vi u∗i , ,..., , exp i Pm,i 2 2i 2 2i strongly* as m → ∞. Set vm,1 := v1 = u1 and for 2 ≤ i ≤ N , u + u∗ u − u∗ ui−1 + u∗i−1 ui−1 − u∗i−1 1 1 1 1 ui . , ,..., , vm,i := exp i Pm,i 2 2i 2 2i Then vm,i → vi strongly* as m → ∞. If a map 8 : U(n)N → U(n)N , 8(U1 , . . . , UN ) = (V1 , . . . , VN ), is defined by V1 := U1 and for 2 ≤ i ≤ N , U + U∗ U − U∗ ∗ ∗ Ui−1 − Ui−1 Ui−1 + Ui−1 1 1 1 1 , Ui , , ,..., Vi := exp i Pm,i 2 2i 2 2i then it is obvious that γ ◦ 8 = γ holds due to the multiplication invariance of γ. For any m, r ∈ N and ε > 0 one can easily see that there are r1 ∈ N and ε1 > 0 such that 8(0u (u1 , . . . , uN ; n, r1 , ε1 )) ⊂ 0u (vm,1 , . . . , vm,N ; n, r, ε)

(n ∈ N).

This yields χu (u1 , . . . , uN ; r1 , ε1 ) ≤ χu (vm,1 , . . . , vm,N ; r, ε) so that χu (u1 , . . . , uN ) ≤ χu (vm,1 , . . . , vm,N ). Hence the desired inequality follows as m → ∞ thanks to the upper semicontinuity.

2. Relation Among Different Free Entropies Let u1 , . . . , uN ∈ M be unitaries and h1 , . . . , hN ∈ M+ , where M+ denotes the set of positive elements in M. The free entropy χ(u ˆ 1 h1 , . . . , uN hN ) may be also considered as the free entropy of the 2N -tuple (u1 , . . . , uN , h1 , . . . , hN ) or rather (u1 , . . . , uN , h21 , . . . , h2N ) of unitary and positive random variables mixed. In this section we will first introduce the free entropy of the mixed tuple of this kind and next obtain its connection with χ(u ˆ 1 h1 , . . . , uN hN ). In this way, we can construct a bridge between the free entropy of unitary random variables and that of non-selfadjoint ones (thus selfadjoint ones). For a non-singular A ∈ Mn (the singular case is negligible) one has a unique polar decomposition A = U H with U ∈ U(n) and H = |A| ∈ Mn+ , where Mn+ denotes the ˆ n on set of positive matrices in Mn . Let 3+,n be the measure on Mn+ induced from 3 Mn via the map A 7→ A∗ A. (This measure is more convenient than that induced via ˆ n on Mn corresponds (up to a constant) to the A 7→ |A|.) The next lemma shows that 3 product of γn on U(n) and the restriction of 3n on Mn+ .

Free Entropy Related to Polar Decomposition

427

ˆ n is transformed to the product measure γn ⊗ 3+,n under Lemma 2.1. The measure 3 the map A ∈ Mn 7→ (U, A∗ A) ∈ U(n) × Mn+ (U is the unitary part of A). Furthermore, the measure 3+,n is a constant multiple of the restriction of 3n on Mn+ : 3+,n = Cn 3n |Mn+ with Cn = 2

−n(n−1)/2 n(n+1)/2

π

n−1 Y −1 j! . j=1

Proof. We consider under the coordinate change H ∈ Mn+ ↔ (V, D) ∈ U(n)/T ×(R+ )n≤ by the diagonalization H = V DV ∗ , where T is the diagonal unitaries and (R+ )n≤ := {(t1 , . . . , tn ) : 0 ≤ t1 ≤ · · · ≤ tn }. Let γ˙ n be the probability measure on U(n)/T induced from γn . Write A∗ A = V DV ∗ and A = U V D1/2 V ∗ with U, V ∈ U(n) and D = Diag(t1 , . . . , tn ). Differentiating A = U V D1/2 V ∗ and using the standard method ˆ n is transformed to the measure for random matrices (see [7]) one can easily see that 3 Y n Y (ti − tj )2 dti γn ⊗ γ˙ n ⊗ Cn0 i<j

i=1

on U(n) × U(n)/T × (R+ )n≤ under the map A 7→ (U, V, D). To determine the normalizing constant Cn0 , use the standard n × n non-selfadjoint Gaussian matrix having the 2 distribution (π/n)−n exp(−n2 tr n (A∗ A)) and compute 1=

π −n2 n

Cn0

Z (R+ )n ≥

n Y X exp −n ti (ti − tj )2 dt i=1

n−1 Y 2 2 j! = π −n Cn0

i<j

j=1

by the Selberg integral [7, p. 354]. Hence, under the coordinate change H ↔ (V, D), the measure 3+,n is written as n−1 n Y −2 Y Y 2 j! (ti − tj )2 dti . γ˙ n ⊗ π n i<j

j=1

i=1

On the other hand, 3n is given as

n(n−1)/2

γ˙ n ⊗ (2π)

n−1 n Y −1 Y Y 2 j! (ti − tj ) dti . j=1

i<j

i=1

(See the proof of [13, Lemma 4.2] for the latter normalizing constant.) Comparing the above two we get the conclusion. The above constant Cn satisfies 1 1 1 π 3 log C + log n = log + , lim n n→∞ n2 2 2 2 4 as is readily checked from the Stirling formula.

(2.1)

428

F. Hiai, D. Petz

Let u1 , . . . , uN ∈ M be unitaries and h1 , . . . , hL ∈ M+ . For n, r ∈ N, ε > 0 and R > 0 we define 0(u,+),R (u1 , . . . , uN ; h1 , . . . , hL ; n, r, ε) := {(U1 , . . . , UN ; H1 , . . . , HL ) ∈ (U(n))N × (Mn+ )L : kHi k ≤ R, | tr (Bi1 · · · Bik ) − τ (bi1 · · · bik )| ≤ ε n

for all 1 ≤ i1 , . . . , ik ≤ 2N + L, 1 ≤ k ≤ r}, where

(b1 , . . . , b2N +L ) := (u1 , . . . , uN , u∗1 , . . . , u∗N , h1 , . . . , hL ), ∗ , H1 , . . . , HL ), (B1 , . . . , B2N +L ) := (U1 , . . . , UN , U1∗ , . . . , UN

and further define χ(u,+),R (u1 , . . . , uN ; h1 , . . . , hL ; r, ε) 1 := lim sup 2 log(γ ⊗ 3+ )(0(u,+),R (u1 , . . . , uN ; h1 , . . . , hL ; n, r, ε)) + L log n , n n→∞ where γ ⊗ 3+ is the abbreviation of γn⊗N ⊗ 3⊗L +,n . Now the definition of the free entropy χ(u,+) (u1 , . . . , uN ; h1 , . . . , hL ) is as before. When L = 0 or (h1 , . . . , hL ) is void, this is nothing but χu (u1 , . . . , uN ) in the previous section. On the other hand, when no unitaries are present, we write 0+,R (h1 , . . . , hL ; n, r, ε), χ+ (h1 , . . . , hL ), etc. It is obvious that the free entropy χ(u,+) has the subadditivity property as χ and χu . Moreover, the following upper semicontinuity of χ(u,+) can be shown in a way similar to [13, Props. 2.4 and 2.6]. Proposition 2.2. Let u1 , . . . , uN and um,1 , . . . , um,N (m ∈ N) be unitaries in M. Let h1 , . . . , hL and hm,1 , . . . , hm,L (m ∈ N) be in M+ . If (um,1 , . . . , um,N , hm,1 , . . . , hm,L ) → (u1 , . . . , uN , h1 , . . . , hL ) in *-distribution and supm khm,i k < +∞ (1 ≤ i ≤ L), then χ(u,+) (u1 , . . . , uN ; h1 , . . . , hL ) ≥ lim sup χ(u,+) (um,1 , . . . , um,N ; hm,1 , . . . , hm,L ). m→∞

The next proposition says that the free entropy χ+ is nothing but χ restricted on positive random variables up to additive constants. Proposition 2.3. The equality χ+ (h1 , . . . , hN ) = χ(h1 , . . . , hN ) +

π 3 N log + 2 2 2

holds for every N ∈ N and h1 , . . . , hN ∈ M+ . Proof. For r ∈ N, ε > 0 and R > 0, it is obvious that 0+,R (h1 , . . . , hN ; n, r, ε) ⊂ 0R (h1 , . . . , hN ; n, r, ε) (the right-hand side is taken in (Mnsa )N ⊃ (Mn+ )N ). Hence it immediately follows from Lemma 2.1 and (2.1) that π 3 N log + . χ+ (h1 , . . . , hN ) ≤ χ(h1 , . . . , hN ) + 2 2 2

Free Entropy Related to Polar Decomposition

429

To show the reverse inequality, we choose (h1 +δ1, . . . , hN +δ1) instead of (h1 , . . . , hN ) and also R > maxi khi k+δ for δ > 0. From [13, Prop. 2.4] and the translation invariance of 3n we can estimate χ(h1 , . . . , hN ) = χ(h1 + δ1, . . . , hN + δ1) 1 N log n = → lim lim sup 2 log 3n (0+,R (h1 + δ1, . . . , hN + δ1; n, r, ε)) + ε→+0 r→∞ n→∞ n 2 1 = → lim lim sup 2 log 3+,n (0+,R (h1 + δ1, . . . , hN + δ1; n, r, ε)) + N log n ε→+0 r→∞ n→∞ n 1 1 − N 2 log Cn + log n n 2 π 3 N log + . ≤ χ+ (h1 + δ1, . . . , hN + δ1) − 2 2 2 Using the upper semicontinuity (Proposition 2.2) as δ → +0 we obtain the result.

Remarks 2.4. (1) For χ+ (h) of a single h ∈ M we have +

π 3 3 1 log + = 6(µ) + log π + , 2 2 4 2 RR where µ is the distribution of h and 6(µ) := log |s − t| dµ(s) dµ(t). Moreover, by Remark 1.5 and Lemma 2.1 we observe that the limit h1 i log 3 (0 (h; n, r, ε)) + log n (2.2) χ+,R (h; r, ε) = lim +,n +,R n→∞ n2 χ+ (h) = χ(h) +

exists for every r ∈ N, ε > 0 and R ≥ khk. (2) Note (see [3]) that, among h ∈ M+ with τ (h) ≤ C, the free entropy χ+ (h) attains the maximal value log(πeC) when (and only when) h has the distribution √ 4Ct − t2 χ[0,4C] (t) dt 2πCt √ or equivalently h1/2 is a quarter-circular element of radius 2 C. The following relation between two free entropies χˆ and χ(u,+) is naturally expected from the definitions in the light of Lemma 2.1. Theorem 2.5. If u1 , . . . , uN ∈ M are unitaries and h1 , . . . , hN ∈ M+ , then χ(u ˆ 1 h1 , . . . , uN hN ) = χ(u,+) (u1 , . . . , uN ; h21 , . . . , h2N ) ≤ χu (u1 , . . . , uN ) + χ(h21 , . . . , h2N ) +

π 3 N log + . 2 2 2

To prove the theorem, we need to approximate the unitary part of A by polynomials of A, A∗ . The approximation here must be uniform for A ∈ Mn with kAk ≤ R in some sense. The next lemma provides the right approximation procedure for our purpose. Let a ∈ M and assume that the distribution of |a| is non-atomic. Let a = u|a| be the polar decomposition. Note that u ∈ M must be a unitary because ker a = {0} from the assumption (and M is a finite von Neumann algebra). Let k · kp denote the Schatten p-norm with respect to τ or tr n .

430

F. Hiai, D. Petz

Lemma 2.6. With the above assumption and notation, for every p ≥ 1, ε > 0 and R ≥ kak, there exist n0 , r ∈ N, δ > 0 and a real polynomial P (t) such that ku − aP (a∗ a)kp ≤ ε, and such that, for each n ≥ n0 , if A ∈ Mn with kAk ≤ R is nonsingular and U is the unitary part of A and if | tr n ((A∗ A)k ) − τ ((a∗ a)k )| ≤ δ

(1 ≤ k ≤ r),

(2.3)

then kU − AP (A∗ A)kp ≤ ε. Proof. Let µ be the distribution of |a|. For every α, β > 0, since u − a(|a| + α1)−1 = u(1 − |a|(|a| + α1)−1 ) = αu(|a| + α1)−1 , we have ku − a(|a| + α1)−1 kpp = kα(|a| + α1)−1 kpp Z ∞ α p α p = dµ(t) ≤ µ([0, β]) + . t+α β 0 Similarly for any non-singular A ∈ Mn with A = U |A|, we have kU − A(|A| +

αI)−1 kpp

α p 1 X α p 1 = ≤ #{i : λi ≤ β} + , n i=1 λi + α n β n

where (0 <) λ1 ≤ λ2 ≤ · · · ≤ λn are the eigenvalues of |A|. Now for each n ∈ N, since µ is non-atomic, one can choose 0 < ξ1(n) < ξ2(n) < · · · < ξn(n) = kak such that µ([0, ξi(n) ]) = i/n (1 ≤ i ≤ n). Then it immediately follows that Z ∞ n 1 X (n) 2k ∗ k t2k dµ(t) = lim (ξi ) (k ∈ N). τ ((a a) ) = n→∞ n 0 i=1 Let β > 0 be fixed so that µ([0, 2β]) < εp /2. By [13, Lemma 4.3] there are r ∈ N and δ > 0 such that, for every n ∈ N, if (λ1 , . . . , λn ) ∈ (R+ )n≤ satisfies n n 1 X 1 X (n) 2k λ2k (ξ ) ≤ 2δ i − n i=1 n i=1 i

then

(1 ≤ k ≤ r),

n

1X 2 (λ − (ξi(n) )2 )2 ≤ β 4 εp . n i=1 i

(2.4)

Next, choose n0 ∈ N such that n 1 X (ξi(n) )2k − τ ((a∗ a)k ) ≤ δ n i=1

(1 ≤ k ≤ r)

whenever n ≥ n0 . Then, for any n ≥ n0 , (2.4) is valid if A ∈ Mn satisfies (2.3). Furthermore, when (2.3) is satisfied, we have 11 p 1 #{i : λi ≤ β} ≤ ε . n 18

(2.5)

Free Entropy Related to Polar Decomposition

431

Indeed, put l := #{i : λi ≤ β} and m := #{i : ξi(n) ≤ 2β}. If m < i ≤ l, then (n) ≤ ξi(n) , so (λ2i − (ξi(n) )2 )2 ≥ (4β 2 − β 2 )2 = 9β 4 . Hence λi ≤ λl ≤ β, but 2β < ξm+1 1 4 (2.4) implies n (l − m) · 9β ≤ β 4 εp , so that l/n ≤ m/n + εp /9. Since εp m (n) ]) ≤ µ([0, 2β]) ≤ , = µ([0, ξm n 2 we have l/n ≤ εp /2 + εp /9, showing (2.5). By the above estimates altogether, we infer that, for each α > 0 and n ≥ n0 , if A ∈ Mn with kAk ≤ R is non-singular and satisfies (2.3), then 11 p α p ε + kU − A(|A| + αI)−1 kpp ≤ 18 β as well as ku − a(|a| + α1)−1 kpp ≤

ε p α p + . 2 β

Choose α > 0 such that (α/β)p ≤ εp /18, and next choose a polynomial P (t) such that 2 1/p √ 1 1− |P (t) − ( t + α)−1 | ≤ ε on [0, R2 ]. R 3 Then for each n ≥ n0 and A as above, we obtain kU − A(|A| + αI)−1 kp ≤

2 1/p 3

ε,

kAP (A∗ A) − A(|A| + αI)−1 kp ≤ kAk kP (A∗ A) − (|A| + αI)−1 k 2 1/p ≤ 1− ε, 3 so kU − AP (A∗ A)kp ≤ ε holds, and similarly ku − aP (a∗ a)kp ≤ ε. Proof of Theorem 2.5. First, the inequality in the theorem is a consequence of the subadditivity of χ(u,+) and Proposition 2.3. Define 9 : Mn → U(n)×Mn+ by 9(A) := (U, A∗ A), where U is the unitary part of A. This is bijective except the negligible singular elements (in Mn and Mn+ ). Put ai := ui hi so that h2i = a∗i ai . Let n, r ∈ N, ε > 0 and R > max{1, kh1 k, . . . , khN k}. It is straightforward to see that there are r1 ∈ N and ε1 > 0 such that 9(0ˆ R (a1 , . . . , aN ; n, r1 , ε1 )) ⊂ U(n) × 0+,R2 (h21 , . . . , h2N ; n, r, ε), and by Lemma 2.1 ˆ 0ˆ R (a1 , . . . , aN ; n, r1 , ε1 )) ≤ 3+ (0+,R2 (h21 , . . . , h2N ; n, r, ε)) 3( for all n ∈ N. This yields χ(a ˆ 1 , . . . , aN ) ≤ χ+ (h21 , . . . , h2N ). Hence we may assume that 2 2 χ(h1 , . . . , hN ) > −∞, so the distribution of each hi is non-atomic. Let ε0 > 0 be such that rε0 (R2 +ε0 )r−1 ≤ ε/3. By Lemma 2.6 there exist n0 , r0 ∈ N, δ > 0 and real polynomials Pi (t) (1 ≤ i ≤ N ) such that kui − ai Pi (a∗i ai )kr ≤ ε0 , and such that, for each 1 ≤ i ≤ N and n ≥ n0 , if Ai ∈ Mn is non-singular with Ai = Ui |Ai |, kAi k ≤ R, and | tr n ((A∗i Ai )k ) − τ ((a∗i ai )k )| ≤ δ

(1 ≤ k ≤ r0 ),

432

F. Hiai, D. Petz

then kUi − Ai Pi (A∗i Ai )kr ≤ ε0 . For Ai ∈ Mn (1 ≤ i ≤ N ) satisfying the above conditions, we set ∗ , A∗1 A1 , . . . , A∗N AN ), (B1 , . . . , B3N ) := (U1 , . . . , UN , U1∗ , . . . , UN

0 ) := (A1 P1 (A∗1 A1 ), . . . , AN PN (A∗N AN ), (B10 , . . . , B3N ∗ P1 (A1 A1 )A∗1 , . . . , PN (A∗N AN )A∗N , A∗1 A1 , . . . , A∗N AN ),

as well as (b1 , . . . , b3N ) := (u1 , . . . , uN , u∗1 , . . . , u∗N , a∗1 a1 , . . . , a∗N aN ), (b01 , . . . , b03N ) := (a1 P1 (a∗1 a1 ), . . . , aN PN (a∗N aN ), P1 (a∗1 a1 )a∗1 , . . . , PN (a∗N aN )a∗N , a∗1 a1 , . . . , a∗N aN ).

Then for any n ≥ n0 and 1 ≤ i1 , . . . , ik ≤ 3N (1 ≤ k ≤ r), by using the H¨older inequality, it is checked that | tr n (Bi1 · · · Bik ) − tr n (Bi01 · · · Bi0k )| ≤ kBi1 · · · Bik − Bi01 · · · Bi0k k1 ε ≤ kε0 (R2 + ε0 )k−1 ≤ , 3 and similarly |τ (bi1 · · · bik ) − τ (b0i1 · · · b0ik )| ≤ ε/3. Now choose r1 (≥ 2r0 ) large enough and ε1 (≤ δ) small enough such that if (A1 , . . . , AN ) ∈ 0ˆ R (a1 , . . . , aN ; n, r1 , ε1 ) then | tr n (Bi01 · · · Bi0k ) − τ (b0i1 · · · b0ik )| ≤ ε/3 for all 1 ≤ i1 , . . . , ik ≤ 3N (1 ≤ k ≤ r). Therefore, for n ≥ n0 we obtain 9(0ˆ R (a1 , . . . , aN ; n, r1 , ε1 )) ⊂ 0(u,+),R2 (u1 , . . . , uN ; h21 , . . . , h2N ; n, r, ε) (up to negligible sets) and hence by Lemma 2.1, ˆ 0ˆ R (a1 , . . . , aN ; n, r1 , ε1 )) ≤ (γ ⊗ 3+ )(0(u,+),R2 (u1 , . . . , uN ; h21 , . . . , h2N ; n, r, ε)). 3( This implies that χ(a ˆ 1 , . . . , aN ) ≤ χ(u,+) (u1 , . . . , uN ; h21 , . . . , h2N ). √ Conversely, given r ∈ N, ε > 0 and R > 0, by approximating t on [0, R2 ] by a polynomial, it is seen that there are r1 ∈ N and ε1 > 0 such that 0(u,+),R2 (u1 , . . . , uN ; h21 , . . . , h2N ; n, r1 , ε1 ) ⊂ 9(0ˆ R (a1 , . . . , aN ; n, r, ε)) (up to negligible sets) for all n ∈ N. This gives the reverse inequality.

Theorem 2.5 gives χ(a ˆ 1 , . . . , aN ) ≤ χu (u1 , . . . , uN ) + χ(a∗1 a1 , . . . , a∗N aN ) +

π 3 N log + 2 2 2

for every a1 , . . . , aN ∈ M and all unitaries u1 , . . . , uN ∈ M satisfying ai = ui |ai |. In particular, we have the following corollary. Its proof was indeed given in the first paragraph of the proof of Theorem 2.5. ˆ 1 , . . . , aN ) > −∞, then the distribution of Corollary 2.7. Let a1 , . . . , aN ∈ M. If χ(a a∗i ai is non-atomic (hence ker ai = {0}) for every 1 ≤ i ≤ N .

Free Entropy Related to Polar Decomposition

433

3. Additivity of Free Entropies In this section we first show that the inequality in Theorem 2.5 can be replaced by the equality in some cases of the free relation. Second, we discuss the additivity properties ˆ The characterization of the additivity of χu is completely of the free entropies χu and χ. analogous to the case of χ. First, we take a free family {h1 , . . . , hN } which is also free from {u1 , . . . , uN , ˆ χu and χ is obtained as follows. Hence u∗1 , . . . , u∗N }. Then an exact relation among χ, we have a formula for χu in terms of χˆ (hence χ). Theorem 3.1. Let u1 , . . . , uN ∈ M be unitaries and h1 , . . . , hN ∈ M+ . If {u1 , . . . , uN , u∗1 , . . . , u∗N }, h1 , . . . , hN are free, then χ(u ˆ 1 h1 , . . . , uN hN ) = χu (u1 , . . . , uN ) +

N X

χ(h2i ) +

i=1

π 3 N log + . 2 2 2

In particular, if h1 , . . . , hN are free standard (i.e. of radius 2) quarter-circular elements and they are free from {u1 , . . . , uN , u∗1 , . . . , u∗N }, then ˆ 1 h1 , . . . , uN hN ) − N log(πe) χu (u1 , . . . , uN ) = χ(u = χ(b1 , c1 , . . . , bN , cN ) − N log(πe), where ui hi = bi + i ci with selfadjoint bi , ci . In the proof below we use Voiculescu’s result on approximate freeness for standard unitary random matrices. The notion of approximate freeness for matrices was introduced in [17]. Let (Mn?N , tr ?N n ) be the free product of N -copies of (Mn , tr n ) and ji the injection of Mn into the ith copy in Mn?N . When i ⊂ Mn (1 ≤ i ≤ N ), r ∈ N and ε > 0 are given, the subsets 1 , . . . , N are said to be (r, ε)-free if ˜ ˜ | tr n (A1 . . . Ak ) − tr ?N n (A1 · · · Ak )| ≤ ε for all A1 , . . . , Ak ∈

FN

i=1

i , 1 ≤ k ≤ r, where A˜ := ji (A) for A ∈ i .

Lemma 3.2. Let u1 , . . . , uN , h1 , . . . , hN be as in Theorem 3.1, and assume that χu (u1 , . . . , uN ) > −∞ and χ+ (h2i ) > −∞ (1 ≤ i ≤ N ). Then, for every r ∈ N, ε > 0 and R > maxi khi k2 , there exists ε1 > 0 such that lim

n→∞

(γ ⊗ 3+ )(4n (r, ε1 ) ∩ 2n (r, ε)) = 1, (γ ⊗ 3+ )(4n (r, ε1 ))

where 4n (r, ε1 ) := 0u (u1 , . . . , uN ; n, r, ε1 ) ×

N Y

0+,R (h2i ; n, r, ε1 ),

i=1

2n (r, ε) := 0(u,+),R (u1 , . . . , uN ; h21 , . . . , h2N ; n, r, ε). Proof. Thanks to the freeness of {u1 , . . . , uN , u∗1 , . . . , u∗N }, h21 , . . . , h2N , one can choose ε1 > 0 such that if (U1 , . . . , UN ; H1 , . . . , HN ) ∈ 4n (r, ε1 ) and {U1 , . . . , UN , U1∗ , ∗ }, {H1 }, . . . , {HN } are (r, ε1 )-free, then (U1 , . . . , UN ; H1 , . . . , HN ) ∈ 2n (r, ε). . . . , UN For every θ > 0, according to [17, Cor. 2.13], there exists n0 ∈ N such that

434

F. Hiai, D. Petz ∗ γ {(V1 , . . . , VN ) ∈ (U(n))N : {U1 , . . . , UN , U1∗ , . . . , UN }, {V1 H1 V1∗ }, . . . , {VN HN VN∗ } are (r, ε1 )-free} ≥ 1 − θ

(3.1)

whenever n ≥ n0 independently of the choice of any Ui ∈ U(n) and Hi ∈ Mn+ with kHi k ≤ R (1 ≤ i ≤ N ). By the assumption that χu (u1 , . . . , uN ) > −∞ and χ+ (h2i ) > −∞, it follows that the γ ⊗ 3+ -measure of 4n (r, ε1 ) is positive (at least if n is large). So, for any large n (≥ n0 ) one can define the probability measure σn on 4n (r, ε1 ) by normalizing the restriction of γ ⊗ 3+ to 4n (r, ε1 ). Then, since σn is invariant under the action of (U(n))N on 4n (r, ε1 ) given by (U1 , . . . , UN ; H1 , . . . , HN ) 7→ (U1 , . . . , UN ; V1 H1 V1∗ , . . . , VN HN VN∗ ) for (V1 , . . . , VN ) ∈ (U(n))N , we have (γ ⊗ 3+ )(4n (r, ε1 ) ∩ 2n (r, ε)) (γ ⊗ 3+ )(4n (r, ε1 )) Z Z ∗ ∗ ψ(U1 , . . . , UN ; V1 H1 V1 , . . . , VN HN VN ) dγ(V1 , . . . , VN ) dσn , = (U (n))N

4n (r,ε1 )

where ψ is the characteristic function of 4n (r, ε1 ) ∩ 2n (r, ε). The choice of ε1 and (3.1) show that Z ψ(U1 , . . . , UN ; V1 H1 V1∗ , . . . , VN HN VN∗ ) dγ(V1 , . . . , VN ) ≥ 1 − θ (U (n))N

for all (U1 , . . . , UN ; H1 , . . . , HN ) ∈ 4n (r, ε1 ). Therefore, we infer that (γ ⊗ 3+ )(4n (r, ε1 ) ∩ 2n (r, ε)) ≥1−θ (γ ⊗ 3+ )(4n (r, ε1 )) whenever n is large enough, and the result follows.

Proof of Theorem 3.1. By Theorem 2.5 and Proposition 2.3 it suffices to show that χ(u,+) (u1 , . . . , uN ; h21 , . . . , h2N ) ≥ χu (u1 , . . . , uN ) +

N X

χ+ (h2i ),

(3.2)

i=1

so we may assume that χu (u1 , . . . , uN ) > −∞ and χ+ (h2i ) > −∞ (1 ≤ i ≤ N ). For any r ∈ N, ε > 0 and R > maxi khi k2 , let ε1 > 0 be as in Lemma 3.2. Then we have χ(u,+),R (u1 , . . . , uN ; h21 , . . . , h2N ; r, ε) 1 ≥ lim sup 2 log(γ ⊗ 3+ )(4n (r, ε1 )) + N log n n n→∞ 1 = lim sup 2 log γ(0u (u1 , . . . , uN ; n, r, ε1 )) n n→∞ N X 1 2 log 3 (0 (h ; n, r, ε )) + log n + +,n +,R i 1 n2 i=1 = χu (u1 , . . . , uN ; r, ε1 ) +

N X

χ+,R (h2i ; r, ε1 ).

i=1

Above we used the fact that lim sup becomes limit in (2.2). Thus (3.2) is shown. The second part is clear from Remark 2.4 (2) and Proposition 1.2.

Free Entropy Related to Polar Decomposition

435

When the roles of u1 , . . . , uN and h1 , . . . , hN are exchanged in Theorem 3.1, we have Theorem 3.3. Let u1 , . . . , uN ∈ M be unitaries and h1 , . . . , hN ∈ M+ . If {u1 , u∗1 }, . . . , {uN , u∗N }, {h1 , . . . , hN } are free, then χ(u ˆ 1 h1 , . . . , uN hN ) =

N X

χu (ui ) + χ(h21 , . . . , h2N ) +

i=1

π 3 N log + . 2 2 2

If u1 , . . . , uN are Haar unitaries in addition, then χ(u ˆ 1 h1 , . . . , uN hN ) = χ(h21 , . . . , h2N ) +

π 3 N log + . 2 2 2

Proof. By Theorem 2.5 and Proposition 2.3 we may show that χ(u,+) (u1 , . . . , uN ; h21 , . . . , h2N ) ≥

N X

χu (ui ) + χ+ (h21 , . . . , h2N ),

(3.3)

i=1

and we may assume χu (ui ) > −∞ and χ+ (h21 , . . . , h2N ) > −∞. For n, r ∈ N, ε > 0 and R > 0 set 4n (r, ε) :=

N Y

0u (ui ; n, r, ε) × 0+,R (h21 , . . . , h2N ; n, r, ε),

i=1

and 2n (r, ε) is the same as in Lemma 3.2. By the freeness assumption there is ε1 > 0 such ∗ }, {H1 , . . . , that if (U1 , . . . , UN ; H1 , . . . , HN ) ∈ 4n (r, ε1 ) and {U1 , U1∗ }, . . . , {UN , UN HN } are (r, ε1 )-free, then (U1 , . . . , UN ; H1 , . . . , HN ) ∈ 2n (r, ε). For every θ > 0 by [17, Cor. 2.13] there exists n0 ∈ N such that ∗ ∗ VN }, γ {(V1 , . . . , VN ) ∈ (U(n))N :{V1 U1 V1∗ , V1 U1∗ V1∗ }, . . . , {VN UN VN∗ , VN UN {H1 , . . . , HN } are (r, ε1 )-free} ≥ 1 − θ

whenever n ≥ n0 independently of the choice of any Ui ∈ U(n) and Hi ∈ Mn+ with kHi k ≤ R (1 ≤ i ≤ N ). Then as in the proof of Lemma 3.2 we have (γ ⊗ 3+ )(4n (r, ε1 ) ∩ 2n (r, ε)) ≥1−θ (γ ⊗ 3+ )(4n (r, ε1 )) for large n. Therefore, lim

n→∞

(γ ⊗ 3+ )(4n (r, ε1 ) ∩ 2n (r, ε)) = 1. (γ ⊗ 3+ )(4n (r, ε1 ))

436

F. Hiai, D. Petz

This implies that χ(u,+),R (u1 , . . . , uN ; h21 , . . . , h2N ; r, ε) 1 ≥ lim sup 2 log(γ ⊗ 3+ )(4n (r, ε1 )) + N log n n n→∞ X N 1 log γn (0u (ui ; n, r, ε1 )) = lim sup 2 n i=1 n→∞ 1 + 2 log 3+ (0+,R (h21 , . . . , h2N ; n, r, ε1 )) + N log n n =

N X

χu (ui ; r, ε1 ) + χ+,R (h21 , . . . , h2N ; r, ε1 )

i=1

thanks to (1.1), so (3.3) is obtained.

Next, we apply the relation shown above to get the additivity properties of the free ˆ We first give the change of variable formula similar to [15, Prop 3.1] entropies χu and χ. for χ(u,+) . To do so, we need a smoothing technique like [15, Lemma 4.1]. We denote by FT the set of all functions f : T → T which is given as f (ei t ) = ei φ(t) by a continuous increasing function φ on [0, 2π] with φ(0) = 0, φ(2π) = 2π. An f ∈ FT is said to be C ∞ if φ is also. Note that if φ is differentiable at t ∈ [0, 2π], then f (η) − f (ζ) lim = φ0 (t) for ζ = ei t . η→ζ η−ζ In this case we write |f 0 (ei t )| instead of φ0 (t). For each unitary u ∈RM and f ∈ FT one can define the unitary f (u) by R functional calculus, that is, f (u) := T f (ζ) de(ζ) for the spectral decomposition u = T ζ de(ζ). Lemma 3.4. Let u ∈ M be a unitary with χu (u) > −∞, and let f ∈ FT . Then there 0 | > 0 on T, kfm (u) − exists a sequence (fm ) of C ∞ -functions in FT such that |fm f (u)k → 0 and χu (fm (u)) → χu (f (u)). On the other hand, we denote by FR+ the set of all continuous increasing functions g : R+ → R+ with g(0) = 0. Lemma 3.5. Let h ∈ M+ , χ(h) > −∞, and g ∈ FR+ . Then there exists a sequence 0 > 0 on R+ , kgm (h) − g(h)k → 0 and (gm ) of C ∞ -functions in FR+ such that gm χ(gm (h)) → χ(g(h)). Lemma 3.5 is essentially included in [15, Lemma 4.1]. The proof of Lemma 3.4 is similar with some modifications, and it may be omitted here. Lemma 3.6. Let u1 , . . . , uN ∈ M be unitaries with χu (ui ) > −∞ and h1 , . . . , hL ∈ M+ with χ+ (hj ) > −∞. Then χ(u,+) (f1 (u1 ), . . . , fN (uN ); g1 (h1 ), . . . , gL (hL )) ≥ χ(u,+) (u1 , . . . , uN ; h1 , . . . , hL ) +

N L X X χu (fi (ui )) − χu (ui ) + χ(gj (hj )) − χ(hj ) i=1

for every f1 , . . . , fN ∈ FT and g1 , . . . , gL ∈ FR+ .

j=1

Free Entropy Related to Polar Decomposition

437

Proof. By Lemmas 3.4 and 3.5 together with Proposition 2.2 we may show the following two cases: (a) If f is a C ∞ -function in FT with |f 0 | > 0 on T, then χ(u,+) (f (u1 ), u2 , . . . , uN ; h1 , . . . , hN ) ≥ χ(u,+) (u1 , . . . , uN ; h1 , . . . , hN ) + χu (f (u1 )) − χu (u1 ). (b) If g is a C ∞ -function in FR+ with g 0 > 0 on R+ , then χ(u,+) (u1 , . . . , uN ; g(h1 ), h2 , . . . , hN ) ≥ χ(u,+) (u1 , . . . , uN ; h1 , . . . , hN ) + χ(g(h1 )) − χ(h1 ). The proof of (b) is the same as [15, Prop. 3.1]. We sketch the similar proof of (a). For ζ, η ∈ T define ( (η) | if ζ 6= η, | f (ζ)−f ζ−η K(ζ, η) := 0 if ζ = η, |f (ζ)| then L(ζ, η) := log K(ζ, η) is continuous on T2 and χu (f (u1 )) − χu (u1 ) = (τ ⊗ τ )(L(u1 ⊗ 1, 1 ⊗ u1 )). Write F (U1 , . . . , UN ; H1 , . . . , HL ) := (f (U1 ), U2 , . . . , UN ; H1 , . . . , HL ) on (U(n))N × (Mn+ )L . For every r ∈ N and ε > 0, by approximating f by a trigonometric polynomial, we notice that F (0(u,+),R (u1 , . . . , uN ; h1 , . . . , hL ; n, r1 , ε1 )) ⊂ 0(u,+),R (f (u1 ), u2 , . . . , uN ; h1 , . . . , hL ; n, r, ε)

(n ∈ N)

for some r1 ∈ N and ε1 > 0. Since n Y f (ζi ) − f (ζj ) 2 Y d(γn ◦ f ) (U1 ) = |f 0 (ζi )| dγn ζ − ζ i j i<j i=1

= exp(Tr n ⊗ Tr n )(L(U1 ⊗ I, I ⊗ U1 )) (ζ1 , . . . , ζN are the eigenvalues of U1 ), we can show as in the proof of [15, Prop. 3.1] that for any δ > 0 there are r1 ∈ N and ε1 > 0 such that 1 d(γn ◦ f ) (U1 ) − χu (f (u1 )) − χu (u1 ) ≤ 3δ 2 log n dγn for all (U1 , . . . , UN ; H1 , . . . , HL ) ∈ 0(u,+),R (u1 , . . . , uN ; h1 , . . . , hL ; n, r1 , ε1 ), n ∈ N, and the inequality in (a) is obtained. If f1 , . . . , fN ∈ FT and g1 , . . . , gL ∈ FR+ are strictly increasing (in terms of angle for fi ), then the inequality in Lemma 3.6 can be replaced by the equality. Proposition 3.7. If u1 , . . . , uN ∈ M are unitaries, then χu (u1 , . . . , uN ) = 0 if and only if u1 , . . . , uN are *-free Haar unitaries.

438

F. Hiai, D. Petz

Proof. Choose free standard quarter-circular elements h1 , . . . , hN which are free from {u1 , . . . , uN , u∗1 , . . . , u∗N }. Theorem 3.1 says that χu (u1 , . . . , uN ) = 0 if and only if χ(u ˆ 1 h1 , . . . , uN hN ) = N log(πe). According to Proposition 1.3 the latter equality holds if and only if u1 h1 , . . . , uN hN are *-free circular elements, which is equivalent to saying that u1 , . . . , uN are *-free Haar unitaries. Theorem 3.8. Let u1 , . . . , uN ∈ M be unitaries. If u1 , . . . , uN are *-free, then χu (u1 , . . . , uN ) = χu (u1 ) + · · · + χu (uN ). Conversely, if χu (ui ) > −∞ for 1 ≤ i ≤ N and the above equality holds, then u1 , . . . , uN are *-free. Proof. When (h1 , . . . , hN ) is void in the proof of (3.3), it can read as a proof of the first part here. (This part can be also shown in a way similar to the selfadjoint case in [13].) Now we prove the second part. Assume that χu (ui ) > −∞ for 1 ≤ i ≤ N and the additivity holds. For each i, since the distribution of ui is non-atomic, there is a (unique) fi ∈ FT such that the distribution of fi (ui ) is the Haar probability measure on T, so χu (fi (ui )) = 0. Then, by Lemma 3.6 (in case of L = 0) and the additivity assumption, we get N X χu (fi (ui )) = 0. χu (f1 (u1 ), . . . , fN (uN )) ≥ i=1

So Proposition 3.7 implies that f1 (u1 ), . . . , fN (uN ) are *-free, and hence u1 , . . . , uN are *-free because ui ∈ {fi (ui )}00 . Theorem 3.9. Let a1 , . . . , aN ∈ M be such that ai = ui hi with a *-free pair of a unitary ui ∈ M and hi ∈ M+ . If a1 , . . . , aN are *-free, then ˆ 1 ) + · · · + χ(a ˆ N ). χ(a ˆ 1 , . . . , aN ) = χ(a Conversely, if χ(a ˆ i ) > −∞ for 1 ≤ i ≤ N and the above equality holds, then a1 , . . . , aN are *-free. Proof. If a1 , . . . , aN are *-free, then u1 , . . . , uN , h1 , . . . , hN are *-free due to the *freeness of ui , hi . Hence Theorems 3.1 and 3.8 imply that χ(a ˆ 1 , . . . , aN ) =

N X

χu (ui ) +

i=1

N X

π 3 X N log + = χ(a ˆ i ). 2 2 2 i=1 N

χ(h2i ) +

i=1

Conversely, assume that χ(a ˆ i ) > −∞ for 1 ≤ i ≤ N and the additivity holds. Since χu (ui ) > −∞ and χ(h2i ) > −∞, one can choose fi ∈ FT and gi ∈ FR+ such that fi (ui ) is a Haar unitary and gi (hi )2 is a standard quarter-circular. Then, letting bi := fi (ui )gi (hi ) and using Theorem 2.5, Lemma 3.6 (applied to fi , gi (t1/2 )2 ) and Theorem 3.1, we get χ(b ˆ 1 , . . . , bN ) = χ(u,+) (f1 (u1 ), . . . , fN (uN ); g1 (h1 )2 , . . . , gN (hN )2 ) ≥ χˆ (u,+) (u1 , . . . , uN ; h21 , . . . , h2N ) +

N N X X χu (fi (u)) − χu (ui ) + χ(gi (hi )2 ) − χ(h2i ) i=1

= χ(a ˆ 1 , . . . , aN ) −

i=1 N X i=1

χ(a ˆ i ) + N log(πe) = N log(πe).

Free Entropy Related to Polar Decomposition

439

So Proposition 1.3 implies that b1 , . . . , bN are *-free standard circulars. Hence a1 , . . . , aN are *-free because of ai ∈ {bi , b∗i }00 . In [9] the notion of R-diagonal pairs was introduced in connection with two-variable R-transform. Instead of giving its definition here, we remark the following characterization shown in [9]: If a ∈ M and ker a = {0}, then a is an R-diagonal element (i.e. (a, a∗ ) is an R-diagonal pair) if and only if a is written as uh by a *-free pair of a Haar unitary u ∈ M and h ∈ M+ . It was also shown in [9] that an R-diagonal element a is circular if and only if the real and imaginary parts of a are free. Theorem 3.9 can be applied in particular when a1 , . . . , aN are R-diagonal. Specialized to the case χ(a) ˆ of a single non-selfadjoint a ∈ M we state Proposition 3.10. Let a ∈ M with χ(a) ˆ > −∞, and let a = uh be the polar decomposition. Then 1 π 3 χ(a) ˆ ≤ χu (u) + χ(a∗ a) + log + 2 2 4 and the equality is attained if and only if u, h are *-free. Moreover, χ(a) ˆ = χ(a∗ a) + 1 π 3 2 log 2 + 4 if and only if a is R-diagonal. Proof. Theorem 3.1 includes the “if " part of the first assertion. To see the “only if ", choose f ∈ FT and g ∈ FR+ such that f (u) is a Haar unitary and g(h)2 is a standard ˆ (u)g(h)) = quarter-circular. Then the equality χ(uh) ˆ = χu (u) + χ+ (h2 ) implies χ(f log(πe) as in the proof of Theorem 3.9, and this means that f (u)g(h) is a standard circular and so u, h are *-free. The second assertion is immediate from the first. The above proposition shows Corollary 3.11. Let µ be a probability measure on R+ with compact support and 6(µ) > −∞. When a ∈ M is such that a∗ a has the distribution µ, 3 χ(a) ˆ ≤ 6(µ) + log π + , 2 and the equality is attained if and only if a is R-diagonal. Example 3.12. For each λ ≥ 1 the free analogue of the Poisson distribution (see [18]) is given by p 4λ − (t − 1 − λ)2 χ(t) dt, µλ := 2πt √ √ where χ(t) is the characteristic function of the interval ( λ − 1)2 , ( λ + 1)2 . This measure is also called the Marchenko-Pastur distribution. From a computation in [5] (also [6]) we have 1 6(µλ ) = −1 + (λ + log λ + (λ − 1)2 log(1 − λ−1 )). 2 If a is an R-diagonal element such that a∗ a has the distribution µλ , then Corollary 3.11 gives 1 χ(a) ˆ = log π + (1 + λ + log λ + (λ − 1)2 log(1 − λ−1 )). 2 The case λ = 1 is a circular element of radius 2.

440

F. Hiai, D. Petz

4. Maximization of Free Entropy for a Matrix of Random Variables A maximization result similar to Corollary 3.11 was recently shown in [8] for the version χ∗ of free entropy introduced in [16]. Moreover, this maximization result for χ∗ was extended to the case of a matrix [aij ]di,j=1 of random variables. In this section we consider the χ-version of the maximization problem from [8]. For each d ∈ N we have a tracial W ∗ -probability space (Md (M) ≡ M ⊗ Md , τ ⊗ tr d ). Let aij (1 ≤ i, j ≤ d) be a family of (non-selfadjoint) elements of M, and set ˆ ij )1≤i,j≤d ) of the d2 -tuple of aij . b := [aij ]di,j ∈ Md (M). We have the free entropy χ((a On the other hand, following [14], one can define the (conditional) free entropy of b in the presence of Md (C1) (≡ C1 ⊗ Md ⊂ Md (M)). Let (eij )1≤i,j≤d be the usual matrix units of Md (C1). For n, r ∈ N, ε > 0 and R > 0 define 0ˆ R (b, (eii )1≤i≤d , (eij )1≤i<j≤d ; n, r, ε) := {(B, (Bii )1≤i≤d , (Bij )1≤i<j≤d ) ∈ Mn × (Mnsa )d × (Mn )d(d−1)/2 : kBk, kBij k ≤ R (1 ≤ i ≤ j ≤ d), | tr (Yi1 · · · Yik ) − (τ ⊗ tr )(yi1 · · · yik )| ≤ ε n

d

for all 0 ≤ i1 , . . . , ik ≤ d2 , 1 ≤ k ≤ r}, where

(y0 , y1 , . . . , yd2 ) := (b, (eii )1≤i≤d , (eij )1≤i<j≤d , (eji )1≤i<j≤d ), ∗ )1≤i<j≤d ), (Y0 , Y1 , . . . , Yd2 ) := (B, (Bii )1≤i≤d , (Bij )1≤i<j≤d , (Bij

and further define 0ˆ R (b : (eii )1≤i≤d , (eij )1≤i<j≤d ; n, r, ε) := {B ∈ Mn : (B, (Bii )1≤i≤d , (Bij )1≤i<j≤d ) ∈ 0ˆ R (b, (eii ), (eij )i<j ; n, r, ε) for some ((Bii )1≤i≤d , (Bij )1≤i<j≤d ) ∈ (Mnsa )d × (Mn )d(d−1)/2 }. Then χ(b ˆ : (eii )1≤i≤d , (eij )1≤i<j≤d ) is defined as before, which we denote by χ(b ˆ : Md (C1)). In fact, choose any family {c1 , . . . , cl } in Md (C1)sa which generates Md (C1) as *-algebra, and define χ(b ˆ : c1 , . . . , cl ) in the same way by taking (Mnsa )l in place of sa d d(d−1)/2 in the above. Then it follows (see [14]) that χ(b ˆ : Md (C1)) = (Mn ) × (Mn ) χ(b ˆ : c1 , . . . , cl ). Proposition 4.1. With the above notations, ˆ : Md (C1)) − d2 log d χ((a ˆ ij )1≤i,j≤d ) ≤ d2 χ(b ˆ − d2 log d. ≤ d2 χ(b) Furthermore, if {b, b∗ } is free from Md (C1) then ˆ − d2 log d. χ((a ˆ ij )1≤i,j≤d ) = d2 χ(b)

(4.1)

Proof. First, the second inequality is trivial. Define a linear map 8 : (Mn )d → Mnd Pd by 8((Aij )1≤i,j≤d ) := [Aij ]di,j=1 . Since k[Aij ]di,j=1 kHS = ( i,j=1 kAij k2HS )1/2 , it is obvious that 2 ˆ nd ◦ 8. ˆ ⊗d =3 (4.2) 3 n 2

Free Entropy Related to Polar Decomposition

441

Let (Eij )1≤i,j≤d be the usual matrix units of Md = In ⊗ Md ⊂ Mnd . Now let (Aij )1≤i,j≤d ∈ 0ˆ R ((aij )1≤i,j≤d ; n, r, ε), and set B := 8((Aij )1≤i,j≤d ) and (x1 , . . . , xd2 ) := (aij )1≤i,j≤d , (X1 , . . . , Xd2 ) := (Aij )1≤i,j≤d , (y0 , y1 , . . . , yd2 ) := (b, (eii )1≤i≤d , (eij )1≤i<j≤d , (eji )1≤i<j≤d ), (Y0 , Y1 , . . . , Yd2 ) := (B, (Eii )1≤i≤d , (Eij )1≤i<j≤d , (Eji )1≤i<j≤d ). For any 0 ≤ i1 , . . . , ik ≤ d2 (1 ≤ k ≤ r), one can write 1X tr n (Xj1 · · · Xjl ), d 1X τ (xj1 · · · xjl ), (τ ⊗ tr d )(yi1 · · · yik ) = d tr nd (Yi1 · · · Yik ) =

(4.3) (4.4)

where the summations in (4.3) and (4.4) are in the same pattern, the number of terms in sum is at most dk and 1 ≤ j1 , . . . , jl ≤ d2 , 1 ≤ l ≤ k. Hence we obtain | tr nd (Yi1 · · · Yik ) − (τ ⊗ tr d )(yi1 · · · yik )| ≤

1 k · d · ε ≤ dr−1 ε. d

Therefore, noting kBk ≤ dR, we infer that 8(0ˆ R ((aij )1≤i,j≤d ; n, r, ε)) ⊂ 0ˆ dR (b : (eii )1≤i≤d , (eij )1≤i<j≤d ; nd, r, dr−1 ε). Thanks to (4.2) this yields r−1 ˆ ˆ ˆ ⊗d ε) 3 n (0R ((aij )1≤i,j≤d ; n, r, ε)) ≤ 3nd (0dR (b : (eii )1≤i≤d , (eij )1≤i<j≤d ; nd, r, d 2

and hence 2 1 ˆ ⊗d log 3 (0ˆ R ((aij )1≤i,j≤d ; n, r, ε)) + d2 log n n n2 1 2 r−1 ˆ ˆ log 3nd (0dR (b : (eii )1≤i≤d , (eij )1≤i<j≤d ; nd, r, d ε)) + log(nd) ≤d n2 d 2

− d2 log d. So we have χˆ R ((aij )1≤i,j≤d ; r, ε) ≤ d2 χˆ dR (b : (eii )1≤i≤d , (eij )1≤i<j≤d ; r, dr−1 ε) − d2 log d. Take the limits as r → ∞ and dr−1 ε → 0 to obtain χˆ R ((aij )1≤i,j≤d ) ≤ d2 χˆ dR (b : Md (C1)) − d2 log d, ˆ : Md (C1)) − d2 log d. so we have χ((a ˆ ij )1≤i,j≤d ) ≤ d2 χ(b ∗ Next, assume that {b, b } is free from Md (C1). Let r ∈ N, ε > 0 and R ≥ 1 be given, and let r1 := 2r +1. For each m ∈ N let n be the integer part of m/d, and let (Eij )1≤i,j≤d be the usual matrix units of Md , where Md is embedded in Mm as Md = (In ⊗ Md ) ⊕ 0m−nd ⊂ Mm so that the rank of the Eii ’s is n. It is clear that the joint distribution of (Eij )1≤i,j≤d in Mm converges to that of (eij )1≤i,j≤d in Md (M) as m → ∞. Hence, thanks to the freeness of {b, b∗ } and {eij }1≤i,j≤d , one can choose ε1 > 0 and m0 ∈ N

442

F. Hiai, D. Petz

such that if m ≥ m0 , B ∈ 0ˆ R (b; m, r1 , ε1 ) and {B}, {Eij }1≤i,j≤d are (r1 , ε1 )-free, then (B, (Eii )1≤i≤d , (Eij )1≤i<j≤d ) ∈ 0ˆ R (b, (eii )1≤i≤d , (eij )1≤i<j≤d ; m, r1 , ε). Set 4m (r1 , ε1 ) := 0ˆ R (b; m, r1 , ε1 ), 2m (r1 , ε) := {B ∈ Mm : (B, (Eii )1≤i≤d , (Eij )1≤i<j≤d ) ∈ 0ˆ R (b, (eii )1≤i≤d , (eij )1≤i<j≤d ; m, r1 , ε)}. Then, as in the proofs of Lemma 3.2 and Theorem 3.3, one can show that ˆ m (r1 , ε1 ) ∩ 2m (r1 , ε)) 3(4 = 1, ˆ m (r1 , ε1 )) m→∞ 3(4 lim

and this implies that 1 ˆ m (2m (r1 , ε)) + log m . log 3 χˆ R (b; r1 , ε1 ) ≤ lim sup 2 m→∞ m

(4.5)

Now define a linear map 9 : Mm → (Mn )d × Cq (q := m2 − n2 d2 ) by 9(B) := (Bij )1≤i,j≤d , (bij )(i,j)∈Rm for B = [bij ]m i,j=1 , 2

where Bij := [bkl ](i−1)d+1≤k≤id,(j−1)d+1≤l≤jd , Rm := {1, . . . , m}2 \ {1, . . . , nd}2 . Since 9 is an isometry with respect to the Euclidean norm, we have 2 ˆ ⊗d ˆm = 3 ⊗ λq ◦ 9 3 n

(4.6)

where λq is the usual Lebesgue measure on Cq . Let 1 ≤ i1 , . . . , ik , j1 , . . . , jk ≤ d with 1 ≤ k ≤ r. One can write m tr m (E1i1 BEj1 i2 BEj2 i3 · · · Ejk−1 ik BEjk 1 ), n τ (ai1 j1 ai2 j2 · · · aik jk ) = d(τ ⊗ tr d )(e1i1 bej1 i2 bej2 i3 · · · ejk−1 ik bejk 1 ).

tr n (Bi1 j1 Bi2 j2 · · · Bik jk ) =

For every B ∈ 2m (r1 , ε), since | tr m (E1i1 BEj1 i2 · · · BEjk 1 ) − (τ ⊗ tr d )(e1i1 bej1 i2 · · · bejk 1 )| ≤ ε, we have

| tr n (Bi1 j1 · · · Bik jk ) − τ (ai1 j1 · · · aik jk )| ≤ 2dε

whenever m is large enough. In this way, we infer that for large m, 9(2m (r1 , ε)) ⊂ 0ˆ R ((aij )1≤i,j≤d ; n, r, 2dε) × {ζ ∈ C : |ζ| ≤ R}q so that thanks to (4.6) 2 q ˆ ⊗d ˆ ˆ m (2m (r1 , ε)) ≤ 3 3 n (0R ((aij )1≤i,j≤d ; n, r, 2dε)) × (πR ) . 2

Free Entropy Related to Polar Decomposition

443

Since q < 2m(d − 1), this and (4.5) yield 2 1 ˆ ⊗d ˆ log 3 ((a ) ; n, r, 2dε) + log m 0 χˆ R (b; r1 , ε1 ) ≤ lim sup R ij 1≤i,j≤d n 2 m→∞ m 1 = 2 χˆ R ((aij )1≤i,j≤d ; r, 2dε) + log d. d Therefore, we obtain χˆ R (b) ≤ completing the proof.

1 χˆ R ((aij )1≤i,j≤d ) + log d, d2

ˆ if {b, b∗ } is free from Md (C1). In fact, this is a In particular, χ(b ˆ : Md (C1)) = χ(b) consequence of [17, Proposition 3.10]. Let µ be as in Corollary 3.11. In the situation of Proposition 4.1 we observe that if b∗ b has the distribution µ, then 3 − d2 log d, (4.7) χ((a ˆ ij )1≤i,j≤d ) ≤ d2 6(µ) + log π + 2 and the equality is attained when b is R-diagonal and {b, b∗ } is free from Md (C1). An example of (aij )1≤i,j≤d for which b satisfies these conditions is easy to construct by using the free product (see [8]). It may be possible that the equality ˆ : Md (C1)) − d2 log d χ((a ˆ ij )1≤i,j≤d ) = d2 χ(b holds in general. Also it is interesting to know whether the freeness of {b, b∗ } from Md (C1) is necessary for the equality (4.1) (or at least for the equality case in (4.7)). Let aii ∈ Msa for 1 ≤ i ≤ d and aij ∈ M for 1 ≤ i < j ≤ d. Set aji := a∗ij and b := [aij ]di,j=1 ∈ Md (M)sa . Then one can define χ((a ˆ ii )1≤i≤d , (aij )1≤i<j≤d ) = χ((aii )1≤i≤d , (bij , cij )1≤i<j≤d ) where aij = bij + i cij with selfadjoint bij , cij , and also χ(b : Md (C1)) as before. Then the proof of the following is a slight modification of that of Proposition 4.1, and we omit it. Proposition 4.2. With the above notations, χ((a ˆ ii )1≤i≤d , (aij )1≤i<j≤d ) ≤ d2 χ(b : Md (C1)) − ≤ d2 χ(b) −

d2 log d 2

d2 log d. 2

Furthermore, if b is free from Md (C1) then χ((a ˆ ii )1≤i≤d , (aij )1≤i<j≤d ) = d2 χ(b) −

d2 log d. 2

Acknowledgement. The first-named author thanks the J´anos Bolyai Mathematical Society for a Paul Erd˝os Visiting Professorship and the Department of Analysis of the Technical University of Budapest where this joint work was completed.

444

F. Hiai, D. Petz

References 1. Ben Arous, G. and Guionnet, A.: Large deviation for Wigner’s law and Voiculescu’s noncommutative entropy. Prob. Theory Rel. Fields 108, 517–542 (1997) 2. Dembo, A. and Zeitouni, O.: Large Deviation Techniques and Applications. Boston–London, Jones and Bartlett, 1993 3. Hiai, F. and Petz, D.: Maximizing free entropy. Acta Math. Hungar. 80, 335–356 (1998) 4. Hiai, F. and Petz, D.: A large deviation theorem for the empirical eigenvalue distribution of random unitary matrices. Preprint No. 17/1997, Mathematical Institute HAS, Budapest 5. Hiai, F. and Petz, D.: Eigenvalue density of the Wishart matrix and large deviations. Infinite Dimensional Anal., Quant. Prob. and Related Topics, 1, 633–646 (1998) 6. Hiai, F. and Petz, D.: The Semicircle Law, Free Random Variables and Entropy. In preparation 7. Mehta, M.L.: Random Matrices. Boston: Academic Press, 1991 8. Nica, A., Shlyakhtenko, D. and Speicher, R.: Some maximization problems for the free analogue of the Fisher information. To appear in Adv. Math. 9. Nica, A. and Speicher, R.: R-diagonal pairs – A common approach to Haar unitaries and circular elements. In: Free Probability Theory, D.V. Voiculescu (ed.), Fields Institute Communications, Vol. 12, Providence, RI: Am. Math. Soc, 1997, pp. 149–188 10. Petz, D. and Hiai, F.: Logarithmic energy as entropy functional. In: Advances in Differential Equations and Mathematical Physics, E. Carlen et al. (eds.), Contemp. Math. Vol. 217, Providence, RI: Am. Math. Soc., 1998, pp. 205–221 11. Voiculescu, D.: Limit laws for random matrices and free products. Invent. Math. 104, 201–220 (1991) 12. Voiculescu, D.: The analogues of entropy and of Fisher’s information measure in free probability theory, I. Commun. Math. Phys. 155, 71–92 (1993) 13. Voiculescu, D.: The analogues of entropy and of Fisher’s information measure in free probability theory, II. Invent. Math. 118, 411–440 (1994) 14. Voiculescu, D.: The analogues of entropy and of Fisher’s information measure in free probability theory, III: The absence of Cartan subalgebras. Geom. Funct. Anal. 6, 172–199 (1996) 15. Voiculescu, D.: The analogues of entropy and of Fisher’s information measure in free probability theory, IV: Maximum entropy and freeness. In: Free Probability Theory, D.V. Voiculescu (ed.), Fields Institute Communications, Vol. 12, Providence, RI: Am. Math. Soc., 1997, pp. 293–302 16. Voiculescu, D.: The analogues of entropy and of Fisher’s information measure in free probability theory, V: Noncommutative Hilbert transforms. Invent. Math. 132, 189–227 (1998) 17. Voiculescu, D.: A strengthened asymptotic freeness result for random matrices with applications to free entropy. Internat. Math. Res. Notices 41–63 (1998) 18. Voiculescu, D.V., Dykema, K.J. and Nica, A.: Free Random Variables. CRM Monograph Ser., Vol. 1, Providence, RI: Am. Math. Soc., 1992 19. Nica, A., Shlyakhtenko, D. and Speicher, R.: Maximality of the microstates free entropy for R-diagonal elements. To appear in Pacific J. Math. Communicated by H. Araki

Commun. Math. Phys. 202, 445 – 461 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Universal Construction of Wq,p Algebras J. Avan1 , L. Frappat2 , M. Rossi2 , P. Sorba2 1 LPTHE, CNRS-UMR 7589, Universit´ es Paris VI/VII, Paris, France 2 Laboratoire d’Annecy-le-Vieux de Physique Th´ eorique LAPTH, CNRS-URA 1436 LAPP, BP 110, F-74941 Annecy-le-Vieux Cedex, France

Received: 2 October 1998 / Accepted: 5 November 1998

Abstract: We present a direct construction of the abstract generators for q-deformed WN algebras. New quantum algebraic structures of Wq,p type are thus obtained. This b )c ) procedure hinges upon a twisted trace formula for the elliptic algebra Aq,p (sl(N generalizing the previously known formulae for quantum groups. It represents the qdeformation of the construction of WN algebras from Lie algebras.

1. Introduction The connection between q-deformed Virasoro (more generally q-deformed W algebras), b c ) (resp. Aq,p (sl(N b )c )) algebras, was investigated in our and elliptic quantum Aq,p (sl(2) recent papers [1–4]. It was shown that q-deformed Virasoro and W structures [5–7] b )c ) elliptic algebra [8, 9]; at the quantum level, once were present inside the Aq,p (sl(N a particular relation existed between the central charge c, the elliptic nome p and the 1 deformation parameter q: (−p 2 )N M = q −c−N for some integer M , and at the classical limit, obtained when setting an additional relation p = q N h for some integer h. This classical limit was identified in particular cases with the classical q-deformed algebras constructed in [5]. It also yielded different classical deformed algebras. In this way, one obtained directly a set of quantizations of these classical q-deformed Poisson algebras, in particular from [5], interestingly distinct from the original quantization [7] obtained from explicit bosonic realizations. The construction was achieved at the abstract level in that only the abstract algebraic b )c ), defined by the eight vertex model R-matrix [10], were used relations for Aq,p (sl(N to derive the q-deformed structures. It was assumed throughout the derivations that the initial formal series relations [8] were in fact extended to the level of analytic relations, thereby leading from one single exchange relation for this generating operator functional of the algebras to an infinite Z-labeled set of exchange relations for the modes, depending

446

J. Avan, L. Frappat, M. Rossi, P. Sorba

upon the choice of a relevant series expansion in a crown-shaped sector for the ratio of spectral parameters in the elliptic structure function. In our initial approach [3], the extension of the construction to sl(N ) was achieved by defining the abstract generators as shifted ordered products of the of higher spin simply spin one generators t(z) = Tr L+ (zq c/2 )(L− (z))−1 . In this respect, the first derivation cannot be considered as the q-deformed version of the WN algebra construction [11] which takes as generators combinations of the current algebra generators from which one then extracts sl(N ) scalar objects; the detailed study developed in [11] allows us to appreciate the successes and the difficulties of this approach. The construction [3] however gave rise to perfectly consistent non-trivial algebraic structures. Indeed the shift in the spectral parameters when defining the product of spin-one generators precludes any identification of the subsequent Wq,p algebras as subsets of the enveloping algebras of any particular integer-labeled V irq (sl(2)) derived from the spin-one generators within a given choice of sector for the ratio of spectral parameters. In particular, as indicated before, the classical limit of these algebras Wq,p did lead in given cases to the original [5] classical q-W Poisson algebra, characterizing the quantum structure as a genuine q-deformed WN algebra. The question of a universal construction of Wq,p algebras, which would not give any privileged role to the spin-one operators as "building blocks" of the full algebra, thus remained open. Our purpose here is to present such a construction and compute the related q-deformed algebraic structures. With this intention we shall rely cupon basic algebraic structures derived from the properties of the operator T(z) ≡ L+ (q 2 z) (L− (z))−1 which was the fundamental object in our previous derivation. In a first section we recall the main results obtained in [3], and the main notations and definitions concerning the elliptic algebras. In a second section, we prove that T(z) obeys an exchange relation of the type R T R0 T = R0 T R T. Originally derived and discussed in [13, 14], these exchange relations then lead us to define new surfaces in the (p, q, c) space on which quantum, then classical, q-Virasoro algebras of the same type as in [1] arise. The classical structures are the same as in [1]. The quantum structures by contrast are more general, for N ≥ 3, than the original algebras derived in [3], which one recovers as particular cases. One cannot however directly derive higher order generators from such an exchange algebra, contrary to the simpler case RLL = LLR, where a famous twisted trace formula exists [15, 16] to generate quantum commuting Hamiltonians. But since the definition of the basic elliptic algebra involves two distinct R-matrices as RLL = LLR∗ , one cannot apply [15, 16] to it either. In a third, central section we show how to define a suitable twisted trace formula, c b )c ). involving L(z) ≡ L+ (q 2 z)t (L− (z)−1 )t and the R-matrix of the algebra Aq,p (sl(N This formula now provides us with the seeked-for q-deformation of the canonical procedure [11] for WN algebras. It leads to closed exchange algebras of the quantum 1 Wq,p type, when the generalized relation (−p 2 )n = q −c−N is obeyed. Here n is any integer, not necessarily a multiple of N . One then gets classical q-W Poisson algebras when p = q N h . The quantum and classical algebras Wq,p [sl(N )] thus constructed contain, when n = kN for integer k, the structures obtained in [3] by using the shifted products. As emphasized, we now have a really universal construction of the higher q-deformed currents generating these Wq,p algebras. In addition, we have obtained the elliptic algebra version of the twisted trace construction, bypassing thereby the difficulty generated a

Universal Construction of Wq,p Algebras

447

priori by the asymmetrical Yang–Baxter relation RLL = LLR∗ (R∗ 6= R) defining the elliptic algebra. b )c ) and the Algebras V irq,p (sl(N )) 2. The Elliptic Algebra Aq,p (sl(N We first recall the most important notations, definitions and results concerning the elliptic algebras and their subsequent q-deformed Virasoro and WN algebras. 2.1. Definition of the elliptic R-matrix. We start by defining the R-matrix of the ZN vertex model (ZN is the congruence ring modulo N ) [10, 12]: 1 ϑ 21 (ζ, τ ) 1 2/N −2 e q, p) = z 1 2 R(z, κ(z 2 ) ϑ 21 (ξ + ζ, τ ) (2.1) 2 X −1 W(α1 ,α2 ) (ξ, ζ, τ ) I(α1 ,α2 ) ⊗ I(α , × 1 ,α2 ) (α1 ,α2 )∈ZN ×ZN

where the variables z, q, p are related to the ξ, ζ, τ variables by z = eiπξ ,

q = eiπζ ,

p = e2iπτ .

(2.2)

ϑ are the Jacobi theta functions with rational characteristics (γ1 , γ2 ) ∈ Z/N × Z/N : X γ exp iπ(m + γ1 )2 τ + 2iπ(m + γ1 )(ξ + γ2 ) . (2.3) ϑ 1 (ξ, τ ) = γ2 m∈Z

The normalization factor is given by: 1 κ(z 2 ) (q 2N z −2 ; p, q 2N )∞ (q 2 z 2 ; p, q 2N )∞ (pz −2 ; p, q 2N )∞ (pq 2N −2 z 2 ; p, q 2N )∞ (2.4) . = 2N 2 (q z ; p, q 2N )∞ (q 2 z −2 ; p, q 2N )∞ (pz 2 ; p, q 2N )∞ (pq 2N −2 z −2 ; p, q 2N )∞ The functions W(α1 ,α2 ) are defined as follows: 1 + α1 /N ϑ 21 (ξ + ζ/N, τ ) 1 2 + α2 /N , W(α1 ,α2 ) (ξ, ζ, τ ) = 1 N + α1 /N (ζ/N, τ ) ϑ 21 2 + α2 /N

(2.5)

and the matrices I(α1 ,α2 ) by: I(α1 ,α2 ) = g α2 hα1 ,

(2.6)

where gij = ω i δij , hij = δi+1,j are N × N matrices (the addition of indices being understood modulo N ) and ω = e2iπ/N .

448

J. Avan, L. Frappat, M. Rossi, P. Sorba

e is ZN -symmetric: The R-matrix R ea, b ea+s, b+s = R R c+s, d+s c, d

a, b, c, d, s ∈ ZN .

(2.7)

We introduce the “gauge-transformed” matrix: e q, p)(g − 2 ⊗ g − 2 ). R(z, q, p) = (g 2 ⊗ g 2 )R(z, 1

1

1

1

(2.8)

It satisfies the following properties: – Yang–Baxter equation: R12 (z) R13 (w) R23 (w/z) = R23 (w/z) R13 (w) R12 (z),

(2.9)

R12 (z)R21 (z −1 ) = 1,

(2.10)

R12 (z)t2 R21 (q −N z −1 )t2 = 1,

(2.11)

R12 (−z) = ω (g −1 ⊗ I) R12 (z) (g ⊗ I),

(2.12)

−1 1 1 b21 (z −1 ) b12 (−p 21 z) = (g 21 hg 21 ⊗ I)−1 R (g 2 hg 2 ⊗ I), R

(2.13)

b12 (x) = τN (q 1/2 x−1 ) R12 (x), R

(2.14)

– unitarity:

– crossing symmetry:

– antisymmetry:

– quasi-periodicity:

where

and τN (z) = z N −2 2

2q2N (qz 2 ) . 2q2N (qz −2 )

(2.15)

The function τN (z) is periodic with period q N : τN (q N z) = τN (z) and satisfies τN (z −1 ) = τN (z)−1 . b )c ). We now recall the definition of the elliptic quantum 2.2. Definition of Aq,p (sl(N P b algebra Aq,p (sl(N )c ) [8, 9]. It is an algebra of operators Lij (z) ≡ n∈Z Lij (n) z n , where i, j ∈ ZN :   L11 (z) · · · L1N (z)   .. .. (2.16) L(z) =  . . . LN 1 (z) · · · LN N (z)

Universal Construction of Wq,p Algebras

449

The q-determinant is given by (ε(σ) being the signature of the permutation σ): q- det L(z) ≡

X σ∈SN

ε(σ)

N Y

Li,σ(i) (zq i−N −1 ).

(2.17)

i=1

b )c ) is defined by imposing the following constraints on the L(z) generators: Aq,p (gl(N b∗ (z/w), b12 (z/w) L1 (z) L2 (w) = L2 (w) L1 (z) R R 12

(2.18)

b∗ (z, q, p) ≡ R b12 (z, q, p∗ = pq −2c ). where L1 (z) ≡ L(z) ⊗ I, L2 (z) ≡ I ⊗ L(z) and R 12 b )c ) and one sets The q-determinant is in the center of Aq,p (gl(N b )c ) = Aq,p (gl(N b )c )/hq- det L − q c2 i. Aq,p (sl(N

(2.19)

2.3. V irq,p (sl(N )) algebras from elliptic algebras. It is here useful to introduce the following two matrices: c

L+ (z) ≡ L(q 2 z),

L− (z) ≡ (g 2 hg 2 ) L(−p 2 z) (g 2 hg 2 )−1 , 1

1

1

1

1

(2.20)

which obey coupled exchange relations following from (2.18) and periodicity/unitarity b∗ : b12 and R properties of the matrices R 12 ∗ b12 b12 (z/w) L± (z) L± (w) = L± (w) L± (z) R (z/w), R 1 2 2 1 c − − + + ∗ b12 (q − c2 z/w). b12 (q 2 z/w) L1 (z) L (w) = L (w) L1 (z) R R 2

2

(2.21)

We now recall some of the main results of refs. [1–3]: Theorem 1. In the three-dimensional parameter space generated by p, q, c, one defines a two-dimensional surface 6N,N M for any integer M ∈ Z by the set of triplets (p, q, c) 1 connected by the relation (−p 2 )N M = q −c−N . One defines the following operators in b )c ): Aq,p (sl(N c c e − (z) ≡ Tr(L(z)), (2.22) t(z) ≡ Tr L+ (q 2 z) L− (z)−1 = Tr L+ (q 2 z)t L e − (z) ≡ (L− (z)−1 )t . where L 1) On the surface 6N,N M , the operators t(z) realize an exchange algebra with the b )c ): generators L(w) of Aq,p (sl(N w L(w) t(z), (2.23) t(z) L(w) = FN N M, z where

FN (r, x) =

 r−1 Y 2q2N (x−2 p−k ) 2q2N (x2 pk )  1  2r(1− N )  q   2q2N (x−2 q 2 p−k ) 2q2N (x2 q 2 pk )   k=0

for r > 0,

 |r|  (2.24) Y  2q2N (x−2 q 2 pk ) 2q2N (x2 q 2 p−k ) 1  −2|r|(1− N )   for r < 0. q 2q2N (x−2 pk ) 2q2N (x2 p−k ) k=1

450

J. Avan, L. Frappat, M. Rossi, P. Sorba

2) On the surface 6N,N M , t(z) closes a quadratic subalgebra: w t(w)t(z), t(z)t(w) = YN N M, z where

YN (r, x) =

(2.25)

 r Y 22q2N (x2 p−k ) 2q2N (x2 q 2 pk ) 2q2N (x2 q −2 pk )     2 2 k 2 2 −k 2 −2 −k    k=1 2q2N (x p ) 2q2N (x q p ) 2q2N (x q p )

for r > 0,

 |r|−1  (2.26) Y 22q2N (x2 p−k ) 2q2N (x2 q 2 pk ) 2q2N (x2 q −2 pk )    for r < 0.   22q2N (x2 pk ) 2q2N (x2 q 2 p−k .) 2q2N (x2 q −2 p−k ) k=0

3) In particular, at the critical level c = −N , the operators t(z) lie in the center of b )c ) and commute with each other. Aq,p (sl(N 3. The New V irq,p (sl(N )) Algebras We first prove the main basic result of this section. 3.1. A generalized quadratic exchange algebra. Theorem 2. The operators T(z) defined by c

T(z) ≡ L+ (q 2 z) L− (z)−1

(3.1)

satisfy the following exchange relation: b21 (q c w/z) T2 (w) = T2 (w) R b12 (q c z/w) T1 (z) R b21 (w/z). (3.2) b12 (z/w) T1 (z)R R Proof. One can derive from Eqs. (2.21) further exchange relations between the operators e − [3]. One has therefore: L+ and L b21 (q c w/z) L+2 (q c2 w) L b21 (q c w/z) T2 (w) = L+1 (q c2 z) L e − (z)t1 R e − (w)t2 T1 (z) R 1 2 c c ∗ e − (w)t2 b21 e − (z)t1 L = L+1 (q 2 z) L+2 (q 2 w) R (w/z) L 1

2

∗ ∗ e − (w)t2 b−1 (z/w) L+2 (q c2 w) L+1 (q c2 z) R b12 b21 e − (z)t1 L =R (z/w) R (w/z) L 12 1 2 c c −1 − + 2 + 2 t1 e − t2 b e = T R (z/w) L2 (q w) L1 (q z) L (z) L (w)

= =

12 1 2 ∗ e − (z)t1 b−1 (z/w) L+2 (q c2 w) L+1 (q c2 z) R b12 e − (w)t2 L TR (z/w) L 12 2 1 b12 (q c z/w) L+1 (q c2 z) L e − (w)t2 R e − (z)t1 b−1 (z/w) L+2 (q c2 w) L R 12 2 1 1

(3.3)

b−1 (z/w) R 12 b21 (w/z), R

1

where T stands for τN (q 2 w/z)τN (q 2 z/w) and we used the relations b∗ (w/z) = T and T R b−1 (z/w) = R b21 (w/z). b∗ (z/w) R R 12 21 12 b12 (z/w) on the left, one obtains the desired Multiplying the last equality of (3.3) by R equation (3.2).

Universal Construction of Wq,p Algebras

451

b )c ). Mixed exchange rela3.2. An alternative construction: New surfaces in Aq,p (sl(N tions of the type described in Theorem 2 were considered in [13, 14]. It was then shown in b21 (q c w/z) γ2 = b12 (z/w) γ1 R [14] that, provided that a c-number matrix exists such that R c b b γ2 R12 (q z/w) γ1 R21 (w/z), one may construct commuting generators defined as Q ≡ tr(γ˜ t T), where γ˜ is matrix dual to γ. This, together with the properties of quasi-periodicity and unitarity of the R-matrix, leads us to consider the following operator: i i h h e − (z)t , (3.4) t(z) ≡ Tr a−n T(z) = Tr a−n L+ (q c/2 z)L where n ∈ Z and the matrix a is given by: a = g 1/2 hg 1/2 .

(3.5)

By analogy with the construction in Theorem 1, we expect that the mechanism of construction of directly commuting Hamiltonians in [14] will here turn into a two-step procedure, with a first constraint on p, q, c leading to a closed exchange algebra and a second constraint leading to commuting operators and a subsequent Poisson structure. Indeed we first establish the exchange properties of (3.4) on the surfaces 6N,n of the three-dimensional space of parameters p, q, c, given by the equation (n ∈ Z, n 6= 0): (−p1/2 )n = q −c−N .

(3.6)

i h It is relevant for later purposes to rewrite the operator t(z) as t(z) = Tr L(n) (z) , where L(n) (z) is defined by t e − (z). L(n) (z) = a−n L+ (q c/2 z) L

(3.7)

We prove the following lemma. Lemma 1. On the surfaces 6N,n , the operators L(n) (z) defined by (3.7) have the folb )c ): lowing exchange properties with the generators L(w) of Aq,p (sl(N w t1 b∗ (q −c w/z)−1 L(n) (z) R b∗ (q −c w/z)t1 . L n, (z) L (w) = F (w) R L(n) 2 N 2 21 21 1 1 z (3.8) Proof. It is easier to formulate the proof in terms of L+ (w). One has: + + c/2 t1 t1 e − + ) (a−n L(n) 1 ) L1 (z) L2 (w). 1 (z) L2 (w) = L1 (zq

(3.9)

To exchange t(z) with L+ (w), we need the following exchange relations, coming directly from (2.21):

b21 (q c2 w/z)t1 R

−1

−1 e − (z) = L e − (z) L+ (w) R b∗ (q − c2 w/z)t1 L+2 (w) L , (3.10) 2 21 1 1

b∗ (z/w)t1 L+ (z)t1 . b12 (z/w)t1 L+2 (w) = L+2 (w) R L+1 (z)t1 R 12 1

(3.11)

452

J. Avan, L. Frappat, M. Rossi, P. Sorba

Using (3.10) we have: −1 c t1 ∗ b21 (q c2 w/z)t1 e − (z)R b21 R L+2 (w)L (q − 2 w/z)t1 L1(n) (z)L+2 (w) = L+1 (q c/2 z)t1 (a−n 1 ) 1 −1 t1 b21 (q c2 w/z)t1 R = L+1 (q c/2 z)t1 (a−n 1 ) t1 e − b∗ − c2 w/z)t1 . (an1 )t1 L+2 (w)(a−n 1 ) L1 (z)R21 (q

(3.12) On the other hand, using the crossing-symmetry property we have: −1 t1 c −n t1 t1 n t1 b21 (q c2 w/z)t1 b 2 +N w/z)−1 ) (a ) = (a ) (q (an1 )t1 R R (a−n 21 1 1 1 t1 b21 (q c2 +N w/z)−1 a−n . = an1 R (3.13) 1 We now apply n times the following relation coming from unitarity and quasi-periodicity: −1 1 1 b21 (z −1 )−1 a−1 . b21 z −1 (−p 21 ) = τN (q 2 z)τN (q 2 z −1 )a1 R (3.14) R 1 We see here the role of the quasi-periodicity operator in implementing the general 1 power of −p 2 in the R matrix, leading to: −1 b21 (z −1 )−1 a−n , b21 z −1 (−p 21 )n = GN (n, z) an1 R (3.15) R 1 where GN (n, z) =

n−1 Y

h 1 i h i 1 1 1 τN zq 2 (−p 2 )−k τN z −1 q 2 (−p 2 )k for n > 0,

k=0

(3.16) |n|

GN (n, z) =

Y k=1

h 1 i h i 1 1 1 −1 −1 τN zq 2 (−p 2 )k τN z −1 q 2 (−p 2 )−k for n < 0.

Applying (3.15) to (3.13) and using Eq. (3.6), we have: −1 c 1 t1 b21 (q c2 w/z)t1 R n, q 2 (−p 2 )n z/w (an1 )t1 = G−1 (a−n 1 ) N t1 b21 (q − c2 w/z)−1 . × R

(3.17)

We remark that from the definition of τN (x) (2.15) and the relation (3.6), it follows that: c 1 c n, q 2 (−p 2 )n x−1 = FN n, q 2 x , (3.18) G−1 N where FN is given by (2.24). Inserting (3.17),(3.18) into (3.12), we have: t1 c w + + c/2 t1 − c2 −1 b 2 L (z) L (w) = F (q z) (q w/z) n, q R L(n) N 21 2 1 1 z −n t1 e − + b∗ (q − c2 w/z)t1 . × L2 (w)(a1 ) L1 (z)R 21

(3.19)

Universal Construction of Wq,p Algebras

453

Now we use Eq. (3.11) to obtain: t1 c w + + ∗ − c2 −1 b 2 L n, q (z) L (w) = F (w) R (q w/z) L(n) N 2 2 21 1 z c + c/2 t1 −n t1 e − ∗ b21 × L1 (q z) (a1 ) L1 (z)R (q − 2 w/z)t1 .

(3.20)

We are now able to state the following theorem: Theorem 3. On the surfaces 6N,n , the operators t(z) defined by (3.4) satisfy the folb )c ): lowing exchange relations with the generators L(w) of Aq,p (sl(N w L(w) t(z). (3.21) t(z) L(w) = FN n, z Proof. We formulate the proof in terms of L+ (w). One has h i t1 e − + ) (z) L (w) . L t(z) L+2 (w) = Tr1 L+1 (zq c/2 )t1 (a−n 2 1 1

(3.22)

From Eq. (3.20), one obtains immediately: t(z) L+2 (w)

t1 c w + ∗ b21 L2 (w)Tr1 R = FN n, q (q − 2 w/z)−1 z i c + c/2 t1 −n t1 e − ∗ b21 × L1 (q z) (a1 ) L1 (z)R (q − 2 w/z)t1 .

c 2

(3.23)

Using the very useful property: t2 0 0 t2 R21 t2 , = Tr1 Q1 R21 Tr1 R21 Q1 R21

(3.24)

we get: h c w t1 e − b∗ − c2 w/z)t L+2 (w)Tr1 L+1 (q c/2 z)t1 (a−n t(z) L+2 (w) = FN n, q 2 1 ) L1 (z)R21 (q z t t2 c ∗ b21 (3.25) × R (q − 2 w/z)−1 . Now the product of the two R-matrices in (3.25) vanishes altogether, leaving: c w L+ (w) t(z). t(z) L+ (w) = FN n, q 2 z

(3.26)

From Definitions (2.20) Theorem 3 follows then immediately.

In particular since aN is proportional to the identity one recovers the exchange algebras in [3] when n = N M, M ∈ Z. A simple corollary of Theorem 3 is the exchange relation between t(z) and t(w). Indeed using the relations: e − (z) = L− (z)t L

−1

c

, L− (w) = aL+ (−p 2 q − 2 w)a−1 ,

we derive from (3.21) the following result:

1

(3.27)

454

J. Avan, L. Frappat, M. Rossi, P. Sorba

Corollary 1. On the surface 6N,n the operators t(z) defined by (3.4) satisfy the following algebra: t(z) t(w) = YN (n, w/z) t(w) t(z),

(3.28)

where YN is the function defined by (2.26). Remark. For n = −1 Eq. (3.28) reads as: t(z) t(w) = t(w) t(z),

(3.29)

that is, on the surface q −c−N = (−p 2 )−1 , the operators t(z) commute. However they b )c ), because the exchange factor of (3.21) is do not belong to the center of Aq,p (sl(N different from unity. 1

As when c = −N this is an occurrence of a one-step mechanism where one obtains directly a commuting algebra of operators with one single constraint on p, q, c. However, contrary to that previous case, where the elliptic algebra and the quantum group shared this feature, this one is characteristic of the elliptic algebra structure, involving as it does the elliptic nome p. Exchange relations (3.28) are to be understood as realizations of V irq,p (sl(N )) b )c ). This conclusion derives from the following algebras in the framework of Aq,p (sl(N results. Theorem 4. On the surface 6N,n , when p = q N h with h ∈ Z\{0}, the function Y(n, x) b )c ). is equal to 1. Hence t(z) realizes an Abelian subalgebra in Aq,p (sl(N Proof. The theorem 4 is easily proved using the periodicity properties of the 2q2N functions. Theorem 5. Setting q N h = p1−β for any integer h 6= 0, the h-labeled Poisson structure defined by: 1 t(z)t(w) − t(w)t(z) β→0 β

{t(z), t(w)}(h) = lim

(3.30)

has the following expression: {t(z), t(w)}(h) = fh (w/z) t(z) t(w), where

(3.31)

Universal Construction of Wq,p Algebras

455

 X n n fh (x) = 2N h ln q  E( )(E( ) + 1) 2 2 `≥0 x2 q 2N `+2 x2 q 2N `−2 2x2 q 2N ` − − × 1 − x2 q 2N ` 1 − x2 q 2N `+2 1 − x2 q 2N `−2 2 2N `+N 2 2N `+N +2 2x q x q x2 q 2N `+N −2 n+1 2 ) − − + E( 2 1 − x2 q 2N `+N 1 − x2 q 2N `+N +2 1 − x2 q 2N `+N −2 2 2 2 2 −2 n 2x x q x q 1 n −1 − (x ↔ x ) − − − E( )(E( ) + 1) 2 2 2 1 − x2 1 − x2 q 2 1 − x2 q −2 for h odd, (3.32)  X x2 q 2N `+2 x2 q 2N `−2 2x2 q 2N ` − − = N hn(n + 1) ln q  1 − x2 q 2N ` 1 − x2 q 2N `+2 1 − x2 q 2N `−2 `≥0 2x2 x2 q 2 x2 q −2 1 −1 − (x ↔ x ) − − for h even. − 2 1 − x2 1 − x2 q 2 1 − x2 q −2 Here the notation E(m) means the integer part of the number m. Proof. By direct calculation.

Formula (3.32) is a trivial generalization of formulas (5.3) of [3] provided that the formal substitution N M → n is done. Therefore the discussion between Theorem 7 and Proposition 5 of [3] is still valid and the Poisson bracket in the sector k = 0 corresponding to (3.32) is given, modulo the substitution indicated, by the formula in Proposition 5 of [3], which for even h yields the algebra V irq (sl(N )) [5]. Therefore we may conclude that (3.28) realizes the exchange relations of bona fide quantum V irq,p (sl(N )) algebras. When N = 2 the construction does not lead to new q-deformed Virasoro algebras. However when N ≥ 3, one obtains new exchange algebraic structures corresponding to the surfaces 6N,n when n 6= N M, M ∈ Z, and these structures will now be extended to complete q-deformed WN algebras. 4. Universal Construction of Wq,p Algebras Extending the construction to higher spin currents for N ≥ 3 compels us to use the direct exchange relation in Lemma 1 instead of the quadratic intertwined exchange relation in Theorem 2, for which no generalizations to higher powers of T exist. 4.1. Quantum Wq,p algebras. Theorem 6. We define the operators ws (z) (s = 1, . . . , N − 1) by:     x Y x Y Y Y ∗ bij L(n) Pij  (q −N zi /zj )ti tj  , (4.1) R ws (z) ≡ Tr  i (zi ) 1≤i≤s j>i

where

1≤i≤s

j>i

456

J. Avan, L. Frappat, M. Rossi, P. Sorba −n + 2 e − (z) L(n) Li (q z))ti L i i (z) ≡ (a c

c e − (z) ⊗ I ⊗ . . . I ≡ I ⊗ · · · ⊗ I ⊗(a−n L+ (q 2 z))t L | {z } | {z }

(4.2)

s−i

i−1

s+1

with n ∈ Z, zi = zq i− 2 , and Pij is the permutation operator between the spaces i and j 1 including the spectral parameters. On the surface 6N,n defined by (−p 2 )n = q −c−N , the b )c ): operators ws (z) realize an exchange algebra with the generators L(w) of Aq,p (sl(N w L(w) ws (z), (4.3) ws (z) L(w) = FN(s) n, z where s w Y w = . FN n, FN(s) n, z zi i=1

(4.4)

Proof. For simplicity, we will only prove the theorem for w2 (z) and w3 (z) (the proof for w1 (z) ≡ t(z) has been done in [2, 3], see Theorem 1 above). The proof is based on the exchange relation (3.20) between L(n) (z) and L+ (w) on the surface 6N,n defined by 1 (−p 2 )n = q −c−N : ti c w + + ∗ − c2 −1 bαi 2 L n, q (z) L (w) = F (w) R (q w/z) L(n) N α α i z (4.5) ∗ − c2 ti bαi × L(n) (z) R (q w/z) . i Consider the operator w2 (z). By definition (with z1 = zq − 2 and z2 = zq 2 ): h i b∗ −N z1 /z2 )t1 t2 L(n) (z2 ) . w2 (z) = Tr P12 L(n) 1 (z1 ) R12 (q 2 1

1

(4.6)

From the exchange relation (4.5) between L(n) (z) and L+ (w), one gets immediately: h i ∗ −N t1 t2 (n) + b (z ) R (q z /z ) L (z ) w2 (z) L+α (w) = Tr12 P12 L(n) 1 1 2 2 Lα (w) 12 1 2 h t1 c w c ∗ bα1 b∗ − c2 w/z1 )t1 L+α (w) Tr12 P12 R (q − 2 w/z1 )−1 L(n) = FN(2) n, q 2 1 (z1 ) Rα1 (q z t2 i t2 c ∗ −N ∗ ∗ − c2 b12 (q z1 /z2 )t1 t2 R bα2 b R , (4.7) (q − 2 w/z2 )−1 L(n) (z ) R (q w/z ) 2 2 α2 2 c w is given by: where FN(2) n, q 2 z c w c w c w = FN n, q 2 FN n, q 2 . FN(2) n, q 2 z z1 z2

(4.8)

In order to reorganize the R matrices in (4.7), one uses the Yang–Baxter equation for the b∗ (Eq. (4.9) is a consequence of (2.9), the normalization factor entering in the matrix R b matrices being the same in the l.h.s. and in the r.h.s. of (4.9)): definition (2.14) of the R ∗ ∗ ∗ ∗ ∗ ∗ bα2 b12 b12 bα2 bα1 bα1 (x1 ) R (x2 ) R (x2 /x1 ) = R (x2 /x1 ) R (x2 ) R (x1 ), R

(4.9)

Universal Construction of Wq,p Algebras

from which it follows (with a shift x2 → q −N x2 ) t2 ∗ ∗ ∗ b12 bα2 bα1 (x1 )t1 R (q −N x2 /x1 )t1 t2 R (x2 )−1 R t2 ∗ ∗ b∗ (x2 )−1 b12 bα1 = R (q −N x2 /x1 )t1 t2 R (x1 )t1 . R α2

457

(4.10)

Therefore, one obtains h t1 c w c ∗ bα1 L+α (w) Tr12 P12 R (q − 2 w/z1 )−1 w2 (z) L+α (w) = FN(2) n, q 2 z t2 c b∗ − 2 w/z2 )−1 L(n) 1 (z1 ) Rα2 (q i t1 t2 c ∗ ∗ ∗ − c2 b12 bα1 b R (q −N z1 /z2 )t1 t2 R (q − 2 w/z1 ) L(n) (z ) R (q w/z ) 2 2 α2 2 h t1 t2 c c c w ∗ ∗ bα1 bα2 L+α (w) Tr12 P12 R R (q − 2 w/z1 )−1 (q − 2 w/z2 )−1 = FN(2) n, q 2 z i t2 c (n) ∗ ∗ b b∗ − c2 w/z1 )t1 R bα2 L1 (z1 ) R12 (q −N z1 /z2 )t1 t2 L(n) (q − 2 w/z2 ) 2 (z2 ) Rα1 (q h t2 t1 c w b∗ (q − c2 w/z2 )−1 b∗ (q − c2 w/z1 )−1 L+α (w) Tr12 R R P12 = FN(2) n, q 2 α2 α1 z i t t c 2 ∗ b∗ −N z1 /z2 )t1 t2 L(n) (z2 ) R b∗ (q − c2 w/z1 ) 1 R bα2 L(n) (q − 2 w/z2 ) , (4.11) α1 1 (z1 ) R12 (q 2 the last equality being obtained by the action of the permutation operator P12 . One then uses the fact that under a trace over the space β, for any c-number matrices Aαβ and Bαβ and operator matrix Qα , one has h i h tα i . (4.12) Trβ Aαβ Qβ Bαβ = Trβ Qβ (Bαβ )tα (Aαβ )tα 0 0 Rα2 , one gets Applying (4.12) to β ≡ 1 ⊗ 2, Aαβ ≡ Rα2 Rα1 and Bαβ ≡ Rα1 h c w b∗ −N z1 /z2 )t1 t2 L(n) (z2 ) L+α (w) Tr12 P12 L(n) w2 (z) L+α (w) = FN(2) n, q 2 1 (z1 ) R12 (q 2 z t t t t c c 2 α 1 α ∗ ∗ bα2 bα1 R (q − 2 w/z2 ) R (q − 2 w/z1 ) t1 tα t2 tα tα i c c ∗ ∗ bα1 bα2 R R (q − 2 w/z1 )−1 (q − 2 w/z2 )−1 h c w b∗ −N z1 /z2 )t1 t2 L(n) (z2 ) L+α (w) Tr12 P12 L(n) = FN(2) n, q 2 1 (z1 ) R12 (q 2 z t t t t c c 2 α 1 α ∗ ∗ bα1 bα2 R (q − 2 w/z2 ) R (q − 2 w/z1 ) −1 −1 tα i c c ∗ ∗ bα1 bα2 R R (q − 2 w/z1 )t1 tα (q − 2 w/z2 )t2 tα c w L+α (w) Tr12 = FN(2) n, q 2 z h i b∗ −N z1 /z2 )t1 t2 L(n) (z2 ) . P12 L(n) (4.13) 1 (z1 ) R12 (q 2

It follows that

c w L+α (w) w2 (z). w2 (z) L+α (w) = FN(2) n, q 2 z

(4.14)

458

J. Avan, L. Frappat, M. Rossi, P. Sorba c

Recalling the fact that L+α (w) = Lα (q 2 w), one gets the desired result. Consider now the case of w3 (z). By definition (with z1 = zq −1 , z2 = z and z3 = zq): h b∗ −N z1 /z2 )t1 t2 w3 (z) = Tr P12 P13 P23 L(n) 1 (z1 ) R12 (q i (4.15) b∗ (q −N z1 /z3 )t1 t3 L(n) (z2 ) R b∗ (q −N z2 /z3 )t2 t3 L(n) (z3 ) . ×R 13 23 2 3 From the exchange relation (4.5) between L(n) (z) and L+ (w), one gets: h ∗ b∗ −N z1 /z2 )t1 t2 R b13 (q −N z1 /z3 )t1 t3 w3 (z) L+α (w) = Tr123 P12 P13 P23 L(n) 1 (z1 ) R12 (q i b∗ −N z2 /z3 )t2 t3 L(n) (z3 ) L+α (w) L(n) 2 (z2 ) R23 (q 3 h t1 c w c ∗ bα1 L+α (w) Tr123 P12 P13 P23 R (q − 2 w/z1 )−1 = FN(3) n, q 2 z (n) ∗ bα1 (q − c2 w/z1 )t1 L1 (z1 ) R t2 c ∗ ∗ ∗ b12 b13 bα2 (q −N z1 /z2 )t1 t2 R (q −N z1 /z3 )t1 t3 R (q − 2 w/z2 )−1 R b∗ − c2 w/z2 )t2 L(n) 2 (z2 ) Rα2 (q t3 i c ∗ ∗ b23 bα3 b∗ − c2 w/z3 )t3 , R (q −N z2 /z3 )t2 t3 R (q − 2 w/z3 )−1 L(n) 3 (z3 ) Rα3 (q

c w

where FN(3) n, q 2

z

(4.16) is given by:

c w c w c w c w = FN n, q 2 FN n, q 2 FN n, q 2 . FN(3) n, q 2 z z1 z2 z3

(4.17)

Applying three times the Yang–Baxter equation (4.10), one obtains: h t1 c w c ∗ bα1 L+α (w) Tr123 P12 P13 P23 R (q − 2 w/z1 )−1 w3 (z) L+α (w) = FN(3) n, q 2 z t2 t3 c c ∗ ∗ bα2 bα3 (q − 2 w/z2 )−1 (q − 2 w/z3 )−1 L1(n) (z1 ) R R ∗ ∗ b13 b12 (q −N z1 /z2 )t1 t2 R (q −N z1 /z3 )t1 t3 L(n) R 2 (z2 ) t2 c ∗ ∗ bα2 b∗ − c2 w/z1 )t1 R b23 (q −N z2 /z3 )t2 t3 L(n) (q − 2 w/z2 ) R 3 (z3 ) Rα1 (q i t3 c ∗ bα3 (q − 2 w/z3 ) . R

(4.18) Finally, after action of the permutation operators and using (4.12) applied to β ≡ 1⊗2⊗3, b∗ in (4.18) simplify. It follows that the R matrices R αi c w L+α (w) w3 (z). (4.19) w3 (z) L+α (w) = FN(3) n, q 2 z c

Again, one gets the desired result since L+α (w) = Lα (q 2 w). The proof for a generic operator ws (z) is obtained by using the basic exchange relation (4.5) between L(n) (z) and L+ (w) and applying 21 s(s − 1) times the Yang–Baxter

Universal Construction of Wq,p Algebras

459

equation in the form (4.10). Successive uses of this relation yields an expression involving: the group of permutation operators; the product of all R matrices appearing at the left of the L(n) operator in Lemma 1; the terms of the monomial ws (z); finally the product of all R matrices appearing at the right of the L(n) operator in Lemma 1. Commutation of the permutation operators with the “left” R matrices then brings this group of R matrices in position to use the transposition procedure (4.12), and precisely rearranges the indices of these R matrices in the exact way required to cancel the two “left” and “right” groups of exchange-generated R matrices once the generalization of the procedure (4.12) is used. Therefore the exchange relation becomes an exchange algebra with purely scalar functional structure coefficients. An immediate consequence is therefore the exchange relation between the operators ws (z) and the generators L(n) (w): w s w FN(s) n, q c Y z L(n) (w) w (z) = n, L(n) (w) ws (z), Y ws (z) L(n) (w) = s N 1 w (s) z i 2 FN n, −p (4.20) i=1 z s+1

where YN is the function defined by (2.26) and zi = zq i− 2 . It is now immediate to derive the exchange algebra among the operators ws (z). One gets the following theorem: Theorem 7. On the surface 6N,n , the operators ws (z) realize an exchange algebra i−1

wi (z) wj (w) =

2 Y

j−1

w wj (w) wi (z). YN n, q v−u z j−1

2 Y

u=− i−1 2 v=−

(4.21)

2

The proof is obvious and follows immediately from Definition 4.1 and Eq. (4.20). Remarks. 1) The critical level c = −N can be seen as a limiting case of the relation 1 (−p 2 )n = q −c−N by taking n = 0, p and q having arbitrary generic values (|q| < 1, |p| < 1). In this limiting case, it is easy to note that the factor GN (0, x) is equal to 1 (see e.g. (3.15)) and hence F (0, x) = 1. Therefore, at the critical level, the operators ws (z) provide a set of commuting quantities belonging to the center of the elliptic quantum b )c ) algebra. Aq,p (sl(N 2) When one chooses n = N M for M ∈ Z, one recovers the structure functions of the q-WN algebras constructed in [3]. General values of n lead to original structure functions. This time however the construction does not go through the two steps of first constructing the trace s(z) and then shift-multiply the same derived abstract generator to get the full q-WN algebra: the trace procedure and shifted multiplication are applied in one single stroke to the original elliptic algebra generators, hence our denomination of this as a “universal” procedure. 4.2. Classical limit. Theorem 8. On the surface 6N,n , when an additional relation p = q N h with h ∈ Z\{0} is imposed, the function Y is equal to 1. Hence, the operators ws (z) realize an Abelian b )c ). subalgebra in Aq,p (sl(N Proof. The proof is straightforward by using the explicit expression of the function Fs (M, x) and the periodicity properties of the 2q2N functions.

460

J. Avan, L. Frappat, M. Rossi, P. Sorba

Remark. Although the ws (z) realize an Abelian algebra, they do not belong to the center b )c ), in contrast to the case of the critical level. of Aq,p (sl(N Theorem 8 allows us to define Poisson structures on the corresponding Abelian subalgebras. As usual, they are obtained as limits of the exchange algebras when p = q N h with h ∈ Z\{0}. Theorem 9. Setting q N h = p1−β for any integer h 6= 0, the h-labeled Poisson structure defined by: 1 wi (z) wj (w) − wj (w) wi (z) (4.22) {wi (z), wj (w)}(h) = lim β→0 β has the following expression: X

(i−1)/2

{wi (z), wj (w)} =

X

(j−1)/2

u=−(i−1)/2 v=−(j−1)/2

w wi (z) wj (w), fh q v−u z

(4.23)

where fh (x) is given by (3.32). One recovers here the same Poisson algebra as in [3], identifying therefore the exchange algebras in Theorem 7 as new Wp,q [sl(N )] algebras. The main issue now is to obtain explicit realizations of these algebras. Curiously enough the bosonic realization achieved in [5] corresponds to a WN algebra which does not belong to the set constructed here. We hope to address this issue in the future. Acknowledgement. This work was supported in part by CNRS, Foundation Angelo della Riccia and EC network n. FMRX-CT96-0012. M.R. and J.A. wish to thank the LAPTH for its kind hospitality.

References 1. Avan, J., Frappat, L., Rossi, M., Sorba, P.: Poisson structures on the center of the elliptic algebra Aq,p (sbl(2)c ). Phys. Lett. A 235, 323 (1997) 2. Avan, J., Frappat, L., Rossi, M., Sorba, P.: New Wq,p (sl(2)) algebras from the elliptic algebra Aq,p (sbl(2)c ). Phys. Lett. A 239, 27 (1998) 3. Avan, J., Frappat, L., Rossi, M., Sorba, P.: Deformed WN algebras from elliptic sl(N ) algebras. Commun. Math. Phys. 199, 697 (1999); math.QA/9801105 4. Avan, J., Frappat, L., Rossi, M., Sorba, P.: Central extensions of classical and quantum q-Virasoro algebras. Submitted to Phys. Lett. A, math.QA/9806065 5. Frenkel, E., Reshetikhin, N.: Quantum affine algebras and deformations of the Virasoro and W -algebras. Commun. Math. Phys. 178, 237 (1996) 6. Awata, H., Kubo, H., Odake, S., Shiraishi, J.: Quantum WN algebras and Macdonald polynomials. Commun. Math. Phys. 179, 401 (1996), and Shiraishi, J., Kubo, H., Awata, H., Odake, S.: A Quantum Deformation of the Virasoro Algebra and the Macdonald Symmetric Functions. Lett. Math. Phys. 38, 33 (1996) 7. Feigin, B., Frenkel, E.: Quantum W algebras and elliptic algebras. Commun. Math. Phys. 178, 653 (1996) 8. Foda, O., Iohara, K., Jimbo, M., Kedem, R., Miwa, T., Yan, H.: An elliptic quantum algebra for sbl2 . Lett. Math. Phys. 32, 259 (1994) 9. Jimbo, M., Konno, H., Odake, S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. qalg/9712029 10. Baxter, R.J.: Exactly solved models in statistical mechanics. London: Academic Press, 1982

Universal Construction of Wq,p Algebras

461

11. Bais, F., Bouwknegt, P., Schoutens, K., Surridge, M.: Extensions of the Virasoro algebra constructed from Kac–Moody algebras using higher order Casimir invariants. Nucl. Phys. B 304, 348 (1988), and Coset construction for extended Virasoro algebras. Nucl. Phys. B 304, 371 (1988) 12. Belavin, A.A.: Dynamical symmetry of integrable quantum systems. Nucl. Phys. B 180, 189 (1981) 13. Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Central extensions of quantum current groups. Lett. Math. Phys. 19, 133 (1990) 14. Freidel, L., Maillet, J.M.: Quadratic algebras and integrable systems. Phys. Lett. B 262, 278 (1991) 15. Maillet, J.M.: Lax equations and quantum groups. Phys. Lett. B 245, 480 (1990) 16. Avan, J., Babelon, O., Billey, E.: The Gervais–Neveu–Felder equation and the quantum Calogero–Moser systems. Commun. Math. Phys. 178, 281 (1996) Communicated by G. Felder

Commun. Math. Phys. 202, 463 – 480 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

A Proof of the Gutzwiller Semiclassical Trace Formula Using Coherent States Decomposition Monique Combescure1 , James Ralston2 , Didier Robert3 1 Laboratoire de Physique Th´ eorique, UMR 8627 du CNRS, bˆatiment 211, Universit´e Paris-Sud, 91405 Orsay, France. E-mail: [email protected] 2 Department of Mathematics, UCLA, Los Angeles, CA 90095-1555, USA. E-mail: [email protected] 3 Laboratoire de Math´ ematiques, UMR 6629 du CNRS, Universit´e de Nantes, BP 99208, 44322 Nantes cedex 3, France. E-mail: [email protected]

Received: 15 June 1998 / Accepted: 9 November 1998

Abstract: The Gutzwiller trace formula links the eigenvalues of the Schr¨odinger opb as Planck’s constant goes to zero (the semiclassical r´egime) with the closed erator H orbits of the corresponding classical mechanical system. Gutzwiller gave a heuristic proof of this trace formula, using the Feynman integral representation for the propagator b Later, using the theory of Fourier integral operators, mathematicians gave rigorous of H. proofs of the formula in various settings. Here we show how the use of coherent states allows us to give a simple and direct proof.

1. Introduction Our goal in this paper is to give a simple proof of the “semiclassical Gutzwiller trace formula”. The pioneering works in quantum physics of Gutzwiller [17] (1971) and b localized Balian-Bloch [4, 5] (1972–74) showed that the trace of a quantum observable A, in a spectral neighborhood of size O(~) of an energy E for the quantum Hamiltonian b can be expressed in terms of averages of the classical observable A associated with H, b over invariant sets for the flow of the classical Hamiltonian H associated with H. b A b This is related to the spectral asymptotics for H in the semi-classical limit, and it can be understood as a “correspondence principle” between classical and quantum mechanics as Planck’s constant ~ goes to zero. Between 1973 and 1975 several authors gave rigorous derivations of related results, generalizing the classical Poisson summation formula from d2 /dθ2 on the circle to elliptic operators on compact manifolds: Colin de Verdi`ere [8], Chazarain [7], Duistermaat-Guillemin [14]. The first article is based on a parametrix construction for the associated heat equation, while the second two replace this with a parametrix, constructed as a Fourier integral operator, for the associated wave equation. More recently, papers by Guillemin–Uribe (1989), Paul–Uribe (1991, 1995), Meinrenken (1992) and

464

M. Combescure, J. Ralston, D. Robert

Dozias (1994) have developed the necessary tools from microlocal analysis in a nonhomogeneous (semiclassical) setting to deal with Schr¨odinger-type Hamiltonians. Extensions and simplifications of these methods have been given by Petkov–Popov [31] and Charbonnel–Popov [6]. The coherent states approach presented here seems particularly suitable when one wishes to compare the phase space quantum picture with the phase space classical flow. Furthermore, it avoids problems with caustics, and the Maslov indices appear naturally. In short, it implies the Gutzwiller trace formula in a very simple and transparent way, without any use of the global theory of Fourier integral operators. In their place we use the coherent states approximation (Gaussian beams) and the stationary phase theorem. The use of Gaussian wave packets is such a useful idea that one can trace it back to the very beginning of quantum mechanics, for instance, Schr¨odinger [35] (1926). However, the realization that these approximations are universally applicable, and that they are valid for arbitrarily long times, has developed gradually. In the mathematical literature these approximations have never become textbook material, and this has lead to their repeated rediscovery with a variety of different names, e.g. coherent states and Gaussian beams. The first place that we have found where they are used in some generality is Babich [2] (1968) (see also [3]). Since then they have appeared, often as independent discoveries, in the work of Arnaud [1] (1973), Keller [24] (1974), Heller [20] (1975, 1987), Ralston [33, 34] (1976, 1982), Hagedorn [18] (1980, 1985), and Littlejohn [25] (1986) – and probably many more that we have not found. Their use in trace formulas was proposed by Wilkinson [36] (1987). The propagation formulas of [18] were extended in Combescure–Robert [10], with a detailed estimate on the error both in time and in Planck’s constant. This propagation formula, whose proof is sketched in the appendix of this paper, allows us to avoid the whole machinery of Fourier Integral operator theory. The early application of these methods in [2] was for the construction of quasi-modes, and this has been pursued further in [33] and Paul–Uribe [27]. There have also been recent applications to the pointwise behaviour of semiclassical measures [28]. 2. The Semiclassical Gutzwiller Trace Formula We consider a quantum system in L2 (Rn ) with Hamiltonian b = −~2 1 + V (x), H

(1)

where 1 is the Laplacian in L2 (Rn ) and V (x) a real, C ∞ (Rn ) potential. The corresponding Hamiltonian for the classical motion is H(q, p) = p2 + V (q), and for a given energy E (∈ R) we denote by 6E the “energy shell” 6E := (q, p) ∈ R2n : H(q, p) = E .

(2)

b obtained by the ~-Weyl quantization More generally we shall consider Hamiltonians H w b of the classical Hamiltonian H, so that H = Op~ (H), where Z x + y i(x−y)·ξ −n (H)ψ(x) = (2π~) H (3) , ξ ψ(y)e ~ dydξ. Opw ~ 2 R2n The Hamiltonian H is assumed to be a smooth, real valued function of z = (x, ξ) ∈ R2n , and to satisfy the following global estimates:

Proof of Gutzwiller Semiclassical Trace Formula

465

• (H.0) There exist non-negative constants C, m, Cγ such that |∂zγ H(z)| ≤ Cγ < H(z) >, ∀z ∈ R2n , ∀γ ∈ N2n ,

(4)

< H(z) > ≤ C < H(z ) > · < z − z > , ∀z, z ∈ R ,

(5)

0

0

m

0

2n

where we have used the notation < u >= (1 + |u|2 )1/2 for u ∈ Rm . Remark 2.1. i) H(q, p) = p2 + V (q) satisfies (H.0), if V (q) is bounded below by some a > 0 and satisfies the property (H.0) in the variable q. b is essentially self-adjoint ii) The technical condition (H.0) implies in particular that H 2 n b on L (R ) for ~ small enough and that χ(H) is a ~-pseudodifferential operator if χ ∈ C0∞ (R) (see [21]) . Let us denote by φt the classical flow induced by Hamilton’s equations with Hamiltonian H , and by S(q, p; t) the classical action along the trajectory starting at (q, p) at time t = 0, and evolving during time t, Z t (ps · q˙s − H(q, p)) ds, (6) S(q, p; t) = 0

where (qt , pt ) = φt (q, p), and the dot denotes the derivative with respect to time. We shall also use the notation: αt = φt (α), where α = (q, p) ∈ R2n is a phase space point. An important role in what follows is played by the “linearized flow” around the classical trajectory, which is defined as follows. Let ∂ 2 H 00 (7) H (αt ) = ∂α2 α=αt be the Hessian of H at point αt = φt (α) of the classical trajectory. Let J be the symplectic matrix 0 I J= , (8) −I 0 where 0 and I are respectively the null and identity n × n matrices. Let F (t) be the 2n × 2n real symplectic matrix solution of the linear differential equation  00 ˙ (αt ) F (t)  F (t) = J H ! . (9) I0  F (0) = 0 I = I F (t) depends on α = (q, p), theP initial point for the classical trajectory, αt . Let γ be a closed orbit on E with period Tγ , and let us denote simply by Fγ the matrix Fγ = F (Tγ ). Fγ is usually called the “monodromy matrix” of the closed orbit γ. Of course, Fγ does depend on α, but its eigenvalues do not, since the monodromy matrix with a different initial point on γ is conjugate to Fγ . Fγ has 1 as an eigenvalue of algebraic multiplicity at least equal to 2. In all that follows, we shall use the following definition: Definition 2.2. We say that γ is a nondegenerate orbit if the eigenvalue 1 of Fγ has algebraic multiplicity 2.

466

M. Combescure, J. Ralston, D. Robert

Let σ denote the usual symplectic form on R2n , σ(α, α0 ) = p · q 0 − p0 · q,

α = (q, p); α0 = (q 0 , p0 )

(10)

(· is the usual scalar product in Rn ). We denote by {α1 , α10 } the eigenspace of Fγ belonging to the eigenvalue 1, and by V its orthogonal complement in the sense of the symplectic form σ, . (11) V = α ∈ R2n : σ(α, α1 ) = σ(α, α10 ) = 0 Then, the restriction Pγ of Fγ to V is called the (linearized) “Poincar´e map” for γ. In some cases the Hamiltonian flow will contain manifolds of periodic orbits with the same energy. When this happens, the periodic orbits will necessarily be degenerate, but the techniques we use here can still apply. The precise hypothesis for this (“Hypothesis C”) will be given in Sect. 4. Following Duistermaat and Guillemin we call this a “clean intersection hypothesis”, but it is more explicit than other versions of this assumption. Since the statement of the trace formula is simpler and more informative when one does assume that the periodic orbits are nondegenerate, we will give that formula here. P We shall now assume the following. Let (0E )T be the set of all periodic orbits on E with periods Tγ , 0 < |Tγ | ≤ T (including repetitions of primitive orbits and assigning negative periods to primitive orbits traced in the opposite sense). • (H.1) There exists δE > 0 such that H −1 ([E − δE, E + δE]) is a compact set of R2n and E is a noncritical value of H (i.e. H(z) = E ⇒ ∇H(z) 6= 0). • (H.2) All γ in (0E )T are nondegenerate, i.e. 1 is not an eigenvalue for the corresponding “Poincar´e map”, Pγ . In particular, this implies that for any T > 0, (0E )T is a discrete set, with periods −T ≤ Tγ1 < · · · < TγN ≤ T . b = Opw (A) be a quantum We can now state the Gutzwiller trace formula. Let A ~ observable, such that A satisfies the following: • (H.3)There exists δ ∈ R, Cγ > 0 (γ ∈ N2n ), such that |∂zγ A(z)| ≤ Cγ < H(z) >δ ∀z ∈ R2n . • (H.4) g a C ∞ function whose Fourier transform gb is of compact support with Supp gb ⊂ [−T, T ]. • Let χ be a smooth function with a compact support contained in ]E − δE, E + δE[, equal to 1 in a neighborhood of E. Then the following “regularized density of states” ρA (E) is well defined: b Aχ( b H)g b ρA (E) = Tr χ(H)

b E−H ~

!! .

(12)

b is purely discrete in a neighborhood of Note that (H.1) implies that the spectrum of H E so that ρA (E) is well defined. We have also, more explictly, ρA (E) =

X 1≤j≤N

g(

E − Ej 2 ˆ j , ϕj >, )χ (Ej ) < Aϕ ~

(13)

where E1 ≤ · · · ≤ EN are the eigenvalues of Hˆ in ]E −δE, E +δE[ (with multiplicities) ˆ j = Ej ϕj ). and ϕj is the corresponding eigenfunction ( Hϕ Now we can state the Gutzwiller Trace Formula.

Proof of Gutzwiller Semiclassical Trace Formula

467

Theorem 2.3. Assume (H.0)–(H.2) are satisfied for H, (H.3) for A and (H.4) for g. Then the following asymptotic expansion holds true, modulo O(~∞ ): Z X −n/2 −(n−1) gb(0)~ A(α)dσE (α) + ck (b g )~k ρA (E) ≡ (π) +

X

6E

k≥−n+2

n/2−1 i(Sγ /~+σγ π/2)

(2π)

e

γ∈(0E )T

  Z Tγ∗   X γ A(αs )ds + dj (b g )~j , gb(Tγ )| det(I − Pγ )|−1/2   0

(14)

j≥1

b where A(α) is the classical Weyl symbol of A, ∗ Tγ is the primitive period of γ, σγ is Hthe Maslov index of γ ( σγ ∈ Z ), Sγ = γ pdq is the classical action along γ, g ) are distributions in gb with support in {0}, ck (b g ) are distributions in gb with support {Tγ }, and dσE is the Liouville measure on dγj (b P E: d6E (d6E is the Euclidean measure on 6E ). dσE = |∇H| Remark 2.4. We can include more general Hamiltonians depending explicitly in ~, K X ~j H (j) such that H (0) satisfies (H.0) and for j ≥ 1, H= j=1

|∂ γ H (j) (z)| ≤ Cγ,j < H (0) (z) > .

(15)

It is useful for applications to consider Hamiltonians like H (0) + ~H (1) , where H (1) may be, for example, a spin term. In that case the formula (14) is true with different coefficients. In particular the first term in the contribution of Tγ is multiplied by R T∗ γ (1) exp −i 0 H (αs )ds . Remark 2.5. For Schr¨odinger operators we only need smoothness of the potential V . In this case the trace formula (14) is still valid without any assumptions at infinity for V when we restrict ourselves to a compact energy surface, assuming E < lim inf |x|→∞ V (x). Using exponential decrease of the eigenfunctions [22] we can prove that, modulo an error term of order O(~+∞ ), the potential V can be replaced by a potential V˜ satisfying the assumptions of Remark (2.1). 3. Preparations for the Proof We shall make use of “coherent states” which can be defined as follows. Let |x|2 , ψ0 (x) = (~π)−n/4 exp − 2~

(16)

be the ground state of the n-dimensional harmonic oscillator, and for α = (q, p) ∈ R2n ,

468

M. Combescure, J. Ralston, D. Robert

T (α) = exp

i (p · x − q · ~Dx ) ~

(17)

is the Weyl–Heisenberg operator of translation by α in phase space, where Dx = We also denote by ϕα = T (α)ψ0

∂ i∂x .

(18)

the usual coherent states centered at the point α. Then it is known that any operator B with a symbol decreasing sufficiently rapidly is in trace class (see [15]), and its trace can be computed by Z −n (19) < ϕα , Bϕα > dα. TrB = (2π~) The regularized density of states ρA (E) can now be rewritten as Z bχ U (t) ϕα > dtdα, ρA (E) = (2π)−n−1 ~−n gb(t) eiEt/~ < ϕα , A

(20)

where U (t) is the quantum unitary group: b U (t) = e−itH/~

(21)

b Aχ( b H). b bχ = χ(H) and A Our strategy for computing the behavior of ρA (E) as ~ goes to zero is first to compute the bracket cχ ϕα , U (t)ϕα >, m(α, t) =< A

(22)

where we drop the subscript χ in Aχ for simplicity. It is useful to rewrite (16) as ψ0 = 3~ ψe0 ,

(23)

where 3~ is the following scaling operator: (3~ ψ) (x) = ~−n/4 ψ x~−1/2 and ψe0 (x) = π −n/4 exp −|x|2 /2 .

(24)

First of all we shall use the following lemma, giving the action of an ~-pseudodifferential operator on a Gaussian. Lemma 3.1. Assume that A satisfies (H.0). Then we have b α= Aϕ

X γ

in L2 (Rn ), where γ ∈ N2n , |γ| =

~

|γ| 2

∂ γ A(α) 9γ,α + O(~∞ ) γ!

(25)

2n 2n P Q , γ! = γj ! and 1

1

γ e 9γ,α = T (α)3~ Opw 1 (z )ψ0 ,

(26) 0

00

γ γ γ γ where Opw 1 (z ) is the 1-Weyl quantization of the monomial: (x, ξ) = x ξ , γ = 0 00 2n (γ , γ ) ∈ N .

Proof of Gutzwiller Semiclassical Trace Formula

469

This lemma is easily proved using a scaling argument and Taylor expansion for the symbol A around the point α. Thus m(t, α) is a linear combination of terms like mγ (α, t) =< 9γ,α , U (t)ϕα > .

(27)

Now we compute U (t)ϕα , using the semiclassical propagation of coherent states result as it was formulated in Combescure-Robert [10]. We recall that F (t) is a time dependent symplectic matrix (Jacobi matrix) defined by the linear equation (9). MetF denotes the metaplectic representation of the linearized flow F (see for example Folland [15]), and the ~-dependent metaplectic representation is defined by Met~ (F ) = 3~ Met(F )3−1 ~ . We will also use the notation Z t pt · qt − p · q ps · q˙s ds − tH(α) − . δ(α, t) = 2 0

(28)

(29)

From Theorem (3.5) of [10] (and its proof, see also the appendix in this paper for a sketched proof) we have the following propagation estimates in the L2 -norm: for every N ∈ N and every T > 0 CN,T exists such that iδ(α, t) f0 k ≤ CN,T ~N , T (αt )Met ~ (F (t))3~ PN (x, Dx , t, ~)ψ kU (t)ϕα − exp ~ (30) where PN (t, ~) is the (~, t)-dependent differential operator defined by X ~k/2−j pw PN (x, Dx , t, ~) = I + kj (x, D, t) (k,j)∈IN

(31)

with IN = {(k, j) ∈ N × N, 1 ≤ j ≤ 2N − 1, k ≥ 3j, 1 ≤ k − 2j < 2N }, where the differential operators pkj (x, Dx , t)Pare products of j Weyl quantization of homogenous polynomials of degree ks with 1≤s≤j ks = k (see [10] Theorem (3.5) and its proof). So that we get f f pw kj (x, Dx , t)ψ0 = Qkj (x)ψ0 (x),

(32)

where Qkj (x) is a polynomial (with coefficients depending on (α, t)) of degree k having the same parity as k. This is clear from the following facts: homogeneous polynomials have a definite parity, and Weyl quantization behaves well with respect to symmetries: Opw (A) commutes to the parity operator 6f (x) = f (−x) if and only if A is an even f0 (x) is an symbol and anticommutes with 6 if and only if A is an odd symbol) and ψ even function. So we get X k+|γ| iδ(α, t) −j 2 · ck,j,γ ~ exp m(α, t) = ~ (j,k)∈IN ;|γ|≤2N (33) D E N f0 + O(~ ), f0 , T (αt )3~ Qk,j Met(F (t))ψ · T (α)3~ Qγ ψ where Qk,j respectively Qγ are polynomials in the x variable with the same parity as k respectively |γ|. This remark will be useful in proving that we have only integer

470

M. Combescure, J. Ralston, D. Robert

powers in ~ in (14), even though half integer powers appear naturally in the asymptotic propagation of coherent states. By an easy computation we have D E f0 , f0 , T (αt )3~ Qk,j Met(F (t))ψ T (α)3~ Qγ ψ (34) α − αt 1 f0 , f0 , Qk,j Met(F (t))ψ √ Qγ ψ T1 = exp −i σ(α, αt ) 2~ ~ where T1 (·) is the Weyl translation operator with ~ = 1. We set α − αt f0 , f0 , Qk,j Met(F (t))ψ √ Qγ ψ mk,j,γ (α, t) = T1 ~ α − αt f f0 . √ ψ0 , Met(F (t))ψ m0 (α, t) = T1 ~

(35)

We compute m0 (α, t) first. We shall use the fact that the metaplectic group transforms Gaussian wave packets into Gaussian wave packets in a very explicit way. If we denote by A, B, C, D the four n × n matrices of the block form of F (t) A B F (t) = (36) C D it is clear, since F is symplectic, that U = A + iB is invertible. So we can define M = V U −1 , where V = (C + iD).

(37)

We have ([15], Ch. 4) m0 (α, t) = (det U )c−1/2 π −n/2 · Z i q − qt i exp (M + iI)x · x) − √ (x − ) · (p − pt + i(q − qt )) dx.(38) · 2 2 ~ Rn 1/2

Remark 3.2. The notation (z(t))c has the following meaning: if t 7→ z(t) is a contin1/2 uous mapping from R into C \ {0} such that z(0) > 0 then (z(t))c denotes the square √ −1/2 in root defined by continuity in t starting from z(0) > 0. Thus the factor (det U )c (38) records the winding of det U (t) as t varies. This takes the place of the “Maslov line bundle” in this construction. √ If we make the change of variables x 7→ (y − qt )/ ~ in (32) and hence in (38), then the formula for the regularized density of states in (19) takes the form Z

Z

Z

i

a(t, α, y, ~)e h 8E (y,α,t) dy.

(39)

1 8E (t, y, α) = S(α, t) + q · p + (y − qt ) · pt + (y − qt ) 2 i 2 · M (t)(y − qt ) + |y − q| − y · p + Et, 2

(40)

ρA (E) =

R

dt

R2n

dα

Rn

The phase function 8E is given by

Proof of Gutzwiller Semiclassical Trace Formula

471

where · denotes the usual bilinear product in Cn , and α = (q, p), αt = φt (α) as before. Our plan is to prove Theorem 2.3 by expanding (39) by the method of stationary phase. The necessary stationary phase lemma for complex phase functions can easily be derived from Theorem 7.7.5 in [22, Vol. 1]. There is also an extended discussion of complex phase functions depending on parameters in [22] leading to Theorem 7.7.12, but the form of the stationary manifold here permits us to use the following: Theorem 3.3 (Stationary Phase Expansion). Let O ⊂ Rd be an open set, and let a, f ∈ C ∞ (O) with =f ≥ 0 in O and supp a ⊂ O. We define M = {x ∈ O, =f (x) = 0, f 0 (x) = 0}, and assume that M is a smooth, compact and connected submanifold of Rd of dimension k such that for all x ∈ M the Hessian, f 00 (x), of f is nondegenerate on the normal space Nx to M at x. R Under the conditions above, the integral J(ω) = Rd eiωf (x) a(x)dx has the following asymptotic expansion as ω → +∞, modulo O(ω −∞ ): J(ω) ≡

2π ω

d−k 2 X

cj ω −j .

(41)

j≥0

The coefficient c0 is given by iωf (m0 )

c0 = e

−1/2 00 f (m)|Nm det a(m)dVM (m), i M ∗

Z

(42)

where dVM (m) is the canonical Euclidean volume in M , m0 ∈ M is arbitrary, and −1/2 [det P ]∗ denotes the product of the reciprocals of square roots of the eigenvalues of 00 m P chosen with positive real parts. Note that, since =f ≥ 0, the eigenvalues of f (m)|N i lie in the closed right half plane. Sketch of proof. Using a partition of unity, we can assume that O is small enough that we have normal, geodesic coordinates in a neighborhood of M . So we have a diffeomorphism χ : U → O, where U is an open neighborhood of (0, 0) in Rk × Rd−k , such that χ(x0 , x00 ) ∈ M ⇐⇒ x00 = 0, and if m = χ(x0 , 0) ∈ M we have χ0 (x0 , 0)(Rxk ) = Tm M, χ0 (x0 , 0)(Rxd−k 00 ) = Nm M, normal space at m ∈ M ). So the change of variables x = χ(x0 , x00 ) gives the integral Z 0 00 eiωf (χ(x ,x )) a(x0 , x00 )| det χ0 (x0 , x00 )|dx0 dx00 . J(ω) = d R The phase

f˜(x0 , x00 ) := f (χ(x0 , x00 ))

(43)

472

M. Combescure, J. Ralston, D. Robert

clearly satisfies {f˜x0 00 (x0 , x00 ) = 0, =f˜(x0 , x00 ) = 0} ⇐⇒ x00 = 0.

(44)

Hence, we can apply the stationary phase Theorem 7.7.5 of [22], (Vol. 1), in the variable x00 , to the integral (43), where x0 is a parameter (the assumptions of [22] are satisfied, uniformly for x0 close to 0). We remark that all the coefficients cj of the expansion can be computed using the above local coordinates and Theorem 7.7.5.

4. The Stationary Phase Computation In this section we compute the stationary phase expansion of (39) with phase 8E given by (40). Note that a(t, α, y, ~) is actually, according to (32), a polynomial in ~1/2 and ~−1/2 . Hence the stationary phase theorem (with ~ independent symbol a) applies to each coefficient of this polynomial. The first order derivatives of 8E (t, y, α) (up to O((y − q)2 , (α − αt )2 ) terms) are given by ∂t 8E = E − H(α) + (y − qt ) · p˙t − q˙t · M (y − qt ), ∂y 8E = pt − p + i(y − q) + M (y − qt ), ∂q 8e = i(q − qt ) − t A(p − pt ) + ( t C − t AM − iI)(y − qt ), ∂p 8E = q − qt + ( t D − t BM − I)(y − qt ). Furthermore, since F is symplectic, one has 2Im 8E = |y − q|2 + |(A + iB)−1 (y − qt )|2 . This implies that 8E (y, α, t) is critical on the set: CE = {(y, α, t) ∈ Rny × R2n α × Rt : y = qt , αt = α, H(α) = E}. Thus each component Mγ of CE has the form Mγ = {(y, α, t) = (q, α, T (α)) : α = (p, q) ∈ γ, αT (α) = α, H(α) = E} .

(45)

We will assume that each γ is a smooth compact manifold. One sees immediately that the manifolds γ are unions of periodic classical trajectories of energy E. We will also assume a “clean intersection” hypothesis which we will state shortly. Thus we have assumed that CE = {0} × 6 ∪ {Mγ1 , . . . , MγN },

(46)

where each Mγk has the form (45) with γk in the fixed point set of the mapping α 7→ αTk . The first thing to check, in order to apply the stationary phase theorem is that the support of α in (39) can be taken as compact, up to an error O(~∞ ). We do this in the following way: let us recall some properties of ~-pseudodifferential calculus proved in [21, 12]. The function m(z) =< H(z) > is a weight function. In [12] it is proved that ˆ = Hˆ χ , where Hχ ∈ S(m−k ), for every k (χ is like in (H.4)). More precisely, we χ(H) have in the ~ asymptotic sense in S(m−k ),

Proof of Gutzwiller Semiclassical Trace Formula

Hχ =

473

X

Hχj ~j

j≥0

and support [Hχ,j ] is in a fixed compact set for every j (see (H.4) and [21] for the computations of Hχ,j ). Let us recall that the symbol space S(m) is equipped with the family of semi-norms, ∂γ sup m−1 (z)| γ u(z)|. ∂z z∈R2n Now we can prove the following lemma Lemma 4.1. There is a compact set K in R2n such that for m(α, t) =< Aˆ χ ϕα , U (t)ϕα > we have

Z R2n /K

|m(α, t)|dα = O(~+∞ ),

uniformly in every bounded interval in t. ˜ = χ. Using (H.3) and the comProof. Let χ˜ ∈ C0∞ (]E − δE, E + δE[) such that χχ ˆ is bounded on position rule for ~-pseudodifferential operators we can see that Aˆ χ (H) L2 (Rn ). So there exists a C > 0 such that ˆ α k2 . |m(α, t)| ≤ C k χ( ˜ H)ϕ But we can write

ˆ 2 ϕα , ϕα > . ˆ α k2 =< χ( ˜ H) k χ( ˜ H)ϕ Let us introduce the Wigner function, wα , for ϕα (i.e. the Weyl symbol of the orthogonal projection on ϕα ). We have Z ˆ 2 ϕα , ϕα >= (π~)−n Hχ2 (z)wα (z)dz, < χ( ˜ H)

where

|z−α|2

wα (z) = (π~)−n e− ~ . Using remainder estimates from [21] we have, for every N large enough, X Hχ2 ,j ~j + ~N +1 RN (~), Hˆ χ2 = 0≤j≤N

where the following estimate in Hilbert-Schmidt norm holds sup k RN (~) kHS < +∞.

0<~≤1

Now there is an R > 0 such that for every j, we have Supp[Hχ2 ,j ] ⊆ {z, |z| < R}. So the proof of the lemma follows from Z 2 −n k RN (~)ϕα k2 dα, k RN (~) kHS = (2π~) and from the elementary estimate, which holds for some C, c > 0, Z |α−r|2 c e− ~ dzdα ≤ Ce− ~ |z|≤R,|α|≥R+1

474

M. Combescure, J. Ralston, D. Robert

The next step is the computation of the Hessian of 8E on a Mγk . After an easy but tedious computation, with the variables ordered as (t, y, p, q), the Hessian 800E is the following (1 + 3n) × (1 + 3n) matrix:   8

00 E

=

Hp · (Hq + M Hp ) −Hq − M Hp −(t D − t BM )Hp −(t C − t AM )Hp

−Hq − Hp M −Hp (D − M B) M + iI D − MB − I t D − t BM − I t BM B − t DB t C − t AM − iI t AM B − t CB

−Hp (C − M A) C − M A − iI t BM A − t BC t AM A − t CA + iI

, (47)

where Hp (resp. Hq ) denotes the vector ∂p H|α=αt (resp. ∂q H|α=αt ), A, B, C, D, are the n × n matrices given by (36), t A the transpose of A, and M is defined by (37). (Recall I is the identity matrix.) We are going to perform elementary row and column operations on (47) to compute the nullspace of 800E , and the determinant of 800E restricted to the normal space to the critical manifold. To begin with we have H1 = t R0 800E R0 , where   1 0 0 0 H I 0 0 R0 =  p 0 0 I 0 0 0 0 I and H1 is given by H1 =  −Hq + iHp Hp · (−Hq + iHp ) −Hq + iHp M + iI   t D − t BM − I −Hp t C − t AM − iI −iHp

−Hp D − MB − I t BM B − t DB t AM B − t CB

 −iHp C − M A − iI  . t BM A − t BC  t AM A − t CA + iI

Multiplying H1 on the right by 

1 0 R3 =  0 0

0 I 0 0

0 B I 0

 0 A 0  I

changes it to H2 =

 

Hp (−Hq + iHp ) −Hq + iHp −Hp −iHp

−Hq + iHp −Hp + (−Hq + iHp )B (−Hq + iHp )A − iHp M + iI D − I + iB C + i(A − I) t D − t BM − I −B I −A t C − t AM − iI −iB −i(A − I)

 .

The key simplification comes from (37) which gives M = (C + iD)(A + iB)−1 , and hence, since F is symplectic t

D − t BM = [t D(A + iB) − t B(C + iD)](A + iB)−1 = (A + iB)−1 ,

t

C − t AM = [t C(A + iB) − t A(C + iD)](A + iB)−1 = −i(A + iB)−1 .

Thus, subtracting the appropriate multiples of the third row in H2 from the other rows we get

Proof of Gutzwiller Semiclassical Trace Formula

475

 0 (−Hq + iHp )(A + iB)−1 −Hp −Hq −1 D−I C   −Hq (C + iD + iI)(A + iB) . H3 =  −Hp (I − A − iB)(A + iB)−1 −B I − A  0 −2i(A + iB)−1 0 0 

Finally using the fourth row to remove the three upper entries in the second column, multiplying the third row by −1, interchanging the second and fourth rows, and the third and fourth columns, we arrive at the simple form   0 0 −Hq −Hp −1 0 0   0 −2i(A + iB) (48) H4 =  0 A−I B  Hp 0 C D−I −Hq and H4 = R1 800E R2 , where R1 and R2 can be computed by repeating the elementary row and column operations that we have performed on the identity matrix, and in particular det R1 = 1 and det R2 = (−1)n . In order to apply the stationary phase theorem the null space of 800E must be the tangent space to the critical set CE . However, one can read off the null space of H4 from (48) Null H4 = R2−1 Null 800E = v Hp = 0 and Hq · v + Hp · w = 0 . (τ, 0, v, w) : (F − I) +τ −Hq w

(49)

This leads us to impose the following “clean flow condition” Hypothesis C. Assume that DE := {(α, t) ∈ 6E × IR /φt (α) = α} is a submanifold of IR1+2n . Then we say the DE satisfies the clean flow condition, if for any (α, t) ∈ DE , the tangent space equals: Tα,t DE = v Hp 1+2n = 0 and Hq · v + Hp · w = 0 (50) : (F − I) +τ . (v, w, τ ) ∈ R −Hq w Since CE = {(y, α, t) : (α, t) ∈ DE and y = q}, the tangent space Ty,α,t CE equals v Hp = 0 and Hq · v + Hp · w = 0}, {(τ, v, w, v) : (F − I) +τ −Hq w and, assuming Hypothesis C, this does equal the null space of 800E , since       τ τ τ  0   Av + Bw + τ Hp   v  R2 =   =  =w w v v v w for (τ, v, w) as in (49). Therefore, if P denotes the orthogonal projection on the null space of 800E , then det(800E + P ) will be the determinant of the Hessian of the phase restricted to the normal space, and setting P˜ = R1 P R2

(51)

476

M. Combescure, J. Ralston, D. Robert

we have det(H4 + Pe) = −(−1)n det(800E + P ). Hence the computations of our paper provide a proof for the existence of a Gutzwiller trace formula under Hypothesis C. However, as stated earlier, we will only carry out the computations for the case that γ consists of a single trajectory here. In this case Hypothesis C reduces to the assumption (H.2) of isolated nondegenerate periodic orbits, and we may complete the computation in the following way. To compute det(H4 +P˜ ) we will use a special basis B. We denote by Eλ the (algebraic) eigenspace of F belonging to the eigenvalue λ. Then under assumption (H.2), dim ⊕ Eλ = 2n − 2, λ6=1

dim E1 = 2, and σ(Eλ , E1 ) = 0 for λ 6= 1, where σ is the symplectic form, as in (10). Let (z1 , z2 ) be a basis for E1 with z1 = (2Hp2 + Hq2 )−1/2 (Hp , −Hq ), and (F − I)z2 = βz1 . Let m1 , · · · m2n−2 be a (real) basis for the span of ⊕ Eλ , and let λ6=1

e0 , · · · en be the Euclidean basis for Rn+1 . Then we take B to be the basis

{(e0 , 0) · · · (en , 0)} ∪ {(0, m1 ) · · · , (0, m2n−2 )} ∪ {(0, z1 ), (0, z2 )}. Since the vector P˜ z01 spans the range of P˜ and H4 z01 = 0, we can use column operations to remove the contribution of P˜ from all columns of the matrix H4 + P˜ with respect to B, except the one corresponding to z1 . Then we can use column operations to remove all entries in the z1 - and z2 -columns corresponding to the basis vectors (e1 , 0) · · · (en , 0), and (0, m1 ) · · · (0, m2n−2 ). Note that this does not change the entries in the first row of the matrix, since σ(z1 , mj ) = 0, j = 1, . . . 2n − 2. After these simplifications which do not change the determinant, the matrix of H4 + P˜ with respect to B becomes:   0 0 0 b −1 0 0   0 −2i(A + iB) . (52) 0 0 Pγ − I 0  a 0 0 The vector a is just ((2Hp2 + Hq2 )1/2 , 0, · · · 0) and b = x, −(2Hp2 + Hq2 )1/2 σ(z1 , z2 ) . Therefore the determinant of H4 + P˜ equals −1 A + iB n ˜ det(Pγ − I) det , (−i) det 2 where ˜ =

0 b a

 0 x −(2Hp2 + Hq2 )σ(z1 , z2 ) , =  (2Hp2 + Hq2 )1/2 x x 0 c 0

(53)



(54)

where x is used for entries that do not enter the calculation, and c is the component of P˜ (0, z1 ) along the basis vector z2 .

Proof of Gutzwiller Semiclassical Trace Formula

477

Now it is not difficult to calculate    0  0  1 P˜   =  v 2 w

 x x  . −w + Bv − Aw  3v − Cw + Dv

(55)

We let P˜1 z1 denote the last 2n components of P˜ (0, z1 ). Since t F Jz1 = Jz1 , the normalization in the definition of z1 gives, σ(z1 , P˜1 z1 ) = 1. Therefore, if P˜1 z1 = cz2 + dz1 we clearly have c = σ(z1 , z2 )−1 . Thus (54) yields ˜ = −(2Hp2 + Hq2 ) det

(56)

and, combining this with (53) and (56), we have det 800E |N (Mγ ) = (−1)n−1 (−i)n det

U 2

−1 |(0, Hp , −Hq , Hp )|2 · det(Pγ − I). (57)

Using (42) and (57), we conclude dγ0

= g(T ˆ γ )eiSγ /~

Z

Tγ∗

0

U × det 2

−1/2

"

(−1)1−n |(0, q˙s , p˙s , q˙s )|2 det(Pγ − I) det U2

#−1/2 ∗

A(αs )dV (s).

c

Using |(0, p˙s , q˙s , p˙s )|−1 dV (s) = ds we get the result for dγ0 in (13). Since det(Pγ − I) = 0 (−1)σ | det(Pγ − I)|, where σ 0 is the number of real eigenvalues of Pγ which are greater than 1, we see that "

(−1)1−n det(Pγ − I) det U2

#−1/2 det ∗

U 2

−1/2 c

0

= ±in−1+σ | det(Pγ − I)|−1/2 . (58)

Note that the role of the Maslov index in (13) is to determine the sign in (58) and σγ in (13) is either n − 1 + σ 0 or n + 1 + σ 0 . The other coefficients, dγj are spectral invariants which have been studied by Guillemin and Zelditch. In principle we can compute them using this explicit approach. This completes the proof of Theorem 2.3. A. Propagation of Coherent States For the reader’s convenience we include here a sketch of the proof for the propagation of coherent states. For simplicity we will first explain the one term approximation with a remainder estimate O(~1/2 ). The result to be proved is kU (t)ϕα − e(i/~)δ(α,t) T (αt )Met~ (F (t))ψ0 k ≤ C~1/2 , where C > 0 is uniform for 0 ≤ t ≤ T and {α, |α| ≤ R}, for every fixed T, R.

478

M. Combescure, J. Ralston, D. Robert

A.1. Quadratic Hamiltonians. First of all let us introduce the quadratic Hamiltonian H2 (z; t) = H(αt )+ < H 0 (αt ), z − αt > +

1 < H 00 (αt )(z − αt ), (z − αt ) >, 2

where H 0 is the first derivative and H 00 (z), the Hessian matrix of H in the variable z ∈ R2n . Hˆ 2 (t) denotes the ~-Weyl quantization of H2 (·, t) and U2 (t, τ ) the timedependent propagator, i.e i~∂t U2 (t, τ ) = Hˆ 2 (t)U2 (t, τ ), U2 (τ, τ ) = 1. Let us denote by Uqe (t, τ ) the propagator defined in the same way by the quadratic form Hqe (z, t) =

1 < H 00 (αt )z, z > . 2

In [10] we have proved the formula U2 (t, τ ) = e(i/~)(δt −δτ ) T (αt )Uqe (t, τ )T (−ατ ).

(A.1)

Furthermore Uqe (t, τ ) are metaplectic transformations. More precisely, Uqe (t, τ ) = U0 (t)U0 (τ )−1 where U0 (t) = Uqe (t, 0). Then U0 (t) defines a continuous family of timedependent metaplectic tranformations, starting from I at t = 0 associated with the symplectic transformations F (t) (cf. [10]). Using a classical result (see [15]) we get explicitly the propagation of Gaussians by U0 (t) U0 (t)ψ0 (x) = (π~)−n/4 det−1/2 (A + iB)eiM x·x/~ . A.2. The Duhamel formula. U (t) − U2 (t, 0) =

1 i~

Z

t

U (t − s) Hˆ − Hˆ 2 (s) U2 (s, 0)ds.

0

A.3. Taylor formula and remainder term estimates. Using the Taylor expansion with integral remainder, we have X ν T (αt )Opw Hˆ − Hˆ 2 (t) = ~ z rν,t (z) T (−αt ) |ν|=3

and using (A) we get, for t ≥ 0, the L2 estimate t X ν sup kU0 (t)−1 Opw k(U (t) − U2 (t, 0))ϕα k ≤ ~ z rν,t (z) U0 (t)ψ0 k. ~ 0≤τ ≤t |ν|=3

By definition of the metaplectic group we have ν w ν U0 (t)−1 Opw ~ z rν,t (z) U0 (t) = Op~ (F (τ )z) rν,t (F (τ )z) . The remainder term estimate in the propagation of coherent states is obtained from the following estimate, which is proved by standard semiclassical analysis techniques, ν 3/2 , |Opw ~ (F (τ )z) rν,t (F (τ )z) | ≤ C~ where C depends only on t and α.

Proof of Gutzwiller Semiclassical Trace Formula

479

A.4. Towards mod.O(~N ) approximation. The method is to expand H(z) around αt by Taylor Formula and to iterate the Duhamel formula. Let us explain this for N = 1. Let us denote 1X ν ∂z H(αt )z ν , H3 (z, t) = H2 (z, t) + H(3) (z − αt , t). H(3) (z, t) = 6 |ν|=3

Then we get U (t)ϕα =

−1

Z

1 + (i~)

t

ˆ Uqe (t, s)H(3) (s)Uqe (s, t)ds U2 (t, 0)ϕα + rα,t ,

0

where Z 1 t U (t − s) Hˆ − Hˆ 3 (s) U2 (s, 0)ϕα ds − rα,t = i~ 0 Z t Z t−s 1 ˆ − Hˆ 2 (τ ))U2 (τ, s)dτ (Hˆ − Hˆ 2 (s))U2 (s, 0)ϕα ds. U (t − s − τ )( H ~2 0 0 Using the representation formula (A) for U2 and the metaplectic property, we get easily as in Step 3, krα,t k ≤ C~. Acknowledgement. The authors thank J. Sj¨ostrand for helpful discussions of this topic, J. Ramanathan for valuable comments on the preliminary version of this paper and the referee for useful remarks.

References 1. Arnaud, J. A.: Hamiltonian theory of beam mode propagation. Progress in Optics XI, ed. E. Wolf, Amsterdam: North Holland, 1973, 249–304 2. Babich, V.M.: Eigenfunctions concentrated in a neighborhood of a closed geodesic. In: Math. Problems in Wave Propagation Theory, V.M. Babich, ed., Sem. Math., V.A. Steklov Math. Inst. 9, 1968, Leningrad. Translated by Consultants Bureau, New York, 1970 3. Babich, V.M., Buldyrev, V.S.: Asymptotic Methods in Short Wave Diffraction Problems. Vol. 1 (Russian), Moscow: Nauka, 1972 4. Balian, R. and Bloch, C.: Distribution of eigenfrequencies for the wave equation in a finite domain. Ann. Phys. 69, vol. 1, 76–160 (1972) 5. Balian, R. and Bloch, C.: Solution of the Schr¨odinger equation in terms of classical paths. Ann. Phys. 85, 514–545 (1974) 6. Charbonnel, A-M., Popov, G.: Semiclassical asymptotics for several commuting operators. Comm. in PDE 24, 283–323 (1998) 7. Chazarain, J.: Formule de Poisson pour les vari´et´es Riemanniennes. Inv. Math. 24, 65–82 (1974) 8. Colin de Verdi`ere, Y.: Spectre du Laplacien et longueurs des g´eod´esiques p´eriodiques I. Compos. Math. 27, 83–106 (1973) 9. Combescure, M. and Robert, D.: Semiclassical sum rules and generalized coherent states. J. Math. Phys. 36, 6596–6610 (1995) 10. Combescure, M. and Robert, D.: Semiclassical spreading of quantum wave packets and applications near unstable fixed points of the classical flow. Asymptotic Anal. 14, 377–404 (1997) 11. Combescure, M. and Robert, D.: Propagation d’´etats coh´erents par l’´equation de Schr¨odinger et approximation semi-classique. C. R. Acad. Sci. Paris, t. 323 S´erie I, 871–876 (1996)

480

M. Combescure, J. Ralston, D. Robert

12. Dimassi, M., Sj¨ostrand, J.: Trace asymptotics via almost analytic extensions. PNLDE 21, Basel–Boston: Birkh¨auser, pp. 126–142 13. Dozias, S: Op´erateurs h-pseudodiff´erentiels a` flot p´eriodique. Th`ese, Paris 13, 1994 14. Duistermaat, J. J. and Guillemin, V.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) 15. Folland, G. B.: Harmonic Analysis in Phase Space. Ann. of Math. Studies, 122, Princeton: Princeton University Press, 1989 16. Guillemin, V. and Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) 17. Gutzwiller, M.: Periodic orbits and classical quantization conditions. J. Math. Phys. 12, 343–358 (1971) and book Chaos in classical and quantum mechanics Berlin–Heidelberg–New York: Springer-Verlag, 1990 18. Hagedorn, G.: Semiclassical Quantum Mechanics. (I), Commun. Math. Phys. 71, 77–93 (1980); (II) Ann. Inst. H. Poincar´e 42, 363–374 (1985) 19. Hall, K.R , Meyer, G.R.: Introduction to Hamiltonian Dynamical Systems and the N-body problem. Applied Mathematical Sciences 90, Berlin–Heidelberg–New York: Springer-Verlag, 1991 20. Heller, E. J.: Time dependent approach to semiclassical dynamics. J. Chem. Phys. 62, 1544–1555 (1975); Quantum localization and the rate of exploration of phase space. Phys. Rev. A35, 1360–1370 (1987) 21. Helffer, B., Robert, D.: Calcul fonctionnel par la transform´ee de Mellin. J. Funct. Anal. V. 153, 246–268 (1983) 22. Helffer, B and Sj¨ostrand, J.: Multiple wells in the semi-classical limit I. Com. in PDE, 9 (4), 337–408 (1984) 23. H¨ormander, L.: The analysis of partial differential operators. 1–4, Berlin: Springer, 1983 24. Keller, J.B.: J. Opt. Soc. Am. 61, 40 (1971) 25. Littlejohn, R.: The semiclassical evolution of wave packets. Physics Rep. 138, 193–291 (1986) 26. Meinrenken, E.: Semiclassical principal symbols and Gutzwiller’s trace formula. Reports on Math. Phys. 31, 279–295 (1992) 27. Paul, T. and Uribe, A.: Sur la formule semi-classique des traces. C. R. Acad. Sci. Paris 313 I, 217–222 (1991) 28. Paul, T. and Uribe, A.: A construction of quasimodes using coherent states. Ann. Inst. H. Poincar´e 59, 357–381 (1993) 29. Paul, T. and Uribe, A.: The semi-classical trace formula and propagation of wave packets. J. Funct. Anal. 132, 192–249 (1995) 30. Paul, T. and Uribe, A.: On the pointwise behaviour of semiclassical measures. Commun. Math. Phys. 175, 229–258 (1996) 31. Petkov,V, Popov, G.: Semiclassical trace formula and clustering of eigenvalues for Schr¨odinger operators. Ann. Inst. Henri Poincar´e, sect. Phys. Th. 68, 17–83 (1998) 32. Popov, G.: On the contribution of degenerate periodic trajectories to the wave-trace. Commun. Math. Phys. 196, 363–383 (1998) 33. Ralston, J.: On the construction of quasimodes associated with stable periodic orbits. Commun. Math. Phys. 51, 219–242 (1976); Erratum, 67, 91 34. Ralston, J.: Gaussian beams and the propagation of singularities. Studies in PDE, Stud. Math. 23, 207– 248 (1982) 35. Schr¨odinger, E.: Naturwissenschaften 14, 664 (1926) 36. Wilkinson, M.: A semiclassical sum rule for matrix elements of classically chaotic systems. J. Phys. A: Math. Gen. 20, 2415–2423 (1987) Communicated by B. Simon

Commun. Math. Phys. 202, 481 – 500 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Stability of One-Electron Molecules in the Brown–Ravenhall Model A. A. Balinsky, W. D. Evans School of Mathematics, Cardiff University, 23 Senghennydd Road, P. O. Box 926, Cardiff CF2 4YH, Great Britain. E-mail: [email protected]; [email protected] Received: 23 June 1998 / Accepted: 19 November 1998

Abstract: In appropriate units, the Brown–Ravenhall Hamiltonian for a system of 1 electron relativistic molecules with K fixed nuclei having charge and position Zk , Rk , k = 1, 2, . . . , K, is of the form B1,K = 3+ D0 +αVc 3+ , where 3+ is the projection onto PK αZk + the positive spectral subspace of the free Dirac operator D0 and Vc = − k=1 |x−R k| PK αZk Zl k
1+π/2)

1. Introduction In [4] I. Daubechies and E. Lieb investigated the “stability of matter” problem for a system of one-electron relativistic molecules with static K nuclei modelled by the Hamiltonian 21 X K K X e2 Zk e2 Zk Zl 2 2 2 4 + . − H1,K Z, R = −} c 1 + m c |x − Rk | |Rk − Rl | (1.1) k=1

k
Here 2π} is Planck’s constant, c the velocity of light, m the electron mass, −e the of the k th nucleus, and we electron charge, Zk ,Rk the charge and position respectively have written Z = Z1 , . . . , Zk , R = R1 , . . . , Rk . They prove that if αZk ≤ 2/π, where α = e2 /(}c) is Sommerfeld’s fine structure constant which is about 1/137.037, then the system is stable in the sense that, for some constant C, the natural operator defined by (1.1) satisfies (1.2) H1,K Z, R ≥ CK.

482

A. A. Balinsky, W. D. Evans

The subject of this paper is to investigate the analogous problem for the Brown– Ravenhall Hamiltonian K K X X e2 Zk e2 Zk Zl + 3+ . (1.3) B1,K Z, R = 3+ D0 − |x − Rk | |Rk − Rl | k=1

k
In (1.3), the notation is as follows (see [5]): • D0 is the free Dirac operator X } ∂ } ∇ + mc2 β ≡ c αj + mc2 β, i i ∂x j j=1 3

D0 = cα·

where α = (α1 , α2 , α3 ) and β are the Dirac matrices given by 12 02 0 σj , β= αj = σj 0 02 −12 with 02 , 12 the zero and unit 2 × 2 matrices respectively and σj the Pauli matrices 01 0 −i 1 0 , σ2 = , σ3 = . σ1 = 10 i 0 0 −1 • 3+ denotes the projection of L2 (R3 ) ⊗ C4 onto the positive spectral subspace of D0 , that is χ(0,∞) (D0 ), where χ(0,∞) is the characteristic function of (0, ∞). If we set fb(p) ≡ F(f )(p) =

1 2π}

3/2 Z

e−i x·p/} f (x) d x,

(1.4)

R3

for the Fourier transform of f , then it follows that (3+ f )∧ (p) = 3+ (p)fb(p), where 3+ (p) =

p 1 cα· p + mc2 β + , e(p) = c2 p2 + m2 c4 2 2e(p)

(1.5)

with p = |p|.

The underlying Hilbert space in which B1,K Z, R acts is H+ = 3+ (L2 (R3 ) ⊗ C4 ).

(1.6)

As usual we use the same notations for the formal Hamiltonians (1.1) and (1.3) and the self-adjoint operators defined as Friedrichs extensions when the associated quadratic forms are bounded below. The form domain of B1,K Z, R is Q4 ( |p|) = H 1/2 (R3 )⊗C4 . To simplify notation, we follow [4] and rescale in the units of the Compton wavelength of the electron, namely }/(mc): replacing x, Rk by (mc/})x, (mc/})Rk respectively, and B1,K by (mc2 )−1 B1,K , we obtain

Stability of One-Electron Molecules

B1,K Z, R = 3+ D0 −

483 K X k=1

αZk + |x − Rk |

K X k
αZk Zl 3+ , |Rk − Rl |

(1.7)

where now D0 =

p 1 1 α· p + β α· ∇ + β, 3+ (p) = + , e(p) = p2 + 1. i 2 2e(p)

We shall put } = 1 in the Fourier transform F in (1.4) hereafter. The case K = 1 of this operator (with nucleus at the origin), αZ 1 α· ∇ + β − 3+ , B1,1 (Z) = 3+ i |x|

(1.8)

(1.9)

was the object of study in [1, 3, 5, 9, 15, 16]. In [5] it was shown that B1,1 (Z) is bounded below in H+ if and only if Z ≤ Zc :=

γc , γc = α

π 2

2 +

2 π

,

(1.10)

confirming a prediction of Hardekopf and Sucher in [9] based on numerical considerations. In fact, under (1.10), the operator (1.9) is proved to be positive in [3] and [15], a strict positive lower bound being exhibited in [15] even for Z = Zc . For αZ ≤ γc , σess (B1,1 (Z)) = [1, ∞) and σsc (B1,1 (Z)) = ∅ for αZ < γc (see [5]), where σess and σsc denote the essential and singular continuous spectra respectively, while in [1] a virial theorem is established which implies that for αZ ≤ 3/4, there are no embedded eigenvalues and hence the spectrum of B1,1 (Z) in [1, ∞) is absolutely continuous. Below 1, when Z < Zc , B1,1 (Z) has an infinite number of eigenvalues accumulating at 1 [16] ; the is latter result is also a consequence of the result in [8] that the nth eigenvalue of B1,1 (Z) √ not greater than that of the Dirac operator D = D0 − αZ/ |·|, but they require Z ≤ 23 α1 . A significant feature of all these results for (1.9) is that they hold for Z < Zc ≈ 124.2, which includes all known elements. Therefore, one has some justification in believing that (1.9) is a better relativistic model than the Herbst operator H1,1 (Z) = (−1 + 1)1/2 −

αZ , |x|

(1.11)

which is the special case K = 1 of the rescaled version of (1.1). The objective in this paper is to establish the result that (1.7) is stable for the whole range of nuclear charges allowed by (1.10), namely Zk ≤ Zc :=

π 2

2 , k = 1, . . . , K. + π2 α

(1.12)

As usual for many-particle problems, there is a restriction on the values of α: we require 2π √ ≈ 0.125721, which covers the physical value α ≈ 1/137 ≈ α ≤ 2 (π +4)(2+ 1+π/2) 0.0072. Note that (1.3) is a restriction to H+ of H1,K Z, R 14 in L2 (R3 ) ⊗ C4 . Thus 2 the stability of H1,K Z, R in L2 (R3 ) implies that of B1,K Z, R in for Zk ≤ πα the subspace H+ . The problem for the Brown–Ravenhall operator is to extend the range Zk ≤ 2/(πα) to Zk ≤ Zc and hence cover all known elements.

484

A. A. Balinsky, W. D. Evans

The proof in [4] rests heavily on an inequality in [4, Lemma 2.1], which is a refinement of the result (see [11]) 1/2 π −1/2 −1/2 |p| k= k |x| 2 for compactly supported functions, and makes use of inequalities for symmetric decreasing rearrangements. Our proof depends on an analogous result, but we have to contend with the absence of rearrangement inequalities for the two spinors involved and the 1 , unlike |p| ,does not have a positive kernel.We use the wave fact that the operator p·σ |p|2 decomposition in configuration (x-) space and an interesting sharp inequality p·σ 2

|p|

≤

x·σ 1 x·σ |x| |p| |x|

(1.13)

(see Lemma 3.2) which we feel might have other useful applications. Our main result is the following. Main p Theorem. Suppose that αZk ≤ γc , k = 1, 2, . . . , K, and α ≤ γc /(4 + 2 1 + π/2), where γc = 2/ π2 + π2 , and Rk 6 = Rl for k 6 = l. Then B1,K Z, R in (1.7) is stable in the sense that there exists a constant C such that B1,K Z, R ≥ CK. Remark. Let (1.7) be written as B1,K Z, R = B01,K Z, R +

K X k
αZk Zl . |Rk − Rl |

(1.14)

The argument in [5] can be used to give σess B01,K Z, R = [1, ∞) for Zk ≤ Zc and hence K X αZk Zl , ∞ . σess B1,K Z, R = 1 + |Rk − Rl | k
2. Preliminaries The Fourier transform of any spinor ψ in the positive spectral subspace of D0 = 1i α· ∇+β can be written [e(p) + 1] u(p) b = 1 , (2.1) ψ(p) (p· σ) u(p) n(p) p where u ∈ L2 (R3 ) ⊗ C2 , a Pauli spinor, e(p) = p2 + 1 and n(p) = [2e(p)(e(p) + 1)]1/2 (see [5, §2]). Conversely any Dirac spinor of the form (2.1) is in the imageof H+ under the Fourier transform. It is readily shown that , formally, (note that B1,1 Z = B01,1 Z ) ψ, B01,1 Z ψ = u, b01,1 Z u ZZ Z αZ 2 u(p0 )∗ k(p0 , p)u(p) d p0 d p, (2.2) = e(p) |u(p)| d p − 2 2π R3

R3 ×R3

Stability of One-Electron Molecules

485

where ∗ denotes the Hermitian conjugate, |u(p)| = u(p)∗ u(p), and the kernel k is the 2 × 2 matrix-valued function 2

k(p0 , p) =

[e(p0 ) + 1][e(p) + 1]12 + (p0 · σ)(p· σ) n(p0 ) |p − p0 | n(p) 2

.

(2.3)

From [1, Lemma 2.4] we have

Lemma 2.1. The operator b01,1 Z in (2.2) has the same domain in L2 (R3 ) ⊗ C2 as the massless operator b01,1 Z defined in the form sense by 1 (2.4) b01,1 Z = |p| − γ W + P W P , γ = αZ, 2 where p· σ u(p), P u (p) = |p| Z

1 W u (p) = 2π 2

0

dp

0

u(p ) R3

(2.5)

0

2

|p − p |

.

(2.6)

Also , b01,1 Z − b01,1 Z can be extended to a bounded operator on L2 (R3 ) ⊗ C2 . While the Brown–Ravenhall operator is simply expressed in momentum space, it will be necessary for us to work in x−space with the operator e b01,1 Z = F −1 b01,1 Z F p· σ 1 p· σ 1 1 + . (2.7) = |p| − γ 2 |x| |p| |x| |p| 1 1 In (2.7) we have used the fact that F |·| u(x), = 2π1 2 |·|1 2 , and hence F −1 W F u(x) = |x| and used the standard notation |p| for F −1 |p| F, and similarly for 1 |x|

1 |x|

p·σ |p| .

We also write in x−space, and omit the tilda in e b0 .

for the operator of multiplication by 1,1 is a unitary (and self-adjoint) involution; let P denote the proThe operator p·σ ± |p| jections onto its eigenspaces corresponding to its eigenvalues at ±1 . Then, we have P± = 21 1 ± p·σ |p| and P± |p| = |p| P± , P±

1 1/2

|p|

=

1 1/2

|p|

P± ,

P±2 = P± , P±∗ = P± . 1 P+ |x|11/2 . Then K ≥ 0 and Lemma 2.2. Let K := γc |x|11/2 P+ |p|

sup ψ, Kψ = 1.

kψk=1

(2.8)

(2.9)

486

A. A. Balinsky, W. D. Evans

Proof. The operator b01,1 Z is homogeneous, and hence non-negative, if and only if it is bounded below. We therefore infer from Lemma 2.1 and [5] that b01,1 Z ≥ 0 if and only if γ ≤ γc . For ϕ ∈ H 1/2 (R3 ) ⊗ C2 =: Q2 ( |p|), 1 0 P+ ϕ . ϕ, P+ b1,1 Z P+ ϕ = P+ ϕ, |p| P+ ϕ − γ ϕ, P+ |x| Hence, with ψ = |p|

1/2

ϕ, γ ≤ γc is equivalent to 1 1 2 kP+ ψk = P+ 1/2 ψ, |p| P+ 1/2 ψ |p| |p| 1 1 1 P ψ, P ψ ≥γ + + 1/2 |x| |p|1/2 |p|

2

1 1

= γ 1/2 P+ 1/2 ψ

. |x| |p|

Let A := |x|11/2 P+ |p|11/2 . Then A∗ := (2.10), and since A = AP+ , that

1 P 1 |p|1/2 + |x|1/2

and K = γc AA∗ . It follows from

2 sup ψ, Kψ = γc kA∗ k = 1.

kψk=1

From ∧

[(· × p)u(·)] (p) = and [(x × ·)u(·)]∨ (x) =

(2.10)

1 ∇p × p u(p) ˆ i

1 ∇x × x u(x), ˇ i

it follows that in both x and p spaces, the total angular momentum operator J ≡ and comJ1 , J2 , J3 = x × p + 21 σ = L+ 21 σ depends only onthe angular co-ordinates, mutes with the Brown–Ravenhall operators b01,1 Z and b01,1 Z (see [5, § 2.2]). The eigenvectors l,m,s , (l, m, s) ∈ I say, of J 2 form an orthonormal basis for L2 (S3 ) ⊗ C2 , and are the spherical spinors q  l+s+m  1 (ω) Y   q 2(l+s) l,m− 2  s = 21 ,   l+s−m  1 2(l+s) Yl,m+ 2 (ω)  (2.11) l,m,s (ω) :=  q l+s−m+1  1 (ω) Y −  l,m− 2  2(l+s)+2 1  q  s=− ,   2 l+s+m+1  1 (ω) Y 2(l+s)+2

l,m+ 2

with l = 0, 1, 2, ... and m = −l − 21 , ..., l + 21 , that do not vanish. Here Yl,k are normalized spherical harmonics on the unit sphere S2 (see, e.g., [13], p. 421) with the convention that Yl,k = 0, if |k| > l. We denote the corresponding index set by I, i.e., I := {(l, m, s)|l ∈ N0 , m = −l− 21 , ..., l+ 21 , s = ± 21 , l,m,s 6 = 0}. The l,m,s are simultaneous eigenvectors of J 2 , J3 , L2 , and σ · L + 1, where L ≡ L1 , L2 , L3 = x × p, with corresponding eigenvalues (l + s)(l + s + 1), m, l(l + 1) and −k respectively, where

Stability of One-Electron Molecules

487

( k :=

−l − 1 l

if s = 21 , if s = − 21 .

(2.12)

We have the isomorphism H = L2 (R3 ) ⊗ C2 −→ L2 ((0, ∞), r2 d r; L2 (S3 ) ⊗ C2 ) : any f ∈ H can be written, with r = |x|, ω = x/ |x|, X cl,m,s (r)l,m,s (ω) f (x) =

(2.13)

(l,m,s)∈I

and 2

kf k =

X (l,m,s)∈I

Z

∞

2

|cl,m,s (r)| r2 d r.

(2.14)

0

The Fourier transform takes angular momentum channels into angular momentum channels, and so (2.13) and (2.14) continue to hold when x is replaced by p. We also need the following identities: Z Z 1 ∗ (ω ) l0 ,m0 ,s0 (ωy ) d ωx d ωy = 2 l,m,s x |x − y| S2 S2 2 2 |x| + |y| 2π Ql δll0 δmm0 δss0 , (2.15) = |x| |y| 2 |x| |y| x x· σ l,m,s (ω) = −l+2s,m,−s (ω), ω = , |x| |x|

(2.16)

2 Z ∞ 1 g(r) r + s2 1 d s l,m,s (ω), g(s)Ql l,m,s (ω) = |p| r πr 0 2rs

(2.17)

dg k +1 + g(r) , (p· σ) ig(r)l,m,s (ω) = −l+2s,m,−s (ω) dr r

(2.18)

see Fl¨ugge[6], Greiner[7, p. 171, (12)]. The functions Ql are Legendre functions of the second kind, i.e., Z 1 1 Pl (t) dt, Ql (z) = 2 −1 z − t where the Pl are Legendre polynomials. [See Stegun [14] for the notation and some properties of these special functions. The Legendre functions of the second kind appear here for exactly the same reasons as in the treatment of the Schr¨odinger equation for the hydrogen atom (Fl¨ugge [6], Problem 77)]. It follows from (2.17) and (2.18) that 1/ |p| preserves the angular momentum channels, but (p· σ)/ |p| does not. However (p· σ)/ |p| and P+ have the following invariant subspaces. We can write the decomposition represented by (2.13) as

488

A. A. Balinsky, W. D. Evans

M

H ≡ L2 (R3 ) ⊗ C2 =

Hl,m ,

l≥0, m=−l−1/2,... ,l+1/2

where Hl,m is the space of functions of the form cl,m, 21 (r)l,m, 21 (ω) + cl+1,m,− 21 (r)l+1,m,− 21 (ω).

(2.19)

Then, by (2.17) and (2.18) P+ : Hl,m → Hl,m .

(2.20)

Finally in this section, we give a brief survey of the properties of the Mellin transform that we shall need subsequently. The Mellin transform M is defined on L2 (R+ ) ≡ L2 ((0, ∞); drr ) by Z ∞ 1 p−1−is ψ(p) d p , s ∈ R, (2.21) ψ ] (s) := Mψ (s) := √ 2π 0 and has inverse M

−1

1 ψ (p) = √ 2π ]

Z

∞

−∞

pis ψ ] (s) d s , p ∈ R+ .

(2.22)

It is a unitary map from L2 (R+ ) onto L2 (R). Convolution on L2 (R+ ) is defined by Z ∞ sd p (2.23) ψ(p)ϕ (ψ ∗ ϕ)(s) = p p 0 and we have that M(ψ ∗ ϕ) = ψ ] · ϕ] . The Mellin transform of the Legendre functions of the second kind satisfies r 1 1 π 0([l + 1 − iz]/2)0([l + 1 + iz]/2) 1 ·+ (z) = M Ql 2 · 2 2 0([l + 2 − iz]/2)0([l + 2 + iz]/2)

(2.24)

(2.25)

(see [12]). 3. Fundamental Inequalities The following result is the analogue of Lemma 2.1 in [4], and has a crucial role in the proof of our main theorem. It is a refinement of Lemma 2.2 to compactly supported functions. Let B(a, R) denote the ball {x : |x − a| < R}. 1 P+ |x|11/2 . Then for all ψ ∈ L2 (R3 ) ⊗ C2 with Theorem 3.1. Let K = γc |x|11/2 P+ |p| supp ψ ⊂ B(0, R) we have 2 Z C0 dx 2 |ψ(x)| p , (3.1) ψ, Kψ ≤ kψk − 2 R |x| R3

where C0 = 1/(π 3 + 4π).

Stability of One-Electron Molecules

489

To prove the theorem we need some preliminary lemmas. Lemma 3.2. p·σ 2

|p|

≤

x·σ 1 x·σ . |x| |p| |x|

(3.2)

Proof. Let γc 1 K˜ = 2 |x|1/2

x·σ 1 x·σ 1 + |p| |x| |p| |x|

1 1/2

|x|

.

(3.3)

We shall prove that for all ψ ∈ L2 (R3 ) ⊗ C2 ,

ψ, Kψ ≤

˜ ψ, Kψ .

whence the lemma. Let ψ ∈ Hl,m be of the form 1 1 f (r)l,m, 21 (ω) + g(r)l+1,m,− 21 (ω) r r

(3.4)

with f, g ∈ C01 (0, ∞); such spinors are dense in Hl,m . In γc ψ, Kψ = 2

1 1 1 γc p·σ 1 ψ, ψ + ψ, ψ 2 1/2 |p| |x|1/2 2 |x|1/2 |p| |x|1/2 |x| 1

it follows from (2.17) and (2.18) that the first term is diagonal and the second off-diagonal in f and g . The diagonal term is γc 4π 2

Z Z

f¯(r) f (r0 ) 1 ∗ (ω)l,m, 21 (ω 0 ) d x d x0 + 2 l,m, 21 3/2 3/2 0 r r |x − x0 | R3 R3 Z Z g(r) ¯ g(r0 ) 1 γc 0 0 ∗ + 2 1 (ω)l+1,m,− 1 (ω ) d x d x 2 4π r3/2 r0 3/2 |x − x0 |2 l+1,m,− 2 R3 R3

γc = 2π

2 Z∞ Z∞ ¯ 2 f (r) f (r0 ) r + r0 √ √ Ql d r d r0 2rr0 r r0 0

0

γc + 2π

Z∞ Z∞ 0

0

2 2 r + r0 g(r) ¯ g(r0 ) √ √ Ql+1 d r d r0 2rr0 r r0

from (2.15). The off diagonal term in ψ, Kψ

is

(3.5)

490

A. A. Balinsky, W. D. Evans

γc 8π

g(r) 1 p·σ 3/2 l+1,m,− 21 (ω) d x d x0 + |x − x0 | r r R 3 R3 Z Z f (r) g(r ¯ 0) ∗ 1 γc 0 1 (ω ) (ω) d x d x0 + 0 p·σ 3/2 l+1,m,− 21 3/2 l,m, 2 0 8π |x − x | r r R3 R3 =: γc I1 + I2 Z Z

f¯(r0 )

∗ 1 (ω 0 ) 0 3/2 l,m, 2

say; in (3.6) we have used the fact that Z Z S2

1 |p|2

has kernel

1 . 4π|x−x0 |

We shall also need

l 1 4π r< ∗ 0 ) d ωx d ωx0 = (ω ), (ω , x l,m,s x l+1 |x − x0 | l,m,s 2l + 1 r>

S2

(3.6)

(3.7)

where r< = min( |x| , |x0 |) and r> = max( |x| , |x0 |). In I1 , we set G(r) = g(r)/r3/2 . From (2.18) we have dG l+2 + G(r) , (p· σ){G(r)l+1,m,− 21 (ω)} = il,m, 21 (ω) dr r and, on substituting in (3.6), we get I1 =

1 2(2l + 1)

Z∞ Z∞ ¯ 0 l f (r ) r< dG l+2 2 + G(r) r2 r0 d r d r0 = i 3/2 r l+1 0 d r r r > 0

0

i = 2(2l + 1)

Z∞ Z∞ 0

0

l+ 21 r< d g l + 21 g(r) d r d r0 = f¯(r0 ) + r> dr r i = 2

Z∞ Zr 0

0

0 l+ 21 d r d r0 r 0 ¯ . (3.8) f (r )g(r) r r

Similarly −i I2 = 2

Z∞ Zr 0

0

0 l+ 21 d r d r0 r g(r)f ¯ (r ) . r r 0

(3.9)

Hence, for ψ ∈ Hl,m given by (3.4), we have

γc ψ, Kψ = 2π

2 Z∞ Z∞ ¯ 2 f (r) f (r0 ) r + r0 √ √ Ql d r d r0 + 2rr0 r r0 0

0

γc + 2π

Z∞ Z∞ 0

0

2 2 r + r0 g(r) ¯ g(r0 ) √ √ Ql+1 d r d r0 2rr0 r r0 Z∞ Zr − γc Im 0

0

0 l+ 21 d r d r0 r 0 ¯ f (r )g(r) . (3.10) r r

Stability of One-Electron Molecules

In γc ˜ ψ, Kψ = 2

491

1 1 γc x·σ 1 1 x·σ 1 ψ, ψ + ψ, ψ 1/2 |p| |x|1/2 2 |x| |x|1/2 |p| |x| |x|1/2 (3.11) |x| 1

the first term is the same as that in (ψ, Kψ), whilst in the second we have from (2.16) 1 1 x ψ(x) = − f (r)l+1,m,− 21 (ω) − g(r)l,m, 21 (ω). σ· |x| r r Hence both terms in (3.11) are diagonal and

γc ˜ ψ, Kψ == 2π

2 Z∞ Z∞ ¯ 2 f (r) f (r0 ) r + r0 √ √ Ql d r d r0 2rr0 r r0 0

0

γc + 2π

Z∞ Z∞ 0

+

γc 2π

0

2 2 r + r0 g(r) ¯ g(r0 ) √ √ Ql+1 d r d r0 2rr0 r r0

Z∞ Z∞ 0

0

2 2 r + r0 g(r) ¯ g(r0 ) √ √ Ql d r d r0 2rr0 r r0

γc + 2π

2 Z∞ Z∞ ¯ 2 f (r) f (r0 ) r + r0 √ √ Ql+1 d r d r0 . (3.12) 2rr0 r r0 0

0

Hence, from (3.10) and (3.12), γc ˜ ψ, Kψ − ψ, Kψ = 2π

2 2 g(r) ¯ g(r0 ) r + r0 √ √ Ql d r d r0 2rr0 r r0

Z∞ Z∞ 0

0

2 Z∞ Z∞ ¯ 2 r + r0 f (r) f (r0 ) γc √ √ Ql+1 d r d r0 + 2π 2rr0 r r0 0

0

Z∞ Zr + γc Im 0

0

0 l+ 21 r d r d r0 . (3.13) f¯(r0 )g(r) r r

We need to prove that the right-hand side of (3.13) is non-negative. For this we use the Mellin transform. b ] (z) denote the function on the right-hand side of (2.25). We then have in Let 2π Q l √ the first term of (3.10), with F (r) = rf (r), 1 2π 1 2π

Z∞ Z∞ 0

0

2 Z∞ Z∞ ¯ 2 f (r) f (r0 ) r + r0 √ √ Ql d r d r0 = 2rr0 r r0 0

0

r + r0 F¯ (r)F (r0 )Ql 2rr0 2

2

d r d r0 = rr0

Z∞ −∞

2 b] |F ] (s)| Q l (s) d s.

492

A. A. Balinsky, W. D. Evans

Similarly for the second term of (3.10). In (3.8) and (3.9) we √ set 2(x) = xl+1 χ(x), where χ is the characteristic function of [0, 1]. Then, with G(r) = rg(r), i I1 = 2

Z∞ Z∞ 0

0 Z∞ r d r d r0 i 0 ¯ F (r )G(r)2 = F ] (s)G] (s)2] (s) d s, r rr0 2 −∞

0

where 1 2 (s) = √ 2π ]

Z1 0

Similarly, i I2 = − 2

1 p−is+l d p = √ (1 + l − is)−1 . 2π Z∞

F ] (s)G] (s) 2] (s) d s.

−∞

Hence, we can write

Z∞

ψ, Kψ = γc

−∞



b ] (s) Q l

i ] 2 2 (s)



F ] (s)



 d s . (3.14)  F ] (s), G] (s)  ] ] i ] b (s) G (s) − 2 2 (s) Q l+1

Since ψ, Kψ ≥ 0, the smooth 2 × 2−matrix in (3.14) is pointwise non-negative, and hence so is the matrix   ] b (s) − i 2] (s) Q l+1 2   ] i ] b 2 (s) Q (s) 2

l

since the two matrices have the same eigenvalues. Hence  ]  ]  i ] b Z∞ F (s) Ql+1 (s) − 2 2 (s)  d s ≥ 0.  F ] (s), G] (s)  γc ] ] i ] b G (s) −∞ 2 2 (s) Ql (s) ˜ But this is precisely the right-hand side of (3.13). Hence ψ, Kψ ≤ ψ, Kψ for all the 1 3 C0 (R ) functions ψ in Hl,m considered, and hence for all ψ ∈ Hl,m since K and K˜ are bounded. Since K and K˜ map Hl,m into itself, it follows that K ≤ K˜ on H and hence the lemma is proved. Lemma 3.3. For all ψ ∈ L2 (R3 ) ⊗ C2 ,

2 ˜ ψ, Kψ ≤ kψk .

Proof. First observe from (3.12) that since Q0 ≥ Q1 ≥ . . . ≥ 0, it is sufficient to prove use Hilbert’s double integral the result for ψ ∈ H0,m . In (3.12), with l = 0, we may inequality given in [10, Sect. 9.3 (319)], since

1 √ √ 0 Ql r r

r 2 +r 02 2rr 0

is homogeneous of

degree −1. We have Z∞ Z∞ γc 2 2 2 2 ˜ k1 |f (r)| + |g(r)| d r + k2 |f (r)| + |g(r)| d r , ψ, Kψ ≤ 2π 0

0

Stability of One-Electron Molecules

where

Z∞ k1 = 0

and

Z∞ k2 = 0

493

Z∞ r + 1 d r π2 1 dr 1 r+ = ln = Q0 2 r r r − 1 r 2 0

Z∞ r+1 1 dr 1 1 1 − 1 d r = 2. r+ = r+ ln Q1 2 r r 2 r r − 1 r 0

Hence

Z∞

2

2

|f (r)| + |g(r)|

˜ ψ, Kψ ≤

2

d r = kψk ,

0

and the lemma is proved. Lemma 3.4. Let

ψ(x) =

X (l,m,s)∈I

ϕ(x) =

s X 1 r

(l,m,s)∈I

1 fl,m,s (r)l,m,s (ω), r

1 |fl,m,s (r)| 0, 21 , 21 (ω), 0, 21 , 21 (ω) = √ 2 π 2

1 . 0

Then

Z R3

kψk = kϕk ,

(3.15)

˜ ˜ , ψ, Kψ ≤ ϕ, Kϕ

(3.16)

1 p |ψ(x)| d x ≤ |x|

Z R3

1 p |ϕ(x)| d x. |x|

(3.17)

Proof. The equality (3.15) is obvious. With ψl,m (x) =

1 1 fl,m, 21 (r)l,m, 21 (ω) + fl+1,m,− 21 (r)l+1,m,− 21 (ω), r r

we have from (3.12), ˜ l,m ≤ ψl,m , Kψ Z∞ Z∞ 1 γc 0 0 √ √ 0 |fl,m, 21 (r)| |fl,m, 21 (r )| + |fl+1,m,− 21 (r)| |fl+1,m,− 21 (r )| · 2π r r 0 0 2 2 2 2 r + r0 r + r0 + Q1 d r d r0 . Q0 2rr0 2rr0

494

A. A. Balinsky, W. D. Evans

Hence Z∞ Z∞

X

X ˜ l,m ≤ γc ˜ ψl,m , Kψ ψ, Kψ = 2π

1 0 √ √ 0 |fl,m,s (r)| |fl,m,s (r )| r r l,m (l,m,s)∈I 0 0 2 2 2 02 r +r r + r0 + Q1 d r d r0 ≤ Q0 2rr0 2rr0 X 21 X 21 Z∞ Z∞ 1 γc 2 0 2 √ |fl,m,s (r )| |fl,m,s (r)| 2π rr0 (l,m,s)∈I (l,m,s)∈I 0 0 2 2 2 2 r + r0 r + r0 ˜ Q0 + Q d r d r0 = ϕ, Kϕ 1 0 0 2rr 2rr 2 1/2

P

again by (3.12) with g = 0, f =

|fl,m,s |

(l,m,s)∈I

, l = 0 and m = 21 .

Finally, if d µ denotes the Haar measure on the orthogonal group O(3), we have for g ∈ O(3), Z R3

1 p |ψ(x)| d x = |x|

Z R3

1 p |ψ(gx)| d x |x|

=: I(g) Z I(g) d µ(g) = O(3)

Z

= R3

Z

≤ R3

1 p dx |x| 1 p |x|

1 √

=

Z

Z

Z

|ψ(gx)| d µ(g)

O(3) 2

21

|ψ(gx)| d µ(g)

dx

O(3)

1 1 p r |x| 2

X

2 π (l,m,s)∈I R3 Z 1 p |ϕ(x)| d x. = |x|

2

|fl,m,s (r)|

21 dx

R3

The proof is therefore complete.

Proof of Theorem 3.1. In view of Lemmas 3.2-3.4 and change of scale, it is enough to prove (3.1) with K replaced by K˜ and ψ of the form ψ(x) =

f (r) 0, 21 , 21 (ω), r

with f ≥ 0 and supp f ⊂ B(0, 1), that is, from (3.12),

Stability of One-Electron Molecules

γc ˜ ψ, Kψ ≡ 2π

Z∞ Z∞ 0

0

495

2 2 2 2 f (r)f (r0 ) r + r0 r + r0 + Q1 d r d r0 √ √ 0 Q0 2rr0 2rr0 r r

Z1

√

2

≤

Z1

√

|f (r)| d r − C0 2 π 0

2 rf (r) d r

.

(3.18)

0

( f (r) f˜(r) := 1 1 rf r

Define

, r ∈ [0, 1] , r > 1.

Then Z1

Z∞ 2

2

|f (r)| d r =

2 0

|f˜(r)| d r

(3.19)

0

and, from Lemma 3.3, γc 2π

2 Z∞ Z∞ ˜ ˜ 0 2 2 2 r + r0 r + r0 f (r)f (r ) + Q1 d r d r0 √ √ 0 Q0 2rr0 2rr0 r r 0

0

Z∞

Z1 2

≤

2

|f˜(r)| d r = 2 0

|f (r)| d r.

(3.20)

0

The left-hand side can be written as 2γc

1 2π

Z1 Z1 0

0

2 2 2 2 r + r0 r + r0 f (r)f (r0 ) √ Q + Q d r d r0 √ 0 1 2rr0 2rr0 r r0

Z1 Z∞

1 + 2π

0

= 2γc J1 + J2

1

2 2 2 2 f (r)f (1/r0 ) r + r0 r + r0 0 + Q1 drdr √ 0 √ 0 Q0 2rr0 2rr0 rr r

say. We have J2 =

1 2π

Z1 Z1 0

0

1 1 1 0 1 0 f (r)f (r0 ) √ rr rr + + Q + Q d r d r0 √ 0 1 2 rr0 2 rr0 r r0

1 ≥ 2π

Z1 Z1 0

0

√

rf (r)

√

r0 f (r0 )

1 1 1 0 rr + 0 Q0 d r d r0 rr0 2 rr 1 ≥ π

Z1 Z1 0

0

√

√ rf (r) r0 f (r0 ) d r d r0

(3.21)

496

A. A. Balinsky, W. D. Evans

1

since inf

0≤u≤1 u

ln

1+u 1−u

= 2. Hence from (3.20) and (3.21),

Z1

2

˜ |f (r)| d r ≥ 2 ψ, Kψ + 2γc

2 0

1 π

Z1 Z1 0

√

√ rf (r) r0 f (r0 ) d r d r0 ,

0

whence γc 2 ˜ ψ, Kψ ≤ kψk − π

Z1

2

= kψk − C0 R3 γc 4π 2

=

2 rf (r) d r

0

Z

with C0 =

√

dx |ψ(x)| p |x|

2

1 π 3 +4π .

4. Proof of the Main Theorem Now that we have available Theorem 3.1, we are able to proceed along similar lines to those in [4]. The argument in [4], based on their Lemma 2.3, to justify the sufficiency of taking Zk = Zc for all k, continues to hold for us. We shall prove that when αZk = αZ = γc , k = 1, . . . , K, there exist constants A and C such that X 1 2 2 kψk + CK kψk (4.1) ψ, B01,K Z, R ψ ≥ −Aγc |Rj − Rk | j= 6 k

for all ψ ∈ 3+ Q4 ( |p|). We have V (x) :=

K X X αZk 1 − Aγc |x − Rk | |Rj − Rk | k=1

≤

X K

Vk2 (x)

k=1

≤

X K

j= 6 k

2 Vk (x)

k=1

=: U (x),

where Vk2 (x)

:= γc

(4.2) X 1 1 −A |x − Rk | |Rj − Rk | j= 6 k

, +

and the subscript + denotes the positive part. Hence, (4.1) is satisfied if 2 ψ, D0 ψ ≥ ψ, U ψ + CK kψk .

(4.3)

Stability of One-Electron Molecules

497

We now translate the problem to one for 2-spinors, using (2.1). In view of Lemma 2.1 (applied K times with the origin shifted to Rk , k = 1, . . . , K, to accommodate the K nuclei), (4.3) will follow if we can prove that for all ϕ ∈ Q2 ( |p|), p·σ p·σ 1 U ϕ ≥ 0. (4.4) ϕ, |p| − U + 2 |p| |p| p·σ ˜ commutes with U + p·σ |p| U |p| =: U and if L is the operator ϕ(x1 , x2 , x3 ) → p·σ p·σ σ3 ϕ(x1 , x2 , −x3 ) we have L |p| = − |p| L, and LU˜ = U˜ L, where U˜ coincides with U˜ with a minus sign inserted before the third components of the Rk . Thus we need only prove (4.4) for ϕ ∈ P+ Q2 ( |p|), where P+ is the projection onto the eigenspace at 1 for p·σ |p| . In other words, it is sufficient to prove that for all ϕ ∈ Q2 ( |p|), ϕ, |p| ϕ ≥ ϕ, P+ U P+ ϕ ,

Clearly,

p·σ |p|

or, equivalently, for all u ∈ H,

2

kuk ≥

1 1 u, p P+ U P+ p u . |p| |p|

(4.5)

The functions Vk in (4.2) are supported in balls B(Rk , tk ), where t−1 k =A

X j= 6 k

1 . |Rj − Rk |

(4.6)

Hence |Rj − Rk | − (tj + tk ) ≥ 1 −

2 |Rj − Rk | A

so that the balls are disjoint if A > 2. Let χk denote the characteristic function of B(Rk , tk ). Then Vk2 (x) = Vk2 (x)χk (x) ≤ and

p

1

U (x) ≤ γc2

K X

− 21

{ |x − Rk |

γc |x − Rk |

χk (x)} =:

p U1 (x).

k=1

Clearly (4.5) is satisfied if, for all u ∈ H, 1 1 2 kuk ≥ u, p P+ U1 P+ p u . |p| |p| √ With T := √1 P+ U1 , this becomes |p|

2 u, T T∗ u ≤ kuk , and so kT∗ k ≤ 1, which in turn is implied by 2 u, T∗ T u ≤ kuk , u ∈ H,

(4.7)

498

A. A. Balinsky, W. D. Evans

or

u,

p p 1 2 P+ U1 u ≤ kuk , u ∈ H. U1 P+ |p|

(4.8)

Thus, we need to prove that for all u ∈ H, K X 1 1 1 2 P χ (x) u P u, χj (x) kuk ≥ γc + + k 1 1 |p| |x − Rj | 2 |x − Rk | 2 j,k=1 K X 1 1 1 P χ (x) u P u, χj (x) = γc + + j 1 1 |p| |x − Rj | 2 |x − Rj | 2 j=1 X 1 1 1 P P χ (x) u u, χj (x) + γc + + k 1 1 |p| |x − Rj | 2 |x − Rk | 2 j= 6 k = I1 + I2 say. By Theorem 3.1, with C0 = (π 3 + 4π)−1 , Z K X 2 C0 1 2 kχj uk − 2 . |(χ u)(x)| d x I1 ≤ j 1 tj |x − Rj | 2 j=1 In I2 ,

1 p·σ |p| , |p|2

1 1 P+ = P+ |p| 2 have kernels

1 , i (x−y)·σ , 2π 2 |x−y|2 4π |x−y|3/2

1 p·σ + ; |p| |p|2

respectively. Thus

1 1 1 + I2 ≤ 4π π 2 Z X Z 1 1 1 γc 1 |(χj u)(x)| 1 |(χk u)(y)| d x d y 2 2 |x − y| |y − Rk | 2 j= 6 k 3 3 |x − Rj | R R −2 X 2 1 1 1 −2 + γc 1 − |Rj − Rk | mj mk ≤ 4π π 2 A j= 6 k

by (4.7), where

Z mj :=

χj (x) |u(x)| 1

|x − Rj | 2

R3

d x.

Thus 2

I1 + I2 ≤ kuk − C0 X

−2 K X 1 1 1 2 1 2 γ m + + 1 − c A t2 j 4π π 2 j=1 j −2 1

|Rj − Rk |

j= 6 k

≤ kuk

2

2

m2j + m2k

Stability of One-Electron Molecules

if 1 C0 − 4π t2j

499

−2 X 1 1 2 −2 + γc 1 − |Rj − Rk | ≥ 0 π 2 A j= 6 k

for all j. On substituting (4.6), the last inequality is satisfied if A ≥ 2

π 1+ 2

2 1− A

−2 ,

p thatp is A ≥ 2 + 1 + π/2. We have therefore established (4.1), and can take A = 2 + 1 + π/2. Consequently, we have X B1,K Z, R ≥ −2Aγc j
if 2α[2 +

p

X 1 1 + αZc2 + CK ≥ CK |Rk − Rj | |Rk − Rj |

1 + π/2] ≤ γc , that is α ≤

j
2π √ (π 2 +4)(2+ 1+π/2)

physical value of α. The proof is therefore complete.

≈ 0.125721; this includes the

Acknowledgement. The authors are grateful to the European Union for support under the TMR grant FMRXCT 96-0001. They also gratefully acknowledge the hospitality of the Erwin Schr¨odinger Institute during the Workshop on Schr¨odinger Operators with Magnetic Fields, 2–12 June 1998. Helpful comments of G.Hoever and N.R¨ohrl were greatly appreciated.

References 1. Balinsky, A.A. and Evans, W.D.: On the virial theorem for the relativistic operator of Brown and Ravenhall, and the absence of embedded eigenvalues. Lett. Math. Phys. 44, 233–248 (1998) 2. Brown, G.E. and Ravenhall, D.G.: On the interaction of two electrons. Proc. Roy. Soc. London A 208, 552–559 (1952) 3. Burenkov, V.I. and Evans, W.D.: On the evaluation of the norm of an integral operator associated with the stability of one-electron atoms. Proc. Roy. Soc. Edinburgh 128 A, 993–1005 (1998) 4. Daubechies, I. and Lieb, E.H.: One-electron relativistic molecules with Coulomb Interaction. Commun. Math. Phys. 90, 497–510 (1983) 5. Evans, W.D., Perry, P. and Siedentop, H.: The spectrum of relativistic one-electron atoms according to Bethe and Salpeter. Commun. Math. Phys. 178, 733–746 (1996) 6. Fl¨ugge, S.: Practical Quantum Mechanics I. Volume 177 of Grundlehren der mathematischen Wissenschaften, Berlin: Springer-Verlag, 1st edition, 1982 7. Greiner, W.: Relativistic Quantum Mechanics. Volume 3 of Theoretical Physics – Text and Excercise Books, Berlin: Springer, 1st edition, 1990 8. Griesemer, M. and Siedentop, H.: A minimax principle for the eigenvalues in spectral gaps. To appear in J. London Math. Soc. 9. Hardekopf, G. and Sucher, J.: Critical coupling constants for relativistic wave equations and vacuum breakdown in quantum electrodynamics. Phys. Rev. A 31 (4), 2020–2029 (1985) 10. Hardy, G.H., Littlewood, J.E. and Polya, G.: Inequalities. 2nd edition, Cambridge: Cambridge University Press, 1952 11. Herbst, I.: Spectral theory of the operator (p2 + m2 )1/2 − Ze2 /r. Commun. Math. Phys. 53, 285–294 (1977) 12. Magnus, W., Oberhettinger, F. and Soni, R.: Formulas and theorems for the special functions of mathematical physics. New York: Springer-Verlag, 1966 13. Messiah, A.: M´ecanique Quantique. Volume 1, Paris: Dunod, 2nd edition, 1969

500

A. A. Balinsky, W. D. Evans

14. Stegun, I.A.: Legendre functions. In: Milton Abramowitz and Irene A. Stegun, editors, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, chapter 8, New York: Dover Publications, 1965, pp. 331–353 15. Tix, C.: Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall. Bull. London Math. Soc. 30, 283–290 (1998) 16. Tix, C.: Self-adjointness and spectral properties of a pseudo-relativistic Hamiltonian due to Brown and Ravenhall. Preprint mp-arc/97-441 Communicated by A. Jaffe

Commun. Math. Phys. 202, 501 – 515 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model ? Guofang Wang?? , Liqun Zhang Institute of Mathematics, Academia Sinica, Beijing 100080, P. R. China Received: 11 July 1998 / Accepted: 1 November 1998

Abstract: In this paper, we obtain non-topological solutions of the self-dual equations arising in the relativistic SU(3) Chern–Simons Higgs model. 1. Introduction In 1990, motivated largely by the physics of higher temperature superconductivity, the relativistic self-dual Abelian Chern–Simons Higgs model was introduced independently by Hong–Kim–Pac [HKP] and Jackiw–Weinberg [JW]. Since then, the existence of solutions of this model has been quite understood. The existence of topological solutions of the corresponding self-dual equations (Bogomol’nyi type equations) was established in [W] by a variational method, which was used first by Taubes (see [JT]), and in [SY1] by an iteration approach. Unlike the non-relativistic model, the relativistic selfdual Abelian Chern–Simons Higgs model admits non-topological solutions which were first obtained rigorously by Spruck–Yang in [SY2] for the one vertex case (see also [CHMY]). Very recently, Chae–Imanuvilov [CI1] constructed a general type of nontopological multivortex solutions. On the other hand, two different kinds of doubly periodic solutions were obtained, at least for one and two vortices, in [CY, Ta, DJLW1 and NT] (see also [DJLW2]). One solution resembles the topological solution in the non-compact setting and another resembles the non-topological one. In this paper, we are interested in the relativistic non-Abelian Chern–Simons Higgs model which was studied earlier by Kao-Lee [KL] and Dunne [D1] (see also [L1, L2, D2 and DJPT]). Unlike the Abelian case, the self-dual non-Abelian Chern–Simons Higgs equations are, in general, reduced to a system of differential equations. Hence the results or methods developed for the Abelian case may not apply. Very recently, Yang [Y] developed a powerful variational principle and obtained the topological solutions of ? Partially supported by NNSF of China

?? Current address: Max-Planck Institute for Mathematics in the Sciences, Inselstrasse 22–26,

D-04103 Leipzig, Germany

502

G. Wang, L. Zhang

the relativistic self-dual Chern–Simons model for the non-Abelian group admitting a symmetric Cartan matrix, e.g., SU (N), N ≥ 3. In this paper, we will extend the method in [CI1] (see also [CI2]) to obtain the non-topological solutions of the relativistic selfdual SU (3) Chern–Simons model. We first give a Green function representation of linear Liouville systems. Then we apply the Banach contract mapping theorem to obtain a solution. These solutions are close, in some sense, to the solutions of the Liouville systems. 2. The Relativistic SU (3) Chern–Simons Higgs Model The relativistic non-Abelian Chern–Simons Higgs model in (2+1) dimension studied in [KL] and [D1] is described by the following Lagrangian density: 2 L = −κ µνρ tr(∂µ Aν Aρ ) + Aµ Aν Aρ − tr((Dµ φ)† D µ φ) − V (φ, φ † ) 3

(2.1)

for a Higgs field φ in the adjoint representation of the compact gauge group G, where the associated semi-simple Lie algebra is denoted by G and the G-valued gauge field Aα on 2 + 1 dimensional Minkowski space R1,2 with metric diag{−1, 1, 1}. Here κ > 0 is the Chern–Simons coupling parameter, tr is the trace in the matrix representation of G and V is the potential energy density of the Higgs field V (φ, φ † ) given by V (φ, φ † ) =

1 tr(([ [ φ, φ † ], φ ] − v 2 φ)† ([ [ φ, φ † ], φ ] − v 2 φ)), 4κ 2

Where v > 0 is a constant which measures either the scale of the broken symmetry or the subcritical temperature of the system. For simplicity, from now on, we denote φ † φ = |φ|2 . In this paper, we are only interested in stationary solutions of the Euler– Lagrangian equation of L. The Energy functional corresponding to the Lagrangian is Z (2.2) E(φ, A) = ds 2 {tr|D0 φ|2 + tr|Di φ|2 + V (φ, φ † )}, supplemented by the Gauss law [φ † , D0 φ] − [(D0 φ)† , φ] = 2κF12 .

(2.3)

With the help of the Gauss law (2.3), the energy functional can be rewritten as R E(φ, A) = ds 2 {tr|D0 φ + κi ([φ, [φ † , φ]] − v 2 φ)|2 + tr|(D1 + iD2 )φ|2 } +

v2 κ

R

ds 2 ρQ ,

(2.4)

where ρQ = tr((D0 φ)† φ − φ † D0 φ) = κF12 . The energy functional may achieve its absolute minimum which is a solution of the following self-dual Chern–Simons Higgs system   D+ φ = 0 (2.5) F 1 † ]] − v 2 φ, φ † ], = − [[φ, [φ +− κ

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

503

where D+ = D1 + iD2 and F+− = ∂− A+ − ∂+ A+ + [A− , A+ ] with A± = A1 ± iA2 and ∂± = ∂1 ± i∂2 . Certainly, a solution of (2.5) (with the Gauss law (2.2)) is a solution of the Euler–Lagrangian equations of L. The equations of (2.5), however, are difficult to handle. The only easy case is the corresponding zero energy solution, which satisfies the algebra equation [[φ, φ † ], φ] = v 2 φ. Here, we are interested in a simplified form of this self-dual system proposed by Dunne in [D2] (see also [KL, L and D1]). Let {Ha }1≤a≤r and {En }1≤n≤s refer to the simple root step operators and Cartan subalgebra generators of G or G (r is the rank of G and s = 1/2(dim G − r)) satisfying [Ha , Hb ] = 0, [En , E−n ] =

[Ha , E±n ] = ±αE±n ,

Pr

a a=1 αn Ha ,

[En , En0 ] = Nnn0 En+n0 ,

where αn = (αn1 , αn2 , · · · , αnr ) (n = ±1, ±2, · · · , ±n) are the root vectors and Nnn0 ’s are suitable constants. Under the following ansatz proposed in [D2] φ=

r X

φ a Ea and Aµ = i

a+1

r X a=1

Aaµ Ha ,

(2.6)

the system (2.5) is reduced to a system describing r Abelian Chern–Simons gauge fields Aaµ coupled to r complex scalar fields φ a . The Lagrangian density is reduced to Lres = −

r X a=1

r r X X |∂µ φ a + i( Abµ αba )φ a |2 − κ αβγ ∂α Aaβ Aaγ − V , b=1

(2.7)

a=1

where the potential V becomes r r r v2 X a 2 1 X v2 X a 2 b 2 |φ | − |φ | K |φ | + |φ a |2 Kab |φ b |2 Kbc |φ c |2 . ab 4κ 2 2κ 2 4κ 2 a=1

a,b=1

a,b,c=1

Here (Kab ) is the Cartan matrix of the Lie algebra G. So we can only solve the following equations: r r v2 X 1 X b2 b 2 Kab |φ | + 2 |φ | Kbc |φ c |2 Kac ∂+ ∂− log |φ | = κ κ a 2

b=1

(2.8)

b,c=1

(see [D1]). When (Kab ) is symmetric, Yang [Y] obtained the topological solutions by a variational approach. In fact, he considered a more complicated system. In this paper, we are only interested in the case G = SU (N ), N ≥ 3, whose Cartan matrix is an (N − 1) × (N − 1) matrix having the following form;   2 − 1 0 ··· ··· 0 −1 2 −1 0 · · · 0     0 −1 2 −1 · · · 0   0 · · · · · · −1 2 −1 .   · · · · · · · · · · · · · · · · · ·  0 · · · · · · 0 −1 2

504

G. Wang, L. Zhang

When N = 3, K = equations: (

2 −1 . In this case, (2.8) can be reduced to the following −1 2

P 1 1u1 = −2eu1 + eu2 + 4e2u1 − 2e2u2 − eu1 +u2 + 4π N j =1 δpj P 2 1u2 = eu1 − 2eu2 − 2e2u1 + 4e2u2 − eu1 +u2 + 4π N j =1 δqj .

(2.9)

In this paper, we will prove the existence of the non-topological solutions of the self-dual non-Abelian Chern–Simons Higgs model. N2 1 2 Theorem 1. Let {pj }N j =1 , {qj }j =1 ⊂ R and β ∈ (0, 1). There exists a solution (φ, A) 1 (with φ = (φ1 , φ2 )) of (2.5) such that φ1 has the zeros {pj }N j =1 and φ2 has the zeros

2 {qj }N j =1 and E(φ, A) < ∞. Moreover, as |x| → ∞,

(|φ1 |2 + |φ2 |2 + |F12 |2 + |Di φ|2 + |Di φ2 |2 )(x) ≤ O(|x|−(2 min{N1 ,N2 }+4−β) ). Obviously, the methods presented here can be generalized to deal with the group G = SU (N ), N ≥ 4. 3. The Liouville System and Green Function We consider the following Liouville type system −1u1 = eu1 − 2eu2 −1u2 = −2eu1 + eu2

in R2 ,

(3.1)

which is also called the Toda system. Konstant [Ko] and Leznov–Saveliev [LS1, LS2] showed that, like the classical Liouville equation [Li], general solutions of (3.1) can be expressed in terms of two arbitrary holomorphic functions. Here we use a simpler form (see [D1]). Let f1 and f2 be two holomorphic functions. Set g(z) = f20 (z)/f10 (z), where f 0 (z) = ∂f ∂z (z). Then (w1 , w2 ) defined by  0 2 2 2  1|  w1 = log 4(1+|g| +|f22 −f1 g|2 )|f (1+|f1 | +|f2 | ))2 (3.2)   w2 = log 4(1+|f21 |2 +|f2 |2 )|g20 |22 (1+|g| +|f −f g| ) 2

1

satisfies (3.1) away from singularities. N2 1 2 Let {pj }N j =1 , {qj }j =1 ⊂ R . We define f1 and f2 as follows. Z

z

f1 = 0

f10 (t)dt Z

f2 = 0

Z

z

g= 0

f10 (z) =

N1 Y

(z − pj ),

(3.3)

j =1

and

where

with

0

z

f20 (t)dt

g (t)dt

with

with

0

f20 = f10 g,

g (z) =

N2 Y

(z − qj ).

j =1

(3.4)

(3.5)

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

For µ1 , µ2 > 0, set

505

 0 2 2 2   ρµ1 1 ,µ2 = log 4(1+µ2 |g| +µ12µ2 |f2 −f1 g|2 )µ21 |f1 | (1+µ |f | +µ µ |f | )) 1

  ρ2

µ1 ,µ2

= log

1

1 2

2

4(1+µ1 |f1 |2 +µ1 µ2 |f2 |2 )µ2 |g 0 |2 (1+µ2 |g|2 +µ1 µ2 |f2 −f1 g|2 )2

(3.6)

with f1 and f2 satisfying (3.3)–(3.5). Clearly, ρµ1 1 ,µ2 and ρµ2 1 ,µ2 satisfy the equations ( P 1 1 2 −1ρ 1 = eρ − 2eρ − 4π N j =1 δ(z − pj ) (3.7) P 1 2 2 2 ρ ρ −1ρ = −2e + e − 4π N j =1 δ(z − qj ) with ρ 1 , ρ 2 → −∞ as |z| → +∞. Now we want to find Green’s function for the system (3.7). Let x ∈ R2 and > 0. We set Z z Z z ρ f10 (t)(t − x) dt and f2 = f10 (t)g(t)(t − x) dt, f1 (z, x) = 0

0

where z = e log z = e(log r+iθ) , f10 and g are defined in (3.3) and (3.5). One can check and ρµ2, , defined by (3.6) using f1 and f2 , satisfy directly that ρµ1, 1 ,µ2 1 ,µ2 ( P 1 1, 2, −1ρ 1, = eρ − 2eρ − 4π N j =1 δ(z − pj ) + 4π δ(z − x) (3.8) P 1, 2, 2 2, ρ ρ +e − 4π N −1ρ = −2e j =1 δ(z − qj ). µ µ

µ ,µ2

Let (G1,1 (z, x), G1,2 (z, x)) = (G1,11 2 (z, x), G1,21 G1,1 (z, x) =

∂ 1, ρ ∂ |=0

and

(z, x)) be defined by

G1,2 (z, x) =

∂ 2, ρ . ∂ |=0

(3.9)

One can easily check Lemma 1. For µ1 , µ2 > 0, the functions G1,1 (z, x) and G1,2 (z, x) defined by (3.9) satisfy ( 1 2 1G1,1 = −2eρ G1,1 + eρ G1,2 + 4π δ(z − x) (3.10) 1 2 1G1,2 = eρ G1,1 − 2eρ G1,2 . Here and in the following proof, we will omit subscripts µ1 and µ2 if there is no confusion. It is easy to check that ∂f2 ∂f1 ∂ |=0 −g ∂ |=0 )} 1+µ1 |g|2 +µ1 µ2 |f2 −gf1 |2 f ∂f Re{µ1 f¯1 ∂1 | +µ1 µ2 f¯2 ∂2 | } =0 =0 1+µ1 |f1 |2 +µ1 µ2 |f2 |2

G1,1 (z, x) = 2 log |z − x| + −2

G1,2 (z, x) =

µ1 µ2 Re{(f2 −f1 g)(

(3.11)

,

∂f1 ¯ f2 ∂ |=0 +µ1 µ2 f2 ∂ | =0 } 1+µ1 |f1 |2 +µ1 µ2 |f2 |2 ∂f f Re{µ1 f¯1 ∂1 | +µ1 µ2 f¯2 ∂2 | } =0 =0 1+µ1 |f1 |2 +µ1 µ2 |f2 |2

Re{µ1 f¯1

−2

(3.12) ,

506

G. Wang, L. Zhang

with ∂f1 = ∂ |=0

Z

z

0

∂f f1 (t) log |t − z|dt and 2 = ∂ |=0 0

Z

z

f2 0 (t)g(t) log |t − z|dt. (3.13)

0

Similarly, we define ∂f2 ∂ |=0 } 2 2 2 |f1 | +µ1 µ2 |f2 |

Re{µ1 µ2 f¯2

G2,1 = −2 1+µ

∂f2 ∂g ∂ |=0 −f1 ∂ |=0 )} 1+µ2 |g|2 +µ1 µ2 |f2 −f1 g|2

Re{µ2 g¯ ∂g ∂ |=0 +µ1 µ2 (f2 −f1 g)(

G2,2 = 2 log |z − x| + −2

Re{µ1 µ2 f¯2

(3.14) ,

∂f2 ∂ |=0 }

1+µ2 |f1 |2 +µ1 µ2 |f2 |2

∂f2 ∂g ∂ |=0 −f1 ∂ |=0 )} =0 1+µ2 |g|2 +µ1 µ2 |f2 −f1 g|2

Re{µ2 g¯ ∂g ∂ |

+µ1 µ2 (f2 −f1 g)(

(3.15) .

As above, G2,1 and G2,2 satisfy ( 1 2 1G2,1 = −eρ G2,1 + 2eρ G2,2 , 1 2 1G2,2 = 2eρ G2,1 − eρ G2,2 + 4π δ(z − x).

(3.16)

G1,2 G2,1 . G= G1,2 G2,2

Set

Clearly, from (3.10) and (3.16), G satisfies ! 1 2 δ(z − x) 0 −2eρ eρ G + 4π . 1G = 1 0 δ(z − x) eρ −2eρ2 Now we consider the following system: ( 1 2 1u1 + 2eρ u1 − eρ u2 = g1 1 2 1u2 − eρ u1 + 2eρ u2 = g2 It is easy to check

in R2 .

Z v1 G(z, x) g1 g2 (x)dx (z) = v2 2 R

is a solution of (3.18). Now we introduce the function spaces used in [CI1]. For α ∈ (0, 1), let Z (1 + |x|2+α )u2 dx < ∞}, Xα = {u ∈ L2 (R2 )| with the norm kuk2Xα =

R

R2

2+α )u2 dz, R2 (1 + |x|

and

2,2 (R2 )| k1uk2Xα + k Yα = {u ∈ Wloc

u 1

(1 + |x|2+α ) 2

k2L2 (R2 ) < ∞}

(3.17)

(3.18)

(3.19)

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

with the norm kuk2Yα = k1uk2Xα + k

u

1

(1+|x|2+α ) 2

507

k2L2 (R2 ) .

It is easy to see that 0 (R2 ). Xα → L1 (R2 ), Yα ⊂ Cloc

It was proven in [CI1] that Lemma 2. Let α ∈ (0, 1), then there exists C > 0 such that for any v ∈ Yα , x ∈ R2 |v(x)| ≤ C0 kvkYα (log+ |x| + 1), where log+ |x| = max{0, log |x|}. We have the following estimates as in [CI1]. Lemma 3. Let α ∈ (0, 1) and (g1 , g2 ) ∈ Xα × Xα . Then v1 and v2 defined by (3.19) have the following estimates: |v1 |(z) + |v2 |(z) ≤ C1 (kg1 kXα + kg2 kXα )(log+ |z| + 1), for any x ∈ R2 ,

(3.20)

for some constant C1 > 0. Lemma 3 follows from Lemma 4. There exists a constant C2 independent of µ1 , µ2 and x such that 2 X

|Gi,j |(z, x) ≤ C2 (log |z − x| + 1) for any z ∈ R2 .

(3.21)

i,j =1

Proof. By Lemma 2, as in [CI1], we can show ∂f |= | 1 ∂ |=0

Z 0

z

f10 (t) log |t − x| ≤ C3 (|z|N1 + 1)(1 + | log |z − x||)

(3.22)

and ∂f |=| | 2 ∂ |=0

Z 0

z

f10 (t)g(t) log |t −x|| ≤ C3 (|z|N1 +N2 +1 +1)(1+| log |z−x||). (3.23)

From (3.22), (3.23), (3.11)-(3.15), it is easy to check that (3.21) is valid. u t 4. Potential Estimates We consider some potential estimates in this section which will be used in the proof of the main theorem. For convenience, let ρµ1 , ρµ2 denote ρµ1 1 ,µ2 , ρµ2 1 ,µ2 which are given in (3.6).

508

G. Wang, L. Zhang N2 +2 N1 +1

Lemma 5. Let f1 and f2 satisfy (3.3)–(3.5), µ2 = µ1 and ρµ1 , ρµ2 as above. Then there exists µ0 ∈ (0, 1) such that for any (µ1 , ) ∈ (0, µ0 ) × (0, 1), Z 1 C4 − 2(Nα++1) (1 + |x|2+α )(1 + log+ |x|)2 e2ρµ ≤ 2 µ1 1 , (4.1) R2 Z 2 C4 − 2(Nα++1) (1 + |x|2+α )(1 + log+ |x|)2 e2ρµ ≤ 2 µ1 1 , (4.2) R2 where 0 < α < 1, the constant C4 is independent of µ1 and . Proof. We only prove (4.1), since the proof of (4.2) is very similar. Put Z Z Z 2+α + 2 2ρµ1 (1 + |x| )(1 + log |x|) e = + = I + I I, R2

|x|≤R

|x|≥R

where R is to be determined. I=

R

0 2 2 2 2+α )(1 + log+ |x|)2 [ 4µ1 |f1 | (1+µ2 |g| +µ1 µ2 |f2 −f1 g| ) ]2 2 2 2 |x|≤R (1 + |x| (1+µ1 |f1 | +µ1 µ2 |f2 | )

≤ 16µ21

R

|x|≤R (1 + |x|

2+α )(1 + log+ |x|)2 |f 0 |4 (1 + µ |g|2 2 1

+ µ1 µ2 |f2 − f1 g|2 )2 .

By (3.3)-(3.5), we obtain that there exists a constant C5 such that for x ∈ R2 ,

and

|f 0 (x)| ≤ C5 (1 + |x|N1 ),

(4.3)

|g(x)| ≤ C5 (1 + |x|N2 +1 ),

(4.4)

|f2 (x) − f1 (x)g(x)| ≤ C5 (1 + |x|N1 +N2 +2 ).

(4.5)

Moreover, for ∈ (0, 1) we have for x ∈ R2 , log+ |x| ≤

2 |x| 2 .

(4.6)

Then we have C6 µ21 R 2+α++4N1 )(1 + µ |x|2(N2 +1) + µ µ |x|2(N1 +N2 +2) )2 2 1 2 |x|≤R (1 + |x| 2 2 C7 µ1 R ≤ 2 |x|≤R (1 + |x|2+α++4N1 + µ22 |x|2+α++4(N2 +N1 ) + µ21 µ22 |x|4(N1 +N2 +2)+4N1 +2+α+ ) C µ2 ≤ 7 2 1 (R 2 + R 4+α++4N1 + µ22 R 4+α++4(N2 +N1 ) + µ21 µ22 |R|4(N1 +N2 +2)+4(N1 +1)+α+ ).

I ≤

− 2(N 1+1)

Let R = µ1

1

, then I≤

C8 − 2(Nα+ 1 +1) µ . 2 1

(4.7)

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

509

− 2(N 1+1)

For the estimate of I I , we note that for R0 = µ0 1 > 1, there exists C9 , such that for |x| > R0 , |f1 (x)| ≥ C9 |x|N1 +1 , |f2 (x)| ≥ C9 |x|N1 +N2 +2 . We also have √ |f1 |2 + µ2 |f2 |2 ≥ 2 µ2 |f1 ||f2 |. Then R 4|f 0 |4 (1+µ2 |g|2 +µ1 µ2 |f1 −f2 g|2 )2 I I ≤ |x|≥R (1 + |x|2+α )(1 + log+ |x|)2 1 µ2 (|f |2 +µ |f |2 )4 1

R

0 4 2+α+ )[ 4|f1 | |x|≥R (1 + |x| |f1 |8

≤

1 2 µ21

≤

C10 2 µ21

≤

C11 2 µ21

≤

C11 (R −4(N1 +1)+α+ 2 µ21

+

R

2+α+ )[|x|−4(N1 +2) |x|≥R (1 + |x|

R

+

2+α+ )(2|x|−4(N1 +2) |x|≥R (1 + |x|

− 2(N 1+1)

By choosing R = µ1

1

1

2

2

|f10 |4 (|g|2 +µ1 |f1 −f2 g|2 )2 ] 4|f1 |4 |f2 |4 |x|4N1 (|x|4(N2 +1) +µ21 |x|4(N1 +N2 +2) ] |x|4(N1 +1)+4(N1 +N2 +2) )

+ µ21 |x|−4 )

+ µ21 R α+ ).

, we have

C11 − 2(Nα+ 1 +1) µ . 2 1 Then (4.1) follows from (4.7) and (4.8). u t

(4.8)

II ≤

Lemma 6. Under the assumptions of Lemma 5, there exists µ0 ∈ (0, 1) such that for any (µ1 , ) ∈ (0, µ0 ) × (0, 1), Z 4−α− 1 2(N +1) (1 + |x|2+α )(1 + |x| )e4ρµ ≤ C12 µ1 1 , (4.9) Z

R2

2

R2

4−α− 2(N1 +1)

(1 + |x|2+α )(1 + |x| )e4ρµ ≤ C12 µ1

,

(4.10)

where 0 < α < 1, the constant C12 is independent of and µ1 . − 2(N 1+1)

Proof. We again only prove (4.9). Let R = µ1 Z Z 2+α 4ρµ1 (1 + |x| )(1 + |x| )e = R2

1

|x|≤R

> 1. Put Z + = I + I I. |x|≥R

As before, we have R µ |f 0 |2 (1+µ |g|2 +µ µ |f −f g|2 ) I = |x|≤R (1 + |x|2+α )(1 + |x| )[4 1 1(1+µ |f2 |2 +µ 1µ 2|f 2|2 )2 1 ]4 1 1 1 2 2 R ≤ 4µ41 |x|≤R (1 + |x|2+α )(1 + |x| )[µ1 |f10 |2 (1 + µ2 |g|2 + µ1 µ2 |f2 − f1 g|2 )]4 R ≤ C13 µ41 |x|≤R (1 + |x|2+α++8N1 + µ42 |x|2+α++8(N1 +N2 +1) + µ41 µ42 |x|8(2N1 +N2 +2)+2+α+ ) ≤ C14 µ41 (R 4+α++8N1 + µ42 R 4+α++8(N1 +N2 +1) + µ41 µ42 R 8(2N1 +N2 +2)+4+α+ ).

510

G. Wang, L. Zhang

Put R =

− 2(N 1+1) µ1 1 .

N2 +2 N1 +1

Recall µ2 = µ1

, then 4−α− 2(N1 +1)

I ≤ C15 µ1

.

(4.11)

For the estimate of I I , we have R |f 0 |8 (1+µ2 |g|2 +µ1 µ2 |f2 −f1 g|2 )4 I I = |x|≥R (1 + |x|2+α )(1 + |x| )4 1 µ4 (|f |2 +µ |f |2 )8 ≤

C16 µ41

1

R

0 8 2+α+ )[ |f1 | |x|≥R (1 + |x| |f1 |16

+

1

2

2

|f10 |8 (|g|2 +µ1 |f2 −f1 g|2 )4 ]. 24 |f1 |8 |f2 |8

As in the proof of Lemma 5, R I I ≤ Cµ174 |x|≤R |x|2+α+ (2|x|−8(N1 +2) + µ41 |x|−8 ) 1

≤

C17 (R 4+α+−8(N1 +2) µ41 − 2(N 1+1)

Therefore, by putting R = µ1

1

+ µ41 R −4+α+ ).

, 4−α− 2(N1 +1)

I I ≤ C18 µ1

.

(4.12)

Then we can deduce (4.9) from (4.11) and (4.12). u t 5. Proof of the Main Results N2 1 We consider the problem (2.9) with the prescribed singularities {pj }N j =1 and {qj }j =1 . Let ρµ1 , ρµ2 be given as in (3.6) with f1 and f2 satisfying (3.3)-(3.5). Put w1 = u1 − ρµ1 (5.1) w2 = u2 − ρµ2 .

Then we need to solve (

1w1 + 2eρµ w1 − eρµ w2 = g1 (µ, w1 , w2 ) 1

2

(5.2)

1w2 − eρµ w1 + 2eρµ w2 = g2 (µ, w1 , w2 ), 1

2

where  1 2 1 2 1 2  g1 = 4e2(ρµ +w1 ) − 2e2(ρµ +w2 ) − eρµ +ρµ +w1 +w2 − eρµ +ξ1 w12 + 1 eρµ +ξ2 w22 2 

g2 = −2e2(ρµ +w1 ) + 4e2(ρµ +w2 ) − eρµ +ρµ +w1 +w2 + 21 eρµ +ξ1 w12 − eρµ +ξ2 w22 1

2

1

2

1

2

and (ξ1 , ξ2 ) is between (0, 0) and (w1 , w2 ). Now we introduce the set Eδ in Yα × Yα , Eδ = {(w1 , w2 ) ∈ Yα × Yα | kw1 k2Yα + kw2 k2Yα ≤ δ 2 }. 1

For w = (w1 , w2 ) ∈ Yα ×Yα , let kwkYα = (kw1 k2Yα +kw2 k2Yα ) 2 . We shall find solutions of (5.2) in Eδ for some small δ.

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

511

Lemma 7. Under the assumptions of Lemma 5, there exist µ0 ∈ (0, 1) and C19 which is independent of µ1 such that for (w1 , w2 ) ∈ Eδ and µ1 ∈ (0, µ0 ), 4−α−2C0 δ 4(N1 +1)

ke2(ρµ +w1 ) kXα ≤ C19 µ1 1

4−α−2C0 δ 4(N1 +1)

ke2(ρµ +w2 ) kXα ≤ C19 µ1 2

and

,

(5.3)

,

(5.4)

4−α−2C0 δ 4(N1 +1)

keρµ +ρµ +w1 +w2 kXα ≤ C19 µ1 1

2

.

(5.5)

Proof. We only need to prove (5.3), since the proof of (5.4) is similar and (5.5) follows from (5.3), (5.4) and Hölder inequality. By Lemma 2, we have for x ∈ R2 , |w1 (x)| ≤ C0 kwkYα (log+ |x| + 1). Then ke2(ρµ +w1 ) k2Xα ≤ 1

≤

(5.6)

R

2+α )e2C0 δ(1+log+ |x|)+4ρµ1 R2 (1 + |x| R 1 e2C0 δ R2 (1 + |x|2+α )(1 + |x|2C0 δ )e4ρµ .

Then (5.3) follows by Lemma 6, and this inequality. u t Lemma 8. Under the assumptions of Lemma 5, there exist µ0 ∈ (0, 1) and C20 which is independent of µ1 and for (µ1 , ) ∈ (0, µ0 ) × (0, 1), such that for (w1 , w2 ) ∈ Eδ with α + 2C0 δ < 1, α+2+2C δ 1 C20 δ 2 − 4(N1 +1)0 µ , (5.7) keρµ +ξ1 w12 kXα ≤ 1 2 keρµ +ξ2 w22 kXα ≤ 2

C20 δ 2 − µ1 2

α+2+2C0 δ 4(N1 +1)

.

(5.8)

Proof. We note that |ξ1 (x)| ≤ |w1 |, and by (5.6) we have keρµ +ξ1 w12 k2Xα ≤ 1

R

2+α )(1 + |x|2C0 δ )C 4 δ 4 (1 + log+ |x|)4 e2ρµ1 0 R2 (1 + |x|

R 1 ≤ C21 δ 4 R2 (1 + |x|2+α+2C0 δ )(1 + |x| (log+ |x|)2 )e2ρµ 2 δ4 R 2+α+2C0 δ+ )(1 + log+ |x|)2 e2ρµ1 . ≤ C22 R2 (1 + |x| 2

By Lemma 5, we obtained (5.7) when δ is small so that α + 2C0 δ < 1. Similarly, we can prove (5.8). u t Now for given (g1 (µ, w), g2 (µ, w)) ∈ Xα × Xα , we define an operator T by Z v g (µ, w) g1 (µ, w) = 1 (z) = (x)dx, (5.9) G(z, x) 1 T v2 g2 (µ, w) g2 (µ, w) R2 where w = (w1 , w2 ) ∈ Eδ . In the following, we shall find a fixed point of the operator T in Eδ for some small δ.

512

G. Wang, L. Zhang

Lemma 9. Under the assumptions of Lemma 5, let (α, ) ∈ (0, 1) × (0, 1). Then there exist µ0 and δ0 such that for 0 < µ1 < µ0 , 0 < δ < δ0 , kT gkYα

4−2α−2C0 δ− 4(N1 +1)

C23 ≤ (µ1

δ2 − + 2 µ1

2α+3+2C0 δ 4(N1 +1)

),

(5.10)

where C23 is independent of µ1 , δ and . R + |x|+1) < +∞, we have Proof. By (3.20) and the fact R2 (log 1+|x|2+α k

v1 1

(1 + |x|2+α ) 2

kL2 + k

v2 1

(1 + |x|2+α ) 2

kL2 ≤ C24 (kg1 kXα + kg2 kXα ).

(5.11)

By Lemma 5 and (3.20), we deduce keρµ v1 kXα ≤

C25 − 4(Nα+ 1 +1) µ (kg1 kXα + kg2 kXα ), 1

(5.12)

keρµ v2 kXα ≤

C25 − 4(Nα+ 1 +1) µ (kg1 kXα + kg2 kXα ). 1

(5.13)

1

2

Since

k1v1 kXα ≤ kg1 kXα + 2keρµ v1 kXα + keρµ v2 kXα , 1

then k1v1 kXα ≤

2

C26 − 4(Nα+ +1) µ1 1 (kg1 kXα + kg2 kXα ).

(5.14)

Therefore

C27 − 4(Nα+ 1 +1) (kg1 kXα + kg2 kXα ). (5.15) µ 1 Similarly we can prove that v2 also satisfies the same estimate (5.15). We now apply Lemma 7 and Lemma 8 to g1 , g2 , then (5.10) follows easily for 0 < δ < δ0 when δ0 satisfies 2α+3+2C δ C27 δ02 − 4(N1 +1)0 1 t u µ < . 4 − 2α − 2C0 δ − > 0, 0 2 2 kv1 kYα ≤

Now we prove that the operator T maps Eδ into Eδ for δ small enough. Lemma 10. Under the assumption of Lemma 9, there exist µ0 , δ0 such that for 0 < µ1 < µ0 , 0 < δ < δ0 and any w ∈ Eδ , kT gkYα ≤ δ. Proof. For given N1 ≥ 1, we choose α = =

1 16 .

(5.16) Then we choose δ0 small so that

1 4 − 2α − 2 + 2C0 δ0 > , 4(N1 + 1) 2(N1 + 1) 1 2α + 3 + 2C0 δ < . 4(N1 + 1) 4(N1 + 1) Then we choose µ0 sufficiently small so that − 4(N 1+1) C23 C23 2(N11+1) + 3 δ 2 µ1 1 = δ µ1

(5.17) (5.18)

(5.19)

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

513

has a solution δ in (0, δ0 ). In fact, if 1−4 we can choose

1 2 C23 4(N1 +1) µ ≥ 0, 4 1

s

1

4(N +1) 3 µ1 1

1 2 C23 4(N +1) µ1 1 ). (5.20) 4 2C23 Since in (5.20) δ → 0, as µ1 → 0, so µ0 exists. From (5.10) and (5.19) we deduce (5.16). Next we prove that the operator T is a contract map for µ0 sufficiently small. For w, w 0 ∈ Eδ , we have

δ=

(1 −

1−4

g1 (µ, w) − g1 (µ, w0 ) = 4e2(ρµ +η1 ) (w1 − w10 ) − 2e2(ρµ +η2 ) (w2 − w20 ) 1

2

− e(ρµ +ρµ +η1 +η2 ) (w1 − w10 + w2 − w20 ) 1

2

(5.21)

− e(ρµ +ξ1 ) (w1 − w10 )2 + 21 eρµ +ξ2 (w2 − w20 )2 , 1

2

where (η1 , η2 ), (ξ1 , ξ2 ) are between w and w0 . In particular, if w, w0 ∈ Eδ , then for i = 1, 2, x ∈ R2 , |ηi (x)| ≤ C0 δ(log+ |x| + 1), |ξi (x)| ≤ C0 δ(log+ |x| + 1),

x ∈ R2 .

Similarly we have g2 (µ, w) − g2 (µ, w0 ) = 4e2(ρµ +η2 ) (w2 − w20 ) − 2e2(ρµ +η1 ) (w1 − w10 ) 2

1

− e(ρµ +ρµ +η1 +η2 ) (w1 − w10 + w2 − w20 ) 1

2

(5.22)

− e(ρµ +ξ2 ) (w2 − w20 )2 + 21 eρµ +ξ1 (w1 − w10 )2 . 2

1

t u

Lemma 11. Under the assumptions of Lemma 5, there exist µ0 ∈ (0, 1) and C28 which is independent of and µ1 , such that for µ1 ∈ (0, µ0 ) and w, w0 ∈ Eδ , kg(µ, w) − g(µ, w 0 )kXα ≤

4−α−−2C0 δ 4(N1 +1)

C28 µ1

+

− C28 µ 2 1

kw − w0 kYα +

α+2+2C0 δ 4(N1 +1)

δkw

− w0 k

(5.23)

Yα .

Proof. As in the proof of Lemma 7, we have ke2(ρµ +η) (w1 − w10 )kXα R + 1 1 ≤ [ R2 C0 (1 + |x|2+α )(log+ |x| + 1)2 e2C0 δ(log |x|+1)+4ρµ ] 2 kw1 − w10 kYα 1

≤

4−α−−2C0 δ 4(N1 +1)

C29 µ1

(5.24)

kw1 − w10 kYα .

It is easy to see that the same estimates hold for the other terms which are linear in w − w 0 in (5.21) and (5.22). For the remaining quadratic terms in (5.21) and (5.22), the estimate is similar to Lemma 8. Then we have proved Lemma 11. u t

514

G. Wang, L. Zhang

Lemma 12. Under the assumptions of Lemma 9, there exist λ, µ0 ∈ (0, 1) such that for µ1 ∈ (0, µ0 ), w, w0 ∈ Eδ and δ is chosen as in (5.20), kT (g(µ, w) − g(µ, w0 ))kYα ≤ λkw − w0 kYα .

(5.25)

Proof. From (5.15) and (5.23) we deduce kT (g(µ, w) − g(µ, w 0 ))kYα ≤

4−2α−2−2C0 δ 4(N1 +1)

C30 (µ1 2

δ − + µ1

where C30 is independent of µ1 and . As in the proof of Lemma 10, we choose α = = (5.18). We note that δ given in (5.20) satisfies − 4(N 1+1)

δµ1

1

→ 0,

1 16

2α+3+2C0 δ 4(N1 +1)

)kw − w0 kYα , (5.26)

and δ0 satisfies (5.17) and

as µ0 → 0.

Then we may choose µ0 sufficiently small, so that δ − 4(N11+1) C30 2(N11+1) (µ + ) ≤ λ. µ 1 2 1 Then (5.26) and (5.27) imply (5.25). u t

(5.27)

Proof of Theorem 1. By Lemma 10, Lemma 5.12 and the Banach contract mapping theorem, we obtain that (5.2) has a solution in Eδ . Then we obtained the solution of problem (2.9) and (u1 , u2 ) = (ρµ1 + w1 , ρµ2 + w2 ). Since for |x| → ∞, ρµ1 (x) = −2(N2 + 2) log |x| + o(log |x|), ρµ2 (x) = −2(N1 + 2) log |x| + o(log |x|), for some β ∈ (0, 1), and we can choose µ0 small so that for i = 1, 2, |wi (x)| ≤ β(log+ |x| + 1), then eu1 (x) = O( eu2 (x) = O( Set

1 |x|2N2 +4−β 1 |x|2N1 +4−β

), ).

N

1 X 1 arg(z − pj ))}, φ1 = exp{ (u1 + 2

(5.28)

j =1 N

2 X 1 arg(z − qj ))}, φ2 = exp{ (u2 + 2

(5.29)

j =1

and

(5.30) A¯ i = −2i ∂¯ log φi , 1 2 Ai + iAi . It is clear that (φ, A) defined by (2.6), (5.28)–(5.30), is a solution

where A¯ i = of (2.5). Other statements in the theorem are easy to check. u t

Non-Topological Solutions of the Relativistic SU(3) Chern–Simons Higgs Model

515

Acknowledgement. We are grateful to Professor Wang Guang Yin for telling us the form of the Green function for systems which enables us to simplify our earlier proof of the main theorem.

Note added in Proof. After submitting this paper, we learnt from Professor Miwa that Chae and Imanuvilov obtained a similar result by using the Newton-Kantorovich scheme in their preprint, Non-topological multivortex solutions of the self-dual Maxwell-Chern– Simons-Higgs systems. References [CaY]

Caffarelli, L. and Yang, Y.S.: Vortex condensation in the Chern–Simons Higgs model: An existence theorem. Commun. Math. Phys. 168, 321–336 (1995) [CI1] Chae, D. and Imanuvilov, O. Yu.: The existence of non-topological multivortex solutions in the relativistic self-dual Chern–Simons theory. Preprint, 1997 [CI2] Chae, D. and Imanuvilov, O. Yu.: Non-topological multivortex solutions of the self-dual MaxwellChern–Simons-Higgs systems. Preprint, 1998 [CHMY] Chen, X., Hastings, S., McLeod, J. B. and Yang, Y.: A nonlinear elliptic equation arising from gauge field theory and cosmology. Proc. R. Soc. Lond. 446, 453–478(1994) [DJLW1] Ding, W., Jost, J., Li, J. and Wang, G.: An analysis of the two-vortex case in the Chern–Simons Higgs model. Calc. Vari. and P. D. E. 7, 87–97 (1998) [DJLW2] Ding, W., Jost, J., Li, J. and Wang, G.: Multiplicity results for the two-vortex hern-Simons Higgs model on the two sphere. Commun. Math. Helv. (in press) [D1] Dunne, G.: Self-dual Chern–Simons Theories. Lecture Notes in Physics, vol. m36, Berlin: Springer-Verlag, 1995 [D2] Dunne, G.: Mass degeneracies in self-dual model. Phys. Lett. B345, 452–457 (1995) [DJPT] Dunne, G. Jackiw, R., Pi, S-Y. and Trugenberger, C.: Self-dual Chern–Simons solitons and two dimensional nonlinear equation. Phys. Rev. D43, 1332 (1991) [HKP] Hong, J., Kim, Y. and Pac, P.Y.: Multivortex solutions of the Abelian Chern–Simons theory. Phys. Rev. Lett. 64, 2230–2233 (1990) [JW] Jackiw, R. and Weinberg, E.: Self-dual Chern–Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) [JT] Jaffe, A. and Taubes, C. H.: Vortices and Monopoles. Boston: Birkhäuser, 1980 [KL] Kao, H. C. and Lee, K.: Self-dual SU (3) Chern–Simons Higgs systems. Phys. Rev. D50, 6626– 6632 (1994) [Ko] Konstant, B.: The solution to a generalized Toda lattic and representation theory. Adv. Math. 34, 195–338 (1979) [L1] Lee, K.: Self-dual nonabelian Chern–Simons solitons. Phys. Rev. Lett. 66, 553–555 (1991) [L2] Lee, K.: Relativistic nonabelian Chern–Simons systems. Phys. Lett. B255, 381–384 (1991) [LS1] Leznov, A. N. and Saveliev, M. V.: Representation of zero curvature of the system of nonlinear partial differential equation xα,z¯z = exp(kx)α and its integrablity. Lett. math. Phys. 3, 489–494 (1979) [LS2] Leznov, A. N. and Saveliev, M. V.: Representation theory and integration of nonlinear spherically symmetric equations to gauge theories. Commun. Math. Phys. 74, 111–118 (1980) [L] [NT] [SY1] [SY2] [Ta] [W] [Y]

2

d log λ ± λ2 = 0. J. Math. Pures Appl. Liouville, J.: Sur l’équation aux différences partielles dudv 2a 18, 71 (1853) Nolasco, M. and Tarantello, G.: Double vortex condensates in the Chern–Simons-Higgs theory. Preprint, 1998 Spruck, J. and Yang, Y.: Topological solutions in the self-dual Chern–Simons theory: Existence and approximation, Ann. Inst. H. P, Anal. Non-linéaire 12, 75–97 (1995) Spruck, J. and Yang, Y.: The existence of non-topological solitons in the self-dual Chern–Simons theory Commun. Math. Phys. 149, 361–376 (1992) Tarantello, G.: Multiple condensate solutions for the Chern–Simons Higgs theory. J. Math. Phys. 37, 3769–3796 (1996) Wang, R.; The existence of Chern–Simons vortices. Commun. Math. Phys. 137, 587–597 (1991) Yang, Y.: The relativistic non-Abelian Chern–Simons equations. Commun. Math. Phys. 186, 199– 218 (1997)

Communicated by T. Miwa

Commun. Math. Phys. 202, 517 – 546 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Quantization of Equivariant Vector Bundles Eli Hawkins Center for Gravitational Physics and Geometry, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 27 February 1998 / Accepted: 5 November 1998

Abstract: The quantization of vector bundles is defined. Examples are constructed for the well controlled case of equivariant vector bundles over compact coadjoint orbits. (A coadjoint orbit is a symplectic manifold with a transitive, semisimple symmetry group.) In preparation for the main result, the quantization of coadjoint orbits is discussed in detail. This subject should not be confused with the quantization of the total space of a vector bundle such as the cotangent bundle. Contents 1 2 2.1 3 3.1 4 4.1 4.2 5 5.1 5.2 5.3 5.4 6 6.1 6.2 6.3 6.4 7

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct and inverse limit quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantized Vector Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct and inverse limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Classical Homogeneous Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The set of coadjoint orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equivariant bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantized Coadjoint Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generators and relations picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limit quantization picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantization of Vector Bundles over O3 . . . . . . . . . . . . . . . . . . . . . . . . . General quantized bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limit quantized bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identification with bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The allowed weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Remarks on Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

518 520 521 521 522 522 522 523 523 523 525 527 528 529 529 530 531 532 532

518

E. Hawkins

7.1 7.2 7.3 8 9 A B B.1 B.2 B.3 B.4 C D D.1 D.2 D.3 D.4 E

Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geometric quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bimodules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Case of the 2-Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct limit of algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct limit of modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse limit of algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse limit of modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review of Representation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coadjoint Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Symplectic structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Why coadjoint orbits? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure of coadjoint orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

532 533 533 533 535 536 537 537 539 539 540 540 541 541 542 542 543 544

1. Introduction Quantization is a vaguely defined process by which a noncommutative algebra is generated from some ordinary, commutative space. Traditionally this space has been the phase space of some system in classical mechanics; the algebra is then meant to consist of observables for a corresponding quantum system. A more recent use of quantization is with a space that is thought of geometrically; the quantization is then thought of as giving noncommutative geometries which approximate the original space being considered. The existing theory of quantization is limited for this purpose in that it only gives an algebra. This corresponds to just having the topology of the quantized space (see [4]). If the original space has more interesting structures than just its topology, then it would be desirable to in some sense “quantize” these as well. Arguably, vector bundles are the most important structures beyond point set topology. Most structures used in geometry are, or involve, vector bundles. The vector fields, differential forms, and spinor fields are sections of vector bundles. K-theory is constructed from vector bundles. A Riemannian metric is a section of a bundle. Differential operators, such as the Dirac operator, act on sections of vector bundles. Indeed, in physics most fields are sections of vector bundles. This paper is a first step towards a theory of the quantization of vector bundles. In pursuit of this goal, I present a plausible definition for the quantization of a vector bundle, and illustrate it with a large class of examples. I give a more general construction of quantization of vector bundles in [14]. I only consider compact manifolds in this paper for several reasons. One is that this is inevitably the simplest case to deal with, since almost anything that will work generally will work in the compact case. Another is physically motivated. The most natural quantizations of compact manifolds give finite-dimensional algebras; as a result, the degrees of freedom of anything on the space should become finite after quantization. This can, therefore, be used as a regularization technique for quantum field theories (see [8]). Outside of some definitions, I will assume the space is a compact manifold M and the quantizations are finite-dimensional.

Quantization of Equivariant Vector Bundles

519

In order to get simple examples I will assume that the geometry is also highly symmetrical. Suppose that some compact, semisimple Lie group acts transitively1 on M, and that everything is equivariant under the action of this group. A manifold that can be quantized (to give finite-dimensional algebras) in a reasonable sense must have a symplectic structure (App. D). A symplectic manifold with transitive symmetry by a compact, semisimple Lie group must be equivalent to a coadjoint orbit of that group (App. D). The coadjoint orbits are therefore the only spaces that can be quantized nicely with this much symmetry. Luckily, coadjoint orbits of compact Lie groups have a very simple systematic quantization (Sect. 5). I begin in Sect. 2 with a general definition of quantization structure similar to that given by Berezin in [1]. This definition involves a minimum of structure. However, greater structure can be useful for some purposes. The perspective of noncommutative geometry [4] holds that a noncommutative algebra should correspond to the “true” geometry, and that the “classical limit” is merely a convenient approximation to this [3, 5]. This suggests that the classical algebra of functions should be secondary, constructed as the limit of a sequence of noncommutative algebras. Based on this philosophy (and other motivations described in Sect. 9), in Sect. 2.1 I outline an approach to quantization based on a directed or inverse system of algebras whose limit is the classical algebra of continuous functions; I call these structures direct and inverse limit quantizations. The technical details of these limits are discussed in Appendices B.1 and B.3. In Sect. 3, I give a definition for the quantization of a vector bundle. Like the quantization of an algebra, the quantization of a vector bundle can be viewed in terms of a directed or inverse system. This is described in Sect. 3.1 and detailed in Appendices B.2 and B.4. The most relevant properties of homogeneous spaces and their vector bundles are described in Sect. 4. In Appendix D, I describe the reasons that the spaces considered here are all coadjoint orbits, and then discuss some properties of these spaces. Appendix D.3 describes the classification of the coadjoint orbits for a given group, and gives a diagrammatic technique for expressing a coadjoint orbit as a coset space. The standard quantization of coadjoint orbits is reviewed, and described in perhaps new ways, in Sect. 5. The quantization is constructed using generators and relations in Sect. 5.1. The directed and inverse limit quantizations are constructed in 5.2. Appendix E gives some additional details which are relevant to the discussion of convergence of the direct and inverse limit quantizations in 5.3. Section 6 contains the main results of this paper. I first construct quantized vector bundles, and then determine what bundles these are quantizations of. I show that all equivariant vector bundles over coadjoint orbits may be quantized. I then discuss some matters arising from this construction. In Subsect. 7.1 I explain the extent to which the construction is unique. In Subsect. 7.2 I note an interesting relationship to geometric quantization. In Subsect. 7.3 I note a property that these quantizations fail to have. In order to illustrate the constructions in this paper, I describe some of the details in the simplest possible case, that of S 2 , in Sect. 8. Appendices A and C serve to fill in some background and fix notation. Appendix A is background mainly for Appendix B. Some of the relevant facts about Lie groups are reviewed in Appendix C in a perspective appropriate to this paper. 1

I. e., the group can take any point to any other point.

520

E. Hawkins

This topic unfortunately requires using a great many symbols. A table of notations is provided at the end of the paper.

2. Quantization Generally, quantization refers to some sort of correspondence between an algebra of functions on some space, and some noncommutative algebra. This might involve a map that identifies functions to operators in the noncommutative algebra, or perhaps vice versa. The idea of a “classical limit” is that the algebra of quantum operators becomes the algebra of classical functions in some limiting sense. To make this meaningful requires having not one, but a whole sequence (discrete or continuous) of quantum algebras. This idea can be made more concrete. Let all algebras involved be C∗ -algebras. Call the space M; the algebra of functions is the algebra C0 (M) of continuous functions (vanishing at infinity in the noncompact case). The set of quantum algebras may be parameterized either continuously (say, over I = R + ) or discretely (say, over I = N). Compactify the parameter space I by adjoining some “∞” where the classical limit belongs. The algebras form a bundle AIˆ over this completed parameter space Iˆ = I ∪ {∞}, each quantum algebra is the fiber over its parameter and C0 (M) is the fiber over ∞. This AIˆ should in fact be a continuous field of C∗ -algebras; see [6]. I am taking the perspective in this paper that quantization gives noncommutative approximations to the topology M. From this perspective, the most essential information about the quantum-classical correspondence is encoded in the topology of the bundle AIˆ . A sequence of operators in each of the quantum algebras can be reasonably identified with a certain function only if these together form a continuous section of AIˆ . The space of continuous sections over Iˆ is naturally a C∗ -algebra, A := 0(AIˆ ) (see App. A). There ˆ is a natural surjection P : A C0 (M) which is simply evaluation at the point ∞ ∈ I. This algebra and surjection are the most succinct and bare-bones quantization structure. This will be referred to as a general quantization. This is almost the same as the structure of quantization given by Berezin in [1]2. It is also a generalization of the structure of a strict deformation quantization [19]; in that case the index set Iˆ is required to be an interval. Other quantization structures contain more (possibly irrelevant) information. Suppose that we are given a quantization of a space M in the form of a sequence of algebras {AN }N∞=1 and maps PN : C0 (M) AN . This is a pretty typical quantization structure; the operator PN (f ) is considered to be the quantization of the function f . The topology I give to ANˆ = AN ∪ C(M) is the weakest such that for each f ∈ C0 (M) the section taking N 7→ PN (f ) and ∞ 7→ f is continuous. Two sets of PN ’s that give the same topology to the bundle are equivalent for the purposes of my perspective. This structure of general quantization is not tied to any particular method of quantization. Indeed, it need not correspond to something that would usually be called quantization. The point of it is that a large class of concepts of quantization can be used to construct a general quantization structure, and it is this structure which is relevant to defining the quantization of a vector bundle in Sect. 3. The strategy for constructing general quantizations that is used here is that A ≡ 0(AIˆ ) is a subalgebra of 0b (AI ) (the C∗ -algebra of bounded sections over I; seeApp.A). The difference between these two types of sections is the behavior approaching ∞; 2

The major difference is that Berezin used smooth rather than continuous functions.

Quantization of Equivariant Vector Bundles

521

elements of 0(AIˆ ) must be continuous at ∞. The key is to describe the condition of continuity at ∞ purely in terms of I 63 ∞. 2.1. Direct and inverse limit quantization. In this section I make the assumption that M is compact and the quantum algebras are finite-dimensional. Since dimensions change discretely, the simplest choice of parameter space is I = N. One perspective on quantization is that the classical algebra is literally the limit of the sequence of quantum algebras. A limit of algebraic objects is generally constructed from either a “directed system” or “inverse system”, so those are what I use here. The former is a bundle of algebras AN and a sequence of maps iN : AN ,→ AN +1 linking them together. In the latter the maps are in the opposite direction, pN : AN AN −1 . If lim constructed properly, these types of systems have limits − lim −{A∗ , p∗ } →{A∗ , i∗ } and ← which are C∗ -algebras; these are detailed in Appendices B.1 and B.3. Intuitively, the directed system can be thought of as i1

i2

lim A1 ,−→ A2 ,−→ . . . ,−→ − →{A∗ , i∗ }.

(2.1)

lim For every N there is a composed injection IN : AN ,→ − →{A∗ , i∗ }. These satisfy a consistency condition with the iN ’s that IN = IN +1 ◦ iN . Similarly, the inverse system can be thought of as p2

p3

lim A1 − A2 − . . . − ← −{A∗ , p∗ }.

(2.2)

lim There are composed surjections PN : ← −{A∗ , p∗ } AN . These also satisfy a consistency condition that PN = pN +1 ◦ PN +1 . These IN ’s and PN ’s are part of the general constructions of directed and inverse limits. The general quantization algebra A is also a natural byproduct of these constructions. The maps iN and pN used in these must not be assumed to be (multiplicative) homomorphisms in general. That assumption would actually restrict M to be a totally disconnected space, which is almost certainly not what we want. Instead we must allow these maps to be some more general type of morphisms, such as unital completely positive maps3. This is discussed a little more in Appendix B.1.

3. Quantized Vector Bundles Suppose that we are given a finitely generated vector bundle V M (see [20]). If the algebra of functions C0 (M) is quantized, then what should be meant by the quantization of V ? In noncommutative geometry, all geometrical structures are dealt with algebraically. In order to find the appropriate definition for quantization of V, we must first treat V algebraically. The algebraic approach comes from the fact that the continuous sections 00 (V ) form a finitely generated, projective module of the algebra C0 (M). Indeed, this gives a one-to-one correspondence between finitely generated, locally trivial, vector bundles and finitely generated, projective modules (see [4]). The “quantization” of V should give modules for each of the quantum algebras AN ; in other words, a bundle of modules over I. 3 The property of complete positivity will not be used here; although it will be mentioned several times. For definition and discussion see [16].

522

E. Hawkins

I define a quantization of the bundle V to be a bundle of modules VIˆ over Iˆ such that the topology is consistent with that of AIˆ , and the fiber at ∞ is the module 00 (V ). The space of sections V := 0(VIˆ ) is a module of A. This gives another way of describing the quantization of V. A quantization of V may be equivalently defined as a finitely generated, projective module V of A satisfying the sole condition that the push-forward by P to a module of C0 (M) is 00 (V ). The condition that AIˆ and VIˆ have consistent topologies is implicitly encoded in this definition. Just as a continuous function is not uniquely determined by its value at a single point, there is not a single, unique quantization of a given V. Indeed, when I is discrete, any finite subset of VN ’s can be changed arbitrarily. However, there may be a uniquely natural choice for almost all VN ’s given by a single formula. This is so in the case discussed in this paper. This issue is discussed further in Sect. 7.1. The guiding principle for quantizing vector bundles will be that we already have one example. The sections of the trivial line bundle V = M × C are simply the continuous functions C0 (M). This means that V = A should always be a good quantization of this bundle. 3.1. Direct and inverse limits. Return to the assumptions of Sect. 2.1 (compactness, etc.). As with quantizing C(M), it is possible to use additional structure in the quantization of a vector bundle. A quantized vector bundle can be constructed from a directed system {V∗ , ι∗ } or an inverse system {V∗ , π∗ } of modules. In these systems, each VN is an AN module; the maps are linear maps ιN : VN ,→ VN +1 and πN : VN VN −1 . The details of this are described in Appendices B.2 and B.4. There are again composed injections V IN and surjections PNV , satisfying the same sort of compatibility conditions as for IN and PN in Sect. 2.1. 4. Classical Homogeneous Spaces Again, and throughout the rest of this paper, I assume that M is a compact manifold, the parameter space is I = N, and the algebras AN are finite-dimensional. In order to get some control of the system, and construct some quantizations explicitly, let us assume that some group G acts transitively on M (i. e., M is homogeneous) and that everything we do will be G-equivariant. It is a standard construction (see [15]) that M can be written as a coset space M = G/H, where the isotropy group is H := {h ∈ G | h(o) = o} for some arbitrary basepoint o ∈ M. Since M is a manifold, G is best chosen to be a Lie group. If we assume G to be compact and semisimple4, then the set of M’s we are interested in is (up to equivalence) the set of “coadjoint orbits” (see App. D.2). 4.1. The set of coadjoint orbits. The coadjoint space is g∗ , the linear dual of the Lie algebra g of G. There is a natural, linear action of G on g∗ . A coadjoint orbit is simply the orbit of some point in g∗ under that G action. The relevant definitions concerning Lie groups are summarized in Appendix C. The classification of coadjoint orbits is strikingly similar to the classification of irreducible representations. The irreducible representations are classified by the dominant weights, which are the vectors on the weight lattice that lie in the positive Weyl chamber C+ ⊂ g∗ . The coadjoint orbits are classified by all vectors in C+ (see App. D.3). Denote by O3 the coadjoint orbit of 3 ∈ C+ ⊂ g∗ . 4 Assuming G semisimple is equivalent to assuming M is not a torus or the product of a torus with something else.

Quantization of Equivariant Vector Bundles

523

Since a coadjoint orbit is a homogeneous space, it can always be expressed as a coset space O3 ∼ = G/H; it is natural to identify the basepoint o = eH ∈ G/H with 3 ∈ O3 . A diagrammatic method of calculating H from 3 is described in Appendix D.3. The structures of the sets of irreducible representations of G and of H are closely related. The weight lattices of G and H are naturally identified. However, the sets of weights which are dominant (and thus actually correspond to representations) are different. This is relevant in Sect. 6.3. 4.2. Equivariant bundles. Notation. In this paper I will generally refer to a representation space (group module) simply as a representation. Suppose that V is an equivariant vector bundle over M = G/H. This simply means that 0(V ) is a representation of G. The fiber Vo at the basepoint o = eH is a vector space and is acted on by H, so Vo is a representation of H. Suppose that W is a representation of H. The set W ×H G := W × G/∼, where (w, g) ∼ (hw, gh−1 ), is naturally an equivariant vector bundle over M. The bundle surjection W ×H G G/H is [(w, g)] 7→ gH; the action of g 0 ∈ G is [(w, g)] 7→ [(w, g 0 g)]. Up to equivalence, all equivariant vector bundles may be constructed in this way. The fiber of W ×H G at o is simply W , so there is a one-to-one correspondence between H-representations and equivariant vector bundles over M. The semigroup of equivariant vector bundles under direct sum is generated by the set of irreducible bundles – those corresponding to irreducible representations. It is not the case that all vector bundles over M can be made equivariant. Nevertheless, I am only considering equivariant bundles in this paper. Every bundle over a homogeneous space which is mentioned in this paper is a finitely generated, locally trivial, equivariant, vector bundle; but I will frequently omit some of these adjectives.

5. Quantized Coadjoint Orbits Notation. The irreducible representations of G are in one-to-one correspondence with dominant weights (App. C). Denote the space of the representation corresponding to the weight λ by (λ). This is the G-representation with “highest weight” λ (App. C). Denote AN := End(N 3), the algebra of matrices on the vector space (N 3); the notation AN will be justified in the following. 5.1. Generators and relations picture. The action of g on (N 3) can be expressed as a map g → End(N 3) = AN . The associative algebra AN is generated by the image of the Lie algebra g. Let {Ji } ⊂ g be a basis of self-adjoint generators of g acting on (N 3); AN can be written in terms of this set of generators and the following relations. First, the commutation relations state that [Ji , Jj ]− = iC kij Jk ,

(5.1)

where C kij are the structure coefficients. Second, the Casimir relations state that Cn (J) = cn (N 3) ∀n,

(5.2)

524

E. Hawkins

where the Casimirs Cn are G-invariant, symmetrically ordered, homogeneous polynomials in the J’s, and the cn ’s are the corresponding eigenvalues. Finally, the Serre relations state that certain linear combinations of Ji ’s are nilpotent, the order of nilpotency rising linearly with N ; an example of this is given in Sect. 8. The Casimir eigenvalues cn (N 3) are polynomials in N 3 of the same order as Cn . In fact the leading order (in N ) term is Cn (3)N Ord(Cn ) . The reason that it is meaningful to evaluate Cn on a point of g∗ (such as 3) as well as on the Ji ’s is that the Ji ’s together form a sort of Lie algebra valued vector in g∗ . The Serre relations are actually equivalent to the condition that the Ji ’s generate a C∗ -algebra. Suppose that the Ji ’s do lie inside a C∗ -algebra and satisfy the commutation and Casimir relations. Then this C∗ -algebra can be faithfully represented on a Hilbert space H. The commutation relations imply that the Ji ’s generate a unitary representation of G on H. The Casimir relations imply that H can only be (N 3) or some Hilbert space direct sum of copies of (N 3). This means that the C∗ -subalgebra generated by the Ji ’s is End(N 3); which implies that the Serre relations are satisfied. Now, regard the AN ’s as forming a bundle AN over the discrete parameter space N. We can think of N and the generators Ji as sections in 0(AN ), but neither is bounded, so they are not in 0b (AN ) (the C∗ -algebra of bounded sections; see App. A). However, the combinations Xi = N −1 Ji are bounded; as can be seen by considering the quadratic Casimir5 C1 . This means that Xi ∈ 0b (AN ). Define A to be the C∗ -subalgebra of 0b (AN ) generated by the Xi ’s. Define A0 := 00 (AN ) to be the algebra of sections vanishing at ∞ (see App. A); since in fact6 A0 is contained in A, it is an ideal there7. Define P : A A∞ := A/A0 to be the corresponding quotient homomorphism; this essentially just evaluates the N → ∞ limit. By construction, the images xi := P(Xi ) generate the quotient algebra A∞ . The relations these satisfy all derive from the relations satisfied by the Xi ’s. These generators commute, since [xi , xj ]− = P([Xi , Xj ]− ) = P(iN −1 C kij Xk ) = 0,

(5.3)

so A∞ is a commutative C∗ -algebra (and therefore is the algebra of continuous functions on some space). The xi ’s transform under G in the same way as Cartesian coordinates on g∗, so A∞ is the algebra of continuous functions on some subspace of g∗. The nonSerre relations alone define a C∗ -algebra; therefore the Serre relations do not give any additional relations for A∞ . The only other relations the xi ’s satisfy are polynomial relations !

Cn (x) = lim N − Ord(Cn ) cn (N 3) = Cn (3) N →∞

(5.4)

which make A∞ the algebra of continuous functions on the algebraic subspace M ⊂ g∗ determined by these polynomials. The Casimir polynomials are a complete system of G-invariant polynomials; therefore M must be a single coadjoint orbit. Obviously, x = 3 satisfies Cn (x) = Cn (3), so 5 The eigenvalue of the quadratic C (X) is C (3) plus a term proportional to N −1 , therefore it is bounded 1 1 as N → ∞, therefore it is a polynomial of bounded operators. 6 It is essentially sufficient to show that A contains one function on N that nontrivially converges to 0. 7 Because A is an ideal in 0 (A ). 0 b N

Quantization of Equivariant Vector Bundles

525

3 ∈ M; therefore M is the orbit O3 . This shows that in the sense of Sect. 2.5.1, the system P : A C(O3 ) is a general quantization of O3 . In this construction the 3 was required to be integral (a weight) rather than any arbitrary 3 ∈ C+ . However, this is not a serious restriction. Rescaling 3 simply rescales O3 , therefore a more appropriate parameter space for distinct coadjoint orbits is the projectivisation PC+ . The image of the weights is dense in PC+ (it is the set of “rational” points), so the quantizable coadjoint orbits are dense in the space of distinct coadjoint orbits. 5.2. Limit quantization picture. Notation. The linear dual of an irreducible representation is also an irreducible representation; we can therefore define λ∗ by the property (λ∗ ) = (λ)∗ . This is a linear transformation on the weights (see App. C). With this notation AN ≡ End(N 3) = (N 3)⊗(N 3∗ ). Given a choice of Cartan subalgebra and positive Weyl chamber, there is a preferred, 1-dimensional “highest weight subspace” in (N 3); choose a normalized basis vector 9N 3 there and call it the highest weight vector (see App. C). Not only do the coadjoint orbits have equivariant general quantizations, but they also admit equivariant direct and inverse limit quantizations. There are standard constructions of maps AN ,→ C(O3 ) and C(O3 ) AN which are suitable to be used as IN and PN . I present these first. We need an equivariant, linear injection IN : AN ,→ C(O3 ). If we have such an IN , then for every point x ∈ O3 , evaluation at x determines a linear function IN ( · )(x) : AN → C ; in other words, x gives an element of the dual A∗N . Such an IN is in fact equivalent ∗ ∗ : O3 ,→ A∗N = (N 3∗ ) ⊗ (N 3). Since IN must be equivariant, to an injection IN it is completely specified by the image of the basepoint o = eH. This image must be H-invariant. The highest weight vector 9N 3 ∈ (N 3) is H-invariant, modulo phase. Its conjugate vector 9−N 3 ∈ (N 3∗ ) transforms by the opposite phase, so the product 9−N 3 ⊗ 9N 3 ∈ g∗ is H-invariant. In fact, H is the largest subgroup that this is invariant under. ∗ (o) := 9−N 3 ⊗ 9N 3 ∈ g∗ . With this choice, Define the image of the basepoint to be IN IN is given by

(5.5) IN (a)(gH) = g9N 3 a g9N 3 for any gH ∈ O3 . There is some apparent arbitrariness in this construction. There were choices made of Cartan subalgebra, positive Weyl chamber, and phase of the highest weight vector. However, the resulting IN is only arbitrary by the freedom to rotate O3 about o (by H), and this freedom was inevitable. We now need to construct injections iN : AN ,→ AN +1 . The question is how to get from something acting on (N 3) to something acting on ([N + 1]3). The key is that precisely one copy of ([N + 1]3) always occurs as a subrepresentation of (3) ⊗ (N 3) (see App. C). There is a unique, natural projection 5+ ∈ HomG [(3) ⊗ (N 3), ([N + 1]3)]

526

E. Hawkins

which maps a vector in (3)⊗(N 3) to its component in the irreducible subrepresentation ([N +1]3) ⊂ (3)⊗(N 3). Using 5+ , an element A ∈ End[(3)⊗(N 3)] can be mapped to 5+ A 5∗+ ∈ AN +1 . Now, that algebra is End[(3) ⊗ (N 3)] = End(3) ⊗ End(N 3) = A1 ⊗ AN . There is a very simple map AN ,→ A1 ⊗ AN taking a 7→ 1 ⊗ a. Composing these gives, as desired, a map iN : AN ,→ AN +1 by the formula iN (a) = 5+ (1 ⊗ a)5∗+ .

(5.6)

This (and any map that can be written in this form) is a completely positive map (see [16]). To verify that our iN really satisfies the consistency condition IN +1 ◦ iN = IN , it is sufficient to check this at the basepoint o ∈ O3 . So, ∀a ∈ AN ,

IN +1 ◦ iN (a)(o) = 9(N +1)3 iN (a) 9(N +1)3

= 9(N +1)3 5+ (1 ⊗ a)5∗+ 9(N +1)3

= 91 ⊗ 9N 3 (1 ⊗ a) 91 ⊗ 9N 3

= 9N 3 a 9N 3 = IN (a)(o) and it is consistent. The surjections come about similarly. There is a related function eN taking O3 to projections in AN . This maps eN : o 7→ |9N 3 ih9N 3 |. Using eN , the injection IN can be written as IN (a)(x) = tr[a eN (x)] ; and the surjection PN is defined as PN (f ) = dim(N 3)

(5.7)

Z O3

f eN ,

(5.8)

where is an invariant volume form normalized to give O3 volume 1. This map is unital and positive. It is actually the adjoint of the map IN if we put natural inner products on AN and C(O3 ). The inner product on AN is ha, bi = ter (N 3) (a∗ b), where ter (N 3) is the trace over R (N 3), normalized to give ter (N 3) 1 = 1. The inner product on C(M) is hf1 , f2 i = O f1∗ f2 . 3 We will automatically satisfy the consistency with the PN ’s if we choose pN to be the adjoint of iN −1 . The immediately obtained formula is pN (a) = [ter (3) ⊗ idAN −1 ](a ⊕ 0);

(5.9a)

where this is a partial trace of the action of a on (N 3) ⊂ (3) ⊗ ([N − 1]3). This can actually be written in essentially the same form as the iN ’s. Precisely one copy of ([N − 1]3) always occurs as a subrepresentation of (3∗ ) ⊗ (N 3), so there is a corresponding projection 5− ∈ HomG [(3∗ ) ⊗ (N 3), ([N − 1]3)]. With this, define pN : AN AN −1 by pN (a) = 5− (1 ⊗ a)5∗− .

(5.9b)

To see that this is equivalent to (5.9a), it is sufficient to check that these agree for

a = eN (o) = 9N 3 9N 3 . These pN ’s are also completely positive.

Quantization of Equivariant Vector Bundles

527

5.3. Convergence. I will now show that these direct and inverse limit quantizations are both convergent by considering the “product” IN [PN (f1 )PN (f2 )] for any two functions f1 , f2 ∈ C(O3 ). This is not an associative product (compare Eq. (D.1)), since PN ◦ IN 6 = id, but as N → ∞ it nevertheless converges to the product of functions. This “product” can be written in terms of an integration kernel as ZZ KN (x, y, z)f1 (y)f2 (z)y z . (5.10) IN [PN (f1 )PN (f2 )](x) = O3

The volume form is again the G-invariant volume form giving O3 total volume 1. From the construction of the maps IN and PN in (5.7) and (5.8) it is immediate that KN (x, y, z) = [dim(N 3)]2 tr[eN (x)eN (y)eN (z)].

(5.11)

If we use the identification O3 = G/H, this can be factorized as KN (gH, g 0 H, g 00 H)

(5.12) = [dim(N 3)]2 g9N 3 |g 0 9N 3 g 0 9N 3 |g 00 9N 3 g 00 9N 3 |g9N 3 . The factor of [dim(N 3)]2 serves to normalize KN so that IN [PN (1)PN (1)] = 1,

(5.13)

as it should be since PN (1) = 1 and IN (1) = 1. The inner products in (5.12) have several nice construction, these are properties. By certainly smooth functions. The absolute value g9N 3 |g 0 9N 3 only depends on the points gH, g 0 H ∈ O3 , and is equal to 1 for gH = g 0 H; but for any gH 6 = g 0 H, N 3 0 N 3 < 1. g9 |g 9 The fact that (see App. C) 9N 3 = 93 ⊗ · · · ⊗ 93 , gives the convenient identity

N 3 0 N 3 3 0 3 N . (5.14) = g9 |g 9 g9 |g 9 These properties imply that for any gH 6 = g 0 H,

N 3 0 N 3 −− −→ 0 g9 |g 9 N →∞

exponentially. The factor [dim(N 3)]2 only increases polynomially; therefore, outside any neighborhood of x = y = z, KN (x, y, z) vanishes uniformly as N → ∞. This means that in order to investigate the N → ∞ limit, it is sufficient to consider x, y, and z close together. Since O3 is homogeneous, we can let x = o without loss of generality. In order to construct an approximation for KN near o, we need a coordinate patch about o. Coadjoint orbits are always K¨ahler manifolds, so complex coordinates are convenient. The (real) tangent fiber To O3 is naturally a complex Hermitian space and in fact can be identified to a subspace of (3) which is orthogonal to 93 . A suitable complex coordinate patch can be constructed by using this identification along with the exponential map; thus a neighborhood of o is coordinatised by vectors in a subspace of (3). Let υ and ζ be the complex coordinates of y and z respectively. Using these coordinates, to second order [dim(3)]−2 K1 (o, y, z) ≈ 1 − kυk − kζk + hυ|ζi 2

2

(E.4)

528

E. Hawkins

(see App. E). A formula for KN (with N 1) can be constructed by raising this to the N th power and recalling the normalization (5.13). This gives KN (o, y, z)y z ≈

2 2 N 2n −N kυk +kζk −hυ|ζi e π

d2n υ d2n ζ,

(5.15)

where 2n = dim O3 . The L1 norm of the error in this expression is of order N − 2 and thus goes to 0 as N → ∞. It is a standard result that as N → ∞ a complex Gaussian such as (5.15) converges as a C −∞ distribution to the delta distribution δ 2n (υ)δ 2n (ζ) d2n υ d2n ζ. This means for smooth functions fi ∈ C ∞ (O3 ) that IN [PN (f1 )PN (f2 )](o) → f1 (o)f2 (o) as N → ∞, and (using the homogeneity of O3 ) 3

−→ f1 f2 . IN [PN (f1 )PN (f2 )] −− N →∞

(5.16)

If, instead of smooth functions, we have continuous functions fi ∈ C(O3 ) then we can approximate these with smooth functions f˜i . Because the maps IN and PN are completely positive, they are norm-contracting; this implies that the norm-difference

IN [PN (f1 )PN (f2 )] − IN [PN (f˜1 )PN (f˜2 )] is bounded uniformly as N → ∞ and goes to 0 as f˜i → fi . This means that (5.16) is true for all continuous functions. Using the fact that PN (1) = 1, this also shows that IN and PN are asymptotically inverse, in the sense that IN ◦ PN (f ) → f as N → ∞. This property means that we can replace PN by a left inverse of IN , and Eq. (5.16) will continue to hold. This shows that the direct limit converges (see App. B.1). Likewise, we can replace IN by a right inverse of PN , and Eq. (5.16) will continue to hold. This shows that the inverse limit converges (see App. B.3). 5.4. Polynomials. In Appendix B.1, the limit − lim →{A∗ , i∗ } is constructed by first constructing the limit Vec-lim −→{A∗ , i∗ } as a sequence of vector spaces and then completing to a C∗ -algebra. In the particular case of coadjoint orbits, Vec-lim −→{A∗ , i∗ } is itself interesting. The algebra C(O3 ) is, as a G-representation, a closure of the direct sum of all its irreducible subrepresentations. On the other hand, each AN is finite-dimensional and is therefore just a direct sum of irreducibles; any element of the limit Vec-lim −→{A∗ , i∗ } is in the image of some AN ; therefore, Vec-lim {A , i } is the “algebraic” direct sum of ∗ ∗ −→ {A , i } must be the direct sum irreducibles. Since C(O3 ) is a closure of this, Vec-lim −→ ∗ ∗ of all the irreducible subrepresentations of C(O3 ). The polynomial functions C[O3 ] on O3 are defined as the restrictions to O3 of polynomials on g∗ . The space of polynomials of a given degree is a direct sum of irreducible representations. Any polynomial has finite degree; therefore C[O3 ] is a direct sum of irreducible representations. Since C[O3 ] is dense in C(O3 ), it must be the direct sum of the irreducible subrepresentations of C(O3 ). This shows that Vec-lim −→{A∗ , i∗ } = C[O3 ], and so the vector space direct limit is in this case an algebra. Whether this is true in any more general case remains to be seen.

Quantization of Equivariant Vector Bundles

529

6. Quantization of Vector Bundles over O3 6.1. General quantized bundles. AN ≡ End(N 3) is a full (a. k. a. simple) matrix algebra. The classification of the modules of a full matrix algebra is elementary. Any module is a tensor product of the fundamental module with some vector space. In this case the fundamental module is (N 3), and the vector space should be a G-representation. Any irreducible, equivariant module of AN must be of the form VN = (N 3) ⊗ (ν),

(6.1)

with the algebra only acting on the first factor. Any finitely generated, equivariant AN module is a direct sum of such irreducibles. Because AN is finite-dimensional, this VN is automatically projective. The defining property of a finitely generated, projective module is that it is a (complemented) submodule of the algebra AN tensored with some vector space. This submodule can be picked out by a projection (idempotent). In the G-equivariant case, “vector space” becomes “G-representation”, and the projection must be G-invariant. In the case of this VN , the representation we tensor with can be chosen to be irreducible; call it (µ). This means that we can identify VN with a submodule of AN ⊗ (µ) in the form VN = [AN ⊗ (µ)] · QN ,

(6.2)

where QN = Q2N . The factor (N 3) is treated as a space of column vectors, but the factor (N 3∗ ) ⊗ (µ) is treated as a space of row vectors, i. e., QN multiplies them from the right. Acting from the left, QN would multiply the corresponding (dual) space of column vectors (N 3) ⊗ (µ∗ ); therefore QN ∈ End[(N 3) ⊗ (µ∗ )] = AN ⊗ End(µ∗ ). We can choose µ such that QN is the unique invariant projection from (N 3) ⊗ (µ∗ ) to the irreducible subrepresentation (ν ∗ ). The injection iN : AN ,→ AN +1 can be applied to the tensor product of AN with a fixed algebra – in this case End(µ∗ ). Let us apply this to QN and call the result QN +1 ; by Eq. (5.6), this is QN +1 := [iN ⊗ id](QN ) = (5+ ⊗ 1)(1 ⊗ QN )(5∗+ ⊗ 1).

(6.3)

QN +1 is an endomorphism on ([N + 1]3) ⊗ (µ∗ ) and is clearly self-adjoint. Let ψ ∈ ([N + 1]3) ⊗ (µ∗ ) be a normalized vector, and look at the product hψ|QN +1 |ψi = h(5∗+ ⊗ 1)ψ| (1 ⊗ QN ) |(5∗+ ⊗ 1)ψi .

(6.4)

Note that 5∗+ ⊗ 1 is just the natural isometric inclusion of ([N + 1]3) into (3) ⊗ (N 3). The product (6.4) is equal to 1 if and only if (5∗+ ⊗ 1)ψ is in the image (3) ⊗ (ν ∗ ) of QN ; but since ψ ∈ ([N +1]3)⊗(µ∗ ), this is equivalent to ψ lying in the intersection (3 + ν ∗ ). Conversely, (6.4) is 0 if (5∗+ ⊗ 1)ψ is orthogonal to (3) ⊗ (ν ∗ ), or equivalently, if ψ is orthogonal to (3 + ν ∗ ). This shows that QN +1 is the projection with image (3 + ν ∗ ). Note that 5∗+ 5+ is the self-adjoint idempotent acting on (3) ⊗ (N 3), with image ([N + 1]3). Using the same sort of reasoning as in the last paragraph, the image of (1 ⊗ QN )(5∗+ ⊗ 1) is in ([N + 1]3) ⊗ (µ∗ ), so there is the identity (1 ⊗ QN )(5∗+ ⊗ 1) = (5∗+ 5+ ⊗ 1)(1 ⊗ QN )(5∗+ ⊗ 1) = (5∗+ ⊗ 1)QN +1 . In words, moving 1 ⊗ QN right past 5∗+ ⊗ 1 transforms it into QN +1 .

(6.5)

530

E. Hawkins

The new projection QN +1 gives an AN +1 -module VN +1 = [AN +1 ⊗ (µ∗ )] · QN +1 = ([N + 1]3) ⊗ (ν + 3∗ ). Repeating this process gives a whole sequence of modules. Since the weight in the second factor is changed by 3∗ with each step, it is simpler to write in terms of λ = ν − N 3∗ . The sequence of modules is now VNλ := (N 3) ⊗ (N 3∗ + λ).

(6.6)

Each of these can be realized as a submodule of AN ⊗ (µ) in the form VNλ = [AN ⊗ (µ)] · QλN .

(6.7)

The projections are related by the recursion8 QλN +1 = [iN ⊗ id](QλN ).

(6.8a)

Because the construction of the pN ’s is so similar to that of the iN ’s, the same reasoning shows that [pN ⊗ id](QN ) is a projection as well. In fact, the same sequence of projections given by (6.8a) also satisfies QλN −1 = [pN ⊗ id](QλN ).

(6.8b)

Now, we can put all these QλN ’s together to form Qλ ∈ 0(AN ) ⊗ End(µ∗ ). The constructions of A in Appendices B.1 and B.3 say essentially that Qλ ∈ A ⊗ End(µ∗ ) if and only if one of the relations (8) is true in a limiting sense as N → ∞. Since Eqs. (8) are true for finite N , we have more than we need to show that Qλ ∈ A ⊗ End(µ∗ ). By construction, this Qλ is obviously a projection. Using this, we define Vλ := [A ⊗ (µ)] · Qλ .

(6.9)

This is a well defined, finitely generated, projective module of A, and the restriction to each AN is VNλ . This shows that Vλ is a general quantization of some bundle V λ over O3 . Although (µ) was used in this construction, λ completely determines Vλ as an A-module. 6.2. Limit quantized bundles. We can use iN to map iN ⊗ id : AN ⊗ (µ) ,→ AN +1 ⊗ (µ). For some ψ ∈ AN ⊗ (µ), look at what happens to the product ψQλN ; using (6.5), [iN ⊗ id](ψQλN ) = 5+ (1 ⊗ [ψQλN ])(5∗+ ⊗ 1)

= 5+ (1 ⊗ ψ)(1 ⊗ QλN )(5∗+ ⊗ 1)

= 5+ (1 ⊗ ψ)(5∗+ ⊗ 1) · QλN +1

= [iN ⊗ id](ψ) · QλN +1 .

(6.10a) (6.10b)

This implies that iN ⊗ id maps the image VNλ of QλN to the image VNλ+1 of QλN +1 , so we can restrict iN ⊗ id to VNλ and get a well defined injection ιN : VNλ ,→ VNλ+1 . These 8

Actually, this is not quite always true; see Sect. 6.4.

Quantization of Equivariant Vector Bundles

531

injections make a directed system out of the VNλ ’s. Because of the simple relationship with the directed system {A∗ , i∗ }, the system {V∗λ , ι∗ } inherits its convergence. In an essentially identical way, we can construct πN : VNλ VNλ−1 as the restriction of pN ⊗ id. This gives a convergent inverse system {V∗λ , π∗ }. In spite of the way that they were constructed, these ιN ’s and πN ’s are independent of the (µ) that we use. We can use the unique natural projection 5λ + ∈ HomG [(3) ⊗ (N 3 + λ∗ ), ([N + 1]3 + λ∗ )] to write (in a slight modification of (6.10a)) ιN (ψ) = 5+ (1 ⊗ ψ)5∗λ + .

(6.11)

In this form ιN manifestly depends only on 3, N , and λ. There is again a precisely analogous form for πN . It is easy to see that 50 + = 5+ , so ιN in (6.11) is a simple generalization of iN in Vλ (5.6). Analogous to the maps IN and PN for the algebras, there are maps IN : VNλ ,→ λ λ λ V λ λ V∞ ≡ 0(V ) and PN : V∞ VN . These are easily constructed as restrictions of IN ⊗ id and PN ⊗ id. These limit quantizations both produce the same Vλ as was constructed using Qλ in the previous section. These are, therefore, all quantizations of the same bundle V λ . 6.3. Identification with bundles. Notation. Since the Lie algebras g and h share the same Cartan subalgebra, their weights are naturally identified (App. D.3). Denote the H-representation with highest weight λ by [λ]. Note that 9λ ∈ [λ] ⊂ (λ). Beware that [λ]∗ and [λ∗ ] are not generally the same. I have established that the irreducible equivariant bundles are given by dominant weights of H, and irreducible equivariant quantized bundles are given by weights of G. So, what is the correspondence? Using the quotient homomorphism P : A C(O3 ), define the limit projection Qλ∞ := [P ⊗ id](Qλ ) ∈ C(O3 ) ⊗ End(µ∗ ) ; this is naturally thought of as a projection-valued function on O3 . The bundle V λ can be realized as the subbundle of O3 × (µ) determined by Qλ∞ . At each point x ∈ O3 , the fiber of V λ is Vxλ = (µ) · Qλ∞ (x) ⊂ (µ). The injection IN is heuristically the limit of applying iN , then iN +1 , then iN +2 , and so on. The recursion relation (6.8a) thus implies that [IN ⊗ id](QλN ) = Qλ∞ . As explained in Sect. 4.2, the equivariant bundle V λ is completely determined by its fiber at o ∈ O3 . This fiber is given by Qλ∞ as Voλ = (µ) · Qλ∞ (o). It is more convenient to first determine the dual (Voλ )∗ = Qλ∞ (o) · (µ∗ ). The H-representation (Voλ )∗ is the image of Qλ∞ (o). This is actually an irreducible representation, so it is determined by its highest weight. Let ψ ∈ (µ∗ ) be a normalized vector of a given weight. If (and only if) ψ ∈ (Voλ )∗ then hψ|Qλ∞ (o)|ψi = 1. So, evaluate this expression; it is (using (5.5)) hψ|Qλ∞ (o)|ψi = hψ|[(IN ⊗ id)(QλN )](o)|ψi

= 9N 3 ⊗ ψ QλN 9N 3 ⊗ ψ .

532

E. Hawkins

This is 1 if and only if 9N 3 ⊗ ψ ∈ (N 3 + λ∗ ). Since N 3 + λ∗ is the highest weight of (N 3 + λ∗ ), the highest weight that ψ can have under this condition is λ∗ . This means that (Voλ )∗ = [λ∗ ]; therefore Voλ = [λ∗ ]∗ . Finally, this gives V λ = [λ∗ ]∗ ×H G.

(6.12)

6.4. The allowed weights. The recursion relation (6.8a) is actually not true for quite all values of λ and N . If a weight ν is not dominant, then there really is no representation (ν). It is, however, convenient to define (ν) := 0 in that case. The condition that Vλ 6 = 0 is that N 3∗ + λ is dominant for some N . If λ satisfies this condition but is not itself dominant, then for low N values VNλ = 0, but for sufficiently large N values VNλ 6 = 0. In this case there is some N such that VNλ = 0 6 = VNλ+1 . This means that QλN = 0 6 = QλN +1 , so obviously [iN ⊗ id](QN ) = 0 6 = QλN +1 and (6.8a) fails. However, this is the only time that (6.8a) is not true, so there is no real trouble from this. Equation (6.8b), on the other hand, is always true. The condition that V λ , as given by (6.12), is a nonzero bundle is that λ∗ is dominant as an H-weight. This is actually exactly equivalent to the condition just described for Vλ 6 = 0. This means that any finitely generated, locally trivial, equivariant vector bundle can be equivariantly quantized. 7. Further Remarks on Bundles 7.1. Uniqueness. Equivariant bundles and modules are classified by equivariant K0 (O3 ). theory. The equivariant vector bundles over O3 all have equivalence classes in KG As has been mentioned (Sect. 4.2), these bundles are classified by representations of H. From this it is easy to see that an equivariant bundle is uniquely specified by its K-class. Similarly, an equivariant module of A is uniquely specified by its K-class in K0G (A). The equivariant general quantizations of vector bundles are equivariant modules of A, and are thus classified by K0G (A). Since C(O3 ) = A/A0 , there is a corresponding six-term periodic exact sequence in K-theory. Part of this sequence reads 0 (O3 ) → K1G (A0 ). K0G (A0 ) → K0G (A) → KG

(7.1)

A0 = 00 (AN ) is the C∗ -direct sum of the algebras AN ; therefore K∗G (A0 ) = L∞The ideal G N =1 K∗ (AN ). Because AN is the matrix algebra on a simple representation of G, its equivariant K-theory is very simple. In degree 0, K0G (AN ) = R(G) the unitary representation ring of G. In degree 1, K1G (AN ) = 0. This simplifies the exact sequence (7.1). Now it reads 0 (O3 ) → 0. R(G)⊕∞ → K0G (A) KG

(7.2)

Firstly, this shows that – at the level of K-theory – any equivariant bundle has an equivariant quantization, since it has a preimage in K0G (A). This corroborates the conclusion of Sect. 6.3. Secondly, this describes the variety of possible quantizations of a given bundle. If two equivariant A-modules quantize the same bundle, then the difference of their K-classes is in the image of R(G)⊕∞ , but that is an algebraic direct sum; it consists of sequences with only finitely many nonzero terms, and each term concerns a single N . This means that if both VN and VN0 are quantizations of V, then for all N sufficiently large, VN ∼ = VN0 .

Quantization of Equivariant Vector Bundles

533

Given this conclusion, the choice of VN ’s in Eq. (6.6) must be the unique one given by a simple formula. 7.2. Geometric quantization. For each N , the fundamental module (N 3) of AN is of course a module. It is tempting to ask if these together form the quantization of some bundle, but they do not. The A-module formed by assembling these is not projective. It is reasonable to instead ask – separately for each N – what bundle’s equivariant quantization (by the construction of Sect. 6) has VN = (N 3)? This is easily answered: ∗

(N 3) = (N 3) ⊗ (0) = (N 3) ⊗ (N 3∗ − N 3∗ ) = VN−N 3 .

(7.3)

Using the identity that [N 3]∗ = [−N 3] (see App. D.3), the corresponding bundle is −N 3∗ = [N 3] ×H G. V The H-representation [N 3] is one-dimensional; this bundle is therefore of rank 1 — i. e., it is a line bundle. In geometric quantization of O3 , the fundamental module (N 3) of AN is constructed as the space of holomorphic sections of this very line bundle. 7.3. Bimodules. For a commutative algebra, any module can automatically be considered a bimodule; simply define right multiplication to be equal to left multiplication. However, it is not generally the case that when a vector bundle is quantized, the corresponding module continues to be a bimodule. The right side (the row-vector factor) of VNλ is (N 3∗ + λ) and does not in general admit any equivariant right multiplication by AN ≡ End(N 3). If VN is an AN -bimodule, then it must contain a factor of (N 3) to accommodate the left multiplication, and a separate factor of (N 3∗ ) to accommodate the right multiplication. It must therefore be the tensor product of AN itself by some representation. The corresponding classical bundle is then the trivial bundle with fiber equal to that representation. This is an unpleasantly restrictive class. A slightly broader class of bundles results if we allow the quantum modules to be multiplied from the left and right by different AN ’s. This is enough to make V an Abimodule, and is also contrary to the philosophy of each N being a separate step along the way to the classical limit. The irreducibles of this class of modules are of the form ∗

VN = (N 3) ⊗ ([N + m]3) ⊗ (λ) = VNm3 ⊗ (λ).

(7.4)

The corresponding classical bundles are a slightly more interesting class than trivial bundles, but still quite restrictive. This can be extended a little further in some cases by using a larger parameter set I. It remains to be seen whether this class of modules is useful.

8. The Case of the 2-Sphere The group SU(2) is the most elementary compact, simple Lie group, so the simplest example of what has been described here is for G = SU(2). There is only one distinct coadjoint orbit for SU(2); it is the 2-sphere. As a coset space S 2 = SU(2)/U(1). The positive Weyl chamber of SU(2) is C+ = R + . Thought of as the parameter space for S 2 ’s, this is the set of radii. In deference to standard physics notation, I will identify the dominant weights with positive half-integers. The irreducible representations are thus (0), ( 21 ), (1), et cetera. The most appropriate choice for 3 is 21 .

534

E. Hawkins

The Lie algebra su(2) is generated by J1 , J2 , and J3 , with the commutation relations [Ji , Jj ]− = ikij Jk ; that is, [J1 , J2 ]− = iJ3 , et cetera. The standard choice for the Cartan subalgebra C is the one-dimensional span of J3 . The weights are just the eigenvalues of J3 . In the representation ( N2 ) the highest weight vector satisfies J3 9N/2 = N2 9N/2 . There is a single (quadratic) Casimir operator C1 (J) = J 2 ≡ J12 + J22 + J32 . Its eigenvalue on the representation ( N2 ) is N2 [ N2 + 1]. There is a single Serre relation for End( N2 ). In terms of the element J+ := 21 (J1 +iJ2 ), the relation is that J+N +1 = 0. Although this is expressed in a noninvariant way, this condition really is invariant; it could equivalently be expressed in terms of many other possible combinations of J’s. The logic of the Serre relation is that the representation ( N2 ) is N + 1 dimensional. It can be decomposed into one-dimensional weight subspaces (J3 eigenspaces). The operator J+ shifts these weight subspaces; it maps the subspace with weight m to the subspace with weight m + 1 (the next higher possible weight). J+ can be applied to some J3 -eigenvector no more than N times before there are no more eigenvalues available, and the result must be 0. Therefore J+N +1 applied to anything in ( N2 ) must give 0. We can construct a general quantization by the method of Sect. 5.1. The generators xi := P(N −1 Ji ) of the resulting A∞ satisfy the relations of commutativity and x21 + x22 + x23 = 41 . Obviously this shows A∞ to be the continuous functions on the sphere of radius 21 in su(2)∗ ∼ = R3 . All SU(2)-representations are self-dual. Because of this, the constructions of iN and pN are even closer than in the general case in Sect. 5.2. Decompose the tensor product ( 21 ) ⊗ ( N2 ) = ( N2+1 ) ⊕ ( N 2−1 ). There is a representation of AN on this that acts trivially on the ( 21 ) factor; for an element a ∈ AN , the ( N2+1 ) corner of this representation matrix is iN (a) ∈ AN +1 ; the ( N 2−1 ) corner is pN (a) ∈ AN −1 . In this case it is possible to construct a simple and (partly) explicit formula for the “product” kernel KN . The key is to use the identification S 2 = CP1 = P( 21 ). The geodesic distances on S 2 are given by the Fubini-Study metric; for two points [ψ], [ϕ] ∈ P( 21 ) the distance dS 2 ([ψ], [ϕ]) is determined by 2

cos2 [dS 2 ([ψ], [ϕ])] :=

|hψ|ϕi| . hψ|ψi hϕ|ϕi

Recall that this is the sphere of radius 21 , so 0 ≤ dS 2 (x, y) ≤ the formulas (5.12) or (E.2) for K1 gives that

π 2.

Comparing this with

|K1 (x, y, z)| = 4 · cos[dS 2 (x, y)] cos[dS 2 (y, z)] cos[dS 2 (z, x)]. Noting (5.14), this gives for arbitrary N that |KN (x, y, z)| = (N + 1)2 cosN [dS 2 (x, y)] cosN [dS 2 (y, z)] cosN [dS 2 (z, x)]. What remains to be determined is the phase. This has no simple formula, but is easily understood geometrically: arg KN (x, y, z) is 2N times the area of the geodesic triangle on S 2 with vertices x, y, and z. To see this, show that this is true for infinitesimal triangles and that this quantity is additive when a triangle is decomposed into smaller triangles. Clearly KN does indeed become sharply peaked as N → ∞. Since the isotropy group of S 2 is H = U(1), the classification of equivariant vector bundles over S 2 is extremely simple. The irreducible bundles are classified by irreducible

Quantization of Equivariant Vector Bundles

535

representations of U(1), which are in turn indexed by half integers. Denote these representations by [m] for any m ∈ 21 Z. Since these representations are all one-dimensional, the irreducible bundles are all rank-one. Under the restriction SU(2) ←- U(1), an irreducible representation of SU(2) decomposes into a direct sum of irreducible U(1)-representations. This is simply (j) → [−j] ⊕ [−j + 1] ⊕ · · · ⊕ [j] for any j ∈ 21 Z. Let W m := [m] ×U(1) SU(2) be the equivariant vector bundle over S 2 with fiber m Wo = [m]. The space of continuous sections 0(W m ) is a completion of the space of polynomial sections 0poly (W m ). As an SU(2)-representation, 0poly (W m ) is a direct sum of irreducibles and is easily computed. The representation (j) occurs precisely once in 0poly (W m ) if and only if [m] occurs in the decomposition of (j); in other words, when j ≡ m mod 1 and j ≥ |m|. So, 0poly (W m ) = (|m|) ⊕ (|m| + 1) ⊕ (|m| + 2) ⊕ · · · . So, what is the quantization of W m ? We need to know for which j does W m = V j ? Equation (6.12) shows that [m] = [j ∗ ]∗ . Since all SU(2)-representations are self-dual, j ∗ = j. All irreducible U(1)-representations are one-dimensional, so (see App. D.3) [j]∗ = [−j]. This means that j = −m, and the quantization of W m is V−m . As an SU(2) representation VN−m = ( N2 ) ⊗ ( N2 − m) = (|m|) ⊕ (|m| + 1) ⊕ · · · ⊕ (N − m). Clearly, modulo completion, VN−m in the limit N → ∞ becomes the same SU(2)representation as 0(W m ). This applies in particular when λ = 0 so VN0 = AN and 0(W 0 ) = C(S 2 ). With a little more work, this consistency check can be carried out for all the complex projective spaces CPn . In a way, it may seem odd to be using SU(2) as the symmetry group for S 2 . The group of distinct orientation preserving isometries of S 2 is SO(3); the group SU(2) is its simply-connected, double cover. If we had used the smaller group, we would have artificially excluded all the AN ’s with N odd. Although SO(3) acts on all the algebras AN , we need the SU(2)-representation (N 3) in order to construct AN . Another reason is that many of the vector bundles on S 2 are SU(2)-equivariant, but not SO(3)-equivariant. It is generally the case that the simply connected G is not the minimal symmetry group of a coadjoint orbit. Indeed, the minimal symmetry group of O3 is the group G0 = G/Z(G) (the “adjoint group”) which maximizes the fundamental group π1 (G0 ). Nevertheless, the simply connected G is the easiest to deal with, and most fruitful, choice. 9. Final Remarks One motivation for considering the limit quantization approach for bundles comes from physics. If this sort of quantization is used as a regularization technique, then it would be desirable to do a “renormalization group” analysis. This involves going from one level of regularization to a coarser one with fewer degrees of freedom. In order to do this we need a sort of coarse-graining map that associates a given field configuration with a coarser field configuration, ignoring some of the degrees of freedom of the original.

536

E. Hawkins

In n-dimensional lattice regularization, the space is approximated by a lattice. The coarse-graining is accomplished by grouping the lattice points into groups of 2n and averaging the field values at those 2n points. This field value is then given to a single point of the new, coarsened lattice which has 2−n times as many points. The degrees of freedom are thus reduced (drastically) by a factor of 2n . Classically, field configurations are sections of vector bundles. If quantization is used as a regularization technique, the field configurations are the vectors in the quantum modules VN . Coarse-graining means going from N to N − 1. The coarse-graining map is πN . The degrees of freedom vary as only a polynomial function of N , so dim VN / dim VN −1 ≈ 1 for large N . This is far gentler than lattice regularization. I hope to discuss this, and related matters in a future paper. Another reason for using constructions in terms of limits, as I have here, is simply that it is the most convenient approach when dealing with coadjoint orbits. When dealing with the quantization of a more general symplectic manifold, objects such as the Hilbert space HN are constructed as spaces of sections over the manifold; everything is constructed from the manifold. In the case of coadjoint orbits, however, HN is constructed directly as a G-representation. We can actually deal more explicitly with the algebra AN than with the algebra C(O3 ). For this reason, it is more convenient to construct the classical structures from the quantum structures, rather than vice versa. The construction of the maps IN and PN in Sect. 5.2 is standard [1, 12, 18]. In the terminology of Berezin [1], PN (a) is the contravariant symbol of a, and an element of −1 (a) is a covariant symbol of a. the preimage IN The idea of directed limit quantization here is based on a construction by Grosse, Klimˇc´ık, and Preˇsnajder in [9]. In that case the quantization of the S 2 was being discussed. Their choice of iN is different and is based on the criterion of preserving the L2 -norm from one algebra to the next. My choice is based on the criterion of compatibility with the standard IN ’s. It can easily be checked that IN never preserves the L2 norms, and therefore my choice of iN ’s never satisfies their criterion. In [10], Grosse, Klimˇc´ık, and Preˇsnajder constructed quantized vector bundles for the special case of S 2 . Their result is the same as mine for that case (see Sect. 8). To reiterate, the main conclusion of this paper is that when the coadjoint orbit O3 = G/H through 3 is quantized to give a sequence of matrix algebras AN = End(N 3), the equivariant vector bundle

V λ = [λ∗ ]∗ ×H G

(6.12)

quantizes to a corresponding sequence of AN -modules VNλ = (N 3) ⊗ (N 3∗ + λ).

(6.6)

In [14] I will continue by describing analogous results in the more general case of compact K¨ahler manifolds.

A. Sections Before discussing the construction of limits, it is worthwhile to clarify the notations for different spaces of sections of the bundles of algebras and modules. Given a noncompact base space, there are several useful types of continuous sections of a vector bundle, all

Quantization of Equivariant Vector Bundles

537

of which are equivalent for a compact base space. For the base space N, sections are the same thing as sequences. For legibility, I will sometimes write sections as sequences in that case. The space of all continuous sections of a vector bundle E is denoted 0(E). If E is a bundle of algebras, then 0(E) is an algebra. However, for a bundle AI of C∗ -algebras 0(AI ) is not a C∗ -algebra since the sup-norm diverges. For a discrete base space, this is the algebraic direct product. The space of continuous sections with compact support is denoted 0c (E). For a bundle of algebras, this is an ideal inside 0(E). For the C∗ -bundle AI this space 0c (AI ) has a C∗ -norm, but is not complete and therefore not C∗ . For a discrete base space, this is the algebraic direct sum. If the fibers of E are normed (as C∗ -algebras are), then two more types of section can be defined. 0b (E) is the space of sections of bounded norm. For the C∗ -bundle AI , 0b (AI ) is a C∗ -algebra; the norm of a section is the supremum of the norms at all points of I. For C∗ -algebras over a discrete base space this is the C∗ -direct sum. 00 (E) is the space of sections such that the norms converge to 0 approaching ∞. To be precise, any arbitrarily low bound on the norms is satisfied on the complement of some compact set. This is the norm closure of 0c (E). For the C∗ -bundle AI , 00 (AI ) is a closed ideal in 0b (AI ). For C∗ -algebras over a discrete base space, this is the C∗ -direct product. These spaces of sections are related by 0c ⊂ 00 ⊂ 0b ⊂ 0. The appropriate notion of a bundle of C∗ -algebras is that of a continuous field of ∗ C -algebras. This is discussed extensively in [6]. B. Limits B.1. Direct limit of algebras. Since we are assuming the index set to be N, sections of AN can also be thought of as sequences. In the category of vector spaces, the limit of a directed system of algebras is ˚ Vec-lim −→{A∗ , i∗ } := A/0c (AN ),

(B.1)

˚ := {a ∈ 0(AN ) | ∃M ∀N ≥ M : aN +1 = iN (aN )} . A

(B.2)

where

The injections iN are meant to identify aN to iN (aN ); (B.1) therefore gives the set of sequences which for sufficiently large N become constant, modulo the sequences which for sufficiently large N are 0. Thinking of AN ⊂ AN +1 , the limit is heuristically the S union N ∈N AN of this nested sequence. Usually, one works in the category of C∗ -algebras in which the morphisms are ∗homomorphisms. If the iN ’s are assumed to be ∗-homomorphisms, then the C∗ -algebraic limit (see [7]) of finite-dimensional algebras will be (by definition) an AF-algebra. This is far too restrictive a class of algebras in this context; a commutative AF-algebra is isomorphic to the continuous functions on a totally disconnected, zero-dimensional space (see [21]). In order to avoid this restriction, we must allow the iN ’s to be some more general type of morphism. Firstly, these must be linear, and I will assume (perhaps unnecessarily) that they are unital (i. e., iN (1) = 1). Several convergence conditions on the iN ’s will also be needed. The first condition is that the iN ’s be norm-contracting maps; this means ∀a ∈ AN , kiN (a)k ≤ kak. There

538

E. Hawkins

is a fairly nice class of norm-contracting maps for C∗ -algebras; these are the completely positive maps (see [16]). All of the iN ’s and pN ’s constructed in this paper are completely positive; however, I am not relying on that property in general. The norm-contracting ˚ ⊂ 0b (AN ). condition ensures A Since each AN is a C∗ -algebra, each has a C∗ -norm. The natural norm on the limit is the limit of these; that is, for any equivalence class [a] ∈ Vec-lim −→{A∗ , i∗ } define k[a]k := lim kaN k . N →∞

(B.3)

The norm-contracting condition guarantees that this is well defined, since it is a limit of a sequence that is (for sufficiently large N ) strictly nonincreasing and bounded from below (by 0). To ensure that this is truly a norm requires a second condition – that it be nondegener˚ ∩ A0 = 0c (AN ), ate. That is, a 6 = 0 =⇒ kak 6 = 0. This is equivalent to the condition that A where A0 = 00 (AN ). ˚ is naturally embedded This means that 0c (AN ) can be replaced by A0 in (B.1), and A ∗ in the C -algebra 0b (AN )/A0 . The norm (B.3) agrees with the natural norm on this ˚ lim quotient. Now define A∞ = − →{A∗ , i∗ } as the closure of A/A0 in 0b (AN )/A0 , or ˚ 0. equivalently as the abstract norm completion of A/A ˚ ⊂ 0b (AN ). Another construction Also define A ⊂ 0b (AN ) as the norm closure of A of A∞ is A∞ = A/A0 ; this shows that if we view sections in 0b (AN ) as sequences, then A is the subspace of sequences which converge into A∞ . It is not a priori true that A∞ is an algebra; this requires a third (and final) condition. Require that A∞ be algebraically closed in 0b (AN )/A0 . This is equivalent to requiring that A ⊂ 0b (A∞ ) be algebraically closed. Assuming these conditions, both A∞ and A are norm closed subalgebras of C∗ algebras; they are therefore C∗ -algebras themselves. For each N , there is a canonical injection IN : AN ,→ Vec-lim −→{A∗ , i∗ } ⊂ A∞

(B.4)

which takes a 7→ [(0, . . . , 0, a, iN (a), iN +1 ◦ iN (a), . . . )]. Heuristically, IN = . . . ◦ iN +1 ◦ iN . If we are trying to prove that a given directed system {A∗ , i∗ } truly converges to a given A∞ , the third convergence condition is the most critical. Using the notation iN,M := iM −1 ◦ . . . ◦ iN : AN ,→ AM , an equivalent statement is that ∀N ∀a, b ∈ AN , lim IN +m [iN,N +m (a) iN,N +m (b)] = IN (a)IN (b).

m→∞

(B.5)

inv inv (such that IN ◦IN = id), chosen so that the sections N 7→ If there are left inverses IN are continuous, then there is a simpler statement. This convergence condition becomes ∀f1 , f2 ∈ A∞ ,

inv (f ) IN

inv inv (f1 )IN (f2 )] −− −→ f1 f2 . IN [IN N →∞

(B.6)

This is the form used in Sect. 5.3. In this circumstance it is also necessary to check that Vec-lim −→{A∗ , i∗ } ⊂ A∞ really inv ’s, this is dense. This means that IN needs to be “asymptotically onto”. Using the IN inv simplifies to the requirement that ∀f ∈ A∞ , IN ◦ IN (f ) → f as N → ∞.

Quantization of Equivariant Vector Bundles

539

Although this was done for the index set N, it can trivially be generalized to any directed set. B.2. Direct limit of modules. Given a directed system {V∗ , ι∗ } of finitely generated, projective modules of each AN , we would like to construct a limit module of the limit algebra A∞ . The construction must work in the special case that the system is just {A∗ , i∗ }. The vector space direct limit Vec-lim −→{V∗ , ι∗ } is not itself an A∞ -module; it needs to be completed somehow. Completion is usually done with some norm, but there is generally no natural norm on the VN ’s. Instead, complete algebraically. The algebraic direct product 0(VN ) is a 0(AN )-module, and by restriction an Amodule. From the construction of the vector space direct limit, start with the vector space ˚ := {ψ ∈ 0(VN ) | ∃M ∀N ≥ M : ψN +1 = ιN (ψN )} . V

(B.7)

˚ ⊂ 0(VN ). I insist that V be a finitely generated ANow define V as the span of AV module, so there is a convergence condition that any element of V can be written as the ˚ In other words, AV ˚ + · · · + AV ˚ stabilizes sum of a bounded number of elements of AV. for some finite number of summands. It is now easy to construct an A∞ -module. The ideal A0 induces a submodule A0V ⊂ V, and the quotient V∞ := V/A0V is an A∞ -module. This is the direct limit of modules. Note that its construction requires the map P : A A∞ but does not require any other quantization structure for the algebras. B.3. Inverse limit of algebras. The limit of the inverse system of algebras is easier to construct. It is lim A∞ = ← −{A∗ , p∗ } := {a ∈ 0b (AN ) | ∀N : aN −1 = pN (aN )} .

(B.8)

Again, the pN ’s should not be required to be homomorphisms, and again, convergence conditions are necessary. This limit also inherits a norm kak := limN →∞ kaN k. This is well defined if the pN ’s are required to be norm-contracting. It is then the limit of a nondecreasing sequence that is bounded from above. No additional condition is required to make this nondegenerate since kak ≥ kaN k. A∞ is already complete with respect to this norm. Since A∞ consists of sequences of nondecreasing norm, the intersection with A0 = 00 (AN ) is 0. This means that A∞ injects naturally into 0b (AN )/A0 . Define A to be the preimage of A∞ by the quotient homomorphism 0b (AN ) 0b (AN )/A0 ; this gives A = A∞ + A0 ⊂ 0b (AN ). This A∞ is also not a priori an algebra. We again need the condition that A∞ ⊂ 0b (AN )/A0 be algebraically closed. This is equivalent to requiring that A ⊂ 0b (AN ) be algebraically closed. If A∞ and A are algebraically closed, then they are C∗ -algebras. lim For each N , there is a canonical surjection PN : ← −{A∗ , p∗ } AN which simply takes a 7→ aN . Heuristically PN = pN +1 ◦ pN +2 ◦ . . . . This last convergence condition is again the most critical. If we are testing whether ? lim ← −{A∗ , p∗ } = A∞ , then an equivalent statement is ∀f1 , f2 ∈ A∞ , lim kPN (f1 )PN (f2 ) − PN (f1 f2 )k = 0.

N →∞

(B.9)

540

E. Hawkins

If there are right inverses PNinv (such that PN ◦ PNinv = id), chosen so that PNinv ◦ PN (f ) → f as N → ∞, then there is a simpler statement. This convergence condition becomes ∀f1 , f2 ∈ A∞ , −→ f1 f2 . PNinv [PN (f1 )PN (f2 )] −− N →∞

(B.10)

This is the form used in Sect. 5.3. B.4. Inverse limit of modules. This construction is very much the same as in B.2 for a direct limit of modules. For an inverse system {V∗ , π∗ } of modules, first construct the vector space ˚ := {ψ ∈ 0(VN ) | ∀N : ψN −1 = pN (ψN )} . V

(B.11)

˚ and the convergence condition is that AV ˚ + · · · + AV ˚ Again define V as the span of AV, stabilizes for some finite number of summands. Define ← lim −{V∗ , π∗ } := V/A0V. C. Review of Representation Theory Let G be a compact, simply connected, semisimple Lie group. This always contains a Cartan subgroup T . This is a maximal abelian subgroup which is always of the form U(1)` (a torus group). Any two Cartan subgroups of G are conjugate, so it is irrelevant which one we now fix and call the Cartan subgroup. Since the irreducible representations of U(1) are one-dimensional and classified by Z, the irreducible representations of T are one-dimensional and classified by the lattice Z` . The Cartan subalgebra C ⊂ g is the Lie algebra of T . Any vector in an irreducible representation of T is an eigenvector of any element of C; the eigenvalue depends linearly on the position of the representation in the above lattice (and on the element of C). The lattice is therefore naturally thought of as lying in the dual C∗ of the Cartan subalgebra. It is called the weight lattice. There is a natural inner product (the Cartan-Killing form) on the Lie algebra g; using this, there is a natural sense in which C∗ ⊂ g∗ . There are some symmetries to C∗ , residual from the action of G on g∗ . The symmetry group of C∗ is the subgroup of G that preserves C∗ ⊂ g∗ , modulo the subgroup that acts trivially on C∗ . This is called the Weyl group W and is finite. Since both are naturally constructed from the pair C ⊂ g, the Weyl group preserves the weight lattice. The Weyl group is generated by a set of reflections across hyperplanes in C∗ . These plains divide C∗ into wedges called Weyl chambers; each Weyl chamber is a fundamental domain of the W action on C∗ , this means that the W -orbit of any point of C∗ intersects a given closed Weyl chamber at least once and intersects the interior of a given Weyl chamber at most once. We can choose a basis of the weight lattice; that P is, a set of fundamental weights {πj } such that the weight lattice is the integer span j Zπj . Given the choice of C, the fundamental weights are unique modulo the freedom to change their signs. Fix a set of fundamental weights. The natural index set for the fundamental weights is the set of P vertices of the Dynkin diagram of g. The positive span of the fundamental weights j R + πj is precisely a (closed) Weyl chamber. Call this the positive Weyl chamber C+ . The weights that lie in C+ are nonnegative integer combinations of the fundamental

Quantization of Equivariant Vector Bundles

541

weights and are called dominant weights. Since it is a fundamental domain of the W action, the positive Weyl chamber C+ can naturally be identified with C∗ /W . Given an irreducible representation of G, it can also be regarded as a T representation. The representation space therefore naturally decomposes into a direct sum of subspaces associated with different weights. The set of weights that occur is W -invariant. The subspace associated with the dominant weight furthest from 0 is always 1-dimensional; that weight is called the highest weight of the representation. Nonisomorphic irreducible representations have distinct highest weights and any dominant weight is the highest weight of some representation. The irreducible representations of G are therefore exactly classified by dominant weights. I denote the representation space with highest weight λ as (λ). Weights are additive under the tensor product. If two vectors have weights λ and µ, then their tensor product has weight λ + µ. Because of this, the highest weight of the (reducible) representation (λ) ⊗ (µ) is λ + µ. The decomposition of (λ) ⊗ (µ) into irreducibles will therefore always contain precisely one copy of (λ + µ); this irreducible representation is called the Cartan product of (λ) and (µ). For each irreducible representation, we can choose a normalized vector 9λ ∈ (λ) in the highest weight subspace. This is called a highest weight vector. Their phases are arbitrary, but can be chosen consistently so that 9λ ⊗ 9µ = 9λ+µ ∈ (λ + µ) ⊂ (λ) ⊗ (µ). The linear dual of an irreducible representation is also an irreducible representation; we can therefore define λ∗ by the property (λ∗ ) = (λ)∗ . This is a linear transformation on the weight lattice; it simply permutes the fundamental weights and is given by an automorphism (possibly trivial) of the Dynkin diagram. Whenever λ − µ is a dominant weight, (λ) ⊗ (µ∗ ) will contain precisely one copy of (λ − µ). In particular, if λ = µ this says that (λ) ⊗ (λ∗ ) contains one copy of the trivial representation; this is little more than the definition of the dual. D. Coadjoint Orbits The purpose of this appendix is to describe the rationale for restricting attention to coadjoint orbits, and then to discuss some of the structure of coadjoint orbits. Toward this goal, I first discuss a more general structure: D.1. Symplectic structure. Thus far I have entirely avoided mentioning something which is usually mentioned first in discussions of quantization – the symplectic structure. Assume M to be a manifold. Suppose that part of our quantization structure is a system of maps IN : AN ,→ C(M), identifying quantum operators to classical inv : C(M) AN functions. We can choose a system of right inverses; that is, maps IN inv such that IN ◦ IN is the identity map AN → AN . Using these, we can pull back the products on each of the AN ’s to C(M), giving a sequence of products inv inv 0 (f )IN (f )]. f ∗N f 0 = IN [IN

(D.1)

By construction, these converge to the ordinary product of functions as N → ∞. Suppose that the quantization is compatible with the smooth structure of M in the sense that for smooth functions f, f 0 ∈ C ∞ (M) the correction f ∗N f 0 − f f 0 is of order 1 9 N . I will assume that any quantization of interest satisfies this. This compatibility means that the function 1 This can be generalized slightly by replacing N with some other function ~(N ) that goes to 0 as N → ∞. The implication (existence of Poisson bracket) remains the same. 9

542

E. Hawkins

{f, f 0 } := lim −iN f ∗N f 0 − f 0 ∗N f N →∞

(D.2)

is well defined. This is the Poisson bracket of f and f 0 ; it is easily seen to be, by construction, antisymmetric and a derivation in both arguments. This means that there exists an antisymmetric, contravariant, rank-2 tensor10 π ij such that the Poisson bracket is given by {f, f 0 } = hπ, df ∧ df 0 i ≡ π ij dfi dfj0 . With the assumption that the algebras AN are finite-dimensional, the π should be nondegenerate if thought of as a map from 1-forms to tangent vectors. This means that it has an inverse ω = π −1 , which is naturally a 2-form. The Poisson bracket also satisfies the Jacobi identity, and this implies that ω is a closed 2-form (dω = 0). This ω is the symplectic form. Although right inverses are not unique, the Poisson bracket and inv ’s here. symplectic form are independent of the specific choice of the IN D.2. Why coadjoint orbits?. Let M be a compact manifold and assume that a compact, semisimple Lie group G acts smoothly and transitively on M. This implies that π1 (M) is finite, and thus H 1 (M; R) = 0. Everything we do should be G-equivariant. Because G acts smoothly on M, the elements of the Lie algebra g of G define certain vector fields on M. Since the quantization is assumed to be G-equivariant, the symplectic form must be G-invariant. This implies that for any ξ ∈ g thought of as a vector field on M, 0 = Lξ ω = d(ξ y ω) + ξ y dω = d(ξ y ω) ; so (using H 1 = 0) there is a “Hamiltonian” h(ξ) ∈ C ∞ (M) such that ξ y ω = dh(ξ), which is well defined modulo constants. The constant can be fixed by requiring that the average of h(ξ) over M is 0. This gives a well-defined linear map h : g → C(M). For any x ∈ M, the evaluation ξ 7→ h(ξ)(x) is a linear map g → C; in other words, h lets us map x into the linear dual g∗ . That map is the “moment map” 8 : M → g∗ (see [13]). Because M is homogeneous and compact, the moment map turns out to be an embedding, so effectively M ⊂ g∗ . By transitivity, M is precisely the orbit of any of its points under the natural “coadjoint” action Ad∗G of G on g∗ , so any of the homogeneous spaces we are considering is a coadjoint orbit. Since g∗∗ = g, any element of g is naturally thought of as a linear function on ∗ g . The Lie bracket on these of course satisfies the Jacobi identity, and extends to a unique Poisson bracket for all functions on g∗ . If xi are linear coordinates on g∗ then the Poisson bivector π on g∗ is given by πij = C kij xk . This Poisson bivector is degenerate, but restricts to a nondegenerate one on any coadjoint orbit. This makes any coadjoint orbit symplectic. The set of homogeneous spaces we are interested in is therefore precisely the set of coadjoint orbits of compact Lie groups. The single point {0} ⊂ g∗ is trivially a coadjoint orbit. It is an exception to some of the statements in this paper, but an utterly uninteresting one, so I will not mention it again. D.3. Structure of coadjoint orbits. We are interested in all coadjoint orbits, but all coadjoint orbits intersect C∗ ⊂ g∗ , so it is sufficient to consider the orbits of all 3 ∈ C∗ . These are still not all distinct; O3 = O30 if (and only if) 3 and 30 are mapped to one another by the Weyl group W . The set of distinct coadjoint orbits is therefore g∗ /G ∼ = C∗ /W ∼ = C+ , using the fact that the Weyl chamber C+ is a fundamental domain of the W action (App. C). 10

This is also called a bivector.

Quantization of Equivariant Vector Bundles

543

We would like to express the coadjoint orbit O3 of 3 ∈ C+ as G/H. So what is H? It is the subgroup of G leaving 3 invariant, or equivalently the centralizer H = Z3 (G) ≡ {h ∈ G | Adh (3) = 3}

(D.3)

if 3 is identified to an element of g using the inner product. In this sense, 3 ∈ C, so because C is Abelian, C ⊂ h. This implies that the Cartan subgroup T is a subgroup of H, so T can be used as the Cartan subgroup of H, and weights of G and H are naturally identified. There are, however, weights which are dominant for H that are not for G, and the Weyl groups are different. Expand 3 in the basis {πj }, and mark the vertices j ∈ Dynkin(g) for which πj has a nonzero coefficient in 3. The vertices of the Dynkin diagram are also the natural index set for the dual basis of fundamental roots. In the standard root decomposition of gC , Eα commutes with 3 and is thus in hC if and only if α is orthogonal to 3. This is true precisely if, in the expansion in fundamental roots, α has 0 coefficients for all the marked vertices of Dynkin(g). This means that hC is spanned by C and the Eα ’s that are supported on the unmarked vertices. This gives a simple, diagrammatic way of calculating h: The Lie algebra h of H is the sum of a copy of u(1) for every marked vertex and the Lie algebra of whatever Dynkin diagram is left after deleting all the marked vertices (and adjoining edges). (This diagrammatic method is also described in [2], the only difference is that the complementary set of vertices is marked.) This shows that, up to homeomorphism, the orbit O3 depends only on which coefficients are nonzero. On the other hand the symplectic structure and metric do vary with 3. Since the number of marked vertices is the number of nonzero coefficients for 3, this is the number of parameters that orbits in a given homeomorphism class vary by. One of these degrees of freedom simply corresponds to rescaling. I use the notation [λ] for the irreducible H-representation with highest weight λ. If the weight λ is a combination of fundamental weights corresponding to marked vertices of Dynkin(g), then the semisimple part of h acts trivially on [λ]. In this case [λ] is one-dimensional and is just a representation of the abelian part of h. The weights N 3 are of this type. In general, if [λ] is one-dimensional then [λ]∗ = [−λ]. If [λ] is one-dimensional and µ is arbitrary, then [λ] ⊗ [µ] = [λ + µ]. D.4. Examples. The existence of the symplectic structure implies that a coadjoint orbit must be even-dimensional. The table lists the lowest-dimensional coadjoint orbits (all those with dimension ≤ 6). Note that CP 3 occurs in two forms. Coadjoint Orbits with dim ≤ 6 dim.

Name

O3

G/H

2

Sphere

S2

SU(2)/U(1)

•

4

Complex projective space

CP2

SU(3)/U(2)

•−−◦

6

Complex projective space

CP3

SU(4)/U(3)

•−−◦−−◦

"

"

Sp(4)/U(1) × Sp(2)

◦=⇒•

SU(3)/U(1) × U(1)

•−−•

SO(5)/SO(2) × SO(3)

•=⇒◦

6

Complex flag variety

6

Double cover of real Grassmanian

eR2,5 G

Diagram

544

E. Hawkins

A less trivial example is given by the diagram •−−◦−−•−−◦=⇒◦. In this case G = f SO(11), and (modulo coverings) H ≈ U(1) × SU(2) × U(1) × SO(5). The dimension is dim G − dim H = 55 − (1 + 3 + 1 + 10) = 40. Notably, S 2 is the only sphere which is a coadjoint orbit. In fact it is the only sphere which admits a symplectic structure, equivariant or not. This is because the symplectic form on a compact manifold always has a nontrivial cohomology class, implying H 2 (M) 6 = 0. The 2-sphere is the only sphere such that H 2 (S n ) 6 = 0. This means that with the reasonable seeming condition of respecting the smooth structure (as described in D), no other sphere may be quantized. For a claim to the contrary, see [11].

E. Projective Space There is a (very) slightly different perspective on how the formula (5.5) for the injection IN : AN ,→ C(O3 ) comes about. It can be thought of as resulting from a natural embedding of O3 into the projectivisation P(N 3) of the representation (N 3). The idea is simply that since 9N 3 is fixed modulo phase by H, its projective equivalence class [9N 3 ] ∈ P(N 3) is exactly fixed by H. Indeed H is the entire isotropy group of this point. This means that the equivariant map that takes O3 3 o 7→ [9N 3 ] is an embedding O3 ,→ P(N 3). Any point [ψ] ∈ P(N 3) determines a state (a normalized element of the dual) of AN . This takes a 7→

hψ|a|ψi . hψ|ψi

(E.1)

So, we can naturally map O3 ,→ P(N 3) → A∗N . From this point the story continues in the same way as in Sect. 5.2. that |xi, |yi, and |zi are (unnormalized) vectors in (3) such that x 7→ Suppose |xi ∈ P(3), et cetera. The formula (5.12) for K1 can be rewritten as [dim(3)]−2 K1 (x, y, z) =

hx|yi hy|zi hz|xi . hx|xi hy|yi hz|zi

(E.2)

A continuous choice of these vectors cannot be made globally, but it can be made in a small neighborhood of o. As in Sect. 5.3, let’s fix x = o. The obvious choice for |oi is |93 i. The arbitrariness in the other vectors is the freedom to multiply by a scalar. If we to fix these vectors by letting h93 |yi = 1 (and likewise for z), then (E.2) simplifies to [dim(3)]−2 K1 (o, y, z) =

hy|zi h93 |yihy|zihz|93 i = . h93 |93 ihy|yihz|zi hy|yihz|zi

(E.3)

Now suppose that we have a complex coordinate system for y and z, that the coordinates are vectors υ and ζ in a subspace of (3), and that to first order |yi is given by |yi ≈ |93 i + |υi (and |zi is given by ζ). From this, the inner product hy|zi can be calculated to second order

Quantization of Equivariant Vector Bundles

545

hy|zi = 1 + hy|z − 93 i = 1 + hy − 93 |z − 93 i ≈ 1 + hυ|ζi. Inserting this into (E.3) gives a formula for K1 to second order [dim(3)]−2 K1 (o, y, z) ≈

1 + hυ|ζi 2 2 1 + kυk 1 + kζk 2

2

≈ 1 − kυk − kζk + hυ|ζi.

(E.4)

Notation [ · , · ]− AIˆ AN A ˚ A A0 Cn cn (N 3) C C+ eN End 0 0b 00 0c 0poly I Iˆ Ji N ˆ N (λ) [λ] λ∗ O3 P P 5 9λ QλN ter VN

Commutator, [a, b]− = ab − ba. (5.1). ˆ Sect. 2. The bundle of algebras over I. Quantum algebra at index N ∈ I, later End(N 3). Sects. 2, 5. C∗ -algebra of continuous sections of AIˆ . Sects. 2, 5.1, 5.1, B.1, B.3. Preliminary vector space, dense in A. (B.2). = 00 (AI ), an ideal in A, the Kernel of P : A A∞ . Sects. 5.1, B.1, B.3. The nth Casimir polynomial. (5.2). The eigenvalue of the Casimir operator Cn (J) acting on (N 3). (5.2). Cartan subalgebra of g. App. C. Positive Weyl chamber in C∗ . Sects. 4.1, C.

The function eN : O3 ,→ AN , gH 7→ g9N 3 g9N 3 . (5.7) Endomorphisms, the algebra of matrices Ron some vector space. Sect. 5. Volume form on O3 , normalized so that O = 1. (5.8). 3 The space of continuous sections of a bundle. App. A. The space of norm-bounded sections. App. A. The space of continuous sections vanishing at ∞. App. A. The space of compactly supported sections. App. A. The space of polynomial sections. Sect. 8. Index set of the quantization. Sect. 2 = I ∪ {∞}. Sect. 2 The basis of Hermitian generators of g. Sect. 5.1. = {1, 2, . . . }. = {1, 2, . . . , ∞}. G-representation space with highest weight λ. Sects. 5, C. H-representation space with highest weight λ. Sects. 6.3, D.3. Weight vector such that (λ∗ ) = (λ)∗ . Sects. 5.2, C. Coadjoint orbit passing through the weight vector 3 ∈ C+ ⊂ g∗ . Sect. 4.1. The surjection A C(M). Sect. 2 Projectivization of a vector space. App. E. Projection onto some subrepresentation. Sects. 5.2, 6.2. Normalized highest weight vector in (λ). Sects. 5.2, C. Projection such that VNλ = [AN ⊗ (µ)]QN . Sect. 6.1 Trace normalized so that ter 1 = 1. Sect. 5.2. Module of the algebra AN in the quantization of the bundle V. Sect. 3

546

E. Hawkins

V∞ VNλ V ˚ V

= 0(V ), which is a module of C(M). Sect. 3 The bundle of quantum modules associated to the G-weight λ. Sect. 6.1. A-module expressing a quantization of V . Sect. 3 Vector space which generates V as an A-module. App. B.2, B.4.

Acknowledgement. I wish to thank Ranee Brylinski and Nigel Higson for extensive discussions. This material is based upon work supported under a National Science Foundation Graduate Fellowship. Also supported in part by NSF grant PHY95-14240 and by the Eberly Research Fund of the Pennsylvania State University.

References 1. Berezin, F. A.: General Concept of Quantization. Commun. Math. Phys. 40, 153–174 (1975) 2. Bordemann, M., Forger, M., R¨omer, H.: Homogeneous K¨ahler Manifolds: Paving the Way Towards New Supersymmetric Sigma Models. Commun. Math. Phys. 102, 605–647 (1986) 3. Chamseddine, A., Connes, A.: The Spectral Action Principle. E-print, hep-th/9606001. Commun. Math. Phys. 186, 73–750 (1997) 4. Connes, A.: Noncommutative Geometry. New York: Academic Press, 1994 5. Connes, A.: Gravity Coupled with Matter and the Foundation of Noncommutative Geometry. E-print, hep-th/9603053. Commun. Math. Phys. 182, 155–176 (1996) 6. Dixmier, J.: C∗ -algebras. Amsterdam: North Holland, 1982 7. Fillmore, P. A.: A User’s Guide to Operator Algebras. New York: Wiley Interscience, 1996 8. Grosse, H., Klimˇc´ık, C., Preˇsnajder, P.: Towards Finite Quantum Field Theory in Noncommutative Geometry. E-print, hep-th/9505175. Int. J. Theor. Phys. 35, 231–244 (1996) 9. Grosse, H., Klimˇc´ık, C., Preˇsnajder, P.: Field Theory on a Supersymmetric Lattice. E-print, hep-th/9507074. Commun. Math. Phys. 185, 155–175 (1997) 10. Grosse, H., Klimˇc´ık, C., Preˇsnajder, P.: Topologically Nontrivial Field Configurations in Noncommutative Geometry. E-print, hep-th/9510083. Commun. Math. Phys. 178, 507–526 (1996) 11. Grosse, H., Klimˇc´ık, C., Preˇsnajder, P.: On Finite 4-D Quantum Field Theory in Noncommutative Geometry. E-print, hep-th/9602115. Commun. Math. Phys. 180, 429–438 (1996) 12. Grosse, H., Preˇsnajder, P.: The Construction of Noncommutative Manifolds Using Coherent States. Lett. Math. Phys. 28, 239–250 (1993) 13. Guillemin, V., Sternberg, S.: Symplectic Techniques in Physics. Cambridge: Cambridge University Press, 1984 14. Hawkins, E.: Geometric Quantization of Vector Bundles. E-print, math.QA/9808116 15. Helgason, S.: Differential geometry, Lie groups, and Symmetric Spaces. Pure and Applied Mathematics, Volume 80, New York: Academic Press, 1978 16. Lance, E. C.: Hilbert C∗ -modules. London Mathematical Society Lecture Note Series, no. 210, Cambridge: Cambridge University Press, 1995 17. Onishchick, A. L., Vinberg, E. B.: Lie Groups and Algebraic Groups. New York: Springer-Verlag, 1988 18. Perelemov, A.: Generalized Coherent States and their Applications. Berlin: Springer-Verlag, 1986 19. Rieffel, M. A.: Quantization and C∗ -algebras. In: C∗ -algebras 1943-1993: A 50 Year Celebration, R. Doran, ed., Contemp. Math. 167, 66–97 (1994) 20. Schwartz, J. T.: Differential Geometry and Topology. New York: Gordon and Breach, 1968 21. Wegge-Olsen, N. E.: K-theory and C∗ -algebras. Oxford: Oxford University Press, 1993 Communicated by A. Connes

Commun. Math. Phys. 202, 547 – 569 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Dynamics and Stability of a Weak Detonation Wave Anders Szepessy Matematiska Institutionen, Kungl. Tekniska Högskolan, S-100 44 Stockholm, Sweden. E-mail: [email protected] Received: 17 August 1998 / Accepted: 13 November 1998

Abstract: One dimensional weak detonation waves of a basic reactive shock wave model are proved to be nonlinearly stable, i.e. initially perturbed waves tend asymptotically to translated weak detonation waves. This model system was derived as the low Mach number limit of the one component reactive Navier-Stokes equations by Majda and Roytburd [SIAM J. Sci. Stat. Comput. 43, 1086–1118 (1983)], and its weak detonation waves have been numerically observed as stable. The analysis shows in particular the key role of the new nonlinear dynamics of the position of the shock wave, The shock translation solves a nonlinear integral equation, obtained by Green’s function techniques, and its solution is estimated by observing that the kernel can be split into a dominating convolution operator and a remainder. The inverse operator of the convolution and detailed properties of the traveling wave reduce, by monotonicity, the remainder to a small L1 perturbation. 1. Introduction and Main Result We shall study the stability of traveling weak detonation waves of the Majda–Rosales combustion model, see [RM] u2 − q0 z)x − βuxx = 0, x ∈ R, t > 0, 2 zx = Kφ(u)z, x ∈ R, t > 0, u(·, 0) = u0 , lim z(x, t) = 1, ut + (

(1.1)

x→∞

where u represents a lumped temperature variable, z the fuel concentration, which satisfies 0 ≤ z ≤ 1, and φ is the ignition temperature kinetics ( 0 u ≤ 0, φ(u) ≡ (1.2) 1 u > 0.

548

A. Szepessy

Here the heat release q0 , the reaction rate K and the viscosity β are given positive constants. The initial temperature u0 is given and the fuel concentration is 1 at x = +∞ for all time. Majda and Rosales derived equation (1.1) as the low Mach number limit of the one component reactive Navier-Stokes equations, see [RM]. In [CMR], Collella, Majda and Roytburd demonstrated by numerical experiments that weak detonation waves of (1.1–2) and the reactive Navier-Stokes equations are stable. Gasser and Szmolyan have proved the existence of strong and weak detonation waves of the one dimensional Navier– Stokes equations with small viscosity, heat conductivity and diffusion, based on a single reactant, [GS1], and a multistep reaction system, [GS2]. Their constructive proof uses geometric singular perturbation theory, where the singular solutions are related to the ZND-model. Gardner proved, in [G1], the existence of detonation waves for the case without diffusion. Here we shall prove that weak detonation waves of (1.1–2) are stable, i.e, the solution to (1.1–2), which initially is a slightly perturbed weak detonation wave, tends asymptotically to a translated weak detonation wave. The proof is based on three fundamental observations, where the first is a surprising property of the special system (1.1–2): The main new idea in the paper is to find a suitable time dependent translation and analyze the dynamics of the shock position. It turns out there is a translation which in fact completely decouples the conservation law for u and the reaction equation. This new decoupling, which clearly simplifies the analysis, is fundamental for our work. Then the Hopf–Cole transformation is used to reduce the study to a linear variable coefficient problem, as in [GSZ] and [SZ]. Finally, the Green’s function of this linear variable coefficient problem is found by a refined parametrix method, using a technique based on the analysis in [SZ]. With these three observations, the perturbations are analyzed pointwise, using the Green’s function and its derivatives, inspired by the previous work on pointwise estimates for nonlinear waves in [L1, LZ1, SX, LZ2, SZ, L2]. The dynamics of the translation ¯ for the weak detonation wave, with speed s starting in x = 0 at t = 0, is based on δ(t) ¯ the position of the ignition determined by u(st + δ(t), t) = 0. This special translation is central to our study and implies that the concentration z in fact becomes a translation of the traveling wave concentration, since the ignition kinetics φ(u) now depends only on the ignition position, for suitable small perturbations. The Hopf–Cole transformation and the Green’s function can be applied to the decoupled conservation law for u. In ¯ particular the condition u(st + δ(t), t) = 0 yields a certain nonlinear integral equation 0 ¯ for δ (t). The solution of the integral equation is estimated by observing that the kernel can be split into a dominating convolution operator and a remainder, where the inverse operator of the convolution and use of detailed properties of the traveling wave reduce, by monotonicity, the remainder to a small L1 perturbation. A related study of determining shock translation is in the work [LZ2] on stability of undercompressive shock waves, by Liu and Zumbrun. They construct time invariants, based on a linearized operator, which yield a transformation that approximately decouples small perturbations for undercompressive shocks in one space dimension. The translation is determined by a linear functional, which can be identified with an eigenvector corresponding to a zero eigenvalue of the dual linearized operator. Our problem (1.1) is not included in the framework of [LZ2] for undercompressive waves. The translation ¯ is in our case instead determined by the nonlinear problem u(st + δ(t), t) = 0. Shock translations, based on conservation of mass, are studied for scalar two dimensional shocks by spectral methods in [G2, GM], for interaction of boundary layers and

Stability of Weak Detonation Wave

549

stationary shocks by pointwise methods in [LYu1], and for the zero dissipation limit of solutions with shocks of hyperbolic systems in [Y] treating the coupling with initial layers and diffusion waves. To present the theorem and then its proof it is convenient to scale the problem (1.1) with x t x ∗ = , t ∗ = , K ∗ = Kβ, β β u∗ (x ∗ , t ∗ ) = u(x, t), z∗ (x ∗ , t ∗ ) = z(x, t), which transforms (1.1) to u2 − q0 z)x − uxx = 0, x ∈ R, t > 0, 2 zx = Kφ(u)z, x ∈ R, t > 0, ut + (

(1.3a)

with the initial and boundary conditions u(x, 0) = u0 (x, 0), lim z(x, t) = 1,

(1.3b)

x→∞

where we omit the superscript ∗ on everything. Majda has proved that the model (1.3) has a traveling reactive shock wave, see [M, RM], u(x, t) = U (x − st), z(x, t) = Z(x − st), (1.4) lim U (x) = u± , x→±∞

with speed s, called a weak detonation wave, characterized by u+ < 0 < u− , q0 1 > u− , s = (u− + u+ ) + 2 u− − u+

(1.5a) (15.b)

provided the heat release satisfies q0 = qcr (K, u− , u+ ),

(1.5c)

where qcr is a certain function which satisfies c

u− (u− − u+ ) u− (u− − u+ ) K ≤ qcr ≤ C K, −u+ −u+

(1.5d)

for some positive constants c and C. Inserting the ansatz (1.4) in (1.3) shows that the wave (U (x − st), Z(x − st)) solves u2 U2 − − − q0 Z − s(U − u− ), 2 2 Z 0 = Kφ(U )Z.

U0 =

(1.6)

The wave is a heteroclinic orbit starting from the saddle point (U, Z) = (u− , 0), following the unstable manifold to (U, Z) = (0, qcr /q0 ), and then connecting (0, qcr /q0 )

550

A. Szepessy

to the node (u+ , qcr /q0 ) which gives the condition (1.5c). A figure in [RM] illustrates the heteroclinic orbits of detonation waves. What happens with the weak detonation wave, connecting (u− , 0) to (u+ , 1), if we slightly perturb the heat release q0 around q0 = qcr ? Numerical experiments show that the solution adjusts its left state to (u∗− , 0) so that, for the new q0 , there is a weak detonation between (u+ , 1) and (u∗− , 0) (with a slightly changed speed) connected to a slower pure fluid dynamical rarefaction or shock wave between (u∗− , 0) and (u− , 0). Therefore, the weak detonation wave is an important wave which is present in more general cases and not only for the specific heat release (1.5c). Moreover, if we gradually decrease the reaction rate K, for a weak detonation wave, then the speed s of the wave will decrease until s < limx→−∞ u(x, t) = u− when the detonation has turned into a strong detonation wave. The strong detonation wave connects the node (u− , 0) to the node (u+ , 1), where the shock speed satisfies u+ < s < u− , see [M, RM]. Its existence is proved in [M], and the stability in [LY, Li2] using weighted energy methods. Stability of Chapman-Jouguet waves is proved in [Li1], also based on careful use of weighted energy estimates. By modifying the ignition temperature kinetics (1.2) to be non-monotone with an induction zone and including a stabilizing zero order term in the first equation of (1.1), to model multidimensional effects of curved fronts, Li has numerically demonstrated the existence of dynamically unstable oscillatory moving fronts in the Majda Rosales model, [Li3]. Therefore, our proof of stability of fronts is subtle with crucial use of the monotone discontinuous kinetics (1.2). While this paper was refereed the author learned about the related stability analysis by Liu and Yu in [LYu2] for weak detonation waves of (1.1–2), where the fluid flux function u2 /2 is replaced by a general convex function f (u) satisfying f 0 > 0. Their study includes large waves, not treated here; on the other hand they require exponentially decaying initial perturbations while the work here only assumes the slow algebraic decay (1.9). Therefore, the combination of the study here, on larger global perturbations of small waves, and the analysis of [LYu2], on small perturbations for large waves, give together a more complete understanding of the stability of weak detonation waves. The stability of strong detonation waves, with zero mass perturbations, for the reactive Navier–Stokes equations is studied in [TT]. In the stability studies of strong detonations and Chapman-Jouguet waves, the asymptotic shock location can be a priori determined, as for classical non-reactive shocks. This is in contrast to weak detonation waves. A main ingredient in our analysis is the new treatment of dynamics of the shock position. Our main point in this paper is the following stability result of weak detonation waves with wave speed s strictly larger than |u− − u+ |, excluding Chapman–Jouget waves where s = u− . We denote by c, C various positive constants independent of the shock strength |u− − u+ |. Theorem 1.1. There is a small positive constant γ such that if u+ < 0 < u− , |u+ − u− | ≤ γ , u+ u+ γ ≤ c| | ≤ K ≤ C| |, u− u−

(1.7a) (1.7b) (1.7c)

then there is a weak detonation wave (U (x − st), Z(x − st)), which satisfies (1.6) with q0 = qcr (K, u− , u+ ), normalized so that U (0) = 0. The wave speed s satisfies (1.5b) and c < s < C. (1.8)

Stability of Weak Detonation Wave

551 min(u2 ,u2 )

− + Furthermore, there is a positive function c0 = o( ) tending to zero as γ → 0+, γ such that if, in addition to assumption (1.7), the initial perturbation w0 ≡ u0 −U satisfies

kw0 kL∞ + kw00 kL∞ + kw0 "kL∞ + kw0 kL1 + kw00 kL1 ≤ c0 , |w0 (x)| + |

c0

d w0 (x)| ≤ , dx (1 + |x|)1+ρ

(1.9a) (1.9b)

for some constant ρ > 0. Then, the solution of (1.3) exists, and there is a positive 0 ¯ such that constant C 0 = O( min(uc− ,u+ ) ) and a translation δ(t) ¯ ku(·, t) − U (· − st − δ(t))k L∞ ≤ C0 , (1 + t)1+ρ ≤ C0.

C0γ , 1 + t 1/2

(1.10a)

|δ¯0 (t)| ≤

(1.10b)

kδ¯0 kL1

(1.10c)

In the remainder of this section we give an introduction and overview of the proof. Write ¯ w(x, t) ≡ u(x, t) − U (x − st − δ), (1.11) ¯ ζ (x, t) ≡ z(x, t) − Z(x − st − δ), then by (1.3) and (1.4), wt + (U w +

w2 )x − wxx = δ¯0 U 0 . 2

(1.12)

We shall solve this equation by the Hopf–Cole transformation, following [GSZ, SZ]. The integrated variable Z ∞ w(y, t)dy v(x, t) ≡ − x

satisfies by (1.12), vt + U vx +

vx2 − vxx = δ¯0 (U − u+ ). 2

By defining v = −2 log H, we have

δ¯0 (U − u+ )H , Ht + U Hx − Hxx = − 2 Z ∞ 1 w(y, 0)dy), H (·, 0) = exp( 2 x

(1.13)

and −2Hx , H −2Hxx Hx + 2( )2 . wx = H H

w=

(1.14a) (1.14b)

552

A. Szepessy

Introducing the Dirac δ-function and the dual functions − ϕi,t − (U ϕi )x − ϕi,xx = 0 t < T , ϕi (x, T ) = and

∂i δ(x − x) ¯ i = 0, 1, 2, ∂x i

(1.15)

ψi,t − U ψi,x − ψi,xx = 0, ∂i δ(x − x), ¯ i = 0, 1, ∂x i

ψi (x, T ) =

we obtain by (1.13), for i = 0, 1, Z ∂i H ( x, ¯ T ) = H (x, 0)ϕi (x, 0; x, ¯ T )dx ∂x i R Z Z 1 T ¯ T )dxdt, − δ¯0 (U − u+ )H (x, t)ϕi (x, t; x, 2 0 R

(1.16)

(1.17)

Z

and H (x, ¯ T )xx = −

1 − 2

R

Hx (x, 0)ψ1 (x, 0; x, ¯ T )dx

Z 0

T

Z R

δ¯0 ((U − u+ )H )x ψ1 dxdt,

(1.18)

for any T ≥ > 0. We shall use (1.14) and (1.17-18) to verify (1.10), provided the initial restrictions (1.7-9) are satisfied. Define the quantity U¯ x ≡ inf c |Ux (x)|, |x|≤ K

(1.19)

measuring the derivative U 0 near U = 0. The proof of Theorem 1.1 is divided into the following Steps I-VI, studied in Sects. 2–7, respectively. In Steps I-IV, we shall first assume that w, wx and δ¯ satisfy (1.20) below. Then, in Step V we prove that (1.20) indeed holds. Step I: The translation. Assume that kwkL∞ + kwx kL∞ min(u− , −u+ ), (u− − u+ )(kδ¯0 kL1 + kδ¯0 kL∞ ) 1, lim w(x, t) = 0,

x→±∞

(1.20a) (1.20b) (1.20c)

¯ by the equation u(st + δ(t), ¯ and determine a unique translation δ(t) t) = 0 to conclude that φ(u(·, t)) is monotone. Here and in the sequel the relation a b and the relation b a are equivalent to a = b o(1), as the shock strength parameter γ tends to zero. Step II: The dual functions. A refined parametrix method is used to estimate the dual functions ϕi , i = 0, 1, 2 and ψ, in Sects. 3 and 7. The method, introduced in [SZ] to estimate Green’s functions for rarefaction waves, is a technique of general interest to obtain precise estimates of Greens’s functions based on characteristic information for variable coefficient convection diffusion problems.

Stability of Weak Detonation Wave

553

Step III: The estimates of ∂x i H . Estimate by Steps I and II the functions H, Hx and Hxx ¯ in (1.17-18), depending on the translation δ. ¯ Combining the equaStep IV: The estimate of the translation. Estimate the translation δ. ¯ ¯ tion u(st + δ(t), t) = 0, (1.14) and (1.17) yield an integral equation for δ(t). The solution of the integral equation is estimated by observing that the kernel can be split into a dominating convolution part and a remainder. Careful use of properties of the traveling wave (1.4) and the corresponding Green’s function (1.15) show that the inverse operator of the convolution reduce, by monotonicity arguments, the remainder to a small L1 perturbation. Step V: Verification of assumption (1.20) and conclusion of (1.10). Use the results of Steps I-IV, (1.9),(1.14) and induction in t to conclude that (1.20) holds for all t. Then, use (1.14) and Step III to evaluate w. Combine Steps III and IV to conclude that (1.10) holds. Step VI: Proof of Lemma 3.1 used in Step II. 2. The Translation We assume that w and wx satisfy (1.20), which will be verified in Step V. Then for fixed time t, the function u(·, t) is continuous and by the boundary condition limx→±∞ u(x, t) = u± in (1.20), we have ¯ t) = 0, u(st + δ, (2.1) for at least one δ¯ ∈ R. Since by (1.20) and (1.19), ¯ t) = Ux (x) + wx (x + δ¯ + st, t) < 0 ux (st + x + δ, for |x| ≤ cK −1 and ¯ t)| = |U (x) + w(x + δ¯ + st, t)| > 0 |u(st + x + δ, ¯ is uniquely defined by for |x| ≥ cK −1 , we conclude that the translation δ¯ = δ(t) (2.1), provided (1.20) holds. As a consequence the ignition temperature kinetics function φ(u(·, t)) has precisely one discontinuity, for each t. 3. The Dual Functions ¯ T ), and the backward Define the special backward characteristic σ ∗ (t) = σ ∗ (t; x, σ (t) = σ (t; x, ¯ T ) and forward characteristic τ (t) = τ (t; x, t) curves dσ ∗ = U (α(t)), σ ∗ (T ) = x, ¯ dt dσ = U (σ (t)), σ (T ) = x, ¯ dt dτ = U (τ (t)), τ (t) = x, dt α(t) ¯ ≡ τ (t) + (σ ∗ (t) − τ (t))

c1 (t − t) , c2 (T − t) + c1 (t − t)

(3.1)

554

A. Szepessy

where c1 , c2 are certain positive constants to be defined below. Let χ be the Gaussian function ∗ (t))2 exp[− (x−σ 4(T −t) ] , (3.2) χ(x, t) = √ 4π(T − t) satisfying the backward problem −χt −

dσ ∗ χx − χxx = 0 , χ(x, t) = δ(x − x). ¯ dt

(3.3)

Define the function S by the forward problem St + (U S)x − Sxx = 0 , S(x, t; x, t) = δ 0 (x − x). Since

(3.4)

¯ T ) = δ 0 (x − x), ¯ −ϕ1,t − (U ϕ1 )x − ϕ1,xx = 0 , ϕ1 (x, t; x,

we have

¯ T )| = |S(x, ¯ T ; x, t)|. |ϕ1 (x, t; x,

(3.5)

Therefore, we can estimate ϕ1 from S. We have Z TZ [U (x, t) − U (α(t), ¯ t)]χx Sdxdt. S(x, ¯ T ) = χ(x, t)x + t R

(3.6)

The proof of the following lemma, given in Section 7, uses that U (·, t) is monotone and it is based on Lemma 2.4-2.6 in [SZ], where a similar estimate is proven for a rarefaction wave. Lemma 3.1. Assume that

|u− − u+ | 1,

and that there are positive constants c, C such that c < s − u± < C,

(3.7)

which is equivalent to the assumption (1.7c). Then there are constants c, C such that (t;x,t)) exp[− (x−σc(t− t) ] 2

|S(x, t; x, t)| ≤

C(t − t)

,

(3.8)

where σ (·; x, t) is the backward characteristic curve defined in (1.10). Remark. The condition (3.7) guarantees that the wave ϕ1 , with the characteristic σ , is transversal to the shock. This condition is a consequence of the assumption (1.7c) and the following lemma. Lemma 3.2. Assume that (1.7) holds. Then there are constants c, C such that u− (u− − u+ ) u− (u− − u+ ) K ≤ qcr ≤ C K, −u+ −u+ cu− K ≤ U¯ x ≤ Cu− K, Cu− K cu− K ≤s≤ . −u+ −u+ c

(3.9a) (3.9b) (3.9c)

Stability of Weak Detonation Wave

555

Proof. For p ≡ U − u− and 0 ≤ U ≤ u− , equation (1.6) takes the form p 0 = −p(s − Z 0 = KZ,

U + u− ) − q0 Z, 2

(3.10)

which implies p s − (U + u− )/2 q0 dp =− − . dZ Z K K Integrating this, we obtain u− =

q0 K

Z

Z

1

exp[−

1

x

0

(

U (x 0 ) + u− dx 0 s q0 − ) 0 ]dx ∼ K 2K x K

s K

1 q0 = , +1 s+K

which combined with s=

q0 1 (u− + u+ ) + u− , 2 u− − u+

(3.11)

and (1.7) yield (3.9a). Then (3.9a) and (3.11) imply (3.9c). Finally (3.9a,c) inserted in (3.10) proves (3.9b). u t We have Lemma 3.3. Let the assumptions in Lemma 3.1 hold and let 0 < < 1. Then there are positive constants c2 , C such that (x − σ (t; x, ¯ T ))2 C exp[− ], T −t c2 (T − t) C (x − σ (t; x, ¯ T ))2 ¯ T )| ≤ √ ], exp[− |ϕ0 (x, t; x, c2 (T − t) T −t  (t;x,T ¯ ))2  C exp[− (x−σ ] , T − ≤ t < T, c2 (T −t) (T −t)3/2 ¯ T )| ≤ |ϕ2 (x, t; x, 2 ¯ ))  C 1/2 exp[− (x−σ (t;x,T ] , t < T − , ¯ T )| ≤ |ϕ1 (x, t; x,

C

(3.13)

(3.14)

c2 (T −t)

(T −t)

¯ T )| ≤ √ |ψ0 (x, t; x,

(3.12)

exp[−

(x − σ (t; x, ¯ T ))2 ], c2 (T − t)

T −t (x − σ (t; x, ¯ T ))2 C exp[− ] , T − ≤ t < T. ¯ T )| ≤ |ψ1 (x, t; x, T −t c2 (T − t)

(3.15) (3.16)

Proof. The estimate (3.12) follows directly from (3.5) and Lemma 3.1. To obtain (3.13), we note that Z x¯ Z x¯ ¯ T) = S(y, T ; x, t)dy = ϕ1 (x, t; y, T )dy. ϕ0 (x, t; x, −∞

Thus, using

R

R S(y, T ; x, t)dy

−∞

= 0 and integrating (3.8) gives (3.13).

556

A. Szepessy

The estimates of ϕ2 and ψ1 for T − < t < T in (3.14) and (3.16) are well known, cf. [F]. To prove (3.14) for t ≤ T − we use that Z ¯ T) = ϕ2 (x, t; x,

R

ϕ2 (x 0 , T − ; x, ¯ T )ϕ0 (x, t; x 0 , T − )dx 0

Z Z =

x0

R −∞

ϕ2 (y, T − ; x, ¯ T )dyϕ1 (x, t; x 0 , T − )dx 0 ,

which combined with (3.12) and the facts Z k Z

·

C ϕ2 (y, T − ; x, ¯ T )dykL1 ≤ √ , −∞

R

ϕ2 (x, T − ; x, ¯ T )dx = 0,

yield (3.14). R Since R ϕ1 (x, t; ·)dx = 0, we have by (3.12), Z |ψ0 (x, t)| ≡ |

x

−∞

ϕ1 (x 0 , t; x, ¯ T )dx 0 | ≤ C

exp[−(x − σ (t))2 /(c(T − t))] , (3.17) √ T −t

which proves (3.15). u t 4. Estimates of ∂x i H Estimate of H . Consider (1.17) for i = 0. The assumption (1.9a) on the initial data gives Z (4.1) k H (x, 0; ·)ϕ0 (x, t)dx − 1kL∞ 1. R

By Lemma 3.3 we have Z

T

| 0

Z R

δ¯0 (U − u+ )H ϕ0 dxdt| ≤ C(u− − u+ )kH kL∞ kδ¯0 kL1 .

(4.2)

Combining (4.1) and (4.2) in the integral equation (1.17), for i = 0, imply kH − 1kL∞ 1, provided that (u− − u+ )kδ¯0 kL1 1, which we assumed in (1.20) and will verify in Step V. u t

(4.3)

Stability of Weak Detonation Wave

557

Estimate of Hx . In this step we estimate Hx based on (1.17), for i = 1. Using (4.3), Lemmas 3.2 and 3.3, we obtain Z T Z Z T /2 Z δ¯0 (U − u+ )H ϕ1 dxdt + δ¯0 (U − u+ )H ϕ1 dxdt |I I | ≡ | R 0 T /2 R (4.4) supτ >T /2 |δ¯0 (τ )τ | kδ¯0 kL1 0 0 , kδ¯ kL∞ + kδ¯ kL1 ). ≤ C(u− − u+ ) min( 1/2 + T T 1/2 Then estimates (3.15) and (4.1), the assumption (1.9) and the identity (1.14) show that Z Z 1 H ϕ1 dx = w(x, 0)H (x, 0)ψ0 (x, 0; x, ¯ T )dx, I≡ 2 R R has the estimate |I | ≤ C min(kw0 kL∞ ,

kw0 kL1 ). √ T

(4.5)

Combining (1.17), (4.4), (3.17), and (4.5) yield kw0 kL∞ + kw0 kL1 (1 + T )1/2 kδ¯0 kL1 + kδ¯0 (t)(1 + t)kL∞ + C(u− − u+ ) min[ , kδ¯0 kL∞ + kδ¯0 kL1 )]. (1 + T )1/2 (4.6)

¯ T )| ≤ C |Hx (x,

t u

Estimate of Hxx . The estimate of Hxx is based on (1.18), i.e., Z Z TZ δ¯0 ((U − u+ )H )x ψ1 dxdt ¯ T ) = − Hx (x, 0)ψ1 (x, 0)dx + Hxx (x, R

0

≡ III + IV.

R

Using Lemma 3.2, (3.10–11) and (1.8) we see that |U 0 | ≤ Cγ . Then, by Lemma 3.1, assumption (1.10a) and the estimate of Hx in (4.6) we can bound the function ((U − u+ )H )x appearing in I V . Using also C , kψ1 (·, t)kL1 ≤ √ T −t which follows from (3.16), we obtain by (4.6), |I V | ≤ Cγ (kδ¯0 kL∞ + kδ¯0 kL1 )(1 + kδ¯0 kL1 + kδ¯0 kL∞ ). Finally, as in (4.5), there holds Z Ckw0 (·, 0)kL∞ . |I I I | = | Hx (x, 0)ψ1 (x, 0)dx| ≤ √ T R For short time, T < 1, we modify the estimate above as follows. Let Z x ψ1 dx. v= −∞

(4.7)

558

A. Szepessy

Z

Then III = Z

where −vt − U vx − vxx =

x

−∞

R

Hxx vdx,

Ux ψ1 dx 0 , v(x, T ) = δ(x − x). ¯

(4.8)

(4.9)

The right-hand side in (4.9) is bounded and smooth. A standard short time estimate, cf. [F], yields for compact sets K ⊂ R, kv(·, t)kL1 (K) ≤ CK . Combining this with (4.8) shows that for T < 1, |I I I | ≤ C(kw0 kL∞ + kw0,x kL∞ + kw0 kL1 + kw0,x kL1 ), which together with (4.7) leads to ¯ T )| ≤ C(kw0 kL∞ + kw0,x kL∞ + kw0 kL1 + kw0,x kL1 ) |Hxx (x, + Cγ (kδ¯0 kL∞ + kδ¯0 kL1 )(1 + kδ¯0 kL1 + kδ¯0 kL∞ ).

(4.10)

t u 5. The Estimate of the Translation In this section we determine the translation δ¯ defined in (2.1). First we note that the normalization U (0) = 0 and (2.1) give ¯ t) = 0. w(st + δ, ¯ Therefore, (1.14) and (1.17) yield the following equation for the translation δ(t): Z ¯ H (y, 0)ϕ1 (y, 0; st + δ(t), t)dy 0= R Z tZ ¯ )) − u+ )H (x, τ )ϕ1 (x, τ ; st + δ(t), ¯ δ¯0 (τ )(U (x − sτ − δ(τ t)dxdτ. + 0

R

(5.1)

It is convenient to introduce the notation Z ¯ H (x, 0)ϕ1 (x, 0; st + δ(t), t)dx, V0 (t) ≡ R Z ¯ )) − u+ )H (x, τ )ϕ1 (x, τ ; st + δ(t), ¯ t)dx, α(τ, t) ≡ (U (x − sτ − δ(τ R

where by (5.1),

Z

t

δ¯0 (τ )α(τ, t)dτ = V0 (t).

0

Let now, for a positive constant C 00 chosen below, α0 (t) ≡ exp(−C 00 t)α(t, t).

(5.2a) (5.2b)

(5.3)

Stability of Weak Detonation Wave

559

Then (5.3) can be rewritten Z t Z t δ¯0 (τ )α0 (t − τ )dτ = V0 (t) + δ¯0 (t)(α0 (t − τ ) − α(τ, t))dτ. 0

(5.4)

0

The convolution operator on the left-hand side of (5.4) can be inverted by the Laplace transform, which yields Z t V 0 (t) + C 00 V0 (t) αt (τ, t) + C 00 α(τ, t) − dτ. (5.5) δ¯0 (τ ) δ¯0 (t) = 0 α(t, t) α(t, t) 0 To solve this equation we shall use the following lemmas. Lemma 5.1. Assume that the assumptions in Theorem 1.1 hold. Then there are positive constants c, c0 , C 00 , C such that Rt 00 0 |αt (τ, t) + C α(τ, t)|dτ ≤ c < 1, |α(t, t)| c min(u− , −u+ ) ≤ |α(t, t)| ≤ Cγ , and

1 |αt (τ, t)| + |α(τ, t)| ), τ < t. ≤ C exp(−c0 (t − τ ))(1 + √ |α(t, t)| t −τ

Lemma 5.2. Assume that the assumptions in Theorem 1.1 hold. Then there is a positive constant C = o(min(u2− , u2+ )/γ ) such that C , (1 + t)(1+ρ) 1 + |δ¯0 (t)| . |V00 (t)| ≤ C (1 + t)(1+ρ) |V0 (t)| ≤

(5.6a) (5.6b)

Combining Lemma 5.1–2 yield Lemma 5.3. Provided the assumptions in Theorem 1.1 hold, the translation satisfies the bounds C(kV00 kL1 + kV0 kL1 ) , min(u− , −u+ ) kV0 kL1 + kV00 kL1 + kV0 kL∞ + kV00 kL∞ . |δ¯0 (t)| ≤ C min(u− , u+ )(1 + t)(1+ρ)

kδ¯0 kL1 ≤

(5.7a) (5.7b)

Proof of Lemma 5.3. The inequality (5.7a) follows by taking the L1 -norm of (5.5) and using Lemmas 5.1–2. To prove (5.7b) partition the integral in the right-hand side of (5.5) into Z t αt (τ, t) + C 00 α(τ, t) dτ | δ¯0 (τ ) | α(t, t) 0 Z t Z t/2 αt (τ, t) + C 00 α(τ, t) αt (τ, t) + C 00 α(τ, t) dτ + | dτ | δ¯0 (τ ) δ¯0 (τ ) ≤| α(t, t) α(t, t) 0 t/2 Z t |αt (τ, t) + C 00 α(τ, t)| |αt (τ, t) + C 00 α(τ, t)| + dτ sup |δ¯0 (τ )|. ≤ kδ¯0 kL1 sup |α(t, t)| |α(t, t)| τ t/2 t/2

560

A. Szepessy

Rt By Lemma 5.1 we have 0 |αt (τ, t) + C 00 α(τ, t)|dτ/|α(t, t)| < 1. Inequality (5.7b) then follows by combining (5.7a), the third and first estimate in Lemma 5.1, induction in t t and the estimates of V00 and V0 in Lemma 5.2. u Proof of Lemma 5.2. By (3.17) and assumption (1.9) we have Z ¯ t)dy| |V0 (t)| = | H (y, 0)ϕ1 (y, 0; st + δ(t), R Z w0 (y)H (y, 0) ¯ ψ0 (y, 0; st + δ(t), t)dy|. =| 2 R Split the last integral into Z R

Z . . . dy =

st/2

−∞

Z . . . dy +

∞

st/2

(5.8)

. . . dy,

and use the algebraic decay of w0 (x) in (1.9b) and the exponential decay of ψ0 (st/2, t; ¯ t) to obtain (5.6a). st + δ, The translation invariance ¯ ¯ t) = ϕ1 (x + sτ, τ ; s(t + τ ) + δ(t), t + τ ), ϕ1 (x, 0; st + δ(t), which follows from equation (1.15) and (∂t + s∂x )U (x − st) = 0, implies d ¯ ϕ1 (y, 0; st + δ(t), t) dt ¯ ¯ t) − (∂τ + s∂y )ϕ1 (y, τ ; st + δ(t), t)|τ =0 . = δ¯0 (t)∂δ¯ ϕ1 (y, 0; st + δ(t), Therefore we have

Z d ¯ H (y, 0)ϕ1 (y, 0; st + δ(t), t)dy dt R Z ¯ t)|τ =0 dy = − H (y, 0)(∂τ + s∂y )ϕ1 (y, τ ; st + δ(t), R Z ∂ ¯ t)dy ≡ I + I I. + δ¯0 (t) H (y, 0) ϕ1 (y, 0; st + δ(t), ∂ δ¯ R

V00 (t) =

Equation (1.15) gives ∂τ ϕ1 (y, τ ) = −(U (y, τ )ϕ1 (y, τ ))y − ϕ(y, τ )1,yy ,

(5.9)

and integrating by parts yields Z ¯ t)dy. I = [(U (y, 0) − s)H (y, 0)y − H (y, 0)yy ]ϕ1 (y, 0; st + δ(t), R

Using ∂y ψ0 = ϕ1 we conclude that h Z ¯ t)dy|, |I | ≤ min | [(U (y, 0) − s)H (y, 0)y − H (y, 0)yy ]ϕ1 (y, 0; st + δ(t), R Z i ¯ t)dy| . | [(U (y, 0) − s)H (y, 0)yy − Uy H (y, 0)y − H (y, 0)yyy ]ψ0 (y, 0; st + δ(t), R

(5.10)

Stability of Weak Detonation Wave

561

Furthermore, (1.15) implies ∂ ¯ t) = ϕ2 (·; st + δ, ¯ t). ϕ1 (·; st + δ, ∂ δ¯ Therefore

(5.11)

Z

¯ H (y, 0)ϕ2 (y, 0; st + δ(t), t)dy Z Z y ¯ ϕ2 (y 0 , 0; st + δ(t), t)dy 0 dy. = −δ¯0 (t) H (y, 0)y

I I = δ¯0 (t)

As in (3.17) we have

R

R

R

−∞

R ϕ2 (y, τ ; ·)dy

Z

(5.12)

= 0, which combined with (3.14) yields

(τ )) exp(− (x−σ c(T −τ ) ) ¯ T )| = | ϕ2 (y, τ ; x, ¯ T )dy| ≤ C , |ψ1 (x, τ ; x, √ T −τ −∞ 2

x

(5.13a)

for T − τ > 1, and for 0 < T − τ ≤ 1 we can use a short time estimate, cf. [F], to derive (t)) exp(− (x−σ c(T −t) ) 2

¯ T )| ≤ C |ψ1 (x, τ ; x, Therefore, |I I | ≤ |δ¯0 (t)| min

hZ Z

Z |Hyy (y, 0)|

R

y

−∞

(T − t)3/2

.

¯ |ψ1 (y 0 , 0; st + δ(t), t)|dy 0 dy, (5.13b)

i ¯ |Hy (y, 0)ψ1 (y, 0; st + δ(t), t)|dy .

R

Differentiating (1.14) gives Hx (·, 0) = −w0 H (·, 0)/2, Hxx (·, 0) = (−w0,x /2 + w02 /4)H (·, 0), Hxxx (·, 0) =

(5.14)

(−w0,xx /2 + w0,x w0 − w03 /8)H (·, 0).

By combining (3.17), (1.9) and (5.8–14) we conclude that o(min(u2− , u2+ )/γ ) , (1 + t)(1+ρ) o(min(u2− , u2+ )/γ )(1 + δ¯0 (t)) . |V00 (t)| ≤ (1 + t)(1+ρ)

|V0 (t)| ≤

(5.15a) t u

(5.15b)

Proof of Lemma 5.1. This proof is divided into Propositions 5.1–4 below. Let us first split α into two parts α(τ, t) ≡ α1 (τ, t) + α2 (τ, t), with the main term α1 (τ, t) =

Z R

¯ ))ψ0 (x, τ ; st + δ(t), ¯ Ux (x − sτ − δ(τ t)dx,

562

A. Szepessy

and the remainder Z ¯ ))(H (x, τ ) − 1)ψ0 (x, τ ; st + δ(t), ¯ α2 (τ, t) = Ux (x − sτ − δ(τ t)dx R Z ¯ )) − u+ )Hx (x, τ )ψ0 (x, τ ; st + δ(t), ¯ t)dx. + (U (x − sτ − δ(τ R

Proposition 5.4. There holds kα2,t + C 00 α2 kL1 ≤ CkwkL∞ , |α2 (t, t)| ≤ CkwkL∞ |α1 (t, t)|.

(5.16) (5.17)

We postpone the proof of this proposition to the end of this section. To estimate α1,t we shall use its sign as follows. Differentiate α1 and use the translation invariance (5.9–12) to obtain Z d Ux ψ0 dx α1,t (τ, t) = dt Z ZR ∂ ∂ 0 ¯ Ux (− − s )ψ0 dx + δ (t) Ux ψ1 dx = (5.18) ∂t ∂x R Z ZR = ((U − s)Ux − Uxx )ψ0,x dx + δ¯0 (t) Ux ψ1 dx. R

Equation (1.5) then implies α1,t =

R

Z R

q0 Zx ψ0,x dx + δ¯0 (t)

Z R

Ux ψ1 dx.

(5.19)

Proposition 5.5. There are positive constants c, C, where c < 1, such that Rt 00 0 |α1,t (τ, t) + C α1 (τ, t)|dτ ≤ c < 1. |α1 (t, t)| Proof. Split the dual functions ψ0 and ψ0,x , defined in (1.16) with initial data ¯ T ) = δ(x − x), ¯ ψ0 (x, T ; x, as

ψ0 = ψ01 + ψ02 , ψ0,x = ψ01,x + ψ02,x ,

where the dominating terms ψ01 and ψ01,x are defined by the explicit Gaussian solution ¯ T ), χ in (3.2–3) and the special backward characteristic σ ∗ in (3.1) starting in (x, −σ ∗ (t))2 exp[− (x4(T ] −t) , ¯ T ) = χ(x, t; x, ¯ T) = p ψ01 (x, t; x, 4π(T − t) (x−σ ∗ (t))2 ∂ exp[− 4(T −t) ] p , ¯ T ) = χ(x, t; x, ¯ T )x = ψ01,x (x, t; x, ∂x 4π(T − t)

ψ02 = ψ0 − ψ01 . The following proposition shows that ψ02 ψ01 .

Stability of Weak Detonation Wave

563

Proposition 5.6. There holds ¯ )) exp[− (x−σc(T(t;−x,T t) ] p ¯ T )| ≤ C|u− − u+ | , |ψ02 (x, t; x, C (T − t) ¯ ))2 exp[− (x−σc(T(t;−x,T t) ] ¯ T )| ≤ C|u− − u+ | . |ψ02,x (x, t; x, C(T − t) 2

(5.20)

(5.21)

Proof of Proposition 5.3. This proposition is a direct consequence of the relation (3.5), the integral equation (3.6) and the estimates (7.3), (3.12), (3.15) in Lemma 7.1 and Lemma 3.3. u t Proposition 5.3 and (5.19) imply Z α1 = α1,t

Z

Ux ψ02 dx, Z 0 ¯ = q0 Zx ψ0,x dx + δ (t) Ux ψ1 dx R Z Z ZR q0 Zx ψ01,x dx + q0 Zx ψ02,x dx + δ¯0 (t) Ux ψ1 dx, = ZR

Ux ψ01 dx +

R

R

R

(5.22) (5.23)

R

where the first and second inequality satisfy Proposition 5.7. There are positive constants C, C 00 , where C 00 = O(1), such that Z Z

Z

00

R t

q0 Zx ψ01,x dx + C Ux ψ01 dx ≥ 0, R Z Z Z 0 ¯ | q0 Zx ψ02,x dx + C Ux ψ02 dx|dτ + |δ (t)| Ux ψ1 dx| R

0

R

≤ o(1)|α1 (t, t)|.

R

(5.25)

Proof of Proposition 5.4. We have Z

0

ax

−∞

e ψ01 dx

 2  √ + ) t/4) ) ≤ C min(1, exp (−(s−u  ≥ c min(1,

and Z

∞ 0

ax

e ψ01 dx

1 s−u 4π t a+ 2 + 2 exp (−(s−u 1 √ + ) t/4) ) s−u 4π t a+ 2 +

 (−s 2 t/4) ≤ C min(1, exp √ ) ≥ c min(1,

1 a+ 2s 4π t exp √ (−s 2 t/4) −1 ) a+ s 4π t 2

Combining this with the estimates ( Zx =

(5.24)

K exp (Kx), x < 0, 0, x > 0,

for a > 0,

for a <

−s . 2

564

A. Szepessy

and (

C1 |u+ | exp (−cSx), C1 u− exp (−cKx), ( C2 |u+ | exp (−cSx), − Ux ≥ C2 u− exp (−cKx), − Ux ≤

x > 0, x < 0, x > 0, x < 0,

yield (5.24). The estimate (5.25) follows from the estimates (5.20–21) of ψ02 and the t assumption (1.20b) that δ¯0 (t) is small. u Proof of Proposition 5.2. The first term in (5.24) is positive since ψ01,x (x, t) > 0 in the support {x ∈ R : x < 0} of Zx . The second term in (5.24) is negative since Ux ≤ 0 and ψ01 ≥ 0. Combining the good signs of the two terms in (5.24) and the bound (5.25) in Proposition 5.4 therefore imply that α1,t and α1 have opposite signs and more precisely Z

t 0

|α1,t + C 00 α1 |dτ ≤ |α1 (t, t)|[(1 − c) + c|u− − u+ |],

which proves Proposition 5.2. u t Proof of Proposition 5.1. To prove (5.16), differentiate (5.2b) with respect to t, as in (5.22–23) and use the translation invariance (5.9–12) to obtain Z α2,t (τ, t) = =

ZR R

¯ ) − u+ ))(H (x, τ ) − 1) (U (x − sτ − δ(τ

d ¯ t)dx ϕ1 (x, τ ; st + δ(t), dt

¯ )) − u+ )(H (y, τ ) − 1))y [(U (y, τ ) − s)((U (y − sτ − δ(τ

¯ )) − u+ )(H (y, τ ) − 1))yy ]ϕ1 (y, τ ; st + δ(t), ¯ − ((U (y − sτ − δ(τ t)dy Z ¯ )) − u+ ))(H (y, τ ) − 1))y − δ¯0 (t) (((U (y − sτ − δ(τ R Z y ¯ × ϕ2 (y 0 , τ ; st + δ(t), t)dy 0 dy. −∞

(5.26)

Then let x ∗ (τ ) ≡

sτ + σ (τ ; st, t) 2

be the mid-point between the shock and the backward characteristic starting at the shock at time t. We have x ∗ (τ ) − sτ > c(t − τ ).

(5.27)

Stability of Weak Detonation Wave

565

Split the integrals over R in the right-hand side of (5.26) as follows and use (5.27), the decay of U − u+ and the estimate of ϕ1 in Lemma 3.3 to obtain Z ¯ )) − u+ ))H (y, τ )y )ϕ1 (y, τ ; st + δ(t), ¯ | (U (y, τ ) − s)((U (y − sτ − δ(τ t)dy| R

Z ≤ kwkL∞ + kwkL∞

x ∗ (τ )

−∞ Z ∞

¯ |ϕ1 (y, τ ; st + δ(t), t)|dy

¯ |ϕ1 (y, τ ; st + δ(t), t)|(U − u+ )dy x ∗ (τ ) e−c(t−τ ) |(U − u+ )(x ∗ (τ ))|

+ ≤ CkwkL∞ ( √ t −τ

√ t −τ

e−c(t−τ ) ) ≤ CkwkL∞ √ , t −τ

which is integrable in t. The estimates of the remaining terms in (5.26) follow similarly. t u Conclusion of the Proof of Lemma 5.1. Proposition 5.2 and the estimate of α2 in terms of kwkL∞ in Proposition 5.1 imply the first statement in Lemma 5.1 for sufficiently small |u− − u+ | and kwkL∞ . Finally, the exponential decay of U − u+ and ϕi , ψi imply, by (5.2b), (5.23), and (5.26), the upper bounds in the second and third estimate of Lemma 5.1. The lower bound in the second estimate of Lemma 5.1 follows by (5.22), (5.20) and t the fact that −Ux and ψ01 are positive. u

6. Verification of the Assumptions (1.20) and Conclusion of (1.10) In this section we complete the proof of the theorem by verifying the assumption (1.20), based on the following continuation argument. Assume that t = t ∗ is the smallest time such that kw(·, t ∗ )kL∞ + kwx (·, t ∗ )kL∞ = ca , (u− − u+ )(kδ¯0 kL1 (0,t ∗ ) + kδ¯0 kL∞ (0,t ∗ ) ) = cb . By a standard short time estimate of (1.12) and (5.5) we see that t ∗ > 0, since the left-hand side in (1.20) can be made arbitrarily small initially by choosing w0 suitably small. From (5.7), (1.9a) and Lemma 5.2 we conclude that (u− − u+ )(kδ¯0 kL1 (0,t ∗ ) + kδ¯0 kL∞ (0,t ∗ ) ) ≤ CC 0 γ / min(u− , u+ ) = o(min(u− , u+ )).

(6.1)

Next by (4.6) and Lemmas 5.2–3 we see that kwkL∞ ≤ C(kw0 kL∞ + (u− − u+ )kδ¯0 kL1 ) γ ). ≤ CC 0 (1 + min(u− , u+ )

(6.2)

Finally, (4.10) and (6.2) imply kwx kL∞ ≤ C(kw0 kL∞ + kw0 kL1 + kw0,x kL∞ + C 0

γ ). min(u− , u+ )

(6.3)

566

A. Szepessy

So that by choosing w0 suitable small in (1.9) we see, by (6.1–3), that the left-hand sides in (1.20ab) are bounded by ca /2 and cb /2, respectively, up to time t ∗ ; a short time estimate then implies that the left-hand sides in (1.20) are bounded by ca and cb up to time t ∗∗ > t ∗ , contradicting the existence of a finite t ∗ . Therefore, for sufficiently small initial data, the estimate (1.20ab) holds for all time. The boundary condition in (1.20c) follows from (4.4–6), by observing that |w(x, t)| ≤ C/|x| for large |x| using Lemma 5.2–3. Lemma 5.3 proves (1.10b) and (1.10c), and combining the lemma with (4.6) proves (1.10a). 7. Proof of Lemma 3.1 This section proves Lemma 3.1. The following lemma motivates the choice of the special backward characteristic σ ∗ in (3.1). Lemma 7.1. Suppose that the functions χ and S in (3.3) and (3.4) satisfy ∗

|χx (x, t)| ≤ C

e−(x−σ (t)) s1

2 /s 1

e−(x−τ (t)) |S(x, t)| ≤ C s2

2 /s 2

,

(7.1)

,

(7.2)

where s1 = c2 (T − t), s2 = c1 (t − t). Then, Z R

(U (x, t)−U (α(t), ¯ t))χx (x, t)S(x, t)dx −(τ (t) − σ ∗ (t))2 ] ≤ exp [ (s1 + s2 ) Z 2 s1 + s2 dx ¯ t)) exp[−(x − α(t)) ¯ ] × (U (x, t) − U (α(t), s1 s2 s1 s2 R e ≤ |U˜ x |

−(τ (t)−σ ∗ (t))2 /(s1 +s2 )

s1 + s2

(7.3)

,

where U˜ x is a function satisfying ¯ O(γ ) − c(α−st) ¯ + e min(T −t,t−t) ), (e−c|α−st| |U˜ x | = p min(T − t, t − t) 2

and Moreover, there holds

α¯ ≡ τ + (σ ∗ − τ ) Z

T

t

|U˜ x |e

−( τ (T )−σ

s2 . s1 + s2

∗ (T ) 2 ) c(T −t)

t

T−

(7.4)

dt ≤ Cγ .

(7.5)

Stability of Weak Detonation Wave

567

Proof. The first inequality in (7.3) follows directly from the assumptions (7.1–2). To prove the second inequality, split the integral over R in (7.3) as Z (α+st)/2 Z ∞ Z ¯ . . . dx = . . . dx + . . . dx. R

−∞

For ±(α¯ − st) > 0 we have

(α+st)/2 ¯

(

Cγ , ±(x − (α¯ + st)/2) < 0, ¯ , ±(x − (α¯ + st)/2) > 0. Cγ e−c|α−st|

|U (x, t) − U (α(t), ¯ t)| ≤

Combining this estimate and the splitting of the integral yields (7.4). The inequality (7.5) follows by inserting the estimate (7.4) in the left-hand side of (7.5) and evaluating the integrals over t as follows. For the case when α¯ is transversal, (T )−σ ∗ (T ) 1 , we use that (7.4) implies < meaning that τ(s−U 2 )(T −t) Z 0≤

t

T

˜|Ux |dt ≤ Cγ . ∗

(T )−σ (T )) | > 1/2, the curve For the remaining case when α¯ is non-transversal, i.e for | (τ(s−U )(T −t) α(t) ¯ may be inside the shock wave region for all t and we use instead

Z

T

t

˜|Ux |e

−(τ (T )−σ ∗ (T ))2 c(T −t) (T − )2

t

dt ≤ Cγ .

t u

Lemma 7.2. There are positive constants c, C such that for t1 , t2 ∈ [t, T ], 1 − Cγ ≤ and

|τ (t2 ) − σ ∗ (t2 )| ≤ 1 + Cγ , |τ (t1 ) − σ ∗ (t1 )|

(τ (t) − σ ∗ (t))2 ) s1 + s2 (x¯ − τ (T ))2 c(T − t) (x¯ − τ (T ))2 ) exp(− ), t ≤ t ≤ T . ≤ exp(− c1 (T − t) c1 (T − t) (T − t)

exp(−

Proof. Let x(t) = σ ∗ (t) − τ (t). Then, by (3.1), we have s2 dx =U x + τ, t − U (τ, t), dt s1 + s2 and U (τ +

s2 x s2 x , t) − U (τ, t) = Ux (ξ, t) , s1 + s2 s1 + s2

for some ξ ∈ [τ, α] ¯ ∪ [α, ¯ τ ]. The shock wave U (x, t) is monotone decreasing in x, cf. Lemma 3.2 and [M], therefore we have dx 2 ≤ 0, dt

568

A. Szepessy

which together with the relation s1 + s2 = c1 (T − t)(1 − (1 −

c2 T − t ) ) c1 T − t

proves the second statement in the lemma. The first statement follows from integrating the differential equation for x above, 2 ≤ 1 and, in the case that α¯ is transversal to the shock wave, i.e. using that s1s+s 2 ∗

(T )−σ (T ) | τ(s−U | < 1/2, we have )(T −t)

Z 0≤

T

t

Ux (ξ, t)dt ≤ Cγ . ∗

(T )−σ (T ) | ≥ 1/2, we instead use that In the case that α¯ is non-transversal, i.e. | τ(s−U )(T −t)

|U (α) ¯ − U (τ )| ≤ Cγ .

t u

Lemma 7.3. For c1 sufficiently large, the linear integral equation (3.6) is a contraction in the weighted L∞ –space |S|w ≡

2 /c

sup x∈R, ¯ T >t

¯ (T )) |(T − t)e(x−τ

1 (T −

t) S(x, ¯ T )|.

Proof. By Lemmas 7.1 and 7.2, we have, for sufficiently large c1 , that Z TZ 2 e−(x−τ (t)) /c1 (t−t) |(U (x, t) − U (α(t), ¯ t))| |χx |dxdt t −t t R ∗

e−(τ (T )−σ (T )) ≤ T −t ∗

2 /c

e−(τ (T )−σ (T )) ≤ T −t

2 /c

t) Z

1 (T −

t 1 (T −

t)

T

U˜ x e

−

τ (T )−σ ∗ (T ) T−

t

2

c(T −t)

dt

· Cγ .

In the final inequality, we have used (7.5) from Lemma 7.1. Therefore the linear integral operator, as a function of S, in (3.3) is a contraction in the weighted norm | · |w . Since moreover |χx |w is bounded in | · |w , the function S is bounded in | · |w , proving (7.2) and Lemma 3.1. u t Acknowledgement. This work was supported by TFR grant 92961 and TMR project HCL ERBFMRXCT 960033.

References [CMR] Collella, P., Majda, A. and Roytburd, V.: Theoretical and numerical structure for reacting shock waves. SIAM J. Sci. Stat. Comput. 7, 1059–1080 (1986) [F] Friedman, A.: Partial Differential Equations of Parabolic Type. New York: Prentice-Hall, 1964 [G1] Gardner, R.: On the detonation of a combustionable gas. Trans. Am. Math. Soc. 277, 431–468 (1983) [G2] Goodman, J.: Stability of viscous scalar shock fronts in several space dimensions. Trans. Am. Math. Soc. 311, 683–695 (1989)

Stability of Weak Detonation Wave

[GM]

569

Goodman, J. and Miller, J.R.: Large-time behavior of scalar viscous fronts in two dimensions. Preprint, 1997 [GS1] Gasser, I. and Szmolyan, P.: A geometric singular perturbation analysis of detonation and deflagration waves. SIAM J. Math. Anal. 24, 968–986 (1993) [GS2] Gasser, I. and Szmolyan, P.: Detonation and deflagration waves with multistep reaction schemes. SIAM J. Appl. Math. 55, 175–191 (1995) [GSZ] Goodman, J., Szepessy, A. and Zumbrun, K.: A remark on the stability of shock waves. SIAM J. Math. Anal. 25, 1463–1467 (1994) [M] Majda, A.: A qualitative model for dynamic combustion. SIAM J. Appl. Math. 41, 70–93 (1981) [Li1] Li, T.: Rigorous asymptotic stability of a Chapman–Jouguet detonation wave in the limit of small resolved heat release. Combust. Theory Modeling 1, 259–270 (1997) [Li2] Li, T.: Stability of strong detonation waves and rates of convergence. Electronic J. of Differential Equations 1998, 1–17 (1998) [Li3] Li, T.: Stability and instability of detonation waves. In: Jeltsch, R. (ed.), Proceedings of the Seventh International Conference on Hyperbolic Problems, Theory, Numerics and Applications, Zürich, 1998 [L1] Liu,T.P.: Interaction of nonlinear hyperbolic waves. In: Liu, F.C., Liu, T.P. (ed.) Nonlinear Analysis. Singapore: World Scientific, 1991, pp. 171–184 [L2] Liu, T.P.: Pointwise convergence to shock waves for the system of viscous conservation laws. Comm. Pure Appl. Math. 50, 1113–1182 (1997) [LY] Liu, T.P. and Ying, L.: Nonlinear stability of strong detonation for a viscous combustion model. SIAM J. Math. Anal. 26, 519–528 (1995) [LYu1] Liu, T.P. and Yu, S.H.: Propagation of stationary shock layer under the effect of boundary. Arch. Rat. Mech. Anal. 139, 57–82 (1997) [LYu2] Liu, T.P. and Yu., S.H.: Nonlinear stability of weak detonation waves for a combustion model. Preprint, 1998 [LZ1] Liu, T.P. and Zumbrun, K.: Nonlinear stability of an undercompressive shock for complex Burgers equation. Commun. Math. Phys. 168, 163–186 (1993) [LZ2] Liu, T.P. and Zumbrun, K.: On the nonlinear stability of general undercompressive viscous shock waves. Commun. Math. Phys. 174, 319–345 (1995) [RM] Rosales, R. and Majda, A.: Weakly nonlinear detonation waves. SIAM J. Appl. Math. 43, 1086–1118 (1983) [SX] Szepessy, A. and Xin, Z.: Nonlinear stability of viscous shock waves. Arch. Rat. Mech. Anal. 122, 53–103 (1993) [SZ] Szepessy, A. and Zumbrun, K.: Stability of rarefaction waves in viscous media. Arch. Rat. Mech. Anal. 133, 249–298 (1996) [TT] Tan, D. and Tesei, A.: Nonlinear stability of strong detonation waves in a gas dynamical combustion. Nonlinearity 10, 355–376 (1997) [Y] Yu, S.H.: Zero dissipation limit to solution with shocks for systems of hyperbolic conservation laws. Preprint, to appear in Arch. Rat. Mech. Anal. Communicated by A. Kupiainen

Commun. Math. Phys. 202, 571 – 592 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Branes and Calibrated Geometries Jerome P. Gauntlett1 , Neil D. Lambert2 , Peter C. West2 1 Physics Department, Queen Mary and Westfield College, Mile End Rd, London E1 4NS, UK.

E-mail: [email protected]

2 Department of Mathematics, King’s College, The Strand, London, WC2R 2LS, UK.

E-mail: [email protected], [email protected] Received: 20 April 1998 / Accepted: 16 November 1998

Abstract: The fivebrane worldvolume theory in eleven dimensions is known to contain BPS threebrane solitons which can also be interpreted as a fivebrane whose worldvolume is wrapped around a Riemann surface. By considering configurations of intersecting fivebranes and hence intersecting threebrane solitons, we determine the Bogomol’nyi equations for more general BPS configurations. We obtain differential equations, generalising Cauchy–Riemann equations, which imply that the worldvolume of the fivebrane is wrapped around a calibrated submanifold. 1. Introduction The dynamics of branes have played an important role in elucidating the structure of M-theory (for a review see [26]). In particular the fivebrane has received substantial interest recently due to its intricate worldvolume theory. This theory has been shown to contain supersymmetric threebrane [19] and self-dual string [18] solitons. A remarkable feature of these solitons, and closely related solitons on the worldvolumes of D-branes, is that they incorporate their spacetime interpretation [19,18,8,14,6,11]. For example, the self-dual string corresponds to a membrane ending on the fivebrane. Similarly, the simplest threebrane soliton solution can be interpreted as the orthogonal intersection of two fivebranes lying along flat hyperplanes. In fact, for this case the Bogomol’nyi equations are precisely the Cauchy–Riemann equations. Thus there are more general solutions corresponding to desingular deformations of this configuration which can be interpreted as a single fivebrane with its worldvolume wrapped around an arbitrary Riemann surface. There are solutions of the supergravity equations of motion corresponding to orthogonal intersections of branes, but the BPS solutions that are known at present are typically not fully localised [25,27,12] (for a review see [9]). The description of intersecting branes given by examining the worldsheet theory thus provides a useful avenue of obtaining more insights into the properties of M-branes. Moreover, the existence of

572

J. P. Gauntlett, N. D. Lambert, P. C. West

branes with non-trivial worldvolumes has important applications in relation to the low energy dynamics of quantum Yang–Mills theories, e.g. the derivation of the Seiberg– Witten curve [29] (see also [23]) and indeed all of the Seiberg–Witten dynamics [20] from the fivebrane. It is natural to enquire if there are other BPS solutions of the worldvolume that correspond to intersecting threebranes and self-dual strings. From the supergravity point of view this seems rather natural: supersymmetric configurations of orthogonal intersecting membranes and fivebranes are known and we might expect to see analogous configurations in the worldvolume theory. For example, a supersymmetric configuration is given by a fivebrane in the {x 1 , x 2 , x 3 , x 4 , x 5 } plane orthogonally intersecting another fivebrane in the {x 3 , x 4 , x 5 , x 6 , x 7 } plane, with a membrane in the {x 3 , x 6 } plane, a configuration that we will denote M5 : 1 2 3 4 5 M5 : 3 4 5 6 7 M2 : 2 6

(1)

Considering the first fivebrane’s worldvolume theory we expect this configuration to correspond, in the simplest setting, to a BPS solution consisting of a threebrane soliton in the x 3 , x 4 , x 5 directions orthogonally intersecting a self-dual string in the x 2 direction. This self-dual string then acts as a source for the three form field h on the fivebrane worldvolume. More general solutions should correspond to BPS solitons in the fivebrane approach to N=2 superYang–Mills theory [29,17]. As a first step towards studying all supersymmetric configurations of branes, in this paper we will consider configurations with only fivebranes. In the simplest setting these should correspond to intersecting configurations of threebranes on the worldvolume, but more generally they can be interpreted as the worldvolume of a single fivebrane with a non-trivial worldvolume, i.e. these BPS states may simply be viewed as a single fivebrane wrapped on a non-trivial submanifold embedded in eleven dimensions. Since there are no membranes and we are considering solitons with only scalars active, our discussion is universal to all types of branes by dimensional reduction and T-duality. The fivebrane in eleven dimensions is particularly useful in this sense because it has both a large worldvolume and transverse space. We will address the issue of configurations involving fivebranes, membranes and momentum modes in a future paper. In our analysis we will choose the target space to be flat space throughout, although the generalisation to a curved space should be straightforward and will be briefly discussed in the conclusion. The supersymmetry of (Euclidean) membranes wrapped on three cycles of a Calabi– Yau manifold and threebranes wrapped around three cycles and four cycles of exceptional holonomy manifolds has been studied in [4,3]. From those results we expect the supersymmetric configurations of fivebranes to correspond to calibrated submanifolds. In this work we shall focus on a full description of the non-linear worldvolume theory of the fivebrane and its supersymmetry. In this way we hope to obtain a more detailed picture. In particular our derivation shows that such surfaces satisfy elegant differential equations, generalising Cauchy–Riemann equations, which appear in the work of Harvey and Lawson [16] as necessary and sufficient conditions for the manifold to be calibrated. In addition, since we will directly show that the surfaces must be calibrated using similar ideas to [4,3], our results can be viewed as a supersymmetric proof of some of the results in [16]. The plan of the rest of the paper is as follows. In the next section we obtain a list of orthogonally intersecting fivebranes which preserve some fraction of eleven-dimensional

Branes and Calibrated Geometries

573

spacetime supersymmetry. The purpose of this section is to characterise some features of potential supersymmetric solutions on the fivebrane. In particular we will identify which transverse scalars we expect to be active in the solutions and determine sets of projection operators acting on the supersymmetry parameters that will be useful in later sections. Following that we turn our attention to the non-linear worldvolume theory of the fivebrane in section three. For the reader who is not interested in all the details of this section, we point them to Eq. (42), which is the condition for the fivebrane to preserve some supersymmetry in cases where the self-dual three form vanishes. Following this equation we present the argument that the fivebranes must be wrapped along calibrated submanifolds. In section four we combine the results of sections two and three to derive the Bogomol’nyi equations for supersymmetric fivebrane configurations. 2. Intersecting Fivebranes In this section we construct a number of orthogonally intersecting fivebrane configurations which preserve some fraction of eleven-dimensional spacetime supersymmetry (see also [5]) and list the corresponding supersymmetry projectors. This will provide a guide in our search for Bogomol’nyi conditions for supersymmetric solutions in the fivebrane worldvolume theory. We first note that a fivebrane in the {x 0 , x 1 , x 2 , x 3 , x 4 , x 5 } plane preserves the supersymmetries 0 012345 = , where 0 a are the flat eleven-dimensional 0-matrices, a = 0, 1, 2, ..., 10, (The notation we use here is e.g. (0 a )β = α (0 a )α β and is further explained in Sect. 3.) The addition of other fivebranes will therefore imply further projections on . We shall list the various configurations in the order of the amount of supersymmetry that they preserve. It turns out that in many configurations the supersymmetry conditions allow for additional fivebranes to be included, without breaking more supersymmetries. Thus the number of fivebranes can be rather large and does not immediately reflect the amount of supersymmetry preserved. We follow the practice of always including these extra fivebranes, which make the configurations more symmetric. However we only list an independent set of projectors for each configuration. The reader will note in the following that there is clearly some choice between adding fivebranes or anti-fivebranes, although only for those fivebranes corresponding to independent projectors. Once these fivebranes are fixed, there is no choice for the others. In this section however, we merely wish to motivate the choice of projections used in the worldvolume analysis in the following sections. Clearly one could find other solitons by changing fivebranes to anti-fivebranes and visa-versa. However this would only lead to trivial changes in our analysis and correspond to changing the signs of the coordinates. 2.1. 1/4 Supersymmetry. M5 : 1 2 3 4 5 M5 : 3 4 5 6

7,

0 012345 = , 0 1267 = −.

(2) (3)

This spacetime configuration should manifest itself as two active scalars (X6 , X7 ) depending on two worldvolume coordinates (x 1 , x 2 ), i.e. a two-dimensional surface embedded in four dimensions. As mentioned above the differential equation that the scalars

574

J. P. Gauntlett, N. D. Lambert, P. C. West

satisfy in BPS solutions are simply Cauchy–Riemann equations, and hence this situation corresponds to a fivebrane wrapped around a Riemann surface. 2.2. 1/8 Supersymmetry. M5 : 1 2 3 4 5 M5 : 3 4 5 6 7 M5 : 3 4 5 8

(4) 9,

0 012345 = , 0 1267 = −, 0 1289 = −.

(5)

BPS worldvolume solutions corresponding to this configuration should have four active scalars depending on two worldvolume coordinates. Thus it should appear as a twodimensional surface embedded in six dimensions (and moreover it must not be possible to embed the surface in four dimensions). In fact it corresponds to a Riemann surface but this time embedded in a six dimensional space. We note that one also has 0 10 = −. M5 : 1 2 3 4 5 M5 : 3 4 5 6 M5 : 1 2 5 6

7 7,

0 012345 = , 0 1267 = −, 0 3467 = −.

(6) (7)

For this case we expect two active scalars depending on four worldsurface coordinates. BPS solutions should appear as a four surface embedded in six dimensions and in fact corresponds to a complex manifold. Note that 0 05 = − so that we could add a pp-wave in the x 5 direction without breaking any more supersymmetries. M5 : 1 2 3 4 5 M5 : 3 4 5 6 7 M5 : 2 4 5 6 M5 : 1 4 5 7

8 8,

0 012345 = , 0 1267 = , 0 1368 = .

(8)

(9)

This configuration should correspond to solutions with three active scalars depending on three worldvolume coordinates. We will see that this corresponds to a three-dimensional special Lagrangian manifold embedded in six dimensions. 2.3. 1/16 Supersymmetry. M5 : M5 : M5 : M5 : M5 : M5 : M5 :

1 2 3 4 5 1 4 5 6 9 1 4 5 7 8 1 2 5 8 9 , 1 2 5 6 7 1 3 5 6 8 1 3 5 7 9

0 012345 = , 0 2369 = −, 0 2378 = , 0 3489 = .

(10)

(11)

Branes and Calibrated Geometries

575

For this configuration we should have four scalars depending on three worldvolume coordinates. We will see below that it describes an associative three surface in seven dimensions. Note that we also have 0 10 = for this configuration. M5 : M5 : M5 : M5 : M5 : M5 : M5 : 0

012345

1 2 3 3 2 1 1 2 2 3 1 3

= , 0

1289

4 4 4 4

5 5 5 5 5 5 5

= −, 0

8 7 7 8 8 7 8 7 1379

9 9 ,

(12)

9 9

= , 0 2378 = .

(13)

Here we should look for solutions with three scalars depending on four worldvolume coordinates. We will see below that this corresponds to a coassociative four surface in seven-dimensions. Note that we have 0 05 = so that we could add a pp-wave in the x 5 direction without breaking any more supersymmetries. M5 : 1 2 3 4 5 M5 : 3 4 5 6 7 M5 : 1 2 5 8 9 , M5 : 3 4 5 8 9 M5 : 1 2 5 6 7 M5 : 5 6 7 8 9

(14)

0 012345 = , 0 1267 = −, 0 3489 = −, 0 1289 = −.

(15)

This configuration corresponds to four scalars depending on four worldvolume coordinates. We will see that it corresponds to a complex four dimensional surface embedded in eight dimensions. Note that 0 05 = −, 0 10 = − and 0 0510 = so we could add a pp-wave in the x 5 direction and a membrane in the {x 0 , x 5 , x 10 } plane. The presence of the membrane is related to the fact that the second and third fivebranes intersect over a string, rather than a threebrane. We have not considered this string soliton by itself because there is no known worldvolume solution to describe it. Such configurations will appear again but unlike this case, where the orthogonal intersection is necessary to obtain the corresponding projections, the fivebranes which contribute string intersections could be discarded. M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 :

1 2 3 3 2 2 3 1

4 5 4 5 6 4 5 6 5 6 4 5 5 6 1 3 5 1 2 5

7 8 9 , 7 8 7 8 9 7 9 8 9

0 012345 = , 0 1267 = , 0 1368 = , 0 1469 = .

(16)

(17)

Here we again have four scalars depending on four worldvolume coordinates. We will see below that this corresponds to a four-dimensional special Lagrangian surface embedded

576

J. P. Gauntlett, N. D. Lambert, P. C. West

in eight dimensions. Note that we also have 0 0510 = so again we could add a membrane in the {x 0 , x 5 , x 10 } plane. 2.4. 1/32 Supersymmetry. M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 :

1 2 3 3 2 1 2 1 3 2 3 1

4 5 4 5 6 4 5 6 5 6 5 6 5 4 5 5 6 2 3 5 6 3 4 5 2 4 5 1 3 5 1 4 5 6 1 2 5

7 8 7 8 7 8 7 8 , 7 8 9 9 8 9 7 9 7 9 9 8 9

(18)

0 012345 = , 0 1267 = , 0 1368 = , 0 1469 = , 0 1289 = −.

(19)

In this configuration we expect four scalars depending on four worldvolume coordinates. We will see below that this solution is described by a Cayley four surface in eight dimensions. Note that here we have 0 0510 = −, 0 05 = − and 0 10 = . Thus we could add membranes in the {x 0 , x 5 , x 10 } plane and pp-waves in the x 5 direction without breaking any additional supersymmetry. M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : M5 : 0

012345

= , 0

1267

1 2 3 3 2 2 3 2 3 1 1 3 1 2 1 3 1 2 1 2 3

4 5 4 5 6 4 5 6 5 6 4 6 4 5 5 5 4 4

5 6 4 6 3 6 2 6 1

7 8 9 10 7 8 7 9 8 9 7 8 9 7 8 9 7 8 7 9 8 9 7 8 9

10 10 10

,

(20)

10 10 10 10

= , 0 1368 = , 0 1469 = , 0 15610 = .

(21)

In this configuration all five scalars are active and depend on all five worldvolume coordinates. We will see that it manifests itself as a five-dimensional special Lagrangian surface

Branes and Calibrated Geometries

577

in ten dimensions. Again there are fivebranes intersecting over strings and 0 0510 = , 0 049 = , 0 038 = , 0 027 = and 0 016 = so that we can add membranes in the {x 0 , x 1 , x 6 }, {x 0 , x 2 , x 7 }, {x 0 , x 3 , x 8 }, {x 0 , x 4 , x 9 } and {x 0 , x 5 , x 10 } planes. 3. Supersymmetry and the Fivebrane In this paper we are interested in bosonic solutions of the fivebrane equations of motion that preserve some supersymmetry. This will be the case if there exist constant spinors such that the variation of the spinor field of the fivebrane theory vanishes: the resulting condition is the Bogomol’nyi equation for the bosonic fields. We will see that the Bogomol’nyi condition will determine the geometry of the fivebrane configuration. In this section we derive an explicit expression for the supersymmetric variation of the spinor field of the fivebrane for the case of the vanishing self-dual three form, generalising and refining the discussion found in [18]. We use the fivebrane dynamics and conventions of [22]. In our paper the fivebrane is embedded in flat eleven-dimensional Minkowski superspace. We must distinguish between world and tangent indices, fermionic and bosonic indices and indices associated with the target space M and the fivebrane worldvolume M. On the fivebrane worldvolume the bosonic tangent space indices are denoted by a, b, ... = 0, 1, 2, ..., 5 and bosonic world indices by m, n, ... = 0, 1, 2, ..., 5. For example, the inverse vielbein of the bosonic sector of the fivebrane worldvolume is denoted by Eam . The bosonic indices of the tangent space of the target space M are denoted by the same symbols, but underlined, i.e. the m inverse vielbein in the bosonic sector is given by Ea . The fermionic indices follow the same pattern, those in the tangent space are denoted by α and α for worldvolume M and target space M respectively, while the world spinor indices are denoted by µ and µ. The fivebrane sweeps out a superspace M in the target superspace M which is specified in local coordinates Z M = (X m , 2µ ), m = 0, 1, . . . , 10, µ = 1, . . . , 32. These coordinates are functions of the worldvolume superspace parameterised by zM = (x m , θ µ ), m = 0, 1, . . . , 5; µ = 1, . . . , 16. The θ µ expansion of the Z M contains x m dependent fields of which the only independent ones are their θ µ = 0 components, also denoted X m and 2µ , and a self-dual tensor habc which occurs at level θ µ in 2µ . Despite the redundancy of notation it will be clear from the context when we are discussing the component fields and the superfields. The bosonic target space indices tangent to M may be decomposed as those that lie in the fivebrane worldvolume and those that lie in the space transverse to the fivebrane; we denote these indices by a and a 0 respectively (i.e. a = (a, a 0 ), a = 0, 1, . . . , 5; a 0 = 10 , . . . , 50 )1 with a similar convention for world indices. The initially thirty-two component spinor indices α are split into a pair of sixteen component spinor indices (i.e. α = (α, α 0 ), α = 1, . . . , 16; α 0 = 10 , . . . , 160 ) corresponding to the breaking of half of the supersymmetries by the fivebrane. We will use the super-reparameterisations of the worldvolume to choose the so-called static gauge. In this gauge we identify the bosonic coordinates in the worldvolume with the bosonic coordinates on the worldvolume (i.e. Xn = x n , n = 0, 1, . . . , 5) and set µ the fermionic fields 2α = 0, α = 1, . . . , 16. For a flat background 2µ = 2α δα . The 0 0 component field content of the fivebrane is X a (a 0 = 10 , . . . , 50 ), 2α (α 0 = 10 , . . . , 160 ) and the self-dual field strength habc . 1 We will also use a 0 = 6, 7, 8, 9, 10.

578

J. P. Gauntlett, N. D. Lambert, P. C. West

We recall some of the salient points of the super-embedding formalism. The frame vector fields on the target manifold M and the fivebrane worldvolume submanifold M M A are given by EA = EA ∂M and EA = EAM ∂M respectively. The coefficients EA encode A

the relationship between the vector fields EA and EA , i.e. EA = EA EA . Applying this relationship to the coordinate Z M we find the equation A

A

EA = EAN ∂N Z M EM .

(22)

In this paper we will be primarily interested in fivebranes whose worldvolumes have habc = 0. In this case the geometry of the fivebrane simplifies considerably. The vector β

β

β

a

b

b

fields Eα ≡ (Eα , Eα 0 ) and Ea ≡ (Ea , Ea 0 ) on the fivebrane can be chosen to be β

a

equal to the Spin(1, 10) and SO(1, 10) matrices uα and ua respectively. For example β

β

β

β

a

a

Eα = uα , Eα 0 = uα 0 , Ea 0 = ua 0 . b

b

(23) β

b

β

β

The matrix ua ≡ (ua , ua 0 ) is an element of SO(1, 10) and the matrix uα ≡ (uα , uα 0 ) forms an element of Spin(1, 10). As is clear from the notation, the indices with an overbar take the same range as those with an underline. We recall that the connection between the Lorentz and spin groups is given by γ

a

δ

uα uβ (0 a )γ δ = (0 b )αβ ub .

(24)

For a flat target superspace the super-reparameterisation invariance reduces to translations and rigid supersymmetry transformations. The latter take the form δx n =

i 20 n , δ2µ = µ . 2

(25)

Unlike other formulations, the super-embedding approach of [22,21] is invariant under super-reparameterisations of the worldvolume, that is, invariant under δzM = −v M ,

(26)

where v M is a supervector field on the fivebrane worldvolume. The corresponding motion induced on the target space M is given by B

δZ B = v A EA ,

(27)

where v M = v A EAM and rather than use the embedding coordinates Z N we referred B the variation to the background tangent space, i.e. δZ B ≡ δZ M EM . We are interested in supersymmetry transformations and so consider v a = 0, v α 6 = 0; with this choice and including the rigid supersymmetry transformation of the target space of Eq. (25) the transformation of 2α is given by [22] α

δ2α = v β Eβ + α .

(28)

The local supersymmetry transformations v α are used to set 2α = 0 which is part of the static gauge choice. However, by combining these transformations with those of the rigid supersymmetry of the target space α we find a residual rigid worldvolume

Branes and Calibrated Geometries

579

supersymmetry which is determined by the requirement that the gauge choice 2α = 0 is preserved. Consequently, we require v β Eβα = − α . Following the discussion in [18] the variation of the remaining spinor is given by 0

γ

0

0

δ2α = v β Eβα = v β Eβ (E −1 )γδ Eδα ,

(29) 0

where we have set the non-linearly realized symmetry parameterized by α to zero. Introducing the projectors [22] γ

(E −1 )αβ Eβ =

1 1 γ γ γ 0 (1 + 0)α , (E −1 )αβ Eβ 0 = (1 − 0)α , 2 2

(30)

we then find that the supersymmetry transformation for the fermions is given by 1 1 0 0 0 0 δ2α = − γ (1 + 0)γα + δ2γ (1 + 0)γα0 . 2 2

(31)

Hence we may write the variation of the spinor as δ2

γ0

1−0 2

α0 γ0

1 0 = − γ (0)γα . 2

(32) 0

Note that since only primed indices occur, the matrix 21 (1 − 0)γα0 is invertible. Therefore 0

by multiplying by its inverse we find the variation of δ2γ . Bosonic configurations will preserve some supersymmetry if there exist spinors 0 such that δ2γ vanishes in the limit 2α = 0. It will actually be more convenient to look for the conditions required for the vanishing of the right hand side of (32). We thus write (32) as 1 ˆ α 0 = − γ (0)γα 0 , δ2 2

(33) 0

ˆ α . To further where we have absorbed the factor of 21 (1 − 0) into the definition of δ2 A

analyse this expression we are required to find EA , or equivalently the u’s of SO(1, 10) and Spin(1, 10), in terms of the component fields in the limit 2α = 0. Using Eq. (22), a b the Lorentz condition uc ηab ud = ηcd and the static gauge choice Xn = x n we find that 0

0

(uab , uab ) = (ean δnb , ean ∂n Xb ), 0

(34)

0

bη a b where gnm = ena em ab = ηnm + ∂n X ∂m X δa 0 b0 . Using the remaining Lorentz cona ditions we find, up to a local SO(5) rotation, that the full Lorentz matrix ua is given by e−1 ∂X e−1 , (35) u= d −1 −d −1 (∂X)T (η1 )T

where the matrix d is defined by the condition dd T = I + (∂X)T η1 (∂X), (∂X)T is the 0 transpose of the matrix (∂n X a ) and η1 is the Minkowski metric on the fivebrane and

580

J. P. Gauntlett, N. D. Lambert, P. C. West β

is given by η1 = diag(−1, 1, 1, 1, 1, 1). The uα ∈ Spin(1, 10) corresponding to the b above ua ∈ SO(1, 10) are found using Eq. (24). We now consider in more detail the decomposition of the spinor indices. We recall that the bosonic indices of the fields on the fivebrane can be decomposed into longitudinal and transverse indices, i.e. a = (a, a 0 ) according to the decomposition of the Lorentz group SO(1, 10) into SO(1, 5) × SO(5). The corresponding decomposition of the spin group is Spin(1, 10) → Spin(1, 5) × U Sp(4). The spinor indices of the groups Spin(1, 5) and U Sp(4) are denoted by α, β, ... = 1, ..., 4 and i, j, ... = 1, ..., 4 respectively. Sixdimensional Dirac spinor indices normally take eight values, however the spinor indices we use for Spin(1, 5) correspond to Weyl spinors. Although we began with spinor indices α that took thirty-two dimensional values and were broken into two pairs of indices each taking sixteen values α = (α, α 0 ), in the final six-dimensional expressions the spinor indices are further decomposed according to the above decomposition of the spin groups and we take α → αi and α 0 → iα when appearing as superscripts and α → αi and α 0 → αi when appearing as subscripts [22]. It should be clear whether we mean α to be sixteen or four dimensional depending on the absence or presence of 0 i, j, ... indices respectively. For example, we will write 2α → 2iα . Using the corresponding decomposition of the spinor indices, the eleven dimensional 0-matrices can be written as β 0 β 0 j 0 (γ a )αβ δα 0 j a β , (36) ) = δ , (0 (0 a )α = (γ a )i α i (γ˜ a )αβ 0 0 −δβα where γ˜ 0 = −γ 0 and γ˜ a = γ a for a 6 = 0. Note that these 0-matrices can appear with either underlined or overlined spinor indices. Using this equation the eleven dimensional 0-matrices with several indices can be expressed as β β 0 0 0 j (γ a1 ...a2n )α a1 ...a2n b10 ...bm b10 ...bm )α = (γ )i , (0 0 (−1)m (γ˜ a1 ...a2n )αβ β 0 0 0 0 0 (−1)m (γ a1 ...a2n+1 )αβ j , (37) (0 a1 ...a2n+1 b1 ...bm )α = (γ b1 ...bm )i (γ˜ a1 ...a2n+1 )αβ 0 where, for example, γ a1 ...a2n ≡ γ [a1 γ˜ a2 γ a3 . . . γ˜ a2n ] . We will need the relationship (γ a1 a2 ...an ) = −

n(n+1) 1 (−1) 2 a1 a2 ...an an+1 ...a6 γan+1 ...a6 , (6 − n)!

(38)

for the chiral six dimensional γ -matrices. The other chiral six dimensional γ˜ -matrices satisfy an identical condition except for an additional minus sign on the right hand side. Using the expressions for the supervielbeins of Eq. (23) in terms of the SO(1, 10) matrices, the variation of the spinor can be written as δ 0 0 1 1 ˆ γ 0 = − γ (u−1 )γβ u γ = − γ (u−1 )γβ (1 − a1 a2 a3 a4 a5 a6 0a1 a2 a3 a4 a5 a6 ) u γ .(39) δ2 β δ 2 6! β

The last step in the above equation used the relation β 1 a1 a2 a3 a4 a5 a6 0 j δα β (0a1 a2 a3 a4 a5 a6 )α = δi . − 0 −δβα 6!

(40)

Branes and Calibrated Geometries

581

Using Eq. (24) we find that ˆ γ0 = δ2

1 a1 a2 a3 a4 a5 a6 b1 b2 b3 b4 b5 b6 α 0 ua1 ua2 ua3 ua4 ua5 ua6 (0 b1 b2 b3 b4 b5 b6 )αγ . 2 · 6!

(41)

Equation (41) however contains an eleven dimensional 0-matrix that involves the upper off diagonal block and as such it vanishes unless the bi indices take values in the longitudinal direction an odd number of times. Substituting in this matrix we find that n 1 0 j −1 αi ˆ j = det(e ) ∂a X c (γ a )αβ (γc0 )i δ2 β 2 1 0 0 0 − ∂a1 Xc1 ∂a2 X c2 ∂a3 X c3 (γ a1 a2 a3 )αβ (γc10 c20 c30 ) j i 3! o 1 0 0 + ∂a1 Xc1 . . . ∂a5 Xc5 (γ a1 ...a5 )αβ (γc10 ...c0 ) j . (42) 5 i 5! When deriving this equation we have used Eq. (38) and (35) for the u’s. In the next section we will derive Bogomol’nyi equations for bosonic configurations with a vanishing selfdual three form which preserve some worldvolume supersymmetry, i.e. configurations associated with the vanishing of (42). We will do this by further manipulating (42) by imposing the projections on the spinor that we obtained in the last section from considerations of orthogonally intersecting branes. Before proceeding to that analysis, it is interesting to consider the conditions for the preservation of supersymmetry without using static gauge. Clearly δ2α = 0 implies α that v β Eβ = − α . Multiplying by the inverse of the embedding matrix this condition β

β0

is equivalent to the two conditions v β = − α (E −1 )α and α (E −1 )α = 0. Since v β is an arbitrary function, the first of these equations is automatically satisfied. The second β0 γ condition is equivalent to α (E −1 )α Eβ 0 = 0, which using the projectors of equation (30) we may rewrite as γ

α (1 − 0)α = 0.

(43)

Hence this is the necessary and sufficient condition for the preservation of supersymmetry. We can now make contact with the work of [4,3]. For the static configurations which are studied in this paper the matrix 0 takes the form 0=−

1 det(e−1 ) m1 m2 m3 m4 m5 ∂m1 X b1 ∂m2 X b2 5! ∂m3 X b3 ∂m4 X b4 ∂m5 X b5 00 0 b1 b2 b3 b4 b5 ,

(44)

where the sums exclude the value 0. Although the matrix 0 is in general not a hermitian matrix, it is for the case of static configurations. One can also verify that it is symmetric in its spinor indices. Following similar arguments to those of [4] for the case of the Euclidean two brane we conclude that η† (1 − 0)(1 − 0)η = η† (1 − 0)η ≥ 0,

(45)

where η = † . The transverse coordinates will not depend on all the longitudinal coordinates of the brane. Let us suppose that they depend on q spatial coordinates leaving

582

J. P. Gauntlett, N. D. Lambert, P. C. West

p = 5 − q spatial coordinates upon which there is no dependence. In static gauge the matrix 0 then further simplifies 0=−

1 det(e−1 ) m1 ...mq ∂m1 X b1 . . . ∂mq X bq 00...p 0 b1 ...bq , q!

(46)

where the sum now excludes 0, ..., p and dete is the determinant of the vielbein induced on the embedded surface. Integrating Eq. (46) over the q longitudinal coordinates of the brane we find that Z Z d q x(dete)η† 0η d q x(dete)η† η ≥ Z 1 = − d q x m1 m2 m3 ...mq ∂m1 X b1 . . . ∂mq X bq η† 00...p 0 b1 ...bq η. q! (47) Hence we find that the volume of the volume of the embedded surface is greater than or 1 m1 m2 m3 ...mq ∂m1 X b1 . . . ∂mq X bq η† 00...p 0 b1 ...bq η. equal to the integral of the form − q! This expression is just the pull back to the worldvolume of a closed (in fact in our case constant) q form in flat spacetime which is a calibration [16]. The embedded surface is calibrated if and only if the bound is saturated which is equivalent, as we have seen above, to preserving some supersymmetry. To illustrate how this works in more detail let us consider the particular example of (18). In this case four of the transverse fields of the fivebrane are active and they depend on only four of the longitudinal coordinates of the fivebrane (i.e. q = 4). Thus we have a four dimensional space embedded in eight dimensions which are made up of the four longitudinal coordinates of the fivebrane and the four active coordinates of the fivebrane. In this case the form of the right hand side of (47) has the components −∂m1 X b1 . . . ∂m4 X b4 η† 005 0 b1 ...b4 η,

(48)

where the sum over the bi excludes the values 0, 5, 10. Since 005 = −, this is just the pull back to the fivebrane world surface of the four form η† 0 b1 ...b4 η. This form lives on the eight-dimensional space and, given the projections in (19), is none other than the Spin(7) invariant self-dual four form which lives on this eight-dimensional space (see for example [1]). One can work out the calibrating form for all the spaces considered in this paper in a similar manner. Finally, it is interesting to compare the worldsurface supersymmetry of the spinor with that of κ-supersymmetry. In fact κ-supersymmetry is just a consequence of worldvolume β supersymmetry which is found by taking [22] v β = κ γ Eγ . Making this replacement in Eq. (28) and using the projector of Eq. (30) we find the standard result for the κ transformation 1 α (49) δ2α = κ γ (1 + 0)γ + α . 2 In addition setting 2α = 0 in static gauge requires 21 κ γ (1+0)γα + α = 0 and following the same argument as before we find the variation of the remaining spinor is given by 0

δ2β (1 − 0)βα0

0

1 γ 0 κ (1 + 0)γβ (1 + 0)βα 2 0 = − β (1 + 0)β α , =

(50)

Branes and Calibrated Geometries

583

0

again setting α = 0, which is the same as (32). Thus one can find the conditions for supersymmetry preservation by studying either worldvolume or κ-supersymmetry. Given that the origin of κ-supersymmetry is worldsurface supersymmetry this is to be expected. 4. Geometry and Calibrations In section two above we wrote down static intersecting brane configurations which preserve some fraction of spacetime supersymmetry. Let us now examine these configurations from the point of view of the worldvolume of the first fivebrane. In particular we shall further manipulate the full non-linear supersymmetry conditions on the worldvolume theory (42) using the projection operators associated with each of the configurations in section two. We will obtain differential equations for the coordinates of all the manifolds constructed above which correspond precisely to the necessary and sufficient conditions of Harvey and Lawson for these to be calibrated submanifolds. We will see that all of these configurations correspond to the standard Kähler, Special Lagrangian and exceptional calibrations of the mathematical literature. As calibrated submanifolds they all have minimal area in their homology class [16]. Thus they all solve the field equations of the fivebrane with the three form set to zero. 4.1. Kähler submanifolds. Let us consider the case of an n complex dimensional manifold embedded in Cm ∼ = R2m with m > n. It is helpful to introduce the complex coordinates zµ = x 2µ−1 + ix 2µ , µ = 1, 2, 3, ..., n, Z α = X2α+4 + iX2α+5 , α = 1, 2, 3, ..., m − n,

(51)

and their complex conjugates zµ¯ and Z α¯ . Let us denote the corresponding γ -matrices by γ µ and γ 0 α = 21 γ 0 α¯ . Here and in the rest of this paper we denote the transverse γ -matrices with primes to distinguish them from the worldvolume γ -matrices. These furnish commuting representations of the 2n-dimensional and 2m-dimensional Clifford algebras respectively; {γ µ , γ ν } = {γ µ¯ , γ ν¯ } = 0,

{γ µ , γ ν¯ } = 2δ µ¯ν ,

{γα0 , γ 0 β } = {γ 0 α¯ , γ 0 β¯ } = 0, {γ 0 α , γ 0 β¯ } = 2δα β¯ .

(52)

We then consider the projections γ µ γ 0 α = 0.

(53)

One can easily check that these form a commuting set of n(m − n) projectors, although they are not always independent. Indeed for (n, m) = (1, 2), (1, 3), (2, 3) one finds the configurations (2),(4),(6) which preserve 1/2, 1/4, 1/4 of worldsheet supersymmetry respectively. The only other case occurring on the fivebrane (i.e. with n ≤ 2 and m−n ≤ 2) is the configuration (14) where (n, m) = (2, 4) and this preserves 1/8 of worldsheet supersymmetry (i.e. only three of the four projections are independent). We now consider the linear term in (42) i h (54) 0 = γ µ ∂µ Z α γ 0 α + γ µ¯ ∂µ¯ Z α γ 0 α + c.c. .

584

J. P. Gauntlett, N. D. Lambert, P. C. West

Clearly the first term is zero as a result of the projections and the equation is satisfied if and only if the scalars are holomorphic functions; ∂µ¯ Z α = 0. For all the above cases with the exception of n = 2, m = 4, the higher order terms vanish automatically. Thus the only supersymmetric configurations correspond to holomorphic embeddings. For the n = 2, m = 4 case one finds a non-trivial third order term coming from (42). Vanishing of the full non-linear supersymmetry then yields the equation h n 3 ¯ ∂µ Z γ ∂ ν Zγ δβα¯¯ − ∂µ Zβ¯ ∂ ν Z α¯ 0 = γ µ γα¯0 ∂ν Z β δβα¯¯ δµν − 2 −δµν ∂ ρ¯ Z γ ∂ρ¯ Zγ δβα¯¯ + δµν ∂ ρ¯ Zβ¯ ∂ρ¯ Z α¯

i

o + c.c. . (55)

Clearly ∂µ Z α¯ = 0 is a solution however we have not checked that it is the only solution. Note that the corresponding complex submanifolds are calibrated by powers of the 1 n ω [16]. Kähler form ω, n! 4.2. Special Lagrangian submanifolds. Here we consider the case of an n-dimensional manifold embedded into R2n ≡ Cn . Let i = 1, 2, 3, . . . , n and introduce the notation i

γ 0 = γ 0 i+5

Xi = Xi+5 ,

(56)

and again the two Clifford algebras γ i and γ 0 i commute. We now consider the projections i

γ 1 γ i γ 0 γ 0 = , 1

(57)

where there is no sum over i. These projections in turn imply that j

i

γ i γ 0 = −γ j γ 0 , i 6 = j.

(58)

It is easy to see that these form a set of n − 1 independent commuting projectors which correspond to the preservation of 2−(n−1) of the worldvolume supersymmetry. Clearly the n = 1 case is trivial and the n = 2 case corresponds to the n = 1, m = 2 complex case above. Let us now consider the supersymmetry condition. First take n = 3, corresponding to the configuration (8) preserving 1/4 of worldvolume supersymmetry. A little algebra shows that (42) may be written as # " X X j 1 γ i γ 0 (∂i Xj − ∂j Xi ) + γ 1 γ 0 ∂i Xi − det(∂X) . (59) 0= i<j

i

Therefore we find from the first term that ∂i Xj = ∂j Xi ,

(60)

and so we take Xj = ∂j F for some F . The second term then gives ∂ 2 F = det(HessF ), where (HessF )ij = ∂i ∂j F .

(61)

Branes and Calibrated Geometries

585

Now consider n = 4 describing the configuration (16) preserving 1/8 of worldvolume supersymmetry. Here we find 0=

X

γ i γ 0

i<j

+ γ 1 γ 0

1

j

h

∂i Xj − ∂j Xi − 3∂[k X k ∂l X l ∂i] Xj + 3∂[k Xk ∂l Xl ∂j ] Xi

i

X

∂i Xi − deti|i (∂X) ,

(62)

i

where det i|j (∂X) is the determinant of the matrix found by deleting the i th row and j th column of the matrix ∂X. The simple condition ∂i Xj − ∂j Xi = 0 has now become non-linear. Some work shows that it can be written as 1 0 = (∂m Xn − ∂n Xm ) δim δjn − δin δjm − δim δjn [(∂ · X)2 − ∂l Xk ∂ l X k ] + ∂ m Xj ∂ n Xi 2 − (∂ · X)[δjm ∂ n Xi + δin ∂ m Xj ] − δim ∂ n Xk ∂ k Xj − δjn ∂ m Xk ∂ k Xi .

(63)

From this one readily sees that ∂j Xi − ∂i Xj = 0 is still a solution (although we have not checked that it is the only solution). Again write Xj = ∂j F so that the first line in (62) vanishes. The second line then yields the equation ∂ 2F =

X

deti|i (HessF ).

(64)

i

Finally we consider n = 5. This describes the configuration (20) preserving 1/16 of the worldvolume supersymmetry. Here we find 0=

X

γ i γ 0

i<j

j

h

∂i Xj − ∂j Xi − 3∂[k X k ∂l X l ∂i] Xj + 3∂[k Xk ∂l Xl ∂j ] Xi



X

1 01 

+ γ γ

∂i Xi −

i

X

i

 detij |ij (∂X) + det(∂X) ,

(65)

i6=j

where det ij |kl (∂X) is the determinant of the matrix found by deleting the i th and j th rows and k th and l th columns of the matrix ∂X. Again Eq. (63) appears and so we write Xi = ∂i F and we arrive at the equation ∂ 2F =

X

det ij |ij (HessF ) − det(HessF ).

(66)

i6=j

Equations (61), (64) and (66) above are precisely the necessary and sufficient conditions derived by Harvey and Lawson [16] for the embedded manifold in Cn to be Special Lagrangian. By definition such manifolds are calibrated by the form Re(dz1 ∧· · ·∧dzn ), where the zµ are complex coordinates of Cn .

586

J. P. Gauntlett, N. D. Lambert, P. C. West

4.3. Exceptional submanifolds. We are now left with only a few of the configurations in section two to analyse. As we will see, these cases correspond to the exceptional calibrated submanifolds discussed in [16]. For these cases it will be convenient to work with an explicit representation of gamma matrices (36) using quaternions. Specifically we choose 1 0 0 1 0 i , γ1 = , γ2 = , γ0 = 0 1 1 0 −i 0 0 j 0 k −1 0 , γ4 = , γ5 = , (67) γ3 = −j 0 −k 0 0 1 and

0 0 0 0 1 0 i0 0 j0 , γ7 = , γ8 = , γ6 = −i 0 0 −j 0 0 1 0 0 0 0 k0 −1 0 , γ = , γ9 = 10 −k 0 0 0 1

(68)

where (i, j, k) and (i 0 , j 0 , k 0 ) are two commuting sets of quaternions that can be realised as Pauli matrices. 4.3.1. Cayley submanifolds. As before, the aim is now to reinterpret the spacetime configuration (18) as a supersymmetric configuration on the first fivebrane. For this case four transverse scalars are excited and they should be functions of four coordinates on the fivebrane, i.e., the configurations should correspond to a four surface in eight dimensions. We will now show that the conditions for preserved supersymmetry after imposing the projections lead to the Cayley differential equation in [16] corresponding to Cayley submanifolds i.e. submanifolds that are calibrated by the Spin(7) invariant self-dual four-form . Before we present the derivation, we first note that the projections (19) can be rewritten in the elegant form 1 3 (0ij + ij kl 0 kl ) = 0, 4 6

(69)

where i = {1, 2, 3, 4, 6, 7, 8, 9} and we have taken the only non-zero components of to be +1 = 1234 = 6789 = 3489 = 2479 = 2378 = 1379 = 1267 = 1368 = 1469 = 2468 , −1 = 1289 = 1478 = 3467 = 2369 ,

(70)

which are the same as those in [16] after the redefinition 6789 → 5678. Thinking of as an SO(8) spinor, (69) says that under the decomposition SO(8) → Spin(7), it is in fact a Spin(7) singlet. To see this note that the adjoint of SO(8) decomposes as 28 → 21 + 7, where 21 is the adjoint of Spin(7), and that the matrix that appears in (69) is precisely the operator that projects onto the 21 (see, for example [2,1]). We thus conclude that the projection operators (19) that we obtained from considerations of orthogonally intersecting branes are equivalent to the more abstract statement that we are working with a spinor that is a Spin(7) singlet. One implication of this observation

Branes and Calibrated Geometries

587

is that we expect the same projections to appear for just two fivebranes rotated by a Spin(7) rotation [24]. Let us now begin the derivation of the Cayley equation. We first rewrite the projections using the explicit basis (67),(68). We conclude the following:

1 0 0 0

0 0 0 i

0 0 0 j

0 0 0 k

= 0,

1 0 0 0

0

0 0 = − 0 i0

0 0 0 j0

= −

0 0 = − 0 k0

0

0 0

= 0,

,

,

,

(71)

which allows one to trade Spin(5, 1) matrices for Spin(5) matrices when acting on the spinor . The signs are necessary and essentially arise from the fact that the Spin(5, 1) γ -matrices commute with the Spin(5) matrices. We look for configurations with ∂0 = ∂5 = 0 and all transverse scalars excited except X 10 . It is convenient to introduce the quaternion valued fields and derivatives X0 = X6 + i 0 X7 + j 0 X8 + k 0 X9 , ∂ = ∂1 + i∂2 + j ∂3 + k∂4 , ∂ = ∂1 − i∂2 − j ∂3 − k∂4 .

(72)

We first consider the terms in the supersymmetry variation that are linear in X: 0 ← − X b γb0 ∂ a γ a =

0 X0 0 X 0

0 ∂ , ∂ 0

0

0 0 1 0 1 , 1 0 1 0

0

0 0 1 0 1 , 1 0 1 0

0 0 0 0 X

=

0 0 = 0 X∂

0 0 0 ∂

(73)

where X = X 6 + iX7 + j X8 + kX9 , X∂ ≡ ∂1 X − ∂2 Xi − ∂3 Xj − ∂4 Xk and we have used (71).

588

J. P. Gauntlett, N. D. Lambert, P. C. West

Next we turn to the terms in the supersymmetry variation that are cubic in X. By performing similar steps we obtain 1 0 0 0 ∂a X b1 ∂a2 Xb2 ∂a3 Xb3 (γ a1 a2 a3 )(γb10 b20 b30 ) 3! 1 0 0 1 0 0 0 1 γ a1 a2 a3 , = − 0 0 1 0 3! 0 ∂a1 X ∂a2 X 0 ∂a3 X 0 1 0 0 0 1 γ a1 a2 a3 , = + 1 0 3! 0 ∂[a1 X∂a2 X∂a3 ] X 0 1 0 1 0 γ a1 a2 a3 , = + 1 0 3! 0 ∂a1 X × ∂a2 X × ∂a3 X −

(74)

where we have introduced the triple × product of quaternions defined by x×y×z=

1 (x yz ¯ − zyx), ¯ 2

(75)

and we have used the fact that it is alternating. Next we let the indices a1 a2 a3 run over the values 1, 2, 3, 4 and substitute their explicit form using (67),(68). Combining with the terms linear in X we conclude that the condition for preserved supersymmetry is encapsulated by the differential equation ∂1 X − ∂2 Xi − ∂3 Xj − ∂4 Xk =∂2 X × ∂3 X × ∂4 X + ∂1 X × ∂3 X × ∂4 Xi − ∂1 X × ∂2 X × ∂4 Xj + ∂1 X × ∂2 X × ∂3 Xk. (76) This is the Cayley equation derived in [16] for submanifolds that are calibrated by the Cayley calibration. 4.3.2. Associative submanifolds. Next consider the configuration (10) preserving 1/16 of the spacetime supersymmetry. For this case four transverse scalars are excited and they should be functions of three coordinates on the fivebrane, i.e. the configurations should correspond to a three surface in seven dimensions. We will now show that the conditions for preserved supersymmetry after imposing the projections (11) lead to the associator equation in [16] for associative submanifolds, i.e. submanifolds calibrated by the G2 invariant three form ϕ. The non-zero components of ϕ can taken to be +1 = ϕ234 = ϕ267 = ϕ469 = ϕ379 = ϕ368 , −1 = ϕ289 = ϕ478 .

(77)

As in the Cayley case, we first note that our projections (11) can be recast in the form 1 2 (0ij + ψij kl 0 kl ) = 0, 3 4

(78)

where the four-form ψ is the Hodge-dual of ϕ in the directions i = {2, 3, 4, 6, 7, 8, 9}. Specifically the non-zero components of ψ are +1 = ψ6789 = ψ3489 = ψ2479 = ψ2378 = ψ2468 , −1 = ψ3467 = ψ2369 ,

(79)

Branes and Calibrated Geometries

589

which are simply the components of in (70) without a 1 component. Thinking of as a Spin(7) spinor, (78) says that it is actually a G2 singlet under the decomposition Spin(7) → G2 . This is because the adjoint of Spin(7) decomposes as 21 → 14 + 7 and the matrix appearing in (78) projects onto the 14, the adjoint of G2 . For this case, after we rewrite the projections (11) using our explicit basis (67),(68), we conclude the following: 0 1 0 = 0, 0 0 0 0 0 0 , = − 0 i 0 i0 0 j 0 0 0 , = − 0 j 0 j0 0 k 0 0 0 . (80) = − 0 k 0 k0 Now we turn to the supersymmetry variation. We now define X0 = X6 + i 0 X 7 + + k 0 X 9 and ∂ = +i∂2 + j ∂3 + k∂4 = −∂. The terms linear in X can now be processed as follows: 0 0 0 0 0 −1 0 1 ∂ 0 b0 0 a , X γb ∂a γ = 0 1 0 1 0 0 ∂ 0 X 0 0 0 0 −1 0 1 . (81) = 0 1 0 1 0 0 ∂ 0X

j 0 X8

Similarly, the cubic terms can be rewritten 0 0 0 0 0 1 γ 234 − 0 0 1 0 0 ∂[2 X ∂3 X 0 ∂4] X 0 0 0 0 0 1 0 −1 = . 0 0 0 1 0 1 0 0 ∂2 X × ∂3 X × ∂4 X

(82)

After taking the hermitian conjugate the condition for unbroken supersymmetry is given by the differential equation (after dropping the primes) −∂2 Xi − ∂3 Xj − ∂4 Xk = ∂2 X × ∂3 X × ∂4 X,

(83)

which is the associator equation that appears in [16]. Recently solutions to this equation have been studied in relation to domain walls in MQCD [28]. 4.3.3. Coassociative submanifolds. Next consider the configuration (12) preserving 1/16 of the spacetime supersymmetry. For this case three transverse scalars are excited and they should be functions of four coordinates on the fivebrane, i.e. the configurations should correspond to a four surface in seven dimensions. We will now show that the conditions for preserved supersymmetry after imposing the projections (13) lead to the coassociator differential equation in [16] for coassociative submanifolds, i.e. manifolds calibrated by the G2 invariant four-form ψ that is Hodge dual to the three-form ϕ in the associative case. For this case the projectors can be recast in the form (78) where the

590

J. P. Gauntlett, N. D. Lambert, P. C. West

components of ψ are now given by (79) after the relabelling {2346789} → {7891234}. Thus, the projections for this case also imply that the spinor is a G2 singlet. To obtain the corresponding differential equation we begin by rewriting the projections using the explicit basis (67),(68), to conclude: 1 0 = 0, 0 0 0 0 0 0 i 0 = − , 0 i 0 i0 0 0 0 0 j 0 = − , 0 j 0 j0 0 0 0 0 k 0 = − . (84) 0 k 0 k0 We now have X = i 0 X 7 + j 0 X 8 + k 0 X9 = −X and ∂ = ∂1 + ∂2 i + ∂3 j + ∂4 k. The terms in the supersymmetry variation that are linear in X can be reexpressed 0 0 0 0 0 0 1 X 0 0 1 b0 0 a , X γb ∂a γ = −1 0 1 0 0 X0 0 ∂ 0 0 0 0 1 0 1 . (85) = −1 0 1 0 0 −X∂ Similarly, the cubic terms can be rewritten 0 0 0 1 γ − 1 0 789 0 ∂X[7 ∂X 8 ∂X9] 0 0 0 0 1 0 1 = − . 1 0 −1 0 0 ∂X7 × ∂X8 × ∂X9

(86)

After taking the hermitian conjugate the condition for unbroken supersymmetry is then given by the differential equation −∂X 7 i − ∂X8 j − ∂X9 k = ∂X7 × ∂X8 × ∂X9 ,

(87)

which is the coassociator equation that appears in [16]. 5. Conclusion In this paper we have analysed the conditions necessary for the fivebrane worldvolume theory to preserve some supersymmetry, when the self-dual three form is set to zero. Our approach was to first consider spacetime configurations of orthogonally intersecting fivebranes in order to derive a set of projection operators acting on the worldvolume supersymmetry parameters. By manipulating the supersymmetry variation using these projections we derived a set of differential equations for the transverse scalar fields of the worldvolume. These Bogomol’nyi equations are none other than the equations resulting from calibrated geometries [16]. It would be interesting to find explicit solutions to these equations, i.e. calibrated submanifolds, for simple cases such as the orthogonal brane

Branes and Calibrated Geometries

591

configurations of section two, or branes in flat space rotated by elements of groups associated with special holonomy [7,10,24] (since we might expect that these configurations are associated with the same projection operators as for the orthogonal configurations that we explicitly considered in section two). It will be very interesting to extend the analysis of this paper to include membranes by allowing for a non-zero self-dual three form. We expect that the resulting differential equations will be associated with a generalised notion of calibrated geometries [13]. We pointed out in section two some configurations allow for pp-waves and membranes to be introduced without breaking any addition supersymmetry. One may expect that this has a simple interpretation in the resulting generalised calibrations. In this paper we have studied fivebranes in a flat target space. It is well known that calibrated geometries can be defined in manifolds with special holonomy. For example, in eight dimensions it is well known that in a curved manifold with reduced Spin(7) holonomy the self-dual four form is globally defined and this allows one to have submanifolds calibrated by . With a flat target space we saw that the preserved supersymmetries for the calibrated submanifolds are Spin(7) invariant spinors. This also has a natural generalisation since manifolds with Spin(7) holonomy contain parallel spinors. In a similar manner the associative and coassociative cases will generalise to seven dimensional manifolds with G2 holonomy while the Kähler and special Lagrangian cases can be generalised to manifolds with SU (n) holonomy. It would interesting to generalise our analysis to find the analogues of the differential equations of [16] in a curved manifold of special holonomy. We leave this to future work, but we would like to mention how some of the analysis of section three could be generalised to a curved target space. Examining Eq. (22) we find that now b

b

ua = eam ∂m XN EN ,

(88)

b

d

where eam gnm edm = ηad and gnm = ∂n X N EN ηbd ∂m X R ER . The equation for the matrix 0 is still given by −

1 a1 a2 a3 a4 a5 a6 b1 b2 b3 b4 b5 b6 α 0 ua1 ua2 ua3 ua4 ua5 ua6 (0 b1 b2 b3 b4 b5 b6 )αγ . 6! b

b

(89)

b

b uc But if we define vc by ua = eam (f −1 )cm vc , where (f −1 )cm is the matrix (f −1 )cm = em b then we may write 0 as

−

1 0 b b b b b b (det(ef )−1 ) a1 a2 a3 a4 a5 a6 va11 va22 va33 va44 va55 va66 α (0 b1 b2 b3 b4 b5 b6 )αγ . 6!

(90)

The net effect of these changes is just to make the replacement 0

0

0

0

b + ∂M Xn Enb0 ), δcn ∂n Xb → (f −1 )cm (Em

(91)

in all formulae for the supervariation of the spinor. Acknowledgements. We would like to thank B. Acharya, F. Dowker, J. Figueroa O’Farrill, G. Gibbons and G. Papadopoulos for useful discussions. JPG is supported in part by EPSRC. While this paper was being prepared we learnt of the work [15] which has some overlap with this work.

592

J. P. Gauntlett, N. D. Lambert, P. C. West

References 1. Acharya, B.S., Figueroa-O’Farrill, J.M., O’Loughlin, M. and Spence, B.: Nucl. Phys. B514, 583 (1998), hep-th/9707118 2. Acharya, B.S., O’Loughlin, M. and Spence, B.: Nucl.Phys. B503, 657 (1997) 3. Becker, K., Becker, M., Morrison, D., Ooguri, H., Oz, Y., and Yin, Z.: Nucl. Phys. B480, 225 (1996), hep-th/9608116 4. Becker, K., Becker, M. and Strominger, A.: Nucl. Phys. B456, 130 (1995) 5. Bergsoeff, E., de Roo, M., Eyras, E., Janssen, B. and van der Schaar, J.P.: Nucl. Phys. B494, 119 (1997), hep-th/9612095 6. Bergshoeff, E., Gomis, J. and Townsend, P.K.: Phys. Lett. B421, 109 (1998), hep-th/9711043 7. Berkooz, M., Douglas, M.R. and Leigh, R.G.: Nucl. Phys. B480, 265 (1996), hep-th/9606139 8. Callan, C.G., Jr., and Maldacena, J.M.: Nucl. Phys. B513, 198 (1998), hep-th/9708147 9. Gauntlett, J.P.: Intersecting Branes. hep-th/9705011 10. Gauntlett, J.P., Gibbons, G.W., Papadopoulos, G. and Townsend, P.K.: Nucl. Phys. B500, 133, hepth/9702202 11. Gauntlett, J.P., Gomis, J. and Townsend, P.K.: J. High Energy Phys. 01, 003 (1998), hep-th/9711205 12. Gauntlett, J.P., Kastor, D.A. and Traschen, J.: Nucl. Phys. B478, 544 (1996), hep-th/9604179 13. Gauntlett, J.P., Lambert, N.D. and West, P.C.: Work in progress 14. Gibbons, G.W.: Nucl. Phys. B514, 603, (1998) hep-th/9709027 15. Gibbons, G.W. and Papadopoulos, G.: Calibrations and Intersecting Branes., hep-th/9803163 16. Harvey, R. and Lawson, H.B., Jr.: Calibrated Geometries. Acta Math. 148, 47 (1982) 17. Henningson, M. and Yi, P.: Phys. Rev. D57, 1291 (1998), hep-th/9707251 18. Howe, P.S., Lambert, N.D. and West, P.C.: Nucl. Phys. B515, 203 (1998), hep-th/9709014 19. Howe, P.S., Lambert, N.D. and West, P.C.: Phys. Lett. B419, 79 (1998), hep-th/9710033 20. Howe, P.S., Lambert, N.D. and West, P.C.: Phys. Lett. B418, 85 (1998), hep-th/9710034 21. Howe, P.S. and Sezgin, E.: Phys. Lett. B394, 62 (1997), hep-th/9611008 22. Howe, P.S., Sezgin, E. and West, P.C.: Phys. Lett. B399, 49 (1997), hep-th/9702008 23. Klemm, A., Lerche, W., Vafa, C. and Warner, N.: Nucl. Phys. bf B477, 746 (1996), hep-th/604034 24. Ohta, N. and Townsend, P.K.: Phys. Lett. B418, 77 (1998), hep-th/9710129 25. Papadopoulos, G. and Townsend, P.K.: Phys. Lett. B380, 273 (1996), hep-th/9603087 26. Townsend, P.K.: Four Lectures on M-Theory. hep-th/9612121; M Theory From Its Superalgebra, hepth/9712004 27. Tseytlin, A.A.: Nucl. Phys. B475, 149 (1996), hep-th/9604035 28. Volovich, A.: Domain Wall in MQCD and Supersymmetric Cycles in Exceptional Holonomy Manifolds. hep-th/9710120; Domain Walls in MQCD and Monge-Ampere Equation. hep-th/9801166 29. Witten, E.: Nucl. Phys. B500, 3 (1997), hep-th/9703166 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 202, 593 – 619 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Calibrations and Intersecting Branes G. W. Gibbons, G. Papadopoulos DAMTP, University of Cambridge, Silver Street, Cambridge CB3 9EW, United Kingdom Received: 20 May 1998 / Accepted: 16 November 1998

Abstract: We investigate the solutions of Nambu–Goto-type actions associated with calibrations. We determine the supersymmetry preserved by these solutions using the contact set of the calibration and examine their bulk interpretation as intersecting branes. We show that the supersymmetry preserved by such solutions is closely related to the spinor singlets of the subgroup G of Spin(9, 1) or Spin(10, 1) that rotates the tangent spaces of the brane. We find that the supersymmetry projections of the worldvolume solutions are precisely those of the associated bulk configurations. We also investigate the supersymmetric solutions of a Born–Infeld action. We show that in some cases this problem again reduces to counting spinor singlets of a subgroup of Spin(9, 1) acting on the associated spinor representations. We also find new worldvolume solutions which preserve 1/8 of the supersymmetry of the bulk and give their bulk interpretation. 1. Introduction Many of the insights onto the relations amongst the superstring theories and M-theory have been found by investigating the soliton-like solutions of the ten- and elevendimensional supergravity theories. Most attention so far has been concentrated on supersymmetric solutions, i.e. those that preserve a proportion of the supersymmetry of the underlying theory. Some of the supersymmetric solutions also saturate a Bogomol’nyi bound and so they are BPS. It is remarkable that a large class of supersymmetric solutions of the supergravity theories can be constructed by superposing elementary soliton solutions which preserve 1/2 of the spacetime supersymmetry [1]. These elementary solutions are the various brane solutions of supergravity theories, the pp-wave and KKmonopole. After such a superposition, the resulting solutions have the interpretation of intersecting branes or branes ending on other branes [2, 1, 3], and whenever appropriate, in the background of a pp-wave or a KK-monopole. The small fluctuations of the (elementary) p-branes of superstrings and M-theory are described by (p+1)-dimensional worldvolume actions of Dirac–Born–Infeld type. Recently, the soliton-like solutions of

594

G. W. Gibbons, G. Papadopoulos

these worldvolume actions have been investigated [4–6]. It has been found that some of these, so-called worldvolume solutions viewed from the bulk (supergravity) perspective also have the interpretation of intersecting branes or branes ending on other branes. One way to explain this correspondence between the supergravity solutions and those of the worldvolume actions is to consider the supergravity coupled to the worldvolume action of a D-p-brane. The latter is a (p+1)-dimensional Dirac–Born–Infeld (DBI) action. Such an action can be written schematically as S=

1 1 SNS⊗NS + SR⊗R + SDBI , gs2 gs

(1.1)

where gs is the string coupling constant. The first two terms are the NS⊗NS and R⊗R parts of the supergravity action, respectively, while the last term is the Born–Infeld action for a D-brane propagating in a supergravity background. In the limit that the string coupling constant becomes very large, SN S⊗N S diverges faster. In order to keep the action small, we set SNS⊗NS = 0 which can be achieved by taking the flat spacetime background and setting the field strength of the fundamental string equal to zero. This is the weak coupling limit which can be described by string perturbation theory. Note that the SR⊗R remains in this limit; it can be eliminated though by choosing the D-brane field strengths to be zero. Next suppose that the string coupling becomes large. In this limit using a similar reasoning, one can set SDBI = 0 leading to the usual supergravity (bulk) description of the various branes. Although these limits are not consistent truncations, it is expected that the various BPS configurations survive in the various limits of the string coupling constant which explains the presence of intersecting brane configurations as solutions of the Dirac–Born–Infeld action. Moreover the worldvolume solitons of other branes are related to those of D-branes by T-and S-duality transformations. Therefore, the above correspondence between bulk configurations and worldvolume ones is valid for all branes. The worldvolume actions of branes are described by matter, vector and tensor multiplets. For convenience we shall refer to the vectors and tensors fields of the vector and tensor multiplets as Born–Infeld fields. The worldvolume actions of branes described by matter multiplets are of Nambu–Goto type. Such actions include those of the IIA string and the M-2-brane [7, 8]. The worldvolume actions of branes described by vector and tensor multiplets for vanishing Born–Infeld fields can be consistently truncated to Nambu–Goto ones. Such actions include those of D-branes [9–11], M-5-brane [12–14] and NS-5-brane of IIB theory. Therefore solutions of the worldvolume theories of all branes which do not involve Born–Infeld fields are solutions of Nambu–Goto actions. The solutions of Nambu–Goto action are minimal surfaces. A large class of minimal surfaces is that constructed using calibrations [15, 16]. Such solutions saturate a bound, so they are BPS, and they preserve a proportion of the spacetime supersymmetry. The supersymmetry of calibrations in Calabi–Yau manifolds and in manifolds of exceptional holonomy has been investigated in [17, 18]. The bulk interpretation of the calibrated worldvolume solutions is that of intersecting branes. Another consistent truncation of the worldvolume actions of branes is to set all the matter fields of vector and tensor multiplets to be constant. The resulting action for vector multiplets is that of non-linear electro-dynamics, i.e. the Born–Infeld action. The bulk interpretation of the solutions of worldvolume theories, that involve only Born–Infeld fields is that of branes within branes [19] (see [20, 21] for the supergravity solutions). The associated bound states can be at or below threshold. Finally, there are worldvolume solutions that involve both Born–Infeld and matter fields. Such solutions have the bulk interpretation of branes ending at branes.

Calibrations and Intersecting Branes

595

The presence of Born–Infeld fields is then required by the Gauss law. We remark that all Dirac–Born–Infeld actions relevant to branes, apart from that of the M-5-brane, can be constructed by dimensionaly reducing the ten-dimensional Born–Infeld action to an appropriate dimension. Therefore all solutions of the Dirac–Born–Infeld actions can be thought of as solutions of the ten-dimensional Born–Infeld action. In fact one has the striking geometric property that, see e.g. [5], any calibrated p-dimensional submanifold of En may be regarded as a solution of the Born–Infeld action for a U (1) gauge field in En+p . In this paper, we shall investigate the solutions of Nambu–Goto actions associated with calibrations. We shall first show that all these solutions are supersymmetric. Our proof is in three steps (i)

we shall use the Killing spinor equations [22] which are derived from the kappasymmetry transformations of brane worldvolume actions, (ii) we shall exploit the fact that the collection of the tangent spaces of a calibration span a subspace of the contact set, and (iii) we shall apply the notion of branes at angles [23, 24]. The main point of the proof is that the contact set of a calibration is a subspace of a homogeneous space, G/H , and that G leaves invariant one-dimensional subspaces (singlets) when acting on the spinor representations of Spin(9, 1) (or Spin(10, 1)) ⊃ G. The number of such singlets determines the proportion of the supersymmetry preserved. The groups G that arise in calibrations are some of those that appear in the context of special holonomies, i.e. G is SU (n), G2 and Spin(7). Our approach provides an alternative way to define calibrations in terms of spinors as opposed to their usual definition in terms of forms. Then we shall systematically explore the bulk interpretation of all the calibrated solutions of Nambu–Goto actions. We shall find that in most cases there is a supergravity intersecting brane configuration associated with every worldvolume solution. Moreover we shall show that all the the supersymmetry projections associated with a bulk configuration are precisely those that arise in the corresponding worldvolume solution. Next we shall investigate a certain class of singular solutions which may arise as limits of regular ones and shall examine the supersymmetry preserved by such solutions. Then we shall investigate the supersymmetry preserved by worldvolume solutions which involve only a non-vanishing Born–Infeld field. The term in the Killing spinor equation that involves the Born–Infeld field can be interpreted as spinor rotation [22]. Using this, we shall find that the supersymmetry condition on the Born–Infeld field associated with a vector multiplet has a group theoretic interpretation. In particular the Born–Infeld field can be thought of as generating an infinitesimal spinor rotation induced by a group G, where G is again a special holonomy group. Next we shall examine solutions of the Dirac–Born–Infeld action. Such solutions involve non-vanishing vectors or tensors field as well as scalars. Although the supersymmetry condition for such solutions can also be expressed in terms of spinor rotations, there is no straightfoward interpretation of these conditions in terms of calibrations. We shall summarize the solutions found which preserve 1/4 of the supersymmetry and we shall present some new ones which preserve 1/8 of the supersymmetry. This paper is organized as follows: In section two, we give the definition of calibrations and summarize some of their properties. In section three, we show that all calibrations preserve a proportion of the spacetime supersymmetry. In section four, we explain the correspondence between worldvolume solutions and those of the bulk. In sections five, six and seven, we give the correspondence between the worldvolume solutions associated with calibrations and those of the bulk. In addition, we present all the

596

G. W. Gibbons, G. Papadopoulos

associated supersymmetry projections. In section eight, we give some of the singular worldvolume solutions. In section nine, we find the worldvolume solutions which have only Born–Infeld fields. In section ten, we investigate the worldvolume solutions which include both vectors and scalars and give a new solution which preserves 1/8 of the supersymmetry. Finally in section eleven, we present our conclusions. 2. Calibrations 2.1. The bound. To investigate the supersymmetry of certain Born–Infeld configurations and establish bounds for their energy, we shall need a few facts about calibrations [15, 16]. We shall consider calibrations in En equipped with the standard Euclidean inner product. Let G(p, En ) be the grassmannian, i.e. the space of (oriented) p-dimensional subspaces of En . Given a p-dimensional subspace ξ of En , ξ ∈ G(p, En ), we can always find an orthonormal basis {e1 , . . . , en } in En such that {e1 , . . . , ep } is a basis in ξ . We denote the co-volume of ξ by →

ξ = e1 ∧ · · · ∧ ep . A p-form φ on a open subset U of En is a calibration of degree p if (i) dφ = 0,

(2.1)

→

(ii) for every point x ∈ U , the form φx satisfies φx ( ξ ) ≤ 1 for all ξ ∈ G(p, En ) and such that the “contact set” →

G(φ) = {ξ ∈ G(p, En ) : φ( ξ ) = 1}

(2.2)

is not empty. One of the applications of calibrations is that they provide a bound for the volume of pdimensional submanifolds of En . Let N be a p-dimensional submanifold of En . At every point x ∈ N , one can find an oriented orthonormal basis in En such that {e1 , . . . , ep } is →

an oriented orthonormal basis of Tx N. The co-volume of N at x is N = e1 ∧ · · · ∧ ep and the volume form of N at x is µN = α1 ∧ · · · ∧ αp , where {α1 , . . . , αp } is the dual basis to {e1 , . . . , ep }. The fundamental theorem of calibrations states the following: • Let φ be a calibration of degree p on En . The p-dimensional submanifold N for which →

φ(N) = 1

(2.3)

is volume minimizing. We shall refer to such minimal submanifolds as calibrated submanifolds, or a calibration, for short, of degree p. To prove the above statement, we choose an open subset U of N with boundary ∂U and assume that there is another subspace W of En and an open set V of W with the same boundary ∂U = ∂V . Using Stokes’ theorem, we have Z Z Z Z → φ= φ = φ(V )µV ≤ µV = vol(V ). (2.4) vol(U ) = U

V

V

Calibrations give a large class of bounds for the volume of subspaces in En . The “charge” density associated with the bound is given by the calibration form. If X is the map from the “worldvolume”, Ep , into En , the above bound can be re-expressed as Z Z p (2.5) d p u det(gµν ) ≥ X∗ φ,

Calibrations and Intersecting Branes

597

where gµν is the induced metric on Ep with respect to the map X and {u1 , . . . , up } are local coordinates of Ep . The tangent spaces of a p-dimensional submanifold N of En , parallel transported to the origin of En , span a subspace of G(p, En ); this is the Gauss map. If moreover N saturates the bound associated with the calibration φ, then the image of the Gauss map is in G(φ). In many cases G(φ) = G/H , where G is a subgroup of SO(n). As we shall see, the group G arises naturally in the investigation of the supersymmetry of such configurations. There are many examples of calibrations. Here we shall mention three examples which will prove useful in the study of supersymmetry and in the construction of actual solutions of the field equations of the Nambu–Goto action. There is a relation between calibrations and special holonomy groups. This is because special holonomy groups are characterized by the existence of certain invariant forms. These forms can be used as calibrations. Manifolds with special holonomy also admit covariantly constant spinors (see the table below). Such spinors are also invariant under the action of the holonomy groups; the invariant spinors and forms are related [25].

Table 1. Covariantly Constant Forms and Spinors. The first column contains the special holonomy groups, the second column contains the degrees (multiplicities) of covariantly constant forms (4+ denotes a self-dual four-form), the third column contains the dimension of the space of covariantly constant spinors and the last column contains the dimension of the special holonomy manifold. Holonomy

Forms

SU (n) Spin(7) G2 Sp(n)

2(1), n(1) 4+ (1) 3(1), 4(1) 2(3)

Spinors 2 1 1 n+1

Dim 2n 8 7 4n

The investigation of supersymmetric solutions of Nambu–Goto action provides a further connection between calibrations and special holonomy groups. This connection makes use of the invariant spinors of special holonomy groups. In particular, the spinor representations1 of Cliff(9, 1) or Cliff(10, 1) have singlets when decomposed under the special holonomy subgroups of Spin(9, 1) or Spin(10, 1). As we shall show, if a calibration as solution of the Nambu–Goto action is associated with a special holonomy group, then the invariant spinors serve as Killing spinors and some of the spacetime supersymmetry is preserved. The proportion of the supersymmetry preserved is related to the number of singlets in the decomposition of the spinor representations of Spin(9, 1) under the special holonomy subgroup. We shall remark further on this relationship between special holonomy, calibrations and superymmetry in section eleven. 2.2. Kähler calibrations. We begin by introducting coordinates {x i , y i ; i = 1, . . . , n} on E2n and the metric n X (dx i dx i + dy i dy i ). (2.6) ds 2 = i 1 Our metric convention is η = diag(1, . . . , 1, −1)

598

G. W. Gibbons, G. Papadopoulos

Then we choose a complex structure on E2n such that the associated Kähler form is ω=

i X i dz ∧ d z¯ i , 2

(2.7)

1 p ω . p!

(2.8)

i

where zi = x i + iy i . Next we set φ=

The form φ is a calibration of degree 2p and the contact set G(φ) is the space of complex p-dimensional planes in Cn = E2n , G(φ) = GC (p, Cn ), where we have identified E2n with Cn using the above complex structure. To prove this, one uses Wirtinger’s inequality which states that →

φ( ξ ) ≤ 1

(2.9)

for every ξ ∈ G(2p, E2n ) with equality if and only if ξ ∈ GC (p, Cn ) (for the proof see [15, 16]). A consequence of this is that all complex submanifolds of Cn are volume minimizing. The form ω, and therefore φ, are invariant under U (n). The contact set can be written as the coset space GC (p, Cn ) = U (n)/U (p) × U (n − p) = SU (n)/S(U (p) × U (n − p)).

(2.10)

Thus SU (n) acts transitively on the space of p-dimensional complex planes in Cn . As we shall see this fact will be used to find the proportion of spacetime supersymmetry preserved by the worldvolume solutions associated with Kähler calibrations.

2.3. Special Lagrangian calibrations. To describe the special Lagrangian calibrations we begin with the metric (2.6) and the Kähler form ω (2.8) as in the previous section. In addition, we introduce the (n, 0)-form ψ = dz1 ∧ · · · ∧ dzn .

(2.11)

The data, which include the metric, the Kähler form and the (n, 0)-form ψ, are invariant under SU (n). The calibration form2 is in this case is φ = Reψ.

(2.12)

The inequality necessary for φ to be a calibration has been demonstrated in [15, 16] and the planes that saturate the bound are called special Lagrangian. The contact set in this case is G(φ) = SU (n)/SO(n).

(2.13)

As in the case of Kähler calibrations, SU (n) acts transitively on the space of special Lagrangian planes. 2 We can define special Lagrangian calibrations with a more general (n, 0)-form. However for this paper this choice will suffice.

Calibrations and Intersecting Branes

599

2.4. Exceptional cases. There are three exceptional cases. (i)

The calibration form is the 3-form, ϕ, in E7 invariant under the exceptional group G2 . This gives rise to a calibration of degree three in E7 . The contact set is G(ϕ) = G2 /SO(4),

(2.14)

which is a subset G(3, E7 ). (ii) The calibration form is the dual χ of ϕ in E7 , χ = ?ϕ. This gives rise to a calibration of degree four in E7 . The contact set is again G(χ) = G2 /SO(4),

(2.15)

which is now thought of as the subset of G(4, E7 ). (iii) The calibration form is the Spin(7)-invariant self-dual 4-form, 8, in E8 . This gives rise to a calibration of degree four in E8 . The contact set is G(8) = Spin(7)/H,

(2.16)

which is a subset of G(4, E8 ), where H = SU (2) × SU (2) × SU (2)/Z2 . This calibration is called a Cayley calibration and the 4-planes that saturate the bound are called Cayley planes. 3. Supersymmetry and Calibrations The dynamics of a large class of extended objects is described by actions of Dirac– Born–Infeld type. The fields include the embedding maps X of the extended object into spacetime which we shall take it to be ten-dimensional Minkowski, E(9,1) . Apart from the embeddings maps, the action may depend on other worldvolume fields like a Born– Infeld 2-form field strength F (as in the case of D-branes) and the various fermionic partners θ which are spacetime fermions. These actions are invariant under fermionic transformations which are commonly called kappa-symmetries. These act on θ as ˜ δθ = (1 + 0)κ,

(3.1)

where 0˜ ∈ Cliff(9, 1) is a traceless hermitian product structure, i.e. 0˜ 2 = 1, and κ is the parameter which is a spacetime fermion. The product structure 0˜ depends on the embeddings maps and the other worldvolume fields, i.e. 0˜ depends on the worldvolume coordinates. It was shown in [22] that the Killing spinor equation in this context is the algebraic equation ˜ = 0, (1 − 0) (3.2) where is the spacetime supersymmetry parameter. Since we have chosen the spacetime geometry to be Minkowski, the spacetime (supergravity) Killing spinor equations imply that is constant3 . The condition for supersymmetry (3.2) is universal and applies to all types of branes, fundamental strings, solitonic 5-branes, D-branes and M-branes. Supersymmetric (worldvolume) configurations are solutions of the Born–Infeld field equations which satisfy (3.2) for some non-vanishing . The proportion of the bulk supersymmetry preserved by such configuration depends on the number of linearly independent solutions of (3.2) in terms of . 3 If we relax the condition that is constant, then the Killing spinor equation (3.2) always has solutions.

600

G. W. Gibbons, G. Papadopoulos

The product structure 0 for D-branes and the M-5-brane can be written [22] as a

a

0˜ = e− 2 0e 2 ,

(3.3)

where 0 depends only on the embedding map of the worldvolume into spacetime and a contains the depence of 0˜ on the Born–Infeld fields. If we take the Born–Infeld fields to vanish, then a = 0 and the product structure becomes 0˜ = 0.

(3.4)

3.1. Supersymmetry without Born–Infeld fields. A consistent truncation of all Dirac– Born–Infeld type of actions is to allow the Born–Infeld fields to vanish. Let X be the embedding map of the extended object into Minkowski spacetime. The bosonic part of the truncated action is Z p (3.5) SNG = d p+1 x |det (gµν )|, where

gµν = ∂µ XM ∂ν XN ηMN

(3.6) {x µ ; µ

= 0, . . . , p}, and is the induced metric on the worldvolume with coordinates η is the ten-dimensional Minkowski metric. The product structure 0 is then expressed in terms of the embedding maps and the spacetime Gamma matrices. The particular expression for 0 depends on the type of brane that we are considering. Here we shall investigate the Killing spinor equations for IIA D-branes. However our argument is universal and applies equally well to IIB D-branes, M-branes, fundamental strings and heterotic 5-branes. To continue, we define (3.7) γµ = ∂µ XM 0M , where {0M ; M = 0, . . . , 9} are the spacetime gamma matrices. We remark that γµ γν + γν γµ = 2gµν ,

(3.8)

i.e. the {γµ ; µ = 0, . . . , p} obey the Clifford algebra with respect to the induced metric. The product structure 0 for IIA D-branes is 0=

1 p √ µ0 ...µp γµ0 . . . γµp 011 . (p + 1)! g

(3.9)

For example, the product structure of a planar IIA D-p-brane which spans the first p directions of E(9,1) is p (3.10) 0 = 00 . . . 0p 011 . There is another way to describe the product structure 0. For this we remark that there is a map c which assigns to every q-form, ν=

1 νM ,...Mp dx M1 ∧ · · · ∧ dx Mp q! 1

in E(9,1) an element c(ν) =

1 νM ,...Mp 0 M1 . . . 0 Mp q! 1

(3.11)

(3.12)

Calibrations and Intersecting Branes

601

of the Clifford algebra Cliff(9, 1). Note that c(?ν) = c(ν)011 ,

(3.13)

where ?ν is the Hodge dual of ν. Let N be the submanifold of E(9,1) that describes a solution of the Born–Infeld field equations. Then the product structure 0 associated with N is p (3.14) 0 = c(µ)011 , where µ is the volume form of N . Therefore, the product structure 0 written in terms of an oriented orthonormal frame at a point y becomes the product structure of a planar p-brane tangent to N at y. Thus to solve the Killing spinor equation of N is equivalent to requiring that the Killing spinor equations of the planar D-p-branes tangent to N at each point y have a common solution. One way to find a solution is to recall what happens when two planar branes intersect at an “angle”. Supersymmetry requires that the element of SO(9, 1) or SO(10, 1) relating the hyper-planes is in a subgroup G ⊂ SO(9, 1) or SO(10, 1). Such angles are called “G-angles”. For a general non-planar brane we proceed analogously. For this we choose a reference point y0 and an orthonormal frame of the tangent space Ty0 N at this point which we extend to an orthonormal frame in E(9,1) . Then we choose any other point y in N and introduce another orthonormal frame in the same way. The two orthonormal frames are related by a Lorentz transformation in E(9,1) (a spatial orthogonal rotation for static configuration). Moreover the supersymmetry condition at y can be written in terms of the product structure at y0 as S −1 0({y0 })S = ,

(3.15)

where S is a spinor rotation induced by the Lorentz rotation that relates the above orthonormal frames at y0 and y. For most of the solutions of the Dirac–Born–Infeld equation, the rotations required are generic elements of the ten-dimensional Lorentz group and all supersymmetry is broken. However, some cases involve rotations which lie in a subgroup of the Lorentz group. In particular, if the rotations lie in a subgroup of the Lorentz group for which the decomposition of the Majorana spinor representation of Spin(9, 1) has singlets, then Eq. (3.15) reduces to 0({y0 }) = ,

(3.16)

provided that is a linear combination of the singlets. Therefore the Killing spinor equation of N in such a case reduces to the Killing spinor equation on a single D-pbrane which can be easily solved. Examples of subgroups of the orthogonal groups for which the decomposition of the ten-dimensional spinor representation has singlets are the special holonomy groups like SU (k), Sp(k), G2 and Spin(7). It is clear from the above arguments that preservation of a proportion of supersymmetry of the bulk by a Dirac–Born–Infeld configuration is closely related to calibrations. Let 8 be the Gauss map which takes the tangent space of N at every point y into the grassmannian G(p + 1, E(9,1) ). If the image S0 = 8(N)

(3.17)

of such a map in G(p + 1, E(9,1) ) is a subspace of the homogeneous space G/H , then clearly the relevant rotations amongst the frames associated with the tangent spaces are in G. If N is a calibration, then S0 is a subspace of the contact set G(φ). In all the examples

602

G. W. Gibbons, G. Papadopoulos

of calibrations that we have presented, the contact set is a homogeneous space with G being one of the groups that appear in the context of special holonomies. From this we conclude that all these calibrations preserve a proportion of the supersymmetry of the bulk. Clearly the case of intersecting planar branes is a special case of this construction. For more about this see section eight. 3.2. Supersymmetry with Born–Infeld fields. The Dirac–Born–Infeld action with Born– Infeld field Fµν is Z q I = d p+1 x |det gµν + Fµν |, (3.18) where gµν is an induced metric. This action describes the dynamics of D-p-branes as well as the dynamics of IIB NS-NS 5-branes. Here we shall investigate the Killing spinor equations for the IIA D-branes. However our argument can be easily extended to apply in the case of IIB D-branes, in the case IIB NS-NS 5-brane as well as in the case of M-5-brane. For the IIA D-p-brane [22], a = Yµν γ µ γ ν 011 ,

(3.19)

F = “‘tan"Y.

(3.20)

where

The inclusion of non-vanishing Born–Infeld fields modifies the conditions for a configuration to be supersymmetric. In this case, there is no a direct relation between calibrations and supersymmetry conditions. Nevertheless the analysis for the proportion of supersymmetry preserved by a solution of the Dirac–Born–Infeld field equations proceeds as in the previous case without Born–Infeld fields. We begin with a submanifold N in E(9,1) which is a solution of the Born–Infeld field equations together with a nonvanishing Born–Infeld field F . Then we introduce orthonormal frames at the tangent spaces Ty0 N and Ty1 N of the points y0 and y1 of N which we extend to orthonormal frames in E(9,1) . Again there is a Lorentz rotation in E(9,1) which relates the two frames. Using a similar argument to that of the previous section, the Killing spinor equation at a point y can be written in terms of the Killing spinor equation at y0 as follows: U −1 e− where

a(y0 ) 2

0(y0 )e

U = e−

a(y0 ) 2

a(y0 ) 2

Se

U = ,

a(y) 2

(3.21) (3.22)

and S is the induced rotation on the spinors from the Lorentz rotation that relates the two frames. These supersymmetry projections can be simplified if we assume that the Born–Infeld field vanishes at the reference point y0 . (In many applications, the Born– Infeld field vanishes at some point usually at infinity.) The supersymmetry projection at y then becomes a(y) a(y) (3.23) e− 2 S −1 0(y0 )Se 2 = . Solutions of these conditions can be found by assuming that the spinor-rotations U leave invariant some constant Majorana spinor . In such a case, the Killing spinor equation for N degenerates to the Killing spinor equation of the reference point 0(y0 ) = , where is a linear combination of the singlets.

(3.24)

Calibrations and Intersecting Branes

603

A consistent truncation of the Dirac–Born–Infeld action is to set {XM } = {x µ , y i = 0}. The truncated action is the Born–Infeld action Z q I = d p+1 x det ηµν + Fµν . (3.25) For this class of configurations, N is a (p+1)-dimensional Minkowski subspace of E(9,1) . Therefore we can choose an orthonormal frame in E(9,1) such that N spans the first p spatial directions of E(9,1) . The Killing spinor equations become e−

a(y) 2

0e

a(y) 2

= ,

(3.26)

where 0 is a constant product structure. In the linearized limit, this condition on the Born–Infeld field F reduces to the familiar condition for a (Yang–Mills) configuration to preserve a proportion of supersymmetry4 together with a projection associated with 0. Viewing F as the infinitesimal transformation of SO(p, 1) rotations acting on the Killing spinor with the induced spinor representation, supersymmetry is preserved provided that F takes values in an appropriate subalgebra of so(p, 1). Such subalgebras are those of the special holonomy subgroups of so(p, 1). For example sp(1) is related to the self-dual connections in four dimensions, su(k) is related to the Einstein-Yang-Mills connections [26], sp(k) is related to the instanton solutions found in [27] (see also [31]) and similarly for g2 and spin(7) [28, 29, 30]. 4. Static Born–Infeld Solitons, Brane Boundaries and Brane Intersections As we have already mentioned, the worldvolume solitons of the Dirac–Born–Infeld action of a p-brane may arise as the intersection of the p-brane with other branes or the boundary of other branes ending on the p-brane. Therefore, such worldvolume solitons can be found by utilizing the bulk boundary and intersection rules for branes. These rules are known either from string theory or from supergravity. Worldvolume solitons that arise as boundaries or intersections of branes have the following qualitative properties: • The bulk brane configuration and the associated worldvolume soliton exhibit the same manifest Poincaré invariance. • The proportion of the bulk supersymmetry preserved by the worldvolume soliton is the same as that of the associated bulk configuration. We shall show that supersymmetric projections in both cases are identical. • The worldvolume solitons of a p-brane that are associated with other branes ending on it have non vanishing Born–Infeld fields. This is a consequence of the conservation of the flux at the intersection. • The number of non-vanishing scalar fields associated with a worldvolume soliton of a p-brane is equal to the number of the worldvolume directions of the other branes involved in the associated bulk configuration which are transverse to the p-brane. For worldvolume solitons associated with the intersection of two branes, we can also impose the following boundary conditions: • Away from the boundary or the intersection, the induced metric on the p-brane should approach that of E(p,1) . 4 We have assumed that F is not a constant two-form field strength.

604

G. W. Gibbons, G. Papadopoulos

• Near the boundary or the intersection on a p-brane the induced metric should approach the worldvolume Minkowski metric of the “incoming” brane from the bulk. There is a very large number of brane configurations that have the interpretation of branes ending on or intersecting with other branes. We shall not attempt to present a complete list here. However, we shall mention some examples below (see [1, 32, 33]). These bulk configurations are most easily classified according to the supersymmetry that they preserve. (i)

Configurations that preserve 1/2 of the supersymmetry of the bulk are bound states of branes that lie within other branes. These bound states are below threshold. Examples of such bound states include the D-(p-2)-branes within D-p-branes and the M-2-brane within the M-5-brane. (ii) Configurations that preserve 1/4 of the supersymmetry of the bulk include the following: In M-theory, two membranes intersecting at a 0-brane, two M-5-branes intersecting at a 3-brane, a membrane ending on a 5-brane with boundary a string. In string theory we have the fundamental string ending on a D-p-brane with a 0brane boundary, a D-string ending at a IIB NS-NS 5-brane with a 0-brane boundary, two D-p-branes intersecting at a (p-2)-brane and two NS 5-branes intersecting at a 3-brane. (iii) Configurations that preserve 1/8 of the supersymmetry of the bulk include the following: In M-theory, three M-2-branes intersecting at a 0-brane, three M-5branes intersecting at a string and two M-2-branes ending a M-5-brane with their string boundaries intersecting on a 0-brane. Using the correspondence between bulk configurations and worldvolume solitons, it is worth mentioning that there are two 0-brane worldvolume solitons on the D-2-brane. One is due to the IIA fundamental string ending on the D-2-brane and the other is due to the intersection of two D-2-branes at a 0-brane. The two 0-brane solitons are charged with respect to different fields. The first one is charged with respect to the Born–Infeld one-form gauge potential and the other is charged with respect to the gauge potential associated with the dual of a transverse scalar. The D-4-brane also has two different 0-brane worldvolume solitions. One is due to the fundamental string ending on the D4-brane and it is charged with respect to the Born–Infeld gauge potential. The other is due to a D-0-brane within the D-4-brane and the worldvolume solution is a Born–Infeld field instanton. 5. Static Solutions and Kähler Calibrations The worldvolume solitons associated with Kähler calibrations are complex submanifolds of Cn . If the calibration has contact set SU (n)/S(U (p)×U (n−p)), then a generic soliton will preserve 1/2n of the supersymmery of the bulk. As we shall see, the singlets of the ten-dimensional representations under SU (n) ⊂ SO(9, 1) satisfy n − 1 relations each reducing the number of components of the Killing spinor by half. The supersymmetry projection associated with the p-brane also reduces the components of the Killing spinor by another half leading to the preservation of the 1/2n of the bulk supersymmetry. The worldvolume solitons associated with the Kähler calibrations have vanishing Born–Infeld type of fields and therefore the analysis below applies universally to all types of branes, M-branes, D-branes and NS-branes. Let us describe the solution of a k-brane soliton of a p-brane with ` transverse scalars. For this we choose the static gauge

Calibrations and Intersecting Branes

605

for the p-brane, XM = (x µ , y i ),

(5.1)

where {µ = 0, . . . , p} are the worldvolume coordinates and {y i ; i = 1, . . . , 9 − p} are the transverse coordinates of the p-brane. In this gauge, the Born–Infeld solution that we would like to describe is a map y from Ep−k ⊂ E(p,1) into E` ⊂ E(9,1) . We then introduce complex structures I and J on Ep−k and E` , respectively, and we write the equation (5.2) I b a ∂b y m = J m n ∂a y n , where a, b = 1, . . . , p − k and m, n = 1, . . . , `; the dimension of Ep−k and E` are even. It is straightforward to show that if y satisfies (5.2) then the embedding {X M } = {(x µ , y m , 0)} solves the Born–Infeld field equations. Choosing complex coordinates with ¯ respect to these complex structures, x a = (zα , z¯ α¯ ) and y m = (s A , s¯ A ), the solutions of (5.2) are holomorphic functions (5.3) s A = s A (z). Therefore the above solution is a Kähler calibration in En−k+` of degree p − k. The Kähler form is given by the complex structure I ⊕ J and the contact set is SU (

p−k ` p−k+` )/S(U ( ) × U ( )). 2 2 2

(5.4)

5.1. SU(2) Kähler calibrations. The simplest of all Kähler calibrations is the one describing an one-dimensional complex space in C2 . The associated Born–Infeld brane solitons have already been found in [5]. They correspond to (p-2)-brane worldvolume solitons that appear in all p-branes (p ≥ 2). From the bulk (supergravity) perspective, the existence of these solitons is due to the “(p-2)-intersection rule” which states that two p-branes intersect on a (p-2)-brane [1]. The solution can be written in an implicit form as F (s, z) = 0. (5.5) Different choices of F give different solutions. Some examples are the following. (i) The 3-brane soliton of M-5-brane for which F is the equation of a Riemann surface [34]. (ii) In [37], the 0-brane soliton of the M-2-brane for some choice of F was interpreted as the M-theory analogue of the IIB (p,q)-string triple junctions [35, 36] (for more references to earlier work see [38]). Another example is to choose F (s, z) = sz − c,

(5.6)

where c is a complex number. If c = 0, then the soliton is singular. The bulk interpretation of such a soliton is that of two planar p-branes intersecting at a (p-2)-brane with the singularity located at the intersection. Such solutions will be investigated in Sect. 8. If c 6 = 0, the singularity is blown up and the induced metric on the p-brane for this configuration has two asyptotically flat regions one at |z| → ∞ and the other at |z| → 0. These asymptotic regions are identified with the two p-branes involved in the intersection. It turns out that the two asymptotic regions are orthogonal in the bulk metric and therefore the intersection is at right angles. Next, we take F (s, z) = (s − bz)z − c,

(5.7)

606

G. W. Gibbons, G. Papadopoulos

where b, c are complex numbers. The induced metric on the p-brane again has two asymptotically flat regions, one at |z| → ∞ and the other at |z| → 0 which can be identified with the p-branes involved in the intersection. However in this case the intersection is not orthogonal. This can be easily seen by observing that |s| → ∞ as |z| → 0 and s → bz as |z| → ∞. Therefore in the (s, z) plane, one asymptotic region is along the s plane while the other is along the plane determined by the equation s = bz. The two angles among these two planes are determined by the parameter b; for b a real number, tan θ =

1 , b

(5.8)

where θ is the the angle. As we have already mentioned all the above configurations preserve 1/4 of the bulk supersymmetry. To be more explicit let us consider the case of IIA D-branes in more detail. We can always arrange, without loss of generality, that the product structure 0 at the reference point y0 is 0(y0 ) = 00 . . . 0p (011 )

p+2 2

.

(5.9)

We shall consider solutions which depend on the worldvolume coordinates (x p−1 , x p )= (X p−1 , Xp ) and with transverse coordinates (y 1 , y 2 ) =(Xp+1 , Xp+2 ). We then introduce the complex coordinates z = x p−1 + ix p and s = y 1 + iy 2 . The supersymmetry projection associated with y0 is 00 . . . 0p (011 )

p+2 2

= .

(5.10)

For such a Kähler calibration, the SU (2) rotations of the tangent planes of the solution is taking place in the space spanned by the coordinates {Xp−1 , . . . , Xp+2 }. For the above choice of complex structure, the spinor singlets of any SU (2) rotation in these directions satisfy 0p−1 0p 0p+1 0p+2 = − ;

(5.11)

(the sign depends on the choice of complex structure). Combining the conditions (5.10) and (5.11), we conclude that the supersymmetry preserved by the worldvolume solution is 1/4 of that of the bulk. It is worth pointing out that the two supersymmetry conditions can be rewritten as 00 . . . 0p (011 ) 00 . . . 0p−2 0p+1 0p+2 (011 )

p+2 2 p+2 2

= , = .

(5.12)

These are precisely the supersymmetry conditions of the supergravity solution with the interpretation of two IIA D-p-branes intersecting on a (p-2)-brane5 . This turns out to be a common feature of all supersymmetric worldvolume solitons. If the worldvolume soliton has a bulk interpretation, then the supersymmetry conditions of the soliton are identical to those of the bulk configuration. The above argument for supersymmetry with a slight modification applies to all (p-2)-brane worldvolume solitons of p-branes. 5 The intersection can be at any SU (2) angle.

Calibrations and Intersecting Branes

607

5.2. SU(3) Kähler calibrations. There are two cases of Kähler calibrations associated with the group SU (3) to consider. The first case is a degree two calibration in C3 and the second case is a degree four calibration in C3 . Both cases have the same contact set. The worldvolume solitons of a p-brane associated with the calibration of degree two are described be the zero locus of the holomorphic functions F 1 (s 1 , s 2 , z) = 0, F 2 (s 1 , s 2 , z) = 0,

(5.13)

where z is a complex coordinate on the p-brane, p ≥ 2, and s 1 , s 2 are complex coordinates transverse to the p-brane. Next suppose that z = x p−1 + ix p , s 1 = y 1 + iy 2 and s 2 = y 3 + iy 4 (y i = Xp+i ). For this choice of complex structure, the spinor singlets of SU (3) acting on z, s 1 , s 2 satisfy the conditions 0p−1 0p 0p+1 0p+2 = −, 0p−1 0p 0p+3 0p+4 = − .

(5.14)

We then choose the base point y0 as in the SU (2) case above so that the projector of the p-brane is given as in (5.10). Therefore a generic worldvolume soliton preserves 1/8 of the bulk supersymmetry as expected. From the bulk perspective, the above worldvolume solitons are associated with the common intersection of three p-branes. There are several examples of such configurations. The typical example is that of three M-2-branes (or equivalently three D-2-branes) intersecting on a 0-brane. The intersection that we are considering is not necessarily orthogonal. However to simplify notation, we shall present here the orthogonal bulk intersection which is (i) M − 2 : (ii) M − 2 : (iii) M − 2 :

0, 1, 2, −, −, −, −, 0, −, −, 3, 4, −, −, 0, −, −, −, −, 5, 6.

(5.15)

In this notation, the numbers denote the bulk directions which are identified with the worldvolume directions of the associated brane and some of the transverse directions are denoted −. From the perspective of one of the three 2-branes, say the first one, z spans the worldvolume directions 1, 2 and s 1 , s 2 span the worldvolume directions 3, 4, 5, 6 of the other two 2-branes. Using reduction from M-theory to IIA and T-duality, we can construct many other cases, like for example that of three D-3-branes intersecting at a string. An example of a worldvolume soliton that corresponds to the above intersection is F 1 = (s 1 − b1 z)z − c1 , F 2 = (s 2 − b2 z)z − c2 ,

(5.16)

where b1 , b2 , c1 , c2 are complex numbers. The induced metric on one of the M-2-branes has two asymptotically flat regions, one as |z| → 0 and the other as |z| → ∞. Identifying s 1 , s 2 with the coordinates of the “incoming” branes, comparing the asympotic behaviour of the solution as |z| → 0 and |z| → ∞ and using a similar argument to that of the

608

G. W. Gibbons, G. Papadopoulos

SU (2) case, we find that the angles are determined by b1 and b2 ; for b1 , b2 real numbers, we have 1 tan θ 1 = 1 , b (5.17) 1 1 tan θ = 2 . b If the constants {b1 , b2 } vanish, then the branes intersect orthogonally. It is worth mentioning that the supersymmetry projections associated with this worldvolume soliton are the same as the supersymmetry projections associated with the bulk configuration (5.15). Next we shall consider the case of degree four calibrations in C3 . The worldvolume solitons of a p-brane are described by the zero locus of the holomorphic function F (z1 , z2 , s) = 0,

(5.18)

where z1 , z2 are complex coordinates of the p-brane and s is a complex coordinate transverse to the brane. The rotation group SU (3) acts on the coordinates z1 , z2 , s. Choosing the complex structure in a way similar to that of the previous case, the spinor singlets of SU (3) satisfy the same conditions as in (5.14). Using these two conditions together with the supersymmetry projection of the p-brane, one concludes that a generic worldvolume soliton will preserve 1/8 of the bulk supersymmetry. These worldvolume solitons correspond to intersecting brane configurations which are magnetic duals (in the sense of [1]) to the intersecting brane configurations associated with the solitons of the previous case. A typical example is the M-theory configurations of three M-5-branes pairwise intersecting on 3-branes and altogether at a string. Again here the intersection is at SU (3) angles. However for simplicitly we give the orthogonally intersecting configuration which is (i) M − 5 : (ii) M − 5 : (iii) M − 5 :

0, 1, 2, 3, 4, 5, 0, 1, 2, 3, −, −, 6, 7, 0, 1, −, −, 4, 5, 6, 7.

(5.19)

This configuration is a magnetic dual to the configuration of three M-2-branes intersecting on a 0-brane (5.15). The degree four calibration associated with a M-5-brane describes a string worldvolume soliton. From the perspective of one of the M-5-branes involved in the intersection (5.19), say the first one, the string is along the directions 0, 1, the complex coordinates z1 , z2 span the directions 2, 3, 4, 5 and s is along the directions 6, 7. 5.3. SU(4) Kähler calibrations. There are three Kähler calibrations associated with the group SU (4). These are a degree two calibrations in C8 , degree four calibrations in C8 and degree six calibrations in C8 . The worldvolume soliton of a p-brane described by degree two Kähler calibration in C4 is the zero locus of the holomorphic functions F 1 (s 1 , s 2 , s 3 , z) = 0, F 2 (s 1 , s 2 , s 3 , z) = 0, F (s , s , s , z) = 0, 3

1

2

3

(5.20)

Calibrations and Intersecting Branes

609

where z is a complex worldvolume coordinate and s 1 , s 2 , s 3 are complex coordinates transverse to the p-brane. Next suppose that z = x p−1 +ix p , s 1 = y 1 +iy 2 , s 2 = y 3 +iy 4 and s 3 = y 5 + iy 6 (y i = Xp+i ). The SU (4) rotation group acts on s 1 , s 2 , s 3 , z and with this choice of complex structure the spinor singlets of SU (4) satisfy 0p−1 0p 0p+1 0p+2 = −, 0p−1 0p 0p+3 0p+4 = −, 0p−1 0p 0p+5 0p+6 = −.

(5.21)

We then choose the base point y0 as in the SU (2) case above so that the projector of the p-brane is given as in (5.10). Using the p-brane projection operator together with (5.21), we find that the proportion of the bulk supersymmetry preserved is 1/16 as expected. As in the SU (2) case the above projections can be rewritten as the projections of four p-branes intersecting on a (p-2)-brane. An example of such a worldvolume soliton is the one that is associated with the bulk configuration of four M-2-branes intersecting on a 0-brane, i.e. (i) (ii) (iii) (iv)

M−2: M−2: M−2: M−2:

0, 0, 0, 0,

1, 2, −, −, −, −, −, −, −, −, 3, 4, −, −, −, −, −, −, −, −, 5, 6 −, −, −, −, −, −, −, − 7, 8.

(5.22)

From the perspective of one of the four 2-branes, say the first one, z spans the worldvolume directions 1, 2 and s 1 , s 2 , s 3 span the worldvolume directions 3, 4, 5, 6, 7, 8 of the other three M-2-branes. The explicit solutions that we have given for the case of three M-2-branes intersecting on a 0-brane can be easily generalized to this case and we shall not repeat the analysis here. Next we shall consider the case of degree four Kähler calibrations in C4 . The worldvolume soliton of a p-brane associated with such calibration is described by the zero locus of two holomorphic functions F 1 (z1 , z2 , s 1 , s 2 , ) = 0, F 2 (z1 , z2 , s 1 , s 2 ) = 0,

(5.23)

where z1 , z2 are two complex worldvolume coordinates and s 1 , s 2 are two complex coordinates transverse to the p-brane. The spinor singlets of SU (4) satisfy the same conditions (5.21) of the previous case for a similar choice of complex structure. The proportion of the supersymmetry preserved is 1/16 of the bulk supersymmetry. From the bulk perspective there are five intersecting brane configurations that correspond to this worldvolume soliton. These are two M-5-branes intersecting at a string, two IIA and IIB NS 5-branes intersecting at a string, two D-5-branes intersecting at a string and two D-4-branes intersecting at a 0-brane. All these configurations are related by reduction from M-theory to IIA and T-duality from IIA to IIB. So let us consider the case of two intersecting D-4-branes at a 0-brane. From the perspective of one of the D4-branes, the 0-brane soliton is described by a degree four Kähler calibration with z1 , z2 the worldvolume coordinates of the chosen brane and s 1 , s 2 the worldvolume coordinates of the other. However there is a puzzle, it is well known that when the intersection of two D-4-branes is orthogonal the proportion of the supersymmery preserved by the configuration is 1/4 of the bulk. This is unlike the cases that we have studied previously

610

G. W. Gibbons, G. Papadopoulos

where the orthogonally intersecting configuration and the intersecting configuration at SU (n) angles preserved the same proportion of supersymmetry. One resolution of the puzzle is that the worldvolume soliton that is associated with the orthogonally intersecting configuration does not utilize the full SU (4) group of rotations of the tangent bundle of the submanifold. We shall give such solutions in Sect. 8. Such solutions though are singular at the intersection. The last case is that of worldvolume solitons of p-branes associated with Kähler calibrations of degree six in C4 . This requires that p ≥ 6 and with at least two transverse directions. These solitons are given by the zero locus of the holomorphic function F (s, z1 , z2 , z3 ) = 0,

(5.24)

where z1 , z2 , z3 are complex worldvolume coordinates of the p-brane and s is a complex transverse coordinate. The investigation of the properties of these solitons, like supersymmetry, is similar to that of the previous cases and we shall not pursue this further here.

6. Static Solutions and Special Lagrangian Calibrations The special Lagrangian calibrations (SLAG) are closely related to the Kähler ones which we have investigated in the previous sections. The contact set of a SLAG calibration is G(φ) = SU (n)/SO(n). Therefore a generic worldvolume soliton associated with a SLAG calibration preserves 1/2n of the supersymmetry of the bulk. To describe the solutions of Nambu–Goto action associated with SLAG calibrations, we again choose the static gauge (5.1) of a p-brane as in the case of Kähler calibrations. The SLAG calibration is then a map y = {y i ; i = 1, . . . , n} from En ⊂ E(p,1) with coordinates {x i ; i = 1, . . . , n} into En ⊂ E(9,1) . In this gauge, the conditions on y required by the SLAG calibration [15, 16] are

and

y i = ∂i f (x),

(6.1)

Im det(δij + i∂i ∂j f ) = 0,

(6.2)

where f is a real function of the coordinates {x i ; i = 1, . . . , n} and ∂i f = ∂x∂ i f . Some examples of SLAG calibrations have been given in [15, 16] and we shall not repeat them here. 6.1. SU(2) SLAG calibrations. The SU (2) SLAG calibrations are the same as the SU (2) Kähler ones. This is because the calibration form in this case is φ = dz1 ∧ dz2 + d z¯ 1 ∧ d z¯ 2

(6.3)

which is the Kähler form of another complex structure J on E4 . Therefore this SLAG calibration is a Kähler calibration with respect to J . So the investigation of the properties of the corresponding solutions of the Born–Infeld field equations is the same as that described in Sect. 5.1 for the corresponding Kähler calibration.

Calibrations and Intersecting Branes

611

6.2. SU(3) SLAG calibrations. The SU (3) SLAG calibration is a degree three calibration in E6 . To interpret this calibration as a soliton of a p-brane, we take three {x i ; i = 1, 2, 3} of the coordinates of E6 to be worldvolume directions of a p-brane (p > 2) and the other three {y i ; i = 1, 2, 3} to be transverse to it. The SU (3) rotations act on E6 with complex coordinates zi = x i + iy i . The supersymmetry projections of SU (3) rotations of the contact set can be easily found using the above choice of complex structure of the SLAG calibration (compare with (5.14)). A straightforward computation reveals that the supersymmetry preserved by such a soliton is 1/8 of the bulk supersymmetry; 1/4 of the supersymmetry is broken by the SU (3) rotations and another 1/2 is broken by the supersymmetry projection associated with the p-brane. The simple example of a bulk configuration which is associated to the solitons of the SU (3) SLAG calibration is that of three M-5-branes intersecting on a membrane, i.e. (i) M − 5 : (ii) M − 5 : (iii) M − 5 :

0, 1, 2, 3, 4, 5, −, −, −, 0, 1, 2, −, −, 5, 6, 7, −, 0, 1, 2, −, 4, −, 6 −, 8.

(6.4)

From the perspective of one of the three M-5-branes, say the first one, this bulk configuration is associated to a 2-brane worldvolume soliton in the directions 0, 1, 2. The transverse scalars (y 1 , y 2 , y 3 ), (y i = Xp+i ), of the worldvolume soliton depend on the worldvolume coordinates x 3 , x 4 , x 5 . It is straightforward to check that this configuration of M-5-branes preserves 1/8 of the bulk supersymmetry. The associated supersymmetry projections are identical with those of the 2-brane worldvolume soliton. Reducing this configuration to IIA theory along x 1 and T-dualizing to IIB along x 2 , we find that the same calibration describes a string soliton on a D-4-brane and a 0-brane soliton on a D-3-brane, respectively. The above interpretation of the worldvolume solitons of the SU (3) SLAG calibrations is not unique. This calibration can also be interpreted as the soliton of two M-5-branes intersecting on a 2-brane at SU (3) angles. We remark though that if the two M-5-branes are brought in an orthogonal position all supersymmetry will break.

6.3. SU(4) and SU(5) SLAG calibrations. The investigation of the worldvolume solitons of the SU (4) and SU (5) SLAG calibrations is similar to the one described above for the worldvolume solitons of the SU (3) SLAG calibrations, so we shall not present these cases in detail. Here we shall present two examples of bulk configurations that can be associated to these worldvolume solitons as follows: (i) A bulk configuration associated to SU (4) SLAG calibration and preserving 1/16 of the supersymmetry is (i) (ii) (iii) (iv)

M−5: M−5: M−5: M−5:

0, 0, 0, 0,

1, 1, 1, 1,

2, 3, 4, 5, −, −, −, −, −, −, 4, 5, 6, 7, −, −, −, 3, −, 5, 6 −, 8, −, −, 3, 4, −, 6 −, −, 9.

(6.5)

Another configuration can be found by placing the third M-5-brane above in the directions 0, 1, 2, 5, 7, 8. The soliton is a string on the M-5-brane and lies in the directions 0, 1. (ii) A bulk configuration associated to SU (5) SLAG calibration and preserving 1/32 of

612

G. W. Gibbons, G. Papadopoulos

the supersymmetry is (i) (ii) (iii) (iv) (v)

M−5: M−5: M−5: M−5: M−5:

0, 0, 0, 0, 0,

1, 2, 3, 4, 5, −, −, −, −, −, −, −, 3, 4, 5, 6, 7, −, −, −, 1, −, 3, −, 5, − 7, −, 9, −, −, 2, −, 4, 5, 6 −, 8, −, −, 1, 2, 3, −, −, − −, −, 9, 10

(6.6)

We remark that the M-5-branes can be placed at different directions from those indicated above still leading to five M-5-branes intersecting at a 0-brane. The worldvolume soliton is a 0-brane on the M-5-brane. Apart from the interpretation given above, SU (4) and SU (5) SLAG calibrations solitons can also be associated with two M-5-branes intersecting at SU (4) and SU (5) angles on a 1-brane and a 0-brane, respectively. The latter case may be of interest since it is suitable for describing M-5-brane junctions but this will not be investigated further here. The bulk configuration of two orthogonally intersecting M-5-branes at a string preserves 1/4 of the supersymmetry and as we shall show later there is a singular worldvolume soliton associated with it. However the bulk configuration of two orthogonal M-5-branes intersecting at a 0-brane breaks all the supersymmetry and therefore for such a configuration to exist the intersection should occur at SU (5) angles. 7. Exceptional Calibrations 7.1. G2 calibrations. The investigation of the solutions of the Born–Infeld action associated with the group G2 is similar to that of the Kähler and SLAG calibrations. However in this case, there does not seem to be a straightforward interpretation of these calibrations in terms of intersecting M-branes. It is tempting though to suggest that these calibrations are associated with intersecting M-5-branes at G2 angles on a string. Supposing this, we take the M-5-brane to lie in the directions 0, 1, 2, 6, 7, 8 and the string to lie in the directions 0, 8. For the degree three calibration, we take E7 to span the directions 1, 2, 3, 4, 5, 6, 7. The spinor singlets under G2 satisfy Gmn = 0,

(7.1)

where

1 ? ϕmn pq 0pq (7.2) 4 are the generators of G2 , m, n, p, q = 1, . . . , 7, and ?ϕ is the Hodge dual of the structure constants ϕ of the octonions which we have chosen as Gmn = 0mn +

ϕ123 = ϕ246 = ϕ435 = ϕ516 = ϕ572 = ϕ471 = ϕ673 = 1.

(7.3)

The condition (7.1) on yields the supersymmetry projections 01346 = , 02356 = , 04567 = .

(7.4)

There are many equivalent ways to present these projections for the above choice of ϕ. The consistency of these projections together with the projector associated with the M5-brane reveals that a generic G2 calibration preserves 1/16 of the bulk supersymmetry.

Calibrations and Intersecting Branes

613

Table 2. Calibrations and Supersymmetry. This table contains the type of calibration, the associated contact set and the proportion of the supersymmetry preserved by the calibration Calibration Kähler SLAG G2 Spin(7)

Contact

Supersymmetry

SU (n) S U (p)×U (n−p) SU (n) SO(n) G2 SO(4) Spin(7) H

2−n 2−n 2−4 2−5

For the degree four calibration, we again take E7 to span the directions 1, 2, 3, 4, 5, 6, 7. The supersymmetry projectors associated with G2 are as in (7.4). The supersymmetry preserved is also 1/16 of the bulk. 7.2. Spin(7) calibrations. The Spin(7) calibrations are degree four calibrations in E8 and a bulk interpretation is as two M-5-branes at Spin(7) angles intersecting on a string. As in the G2 case above, we take one of the M-5-branes to lie in the directions 0, 1, 2, 6, 7, 9, the string to lie in the directions 0, 9 and the calibration to take place in the directions 1, 2, 3, 4, 5, 6, 7, 8. The spinor singlets under Spin(7) satisfy GI J = 0,

(7.5)

where

3 1 0I J + I J KL 0KL 4 6 are the generators of Spin(7), I, J, K, L = 1, . . . , 8, and GI J =

mnp8 = ϕmnp , mnpq = ?ϕmnpq

(7.6)

(7.7)

is the Spin(7)-invariant self-dual 4-form. The supersymmetry projections associated with Spin(7) acting on E8 spanned by the above directions are 01346 02356 04567 01238

= , = , = , = .

(7.8)

The first three projections are similar to those of G2 . Using the last two projections, we find that (7.9) 012345678 = . This implies that if is anti-chiral in the 8-dimensional sense all supersymmetry is broken6 . The consistency of these projections together with the projector associated with the M-5-brane reveals that a generic Spin(7) calibration preserves 1/32 of the bulk supersymmetry. We summarize some of our results in Sects. 5–7 in Table 2. 6 We remark that this depends on the orientation of E8 . If the other orientation is chosen, then chiral representation does not have singlets under Spin(7).

614

G. W. Gibbons, G. Papadopoulos

8. Singular Solutions and Piecewise Planar Branes The Dirac–Born–Infeld field equations admit a large class of solutions for which the transverse scalars y are piece-wise linear functions and the Born–Infeld field F is piecewise constant. In this class of solutions, we can also include double valued solutions, like for example the configurations for which their graph consists of two or more geometrically intersecting planes in E(9,1) . Some of these solutions are limits of the solitons discussed in the previous sections. The bulk interpretation of such solutions is that of planar branes intersecting at angles with a non-vanishing Born–Infeld field. The singularities of these solutions are at the intersections where the transverse scalars and their first derivatives are discontinuous. We shall call such solutions of the Dirac–Born–Infeld field equations “singular solutions”. The investigation of the supersymmetry preserved by a singular solution with vanishing Born–Infeld field is very similar to that of two or more planar branes placed in the ten-dimensional Minkowski background. Such analysis has already been done in [22] and we shall not repeat it here. Instead, we shall give one example to illustrate the way that such a computation can be done. For this we shall consider singular solutions of the M-2-brane action. Let x 1 , x 2 be the two spatial worldvolume coordinates of a M2-branes and y 1 = X3 , y 2 = X4 two of its transverse scalars. A singular worldvolume solution of the M-2-brane is ( ci , for x 1 · x 2 > 0 i (8.1) y = xi , for x 1 · x 2 < 0, where {ci ; i = 1, 2} are constants. The supersymmetry conditions associated with this solution are 00 01 02 = ,

x 1 · x 2 > 0,

1 00 (01 + 03 )(02 + 04 ) = , 2

x 1 · x 2 < 0,

(8.2)

Using the first projector the second one can be rewritten as 00 03 04 = .

(8.3)

Therefore the supersymmetry preserve is 1/4 of the bulk. The projectors of this solution are those of two M-2-branes intersecting at a 0-brane. We can easily include piece-wise constant Born–Infeld fields to the above singular solutions. If the Born–Infeld is everywhere constant, then it is straightforward to see that its inclusion does not break any additional supersymmetry. However if it is piece-wise constant, additional supersymmetry may be broken (see [22]). 9. Solutions with Only Born–Infeld Fields A consistent truncation of the Dirac–Born–Infeld action is to fix the location of the pbrane, as in Sect. 3.2, and reduce it to a Born–Infeld action. Choosing a Lorentz frame in the bulk adopted to the p-brane, the supersymmetry condition, given also in Sect. 3.2, becomes a a (9.1) e− 2 0e 2 =

Calibrations and Intersecting Branes

615

since S = 1; the projector 0 is a product of (constant) Gamma matrices. As it has already been mentioned in Sect. 3.2 linearizing this condition in the Born–Infeld fields, it becomes the familiar supersymmetry condition of the maximally supersymmetric Maxwell multiplet in (p+1) dimensions together with the projection associated with the planar p-brane. So at least in the linear approximation, Born–Infeld fields that satisfy the self-duality condition and its generalizations, like for example the Einstein–Yang– Mills (SU (n)) condition or the Sp(k) condition, preserve a proportion of the spacetime supersymmetry. We remark that the worldvolume solitons involving only Born–Infeld fields have the bulk interpretation of branes within branes. A general analysis for the conditions that Y (see (3.20)) should satisfy for (9.1) to admit non-trivial solutions has been already given in Sect. 3.2. Here we shall examine some special cases. First we shall consider the case where Y is self-dual 2-form in the directions 1,2,3,4 of the p-brane (p ≥ 4). If this is the case, then the singlets of the rotation ea satisfy (9.2) 01234 = −. We remark that the same condition on is imposed by a self-dual Born–Infeld field in the linearized limit. If is such a singlet, then the supersymmetry condition reduces to 0 = .

(9.3)

The compatability conditions of the two projections (9.2) and (9.3) determine the proportion of the bulk supersymmetry preserved by the configuration. It turns out that self-dual configurations on all D-p-branes p ≥ 4 preserve 1/4 of the bulk supersymmetry. It remains to express the self-duality condition on Y in terms of a condition on F . It turns out that if Y is self-dual so is F as it can be easily seen using the identity 1 Yac Ydb δ cd = − Ycd Y cd δab 4

(9.4)

and the relation F = tanY , where a, b, c, d = 1, . . . , 4. Moreover a short calculation reveals that F satisfies the field equations of the Born–Infeld action as a consequence of the self-duality condition and the Bianchi identity. A simple explicit example with a self-dual Maxwell field is obtained by setting

and

Fab = ∂a Ab − ∂b Aa

(9.5)

Aa = Ia b ∂b H,

(9.6)

where I is a constant complex structure of E4 , Ia b Ib c = −δa c , with anti-self-dual Kähler form and H a harmonic function on E4 . This worldvolume solution has several bulk interpretations depending on the context that it is used. One example is that of a D-0-brane within a D-4-brane. Other conditions on Y which lead to preservation of some of the supersymmetry of the bulk have already been mentioned in Sect. 3.2. However there does not seem to be such a simple relation between these conditions of Y and those on F . For example, if Y satisfies the Kähler-Yang-Mills condition (i.e. Y is in the Lie algebra SU (k), k > q), one can easily see that F does not satisfy the same condition. We remark, however, that if Y is tri-hermitian (i.e. Y is in the Lie algebra Sp(k)), then F is tri-hermitian as well. Such a condition has been studied [24] where it was found that the supersymmetry preserved 3/16 of the bulk. Explicit solutions can be easily constructed by superposing those of (9.6) using the method developed in [31].

616

G. W. Gibbons, G. Papadopoulos

10. Solitons with Born–Infeld Fields Solitons of the Dirac–Born–Infeld field equations of a p-brane that involve both nonvanishing scalars and Born–Infeld fields are those that arise at the boundary of one brane ending on another. Such solitons preserve 1/4 of the bulk supersymmetry or less. We have not been able to find a calibration-like construction to deal with these solitons. So we shall use the bulk picture to determine the existence of such solutions and we shall give some new examples. It is convenient to categorize them according to the supersymmetry that they preserve.

10.1. Solitons preserving 1/4 of the supersymmetry. Most of these solitons have already been found. These include, (i) the 0-brane soliton on all D-branes due to a fundamental string ending on them [5], (ii) the 0-brane soliton on the IIB NS-5-brane due to a D-string ending on it, (iii) the self-dual string soliton [39] on the M-5-brane due to a membrane ending on it, and (iv) the 2-brane soliton [39] of the IIB NS-5-brane due to a three brane ending on it. These solitons are related using T-and S-duality transformations of the associated bulk theories [40]. Apart from these soliton solutions which are charged with respect to the standard 2-form Born–Infeld field, there are other solitons which are charged with respect to other p-form fields. One example is the domain wall solutions on the IIB D-5-and NS-5-branes [41]. We remark that in all the above solitons, the superymmetry projectors are those of the associated bulk configuration. For example the supersymmetry projectors of the self-dual string soliton on the M-5-brane are those on the M-5-brane and the M-2-brane.The energy bounds for some of the above solutions have been investigated in [42] and the geometry of their moduli spaces have been studied in [43].

10.2. Solitons preserving 1/8 of the supersymmetry. There are many such solitons. This can be seen either by studying the charges that appear in the various worldvolume supersymmetry algebras or by examinning the bulk intersecting brane configurations. We shall not present a complete account of all such configurations. Instead we shall investigate a class of electrically charged solutions associated with D-branes. Let F be the Born–Infeld field of a D-p-brane and {y i ; i = 1, · · · , 9−p} be the transverse scalars. For electrically charged solutions, we use the ansatz F0a = −∂a φ , a = 1, . . . , k, y 1 = φ, y m = y m (x) m = 2, . . . , 9 − p.

(10.1)

A standard computation using the results of [5] reveals that the field equations of the D-p-brane become g ab ∇a ∂b φ = 0, (10.2) g ab ∇a ∂b y m = 0, where

gab = δab + ∂a y m ∂b y n δmn

(10.3)

is the induced metric and ∇ is the Levi–Civita connection of g. The second equation in (10.2) can be solved by taking the maps {y m } to describe a degree k calibration

Calibrations and Intersecting Branes

617

in E8+k−p . The first equation in (10.2) is just the harmonic function condition on the calibrated surface. Equation (10.2) can be easily solved if the associated calibration is Kähler. In this case the induced metric g on the calibrated manifold is Kähler. So the first equation of (10.2) can be solved by taking φ to be the real part of a holomorphic function h, i.e. ¯ φ = h(z) + h(z).

(10.4)

There are many bulk configuration that correspond to the above solutions by allowing for different calibrations. One example is the 0-brane worldvolume soliton on the D-3-brane corresponding to the intersection (i) D − 3 : (ii) D − 3 : (iii) F − 1 :

0, 1, 2, 3, −, −, −, 0, 1, −, −, 4, 5, −, 0, −, −, −, −, −, 6.

(10.5)

The associated calibration is a degree two SU (2) Kähler calibration in the directions 2, 3, 4, 5 and y 1 = X6 . The proportion of the supersymmetry preserved is 1/8. We shall present a complete discussion of the worldvolume solitons that preserve 1/8 of the supersymmetry or less elsewhere. 11. Conclusions We have investigated the Killing spinor equations associated with the kappa-symmetry transformations of the worldvolume brane actions and shown that all the worldvolume solitons associated with calibrations are supersymmetric. The proof has been based on the properties of the contact set of a calibration. Then we have presented a bulk interpretation of all the calibrations in terms of intersecting branes. Next we have examined other supersymmetric worldvolume solutions, like singular solutions and solutions involving only Born–Infeld fields. Finally, we found new worldvolume solutions of the Dirac– Born–Infeld action. In particular, we present a 0-brane worldvolume solution with the bulk interpretation of two intersecting D-3-brane with a fundamental string ending on them. Such a solution preserves 1/8 of supersymmetry. So far we have considered solutions of the Nambu–Goto-type of actions which are embeddings into the ten- or eleven-dimensional Minkowski spacetimes. Alternatively, we can take as a spacetime any solution of ten- or eleven-dimensional supergravities. In particular, we can take M(10) = Mk × E(9−k,1) or M(11) = Mk × E(10−k,1) , where Mk is a manifold with the special holonomies of Table 1. Since for all these manifolds the Ricci tensor vanishes, the above spaces are solutions of the supergravity field equations by setting the rest of the fields either zero or constant. One can then consider solutions of the Nambu–Goto-type of actions for the spacetimes given as above (see also [17, 18]). There is a natural definition of a calibration in the manifolds with special holonomy using the covariantly constant forms (see Table 1). It turns out that such solutions of the Nambu–Goto action associated with these calibrations preserve the same proportion of supersymmetry as those we have studied for Minkowski spacetimes. It would be of interest to investigate further the Dirac–Born–Infeld type of actions. In particular, there may be a generalization of calibrations which also involves the nonvanishing Born–Infeld field. But unlike for calibrations, in this case it is not clear which geometric quantity is minimized. However, the definition of such a generalized calibration in terms of spinors is straightforward and it has been done in Sect. 3. Since it is

618

G. W. Gibbons, G. Papadopoulos

expected that there should be a consistent generalization of Dirac–Born–Infeld actions involving non-abelian Born–Infeld field (see [44]), there may also exist a “non-abelian” generalization of calibrations. Acknowledgements. We thank Jerome Gauntlett for helpful discussions and for telling us about related work which will appear in a forthcoming paper of his with Neil Lambert and Peter West. G.P. is supported by a University Research Fellowship from the Royal Society.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34.

Papadopoulos, G. and Townsend, P.K.: Intersecting M-branes. Phys. Lett. B380, 273 (1996) Strominger, A.: Open p-branes. Phys. Lett. B383, 44; hep-th/9512059 Townsend, P.K.: Brane Surgery. Nucl. Phys. Proc. Suppl 58, 163 (1997) Callan, C.G., Jr, and J.M. Maldacena, Brane dynamics from the Born–Infeld action. hep-th/9708147 Gibbons, G.W.: Born–Infeld particles and Dirichlet p-branes. hep-th/9709027 Gibbons, G.W.: Wormholes on the Worldvolume: Born–Infeld Particles and Dirichlet p-Branes. hepth/9801106 Green, M.B. and Schwarz, J.H.: Covariant Description of Superstrings. Phys. Lett. 136B, 367 (1984) Bergshoeff, E., Sezgin, E. and Townsend, P.K.: Supermembranes and Eleven-Dimensional Supergravity. Phys. Lett. 189B, 75 (1987) Cederwall, M., von Gussich, A., Nilsson, A., Sundell, P. and Westerberg, A.: The Dirichlet Super P-Branes in Ten-Dimensional Type IIA and IIB Supergravity. Nucl. Phys. B490, 179 (1997) Aganagic, M., Popescu, C. and Schwarz, J.H.: D-Brane Actions with Local Kappa Symmetry. Phys. Lett. B393, 311 (1997) Bergshoeff, E. and Townsend, P.K.: Super D-Branes. Nucl. Phys. B490, 145 (1997) Howe, P.S. and Sezgin, E.: D=11, p=5. Phys. Lett. B394, 62 (1997) Bandos, I., Lechner, K., Nurmagambetov, A., Pasti, P., Sorokin, P. and Tonin, M.: Covariant Action for the Super-Five-Brane of M-theory. hep-th/9703127 Aganagic, M., Park, J., Popescu, C. and Schwarz, J.H.: Worldvolume Action of the M-Theory Five-Brane. hep-th/9701166 Harvey, R. and Blaine Lawson, H., Jr.: Calibrated Geometries. Acta Math. 148, 47 (1982) Harvey, F.R.: Spinors and Calibrations. New York: Academic Press, 1990 Becker, K., Becker, M. and Strominger, A.: Fivebranes, Membranes and Nonperturbative String Theory. Nucl. Phys. B456, 130 (1995) Becker, K., Becker, M., Morrison, D.R., Ooguri, H., Oz, Y. and Yin, Z.: Supersymmetric Cycles in Exceptional Holonomy Manifolds and Calabi-Yau 4-folds. hep-th/9608116 Douglas, M.R.: Branes within Branes. hep-th/9512077 Izquierdo, J.M., Lambert, N.D., Papadopoulos, G. and Townsend, P.K.: Dyonic Branes. Nucl. Phys. B460, 560 (1996) Papadopoulos, G. and Townsend, P.K.: Kaluza–Klein on the brane. Phys. Lett. B393, 59 (1997) Bergshoeff, E., Kallosh, R., Papadopoulos, R. and Ortin, T.: kappa-symmetry supersymmetry and intersecting branes. Nucl. Phys. B502, 149 (1997) Berkooz, M., Douglas, M.R. and Leigh, R.G.: Branes Intersecting at Angles. Nucl. Phys. B480, 265 (1996) Gauntlett, J.P., Gibbons, G.W., Papadopoulos, G. and Townsend, P.K.: Hyper-Kähler Manifolds and Multiply Intersecting Branes. Nucl. Phys. B500, 133 (1997) Wang, McKenzie Y.: Parallel Spinors and Parallel Forms. Ann. Global Anal. Geom. 7, 59 (1989) Donaldson, S.K.: Anti Self-dual Yang-Mills Connections over Complex Algebraic Surfaces and Stable Bundles. Proc. London Math. Soc. 50, 1 (1985) Corrigan, E., Goddard, P. and Kent, A.: Some Comments on the ADHM Construction in 4k Dimensions. Commun. Math. Phys. 100, 1 (1985) Fubini, S. and Nicolai, H.: The Octonionic Instanton. Phys. Lett. 155B, 369 (1985) Ivanova, T.A.: Octonions, Self-Duality and Strings. Phys. Lett. B315, 277 (1993) Gunaydin, M. and Nicolai, H.: Seven-Dimensional Octonionic Yang-Mills Instanton and its Extension to an Heterotic String Soliton. Phys. Lett. B351, 169 (1995) Papadopoulos, G. and Teschendorff, A.: Instantons at Angles. hep-th/9708116 Tseytlin, A.A.: Harmonic Superposition of M-Branes. Nucl. Phys. B475, 149 (1996) Gauntlett, J.P., Kastor, D.A. and Traschen, J.: Overlapping Branes in M-theory. Nucl. Phys. B478, 544 (1996) Witten, E. Solutions of four-dimensional field theories via M-theory. Nucl. Phys. B500, 3 (1997)

Calibrations and Intersecting Branes

35. 36. 37. 38. 39. 40. 41. 42. 43. 44.

619

Schwarz, J.H.: Lectures on Superstrings and M-theory Dualities. Nucl. Phys. Proc. Suppl. 55B, 1 (1997) Sen, A.: String Network. hep-th/9711130 Krogh, M. and Lee, S.: String Network from M-theory. hep-th/9712050 Rey, Soo-Jong and Yee, Jung-Tay: BPS Dynamics of Triple (p,q) String Junction. hep-th/9711202 Howe, P.S., Lambert, N.D. and West, P.: The self-dual string soliton. hep-th/9709014 Papadopoulos, G.: T-duality and the worldvolume solitons of five-branes and KK-monopoles. hepth/9712162 Bergshoeff, E., van der Schaar, J.P., and Papadopoulos, G.: Domain Walls on the Brane. hep-th/9801158 Gauntlett, J., Gomis, J. and Townsend, P.K.: BPS Bounds for Worldvolume Branes. hep-th/9711205 Papadopoulos, G. and Gutowski, J.: The moduli spaces of worldvolume brane solitons. hep-th/9802186 Tseytlin, A.A.: A Non-Abelian Generalization of Born–Infeld Action in String Theory. Nucl. Phys. B501, 41 (1997)

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 202, 621 – 628 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

On the Zero Set of the Wave Function in Superconductivity Jorge Berger1 , Jacob Rubinstein2 1 2

Department of Physics, Technion, 32000 Haifa, Israel. E-mail: [email protected] Department of Mathematics, Technion, 32000 Haifa, Israel. E-mail: [email protected]

Received: 8 July 1997/ Accepted: 28 January 1999

Abstract: We consider the Ginzburg Landau functional in a multiply connected planar domain with enclosed magnetic flux. Particular attention is given to the zero set of the order parameter. We show that there exist applied fields for which the zero set is of codimension 1. 1. Introduction The Ginzburg Landau (GL) model of superconductivity concerns an order parameter u which is a complex valued function and the magnetic field vector potential A. The problem is forced by an applied magnetic field He . The local minimizers describe the possible equilibrium configurations under the external constraints. The square of the absolute value of u measures the density of the superconducting electrons, while the phase of u is related to the supercurrent. We shall consider here the GL functional in a smooth two dimensional domain that is homeomorphic to an annulus. We shall refer to such domains as rings. Of particular interest are the zeros of u. It is often argued that the zero set, which we shall denote by S, should consist of isolated points. The heuristic reasoning is that since u is complex, its zeros are the intersection of the zero levels lines of its real and imaginary parts. Moreover, Elliott et al. have recently proved [7] that under some smoothness assumptions on the given data, S is indeed a finite collection of points. We shall show that there exist fields He for which S consists of a curve connecting the inner boundary of the ring to its outer boundary. When this occurs we say that u is in the singly connected state [2]. The solution we discovered exists for some temperature range below the critical temperature Tc in which the transition to superconductivity occurs. A by-product of our result is that the Elliott - Matano - Qi [7] theorem is valid only for simply connected domains. When the ring is very thin, the GL functional can be approximated by a one dimensional model on a curve [12]. We have studied this model in detail [3] and derived an

622

J. Berger, J. Rubinstein

intricate picture of phase transitions when there exist nonuniformities in the ring’s cross section. Our results here give a rigorous justification to the phase transition diagram of [3]. In the thin ring limit the line of zeros collapses to a point. The appearance of a singly connected state has profound effects on the behavior of the supercurrent as a function of He . Experimental aspects of the new phase are discussed in [4]. 2. Formulation We write the GL functional in nondimensional form Z λ[−|u|2 + |u|4 /2] + |(∇ − iA)u|2 dx Gλ (u, A) = Z 2 2 −1 |∇ × A − He | dx. +κ λ

(2.1)

R2

Here λ is a parameter that vanishes at Tc and increases with Tc − T , where T (Tc ) is the temperature (critical temperature), and κ is a material parameter. The order parameter u is a complex valued function, A is the magnetic vector potential, and the applied magnetic field is denoted by He . Defining the inner perimeter of the ring as 2πR, we choose R as the lengthscale (rather than the coherence length or the penetration length). The external field is given in dimensional units by 80 He /2πR2 , and the dimensional vector potential is 80 A/2πR, where 80 is the fundamental flux quantum. We comment that the form (2.1) is slightly different from the form used in other mathematical texts ([6, 11]). In our scaling the temperature appears explicitly in the functional, which is advantageous for the purpose of studying phase transitions. We assume that He is a smooth vector field in the direction orthogonal to the plane, and that its support does not intersect . Let Ae be a smooth solution to ∇ × Ae = He , ∇ · Ae = 0. It is convenient to split A into the sum of the applied vector potential Ae and an induced potential Ai . The Euler Lagrange equations associated with (2.1) are −(∇ − iA)2 u + λu(|u|2 − 1) = 0, x ∈ ,

(2.2)

∇ × (∇ × A − He ) = λκ−2 =(u(∇ ¯ − iA)u)1 , x ∈ R2 , (∇ − iA)u · ν = 0, x ∈ ∂,

(2.3) (2.4)

where ν is the unit normal to ∂. We use the gauge invariance of (2.1) [6] to impose the additional constraint ∇ · A = 0. We can replace Eq. (2.3) for A by an equation for Ai : [ ¯ − iA)u)1 , x ∈ 0 (2.5) ∇ × ∇ × Ai = λκ−2 =(u(∇ supplemented with the boundary condition ∇ × Ai = 0, x ∈ ∂ \ ∂0 .

(2.6)

Here 0 is the hole bounded by the ring. The normal state is described by the pair (u, A) = (0, Ae ), which is clearly a solution to (2.2)–(2.4). We expect the normal state to be a stable solution for high temperatures (small λ), and to lose stability as λ is increased past a critical value that we denote by

Zero Set of Wave Function in Superconductivity

623

λp (associated with the phase transition temperature). The bifurcation equation at λp is obtained by linearizing (2.2)-(2.4) about the normal state: −(∇ − iAe )2 u − λp u = 0, x ∈ , (∇ − iAe )u · ν = 0, x ∈ ∂.

(2.7) (2.8)

Since λp is the first λ for which there is a nontrivial solution for (2.7)-(2.8), there is a variational characterization for it: Z |(∇ − iAe )u|2 dx, (2.9) λp = inf GL(u) = inf

R where the minimization is taken under the constraint |u|2 dx = 1. It has been recognized since the classical work of Little and Parks [8] that the flux of He through the hole bounded by the ring, i.e. the integral of He over this area, is an important physical quantity. The flux divided by 2π will be denoted by 8. It will be shown that a necessary condition for the singly connected state to appear is that 8 is as far as possible from an integer. The following geometrical - spectral characterization will be useful in the sequel: Definition Q. Let be a ring-like domain, and let 0 be a curve connecting the inner boundary of the ring to its outer boundary. We denote the class of such curves by L. We introduce the scalar eigenvalue problem Z |∇y|2 dx, (2.10) µ (0) = inf y∈A where A = {y ∈ H (), 1

R

y dx = 1, y|0 = 0}. Let 2

µ = inf µ (0).

(2.11)

A domain is said to be of type Q if the infimum in (2.11) is achieved at a single curve 0∗ . 3. The Linear Problem We first establish our result for the linear bifurcation problem (2.7)–(2.8). Theorem 1. Assume is of type Q. Let 28 = 2l + 1

(3.1)

for some integer l. Then the solution of (2.7)-(2.8) with the lowest eigenvalue vanishes along a curve 0 of class L. Moreover, the polar form of the solution is u(x) = Y (x)eiφ(x) , where φ(x) satisfies ∇φ = Ae .

(3.2)

The curve 0, the amplitude Y and the eigenvalue λp are the solution to the minimization problem (2.10)–(2.11).

624

J. Berger, J. Rubinstein

Proof. Since the solutions of (2.7)–(2.8) are critical points of (2.9), the real and imaginary parts of u are analytic functions [9]. We recall Theorem 2.6 of [7] that states that the zero set S consists only of isolated points or curves which end on the boundary of . While this theorem is actually stated for the nonlinear functional, it obviously applies also to the linear case. Thus the phase is locally well defined almost everywhere. Moreover, using the same arguments as in Corollary 2.7 of [7], the only possible curves in S must be of type L. Let φ¯ be a solution to (3.2). All other solutions are of the form φc = φ¯ + c for some constant c. Consider first a solution of (2.7) of the form uc = Y eiφc for some c. Let C be a curve circulating once around the ring. Integrating (3.2) along C, using (3.1), and applying Stokes theorem, we find that uc is not single valued unless it has a zero somewhere on C. It follows that Y must vanish along some curve 0c of type L. Also, normalizing Y to have a unit L2 norm, we see by substituting (3.2) into (2.9) that Z |(∇ − iAe )uc |2 dx = µ (0c ). (3.3)

Assume now in contradiction that there exists a solution u1 = Y1 eiφ1 of (2.7)–(2.8) whose phase does not satisfy (3.2). Using (3.1), we see that u2 = Y1 ei(2φc −φ1 ) is also a solution. But since (2.7)–(2.8) is linear, any linear combination of u1 and u2 is a solution too with the same eigenvalue λp . Consider for example u∗c = u1 − u2 = 2iY1 sin(φ1 − φc )eiφc

(3.4)

for some c. Clearly we can find a value c such that the equation φc (x)−φ1 (x) = 0 defines a curve γc . Since γc consists of zeros of u∗c , it must be of class L. Returning to (2.9), normalizing u∗c to have a unit L2 norm, and using (3.2), we obtain the following identity for u∗c : Z |(∇ − iAe )u∗c |2 dx = µ (γc ). (3.5) λp =

But varying c we obtain a smooth family of curves γc satisfying (3.5) in contradiction to our assumption that is of type Q. Moreover, property Q implies that the zero set of u consists of the special curve 0∗ , the eigenvalue λp equals µ , and the amplitude of the order parameter is the minimizer defined in Q. 4. The Nonlinear Equations The following theorem establishes the existence of singly connected solutions also to the nonlinear equations (2.2)–(2.4) for some temperature interval below Tc . Theorem 2. Assume is of type Q, and (3.1) holds. Then there exists an interval Iδ = (λp , λp + δ), δ > 0, such that for every λ ∈ Iδ , there exists a solution to (2.2)–(2.4) that vanishes along a curve of class L. The phase of the solution satisfies (3.2) and the current vanishes everywhere. Proof. We have shown (Theorem 1) that a unique solution up bifurcates at λ = λp , 28 = 2l +1. Applying the Crandall–Rabinowitz abstract bifurcation theory ([5, Theorem 2.4]), one can show that there is a unique branch originating at up . The details are similar to the examples in [5] and to Theorem 8.6 of [1], so we do not spell them out.

Zero Set of Wave Function in Superconductivity

625

Fix λ > λp , and let (u, A) = (Y eiφ , Ae + Ai ) be a solution along this branch. If φ satisfies (3.2), then Ai = 0, and the result follows by the same arguments as in Theorem 1. If, on the other hand, φ does not satisfy (3.2), then from Eq. (2.5) Ai cannot vanish identically. Consider now (v, B) = (Y ei(2φc −φ) , Ae − Ai ),

(4.1)

where φc is a solution of (3.2). Condition (3.1) implies that (v, B) is smooth, and it is easy to check that (v, B) is also a solution of (2.2)–(2.4) and, moreover Gλ (v, B) = Gλ (u, A). We remark that u R and v are of different homotopy types: Let the homotopy type of u be d1 . This means C ∇φ = d1 , where C is any curve circulating once about the ring. But then the homotopy type of v is d2 = 2l + 1 − d1 , and since d1 and l are integers, d1 6 = d2 and therefore v 6 = u. It follows that at least at a small λ interval to the right of λp , the phase φ must satisfy (3.2). We point out that the energy of the singly connected solutions is smaller than the energy of the normal state. To see this we substitute (3.2) and Ai = 0 in (2.2)–(2.4). We find that the amplitude Y solves −1Y + λY (Y 2 − 1) = 0, x ∈ , ν · ∇Y = 0, x ∈ ∂. Multiplying (4.2) by Y , integrating by parts and using (3.2) again, we get Z λ Y 4 dx < 0. Gλ = − 2

(4.2) (4.3)

(4.4)

5. The One Dimensional Model When the ring is very thin, the GL model can be approximated by a functional on a curve M forming the ring’s skeleton. The one dimensional model was used by several authors (see e.g. [10]). It was rigorously derived in [12]: Z 1 λ[−|u|2 + |u|4 /2] + |(∂θ − iAeτ )u|2 D(θ) dθ. (5.1) Gλ (u) = M

Here θ is a variable along M , D(θ) is a strictly positive smooth function that measures the thickness of the ring at θ, and Aeτ is the tangential (along M ) component of Ae . The Euler Lagrange equation associated with (5.1) is (∂θ − Aeτ )D(∂θ − Aeτ )u + λDu(1 − |u|2 ) = 0,

(5.2)

with periodic boundary conditions. The Q property is now replaced by Definition Q1. Consider the scalar eigenvalue problem Z |∂θ y(θ)|2 D(θ + α) dθ, µM (α) = inf y∈B where B = {y ∈ H 1 (M ),

(5.3)

M

R M

Dy 2 = 1, y(0) = 0}. Let µM = inf µM (α).

(5.4)

A pair (M, D) is said to be of type Q1 if the infimum in (5.4) is achieved at a single point α.

626

J. Berger, J. Rubinstein

In [3] we systematically studied the phase diagram for the model (5.1) under the assumption that the ring is weakly nonuniform, i.e. D(θ) = 1 + D1 (θ), where is a small positive parameter. The following theorem provides theoretical support to the overall structure of the diagram for any thickness function D(θ) that satisfies condition Q1: Theorem 3. Assume (M, D) is of type Q1, and 8 satisfies (3.1). Then 1. There exists a critical value λ1p , where the trivial solution to (5.2) bifurcates into a singly connected state, and an interval Iδ1 = (λ1p , λ1p + δ), δ > 0, where the solution to (5.2) that bifurcated from the normal state is singly connected. 2. There is another value λ1q , such that for all λ ∈ (λ1q , ∞) the global minimizer is not singly connected. Proof. The first part of the theorem follows by the same arguments as in Theorem 2. The critical value λ1p is now the eigenvalue of the one dimensional linearization of (5.2). It can be written variationally as Z |(∂θ − iAeτ )u|2 D(θ)dθ, (5.5) λ1p = inf M

R where the minimization is taken over all functions u satisfying M |u|2 Ddθ = 1. Since the magnetic vector potential is fixed in this case, it follows from (5.1) and (5.5) that the normal state (u ≡ 0) is the global minimizer for all λ ≤ λ1p . The bifurcating singly connected state Y (θ) satisfies an inequality similar to (4.4), Z λ DY 4 dx < 0, (5.6) G1λ = − 2 M which implies its stability. We further remark that in the one dimensional setup, the equation ∂θ φ = Aeτ , replacing (3.2), is always solvable. To prove the second part, we recall [3] that in the one dimensional case the Euler Lagrange equation for the phase of the order parameter can be integrated explictly in terms of the applied magnetic field. The energy functional for doubly connected functions (i.e. functions that do not vanish on M [2]) takes the form Z (2πk)2 . (5.7) λ[−Y 2 + Y 4 /2] + (∂θ Y )2 D(θ) dθ + G1λ (Y ) = 3 M Here Y is the amplitude of u, k = N −8, where N is the circulation of the phase divided R dθ 2 by 2π, and 3 = M DY 2 . Clearly, to minimize the energy, N is chosen to minimize k . 2 Our assumption on 8 implies k = 1/4. The energy functional for singly connected functions is similar except that the last term is omitted. The Euler Lagrange equation for the amplitude of the singly connected state is ∂θ (D∂θ Y ) + λD(Y − Y 3 ) = 0, Y (0) = Y (2π) = 0,

(5.8)

where, without loss of generality, we chose the angle at which Y vanishes to be θ = 0. It is easy to verify that the solution to (5.8) satisfies 0 ≤ Y ≤ 1. Set e = D(∂θ Y )2 + λD(Y 2 − Y 4 /2). Differentiating e with respect to θ and using (5.8) we get ∂θ e + D−1 ∂θ De = 2λD−1 ∂θ D(Y 2 − Y 4 /2).

(5.9)

Zero Set of Wave Function in Superconductivity

627

If the singly connected state is a global minimizer, its energy should be negative. Therefore there must be at least one point where |∂θ Y | ≤ Cλ1/2 for some constant C.Applying Gronwall’s lemma to (5.9), we conclude that |∂θ Y | ≤ Cλ1/2 for all θ. Multiplying (5.8) by Y and integrating by parts, we get for the singly connected state [3] Z λ DY 4 dθ. (5.10) G1λ (Y ) = − 2 M We also compute G1λ (Y = 1) = −

λ 2

Z M

D+ R

π2 . D−1 M

(5.11)

The singly connected state vanishes at some point, and the derivative estimate we have obtained implies that, for large λ, Y is small at some interval of size λ−p about this point for every p ∈ (1/2, 1). Thus Z Z λ λ D− DY 4 dθ ≥ O(λ1−p ), p ∈ (1/2, 1). (5.12) 2 M 2 M Therefore, for λ sufficiently large, the constant order parameter Y ≡ 1 has lower energy than any singly connected state. 6. Discussion We have shown that the zero set of the order parameter can be of codimension 1. Our construction is based on the solvability of the phase equation (3.2), which requires the applied magnetic field to vanish in , while still allowing for nonzero flux. Therefore the topology of the ring is crucial. Elliot, Matano and Qi proved in [7] that the zero set of the minimizers to the Ginzburg Landau functional in two dimensional domains consists of isolated points. We have given here an example of domains for which the zero set consists of a curve. The reason for the disagreement is that the construction in Lemma 2.7 of [7] is valid only for singly connected domains. The bifurcation equation (2.7)–(2.8) can also be interpreted as the Schr¨odinger equation for a charged particle in a ring under an external magnetic field and with zero probability current through the boundary. Thus Theorem 1 implies that whenever the magnetic flux, measured in fundamental flux units, is an integer plus half, then there exists a “forbidden curve” for the ground state. The GL energy in thin annular domains can be shown [12] to converge to the weighted GL energy on a circle, where the weight is determined by the local thickness of the annulus. In [2] and [3] we have used asymptotic and numerical methods to study a weighted GL energy on a circle. The calculations were done under the assumption that the thickness is nearly uniform or piecewise constant. We predicted the existence of a temperature range in which a new superconducting phase will emerge if the weight is nontrivial, which corresponds to asymmetric thickness. The new phase is characterized by a smooth transition between states with different angular quantum numbers. Theorem 3 provides a theoretical support to the phase transition picture for arbitrary thickness, provided condition Q1 holds.

628

J. Berger, J. Rubinstein

The geometrical-spectral characterization of we introduced in Definition Q poses a novel challenging problem in calculus of variations. Essentially condition Q (or Q1) means that the domain is not symmetric. It does not hold for the symmetric annulus confined between two concentric circles. We conjecture that Q is a generic property. Acknowledgement. This work was supported by the US-Israel Binational Science Foundation. We thank G. Wolansky for helpful conversations.

References 1. Bauman, P., Phillips, D. and Qi, T.: Stable nucleation of the Ginzburg Landau system with an applied magnetic field. Arch. Rat. Mech. Anal. 142, 1–43 (1998) 2. Berger, J. and Rubinstein, J.: Topology of the order parameter in the Little Parks experiment. Phys. Rev. Lett. 75, 320–322 (1995) 3. Berger, J. and Rubinstein, J.: Bifurcation analysis for phase transitions in nonuniform superconducting rings. SIAM J. Appl. Math. 58, 103–121 (1998) 4. Berger, J. and Rubinstein, J.: Signatures for the second critical point in the phase diagram of a superconducting ring. Phys. Rev. B 56, 5124–5127 (1997) 5. Crandall, M.G. and Rabinowitz, P.H.: Bifurcation from simple eigenvalues. J. Func. Anal. 8, 321–340 (1971) 6. Du, Q., Gunzburger, M.D. and Peterson, J.: Analysis and approximation of the Ginzburg Landau model of superconductivity. SIAM Review 34, 529–560 (1992) 7. Elliott, C.M., Matano, H. and Qi, T.: Zeros of complex Ginzburg Landau order parameter with application to superconductivity. Europ. J. Appl. Math. 5, 431–448 (1994) 8. Little, W.A. and Parks, R.D.: Observation of quantum periodicity in the transition temperature of a superconducting cylinder. Phys. Rev. Lett. 9, 9–12 (1962) 9. Morrey, C.B.: Multiple Integrals in the Calculus of Variations. Berlin–Heidelberg–New York: SpringerVerlag, 1966 10. Pannetier, B.: Superconducting wire networks. In: Quantum Coherence in Mesoscopic Systems, ed. B. Kramer, New York: Plenum Press, 1991, pp. 457–484 11. Rubinstein, J.: Six lectures on superconductivity. In: Boundaries, Interfaces and Transitions, M. Delfour, ed., CRM Lecture Notes 13, Providence, RI: American Mathematical Society, 1998 12. Rubinstein, J. and Schatzman, M.: Asymptotics for thin superconducting rings. J. Math. Pure Appl. 77, 801–820 (1998) 13. Tinkham, M.: Introduction to Superconductivity. New York: McGraw-Hill, 1996 Communicated by J. L. Lebowitz

Commun. Math. Phys. 202, 629 – 649 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Nodal Sets for Groundstates of Schr¨odinger Operators with Zero Magnetic Field in Non Simply Connected Domains ? B. Helffer1 , M. Hoffmann-Ostenhof 2 , T. Hoffmann-Ostenhof 3,4 , M. P. Owen4 1 D´ epartement de Math´ematiques, Bˆatiment 425, Universit´e Paris-Sud, F-91405 Orsay Cedex, France. E-mail: [email protected] 2 Institut f¨ ur Mathematik, Universit¨at Wien, Strudthofgasse 4, A-1090 Wien, Austria. E-mail: [email protected] 3 Institut f¨ ur Theoretische Chemie, Universit¨at Wien, W¨ahringerstrasse 17, A-1090 Wien, Austria. E-mail: [email protected] 4 International Erwin Schr¨ odinger Institute for Mathematical Physics, Boltzmanngasse 9, A-1090 Wien, Austria. E-mail: [email protected]

Received: 23 July 1998 / Accepted: 17 November 1998

Abstract: We investigate nodal sets of magnetic Schr¨odinger operators with zero magnetic field, acting on a non simply connected domain in R2 . For the case of circulation 1/2 of the magnetic vector potential around each hole in the region, we obtain a characterisation of the nodal set, and use this to obtain bounds on the multiplicity of the groundstate. For the case of one hole and a fixed electric potential, we show that the first eigenvalue takes its highest value for circulation 1/2. 1. Introduction and Statement of Results Let ⊂ R2 be a region with smooth (C ∞ ) boundary, which is homeomorphic to a disk with k holes, and consider the magnetic Schr¨odinger operator HA,V := (i∇ + A)2 + V

(1.1)

acting on L2 () with Neumann boundary conditions. The potential V is assumed to be smooth, and we consider a smooth magnetic vector potential A which corresponds to a zero magnetic field. That is, B := curl A = 0

(1.2)

in . Assumption (1.2) implies that in any simply connected, open subset of , there exists a gauge function φ such that ∇φ = A.

(1.3)

We shall see that the operator HA,V is unitarily equivalent to the non-magnetic Schr¨odinger operator HO,V if and only if one can extend this local gauge eiφ to a globally ?

Funded by the European Union TMR grant FMRX-CT 96-0001

630

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

defined function such that φ (which might not be a singlevalued function) satisfies (1.3). We shall see that this can be done precisely when each of the circulations I 1 A · dx, (1.4) 8i = 2π σi of A round the ith hole (i = 1, . . . , k) takes an integer value. Here σi is a closed path1 which parametrises the boundary 6i of the ith hole and turns once in an anti-clockwise direction. Furthermore, if the circulations 8 = (81 , . . . , 8k ) of two distinct vector potentials A and A0 are equal modulo Zk then the corresponding operators HA,V and HA0 ,V are unitarily equivalent under a gauge transformation. Theorem 1.1. Let ⊂ R2 be a region with smooth boundary, which is homeomorphic to a disk with k holes. For a given smooth potential V , the first eigenvalue λ1 of the magnetic Schr¨odinger operator HA,V , where A satisfies (1.2), depends only on the circulations 8 = (81 , . . . , 8k ) of A. The function λ1 (8) has the following properties (in which l ∈ Zk is arbitrary): λ1 (8 + l) = λ1 (8), λ1 (l/2 + 8) = λ1 (l/2 − 8),

(1.5) (1.6)

λ1 (8) > λ1 (0, . . . , 0) for 8 6∈ Zk .

(1.7)

For the case k = 1, we have in addition to Eq. (1.7) that λ1 (8) < λ1 (1/2)

(1.8)

for 8 6∈ 1/2 + Z. Equations (1.5), (1.6) and inequality (1.7) are straightforward, and are proved in Sect. 2 (see also Remark 2.2). In this context we should also mention the recent very interesting results [HN97] by Herbst and Nakamura concerning large magnetic fields. We choose Neumann boundary conditions on HA,V in this article because we were motivated by questions arising in the Ginzburg model of super-conductivity. Our results are also valid for the case of Dirichlet boundary conditions (see Remark 1.5 (vi)). Dirichlet boundary conditions are related to the Aharonov-Bohm effect for bound states. See [LO77, Hel88a, Hel88b, Hel94]. Such models also arise in the description of the Little-Parks experiment [LP62]. Inequality (1.8) appears, to the best of our knowledge, for the first time. Our proof of this result (see Sect. 4), uses a connection between the maximality of the first eigenvalue for flux 1/2 and the structure of the nodal set of groundstates. The nodal sets for the single hole case with flux 1/2 were recently investigated by Berger and Rubinstein [BR97]. Part of our work is motivated by their preprint. Using semiclassical arguments as in [Hel88a], we can show that in general the first eigenvalue is not necessarily maximised for circulation (1/2, . . . , 1/2). Definition 1.2. The nodal set N (u) of an eigenfunction u of a magnetic Schr¨odinger operator on a manifold with smooth boundary is defined in by N (u) := {x ∈ : u(x) = 0}.

(1.9)

1 A piecewise smooth mapping γ : [0, 1] → X is called a path in X. The point γ(0) is called the initial point and γ(1) is called the final point. The image 0 = γ([0, 1]) of the path is called a curve.

Nodal Sets for Groundstates of Schr¨odinger Operators

631

Some useful information on nodal sets of real valued eigenfunctions of non-magnetic Schr¨odinger equations in two dimensions is given in Proposition 4.1. In particular we see that such nodal sets consist of the finite union of smoothly immersed circles and lines. It is “generically” the case that the nodal set of every complex eigenfunction of a magnetic Schr¨odinger operator consists of isolated points of intersection of the lines of zeros of the real and imaginary parts of the function. See [EMQ94]. The local properties of the nodal sets of eigenfunctions of the operator HA,V are the same as the local properties of complex solutions of non-magnetic Schr¨odinger equations. More precisely, since we may find at every point a local gauge eiφ satisfying (1.3), we may multiply any eigenfunction of HA,V by a local gauge so that the product solves a non-magnetic Schr¨odinger equation. The nodal set is invariant under local gauge transformations. We shall see in what follows that although the local properties of nodal sets of eigenfunctions of our magnetic Schr¨odinger operator are the same as the properties of a non-magnetic Schr¨odinger operator, the global properties differ in the case where 8 = (1/2, . . . , 1/2). In particular, in the non-magnetic case we see that (since a real eigenfunction must change sign at the nodal set) an even number of nodal lines (or perhaps no nodal lines) of an eigenfunction emerges from each boundary component of the region. In Theorem 1.4 we show that for 8 = (1/2, . . . , 1/2), an odd number of nodal lines of the groundstate emerge from each component. Definition 1.3. We say that a (nodal) set N slits if it is the union of a collection of piecewise smooth, immersed lines such that each line starts and finishes at the boundary ∂ and leaves the boundary transversally; (ii) internal intersections between lines are transversal; (iii) the complement \ N is connected; (iv) an odd number of nodal lines leaves each interior boundary component. (i)

We shall say that a collection of paths slits if the union of the images of the paths slits . See Fig. 1 for some examples of regions which are slit. Note that part (iii) of the above definition is the reason why a nodal set which slits contains no immersed circles, and also implies that each line of a slitting set links together a unique pair {6i , 6j } of distinct (i.e. i 6 = j) boundary components. Note also that for the single hole case, a set which slits consists of one line which joins the outer boundary of to the inner boundary. In Corollary 4.3 we show that if a collection of paths slits a region then no sub- or supercollection of these paths can also slit the region. In Proposition 5.1 we show that the number n of paths of such a collection must satisfy k/2 ≤ n ≤ k. Theorem 1.4. Let be a region with smooth boundary, which is homeomorphic to a disk with k holes. Let V be a smooth potential and let A be a smooth magnetic vector potential satisfying Eq. (1.2), such that the value of the circulations around each hole lie in 1/2 + Z (that is 8 = (1/2, . . . , 1/2), modulo Zk ). (i)

If the first eigenvalue of HA,V is simple then the nodal set of the corresponding eigenfunction slits . Otherwise there exists an orthonormal basis {u1 , . . . , um } of groundstate eigenspace such that the nodal set of any non-zero combination Pthe m a i=1 i ui , with ai aj ∈ R for each 1 ≤ i, j ≤ m, slits .

632

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

Fig. 1. Examples of some sets which slit

(ii) The multiplicity m of the first eigenvalue of HA,V satisfies   2, m ≤ k,  k − 1,

k = 1, 2; k odd, k ≥ 3; k even, k ≥ 4.

(1.10)

(iii) For k = 1, 2 with groundstate multiplicity two, the nodal sets of two linearly independent groundstates do not intersect. It follows that the nodal set of a combination a1 u1 + a2 u2 is empty whenever a1 a2 6∈ R. Here we make some remarks connected to the above theorem. Remarks 1.5. (i) The above bound on the multiplicity of the first eigenvalue is sharp in the case of one hole (see Example 5.3), but it is not expected to be sharp for many holes. It would be interesting to know an asymptotic result about the growth of the maximum multiplicity with the number of holes. (ii) We prove the bound by taking advantage of topological obstructions to nodal sets caused by the holes. These obstructions prevent the existence of high dimensional groundstate eigenspaces. Our type of method was first discovered in [Che76] and has since been taken up and used by others, e.g. [Nad88, HOHON98, HOMN]. See also [Col93] for explicit constructions of examples with high multiplicity. (iii) Our result bears similarities to bounds on multiplicities of higher eigenvalues of nonmagnetic Schr¨odinger operators on surfaces with boundary. Some related literature on this topic is given in [Col93, Nad88, HOHON98, HOMN]. (iv) It has been shown in [BCC98] that no upper bound on the multiplicity exists when one adds a general magnetic field, even on the sphere.

Nodal Sets for Groundstates of Schr¨odinger Operators

633

(v) For the cases k ≥ 3 we expect that there could be intersection of nodal sets of two independent groundstates, and correspondingly that the nodal set of a combination a1 u1 + a2 u2 will not in general be empty when a1 a2 6∈ R. (vi) If we assume that HA,V has Dirichlet boundary conditions then Theorems 1.1 and 1.4 hold with suitable changes to the proofs. More precisely, in Proposition 4.1 the Taylor expansion (4.2) for a zero of order l at a point x ∈ ∂ becomes f (x) = arl sin lω + O(rl+1 ), and from Lemma 4.5 through to the proof of Theorem 1.4 (ii), all arguments which involve a function which has a zero of order l = k (for example) should be replaced by the same argument involving a function with a zero of order l = k + 1. 2. Some Basic Results The quadratic form corresponding to the operator HA,V is Z |(i∇ + A)u|2 + V |u|2 d2 x, QA,V (u) =

(2.1)

with domain QNeu = W 1,2 () = H 1 (). This choice of quadratic form domain corresponds to Neumann boundary conditions for HA,V . For the case of Dirichlet boundary conditions (see Remark 1.5 (vi)) the relevant quadratic form domain is QDir = W01,2 (). Remark 2.1. Neumann boundary conditions for a magnetic Schr¨odinger operator mean that functions in the domain of the operator satisfy i

∂u = −A · n u ∂n

(2.2)

on ∂, where n is normal to ∂. One can always assume that the vector potential satisfies the additional properties ∇ · A = 0 in ,

A · n = 0 on ∂.

(2.3)

The reason is as follows: There is a solution φ (unique up to a constant) to the oblique derivative problem 1φ = −∇ · A in ,

∇φ · n = −A · n on ∂.

(2.4)

0

See [GT83, Theorem 6.31 and the following remark]. Setting A = A + ∇φ, the operator HA0 ,V is unitarily equivalent to HA,V under the gauge transformation eiφ , and A0 satisfies the properties (2.3). Proof of Eq. (1.5). Let A and A0 be magnetic vector potentials with circulations that differ by an element of Zk . For any closed path σ, I 1 (A0 − A) · dx ∈ Z, 2π σ and hence there exists a smooth, multivalued function φ such that eiφ is univalued and ∇φ = A0 − A. For u ∈ H 1 () we have (i∇ + A0 )eiφ u = eiφ (i∇ + A)u, and therefore the operators HA,V and HA0 ,V are unitarily equivalent.

634

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

Remark 2.2. For any magnetic vector potential A satisfying (1.2) there exists a gauge function φ such that k X 8i −y + yi = (∇φ)(x, y), A(x, y) − 2πri2 x − xi i=1 where (xi , yi ) is a fixed point in the ith hole, ri2 = (x − xi )2 + (y − yi )2 and 8i is the circulation of A round the ith hole. Defining A0 (x, y) = we see, for a fixed V , that

k X 8i −y + yi , 2πri2 x − xi i=1

HA0 ,V = e−iφ HA,V eiφ

and thus HA,V is unitarily equivalent to HA0 ,V . This means that the magnetic vector potential is determined up to a gauge transformation by its circulations 8, and verifies that the spectrum of HA,V is determined by 8. Proof of Eq. (1.6). Let A be a magnetic vector potential with circulation 8, and let u be a groundstate of HA,V . It is easy to show that u is a groundstate of H−A,V with the same eigenvalue, and hence λ1 (−8) = λ1 (8).

(2.5)

We obtain Eq. (1.6) by combining (2.5) and (1.5) as follows: λ1 (l/2 + 8) = λ1 (−l/2 − 8) = λ1 (l/2 − 8).

Proof of Inequality (1.7). Suppose for a contradiction that 8 6∈ Zk and that λ1 (8) ≤ λ1 (0), where 8 is the circulation vector of some magnetic vector potential A. Let u0 denote the unique normalised positive groundstate of the operator H0,V and let uA be a normalised groundstate of the operator HA,V . Using the diamagnetic inequality [Sim79] we have Q0,V (|uA |) ≤ QA,V (uA ) = λ1 (8) ≤ λ1 (0) = Q0,V (u0 ),

(2.6)

and thus |uA | = u0 . It follows that uA = eiφ u0 for some smooth, real valued, multivalued function φ, and hence Z Z Z |A − ∇φ|2 |u0 |2 d2 x = |(i∇ + A − ∇φ)u0 |2 d2 x − |∇u0 |2 d2 x Z Z 2 2 |(i∇ + A)uA | d x − |∇u0 |2 d2 x =

= QA,V (uA ) − Q0,V (u0 ) = 0,

and therefore A = ∇φ in . Thus for each i = 1, . . . , k we have I I 1 1 A · dx = dφ ∈ Z, 8i = 2π σi 2π σi

Nodal Sets for Groundstates of Schr¨odinger Operators

635

where σi is a closed path which parametrises the boundary 6i of the ith hole and turns once in an anticlockwise direction. This contradicts our assumption that 8 6∈ Zk . The proof of inequality (1.7) is an alternative to the proofs given in [LO77] and [Hel88a]. It has the advantage of being simpler and being independent of whether the boundary conditions are Neumann or Dirichlet. See also [HN97]. We leave the proof of inequality (1.8) until Sect. 4 because it depends on Theorem 1.4 (i).

3. A Twofold Riemannian Covering Manifold In this section we consider the case where the circulations of the magnetic vector potential A satisfy 8i ∈ 1/2 + Z

(3.1)

for each 1 ≤ i ≤ k. The proofs of our results use a twofold Riemannian covering ˜ of the domain (see Remark 3.4 however). For the case of more than manifold one hole, there exists more than one twofold Riemannian covering manifold of . We shall take a particular choice of covering manifold on which the circulation of the lifted magnetic (1-form) potential A˜ along any closed curve is an integer. Before the precise definition, we introduce some basic notation. For further details see for example [Kos80] or [GHL90]. ˜ be a covering manifold of , and let 5 be the associated covering Notation 3.1. Let map. We denote the lifts of various quantities as follows: ˜ : 5(x) ∈ N }. For a function f : → C, define For a set N define N˜ = {x ∈ ˜ ˜ ˜ ˜ such that f : → C by f = f ◦ 5. For a path σ : [0, 1] → and a point x ∈ ˜ 5(x) = σ(0) let σ˜ : [0, 1] → denote the unique lifted path such that σ(0) ˜ = x and 5 ◦ σ˜ = σ. We endow the covering manifold with the metric obtained by lifting the flat Euclidean ˜ This is the unique metric which makes 5 a local isometry, and therefore metric of to . ˜ = div grad denote the Laplace-Beltrami operator a Riemannian covering map. Let 1 2 ˜ ˜ obtained by on L () induced by the lifted metric on , and let A˜ be the 1-form on lifting the 1-form associated with the smooth vector potential A defined on . ˜ ∞ be the universal covering manifold of and let 5∞ be the associated Let covering map. The universal covering of any manifold is simply connected. ˜ ∞ satisfy 5∞ (x∞ ) = 5∞ (y∞ ) Note that due to (3.1) if two points x∞ , y∞ ∈ then for any path σ joining x∞ to y∞ , the integral 1 2π

I 5∞ ◦σ

A · dx

(3.2)

lies either in 1/2 + Z or in Z. The value of (3.2) is independent of the path σ because curl A = 0 and because the universal covering manifold is simply connected. We therefore construct the twofold covering manifold (as a quotient of the universal covering manifold) as follows:

636

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

˜ by identifying points Definition 3.2. (i) We define the twofold covering manifold ˜ ∞ according to the equivalence relation x∞ ∼ y∞ if and only if x∞ , y∞ in 5∞ (x∞ ) = 5∞ (y∞ )

(3.3)

˜ ∞ joining x∞ to y∞ we have and for each path σ in Z 1 A · dx ∈ Z. 2π 5∞ ◦σ

(3.4)

˜ → is defined by 5(x) = 5∞ (x∞ ), where x = [x∞ ] is The covering map 5 : the equivalence class (under ∼) containing x∞ .

˜

5

Fig. 2. Realization of a twofold covering manifold

˜ → ˜ by (ii) On our twofold covering manifold we define the symmetry map G : ˜ which lies above 5(x) ∈ . Note that setting Gx to be the other point in 5−1 (5(x)) = {x, Gx}. ˜ → C is symmetric if f (Gx) = f (x) for all x ∈ , ˜ and (iii) We say that a function f : ˜ antisymmetric if f (Gx) = −f (x) for all x ∈ . Note that the identity map and G form a group G = {I, G}, with the composition ˜ The quotient of ˜ by G is the original manifold . The G2 = I, which acts freely on . lift f˜ of a function f on is symmetric. Using Eq. (3.4) we have I I 1 1 A˜ · d˜x = A · dx ∈ Z, (3.5) 2π σ 2π 5◦σ ˜ Hence there exists a smooth, multivalued function θ on ˜ for any closed path σ in . such that exp iθ is univalued and ˜ grad θ = A.

(3.6)

Nodal Sets for Groundstates of Schr¨odinger Operators

637

˜ defined by Lemma 3.3. The operator L : L2 () → L2 () 1 L u = √ eiθ u˜ 2

(3.7)

˜ and maps eigenfunctions of is a isometry onto the antisymmetric functions in L2 (), HA,V onto antisymmetric eigenfunctions of the Schr¨odinger operator ˜ + V˜ H˜ 0,V = −1

(3.8)

˜ with Neumann boundary conditions. acting on L2 () Proof. We shall first show that the function eiθ is antisymmetric (under G). For any ˜ let σ : [0, 1] → ˜ be a path which joins x to Gx. Using the terminology point x ∈ , of Definition 3.2 we have 5(x) = 5(Gx) but x 6∼ Gx, and hence I 1 A · dx = l + 1/2 2π 5◦σ for some l ∈ Z. Keeping in mind that θ is multivalued, we get I Z Z A˜ · d˜x = dθ = A · dx = (2l + 1)π. θ(Gx) − θ(x) = σ

σ

5◦σ

Hence exp[iθ(Gx)] = − exp[iθ(x)] as claimed. The action of L upon a function u ∈ L2 () consists of two steps. The first step is to lift u to the symmetric function u. ˜ This is a bijection onto the space of symmetric ˜ The second step is to multiply u˜ by the antisymmetric function functions of L2 (). eiθ . This step is a bijection from the space of symmetric functions onto the space of ˜ To see that L is an isometry onto its range, we take antisymmetric functions in L2 (). 2 two functions u, v ∈ L () and note that Z Z 1 iφ −iφ = e u.e ˜ vd ˜ x ˜ = uvdx = hu, viL2 () . hL u, L viL2 () ˜ 2 ˜ For every eigenfunction u of HA,V , the lift u˜ is an eigenfunction of the lifted magnetic Schr¨odinger operator ˜ grad +A) ˜ + V˜ H˜ A,V = (i div +A)(i

(3.9)

˜ where V˜ and A˜ are the lifts of V and A respectively. We now multiply by the on gauge eiθ . Using Eq. (3.6), the function eiθ u˜ is an eigenfunction of the non-magnetic Schr¨odinger operator H˜ 0,V . The spectrum of HA,V consists of the eigenvalues corresponding to the antisymmetric eigenfunctions of H˜ 0,V . It turns out to be useful (see Lemma 4.4) to single out the case where a function u has the following property: Property P. The function u is a groundstate of the operator HA,V , and the corresponding eigenfunction L u of H˜ 0,V has a constant phase. In other words, there exists a constant α ∈ C \ {0} such that L (αu) is a real valued function.

638

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

˜ the groundstate of the operator H˜ 0,V is symmetric. Due to the symmetry of , In contrast, if u has Property P then L (αu) is an antisymmetric eigenfunction (and therefore an excited state) of H˜ 0,V . Consequently both L (αu) and u have a nonempty nodal set. Remark 3.4. It is not necessary to use the covering manifold to describe Property P. An alternative is to formulate the property in terms of an antilinear operator K. We define the operator below. Since 8i ∈ 1/2 + Z for each i = 1, . . . , k, we see that I 1 2A · dx ∈ Z 2π σ for all closed paths σ in . It follows that there exists a smooth, multivalued function ψ such that eiψ is univalued and ∇ψ = 2A. The multivalued function θ given in Eq. (3.6) is related to ψ by the formula ψ ◦ 5 = 2θ + c for some constant c. We define K by the formula K = e−iψ 0,

(3.10)

where 0 is the operator 0u = u. Then K 2 = Id and K commutes with HA,V . It turns out that a function u ∈ L2 () has Property P if and only if it is an eigenfunction of both HA,V and K. One could in fact completely dispense with the covering manifold, but at the expense of a clear geometrical picture in the following sections.

4. Characterisation of the Nodal Set We first collect some well known facts about eigenfunctions of non-magnetic Schr¨odinger operators acting on two dimensional Riemannian manifolds: Proposition 4.1 (Non-magnetic Schr¨odinger operators). Let f be a real valued eigenfunction of a non-magnetic Schr¨odinger operator with smooth potential and Neumann boundary conditions, on a two dimensional locally flat Riemannian manifold with smooth boundary. Then f ∈ C ∞ (). Furthermore, f has the following properties: (i) If f has a zero of order l at a point x0 ∈ then the Taylor expansion of f is f (x) = pl (x − x0 ) + O(|x − x0 |l+1 ),

(4.1)

where pl is a real valued, non-zero, harmonic, homogeneous polynomial of degree l. Moreover if x0 ∈ ∂, the Neumann boundary conditions imply that f (x) = arl cos lω + O(rl+1 )

(4.2)

for some non-zero a ∈ R, where (r, ω) are polar coordinates of x around x0 . The angle ω is chosen so that the tangent to the boundary at x0 is given by the equation sin ω = 0.

Nodal Sets for Groundstates of Schr¨odinger Operators

639

(ii) The nodal set N (f ) is the union of finitely many, smoothly immersed circles in , and smoothly immersed lines which connect points of ∂. Each of these immersions is called a nodal line. Note that self-intersections are allowed. The connected components of \ N (f ) are called nodal domains. (iii) If f has a zero of order l at a point x0 ∈ then exactly l segments of nodal lines pass through x0 . The tangents to the nodal lines at x0 dissect the full circle into 2l equal angles. If f has a zero of order l at a point x ∈ ∂ then exactly l segments of nodal lines meet the boundary at x0 . The tangents to the nodal lines at x0 are given by the equation cos lω = 0, where ω is chosen as in (4.2). Proof. The proof that f ∈ C ∞ () can be found in [Wlo82, Theorem 20.4]. The proof of part (i) is trivial because V and f are smooth functions so the Taylor expansion (with remainder) exists. The properties of the first term of the expansion follow by substituting the Taylor expansion into the groundstate eigenvalue equation. See [Ber55, Che76] for proofs of the other parts. Proposition 4.1 can be generalised to include eigenfunctions of magnetic Schr¨odinger operators with a smooth magnetic vector potential A. The eigenfunctions still lie in C ∞ () and the expansions (4.1) and (4.2) hold, except that the polynomial pl and the constant a are allowed to be complex. However statements (ii) and (iii) about the nodal set do not carry over. Theorem 4.2. Let N ⊂ be the union of finitely many smoothly immersed circles and smoothly immersed lines which connect points of ∂. The following statements are equivalent: (i) \ N is connected (therefore N contains no smoothly immersed circles), and an odd number of lines emanate from each hole. ˜ \ N˜ decomposes into two open (ii) In the twofold covering manifold, the open set ˜ = ∂D2 ∩ ˜ = N˜ . path connected subsets D1 , D2 such that D2 = GD1 and ∂D1 ∩ ˜ N˜ . Suppose for a contradiction Proof. (i) ⇒ (ii) Let D1 be a connected component of \ that this is the only component. Due to the symmetry of the manifold, GD1 = D1 , and thus for any point x ∈ D1 there exists a path σ lying in D1 (i.e. not intersecting N˜ ), which joins x and Gx. Using the terminology of Definition 3.2 we have 5(x) = 5(Gx) but x 6∼ Gx, and hence I 1 A · dx ∈ 1/2 + Z. 2π 5◦σ The closed path 5 ◦ σ must therefore circulate an odd number of holes. Since an odd number of lines of N emanate from each hole, the path 5 ◦ σ must intersect with one of them. This contradicts the fact that σ does not intersect N˜ . Since \ N is connected there can only be two connected components D1 , D2 of ˜ \ N˜ . As above, we see that GD1 6 = D1 , and therefore D2 = GD1 . ˜ 6 = N˜ . Then there exists a point Suppose now for a contradiction that ∂D1 ∩ ˜ ˜ x ∈ ∂D1 ∩ such that x 6∈ ∂D2 ∩ . The set D1 borders with itself at x, and since D1 is path connected there exists a closed path σ such that σ(0) = σ(1) = x, which intersects N˜ transversally at x and which does not intersect N˜ anywhere else. Since σ is closed, I A · dx ∈ Z, 5◦σ

640

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

and therefore 5 ◦ σ circulates an even number of holes. Since an odd number of lines emanate from each hole, 5 ◦ σ intersects N an even number of times. This contradicts the fact that σ intersects N˜ only once. (ii) ⇒ (i) Since D2 = GD1 we see that 5D1 = 5D2 , and hence \ N = 5(D1 ∪ D2 ) = 5D1 ∪ 5D2 = 5D1 . Since 5 is continuous, \ N is connected. Let σ ⊂ ˜ σ may be be a closed path which circulates the ith hole. Due to the construction of , ˜ which begins at a point x ∈ D1 and ends at Gx ∈ D2 . Since D1 lifted to a path σ˜ in and D2 coborder, the path σ˜ crosses N˜ an odd number of times and therefore σ crosses N an odd number of times. By choosing σ ⊂ sufficiently close to σi we see that an odd number of segments of lines leave the ith boundary component. Since \ N is connected, each of these line endings belongs to a distinct line, and hence an odd number of lines leaves each boundary component. Corollary 4.3. Suppose that a collection of paths slits a region. Then no subcollection of these paths can slit the region. Also, no supercollection of these paths (i.e. a collection of paths which contain the original collection) can slit the region. Proof. Suppose that the union N of a collection of lines {01 , . . . , 0n } slits . Using ˜ N˜ decomposes Theorem 4.2, we see that in the twofold covering manifold the open set \ into two cobordering, open, path connected subsets D1 , D2 . Let S be the union of a strict subcollection of the lines. The non-empty set N˜ \ S˜ connects together the two regions ˜ is connected. Using Theorem 4.2 in ˜ \ S˜ = D1 ∪ D2 ∪ (N˜ \ S) D1 and D2 and thus the reverse direction, we see that S does not slit . It follows easily that no supercollection of N can slit because then N would be a strict subset of S which slits , and this is not possible by the above paragraph. Lemma 4.4. If a groundstate u of HA,V has Property P then the nodal set of u slits . Proof. By multiplying the function u by a non-zero complex constant we may assume that the eigenfunction L u of H˜ 0,V is real valued. Since L u is an antisymmetric function ˜ the nodal domains D1 , . . . , Dl of L u have the property that on the covering manifold , for each i = 1, . . . , l, we have GDi = Dj for some j 6 = i. Suppose for a contradiction that l > 2. Then there exist two cobordering domains D1 , D2 such that GD1 6 = D2 . Define D = Interior(D1 ∪ D2 ), so that D is the union of D1 , D2 and the border between odinger operator them. Let Q˜ D 0,V denote the quadratic form corresponding to the Schr¨ D ˜ + V˜ = −1 H˜ 0,V

˜ and Neumann boundary on D with Dirichlet boundary conditions on S˜ = ∂D ∩ ˜ and let g denote the corresponding positive groundstate. Since the condition ∂D ∩ ∂ , boundary of D is piecewise smooth, the restriction L u|D lies in the quadratic form ˜ domain of Q˜ D 0,V . Define the antisymmetric function h on by   y ∈ D, g(y), h(y) = −g(Gy), y ∈ GD,  0, otherwise. Let Q˜ 0,V denote the quadratic form of the operator H˜ 0,V , which we define in Eq. (3.8). Since L u is an antisymmetric eigenfunction which corresponds to a groundstate of HA,V , it has the least energy of all antisymmetric functions, and therefore

Nodal Sets for Groundstates of Schr¨odinger Operators

641

Q˜ 0,V (h) Q˜ D Q˜ D Q˜ 0,V (L u) Q˜ 0,V (L u) 0,V (g) 0,V (L u|D ) ≤ = ≤ = . 2 2 2 2 kL ukL2 () khkL2 () kgkL2 (D) kL u|D kL2 (D) kL uk2L2 () ˜ ˜ ˜

(4.3)

We have in fact equality in (4.3), and therefore, by uniqueness of the groundstate, we have that L u|D = λg for some λ 6 = 0. This contradicts the fact that L u|D is zero on ∂D1 ∩ D. Hence l = 2. This means that the nodal set N of u satisfies statement (ii) in Theorem 4.2. Using the equivalence proved in Theorem 4.2 we see that parts (iii) and (iv) of the definition of slitting are satisfied. Parts (i) and (ii) follow from the fact that u can be approximated locally by harmonic polynomials. See Proposition 4.1. Proof of Theorem 1.4 (i). Let U denote the groundstate eigenspace of HA,V . For all u ∈ U we have Re[L u], Im[L u] ∈ L U are eigenfunctions of H˜ 0,V , if they are not identically zero. It follows that we may find an orthonormal basis {f1 , . . . , fm } of real valued functions for L U . Since L is an isometry, the functions {u1 , . . . , um } defined by ui = L −1 fi are Pman orthonormal basis of U . Now let u = i=1 αi ui , where αi αj ∈ R for each 1 ≤ i, j ≤ m. Take some αj 6 = 0. Then m X αi αj fi L (αj u) = i=1

is a real valued function, and so u has Property P. The result now follows from Lemma 4.4. Lemma 4.5. If a groundstate u of HA,V has a zero of order l at a point x ∈ ∂ then l ≤ k. Moreover, if k is even and x lies on an interior boundary component (61 , say) then l ≤ k − 1. Proof. Assume first that u has Property P, and suppose for a contradiction that l ≥ k + 1. Let 6i denote the boundary component on which x lies, where i ∈ {0, 1, . . . , k}. At least k + 1 distinct nodal lines emerge from 6i . Since there are only k boundary components distinct from 6i there must exist two nodal lines which both start at 6i and finish at 6j for some j 6 = i. In both cases, such a nodal set would split into more than one nodal domain, thus contradicting the assumption that N (u) slits . Hence l ≤ k. If u does not have Property P then we can obtain a contradiction using the same methods above on the function L −1 [Re[L u]]. This function is a groundstate of HA,V , has a zero of order at least l at x, and does have Property P. Suppose that k is even, that x ∈ 6i (with i ∈ {1, . . . , k}) and that l = k. Since N (u) slits there must be an odd number of nodal lines leaving 6i . Therefore at least k + 1 nodal lines leave 6i , and we obtain a contradiction as before. Lemma 4.6. Suppose that the groundstate eigenspace U of HA,V is m dimensional. For each point x ∈ ∂ there exists a function ux ∈ U which has Property P and which has a zero of order at least m − 1 at x. (ii) If m = k + 1 then for each point x lying on the outer boundary 60 of there exists a unique ux ∈ U (up to multiplication by a complex constant) which has a zero of order k at x. The function ux has Property P. The nodal set of ux consists of k lines which emanate from x (which is the only point of intersection of lines), and which end at each of the k distinct interior boundary components of . Each nodal line depends smoothly on x.

(i)

642

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

(iii) If k is even and m = k then for each point x lying on an interior component of the boundary of there exists a unique ux ∈ U (up to multiplication by a complex constant) which has a zero of order k − 1 at x. The function ux has Property P.

x

x Fig. 3.

Fig. 4.

For pictorial representations of cases (ii) and (iii), see Figs. 3 and 4 respectively. Proof. (i) We shall first prove by induction the following statement: If Um is an m dimensional vector space of groundstates of HA,V then for each point x ∈ ∂ there exists a function f ∈ Um which has a zero of order at least m − 1 at x. The first step of the induction, for m = 1, is trivial. Assume now that the above statement is true for some general m. Suppose that Um+1 is an m + 1 dimensional vector space of groundstates of HA,V . Let Um be any m dimensional subspace of Um+1 . Then there exists a function f1 ∈ Um which has a zero of order at least m − 1 at x. We can assume that the order of the zero is exactly m − 1, otherwise we have found a function with a zero of order at least m, and the argument for the induction step would finish. Now take 0 = {f ∈ Um+1 : f ⊥ f1 }. Um 0 which has a zero of order m − 1 By the same argument, there exists a function f2 ∈ Um at x. Using the Taylor expansions

fi (r, ω) = ai rm−1 cos(m − 1)ω + O(rm )

i = 1, 2,

(written in polar coordinates based at x, with ai ∈ C \ {0}), we see that the function f = a2 f1 − a1 f2 is not identically zero, and has a zero of order at least m at x. This finishes the induction step. If f has Property P then we choose u = f . Otherwise, if f does not have Property P then Re[L f ] is not identically zero, and has a zero of order at least m − 1 at points ˜ such that 5(y) = x. Using Lemma 3.3 we see that u := L −1 (Re[L f ]) has y ∈ Property P, and has a zero of order at least m − 1 at x. (ii) For this part we consider the case m = k + 1 and take any point x ∈ 60 . Part (c) shows that there exists a function ux ∈ U with Property P and which has a zero of order at least k at x. Lemma 4.5 shows that the zero is of order k, and therefore k nodal lines emanate from x. To prove uniqueness, suppose that vx is a linearly independent function which also has a zero of order k at x. As above, using the Taylor expansions of ux and vx at x, we may find a linear combination of ux and vx which is not identically zero and which has a zero of order at least k + 1 at x. This contradicts Lemma 4.5.

Nodal Sets for Groundstates of Schr¨odinger Operators

643

Due to Lemma 4.4, \ N is connected, and therefore each pair of nodal lines only intersect at x. The nodal lines must also end at distinct interior boundary components. Since zeros of order larger than 1 only occur at points of intersection of nodal lines, there can only occur zeros of order 1 away from x. At such zeros, the gradient of ux is non-zero. We may multiply ux by the local gauge eiφ , where φ is given in Eq. (1.3) to make it a real valued function. The function wx = eiφ ux has locally the same nodal set as ux . Note that wx depends smoothly on x. In order to see this, one should note that a linear combination of eigenfunctions with a zero of order m − 1 at x can be found by solving a system of linear equations which, by uniqueness (see above), has full rank. Since the gradient of wx is non-zero at the nodal set away from x, the nodal lines depend smoothly on x. (iii) The proof of this part is similar.

Proof of Theorem 1.4 (ii). Let m denote the multiplicity of the first eigenvalue of HA,V . Lemma 4.6 (i) shows that for any point x ∈ ∂ there exists a groundstate of HA,V which has a zero of order l ≥ m − 1 at x. Lemma 4.5 shows that l ≤ k. This gives the universal bound m ≤ k + 1, and in particular shows that for k = 1 we have m ≤ 2. We consider now the case when k ≥ 2 and suppose for a contradiction that m = k+1. Lemma 4.6 (ii) shows that for each point x lying on 60 there exists a unique eigenfunction ux which has a zero of order k at x. Since each ux has Property P, the nodal set of each ux slits . The nodal set of each individual ux has k nodal lines {0x,1 , . . . , 0x,k }, emanating from x, and each line ends at a distinct interior boundary component. We may parametrise each line 0x,i by a path γx,i chosen so that γx,i (0) = x and γx,i (1) ∈ 6i for each i. Each path γx,i varies smoothly with x.

σ1

yt

y0

γxt ,1

σ0

γx0 ,1 xt

x0 Fig. 5.

x0

xt

Fig. 6.

We shall see that if we move x round the boundary 60 , the nodal sets of the corresponding functions wind round the holes. After one complete turn, we cannot obtain the original nodal set, thus contradicting uniqueness of the original eigenfunction. We obtain the contradiction formally as follows: Let σ0 be a closed path which parametrises the outer boundary component 60 of , and which turns once in a clockwise direction. For s ∈ [0, 1], let xs = σ0 (s) and let ys = γxs ,1 (1). Since σ0 is closed, x0 = x1 . Also, since γxs ,1 depends smoothly on xs , which in turn depends smoothly on s, the point ys moves smoothly round the inner boundary component 61 . For a fixed t ∈ [0, 1] define

644

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

σ0,t (s) = σ0 (st) = xst , σ1,t (s) = yst . The paths σ0,t and σ1,t are parametrisations of segments of 60 and 61 respectively. Note that σ0,1 = σ0 and σ1,1 = σ1p for some p ∈ Z, where σ1p means running p times around the closed path σ1 . For all t ∈ [0, 1] we have −1 ◦ γx−1 ◦ σ1,t ◦ γx0 ,1 ∼ 0, σ0,t t ,1

(4.4)

where ◦ denotes gluing of paths and ∼ denotes homotopy. This means that the left hand side of (4.4) is a closed path that does not enclose any holes. See Fig. 5. Setting t = 1 we get ◦ σ1p ◦ γx0 ,1 ∼ 0, σ0−1 ◦ γx−1 0 ,1 and therefore

◦ σ1p ◦ γx0 ,1 ∼ σ0 . σ1p ∼ γx−1 0 ,1

This gives us a contradiction because the path σ1p is not homotopic to σ0 . Hence m ≤ k. 0 Finally we consider the case where k is even and k ≥ 4. Let denote the closure of our region with the points of the outer boundary identified. Let Dk−1 ⊂ R2 denote an open disk with k − 1 smaller, disjoint, closed disks removed. There exists a homeomorphism 0

X : → Dk−1

(4.5)

such that X restricted to is smooth, and such that the boundary component 61 maps 60

61

p

X

Dk−1

Fig. 7.

to the outer boundary of Dk−1 . See Fig. 7. One can imagine X as a composition of mapping onto the surface of a sphere, deforming it so that 61 becomes very large and 60 very small, and then finally pulling off the sphere. Let p := X(60 ) ∈ Dk−1 , so that X() = Dk−1 \ {p}. Let N be a set which slits . We claim that X(N ) slits Dk−1 . For since k is even, the number of nodal lines hitting the outer boundary component 60 is even (possibly zero). This corresponds to an even number of paths in X(N ) starting or finishing at p. These paths can be paired together to link distinct boundary components. Since X −1 is a smooth bijection away from p, the resulting paths are still piecewise smooth. It is easy to verify that all the other slitting conditions are satisfied.

Nodal Sets for Groundstates of Schr¨odinger Operators

645

Suppose for a contradiction that m = k. For s ∈ [0, 1], let xs = σ1 (s) be a point on the interior boundary component 61 of . Lemma 4.6 (iii) shows that there exists a unique uxs ∈ U (up to multiplication by a complex constant) which has a zero of order k − 1 at xs . The nodal set N (uxs ) consists of k − 1 nodal lines emanating from xs . As shown above, the set Ss := X(N (uxs )) slits Dk−1 and consists of k − 1 lines emanating from the point ys = X(xs ) on the outer boundary of Dk−1 . We have thus constructed a family of slitting sets Ss which depends continuously on the parameter s ∈ [0, 1], and such that S0 = S1 . By moving the point ys round the outer boundary of Dk−1 and using the homotopy argument above, we obtain a similar contradiction. Hence m ≤ k − 1. Proof of Theorem 1.4 (iii). Suppose that k = 1 and that the multiplicity of the first eigenvalue is two. Suppose for a contradiction that there exist two linearly independent groundstates v1 and v2 such that the set S = N (v1 ) ∩ N (v2 ) is non-empty, and let z be any point in S. Since {v1 , v2 } is a basis of the groundstate eigenspace U of HA,V , the nodal set of every function u ∈ U contains the point z. From Lemma 4.6 (ii) we see that for each point x on the outer boundary 60 of there exists a unique eigenfunction ux ∈ U such that x ∈ N (ux ). If we start x at the point x0 = σ0 (0) and then move x continuously round the outer boundary 60 once in a clockwise direction then the segment of the nodal line joining x to z deforms continuously and winds around the inner boundary 61 (see Fig. 8). The resulting nodal line is different from the original, thus contradicting uniqueness of the eigenfunction ux0 . This argument can be formalised using a homotopy argument similar to that found in the proof of part (ii). x0 z

Fig. 8.

Suppose that u = α1 u1 + α2 u2 , where α1 α2 6∈ R. Since each function L ui is real valued (see the construction of the ui in the proof of Theorem 1.4 (i)), we have N (L (α2 u)) = N (α1 α2 L u1 + |α2 |2 L u2 ) = N (L u1 ) ∩ N (L u2 ). Since the nodal sets of u1 and u2 do not intersect, we have N (u) = 5(N (L u)) ⊆ 5(N (L u1 )) ∩ 5(N (L u2 )) = N (u1 ) ∩ N (u2 ) = ∅. 0

For the case k = 2, the proof uses the map X : → D1 (see Eq. (4.5)) to essentially reduce the region with two holes to the single hole case.

646

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

Proof of Inequality (1.8) from Theorem 1.1. Suppose that k = 1, and let A1 and A2 be magnetic vector potentials, where A1 has circulation 1/2. Let 8 denote the circulation of A2 . Suppose for a contradiction that 8 6∈ 1/2 + Z and that λ1 (HA2 ,V ) ≥ λ1 (HA1 ,V ). Using Theorem 1.4 (i), there exists a groundstate u1 of HA1 ,V which has a nodal set N which slits . As we are in the single hole case, the nodal set consists of a single line 0 which joins the outer boundary to the inner boundary. We shall need an operator H0,A2 ,V , which has extra Dirichlet boundary conditions imposed along the line 0. This is defined formally as the self-adjoint operator corresponding to the restriction of the closed quadratic form QA2 ,V (defined in (2.1)) to the domain = {u ∈ QNeu = W 1,2 () : u|0 = 0}. QNeu 0 Using our supposition, and the fact that the nodal set of u1 consists of the line 0, we have λ1 (HA2 ,V ) ≥ λ1 (HA1 ,V ) = λ1 (H0,A1 ,V ).

(4.6)

Since \0 is simply connected, H0,A1 ,V is unitarily equivalent to H0,A2 ,V , and therefore λ1 (H0,A1 ,V ) = λ1 (H0,A2 ,V ) =

inf

u∈QNeu () 0

QA2 ,V (u) ≥ λ1 (HA2 ,V ).

(4.7)

We have equality in (4.6) and (4.7), and therefore the groundstate u2 ∈ QNeu of H0,A2 ,V is also a groundstate of HA2 ,V The nodal sets of u1 and u2 both contain 0. Since curl A1 = curl A2 = 0 in the connected set \0, there exist smooth functions φ1 , φ2 : \ 0 → R such that ∇φi = Ai . The functions φ1 and φ2 supply us with gauge transformations eiφ1 and eiφ2 , from which we see both eiφ1 u1 and eiφ2 u2 are groundstates of H0,0,V . By uniqueness of the groundstate of a non magnetic Schr¨odinger operator, we have u2 = λei(φ2 −φ1 ) u1 for some constant λ ∈ C \ {0}. Let φ3 = φ2 − φ1 . Since both u1 and u2 are smooth functions on we may extend φ3 to a C 1 multivalued function on . The values that φ3 takes at a point differ by multiples of 2π. Hence for a path σ which circulates once Z Z Z 1 1 1 A2 · dx = A1 · dx + (A2 − A1 ) · dx 2π σ 2π σ 2π σ Z 1 1 1 dφ3 = + l. = + 2 2 2π σ This contradicts our assumption that 8 6∈ 1/2 + Z.

Remark 4.7. Using semiclassical arguments as in [Hel88a], we can show that for k ≥ 2, the first eigenvalue is not necessarily maximised for circulation (1/2, . . . , 1/2). However, we may use methods similar to those in the above proof to show that λ1 (1/2, . . . , 1/2) = inf λ1 (HS,0,V ), S∈S

(4.8)

where S is the collection of all sets S which slit , and where HS,0,V is defined (as in the above proof) to have extra Dirichlet boundary conditions along S ∈ S .

Nodal Sets for Groundstates of Schr¨odinger Operators

647

5. Additional Results and Examples Proposition 5.1. If a collection of paths {γ1 , . . . , γn } slits a region with k holes then k/2 ≤ n ≤ k. Proof. The lower bound on n is elementary because there are an odd number of lines (i.e. at least one) leaving each of the k holes. There must therefore be at least k/2 lines. We finally prove the upper bound on n. Let σ0 be a closed path which parametrises the outer boundary 60 of , and let σ1 , . . . , σk be closed paths which parametrise the k other boundary components 61 , . . . , 6k . Define S0 =

k [

σi (0),

i=0

S1 =

k [



!

∪

{σi (0, 1) }

i=0

n [

(5.1)



{γj [0, 1] } ,

(5.2)

j=1

S2 = { \ N }.

(5.3)

Let (N0 , N1 , N2 ) = (k + 1, k + 1 + n, 1) be the triple of integers associated to this decomposition, in which Ni is the number of elements in the collection Si . The decomposition D is not a standard CW decomposition of , and therefore the number N := N0 − N1 + N2 will not yield the Euler number χ() = −k + 1. It is however possible to modify the decomposition to make it into a proper CW decomposition in two steps: (i) We first add vertices where intersections of elements of S1 occur at points which are not in S0 . This step will decompose some elements of S1 into smaller parts but leaves the element \ N of S2 unaltered. Let S00 denote the new collection of vertices. (ii) If \ N is not simply connected then the second step is to add some extra lines, which begin and end at already existing vertices in S00 in order to break up (without disconnecting) the region into a single simply connected 2-cell. Note that after each step, S20 still consists of just one connected open set, so N20 = 1, whilst the number N00 − N10 of vertices minus lines does not increase. It follows that N0 − N1 + N2 ≥ N00 − N10 + N20 = χ(). Substituting in N0 = k + 1, N1 = k + 1 + n, N2 = 1, and χ() = −k + 1, we obtain n ≤ k.

Example 5.2. The example of the circle S 1 is interesting to analyse. Consider the operator Pα = −(∂φ − iα)2 on L2 (S 1 ). The spectrum can be easily seen to be σ(Pα ) = {(n − α)2 : n ∈ Z}, and therefore

λ1 (Pα ) = min(n − α)2 . n∈Z

648

B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, M. P. Owen

When α is an integer, the first eigenvalue is 0 and is simple and the corresponding eigenfunction is exp iαφ. The first eigenvalue is actually simple whenever α is not a half-integer. On the other hand, if α is a half-integer, the first eigenvalue is 1/4, with multiplicity two. If, for example α = 1/2, the corresponding eigenspace is spanned by the functions 1 and exp(iφ) (or alternatively by the functions exp(iφ/2) cos(φ/2) and exp(iφ/2) sin(φ/2)), and one can parametrise all the resulting eigenfunctions, in terms of a parameter φ0 , by exp(iφ/2) sin((φ − φ0 )/2). It is easy to see how the degeneracy of the first eigenvalue disappears when considering Pα,,v = −(∂φ − iα)2 + v(φ), perturbatively as 6 = 0 is small, provided v(φ) satisfies the condition Z

2π

v(φ)eiφ 6 = 0.

0

Example 5.3. In [Hel88b, Subsect. 7.3], an example is given in which the multiplicity of the first eigenvalue is two. The domain and potential V are symmetric under the map S : z 7→ −z, and the magnetic potential is given explicitly by A=

8 −y . 2πr2 x

If we take the case when the flux is an half-integer and we compose the operator K (see Remark 3.4) with the operator S defined by (Su)(z) = u(Sz), the operator M = SK commutes with PA,V and satisfies M 2 = −I. Kramer’s theorem shows that the multiplicity is at least two. One can indeed show that u and M u are linearly independent. An alternative proof is simply to say that Su is also an eigenvector with nodal set Sγ, where γ is the nodal set of u. Since Sγ is not equal to γ, the function Su is linearly independent of u. Acknowledgement. The authors wish to thank L. Friedlander and P. Michor for many interesting discussions and J. Rubinstein for useful correspondence. B. Helffer and T. Hoffmann-Ostenhof are grateful to A. Laptev for inviting them to Stockholm, where this research was initiated.

Nodal Sets for Groundstates of Schr¨odinger Operators

649

References [BR97]

Berger, J. and Rubinstein, J.: On the zero set of the wave function in superconductivity. Preprint, 1997 [Ber55] Bers, L.: Local behaviour of solutions of general linear equations. Commun. Pure Appl. Math. 8, 473–496 (1955) [BCC98] Besson, G., Colbois, B. and Courtois, G.: Sur la multiplicit´e de la premi`ere valeur propre de l’op´erateur de Schr¨odinger avec champ magn´etique sur la sph`ere S 2 . Trans. Amer. Math. Soc. 350, 331–345 (1998) [Che76] Cheng, S.Y.: Eigenfunctions and nodal sets. Commentarii. Math. Helv. 51, 43–55 (1976) [Col93] Colin de Verdi`ere, Y.: Multiplicit´es des valeurs propres. Laplaciens discrets et laplaciens continus. Rend. Mat. Appl. VII, 433–460 (1993) [EMQ94] Elliott, C.M., Matano, H. and Qi, T.: Zeros of complex Ginzburg–Landau order parameter with applications to superconductivity. Eur. J. Appl. Math. 5, 431–448 (1994) [GHL90] Gallot, S., Hulin, D. and Lafontaine, J.: Riemannian geometry. 2nd ed., Universitext, Berlin– Heidelberg–New York: Springer-Verlag, 1990 [GT83] Gilbarg, N. and Trudinger, S.: Elliptic partial differential equations of second order. 2nd ed., Grundlehren der mathematischen Wissenschaften, no. 224, Berlin–Heidelberg–New York: Springer-Verlag, 1983 [Hel88a] Helffer, B.: Effet d’Aharonov-Bohm sur un e´ tat born´e de l’´equation de Schr¨odinger. Commun. Math. Phys. 119, 315–329 (1988) [Hel88b] Helffer, B.: Semi-classical analysis for the Schr¨odinger operator and applications. Lecture notes in mathematics, no. 1336, Berlin–Heidelberg–New York: Springer-Verlag, 1988 [Hel94] Helffer, B. On spectral theory for Schr¨odinger operators with magnetic potentials. Adv. Stud. Pure Math. 23, 113–141 (1994) [HN97] Herbst, I. and Nakamura, S.: Schr¨odinger operators with strong magnetic fields: Quasiperiodicity of spectral orbits and topology. Amer. Math. Soc. Transl. (2), 189, 105–123 (1999) [HOHON98] Hoffmann-Ostenhof, M., Hoffmann-Ostenhof, T. and Nadirashvili, N.: On the multiplicity of eigenvalues of the Laplacian on surfaces. Ann. Global Anal. Geom. (1998), to appear [HOMN] Hoffmann-Ostenhof, T., Michor, P. and Nadirashvili, N.: Bounds on the multiplicity of eigenvalues for fixed membranes. Preprint 1998 [Kos80] Kosniowski, C.: A first course in algebraic topology. Cambridge: Cambridge University Press, 1980 [LO77] Lavine, R. and O’Carroll, M.: Ground state properties and lower bounds for energy levels of a particle in a uniform magnetic field and external potential. J. Math. Phys. 18, 1908–1912 (1977) [LP62] Little, W.A. and Parks, R.D.: Observation of quantum periodicity in the transition temperature of a superconducting cylinder. Phys. Rev. Lett. 9, 9–12 (1962) [Nad88] Nadirashvili, N.S.: Multiple eigenvalues of the Laplace operator. Math. USSR, Sb. 61, 225–238 (1988) [Sim79] Simon, B.: Kato’s inequality and the comparison of semigroups. J. Funct. Anal. 32, 97–101 (1979) [Wlo82] Wloka, J.: Partielle Differentialgleichungen. Sobolevr¨aume und Randwertaufgaben. B. G. Teubner, 1982 Communicated by B. Simon

Commun. Math. Phys. 202, 651 – 667 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Canonical Representations of Sp(1, n) Associated with Representations of Sp(1) G. van Dijk, A. Pasquale?,?? Mathematical Institute, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands. E-mail: [email protected]; [email protected] Received: 1 May 1998 / Accepted: 18 November 1998

Abstract: Canonical representations of Sp(1, n) associated with finite dimensional irreducible representations of Sp(1) are defined using vector-valued Berezin kernels. Their decomposition into irreducible representations is determined by decomposing the corresponding reproducing distributions in terms of positive definite trace spherical functions on Sp(1, n). The canonical representatons are also identified with the restriction to Sp(1, n) of certain maximal degenerate representations of SL(n + 1, H ). Introduction Canonical representations are a special type of reducible unitary representations. They originated in the same spirit of the complementary series, that is as the completion of D(G/K) with respect to a new G-invariant inner product depending on a positive real parameter λ. L2 (G/K) is recovered as a limiting case, by letting λ → +∞. The Ginvariant inner product arises from the so called Berezin kernel. The main task is the decomposition of the canonical representations into their irreducible constituents. Canonical representations have been defined for SU (1, 1) by Gel’fand, Graev and Vershik [12]. Van Dijk and Hille [1] extended their definition and decomposition first to SU (1, n; F) (F = R, C or H), and then to Hermitian symmetric spaces [2]. In the latter case, they also considered the notion of “canonical representation associated with a character”. For example, in SU (1, n; C), the character is a representation of U (1; C). In this paper we define and study the canonical representations of Sp(1,n)(= SU (1, n; H)) associated with finite dimensional irreducible representations of Sp(1). This is a family of unitary representations of Sp(1, n) which depends on two parameters: a nonnegative half-integer l parametrizing the equivalence classes of unitary irreducible ? Supported by a grant from the Dutch Organization for Scientific Research (N.W.O.) and by the Thomas Stijlties Institute for Mathematics. ?? Present address: Institut für Mathematik, TU-Clausthal, Erzstrasse 1, 38678 Clausthal-Zellerfeld, Germany. E-mail: [email protected]

652

G. van Dijk, A. Pasquale

representations of Sp(1) and a continuous real parameter λ > l +1. The canonical representations of Sp(1, n) defined in [1], which are associated with the trivial representation of Sp(1), are obtained for l = 0. After having given the definition of canonical representations associated with representations of Sp(1), we describe their decomposition. This decomposition is presented in terms of the spectral decomposition of the corresponding representing distribution. It requires the harmonic analysis on vector bundles over Sp(1, n)/Sp(1) × Sp(n) we have developed in [5]. For small values of the parameters l and λ, finitely many complementary series occur in the decomposition. This interesting phenomenon generalizes the one observed in [1] for the case l = 0. Canonical representations appear in many different contexts, see for example [4] for a survey about canonical representations of SU (1, n; C). In the final section we present a situation in which our canonical representations naturally occur, namely by taking the restriction to Sp(1, n) of some maximal degenerate representations of SL(n + 1, H). 1. Preliminaries Let H denote the skew field of the quaternions, with units 1, i, j, k (i 2 = j 2 = k 2 = −1). If q = a + i b + j c + k d ∈ H (with a, b, c, d ∈ R), then the quaternionic conjugate of q is q¯ = a − i b − j c − k d. The real and the imaginary parts of q are respectively
(1)

for x = (x0 , x1 , . . . , xn ) and y = (y0 , y1 , . . . , yn ) in Hn+1 . G is connected and simply connected. For an integer m ≥ 1, the group Sp(m) consists of all m × m matrices over H which preserve the inner product in Hm (x, y) = y¯1 x1 + · · · + y¯m xm .

(2)

In particular, Sp(1) is identified with the group of quaternions of norm equal to one. The subgroup K of G formed by the matrices of the form

u0 , 0h

u ∈ Sp(1), ∈ Sp(n),

is often identified with Sp(1) × Sp(n). It is maximally compact in G. Let J = diag (−1, 1, . . . , 1). The Lie algebra g of G consists of all (n + 1) × (n + 1) matrices over H satisfying J X + X¯ t J = 0, i.e. of the form

z1 Z2 Z¯ 2t Z3

with z1 ∈ H satisfying z1 + z¯ 1 = 0 and Z3 a skew-Hermitian n × n matrix. The symbol t denotes transposition.

Canonical Representations of Sp(1, n)

653

The subspace of g formed by its Hermitian elements is denoted by p. The real span of the element   001 L = 0 0 0 100 is a maximal abelian subspace a of g. If exp denotes the exponential map of G, then A = exp a consists of the matrices of the form   cosh t 0 sinh t t ∈ R, at =  0 I 0  , sinh t 0 cosh t I being the identity matrix. We set ρ := 2n + 1. Let u0 10 : u ∈ Sp(1) , K2 = : h ∈ Sp(n) . K1 = 0I 0h According to the Cartan decompostion of G, every element g = [gij ]ni,j =o can be written as g = uh1 at h2 for uniquely determined u=

g00 ∈ Sp(1), |g00 |

cosh t = |g00 |

(3)

/ K, then t > 0 and and for some h1 , h2 ∈ Sp(n). Moreover,  if g∈  h1 and h2 are uniquely  1 0 0  determined modulo the subgroup 0 V 0 : V ∈ Sp(n − 1) .  0 0 1  The norm of x = (x0 , x1 , . . . , xn ) ∈ Hn+1 is kxk = (x, x)1/2 . The unit sphere in Hn+1 is S = {ω ∈ Hn+1 : kωk = 1}. We consider two G-actions: the usual matrix multiplication of g ∈ G by elements x ∈ Hn+1 considered as column vectors, denoted gx, and the action of G on S defined by g·ω =

gω . kgωk

(4)

The orbit of the point e0 = (1, 0, . . . , 0) ∈ S under this second action is O = {ω ∈ S : [ω, ω] > 0}, an open subspace of S. The action is transitive, and it gives a diffeomorphism between G/K2 and O. 2. The Canonical Representations It is proved in [1], p. 126 that the K-biinvariant function defined by ψλ (at ) = (cosh t)−2λ ,

t ∈ R,

(5)

is positive definite when λ > 1. The canonical representation of Sp(1, n) of parameter λ is then defined as the unique unitary representation associated with ψλ according to the classical construction by Gel’fand and Raikov. We will generalize this definition by

654

G. van Dijk, A. Pasquale

replacing the condition of K-biinvariance with the condition of having type τ for some fixed finite dimensional unitary irreducible representation τ of Sp(1). The construction of Gel’fand and Raikov has been extended to this setting by Kunze [10]. Recall that the set of equivalence classes of finite dimensional irreducible representations of K1 ≡ Sp(1) is parametrized by the set N/2 of nonnegative half-integers {0, 1/2, 1, 3/2, . . . }. We denote with the same symbol τl either the equivalence class corresponding to the parameter l or a fixed unitary representative for it. Thus τl is a unitary irreducible representation of K1 in a Hilbert space (Vl , h·, ·il ) of dimension dl = 2l + 1. We extend τl to a representation of K by setting τl ≡ 1 on K2 . Each τl is self-dual, i.e. unitarily equivalent to its contragredient representation. It follows in particular that the character χl = tr τl of τl satisfies χl (k −1 ) = χl (k), k ∈ K. An operator-valued function F : G → End (Vl ) is said to have type τl if F (kgk 0 ) = τl (k)F (g)τl (k 0 ),

k, k 0 ∈ K, g ∈ G.

(6)

F is normalized if F (e) = I (e the unit of G, I the identity of Vl ). F is said to be positive definite (written F >> 0) if n X

hF (gj−1 gi )vi , vj il ≥ 0

(7)

i,j =1

for every integer n ≥ 1 and for all g1 , . . . , gn ∈ G and v1 , . . . , vn ∈ Vl . If F is continuous this is equivalent to the condition Z Z hF (g1−1 g2 )f (g1 ), f (g2 )il dg1 dg2 ≥ 0 (8) G G

for every f ∈ D(G; Vl ), the space of C ∞ compactly supported functions f : G → Vl . Let C ∞ (G; τl ) denote the C-vector space of smooth functions F : G → End (Vl ) of type τl , and let C ∞ (G; χl ) be the space of smooth functions f : G → C satisfying f ∗ dl χl = f

(9)

and f 0 = f, where

Z f (x) := 0

K

(10)

f (kxk −1 ) dk.

(11)

Then C ∞ (G; τl ) and C ∞ (G; χl ) are isomorphic under the map F 7−→ dl tr F . This isomorphism restricts to an algebra isomorphism of the convolution algebras D(G; τl ) and D(G; χl ) respectively consisting of the compactly supported elements of C ∞ (G; τl ) and C ∞ (G; χl ) (see e.g. [3], Theorem 1.1). The inverse isomorphism is f 7 → f ∗ τl . The algebras D(G; τl ) and D(G; χl ) are commutative, see [5], Sect. 2. Moreover, for every f ∈ C ∞ (G; χl ) we have f (g −1 ) = f (g),

g ∈ G,

(12)

Canonical Representations of Sp(1, n)

655

and f (g) = f (uat ) =

1 χl (u)f (at ) dl if g = uh1 at h2 with u ∈ U, t ∈ R, h1 , h2 ∈ Sp(n). (13)

For l ∈ N/2, λ ∈ C and q ∈ H, q 6 = 0, we set q l,λ = |q|λ τl (q/|q|).

(14)

Definition 1. For l ∈ N/2 and λ ∈ (0, ∞), define 9l,λ : G → End (Vl ) and Bl,λ : O × O → End (Vl ) by g ∈ G, 9l,λ (g) = [ge0 , e0 ]l,−2λ = |g00 |−2λ τl ( g00 /|g00 | ), λ [ω, ω][υ, υ] [ω, υ] , ω, υ ∈ O. τl Bl,λ (ω, υ) = [ω, υ][υ, ω] |[ω, υ]| In analogy with the case l = 0, we call the function Bl,λ the Berezin kernel of parameter λ associated with τl . Observe that [ω, υ] 6 = 0 for ω, υ ∈ O. Indeed [ω, υ] = g00 if ω = g1 ·e0 , υ = g0 ·e0 P and g2−1 g1 = [gij ]ni,j =0 , and |g00 |2 = 1 + nj=1 |g0j |2 ≥ 1 for g ∈ G. k1 , k2 ∈ K, g ∈ G. Lemma 1. (a) 9l,λ (k1 gk2 ) = τl (k1 )9l,λ (g)τl (k2 ), at ∈ A. (b) 9l,λ (at ) = (cosh t)−2λ I, g1 , g2 ∈ G. (c) 9l,λ (g1−1 g2 ) = Bl,λ (g1 · e0 , g2 · e0 ), The following corollary will be a consequence of Theorem 1. Corollary 1. 9l,λ is a positive-definite operator-valued function for λ > l + 1. Definition 2. For λ > l +1 the canonical representation 5l,λ of parameter λ associated with τl is the representation corresponding to the normalized positive-definite operatorvalued function 9l,λ of type τl . We call 9l,λ the reproducing distribution of 5l,λ . The representation 5l,λ can be realized as follows. Let D(G/K; τl ) denote the set of compactly supported C ∞ sections of the homogeneous vector bundle over G/K associated with τl , that is the functions f : G → Vl which are C ∞ , compactly supported and satisfy f (gk) = τl (k)−1 f (g),

g ∈ G, k ∈ K.

(15)

D(G/K; τl ) can be identified with the space D(O; τl ) of C ∞ functions ϕ : S → Vl with compact support contained in O and satisfying ϕ(ωu) = τl (u)−1 ϕ(ω),

ω ∈ S, u ∈ Sp(1).

(16)

The identification is obtained via f (ω) ≡ f (g)

if g · e0 = ω.

(17)

656

G. van Dijk, A. Pasquale

Let dω denote the Sp(n + 1)-invariant measure on S normalized by the condition dω(S) = 1. The measure on O given by dν(ω) = dω/[ω, ω]2(n+1) is G-invariant. The Hermitian form on D(G/K; τl ) Z Z hBl,λ (ω, υ)f1 (ω), f2 (υ)il dν(ω)dν(υ) (18) hf1 , f2 il,λ = S S

is therefore G-invariant too. It induces a G-invariant Hermitian form, also denoted h·, ·il,λ , on the quotient D(G/K; τl )/N, where N = {f ∈ D(G/K; τl ) : hf, f il,λ = 0}. 5l,λ is defined as the left regular action of G on the Hilbert completion Hl,λ of D(G/K; τl )/N with respect to h·, ·il,λ . 3. Spectral Decomposition In this section we describe the spectral decomposition of the reproducing distribution of 5l,λ . Let us first consider the scalar-valued reproducing distribution, which is by definition the function ψl,λ (g) = dl tr 9l,λ (g),

g ∈ G.

(19)

Then ψl,λ ∈ C ∞ (G; χl ) and ψl,λ (at ) = dl2 (cosh t)−2λ for t ∈ R. Let L2 (G; χl ) denote the closure of D(G; χl ) in L2 (G). The building blocks of the harmonic analysis of L2 (G; χl ) are the trace τl -spherical functions. A continuous function ζ : G → C satisfying ζ (e) = 1,

ζ 0 = ζ,

ζ ∗ dl χl = ζ

is a trace τl -spherical function if it satisfies one of the following equivalent properties: R 1. The map D(G; χl ) → C which maps f in G f (g)ζ (g) dg is an algebra homomorphism. 2. ζ is a common eigenfunction of U(g)K , the set of all left invariant differential operators on G which are right-K-invariant. The set of all trace τl -spherical functions of G has been explicitly determined in [5]. They are exactly the functions ζl,s ∈ C ∞ (G; χl ), s ∈ C, given by ρ−s ρ+s + l, + l; 2n; − sinh2 t , t ∈ R, ζl,s (at ) = (cosh t)2l 2 F1 2 2 where 2 F1 (a, b; c; d) denotes the Gauss hypergeometric function. Moreover ζl,s = ζl,s 0 if and only if s 0 = ±s. The positive definite τl -spherical functions have been singled out in Theorem 5.6 of [5], and the Plancherel Formula for the τl -spherical transform has been determined in Theorem 7.3 of the same paper. Let R+ denote the set of nonnegative real numbers. Then each ζl,is with s ∈ R+ is a trace τl -spherical function arising from a principal series representation of G. Let Dl = {sj := 2(l − j − n) + 1 : j = 0, 1, . . . with sj > 0}, Sl,λ = {sm (λ) := ρ − 2λ − 2m : m = 0, 1, . . . with sm (λ) > 0}.

Canonical Representations of Sp(1, n)

657

Then Dl = ∅ for 2l ≤ 2n − 1 and Sl,λ = ∅ for 2l ≥ 2n − 1. Dl parametrizes the trace τl spherical functions ζl,s arising from discrete series and Sl,λ parametrizes finitely many of those arising from the complementary series. We remark that for a fixed τl there cannot be at the same time trace τl -spherical functions arising from the discrete series and trace τl -spherical functions arising from the complementary series. The c-function for the τl -spherical transform is the meromorphic function cl (s) = 2ρ−s

0

ρ+s 2

0(2n)0(s) + l 0 ρ+s 2 −l−1

(20)

(cf. Formula (60) in [5]). In terms of the shifted factorials (a)α := a(a + 1) · · · (a + α − 1)

(21)

we have for sj ∈ Dl (j + 1)2n−1 (2(l −n+1) − j )2n−1 1 Res . = 2−2ρ [2(l −n−j ) + 1] s=sj cl (s)cl (−s) ((2n − 1)!)2 (22) Moreover, if s ∈ R, then |cl (is)|

−2

2 2 2−2ρ ρ + is ρ + is . + l − l − 1 = s sinh(πs) 0 0 π0(2n)2 2 2

The main result of the paper is the following theorem, which we will prove in Sect. 4. Theorem 1. (Decomposition Theorem). If λ > l + 1, then, in the sense of distributions on G, we have the following decomposition Z ∞ X X al,sm (λ) (λ)ζl,sm (λ) + al,sj (λ)ζl,sj + al,is (λ)ζl,is ds. (23) ψl,λ = sj ∈Dl

sm (λ)∈Sl,λ

0

For s ∈ Sl,λ ∪ Dl ∪ iR+ , the functions al,s : (l + 1, ∞) → [0, ∞) are given by (λ + l)m (λ − l − 1)m 1 , sm (λ) ∈ Sl,λ , al,sm (λ) (λ) = 22(λ+m) (ρ − 2λ − 2m + 1)m m! cl (sm (λ)) 1 1 sj ∈ Dl , ψˆ l,λ (sj ), Res al,sj (λ) = Cl s=sj cl (s)cl (−s) 1 1 , s ∈ R, al,is (λ) = ψˆ l,λ (is) 2πCl |cl (is)|2 where ψˆ l,λ (s) =

s−ρ s+ρ π 2n 0 λ + 2 0 λ − 2 (2l + 1)2 0(λ − l − 1)0(λ + l)

(24)

and Cl =

π 2n 1 1 1 . 2 (2l + 1)2 4 0(2n)

(25)

658

G. van Dijk, A. Pasquale

In (23), the sum over Sl,λ does not appear either if 2l ≥ 2n − 1 or if 2l < 2n − 1 and 2λ ≥ 2n + 1. The sum over Dl does not appear if 2l < 2n − 1. If 2l < 2n − 1 and 2λ > 2n + 1 or if 2l ≥ 2n − 1, then the function ψl,λ ∈ L1 (G) ∩ L2 (G) and the decomposition (23) is exactly the inversion formula for the τl -spherical transform of ψl,λ . The situation described by Theorem 1 is depicted in Fig. 1.

2l < 2n − 1 λ

Dl = ∅ Sl,λ 6 = ∅

(

Dl = ∅ Sl,λ = ∅

[

l +1

2n+1 2

principal series and finitely many complementary series

principal series ψl,λ ∈ L1 (G) ∩ L2 (G)

2l ≥ 2n − 1 λ

Dl 6 = ∅, Sl,λ = ∅

( l +1

principal series and (finitely many) discrete series ψl,λ ∈ L1 (G) ∩ L2 (G) Fig. 1. Spectral decomposition for different values of λ

The functions Zl,s : G → End (Vl ) defined by Zl,s = ζl,s ∗ τl are the (operatorvalued) τl -spherical functions of G. Lemma 5.2 in [5] shows the Zl,s as the projection on the K-type τl of certain degenerate principal series representations of G. Moreover, Zl,s is positive definite if and only if so is ζl,s (cf. the remark on p.15 of [13]). In particular, the Zl,s corresponding to the ζl,s of Formula (23) are all positive definite. We can therefore prove Corollary 1. Proof of Corollary 1. For every f ∈ D(G; Vl ), the map hf defined by Z Z hτl (k)f (xgk), f (x)il dx dk hf (g) = G K

C ∞ (G; τ )

and ψ ∈ C ∞ (G; χl ) are linked by the relation belongs to D(G). If 9 ∈ 9 = ψ ∗ τl , then Z Z Z h9(g2−1 g1 )f (g1 ), f (g2 )il dg1 dg2 = ψ(g)hf (g) dg. G G

G

The positive definiteness of 9l,λ for λ > l + 1 follows then from (23) since all the t coefficients al,s are ≥ 0. u

Canonical Representations of Sp(1, n)

659

Let Tl,s denote the irreducible representation of G whose projection on the K-type τl is Zl,s (see Corollary 5.5 in [5]). Formulas (23) and (3) imply that the reproducing distribution 9l,λ of 5l,λ is the direct integral of the reproducing distributions Zl,s of the Tl,s : Z ∞ X X al,sm (λ) (λ) Zl,sm (λ) + al,sj (λ) Zl,sj + al,is (λ) Zl,is ds. (26) 9l,λ = sj ∈Dl

sm (λ)∈Sl,λ

0

Considering the unitary representation corresponding to the reproducing distribution on the right-hand side of (26), we obtain the following corollary. Corollary 2. The unitary representation 5l,λ , λ > l+1, has the following direct integral decomposition into unitary irreducible representations: Z ∞ X X al,sm (λ) (λ)Tl,sm (λ) + al,sj (λ)Tl,sj + al,is (λ)Tl,is ds. (27) 5l,λ = sj ∈Dl

sm (λ)∈Sl,λ

0

4. Proof of Theorem 1 Let ψ be a locally integrable function on G satisfying ψ 0 = ψ and ψ ∗ dl χl = ψ. As usual, ψ defines an element of D0 (G) by Z ψ(g)f (g) dg, f ∈ D(G). (28) (ψ, f ) := G

Since (ψ, f ) = (ψ, f 0 ∗ χl ), the distribution ψ is completely determined by its value on D(G; χl ) and, on this space, we have Z ∞ ψ(at )f (at )1(t) dt, f ∈ D(G; χl ) (29) (ψ, f ) = Cl 0

where we have set 1(t) := 22ρ (sinh t)4n−1 cosh3 t.

(30)

Observe that f¯ˆ = fˆ¯ since ζl,s (g) = ζl,s (g). If ψ ∈ L1 (G; χl ) ∩ L2 (G; χl ), then the Plancherel Theorem implies Z ˆ fˆ(s) dσl (s) ψ(s) (ψ, f ) = i R+ ∪Dl Z ˆ ψ(s)(f, ζl,s ) dσl (s) = i R+ ∪Dl

where dσl (s) denotes the Plancherel measure (cf. [5], Theorem 7.3). Let ψl,λ ∈ C ∞ (G; χl ) be the map defined by ψl,λ (at ) = (cosh t)−2λ . For the time being we keep the general situation of λ ∈ C. Let σ0 be the constant defined by ( max Dl = 2(l − n) + 1 if 2l ≥ 2n − 1 (31) σ0 = 0 otherwise.

660

G. van Dijk, A. Pasquale

Lemma 2.3 in [8] proves the existence of a constant K > 0 so that for all s ∈ C and t ≥0 ζl,s (t) ≤ K(1 + |t|)e (|<s|−ρ)t . (32) Therefore the map g 7 → ψl,λ (g)ζl,s (g) ∈ L1 (G; χl ) for all s ∈ iR+ ∪ Dl if 2<λ > ρ + σ0 . In this case, the condition ψl,λ ∈ L2 (G; χl ) is also automatically satisfied. Formula 20.2(9) in [7] gives for 2<λ > ρ + σ0 Z ∞ (cosh t)−2λ+2l+3 (sinh t)4n−1 × ψˆ l,λ (s) = 22ρ Cl 0 ρ −s ρ +s + l, + l; 2n; − sinh2 t dt × 2 F1 2 2 Z ∞ ρ −s ρ +s + l, + l; 2n; −x dx (1+x)−λ+l+1 x 2n−1 2 F1 = 22ρ−1Cl 2 2 0 s−ρ s+ρ 2n 0 λ+ 2 0 λ− 2 π . (33) = (2l + 1)2 0(λ − l − 1)0(λ + l) This formula has been also obtained in [9] (9.1) by means of the Abel transform. The left-hand side of (33) provides a meromorphic extension of ψˆ l,λ to C2 . Remark 1. If λ ∈ R and s ∈ R, then 2 is−ρ 2l 2n λ + 0 Y 2 π 1 . ψˆ l,λ (is) = 2 2 (2l + 1) [0(λ − l − 1)] (λ − l − 1 + h) h=0 In particular, ψˆ l,λ (is) > 0 for all s ∈ R when λ is real with λ > l + 1. s −ρ Moreover, if λ > l + 1, 2l > 2n − 1 and sj ∈ Dl , then λ + j 2 ≥ s +ρ λ − j 2 > j ≥ 0, so ψˆ l,λ (sj ) > 0.

sj 2

> 0 and

Let 8l,−s be the function defined by ρ +s ρ +s + l, − l − 1; 1+s; 1−tanh2 t . 8l,−s (t) = (2 cosh t)−s−ρ 2 F1 2 2

(34)

Lemmas 2.1 and 2.2 in [8] prove the following estimates: 1. For each r > 0 there is a constant Kr > 0 such that |cl (s)|−1 ≤ Kr (1 + |s|)2n− 2

1

(35)

if <s ≥ 0 and cl (s 0 ) 6 = 0 for |s 0 − s| ≤ r. 2. For each δ > 0 there is a constant Kδ > 0 such that |8l,−s (t)| ≤ Kδ e−(<s+ρ)t if <s ≥ 0 and t ≥ δ.

(36)

Canonical Representations of Sp(1, n)

661

If f ∈ D(G; χl ), then, by the Paley-Wiener Theorem and by the inversion formula for the τl -spherical transform ([5], Theorem 7.1), f (at ) =

1 2πCl

Z

+∞

−∞

fˆ(σ + iν)8l,−σ −iν (t)

dν cl (σ + iν)

(37)

for all t > 0 provided σ > σ0 . Apply the relation ([6], 2.9(2)) 2 F1 (a, b; c; z)

= (1 − z)c−a−b 2 F1 (c − a, c − b; c; z)

to (34) and obtain (38) 8l,−s (t) = 2−s−ρ (sinh t)2−4n (cosh t)ρ−s−4 × s−ρ s−ρ + 1 − l, + 2 + l; 1 + s; 1 − tanh2 t . × 2 F1 2 2 Since ([6], 2.8(46)) 2 F1 (a, b; c; 1)

=

0(c)0(c − a − b) if c 6= 0, −1, . . . and <(c − a − b) > 0, 0(c − a)0(c − b)

we find that for s 6 = −1, −2, . . . 8l,−s (t) ∼ C1 (s)t 2−4n

as t → 0+ ,

(39)

where C1 (s) = 2−s−ρ

0(2n − 1)0(s + 1)

0( s+ρ 2

+ l)0( s+ρ 2 − l − 1)

(40)

is holomorphic in <s > −1 and bounded on <s ≥ 0 because of Stirling’s Formula. The asymptotic estimate (36), the inversion formula (37) and (39) imply that for every fixed λ ∈ C, every f ∈ D(G; χl ) and every σ > max(σ0 , ρ − 2<λ) Z ∞ ψl,λ (at )f (at )1(t) dt (ψl,λ , f ) = Cl 0 Z +∞ dν 1 , (41) fˆ(σ + iν)bl,λ (σ + iν) = 2π −∞ cl (σ + iν) where we have set Z bl,λ (s) =

0

∞

8l,−s (t)ψl,λ (t)1(t) dt.

(42)

8l,−s is holomorphic on <s > −1. Hence bl,λ (s) is a well-defined holomorphic function on V = {(λ, s) ∈ C2 : <s > max(−1, ρ−2<λ)}. Using the explicit expression of 8l,−s , we can meromorphically extend bl,λ (s) to the whole C2 .

662

G. van Dijk, A. Pasquale

Lemma 2. bl,λ (s) extends as a meromorphic function on C2 with singularities at most along the hyperplanes {s = −k} (k = 1, 2, . . . ) and {s = sm (λ)} (m = 0, 1, 2, . . . ). Here we have set sm (λ) = ρ − 2λ − 2m.

(43)

Consider bl,λ as a meromorphic function of s ∈ C and assume λ real > l + 1. Then every sm (λ) > 0 is a simple pole of bl,λ with residue Res bl,λ (s) = 22(λ+m)

s=sm (λ)

(λ + l)m (λ − l − 1)m . (ρ − 2λ − 2m + 1)m m!

Proof. Because of (38), we have on V Z ∞ −s−ρ (cosh t)ρ−s−1−2λ sinh t × bl,λ (s) = 2 0 s −ρ s −ρ 2 + 1 − l, + 2 + l; 1+s; 1−tanh t dt × 2 F1 2 2 Z 1 s−ρ s −ρ s −ρ −s−ρ−1 −1+λ 2 x +1−l, +2+l; 1+s; x dx =2 2 F1 2 2 0 (by substituting x = 1 − tanh2 t) 1 × (44) = 2−s−ρ−1 s−ρ 2 +λ s −ρ s −ρ s −ρ s −ρ + 1−l, +2+l, + λ; 1+s, +λ+1; 1 . F 3 2 2 2 2 2 The last equality is a consequence of Formula 20.2(5) in [7]: Z 1 x d−1 (1 − x)f −1 2 F1 (a, b; c; x) dx = 0

=

0(d)0(f ) 3 F2 (a, b, d; c, d + f ; 1) 0(d + f ) if 0, 0, <(c + f − a − b) > 0.

The function [0(d)0(f )]−1 3 F2 (a, b, c; d, f ; 1) extends to be an entire function of a, b, c, d, f on <(a + b + c − d − f ) < 0, the series ∞ X m=0

(a)m (b)m (c)m 0(d)(d)m 0(f )(f )m m!

which defines it being uniformly convergent on the compact sets of this region. The righthand side of (44) gives therefore a meromorphic extension of bl,λ (s) with the required singularities. The espression for the residues at the poles sm (λ) is given by the explicit form of the series expansion of 3 F2 ( , , ; , ; 1) in our case, that is s +ρ s −ρ s −ρ s −ρ + 1 − l, + 2 + l, + λ; 1+s, + λ + 1; 1 = 3 F2 2 2 2 2 s−ρ s−ρ ∞ X 1 2 +1−l m 2 +2+l ρ−s t. u =2 (1 + s)m m! s − sm (λ) m=0

Canonical Representations of Sp(1, n)

663

Remark 2. Assume λ real, λ > l + 1. There exist nonnegative integers m for which sm (λ) > 0 if and only if 2λ < 2n + 1. In this case i) sm (λ) > 0 for finitely many indices m; ii) 2l < 2n − 1 (case of no discrete series in the Plancherel Formula); iii) Res bl,λ (s) > 0. s=sm (λ)

Lemma 3. The function bl,λ satisfies the recurrence relation [s 2 − (ρ − 2λ)2 ] bl,λ (s) − [4λ(1 − λ) − 4l(l + 1)] bl,λ+1 (s) = 2scl (s). Proof. The differential operator Ll,λ :=

4l(l + 1) d2 d − 4λ(λ − ρ), + [(4n − 1) coth t + 3 tanh t] + 2 dt dt cosh2 t

is symmetric in the space L2 (1(t) dt) of L2 -integrable functions on (0, +∞) with respect to the positive measure 1(t) dt. It satisfies Ll,λ ψl,λ = [4λ(1 − λ) − 4l(l + 1)]ψl,λ+1 , Ll,λ 8l,−s = [s 2 − (ρ − 2λ)2 ]8l,−s . 8l,−s ∈ L2 (1(t) dt) if s > 0 and ψl,λ ∈ L2 (1(t) dt) if 2<λ > ρ. Under these assumptions and with [f, g](t) := 1(t)[f 0 (t)g(t) − f (t)g 0 (t)], we have [s 2 − (ρ − 2λ)2 ] bl,λ (s) − [4λ(1 − λ) − 4l(l + 1)] bλ+1 (s) = Z ∞ (Ll,λ 8l,−s )(t)ψl,λ (t) − 8l,−s (t)(Ll,λ ψl,λ )(t) 1(t) dt = 0

= lim [8l,−s , ψl,λ ](t) − lim[8l,−s , ψl,λ ](t) t↑+∞

=

t↓0

− lim 80l,−s (t) t↓0

= 2scl (s) 0 (0) = 0 and [ζ , 8 because ζl,s (0) = 1, ζl,s l,s l,−s ] = 2scl (s) (cf. Formula (69) in [5]). t This equality extends to an identity of meromorphic functions on C2 . u

Corollary 3. For every fixed λ ∈ C and every σ > 0, bl,λ (s) remains bounded as |s| → ∞ on the strip 0 ≤ <s ≤ σ . Proof. Formulas (36) and (39) imply that for some t0 > 0 and for some positive constants M, K |9l,−s (t)ψl,λ (t)1(t)| ≤ Mtχ(0,t0 ) (t) + Ke−(<s+2<λ−ρ)t χ[t0 ,∞) (t).

664

G. van Dijk, A. Pasquale

(χA is the characteristic function of A ⊂ R). Hence bl,λ (s) is bounded on <s ≥ ρ − 2<λ+1. The conclusion immediately follows from the previous lemma once it is noticed that scl (s) remains bounded on <s ≥ 0 as |s| → ∞. In fact, since 1 0(z + α) α−β −2 =z 1 + (α − β)(α + β − 1) + O(z ) if | arg z| < π 0(z + β) 2z (cf. [6], 1.18(4)), there exist costants K, K 0 so that 0 2s + 21 0 2s 5 |scl (s)| = K|s| ∼ K 0 |s|−ρ+ 2 0 2s + ρ2 + l 0 2s + ρ2 − l − 1 as |s| → ∞ in | arg s| ≤ π/2. u t Proof of Theorem 1. Fix f ∈ D(G; χl ) and consider the function gl,λ (s) :=

1 fˆ(s)bl,λ (s) . 2π cl (s)

Because of the Paley-Wiener Theorem ([5], Theorem 6.4), fˆ is even, of exponential type and rapidly decreasing. cl (s)−1 has positive poles in Dl provided 2l > 2n − 1, and a simple zero at s = 0. It has polynomial growth in s. bl,λ is meromorphic with at most simple poles. Its positive poles are in Sl,λ . It remains bounded as |s| → ∞ in every strip 0 ≤ <s ≤ σ . If 2λ > 2n − 1, then bl,λ (s) is holomorphic in <s > 0. Fix σ > max(σ0 , ρ − 2λ), and let γR be the rectangular contour of vertices ±iR and σ ± iR. Integrating gλ along γR and letting R → ∞, we obtain from (41) Z +∞ 1 dν fˆ(σ + iν)bl,λ (σ + iν) (ψl,λ , f ) = 2π −∞ cl (σ + iν) # # " " X X fˆ(s)bl,λ (s) fˆ(s)bl,λ (s) + + Res Res = s=sj s=sm (λ) cl (s) cl (s) sj ∈Dl sm (λ)∈Sl,λ Z +∞ ds 1 . bl,λ (is)fˆ(is) + 2π −∞ cl (is) Because of the relation ζl,s = cl (s)8l,s + cl (−s)8l,−s

(45)

we have ψˆ l,λ (is) = Cl

Z 0

∞

ψl,λ (t)ζl,is (t)1(t) dt = Cl [cl (is)bl,λ (−is) + cl (−is)bl,λ (is)]

for all s ∈ R and λ ∈ C with 2<λ > ρ. This equality extends to an identity of meromorphic functions on C2 . Moreover, if s = sj , we have cl (sj ) = 0, so ψˆ l,λ (sj ) = t Cl cl (−sj )bl,λ (sj ). u

Canonical Representations of Sp(1, n)

665

5. Maximal Degenerate Representations of SL(n + 1, H) In this section we exibit the canonical representations as the restriction to Sp(1, n) of some maximal degenerate representations of SL(n + 1, H). The group GH = SL(n + 1, H) is the simply connected real Lie groups consisting of the (n + 1) × (n + 1) matrices over H with Dieudonné determinant equal to one. It contains G and U = Sp(n + 1) as subgroups. Moreover, U is maximally compact in GH . The Lie algebra gH of GH is formed by all (n + 1) × (n + 1) matrices X over H satisfying
a b 0C

and

a 0 bC

with a ∈ H \ {0} and C an n × n matrix. ± of P ± by For λ ∈ C define representations ωl,λ ± (p) ωl,λ

λ±ρH

= |a|

τl (a/|a|),

a· p= · ·

∈ P ±.

(46)

± denote the representation of GH induced from P ± : Let πl,λ ± ± = IndG πl,λ P ± ωl,λ .

(47)

+ on the space D(S; τl ) of all smooth functions ϕ : S → A compact realization of πl,λ −1 Vl satisfying ϕ(ωu) = τl (u) ϕ(ω) for all ω ∈ S and u ∈ Sp(1) is given by + (g)ϕ](ω) = ϕ(g −1 · ω)kg −1 ωk−(λ+ρH ) , [πl,λ

g ∈ GH , ϕ ∈ D(S; τl ), ω ∈ S,

− (g)ϕ](ω) = ϕ(θ(g −1 ) · ω)kθ(g −1 )ωkλ−ρH, g ∈ GH , ϕ ∈ D(S; τl ), ω ∈ S. [πl,λ − + = πl,−λ ◦ θ. In particular, πl,λ ± The representations πl,λ have been studied in [11]. Their main properties are summarized in the following theorem. ± are irreducible except for the following cases: Theorem 2. (1.) The representations πl,λ (a) λ ∈ Z, λ ≥ 2l + ρ, λ ≡ 2l (mod 2). (b) λ ∈ Z, λ ≤ −(2l + ρ), λ ≡ 2l (mod 2). ± has one infinite dimensional irreducible submodule and one fiIn Case (2), πl,±λ ± nite dimensional irreducible quotient. In Case (2), πl,±λ has one finite dimensional irreducible submodule and one infinite dimensional irreducible quotient.

666

G. van Dijk, A. Pasquale

± 2. An irreducible πl,λ is unitarizable if and only if λ ∈ iR. In this case the inner product ± making πl,λ unitary is the standard inner product in L2 (S, Vl ), that is

Z

(ϕ|ψ)l =

S

hϕ(ω), ψ(ω)il dω.

(48)

± is reducible, neither its irreducible submodule nor its irreducible quotients When πl,λ are unitarizable. ± is irreducible, there exist intertwining operators For all λ for which πl,λ + − − + ), that is continuous linear maps Al,λ : D(S; τl ) (πl,λ , πl,λ ) and (πl,−λ , πl,−λ

for the pairs → D(S; τl )

satisfying

+ − (g) = πl,λ (g) ◦ Al,λ , Al,λ ◦ πl,λ

− + (g) = πl,−λ (g) ◦ Al,λ , Al,λ ◦ πl,−λ

g ∈ GH , g ∈ GH .

They can be explicitly written as the integral operators Z Al,λ ϕ(ω) = |(ω, υ)|λ−ρH τl (ω, υ)/|(ω, υ)| ϕ(υ) dυ, S

ω ∈ S.

(49)

The integral on the right-hand side of (49) is norm convergent for all ϕ ∈ D(S; τl ) provided <λ > 2(n − 1), and it defines on this region a Vl -valued holomorphic function of λ. Its regularization gives a meromorphic extension of Al,λ to C. The Cartan involution θ satisfies θ(g) = J gJ for g ∈ G, where J = diag (−1, 1, + − and πl,−λ are therefore equivalent representations, · · · , 1). The restriction to G of πl,λ intertwined by the operator E given by ϕ ∈ D(S; τl ), ω ∈ S.

(Eϕ)(ω) = ϕ(J ω),

(50)

± to G. We denote The space D(O; τl ) of (16) is invariant under the restriction of πl,±λ + by π˜ l,λ the restriction of πl,λ to G considered as a representation of D(O; τl ):

+ (g) π˜ l,λ (g) = πl,λ

D(O;τl )

,

g ∈ G.

(51)

+ are generally not unitarizable. This is no longer Recall that the representations πl,λ true for the π˜ l,λ .

Proposition 1. π˜ l,λ is a unitary representation of G on D(O; τl ) with respect to the inner product Z (52) (ϕ1 , ϕ2 )l,λ = hϕ1 (ω), ϕ2 (ω)il [ω, ω]<λ dω. S

Proof. The linear isomorphism α : D(O; τl ) → D(G/K; τl ) defined by (αϕ)(g) = ϕ(g · e0 )kge0 k−(λ+ρH ) t intertwines π˜ l,λ and the left regular representation on D(G/K; τl ). u

(53)

Canonical Representations of Sp(1, n)

667

We now suppose λ ∈ R, and consider another π˜ l,λ -invariant Hermitian form, namely the one coming from the intertwining operator. Let A˜ l,λ = Al,λ ◦ E. Since (J ω, υ) = [ω, υ], we have for all ϕ ∈ D(S; τl ) Z l,λ−ρH ϕ(υ) dυ. (54) (A˜ l,λ ϕ)(ω) = [ω, υ] S

A˜ l,λ is an intertwining operator: A˜ l,λ ◦ π˜ l,λ (g) = π˜ l,−λ (g) ◦ A˜ l,λ ,

g ∈ G.

(55)

+ , πl,−λ¯ ), i.e. For all λ ∈ C, the standard inner product of L2 (S; Vl ) is invariant for (πl,λ + + (πl,λ (g)ϕ|ψ)l = (ϕ|πl,−λ¯ (g)ψ)l , for ϕ, ψ ∈ D(S; τl ). It follows that

hϕ1 |ϕ2 il,λ = (ϕ1 |A˜ l,λ ϕ2 )l

(56)

defines a π˜ l,λ -invariant inner product on D(O; τl ). Theorem 3. Let λ > l + 1 and let λ˜ = 2λ + ρH . The G-invariant Hermitian form associated with the Berezin kernel Bl,λ (i.e. to 9l,λ ) is the form induced on D(G/K; τl ) by the inner product (·, ·)l,λ˜ on D(O; τl ) under the intertwining isomorphism α. In other words, the canonical representation 5l,λ (λ > l + 1) on D(G/K; τl ) is unitarily equivalent to the restriction of π +˜ to G (when considered as a representation l,λ of D(O; τl )). References 1. van Dijk, G., Hille, S.C.: Canonical representations related to hyperbolic spaces. J. Funct. Anal. 147, 109–139 (1997) 2. van Dijk, G., Hille, S.C.: Maximal degenerate representations, Berezin kernels and canonical representations. In: Komrakov, B., Krasil’shchik, J., Litvinov, G., Sossinky, A. (eds.) Lie groups and Lie algebras, their representations, generalizations, and applications. Dordrecht: Kluwer Academic, 1997, pp. 1–15 3. van Dijk, G.: Spherical functions on the p-adic group P GL(2). I. Indag. Math. 31, 213–225 (1969) 4. van Dijk, G.: Canonical representations. In: Tambov Summer School Seminar, August 26–31. Proceedings, Tambov, 1996. Vol. 2, Tambov: Vestnik Tambov Univ., 1997, pp. 350–366 5. van Dijk, G., Pasquale, A.: Harmonic analysis on vector bundles over Sp(1, n)/Sp(1) × Sp(n). Report W 97-17, Leiden University, 1997 6. Erdélyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher transcendental functions. Vol. 1, New York: McGraw-Hill, 1953 7. Erdélyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Tables of integral transforms. Vol. 2, New York: McGraw-Hill, 1954 8. Koornwinder, T.H.: A new proof for a Paley-Wiener type theorem for the Jacobi transform. Ark. Mat. 13, 145–159 (1975) 9. Koornwinder, T.H.: Jacobi functions and analysis on noncompact semisimple Lie groups. In: Askey, R.A., Koornwinder, T.H., Schempp, N. (eds.), Special functions: Group theoretical aspects and applications. Dordrecht: Reidel Publishing Company, 1984, pp. 1–85 10. Kunze, R.A.: Positive-definite operator-valued kernels and unitary representations. In: Gelbaum, B.R. (ed.), Functional analysis. Proceedings, University of California, Irvine, 1966, Washington D.C.: Thompson Book Co. 1967, pp. 235–247 11. Pasquale, A.: Maximal degenerate representations of SL(n + 1, H). Preprint. Leiden University, 1998 12. Vershik, A.M., Gel’fand, I.M., Graev, M.I.: Representations of the group SL(2, R) where R is a ring of functions. Russ. Math. Surv 28, 87–132 (1973) 13. Warner, G.: Harmonic analysis on semi-simple Lie groups. Vol. 2, Berlin: Springer Verlag, 1972 Communicated by H. Araki

Commun. Math. Phys. 202, 669 – 699 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Kähler Moduli Space for a D-Brane at Orbifold Singularities Kenji Mohri? Theory Group, Institute of Particle and Nuclear Studies High Energy Accelerator Research Organization (KEK), 1-1 O-ho, Tsukuba, Ibaraki 305-0801, Japan. E-mail: [email protected] Received: 9 June 1998 / Accepted: 18 November 1998

Abstract: We develop a method to analyze systematically the configuration space of a D-brane localized at the orbifold singular point of a Calabi–Yau d-fold of the form Cd / 0 using the theory of toric quotients. This approach elucidates the structure of the Kähler moduli space associated with the problem. As an application, we compute the toric data of the 0-Hilbert scheme. 1. Introduction The configuration space of a D-brane localized at the orbifold singularity of a Calabi– Yau d-fold of the form Cd / 0, where 0 is a finite subgroup of SU(d), is an interesting object to study, because it represents the ultra-short distance geometry felt by the D-brane probe [12], which may be different from the geometry of bulk string. On mathematical side, the D-brane configuration space corresponds to a generalization of the Kronheimer construction of the ADE type hyper-Kähler manifolds [26] to higher dimensions, which has been studied by Sardo Infirri [38,39]. He has shown that the D-brane configuration space is a blow-up of the orbifold Cd / 0, the topology of which depends on the Kähler (or Fayet–Iliopoulos) moduli parameters; moreover he has conjectured that for d = 3, the D-brane configuration space is a smooth Calabi–Yau three-fold for a generic choice of the Kähler moduli parameters. The case in which 0 is Abelian is of particular importance, because then the configuration space is a toric variety, which enables us to employ various methods of toric geometry to study it. Using toric geometry, several aspects of the D-brane configuration space have been studied so far [10,11,13,18,27,30,31,36]. Our aim in this article is to give a method to analyze systematically the structure of the Kähler moduli space associated with the D-brane configuration space which releases one from the previous brute force calculations, for example see [27, (53–74)]. It turns ? Present address: Inst. of Physics, Univ. of Tsukuba, 1-1-1 Ten-no dai, Tsukuba, Ibaraki 305-8571, Japan. E-mail: [email protected]

670

K. Mohri

out that the theory of toric quotients developed by Thaddeus [41] provides us with the most powerful tool to investigate the D-brane configuration space. This approach has already been taken in [39], where the analysis of the toric data is reduced to the network flow problem on the McKay quiver defined by the orbifold. To save the notation, we consider only cyclic groups for 0, but the generalization to an arbitrary Abelian group, that is, a product of several cyclic groups, should be straightforward. The organization of this article is as follows: In Sect. 2, we explain in detail the construction by Thaddeus [41] of quasi-projective toric varieties and their quotients by subtori in terms of rational convex polyhedra. This formulation gives us a clear picture of the Kähler moduli space associated with a toric quotient [24,41]. In Sect. 3, we describe the configuration space of a D-brane localized at the orbifold singularity as a toric variety obtained by a toric quotient of an affine variety closely following the treatment by Sardo Infirri [39]. Then we give typical examples of phases of the D-brane configuration spaces for Calabi–Yau four-fold models. Section 4 is devoted to an application of our construction of the D-brane configuration space to the 0-Hilbert scheme [22,23,32,33,37], which is roughly the moduli space of |0| points on Cd invariant under the action of 0, in the hope that the investigation of various Hilbert schemes sheds light on the geometrical aspect of D-branes on Calabi–Yau varieties [4,5,35]. For textbooks or monographs dealing with various aspects of toric varieties and related topics, consult [1,7,15–17,34,40,44], as well as the physics articles [2,28,42], which contain introductory materials intended for physicists. 2. Toric Varieties and Its Quotients 2.1. Polyhedra and quasi-projective varieties. Let N be a lattice of rank p and M = N ∗ be the dual lattice. Let T = Hom(M, C∗ ) ∼ = N ⊗Z C∗ ∼ = (C∗ )p be the associated torus. Then we have the following identification: M = Hom(T , C∗ ), characters of T , N = Hom(C∗ , T ), 1-parameter subgroups of T .

(2.1) (2.2)

Let P be a p-dimensional convex polyhedron in the vector space MQ . We want to associate a quasi-projective toric variety to the data (M, P ), which we denote by X(M, P ) or simply by X(P ) if no confusion occurs. P can be represented as an intersection of half-spaces as follows: (2.3) P = m ∈ MQ | hm, v a i ≥ ta , ∀a ∈ 3 , where v a ∈ N and ta ∈ Q and 3 is an index set. For technical reason, we put the following assumptions on P : 1. Each v a is a primitive vector, that is, for any integer n > 1, (1/n) v a 6∈ N . 2. The expression of P (2.3) is reduced in the sense that the omission of the a th inequality in (2.3) gives rise to a polyhedron strictly larger than P for any a ∈ 3. 3. The vector space defined by {m ∈ MQ | hm, v a i = 0, ∀a ∈ 3}, which is the maximal vector subspace in P , is equal to {0}.

Kähler Moduli Space for D-Brane at Orbifold Singularities

671

The a th facet of P , which we denote by Fa , is given by Fa := {m ∈ P | hm, v a i = ta } ,

(2.4)

which shows that v a is an inner normal vector to P at Fa . Here let us describe combinatorics of P [44, Lec. 2.2]. By the face lattice of P , we mean the set of all the faces of P partially ordered by inclusion relation, which is denoted by L(P ). We also denote the proper part of it by L(P ) := L(P )\(∅, P ). For each F ∈ L(P ), we define a subset I (F ) of 3 by I (F ) := {a ∈ 3| F ⊂ Fa },

(2.5)

where card I (F ) ≥ codim F . Then each F ∈ L(P ) can be represented as an intersection of facets as follows: \ Fa . (2.6) F = a∈I (F )

It is also convenient to set formally I (P ) = ∅, I (∅) = 3 and to regard (2.6) valid even for F = ∅, P . Then the intersection ∩ of any two elements of L(P ) can be described in an obvious manner, that is, \ Fa . (2.7) F 1 ∩ F2 = a∈I (F1 ) ∪ I (F2 )

Again for F1 , F2 ∈ L(P ), let F1 ∪ F2 ∈ L(P ) be the smallest among those which contains both F1 and F2 . The operation ∪ is called join. We see that for F1 , F2 ∈ L(P ), \ Fa . (2.8) F 1 ∪ F2 = a∈I (F1 ) ∩ I (F2 )

e := Z × M and define a cone C(P ) in M eQ , which Define a rank (q + 1) lattice by M is called the homogenization of P [44, Lect. 1.5], by eQ , C(P ) = closure of λ(1, m) λ ∈ Q≥0 , m ∈ P in M = {λ(1, m) | λ ∈ Q>0 , m ∈ P } + {0} × rec P ,

(2.9)

where a Minkowski sum is used in the second line and rec P is the recession cone of P defined by (2.10) rec P = m ∈ MQ m0 + λm ∈ P , ∀m0 ∈ P , ∀λ ∈ Q>0 . In our case a more concrete expression is possible: rec P ∼ = m ∈ MQ | hm, v a i ≥ 0, ∀a ∈ 3 .

(2.11)

e has a structure of a graded rec P -algebra graded by its first component, that C(P ) ∩ M e 0 = rec P , which leads us to e k := C(P ) ∩ ({k} × M) and (C(P ) ∩ M) is, (C(P ) ∩ M) the following definition of X(P ) as a quasi-projective variety which is projective over an affine variety [41, (2.9)] e −→ X0 (P ) := Spec (rec P ) . (2.12) X(P ) := Proj C(P ) ∩ M

672

K. Mohri

Strictly speaking, every scheme X in this article, either affine or projective, should be replaced by the set of its C-valued points X(C) := HomC (Spec C, X) [34]. To be more explicit, we construct X(P ) by the following procedure. First let (k1 , m1 ), e Then we have an embedding of X(P ) in the . . . , (ks , ms ) be the generators of C(P )∩ M. weighted projective space P(k1 , . . . , ks ), where a degree kj may be 0; more precisely, e are those of rec P ∩ M. The ambient space the degree zero generators of C(P ) ∩ M P(k1 , . . . , ks ) of X(P ) admits the following symplectic quotient realization:  , X s   kj |zj |2 = 1 U(1). (2.13) P(k1 , . . . , ks ) = (z1 , . . . , zs ) ∈ Cs   j =1 e defined by Second let ψ be the lattice surjection from Zs to C(P ) ∩ M ψ(c) :=

s X

cj (kj , mj ).

(2.14)

j =1

e Then Ker ψ is the lattice that represents the relations between the generators of C(P )∩M. We convert them to equations for the homogeneous coordinates (zj ) of P(k1 , . . . , ks ), which is called the F-flatness equations in physics terminology: Y −cj Y cj zj = zj , c ∈ Ker ψ, (2.15) cj >0

cj <0

where the degree of zj is kj . We now get a symplectic quotient realization of X(P ):   s X   ,     kj |zj |2 = 1 s U(1). X(M, P ) := (zj ) ∈ C j =1       F-flatness equations (2.15)

(2.16)

∼ Q × P so that Proj C(P ) ∩ M e If P itself is a polyhedral cone in MQ , then C(P ) = is isomorphic to Spec (P ∩ M), that is, X(P ) is an affine variety. Another extreme case is when P is a bounded polyhedron, that is, polytope. Then X(P ) is a projective variety. e Example. Let M = Z2 and P = cone{ 0, (1/2)e1 , (1/3)e2 } ⊂ MQ . Then C(P ) ∩ M is freely generated by (1, 0), (2, e1 ) and (3, e2 ), so that X(P ) = P(1, 2, 3). Example. Let M = Z2 and P = conv { 3e1 , e1 + e2 , 3e2 } + cone{ e1 , e2 }. Then ) ( 2 2 2 2 x1 T3 − x2 T2 = 0, x2 T1 − x1 T2 = 0 , X(P ) = (x1 , x2 ; T1 , T2 , T3 ) ∈ C × P T1 T3 − x1 x2 T22 = 0 which is projective over the affine variety X(rec P ) = C2 . The T -action on the homogeneous coordinates is given by mj ,n ni zj , n ∈ N, λ ∈ C∗ , zj → λhm

(2.17)

where we regard n ∈ N as a 1-parameter subgroup of T according to (2.2). In an evident e which defines a linearization, that is, a way, (2.17) induces a T -action on C(P ) ∩ M, lifting to an ample line bundle, of the T -action on the base X(P ).

Kähler Moduli Space for D-Brane at Orbifold Singularities

673

2.2. Toric varieties from fans. Now that we have given a variety X(M, P ) associated with a polyhedron P ⊂ MQ , it is natural to ask for the fan in NQ that yields X(M, P ) as a toric variety. To describe the fan associated with X(M, P ), let us first define the following function: (2.18) h(n) := min hm0 , ni : m0 ∈ P , which is called the support function of P ⊂ MQ [34, Appendix]. Note that the domain of definition of h, which we denote by dom h, is dom h = cone { v a | a ∈ 3} ⊂ NQ ,

(2.19)

which is p dimensional owing to the third assumption on P that we put earlier. Now define a cone C(F ) in NQ for F ∈ L(P )\∅ by C(F ) := { n ∈ dom h | hm, ni = h(n), ∀m ∈ F } ⊂ NQ ,

(2.20)

which is called the normal cone of F . To be more explicit, for the a th facet of P , C(Fa ) = cone{ v a } = Q≥0 v a and for a lower dimensional face F , M Q≥0 v a . (2.21) C(F ) = cone{ v a | a ∈ I (F )} = a∈I (F )

We also see that C(P ) = {0} ∈ NQ because we always assume that dim P = p. Note that dim F + dim C(F ) = p and F ∈ L(P )\∅ can be recovered from C(F ) by F = {m ∈ P | hm, ni = h(n), ∀n ∈ C(F )} . Moreover for F1 , F2 ∈ L(P )\∅, C(F1 ) is a face of C(F2 ) if and only if F2 is a face of F1 , and C(F1 ∪ F2 ) = C(F1 ) ∩ C(F2 ) is a common face of C(F1 ) and C(F2 ). Thus we can define a fan in NQ by N (P ) := {C(F ) | F ∈ L(P )\∅ } ,

(2.22)

which we call the normal fan of P , and the support of which is dom h. We denote by X∗ (N, N (P )) the toric variety associated with the data (N, N (P )). By definition, X∗ (N, N (P )) has the following affine open covering: [ (2.23) X∗ (N, N (P )) = F ∈L(P )\∅

where X(M, C(F )∗ ) = Spec (M ∩ C(F )∗ ), and for a cone C ⊂ NQ , its dual cone C ∗ ⊂ MQ is defined by (2.24) C ∗ := m ∈ MQ | hm, ni ≥ 0, n ∈ C} . Proposition 2.1. X ∗ (N, N (P )) is isomorphic to X(M, P ). This follows from the fact that the affine open covering of X∗ (N, N (P )) described in (2.23) is identical with that of X(M, P ) given in [41, Proposition (2.17)]. The shape and the size of the polyhedron P carry information about the Kähler moduli parameters of X(M, P ), which are lost in converting P into its normal fan N (P ). Two polyhedra P1 , P2 in MQ are said to be normally equivalent if their normal fans are isomorphic to each other, that is, N (P1 ) ∼ = N (P2 ).

674

K. Mohri

Example. Let us take M = Z2 and a pair of normally equivalent polyhedra P1 = conv {0, e1 , e2 , e1 + e2 }, P2 = conv {0, 4e1 , 3e2 , 4e1 + 3e2 }. Both X(M, P1 ) ⊂ P3 and X(M, P2 ) ⊂ P19 are isomorphic to P1 × P1 ; the Kähler moduli of the former and the latter are (1, 1) and (4, 3) respectively. The use of the normal fan, however, is a far more efficient way to obtain the toric variety X(M, P ). 2.3. Toric quotient. Let P ⊂ MQ be a polyhedron, and X(M, P ) be the associated quasi-projective variety. Suppose that there is an exact sequence of lattices π∗

i∗

0 → N 0 → N → N → 0,

(2.25)

where rank N 0 = p − q and rank N = q, then the dual sequence is also exact: i

π

0 → M → M → M 0 → 0.

(2.26)

A sublattice N 0 ⊂ N defines a subtorus T 0 = N 0 ⊗ C∗ = Hom(M 0 , C∗ ) of rank p − q, which acts on X(M, P ). Now we want to define the geometric invariant theory (GIT) quotient of X(M, P ) by the action of T 0 . e admits a natural T 0 -action and the T 0 -invariant part The graded ring C(P ) ∩ M T 0 e is also a graded ring. Then we define the quotient variety by C(P ) ∩ M e X(M, P )//T 0 := Proj C(P ) ∩ M

T 0

,

(2.27)

which is again projective over the affine variety defined by the affine GIT quotient 0

X0 (M, P )//T 0 := Spec (rec P ∩ M)T ,

(2.28)

0 0 e T . where (rec P ∩ M)T is the degree zero part of C(P ) ∩ M We immediately see that the GIT quotient variety admits the following toric realization: −1 (0) , (2.29) X(M, P )//T 0 = X M, P ∩ πQ −1 (0) is the sublattice of M fixed by T 0 . where M = Ker π = M ∩ πQ The corresponding symplectic quotient construction can be done as follows: In addition to the D-flatness equation in (2.13), s X

kj |zj |2 = 1

(2.30)

j =1

for the ambient space P(k1 , . . . , ks ), we put p − q D-flatness equations associated with T 0 -action on (zj ) with the Kähler (or Fayet–Iliopoulos) parameters r = 0 ∈ M 0 Q followed by quotienting by U(1)p−q . More concretely, let n0 1 , . . . , n0 p−q be the

Kähler Moduli Space for D-Brane at Orbifold Singularities

675

generators of N 0 , each of which corresponds to a 1-parameter subgroup of T 0 ∼ = (C∗ )p−q . Then the additional p − q D-flatness equations can be written as s X

hπ(mj ), n0 l i |zj |2 = 0, l = 1, . . . , p − q.

(2.31)

j =1

A useful abbreviation of (2.31) is s X

π(mj ) |zj |2 = 0,

(2.32)

j =1

where we say that zj has T 0 -charge π(mj ). Now we want to consider the toric quotient of X(P ) by T 0 with a nonzero Kähler r ∈ MQ such that πQ (b r) = r moduli parameter r ∈ M 0 Q . To this end let us take b and consider the shifted polyhedron P − b r ⊂ MQ . The original generators (kj , mj ) e are now shifted to (kj , mj − kjb r) so that the T 0 -charge of zj becomes of C(P ) ∩ M 0 (π(mj ) − kj r). This T charge assignment for (zj ) defines a new action of T 0 on e which we denote by T 0 (r). Then we can define the GIT quotient of X(M, P ) (C(P )∩ M) 0 by T (r) as e X(M, P )//T 0 (r) := Proj C(P ) ∩ M

T 0 (rr )

(2.33)

which is also projective over the affine variety X0 (M, P )//T 0 (r) := Spec (rec P ∩ M)T

0 (rr )

.

(2.34)

The ambiguity in the choice of b r, which is isomorphic to M Q , does not affect the definitions (2.33), (2.34). In fact it only affects the T := T /T 0 -linearization of the quotient variety, which is irrelevant to us. To see that the definition of X(M, P )//T 0 (r) above corresponds to the change of the Kähler parameters to r ∈ M 0 Q , we have only to describe the corresponding symplectic quotient construction of X(M, P )//T 0 (r). The D-flatness equations associated with T 0 (r) are s X (π(mj ) − kj r) |zj |2 = 0.

(2.35)

j =1

Combining (2.30) and (2.35), we obtain the D-flatness equations associated with T 0 with the Kähler moduli parameters r: s X j =1

π(mj ) |zj |2 = r.

(2.36)

676

K. Mohri

Thus we get X(M, P )//T 0 (r) by the following symplectic quotient of X(P ) by the U(1)p−q -action with r as a Kähler parameters:  

, s  X (zj ) ∈ X(M, P ) π(mj ) |zj |2 = r U(1)p−q X(M, P )//T 0 (r) ∼ =   j =1

(2.37)  s s X X  ,  2 2  k |z | = 1, π(m ) |z | = r j j j j s ∼ U(1)p−q+1 . = (zj ) ∈ C j =1 j =1       F-flatness equations (2.15)    

In the following we argue that the GIT quotient X(M, P )//T 0 (r) defined above can be realized as a quasi-projective toric variety: We will show that X(M, P )//T 0 (r) can be realized as a quasi-projective toric variety generalizing (2.29): r) = r for r ∈ M 0 Q , and let Q(b r) ⊂ M Q Proposition 2.2. Fix b r ∈ MQ such that πQ (b be the polyhedron defined by Q(b r) := (P − b r) ∩ M Q . Then we have r) . X(M, P )//T 0 (r) = X M, Q(b

(2.38)

r ∈ M because upon the shift by b r, Proof. We see that (2.38) holds when r ∈ M 0 and b e turns to one of C(P − b e each element of C(P ) ∩ M r) ∩ M. Let e be the least positive integer such that er ∈ M 0 . Without loss of generality, we −1 can restrict b r ∈ πQ (r) to those which satisfy eb r ∈ M. To deal with this case, we use the dilatation invariance of the toric data: L For a graded ring G := k≥0 Gk , define its eth Segre transform G(e) for e ∈ N by L L (e) (e) Gk = Gek and G(e) := k≥0 Gk = k≥0 Gek . Then we have Proj G ∼ = Proj G(e) .

(2.39)

e coincides with C(eP ) ∩ M, e so We easily see that the eth Segre transform of C(P ) ∩ M that X(M, P ) ∼ = X(M, eP ).

(2.40)

r) ∩ M Q ∼ r) ∩ M Q X M, (P − b = X M, (eP − eb 0 ∼ = Proj (C(eP ) ∩ M)T (err ) .

(2.41)

Then we have

To finish the proof of the proposition, we have only to prove the following lemma: Lemma 2.2.1. The graded ring (C(eP ) ∩ M)T

0 (err )

coincides with (C(P ) ∩ M)T

0 (rr )

.

Kähler Moduli Space for D-Brane at Orbifold Singularities

677

e k = kP ∩M, Proof of Lemma 2.2.1. For simplicity, we set temporally Gk := (C(P )∩ M) 0 (rr ) (e) T e Any element of G e and G := C(eP ) ∩ M. can be written as G := C(P ) ∩ M, PL PL 0 (r) charge is (k , m ), where the total T (π(m ) − k j j j j r) = 0, that is, j =1 PL PN Pj L=1 0 ( j =1 kj ) r = j =1 π(mj ) ∈ M , which implies that j =1 kj , which is the degree P of L j =1 (kj , mj ), should be a multiple of e. Thus we see that if we define a subring H of G by Hk = Gk , if k ≡ 0 mod e, Hk = 0, otherwise, 0

0

then we have GT (rr ) = H T (rr ) . (e) Now take an arbitrary element (ek, m) ∈ Hek = Gk . When regarded as an element 0 of Hek , its T (r) charge is (π(m) − ekr), which is the same as its T 0 (er) charge regarded (e) t as an element of Gk . u Then the combination of (2.41) and Lemma 2.2.1 proves Proposition 2.2. u t 2.4. Kähler moduli space. We consider here the r-dependence of the topology, or the phase in physics terminology, of the quotient toric variety (2.38). The quotient variety is the toric variety associated with the normal fan of the polyhedron Q(b r), which is given −1 (r) of P translated by −b r. Therefore the topology of the quotient by the slice P ∩ πQ −1 (r), which depends on variety is determined virtually by the shape of the slice P ∩ πQ

−1 (r) of MQ . the faces of P that intersect with the affine subspace πQ This observation leads us to define the following decomposition of the polyhedron πQ (P ) induced by the πQ -images of the faces of P [24]. First for each r ∈ πQ (P ), let L(r) be the subset of L(P ), the proper faces of P , by L(r) := F ∈ L(P ) r ∈ πQ (F ) . Then define an equivalence relation ∼ in πQ (P ) by r 1 ∼ r 2 if and only if L(r 1 ) = L(r 2 ), for r 1 , r 2 ∈ πQ (P ). We call an equivalence class K 0 in πQ (P )/∼ a chamber. The polyhedron πQ (P ) admits the decomposition into the disjoint sum of these chambers: a K 0, (2.42) πQ (P ) = K 0 ∈πQ (P )/∼

r)) is constant in each chamber [24]. and the topology of the quotient variety X(M, Q(b Therefore we see that the decomposition (2.42) of the parameters space πQ (P ) represents the phase structure of the toric quotient. We also define a closed polyhedron K to be the closure in M 0 Q of the chamber 0 K ∈ πQ (P )/∼. Conversely K 0 is recovered as the relative interior of K. Then the collection of the polyhedra K defined by K := {K| K 0 ∈ πQ (P )/∼}

(2.43)

constitutes a polyhedral complex [44, Lect. 5.1] in M 0 Q , which means that for each K ∈ K, every face of K belongs to K and the intersection K1 ∩ K2 of any two elements of K is the face of both K1 and K2 ; in particular K is a fan if it consists of polyhedral cones, which is true if P itself is a cone. We call the polyhedral decomposition of πQ (P ) defined by the complex (2.43) the Kähler moduli space associated with the toric quotient.

678

K. Mohri

We define the Kähler walls to be the πQ -image of the skeleton of P consisting of all the faces of codimensions q + 1. The Kähler walls is the region where the toric quotient construction degenerates in the sense that for each r in the Kähler walls, there is a face F −1 (r) intersect despite the fact that the sum of their codimensions of P such that F and πQ in MQ exceeds p = dim MQ . We are thus mainly interested in the Kähler moduli parameters in the complement of the Kähler walls in πQ (P ), which is the disjoint union of the chambers of the maximal dimensions [41], which we call the maximal chambers. We also call the closure of a maximal chamber in M 0 Q a “maximal polyhedron”. The πQ -image of each face of P of codimensions less than q + 1 is a union of several maximal polyhedra. Let L(P )(k) be the subset of L(P ) consisting of the faces of codimensions k. For each F ∈ L(P )(k) , where k ≤ q, we can define a k-cone in N Q by ∗ C(F ) := iQ (C(F )) = cone { v a | a ∈ I (F ) } ,

(2.44)

where i ∗ is the lattice surjection from N to N (2.25). r)) of the quotient Then we see that for any r ∈ int πQ (F ), the normal fan N (Q(b has the k-cone C(F ) defined above. This is because if r ∈ int πQ (F ), then the slice T −1 −1 −1 P ∩ πQ (r) has the face F ∩ πQ (r) = a∈I (F ) Fa ∩ πQ (r) of codimensions k, the

normal cone of which is precisely C(F ). The following two cases are of particular importance: first for the a th facet Fa and for −1 (r), so that the normal fan N (Q(b r)) any r ∈ πQ (Fa ), the slice has the facet Fa ∩ πQ has the 1-cone cone{ v a }, which means that the quotient variety has the exceptional divisor corresponding to v a . We say two vectors v a and v b in N are incompatible if int πQ (Fa ) and int πQ (Fb ) have no common point; then the two vectors cannot appear simultaneously in the quotient fan outside the Kähler walls; second for F ∈ L(P )(q) T −1 and r ∈ int πQ (F ), the slice has the vertex a∈I (F ) Fa ∩ πQ (r), which corresponds to the maximal cone C(F ) ⊂ N Q of the normal fan. Because the normal fan N (Q(b r)) is determined by listing its maximal cones, we obtain the following description of the phase structure of the quotient variety outside the Kähler walls. Let us call a subset S of L(P )(q) coherent if the collection of the cones in N Q , 6(S) := L C(F ) | F ∈ S ,

(2.45)

defines a fan, where L(C(F )), the face lattice of C(F ), is the set of all the faces of C(F ), and if the subspace of πQ (P ) defined by K(S) :=

\ F ∈S

πQ (F ) =

\

πQ ∩a∈I (F ) Fa

(2.46)

F ∈S

has an interior point, that is, if K(S) is a maximal polyhedron. Proposition 2.3. The Kähler moduli space K associated with the toric quotient is o n (2.47) K = L (K(S)) S ⊂ L(P )(q) : coherent .

Kähler Moduli Space for D-Brane at Orbifold Singularities

679

Proposition 2.4. For each coherent subset S ⊂ L(P )(q) , we have r)) ∼ X(M, Q(b = X∗ (N, 6(S)), ∀r ∈ int K(S),

(2.48)

where X ∗ (N , 6(S)) is the toric variety defined by the fan 6(S). Note that (2.47) and (2.46) generalize the descriptions of the GKZ secondary fan [17] and its maximal cones given in [6, (4.2)], where M = Zp , P = cone{ e1 , . . . , ep } ∼ = (Q≥0 )p is the basic simplicial cone, and X(M, P ) ∼ = Cp , which has been used in the investigation of the Kähler moduli space of bulk string compactified on a Calabi–Yau manifold [21]. 3. D-Brane Configuration Space 3.1. Calabi–Yau orbifolds. Let {a1 , . . . , ad } be a d-tuple of the integers, and ω be a primitive nth root of unity. We define 0 to be a group isomorphic to the cyclic group Zn := Z/nZ and define the action of the generator g ∈ 0 on Cd by g · xµ = ωaµ xµ , 1 ≤ µ ≤ d.

(3.1)

We denote the quotient space by Cd / 0. The following are well-known: • Cd / 0 has an isolated singularity at the origin if and only if (aµ , n) = 1, ∀µ. Pd • Cd / 0 is a Calabi–Yau variety if and only if µ=1 aµ ≡ 0 mod n. We restrict ourselves to the models in which the orbifold Cd / 0 is a Calabi–Yau variety with an isolated singularity unless otherwise stated, because our main interest is the study of the configuration space M of a D-brane localized at the singular point of the Calabi– Yau variety Cd / 0. We denote the model characterized by the integers (a1 , . . . , ad ; n) above by 1/n(a1 , . . . , ad ) for simplicity. Here we give some facts about the Calabi–Yau orbifolds. An advanced introduction to this subject can be found in [9]. First of all, Cd / 0 is a toric variety. A useful choice of the dual pair of the lattices to describe Cd / 0 is the following: 1 N 0 := Zd + (a1 , . . . , ad ) Z, n o n M 0 := m ∈ Zd | m · a ≡ 0 mod n , a := (a1 , . . . , ad ).

(3.2) (3.3)

Let {e∗1 , . . . , e∗d } and {e1 , . . . , ed } be the set of the fundamental vectors of (N 0 )Q and (M 0 )Q respectively, which generate the dual pair of simplicial cones: C0∗ = cone{e∗1 , . . . , e∗d } ∼ = (Q≥0 )d ⊂ (N 0 )Q , C0 = cone{e1 , . . . , ed } ∼ = (Q≥0 )d ⊂ (M 0 )Q .

(3.4) (3.5)

Then we have Cd / 0 = X(M 0 , C0 ) = Spec (M 0 ∩ C0 ) = X∗ (N 0 , C0∗ ).

(3.6)

To see this, it suffices to note that the affine coordinate ring of Cd / 0 is the 0-invariant part of C[x1 , . . . , xd ], which is precisely (M 0 ∩ C0 ). The simplicial cone C0∗ ⊂ (N 0 )Q is the

680

K. Mohri

fan associated with Cd / 0. Thus a toric blow-up of Cd / 0 corresponds to a subdivision of the cone C0∗ by incorporating new 1-cones, the primitive vectors of which correspond to exceptional divisors. For simplicity, we will confuse the primitive vector of a 1-cone with the exceptional divisor associated with it. Let T = conv{e∗1 , . . . , e∗d } be the fundamental simplex in (N 0 )Q associated with the orbifold. A primitive vector v ∈ N 0 is classified by its age, which is defined to be the positive integer k such that v ∈ kT . Incorporation of v ∈ N 0 in subdivision of the fan C0∗ preserves the Calabi–Yau property if and only if its age is 1. Thus a primitive vector of age 1 is said to be crepant. A crepant toric blow-up of Cd / 0 corresponds to a subdivision of T using lattice points in T . We define the weight vector w associated with a primitive vector v ∈ N by w := nv ∈ Zd . We can read the physical Hodge numbers of bulk string (hp,p ) “compactified” on d C / 0 from the Ehrhart series for (N 0 , T ) [3] as X k≥0

l (kT ) y k =

d−1 X 1 hp,p y p , (1 − y)d

(3.7)

p=0

where l (kT ) is the number of the lattice points in the dilated simplex kT , that is, l (kT ) = card (kT ∩N 0 ). In particular, the number of the crepant divisors h1,1 = l(T )−d equals the dimensions of the Kähler moduli space of bulk string “compactified” on Cd / 0. There is a striking difference between d = 4 orbifolds from d = 2, 3 ones: in general, incorporation of the crepant divisors only is not enough to resolve Cd / 0 completely into a smooth variety for d = 4 as opposed to d = 2, 3 cases. In [27], we have divided the d = 4 models into the following three classes: (A) the models that admit a crepant resolution, (B) those that have no crepant divisors, the singularities of which are called terminal, consisting of the models of the form: 1/n(1, a, n − 1, n − 1) where (n, a) = 1 [29], (C) those that have at least one crepant divisor, but do not admit any crepant resolutions. The complete identification of the (A) class, that is, the classification of the isolated cyclic quotient Gorenstein singularities in four or higher dimensions for which crepant resolutions are possible is a very interesting but unsolved mathematical problem [37], the physical meaning of which is yet to be elucidated. It is clear that the examples of the (A) class shown in [27], 1/(3m + 1)(1, 1, 1, 3m − 2), 1/(4m)(1, 1, 2m − 1, 2m − 1), m ∈ N,

(3.8)

are only the tip of the iceberg. Recently, however, a considerable progress in this subject has been made in [8,9]. A remarkable new series in the (A) class, the mth member of which is called the 4-dimensional geometric progress singularity-series of ratio m (GPSS(4; m)), is given in [9, Conjecture 10.2]: Conjecture (Dais–Henk). The 1/{(1 + m)(1 + m2 )}(1, m, m2 , m3 ) model admits a crepant resolution for each m ∈ N. The same conjecture was also made by the author, who has only checked that the Delaunay triangulation [14], [44, p. 146], of T by the lattice points in it yields a crepant resolution, which is not unique for m ≥ 3, up to m = 10.

Kähler Moduli Space for D-Brane at Orbifold Singularities

681

3.2. D-brane configuration space. We consider the configuration space of a D1-brane localized at the singular point of Cd / 0. This can be realized as follows: First we consider n = |0| D1-branes localized at the origin of Cd . We assign the Chan–Paton indices i mod n to the D1-branes. Then the world sheet theory on the D1-branes is U(n) gauge theory with (8, 8) supersymmetry. The configuration of the D1-branes is described by the d-tuple of the matrices {(Xµ )ij } taking values in the adjoint representation of U(n) [43]; second, taking into account the 0-actions on the Lorentz indices (µ) and the Chan– Paton indices (i), on which 0 acts as cyclic permutations, we define the configuration of a D1-brane on the orbifold Cd / 0 to be that of n D1-branes on Cd invariant under the simultaneous action of 0 on the Lorentz and the Chan–Paton indices [11,13]. In the next section, we use a closely related idea in the definition of Hilbert schemes of n points on Cd . The world sheet supersymmetry is reduced, at this point, to (4, 4), (2, 2) and (0, 2) for d = 2, 3 and for d = 4 respectively, with the exception that the supersymmetry of the d = 4 (B) model is enhanced to (0, 4) [27]. We can also consider a model with P µ aµ 6 ≡ 0 mod n, where the supersymmetry is completely broken [11]. Let Ra be the one dimensional representation of 0 over C on which the generator g ∈ 0 acts as multiplication by ωa . Then the D-brane matrices (Xµ ) take values in (Q ⊗ End(R))0 ∼ = Hom0 (R, R ⊗ Q) [26,38,39], where the two 0-modules, R=

n M

Ri , Q =

d M

Raµ ,

(3.9)

µ=1

i=1

carry the Chan–Paton and the Lorentz 0-quantum numbers of the matrices respectively. Note that we have done the discrete Fourier transformation on the Chan–Paton indices, so that the 0-action on those is diagonalized. To be explicit, the matrix elements that can be nonzero are xµ(i) := (Xµ )ii+aµ ,

(3.10)

and the configuration space of the D1-brane on Cd / 0 is the solution space of the following equations: F-flatness equation, (3.11) Xµ , Xν = O, d h X µ=1

i Xµ , Xµ† − diag(r1 , . . . , rn ) = O, D-flatness equation,

(3.12)

(i)

divided by the action of U(1)n /U(1)diag , where xµ has the i th U(1) charge 1 and the (i + aµ )th U(1) charge −1, and the others 0 as seen from (3.12). To have a solution P to (3.12), the Fayet–Iliopoulos (or Kähler ) moduli parameters r := (ri ) must satisfy ni=1 ri = 0. The F-flatness equation (3.11) can be solved as follows [11]: We redefine the generator of 0 so that ad = −1 mod n. Then the matrix elements (3.10) can be represented by (i) (0) xd , i = 1, . . . , n and xµ , µ = 1, . . . , d − 1 as xµ(i)

=

xµ(0)

(j ) Qaµ (j ) j =1 xd · j =1 xd . Qi+aµ (j ) x j =1 d

Qi

(3.13)

682

K. Mohri

We see that the solution space of the F-flatness equation (3.11), which we denote by A, is the (n − 1 + d)-dimensional affine variety embedded in Cnd defined by the equations of monomial type (3.13), which shows that A is a toric variety. The configuration space of the D1-brane, which we denote by M(r), is also toric because it is obtained as a toric quotient of A (3.12). In the next subsection, we give a toric description of M(r), based on the formalism developed in the last section, which elucidates the structure of the Kähler moduli space associated with the toric quotient A//(C∗ )n−1 (r), as well as provides us with an efficient method to compute the configuration space M(r) for any r ∈ Qn−1 . 3.3. Toric description of the D-brane configuration space. According to (3.13), we propose the following toric description of A [39]: Let M (0) be a lattice of rank nd generated (i) by eµ , 0 ≤ i ≤ n − 1, 1 ≤ µ ≤ d, and M (1) be the sublattice of rank (n − 1)(d − 1) of M (0) generated by f (i) µ

:=

e(i) µ

− e(0) µ

i+aµ

+

X j =1

(j ) ed

−

i X j =1

(j ) ed

−

aµ X j =1

(j )

ed , µ 6 = d, i 6= 0,

(3.14)

with the injection j : M (1) → M (0) . Let M = M (0) /M (1) be the quotient lattice of rank n − 1 + d and p : M (0) → M be the projection. If we define Cbasic to be the basic (0) simplicial cone in MQ , that is, n o Cbasic = cone e(i) 1 ≤ µ ≤ d, 0 ≤ i ≤ n − 1 , (3.15) µ then its pQ -image P := pQ (Cbasic ) is a cone in MQ and we have [39] A = X(M, P ) = Spec (P ∩ M) .

(3.16)

The D-brane configuration space M(r) can be realized as the toric quotient of A as follows [39]: Let M 0 ⊂ Zn be the lattice of rank n − 1 generated by ei − ei+1 , 1 ≤ i ≤ n − 1, where (ei ) is the generators of Zn , and π 0 : M (0) → M 0 the lattice projection π 0 (e(i) µ ) := ei − ei+aµ ,

(3.17) (i)

which is determined according to the U(1)n charge assignment of xµ . It is easily seen that π 0 factors through p, that is, there is a projection π : M → M 0 such that π 0 = π ◦p. Finally we define a sublattice of rank d of M by M := Ker π ∼ = Ker π 0 /Im j . Note that πQ (P ) = M 0 Q . Then we have r)), M(r) := X(M, P )//T 0 (r) = X(M, Q(b T0

N0

(3.18) (M 0 )∗

is the subtorus of T associated with the sublattice = ⊂ N = M ∗, where 0 and we regard the Fayet–Iliopoulos parameter r as a point of M Q . We can obtain the fan of M(r) as the normal fan of the d-polyhedron Q(b r). Remark. It may be confusing to have two lattices of rank d, both of which are associated with the configuration space M(r): M symbolically represents the lattice for a general quotient toric variety (2.26); on the other hand, M 0 , which was originally introduced as a useful lattice to describe the orbifold Cd / 0 in (3.3), is also suited for its blow-up M(r). Our intention is that we use M 0 for the concrete descriptions of the toric data of M(r) below.

Kähler Moduli Space for D-Brane at Orbifold Singularities

683

0

? M (1) j

?

M (0)

@

p 0

-

M

i

-

? M

π0

@ @ R @ π - M0

-

0

? 0 Fig. 3.1. Sequences of Lattices

3.4. Some properties of the Kähler moduli space. We define the action of a generator of 0 on the Chan–Paton indices by ϕ(i) := i + 1 (mod n), which can be extended to an action of 0 as an automorphism on each lattice shown in Fig. 1 in such a manner that any lattice homomorphism in Fig. 1 becomes 0-equivariant, which we denote by ϕ (1) , ϕ (0) , ϕ, ϕ 0 , ϕ for M (1) , M (0) , M, M 0 , M respectively. For example, the action on the generators of M (0) reads as follows: (i+1) , ϕ (0) (e(i) µ ) = eµ

(3.19)

while the action on those of M (1) is (i+1) − f (1) ϕ (1) (f (i) µ ) = fµ µ .

(3.20)

The Kähler moduli space of 1/n(a1 , . . . , ad ), which we denote by Kn (a1 , . . . , ad ), is the complete fan in M 0 Q obtained as the subdivision of M 0 Q induced by the πQ -images of the faces of the cone P in MQ . Propositions 3.1–3 stated below are immediate consequences of our definitions: Proposition 3.1. If {a1 , . . . , ad } = {b1 , . . . , be } as sets, then the two models 1/n(a1 , . . . , ad ) and 1/n(b1 , . . . , be ) have the same Kähler moduli space, that is, Kn (a1 , . . . , ad ) = Kn (b1 , . . . , be ),

(3.21)

where the two models above need not necessarily satisfy the Calabi–Yau condition. We say that the d-fold model 1/n(a1 , . . . , ad ) can be reduced to e dimensions, when (3.21) occurs with d > e.

684

K. Mohri

Example. We have the reductions of the Calabi–Yau four-fold models to two dimensions according to the following identifications:

K

Km (1, 1, m − 1, m − 1) = Km (1, m − 1),

(3.22)

K

(3.23)

4m

3m+1

(1, 1, 1, 3m − 2) = K

3m+1

(1, 1, 2m − 1, 2m − 1) = K

4m

(1, 3m − 2),

(1, 2m − 1).

(3.24)

A toric two-fold has the virtue that the listing of the 1-cones alone determines its fan. The four-fold models entering in (3.22–3.24) inherit this property from the corresponding two-fold models, which considerably simplifies the analysis of the Kähler moduli space of these four-fold models. Let us take the 1/m(1, m − 1) model. The maximal chambers of the Kähler moduli space Km (1, m−1) can be identified with the Weyl chambers of SU(m) [26], in which the D-brane configuration space M(r) is in the minimal blow-up phase. Correspondingly, the phase of the four-fold 1/m(1, 1, m − 1, m − 1) in the Weyl chambers turns out to be the non-Calabi–Yau smooth phase with Euler number 4(m − 1), the fan of which is given by the following collection of the maximal cones: h1, 3, 4, 5i, h2, 3, 4, 5i, h1, 2, 3, m + 3i, h1, 2, 4, m + 3i, (3.25) h1, 3, l, l + 1i, h1, 4, l, l + 1i, h2, 3, l, l + 1i, h2, 4, l, l + 1i, 5 ≤ l ≤ m + 2, where the weight vectors above are given by w1 = (m, 0, 0, 0), w 2 = (0, m, 0, 0), w3 = (0, 0, m, 0), w4 = (0, 0, 0, m), wl = (l − 4, l − 4, m + 4 − l, m + 4 − l), 5 ≤ l ≤ m + 3. (3.26) Here our convention for the expression of the maximal cone [21] is: hs1 , . . . , sk i := cone{ ws1 , . . . , w sk } ⊂ (N 0 )Q .

(3.27)

Brute force calculations for m = 2, 3 cases can be found in [27, Sect. 6.2]. As for the 1/(4m)(1, 1, 2m − 1, 2m − 1) model and 1/(3m + 1)(1, 1, 1, 3m − 2) model, the D-brane configuration space M(r) of the four-fold model is in the smooth Calabi–Yau phase if and only if that of the corresponding two-fold model is in the minimal blow-up phase; the former is in the non-Calabi–Yau smooth phases if and only if the latter is in the non-minimal blow-up phases. In the same way, a two-parameter model: 1/n(1, . . . , 1, a, b) treated in [8] is one which can be reduced to three dimensions. Proposition 3.2. The polyhedron P admits an action of 0, that is, ϕQ (P ) = P . Corollary 3.2.1. The set of the facets of P , which we previously denoted by L(P )(1) = {Fa | a ∈ 3}, is decomposed into 0-orbits. Within each 0-orbit, the facets share a common weight vector for the quotient toric variety. Each model has the following d 0-singlets: n o ν 6 = µ , 1 ≤ µ ≤ d, (3.28) Fµ := cone p(i) ν (i)

(i)

where we set p(eµ ) = pµ ∈ M for simplicity. The weight vector associated with Fµ is wµ = neµ for 1 ≤ µ ≤ d.

Kähler Moduli Space for D-Brane at Orbifold Singularities

685

The remaining 0-orbits are denoted by o n (j ) j (0) Fk := ϕQ Fk 0 ≤ j ≤ mk − 1 , k ≥ d + 1,

(3.29)

where mk is the length of the k th 0-orbit, and we denote the weight vector associated with the k th orbit by wk , which we call the k th exceptional divisor. Example. For the 1/5(1, 2, 3, 4) model, the exceptional divisors are w5 w7 w9 w11

= (1, 2, 3, 4), w6 = (3, 1, 4, 2), w8 = (4, 3, 2, 6), w 10 = (3, 6, 4, 2), w12

= (2, 4, 1, 3), = (4, 3, 2, 1), = (2, 4, 6, 3), = (6, 2, 3, 4).

(3.30)

(0)

To describe k th 0-orbit, it suffices to show its 0th member Fk as in (3.29). Then we have for the age 2 exceptional divisors, n o (0) (1) (2) (3) (4) (1) (2) (3) (1) (2) (1) (3.31) F5 = cone p1 , p1 , p1 , p1 , p2 , p2 , p2 , p3 , p3 , p4 , n o (0) (1) (3) (4) (3) (1) (2) (3) (4) (1) (3) (3.32) F6 = cone p1 , p1 , p 1 , p2 , p3 , p3 , p3 , p3 , p4 , p4 , n o (0) (0) (2) (0) (1) (2) (4) (0) (0) (2) (4) (3.33) F7 = cone p1 , p 1 , p2 , p2 , p2 , p2 , p3 , p4 , p4 , p4 , n o (0) (0) (0) (4) (0) (3) (4) (0) (2) (3) (4) (3.34) F8 = cone p1 , p 2 , p2 , p3 , p3 , p3 , p4 , p4 , p4 , p 4 , and for the age 3 exceptional divisors n o (0) (2) (4) (1) (4) (0) (2) (4) (4) F9 = cone p1 , p1 , p 2 , p2 , p3 , p3 , p3 , p4 , n o (0) (2) (3) (4) (2) (3) (2) (1) (2) F10 = cone p1 , p1 , p1 , p2 , p2 , p3 , p4 , p 4 , n o (0) (0) (1) (0) (0) (4) (0) (3) (4) F11 = cone p1 , p 1 , p2 , p3 , p3 , p4 , p4 , p4 , n o (0) (0) (0) (2) (4) (0) (3) (0) (2) F12 = cone p1 , p 2 , p2 , p2 , p3 , p3 , p4 , p 4 .

(3.35) (3.36) (3.37) (3.38)

The action of 0 on a facet is as follows: n o (0) (2) (3) (4) (0) (2) (3) (4) (2) (3) (2) = cone p1 , p1 , p1 , p1 , p2 , p 2 , p2 , p 3 , p3 , p4 . (3.39) ϕQ F5 We see that the length of each 0-orbits above is 5. Example. For the case of (1/12)(1, 1, 5, 5) model, the exceptional divisors and the lengths of the 0-orbits are as follows: w5 w6 w7 w8 w9

= (1, 1, 5, 5) : = (3, 3, 3, 3) : = (5, 5, 1, 1) : = (4, 4, 8, 8) : = (8, 8, 4, 4) :

12, 12 + 12 + 12 + 4, 12, 6 + 6 + 3, 6 + 6 + 3.

(3.40)

686

K. Mohri

The representatives of 0-orbits for the crepant divisors are ( (0) F5 = cone

( 1 F (0) = cone 6

( 2 F (0) = cone 6

( 3 F (0) = cone 6

( 4 F (0) = cone 6

( (0)

F7

= cone

) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) p µ , p µ , p µ , p µ , pµ , p µ , p µ , p µ , pµ , p µ , p µ µ = 1, 2 , (1) (2) (4) (6) (7) (9) (11) ν = 3, 4 pν , pν , pν , pν , pν , pν , pν ) (0) (1) (2) (5) (6) (8) (9) (10) (11) p µ , p µ , pµ , p µ , p µ , p µ , p µ , pµ , p µ µ = 1, 2 , (1) (2) (4) (5) (6) (8) (9) (10) (11) p ν , p ν , p ν , p ν , pν , p ν , p ν , p ν , p ν ν = 3, 4 ) (3) (4) (5) (6) (7) (8) (9) (10) (11) p µ , p µ , p µ , p µ , p µ , p µ , p µ , p µ , p µ µ = 1, 2 , (1) (2) (3) (4) (6) (7) (8) (9) (11) ν = 3, 4 pν , pν , pν , pν , pν , pν , pν , pν , pν ) (0) (2) (3) (6) (7) (8) (9) (10) (11) p µ , p µ , p µ , p µ , p µ , p µ , p µ , pµ , p µ µ = 1, 2 , (0) (2) (3) (5) (6) (7) (8) (10) (11) p ν , p ν , p ν , p ν , p ν , p ν , p ν , pν , p ν ν = 3, 4 ) (0) (1) (2) (4) (5) (6) (8) (9) (10) p µ , p µ , pµ , p µ , p µ , p µ , p µ , p µ , p µ µ = 1, 2 , (0) (1) (2) (4) (5) (6) (8) (9) (10) p ν , p ν , pν , p ν , p ν , p ν , p ν , p ν , p ν ν = 3, 4 ) (5) (6) (7) (8) (9) (10) (11) µ = 1, 2 pµ , pµ , pµ , pµ , pµ , pµ , pµ , (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) ν = 3, 4 p ,p ,p ,p ,p ,p ,p ,p ,p ,p ,p ν

ν

ν

ν

ν

ν

ν

ν

ν

ν

ν

(3.41)

while those for the age 2 divisors are ( 1

(0) F8

2

(0) F8

3

(0) F8

1

(0) F9

2

(0) F9

= cone ( = cone ( = cone ( = cone ( = cone (

3 (0) F9

= cone

) (1) (2) (4) (6) (7) (8) (10) p(0) µ , p µ , p µ , p µ , p µ , p µ , p µ , p µ µ = 1, 2 , (4) (6) (10) ν = 3, 4 p(0) ν , pν , pν , pν ) (3) (4) (5) (6) (9) (10) (11) p(0) µ , p µ , p µ , p µ , p µ , p µ , p µ , p µ µ = 1, 2 , (3) (8) (9) ν = 3, 4 p(2) ν , pν , pν , pν ) (1) (3) (4) (6) (7) (9) (10) p(0) µ , p µ , p µ , p µ , p µ , p µ , p µ , p µ µ = 1, 2 , (3) (6) (9) ν = 3, 4 p(0) ν , pν , pν , pν ) (4) (9) (10) µ = 1, 2 p(3) µ , pµ , pµ , pµ , (1) (2) (3) (6) (7) (8) (9) ν = 3, 4 p(0) ν , pν , pν , pν , pν , pν , pν , pν ) (2) (6) (8) µ = 1, 2 p(0) µ , pµ , pµ , pµ , (2) (4) (5) (6) (8) (10) (11) ν = 3, 4 p(0) ν , pν , pν , pν , pν , pν , pν , pν ) (5) (8) (11) µ = 1, 2 p(2) µ , pµ , pµ , pµ . p(1) , p(2) , p(4) , p(5) , p (7) , p(8) , p(10) , p(11) ν = 3, 4 ν

ν

ν

ν

ν

ν

ν

(3.42)

ν

Note that the reducibility of this model to two dimensions (3.24) is reflected in the structure of the facets of P . 0

Proposition 3.3. 0 acts on MQ as an symmetry of the toric quotient: M(r) ∼ = M(ϕ 0 Q (r)), r ∈ M 0 Q .

(3.43)

Kähler Moduli Space for D-Brane at Orbifold Singularities

687

3.5. Phases of Calabi–Yau four-fold models. Here we describe some typical phases of Calabi–Yau four-fold models, leaving the cases of d = 3 for the reader’s exercise. P n−1 by discardIn this subsection, we identify M 0 = { r ∈ Zn | n−1 i=0 ri = 0} with Z ing its zeroth component r0 . 3.5.1. (1/12)(1,1,5,5) model. First of all, we need to choose the representative of the facets for each weight vector w k for k = 5, . . . , 9, which we denote by Fk , so that they are compatible, that is, 9 \

πQ (Fk )

(3.44)

k=5

is a 4 dimensional cone in M 0 Q . Our choice is as follows: (3)

(0)

(1)

(0)

(0)

F5 = F5 , F6 = 4 F6 , F7 = F7 , F8 = 1 F8 , F9 = 2 F9 .

(3.45)

Consider the following candidates of the phases realized in maximal cones of the Kähler moduli space K12 (1, 1, 5, 5): Phase I (6I ) h1, 2, 3, 7i, h1, 2, 4, 7i, h1, 3, 4, 5i, h2, 3, 4, 5i, h2, 4, 5, 6i, h2, 4, 6, 7i, h1, 4, 6, 7i, h2, 3, 6, 7i, h1, 4, 5, 6i, h1, 3, 6, 7i, h2, 3, 5, 6i, h1, 3, 5, 6i.

(3.46)

Phase II (6II ) h1, 2, 3, 7i, h2, 4, 6, 8i, h1, 4, 6, 8i, h2, 4, 6, 7i,

h1, 2, 4, 7i, h1, 4, 6, 7i, h1, 3, 5, 8i, h2, 3, 5, 8i,

h1, 3, 4, 5i, h2, 4, 5, 8i, h2, 3, 6, 8i, h2, 3, 6, 7i,

h2, 3, 4, 5i, h1, 4, 5, 8i, h1, 3, 6, 8i, h1, 3, 6, 7i.

h1, 2, 3, 7i, h1, 4, 6, 9i, h1, 4, 7, 9i, h2, 3, 7, 9i,

h1, 2, 4, 7i, h2, 4, 5, 6i, h1, 3, 6, 9i, h2, 3, 6, 9i,

h1, 3, 4, 5i, h2, 4, 6, 9i, h1, 3, 7, 9i, h2, 3, 5, 6i,

h2, 3, 4, 5i, h2, 4, 7, 9i, h1, 4, 5, 6i, h1, 3, 5, 6i.

h1, 2, 3, 7i, h2, 4, 6, 8i, h2, 4, 7, 9i, h2, 3, 7, 9i, h2, 3, 6, 8i,

h1, 2, 4, 7i, h2, 4, 5, 8i, h1, 4, 6, 8i, h1, 3, 7, 9i, h2, 3, 5, 8i,

h1, 3, 4, 5i, h2, 4, 6, 9i, h1, 4, 7, 9i, h1, 4, 5, 8i, h1, 3, 5, 8i,

h2, 3, 4, 5i, h1, 4, 6, 9i, h1, 3, 6, 9i, h2, 3, 6, 9i, h1, 3, 6, 8i.

(3.47)

Phase III (6III ) (3.48)

Phase IV (6IV )

(3.49)

688

K. Mohri

* IV

HH

HH j

HH

II

HH j

*

III

I

Fig. 3.2. Blow-Down Diagram

Here 6I–IV means the corresponding fan. Phase I is the smooth Calabi–Yau phase; phase II–IV are non-Calabi–Yau smooth phases, which are blow-ups of phase I. Each of the fans 6I–IV defines a coherent subset SI–IV of L(P )(4) , the set of the codimension four faces of P . Then according to (2.46), we can construct the maximal cones KI–IV := K(SI–IV ) of the Kähler moduli space K12 (1, 1, 5, 5). The result is as follows:   −e3 + e4 − e7 + e8 − e11 , −e4 + e9 , −e8 + e9 , −e7        −e3 + e8 , −e3 + e4 , −e1 + e2 , −e4 + e5 , −e5 + e10        −e , −e + e − e , −e + e , −e + e − e + e   11 7 9 11 3 6 4 6 8 9 , (3.50) KI = cone  −e7 + e10 , −e1 + e2 − e4 + e6 , −e5 + e6 − e8 + e10         −e1 + e2 − e5 + e6 − e9 + e10 , −e3 + e5 − e7 + e8        −e1 + e6 , −e4 + e5 − e7 + e9   −e7 , −e3 + e4 − e5 + e6 − e9 + e10 − e11 , −e3 + e6       −e + e − e + e , −e + e − e + e − e + e   6 8 10 1 2 6 9 10 5 5 , KII = cone  −e3 + e8 , −e3 + e4 , −e5 + e10 , −e9 + e10 , −e1 + e6        −e11 , −e3 + e4 − e7 + e8 − e11

(3.51)

KIII

  −e11 , −e1 + e2 , −e1 + e2 − e3 + e6 − e7 + e8 − e9       −e + e − e + e , −e + e − e + e − e + e   1 2 4 6 1 2 6 9 10 5 = cone ,  −e4 + e5 , −e3 + e8 , −e3 + e4 , −e1 + e6 , −e3 + e6        −e7 , −e3 + e5 − e7 + e8 , −e3 + e4 − e7 + e8 − e11

(3.52)

KIV

  −e11 , −e3 + e8 , −e3 + e4 − e5 + e6 − e9 + e10 − e11        −e7 , −e3 + e4 , −e1 + e2 − e3 + e6 − e7 + e8 − e9  = cone .   −e1 + e2 − e5 + e6 − e9 + e10 , −e1 + e6 , −e3 + e6       −e9 + e10 , −e3 + e4 − e7 + e8 − e11

(3.53)

3.5.2. 1/5(1,2,3,4) model. Let us first choose the following representatives for the 0orbits: (4)

F5 = F5 , (1)

(2)

F6 = F6 , (3)

(0)

F7 = F7 , (0)

(0)

F8 = F8 , (0)

F9 = F9 , F10 = F10 , F11 = F11 , F12 = F12 ,

(3.54)

Kähler Moduli Space for D-Brane at Orbifold Singularities

689

which satisfy the compatibility condition: 12 \

πQ (Fk ) = cone{ −e1 , −e2 , −e3 , −e4 }.

(3.55)

k=5

If we define Phase I by Phase I (6I ) h2, 3, 10, 11i, h1, 4, 7, 12i, h3, 5, 7, 10i, h2, 3, 8, 11i, h1, 7, 8, 12i, h2, 6, 8, 11i, h2, 3, 4, 5i, h1, 3, 7, 8i, h2, 4, 5, 6i, h1, 2, 6, 8i, h1, 4, 6, 9i, h1, 2, 4, 6i, h1, 3, 4, 7i, h1, 2, 3, 8i, h1, 4, 9, 12i, h3, 4, 5, 7i, h2, 3, 5, 10i, h4, 5, 6, 9i, (3.56) h3, 7, 8, 10, 11i, h2, 5, 6, 10, 11i, h4, 5, 7, 9, 12i, h1, 6, 8, 9, 12i, h5, 6, 7, 8, 9, 10, 11, 12i, the associated cone KI := K(SI ) coincides with (3.55). Therefore Phase I is the only possible phase under the choice of the representatives for the exceptional divisors (3.54). The second choice of the compatible representatives for the exceptional divisors is: (4)

F5 = F5 , (1)

(2)

F6 = F6 , (3)

(0)

F7 = F7 ,

(1)

F8 = F8 ,

(1)

F9 = F9 , F10 = F10 , F11 = F11 .

(3.57)

Note the absence of the facet associated with w12 in (3.57). Consider the following two phases: Phase II (6II ) h5, 8, 10, 11i, h2, 3, 4, 5i, h3, 4, 5, 7i, h4, 5, 6, 8i, h1, 2, 3, 8i, h2, 3, 8, 11i, h1, 4, 5, 8i, h3, 8, 10, 11i, h2, 5, 10, 11i, h1, 4, 6, 8i, h1, 3, 4, 7i, h2, 3, 10, 11i, h1, 2, 6, 8i, h2, 3, 5, 10i, h2, 4, 5, 6i, h3, 5, 7, 10i, h1, 4, 5, 7i, h1, 2, 4, 6i, h1, 3, 7, 8, 10i, h1, 5, 7, 8, 10i, h2, 5, 6, 8, 11i;

(3.58)

Phase III (6III ) h3, 8, 10, 11i, h3, 5, 7, 10i, h1, 2, 3, 8i, h2, 3, 4, 5i, h2, 3, 8, 11i, h1, 2, 6, 8i, h5, 8, 10, 11i, h2, 3, 10, 11i, h2, 5, 10, 11i, h1, 2, 4, 6i, h3, 4, 5, 7i, h4, 5, 6, 9i, h1, 4, 5, 9i, h2, 3, 5, 10i, h2, 4, 5, 6i, h1, 3, 4, 7i, h1, 4, 6, 9i, h1, 4, 5, 7i, h2, 5, 6, 8, 11i, h1, 3, 7, 8, 10i, h1, 5, 7, 8, 10i, h1, 5, 6, 8, 9i.

(3.59)

Using (2.46), we see that these two phases can be realized in the maximal cones KII and KIII defined by KII = cone {−e3 , e1 − e3 − e4 , e1 − e2 − e3 − e4 , e1 − e2 − e3 } , KIII = cone {−e3 , e1 − e3 − e4 , e1 − e2 − e3 − e4 , −e4 } .

(3.60) (3.61)

690

K. Mohri

So far we have seen only the phases the fan of which is not simplicial, which means that the singularity of M(r) is worse than orbifold ones in these phases. In fact, combinatorics of the facets of P admits neither the smooth phases incorporating two age 3 divisors, for example, h3, 7, 8, 10i, h1, 4, 7, 8i, h5, 8, 10, 11i, h1, 4, 6, 8i, h3, 5, 7, 10i,

h5, 6, 8, 11i, h2, 4, 5, 6i, h5, 7, 8, 10i, h2, 6, 8, 11i, h1, 3, 7, 8i,

h3, 8, 10, 11i, h1, 2, 4, 6i, h4, 5, 6, 8i, h1, 3, 4, 7i, h1, 2, 6, 8i,

h2, 5, 10, 11i, h3, 4, 5, 7i, h2, 5, 6, 11i, h2, 3, 10, 11i, h1, 2, 3, 8i,

h2, 3, 4, 5i, h4, 5, 7, 8i, h2, 3, 8, 11i, h2, 3, 5, 10i,

(3.62)

nor the phase with the simplicial fan incorporating all the exceptional divisors: h1, 4, 7, 12i, h2, 5, 10, 11i, h1, 3, 4, 5i, h4, 5, 7, 9i, h2, 3, 10, 11i, h5, 7, 9, 10i, h1, 2, 6, 8i,

h1, 4, 9, 12i, h1, 7, 8, 12i, h1, 3, 4, 7i, h2, 6, 8, 11i, h3, 8, 10, 11i, h2, 5, 6, 11i, h1, 2, 3, 8i,

h8, 9, 10, 11i, h7, 8, 9, 12i, h1, 2, 4, 6i, h1, 4, 6, 9i, h5, 9, 10, 11i, h2, 3, 5, 10i, h1, 3, 7, 8i,

h5, 6, 9, 11i, h1, 8, 9, 12i, h3, 4, 5, 7i, h6, 8, 9, 11i, h3, 7, 8, 10i, h3, 5, 7, 10i,

h4, 7, 9, 12i, h2, 3, 4, 5i, h2, 3, 8, 11i, h4, 5, 6, 9i, h7, 8, 9, 10i, h1, 6, 8, 9i,

(3.63)

the fans of which are found by the Delaunay triangulations [14], [44, p. 146] of 6T . 4. 0-Hilbert Scheme 4.1. Symplectic quotient construction. Let X be a quasi-projective variety with a fixed embedding in a projective space. The Hilbert scheme HilbP (X) is the moduli space that parametrizes all the closed subschemes of X with a fixed Poincaré polynomial P (z), where P (l) ∈ Z, for all l ∈ Z. See [19], [25, Chapter I] for more detailed information. Let us take the following pair: X = Cd , P (z) = n (constant), and consider the moduli space of zero-dimensional closed subschemes of length n in Cd , which we denote by Hilbn (Cd ) [22,23,32,37]. A point Z ∈ Hilbn (Cd ) corresponds to an ideal I ⊂ A of colength n, where A =: C[x1 , . . . , xd ] is the coordinate ring of Cd . Therefore we have Hilbn (Cd ) = { ideal I ⊂ A | dimC A/I = n } .

(4.1)

For d, n ≥ 3, Hilbn (Cd ) is a singular variety. The action of 0 on Cd is naturally extended to that on Hilbn (Cd ). Let (Hilbn (Cd ))0 be the subset of Hilbn (Cd ) which is fixed by the action of 0. Each point of (Hilbn (Cd ))0 corresponds to a 0-invariant ideal I of A. Consequently, for I ∈ (Hilbn (Cd ))0 , A/I = H 0 (Z, OZ ) becomes a 0-module of rank n. For example, 0-orbit of a point p ∈ Cd \0 is a point of (Hilbn (Cd ))0 , and it constitutes the regular representation R (3.9) of 0. Now we give a definition of the 0-Hilbert scheme following [22]: n o (4.2) Hilb0 (Cd ) := I ∈ (Hilbn (Cd ))0 A/I ∼ =R ,

Kähler Moduli Space for D-Brane at Orbifold Singularities

691

which means that the 0-Hilbert scheme Hilb0 (Cd ) parametrizes all the zero-dimensional closed subschemes Z ⊂ Cd such that H 0 (Z, OZ ) is isomorphic to the regular representation R of 0. The mathematical aspect of the 0-Hilbert scheme Hilb0 (Cd ) for d = 2, 3 has been largely uncovered: For d = 2, Hilb0 (C2 ) is a minimal resolution of the singularity 1/m(1, m − 1) [23]; moreover, it has been shown that even for d = 3, Hilb0 (C3 ) is a crepant resolution of the Calabi–Yau three-fold singularity C3 / 0 by I. Nakamura in [33], despite the fact that Hilbn (C3 ) itself is singular. Thus our interest here is also concentrated on the d = 4 case. We will show later that Hilb0 (C4 ) is singular in general. The definition of Hilbert schemes Hilbn (Cd ) and Hilb0 (Cd ) given above may seem abstract. However Y. Ito and H. Nakajima has shown that they can be realized as holomorphic (GIT)/symplectic quotients of flat spaces associated with the gauge group U(n) and U(1)n respectively [22,32], that is, we can identify Hilbn (Cd ) and Hilb0 (Cd ) as the classical Higgs moduli spaces of supersymmetric gauge theories [28,42]. In particular Hilb0 (Cd ) can be described as a toric variety. Furthermore, it is isomorphic to the D-brane configuration space M(r) with a particular choice of the Fayet–Iliopoulos parameter r ∈ M 0 Q [22], which is the main point of this subsection. Let us first explain a holomorphic quotient construction of Hilbn (Cd ). Fix I ∈ Hilbn (Cd ) and let V = A/I be the associated n dimensional vector space. The multiplication of xµ on V defines the d-tuple of the elements of End(V ) which we denote by Xµ . To be more explicit, we choose an arbitrary basis i , i = 1, . . . , n of V , and we P j define the matrix elements of Xµ by (xµ + I ) · i = nj=1 (Xµ )i j . If we also define a basis of Cd by β µ , µ = 1, . . . , d, then we define an element X of Hom(V , Cd ⊗ V ) by X( i ) :=

n d X X µ=1 j =1

j

β µ ⊗ j (Xµ )i .

(4.3)

Similarly the image of the 1 ,→ A → A/I defines a non-zero element of V which Pmap n i , where we mean by p the associated element of p we denote by p(1) = i i=1 Hom(C, V ), that is, p(λ) := λ p(1) for λ ∈ C. It is clear by construction that (X, p) satisfies the following two conditions: (i) Xµ , Xν = O (F-flatness). (ii) p(1) is a cyclic vector, that is, V is generated by Xµ over p(1) (stability). Conversely let W be the vector space Hom(Cn , Cd ⊗ Cn ) ⊕ Hom(C, Cn ), and take such an element (X, p) ∈ W that satisfies the above two conditions (i), (ii) with V = Cn . Then (X, p) defines a point I ∈ Hilbn (Cd ) as follows: First we define a surjective homomorphism κ : A → Cn of vector spaces over C by X i1 (Xµ1 )ii12 (Xµ2 )ii23 · · · (Xµs )ijs pj , κ(xµ1 · · · xµs ) := Xµ1 · · · Xµs · p(1) = i1 ,...,is ,j

(4.4) P P j where we define Xµ ( i ) := nj=1 j (Xµ )i , and p(1) := ni=1 pi i for a basis i of Cn ; second let I := Ker κ ⊂ A be an ideal of A, then A/I ∼ = Cn as a vector space, which n d implies I ∈ Hilb (C ) according to (4.1); third, it is clear that two elements (X, p) and (X 0 , p0 ) of W define the same point of Hilbn (Cd ) if and only if (X 0 , p0 ) = (gXg −1 , gp), ∃g ∈ GL(n, C).

692

K. Mohri

Thus we have arrived at the following holomorphic quotient construction of Hilbn (Cd ): condition (i) : F-flatness n d ∼ GL(n, C). (4.5) Hilb (C ) = (X, p) ∈ W condition (ii) : stability We can also obtain the corresponding symplectic quotient construction of Hilbn (Cd ) by replacing the stability condition (ii) and the quotient by GL(n, C) above by the D-flatness condition Dr :=

d h X µ=1

i Xµ , Xµ† + p · p† − r diag(1, . . . , 1) = O,

(4.6)

and the quotient by U(n), where r > 0 is a unique Fayet–Iliopoulos parameter associated with the U(1) factor of U(n). If we set r = 0, we obtain the symmetric product (Cd )n /Sn as a quotient variety reflecting the existence of the Hilbert–Chow morphism Hilbn (Cd ) → (Cd )n /Sn . Let us turn to the holomorphic quotient construction of Hilb0 (Cd ) based on that of Hilbn (Cd ) given above. The only difference from the previous treatment of Hilbn (Cd ) is that this time we must assign the 0-quantum numbers to the objects : xµ , i , β µ and p. However we would not mind repeating almost the same argument for convenience. Let us first redefine the generator g of 0 so that the action of which on xµ becomes g · xµ = ω−aµ xµ for consistency. Second take a point I ∈ Hilb0 (Cd ) and define V = A/I , which is now isomorphic to the regular representation R as a 0-module by definition. Let i ∈ V be a generator of Ri for i = 1, . . . , n, that is, g · i = ωi i and L V = n−1 i=0 C i is the irreducible decomposition of 0-modules. We also introduce somewhat abstractly β µ as a generator of Raµ for µ = 1, . . . , d L and let Q = dµ=1 C β µ be a 0-module. Then we define a 0-equivariant homomorphism from V to Q ⊗ V , which we call X, by X:f ∈V →

d X

β µ ⊗ (f · xµ ) ∈ Q ⊗ V ,

(4.7)

µ=1

where the product of polynomials f · xµ is evaluated modulo I . In particular the µth component of X( i+aµ ) becomes xµ · i+aµ = (Xµ )ii+aµ i , ∃(Xµ )ii+aµ ∈ C.

(4.8)

Thus we get the matrices (Xµ ) of the same content as those for a D-brane at the orbifold singularity. The map 1 ,→ A → A/I now induces an element 0 6 = p ∈ Hom0 (C, V ), where p(1) = p 0 0 ∈ V . Thus an element I ∈ Hilb0 (Cd ) defines an element (X, p) ∈ Hom0 (V , Q ⊗ V ) ⊕ Hom0 (C, V ), and it is clear by construction that (X, p) satisfies the conditions (i) (F-flatness) and (ii) (stability) above. Conversely take an element (X, p) of W 0 := Hom0 (R, Q ⊗ R) ⊕ Hom0 (C, R) such that (n, n) matrices (Xµ ) and (n, 1) matrix p(1) defined by X( i ) =

d X

i−aµ

(Xµ )i

µ=1

β µ ⊗ i−aµ , p(1) = p0 0 ,

Kähler Moduli Space for D-Brane at Orbifold Singularities

693

satisfy the conditions (i), (ii) with V = R. Then we can define a 0-equivariant surjective homomorphism κ : A → R by κ(xµ1 · · · xµs )

X

=

i1 ,...,is

i1 (Xµ1 )ii12 (Xµ2 )ii23 · · · (Xµs )i0s p0 ,

(4.9)

so that the ideal I := Ker κ is 0-invariant and we obtain the 0-module isomorphism A/I ∼ = R, that is, (X, p) defines an element I of Hilb0 (Cd ). With the basis β µ of Q fixed, two elements (X, p) and (X 0 , p0 ) of W 0 which satisfy the conditions (i) and (ii) define they are related as (X0 , p0 ) = (uXu−1 , up) the same point I ∈ Hilb0 (Cd ) if and only Qif n ∼ by an element u := (ui ) ∈ Aut0 (R) = i=1 Aut(Ri ) ∼ = (C∗ )n , where ui acts on i by −1 i → ui i . Thus we get the holomorphic quotient construction of Hilb0 (Cd ): Y n condition (i) : F-flatness Aut(Ri ), (4.10) Hilb0 (Cd ) ∼ = (X, p) ∈ W 0 condition (ii) : stability i=1

which in particular shows that Hilb0 (Cd ) is toric. The associated symplectic Q quotient can be obtained by replacing the stability condition (ii) and the quotient by i Aut(Ri ) by Q the D-flatness condition that takes the same form as (4.6) followed by the quotient by i U(Ri ) ∼ = U(1)n . Consequently, the Fayet–Iliopoulos parameters associated with n U(1) is r(1, . . . , 1). To sum up, we have the symplectic quotient realization of Hilb0 (Cd ): ), ( Xµ , Xν = O 0 d ∼ 0 F-flatness : U(1)n . (4.11) Hilb (C ) = (X, p) ∈ W D-flatness : Dr = O The relation between Hilb0 (Cd ) and M(r) can be easily seen if we write down the D-flatness equations for Hilb0 (Cd ) in components: d X µ=1

(i−aµ ) 2

|xµ(i) |2 − |xµ

|

+ |p0 |2 δ i,0 = r, i = 0, 1, . . . , n − 1,

(4.12)

(i)

where we set xµ := (Xµ )ii+aµ as before (3.10). We can delete p0 and the diagonal U(1) from the symplectic quotient construction owing to the Higgs mechanism [22]: |p 0 |2 = n r.

(4.13)

Then we are left with the matrices (Xµ ), which satisfy the same equations as those of D-brane matrices (3.11, 3.12) with the Fayet–Iliopoulos parameter (r0 , r1 , . . . , rn−1 ) = r (−(n − 1), 1, . . . , 1) ∈ M 0 Q .

(4.14)

Thus we come to the conclusion: n−1

z }| { Hilb (C ) ∼ = M(r 1), 1 := (1, . . . , 1), 0

d

(4.15)

where we have identified M 0 with Zn−1 by neglecting the zeroth component. We also note the existence of the Hilbert–Chow morphism Hilb0 (Cd ) → Cd / 0, which comes from the isomorphism: M(0) ∼ = Cd / 0 [38].

694

K. Mohri

4.2. Another algorithm for computation. The aim of this subsection is to translate the algorithm to compute the 0-Hilbert scheme given by Reid in [37], which seems quite different from the one given in the previous subsection, into the language of convex polyhedra. Closely related topics can be found in [1,40]. Let A = C[x1 , . . . , xd ] be the coordinate ring of Cd , where g ∈ 0 acts on xµ as multiplication by ωaµ , which defines the action of 0 on Cd from the right. For i = 0, . . . , n − 1, we define Li to be the “orbifold line bundle” on Cd / 0 associated with the irreducible representation R−i of 0, where g ∈ 0 acts as multiplication by ω−i . The global section of Li is (R−i ⊗ A)0 , that is, the weight i subspace of A. Note that as a n 0-module, A ∼ = Sym Q = ⊕∞ n=0 S Q. The set of the monomial generators over C of 0 (R−i ⊗ A) , which we denote by Mi , is given by n o (4.16) Mi = m ∈ (Z≥0 )d | m · a ≡ i mod n , where m = (m1 , . . . , md ) and a = (a1 , . . . , ad ). M0 coincides with the coordinate ring of Cd / 0, and each Mi has a structure of a finitely generated M0 -module, the set of the generators of which we denote by Bi . Let Pi = conv Mi be the Newton polyhedron of the global monomial sections of Li , which can be regarded as a polyhedron in (M 0 )Q , where the lattice M 0 is defined in (3.3). Then the toric variety X(M 0 , Pi ) defines the blow-up of Cd / 0 = X(M 0 , P0 ) by Li , which is denoted by Bli (Cd / 0). The normal fan N (Pi ) in (N 0 )Q (3.2) is the fan associated with Bli (Cd / 0). Evidently, Pi can be expressed as the Minkowski sum of the polytope conv Bi and the cone C0 = P0 (3.5). On the other hand, a celebrated theorem of E. Noether adapted to the 1/n(a1 , . . . , ad , n − i) model, which is not Calabi–Yau, tells us that all the members of Bi can be found among those in Mi of degree ≤ n, which implies the following way to construct the Newton polyhedron Pi without any knowledge of Bi : d X mµ ≤ n ⊃ Bi . Pi ∼ = conv Bi 0 + C0 , B 0 i := m ∈ Mi

(4.17)

µ=1

Example. We take the 1/5(1, 2, 3, 4) model. The four convex polyhedra are given by P1 P2 P3 P4

= conv {e1 , 3e2 , 2e3 , 4e4 , = conv {2e1 , e2 , 4e3 , 3e4 , = conv {3e1 , 4e2 , e3 , 2e4 , = conv {4e1 , 2e2 , 3e3 , e4 ,

e2 + e4 } + C0 , e3 + e4 } + C0 , e1 + e2 } + C0 , e1 + e3 } + C0 .

(4.18)

According to [37], Hilb0 (Cd ) is the toric variety associated with the fan in (N 0 )Q that is the coarsest common refinement of the normal fans N (Pi ), i = 1, . . . , n − 1, which we denote by N (P1 ) ∩ · · · ∩ N (Pn−1 ). To put it differently, Hilb0 (Cd ) is the toric variety associated with the polyhedron PHilb defined by PHilb := P1 + · · · + Pn−1 = conv (B1 + · · · + Bn−1 ) + C0 ,

(4.19)

because of the formula [44, Proposition 7.12]: N (P1 ) ∩ · · · ∩ N (Pn−1 ) = N (P1 + · · · + Pn−1 ).

(4.20)

Kähler Moduli Space for D-Brane at Orbifold Singularities

695

It is clear by construction that Hilb0 (Cd ) is projective over Cd / 0 = X(M 0 , C0 ), and that each Pi defines a line bundle generated by global sections, and PHilb an ample one on Hilb0 (Cd ). Note that PHilb defined in (4.20) is by no means a unique candidate for a polyhedron P yielding the 0-Hilbert scheme: indeed any polyhedron of the form n−1 i=1 ki Pi , where ki > 0, for all i fits for the job. A distinguished feature of PHilb (4.20) among the family Pn−1 i=1 ki Pi is the following: 1)) and (M 0 , PHilb ) are isomorphic to each other Conjecture. Two polyhedra (M, Q(b modulo translation. 1) = 1. Recall that b 1 is an element of MQ which satisfies πQ (b 4.3. Computations. Here we compute the 0-Hilbert schemes of some Calabi–Yau fourfold models to show the power of the formula (2.38) of the toric quotient combined with (4.15). Another method (4.19), though less effective, serves as a consistency check of the result of (4.15). 4.3.1. (1/17)(1,1,6,9) model. The fan of the 0-Hilbert scheme is given by the following collection of the maximal cones: h2, 3, 4, 5i, h1, 2, 4, 7i, h2, 5, 7, 8i, h1, 5, 6, 8i,

h1, 3, 5, 6i, h1, 2, 7, 9i, h1, 2, 3, 6i, h2, 5, 6, 8i, h2, 3, 5, 6i, h1, 2, 8, 9i, h1, 7, 8, 9i, h2, 7, 8, 9i, h2, 4, 5, 7i, h1, 2, 6, 8i, h1, 4, 5, 7i, h1, 3, 4, 5i, h1, 5, 7, 8i,

(4.21)

where the weight vectors are w5 = (1, 1, 6, 9), w 6 = (2, 1, 12, 1), w 7 = (3, 3, 1, 10), w8 = (4, 4, 7, 2), w 9 = (6, 6, 2, 3).

(4.22)

The fan (4.21) defines one of the five crepant resolutions of the (1/17)(1, 1, 6, 9) model. For other Calabi–Yau four-fold models which admit crepant resolutions, we only give the following conjecture. Conjecture. The 0-Hilbert schemes of 1/(3m + 1)(1, 1, 1, 3m − 2) and 1/(4m)(1, 1, 2m − 1, 2m − 1) models (3.8) are the crepant resolutions of the corresponding orbifolds described in [27]. 4.3.2. 1/5(1,2,3,4) model. The 0-Hilbert scheme coincides with Phase I found in the previous Sect. (3.57). 4.3.3. 1/7(1,2,5,6) model. The exceptional divisors appearing in the 0-Hilbert scheme are as follows: w 5 = (1, 2, 5, 6), w 6 = (2, 4, 3, 5), w7 = (3, 6, 1, 4), w8 = (4, 1, 6, 3), w 9 = (5, 3, 4, 2), w 10 = (6, 5, 2, 1), w 11 = (2, 4, 10, 5), w 12 = (3, 6, 8, 4), w 13 = (4, 8, 6, 3), w 14 = (5, 3, 4, 9), w 15 = (5, 10, 4, 2), w 16 = (6, 5, 2, 8), (4.23) w 17 = (8, 2, 5, 6), w 18 = (9, 4, 3, 5), w 19 = (9, 4, 3, 12), w20 = (12, 3, 4, 9).

696

K. Mohri

The fan of the 0-Hilbert scheme is given by h1, 2, 3, 10i, h2, 4, 6, 7i, h1, 2, 7, 10i, h2, 3, 13, 15i, h2, 6, 12, 13i, h2, 3, 12, 13i, h3, 4, 5, 8i, h2, 3, 10, 15i, h1, 2, 4, 7i, h2, 3, 11, 12i, h2, 3, 5, 11i, h2, 3, 4, 5i, h1, 3, 8, 9i, h1, 9, 10, 18i, h4, 5, 6, 14i, h1, 4, 19, 20i, h1, 3, 9, 10i, h2, 4, 5, 6i, h3, 9, 12, 13i, h6, 9, 12, 13i, (4.24) h1, 4, 17, 20i, h1, 3, 4, 8i, h1, 4, 16, 19i, h1, 4, 8, 17i, h1, 4, 7, 16i, h2, 7, 10, 15i, h4, 6, 7, 16i, h3, 5, 8, 11i, h1, 8, 9, 17i, h3, 8, 9, 11, 12i, h2, 6, 7, 13, 15i, h1, 9, 17, 18, 20i, h4, 6, 14, 16, 19i, h3, 9, 10, 13, 15i, h2, 5, 6, 11, 12i, h1, 16, 18, 19, 20i, h4, 14, 17, 19, 20i, h1, 7, 10, 16, 18i, h4, 5, 8, 14, 17i, h6, 7, 9, 10, 13, 15, 16, 18i, h4, 5, 8, 9, 11, 12, 14, 17i, h6, 9, 14, 16, 17, 18, 19, 20i. 4.3.4. 1/7(1,1,2,3) model. This model has seven weight vectors: w 5 = (1, 1, 2, 3), w 6 = (3, 3, 6, 2), w7 = (4, 4, 1, 5), w 8 = (5, 5, 3, 1), w 9 = (6, 6, 5, 4), w10 = (8, 8, 2, 3), w11 = (9, 9, 4, 6).

(4.25)

The fan of the 0-Hilbert scheme, which is a smooth non-Calabi–Yau four-fold, has only five of them: h2, 4, 5, 7i, h2, 4, 5, 8i, h1, 5, 7, 10i, h1, 3, 6, 8i,

h1, 2, 7, 10i, h1, 3, 5, 6i, h1, 2, 8, 10i, h2, 3, 4, 5i,

h1, 4, 5, 7i, h1, 2, 3, 8i, h1, 4, 5, 8i, h2, 3, 5, 6i, h1, 2, 4, 7i, h1, 3, 4, 5i, h2, 3, 6, 8i, h1, 5, 8, 10i, h2, 5, 8, 10i, h2, 5, 7, 10i.

(4.26)

4.3.5. (1/16)(1,3,5,7) model. The weight vectors appearing in the 0-Hilbert scheme are given by w5 w8 w 11 w14 w 17 w20 w 23 w 26 w 29 w 32

= (17, 3, 21, 7), = (5, 15, 9, 3), = (12, 4, 12, 4), = (13, 7, 33, 11), = (8, 8, 24, 8), = (18, 6, 42, 14), = (17, 3, 5, 7), = (13, 7, 1, 11), = (14, 10, 6, 18), = (4, 12, 4, 12),

w6 w9 w 12 w15 w18 w21 w24 w27 w 30 w33

= (18, 6, 10, 14), = (11, 33, 7, 13), = (8, 24, 8, 8), = (3, 9, 15, 5), = (7, 5, 3, 1), = (30, 10, 6, 18), = (1, 3, 5, 7), = (23, 5, 3, 17), = (7, 5, 3, 17), = (17, 3, 5, 23),

w7 w 10 w13 w16 w19 w22 w25 w28 w31 w34

= (14, 42, 6, 18), = (12, 4, 12, 20), = (20, 12, 4, 12), = (6, 2, 14, 10), = (22, 2, 14, 10), = (18, 6, 10, 30), = (7, 21, 3, 17), = (10, 14, 2, 6), = (11, 1, 7, 13), = (10, 14, 2, 22).

(4.27)

Kähler Moduli Space for D-Brane at Orbifold Singularities

697

The fan of the 0-Hilbert scheme is defined by the following 104 maximal cones: h1, 3, 11, 18i, h1, 4, 31, 33i, h2, 4, 24, 32i, h1, 18, 21, 23i, h2, 9, 12, 32i, h3, 11, 17, 18i, h3, 4, 16, 24i, h18, 24, 29, 32i, h6, 18, 24, 29i, h1, 13, 18, 21i, h1, 11, 18, 23i, h2, 3, 8, 18i, h2, 25, 28, 34i, h3, 8, 15, 18i, h2, 26, 28, 34i, h2, 4, 26, 34i, h2, 3, 15, 24i, h2, 8, 12, 18i, h2, 3, 4, 24i, h2, 8, 15, 24i, h6, 11, 16, 24i, h2, 9, 12, 18i, h2, 8, 12, 24i, h8, 15, 18, 24i, h3, 14, 17, 24i, h4, 22, 30, 33i, h11, 17, 18, 24i, h3, 15, 17, 24i, h4, 26, 30, 34i, h15, 17, 18, 24i, h1, 2, 4, 26i, h1, 21, 23, 27i, h4, 10, 22, 24i, h6, 10, 22, 24i, h4, 10, 16, 24i, h6, 11, 18, 24i, h6, 10, 16, 24i, h18, 28, 29, 32i, h1, 4, 26, 27i, h1, 3, 5, 11i, h1, 2, 26, 28i, h2, 7, 25, 28i, h1, 13, 26, 28i, h1, 2, 18, 28i, h1, 13, 18, 28i, h13, 18, 28, 29i, h6, 11, 18, 23i, h6, 23, 29, 30i, h8, 12, 18, 24i, h13, 18, 21, 29i, h1, 3, 5, 19i, h2, 3, 8, 15i, h3, 11, 14, 17i, h6, 18, 23, 29i, h1, 2, 3, 18i, h2, 4, 25, 34i, h18, 21, 23, 29i, h3, 5, 16, 19i, h4, 22, 24, 30i, h6, 22, 24, 30i, h4, 26, 27, 30i, h1, 19, 23, 31i, h1, 3, 4, 31i, h1, 3, 19, 31i, h2, 7, 9, 32i, h3, 5, 11, 20i, h3, 16, 19, 31i, h3, 4, 16, 31i, h6, 24, 29, 30i, h4, 10, 16, 31i, h9, 12, 18, 32i, h3, 11, 14, 20i, h2, 7, 25, 32i, h12, 18, 24, 32i, h2, 12, 24, 32i, h2, 4, 25, 32i, h7, 25, 28, 32i, h1, 23, 27, 33i, h24, 29, 30, 32i, h4, 24, 30, 32i, h1, 4, 27, 33i, h3, 15, 17, 18i, h23, 27, 30, 33i, h4, 27, 30, 33i, h1, 23, 31, 33i, h3, 5, 16, 20i, h11, 14, 17, 24i, h5, 11, 16, 20i, h1, 13, 21, 26, 27i, h2, 7, 9, 18, 28i, h21, 23, 27, 29, 30i, h6, 22, 23, 30, 33i, h4, 10, 22, 31, 33i, h7, 9, 18, 28, 32i, h1, 5, 11, 19, 23i, h4, 25, 30, 32, 34i, h3, 14, 16, 20, 24i, h11, 14, 16, 20, 24i, h6, 10, 22, 23, 31, 33i, h5, 6, 11, 16, 19, 23i, h13, 21, 26, 27, 29, 30i, h25, 28, 29, 30, 32, 34i, h13, 26, 28, 29, 30, 34i, h6, 10, 16, 19, 23, 31i. (4.28) We see that in general the 0-Hilbert scheme of a Calabi–Yau orbifold for d = 4 is neither smooth nor Calabi–Yau in contrast with the cases of d = 2, 3. Acknowledgement. I would like to thank Mitsuko Abe (Tokyo Inst. of Technology) for many discussions.

References 1. Arnold, V.I., Gusein-Zade, S.M., Varchenko, A.N.: Singularities of Differentiable Maps II. Monographs in Mathematics 82. Boston: Birkhäuser, 1988 2. Aspinwall, P.S., Greene, B.R., Morrison, D.R.: Calabi–Yau Moduli Space, Mirror Manifolds and Spacetime Topology Change in String Theory. Nucl. Phys. B416, 414–480 (1994), hep-th/9309097 3. Batyrev, V.V., Dais, D.I.: Strong McKay Correspondence, String Theoretic Hodge Numbers and Mirror Symmetry. Topology 35, 901–929 (1996), alg-geom/9410001 4. Becker, K., Becker, M., Morrison, D.R., Ooguri, H., Oz,Y.,Yin, Z.: Supersymmetric Cycles in Exceptional Holonomy Manifolds and Calabi–Yau Four-Folds. Nucl. Phys. B480, 225–238 (1996), hep-th/9608116 5. Bershadsky, M., Vafa, C., Sadov, V.: D-Branes and Topological Field Theories. Nucl. Phys. B463, 420–434 (1996), hep-th/9511222 6. Billera, L.J., Fillman, P., Sturmfels, B.: Constructions and Complexity of Secondary Polytopes. Adv. Math. 83, 155–179 (1990)

698

K. Mohri

7. Cox, D.A.: The Homogeneous Coordinate Ring of a Toric Variety. J. Alg. Geom. 4, 17–50 (1995), alggeom/9210008; Cox, D.A.: Recent Developments in Toric Geometry. In : Kollár, J., Lazarsfeld, R., Morrison, D.R. (eds.) Algebraic Geometry–Santa Cruz 1995. Proceedings of Symposia in Pure Mathematics 62, Part 2, Providence, RI: American Mathematical Society, 1997, pp. 389–436 alg-geom/9606016 8. Dais, D.I., Hause, U.-U., Henk, M.: On Crepant Resolutions of 2-Parameter Series of Gorenstein Cyclic Quotient Singularities. Result. Math. 33, 208–265 (1998), math.AG/9803096 9. Dais, D.I., Henk, M.: On a Series of Gorenstein Cyclic Quotient Singularities Admitting a Unique Projective Crepant Resolutions. math.AG/9803094, to appear In: Ewald, G.M., Teissier, B. (eds.) Combinatorial Convex Geometry and Toric Varieties. Boston: Birkhäuser 10. Douglas, M.R., Greene, B.R.: Metrics on D-Brane Orbifolds. Adv. Theor. Math. Phys. 1, 184–196 (1998), hep-th/9707214 11. Douglas, M.R., Greene, B.R., Morrison, D.R.: Orbifold Resolutions by D-Branes. Nucl. Phys. B506, 84–106 (1997), hep-th/9704151 12. Douglas, M.R., Kabat, D., Pouliot, P., Shenker, S.H.: D-Branes and Short Distances in String Theory. Nucl. Phys. B485, 85–127 (1996), hep-th/9608024 13. Douglas, M.R., Moore, G.: D-Branes, Quivers and ALE Instantons. hep-th/9603167 14. Edelsbrunner, H.: Algorithms in Computational Geometry. EATCS Monographs in Theoretical Computer Science 10. Berlin: Springer-Verlag, 1987 15. Ewald, G.M.: Combinatoric Convexity and Algebraic Geometry. Graduate Texts in Mathematics 168. New York: Springer-Verlag, 1996 16. Fulton, W.: Introduction to Toric Varieties. Annals of Mathematics Studies 131. Princeton: Princeton Univ. Press, 1993 17. Gelfand, I.M., Kapranov, M.M., Zelevinski, A.V.: Discriminants, Resultants and Multidimensional Determinants. Mathematics: Theory & Applications. Boston: Birkhäuser, 1994 18. Greene, B.R.: D-Brane Topology Changing Transitions. Nucl. Phys. B525, 284–296 (1998), hepth/9711124 19. Grothendieck, A.: Fondements de la Géométrie Algébrique. Extraits du Séminaire Bourbaki 1957–1962, mimeographed notes. 11 rue Pierre Curie, Paris 5e: Secrétariat mathématique 1962 20. Hartshorne, R.: Algebraic Geometry. Graduate Texts in Mathematics 52, New York: Springer-Verlag, 1977 21. Hosono, S., Lian, B.H., Yau, S.-T.: GKZ-Generalized Hypergeometric Systems in Mirror Symmetry of Calabi–Yau Hypersurfaces. Commun. Math. Phys. 182, 535–577 (1996), alg-geom/9511001. 22. Ito, Y., Nakajima, H.: McKay Correspondence and Hilbert Schemes in Dimension Three. math.AG/9803120 23. Ito, Y., Nakamura, I.: McKay Correspondence and Hilbert Schemes. Proc. Japan Acad. 72A, 135–137 (1996); Ito, Y., Nakamura, I.: Hilbert Schemes and Simple Singularities An and Dn . Hokkaido Univ. preprint 348 (1996); Nakamura, I.: Hilbert Schemes and Simple Singularities E6 , E7 and E8 . Hokkaido Univ. preprint 362 (1996) 24. Kapranov, M.M., Sturmfels, B., Zelevinsky, A.V.: Quotients of Toric Varieties. Math. Ann. 290, 643–655 (1991) 25. Kollár, J.: Rational Curves on Algebraic Varieties. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge 32. Berlin: Springer-Verlag, 1996 26. Kronheimer, P.B.: The Construction of ALE Spaces as Hyper-Kähler Quotients. J. Diff. Geom. 28, 665– 683 (1989) 27. Mohri, K.: D-Branes and Quotient Singularities of Calabi–Yau Four-Folds. Nucl. Phys. B521, 161–182 (1998), hep-th/9707012 28. Morrison, D.R., Plesser, M.R.: Summing the Instantons: Quantum Cohomology and Mirror Symmetry in Toric Varieties. Nucl. Phys. B440, 279–354 (1995), hep-th/9412236 29. Morrison, D.R., Stevens, G.: Terminal Quotient Singularities in Dimensions Three and Four. Proc. Amer. Math. Soc. 90, 15–20 (1984) 30. Mukhopadhyay, S., Ray, K.: Conifolds from D-Branes. Phys. Lett. B423, 247–254 (1998), hep-th/9711131 31. Muto, T.: D-Branes on Orbifolds and Topology Change. Nucl. Phys. B521, 183–201 (1998), hepth/9711090 32. Nakajima, H.: Lectures on Hilbert Schemes of Points on Surfaces. preprint (1996), available via http://www.kusm.kyoto-u.ac.jp/˜nakajima/TeX.html 33. Nakamura, I.: Hilbert Schemes of Abelian Group Orbits. preprint (1998), cited in [22] 34. Oda, T.: Convex Bodies and Algebraic Geometry: An Introduction to the Theory of Toric Varieties. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge 15. Berlin: Springer-Verlag, 1988 35. Ooguri, H., Oz,Y.,Yin, Z.: D-Branes on Calabi–Yau Spaces and Their Mirrors. Nucl. Phys. B477, 407–430 (1996), hep-th/9606112 36. Ray, K.: A Ricci-Flat Metric on D-Brane Orbifolds. Phys. Lett. B433, 307–317 (1998), hep-th/9803192 37. Reid, M.: McKay Correspondence. alg-geom/9702016

Kähler Moduli Space for D-Brane at Orbifold Singularities

699

38. Sardo, Infirri, A.V.: Partial Resolutions of Orbifold Singularities via Moduli Space of HYM-type Bundles. alg-geom/9610004 39. Sardo, Infirri, A.V.: Resolutions of Orbifold Singularities and the Transportation Problem on the McKay Quiver. alg-geom/9610005 40. Sturmfels, B: Gröbner Bases and Convex Polytopes. University Lecture Series 8. Providence, RI: Americal Mathematical Society, 1995 41. Thaddeus, M.: Toric Quotients and Flips. In: Fukaya, K., Furuta, M., Kohno, T., Kotschick, D. (eds.) Topology, Geometry and Field Theory, pp. 193–213. Singapore: World Scientific, 1994 42. Witten, E.: Phases of N = 2 Theories in Two Dimensions. Nucl. Phys. B403, 159–222 (1993), hepth/9301042 43. Witten, E.: Bound States of Strings and p-Branes. Nucl. Phys. B460, 335–350 (1996), hep-th/9510135 44. Ziegler, G.M.: Lectures on Polytopes. Graduate Texts in Mathematics 152, New York: Springer-Verlag, 1994 Communicated by H. Araki

Commun. Math. Phys. 202, 701 – 733 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

An Extension of the Character Ring of sl(3) and Its Quantisation P. Furlan1,2 , A. Ch. Ganchev3 , V. B. Petkova2,3 1 Dipartimento di Fisica Teorica dell’Università di Trieste, Strada Costiera 11, 34100 Trieste, Italy.

E-mail: [email protected]

2 Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Trieste, Italy.

E-mail: [email protected]

3 Institute for Nuclear Research and Nuclear Energy, Tzarigradsko Chaussee 72, 1784 Sofia, Bulgaria.

E-mail: [email protected] Received: 22 September 1998 / Accepted: 24 November 1998

Abstract: We construct a commutative ring with identity which extends the ring of characters of finite dimensional representations of sl(3). It is generated by characters b k at k 6 ∈ Q. with values in the group ring Z[W˜ ] of the extended affine Weyl group of sl(3) The “quantised” version at rational level k + 3 = 3/p realises the fusion rules of a WZW b k. conformal field theory based on admissible representations of sl(3)

1. Introduction The aim of this work is to describe the characters (one-dimensional representations) of b k WZW conformal models at rational (shifted) level the fusion algebra of g = sl(n) κ := k + n = p 0 /p, for n = 3 = p0 , (p, 3) = 1 and their “classical” counterparts. In [9] we have realised this fusion algebra as a matrix algebra F ⊂ Matp2 (C) with inz , and a basis {N , (N )z = (p) N z , N = teger nonnegative structure constants (p) Ny,x y y x 1 y,x 1p2 }. The labels of the basis run over the highest weights of the admissible represen(+)

tations at κ = 3/p [12], which conveniently are parametrised by a subset Wp of the affine Weyl group W , see the text for precise definitions. F is a matrix realisation of a commutative, associative algebra with identity, a distinguished basis, and an involution ∗. The definition of the fusion algebra – an example of C-algebra (Character - algebra) in the terminology of [2], implies that Ny are normal, hence simultaneously diagonalisable (µ) by a unitary matrix ψy labelled by some set Ep = {µ} of p2 indices, (p)

z Nx,y =

X ψx(µ) µ∈Ep

(µ)

ψ1

ψy(µ) ψz(µ) ∗ =

X µ∈Ep

(p)

(p)

χy (µ) χz∗ (µ) (p) χx (µ) P . (p) 2 (+) |χu (µ)| u∈W p

(1.1)

702

P. Furlan, A. Ch. Ganchev, V. B. Petkova (p)

(µ)

(µ)

The eigenvalues χx (µ) = ψx /ψ1 of Nx provide p2 linear representations Ny → (p) χy (µ), i.e., characters of F, labelled by the set Ep , X (p) (p) (p) (p) z Nx,y χz (µ). (1.2) χx (µ) χy (µ) = (+)

z∈Wp

With an appropriate reinterpretation of the labels this is the general setting for any b k integrable representations (a RCFT. In particular for WZW models based on the sl(n) subclass of the admissible representations at integer level k = p 0 − n), both indices of (µ) (µ) the unitary matrix ψλ belong to the alcove P+k , ψλ is a symmetric matrix which (µ)

(p0 )

coincides with the modular matrix, ψλ = Sλµ , while (1.1) reduces to the Verlinde formula [15] for the fusion rule (FR) multiplicities, i.e., the dimensions of the spaces of chiral vertex operators. In [9] we have specified the above general setting to the case of the generic subseries κ = 3/p, of the admissible representations, describing explicitly one of the fusion matrices, a “fundamental” fusion matrix Nf . Here f is some analog of the g = sl(3) ¯ 1 = (1, 0), but the product of Nf with any Nx produces generically fundamental weight 3 z = 0, 1 were seven terms, each appearing with multiplicity one. The numbers (p) Nf,y found by solving a set of algebraic equations coming from the decoupling of singular b k Verma modules. More precisely the decoupling of the “horizontal” vectors in g = sl(3) singular vectors (which exist also for generic values of the level, κ 6 ∈ Q) determines the generic seven points, while the additional truncation conditions, including the ones at rational level, were obtained making also some assumptions suggested from explicit computations at small values of p. Although Nf together with its conjugate Nf ∗ are not sufficient to build up the full polynomial fusion ring, diagonalising this matrix in (µ) an orthonormal basis, i.e., finding the eigenvector matrix ψy (common to all fusion matrices) recovers through (1.1) all FR multiplicities. We called (1.1) Pasquier–Verlinde type formula since similar formulæ – though with different interpretation of the structure constants (no more required to be nonnegative integers), were first discussed in [13] in the context of lattice ADE models. Such formulæ, together with their dual counterparts, describing the structure constants of a “dual” algebra, have been furthermore exploited in the study of nondiagonal modular invariants of the integrable WZW conformal models [5, 14]. Diagonalising a p 2 × p2 matrix becomes a tedious task for big p (we have done this (µ) exercise for p = 4, 5) so it is preferable to have an explicit analytic formula for ψy , (p) or equivalently, for the characters χy (µ) of the fusion algebra. The path we follow in (p) this paper to find the characters χy (µ) was suggested by the second formula for the admissible FR multiplicities conjectured (with slightly changed notation) in [9], (p)

z Nx,y =

X w0 ∈W [z]

det(w0 ) mxw0 z y −1 .

(1.3)

This formula is analogous to the formula for the integrable FR multiplicities derived in [11, 16, 8], which is a truncated version of the classical Weyl–Steinberg formula for the tensor product multiplicities of finite dimensional representations of the horizontal subalgebra g (with the role of the horizontal Weyl group W taken by the affine Weyl group W ). In (1.3) the summation runs over the Kac–Wakimoto (KW) affine group W [z] ,

Extension of Character Ring of sl(3) and Its Quantisation

703

generated by Kac–Kazhdan reflections corresponding to singular vectors of the g Verma module of highest weight parametrised by the element z ∈ W . In the integrable case the counterpart of mxv is the multiplicity mλµ of the weight µ of the g finite dimensional module of highest weight λ. Here the integers mxv , x, v ∈ W , describe a generalised (finite) weight diagram. The realisation that one has to generalise the classical notion of weight diagram in order to describe the rational level FR was the main lesson of the, otherwise still incomplete, analysis of the null-decoupling equations of [9]. While the g weight diagrams are subsets of the root lattice Q of g, attached to the highest weight λ, in our case the role of a “root lattice” is taken by the affine Weyl group W = W n tQ , at a generic (κ 6 ∈ Q) level. Now the question is, can one find formal characters encoding this information about the generalised weight diagrams, which furthermore are closed under multiplication and recover a “classical” analogue of (1.3) at generic levels, with the affine [z] KW groups replaced by their horizontal counterparts W . Then by analogy with the integrable case one can “quantise” these “classical” characters imposing the periodicity conditions, accounting for the rational level, and thus recover the linear representations of F. Though it is not necessary, to understand our way of reasoning and the motivating idea it is helpful to imagine that there is a finite dimensional algebraic object playing the ˜ (+) , role of g. The irreps of the assumed hidden algebra are labelled in general by x ∈ W ˜ a fundamental subset of the extended affine Weyl group W = W n tP , with respect to the right action of W (equivalent to the action of the horizontal KW Weyl group, see [12] and the text below). The same set also parametrises maximally reducible Verma modules b k of g at generic level and the corresponding irreducible quotients. In the simplest sl(2) case solved in [1, 6] this intrinsic algebra is the Z2 graded algebra osp(1|2) [7]; the Weyl group W ' Z2 of sl(2) distinguishes between even and odd vacuum state finite dimensional irreps of osp(1|2). Alternatively mapping ι : W˜ → Q one can use the horizontal subalgebra g = sl(2) itself, but restricting to its representations with highest weights on the root lattice Q (integer isospins). In general this map ι (see Sect. 2) allows to describe the generalised modules through subsets of the supports of standard modules of g (with highest weights of n-ality zero); the case sl(2) is trivial since then Im(ι) ≡ Q. The construction of the “classical characters” at generic level, k 6 ∈ Q, takes a considerable part of this work and is its main novel result. It is done in the first part of the paper (Sects. 2, 3, 4) by full analogy with the standard case starting with some analogs of the Verma module characters, used as ingredients in a generalised Weyl character formula. While the input about the relevant generalised supports of the finite or infinite “modules” (the set of weights and their multiplicities) is essentially taken over from [9], slightly rephrased and generalised, the main effort here is to find the proper multiplicative structure of the formal characters. They are elements of the group ring Z[W˜ ] of the exb k , closed under multiplication. The structure constants tended affine Weyl group of sl(3) ˜ of Z[W˜ ] satisfy a generalised Weyl-Steinberg of the resulting commutative subring W formula. As a side result one obtains an explicit formula for the cardinality of the generalised weight diagrams, i.e., for the dimensions of the “finite dimensional modules” of the unknown algebra possibly generalising osp(1|2). Some steps of the construction b k , arbitrary in this first part hold, or are straightforwardly generalisable, for g = sl(n) n. So we keep the exposition general although we present the details fully for the case b k ; the general case will be elaborated elsewhere. In the second part (Sects. 5, g = sl(3) 6, 7) we consider rational values κ = 3/p, thus a generic subseries of the admissible b k . The formal characters are “quantised” imposing perirepresentations of g = sl(3) odicity constraints and realised as C - valued functions. They satisfy the orthogonality

704

P. Furlan, A. Ch. Ganchev, V. B. Petkova (µ)

and completeness relations equivalent to the unitarity of ψy , so that one recovers the Pasquier–Verlinde type formula (1.1). Thus given the characters both formulæ for the FR multiplicities of the admissible representations (1.1) and (1.3) derive from the classical analog of the Weyl–Steinberg formula established in the first part of the paper. Furthermore one proves a third formula for the FR multiplicities, conjectured in [9], which also has a “classical” counterpart, (p)

ι(z)

z Nx,y = (3p) N¯ ι(x) ι(y) ,

(1.4)

ι(z) where (3p) N¯ ι(x) ι(y) are structure constants of the integrable fusion algebra at κ = 3p, while ι is the map sending W˜ to a subset of the triality zero weights. Presumably the fusion b rules of sl(n) WZW at κ = n/p are given by the above formula with n substituting 3.

2. Preliminaries We start with fixing some notation. Let 1, 1+ , 5 = {α1 , . . . , αn−1 }, be, respectively, the sets of roots, positive roots, and the simple roots of g = sl(n), and 1re , 1re + = P 1+ ∪ (1 + Z>0 δ), 5 = {α0 = δ − i αi , 5}, – the set of real roots, real positive roots b k . Let 3i be the fundamental weights and the simple roots of the affine algebra g = sl(n) of g, i.e., h3i , αj i = δij , with respect to the Killing-Cartan bilinear form h·, ·i on the dual h¯ ∗ of the Cartan algebra. With Q = ⊕i Z αi and P = ⊕i Z 3i we denote the root and weight lattices of g. Their positive cones are Q+ = ⊕i Z≥0 αi and P+ = ⊕i Z≥0 3i . The negative cone −Q+ is the support of the g Verma module of 0 highest weight, while P+ is the chamber of integral dominant weights – the highest weights of the finite dimensional representations of g. The form h·, ·i extends to h˜ ∗ = h¯ ∗ ⊕ C30 ⊕ Cδ with hh¯ ∗ , C30 ⊕ Cδi = 0, h30 , 30 i = 0 = hδ, δi, hδ, 30 i = 1. The fundamental weights of ˜∗ ¯∗ g are {3i = 30 + 3i }, 30 = 0. The “horizontal” projection P (·) : h → h is defined as having the kernel Cδ ⊕ C30 . The Weyl vector is ρ = j 3j . The Weyl group W of g is a finite Coxeter group generated by the simple reflections wi = wαi with relations (wi )2 = 1 = (wi wi+1 )3 , wi wj = wj wi , j 6 = i ± 1, i, j = 1, 2, . . . , n − 1. The affine Weyl group W is generated by the simple reflections wj , j = 0, 1, . . . , n − 1 with the same type of relations (for n > 2), identifying wn = w0 . These groups can be depicted by their Cayley graphs, in which the vertices correspond to elements of the group, edges to the generators, and the elementary polygons to the relations. E.g., Fig. 1 depicts (a b finite part of) the Cayley graph of the affine Weyl group W of g = sl(3), presented as {wi : (wi )2 = 1 = (wi wj )3 , {i, j } ⊂ {0, 1, 2}}, while any of the “12” elementary hexagons in Fig. 1 is the Cayley graph of the corresponding horizontal Weyl group W . The labels i on the edges correspond to the generators wi , i = 0, 1, 2. The “origin” of the graph, the vertex corresponding to the group unit 1 is denoted by JI. The three types of hexagons (“12”, “01”, and “20”) correspond to the three Artin type relations among the generators. It is convenient to introduce the following shorthand notation: wij k... = wi wj wk . . . . We will also need the extended affine group W˜ defined as the semi-direct product ˜ W = W n tP , (while W = W n tQ ), tP being the subgroup of translations in the weight lattice P . The elements tβ , β ∈ P , act on h˜ ∗ as 1 (2.1) tβ (3) = 3 + h3, δiβ − h3, βi + hβ, βih3, δi δ, 2

Extension of Character Ring of sl(3) and Its Quantisation

705

•

W (+)

0

q•MM q•MM q•MM qqq MMMMMM qqqq MMMMqqqqqqq MMMM q q 0 2 1 ◦• 2 • 1 • 0 ◦ • 0

2

1

0

◦MMM

MMMMM qqqq◦•MMMMM qqqqq•◦MMMM1M q2qqq•◦MMMMMMM qqqq•◦MMMMM qqqqq◦ M•q 0 M◦q 1 2 M◦ 0 M◦ 2 M◦qqq 0 •qqq 0 •q 1 2

1

0

2

1

◦•M ◦ ◦MMM1 •M •M ◦ MMM q2qqq MMMMMMM qqqq MMMMM qqqqqq MMM1MM q2qqq q q q 0 2 ◦ • 1 •q 0 ◦q 0

◦MMM

MMMMM qqq M◦q 1

0

2

W (+) w

2

2

1

qq◦•MMM1

q◦MMM

MMM qqqqq ◦qq 0

MMM qq ◦q •

1

0

0

◦M ◦ MMMM qqqqq MMMMM qqqqqqq q 1 2 0 ◦ ◦ 2

1

◦MMM1

◦M ◦ ◦M J◦IM MMM q2qqq MMMMMMM qqqq MMMMM qqqqqq MMM1MM q2qqq 0 2 ◦q ◦q 1 ◦qq 0 ◦q

1

2

W (+) w

0

◦MMM 2 qq• M

21

W (+) w2

1

◦MMM1

◦ MMM q2qqq q ◦

W (+) w12

W (+) w121 Fig. 1. Parts of the Cayley graph of W , the chamber W (+) and the KW action on it

˜ and for α ∈ 1, lδ − α ∈ 1re + , y ∈ W , β ∈ P , one has the properties , y wα y −1 = wy(α) . tlα = wlδ−α wα , y tβ y −1 = ty(β) ¯

(2.2)

We have denoted by overbar the projection of W˜ onto the horizontal subgroup W sending the affine translations to the unit element. The group W˜ can be also written as W˜ = W o A, where A is the subgroup of W˜ which keeps invariant the set of simple roots 5 of g. It is a cyclic group generated by γ = t31 γ¯ , where γ¯ = w1 . . . wn−1 is a Coxeter element in W generating the cyclic subgroup A¯ of W . One has γ (αj ) = αj +1 = γ j +1 (α0 ) for j = 0, 1, 2, · · · , n − 1 identifying αn ≡ α0 . In the case of g = sl(3) we will think of the Cayley graph of W˜ as a 3-sheeted covering of the graph of W with, for example, the “fiber” over the edge “0” connecting the vertices 1 and w0 being the set U = {A, A w0 } and this part of the graph of W˜ is depicted in Fig. 2. The oriented edges correspond to γ and the squares – to the implementation of the automorphism of W , wα → γ wα γ −1 = wγ (α) , α ∈ 5. Introduce the set P = {3 = y · k30 , y ∈ W˜ }, where the shifted action of w ∈ W˜ on h˜ ∗ is given by w · 3 = w(3 + ρ) − ρ. In this and the following two sections we shall assume that the level k is generic, k 6 ∈ Q, which in particular ensures that if y · k30 = k30 then y ≡ 1.

706

P. Furlan, A. Ch. Ganchev, V. B. Petkova

•

γ

/

w0 JI

•

γ

/

w1

/

γ

•

γ

/

w2 •

/

γ

• w0

•

/

γ

JI

Fig. 2. The set U

According to the general criterion of Kac-Kazhdan if h3 + ρ, βi is a positive integer for some β ∈ 1re + the Verma module M3 , 3 ∈ P, of g is reducible, containing a Verma submodule Mwβ ·3 . In particular if for some y = y¯ t−λ ∈ W˜ , and α ∈ 1+ we have y(α) = hλ, αi δ + y(α) ¯ ∈ 1re + , then the KK condition is fulfilled for 3 = y · k30 and the root β = y(α), and using (2.2), wy(α) · 3 = y wα y −1 · 3 = y wα · k30 = y¯ wα t−wα−1 (λ) · k30 .

(2.3)

¯ If y(αi ) ∈ 1re + , ∀ αi ∈ 5, the reflections wy(αi ) generate an isomorphic to W group [3]

[y]

W (to be denoted also W ), introduced by Kac-Wakimoto [12]. We shall refer to these groups as (horizontal) KW (Weyl) groups. According to (2.2) we can identify the action of a KW group with the right action of W on W˜ . Now we introduce the subset P+ ⊂ P of weights 3 such that the corresponding Verma modules M3 are maximally reducible. From the above discussion it is clear that ˜ (+) · k30 , where P+ = W ˜ (+) = {y ∈ W˜ | y(αi ) ∈ 1re W + for ∀αi ∈ 5}.

(2.4)

[3]

· k30 describes the embedding pattern among The Bruhat ordering on the orbit W [3] 0 · k30 }. the Verma modules {M30 , 3 ∈ W ˜ (+) ∩ W . One has Denote W (+) = W Proposition 2.1. P+ is a fundamental domain in P with respect to the action of the KW ˜ (+) (W (+) ) is a fundamental domain in W˜ (W ) with Weyl groups, or, equivalently, W respect to the right action of W . Proof. Let us introduce some notation first. Let Hα = {λ ∈ h¯ ∗ | hλ, αi = 0} be the hyperplane orthogonal to the root α. For w ∈ W let I (w) = {i | w(αi ) < 0, αi ∈ 5} and (w) P+ := {λ ∈ P+ | hλ, αi i > 0, i ∈ I (w)} X ¯ i. 3 = P+ \ ∪i∈I (w) Hαi ∩ P+ = P+ + i∈I (w)

The definition (2.4) can be obviously rewritten as ˜ (+) = {y ∈ W˜ | y = wt−λ , w ∈ W , λ ∈ P+(w) }. W

(2.5)

˜ (+) w = ∪ ∪w0 ∈W Hence the statement of the proposition follows from ∪w∈W W w∈W 0 t 0 (w0 ) w w = ∪w∈W t−P w = W˜ , once the following lemma is established: −w (P+

)

(w)

Lemma 2.2. P = ∪w∈W w(P+ ) is a partition, i.e., a disjoint union.

Extension of Character Ring of sl(3) and Its Quantisation

707

To prove the lemma one has to exploit several standard properties of the Weyl group W which can be found e.g., in [10], chapter I. Proof. Any λ0 ∈ P is represented as λ0 = w(λ) for some λ ∈ P+ , w ∈ W . Denote X = Xλ = {i | hλ, αi i = 0}. According to Proposition 1.10c of [10] the element w splits uniquely into a product of two elements of W , w = uv, s.t. v ∈ WX , (the group generated by the simple reflections labelled by the subset X), and u(αi ) > 0, for any (u) (u) i ∈ X. We have v(λ) = λ, I (u) ∩ X = ∅, hence λ ∈ P+ and λ0 = u(λ) ∈ u(P+ ). (w) This proves that P is covered by the union of subsets w(P+ ). The uniqueness of the above splitting proves also the disjointness. u t ˜ (+) (or, equivalently, P+ ) as a ‘dominant chamber’. The left action We shall refer to W ˜ (+) (P+ ) invariant and ∪a∈A a W (+) = of the group A (the shifted action of A) keeps W ˜ (+) . Similarly P+ is Z/nZ graded by the n-ality τ (3 = yt ¯ −λ · k30 ) := τ (λ), where W P ¯ n−1 i mod n is the standard grading in P . τ (λ) = i ihλ, αi i = n hλ, 3 ˜ (+) = U t−P+ in ˜ (+) can be also expressed as W In the case sl(3) the chamber W terms of the subset U = {A, A w0 } ⊂ W˜ , depicted in Fig. 2. Next we introduce a map ι of W˜ (or P) into Q X β ∈ Q. (2.6) ι : W˜ 3 y = yt ¯ −λ 7 → n λ + y¯ −1 · 0 = n λ − β>0, y(β)<0 ¯

See [11] (exercise 3.12 of ch. 3) for the last equality. A is mapped by ι to zero. The map ι has the “twisted log” property ι(xy) = y −1 (ι(x)) + ι(y).

(2.7)

Compare with the horizontal projection map h : W˜ 3 y = yt−λ 7 → h(y) = y · k30 = y · (−κλ) ∈ κP + W · 0, h(xy) = h(x) + x(h(y)). ¯

(2.8)

˜ (+) . Indeed The map (2.6) provides another equivalent definition of the chamber W comparing with (2.5) and using that 1 ≤ hρ, αi ≤ n − 1, for any α ∈ 1+ one easily checks ˜ (+) = {y ∈ W˜ | ι(y) ∈ P+ }. W

(2.9)

The relation (2.7) implies that ι intertwines between the KW action (equivalent according to (2.3) to the right action of W on W˜ ) and the ordinary shifted action of W , i.e., ι(yw) = w −1 · ι(y), w ∈ W .

(2.10)

Both subsets of Q, the image Im(ι) and its complement are invariant under the shifted action of the Weyl group W . The g Verma modules of highest weight ι(y) are reducible iff the corresponding g Verma modules of highest weight 3 = y · k30 are reducible and the pattern of embeddings of submodules in both cases is identical. In Fig. 3 we have illustrated the map 13 ι for the case g = sl(3). It maps the even (under the gradation det(w) = ±1) elements of W to P and the odd elements to P + θ/3 (θ = α1 + α2 ) in such a way that the vertices of the Cayley graph (if we make it into

708

P. Furlan, A. Ch. Ganchev, V. B. Petkova

q•1MMM q •MMMMM M qqq 1 MMM qqqq q • • 1X • F 32 1 31

1 •MM MMM qqqqJ•IMMMMMMMM qqqq• 2 q 1 q •

•

•MM

MMM qqqq• •q

Fig. 3. The map ι

√ a “rigid” geometrical graph by fixing the length of each √ edge to be of length 2/3 assuming as usual that the roots of sl(3) are of length 2) geometrically “sit” at the same places as the vertices of the two lattices P and P + θ/3. (See the figure.) In other words refining by 3 the weight lattice, 3i → 3i /3, (or equivalently, rescaling κ → κ/3) the points P of the Cayley graphPcan be identified with a subset of the triality zero weights {λ = i 3ni 3i /3} ∪ {λ = i (3ni + 1)3i /3} in the refined lattice, while P the “excluded” triality zero points λ = i (3ni − 1)3i /3 ∈ P − θ/3, correspond to the centers of the elementary hexagons. Remark. The above analysis properly extends to the larger than P region {3 = y · (λ0 + k30 ), y ∈ W˜ , λ0 ∈ P ; k 6 ∈ Q}. ˜ (+) , λ0 ∈ P+ , providing highest weights of “maximally It contains a subset with y ∈ W reducible” g Verma modules. For our purposes it is sufficient to choose λ0 = 0 thus restricting to weights 3 parametrised by y ∈ W˜ . 3. Characters Let us start by recalling the supports and characters of ordinary Verma and finite dimensional modules of g. The latter characters are elements of the group ring Z[tP ] of the group of translations by the weight lattice. Keeping with tradition, the translations t−λ from the previous section will be written as formal exponentials e−κλ ; κ = −1 recovers the standard notation, see, e.g., [3] for standard definitions. The character of the Verma module Vλ of highest weight λ is given by X X Y λ K µ e−κµ = e−κλ Kβ eκβ = e−κλ (1 − eκα )−1 ch(Vλ ) = + + µ∈λ−Q β∈Q (3.1) α∈1+ = e−κ(λ+ρ) dκ ,

Extension of Character Ring of sl(3) and Its Quantisation

709

λ

where the multiplicity K µ of an weight µ is expressed via the Kostant partition function λ

K µ := Kλ−µ ∈ Z≥0 , while dκ is a W invariant (up to a sign) quantity w(dκ ) = det(w) dκ for w(eκβ ) := eκw(β) . The support of a module is the set of weights µ of λ nonzero multiplicity K µ , thus supp Vλ = λ − Q+ . The irreducible finite dimensional modules can be resolved in terms of Verma modules, i.e., each Verma module Vw·λ , with w ∈ W and λ ∈ P+ , contains submodules of weight Vw0 ·λ for all w0 > w in the Bruhat ordering of W (in a convention in which 1 is the smallest element) and grading W by the reduced length of words the Verma module inclusions organize in a BGG (Bernstein–Gelfand–Gelfand) resolution. For the characters χ λ , λ ∈ P+ of irreducible finite dimensional modules the BGG resolution gives immediately the Weyl character formula X X X χλ = det(w) ch(Vw·λ ) = dκ det(w) e−κw(λ+ρ) = mλµ e−κµ , w∈W

w∈W

mλµ =

X

det(w) Kw·λ−µ ,

µ∈P

(3.2)

w∈W

P and from χ 1 = 1 the factor dκ is expressed as 1/dκ = w∈W det(w) e−κw(ρ) . The support (weight diagram) is 0λ = {µ ∈ P | mλµ 6= 0}. From (3.2) it follows that w(χ λ ) = χ λ , mλw(µ) = mλµ ,

(3.3)

and, extending the first line of (3.2) to λ ∈ P , χ w·λ = det(w) χ λ .

(3.4)

The map ι introduced in (2.6) establishes a correspondence between (highest weights of) Verma modules of the affine algebra g and of the horizontal subalgebra g, with identical reducibility structure. Now we shall introduce another class of “Verma modules” Vy (of yet unknown finite dimensional algebra) described through its supports, i.e., weights and their multiplicities. The supports are parametrised by elements of aW ⊂ W˜ (for a fixed a ∈ A) and mapped by ι into subsets of supports of g Verma modules. Motivated by the intertwining property (2.10) we shall call such a module reducible if its ι image is a reducible Verma module. In particular the maximally reducible “Verma modules” have ˜ (+) , whence the name “dominant chamber” for highest weights 3 = y · k30 , y ∈ W ˜ the latter subset of W . Furthermore in full analogy with the representation theory of g we shall introduce “finite dimensional modules” obtained factorising maximal “Verma submodules” of reducible “Verma modules” by an analog of the BGG resolution in which the role of W is replaced by the action of the KW group, equivalent according to (2.3) to the right action of W on W˜ . We have already explained on the example of g = sl(3) this construction in [9] on the level of multiplicities of weights, here we add a realisation of the characters of these “Verma- and finite dimensional modules”. It recovers the prescribed multiplicities but furthermore allows to consider a tensor product of “finite dimensional modules” realised by a multiplication of their characters. The result is a new commutative ring which extends the ring of g characters. It will be used in Sects. 6, 7 as a basis for the construction of quantised “q”-characters which realise the fusion rules of the admissible representations at rational level.

710

P. Furlan, A. Ch. Ganchev, V. B. Petkova

We can describe the analogs of the multiplicities in (3.1) via the map ι, so what remains to be done is to generalise the formal exponentials entering (3.1). The naive extension of the map ι to the characters, thus leading to ordinary formal exponentials with arguments ¯ = h(y) ∈ h¯ ∗ ) is possible but involving the weights ι(y) (or the horizontal projections 3 is not consistent (except in the case g = sl(2)) since in general a sum of such weights goes beyond the image of ι. (This is in agreement with the fact that there is no nontrivial subset of the n-ality zero representations of g closed under tensor products). So our main idea is, instead of exponentials (elements of the group algebra of the group of translations tP ) assigned to weights, to consider the elements w e−κλ = y ∈ W˜ , i.e., the generating elements of the group algebra of the group W˜ (the extension of tP over W ). (We shall use the same notation for both interpretations denoting as before sometimes the affine translations by formal exponentials.)1 Unlike the formal exponentials these generating elements do not commute any more (rather satisfy the multiplication rules of W˜ in (2.2), but nevertheless the “finite dimensional module” characters we obtain do commute and multiply according to the rules conjectured in our previous paper. Now we introduce the supports, Vu , and Gu , and characters, ch(Vu ), and χu , of, respectively, ‘Verma” and “finite dimensional modules”. The supports are certain infinite or finite subsets of the extended affine Weyl group W˜ while the characters are certain (formal) series or finite sums with integer coefficients of elements of W˜ . Let 3 = y · k30 , y ∈ aW, a ∈ A. We define Vy := {z ∈ aW | ι(z) ∈ ι(y) − Q+ } = {xy | x ∈ W, ι(x) ∈ −y(Q+ )}.

(3.5)

Alternatively, using that any ν ∈ Q+ can be represented uniquely as ν = nβ + λ, with P some β ∈ Q+ and λ ∈ Q+ /nQ+ = { n−1 i=1 ki αi | 0 ≤ ki ≤ n − 1}, we can bring Vy into the form Vy = T y y tQ+ = {tuy(β) uy | u ∈ T y , β ∈ Q+ }.

(3.6)

Here the finite subsets T y ⊂ W , which project horizontally to W , are subject to the condition T y = {u ∈ W | − y¯ −1 (ι(u)) ∈ Q+ /nQ+ }.

(3.7)

E.g., if an element u ∈ T y projects to a reflection wα ∈ W , α ∈ 1+ , then it equals u = wα , or u = wα t−α = wδ−α if y −1 (α) ∈ 1+ , or y −1 (−α) ∈ 1+ , respectively. Thus the support Vy in (3.5) naturally generalises the support of a g Verma module, being defined as a collection of |W | “positive” (for a choice of a set of simple roots) root lattice cones tu y(Q+ ) , applied to u y ∈ T y y. The two descriptions of Vy reflected in (3.6) and in (3.5) – as a collection of supports of ordinary g modules of highest weights h(u y) = uy · k30 , or as the support of a g module of highest weight ι(y) with some “excluded” points, will be both useful in what follows. Next we define the multiplicity of a weight z through its ι image, y

y

Kz := Kι(y)−ι(z) or Kxy = K−y −1 (ι(x)) ,

(3.8)

1 From now on w eκλ will stand for the multiplication of two elements in the group ring Z[W ˜ ] ; accordingly the standard notation for the action of the (horizontal) Weyl group on the formal exponentials, w(e−κλ ) =

e−κw(λ) , will be replaced by e−κw(λ) = we−κλ w−1 , cf. (2.2).

Extension of Character Ring of sl(3) and Its Quantisation

711

where Kβ is the Kostant partition function in (3.1). Then the characters generalising (3.1) are defined according to X

ch(Vy ) :=

X

y

z Kz =

W˜ , zy −1

x ∈ W, ι(x) ∈ −y(Q+ )

∈W z∈ ι(z) ∈ ι(y) − Q+

xy K−y −1 (ι(x)) .

(3.9)

Equivalently, using that −y¯ −1 (ι(x)) = −y¯ −1 (ι(u)) + nβ for xy = uytβ , and denoting X Knβ+r eκ β , (3.10) Pr e−κρ dκ := β∈Q+

we rewrite (3.9) as ch(Vy ) =

X

uy P−y¯ −1 (ι(u)) e−κρ dκ .

(3.11)

u∈T y κ

Multiplying both sides of (3.10) by e n r , summing over r ∈ Q+ /nQ+ and making a κ change of variables nβ + r = β 0 , we recover in the r.h.s. the factor e− n ρ dκ/n , cf. (3.1). Thus Pr are polynomials of translations in Q+ determined through the quotient of the standard denominators, dκ/n and dκ , X

κ

Pr e n r =

r∈Q+ /nQ+

κ n−1 Y X X κ e− n ρ dκ/n = e n kα α = −κρ e dκ +

α>0 kα =0

X

Pr =

(n)

(n)

Kµ e

κµ n

,

(3.12)

µ∈Q

Kr+nν eκ ν .

(3.13)

ν∈Q+

The partition function (n) Kµ , defined through the last equality in (3.12) is apparently nonzero for a finite subset of Q+ , i.e., the summation in (3.13) is finite. The relations (3.10), (3.13) imply X (n) Knν+r Kβ−ν . (3.14) Knβ+r = ν∈Q+

We have the symmetry properties ch(Va y ) = a ch(Vy ), a

−1

T ay a = T y , P−(ay)−1 (ι(aua −1 )) = P−y¯ −1 (ι(u)) ,

(3.15)

using that ι(a) = 0, ι(a −1 xa) = a¯ −1 (ι(x)) for a ∈ A. ˜ (+) , hence ι(y) ∈ P+ . According to Now consider 3 = y · k30 ∈ P+ , y ∈ W our definition Vy is reducible with submodules Vyw , w ∈ W , since Vι(y) is reducible with submodules Vw−1 ·ι(y) . In parallel with (3.2) we define the characters of “finite dimensional modules” by a “resolution” formula with respect to the KW Weyl group, X X X y det(w¯ 0 ) ch(Vw0 y ) = det(w) ch(Vy w ) = mz z. (3.16) χy := w0 ∈W

[3]

w∈W

z∈W˜ , zy −1 ∈W

712

P. Furlan, A. Ch. Ganchev, V. B. Petkova w1210 = t−θ

'&%$ !"# 2 1 qq 1 MMM M q q '&%$ !"# '&%$ !"# 1 1 2

1

'&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 MM2 MM q0qqq 2 MMM1M q2qqq 2 MMM0M q1qqq 1 '&%$ !"# '&%$ !"# '&%$ !"# 2 3 2 1

0

2

'&%$ !"# 0 '&%$ !"# '&%$ !"# 4M 2M 2 qq 2 MMM M q1qqq MM2M q0qqq MM1M q q '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 3 3 1 0

2

1

0

'&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 MM2 MM q0qqq 2 MMM1M q2qqq 2 MMM0M q1qqq 1 '&%$ !"# '&%$ !"# '&%$ !"# 1

2

1

0

'&%$ !"# 1 Fig. 4. The weight diagram Gw1210 with weight multiplicities indicated in the circles

We extend this definition to the whole W˜ , i.e., ˜ (+) , w ∈ W . χyw = det(w) χy , y ∈ W

(3.17)

Using (3.8) and the intertwining property (2.10) of the map ι (3.16) gives for the multiy plicities mz (cf. (3.2)) X X y yw det(w) Kz = det(w) Kι(yw)−ι(z) mz = w∈W

=

X

w∈W

(3.18)

ι(y)

det(w) Kw·ι(y)−ι(z) = mι(z) .

w∈W

Having an explicit description for the multiplicities we can introduce the supports y Gy of “finite dimensional modules” as Gy = {z ∈ W˜ | mz 6= 0}. From (3.18) and from the definition of the map ι it follows that these “generalised weight diagrams” have the structure of n-ality zero sl(n) weight diagrams 0ι(y) with the points µ 6 ∈ Im(ι) excluded. An sl(3) example is illustrated in Fig. 4, see [9] for more examples. Let us point out some symmetry properties of the generalised weight diagrams. We have from (3.15) χa y = a χy , a ∈ A, ay

y

(3.19) ι(y)

ι(y)

= mι(z) = which implies that maz = mz . The invariance of the g multiplicities ma(ι(z)) ¯ ι(y)

mι(a z a −1 ) ) implies y

y

ma −1 z a = mz , a ∈ A,

(3.20)

and hence a χy a −1 = χy . For z = a t−µ , a ∈ A, the symmetry property (3.20) extends to y

y

ma t−w(µ) = ma t−µ , w ∈ W ,

(3.21)

Extension of Character Ring of sl(3) and Its Quantisation

713

t−λ

'&%$ !"# 2 1 qq 1 MMM M q q '&%$ !"# '&%$ !"# 1 1 2

1

'&%$ !"# 2 '&%$ !"# '&%$ !"# '&%$ !"# 2M 2M 1M 1 q 1 MMM M q0qqq MM1M q2qqq MM0M q1qqq MM2M qqq '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 2 3 2 1 2

1

0

2

1

'&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 MM2 MM q0qqq 2 MMM1M q2qqq 3 MMM0M q1qqq 4 MMM2M q0qqq 3 MMM1M q2qqq 2 MMM0M q1qqq 1 '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 2 3 4 4 3 2

Fig. 5. The beginning of the “Verma” support Vt−λ with multiplicities indicated in circles w0 t−λ

'&%$ !"# 1

0

'&%$ !"# 0 '&%$ !"# '&%$ !"# 2M 1M 2 qq 1 MMM M q1qqq MM2M q0qqq MM1M q q '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 2 2 1 0

2

1

0

'&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 1 MM0 MM q1qqq 2 MMM2M q0qqq 3 MMM1M q2qqq 3 MMM0M q1qqq 2 MMM2M q0qqq 1 '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# 2 3 4 3 2

Fig. 6. The beginning of the “Verma” support Vw0 t−λ with multiplicities in circles

using once again (2.7) to obtain ι(aw a −1 z w −1 ) = w(ι(z)) as well as the invariance of ι(y) ι(y) the g multiplicities mw(ι(z)) = mι(z) for any w ∈ W . Now we examine the case g = sl(3) in detail. The 6 element sets T y , are T a¯ = ¯ θ = a T wθ a −1 , and T wθ = {1, w a W a −1 , while T aw 010 , w020 , w10 , w20 , w0 }, see Figs. 5, 6, where the two basic supports are “visualized” as certain subgraphs of the Cayley graph of W . The set of polynomials P−y¯ −1 (ι(u)) in (3.11) associated with T y and the corresponding characters for y = 1, wθ read

ch(Vt−λ ) =

X

w P−ι(w) e−κ(λ+ρ) dκ

w∈W κθ

= (1 + 2e ) + w12 (2 + eκα1 ) + w21 (2 + eκα2 )

+ w1 (1 + eκθ + eκα2 ) + w2 (1 + eκθ + eκα1 ) + 3wθ e−κ(λ+ρ) dκ ,

(3.22)

714

P. Furlan, A. Ch. Ganchev, V. B. Petkova

ch(Vwθ t−λ ) =

X u∈T wθ κθ

u wθ P−θ −1 (ι(u)) e−κ(λ+ρ) dκ

= w0 wθ (2 + e ) + w010 wθ (1 + 2eκα2 ) + w020 wθ (1 + 2eκα1 ) + w20 wθ (2 + eκα1 ) + w10 wθ (2 + eκα2 ) + wθ (1 + 2eκθ ) e−κ(λ+ρ) dκ .

(3.23)

Now we rewrite the “finite dimensional module” characters (3.16) expressing them in terms of the ordinary sl(3) characters, the elements of A and the “class” element X a w0 a −1 = w0 + w1 + w2 . (3.24) F ≡ a∈A

˜ (+) , ν = x(λ) we have Proposition 3.1. For any x = x t−λ = t−ν x ∈ W χx = det(x) χ ν + γ χ ν−231 + γ −1 χ ν−232 + (F + 2) χ ν−ρ + γ χ ν−32 + γ −1 χ ν−31 = χ λ+x −1 ·0 + γ χ λ+x −1 ·(−231 ) + γ −1 χ λ+x −1 ·(−232 )

(3.25)

+ (F + 2) χ λ+x −1 ·(−ρ) + γ χ λ+x −1 ·(−32 ) + γ −1 χ λ+x −1 ·(−31 ) .

Proof. First the “Verma module” characters (3.22), (3.23) can be rewritten for any x = x t−µ ∈ W˜ as (recall that γ = w12 e−κ32 ) −1 −1 ch(Vx ) = (1 + γ eκx (231 ) + γ −1 eκx (232 ) ) + (F + 2)(eκx

−1 (ρ)

+ γ eκx

−1 (3 ) 2

+ γ −1 eκx

−1 (3 ) 1

−1 ) dκ e−κ(µ+x (ρ)) . (3.26)

Next apply the “resolution” formula (3.16) in the form X X det(w¯ 0 ) ch(Vw0 x ) = det(w) ch(Vxw t−w−1 (λ) ) χx = w0 ∈W

[x]

= det(x)

X

w∈W

det(w) ch(Vw t−w−1 (x(λ)) ).

w∈W

To obtain the second equality in (3.25), i.e., to express the characters in terms of sl(3) characters χ λ with λ ∈ P+ , we have used (3.4), so that each term det(x) χ ν−νa turns t into χ λ+x −1 ·(−νa ) . u Taking into account the same property (3.4) (which in particular implies that χ λ = 0 for weights on the shifted reflecting hyperplanes, i.e., wα · λ = λ for some α ∈ 1+ ) some terms in the general expression (3.25) may vanish or compensate each other for some “boundary” weights. The character formula (3.25) can be also rewritten as a decomposition over W , or U P P (w) (u) (w) (u) χy = w∈W w By = u∈U u Cy with some coefficients By (Cy ) given by linear combinations of sl(3) characters and formal exponentials. From this decomposition the

Extension of Character Ring of sl(3) and Its Quantisation

715

y

multiplicity mz in (3.18) can be rewritten as a sum of sl(3) multiplicities. E.g., rewrite (3.25) for the character χx , X Pµλ e−κµ : χx = µ∈P

=

X

a∈A,µ∈P

=

X

a∈A,µ∈P

+

X

b∈A

=

X

z=a t−µ a∈A,µ∈P

λ+x (mµ

λ+x mµ

−1 ·(−ν ) a

−1 ·(−ν ) a

λ+x −1 ·(−νb ) 0

mµ+µb,a

mxz a e−κµ +

0

λ+x −1 ·(−νa )

+ 2mµ

λ+x −1 ·(−νa ) 0

+ 2mµ

0

λ+x −1 ·(−νa )

) a + mµ

F a e−κµ

a

(3.27)

a w0 e−κµ X z=a w0 t−µ , a∈A,µ∈P

mxz a w0 e−κµ ,

0

λ where the weights νa , νa ∈ P can be read from (3.25), and the polynomials Pµλ = Pw(µ) are invariant under the nonshifted action of W . The weights µb,a ∈ P+ result from the resummation of the “odd” term in (3.25), setting wj a = bw0 t−µa,b , b ∈ A, (since W˜ = U tP ), which allows comparing the second and the third lines in (3.27) to get an expression for the multiplicities mxz in terms of the sl(3) ones. Let us give some examples,

χa = a, a ∈ A, χw0 = 2 + w0 + w1 + w2 = 2 + F, χf ≡ χt

¯1 −3

= χ 3¯ + (1 + F ) γ −1 = χγ −1 χw20 1

−1

χf ∗

=γ ≡ χt

(w20 + w12 + w01 + w0 + w1 + w2 + 1), = χ 3¯ + (1 + F ) γ = χγ χw10 ¯

−32

(3.28)

2

= γ (w10 + w21 + w02 + w0 + w1 + w2 + 1). Finally we define dimensions of the “finite dimensional modules” by setting in (3.16) ˜ (+) , every (generating) element w e−κµ in Z[W˜ ] to 1, i.e., for y = y¯ t−λ ∈ W X y mz . (3.29) Dy = z∈W˜

Using the decomposition formula (3.25) the dimension can be expressed as a sum of dimensions D λ , of sl(3) finite dimensional representations – the final expression can be cast into the form Y det(y) ¯ ρ hy(λ) ¯ + , y(α)i ¯ + Dy = 9 3 3 α>0 (3.30) Y det(y) ¯ det(y) ¯ 2 1 hι(y) + ρ, αi + = D ι(y) + . = 3 3 3 3 α>0

Since ι(a) = 0, a ∈ A we have Dy = Da y . Here are the first few numbers produced from (3.30): 1, 5, 7, 19, 23, 43, 83, 103, . . . .

716

P. Furlan, A. Ch. Ganchev, V. B. Petkova

˜ (+) ≡ A t−P+ . Remark. In the case g = sl(2) we have ι(W ) ≡ Q = 2P and W The supports of the “Verma modules” and the weight diagrams Gy are isomorphic to the supports of g = sl(2) Verma modules Vι(y) and weight diagrams 0ι(y) , respectively. Alternatively the map 21 ι recovers the supports of modules of highest weight 21 ι(a t−λ ) = λ ∈ P+ of the superalgebra osp(1|2), with a = 1, γ , labelling the two types of modules for a given λ. One has χa t−λ = a(χ λ + γ χ λ−α/2 ) and we can replace γ simply by a sign = −1, thus recovering the supercharacters of the finite dimensional representations of osp(1|2). The formula (3.30) is replaced by Dy = hι(y) + ρ, αi = 2 hλ + ρ2 , αi, y = a t−λ .

4. Character Ring and Its Structure Constants ˜ the ring of “characters of finite dimensional modules”, the subring of Denote by W, ˜ (+) } with χy defined in (3.16). Thus ˜ Z[W ] generated by the set of characters {χy |y ∈ W ˜ is inherited from the multiplication in the group ring Z[W˜ ]. The the multiplication in W main aim of this section is to obtain a formula for the structure constants of the ring. ˜ is a commutative ring. Proposition 4.1. The ring W Proof. The statement follows from the fact that A is commutative and commutes with F , and (3.3) is equivalent to wχ λ w−1 = χ λ , i.e., the ordinary characters commute with elements of W˜ . u t Next we have Lemma 4.2. χ 3 χy = 1

X i=1,2,3

χt−e y , χ 3 χy = i

2

X i=1,2,3

χte y , F χy = i

X j =0,1,2

χwj y .

(4.1)

P Proof. The first two of these equalities (in which i ei = 0, e1 = 31 , e3 = −32 ) follow from the decomposition (3.25) of χy in sl(3) characters and the analogous multiplication rules ofP the latter with the characters of the fundamental representations 3i ; recall that χ 3 = i e−κ ei . The derivation of the third is based on the straightforward 1 relation F 2 = 3 + γ χ 3¯ + γ −1 χ 3¯ , 1

(4.2)

2

and the use of (3.19) and the first two of the equalities in (4.1), taking into account the (+) (−) (±) splitting of (3.25) into “even” and “odd” part, χy = χy + F χy , χy being linear combinations of A with sl(3) characters as coefficients (or, elements in the group ring W[A] of A, over the ring W of sl(3) characters). u t The quantities in (4.1) are the basic ingredients of the examples (3.28) in the previous section and thus from the above lemma and from (3.19) we have

Extension of Character Ring of sl(3) and Its Quantisation

717

Corollary 4.3. χγ χy = χγ y , χw0 χy = 2χy + χw0 y + χw1 y + χw2 y X X 0 0 = mw mw χ, x χx y = z y −1 z x∈Gw0

z∈Gw0 y

χf χy = χf y + χw1 f y + χw21 f y + χw121 f y + χw0121 f y + χw021 f y + χw2021 f y , X X χz y = χz , f = t−31 , = z∈Gf

χ

f∗

(4.3)

z∈Gf y

χy = χ + χw2 f ∗ y + χw12 f ∗ y + χw212 f ∗ y + χw0212 f ∗ y + χw012 f ∗ y + χw1012 f ∗ y , X X χz y = χz , f ∗ = t−32 . = f∗ y

z∈Gf ∗

z∈Gf ∗ y

These Pieri type formulæ hold for generic y, in general there could be cancellations on KW orbits due to (3.17). In (4.3) the shifted weight diagram Gf y, consisting generically of 7 points, appears, thus we recover the multiplication rule of the “fundamental” representation f obtained in [9]. We recall that it was found solving the null-decoupling equations resulting from a pair of singular vectors of weight wf (αi ) · 3, i = 1, 2, (i.e., wδ+α1 · 3, wα2 · 3 ) in the g Verma module of highest weight 3 = t−31 · k30 . Similarly the general property (3.19) (first line in (4.3)) was confirmed in [9] in the case sl(3) analysing the decoupling conditions corresponding to singular vectors of weight wγ (αi ) · 3, i.e., w0 · 3, w2 · 3, in the g Verma module of h.w. 3 = γ · k30 . The second of the product rules in (4.3) appeared in [9] as a consequence of a conjectured general Weyl-Steinberg type formula for the structure constants. We shall now prove that this formula indeed holds for the multiplication of the characters constructed in the previous section. We proceed in full analogy with the proof of its standard sl(3) analog. First we establish Lemma 4.4. For any x, y ∈ W (+) we have X mxz χzy = χx χy = z∈W

X z∈W ∩Gx y

mxzy −1 χz .

(4.4)

The formula (4.4) is the analog of the sl(3) geometric algorithm for the decomposition of the product, namely one takes the generalized “weight diagram” Gx and “translates” it by y. The proof of Lemma 4.4 relies on the decomposition (3.25) of the characters χy in terms of ordinary characters χ. (Below we will give an alternative proof.) Recall that the proof of the standard analogs of (4.4) is based on the invariance of the multiplicities mνw(µ) = mνµ (3.3) under the ordinary (unshifted) action of W , so that P P χ λ = µ∈P+ m ¯ λµ w∈W e−κw(µ) , and the property X X e−κw(µ) χ λ = χ w(µ)+λ . (4.5) w∈W

Let us now prove Lemma 4.4.

w∈W

718

P. Furlan, A. Ch. Ganchev, V. B. Petkova

Proof. Using that

P

w∈W

e−κw(µ) commutes with any w ∈ W˜ (4.5) implies X X e−κw(µ) χy = χt−w(µ) y . w∈W

(4.6)

w∈W

Now we start from the decomposition (3.27) for χx and we insert (4.6) in χx χy using furthermore (4.3), X X X Pµλ e−κµ χy = Pµλ e−κw(µ) χy χx χy = µ∈P

=

X

Pµλ

µ∈P+

=

X

µ∈P+

χt−w(µ) y =

w∈W

X

λ+x −1 ·(−νa )

(mµ

a∈A,µ∈P

X

0

λ+x −1 ·(−νa )

+ mµ =

a∈A,µ∈P

+

j =0,1,2

X X

b∈A

λ+x (mµ

−1 ·(−ν ) a

X µ∈P

w∈W

Pµλ χt−µ y 0

λ+x −1 ·(−νa )

+ 2mµ

χwj at−µ y

)χa t−µ y

(4.7)

0

λ+x −1 ·(−νa )

+ 2mµ

)χa t−µ y

χaw0 t−µ y ,

λ+x −1 ·(−νb )

mµ+µb,a

0

repeating the resummation in the second line of (3.27); it remains to use the last line in t (3.27) to recover (4.4). 2 u ˜ (+) using (3.19), (3.20). Next we have The lemma extends to x, y ∈ W ˜ (+) , y ∈ a 0 W (+) , a, a 0 ∈ A. Then Proposition 4.5. Let x ∈ a W (+) ⊂ W X z Nx,y χz , χx χy =

(4.8)

z∈aa 0 W (+) z = Nx,y

X w0 ∈W

=

X

w∈W

[z]

det(w0 ) mxw0 z y −1 =

det(w)

X w∈W

ι(x) mι(zwy −1 ) .

det(w) mxzwy −1 (4.9)

˜ (+) ∩ Note that the summation in (4.8) runs effectively over the shifted weight diagram W Gx y, (of “shifted highest weight” xy ) since from the expression of the structure constants z it follows that zw ∈ G y for any w ∈ W . To make contact with the notation in [9], Nx,y x where we have used the horizontal projections of the weights 3y = y ·k30 , note that the ¯ “shifted highest weight” was denoted in [9] by 3x ◦ 3y = h(x) ◦ h(y) := h(x) + x(h(y) which coincides, according to (2.8), with h(xy). 2 Strictly speaking the summation over W in the first line of (4.7), which is rather a summation over orbits, should contain an additional factor for weights µ with nontrivial stationary subgroup of W ; the same remark applies to the summation in the last line before (4.5).

Extension of Character Ring of sl(3) and Its Quantisation

719

Proof. Using Lemma 4.4 it remains to account for cancellations on KW orbits due to ˜ (+) is a fundamental domain in W˜ with respect to the KW Weyl (3.17), using that W group. u t z of Finally we prove another property announced in [9]. The structure constants Nx,y ˜ can be expressed by the structure constants of the sl(3) character the character ring W ring W through the map ι:

Proposition 4.6. ι(z) z = N¯ ι(x) ι(y) . Nx,y

(4.10)

Proof. Using (2.7) we have ι(zwy −1 ) = y(ι(zw)) + ι(y −1 ) = y(ι(zw) − ι(y)). Hence ι(x) ι(x) ι(x) ι(x) mxzwy −1 = mι(zwy −1 ) = my(ι(zw)−ι(y)) = mι(zw)−ι(y) = mw−1 ·ι(z)−ι(y) (using (2.10) in the last step), which inserted in (4.9) converts it into the classical Weyl–Steinberg ι(z) t formula for the sl(3) tensor product multiplicities N¯ ι(x) ι(y) . u z are nonnegative integers. Corollary 4.7. The structure constants Nx,y

We introduce an involution in W˜ induced by the Z2 automorphism of the g Dynkin diagram αi → αi∗ := αn−i , i = 1, 2, . . . , n − 1, according to wα∗i = wαi∗ , tλ∗ = tλ∗ , hλ∗ , αi i = hλ, αi∗ i, and then ι(x ∗ ) = (ι(x))∗ . The involution extends to an auto˜ with χ ∗ = χy ∗ . Proposition 4.6 implies the standard properties morphism of the ring W y of the structure constants y∗

∗

1 z = δx,y ∗ , Nx,y = Nx,z∗ = Nxz∗ ,y ∗ , Nx,y

(4.11)

z az = Nax,y , a ∈ A. Nx,y

(4.12)

along with

Remark. The “classical” dimensions Dx = Dx ∗ (3.29) provide a numerical realisation of the product rule (4.8), X X ι(z) z Nx,y Dz = N¯ ι(x) ι(y) Dz . Dx Dy = z

z

˜ has a set of “fundamental” characters that generate it as a polynomial ring. The ring W One possibility for this fundamental set is {χw0 , χf , χf ∗ , χγ = γ }. The group A is like a ˜ Thus set of simple currents or in the language of rings it is a group of units of the ring W. ˜ = W[A], where W is the triality zero subring W of W ˜ having as a fundamental set W {χw0 , χw20 , χw10 } = {χw0 , γ χf , γ −1 χf ∗ }. It is convenient to introduce the notation f0 = χw0 , f1 = χw20 , f2 = χw10 . The weight diagrams Gfj , j = 0, 1, 2, are depicted in P P −1 a −1 the relations Fig. 7. Since χ 3¯ 1 = γ −1 a∈A a γ¯ a −1 and χ 3¯ 2 = γ a∈A a γ¯ (4.1) read equivalently X X a x a −1 χy = χa x a−1 y a∈A

a∈A

720

P. Furlan, A. Ch. Ganchev, V. B. Petkova w0

•MM 1

q• qq2q q •

'&%$ !"# 1 0

MMM

•

0

'&%$ !"# 1 2 qq 2 MMM M q q '&%$ !"# '&%$ !"# 1 1

0

•MM

MMM qqqJIMMM2MM q •q 1 •

JIM q• qqq MM2MM qq0qq •

q •q 1

0

1

2

•

•

Fig. 7. The supports of the three “fundamentals”

for any y ∈ W˜ and x = w0 , w10 , w20 . Similarly the Pieri type formulæ (4.3) can be rewritten for the triality zero counterparts {fj , j = 0, 1, 2}. With the shorthand notation w20 w10 0 1 2 Mu0 = mw u , Mu = mu , Mu = mu , u ∈ W , we have fj χy =

X

j

Mu χuy =

u∈Gfj

X z∈Gfj y

j

Mzy −1 χz .

(4.13)

Before we proceed with the next proposition we introduce on the chambers W (+) ˜ (+) a filtration and related gradation using the reduced length of words W (+) = and W ≤k (+)

{x ∈ W (+) | `(x) ≤ k} and W=k = {x ∈ W (+) | `(x) = k}. (+)

Lemma 4.8. Let x ∈ W=k then the sets Gfi x, for i = 1, 2, have a single element in (+) (+) W=k+2 while the rest are in W≤k+1 . Proof. The statement is proved by a direct check, cf. Figs. 1, 7, using (4.13). u t Proposition 4.9. The ring W is generated as a polynomial ring by χw0 , χw10 , χw20 , subject to one algebraic relation χw0 χw0 = 2 χw0 + 1 + χw10 + χw20 .

(4.14)

Proof. We want to show that any χy , y ∈ W (+) , can be represented as χy =

X

X

y

y

cε,n1 ,n2 f0ε f1n1 f2n2 , cε,n1 ,n2 ∈ Z.

(4.15)

ε=0,1 n1 ,n2 ∈Z≥0 (+)

(+)

First note that W≤1 = {1, w0 }, and W=2 = {w10 , w20 }. Using Lemma 4.8 and (4.13), (4.14) we see that these polynomials are determined inductively going up the gradation. t u Due to the relation (4.14) the ring can be generated also by only two fundamental characters, i.e., either by f0 , f1 or by f0 , f2 . Using (4.13) and (4.15) one can give an alternative proof of Lemma 4.4:

Extension of Character Ring of sl(3) and Its Quantisation

721

Proof. Let us do it for W. By Proposition 4.9 we have X X x cε,n f ε f n1 f2n2 χx = 1 ,n2 0 1 ε=0,1 n1 ,n2 ∈Z≥0

=

X

X

x cε,n 1 ,n2

X

(Mu0 )ε

ni Y Y

ε=0,1 n1 ,n2 ∈Z≥0 {u,{ui,j }} ε × u u1,1 . . . u1,n1 u2,1 . . . u2,n2 ,

i=1,2 j =1

Mui i,j

using (3.16) for each fj with the notation for the multiplicities P as in (4.13). Note that all of the above sums are actually finite sums. Writing χx = z mxz z we get mxz =

X n1 ,n2 ∈Z≥0

x c0,n 1 ,n2

x + c1,n 1 ,n2

ni X Y Y {ui,j } i=1,2 j =1

X {u,{ui,j }}

Mu0

ni Y Y i=1,2 j =1

Mui i,j Mui i,j ,

where the two internal sums are over subsets {ui,j } and {u, {ui,j }} of W such that u1,1 . . . u1,n1 u2,1 . . . u2,n2 = z and uu1,1 . . . u1,n1 u2,1 . . . u2,n2 = z respectively. Using repeatedly associativity together with the Pieri type formulæ (4.13), e.g., X X Mui i,j ui,j )χui,j +1 ...y = Mui i,j χui,j ui,j +1 ...y , ( ui,j

ui,j

we are done. u t ˜ can be looked as a “quadratic” extension of the ring of sl(3) Remark. The ring W ˜ according to (3.28). I.e., W ˜ is a polynomial ring with characters W – a subring of W ˜ := W[A] modulo a quadratic relation, W ˜ = V[F ˜ ]/{F 2 − C = coefficients in V 0} ; C ∈ V is given by the r.h.s. of (4.2). Here V ⊂ W[A] is a triality zero subring of ˜ Similarly W = V[F ]/{F 2 − C = 0}. V. 5. Alcove of Admissible Weights at Level k + 3 = 3/p b k at rational levels, Beginning with this section we will analyse the case of g = sl(3) namely κ = k + 3 = 3/p with p ∈ Z≥2 \3Z. This selects a subseries of the general set of admissible weights of [12] which we will describe in more detail. 3 The rationality of κ has two consequences – for y ∈ W˜ the map y 7 → y · k30 for y ∈ W˜ is not injective and the KW groups are isomorphic to the affine Weyl group W . [p] [p] Let 5 = {α0 = pδ − θ} ∪ 5, and 1re +,p = 1+ ∪ {mpδ + α, α ∈ 1, m ∈ Z>0 }. Denote by W [p] the isomorphic to W subgroup of W , generated by the reflections [p] {wα , α ∈ 5 }. We have W [p] = tpQ o W . The subgroup A[p] of W˜ generated by 3 This subset is generic since it is expected that as in the sl(2) b case there is an effective factorisation of the FR multiplicities for the general admissible representations at κ = p 0 /p into the multiplicities for the two subseries – at κ = n/p and the integrable one at κ = p0 , the former represented by the r.h.s. of (1.4), which extends to arbitrary p ∈ Z≥1 .

722

P. Furlan, A. Ch. Ganchev, V. B. Petkova [p]

γ[p] := tp 3¯ 1 γ¯ = t(p−1) 3¯ 1 γ keeps invariant the set 5 and hence a W [p] a −1 = W [p] for a ∈ A[p] . We have W˜ = W o A[p] and for κ = 3/p, A[p] · k30 = k30 .

(5.1)

Let y ∈ W˜ and P = {3 = y · k30 | y ∈ W˜ }. From the Kac-Kazhdan condition and [p] [p] the from the analog of (2.3) with α ∈ 5 it is clear that if y(α) ∈ 1re + , ∀α ∈ 5 [p] reflections {wy(α) , α ∈ 5 } generate a KW group W [3] (to be denoted also W [y] ) such that its shifted action on P gives the weights of the Verma submodules of M3 . As in (2.3) the shifted action of W [y] on the weights in P is intertwined with the right action of W [p] on W˜ . Moreover M3 is a maximally reducible Verma module with infinitely many singular vectors. Hence we are led to the definition of the alcove of admissible ˜ p(+) · k30 = Wp(+) · k30 , where weights as P+,p = W ˜ p(+) = {y ∈ W˜ | y(5[p] ) ⊂ 1re W +} (w)

˜ p(+) ∩ W. Wp(+) = W

and

(5.2)

(w)

Denote P+,p = {λ ∈ P+ | hλ, θi < p if w(−θ ) < 0 or hλ, θi ≤ p if w(−θ ) > 0, w ∈ (1)

W }. In particular P+,p coincides with the integrable alcove P+k at level k = p − 1. It is ˜ p(+) is equivalent to easy to see that the definition of W (y) ˜ p(+) = {y = y t−λ ∈ W˜ | λ ∈ P+,p } W

= A t−P p−1 ∪ A w0 t−P p−2 = t−P p−1 A[p] ∪ w0 t−P p−2 A[p] , +

+

+

(5.3)

+

(y) ˜ p(+) = ∪a∈A[p] Wp(+) a. Wp(+) = {y = y t−λ ∈ W | λ ∈ P+,p ∩ Q}, W

(5.4)

The second equality in (5.3), representing the alcove as a disjoint union of two leaves, p−1 p−2 parametrised by the two alcoves P+ and P+ , takes into account the equivalence of ˜ p(+) implemented by the right action of the group A[p] , or, more explicitly, elements in W p−1

,

p−2

.

l = γ l t−σ −l t−λ γ[p]

, λ ∈ P+

l = γ −l w0 t−σ −l w0 t−λ γ[p]

λ ∈ P+

[p−1] (λ)

, [p−2] (λ)

(5.5)

¯ 1 denotes the automorphism of the alcove Here σ[k] (λ) := γ (λ + k30 ) = w12 (λ) + k 3 k P+ at integer level k induced by the action of A. Alternatively, due to (5.1), the admissible (+) alcove is parametrised by the elements of the fundamental domain Wp of W (i.e., ˜ p(+) ), as indicated in (5.4). triality zero points on any orbit of A[p] in W (w)

In analogy with Lemma 2.2 one can show that P = ∪w∈W w(P+,p ) + pQ ˜ p(+) , W

is a

(+) Wp ,

respectively is a fundamental domain in partition and hence one has that W˜ , respectively W , for the right action of W [p] . Again the map ι intertwines the right action of W [p] on W with the action of the affine Weyl group at level 3p − 3; it is sufficient to check, taking into account (2.10), that ι(ywpδ−θ ) = w0 · (ι(y) + (3p − 3)30 ).

(5.6)

Extension of Character Ring of sl(3) and Its Quantisation

723

qM M

qM M qM M qM M qM M qM M q q q q q M M M M M M q q q q q q 0123 7654 0123 7654 0123 7654 0123 7654 0123 7654 M q 40 MM1 2 q 01 MM0 1 q 22 MM2 0 q 10 MM1 2 q 04 M M q M M M M M q q q q q q q q q Mq q Mq 14 12 21 41 0 2 1 0 0123 7654 0123 7654 0123 7654 0123 7654 M q 03 MM2 0 q 21 MM1 2 q 12 MM0 1 q 30 M M q M qq M qq M qq MM q Mq q q 32 22 23 1 0 2 0123 7654 0123 7654 0123 7654 M q 20 MM0 1 q 11 MM2 0 q 02 M M q M M M q q q q q Mq q Mq 31 13 2 1 0123 7654 0123 7654 M q 13 MM1 2 q 31 M M q M M q q q Mq q Mq 11 0 0123 7654 M q 00 M M q MM q Mq q q M q M q Mq q

Fig. 8. The p = 5 alcove

(+)

˜ p is represented by a formula analogous to (2.9), with P+ replaced by Accordingly W 3p−3 P+ . As an example we depict the alcove of admissible weights P+,p , for p = 5, in ˜ p(+) | w ∈ {e, wθ }} with a circle or box in case Fig. 8. It is parametrised by {wt−λ ∈ W of w = e or w = wθ = w0 tθ respectively, the numbers inside being the labels of λ. Equivalently, keeping only the triality zero labels λ, the same figure depicts the alternative representation of the admissible alcove through the elements of the fundamental domain (+) Wp . Unlike [9] the latter choice will be mostly used in what follows. Sometimes it will ˜ p(+) , imposing the constraints implemented be also useful to work with the full domain W [p] by the right action of A . Define a triality preserving order 3 automorphism of W˜ (and hence of W ) −p σp (x) : = γ x γ[p] , x ∈ W˜ , p

ι(σp (x)) = σ[3p−3] (ι(x)).

(5.7)

724

P. Furlan, A. Ch. Ganchev, V. B. Petkova (+)

Geometrically σp fixes the “middle” point of the alcove Wp (t− p−1 ρ , or w0 t− p−2 ρ = 3 3 wθ t− p+1 ρ , cf. (5.5)), and “rotates” it sending the “corners” into one another, i.e., it 3 behaves like the usual “simple current” automorphism of an integrable alcove. 6. Quantised “q”-Characters Recall first the integrable case where the “classical” g characters χ λ , λ ∈ P+ , are converted into C-valued “q”-characters, labelled by the set {λ ∈ P+k } of integrable highest weights at (positive) integer level k. Essentially one turns the formal exponentials eλ , λ ∈ P , into “true” exponentials, −2πi

eλ → eλ (µ) := e k+n hλ,µ+ρi , µ ∈ P .

(6.1) (h)

This “quantises” the “classical” g characters into “periodic” characters, χ λ (µ) = (h) χ λ+h Q (µ), h = k + n, i.e., (skew-)invariant under the full affine Weyl group at level k, det(w) χ (h)

w·(λ+k30 )

(h)

(h)

(µ) = χ λ (µ) = χ λ (w · (µ + k30 )), w ∈ W,

(6.2)

so that we can restrict the “dual” set (the set of µ’s) to the integrable alcove P+k itself. 4 (h) (h) (h) The “q”-characters are given explicitly by a ratio χ λ (µ) = Sλ µ /S0 µ of matrix elements (h)

of the integrable modular matrix Sλ µ , a unitary, symmetric matrix. It is recovered up to an overall constant by the second equality in (3.2), with κ = −1 and exponentials transformed as in (6.1), i.e., (3.2) turns into the Kac–Peterson formula [11]. Thus the (h) complex numbers {χ λ (µ), µ ∈ P+h−n } can be interpreted as eigenvalues of the matrix β β Nλ of fusion rule coefficients (Nλ )α = Nλ α of the integrable WZW conformal models. β This relates the Verlinde formula for Nλ α to the classical Weyl–Steinberg formula [11, (h) 16, 8]. In what follows we shall also need χ λ (µ), for µ belonging to some of the (lh) shifted hyperplanes Hα := {µ ∈ h¯ ∗ | hµ + ρ, αi = l h}, α ∈ 1+ , l ∈ Z. While the Kac–Peterson formula has no sense on these hyperplanes, since both the numerator and (h) the denominator vanish, the characters χ λ (µ) are well defined through the analog of the last equality in (3.2), or any of the standard determinant formulæ for the classical sl(3) characters. Following the analogy with the integrable case the idea is to replace the affine Weyl group with the affine KW group at level κ − 3 = 3/p − 3, i.e., to extend the invariance (3.17) of the “classical” characters with respect to the right action of the horizontal Weyl group W to invariance with respect to the right action of the affine group W [p] . This will lead to (1.2) with the structure constants given by the conjectured in [9] formula (1.3), which now derives from the “classical” formula (4.9). Finally inverting (1.2) we will recover in Sect. 7 the Pasquier–Verlinde type formula (1.1). Apparently there are two problems to be solved. We have to find an analog of the discrete set {µ ∈ P+k } and furthermore the elements of the group algebra of W˜ have to be converted into some C-valued functions on this set. 4 Alternatively the “q”-characters are obtained restricting the standard group characters to the discrete −2πi h−n subset of elements {diag(e h hei ,µ+ρi , i = 1, 2, . . . , n), µ ∈ P+ }, in the Cartan subgroup of SU (n). Pn ¯i −3 ¯ i−1 , 3 ¯0 =0=3 ¯ n. Here i=1 ei = 0, ei = 3

Extension of Character Ring of sl(3) and Its Quantisation

725

Denote by Ep the “double alcove” region Ep = {µ ∈ P+ |0 ≤ hµ, αi i ≤ p − 1, i = 1, 2} p−3 p−3 2p−2 3p−3 ⊂ P+ . = P+ ∪ wθ (P+ ) + (p − 2)θ ∪α∈1+ Hα(p) ∩ P+ p−1

(6.3)

p+1

This set, which can be also looked at as P+ ∪ {wθ (P++ ) + p θ}, contains p2 weights, |Ep | ≡ |P+,p |, and we shall argue below that it is the analog for k = 3/p − 3 of the integrable “dual” set {µ ∈ P+h−3 }, see Figs. 9a, 9b, where Ep is depicted for p = 5 (lp) and p = 4 (the dotted lines indicate the hyperplanes Hα ). For p = 2 Ep consists 1 of the alcove P+ and the weight (p − 1, p − 1) = (1, 1) and thus represents the Z3 3p−3 factorisation of the integrable alcove at level 3p − 3, P+ , obtained after identifying 3p−3 l (λ) along an orbit of the σ automorphism of P+ , including the σ the points σ[3p−3] stable point (p − 1, p − 1). For p > 2 this factorisation leads to a subset of the alcove 3p−3 p−3 P+ which is of cardinality |Ep | + |P+ | > |Ep |. We look for a solution of the invariance condition χy w (·) = det(w) χy (·), w ∈ W [p] ,

(6.4)

˜ p(+) . χy a (·) = χy (·), a ∈ A[p] , y ∈ W

(6.5)

together with

Accounting for the invariance of the characters with respect to the horizontal Weyl group W (3.17) the requirement (6.4) reduces to the periodicity condition χy tp ν (·) = χy (·), ν ∈ Q.

(6.6)

The formula (3.25) for the characters χy (·) involves the three basic ingredients – the elements of the group A, the sl(3) characters χ λ , and the combination F in (3.24), so

(0, 0)

q qqq q q (0, p − 1) qqq qq '&%$ !"# • q q '&%$ !"#qq '&%$ !"# • q• q q q '&%$ !"# '&%$ !"# '&%$ !"# • • • M MMM qqq q '&%$ !"# '&%$ !"# '&%$ !"# • • MM q• MM qqq p−3 '&%$ '&%$ !"# !"# '&%$ !"# '&%$ !"# • MM P+ • • • MM qqq q '&%$ !"# '&%$ !"# '&%$ !"# • MM • q• MM qqq '&%$ !"# '&%$ !"# '&%$ !"# • MM • • MM '&%$ !"# '&%$ !"# • MM • MM '&%$ !"# • MM MMM (p − 1, 0) MMM MMM Fig. 9a. The dual set Ep for p = 5

'&%$ !"# • '&%$ !"# •

'&%$ !"# •

726

P. Furlan, A. Ch. Ganchev, V. B. Petkova

(2p)

Hα2

v• vv

v v•

(0)

H α2

(0)

H α1

v vv • v• v vv (p) • • Hα2 vv vv • • • v vv v • • v• vvv '&%$ !"# • • • • vvv '&%$ !"# '&%$ !"# • • • • vvv '&%$ !"# '&%$ !"# '&%$ !"# • • • HH • • H vvv1 '&%$ !"# '&%$ !"# '&%$ !"# '&%$ !"# • HP • • • • HH+ vvv '&%$ !"# '&%$ !"# '&%$ !"# • • HH • • • H '&%$ !"# '&%$ !"# • • • HH • H '&%$ !"# • • • HH • HH • • •HH HH • •HH • HH (p) Hα1 • •HH HH (p) Hθ •HH • HH •HH HH •

(2p)

Hα1

(2p)

Hθ 3p−3

Fig. 9b. The dual set Ep ⊂ P+

for the case p = 4

we have to give meaning to some C - valued counterparts γ (·), χ λ (·), F (·). The natural realisation for the generator of the group A – isomorphic to the cyclic group Z3 , reads γ → γ (µ) := e

2π i m 3 τ (µ)

, m = 1, 2, mod 3.

(6.7)

The periodicity requirement (6.6) suggests to look for a realisation of the sl(3) characters (p) in (3.25) in terms of the integrable characters χ λ (µ) at (shifted) level p, determined p−3 for µ ∈ P+ , (p)

3 χ λ → χ λ (µ) := ελ,µ χ λ (µ), ελ,µ = 1.

(6.8)

In (6.8) we have allowed for an arbitrary overall phase constant ελ,µ , invariant with respect to both indices under the shifted action of the affine Weyl group. We can choose ελ,µ = e

−2π i l 3 τ (µ) τ (λ)

, l = 1, 2, mod 3,

(6.9)

Extension of Character Ring of sl(3) and Its Quantisation

727

which effectively leads to the realisation of the formal exponentials as e−κλ → e−κλ (µ) := e

−2π i l 3 τ (µ) τ (λ)

−2π i p

e

hλ,µ+ρi

.

(6.10)

The need for this phase is dictated by the requirement (6.5), which combined with (3.19), (±) (+) (−) (5.5) reads for each of the parts χy in χy = χy + F χy , (treating for the time being F (µ) as a formal variable) (±)

γ (µ) χy(±) (µ) = χt−σ

(±)

[p−1] (λ)

(µ) = χt−σ

(±)

¯ 2) [p−3] (λ−23

(µ)

(±)

γ (µ) χy(±) (µ) = χw0 t

2 −σ[p−2] (λ)

(µ) = χw0 t

2 ¯ 1) −σ[p−3] (λ−3

for y = t−λ , (6.11) (µ) for y = w0 t−λ .

The above conditions and the corresponding standard property of the integrable “q”characters (p)

χ σ[p−3] (λ) (µ) = e

2π i 3 τ (µ)

(p)

χ λ (µ),

(6.12)

fix the integer l to l = p mod 3 (using that p 2 − 1 = 0 mod 3), and keep arbitrary the power m in the phase in (6.7). Without lack of generality we can choose m = l = p since otherwise the remaining phases can be absorbed using the analogous to (6.12) symmetry with respect to the index µ, (p)

χ λ (σ[p−3] (µ)) = e

2π i 3 τ (λ)

m−p

(p)

χ λ (µ), p−3

thus changing the value of µ to µ0 = σ[p−3] (µ) ∈ P+

(6.13)

; we can do this since the three

(±) χy

are described by sl(3) characters of weights of different triality terms in each of τ = 0, 1, 2. Now we turn to the operator F = w0 + w1 + w2 . We recall that it commutes with the elements of A as well as with the sl(3) characters. Preserving the relation (4.2) – which is the basic relation used to derive the character ring structure constants, we see that the square of F (·) can be determined by the (fundamental) integrable characters, i.e., (p)

(p)

1

2

F 2 → F 2 (µ) := 3 + χ 3¯ (µ) + χ 3¯ (µ)

(6.14)

p for any µ ∈ P . This determines F (µ) up to a sign, F (µ) = ε(µ) F 2 (µ), ε(µ) = ±1. The r.h.s of (6.14) is equivalently reproduced by F 2 (µ) = |R(µ)|2 , X − 2π i ha(θ 2π i 2π i ¯ ),µ+ρi − 2π i hθ,µ+ρi hα ,µ+ρi hα ,µ+ρi (6.15) e 3p = e 3p + e 3p 1 + e 3p 2 . R(µ) = a∈ ¯ A¯

One has the relations √ (3p) ¯ i 3p 3 S0 µ = 1/dκ/3 (µ) = R(µ) − R(µ),

(6.16)

728

P. Furlan, A. Ch. Ganchev, V. B. Petkova

X X √ (p) ¯ ¯ 3 ¯ i p 3 S0 µ = 1/dκ (µ) = e−a(θ)κ (µ) − ea(θ)κ (µ) = (R(µ))3 − (R(µ)) a∈ ¯ A¯

a∈ ¯ A¯

√ (3p) ¯ ¯ = i 3p 3 S0 µ R(µ) + R(µ) − |R(µ)| R(µ) + R(µ) + |R(µ)| . (6.17) ¯ (Here R(µ) is the complex conjugation of R(µ).) (±) (+) It remains to determine the sign of ε(µ). Since the parts χy (µ) in χy = χy + (−) F χy , as well as F 2 (µ), coincide for µ and its reflected images according to (6.2), p−3 we can assign ε(µ) = 1 for µ ∈ P+ and ε(µ) = −1 for µ sitting on the “mirror” (p) (with respect to the hyperplane Hθ ) alcove in Ep . On the intersection of Ep with the (p) reflection hyperplanes Hα we choose ε(µ) = 1 for α = θ , ε(µ) = −1 for α = α1 , α2 and the justification of this choice will become clear below. The domain Ep splits into p−2 (±) (+) two disjoint subsets Ep , Ep := P+ , thus ε(µ) := ±1 for µ ∈ Ep(±) .

(6.18)

Summarising we are led to the following expression for the quantised characters (p) ˜ p(+) : χy (µ), y = y t−λ ∈ W −2π i p (p) (p) (p) (p) (µ) + χ (µ) + χ (µ) χy (µ) : = e 3 τ (µ) τ (λ) χ λ+y −1 ·(0) λ+y −1 ·(−231 ) λ+y −1 ·(−232 ) (p) (p) (p) (µ) + χ (µ) + χ (µ) . + (F (µ) + 2) χ −1 −1 −1 ·(−ρ)

λ+y

λ+y

·(−32 )

λ+y

·(−31 )

(6.19) (+)

For y ∈ Wp the overall phase in (6.19) disappears. Taking µ = 0 we define “q”(p) (p) dimensions Dy := χy (0) expressed by the “q”-dimensions of the integrable level p − 3 case. (+)

Proposition 6.1. Let x, y ∈ Wp , µ ∈ Ep . Then X (p) (p) (p) (p) z Nx,y χz (µ), χx (µ) χy (µ) =

(6.20)

(+) z∈Wp

where (p)

z Nx,y =

X w0 ∈W [z·k30 ]

=

X

w∈W [p]

det(w 0 ) mxw0 z y −1

det(w) mxzwy −1 =

X

(6.21)

zw det(w) Nx,y .

w∈W [p]

Furthermore the equality (1.4) holds true. (p)

Proof. Since the basic relations (4.1), (4.2) are conserved the map χy → χy (µ) is a ring homomorphism, so (4.4) holds and it remains to use (6.4) to recover (6.20), (6.21). Finally the derivation of (1.4) parallels that of (4.10) using (5.6). u t

Extension of Character Ring of sl(3) and Its Quantisation

729

(p) ˜ p(+) . Given y ∈ W ˜ p(+) take γ m y ∈ Wp(+) The statement extends to χy (µ), y ∈ W −2π ip m τ (µ)

(p)

(p)

3 with the appropriate m. Then χy (µ) = e χγ m y (µ) and the product of charac(p) (+) ι(a z) ˜ = ters χy (µ), y ∈ Wp reduces to (6.20), (6.21) due to the symmetry (3p) N¯

(3p) N ¯ ι(z) , ι(x) ι(y) (p)

z , a ∈ A, i.e., the symmetries (4.11), (4.12) extend to (p) Nx,y y∗

∗

z az 1 Nx,y = (p) Nx,ay = (p) Nx,z∗ = (p) Nxz∗ ,y ∗ , , (p) Nx,y = δx,y ∗ .

ι(a x) ι(y)

(6.22)

The action of the involution ∗ on the characters coincides with the complex conjugation (p)

(p)

(p) ∗

χy ∗ (µ) = χy (µ∗ ) = χy

(p)

(µ) (= χy (µ)).

(6.23)

The second equality follows from ε(µ) = ε(µ∗ ) and the analogous equality for the integrable characters. Using (5.7) the first relation in (6.22) can be also rephrased in terms of elements of (+) (+) = χσ(p) (σp (y), y ∈ Wp , being the triality zero representative Wp only, since χγ(p) y p (y) (+)

˜ p , on its A[p] orbit), of γ y ∈ W (p)

σ (z)

p z Nx,σ = (p) Nx,y , p (y)

χσ(p) χ (p) = χσ(p) . p (1) y p (y)

(6.24)

The analogs of the basic examples in (3.28) read χγ(p) (µ) = e

2π i pτ (µ) 3

(p)

(= χt−σ

(p)

[p−1] (0)

(µ) = χγ 2 t

2 −σ[p−1] (0)

(µ)) = χσ(p) (µ), p (1)

¯ (µ) − R(µ) − R(µ) + ε(µ) |R(µ)|, χw(p) (µ) = 2 + F (µ) = χ (3p) (1,1) 0

χw(p) (µ) = e

2π i pτ (µ) 3

20

χt(p) (µ) = e

2π i pτ (µ) 3

¯1 −3

χ

(p) (µ) σp−1 (w20 )

= χ (p) (µ) + 1 + F (µ) (1,0)

¯ (µ) − R(µ) − R(µ) + ε(µ) |R(µ)|. = χ (3p) (3,0) (6.25) In (6.25) we have expressed the characters in terms of the integrable characters 3p−3 χ (3p) (µ) at (shifted) level 3p. Since Ep ⊂ P+ , taking µ ∈ Ep gives well defined ι(y) (p)

expressions. On the hyperplanes Hα ∩ Ep these characters reduce (up to a sign) to the (3p) corresponding integrable characters χ ι(y) (µ) at (shifted) level 3p. Indeed one proves (p)

Lemma 6.2. Let µ ∈ Ep ∩ ∪α∈1+ Hα

. Then

¯ − ε(µ) |R(µ)| = 0 rε (µ) := R(µ) + R(µ)

(6.26)

for ε(µ) as in (6.18). (p)

Proof. One easily checks that for µ ∈ Ep ∩ ∪α∈1+ Hα

and ε(µ) chosen as in

(6.18) R(µ) can be cast into the form R(µ) = −ε(µ) e the lemma. u t

|R(µ)|, which implies

−2π i 3 ε(µ)

730

P. Furlan, A. Ch. Ganchev, V. B. Petkova (p)

The alternative expressions in (6.25) representing the characters χy (µ) in terms of ˜ p(+) , µ ∈ Ep . the integrable “q”-characters at level 3p − 3 generalise to arbitrary y ∈ W To simplify notation we shall omit the explicit dependence on µ denoting the overall ˜ p(+) we obtain by straightforward ¯λ ∈ W phase in (6.19) by εy . Thus for any y = yt computation using (6.15), (6.16), (6.17), (6.25), (p) (3p) (p) (p) (p) +χ +χ . (6.27) χy = εy χ ι(y) − rε χ −1 −1 −1 ·(−ρ)

λ+y

·(−32 )

λ+y

·(−31 )

λ+y

The second term in (6.27) admits also a representation entirely in terms of integrable (3p) (3p) “q”-characters χ ν (µ) at level 3p − 3, with weights ν 6∈ Im(ι), using that χ 3λ+2ρ = (p)

rε r−ε χ λ . From Lemma 6.2 and the relations (6.16), (6.17) it follows that r−ε (µ) 6= 0 for any µ ∈ Ep . Finally we can also cast (6.27) into the form X −2π i hw(ι(y)+ρ), µ+ρi e 3p χy(p) (µ) = εy (µ) dκ (µ) R 2 + F R¯ (µ) w∈A¯

X 2π i hw(ι(y)∗ +ρ), µ+ρi − R¯ 2 + F R (µ) e 3p . w∈A¯

(6.28) Lemma 6.2 and (6.27) imply (p)

(+)

and µ ∈ Ep ∩ ∪α∈1+ Hα

Corollary 6.3. For any y ∈ Wp

(p)

,

(3p)

χy (µ) = χ ι(y) (µ).

(6.29)

Despite the relation (1.4) between the structure constants the product of characters (p) (3p) χy (µ) differs in general from that of the integrable characters χ ι(y) (µ) at level 3p − 3 (3p)

since the decomposition of the latter contains also terms χ λ (µ) with λ 6 ∈ Im(ι). On the other hand the equality (1.4) together with (6.29) – the latter property being enforced by the choice (6.18) of the sign of F (µ), require that on the intersection of the hyperplanes (p) Hα with Ep , the product of the triality zero integrable characters at shifted level 3p has to reduce to that of the characters (6.19). Otherwise we run into a contradiction, i.e., the choice of sign (6.18) will appear to be inconsistent. However it is easy to prove the above property of the standard integrable characters at level 3p − 3, thus justifying a posteriori the choice (6.18). Namely we have 3p−3 (lp) ∩ ∪α∈1+ , l∈Z Hα , Lemma 6.4. For µ ∈ P+ (3p)

(3p)

χ ι(x) (µ) χ ι(y) (µ) =

X

(3p)

λ∈Im(ι)

(3p)

λ Nι(x) ι(y) χ λ

˜ p(+) . (µ), x, y ∈ W

(6.30)

Proof. The proof of the lemma reduces to the proof of the following property of the integrable characters at level 3p − 3, p > 2: 3p−3 (lp) 3p−3 ∩ Hα , α ∈ 1+ , l ∈ Z, and λ ∈ P+ , τ (λ) = 0, λ 6 ∈ Im(ι), For µ ∈ P+ (3p)

χλ

(µ) = 0.

(6.31)

Extension of Character Ring of sl(3) and Its Quantisation

731 p−3

If τ (λ) = 0, and λ 6 ∈ Im(ι) then λ + ρ = 3 λ0 , for some λ0 ∈ P+ (3p)

χλ

(µ) =

3p−3

P+

(p) Sλ0 −ρ,µ (3p) 3 S0 µ

(p)

+ ρ. Hence (lp)

and (6.31) follows from the vanishing of Sλ0 −ρ,µ for µ ∈ Hα

∩

. u t

Remark. The case p = 2 is degenerate (trivial) since the solutions of (6.26) coincide 3p−3 ≡ P+3 are all with the whole Ep and accordingly the triality zero points in P+ in Im(ι). Hence the characters (6.19) with y = 1, w20 , w10 , w0 , coincide with the corresponding integrable characters at level 3p − 3 = 3 – they realise the triality zero fusion subalgebra at this level labelled by {λ = (0, 0), (3, 0), (0, 3)„ (1, 1)}. Thus the b k case at κ = k + 2 = 2/p, p b k case κ = k + 3 = 3/2 is analogous to the sl(2) sl(3) (p) (2p) (+) – odd, where the admissible “q”-characters χy (µ) = χ ι(y) (µ), y ∈ Wp , close the b integer isospin (τ (λ) = 0) fusion subalgebra of the sl(2) integrable representations at (+) shifted level 2p; the representative Wp of the admissible alcove is defined as in (5.4), b k and k + n = n/p. the latter formula being universal for any sl(n) 7. Pasquier–Verlinde Type Formula (p)

(+)

We have found p 2 vectors χ(µ) = {χy (µ), y ∈ Wp } with µ ∈ Ep , which according (+) to (6.20) provide eigenvectors common to all fusion matrices Ny , y ∈ Wp , (Ny )zx = (p) (p) N z , and for any y the numbers χ y (µ) are eigenvalues of Ny labelled by the set y,x Ep . Lemma 7.1. Let µ, µ0 ∈ Ep . If χ(µ) = χ(µ0 ) then µ = µ0 . (±)

Proof. Recall that the domain Ep splits into two disjoint subsets Ep each being a subset of a fundamental domain in P with respect to the shifted action of W at level p − 3. (p) (p) From χfj (µ) = χfj (µ0 ), j = 0, 1, 2 it follows that: (+)

(−)

i) ε(µ) = ε(µ0 ), which implies that both µ, µ0 ∈ Ep , or µ, µ0 ∈ Ep , (p) (p) ii) χ 3¯ (µ) = χ 3¯ (µ0 ), i = 1, 2, which implies that µ0 = w · (µ + (p − 3)30 ), i i t w ∈ W . Hence µ = µ0 . u Following standard arguments and taking into account the properties (6.22) of the z , the lemma immediately leads to: structure constants (p) Nx,y Corollary 7.2.

X

(p)

(p) ∗

χy (µ) χy

(µ0 ) = 0, ∀ µ, µ0 ∈ Ep µ 6 = µ0 ,

(7.1)

(+) y∈Wp

and hence {χ (µ), µ ∈ Ep } is a linearly independent set of (common) eigenvectors. (p)

Normalising the eigenvectors χ(µ) (recall that χ1 (µ) = 1), (p)

(µ)

ψy(µ) = χy (µ) ψ1 ,

1 (µ) |ψ1 |2

=

X (+) y∈Wp

(p)

|χy (µ)|2 ,

(7.2)

732

P. Furlan, A. Ch. Ganchev, V. B. Petkova (µ)

we can choose ψ1

(µ) ∗

real positive, so that ψy (µ)

(µ∗ )

= ψy

(µ)

= ψy ∗ , (see also (6.23)). Due

to (7.1) the square matrix ψy is nonsingular and hence both its column and row vectors (µ) are linearly independent. Thus we obtain a unitary matrix ψy , X X 0 ψy(µ) ψy(µ )∗ = δµ µ0 , ψy(µ) ψx(µ)∗ = δy x , (7.3) µ∈Ep

(+)

y∈Wp

which diagonalises all Ny . Indeed using the second (completeness) relation in (7.3) the formula (6.20) converts into (1.1), providing an equivalent expression for the “q” analog of the Weyl-Steinberg type formula (6.21). Hence we recover the Pasquier–Verlinde type formula for the fusion rule multiplicities of the admissible representations at level (µ) k + 3 = 3/p proposed in [9] with a now explicitly determined eigenvector matrix ψy . A remaining technical problem is to perform explicitly the summation in (7.2). At (p) least for µ ∈ Ep ∩ ∪α∈1+ Hα , this can be easily done, getting an explicit expression (µ)

(µ)

for the constant ψ1 , and hence for the corresponding matrix elements of ψy , for this particular subset of weights in Ep . Indeed we have (p) (+) Lemma 7.3. Let y = y t−λ ∈ Wp and µ ∈ Ep ∩ ∪α∈1+ Hα . Then ψy(µ) =

√ (3p) 3 Sι(y) µ ,

µ + ρ 6 = pρ ;

ψy(µ)

(3p) Sι(y) µ ,

µ + ρ = pρ.

=

(7.4)

Proof. According to (7.2) and (6.29) it is sufficient to prove the statement for y = 1. (p) From (6.29), (6.31), it follows that for µ ∈ Ep ∩ ∪α∈1+ Hα one has X (+)

y∈Wp

X

(p)

|χy (µ)|2 =

3p−3

λ∈P+

(3p)

|χ λ

, τ (λ)=0

2 X 1 (µ)|2 = √ δµ,σ l . [3p−3] (µ) (3p) 2 ( 3S0 µ ) l=0

3p−3

exploiting standard properties of the modular The last equality holds for any µ ∈ P+ (3p) matrices Sλ µ ; see, e.g., [14]. Since the point µ + ρ = pρ is a fixed point for the σ[3p−3] √ automorphism, the factor 3 does not appear in the second equality of (7.4). u t According to the last remark in the previous section in the case p = 2 the formulæ (µ) (7.4) describe all matrix elements of the eigenvector matrix ψy and analogous formulæ (with the factor 3 substituted by 2) hold for the whole sl(2) subseries at level k+2 = 2/p. We conclude with the remark that the character ring constructed here is an extension of the ring of integrable “q”-characters at shifted level p, with the two roots of the quadratic polynomial (6.14) of F . The latter characters are elements of the subring Z[ω] of the cyclotomic extension Q[ω] of the rational numbers for ω3p = 1, see [4]. Acknowledgements. We would like to thank V. Dobrev, V. Molotkov, Tch. Palev, I. Penkov, and J.-B. Zuber for useful discussions, remarks, or suggestions. We also thank all colleagues who have shown interest in this work and/or have endured our explanations. A. Ch. G. thanks the A. v. Humboldt Foundation for financial support and the Universities of Kaiserslautern and Bonn for hospitality. V. B. P. acknowledges the financial support and hospitality of INFN, Sezione di Trieste, and the partial support of the Bulgarian National Research Foundation (contract 8 − 643).

Extension of Character Ring of sl(3) and Its Quantisation

733

Note added. The multiplication rule encoded in the product of the character χw0 with any χy , see (4.3), has been reproduced – including the multiplicity two contribution, by an explicit solution of the singular vectors decoupling equations at generic level [17]. Thus filling a gap in the computations in [9] all Pieri type formulae (4.3) are now confirmed.

References b 1. Awata, H., and Yamada, Y.: Fusion rules for the fractional level sl(2) algebra. Mod. Phys. Lett. A7, 1185–1195 (1992) 2. Bannai, E., and Ito, T.: Algebraic Combinatorics I: Association Schemes. NewYork: Benjamin/Cummings, 1984 3. Bourbaki, N.: Groupes et algèbres de Lie. Paris: Hermann, 1968 4. De Boer, J. and Goeree, J.: Markov traces and II(1) factors in conformal field theory. Commun. Math. Phys. 139, 267–304 (1991) 5. Di Francesco, P., and Zuber, J.-B.: SU(N) lattice integrable models associated with graphs. Nucl. Phys. B338, 602–646 (1990); Di Francesco, P., and Zuber, J.-B.: SU(N) lattice integrable models and modular invariance. In: Recent Developments in Conformal Field Theories, Trieste Conference 1989, Randjbar-Daemi, S., Sezgin, E., and Zuber, J.-B., eds. Singapore: World Scientific, 1990; Di Francesco, P.: Integrable lattice models, graphs and modular invariant conformal field theories. Int. J. Mod. Phys. A7, 407–500 (1992) 6. Feigin, B.L., and Malikov, F.G.: Fusion algebra at a rational level and cohomology of nilpotent subalgebras b of sl(2). Lett. Math. Phys. 31, 315–325 (1994) b 7. Feigin, B.L., and Malikov, F.G.: Modular functor and representation theory of sl(2) at a rational level. In: Operads: Proceedings of Renaissance Conferences, Cont. Math. 202, Loday, J.-L., Stasheff, J.D., and Voronov, A.A., eds. Providence, RI: AMS, 1997, p. 357 8. Furlan, P., Ganchev, A.Ch., and Petkova, V.B.: Quantum groups and fusion rule multiplicities. Nucl. Phys. B343, 205–227 (1990) 9. Furlan, P., Ganchev, A.Ch., and Petkova, V.B.: Fusion rules for admissible representations of affine (1) algebras: The case of A2 . Nucl. Phys. B518 [PM], 645–668 (1998) 10. Humphreys, J.M.: Reflection Groups and Coxeter Groups, Cambridge: Cambridge University Press, 1990 11. Kac, V.G.: Infinite-dimensional Lie Algebras. Third edition, Cambridge: Cambridge University Press, 1990 12. Kac, V.G., and Wakimoto, M.: Modular invariant representations of infinite-dimensional Lie algebras and superalgebras. Proc. Natl. Sci. USA 85, 4956–4960 (1988); Kac, V.G., and Wakimoto, M.: Classification of modular invariant representations of affine algebras. Adv. Ser. Math. Phys. vol 7, Singapore: World Scientific, 1989, pp. 138–177; Kac, V.G., and Wakimoto, M.: Branching functions for winding subalgebras and tensor products. Acta Applicandae Math. 21, 3–39 (1990) 13. Pasquier, V.: Operator content of the ADE lattice models. J. Phys. A20, 5707–5717 (1987) 14. Petkova, V.B., and Zuber, J.-B.: From CFT to graphs. Nucl. Phys. B463, 161–193 (1996) 15. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300 [FS22], 360–376 (1988) 16. Walton, M.: Fusion rules in Wess–Zumino–Witten models. Nucl. Phys. B340, 777–790 (1990) 17. Ganchev, A.Ch., Petkova, V.B., and Watts, G.M.T.: to appear Communicated by G. Felder