Commun. Math. Phys. 220, 1 – 12 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On the Definition of SRB-Measures for Coupled Map Lattices Esa Järvenpää, Maarit Järvenpää University of Jyväskylä, Department of Mathematics, P.O. Box 35, 40351 Jyväskylä, Finland. E-mail:
[email protected];
[email protected] Received: 23 June 2000 / Accepted: 4 January 2001
Abstract: We consider SRB-measures of coupled map lattices. The emphasis is given to a definition according to which a SRB-measure is an invariant probability measure whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. We show that coupled map lattices which are close to an uncoupled expanding map have typically an infinite number of SRB-measures. In particular, we give a counterexample to the Bricmont–Kupiainen conjecture.
1. Introduction The SRB-measure (Sinai, Ruelle, Bowen) is by definition a “natural” invariant probability measure of a dynamical system (X, T ), where X is a manifold and T : X → X is a differentiable mapping. The meaning of the word “natural” comes from the interpretation that the dynamical system is a model of some physical system. The natural measure should tell how typical points behave asymptotically, that is, what the long time behaviour of the system is for typical initial values. Typical points are determined by the set-up of the actual experiment. If the phase space of the system is a manifold then one may argue that the Lebesgue measure or some smooth modification of it is the right distribution for the initial values. Having found an invariant measure µ and aset A ⊂ X with positive Lebesgue measure such that the Birkhoff average limn→∞ n1 ni=1 δT i (x) tends to µ in the weak∗ -topology for all x ∈ A, it is reasonable to say that µ is a SRBmeasure. Here δx is the probability measure concentrated at the point x. The existence of several other definitions for the SRB-measure found in the literature stems from the fact that this is a difficult condition to test. One definition is that the SRB-measure is an invariant probability measure whose conditional distributions on unstable leaves are absolutely continuous with respect to the corresponding Lebesgue measure. According to another definition it is an equilibrium state for a certain potential function obtained from the derivative of the map. A third definition states that the SRB-measure is a limit of the
2
E. Järvenpää, M. Järvenpää
Lebesgue measure under the iteration of the dynamics. For nice finite-dimensional systems like expanding maps on compact manifolds or axiom A systems all these definitions agree and give the same unique SRB-measure. When adopting the aforementioned definitions into the infinite-dimensional setting of coupled map lattices, one should take into consideration that in an experiment it is possible to measure only a finite number of quantities, in particular, a finite number of coordinates. Thus it seems quite natural to demand that the finite dimensional projections of a SRB-measure are absolutely continuous with respect to the corresponding Lebesgue measure. The extension of the equilibrium state definition to the infinite dimensional setting is not trivial because of the difficulties caused by infinite determinants and matrices. The third definition is obtained by studying finite dimensional approximations of the whole system, taking the limit of the (finite) Lebesgue measure under these approximations, and letting the subsystem size tend to infinity. Even for expanding maps one possibility is to demand that finite dimensional conditional distributions are absolutely continuous. All of the above definitions have been used in the literature. Bunimovich and Sinai [BS] studied expanding maps of the unit interval with a special diffusive coupling over one-dimensional lattice Z. They showed that the system has an invariant Gibbs state whose projections onto finite-dimensional subsystems are absolutely continuous with respect to the Lebesgue measure. In [BK1] Bricmont and Kupiainen used the first mentioned definition, proved the existence of a SRB-measure for analytic expanding circle maps in the regime of small analytic coupling over d-dimensional lattice Zd , and conjectured the uniqueness of this SRB-measure. They extended the existence result for special Hölder continuous functions in [BK2]. They also verified that the SRB-measure is unique in the class of measures for which the logarithm of the density is Hölder continuous. In [J] it was shown that all these results remain true if one replaces the circle by any compact Riemannian manifold. Jiang and Pesin [JP] considered weakly coupled Anosov maps. They managed to extend the equilibrium state definition to this setting and proved the existence and uniqueness of the SRB-measure. Recently, Keller and Zweimüller [KZ] studied piecewise expanding interval maps with a special unidirectional coupling using the last mentioned definition. They established the existence and uniqueness of the SRB-measure in this setting. Finally, the proofs of [BK2, JP] give the uniqueness of the SRB-measure given as in the third definition above. The purpose of this paper is to show that the first mentioned definition is not equivalent with the second and third ones in an infinite dimensional setting. We will construct a coupled map lattice which has an infinite number of SRB-measures according to the first mentioned definition (see Theorem 3.4). (Three of these are also (space) translation invariant.) We also argue that our example is not just a curious artificial system but it manifests a typical behaviour. Thus, although being perhaps the most natural of the above definitions at the heuristic level, this definition has the drawback of being non-unique. Our results also imply that for each finite subsystem X one can find a set A of positive Lebesgue measure such that for each x ∈ A there are boundary conditions y1 (x) and y2 (x) such that n 1 lim δT i (x∨yi ) = µi , n→∞ n i=1
where µ1 = µ2 and x ∨ y is the natural element of the phase space X. Hence the boundary conditions do have an effect. Note that one cannot draw the conclusion that there is a physical phase transition since for each x ∈ A one has to choose the boundary
Non-Uniqueness of SRB-Measures for Coupled Map Lattices
3
condition in a very special way in order to see another SRB-measure than the one whose existence was proved in [BK2].
2. Preliminaries Our main motivation comes from the well-known projection results in Rn stating that the projections of a Radon measure µ onto almost all m-planes are absolutely continuous with respect to the m-dimensional Lebesgue measure provided that the m-energy of µ is finite [M, Theorem 9.7]. Our strategy is to use the fact that expanding maps have small invariant sets (and measures) in the sense that their dimensions are less than the dimension of the ambient manifold. For example, the 13 -Cantor set is invariant under the map x → 3x mod 1. If one takes a finite n-fold product of these Cantor-sets, one will obtain a set which is invariant under the corresponding n-fold product map. Of course, the dimension of this product set is less than n, and so the natural Hausdorff measure living on the set, although being invariant, is not a SRB-measure since it is not absolutely continuous with respect to the n-dimensional Lebesgue measure. However, as n grows, the dimension of the product Cantor set grows. In particular, for each integer m one can find n such that the dimension of the n-fold Cantor set is greater than m. By the above mentioned projection result typical projections of the n-fold Hausdorff measure onto m-dimensional subspaces are absolutely continuous with respect to the m-dimensional Lebesgue measure. Of course, for this system the m-dimensional subsystems are atypical and the projections onto them are not absolutely continuous. Our idea is that a small coupling will make these coordinate planes typical ones. However, one has to be careful since in [HK] Hunt and Kaloshin proved that these projection results are not valid in infinite dimensional spaces. The projection theorems have also the reversed statements according to which the set of exceptional directions may have positive dimension although having zero measure (see [F]). Thus one cannot expect anything more than “almost all”-results. We adopt the very general formulation of the projection theorem due to Peres and Schlag [PS]. We begin by recalling the notation from [PS] which we will use later. Definition 2.1. Let (X, d) be a compact metric space, Q ⊂ Rn an open connected set, and : Q × X → Rm a continuous map with n ≥ m. For any multi-index |η| η = (η1 , . . . , ηn ) ∈ Nn , let |η| = ni=1 ηi be the length of it, and ∂ η = (∂ε1 )η1∂...(∂εn )ηn , where = (ε1 , . . . , εn ) ∈ Q. Let L be a positive integer and δ ∈ [0, 1). We say that ∈ C L,δ (Q) if for any compact set Q ⊂ Q and for any multi-index η with |η| ≤ L there exist constants Cη,Q and Cδ,Q such that
|∂ η (, x)| ≤ Cη,Q and sup |∂ η (, x) − ∂ η ( , x)| ≤ Cδ,Q | − |δ |η |=L
for all , ∈ Q and x ∈ X. Next we will give a definition of a subclass of C L,δ (Q) from [PS]. Definition 2.2. Let ∈ C L,δ (Q) for some L and δ. Define for all x = y ∈ X, x,y () =
(, x) − (, y) . d(x, y)
4
E. Järvenpää, M. Järvenpää
Let β ∈ [0, 1). The set Q is a region of transversality of order β for if there exists a constant Cβ such that for all ∈ Q and for all x = y ∈ X the condition |x,y ()| ≤ Cβ d(x, y)β implies det(Dx,y ()(Dx,y ())T ) ≥ Cβ2 d(x, y)2β . Here the derivative with respect to is denoted by D and AT is the transpose of a matrix A. Further, is (L, δ)-regular on Q if there exists a constant Cβ,L,δ and for all multiindices η with |η| ≤ L there exists a constant Cβ,η such that for all , ∈ Q and for all distinct x, y ∈ X, |∂ η x,y ()| ≤ Cβ,η d(x, y)−β|η| and
sup |∂ η x,y () − ∂ η x,y ( )| ≤ Cβ,L,δ | − |δ d(x, y)−β(L+δ) .
|η |=L
Remark 2.3. Note that if the determinant in Definition 2.2 is bounded away from zero then Q is a region of transversality of order β for all β ∈ [0, 1). Definition 2.4. Let µ be a Borel measure on X and α ∈ R. The α-energy of µ is d(x, y)−α dµ(x)dµ(y). Eα (µ) = X
X
We denote the image of a measure µ under a map f : X → Y by f∗ µ, that is, f∗ µ(A) = µ(f −1 (A)) for all A ⊂ Y . The following theorem from [PS] gives a relation between Sobolev-norms of images of measures under C L,δ (Q)-mappings and energies of original measures. Theorem 2.5. Let Q ⊂ Rn and ∈ C L,δ (Q) such that L + δ > 1. Let β ∈ [0, 1). Assume that Q is a region of transversality of order β for and that is (L, δ)-regular on Q. Let µ be a finite Borel measure on X such that Eα (µ) < ∞ for some α > 0. Then there exist a constant a0 depending only on m, n, and δ such that for any compact Q ⊂ Q, ∗ µ22,γ dLn () ≤ Cγ Eα (µ) Q
for some constant Cγ provided that 0 < (m + 2γ )(1 + a0 β) ≤ α and 2γ < L + δ − 1. Here · 2,γ is the Sobolev norm, that is, |ˆν (ξ )|2 |ξ |2γ dLm (ξ ) ν22,γ = Rm
for any finite compactly supported Borel measure on Rm , where νˆ (ξ ) = e−iξ ·x dν(x) Rm
is the Fourier transform of ν. Proof. [PS, Theorem 7.3].
Non-Uniqueness of SRB-Measures for Coupled Map Lattices
5
Remark 2.6. Let ν be a finite compactly supported Borel measure on Rn . If ν2,0 < ∞ then ν is absolutely continuous with respect to the Lebesgue measure Ln and its RadonNikodym derivative is L2 -integrable, that is, D(ν, Ln ) ∈ L2 (Rn ) (see 3.5). Indeed, if νˆ ∈ L2 (Rn ) then by the surjectivity of the Fourier transform [SW, Theorem 2.3, p. 17] there exists f ∈ L2 (Rn ) such that fˆ = νˆ . Thus by [T, Definition 1.7, p. 262] f = ν as a distribution meaning that f = D(ν, Ln ). Note also that ν2,γ < ∞ for γ ≥ n + 2 implies that D(ν, Ln ) has L2 -integrable derivatives of order γ , that γ is, D(ν, Ln ) ∈ W2 (Rn ). So by [SW, Lemma 3.17, p. 26] D(ν, Ln ) is continuously differentiable. 3. Results Let ' = Zd S 1 , where d ≥ 1 is an integer and S 1 ⊂ C is the unit circle. We use ˜ ⊂ Zd let π : ' → ' and the notation ' = S 1 for all ⊂ Zd . For ⊂ π , : ' ˜ → ' be the natural projections. Let ε0 > 0 and let A : ' → ' be such ˜ that its lift A : ' → ', where ' = Zd R, is A (x)i = xi + εil 2−|i−l| g(xl ) (3.1) l∈Zd
for all i ∈ Zd , where | · | is a metric on Zd , εil ∈ (−ε0 , ε0 ) for all i, l ∈ Zd and g is continuously differentiable and 1-periodic. (We use the covering map p : ' → ' such that Zd [0, 1] is a covering domain. Then A = p ◦ A ◦ p−1 .) For the discussion of the explicit form of the conjugacy A , see Remarks 3.5. Set E = Zd ×Zd (−ε0 , ε0 ) and denote by L the product over Zd × Zd of normalized Lebesgue measures on (−ε0 , ε0 ). It is not difficult to see that A is invertible for all ∈ E provided ε0 is small enough (depending on |g |). We fix such ε0 and set T = A ◦F ◦A−1 , d 3 1 maps z → z (or t → 3t mod 1 if S is where F : ' → ' is the product over Z of viewed as [0, 1]). Let K = Zd K and µ = Zd Hs |K , where K is the 13 -Cantor set on S 1 (or [0, 1]) and Hs |K is the restriction of the s-dimensional Hausdorff measure to 2 K with s = log log 3 . (Note that s is the Hausdorff dimension of K). Now (A )∗ µ is clearly T -invariant, that is, (T )∗ (A )∗ µ = (A )∗ µ. Our aim is to show that for L-almost all the projection (π )∗ (Aε )∗ µ is absolutely continuous with respect to the Lebesgue measure on ' for all finite ⊂ Zd . Let ⊂ Zd . We denote the restriction of A to ' by A, , that is, A, (x)i = xi + εil 2−|i−l| g(xl ) l∈
˜ ⊂ Zd be finite for all i ∈ . Set µ = Hs |K and K = K. Let ⊂ ˜ such that | |s > | |, where the number of elements in is denoted by | |. Let ˜ E × ˜ = × ˜ (−ε0 , ε0 ) and let L × be the restriction of L to E × ˜ . We will first ˜
show that for L × -almost all ∈ E × ˜ the measure (π , ◦ Aε, ˜ )∗ µ ˜ is absolutely ˜ continuous with respect to the Lebesgue measure on ' . As it will be indicated in the proof of Proposition 3.2 this claim follows from Theorem 2.5. In order to apply Theorem 2.5 we have to give some conditions on g. Since g is 1-periodic and continuously differentiable there necessarily exists t0 ∈ [0, 1] such that
6
E. Järvenpää, M. Järvenpää
g (t0 ) = 0. In order to satisfy the transversality assumption in Theorem 2.5, we demand that g = 0 on K. More precisely, let b > 0 and let g be increasing on [0, 1/6] such that g(0) = 0 and g (t) ≥ b for all t ∈ [0, t1 ] for some 1/9 < t1 < 1/6. Define g(t + 1/6) = g(1/6 − t) for t ∈ [0, 1/6] and g(1 − t) = −g(t) for t ∈ [0, 1/3]. We extend g to the interval [1/3, 2/3] such that g is continuously differentiable, g([0, 1]) ⊂ [−1, 1], for some B ≥ b we have |g (t)| ≤ B for all t ∈ [0, 1], and |g (t)| ≥ b for all t ∈ [1/3, 1/3 + t2 ] ∪ [2/3 − t2 , 2/3], where 0 < t2 < 1/9. Consider the second step in the construction of the Cantor set K. Call the chosen intervals Ii , i = 1, . . . , 4, that is, I1 = [0, 1/9], I2 = [2/9, 1/3], I3 = [2/3, 7/9], and I4 = [8/9, 1]. Let x ∈ K and ⊂ Zd . Define x˜ ∈ K in the following way: For all i ∈ , let x˜i = xi . For j ∈ c = Zd \ set x˜j = xj if xj ∈ I1 ∪ I4 , x˜j = 1/6 − (xj − 1/6) if xj ∈ I2 , and x˜j = 5/6 + 5/6 − xj if xj ∈ I3 . Note that with these definitions g(x˜j ) = g(xj ) for all j ∈ Zd implying that π ◦ A (x) ˜ = π ◦ A (x). Further, if / [−t1 , t1 ] for some j ∈ c then x˜j ∈ [−t1 , t1 ]. xj ∈ Let x, y ∈ K such that xi ∈ I1 and yi ∈ I2 for some i ∈ . Then A (y)i − A (x)i ≥ yi − xi − εil 2−|i−l| |g(yl ) − g(xl )| l∈Zd
≥ yi − xi −
εil 2−|i−l| B|yl − xl | ≥ yi − xi − Cε0 ≥
l∈Zd
1 (3.2) 18
for ε0 small enough since yi − xi ≥ 1/9. Thus the cubes at the second stage of the construction of K with i th side I1 will not overlap with cubes with i th side I2 under the projection π ◦ A provided that i ∈ . (The same argument works in other cases as well, see 3.3 below.) More precisely, there exists a constant c > 0 such that |π ◦ A (x) − π ◦ A (y)| ≥ c
(3.3)
for all x, y ∈ K with xi ∈ I1 ∪ I4 and yi ∈ I2 ∪ I3 (or xi ∈ I2 and yi ∈ I3 ) for some i ∈ . Further, as in (3.2) we see that there exists c˜ > 0 such that |A (x)i − 1/6| ≥ c˜ for all i ∈ and x ∈ K, giving the existence of δ > 0 such that 1 1 − δ, + δ = ∅ (3.4) π{i} ◦ A (K) ∩ 6 6 for all i ∈ . We fix ε0 and δ such that the above results hold.
˜ ⊂ Zd be finite such that | |s ˜ > | |. Set X ˜ = ˜ [−t1 , t1 ]. Lemma 3.1. Let ⊂
Define : E × ˜ × X ˜ → ' by (, x) = π , ◦ A, ˜ (x). Then the assumptions ˜ of Theorem 2.5 are valid for δ = 0, β = 0, and for all integers L > 1. Further, ˜ Eα (µ ˜ ) < ∞ for any | | < α < | |s.
Proof. We may replace ' by Rm , where m = | |. Let i0 ∈ . Note that X ˜ is a compact metric space equipped with the metric 2−2|i0 −l| |xl − yl |2 . d(x, y)2 = ˜ l∈
Clearly ∈ C L,0 (E × ˜ ) for all positive integers L since all the first order partial derivatives are constants. Note that Q in Definition 2.1 will not play any role here since all the estimates are independent of Q .
Non-Uniqueness of SRB-Measures for Coupled Map Lattices
7
To check the transversality assumption in Definition 2.2, define for all x = y ∈ X ˜ , x,y () =
(, x) − (, y) . d(x, y)
˜ and x, y ∈ X ˜ such that x = y. Then Fix i ∈ , k = (k1 , k2 ) ∈ × ,
Dx,y ()i,k = δi,k1 2−|i−k2 |
g(xk2 ) − g(yk2 ) , d(x, y)
where δi,j is the Kronecker’s delta. Thus for i, j ∈ , (Dx,y ()Dx,y ()T )i,j =
δi,j −|i−l|−|j −l| 2 (g(xl ) − g(yl ))2 d(x, y)2 ˜ l∈ 2 −|i−i0 |−|j −i0 |
≥ δi,j b 2
.
By Remark 2.3 the transversality assumption is valid for β = 0 with the constant C0 = bm 2− i∈ |i−i0 | . Finally, is obviously (L, 0)-regular (in fact (L, δ)-regular for all δ ∈ [0, 1)) on E × ˜ for all positive integers L. The last assertion follows from the well-known properties of the Hausdorff measure Hs |K (see [M, Chapter 8]). The following absolute continuity result follows from Theorem 2.5 and Lemma 3.1. ˜
˜ > | |. Then for L × ˜ ⊂ Zd be finite such that | |s Proposition 3.2. Let ⊂ almost all ∈ E × ˜ the measure (π , ◦ A ) µ is absolutely continuous with ˜ ∗ ˜ ˜ , respect to the Lebesgue measure on ' . Proof. By the arguments given before stating Lemma 3.1 we may replace ' ˜ by X ˜ = ˜
× ˜ )∗ µ ˜ [−t1 , t1 ]. Lemma 3.1 and Theorem 2.5 give (π , ˜ ◦A, ˜ 2,0 < ∞ for L
almost all ∈ E × ˜ which by Remark 2.6 implies the claim. In Proposition 3.3 we will prove that one may replace A, ˜ by A and µ ˜ by µ in Proposition 3.2. For this purpose we use differentiation theory of measures. Let ν and λ be Radon measures on Rn . Recall that the lower derivative of ν with respect to λ at a point x ∈ Rn is defined by D(ν, λ, x) = lim inf r→0
ν(B(x, r)) , λ(B(x, r))
(3.5)
where B(x, r) is the closed ball with centre at x and with radius r. If the limit exists it is called the Radon-Nikodym derivative of ν with respect to λ and is denoted by D(ν, λ, x). Further, ν is absolutely continuous with respect to λ if and only if D(ν, λ, x) < ∞ for ν-almost all x ∈ Rn [M, Theorem 2.12]. ˜ > | | and let 1 ∈ E ˜ ˜ ⊂ Zd be finite such that | |s Proposition 3.3. Let ⊂
× such that the conclusion of Proposition 3.2 is valid. Then for all ∈ E with × ˜ = 1 we have D((π ◦ A )∗ µ, L , x) < ∞ for (π ◦ A )∗ µ-almost all x ∈ ' . Here L is the Lebesgue measure on ' and × ˜ = (εij )(i,j )∈ × ˜ .
8
E. Järvenpää, M. Järvenpää
Proof. Let , 0 ∈ E such that × ˜ = 1 , × ˜c = ˜ ˜ = (0 ) × ˜ ˜ , and (0 )Zd × (0 ) ˜ c ×Zd = 0. Set ν = (π ◦ A )∗ µ and ν0 = (π , ◦ A ) µ . Then ν and ν 0 are ˜ ∗ ˜ ˜ 0 , Radon measures with compact supports [M, Theorem 1.18]. It follows directly from (3.1) that (A0 , ˜ )∗ µ ˜ = (π ˜ ◦ A0 )∗ µ, meaning that ν0 = (π ◦ A0 )∗ µ. By Proposition 3.2 the measure ν0 is absolutely continuous with respect to L . Set m = | |. We will first show that there exists a constant C > 0 such that for all r > 0, √ ν (B(x, r))dν (x) ≤ C ν0 (B(x, mr))dν0 (x). (3.6) '
'
By [FO, Lemma 2.6] it is enough to prove that ν (Q)2 ≤ C Q∈D (r, )
ν0 (Q)2 ,
(3.7)
Q∈D (r, )
where D(r, ) is the family of r-mesh cubes in R , that is, cubes of the form [l1 r, (l1 + 1)r) × · · · × [lm r, (lm + 1)r), where li ∈ Z for all i = 1, . . . , m. Let r > 0. Consider the cubes at the nth stage of the construction of K, where 3−n < r. Call this nth stage approximation K(n). Setting V0 = A0 , ˜ (K ˜ (n)) × K ˜ c (n) = A, ˜ (K ˜ (n)) × K ˜ c (n), we get A0 (spt µ) ⊂ V0 implying that spt ν0 ⊂ π (V0 ). Here the support of a measure λ is denoted by spt λ. ˜ and x, y ∈ X = Zd [−t1 , t1 ] such that xk = yk for all k ∈ , ˜ then If i ∈ A (x)i − A (y)i = εil 2−|i−l| (g(xl ) − g(yl )). (3.8) ˜c l∈
(Recall the discussion before Lemma 3.1 according to which we can assume that xi ∈ ˜ c. [−t1 , t1 ] for all i ∈ Zd ). Note that the difference in (3.8) depends only on xj for j ∈ Defining V = A (K(n)), we have spt ν ⊂ π (V ). Further, A (x)i = A, ˜ (x)i for ˜ c meaning that the restriction of V to the subspace ˜ if xj = 0 for all j ∈ all i ∈ ' ˜ ⊂ ' equals A, ˜ (K ˜ (n)) = A0 , ˜ (K ˜ (n)). So by (3.8) V is obtained from V0 by tilting the rows of “cubes” above each “cube” in A, ˜ (K ˜ (n)) in such a way that the ˜ Thus ν is obtained from ν0 by amount of translation does not depend on xi for i ∈ . spreading around the “cubes” defining ν0 . Let Q ∈ D(r, ). If there is Q ∈ D(r, ) such that a part of the “cubes” above it in V0 are tilted above Q then the corresponding “cubes” above Q (in V0 ) are removed away by (3.8). Define AQ = {Q ∈ D(r, ) | π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅}. ˜ ,
Then for all Q ∈ AQ with π (V ) ∩ π (A (A−1˜ (Q × X \ ) × X ˜ c )) ∩ Q = ∅ we ˜ , c have V0 ∩ (Q × X ) = ∅. Further, Q × X c = PQ (Q ), (3.9) Q ∈D (r, ) Q∈AQ
where
PQ (Q ) = {x ∈ Q × X c | π (A (A−1˜ (x ˜ ) × x ˜ c )) ∈ Q }. ,
Non-Uniqueness of SRB-Measures for Coupled Map Lattices
9
Observe that (A0 )∗ µ(PQ (Q )) = (A )∗ µ(A (A−1 0 (PQ (Q )))).
(3.10)
Note that by (3.8) the geometric shape of this partition is independent of Q, that is, if Q1 ∈ D(r, ) with Q1 × X c =
PQ1 (Q ),
Q ∈D (r, ) Q1 ∈AQ
then for all Q2 = τ (Q1 ) ∈ D(r, ) (τ is a translation) we have
Q2 × X c =
τ (PQ1 (Q )).
Q ∈D (r, ) Q1 ∈AQ
Naturally, this partition can be restricted to V0 . Hence for all Q ∈ D(r, ) there are 1 non-negative numbers pQ (Q ) = ν0 (Q) (A0 )∗ µ(PQ (Q )) adding to 1 such that
ν0 (Q) = (A0 )∗ µ(Q × X c ) =
(A0 )∗ µ(PQ (Q ))
Q ∈D (r, ) Q∈AQ
=
(3.11)
pQ (Q )ν0 (Q).
Q ∈D (r, ) Q∈AQ
This gives by (3.10) that ν (Q) =
Q ∈AQ
(A0 )∗ µ(PQ (Q)) =
pQ (Q)ν0 (Q ).
(3.12)
Q ∈AQ
The numbers pQ (Q ) depend on both Q and PQ (Q ). Enumerating the partition of Q×X c given in (3.9) we get Q×X c = ∪i PQ (i), where the geometric shape of PQ (i) ∈ D(r, ) we have PQ may vary as i varies. However, for all i and Q, Q (i) = τ (PQ (i)), = τ (Q). Hence the differences in PQ (i) as Q varies where τ is the translation with Q and i is kept fixed are due to the fact that the measure is not evenly distributed inside horizontal | |-dimensional slices of Q × X c . Note that if such a horizontal slice intersects an element PQ (Q ) of the partition (3.9), then, by (3.8), it may intersect only the elements PQ (Q ), where Q is a neighbour of Q in D(r, ). Let N = 3| | be the number of neighbours. We say that Q and Q are related (Q ∼ Q ) if there exists Q
10
E. Järvenpää, M. Järvenpää
such that Q , Q ∈ AQ . Then by (3.11) and (3.12) N
ν0 (Q)2 −
Q∈D (r, )
=N
ν (Q)2
Q∈D (r, )
pQ (Q )pQ (Q )ν0 (Q)2
Q∈D (r, ) Q ∈D (r, ) Q ∈D (r, ) Q∈AQ Q∈AQ
−
pQ (Q)pQ (Q)ν0 (Q )ν0 (Q )
Q∈D (r, ) Q ∈AQ Q ∈AQ
=
pQ (Q)pQ (Q)(ν0 (Q ) − ν0 (Q ))2 + P ≥ 0
Q ,Q ∈D (r, ) Q∈D (r, ) Q ,Q ∈AQ Q ∼Q
since the remainder P (which is due to the occasionally very generous compensation factor N ) is non-negative. This concludes the proof of (3.7). Let α be the L -measure of the m-dimensional unit ball. By [M, Theorem 2.12] D(ν0 , L , x) exists and is finite for L -almost all x. By Proposition 3.2 the same is true for ν0 -almost all x. By Remark 2.6 we can choose D(ν0 , L ) as smooth as we like by ˜ In particular, it can be chosen to be uniformly continuous so that one can increasing . find r0 > 0 such that ν0 (B(x, r))α −1 r −m ≤ max{2D(ν0 , L , x), 1} for all 0 < r < r0 and x ∈ ' . Thus using Fatou’s lemma, inequality 3.6, the theorem of dominated convergence, and Theorem 2.5 together with Plancharel’s formula [SW, Theorem 2.1, p. 16], we have D(ν , L , x)dν (x) = lim inf ν (B(x, r))α −1 r −m dν (x) r→0 ≤ lim inf ν (B(x, r))α −1 r −m dν (x) r→0 √ ≤ lim inf C ν0 (B(x, mr))α −1 r −m dν0 (x) r→0 √ m = C( m) D(ν0 , L , x)dν0 (x) = C D(ν0 , L , x)2 dL (x) < ∞. Thus D(ν , L , x) is finite for ν -almost all x.
Theorem 3.4. For L-almost all the map T has infinitely many SRB-measures. Proof. For all finite ⊂ Zd , let Eg ( ) = { ∈ E | (π ◦ A )∗ µ is absolutely continuous with respect to L }. By Propositions 3.2 and 3.3 and [M, Theorem 2.12] we get for all finite ⊂ Zd , L(Eg ( )) = 1.
Non-Uniqueness of SRB-Measures for Coupled Map Lattices
Defining Eg =
11
Eg ( )
⊂Zd
| |<∞
we have L(Eg ) = 1. Further, for all ∈ Eg the measure (π ◦ A )∗ µ is absolutely continuous with respect to L for all finite ⊂ Zd . Since by (3.4) the measure (A )∗ µ is different from the SRB-measure constructed by Bricmont and Kupiainen there are at least two SRB-measures. Instead of considering the standard Cantor set one can study the Cantor set where the first two intervals are chosen and the third one is removed. Defining g properly the above proofs work for both of these sets. Since at each direction one can choose either of these Cantor sets and each choice gives a different measure there is an infinite number of SRB-measures. Remarks 3.5. 1) Taking any coupled map lattice which is close to T0 in the sense that it has an invariant set close to K, one can repeat the above arguments. Thus it is possible to decompose a suitable space of coupled map lattices into leaves such that inside each leaf almost every system has infinitely many SRB-measures. This shows that the uniqueness of the SRB-measure is a very atypical situation. The explicit form of the conjugacy A is irrelevant. It is simply enough to find one. In order to apply Theorem 2.5 it is essential that the map depends on all coordinates such that the decay rate is not faster than the one in the definition of the metric. More precisely, there has to be some lower bound so that one can define an appropriate auxiliary metric used in the proof of Lemma 3.1. In Theorem 2.5 the essential feature of g is that its derivative is not zero close to the Cantor set K. The symmetry assumptions make it easier to take the infinite limit in Proposition 3.3. 2) Similar methods can be used for coupled axiom A diffeomorphisms to show that typically they have also an infinite number of invariant probability measures whose projections onto finite dimensional subsystems are absolutely continuous with respect to the corresponding Lebesgue measure. This indicates that also the equilibrium state SRB-measure and the one obtained as the limit of the Lebesgue measure have the same property. However, our method cannot be used directly to prove this since the support of this SRB-measure is dense (see (3.3)). 3) Note that by Theorem 2.5 and Remark 2.6 the densities of (π ◦ A )∗ µ are smooth, in particular, Hölder continuous. The uniqueness proof of Bricmont and Kupiainen fails for these measures because there are regions where the density is zero (see (3.4)), and so one cannot take the logarithm of the densities. Acknowledgements. EJ and MJ acknowledge the financial support of the Academy of Finland (projects 46208 and 38955).
References [BK1] Bricmont, J. and Kupiainen, A.: Coupled analytic maps. Nonlinearity 8, 379–396 (1995) [BK2] Bricmont, J. and Kupiainen, A.: High temperature expansions and dynamical systems. Commun. Math. Phys. 178, 703–732 (1996) [BS] Bunimovich, L.A. and Sinai,Ya.G.: Spacetime chaos in coupled map lattices. Nonlinearity 1, 491–516 (1988) [F] Falconer, K.J.: Hausdorff dimension and the exceptional set of projections. Mathematika 29, 109–115 (1982)
12
[FO] [HK] [J] [JP] [KZ] [M] [PS] [SW] [T]
E. Järvenpää, M. Järvenpää
Falconer, K.J. and O’Neil, T.C.: Convolutions and the geometry of multifractal measures. Math. Nachr. 204, 61–82 (1999) Hunt, B.R. and Kaloshin, V.Yu.: Regularity of embeddings of infinite-dimensional fractal sets into finite-dimensional spaces. Nonlinearity 12, 1263–1275 (1999) Järvenpää, E.: A note on weakly coupled expanding maps on compact manifolds. Ann. Acad. Sci. Fenn. Math. 24, 511–517 (1999) Jiang, M. and Pesin, Ya.: Equilibrium measures for coupled map lattices: Existence, uniqueness and finite-dimensional approximations. Commun. Math. Phys. 193, 675–711 (1998) Keller, G. and Zweimüller, R.: Weakly coupled map lattices – between differentiable dynamics and statistical mechanics: The case of unidirectional interactions. Preprint Mattila, P.: Geometry of Sets and Measures in Euclidean Spaces: Fractals and rectifiability. Cambridge: Cambridge University Press, 1995 Peres, Y. and Schlag, W.: Smoothness of projections, Bernoulli convolutions, and the dimension of exceptions. Duke Math. J. 102, 193–251 (2000) Stein, E. and Weiss, G.: Introduction to Fourier Analysis on Euclidean Spaces. Princeton, NJ: Princeton University Press, 1971 Torchinsky, A.: Real-Variable Methods in Harmonic Analysis, San Diego, CA: Academic Press, Inc. 1986
Communicated by A. Kupiainen
Commun. Math. Phys. 220, 13 – 40 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Triviality of Hierarchical Ising Model in Four Dimensions Takashi Hara1 , Tetsuya Hattori1 , Hiroshi Watanabe2 1 Graduate School of Mathematics, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan.
E-mail:
[email protected];
[email protected]
2 Department of Mathematics, Nippon Medical School, 2-297-2, Kosugi, Nakahara, Kawasaki 211-0063,
Japan. E-mail:
[email protected] Received: 27 April 2000 / Accepted: 5 January 2001
Abstract: Existence of critical renormalization group trajectory for a hierarchical Ising model in 4 dimensions is shown. After 70 iterations of renormalization group transformations, the critical Ising model is mapped into a vicinity of the Gaussian fixed point. Convergence of the subsequent trajectory to the Gaussian fixed point is shown by power decay of the effective coupling constant. The analysis in the strong coupling regime is computer-aided and Newman’s inequalities on truncated correlations are used to give mathematical rigor to the numerical bounds. In order to obtain a criterion for convergence to the Gaussian fixed point, characteristic functions and Newman’s inequalities are systematically used.
1. Introduction and Main Result Dyson’s Hierarchical spin system is an equilibrium statistical mechanical system defined as follows [4, 16, 3, 6, 14]. Let be a positive integer, and denote the 2 variables (spin variables) φθ , Hamiltonian H , and the expectation values ·, respectively, by φθ = φθ ,...,θ1 , θ = (θ , . . . , θ1 ) ∈ {0, 1} , 2 1 c n H (φ) = − φθ ,...,θ1 , 2 4 θ ,...,θn+1 θn ,...,θ1 n=1 1 F ,h = dφF (φ) exp(−βH (φ)) h(φθ ), Z,h θ h(φθ ), Z,h = dφ exp(−βH (φ)) θ
14
T. Hara, T. Hattori, H. Watanabe
where h is a single spin measure density normalized as h(x)dx = 1. R
In the following, we shall fix the so far arbitrary normalization of the spin variables by β=
1 1 − . c 2
(1.1)
Hierarchical models are so designed that the block-spin renormalization group transformation R has a simple form. In fact, R is a non-linear transformation of functions on R, defined as follows. Define the block spins φ by √ c φτ θ1 , τ = (τ−1 , . . . , τ1 ). φτ = 2 θ1 =0,1
If a function F (φ) depends on φ through φ only, namely, if there is a function F (φ ) on the block spins such that F (φ) = F (φ ), then it holds that F ,h = F −1,Rh , where β Rh(x) = const. exp( x 2 ) 2 Note that
R
h
x x √ + y h( √ − y) dy, x ∈ R. c c
1 hG (x) = const. exp − x 2 4
(1.2)
(1.3)
is a fixed point of R, which we shall refer to as the density function of the massless Gaussian measure. By looking into the asymptotics of e.g., susceptibility for the hierarchical massless Gaussian model defined by (1.3), and comparing it with that of standard nearest neighbor massless Gaussian models on d-dimensional regular lattice, we see that the dimensionality d of the system may be identified (at least for the Gaussian fixed point) as 1 c = 21−2/d (1.4) β = (22/d − 1) . 2 We shall extend the correspondences to hierarchical models with arbitrary measures, and use the terminology d-dimensional hierarchical models whenever (1.4) holds. Asymptotic properties of the renormalization group trajectories hN = RN h0 , N = 0, 1, 2, · · · ,
(1.5)
are extensively investigated in a “weak coupling regime” i.e., in a “neighborhood” of hG [16, 3, 6–8]. In particular, it is known that, if d ≥ 4, then there are no non-Gaussian
Triviality of Hierarchical Ising Model in Four Dimensions
15
fixed points in a “neighborhood” of hG , and that a “continuum limit” constructed from a critical trajectory with an initial function in a “neighborhood” of hG is trivial (Gaussian). However, in order to study asymptotic properties of strongly coupled models, we have to analyze trajectories (1.5) with initial functions in a “strong coupling regime” far away from the Gaussian fixed point. As a typical example, we consider in this paper the hierarchical Ising model, which is defined by the Ising spin measure density parameterized by s ≥ 0: hI,s (x) =
1 (δ(x − s) + δ(x + s)) , 2
(1.6)
which may be regarded as a strong coupling limit of the φ 4 measures: hµ,λ (x) = const. exp(−µx 2 − λx 4 ),
µ = −2λs 2 ,
λ → ∞.
Here and in the following, we use the standard notation δ(x −s) dx denoting a probability measure with unit mass on a single point x = s. The hierarchical Ising model has an infinite volume limit → ∞, if 0 < c < 2 (d > 0), and has a phase transition, if 1 < c < 2 (d > 2) [4]. It has been widely believed without proof that the hierarchical Ising model in d ≥ 4 dimensions has a critical trajectory converging to the Gaussian fixed point and that the “continuum limit” of the hierarchical Ising model in d ≥ 4 dimensions will be trivial. In this paper, we prove this fact. In the present analysis, it is crucial that the critical Ising model is mapped into a weak coupling regime after a small number of renormalization group transformations (in fact, 70 iterations for d = 4). Moreover, using a framework essentially different from that of [16, 7], we see in the weak coupling regime that the “effective coupling constant” of a critical model decays as c1 /(N +c2 ) after N iterations in d = 4 dimensions (exponentially for d > 4). Our framework in the weak coupling regime is designed especially for a critical trajectory starting at the strong coupling regime so that the criterion of convergence to the Gaussian fixed point can be checked numerically with mathematical rigor. Corresponding results, triviality of φ44 spin model on regular lattice (“full model”), are far harder, and a proof of triviality of Ising model on 4 dimensional regular lattice is, though widely believed, still open. We should here note the excellent and hard work of [9, 10] where the existence of critical trajectory in the weak coupling regime (near Gaussian fixed point; “weak triviality”) is solved by rigorous block spin renormalization group transformation. Our main theorem is the following: √ Theorem 1.1. If d ≥ 4 (i.e. c ≥ 2), there exists a “critical trajectory” converging to the Gaussian fixed point starting from the hierarchical Ising models. Namely, there exists a positive real number sc such that if hN , N = 0, 1, 2, · · · , are defined by (1.5) with h0 = hI,sc , then the sequence of measures hN (x) dx, N = 0, 1, 2, · · · , converges weakly to the massless Gaussian measure hG (x) dx. Remark. Our proof is partially computer-aided and shows for d = 4 that sc ∈ [1.7925671170092624, 1.7925671170092625]. In the following sections, we give a proof of Theorem 1.1. We will concentrate on the case d = 4, since the cases d > 4 can be proved along similar lines (with weaker bounds).
16
T. Hara, T. Hattori, H. Watanabe
2. Strategy The proof of Theorem 1.1 is decomposed into two parts: Theorem 2.1(analysis in the weak coupling regime) and Theorem 2.2 (analysis in the strong coupling regime). They are stated in Sect. 2.3, and their proofs are given in Sect. 4 and Sect. 5, respectively. Theorem 1.1 is proved at the end of this section assuming them. (1) In Theorem 2.1, we control the renormalization group flow in a weak coupling regime by means of a finite number of truncated correlations (Taylor coefficients of logarithm of characteristic functions), and, in terms of the truncated correlations, we give a criterion, a set of sufficient conditions, for the measure to be in a domain of attraction of the Gaussian fixed point. (2) In Theorem 2.2, we prove, by rigorous computer-aided calculations, that there is a trajectory whose initial point is an Ising measure and for which the criterion in Theorem 2.1 is satisfied after a small number of iterations. The first part (Theorem 2.1) is essentially the Bleher–Sinai argument [1, 2, 16]. However, the criteria introduced in the references [16, 7] seem to be difficult to handle when “strong coupling constants” are present in the model, as in the Ising models. In order to overcome this difficulty, we use characteristic functions of single spin distributions and Newman’s inequalities for truncated correlations. The second part (Theorem 2.2) is basically simple numerical calculations of truncated correlations up to 8 points to ensure the criterion. The results are double checked by Mathematica and C++ programs, and furthermore they are made mathematically rigorous by means of Newman’s inequalities. It should be noted that rigorous computer-aided proofs are employed in [14] to Dyson’s hierarchical model in d = 3 dimensions, to prove, with [13], an existence of a non-Gaussian fixed point. (The “physics” are of course different between d = 3 and d = 4.) We also focus on a complete mathematical proof, by combining rigorous computer-aided bounds with mathematical methods such as Newman’s inequalities and the Bleher–Sinai arguments. 2.1. Characteristic function. Denote the characteristic function of the single spin distribution hN as √ ˆhN (ξ ) = FhN (ξ ) = e −1ξ x hN (x) dx. (2.1) R
The renormalization group transformation for hˆ N is hˆ N+1 = FRF −1 hˆ N ,
(2.2)
FRF −1 = T S,
(2.3)
which has a decomposition
where
√ 2 c ξ , 2
β T g(ξ ) = const. exp − g(ξ ), 2 Sg(ξ ) = g
(2.4) (2.5)
Triviality of Hierarchical Ising Model in Four Dimensions
17
and the constant is so defined that T g (0) = 1. The transformation (2.2) has the same form as the N = 2 case of the Gallavotti hierarchical model [5, 11, 12]. Note that only for N = 2 the Gallavotti model is equivalent (by Fourier transform) to the Dyson’s hierarchical model. We introduce a “potential” VN for the characteristic function hˆ N and its Taylor coefficients µn,N by hˆ N (ξ ) = e−VN (ξ ) , VN (ξ ) =
∞
(2.6)
µn,N ξ n .
(2.7)
n=1
(Note that hˆ N (0) = 1.) The coefficient µn,N is called a truncated n point correlation. They are functions of Ising parameter s in h0 = hI,s , but to simplify expressions, we will always suppress the dependences on s in the following. In particular, for the initial condition h0 = hI,s , we have hˆ 0 (ξ ) = hˆ I,s (ξ ) = FhI,s (ξ ) = cos(sξ ), 1 1 4 1 6 µ2,0 = s 2 , µ4,0 = s , µ6,0 = s , 2 12 45 and
µ8,0 =
17 8 s , 2520
etc.,
√ √ 2 h1 (x) = RhI,s (x) = const. eβcs /2 δ(x − s c) + δ(x + s c) + 2δ(x) , √ 1 2 1 + k cos( csξ ) , with k = eβcs /2 , 1+k k k = k", µ4,1 = (2k − 1)"2 , µ6,1 = (16k 2 − 13k + 1)"3 , 6 90 k cs 2 = (272k 3 − 297k 2 + 60k − 1)"4 , etc., with " = . 2520 2(k + 1)
hˆ 1 (ξ ) = µ2,1 µ8,1
2.2. Newman’s inequalities. The function VN has a remarkable positivity property and its Taylor coefficients obey Newman’s inequalities (for a brief review of relevant part, see Appendix A): 1 (2µ4,N )n/2 , n = 3, 4, 5, · · · . (2.8) n These inequalities follow from [15, Theorem 3, 6], since we have chosen the Ising spin distribution h0 = hI,s and the function of η defined by √ c N ηx e hN (x)dx = exp η φθ (2.9) 2 N,hI,s 0 ≤ µ2n,N ≤
θ
has only pure imaginary zeros as is shown in [15, Theorem 1]. Note also that (1.2) and (1.6) imply µ2n+1,N = 0,
n = 0, 1, 2, · · · .
(2.10)
18
T. Hara, T. Hattori, H. Watanabe
The bounds (2.8) are extensively used in this paper. We here note the following facts: (1) The right-hand side of (2.7) has a nonzero radius of convergence. (2) It suffices to prove lim µ4,N = 0 in order to ensure that µ2n,N , n ≥ 3, converges N→∞
to zero, hence the trajectory converges to the Gaussian fixed point. 2.3. Proof of Theorem 1.1. Let h0 = hI,s and d = 4. Note the following simple observations on the “mass term” µ2,N , which is the variance of hN (x) dx. (1) µ2,N is continuous in the Ising parameter s, because hN (x) dx is a result of a finite number of renormalization group transformation (1.2). (2) µ2,N is increasing in s, vanishes at s = 0, and diverges as s → ∞. We then put, for N = 0, 1, 2, · · · ,
s N = inf s > 0 | µ2,N ≥ 1 , √ 3 s N = inf s > 0 | µ2,N ≥ min 1 + √ µ4,N, 2 + 2 . 2
(2.11) (2.12)
Obviously, we have 0 < s N ≤ s N < ∞. Note also that 3 1 ≤ µ2,N ≤ 1 + √ µ4,N 2
(2.13)
holds for s ∈ [s N , s N ]. As is seen in Sect. 4, (2.13) is necessary for the model to be critical. We call this a critical mass condition. The following theorem states our result in the weak coupling regime and is proved in Sect. 4. Theorem 2.1. Let h0 = hI,s and d = 4. Assume that there exist integers N0 and N1 , satisfying N0 ≤ N1 , such that, for s ∈ [s N1 , s N1 ], the bounds 0 ≤ µ4,N0 ≤ 0.0045, 1.6µ24,N0
≤ µ6,N0 ≤
(2.14)
6.07µ24,N0 , 48.469µ34,N0 ,
(2.15)
N0 ≤ N < N1 ,
(2.17)
0 ≤ µ8,N0 ≤
(2.16)
and µ2,N < 2 +
√
2,
hold. Then there exists an sc ∈ [s N1 , s N1 ] such that if s = sc then lim µ4,N = 0,
N→∞
lim µ2,N = 1.
N→∞
Triviality of Hierarchical Ising Model in Four Dimensions
19
s=sc µ4 s=sN -- 1 0.0045
N0
N1
-s=s N
1
N0
N1
0
µ2
1.0
Fig. 2.1. A schematic view of trajectories on (µ2 , µ4 -plane) in Theorem 2.1. Trajectories for s = s N1 and for s = s N1 (solid lines) and the critical trajectory for s = sc (broken line) are shown. The Gaussian fixed point corresponds to the point (1.0, 0). The region defined by inequalities for (µ2 , µ4 ) analogous to (2.13) and (2.14) (and (2.17)) is shaded
Remark. The original Bleher–Sinai argument takes N0 = N1 . We include the N0 < N1 case which makes it possible to complete our proof by evaluating various quantities only at 2 endpoints of the interval in consideration for Ising parameter s, instead of all values in the interval, as is implicit in the assumptions of Theorem 2.1. This point will be clarified at the end of Sect. 5.3. The following theorem states our result in the strong coupling regime and is proved in Sect. 5. Theorem 2.2. The assumptions of Theorem 2.1 are satisfied for N0 = 70 and N1 = 100, where s N1 and s N1 satisfy 1.7925671170092624 ≤ s N1 ,
s N1 ≤ 1.7925671170092625.
Proof of Theorem 1.1 for d = 4 assuming Theorem 2.1 and Theorem 2.2. Theorem 2.1 and Theorem 2.2 imply that there exists sc ∈ [s N1 , s N1 ] such that, for s = sc , lim µ4,N = 0 and lim µ2,N = 1 hold. Then (2.6), (2.7), and (2.8) imply
N→∞
N→∞
2 lim hˆ N (ξ ) = e−ξ ,
N→∞
uniformly in ξ on any closed interval in R. It is easy to see that e−ξ is the characteristic function of the massless Gaussian measure hG , hence Theorem 1.1 holds for d = 4. The bounds on s N1 and s N1 in Theorem 2.2 imply 2
1.7925671170092624 ≤ sc ≤ 1.7925671170092625.
20
T. Hara, T. Hattori, H. Watanabe
3. Truncated Correlations In this section, we prepare basic (recursive) bounds on the truncated correlations that will be used in Sect. 4. The renormalization group transformation is decomposed as (2.3). Since the mapping S is simple, the essential part of our work is an analysis of T . The consequence in this section is Proposition 3.1. 3.1. Recursions. Note first that in terms of VN the mapping S can be expressed as
Se
−VN
(ξ ) = e
−2VN
√
c 2 ξ
.
Using (2.7), (2.10), (1.4) we also have
√ ∞ c 21−(1+2/d)n µ2n,N ξ 2n . ξ = 2VN 2
(3.1)
(3.2)
n=1
Next, write (2.5) as T g = const. gβ/2 , where g(ξ ) =
gt = exp(−t)g,
(3.3)
√ d 2g 1 (ξ ), and β = ( 2 − 1) for d = 4. gt is a solution to 2 dξ 2 ∂gt = −gt , g0 = g. ∂t
Hence, if we put gt (ξ ) = exp(−Vt (ξ )), then Vt satisfies d Vt = (∇Vt )2 − Vt , dt
(3.4)
∂Vt (ξ ). In other words, VN+1 is given as a solution of (3.4) at t = β/2 ∂ξ (modulo constant term), with the initial condition (3.2) at t = 0. If we write where ∇Vt (ξ ) =
Vt (ξ ) =
∞
µ2n (t)ξ 2n ,
n=0
then (3.4) implies d µ2n (t) = − (2n + 2)(2n + 1)µ2n+2 (t) dt n + (2")(2n − 2" + 2)µ2" (t) µ2n−2"+2 (t). "=1
(3.5)
Triviality of Hierarchical Ising Model in Four Dimensions
21
In particular, we have d µ2 (t) = 4µ2 (t)2 − 12µ4 (t), dt d µ4 (t) = 16µ2 (t)µ4 (t) − 30µ6 (t), dt d µ6 (t) = 24µ2 (t)µ6 (t) + 16µ4 (t)2 − 56µ8 (t), dt d µ8 (t) = 32µ2 (t)µ8 (t) + 48µ4 (t)µ6 (t) − 90µ10 (t). dt
(3.6) (3.7) (3.8) (3.9)
Thus, µ2n,N and µ2n,N+1 are related for d = 4 by e.g., 1 1 1 1 µ2 (0) = √ µ2,N , µ4 (0) = µ4,N , µ6 (0) = √ µ6,N , µ8 (0) = µ8,N , 4 32 2 8 2
β β β β µ2,N+1 = µ2 , µ4,N+1 = µ4 , µ6,N+1 = µ6 , µ8,N+1 = µ8 . 2 2 2 2 3.2. Bounds. We first note that the quantities µn (t) obey Newman’s inequalities: by comparing (2.5) and (3.3) we see that the correspondence VN → V (t) is obtained by a replacement β → 2t in (1.2). Therefore µn (t) also is a truncated n point correlation of a measure to which arguments in [15] apply, hence an analogue of (2.8) holds: 0 ≤ µ2n (t) ≤
1 (2µ4 (t))n/2 , n
n = 3, 4, 5, · · · .
(3.10)
We have to show decay of µ4,N as N → ∞. In case d > 4, the decay follows from (3.6) and (3.7) with d-dependent coefficients, namely, if we throw out the negative contributions −µ4 (t) and −µ6 (t) to the right-hand sides of (3.6) and (3.7), respectively, then we have upper bounds on µ2 (t) and µ4 (t). This argument eventually yields exponential decay of µ4,N . In case d = 4, the situation is more subtle, since the decay of µ4,N is weak, i.e., powerlike instead of exponential. In order to derive the delicate bound on µ4 (t), a lower bound for µ6 (t) must be incorporated, which in turn needs an upper bound on µ8 (t). Thus, we have to deal with Eqs. (3.6)–(3.9). This is the principle of our estimation. The result is the following: Proposition 3.1. Let d = 4 and N be a positive integer, and put rN =
√
1
=√
√
1
1 − ( 2 − 1)(µ2,N − 1) 2 − ( 2 − 1)µ2,N √ 2rN − 1 rN 1 ζN = √ = −√ . µ 2µ2,N 2µ2,N 2,N
,
(3.11) (3.12)
(i) If µ2,N < 2 +
√
2,
(3.13)
22
T. Hara, T. Hattori, H. Watanabe
then µ2,N+1 ≤ rN µ2,N ,
(3.14)
µ2,N+1 ≥
(3.15)
rN µ2,N − 3rN2 ζN µ4,N .
(ii) If, furthermore, 21 15 µ4,N ≥ √ ζN µ6,N + ζN2 µ24,N , 4 4 8 2 µ6,N 123 7 1 √ + ζN µ24,N ≥ 24ζN3 µ34,N + √ ζN2 µ4,N µ6,N + ζN µ8,N , 2 8 8 2 8 2 3 45 ζN µ4,N ≥ 12ζN3 µ24,N + √ ζN2 µ6,N , 2 8 2
(3.16) (3.17) (3.18)
then
15 µ2,N+1 ≤ rN µ2,N − 3rN2 ζN µ4,N − 8ζN3 µ24,N − √ ζN2 µ6,N , (3.19) 4 2
15 µ4,N+1 ≥ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N , (3.20) 2 2
15 µ4,N+1 ≤ rN4 µ4,N − √ ζN µ6,N − 21ζN2 µ24,N 2 2 705 105 2 (3.21) + √ ζN3 µ4,N µ6,N + 447ζN4 µ34,N + ζN µ8,N , 4 2 2
µ6,N µ6,N+1 ≤ rN6 (3.22) √ + 4ζN µ24,N , 2
µ6,N 123 µ6,N+1 ≥ rN6 √ + 4ζN µ24,N − 192ζN3 µ34,N − √ ζN2 µ4,N µ6,N − 7ζN µ8,N , 2 2 (3.23)
µ 12 8,N µ8,N+1 ≤ rN8 (3.24) + √ ζN µ4,N µ6,N + 24ζN2 µ34,N . 2 2 The rest of this section is devoted to a proof of Proposition 3.1.
Proof. Now, observe that µ¯2 (t) defined by d 1 µ¯2 (t) = 4µ¯2 (t)2 , µ¯2 (0) = √ µ2,N , dt 2
(3.25)
is an upper bound of µ2 (t): µ2,N 1 µ2 (t) ≤ µ¯2 (t) = √ . √ 2 1 − 2 2µ2,N t √ 2−1 β = for d = 4 implies (3.14). This, at t = 2 4
(3.26)
Triviality of Hierarchical Ising Model in Four Dimensions
23
Put 1 , √ 1 − 2 2µ2,N t m(t) = µ¯2 (t) − µ2 (t).
M(t) =
We have m(t) ≥ 0, and (3.13) implies that M(t) is√ increasing in t ∈ [0, β/2]. By a change of variable z = M(t) − 1 (dz = 2 2µ2,N M(t)2 dt) and by putting m(z) ˆ = m(t)/M(t)2 ,
µˆ4 (z) = µ4 (t)/M(t)4 ,
µˆ6 (z) = µ6 (t)/M(t)6 , µˆ8 (z) = µ8 (t)/M(t)8 , we have, from (3.6)–(3.9), z µ4,N 1 (−8m(z) ˆ µˆ4 (z) − 15µˆ6 (z))dz, (3.27) +√ 4 2µ2,N 0 z µ6,N 1 µˆ6 (z) = √ + √ (8µˆ4 (z)2 − 12m(z) ˆ µˆ6 (z) − 28µˆ8 (z))dz, (3.28) 8 2 2µ2,N 0 z µ8,N 1 µˆ8 (z) = (24µˆ4 (z)µˆ6 (z) − 16m(z) ˆ µˆ8 (z) − 45µˆ10 (z))dz, +√ 32 2µ2,N 0
µˆ4 (z) =
m(z) ˆ =√
1 2µ2,N
(3.29)
z
(6µˆ4 (z) − 2m(z) ˆ 2 )dz,
(3.30)
0
Eqs. (3.27)–(3.30) with positivity of µ2n (t) imply µ4,N , 4 z µ24,N µ6,N µ6,N 1 µˆ6 (z) ≤ √ + √ 8µˆ4 (z)2 dz ≤ √ + √ z, 8 2 2µ2,N 0 8 2 2 2µ2,N z µ8,N 1 µˆ8 (z) ≤ 24µˆ4 (z)µˆ6 (z)dz +√ 32 2µ2,N 0
µˆ4 (z) ≤
µ8,N 3 µ4,N 2 3 µ4,N µ6,N z+ z , + 32 8 µ2,N 4 µ22,N z 3µ4,N 1 6µˆ4 (z)dz ≤ √ z. m(z) ˆ ≤√ 2µ2,N 0 2 2µ2,N
(3.31) (3.32)
3
≤
(3.33) (3.34)
√ β β (z = M( ) − 1 = 2rn − 1 for d = 4) implies (3.15). 2 2 Using (3.31), (3.32), (3.34) in (3.27), we have
In particular, (3.34) at t =
µˆ4 (z) ≥
21µ24,N 2 µ4,N 15µ6,N z− z . − 4 16µ2,N 8µ22,N
(3.35)
24
T. Hara, T. Hattori, H. Watanabe
Using (3.32), (3.33), (3.34), (3.35) in (3.28) and (3.30) we further have 12µ34,N µ24,N µ6,N 123µ4,N µ6,N 2 7µ8,N z − √ 3 z3 − z − √ z, µˆ6 (z) ≥ √ + √ √ 2 8 2 2 2µ2,N 2µ2,N 16 2µ2,N 8 2µ2,N (3.36) 6µ24,N √ 3 z3 2µ2,N
3µ4,N 45µ6,N m(z) ˆ ≥ √ z− − √ 2 z2 . (3.37) 2 2µ2,N 16 2µ2,N √
√ √ β 2−1 β and z = M − 1 = 2rN − 1 M = 2rN . When d = 4, β = 2 2 2 Then the assumptions (3.16) – (3.18) of Proposition 3.1 imply that the right-hand sides β of (3.35), (3.36), and (3.37) are non-negative at t = . On the other hand, they are 2 concave in z for z ≥ 0. Recall also that z = M(t) − 1 is increasing in t ∈ [0, β/2]. Therefore, they are non-negative for all t ∈ [0, β/2]. Using (3.35), (3.36), and (3.37) in (3.27), we therefore have z
6µ24,N 3µ4,N µ4,N 45µ6,N 1 8 √ z − √ 3 z3 − √ 2 z2 × −√ µˆ4 (z) ≤ 4 2 2µ2,N 16 2µ2,N 2µ2,N 0 2µ2,N
21µ24,N 2 µ4,N 15µ6,N × z− z − 4 16µ2,N 8µ22,N
12µ34,N 3 123µ4,N µ6,N 2 µ24,N µ6,N 7µ8,N +15 √ + √ z− √ 3 z − z − √ z dz √ 8 2 2 2µ2,N 2µ2,N 16 2µ22,N 8 2µ2,N ≤
21µ24,N 2 µ4,N 15µ6,N z− z − 4 16µ2,N 8µ22,N
3 705µ4,N µ6,N 3 447µ4,N 4 105µ8,N 2 z + z + z . 32µ32,N 16µ42,N 32µ22,N √ Recalling that at t = β/2 (z = M( β2 ) − 1 = 2rN − 1) we have
+
(3.38)
β µ¯2 ( ) = rN µ2,N , 2 µ2,N+1 µ4,N+1 µ6,N+1 µ8,N+1
2 β = rN µ2,N − m( ˆ 2rN − 1)M , 2
4 √ β = µˆ4 ( 2rN − 1)M , 2
6 √ β = µˆ6 ( 2rN − 1)M , 2
8 √ β = µˆ8 ( 2rN − 1)M , 2 √
we see that (3.37), (3.35), (3.38), (3.32), (3.36), (3.33) imply (3.19)–(3.24), respectively. This completes a proof of Proposition 3.1.
Triviality of Hierarchical Ising Model in Four Dimensions
25
4. Bleher–Sinai Argument In order to show Theorem 2.1, we confirm existence of a critical parameter s = sc by means of Bleher–Sinai argument, and, at the same time, we derive the expected decay of µ4,N . In Bleher–Sinai argument, monotonicity of s N and s N with respect to N is essential. Proposition 4.1. Let d = 4. Then the following hold: (1) If µ2,N − 1 < 0 then µ2,N+1 < µ2,N . 3 1 (2) If > µ2,N − 1 ≥ √ µ4,N then µ2,N+1 ≥ µ2,N . 4 2 Proof. Note that for both cases in the statement, the assumption (3.13) in Proposition 3.1 holds. Hence, (3.14), with (3.11) and monotonicity of µ2,N , implies µ2,N − 1 < 0 ⇒ rN < 1 ⇒ µ2,N+1 < µ2,N . Next we see that (3.15), with (3.11) and (3.12), implies √ 3rN ( 2rN − 1) µ2,N − 1 ≥ ⇒ µ2,N+1 ≥ µ2,N . √ µ4,N (2 − 2)µ22,N
(4.1)
(4.2)
Put L1 (x) = √
3 . √ 2x( 2 − ( 2 − 1)x)2 √
Then by straightforward calculation we see 1≤x≤
5 3 ⇒ L1 (x) ≤ L1 (1) = √ , 4 2
and (3.11) implies √ 3rN ( 2rN − 1) . L1 (µ2,N ) = √ (2 − 2)µ22,N Therefore (4.2) implies that 1 3 > µ2,N − 1 ≥ √ µ4,N ⇒ µ2,N+1 ≥ µ2,N . 4 2
(4.3)
Corollary 4.2. Let d = 4. Then, for the s N defined in (2.11), it holds that s N ≤ s N+1 . Proof. Since µ2,N is increasing in s, if s < s N then µ2,N < 1, hence Proposition 4.1 implies µ2,N+1 < µ2,N < 1, further implying s < s N+1 . Hence the statement holds.
26
T. Hara, T. Hattori, H. Watanabe
For later convenience, define rN∗ =
1
√
3 1 − ( 2 − 1) √ µ4,N 2 1 ζ∗N = 1 − √ , 2 √ ∗ 2rN − 1 .
ζN∗ = √ 3 2 1 + √ µ4,N 2
,
(4.4)
(4.5) (4.6)
Then we see that if (2.13) holds, then we have, from (3.11) and (3.12), 1
(4.7) (4.8)
Proposition 4.3. Let d = 4 and put α0 = 0.0045,
α1 = 1.6,
α2 = 6.07, α3 = 48.469.
Assume that there exists an integer N such that (3.13) and (0 ≤) µ4,N ≤ α0 , α1 µ24,N
≤ µ6,N ≤
(0 ≤) µ8,N ≤
(4.9) α2 µ24,N ,
α3 µ34,N ,
(4.10) (4.11)
hold. Then (3.16)–(3.18) hold, and the following also hold: (0 ≤) µ4,N+1 ≤ µ4,N (1 − 0.08µ4,N ) (≤ α0 ), α1 µ24,N+1
≤ µ6,N+1 ≤
(0 le) µ8,N+1 ≤
α2 µ24,N+1 , α3 µ34,N+1 .
(4.12) (4.13) (4.14)
Proof. For x ≥ 0 put "r (x) =
1−
√
1
3 , 2−1 √ x 2
1 "d (x) = 1 − √ , 2 √ 2"r (x) − 1
, "u (x) = √ 3 2 1+ √ x 2
15 L2 (x) = 1 − √ α2 "u (x) + 21"u (x)2 x. 2 2
(4.15)
Triviality of Hierarchical Ising Model in Four Dimensions
27
In particular, (4.4), (4.5), (4.6) imply rN∗ = "r (µ4,N ),
ζN∗ = "u (µ4,N ).
ζ∗N = "d (µ4,N ),
By explicit calculation, we see that L2 (x) > 0,
0 ≤ x ≤ α0 .
(4.16)
The right-hand side of (3.16) is then bounded from above by 1 1 µ4,N (1 − L2 (µ4,N )) ≤ µ4,N , 4 4 hence (3.16) holds. Similarly, (3.18) is seen to hold for 0 ≤ µ4,N ≤ α0 , if we note that the right-hand side of (3.18) is bounded from above by
3 45 3 ζN µ24,N 12ζN∗ + √ ζN∗ α2 ≤ (1 − L2 (µ4,N ))µ4,N ≤ µ4,N . 4 2 8 2 The condition (3.17) is seen to hold with similar argument, if we note the right-hand side is bounded from above by
123 7 ζN µ34,N 24ζN2 + √ ζN α2 + α3 , 8 8 2 while the left-hand side is bounded from below by
α1 1 µ24,N √ + ζN . 8 2 2 Therefore, the conclusions of Proposition 3.1 hold, in particular, (3.20)–(3.23) imply
15 4 2 (4.17) µ4,N+1 ≥ rN µ4,N 1 − √ ζN α2 + 21ζN µ4,N , 2 2
15 µ4,N+1 ≤ rN4 1 − √ ζN α1 + 21ζN2 µ4,N µ4,N 2 2
705 3 105 2 2 4 + (4.18) ζ α3 µ4,N , √ ζN α2 + 447ζN + 4 N 2 2
µ6,N+1 µ4,N 2 6 α2 , (4.19) ≤ r + 4ζ √ N N µ4,N+1 µ24,N+1 2
µ6,N+1 µ4,N 2 6 α1 123 2 3 µ ≥ r − 192ζ + α + 7ζ α + 4ζ ζ √ √ N N 3 4,N , N N N 2 µ4,N+1 µ24,N+1 2 2 µ8,N+1 ≤ µ34,N+1
µ4,N µ4,N+1
3
rN8
α3 12 + √ ζN α2 + 24ζN2 . 2 2
(4.20) (4.21)
Rewriting (4.17), using (4.7) and (4.8), we have 1 µ4,N ≤ 4 µ4,N+1 rN
1−
1 1 ≤ . 15 L (µ 2 4,N ) √ ζN α2 + 21ζN2 µ4,N 2 2
(4.22)
28
T. Hara, T. Hattori, H. Watanabe
This and (4.19) imply µ6,N+1 µ24,N+1
1 √ α2 + 4"u (µ4,N ) 2 ≤ . L2 (µ4,N )2
By explicit calculation, we see that 1 √ α2 + 4"u (x) 2 0 ≤ x ≤ α0 ⇒ ≤ α2 . L2 (x)2 Therefore the upper bound in (4.13) holds. In a similar way, we note that (4.21) and (4.22) imply 1 12 α3 + √ "u (µ4,N )α2 + 24"u (µ4,N )2 µ8,N+1 2 2 ≤ . L2 (µ4,N )3 µ34,N+1 By explicit calculation, we see that 12 1 α3 + √ "u (x)α2 + 24"u (x)2 2 2 0 ≤ x ≤ α0 ⇒ ≤ α3 . L2 (x)2 Therefore (4.14) holds. Similarly, from (4.20) and (4.18), we see that if 0 ≤ µ4,N ≤ α0 , then µ6,N+1 µ24,N+1 α1 123 √ + 4"d (µ4,N ) − (192"u (µ4,N )3 + √ "u (µ4,N )2 α2 + 7"u (µ4,N )α3 )µ4,N 2 2 ≥ "r (µ4,N )2 D 2 ≥ α1 , where
15 2 D =1− √ "d (µ4,N )α1 + 21"d (µ4,N ) µ4,N 2 2
705 105 3 4 2 + "u (µ4,N ) α3 µ24,N . √ "u (µ4,N ) α2 + 447"u (µ4,N ) + 4 2 2
(4.23)
The lower bound in (4.13) therefore holds. Finally, from (4.18), we have, again with similar argument,
µ4,N+1 15 4 ≤ "r (µ4,N ) 1 − ( √ "d (µ4,N )α1 + 21"d (µ4,N )2 )µ4,N µ4,N 2 2
705 105 3 4 2 2 + √ "u (µ4,N ) α2 + 447"u (µ4,N ) + "u (µ4,N ) α3 µ4,N 4 2 2 ≤ 1 − 0.08µ4,N , if 0 ≤ µ4,N ≤ α0 . Therefore (4.14) holds.
Triviality of Hierarchical Ising Model in Four Dimensions
29
Corollary 4.4. Let d = 4, and assume that for some N the assumptions (4.9)–(4.11) in Proposition 4.3 hold for all s satisfying s N ≤ s ≤ s N , where s N and s N are defined in (2.11) and (2.12). Then it holds that s N+1 ≤ s N . √ 3 Proof. By (4.9), 1 + √ µ4,N < 2 + 2, if s N ≤ s ≤ s N . Hence, by (2.12), 2 3 s N = inf s > 0 | µ2,N ≥ 1 + √ µ4,N , 2 and, from monotonicity of µ2,N in s, (3.13) holds if s ≤ s N . Continuity of µ2,N and µ4,N in s imply 3 µ2,N = 1 + √ µ4,N , if s = s N . 2 (In particular, we may assume that
5 > µ2,N .) Hence Proposition 4.1 implies 4
3 µ2,N+1 ≥ 1 + √ µ4,N , for s = s N . 2
(4.24)
By assumptions at s = s N , we see, from Proposition 4.3, that µ4,N+1 ≤ µ4,N , which, with (4.24), implies 3 µ2,N+1 ≥ 1 + √ µ4,N+1 . 2 This proves s N+1 ≤ s N .
Proof of Theorem 2.1. Note first that Corollary 4.2 implies s N ≤ s N+1 ,
N = N1 , N1 + 1, N1 + 2, · · · .
(4.25)
With assumptions of the theorem and by induction on N , Proposition 4.3 implies that for any s satisfying s N1 ≤ s ≤ s N1 , the bounds (4.9)–(4.11) hold for N = N1 . Hence Corollary 4.4 implies s N1 +1 ≤ s N1 . Also since s ≤ s N1 implies (3.13) for N = N1 , Proposition 4.3 implies that (4.9)– (4.11) hold for N = N1 + 1 and s N1 +1 ≤ s ≤ s N1 +1 . We can proceed with induction on N and repeat this argument to conclude that (4.12)–(4.14) hold for s N ≤ s ≤ s N , N = N1 , N1 + 1, N1 + 2, · · · , and s N+1 ≤ s N ,
N = N1 , N1 + 1, N1 + 2, · · · .
The bounds (4.25) and (4.26) imply that a sequence of closed intervals on R [s N1 , s N1 ] ⊃ [s N1 +1 , s N1 +1 ] ⊃ [s N1 +2 , s N1 +2 ] ⊃ · · · , is contracting, hence there exists an sc , satisfying s N1 ≤ sc ≤ s N1 , such that s N ≤ sc ≤ s N ,
N = N1 , N1 + 1, N1 + 2, · · · .
(4.26)
30
T. Hara, T. Hattori, H. Watanabe
Hence, in particular, (4.12) holds for all integer N ≥ N1 at s = sc . This implies lim µ4,N = 0,
N→∞
at s = sc . Also we see that if s = sc then (2.13) holds for all N ≥ N1 . Therefore we have lim µ2,N = 1,
N→∞
at s = sc . This completes a proof of Theorem 2.1.
5. Strong Coupling Problem We shall prove Theorem 2.2 by (computer-aided) brute force evaluation of the Taylor coefficients of hˆ N (ξ ) instead of VN (ξ ). 5.1. Taylor expansion. Define the Taylor coefficients an,N , n ∈ Z+ , of hˆ N by hˆ N (ξ ) =
∞
(−1)n
n=0
1 an,N ξ 2n . n!
(5.1)
In particular, a0,N = hˆ N (0) = 1. Note also that an,N ≥ 0,
n ∈ Z+ .
µn,N and an,N are related, e.g., as µ2,N = a1,N , µ8,N =
4 a1,N
4
µ4,N = −
2 −a a1,N 2,N
2 a a1,N 2,N
2
+
2 2 a2,N 8
+
,
µ6,N =
a1,N a3,N 6
3 a1,N
−
3 a4,N − . 24
a1,N a2,N a3,N + , 2 6
For Ising measure h0 = hI,s , an,0
n! d 2n hˆ 0 n! = (−1) (0) = 2n (2n)! d ξ (2n)! n
x 2n hI,s (x)dx =
n! 2n s , n ∈ Z+ . (5.2) (2n)!
Note that one of the Newman inequalities (see (A.6)), or the Gaussian inequalities, imply that n = µn2,N , n ∈ Z+ . an,N ≤ a1,N
Define bn,N , n ∈ Z+ , by (S hˆ N )(ξ ) = hˆ N
√
c ξ 2
2 =
∞ n=0
(−1)n
1 bn,N ξ 2n , n!
(5.3)
Triviality of Hierarchical Ising Model in Four Dimensions
31
where S is in (2.4). Then bn,N =
n c n n
4
"=0
"
a",N an−",N , n ∈ Z+ .
(5.4)
With (5.3) we have, cµ
bn,N ≤
2,N
n
2
,
n ∈ Z+ .
(5.5)
Next define a˜ n,N , n ∈ Z+ , by
∞ ∞ 1 1 β m d 2m ˆ S h (ξ ) = (−1)n a˜ n,N ξ 2n . − N 2m m! 2 dξ n!
m=0
n=0
Then a˜ n,N =
∞ m β m=0
bm+n,N
2
(2m + 2n)!n! , m!(m + n)!(2n)!
n ∈ Z+ ,
(5.6)
and (2.5) implies hˆ N+1 (ξ ) =
∞ 1
a˜ 0,N
(−1)n
n=0
1 a˜ n,N ξ 2n , n!
where we fixed the constant in the definition of T by hˆ N+1 (0) = 1. Comparing this with (5.1) we obtain a recursion relation in N for an,N : an,N+1 =
a˜ n,N , a˜ 0,N
n ∈ Z+ , N ∈ Z+ .
(5.7)
5.2. Truncation. We will evaluate a finite number, say M, of an,N ’s (n = 1, 2, · · · , M) explicitly with aid of computer calculations, by evaluating an,N , n > M, “theoretically”. For this, we need to give bounds of series in (5.4) and (5.6) in terms of sums of finite terms. The following proposition serves for this purpose. Proposition 5.1. Let M be a positive integer, and define bn,N , b¯ n,N ,
n = 0, 1, 2, · · · , 2M,
and a˜ n,N ,
a˜¯ n,N ,
a n,N , a¯ n,N ,
n = 0, 1, 2, · · · , M,
inductively in N ∈ Z+ , by a n,0 = a¯ n,0 =
n! 2n s , (2n)!
n = 0, 1, 2, · · · , M,
32
T. Hara, T. Hattori, H. Watanabe
and bn,N =
b¯ n,N
c n 4
n"=0 n" a ",N a n−",N , 0 ≤ n ≤ M, × M n "=n−M " a ",N a n−",N , M < n ≤ 2M,
(5.8)
n c n n a¯ ",N a¯ n−",N , 0 ≤ n ≤ M, " 4 "=0
= n c n ca¯ 1,N n ¯ a¯ ",N a¯ n−",N + 2bn,N , , min " 4 2 n−M≤"≤M M < n ≤ 2M, (5.9)
a˜ n,N =
a˜¯ n,N =
2M−n
m=0
2M−n
m=0
β 2
m
β 2
m
bm+n,N
(2m + 2n)!n! , m!(m + n)!(2n)!
0 ≤ n ≤ M,
(2m + 2n)!n! + 2a¯ n,N , b¯ m+n,N m!(m + n)!(2n)!
a n,N+1 =
a˜ n,N , a˜¯ 0,N
a¯ n,N+1 =
a˜¯ n,N , a˜ 0,N
0 ≤ n ≤ M,
1 ≤ n ≤ M,
(5.10)
(5.11)
(5.12)
and a 0,N+1 = a¯ 0,N+1 = 1, where we put 2b¯ n,N = 2
c a¯ 1,N 4
n
n × n−M −1 1−
and
2a¯ n,N =
1 2β
1 n−M −1/(M+1) M+1 e
×
a¯ M,N , aM 1,N
(5.13)
n
2M+1 βca¯ 1,N a¯ M,N N × M . 1 − 2βca¯ 1,N n a 1,N
(5.14)
If for an integer N1 it holds that a¯ 1,N <
1 , 2βc
0 ≤ N ≤ N1 ,
(5.15)
then an,N , bn,N , a˜ n,N , n ∈ Z+ , N ∈ Z+ , defined inductively by (5.2), (5.4), (5.6), (5.7), satisfy, for all N ≤ N1 , bn,N ≤ bn,N ≤ b¯ n,N , n = 0, 1, 2, · · · , 2M, a˜ n,N ≤ a˜ n,N ≤ a˜¯ n,N , n = 0, 1, 2, · · · , M, a n,N ≤ an,N ≤ a¯ n,N , n = 0, 1, 2, · · · , M.
(5.16)
Triviality of Hierarchical Ising Model in Four Dimensions
33
The rest of this subsection is devoted to a proof of this proposition. Proof. The claimed bounds on an,N in (5.16) hold for N = 0. We proceed by induction on N , and assume that they hold for N . By comparing (5.4) with (5.8), and noting that an,N are non-negative, we see that the lower bound for bn,N in (5.16) holds. Assume for a moment that the upper bound for bn,N in (5.16) also holds. Then comparing (5.6) with (5.10), we see that the lower bound for a˜ n,N in (5.16) holds. If the upper bound for a˜ n,N also holds, then (5.7) and (5.12) imply that the bounds for an,N+1 in (5.16) also hold. Hence we are left with proving the upper bounds for bn,N and a˜ n,N in (5.16). Upper bound on bn,N . Note first that if n ≤ M, then bn,N =
n n c n c n n n a",N an−",N ≤ a¯ ",N a¯ n−",N = b¯ n,N , " " 4 4 "=0
"=0
hence bn,N ≤ b¯ n,N holds. Also, (5.5) implies bn,N ≤
cµ
2,N
n
2
≤
ca¯ 1,N 2
n
,
hence it suffices to prove bn,N ≤
c n 4
n−M≤"≤M
n a¯ ",N a¯ n−",N + 2b¯ n,N , M < n ≤ 2M. "
(5.17)
To prove (5.17), first note
c n n ¯ 2bn,N = bn,N − a¯ ",N a¯ n−",N " 4 n−M≤"≤M
c n n a",N an−",N . ≤ " 4
(5.18)
0≤"
Using the Newman inequalities (A.6) we see that if " > M, "−M a",N ≤ aM,N a"−M,N ≤ aM,N a1,N .
Hence 2b¯ n,N ≤
c n 4
n n−"−M a",N aM,N a1,N " 0≤"
M<"≤n
≤2
c a
1,N
4
(5.19)
n a M,N M a1,N
n−M−1
"=0
n , "
(5.20)
34
T. Hara, T. Hattori, H. Watanabe
where we also used (5.5). Write the summation in the right-hand side as n−M−1
"=0
n n n−M −1 n−M −1n−M −2 + = 1+ M +2 M +2 M +3 " n−M −1 n−M −1n−M −2n−M −3 + + ··· . M +2 M +3 M +4
(5.21)
Noting that a−x ≤ ae−2x , 1+x we find, by putting a =
a ∈ (0, 1],
x ∈ [0, 1],
(5.22)
1 n−M and 3 = , M +1 M +1 a − k3 n−M −k = ≤ a e−2k3 . M +k+1 1 + k3
(5.23)
Hence (5.21) has a bound n−M−1
"=0
∞ n n n 1 ≤ × a k e−k(k+1)3 ≤ × , " n−M −1 1 − ae−3 n−M −1 k=0
1 n−M , 3= , a= M +1 M +1
which implies 2b¯ n,N ≤ 2b¯ n,N ,
(5.24)
where 2b¯ n,N is defined in (5.13). This proves (5.17). Upper bound on a˜ n,N . Put
≤ =
2M−"
β m ¯ (2m + 2")!"! bm+",N 2 m!(m + ")!(2")! m=0
∞ β m (2m + 2")!"! bm+",N 2 m!(m + ")!(2")!
2a¯ ",N = a˜ ",N −
m=2M+1−" ∞
m=2M+1−"
(2β)m bm+",N
(5.25)
(2m + 2" − 1)!! . (2m)!! (2" − 1)!!
Using (5.19) and (5.5), we see that if n > 2M, n n c n c n n n n aM,N a",N an−",N ≤ a1,N × M bn,N = " " 4 4 a1,N "=0 "=0 c a n a 1,N M,N = . M 2 a1,N
(5.26)
Triviality of Hierarchical Ising Model in Four Dimensions
35
Therefore 2a¯ ",N ≤
aM,N c a1,N " M 2 a1,N
≤
aM,N M a1,N
=
aM,N M a1,N
=
aM,N M a1,N
∞
m (2m + 2" − 1)!! (2m)!! (2" − 1)!! m=2M+1−"
∞ c a " m m + " 1,N βc a1,N " 2 m=2M+1−"
∞ c a " 2M+1−" k 2M + 1 + k 1,N βc a1,N βc a1,N " 2 k=0
"
∞ 2M+1 k 2M + 1 + k 1 . (5.27) βc a1,N βc a1,N " 2β βc a1,N
k=0
Here, T2M+1," (r) =
∞
βc a1,N
k
k=0
∞ 2M + 1 + k k 2M + 1 + k = r " "
"
1 2M + 1 m = q , "−m 1−r
k=0
(5.28)
m=0
r where r = βc a1,N , and q = 1−r . By assumption r < 21 . The binomial coefficient in the summand is largest when m = 0, because 2M + 1 > 2M ≥ 2". Therefore,
"
1 1 2M + 1 m 1 2M + 1 T2M+1," (r) ≤ q ≤ 1−r " 1−r 1−q " m=0
1 2M + 1 = . 1 − 2r "
(5.29)
This proves
2a¯ ",N ≤
1 2β
"
2M+1
βc a1,N aM,N 2M + 1 × M ≤ 2a¯ ",N , " 1 − 2βc a1,N a1,N
where 2a¯ ",N is defined in (5.14). This proves a˜ n,N ≤ a˜¯ n,N .
(5.30)
Remark. We can “improve” Proposition 5.1 by employing (correct) bounds, in a similar ca¯
n
1,N way as the term proportional to in (5.9). In actual calculations, we improve 2 a¯ n,N+1 , n = 1, 2, · · · , M, in (5.12), the upper bounds for an,N+1 ’s, using (A.6) (as well 2 as its special case (5.5)). To be more specific, we compare a¯ 4,N+1 in (5.12) with a¯ 2,N+1 and replace the definition if the latter is smaller. Then we go on to “improve” a¯ 6,N+1 by comparing with a¯ 2,N+1 a¯ 4,N+1 , and so on. Conceptually there is nothing really new here, but this procedure improves the actual value of the bounds in Proposition 5.1.
36
T. Hara, T. Hattori, H. Watanabe
5.3. Computer results. In this subsection we prove Theorem 2.2 on computers using Proposition 5.1. We double checked by Mathematica and C++ programs on interval arithmetic. Here we will give results from C++ programs. Our program employs interval arithmetic, which gives rigorous bounds numerically. The idea is to express a number by a pair of “vectors”, which consists of an array of length M of “digits”, taking values in {0, 1, 2, · · · , 9}, and an integer corresponding to “exponent”. To give a simple example, let M = 2. One can view that 0.0523 is expressed on the program, for example, as I1 = [5.2 × 10−2 , 5.3 × 10−2 ], and 3 is expressed as I2 = [3.0 × 100 , 3.0 × 100 ]. When the division I1 /I2 is performed, our program routines are so designed that they give correct bounds as an output. Namely, the computer output of I1 /I2 will be [1.7 × 10−2 , 1.8 × 10−2 ]. We may occasionally lose the best possible bounds, but the program is so designed that we never lose the correctness of the bounds. Thus all the outputs are rigorous bounds of the corresponding quantities. In actual calculation we took M = 70 digits, which turned out to be sufficient. We also note that interval arithmetic is employed in [14] for the hierarchical model in d = 3 dimensions. We took an independent approach in programming – we focused on ease in implementing the interval arithmetic to main programs developed for standard floating point calculations – so that structure and details of the programs are quite different. However, our numerical calculations are “not that heavy” to require anything special. For the program which we used for our proof, see the supplement to [17]. As will be explained below, we only need to consider 2 values for the initial Ising parameter s: s− = 1.7925671170092624, and s+ = 1.7925671170092625. We perform explicit recursion on computers for each s = s± using Proposition 5.1. We summarize what is left to be proved: 1 , 0 ≤ s ≤ sN1 , 0 ≤ N ≤ N1 , where N1 = 100. This condition is 2βc from (5.15), imposed because we are going to do evaluation using Proposition 5.1. Note that this condition is stronger than (2.17) in the assumptions in Theorem 2.1, √ 1 1 because = (2 + 2) = 1.707 · · · for d = 4. 2βc 2 (2) s− ≤ s N1 and s N1 ≤ s+ . To prove this, it is sufficient (as seen from the definitions (2.11) and (2.12)) to prove
(1) a¯ 1,N <
µ2,N1 < 1, when s = s− ,
3 and µ2,N1 > 1 + √ µ4,N1 , when s = s+ . 2 (5.31)
(3) For any s satisfying s− ≤ s ≤ s+ , the bounds (0 ≤)µ4,N0 ≤ 0.0045, 1.6µ24,N0
≤ µ6,N0 ≤
(0 ≤)µ8,N0 ≤
6.07µ24,N0 , 48.469µ34,N0 ,
(5.32) (5.33) (5.34)
hold for N0 = 70. This condition comes from the assumptions in Theorem 2.1 (sufficient, if s− ≤ s N1 and s N1 ≤ s+ ). We now summarize our results from explicit calculations.
Triviality of Hierarchical Ising Model in Four Dimensions
37
1 2 (1) We have a¯ 1,N ≤ s+ = 1.6066 · · · , 0 ≤ s ≤ s+ , 0 ≤ N ≤ N1 . The largest value 2 for a¯ 1,N in the range of parameters is actually obtained at s = s+ and N = 0. (2) Our calculations turned out to be accurate to obtain more than 40 digits below decimal point correctly for µ2,100 and µ4,100 at s = s± , which is more than enough to prove (5.31). In fact, we have 0.99609586499804791366176669341357334889503943 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 0.99609586499804791366176669341357334889503972, at s = s− , and 1.0131857903720691722396611098376636943838027 ≤ a 1,100 ≤ µ2,100 ≤ a¯ 1,100 ≤ 1.0131857903720691722396611098376636943838031, 0.00281027097809098768088795100753480139767915 2 ≤ 21 (−a¯ 2,100 + a 21,100 ) ≤ µ4,100 ≤ 21 (−a 2,100 + a¯ 1,100 ) ≤ 0.00281027097809098768088795100753480139767969, at s = s+ . (3) To prove (5.32)–(5.34), we note the following. Let us write the s dependences of an,N and µn,N explicitly like an,N (s) and µn,N (s). For any integer N and for any s satisfying s− ≤ s ≤ s+ , the monotonicity of an,N (s) with respect to s implies µ4,N (s) =
1 1 (−a2,N (s) + a1,N (s)2 ) ≤ (−a2,N (s− ) + a1,N (s+ )2 ) =: µ¯ 4,N . 2 2 (5.35)
Hence if we can prove µ¯ 4,70 ≤ 0.0045, then we have proved (5.32). In a similar way, sufficient conditions for (5.33) and (5.34) are 1.6 ≤
µ6,70 µ¯ 24,70
,
µ¯ 6,70 ≤ 6.07, µ24,70
µ¯ 8,70 ≤ 48.469, µ34,70
with obvious definitions (as in (5.35) for µ¯ 4,N ) for µn,70 and µ¯ n,70 . The bounds we have for these quantities are (we shall not waste space by writing too many digits): µ¯ 4,70 ≤ 0.004144, 3.6459 ≤
µ6,70 µ¯ 24,70
,
µ¯ 6,70 µ¯ 8,70 ≤ 3.7542, 3 ≤ 38.488. µ24,70 µ4,70
This completes a proof of Theorem 2.2, and therefore Theorem 1.1 is proved. Acknowledgement. The authors would like to thankYoichiro Takahashi for his interest in the present work and for discussions. Part of this work was done while T. Hara was at Department of Mathematics, Tokyo Institute of Technology. The researches of T. Hara and T. Hattori are partially supported by Grant-in-Aid for Scientific Research (C) of the Ministry of Education, Science, Sports and Culture.
38
T. Hara, T. Hattori, H. Watanabe
A. Newman’s Inequalities Let X be a stochastic variable which is in class L of [15]. X ∈ L has Lee-Yang property, which states that the zeros of the moment generating function E eH X are pure imag inary. In fact, it is shown in [15, Prop. 2] using Hadamard’s Theorem that E eH X has the following expression: !
E e
HX
"
=e
bH 2
#
j
H2 1+ 2 αj
$ ,
(A.1)
where b is a non-negative constant and αj , j = 1, 2, 3, · · · , is a positive nondecreasing ∞ αj−2 < ∞. sequence satisfying j =1
Consequences of (A.1) in terms of inequalities among moments (n point functions) are given in [15], among which we note the following: 1. Positivity [15, Theorem 3]. Put µ2n
! √ d 2n 1 =− log E e −1ξ X (2n)! dξ 2n
"%% % %
ξ =0
.
(A.2)
Then, µ2n ≥ 0, n = 0, 1, 2, · · · .
(A.3)
(Note that (A.1) implies µ2n+1 = 0.) 2. Newman’s bound [15, Theorem 6]. Put v2n = nµ2n . Then, v4n ≤ v4n ,
v6 ≤
√ v 4 v8 ,
v4n+2 ≤ v6 v4n−1 ,
(A.4)
where the first and third inequalities follow from (2.10) of [15], while the second one n/2 is (2.12) of [15]. These imply v2n ≤ v4 , n ≥ 2, and therefore µ2n ≤
(2µ4 )n/2 , n = 2, 3, 4, · · · . n
(A.5)
Furthermore, we will prove the following. Proposition A.1. Put aN =
" ! N! E X2N , N ∈ Z+ . Then, (2N )!
aM+N ≤ aM aN
N, M = 0, 1, 2, · · · .
(A.6)
Proof. Put yj = αj−2 > 0. Then " ! 2 1 + H 2 yj . E eH X = ebH j
(A.7)
Triviality of Hierarchical Ising Model in Four Dimensions
39
Expand the infinite product to obtain H4
H6
yj + yi yj + y i y j yk + . . . 1 + H 2 yj = 1 + H 2 2! 3! j
j
=
∞
i,j
i,j,k
H 2n cn , n!
n=0
with
cn =
yi1 yi2 yi3 . . . yin ,
(A.8)
(A.9)
i1 ,i2 ,...,in
where primed summations denote summations over non-coinciding indices. Hence we have, ∞ ! " E eH X = H 2N
N=0
!
Comparing with E e
m,n:m+n=N
HX
"
∞ N bm cn bN−n cn = . H 2N m! n! (N − n)! n!
(A.10)
n=0
N=0
∞ aN 2N = H , we obtain N! N=0
aN =
N N n=0
bN−n cn .
n
Note that (A.9) implies cn+m ≤ cm cn ,
(A.11)
because the conditions of primed summations are weaker for the left-hand side. This with b ≥ 0 implies M N M N M+N−m−n b cm cn aM aN = m n m=0 n=0
≥
N M M N
m=0 n=0
=
M+N
b
m
M+N−"
=
"
c"
m:0≤m≤M, 0≤"−m≤N
"=0 M+N
bM+N−m−n cm+n
n
b
"=0
M+N−"
c"
M +N "
M N m "−m
= aM+N ,
where, in the last line, we also used
" M N M +N = , m "−m "
(A.12)
m: 0≤m≤M, 0≤"−m≤N
which is seen to hold if we compare the coefficients of x " of an identity (1 + x)M+N = (1 + x)M (1 + x)N .
40
T. Hara, T. Hattori, H. Watanabe
References 1. Bleher, P.M. and Sinai, Ya.G.: Investigation of the critical point in models of the type of Dyson’s hierarchical model. Commun. Math. Phys. 33, 23–42 (1973) 2. Bleher, P.M. and Sinai, Ya.G.: Critical indices for Dyson’s asymptotically hierarchical models. Commun. Math. Phys. 45, 247–278 (1975) 3. Collet, P. and Eckmann, J.-P.: A renormalization group analysis of the hierarchical model in statistical physics. Springer Lecture Note in Physics 74, 1978 4. Dyson, F.J.: Existence of a phase-transition in a one-dimensional Ising ferromagnet. Commun. Math. Phys. 12, 91–107 (1969) 5. Gallavotti, G.: Some aspects of the renormalization problems in statistical mechanics. Memorie dell’ Accademia dei Lincei 15, 23–59 (1978) 6. Gaw¸edzki, K. and Kupiainen, A.: Triviality of φ44 and all that in a hierarchical model approximation. J. Stat. Phys. 29, 683–699 (1982) 7. Gaw¸edzki, K. and Kupiainen, A.: Non-Gaussian fixed points of the block spin transformation. Hierarchical model approximation. Commun. Math. Phys. 89, 191–220 (1983) 8. Gaw¸edzki, K. and Kupiainen, A.: Nongaussian Scaling limits. Hierarchical model approximation. J. Stat. Phys. 35, 267–284 (1984) 9. Gaw¸edzki, K. and Kupiainen, A.: Asymptotic freedom beyond perturbation theory. In: K. Osterwalder and R. Stora, eds., Critical Phenomena, Random Systems, Gauge Theories. Les Houches 1984, Amsterdam: North-Holland, 1986 10. Gaw¸edzki, K. and Kupiainen, A.: Massless lattice φ44 Theory: Rigorous control of a renormalizable asymptotically free model. Commun. Math. Phys. 99, 199–252 (1985) 11. Koch, H. and Wittwer, P.: A non-Gaussian renormalization group fixed point for hierarchical scalar lattice field theories. Commun. Math. Phys. 106, 495–532 (1986) 12. Koch, H. and Wittwer, P.: On the renormalization group transformation for scalar hierarchical models. Commun. Math. Phys. 138, 537–568 (1991) 13. Koch, H. and Wittwer, P.: A nontrivial renormalization group fixed point for the Dyson–Baker hierarchical model. Commun. Math. Phys. 164, 627–647 (1994) 14. Koch, H. and Wittwer, P.: Bounds on the zeros of a renormalization group fixed point. Mathematical Physics Electronic Journal 1, No. 6 (24pp.) (1995) 15. Newman, C.M.: Inequalities for Ising models and field theories which obey the Lee–Yang theorem. Commun. Math. Phys. 41, 1–9 (1975) 16. Sinai, Ya.G.: Theory of phase transition: Rigorous results. New York: Pergamon Press, 1982 17. Hara, T., Hattori, T., and Watanabe, H.: Triviality of hierarchical Ising Model in four dimensions. Archived in mp_arc (Mathematical Physics Preprint Archive, http://www.ma.utexas.edu/mp_arc/) 00-397 Communicated by D. C. Brydges
Commun. Math. Phys. 220, 41 – 67 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Geometric Optics and Long Range Scattering for One-Dimensional Nonlinear Schrödinger Equations Rémi Carles Antenne de Bretagne de l’ENS Cachan and IRMAR, Campus de Ker Lann, 35 170 Bruz, France. E-mail:
[email protected] Received: 23 May 2000 / Accepted: 8 January 2001
Abstract: With the methods of geometric optics used in [2], we provide a new proof of some results of [11], to construct modified wave operators for the one-dimensional cubic Schrödinger equation. We improve the rate of convergence of the nonlinear solution towards the simplified evolution, and get better control of the loss of regularity in Sobolev spaces. In particular, using the results of [9], we deduce the existence of a modified scattering operator with small data in some Sobolev spaces. We show that in terms of geometric optics, this gives rise to a “random phase shift” at a caustic. Contents 1. 2. 3. 4. 5. 6. 7.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formal Computations . . . . . . . . . . . . . . . . . . . . . . . Estimates on Some Oscillatory Integrals . . . . . . . . . . . . . Energy Estimates . . . . . . . . . . . . . . . . . . . . . . . . . Justification of Nonlinear Geometric Optics Before the Caustic . Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Construction of the Modified Scattering Operator and Application
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
41 46 48 53 54 58 63
1. Introduction In this article, we consider the nonlinear Schrödinger equation in one space dimension, 1 i∂t ψ + ∂x2 ψ = λ|ψ|p ψ, λ ∈ R, 2 in the particular case where p = 2. Define the Fourier transform by Fv(ξ ) = v (ξ ) = e−ix.ξ v(x)dx.
(1.1)
42
R. Carles
For p > 2, it is well known that to any asymptotic state ψ− ∈ H 1 ∩ F(H 1 ) =: , one can associate a solution ψ of (1.1) that behaves asymptotically as the free evolution of ψ− , that is, U0 (−t)ψ(t) − ψ− −→ 0, t→−∞
i 2t ∂x2
where U0 (t) := e denotes the unitary group of the free Schrödinger equation. The operator W− : ψ− → ψ|t=0 is called a wave operator. The case p = 2 is different (long range case). It is proved (see [1,12,13,5]) that if ψ− ∈ L2 and U0 (−t)ψ(t) − ψ− −→ 0 in L2 , where ψ solves t→−∞
1 i∂t ψ + ∂x2 ψ = |ψ|2 ψ, 2 then ψ = ψ− = 0. One cannot compare the nonlinear dynamics with the free dynamics. In [11], the author constructs modified wave operators that allow to compare the nonlinear dynamics of (1.1) when p = 2 with a simpler one, yet more complicated than the free dynamics. Assuming the asymptotic state ψ− is sufficiently smooth and small in a certain Hilbert space, Ozawa defines a new operator (that depends on ψ− ) such that the evolution of ψ− under this dynamics can be compared to the asymptotic behavior of a certain solution of 1 i∂t ψ + ∂x2 ψ = λ|ψ|2 ψ. (1.2) 2 Using the methods of geometric optics as in [2], we rediscover these modified operators, and improve some convergence estimates (Corollary 1). Moreover, we have better control of the (possible) loss of regularity, which, along with the results of [9], makes it possible to define a modified scattering operator (S = W+−1 W− ) for small data in (Corollary 2). This enables us to describe the validity of nonlinear geometric optics with focusing initial data. In particular, we show that the caustic crossing is described in terms of the scattering operator (as in [2]), plus a “random phase shift” (Corollary 3). In [8], Ginibre and Velo construct modified wave operators in Gevrey spaces. They make no size restriction on the data, but require analyticity for the asymptotic states. In the present article, we cannot leave out the smallness assumption, but our asymptotic states are less regular. Denote H := {f ∈ H 3 (R); xf ∈ H 2 (R)} = {f ∈ S (R); f H := (1 + x 2 )1/2 (1 − ∂x2 )f L2 + (1 − ∂x2 )3/2 f L2 < ∞}. Recall one of the results in [11]. Theorem 1 ([11], Theorem 2). There exists γ > 0 with the following properties. For 1 any ψ− ∈ F(H) with ψ − L∞ < γ , (1.2) has a unique solution ψ ∈ C(R; H ) ∩ 4 1,∞ Lloc (R; W ) such that for any α with 1/2 < α < 1, ψ(t) − eiS
t
−∞
ψ(τ ) − eiS
− (τ )
− (t)
U0 (t)ψ− H 1 = O(|t|−α ), 1/4
U0 (τ )ψ− 4W 1,∞ dτ
= O(|t|−α ) as t → −∞,
(1.3)
(1.4)
Geometric Optics and Long Range Scattering for NLS
43
where the phase shift S − is defined by S − (t, x) :=
λ 2π
x 2 ψ− log |t|. t
(1.5)
Remark 1. Theorem 1 in [11] gives an asymptotic in L2 instead of H 1 , and requires less regularity on the asymptotic state ψ− . Yet, it is still required to be small in the same space as in Theorem 2. Now we recall why the method of geometric optics can be closely related to scattering theory in the case of the nonlinear Schrödinger equation. In [2], we consider the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = λε α |uε |β uε , (t, x) ∈ R+ × R, 2 (1.6) 2 ε −i x2ε u|t=0 = e f (x), where α ≥ 1, β > 0 and 0 < ε ≤ 1 is a parameter going to zero. With the initial phase −x 2 /2, rays of geometric optics (which are the projection on the (t, x) space of the bicharacteristics) focus at the point (t, x) = (1, 0). We proved in [2] that in the case where β = 2α > 2 (“nonlinear caustic”), the asymptotic behavior, as ε goes to zero, of the solution near t = 1 is easily expressed in terms of f and the wave operator W− . To see that point, we introduced the scaling 1 ε t −1 x ε , , (1.7) u (t, x) = √ ψ ε ε ε that satisfies
1 1 1 ε U0 ψ − −→ ψ− := √ f (x). ε ε ε→0 2iπ
Define the function ψ by
(1.8)
i∂ ψ + 1 ∂ 2 ψ = λ|ψ|β ψ, t 2 x ψ|t=0 = W− ψ− .
Then ψ is a concentrating profile for uε , that is 1 t −1 x uε (t, x) ∼ √ ψ , . ε→0 ε ε ε In this paper, we treat the limiting case of (1.6), that is α = 1, β = 2. We study the validity of nonlinear geometric optics, for positive times, for the solutions of the following initial value problem, 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , (t, x) ∈ R+ × R, 2 (1.9) x2 1 2 uε = e−i 2ε +iλ|f (x)| log ε f (x). |t=0
44
R. Carles
We altered the initial data by adding the term eiλ|f (x)| log ε in order to recover the same modified wave operator as in [11]. Nonlinear geometric optics could be justified as well without this term, by the same methods as that which follows in this article, but would not make it possible to deduce the existence of modified wave operators for (1.2). From now on, the function f in the initial data is supposed to belong to H, and nonzero. Then for every (fixed) ε > 0, (1.9) has a unique global solution, which belongs to C(Rt , ) (see for instance [5,6]). The following definition, that follows the spirit of [2], will be motivated in Sect. 2. 2
Definition 1. Let g ε be defined for t < 1 by g ε (t, ξ ) := λ|f (−ξ )|2 log
1
1−t ε
.
The approximate solution uεapp is defined for t < 1 by x.ξ t−1 2 1 dξ ε uεapp (t, x) := √ e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ ) , 2π ε with
a0 (ξ ) :=
2π f (−ξ ). i
We define the symbol aε (t, ξ ) by x.ξ t−1 2 1 dξ ε uε (t, x) = √ e−i 2ε ξ +i ε +ig (t,ξ ) aε (t, ξ ) , 2π ε
(1.10)
(1.11)
(1.12)
(1.13)
which makes sense since uε ∈ L2 and
ξ 1 t−1 2 ε . aε (t, ξ ) = √ ei 2ε ξ −ig (t,ξ ) uε t, ε ε
(1.14)
We can now state the main result. Theorem 2. Let f ∈ H. There exist C ∗ = C ∗ (f ) and ε∗ = ε ∗ (f ) > 0 such that for 0 < ε ≤ ε∗ , nonlinear geometric optics is valid before the focus, with the following distinctions. – If f L∞ < |2λ|−1/2 , then in {1 − t ≥ C ∗ ε} and for any 0 ≤ s ≤ 1, 1 − t 2+s ε . aε (t, .) − a0 H s ∩F(H s ) = O log 1−t ε 2 ∗ – If f L∞ ≥ |2λ|−1/2 , then denote C0 := 2|λ|f L∞ . For any α > 0, there exists Cα 1/(C +α) 0 such that in 1 − t ≥ Cα∗ ε| log ε|5/2 , and for any 0 ≤ s ≤ 1,
aε (t, .) − a0 H s ∩F(H s ) = O
ε 2+s . | log ε| (1 − t)C0 +2sα
The above estimates are uniform on the time intervals we consider.
Geometric Optics and Long Range Scattering for NLS
45
Define the Galilean operator J (see for instance [5,6]) by J (t) := x + it∂x . 1/2 . Then there exists a unique Corollary 1. Let ψ− ∈ F(H) with ψ − L∞ < (π/|λ|) ψ ∈ C(R, ) solution of (1.2) such that for any 0 ≤ s ≤ 1, as t → −∞, (log |t|)2+s − ψ(t) − eiS (t) U0 (t)ψ− H s = O , (1.15) |t| (log |t|)3 iS − (t) J (t)ψ − J (t)e U0 (t)ψ− L2 = O . (1.16) |t| In particular, we have (log |t|)5/2 iS − (t) ψ(t) − e . (1.17) U0 (t)ψ− L∞ = O |t|3/2
Actually, we will prove uniqueness under weaker conditions, as stated in the following proposition. Recall that f and ψ− are related by (1.8). Proposition 1. Let f ∈ H 2 (R). Suppose f L∞ < |2λ|−1/2 . Then there exists at most one function ψ ∈ C(Rt , L2 ∩ L∞ ) solution of (1.2) satisfying the following property: There exists 1/2 < α < 1, with α > 2|λ|f 2L∞ , such that, as t → −∞, 1 iS − (t) ψ(t) − e . U0 (t)ψ− L2 ∩L∞ = O |t|α Remark 2. Our method does not recover the convergence in L4t (L∞ x ) of the derivatives, stated in Theorem 1. However, we recover all the others, with a better convergence rate. Remark 3. The other improvement involves the regularity of the function ψ thus constructed. We get some regularity of the momenta of ψ, namely xψ ∈ L2 , which did not appear in [11]. Thanks to this regularity, we can use the results of asymptotic completeness stated in [9], in order to define a long range scattering operator for small data. Corollary 2. We can define a modified scattering operator for (1.2), for small data in H. There exists δ > 0 such that to any ψ− ∈ F(H) satisfying ψ− ≤ δ, we can associate unique ψ ∈ C(Rt , ) solution of (1.2) and ψ+ ∈ L2 such that ψ(t)
∼
t→±∞
eiS
± (t)
U0 (t)ψ± in L2 ,
(1.18)
where S ± are defined by (1.5). The map S : ψ− → ψ+ is the modified scattering operator. Corollary 3. Let f ∈ H. Assume f is sufficiently small. Then nonlinear geometric optics is valid in L2 for the problem (1.9), before and after the caustic. The caustic crossing is described by the modified scattering operator S and a “random phase shift”. One has the following asymptotics in L2 , – if t < 1, then π
2 ei 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π − ε→0 2π(1 − t)
ε
x t−1
2 log
1−t ε
ψ −
x , t −1
46
R. Carles
– if t > 1, then
π
2 e−i 4 i x +i λ ψ u (t, x) ∼ √ e 2ε(t−1) 2π + ε→0 2π(t − 1)
ε
x t−1
2 log
t−1 ε
ψ +
x , t −1
where ψ− is defined by (1.8) and ψ+ = Sψ− . Remark 4. The phase shift of −π/2 between the two asymptotics is classical, and appears even in the linear case ([4]). The change in the profile, measured by a scattering operator, was proved in [2]. The new phenomenon here is the phase shift 2 2 x x λ λ t −1 log t − 1 , ψ+ − ψ− log 2π t −1 ε 2π t −1 ε which is “very nonlinear”, and depends on ε, hence can be called “random”. Remark 5. From a physical point of view, the nonlinearity λ|ψ|2 ψ appears as the first term of a Taylor expansion of a more general nonlinearity h(|ψ|2 )ψ. For instance, h may be bounded (to model the phenomenon of saturation). For large times, ψ is small and we can write h(|ψ|2 )ψ = λ|ψ|2 ψ + R(|ψ|2 )ψ,
(1.19)
with R(|ψ|2 ) = O(|ψ|4 ). One can check that replacing λ|ψ|2 ψ with the right-hand side of (1.19), Corollary 1 still holds, as well as Corollary 2, since the results in [9] still hold with (1.19). Notations. We will denote d¯ξ := so that the Fourier inverse formula writes F −1 f (x) =
dξ , 2π eixξ f (ξ )d¯ξ.
For x ∈ R, we denote x := (1 + x 2 )1/2 . 2. Formal Computations In this section, we recall how the oscillatory integrals were introduced in the nonlinear short range case ([2]), and give a formal argument that leads to Definition 1 before the focus, that is for t < 1. Suppose uε solves the initial value problem 1 iε∂t uε + ε 2 ∂x2 uε = 0, (t, x) ∈ R+ × R, 2 (2.1) 2 ε −i x2ε u|t=0 = e f (x). For t < 1, the asymptotics when ε goes to zero is given by WKB methods, x2 1 x uε (t, x) ∼ √ ei 2ε(t−1) . f ε→0 1 − t 1−t
(2.2)
Geometric Optics and Long Range Scattering for NLS
47
Near the focus, this description fails to be valid. Neither the profile nor the phase in (2.2) are defined for t = 1. For much more general cases, Duistermaat showed that a uniform description can be obtained in terms of oscillatory integrals ([4]), that is, in this case, xξ t−1 2 1 ε u (t, x) = √ (2.3) e−i 2ε ξ +i ε aε (ξ )d¯ξ. ε It is easy to check that aε has an asymptotic expansion in powers of ε, and in particular, aε −→ a0 defined by (1.12). For t < 1 the usual stationary phase formula applied to the ε→0
above integral with aε replaced by a0 gives the asymptotics (2.2). For t > 1, one has almost the same asymptotics, the main difference is a phase shift of −π/2 due to the caustic crossing. For the nonlinear case (1.6), we generalized the previous representation as follows ([2]), xξ t−1 2 1 ε u (t, x) = √ (2.4) e−i 2ε ξ +i ε aε (t, ξ )d¯ξ. ε This formula makes sense as soon as uε ∈ L2 , since ξ 1 t−1 2 aε (t, ξ ) = √ ei 2ε ξ uε t, . ε ε The nonlinear term εα |uε |β uε is negligible when ∂t aε goes to zero. With this natural definition, we proved that the nonlinear term can have different influences away from the caustic, and near t = 1, which led us to use the same vocabulary as in [10], linear/nonlinear propagation, linear/nonlinear caustic. We also proved that the four cases can be encountered. When the propagation is nonlinear (α = 1), a formal computation based on the stationary phase formula suggests as a limit transport equation for the symbol aε , i∂t a(t, ξ ) =
λ |2π(1 − t)|
β 2
|a|β a(t, ξ ),
(2.5)
at least away from the caustic, with initial data a|t=0 = a0 (ξ ). Multiplying (2.5) by a, ¯ ig(t,ξ ) one notices that the modulus of a is constant. If we write a = a0 e , the equation for g is: ∂t g(t, ξ ) = −
λ |1 − t|
β 2
|f (−ξ )|β .
(2.6)
If we wish to get as a limit transport equation the relation ∂t a˜ = 0, it seems natural to define a modified symbol a˜ ε as x.ξ t−1 2 1 ε u (t, x) = √ (2.7) e−i 2ε ξ +i ε +ig(t,ξ ) a˜ ε (t, ξ )d¯ξ, ε with g|t=0 = 0. In the case of a linear caustic (β < 2), we proved that indeed, a˜ ε (t, ξ ) −→ a0 (ξ ) in L∞ t,loc (x ). ε→0
48
R. Carles
In the case we want to study now, β = 2, the integration of (2.6) is possible only for t < 1. With the initial data g|t=0 = λ|f (−ξ )|2 log 1ε , it gives the result introduced in Definition 1. As in the cases recalled above, the transport equation for the modified symbol a˜ ε must be, for t < 1, ∂t a˜ ε −→ 0, ε→0
which leads us to the definition of the approximate solution (1.11). From now on, we will leave out the tilde symbol for a, and adopt the notation (1.13). Remark 6. The function g is defined only for t < 1, not near t = 1. One must remember that the formal computations that lead to the definition of g are based on the application of 2 the stationary phase formula. When the phase 1−t 2 ξ + xξ does not have non-degenerate critical points, one must not expect this formal argument to be valid in the general case. On the other hand, recall that the case we study (α = 1 and β = 2) corresponds to a nonlinear propagation and a nonlinear caustic. The phase g takes the nonlinear effects of the propagation before the caustic into account. To take the nonlinear effects of the caustic into account, one has to define a (long range) scattering operator for the cubic Schrödinger equation (see Sect. 7). For t < 1, the function uεapp satisfies the equation √ xξ t−1 2 1 ε iε∂t uεapp + ε 2 ∂x2 uεapp = − ε ∂t g ε (t, ξ )e−i 2ε ξ +i ε +ig (t,ξ ) a0 (ξ )d¯ξ 2 xξ t−1 2 1 |f (−ξ )|2 ε = λε √ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε
(2.8)
For t < 1, one can formally apply the stationary phase formula to the integral defining uεapp , x2 x 1 x i 2ε(t−1) +ig ε t, t−1 ε uapp (t, x) ∼ e f (2.9) =: uε1 (t, x). ε→0 (1 − t)1/2 1−t On the other hand, if one applies the stationary phase formula to the right-hand side of (2.8), it comes λε|uε1 |2 uε1 (t, x), so formally, uεapp is an approximate solution of (1.9). In the following section, we estimate precisely the remainders when one applies the stationary phase formula as above. 3. Estimates on Some Oscillatory Integrals 3.1. The fundamental estimate. We first estimate precisely the remainder of the usual stationary phase formula applied to the first order, in L2 . Lemma 1. Let σ (t, ξ ) be locally bounded in time with values in L2 (R). Denote xξ t−1 2 1 H ε (t, x) := √ e−i 2ε ξ +i ε σ (t, ξ )d¯ξ, ε and .ε the first term given by the stationary phase formula, x2 x i −i ε 2ε(1−t) σ t, . . (t, x) := e 2π(1 − t) t −1
Geometric Optics and Long Range Scattering for NLS
49
1. There exists a continuous function h, with h(0) = 0, such that ε ε H (t, .) − .ε (t, .) 2 = h . L 1−t 2. If σ (t, .) ∈ H 2 (R), the rate of continuity of h can be estimated, ε H (t, .) − .ε (t, .) 2 ≤ C ε σ (t, .) 2 . Hξ L |1 − t| Proof. From the definition of H ,
1−t x 2 1 i ξ + 1−t σ (t, ξ )d¯ξ H (t, x) = e e 2ε √ ε x2 1−t 2 1 x ei 2ε ξ σ t, ξ − = e−i 2ε(1−t) √ d¯ξ, 1−t ε 2
x −i 2ε(1−t)
ε
hence from Parseval formula,
ε xy 2 i ei 2(1−t) y ei 1−t Fξ−1 H (t, x) = e →y σ (t, y)dy 2π(1 − t) x2 x i −i 2ε(1−t) = e σ t, 2π(1 − t) t −1 xy x2 ε 2 i + e−i 2ε(1−t) ei 2(1−t) y − 1 ei 1−t F −1 σ (t, y)dy, 2π(1 − t) ε
2
x −i 2ε(1−t)
and the last term can also be written as ε x2 i x −i 2ε(1−t) i 2(1−t) y2 −1 −1 F σ t, F e . e 2π(1 − t) t −1 Now from the Plancherel formula, ε H (t, .) − .(t, .) 2 = h t, L x
with
ε 1−t
z 2 h(t, z) = ei 2 y − 1 F −1 σ
L2y
,
.
Then the first point follows from the dominated convergence theorem. When σ (t, .) ∈ H 2 , we have z h(t, z) = 2 sin y 2 F −1 σ (t, .) 4 L2y z ≤ 2 y 2 F −1 σ (t, .) 4 L2y ≤ |z| y 2 F −1 σ (t, .) 2 = C|z|σ (t, .)H 2 . Ly
This inequality completes the proof of Lemma 1.
50
R. Carles
3.2. Convergence of the initial data. To obtain asymptotics in for the symbols as stated in Theorem 2, we have to notice the following properties. If xξ t−1 2 1 v ε (t, x) = √ e−i 2ε ξ +i ε bε (t, ξ )d¯ξ, ε then
1 √ ε and 1 √ ε
e−i
e−i
xξ t−1 2 2ε ξ +i ε
xξ t−1 2 2ε ξ +i ε
ξ bε (t, ξ )d¯ξ = ε∂x v ε (t, x),
∂ξ bε (t, ξ )d¯ξ = J ε (t)v ε (t, x),
where we denoted J ε (t) :=
x + i(t − 1)∂x . ε
(3.1)
The operator J ε is nothing else than the usual Galilean operator, rescaled accordingly to our problem. Lemma 2. The operator J ε satisfies the following properties. – The commutation relation,
1 2 2 J (t), iε∂t + ε ∂x = 0. 2 ε
(3.2)
x2
– Denote M ε (t) = ei 2ε(t−1) , then J ε (t) writes J ε (t) = i(t − 1)M ε (t)∂x M ε (2 − t).
(3.3)
– The modified Sobolev inequality, w(t)L∞ ≤ C √
1 1/2 1/2 w(t)L2 J ε (t)w(t)L2 . |1 − t|
(3.4)
– For any function F ∈ C 1 (C, C) satisfying the gauge invariance condition ∃G ∈ C 1 (R+ , R), F (z) = zG (|z|2 ), one has J ε (t)F (w) = ∂z F (w)J ε (t)w − ∂z¯ F (w)J ε (t)w.
(3.5)
Th first step to prove Theorem 2 is to study the convergence of the initial value of the symbol aε .
Geometric Optics and Long Range Scattering for NLS
51
Lemma 3. The following convergence holds in , aε (0, ξ ) −→ a0 (ξ ). ε→0
More precisely, there exists C = C(f H ) such that
1 2 ≤ Cε log , ε 1 3 ≤ Cε log . ε
aε (0, .) − a0 L2 ξ(aε (0, ξ ) − a0 (ξ ))L2 , ∂ξ (aε (0, ξ ) − a0 (ξ ))L2
(3.6)
Moreover, the same estimates hold with aε (0, ξ ) − a0 (ξ ) replaced with (aε (0, ξ ) − 2 a0 (ξ ))e−iλ|f (−ξ )| log ε . Proof. From (1.14) and the initial value of uε , one has i 1 1 2 2 2 aε (0, ξ ) = eiλ|f (−ξ )| log ε . √ e− 2ε (x+ξ ) +iλ|f (x)| log ε f (x)dx. ε Denote hε (x) := eiλ|f (x)|
2 log 1 ε
f (x). From Parseval formula, one also has y2 1 −iλ|f (−ξ )|2 log ε =√ aε (0, ξ )e e−iyξ −iε 2 hε (y)dy, 2iπ
hence (aε (0, ξ ) − a0 (ξ )) e
−iλ|f (−ξ )|2 log ε
=√
1 2iπ
e
−iyξ
e
2
−iε y2
− 1 hε (y)dy.
Following the proof of Lemma 1, one then proves that the L2 -norm of the above quantity is O(ε| log ε|2 ), and its -norm is O(ε| log ε|3 ). The estimates of Lemma 3.6 are then straightforward. 3.3. Estimating the approximate solution. To estimate the remainder uε − uεapp , we will need some information as for the L∞ -norm of the approximate solution. The following lemma provides some. Lemma 4. Let β > 0. There exists C∗ = C∗ (β, f H 2 ) such that in the region {1 − t ≥ C∗ ε}, uεapp (t) satisfies almost the same estimate as uε1 (t) in L∞ , that is, uεapp (t)L∞ ≤
f L∞ + β . √ 1−t
Proof. Write uεapp (t)L∞ ≤ uε1 (t)L∞ + uεapp (t) − uε1 (t)L∞ , and denote d ε (t, x) := uεapp (t, x) − uε1 (t, x). From the modified Sobolev inequality, d ε (t)L∞ ≤ √
C 1−t
d ε (t)L2 J ε (t)d ε L2 . 1/2
1/2
(3.7)
52
R. Carles
Now the L2 -norms can be estimated thanks to Lemma 1, with σ ε (t, ξ ) := eig
ε (t,ξ )
a0 (ξ ).
Thus, d ε (t)L2 ≤ C
ε σ ε (t, .)H 2 . ξ 1−t
It is a straightforward computation to see that since H 1 (R) ⊂ L∞ (R), there are some constants such that ε 1−t 2 ε d (t)L2 ≤ C(f H 2 ) . log 1−t ε Since J ε (t) acts as the differentiation with respect to ξ on the symbols, the first part of Lemma 1 gives 1−t ε J ε (t)d ε L2 = log h , ε 1−t where h ∈ C(R) satisfies h(0) = 0. Then from (3.7),
1/2 ε ε C(f H 2 ) 1 − t 3/2 ε . d (t)L∞ ≤ √ h log 1−t ε 1−t 1−t Hence, for 1 − t ε, d ε (t) is negligible compared to uε1 (t) in L∞ . This completes the proof of Lemma 4. The proof of the next lemma is similar, and uses the regularity f ∈ H. Lemma 5. There exists C∗ = C∗ (f H ) such that in the region {1 − t ≥ C∗ ε}, the derivatives of uεapp satisfy almost the same estimates as the derivatives of uε1 in L∞ , that is, there exists C = C(f H ) such that ε∂x uεapp (t)L∞ ≤ √ J ε (t)uεapp L∞ ≤ √
C 1−t C 1−t
, log
(3.8) 1−t . ε
(3.9)
3.4. The equation satisfied by the approximate solution. From Sect. 2 and more precisely from Eq. (2.8), the approximate solution uεapp solves the cubic nonlinear Schrödinger equation up to the error term 6ε (t, x) := |uεapp |2 uεapp (t, x) xξ t−1 2 1 |f (−ξ )|2 ε −√ a0 (ξ )d¯ξ. e−i 2ε ξ +i ε +ig (t,ξ ) 1−t ε
(3.10)
Lemma 6. There exist C = C(f H 2 ) and C∗ = C∗ (f H 2 ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 2 ε log 6ε (t)L2x ≤ C . (3.11) (1 − t)2 ε
Geometric Optics and Long Range Scattering for NLS
53
Proof. Write 6ε (t, x) = 6ε (t, x) + |uε1 |2 uε1 (t, x) − |uε1 |2 uε1 (t, x), and introduce ε (t, x) := |uεapp |2 uεapp (t, x) − |uε1 |2 uε1 (t, x). 6 ε satisfies the estimate stated in Lemma 6. The other estimate to complete We prove that 6 the proof of Lemma 6 would be easier and will be left out. First remark that ε (t, .)L2 ≤ C uεapp (t)2L∞ + uε1 (t)2L∞ (uεapp − uε1 )(t)L2 . 6 x x x One has obviously 1 f 2L∞ . 1−t From Lemma 4, uεapp satisfies the same estimate in the region we are considering. Hence, uε1 (t)2L∞ ≤ x
ε (t, .)L2 ≤ C(f H 2 ) 6 From Lemma 1 with σ ε = eig
1 (uεapp − uε1 )(t)L2x . 1−t
(3.12)
ε (t,ξ )
a0 (ξ ), we finally have ε ig ε (t,ξ ) (uεapp − uε1 )(t)L2x ≤ C a0 (ξ ) 2 , e Hξ 1−t
ε satisfies the estimate announced in Lemma 4. and it is easy to check that 6
The following lemma is the extension of Lemma 6 we will need for the proof of Theorem 2, and its proof is similar. Lemma 7. There exist C = C(f H ) and C∗ = C∗ (f H ) such that uniformly in the region {1 − t ≥ C∗ ε}, 1−t 3 ε ε ε log , J (t)6 (t)L2x ≤ C (1 − t)2 ε (3.13) 1−t 3 ε ε log . ε∂x 6 (t)L2x ≤ C (1 − t)2 ε 4. Energy Estimates In this section, we derive the three energy estimates we will use to justify nonlinear geometric optics. Recall that the exact solution uε and the approximate solution uεapp satisfy 1 iε∂t uε + ε 2 ∂x2 uε = λε|uε |2 uε , 2 1 iε∂t uεapp + ε 2 ∂x2 uεapp = λε|uεapp |2 uεapp − ε6ε , 2 where 6ε is defined by (3.10) and is estimated in Lemmas 6 and 7. Introduce the remainder w ε := uε − uεapp . Subtracting the previous two equations, one has 1 iε∂t w ε + ε 2 ∂x2 w ε = λε |uε |2 uε − |uεapp |2 uεapp + ε6ε . 2
(4.1)
54
R. Carles
Multiplying the previous equation by w ε and taking the imaginary part of the result integrated in x, it follows ∂t w ε (t)L2 ≤ C uε (t)2L∞ + uεapp (t)2L∞ w ε (t)L2 + C6ε (t)L2 (4.2) ≤ C wε (t)2L∞ + uεapp (t)2L∞ wε (t)L2 + C6ε (t)L2 . Differentiating (4.1) with respect to x and multiplying by ε∂x w ε , one has similarly ∂t ε∂x w ε (t)L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ ε∂x w ε (t)L2 + Cw ε (t)L2 uεapp (t)L∞ ε∂x uεapp (t)L∞
(4.3)
+ Cw ε (t)2L∞ ε∂x uεapp (t)L2 + Cε∂x 6ε (t)L2 . Finally, since from Lemma 2 J ε commutes with the Schrödinger operator and acts on the nonlinearity we are considering as a differentiation, we also have ∂t J ε (t)w ε L2 ≤ C w ε (t)2L∞ + uεapp (t)2L∞ J ε (t)w ε L2 + Cw ε (t)L2 uεapp (t)L∞ J ε (t)uεapp L∞
(4.4)
+ Cw ε (t)2L∞ J ε (t)uεapp L2 + CJ ε (t)6ε L2 . The main idea to justify nonlinear geometric optics is to integrate those three energy estimates so long as wε (t)L∞ is not greater than uεapp (t)L∞ . Since w ε is expected to be a remainder, this case actually occurs in “sufficiently” large regions, as we will see in the next section. 5. Justification of Nonlinear Geometric Optics Before the Caustic We now illustrate the method announced above. From Lemma 4, the “so long” condition writes for instance 4f L∞ wε (t)L∞ ≤ √ . 1−t
(5.1)
From Inequality (3.4) and Lemma 3, wε (0)L∞ ≤ Cw ε (0)L2 J ε (0)w ε L2 ≤ Cε| log ε|5/2 . 1/2
1/2
Hence there exists ε∗ = ε∗ (f H ) > 0 such that for 0 < ε ≤ ε∗ , w ε (0)L∞ ≤ 2f L∞ . By continuity, Condition (5.1) is satisfied for 0 ≤ t ≤ Tε for some Tε > 0. Then so long as (5.1) holds, we can integrate the three energy estimates using Gronwall lemma. Since (4.3) and (4.4) are very similar, introduce the following norm, (5.2) wε (t)Y := max ε∂x w ε (t)L2 , J ε (t)w ε L2 .
Geometric Optics and Long Range Scattering for NLS
55
Now we can write (3.4) as w ε (t)L∞ ≤ √
C
w ε (t)L2 w ε (t)Y . 1/2
1−t
1/2
So long as (5.1) holds, estimate (4.2) can be written as follows, ∂t w ε (t)L2 ≤ C1
f 2L∞ ε w (t)L2 + C6ε (t)L2 , 1−t
(5.3)
where C1 is a universal constant that does not depend on f . Denote C0 := C1 f 2L∞ . From the Gronwall lemma, we can integrate the previous inequality as follows, t 1 − s C0 w ε (0)L2 ε ε w (t)L2 ≤ +C 6 (s)L2 ds. (5.4) (1 − t)C0 1−t 0 From Lemma 5, and from Lemma 7, Inequalities (4.3) and (4.4) can also be written 1−t C0 C w ε (t)Y + w ε (t)L2 log ∂t w ε (t)Y ≤ 1−t 1−t ε C 1 − t + w ε (t)L2 w ε (t)Y log (5.5) 1−t ε 1−t 3 ε log , +C (1 − t)2 ε where we possibly increased the value of C1 (this question will be addressed more precisely in Sect. 6). To estimate the integral of the right-hand side of (5.4), we use Lemmas 6 and 7. The integral is not greater than t C ε 1−s 2 log ds. (5.6) (1 − t)C0 0 (1 − s)2−C0 ε For j > 0, we are thus led to study t 1−s j 1 log ds. 2−C0 ε 0 (1 − s) We take j > 0 and not only j = 2 because to estimate w ε (t)Y , we will have to deal with similar integrals with j = 3. With the substitution σ = 1−t ε , it becomes ε
C0 −1
1 ε 1−t ε
log σ j dσ. σ 2−C0
(5.7)
Since in Lemmas 4, 6 and 7, we had to restrict our attention to the region 1−t ε, we can replace log σ with | log σ | in the previous integral with no change in the asymptotics, and we have to study 1 ε (log σ )j dσ, j > 0. (5.8) 1−t σ 2−C0 ε To estimate these integrals, we have to distinguish two cases, namely C0 < 1 and C0 ≥ 1.
56
R. Carles
5.1. Case C0 < 1. In this case, one has obviously 2 − C0 > 1, hence the integral (5.8) is convergent. More precisely, we have to estimate the remainder of a converging integral. Integration by parts shows that for b > a 1, b (log σ )j (log a)j . dσ = O σ 2−C0 a 1−C0 a With a =
1−t ε ,
it follows that the energy estimate (5.4) becomes w ε (t)L2 ≤ C2
ε 1−t
log
1−t ε
2 .
(5.9)
Let α > 0. Then if 1 − t ≥ C∗ ε where C∗ is such that C2
(log C∗ )3 = α, C∗
where C2 is the constant in (5.9), Inequality (5.5) becomes 1−t 3 C0 + α ε Cε log . w (t)Y + ∂t w ε (t)Y ≤ 1−t (1 − t)2 ε
(5.10)
Taking α > 0 sufficiently small, we have C0 + α < 1, hence we can replace C0 + α with C0 with no change in the result. Now we can apply Gronwall lemma to (5.10), t 1 1−s 3 w ε (0)Y Cε wε (t)Y ≤ log + ds. (1 − t)C0 (1 − t)C0 0 (1 − s)2−C0 ε The previous estimate with j = 3 yields ε w (t)Y ≤ C 1−t ε
1−t log ε
From (3.4), w ε (t)L∞ ≤ √
C
ε 1−t 1−t
log
3
1−t ε
.
(5.11)
5/2 .
Hence, there exists C∗ = C∗ (f ) such that for 0 < ε ≤ ε∗ and in the region {1−t ≥ C∗ ε}, condition (5.1) is always satisfied, and estimates (5.9) and (5.11) hold, which we can summarize in the following proposition. −1/2
Proposition 2. Define δ := C1 . If f ∈ H satisfies f L∞ < δ, then nonlinear geometric optics is uniformly valid in the region {1 − t ≥ C∗ ε} for some (large) C∗ = C∗ (f ), with the estimates, ε 1−t 2 , log aε (t, .) − a0 L2 ≤ C 1−t ε ε 1−t 3 . log ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2 ≤ C 1−t ε
Geometric Optics and Long Range Scattering for NLS
57
5.2. Case C0 ≥ 1. Now the integral (5.8) is divergent. Integration by parts shows that for b > a 1, b (log σ )j C0 −1 j dσ = O b (log b) . σ 2−C0 a Then the energy estimate (5.4) becomes w ε (t)L2 ≤ C
ε | log ε|2 . (1 − t)C0
(5.12)
Let α > 0. First, we restrict our study to the region 1−t ε ≤ 2α. | log ε|2 log C 0 (1 − t) ε
(5.13)
Then the energy estimate (5.5) becomes, when we take only the “worst” terms into account, ∂t w ε (t)Y ≤
1−t C0 + 2α ε Cε 2 log | log ε| w (t)Y + . 1−t (1 − t)1+C0 ε
(5.14)
Applying the Gronwall lemma and proceeding as for the L2 -norm yields wε (t)Y ≤ C
ε| log ε|3 . (1 − t)C0 +2α
(5.15)
From (3.4), wε (t)L∞ ≤ √
C ε | log ε|5/2 . 1 − t (1 − t)C0 +α
1/(C0 +α) Hence, for 1 − t ε| log ε|5/2 and ε sufficiently small, condition (5.1) is always satisfied, and estimates (5.12) and (5.15) hold. Notice that in this region, for ε sufficiently small, (5.13) is automatically satisfied. Proposition 3. Take δ as in Proposition 2. Assume f ∈ H satisfies f L∞ ≥ δ. Let α > 0. Then there exists Cα∗ such that nonlinear geometric optics is uniformly valid in the 1/(C0 +α) region {1 − t ≥ Cα∗ ε| log ε|5/2 }, where C0 = f 2L∞ /δ 2 , with the estimates, ε | log ε|2 , (1 − t)C0 ε ≤C | log ε|3 . (1 − t)C0 +2α
aε (t, .) − a0 L2 ≤ C ∂ξ (aε (t, .) − a0 )L2 , ξ(aε (t, ξ ) − a0 (ξ ))L2
Propositions 2 and 3 imply Theorem 2, up to the computation of the smallness constant we find with this method, which we shall perform in the next section.
58
R. Carles
6. Interpretation 6.1. Computation of δ. We now focus on the case f L∞ < δ, and compute the best constant given by our method. From Sect. 5.1, we have to compute the coefficient in the factor of wε (t)L2 in Inequality (4.2), and the constant that appears in the first line of the right-hand side of (4.3) and (4.4). Indeed, for these last two inequalities, we proved that the other terms can be either absorbed (provided we remain sufficiently “far” from the caustic), or considered as a small source term. For Inequality (4.2), we multiplied (4.1) by w ε , then took the imaginary part of the result integrated in space. Write |uε |2 uε − |uεapp |2 uεapp = |uε |2 w ε + (|uε |2 − |uεapp |2 )uεapp . With the method of energy estimates, the first term will vanish, and the second is written |wε |2 + 2 Re(w ε uεapp ) uεapp . Hence, we can rewrite (4.2) more precisely as ∂t w ε (t)L2 ≤ 2|λ| 2w ε (t)L∞ + uεapp (t)L∞ uεapp (t)L∞ w ε (t)L2 + source term.
(6.1)
For Inequality (4.3), we differentiate |uε |2 uε − |uεapp |2 uεapp , with the result (uε )2 ε∂x uε − (uεapp )2 ε∂x uεapp + 2(|uε |2 ε∂x uε − |uεapp |2 ε∂x uεapp ). The very last term will be considered as a source term. The term before is written |uε |2 ε∂x uε = |uε |2 ε∂x w ε + |uε |2 ε∂x uεapp . When we take the imaginary part, the term |uε |2 ε∂x w ε vanishes, and the other term is made of source terms and of “absorbed” terms. Finally, the only relevant term will be (uε )2 ε∂x w ε , and we can rewrite (4.3) as 2 ∂t ε∂x w ε (t)L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ ε∂x w ε (t)L2 (6.2) + absorbed terms + source terms. Since from (3.5), J ε acts on the nonlinearity as a differentiation, the computation is exactly the same as with ε∂x , and we have 2 ∂t J ε (t)w ε L2 ≤ 2|λ| w ε (t)L∞ + uεapp (t)L∞ J ε (t)w ε L2 (6.3) + absorbed terms + source terms. Now notice that in Lemma 4, we could have obtained the estimate 1+β uεapp (t)L∞ ≤ √ f L∞ , 1−t for any β > 0, provided that we take C∗ sufficiently large.
Geometric Optics and Long Range Scattering for NLS
59
Similarly, for Condition (5.1), we could have taken w ε (t)L∞ ≤ √
β
f L∞ .
1−t
Obviously, the smaller β is, the smaller ε∗ is, and the larger C ∗ in Theorem 2. We see that for any β > 0, we can take C0 = 2(1 + 2β)2 |λ|f 2L∞ , which proves that we can take δ = |2λ|−1/2 , and completes the proof of Theorem 2. 6.2. Proof of Corollary 1. Existence. Recall the scaling (1.7). For every ε > 0, ψ ε is the unique solution in C(R, ) of the initial value problem 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ ε |2 ψ ε , 2 (6.4) 2 √ ε −iε x2 −iλ|f (εx)|2 log ε ψ = εf (εx)e . |t=−1/ε
For t < 0, define ψapp by
ψapp (t, x) :=
t
e−i 2 ξ
2 +ixξ +iλ|f (−ξ )|2 log |t|
a0 (ξ )d¯ξ.
From Theorem 2, if f L∞ < |λ|−1/2 , then there exist ε∗ and C ∗ such that for 0 < ε ≤ ε∗ and 0 ≤ s ≤ 1, ψ ε (t) − ψapp (t)H s ≤ C
(log |t|)2+s , |t|
(6.5)
uniformly for −1/ε ≤ t ≤ −C ∗ . Moreover, since the operator J ε is nothing but the classical Galilean operator J (t) up to the scaling (1.7), we also have J (t)ψ ε − J (t)ψapp L2 ≤ C
(log |t|)3 , |t|
(6.6)
uniformly for −1/ε ≤ t ≤ −C ∗ . Proposition 4. Assume f L∞ < |λ|−1/2 . Then there exists C ∗ > 0 such that (ψ ε (−C ∗ ))0<ε≤1 is a Cauchy sequence in . Proof. The proof is organized as follows. First, we prove that for C ∗ sufficiently large, (ψ ε (t))0<ε≤1 is a Cauchy sequence in L2 for any t ≤ −C ∗ . Then we deduce that (∂x ψ ε (−C ∗ ))0<ε≤1 and (J (−C ∗ )ψ ε )0<ε≤1 are also Cauchy sequences in L2 . First step. For C ∗ sufficiently large, (ψ ε (t))0<ε≤1 is a Cauchy sequence in L2 for any t ≤ −C ∗ . Let 0 < ε1 < ε2 ≤ ε ∗ . From the previous paragraph, the “sharp” energy estimate in L2 for the difference ψ ε2 − ψ ε1 =: φ writes ∂t φ(t)L2 ≤ |λ|(2φ(t)L∞ + ψ ε1 (t)L∞ )ψ ε1 (t)L∞ φ(t)L2 . C∗
(6.7)
be the constant given by Theorem 2. As before, we might increase the value of Let C ∗ in the following computations, but only a finite number of times. From the modified Sobolev inequality and Inequalities (6.5), (6.6), we have, for −1/ε ≤ t ≤ −C ∗ , ψ ε (t) − ψapp (t)L∞ ≤ C
(log |t|)5/2 , |t|3/2
60
R. Carles
hence φ(t)L∞ ≤ C
(log |t|)5/2 , |t|3/2
provided −1/ε2 ≤ t ≤ −C ∗ , and ψ ε (t)L∞ ≤
f L∞ + α (log |t|)5/2 +C , 1/2 |t| |t|3/2
(6.8)
where α > 0 is chosen sufficiently small so that f L∞ + α < |λ|−1/2 . Then we can write (6.7) as C ∂t φ(t)L2 ≤ + h(t) φ(t)L2 , (6.9) |t| with C < 1 and h ∈ L1 . Now the Gronwall lemma gives, for −1/ε2 ≤ t0 ≤ t ≤ −C ∗ , φ(t)L2
C C t t0 t0 h(s)ds t ≤ φ(t0 )L2 e 0 ≤ Cφ(t0 )L2 . t t
(6.10)
But φ(t)L2 ≤ ψ ε2 (t) − ψapp (t)L2 + ψ ε1 (t) − ψapp (t)L2 ≤ C
(log |t|)2 , |t|
hence (6.10) becomes φ(t)L2
C t0 (log |t0 |)2 ≤ C , t |t0 |
(6.11)
with C < 1. Take t0 = −1/ε2 . We see that when ε2 goes to zero, the right-hand side of (6.10) goes to zero for any fixed t ≤ −C ∗ , which completes the first step. Second step. Recall the energy estimates for ∂x φ. It can be written ∂t ∂x φ(t)L2 ≤ |λ|φ(t) + ψ ε1 (t)2L∞ ∂x φ(t)L2 + φ(t)L∞ (φ(t)L∞ + ψ ε1 (t)L∞ + ψ ε2 (t)L∞ )∂x ψ ε1 (t)L2 . The first line of the right-hand side is almost the same as for the L2 case, and is handled in the same way. For the second line, we use the Sobolev inequality φ(t)L∞ ≤
C 1/2 1/2 φ(t)L2 J (t)φL2 , |t|1/2
along with estimates (6.6) and (6.8), to obtain C (log |t|)3 1/2 ∂t ∂x φ(t)L2 ≤ + h(t) ∂x φ(t)L2 + C φ(t)L2 . |t| |t|2
(6.12)
Geometric Optics and Long Range Scattering for NLS
61
Applying the Gronwall lemma as in the first step, we find C t t0 (log |s|)3 1/2 s C ∂x φ(t)L2 ≤ C∂x φ(t0 )L2 + C φ(s) ds L2 t t |s|2 t0 t (log |s|)3 (log |t0 |)4 t0 C 1/2 s C + C φ(s)L2 ds. ≤C 2 |t0 | t |s| t t0
(6.13)
As in the first step, we take t0 = −1/ε2 . Here we take t = −C ∗ . Since C < 1, the first term of the right-hand side goes to zero with ε2 , and the map s →
(log |s|)3 |s|2
s C t
is integrable. From the first step and dominated convergence theorem, the second term of the right-hand side also goes to zero with ε2 , which proves that (∂x ψ ε (−C ∗ ))0<ε≤ε∗ is a Cauchy sequence in L2 , provided that C ∗ is large enough. The proof that (J (−C ∗ )ψ ε )0<ε≤ε∗ is a Cauchy sequence in L2 is similar, for the only change would be a larger power for the logarithms, which does not affect the computations. Remark 7. Evidently, we cannot apply this method when f L∞ ≥ |λ|−1/2 , for in Theorem 2, we prove the validity of nonlinear geometric optics before a boundary layer around the caustics, which is big compared to ε. Thus, with the same scaling as above, we would obtain estimates on time intervals where both bounds go to −∞ when ε goes to zero. From the preceding proposition, there exists C ∗ > 0 and ϕ ∈ such that ψ ε (−C ∗ ) −→ ϕ in . ε→0
Define the function ψ as the solution of the initial value problem i∂ ψ + 1 ∂ 2 ψ = λ|ψ|2 ψ, t 2 x ψ|t=−C ∗ = ϕ.
(6.14)
From the local well-posedness of (1.2) in (see for instance [5,6]), for any T > 0, sup
−T ≤t≤T
ψ ε (t) − ψ(t) −→ 0. ε→0
Let t ≤ −C ∗ , ε ≤ −1/t and 0 ≤ s ≤ 1. Write ψ(t) − ψapp (t)H s ≤ ψ(t) − ψ ε (t)H s + ψ ε (t) − ψapp (t)H s . For ε sufficiently small, the first term of the right-hand side is smaller than (log |t|)2+s . |t|
62
R. Carles
But for any 0 < ε ≤ min(ε∗ , −1/t), the second term is smaller than C
(log |t|)2+s , |t|
hence ψ(t) − ψapp (t)H s ≤ C
(log |t|)2+s . |t|
Of course, the same computation gives J (t)ψ − J (t)ψapp L2 ≤ C
(log |t|)3 . |t|
Thus, we have proved the existence part of Corollary 1, up to the following lemma. Lemma 8. Let f ∈ H, and ψ− , S− defined by (1.8) and (1.5). Then there exists C > 0 such that for t ≤ −2, ψapp (t) − eiS− (t) U0 (t)ψ−
≤C
(log |t|)2 , |t|
(6.15)
J (t)ψapp − J (t)eiS− (t) U0 (t)ψ−
≤C
(log |t|)3 . |t|
(6.16)
H1
and
L2
Proof. We prove the estimate for the L2 -norm; the other estimates are similar. With the scaling (1.7), we have t −1 t −1 iS− t−1 ε U0 ,x − e ψ− (x) 2 ψapp L ε ε x t−1 ε t −1 1 iS − ε . = uapp (t, x) − √ e U0 ψ− ε ε L2 ε
With this approach, we can see that both terms have the same asymptotics given by the stationary phase formula, and the estimates of Lemma 8 are given by Lemma 1. Remark 8. Equations (6.4) provide an algorithm to construct ψ as the image of ψ− under the modified wave operator. Uniqueness follows from Proposition 1, which we will now prove.
6.3. Proof of Proposition 1. The method to prove uniqueness follows the proof of Proposition 4, in its first step. From Lemma 8, the assumption of Proposition 1 can be replaced by: There exists 1/2 < α < 1, with α > 2|λ|f 2L∞ , such that, as t → −∞, 1 ψ(t) − ψapp (t)L2 ∩L∞ = O . (6.17) |t|α
Geometric Optics and Long Range Scattering for NLS
63
Suppose ψ1 and ψ2 are two such functions, and denote φ := ψ2 − ψ1 . We already saw that the energy estimate in L2 can be written ∂t φ(t)L2 ≤ 2|λ|ψ1 (t)L∞ (ψ1 (t)L∞ + φ(t)L∞ ) φ(t)L2 .
(6.18)
Take β > 0 such that |λ| (f L∞ + β)2 < α < 1. From Lemma 4, there exists C ∗ > 0 such that for any t ≤ −C ∗ , ψapp (t)L∞ ≤
f L∞ + β . |t|1/2
Then for t ≤ −C ∗ and from (6.17), (6.18) becomes C C ∂t φ(t)L2 ≤ + φ(t)L2 , |t| |t|α+1/2 with C := |λ| (f L∞ + β)2 < 1. For t0 ≤ t ≤ −C ∗ , the Gronwall lemma gives φ(t)L2
C t0 ≤ Cφ(t0 )L2 . t
Using the assumptions again, we have φ(t)L2
1 ≤C α |t0 |
C t0 . t
Given our choice for β, α > C. Fix t = −C ∗ . The right-hand side goes to zero when t0 goes to −∞. Hence φ(−C ∗ ) = 0, and φ ≡ 0 from the uniqueness for (1.2) in C(Rt , L2 ∩ L∞ ) (see [7]). This proves Proposition 1 and completes the proof of Corollary 1. Remark 9. For Proposition 1, we need the assumption f ∈ H 2 (R) because it is the minimum regularity we assumed for Lemma 4. 7. Construction of the Modified Scattering Operator and Application 7.1. Proof of Corollary 2. We first recall the main result in [9] for nonlinear Schrödinger equation, which corresponds to the notion of asymptotic completeness of the modified wave operators introduced in [11]. Theorem 3 ([9], Theorem 1.2, case n = 1). Let ϕ ∈ , with ϕ = δ ≤ δ, where δ is sufficiently small. Let ψ ∈ C(Rt , ) be the solution of the initial value problem (6.14), with C ∗ = 0. Then there exist unique functions W ∈ L2 ∩ L∞ and φ ∈ L∞ such that for t ≥ 1, t dτ 2 F U0 (−t)ψ (t) exp −i λ ˆ )|2 − W | ψ(τ ≤ Cδ t −α+C(δ ) , (7.1) 2π 1 τ 2 ∞ L ∩L t λ dτ 2 2 2 ˆ |ψ(τ )| ≤ Cδ t −α+C(δ ) , (7.2) − λ|W | log t − φ 2π τ 1
L∞
64
R. Carles
where Cδ < α < 1/4, and φ is a real valued function. Furthermore we have the asymptotic formula for large time t, 2 x 2 x x x 1 W exp i + iλ W ψ(t, x) = log t + iφ (it)1/2 t 2t t t (7.3) 2
+ O(δ t −1/2−α+C(δ ) ) and the estimate F U0 (−t)ψ (t) − W exp(iλ|W |2 log t + iφ)
2
L2 ∩L∞
≤ Cδ t −α+C(δ ) .
(7.4)
Remark 10. Uniqueness follows from (7.1) and (7.2), which make it possible to define W and φ. The asymptotics (7.3) and (7.4) are immediate consequences of (7.1) and (7.2). ˜ ˜ where φ˜ ∈ L∞ , then (7.3) and In particular, if we replace (W, φ) with (W ei φ , φ − φ), (7.4) still hold. Remark 11. This theorem states “almost” asymptotic completeness for small data for the modified wave operators introduced in [11]. Indeed, no regularity for the momenta of ψ is proved in [11]. In Corollary 1, we limit the loss of regularity, and in particular obtain for ψ that required in Theorem 3. 1/2 . From Corollary 1, Proof of Corollary 2. Let ψ− ∈ F(H), with ψ − L∞ < (π/|λ|) there exists a unique ψ ∈ C(R, ) solution of (1.2) satisfying (1.15), (1.16). The first step is then to check that for ψ− sufficiently small, ψ(0) < δ, so that we can use the results of Theorem 3. The second step consists in defining ψ+ . From Duhamel’s formula, one has 0 ψ(0) = U0 (C ∗ )ψ(−C ∗ ) − iλ U0 (−s)|ψ|2 ψ(s)ds. −C ∗
On the other hand, we saw that for C ∗ 1,
U0 (C ∗ )ψ(−C ∗ ) ≤ U0 (C ∗ )ψapp (−C ∗ ) + U0 (C ∗ )(ψ − ψapp )(−C ∗ ) ≤ Cψ− log C ∗ + C
log C ∗ 4 . C∗
From local estimates for (1.2), we see that there exist functions hj , j = 1, 2, 3, with h1 (x) −→ 0, h2 is increasing, and h3 (x) −→ 0, such that x→+∞
x→0
ψ(0) ≤ h1 (C ∗ ) + h2 (C ∗ )h3 (ψ− ).
(7.5)
Taking first C ∗ sufficiently large, then ψ− sufficiently small, we see that we can have ψ(0) < δ. Then Theorem 3 provides (unique) functions W and φ. Define ψ+ by ψ+ := F −1 W eiφ ∈ L2 (R).
(7.6)
Geometric Optics and Long Range Scattering for NLS
65
From (7.3) and (7.4), we have, in L2 , ψ(t)
eiλ|ψ+ ( t )|
∼
t→+∞
x
2
log t
which, along with Corollary 1, yields Corollary 2.
U0 (t)ψ+ ,
7.2. Consequences for nonlinear geometric optics. In Sect. 2, we mentioned the fact that to describe the asymptotics of uε after the caustic, one needs a modified scattering operator. Now we have one, we can describe uε globally. We first give a heuristic approach, then prove Corollary 3. We already noticed that the phase g ε (hence the symbol aε ) is defined only for t < 1. If we want a global description, we have to replace g ε with a phase φ ε which is defined for all t, and coincides asymptotically with g ε for t < 1. To guess which possible φ ε we can choose, recall Scaling (1.7). The function ψ ε solves (1.2), and we saw that, for t ∈] − ∞, T ], where T is finite, ψ ε (t) −→ ψ(t) in L2 ∩ L∞ . ε→0
Hence we have 1 i∂t ψ ε + ∂x2 ψ ε = λ|ψ|2 ψ ε + small. 2
(7.7)
Forget the “small” term. We now have to study a linear Schrödinger equation, with a time-dependent potential λ|ψ|2 . According to the vocabulary used in [3], this is not a short range potential, for it does not belong to L1t (L∞ x ). A scattering theory for long range potentials is available (see for instance [3]). The first idea is due to Dollard and consists in studying t ε 2 ψ (t, x) exp −iλ |ψ| (s, sξ )ds 0
in order to get rid of the long range part. In our context, this means that we can replace g ε with φ ε (t, ξ ) := −λ
t−1 ε
1 |ψ|2 (s, sξ )ds + λ|f (−ξ )|2 log . ε −1/ε
The symbol aε is now defined (globally in time) by x.ξ t−1 2 1 ε ε u (t, x) = √ e−i 2ε ξ +i ε +iφ (t,ξ ) aε (t, ξ )d¯ξ. ε
(7.8)
(7.9)
Now from Corollary 1, one has, for t < 1, |ψ(s, sξ )| =
1 1 ∞ |ψ − (ξ )| + o(1) in Lt (Lx ), |2π s|1/2
∞ hence, in L∞ t,loc (0, 1; Lx ),
φ ε (t, ξ ) = g ε (t, ξ ) + o(1).
(7.10)
66
R. Carles
Therefore, even with this new definition of aε , we have, for t < 1, aε (t, ξ ) −→ a0 (ξ ) in L2 . ε→0
Similarly, for t > 1 and from Theorem 3, there exists a function H (that depends on ∞ ψ) such that in L∞ t,loc (1, 2; Lx ), t −1 + H (ξ ) + o(1). ε
φ ε (t, ξ ) = −λ|W (ξ )|2 log In particular, since aε (t, ξ ) = e−iφ
ε (t,ξ )
(7.11)
1−t t −1 F U0 ψε ε ε
and the map ϕ → (W, φ) in Theorem 3 is continuous, 2 aε (t, ξ ) −→ e−iH (ξ )+iφ(ξ ) W (ξ ) = e−iH (ξ ) ψ + , in L . ε→0
(7.12)
Apparently, the limit of aε depends on this function H . One must bear in mind that this function H is closely related to our choice in the definition of the new phase φ ε . For instance, one can check that replacing φ ε with φ ε (t, ξ ) + h1 (ξ )
t−1 ε
−1/ε
h2 (s)ds,
where h1 ∈ L∞ , h2 ∈ L1 , would just alter the definition of H . Thus this function appears as a parameter in the definition of aε . Nevertheless, the asymptotics for uε is independent of H . It is given, in L2 , by (7.12), (7.11) and the first part of Lemma 1. This leads to the asymptotics given in Corollary 3 for t > 1. The asymptotics for t < 1 is a simple consequence of Theorem 2 and (7.10). This completes the proof of Corollary 3. Acknowledgements. I would like to thank Professor A. Bressan for his invitation at SISSA, where this work was achieved. This research was supported by the European TMR ERBFMRXCT960033.
References 1. Barab, J. E.: Nonexistence of asymptotically free solutions for nonlinear Schrödinger equation. J. Math. Phys. 25, 3270–3273 (1984) 2. Carles, R.: Geometric optics with caustic crossing for some nonlinear Schrödinger equations. Indiana Univ. Math. J. 49, 475–551 (2000) 3. Derezi´nski, J., and Gérard, C.: Scattering theory of quantum and classical N-particle systems. Texts and Monographs in Physics, Berlin–Heidelberg: Springer Verlag, 1997 4. Duistermaat, J. J.: Oscillatory integrals, Lagrangian immersions and unfolding of singularities. Comm. Pure Appl. Math. 27, 207–281 (1974) 5. Ginibre, J.: Introduction aux équations de Schrödinger non linéaires. Cours de DEA, Paris Onze Édition (1995) 6. Ginibre, J.: An introduction to nonlinear Schrödinger equations. In: Nonlinear waves (Sapporo, 1995). Gakk¯otosho, R. Agemi and Y. Giga and T. Ozawa (eds.), GAKUTO International Series, Math. Sciences and Appl., 1997, pp. 85–133 7. Ginibre, J., and Velo, G.: On a class of nonlinear Schrödinger equations. III. Special theories in dimensions 1, 2 and 3. Annales de l’Institut Henri Poincaré. Section A. Physique Théorique. Nouvelle Série 28, 287– 316 (1978)
Geometric Optics and Long Range Scattering for NLS
67
8. Ginibre, J., and Velo, G.: Long Range Scattering and Modified Wave Operators for some Hartree Type Equations III. Gevrey spaces and low dimensions. J. Diff. Eq., to appear 9. Hayashi, N., and Naumkin, P.: Asymptotics for large time of solutions to the nonlinear Schrödinger and Hartree equations. Am. J. Math. 120, 369–389 (1998) 10. Hunter, J., and Keller, J.: Caustics of nonlinear waves. Wave Motion 9, 429–443 (1987) 11. Ozawa, T.: Long range scattering for nonlinear Schrödinger equations in one space dimension. Commun. Math. Phys. 139, 479–493 (1991) 12. Strauss, W.: Nonlinear scattering theory. In: Scattering theory in mathematical physics, J. Lavita and J. P. Marchands (eds.), Dordrecht: Reidel, 1974 13. Strauss, W.: Nonlinear scattering theory at low energy. J. Funct. Anal. 41, 110–133 (1981) Communicated by A. Kupiainen
Commun. Math. Phys. 220, 69 – 94 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Regularized Products and Determinants Georg Illies IHES, Le Bois-Marie, 35, Route de Chartres, 91440 Bures-sur-Yvette, France. E-mail:
[email protected] Received: 4 April 2000 / Accepted: 15 January 2001
Abstract: Zeta-regularized products are used to define determinants of operators in infinite dimensional spaces. This article provides a general theory of regularized products and determinants which delivers a better approach to their existence and explicit determination. 1. Introduction The zeta-regularized product of a sequence ak ∈ C∗ is defined by ∞ ∞ d −s ak := exp − ak |s=0 , ds k=1
(1.1)
k=1
provided that the Dirichlet series converges absolutely in a half plane and can be meromorphically continued to the left of (s) = 0; the evaluation at s = 0 means the constant term of the Laurent expansion. This obviously generalizes the ordinary finite product. Zeta-regularization was first used to define analytic torsion [RS] and since then has played a role in global analysis, the theory of dynamical zeta functions, and Arakelov theory. Theoretical physicists use zeta-regularization as a method for renormalization in quantum field theories [EORBZ] and various papers (e.g. [Ef, Ko1, Ko2, Ko3, Sa]) have calculated the regularized determinant of Laplacians. Zeta-regularization also appeared in a conjectural cohomological approach to motivic L-functions ([De1, De2, De3, Ma]). In that context the question appeared as to which meromorphic functions of finite order (e.g. motivic L-functions) are zeta-regularized, i.e. can be represented as f (z) = (z − ρ)±1 (1.2) Present address: Algebra und Zahlentheorie, Fachbereich Mathematik, Universität – Gesamthochschule Siegen, Walter-Flex-Str. 3, 57068 Siegen, Germany. E-mail:
[email protected]
70
G. Illies
where the product is over all zeroes and poles ρ of f (z) with multiplicities and the sign of the exponent being positive for zeroes and negative for poles. This turns out to be the basic problem of zeta regularized determinants and it was the starting point of the following investigation which, we hope, gives a satisfying answer to the question. Regularization entails several technical problems because of the meromorphic continuation of the Dirichlet series. For example, the regularized product of all primes does not exist as p −s has the natural boundary (s) = 0 ([LW]). The aim of this paper is to give a better approach to regularized products improving the formalism in [Vo, CV, QHS] and [JL1] (compare Sect. 6 below) which is based on the representation of the Dirichlet series as the Mellin transform of the series θ(t) :=
∞
eak t .
(1.3)
k=1
In many applications arg(ak ) varies in such a way that this series does not converge for any t, for example in the product (1.2). This problem can be solved by instead using a kind of Hankel integral of the Dirichlet series (see Sect. 7). To treat the product (1.2) one has by definition to regard the function ζ (s, z) := ±(z − ρ)−s . This paper is also thought of as an examination of the analytic and asymptotic properties of this generalized Hurwitz zeta function which should be interesting for its own sake. Before giving a short overview we introduce the notion of a divisor which is basic for all that follows. A divisor D is given by a function mD : C → Z such that there is a β > 0 with |mD (ρ)| < ∞. |ρ|β
(1.4)
ρ∈C
Condition (1.4) reflects that the Dirichlet series in Definition (1.1) must converge absolutely in a half plane. We recall a fundamental fact from the theory of entire functions of finite order (compare [Ti] for the proof): A function mD : C → Z gives rise to a divisor if and only if there is a meromorphic function f (z) of finite order (i.e. f (z) is the quotient of two entire functions of finite order) such that mD (ρ) = ord f (z), z=ρ
ρ ∈ C,
thus D is the divisor of f (z) in the usual sense. And this function f (z) is determined by D up to an exponential polynomial, i.e. a function g(z) is meromorphic of finite order with divisor D if and only if there is a polynomial P (z) with g(z) = eP (z) f (z). After introducing some notation (Sect. 2) we define a general class of regularized products in Sect. 3, zeta-regularization being just an example; the rhs of (1.2) with the multiplicities mD (ρ) in the case of its existence is called regularized determinant and denoted by (z − ρ)±1 . (1.5) D (z) := We prove that D (z) is a meromorphic function of finite order with divisor D, thus equals eP (z) f (z) for a certain polynomial P (z). Regularization means finding this polynomial. Section 4, a sort of theoretical excursion, discusses axiomatic generalizations of the regularization process and shows that a theory of regularization should deal with quasidirected divisors (defined in Sect. 2).
Regularized Products and Determinants
71
If the Dirichlet series in the definition of regularized products also satisfies certain exponential estimates and does not have too many poles we speak of bounded regularizability (Sect. 5). In that case one can apply certain integral transformations, especially the Mellin transform, the mentioned Hankel integral and the Laplace transform, to get Theorems 3, 4 and 6 of Sects. 6 and 7 and 9. They give the equivalence of bounded regularizability with certain asymptotics for θD (t), ζD (s, z) and for the function θD (t, s) which is defined as the Laplace transform of ζD (s, z). As a corollary of Theorem 4 one gets Theorem 5 in Sect. 7 which is the fundamental theorem of the theory of regularization. It states that D is bounded regularizable if and only if for some 0 < ψi < π , i = 1, 2 and ε > 0 an asymptotic log f (z) =
zαi logni z + o(|z|−ε ) |z| → ∞
(1.6)
i
with finite sum, αi ∈ C and ni ∈ N0 , is valid for −ψ2 < arg(z) < ψ1 . D (z) exists in that case and also the polynomial P (z) can be determined in terms of this asymptotic of log f (z) which is very intrinsic. These results deliver a satisfying theory of regularization and apply to a large class of examples. In Sect. 8.1 for instance it is shown that every meromorphic function of finite order representable by a Dirichlet series is regularized; this improves results of Jorgenson and Lang ([JL1, JL2]) who had to assume that it also satisfies a functional equation. In 8.2 we regularize higher "-functions. Thus Sect. 8 is applicable to various kinds of zeta and L-functions. The function θD (t, s), introduced in Sect. 9, is a type of multivalued theta function and plays a central role in [Il2]. There is also an alternative approach to regularization via renormalizing certain divergent integrals (Sect. 10). The following three sections contain technical proofs which were postponed. The article reproduces the main results of Chapter 2 of my thesis [Il1] in a more special context. In some cases we only give sketches of the proofs, for complete proofs, generalizations and further results the reader is referred to [Il1]. In [Il2] it is shown how to apply the theory of regularization to generalize results of Cramér ([Cr]) and Guinand ([Gui]) thus improving results of [JL2].
2. Notation In the sequel f (z) denotes a meromorphic function of finite order and D its divisor. We define two important parameters: The exponent r of f and D is the infimum of all β > 0 satisfying (1.4); the genus g of f and D is the smallest n ∈ N0 such that (1.4) is satisfied for β = n + 1; note g + 1 ≥ r ≥ g. We will say that D lies in a set M ⊂ C if mD (ρ) = 0 implies that ρ ∈ M. Let 0 < ϕi < π, i = 1, 2, then we define open connected sets Wrϕ1 ,ϕ2 := {z ∈ C∗ | − ϕ2 < arg(z) < ϕ1 }, Wlϕ1 ,ϕ2 := C∗ \Wrϕ1 ,ϕ2 and a contour Cϕ1 ,ϕ2 consisting of the ray from e−ϕ2 i ∞ to 0 and the ray from 0 to eϕ1 i ∞; thus C = Wlϕ1 ,ϕ2 ∪ Cϕ1 ,ϕ2 ∪ Wrϕ1 ,ϕ2 is a disjoint union.
72
G. Illies
(z) ✻
Cϕ1 ,ϕ2 ✡✡ ✣ ✡ ✡ ✡ ✡ ✡ ϕ1 ◗ϕ2 Wrϕ1 ,ϕ2 ◗ ◗ ◗ ◗ ◗ ◗ ❦
Wlϕ1 ,ϕ2
✲
(z)
A divisor D is called directed if it lies in a Wlϕ1 ,ϕ2 . It is called quasi-directed if it is directed with the exception of finitely many ρ, and it is called strictly directed if it lies in a Wlϕ1 ,ϕ2 with ϕ1 > π2 and ϕ2 > π2 . We will also write ρ ∈ D instead of mD (ρ) = 0 and use the following notation:
ϕ(ρ) :=
ρ∈D
mD (ρ)ϕ(ρ).
ρ∈C
3. Xi Functions and Regularization Definition 3.1. If D is a directed divisor, UD := {z ∈ C | |z| < |ρ|, ρ ∈ D} and the argument is chosen so that −π < arg(z − ρ) < π then ξD (s, z) :=
ρ∈D
"(s) , (z − ρ)s
(s) > r,
z ∈ UD ,
(3.1)
is called the Hurwitz xi function of D; ξD (s) := ξD (s, 0) is called the xi function of D. Convergence is absolute and ξD (s, z) is holomorphic in both variables. Proposition 3.2. ξD (s, z) satisfies the following differential equation: d ξD (s, z) = −ξD (s + 1, z). dz
(3.2)
A function f (z) is meromorphic of finite order with divisor D if and only if for some l ≥ g: d l+1 log f (z) = (−1)l ξD (l + 1, z). dz
(3.3)
Proof. Equation (3.2) follows by taking the term by term derivative. For (3.3) check that d l+1 Wei,l log D,a (z) defined in (3.6) below satisfies (3.3) and observe that the operation dz exactly kills exponential polynomials of degree ≤ l.
Regularized Products and Determinants
73
Proposition 3.3. For z ∈ UD the following absolutely convergent Taylor series expansion is valid: ξD (s, z) =
∞
(−1)m ξD (s + m)
m=0
zm . m!
(3.4)
If ξD (s) is meromorphic for (s) > −p then ξD (s, z) is also meromorphic for (s) > −p and holomorphic for z ∈ U for any simply connected U ⊂ C with UD ⊂ U and ρ ∈ U for all ρ ∈ D. Proof. The Taylor series follows from (3.2); for the meromorphy in s observe that shifting the coefficients does not change the convergence radius. The continuation in z is obtained by treating finitely many ρ ∈ D separately. Definition 3.4. A regularization sequence δ is a sequence of complex numbers δ0 , δ1 , . . . with δ0 = 1. Formally let δ(s) := δ0 + δ1 s + δ2 s 2 + . . . . A directed divisor D is called regularizable if ξD (s) is meromorphic in a half plane (s) > −ε with ε > 0. For z ∈ U D (z) := exp(−CTs=0 (δ(s)ξD (s, z)))
(3.5)
is called the δ-regularized determinant of D. One calls D (0) the δ-regularized product of D. Note that CTs=0 means the constant term in the Laurent expansion at s = 0. If δ(s) is a divergent series, then one has to develop ξD (s, z) in a Laurent series and multiply it formally with the formal series for δ(s). In the sequel there will often appear formulas which must be interpreted in this formal sense. Examples. 1) xi-regularized determinant (Jorgenson, Lang): δ(s) = 1, 2) zeta-regularized determinant: δ(s) = " −1 (s + 1), 3) zero-renormalized determinant: δ(s) = "(1 − s). Remark. The factor δ(s) isintroduced because of several reasons. First one wants to handle "scaled" products a(z − ρ) (compare [De1, De2, De3]). It also turned out that the canonical way of renormalization (see Theorem 7 in Sect. 10) differs from zeta-regularization. A further reason is that in [JL1] xi-regularization was used which is technically the simplest regularization. While zeta-regularization as well as zerorenormalization generalize the ordinary finite product (as every regularization with δ1 = γ does, γ the Euler-Mascheroni constant), xi-regularization does not. Zeta-regularization satisfies the product rule ρ n = ( ρ)n so comes closest to what one would expect for a product. Fix a ∈ C with mD (a) = 0 and - ≥ g, then we define the absolutely convergent Weierstrass product
mD (ρ) z−a 1 z − a k Wei,D,a (z) := 1− , (3.6) exp ρ−a k ρ−a ρ∈C
k=0
which is a meromorphic function of finite order with divisor D. For a = 0 and g = l one has the usual canonical Weierstrass product (compare [Ti]).
74
G. Illies
Theorem 1. D (z) is a meromorphic function of finite order with divisor D. The explicit relation to Weierstrass products is given by D (z) = eP (z) Wei,D,a (z),
(3.7)
where with a suitable branch of the logarithm P (z) =
(z − a)m log(m) D (a), m!
(3.8)
m=0
log(m) D (z) = (−1)m+1 CTs=0 (δ(s)ξD (s + m, z)) m = 0, 1, . . . . In the sequel a meromorphic function of finite order f (z) with divisor D will be called δ-regularized if it equals D (z). Proof of Theorem 1. We have (
d m+1 log D (z) = (−1)m CTs=0 (δ(s)ξD (s + m + 1, z)) ) dz = (−1)m ξD (m + 1, z) for m ≥ g;
(3.9) (3.10)
the first equation holds because of (3.2) and is valid for all m ∈ N0 . It is also true that ξD (s, z) is holomorphic for s = m + 1 if m ≥ g by the definition of g. Comparison with (3.3) proves the first assertion. One also easily checks d m+1 0 for m = −1, 0, . . . , - − 1 Wei,log D,a (z)|z=a = (−1)m ξD (m + 1, a) for m ≥ -. dz Using this as well as (3.9) and (3.10) the explicit relation to Weierstrass products follows by subtracting the Taylor series expansion around s = a for log Wei,D,a (z) from that for log D (z). 4. Determinant Systems We will call a function f (z) associated to a divisor D if there is a polynomial P (z) such that f (z) = eP (z) D,a (z) and deg P ≤ g Wei,g
or equivalently that (3.3) is satisfied for l = g (compare the proof of (3.3)). Observe that Wei,g in this definition we have set - = g in (3.6). Note also that if D,a (z) is, in addition, entire then its order is exactly r and no entire function with divisor D can have a smaller order (compare [Ti]). So associated functions have minimal order because of g ≤ r. By Theorem 1 regularization means picking out a certain associated function to a divisor. Now we ask for extensions of this process to non-regularizable divisors. For α ∈ C we define the translated divisor D |+α by mD |+α (z + α) := mD (z) and the sum D1 + D2 by mD1 +D2 (z) := mD1 (z) + mD2 (z). Let Dfin be the abelian group of all finite divisors, Dbreg that of all bounded regularizable quasi-directed divisors (compare Sect. 5), Dreg that of all regularizable quasi-directed divisors, i.e. those which are regularizable directed after eliminating finitely many ρ, Dqd that of all quasi-directed divisors, and D that of all divisors. These are all translation-invariant with proper inclusions Dfin ⊂ Dbreg ⊂ Dreg ⊂ Dqd ⊂ D. The following definition arises from the demand for generalizations of the characteristic polynomial to the infinite dimensional case.
Regularized Products and Determinants
75
Definition 4.1. Let D ⊆ D be a translation-invariant subgroup. A determinant on D attaches to every D ∈ D an associated function D (z), such that: i) D |+α (z + α) = D (z) ii) D1 +D2 (z) = D1 (z)D2 (z)
(translation-invariance) (linearity)
for D, D1 , D2 ∈ D , α ∈ C. (D , ) is called a determinant system. Examples. 1) (Dfin , ) with characteristic “polynomial” which is a rational function defined by D (z) := ρ∈C (z − ρ)mD (ρ) . 2) (Dreg , ) with the δ-regularized determinant . For a D ∈ Dreg which is not directed, D (z) can be defined by translation-invariance. Theorem 2 answers the question of how large determinant systems can be. Theorem 2. a) There is no determinant system (D, ). b) For every regularization sequence δ there is a determinant system (Dqd , ) which is an extension of the δ-regularized determinant system (Dreg , ). Proof. a) Let D be the divisor that consists only of zeroes of order one lying at the lattice points ρ = m + ni, m, n ∈ Z. From translation invariance one gets D (z + 1) = D (z) and D (z + i) = D (z). Hence D (z) must be a doubly-periodic entire non-constant function which is impossible by Liouville’s theorem. b) (Idea) One has to choose the exponential polynomials consistently with linearity and translation-invariance. This leads to a system of infinitely many linear equations with infinitely many variables which can be reduced to finite systems by Zorn’s Lemma. (See Sect. 11 for a complete proof.) Remark. Not every determinant system is extendable, so b) is an aesthetic property of regularization. The proof is non-constructive and its extensions are not uniquely determined. The meaning of regularization is that it gives large constructively defined determinant systems. 5. Bounded Regularizability, Singularities and Asymptotics In this section we introduce the special case of bounded regularizability of divisors and give all the neccessary technical definitions to formulate the results of Sects. 6, 7 and 9 which state its equivalence to various asymptotics. Definition 5.1. Let D be a directed divisor, 0 < σi < π for i = 1, 2 and p ∈ R ∪ {∞}, then D resp. ξD (s) are called (σ1 , σ2 )-bounded p-regular if: i) ξD (s) is meromorphic for (s) > −p. ii) ξD (s) has only finitely many poles in the strip α1 < (s) < α2 for any −p < α1 < α2 . iii) For all −p < α1 < α2 and σ1 < σ1 , σ2 < σ2 , π O(e( 2 −σ2 )(s) ) for (s) → ∞ ξD (s) = π O(e−( 2 −σ1 )(s) ) for (s) → −∞ in the strip α1 < (s) < α2 .
76
G. Illies
We simply say bounded p-regular, if there are 0 < σi < π such that (σ1 , σ2 )-bounded p-regularity is valid. We have bounded regularizability if p > 0. Note that every directed divisor D in Wlϕ1 ,ϕ2 is (ϕ1 , ϕ2 )-bounded (−r)-regular as follows from Stirling’s formula. Definition 5.2. A pB-System consists of: 1. A finite or infinite sequence of pairs (pn , Bn (z))n=0,1,2,... with complex numbers pn satisfying (p0 ) ≤ (p1 ) ≤ . . . ≤ (pn ) ≤ . . . and polynomials Bn (z) ∈ C[z], Bn (z) = k bn,k zk . 2. An abscissa p ∈ R ∪ {∞} such that p > (pn ) for all n, and in addition for infinite sequences: p = limn→∞ (pn ). pB-systems capture the simultaneous information about the occurring singular part distributions and the occurring asymptotics. Example. If the divisor D is (σ1 , σ2 )-bounded p-regular, then there is a pB-system (pn , Bn (z))n=0,1,2,... with abscissa p such that the poles of ξD (s) in the half plane (s) > −p lie exactly at the values s = −pn and the Laurent expansions have the singular parts Bn (∂s )[
(−1)k k! 1 ]= bn,k . s + pn (s + pn )k+1 k
(−pn , Bn (∂s )[
1 ])n=0,1,... s + pn
is called the singular part distribution of ξD (s) in that case. Definition 5.3. Let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p as above and 0 < σi ≤ ϕi < π, i = 1, 2. A function θ : Wrϕ1 ,ϕ2 −→ C satisfies the Cramér asymptotic with abscissa p in Wrσ1 ,σ2 , θ (t) ∼
∞
t pn Bn (log t) for |t| → 0,
n=0
if the estimate for t ∈ Wrσ1 ,σ2 θ (t) − t pn Bn (log t) = O(|t|q ) for |t| → 0 (pn )
is valid for any fixed q, q ∈ R with q < q < p and 0 < σi < σi , i = 1, 2. Analogously, for a function g : Wrϕ1 ,ϕ2 −→ C one defines the meaning of the following Stirling asymptotic with abscissa p in Wrσ1 ,σ2 : g(z) ∼
∞
z−pn Bn (log z) for |z| → ∞.
n=0
In the definition we choose of course arg(t), arg(z) ∈] − ϕ2 , ϕ1 [.
Regularized Products and Determinants
77
6. Mellin Transform and Cramér Asymptotics A typical method for showing that a Dirichlet series has a meromorphic continuation is using the Mellin transform. Definition 6.1. For a strictly directed divisor D in Wlϕ1 ,ϕ2 , ( π2 < ϕi < π ), one defines the partition function θD (t) := eρt , t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) . (6.1) ρ∈D
The series converges absolutely and is holomorphic in t. Proposition 6.2. One has the Mellin transform representation for (s) > r, ∞ ξD (s) = θD (t)t s−1 dt,
(6.2)
0
and its inverse for t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) and c > r c+i∞ 1 ξD (s)t −s ds θD (t) = 2π i c−i∞
(6.3)
with absolute convergence of the integrals. Proof. Because of the theorem about Mellin inversion it suffices to prove (6.3) and by majorized convergence, this is reduced to the case of a one-point-divisor. In that case (6.3) is the inverse formula for Euler’s Mellin integral for "(s). This approach is only possible for strictly directed divisors as otherwise the defining series for θD (t) does not converge for any t. Theorem 3. Let π2 < σi ≤ ϕi < π for i = 1, 2 and D be strictly directed in Wlϕ1 ,ϕ2 , and let (pn , Bn (z))n=0,1,... be a pB-system with abscissa p. Then the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[
1 ])n=0,1,... . s + pn
C) θD (t) satisfies a Cramér asymptotic with abscissa p of the form θD (t) ∼
∞
t pn Bn (log t) for |t| → 0
n=0
in Wr(σ2 − π2 ),(σ1 − π2 ) . Proof (sketch). C) ⇒ A) is shown by (6.2): The poles and singular parts of ξD (s) arise by integrating the terms of the Cramér asymptotic, and the exponential estimation for ξD (s) in vertical strips can be shown by rotating the ray of integration in (6.2) in Wr(σ2 − π2 ),(σ1 − π2 ) . A) ⇒ C) follows from (6.3) by replacing the abscissa c of the line of integration by a smaller c > −p and applying the residue theorem. The residues of the integrand produce the terms of the Cramér asymptotic.
78
G. Illies
Remark 1. This theorem is well known in the context of the Mellin and Laplace transform (e.g. [Do, II, Chap. 5], where a complete proof can be found). Using it one can decide whether a strictly directed divisor is bounded regularizable or not, by checking for the existence of a suitable Cramér asymptotic for the partition function. For example the regularized product of the eigenvalues of Laplacians on manifolds exists because of Cramér asymptotics arising from heat kernel expansions (comp. [Ef, Ko1, Ko2, Ko3, Sa]). Remark 2. In the case of strictly directed divisors also the implication A) ⇒ B’) of Theorem 4 and, in particular, the asymptotic (7.8) can be obtained by the Mellin integral ξD (s, z) =
∞
0
θD (t)e−zt t s−1 dt
using 0
∞
pn
t Bn (log t)e
−zt s−1
t
"(s + pn ) . dt = Bn (∂s ) zs+pn
The Mellin transform approach to regularized products and determinants (the details can be found in Sect. 2.4 in [Il1]) was also extensively studied by Jorgenson and Lang ([JL1]).
7. Hankel Integrals and Stirling Asymptotics The Mellin integral method has two shortcomings: It is possible only for strictly directed divisors and the partition function is a non-intrinsic construction, one wants criteria in terms of associated functions. In this section we solve these problems postponing the technical proofs until Sect. 12. For powers a s and log(a) we always use −π < arg(a) < π. Proposition 7.1. Let D be a directed divisor in Wlϕ1 ,ϕ2 (0 < ϕi < π ). a) ξD (s, z) =
1 2π i
c+i∞
c−i∞
"(s − s ) ξD (s )ds zs−s
(7.1)
for z ∈ Wrϕ1 ,ϕ2 and (s) > c > r with absolute convergence of the integral. b) Let 0 < σi < ϕi for i = 1, 2, C = Cσ1 ,σ2 and z ∈ Wrσ1 ,σ2 . Then for (s) > r and (s0 ) > r one has the absolutely convergent integral representation ξD (s, z) =
1 2π i
C
"(s − s0 + 1) ξD (s0 , w)dw. (z − w)s−s0 +1
(7.2)
In the sequel the representations (7.1) and (7.2) play a similar role as (6.2) and (6.3) in Sect. 6. For the explicit description of the Stirling asymptotics we need the following definition.
Regularized Products and Determinants
79
Definition 7.2. Let δ be a fixed regularization sequence. Then for any q ∈ C we define the linear map [q] : C[z] −→ C[z] B(z) −→B [q] (z), by
"(s + q) CTs=0 δ(s)B(∂s ) = z−q B [q] (log z). zs+q
(7.3)
For Pk (z) := zk we get: [q]
Pk (z) =
k j =0
(−1)j
k (k−j ) " (q)zj j
(7.4)
in the case that q = −n for all n ∈ N0 , while for q = −n, [q] Pk (z)
k CTs=q (" (k−j ) (s))zj = (−1) j j =0 (−1)n (−1)k+1 k+1 z + + (−1)k k!δk+1 . n! k+1 k
j
(7.5)
Special case B(z) = b0 . Easy calculations using the fact that "(z) is holomorphic for (−z) ∈ N0 as well as the expansion "(s) = 1s − γ + . . . (γ the Euler–Mascheroni constant) and "(s − n) = "(s)((s − 1)(s − 2) . . . (s − n))−1 deliver for q = −n b0 "(q) n 1 (7.6) B [q] (z) = (−1)n+1 z + γ − δ1 − j =1 j for q = −n, b0 n! (for zeta-regularization as well as for zero-renormalization one has δ1 = γ .) The following basic properties of [q] are clear by (7.4), (7.5) and the definition. Proposition 7.3. a) [q] is bijective for q = 0, −1, −2, . . . with deg B [q] = deg B. b) [q] is injective for q = 0, −1, −2, . . . with deg B [q] = deg B + 1 and with dim Coker([q]) = 1. c) d −q [q] z B (log z) = −z−(q+1) B [q+1] (log z). dz
(7.7)
Remark 3. In particular, every Stirling asymptotic with abscissa p can be represented as linear combination of terms of the form z−q B [q] up to a polynomial P (z) with1 P (z)zp → 0 for |z| → 0, and this polynomial is uniquely determined. This shows that the asymptotics in B’) of Theorem 4 and B) of Theorem 5 are general Stirling asymptotics which are written in a special manner. And this also means that the Stirling asymptotic (7.8) is an effective method to determine the regularized determinant among all associated functions for D. 1 Observe: Terms z−q with q ≥ p make no sense in Stirling asymptotics with abscissa p.
80
G. Illies
Remark 4. Part c) of the proposition together with (3.2) shows that the Stirling asymptotics in B’) and B) can de differentiated term by term. Theorem 4. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 , let p ∈ R ∪ {∞} and ξD (s) be meromorphic for (s) > −p (compare Prop. 3.3). Then for any regularization sequence δ, s0 with (s0 ) > −p and a pBsystem (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with the singular part distribution (−pn , Bn (∂s )[
1 ])n=0,1,... . s + pn
B’) There is a polynomial Ps0 (z) with Ps0 (z)zp+s0 → 0 for |z| → 0 and such that the Stirling asymptotic with abscissa p + (s0 ) CTs=0 (δ(s)ξD (s + s0 , z))) ∼ Ps0 (z) +
∞
[pn +s0 ]
z−(pn +s0 ) Bn
(log z) for |z| → ∞
n=0
is valid in Wrσ1 ,σ2 . The polynomial in B’) is then uniquely determined: Ps0 (z) = 0. The idea of the proof given in Sect. 12 is rather similar to the proof of Theorem 3. To get the Stirling asymptotic B’) from the singular part distribution A) one uses (7.1), shifts the line of integration and applies the residue theorem. The other direction is a little bit more difficult but the basic idea is of course to use (7.2) and integrate the Stirling asymptotic term by term. Some technical difficulties arise because (7.2) is not valid for z = 0. Using Eqs. (3.2), (3.3) and (3.5) one obtains the following theorem as an easy corollary of Theorem 4. Theorem 5. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 ; let f (z) be a meromorphic function of finite order with divisor D. Then for a regularization sequence δ, m ∈ N0 and a pB-system (pn , Bn (z))n=0,1,... with abscissa p ∈ R ∪ {∞} the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution (−pn , Bn (∂s )[
1 ])n=0,1,... . s + pn
B) There is a polynomial Pf (z) with Pf (z)zp → 0 for |z| → 0, such that the Stirling asymptotic with abscissa (p + m) (m)
log
f (z) ∼
(m) Pf (z) + (−1)m−1
is valid in Wrσ1 ,σ2 .
∞ n=0
[pn +m]
z−(pn +m) Bn
(log z) for |z| → ∞
Regularized Products and Determinants
81
Pf (z) in B) can then be chosen independent of m, it is (up to the choice of the logarithm) uniquely determined. If, in addition, p > 0, so D is bounded regularizable, then for the δ-regularized determinant one has PD = 0, i.e. the following Stirling asymptotic with abszissa p in Wrσ1 ,σ2 is valid: log D (z) ∼ −
∞
[pn ]
z−pn Bn
(log z) for |z| → +∞.
(7.8)
n=0
Theorem 5 can be regarded as the fundamental theorem about bounded regularizability by Remark 3 is states that whenever a log(m) f (z) satisfies any Stirling asymptotic with abscissa greater than zero, then f (z) and its divisor D are bounded regularizable, and (7.8) allows to determine its regularized determinant, i.e. the polynomial P (z) mentioned in the introduction. The triple equivalence A) ⇔ B) (⇔ C)) given by Theorems 3 and 5 where the latter equivalence is valid only for strictly directed divisors will be generalized in Sect. 9 (Theorems 4 and 6) to an equivalence A) ⇔B’) ⇔ C’) valid for all directed divisors which summarizes all informations about singular part distributions and asymptotics of ξD (s, z) and θD (t, s). 8. Examples 8.1. Dirichlet series. Corollary 8.1. If f (z) is meromorphic of finite order and has an absolutely convergent Dirichlet series representation f (z) = 1 +
∞ βn n=0
αnz
, (z) > σ0 ,
with βn ∈ C and αn ∈ R>1 with limn→∞ αn = ∞, then f (z) is δ-regularized for every regularization sequence δ, i.e. f (z) = D (z), in particular, f (z) is associated to its divisor (compare to Sect. 4). ξD (s, z) is holomorphic for s ∈ C. Proof. By the Taylor series expansion for log(1 + x) it is clear that the trivial Stirling asymptotic log f (z) ∼ 0 as |z| → 0 with abscissa +∞ is valid in Wr( π2 −ε),( π2 −ε) , so by Theorem 5 and Proposition 3.3 the assertion is clear. Remark 5. Using only the Mellin integral method one needs to assume that f (z) also satisfies a functional equation and examines eρt , θD− (t) := eρt θD+ (t) := ρ∈D,(ρ)>0
ρ∈D,(ρ)≤0
separately. For f (z) = ζ (z) the Riemann zeta function a classical result of Cramér ([Cr]) delivers the Cramér asymptotics for θD+ (t) and θD− (t) (with logarithmic terms in contrast to the examples from the spectra of Laplacians mentioned in Sect. 6) and thus regularizability of ζ (z) ([So, ScSo]). In [JL2] Cramér’s result was generalized to a class of f (z) as in the above corollary which in addition satisfies a functional equation, and their result implies regularizablity
82
G. Illies
of all these functions. Corollary 8.1 shows regularizability for a much larger class and moreover one no longer needs Cramér’s result. The methods of this section of [Il2] also apply to the “polynomial Bessel fundamental class” introduced in [JL3]. Nevertheless a functional equation is neccessary if one wants to get information about θD+ (t) (compare [Il2]). Remark 6. Theorem 5 gives satisfying criteria for deciding whether a function is bounded regularizable or not. For example, consider the function 1 2 f (z) = √ (z2 + 1)(1 + e−z )"(z) + e−z " (z). 2π √ It can be immediately seen that it is zeta-regularized: ( 2π )−1 "(z) is zeta-regularized because of (8.5) below, and this is true for (z2 + 1) because it is a characteristic polynomial, and this holds for (1 + e−z ) because it is a Dirichlet series; the second summand is small (in an angular domain) compared to the first and does not change the Stirling asymptotic of log f (z). 8.2. "-functions. In §2.8 of [Vi] the functions "n (z) were defined which appear in the functional equations of Selberg zeta functions and which are special cases of the general higher "-functions introduced by Barnes ([Bar]). They are simple examples for regularization with non-trivial Stirling asymptotics and their zeta-regularization can already be found in [Va, Ku] and [Ma, §3.3]. We give the following definition which is equivalent to that of Vigneras. Definition 8.2. The sequence ("n (z))n=0,1,... of "-functions of order n is defined by the following conditions: 0) "0 (z) = 1z . 1) "n−1 (z), n ∈ N, is an entire function of finite order and the divisor Dn of "n (z) consists
exactly of the ρ = −k, k ∈ N0 with multiplicity − n+k−1 n−1 . 2) "n (1) = 1 for all n ∈ N0 . 3) For all n ∈ N0 the following functional equation is valid: "n+1 (z) "n+1 (z + 1) = . "n (z) Using higher Bernoulli polynomials ([No]) one has for n ∈ N0 , 1 θDn (t) = −(−θD1 (t))n = − , (1 − e−t )n (8.1) ∞ (−1)ν Bνn (0) ν−n =− for 0 < |t| < 2π , t ν! ν=0
thus by Theorem 3 the Dn are bounded regularizable. Applying (7.6) and (7.8) one gets the Stirling asymptotic with abscisssa +∞ for the δ-regularized determinant k n n (0) B 1 n−k log z + γ − δ1 − zk log Dn (z) ∼ (−1)n+1 (n − k)!k! j k=0 j =1 (8.2) ∞ n (k − 1)!B (0) n+k z−k for |z| → ∞ + (−1)n+k (n + k)! k=1
Regularized Products and Determinants
83
in Wr(π−ε),(π−ε) . Proposition 8.3. The functions "n (z) are well defined; one has "n (z) = e−Pn (z) Dn (z) with polynomials Pn (z) of degree ≤ n which are determined (e.g. using Lagrange interpolation) by the relations j −1 j −1 Pn (j ) = log Dn−i (1), j = 1, . . . , n + 1. (−1)i (8.3) i i=0
The values log Dn (1) := −CTs=0 (δ(s)ξDn (s, 1)) can be expressed in terms of the Riemann zeta function: log Dn (1) =
n−1
τn,l ζ (−l) + (δ1 − γ )
l=0
n−1
τn,l ζ (−l)
(8.4)
l=0
for n ≥ 1 and log D0 (1) = δ1 − γ , with the Euler-Mascheroni constant γ and τn,l from the development n−1 n+x−1 = τn,l (x + 1)l . n−1 l=0
Proof. One shows that there exists exactly one choice of polynomials Pn (z) with Pn+1 (z + 1) + Pn (z) − Pn+1 (z) = 0 Pn (1) = log Dn (1) for n ∈ N0 , with deg P0 = 0. (Because of deg P0 = 0 and the first equation one gets deg Pn ≤ n, by induction it is easy to prove that the Pn (j ) are given as in the proposition, and in the other direction, that the uniquely determined Pn (z) with deg Pn ≤ n and with these Pn (1) satisfy the two equations.) The expression for log Dn (1) follows from δ(s)"(s) = 1s + (δ1 − γ ) + . . . and ξDn (s, 1) = −"(s)
∞ n+k−1 k=0
n−1
(k + 1)−s = −"(s)
n−1
τn,l ζ (s − l).
l=0
"1 (z) is the usual "-function, "2−1 (z) = G(z) is known as Barnes’ G-function. For these two functions we√will give the result more explicitly. It is well known that 1 ζ (0) = − 21 , ζ (0) = − log 2π and ζ (−1) = − 12 . With the Kinkelin-Glaisher constant 1 A one can express ζ (−1) = 12 − log A (compare [Vo, pp. 461–464], [Al, p. 357]), but we use just ζ (−1). Corollary 8.4. For n = 1, 2 one has 1 "1 (z) D1 (z) = √ e−(δ1 −γ )(z− 2 ) , 2π "2 (z) ζ (−1)+z log √2π+ δ1 −γ ((z−1)2 − 1 ) 2 6 . D2 (z) = √ e 2π By combining (8.5) and (8.2) one gets the usual Stirling formula for "(z).
(8.5) (8.6)
84
G. Illies
9. The Function θD (t, s) We now define and examine the function θD (t, s) for a directed divisor. This function is a Laplace transform of ξD (s, z) for the variable z which turns out to be a "mixture" of θD (t) and ξD (s) and is an essential tool in [Il2]. We give without proof a sort of generalization of Theorem 3 to directed divisors in terms of this function. ∗ with ϕ1 < arg(t) < In the sequel Wlϕ1 ,ϕ2 is regarded as the subset of all those t ∈ C −ϕ2 +2π (and we use these arguments for log t). Then Wlϕ1 ,ϕ2 is also defined for ϕi ≤ 0, which is needed in what follows. we define Definition 9.1. For ρ ∈ C\R≥0 , s ∈ C with (−s) ∈ N0 and t ∈ C e xp(ρ, t, s) :=
e−πi(s−1) "(s)"(1 − s, ρt) · eρt . 2π i
For a directed divisor D in Wlϕ1 ,ϕ2 (with 0 < ϕi < π ) we define e xp(ρ, t, s) for t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , (s) > r. θD (t, s) :=
(9.1)
(9.2)
ρ∈D
In the definition of e xp(ρ, t, s), a type of multivalued exponential function, the incomplete Gamma function (obviously holomorphic in α and z) ∞ ∗ "(α, z) := e−τ τ α−1 dτ, α ∈ C, z ∈ C z
is used. Properties of "(α, z) are well known (e.g. [EMOT, II, Chap. 9]). We state the needed properties of e xp(ρ, t, s) in Lemma 13.1 and give a selfcontained proof. In particular, by the lemma one can see that the defining sum for θD (t, s) converges absolutely and is holomorphic in the given domains. Proposition 9.2. With D as in the above definition, t ∈ Wl( π2 −ϕ1 ),( π2 −ϕ2 ) and (s) > r one has iα t −(s−1) e ∞ wt θD (t, s) = e ξD (s, w)dw (9.3) 2π i 0 for every α ∈] − ϕ2 , ϕ1 [ satisfying (eiα t) < 0, and the integral converges absolutely. θD (t, s) satisfies the following functional equations: θD (t, s + 1) − θD (t, s) = and if D is strictly directed in Wlϕ1 ,ϕ2 with
π 2
t −s · ξD (s) 2π i
(9.4)
< ϕi < π ,
θD (t, s) − e2πi(s−1) θD (exp(2π i)t, s) = θD (t), t ∈ Wr(ϕ2 − π2 ),(ϕ1 − π2 ) ,
(9.5)
θD (t, s), is identified where the overlap in Wl( π2 −ϕ1 ),( π2 −ϕ2 ) , the domain of definition for with Wr(ϕ2 − π2 ),(ϕ1 − π2 ) which is the domain of definition for the partition function θD (t) (compare Definition 6.1). Proof. By majorized convergence using Lemma 12.1 the Laplace integral representation is obtained from (13.4). The functional equations follow from the corresponding ones for e xp(ρ, t, s) given in Lemma 13.1.
Regularized Products and Determinants
85
Remark 7. The proposition shows that θD (t, s) behaves like ξD (s) in the variable s and like θD (t) in the variable t. In particular, θD (t, s) is meromorphic for (s) > −p if and only if this is true for ξD (s). For q ∈ C, a regularization sequence δ and B(z) ∈ C[z] we define the polynomial B [[q]] (z) ∈ C[z] by π(−z)s+q 1 CTs=0 δ(s)B(∂s ) = B [[q]] (log z)zq , − 2πi sin(π(s + q)) with arg(−z) := arg(z) − π. Theorem 6. Let 0 < σi ≤ ϕi < π for i = 1, 2 and D be a directed divisor in Wlϕ1 ,ϕ2 such that ξD (s) is meromorphic for (s0 ) > −p (compare Remark 1). For s0 ∈ C with (s0 ) > −p , a regularization sequence δ and a pB-system (pn , Bn (z))n=0,1,... with abscissa p < ∞, the following statements are equivalent: A) ξD (s) is (σ1 , σ2 )-bounded p-regular with singular part distribution 1 . − pn , Bn (∂s ) s + pn n=0,1,... s0 (t, t −1 ) with P s0 (t, t −1 )t −(p+s0 −1) → 0 for |t| → ∞ C’) There exists a polynomial P and such that the Cramér asymptotic with abscissa p + (s0 ) − 1, s0 (t, t −1 ) θD (t, s + s0 ) ∼ P CTs=0 δ(s)t s+s0 −1 +
∞
[[pn +s0 −1]]
t pn +s0 −1 Bn
(log t)
(9.6)
n=0
for |t| → 0 is valid in Wl( π2 −σ1 ),( π2 −σ2 ) . The polynomial in C’) is then uniquely determined: n−1 s0 (t, t −1 ) = 1 P CTs=0 (δ(s)ξD (s + s0 − k − 1))t k 2π i k=0
with n such that n − 1 < p + (s0 ) − 1 ≤ n. As this theorem has no direct application to regularization we omit the analogue of Proposition 7.3 for [[q]] which shows that the Cramér asymptotic in C’) is a general one written in a special manner, and we give only the idea of the proof of Theorem 6. Proof. (idea) By Theorem 4 it suffices to prove B’) ⇔ C’). This equivalence can be shown using (9.3) and its inversion by a Hankel integral ((2.6.11) in [Il2]) integrating the asymptics term by term. For details see Sect. 2.6.1 in [Il2]. Remark 8. B [[q]] (log t)t q − B [[q]] (log(exp(2π i)t))(exp(2π i)t)q = B(log t)t q and (9.5) lead one to rediscover the implication A) ⇒ C) in Theorem 3 but now one has with Theorem 4 the general equivalence A) ⇔ B’) ⇔C’) for all directed divisors already mentioned in Sect. 7. Because of (9.3) C’) is an explicit determination of the Cramér asymptotic of the Laplace transform of ξD (s, z) which is the basic meaning of Theorem 6.
86
G. Illies
10. Renormalized Determinants The following is a generalization of ideas from §5 of [Vo]. Because of (3.2) and (3.3) every meromorphic function D (z) of finite order with divisor D has representations of the form z λ λ1 log D (z) = ... (−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ(10.1) a-+1
a-
a1
for certain - ≥ g and ai ∈ C, e.g. Wei,D,a (z) defined by Eq. (3.6) for ai = a. Easy considerations show that one must have |ai | = ∞ in order to get a determinant (compare Sect. 4) by this. But then (10.1) is divergent, so one has to renormalize the divergent integral. If D is quasi-directed and bounded regularizable according to Theorem 4 one has a Stirling asymptotic for ξD (- + 1, z), and (10.1) with ai = ∞ can be renormalized z if one has a renormalization for every integral of the form ∞ λ−q B(log λ)dλ, q ∈ C, B(z) ∈ C[z] (taking of course the value of the integral in case of absolute convergence). In the sequel for B(z) ∈ C[z] and q ∈ C we define B {q−1} (z) ∈ C[z] by d −(q−1) {q−1} z B (log z) = z−q B(log z), dz B {0} (0) = 0. Thus the z−(q−1) B {q−1} (log z) are just those primitives of the z−q B(log z) whose constant terms are zero. Definition 10.1. A renormalization sequence ω is a sequence (ωn )n=0,1,... of complex numbers. For such a renormalization sequence ω and D ∈ Dbreg , i.e. D is a quasidirected bounded regularizable divisor, the ω-renormalized determinant D (z) of D is defined by log D (z) =
z λ∞
∞
...
λ1
∞
(−1)- ξD (- + 1, λ0 )dλ0 dλ1 . . . dλ-
for - ≥ g, integrating (- + 1) times using the Stirling asymptotic for ξD (- + 1, z) from Theorem 4 and following the renormalization rule: z z−(q−1) B {q−1} (log z) for q = 1 λ−q B(log λ)dλ := (10.2) B {0} (log z) + ω(B) for q = 1 ∞ with ω(B) :=
k
ωk0 bk for B(z) =
k bk z
k.
One can easily prove that the definition of D (z) is independent of - ≥ g and of the lines of integration and that it delivers indeed a determinant system on Dbreg . The Stirling asymptotic that determines log D (z) in the same way as (7.8) for the δregularized determinant is derived by integrating the Stirling asymptotic for ξD (- + 1, z) term by term following (10.2). The next theorem shows that renormalization and regularization in fact are essentially the same.
Regularized Products and Determinants
87
Theorem 7. There is a bijection between the set of regularization sequences δ and the set of renormalization sequences ω such that the δ-regularized determinant and the ω-renormalized determinant deliver the same determinant system on Dbreg . The ω0 -renormalized determinant with ωn0 := 0 for all n ∈ N0 delivers the zerorenormalization as defined in Example 3 in Sect. 3. Proof. By Theorem 5 and the properties of the map [q], in particular, (7.7) and the fact that Stirling asymptotics for log(m) f (z) can be differentiated term by term (Remark 2 in Sect. 7) one easily sees that it is sufficient to observe the following: 1. δ(s) = "(1 − s) is a regularization sequence with B [0] (0) = 0 for all B(z) ∈ C[z]. This follows from (7.5) as then one has CTs=0 (" (k) (s)) = (−1)k+1 k!δk for all k ∈ N0 . 2. Let >1 be the C-vector space of all renormalization sequences and >2 that of all regularization sequences. Define ? to be the C-vector space of all C-linear maps from C[z] to C. Then regard the maps α1 : >1 −→ ?, ω −→ (B → ω(B)), α2 : >2 −→ ?, δ −→ (B → (B [0]δ − B [0]0 )), where in the latter definition [q]δ means the map [q] for the regularization sequence δ while [q]0 means [q] for the special regularization sequence of zero renormalization (δ(s) = "(1 − s)). These maps are obviously isomorphisms and α1−1 ◦ α2 is the demanded isomorphism between >2 and >1 . 11. Proof of Theorem 2b) Given a system of relations
(i)
|+α1,k
D1
+ ... +
k
D (i)
(i)
|+αn,k
Dn
= D (i) ,
i ∈ I,
(11.1)
k
Dreg ,
Dqd \Dreg ,
(i)
with ∈ D1 , . . . , Dn ∈ αm,k ∈ C for i ∈ I , m = 1, . . . , n and with finite sums over the index k. We first regard logarithms of associated functions for large real z. We choose the logarithms of the regularized determinants log D (i) (z) := −CTs=0 (δ(s)ξD (i) (s, z)) and logarithms log inDm (z) of certain associated functions inDm (z). We search for polynomials Pm (z) =
gm l=0
(−1)l+1
xm,l l z, l!
m = 1, . . . , n
such that log Dm (z) = Pm (z) + log inDm (z) is consistent with i) and ii) of Definition 4.1 under (11.1). With the polynomials (i) (i) P(i) (z) := log D (i) (z) − log inD1 z − α1,k − . . . − log inDn z − αn,k k
k
(11.2)
88
G. Illies
this is equivalent to (i) (i) P1 z − α1,k + . . . + Pn z − αn,k , P(i) (z) = k
i ∈ I,
(11.3)
k
and this is equivalent to a system of linear equations for the xm,l . Now by Zorn’s lemma a system of linear equations has a solution if every finite subsystem has one. Thus it suffices to prove that there is always a solution if |I | < ∞. So wlog we may assume that δ(s) is a polynomial (as the finitely many log D (i) (z) depend only on finitely many δn ). And after a trivial translation we also assume that all (i) divisors are directed and that there is an α > 0 such that |z| < α implies |z − αm,k | < |ρ| for all ρ that occur in the D (i) , Dm ; we always assume |z| < α. If we choose log inm (z) = Wei,g log Dm ,0 m (z) (compare Eq. (3.6)), then we have with the coefficients given in the proof of Theorem 1 b) and using Proposition 3.3: (i) log inDm z − αm,k (i) l gm z − αm,k (i) − CTs=0 δ(s) ξDm s, z − αm,k − ξDm (s + l) (−1)l l! l=0
and thus by Eq. (11.2)
(i) l g1 z − α1,k (−1)l ξD1 (s + l) P(i) (z) = − CTs=0 δ(s) l! l=0
... +
gn l=0
k
(−1)l
(i) l
z − αn,k l!
k
(11.4)
ξDn (s + l) , i ∈ I
(i) (i) (as k ξD1 (s, z −α1,k )+. . .+ k ξDn (s, z −αn,k ) = ξD (i) (s, z)). Comparison of (11.4) and (11.3) leads one to introduce the functions xm,l (s) := δ(s)ξDm (s + l), Pm (s, z) :=
gm l=0
P(i) (s, z) :=
(−1)l+1
xm,l (s) l z, l! (i)
P1 (s, z − α1,k ) + . . . +
k
(11.5) k
(i)
Pn (s, z − αn,k ),
where the xm,l (s) and the Pm (s, z) are all holomorphic for (s) > rmax := max rm while P(i) (s, z) is meromorphic for (s) > 0 with CTs=0 (P(i) (s, z)) = P(i) (z),
i ∈ I,
(11.6)
for |z| < α as is seen from Eq. (11.4). We now expand (11.5) and (11.3) by powers of z: P(i) (s, z) =
g max l=0
p(i),l (s)zl
Regularized Products and Determinants
89
gmax and P(i) (z) = l=0 p(i),l zl . Regard the p(i),l (s) and correspondingly the p(i),l as the components of vectors p(s), p ∈ CM , and regard the xm,l (s) and xm,l as the components of vectors x m,l (s), x m,l ∈ CN . The expansion of (11.3) and (11.5) by powers of z delivers a matrix B ∈ Mat(N × M, C) such that p(s) = B · x(s)
for (s) > rmax ,
(11.7)
and it has to be shown that there is a x ∈ CN such that p = B · x. But there is a matrix Bˆ ∈ Mat(M × N , C) such that a solution exists if and only if Bˆ · p = 0. With this Bˆ one has Bˆ · p = Bˆ · CTs=0 (p(s)) = CTs=0 (Bˆ · p(s)) = 0, where the first equality is obtained from (11.6) and the last from (11.7).
Remark. Observe that in the proof the operation CTs=0 is applied to functions f (s) = f1 (s) + f2 (s) with f1 (s) being meromorphic around s = 0 and f2 (s) defined only for (s) > 0 but continuous at s = 0.
12. Proof of Theorem 4 The following estimate is needed to apply majorized convergence to integrals over ξD (s, z). Lemma 12.1. Let 0 < ϕi < ϕi < π , i + 1, 2 and let D be a directed divisor in Wlϕ1 ,ϕ2 and given r ≥ r such that c := ρ∈D |mD (ρ)ρk−r | < ∞. Then for (s0 ) > r, mD (ρ) r −(s )
0 (z − ρ)s0 = O |z|
(12.1)
ρ∈D
for z ∈ Wrϕ1 ,ϕ2 and |z| → ∞. Proof. We split the series in
|ρ|< 21 |z| and
|ρ|≥ 21 |z| and treat these two series separately. 1 r |ρ|<x |mD (ρ)| ≤ cx for any 2 |z| and use
For the first series we estimate |z − ρ| > x > 0 which follows immediately from the definition of c. This last inequality on the other hand implies x1 ≤|ρ|<x2
x2 mD (ρ) 1 ≤ cx r 1 + cr y r −1 α dy 1 α ρα x1 y x1
(12.2)
x for all α ∈ R>0 and 0 < x1 < x2 (as the rhs obviously maximizes x12 y −α dµ(y) under x the condition x1 dµ(y) ≤ cx r for all x ∈ [x1 , x2 ]). Observing that there is a β > 0 such that |z − ρ| > β|ρ| for all ρ ∈ D and z ∈ Wrϕ1 ,ϕ2 and using (12.2) we get the estimate also for the second series.
90
G. Illies
In the sequel we often tacitly use the following estimate: For 0 ≤ ϕ < and m ∈ N0 , " (m) (s) = O(e−ϕ|(s)| ),
|(s)| → ±∞
π 2,
α1 < α2 (12.3)
for α1 < (s) < α2 . For m = 0 this is part of the Stirling formula, for m > 0 it follows by applying Cauchy’s inequalities. Proof of Proposition 7.1. By majorized convergence (for b) apply the above Lemma) the two integral representations have to be proved only for one-point-divisors. In the sequel for expressions a s we always use arg(a) ∈] − π, π [. a) Let (s) > c > 0 and (ρ) < 0, (z) > 0, then by Euler’s Mellin integral for "(s) and its inversion one has ∞ "(s) = e−zt t s−1 eρt dt (z − ρ)s 0 ∞ c+i∞ "(s ) −s 1 −zt s−1 = e t t ds dt 2π i c−i∞ (−ρ)s 0 c+i∞ 1 "(s − s ) "(s ) = ds , 2πi c−i∞ zs−s (−ρ)s the last equation by interchanging the integrations (Fubini). Using the identity theorem one gets this formula as needed for ρ ∈ Wlϕ1 ,ϕ2 and z ∈ Wrϕ1 ,ϕ2 because both sides are holomorphic in the variables ρ and z. b) For ρ ∈ Wlϕ1 ,ϕ2 , z ∈ Wrϕ1 ,ϕ2 and (s) > 0 we will prove 1 "(s) = s (z − ρ) 2πi
C
"(s − s0 + 1) "(s0 ) dw. (z − w)s−s0 +1 (w − ρ)s0
(12.4)
For z0 ∈ Wrϕ1 ,ϕ2 and (s0 ) < 1 one has 1 2πi
C
iπs0 − e−iπs0 ∞ 1 1 v −s0 −s e dw = z dv 0 (z0 − w)s−s0 +1 w s0 2π i (1 + v)s−s0 +1 0 sin π s0 "(1 − s0 )"(s) = z0−s π "(s − s0 + 1) "(s) 1 = s , z0 "(s0 )"(s − s0 + 1)
the first equation by substituting w → −z0 v and deforming the contour (residue theorem), the second because of the representation 1.5 (2) in [EMOT] for Euler’s beta function B(u, v) = "(u)"(v)" −1 (u + v) and the third because of the equation "(1 − s0 )"(s0 ) = π sin−1 πs0 . Now for ρ ∈ Wlϕ1 ,ϕ2 such that z = z0 + ρ ∈ Wrϕ1 ,ϕ2 , replace the contour C by the shifted contour C − ρ. The value of this integral is independent of ρ (residue theorem) and (by majorized convergence for ρ → 0) equals the value of the above integral. Applying the substitution w → (w − ρ) and the identity theorem yields (12.4) in the demanded generality.
Regularized Products and Determinants
91
Proof of Theorem 4. A) ⇒ B’). Let −p < −q < r < c with (pn ) = q for all n. Then by the residue theorem (7.1) for (s) > c and z ∈ Wrϕ1 ,ϕ2 becomes −q+i∞ "(s − s ) 1 ξD (s, z) = ξD (s )ds 2π i −q−i∞ zs−s "(s − s ) + Ress =−pn ξD (s ) zs−s (pn )
n zs+pn
and this is also valid for all (s) > −q as is seen by the identity theorem. From this B’) easily follows for (s0 ) > −p. If −p < (s0 ) ≤ −p then first take the Stirling asymptotic for s0 = s0 + k with k ∈ N such that −p < (s0 ) ≤ −p + 1 and integrate k times. B’) ⇒ A). First note that we just need to show that ξD (s) is (σ1 , σ2 )-bounded pregular, but we do not need to determine the singular part distribution as then because of A) ⇒ B’) and the properties of the map [q] it must be the demanded one. Let now 0 < σi < σi < σ . We have for (s) > r and C = Cσ1 ,σ2 , 1 "(s − s0 + 1) CTs1 =0 (δ(s1 )ξD (s1 + s0 , w)) dw, (12.5) ξD (s, z) = 2πi C (z − w)s−s0 +1 which is obtained by applying partial integration to (7.2) using (3.2) where the necessary estimates for |w| → ∞ are derived by integrating (12.1). Now as (12.5) is not valid for z = 0 one has to use a little trick: One deforms the contour C and uses a "shifted" Stirling asymptotics. Let ε > 0 then by Taylor series expansion one obtains a pB-systems n ) with abscissa p such that ( pn , B q (z) := CTs1 =0 (δ(s1 )ξD (s1 + s0 , z)) R B n (log(z + ε)) s0 (z) − −P (z + ε)pn +s0 ( pn )
= O(|z|
−(q +(s0 ))
)
for |z| → ∞
is valid for z ∈ Wrσ1 σ2 for all fixed q < q < p. Now we choose a contour C which equals C for large |w| but is a little bit deformed near z = 0 such that z = 0 lies to its right side and (−ε) and all ρ ∈ D ly to its left side. Then (12.5) is also valid for z = 0 and C instead of C. Thus one gets for s0 ): (s) > max(r, −(p0 ), deg P 1 Bn (log(w + ε)) "(s − s0 + 1) ξD (s) = dw 2π i C (w + ε)pn +s0 (−w)s−s0 +1 ( pn )
92
G. Illies
s0 vanishes because of the residue theorem and one applies a where the integral over P slight generalization of (12.4) (obtained by deforming the contour and using the identity theorem) to the integral 1 (w + ε)y−s0 Bn (log(w + ε)) dw = Bn (∂y ) dw . p +s s−s0 +1 s−s0 +1 C (w + ε) n 0 (−w) C (−w) |y=− pn q one easily sees that ξD (s) Using (12.3) and choosing C suitably for the integral over R is (σ1 , σ2 )-bounded p-regular. 13. Properties of exp(ρ, t, s) Lemma 13.1. a) e xp(ρ, t, s) satisfies the functional equations e xp(ρ, t, s) − e2πi(s−1) e xp(ρ, exp(2π i)t, s) = eρt
(13.1)
and e xp(ρ, t, s + 1) − e xp(ρ, t, s) =
t −s · "(s)(−ρ)−s . 2π i
(13.2)
b) Let K ⊂ {s ∈ C| (s) > 0} be a compact set and ε > 0. Then there is a c > ∗ with arg(−ρ) ∈] − π, π [ and 0 such that for ρ ∈ C\R≥0 , s ∈ K and t ∈ C π 5π − arg(−ρ) − 2 + ε < arg(t) < − arg(−ρ) + 2 − ε, one has the estimate | exp(ρ, t, s)| < c|ρt|−(s) . Proof. For
π 2
< arg(t) <
(13.3)
3π 2
we have by definition ∞ t −(s−1) ewt e xp(ρ, t, s) = dw "(s) 2π i (w − ρ)s 0
(13.4)
with absolute convergence. Equation (13.2) is obtained by partial integration. Rotating the ray of integration (residue theorem) one first gets for (s) < 1 and − arg(−ρ)− π2 < arg(t) < − arg(−ρ) + π2 with arg(−ρ) = α − π and a small δ > 0, e xp(ρ, t, s) − e2πi(s−1) e xp(ρ, exp(2π i)t, s) i(α+δ) ei(α−δ) ∞ wt e ∞ t −(s−1) e = − dw "(s)eρt 2πi ws 0 0 ! " (e−π i(s−1) −eπ i(s−1) )t s−1 "(1−s)
=
1 "(s)"(1 − s)2i sin(π s)eρt = eρt , 2πi
thus (13.1), by the identity theorem also in general. It remains to prove (13.3). We assume arg(ρt) ∈] − ε , ε [ for 0 < ε < π2 , the general case follows then by rotating the ray of integration, ∞ (−ρt)w "(s) e e xp(ρ, t, s) = dw, (−ρt)−(s−1) 2π i (w + 1)s 0
Regularized Products and Determinants
93
which immediately gives the estimate | exp(ρ, t, s)| < c1 |ρt|−((s)−1) for |ρt| ≥ 1 for a suitable c1 > 0 und thus (13.3) by (13.2). For |ρt| ≤ 1 on the other hand with 1 0 < α := (ρt) ≤ 1 and the trivial estimate e−x ≤ x+α for x ∈ R≥0 one has 0
∞
e−(ρt)w dw = α (s)−1 |(w + 1)s | ≤α
(s)−1
∞
0 ∞ 0
and the assertion easily follows also for |ρt| ≤ 1.
e−x dx (x + α)(s) dx dx, (x + α)(s)+1
14. Miscelleanea In Chapter 2 of [Il1] the formalism of regularized determinants was developed more generally: Following Jorgenson and Lang ([JL1]) divisors with non-integer multiplicities mD : C → C (instead of C → Z) were regarded, then everything can be carried out with almost no difficulties, except that the associated functions become multivalued with the ρ ∈ D as branch points. Also essential singularities for ξD (s) were allowed. In that case it is neccessary that the formal power series δ(s) is convergent near zero. With this assumption almost everything can be done in general although some not completely trivial convergence problems occur. The maps [q] and [[q]] defined in Sects. 7 and 8 are special cases of the following construction: For q ∈ C, a regularization sequence δ and a function h(s), which is meromorphic in a neighborhood of q we define a linear map [h, q] : C[z] → C[z] (notation: B(z) → B [h,q] (z)) by CTs=0 (δ(s)B(∂s )[h(s)zs+q ]) = B [h,q] (log z)zq . If h1 (s) and h2 (s) are two such function, then if h1 (s) is, in addition, holomorphic at s = q the composition law [h2 , q] ◦ [h1 , q] = [h1 · h2 , q] is easily checked. For example 1 this implies [[q]] = − 2πi [1 − q] ◦ [q] for q = −n, n ∈ N0 . Also a sort of inverse of [q] can be defined (compare Satz 2.3.6 in [Il1]). Acknowledgements. I would like to thank C. Deninger for supervising my Ph.D. thesis as well as M. Schröter, I. Vardi, C. Bree, C. Soulé, A. Voros, J. B. Bost and J. Jorgenson for helpful discussions and improvements. Parts of the article were written during a visit at the IHES.
References [Al] [Bar] [Cr] [CV] [De1] [De2]
Almquist, G.: Asymptotic Formulas and Generalized Dedekind Sums. Exp. Math. 7, 343–359 (1998) Barnes, E.W.: On the Theory of the Multiple Gamma Function. Phil. Trans. of the Royal Soc. (A) 19, 374–439 (1904) Cramér, H.: Studien über die Nullstellen der Riemannschen Zetafunktion. Math. Zeitschrift 4, 104–130 (1919) Cartier, P., Voros, A.: Une nouvelle interpretation de la formule des traces de Selberg. In: The Grothendieck Festschrift, Vol. 2, Basel–Boston: Birkhäuser, 1991, pp. 1–67 Deninger, C.: Motivic L-functions and regularized determinants. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 707–743 Deninger, C.: Motivic L-functions and regularized determinants II. In: F. Catanese (Hrsg.) Proc. Arithmetic Geometry, Cortona, 1994
94
[De3]
G. Illies
Deninger, C.: Some Analogies between Number Theory and dynamical Systems on foliated Spaces. Documenta Mathematica, extra vol. ICM 1998, I, Plenary Talks, pp. 23–46 [Do] Doetsch, G.: Handbuch der Laplacetransformation I/II, Basel: Birkhäuser, 1950/1955 [Ef] Efrat, L.: Determinants of Laplacians on surfaces of finite volume. Commun. Math. Phys. 119, 443–451 (1988); Erratum. Commun. Math. Phys. 138, 607 (1991) [EMOT] Erdelyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher transcendental functions I, II, III. New York: McGraw-Hill, 1953 [EORBZ] Elizalde, E., Odintsov, S.D., Romeo, A., Bytsenko, A.A., Zerbini, S.: Zeta regularization techniques with applications. Singapore: World Scientific, 1994 [Gui] Guinand, A.D.: Fourier reciprocities and the Riemann zeta function. Proc. London Math. Soc. (2) 51, 401–414 (1950) [Il1] Illies, G.: Regularized products, trace formulas and Cramér functions. Ph.D.-thesis (in German), Schriftenreihe des mathematischen Instituts der Universität Münster, 3. Serie, Heft 22, 1998 [Il2] Illies, G.: Cramér functions and Guinand equations. IHES-preprint 1999 [JL1] Jorgenson, J., Lang, S.: Basic Analysis of regularized series and products. LNM 1564, Berlin: Springer, 1994 [JL2] Jorgenson, J., Lang, S.: On Cramér’s theorem for general Euler products with functional equation. Math. Ann. 297/3 383–416 (1993) [JL3] Jorgenson, J., Lang, S.: Extension of analytic number theory and the theory of regularized harmonic series from Dirichlet series to Bessel series. Math. Ann. 306, 75–124 (1996) [Ko1] Koyama, S.Y.: Determinant expressions of Selberg zeta functions I. Trans. AMS 324, 149–168 (1991) [Ko2] Koyama, S.Y.: Determinant expressions of Selberg zeta functions II. Trans. AMS 329, 755–772 (1992) [Ko3] Koyama, S.Y.: Determinant expressions of Selberg zeta functions III. Proc. AMS 113, 303–311 (1991) [Ku] Kurokawa, N.: Multiple sine functions and Selberg zeta functions. Proc. Japan Acad. 67A, 61–64 (1991) [LW] Landau, E., Walfisz, A.: Über die Nichtfortsetzbarkeit einiger durch Dirichletsche Reihen definierter Funktionen. Rend. di Palermo 44, 8286 (1919) [Ma] Manin, Y.I.: Lectures on zeta functions and motives Preprint MPI Bonn, 1992 [No] Norlund, N.E.: Memoire sur les polynomes de Bernoulli. Acta Mathematica 43, 121–196 (1920) [QHS] Quine, J.R., Heydari, S.H., Song, R.Y.: Zeta-regularized products. Trans. of the AMS 338, 1, 213–231 (1993) [RS] Ray, D., Singer, I.: Analytic torsion for analytic manifolds. Ann. Math. 98, 154–177 (1973) [Sa] Sarnak, P.: Determinants of Laplacians. Commun. Math. Phys. 110, 113–120 (1987) [ScSo] Schröter, M., Soulé, C.: On a Result of Deninger Concerning Riemann’s Zeta Function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [So] Soulé, C.: Letter to C. Deninger, 13.2.1991, as: M. Schröter, S. Soulé: On a result of Deninger concerning Riemann’s zeta function. In: Motives, Proc. of Symp. Pure Math. 55/1, Providence, RI: AMS, 1994, pp. 745–747 [Ti] Titchmarsh, E.C.: The Theory of Functions. 2nd ed., Oxford: Oxford University Press, 1939 [Va] Vardi, I.: Determinants of Laplacians and multiple Gamma Functions. Siam J. Math. Anal. 19, 1, 493–507 (1988) [Vi] Vigneras, M.F.: L’equation fonctionelle de la fonction zeta de Selberg du groupe modulaire SL(2, Z). Asterisque 61, 235–249 (1979) [Vo] Voros, A.: Spectral Functions, Special Functions and the Selberg Zeta Function. Commun. Math. Phys. 110, 439–465 (1987) Communicated by P. Sarnak
Commun. Math. Phys. 220, 95 – 104 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Super Brockett Equations: A Graded Gradient Integrable System R. Felipe1 , F. Ongay2 1 ICIMAF, Havana, Cuba, and Universidad de Antioquia, Medellín, Colombia 2 CIMAT, Guanajuato, Mexico. E-mail:
[email protected]
Received: 9 February 2000 / Accepted: 18 January 2001
Abstract: Rather recently equations of Lax type defined by a double commutator, the so-called Brockett equations, have received considerable attention. In this paper we prove that a supersymmetric version of a Brockett hierarchy is an infinite dimensional integrable gradient system. As far as we know, this is the only graded system of this type existing in the literature. 0. Introduction Ever since the discovery in 1968 by Gardner, Green, Kruskal and Miura of the inverse scattering method to solve the KdV equations, the theory of infinite dimensional integrable systems, sometimes also known as the theory of soliton equations, has been the subject of a great deal of work, and many results and applications have stemmed from this newfound attention to the subject. As is well known, one of the first major developments came with the realization that these systems can be put in the so-called Lax form, L˙ = [L, N ], since this description is particularly well suited to stress some of the geometrical interpretations of the equations, in particular allowing to place them into a Hamiltonian framework. On the other hand, some ten years ago, ODE’s of Lax type defined by more than one Lie bracket were introduced by R. Brockett (see [B1] and [B2]), in connection with some least squares matching and sorting problems. Surprisingly enough, these so-called Brockett systems exhibit many remarkable features besides the original intended ones: to name one, it was discovered by A. Bloch, R. Brockett and T. Ratiu (see e.g. [B-B-R]) that the equations corresponding to the celebrated Toda lattice can be cast into this mold. But moreover, another property of these equations, still more relevant to our purposes, was also proved in [B-B-R], where it was shown that these finite dimensional systems Partially supported by CONACYT, Mexico, project 28-492E and CODI project “Complete integrability of Brockett type equations”, University of Antioquia, Colombia.
96
R. Felipe, F. Ongay
are completely integrable, but of gradient type (the existence of a suitable Hamiltonian structure remaining an open question). Quite recently, the theory of Brockett equations was adapted by one of us for PDE’s (reference [F]), and it was proved that many important properties, such as the complete integrability and the property of being a gradient system, were still valid in this infinite dimensional context, but also that this analog of the Brockett equation belongs to a hierarchy, similar to the well known KdV or KP hierarchies. In this work we consider yet another extension of the Brockett system: Following the approach to supersymmetric (i.e., Z2 -graded) versions of the KP hierarchy, studied for example by Manin and Radul ([M-R]), Mulase ([Mu2]), or Rabin ([R]), we define and study a supersymmetric extension of the Brockett hierarchy introduced in [F]. In particular, our main results will show that the properties of being completely integrable and a gradient flow, also extend to this graded hierarchy; to the best of our knowledge, this is the first example of a graded system possessing these properties. Furthermore, the flows associated to this new hierarchy naturally “live” on a flag in the space of gauge operators, and we conjecture that this geometric feature of our construction might be of some use in the algebro-geometric study of deformations of line bundles over algebraic curves, both in the classical and graded case. 1. A Z2 -Graded Brockett Hierarchy We will consider in this work a rather standard (1, 1) dimensional setting, namely, the one studied by Manin and Radul, which we now briefly recall, referring the reader to the basic reference [M-R] for more details (see also [Mu2]): First of all, let x denote an even variable, ξ an odd one (the parity of an object will be denoted by a tilde, so that for instance x˜ = 0; ξ˜ = 1), and fix some ring of “superfunctions” in these variables (for instance, we may take the ring of formal power series in x and ξ ), B, where the operator θ = ∂ξ +∂x acts as an odd derivation (recall that θ 2 = ∂x ). Then one considers the ring of (formal) super pseudo-differential operators, B((θ −1 )), with coefficients in B. To avoid confusion with the action of the derivations on the operators, the product in this ring will be denoted by ◦, and by θ −1 we will denote the (formal) inverse of θ. Thus, every operator L ∈ B((θ −1 )) can be written as a formal series bi θ i , L= i≤m
and, as usual, we will write L+ =
bi θ i ;
L− =
bi θ i ,
(1)
i<0
0≤i
where the first part is the differential operator part of L, and the second the strictly pseudodifferential (or “integral”) part. Clearly, this determines a direct sum decomposition, B((θ −1 )) = D ⊕ E (−1) , and, as a matter of notation, the even and odd parts of these spaces (in fact, Lie superalgebras, with the standard supercommutator bracket) will be denoted by a subscript 0 or 1, respectively. It is also worthwhile to note that E (−1) is naturally filtered by E (k) = bi θ i , i≤k
Super Brockett equations
97
k ≤ −1, and that the affine subspace 1 + E (−1) is a Lie supergroup, whose Lie superalgebra, i.e., the tangent space at the identity, can be identified with E (−1) (and, of course, the same holds for the tangent space at any other point). Given this setting, to construct the hierarchy we start by letting = θ + i≤0 bi θ i be a fixed odd graded Lax operator (this implies b˜i = i + 1 mod 2), and, in order to have a dressing presentation of L (cf. Sect. 2), we assume that the coefficients of satisfy the relation 1 b−1 + θb0 = 0. (2) 2 ˜ and introduce Next, we consider an infinite set of “time” variables, ti , where t˜i = i, the family of flows – i.e., derivations of B((θ −1 )) – , θi , defined by θ2k = ∂t2k , θ2k−1 = ∂t2k−1 +
∞
t2j −1 ∂t2(k−j −1) ,
j =1
and we assume that these flows supercommute with θ . Although we shall not use this property, it is well known (cf. [M-R]) that they satisfy the commutation relations [θ2i , θ2j ] = 0; [θ2i , θ2j −1 ] = 0, [θ2i−1 , θ2j −1 ] = 2θ2(i+j −1) . We can finally define the super Brockett hierarchy: Definition 1. Let be a pseudo-differential operator satisfying (2). The super, or Z2 graded, Brockett hierarchy is the system of equations θk = [, [, k+1 + ]];
k = 1, 2, 3, . . . .
(3)
First of all we have the following useful result, giving an equivalent way to write the hierarchy (this is analogous to – and in fact simpler than – formula (7) in [M-R]): Lemma 1. For all k ∈ N, [, [, k+ ]] = −[, [, k− ]].
(4)
Proof. It is convenient to divide the proof into the cases k odd and k even: If k = 2n, then 2n is an even operator, so that 2n 2n [, 2n + ] = ◦ + − + ◦ = 0.
Since obviously 2n 2n [, 2n + ] = [, ] − [, − ],
and the result follows immediately in this case. If k = 2n + 1, then [, 2n+1 ] = 22n+2 , and the above reasoning no longer holds. However, from [, 2n+1 ] = −[, 2n+1 ] + 22n+2 + − we get ]] = −[, [, 2n+1 ]] + 2[, 2n+2 ] = −[, [, 2n+1 ]], [, [, 2n+1 + − − since now 2n+2 is even; this completes the proof of the lemma.
98
R. Felipe, F. Ongay
2. Graded Residues and Conservation Laws As is well known, the definition of complete integrability in the infinite dimensional case is more subtle than in the finite dimensional one. The one we will use here is the existence of an infinite number of conserved quantities, and this fact will be established in this section for the super Brockett hierarchy. We start with the following result, that gives the action of the time flows on the powers of the Lax operator Lemma 2. If satisfies the super Brockett equations then, for all k ∈ N,
and
θk 2n = (−1)k [2n , [, k+1 + ]];
n = 1, 2, 3, . . .
(5)
θk 2n+1 = [2n+1 , [, k+1 + ]];
n = 1, 2, 3, . . .
(6)
Proof. We will show first that (5) holds for n = 1, and then use (3) and induction on n, to prove (5) and (6) for arbitrary n. ˜ m = m, Now, to start the induction, since ˜ we have ˜ ˜
θk 2 = (θk ) ◦ + (−1)θk ◦ (θk ) = (θk ) ◦ + (−1)k ◦ (θk ) k+1 k = [, [, k+1 + ]] ◦ + (−1) ◦ [, [, + ]] k+1 k = (−1)k { ◦ ( ◦ [, k+1 + ] − (−1) [, + ] ◦ ) k+1 k + (−1)k ( ◦ [, k+1 + ] − (−1) [, + ] ◦ )} k+1 2k 2 = (−1){2 ◦ [, k+1 + ] − (−1) [, + ] ◦ }
= (−1)k [2 , [, k+1 + ]] as desired. Thus, for even powers we have θk 2n+2 = (θk 2 ) ◦ 2n + 2 ◦ (θk 2n ) k+1 2n 2 2n = (−1)k {[2 , [, k+1 + ]] ◦ + ◦ [ , [, + ]]} k+1 2n+2 } = (−1)k {2n+2 ◦ [, k+1 + ] − [, + ] ◦
= (−1)k ([2(n+1) , [, k+1 + ]] while for odd powers we have θk 2n+3 = (θk 2 ) ◦ 2n+1 + 2 ◦ (θk 2n+1 ) 2n+1 } + 2 ◦ [2n+1 , [, k+1 = (−1)k {[2 , [, k+1 + ]] ◦ + ]] k+1 2 2n+1 = (−1)k (2 ◦ [, k+1 + ] − [, + ] ◦ ) ◦ k+1 k 2n+1 + 2 ◦ (2n+1 ◦ [, k+1 ) + ] − (−1) [, + ] ◦
= [2n+3) , [, k+1 + ]], and this completes the proof of the lemma.
Super Brockett equations
99
As we mentioned – and as is also the case in the non-graded setting – , due to the relation (2) Lax superoperators can be presented in a so-called dressed form; that is, there exists an even operator (called a dressing or gauge operator) (−1) ai θ i ∈ E0 , S =1+ i<0
such that = S ◦ θ ◦ S −1 . We also recall that gauge operators are unique, modulo left (−1) multiplication by an element of 1+E0 , with constant coefficients (i.e., independent of x and ξ ). We then have the following result, which gives yet another useful reformulation of the hierarchy. Lemma 3. Let be a Lax superoperator. Then satisfies (3) if and only if there exists a dressing operator S that satisfies θn S = (−1)n [, n+1 − ] ◦ S.
(7)
Proof. The “if” part is rather straightforward: In fact, if we let S be any dressing operator, from θn (S ◦ S −1 ) = 0 and since S is even we have the relation θn S −1 = −S −1 ◦ (θn S) ◦ S −1 . Using this, and the fact that θ and θn supercommute, we obtain θn = θn (S ◦ θ ◦ S −1 ) = θn S ◦ θ ◦ S −1 + S ◦ θn (θ ◦ S −1 ) = θn S ◦ θ ◦ S −1 + (−1)n S ◦ θ ◦ θn S −1 = θn S ◦ S
−1
= θn S ◦ S
−1
◦S◦θ ◦S
−1
n
− (−1) S ◦ θ ◦ S
n
−1
◦ − (−1) ◦ θn S
(8) −1
◦ θn S ◦ S
−1
= [θn S ◦ S −1 , ].
Therefore, the super Brockett system (3) is equivalent to [θn S ◦ S −1 , ] = −[, [, n+1 − ]] or, rearranging terms, [θn S ◦ S −1 − (−1)n [, n+1 − ], ] = 0,
(9)
so (7) clearly implies (3). From (9) it is also clear that for the “only if” part, to get a solution of (7), we will need to choose a particular dressing operator, which following [Mu1], we will call a Sato–Wilson gauge operator. Let be a Lax operator satisfying (3), let T denote any dressing operator for it, and consider the operator Hn = θn T ◦ T −1 − (−1)n [, n+1 − ]. (−1)
Observe that H˜ n = n˜ + 1, and in fact Hn ∈ En˜ , since both summands have order less than or equal to −1. Then, by applying twice the dressing procedure, we get [Hn , ] = T ◦ [T −1 ◦ Hn ◦ T , θ] ◦ T −1 so (9) is equivalent to [T −1 ◦ Hn ◦ T , θ] = [Bn , θ ] = 0.
100
R. Felipe, F. Ongay
Thus, since θ 2 = ∂x , if satisfies the super Brockett equations, then the operator Bn commutes with ∂x , and therefore has constant coefficients. (−1) Furthermore, it is clear that Bn also belongs to En˜ , and so we can construct an operator Cn = i≤0 ci θ i , with constant coefficients, such that θn Cn = −Bn ◦ Cn . This is actually done by recursively solving a system of ordinary differential equations, whose first few equations are θ n c0 = 0
;
θn c−1 = b−1
;
θn c−2 = b−2 , etc.
(−1)
It follows that we can choose Cn ∈ 1 + E0 (so that in particular it is invertible), and hence S = T ◦ Cn is also a dressing operator. But now we have S −1 ◦ (θ S ◦ S −1 − (−1)n [, n+1 − ]) ◦ S = (T ◦ Cn )−1 ◦ θn (T ◦ Cn ) ◦ (T ◦ Cn )−1 − (−1)n [, n+1 − ] ◦ (T ◦ Cn ) = Cn−1 ◦ T −1 ◦ (θn T ◦ Cn + T ◦ θn Cn ) ◦ Cn−1 ◦ T −1 − (−1)n [, n+1 − ] ◦ T ◦ Cn = Cn−1 ◦ T −1 ◦ θn T + θn Cn ◦ Cn−1 − (−1)n T −1 ◦ [, n+1 − ] ◦ T ◦ Cn = Cn−1 ◦ − Bn + T −1 ◦ (θn T ◦ T −1 − (−1)n [, n+1 − ]) ◦ T ◦ Cn = Cn−1 ◦ − Bn + T −1 ◦ Hn ◦ T ◦ Cn = 0. Hence S is a Sato–Wilson operator for , as desired.
We are now in position to see in what sense the super Brockett hierarchy is integrable: First of all, as in [M-R], if P ∈ B((θ −1 )) we denote by Res θ P the coefficient of θ −1 in P ; also, it is shown in that work that there exists a universal polynomial in two variables, which we denote by F , such that Res θ [P , Q] = θ F (P , Q) for P , Q ∈ B((θ −1 )). Hence, k+1 n if we set Sn,k = F (n , (−1)k [, k+1 + ]) for n even, and Sn,k = F ( , [, + ]) for n n odd, and Rn = Res θ , the two results proved in this section can be rewritten as θk Rn = θSn,k , where k, n are arbitrary. These equations can be thought of as conservation laws in the usual way, so we have proved one of our main assertions: Theorem 1. The super Brockett hierarchy is completely integrable in the sense that it admits the infinite family of conservation laws θk Rn = θSn,k for k, n arbitrary positive integers.
(10)
Super Brockett equations
101
3. Graded Adler Functionals and the Gradient Nature of Super Brockett Flows We will now look at the above construction from the point of view of Adler functionals, to show that they define gradient flows on the space of gauge operators; it will turn out that these flows are defined only on suitable subspaces of 1 + E (−1) . First, we recall that, in this context, the Berezin integral of a superfunction f ∈ B is given by f d(ξ, x) = ∂ξ f dx. For an abstract ring B this integral is in principle only a formal object (see [D] for a good discussion of this point), but the point is that it can be used to define Adler type functionals for pseudo-differential operators: for p ∈ B((θ −1 )), this is the functional Res θ P d(ξ, x). These Adler functionals in turn allow the construction of a pairing for pseudo-differential operators, by means of X, Y = Res θ XY d(ξ, x), and it is well known that for the supercommutator of two operators one has Res θ [X, Y ] d(ξ, x) = 0.
(11)
Now let be a Lax superoperator, S = 1 + i<0 ai θ i a gauge operator and k a positive integer. We will consider the special type of functional Fk (S) = − Res θ k+1 − ◦ d(ξ, x) (12) = − Res θ (S ◦ θ k+1 ◦ S −1 )− ◦ S ◦ θ ◦ S −1 d(ξ, x), as a functional of S, and we want to calculate the (again formal) gradient of these functionals at a fixed point S, relative to the pairing above. As usual, this is obtained by computing the linear term in the expansion in powers of ' of Fk (S + '() − F (S), where ' is an even parameter and where ( is some even differential superoperator in E (−1) (upon which we will impose some further restrictions later on), so we have to compute (S + '()−1 . Now, (S + '()−1 = (S ◦ (1 + 'S −1 ◦ ())−1 = (1 + 'S −1 ◦ ()−1 ◦ S −1 , and the last part can be written using a formal Neumann series as (1 + 'S −1 ◦ ()−1 =
∞ k=0
(−1)k (S −1 ◦ ()k ' k .
(13)
102
R. Felipe, F. Ongay
Therefore we have the explicit expression (S + '()−1 =
∞
(−1)k (S −1 ◦ ()k ◦ S −1 ' k
k=0 −1
=S
− S −1 ◦ ( ◦ S −1 ' + O(' 2 ).
Upon substitution of this in (12), and gathering of terms according to powers of ', we have Fk (S + '() − Fk (S) = − ' Res θ (S ◦ θ k+1 ◦ S −1 )− ◦ (( ◦ θ ◦ S −1 − S ◦ θ ◦ S −1 ◦ ( ◦ S −1 ) + (( ◦ θ k+1 ◦ S −1 − S ◦ θ k+1 ◦ S −1 )− ◦ S ◦ θ k+1 ◦ S −1 d(ξ, x) + O(' 2 ) (14) −1 = − ' Res θ k+1 − ◦ ( ◦ S −1 ) − ◦ (( ◦ θ ◦ S + (( ◦ θ k+1 ◦ S −1 − k+1 ◦ ( ◦ S −1 )− ◦ d(ξ, x) + O(' 2 ). We now impose the following condition on (: (−1) can be written as ( = φk ◦ S, for some even operator Since every operator in E0 φk , we assume that the latter satisfies [φk , k+1 ] ∈ E (−k−2) . (−1) Observe that as k varies, this determines a flag in E0 ; that is, a flag of tangent directions in the space of gauge operators. But the important point for us now is – as an easy calculation shows – , that this condition ensures that the order of the second term in the integral is at most −2, and therefore has residue zero. Under this hypothesis, the coefficient of the term linear in ' in (14) can now be rewritten as − Res θ k+1 ◦ [φk , ] + ([φk , k+1 ])− ◦ d(ξ, x) = − Res θ k+1 ◦ [φk , ] d(ξ, x). In fact, we do not yet have an expression for the gradient, since we would have to “isolate” ( in the above equation; but, because the residue of a super commutator is zero, to achieve this it will suffice to show that k+1 ◦ [φk , ] + S −1 [k+1 − , ] ◦ ( is a sum of supercommutators. So let us consider k+1 k+1 −1 −1 ◦ [k+1 −k+1 − ◦ [φk , ] − S [− , ] ◦ ( = −− ◦ [φk , ] − S − , ] ◦ φk ◦ S.
˜ n = n, ˜ expanding the last expression we have Then, since S˜ = 0, and −1 ◦ [k+1 − k+1 − ◦ [φk , ] − S − , ] ◦ φk ◦ S −1 k+1 ◦ (k+1 ◦ k+1 = − k+1 − ◦ (φk ◦ − ◦ φk ) + S − ◦ − (−1) − ) ◦ φk ◦ S −1 ◦ ) − S ◦ (S −1 ◦ k+1 = − (k+1 − ◦ φk ◦ S) ◦ (S − ◦ ◦ φk ) k+1 −1 (S ◦ )(k+1 + (S −1 ◦ k+1 − ◦ ◦ φk ) ◦ S − (−1) − ◦ φk ◦ S).
Super Brockett equations
103
We can now gather together the first and fourth, and the second and third, terms as the supercommutators of the operators indicated by the parentheses, to get k+1 k+1 −1 −1 k+1 ◦ ] + [S −1 ◦ k+1 − ◦ ◦ φk ]. − [φk , ] + S [− , ]φk S = [− (, S
Therefore
−1 Resθ k+1 ◦ [k+1 − ◦ [φk , ] = −Resθ S − , ](
and we get as gradient of Fk (S) grad Fk (S) = −S −1 ◦ [k+1 − , ].
(15)
These computations essentially end the proof of our last result: Theorem 2. The super Brockett equations define a graded gradient system. More precisely: For each k > 0, Eq. (7) gives the flow of the gradient of the graded Adler functional Fk (S) on the affine subspace 1 + E (−k−2) . Proof. Indeed, to end the proof of our claim, it remains only to observe that, from Lemma 2, we have θk S −1 = −S −1 ◦ θk S ◦ S −1 = (−1)k+1 S −1 ◦ [, k+1 − ]. Therefore, modulo an inessential sign, the right-hand side of (15) is in fact equivalent to the right-hand side of (7), which we have already shown to be equivalent to the super Brockett system. Remark. The graded hierarchy that we have constructed in this paper preserves, and in a definite sense generalizes, several of the remarkable features of the standard Brockett equation. But moreover, we have also seen that these super Brockett equations will induce a flow on an infinite Grassmannian, of a different type to that given by the known super KP flows. We conjecture, therefore, that this hierarchy might also be of value, for instance, for the algebro-geometric study of deformations of superline bundles over supercurves, etc. (and it is clear that this remark also applies to the non-graded case; see also [F]). We hope to clarify some of these questions in a future work. Acknowledgements. Both authors wish to express their indebtedness to Prof. J. Rabin, who patiently listened to our expositions of a preliminary version of this work, and made several valuable comments. The bulk of this paper was done during reciprocal visits by each author to his coauthor’s respective institution; both of us thankfully acknowledge their hospitality during these stays. Finally, we are grateful to one of the referees, who pointed out an error in the original manuscript.
References [B-B-R] Bloch, A.M., Brockett, R.W., and Ratiu, T.S.: Completely integrable gradient flows. Commun. Math. Phys. 147, 57–54 (1992) [B1] Brockett, R.W.: Least squares matching problems. Linear Algebra Appl. 122, 761–777 (1989) [B2] Brockett, R.W.: Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems. Linear Algebra Appl. 146, 79–91 (1991) [D] Dickey, L.A.: Soliton equations and Hamiltonian systems Advanced Series in Math. 12, Phys. Singapore: World Scientific, 1991 [F] Felipe, R.: Algebraic aspects of Brockett type equations. Physica D 132, 287–297 (1999) [M-R] Manin, Yu.I., and Radul, O.A.: A supersymmetric extension of the Kadomtsev–Petviashvili hierarchy. Commun. Math. Phys. 98, 65–77 (1985)
104
[Mu1] [Mu2] [R]
R. Felipe, F. Ongay
Mulase, M.: Complete integrability of the Kadomtsev–Petviashvili equation. Adv. Math. 54, 57–66 (1984) Mulase, M.: A new super KP system and a characterization of the Jacobians of arbitrary algebraic supercurves. J. Diff. Geom. 34, 651–680 (1991) Rabin, J. M.: The geometry of super KP flows. Commun. Math. Phys. 137, 533–552 (1991)
Communicated by T. Miwa
Commun. Math. Phys. 220, 105 – 164 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials and Coset Branching Functions Anne Schilling1, , Mark Shimozono2, 1 Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge,
MA 02139, USA. E-mail:
[email protected]
2 Department of Mathematics, Virginia Tech, Blacksburg, VA 24061-0123, USA.
E-mail:
[email protected] Received: 9 April 2000 / Accepted: 26 January 2001
Abstract: Level-restricted paths play an important rôle in crystal theory. They correspond to certain highest weight vectors of modules of quantum affine algebras. We show that the recently established bijection between Littlewood–Richardson tableaux and rigged configurations is well-behaved with respect to level-restriction and give an explicit characterization of level-restricted rigged configurations. As a consequence a new general fermionic formula for the level-restricted generalized Kostka polynomial is obtained. Some coset branching functions of type A are computed by taking limits of these fermionic formulas. 1. Introduction Generalized Kostka polynomials [26, 33, 35–38] are q-analogues of the tensor product multiplicity λ cR = dim Homsln (V λ , V R1 ⊗ · · · ⊗ V RL ),
(1.1)
where λ is a partition, R = (R1 , . . . , RL ) is a sequence of rectangles and V λ is the irreducible integrable highest weight module of highest weight λ over the quantized enveloping algebra Uq (sln ). The generalized Kostka polynomials can be expressed as generating functions of classically restricted paths [30, 33, 37]. In terms of the theory of Uq (sln )-crystals [16, 17] these paths correspond to the highest weight vectors of tensor products of perfect crystals. The statistic is given by the energy function on paths. n )-crystal strucThe Uq (sln )-crystal structure on paths can be extended to a Uq (sl ture [18]. The level-restricted paths are the subset of classically restricted paths which, New address as of July 2001: Department of Mathematics, University of California, One Shields Ave., Davis, CA 956116-8633, USA. E-mail:
[email protected] Partially supported by NSF grant DMS-9800941.
106
A. Schilling, M. Shimozono
n )after tensoring with the crystal graph of a suitable integrable highest weight Uq (sl module, are affine highest weight vectors. Hence it is natural to consider the generating functions of level-restricted paths, giving rise to level-restricted generalized Kostka polynomials which will take a lead rôle in this paper. The notion of level-restriction is also very important in the context of restricted-solid-on-solid (RSOS) models in statistical mechanics [3] and fusion models in conformal field theory [39]. The one-dimensional configuration sums of RSOS models are generating functions of level-restricted paths (see for example [2, 9, 14]). The structure constants of the fusion algebras of Wess– Zumino–Witten conformal field theories are exactly the level-restricted analogues of the Littlewood–Richardson coefficients in (1.1) as shown by Kac [15, Exercise 13.35] and Walton [40, 41]. q-Analogues of these level-restricted Littlewood–Richardson coefficients in terms of ribbon tableaux were proposed in ref. [10]. The generalized Kostka polynomial admits a fermionic (or quasi-particle) formula [25]. Fermionic formulas originate from the Bethe Ansatz [4] which is a technique to construct eigenvectors and eigenvalues of row-to-row transfer matrices of statistical mechanical models. Under certain assumptions (the string hypothesis) it is possible to count the solutions of the Bethe equations resulting in fermionic expressions which look like sums of products of binomial coefficients. The Kostka numbers arise in the study of the XXX model in this way [22–24]. Fermionic formulas are of interest in physics since they reflect the particle structure of the underlying model [20, 21] and also reveal information about the exclusion statistics of the particles [5–7]. The fermionic formula of the Kostka polynomial can be combinatorialized by taking a weighted sum over sets of rigged configurations [22–24]. In ref. [25] the fermionic formula for the generalized Kostka polynomial was proven by establishing a statisticpreserving bijection between Littlewood–Richardson tableaux and rigged configurations. In this paper we show that this bijection is well-behaved with respect to levelrestriction and we give an explicit characterization of level-restricted rigged configurations (see Definition 5.5 and Theorem 8.2). This enables us to obtain a combinatorial formula for the level-restricted generalized Kostka polynomials as the generating function of level-restricted rigged configurations (see Theorem 5.7). As an immediate consequence this proves a new general fermionic formula for the level-restricted generalized Kostka polynomial (see Theorem 6.2 and Eq. (6.7)). Special cases of this formula were conjectured in refs. [8, 12, 13, 27, 33, 42]. As opposed to some definitions of “fermionic formulas” the expression of Theorem 6.2 involves in general explicit negative signs. However, we would like to point out that because of the equivalent combinatorial formulation in terms of rigged configurations as given in Theorem 5.7 the fermionic sum is manifestly positive (i.e., a polynomial with positive coefficients). The branching functions of type A can be described in terms of crystal graphs of n )-modules. For certain triples of weights irreducible integrable highest weight Uq (sl they can be expressed as limits of level-restricted generalized Kostka polynomials. The structure of the rigged configurations allows one to take this limit, thereby yielding a fermionic formula for the corresponding branching functions (see Eq. (7.10)). The derivation of this formula requires the knowledge of the ground state energy, which is obtained from the explicit construction of certain local isomorphisms of perfect crystals (see Theorem 7.3). A more complete set of branching functions can be obtained by considering “skew” level-restricted generalized Kostka polynomials. We conjecture that rigged configurations are also well-behaved with respect to skew shapes (see Conjecture 8.3).
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
107
The paper is structured as follows. Section 2 sets out notation used in the paper. In Sect. 3 we review some crystal theory, in particular the definition of level-restricted paths, which are used to define the level-restricted generalized Kostka polynomials. Littlewood–Richardson tableaux and their level-restricted counterparts are defined in Sect. 4. The formulation of the generalized Kostka polynomials in terms of Littlewood– Richardson tableaux with charge statistic is necessary for the proof of the fermionic formula which makes use of the bijection between Littlewood–Richardson tableaux and rigged configurations. The latter are the subject of Sect. 5 which also contains the new definition of level-restricted rigged configurations and our main Theorem 5.7. The proof of this theorem is reserved for Sect. 8. The fermionic formulas for the level-restricted Kostka polynomial and the type A branching functions are given in Sects. 6 and 7, respectively. 2. Notation All partitions are assumed to have n parts, some of which may be zero. Let R = (R1 , R2 , . . . , RL ) be a sequence of partitions whose Ferrers diagrams are rectangles. Let Rj have µj columns and ηj rows for 1 ≤ j ≤ L. We adopt the English notation for partitions and tableaux. Unless otherwise specified, all tableaux are assumed to be column-strict (that is, the entries in each row weakly increase from left to right and in each column strictly increase from top to bottom). 3. Paths The main goal of this section is to define the level-restricted generalized Kostka polyn )-crystal graphs nomials. These polynomials are defined in terms of certain finite Uq (sl whose elements are called paths. The theory of crystal graphs was invented by Kashiwara [16], who showed that the quantized universal enveloping algebras of Kac–Moody algebras and their integrable highest weight modules admit special bases whose structure at q = 0 is specified by a colored graph known as the crystal graph. The crystal graphs for the finite-dimensional irreducible modules for the classical Lie algebras were computed explicitly by Kashiwara and Nakashima [17]. The theory of perfect crystals gave a realization of the crystal graphs of the irreducible integrable highest weight modules for affine Kac–Moody algebras, as certain eventually periodic sequences of elements taken from finite crystal graphs [19]. This realization is used for the main application, some new explicit formulas for coset branching functions of type A. 3.1. Crystal graphs. Let Uq (g) be the quantized universal enveloping algebra for the Kac–Moody algebra g. Let I be an indexing set for the Dynkin diagram of g, P the weight lattice of g, P ∗ the dual lattice, {αi | i ∈ I } the (not necessarily linearly independent) simple roots, {hi | i ∈ I } the simple coroots, and {i | i ∈ I } the fundamental weights. Let · , · denote the natural pairing of P ∗ and P . Suppose V is a Uq (g)-module with crystal graph B. Then B is a directed graph whose vertex set (also denoted B) indexes a basis of weight vectors of V , and has directed edges colored by the elements of the set I . The edges may be viewed as a combinatorial version of the action of Chevalley generators. This graph has the property that for every b ∈ B and i ∈ I , there is at most one edge colored i entering (resp. leaving) b. If there is an edge b → b colored i, denote this by fi (b) = b and ei (b ) = b. If there is no edge
108
A. Schilling, M. Shimozono
colored i leaving b (resp. entering b ) then say that fi (b) (resp. ei (b )) is undefined. The fi and ei are called Kashiwara lowering and raising operators. Define φi (b) (resp. i (b)) to be the maximum m ∈ N such that fim (b) (resp. eim (b)) is defined. There is a weight function wt : B → P that satisfies the following properties: wt(fi (b)) = wt(b) − αi , wt(ei (b)) = wt(b) + αi , hi , wt(b) = φi (b) − i (b).
(3.1)
B is called a P -weighted I -crystal. Let P + = { ∈ P | hi , ≥ 0, ∀i ∈ I } be the set of dominant integral weights. For ∈ P + denote by V() the irreducible integrable highest weight Uq (g)-module of highest weight . Let B() be its crystal graph. Say that an element b ∈ B of the P -weighted I -crystal B is a highest weight vector if i (b) = 0 for all i ∈ I . Let u be the highest weight vector in B(). By (3.1), for all i ∈ I , i (u ) = 0, φi (u ) = hi , .
(3.2)
Let B be the crystal graph of a Uq (g)-module V . A morphism of P -weighted I crystals is a map τ : B → B such that wt(τ (b)) = wt(b) and τ (fi (b)) = fi (τ (b)) for all b ∈ B and i ∈ I . In particular fi (b) is defined if and only if fi (τ (b)) is. Suppose V and V are Uq (g)-modules with crystal graphs B and B respectively. Then V ⊗ V admits a crystal graph denoted B ⊗ B which is equal to the direct product B × B as a set. We use the opposite of the convention used in the literature. Define b ⊗ fi (b ) if φi (b ) > i (b), fi (b ⊗ b ) = fi (b) ⊗ b if φi (b ) ≤ i (b) and φi (b) > 0, (3.3) undefined otherwise. Equivalently, ei (b) ⊗ b if φi (b ) < i (b), ei (b ⊗ b ) = b ⊗ ei (b ) if φi (b ) ≥ i (b) and i (b ) > 0, undefined otherwise.
(3.4)
One has φi (b ⊗ b ) = φi (b) + max{0, φi (b ) − i (b)}, i (b ⊗ b ) = max{0, i (b) − φi (b )} + i (b ).
(3.5)
Finally wt : B ⊗ B → P is defined by wt(b ⊗ b ) = wt B (b) + wt B (b ), where wtB : B → P and wtB : B → P are the weight functions for B and B . This construction is “associative”, that is, the P -weighted I -crystals form a tensor category. Remark 3.1. It follows from (3.4) that if b = bL ⊗ · · · ⊗ b1 and ei (b) is defined, then ei (b) = bL ⊗ · · · ⊗ bj +1 ⊗ ei (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 for some 1 ≤ j ≤ L.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
109
3.2. Uq (sln )-crystal graphs on tableaux. Let J = {1, 2, . . . , n − 1} be the indexing set for the Dynkin diagram of type An−1 , with weight lattice Pfin , simple roots {α i | i ∈ J }, fundamental weights {i | i ∈ J }, and simple coroots {hi | i ∈ J }. Let λ = (λ1 ≥ λ2 ≥ · · · ≥ λn ) ∈ Nn be a partition. There is a natural projection n Z → Pfin denoted λ → λ = n−1 i=1 (λi −λi+1 )i . Let V (λ) be the irreducible integrable highest weight module of highest weight λ over the quantized universal enveloping algebra Uq (sln ) [17]. By abuse of notation we shall write V λ = V (λ) and denote the crystal graph of V λ by Bλ . As a set Bλ may be realized as the set of tableaux of shape λ over the alphabet {1, 2, . . . , n}. Define the content of b ∈ Bλ by content(b) = (c1 , . . . , cn ) ∈ Nn , where cj is the number of times the letter j appears in b. The weight function wt : Bλ → Pfin is given by sending b to the image of content(b) under the projection Zn → Pfin . The row-reading word of b is defined by word(b) = · · · w2 w1 , where wr is the word obtained by reading the r th row of b from left to right. This definition is useful even in the context that b is a skew tableau. The edges of Bλ are given as follows. First let v be a word in the alphabet {1, 2, . . . , n}. View each letter i (resp. i +1) of v as a closing (resp. opening) parenthesis, ignoring other letters. Now iterate the following step: declare each adjacent pair of matched parentheses to be invisible. Repeat this until there are no matching pairs of visible parentheses. At the end the result must be a sequence of closing parentheses (say p of them) followed by a sequence of opening parentheses (say q of them). The unmatched (visible) subword is of the form i p (i + 1)q . If p > 0 (resp. q > 0) then fi (v) (resp. ei (v)) is obtained from v by replacing the unmatched subword i p (i + 1)q by i p−1 (i + 1)q+1 (resp. i p+1 (i + 1)q−1 ). Then φi (v) = p, i (v) = q, and fi (v) (resp. ei (v)) is defined if and only if p > 0 (resp. q > 0). For the tableau b ∈ Bλ , let fi (b) be undefined if fi (word(b)) is; otherwise define fi (b) to be the unique (not necessarily column-strict) tableau of shape λ such that word(fi (b)) = fi (word(b)). It is easy to verify that when defined, fi (b) is a columnstrict tableau. Consequently φi (b) = φi (word(b)). The operator ei and the quantity i (b) are defined similarly. n )-crystal structure on rectangular tableaux. There is an inclusion of alge3.3. Uq (sl n ), where Uq (sl n ) is the quantized universal enveloping algebra bras Uq (sln ) ⊂ Uq (sl n of the affine Kac–Moody algebra sl n [15]. corresponding to the derived subalgebra sl (1) Let I = {0, 1, 2, . . . , n − 1} be the index set for the Dynkin diagram of An−1 . Let Pcl n , with (linearly dependent) simple roots {α cl | i ∈ I }, simple be the weight lattice of sl i coroots {hi | i ∈ I }, and fundamental weights {cl | i ∈ I }. The simple roots satisfy i the relation α0cl = − i∈J αicl . There is a natural projection Pcl → Pfin with kernel cl Z0 such that cl i → i for i ∈ J and 0 → 0. Let cl : Pfin → Pcl be the section cl of the above projection defined by cl(i ) = cl i − 0 for i ∈ J . Let c ∈ sl n be the canonical central element. The level of a weight ∈ Pcl is defined by c , . Let (Pcl+ )* = { ∈ Pcl+ | c , = *}. n )-module that has a crystal graph B (not all Suppose V is a finite-dimensional Uq (sl do); B is a Pcl -weighted I -crystal. A weight function wt cl : B → Pcl may be given by wtcl (b) = cl(wt(b)), where wt : B → Pfin is the weight function on the set B viewed as a Uq (sln )-crystal graph. In addition to being a Uq (sln )-crystal graph, B also has some
110
A. Schilling, M. Shimozono
n ) which edges colored 0. The action of Uq (sln ) on V λ extends to an action of Uq (sl admits a crystal structure, if and only if the partition λ is a rectangle [18, 30]. If λ is n )-module with the rectangle with k rows and m columns, then write V k,m for the Uq (sl Uq (sln )-structure V λ and denote its crystal graph by B k,m . If one of m or k is 1, then it is easy to give e0 and f0 explicitly on B k,m , for in this case the weight spaces of V k,m are one-dimensional, and the zero edges can be deduced from (3.1) [18]. The general case is given as follows [37]. We shall first define a content-rotating bijection ψ −1 : B k,m → B k,m . Let b ∈ B k,m be a tableau, say of content (c1 , c2 , . . . , cn ). ψ −1 (b) will have content (c2 , c3 , . . . , cn , c1 ). Remove all the letters 1 from b, leaving a vacant horizontal strip of size c1 in the northwest corner of b. Compute Schensted’s P tableau [34] of the row-reading word of this skew subtableau. It can be shown that this yields a tableau of the shape obtained by removing c1 cells from the last row of the rectangle (mk ). Subtract one from the value of each entry of this tableau, and then fill in the c1 vacant cells in the last row of the rectangle (mk ) with the letter n. It can be shown that ψ −1 is a well-defined bijection, whose inverse ψ can be given by a similar algorithm. Then fi = ψ −1 ◦ fi+1 ◦ ψ, ei = ψ −1 ◦ ei+1 ◦ ψ
(3.6)
for all i where indices are taken modulo n; in particular for i = 0 this defines explicitly the operators e0 and f0 . 3.4. Sequences of rectangular tableaux. For a sequence of rectangles R, consider the n )-crystal graph has underlying set PR = tensor product V RL ⊗ · · · ⊗ V R1 . Its Uq (sl BRL ⊗ · · · ⊗ BR1 , where the tensor symbols denote the Cartesian product of sets. A typical element of PR is called a path and is written b = bL ⊗ · · · ⊗ b2 ⊗ b1 , where bj ∈ BRj is a tableau of shape Rj . The edges of the crystal graph PR are given explicitly as follows. Define the word of a path b by word(b) = word(bL ) · · · word(b2 )word(b1 ). Then for i = 1, 2, . . . , n − 1 (as in the definition of fi for b ∈ Bλ ), if fi (word(b)) is undefined, let fi (b) be undefined; otherwise it is not hard to see that there is a unique path fi (b) ∈ PR such that word(fi (b)) = fi (word(b)). To define f0 , let ψ(b) = ψ(bL ) ⊗ · · · ⊗ ψ(b1 ) and f0 = ψ −1 ◦ f1 ◦ ψ. This definition is equivalent to that given by taking the above definition of fi on the crystals BRj and then applying the rule for lowering operators on tensor products (3.3). The action of ei for i ∈ I is defined analogously. n , with weight 3.5. Integrable affine crystals. Consider the affine Kac–Moody algebra sl lattice Paf , independent simple roots {αi | i ∈ I }, simple coroots {hi | i ∈ I }, and fundamental weights {i | i ∈ I }. Let δ ∈ Paf be the null root. There is a natural projection which we shall by abuse of notation also call cl : Paf → Pcl such that cl(δ) = 0 and cl(i ) = cl i for i ∈ I . Write af : Pcl → Paf for the section of cl given by af(cl ) = for i ∈ I . i i
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
111
Let ∈ Pcl+ be a dominant integral weight and B() the crystal graph of the n )-module of highest weight . If = 0 irreducible integrable highest weight Uq (sl then B() is infinite. The set of weights in Paf that project by cl to are given by cl−1 () = {af() + j δ | j ∈ Z}. Now fix j . The irreducible integrable highest weight n )-crystal graph B(af() + j δ) may be identified with B() as sets and as I Uq (sl crystals (independent of j ). The weight functions for B(af()+j δ) and B(af()) differ by the global constant j δ. The weight function B() → Z is obtained by composing the weight function for B(af() + j δ), with the projection cl : Paf → Pcl . The set B() is then endowed with an induced Z-grading E : B() → N defined by E(b) = − d , wt(b) , where B() is identified with B(af()), wt : B(af()) → Paf is the weight function and d ∈ Paf∗ is the degree generator. The map d , · takes the coefficient of the element δ of an element in Paf when written in the basis {i | i ∈ I } ∪ {δ}. 3.6. Energy function on finite paths. The set of paths PR has a natural statistic called the energy function. The definitions here follow [30]. Consider first the case that R = (R1 , R2 ) is a sequence of two rectangles. Let Bj = BRj for 1 ≤ j ≤ 2. Since B2 ⊗ B1 is a connected crystal graph, there is a unique n )-crystal graph isomorphism Uq (sl (3.7) σ : B2 ⊗ B1 ∼ = B1 ⊗ B2 . This is called the local isomorphism (see Sect. 4.4 for an explicit construction). Write σ (b2 ⊗ b1 ) = b1 ⊗ b2 . Then there is a unique (up to a global additive constant) map H : B2 ⊗ B1 → Z such that −1 if i = 0, e0 (b2 ⊗ b1 ) = e0 b2 ⊗ b1 and e0 (b1 ⊗ b2 ) = e0 b1 ⊗ b2 , H (ei (b2 ⊗ b1 )) = H (b2 ⊗ b1 ) + 1 if i = 0, e0 (b2 ⊗ b1 ) = b2 ⊗ e0 b1 (3.8) and e0 (b1 ⊗ b2 ) = b1 ⊗ e0 b2 , 0 otherwise. This map is called the local energy function. By definition it is invariant under the local isomorphism and under fi and ei for i ∈ J . Let us normalize it by the condition that H (u2 ⊗u1 ) = |R1 ∩R2 |, where uj is the Uq (sln ) highest weight vector of Bj for 1 ≤ j ≤ 2, R1 ∩ R2 is the intersection of the Ferrers diagrams of R1 and R2 , and |R1 ∩ R2 | is the number of cells in this intersection. Explicitly |R1 ∩ R2 | = min{η1 , η2 } min{µ1 , µ2 }. If η1 + η2 ≤ n then the local energy function attains precisely the values from 0 to |R1 ∩ R2 |. Now let R = (R1 , . . . , RL ) be a sequence of rectangles and b = bL ⊗ · · · ⊗ b1 ∈ PR . For 1 ≤ p ≤ L−1 let σp denote the local isomorphism that exchanges the tensor factors (i+1) be the (i + 1)th tensor in the pth and (p + 1)th positions. For 1 ≤ i < j ≤ L, let bj factor in σi+1 σi+2 . . . σj −1 (b). Then define the energy function (i+1) E(b) = H (bj ⊗ bi ). (3.9) 1≤i<j ≤L
The value of the energy function is unchanged under local isomorphisms and under ei and fi for i ∈ J , since the local energy function has this property. The next lemma follows from the definition of the local energy function.
112
A. Schilling, M. Shimozono
Lemma 3.2. Suppose b = bL ⊗ · · · ⊗ b1 ∈ PR is such that e0 (b) is defined and for any ⊗ · · · ⊗ b of b under a composition of local isomorphisms, e (b ) = image b = bL 0 1 bL ⊗ · · · ⊗ bj +1 ⊗ e0 (bj ) ⊗ bj −1 ⊗ · · · ⊗ b1 , where j = 1. Then E(e0 (b)) = E(b) − 1. If all rectangles Rj are the same then each of the local isomorphisms is the identity and E(b) = (L − i)H (bi+1 ⊗ bi ). (3.10) 1≤i≤L−1
Say that b ∈ PR is classically restricted if it is an sln -highest weight vector, that is, i (b) = 0 for all i ∈ J . Equivalently, word(b) is a (reverse) lattice permutation (every final subword has partition content). Let PR be the set of classically restricted paths in PR of weight ∈ Pcl . It was shown in [37] that the generalized Kostka polynomial (which was originally defined in terms of Littlewood–Richardson tableaux; see (4.3)) can be expressed as KλR (q) = q E(b) . (3.11) b∈Pcl(λ)R
This extends the path formulation of the Kostka polynomial by Nakayashiki and Yamada [30]. 3.7. Level-restricted paths. Let B be any Pcl -weighted I -crystal and ∈ Pcl+ . Say that b ∈ B is -restricted if b ⊗ u is a highest weight vector in the Pcl -weighted I -crystal B ⊗ B(), that is, i (b ⊗ u ) = 0 for all i ∈ I . Equivalently i (b) ≤ hi , for all i ∈ I by (3.5) and (3.2). Denote by H(, B) the set of elements b ∈ B that are -restricted. If ∈ Pcl+ has the same level as , define H(, B, ) to be the set of b ∈ H(, B) such that wt(b) = − ∈ Pcl , that is, the set of b ∈ B such that b ⊗ u is a highest weight vector of weight . Say that the element b is restricted of level * if it is (*0 )-restricted. Such paths are also classically restricted since hi , *0 = 0 * denote the set of paths in P for i ∈ J . Let PR R that are restricted of level *. Letting * = H(* , B, + * ). B = PR , this is the same as saying PR 0 0 Define the level-restricted generalized Kostka polynomial by * KλR (q) = q E(b) . (3.12) b∈P *
cl(λ)R
3.8. Perfect crystals. This section is needed to compute the coset branching functions in n )-crystal n . For any Uq (sl Sect. 7. We follow [19], stating the definitions in the case of sl B, define , φ : B → Pcl by (b) = i∈I i (b)i and φ(b) = i∈I φi (b)i . Now let * be a positive integer and B the crystal graph of a finite dimensional irren )-module V . Say that B is perfect of level * if ducible Uq (sl (1) B ⊗ B is connected. (2) There is a weight ∈ Pcl such that B has a unique vector of weight and all other vectors in B have lower weight in the Chevalley order, that is, wt(B) ⊂ − i∈J Nαi .
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
113
(3) * = minb∈B c , (b) . (4) The maps and φ restrict to bijections Bmin → (Pcl+ )* , where Bmin ⊂ B is the set of b ∈ B achieving the minimum in 3.
n the perfect crystals of level * are precisely those of the form B k,* for 1 ≤ k ≤ For sl cl n − 1 [18, 30]. Let B = B k,* . The weight can be taken to be *(cl k − 0 ). Example 3.3. We describe the bijections , φ : Bmin → (Pcl+ )* in this example. Let B = B k,* . For this example let n = 6, k = 3, * = 5, and consider the weight = 20 + 1 + 2 + 4 . As usual subscripts are identified modulo n. The unique tableau b ∈ B k,* such that φ(b) = is constructed as follows. First let T be the following tableau of shape (*k ). Its bottom row contains hi , copies of the letter i for 1 ≤ i ≤ n (here it is 12466 since the sequence of hi , for 1 ≤ i ≤ 6 is (1, 1, 0, 1, 0, 2)). Let every letter in T have value one smaller than the letter directly below it. Here we have −1 0 2 4 4 T = 0
1 3 5 5
1
2 4 6 6.
Let T− be the subtableau of T consisting of the entries that are nonpositive and T+ the rest. Say T− has shape ν (here ν = (2, 1)). Let ν = (*k ) − (νk , νk−1 , . . . , ν1 ) (here ν = (5, 4, 3)). The desired tableau b is defined as follows. The restriction of b to the shape ν is P (T+ ), or equivalently, the tableau obtained by taking the skew tableau T+ and first pushing all letters straight upwards to the top of the bounding rectangle (*k ), and then pushing all letters straight to the left inside (*k ). The restriction of b to (*k )/ ν is the tableau of that skew shape in the alphabet {1, 2, . . . , n} with maximal entries, that is, its bottom row is filled with the letter n, the next-to-bottom row is filled with the letter n − 1, etc. In the example, 1 1 2 4 4 b=2 3 5 5 5 4 6 6 6 6. To construct the unique element b ∈ B k,* such that (b ) = , let U be the tableau whose first row has hi , copies of the letter i + 1 for 1 ≤ i ≤ n, again identifying subscripts modulo n; here U has first row 11235. Now let the rest of U be defined by letting each entry have value one greater than the entry above it. So 1 1 2 3 5 U =2 2 3 4 6 3 3 4 5 7. Let U− be the subtableau of U consisting of the values that are at most n. Let µ be the µ = (*k ) − (µk , µk−1 , . . . , µ1 ). Here µ = (5, 5, 4) and µ = (1, 0, 0). shape of U− and The element b is defined as follows. Its restriction to the skew shape (*k )/ µ is the unique skew tableau V of that shape such that P (V ) = U− , or equivalently, this restriction is obtained by taking the tableau U− , pushing all letters directly down within the rectangle (*k ) and then pushing all letters to the right within (*k ). The restriction of b to the
114
A. Schilling, M. Shimozono
shape µ is filled with the smallest letters possible, so that the first row of this subtableau consists of ones, the second row consists of twos, etc. Here 1 1 1 2 3
b =2 2 3 4 5 3 3 4 5 6. The main theorem for perfect crystals is: Theorem 3.4 ([19]). Let B be a perfect crystal of level * and ∈ (Pcl+ )* with * ≥ * . n )-crystals Then there is an isomorphism of Uq (sl B ⊗ B() ∼ =
B( + wt(b)).
(3.13)
b∈H(,B)
Suppose now that B is perfect of level * and ∈ (Pcl+ )* . Write b() for the unique element of B such that φ(b()) = . Theorem 3.4 (with therein replaced by = (b())) says that B ⊗ B((b())) ∼ = B() with corresponding highest weight vectors b() ⊗ u(b()) → u . This isomorphism can be iterated. Let σ : Bmin → Bmin be the unique bijection defined by φ ◦ σ = . Then there are isomorphisms B ⊗N ⊗ B(φ(σ N (b()))) ∼ = B() such that the highest weight vector of the left-hand side is n ) given by b()⊗σ (b())⊗σ 2 (b())⊗· · ·⊗σ N−1 (b())⊗uφ(σ N (b())) . For the Uq (sl perfect crystals B k,* , it can be shown that the map σ is none other than the power ψ −k of the content rotating map ψ. Moreover if σ is extended to a bijection σ : B k,* → B k,* by defining σ = ψ −k , then the extended function also satisfies φ(σ (b)) = (b) for all b ∈ B k,* not just for b ∈ Bmin . Since the bijection ψ on B k,* has order n, the bijection σ has order n/ gcd(n, k). The ground state path for the pair (, B) is by definition the infinite periodic sequence b = b1 ⊗ b2 ⊗ . . . , where bi = σ i−1 (b()). Let P(, B) be the set of all semi-infinite sequences b = b1 ⊗ b2 ⊗ . . . of elements in B such that b eventually agrees with the ground state path b for (, B). Then the set P(, B) has the structure of the crystal B() with highest weight vector u = b and weight function wt(b) = i≥1 (wt(bi ) − wt(bi )). To recover the weight function of the n )-crystal B(af()), define the energy function on P(, B) by Uq (sl E(b) =
i(H (bi ⊗ bi+1 ) − H (bi ⊗ bi+1 ))
(3.14)
i≥1
and define the map B(af(λ)) → Paf by b → wt(b) − E(b)δ, where wt : B() → Pcl . P(, B) can be regarded as a direct limit of the finite crystals B ⊗N . Define the embedding iN : B ⊗N → P(, B) by b1 ⊗ · · · ⊗ bN → b1 ⊗ b2 ⊗ bN ⊗ bN+1 ⊗ bN+2 ⊗ . . . . Define EN : B ⊗N → Z by EN (b1 ⊗ · · · ⊗ bN ) = E(b1 ⊗ · · · ⊗ bN ⊗ bN+1 ), where the E on the right-hand side is the energy function for the finite path space B ⊗N+1 . By definition for all p = b1 ⊗ · · · ⊗ bN ∈ B ⊗N , E(iN (p)) = EN (p) − EN (b1 ⊗ · · · ⊗ bN ). Note that the last fixed step bN+1 is necessary to make the energy function on the finite paths stable under the embeddings into P(, B).
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
115
3.9. Standardization embeddings. We require certain embeddings of finite path spaces. Given a sequence of rectangles R, let r(R) denote the sequence of rectangles given by splitting the rectangles of R into their constituent rows. For example, if R = ((1), (2, 2)), then r(R) = ((1), (2), (2)). There is a unique embedding iR : PR 7→ Pr(R)
(3.15)
defined as follows. Its explicit computation is based on transforming R into r(R) using two kinds of steps. (1) Suppose R1 has more than one row (η1 > 1). Then use the transformation R → η −1 R < = ((µ1 ), (µ11 ), R2 , R3 , . . . , RL ). Informally, R < is obtained from R by n )-crystal splitting off the first row of R1 . There is an associated embedding of Uq (sl < < graphs iR : PR → PR < defined by the property that word(i (b)) = word(b) for all b ∈ PR . Here it is crucial that the rectangle being split horizontally, is the first one, for otherwise the embedding does not preserve the edges labeled by 0. (2) If η1 = 1, then use a transformation of the form R → sp R for some p. Here sp R denotes the sequence of rectangles obtained by exchanging the p th and (p + 1)th n )-crystal graphs is the local rectangles in R. The associated isomorphism of Uq (sl isomorphism σp : PR → Psp R defined before. It is clear that one can transform R into r(R) using these two kinds of steps. Now fix one such sequence of steps leading from R to r(R), say R = R (0) → R (1) → · · · → R (N) = r(R), where each R (m) is a sequence of rectangles and each step R (m−1) → R (m) is one of the two types defined above. Define the map i (m) : PR (m−1) 7→ PR (m) by i (m) = iR<(m−1) if the step is of the first kind, and by i (m) = σp if it is of the second kind. Let iR : PR → Pr(R) be the composition iR = i (N) ◦ · · · ◦ i (1) . It can be shown that the map iR does not depend on the sequence of the R (m) ; this is proven in the equivalent language of Littlewood–Richardson tableaux in [36].
4. Littlewood–Richardson Tableaux We now review some formulations of type A tensor product multiplicities that use tableaux. These tableaux, which we call Littlewood–Richardson (LR) tableaux, are the intermediate combinatorial objects between paths and rigged configurations, which give rise to fermionic expressions. For the most part, the material in this section is taken from [33, 35–37]. 4.1. Three formulations. Let I1 , I2 , . . . , IL be intervals of integers such that if i < j ,
x ∈ Ii and y ∈ Ij , then x < y. Set I = L j =1 Ij . For each 1 ≤ j ≤ L, fix a tableau Zj of shape Rj in the alphabet Ij . Define the set SLR(λ; Z) to be the set of tableaux Q of shape λ in the alphabet I such that P (Q|Ij ) = Zj for all j , where Q|Ij denotes the skew subtableau of Q obtained by restricting to the alphabet Ij , and P (S) denotes the Schensted P -tableau [34] of the row-reading word of the skew tableau S. It is well-known λ , where cλ was defined in (1.1). that |SLR(λ; Z)| = cR R We shall define three kinds of LR tableaux given by SLR(λ; Z) for various choices of intervals Ij and tableaux Zj .
116
A. Schilling, M. Shimozono
(1) LR(λ; R): Define the set of intervals of integers Ij = Aj = [η1 + · · · + ηj −1 + 1, η1 + · · · + ηj −1 + ηj ]. Let Zj = Yj be the tableau of shape Rj whose r th row is filled with copies of the r th largest letter of Aj , namely, η1 + · · · + ηj −1 + r. Define LR(λ; R) := SLR(λ; Y ). When R consists of single rows (that is, ηj = 1 for all j ), then LR(λ; R) = CST(λ; µ), the (column-strict) tableaux of shape λ and content µ. (2) CLR(λ; R) (Columnwise LR): Let ZC1 be the standard tableau of shape R1 obtained by placing the numbers 1 through η1 down the first column, the next η1 numbers down the second column, etc. Continue this process to obtain ZC2 , starting with the next available number, namely, η1 µ1 + 1. Explicitly, for 1 ≤ j ≤ L, the (r, c)th entry in the j th tableau ZCj is equal to η1 µ1 + · · · + ηj −1 µj −1 + (c − 1)ηj + r. Let Bj be the interval consisting of the entries of the tableau ZCj . Define CLR(λ; R) := SLR(λ; ZC). (3) RLR(λ; R) (Rowwise LR): Define this similarly to CLR(λ; R) but label by rows, so that the (r, c)th entry of ZRj is η1 µ1 + · · · + ηj −1 µj −1 + (r − 1)µj + c. Then let RLR(λ; R) := SLR(λ; ZR). Example 4.1. Let R = ((1), (2, 2)) and λ = (3, 2). Here A1 = {1}, A2 = {2, 3}, and Y1 = 1
2 2
and
Y2 =
ZC1 = 1
and
ZC2 =
2 4
ZR1 = 1
and
ZR2 =
2 3
3 3.
We have B1 = {1}, B2 = {2, 3, 4, 5},
3 5
and
4 5.
Observe that T =
1 2 4 3 5
is in CLR(λ; R) since P (T |B1 ) = 1 = ZC1 and P (T |B2 ) = ZC2 . On the other hand 1 3 5 2 3 5 is not in CLR(λ; R) since P (T |B2 ) = = ZC2 . T = 2 4 4 4.2. Obvious bijections among the various LR tableaux. There are trivial relabeling bijections between the various kinds of LR tableaux defined above. We give them explicitly here for later use. (1) The bijection γR : CLR(λ; R) → RLR(λ; R) is given by the following relabeling. Consider an entry x in a standard tableau S ∈ CLR(λ; R). Then x appears in one of the ZC tableaux, say, it is the (r, c)th entry of ZCj . Let y be the (r, c)th entry of the rowwise tableau ZRj . Then replace x by y in S. Performing all such replacements simultaneously yields γR (S) ∈ RLR(λ; R).
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
117
(2) The bijection std : LR(λ; R) → RLR(λ; R) is given by Schensted’s standardization map [34]. Let Q ∈ LR(λ; R) and i be some entry in Q. Suppose i is the r th largest value in the subinterval Aj . Replace the occurrences of the letter i in Q from left to right by the consecutive integers given by the r th row of ZRj . The result of these substitutions is std(Q) ∈ RLR(λ; R). (3) Define a bijection βR : LR(λ; R) → CLR(λ; R) by γR−1 ◦ std. (4) Observe that ordinary transposition of standard tableaux restricts to a bijection tr : RLR(λ; R) ↔ CLR(λt ; R t ), where λt denotes the transpose partition of λ and R t = (R1t , R2t , . . . , RLt ). (5) There is a bijection tr LR : CLR(λ; R) → CLR(λt ; R t ) defined by tr LR = tr ◦ γR . 4.3. Paths to tableau pairs. The Robinson–Schensted–Knuth correspondence allows one to pass from paths to pairs of tableaux. This bijection gives a combinatorial decomposition of the crystal graph of PR into Uq (sln ) irreducible components and encodes the energy function in the recording tableau. The column insertion version of the Robinson–Schensted–Knuth correspondence, restricts to a bijection RSK : PR →
CST(λ; ·) × LR(λ; R)
(4.1)
λ
as follows. Let b = bL ⊗ · · · ⊗ b2 ⊗ b1 ∈ PR . Define P (b) := P (word(b)). This can be computed by the column insertion of word(b) starting from the right end. Recall that bj and Yj are column-strict tableaux of shape Rj . Let Q(b) be the tableau obtained by recording the insertion of a letter in bj by the letter in the corresponding position in Yj . It can be shown that Q(b) ∈ LR(λ; R), and that the map (4.1) given by b → (P (b), Q(b)) is a bijection. Remark 4.2. (1) This bijection is a morphism of Uq (sln )-crystal graphs in the sense that P (ei (b)) = ei (P (b)) for i ∈ J . In particular, b ∈ PR is classically restricted if and only if P (b) is a Yamanouchi tableau, that is, its r th row is filled with copies of the letter r for all 1 ≤ r ≤ n. (2) The energy function on paths can be transferred easily to a statistic on LR(λ; R) called the generalized charge (written cR ) such that cR (Q(b)) = E(b). The generalized charge is defined explicitly in (4.2) below. Example 4.3. Let R = ((1), (2, 2)) and b ∈ PR given by b=
1 1 2 2
⊗ 1.
Then word(b) = 2211 1 and P (b) =
1 1 1 2 2
Q(b) =
1 2 2 3 3.
118
A. Schilling, M. Shimozono
4.4. Generalized Automorphisms of Conjugation. For the moment let R = (R1 , R2 ) and Bj = BRj for 1 ≤ j ≤ 2. Recall that the local isomorphism (3.7) is the unique n )-crystal graphs B2 ⊗ B1 → B1 ⊗ B2 or equivalently P(R1 ,R2 ) → isomorphism of Uq (sl P(R2 ,R1 ) . Let us make this more explicit. By Remark 4.2 we have a commutative diagram of bijections RSK
P(R1 ,R2 ) −−−−→ σ P(R2 ,R1 ) −−−−→ RSK
λ CST(λ) × LR(λ; (R1 , R2 )) 1×s
λ CST(λ) × LR(λ; (R2 , R1 ))
such that P (σ (b)) = P (b). This induces a bijection s : LR(λ; (R1 , R2 )) → LR(λ; (R2 , R1 ))
for each λ.
The tensor product V R2 ⊗ V R1 is multiplicity-free. Therefore the domain and codomain of s are both empty or both singletons. Hence the bijection s is unique and can be computed from the definition of the set LR. Then σ (b) can be computed by applying RSK to obtain (P (b), Q(b)), then applying s to get (P (b), s(Q(b)), and finally, the inverse of RSK to obtain σ (b). The local energy function is recovered using only the shape of the tableau pair. For a tableau Q ∈ LR(λ; (R1 , R2 )) let d(Q) be the number of cells in Q that lie strictly to the right of the max{µ1 , µ2 }th column, or equivalently, strictly to the right of the shape R1 ∪ R2 . Then H (b) = d(Q(b)). n )-crystal graph isomorphism σp : PR → Psp R induces involutions Then the Uq (sl sp : LR(λ; R) → LR(λ; sp R) such that the diagram commutes: RSK
PR −−−−→ σp
Psp R −−−−→ RSK
λ CST(λ) × LR(λ; R)
1×s p
λ CST(λ) × LR(λ; sp R).
The map sp is computed explicitly as follows [37]. Let Q ∈ LR(λ; R) and Aj be the alphabets as in the definition of LR(λ; R). Remove the skew subtableau U = Q|Ap ∪Ap+1 . Use the usual column insertion of its row reading word, obtaining a pair of tableaux (P , Q ), where P ∈ LR(ρ; (Rp , Rp+1 )) for some partition ρ and Q is the standard column insertion tableau. Next replace P by s(P ), where s is the unique bijection LR(ρ; (Rp , Rp+1 )) → LR(ρ; (Rp+1 , Rp )). Finally, pull back the pair of tableaux (s(P ), Q ) under column insertion to obtain a word which turns out to be the row reading word of a skew column-strict tableau V of the same shape as U . Then sp (Q) is obtained by replacing U by V . The bijections sp specialize to the automorphisms of conjugation of Lascoux and Schützenberger [29] in the case that R consists of single rows. It is shown in [37] that the bijections σp and sp define an action of the symmetric group SL on paths and LR tableaux respectively. Specifically, for w ∈ SL let w = si1 si2 . . . siN be any factorization of w into adjacent transpositions si = (i, i + 1). For b ∈ PR , define wb = σi1 σi2 . . . σiN b ∈ PwR . For Q ∈ LR(λ; R) define wQ = si1 si2 . . . siN Q ∈ LR(λ; wR).
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
119
4.5. Generalized charge. The generalized charge on Q ∈ LR(λ; R) is defined by [35, 33] cR (Q) =
L−1 1 (L − i)di,wR (wQ). L!
(4.2)
w∈SL i=1
where di,R (Q) = d(P (word(Q|Ai ∪Ai+1 ))) where d is understood to be the function d : LR(ρ; (Ri , Ri+1 )) → N. It was shown in [33, Section 6] and [35] that LR(R) = ∪λ LR(λ; R) has the structure of a graded poset with covering relation given by the R-cocyclage and grading function given by the generalized charge. The generalized Kostka polynomial is by definition the generating function of LR tableaux with the charge statistic [33, 35] q cR (T ) . (4.3) KλR (q) = T ∈LR(λ;R)
This extends the charge representation of the Kostka polynomial Kλµ (q) of Lascoux and Schützenberger [28, 29]. For a path b ∈ PR one has E(b) = cR (Q(b)) [37], so the formulas (3.11) and (4.3) are equivalent. 4.6. Embeddings of LR tableaux. The embeddings (3.15) of sets of paths, induce embeddings iR : LR(λ; R) 7→ LR(λ; r(R))
(4.4)
via RSK. These maps are defined in [33, 36]. In the notation of [25, Sect. 8.4] they r(R) are denoted θR . They are given by compositions of the generalized automorphisms of conjugation sp and by the embeddings of the form iR< : LR(λ; R) → LR(λ; R < ) (which is just the inclusion map). These embeddings preserve the R-cocyclage poset structure and the generalized charge, since they are induced by maps that preserve the n )-crystal graph structure. Uq (sl 4.7. Level-restricted LR tableaux. Say that a tableau Q ∈ LR(λ; R) is restricted of level * such that Q = Q(b). Denote the set of such * if there is a level-restricted path b ∈ PλR * tableaux by LR (λ; R). Example 4.4. Suppose each rectangle is a single row so that LR(λ; R) = CST(λ; µ). In this case let us write CST* (λ; µ) = LR* (λ; R). The following explicit rule appears in [11]. Let Q ∈ CST(λ; µ). The tableau Q may be viewed as a sequence of shapes ∅ = λ(0) ⊂ λ(1) ⊂ · · · ⊂ λ(L) = λ, where λ(j ) is the shape of Q|[1,j ] . Then Q is restricted of level * if (j )
(j −1)
λ1 − λn
≤*
for all 1 ≤ j ≤ L.
(4.5)
In the further special case that Rj = (1) for all j , write ST(λ) = LR(λ; R) for the set of standard tableaux of shape λ and write ST* (λ) = LR* (λ; R) for the level-restricted subset. For S ∈ ST(λ), associate the chain of shapes λ(j ) as above. Since passing from λ(j −1) to λ(j ) adds only one additional cell, the condition (4.5) simplifies to (j )
(j )
λ1 − λ n ≤ *
for all 1 ≤ j ≤ L.
(4.6)
120
A. Schilling, M. Shimozono
For general R it is possible to transfer the condition of level-restriction on paths to an explicit condition on LR tableaux. However for our purposes it is more convenient to use the following description of LR* (λ; R). Since the embedding (4.4) is induced by n )-crystal graph structure, it follows that the embedding (3.15) that preserves the Uq (sl LR* (λ; R) = {Q ∈ LR(λ; R) | iR (Q) ∈ CST* (λ; r(R))}.
(4.7)
Hence an expression for the level-restricted generalized Kostka polynomials equivalent to (3.12) is * KλR (q) = q cR (T ) . T ∈LR* (λ;R)
5. Rigged Configurations This section follows [25, Sect. 2.2], with the notational difference that here Rj is a rectangle with µj columns and ηj rows. The reason for this is that here we work with RC(λ; R) rather than RC(λt ; R t ) as in [25]. 5.1. Review of definitions. A (λ; R)-configuration is a sequence of partitions ν = (ν (1) , ν (2) , . . . ) with the size constraints |ν (k) | =
λj −
j >k
L
µa max{ηa − k, 0}
(5.1)
a=1
for k ≥ 0, where by convention ν (0) is the empty partition. If λ has at most n parts all partitions ν (k) for k ≥ n are empty. For a partition ρ, define mi (ρ) to be the number of parts equal to i and min{i, ρj }, Qi (ρ) = ρ1t + ρ2t + · · · + ρit = j ≥1
the size of the first i columns of ρ. Let ξ (k) (R) be the partition whose parts are the widths of the rectangles in R of height k. The vacancy numbers for the (λ; R)-configuration ν are the numbers (indexed by k ≥ 1 and i ≥ 0) defined by (k) Pi (ν) = Qi ν (k−1) − 2Qi ν (k) + Qi ν (k+1) + Qi ξ (k) (R) . (k)
(5.2)
In particular P0 (ν) = 0 for all k ≥ 1. The (λ; R)-configuration ν is said to be admissible (k) if Pi (ν) ≥ 0 for all k, i ≥ 1, and the set of admissible (λ; R)-configurations is denoted by C(λ; R). Following [26, (3.2)], set (k) (k) (k+1) αi αi − α i , cc(ν) = k,i≥1
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
121
(k)
where αi is the size of the i th column in ν (k) . Define the charge c(ν) of a configuration ν ∈ C(λ; R) by c(ν) = ||R|| − cc(ν) − |P | with ||R|| =
|Ri ∩ Rj |
and
|P | =
1≤i<j ≤L
k,i≥1
(k)
mi (ν)Pi (ν).
Observe that c(ν) depends on both ν and R but cc(ν) depends only on ν. Example 5.1. Let λ = (3, 2, 2, 1) and R = ((2), (2, 2), (1, 1)). Then ν = ((2), (2, 1), (1)) is a (λ; R)-configuration with ξ (1) (R) = (2) and ξ (2) (R) = (2, 1). The configuration ν may be represented as 0
1
0
0
where the vacancy numbers are indicated to the left of each part. In addition cc(ν) = 3, !R! = 5, |P | = 1 and c(ν) = 1. Define the q-binomial by
(q)m+p m+p = (q)m (q)p m
for m, p ∈ N and zero otherwise, where (q)m = (1 − q)(1 − q 2 ) · · · (1 − q m ). The following fermionic or quasi-particle expression of the generalized Kostka polynomials, is a variant of [25, Theorem 2.10]. Theorem 5.2. For λ a partition and R a sequence of rectangles P (k) (ν) + mi (ν (k) ) i . KλR (q) = q c(ν) mi (ν (k) ) k,i≥1
(5.3)
ν∈C(λ;R)
Expression (5.3) can be reformulated as the generating function over rigged configurations. To this end we need to define certain labelings of the rows of the partitions in a configuration. For this purpose one should view a partition as a multiset of positive integers. A rigged partition is by definition a finite multiset of pairs (i, x), where i is a positive integer and x is a nonnegative integer. The pairs (i, x) are referred to as strings; i is referred to as the length of the string and x as the label or quantum number of the string. A rigged partition is said to be a rigging of the partition ρ if the multiset consisting of the lengths of the strings is the partition ρ. So a rigging of ρ is a labeling of the parts of ρ by nonnegative integers, where one identifies labelings that differ only by permuting labels among equal-sized parts of ρ. A rigging J of the (λ; R)-configuration ν is a sequence of riggings of the partitions ν (k) such that for every part of ν (k) of length i and label x, (k)
0 ≤ x ≤ Pi (ν).
(5.4)
The pair (ν, J ) is called a rigged configuration. The set of riggings of admissible (λ; R)configurations is denoted by RC(λ; R). Let (ν, J )(k) be the k th rigged partition of (ν, J ).
122
A. Schilling, M. Shimozono (k)
A string (i, x) ∈ (ν, J )(k) is said to be singular if x = Pi (ν), that is, its label takes on the maximum value. Observe that the definition of the set RC(λ; R) is completely insensitive to the order of the rectangles in the sequence R. However the notation involving the sequence R is useful when discussing the bijection between LR tableaux and rigged configurations, since the ordering on R is essential in the definition of LR tableaux. Define the cocharge and charge of (ν, J ) ∈ RC(λ; R) by cc(ν, J ) = cc(ν) + |J |, c(ν, J ) = c(ν) + |J |, (k) |Ji |, |J | = k,i≥1
(k)
(k)
where Ji is the partition inside the rectangle of height mi (ν (k) ) and width Pi (ν) given by the labels of thepartsof ν (k) of size i. Since the q-binomial m+p is the generating function of partitions with at most m m parts each not exceeding p [1, Theorem 3.1], Theorem 5.2 is equivalent to the following theorem. Theorem 5.3. For λ a partition and R a sequence of rectangles KλR (q) =
q c(ν,J ) .
(5.5)
(ν,J )∈RC(λ;R)
5.2. Switching between quantum and coquantum numbers. Let θR : RC(λ; R) → RC(λ; R) be the involution that complements quantum numbers. More precisely, for (k) (ν, J ) ∈ RC(λ; R), replace every string (i, x) ∈ (ν, J )(k) by (i, Pi (ν) − x). The notation here differs from that in [25], in which θR is an involution on RC(λt ; R t ). Lemma 5.4. c(θR (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R). Proof. Let θR (ν, J ) = (ν , J ). It follows immediately from the definitions that ν = ν. In particular ν and ν have the same vacancy numbers and |J | = |P | − |J |. Then c(θR (ν, J )) = c(ν , J ) = ||R|| − cc(ν ) − |P | + |J | = ||R|| − cc(ν) − |J | = ||R|| − cc(ν, J ).
# "
There is a bijection tr RC : RC(λ; R) → RC(λt ; R t ) that has the property cc(tr RC (ν, J )) = ||R|| − cc(ν, J ) for all (ν, J ) ∈ RC(λ; R); see the proof of [26, Prop. 11].
(5.6)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
123
5.3. RC’s and level-restriction. Here we introduce the most important new definition in this paper, namely, that of a level-restricted rigged configuration. Say that a partition λ is restricted of level * if λ1 − λn ≤ *, recalling that it is assumed that all partitions have at most n parts, some of which may be zero. Fix a shape λ and a sequence of rectangles R that are all restricted of level *. Define * = * − (λ1 − λn ), which is nonnegative by assumption. Set λ = (λ1 − λn , . . . , λn−1 − λn )t and denote the set of all column-strict tableaux of shape λ over the alphabet {1, 2, . . . , λ1 − λn } by CST(λ ). Define a table of modified vacancy numbers depending on ν ∈ C(λ; R) and t ∈ CST(λ ) by (k)
(k)
Pi (ν, t) = Pi (ν) −
λ k −λn
χ (i ≥ * + tj,k ) +
λk+1 −λn
j =1
χ (i ≥ * + tj,k+1 )
(5.7)
j =1
for all i, k ≥ 1, where χ (S) = 1 if the statement S is true and χ (S) = 0 otherwise, and (k) (k) tj,k is the (j, k)th entry of t. Finally let xi be the largest part of the partition Ji ; if (k) (k) Ji is the empty set xi = 0. Definition 5.5. Say that (ν, J ) ∈ RC(λ; R) is restricted of level * provided that (k)
(1) ν1 ≤ * for all k. (2) There exists a tableau t ∈ CST(λ ), such that for every i, k ≥ 1, (k)
xi
(k)
≤ Pi (ν, t).
Let C* (λ; R) be the set of all ν ∈ C(λ; R) such that the first condition holds, and denote by RC* (λ; R) the set of (ν, J ) ∈ RC(λ; R) that are restricted of level *. (k)
Note in particular that the second condition requires that Pi (ν, t) ≥ 0 for all i, k ≥ 1. Example 5.6. Let us consider Definition 5.5 for two classes of shapes λ more closely: (k)
(1) Vacuum case: Let λ = (a n ) be rectangular with n rows. Then λ = ∅ and Pi (ν, ∅) = (k) Pi (ν) for all i, k ≥ 1 so that the modified vacancy numbers are equal to the vacancy numbers. (2) Two-corner case: Let λ = (a α , bβ ) with α + β = n and a > b. Then λ = (α a−b ) and there is only one tableau t in CST(λ ), namely the Yamanouchi tableau of shape λ . Since tj,k = j for 1 ≤ k ≤ α we find that (k) (k) *, 0} Pi (ν, t) = Pi (ν) − δk,α max{i −
for 1 ≤ i ≤ * and 1 ≤ k < n. We wish to thank Anatol Kirillov for communicating this formula to us [27]. Our main result is the following formula for the level-restricted generalized Kostka polynomial: Theorem 5.7. Let * be a positive integer. For λ a partition and R a sequence of rectangles both restricted of level *, * KλR (q) = q c(ν,J ) . (ν,J )∈RC* (λ;R)
124
A. Schilling, M. Shimozono
The proof of this theorem is given in Sect. 8. Example 5.8. Consider n = 3, * = 2, λ = (3, 2, 1) and R = ((2), (1)4 ). Then 0 0
1
and
1
0
(5.8)
0
2
are in C* (λ; R), where again the vacancy numbers are indicated to the left of each part. The set CST(λ ) consists of the two elements 1
1
1
and
2
2
2
.
Since * = 0 the three rigged configurations 0 0 ,
0 0
0 0
and
0
0 1
0
are restricted of level 2 with charges 2, 3, 4, respectively. The riggings are given on the 2 (q) = q 2 + q 3 + q 4 . right of each part. Hence KλR In contrast to this, the Kostka polynomial Kλµ (q) is obtained by summing over both configurations in (5.8) with all possible riggings below the vacancy numbers. This amounts to Kλµ (q) = q 2 + 2q 3 + 2q 4 + 2q 5 + q 6 . In Sect. 7 we will use Theorem 5.7 to obtain explicit expressions for type A branching functions. The results suggest that it is also useful to consider the following sets of rigged configurations with imposed minima on the set of riggings. t t t Let ρ ⊂ λ be a partition and Rρ = ((1ρ1 ), (1ρ2 ), . . . , (1ρn )), the sequence of single t t columns of height ρi . Set ρ = (ρ1 − ρn , . . . , ρn−1 − ρn ) and (k) Mi (t)
=
ρ k −ρn j =1
ρk+1 −ρn
χ (i ≤ ρ1 − ρn − tj,k ) −
χ (i ≤ ρ1 − ρn − tj,k+1 )
j =1
for all t ∈ CST(ρ ). Then define RC* (λ, ρ; R) to be the set of all (ν, J ) ∈ RC* (λ; Rρ ∪R) (k) such that there exists a t ∈ CST(ρ ) such that Mi (t) ≤ x for (i, x) ∈ (ν, J )(k) and (k) (k) Mi (t) ≤ Pi (ν) for all i, k ≥ 1. Note that the second condition is obsolete if i occurs (k) (k) as a part in ν (k) since by definition Mi (t) ≤ x ≤ Pi (ν) for all (i, x) ∈ (ν, J )(k) . Conjecture 8.3 asserts that the set RC* (λ, ρ; R) corresponds to the set of all level-* restricted Littlewood–Richardson tableaux with a fixed subtableaux of shape ρ.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
125
6. Fermionic Expression of Level-Restricted Generalized Kostka Polynomials 6.1. Fermionic expression. Similarly to the Kostka polynomial case, one can rewrite the expression of the level-restricted generalized Kostka polynomials of Theorem 5.7 in fermionic form. (k)
Lemma 6.1. For all ν ∈ C* (λ, R), t ∈ CST(λ ) and 1 ≤ k < n, we have Pi (ν, t) = 0 for i ≥ *. (k)
(k)
Proof. Since ν1 ≤ * it follows from [26, (11.2)] that Pi (ν) = λk − λk+1 for i ≥ *. Since t is over the alphabet {1, 2, . . . , λ1 − λn } this implies for i ≥ *,
(k)
(k)
Pi (ν, t) = Pi (ν) −
λ k −λn
χ (i ≥ * + tj,k ) +
j =1
λk+1 −λn
χ (i ≥ * + tj,k+1 )
j =1
= λk − λk+1 − (λk − λn ) + (λk+1 − λn ) = 0.
# "
Let SCST(λ ) be the set of all nonempty subsets of CST(λ ). Furthermore set (k) = min{Pi (ν, t)|t ∈ S} for S ∈ SCST(λ ). Then by inclusion-exclusion the set of allowed rigging for a given configuration ν ∈ C* (λ; R) is given by
(k) Pi (ν, S)
S∈SCST(λ )
(k)
(−1)|S|+1 {J |xi
(k)
≤ Pi (ν, S)}.
is the generating function of partitions with at most m parts Since the q-binomial m+p m (k) each not exceeding p and since P* (ν, S) = 0 by Lemma 6.1 the level-* restricted generalized Kostka polynomials has the following fermionic form. Theorem 6.2.
* KλR (q) =
(−1)|S|+1
S∈SCST(λ )
ν∈C* (λ;R)
q c(ν)
(k) mi (ν (k) ) + Pi (ν, S) . mi (ν (k) )
*−1 n−1 i=1 k=1
In Sect. 7 we will derive new expressions for branching functions of type A as limits of the level-restricted generalized Kostka polynomials. To this end we need to reformulate the fermionic formula of Theorem 6.2 in terms of a so-called (m, n)-system. Set (a)
(a)
(a)
(a)
mi
= Pi (ν, S) = Pi (ν) + fi (S),
ni
= mi (ν (a) ),
(a)
126
A. Schilling, M. Shimozono
(a) and Li = L j =1 χ (i = µj )χ (a = ηj ) for 1 ≤ i ≤ * and 1 ≤ a ≤ n which is the number of rectangles in R of shape (i a ). Then (a)
(a)
(a)
(a−1)
(a)
(a+1)
+ 2ni − ni −mi−1 + 2mi − mi+1 − ni (a−1) (a−1) (a) (a+1) (a) (a+1) = αi − 2αi + αi − αi+1 − 2αi+1 + αi+1 +
=
L
δa,ηk − min{i − 1, µk } + 2 min{i, µk } − min{i + 1, µk }
k=1 (a) (a) (a) − fi−1 (S) + 2fi (S) − fi+1 (S) (a−1) (a+1) (a−1) (a) (a) − αi − αi+1 + 2(αi − αi+1 ) − αi (a) (a) (a) (a) Li − fi−1 (S) + 2fi (S) − fi+1 (S).
(a+1)
− αi+1
(a)
At this stage it is convenient to introduce vector notation. For a matrix vi 1 ≤ i ≤ * − 1 and 1 ≤ a ≤ n − 1 define v=
*−1 n−1 i=1 a=1
with indices
(a)
vi e i ⊗ e a ,
where ei and ea are the canonical basis vectors of Z*−1 and Zn−1 , respectively. Define (a)
(a)
(a)
(a)
ui (S) = −fi−1 (S) + 2fi (S) − fi+1 (S), which in vector notation reads u(S) = (C ⊗ I )f (S) +
n−1
(λa − λa+1 )e*−1 ⊗ ea ,
(6.1)
a=1 (0)
where C is the Cartan matrix of type A and I is the identity matrix. Since ni (k) (k) m0 = 0 and m* = 0 by Lemma 6.1 it follows that (C ⊗ I )m + (I ⊗ C)n = L + u(S).
(n)
= ni
=
(6.2)
In terms of the new variables the condition (5.1) on |ν (a) | becomes (a)
n* = −e*−1 ⊗ ea (C −1 ⊗ I )n −
a
*
n
1 1 (b) λj + i min{a, b}Li , * * j =1
(6.3)
i=1 b=1
where we used Cij−1 = min{i, j } − ij/* if C is (* − 1) × (* − 1)-dimensional and n * (b) b=1 i=1 ibLi = |λ|. Lemma 6.3. In terms of the above (m, n)-system c(ν) =
1 m(C ⊗ C −1 )m − m(I ⊗ C −1 )u(S) 2 1 + u(S)(C −1 ⊗ C −1 )u(S) + g(R, λ), 2
(6.4)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
127
where g(R, λ) = !R! −
2 n−1 * n 1 −1 (a) (b) 1 1 λj − |λ| Cab Lj Lj + 2 2* n a,b=1 j =1
(a)
and Li
=
j =1
*
(a) j =1 min{i, j }Lj .
Proof. By definition c(ν) = !R! − cc(ν) − |P |. Note that |P | =
* n−1 i=1 k=1
=
(k)
mi (ν (k) )Pi (ν)
* n−1 i=1 k=1
(k)
αi
= −2cc(ν) +
(k)
− αi+1
n−1 * i=1 k=1
i (k) (α (k−1) − 2α (k) + α (k+1) ) + Li j =1
j
j
j
(k) (k)
n i Li .
Hence eliminating cc(ν) in favor of |P | yields * n−1
1 1 (k) (k) c(ν) = !R! − |P | − n i Li . 2 2 i=1 k=1
(k)
On the other hand, using ni
(k)
= mi (ν (k) ) and P* (ν) = λk − λk+1 ,
|P | = n(I ⊗ I )P (ν) +
n−1 k=1
(k)
n* (λk − λk+1 )
so that n−1
1 1 (k) (k) c(ν) = !R! − n(I ⊗ I )(P (ν) + L) − n* λk − λk+1 + L* . 2 2
(6.5)
k=1
Eliminating n in favor of m using (6.2) and substituting P (ν) = m − f (S) yields 1 1 − n(I ⊗ I )(P (ν) + L) = m{C ⊗ C −1 (m + L − f (S)) − I ⊗ C −1 (L + u(S))} 2 2 1 − (L + u(S))(I ⊗ C −1 )(L − f (S)). 2 Similarly, replacing n by m in (6.3) we obtain (a)
n* = e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −
1 1 −1 (b) λj − |λ| + Cab L* . * n a
n−1
j =1
b=1
(6.6)
128
A. Schilling, M. Shimozono
Inserting these equations into (6.5), trading f (S) for u(S) by (6.1) and using (C ⊗ I )L − L −
n−1 a=1
(a)
e*−1 ⊗ ea L* = 0
# "
results in the claim of the lemma.
As a corollary of Lemma 6.3 and Theorem 6.2 we obtain the following expression for the level-restricted generalized Kostka polynomial 1 −1 −1 * KλR (q) = q g(R,λ) (−1)|S|+1 q 2 u(S)C ⊗C u(S) ×
S∈SCST(λ )
q
m+n , m
1 −1 −1 2 mC⊗C m−mI ⊗C u(S)
m
(6.7)
where n is determined by (6.2), the sum over m is such that e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) 1 1 −1 (b) λj − |λ| + Cab L* ∈ Z, * n a
−
n−1
j =1
for all 1 ≤ a ≤ n − 1 and
m+n m
=
*−1 n−1 i=1
k=1
b=1
(k) (k) mi +ni (k) mi
.
Now consider the second case of Example 5.6, namely λ = (a α , bβ ) with a > b and α + β = n. Then SCST(λ ) only contains the element S = {t}, where t is the Yamanouchi tableau of shape λ and u(S) = e * ⊗ eα . In the vacuum case, that is, when n ), the set SCST(λ ) only contains S = {∅} and u(S) = f (S) = 0. In this ) λ = (( |λ| n case (6.7) simplifies to 1 * g(R,λ) mC⊗C −1 m m + n 2 KλR (q) = q . q m m When R is a sequence of single boxes this proves [8, Theorem 1]1 . When R is a sequence of single rows or single columns this settles [12, Conjecture 5.7]. 6.2. Polynomial Rogers–Ramanujan-type identities. Let W be the Weyl group of sln , M = {β ∈ Zn | ni=1 βi = 0} be the root lattice, ρ the half-sum of the positive roots, and (·|·) the standard symmetric bilinear form. Recall the energy function (3.9). It was shown in [31] that 1 * (q) = (−1)τ q − 2 (*+n)(β|β)+(λ+ρ|β)+E(b) . (6.8) KλR τ ∈W β∈M
b∈PR wt(b)=−ρ+τ −1 (λ−(*+n)β+ρ)
Equating (6.7) and (6.8) gives rise to polynomial Rogers–Ramanujan-type identities. For the vacuum case, that is, when the partition λ is rectangular with n rows, this proves [33, Eq. (9.2)]2 . 1 We believe that the proof given in [8] is incomplete. 2 The definition of level-restricted path as given in [33, p. 394] only works when R (or µ therein) consists
of single rows; otherwise the description of Sect. 3.7 should be used.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
129
7. New Expressions for Type A Branching Functions The coset branching functions b labeled by the three weights , , have a nat ural finitization in terms of ( + )-restricted crystals. For certain triples of weights these can be reformulated in terms of level-restricted paths, which in turn yield an expression of the type A branching functions as a limit of the level-restricted generalized Kostka polynomials. Together with the results of the last section this implies new fermionic expressions for type A branching functions at certain triples of weights.
7.1. Branching function in terms of paths. Let , , ∈ Pcl be dominant integral weights of levels *, * , and * respectively, where * = * + * . The branching function b (z) is the formal power series defined by af()−mδ b zm caf( ),af( ) , (z) = m≥0
af()−mδ
where caf( ),af( ) is the multiplicity of the irreducible integrable highest weight n )-module V(af() − mδ) in the tensor product V(af( )) ⊗ V(af( )). Uq (sl n -highest weight vectors of weight The desired multiplicity is equal to the number of sl af()−mδ in the tensor product B(af( ))⊗B(af( )), that is, the number of elements b ⊗b ∈ B(af( ))⊗B(af( )) such that wt(b ⊗b ) = af()−mδ and i (b ⊗b ) = 0 for all i ∈ I . By (3.5), b = u , b is -restricted, and wt(b ) = af( − ) − mδ. Let B be a perfect crystal of level * . Using the isomorphism B( ) ∼ = P( , B) let b = b1 ⊗ b2 ⊗ · · · and b ∈ P( , B) be the ground state path. Suppose N is such that . In type A(1) the period of the ground for all j > N, bj = bj . Write b = b1 ⊗ · · · ⊗ bN n−1 state path b always divides n. Choose N to be a multiple of n, so that b = b ⊗ b and bN+1 = b1 . Then the above desired highest weight vectors have the form b ⊗ b = (b ⊗ u ) ⊗ u ∈ B ⊗N ⊗ B(af( )) ⊗ B(af( )). But there is an embedding B(af( + )) 7→ B(af( )) ⊗ B(af( )) defined by u + → u ⊗ u . With this rephrasing of the conditions on b and taking limits, we have −EN (b1 ⊗···⊗bN ) b zEN (b) , (7.1) (z) = lim z N→∞ N∈nZ
b∈H( + ,B⊗N ,)
where EN : B ⊗N → Z is given by EN (b) = E(b ⊗ bN+1 ) = E(b ⊗ b1 ) and E is the energy function on finite paths. Our goal is to express (7.1) in terms of level-restricted generalized Kostka polynomials. We find that this is possible for certain triples of weights. Using the results of Sect. 6 this provides explicit formulas for the branching functions. 7.2. Reduction to level-restricted paths. The first step in the transformation of (7.1) is to replace the condition of ( + )-restrictedness by level * restrictedness. This is achieved at the cost of appending a fixed inhomogeneous path. Consider any tensor product B of perfect crystals each of which has level at most * (the level of ), such that there is an element y ∈ H(* 0 , B , ). We indicate how such a B and y can be constructed explicitly. Let λ be the partition with strictly
130
A. Schilling, M. Shimozono
less than n rows with hi , columns of length i for 1 ≤ i ≤ n − 1. Let Yλ be the Yamanouchi tableau of shape λ. Then any factorization (in the plactic monoid) of Yλ into a sequence of rectangular tableaux, yields such a B and y . Example 7.1. Let n = 6, * = 5, = 0 + 22 + 3 + 4 . Then λ = (4, 4, 2, 1) (its transpose is λt = (4, 3, 2, 2)) and 1 1 1 1 Yλ =
2 2 2 2 3 3
.
4 One way is to factorize into single columns: B = B 2,1 ⊗ B 2,1 ⊗ B 3,1 ⊗ B 4,1 and y = y4 ⊗ y3 ⊗ y2 ⊗ y1 , where each yj is an sln highest weight vector, namely, the j th column of Yλ . Another way is to factorize into the minimum number of rectangles by slicing Yλ vertically. This yields B = B 2,2 ⊗ B 3,1 ⊗ B 4,1 ; again the factors of y = y3 ⊗ y2 ⊗ y1 are the sln highest weight vectors, namely,
y3 =
1 1 2 2
1 ,
y2 = 2 , 3
1 2 y1 = . 3 4
Consider also a tensor product B of perfect crystals such that there is an element ∈ H(* 0 , B , ). Then y = y ⊗ y ∈ H(*0 , B ⊗ B , + ). Instead of b ∈ H( + , B ⊗N , ), we work with b ⊗ y, where b ⊗ y is restricted of level *. This trick doesn’t help unless one can recover the correct energy function directly from b ⊗ y. Let p be the first N steps of the ground state path b ∈ P( , B). Define the normalized energy function on B ⊗N by E(b) = E(b ⊗ y ) − E(p ⊗ y ). A priori it depends on , B, and y . The energy function occurring in the branching function is E (b) = E(b ⊗ b1 ) − E(p ⊗ b1 ). y
Lemma 7.2. E = E . Proof. It suffices to show that the function B ⊗N → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant. Using the definition (3.9) and the fact that b is homogeneous of length N, we have E(b ⊗ y ) = E(b) + N E(bN ⊗ y ) − (N − 1)E(y ). Similarly E(b ⊗ b1 ) = E(b) + N E(bN ⊗ b1 ). Therefore E(b ⊗ y ) − E(b ⊗ b1 ) = N(E(bN ⊗ y ) − E(bN ⊗ b1 )) − (N − 1)E(y ). Thus it suffices to show that the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is a constant function. Suppose first that i (b ) > hi , for some 1 ≤ i ≤ n − 1. By the construction of y and b1 , φi (y ) = hi , = φi (b1 ) for 1 ≤ i ≤ n − 1, since φ(b1 ) = . Then ei (b ⊗ y ) = ei (b ) ⊗ y and ei (b ⊗ b1 ) = ei (b ) ⊗ b1 by (3.4). Passing from b to ei (b ) repeatedly, the values of the energy functions are constant, so it may be assumed that b ⊗ y is a sln highest weight vector; in particular, i (b ) ≤ hi , for all 1 ≤ i ≤ n − 1.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
131
Next suppose that 0 (b ) > h0 , . Now φ0 (y ) = 0 and φ0 (b1 ) = h0 , . By (3.4) e0 (b ⊗ b1 ) = e0 (b ) ⊗ b1 and e0 (b ⊗ y ) = e0 (b ) ⊗ y . By (3.8) and the fact that the local isomorphism on B ⊗B is the identity, we have E(e0 (b ⊗b1 )) = E(b ⊗b1 )−1. To show that E(e0 (b ⊗y )) = E(b ⊗y )−1 we check the conditions of Lemma 3.2. By (3.1) 0 (y ) = φ0 (y ) − h0 , wt(y ) = 0 − h0 , − * 0 = * − h0 , . Also by (3.5), since φ0 (y ) = 0, we have 0 (b ⊗ y ) = 0 (b ) + 0 (y ) > h0 , + * − h0 , = * . Let z ⊗ x be the image of b ⊗ y under an arbitrary composition of local isomorphisms. Since b ⊗ y is an sln highest weight vector, so is z ⊗ x and x. Now x is the sln -highest weight vector in a perfect crystal of level at most * , so φ0 (x) = 0 and 0 (x) ≤ * . But * < 0 (b ⊗ y ) = 0 (z ⊗ x) = 0 (z) + 0 (x) so that 0 (z) > 0. By (3.4) e0 (z ⊗ x) = e0 (z) ⊗ x. So E(e0 (b ⊗ y )) = E(b ⊗ y ) − 1 by Lemma 3.2. ) ≤ h , . But then ) ≤ (b (b By induction we may now assume that 0 0 i i i hi , , or c , (b ) ≤ c , = * . Since b ∈ B and B is a perfect crystal of level * , b must be the unique element of B such that (b ) = . Thus the function B → Z given by b → E(b ⊗ y ) − E(b ⊗ b1 ) is constant on B if it is constant on the singleton set { −1 ( )}, which it obviously is. " # 7.3. Explicit ground state energy. To go further, an explicit formula for the value E(p ⊗ y ) is required. This is achieved in (7.2). The derivation makes use of the following explicit construction of the local isomorphism. Theorem 7.3. Let B = B k,* be a perfect crystal of level *, , ∈ (Pcl+ )* , B a perfect crystal of level * ≤ *, and b ∈ H( , B , ). Let x ∈ B (resp. y ∈ B) be the unique element such that (x) = (resp. (y) = ). Then under the local isomorphism B ⊗ B ∼ = ψ k (b) ⊗ y. = B ⊗ B, we have x ⊗ b ∼ The proof requires several technical lemmas and is given in the next section. Example 7.4. Let n = 5, * = 4, k = 2, = 0 +1 +3 +4 , = 0 +1 +2 + 4 , * = 2, B = B 2,2 . Here the set H( , B , ) consists of two elements, namely, 1 2 4 5
and
1 4 2 5.
Let b be the second tableau. The theorem says that 1 1 2 3 2 3 4 5
⊗
1 1 2 4 1 4 ∼ 1 3 ⊗ = 2 3 5 5. 2 4 2 5
Proposition 7.5. Let ∈ (Pcl+ )* , B = B k,* a perfect crystal of level *, b ∈ P(, B) the ground state path, p a finite path (say of length N , where N is a multiple of n) such that p ⊗ b = b, B the tensor product of perfect crystals each of level at most *, and y ∈ H(*0 , B , ). Let p be the path of length N such that p ⊗ b = b , where b ∈ P(*0 , B) is the ground state path. Then under the composition of local isomorphisms B ⊗N ⊗ B ∼ = y ⊗ p . = B ⊗ B ⊗N we have p ⊗ y ∼ Proof. Induct on the length of the path y. Suppose B = B1 ⊗ B2 and y = y1 ⊗ y2 , where yj ∈ Bj and Bj is a perfect crystal. Let = − wt(y1 ). By the definitions y2 ∈ H(*0 , B2 , ). By induction the first N steps p of the ground state path of
132
A. Schilling, M. Shimozono
∼ y2 ⊗ p under the composition of local isomorphisms P( , B) satisfy p ⊗ y2 = ⊗N ⊗N ∼ B ⊗ B2 = B2 ⊗ B . Tensoring on the left with y1 , it remains to show that p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms B ⊗N ⊗ B1 ∼ = B1 ⊗ B ⊗N . Now ∈ B are the unique elements such that (p ) = and (p ) = . pN ∈ B and pN N N . Now p ⊗ y ∈ H( , B ⊗ Applying Theorem 7.3 we obtain pN ⊗ y1 ∼ = ψ k (y1 ) ⊗ pN N 1 ∈ H( , B ⊗ B, φ(p )). This implies that ψ k (y ) ∈ B1 , φ(pN )) so that ψ k (y1 ) ⊗ pN 1 N 1 ) and (p H(φ(pN ), B1 , φ(pN )). Now by definition (pN−1 ) = φ(pN N−1 ) = φ(pN ). . Continuing in Applying Theorem 7.3 we obtain pN−1 ⊗ ψ k (y1 ) ∼ = ψ 2k (y1 ) ⊗ pN−1 j k (j +1)k ∼ (y1 ) ⊗ pN−j for 0 ≤ j ≤ N − 1. this manner it follows that pN−j ⊗ ψ (y1 ) = ψ Composing these local isomorphisms it follows that p ⊗ y1 ∼ = ψ Nk (y1 ) ⊗ p . But ψ N is the identity since the order of ψ divides n which divides N . Therefore p ⊗ y1 ∼ = y1 ⊗ p under the composition of local isomorphisms and we are done. " # In the notation in the previous section, E(p ⊗ y ) = E(y ⊗ p ), where p is the first N steps of the ground state path of P(* 0 , B). Write N = nM and B = B k,* . Then using the generalized cocyclage one may calculate explicitly the generalized charge of the LR tableau corresponding to the level * restricted (and hence classically restricted) path y ⊗ p . Let |y | denote the total number of cells in the tableaux comprising y . Then kM . (7.2) E(y ⊗ p ) = E(y ) + |y |kM + n* 2 Example 7.6. Let n = 5, * = 3, = 0 + 3 + 4 , k = 2 and M = 1. Then p is the path 4 4 4 5 5 5
⊗
2 2 2 3 3 3
⊗
1 1 1 5 5 5
⊗
3 3 3 4 4 4
⊗
1 1 1 2 2 2.
The element y can be taken to be the tensor product 1
1
2
2⊗ 3
3 4.
Let λ = (8, 8, 8, 7, 6). Then the tableau Q ∈ LR(λ; R) (resp. Y ) that records the path y ⊗ p (resp. y ) is given by 1 1 1
5
5
5
11 15
2 2 2
7
7
7
12 16
Q=3 3 3
8
8
8
13 17 ,
4 4 4
9
9
9
14
1 5 2 6 Y =3 7 4
6 6 6 10 10 10 with R = ((3, 3), (3, 3), (3, 3), (3, 3), (3, 3), (1, 1, 1, 1), (1, 1, 1)) and subalphabets {1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}, {11, 12, 13, 14}, {15, 16, 17}. The generalized charge
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
133
cR (Q) is equal to the energy E(y ⊗ p ) [37, Theorem 23]. Here the widest rectangle in the path is of width * . For any tableau T ∈ LR(ρ; R) for some partition ρ, define V (T ) = P ((w0R Te )(w0R Tw )), where P is the Schensted P tableau, w0R is the automorphism of conjugation that reverses each of the subalphabets, and Tw and Te are the west and east subtableaux obtained by slicing T between the * th and (* + 1)th columns. It can be shown that there is a composition of |Te | generalized R-cocyclages leading from T to V (T ), where |Te | denotes the number of cells in Te . It follows from the ideas in [35, Sect. 3] and the intrinsic characterization of cR in [35, Theorem 21] that cR (T ) = cR (V (T )) + |Te |.
(7.3)
For the above tableau Q we have 1 1 1
1 1 1
2 2 2
2 2 2
Qw = 3 3 3
w0R Qw = 3 3 3
4 4 4
4 4 4
6 6 6
5 5 5
6
6
6
11 15
7
7
7
12 16
= 8
8
8
13 17 .
9
9
9
14
and 5
5
5
11 15
7
7
7
12 16
Qe = 8
8
8
13 17
9
9
9
14
w0R Qe
10 10 10
10 10 10
Then
V (Q) =
1
1
1
2
2
2
1
1
1
11 15
3
3
3
2
2
2
12 16
4
4
4
3
3
3
13 17
5
5
5
4
4
4
14
6
6
6
5
5
5
7
7
7
6
6
6
8
8
8
7
7
7
9
9
9
8
8
8
10 10 10.
9
9
9
11 15
10 10 10
12 16
and
V (V (Q)) =
13 17 14
134
A. Schilling, M. Shimozono
We have cR (V (V (Q))) = cR (Y ) = E(y ) by [35, Theorem 21] and cR (Q) = cR (V (Q)) + |Qe | = cR (V (Q)) + * n + |Y |, and cR (V (Q)) = cR (V (V (Q))) + |Y | by (7.3). This implies cR (Q) = * n + E(y ) + 2|Y |. 7.4. Proof of Theorem 7.3. The proof of Theorem 7.3 requires several lemmas. Words of length L in the alphabet {1, 2, . . . , n} are identified with the elements of the crystal basis of the L-fold tensor product (B 1,1 )⊗L . Lemma 7.7. Let u and v be words such that uv is an An−1 highest weight vector. Then v is an An−1 highest weight vector and j (u) ≤ φj (v) for all 1 ≤ j ≤ n − 1. Proof. Let uv be an An−1 highest weight vector and 1 ≤ j ≤ n − 1. By (3.5) 0 = j (uv) = j (v) + max{0, j (u) − φj (v)}. Since both summands on the right-hand side are nonnegative and sum to zero they must both be zero. " # Lemma 7.8. Let w be a word in the alphabet {1, 2} and w a word obtained by removing a letter i of w. Then w ) ≤ 1 (w) + 1 with equality only if i = 1. (1) 1 ( w ) + 1 with equality only if i = 2. (2) 1 (w) ≤ 1 ( Proof. Write w = uiv and w = uv. By (3.5) 1 (ui) = 1 (i) + max{0, 1 (u) − φ1 (i)} max{0, 1 (u) − 1} if i = 1 = 1 + 1 (u) if i = 2.
(7.4)
In particular 1 (ui) ≥ 1 (u) − 1. Applying (3.5) to both 1 (uv) and 1 (uiv) and subtracting, we obtain 1 (uv) − 1 (uiv) = max{0, 1 (u) − φ1 (v)} − max{0, 1 (ui) − φ1 (v)} ≤ max{0, 1 (u) − φ1 (v)} − max{0, 1 (u) − 1 − φ1 (v)} ≤ 1. Moreover if 1 (uv) − 1 (uiv) = 1 then all of the inequalities are equalities. In particular it must be the case that 1 (ui) = 1 (u) − 1, which by (7.4) implies that i = 1, proving the first assertion. On the other hand, (7.4) also implies 1 (ui) ≤ 1 + 1 (u). Subtracting 1 (uv) from 1 (uiv) and computing as before, the second part follows. " # Say that w is an almost highest weight vector with defect i if there is an index 1 ≤ i ≤ n − 1 such that j (w) = δij for 1 ≤ j ≤ n − 1, and also i−1 (ei (w)) = 0 if i > 1. Lemma 7.9. Let w be an almost highest weight vector with defect i for 1 ≤ i ≤ n − 1. Then ei (w) is either an An−1 highest weight vector or an almost highest weight vector of defect i + 1.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
135
Proof. For j ∈ {i − 1, i, i + 1}, the restriction of the words w and ei (w) to the alphabet {j, j + 1} are identical, so that j (ei (w)) = j (w) = 0 by the definition of an almost highest weight vector.Also i (w) = 1 implies that i (ei (w)) = 0.Again by the definition of an almost highest weight vector, i−1 (ei (w)) = 0. If i = n − 1 we have shown that ei (w) is an An−1 highest weight vector. So it may be assumed that i < n − 1. It is enough to show that one of the two following possibilities occurs. (1) i+1 (ei (w)) = 0. (2) i+1 (ei (w)) = 1 and i (ei+1 ei (w)) = 0. Recall that ei (w) is obtained from w by changing an i + 1 into an i. Write w = u(i + 1)v such that ei (w) = uiv. In this notation we have φi (v) = 0 and i (u) = 0. By Lemma 7.8 point 7.8 with {1, 2} replaced by {i + 1, i + 2} and using that w is an almost highest weight vector of defect i, we have i+1 (ei (w)) ≤ i+1 (w) + 1 = 1. It is now enough to assume that i+1 (ei (w)) = 1 and to show that i (ei+1 ei (w)) = 0. By (3.5) 0 = i+1 (w) = i+1 (u(i + 1)v) = i+1 (v) + max{0, i+1 (u) − φi+1 ((i + 1)v)}. In particular i+1 (v) = 0. Hence ei+1 (ei (w)) = ei+1 (uiv) = ei+1 (u)iv. Similar computations starting with i (w) = 1 and which use the fact that i (u) = φi (v) = 0, yield i (v) = 0. We have i (ei+1 ei (w)) = i (ei+1 (u)iv) = i (iv) + max{0, i (ei+1 (u)) − φi (iv)} = 0 + max{0, i (ei+1 (u)) − 1}. But i (u) = 0 and in passing from u to ei+1 (u) an i + 2 is changed into an i + 1. By Lemma 7.8 point 7.8 applied to the restriction of u to the alphabet {i, i + 1}, we have i (ei+1 (u)) ≤ i (u) + 1 = 1. It follows that i (ei+1 ei (w)) = 0, and that ei (w) is an almost highest weight vector of defect i + 1. " # Lemma 7.10. Suppose w is an An−1 highest weight vector and w is a word obtained by removing a letter (say i) from w. Then there is an index r such that i ≤ r ≤ n and er−1 er−2 · · · ei ( w ) is an An−1 highest weight vector. Proof. By Lemma 7.9 it suffices to show that w is either an An−1 highest weight vector or an almost highest weight vector of defect i. w ) = 0 for j = i. For j ∈ {i − 1, i}, the restrictions of w and First it is shown that j ( w to the alphabet {j, j + 1} are the same, so that j ( w ) = j (w) = 0. For j = i − 1, by Lemma 7.8 point 7.8 and the assumption that w is an An−1 highest weight vector, it follows that i−1 ( w ) ≤ i−1 (w) + 1 = 1. But equality cannot hold since the removed letter is i as opposed to i − 1. Thus i−1 ( w ) = 0. w ) ≤ i (w) + 1 = 1 by Lemma 7.8 point 7.8 and the fact Next we observe that i ( that w is an An−1 highest weight vector. w ) = 0 then w is an An−1 highest weight vector. So it may be assumed that If i ( i ( w ) = 1. It suffices to show that i−1 (ei ( w )) = 0. Write w = uiv and w = uv. Now
136
A. Schilling, M. Shimozono
j (v) = 0 for all 1 ≤ j ≤ n − 1 by Lemma 7.7 since w is an An−1 highest weight vector. In particular i (v) = 0 so that ei ( w ) = ei (uv) = ei (u)v. We have i−1 (ei ( w )) = i−1 (ei (u)v) = i−1 (v) + max{0, i−1 (ei (u)) − φi−1 (v)} = max{0, i−1 (ei (u)) − φi−1 (v)}, since i−1 (v) = 0 by Lemma 7.7. It is enough to show that i−1 (ei (u)) ≤ φi−1 (v). But i−1 (ei (u)) ≤ i−1 (u) + 1 = i−1 (ui) ≤ φi−1 (v). The first inequality holds by an application of Lemma 7.8 point 7.8 since the restrictions of u and ei (u) to the alphabet {i − 1, i} differ by inserting a letter i. The last inequality holds by Lemma 7.7 since w = uiv is an An−1 highest weight vector. " #
Lemma 7.11. Let B = B k,* be a perfect crystal of level * ≤ *, ∈ (Pcl+ )* , B a finite (possibly empty) tensor product of perfect crystals of level at most *, x ∈ B and b ∈ B such that x ⊗ b ∈ H(, B ⊗ B). Let i ∈ J such that hi , > 0 and set = − i + i−1 . Then there is an index 0 ≤ s ≤ k such that ei+s−1 · · · ei+1 ei (x ⊗ b) = x ⊗ ei+s−1 · · · ei+1 ei (b)
(7.5)
and ei+s−1 · · · ei (b) ∈ H( , B), where the subscripts are taken modulo n. Moreover if * = * then s = k. (1)
Proof. Since the Dynkin diagram An−1 has an automorphism given by rotation, it may be assumed that i = 1. Let λ be the partition of length less than n, given by hj , = λj − λj +1 for 1 ≤ j ≤ n − 1 and λn = 0. Since h1 , > 0 it follows that λ has t
a column of size 1. Let m = λ1 and yi be the An−1 -highest weight vector in B λj ,1 for 1 ≤ j ≤ m. Write y = ym ⊗ · · · ⊗ y1 and y = ym−1 ⊗ · · · ⊗ y1 . Observe that t t ,1 λ m y ⊗ u*0 is an affine highest weight vector in B ⊗ · · · ⊗ B λ1 ,1 ⊗ B(*0 ) and has weight so its connected component is isomorphic to B(). A similar statement holds for y ⊗ u*0 and B( ). In particular, b ⊗ y is an An−1 highest weight vector. The map x ⊗ b ⊗ y → word(x)word(b)word(y) gives an embedding of An−1 -crystals into a tensor product of crystals B 1,1 . By Lemma 7.10, there exists an index 1 ≤ r ≤ n such that er−1 er−2 · · · e1 (word(x)word(b)word( y )) is an An−1 highest weight vector. Since y is an An−1 highest weight vector it follows that er−1 · · · e1 (word(x)word(b)word( y )) = er−1 · · · e1 (word(x)word(b))word( y ). Let pj be the position of the letter in ej −1 . . . e1 (word(x)word(b)) that changes from a j + 1 to j upon the application of ej , for 1 ≤ j ≤ r − 1. It follows from the proof of Lemma 7.9 that pr−1 < pr−2 < · · · < p2 < p1 .
(7.6) b
Let s be the maximal index such that ps is located in word(b). Write = es · · · e1 (b). It follows that es es−1 · · · e1 (x ⊗ b) = x ⊗ b and that b ⊗ y is an An−1 highest weight vector. It remains to show that 0 (b ⊗ y ⊗ u*0 ) = 0 and that s ≤ k with equality if * = *.
(7.7)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
137
Consider the corresponding positions in the tableau b. Since b → word(b) is an An−1 crystal morphism, es · · · e1 (word(b)) = word(es · · · e1 (b)). Let (i1 , j1 ) be the position in the tableau b corresponding to the position p1 in word(b), and analogously define (i2 , j2 ), (i3 , j3 ), and so on. Since the rows of all tableaux (and in particular b, e1 (b), e2 e1 (b), etc.) are weakly increasing and (7.6) holds, it follows that i1 < i2 < i3 < · · · < is . But b has k rows, so s ≤ k. The next goal is to prove (7.7). Suppose first that s < n − 1. In this case the letters 1 and n are undisturbed in passing from e1 (b) to es · · · e1 (b). Using this and the Dynkin diagram rotation it follows that y ⊗ u*0 ) = 0 (e1 (b) ⊗ u ) 0 (es · · · e2 e1 (b) ⊗ = max{0, 0 (e1 (b)) − φ0 (u )} = max{0, 0 (e1 (b)) − φ0 (u ) − 1}.
(7.8)
But φ0 (u ) ≥ 0 (b) ≥ 0 (e1 (b)) − 1 by the fact that 0 (b ⊗ u ) = 0 and Lemma 7.8 point 7.8 applied after rotation of the Dynkin diagram. By (7.8) the desired result (7.7) follows. Otherwise assume s = n − 1. Here k = n − 1 since s ≤ k < n with the inequality holding by the perfectness of B. By (7.6) and the fact that b is a tableau, it must be the case that e1 acting on b changes a 2 in the first row of b into a 1, e2 acting on e1 (b) changes a 3 in the second row of e1 (b) into a 2, etc. Since b is a tableau with n − 1 rows with entries between 1 and n, there are integers 0 ≤ νn−1 ≤ νn−2 ≤ · · · ≤ ν1 < * such that the i th row of b consists of νi copies of the letter i and * − νi copies of the letter i + 1. For tableaux b of this very special form, the explicit formula for e0 in [37, (3.11)] yields 0 (b) = * − mn (b), where mn (b) is the number of occurrences of the letter n in b. Since b = en−1 · · · e1 (b) also has the same form (with νi replaced by νi + 1 for 1 ≤ i ≤ n − 1) and mn (b ) = mn (b) − 1, it follows that 0 (b ) = 0 (b) + 1. We have y ⊗ u*0 ) = 0 (b ⊗ u ) 0 (b ⊗
= max{0, 0 (b ) − φ0 (u )} = max{0, 0 (b) + 1 − (φ0 (u ) + 1)} = 0
since b ∈ H(, B). Finally, assuming * = *, it must be shown that s = k. Since the level of B is the same as that of the weights and , it follows from the perfectness of B that both b and b are uniquely defined by the property that (b) = and (b ) = . Let = n−1 i=0 zi i . By the explicit construction of b in Example 3.3, wt(b) =
n−1 k j =1 i=0
zi (i+j − i+j −1 ) =
n−1
zi (i+k − i )
i=0
with indices taken modulo n. Subtracting the analogous formula for wt(b ), wt(b) − wt(b ) = − kj =1 αj . Using (3.1) it follows that k = s. " # Proof of Theorem 7.3. First observe that x ⊗ b ∈ H( , B ⊗ B , φ(x)) by (3.1), b ∈ H( , B , ), and (x) = . Let c ∈ B and z ∈ B be such that x ⊗ b ∼ = c⊗z under the local isomorphism. Then c ⊗ z ∈ H( , B ⊗ B, φ(x)) which means that z is -restricted. Hence z ∈ H( , B, φ(z)) and c ∈ H(φ(z), B , φ(x)). The former together with the perfectness of B implies that y = z. From the latter it follows that
138
A. Schilling, M. Shimozono
ψ −k (c) ∈ H( , B , ). However the set H( , B , ) might have multiplicities so it is not obvious why b = ψ −k (c) or equivalently c = ψ k (b). The proof proceeds by an induction that changes the weight to a weight that is “closer to" *0 . Suppose first that there is a root direction i = 0 such that = − i + i−1 . By Lemma 7.11 applied for the weight hi , > 0 and , simple root αi , and element x ⊗ b ∈ H( , B ⊗ B ), there is an 0 ≤ s < n such , B , ), where = − s+i + s+i−1 and that b = ei+s−1 · · · ei+1 ei (b) ∈ H( ei+s−1 · · · ei (x ⊗ b) = x ⊗ b. Applying Lemma 7.11 with , αs+i , and x ∈ H(, B), , B). it follows that x = ek+s+i−1 · · · es+i (x) ∈ H( , B ⊗ B ). The above computations imply ek+s+i−1 · · · ei (x ⊗ b) = x ⊗ b ∈ H( , B ⊗ B) since x ⊗ b → c ⊗ y under We have ek+s+i−1 · · · ei+1 ei (c ⊗ y) ∈ H( the local isomorphism. It must be seen which of these raising operators act on the tensor factor in B and which act in B. By Lemma 7.11 applied with , αi , and c ⊗ y ∈ , B) and that ek+i−1 · · · ei (c⊗ H( , B ⊗B), it follows that y = ek+i−1 · · · ei (y) ∈ H( (1) y) = c⊗ y . Since y ⊗u is an An−1 highest weight vector, the rest of the raising operators es+k−1 · · · ek+i must act on the first tensor factor. Let c = ek+s+i−1 · · · ek+i (c). Then ek+s+i−1 · · · ei (c ⊗ y) = c ⊗ y . But the local isomorphism is a crystal morphism so it sends x ⊗ b → c ⊗ y . By induction c = ψ k ( b). By (3.6) it follows that c = ψ k (b). Otherwise there is no index i = 0 such that hi , > 0. This means = *0 . But the sets H(*0 , B, ) and H(*0 , B , φ(y)) are singletons whose lone elements are given by the An−1 highest weight vectors in B and B respectively. Since B ⊗ B is An−1 multiplicity-free it follows that the sets H(φ(y), B , φ(x)) and H(, B, φ(x)) are singletons. In this case it follows directly that c = ψ k (b) since both c and ψ k (b) are elements of the set H(φ(y), B , φ(x)). " # 7.5. Branching function by restricted generalized Kostka polynomials. The appropriate map from LR tableaux to rigged configurations, sends the generalized charge of the LR tableau to the charge of the rigged configuration. Unfortunately in general it is not clear what happens when one uses the statistic coming from the energy function E(b ⊗ y ) but using the path b ⊗ y ⊗ y . It is only known that the statistic E(b ⊗ y ⊗ y ) on the path b ⊗ y ⊗ y is well-behaved. So to continue the computation we require that y = ∅. This is achieved when = * 0 . So let us assume this. The other problem is that we do not consider all paths in H(*0 , B ⊗N ⊗ B , ), but only those of the form b ⊗ y , where y ∈ B is a fixed path. Passing to LR tableaux, this is equivalent to imposing an additional condition that the subtableaux corresponding to the first several rectangles must be in fixed positions. Conjecture 8.3 asserts that the corresponding sets of rigged configurations are well-behaved. The special case that requires no extra work is when B consists of a single perfect crystal. This is achievable when has the form = rs + (* − r)0 ; in this case B = B s,r and y is the sln -highest weight element of B s,r . This is the same as requiring that the first subtableau of the LR tableau be fixed. But this is always the case. Let R (M) consist of the single rectangle (r s ) followed by N = Mn copies of the rectangle (* k ), where B = B k,* . Let λ(M) be the partition of the same size as the total size of R (M) , (M) such that λ projects to − *0 . Then the set of paths H(*0 , B ⊗N ⊗ B s,r , ) is * equal to P−* ,R (M) . This is summarized by 0
kM
−rskM−n* ( 2 ) * b Kλ(M) ,R (M) (q), (q) = lim q M→∞
where is arbitrary, = rs + (* − r)0 , and = * 0 .
(7.9)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
139
Inserting expression (6.7) for the generalized Kostka polynomial in (7.9) and taking the limit yields the following fermionic expression for the branching function: b (q) = q
×
rs(s−n) 1 2n + 2*
n
|λ| 2 j =1 (λj − n )
(−1)|S|+1 q 2 u(S)C 1
−1 ⊗C −1 u(S)
S∈SCST(λ )
q 2 mC⊗C 1
−1 m−mI ⊗C −1 u(S)
m
*−1 n−1 m(a) +n(a) n−1 i
i=1 a=1 i=*
(a) mi
i
a=1
1 , (q)m(a)
(7.10)
*
where λ is any partition which projects to − *0 and u(S) as defined in (6.1). The n−1 (a) (a) sum over m runs over all m = *−1 a=1 mi ei ⊗ ea such that mi ∈ Z and i=1 e*−1 ⊗ ea (I ⊗ C −1 m − C −1 ⊗ C −1 u(S)) −
1 1 λj − |λ| ∈ Z * n a
j =1
(a)
for all 1 ≤ a ≤ n − 1. The variables ni (a)
ni
are given by
= ei ⊗ ea −C ⊗ C −1 m + I ⊗ C −1 (u(s) + er ⊗ es )
for all 1 ≤ a < n and 1 ≤ i < *, i = * . 8. Proof of Theorem 5.7 To prove Theorem 5.7 it clearly suffices to show that there is a bijection ψ R : RLR* (λ; R) → RC* (λ; R) that is charge-preserving, that is, cR (T ) = c(ψ R (T )) for all T ∈ RLR* (λ; R). Here we identify LR(λ; R) with RLR(λ; R) via the standardization bijec : CLR(λ; R) → N by c = c ◦ γ , where c : RLR(λ; R) → tion std. Also define cR R R R R N. It will be shown that one of the standard bijections ψ R : RLR(λ; R) → RC(λ; R) is charge-preserving, and that it restricts to a bijection RLR* (λ; R) → RC* (λ; R). With this in mind let us review the bijections from LR tableaux to rigged configurations. 8.1. Bijections from LR tableaux to rigged configurations. A bijection φ R : CLR(λ; R) → RC(λt ; R t ) was defined recursively in [25, Definition-Proposition 4.1]. It is one of four natural bijections from LR tableaux to rigged configurations: (1) Column index quantum: φ R : CLR(λ; R) → RC(λt ; R t ), R : CLR(λ; R) → RC(λt ; R t ), defined by φ R = (2) Column index coquantum: φ θR t ◦ φ R , (3) Row index quantum: ψ R : RLR(λ; R) → RC(λ; R), defined by ψ R = φ R t ◦ tr, and R : RLR(λ; R) → RC(λ; R), defined by ψ R = θR ◦ ψ R . (4) Row index coquantum: ψ Of these four, the one that is compatible with level-restriction is ψ. First we show that it is charge-preserving. This fact is a corollary of the difficult result [25, Theorem 9.1]. Proposition 8.1. c(ψ R (T )) = cR (T ) for all T ∈ RLR(λ; R).
140
A. Schilling, M. Shimozono
Proof. Consider the following diagram, which commutes by the definitions and [25, Theorem 7.1] RLR(λ; R) ggggogooo g g g g o g ggggg oootr g g o g g o wo sggg CLR(λ; R) tr / CLR(λt ; R t ) ψR LR φ R t φR / RC(λ; R). RC(λt ; R t ) tr γR−1
RC
In particular ψ R = tr RC ◦ φ R ◦ γR−1 . Let T ∈ RLR(λ; R) and Q = γR−1 (T ). Then, using tr RC ◦ θR t = θR ◦ tr RC , R (Q))). ψ R (T ) = θR (tr RC (φ R (Q)). Then Let (ν, J ) = tr RC (φ c(ψ R (T )) = c(θR (ν, J )) = ||R|| − cc(ν, J ) R (Q))) = cc(φ R (Q)) = cR (Q) = cR (T ) = ||R|| − cc(tr RC (φ . by Lemma 5.4, (5.6) and [25, Theorem 9.1] to pass from cc to cR
# "
In light of Proposition 8.1, to prove Theorem 5.7 it suffices to establish the following result. Theorem 8.2. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ; R) → RC* (λ; R). Computer data suggests that the bijection ψ R is not only well-behaved with respect to level-restriction, but also with respect to fixing certain subtableaux. It was argued in Sect. 7.5 that the branching functions can be expressed in terms of generating functions of tableaux with certain fixed subtableaux.t t Let ρ ⊂ λ be partitions, Rρ = ((1ρ1 ), . . . , (1ρn )) and Tρ the unique tableau in RLR(ρ; Rρ ). Define RLR* (λ, ρ; R) to be the set of tableaux T ∈ RLR* (λ; Rρ ∪ R) such that T restricted to shape ρ equals Tρ . Recall the set of rigged configurations RC* (λ, ρ; R) defined in Sect. 5.3. Conjecture 8.3. The bijection ψ R : RLR(λ; R) → RC(λ; R) restricts to a well-defined bijection ψ R : RLR* (λ, ρ; R) → RC* (λ, ρ; R). 8.2. Reduction to single rows. In this section it is shown that to prove Theorem 8.2 it suffices to consider the case where R consists of single rows. Recall the nontrivial embedding iR : LR(λ; R) 7→ LR(λ; r(R)). We identify LR(λ; R) and RLR(λ; R) via std, and therefore have an embedding iR : RLR(λ; R) 7→ RLR(λ; r(R)). Define a map jR : RC(λ; R) → RC(λ; r(R)) as follows. Let (ν, J ) ∈ RC(λ; R). For each rectangle of R having k rows and m columns, add k − j strings (m, 0) of length m and label zero to the rigged partition (ν, J )(j ) for 1 ≤ j ≤ k − 1. The resulting rigged configuration is jR (ν, J ).
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
141
Proposition 8.4. The following diagram commutes: iR
RLR(λ; R) −−−−→ RLR(λ; r(R)) ψ ψR
r(R) RC(λ; R) −−−−→ RC(λ; r(R)). jR
It must be shown that similar diagrams commute in which iR is replaced by either iR< or sp , the maps that occur in the definition of iR . Let jR< : RC(λ; R) → RC(λ; R < ) be defined by adding a string (µ1 , 0) to each of the first η1 − 1 rigged partitions in (ν, J ) ∈ RC(λ; R). Lemma 8.5. jR< is well-defined and the following diagram commutes: iR<
RLR(λ; R) −−−−→ RLR(λ; R < ) ψR
ψ R< RC(λ; R) −−−−→ RC(λ; R < ). jR<
Proof. Consider the following diagram: RLR(λ; R) OOO OOOtr OOO OO' ψR CLR(λt ; R t ) ooo ooo o o wooo φ Rt RC(λ; R)
iR<
i∨
jR<
/ RLR(λ; R < ) nn tr nnnn n n n vnnn / CLR(λt ; R < t ) ψ R< PPP PPP PPP PPP φ R
Let us view this diagram as a prism in which the large rectangular face is the front, and the other faces with four sides are the top and bottom, and the faces with three sides are the left and right. We want to show that the front face commutes. For this it suffices to show that all other faces commute. The left and right faces commute by the definition of ψ. Let us define i ∨ : CLR(λt ; R t ) → CLR(λt ; R < t ) so that the top face commutes. It suffices to show the bottom face commutes. Observe that i ∨ is the embedding for CLR that splits off the first column of the first rectangle in R t . But then the bottom face commutes by [25, Lemma 5.4] applied to R t in place of R. " # Lemma 8.6. The following diagram commutes: sp
RLR(λ; R) −−−−→ RLR(λ; sp ) ψ ψR
sp R RC(λ; R)
RC(λ; sp R).
142
A. Schilling, M. Shimozono
Proof. We use the same kind of diagram as in the previous lemma. Of course (sp R)t = sp (R t ). RLR(λ; R) OOO OOOtr OOO OO' ψR CLR(λt ; R t ) oo ooo o o o wooo φ Rt RC(λ; R)
sp
/ RLR(λ; sp R) mm tr mmmm m m m vmmm sp / CLR(λt ; (sp R)t ) ψ sp R QQQ QQQ QQQ φ (sp R)t QQQ( / RC(λ; sp R) =
We argue as in the previous lemma. The left and right faces commute by the definition of ψ, the top face commutes by [36, Prop. 32], and the bottom face commutes by [25, Lemma 8.5]. Proof of Proposition 8.4. Consider the diagram RLR(λ; R) OOO OOOtr OOO OO' ψR CLR(λt ; R t ) oo ooo o o o wooo φ Rt RC(λ; R)
iR
I∨
jR
/ RLR(λ; r(R)) mm trmmmmm m mm vmmm / CLR(λt ; r(R)t ) ψ r(R) QQQ QQQ QQQ QQQ φ r(R)t Q( / RC(λ; r(R))
The left and right faces commute by the definition of ψ. Let us define I ∨ so that the top face commutes. It suffices to show the bottom face commutes. By the previous two lemmas, the bottom face commutes if jR is given by the composition of maps of the form jR< and the identity map, corresponding to the way that iR was computed. But it is # easy to see that the effect of this composition of maps is precisely jR . " By the definition of jR and Definition 5.5 of the level-restriction for rigged configurations, we have RC* (λ; R) = {(ν, J ) ∈ RC(λ; R) | jR (ν, J ) ∈ RC* (λ; r(R))}.
(8.1)
We now show that Theorem 8.2 follows from the special case when R consists of single rows. The proof is a diagram chase using the commutative diagram in Proposition 8.4. Since r(R) consists of single rows, it is assumed that ψ r(R) : LR(λ; r(R)) → RC(λ; r(R)) restricts to a bijection LR* (λ; r(R)) → RC* (λ; r(R)). In particular ψ r(R) (LR* (λ; r(R))) = RC* (λ; r(R)). Since ψ R : LR(λ; R) → RC(λ; R) is a bijection, it is enough to show that ψ R (LR* (λ; R)) = RC* (λ; R). For the inclusion ψ R (LR* (λ; R)) ⊂ RC* (λ; R), suppose that x ∈ LR* (λ; R). By (4.7) iR (x) ∈ LR* (λ; r(R)). By assumption, ψ r(R) (iR (x)) ∈ RC* (λ; r(R)). But ψ r(R) ◦ iR = jR ◦ ψ R by Proposition 8.4, so jR (ψ R (x)) ∈
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
143
RC* (λ; r(R)). By (8.1), ψ R (x) ∈ RC* (λ; R). For the other inclusion, suppose y ∈ RC* (λ; R). Let x ∈ LR(λ; R) be the unique element such that ψ R (x) = y. Now ψ r(R) (iR (x)) = jR (ψ R (x)) = jR (y). By (8.1) jR (y) ∈ RC* (λ; r(R)). By assumption iR (x) ∈ LR* (λ; r(R)). By (4.7) x ∈ LR* (λ; R), that is, y ∈ ψ R (LR* (λ; R)). 8.3. Single row quantum number bijection. We must prove Theorem 8.2 when R consists of single rows. For the rest of the paper we shall assume this is the case. Then ηj = 1 for all j , Rj = (µj ) for 1 ≤ j ≤ L, LR(λ; R) = CST(λ; µ), LR* (λ; R) = CST* (λ; µ), and R t consists of single columns. We also write ψ µ for ψ R in this case. Again using std we identify LR(λ; R) with RLR(λ; R), and LR* (λ; R) with its image in RLR(λ; R) under std. Now [25, Sect. 4.2] gives a direct description of φ R t that is particularly simple when R t consists of single columns. This is easily translated to the following algorithm to compute the bijection ψ R : RLR(λ; R) → RC(λ; R). First, ν ∈ C(λ; R) requires that |ν (k) | = λj j >k
for k ≥ 1. The vacancy numbers may be given by (k)
Pi (ν) = Qi (ν (k−1) ) − 2Qi (ν (k) ) + Qi (ν (k+1) ), where ν (0) = µ and (since µ is not necessarily a partition) Qi (µ) := min{µj , i}. j
Now let us describe the bijection ψ R : RLR(λ; R) → RC(λ; R). Start with T ∈ RLR(λ; R). Write T − for the tableau obtained by removing the largest letter from T (which occurs in row r, say) and λ− for the shape of T − . Let R − = ((µ1 ), (µ2 ), . . . , (µL−1 ), (µL − 1)). Since T − ∈ RLR(λ− ; R − ), by induction ψ R (T − ) = (ν, J) is defined. Let s (r) = ∞. For k = r − 1 down to 1, select the longest singular string in (ν, J)(k) of length s (k) (possibly of zero length) such that s (k) ≤ s (k+1) . With the convention s (0) = µL − 1, it can be shown that s (0) ≤ s (1) as well. Then ψ R (T ) := (ν, J ) is obtained from (ν, J) by lengthening each of the selected strings by one, and resetting their labels to make them singular with respect to the vacancy numbers in the definition of RC(λ; R), and leaving −1 all other strings unchanged. Denote the transformation (ν, J) → (ν, J ) by δ . −1 The inverse of δ , denoted δ, is obtained as follows. Set *(0) = µL . Select inductively a singular string of length *(k) in (ν, J )(k) with *(k) smallest such that *(k) ≥ *(k−1) . If no such singular string exists set *(k) = ∞. Then (ν, J) is obtained from (ν, J ) by shortening all selected strings by one, making them singular again and leaving all other strings unchanged. Remark 8.7. Up to the relabeling bijection std this is precisely the description of the bijection CST(λ; µ) → RC(λ; (µ1 ), . . . , (µL )) that was given in terms of the map called π∗ in [24].
144
A. Schilling, M. Shimozono
Example 8.8. Take µ = (2, 2, 2, 2, 1), λ = (3, 3, 2, 1) and 1 2 6
1 2 6 T =
3 4 8 5 9
T− =
so that
3 4 8 5 7
7
and r = 3. The rigged configuration corresponding to T − is 0
0
0
0
0
*
0
*
0
0
0
0
0
0
where the labels are written to the right of each part and the vacancy numbers to the −1 left. The selected strings under δ with r = 3 are indicated by ∗. Hence the rigged configuration corresponding to T is 0
0
0
0
0
0
1 0
1
0 .
0
0
8.4. Proof of the single row case. Now we come to the proof of Theorem 8.2 when R is a sequence of single rows. More precisely we will prove the following theorem. Theorem 8.9. Let λ = (λ1 , λ2 , . . . , λn ) be a partition of level * and µ = (µ1 , µ2 , . . . , µL ) an array of positive integers not exceeding *. Then (ν, J ) is in the image of CST* (λ, µ) under ψ µ if and only if (k)
(1) ν1 ≤ * for all 1 ≤ k < n, and (2) there exists a column-strict tableau t ∈ CST(λ ) such that (k) xi
≤
(k) Pi (ν) −
λ k −λn
χ (i ≥ * + tj,k ) +
j =1
λk+1 −λn
χ (i ≥ * + tj,k+1 )
(8.2)
j =1
for all 1 ≤ k < n and 1 ≤ i ≤ *. Remark 8.10. The first column of t ∈ CST(λ ) has length λ1 − λn . Since t is a columnstrict tableau over the alphabet {1, 2, . . . , λ1 − λn } this requires that tj,1 = j . Remark 8.11. Since tj,k ≤ tj,k+1 the bounds in (8.2) can be rewritten as (k)
xi
(k)
≤ Pi (ν) λk+1 −λn
−
j =1
χ ( * + tj,k ≤ i < * + tj,k+1 ) −
λ k −λn j =λk+1 −λn +1
χ (i ≥ * + tj,k ).
(8.3)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
145
Fig. 1. An example for nonintersecting paths illustrating (8.2)
For the proof of Theorem 8.9 it will be useful to have the following graphical description of (8.3) in mind. Consider n − 1 strips of length * and height λ1 − λn arranged on top of each other. Assign the label k to the k th strip from the top. Within each strip assign a height label with height 1 at the bottom of the strip and height λ1 − λn at the top of the strip. Call the coordinate along the horizontal axis the position. Then draw a horizontal line from position * + tj,k to position * + tj,k+1 at height j in the k th strip with a closed dot at position * + tj,k and an open dot at position * + tj,k+1 to indicate that the first position belongs to the line, whereas the second one does not. If tj,k = tj,k+1 draw an open dot. If tj,k+1 does not exist draw a horizontal line from position * + tj,k , indicated by a closed dot, to position *. If there is an open dot at position * + tj,k+1 of height j in strip k, then there is also a dot at the same position and height in strip k + 1. Connect all such dots by a vertical line. This way one obtains λ1 − λn paths which all end at position *. The j th path in strip 1 starts at position * + j by Remark 8.10. Furthermore, since tj,k < tj +1,k the paths do not intersect. The k th strip contains λk − λn paths. The other λ1 − λk paths already ended at position * in previous strips. An example for a set of such nonintersecting paths is given in Fig. 1. It corresponds to n = 5, * = 8, λ = (6, 5, 4, 3, 1) and 1 1 2 4 2 2 3 5. t= 3 4 4 4 5 5 The dashed lines separate the various strips. (k) To read off the bound on xi from the picture, draw a vertical line at position i. Suppose that m paths cross this line horizontally in strip k (when the vertical line goes (k) through a closed/open dot we consider this as crossing/not crossing). Then Pi (ν) − m is the maximal possible rigging for strings of length i in (ν, J )(k) . For example, the vertical line at position 6 in Fig. 1 crosses one line in strip 1, no line in strip 2 and 4, and
146
A. Schilling, M. Shimozono (1)
(1)
(2)
(2)
(3)
(3)
two lines in strip 3, so that x6 ≤ P6 (ν) − 1, x6 ≤ P6 (ν), x6 ≤ P6 (ν) − 2, and (4) (4) x6 ≤ P6 (ν). Recall that the rigging of a singular string of length i in (ν, J )(k) equals the vacancy (k) number Pi (ν). Hence the above graphical description of the bounds shows that (ν, J )(k) cannot contain singular strings of length i in the intervals * + tj,k+1 * + tj,k ≤ i < and * + tλk+1 −λn +1,k ≤ i
for 1 ≤ j ≤ λk+1 − λn , if λk+1 < λk ,
since in these intervals a vertical line at position i would cross at least one path. Conversely, if i is the length of a singular string in (ν, J )(k) then it must be in the complements of these intervals, that is, 1≤i < * + t1,k or * + tj −1,k+1 ≤ i < * + tj,k * + tλk+1 −λn +1,k or * + tλk+1 −λn ,k+1 ≤ i < or * + tλk+1 −λn ,k+1 ≤ i ≤ *
for 1 < j ≤ λk+1 − λn , if λk+1 < λk ,
(8.4)
if λk+1 = λk .
Since tj,k ≤ tj,k+1 these intervals are pairwise disjoint, but some of these intervals can of course be empty. Graphically the conditions in (8.4) require that i lies between two paths. More precisely, the first case in (8.4) states that i lies to the left of the first path, the second condition requires that i lies between the (j − 1)th and j th path, and the third case applies if there are more than λk+1 − λn paths in the k th strip in which case i lies between paths λk+1 − λn and λk+1 − λn + 1. The last condition applies if there are exactly λk+1 − λn paths in strip k. None of these ends at * in this strip and the condition implies that i lies to the right of the rightmost path. Remark 8.12. We use the following conventions throughout the proof: t0,k = − * and tj,k = λ1 − λn + 1 for j > λk − λn . Without further ado we present the gory details of the proof of Theorem 8.9. Proof of Theorem 8.9. We prove the theorem by induction on |λ|. The theorem is true for λ = ∅ since then T = ∅ and (ν, J ) = ∅. In this case T is of level * ≥ 0 and conditions 8.9 and 8.9 are trivially satisfied. Proof of the forward direction. Let T ∈ CST* (λ, µ) and (ν, J ) = ψ µ (T ) be its image under the row-wise quantum number bijection. Let T − be the tableau obtained from T by removing the rightmost largest entry. Set λ = shape(T − ), (ν, J) = δ(ν, J ) and denote by r the row index of the cell λ/λ. µ Set λ0 = shape(T − L ). The tableau T − is of level * since λ1 − λ0n ≤ λ1 − λ0n ≤ * by (k) the condition that T is of level *. By induction the theorem holds for T − so that ν 1 ≤ * and there exists a column-strict tableau t of shape (λ1 − λn , . . . , λn−1 − λn )t such that by (8.3), (k)
xi
(k)
≤ Pi (ν) λk+1 −λn
−
j =1
χ ( * + t j,k ≤ i < * + t j,k+1 ) −
λ k −λn j =λk+1 −λn +1
χ (i ≥ * + t j,k )
(8.5)
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
147 (k)
* = * − λ1 + λn , and x i for all 1 ≤ k < n and 1 ≤ i ≤ *. Here (k)
is the largest part
(k)
of Ji and zero if Ji is empty. The aim is to show that conditions 8.9 and 8.9 of the theorem hold for (ν, J ). −1 Denote by s (k) the length of the selected singular string in (ν, J)(k) under δ . By definition µL − 1 = s (0) ≤ s (1) ≤ · · · ≤ s (r−1) and s (k) = ∞ for k ≥ r. We claim that there exist indices j (k) for 0 ≤ k < r such that * + t j (k) −1,k+1 ≤ s (k) < * + t j (k) ,k * + t j (0) −1,1 ≤ s
(0)
for 1 ≤ k < r,
< * + t j (0) ,1 ,
(8.6) (8.7)
and 1 ≤ j (0) ≤ j (1) ≤ · · · ≤ j (r−1) ≤ λr − λn + δr,n ,
(8.8)
*. The proof proceeds by descending induction on k for where by definition t 0,k = − 1 ≤ k < r. We make frequent use of (8.4) applied to (ν, J), where the first and third line are viewed as the cases j = 1 and j = λk+1 − λn + 1 of the general interval appearing in the second line of (8.4). First assume k = r − 1. Note that λr < λr−1 since λr + 1 = λr ≤ λr−1 = λr−1 . Hence the existence of j (r−1) follows from (8.4) since the last case does not apply. In particular we have j (r−1) ≤ λr −λn +1 = λr −λn +δr,n . Now consider 0 ≤ k < r − 1 and assume that * + t j (k) −1,k+1 ≤ s (k) for some j (k) > j (k+1) . Then by induction and the column-strictness of t, * + t j (k+1) ,k+1 ≤ * + t j (k) −1,k+1 ≤ s (k) , s (k+1) < which is a contradiction. Hence j (k) ≤ j (k+1) . Since s (k) is the length of a singular string and by induction j (k) ≤ λr − λn + 1 ≤ λk+1 − λn for 1 ≤ k < r − 1, s (k) must be in the first or second set of the intervals in (8.4) with all quantities replaced by their barred counterparts which proves (8.6). Equation (8.7) follows since t j,1 = j by Remark 8.10. Let us now prove condition 8.9 of the theorem. By construction (ν, J ) is obtained from (ν, J) by increasing the length of the selected strings in (ν, J)(k) by one for 1 ≤ k < r, making them singular again and leaving all other strings unchanged. For r = 1 this means that (ν, J ) = (ν, J) so that condition 8.9 of the theorem is satisfied by induction. Now assume r > 1. Since t j (r−1) ,k ∈ {1, 2, . . . , λ1 − λn } it follows from (8.6) with k = r − 1 *+t j (r−1) ,r−1 ≤ *+λ1 −λn = *. Hence µL −1 ≤ s (1) ≤ · · · ≤ s (r−1) < * that s (r−1) < which ensures condition 8.9 of the theorem for 1 < r ≤ n. It remains to prove that the second condition of the theorem holds. The vacancy numbers of ν and ν are related as follows: (k)
(k)
Pi (ν) = Pi (ν) − χ (s (k−1) < i ≤ s (k) ) + χ (s (k) < i ≤ s (k+1) ). (k)
By construction xi Hence (k)
xi
(k)
≤ xi
(k)
(k)
for i = s (k) + 1 and xs (k) +1 = Ps (k) +1 (ν) for 1 ≤ k < r.
(k)
≤ Pi (ν) − χ (s (k−1) < i ≤ s (k) ) + χ (s (k) < i ≤ s (k+1) ) −
λ k −λn j =1
χ (i ≥ * + t j,k ) +
λk+1 −λn
j =1
χ (i ≥ * + t j,k+1 )
(8.9)
148
A. Schilling, M. Shimozono
for i = s (k) + 1. In the remainder of the proof of the forward direction it will be shown that (8.2) holds for i = s (k) + 1 and that (8.9) implies (8.2) for i = s (k) + 1. We distinguish the cases r = 1, 1 < r < n, and r = n. " # * = * + 1 and Case r = 1. In this case λ1 = λ1 − 1, λk = λk for 1 < k ≤ n, t is a column-strict tableau over the alphabet {1, 2, . . . , λ1 − λn − 1}. Furthermore s (0) = µL − 1 and s (k) = ∞ for k ≥ 1. By (8.9) we have for all 1 ≤ k < n and 1 ≤ i ≤ *, (k) xi
≤
(k) Pi (ν) − χ (i
λk −λn −δk,1
≥ µL )δk,1 −
χ (i ≥ * + 1 + t j,k )
j =1 λk+1 −λn
+
(8.10) χ (i ≥ * + 1 + t j,k+1 ).
j =1
Remark 8.10 requires that tj,1 = j for 1 ≤ j ≤ λ1 − λn . Hence − χ (i ≥ µL ) −
λ1 −λ n −1
χ (i ≥ * + 1 + t j,1 )
j =1
≤−
λ 1 −λn
χ (i ≥ * + tj,1 ) + χ ( * + 1 ≤ i < µL ).
(8.11)
j =1
If µL ≤ * + 1 the term χ ( * + 1 ≤ i < µL ) vanishes. In this case set tj,k = t j,k + 1 for 1 < k < n and 1 ≤ j ≤ λk − λn which defines a column-strict tableau of shape (λ1 − λn , . . . , λn−1 − λn )t over {1, 2, . . . , λ1 − λn }. Then (8.10) implies (8.2). * + 1 is considerably harder to establish due to the extra term The case µL > χ( * + 1 ≤ i < µL ). Our strategy is as follows. The term χ ( * + 1 ≤ i < µL ) can be absorbed by defining tj,2 appropriately except in certain cases. In general this introduces extra terms for the bounds at k = 2. These in turn can be absorbed by defining tj,3 appropriately (except in certain cases) and so on. If all tj,k for 1 ≤ k < n can be defined and all bounds for 1 ≤ k < n are written in the form of (8.2) we are done. In the exceptional cases (when (8.10) does not imply (8.2)) it can be shown that the corresponding tableau T is not of level * which contradicts the assumptions. Let us now plunge into the details. Define tj,1 = j for 1 ≤ j ≤ λ1 − λn and (k) (k) (k) * − 1. Since r = 1, we have x i = xi and Pi (ν, t) equals the set d = µL − (1) right-hand side of (8.10). Let aj for 1 ≤ j ≤ λ2 − λn + 1 be the minimal index (1) (1) i ∈ [t j −1,2 + 1, t j,2 ] ∩ [1, d] such that x = P (ν, t), where t 0,k = − * and *+i
*+i
(1)
t j,k = λ1 − λn + 1 for j > λk − λn . If no such i exists set aj = t j,2 + 1. By definition (1)
(1)
(1)
x < P (ν, t) for t j −1,2 < i < aj and 1 ≤ i ≤ d so that we can sharpen the *+i *+i 2 −λn +1 (1) bounds in (8.10) for k = 1 by adding − jλ=1 χ ( *+t j −1,2 < i < *+aj )χ ( *+1 ≤ i ≤ *+d). Note that t λ2 −λn +1,2 +1 = λ1 −λn +2 = λ1 −λn +1 = tλ2 −λn +1,2 . The case (1) (1) aλ2 −λn +1 < tλ2 −λn +1,2 will be dealt with later. Suppose that aλ2 −λn +1 = tλ2 −λn +1,2 .
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
149
Then one finds using (8.11), (1)
xi
(1)
≤ Pi (ν) − +
λ 1 −λn
χ (i ≥ * + tj,1 ) +
λ 2 −λn
j =1 λ 2 −λn j =1
χ (i ≥ * + 1 + t j,2 )
j =1 (1) χ ( * + aj
(8.12)
≤i ≤ * + t j,2 ).
(1)
Define tj,2 = min{aj , tj +1,2 − 1} recursively by descending 1 ≤ j ≤ λ2 − λn . From (1)
its definition it is clear that aj lies in the interval [t j −1,2 + 1, t j,2 + 1]. By descending (1)
induction on j it also follows that tj,2 ∈ [t j −1,2 + 1, t j,2 + 1] and that either tj,2 = aj (1)
or aj
(1)
− 1. The latter case only occurs when aj (1) aj
j
= tj +1,2 = t j,2 + 1. In addition (1)
= t j −1,2 + 1 if tj,2 = aj − 1. This there must exist an index > j such that is because tj +1,2 = t j,2 + 1 is at its lower bound in the interval [t j,2 + 1, t j +1,2 + 1] (1) and this can happen in only two ways; either tj +1,2 = aj +1 which proves the assertion (1)
with j = j + 1 or tj +1,2 = aj +1 − 1 in which case the assertion must be true by (1)
induction since the initial case is tλ2 −λn +1,2 = aλ2 −λn +1 . Note that it also follows by (1)
(1)
induction that aj = aj + j − j − 1 if j is minimal. From its definition it follows that tj,2 < tj +1,2 and furthermore tj,2 ≥ t j −1,2 + 1 ≥ t j −1,1 + 1 = j which are the conditions for column-strictness for the first two columns of t. Hence (8.12) yields (8.2) for k = 1. We proceed inductively on 1 < k < n. Assume that by induction tj,k ∈ [t j −1,k + 1, t j,k + 1] is already defined for 1 ≤ k ≤ k. In terms of tj,k (8.10) reads (k)
xi
(k)
≤ Pi (ν) −
λ k −λn
χ (i ≥ * + tj,k ) +
j =1
λk+1 −λn
χ (i ≥ * + 1 + t j,k+1 )
j =1
+
λ k −λn
(8.13) χ ( * + tj,k ≤ i ≤ * + t j,k ).
j =1 (k)
For 1 < k < n define aj index i ∈ [t j −1,k+1
for 1 ≤ j ≤ λk+1 − λn + 1 to be the minimal
k −λn (k) (k) + 1, t j,k+1 ] ∩ λh=1 [th,k , t h,k ] such that x = P (ν, t). *+i *+i (k)
If no such i exists set aj
= t j,k+1 + 1. Note that t λk+1 −λn +1,k+1 + 1 = λ1 − (k)
λn + 1 = tλk+1 −λn +1,k+1 . The case aλk+1 −λn +1 < tλk+1 −λn +1,k+1 will be dealt with (k)
later. Now assume that aλk+1 −λn +1 = tλk+1 −λn +1,k+1 , and define recursively tj,k+1 = (k)
(k)
min{aj , tj +1,k+1 − 1} on descending 1 ≤ j ≤ λk+1 − λn . By definition aj ∈ [t j −1,k+1 + 1, t j,k+1 + 1]. As in the case k = 1 it follows by descending induc(k) tion on j that tj,k+1 ∈ [t j −1,k+1 + 1, t j,k+1 + 1] and that either tj,k+1 = aj or (k)
aj
− 1. By definition tj,k+1 < tj +1,k+1 . Let us now show that also tj,k ≤ tj,k+1 (k)
which would prove the column-strictness of t. By definition th,k ≤ aj
≤ t h,k for
150
A. Schilling, M. Shimozono (k)
some h. Assume h < j . Then t h,k ≥ aj
≥ t j −1,k+1 + 1 which violates the column(k)
strictness of t. Hence h ≥ j . If tj,k+1 = aj tj,k+1 =
(k)
then tj,k+1 ≥ th,k ≥ tj,k as desired. If
(k) aj −1 then a problem can only occur if h
(k)
= j and aj = tj,k . However in this
case aj = tj,k+1 + 1 = tj +1,k+1 = t j,k+1 + 1 > t j,k = t h,k which is a contradiction. This proves tj,k ≤ tj,k+1 . By the same arguments as in the case k = 1 there must exist (k) (k) an index j > j such that aj = t j −1,k+1 + 1 if tj,k+1 = aj − 1. For minimal j it (k)
(k)
follows again by induction that aj = aj + j − j − 1. (k)
(k)
(k)
(k)
Since by definition x = x < P (ν, t) for t j −1,k+1 < i < aj *+i *+i *+i i ≤ t h,k for some 1 ≤ h ≤ λk − λn , one can add λk+1 −λn +1
−
j =1
(k) χ ( * + t j −1,k+1 < i < * + aj )
λ k −λn
and th,k ≤
χ ( * + th,k ≤ i ≤ * + t h,k )
h=1
k −λn (k) to (8.13). If aλk+1 −λn +1 = tλk+1 −λn +1,k+1 then the sum of this term and λj =1 χ ( *+ λk+1 −λn (k) tj,k ≤ i ≤ * + t j,k ) does not exceed j =1 χ ( * + aj ≤ i ≤ * + t j,k+1 ) and (8.2) is proven for 1 < k < n. (k) It remains to treat the case when there exists a 1 ≤ k < n such that aλk+1 −λn +1 < λ1 − λn + 1. Let κ be minimal with this property. We will show that in this case T is not of level * which contradicts the assumptions. We claim that there exist indices hk and jk for 1 ≤ k ≤ κ such that (k)
t jk −1,k+1 + 1 ≤ ajk ≤ t jk ,k+1 ,
(8.14)
(k)
thk ,k ≤ ajk ≤ t hk ,k ,
(8.15)
h1 ≥ h2 ≥ · · · ≥ hκ and hk ≥ jk ≥ hk+1 . The inequalities (8.14) and (8.15) hold for k = κ with jκ = λκ+1 − λn + 1 and some hκ by the definition of κ. Now suppose that k < κ and that hk and jk for k < k ≤ κ satisfying (8.14) and (8.15) have (k) (k) been defined by induction. Recall that either tj,k+1 = aj or aj − 1. First assume (k)
(k)
that thk+1 ,k+1 = ahk+1 . This implies in particular that ahk+1 = thk+1 ,k+1 ≤ t hk+1 ,k+1 (k)
(k)
by (8.15) and hence t hk+1 −1,k+1 + 1 ≤ ahk+1 ≤ t hk+1 ,k+1 by the definition of aj . Set jk = hk+1 and choose hk such that (8.15) holds which must be possible by the (k) definition of aj . Also hk ≥ jk = hk+1 since otherwise t hk ,k ≤ t jk −1,k+1 by the column-strictness of t which yields a contradiction since then (8.14) and (8.15) cannot (k) hold simultaneously. Next assume thk+1 ,k+1 = ahk+1 − 1. Let jk > hk+1 be minimal (k)
such that ajk = t jk −1,k+1 + 1 ≤ t jk ,k+1 ; the existence of jk was proved before. In (k)
(k)
addition it was shown that ajk = ahk+1 + jk − hk+1 − 1. The existence of hk follows (k)
again from the definition of aj . As before hk ≥ jk ≥ hk+1 . (k) (k) *+aj
By definition x
k
(k) (k) *+aj
= x
k
(k) (k) (ν, t). *+aj
= P
k
(k)
Since Pi (ν, t) is given by the
right-hand side of (8.13), it follows from (8.14), (8.15) and the fact that tj,k ∈ [t j −1,k +
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
151
1, t j,k + 1] that (k) (k) *+aj
x
k
(k) (k) (ν) − hk *+aj
= P
+ jk
for 1 ≤ k ≤ κ.
(8.16)
k
µ −b
Define T b = T − L for 0 ≤ b ≤ µL with corresponding rigged configurations µ −b (ν b , J b ) = δ L (ν, J ). Let rb be the row index of the cell T b /T b−1 . Denote the length (k) of the selected string in (ν b , J b )(k) under δ by *b . We claim that (8.16) implies (k)
(k)
≤ * + ajk * *+j k
for 1 ≤ k ≤ κ. (0)
(8.17) (1)
(2)
(n−1)
This is shown by induction on k. By construction b = *b ≤ *b ≤ *b ≤ · · · ≤ *b (k) (k) (k) and *1 < *2 < . . . < *µL . In addition (k)
(k)
(k−1)
Pi (ν b−1 ) = Pi (ν b ) − χ (*b (1) (1) *+aj
If *
(k)
(k)
(k+1)
≤ i < *b ) + χ (*b ≤ i < *b
(1)
(1)
). (1)
(8.18) (1)
≤ *+aj1 then (8.17) follows immediately since j1 ≤ aj1 and *b−1 < *b .
1
(1) (1) *+aj
Hence assume that *
(1)
(0)
(1)
> *+aj1 . Since *b = b, the vacancy number at i = *+aj1
1
(1) (1) is decreased by one with each application of δ until *b ≤ * + aj1 because of (8.18) at k = 1. By (8.16) it takes h1 − j1 applications of δ until there is a singular string of (1) (1) length * + aj1 in the first rigged partition. Since aj1 = h1 by (8.15) and Remark 8.10 (1) * + j1 which this means that the singular string occurs at b = * + aj1 − h1 + j1 = proves (8.17) at k = 1. Now consider the cases 1 < k ≤ κ and assume that (8.17) holds for k < k. First (k−1) (k−1) (k) assume that thk ,k = ahk . In this case jk−1 = hk and by (8.15) ajk−1 = thk ,k ≤ ajk (k−1) (k) (k) (k) so that * ≤ * + a by (8.17) at k − 1. If * ≤ * + a there is nothing to *+jk−1 (k)
*+jk−1
jk (k)
(k)
jk
show since *b−1 < *b and jk−1 ≥ jk . Hence assume that * *+j
k−1
(k) > * + ajk . Again
by (8.16) and (8.18) it takes hk − jk applications of δ until there is a singular string of (k) length * + ajk in the k th rigged partition. Since jk−1 = hk the singular string occurs at b= * + jk−1 − (hk − jk ) = * + jk which proves (8.17). (k−1) (k−1) (k−1) − 1. Then ajk−1 = ahk + jk−1 − hk − 1 = Next assume that thk ,k = ahk (k−1) (k) (k−1) (k−1) ≤ a + jk−1 − hk . Since * ≤ *+a thk ,k + jk−1 − hk so that by (8.15) a jk−1
jk
*+jk−1
jk−1
it takes at most jk−1 −hk applications of δ before there is a singular string in the (k −1)th (k) rigged partition of length not exceeding * + ajk . After that, by (8.16) and (8.18), it takes (k) * + a in the k th rigged hk − jk applications of δ until there is a singular string of length jk
(k)
partition. Hence altogether the existence of a singular string of length * + ajk is assured at b = * + jk−1 − (jk−1 − hk ) − (hk − jk ) = * + jk which concludes the proof of (8.17). (κ) Recall that jκ = λκ+1 − λn + 1. Therefore (8.17) implies that *b is finite for (k) 1 ≤ b ≤ * + λκ+1 − λn + 1. If *b is finite this means that rb > k so that rb > κ for 1 ≤ b ≤ * + λκ+1 − λn + 1. Since at most λκ+1 − λn boxes can be removed from T *+λκ+1 −λn +1 in rows with index κ < rb < n it follows that r *+1 = n. This implies
152
A. Schilling, M. Shimozono
λ0n ≤ λn − * − 1, where λ0 = shape(T 0 ). Hence λ1 − λ0n ≥ * + 1 which contradicts the assumption that T is of level *. This concludes the proof of the case r = 1. Case 1 < r < n. In this case λr = λr − 1, λk = λk for k = r and *= *. It is convenient (1) (1) to introduce p = j − 1, max{j (k) − 1, p (k−1) } for s (k) < * + t j (k) ,k−1 − 1, (k) p = (8.19) j (k) for s (k) ≥ * + t j (k) ,k−1 − 1, for 1 < k < r and p(r) = λr − λn . Note that for 1 ≤ k < r either p (k) = j (k) or p(k) = j (k) − 1 and that p (k) ≤ p(k+1) . Define tj,1 = j for 1 ≤ j ≤ λ1 − λn , for 1 ≤ j < j (k−1) and p (k) < j ≤ λk − λn , t j,k (k−1) (8.20) tj,k = s − *+1 for j = j (k−1) = p(k−1) , max{t (k−1) (k) <j ≤p , j −1,k , t j,k−1 } for p for 1 < k ≤ r and tj,k = t j,k for r < k < n and 1 ≤ j ≤ λk − λn . By (8.6) we have t j (k−1) −1,k < t j (k−1) ,k−1 for 1 < k ≤ r so that tj (k−1) ,k = t j (k−1) ,k−1
for p (k−1) = j (k−1) − 1 < p(k) .
(8.21)
It needs to be shown that t indeed defines a column-strict tableau over the alphabet {1, 2, . . . , λ1 − λn }. Since t j,k ∈ {1, 2, . . . , λ1 − λn } and s (k) < * for all 1 ≤ k < r the condition tj,k+1 ∈ {1, 2, . . . , λ1 −λn } might only be violated if p (k) = j (k) and s (k) < * for 1 ≤ k < r. By (8.6) the latter condition requires j (k) = 1 so that 1 = j (1) = · · · = j (k) by (8.8). Since 0 ≤ s (1) ≤ · · · ≤ s (k) < * the first condition in (8.19) applies for p(h) for 2 ≤ h ≤ k. However, since p(1) = j (1) −1 = 0 this implies that p(k) = 0 which contradicts the requirement j (k) = p(k) . This shows that tj,k+1 ∈ {1, 2, . . . , λ1 − λn }. Next we check that t is column-strict. The condition tj,k < tj +1,k only needs to be checked for 1 < k ≤ r and j (k−1) −1 ≤ j ≤ p(k) since in all other cases it automatically follows from the column-strictness of t. First assume p (k−1) = j (k−1) . Then tj (k−1) −1,k = t j (k−1) −1,k < s (k−1) − * + 1 = tj (k−1) ,k by (8.6). Furthermore tj (k−1) ,k = s (k−1) − *+ (k−1) (k−1) = j − 1. 1 ≤ t j (k−1) ,k−1 < tj (k−1) +1,k by (8.6) and (8.20). Next assume p Then for p(k−1) < p(k) , tj (k−1) −1,k = t j (k−1) −1,k < t j (k−1) ,k−1 = tj (k−1) ,k by (8.6) and (8.21). For p(k−1) = p(k) the column-strictness is trivial. Furthermore t j −1,k < t j,k ≤ max{t j,k , t j +1,k−1 } and t j,k−1 < t j +1,k−1 ≤ max{t j,k , t j +1,k−1 } so that tj,k < tj +1,k for p(k−1) < j < p (k) . And finally tp(k) ,k = max{t p(k) −1,k , t p(k) ,k−1 } ≤ t p(k) ,k < t p(k) +1,k = tp(k) +1,k . The conditions tj,k ≤ tj,k+1 only need to be verified for j (1) ≤ j ≤ p (2) and k = 1, for j (k−1) ≤ j ≤ p(k+1) and 1 < k < r, and for j (r−1) ≤ j and k = r. First assume k = 1. Then tj,1 = j ≤ max{t j −1,2 , t j,1 } = tj,2 for j (1) ≤ j ≤ p(2) . Now assume 1 < k < r. For p(k−1) = j (k−1) < j (k) we have tj (k−1) ,k = s (k−1) − *+1 ≤ t j (k−1) ,k−1 ≤ t j (k−1) ,k+1 = tj (k−1) ,k+1 by (8.6). For p (k−1) = j (k−1) = j (k) we have * + 1 ≤ s (k) − * + 1 = tj (k) ,k+1 . For p (k−1) < j < j (k) one obtains tj (k−1) ,k = s (k−1) − tj,k = max{t j −1,k , t j,k−1 } ≤ t j,k+1 = tj,k+1 . Next assume p (k−1) < p (k) = j (k) . Then the second case of (8.19) applies so that s (k) − * + 1 ≥ t j (k) ,k−1 . By (8.6) also
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
153
Fig. 2. Illustration of the extra terms in (8.23) and (8.24)
s (k) − * + 1 ≥ t j (k) −1,k+1 ≥ t j (k) −1,k . This implies tj (k) ,k = max{t j (k) −1,k , t j (k) ,k−1 } ≤ (k) * + 1 = tj (k) ,k+1 . And finally for p (k) < j ≤ p (k+1) we have tj,k = t j,k ≤ s − max{t j −1,k+1 , t j,k } = tj,k+1 . In a similar fashion one shows that tj,r ≤ tj,r+1 . Hence t forms a column-strict tableau of shape (λ1 − λn , . . . , λn−1 − λn )t . We will now show that (8.2) holds with t as defined in (8.20). First assume that (k) (k) i = s (k) + 1. In this case xi = Pi (ν) if 1 ≤ k < r. Hence it needs to be shown that in (k) (k) this case Pi (ν, t) = Pi (ν). To this end it suffices to show that there exists an index j such that * + tj −1,k+1 ≤ s (k) + 1 < * + tj,k .
(8.22)
Assume that p(k) = j (k) , so that * + tj (k) ,k+1 = s (k) + 1. By (8.6) s (k) + 1 ≤ * + t j (k) ,k < * + tj (k) +1,k so that (8.22) holds with j = j (k) + 1. Next assume that * + t j (k) +1,k = * + t j (k) ,k−1 ≤ * + t j (k) ,k = * + tj (k) ,k . p(k) = j (k) − 1. Then by (8.19), s (k) + 1 < (k) * + t j (k) −1,k+1 ≤ s + 1 which implies (8.22) Furthermore by (8.6), * + tj (k) −1,k+1 = with j = j (k) . In summary (8.22) holds for j = p (k) + 1. It remains to show that for i = s (k) + 1 the bounds (8.9) imply (8.2) with t as in (8.20). First assume 1 ≤ k < r. For i such that * + t j −1,k+1 ≤ i < * + t j,k with (k) (k) 1 ≤ j ≤ λr − λn (8.5) simply reads x i ≤ Pi (ν). By construction there are no singular strings of length s (k) < i ≤ s (k+1) in (ν, J)(k) . Hence, for 1 ≤ k ≤ r − 2 we can sharpen the bounds in (8.5) and therefore also those in (8.9) by adding the terms −χ (s (k) < i ≤ min{s (k+1) , * + t j (k) ,k − 1})
(8.23)
if j (k) = j (k+1) and
− χ (s
(k)
j (k+1) −2
χ ( * + t j,k+1 ≤ i < * + t j +1,k )
j =j (k)
* + t j (k+1) ,k − 1}) − χ ( * + t j (k+1) −1,k+1 ≤ i ≤ min{s (k+1) ,
(8.24)
if j (k) < j (k+1) . In terms of the paths, this corresponds to adding a horizontal line segment (which is equivalent to extra minus signs) in the k th strip in the interval s (k) < i ≤ s (k+1) whenever there is a horizontal gap between two neighboring paths. An example is given in Fig. 2. It depicts the k th strip and the zigzag lines correspond to the added line segments.
154
A. Schilling, M. Shimozono
The sum of extra terms (8.23) or (8.24) and χ (s (k) < i ≤ s (k+1) ) does not exceed (k+1) p
χ ( * + max{t j −1,k+1 , t j,k } ≤ i < * + t j,k+1 )
j =j (k)
=
(k+1) p
χ ( * + tj,k+1 ≤ i < * + t j,k+1 ) − χ (s (k) < i < * + t p(k) ,k ).
(8.25)
j =j (k)
To obtain the first line of (8.25) we have used t j (k) −1,k+1 < t j (k) ,k by (8.6) for the term * + t j (k+1) ,k+1 when p(k+1) = j = j (k) , the definition (8.19) of p (k+1) and s (k+1) < (k+1) (k) (k) which follows from (8.6). When p = j the second line follows directly j using (8.20). When p(k) = j (k) − 1 use that * + t j (k) −1,k ≤ * + t j (k) −1,k+1 ≤ s (k) by (8.6) so that the last term vanishes. Similarly for k = r − 1 the bounds in (8.9) can be sharpened by adding −χ (s (r−1) < i < * + t j (r−1) ,r−1 ) −
λr −λ n −1
χ ( * + t j,r ≤ i < * + t j +1,r−1 ).
j =j (r−1)
Together with χ (s (r−1) < i ≤ s (r) ) = χ (s (r−1) < i) this yields by similar reasons as before λr −λ n −1
χ ( * + tj,r ≤ i < * + t j,r ) + χ (i ≥ * + tλr −λn ,r )
j =j (r−1)
* + t p(r−1) ,r−1 ). − χ (s (r−1) < i <
(8.26)
Note that t j −1,k ≤ tj,k ≤ t j,k for j (k−1) ≤ j ≤ p (k) and 1 ≤ k < r. In addition s (k−1) < * + tj (k−1) ,k for 1 ≤ k ≤ r. For k = 1 this follows from (8.7) and Remark 8.10, and for 1 < k ≤ r and p(k−1) = j (k−1) this follows from (8.20) and for p (k−1) = j (k−1) − 1 one exploits the first condition of (8.19) and (8.21). Also s (k) ≥ * + t j (k) −1,k for 1 ≤ k < r thanks to (8.6). This implies for 1 ≤ k < r, (k)
−χ (s
(k−1)
(k)
)≤−
p
χ ( * + tj,k ≤ i < * + t j,k )
j =j (k−1)
(8.27)
* + t p(k) ,k ). + χ (s (k) < i < From (8.25) (or (8.26) for k = r − 1) and (8.27) it is straightforward to see that (8.9) implies (8.2) for 1 ≤ k < r.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
155
Now consider k = r. Then s (r) = ∞, s (r−1) < * + tj (r−1) ,r as shown above (8.27) (r−1) ≤ j < λr − λn so that and t j,r ≤ tj +1,r for j −χ (s (r−1) < i ≤ s (r) ) −
λ r −λn
χ (i ≥ * + t j,r )
j =1
≤ − χ (i ≥ * + tj (r−1) ,r ) − = −
λ r −λn
j (r−1) −1
χ (i ≥ * + tj,r ) −
λr −λ n −1
j =1
χ (i ≥ * + tj +1,r )
j =j (r−1)
χ (i ≥ * + tj,r ).
j =1
For r < k < n we have s (k−1) = s (k) = ∞ and t j,k = tj,k so that −χ (s (k−1) < i ≤ k −λn k −λn s (k) ) = 0 and − λj =1 χ (i ≥ χ (i ≥ * + tj,k ). * + t j,k ) = − λj =1 For r ≤ k < n we have s (k) = s (k+1) = ∞ and t j,k+1 = tj,k+1 so that χ (s (k) < i ≤ λk+1 −λn λk+1 −λn s (k+1) ) = 0 and j =1 * + t j,k+1 ) = j =1 χ (i ≥ χ (i ≥ * + tj,k+1 ). This concludes the proof that (8.9) implies (8.2) for 1 < r < n. Case r = n. In this case λk = λk for 1 ≤ k < n, λn = λn − 1, *= * − 1 and t is a tableau over the alphabet {1, 2, . . . , λ1 − λn + 1}. It follows from (8.8) that j (0) = · · · = j (n−1) = 1. Note that this requires in particular that s (0) = µL − 1 < *+1 = *. Define tj,k = t j +1,k − 1 for 1 ≤ k < n and 1 ≤ j ≤ λk − λn . Then by the column-strictness of t we have tj,k < tj +1,k and tj,k ≤ tj,k+1 . Note in particular that tj,1 = t j +1,1 − 1 = j so that t is a column-strict tableau over the alphabet {1, 2, . . . , λ1 − λn }. In addition it * + t1,k so that (8.22) holds follows from (8.6) that 0 ≤ s (k) + 1 ≤ * + t 1,k < * + t 2,k = for j = 1. This ensures (8.2) for i = s (k) + 1. Using the fact that there are no singular * + t 1,k in (ν, J)(k) and that s (k+1) < * + t 1,k+1 by (8.6) the strings of length s (k) < i < (k) (k+1) term χ (s < i ≤ s ) in (8.9) can be safely replaced by χ (*+t 1,k ≤ i < *+t 1,k+1 ). Furthermore dropping the term −χ (s (k−1) < i ≤ s (k) ) Eq. (8.9) becomes (k) xi
≤
(k) Pi (ν) −
λk −λ n +1 j =2
* + t j,k ) + χ (i ≥
λk+1 −λn +1
* + t j,k+1 ). χ (i ≥
j =2
*= * − 1 and the definition of t this is exactly (8.2). Using This concludes the proof of the forward direction of the theorem.
Proof of the reverse direction. Let us now prove the reverse direction. To this end consider a rigged configuration (ν, J ) corresponding to a column-strict tableau T of (k) shape λ and content µ under ψ µ which satisfies ν1 ≤ * for all 1 ≤ k < n and (8.2). We need to show that then T is of level *. This is equivalent to showing that T − is of level * and that λ1 − λ0n ≤ * if r = 1, where r is the row index of the cell λ/λ, λ = shape(T − )
156
A. Schilling, M. Shimozono µ
and λ0 = shape(T − L ). By induction the statement that T − is of level * is equivalent (k) * = * − λ1 + λn ≥ 0 and (ν, J) = δ(ν, J ) satisfies ν 1 ≤ * for all to the statement that 1 ≤ k < n and (k)
xi
(k)
≤ Pi (ν) −
λ k −λn
χ (i ≥ * + t j,k ) +
λk+1 −λn
j =1
χ (i ≥ * + t j,k+1 )
(8.28)
j =1
for some column-strict tableau t of shape (λ1 − λn , . . . , λn−1 − λn )t . To prove *≥0 it suffices to show that r = n cannot occur when * = 0. Let *(k) (1 ≤ k < r) be the length of the selected singular string in (ν, J )(k) under δ. By definition µL = *(0) ≤ *(1) ≤ *(2) ≤ · · · ≤ *(r−1) ≤ *, and the rigged configuration (ν, J) is obtained from (ν, J ) by shortening the selected strings by one, making them (k) singular again and leaving all other strings unchanged. Since ν1 ≤ * this immediately (k) implies ν 1 ≤ * for all 1 ≤ k < n. The vacancy numbers are related by (k)
(k)
Pi (ν) = Pi (ν) − χ (*(k−1) ≤ i < *(k) ) + χ (*(k) ≤ i < *(k+1) ). (k)
Furthermore x i (8.2) implies (k)
xi
(k)
≤ xi
(k)
(8.29)
(k)
for i = *(k) − 1 and x *(k) −1 = P*(k) −1 (ν) for 1 ≤ k < r so that
(k)
≤ Pi (ν) + χ (*(k−1) ≤ i < *(k) ) − χ (*(k) ≤ i < *(k+1) ) −
λ k −λn j =1
χ (i ≥ * + tj,k ) +
λk+1 −λn
χ (i ≥ * + tj,k+1 )
(8.30)
j =1
for i = *(k) − 1. Since *(k) is the length of a singular string in (ν, J )(k) it must be in one of the intervals in (8.4). Let j (k) for 1 ≤ k < r be the index such that * + tj (k) −1,k+1 ≤ *(k) < * + tj (k) ,k ,
(8.31)
* and tj,k = λ1 − λn + 1 for all j > λk − λn . By similar where we recall that t0,k+1 = − arguments as in the derivation of (8.8) one finds 1 ≤ j (1) ≤ · · · ≤ j (r−1) ≤ λr − λn + 1.
(8.32)
Case r = 1. In this case λk = λk for 1 < k ≤ n, λ1 = λ1 − 1, * = * − 1 and (k) tj,k ∈ {1, 2, . . . , λ1 − λn } = {1, 2, . . . , λ1 − λn + 1}. Let a (1 ≤ k < n) be maximal such that tj,k = j for all 1 ≤ j ≤ a (k) . It follows from Remark 8.10 and tj,k ≤ tj,k+1 that 0 ≤ a (n−1) ≤ · · · ≤ a (2) ≤ a (1) = λ1 − λn . Set t j,1 = j for 1 ≤ j ≤ λ1 − λn and j for 1 ≤ j ≤ a (k) , t j,k = tj,k − 1 for a (k) < j ≤ λk − λn , for 1 < k < n. The definition of a (k) and column-strictness of t ensure the columnstrictness of t. Note that the terms j = 1, 2, . . . , a (k+1) in the two sums in (8.30) cancel
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
157
each other. Recall that *(0) = µL and *(k) = ∞ for k ≥ 1. Assume k = 1. The term χ (*(0) ≤ i < *(1) ) = χ (i ≥ µL ) in (8.30) can be replaced by χ (i ≥ * + ta (2) +1,1 ) = (2) (2) χ (i ≥ * + a + 1). If µL ≤ * + a + 1 this follows from the fact that by construction there are no singular strings of length ≥ µL in (ν, J )(1) . For µL > * + a (2) + 1 we have * − 1, * + a (2) + 1). Hence using *= χ (i ≥ µL ) ≤ χ (i ≥ (1)
xi
(1)
≤ Pi (ν) + χ (i ≥ * + a (2) ) −
+
λ 2 −λn
λ1 −λ n +1
χ (i ≥ * + j)
j =a (2) +1
χ (i ≥ * + tj,2 )
j =a (2) +1
=
(1) Pi (ν) −
λ 1 −λn
χ (i ≥ * + j) +
j =a (2) +1
λ 2 −λn
χ (i ≥ * + t j,2 ),
j =a (2) +1
which is (8.28) for k = 1. Now assume 1 < k < n. Since *(k) = ∞ for 1 ≤ k ≤ n the terms involving *(k) in (8.30) vanish and (k)
xi
(k)
≤ Pi (ν) − (k)
≤ Pi (ν) −
λ k −λn
χ (i ≥ * + tj,k ) +
j =a (k+1) +1 λ k −λn j =a (k+1) +1
λk+1 −λn
χ (i ≥ * + tj,k+1 )
j =a (k+1) +1
χ (i ≥ * + t j,k ) +
λk+1 −λn
χ (i ≥ * + t j,k+1 )
j =a (k+1) +1
which is (8.28) for 1 < k < n. This concludes the proof that (8.2) implies (8.28) for r = 1. *= *. Set p (r) = j (r) = Case 1 < r < n. Here λk = λk for k = r, λr = λr − 1 and λr − λn and j (k) − 1 for *(k) ≤ * + tj (k) −1,k+2 , (k) p = (8.33) (k) (k+1) min{j , p } for *(k) > * + tj (k) −1,k+2 , for 1 ≤ k < r, where we recall that tj,r+1 = λ1 − λn + 1 for j > λr+1 − λn . Note that p(k) = j (k) or j (k) − 1 and p(k) ≤ p(k+1) due to (8.32). Define t j,1 = j for 1 ≤ j ≤ λ1 − λn , for 1 ≤ j < p(k−1) and j (k) ≤ j ≤ λk − λn , tj,k t j,k = min{tj,k+1 , tj +1,k } for p (k−1) ≤ j < p(k) , (8.34) *(k) − (k) (k) * for j = p = j − 1, for 1 < k ≤ r and t j,k = tj,k for r < k < n and 1 ≤ j ≤ λk − λn . Recall that tj,k = λ1 − λn + 1 for j > λk − λn .
158
A. Schilling, M. Shimozono
It needs to be shown that t is a tableau over the alphabet {1, 2, . . . , λ1 − λn }. Since tj,k ∈ {1, 2, . . . , λ1 − λn } = {1, 2, . . . , λ1 − λn } the only problematic case is the third case in (8.34). Condition 8.9 of the theorem implies that *(k) ≤ * for 1 ≤ k < r so that *(k) − * ≤ λ1 − λn . By (8.31) the condition 1 ≤ *(k) − * can only be violated if j (k) = 1. (k) Assume that j = 1 for some 1 ≤ k < r. Let h be maximal such that j (k) = 1 for all 1 ≤ k ≤ h. Then *(k) > * + tj (k) −1,k+2 = 0 for all 1 ≤ k ≤ h so that the second case in (8.33) applies. If h < r − 1 we have p(h+1) ≥ j (h+1) − 1 ≥ 1 by the maximality of h. If h = r − 1, p(r) = j (r) = λr − λn ≥ 1. In both cases it follows that p (k) = j (k) = 1 for all 1 ≤ k ≤ h. Hence by (8.34) the case t p(k) ,k = *(k) − * < 1 does not occur. This proves that t j,k ∈ {1, 2, . . . , λ1 − λn }. It remains to show that t is column-strict. The condition t j,k < t j +1,k only needs to be considered for p(k−1) −1 ≤ j < j (k) and 1 < k < r and for p (r−1) −1 ≤ j ≤ λr+1 −λn and k = r by the column-strictness of t. In these cases t j,k < t j +1,k can be deduced from the following inequalities: (a) (b)
tj,k < min{tj +1,k+1 , tj +2,k }, min{tj,k+1 , tj +1,k } ≤ tj +1,k < tj +2,k , min{tj,k+1 , tj +1,k } ≤ tj,k+1 < tj +1,k+1 ,
(c)
min{tj (k) −2,k+1 , tj (k) −1,k } ≤ tj (k) −2,k+1 < tj (k) −1,k+1 ≤ *(k) − *, * < tj (k) ,k for 1 ≤ k < r, *(k) −
(d)
min{tj (k) −1,k+1 , tj (k) ,k } = tj (k) −1,k+1 < tj (k) ,k
for 1 ≤ k < r,
where (8.31) was employed extensively. The condition t j,k ≤ t j,k+1 needs to be verified for k = 1 and p(1) ≤ j < j (2) , for 1 < k < r and p (k−1) ≤ j < j (k+1) and for k = r and p(r−1) ≤ j ≤ λr+1 − λn . In these cases t j,k ≤ t j,k+1 can be deduced from the following inequalities: (a) (b)
min{tj,k+1 , tj +1,k } ≤ tj,k+1 , tj,k ≤ min{tj,k+2 , tj +1,k+1 },
(c)
tj (k+1) −1,k ≤ tj (k+1) −1,k+2 ≤ *(k+1) − *,
where again (8.31) was employed. In addition for 1 < k < r we have *(k) − * ≤ min{tj (k) −1,k+2 , tj (k) ,k+1 } if *(k) ≤ * + tj (k) −1,k+2 . If *(k) > * + tj (k) −1,k+2 then p (k) = j (k) − 1 is only possible if p (k) = p(k+1) = j (k) − 1 which implies that j (k) = j (k+1) . However in this case t j (k) −1,k = *(k) − * ≤ *(k+1) − * = t j (k) −1,k+1 . This proves the column-strictness of t. (k) (k) By definition x *(k) −1 = P*(k) −1 (ν) for 1 ≤ k < r. Hence we need to check that (k)
(k)
Pi (ν, t) = Pi (ν) for i = *(k) − 1. It suffices to show that there exists an index j such that * + t j −1,k+1 ≤ *(k) − 1 < * + t j,k .
(8.35)
Assume that p(k) = j (k) − 1. Then * + t j (k) −1,k = *(k) and by (8.31) *(k) ≥ *+ (k) tj (k) −1,k+1 > * + tj (k) −2,k+1 = * + t j (k) −2,k+1 so that (8.35) holds for j = j − 1. * + tj (k) −1,k+2 ≥ * + tj (k) −1,k+1 = Now assume p(k) = j (k) . Then by (8.33), *(k) > (k) * + t j (k) −1,k+1 . Furthermore by (8.31) * < * + tj (k) ,k = * + t j (k) ,k so that (8.35) holds for j = j (k) .
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
159
For i = *(k) −1 we need to show that (8.30) implies (8.28). Note that for 1 ≤ k < r we have t j,k+1 ≤ tj +1,k+1 for p (k) ≤ j < j (k+1) since min{tj,k+2 , tj +1,k+1 } ≤ tj +1,k+1 and *(k+1) − * < tj (k+1) ,k+1 for 1 ≤ k < r − 1 by (8.31). In addition *(k+1) ≥ * + t j (k+1) −1,k+1 . For p(k+1) = j (k+1) − 1 this follows directly from (8.34), and for * + tj (k+1) −1,k+2 ≥ * + t j (k+1) −1,k+1 by (8.31). Since p(k+1) = j (k+1) we have *(k+1) ≥ (k) furthermore * < * + tj (k) ,k+1 by (8.31) we have for 1 ≤ k < r, * + tp(k) ,k+1 ≤ i < *(k) − χ (*(k) ≤ i < *(k+1) ) ≤ χ −
j (k+1) −1
χ ( * + tj,k+1 ≤ i < * + t j,k+1 ) − δk+1,r χ (i ≥ * + tλr −λn ,r ), (8.36)
j =p(k)
where the last term occurs since *(r) = ∞. Observe that tj,k+1 ≤ t j,k+1 for p(k) ≤ j < j (k+1) and 1 ≤ k < r since min{tj,k+2 , tj +1,k+1 } ≥ tj,k+1 and *(k+1) − * ≥ tj (k+1) −1,k+2 ≥ tj (k+1) −1,k+1 for 1 ≤ k < r − 1 by (8.31). Hence using (8.36) and (8.34) we have for 1 ≤ k < r λk+1 −λn (k) (k+1) −χ * ≤ i < * χ (i ≥ * + tj,k+1 ) + j =1 λk+1 −λn
≤
j =1
χ i ≥ * + t j,k+1
(8.37)
+χ * + tp(k) ,k+1 ≤ i < *(k) .
Since tj,1 = j by Remark 8.10 and tj,1 ≤ tj,2 Eq. (8.31) implies that either *(0) ≤ < * + 1 for j (1) = 1 or *(1) = * + j (1) − 1 and tj,2 = j for 1 ≤ j < j (1) ≤ (1) (1) λr − λn + 1. Note that in both cases (8.2) reads xi ≤ Pi (ν) for 1 ≤ i < *(1) . Since by construction there are no singular strings of length *(0) ≤ i < *(1) in (ν, J )(1) we can add the term −χ (*(0) ≤ i < *(1) ) to (8.2) for k = 1. This has the effect that the term χ (*(0) ≤ i < *(1) ) in (8.30) for k = 1 can be dropped. Note that for j (1) = 1 we have p(1) = 1 so that the term χ ( * + tp(1) ,2 ≤ i < *(1) ) in (8.37) is zero. For j (1) > 1 * + j (1) − 1. Since this term is also zero since tp(1) ,2 ≥ tj (1) −1,2 = j (1) − 1 and *(1) = λ1 −λn λ1 −λn * + tj,1 ) = − j =1 χ (i ≥ * + t j,1 ), this proves that in addition − j =1 χ (i ≥ (8.30) implies (8.28) for k = 1. Now assume that 1 < k < r. By construction there are no singular strings of length *(k−1) ≤ i < *(k) in (ν, J )(k) . Therefore the bounds (8.2) and hence also the bounds in (8.30) can be sharpened by adding −χ max *(k−1) , * + tj (k) −1,k+1 ≤ i < *(k) *(1)
for j (k−1) = j (k) and − χ max *(k−1) , * + tj (k−1) ,k * + tj (k−1) −1,k+1 ≤ i < −
(k) −1 j
j =j (k−1) +1
χ * + tj −1,k+1 ≤ i < * + tj,k − χ * + tj (k) −1,k+1 ≤ i < *(k)
160
A. Schilling, M. Shimozono
for j (k−1) < j (k) . Adding these to χ (*(k−1) ≤ i < *(k) ) does not exceed (k) −1 j
χ ( * + tj,k ≤ i < * + min{tj,k+1 , tj +1,k })
j =p(k−1)
=
(k) −1 j
χ ( * + tj,k ≤ i < * + t j,k ) − χ ( * + tp(k) ,k+1 ≤ i < *(k) ).
j =p(k−1)
Using again that tj,k ≤ t j,k for p(k−1) ≤ j < j (k) this can be combined with the term k −λn k −λn − λj =1 χ (i ≥ *+tj,k ) of (8.30) to yield − λj =1 χ (i ≥ *+t j,k )−χ ( *+tp(k) ,k+1 ≤ (k) i < * ). Together with (8.37) this proves that (8.30) implies (8.28) for 1 < k < r. (r) (r) Consider k = r. Recall that λr+1 < λr so that (8.2) implies xi ≤ Pi (ν) − 1 for i ≥ * + tλr+1 −λn +1,r . By construction there are no singular strings of length i ≥ *(r−1) in (ν, J )(r) . Hence for j (r−1) ≤ λr+1 − λn + 1 the bounds in (8.2) and (8.30) can be sharpened by adding * + tj (r−1) −1,r+1 − χ max *(r−1) , ≤i < * + tj (r−1) ,r −
λr+1 −λn +1
χ ( * + tj −1,r+1 ≤ i < * + tj,r ),
j =j (r−1) +1
which added to χ (i ≥ *(r−1) ) does not exceed λr+1 −λn
χ * + tj,r ≤ i < * + min{tj,r+1 , tj +1,r } + χ (i ≥ * + tλr+1 −λn +1,r )
j =p(r−1)
=
λr −λ n −1
χ * + tj,r ≤ i < * + min{tj,r+1 , tj +1,r } + χ (i ≥ * + tλr −λn ,r ),
j =p(r−1)
(8.38) where in the last line we used that min{tj,r+1 , tj +1,r } = tj +1,r for j > λr+1 − λn since by definition tj,r+1 = λ1 − λn + 1 in this case. The last line of (8.38) also makes sense for j (r−1) > λr+1 − λn + 1 since then p (r−1) = j (r−1) − 1 and * + tj (r−1) −1,r ≤ *(r−1) λr −λn by (8.31). The last line of (8.38) combined with − j =1 χ (i ≥ * + tj,r ) yields λr −λn − j =1 χ (i ≥ * + t j,r ) using (8.34). For r < k < n the term χ (*(k−1) ≤ i < k −λn k −λn *(k) ) vanishes and − λj =1 χ (i ≥ * + tj,k ) = − λj =1 χ (i ≥ * + t j,k ). Similarly λ k+1 −λn (k) (k+1) −χ (* ≤ i < * ) is zero for r ≤ k < n and j =1 χ (i ≥ * + tj,k+1 ) = λk+1 −λn χ (i ≥ * + t j,k+1 ). Together these results prove (8.28) for r ≤ k < n. j =1 This concludes the proof of the reverse direction of the theorem for 1 < r < n.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
161
*= * − 1. Then by Case r = n. In this case λk = λk for 1 ≤ k < n, λn = λn − 1 and (8.32) it follows that j (1) = · · · = j (n−1) = 1. In particular from (8.31), *(1) < *+t1,1 = * + 1 which yields a contradiction when * = 0 since by assumption *(1) ≥ µL ≥ 1. Hence the case r = n cannot occur when * = 0. Define t 1,k = 1 and t j,k = tj −1,k + 1 for 1 < j ≤ λk − λn for all 1 ≤ k < n. Since tj,k ∈ {1, 2, . . . , λ1 − λn − 1} it follows that t j,k ∈ {1, 2, . . . , λ1 − λn }. The column-strictness of t immediately implies the column-strictness of t. *+t1,k and there are no singular strings of length *(k−1) ≤ i < *(k) Since µL ≤ *(k) < (k) in (ν, J ) we may drop the term χ (*(k−1) ≤ i < *(k) ) in (8.30). In addition dropping the term −χ (*(k) ≤ i < *(k+1) ) (8.30) implies for i = *(k) − 1, (k)
xi
(k)
≤ Pi (ν) − (k)
= Pi (ν) −
λk −λ n −1
* + 1 + tj,k + χ i ≥
j =1
λk+1 −λn −1
* + 1 + tj,k+1 χ i ≥
j =1
λ k −λn
λk+1 −λn χ i ≥ χ i ≥ * + t j,k + * + t j,k+1 .
j =2
j =2
The terms j = 1 can be added to both sums since they just cancel each other so that we have (8.28) for i = *(k) − 1. * + t1,k for 1 ≤ k < n so that Finally consider the case i = *(k) − 1. We have *(k) < (k) * − 1 < * + t 2,k . Since the terms j = 1 in the two sums cancel, (8.28) for i = *(k) − 1 (k) (k) (k) (k) reduces to x *(k) −1 ≤ P*(k) −1 (ν), or equivalently P*(k) −1 (ν, t) = P*(k) −1 (ν) as desired. This concludes the proof that T − is of level *. Zu guter Letzt. It remains to show that λ1 − λ0n ≤ *. µ −b Define T b = T − L for 0 ≤ b ≤ µL with corresponding rigged configurations µ −b (k) (ν b , J b ) = δ L (ν, J ), and λb = shape(T b ). Let (x b )i be the largest rigging occurb b ring for the strings of length i in (ν , J ) and let rb be the row index of the cell λb /λb−1 for 1 ≤ b ≤ µL . Then n ≥ r1 ≥ r2 ≥ · · · ≥ rµL ≥ 1. Denote the length of the selected (k) string in (ν b , J b )(k) under δ by *b . Let 1 ≤ β ≤ µL be maximal such that rβ = n. If no such β exists set β = 0. Then λ0n = λn − β. Hence proving λ1 − λ0n ≤ * is equivalent to showing that β ≤ *. * then also β ≤ µL ≤ *. Hence assume that µL = * + d with d ≥ 1. For If µL ≤ b > * set b = b − *. We will show by descending induction on *
(x b )i
(k)
≤ Pi (ν b ) −
min{b,λ k −λn } j =1
χ (i ≥ * + tj,k ) +
min{b,λk+1 −λn }
χ (i ≥ * + tj,k+1 )
(8.39)
j =1
for all 1 ≤ k < n and 1 ≤ i < * + tb,k+1 , where we recall that tj,k = λ1 − λn + 1 if j > λk − λn .
162
A. Schilling, M. Shimozono
Let us first show that (8.39) holds for b = *+d. This follows directly from (8.2) using λk −λn min{d,λk −λn } λk+1 −λn that − j =1 χ (i ≥ * + tj,k ) ≤ − j =1 χ (i ≥ * + tj,k ) and j =1 χ (i ≥ min{d,λk+1 −λn } * + tj,k+1 ) = j =1 χ (i ≥ * + tj,k+1 ) thanks to the fact that by assumption 1≤i < * + td,k+1 and tj,k+1 < tj +1,k+1 . Now assume (8.39) to be true for some * < b ≤ µL . We will prove that then rb < n and that (8.39) holds for b − 1 if b > * + 1. We claim that (k)
*b ≥ * + tb,k+1
(8.40)
for all 1 ≤ k < n, where by definition tj,n = λ1 − λn + 1. Assume the opposite, namely (κ) * + tb,κ+1 . By (8.39) there are no let 1 ≤ κ < n be the smallest index such that *b < (κ) in (ν b , J b )(κ) so that * < *+t . singular strings of lengths * + t ≤ i < * + t b,κ
b,κ+1
b,κ
b
(0) (κ) (κ−1) *+tb,1 this implies that *b < *b By the minimality of κ and the fact that *b = b = which is a contradiction. This proves (8.40). Note that similar to Remark 8.11 Eq. (8.39) can be interpreted in terms of b non-intersecting paths which all end at position *. In th (k) this language the condition (8.40) states that *b is to the right of the b path. Since (k) all paths end at * and there are no parts of length greater than * in νb this implies that rb < n. More precisely, rb ≤ k if b > λk+1 − λn for all 1 ≤ k < n. Let us now prove (8.39) at b − 1. It follows from (8.29) that (k)
(k)
(k−1)
Pi (ν b ) = Pi (ν b−1 ) + χ (*b (k−1)
By (8.40), *b b ≤ λk − λn ,
(k)
(k−1)
≥ * + tb,k . Hence χ (*b
(k−1)
χ (*b
(k)
≤ i < *b ) − ≤−
(k)
(k+1)
≤ i < *b ) − χ (*b ≤ i < *b
).
(k) ≤ i < *b ) ≤ χ ( * + tb,k ≤ i) so that for
min{b,λ k −λn }
χ (i ≥ * + tj,k )
j =1
min{b−1,λ k −λn }
χ (i ≥ * + tj,k )
j =1 (k−1)
as desired. When b > λk − λn then *b = ∞ so that the above inequality still holds. (k) (k+1) ) does not Since we only consider 1 ≤ i < * + tb−1,k+1 the term −χ (*b ≤ i < *b min{b,λk+1 −λn } contribute by (8.40) and in addition b can be replaced by b − 1 in j =1 χ (i ≥ * + tj,k+1 ). This proves (8.39) for b − 1. Since we have shown that (8.39) implies that rb < n for * < b ≤ µL it follows that β ≤ *. This concludes the proof of Theorem 8.9. Acknowledgements. We are deeply indebted to Anatol Kirillov for stimulating discussions. We would also like to thank Peter Bouwknegt, Srinandan Dasmahapatra, Atsuo Kuniba, and Masato Okado for helpful discussions.
Fermionic Formulas for Level-Restricted Generalized Kostka Polynomials
163
References 1. Andrews, G.E.: The Theory of Partitions, Encyclopedia of Mathematics and its Applications, Vol. 2, Reading, Massachusetts:, Addison-Wesley, 1976 2. Andrews, G.E., Baxter, R.J. and Forrester, P.J.: Eight-vertex SOS model and generalized Rogers– Ramanujan-type identities. J. Stat. Phys. 35, 193–266 (1984) 3. Baxter, R.J.: Exactly solved models in statistical mechanics. London: Academic Press, 1982 4. Bethe, H.A.: Zur Theorie der Metalle, I. Eigenwerte und Eigenfunktionen der linearen Atomkette. Z. Physik 71, 205–231 (1931) 5. Berkovich, A. and McCoy, : The universal chiral partition function for exclusion statistics. Ser. Adv. Statist. Mech. 14, 240–256 (1999) 6. Bouwknegt, P., Chim, L. and Ridout, D.: Exclusion statistics in conformal field theory and the UCPF for WZW models. Nucl. Phys. B 572, 547–573 (2000) 7. Bouwknegt, P. and Schoutens, K.: Exclusion Statistics in Conformal Field Theory – generalized fermions and spinons for level-1 WZW theories. Nucl. Phys. B 547, 501–537 (1999) (1) 8. Dasmahapatra, S.: On the combinatorics of row and corner transfer matrices of the An−1 restricted face models. Int. J. Mod. Phys. A 12, 3551–3586 (1997) 9. Date, E., Jimbo, M., Kuniba, A., Miwa, T. and Okado, M.: Exactly solvable models. Local height probabilities and theta function identities. Nucl. Phys. B 290, 231–273 (1987) 10. Foda, O., Leclerc, B., Okado, M. and Thibon, J.-Y.: Ribbon tableaux and q-analogues of fusion rules in WZW conformal field theories. Preprint math.QA/9810008 11. Goodman, F.M. and Wenzl, H.: Littlewood–Richardson coefficients for Hecke algebras at roots of unity Adv. Math. 82, 244–265 (1990) 12. Hatayama, G., Kirillov, A.N., Kuniba, A., Okado, M., Takagi, T. and Yamada, Y.: Character formulae of n -modules and inhomogeneous paths Nucl. Phys. B 536 [PM], 575–616 (1999) sl 13. Hatayama, G., Kuniba, A., Okado, M., Takagi, T. and Yamada, Y.: Remarks on fermionic formula. Contemp. Math. 248, 243–291 (1999) (1) 14. Jimbo, M., Miwa, T. and Okado, M.: An An−1 family of solvable lattice models. Mod. Phys. Lett. B 1, 73–79 (1987) 15. Kac, V.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge Univ. Press, 1990 16. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63, 465–516 (1991) 17. Kashiwara, M. and Nakashima, T.: Crystal graphs of representations of the q-analogue of classical Lie algebras. J. Alg. 165, 295–345 (1994) 18. Kang, S.-J., Kashiwara, M., Misra, K., Miwa, T. Nakashima, T. and Nakayashiki, A.: Perfect crystals of quantum affine Lie algebras. Duke Math. J. 68, 499–607 (1992) 19. Kang, S.-J., Kashiwara, M., Misra, K., Miwa, T., Nakashima, T. and Nakayashiki, A.: Affine algebras and vertex models. Int. J. Modern Phys. A, vol. 7, Suppl. 1A, 449–484 (1992) 20. Kedem, R., Klassen, T.R., McCoy, B.M. and Melzer, E.: Fermionic quasi-particle representations for characters of (G(1) )1 × (G(1) )1 /(G(1) )2 . Phys. Lett. B 304, 263–270 (1993) 21. Kedem, R., Klassen, T.R., McCoy, B.M. and Melzer, E.: Fermionic sum representations for conformal field theory characters. Phys. Lett. B 307, 68–76 (1993) 22. Kerov, S.V., Kirillov, A.N. and Reshetikhin, N.Y.: Combinatorics, the Bethe ansatz and representations of the symmetric group. J. Soviet. Math. 41, 916–924 (1988) 23. Kirillov, A.N.: On the Kostka–Green–Foulkes polynomials and Clebsch–Gordon numbers. J. Geom. Phys. 5, 365–389 (1988) 24. Kirillov, A.N. and Reshetikhin, N.Y.: The Bethe Ansatz and the combinatorics ofYoung tableaux. J. Soviet Math. 41, 925–955 (1988) 25. Kirillov, A.N., Schilling, A. and Shimozono, M.: A bijection between Littlewood–Richardson tableaux and rigged configurations. To appear in Selecta Mathematica (N.S.) (math.CO/9901037) 26. Kirillov, A.N. and Shimozono, M.: A generalization of the Kostka–Foulkes polynomials. preprint math.QA/9803062 27. Kirillov, A.N.: Private communication 28. Lascoux, A. and Schützenberger, M.P.: Sur une conjecture de H.O. Foulkes. CR Acad. Sci. Paris 286A, 323–324 (1978) 29. Lascoux, A. and Schützenberger, M.P.: Le monoïde plaxique. In: Noncommutative structures in algebra and geometric combinatorics. (Naples, 1978), A. de Luca Ed., Roma: Quaderni della Ricerca Scientifica, 109 CNR, 1981, pp. 129–156 30. Nakayashiki, A. and Yamada, Y.: Kostka polynomials and energy functions in solvable lattice models. Selecta Math. (N.S.) 3, 547–599 (1997) 31. Schilling, A. and Shimozono, M.: Bosonic formula for level-restricted paths Advanced Studies in Pure Mathematics 28, (2000), Combinatorial Methods in Representation Theory, pp. 305–325
164
A. Schilling, M. Shimozono
32. Schilling, A. and Warnaar, S.O.: Supernomial coefficients, polynomial identities and q-series. The Ramanujan J. 2, 459–494 (1998) 33. Schilling, A. and Warnaar, S.O.: Inhomogeneous lattice paths, generalized Kostka polynomials and An−1 supernomials. Commun. Math. Phys. 202, 359–401 (1999) 34. Schensted, C.: Longest increasing and decreasing subsequences. Canad. J. Math. 13, 179–191 (1961) 35. Shimozono, M.: A cyclage poset structure for Littlewood–Richardson tableaux. To appear in Europ. J. Combin. (math.QA/9804037) 36. Shimozono, M.: Multi-atoms and monotonicity of generalized Kostka polynomials. Preprint math.QA/9804038 37. Shimozono, M.: Affine type A crystal structure on tensor products of rectangles, Demazure characters, and nilpotent varieties. To appear in Europ. J. Combin. (math.QA/9804039) 38. Shimozono, M. and Weyman, J.: Graded characters of modules supported in the closure of a nilpotent conjugacy class. European J. Combin. 21, no. 2, 257–288 (2000) 39. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B 300, 360–376 (1988) 40. Walton, M.A.: Algorithm for WZW fusion rules: A proof Phys. Lett. B 241, 365–368 (1990) 41. Walton, M.A.: Errata: “Algorithm for WZW fusion rules: A proof”. Phys. Lett. B 244, 580 (1990) 42. Warnaar, S.O. and Pearce, P.A.: A-D-E polynomial and Rogers–Ramanujan identities. Int. J. Mod. Phys. A 11, 291–311 (1996) Communicated by T. Miwa
Commun. Math. Phys. 220, 165 – 229 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Rational Surfaces Associated with Affine Root Systems and Geometry of the Painlevé Equations Hidetaka Sakai Division of Mathematics, Graduate School of Science, Kyoto University, Kyoto 606, Japan. Received: 18 September 1999 / Accepted: 29 January 2001
Abstract: We present a geometric approach to the theory of Painlevé equations based on rational surfaces. Our starting point is a compact smooth rational surface X which has a unique anti-canonical divisor D of canonical type. We classify all such surfaces (1) X. To each X, there corresponds a root subsystem of E8 inside the Picard lattice of X. We realize the action of the corresponding affine Weyl group as the Cremona action on a family of these surfaces. We show that the translation part of the affine Weyl group gives rise to discrete Painlevé equations, and that the above action constitutes their group of symmetries by Bäcklund transformations. The six Painlevé differential equations appear as degenerate cases of this construction. In the latter context, X is Okamoto’s space of initial conditions and D is the pole divisor of the symplectic form defining the Hamiltonian structure.
Contents 1. 2. 3. 4. 5. 6. 7. A. B. C.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Discrete Equations to Surface Theory . . . . . . . . . . . . . . . . . Preliminaries on Rational Surfaces and Root Systems . . . . . . . . . . . Generalized Halphen Surface . . . . . . . . . . . . . . . . . . . . . . . . Parameterization of Isomorphism Classes of Surfaces . . . . . . . . . . . Cremona Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Painlevé Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Correspondence Between Standard Basis of 10 and Canonical Root Bases Realization of Generalized Halphen Surfaces with dim | − KX | = 0 . . . . Cremona Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
166 170 173 180 186 193 197 207 213 223
166
H. Sakai
1. Introduction 1. As is well known, the Painlevé differential equations have two independent origins. The first is due to P. Painlevé and coworkers [21, 8]. In their original work, the Painlevé equations were discovered as differential equations whose solutions do not have movable singularities other than poles. This feature, now called the Painlevé property, has become a guiding principle in the theory of integrable systems. The second is the monodromy preserving deformation of linear differential equations [7, 12]. Given a system of linear differential equations, the problem is to deform it in such a way that the monodromy group stays constant. Here the Painlevé equations arise as the condition for the invariance of the monodromy, written in terms of the coefficients of the underlying linear system. Our aim in this paper is to characterize the Painlevé equations from yet another point of view: theory of rational surfaces. Before entering into the detailed discussions, we begin by reviewing previous works which directly motivated the present study. In the following, we shall follow the standard notation PI , · · · , PVI to represent the six types of the Painlevé equations. 2. In the work [19], K. Okamoto introduced the notion of space of initial conditions for the Painlevé equations. The basic idea is as follows. Consider one of the Painlevé equations written in the form dy = P (t, y, z), dt
dz = Q(t, y, z), dt
(1)
where P and Q are polynomials in y, z, depending rationally on t with no poles in a domain B ⊂ C (for example, B = C\{0, 1} for PV I ). Very naïvely the space of initial conditions for (1) is C2 , since its solutions holomorphic at a fixed base point t0 ∈ B are uniquely specified by initial values (y(t0 ), z(t0 )). However we must also take into account solutions which have a pole at t0 . For that purpose we compactify C2 . After this it can happen that many solutions pass through a common point, and we need to separate them. This requires the operation of blowing-up the space. Starting with a trivial bundle Q = ( × B, π, B) (where is a compactification of C2 ) and repeating the above process, Okamoto constructed for each PI –PVI a fiber bundle P = (E, π, B) with the following properties: There is a foliation F on P extending the Painlevé equations on Q such that a) Each leaf of F intersects with each fiber transversally; b) Each path γ on B can be lifted to a leaf γp that runs through a given point p ∈ π −1 (γ (0)); c) π|γp : γp → B is surjective and γp is a covering space of B by π . In this situation, each fiber of P is called the space of initial conditions. Note that any two fibers are isomorphic by (b) and (c). The space E is not compact, since at the last stage of the construction it is necessary to remove vertical leaves which are the pole divisor of the symplectic form of the Hamiltonian structure for the Painlevé equations. Remarkably, the configurations of irreducible components of vertical leaves are described by Dynkin diagrams [19], in a way very similar to Kodaira’s classification of singular fibers of elliptic surfaces (see Table 1 below). Subsequently, the study of the space of initial conditions was pursued further by K. Takano et al. [24]. In particular, they obtained the following uniqueness result: There exists only one Hamiltonian structure (that of the Painlevé equation) which is holomorphic on E and extends rationally to its compactification E. In other words, the pair (E,E) carries all information about the Painlevé equations.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
167
Table 1. Configuration of vertical leaves and symmetry PI
PII
PIII
PIV
PV
vertical leaves
(1) E8
(1) E7
(1) D6
(1) E6
(1) D5
D4
symmetry
-
(1)
D4
(1)
A1
(2A1 )(1)
(1)
A2
A3
PVI (1) (1)
3. Another aspect of the theory of Painlevé equations relevant to us is their symmetries under affine Weyl groups. This subject was also initiated and developed by Okamoto [20]. Let us illustrate it on the example of PII , y = y 2 − 2z + s,
(2)
(3)
(4)
z = a1 − 2yz, s = a1 + a0 ,
where a0 , a1 are parameters and the prime denotes derivation. (It is possible to normalize a1 + a0 to be 1 as is usually done. However we find it convenient to keep this redundancy for describing the symmetry.) Let C(a0 , a1 , s, y, z) be the differential field defined by the derivation rule (2)–(4) along with a0 = a1 = 0. Then the following maps are automorphisms of differential fields on C(a0 , a1 , s, y, z): a1 , z → z, z σ : a1 → a0 , a0 → a1 , s → s, y → −y, z → y 2 + s − z.
w1 : a1 → −a1 , a0 → a0 + 2a1 , s → s, y → y −
(1)
(5) (6)
Moreover w1 , σ generate the extended affine Weyl group of type A1 . In the literature, such transformations of differential equations are often called Bäcklund transformations. Similar actions of affine Weyl groups are also available for the other Painlevé equations as well. We quote the result from [19, 20] in the following table. (See Sect. 4.2 for the notation (2A1 )(1) .) For comparison, we have included in the table the type of the configuration of vertical leaves mentioned above. From the list we observe that the root systems of the configurations and the symmetries are orthogonal complement to each other in a common root (1) lattice of type E8 . We will come back to this point shortly. In recent years, difference versions of the Painlevé equations have been studied extensively by several authors [9, 10, 13, 22]. These are discrete mappings, which in an appropriate continuous limit reduce to the Painlevé differential equations, and more significantly, are “integrable” in a certain sense. (For difference equations it is not so clear how to formulate the notion of integrability. An important step in this direction is the singularity confinement test proposed by A. Ramani and B. Grammaticos as a difference counterpart of the Painlevé property. We shall touch upon this method from our viewpoint in Sect. 2.) It has gradually become clear that, in the discrete setting, difference equations and their symmetries should be treated on an equal footing. In fact, many of the discrete Painlevé equations found by Ramani and Grammaticos [23] turned out to be the translation part of Okamoto’s affine Weyl group symmetries. 4. The result of Takano et al. suggests the possibility of a purely geometric framework which enables us to understand various features of the Painlevé equations. The present article is an attempt toward this goal. Unlike Okamoto, who started from differential
168
H. Sakai
equations and obtained the space of initial conditions, we will start from varieties with certain properties and end up with equations. The main objects of our study are compact rational surfaces X which admit a unique anti-canonical divisor D of a special kind (called “canonical type” in [1]). Such a surface X can be obtained by blowing up the projective plane P2 centered at nine points. Surfaces obtained by blowing up P2 are deeply studied from the moduli theoretical viewpoint, that is, as a theory of invariants of finite point sets in a projective plane [5]. A wellstudied example is the del Pezzo surface which is a smooth rational surface with ample anti-canonical divisor class. This corresponds to m points in P2 (1 ≤ m ≤ 8) satisfying the following conditions: a) there are no infinitely near points, b) no three points are collinear, c) no six points lie on a conic, and d) in the case m = 8, these points do not lie on a cubic with a singular point at one of the points. We consider the case of nine points in more degenerate configurations. Generalizations of the del Pezzo surface are studied, in particular in connection with the Weyl group, by P. du Val [6], M. Nagata [17], Y. Manin [16], M. Demazure [3] E. Looijenga [15] and B. Harbourne [11]. Among others, Looijenga [15] gave a detailed analysis for degenerate cases of what we call ”multiplicative type” in this paper (Sect. 4). We shall closely follow his method and consider all cases of elliptic, multiplicative and additive types. In the Picard group Pic(X), the orthogonal complement of D (with respect to the (1) intersection pairing) becomes a root lattice of type E8 . The latter has two interesting root subsystems of affine type, R and R ⊥ . The root lattice of R is the span Q(R) = Z[Di ] of the classes of irreducible components Di of D, and that of R ⊥ is its orthogonal complement Q(R ⊥ ). We classify all possible surfaces X by classifying R. The Weyl group of R ⊥ acts as automorphisms of Pic(X) preserving the above structure. We call them the Cremona isometries. (For a more precise definition, see Sect. 4.) This action can be lifted further to a birational action of the (families of) surfaces X. We then show that the translation part of the Weyl group gives rise to discrete Painlevé equations, whereas the whole group acts as their group of symmetries (Bäcklund transformations). We also obtain as the degenerate cases the space of initial conditions of the six Painlevé differential equations. More precisely, the latter is the noncompact surface obtained by removing the anti-canonical divisor D (the vertical leaves) from X. The present geometric setting makes transparent algebraic properties of the Painlevé equations besides the Bäcklund transformations. It is known that there exist special solutions of Riccati type when the parameters lie on the reflection hyperplanes. From our standpoint, this condition is interpreted as that of the existence of an effective root (P1 with self-intersection −2) which is left invariant under the flow of the differential equation. The Painlevé equations have also the differential equation for elliptic functions as a limit. For example, if a0 + a1 = 0 in (2)–(4) then the equation can be solved by quadratures using elliptic functions. This corresponds to a degeneration of the space of initial conditions into rational elliptic surfaces. In this paper our treatment of the Painlevé differential equations is somewhat indirect, in that they appear as a limit of the translation part of the affine symmetry. A more direct approach based on the Kodaira-Spencer deformation theory is being developed in recent work of H. Umemura and M.-H. Saito. Their work can be viewed as a further extension of the framework of the present paper. 5. Let us mention further related problems. A natural question that comes to mind is whether our approach is related to monodromy preserving deformations. We imagine that there may be a correspondence between some invariants of differential equations and the family X in this paper.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
169
Another problem is an extention to the multi-variable cases. The Painlevé equations have Cremona transformations as their origin. We believe that a similar correspondence between birational geometry and integrable systems exists in higher dimensional situations as well, though algebraic geometry becomes more difficult. The theory of generalized del Pezzo varieties [5] may give us some hints about this idea. A different approach to the Painlevé equations also based on the affine symmetry and discrete systems has been developed by M. Noumi andY.Yamada [18]. They constructed representations of affine Weyl groups on the field of rational functions and found a new presentation of the Painlevé equations. This formulation makes the symmetry apparent and allows for straightforward extension to the higher dimensional case. Our approach offers a geometric characterization of various structures of the Painlevé equations, while their theory provides an effective algorithmic tool for studying them. 6. The text is organized as follows: In Sect. 2 we motivate our construction on the example of a q-analog of the sixth Painlevé equation. We take an example from discrete systems because they seem to be less known as compared to the differential case, and also because they are easier to describe. This section is meant to be an introduction to our approach. Beginning in the next section, we forget about the differential/difference equations for awhile and concentrate on surface theory. In Sect. 3 we recall known results about rational surfaces. We also mention about root systems and the Weyl groups. In Sect. 4, we classify the surfaces and the anti-canonical divisors which satisfy our conditions. For us, of particular significance is the Cremona action. We introduce in Sect. 4 Cremona isometries as certain automorphisms of Pic(X), and study them in detail in Sect. 6. In Sect. 5 we parametrize our surfaces and construct a family of them. We then realize the Cremona action as birational transformations of this family in Sect. 6. In Sect. 7 we construct discrete Painlevé equations in terms of the Cremona action. We obtain the Painlevé differential equations by degenerating surfaces. We obtain, in particular, an “elliptic” difference equation, which seems to be the most generic one (see Sect. 7, 4). Notation and conventions. Throughout this paper the base field is C. We use the following symbols: • • • • • • • • •
X: a smooth projective surface, K = KX : the canonical divisor class on X, bi = rank H i (X, Z): the Betti number, hp,q = dim H q (X, "p ): the Hodge number, q = q(X) = h0,1 : the irregularity, pg = pg (X) = h0,2 : the geometric genus, ci = ci (X): the Chern classes, χ (F) := (−1)i dim H i (X, F) : the Euler characteristic of F, π(C) = 21 ((C 2 ) + KX · C) + 1: the virtual genus of a curve C. We have bi = p+q=i hp,q , hp,q = hq,p = h2−p,2−q . In the case of a rational surface X we have q(X) = pg (X) = 0, and more generally, H 0 (X, lKX ) = 0 for l ≥ 1. By ∼ we denote the linear equivalence of divisors. The group of classes of divisors with respect to the linear (resp. numerical) equivalence is denoted by Pic(X) (resp. Num(X)). For a rational surface X we have Pic(X) = Num(X). Pic(X) is also regarded as the group of invertible sheaves H 1 (X, O∗ ). We write operations on invertible sheaves additively.
170
H. Sakai
For a divisor D or a divisor class D = [D], we denote by |D| = |D| the set of all positive divisors C on X such that C ∼ D. It is clear that dim |D| = dim H 0 (X, D) − 1. We call an element of | − KX | an anti-canonical divisor. A component of a divisor is an irreducible component of its support. We denote by Comp(D) the set of components of D. We use the symbol Comp(D) to denote the set of classes of components in Pic(X), when there is no fear of confusion. For a divisor D = mi Di , we write Dred = Di . Let F be a divisor class. We say F is effective if |F| = ∅, numerically effective if F · [C] ≥ 0 holds for any irreducible curve C, and irreducible if there exists an irreducible curve C ∈ |F|. We denote by F · G = c1 (F) · c1 (G) ∈ Z the intersection index of divisors or divisor classes on X, and by (F 2 ) the self-intersection index. In particular, for effective classes F = [F ] and G = [G] such that F and G have no common irreducible components, we have F ·G =
x∈X
dim OX,x /(fx , gx ) ≥ 0, k
where fx = 0 (resp. gx = 0) define F (resp. G) locally near x. The type of root system is denoted by the symbol R. In this paper, R is assumed to be symmetrizable. • . = .(R) = {α1 , . . . αl } : the root basis, .∨ = .∨ (R) = {α1∨ , . . . αl∨ } : the coroot basis, • (V , ., .∨ ): a realization of R (. ⊂ V ∗ , .∨ ⊂ V ), ν : V → V ∗ : the isomorphism induced from the bilinear form, • Q = Q(R) = li=1 Zαi : the root lattice, • W = W (R) ⊂ GL(V ∗ ): the Weyl group, • 3 = 3(R): the set of roots, 3+ : the set of positive roots, 3re : the set of real roots, 3im : the set of imaginary roots. In the case of affine type, we enumerate the roots by i = 0, . . . , l − 1 for convenience.
2. From Discrete Equations to Surface Theory The study of discrete Painlevé equations is currently the focus of intensive activity. These discrete Painlevé systems are of interest, as they share many properties in common with the differential ones, including the affine Weyl group symmetry, the Lax pair, Hirota’s bilinear form and the special solutions. We can add the simplicity of their algebraic and geometric structures to the reasons why we study discrete systems. Introduction of the singularity confinement method brought forth intriguing developments [22]. Rational mappings of a discrete equation may have poles which give rise to infinities at particular input values of solutions. The singularity confinement demands that after a finite iteration of steps these infinities disappear and memory of the input data be recovered [9]. We can regard it as a discrete counterpart of the Painlevé property. In this section we review briefly the singularity confinement method taking the sixth q-Painlevé equation (q-PVI ) as an example, and give a geometric interpretation to it in non-technical terms.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
171
Originally q-PVI was derived from the connection preserving deformation of a linear q-difference equation [13]. It is the following system of nonlinear q-difference equations: q-PVI :
f (qt)f (t) g(qt) − qb1 t g(qt) − qb2 t = , b7 b8 g(qt) − b3 g(qt) − b4 g(qt)g(t) f (t) − b5 t f (t) − b6 t = . b3 b4 f (t) − b7 f (t) − b8
(7) (8)
Here bi ’s are constants subject to the constraint qb1 b2 /b3 b4 = b5 b6 /b7 b8 . Provided that |q| = 1, the behavior of f and g near t = 0 determines the values of f (t) and g(t) for arbitrary t ∈ C\{0} because the calculation of f (t) and g(t) can be reduced to the values of f (q n t), g(q n t) or f (q −n t), g(q −n t). Here we view (7),(8) as a chain of birational mappings · · · → (f , g) → (f, g) → (f , g) → · · · , where f = f (q −1 t), f = f (qt), and so forth. Each step is an automorphism of the field of rational functions C(f, g). To be precise, since the system is not autonomous we should consider the mapping b1 t b2 t b3 b4 qb1 t qb2 t b3 b4 ; f, g → ;f,g . b 5 t b 6 t b 7 b8 qb5 t qb6 t b7 b8 To simplify the notation let us rename b1 t, b2 t, b5 t, b6 t as b1 , b2 , b5 , b6 , so that bi = qbi t for i = 1, 2, 5, 6. Then we have an automorphism of the field C(b1 , b2 , b3 , b4 , b5 , b6 , b7 , b8 , f, g), b1 b2 b3 b4 qb1 qb2 b3 b4 ; f, g → ;f,g , b5 b6 b7 b8 qb5 qb6 b7 b8 g=
b3 b4 f − b5 f − b6 , g f − b7 f − b8
f =
b7 b8 g − qb1 g − qb2 , f g − b3 g − b4
q=
b3 b4 b5 b6 . b1 b2 b7 b8
Let us see how the singularity confinement works in our case of q-PVI . Singularities occur for the values f = b7 , b8 or g = b3 , b4 . We must also consider f = b5 , b6 and g = b1 , b2 which are the sources of the singularities for the next values of 1/f and 1/g. For example, the input datum (f , g) = (b7 , g0 ) gives us (f, g) ∼ (b8 , ∞). In order to calculate the next value g, we take the input datum (b7 + 5, g0 ) and take the limit 5 → 0. Then b3 b4 (b7 − b5 /q)(b7 − b6 /q) , 5g0 b7 − b 8 g0 b8 (b7 − b8 )(b3 + b4 − b1 − b2 ) b8 + O(5 2 ) f − b8 = 5 − b3 b4 (b7 − b5 /q)(b7 − b6 /q) b7 g=
(O(5 2 ) means the usual symbol of Landau). We see that even though g = ∞, g is finite, and moreover the input value g0 can be recovered from it. The secret is the fact that g(f − b8 ) remains finite when the infinity appears in this process. So let us introduce the coordinate h = g(f −b8 ) and construct a surface by gluing C2 ∪C2 = (f, 1/ h)∪(1/g, h)
172
H. Sakai
via the relation (f − b8 )/ h = 1/g. This is nothing but the operation of blowing-up. The equation then makes sense on this surface. Let us construct the space of initial conditions for q-PVI . For this purpose let us first compactify C2 and take P1 × P1 . Writing b = (b1 , . . . , b8 ), let Xb be the surface obtained by blowing up P1 × P1 with the center (b7 , ∞), (b8 , ∞), (∞, b3 ), (∞, b4 ), and (b5 , 0), (b6 , 0), (0, b1 ), (0, b2 ). (∞, b3 ) (∞, b4 ) r r r (b8 , ∞) (b5 , 0) r r r (b6 , 0) r r (b7 , ∞) (0, b1 ) (0, b2 ) Notice that the positions of these eight points are not arbitrary. There are four pairs of points, each of which lie on a line Di (i = 0, . . . , 3). Two of the lines have bidegree (1, 0) and the other two have bidegree (0, 1). The meaning of these lines is that they are pole divisors of a 2-form which is invariant under the mapping q-PVI . Indeed, let df ∧ dg ω= . Then it is easy to see that fg df ∧ dg df ∧ dg df ∧ dg , =− = fg fg fg where we regard t as constant. The intersection matrix for these divisors Di is the (1) generalized Cartan matrix of type A3 multiplied by −1. For this reason, we call Xb an (1) A3 -surface. The merit of introducing Xb is that q-PVI gives (not just a birational but) a one-to-one correspondence between Xb and Xb . Proposition 1. The sixth Painlevé q-difference equation gives an isomorphism from Xb to Xb . Thus we can think of Xb as the space of initial conditions of the sixth Painlevé qdifference equation. Let us show the proposition. As a mapping q-PVI can be decomposed into more elementary mappings between spaces of initial conditions as follows: q-PVI = σ2 ◦ w2 ◦ w1 ◦ w0 ◦ w2 ◦ σ1 ◦ w3 ◦ w5 ◦ w4 ◦ w3 , b1 bb75 b2 bb75 b3 b4 b1 b2 b3 b4 f − b7 w3 ; f, g → ; f, g b5 b6 b7 b8 f − b5 b7 b6 b5 b8 b1 bb75 b2 bb75 b3 b4 f − b7 w4 → ; f, g f − b5 b7 b6 b8 b5 b1 bb75 b2 bb75 b3 b4 f − b7 w5 → ; f, g f − b5 b6 b7 b8 b5 b1 bb75 bb86 b2 bb75 bb86 b3 b4 f − b 7 f − b8 w3 → ; f, g f − b5 f − b6 b8 b7 b6 b5
(9)
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
173
b3 b4 f − b5 b6 qb1 b3 b4 ; f, g = b 5 b8 b 7 g f − b7 f − b8 b4 qb1 qb2 b3 g − b4 w2 → ;f ,g b4 b4 g − qb2 b6 qb b5 qb b8 b7 2 2 b3 b4 qb1 qb2 g − b4 w0 → ;f ,g b4 b4 g − qb2 b6 qb b5 qb b8 b7 2 2 qb1 b4 b3 qb2 g − b4 w1 → ;f ,g b4 b4 g − qb2 b6 qb b5 qb b8 b7 2 2 b3 b4 qb1 qb2 g − b4 g − b3 w2 → ;f ,g b4 b3 b4 b3 g − qb2 g − qb1 b6 qb b5 qb b8 b7 2 qb1 2 qb1 qb1 qb2 b3 b4 b7 b8 g − qb1 g − qb2 σ2 → ;f = ,g . qb5 qb6 b7 b8 f g − b3 g − b4 σ1
→
qb2 b6
We have only to show each step is an isomorphism. But this is just an interchange of blowing-down structures of the same surface. For example, w3 is obtained by blowingup P1 × P1 at (f, g) = (b5 , 0), (b7 , ∞) and blowing-down the exceptional curves of the first kind: pull-backs of (f − b5 = 0) and (f − b7 = 0). (1) Moreover these mappings generate the extended affine Weyl group of D5 ; wi is the reflection with respect to a simple root and σi comes from an automorphism of the Dynkin diagram. It naturally emerges from configurations of exceptional curves on the (1) (1) space of initial conditions. So we have two root systems A3 and D5 associated with (1) (1) one surface Xb . The root system A3 arises as the pole divisors of a 2-form ω, and D5 plays the role of the symmetry. We will see later the same situation in more general cases. Remark. For uniformity of description, we will consider in the following sections Xb as a surface obtained by blowing up P2 with the centers at nine points, rather than taking P1 × P1 . The correspondence between the parameters in this section and those of Appendix B is 1/a2 1/a3 1/a1 a8 y z b1 b 2 b 3 b 4 ; f, g = ; , , b5 b6 b7 b8 a6 a7 − 1/a4 a1 − 1/a5 a1 x − a1 z x where (x : y : z) is the homogeneous coordinate of P2 . As for the ai , see the item (1) “A3 -surface” in Appendix B. The Dynkin automorphism σ1 (resp. σ2 ) is described as −σ(13) (resp. −σ(20) ) there. From the next section we take the position that the surfaces teach us everything. We once forget about dynamical systems and begin with characterization of surfaces. 3. Preliminaries on Rational Surfaces and Root Systems This section is devoted to a review of known results which we will use later. Rational surfaces have been studied by many authors. Most relevant to us are the works of
174
H. Sakai
P. du Val [6], M. Nagata [17], Y. Manin [16], M. Demazure [3], E. Looijenga [15] and B. Harbourne [11]. For the reader’s convenience we give brief proofs to some of the statements. See notation and conventions in Introduction for the definition of symbols. 1. First we recall some useful formulas which hold for a smooth, complete surface X. They are called by the name of Gauss–Bonnet, Noether, Riemann–Roch and Serre, respectively. • c2 (X) = χ (X), 1 • χ (O) = 1 − q + pg = 12 (c12 + c2 ), 1 • χ (F) = 2 F · (F − KX ) + χ (O), in particular, dim H 0 (X, F) + dim H 2 (X, F) ≥ 21 F · (F − KX ) + χ (O), • dim H i (X, F) = dim H 2−i (X, KX − F). 2. A smooth compact connected surface free of exceptional curves of the first kind is said to be relatively minimal. A rational relatively minimal surface is either P2 or one of the Hirzebruch surfaces Fl = P(OP1 ⊕ OP1 (−l)) (l = 0, 2, 3, 4, . . . ). The latter is a P1 -bundle over P1 having a section S with S · S = −l. We consider the Picard group Pic(X) = H 1 (X, O∗ ). The exact sequence 0 → Z → exp O → O∗ → 0 leads to H 1 (X, O) → H 1 (X, O∗ ) → H 2 (X, Z) → H 2 (X, O), while the rationality implies q = pg = 0. Hence Pic(X) ! H 2 (X, Z) holds for a rational surface X. 2 ) = 0. Then Proposition 2. Let X be a rational surface with (KX
a) rank Pic(X) = 10. b) | − KX | = ∅. c) Suppose that | − KX | has a divisor D = i mi Di such that [Di ] · KX = 0 for any i. Let ρ : X → Z be a birational morphism of X onto a relatively minimal surface Z. Then Z is either P2 , F0 (= P1 × P1 ) or F2 . d) Under the same condition for |−KX |, there exists a birational morphism ρ : X → P2 . 1 (c12 + c2 ), where c2 (X) = χ (X) = Proof. a) The Noether formula says that 1 = 12 4 2 2 i i=0 (−1) bi = 2 − 2b1 + b2 and b1 = 2q = 0. It follows from c1 = (KX ) = 0 that rank Pic(X) = b2 = 10. b) From the Riemann–Roch inequality we have dim H 0 (X, 2KX ) + dim H 0 (X, −KX ) ≥ 1. The first term in the left-hand side vanishes. c) If Z = Fl (l ≥ 3), then X has a nonsingular rational curve S with (S 2 ) ≤ −3. For such a curve we have S · KX = 2(π(S) − 1) − (S 2 ) = −2 − (S 2 ) ≥ 1. Then S∈ / Comp(D) because S ∈ Comp(D) would mean S · D = 0. But S ∈ / Comp(D) leads to S · D ≥ 0. It is a contradiction. d) Let ρ : X → Z be a birational morphism onto a relatively minimal model Z. Suppose that Z = P2 . Then rank Pic(Z) = 2 and rank Pic(X) = 10, so that ρ can be written as a composition ρ = ρ1 ◦ ρ2 ◦ · · · ◦ ρ8 , where ρi is the blowing-down of an exceptional curve Ei (1 ≤ i ≤ 8). Set p = ρ1 (E1 ) and ρ = ρ2 ◦ ρ3 ◦ · · · ◦ ρ8 . If Z = F0 = P1 × P1 , we take H0 and H1 to be the lines passing through p of bidegree (0, 1) and (1, 0), respectively. Then we can blow down the proper transform of H0 ∪ H1 , and the image is P2 . In the case Z = F2 , p does not lie on the curve S with (S 2 ) = −2 for the reason indicated in the proof of c). Let F be the fiber of F2 passing through p. We can blow down the proper transform of F ∪ S, and the image is P2 . " #
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
175
3. Suppose that X admits a birational morphism ρ : X → P2 , where ρ = ρ1 ◦ · · · ρn is the composition of blowing-down ρi with the center at pi ∈ Xi = ρi ◦· · ·◦ρn (X). We call such a pair of a surface and a composition of blowing-down morphisms (X, ρ) a blowingdown structure. Two blowing-down structures (X, ρ) and (X , ρ ) are isomorphic if there exist isomorphisms ψi : Xi → Xi such that ρi ◦ ψi = ψi−1 ◦ ρi (i = 0, . . . , 9). We denote by E0 (resp. Ei (i = 1, . . . , n)) the class of the total transform of a line in P2 (resp. of the closed point pi which is the center of ρi ). Then the Picard group of X is described as follows. Pic(X) = ZE0 + ZE1 + · · · + ZEn , (E02 ) = 1, (Ei2 ) = −1 (i = 0), Ei · Ej = 0 (i = j ), KX = −3E0 + E1 + · · · + En . Let (E0 , E1 , . . . , En ) be a basis of Pic(X) with the intersection pairing given as above. If there exists a series of blowing-down morphisms whose class of total transform of a line (resp. of an exceptional curve of the first kind) is E0 (resp. Ei (i = 1, . . . , n)), then we say that (Ei )ni=0 defines a blowing-down structure. Now E0 is numerically effective. It is because E0 is irreducible and E0 · E0 = 1. So the next lemma is immediate. (The latter follows from a) for KX − F and the Serre duality.) Lemma 3. a) If F · E0 ≤ −1, then H 0 (X, F) = 0. b) If F · E0 ≥ −2, then H 2 (X, F) = 0. Next we study the semi-group of effective classes Pic+ (X). Let C be an irreducible curve on X which is not a component of D ∈ | − KX |. The virtual genus of C is non-negative and given by 1 1 π(C) = 1 + [C] · ([C] + K) ≤ 1 + (C 2 ). 2 2 Hence (C 2 ) ≥ −2. If (C 2 ) = −2, then [C] · KX = 0 and π(C) = 0. It implies that C is a smooth rational curve. A smooth rational curve with self-intersection −2 is called a nodal curve. A nodal curve is either an irreducible component of D ∈ | − KX | or disjoint from KX . If (C 2 ) = −1, then C · KX = −1 and π(C) = 0. So C is an exceptional curve of the first kind and meets −KX simply in a unique irreducible component. We set the following notation; NEFF: the set of numerically effective classes, EX = {F ∈ Pic(X)|(F 2 ) = −1, F · KX = −1}: the set of exceptional classes, 3nod : the set of classes of nodal curves disjoint from the irreducible components of | − KX |. Lemma 4. An exceptional class F ∈ EX is effective. Proof. Let G = (F · E0 )KX + 3F, then E0 · G = 0 and so (G 2 ) ≤ 0. It is equivalent to −6(F · E0 ) − 9 ≤ 0, so F · E0 ≥ − 96 ≥ −2. By Lemma 3 we have H 2 (X, F) = 0, and dim H 0 (X, F) ≥ 1 + 21 F · (F − KX ) = 1 by the Riemann–Roch inequality. " # Proposition 5. Pic+ (X) is generated by NEFF ∩ Pic+ (X), EX, 3nod and Comp(D) for D ∈ | − KX |.
176
H. Sakai
Proof. Suppose F ∈ Pic+ (X) has a negative intersection number with an irreducible ∩ Supp(F ) and [C] ∈ EX, 3nod or Comp(D) for D ∈ | − K |. curve C. Then C ∈ F ∈| X F| We get a numerically effective class by subtracting these [C] from F. " # Proposition 6. Let D ∈ | − KX |. Then D is connected. Proof. First we show 1) dim H 1 (D, OD ) = 1 and 2) H 1 (D , OD ) = 0 for a divisor D with 0 < D < D. The exact sequence 0 → OX (−D) → OX → OD → 0 leads to H 1 (OX ) → 1 H (OD ) → H 2 (OX (−D)) → H 2 (OX ). Now H 1 (OX ) = H 2 (OX ) = 0 and H 2 (OX (−D)) ! H 0 (OX (KX + D)) = H 0 (OX ) = C. Hence 1) is proved. From KX + D < 0 we have H 2 (OX (−D )) ! H 0 (OX (KX + D )) = 0, so 2) is proved similarly. If D = D + D
, D ∩ D
= ∅, then C ! H 1 (OD ) ! H 1 (OD ) ⊕ H 1 (OD
) ! 0. Hence D is connected. " # 4. The standard Lorentzian lattice n is a free Z-module of rank n, Zv0 ⊕ Zv1 ⊕ · · · Zvn−1 , equipped with a symmetric bilinear form (.|.) given by n−1 2 2 ai vi = −a02 + a12 + · · · + an−1 , i=0
where |v|2 = (v|v). Proposition7. The notation being as above, let δn = 3v0 − v1 − · · · − vn . Then (δn )⊥ = {v ∈ n+1 (v|δn ) = 0} is the root lattice corresponding to the generalized Cartan matrix with the following Dynkin diagram: ❞ r0 ❞ r1
❞ r2
❞ r3
ppp
❞ rn−2
❞ rn−1
A root basis can be chosen as r0 = v0 − v1 − v2 − v3 ,
ri = vi − vi+1 (i = 1, . . . , n − 1).
Proposition 8. Let X be a rational surface with rank Pic(X) = n ≥ 3. Then the Picard group Pic(X) with a bilinear form (F|G) = −F · G (F, G ∈ Pic(X)) is isomorphic to n . Moreover there exists an isomorphism φ:n → Pic(X) such that δn−1 → −KX . We have special interest in the case n = 9. In this case the root system of type (1) E8 appears. We quote from Kac’s book [14] some results about root systems. See Introduction as for the notation R, Q, ., etc. Proposition 9 ([14] § 5.10). Let R be a root system of finite, affine or hyperbolic type. Then a) 3re = {α ∈ Q (α|α) = 2}, when R is symmetric. b) 3im = {α ∈ Q \ {0} (α|α) ≤ 0}.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
177
Proposition 10 ([14] § 5.6, § 4.8). Let R be an indecomposable root system. a) If R is of finite type, then the set 3im is empty. b) If R is of affine type, then there is a unique element δ = l−1 i=0 ai αi ∈ 3 such that im 3im + = 3 ∩ 3+ = {nδ (n = 1, 2, . . . )}. 5. Now we turn to the description of the Weyl groups of affine type. The Weyl group W = W (R) is generated by the simple reflections wi ’s, where wi (λ) = λ − %α ∨ , λ& αi
for λ ∈ V ∗ .
For any real root α, wα (wα (λ) = λ − %α ∨ , λ& α) belongs to W . In fact if α = w(αi ) for some w ∈ W , then wα = w ◦ wi ◦ w −1 . re re Proposition 11 ([14] § 6.5). Let α ∈ 3re + = 3 ∩ 3+ be such that β = δ − aα ∈ 3+ for some a ∈ Z. Then, for λ ∈ V ∗ , we have
1 wα ◦ wβ (λ) = λ + %K, λ&ν(β ∨ ) − (%β ∨ , λ& + |β ∨ |2 %K, λ&) δ, 2 where K is the canonical central element. ◦
Let R be the root system obtained from the generalized Cartan matrix of R by deleting ◦
the 0th row and column. Denote by W the subgroup of W generated by w1 , . . . , wl−1 . ◦
◦
Let M = ν(Z(W · θ ∨ )) ⊂V ∗ , θ = δ − a0 α0 , and set T = {tα |α ∈ M}, where 1 tα (λ) = λ + %K, λ&α − ((α|λ) + |α|2 %K, λ&) δ 2 ◦
for λ ∈ V ∗ .
◦
We have M =Q= Q(R) if R is symmetric and of affine type. ◦
Proposition 12 ([14] § 6.5). W ! W T . Let Aut(R) be the group of Dynkin automorphisms. The extension Aut(R) W (R) =W (R). is called the extended Weyl group and is denoted as W Proposition 13 ([14] § 5.10). a) If R is indecomposable, then the group of all automorphisms of Q preserving 3 is (R). ±W b) If R is symmetric and of finite, affine or hyperbolic type, then the group of all auto (R). morphisms of Q preserving (.|.) is ±W 6. In the rest of this section, suppose X is a surface obtained by nine successive blowings(1) (1) up from P2 with the center at each point. Then w ∈ W (E8 ) acts not only on Q(E8 ) ⊂ Pic(X) but on Pic(X) by wri (F) = F + (ri · F) ri , where r0 = E0 − E1 − E2 − E3 and (1) ri = Ei − Ei+1 (i = 1, . . . , 8). We set .(E8 ) = {ri (i = 0, . . . , 8)}. Furthermore we assume that Di · KX = 0 for any Di = [Di ] ∈ Comp(D) and any (1) D ∈ | − KX |. In this case we have such Di ’s in 3re (E8 ) because Di ⊥ KX and (Di2 ) = −2 (it is because we have (Di2 ) ≥ −2 from the virtual genus π(Di ) of the (1) irreducible curve Di ). The same argument shows that 3nod ⊂ 3re (E8 ).
178
H. Sakai (1)
Lemma 14. W (E8 ) acts transitively on EX. Proof. Let F = 9i=0 ai Ei ∈ EX. We have 3a0 − i>0 ai = 1, a02 − i>0 ai2 = −1. From the effectivity of F and the numerically effectivity of E0 , we find a0 ≥ 0. If a0 = 0, then there is an Ei (i > 0) such that F = Ei and EX is permuted by (1) S9 ⊂ W (E8 ) transitively. Let a0 > 0. Acting with S9 as necessary, we assume a1 ≥ a2 ≥ · · · ≥ a9 . We show a0 < a1 + a2 + a3 . Suppose the contrary, then we have a0
a0
−1 = ai − ≤ (a1 + a2 + a3 − a0 ) + 6 a3 − ≤ 2(3a3 − a0 ) ≤ 0. 3 3 i>0
This means that a1 − a30 = a2 − a30 = a3 − a30 = 0. Let a1 = a2 = a3 = n then we have (ai ) = (3n, n, n, n, n, . . . , n, n − 1), because 3ai − a0 ≤ 0 and 9i=1 (ai − a30 ) = −1. Now a02 − i>0 ai2 = 9n2 − (9n2 − 2n + 1) = −1 leads to a0 = 0, and it is a contradiction. In this case we can operate w0 on F, where w0 is the reflection by a root r0 = E0 − E1 − E2 − E3 . This operation reduces a0 so recursive operations lead to the case of a0 = 0. " # (1)
Lemma 15. Let F ∈ Pic(X) and F · ri ≥ 0 for ri ∈ .(E8 ). Then w(F) − F ∈ (1) (1) Z≥0 ..(E8 ) for w ∈ W (E8 ). Proof. We take a reduced expression w = wi1 ◦ · · · ◦ wit . For a simple reflection wit , wit (F) − F ∈ Z≥0 .rit . If is is the smallest number such that wis+1 ◦ · · · ◦ wit (F) − F ∈ / (1) / 3+ (E8 ) Z≥0 .., then ris · wis+1 ◦ · · · ◦ wit (F) < 0. Therefore wit ◦ · · · ◦ wis+1 (ris ) ∈ and it contradicts the fact that this is a reduced expression ([14] § 3.11). " # Lemma 16. If w(E0 ) is numerically effective, then the linear system of w(E0 ) has no base points and is effective. Proof. Since w(E0 ) is numerically effective, we have E0 ·w(E0 ) ≥ 0. From b) of Lemma 3 and the Riemann–Roch inequality, dim H 0 (X, w(E0 )) ≥ 3. Hence w(E0 ) is effective. → X is the blowing-up of X at p. Suppose p ∈ X is a base point of w(E0 ) and π :X Then the total transform E0 of w(E0 ) is numerically effective and the total transform of p is a fixed component of E0 . Thus it is enough to show that E0 has no fixed component. Let F be the class of the fixed part of E0 and write E0 = H + F. Then H has no fixed components, so (H2 ) ≥ 0. Because E0 is numerically effective, E0 · H ≥ 0. If E0 · H = 0, then (H2 ) = 0. It has the negative definite intersection form, and follows that H = 0 because E0⊥ ⊂ Pic(X) hence dim H 0 (X, H) = 1. This contradicts to dim H 0 (X, H) = dim H 0 (X, E0 ) ≥ 3. On the other hand, if E0 · H > 1, then E0 · F < 0. It is against the effectivity of F. Thus E0 · H = 1. Moreover we have (H2 ) = 0 because (H2 ) > 0 implies that F = 0 from F ⊥ E0 . → Pl whose image is a curve C. The global sections of H induce a morphism ϕ : X is nonsingular and rational, and ϕ is induced by a complete linear system, we Since X conclude that C ! P1 . and Because E0 · H = 1, H is not a nontrivial multiple of any element of Pic(X), a general hyperplane L of Pl meets C at a single point. Thus C is a linear subvariety of Pl . Since ϕ is induced by a linear system, this means Pl = C ! P1 . Therefore H) = H 0 (P1 , O(1)), and dim H 0 (P1 , O(1)) = 2 leads to a contradiction. " H 0 (X, #
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
179
Lemma 17. If w(E0 ) is numerically effective, then dim H 0 (X, w(E0 )) = 3. Proof. From the above lemma, w(E0 ) is effective and has no base points. Since (E02 ) > 0, the linear system of w(E0 ) is not composite with a pencil. Bertini’s theorem says that w(E0 ) is irreducible. The arithmetic genus of w(E0 ) is zero. So w(E0 ) has a divisor C ! P1 . We consider the exact sequence 0 → OX → OX ⊗ w(E0 ) → OC ⊗ w(E0 ) → 0. Then deg OC ⊗ w(E0 ) = 1, and from the Riemann–Roch theorem for curves, we have dim H 0 (C, OC ⊗ w(E0 )) = 2. By taking the cohomology of the above exact sequence, noting that dim H 0 (X, OX ) = 1 and dim H 1 (X, OX ) = 0, we have dim H 0 (X, w(E0 )) = 3. " # Proposition 18. Let E = (E0 , . . . , E9 ) define a blowing-down structure. ∪ Comp(D) ⊂ 3 (E (1) ). a) 3nod ∪ D∈|− + 8 KX | (1)
b) Let w ∈ W (E8 ). Then wE = (w(E0 ), . . . , w(E9 )) defines a blowing-down structure ∪ Comp(D)) ⊂ 3 (E (1) ). if and only if w(3nod ∪ D∈|− + 8 KX | (1)
c) Let wi ∈ W (E8 ) be the reflection by a simple root ri . Then wi E defines a blowingdown structure if and only if ri is not effective.
∪ Comp(D), then E · r ≥ 0. If E · r > 0 then r as a Proof. a) If r ∈ 3nod ∪ D∈|− 0 0 KX | sum of r0 = E0 − E1 − E2 − E3 , ri = Ei − Ei+1 , i = 1, . . . , 8 has r0 with a positive coefficient, hence r is positive. If E0 · r = 0 then r is of the form Ei − Ej . From the effectivity of r, pj is infinitely near pi , so i < j and r is positive. b) If wE defines a blowing-down structure, then w preserves the positivity of 3nod and an element of Comp(D) from a). To prove the converse we show the numerical effectivity of w(E0 ) first. But since an irreducible class F with F · F ≤ −1 is either an element of Comp(D), 3nod or EX, it is ∪ Comp(D)∪ enough to show the effectivity of w(E0 ) and F ·w(E0 ) ≥ 0 for F ∈ D∈|− KX | 3nod ∪ EX. ∪ Comp(D), then F · w(E ) ≥ 0 from the hypothesis. Let If F ∈ 3nod ∪ D∈|− 0 KX | F ∈ EX. From Lemma 14 and 15 an element F ∈ EX has the form E9 + G, where G ∈ (1) Z≥0 ..(E8 ). Taking F = w−1 (F), we have w(E0 )·F = E0 ·w−1 (F) ≥ 0. Let us show that w(E0 ) is effective. By Lemma 15, w(E0 ) · E0 ≥ 0 holds, so dim H 2 (X, w(E0 )) = 0 by Lemma 3. The Riemann–Roch inequality says dim H 0 (X, w(E0 )) ≥ 3. Therefore w(E0 ) is numerically effective. Lemma 16 and 17 say that w(E0 ) induces a morphism ϕ : X → P2 which is birational since (w(E0 )2 ) = 1. ϕ is a contraction of nine exceptional curves of the first kind which do not meet w(E0 ). The intersection form determines these classes up to ordering, so these are w(Ei ) (i = 1, . . . , 9). But from the hypothesis, if w(Ei ) − w(Ej ) (0 < i = j ) is effective, then i < j . Thus we obtain the statement. Because wi preserves the positivity of all roots except ri , c) is a consequence of b). (1) / 3+ (E8 ). " # The converse is clear because wi (ri ) = −ri ∈ Lemma 19. Let (E0 , E1 , . . . , E9 ) define a blowing-down structure, and suppose F is a class such that F · ri ≥ 0 (i = 0, . . . , 8), and F · Ei ≥ 0 (i = 0, . . . , 9). Then F is a nonnegative sum of the classes E0 , E0 − E1 , 2E0 − E1 − E2 and −Ki for i ≥ 3.
180
H. Sakai
Proof. If F · Ei = 0 for every i, then F = 0. Let k be the largest index such that F · Ek > 0. If k = 0, then F = aE0 , a > 0. If k = 1, then F = aE0 − bE1 (a ≥ b > 0), so F = (a − b)E0 + b(E0 − E1 ). If k = 2, then F = aE0 − bE1 − cE2 and a ≥ b + c > b ≥ c > 0, so F = (a− b − c)E0 + (b − c)(E0 − E1 ) + c(2E0 − E1 − E2 ). If k ≥ 3 and F = 9i=0 ai Ei , then a0 ≥ −(a1 + a2 + a3 ) ≥ −3a3 and −a1 ≥ −a2 ≥ · · · ≥ −ak > 0 = ak+1 = · · · = a9 . Then F = F − ak Kk satisfies the conditions above and has a smaller k. By induction on k, F has the required form. " # Proposition 20. F ∈ NEFF if and only if there exists a blowing-down structure such that F is a sum of the classes E0 , E0 − E1 , 2E0 − E1 − E2 and −Ki = −KX + E9 + · · · + Ei+1 (i ≥ 3) with nonnegative coefficients. In particular F is effective. Proof. First we show that F = E0 , E0 − E1 , 2E0 − E1 − E2 , −Ki (i ≥ 3) are numerically effective. Since F is effective, it is enough to show F · G ≥ 0 for an arbitrary irreducible (1) divisor class G with (G 2 ) < 0. If G is a nodal curve, then G belongs to 3+ (E8 ) by a) of Proposition 18 and F · ri ≥ 0 for all i. Thus we have only to prove it for G ∈ EX. When we set G = ai Ei then a0 ≥ 0. From a02 − 9i=1 ai2 = −1, we have easily F · G ≥ 0 for F = E0 , E0 − E1 and 2E0 − E1 − E1 . In the case F = −Ki we have also F ·G > 0. For G ∈ EX(Xi ), we have that −KXi ·G = −Ki ·G = 1 in Xi = ρi ◦· · ·◦ρ9 (X) and the total transform of a numerically effective divisor class is numerically effective. Conversely if F is numerically effective then we have only to show F ·ri ≥ 0 for some (Ei )9i=0 by the above lemma. Set F = 9i=0 ai Ei in terms of some (Ei )9i=0 . If F · ri < 0 then ri is not effective and so (w(Ei )) defines a blowing-down structure, where w is a reflection by a root ri . In the case i = 0 this operation decreases a0 , and in the case i > 0, this operation decreases 8i=1 max{−F · ri , 0} = 8i=1 max{ai − ai+1 , 0} with a0 unchanged. If we do not achieve (Ei ) such that F · ri ≥ 0 for i = 0, . . . , 8 by these processes then a0 becomes negative. But it contradicts to the fact that E0 is effective. " # 4. Generalized Halphen Surface 1. We assume X to be a rational surface such that | − KX | has an effective divisor of canonical type (we denote it by D). Here, by a divisor of canonical type we mean the following. Definition 1. Let D = mi Di be an effective divisor on X with irreducible compoi∈I
nents Di . We say that D is of canonical type if KX · [Di ] = D · Di = 0 for all i. In this case we have c12 (X) = (−KX )2 = − i∈I mi KX · [Di ] = 0. If | − KX | has a positive dimension, X is a rational elliptic surface called a Halphen surface of index one (a Halphen surface of index l is characterized by the condition that dim | − lKX | > 0). It is known that the anti-canonical divisors of a Halphen surface are of canonical type [1]. So our X is a generalization of this rational elliptic surface. Definition 2. Let X be a smooth projective rational surface. We call X a generalized Halphen surface if X has an anti-canonical divisor of canonical type.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
181
Table 2. Classification of generalized Halphen surfaces with dim | − KX | = 0 R Elliptic type
(1) A0 (= I0 ) (1)∗
Multiplicative type
A0
Additive type
A0
(1)
(1)
(1)
(1)
(1)
(= I1 ), A1 (= I2 ), A2 (= I3 ), . . . , A7 , A7 (= I8 ), A8 (= I9 )
(1)∗∗
(1)∗
(= II), A1
(1)∗
(= III), A2
(= IV),
(1) (1) D4 (= I0∗ ), . . . , D8 (= I4∗ ), (1)
(1)
(1)
E6 (= IV∗ ), E7 (= III∗ ), E8 (= II∗ )
Generalized Halphen surfaces are divided into two types; • dim | − KX | = 1: a Halphen surface of index one, • dim | − KX | = 0. We classify the surfaces through the classification of anti-canonical divisors, but a Halphen surface of index one has many types of anti-canonical divisors. A Halphen surface is a well-known object [1] and the elliptic functions appear instead of the Painlevé equations. We study mainly generalized Halphen surfaces with dim | − KX | = 0. If | − KX | has a unique divisor D, then X is classified according to the type R of D, where R is in the Table 2. We call X an R-surface and denote it by X(R). The symbols in the parentheses correspond to the ones from Kodaira’s elliptic singular fibers. We divide all cases with dim | − KX | = 0 into three classes according to rank H1 (Dred , Z) = 2, 1 or 0, where Dred = ∪Di for D = mi Di . We call them the elliptic type, the multiplicative type and the additive type respectively. We will see later that this classification corresponds to the types of discrete equations from the affine Weyl group symmetry: what we call elliptic-difference equation, q-difference equation and usual difference equation. We will obtain this classification, first by studying the divisor classes Di = [Di ] ∈ Pic(X), and then by a much finer investigation on the divisors Di . (1)
2. We have two important root subsystems R and R ⊥ in Q(E8 ) = (−KX )⊥ ⊂ Pic(X). (The symbol R in the classification of surfaces in Table 2 needs more detailed distinction than that of the root system R. See 3.) The root lattice of R is the span Q(R) = ZDi of the classes of irreducible components Di of D, and that of R ⊥ is its orthogonal complement Q(R ⊥ ) := Q(R)⊥ = {v ∈ Pic(X) | (v|Di ) = 0 for all i}. The lists of R and the corresponding R ⊥ are the following: Table 3. R
182
H. Sakai Table 4. R ⊥
The arrows mean inclusions (R → R ⇔ R ⊥ → R ⊥ ⇔ Q(R) ⊂ Q(R ) ⇔ (1) Q(R ⊥ ) ⊃ Q(R ⊥ )). Let R = A0 stand for the lattice Q(R) = ZKX in the case that D (1) (1) itself is irreducible. Now A7 has two types of realizations in the lattice Q(E8 ). We use (1) (1) the symbol A7 for the one which has no orthogonal real root of E8 in its complement, (1) (1) (1) and A7 for the other one. The symbol A1,|α|2 =l means the root subsystem of type A1 whose square length of roots is l, that is, it has an intersection form such that −
l
−l
−l
l
.
We use unusual symbols (2A1 )(1) , (A2 + A1 )(1) and (A1 + A1,|α|2 =14 )(1) . They mean that they have the following intersection forms: 2
(1) (2A1 )(1) = A1 + A1 : − −2
0
(A2 + A1 )
0 0 2 0 0 2 −2 , −2 2 2
2 ∼ : − 0 0 2 −1 −1 0 −1 2 −1 0 (1) = A2 + A1 : − −1 −1 2 0 (1) A1 + A 1
(1)
−2 0
0
0
0
2
−1
0
−1 (1) ∼ A2 + A 1 : − 0 0
2 0 0
2 0
0 , 2 −2 −2 2 0
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
2
(1)
0
0 0 0 14 2 0 0 : − 0 14 −14 . 0 −14 14
(1) (A1 + A1,|α|2 =14 )(1) = A1 + A1,|α|2 =14 : − −1
∼ A1 + A1,|α|2 =14
−1
183
2
(1)
The list of R can be obtained by the list of root subsystems of E8 which are indecomposable and of affine type (see [2]). Notice that R is of affine type because the null root −KX belongs to Q(R) = ZDi , and is indecomposable because of Proposition 6. Remark. In Table 3, the surfaces corresponding to R’s encapsulated in the frame are the spaces of initial conditions of the Painlevé differential equations [19]. In Sect. 6, a subsystem R ⊥ plays an important role in the constructions of symmetry. A discrete Painlevé equation is a translation part of this symmetry. The operation of taking the continuous limit to the Painlevé differential equation has a corresponding arrow in Table 4 and corresponding degeneration of the surface. In detail, see Sect. 7. 3. The root system R in Table 3 slightly differs from the R in the classification of surfaces (Table 2). We need a more detailed study of D = mi Di . According to Proposition 2 we have a birational morphism ρ : X → P2 , where ρ is a composition of nine successive blowings-up of points, which are possibly infinitely near each, in P2 ; ρ9
ρ8
ρ2
ρ1
X = X9 → X8 → · · · → X1 → X0 = P2 . Each ρi is a blowing-up of a closed point pi ∈ Xi−1 . We denote the image of pi in P2 by the same letter pi . In the generic case, there exists a unique cubic curve passing through the nine points pi . Then the proper transform of this cubic curve in X is D because −KX = 3E0 − E1 − · · · − E9 . We have classified D via the span of Comp(D) in Pic(X). If D is irreducible, (1) corresponding to A0 , D is either a smooth elliptic curve, a rational curve with a node, (1) (1) or a rational curve with a cusp. Similarly A1 and A2 have different types of divisors D. We draw distinction among these as follows; (1) (1)∗ R = A0 : D is a smooth elliptic curve, R = A0 : D is a rational curve with a node, (1)∗∗ R = A0 : D is a rational curve with a cusp, (1) (1)∗ R = A1 : D = D0 +D1 and D0 intersects D1 at two points, R = A1 : D = D0 +D1 and D0 is tangent at one point; D0 · D1 = 2, ✬✩
✬✩
✫✪
✫✪
184
H. Sakai (1)
R = A2 : D = D0 + D1 + D2 and D0 ∩ D1 ∩ D2 = ∅; Di · Dj = 1 (i = j ), (1)∗ R = A2 : D = D0 + D1 + D2 and D0 ∩ D1 ∩ D2 = ∅; Di · Dj = 1 (i = j ). ❏✡ ✡❏ ✡ ❏ ✡ ❏ ✡ ❏
◗
✑
◗
✑
✑
◗
◗✑ ✑◗
✑
◗
Remark. If the points pi admit two distinct cubic curves F0 (x : y : z) = 0, F1 (x : y : z) = 0 passing through them, then | − KX | is a pencil which consists of the proper transform of µF0 + νF1 = 0. Then the morphism X ⊃ (µF0 + νF1 = 0) + x → (µ : ν) ∈ P1 is the elliptic fibration of a Halphen surface X. 4. We can describe the isomorphism classes of the surfaces by the ordered point sets (pi )9i=1 , without using detailed information about X as a surface obtained by gluing affine spaces. That is, naïvely the parameter space of surfaces is x x · · · x 1 2 9 PGL(3)\ y1 y2 · · · y9 /(C∗ )9 , z z ··· z 1
2
9
where pi = (xi : yi : zi ). However we need more degenerated configurations of points including many infinitely near points and more specific configurations of the anticanonical divisor. In the next section we will closely investigate the parameterization of isomorphism classes of our surfaces, together with the blowing-down structures. This parameterization is very convenient to describe the surfaces, but it depends on the blowing-down structures. The blowing-down structure is determined by a choice of divisor classes (E0 , . . . , E9 ). It corresponds to the standard basis of 10 .An isomorphism ϕ : 10 → Pic(X) defined by ϕ(vi ) = Ei (i = 0, . . . , 9) via the blowing-down structure on X is called a strict geometric marking of X. If ϕ is a strict geometric marking, then ϕ(δ) = −KX , where δ = δ9 = 3v0 − v1 − (1) · · · − v9 . We can choose a unique root basis for ϕ((δ)⊥ ) ! Q(E8 ) in Pic(X) through this marking. We call it the canonical root basis. This holds in the case of sublattice (1) Q(R) or Q(R ⊥ ) ⊂ Q(E8 ). (1) We illustrate the setting above with the case of X(E7 ) as an example. There are the following two types of possible blowing-down structures: t D7 2 t D0
❞ D1
❞ D2
❞ D3
❞ D4
D7 ❞ D5
1 ❞ D6
1 ❞ D0
❞ D1
t D2
❞1 ❞ D3
❞ D4
❞ D5
❞ D6
Each vertex Di is a divisor class of an irreducible component Di and a line between vertices means that these two divisor classes intersect at one point. The number associated with each vertex is the number of irreducible exceptional classes which are blown down by ρ : X → P2 and intersect with corresponding Di . Here • means a divisor class of type E0 −Ei1 −Ei2 −Ei3 (i1 , i2 , i3 ≥ 1) and ◦ means a divisor class of type Ej1 −Ej2 (1 ≤ j1 < j2 ).
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
185
(1)
We can blow down the surface X(E7 ) in each manner. From the first diagram we choose (Ei )9i=0 by using the elements of Comp(D) as follows: We assign E9 to an irreducible exceptional class which intersects D6 ; E8 = E9 + D6 ; E7 = E8 + D5 ; E6 = E7 + D4 ; E5 and E4 are assigned to irreducible exceptional classes which intersect D0 ; E3 = E6 + D3 ; E2 = E3 + D2 ; E1 = E2 + D1 . Conversely we have the canonical root basis from (Ei )9i=0 ; D0 = E0 − E1 − E4 − E5 , D1 = E1 − E2 , D2 = E2 − E3 , D3 = E3 − E6 , D4 = E6 − E7 , D5 = E7 − E8 , D6 = E8 − E9 , D7 = E0 − E1 − E2 − E3 . We choose the correspondence between the standard basis of 10 and each canonical root basis of Q(R) and Q(R ⊥ ) as in Appendix A. 5. The blowing-down structure is not determined even if we fix the span of Di in Pic(X). (1) For example, in the case of X(E7 ), we could have chosen E9 = 3E0 − E1 − E2 − E3 − 2E4 − E6 − E7 − E8 instead of E9 , if E9 is irreducible. We could have changed the indices as (E4 , E5 ) = (E5 , E4 ) or (Di )7i=0 = (D6 , D5 , D4 , D3 , D2 , D1 , D0 , D7 ). A change of blowing-down structures causes a change of parameters of isomorphism classes of the surface X and gives rise to a Cremona transformation. Here a Cremona transformation means an automorphism of the field K(P2 ) ! K(X). We will see that such transformations are an origin of the Painlevé equations (Sect. 7). Let us illustrate the correspondence between changes of blowing-down structures and Cremona transformations on a simple example. Recall the standard Cremona transformation on P2 : (x : y : z) → (yz : zx : xy), ✡
❏✡r p3 ✡ ❏ ❏r r✡ ✡p1 p2❏
E0 −E3 −E1 =E 2 ✡
✡ ❏ ✡ ❏ E1 =E 0 −E 2 −E 3 ❏ ❏
←−
❏ E3 =E 0 −E 1 −E 2 ❏ E0 −E2 −E3 =E 1 ❏ −→ ✡ ❏ ✡ ✡ E2 =E 0 −E 3 −E 1 ✡ E0 −E1 −E2 =E 3
p2 p1r ❏r ✡ ❏ ✡ ❏ r✡ ✡❏p3
Qp3 Qp2 Qp1 (P2 ) = Qp3 Qp2 Qp1 (P2 ) P2
. −→
P2 ,
where Qp (X) is the surface obtained by blowing up X with the center p. This transformation is brought about by the change of the blowing-down structures: E0 E0 2 −1 −1 −1 E 1 1 0 −1 −1 E1 = . E 2 1 −1 0 −1 E2 1 −1 −1 0 E3 E3 In order to investigate these transformations, it is enough to consider choices of blowing-down structures, i.e. choices of standard bases of Pic(X). We need to know
186
H. Sakai
these on the whole. So we prepare the terminology; an automorphism σ of Pic(X) is called a Cremona isometry (cf. [15, 4]) if the following properties are satisfied: (Cr1) σ preserves the intersection form in Pic(X), (Cr2) σ leaves the canonical class of X fixed, (Cr3) σ leaves the semi-group of effective classes Pic+ (X) invariant. We will determine the group of Cremona isometries Cr(X) for each surface X = X(R) and calculate the corresponding Cremona transformations in Sect. 6.
5. Parameterization of Isomorphism Classes of Surfaces 1. In the subsequent sections X is a generalized Halphen surface with dim | − KX | = 0. We usually parameterize the surfaces which are obtained by blowing up the projective plane in terms of the moduli of ordered point sets on the projective plane. But we need more degenerated configurations of points including many infinitely near points and more specific configurations of irreducible components of the anti-canonical divisors. We can normalize the surface by using projective transformations. The list is in Appendix B. (1)
We will see closely the case of the E7 -surface. First of all the images of D0 and D7 by ρ are lines on P2 . We can regard they are (x = 0) and (z = 0) by a choice of a coordinate. The surface obtained by blowing up the affine space Spec C[x, y] with the center (x, y) = (0, 0) is W = {(x, y; ζ0 : ζ1 ) ∈ Spec C[x, y] × P1 | xζ0 = yζ1 } = Spec C[x, ζ0 /ζ1 ] ∪ Spec C[ζ1 /ζ0 , y]. In general a surface obtained by blowing-up is constructed by gluing such affine schemes. For economy of symbols we write x/y and y/x to stand for ζ1 /ζ0 and ζ0 /ζ1 respectively. First, p1 is the point which is expressed in this coordinate as (x : y : z) = (0 : 1 : 0) because E1 intersects with D0 and D7 . We write it as p1 : (x : y : z) = (0 : 1 : 0). We can take p4 : (0 : 0 : 1) and p5 : (0 : a1 : 1) for a1 ∈ C. The point p2 is infinitely near p1 and p2 ∈ Spec C[ xy , yz / xy ]. Because E2 intersects with D7 , p2 : (x/y, z/x) = y2z (0, 0). Similarly p3 : (x/y, yz/x 2 ) = (0, 0). Now p6 ∈ Spec C[ yx , 3 ] and p6 : x (x/y, y 2 z/x 3 ) = (0, c) for c ∈ C∗ , but by a projective transformation (x : y : z) → y(y 2 z − x 3 ) (x : y : cz) we can set p6 : (x/y, y 2 z/x 3 ) = (0, 1). Next p7 ∈ Spec C[ xy , ] x4 2 3 4
and p7 : (x/y, y(y z − x )/x ) = (0, c ) for c ∈ C, but by a projective transformation
(x : y : z) → (x : y + c2 x : z) we can set p7 : (x/y, y(y 2 z − x 3 )/x 4 ) = (0, 0). The y 2 (y 2 z − x 3 ) point p8 is in Spec C[ xy , ] and p8 : (x/y, y 2 (y 2 z − x 3 )/x 5 ) = (0, s) for x5 s ∈ C. Finally, p9 : (x/y, y(y 2 (y 2 z − x 3 ) − sx 5 )/x 6 ) = (0, a0 ) for a0 ∈ C.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
As a result, we have x=0 z=0 ❞ p1 ← p2 ← p3 ← p6 ← ❏ ✡✡ ← p7 ← p8 ← p9 ❏ p4 ✡ r ❏ p5 ✡ ←− r ❏ ✡ ❏
187
rp4 rp5
p9 r
p1 : (0 : 1 : 0) x z x yz , = (0, 0) ← p3 : , = (0, 0) ← ← p2 : y x y x2 x y2z x y(y 2 z − x 3 ) ← p6 : = (0, 1) ← p7 : = (0, 0) ← , , y x3 y x4 x y 2 (y 2 z − x 3 ) ← p8 : = (0, s) , y x5 x y(y 2 (y 2 z − x 3 ) − sx 5 ) ← p9 : = (0, a0 ), , y x6 p4 : (0 : 0 : 1),
p5 : (0 : a1 : 1).
Here an arrow pi ← pj means that pj is infinitely near pi . It is possible that p5 is infinitely near p4 but in this case p5 : (y/z, x/y) = (0, 0) because E5 intersects with D0 . In particular p5 does not have extra-dimension of parameters so this case is simply parameterized by a1 = 0. Furthermore we have a projective transformation: (a1 , a0 ; s ; x : y : z) ! (µ3 a1 , µ3 a0 ; µ2 s ; x : µy : µ2 z)
for µ ∈ C∗ .
We have another way to parameterize an isomorphism class of surfaces. We use the period mapping which maps the elements of the second homology to C. However we have no holomolphic 2-form on X, so we have to use H2 (X − Dred ; Z) instead of H2 (X; Z). Therefore this is a parameterization for the non-compact variety X − Dred . Let ω be a meromorphic 2-form on X with div(ω) = −D. Then ω determines the period mapping χˆ : H2 (X − Dred ; Z) → C which sends M ∈ H2 (X − Dred ; Z) to M ω. We investigate H2 (X − Dred ; Z). The exact sequence of the homology for the pair (X, X − Dred ) is the following: →
∂∗
i∗
j∗
H3 (X; Z) → H3 (X, X − Dred ; Z) → H2 (X − Dred ; Z) → H2 (X; Z) → H2 (X, X − Dred ; Z) → 0 0
We identify Hk (X, X − Dred ; Z) with H 4−k (Dred ; Z) by the Poincaré duality. The kernel of j ∗ is the root lattice Q(R ⊥ ) because H2 (X, X − Dred ; Z) = H 2 (Dred ; Z) is spanned by the irreducible components Di of D.
188
H. Sakai
Now, from the Poincaré duality H 1 (Dred ; Z) = H1 (Dred ; Z) and the short exact sequence: 0 → H 1 (Dred ; Z) → H2 (X − Dred ; Z) → Q(R ⊥ ) → 0, we obtain the mapping χ : Q(R ⊥ ) → C
mod χˆ (H1 (Dred ; Z))
through the period mapping χˆ . But ω is not determined uniquely and we need a normalization. Here ω is specified as follows: a) elliptic type: We fix a basis (γ0 , γ1 ) of H1 (Dred ; Z) and set χˆ (γ0 ) = 1 and τ = χˆ (γ1 ) ∈ H = {v ∈ C|Im v > 0}. b) multiplicative type: H1 (Dred ; Z) has a generator γ which determines an orientation of D, so we set χˆ (γ ) = 1. c) additive type: H2 (X − Dred ; Z) = Q(R ⊥ ) so we set ω as χ ([D]) = 1. The fact that χ ([D]) = 0 follows from Proposition 23. Next we give some concrete computation of χ (αi ). Lemma 21. Let αi be a simple root such that (αi2 ) = −2 and αi ∈ 3nod . Then there exist irreducible exceptional curves Ci,0 , Ci,1 and a Dj such that αi = [Ci,0 − Ci,1 ] and Dj · Ci,0 = Dj · Ci,1 = 1. Furthermore we have Dj ∩Ci,1 √ χ (αi ) = 2π −1 ResDj ω mod χˆ (H1 (Dred ; Z)). Dj ∩Ci,0
Proof. The first part of this lemma is obvious from the concrete expressions of αi given in Appendix A. Let T be a closed tubular neighborhood of Dj in X such that T ∩ Ci,0 and T ∩ Ci,1 are fibers. Let l be an injective path in Dj from Ci,0 ∩ Dj to Ci,1 ∩ Dj . We take M = (Ci,0 \ (Ci,0 ∩ T )) ∪ ∂T |l ∪ (Ci,1 \ (Ci,1 ∩ T )). Then we canchoose an orientation such that M is homologous to Ci,0 −Ci,1 . We have Ci,0 \(Ci,0 ∩T ) ω = Ci,1 \(Ci,1 ∩T ) ω = 0. Therefore M ω = ∂T |l ω. By the residue formula we have √ ω = 2π −1 ResDj ω. # " ∂T |l
l
Because div(ω) = −D,
div(ResDj ω) = −
mk (Dk ∩ Dj ),
Dk ∩Dj =∅
where mk is the multiplicity of Dk in D (the Kac label in the language of root systems). This divisor on Dj determines the ResDj (ω) on Dj up to constant. Let us calculate some examples. Ell 1. We take D = E(g2 ,g3 ) = {y 2 z = 4x 3 − g2 xz2 − g3 z3 }. Then we have ResD ω =
1 dx , √ 2π −1 y
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
189
the holomorphic 1-form on the elliptic curve. The isomorphism C/Z + τ Z ! D pulls 1 1 dx to √ √ back dw, where w is the coordinate on C which is invariant with 2π −1 y 2π −1 the additive action and w = 0 corresponds to an inflection point (0 : 1 : 0) ∈ E(g2 ,g3 ) . For an αk of the form Ei − Ej , χ (αk ) is easy to calculate. If pi and pj are written as (℘ (θi ) : ℘ (θi ) : 1), (℘ (θj ) : ℘ (θj ) : 1) then χ (αk ) = θj − θi mod Z + τ Z. For α8 we regard it as (E0 − E1 − E2 ) − E3 , then the first part is represented by the line passing through p1 and p2 and this line intersects with D at (℘ (−θ1 − θ2 ) : ℘ (−θ1 − θ2 ) : 1). Hence we have χ (α8 ) = θ1 + θ2 + θ3 mod Z + τ Z. This is the same when we view α8 as (E0 − E2 − E3 ) − E1 or (E0 − E3 − E1 ) − E2 . Mult 3. Since Dj intersects Dj +1 and Dj −1 with multiplicity 1, respectively (j ∈ Z/3Z), we have ResDj ω =
dw 1 , √ 2 (2π −1) w
where w is a coordinate of Dj ! P1 such that Dj intersects Dj −1 (resp. Dj +1 ) at 0 (resp. ∞) (j ∈ Z/3Z). For an αk of the form Ei − Ej , χ (αk ) is easy to calculate. If pi and pj are written as (ai : 1), etc. on Dl ! P1 then pj aj dw 1 1 mod Z. χ (αk ) = = log √ √ ai 2π −1 pi w 2π −1 For α3 we regard it as (E0 − E1 − E4 ) − E6 , then the first part is represented by the line passing through p1 and p4 and this line intersects with D1 at p1,4 : (1 : −1/a1 a4 : 0), hence p6 dw 1 1 = χ (α3 ) = log(−a1 a4 a6 ) mod Z. √ √ 2π −1 p1,4 w 2π −1 Similar calculation works on in the case of the other multiplicative types. Add 3. ResDl ω has a double pole at (0 : 1 : 0) and is holomorphic elsewhere. So ResDl ω = const.dw, where w is the coordinate on P1 for which w = ∞ coincides with (0 : 1 : 0). We calculate first an αk of the form Ei − Ej , χ (αk ). If pi and pj are written as (ai : 1), etc. on Dl ! P1 then aj − ai 1 pj , dw = χ (αk ) = λ pi λ where λ = χ ([D]). For α3 we regard it as (E0 − E1 − E4 ) − E6 , then the first part is represented by the line passing through p1 and p4 and this line intersects with D1 at p1,4 : (1 : −a1 − a4 : 0), hence 1 p6 a1 + a4 + a6 dw = . χ (α3 ) = λ p1,4 λ Similar calculation works on in the case of the other additive types. Proposition 22. Let α ∈ 3(R ⊥ ) with (α 2 ) = −2. Denote by W nod the subgroup of (1) W (E8 ) which is generated by the reflections with respect to the elements of 3nod . Then α ∈ W nod · 3nod if and only if χ (α) = 0.
190
H. Sakai
Proof. If [N ] ∈ 3nod , then N ⊂ X − Dred and the restriction of ω to N is holomorphic. Hence χ ([N ]) = 0. Furthermore χ ([N ]) = 0 for [N ] ∈ W nod · 3nod because w([N ]) = [N ] mod Z3nod for w ∈ W nod . Conversely, suppose α ∈ 3(R ⊥ ) is such that (α 2 ) = −2 and χ (α) = 0. Then α = w(αi ) for some αi ∈ .(R ⊥ ) satisfying (αi2 ) = −2. We can write w = w ◦ w
with w ∈ W nod and w
such that w (3nod ) ⊂ 3+ . From Proposition 18, w
(αi ) can be expressed as [C0 −C1 ], where C0 and C1 are exceptional curves which have no common component with D and meet D in the same component Dj . In particular C0 ∩ C1 = ∅ or C0 ⊃ C1 . Then χ ([C0 −C1 ]) = χ (w ([C0 −C1 ])) = 0. It means that C0 ∩Dj = C1 ∩Dj because of the concrete expression of ResDj ω and Lemma 21. Therefore [C0 − C1 ] is effective and α = w ([C0 − C1 ]) ∈ W nod · 3nod . " # Remark. This proposition is due to Looijenga [15]. This has a significance in Painlevé analysis because it gives a characterization of parameters which have the classical solutions of Riccati type. The Riccati differential equation is derived from a flow in P1 and this, as a special solution of the Painlevé equation, comes in sight as a flow in a nodal root. It happens from the invariance of nodal roots under the time evolution of the Painlevé equation. The fact that the Painlevé differential equation is obtained as a limit of an element of Cr(X) explains this invariance. " # Now we characterize a Halphen surface of index one by using the period mapping, but a Halphen surface of index one allows | − KX | to have many elements D’s. The definition of χ depends on the choice of ω. In this case we denote this mapping by χω . Proposition 23. Let X be a generalized Halphen surface. Then χω ([D]) = χω (−KX ) = 0 if and only if X is a Halphen surface of index one. Proof. We know that X is a Halphen surface of index one if and only if dim | − KX | = 1. An element of | − KX | is a proper transform of a cubic curve passing through the nine points pi ’s. If we show that dim | − KX | = 1 implies χω ([D]) = 0, the converse can be concluded because the arbitrary eight points determine the ninth point uniquely such that there exists a pencil of cubic curves passing through these nine points. We assume dim | − KX | = 1. The value of χω ([D]) depends on the choice of D ∈ | − KX |, but we can vary D and ω continuously. So we show it in the generic case, namely the case that D is a proper transform of an irreducible smooth cubic curve. It means that θ1 + · · · + θ9 = 0 mod Z + Zτ from the calculus in the case of elliptic type, where pi = (℘ (θi ) : ℘ (θi ) : 1) with a certain coordinate of P2 , and this reduces to the next well-known classical result. " # Lemma 24. Let D be a plane cubic curve, and let C be a plane curve of degree m intersecting D at 3m points pi (i = 1, . . . , 3m). Then p1 + · · · + p3m = 0 with respect to the group operation on the elliptic curve D. Proof. Recall the group operation of an elliptic curve. We set 0 to be an inflection point. Given a point P , −P is defined to be the other point where D intersects a line passing through P and 0. The operation P + Q = R means that P , Q and −R are intersection points of D and a line. We show the lemma by induction. When m = 1, p1 + p2 + p3 = 0 means that these three points lie on a line, so it is trivial. We assume this lemma for a curve of degree m − 1 and show it in the case of a curve C of degree m.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
191
Table 5. Extra-parameter (1)
D5
D6
D7
D8
E6
E7
E8
ζ
s
s/λ
s/λ2
s/λ3
s/λ4
s 2 /λ
s 3 /λ2
s 2 /λ
B
C \ {0, 1}
C∗
C∗
C∗
C∗
C
C
C
D4
The type of surface
(1)
(1)
(1)
(1)
(1)
(1)
(1)
Let Q be −(p1 +p2 ), let R be −(p3 +p4 ) and let S be −(Q+R). Then p1 , . . . , p3m , Q, R and S lie on the curve of degree m + 1 which is the product of C and a line passing through Q, R and S. We know the fact that if three points, which are among 3k intersecting points of a cubic curve and a curve of degree k, lie on a line, then there exists a curve of degree k − 1 such that it intersects this cubic curve at the remaining 3(k − 1) points. Now p1 , p2 , Q and p3 , p4 , R lie on a line respectively, so p5 , . . . , p3m , S lie on a curve of degree k − 1. From the assumption, we have p5 + · · · + p3m + S = 0. From p1 + p2 + Q = 0, p3 + p4 + R = 0 and Q + R + S = 0, we arrive at the conclusion. " # Remark. This proposition tells the fact that the spaces of initial conditions of the Painlevé equations have an elliptic surface as a limit. It corresponds to an elliptic function as a limit of the Painlevé transcendents. See Remark at the last part of Sect. 7, 1. (1)
3. In the case R = E7 , notice that the value of the period mapping χ (α1 ) = a1 /λ ∈ C does not determine the isomorphism class of the surfaces. It means that the compact (1) variety X(E7 ) has one more dimension of parameters than the non-compact variety (1) X(E7 ) − Dred , and we need an extra-parameter ζ = s 3 /λ ∈ C. We see that the (1) parameters χ (α1 ) = a1 /λ and s 3 /λ determine the isomorphism class of the E7 -surfaces from Appendix B. The following is the list of surfaces which have such an extra-parameter ζ ∈ B. This ζ ∈ B is defined through the construction in Appendix B, for the surface X in the list and the blowing-down structure. The symbol s is as in Appendix B. Then, adding this extra-parameter, the parameterization obtained from period mapping is complete in some sense. Here is a theorem of Torelli type. be R-surfaces. Denote the irreducible components of the antiTheorem 25. Let X and X canonical divisor, the set of nodal roots, the blowing-down structures and the values of by the usual symbols and the period mapping with respect to the root lattice of X and X the tilded ones respectively. (1) (1) is an isometry such that For R = Dl , El , if φ: Pic(X) → Pic(X) i , a) φ(Di ) = D nod , b) φ(3nod ) = 3 ∗ c) φ (χ˜ ) = χ , (1) d) τ = τ˜ ∈ H when R = A0 ,
τ is not a fixed point of an element of PSL(2, Z),
ρ) then φ is induced by an isomorphism S : (X, ˜ → (X, ρ), and this S is unique when (1) (1) R = A7 , A8 . (1) (1) For R = Dl , El , under the same conditions, (φ(Ei ))9i=0 defines a blowing-down structure ρ, ˜ where (Ei )9i=0 is a standard basis of Pic(X), which defines a blowing-down
192
H. Sakai
ρ)) structure ρ. Furthermore, denote the extra-parameter defined for (X, ρ) (resp. (X, ˜ ρ) by ζ (resp. ζ˜ ) for R listed in Table 5. If (X, ρ), (X, ˜ satisfies e) ζ = ζ˜ ∈ B, ρ) then φ is induced by a unique isomorphism S : (X, ˜ → (X, ρ). Proof. First we show that (φ(Ei ))9i=0 defines a blowing-down structure, where (Ei )9i=0 is a standard basis of Pic(X), which defines a blowing-down structure ρ. To apply i in satisfying the correspondence with D Proposition 18, we take a basis (Ei ) of Pic(X) AppendixA which defines a blowing-down structure. The change of basis w : (φ(Ei )) → (1) i is invariant. mi D (Ei ) is induced from an automorphism of Q(E8 ) because −KX = (1) (1) (1) The conclusion follows from Aut(Q(E8 )) = ±W (E8 ) + w and w(3+ (E8 )) ⊃ ∪ Comp(D). 3nod ∪ D∈|− KX | The existence of an isomorphism S follows from the description given in Appendix B. (1) (1) We prove the uniqueness for X(R), where R = A7 , A8 . If we have two different isomorphisms, then we have a non-trivial automorphism of the blowing-down structure of X (Sect. 3, 3), which does not change the values of τ , χ and ζ . It induces an automorphism of P2 . From Appendix B, it is clear that there is no such automorphism in the (1) case R = A0 . (1) When R = A0 , such a transformation induces an automorphism of D: y 2 z = 3 2 4x − g2 xz − g3 z3 . From the viewpoint of D = C/(Z + Zτ ), it is a composition of a translation and a change of a basis of the lattice (a translation permutes nine inflection k lτ points + of D). A change of basis moves τ and a translation changes χ (α8 ), so 3 3 there is no such automorphism. " # (1)
(1)∗
Remark. There is an ambiguity in the parameterization in the case of A0 , A0 and (1) (1)∗ (1) A1 . For R = A0 , A1 , it is caused by the orientation of D, i.e. the choice of a basis (1) of H1 (Dred , Z). For R = A0 , it is caused by the choice of a basis of H1 (Dred , Z) and the choice of the inflection point (0 : 1 : 0). 4. We have parameterized pi for i = 1, . . . , 9. From these data we could construct each R-surface as a surface obtained by blowing up the projective plane. The parameters are the gluing data of affine schemes and we have considered them as complex numbers. If we regard the parameters as transcendental elements, then we obtain a family of (1) X(R). We illustrate it by taking the case of E7 as an example. We regard a1 , a0 and s as elements of C[a1 , a0 , s, 1/(a1 + a0 )] and glue each affine plane which is defined over (1) (1) C[a1 , a0 , s, 1/(a1 +a0 )]. Then we get a family of X(E7 ). We denote it by X = X (E7 ). We have the canonical morphism π : X → Spec C[a1 , a0 , s, 1/(a1 + a0 )] and a fiber X = X(a1 ,a0 ,s) = π −1 ((a1 , a0 , s) = (c, c , c
)) is a generalized Halphen surface, where c, c , c
∈ C such that c + c = 0. It is redundant for the parameterization (1) of X(E8 ), but we take this family for the purpose of a clear description of Cremona transformations. In the next section we investigate an action of Cr(X) on X . It is called the Cremona action.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
193
6. Cremona Action 1. In order to study the discrete and differential Painlevé equations we need to construct the elements of Cr(X). The main result of this section is the next theorem. Looijenga (1)∗ (1) (1) (1) (1) calculate Cr(X(R), D) for R = A0 , A1 , A2 , A3 and A4 [15]. This theorem is an extension. Here Cr(X, D) is the group of the Cremona isometries which leaves each class Di fixed. We use the notation GS = {g ∈ G | g(S) = S} ⊂ G. We have defined 3nod in Sect. 3, 3. For α with (α 2 ) = −2, wα ∈ W (R ⊥ ) acts not only on Q(R ⊥ ) but on Pic(X) by wα (F) = F + (α · F) α. (1) (1) (1) Theorem 26. a) Cr(X(R), D) = W (R ⊥ )3nod for R = A6 , A7 or D7 . (1)
(1)
(1)
(1)
(1)
(1)
b) For R = A6 , A7 , A7 , A8 , D7 or D8 , σ ∈ Aut(R) defines an action on (R ⊥ ))3nod . Pic(X) and Cr(X(R)) = (Aut(R) W (R ⊥ ))3nod ! (W (1) (1) (1) (1) 7n c) Cr(X(A6 ), D) = (W (A1 ) × {σ | n ∈ Z})3nod , Cr(X(A6 )) = ((W (A1 ) n {σ | n ∈ Z}) S2 )3nod , where σ : (E0 , . . . , E9 ) → (E 0 , . . . , E 9 ); E 0 = 2E0 − E1 − E7 − E8 ,
E 1 = E4 ,
E 4 = E9 , E 5 = E0 − E1 − E7 ,
E 2 = E3 ,
E 6 = E5 ,
E 3 = E0 − E1 − E8 ,
E 7 = E0 − E7 − E8 ,
E 8 = E2 ,
E 9 = E6 . (1)
(1)
Cr(X(A7 )) = {σ n | n ∈ Z} S2 , where
d) Cr(X(A7 ), D) = {σ 8n | n ∈ Z},
σ : (E0 , . . . , E9 ) → (E 0 , . . . , E 9 ); E 0 = 2E0 − E1 − E7 − E8 , E 3 = E0 − E7 − E8 , E 7 = E3 ,
E 8 = E4 ,
E 4 = E2 ,
E 2 = E0 − E1 − E8 ,
E 5 = E6 ,
E 6 = E0 − E1 − E7 ,
E 9 = E5 .
(1)
e) Cr(X(D7 ), D) = {(σ ◦σ )4n | n ∈ Z}, σ : E4 ↔ E5 ,
E 1 = E9 ,
(1) (A(1) ), where Cr(X(D7 )) = %σ, σ & ! W 1
E6 ↔ E7 ,
σ : (E0 , . . . , E9 ) → (E 0 , . . . , E 9 ); E 0 = 3E0 − E1 − E2 − E3 − 2E4 − E6 , E 2 = E0 − E2 − E4 ,
E 1 = E0 − E3 − E4 ,
E 3 = E0 − E1 − E4 ,
E 4 = 2E0 − E1 − E2 − E3 − E4 − E6 , E 5 = E8 , (1)
E 6 = E0 − E4 − E6 , (1)
E 7 = E9 , (1)
E 8 = E5 ,
E 9 = E7 . (1)
f) Cr(X(A7 )) = (D8 W (A1 ))3nod , Cr(X(A8 )) = D6 , Cr(X(D8 )) = S2 , where D2n is the dihedral group of order 2n and Sn is the symmetric group of order n!.
194
H. Sakai (1)
(1)
(1)
Proof. a) Let w ∈ W (R ⊥ ) be such that w(3nod ) = 3nod (R = A6 , A7 or D7 ). Then w leaves EX = {F ∈ Pic(X)|(F 2 ) = −1, F · KX = −1} invariant. Furthermore NEFF is also invariant from Proposition 20 because w leaves the blowing-down structures invariant from Proposition 18. Proposition 5 says that Pic+ (X) remains through the action of w. Conversely let σ ∈ Cr(X, D). Then σ induces an automorphism of Q(R ⊥ ). Therefore there exists a w ∈ Aut(Q(R ⊥ )) = ±Aut(R ⊥ ) W (R ⊥ ) (Proposition 13) induced by σ . However an automorphism of the Dynkin diagram of R ⊥ moves Dj except the identity. It is because each Dynkin automorphism of R ⊥ moves a terminal vertex αi which intersects an exceptional curve F such that F ·αi = F ·Dj = 1 for some j and F ·αk = F ·Dl = 0 for k = i, l = j . So w ∈ W (R ⊥ ) and of course w(3nod ) = 3nod . Now w ∈ Cr(X, D) and σ ◦ w−1 fixes the simple roots αi ’s and Di ’s. Let (Ei ) and (αj , Dk ) be the standard basis and the simple roots in Appendix A. If we set (Ei ) = (σ ◦w−1 (Ei )) then (Ei ) and (Ei ) have the same intersections with (αj , Dk ). (Ei −Ei )·αj = (Ei − Ei ) · Dk = 0 so Ei = Ei + ni δ. But |E0 |2 = |E0 |2 + 2n0 (E0 |δ) = −1 so n0 = 0 and E0 = E0 . Similarly Ei = Ei for i = 1, . . . , 9. Hence σ ◦ w −1 trivially acts on Pic(X). b) There is a homomorphism 1 → Cr(X)/Cr(X, D) → Aut(R) so we have only to construct a homomorphism Aut(R)3nod → Cr(X). It can be shown in each case respectively. (We will construct it as the Cremona action in Appendix C.) We will see (1) (1) the E7 -surface as an example. We assume that the generator σ of Aut(E7 ) belongs (1) to Aut(E7 )3nod . We want the action of σ : (Ei ) → (E i ) on Pic(X). Then E 9 is an irreducible exceptional class which intersects with σ (D6 ) = D0 . We take E 9 = E5 and it determines the action of σ on Pic(X). In order to show this we look at E i for all i. E 9 = E5 ,
E 8 = E 9 + σ (D6 ) = E5 + D0 = E0 − E1 − E4 ,
E 6 = E0 − E3 − E4 ,
E 3 = E0 − E4 − E6 ,
E 1 = E0 − E4 − E8 ,
E 5 = E9 ,
E 7 = E0 − E2 − E4 ,
E 2 = E0 − E4 − E7 ,
E 4 = α0 + E9 = 3E0 − E1 − E2 − E3 − 2E4 − E6 − E7 − E8 , E 0 = σ (D7 + E1 + E2 + E3 ) = 4E0 − E1 − E2 − E3 − 3E4 − E6 − E7 − E8 . (1)
(1)
Now σ is a Cremona isometry because σ ∈ W (E8 ) = Aut(Q(E8 ))/{±1} and σ (Comp(D)) = Comp(D). (R ⊥ ) in each case respectively. It can be verified that Aut(R) W (R ⊥ ) ! W (1)
d) According to the discussion above, Cr(X, D) ⊂ W (A1,|α|2 =8 ), but wα0 and wα1 are not elements of Aut(Pic(X)) but elements of Aut(Pic(X) ⊗ Q). An element of (1) W (A1,|α|2 =8 ) is expressed as (wα0 ◦ wα1 )±n or ((wα0 ◦ wα1 )n ◦ wα0 )± . The smallest ele(1)
ment which is in Aut(Pic(X)) is (wα0 ◦wα1 )4 . It also induces an element of Aut(Q(E8 )), so belongs to Cr(X, D). We have σ 8 = (wα0 ◦ wα1 )4 by a simple calculation. Let σ be a reflection in D16 such that σ 2 = 1. Then σ can be constructed as an element of Aut(Pic(X)) (see Appendix C). Here σ is an element of the inverse image (1) of a rotation in Aut(A7 ) = D16 . For any σ
∈ Cr(X), σ
◦ σ k ◦ σ i is an element of Cr(X, D) for some 0 ≤ k ≤ 7 and i = 0 or 1. So the second statement is concluded. Similarly, c) and e) can be proved.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs. (1)
195
(1)
f) For R = A7 , D8 ⊂ Aut(A7 ) = D16 . We can construct σ ∈ D8 as an automorphism of Pic(X) (see Appendix C). We have to show that the elements of D16 \ D8 do not induce an automorphism of Pic(X). This is obvious because D2 , D4 , D6 and D0 intersect an irreducible exceptional divisor respectively and the other Di ’s do not. The (1) (1) # case of R = A8 and D8 can be treated similarly. " 2. Next we calculate a concrete expression of Cremona actions of Cr(X) as a change of coordinates. Let X be the family of R-surfaces which we constructed in the previous section. We describe these actions by using corresponding actions on K(X ). They are automorphisms of K(X ) and birational actions on X . These actions are obtained from the actions on blowing-down structures. So, in order to obtain these, it is enough to blow up the projective plane and blow it down with another blowing-down structure. In particular it is just a replacement of parameterizations of the same surface so the restriction of the action to each fiber is an isomorphism between R-surfaces. We have the list of results of calculations in Appendix C. Let us explain the meaning (1) of the symbols. For instance, for the E6 -surface the symbol σ(10)(26) corresponds to 1234560 (1) the permutation (10)(26) = 0634521 ∈ S7 , where we view Aut(E6 ) T→ S7 . Now (10)(26) ∈ S7 means that the vertices D1 and D0 , D2 and D6 are interchanged. We (1) (1) have an isomorphism Aut(E6 ) → Aut(A2 ) and σ(10)(26) interchanges α2 and α0 . Notice this interchange is denoted by σ(10)(26) and not by σ(20) . The minus means that −σ transforms λ = a1 + a2 + a0 to −λ. (1) Here we explain the calculations in the case of the E7 -surfaces as an example. It has (1) (1) (1) (A ) symmetry. Let σ(15)(24)(60) ∈ Aut(E (1) ) ! Aut(A(1) ) Aut(E7 ) W (A1 ) ! W 1 7 1 (1) and w1 , w0 ∈ W (A1 ) be the generators. We have only to get the descriptions of w1 and σ(15)(24)(60) because w0 = σ(15)(24)(60) ◦ w1 ◦ σ(15)(24)(60) . The reflection w1 is easy. It is obtained from permuting the divisor classes E4 and E5 . This means permuting distinct points p4 and p5 and it means permuting the order of blowings-down. We have w1 : (a1 , a0 ; s; x : y : z) −→ (−a1 , a0 + 2a1 ; s; x : y − a1 z : z). (1)
The Cremona isometry σ(15)(24)(60) ∈ Cr(X(E7 )) is described as follows: E 9 = E5 ,
E 8 = E0 − E1 − E4 ,
E 3 = E0 − E4 − E6 ,
E 7 = E0 − E2 − E4 ,
E 2 = E0 − E4 − E7 ,
E 6 = E0 − E3 − E4 ,
E 1 = E0 − E4 − E8 ,
E 5 = E9 ,
E 4 = 3E0 − E1 − E2 − E3 − 2E4 − E6 − E7 − E8 , E 0 = 4E0 − E1 − E2 − E3 − 3E4 − E6 − E7 − E8 . In order to obtain the action of σ(15)(24)(60) on X , we have to blow up P2 successively and get the surface X with the blowing-down structure ρ : X → P2 . Then we blow down the obtained surface to P2 with another blowing-down structure ρ : X → P2 . But we take these operations alternately for saving descriptions and we can omit some operations. For example E 9 = E5 means that we have to blow up the point p5 and blow down the exceptional curve, and we can omit these operations because these two operations are canceled out.
196
H. Sakai
First, we construct the surface obtained by blowing up P2 with the centers at p1 and p4 . We have E 9 = E5 , so we can regard that E 9 has been already blown down. Next E 8 = E0 − E1 − E4 is to be blown down. This is the proper transform of the line which passes through p1 and p4 , namely the line (x = 0). Blowing it down, we y y get P1 × P1 . We abbreviate P2 = Spec C[ xz , z ] ∪ Spec C[ x , xz ] ∪ Spec C[ yz , yx ] as y y P2 = ( xz , z ) ∪ ( x , xz ) ∪ ( yz , yx ). This process is described as follows: x y y z
z x P2 = , ∪ , ∪ , z z x x y y ↑ ρ4 ↑ ρ1
x, y ∪ x, y z, x ∪ z, x z x y z y z x y
❄ y z y x x z x x P ×P = , ∪ , ∪ , ∪ , x x x z y x y z 1
1
We regard each coordinate as an element of the rational function field K(X) ! K(P2 ). Proceeding with these operations, we obtain the next description:
P2 =
P1 × P1 =
x y y z
z x , ∪ , ∪ , z z x x y y
↑ ρ4
↑ ρ1
y y x, x z x z x z x ∪ y, z y, z ∪ x, y
y z y x x z x❄ x , ∪ , ∪ , ∪ , x x x z y x y z
↑ ρ 2
x , yz y x2
F1 =
y z y x x yz ∪ , ∪ , ∪ , x x x z y x2
x , y2z y x3
y z y x , ∪ , ∪ F2 = x x x z
x , y(y 2 z−x 3 ) y x4
y z y x F1 = , ∪ , ∪ x x x z ∼
y y2z − x3 , x x2z
2 ∪ xyz , xz
∪
x , y
❄
x x2 , y yz
↑ ρ 3
3 ∪ x2 , yz2
y z x
x y2z , ∪ y x3 ↑ ρ6
❄
x x3 , y y2z
2 x4 , y z −1 y(y 2 z−x 3 ) x 3
∪
y(y 2 z − x 3 ) x4
∪
,
❄x 4
x , y y(y 2 z − x 3 )
x xyz x y2z − x3 , ∪ , 2 y xyz y y z − x3 ↑ ρ7 x , y 2 z−x 3 ∪ x 2 z , y 2 z−x 3 2 2 3 y xyz
x2z y , 2 x y z − x3
∪
x z
y z−x
❄
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs. P1 × P1 = ∼
y y2z − x3 , x x2z
∪
y x2z , 2 x y z − x3
∪
x y2z − x3 , y x2z
∪
x x2z , 2 y y z − x3
y x2z , 2 x y z − x 3 + sx 2 z x y 2 z − x 3 + sx 2 z x2z x ∪ , , ∪ y y y 2 z − x 3 + sx 2 z x2z ↑ ρ8 y 2 z−x 3 +sx 2 z x , y(y 2 z−x 3 +sx 2 z) ∪ x3z , 3 2 3 2 2 y
y y 2 z − x 3 + sx 2 z , x x2z
197
∪
x z
y(y z−x +sx z)
x z
→ P2 = (x(y 2 z − x 3 + sx 2 z) : y(y 2 z − x 3 + sx 2 z) : x 3 z)
Thus we have σ(15)(24)(60) : (a1 , a0 ; s; x : y : z)
−→ a0 , a1 ; s; x(y 2 z + szx 2 − x 3 ) : −y(y 2 z + szx 2 − x 3 ) : zx 3 . Taking f = y/x and g = x/z as generators of K(X ) with a1 , a0 , s, we have a simpler description of w1 and σ(15)(24)(60) ; a1 w1 : (a1 , a0 ; s; f, g) −→ −a1 , a0 + 2a1 ; s; f − , g , g σ(15)(24)(60) : (a1 , a0 ; s; f, g) −→ (a0 , a1 ; s; −f, f 2 + s − g). They are isomorphisms of fibers: X(a1 ,a0 ,s) → X(a1 ,a0 ,s) , where ξ is the image of ξ by σ(15)(24)(60) or w1 . This is because X(a1 ,a0 ,s) and X(a1 ,a0 ,s) are the same surfaces and we have changed only blowing-down structures. When a1 = a0 , σ(15)(24)(60) is an automorphism different from the identity. 7. Painlevé Equations 1. In the case of an R-surface in the list of Table 4, X(R) is isomorphic to the space of initial conditions of PJ for J = I, II, III, IV, V, VI [19]. Furthermore the action of Cr(X) coincides with the Bäcklund transformations of PJ [20]. Each Painlevé equation is expressed by elements of K(X ). PVI :
1 df = · [2f (f − 1)(f − s)g + (a1 + 2a2 )f (f − 1) dt s(s − 1) + a3 (s − 1)f + a4 s(f − 1)] , 1 dg = − [{3f 2 − 2(s + 1)f + s}g 2 dt s(s − 1) + {(a1 + 2a2 )(2f − 1) + a3 (s − 1) + a4 s}g + a2 (a1 + a2 )], ds = λ = a1 + a2 + a3 + a4 + a0 , dt
(10)
(11) (12)
198
H. Sakai Table 6. X(R), Cr(X) and Painlevé equations (1)
(1)
The type of surface
D4
Painlevé equation
PVI
PV
(D (1) ) W 4
(A(1) ) W 3
Cremona action
(1)
D5
(1)
E6
D6 D
(1)
PIII6 (1)
Aut(D6 ) W ((2A1 )(1) ) (1)
E7
(1)
D7 D
(1)
PIII7
(A(1) ) W 1
(1)
D8 D
(1)
PIII8
S2
(1)
E8
PIV
PII
PI
(A(1) ) W 2
(A(1) ) W 1
−
where f = z/(z − x) and g = y(z − x)/xz. 1 df = [(2g + s)f (f − 1) − (a1 + a3 )f + a1 ], dt s dg 1 = − [(2f − 1)g(g + s) − (a1 + a3 )g + a2 s], dt s ds = λ = a1 + a2 + a3 + a0 , dt
PV :
(13) (14) (15)
where f = x/(x − z) and g = y(x − z)/xz. D
df s 4 = [2f 2 g − (f 2 + (a1 + b1 )f − )], dt s 4 dg 4 = − [2f g 2 − (2f + (a1 + b1 )g + a1 )], dt s ds = 4λ = 4(a1 + a0 ) = 4(b1 + b0 ), dt
(1)
PIII6 :
(16) (17) (18)
where f = y(z − x)/xz and g = x/(z − x). D
(1)
PIII7 :
1 df = − [2f 2 g + a1 f − 2s], dt 2s dg 1 1 = [2f g 2 + a1 g + ], 2s 2 dt ds λ a1 + a0 = = , dt 2 2
(19) (20) (21)
where f = −2yz/x 2 and g = x/2z. D
(1)
PIII8 :
df 1 = − [2f 2 g + f ], dt 4s dg 2s 1 1 = [2f g 2 + g + 2 − ], dt 4s f 2 ds λ = , dt 4
where f = 2z/x and g = (2y − x)/4z.
(22) (23) (24)
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
199
Remark. Usually PIII is written in the form d 2f 1 1 df 2 1 df δ + (αf 2 + β) + γf 3 + . = − 2 dt f dt t dt t f
(25)
It is equivalent to d 2f 1 = 2 dt f
df dt
2 −
1 df β f2 δ + 2 (γf + α) + + , t dt 4t 4t 4f
(26)
by replacing in the latter equation, t by t 2 and f by tf . If γ δ = 0, then we can normalize γ = −δ = 4 without loss of generality. In this D
(1)
case we get PIII6 (16)–(18). The parameters (δ, β) and (−α, −γ ) are symmetric; if one of γ and δ equals zero D
(1)
D
(1)
(not both), then we have PIII7 (19)–(21). When γ = δ = 0, we have PIII8 (22)–(24). Moreover PV 2 β 1 df (f−1)2 γ f (f + 1) d 2f 1 1 df αf + + + + f +δ = − 2 2 dt 2f 1−f dt t dt t f t f−1 (27) D
(1)
is also reduced to PIII6 (16)–(18) when δ = 0. If δ = 0, then we can normalize it as (13)–(15). In detail, see [20]. PIV :
df = f (2g − f − s) − a1 , dt dg = g(2f − g + s) + a2 , dt ds = 2λ = 2(a1 + a2 + a0 ), dt
(28) (29) (30)
where f = x/z and g = y/x. PII :
df = f 2 − s − 2g, dt dg = a1 − 2f g, dt ds = λ = a1 + a0 , dt
(31) (32) (33)
where f = y/x and g = x/z. PI :
where f = x/z and g = 2y/z.
df = g, dt dg = 6f 2 + 2s, dt ds = λ, dt
(34) (35) (36)
200
H. Sakai
Remark. If λ = 0, then the surface X is a Halphen surface of index one. Each Painlevé equation is reduced to the equation which is satisfied by an elliptic function in this limit. If λ = 0, then we can normalize each Painlevé equation as λ = 1 without loss of generality. 2. The space of initial conditions X(R) is constructed and the Bäcklund transformation occurs from the Cremona isometries. We have achieved this from the theory of rational surfaces without the differential equation. Then, can we obtain the differential equation itself from surface theory? (1) (A(1) )-symmetry is an action with respect to In the case of the E7 -surface, the W 1 a1 and a0 , but we have another parameter s. This s is not a parameter arising from the period mapping with respect to H2 (X − Dred , Z). In the theory of Painlevé differential equations the continuous action from differential equation is a deformation of X with respect to s, but this is a trivial deformation of X − Dred because the Painlevé property ensures that X(a1 ,a0 ,s0 ) − Dred → X(a1 ,a0 ,s0 +s) − Dred is an isomorphism as complex analytic spaces. Anyway we want to construct this deformation. In the following, the second Painlevé differential equation is obtained not directly but as a limit of discrete action on further generic rational surfaces. (1) (1) (1) The E6 -surface and the D6 -surface have the E7 -surface as its degeneration. We (1) choose the E6 -surface among them. This surface is constructed in a similar way to the (1) case R = E7 above: z=0
x=0 p9 r ❏✡❝ p1 ← p2 ← p7 ← p8 ← p9 p4 r✡ ❏ p5 ✡ p6 p4 p 5 ←− ❏❝r p3 ← p6 r r r r ✡ ❏ x z x yz p1 : (0 : 1 : 0) ← p2 : , = (0, 0) ← p7 : , 2 = (0, 1) y x y x 2 x y(yz − x ) ← p8 : = (0, s) , y x3 x y(y(yz − x 2 ) − sx 3 ) ← p9 : = (0, −a0 + s 2 ), , y x4 p4 : (0 : 0 : 1), (1)
p5 : (0 : a1 : 1), (1)
p3 : (1 : 0 : 0) ← p6 :
z y , x z
= (0, −a2 ).
(1)
(1)
Now Q(E6 )⊥ = Q(A2 ) and we can write the action of Aut(E6 ) W (A2 ) ! (A(1) ) as W 2 −σ (15)(24) : (a1 , a2 , a0 ; s; x : y : z) −→ (−a2 , −a1 , −a0 ; s; yz : −xy : −zx) ,
−σ (10)(26) : (ai ; s; x : y : z) −→ −a1 , −a0 , −a2 ; s; xz : −(yz − x 2 − sxz) : z2 , w1 : (ai ; s; x : y : z) −→ (−a1 , a2 + a1 , a0 + a1 ; s; x : y − a1 z : z) , w2 = σ(15)(24) ◦ w1 ◦ σ(15)(24) , w0 = σ(10)(26) ◦ w2 ◦ σ(10)(26) .
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
Let us take f = −y/x and g = s − and s. Then we have
201
x y + as generators of K(X ) with a1 , a2 , a0 z x
w1 : (a1 , a2 , a0 ; s; f, g) −→ −a1 , a2 + a1 , a0 + a1 ; s; f +
a1 a1 ,g − s−f −g s−f −g
,
a2 w2 : (a1 , a2 , a0 ; s; f, g) −→ a1 + a2 , −a2 , a0 + a2 ; s; f, g + , f a0 w0 : (a1 , a2 , a0 ; s; f, g) −→ a1 + a0 , a2 + a0 , −a0 ; s; f − , g , g −σ (15)(24) : (a1 , a2 , a0 ; s; f, g) −→ (−a2 , −a1 , −a0 ; s; s − f − g, g), −σ (10)(26) : (a1 , a2 , a0 ; s; f, g) −→ (−a1 , −a0 , −a2 ; s; g, f ). (1)
(A ); We have a translation T in W 2 T = w2 ◦ w1 ◦ σ(15)(24) ◦ σ(10)(26) : (ai ; s; f, g) → (a i ; s; f , g) a0 a2 − λ . = a1 , a2 − λ, a0 + λ; s; s − f − g + , s − g − f − g f This is well known as one of the difference second Painlevé equations d-PII and obtained as a contiguous relation of the fourth Painlevé equation. (1) (1) In order to degenerate the E6 -surface to the E7 -surface we have to make p3 (1) infinitely near p1 . We take a new coordinate of the E6 -surface as (x˜ : y˜ : z˜ ) = (−x : 2 (x + y)/5 : 5 z). In this coordinate a line y = x˜ + 5 y˜ = 0, which passes through p5 and p3 , gets joined with x = −x˜ = 0. We have x˜ z˜ p1 : (0 : 1 : 0) ← p2 : , = (0, 0) y˜ x˜ x˜ y˜ z˜ ←p3 : = (−5, 0) , y˜ x˜ 2 1 x˜ y˜ y˜ z˜ ← p6 : = 0, − , y˜ x˜ x˜ 2 a2 1 1 x˜ 1 y˜ y˜ y˜ z˜ ←p7 : = 5, + + 1 , y˜ x˜ x˜ x˜ 2 a2 5 a2 1 1 y˜ y˜ y˜ z˜ x˜ y˜ 1 ←p8 : − , + +1 y˜ x˜ x˜ x˜ x˜ 2 a2 5 a2 1 1 = 0, − 2 3 + −s 5 a2
202
H. Sakai
y˜ y˜ y˜ y˜ z˜ 1 ←p9 : + x˜ x˜ x˜ x˜ 2 a2 1 1 1 1 − +1 + 2 3+ −s 5 a2 5 a2 1 1 = 0, 3 6 − 4s + s 2 + , − a0 5 a2
x˜ y˜ , y˜ x˜
p4 : (0 : 0 : 1),
p5 : (0 : a1 : 5 3 ),
−1 −1 x˜ x˜ y˜ x˜ x˜ = . = + 5 and = +5 where y˜ y˜ x˜ y˜ y˜ When we take s = 2, a0 = 1 + s˜ 5 2 + β0 5 3 , a2 = −1 − s˜ 5 2 + β2 5 3 , a1 = β1 5 3 (a˜ 0 = β0 + β2 , a˜ 1 = β1 ), (1)
we have the E7 -surface as a limit 5 → 0. 1 x˜ x˜ β 0 − β2 y˜ − 5 (→ y/ ˜ x) ˜ and g˜ = −5 (→ x/˜ ˜ z) then the If we take f˜ = x˜ 2 z˜ z˜ 4 difference equation d-PII goes to the differential equation PII as a continuous limit: f˜ = f˜2 − 2g˜ − s˜ , ˜ g˜ = β1 − 2f˜g, s˜ = β1 + β2 + β0 . At the end we have got the second Painlevé differential equation only from surface theory. Remark. This calculation is a formal one. We assume f˜, g˜ and s˜ to be functions of a d f˜ 5 2 d 2 f˜ formal parameter t. We take f˜ = T (f˜) = f˜ + 5 + · · · , and so on. + dt 2 dt 2 (1) (A(1) ) explains the reason why the The commutativity between T and W (A1 ) in W 2 Bäcklund transformations occur;
T ◦ w1 = w1 ◦ T , (1)
T ◦ w0 ◦ w2 ◦ w0 = w0 ◦ w2 ◦ w0 ◦ T ,
(1)
where W (A2 ) ⊃ W (A1 ) = %w1 , w0 ◦w2 ◦w0 & and %w1 , w0 ◦w2 ◦w0 & goes to the action (1) (1) of Cr(X(E7 ), D) on X (E7 ) as a limit. The fact that (α0 − α2 ) ⊥ (Zα1 + Z(α2 + α0 )) causes this commutativity. (1)
3. These settings we have seen in the case of the E7 -surface and PII are applicable to all the Painlevé equations. For some R, we list the type of surface (with the group of Cremona isometries), the degenerate diagram and the explicit form of the discrete Painlevé equation. (1) (1) (D (1) )), • A3 -surface (Cr(X(A3 )) = W 5 (1) (1) (1) (D (1) )), (D ) → W degeneration: q-PV I → PV I (A → D ; W 3
4
5
4
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
q-PVI = σ(20) ◦ w2 ◦ w1 ◦ w0 ◦ w2 ◦ σ(13) ◦ w3 ◦ w5 ◦ w4 ◦ w3 : qb1 qb2 b3 b4 b1 b2 b3 b4 ; f, g → ;f,g , b 5 b 6 b7 b 8 qb5 qb6 b7 b8 ff g − qb1 g − qb2 = , b7 b8 g − b3 g − b4 gg f − b5 f − b 6 = , b3 b4 f − b7 f − b8 b3 b4 b5 b6 q= , b1 b2 b7 b8 where
(1)
b1 b5
b2 b6
b3 b7
b4 z 1/a2 1/a3 1/a1 a8 y , . ; f, g = ; b8 a6 a7 − 1/a4 a1 − 1/a5 a1 x − a1 z x (1)
(1)
(A )) • A4 -surface (Cr(X(A4 )) = W 4
(1) (1) (1) (A(1) )), (A4 ) → W degeneration: q-PV → PV (A4 → D5 ; W 3 2 : q-PV = w4 ◦ w3 ◦ w2 ◦ w1 ◦ σ(12340) a1 a 2 a 3 a1 a 2 a 3 ; f, g → ;f,g , a4 a0 a4 /q qa0 1−g a1 a2 ff = , a3 (1 + a2 g)(1 − a1 a2 g)
gg =
a4 a0 (1 − a3 f )(1 − f ) , a2 f (a4 − qf )
where f = (z − x)/y and g = z(x + y − z)/x(z − x). (1) (1) ((A2 + A1 )(1) )) • A5 -surface (Cr(X(A5 )) = W
(1) (1) (A(1) ) , ((A2 + A1 )(1) ) → W degeneration: q-PIV → PIV A5 → E6 ; W 2 3 : q-PIV = w1 ◦ σ(123450) a1 a2 a0 a1 a2 a0 ; f, g → ;f,g , b1 b0 b1 /q qb0 a0 b1
f f = − a2 b0 − (1 − g), g
a1 b0 gg = −
f − a 2 b0
f + a2 q = a1 a2 a0 = b1 b0 ,
where f = y/x and g = x/z.
,
203
204
H. Sakai (1)
(1)
((A2 +A1 )(1) ) → W ((A1 +A1 )(1) )), degeneration : q-PIII → PIII (A5 → D6 ; W 2 q-PIII = w2 ◦ w1 ◦ σ(123450) : a1 a2 /q qa0 a1 a2 a0 ; f, g → ;f,g , b1 b0 b1 b0 g − b1 , gf f = −a1 a0 g − b 1
gf g = b1
f g + b1 a2 a0 f g + b 1
,
where f = (z − x)/y and g = yz/x(z − x). (1) (1) (D (1) )) • D4 -surface (Cr(X(D4 )) = W 4
(1)
degeneration: d-PV (contiguity of PV I ) → PV (D4 (A(1) )), W 3
(1)
→ D5
(D (1) ) → ; W 4
d-PV = σ(14) ◦ w2 ◦ w3 ◦ w4 ◦ σ(40) ◦ σ(34) ◦ σ(40) ◦ w2 ◦ w1 ◦ w0 : a1 + λ a2 − λ a3 a1 a2 a3 ; s; f, g → ; s; f , g , a4 a0 a4 a0 + λ a1 a0 f + f = a3 + + , g + 1 sg + 1 gg =
(f − λ + a2 )(f − λ + a2 + a4 )
sf (f − a3 ) λ = a1 + a2 + a3 + a4 + a0 ,
where f = y/x and g = (x − z)/z. (1) (1) (A(1) )) • D5 -surface (Cr(X(D5 )) = W 3
,
(1)
(1)
degeneration: d-PIV (contiguity of PV ) → PIV (D5 → E6 ; (A(1) )), (A(1) ) → W W 3
2
d-PIV = w3 ◦ w2 ◦ w1 ◦ σ(45) ◦ σ(14)(50) : a1 a2 a1 a2 ; s; f, g → ; s; f , g , a3 a0 a3 − λ a 0 + λ sg ff = , (g − a3 + λ)(g + a0 + λ) s a1 + a0 g+g = + − λ + a 3 − a0 , f 1−f λ = a1 + a2 + a3 + a0 , where f =
y(z − x) − szx (x − z)(y(z − x) − szx) and g = . x((y + a0 z)(z − x) − szx) z(z − x)
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs. (1)
(1)
205 (1)
(A ) → W ((A1 + degeneration: d-PIII (contiguity of PV ) → PIII (D5 → D6 ; W 3 (1) A1 ) )), d-PIII = (w3 ◦ w2 ◦ w1 ◦ σ(45) ◦ σ(14)(50) ) ◦ (w2 ◦ w1 ◦ w0 ◦ σ(45) ◦ σ(14)(50) ) = w2 ◦ w3 ◦ w1 ◦ w2 ◦ (σ(45) ◦ σ(14)(50) )2 : a1 a2 + λ a1 a2 ; s; f, g → ; s; f , g , a3 a0 − λ a3 a0 a0 − λ a + 2 − λ + , f +f =1+ s−g g a 1 + a0 a 3 + a0 g+g =s− + , 1−f f λ = a1 + a2 + a3 + a0 , x((y + a0 z)(z − x) − szx) y(z − x) − szx and g = . (x − z)(y(z − x) − szx) z(z − x) (1) (1) ((A1 + A1 )(1) )) • D6 -surface (Cr(X(D6 )) = W where f =
(1) (1) degeneration: alt.d-PII (contiguity of PIII ) → PII (D6 → E7 ; W ((A1 +A1 )(1) ) → (1) (A )), W 1
alt. d-PII = w1 ◦ σ(56) ◦ σ(16)(24)(50) ◦ σ(56) : a1 a 0 a1 a0 ; s; f, g → ; s; f , g , b1 b0 b1 − λ b 0 + λ g(g − b1 ) , ff = s b1 g+g = , 1−f λ = a1 + a0 = b1 + b0 , where f = x/(x − z) and g = y/z. (1) (1) (A(1) )) • E6 -surface (Cr(X(E6 )) = W 2
(1)
(1)
(1)
(1)
(A ) → W (A )), degeneration: d-PII (contiguity of PIV ) → PII (E6 → E7 ; W 2 1 d-PII = w2 ◦ w1 ◦ σ(15)(24) ◦ σ(10)(26) : (ai ; t; f, g) → (a i ; t; f , g) a0 a2 − λ , = a1 , a2 − λ, a0 + λ; s; s − f − g + , s − g − f − g f λ = a1 + a2 + a0 ,
where f = −y/x and g = s −
x y + . z x
206
H. Sakai (1)
(1)
(1)
(A )) • E7 -surface (Cr(X(E7 )) = W 1 (1) (1) (1) degeneration: alt .d-PI (contiguity of PII ) → PI (E7 → E8 ; W (A1 ) → 1), alt.d-PI = w1 ◦ σ(15)(24)(60) :
a0 2 (ai ; s; f, g) → (a i ; s; f , g) = a1 −λ, a0 + λ; s;−f − , s + f −g , g λ = a1 + a0 ,
where f = y/x and g = x/z. In this paper we have presented a new approach for both the discrete and differential Painlevé equations. Now we can summarize how to get the differential Painlevé equations. In order to obtain the Painlevé equation, we have constructed • • • • •
X: a generalized Halphen surface with dim | − KX | = 0, X : a family of surfaces X, Cr(X): the group of the Cremona isometries (an action on Pic(X)), the Cremona action: an action of Cr(X) on X , discrete Painlevé equations as a Cremona action,
and the Painlevé differential equations are obtained as a limit of discrete Painlevé equations. 4. We close this section with the introduction of the most generic discrete equation. Although it has no differential equation as a limit directly, all discrete equations which appear in this article can be regarded as its degeneration. We often use terminology such as “q-difference equation” when we call a discrete equation which have coefficients to change multiplicatively with respect to the independent variables. The next equation have coefficients which change in compliance with the addition formula for the elliptic functions. If we call this kind of discrete equations elliptic-difference equations, then the next is the elliptic-difference Painlevé equation. (1)
(1)
(1)
• A0 -surface (Cr(X(A0 )) = W (E8 )) ell.P = w3 ◦ w2 ◦ w4 ◦ w3 ◦ w1 ◦ w2 ◦ w5 ◦ w4 ◦ w3 ◦ w6 ◦ w5 ◦ w4 ◦ w8 ◦ w3 ◦ w2 ◦ w1 ◦ w7 ◦ w6 ◦ w5 ◦ w4 ◦ w3 ◦ w2 ◦ w0 ◦ w7 ◦ w6 ◦ w5 ◦ w4 ◦ w3 ◦ w8 ◦ w3 ◦ w4 ◦ w5 ◦ w6 ◦ w7 ◦ w0 ◦ w2 ◦ w3 ◦ w4 ◦ w5 ◦ ◦w6 ◦ w7 ◦ w1 ◦ w2 ◦ w 3 ◦ w 8 ◦ w 4 ◦ w5 ◦ w 6 ◦ w3 ◦ w 4 ◦ w 5 ◦ w 2 ◦ w 1 ◦ w3 ◦ w4 ◦ w2 ◦ w3 ◦ w8 : θ1 + 23 λ θ2 + 23 λ θ3 + 23 λ θ1 θ2 θ3 θ4 θ5 θ6 ; x : y : z → θ4 − 1 λ θ5 − 1 λ θ6 − 1 λ; x : y : z , 3 3 3 θ7 θ8 θ9 θ7 − 13 λ θ8 − 13 λ θ9 − 13 λ (x : y : z) = P
θ4 − 23 λ456 + λ3 ,θ5 − 23 λ456 + λ3 ,θ6 − 23 λ456 + λ3
◦ P
θ7 + 23 λ123 +
◦ P
θ4 +
λ456 λ456 λ456 2 2 3 ,θ8 + 3 λ123 + 3 ,θ9 + 3 λ123 + 3 ,
λ123 λ123 λ123 3 ,θ5 + 3 ,θ6 + 3
◦
◦ P(θ1 ,θ2 ,θ3 ) (x : y : z),
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
λ = λ123 + λ456 + λ789 =
9
θi ,
207
λij k = θi + θj + θk ,
i=1
where
p1 (α, β, γ ) 0
p2 (α, β, γ ) 0 0 0 p3 (α, β, γ )
( α−2β−2γ ) 1 ℘ ( α−2β−2γ ) ℘ 3 3
( −2α+β−2γ ) 1 , ℘ ( −2α+β−2γ ) ℘ 3 3
( −2α−2β+γ ) 1 ℘ ( −2α−2β+γ ) ℘ 3 3
P(α,β,γ ) (x : y : z) = l α,β,γ (x : y : z) 0 ·
0
l0,β,γ · l α+β+γ , −2α+β−2γ , −2α−2β+γ
p1 (α, β, γ ) 3 3 3 p2 (α, β, γ ) = l · l α−2β−2γ α+β+γ −2α−2β+γ 0,γ ,α , , 3 3 3 p3 (α, β, γ ) l0,α,β · l α+β+γ , α−2β−2γ , −2α+β−2γ 3
,
3
3
l α,β,γ (x : y : z) lγ ,α (x : y : z) · lα,β (x : y : z) : lα,β (x : y : z) · lβ,γ (x : y : z) : , = : lβ,γ (x : y : z) · lγ ,α (x : y : z)
x
y
z
(α) 1 , lα,β (x : y : z) = det ℘ (α) ℘ ℘ (β) ℘ (β) 1
℘ (α) ℘ (α) 1
lα,β,γ = det ℘ (β) ℘ (β) 1 . ℘ (γ ) ℘ (γ ) 1 A. Correspondence Between Standard Basis of 10 and Canonical Root Bases Here we list the correspondence between the standard basis and the canonical root bases for the Picard group Pic(X) of the generalized Halphen surface X with dim | − KX | = 0. Now we denote by (Ei )9i=0 the standard basis which defines the blowing-down structure, and denote by Di = [Di ] the class of the irreducible component of the unique anticanonical divisor. We take a root basis .(R) as for the following αi ’s. There are different correspondences, but we can take the blowing-down structure as in the list, for X and D = mi Di . See Sect. 4, 4.
208
H. Sakai (1)
Ell 1. A0 -surface: (1)
Q(A0 ) = Zδ, (1)⊥
(1)
= E8 ,
A0
(1)
.(E8 )
:
αi = Ei − Ei+1 (for i = 1, . . . , 7), α8 = E0 − E1 − E2 − E3 , α0 = E8 − E9 , δ = 2α1 + 4α2 + 6α3 + 5α4 + 4α5 + 3α6 + 2α7 + α0 + 3α8 .
(1)∗
Mul 1. A0 -surface: The same with Ell 1. (1) Mul 2. A1 -surface: (1)
.(A1 )
:
(1)⊥
D1 = E0 − E1 − E2 − E3 , δ = D1 + D 0 ,
D0 = 2E0 − E4 − E5 − E6 − E7 − E8 −E9 ,
(1)
= E7 ,
A1
(1)
.(E7 )
:
α1 = E2 − E3 , α2 = E1 − E2 , α3 = E0 − E1 − E4 − E5 , α4 = E5 − E6 , α5 = E6 − E7 , α6 = E7 − E8 , α7 = E4 − E5 , α0 = E8 − E9 , δ = α1 + 2α2 + 3α3 + 4α4 + 3α5 + 2α6 + α0 + 2α7 .
(1)
Mul 3. A2 -surface: (1)
.(A2 )
(1)⊥
:
D1 = E0 − E6 − E7 − E8 , D0 = E0 − E4 − E5 − E9 , δ = D1 + D2 + D0 ,
D2 = E0 − E1 − E2 − E3 ,
(1)
= E6 ,
A2
(1)
.(E6 )
:
α1 = E2 − E3 , α2 = E1 − E2 , α3 = E0 − E1 − E4 − E6 , α4 = E6 − E7 , α5 = E7 − E8 , α6 = E4 − E5 , α0 = E5 − E9 , δ = α1 + 2α2 + 3α3 + 2α4 + α5 + 2α6 + α0 .
(1)
Mul 4. A3 -surface: (1)
.(A3 )
(1)⊥
A3
(1)
.(D5 )
:
D1 = E0 − E6 − E7 − E8 , D2 = E0 − E1 − E2 − E3 , D3 = E0 − E4 − E5 − E8 , D0 = E8 − E9 , δ = D1 + D 2 + D 3 + D 0 , (1)
= D5 , :
α1 = E2 − E3 , α2 = E1 − E2 , α3 = E0 − E1 − E4 − E6 , α4 = E6 − E7 , α5 = E4 − E5 , α0 = E0 − E1 − E8 − E9 , δ = α1 + 2α2 + 2α3 + α4 + α5 + α0 .
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
209
(1)
Mul 5. A4 -surface: (1)
.(A4 )
:
(1)⊥
D1 = E0 − E6 − E7 − E8 , D2 = E0 − E1 − E2 − E3 , D3 = E0 − E4 − E5 − E7 , D4 = E7 − E8 , D0 = E8 − E9 , δ = D1 + D 2 + D 3 + D 4 + D 0 , (1)
= A4 ,
A4
(1)
.(A4 )
:
α1 = E2 − E3 , α2 = E1 − E2 , α3 = E0 − E1 − E4 − E6 , α4 = E4 − E5 , α0 = 2E0 − E1 − E2 − E4 − E7 − E8 − E9 , δ = α1 + α2 + α3 + α4 + α0 .
(1)
Mul 6. A5 -surface: (1)
.(A5 )
(1)⊥
A5
:
D1 = E0 − E6 − E7 − E8 , D2 = E0 − E1 − E2 − E3 , D3 = E3 − E5 , D4 = E0 − E3 − E4 − E7 , D5 = E7 − E8 , δ = D1 + D 2 + D 3 + D 4 + D 5 + D 0 ,
D0 = E8 − E9 ,
= (A2 + A1 )(1) ,
.((A2 + A1 )(1) )
:
α1 = E1 − E2 , α2 = E0 − E1 − E4 − E6 , α0 = 2E0 −E1 −E3 −E5 −E7 −E8 −E9 , α1 = E0 −E3 −E5 −E6 , α0 = 2E0 −E1 −E2 −E4 −E7 −E8 −E9 , δ = α1 + α0 = α1 + α0 .
(1)
Mul 7. A6 -surface: (1)
.(A6 )
(1)⊥
A6
.((A1 + A1,|α|2 =14 )(1) )
:
D1 = E0 − E2 − E7 − E8 , D2 = E2 − E6 , D3 = E0 − E1 − E2 − E3 , D4 = E3 − E5 , D5 = E0 − E3 − E4 − E7 , D6 = E7 − E8 , D0 = E8 − E9 , δ = D1 + D 2 + D 3 + D 4 + D 5 + D 6 + D 0 ,
= (A1 + A1,|α|2 =14 )(1) , :
α1 = E0 − E2 − E4 − E6 , α0 = 2E0 − E1 − E3 − E5 − E7 − E8 − E9 , α1 = E0 + 2E1 − E2 − 2E3 + E4 − 2E5 − E6 , α0 = 2E0 − 3E1 + E3 − 2E4 + E5 − E7 − E8 − E9 , δ = α1 + α0 = α1 + α0 .
210
H. Sakai (1)
Mul 8. A7 -surface: (1)
.(A7 )
:
(1)⊥
D1 = E0 − E2 − E7 − E8 , D2 = E2 − E6 , D3 = E0 − E1 − E2 − E3 , D4 = E3 − E4 , D5 = E4 − E5 , D6 = E0 − E3 − E4 − E7 , D7 = E7 − E8 , D0 = E8 − E9 , δ = D1 + D2 + D3 + D4 + D5 + D6 + D7 + D0 , (1)
A7
= A1,|α|2 =8 ,
(1)
.(A1,|α|2 =8 )
:
α1 = E0 − 2E1 + E2 + E6 − E7 − E8 − E9 , α0 = 2E0 + E1 − 2E2 − E3 − E4 − E5 − 2E6 , δ = α1 + α0 .
(1)
Mul 9. A7 -surface: (1)
.(A7 )
:
(1) ⊥
D1 = E0 − E2 − E7 − E8 , D2 = E2 − E6 , D3 = E0 − E1 − E2 − E3 , D4 = E3 − E5 , D5 = E1 − E3 , D6 = E0 − E1 − E4 − E7 , D7 = E7 − E8 , D0 = E8 − E9 , δ = D1 + D2 + D3 + D4 + D5 + D6 + D7 + D0 , (1)
A7
= A1 ,
(1)
.(A1 )
:
α1 = E0 − E2 − E4 − E6 , δ = α1 + α0 .
α0 = 2E0 − E1 − E3 − E5 − E7 − E8 −E9 ,
(1)
Mul 10. A8 -surface: (1)
.(A8 )
(1)⊥
A8
:
D1 = E0 − E1 − E7 − E8 , D2 = E1 − E2 , D3 = E2 − E6 , D4 = E0 − E1 − E2 − E3 , D5 = E3 − E4 , D6 = E4 − E5 , D7 = E0 − E3 − E4 − E7 , D8 = E7 − E8 , D0 = E8 − E9 , δ = D1 + D 2 + D 3 + D 4 + D 5 + D 6 + D 7 + D 8 + D 0 , (1)
= A0
(1)
Q(A0 ) = Zδ. (1)∗∗
Add 1. A0 -surface: The same with Ell 1 and Mul 1. (1)∗ Add 2. A1 -surface: The same with Mul 2. (1)∗ Add 3. A2 -surface: The same with Mul 3.
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
211
(1)
Add 4. D4 -surface: (1)
.(D4 )
:
(1)⊥
D1 = E0 − E1 − E2 − E3 , D2 = E1 − E8 , D4 = E0 − E1 − E6 − E7 , D0 = E8 − E9 , δ = D1 + 2D2 + D3 + D4 + D0 ,
D3 = E0 − E1 −E4 −E5 ,
(1)
= D4 ,
D4
(1)
.(D4 )
:
α1 = E2 − E3 , α2 = E0 − E2 − E4 − E6 , α4 = E6 − E7 , α0 = E0 − E1 − E8 − E9 , δ = α1 + 2α2 + α3 + α4 + α0 .
α3 = E4 − E5 ,
(1)
Add 5. D5 -surface: (1)
.(D5 )
:
(1)⊥
D1 = E0 −E1 −E2 −E3 , D2 = E2 −E8 , D3 = E1 −E2 , D4 = E0 −E1 −E4 −E5 , D5 = E0 −E1 −E6 −E7 , D0 = E8 −E9 , δ = D1 + 2D2 + 2D3 + D4 + D5 + D0 , (1)
D5
= A3 ,
(1)
.(A3 )
:
α1 = E4 −E5 , α2 = E0 −E3 −E4 −E6 , α3 = E6 −E7 , α0 = 2E0 −E1 −E2 −E4 −E6 −E8 −E9 , δ = α1 + α2 + α3 + α0 .
(1)
Add 6. D6 -surface: (1)
.(D6 )
(1)⊥
D6
.((2A1 )(1) )
:
D1 = E0 −E1 −E2 −E3 , D2 = E3 −E8 , D3 = E2 −E3 , D4 = E1 −E2 , D5 = E0 −E1 −E4 −E5 , D6 = E0 −E1 −E6 −E7 , D0 = E8 −E9 , δ = D1 + 2D2 + 2D3 + 2D4 + D5 + D6 + D0 ,
= (2A1 )(1) , :
α1 = E4 −E5 , α0 = 3E0 −E1 −E2 −E3 −2E4 −E6 −· · ·−E9 , α1 = E6 −E7 , α0 = 3E0 −E1 −· · ·−E5 −2E6 −E8 −E9 , δ = α1 + α0 = α1 + α0 .
(1)
Add 7. D7 -surface: (1)
.(D7 )
(1)⊥
D7
:
D1 = E0 − E1 − E2 − E3 , D2 = E3 − E8 , D3 = E2 − E3 , D4 = E1 − E2 , D5 = E0 − E1 − E4 − E5 , D6 = E4 − E6 , D7 = E5 − E7 , D0 = E8 − E9 , δ = D1 + 2D2 + 2D3 + 2D4 + 2D5 + D6 + D7 + D0 , (1)
= A1,|α|2 =4 ,
212
H. Sakai (1)
.(A1,|α|2 =4 )
:
α1 = E4 − E5 + E6 − E7 , α0 = 3E0 − E1 − E2 − E3 − 2E4 − 2E6 − E8 − E9 , δ = α1 + α0 .
(1)
Add 8. D8 -surface: (1)
.(D8 )
:
(1)⊥
D1 = E0 − E1 − E2 − E3 , D2 = E3 − E8 , D3 = E2 − E3 , D4 = E1 − E2 , D5 = E0 − E1 − E4 − E5 , D6 = E5 − E6 , D7 = E4 − E5 , D8 = E6 − E7 , D0 = E8 − E9 , δ = D1 + 2D2 + 2D3 + 2D4 + 2D5 + 2D6 + D7 + D8 + D0 , (1)
D8
= A0
(1)
Q(A0 ) = Zδ. (1)
Add 9. E6 -surface: (1)
.(E6 )
(1)⊥
E6
(1)
.(A2 )
:
D1 = E0 − E1 − E4 − E5 , D2 = E1 − E2 , D3 = E2 − E7 , D4 = E0 − E1 − E2 − E3 , D5 = E3 − E6 , D6 = E7 − E8 , D0 = E8 − E9 , δ = D1 + 2D2 + 3D3 + 2D4 + D5 + 2D6 + D0 , (1)
= A2 , :
α1 = E4 − E5 , α2 = E0 − E3 − E4 − E6 , α0 = 2E0 − E1 − E2 − E4 − E7 − E8 − E9 , δ = α1 + α2 + α0 .
(1)
Add 10. E7 -surface: (1)
.(E7 )
(1)⊥
E7
(1)
.(A1 )
:
D0 = E0 − E1 − E4 − E5 , D1 = E1 − E2 , D2 = E2 − E3 , D3 = E3 − E6 , D4 = E6 − E7 , D5 = E7 − E8 , D6 = E8 − E9 , D7 = E0 − E1 − E2 − E3 , δ = 2D1 + 3D2 + 4D3 + 3D4 + 2D5 + D6 + 2D7 + D0 , (1)
= A1 , :
α1 = E4 − E5 , δ = α1 + α0 .
α0 = 3E0 − E1 − E2 − E3 − 2E4 − E6 − · · · − E9 ,
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
213
(1)
Add 11. E8 -surface: (1)
.(E8 )
(1)⊥
E8
:
Di = Ei − Ei+1 (for i = 1, . . . , 7), D8 = E0 − E1 − E2 − E3 , D0 = E8 − E9 , δ = 2D1 + 4D2 + 6D3 + 5D4 + 4D5 + 3D6 + 2D7 + D0 + 3D8 , (1)
= A0 ,
(1)
Q(A0 ) = Zδ.
B. Realization of Generalized Halphen Surfaces with dim | − KX | = 0 We have a blowing down morphism: X → P2 which gives the basis (Ei )9i=0 of Pic(X) in the relation of Appendix A with Di ’s, for a generalized Halphen surface X with dim |−KX | = 0. We normalize P2 by PGL(3) and get the following list. Any generalized Halphen surface is obtained by blowing up P2 with the centers at pi ’s. See § 5, 3 and 4. (1) Ell 1. A0 -surface: D: a smooth elliptic curve; y 2 z = 4x 3 − g2 xz2 − g3 z3 , pi : (℘ (θi ) : ℘ (θi ) : 1) = (ai : ai : 1) (i = 1, . . . , 9), χ (αi ) = θi+1 − θi
(i = 1, . . . , 7),
χ (α8 ) = θ1 + θ2 + θ3 ,
χ (α0 ) = θ9 − θ8 .
(1)∗
Mul 1. A0 -surface: D: a rational curve with a node; y 2 z = 4(x + 13 z)2 (x − 23 z)
pi :
1 1 −2 cos θi − : 1 = (ai : ai : 1) : 3 sin2 θi sin3 θi
χ (αi ) = θi+1 − θi
(i = 1, . . . , 7),
(i = 1, . . . , 9).
χ (α8 ) = θ1 + θ2 + θ3 ,
(1)
Mul 2. A1 -surface: ✬✩ y=0 ✫✪ y 2 = xz
χ (α0 ) = θ9 − θ8 .
214
H. Sakai
p1 : (a1 : 0 : 1), p4 : p7 :
(a42 (a72
p2 : (a2 : 0 : 1),
: a4 : 1),
p5 :
: a7 : 1),
p8 :
a1 a2 a3
(a52 (a82
: a5 : 1), p6 : (a62 : a6 : 1), : a8 : 1), p9 : (a92 : a9 : 1).
e e
√ −1χ(α1 )
√ 2π −1χ(α4 ) √ 2π −1χ(α7 )
= a3 /a2 ,
e2π
= a5 /a6 ,
e
= a4 /a5 ,
e
µ2 a1 µ2 a2 µ2 a3
a4 a5 a6 ; x : y : z ∼ µa4 a7 a8 a9 µa7 e2π
p3 : (a3 : 0 : 1),
√ −1χ(α2 )
√ 2π −1χ(α5 ) √ 2π −1χ(α0 )
µa6 ; µ2 x : µy : z , µa9
µa5 µa8
= a2 /a1 ,
e2π
= a6 /a7 ,
e
√ −1χ(α3 )
√ 2π −1χ(α6 )
= −a1 /a4 a5 , = a7 /a8 ,
= a8 /a9 .
(1)
Mul 3. A2 -surface: z=0 x=0 ❏ p4 r ✡❏r p6 ✡ r p5 r ❏ rp7 p9 r✡ p ✡ r r r❏ 8 ✡p1 p2 p3❏
y=0
p1 : (a1 : 0 : 1), p2 : (a2 : 0 : 1), p3 : (a3 : 0 : 1), p4 : (0 : 1 : a4 ), p5 : (0 : 1 : a5 ), p9 : (0 : 1 : a9 ), p6 : (1 : a6 : 0), p7 : (1 : a7 : 0), p8 : (1 : a8 : 0).
a1 a2 a3
a4 a5 a9 ; x : y : z ∼ a6 a 7 a 8 e2π e e
√ −1χ(α1 )
√ 2π −1χ(α4 ) √ 2π −1χ(α0 )
= a3 /a2 ,
e2π
= a7 /a6 ,
e
√ −1χ(α2 )
√ 2π −1χ(α5 )
µa1 µa2 µa3 1 ν a4 ν µ a6
1 ν a5 ν µ a7
1 ν a9 ν µ a8
= a2 /a1 ,
e2π
= a8 /a7 ,
e
; µx : νy : z ,
√ −1χ(α3 )
√ 2π −1χ(α6 )
= −a1 a4 a6 , = a5 /a4 ,
= a9 /a5 .
(1)
Mul 4. A3 -surface: x=0 z=0 r p ←p ❡ ❏✡ 8 9 p4 r✡ ❏ r p7 ❏r p5 r✡ p ✡ r r r❏ 6 y = 0 ❏ ✡p1 p2 p3
←−
p4 r p5 r
p9 r r r r p 1 p2 p3
r p7 r p6
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
p1 : (a1 : 0 : 1), p4 : (0 : 1 : a4 ),
p2 : (a2 : 0 : 1), p3 : (a3 : 0 : 1), p5 : (0 : 1 : a5 ), p6 : (1 : a6 : 0),
p8 : (0 : 1 : 0) ← p9 :
a8 a4
e2π e
a1 a5
1 a3 µ a8 ;x : y : z ∼ 1 a7 a4
a2 a6
√ −1χ(α1 )
√ 2π −1χ(α4 )
ν
= a3 /a2 ,
e2π
= a7 /a6 ,
e
√ −1χ(α2 )
√ 2π −1χ(α5 )
x z , y x
215
p7 : (1 : a7 : 0),
= (0, a8 ).
µa1
µa2
µa3
1 ν a5
ν µ a6
ν µ a7
= a2 /a1 ,
e2π
= a5 /a4 ,
e
; µx : νy : z ,
√ −1χ(α3 )
√ 2π −1χ(α0 )
= −a1 a4 a6 , = a1 a8 .
(1)
Mul 5. A4 -surface:
x=0 ❡ p ←p ←p ❏✡ 7 8 9 p4 r✡ ❏ r p6 ❏ p5 r✡ ✡ r r r❏ y=0 ✡p1 p2 p3❏
z=0
p1 : (1 : 0 : 1), p2 : (a2 : 0 : 1), p3 : (a1 a2 : 0 : 1), p4 : (0 : 1 : 1), p5 : (0 : 1 : a4 ), p6 : (1 : −a3 : 0), x z x yz , = (0, 0) ← p9 : , 2 = (0, a0 /a2 ). p7 : (0 : 1 : 0) ← p8 : y x y x e2π e
√ −1χ(α1 )
√ 2π −1χ(α4 )
= a1 ,
e2π
= a4 ,
e
√ −1χ(α2 )
√ 2π −1χ(α0 )
= a2 ,
e2π
√ −1χ(α3 )
= a3 ,
= a0 .
(1)
Mul 6. A5 -surface:
x=0 ❏✡❡ p7 ← p8 ← p9 p4 r✡ ❏ r p6 ❏ ✡ p5 → p3 ✡ r r r❏ ❡ y=0 ✡ p1 p2❏ z=0
p1 : (1 : 0 : 1),
p2 : (a1 : 0 : 1), p4 : (0 : 1 : 1), p6 : (1 : −a2 : 0), y x , = (0, b1 /a2 ), p3 : (0 : 0 : 1) ← p5 : z y x z x yz p7 : (0 : 1 : 0) ← p8 : = (0, b0 /a1 ). , = (0, 0) ← p9 : , y x y x2 e2π e
√ −1χ(α1 )
√ 2π −1χ(α1 )
= a1 ,
e2π
= b1 ,
e
√ −1χ(α2 )
√ 2π −1χ(α0 )
= a2 , = b0 .
e2π
√ −1χ(α0 )
= b1 b0 /a1 a2 ,
216
H. Sakai (1)
Mul 7. A6 -surface:
x=0 ❏✡❡ p7 ← p8 ← p9 p4 r✡ ❏ ❏ ✡ p 2 ← p6 p5 → p3 r✡ ❡ r ❏ r❡ ❏ y=0 ✡ p1 z=0
p1 : (1 : 0 : 1),
p4 : (0 : 1 : 1), z y , = (0, a1 ), p2 : (1 : 0 : 0) ← p6 : x z y x p3 : (0 : 0 : 1) ← p5 : , = (0, a0 /b), z y x z x yz p7 : (0 : 1 : 0) ← p8 : , = (0, 0) ← p9 : , 2 = (0, −b). y x y x 2e2π e
√ −1χ(α1 )
√ 2π −1χ(α1 )
= a1 , =
e2π
a1 a02 /b2 ,
e
√ −1χ(α0 )
√ 2π −1χ(α0 )
= a0 , = b2 /a0 .
(1)
Mul 8. A7 -surface:
x=0 ❡ p ←p ←p ❏✡ 7 8 9 ✡❏ ❏ ✡ p 2 ← p6 ✡ ❡ r ❏ r❡ ❏ y=0 ✡ p1
z=0
p5 → p4 → p3
z y p2 : (1 : 0 : 0) ← p6 : , = (0, 1), x z y x y zx p3 : (0 : 0 : 1) ← p4 : , = (0, 0) ← p5 : , 2 = (0, a0 ), z y z y x z x yz p7 : (0 : 1 : 0) ← p8 : = (0, a1 ). , = (0, 0) ← p9 : , y x y x2
p1 : (1 : 0 : 1),
e2π
√ −1χ(α1 )
= a1 ,
e2π
√ −1χ(α0 )
= a0 .
(1)
Mul 9. A7 -surface:
x=0 ❡ p ←p ←p ❏✡ 7 8 9 p4 r✡ ❏ ❏ ✡ p 2 ← p6 ❏ r❡ ✡ ❡ ❏ y=0 ✡
z=0
p5 → p3 → p1
p1 : (0 : 0 : 1) ← p3 :
x y , z x
= (0, 0) ← p5 :
y x2 , x yz
= (0, b)
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
z y , = (0, a1 ), p4 : (0 : 1 : 1), x z x z x yz p7 : (0 : 1 : 0) ← p8 : , = (0, 0) ← p9 : , 2 = (0, c). y x y x c (a1 , b, c ; x : y : z) ∼ a1 , µ2 b, 2 ; µx : y : z , µ
p2 : (1 : 0 : 0) ← p6 :
e2π
√ −1χ(α1 )
= a1 ,
e2π
√ −1χ(α0 )
= bc.
(1)
Mul 10. A8 -surface:
x=0 ❏✡❡ p7 ← p8 ← p9 ✡❏ ❏ ✡ p1 ← p2 ← p6 ❏❡ ✡ ❡ ❏ y=0 ✡
z=0
p5 → p4 → p3
z y z xy p1 : (1 : 0 : 0) ← p2 : = (0, 1), , = (0, 0) ← p6 : , x z x z2 y x y zx p3 : (0 : 0 : 1) ← p4 : = (0, b), , = (0, 0) ← p5 : , z y z y2 x z x yz p7 : (0 : 1 : 0) ← p8 : = (0, c). , = (0, 0) ← p9 : , y x y x2
(b, c ; x : y : z) ∼ (µ3 b, µ−3 c ; µx : µ−1 y : z), e2π
√ −1χ(δ)
(1)∗∗
Add 1. A0
= bc.
-surface: y 2 z = 4x 3
p
pi : (ai : −2 : ai3 )
(i = 1, . . . , 9),
(ai ; x : y : z) ∼ (µai ; µx : y : µ3 z), χ (αi ) = (ai+1 − ai )/λ (i = 1, . . . , 7), χ (α8 ) = (a1 + a2 + a3 )/λ, χ (α0 ) = (a9 − a8 )/λ, where λ =
9 i=1
ai .
217
218
H. Sakai (1)∗
Add 2. A1
-surface: ✬✩ y=0 yz = x 2 ✫✪
p1 : (1 : 0 : a1 ), p4 : (−a4 : 1 : p7 : (−a7 : 1 :
a1 a2 a3
a4 a5 a6 ; x a7 a8 a9 µa1 + 2ν ∼ µa4 − ν µa7 − ν
p2 : (1 : 0 : a2 ),
a42 ), a72 ),
p5 : (−a5 : 1 : p8 : (−a8 : 1 :
a52 ), a82 ),
p3 : (1 : 0 : a3 ), p6 : (−a6 : 1 :: a62 ), p9 : (−a9 : 1 : a92 ).
: y : z
µa2 + 2ν µa3 + 2ν µa5 − ν µa8 − ν
µa6 − ν ; µx + νy : y : 2µνx + ν 2 y + µ2 z , µa9 − ν
χ (α1 ) = (a3 − a2 )/λ, χ (α2 ) = (a2 − a1 )/λ, χ (α3 ) = (a1 + a4 + a5 )/λ, χ (α4 ) = (a6 − a5 )/λ, χ (α5 ) = (a7 − a6 )/λ, χ (α6 ) = (a8 − a7 )/λ, 9 χ (α7 ) = (a5 − a4 )/λ, χ (α0 ) = (a9 − a8 )/λ, where λ = ai . i=1
(1)∗
Add 3. A2
-surface: x=0 ❏✡ ✡❏ ❏ ✡ ❏ ✡ ❏ ✡ x=z
z=0
p1 : (1 : −a1 : 1), p2 : (1 : −a2 : 1), p5 : (0 : a5 : 1), p4 : (0 : a4 : 1), p6 : (1 : a6 : 0), p7 : (1 : a7 : 0),
p3 : (1 : −a3 : 1), p9 : (0 : a9 : 1), p8 : (1 : a8 : 0).
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
219
a1 a2 a3
a4 a5 a9 ; x : y : z a6 a7 a8 νa1 − µ − η νa2 − µ − η νa3 − µ − η ∼ νa5 + η νa9 + η ; x : µx + νy + ηz : z νa4 + η , νa6 + µ νa7 + µ νa8 + µ χ (α1 ) = (a3 − a2 )/λ, χ (α4 ) = (a7 − a6 )/λ,
χ (α2 ) = (a2 − a1 )/λ, χ (α3 ) = (a1 + a4 + a6 )/λ, χ (α5 ) = (a8 − a7 )/λ, χ (α6 ) = (a5 − a4 )/λ,
χ (α0 ) = (a9 − a5 )/λ, where λ =
9
ai .
i=1 (1)
Add 4. D4 -surface: x=0 z=0 ❏✡❡ p1 ← p8 ← p9 p4 r✡ r❏ r p6 ←− p4 r p 2 r p 6 r r ❏ p5 ✡ p2 ❏ p7 p9 p5 r p3 r p7 r r✡ r r p3 ❏ ✡ x=z x z p1 : (0 : 1 : 0) ← p8 : , y x s s x y(sx + (1 − s)y) = 0, = 0, ← p9 : , a0 , s−1 y x2 s−1 p2 : (1 : −a2 : 1), p5 : (0 : a3 : 1),
p3 : (1 : −a2 − a1 : 1), p4 : (0 : 0 : 1), p6 : (1 : 0 : 0), p7 : (1 : a4 : 0).
(ai ; s ; x : y : z) ∼ (µai ; s ; x : µy : z), χ (α1 ) = a1 /λ,
χ (α2 ) = a2 /λ,
χ (α3 ) = a3 /λ, 4 χ (α4 ) = a4 /λ, χ (α0 ) = a0 /λ, where λ =
i=0
(1)
ai .
Add 5. D5 -surface: ◗✑ x=0 ◗✑ z=0 ✑ ◗ ❏✡❡ p1 ← p2 ← p8 ← p9 ✑ ◗ ✑ ◗ p4 r✡ ❏ r p6 ✑ ◗ ❏ r p6 r p5 ✡ p p 7 4 r r p p ←− 3 9 ❏ r✡ r r p5 r p7 r p3 ❏ ✡ x=z x z x y(z − x) p1 : (0 : 1 : 0) ← p2 : = (0, s) , = (0, 1) ← p8 : , y x y x2 x y(y(z − x) − sx 2 ) ← p9 : = (0, s(s − a0 )), , y x3
220
H. Sakai
p3 : (1 : −a2 : 1), p6 : (1 : 0 : 0),
p4 : (0 : 0 : 1), p5 : (0 : a1 : 1), p7 : (1 : a3 : 0).
(ai ; s ; x : y : z) ∼ (µai ; µs ; x : µy : z), χ (α1 ) = a1 /λ, χ (α2 ) = a2 /λ, χ (α3 ) = a3 /λ, 3 where λ = ai .
χ (α0 ) = a0 /λ,
i=0
Add 6.
(1) D6 -surface:
x=0 p4 r p 6 r ❏✡❡ p1 ← p2 ← p3 ← p8 ← p9 p4 r✡ ❏ r p6 p5 r p7 r ❏ p5 ✡ p 7 ←− ❏r r✡ ❏ ✡ x=z x z x y(z − x) = (0, 0) p1 : (0 : 1 : 0) ← p2 : , = (0, 1) ← p3 : , y x y x2 x y 2 (z − x) ← p8 : = (0, s) , y x3 x y(y 2 (z − x) − sx 3 ) ← p9 : = (0, s(b1 − a0 )), , y x4 p4 : (0 : 0 : 1), p5 : (0 : a1 : 1), p6 : (1 : 0 : 0), p7 : (1 : b1 : 0). z=0
r p9
µa1 µa0 a1 a0 ; s; x :y :z ∼ ; µ2 s ; x : µy : z , b1 b0 µb1 µb0 χ (α1 ) = a1 /λ, χ (α0 ) = a0 /λ, χ (α1 ) = b1 /λ,
χ (α0 ) = (a1 + a0 − b1 )/λ,
where λ = a1 + a0 .
(1)
Add 7. D7 -surface: x=0 z=0 ❏✡❡ p1 ← p2 ← p3 ← p8 ← p9 ❏ p6 → p4 ❡ r✡ ❏ ✡ p7 → p5 ←− ❏ r✡ ❡ ❏ ✡ p1 : (0 : 1 : 0) ← p2 :
x z , y x
p9 r
❅
❅ ❅
x yz , 2 = (0, 0) y x x y(y 2 z − sx 3 ) , = (0, s) ← p9 : = (0, −a0 s), y x4
= (0, 0) ← p3 :
x y2z , y x3 x y p4 : (0 : 0 : 1) ← p6 : , = (0, 0), z x x y−z p5 : (0 : 1 : 1) ← p7 : , = (0, a1 ). z x ← p8 :
p 6 r r p7 ❅
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
221
(a1 , a0 ; s ; x : y : z) ∼ (µa1 , µa0 ; µ3 s ; x : µy : µz), a1 a0 χ (α1 ) = , χ (α0 ) = , where λ = a1 + a0 . λ λ (1)
Add 8. D8 -surface: z=0 ❞x =0 ❏ ✡✡ p1 ← p2 ← p3 ← p8 ← p9 rp7 ❅ ❏ ✡ r ❅ p7 → p6 → p5 → p4 ❞ ❏ ←− ✡ ❅ p9 ❏ ❏ ✡ x z , = (0, 0) p1 : (0 : 1 : 0) ← p2 : y x x yz ← p3 : , 2 = (0, 0) y x x y2z x y(y 2 z − sx 3 ) ← p8 : = (0, −λs), , 3 = (0, s) ← p9 : , y x y x4 y x y zx p4 : (0 : 0 : 1) ← p5 : , = (0, 0) ← p6 : , 2 = (0, 1) z y z y 2 y z(zx − y ) ← p7 : = (0, 0). , z y3 (λ; s ; x : y : z) ∼ (µλ; µ4 s ; µ−1 x : y : µz) χ (δ) = λ/λ = 1. (1)
Add 9. E6 -surface: x=0 p9 z=0 r ❏✡❡ p1 ← p2 ← p7 ← p8 ← p9 ❏ ✡ p4 r ❏❡ r p6 p4 p5 p5 ✡ ←− ❏ p3 ← p6 r r r r✡ ❏ ✡ x z x yz , = (0, 0) ← p7 : , 2 = (0, 1) p1 : (0 : 1 : 0) ← p2 : y x y x 2 x y(yz − x ) ← p8 : = (0, s) , y x3 x y(y(yz − x 2 ) − sx 3 ) ← p9 : = (0, −a0 + s 2 ), , y x4 z y p4 : (0 : 0 : 1), p5 : (0 : a1 : 1), p3 : (1 : 0 : 0) ← p6 : , = (0, −a2 ). x z (ai ; s ; x : y : z) ∼ (µ2 ai ; µs ; x : µy : µ−1 z),
222
H. Sakai
χ (α1 ) = a1 /λ,
χ (α2 ) = a2 /λ,
χ (α0 ) = a0 /λ,
where λ =
2 i=0
ai .
(1)
Add 10. E7 -surface: x=0 z=0 r p4 ❡ p1 ← p2 ← p3 ← p6 ← ❏✡ p9 r r p5 ← p ← p ← p ❏ ✡ 7 8 9 p4 r ❏ ✡ p5 ←− ❏ r✡ ❏ ✡ x z x yz , = (0, 0) ← p3 : , 2 = (0, 0) p1 : (0 : 1 : 0) ← p2 : y x y x x y2z x y(y 2 z − x 3 ) ← p6 : = (0, 0) , 3 = (0, 1) ← p7 : , y x y x4 x y 2 (y 2 z − x 3 ) ← p8 : = (0, −s) , y x5 x y(y 2 (y 2 z − x 3 ) + sx 5 ) ← p9 : = (0, −a0 ), , y x6 p4 : (0 : 0 : 1),
p5 : (0 : a1 : 1).
(a1 , a0 ; s ; x : y : z) ∼ (µ3 a1 , µ3 a0 ; µ2 s ; x : µy : µ−2 z), χ (α1 ) = a1 /λ, χ (α0 ) = a0 /λ, where λ = a1 + a0 . (1)
Add 11. E8 -surface: p9 r
p 1 ← p2 ← p3 ← . . . . . . ← p8 ← p9 ←−
❡ z=0
x yz , 2 = (0, 0) y x 2 x y z x y(y 2 z − x 3 ) = (0, 1) ← p5 : = (0, 0) , , y x3 y x4 x y 2 (y 2 z − x 3 ) = (0, 0) , y x5 x y 3 (y 2 z − x 3 ) = (0, 0) , y x6 x y 4 (y 2 z − x 3 ) = (0, s) , y x7
p1 : (0 : 1 : 0) ← p2 : ← p4 : ← p6 : ← p7 : ← p8 :
x z , y x
= (0, 0) ← p3 :
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
← p9 :
x y(y 4 (y 2 z − x 3 ) − sx 7 ) , y x7
(λ ; s ; x : y : z) ∼ (µ2 λ ; µs ; µ2 x : µ3 y : z),
223
= (0, λ).
χ (δ) = λ/λ = 1.
C. Cremona Action It is useful to calculate generators of the Cremona action for the purpose of a construction of the discrete Painlevé equations. We give some calculations here. Some surfaces are omitted. (1) Ell 1. A0 -surface: (1) W (E8 )-symmetry
θ1 θ2 θ3
w8 : θ4 θ5 θ 6 ; τ ; x : y : z θ7 θ8 θ9 −→
−2θ1 +θ2 −2θ3 3 θ5 + θ1 +θ32 +θ3 θ8 + θ1 +θ32 +θ3
θ1 −2θ2 −2θ3 3 θ4 + θ1 +θ32 +θ3 θ7 + θ1 +θ32 +θ3
where
−2θ1 −2θ2 +θ3 3 θ6 + θ1 +θ32 +θ3 θ9 + θ1 +θ32 +θ3
p1 (θ1 , θ2 , θ3 )
(x : y : z) = l θ1 ,θ2 ,θ3 (x : y : z) 0
0
℘ ( θ1 −2θ32 −2θ3 )
;τ;x:y:z ,
0
0
0 p3 (θ1 , θ2 , θ3 )
p2 (θ1 , θ2 , θ3 ) 0
℘ ( θ1 −2θ32 −2θ3 )
1
−2θ1 +θ2 −2θ3 · ) ℘ ( −2θ1 +θ3 2 −2θ3 ) 1 ℘( , 3 −2θ1 −2θ2 +θ3 −2θ1 −2θ2 +θ3
℘( ) ℘( ) 1 3 3
p1 (α, β, γ )
· p2 (α, β, γ ) = p3 (α, β, γ )
l0,β,γ · l α+β+γ , −2α+β−2γ , −2α−2β+γ
3 3 3 l0,γ ,α · l α+β+γ , −2α−2β+γ , α−2β−2γ 3 3 3 l0,α,β · l α+β+γ , α−2β−2γ , −2α+β−2γ 3
3
,
3
l α,β,γ (x : y : z) lγ ,α (x : y : z) · lα,β (x : y : z) : lα,β (x : y : z) · lβ,γ (x : y : z) : , = : lβ,γ (x : y : z) · lγ ,α (x : y : z)
224
H. Sakai
x
y
z
lα,β (x : y : z) = det ℘ (α) ℘ (α) 1 , ℘ (β) ℘ (β) 1
℘ (α) ℘ (α) 1
lα,β,γ = det ℘ (β) ℘ (β) 1 . ℘ (γ ) ℘ (γ ) 1
wi : (θi , θi+1 ; τ ; x : y : z) −→ (θi+1 , θi ; τ ; x : y : z), w0 : (θ8 , θ9 ) −→ (θ9 , θ8 ).
for i = 1, . . . , 7,
(1)
Mul 3. A2 -surface: (1) (1) (E (1) )-symmetry Aut(A2 ) W (E6 ) ! W 6
a 1 −σ (12) : a 4 a6
a2
a3
a5
a9
a7
a8
;
1/a4 1/a − → x:y:z 1 1/a6
1/a6 −σ(10) : (ai ; x : y : z) −→ 1/a4
1/a7
1/a8
1/a5
1/a9
1/a1
1/a2
1/a3
w3 : (ai ; x : y : z) −→ 1/a4 −a2 a6 · 1/a6 −a5 a1 1/a1 −a7 a4
−a9 a1 −a8 a4
;
1/a5
1/a9
1/a2
1/a3
1/a7
1/a8
;
y:x:z ,
x:z:y ,
−a3 a6 ;
x(−a6 x + y + a6 a1 z) : y(a4 a6 x − a4 y + z) : , : z(x + a1 a4 y − a1 z)
w1 : (a2 , a3 ; x : y : z) −→ (a3 , a2 ; x : y : z), w2 : (a1 , a2 ) −→ (a2 , a1 ), w4 : (a6 , a7 ) −→ (a7 , a6 ), w5 : (a7 , a8 ) −→ (a8 , a7 ), w6 : (a4 , a5 ) −→ (a5 , a4 ), w0 : (a5 , a9 ) −→ (a9 , a5 ). (1)
Mul 4. A3 -surface: (1) (1) (D (1) )-symmetry Aut(A3 ) W (D5 ) ! W 5 a8 a1 a 2 a 3 −σ (12)(30) : ;x:y:z a4 a5 a6 a7 −1/a5 a1 −a4 a1 1/a6 1/a7 −→ ; x(x − a1 z) : z(x + a1 a4 y − a1 z) : xy , 1/a4 1/a8 a1 a4 1/a2 1/a3 1/a8 1/a1 1/a2 1/a3 − σ(13) : (ai ; x : y : z) −→ ;z:y:x , 1/a6 1/a7 1/a4 1/a5 w3 : (ai ; x : y : z) a8 a1 a4 1/a4 −a2 a6 −a3 a6 x(−a6 x + y + a6 a1 z) : y(a4 a6 x − a4 y + z) : −→ ; , 1/a6 −a5 a1 1/a1 −a7 a4 : z(x + a1 a4 y − a1 z) w0 : (ai ; x : y : z) 1/a1 1/a8 a2 a3 −→ ; x(x − a1 z) : y(a8 x − z) : z(x − a1 z) , a4 a1 a5 a1 a6 a8 a7 a8 w1 : (a2 , a3 ; x : y : z) −→ (a3 , a2 ; x : y : z), w2 : (a1 , a2 ) −→ (a2 , a1 ), w4 : (a4 , a5 ) −→ (a5 , a4 ), w5 : (a6 , a7 ) −→ (a7 , a6 ).
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
225
(1)
Mul 5. A4 -surface: (1) (1) (A(1) )-symmetry Aut(A4 ) W (A4 ) ! W 4 a1 a2 a3 σ(12340) : ;x:y:z a4 a0 a3 a4 a0 2 −→ ; a4 xy(z − x) : a2 z(x + y − z)(x + a4 y − z) : x(x − z) , a1 a2 −σ(13)(40) : (ai ; x : y : z) 1/a1 1/a0 1/a4 ; −→ 1/a3 1/a2
a2 z(x − z)(x + y − z) : : y((z − x)(a0 x + a2 z) − a2 yz) : a0 x(x − z)2
,
w3 : (ai ; x : y : z) a1 a2 a3 1/a3 ; a3 x(a3 x + y − a3 z) : y(a3 x + y − z) : a3 z(x + y − z) , −→ a3 a4 a0 1/a1 a1 a2 a3 w1 : (ai ; x : y : z) −→ ;x:y:z , a4 a1 a0 a1 a2 1/a2 a2 a3 w2 : (ai ; x : y : z) −→ ; x : a2 y : a2 z , a 4 a0 a1 a2 a3 a4 w4 : (ai ; x : y : z) −→ ; x : a4 y : z , 1/a4 a4 a0 3 2 w0 = σ(12340) ◦ w1 ◦ σ(12340) . (1)
Mul 6. A5 -surface: (1) Aut(A5 ) W ((A2 + A1 )(1) )-symmetry a1 a2 a0 σ(123450) : ;x:y:z b1 b0 a2 a0 a1 2 −→ ; b1 xy(z − x) : −a2 yz(x + y − z) : b1 x(x − z) , b0 b1 −σ(12)(30)(45) : (ai , bj ; x : y : z) 1/a2 1/a1 1/a0 ; x(z − x) : z(x + y − z) : yx , −→ 1/b0 1/b1 w2 : (ai , bj ; x : y : z) a1 a2 1/a2 a2 a0 ; a2 x(a2 x +y −a2 z) : −y(a2 x +y −z) : a2 z(x +y − z) , −→ b1 b0 1/a1 a1 a2 a1 a0 w1 : (ai , bj ; x : y : z) −→ ; x : a1 y : a1 z , b1 b 0 4 2 w0 = σ(123450) ◦ w1 ◦ σ(123450) ,
w1 : (ai , bj ; x : y : z)
226
H. Sakai
−→
a1 a2 a0 ; x(y + a x) : −b y(y + a x) : z(a x − b y) , 2 1 2 2 1 1/b1 b12 b0
3 3 w0 = σ(123450) ◦ w1 ◦ σ(123450) . (1)
n Mul 7. A6 -surface: (W (A(1) 1 ) {σ | n ∈ Z}) S2 -symmetry
σ (= σ(1642053) ) : (a1 , a0 , b ; x : y : z) −→ (a0 , a1 , −a1 b; z(z − x) : bx(z − x) : yz) −σ(15)(24)(60) : (a1 , a0 , b ; x : y : z) −→ 1/a0 , 1/a1 , b ; z(x − z)(x + y − z) : y((z − bx)(x − z) + yz) : bx(x − z)2 , w1 : (a1 , a0 , b ; x : y : z) −→ (1/a1 , a12 a0 , ba1 ; x(y − a1 z) : y(y − z) : a1 z(y − z)), w0 = σ(15)(24)(60) ◦ w1 ◦ σ(15)(24)(60) . (1)
Mul 8. A7 -surface: {σ n | n ∈ Z} S2 -symmetry σ (= σ(14725036) ) : (a1 , a0 ; x : y : z) → (a12 a0 , 1/a1 ; yz : a1 z(x − z) : a1 x(x − z)), −σ(14)(23)(50)(67) : (a1 , a0 ; x : y : z) → (1/a0 , 1/a1 ; zx : yz : xy). (1)
Mul 9. A7 -surface: (1) D8 W (A1 )-symmetry σ(1357)(2460) : (a1 , b, c ; x : y : z) −→ bc, a1 /b, b ; xy : byz : x 2 , −σ(13)(40)(57) : (a1 , b, c ; x : y : z) −→ (1/a1 , 1/c, 1/b ; x : z : y) , w1 : (a1 , b, c ; x : y : z) −→ (1/a1 , a1 b, a1 c ; x(y − a1 z) : y(z − y) : a1 z(z − y)), w0 = σ(13)(40)(57) ◦ w1 ◦ σ(13)(40)(57) . (1)
Mul 10. A8 -surface: D6 -symmetry σ(147)(258)(360) : (b, c ; x : y : z) → (c2 , b/c ; cy : z : cx), −σ(15)(24)(60)(78) : (b, c ; x : y : z) → (1/c, 1/b ; zx : yz : xy). (1)
Add 4. D4 -surface: (1) (1) (D (1) )-symmetry Aut(D4 ) W (D4 ) ! W 4 a 1 a2 a 3 ; s; x : y : z σ(40) : a 4 a0 s a1 a2 a0 ; ; xz : y(sx + (1 − s)z) : z(sx + (1 − s)z) , −→ a4 a3 s−1 a1 a2 a4 1 ; ; x : −y − a2 x : x − z , σ(14) : (ai ; s; x : y : z) −→ a3 a0 s a3 a2 a1 ; 1 − s; z : y : x , σ(34) : (ai ; s; x : y : z) −→ a 4 a0
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
w1 : (ai ; s; x : y : z) −→
−a1
a2 + a 1 a4 a0
a3
227
; s; x : y : z ,
w2 : (ai ; s; x : y : z) a1 +a2 −a2 a3 +a2 −→ ; s; x(y +a2 z) : (y +a2 z)(a2 x +y) : z(a2 x +y) , a4 +a2 a0 +a2 a1 a2 +a3 −a3 w3 : (ai ; s; x : y : z) −→ ; s; x : y −a3 z : z , a4 a 0 a1 a2 + a4 a3 ; s; x : y − a4 x : z , w4 : (ai ; s; x : y : z) −→ −a4 a0 w0 : (ai ; s; x : y : z) a1 a2 + a0 a3 x(sx + (1 − s)z) : y(sx + (1 − s)z) − a0 xz −→ ; s; . a4 − a 0 : z(sx + (1 − s)z) (1)
Add 5. D5 -surface: (1) (1) (A(1) )-symmetry Aut(D5 ) W (A3 ) ! W 3 σ(14)(23)(50) : (a1 , a2 , a3 , a0 ; s; x : y : z) −→ (a2 , a1 , a0 , a3 ; −s; yz(z − x) : −y(y(z − x) − sxz) : z(y(z − x) − sxz)) , σ(45) : (ai ; s; x : y : z) −→ (a3 , a2 , a1 , a0 ; −s; z : y : x) , w1 : (ai ; s; x : y : z) −→ (−a1 , a2 + a1 , a3 , a0 + a1 ; s; x : y − a1 x : z) , w3 : (ai ; s; x : y : z) −→ (a1 , a2 + a3 , −a3 , a0 + a3 ; s; x : y − a3 z : z) , w2 = σ(14)(23)(50) ◦ w1 ◦ σ(14)(23)(50) , w0 = σ(14)(23)(50) ◦ w3 ◦ σ(14)(23)(50) . (1)
Add 6. D6 -surface: (1) Aut(D6 ) W ((2A1 )(1) )-symmetry b0 := a0 + a1 − b1 , a1 a 0 σ(16)(24)(50) : ; s; x : y : z b1 b0 x(y(y −b1 x)(z−x)−szx 2 ) : −(y −b1 x)(y(y −b1 x)(z−x)−szx 2 ) : a0 a 1 ; s; , −→ b1 b 0 : xy(y −b1 x)(z−x) σ(56) : (a1 , a0 , b1 , b0 ; s; x : y : z) −→ (b1 , b0 , a1 , a0 ;−s; z : y : x) , w1 : (a1 , a0 , b1 , b0 ; s; x : y : z) −→ (−a1 , a0 +2a1 , b1 , b0 ; s; x : y −a1 z : z) , w1 : (a1 , a0 , b1 , b0 ; s; x : y : z) −→ (a1 , a0 ,−b1 , b0 +2b1 ; s; x : y −b1 x : z) , w0 = σ(16)(24)(50) ◦ w1 ◦ σ(16)(24)(50) , w0 = σ(56) ◦ σ(16)(24)(50) ◦ σ(56) ◦ w1 ◦ σ(56) ◦ σ(16)(24)(50) ◦ σ(56) . (1)
Add 7. D7 -surface: (A(1) )-symmetry W 1 σ (= σ(67) ) : (a1 , a0 ; s; x : y : z) → (−a1 , a0 + 2a1 ; −s; x : y − a1 x − z : −z), σ (= σ(16)(25)(34)(70) ) : (a1 , a0 ; s; x : y : z) → (a0 , a1 ; −s; −xyz : zy 2 : sx 3 ).
228
H. Sakai (1)
Add 8. D8 -surface: S2 -symmetry
σ(17)(26)(35)(80) : (λ; s; x : y : z) → λ; s ; zx : (y − λx/2) z : sx 2 .
(1)
Add 9. E6 -surface: (1) (1) (A(1) )-symmetry Aut(E6 ) W (A2 ) ! W 2 −σ (15)(24) : (a1 , a2 , a0 ; s; x : y : z) −→ (−a2 , −a1 , −a0 ; s; yz : −xy : −zx) , −σ (10)(26) : (ai ; s; x : y : z) −→ −a1 , −a0 , −a2 ; s; xz : −(yz − x 2 − sxz) : z2 , w1 : (ai ; s; x : y : z) −→ (−a1 , a2 + a1 , a0 + a1 ; s; x : y − a1 z : z) , w2 = σ(15)(24) ◦ w1 ◦ σ(15)(24) , w0 = σ(10)(26) ◦ w2 ◦ σ(10)(26) . (1)
Add 10. E7 -surface: (1) (1) (A(1) )-symmetry Aut(E7 ) W (A1 ) ! W 1 σ(15)(24)(60) : (a1 , a0 ; t; x : y : z) −→ a0 , a1 ; s; x(y 2 z − szx 2 − x 3 ) : −y(y 2 z − szx 2 − x 3 ) : zx 3 , w1 : (a1 , a0 ; s; x : y : z) −→ (−a1 , a0 + 2a1 ; s; x : y − a1 z : z), w0 = σ(15)(24)(60) ◦ w1 ◦ σ(15)(24)(60) . Acknowledgement. The author expresses his sincere gratitude to Professor Masatoshi Noumi, Professor Kazuo Okamoto, Professor Masahiko Saito, Professor Kyouichi Takano, Professor Hiroshi Umemura and Professor Yasuhiko Yamada for much helpful advice and constant encouragement. The ideas in this paper have matured under the stimulus of their important and interesting results. The author is most grateful to Professor Michio Jimbo and Professor Jun-ichi Matsuzawa, who spent much time to read this paper and gave many useful advices. This work is partially supported by the Research Fellowships of the Japan Society for the Promotion of Science.
References 1. Cossec, F., Dolgachev, I.: Enriques surfaces I. Basel–Boston: Birkhäuser, 1988 2. Coxeter, H.S.M.: Finite groups by reflections, and their subgroups generated by reflections. Proc. Cambridge Philos. Soc. 30, 466–482 (1934) 3. Demazure, M.: Surface de del Pezzo, II–V. Lect. Notes in Math. 777. Berlin–Heidelberg–New York: Springer, 1980, pp. 23–69 4. Dolgachev, I.: Weyl groups and Cremona transformations. Proc. Symp. Pure Math. 40, 283–294 (1983) 5. I. Dolgachev, D. Ortland, Point sets in projective spaces and theta functions. Astérisque 165 (1988) 6. du Val, P.: On the Kantor group of a set of points in a plane. Proc. London Math. Soc. 42, 18–51 (1937) 7. Fuchs, R.: Über lineare homogene Differentialgleichungen zweiter Ordnung mit drei im Endlichen gelegene wesentlich singulären Stellen. Math. Ann. 63, 301–321 (1907) 8. Gambier, B.: Sur les équations différetielles du second ordre et du premier degré dont l’intégrale générale est à points critiques fixes. Acta. Math. 33, 1–55 (1910) 9. Grammaticos, B., Ramani,A. and Papageorgiou,V.G.: Do integrable mappings have the Painlevé property? Phys. Rev. Lett. 67, 1825–1828 (1991) 10. Grammaticos, B., Ohta,Y., Ramani, A. and Sakai, H.: Degeneration through coalescence of the q-Painlevé VI equations. J. Phys. A: Math. Gen. 31, 3545–3558 (1998) 11. Harbourne, B.: Blowings-up of P2 and their blowings-down. Duke Math. J. 52, 129–148 (1985) 12. Jimbo, M. and Miwa, T.: Monodromy preserving deformation of linear ordinary differential equations with rational coefficients. II. Physica D 2, 407–448 (1981) 13. Jimbo, M. and Sakai, H.: A q-analog of the sixth Painlevé equation. Lett. Math. Phys. 38, 145–154 (1996) 14. Kac, V.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 15. Looijenga, E.: Rational surfaces with an anti-canonical cycle. Annals of Math. 114, 267–322 (1981) 16. Manin, Y.: Cubic forms: Algebra, Geometry, Arithmetic. Amsterdam: North Holland, 1974
Rational Surfaces Associated with Affine Root Systems and Geometry of Painlevé Eqs.
229
17. Nagata, M.: On rational surfaces II. Mem. Coll. Sci. Univ. Kyoto 38, 271–293 (1960) 18. Noumi, M. and Yamada, Y.: Affine Weyl groups, discrete dynamical systems and Painlevé equations. Comm. Math. Phys. 199, 281–295 (1998) 19. Okamoto, K.: Sur les feuilletages associés aux équation du second ordre à points crtiques fixes de P. Painlevé. Japan. J. Math. 5, 1–79 (1979) 20. Okamoto, K.: Studies on the Painlevé equations I. Annali di Matematica pura ed applicata, CXLVI, 337–381 (1987); II. Jap. J. Math. 13, 47–76 (1987); III. Math. Ann. 275, 221–255 (1986); IV. Funkcial. Ekvac. Ser. Int. 30, 305–332 (1987) 21. Painlevé, P.: Sur les équations différetielles du second ordre dont l’intégrale générale et uniforme. Oeuvre t. III, pp. 187–271 22. Ramani, A., Grammaticos, B. and Hietarinta, J.: Discrete versions of the Painlevé equations. Phys. Rev. Lett. 67, 1829–1832 (1991) 23. Ramani, A. and Grammaticos, B.: The grand scheme for the discrete Painlevé equations. In: Lecture at the Toda symposium (1996) 24. Sioda, T. and Takano, K.: On Some Hamiltonian Structures of Painlevé Systems, I. Funkcial. Ekvac. 40, 271–291 (1997) Communicated by T. Miwa
Commun. Math. Phys. 220, 231 – 261 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Donaldson Invariants of Product Ruled Surfaces and Two-Dimensional Gauge Theories Carlos Lozano1 , Marcos Mariño2 1 Departamento de Física de Partículas, Universidade de Santiago de Compostela, 15706 Santiago de Com-
postela, Spain. E-mail:
[email protected]
2 Department of Physics, Yale University, New Haven, CT 06520, USA. E-mail:
[email protected]
Received: 22 July 1999 / Accepted: 12 June 2000
Abstract: Using the u-plane integral of Moore and Witten, we derive a simple expression for the Donaldson invariants of product ruled surfaces g × S2 , where g is a Riemann surface of genus g. This expression generalizes a theorem of Morgan and Szabó for g = 1 to any genus g. We give two applications of our results: (1) We derive Thaddeus’ formulae for the intersection pairings on the moduli space of rank two stable bundles over a Riemann surface. (2) We derive the eigenvalue spectrum of the Fukaya–Floer cohomology of g × S1 .
1. Introduction The Donaldson invariants of smooth four-manifolds have played a very important role in physics and mathematics in the last years. Since the reformulation of Donaldson theory by Witten in terms of twisted N = 2 Yang–Mills theory [1], the physical approach to Donaldson theory has opened unsuspected perspectives. The main breakthrough, in this respect, was the introduction of Seiberg–Witten invariants in [2] and Witten’s “magic” formula relating the Donaldson and Seiberg–Witten invariants of simply-connected fourmanifolds with b2+ > 1 and of simple type. This relation was fully explained in the fundamental paper of Moore and Witten [3], which also analyzed Donaldson–Witten theory for manifolds of b2+ = 1 using the formalism of the u-plane integral. The u-plane integral of Moore and Witten has been studied during the last two years from many different points of view [4–10]. One of the most interesting outcomes of this approach has been a complete understanding of Donaldson invariants of non-simply connected manifolds. The study of this problem from the point of view of the u-plane integral was initiated in [3,4] and completed in [6], where (among other things) a general wall-crossing formula for non-simply connected manifolds was derived, generalizing the results obtained in [11] using algebro-geometric methods.
232
C. Lozano, M. Mariño
1.1. Product ruled surfaces. Among non-simply connected manifolds of b2+ = 1, product ruled surfaces play an important role in Donaldson theory. These are four-manifolds of the form g × S2 , where g is a Riemann surface of genus g. In [6], a direct application of the lattice reduction technique of [3] led to explicit expressions for the Donaldson invariants of these surfaces, in the chamber where the volume of S2 is small. Using these expressions and summing up the infinite number of wall-crossing terms, one can derive in principle the Donaldson–Witten generating function of product ruled surfaces in the chamber where the volume of g is small. This was the approach followed in [6], where explicit and compact formulae were written down using Kronecker’s double identity to sum up the wall-crossing terms. However, formulae for the Donaldson invariants based on wall-crossing tend to be ineffective when the instanton number is large. Based on physical intuition, we would expect that some of the properties of the Donaldson–Witten series will not be apparent in these kinds of expressions, which are based on calculations made in the “electric" frame. The approach based on wall-crossing formulae makes it difficult to write generating formulae, even for lower genus, and in fact one of the motivations of this paper was to reproduce the simple result of Morgan and Szabó for g = 1 [12] using the u-plane integral. The approach that we follow in this paper is to perform a direct calculation of the uplane integral in the chamber where g is very small. This requires a slight generalization of the computation in [3] to allow for a non-zero Stiefel–Whitney class. In this chamber, there is a very important contribution coming from the Seiberg–Witten invariants. In this case, this contribution can be also computed from the u-plane integral via wallcrossing, with the important difference that the number of walls is always finite. This is the main reason for the relative simplicity of our final expression, which is expressed in terms of “magnetic” variables. Therefore, together with the results given in [6], one finds two different expressions for the Donaldson invariants of product ruled surfaces. This is somewhat similar to the case of CP 2 , whose invariants were computed in [13] by summing up an infinite number of wall-crossings, and in [3] by direct evaluation of the u-plane integral. There is, however, one important difference: in the case of CP 2 both expressions are expressed in terms of “electric” variables, while in this case one of them is written in “electric” variables, and the other in “magnetic” variables. Depending on the problem we are interested in, we will find useful one expression or the other. 1.2. Relation to the moduli space of stable bundles on a curve. Apart from its intrinsic interest, the importance of having explicit expressions for the Donaldson invariants of product ruled surfaces comes from their relation to other interesting moduli problems. First of all, for zero instanton number, the moduli space of instantons on g × S2 is nothing but the moduli space of flat connections on the Riemann surface g . This space has been extensively studied by mathematicians, and the structure of its cohomology ring has been explored using gauge-theoretic techniques, starting with the seminal paper of Atiyah and Bott [14]. Using the connection to Verlinde formula [15], Thaddeus [16] was able to compute the intersection pairings for the generators of the cohomology ring. The moduli space of flat connections can also be described by a two-dimensional version of Donaldson–Witten theory. In [17], Witten gave a physical derivation of these intersection pairings by exploiting the relation of this two-dimensional topological field theory to physical 2d Yang–Mills theory [18]. He found in fact explicit formulae for higher rank gauge groups. In this paper, we will give another derivation of Thaddeus’ formulae using the Donaldson invariants of product ruled surfaces. In a sense, our derivation can
Donaldson Invariants and Two-Dimensional Gauge Theories
233
be regarded as the dimensional reduction of Donaldson–Witten theory down to two dimensions. In this case, as we are considering zero instanton number, it is preferable to use the “electric” expressions given in [6]. 1.3. Relation to Fukaya–Floer cohomology. Another important application of the Donaldson invariants of product ruled surfaces is to the (Fukaya)–Floer cohomology of g × S1 . The Fukaya–Floer cohomology of g × S1 gives the gluing theory for Donaldson invariants along this three-manifold, and the gluing theory can be used in turn to derive important properties of Donaldson invariants of general four-manifolds [19–21]. The ring structure of the Floer cohomology of g × S1 has been studied from many different points of view. In [22], an explicit presentation was obtained under the assumption that the eigenvalues can be obtained from the Donaldson invariants of g × 1 . This presentation was finally derived by V. Muñoz in [23], and a partial determination of the ring structure of the Fukaya–Floer cohomology was obtained in [19]. In a sense, the information contained in the Fukaya–Floer cohomology of g × S1 is equivalent to the information contained in the Donaldson invariants of product ruled surfaces. Using our “magnetic” expression for the Donaldson invariants, it is straightforward to find the eigenvalue spectrum of the Fukaya–Floer cohomology. It is interesting to notice that this spectrum is by no means obvious from the “electric” expressions, in other words, it can not be seen in a semiclassical instanton expansion. 1.4. Relation to the quantum cohomology of the moduli space of stable bundles over curves. We have seen that the Donaldson invariants of product ruled surfaces that correspond to zero Pontriagin number give the intersection pairings on the moduli space of stable bundles over a Riemann surface. It has been argued that the Donaldson invariants with non-zero Pontriagin numbers, when computed in the chamber where g is small and with Stiefel–Whitney classes satisfying (w2 (E), [g ]) = 0, give essentially the Gromov–Witten invariants of this moduli space. This was shown in [22] using the dimensional reduction of topological Yang–Mills theory on g × S2 to a type A topological sigma model whose target space was the moduli space of flat connections on the Riemann surface1 . The relation between the Gromov–Witten invariants and the Donaldson invariants is equivalent to the Atiyah–Floer conjecture, which says that the Floer cohomology of g × S1 is isomorphic as a ring to the quantum cohomology of the moduli space of stable bundles over g . The isomorphism of vector spaces was proved in [24], and the ring isomorphism was proved in [25]. Using this isomorphism, one can interpret our formula (5.15) for the generating function of the Donaldson invariants as a generating function for the Gromov–Witten invariants. 1.5. Organization of the paper. The organization of this paper is as follows: in Sect. 2, we give a brief summary of the results of [6] for the Donaldson invariants of nonsimply connected manifolds. In Sect. 3, we compute the Donaldson invariants of product ruled surfaces in the chambers of small volume for g and for S2 by direct evaluation. In Sect. 4, we derive Thaddeus’ formulae for the intersection pairing and Verlinde’s 1 In [4] it was argued, by performing the dimensional reduction in the low-energy action, that the effective two-dimensional theory can be formulated in terms of a topological Landau–Ginzburg model, which would then give an equivalent description of the quantum cohomology in a way reminiscent of mirror symmetry. It would be interesting to check this in some detail.
234
C. Lozano, M. Mariño
formulae. In Sect. 5, we explain the connection to Fukaya–Floer cohomology and derive the eigenvalue spectrum. 2. Donaldson–Witten Theory on Non-Simply Connected Manifolds The Donaldson invariants of smooth, compact, oriented four-manifolds X [26] are defined by using intersection theory on the moduli space of anti-self-dual connections. The cohomology classes on this space are associated to homology classes of X through the slant product [26] or, in the context of topological field theory, by using the descent procedure [1]. In this paper, we will restrict ourselves to the Donaldson invariants associated to zero, one and two-homology classes. The inclusion of three-classes has been considered in [6]. Define A(X) = Sym(H0 (X) ⊕ H2 (X)) ⊗ ∧∗ H1 (X).
(2.1)
Then, the Donaldson invariants can be regarded as functionals w (E)
DX 2
: A(X) → Q,
(2.2)
where w2 (E) ∈ H 2 (X, Z) is the second Stiefel–Whitney class of the gauge bundle. It is convenient to organize these invariants as follows. Let {δi }i=1,... ,b1 be a basis of one-cycles, {βi }i=1,... ,b1 the corresponding dual basis of harmonic one-forms, and {Si }i=1,... ,b2 a basis of two-cycles. We introduce the formal sums δ=
b1
ζi δi ,
S=
i=1
b2
vi Si ,
(2.3)
j =1
where vi are complex numbers, and ζi are Grassmann variables. The generator of the 0class will be denoted by x ∈ H0 (X, Z). We then define the Donaldson–Witten generating function: w (E)
ZDW (p, ζi , vi ) = DX2
(epx+δ+S )
(2.4)
so that the Donaldson invariants can be read off from the expansion of the left-hand side in powers of p, ζi and vi . The main result in [1] is that ZDW can be understood as the generating functional of a twisted version of the N = 2 supersymmetric gauge theory – with gauge group SU (2) – in four dimensions – see In the twisted [1, 3, 27] for details. theory one can define observables O(x), I1 (δ) = δ O1 , I2 (S) = S O2 (where Oi are functionals of the fields of the theory) in one to one correspondence with the homology classes of X, and in such a way that the generating functional epO+I1 (δ)+I2 (S) , is precisely ZDW (p, ζi , vi ). Based on the low-energy effective descriptions of N = 2 gauge theories obtained in [28, 29], Witten obtained an explicit formula for (2.4) in terms of Seiberg–Witten invariants for manifolds of b2+ > 1 and simple type [2]. The general framework to give a complete evaluation of (2.4) was established in [3]. The main result of Moore and Witten is an explicit expression for the generating function ZDW : ZDW = Zu + ZSW
(2.5)
Donaldson Invariants and Two-Dimensional Gauge Theories
235
which consists of two pieces. ZSW is the contribution from the moduli space MSW of solutions of the Seiberg–Witten monopole equations. Zu (the u-plane integral henceforth) is the integral of a certain modular form over the fundamental domain of the group 0 (4), that is, over the quotient 0 (4) \ H, where H is the upper half-plane. The explicit form of Zu was derived in [3] for simply connected four-manifolds, and extended to the non-simply connected case in [6]. Zu is non-vanishing only for manifolds with b2+ = 1, and provides a simple physical explanation of the failure of topological invariance of the Donaldson invariants on those manifolds [3].
2.1. The u-plane integral. We will start by considering the u-plane piece. We will assume for simplicity that b1 is even, although the general story is very similar. We can assume that X has b2+ = 1 (otherwise the u-plane integral is zero). In this case, there is a normalized self-dual two form or period point ω, with ω2 = 1, which generates H 2,+ (X, IR) IR. The self-dual and anti-self-dual projections of a two-form λ are then given by λ+ = (λ, ω)ω, λ− = λ − λ+ . Another important aspect of non-simply connected four-manifolds of b2+ = 1 is that the image of the map ∧ : H 1 (X, Z) ⊗ H 1 (X, Z) −→ H 2 (X, Z)
(2.6)
is generated by a single rational cohomology class [11], so that for any two elements of the basis {βi }i=1,...,b1 of H 1 (X, Z), βi ∧ βj = aij , where aij is an antisymmetric b1 × b1 matrix2 . As shown in [3, 6], the new ingredient in the u-plane integral for non-simply connected manifolds is an integration over the Jacobian torus of X, Tb1 = H 1 (X, IR)/H 1 (X, Z). # There is a basis of one-forms on Tb1 that we will denote by βi ∈ H 1 (Tb1 , Z), and which 1 are dual to βi ∈ H (X, Z). Notice that there is an isomorphism H1 (X, Z) H 1 (Tb1 , Z), 1 # # given by δi → βi . We will then define δ # = bi=1 ζi βi as the image of δ in (2.3) under this isomorphism. Finally, we introduce a symplectic two-form on Tb1 as $=
i<j
#
#
aij βi ∧ βj ,
(2.7)
which does not depend on the choice of basis. This is a volume element for the torus, hence vol(Tb1 ) =
Tb1
$b1 /2 . (b1 /2)!
(2.8)
We can now write the u-plane integral in the non-simply connected case: w2 (E) Zu = epO+I1 (δ)+I2 (S) u dxdy ˜ = −4πi h∞ f∞ (p, δ, S, τ, y),(S), 1/2 0 (4)\H y Tb 1 2 The class
was denoted by in [6, 11].
(2.9)
236
C. Lozano, M. Mariño
where x = Re(τ ), y = Im(τ ). In this expression, the function f∞ (p, δ, S, τ, y) is an almost holomorphic modular form, as well as a differential form on Tb1 , given by: √ 2 b1 −3 σ −1 2pu∞ +S 2 T∞ # f∞ (p, δ, S, τ, y) = exp 2f1∞ (S, )$ + ih−1 h∞ ϑ4 f2∞ e ∞δ . 64π (2.10) We have denoted the intersection form in two-cohomology by ( , ) and used Poincaré ˜ is a Siegel–Narain duality to convert cohomology classes in homology classes. ,(S) theta function given by: ˜ = exp(2πiλ20 ) exp − 1 h−2 ˜2 ,(S) ∞ S− 8πy · exp −iπ τ¯ (λ+ )2 − iπ τ (λ− )2 + π i(λ − λ0 , w2 (X)) λ∈H 2 + 21 w2 (E)
˜ · exp −ih−1 ∞ (S− , λ− ) (λ+ , ω) +
(2.11)
i −1 ˜ h∞ (S+ , ω) , 4πy
where S˜ = S − 16f2∞ h∞ (
⊗ $),
(2.12)
so (2.11) is also a differential form on Tb1 , and 2λ0 is an integer lifting of w2 (E).3 Finally, in the above expressions u∞ , T∞ , h∞ , f1∞ and f2∞ are modular forms defined as follows: 1 ϑ24 + ϑ34 1 = 1/4 (1 + 20q 1/2 − 62q + · · · ), 2 2 (ϑ2 ϑ3 ) 8q
1 E2 T∞ = − = q 1/4 (1 − 2q 1/2 + 6q + · · · ), − 8u ∞ 24 h2∞ 1 h∞ (τ ) = ϑ2 ϑ3 = q 1/8 (1 + 2q 1/2 + q + · · · ), 2 2E2 + ϑ24 + ϑ34 f1∞ (q) = = 1 + 24q 1/2 + · · · , 3ϑ48 ϑ2 ϑ3 f2∞ (q) = = q 1/8 + 18q 5/8 + · · · . 2ϑ48 u∞ =
(2.13)
In this formulae, q = e2πiτ , and ϑi , i = 2, 3, 4 are the Jacobi theta functions (we follow the notation in [3]). Notice that T∞ does not transform well under modular transformations, due to the presence of the second Eisenstein series. In (2.10), we have used the related form 1 −2 T∞ = T∞ + h , 8πy ∞
(2.14)
3 We have absorbed all the b dependent factors in the u-plane integral of [6] in the normalization of the 1 differential forms on Tb1 , in order to get more compact expressions.
Donaldson Invariants and Two-Dimensional Gauge Theories
237
which is not holomorphic but transforms well under modular transformations. We also define the related holomorphic function f∞ (p, δ, S, τ ) as in (2.10) but with T∞ instead of T∞ . One immediate application of the u-plane integral formalism is the derivation of the wall-crossing behavior of Donaldson invariants. As explained in [3], the integral (2.9) has a discontinuous variation at the cusps of 0 (4)\H whenever the cohomology class λ ∈ H 2 (X; Z) is such that the period ω · λ changes sign. We then say that λ defines a wall. The cusps are located at τ = i∞, τ = 0, and τ = 2. The wall-crossing behavior associated to the cusp at infinity gives the wall-crossing properties of the Donaldson invariants, while the discontinuous variation of the integral at τ = 0, 2 must cancel against the contribution to wall-crossing from the Seiberg–Witten piece ZSW . As shown in [3], this cancellation completely fixes the structure of ZSW . The wall-crossing of the u-plane integral at τ = i∞ can be easily derived by imitating the analysis in Sect. 4 of [3]. The conditions for wall-crossing are λ2 < 0 and λ+ = 0, and one finds [6]: i 2 W C(λ) = − (−1)(λ−λ0 ,w2 (X)) e2πiλ0 2 2 −1 · q −λ /2 h∞ (τ )b1 −2 ϑ4σ f2∞ exp 2pu∞ + S 2 T∞ − i(λ, S)/ h∞ (2.15)
δ# · exp 2f1∞ (q)(S, )$ + 16if2∞ (q)(λ, )$ + i . h∞ q 0 Tb1 Using the q-expansion of the different modular forms, it is easy to check that the wall-crossing term is different from zero only if 0 > λ2 ≥ p1 /4, where p1 is the Pontriagin number of the gauge bundle (and p1 ≡ w2 (E)2 mod 4). The expression (2.15) generalizes the wall-crossing formula of [13] to non-simply connected manifolds. Wall-crossing terms for non-simply connected manifolds were computed in [11] using algebro-geometric methods in some particular cases4 .
2.2. The Seiberg–Witten contribution. The structure of the Seiberg–Witten contribution ZSW on non-simply connected four-manifolds has been studied in detail in [6], following the approach in [3]. We will not repeat the analysis here but we will write down several formulas which will be useful later. The SW part ZSW contains two pieces which correspond to the cusps at τ = 0, 2, and is written in terms of universal modular forms and of the Seiberg–Witten invariants introduced in [2]. A crucial ingredient in the discussion of the Seiberg–Witten contribution for non-simply connected manifolds is that we have to consider generalized Seiberg– Witten invariants, which involve integration of differential forms on the moduli space of solutions to the monopole equations. These differential forms can be constructed, in the context of topological field theory, using the descent procedure, and they are associated to one-cycles in the four-manifold X. Equivalently, to every element βi in the basis of one-forms of X introduced above, with i = 1, . . . , b1 , we associate a one-form νi on Mλ . The generalized Seiberg–Witten invariants are then introduced as follows. Let 4 In comparing this to the expressions in [13, 11], one has to take into account that what they call ξ or ζ is in fact our 2λ.
238
C. Lozano, M. Mariño
λ ∈ H 2 (X, Z) + w2 (X)/2 be a Spinc structure on X,5 and let Mλ be the corresponding Seiberg–Witten moduli space, with virtual dimension dλ = λ2 − (2χ + 3σ )/4. We then define: dλ −r SW (λ, β1 ∧ · · · ∧ βr ) = ν 1 ∧ · · · ∧ νr ∧ a D 2 , (2.16) Mλ
where aD is a two-form which represents the first Chern class of the universal line bundle on the moduli space. These generalized invariants (and their wall-crossing properties) have been considered in [30]. We can now write a general expression for ZSW in the case of a four-manifold of b2+ = 1. To do this, we first introduce the following modular forms: ϑ34 + ϑ44 = 1 + 32qD + 256qD + · · · , 2(ϑ3 ϑ4 )2 1 1 2 + · · · ), hM (qD ) = ϑ3 ϑ4 = (1 − 4qD + 4qD 2i 2i 2E2 − ϑ34 − ϑ44 1 2 f1M (qD ) = = − (1 − 6qD + 24qD + · · · ), 8 3ϑ28 ϑ3 ϑ4 1 1 f2M (qD ) = = 9 ( − 12 + 72qD + · · · ), 2 i qD 2iϑ28 1 1 E2 2 TM (qD ) = − − 8uM = + 8qD + 48qD + ··· . 24 h2M 2 uM (qD ) =
(2.17)
These modular forms are (up to the modular weight) the same modular forms as in (2.13) but evaluated at τD = −1/τ , that is, τ hM (τ ) = h∞ (−1/τ ), and so on. In the above expansions, we have used the dual variable qD = e2πiτD . We will denote by 1 δ∗ = bi=1 ζi βi the formal combination of one-forms which is dual to δ in (2.3). ZSW can then be written as the sum of two terms. The first one (corresponding to the cusp at τ = 0, or monopole cusp) is given by the following expression: ZSW,M =
b i b1 +1
8
λ
b≥0 n=0
(−1)n −6n−5b+b1 /2 2iπ(λ0 ·λ+λ2 ) 0 e 2 n!(b − n)!
(b+n−b1 )/2 if1M exp 2puM −ihM (S, λ)+S 2 TM (u) f2M
n 1 + 8f1M f2M (λ, ) 2f1M (S, ) + 16i 8f1M q0
−λ2 /2 b1 −3 8+σ hM ϑ 2
· qD
D
·
b1 ip ,jp =1
ai1 j1 · · · ain jn SW (λ, βi1 ∧ βj1 ∧ · · · ∧ βin ∧ βjn ∧ δ∗b−n ), (2.18)
5 In the mathematical literature, Spinc structures are rather given by integral cohomology classes which reduce to w2 (X) mod 2. They correspond to 2λ in our notation.
Donaldson Invariants and Two-Dimensional Gauge Theories
239
where the first sum is over all the Spinc -structures on X. As shown in [2], only a finite number of λ give non-zero Seiberg–Witten invariants (for a given metric). The second piece in ZSW corresponds to the cusp at τ = 2 (the dyon contribution) and it has exactly the same form, with the only difference that one has to use the modular forms udy = −uM ,
hdy = ihM ,
f1dy = f1M ,
f2dy = if2M ,
Tdy = −TM
(2.19)
and include an extra factor exp(−2π iλ20 ). It is easy to check that b1
ZSW,dy (p, ζi , vi ) = e−2πiλ0 i 1− 2 ZSW,M (−p, iζi , −ivi ), 2
(2.20)
in agreement with the arguments based on R-symmetry [2, 3, 27]. The structure of ZSW is such that the wall-crossing behavior due to the Seiberg–Witten invariants exactly cancels the wall-crossing behavior of the u-plane integral at the cusps τ = 0, 2. For example, at τ = 0 one can see that (2.9) has a discontinuous behavior for λ ∈ H 2 (X, Z) + w2 (X)/2 (i.e. a Spinc structure on X) such that λ2 < 0, and λ · ω = 0. These are precisely the conditions for SW wall-crossing. The discontinuity is given by: i 2iπ(λ0 ·λ+λ2 ) −λ2 /2 b1 −3 8+σ 0 e qD hM ϑ2 exp 2puM − ihM (S, λ) + S 2 TM (u) 8
(2.21) i # · exp 2f1M (S, )$ + 16if2M (λ, )$ + δ . hM Tb1 q0 D
If one compares this expression with (2.18), one actually recovers the general wallcrossing formulae of [30]6 : W C(SW (λ, β1 ∧ · · · ∧ βr )) =
(−1)
b1 −r 2
( b12−r )!
(λ, )
b1 −r 2
#
Tb 1
β1 ∧ · · · ∧ βr# ∧ $
b1 −r 2
.
(2.22) 3. Donaldson Invariants of Product Ruled Surfaces In this section we will derive explicit results for the Donaldson invariants on product ruled surfaces, that is, on four-manifolds of the form X = g × S2 , where g is a Riemann surface of genus g. For these surfaces, b1 = 2g, b2 = 2, b2+ = 1, so σ = 0 and χ = 4 − 2b1 . H 2 (X, Z) is generated by the cohomology classes [S2 ], [g ], with intersection form
0 1 1,1 II = . (3.1) 1 0 These surfaces are Spin, therefore w2 (X) = 0. The basis of one forms on g × S2 is given by the duals to the usual symplectic basis of one cycles on g , δi , i = 1, . . . , 2g, with δi ∩ δi+g = 1, i = 1, . . . , g. The matrix aij is then the symplectic matrix J , and = [S2 ] (the Poincaré dual to the two-homology class of S2 ). It follows that $=
g i=1
#
#
βi ∧ βi+g ,
(3.2)
6 Up to a numerical constant that appears when one compares the normalization of the fermion fields in the twisted theory to the topological normalization of the one-forms νi . See [6] for details.
240
C. Lozano, M. Mariño
and vol(Tb1 ) = 1. As it will become clear in the computation, all the Donaldson polynomials involving the cohomology classes associated to one-cycles can be exeven pressed g in terms of the Sp(2g, Z)-invariant element in ∧ H1 (X, Z) given by ι = −2 i=1 δi δi+g . This element of A(X) corresponds to the degree 6 differential form on the moduli space of instantons given by: γ = −2
g
I (δi )I (δi+g ).
(3.3)
i=1
If we write S = sg +tS2 , we see that the generating function that we want to compute is 2 w (E) ZDW (p, r, s, t) = DX2 (epx+rι+sg +tS ). In the previous section we have computed the generating functional including δ. If we want to include ι in the u-plane integral, we # just take into account that the 3-class I (δi ) on the moduli space gives (i/ h∞ )βi in the u-plane integral. Therefore, using (3.2), we find the correspondence γ →
2r $, h2∞
(3.4)
and to obtain ZDW (p, r, s, t) from the above formulae we just have to change (i/ h∞ )δ # by (3.4) in the u-plane integral. For the Seiberg–Witten contribution, the modification is very similar. As b2+ = 1, the generating function (or Donaldson series) for the Donaldson invariants will be given by the sum of the u-plane integral and the SW contributions. The resulting Donaldson polynomials are not topological invariants. In this case, it is interesting to compute the polynomials in limiting chambers, i.e. in chambers where one of the factors in the product is very small (and the other factor is then very big). Once the invariants are known in these chambers, we can compute the invariants in any other chamber by adding a sum of wall-crossings. But the most important reason to study the invariants in the limiting chambers is that, for special choices of the second Stiefel–Whitney classes, the Donaldson polynomials have a simple structure. Moreover, the connection to Fukaya– Floer theory involves the limiting chamber in which g is small. The limiting chambers can be analyzed in a fairly simple way using the general expression of the period point, as in [6]: 1 ω(θ ) = √ (eθ [S2 ] + e−θ [g ]). 2
(3.5)
The limiting chambers are θ → ±∞, which correspond to the limit of small volume for S2 and g , respectively. As explained in [6], in the chamber where S2 is small and θ → ∞, the scalar curvature is positive and the Seiberg–Witten invariants vanish [2]. This has two important consequences: first, in this chamber the Donaldson invariants are given just by the u-plane integral. Second, the Seiberg–Witten invariants in any other chamber can be computed by wall-crossing and they will be given by the topological expression (2.22). In particular, the SW contribution to the Donaldson invariants will be given by wall-crossing of the u-plane integral at the cusps τ = 0, 2, and we can use the simple expression (2.21).
Donaldson Invariants and Two-Dimensional Gauge Theories
241
3.1. Computing the u-plane integral. We start by computing the u-plane integral. We follow closely the analysis in Sect. 8 of [3]. We first rewrite (2.9) as dxdy ¯ (3.6) G(ρ) = fˆ∞ (p, S, τ, y)B, 3/2 0 (4)\H y Tb 1 ¯ is a Siegel–Narain theta function introduced in [3]: where fˆ∞ is given in (2.10) and B 1 2 2 ¯ ¯ ¯ B = exp ξ − ξ− 2y + (3.7)
exp −iπ τ¯ (λ+ )2 − iπ τ (λ− )2 − 2π i(ξ¯ , λ) + 2π i(λ, α) , λ∈H 2 +β
with ξ¯ = ρyh∞ ω +
1 ˜ S− , 2π h∞
α = 0,
β=
1 w2 (E). 2
(3.8)
Zu is obtained from G by ˜ ω)G(ρ) Zu = (S,
ρ=0
+2
dG . dρ ρ=0
(3.9)
Next we bring the integral (2.9) over 0 (4) \ H to an integral over the fundamental domain of SL(2, Z). Recall that a fundamental domain for 0 (4) can be obtained from a fundamental domain F for SL(2, Z) as follows: 0 (4) \ H ∼ = F ∪ (T · F) ∪ (T 2 · F) ∪ (T 3 · F) ∪ (S · F) ∪ (T 2 S · F),
(3.10)
where T and S are the standard generators of SL(2, Z). The first four domains correspond to the cusp at τ → i∞ and will be referred to as the semiclassical cusp. The domain S ·F corresponds to the cusp at τ = 0 (the monopole cusp). The domain T 2 S · F corresponds to the cusp at τ = 2, or dyon cusp. Taking this into account, we can bring the integral (3.6) to the form dxdy dxdy ˆy 1/2 B ¯ = I(τ ) G(ρ) = f 2 2 0 (4)\H y Tb 1 0 (4)\H y Tb 1 (3.11) dxdy I (τ ), = I 2 F y Tb 1 I
¯ I denotes the modular forms where II = fˆI y 1/2 B I(∞,0) (τ ) = I(τ ), I(∞,1) (τ ) = I(τ + 1), I(∞,2) (τ ) = I(τ + 2),
I(∞,3) (τ ) = I(τ + 3), IM (τ ) = I(−1/τ ), Idy (τ ) = I(2 − 1/τ ).
(3.12)
Therefore, the integral splits into 6 integrals over the modular domain of SL(2, Z). Now we can analyze each of these following the general procedure described in Sect. 8 of
242
C. Lozano, M. Mariño
[3]. From the modular properties of the theta function (3.7) – see for example Appendix B of [3] – one notes that the corresponding theta functions have αI βI
(∞, 0) 0 w2 (E)/2
(∞, 1) w2 (E)/2 w2 (E)/2
(∞, 2) 0 w2 (E)/2
(∞, 3) w2 (E)/2 w2 (E)/2
M w2 (E)/2 0
dy w2 (E)/2, 0
(3.13)
where we have taken into account that w2 (X) = 0 for product ruled surfaces. The computation of the u-plane integral proceeds as in [3], following a general strategy due to Borcherds [31], which is a generalization of standard techniques in one-loop threshold corrections in string theory – see for example [32]. This strategy is called lattice reduction. To perform the lattice reduction, one has to choose a reduction vector z in the cohomology lattice. z must be a primitive vector of zero norm, and the computation using lattice reduction will be valid in the chamber with (z, ω)2 very small. We have then two possible choices of z depending on the limiting chamber we choose: for small g , one has z = [g ], and for small S2 one has z = [S2 ]. One also chooses another norm zero vector z# such that (z, z# ) = 1. The value of the u-plane integral depends of course on the value of the Stiefel–Whitney class, that we can write as w2 (E) = Fz + F # z# ,
(3.14)
where F, F # = 0, 1. We then write βI =
qI rI z + z# , 2 2
(3.15)
where q I = F, r I = F # for the cusps at infinity, and are zero otherwise. We write the elements in the lattice H 2 (X, Z) as cz# + nz, where c, n ∈ Z. The contribution of the I th cusp to the integral (3.6) reads, after a Poisson summation on n, 2 e2πiλ0 dxdy fˆI 2 b1 y 2 F T 2z+ π π I 2 I I ¯ ¯ exp − |(c + r /2)τ + d| − 2 (ξ+ , z+ )(ξ− , z− ) 2 2yz+ yz+ c,d∈Z π(ξ¯+I , z+ ) (c + r I /2)τ + d exp − 2 y z+ π(ξ¯−I , z− ) (c + r I /2)τ¯ + d π ¯ I , z)(α I , z) ( ξ + − 2 2 y z+ yz+ I , z ) (α+ rI π + I 2 c + exp −iπ q I d − (α , z) + π τ ¯ + d 2 2 2 2yz+ yz+ I I (α , z− ) r c+ τ +d . +π − 2 2 yz+
(3.16)
We will now analyze the u-plane integral in the two limiting chambers. We first consider the case of S2 small. In this case, z = [S2 ]. The first thing to notice is that, if F # = 0, then the cusps at infinity do not give any contribution. This is due to the non-zero r I
Donaldson Invariants and Two-Dimensional Gauge Theories
243
in the exponentials in (3.16), and it can be proved using the analysis in Sect. 5 of [3]. Moreover, the cusps at τ = 0, 2 do not contribute either, because the measure in the u-plane integral goes like qD +. . . . Therefore, if (w2 (E), [S2 ]) = 0, the u-plane integral vanishes in the chamber where the volume of [S2 ] is very small. This is an example of the vanishing theorem of [3]. We then have to consider only the case of w2 (E) = F[S2 ]. In this case, the contribution comes from the cusps at infinity, which have q I = F. Notice that α I = (F/2)z for I = (∞, 1) and (∞, 3). It is easy to see that the inclusion of α I for these cusps is equivalent to an extra phase −π iFc. Following [3], we now apply the unfolding technique to the integral (3.16). The action of SL(2, Z) on c and d has two classes of orbits: nondegenerate orbits with c, d not both zero, and the degenerate orbit with c = d = 0. Non-degenerate orbits can be transformed by SL(2, Z) to have c = 0, giving an integral over a strip 0 ≤ x ≤ 1 in the upper half plane, together with a sum over d ∈ Z \ {0}. In this case, the contribution of the degenerate orbit can be combined with the contribution of the non-degenerate orbits. As c = 0 in any case, the four cusps at infinity give the same contribution and they add to √ 2 −8 2e2πiλ0
Tb 1
f∞ h∞
∞ d=−∞
e−πiFd d + A∞ (z)
,
(3.17)
q0
where AI (z) ≡
( S, z) . 2π hI
(3.18)
In this case, as z = [S2 ] and therefore ( , z) = 0, one has A∞ (z) =
s . 2π h∞
(3.19)
Using now the identity, ∞ d=−∞
e−iBθ eiθd = 2π i , d +B 1 − e−2πiB
(3.20)
which is valid for 0 ≤ θ < 2π , and integrating over Tb1 , one finally obtains the expressions [6]
g i is w2 (E)=0 2 −1 2pu∞ +S 2 T∞ 2 2f1∞ h∞ s + 2r coth = − (h∞ f2∞ ) e , Zg,S2 4 2h∞ q 0 (3.21) where we used (3.20) with θ = 0, as in the original computation in [3], and
g 1 s w2 (E)=[S2 ] 2 −1 2pu∞ +S 2 T∞ 2 Zg,S2 = − (h∞ f2∞ ) e , 2f1∞ h∞ s + 2r csc 4 2h∞ q 0 (3.22) where we used again the identity (3.20) with θ = π . This changes the coth in a csc. For g = 0, one recovers the expressions for S2 × S2 which were obtained in [3, 33].
244
C. Lozano, M. Mariño
Let us now consider the other limiting chamber, in which the volume of g is small. We then have z = [g ]. We will also restrict ourselves to the Stiefel–Whitney classes of the form w2 (E) = [S2 ] + F[g ], i.e., F # = 1. The reason for this is simple: in this case, r I = 1 for the cusps at infinity and the vanishing argument of [3] applies. Therefore, there is no contribution from these cusps. In other words, (β I , [g ]) = 0 and there is a non-zero magnetic flux through the vanishing fiber g . Notice that this does not imply that the whole u-plane integral is zero. As it has been already remarked at the end of Sect. 5 in [3], the vanishing of the monopole and dyon cusps depends crucially on the behavior of the measure. Let us then focus on the contribution from the monopole cusp (the contribution from the dyon cusp is related to that of the monopole cusp by (2.19)). It is easy to see that we can absorb the α M dependence in (3.16) in a shift d → d − 21 plus an extra τ -independent phase −iπFc. We can now apply the unfolding procedure. The shift in d is crucial to ˜ the final result. There is a subtlety here, since
c and d come in the combination cτ + d, αβ where d˜ = d − 21 . SL(2, Z) elements with γ odd make c half-integer (and hence γ δ not zero) and those with δ even result in d˜ ∈ Z. Both problems are related and can be solved as follows. If γ is odd, c# = αc + γ (d − 21 ) ∈ Z − 21 . Looking back to (3.16), we see that a half-integer c corresponds to having a non-zero flux through the vanishing fiber, so the integral corresponding to those values of c vanishes. If δ is even, then, as αδ − βγ = 1 (which means in particular that γ and δ are coprime), γ is necessarily odd, and the integral vanishes as well. So we do not get any contribution from orbits with c ∈ Z + 21 or d˜ ∈ Z. Therefore, the final result is the same as in Eq. (C.2) in [3] (with minor changes): ∞ √ 2πiλ2 1 0 fM hM , (3.23) −2 2e d − 21 + AM (z) q 0 Tb1 d=−∞ D
where t 8 − f2M $ 2π hM π is the magnetic version of (2.10): √
2 2(g−1) −1 2puM +S 2 TM r h f2M e exp 2 f1M s + 2 $ . f M hM = 64 M hM AM (z) =
and fM
(3.24)
(3.25)
Using the identity (3.20) with B = AM (z) − 1/2, we finally obtain the final result for the u-plane integral in the limiting chamber where the volume of g is small: √ 2 w2 (E) Zu, = − 4 2πie2πiλ0 g . (3.26) fdy hdy fM hM −2πiλ20 · +e −2πiAdy q 0 −2πiAM q 0 Tb1 1 + e Tb 1 1 + e D dy To obtain an explicit expression for (3.34), we have to integrate over the Jacobian Tb1 . To do this, we have to expand the exponential involving $ in the numerators of (3.34). Notice that m 1 1 t x = + Li (−e ) , (3.27) −m 1 + et+x 1 + et m! m≥1
Donaldson Invariants and Two-Dimensional Gauge Theories
245
where Lin is the polylogarithm of index n. We have taken into account the fact that, for negative index, the polylogarithm is given by |m| 1 d t Li−m (e ) = , (3.28) dt 1 − et and using this, the identity (3.27) follows immediately. One also has: Li−m (−1) =
1 1 Em = − (2m+1 − 1)Bm+1 , 2 m+1
(3.29)
where Em are the Euler numbers and Bm are the Bernoulli numbers. To write the final expression for the u-plane integral, we also define the normalized modular forms: h˜ M = 2ihM ,
f˜1M = −8f1M ,
f˜2M = 29 if2M .
(3.30)
The u-plane integral for the Stiefel–Whitney class w2 (E) = [S2 ] + F[g ], F = 0, 1, is then given by: g g F 8 F 2puM +2stTM e (−1)m 2−6m (h˜ 2M f˜2M )m−1 (p, r, s, t) = − 2 (−1) Zu, g m m=1
g−m s ˜2 ˜ 2t h˜ −1 M · 2r + hM f1M (3.31) Li−m (−e ) 16 0 qD g g 8 1−g −2puM −2stTM −2 i e (−1)m 2−6m (h˜ 2M f˜2M )m−1 m m=1
g−m is ˜ 2 ˜ −2it h˜ −1 M ) · 2ir − hM f1M Li−m (−e , 16 q0 D
where we have used that λ20 = w2 (E)2 /4 (mod Z) = F/2 (mod Z). The first piece corresponds to the monopole cusp at τ = 0, while the second one corresponds to the dyon cusp at τ = 2. Notice that in the above expression we have not included the term m = 0 in the sum which comes from the expansion (3.27). This is due to the fact that −1 this term has an overall f˜2M = qD + · · · . As the rest of the modular forms are analytic 0 term in the expansion, so this term does not contribute. in qD , there can not be any qD 3.2. The Seiberg–Witten contribution. Let us now turn to the Seiberg–Witten contribution. As we have already remarked, the Seiberg–Witten invariants vanish in the chamber of small volume for S2 , and the Seiberg–Witten invariants in the chamber where vol(g ) → 0 are given by the sum of wall-crossing terms of the form (2.22). But these terms match the wall-crossings at the cusps τ = 0, 2 of the u-plane integral. Therefore, the SW contribution to the Donaldson invariants is given by the sum of wall-crossings (2.21). Which walls do we cross in going from the limiting chamber where vol(S2 ) → 0 to the chamber where vol(g ) → 0? Any λ with λ2 < 0 and dλ = λ2 −(2χ +3σ )/4 = λ2 +2(g−1) ≥ 0 defines a wall. If we set λ = −b[g ]+a[S2 ], a, b ∈ Z, these conditions give: 0 < ab ≤ g − 1.
(3.32)
246
C. Lozano, M. Mariño
We then see that, for a fixed genus g, there is only a finite number of walls for the Seiberg– Witten contribution. This is in sharp contrast with the wall-crossing terms coming from the cusps at infinity, analyzed in [6], where there is an infinite number of walls. Notice that λ and −λ define the same wall, but the wall-crossing term has opposite sign. We then obtain the following expression for the Seiberg–Witten contribution: F ZSW, (p, r, s, t) g
= 28 (−1)F
ab sgn(a)qD (−1)aF+b (h˜ 2M f˜2M )−1 e2puM +2stTM
0
g 2as−2bt s ˜2 ˜ −7 ˜ 2 ˜ h f1M + 2 bhM f2M + 2r e h˜ M · 16 M 0 (3.33) qD ab sgn(a)qD + 28 i 1−g (−1)aF+b (h˜ 2M f˜2M )−1 e−2puM −2stTM
0
g 2ias−2ibt − ˜ is ˜ 2 ˜ −7 ˜ 2 ˜ hM · − hM f1M + 2 bhM f2M + 2ir e , 16 q0 D
where again the first piece corresponds to the monopole cusp, and the second piece to the dyon cusp. The sgn(a) appears because, as we explained before, the wall-crossing terms corresponding to λ and −λ have opposite sign. The overall sign can be fixed by looking at the sign of the Seiberg–Witten invariants in the wall-crossing formula of [30]. Notice that the constraint g − 1 ≥ ab is in fact redundant, as the whole expression above vanishes if this constraint is not fulfilled. This is due to the fact that the most negative g−1 1−g power of qD in the above expression comes from f˜2M = qD (1 + O(qD )). Therefore, 0 if ab > g − 1, there are no terms in qD . The Donaldson–Witten generating function in the chamber where g is small, and for w2 (E) = [S2 ] + F[g ], is then given by the sum of (3.33) and (3.31), and we will F (p, r, s, t). Notice that Z F (p, r, s, t) can be also computed by adding denote it by Z g g the generating function in the chamber of small volume for S2 and the infinite sum of wall-crossings. The resulting expressions were given in [6], in terms of Weierstrass σ F (p, r, s, t) gives functions. The fact that the generating functions in [6] are equal to Z g a remarkable identity. They encode the information about the Donaldson invariants in two different ways, that we can call “magnetic” and “electric.” We have checked that they give indeed the same invariants in many cases, but an analytic proof would be rather formidable. It is interesting to notice that the u-plane integral can be written as −28 (−1)F e2puM +2stTM (h˜ 2M f˜2M )−1
s d · 2r + h˜ 2M f˜1M − 2−8 f˜2M h˜ 3M 16 dt
g ∞ n=0
n
(−1) e
2nt h˜ M
(3.34) , 0 qD
where we have only written the monopole contribution. This piece has the same structure of the Seiberg–Witten contribution (3.33), but where the sum is now over an infinite number of basic classes of the form λ = n[g ], i.e. with b = −n ≤ 0 and a = 0. Similar considerations have been made in [33] for simply-connected manifolds.
Donaldson Invariants and Two-Dimensional Gauge Theories
247
3.3. Some properties and examples. An interesting corollary of our computation is that the manifold g × S2 is of g th finite type, when one considers the chamber of small volume for g and a Stiefel–Whitney class such that (w2 (E), g ) = 0. This is an immediate consequence of our expressions. To see it, notice that both for the u-plane and the SW contributions, the only source of a negative power of qD is the term f˜2M . The rest of the modular forms involved in our formulae are analytic in qD . On the other hand, the maximum possible power of f˜2M is precisely g − 1. As uM = 1 + 32qD + . . . , an insertion of (uM ± 1)g in the monopole (respectively, dyon) contribution to (3.34) or (3.33), will make the generating function vanish. But an insertion of uM ±1 is equivalent to acting with ∂ ±2 ∂p
(3.35)
on the Donaldson–Witten generating function. We then find that 2
g ∂ − 4 ZgF (p, r, s, t) = 0. ∂p 2
(3.36)
g is in fact the minimum power we need to kill the generating function, since
∂2 −4 ∂p 2
g−1
ZgF (p, r, s, t) = (−1)F+g−1 2g Li−g (−e2t )e2p+st + (−1)F i 1−g Li−g (−e−2it )e−2p−st .
(3.37)
We conclude that g × S2 is of g th finite type for the chamber and the Stiefel–Whitney classes under consideration. This was proved in [12] for g = 1. We will now give explicit expressions for the Donaldson–Witten generating function at low genus. For g = 1, the Seiberg–Witten contribution vanishes as there are no walls i.e., the conditions (3.32) have no solution). In this case, only the u-plane contributes. The only polylogarithm involved here is Li−1 (−et ) = −
et 1 =− . (1 + et )2 4(cosh(t/2))2
(3.38)
It is clear that for g = 1 we only need the first term in the expansion of the modular forms. We then find, 2p+st −2p−st e 1 F F F F e + (−1) . Z1,1 (p, r, s, t) = Zu,1 (p, r, s, t) = − (−1) 2 cosh2 (t) cosh2 (−it) (3.39) As in this case the manifold has a simple type behavior, we can define the Donaldson 1 ∂ F F | 2 series DF = ZDW p=0 + 2 ∂p ZDW |p=0 . If we write it as a functional on Sym(H (X, Z)), we find from (3.39): DF = (−1)(e
2 −2e·F )
eQ/2 , cosh2 F
(3.40)
where e = w2 (E), F = [1 ], and Q is the intersection form. This is in perfect agreement with Theorem 1.3 in [12].
248
C. Lozano, M. Mariño
It is clear that, as we consider larger values of g, the expression for the Donaldson– Witten generating function becomes more and more complicated. The main source of this complexity is the t-dependence in the u-plane integral. For example, for g = 2 the monopole contribution to the u-plane integral is given by: 2 2p+st+2t F F e 4t 4t 2t 5 − 5e p + 128 1 + e + 16 −1 + e r Zu,M = − (−1) 16(1 + e2t )4
+ 4s + 8e2t s + 4e4t s − 2t + 8e2t t − 2e4t t − 4st + 4e4t st , (3.41) while the Seiberg–Witten contribution is F ZSW,M = (−1)F
e2p+st 2t−2s − e2s−2t . e 64
(3.42)
4. Application 1: Intersection Pairings on the Moduli Space of Stable Bundles 4.1. The moduli space of stable bundles. Let g be a Riemann surface of genus g. The moduli space of flat SO(3) connections on g , with Stiefel–Whitney class w2 = 0 turns out to be a very rich and interesting space. One of the reasons for this richness is the fact that this moduli space can be understood in many different ways: using the HitchinKobayashi correspondence, we can think about this space as the moduli space of rank two, odd degree stable bundles over g with fixed determinant. On the other hand, due to the classical theorem of Narashiman and Seshadri, we can identify this moduli space with the representations in SU (2) of the fundamental group of the punctured Riemann surface g \Dp , where Dp is a small disk around the puncture p, and with holonomy −1 around p (the fact that we require a non-trivial holonomy is due precisely to the non-zero Stiefel–Whitney class). In any case, this moduli space, that we will denote by Mg , is a smooth projective variety of (real) dimension 6g − 6. Similarly, we can consider the moduli space of flat SU (2) connections, i.e. with w2 = 0. This moduli space can be identified with the moduli space of stable rank two vector bundles of even degree and it is singular. We will denote it by M+ g. The cohomology ring of Mg can be studied by using a two-dimensional version of the µ map which arises in Donaldson theory. This map sends homology classes of g to cohomology classes of Mg . The generators of H∗ (g ) give in fact a set of generators in H 4−∗ (Mg ) that are usually taken as follows [16, 25]: α = 2µ(g ) ∈ H 2 (Mg ), ψi = µ(γi ) ∈ H 3 (Mg ),
(4.1)
β = −4µ(x) ∈ H (Mg ), 4
where x is the class of the point in H0 (g ). We also define the Sp(2g, Z)-invariant cohomology class in H 6 (Mg ), γ = −2
g i=1
ψi ψi+g .
(4.2)
Donaldson Invariants and Two-Dimensional Gauge Theories
249
One can show that the moduli space of anti-self-dual connections on g × S2 with instanton number zero is isomorphic to the moduli space of flat connections on g . In particular, the generators of the cohomology in (4.1) correspond precisely to the Donaldson cohomology classes, and we have that α = 2I (g ),
ψi = I (γi ),
β = −4O,
(4.3)
while the invariant form γ corresponds to (3.3). 4.2. The intersection pairings. To determine the ring structure of the cohomology of Mg , once a set of generators has been found, one only has to find a set of relations. Due to Poincaré duality, the intersection pairings of the generators in (4.1) give all the information needed to find the relations. In other words, to find the structure of the cohomology ring it is enough to evaluate the intersection pairings α m β n γ p Mg = αm ∧ β n ∧ γ p , (4.4) Mg
as all the intersection pairings involving the ψi ’s can be reduced to (4.4) by Sp(2r, Z) symmetry. These numbers can be considered as the two-dimensional analogs of Donaldson invariants, which in fact can be formulated as correlation functions of a twodimensional topological gauge theory [17, 34]. In [16], Thaddeus computed (4.4) in two steps: first, he obtained a recursive relation which allows to eliminate the γ classes. Second, he computed the pairings α m β n using Verlinde’s formula [15]. The same steps are followed by Witten in [17]. We will also prove first the recursion relation, and then compute the remaining pairings. To do this, we use the Donaldson invariants of product ruled surfaces. The relation between the pairings in (4.4) and the Donaldson invariants are as follows (see [24], Remark 13): w2 =[S2 ] α m β n γ p Mg = F([S2 ])D (2g )m (−4x)n ιp . (4.5) ×S2 g
K·w+w 2
In this equation, F(w) = (−1) 2 , where w is an integer lift of the second Stiefel– Whitney class. This sign appears due to the following reason. The Donaldson invariants that appear in the right-hand side are defined using the natural orientation of the moduli space of anti-self-dual connections. For algebraic surfaces, this moduli space can be realized as the Gieseker compactification of the moduli space of rank two stable bundles M(c1 , c2 ), where c1 = w. This complex space has a natural orientation induced by its complex structure, and the difference between these orientations in the computation of the invariants is given by F(w). Now, when c2 = 0 and c1 = [S2 ] (i.e., there is a nonzero flux through the Riemann surface g ), then M(0, c1 ) = Mg , and the intersection pairings in the left hand side are in fact computed with the complex orientation. In this case, F([S2 ]) = −1. Notice that the pairing above is only different from zero when 2m + 4n + 6p = 6g − 6. The Donaldson invariants in (5.2) can be computed in any chamber, since for c2 = 0 and w2 (E) = [S2 ] one has p1 = 0 and therefore there are no walls. The computation turns out to be much simpler in the chamber of small volume for S2 . One can make in principle the computation in the other limiting chamber, using F=0 . This turns out to be very complicated analytically, although we have explicitly Z g checked that the answers agree in many cases. Physically, the computation of Thaddeus’ intersection pairings (as well as the computation of Donaldson invariants involving low
250
C. Lozano, M. Mariño
instanton numbers) is rather semiclassical and is best performed in the electric frame, F=0 gives information about global aspects of the while the “magnetic” expression Z g generating function which are useful, for example, to compute the eigenvalue spectrum of Fukaya–Floer cohomology. It has also been noticed in [22] that in fact, to compute the intersection pairings on Mg , the chamber of small volume for S2 is more natural, as the topological reduction in this chamber gives the twisted N = 2 Yang–Mills in two dimensions in a direct way. We will then extract these pairings from the Donaldson invariants given by (3.22). The first thing that we can prove is the recursive relation of Thaddeus. Using the explicit formula (3.22), one easily sees that ∂ w2 (E)=[S2 ] w2 (E)=[S2 ] = 2gZg−1,S , Zg,S2 2 ∂r
(4.6)
and this implies, using (5.2), that α m β n γ p Mg = 2gα m β n γ p−1 Mg−1 ,
(4.7)
which is precisely Thaddeus’ recursive relation. We now compute the intersection pairings α m β n . To do this, we use the expansion: csc z =
∞
(−1)k+1 (22k − 2)B2k
k=0
z2k−1 , (2k)!
(4.8)
where B2k are the Bernoulli numbers. We have to extract now the powers s m p n from the generating function (3.22). Notice that a power s g comes already from the overall g-dependent factor in (3.22). We then have to extract the power s m−g from the series expansion in s/2h. Taking now into account the comparison factors from (5.2), and the dimensional constraint 2m + 4n = 6g − 6, one finds α m β n =
1 m 2 (−4)n i m−g+1 22g+n−m m! 4 (2m−g+1 − 2) 3g−m−2 n g −1 · u∞ f1∞ f2∞ ]q 0 . Bm−g+1 [h∞ (m − g + 1)!
(4.9)
Fortunately, only the leading term contributes in the q-expansion involved in (4.9). One finally obtains, α m β n = (−1)g
m! 22g−2 (2m−g+1 − 2)Bm−g+1 , (m − g + 1)!
(4.10)
which is exactly Thaddeus’ formula for the intersection pairings. This expression, as well as the recursive relation (4.7), has been obtained by Witten in [17] by exploiting the relation to physical two-dimensional Yang–Mills theory. A derivation of (4.10) based on gluing techniques has been worked out in [35]. One can formally consider the intersection pairings in the case of even degree, and extract them from the Donaldson invariants for vanishing Stiefel–Whitney class. These invariants are given by (3.21). Using now the expansion coth z =
∞ k=0
22k B2k
z2k−1 , (2k)!
(4.11)
Donaldson Invariants and Two-Dimensional Gauge Theories
251
and taking into account that F(0) = 1, one finds α m β n = (−1)g
m! 2m+g−1 Bm−g+1 . (m − g + 1)!
(4.12)
These are also the pairings that one obtains using two-dimensional gauge theory [17, 34]. However, one has to be extremely careful about the mathematical interpretation of (4.12), since the space M+ g is singular for g ≥ 3. To define the intersection pairings one has then to use intersection cohomology [36] or consider a partial desingularization of Mg that has only orbifold singularities [37]. The computation of the intersection pairings for the partial desingularization was performed in [37] using the strategy of þ. The result agrees with (4.12) for m ≥ g, but for m < g there are correction terms7 . It would be interesting to see if these corrections can be obtained using physical methods. In this sense, the approach in [17] seems more appropriate, as the singular character of the moduli space shows up as a non-regular term in the partition function. 4.3. Relation to Verlinde’s formula. The derivation of the intersection pairings (4.10) in [16] was based in the SU (2) Verlinde formula for the WZW model [15]. In fact, one can reverse the logic in [16] and give a derivation of Verlinde’s formula from the intersection pairings. In this section, we will closely follow the arguments given in [16] for Mg , and we will also show that they can be formally extended to M+ g. Verlinde’s formula gives an explicit expression for the number of conformal blocks in CFT. In the case of the SU (2) WZW model, the space of conformal blocks at level k (where k is a positive integer) can be identified with the space of sections of the line bundle Lk/2 , where L is a fixed line bundle over Mg which generates Pic(Mg ) Z. The canonical bundle of Mg is given by L−2 . We are interested in computing dimH 0 (Lk/2 , Mg ). As explained in [16], this can be done using Hirzebruch–Riemann–Roch. The canonical bundle of Mg is negative, and by the Kodaira vanishing theorem one has that H i (Lk/2 , Mg ) = 0 for i > 0. We then have, dim H 0 (Lk/2 , Mg ) = χ (Lk/2 , Mg ) = ch Lk/2 td Mg . (4.13) Mg
Notice that the cohomology classes involved in Hirzebruch–Riemann–Roch can be expressed in terms of the generators of the cohomology ring (4.1), and therefore (4.13) can be computed in principle once the intersection pairings are known. Explicit expressions for the characteristic classes of the tangent bundle to Mg have been obtained by Newstead [38] (see also [39]) and read c1 (Mg ) = 2α, p(Mg ) = ((1 + β)2g−2 . One also has c1 (L) = α, and this gives: k + 2 √β/2 2g−2 0 k/2 dim H (L , Mg ) = α exp . (4.14) √ 2 sinh β/2) Mg If we expand the characteristic classes in the right-hand side of (4.14), we get a polynomial in k + 2 of the form 3g−3 P3g−3−m (k + 2)m α m β (3g−3−m)/2 , (4.15) 23g−3 m! Mg m=0
7 We are grateful to Y.-H. Kiem for explaining these issues to us.
252
C. Lozano, M. Mariño
where Pn is the coefficient of x n in the series expansion of (x/ sinh x)2g−2 . Using now (4.10), it is easy to see that this is the coefficient of x 3g−3 in
2g−2 x k + 2 g−1 (k + 2)x − x . (4.16) 2 sinh x sinh(k + 2)x An argument due to Zagier and explained in [16], Prop. (19), shows that this coefficient is given by dim H 0 (Lk/2 , Mg ) =
k+2 2
g−1 k+1 n=1
(−1)n+1 nπ 2g−2 , (sin k+2 )
(4.17)
which is precisely Verlinde’s formula in the case of odd degree. As we pointed out before, the moduli space of rank two stable bundles of even degree is a singular space, and in principle the intersection numbers are not well-defined. The answer (4.12) should be considered as a regularization of these pairings in the context of the u-plane integral. If we assume that the Riemann–Roch formula is still valid, one obtains in fact the usual Verlinde’s formula for SU (2).8 If we consider the expression (4.15) with the intersection pairings given in (4.12) one finds that the dimension of 3g−3 in H 0 (Lk/2 , M+ g ) is now given by the coefficient of x
k+2 − − x 2
g−1
x sinh x
2g−2 (k + 2)x coth(k + 2)x.
(4.18)
Going through the argument in [16], Prop. (19), one easily obtains: dim H (L 0
k/2
, M+ g)
=
k+2 2
g−1 k+1 n=1
1
(sin
nπ 2g−2 , k+2 )
(4.19)
which gives the right formula for the number of conformal blocks for the (untwisted) SU (2) case. This computation is, however, formal, as there is no suitable Riemann–Roch + formula for a singular space like M+ g . Using the orbifold desingularization of Mg , one can apply the Kawasaki-Riemann–Roch formula and relate the intersection pairings to Verlinde’s formula (4.19). In fact, this is how the corrections to the pairings (4.12) are obtained in [37]. 5. Application 2: Fukaya–Floer Cohomology 5.1. Floer cohomology and gluing rules. The Floer (co)homology groups of threemanifolds and their relations to Donaldson invariants can be understood in a simple way using the axiomatic approach introduced by Atiyah [42], which in fact is a formalization of heuristic considerations involving path integrals. According to the axiomatic approach, a topological field theory in 3 + 1-dimensions is essentially a functor R from the category of three-dimensional manifolds to the category of complex vector spaces, R : Man(3) → Vect, and satisfying certain properties. In the case of Donaldson–Witten theory, this functor associates to any compact, oriented three-manifold Y the graded vector space given by the Floer homology groups H F∗ (Y ). These homology groups 8 This has also been observed in [40, 41].
Donaldson Invariants and Two-Dimensional Gauge Theories
253
can be defined by using Morse theory with the Chern–Simons functional on the moduli space of SO(3) connections on Y with second Stiefel–Whitney class w2 ∈ H 2 (Y, Z). We will be interested here in the gluing rules that relate the ring structure of the Floer homology to the Donaldson invariants of four-manifolds. Let us consider a fourmanifold X with boundary ∂X = Y , together with an element z in A(X). According to the axiomatic approach, the functor R also assigns to the pair (X, z) a “relative invariant” of X, which is an element in the Floer homology of Y (i.e., R(X, z) ∈ R(∂X)). This relative invariant can be understood in a simple way, as explained in [1], in terms of path-integrals. Let z = Si1 . . . Sip x n γj1 . . . γjq be an element of A(X), where Siµ , γjµ are two and one-homology classes, respectively, and x is the class of the point. This determines a BRST-invariant operator given by the µ-map, namely Az = I (Si1 ) . . . I (Sip )On I (γj1 ) . . . I (γjq ).
(5.1)
We then define the relative invariant through the usual correspondence operators/states in quantum field theory: w (R(X, z))X (φY ) = [Dφ]|φ|Y =φY e−STYM Az , (5.2) X
which is a functional of the fields restricted to the boundary. In this path integral, one integrates over all the gauge fields with the Stiefel–Whitney class wX ∈ H 2 (X, Z), where wX restricts to w on Y . STYM is the action of topological Yang–Mills theory. There are some extra structures in H F∗ (Y ) that will be important to our analysis. Our presentation will closely follow the excellent surveys in [20, 23]; for more details, one can see [43, 44]. First of all, there is an associative and graded commutative ring structure H F∗ (Y ) ⊗ H F∗ (Y ) → H F∗ (Y ). Second, as in ordinary homology, one can define the dual of H F∗ (Y ) to obtain the Floer cohomology of Y . Moreover, if −Y denotes the manifold with opposite orientation, one has H F ∗ (Y ) H F∗ (−Y ). Finally, there is a natural, non-degenerate pairing , : H F ∗ (Y ) ⊗ H F∗ (Y ) → C. When the states in the Floer cohomology are given by relative invariants, this pairing can be understood heuristically from path integral arguments. Consider two manifolds with boundary, X1 , X2 , such that ∂X1 = Y and ∂X2 = −Y (i.e., Y with the opposite orientation). We can glue the manifolds together to obtain a closed four-manifold X. There is then a pairing between H F∗ (Y ) and H F∗ (−Y ) which is given by wX1 wX2 R(X1 , z1 ) , R(X2 , z2 ) = [Dφ]e−STYM Az1 Az2 . (5.3) X
In other words, the pairing is essentially given by Donaldson invariants of the fourmanifold X. In order to give a precise gluing result, we have to be careful with two things: first of all, which is the Stiefel–Whitney class that one has to pick in order to define the Donaldson invariant on the right-hand side of (5.3)? and, in case b2+ (X) = 1, in which chamber should we compute the Donaldson invariant? These issues are discussed in [19, 43] in detail. To answer the first question, consider a w ∈ H 2 (X, Z) such that w|Y = w2 . Also consider a cohomology class [] ∈ H 2 (X, Z), given as the Poincaré dual of a twoclass which lies in the image of i∗ : H2 (Y, Z) → H2 (X, Z), and satisfying w ·[] = 1 (mod 2). The pair (w, ) is called in [19] an allowable pair. One then defines (w,)
DX
w+[] w = DX + DX .
(5.4)
254
C. Lozano, M. Mariño
The gluing theorem of [23, 43] is then (w,)
R(X1 , z1 )wX1 , R(X2 , z2 )wX2 = DX
(z1 z2 ).
(5.5)
If b2+ (X) = 1, then one considers the metric giving a long neck, i.e., one takes X = X1 ∪ (Y × [0, R]) ∪ X2 , with R very large. We are interested in the Floer (co)homology of Y = g × S1 , with Stiefel–Whitney class w2 = [S1 ]. This manifold has an orientation-reversing diffeomorphism given by conjugation on S1 , and therefore there is a natural isomorphism H F ∗ (Y ) H F∗ (Y ). We will work with the ring cohomology from now on. The first thing to do is to find the generators of this ring. Consider the four-manifold with boundary X 0 = g × D 2 , where D 2 is the two-disk. Then, ∂X 0 = g × S1 . We can define relative invariants of X associated to elements in A(X0 ) = A(g ). Clearly, one has to take w = [D 2 ] ∈ H 2 (X 0 , Z), which restricts to [S1 ] at the boundary. The generators of H F ∗ (Y ) are then given by [23]: α = 2Rw (X 0 , g ) ∈ H F 2 (Y ), ψi = Rw (X 0 , γi ) ∈ H F 3 (Y ), w
(5.6)
β = −4R (X , x) ∈ H F (Y ), 0
4
where g , γi and x are the generators of H∗ (g ). Notice that this basis is very similar to the basis of Mg presented in (4.1). In fact, H F ∗ (Y ) and H ∗ (Mg ) are isomorphic as vector spaces [24]. The product structure in the Floer cohomology is given, for these relative invariants, by Rw (X 0 , z)Rw (X 0 , z# ) = Rw (X 0 , zz# ). We will restrict ourselves to the invariant part of H F ∗ (Y ) (as in the analysis of the cohomology of Mg ), which is generated by α, β and γ = −2
g
Rw (X 0 , γi γi+g ).
(5.7)
i=1
The last ingredient we need is the gluing rule. If we consider the pairing of two relative invariants constructed from X0 , we will have to glue two copies of X 0 along their boundaries. Clearly, this gives the closed four-manifold X = g × S2 (where S2 comes from gluing the two disks along their boundaries S1 ). The long neck metric is the one that makes S2 very big, and then corresponds to the chamber where g is small. Finally, we have to specify the allowable pair. In X, w = [S2 ] restricts to w2 = [S1 ] on Y . On the other hand, the image of H2 (Y, Z) in H 2 (X, Z) is generated by g . This means that the gluing rule for the relative invariants is (w,g )
Rw0 (X 0 , z1 ), Rw0 (X 0 , z2 ) = DX
w2 = DX
(z1 z2 )
=[S2 ]
w =[S2 ]+[g ]
(z1 z2 ) + DX2
(z1 z2 ).
(5.8)
The Fukaya–Floer cohomology H F F ∗ (Y ) of an oriented three-manifold Y needs the extra input of a loop δ S1 in Y . A review of this construction can be found in [19]. Here we will consider that δ is the S1 factor in Y = g × S1 . In this case, one has that H F F ∗ (Y ) = H F ∗ (Y ) × C[[t]]. A basis of generators can be also constructed using
Donaldson Invariants and Two-Dimensional Gauge Theories
255
relative invariants of the manifold X 0 , with the insertion of the operator exp tI (D 2 ) in the path integral. In this way, we obtain the generators α = 2Rw0 (X 0 , g etD ) ∈ H F F 2 (Y ), 2
i = Rw0 (X 0 , γi etD 2 ) ∈ H F F 3 (Y ), ψ = −4Rw0 (X 0 , xe β
tD 2
(5.9)
) ∈ H F F 4 (Y ).
The gluing rule is now w 0 2 2 2 (w, ) R 0 X , z1 etD , Rw0 X 0 , z2 etD = DX g z1 z2 etS ,
(5.10)
and therefore the Donaldson invariants involved in the Fukaya–Floer cohomology include the cohomology class associated to S2 . This makes the determination of this cohomology more difficult. 5.2. Eigenvalue spectrum of the Fukaya–Floer cohomology. As we have seen, the intersection pairings in Floer and Fukaya–Floer cohomology are given by Donaldson invariants of g × S2 in the chamber where g is small. These invariants completely determine, in principle, the ring structure of the (Fukaya)–Floer cohomology, but this does not mean that we are able to give an explicit presentation of the relations of the ring. Already in the comparatively simpler case of the classical cohomology of Mg , to obtain the explicit relations at genus g starting from the intersection pairings (4.10) turns out to be a very complicated combinatorial problem (solved in [39]). In this section, we want to show that an important aspect of the ring structure, namely the eigenvalue spectrum, can be deduced in a simple way from the generating function that we found in Sect. 2. In the case of Floer cohomology, the spectrum was obtained in [22] under some extra assumptions, and finally derived in [23] from an explicit presentation of the relations. The spectrum of the Fukaya–Floer cohomology was conjectured in [19], based on the computation of the spectrum for a submodule. Our calculation confirms this conjecture. Our strategy will be in a way the reverse to that in [20]. In these papers, the information about Fukaya–Floer cohomology obtained in [19] is used to understand the structure of Donaldson invariants. Here, we will use the Donaldson invariants of g × S2 to deduce results about the Fukaya–Floer cohomology of g × S1 . The basic procedure to obtain the eigenvalue spectrum is to find elements in the ideal of relations of the Fukaya–Floer cohomology, i.e., to find vanishing polynomials in the and generators α, β γ: , P( α, β γ ) = 0. (5.11) We can easily translate this identity in terms of the generating function for the Donaldson invariants of g × S2 : as the pairing (5.3) is non-degenerate, to prove the identity (5.11) it is enough to prove that 2 , P( α, β γ ), R(X0 , zetD ) = 0,
(5.12)
for any z ∈ A(g ). Due to the gluing rule (5.10), the above pairing is nothing but (w, )
DX g (P(2g , −4x, ι)zetS ). The vanishing of (5.12) for any z is then equivalent to the following differential equation,
∂ ∂ ∂ (w, ) P 2 , −4 , (5.13) Zg g (p, r, s, t) = 0, ∂s ∂p ∂r 2
256
C. Lozano, M. Mariño
where we have defined the generating functional corresponding to the invariants (5.4): (w,g )
Zg
(p, r, s, t) = ZgF=0 (p, r, s, t) + ZgF=1 (p, r, s, t).
(5.14)
What we have computed in Sect. 3 are precisely the generating functions involved in (5.14). We then have to study the differential equations satisfied by our function. First of all, using (3.31) and (3.33), one immediately finds: g g (w, ) (−1)m 2−6m (h˜ 2M f˜2M )m−1 Zg g = − 29 i 1−g e−2puM −2stTM m m=1
g−m is ˜ 2 ˜ −2it h˜ −1 M · 2ir − hM f1M Li−m − e 16 0 qD ab sgn(a)qD + 29 (−1)b e2puM +2stTM (h˜ 2M f˜2 )−1 a odd 0
· + 29 i 1−g
a even 0
·
g 2as−2bt s ˜2 ˜ −7 ˜ 2 ˜ h f1M + 2 bhM f2M + 2r e h˜ M 16 M 0 qD ab sgn(a)qD (−1)b e−2puM −2stTM (h˜ 2M f˜2 )−1
is − h˜ 2M f˜1M + 2−7 bh˜ 2M f˜2M + 2ir 16
g e
− 2ias−2ibt ˜
(5.15)
.
hM
0 qD
This generating function can be explicitly evaluated for low genus. The results are especially simple if we put t = 0.9 We obtain, for example: (w,1 )
= − e−2p , 1 1 (w, ) Z2 2 (p, r, s) = − e−2 p (32 r − s) − e2p (e2s − e−2s ), 8 32 1 −2p (w, ) Z3 3 (p, r, s) = − 98+256p+256p2 +49152r 2 −3072rs +48s 2 e 4096 1 2p+2s + e (3 − 4 p − 48 r − 2 s) 256 1 2p−2s 1 −2p 4is + e e (3 − 4 p + 48 r + 2 s) + (e + e−4is ). 256 4096 (5.16) Z1
Let us now concentrate on the eigenvalue spectrum of the Fukaya–Floer cohomology. The first eigenvalue equation we can write is the one that corresponds to the finite-type condition that we obtained in Sect. 3. It reads now, 2 − 64)g = 0, (β
(5.17)
9 For t = 0, the generating function (5.15) can be computed in principle using the Artinian decomposition of the Floer cohomology [23]. This procedure, however, does not give a general result for any genus and has to be worked out case by case. V. Muñoz has informed us that the above expressions for g = 2, 3 coincide with the results that can be obtained from this decomposition.
Donaldson Invariants and Two-Dimensional Gauge Theories
257
must be ±8. To understand the eigenvalue spectrum of the therefore the eigenvalues of β (w, ) remaining operators, it is useful first to be more precise about the structure of Zg g . If we look at (5.15), it is easy to see that it can be written as, (w, ) Za (p, r, s, t). (5.18) Zg g = |a|≤g−1
Notice that the u-plane integral contribution corresponds to a = 0. The structure of Za (p, r, s, t) is immediate from (5.15): ! fa (p, r, s, t)e2p+st+2as , for a odd, Za (p, r, s, t) = (5.19) fa (p, r, s, t)e−2p−st−2ias , for a even, where fa (p, r, s, t) is a polynomial in p, r and s and a power series in t (similar remarks about the structure of the Donaldson invariants have been made in [20]. We are interested in the degree of the polynomial in p, r, s. This is again easy to see if we look at the modular forms involved in (5.15). Assume g ≥ 2 (for g = 1, f0 only depends on t). We know that the maximum power we can find in p is precisely g − 1 (a simple consequence of the finite type condition). As one can see in (3.37), this power appears in the u-plane integral contribution, and for a = 0, the maximum power of p is in fact g − 2. Let us now find which is the maximum power of s in fa . If we group the powers of s in the modular form that gives fa , we easily see that the leading term in qD has the form ab+1+n−m g+n−m n qD s t ,
(5.20)
up to numerical constants. It is clear that the maximum possible power of s which can appear in fa is g − |a| − 1, and occurs for |b| = 1. This power actually appears in fa : using the above expression, it is easy to see that fa (p, r, s, t) = − 26−4g−3|a| g (a + 2t)n−|a|−1 −2(sgn(a))t g−|a|−1 g (sgn(a))n+1 · s + ..., e n (n − |a| − 1)! n=|a|+1
(5.21) for a odd, while for a even one obtains: fa (p, r, s, t) = (−1)g−|a| i |a| 26−4g−3|a| g (a − 2it)n−|a|−1 2i(sgn(a))t g−|a|−1 g (sgn(a))n+1 s + .... e · n (n − |a| − 1)! n=|a|+1
(5.22) Finally, for a = 0 (the u-plane contribution), one has: fa (p, r, s, t) = (−1)g 26−4g
g (−1)n (−2it)n−1 g n=1
(n − 1)!
n
Li−n (−e−2it )s g−1 + . . . . (5.23)
258
C. Lozano, M. Mariño
Notice that, for a = 0, fa (p, r, s, t) is a polynomial in e±t or e±it while for a = 0 it is a rational function of e±2it . By similar arguments, one finds that the maximum power of r appears for a = 0 and is g − 1. We then have the following differential equations for the Za :
g−1
g−|a| ∂ ∂ −4 +8 Za = 2 − 4a − 2t Za = 0, a odd, ∂p ∂s
g−1
g−|a| ∂ ∂ Za = 2 + 4ia + 2t Za = 0, a even, a = 0, (5.24) −8 −4 ∂p ∂s
g
g ∂ ∂ −4 a = 0. − 8 Z0 = 2 + 2t Z0 = 0, ∂p ∂s Notice that the powers that appear in these equations are in fact the minimum powers that are needed to kill the Za , as it can be easily seen from (5.21), (5.22) and (5.23). We also have ∂ g (w,g ) Zg = 0, ∂r g
(5.25)
which is also the minimum power we need (r g−1 appears for a = 0, while for a = 0 the ± 8)g kills all the Za with a odd maximum power is r g−2 ). Notice, in particular, that (β (even, respectively). We can now deduce the eigenvalue spectrum of the Fukaya–Floer cohomology. From (5.24) and (5.25) we find the following operator equations: "
( α − 4a − 2t)g−|a|
"
g = 0, γ ( α + 4ia + 2t)g−|a| = 0.
(5.26)
a even
a odd
Therefore, the only eigenvalue of γ is 0, and for α we find the eigenvalue spectrum only occurs for (0, ±4 + 2t, ±8i − 2t, · · · ). Notice that the eigenvalue 8 of β α = −4ia − 2t, with a even. In the same way, we find that −8 only occurs for α = 4a + 2t, for a odd. This is due to the equation " + 8)g (β ( α + 4ia + 2t)g−|a| = 0, (5.27) a even
− 8)g . Recalling now that −(g − 1) ≤ a ≤ g − 1, and a similar equation involving (β , our main conclusion is that the eigenvalue spectrum of ( α, β γ ) is given by (0, 8, 0),
(±4 + 2t, −8, 0), . . . (±4(g − 1)i g + (−1)g 2t, (−1)g−1 8, 0).
(5.28)
This generalizes Proposition 20 in [23], and confirms the conjecture in [19] (see Theorem 5.13 and Remark 5.14 in that paper). It is easy to give an explicit construction of the eigenvectors corresponding to these eigenvalues. They are given by: g−|a # | g−|a|−1 ' g , a even, α + 4ia + 2t α + 4ia # + 2t a # even, a # =a β +8 |a # |≤g−1 va = g−|a # | g−|a|−1 ' g , a odd. α − 4a − 2t α − 4a # − 2t a # odd, a # =a β −8 # |a |≤g−1
(5.29)
Donaldson Invariants and Two-Dimensional Gauge Theories
259
To see that these vectors are in fact not zero, one can easily prove that va , R(X 0 , zetS = 0 for any z ∈ A(X), using the explicit results for the generating function (5.15) in (5.21), (5.22) and (5.23). Using now arguments from [22, 23, 45], together with our results, it is easy to rederive the presentation of the Floer cohomology of g × S1 given in [22, 23]. We will give some brief indications on this respect. Let Jg be the ideal of relations at genus g (Jg is then generated by all the polynomials in α, β, γ that vanish as elements of the invariant (w, ) part of H F ∗ ). First of all, notice that Zg g satisfies 2
∂ (w,g ) (w, ) = 2gZg−1 g . Zg ∂r
(5.30)
This implies immediately the following inclusion relation: γ Jg ⊂ Jg+1 ⊂ Jg .
(5.31)
Now one can use the fact that the Floer cohomology of Y = g × S1 is a deformation of the cohomology of Mg (this is rather elementary and does not assume the existence of a ring isomorphism between H F ∗ (Y ) and the quantum cohomology of Mg ). Using the explicit recursive presentation of the ring cohomology of Mg given in [39, 46], and adapting the arguments of Proposition 3.2 in [45], one obtains the following result: The ideal of relations is given by Jg = (qg1 , qg2 , qg3 ), where the qgi , i = 1, 2, 3, are given by the following recursive relations: 1 qg+1 = αqg1 + g 2 qg2 , 2 qg+1 = (β + cg+1 )qg1 +
2g 3 q , g+1 g
(5.32)
3 qg+1 = γ qg1 ,
and q11 = α, q12 = β + c1 , q13 = γ . When cg+1 = 0, one recovers the classical cohomology of Mg . The deformation is then encoded in the coefficient cg+1 . Notice that the key fact which is used to derive (5.32) is the inclusion of ideals (5.31), which is in turn a consequence of (5.30). This recursion relation was conjectured in [45] for the generating function of the Gromov–Witten invariants of Mg , and provides in that context a generalization of Thaddeus’ recursion relation (4.7) from the classical to the quantum pairings. The last ingredient is then to compute the value of the coefficient cg+1 . It was shown in [23] that this value can be easily deduced by induction using the eigenvector that corresponds to the maximum α-eigenvalue. For g = 1, one immediately finds from (w, ) (5.15) that Z1 1 = −e−2p (we are putting t = 0, as we are considering the Floer rather than the Fukaya–Floer cohomology). It then follows that β = 8, so c1 = −8. The argument in [23], Theorem 14, gives then cg = (−1)g 8. It would be interesting to use the information contained in (5.15) to give a recursive presentation like (5.32) but for the Fukaya extension. Some steps in this direction have been taken in [19]. In principle, all the information that one needs is contained in the generating function (5.15), but it is still a non-trivial problem to extract it in the particular form of an explicit presentation of the relations.
260
C. Lozano, M. Mariño
Acknowledgements. We would like to thank G. Moore for explanations of [3], for many useful discussions on the topics treated in this paper, and for a critical reading of the manuscript. We are grateful to V. Muñoz for discussions and correspondence on Floer cohomology and for communicating to us his computations of generating functions. We would also like to thank Y.-H. Kiem and B. Siebert for useful discussions and correspondence. Finally, we want to thank J.M.F. Labastida for a critical reading of the manuscript and for his encouragement. C.L. would like to thank the Physics Departments of Yale University and Brandeis University for their hospitality. M.M. would like to thank the Departamento de Física de Partículas at the Universidade de Santiago de Compostela, as well as the organizers of VBAC99, for their hospitality during the completion of this paper. The work of C.L. is supported by DGICYT under grant PB96-0960. The work of M.M. is supported by DOE grant DE-FG02-92ER40704.
References 1. Witten, E.: Topological Quantum Field Theory. Commun. Math. Phys. 117, 353 (1988) 2. Witten, E.: Monopoles and four-manifolds. hep-th/9411102; Math. Res. Letters 1, 769 (1994) 3. Moore, G. and Witten, E.: Integration over the u-plane in Donaldson theory. hep-th/9709193, Adv. Theor. Math. Phys. 1, 298 (1997) 4. Losev, A., Nekrasov, N. and Shatashvili, S.: Issues in topological gauge theory, hep-th/9711108, Nucl. Phys. B 534, 549 (1998); Testing Seiberg–Witten solution. hep-th/9801061 5. Mariño, M. and Moore, G.: Integrating over the Coulomb branch in N = 2 gauge theory. hep-th/9712062, Nucl. Phys. B (Proc. Suppl) 68, 336 (1998); The Donaldson–Witten function for gauge groups of rank larger than one. hep-th/9802185, Commun. Math. Phys. 199, 25 (1998) 6. Mariño, M. and Moore, G.: Donaldson invariants for non-simply connected manifolds. hep-th/9804114, Commun. Math. Phys. 203, 249 (1999) 7. Labastida, J.M.F. and Lozano, C.: Duality in twisted N = 4 supersymmetric gauge theories in four dimensions. hep-th/9806032, Nucl. Phys. B 537, 203 (1999); H. Kanno and S.-K. Yang: Donaldson– Witten functions of massless N = 2 supersymmetric QCD. hep-th/9806015, Nucl. Phys. B 535, 512 (1998) 8. Mariño, M. and Moore, G.: Three-manifold topology and the Donaldson–Witten partition function. hepth/9811214, Nucl. Phys. B 547, 569 (1999) 9. Mariño, M., Moore, G. and Peradze, G.: Superconformal invariance and the geography of four-manifolds. hep-th/9812055, Commun. Math. Phys. 205, 691 (1999); Four-manifold geography and superconformal symmetry. math.DG/9812042, Math. Res. Lett. 6, 429 (1999) 10. Gorsky, A., Marshakov, A., Mironov, A. and Morozov, A.: RG equations from Whitham hierarchy. hepth/9802007, Nucl. Phys. B 527, 690 (1998); Takasaki, K.: Integrable hierarchies and contact terms in u-plane integrals of topologically twisted supersymmetric gauge theories. hep-th/9803217, Int. J. Mod. Phys. A 14, 1001 (1999); Mariño, M.: The uses of Whitham hierarchies. hep-th/9905053, Prog. Theor. Phys. Suppl. 135, 29 (1999) 11. Muñoz, V.: Wall-crossing formulae for algebraic surfaces with positive irregularity. alg-geom/9709002 12. Morgan, J. and Szabó, Z.: Embedded tori in four-manifolds. Topology 38, 479 (1999) 13. Göttsche, L.: Modular forms and Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9506018; J. Am. Math. Soc. 9, 827 (1996) 14. Atiyah, M.F. and Bott, R.: The Yang–Mills equations over Riemann surfaces. Philos. Trans. Roy. Soc. London 308, 523 (1982) 15. Verlinde, E.: Fusion rules and modular transformations in 2d conformal field theory. Nucl. Phys. B 300, 360 (1988) 16. Thaddeus, M.: Conformal field theory and the cohomology of the moduli space of stable bundles. J. Diff. Geom. 35, 131 (1992) 17. Witten, E.: Two-dimensional gauge theories revisited. hep-th/9204083, J. Geom. Phys. 9, 303 (1992) 18. Witten, E.: On quantum gauge theories in two dimensions. Commun. Math. Phys. 141, 153 (1991) 19. Muñoz, V.: Fukaya–Floer homology of × S1 and applications. math.DG/9804081 20. Muñoz, V.: Basic classes for four-manifolds not of simple type. math.DG/9811089; Higher type adjunction inequalities for Donaldson invariants. math.DG/9901046 21. Froyshov, K.A.: Equivariant aspects of Yang–Mills-Floer theory. math.DG/9903083 22. Bershadsky, M., Johansen, A., Sadov, V. and Vafa, C.: Topological Reduction of 4D SYM to 2D σ –Models. hep-th/9501096; Nucl. Phys. B 448, 166 (1995) 23. Muñoz, V.: Ring structure of the Floer cohomology of × S1 , dg-ga/9710029,. Topology 38, 517 (1999) 24. Dostoglou, S. and Salamon, D.: Self-dual instantons and holomorphic curves. Ann. of Math. 139, 581 (1994) 25. Muñoz, V.: Quantum cohomology of the moduli space of stable bundles over a Riemann surface. alggeom/9711030, Duke Math. J. 98, 525 (1999)
Donaldson Invariants and Two-Dimensional Gauge Theories
261
26. Donaldson, S.K. and Kronheimer, P.B.: The geometry of four-manifolds. Oxford: Oxford Univ. Press, 1990 27. Witten, E.: Supersymmetric Yang–Mills theory on a four-manifold. hep-th/9403193; J. Math. Phys. 35, 5101 (1994) 28. Seiberg, N. and Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang–Mills theory. hep-th/9407087, Nucl. Phys. B 426, 19 (1994) 29. Seiberg, N. and Witten, E.: Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD. hep-th/9408099, Nucl. Phys. B 431, 484 (1994) 30. Li, T.J. and Liu, A.: General wall-crossing formula. Math. Res. Lett. 2, 797 (1995); Okonek, C. and Teleman, A.: Seiberg–Witten invariants for manifolds with b2+ = 1, and the universal wall-crossing formula. alg-geom/9603003, Int. J. Math. 7, 811 (1996) 31. Borcherds, R.E.: Automorphic forms with singularities on Grassmannians. Invent. Math. 132, 491 (1998) 32. Dixon, L., Kaplunovsky, V. and Louis, J.: Moduli dependence of string loop corrections to gauge coupling constants. Nucl. Phys. B 355, 649 (1991); Harvey, J.A. and Moore, G.: Algebras, BPS states, and strings. hep-th/9510182, Nucl. Phys. B 463, 315 (1996) 33. Göttsche, L. and Zagier, D.: Jacobi forms and the structure of Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9612020, Sel. Math., New ser. 4, 69 (1998) 34. Blau, M. and Thompson, G.: Lectures on 2d gauge theories. hep-th/9310144, In: 1993 Trieste Summer School in High Energy Physics and Cosmology, E. Gava et al. (eds.), Singapore: World Scientific, 1994 35. Donaldson, S.K.: Gluing techniques in the cohomology of moduli spaces. In: Topological methods in modern mathematics, L.R. Goldberg and A.V. Phillips eds., Berley: Publish or Perish, 1993 36. Kiem, Y.-H.: Equivariant and intersection cohomology of moduli spaces of vector bundles; and Intersection cohomology of quotients of non-singular varieties. Yale preprints 37. Hatter, L.: Cohomology of compactifications of moduli spaces of stable bundles over a Riemann surface. Ph.D. Thesis, Oxford University, 1997 (unpublished) 38. Newstead, P.E.: Characteristic classes of stable bundles over an algebraic curve. Trans. Am. Math. Soc. 169, 337 (1972) 39. Zagier, D.: On the cohomology of moduli spaces of rank two vector bundles over curves. In: The moduli space of curves, R. Dijkgraaf et al. (eds.), Basel–Boston: Birkhäuser 40. Szenes, A.: The combinatorics of the Verlinde formulas. alg-geom/9402003, in: Vector bundles in algebraic geometry, N.J. Hitchin et. al. (eds.), Cambridge: Cambridge University Press, 1995 41. Mohri, K.: Residues and topological Yang–Mills theory in two dimensions. hep-th/9604022, Rev. Math. Phys. 9, 59 (1997) 42. Atiyah, M.F.: The geometry and physics of knots. Cambridge: Cambridge University Press, 1990; Dijkgraaf, R.: Fields, strings and duality. hep-th/9703136, in Symmétries quantiques (Les Houches, 1995), Amsterdam: North-Holland, 1998 43. Donaldson, S.K.: Floer homology and algebraic geometry. In: Vector bundles in algebraic geometry, N.J. Hitchin et. al. (eds.), Cambridge: Cambridge University Press, 1995 44. The Floer Memorial Volume. H. Hofer et. al. (eds.), Basel–Boston: Birkhäuser 1995 45. Siebert, B.: An update on (small) quantum cohomology. In: Mirror Symmetry III, Providence, RI: AMS, 1999 46. Siebert, B. and Tian, G.: Recursive relations for the cohomology ring of moduli spaces of stable bundles. alg-geom/9410019, Turkish J. Math. 19, 131 (1995); King, A.D. and Newstead, P.E.: On the cohomology ring of the moduli space of stable bundles on a curve. Topology 37, 407 (1998) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 220, 263 – 292 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Singular Dimensions of the N = 2 Superconformal Algebras II: The Twisted N = 2 Algebra Matthias Dörrzapf1 , Beatriz Gato-Rivera2,3 1 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street,
Cambridge, CB3 9EW, UK. E-mail: [email protected]
2 Instituto de Matemáticas y Física Fundamental, CSIC, Serrano 123, Madrid 28006, Spain.
E-mail: [email protected]
3 NIKHEF, Kruislaan 409, 1098 SJ Amsterdam, The Netherlands
Received: 15 March 1999 / Accepted: 12 November 2000
Abstract: We introduce a suitable adapted ordering for the twisted N = 2 superconformal algebra (i.e. with mixed boundary conditions for the fermionic fields). We show that the ordering kernels for complete Verma modules have two elements and the ordering kernels for G-closed Verma modules just one. Therefore, spaces of singular vectors may be two-dimensional for complete Verma modules whilst for G-closed Verma modules they can only be one-dimensional. We give all singular vectors for the levels 21 , 1, and 23 for both complete Verma modules and G-closed Verma modules. We also give explicit examples of degenerate cases with two-dimensional singular vector spaces in complete Verma modules. General expressions are conjectured for the relevant terms of all (primitive) singular vectors, i.e. for the coefficients with respect to the ordering kernel. These expressions allow to identify all degenerate cases as well as all G-closed singular vectors. They also lead to the discovery of subsingular vectors for the twisted N = 2 superconformal algebra. Explicit examples of these subsingular vectors are given for the levels 21 , 1, and 23 . Finally, the multiplication rules for singular vector operators are derived using the ordering kernel coefficients. This sets the basis for the analysis of the twisted N = 2 embedding diagrams. 1. Introduction The spectacular gain of importance of superstring theory in physics over the past fifteen years asked for solutions to challenging problems on the mathematics side. Just like for any quantum field theory, a key problem is the study of the underlying symmetry algebra in order to understand the space of states of the physical theory. Almost at the same time as Ademollo et al. [1] identified for the first time the symmetry algebras of the superstrings, Kac was already studying these objects among his classification of superalgebras [24]. These symmetry algebras are known as superconformal algebras because the Virasoro algebra is a subalgebra due to the conformal invariance on the string world-sheet. The study of the representations of the superconformal algebras is
264
M. Dörrzapf, B. Gato-Rivera
one of these challenging problems for mathematics. Despite its importance, satisfactory answers are still not known for most superconformal algebras. Increasing the number of fermionic currents N the structure of the representations of the superconformal algebras gets more and more complicated. The case of the N = 1 superconformal algebra is well understood thanks to many authors [2, 6, 5, 18, 19, 23, 25, 27, 28, 30]. However, its representation theory is very close to the representation theory of the Virasoro algebra. Already for the N = 2 superconformal algebras there are still many open questions, even though much progress has been made in recent years [8, 12, 11, 20, 21, 14]. For the N = 2 superconformal algebras many aspects arise that are new to representations of chiral algebras. One of these aspects is the existence of degenerate singular vectors [12, 21]. That is, N = 2 superconformal Verma modules can have at the same level and with the same charge more than one linearly independent singular vectors. Thus the study of the embedding structure as well as the computation of character formulae for the representations is much more complicated than in the case where the comparison of level and charge is already enough in order to decide if two singular vectors are proportional, as it is for the Virasoro algebra and for the N = 1 superconformal algebra. In physics, the different types of superconformal symmetry algebras, for the same number of fermionic fields N , arise through the different choices of periodicity conditions for the fermionic currents around a closed superstring world-sheet, as well as for different choices of the Virasoro generators (i.e. of the stress-energy tensor). For the standard choice of Virasoro generators, the superconformal algebras resulting from antiperiodic boundary conditions are called Neveu–Schwarz algebras, those corresponding to periodic boundary conditions are called Ramond algebras and those with mixed boundary conditions are called twisted superconformal algebras. The N = 2 superconformal algebras fall under four types: the Neveu–Schwarz N = 2 algebra, the Ramond N = 2 algebra, the topological N = 2 algebra, and the twisted N = 2 algebra. The first three algebras are isomorphic, nevertheless their highest weight representations are quite different1 . The topological N = 2 algebra, which is the symmetry algebra of topological conformal field theory, also plays an important rôle in string theories [9, 22, 7]. This algebra is also known as the “twisted topological” N = 2 algebra because it can be obtained from the Neveu–Schwarz N = 2 algebra by modifying the stress-energy tensor by adding the derivative of the U(1) current, procedure known as “topological twist”. We have studied the singular dimensions of the three isomorphic algebras in a recent publication [16], where we presented a formalism – the adapted ordering method – that allows us to compute upper limits for the dimensions of singular subspaces simply by ordering the algebra generators appropriately. If this order is chosen in a suitable way, then the computed upper limits for these singular dimensions may actually be maximal. We applied this method directly to the topological N = 2 algebra proving its usefulness, and then we deduced the maximal singular dimensions corresponding to the Neveu– Schwarz and to the Ramond N = 2 algebras. So far, not much attention has been given to the twisted superconformal algebras, which appear for N = 2 for the first time. This is mainly due to the fact that field theories with mixed periodicity conditions have not been considered for string theories. In the N = 2 case there is just one twisted algebra whilst for bigger N there are several possible ways of mixing the periodicity conditions of the fermionic fields and thus different twisted superconformal algebras can be found. In this paper we will focus on the twisted N = 2 superconformal algebra. We will show that the method of adapted 1 We thank V. Kac for discussions on this point
Singular Dimensions of N = 2 Superconformal Algebras II
265
orderings can also be used for this algebra. For the three isomorphic N = 2 algebras as well as for the Virasoro algebra we saw [16] that the suitably chosen adapted ordering assigns a special rôle to the powers of the Virasoro generator L−1 . Surprisingly enough, in the twisted N = 2 case this rôle is taken over for the first time by another generator. Furthermore, we will see that the twisted N = 2 algebra requires its two fermionic fields to be mixed rather than kept separate as for the other N = 2 algebras. Hence the twisted N = 2 algebra behaves very differently from the other N = 2 algebras with respect to the adapted ordering method. Nevertheless, the method still works and we can prove that the computed upper limits for the singular dimensions are actually maximal. Even though the twisted N = 2 superconformal algebra contains two fermionic fields, at the algebraic level it looks as if it really contains only one. Furthermore the parameter space of the conformal weights is just one-dimensional, unlike the corresponding parameter spaces for the three isomorphic N = 2 algebras. For both reasons, one would have naively suggested that the singular dimensions have to be less than or equal to 1. Surprisingly enough we will show that the maximal singular dimension is in fact 2. Surely, these two-dimensional spaces are not spanned by any tangent space of vanishing surfaces corresponding to singular vectors – as is the case for the three isomorphic N = 2 algebras – simply due to the fact that the parameter space is only one-dimensional. The singular dimension 2 arises rather in a completely new way due to the intersection of two different one-parameter families of singular vectors. This is another reason for which the twisted N = 2 algebra is so far unique of its kind. Superconformal algebras seem to have many surprising features. Not only does this paper show that recent achievements follow the correct way to tackle these and to learn much, which will be helpful to analyse even more complicated conformal structures, it also shows that the twisted N = 2 algebra gives more new and different insight in superconformal representation theory. Furthermore this paper creates the basis for the study of the embedding structure of the twisted N = 2 highest weight representations since we deduce the singular dimensions as well as the multiplication rules for singular vector operators. Both are crucial for the analysis of the embedding structure. We will also give answers to the question whether the twisted N = 2 superconformal algebra contains subsingular vectors. The study of the embedding diagrams, that is the structure of descendant (secondary) singular vectors, will be considered in a forthcoming publication. This paper is organised in the following way. After a small introduction with the necessary facts about the twisted N = 2 algebra in Sect. 2, we review the concept of adapted orderings and its main implications in Sect. 3, in a version suitable for the twisted N = 2 algebra. In Sect. 4 we then introduce an adapted ordering for the Verma modules of the twisted N = 2 algebra. This allows us to compute the upper limits for the singular dimensions and the multiplication rules for singular vector operators in Sect. 5. That these upper limits are in fact maximal dimensions is demonstrated in Sect. 6 using explicit examples which are supplemented by more examples in Appendix A. Based on both theoretical results and explicit computational results we give a reliable conjecture for the coefficients of the relevant terms of all twisted N = 2 singular vectors in Sect. 7. Using this conjecture we can characterize all cases of degenerate (i.e. two-dimensional) singular vectors for all levels. This conjecture leads to the discovery of subsingular vectors, which are singular in the G-closed Verma modules considered in Sect. 8 and in Appendix B. It also leads to the identification of all G-closed singular vectors, which are the issue of Sect. 9. We conclude the paper with some final remarks in Sect. 10.
266
M. Dörrzapf, B. Gato-Rivera
2. The Twisted N = 2 Superconformal Algebra The twisted N = 2 superconformal algebra J2 consists of the Virasoro algebra generators2 Lm , m ∈ Z, corresponding to the stress-energy tensor, a Heisenberg algebra Tr , with half-integral r ∈ Z 1 , corresponding to the U(1) current, and the fermionic genera2
tors Gk , k ∈ Z1/2 , which are the modes of the two spin-3/2 fermionic fields. J2 satisfies the following commutation relations3 . C (m3 − m) δm+n,0 [Lm , Ln ] = (m − n)Lm+n + 12 1 [Lm , Gk ] = m − k Gm+k , 2 [Lm , Tr ] = − rTm+r , 1 [Tr , Ts ] = Crδr+s,0 , 3 [Tr , Gk ] = Gr+k ,
(1)
Z1
Z 2 (−1)2k Lk+l − δk+l (−1)2k (k − l)Tk+l {Gk , Gl } = 2δk+l C Z 1 + δk+l (−1)2k k 2 − δk+l,0 , 3 4 [Lm , C] = [Tr , C] = [Gk , C] = 0, S = 1 if m ∈ S and δ S = 0 otherwise, (−1)2k = +1 for k ∈ Z and (−1)2k = −1 with δm m for k ∈ Z 1 , m, n ∈ Z, r, s ∈ Z 1 , and finally k, l ∈ Z1/2 . For the Neveu–Schwarz, 2 2 Ramond and topological N = 2 algebras one usually chooses a basis in which the odd generators are nilpotent. If we want to keep the L0 -grading of the basis for the algebra generators then such a choice does not exist for the twisted N = 2 algebra. Nevertheless, the squares of the fermionic operators can be expressed in terms of Virasoro operators, as one can deduce easily from the commutation relations Eqs. (1):
G2k = (−1)2k L2k − δk,0
C , 24
k ∈ Z1/2 .
(2)
The central term C commutes with all other operators and can therefore be fixed as c ∈ C. HJ2 = span{L0 , C} defines a commuting subalgebra of J2 , which can therefore be diagonalised simultaneously. Generators with positive index span the set of positive operators J+ 2 of J2 and likewise generators with negative index span the set of negative 2 We use the notation N = {1, 2, 3, . . . }, N = {0, 1, 2, . . . }, N = { 1 , 3 , 5 , . . . }, N1/2 = N ∪ N , 1 1 0 2 2 2 2 2 1/2 3 1 1 3 1/2 N0 = N0 ∪ N 1 , and also Z = {. . . , −1, 0, 1, 2, . . . }, Z 1 = {. . . , − 2 , − 2 , 2 , 2 , . . . }, Z = Z ∪ Z1 . 2 2 2 3 The earlier way of writing the twisted N = 2 commutation relations, as for instance given by Boucher,
Friedan, and Kent [8], is to distinguish the modes of the two fermionic fields explicitly using superscripts 1 and 2. However, algebraically it is not necessary to distinguish them as one of the fields has modes with integral indices and the other one has modes with half-integral indices. In order to avoid the imaginary unit i appearing in the commutation relations, we have, for convenience, redefined iGr as Gr for r ∈ Z 1 compared to the notation in Ref. [8].
2
Singular Dimensions of N = 2 Superconformal Algebras II
267
operators J− 2 of J2 : 1/2 J+ }, 2 = span{Lm , Tr , Gk : m ∈ N, r ∈ N 1 , k ∈ N
(3)
2
J− 2
= span{L−m , T−r , G−k : m ∈ N, r ∈ N 1 , k ∈ N 2
1/2
}.
(4)
The zero modes are spanned by J02 = span{L0 , G0 , C} such that the generator G0 classifies the different types of Verma modules as we will analyse shortly. One usually extends the algebra by the parity operator4 (−1)F which commutes with all operators Lm , Tr , and C and anticommutes with Gk : [(−1)F , Lm ] = [(−1)F , Tr ] = [(−1)F , C] = 0, {(−1)F , Gk } = 0,
m ∈ Z, r ∈ Z 1 , 2
k ∈ Z1/2 .
The operator (−1)F will serve us later to distinguish fermionic from bosonic states in the space of states. As (−1)F commutes with HJ2 it can also be diagonalised simultaneously to HJ2 . As usual, a simultaneous eigenvector | of HJ2 and (−1)F with L0 -eigenvalue (the conformal weight), C-eigenvalue5 c (the conformal anomaly), and vanishing J+ 2 action is called a highest weight vector, corresponding in fact to the state with lowest conformal weight in a given representation of the algebra. Unless otherwise stated, we set | to have (−1)F -eigenvalue6 (parity) +. Additional zero-mode vanishing conditions are possible only with respect to the operator G0 which may or may not annihilate a highest weight vector | (cf. Ref. [16, Def. 3.A]). If G0 annihilates the highest weight vector we shall denote it by a superscript in | G and call this highest weight vector c G-closed. Since G20 = L0 − 24 , any G-closed vector necessarily has conformal weight c = 24 . Verma modules are defined in the usual way as the left module obtained by acting with the universal enveloping algebra U (J2 ) on a highest weight vector: V = U (J2 ) ⊗HJ
2
V Gc
24
⊕J+ 2
| ,
c G = U (J2 ) ⊗HJ ⊕J+ ⊕span{G0 } . 2 2 24
(5) (6)
If the highest weight vector is G-closed, then we call the Verma module V Gc (which is 24 not complete therefore) also G-closed. p For an eigenvector l of HJ2 in V the conformal weight is + l and the parity 1/2 is p with l ∈ N0 and p ∈ {±}. For convenience we will sometimes denote the level p p and the parity as |l |L = l and |l |F = p. Singular vectors l± of a Verma module V are eigenvectors of HJ2 with parity ± that are not proportional to the highest weight vector but are annihilated by J+ 2 . They correspond therefore to the states with lowest conformal weight in a given subrepresentation of the algebra. They may also satisfy an additional vanishing condition with respect to the operator G0 in which case we add a superscript G in the notation l±G and call it a G-closed singular vector. Obviously Gclosed singular vectors satisfy similar restrictions on their conformal weight as G-closed c highest weight vectors: + l = 24 . Most singular vectors can be constructed by acting 4 Note that due to Eq. (2) the fermion number F is not well defined, however, the parity (−1)F is. 5 For simplicity we will supress the eigenvalue of C in |, c and simply write | . 6 Instead of parity ±1 we shall simply say parity ±.
268
M. Dörrzapf, B. Gato-Rivera
with algebra generators on another singular vector at a lower (or equal) level. These are called secondary singular vectors. Otherwise, they are called primitive singular vectors. The singular vector operator θl± of l± is the unique operator in U (J− 2 ⊕ {G0 }) such ⊕ {G }) is replaced by U (J− that l± = θl± | and similarly for l±G , where U (J− 0 2 2 ). Singular vectors and all their descendant vectors (i.e. all the states that are produced by acting with algebra generators on the singular vectors) have vanishing ( pseudo-)norms under the standard inner product. Vectors with zero norm are usually called null vectors7 . The quotient module where all null vectors are set to 0 is hence an irreducible highest weight representation. Conversely, all irreducible highest weight representations can be constructed in this way. However, singular vectors and their descendants do not necessarily span the whole submodule of null vectors. The quotient module of the Verma module divided by the submodule spanned by all singular vectors may again contain new singular vectors, called subsingular vectors. Subsingular vectors and their descendants are also null vectors. Subsingular vectors do not appear for the Virasoro algebra neither for the N = 1 superconformal algebras. However, they have been discovered for the three isomorphic N = 2 superconformal algebras in Refs. [21, 20]. In these cases, the existence of subsingular vectors is very closely related to the existence of singular operators annihilating a singular vector as shall be discussed in a forthcoming publication [15]. That the twisted N = 2 superconformal algebra also has subsingular vectors will be shown in Sect. 8 of this paper. Even though the set of null vectors consists of singular vectors, subsingular vectors, and all their descendants, the null vectors with lowest level in a Verma module are certainly singular vectors (i.e. annihilated by all the positive operators). Therefore, the lowest level at which the determinant of the inner product matrix vanishes indicates the presence of (at least) one singular vector. Hence the determinant formula is one of the most important tools for analysing highest weight representations. For the twisted N = 2 superconformal algebra the determinant formula has been given by Boucher, Friedan, and Kent [8]. For level zero the determinant formulae are detM0+ = 1 and c detM0− = − 24 . For level l ∈ N1/2 and parity ± the determinant formulae are: detMl±
P (l) 2 P c 2 c c 1 c 2 = − −1 − + −1 r +s 24 3 24 4 3 1≤rs≤2l
l− rs 2
,
sodd
(7) where r and s are positive integers (s odd). The partition function P (l), that can be found in Ref. [8], satisfies P (0) = 1. The two parity sectors ± have exactly the same determinant expressions. This correspondence is in fact even closer since the operator G0 interpolates between the two sectors, provided G0 does not act on any G-closed vector. Let us analyse this correspondence closer for singular vectors. For a given singular vector l+ = θl+ | ∈ V with singular operator θl+ we can construct two negative parity singular vectors: − = θl+ G0 | , G,l
− G l
=
G0 θl+ | .
(8) (9)
7 Strictly speaking, null-vectors should only be vectors in the kernel of the inner product matrix and hence decouple completely from the whole space of states. If the pseudo-norm is not positive semi-definite there may be vectors with vanishing pseudo-norm that are not null-vectors [15].
Singular Dimensions of N = 2 Superconformal Algebras II
269
Likewise, G0 constructs two positive parity singular vectors from a singular vector l− = θl− | ∈ V : + G,l = θl− G0 | ,
+ G l
=
(10)
G0 θl− | .
(11)
One might therefore believe that singular vectors always come in groups of (at least) 4. However, this assumes that these 4 vectors are actually linearly independent, what is not true as one can guess from the fact that P (0) = 1 in the determinant formulae, pointing towards the existence of, generically, one singular vector of each parity at the same level in the same Verma module. This issue will be considered in Sect. 7. In the case of G-closed Verma modules or G-closed singular vectors some of these vectors are actually trivial, an issue that is important for the discussion of embedding diagrams. For elements Y of J2 which are eigenvectors of HJ2 and (−1)F with respect to the adjoint representation we define the level |Y |L as [L0 , Y ] = |Y |L Y and in addition the parity |Y |F as [(−1)F , Y ] = |Y |F Y . In particular, elements of the form Y = L−mL . . . L−m1 T−sT . . . T−s1 G−kG . . . G−k1 T−n 1 Gr1 1 Gr02 , 2
−2
(12)
T where r1 , r2 ∈ {0, 1} and any reorderings of Y have level |Y |L = L j =1 mj + j =1 sj +
G n+r1 G+r1 +r2 . For these elements we shall also define j =1 kj + 2 and parity |Y |F = (−1) their length Y = L+T +G. For the trivial case we set |1|L = 1 = 0 and |1|F = +1. For convenience we define the following sets of negative operators for l ∈ N1/2 : Ll = {Y = L−mL . . . L−m1 : mL ≥ . . . ≥ m1 ≥ 1, |Y |L = l}, 3 Tl = {Y = T−sT . . . T−s1 : sT ≥ . . . ≥ s1 ≥ , |Y |L = l}, 2 Gl = {Y = G−kG . . . G−k1 : kG > . . . > k1 ≥ 1, |Y |L = l}, L0 = T0 = G0 = {1}.
(13) (14) (15) (16)
We can now define a graded basis for the Verma modules which we use as the standard basis and on which we shall later construct the adapted ordering. A preliminary step is 1/2 to define for l ∈ N0 , Sl± = Y = LT G : L ∈ Lm , T ∈ Tr , G ∈ Gk , 1/2 |Y |L = l = m + r + k, |Y |F = ±1 = |G|F , m, ∈ N0 , r, k ∈ N0 . (17) 1/2
This leads for l ∈ N0 to: p 1/2 1 Gr1 1 Gr02 : Sk,p ∈ Sk , k ∈ N0 , r1 , r2 ∈ {0, 1}, Cl± = Sk,p T 2l−2k−r − 21 −2 2l − 2k − r1 ≥ 0, p(−1)r1 +r2 = ±1 .
(18)
A typical element of Cl± is hence of the form Y = L−mL . . . L−m1 T−sT . . . T−s1 G−kG . . . G−k1 T−n 1 Gr1 1 Gr02 , 2
−2
(19)
270
M. Dörrzapf, B. Gato-Rivera
r1 , r2 ∈ {0, 1}, such that |Y |L = m and |Y |F = ±1. Sk,p of Y ∈ Cl± is called the leading part of Y and is denoted by Y ∗ . Hence, we can define the following standard basis: 1/2 B = X | : X ∈ Cl± , l ∈ N0 , (20) obtaining finally the Verma module V = span{B }. 1/2 The basis Eq. (20) is naturally N0 ⊗{±1} graded with respect to their L0 and (−1)F eigenvalues, where the L0 -eigenvalue (the level) is seen relative to the eigenvalue of the highest weight vector. The normal form of an eigenvector of the commuting subalgebra HJ2 is defined as the basis decomposition with respect to the standard basis Eq. (20).
p l = cX X | . (21) p
X∈Cl p
p
We call the operators X ∈ Cl simply the terms of l and the coefficients cX its p coefficients. Terms X with non-trivial coefficients cX are called non-trivial terms of l . G For each of the two types of twisted Verma modules V and V c we can thus think of 24
two types of singular vectors l± and ±G (both). However, the latter does not exist c 24 − in G-closed Verma modules, as G-closed Verma modules V 24c contain at level 0 only the highest weight vector8 . To conclude this section, we should mention that in the case of a G-closed Verma module V Gc the basis Eq. (20) simplifies to 24 c 1/2 (22) : X ∈ Cl±G , l ∈ N0 , , B Gc = X 24 24 where p 1/2 1 Cl±G = Sk,p T 2l−2k−r Gr1 1 : Sk,p ∈ Sk , k ∈ N0 , r1 ∈ {0, 1}, − 21 −2 1/2 2l − 2k − r1 ≥ 0, p(−1)r1 = ±1 , l ∈ N0 . (23) All other definitions and results carry over to the G-closed case simply by considering c = 24 . 3. Adapted Orderings and Singular Dimensions A key question for the theory of highest weight representations is the dimension of 1/2 any space of singular vectors that may appear at a given level l ∈ N0 with fixed parity p ∈ {±} in a Verma module V . In Ref. [16] we have shown that these dimensions, called singular dimensions, are closely related to the ordering kernel of an adapted ordering on p the set Cl . In fact, it is the number of elements of the ordering kernel that sets an upper limit on the singular dimensions of a Verma module. Before we introduce an adapted ordering O on Cl± and compute its ordering kernel as defined in Ref. [16], we shall first review a few results on adapted orderings that are crucial for our reasoning. An adapted p ordering is simply a one-sided bounded total ordering on Cl satisfying the requirements specified in the following definition. 8 This resembles very much the result, for the Neveu–Schwarz and the topological N = 2 algebras, that chiral singular vectors do not exist in chiral Verma modules.
Singular Dimensions of N = 2 Superconformal Algebras II
271
p
Definition 3.1. A total ordering O on Cl with global minimum is called adapted to the pA p subset Cl ⊂ Cl , in the Verma module V with annihilation operators K = J+ 2 , if for pA any element X0 ∈ Cl at least one annihilation operator " ∈ K exists for which
"X cX 0 X (24) " X0 | = X∈B
p 0 contains a non-trivial term X˜ ∈ B (i.e. c"X = 0) such that for all Y ∈ Cl with ˜ X
in X0
"Y " Y | = cX X
(25)
X∈B
pA
pK
is trivial: c"Y = 0. The complement of Cl , Cl X˜ to the ordering O in the Verma module V . p
pG
pA
p
pA
= Cl \ Cl
is the kernel with respect
pGA
If we replace in Def. 3.1 Cl by Cl , Cl by Cl , V by V Gc , and B 24c by B Gc , then 24 24 we obtain the corresponding definition for a G-closed adapted ordering OG with ordering pGK kernel Cl . For both cases, O and OG , if we replace the annihilation conditions by + K = J2 ⊕ {G0 } we obtain the corresponding definition for G-closed vectors at level l, parity p. p Our aim is to introduce an adapted ordering on Cl such that the resulting ordering kernels are as small as possible. There are two main reasons to proceed in this way. First, p for a given singular vector l of V the coefficients of the terms corresponding to the pK ordering kernel Cl determine the singular vector uniquely, i.e. if the normal forms of two singular vectors agree on the ordering kernel then the singular vectors are identical. p This also implies that for a given singular vector l at least one element of the adapted p p kernel must be a non-trivial term of l , otherwise l would agree on the whole adapted kernel with the trivial vector and would therefore be trivial itself. Secondly, the number pK of elements in the ordering kernel Cl is an upper limit for the singular dimensions of V at level l, parity p. These results are the content of two theorems which we proved in a very general setting in Ref. [16]. Due to their importance for our later considerations we repeat these theorems here adapted to the particular case of the twisted N = 2 algebra J2 . pA
Theorem 3.2. Let O denote an adapted ordering on Cl at level l, parity p, with ordering pK kernel Cl for a given Verma module V and annihilation operators K = J+ 2 . If the p,1 p,2 normal form of two vectors l and l at the same level l and parity p, both satisfying 1 = c2 for all terms X ∈ C pK , then the highest weight conditions, have cX X l p,1
l
p,2
≡ l
.
(26) pA
Theorem 3.3. Let O denote an adapted ordering on Cl at level l, parity p, with ordering pK kernel Cl for a given Verma module V and annihilation operators K = J+ 2 . If the pK ordering kernel Cl has n elements, then there are at most n linearly independent p singular vectors l in V at level l with parity p.
272
M. Dörrzapf, B. Gato-Rivera p
pG
Again, we can replace O on Cl by OG on Cl in order to obtain both theorems for G-closed Verma modules V Gc . If we extend the annihilation operators from K = J+ 2 to 24
K = J+ 2 ⊕ {G0 }, which normally leads to a smaller ordering kernel, then both theorems pG hold with this smaller ordering kernel for G-closed singular vectors l . 4. Adapted Ordering for the Twisted N = 2 Algebra p
In this section we will define an adapted ordering on Cl , given by Eq. (18), and compute its ordering kernel which will later turn out to be the smallest possible ordering kernel. pG We will conclude this section by transferring these results to the G-closed case Cl . For convenience, we shall first give an ordering on the sets Lm , Ts , and Gk . Definition 4.1. Let Y denote either L, T, or G, (but the same throughout this definition) and take two elements Yi , i = 1, 2, such that Yi = Z i i . . . Z i i , |Yi |L = k i for −kY
ki
∈
1/2 N0 ,
or Yi = 1, i = 1, 2, with
Zi i −kj
−k1
i
being generators of the type L−k i , T−k i , or j
j
G−k i depending on whether Y denotes L, T, or G respectively. For Y1 = Y2 we compute j
the index9 j0 = min{j : kj1 − kj2 = 0, j = 1, . . . , min(Y1 , Y2 )}. If non-trivial, j0 is the index for which the levels of the generators in Y1 and Y2 first disagree when read from the right to the left. For j0 > 0 we then define Y1
(27)
Y1 Y2 .
(28)
If, however, j0 = 0, we set
In order to get more familiar with these definitions let us consider the following examples: L−2 L−2 L−1
where j0 = 2, where j0 = 0,
T− 3
where j0 = 0.
2
2
2
(29)
p
We can now define an ordering on Cl which will turn out to be adapted with a sufficiently small ordering kernel. p
Definition 4.2. On the set Cl , l ∈ N1/2 , p ∈ {±} we introduce the total ordering O. p
i
ri
ri
For two elements X1 , X2 ∈ Cl , X1 = X2 with Xi = Li T i Gi T n 1 G 1 1 G02 , Li ∈ Lmi , −2
−2
T i ∈ Tsi , Gi ∈ Gki for some mi , ni ∈ N0 , si , ki ∈ N1/2 , r1i , r2i ∈ {0, 1}, i = 1, 2, we define X1 n2 .
(30)
X1 r12 .
(31)
For n1 = n2 we set
9 For subsets of N we define min ∅ = 0.
Singular Dimensions of N = 2 Superconformal Algebras II
273
If r11 = r12 then we set X1
(32)
In the case that G1 = G2 we then define X1
(33)
X1
(34)
If further L1 = L2 we set which finally has to give an answer. For X1 = X2 we define X1
Definition 4.2 is well-defined since one obtains an answer for any pair X1 , X2 ∈ Cl , X1 = X2 after going through Eqs. (30)–(34), and hence the ordering O proves to be a p total ordering on Cl . Namely, if Eqs. (30)–(34) do not give an answer on the ordering ri
of X1 and X2 , then obviously X1 and X2 are of the form Xi = LT GT n 1 Gr1 1 G02 , with −2
−2
common L, T , G, n, and also r1 . The fact that both X1 and X2 have the same parity p implies further r21 = r22 and ultimately X1 = X2 . It is easy to see that the O-smallest element of Cl+ is T 2l1 followed by T 2l−1 1 G− 1 G0 −2
−2
2
whilst the O-smallest element of Cl− is T 2l1 G0 followed by T 2l−1 1 G− 1 . We will now −2
−2
2
show that the ordering O is adapted and we will compute the ordering kernels. It will turn out that the elements of the ordering kernels are exactly the four elements just mentioned and therefore they identify all singular vectors according to Theorem 3.2. In Sect. 6 we shall give examples of singular vectors at low levels which show that these are indeed the smallest ordering kernels for general values of . p
Theorem 4.3. For all central terms c ∈ C, the ordering O is adapted to Cl for all Verma 1/2 modules V , for all levels l with l ∈ N0 , and for both parities p ∈ {±}. The ordering kernels are given by tables 1 and 2 for all levels10 l. Table 1. Ordering kernels for O, annihilation operators J+ 2 (−1)F +1
Ordering kernel G G T 2l1 , T 2l−1 1 0 1 − −2
−1
−2
2
G− 1 T 2l1 G0 , T 2l−1 −2 − 21 2
For the proof of Theorem 4.3 we follow the lines of the proof for the topological N = 2 algebra given in Ref. [16]. There, however, our strategy for the proof was to find annihilation operators that were able to produce an additional operator L−1 which would create a term that cannot be obtained by any O-bigger term. Obviously, in our present case T− 1 has taken over the rôle that was played by L−1 in the topological N = 2 case. 2 (As a matter of fact this is the first time that this rôle is played by a generator different from L−1 ). The following proof makes clear that in this case the adapted ordering method works for powers of T− 1 . 2
10 Note that for level l = 0 some of the kernel elements obviously do not exist.
274
M. Dörrzapf, B. Gato-Rivera Table 2. Ordering kernels for O, annihilation operators J+ 2 and G0 (−1)F
Ordering kernel T 2l1 − 2 T 2l1 G0
+1 −1
−2
p
1/2
Proof. Let us start with the most general term X0 at level l, parity p, in Cl , l ∈ N0 , p ∈ {±}, p
X0 = L0 T 0 G0 T−n 1 Gr1 1 Gr02 ∈ Cl , 2
(35)
−2
with L0 = L−mL0 . . . L−m1 ∈ Lm , T 0 = T−sT 0 . . . T−s1 ∈ Ts , G0 = G−kG0 . . . G−k1 ∈ Gk ,
(36)
m, n ∈ N0 , s, k ∈ N0 , r1 , r2 ∈ {0, 1}, such that l = m + s + k + n2 + r21 and 0 p = (−1)G +r1 +r2 . We then construct the vector 0 = X0 | . In the case G0 = 1 we look at G−k1 which necessarily has k1 > 21 and consider the positive operator Gk1 − 1 . The commutation relations of G−k1 and Gk1 − 1 create a 1/2
2
generator T− 1 (note that 2k1 − 2
1 2
2
> 0). Therefore, acting with the positive operator
Gk1 − 1 on 0 = X0 | we obtain a non-trivial term X G in Gk1 − 1 0 of the form 2
2
r1 r2 ˜ 0 T n+1 XG = L0 T 0 G 1 G 1 G0 −2
−2
−p , l−k1 + 21
∈C
˜ 0 = G−k 0 . . . G−k2 , G G
(37) p
˜ 0 = 1 in the case G0 = 1. Any other term Y ∈ C which also produces X G under or G l the action of Gk1 − 1 also creates at least one generator T− 1 under the action of Gk1 − 1 2 2 2 or is already O-smaller than X0 since Y would already have more generators T− 1 than 2 X0 . The latter case is irrelevant for us in view of Definition 3.1 on adapted orderings. The action of one positive operator can however create at most one new generator. We can therefore restrict ourselves to terms Y of the form rY
rY
p
Y = LY T Y GY T−n 1 G 1 1 G02 ∈ Cl , 2
(38)
−2
with the same number n of generators T− 1 than X0 . If Gk1 − 1 creates X G acting on Y , 2 2 then the additional T− 1 could only be created in three ways. First, Gk1 − 1 commutes 2
2
with generators in GY and creates T− 1 . Second, Gk1 − 1 commutes with generators in 2
2
LY T Y and creates Gk with k < k1 which furthermore commutes with generators in GY to create T− 1 . Third, Gk1 − 1 commutes with generators in LY T Y GY and creates 2
2
G0 which then commutes with G− 1 in the case of r1Y = 1 to create T− 1 . In the third 2
2
Singular Dimensions of N = 2 Superconformal Algebras II
275
case r1Y = 1 would obviously lead to Y being O-smaller than X0 because then r1 = 0 (as XG is the term produced from Y in this way), and due to Eq. (31). In the first and second cases we obviously need to have a generator G−k in GY with 21 < k < k1 , or G−k1 itself is contained in GY and Gk1 − 1 commutes only with this generator. In the 2 latter case X0 would consequently equal Y and in the former case Y is again O-smaller than X0 according to Eq. (32). Hence, we have shown that the ordering O is adapted on the set of terms X0 of the form Eq. (35) with G0 = 1. Let us therefore continue with terms X0 of the form p
X0 = L0 T 0 T−n 1 Gr1 1 Gr02
∈ Cl .
−2
2
(39)
If L0 = 1 we consider the positive operator Tm1 − 1 . Acting with Tm1 − 1 on 0 = X0 | 2 2 creates a non-trivial term p , l−m1 + 21
r1 r2 XT = L˜ 0 T 0 T n+1 1 G 1 G0 −2
∈C
−2
L˜ 0 = L−mL0 . . . L−m2 ,
(40) p
or L˜ = 1 in the case L0 = 1. Again, we find that any term Y ∈ Cl which is not O-smaller than X0 but still produces XT under the action of Tm1 − 1 needs to create 2
exactly one generator T− 1 and thus nY = n. As in the case before we find r1Y = r1 , and 2
also GY = 1, otherwise Y
Y = LY T Y T−n 1 Gr1 1 Gr02 2
∈ Cl ,
−2
(41)
where also r2Y = r2 due to parity equality with X0 . The only way of creating T− 1 from Y 2
under the action of Tm1 − 1 is via commutation with generators in LY as the commutation 2
with generators in T Y does not produce any new generators. Thus, Tm1 − 1 can produce 2
T− 1 in one step by commuting with L−m1 if this is contained in LY . In this case, however, 2 Y turns out to be equal to X0 . Another possibility is that Tm1 − 1 produces T− 1 in more 2
2
than one step by commuting with generators in LY which obviously requires LY to contain generators L−m with 1 ≤ m < m1 and therefore Y
−2
p
∈ Cl .
(42)
If T 0 = 1 we take the positive operator Ls1 − 1 . Acting with Ls1 − 1 on 0 = X0 | 2 2 creates a non-trivial term r1 r2 XL = T˜ 0 T n+1 1 G 1 G0 −2
−2
T˜ 0 = T−sT 0 . . . T−s2 ,
p , l−s1 + 21
∈C
(43)
276
M. Dörrzapf, B. Gato-Rivera
or T˜ = 1 in the case T 0 = 1. Similar arguments as above allow us to restrict ourselves to terms Y of the form p
Y = T Y T−n 1 Gr1 1 Gr02 ∈ Cl , 2
(44)
−2
as all other terms Y that also produce XL under the action of Ls1 − 1 would automatically 2 be O-smaller than X0 . Similarly as before, the only way of creating T− 1 from Y under 2
the action of Ls1 − 1 is by commuting Ls1 − 1 with generators in T Y . Exactly the same 2
2
arguments as for LY together with Eq. (34) show that either Y = X0 or Y
X0 = T−n 1 G− 1 Gr02 ∈ Cl , 2
(45)
2
which creates a non-trivial term −p
r2 X = T n+1 1 G0
∈ Cl .
−2
(46)
Obviously, the only way of creating T− 1 by commuting G0 with products of generators 2 is by commuting it with G− 1 . Therefore any other term Y that creates X under the action 2 of G0 and is not O-smaller than X0 would also need to create one T− 1 and thus also have 2
r1Y = r1 = 1, besides nY = n. Trivially, we find in this case that Y = X0 . This proves the ordering kernel of Table 4.2 which finally completes our proof of Theorem 4.3.
To conclude this subsection we mention that we can easily transfer these results to G-closed Verma modules. It is fairly straightforward to see that the above proof also holds if we set everywhere r2 ≡ 0. This however leads exactly to the G-closed case. pG Therefore, the same ordering O of Def. 4.2 will also serve as ordering OG on Cl in the case of a G-closed Verma module V Gc . Setting r2 ≡ 0 everywhere in the proof 24 of Theorem 4.3 obviously reduces the ordering kernels. One easily finds the following theorem. pG
Theorem 4.4. For all central terms c ∈ C, the ordering O is adapted to Cl for all 1/2 Verma modules V Gc , for all levels l with l ∈ N0 , and for both parities p ∈ {±}. 24 Ordering kernels are given by the following table for all levels l: pG
Table 3. Ordering kernels for O on Cl (−1)F +1
Ordering kernel T 2l1 −2
−1
, annihilation operators J+ 2
T 2l−1 1 G− 1 −2
2
Singular Dimensions of N = 2 Superconformal Algebras II
277
5. Singular Dimensions and Singular Vectors of the Twisted N = 2 Algebra As a consequence of the results of the previous section we will now be able to give an 1/2 upper limit for the dimensions of a singular vector space at a given level l ∈ N0 with fixed parity p ∈ {±}. In addition, we will use the ordering kernels in order to uniquely identify all singular vectors (primitive as well as secondary) of the twisted N = 2 superconformal algebra. In Sect. 6 we shall give explicit examples which prove that for particular Verma modules the upper limits are indeed the maximal singular dimensions. Theorem 3.3 tells us that an upper limit for the dimension of a space of singular vectors at fixed level l and parity p is given by the number of elements in the corresponding pK ordering kernel Cl . Using the tables of ordering kernels of Theorem 4.3 and Theorem 4.4 we can easily give the following maximal singular dimensions. p
pG
Theorem 5.1. For singular vectors l and G-closed singular vectors l in the Verma modules V and V Gc (both), one finds the following upper limits for the number of 24
1/2
linearly independent singular vectors at the same level l ∈ N0 parity p ∈ {±}:
and with the same
Table 4. Maximal dimensions for singular spaces of the twisted N = 2 superconformal algebra p
l ∈ V p l ∈ V Gc 24 pG ∈ V c 24 − pG 0 ∈ V Gc
p = +1
p = −1
2
2
1
1
1
1
0
0
24
Surely, not at each level the singular spaces will have the dimensions given in the previous tables. In fact, for most levels in most Verma modules there will not be any singular vectors at all. The determinant formulae of the inner product matrix [8], given in Sect. 2, indicate in which Verma modules and at which levels singular vectors appear (for the first time). In the following section we shall give examples of these singular vectors for low levels. From Theorem 3.2 we know that singular vectors can only exist if they contain in their normal form at least one non-trivial term of the corresponding ordering kernel of Theorem 4.3 or 4.4. Furthermore, if the normal forms of two singular vectors agree in the coefficients of the terms of the ordering kernel, then the singular vectors are identical. This allows us to identify existing singular vectors simply by their coefficients with respect to the terms of the ordering kernel. For the cases of Table 4 with singular dimension 1 this implies that two singular vectors are already proven to be proportional if we can simply show that they are on the same level with the same parity. In addition, if the coefficient of the term of the ordering kernel vanishes, then the singular vector is trivial. Thus, the coefficient of T 2l1 is already sufficient in order to see −2
if a singular vector is trivial, in the case of p = + (see Tables 2 and 3), whereas the coefficient of T 2l1 G0 or T 2l−1 1 G− 1 is sufficient in order to decide if a singular vector −2
−2
2
is trivial in the case p = −. When the ordering kernel is two-dimensional we need the coefficients of the two terms in order to uniquely define a singular vector. This justifies to introduce an appropriate notation, as follows.
278
M. Dörrzapf, B. Gato-Rivera
Definition 5.2. Let l± = θl± | be a singular vector in V at level l ∈ N0 with parity {±}. The normal form of l+ is completely determined by the coefficients of T 2l1 1/2
−2
− and T 2l−1 1 G− 1 G0 , whilst the normal form of l is determined by the coefficients of −2
2
T 2l1 G0 and T 2l−1 1 G− 1 . We thus use the following notation: −2
−2
2
+ 2l−1 2l (a, b)+ l = θl = aT− 1 + bT 1 G− 1 G0 + . . . ,
(47)
− 2l−1 2l (a, b)− l = θl = aT− 1 G0 + bT 1 G− 1 + . . . ,
(48)
2
2
−2
2
−2
2
for the singular vector operators θl± given in their normal form, where a, b ∈ C. Notice that Def. 5.2 only makes sense if l± is a singular vector. Certainly for most pairs a, b ∈ C singular vectors do not exist. The advantage of this notation lies in the fact that we can easily obtain multiplication rules for singular vector operators and construct descendant singular vectors. (In the context of embedding diagrams, this issue will be analysed in more detail in a forthcoming publication.) For example, the product of the first terms of the singular vector operators a1 T 2l11 + b1 T 2l11 −1 G− 1 G0 + . . . a2 T 2l12 + b2 T 2l12 −1 G− 1 G0 + . . . −2 −2 −2 −2 2 2 b b 1 2 2(l +l ) 2(l +l )−1 a1 a2 T 1 1 2 + a1 b2 + a2 b1 − T 1 1 2 G− 1 G0 + . . . , (49) −2 −2 2 2 + gives us the multiplication rule for (a1 , b1 )+ l1 (a2 , b2 )l2 provided the conformal weights match: 2 + l2 = 1 . In the same way we can easily find similar rules for products involving negative parity operators. The following theorem summarizes these results. p
p
Theorem 5.3. Given two singular vector operators θl1 1 and θl2 2 for the Verma modules p p V1 and V2 with 2 + l2 = 1 then θl1 1 θl2 2 |2 is either trivial or singular in V2 at level l1 + l2 with parity p1 p2 . The resulting singular vector can be expressed in the following way, depending on the multiplication rules which in turn depend on the parities: b 1 b2 + + | |2 , (a1 , b1 )+ (a , b ) a , a b + a b − = a (50) 2 1 2 1 2 2 1 l1 2 2 l2 2 l1 +l2 c b 1 b2 − + − |2 , − (a1 , b1 )l1 (a2 , b2 )l2 |2 = a1 a2 , a1 b2 + a2 b1 2 − 24 2 l1 +l2 (51) + (a1 , b1 )− l1 (a2 , b2 )l2 |2 a 1 b2 c − |2 , , −2l2 a1 b2 + a2 b1 − a1 b2 2 − = a1 a2 − 2 24 l1 +l2 − (a1 , b1 )− l1 (a2 , b2 )l2 |2 + c a 1 b2 |2 . − , a2 b1 − 2l2 a1 b2 − a1 b2 = a 1 a2 2 − 24 2 l1 +l2
(52)
(53)
Singular Dimensions of N = 2 Superconformal Algebras II
279
± For the vectors G,l and G l± , introduced in Eqs. (8)–(11), we can derive the corresponding expressions by looking at operator products of the type
aT−2l1 + bT 2l−1 1 G− 1 G0 + . . . 2
−2
2
c + ... , G0 = aT−2l1 G0 + bT 2l−1 1 G− 1 L 0 − −2 2 24 2 (54)
and similar products for negative parity vectors. We obtain the following results for ± ± ± G,l = (a, b)± G,l | and G l = G (a, b)l | : − c b | , − 2la = a − , −b − 2 24 l c − | , (a, b)− G,l | = a, b − 24 l + c b + | | , − , −2la − b = a − (a, b) G l 24 2 l c + | = a − , b | . (a, b)+ G,l 24 l − G (a, b)l |
(55) (56) (57) (58)
6. Level 1/2, Level 1, and Level 3/2 Singular Vectors Before we proceed in analysing the structure of the singular vectors we shall first give all singular vectors at the lower levels 1/2, 1 and 3/2. These examples already prove that the upper limits given for the singular dimensions are in fact not only upper limits but actually maximal singular dimensions for certain Verma modules. The determinant formulae [8], given in Sect. 2, show the existence of the parity − singular vector 0− at level 0 in the Verma modules V 24c , and also show the existence of the singular vectors ± at level rs with parity ±, r, s ∈ N, s odd, in the Verma modules V r,s r,s (t) with: 2 t 1 (rt + s)2 + + , 8t 8 8 c = 3 + 3t.
r,s (t) = −
(59) (60)
± are generically primitive. However in the case of intersecThese singular vectors r,s tions, with two or more of them in the same Verma module, some of these singular vectors are actually secondary. By explicit computer calculations [29] one finds the following singular vectors (oneparameter families of singular vectors, to be precise) for parity +. The corresponding results for parity − can easily be derived using the multiplication rules Eqs. (55)–(58). These singular vectors are given in Appendix A.
Level 21 : t + 1 + . (t) = (t + 1)T− 1 + 4tG− 1 G0 − 1,1 2 2 8t
(61)
280
M. Dörrzapf, B. Gato-Rivera
Level 1:
Level
+ (t) = (2t + 1)T−2 1 + 8tT− 1 G− 1 G0 + 2t (2t + 1)L−1 2,1 2 2 2 3t 2 + 3t + 1 − 4t (t + 1)G−1 G0 − . 8t
(62)
3 2
has two different singular vectors: + (t) = (3t + 1)T−3 1 + 12tT−2 1 G− 1 G0 + 6t (3t + 2)L−1 T− 1 3,1 2
2
2
2
− 4t (4t + 3)G−1 T− 1 G0 + 8t L−1 G− 1 G0 + 2t (3t + 1)G−1 G− 1 2 2 2 8t 2 + 5t + 1 + 2t (t − 1)(3t + 1)T− 3 + 8t (t 2 − t − 1)G− 3 G0 − , 2 2 8t (63) + 3 2 1,3 (t) = (t + 3)T− 1 + 4tT− 1 G− 1 G0 − 2t (t + 3)L−1 T− 1 2
2
2
2
2
− 4tG−1 T− 1 G0 − 8t 2 L−1 G− 1 G0 + 2t (t + 3)G−1 G− 1 2 2 2 5t + 9 . − 4(t + 3)T− 3 − 8tG− 3 G0 − 2 2 8t
(64)
We can use the notation introduced in Def. 5.2 of the previous section in order to identify the given examples of singular vectors in terms of their leading coefficients. + + t +1 , (65) 1,1 (t) = (t + 1, 4t) 1 − 8t 2 3t 2 + 3t + 1 + − 2,1 , (66) (t) = (2t + 1, 8t)+ 1 8t 8t 2 + 5t + 1 + 3,1 (t) = (3t + 1, 12t)+3 − , (67) 8t 2 5t + 9 + 1,3 (t) = (t + 3, 4t)+3 − . (68) 8t 2 We should note that at least at these levels there are no other singular vectors than the ones predicted by the determinant formula. In principle, the determinant formula only proves the existence of the singular vector at lowest level in a Verma module given by Eq. (59). Continuation arguments extend the existence to the cases where the curves of Eq. (59) intersect. The determinant formula does a priori not exclude the existence of so-called isolated singular vectors which are primitive singular vectors that may appear ± . As far as at higher levels and are always hidden behind a singular vector of type r,s we know, such isolated singular vectors have never been found for any chiral algebra. As these examples are the complete set of singular vectors at levels 21 , 1, or 23 with positive parity, we observe that the singular spaces at level 21 and level 1 are onedimensional. However, at level 23 we find that the conformal weights 3,1 (t) and 1,3 (t) + (1) = intersect for t = ±1. In these cases the corresponding singular vectors are 3,1 + 7 + + 7 + (4, 12) 3 − 4 and 1,3 (1) = (4, 4) 3 − 4 for t = 1, and 3,1 (−1) = (−2, −12)+3 21 2
2
2
Singular Dimensions of N = 2 Superconformal Algebras II
281
+ and 1,3 (−1) = (2, −4)+3 21 for t = −1. Obviously, for both values of t these singular 2
vectors are linearly independent and thus span a two-dimensional singular vector space. Therefore we also find degenerate singular vectors for the twisted N = 2 superconformal algebra as already observed for the three isomorphic N = 2 superconformal algebras [12, 20, 21]. We should stress that in the Virasoro case for a given level l ∈ N0 we also have different one-parameter families of singular vectors ξp,q (t) that intersect for certain values of t. However, for Virasoro Verma modules these singular vectors at such intersection points are always linearly dependent. This follows indirectly from the proof of Feigin and Fuchs [17] and directly from the results of Kent [26] (see also our results in Ref. [16]). The same happens for the Verma modules of the N = 1 superconformal algebras. The degeneracy found for the Neveu–Schwarz N = 2 algebra [12] (and for the topological and Ramond N = 2 algebras [16, 21]) is different in the sense that one single NS may on its own describe a two-dimensional space under certain uncharged vector r,s conditions (+1 and −1 charged singular vectors intersecting in the same Verma module and the uncharged singular vector becoming a secondary singular vector of them). At NS (t ) vanishes and the two-dimensional tangent space becomes these points t = t0 , r,s 0 a degenerate singular space. The degeneracy found for the twisted N = 2 algebra is thus of a different type since the singular vectors that intersect are generically primitive, like + + + + the vectors 3,1 (1) and 1,3 (1), and 3,1 (−1) and 1,3 (−1) given above. It is in fact unique of its kind in the sense that it has not been observed for any other chiral algebra so far in this form. In the following section we will conjecture general expressions for ± (t) (for the relevant coefficients of the terms the twisted N = 2 singular vectors r,s corresponding to the ordering kernel) based on explicit examples and also theoretical ± (t) arguments. From these expressions one deduces that the parametrised vectors r,s never vanish identically unlike the uncharged singular vectors of the Neveu–Schwarz N = 2 algebra at the degenerate points. 7. All Twisted N = 2 Singular Vectors + , corresponding We have learned in Sect. 2 that every positive parity singular vector r,s to an operator (a, b)+ l , implies the existence of two singular vectors with negative parity − at the same level l, corresponding to the operators G (a, b)− l and (a, b)G,l (and the other − way around for r,s ). These two negative (or positive) parity singular vectors may or may not be linearly independent. If they were always linearly independent, then whenever a singular vector exists we would find a two-dimensional singular space with the opposite parity. However, as was pointed out, the fact that P (0) = 1 in the determinant formula, given by Eq. (7), points towards the existence of, generically, one singular vector of each parity at the same level in the same Verma module Vr,s . Nevertheless, as we − + + will see , even in the case where G (a, b)− l and (a, b)G,l (or G (a, b)l and (a, b)G,l ) are proportional the maximal singular dimension may still be 2. In order to analyse this case in more detail we will use the expressions given by Eqs. (55)–(58) for G (a, b)− l , + + , (a, b) and (a, b) . (a, b)− G,l G l G,l Requiring that the two negative parity singular vectors are proportional, and using the rt explicit expressions for r,s (t), given by Eqs. (59), leads to two solutions: ab = 4 rt+s or b s a = 4 rt+s . Note that under the proportionality assumption, the case a = 0 also implies b = 0 and thus (a, b)+ l ≡ 0, which we want to exclude. Therefore, if either of the two
282
M. Dörrzapf, B. Gato-Rivera
conditions on the ratio ab is satisfied, then the two negative parity singular vectors derived from a positive parity singular vector would be proportional and thus the corresponding singular dimension may well only be 1. Surprisingly enough, all examples found so far satisfy always the first of these two conditions. These examples include the singular vectors at levels 21 , 1 and 23 given in Sect. 6 as well as singular vectors at levels 2 and 25 . We believe that there must be a deep reason behind this result, i.e. underlying the fact that all the singular vectors we have computed satisfy the same condition (out of two possible ones) that prevents a positive parity singular vector to impose the existence of a two-dimensional singular space of negative parity. that the For this reason we conjecture b rt + = (a, b)+ positive parity singular vectors r,s rs r,s follow the ratio a = 4 rt+s . From 2
this we can easily derive the corresponding ratio ab for the negative parity singular vectors − = (a , b )− r,s rs r,s . These results allow us to conjecture the following expressions 2 for the twisted N = 2 singular vectors. ± in the Verma modules V Conjecture 7.1. The singular vectors r,s r,s (t) are given by: + (t) = (rt + s, 4rt)+rs r,s (t) , (69) r,s 2
− r,s (t)
2 = (− , rt + s)−rs r,s (t) . 2 r
(70)
− , the last two equations of Eqs. (55)– Also for the negative parity singular vectors r,s + rather than to a pair of different singular vectors. The only excep(58) both lead to r,s ± or V become G-closed and consequently some of these tional cases occur when r,s vectors vanish. This will be considered in the following sections. Using the superconformal fusion rules it may be possible to derive explicit expressions for all the coefficients of the singular vectors Eqs. (69)–(70) in the spirit of Bauer, di Francesco, Itzykson, and Zuber [4, 3], as it was possible for the Neveu–Schwarz N = 2 case [10]. This could also lead to a proof of conjecture 7.1. For this purpose it would be necessary to develop [15] first a twisted superfield formalism following the approach of Ref. [13] for the Neveu–Schwarz algebras. Based on Conjecture 7.1 it is easy to compute the Verma modules with degenerate 1/2 singular vectors, i.e. with singular spaces of dimension 2. For a given level l ∈ N0 there are normally multiple solutions to the number theoretical problem rs = 2l, r, s ∈ N, s odd. It happens the first time for level l = 23 that we find multiple solutions r = 1, s = 3 and r = 3, s = 1. In the general case one factorises the integer 2l in its prime factors
2l = 2n
P i=1
pini ,
(71)
with pi > 2 distinct primes and n ∈ N, ni ∈ N0 . The solutions to rs = 2l, r, s ∈ N, s odd are then given by rπ = 2n
P i=1
sπ =
P i=1
piki ,
pini −ki ,
(72)
Singular Dimensions of N = 2 Superconformal Algebras II
283
for each P -tuple π = (k1 , k2 , . . . , kP ) ∈ NP0 , with ki ≤ ni , ∀i = 1, . . . , P . For example, at level l = 630 we find the prime factorisation 2l = 22 × 32 × 5 × 7. Hence, we find 12 3-tuples (k1 , k2 , k3 ) ∈ N30 with k1 ≤ 2 and k2 , k3 ≤ 1. We should note that for each level l ∈ N1/2 there exists at least one solution: r = 2l, s = 1. The number of solutions to rs = 2l, r, s ∈ N, s odd is thus determined by the number of P -tuples π = (k1 , . . . , kP ) with π ∈ NP0 and π ≤ (n1 , . . . , nP ) for the prime factorisation of 2l. Hence, there are11 ,l = Pi=1 (ni + 1) solutions to rs = 2l, r, s ∈ N, s odd. Due to our Conjecture 7.1 we have generically only one-dimensional singular spaces + (or − ) at level l = rs for all , solutions of r and defined by the singular vectors r,s l r,s 2 s. That is, these singular vectors with the same parity and level are located in different Verma modules. The most interesting Verma modules are those, where some of these ,l solutions intersect and may hence lead to two-dimensional singular spaces provided the corresponding singular vectors are not proportional. Using Conjecture 7.1 we shall now investigate such cases of degeneration. Let us note again that the singular dimension can not be bigger than 2 even though a priori we would expect that even more than 2 solutions πi could intersect. Let us analyse the intersections of the conformal weights r,s (t) of the ,l solutions to rs = 2l, r, s ∈ N, s odd. For fixed l ∈ N1/2 we find the prime factorisation as given above. Let us now take two P -tuples π 1 and π 2 (π 1 = π 2 ) and let us assume that the corresponding conformal weights intersect: rπ 1 ,sπ 1 = rπ 2 ,sπ 2 . Applying Eq. (59) one gets straightforwardly that this is the case if and only if 2
n
P i=1
k1 pi i t
+
P i=1
ni −k 1 pi i
2
= 2
n
P i=1
k2 pi i t
+
P i=1
ni −k 2 pi i
2 .
(73)
This easily rearranges to two sign-symmetric solutions for the parameter t of the central extension c: t =±
P 1 ni −ki1 −ki2 pi . 2n
(74)
i=1
Again, let us consider the example of level l = 630 and consider the two 3-tuples π 1 = (1, 1, 1) and π 2 = (2, 0, 1) for the order (3, 5, 7) of the relevant prime factors. Equation (74) tells us that the conformal weights rπ 1 ,sπ 1 and rπ 2 ,sπ 2 have exactly 1 1 2 intersection points, namely t = ± 22 ×3×7 = ± 84 , where the two singular vectors + + r 1 ,s 1 and r 2 ,s 2 both exist. The key question will now be whether these singular π π π π vectors are actually different and hence define a two-dimensional singular space, or if they are proportional and thus lead to a one-dimensional singular space instead. For this purpose, let us again take two different P -tuples π 1 and π 2 . The correspond ni −k 1 −k 2 ing conformal weights intersect for t = ± 21n Pi=1 pi i i . Using Conjecture 7.1 we obtain the corresponding singular vectors r+ j ,s j , j = 1, 2 as: π
r+ j ,s π
πj
π
, = (rπ j t + sπ j , 4rπ j t)+ r ,s l πj πj
11 We define the empty product as 0 i=1 = 1.
j = 1, 2.
(75)
284
M. Dörrzapf, B. Gato-Rivera
Inserting the expressions Eqs. (72) for rπ j and sπ j and replacing t by Eq. (74) we hence obtain P P P j j¯ j¯ + ni −k ni −k ni −k pi i ± pi i , 4 pi i r+ j ,s j = ± rπ j ,sπ j , j = 1, 2, π
π
i=1
i=1
l
i=1
(76) with 1¯ = 2 and 2¯ = 1. The determinant of the coefficients shows that r+ 1 ,s 1 and π π r+ 2 ,s 2 are linearly dependent if and only if π
π
±4
P
ni −ki1
2
pi
i=1
⇔
P i=1
∓4
P i=1
ni −ki1
pi
=
ni −ki2 2
pi
P i=1 2
ni −ki2
pi
= 0,
(77)
,
(78)
⇔ π1 = π ,
(79)
in contradiction with our original assumption. Hence, the vectors r+ 1 ,s π
π1
and r+ 2 ,s
are linearly independent for π 1 = π 2 . Similar arguments also show that r− 1 ,s π
π
π1
π2
and
r− 2 ,s 2 are linearly independent for π 1 = π 2 (at exactly the same values for t). π π An important observation now is the following. On the one hand, we have proven that the maximal singular dimension is 2 and, on the other, we have shown that for different P -tuples π1 and π2 the corresponding singular vectors are always linearly independent at the intersection points (i.e. in the same Verma module). Therefore, since for high levels l there are many different P -tuples, the question arises whether or not it is possible that the conformal weights of more than two P -tuples can intersect at the same value t, obtaining a contradiction to either of our results. That this is not the case is shown in the following consideration. The conformal weights of π1 and π2 (π1 = π2 ) ni −k 1 −k 2 intersect at t = ± 21n Pi=1 pi i i . In the same way the conformal weights of π2 and ni −k 2 −k 3 π3 (π2 = π3 ) intersect at t¯ = ± 21n Pi=1 pi i i . Assuming t = t¯ is equivalent to P i=1
ni −ki1 −ki2
pi
=
P i=1
ni −ki2 −ki3
pi
,
(80)
k1 k3 and thus Pi=1 pi i = Pi=1 pi i results in π1 = π3 due to the fact that the numbers pi are distinct primes. Therefore, there are no values of t for which more than two conformal weights rπ ,sπ intersect for different P -tuples π . We should stress again, that these results are based on the Conjecture 7.1 which seems to be a very reliable conjecture. We summarise our results in the following theorem. Theorem 7.2. Let us fix a level l ∈ N1/2 . Based on Conjecture 7.1 one finds that at ± define generically one-dimensional singular the level l = rs2 the singular vectors r,s spaces. The only exceptions, where degenerate singular spaces of dimension 2 arise, occur necessarily for the cases when two conformal weights r1 ,s1 (t) and r2 ,s2 (t) intersect. For each two different pairs (r1 , s1 ) and (r2 , s2 ) with l = r12s1 = r22s2 ,
Singular Dimensions of N = 2 Superconformal Algebras II
285
r1 , r2 , s1 , s2 ∈ N, s1 , s2 odd, there are two values of t where such an intersection ni −k 1 −k 2 happens. These values are t = ± 21n Pi=1 pi i i and they are real and rational in all cases. However, the conformal weights of three such pairs do not intersect for any common values of t. 8. G-Closed Verma Modules We will now consider G-closed Verma modules V Gc and discuss the relation between 24 the singular vectors in these (incomplete) Verma modules and the singular vectors in c complete Verma modules V with = 24 . For this value the Verma module V 24c contains − c a singular vector 0 = G0 24 at level 0 with parity −. In order to obtain irreducible representations one considers the quotient space V Gc = 24
V 24c
U (J2 )0−
,
(81)
which is precisely the G-closed Verma module V Gc . In fact, this quotient module is 24 only one among the quotient modules. We could equally well consider quotients with ± at levels rs . The reason why we respect to modules generated by singular vectors r,s 2 G are especially interested in V c is twofold. First, for physical applications it could be 24
relevant if a highest weight vector (or a singular vector) is G-closed, which makes V Gc 24 special compared to other quotient spaces. Secondly, the Verma module V 24c is easier to construct by explicit calculations than most Verma modules with singular vectors. Nevertheless, we aim also at obtaining information on the structure of quotient modules that is valid not only for V Gc but also for quotient modules of similar type. 24 Via the canonical map defined for quotient spaces, singular vectors from V 24c are either also singular in V Gc or they are trivial (they “go away”). The latter happens if and only if 24
the singular vector in question is a descendant vector of the singular vector 0− in V 24c . Thus, singular vectors of V Gc can either be inherited from V 24c via the canonical quotient 24
map or they appear for the first time in the quotient V Gc as vectors that were not singular 24
in V 24c . In the latter case these singular vectors of V Gc are called subsingular vectors in 24 V 24c . The Virasoro algebra does not allow any subsingular vectors which is a consequence of the proof of Feigin and Fuchs [17], neither do the N = 1 Neveu–Schwarz algebra [2]. Surprisingly enough, the N = 2 superconformal algebras contain subsingular vectors which have first been discovered in Refs. [20, 21] for the Neveu–Schwarz, the Ramond, and the topological N = 2 algebras. So far, nothing was known about the existence of subsingular vectors in the twisted N = 2 case. Due to the very different structure of this particular N = 2 algebra it was hard to say what one a priori should expect. However, we will show in this section that the twisted N = 2 superconformal algebra also contains subsingular vectors. In Sect. 5 we have found that the singular dimensions for G-closed Verma modules are just 1. Therefore, there are no degenerate singular vectors in G-closed Verma modules. A singular vector in V Gc at level l can therefore be identified by the coefficient of only 24 one particular term (being zero or non-zero). The ordering kernel in Table 3 tells us that
286
M. Dörrzapf, B. Gato-Rivera
this particular term is T 2l1 for the positive parity case and T 2l−1 1 G− 1 for the negative −2
−2
parity one. We can thus give the following theorem.
2
Theorem 8.1. Two singular vectors in the G-closed Verma module V Gc at the same level 24 and with the same parity are always proportional. If a vector satisfies the highest weight conditions at level l, with parity + or parity −, and its coefficient for the term T 2l1 or T 2l−1 G− 1 − 21 2
−2
vanishes, respectively, then this vector is trivial.
Before looking at explicit examples of singular vectors in V Gc we should first analyse 24
± in V c are descendants of − . Requiring whether or not the existing singular vectors r,s 0 24 c ± that r,s (t G ) = 24 in Eq. (59) one identifies easily the values t G of t for which r,s rs c exists in V 24 at level 2 :
s G =− , tr,s r
(82)
± ∈ V c , which happens only for the values for r, s ∈ N, s odd. We observe that if r,s 24
n rs G , then one finds also the singular vectors ± c at levels tr,s nr,ns ∈ V 24 2 for any positive ± . From + = integer n ∈ N. Conjecture 7.1 gives us the explicit form of the vectors r,s r,s + c G + G (rt + s, 4rt) rs 24 one obtains for the values of tr,s that r,s (tr,s ) = (0, −4s)+rs r−s 8r , 2 2 − − 2 c 2 r−s leads to − (t G ) = (− , 0) rs . As the and similarly − = (− , rt + s) rs 2
coefficient of
r,s T rs1 −2
r
24
2
r,s r,s
r
for the positive parity case and the coefficient of
8r 2 rs−1 T 1 G− 1 −2 2
in the
negative parity case vanish, due to Theorem 8.1 we hence find that the singular vectors ± (t G ) vanish in the quotient space V G . Therefore the singular vectors ± (t G ) are r,s c r,s r,s r,s 24
G ). Consequently12 all the singular vectors that can be found descendant vectors of 0− (tr,s in V Gc are subsingular in V 24c . 24
± do appear in V G By explicit computation one easily finds that singular vectors ϒr,s r−s 8r
± disappear in the quotient kernel. at exactly the levels rs2 where the singular vectors r,s (The reasons for this will be shown in a future publication [15]). In addition these are the ± are hence subsingular vectors only singular vectors appearing in V Gc . The vectors ϒr,s 24
in the complete Verma modules V r−s . This has been verified for levels 21 , 1, and 23 . The 8r explicit results are given in Appendix B. We conclude this section with another interesting fact about the G-closed Verma modules. As we know, the singular dimension for G-closed Verma modules is only 1. One may wonder what happens in the cases where V 24c has a two-dimensional singular space. First of all, we have just shown that if this happens then – due to the nature of the ± – the whole two-dimensional singular space would vanish in the singular vectors r,s G quotient space V c as it is descendant of 0− . Nevertheless, one could argue that for these 24 cases the corresponding subsingular vectors that appear in a one to one correspondence at the same levels as the singular vectors that disappear, may span a two-dimensional 12 Strictly speaking, we have to assume that there are no isolated singular vectors for the twisted N = 2 superconformal algebra. We have not been able to compute any isolated singular vectors neither have they ever appeared for any other chiral algebra.
Singular Dimensions of N = 2 Superconformal Algebras II
287
subsingular space. However, Verma modules V that contain degenerate singular vectors c never satisfy = 24 and therefore this question is redundant. Namely, if we use in ni −k 1 −k 2 t = − rs the expressions of Eqs. (72) for r and s and require that t = ± 21n Pi=1 pi i i in the case of degeneracy for two P -tuples π 1 and π 2 , one easily finds ±
P P 1 ni −ki1 −ki2 1 ni −2ki1 p = − pi . i 2n 2n i=1
(83)
i=1
Taking into account that the pi are distinct primes then the only solution to Eq. (83) c is, however, π 1 = π 2 , which shows that the modules V with = 24 never have two-dimensional singular spaces. 9. G-Closed Singular Vectors In this last section we will analyse G-closed singular vectors in a very similar way as we have analysed G-closed Verma modules in the previous section. Due to the constraints on the conformal weights, there are no G-closed singular vectors in G-closed Verma ± (t) ∈ V modules, as mentioned already earlier. Therefore, we take r,s r,s (t) and impose rs c the conformal weight constraint r,s + 2 = 24 which is satisfied by G-closed singular c vectors. The condition r,s + rs2 = 24 has the unique solution G
s tr,s = . r
(84)
G of the previous section. Let us note the similarity between G tr,s and tr,s According to the result that the singular dimension for G-closed singular vectors is 1, the ordering kernel being given in Table 2, a G-closed singular vector l±G at c − can be identified by the coefficient of T 2l1 or T 2l1 G0 (being zero or level l = 24 −2
−2
non-zero), depending on the parity. We therefore give the following theorem. Theorem 9.1. Two G-closed singular vectors in the same Verma module V , at the same c level l = 24 − and with the same parity are always proportional. If a G-closed vector c satisfies the highest weight conditions at level l = 24 − and parity +, or parity −, 2l 2l and its coefficient for the term T 1 or T 1 G0 vanishes, respectively, then this vector is −2
trivial.
−2
Using the expressions for G (a, b)∓ l given by Eqs. (55)–(58), we obtain for the vec− + = G t , the result G + = (rt + s, 4rt) tor G0 r,s rs G r,s , in the case t = r,s 0 r,s 2 (0, 0)−rs r,s . Since this vector satisfies the highest weight conditions but has van2
+ ≡ 0. Thus the singular ishing coefficient for the term T rs1 G0 , we find that G0 r,s −2
s + are G-closed for t = G t − vectors r,s r,s = r . Similarly, we obtain for G0 r,s = + + 2 G − tr,s , G0 r,s = (0, 0) rs r,s and hence G (− r , rt + s) rs r,s , in the case t = 2
2
s − are also G-closed for t = G t the singular vectors r,s r,s = r . It is surprising the fact that for t = G tr,s all the singular vectors become G-closed (this is not required by first principles).
288
M. Dörrzapf, B. Gato-Rivera
± ∈V Theorem 9.2. The singular vectors r,s r,s become G-closed exactly for the values s G of the central parameter t = tr,s = r , r, s ∈ N, s odd. ± ∈ V Like in the previous section, let us finally check that the cases where r,s r,s become G-closed cannot be degenerate cases, as suggested by the results of table 4.3. We set t = rs and use again the expressions of Eqs. (72) for r and s. Requiring that ni −k 1 −k 2 t = ± 21n Pi=1 pi i i in the case of degeneracy for two different P -tuples π 1 and π 2 we obtain
±
P P 1 ni −ki1 −ki2 1 ni −2ki1 p = + pi . i 2n 2n i=1
(85)
i=1
Again, the only solution to Eq. (85) is π 1 = π 2 , in contradiction with the assumption that π 1 and π 2 are distinct. This shows that the degenerate cases do not correspond to G-closed singular vectors, i.e. two-dimensional singular spaces are never G-closed. 10. Conclusions and Prospects In this paper we have applied the method of adapted orderings to the twisted N = 2 superconformal algebra. This shows, again, that this method is a powerful tool that can easily be applied to large classes of algebras. The way it works is rather simple and it is very flexible in its use. Whereas the application of this method to the topological, Neveu–Schwarz and Ramond N = 2 algebras in Ref. [16] assigned a special rôle to the Virasoro generator L−1 , for the twisted N = 2 algebra this special rôle has been taken over by the U(1) current generator T− 1 . Another remarkable difference compared to our 2 earlier applications of this method is that in the twisted N = 2 case the fermionic field modes need to be mixed in the adapted ordering rather than being kept separate. As a result we have derived the singular dimensions and the ordering kernels for the twisted N = 2 algebra. This reveals that also the twisted N = 2 algebra contains degenerate singular spaces with dimension 2 in complete Verma modules, as it is the case for the three isomorphic N = 2 algebras. On the other hand, G-closed Verma modules have singular dimension 1 and therefore do not allow any degenerate singular spaces. Based on the ordering kernel coefficients we have conjectured general expressions for the relevant terms of all (primitive) singular vectors. These expressions lead to the identification of all degenerate cases with singular dimension 2 and also to the identification of all G-closed singular vectors. These expressions also lead to the discovery of subsingular vectors for the twisted N = 2 superconformal algebra. As in many examples for the other N = 2 superconformal algebras, these subsingular vectors appear in relation to additional vanishing conditions on a singular vector. This relation can be generally proven and will be given in a forthcoming paper [15]. We have seen that ordering kernel coefficients play a crucial rôle in this matter. We have also computed multiplication rules for singular vector operators, which sets the foundation for the study of the twisted N = 2 embedding diagrams which will be investigated in a future publication. The twisted N = 2 superconformal algebra has hence proven to have a rather rich structure that contains many features that were also discovered for the three isomorphic N = 2 superconformal algebras. Nevertheless, the way these features appear in the case of the twisted N = 2 algebra is quite different from the way they appear for
Singular Dimensions of N = 2 Superconformal Algebras II
289
the other N = 2 algebras. For example, the degenerate singular spaces are produced simply by the intersection of two primitive singular vectors. For no other N = 2 algebra primitive singular vectors at the same level and with the same charge are linearly independent. Furthermore, the discovery of degenerate singular vectors in the twisted N = 2 case is rather surprising as the underlying parameter space is just one-dimensional. The twisted N = 2 superconformal algebra hence turns out to be quite distinct from all other superconformal algebras analysed so far in the literature. This raises the need for the analysis of twisted superconformal algebras for higher N – a very interesting question already because the periodicity conditions of more than 2 fermionic fields can be mixed in different ways. The concept of adapted orderings should apply also in these cases.
A. Examples of Negative Parity Singular Vectors For completeness we shall also give all singular vectors with negative parity at level 21 , 1, and 23 . They were found by explicit computer calculations [29]. Level 21 : − 1,1 (t)
t + 1 = −2T− 1 G0 + (t + 1)G− 1 − . 2 2 8t
(86)
Level 1: − 2,1 (t) = −T−2 1 G0 + (2t + 1)T− 1 G− 1 − 2tL−1 G0 2 2 2 3t 2 + 3t + 1 t − (2t + 3)G−1 − . 2 8t
Level
3 2
(87)
has two different singular vectors:
2 − 3,1 (t) = − T−3 1 G0 + (3t + 1)T−2 1 G− 1 − 4tL−1 T− 1 G0 2 2 3 2 2 1 2t 4t − (12t 2 + 13t + 3)G−1 T− 1 + (3t + 1)L−1 G− 1 − G−1 G− 1 G0 2 2 2 3 3 3 8t 2 + 5t + 1 2 4t 4 8 , − (t − 1)T− 3 G0 + (2t 3 − t 2 − t − )G− 3 − 2 2 3 3 3 3 8t (88) − 1,3 (t) = −2T−3 1 G0 + (t + 3)T−2 1 G− 1 + 4tL−1 T− 1 G0 2
2
2
2
− (t + 3)G−1 T− 1 − 2t (t + 3)L−1 G− 1 − 4tG−1 G− 1 G0 2 2 2 5t + 9 + 8T− 3 G0 − 2(t + 3)G− 3 − . 2 2 8t
(89)
290
M. Dörrzapf, B. Gato-Rivera
In the notation of Def. 5.2 we obtain:
t +1 , = (−2, t + 1) 1 − 8t 2 2 − − 3t + 3t + 1 2,1 (t) = (−1, 2t + 1)1 − , 8t − 8t 2 + 5t + 1 2 − (t) = − , 3t + 1 − , 3,1 3 3 8t 2 − − 5t + 9 1,3 (t) = (−2, t + 3) 3 − . 8t 2 −
− (t) 1,1
(90) (91) (92) (93)
− − and 1,3 span a two-dimensional singular vector For t = ±1 we observe that 3,1 space, as it was the case for their positive parity partners. − 7 2 − 3,1 (1) = − , 4 − , (94) 3 3 4 2 − − 7 1,3 (1) = (−2, 4) 3 − , (95) 4 2 − 1 2 − (−1) = − , −2 , (96) 3,1 3 2 3 2 − − 1 1,3 (−1) = (−2, 2) 3 − . (97) 2 2
B. Examples of Singular Vectors in G-Closed Verma Modules ± ∈ V As discussed in Sect. 8, the singular vectors r,s , at levels rs2 , are always c r,s − descendants of the level zero singular vector 0 = G0 24 in the cases where r,s (t) = c s c r−s G 24 . This happens for the values t = tr,s = − r , so that 24 = 8r . Therefore, the singular ± are absent in the G-closed Verma modules V G = vectors r,s c 24
V
c 24
c G0 | 24
. All singular
vectors which we find in V Gc are therefore subsingular vectors in V 24c . Surprisingly 24
enough, the only singular vectors we have found in V Gc are exactly at the levels and for 24
± to appear, if they were not trivial in the modules where we would have expected r,s the quotient module. That is, they are located at levels rs2 in the Verma modules with t = − rs . In the following we will give all the singular vectors in V Gc for levels 21 , 1, and 3 2.
These vectors are therefore subsingular vectors in V 24c .
Level
1 2
24
has singular vectors only for t = −1: + (−1) = T− 1 |0 G , ϒ1,1
(98)
− ϒ1,1 (−1)
(99)
2
G
= G− 1 |0 . 2
Singular Dimensions of N = 2 Superconformal Algebras II
291
Level 1 has singular vectors only for t = − 21 : 1 G 1 2 = T− 1 − L−1 , − 2 16 2 1 1 G − ϒ2,1 − = 4T− 1 G− 1 − G−1 . 2 2 2 16
+ ϒ2,1
Level
3 2
(101)
has singular vectors only for t = − 13 and for t = −3:
1 G 1 3 , − = 9T− 1 − 18L−1 T− 1 − 6G−1 G− 1 + 8T− 3 2 2 2 3 12 2 1 1 G − ϒ3,1 − . = 27T−2 1 G− 1 − 15G−1 T− 1 − 6L−1 G− 1 − 10G− 3 2 2 2 2 3 12 2 1 G + 3 ϒ1,3 (−3) = T− 1 + 6L−1 T− 1 − 6G−1 G− 1 − 4T− 3 − , 2 2 2 4 2 1 G − ϒ1,3 (−3) = T−2 1 G− 1 − G−1 T− 1 + 6L−1 G− 1 − 2G− 3 − . 2 2 2 2 4 2 + ϒ3,1
(100)
(102) (103) (104) (105)
G r−s If we replace r−s by 8r the resulting singular vectors are thus subsingular vectors 8r in the Verma modules V r−s . 8r The appearance of these subsingular vectors is cvery closely related with an additional vanishing condition for the singular vector G0 24 . That such an additional vanishing condition results in an additional null vector which is most likely subsingular will be proven in a forthcoming publication [15]. Acknowledgements. We are grateful to Adrian Kent for many discussions on the ordering method and to Victor Kac for his explanations on superconformal algebras. M.D. is indebted to the Deutsche Forschungsgemeinschaft (DFG) for financial support and to DAMTP for the hospitality.
References 1. Ademollo, M., Brink, L., d’Adda, A., d’Auria, R., Napolitano, E., Sciuto, S., del Giudice, E., di Vecchia, P., Ferrara, S. Gliozzi, F., Musto, R. and Pettorino, R.: Supersymmetric strings and colour confinement. Phys. Lett. B 62, 105 (1976) 2. Astashkevich, A.B.: On the structure of Verma modules over Virasoro and Neveu–Schwarz algebras. Commun. Math. Phys. 186, 531 (1997) 3. Bauer, M., di Francesco, P., Itzykson, C. and Zuber, J.-B.: Covariant differential equations and singular vectors in Virasoro representations. Nucl. Phys. B 362, 515 (1991) 4. Bauer, M., di Francesco, P., Itzykson, C. and Zuber, J.-B.: Singular vectors of the Virasoro algebra. Phys. Lett. B 260, 323 (1991) 5. Benoit, L. and Saint-Aubin, Y.: Fusion and the Neveu–Schwarz singular vectors. Lett. Math. Phys. 23, 117 (1991) 6. Benoit, L. and Saint-Aubin, Y.: An explicit formula for some singular vectors of the Neveu–Schwarz algebra. Int. J. Mod. Phys. A 7, 3023 (1992) 7. Bershadsky, M., Lerche, W., Nemeschansky, D. and Warner, N.P.: Extended N = 2 superconformal structure of gravity and W-gravity coupled to matter. Nucl. Phys. B 401, 304 (1993) 8. Boucher, W., Friedan, D. and Kent, A.: Determinant formulae and unitarity for the N = 2 superconformal algebras in two dimensions or exact results on string compactification. Phys. Lett. B 172, 316 (1986) 9. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Topological strings in d < 1. Nucl. Phys. B 352, 59 (1991)
292
M. Dörrzapf, B. Gato-Rivera
10. Dörrzapf, M.: Singular vectors of the N = 2 superconformal algebra. Int. J. Mod. Phys. A 10, 2143 (1995) 11. Dörrzapf, M.: Superconformal field theories and their representations. PhD thesis, University of Cambridge, September 1995, http//www.damtp.cam.ac.uk/user/md131/research/thesis.html 12. Dörrzapf, M.: Analytic expressions for singular vectors of the N = 2 superconformal algebra. Commun. Math. Phys. 180, 195 (1996) 13. Dörrzapf, M.: The definition of Neveu–Schwarz superconformal fields and uncharged superconformal transformations. Rev.Math. Phys. 11, 137 (1999) 14. Dörrzapf, M.: The embedding structure of unitary N = 2 minimal models. Nucl. Phys. B 529, 639 (1998) 15. Dörrzapf, M. and Gato-Rivera, B.: Work in progress 16. Dörrzapf, M. and Gato-Rivera, B.: Singular dimensions of the N = 2 superconformal algebras I. Commun. Math. Phys. 206, 493 (1999) 17. Feigin, B.L. and Fuchs, D.B.: Representations of Lie groups and related topics. A.M. Vershik and A.D. Zhelobenko eds., New York: Gordon & Breach, 1990 18. Friedan, D., Qiu, Z. and Shenker, S.: Superconformal invariance in two dimensions and the tricritical Ising model. Phys. Lett. B 151, 37 (1985) 19. Fuchs, D.: Singular vectors over the Virasoro algebra and extended Verma modules. Adv. Sov. Math. 17, 65 (1993) 20. Gato-Rivera, B. and Rosado, J.I.: Chiral determinant formula and subsingular vectors for the N = 2 superconformal algebras. Nucl. Phys. B 503, 447 (1997) 21. Gato-Rivera, B. and Rosado, J.I.: Families of singular and subsingular vectors of the topological N = 2 superconformal algebra. Nucl. Phys. B 514, 477 (1998) 22. Gato-Rivera, B. and Semikhatov, A.M.: d ≤ 1 U d ≥ 25 and W constraints from BRST-invariance in the c = 3 topological algebra. Phys. Lett. B 293, 72 (1992) 23. Goddard, P., Kent, A. and Olive, D.: Unitary representations of the Virasoro and super-Virasoro algebras. Commun. Math. Phys. 103, 105 (1986) 24. Kac, V.G.: Lie superalgebras. Adv. Math. 26, 8 (1977) 25. Kac, V.G.: Contravariant form for infinite-dimensional (Lie) algebras and superalgebras. Lec. Notes in Phys. 94, Berlin–Heidelberg–New York: Springer, 1979, pp. 441 26. Kent, A.: Singular vectors of the Virasoro algebra. Phys. Lett. B 273, 56 (1991) 27. Kiritsis, E.B. Character formulae and the structure of the representations of the N = 1, N = 2 superconformal algebras. Int. J. Mod. Phys. A 3, 1871 (1988) 28. Meurman, A. and Rocha-Caridi, A.: Highest weight representations of the Neveu–Schwarz and Ramond algebras. Commun. Math. Phys. 107, 263 (1986) 29. Waterloo Maple Software. Programmed with Maple V, release 3, 1998 30. Watts, G.M.T.: Null vectors of the superconformal algebra: The Ramond sector. Nucl. Phys. B 407, 213 (1993) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 220, 293 – 299 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Log Mirror Symmetry and Local Mirror Symmetry Nobuyoshi Takahashi Department of Mathematics, Hiroshima University, Higashi-Hiroshima 739-8526, Japan. E-mail: [email protected] Received: 1 October 1999 / Accepted: 22 November 2000
Abstract: We study Mirror Symmetry of log Calabi–Yau surfaces. On one hand, we consider the number of “affine lines” of each degree in P2 \B, where B is a smooth cubic. On the other hand, we consider coefficients of a certain expansion of a function obtained from the integrals of dx/x ∧ dy/y over 2-chains whose boundaries lie on Bφ , where {Bφ } is a family of smooth cubics. Then, for small degrees, they coincide. We discuss the relation between this phenomenon and local mirror symmetry for P2 in a Calabi–Yau 3-fold ([CKYZ]). 0. Introduction In the classification theory of algebraic varieties, one can study non-compact algebraic manifolds by means of Log Geometry. For a quasi-projective manifold U , there exist a projective variety X and a normal crossing divisor B on X such that X\B is isomorphic to U. A rational differential form on X is called a logarithmic form if it can be locally written as a linear combination (with regular functions as coefficients) of products of d log f , where f is a regular function that vanishes at most in B. Then the space of logarithmic forms is an invariant of U and it plays the role that the space of regular forms plays in the classification theory of projective manifolds. In particular, we may think of a pair (X, B) (or the open variety X\B) as a log Calabi–Yau manifold if KX + B is trivial. It is expected that some analogue of Mirror Symmetry holds for log Calabi–Yau manifolds. As an evidence, we studied Mirror Symmetry for the one-dimensional log Calabi–Yau manifold, i.e. (P1 , {0, ∞}) or C× , in [T2]: for b ≥ 0 and k, l > 0, we consider covers of P1 that have k and l points over 0 and ∞ and are simply branched over b prescribed points. We define Fb,k,l (z1 , . . . , zk ; w1 , . . . , wl ) to be the generating function of the number of such curves. Then Fb,k,l coincides with the sum of certain integrals associated to graphs that have k and l “initial” and “final” vertices, have b internal vertices and are trivalent
294
N. Takahashi
at internal vertices (here, variables z1 , . . . , zk and w1 , . . . wl are attached to initial and final vertices). The aim of this paper is to study the two-dimensional case as a further example of Log Mirror Symmetry. In Sect. 1, we describe Mirror Symmetry of P2 \B, where B is a smooth cubic. On the A-model side, we count curves of each degree in P2 \B whose normalizations are A1 and construct a power series from those numbers. On the B-model side, we construct a function from the integrals of dx/x ∧ dy/y over 2-chains whose boundaries lie on Bφ , where Bφ is a member of a family of smooth cubics parametrized by φ. Then we see that the coefficients of their expansions coincide up to order 8. Then, in Sect. 2, we discuss the relation between our log mirror symmetry and local mirror symmetry studied in [CKYZ]. We explain that the number of “affine lines” is in some sense the dual of local Gromov–Witten invariants of P2 in a Calabi–Yau 3-fold. 1. Log Mirror Symmetry 1.1. A-model. On the enumerative side, we counted curves of the lowest “log genus” in [T1]. In the following, B denotes a smooth cubic curve in P2 , the complex projective plane. Definition 1.1. A plane curve C is said to satisfy the condition (AL) if it is irreducible and reduced and the normalization of C\B is isomorphic to the affine line A1 . Remark 1.2. If C satisfies (AL), then C∩B consists of one point. This point is a (3 deg C)torsion for a group structure on B whose zero element is an inflection point of B. We treat only “primitive” cases. Definition 1.3. Let P be a point of order 3d for a group structure on B whose zero element is a point of inflection. Then we define md to be the number of curves C of degree d which satisfy (AL) and C ∩ B = {P } (or, more precisely, the degree of the 0-dimensional scheme parametrizing such curves). The above definition implicitly assumes that the number md does not depend on the choice of P . Although we haven’t proved it yet, it holds in the cases where we know md , i.e. when d ≤ 8. Theorem 1.4 ([T1]). We have the following table for md : d md
1 1
2 1
3 3
4 16
5 113
6 948
Furthermore, under a technical hypothesis (see [T1]), we have m7 = 8974 and m8 = 92840. Our invariants are apparently related to the relative Gromov–Witten invariants defined in [IP] and [LR] for pairs of a symplectic variety and its subvariety of real codimension 2. In algebraic language: Proposition-Definition 1.5 (1-pointed case of [G1]). Let M¯ 0,1 (P2 , d) be the moduli stack of 1-pointed genus 0 stable maps of degree d to P2 and M¯ iB (P2 , d) the closed subset consisting of points corresponding to f : (C, P ) → P2 such that
Log Mirror Symmetry and Local Mirror Symmetry
295
(i) f (P ) ∈ B and (ii) f ∗ B − iP ∈ A0 (f −1 B) is effective. Then the virtual fundamental class [M¯ iB (P2 , d)]virt is naturally defined and is of expected dimension 3d − i for i ≤ 3d. Considering that the number of 3d-torsions is (3d)2 and that the relative Gromov– Witten invariants take multiple covers into account, we expect the following to hold. B (P2 , d)]virt = (3d)2 Conjecture 1.6. [M¯ 3d
k|d (−1)
d−d/k m
d/k /k
4.
Remark 1.7. (1) Although the factor k −4 looks unfamiliar, this is compatible with the factor k −3 for Gromov–Witten invariants, since md is conjecturally equal to (−1)3d−1 nd /(3d), where nd is the local Gromov–Witten invariant (see Remark 2.2). (2) A. Gathmann informed the author that this is true for d ≤ 8 (i.e. where md is calculated).1 1.2. B-model. The result of [T2] suggests that the mirror manifold of C× is C× . In the A-model, we associate a parameter to each point over 0 or ∞ on a cover of C× , and in the B-model this corresponds to the choice of points on C× : the former can be considered as “Kähler moduli” and the latter as “complex moduli”. In the B-model, we used the points as the boundary of integrals to define coordinate functions. With this in mind, we look for a function which has an expansion with coefficients md . Although the calculation below is essentially the same as in [CKYZ], here we see it from the point of view of Mirror Symmetry of log Calabi–Yau surfaces: we claim that the mirror of (P2 \(smooth cubic) & Kähler moduli) is (C× × C× & cubic), where the latter cubic has complex moduli and we take it as the boundary of integrals, just as in the case of C× . We consider homogeneous coordinates X, Y, Z and inhomogeneous coordinates x, y on P2 . Let be the logarithmic 2-form dx/x ∧ dy/y on X0 := P2 \{XY Z = 0} and write Bφ for the plane cubic defined by XY Z − φ(X 3 + Y 3 + Z 3 ) = 0. Then we consider the integral I :=
!φ
,
where !φ is a 2-chain in X0 whose boundary has support on Bφ . Lemma 1.8. We have φ
dI = dφ
∂!
dx/(x − 3φy 2 ).
Proof. If we fix x and differentiate xy − φ(x 3 + y 3 + 1) = 0, we have dy/dφ = xy/φ(x − 3φy 2 ). 1 His computation of [M ¯ B (P2 , d)]virt was made available after the submission of this paper: [G2], Example 3d
2.2.
296
N. Takahashi d So, if we set z := φ 3 and θ := z dz , we have
{θ 3 − 3zθ(3θ + 1)(3θ + 2)}I = 0. The following functions form a basis of the space of solutions: I1 = 1, (0)
I2 = log z + I2 , I3 = I2 log z −
(log z)2 (0) + I3 , 2
where (0)
I2
(0)
I3
17325 4 756756 5 z + z + 2858856z6 2 5 399072960 7 4732755885 8 + z + z + ··· , 7 4 423 2 389415 4 21981393 5 16973929 6 = 9z + z + 1486z3 + z + z + z 4 16 50 2 8421450228 7 1616340007953 8 + z + z + ··· . 49 448 = 6z + 45z2 + 560z3 +
The monodromy of I3 for z → e2πi z is 2π i(I2 − π i), which we denote by I˜2 . We ˜ ˜ denote the monodromy (2πi)2 of I˜2 by I˜1 . Now we write I3 in terms of q := e2πi I2 /I1 = (0) −eI2 = −zeI2 : 351 2 (I2 )2 319455 4 18122643 5 + 9z + z + 1216z3 + z + z 2 4 16 50 35161224 6 7009518168 7 1350681750297 8 + z + z + z + ··· 5 49 448 (log(−q))2 36999 4 635634 5 135 2 = + 0 − 9q + q − 244q 3 + q − q 2 4 16 25 193919175 7 3422490759 8 + 307095q 6 − q + q + ··· 49 64 ∞ ∞ ∞ ∞ (log(−q))2 qk q 2k q 3k q 4k 2 2 2 = + 6 .1 − 9 .3 + 12 .16 − 32 .1 2 k2 k2 k2 k2
I3 =
k=1
− 152 .113
∞ q 5k
k2
k=1 ∞
+ 242 .92840
k=1
+ 182 .948
k=1
∞ q 6k k=1
k2
k=1 ∞
− 212 .8974
k=1
k=1
q 7k k2
q 8k + ··· . k2
Remark 1.9. K := (zdI2 /dz)3 d 2 I3 /dI22 satisfies dK/dz = (27/(1 − 27z))K, and therefore we have K = 1/(1 − 27z). Comparing the coefficients with the table in Theorem 1.4, we propose:
Log Mirror Symmetry and Local Mirror Symmetry
297
Conjecture 1.10. ∞
∞
d=1
k=1
q dk (log(−q))2 + (−1)d (3d)2 md . 2 k2
I3 =
Remark 1.11. If we assume Conjecture 1.6, the previous conjecture is equivalent to ∞
I3 =
(log(−q))2 B + (−1)d [M¯ 3d (P2 , d)]virt q d , 2 d=1
and in fact this may be a more natural equality. According to A. Gathmann, his algorithm in [G1] can be used to prove this.2 2. Log Mirror and Local Mirror In [CKYZ], the generating function of the “numbers of rational curves in a local Calabi– Yau 3-fold” was given. Let P2 be embedded in a Calabi–Yau 3-fold and denote by nd the contribution of rational curves of degree d in P2 to the number of rational curves in the Calabi–Yau 3-fold. Let M¯ 0,0 be the moduli of stable maps of genus 0 curves to P2 with degree d 2 images and U the vector bundle over M¯ 0,0 whose fiber at the point [f : C → P 3] is 1 ∗ H (C, f KP2 ). Then, the Chern number Kd := c3d−1 (U ) is equal to k|d nd/k /k . Theorem 2.1 ([CKYZ]). ∞
I3 =
(log(−q))2 3dKd q d − 2 d=1
=
(log(−q))2 2
−
∞ d=1
3dnd
∞ q dk k=1
k2
.
Remark 2.2. Thus Conjecture 1.10 is equivalent to nd = (−1)d−1 3dmd , and this equality holds for d ≤ 8 by Theorem 1.4 and the q-expansion of I3 in the previous section. We give a heuristic argument as to why this should hold, although it contains serious gaps as we explain later. By Serre duality, the dual of U is isomorphic to the vector bundle V over M¯ 0,0 whose fiber at [f : C → P2 ] is H 0 (C, KC ⊗ f ∗ OP2 (B)). Since the rank of the bundles is 3d − 1, we have c3d−1 (V ) = (−1)d−1 Kd . Let M¯ 0,1 be the moduli of stable maps of 1-pointed genus 0 curves to P2 with degree d images, M0,1 the open subset of M¯ 0,1 representing f : (P1 , P ) → P2 and π : M¯ 0,1 → M¯ 0,0 the projection. Further, let E1 be the line bundle over M¯ 0,1 whose fiber at f : (C, P ) → P2 is H 0 (C, f ∗ OP2 (B) ⊗ OP ) and L the line bundle whose fiber is H 0 (C, OP (−P )). Then, we have c3d (π ∗ V ⊕ E1 ) = 3dc3d−1 (V ), for the zero set of a section of E1 induced by a defining equation of B is the set of the points corresponding to f : (C, P ) → P2 such that f (P ) ∈ B, and there are 3d such points for any f : C → P2 . 2 [G2], Example 2.2.
298
N. Takahashi
We have an exact sequence residue
0 → KC → KC (P ) −→ OC → 0. If C is irreducible, i.e. isomorphic to P1 , we obtain exact sequences 0 → H 0 (C, KC (−(k + 1)P ) ⊗ f ∗ OP2 (B)) → H 0 (C, KC (−kP ) ⊗ f ∗ OP2 (B)) → H 0 (C, OP (−(k + 1)P ) ⊗ f ∗ OP2 (B)) → 0 for 0 ≤ k ≤ 3d − 2. We also have H 0 (C, KC (−(3d − 1)P ) ⊗ f ∗ OP2 (B)) = 0. Thus, on M0,1 , we have a filtration of π ∗ V ⊕ E1 such that the associated graded i module is isomorphic to 3d−1 i=0 E1 ⊗ L . On the other hand, there are about (3d)2 md plane curves of degree d satisfying (AL), since the number of 3d-torsions on B is (3d)2 . They are in one-to-one correspondence with points (f : (P1 , P ) → P2 ) ∈ M0,1 such that f is birational onto the image and that f ∗ B = 3dP . Consider the vector bundle E3d of rank 3d over M¯ 0,1 whose fiber at f : (C, P ) → P2 is H 0 (C, f ∗ OP2 (B) ⊗ O3dP ). On M0,1 , the zero set of a section of E3d induced by a defining equation of B is the set of the points f : (C, P ) → P2 such that f ∗ B = 3dP . Now, from the exact sequences 0 → OP (−kP ) → O(k+1)P → OkP → 0, we see that E3d has a filtration such that the associated graded module is isomorphic to 3d−1 i 2 3d−1 3dK ≈ (−1)3d−1 3dn . d d i=0 E1 ⊗ L . Thus we may expect (3d) md ≈ (−1) Rigorously, however, this argument makes little sense. First, the section of E3d induced by a defining equation of B has undesirable zeros in M¯ 0,1 \M0,1 . For example, if C = C1 ∪ C2 ∪ C3 (a chain in this order), P ∈ C2 and C2 maps to a point Q in B, we may take any rational curves through Q as the images of C1 and C3 . Thus the number (3d)2 md may be much different from c3d (E3d ): in fact, we have c3·2 (E3·2 ) = 828, which is much larger than (3 · 2)2 m2 = 36. Second, we have a filtration of π ∗ V ⊕ E1 merely on M0,1 . Let M¯ iB (P2 , d) be as in Definition 1.5. The section of the line bundle (E1 ⊗Li )|M¯ B (P2 ,d) i
induced by a defining equation of B vanishes when f −1 B ⊇ (i + 1)P is satisfied. Then, [G1] describes the difference between c1 (E1 ⊗ Li ).[M¯ iB (P2 , d)]virt and B (P2 , d)]virt . This should account for the difference between c( 3d−1 E ⊗ Li ) [M¯ i+1 1 i=0 and c(π ∗ V ⊕ E1 ). References [CKYZ] Chiang, T.-M., Klemm, A., Yau, S.-T., Zaslow, E.: Local mirror symmetry: calculations and interpretations. Adv. Theor. Math. Phys. 3, no. 3, 495–565 (1999) [G1] Gathmann, A.: Absolute and relative Gromov–Witten invariants of very ample hypersurfaces. Preprint (math.AG/9908054) [G2] Gathmann, A.: Relative Gromov–Witten invariants and the mirror formula. Preprint (math.AG/0009190) [IP] Ionel, E., Parker, T.: Relative Gromov–Witten Invariants. Preprint (math.SG/9907155) [LR] Li,A., Ruan,Y.: Symplectic surgery and Gromov–Witten invariants of Calabi–Yau 3-folds I. Preprint (math.AG/9803036)
Log Mirror Symmetry and Local Mirror Symmetry
[T1] [T2]
299
Takahashi, N.: Curves in the complement of a smooth plane cubic whose normalizations are A1 . Preprint (alg-geom/9605007) Takahashi, N.: Mirror symmetry and C× . Proc. Am. Math. Soc. 129, no. 1, 29–36 (2001)
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 220, 301 – 331 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Strong Connections and Chern–Connes Pairing in the Hopf–Galois Theory Ludwik D¸abrowski1 , Harald Grosse2 , Piotr M. Hajac3,4 1 Scuola Internazionale Superiore di Studi Avanzati, Via Beirut 2–4, 34014 Trieste, Italy.
E-mail: [email protected]
2 Institute for Theoretical Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria.
E-mail: [email protected]
3 Mathematical Institute, Polish Academy of Sciences, ul. Sniadeckich ´ 8, Warsaw, 00-950, Poland 4 Department of Mathematical Methods in Physics, Warsaw University, ul. Ho˙za 74, Warsaw, 00-682, Poland.
E-mail: [email protected] Received: 8 March 2000 / Accepted: 5 January 2001
Abstract: We reformulate the concept of connection on a Hopf–Galois extension B ⊆ P in order to apply it in computing the Chern–Connes pairing between the cyclic cohomology H C 2n (B) and K0 (B). This reformulation allows us to show that a Hopf–Galois extension admitting a strong connection is projective and left faithfully flat. It also enables us to conclude that a strong connection is a Cuntz–Quillen-type bimodule connection. To exemplify the theory, we construct a strong connection (super Dirac monopole) to find out the Chern–Connes pairing for the super line bundles associated to a super Hopf fibration. Introduction A noncommutative-geometric concept of principal bundles and characteristic classes is given by the Hopf–Galois theory of algebra extensions and the pairing between cyclic cohomology and K-theory, respectively. In the spirit of the Serre-Swan theorem, the quantum vector bundles are given as finitely generated projective modules associated to an H -Galois extension via a corepresentation of Hopf algebra H . The K0 -class of such a module can then be paired with the cohomology class of a cyclic cocycle to produce an invariant playing the role of an integrated characteristic class of a vector bundle. To obtain these invariants, we provide a theory of connections on Hopf–Galois extensions which can be used in calculating projector matrices of associated quantum vector bundles. A main point of this paper is that strong connections on an H -Galois extension B ⊆ P are equivalent to left B-linear right H -colinear unital splittings of the multiplication map B ⊗ P → P . Since connections can be considered as appropriate liftings of the translation map (restricted inverse of the canonical Galois map), knowing a connection yields automatically an explicit expression for the translation map. Vice-versa, an explicit formula for the translation map might immediately indicate a formula for connection. (This is important from the practical point of view.) If a connection is strong, then the simple machinery presented herein helps one to extract the projective module data of
302
L. D¸abrowski, H. Grosse, P. M. Hajac
an associated quantum vector bundle. One can then plug it into the computation of the pairing. In the classical geometry, characteristic classes of associated vector bundles can be computed from connections on principal bundles. Our approach parallels to some extent this idea in the quantum-geometric setting. We work within the general framework of noncommutative geometry, quantum groups and Galois-type theories. For an introduction to Hopf–Galois extensions we refer to [M-S93, S-HJ94] and for a comprehensive description of the Chern–Connes pairing to [C-A94, L-JL97]. The point of view advocated in here was already employed to compute projector matrices [HM99] and the Chern numbers [H-PM00] of the quantum Hopf line bundles from the Dirac q-monopole connection [BM93]. Thus, although this work is antedated by [HM99] and [H-PM00], it conceptually precedes these papers, and can be viewed as a follow up of the theory of connections, strong connections and associated quantum vector bundles developed in [BM93, H-PM96] and [D-M96], respectively. (See [D-M97a, Sect. 5] and [D-M97c, D-Ma] for an alternative theory of characteristic classes on quantum principal bundles.) We begin in Sect. 1 by recalling basic facts and definitions. In Sect. 2 we first reformulate the concept of general connections so as to make transparent the characterisation of a strong connection as an appropriate splitting of the multiplication map B ⊗ P → P , where P is an H -Galois extension of B. Then we prove the equivalence of four different definitions of a strong connection, which is the main claim of this paper, and study its consequences. As a quick illustration of the theory, we apply it to a strong and non-strong connection on quantum projective space RPq2 . We obtain, as a by-product, the definition of the “tangent bundle” of the Podle´s equator quantum sphere. We also show that there are infinitely many canonical strong connections on the quantum Hopf fibration, and prove that they all coincide with the Dirac monopole in the classical limit. A super Dirac monopole is presented in Sect. 3. We adapt to the Hopf–Galois setting the construction of a super Hopf fibration. Then, employing the super monopole, we compute projector matrices of the super Hopf line bundles. Taking advantage of the functoriality of the Chern–Connes pairing, we conclude that the values of the pairing for the super and classical Hopf line bundles coincide. Hence we infer the non-cleftness of the super Hopf fibration. We end Sect. 3 by proving that, in analogy with the classical situation, the direct sum of spin-bundle modules (Dirac spinors) is free of rank two for both the super and the quantum Hopf fibration. In Appendix, we complement the four descriptions of a strong connection by providing (appropriately adapted) four equivalent actions of gauge transformations on connections. 1. Preliminaries Throughout the paper algebras are assumed to be unital and over a field k. The unadorned tensor product stands for the tensor product over k. Our approach is algebraic, so that we use finite sums. We use the Sweedler notation h = h(1) ⊗h(2) (summation understood) and its derivatives. The letter S and ε signify the antipode and counit, respectively. The convolution product of two linear maps from a coalgebra to an algebra is denoted in the following way: (f ∗ g)(c) := f (c(1) )g(c(2) ). We use the word “colinear” with respect to linear maps that preserve the comodule structure. (Such maps are also called “covariant.”) We work with right Hopf–Galois extensions and skip writing “right” for brevity. For an H -Galois extension B ⊆ P we write the canonical Galois isomorphism as χ := (m ⊗ id) ◦ (id ⊗B R ) : P ⊗B P −→ P ⊗ H,
(1.1)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
303
where R : P → P ⊗H stands for the comodule-algebra coaction ( R p =: p(0) ⊗p(1) ; again, summation understood), and m for the multiplication map P ⊗ P → P . We say that a Hopf–Galois extension is cleft iff there exists a (unital) convolution-invertible colinear map : H → P , and call a cleaving map. The concept of cleftness is close but, as explained in the last paragraph of [DHS99, Sect. 4], not tantamount to the idea of triviality of a principal bundle. (Trivial is cleft but not vice-versa.) A cleaving map is usually assumed to be unital, but since any non-unital can be unitalised (e.g., see [DT86, p. 813] or [HM99, Sect. 1]), this assumption, though technically useful, is conceptually redundant. It also follows from the defining properties of that it is injective (e.g., see [HM99, Sect. 1]). Next, note that the canonical map χ , although cannot be an algebra homomorphism in general, is always determined by its values on generators. The same is true for χ −1 . The left P -linearity of χ −1 makes it practical to restrict it from P ⊗ H to H , and define the translation map τ : H → P ⊗B P ,
τ (h) := χ −1 (1 ⊗ h) =: h[1] ⊗B h[2] (summation understood). (1.2)
The following are properties of τ compiled from [S-HJ90b, B-T96]: (id ⊗B R ) ◦ τ = (τ ⊗ id) ◦ , ((flip ◦ R ) ⊗B id) ◦ τ = (S ⊗ τ ) ◦ , P ⊗B P ◦ τ = (τ ⊗ id) ◦ AdR , m ◦ τ = ε, ˜ = h˜ [1] h[1] ⊗B h[2] h˜ [2] . τ (hh)
(1.3) (1.4) (1.5) (1.6) (1.7)
Here P ⊗B P is the coaction on P ⊗B P obtained via the canonical surjection πB : P ⊗ P → P ⊗B P from the diagonal coaction P ⊗P : p ⊗ p −→ p(0) ⊗ p (0) ⊗ p(1) p (1) ,
(1.8)
and AdR (h) := h(2) ⊗ S(h(1) )h(3) is the right adjoint coaction. To fix convention and clarify some basic issues, let us recall that the universal differential calculus 1 A (grade one of the universal differential algebra) can be defined by the exact sequence 0 −→ 1 A −→ A ⊗ A −→ A −→ 0,
(1.9)
i.e., as the kernel of the multiplication map. The differential is given by da := 1⊗a−a⊗1. We can identify 1 A with A ⊗ A/k as left A-modules via the maps 1 A
i
ai ⊗ ai →
i
ai ⊗ πA (ai ) ∈ A ⊗ A/k x ⊗ πA (y) → xdy ∈ 1 A, (1.10)
where πA : A → A/k is the canonical surjection. can identify 1 A with Similarly, one A/k ⊗ A as right A-modules ( i ai ⊗ ai → i πA (ai ) ⊗ ai ). Consequently, for any left A-module N , we have 1 A ⊗A N ∼ = A/k ⊗ N . For any splitting ı : A/k → A of
304
L. D¸abrowski, H. Grosse, P. M. Hajac
the canonical surjection (πA ◦ ı = id), we have an injection ı ⊗ id : A/k ⊗ N → A ⊗ N . Thus there is an injection fı : 1 A ⊗A N −→ A ⊗ N, aij ⊗ aij ⊗A nj := (ı ◦ πA )(aij ) ⊗ aij nj . fı i,j
(1.11)
i,j
On the other hand, we have a natural map coming from tensoring (1.9) on the right with N: aij ⊗ aij ⊗A nj := aij ⊗ aij nj . fN : 1 A ⊗A N −→ A ⊗ N, fN i,j
i,j
(1.12) Since πA ◦ ı = id, we have (πA ⊗ id) ◦ fN = (πA ⊗ id) ◦ fı , whence ((ı ◦ πA ) ⊗ id) ◦ fN = ((ı ◦ πA ) ⊗ id) ◦ fı = fı .
(1.13)
It follows now from the injectivity of fı that fN is injective. Thus we have shown that (1.9) yields the exact sequence: 0 −→ 1 A ⊗A N −→ A ⊗ N −→ N −→ 0.
(1.14)
If B is a subalgebra of P , then we can also write (1 B)P for the kernel of the multiplica m tion map B ⊗P → P . Indeed, m((1 B)P ) = 0, and if i bi ⊗pi ∈ Ker (B ⊗P → P ), then bi ⊗ p i = (bi ⊗ pi − 1 ⊗ bi pi ) = − (dbi )pi ∈ (1 B)P . (1.15) i
i
i
To sum up, we have (cf. [HM99, p. 251]) m 1 B ⊗B P ∼ = Ker (B ⊗ P → P ) = (1 B)P .
(1.16)
The following are the universal-differential-calculus versions of general-calculus definitions in [BM93, H-PM96]: Definition 1.1 ([BM93]). Let B ⊆ P be an H -Galois extension. Denote by 1 P the universal differential calculus on P and by 1 P the restriction of P ⊗P to 1 P . A left P -module projection & on 1 P is called a connection iff Ker & = P (1 B)P (horizontal forms),
(1.17)
1 P ◦ & = (& ⊗ id) ◦ 1 P (right colinearity).
(1.18)
Definition 1.2 ([BM93]). Let P , H , B and 1 P be as above. A k-homomorphism ω : H → 1 P such that ω(1) = 0 is called a connection form iff it satisfies the following properties: 1. (m ⊗ id) ◦ (id ⊗ R ) ◦ ω = 1 ⊗ (id − ε) (fundamental vector field condition), 2. 1 P ◦ ω = (ω ⊗ id) ◦ AdR (right adjoint colinearity).
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
305
For every Hopf–Galois extension there is a one-to-one correspondence between connections and connection forms (see [BM93, p. 606] or [M-S97, Prop. 2.1]). In particular, the connection &ω associated to a connection form ω is given by the formula: &ω (dp) = p(0) ω(p(1) ).
(1.19)
(Since &ω is a left P -module homomorphism, it suffices to know its values on exact forms.) Definition 1.3 ([H-PM96]). Let & be a connection in the sense of Definition 1.1. It is called strong iff (id − &)(dP ) ⊆ (1 B)P . We say that a connection form is strong iff its associated connection is strong. Let us now have a closer look at the concept of connection. For the sake of brevity we put χ = (m ⊗ id) ◦ (id ⊗ R ) : P ⊗ P → P ⊗ H,
(1.20)
denote by χ its restriction to 1 P , and by H + the kernel of the counit map (augmentation ideal). Since ((id ⊗ ε) ◦ χ ) (1 P ) = 0, we have χ (1 P ) = P ⊗ H + . Consider P ⊗ H , + and similarly P ⊗ H , as a right comodule via the map P ⊗H : p ⊗ h −→ p(0) ⊗ h(2) ⊗ p(1) S(h(1) )h(3) .
(1.21)
Then there is a one-to-one correspondence between connections and left P -linear right H -colinear splittings of χ [BM93, p. 606]. Since H = H + ⊕ k, we can define σ (p ⊗ h) for h ∈ H + σ (p ⊗ h) = (1.22) p ⊗ h1P for h ∈ k, where σ is a splitting of χ . On the other hand, we can consider unital left P -linear right H -colinear splittings r of the canonical surjection πB : P ⊗ P → P ⊗B P . This leads to the following commutative diagram of exact rows of left P -modules right H -comodules (see above for the comodule structures): χ 0 −−→ P (1B)P −−→ 1P
− −− −−→
←−− P ⊗ H + −−→ 0 σ χ
− −− −−→
0 −−→ P (1B)P −−→ P ⊗ P ←−− P ⊗ H −−→ 0 σ −1 χ π
(1.23)
B
− −− −−→
0 −−→ P (1B)P −−→ P ⊗ P ←−− P ⊗B P −−→ 0. r
One can check that χ intertwines the relevant comodule structures ⊗ p(1) ) ( P ⊗H ◦ χ )(p ⊗ p ) = P ⊗H (pp(0)
= p(0) p(0) ⊗ p(3) ⊗ p(1) p(1) S(p(2) )p(4)
= p(0) p(0) ⊗ p(1) ⊗ p(1) p(2)
= (χ
⊗ id)(p(0) ⊗ p(0)
⊗ p(1) p(1) )
= ((χ ⊗ id) ◦ P ⊗P ) (p ⊗ p ).
(1.24)
306
L. D¸abrowski, H. Grosse, P. M. Hajac
Other calculations to verify that this diagram is a commutative diagram of right H comodules are of the same kind. To see that Ker πB = P (1 B)P one can argue as above (1.16). Yet another description of a connection as a splitting is as follows. Denote πB (1P ) by 1B P (relative differential forms as in [CQ95, Sect. 2]). The commutativity of the diagram (1.23) implies that the restriction of the canonical map χˇ : 1B P → P ⊗ H + is an isomorphism. Let ω be a connection form and ω˜ its restriction to H + . Similarly, let τ˜ be the restriction to H + of the translation map. Recall that σ is the left P -module extension of ω˜ [BM93, p. 606]. Hence the commutativity of (1.23) implies also (for any ω) ˜ πB ◦ ω˜ = τ˜ . Since the translation map τ is unital, knowing τ˜ is tantamount to knowing τ . Thus a connection form yields an explicit expression for the translation map. On the other hand, viewing H + as a right comodule under the right adjoint coaction AdR allows one to define equivalently a connection as a colinear lifting of the restricted translation map τ˜ . Indeed, we can complete the equality πB ◦ ω˜ = τ˜ to the commutative diagram χ
1P −−→ P ⊗ H + ❅ πB ω˜ χˇ ❅ ❘ ❅ τ˜
H + −−→
(1.25)
1B P
and directly verify this assertion. This explains the close resemblance between the formulas for the translation maps and connection forms. For example, compare (3.5) with (3.10-3.11) and the proof of [H-PM96, Prop. 2.10] with [H-PM96, 2.14]. Compare also [DHS99, Cor. 2.3] and [BM93, Prop. 5.3]. A natural next step is to consider associated quantum vector bundles. More precisely, what we need here is a replacement of the module of sections of an associated vector bundle. In the classical case such sections can be equivalently described as “functions of type +" from the total space of a principal bundle to a vector space. We follow this construction in the quantum case by considering B-bimodules of colinear maps Homρ (V , P ) associated with an H -Galois extension B ⊆ P via a corepresentation ρ : V → V ⊗ H (see [D-M97a, Appendix B] or [D-M96]). Proposition 2.5 gives a formula for a splitting of the multiplication map B⊗ Homρ (V , P ) → Homρ (V , P ), and a splitting of the multiplication map is almost the same as a projector matrix, for it is an embedding of Homρ (V , P ) in the free B-module B ⊗ Homρ (V , P ). However, to turn a splitting into a concrete recipe for producing finite size projector matrices of finitely generated projective modules, we need the following general lemma: Lemma 1.4 ([HM99]). Let A be an algebra and M a projective left A-module generated by linearly independent generators g1 , . . . , gn . Also, let {g˜ µ }µ∈I be a completion of {g1 , . . . , gn } to a linear basis of M, f2 be a left A-linear splitting of the multiplication map A ⊗ M → M given by the formula f2 (gk ) = nl=1 akl ⊗ gl + µ∈I akµ ⊗ g˜ µ , and cµl ∈ A a choice of coefficients such that g˜ µ = nl=1 cµl gl . Then Ekl = akl + 2 n µ∈I akµ cµl defines a projector matrix of M, i.e., E ∈ Mn (A), E = E and A E (row times matrix) and M are isomorphic as left A-modules. For our later purpose, we also need the following general digression (cf. [R-J94, Lemma 1.2.1]). Let A be an algebra, and let E, F be idempotents in Mm (A), Mn (A),
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
307
respectively. It can be verified that the projective modules Am E and An F are isomorphic ˜ if there exist maps L and L, L − −− −−→
Am ←−− L˜
An
(1.26)
− −− −−→
Am E ←−− An F such that ELF = LF,
˜ = LE, ˜ F LE
ELL˜ = E,
˜ = F. F LL
(1.27)
2. Connections We begin this section by considering general connections on Hopf–Galois extensions as appropriate splittings. It is known that, under the assumption of faithful flatness, there always exists a connection on a Hopf–Galois extension [S-P93, Satz 6.3.5] (cf. [D-M97a, Theorem 4.1]). (For a comprehensive review of faithful flatness see [B-N72].) Chasing diagram (1.23) and playing around with appropriate modifications of its rows we obtain: Proposition 2.1. Let B ⊆ P be an H -Galois extension. Denote by C(P ) the space of connection forms on P , by R(P ) the space of unital left P -linear right H -colinear splittings r of the canonical surjection πB : P ⊗ P → P ⊗B P , and by S(P ) the space of unital left B-linear right H -colinear maps s : P → P ⊗ P satisfying (πB ◦ s)(p) = 1 ⊗B p. Then the formulas ω(p(1) ), 6(ω) (p ⊗B p ) = pp ⊗ 1 + pp(0)
˜ 6(r) (h) = (r ◦ τ )(h − ε(h)), (2.1) ˜ 6
6
define mutually inverse bijections C(P ) → R(P ) → C(P ) and, similarly, the formulas 7(r) (p) = r(1 ⊗B p),
˜ 7(s) (p ⊗B p ) = ps(p ). 7
(2.2)
˜ 7
determine mutually inverse bijections R(P ) → S(P ) → R(P ). Proof. Let us first check that 6(C(P )) ⊆ R(P ). It is clear that, for any ω ∈ C(P ), 6(ω) is unital and left P -linear. To see that 6(ω) is H -colinear, we use (1.19) and (1.18): ( P ⊗P ◦ 6(ω)) (p ⊗B p ) = P ⊗P (pp ⊗ 1 + &ω (pdp )) = (p(0) p(0) ⊗ 1 + &ω (p(0) dp(0) )) ⊗ p(1) p(1)
= 6(ω)(p(0) ⊗B p(0) ) ⊗ p(1) p(1)
= (6(ω) ⊗ id) ◦ P ⊗B P (p ⊗B p ).
(2.3)
To verify that 6(ω) is a splitting of the canonical surjection πB , recall that Ker πB = P (1B)P and note that (&ω )2 = &ω entails Ker &ω = (id − &ω )(1P ). Thus, by (1.17), we have Ker πB = (id − &ω )(1P ). Combining this with (1.19) we obtain
(id − πB ◦ 6(ω)) (p ⊗B p ) = πB p ⊗ p − pp ⊗ 1 − &ω (pdp )
(2.4) = πB ◦ (id − &ω ) (pdp ) = 0.
308
L. D¸abrowski, H. Grosse, P. M. Hajac
˜ ˜ The next step is to check that 6(R(P )) ⊆ C(P ). To see that 6(r) (H ) ⊆ 1P for any r ∈ R(P ), we take advantage of property (1.6) of the translation map τ , and compute: ˜ m ◦ 6(r) (h) = (m ◦ πB ◦ r ◦ τ ) (h − ε(h)) (2.5) = (m ◦ τ ) (h − ε(h)) = ε(h − ε(h)) = 0. (Here we abuse the notation and denote also by m the multiplication map on P ⊗B P .) ˜ It is immediate that 6(r)(1) = 0. Furthermore, using the colinearity of r and (1.6), ˜ ˜ we verify the colinearity of 6(r): 1 P ◦ 6(r)(h) = (ω ⊗ id) ◦ AdR . To check the fundamental-vector-field condition (χ ◦ r ◦ τ )(h − ε(h)) = 1 ⊗ (h − ε(h))
(2.6)
we note that it is equivalent to the equality (χ ◦ r ◦ χ −1 ) = id, which follows from the commutativity of (1.23). ˜ ◦ 6 = id and 6 ◦ 6 ˜ = id. To this end, taking advantage It remains to show that 6 of the unitality of 6(ω), (1.3) and (1.6), we compute: ˜ ◦ 6)(ω) (h) = (6(ω) ◦ τ )(h − ε(h)) (6 = 6(ω)(h[1] ⊗B h[2] ) − ε(h) ⊗ 1 = ε(h) ⊗ 1 + h[1] h[2] (0) ω(h[2] (1) ) − ε(h) ⊗ 1
(2.7)
= h(1) [1] h(1) [2] ω(h(2) ) = ε(h(1) )ω(h(2) ) = ω(h). Similarly, taking advantage of the unitality and left P -linearity of r, we compute ˜ ˜ (6 ◦ 6)(r) (p ⊗B p ) = pp ⊗ 1 + pp(0) 6(r) (p(1) ) = pp ⊗ 1 + pp(0) (r ◦ τ )(p(1) − ε(p(1) )) )⊗1 = pp ⊗ 1 + r pp(0) τ (p(1) ) − pp(0) ε(p(1) = r χ −1 (χ (p ⊗B p )) = r(p ⊗B p ).
˜ is straightforward. Finally, the proof concerning 7 and 7
(2.8)
Corollary 2.2. An H -Galois extension B ⊆ P admits a connection, if there exists a (not necessarily unital) left B-linear right H -colinear map s : P → P ⊗ P satisfying (πB ◦ s)(p) = 1 ⊗B p. Proof. Denote by S(P ) the space of all maps s defined in the corollary. To construct a “unitalising” map T : S(P ) → S(P ), we need to upgrade the constant correction term 1⊗1−s(1) to a left B-linear right H -colinear function of p whose image is in the kernel of the multiplication map. A very simple way to do so is to replace it by p(1 ⊗ 1 − s(1)). Now, we can define T(s)(p) = s(p) + p(1 ⊗ 1 − s(1)). It is straightforward to check that T (S(P )) ⊆ S(P ), as needed.
(2.9)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
309
When we think of a connection as an element s ∈ S(P ), then the strongness condition (see Definition 1.3) can be put as s(P ) ⊆ B ⊗ P . (Shift the second term on the right hand side to the left hand side in [M-S97, (11)].) Describing strong connections as strong elements in S(P ) is the main point of the theorem below. The second description is in terms of a covariant differential, and was hinted at in [H-PM96, Remark 4.3]. The third one coincides with the definition of a strong connection except that we change the inclusion condition (id − &)(dP ) ⊆ (1B)P to the equivalent equality condition (id − &)(BdP ) = (1B)P . The last description is precisely the definition of a strong connection form. Proving the equivalence of a strong connection to an appropriate splitting of the multiplication map B ⊗ P → P enables us to derive several desirable consequences. We write everything explicitly so as to provide a self-contained and coherent treatment of the strong connection. Theorem 2.3. Let B ⊆ P be an H -Galois extension. The following are equivalent descriptions of a strong connection: m
→ ← 1) A unital left B-linear right H -colinear splitting s of the multiplication map B ⊗P − s P. 1 2) A right H -colinear homomorphism D : P → ( B)P annihilating 1 and satisfying the Leibniz rule: D(bp) = bDp + db.p, ∀ b ∈ B, p ∈ P . 3) A left P -linear right H -colinear projection & : 1P → 1P (&2 = &) such that (id − &)(BdP ) = (1B)P . 4) A homomorphism ω : H → 1P vanishing on 1 and satisfying: a) 1 P ◦ ω = (ω ⊗ id) ◦ AdR , b) (m ⊗ id) ◦ (id ⊗ R ) ◦ ω = 1 ⊗ (id − ε), c) dp − p(0) ω(p(1) ) ∈ (1B)P , ∀ p ∈ P .
Proof. Let Vi , i ∈ {1, 2, 3, 4}, denote the corresponding spaces of homomorphisms defined in points 1)–4). We need to construct 4 mappings J1 : V1 → V2 ,
J2 : V2 → V3 ,
J3 : V3 → V4 ,
J4 : V4 → V1 ,
(2.10)
satisfying 4 identities: J4 ◦ J3 ◦ J2 ◦ J1 = id and cyclicly permuted versions.
(2.11)
Put J1 (s)(p) = 1⊗p −s(p). (Compare with the right-handed version [CQ95, (55)].) Evidently, J1 (s) is a right H -colinear homomorphism from P to (1B)P = Ker (m : B ⊗ P → P ) (see (1.16)) annihilating 1. As for the Leibniz rule, we have J1 (s)(bp) = 1 ⊗ bp − s(bp) = db.p + b ⊗ p − bs(p) = db.p + bJ1 (s)(p). (2.12) This establishes J1 as a map from V1 to V2 . Next, put J2 (D)(p dp) = p (d − D)(p). Observe first that J2 (D) is a well-defined (left P -linear right H -colinear) endomorphism of 1Pbecause D1 = 0 (see (1.10)). Choose bi ∈ B, pi ∈ P , such that Dp = i (dbi )pi = i (d(bi pi ) − bi dpi ). It follows
310
L. D¸abrowski, H. Grosse, P. M. Hajac
from the Leibniz rule that J2 (D) ◦ d is a left B-module map. Thus we have:
J2 (D)2 − J2 (D) (p dp) = J2 (D)(p dp) − J2 (D)(p Dp) − J2 (D)(p dp) =− p J2 (D)(d(bi pi ) − bi dpi ) i
p (J2 (D) ◦ d) (bi pi ) + p bi (J2 (D) ◦ d) (pi ) =− i
i
= 0. (2.13) Hence J2 (D) is a projection. Furthermore, note that for any bi ∈ B, pi ∈ P , we have (id − J2 (D))( bi .dpi ) = bi .Dpi ∈ (1B)P , (2.14) i
i
i.e., (id − J2 (D))(BdP ) ⊆ (1B)P . To see the reverse inclusion, take any i bi ⊗ pi ∈ m Ker (B ⊗ P → P ) = (1B)P . Then, using the above calculation and the Leibniz rule, we obtain 0= D(bi pi ) = bi Dpi + dbi .pi i
i
= (id − J2 (D))
i
i
bi .dpi
−
(2.15) bi ⊗ p i ,
i
i.e., i bi ⊗ pi ∈ Im(id − J2 (D)), as needed. To construct J3 , note first that (&◦d) : P → 1P is left B-linear. Indeed, since &2 = &, the condition (id − &)(BdP ) = (1B)P entails &((1B)P ) = 0. Consequently &d(bp) = &(db.p) + &(bdp) = b(& ◦ d)(p),
(2.16)
as claimed. Therefore it makes sense to put J3 (&)(h) = h[1] &(dh[2] ) (see (1.2)). This formula defines a homomorphism from H to 1P vanishing on 1. Furthermore, by the right H -colinearity of & and property (1.5) of the translation map, we have
1 P ◦ J3 (&) (h) = h[1] (0) &(dh[2] (0) ) ⊗ h[1] (1) h[2] (1) = h(2) [1] &(dh(2) [2] ) ⊗ S(h(1) )h(3) = ((J3 (&) ⊗ id) ◦ AdR ) (h).
(2.17)
As for the property b), note first that (id − &)(BdP ) = (1B)P implies, by the left P linearity of &, that (id−&)(1P ) = P (1B)P . Secondly, recall that χ (P (1B)P ) = 0 (see (1.23)). Hence (χ ◦ J3 (&)) (h) = h[1] ((m ⊗ id) ◦ (id ⊗ R )) &dh[2] + (id − &)dh[2] = h[1] ((m ⊗ id) ◦ (id ⊗ R )) (1 ⊗ h[2] − h[2] ⊗ 1) = h[1] h[2] (0) ⊗ h[2] (1) − h[1] h[2] ⊗ 1 = 1 ⊗ (h − ε(h)).
(2.18)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
311
To verify c), we compute:
dp − p(0) p(1) [1] & dp(1) [2] = (id − &)(dp) ∈ (1B)P .
(2.19)
Consequently, J3 is a mapping from V3 to V4 . Finally, put J4 (ω)(p) = p ⊗ 1 + p(0) ω(p(1) ).
(2.20)
To see that J4 (ω) takes values in B ⊗ P , note that p ⊗ 1 + p(0) ω(p(1) ) = p ⊗ 1 − 1 ⊗ p + 1 ⊗ p + p(0) ω(p(1) )
= 1 ⊗ p − dp − p(0) ω(p(1) ) ∈ B ⊗ P
(2.21)
by property c) of ω. The right H -colinearity of J4 (ω) follows from property a) of ω. The remaining needed properties of J4 (ω) are immediate. Consequently, J4 is a mapping from V4 to V1 . To end the proof, we need to show J4 ◦J3 ◦J2 ◦J1 = id and its three cyclicly permuted versions. We use recurrently the fact that the translation map τ provides the inverse of the canonical map χ , so that p(0) p(1) [1] ⊗B p(1) [2] = 1⊗B p and h[1] h[2] (0) ⊗h[2] (1) = 1⊗h. (J4 ◦ J3 ◦ J2 ◦ J1 )(s)(p) = p ⊗ 1 + p(0) (J3 ◦ J2 ◦ J1 )(s)(p(1) ) = p ⊗ 1 + p(0) p(1) [1] (J2 ◦ J1 )(s) dp(1) [2] = p ⊗ 1 + (J2 ◦ J1 )(s)(dp) = p ⊗ 1 + dp − J1 (s)(p) = 1 ⊗ p − 1 ⊗ p + s(p) = s(p).
(2.22)
(J3 ◦ J2 ◦ J1 ◦ J4 )(ω)(h) = h[1] (J2 ◦ J1 ◦ J4 )(ω)(dh[2] ) = h[1] (d − (J1 ◦ J4 )(ω)) (h[2] ) = h[1] (d − 1 ⊗ id + J4(ω)) (h[2] ) [1]
=h =
(2.23)
[2]
(J4(ω) − id ⊗ 1) (h ) [1] [2] h h (0) ω(h[2] (1) ) = ω(h).
(J2 ◦ J1 ◦ J4 ◦ J3 )(&)(dp) = dp − (J1 ◦ J4 ◦ J3 )(&)(p) = 1 ⊗ p − p ⊗ 1 − 1 ⊗ p + (J4 ◦ J3 ) (&)(p) = −p ⊗ 1 + p ⊗ 1 + p(0) (J3 (&)) (p(1) )
(2.24)
= p(0) p(1) [1] &(dp(1) [2] ) = &(dp). (J1 ◦ J4 ◦ J3 ◦ J2 )(D)(p) = 1 ⊗ p − (J4 ◦ J3 ◦ J2 )(D)(p) = dp − p(0) (J3 ◦ J2 ) (D)(p(1) ) = dp − p(0) p(1) [1] J2 (D)(dp(1) [2] ) = dp − J2 (D)(dp) = Dp. This shows that the maps Ji are bijective.
(2.25)
312
L. D¸abrowski, H. Grosse, P. M. Hajac
Corollary 2.4. If B ⊆ P is an H -Galois extension admitting a strong connection, then 1) P is projective as a left B-module, 2) B is a direct summand of P as a left B-module, 3) P is left faithfully flat over B. Proof. Let s : P → B ⊗ P be the splitting associated to a strong connection. Due to the unitality of B the multiplication map B ⊗ P → P is surjective. Thus P is a direct summand of B ⊗ P via s, and the projectivity of P follows from the freeness of B ⊗ P . Let fB : P → B be a linear map which is identity on B. Then m ◦ (id ⊗ fB ) ◦ s is a left B-linear map splitting the inclusion B ⊆ P . Hence B is a direct summand of P . Finally, since P is projective it is flat. On the other hand, since P contains B as a direct summand, it is also faithfully flat. In fact, since s embeds P in B ⊗P colinearly, we can say that P is an H -equivariantly projective left B-module. Next, we translate s to the setting of associated quantum bundles so as to be able to compute their projector matrices with the help of Lemma 1.4. Proposition 2.5. Let s : P → B ⊗ P be the splitting associated to a strong connection on the H -Galois extension B ⊆ P , and let ρ : V → V ⊗ H be any finite dimensional corepresentation of H . Denote by : the canonical isomorphism B ⊗ Hom(V , P ) → Hom(V , B ⊗ P ). Then the formula sρ (ξ ) = :
−1
(s ◦ ξ )
(2.26)
gives a left B-linear splitting of the multiplication map B ⊗ Homρ (V , P ) → Homρ (V , P ).
Proof. Note first that, since s is right colinear, s ◦ Homρ (V , P ) ⊆ Homρ (V , B ⊗ P ).
We need to show that : B ⊗ Homρ (V , P ) = Homρ (V , B ⊗ P ), where :(b ⊗ ξ )(v) = b ⊗ ξ(v). For this purpose we can reason as in the proof of [HM99, Prop. 2.3] and construct the following commutative diagram with exact rows: id⊗ρ
0 −−→ B ⊗ Homρ (V , P ) −−→ B ⊗ Hom(V , P ) −−→ B ⊗ Hom(V , P ⊗ H ) : : (2.27) ρ
0 −−→ Homρ (V , B ⊗ P ) −−→ Hom(V , B ⊗ P ) −−→ Hom(V , B ⊗ P ⊗ H ). Here ρ is defined by ρ(ξ ) = (ξ ⊗ id) ◦ ρ − R ◦ ξ , and similarly ρ. The map : is the appropriate canonical isomorphism. Completing the diagram to the left with zeroes and applying the Five Isomorphism Lemma shows that the restriction of : to B⊗Homρ (V , P ) is an isomorphism onto Homρ (V , B ⊗P ), as needed. Thus sρ is a map from Homρ (V , P ) to B ⊗ Homρ (V , P ), as claimed. Explicitly, : :
−1
(ϕ) =
i
ϕ(ei )ei =
−1
is given by
ϕ(ei )[−1] ⊗ ϕ(ei )[0] ei ,
(2.28)
i
where {ei } is a basis of V , {ei } its dual, and we put ϕ(v) = ϕ(v)[−1] ⊗ϕ(v)[0] (summation understood). Similarly, we can write sρ (ξ ) = sρ (ξ )[−1] ⊗ sρ (ξ )[0] . The left B-linearity
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
313
of sρ follows from the left B-linearity of s and :. Finally, sρ splits the multiplication map because s splits the multiplication map: (m ◦ sρ )(ξ )(v) = sρ (ξ )[−1] sρ (ξ )[0] (v)
= m : sρ (ξ ) (v) = m ((s ◦ ξ )(v)) = (m ◦ s ◦ ξ )(v) = ξ(v).
(2.29)
Applying the standard reasoning as used in the proof of Corollary 2.4, we can infer (under the assumptions of Proposition 2.5) that Homρ (V , P ) is projective as a left Bmodule. On the other hand, if P is left faithfully flat over B and the antipode of H is bijective, one can prove that Homρ (V , P ) is finitely generated as a left B-module [S-P]. Thus point 3) of Corollary 2.4 leads to the following conclusion (cf. [D-M97a, App. B]): Corollary 2.6. Let H be a Hopf algebra with a bijective antipode, B ⊆ P an H -Galois extension admitting a strong connection, and ρ : V → V ⊗ H a finite-dimensional corepresentation of H . Then the associated module of colinear maps Homρ (V , P ) is finitely generated projective as a left B-module. Closely related to B-bimodule Homρ (V , P ) is B-bimodule ϕ(V ) ⊆ P Pρ := ϕ∈Homρ (V ,P )
(cf. [D-M97a, App. B]). It turns out that such submodules of P are invariant under the splitting associated to a strong connection: Proposition 2.7. Let s be the splitting associated to a strong connection on an H -Galois extension B ⊆ P . Let ρ : V → V ⊗ H be a finite-dimensional corepresentation of H and Pρ := ϕ∈Homρ (V ,P ) ϕ(V ). Then s(Pρ ) ⊆ B ⊗ Pρ . Proof. If p ∈ Pρ then there exists finitely many ϕ˜ν ∈ Homρ (V , P ) such that p=
ϕ˜ν (vν ) =
ν
V dim ν
vνk ϕ˜ν (ek ) =
k=1
dim V
ϕk (ek ).
(2.30)
k=1
Here {ek } is a basis of V and ϕk := ν vνk ϕ˜ν . (Since vνk are simply the coefficients of vν with respect to {ek }, we have ϕk ∈ Homρ (V , P ).) Next, we can always write s(p) = µ fµ ⊗ (p)µ , where {fµ } is a linear basis of B. (We have the strongness condition s(P ) ⊆ B ⊗ P .) Since s and ϕk are both colinear, so is their composition s ◦ ϕk , and we have P ⊗P (s ◦ ϕk )(e: ) =
dim V
m=1
ρ
s(ϕk (em )) ⊗ um: =
dim V
m=1 µ
ρ
fµ ⊗ (ϕk (em ))µ ⊗ um: , (2.31)
314
L. D¸abrowski, H. Grosse, P. M. Hajac ρ
where um: are the matrix elements of corepresentation ρ. On the other hand, remembering that s(P ) ⊆ B ⊗ P , we have P ⊗P (s ◦ ϕk )(e: ) = fµ ⊗ R (ϕk (e: ))µ . (2.32) µ
Combining the above two equalities and using the linear independence of fµ , we obtain R (ϕk (e: ))µ =
dim V
m=1
ρ
(ϕk (em ))µ ⊗ um: .
(2.33)
Hence we can define a bi-index family of ρ-colinear maps by the equality ϕkµ (e: ) = (ϕk (e: ))µ . Consequently, due to (2.30), we have dimV dimV s(p) = s ϕk (ek ) = fµ ⊗ (ϕk (ek ))µ k=1
k=1
=
µ
µ
fµ ⊗
dimV
(2.34)
ϕkµ (ek ) ∈ B ⊗ Pρ ,
k=1
as claimed. Remark 2.8. Just as we define J1 in the proof of Theorem 2.3, we can define the covariant derivative on Homρ (V , P ) via the formula ∇ : Homρ (V , P ) → 1B ⊗B Homρ (V , P ) , ∇ξ = 1 ⊗ ξ − sρ (ξ ).
(2.35)
Using identifications in Theorem 2.3 (isomorphisms Ji ), one can check that (2.35) agrees with [HM99, (2.2)]. We now proceed to establishing a link between strong connections and Cuntz–Quillen connections on bimodules [CQ95, p. 283]. Let C be a coalgebra and N1 , N2 right Ccomodules. Denote by A := Hom(C, k) the algebra dual to C. Then N1 and N2 enjoy the following natural left A-module structure (e.g., see [M-S93, Sect. 1.6]): A ⊗ Ni a ⊗ n → n(0) a(n(1) ) ∈ Ni , i ∈ {1, 2}.
(2.36)
With respect to this structure, any k-homomorphism from N1 to N2 is right C-colinear if and only if it is right Aop -linear. Thus, for an H -Galois extension B ⊆ P , algebra P is a (B, (H ∗ )op )-bimodule, where H ∗ := Hom(H, k) is the algebra dual to H considered as a coalgebra. By Theorem 2.3 (Point 2), a strong connection can be given as a right (H ∗ )op -linear map D : P → 1 B ⊗B P (see (1.16)) satisfying the left Leibniz rule and vanishing on 1. Therefore it seems natural to generalize the concept of a left bimodule connection [CQ95, p. 284] to Definition 2.9. Let N be an (A1 , A2 )-bimodule. We say that ∇L : N → 1 A1 ⊗A1 N is a left bimodule connection iff it is right A2 -linear and satisfies the left Leibniz rule: ∇L (an) = a∇L (n) + da ⊗A1 n, ∀ a ∈ A1 , n ∈ N . We can now say that a strong connection on H -Galois extension B ⊆ P is a left (B, (H ∗ )op )-bimodule connection on P vanishing on 1. In an analogous way, we can define a right bimodule connection ∇R . Then we can put them together and, in the spirit of [CQ95, p. 284], define a bimodule connection as:
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
315
Definition 2.10. Let N be an (A1 , A2 )-bimodule and ∇L and ∇R a left and right bimodule connection, respectively. We call a pair (∇L , ∇R ) a bimodule connection on N . Reasoning precisely as in [CQ95], one can show that an (A1 , A2 )-bimodule N admits a bimodule connection if and only if it is projective as a bimodule (i.e., as a module over op A1 ⊗ A2 ). In a similar fashion, one can see that strong connections correspond to equivariant connections in the algebraic-geometry setting [R-D98, (20)]. Remark 2.11. Within the framework of the Hopf–Galois theory the right coaction id ⊗ R : B⊗P → B⊗P ⊗H and the restriction B⊗P of the diagonal coaction P ⊗P (1.8) coincide. Therefore one can use either of them to define the colinearity of a splitting s of the multiplication map B⊗P → P . In the general setting of C-Galois extensions [BH99, Def. 2.3], the diagonal coaction P ⊗P : P ⊗P → P ⊗P ⊗C (coinciding with (1.8) for Hopf–Galois extensions) can be defined by the formula P ⊗P = (id ⊗ ψ) ◦ ( R ⊗ id) [BM98a, Prop. 2.2], where ψ : C ⊗P → P ⊗C is an entwining structure and R : P → P ⊗ C, R (p) = p(0) ⊗ p(1) , a coaction (see [BM00, Sect. 3] for details). If B is the subalgebra of P of C-coinvariants, i.e., B = {b ∈ P | R (bp) = b R (p), ∀ p ∈ P }, then P ⊗P (b ⊗ p) = (id ⊗ ψ)( R (b1) ⊗ p) = (id ⊗ ψ)(b1(0) ⊗ 1(1) ⊗ p) = b1(0) ⊗ ψ(1(1) ⊗ p).
(2.37)
On the other hand, if B ⊆ P is C-Galois and ψ is its canonical entwining structure [BH99, (2.5)], then, by [BH99, Theorem 2.7], P is a (P , C, ψ)-module [B-T99], so that ψ(p ⊗ p). In particular, (p) = 1 ψ(1 we have R (p p) = p(0) R (0) (1) ⊗ p). Hence (1) (id ⊗ R )(b ⊗ p) = b ⊗ 1(0) ψ(1(1) ⊗ p).
(2.38)
Therefore we need to distinguish between B⊗P and id ⊗ R in the C-Galois case.1 If we define a strong connection on a C-Galois extension B ⊆ P as a unital left B-linear right C-colinear (with respect to B⊗P ) splitting of the multiplication map B ⊗ P → P , then such a strong connection yields a connection in the sense of [BM00, Def. 3.5]. Indeed, let s be such a splitting, and &s (rdp) := r(s(p) − p ⊗ 1). One can see that this formula gives a well-definedleft P -linear endomorphism of 1 P . Furthermore, by the left B-linearity of s, for any i dbi .pi ∈ (1 B)P , we have:
&s (
dbi .pi ) =
i
&s (d(bi .pi ) − bi dpi )
i
=
s(bi pi ) − bi pi ⊗ 1 − bi (s(pi ) − pi ⊗ 1) = 0.
(2.39)
i
Hence P (1 B)P ⊆ Ker &s by the left P -linearity of &s . On the other hand, since m◦s = id and s(P ) ⊆ B⊗P , we have πB (s(p)) = 1⊗B p, where πB : P ⊗P → P ⊗B P is the canonical surjection. Consequently, πB (&s (p dp)) = πB (p (s(p) − p ⊗ 1)) = p πB (s(p)) − p p ⊗B 1 = p ⊗B p − p p ⊗B 1 = πB (p dp). 1 We are grateful to T. Brzezi´nski for suggesting to us this way of arguing.
(2.40)
316
L. D¸abrowski, H. Grosse, P. M. Hajac
Therefore, since P (1 B)P = Ker πB (see below (1.23)), we obtain Ker &s ⊆ P (1 B)P . Thus Ker &s = P (1 B)P . Next, take any p ∈ P . It follows from s(P ) ⊆ B ⊗ P that dp − &s (dp) = 1 ⊗ p − p ⊗ 1 − s(p) + p ⊗ 1 = 1 ⊗ p − s(p) ∈ B ⊗ P . (2.41) Since also m(1 ⊗ p − s(p)) = 0, we have dp − &s (dp) ∈ (1 B)P ⊆ Ker &s . By the left P -linearity of &s we can conclude now that &s ◦ (id − &s ) = 0, i.e., (&s )2 = &s . It remains to show that P ⊗P ◦ &s ◦ d = ((&s ◦ d) ⊗ id) ◦ R . The property ψ(c ⊗ 1) = 1 ⊗ c entails P ⊗P (p ⊗ 1) = p(0) ⊗ ψ(p(1) ⊗ 1) = p(0) ⊗ 1 ⊗ p(1) .
(2.42)
P ⊗P (&s (dp)) = P ⊗P (s(p)) − P ⊗P (p ⊗ 1) = s(p(0) ) ⊗ p(1) − p(0) ⊗ 1 ⊗ p(1)
= (&s ◦ d) ⊗ id ( R (p))
(2.43)
Therefore
by the colinearity of s. Consequently &s is a connection, as claimed. To exemplify Proposition 2.1 and Theorem 2.3, let us translate the strong and nonstrong connection forms on quantum projective space RPq2 [H-PM96, Ex. 2.8] to the language of splittings. Example 2.12 (Quantum projective space RPq2 ). First let us recall how to define the 2 ) of the equator quantum sphere of Podle´s [P-P87]. To this coordinate algebra A(Sq,∞ end, we modify the convention in [H-PM96] by replacing q by q −1 and rewriting the generators as follows: 2(1 + q 4 ) x13 . (2.44) x = x11 , y = x12 , z = 1 + q2 2 ) as Cx, y, z/I Now we can define A(Sq,∞ q,∞ , where Cx, y, z is the (unital) free algebra generated by x, y, z and Iq,∞ is the two-sided ideal generated by
q4 − 1 2 z , q4 + 1 q 2 + q −2 q −2 − q 2 q 2 + q −2 q 2 − q −2 xz − zx − i zy, yz − zy − i zx. 2 2 2 2
x 2 + y 2 + z2 − 1,
xy − yx − i
(2.45)
2 ) into a Map(Z , C)-comodule algebra we use the formulas (see above To make A(Sq,∞ 2 Sect. 6 in [P-P87] for the related quantum-sphere automorphisms)
R (x) = x ⊗ γ , R (y) = y ⊗ γ , R (z) = z ⊗ γ ,
(2.46)
where γ (±1) = ±1. The coordinate ring of quantum projective space RPq2 is then 2 ). (The algebra A(RP 2 ) defined as the Map(Z2 , C)-coinvariant subalgebra of A(Sq,∞ q
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
317
2 ) generated by the monomials of even degree.) The exis the subalgebra of A(Sq,∞ 2 ) is a Map(Z , C)-Galois extension which is not cleft. The tension A(RPq2 ) ⊆ A(Sq,∞ 2 non-cleftness can be proved by reasoning exactly as in [HM99, Appendix]. Indeed, since 1 and γ are linearly independent group-likes ( c = c ⊗ c), a cleaving map 2 ) would have to map them to linearly independent (injec : Map(Z2 , C) → A(Sq,∞ 2 ). But this is tivity of ) invertible (convolution invertibility of ) elements in A(Sq,∞ 2 impossible as A(Sq,∞ ) ⊆ A(SLq (2)), and the only invertible elements in A(SLq (2)) are non-zero numbers [HM99, App.]. Translating the formula [H-PM96, Prop. 2.14] for a strong connection to our setting, we have
ω(γ ) = xdx + ydy + zdz = x ⊗ x + y ⊗ y + z ⊗ z − 1 ⊗ 1.
(2.47)
The splitting corresponding to ω is then, due to its unitality and left A(RPq2 )-linearity, determined by s(x) = xt , s(y) = yt , s(z) = zt , where
t = x ⊗ x + y ⊗ y + z ⊗ z. (2.48)
2 ). Thus one can directly see that the image of s is in A(RPq2 ) ⊗ A(Sq,∞ 2 Next, consider a non-strong connection ω(γ ˜ ) = ω(γ ) − 2dx [H-PM96, Prop. 2.15]. Again, we compute the corresponding splitting:
s˜ (x) = s(x) − 2xdx 2 , s˜ (y) = s(y) − 2ydx 2 , s˜ (z) = s(z) − 2zdx 2 .
(2.49)
As in the proof of [H-PM96, Prop. 2.15], we can invoke the representation theory contained in [P-P87] to conclude that xdx 2 = 0. Consequently, ((id ⊗ flip) ◦ ( R ⊗ id − id ⊗ 1 ⊗ id)) (xdx 2 ) = xdx 2 ⊗ (γ − 1) = 0.
(2.50)
2 ). Hence the image of s˜ is not in A(RPq2 ) ⊗ A(Sq,∞
Remark 2.13. Let x, y, z be as above. Since x 2 + y 2 + z2 = 1 (which was the reason for rescaling the generators) and, with respect to the star structure inherited from SUq (2), we have x ∗ = x, y ∗ = y, z∗ = z, we can treat the generators x, y, z as the Cartesian coor2 . (See [HMS, Sect. 2] for the Cartesian coordinates for all Podle´s spheres.) dinates of Sq,∞ 2 )) Having this in mind, we take the idempotent F = (x, y, z)T (x, y, z) ∈ M3 (A(Sq,∞ T (here stands for the matrix transpose) and define the projective module of the normal 2 2 )3 F . Therefore, one can define the projective module of the bundle of Sq,∞ as A(Sq,∞ 2 )3 (I − F ), where I is tangent bundle of the equator Podle´s quantum sphere as A(Sq,∞ 3 3 2 the identity matrix in M3 (A(Sq,∞ )). Let us now consider strong connections on principal homogeneous Hopf–Galois extensions, i.e., P /I -Galois extensions given by a Hopf ideal I in a Hopf algebra P . Here the coaction is given by the formula R = (id ⊗ πI ) ◦ , where πI is the canonical surjection P → P /I . For such extensions, it is known (e.g., see [DHS99, Theorem 2.1]) that if B = P co P /I then I = B + P , where B + = Ker ε∩B. If s is the splitting associated to a strong connection, then, due to the left B-linearity of s, s(B + P ) = B + s(P ) ⊆ B + B ⊗ P = B + ⊗ P .
(2.51)
318
L. D¸abrowski, H. Grosse, P. M. Hajac
Hence s descends to a splitting i of thes canonical surjection P → P /(B + P ): P
− −− −−→
←−− m i
B ⊗P
(2.52)
− −− −−→
H = P /(B + P ) ←−− (B ⊗ P )/(B + ⊗ P ) = P . Explicitly, we have i(p) = ((ε ⊗ id) ◦ s) (p). (The map is well-defined because of (2.51).) Put s(p) = s(p)[0] ⊗ s(p)[1] (summation understood). Then, as m ◦ s = id and, for b ∈ B, p ∈ P , ε(b)p = bp mod B + P , we have (πI ◦ i)(p) = πI ε(s(p)[0] )s(p)[1] = πI s(p)[0] s(p)[1] = (πI ◦ m ◦ s)(p) = p. (2.53) Furthermore, since s is unital, so is i. The right colinearity of i follows from the strongness (s(P ) ⊆ B ⊗ P ) and the right colinearity of s: ( R ◦ i)(p) = ε s(p)[0] s(p)[1] (0) ⊗ s(p)[1] (1) = ((ε ⊗ id ⊗ id) ◦ P ⊗P ◦ s) (p) = ((ε ⊗ id) ◦ s) (p(1) ) ⊗ p(2) = i(p(1) ) ⊗ p(2) = i(p (1) ) ⊗ p (2) .
(2.54)
Thus one can associate to any strong connection on a principal homogeneous Hopf– Galois extension a total integral of Doi [D-Y85] (unital right colinear map H → P ). Recall that total integrals always exist on faithfully flat Hopf–Galois extensions ([S-HJ90a, Theorem 1], [D-Y85, (1.6)], [S-HJ90a, Remark 3.3]). This is in agreement with Point 3 of Corollary 2.4, although we claim there only the left faithful flatness, and faithfully flat Hopf–Galois extensions B ⊆ P are defined as Hopf–Galois extensions such that P is B-faithfully-flat on both sides. Note also that we could equally well proceed as in [BM98b, Prop. 3.6] and define i via a connection form. If s is the splitting associated to a connection form ω, i.e., s = J4 (ω) (see (2.20)), then i(p) = ((ε ⊗ id) ◦ J4 (ω)) (p)
= (ε ⊗ id) p ⊗ 1 + p(1) ω(p(2) ) = ε(p) ⊗ 1 + ε(p(1) )ε(ω(p(2) )(1) ) ⊗ ω(p(2) )(2) = εH (p) ⊗ 1 + ε(p(1) )(ε ⊗ id)(ω(p(2) )) = εH (p) ⊗ 1 + ((ε ⊗ id) ◦ ω) (p),
(2.55)
where ω(h) = ω(h)(1) ⊗ ω(h)(2) , summation understood, and ε H denotes the counit on H . (See [BM98b, Prop. 3.6] for this kind of splitting in the case of non-universal calculus.) If i is also left colinear, then, by [HM99, Prop. 2.4], the formula ω = (S ∗ d) ◦ i associates to i a strong connection. (Such connections are called canonical strong connections.) It turns out that applying the above described way of associating a total integral to a strong connection in the canonical case is simply solving the equation ω = (S ∗ d) ◦ i for i. Indeed, since ω(h) = Si(h)(1) di(h)(2) and ε(i(p)) ¯ = ε(p) (because i(p) ¯ − p ∈ KerπI ⊆ Kerε), we have J4 (ω)(p) = p ⊗ 1 + p(1) ω(p(2) ) = p(1) Si(p(2) )(1) ⊗ i(p(2) )(2) .
(2.56)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
319
Applying ε ⊗ id yields i(p) = ((ε ⊗ id) ◦ J4 (ω)) (p),
(2.57)
as claimed. Example 2.14 (Quantum and classical Hopf fibration). The above described formalism applies to the quantum Hopf fibration. We refer to [HM99] for the computation of projector matrices of the quantum Hopf line bundles from the Dirac q-monopole connection [BM93], and to [H-PM00] for the computation of the Chern–Connes pairing of these matrices with the cyclic cocycle (trace) of [MNW91, (4.4)]. (This pairing yields numbers called “Chern numbers” or “charges”. See Proposition 3.7 for the freeness of the direct sum of charge −1 and charge 1 quantum Hopf line bundles, cf. [DS94, (4.2)] for a local description of such bundles.) Here we only remark that this quantum principal fibration admits infinitely many canonical strong connections. Indeed, for the injective antipode (which is the case here), [HM99, Cor. 2.6] classifies the canonical strong connections by unital bicolinear splittings. On the other hand, by [MMNNU91, p. 363], all unital bicolinear splittings i : C[z, z−1 ] → A(SLq (2)) are of the form: i(zn ) = (1 + ζ pn (ζ ))α n , i(z−n ) = (1 + ζ rn (ζ ))δ n ,
(2.58)
where pn , rn are arbitrary polynomials in ζ := −q −1 βγ . (Here, α, β, γ , δ are the generators of A(SLq (2)) as in [HM99].) Since the equality ω = (S ∗ d) ◦ i can be solved for i (see (2.57)), different splittings yield different connections. Hence there are infinitely many connections. However, for q = 1, after passing to the de Rham forms, all the canonical strong connections coincide with the classical Dirac monopole. More precisely, let πDR be the canonical projection from the universal onto the de Rham differential calculus and i0 be the splitting corresponding to the Dirac monopole (i.e., given by (2.58) with pn = 0 = rn for all n). Then πDR ◦ (S ∗ d) ◦ i = πDR ◦ (S ∗ d) ◦ i0 for all i .
(2.59)
((S ∗ dDR ) ◦ (i0 − i)) (z) = (S ∗ dDR )(βγ p1 (−βγ )α).
(2.60)
Indeed, we have
Furthermore, using the commutativity of functions with forms and functions, and the Leibniz rule, we obtain (S ∗ dDR )(hh ) = S(h(1) )S(h (1) )dDR (h(2) h (2) )
= S(h (1) )h (2) S(h(1) )dDR h(2) + S(h(1) )h(2) S(h (1) )dDR h (2)
= ε(h
(2.61)
)S(h(1) )dDR h(2) + ε(h)S(h (1) )dDR h (2) .
Substituting h = β and h = γ p1 (−βγ )α, and noting that ε(β) = 0 and ε(γ p1 (−βγ )α) = 0, one can conclude that (S ∗ dDR )(i(z)) = (S ∗ dDR )(i0 (z)). On the other hand, for any connection form ω we have (πDR ◦ ω)(uu ) = (πDR ◦ ω)(u)ε(u ) + ε(u)(πDR ◦ ω)(u ).
(2.62)
Therefore (S ∗ dDR ) ◦ i and (S ∗ dDR ) ◦ i0 coincide on any power of z, whence are equal, as claimed.
320
L. D¸abrowski, H. Grosse, P. M. Hajac
3. Chern–Connes Pairing for the Super Hopf Fibration The super Hopf fibration leading to the super sphere has an interesting history. To the best of our knowledge, it was first introduced by Landi and Marmo [LM87]. They treated supersymmetric abelian gauge fields in general and worked out details for the super group UOSP(1, 2). Everything was formulated within the Grassmann envelope of the super algebra uosp(1, 2). The super manifold theory has been used in the work of Teofilatto [T-P88]. He defines and studies super Riemann surfaces. As the simplest example, he treated the super sphere with S 2 as its body. Ideas of noncommutative geometry were used in [GKP96, GKP97] to introduce an ultraviolet regularization for quantum fields defined on S 2 . The fuzzy sphere [M-J92] was introduced through suitable embeddings of the algebra of N × N matrices. In [GKP96], similar embeddings of modules led to approximation of sections of line bundles over S 2 . Also in [GKP96], there is a study of fermions and supersymmetric extensions of the fuzzy sphere. An extensive treatment of the approximation of super-graded functions over the super sphere, and sections of a bundle through sequences of graded modules, as well as the treatment of the graded de Rham complex, is given in [GR98]. The description of the monopole on the super sphere that we provide can be related to that given in [BBL90]. A detailed study of the super monopole using the super-geometry approach can be found in [L-G01b]. Our approach here to the super Hopf fibration is purely algebraic. First, we show that the super Hopf fibration can be considered as an H -Galois extension A(Ss2 ) ⊆ A(Ss3 ), where H = C[z, z−1 ] is the Hopf algebra generated by an invertible group-like element z. The polynomial algebras A(Ss3 ) and A(Ss2 ) are taken as nilpotent extensions (by two Grassmann variables λ± ) of the (complex) coordinate rings of the 3-dimensional sphere S 3 and 2-dimensional sphere S 2 , respectively (see [GKP96]). This is summed up in the following commutative diagram with exact columns (but not rows): A(Ss2 ) ∩ λ± −−→ λ± A(Ss2 ) ℘
−−→ A(Ss3 ) −−→ H
A(S 2 )
−−→ A(S 3 ) −−→ H
(3.1)
Thus, in a sense, the super Hopf fibration can be viewed as a Grassmann covering of the classical (complex) Hopf fibration. Definition 3.1. Let R = C[a, b, c, d] be the polynomial ring in four variables. Put D = ad − bc. Let I be the two-sided ideal in the (unital) free algebra Rλ+ , λ− generated by λ2+ ,
λ2− ,
λ+ λ− + λ− λ+ ,
λ+ λ− + D − 1.
(3.2)
We call the quotient algebra A(Ss3 ) := Rλ+ , λ− /I the coordinate ring of 3-dimensional super sphere Ss3 . It can be easily verified that the (matrix) formula a b a⊗1 b⊗1 1⊗z 0 R c d = c ⊗ 1 d ⊗ 1 0 1 ⊗ z−1 λ+ λ− λ + ⊗ 1 λ− ⊗ 1
(3.3)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
321
defines a coaction R : A(Ss3 ) → A(Ss3 ) ⊗ H making A(Ss3 ) a right H -comodule algebra. Lemma 3.2. Let A(Ss2 ) := {a ∈ A(Ss3 ) | R (a) = a ⊗ 1} be the algebra of H coinvariants. Then A(Ss2 ) is the subalgebra of A(Ss3 ) generated by 1, ab, bc, cd, λ+ b, λ+ d, λ− a, λ− c, λ+ λ− .
(3.4)
Proof. Evidently, the algebra generated by (3.4) is contained in A(Ss2 ). For the opposite inclusion, note first that every element a of A(Ss3 ) can be written as a linear combination of (coefficient-free) monomials mk,: such that R (mk,: ) = mk,: ⊗ zk . Since the powers of z form a basis of H , if a ∈ A(Ss2 ), then a must be a linear combination of non-zero monomials m0,: . On the other hand, any 0 = m0,: = 1 is a word composed of the same number of letters coming from the alphabet {a, c, λ+ } and the alphabet {b, d, λ− }. Furthermore, since all letters commute or anti-commute, we can always pair the letters coming from different alphabets. Hence m0,: can be expressed in terms of (3.4), as needed. Proposition 3.3. The extension of algebras A(Ss2 ) ⊆ A(Ss3 ) is H -Galois. Proof. Define the map τ : H → A(Ss3 ) ⊗A(Ss2 ) A(Ss3 ) by the formulas (n ∈ N): τ (zn ) = (1 + nλ+ λ− )
n n k=0
τ (z
−n
) = (1 + nλ+ λ− )
d n−k (−b)k ⊗A(Ss2 ) a n−k ck ,
k
(3.5)
n n k=0
k
a
n−k
k
(−c) ⊗A(Ss2 ) d
n−k k
b .
We are going to prove that χ˜ := (m ⊗ id) ◦ (id ⊗ τ ) is the inverse of the canonical map χ . (This means that τ is the translation map.) Since χ and χ˜ are both left A(Ss3 )-linear maps by construction, it suffices to check χ ◦ χ˜ = id and χ˜ ◦ χ = id on elements of the form 1 ⊗ h and 1 ⊗A(Ss2 ) p, respectively. To verify the first identity, we recall that any h ∈ H is a linear combination of z±n , n ∈ N, and compute: (χ ◦ χ˜ )(1 ⊗ zn ) = (1 + nλ+ λ− ) = (1 + nλ+ λ− )
n n k
k=0 n k=0
d n−k (−b)k χ (1 ⊗A(Ss2 ) a n−k ck )
n k
d
n−k
k n−k k
(−b) a
= (1 + nλ+ λ− )(ad − bc)n ⊗ zn = (1 + nλ+ λ− )(1 − λ+ λ− )n ⊗ zn = (1 + nλ+ λ− )(1 − nλ+ λ− ) ⊗ zn = 1 ⊗ zn .
c
⊗ zn (3.6)
In the fourth equality we used the determinant relation λ+ λ− + ad − bc = 1. Similarly, we obtain: n n n−k (χ ◦ χ˜ )(1 ⊗ z−n ) = (1 + nλ+ λ− ) (−c)k χ (1 ⊗A(Ss2 ) d n−k bk ) = 1 ⊗ z−n . k a k=0
(3.7)
322
L. D¸abrowski, H. Grosse, P. M. Hajac
Thus χ ◦ χ˜ = id. For the other identity, we note that it is sufficient to check it on the monomials m±n (as in the proof of Lemma 3.2 but with the second index suppressed). Since R (m±n ) = m±n ⊗ z±n , we have mn d n−k bk ∈ A(Ss2 ) and m−n a n−k ck ∈ A(Ss2 ). Hence, using the centrality of λ+ λ− ∈ A(Ss2 ), we can compute: (χ˜ ◦ χ )(1 ⊗A(Ss2 ) mn ) = mn χ˜ (1 ⊗ zn ) = mn (1 + nλ+ λ− )
n n k
k=0
= (1 + nλ+ λ− )
d n−k (−b)k ⊗A(Ss2 ) a n−k ck
n n
mn d n−k (−b)k ⊗A(Ss2 ) a n−k ck
k
k=0
= 1 ⊗A(Ss2 ) (1 + nλ+ λ− )
n k=0
(3.8)
n k
mn d n−k (−b)k a n−k ck
= 1 ⊗A(Ss2 ) mn (1 + nλ+ λ− )(ad − bc)n = 1 ⊗A(Ss2 ) mn . Here the last step is as in the previous calculation. Similarly, we get: (χ˜ ◦ χ )(1 ⊗A(Ss2 ) m−n ) = (1 + nλ+ λ− )
n n k
k=0
m−n a n−k (−c)k ⊗A(Ss2 ) d n−k bk
(3.9)
= 1 ⊗A(Ss2 ) m−n . Therefore χ˜ is the inverse of χ , and the extension is H -Galois.
We use the idea of colinear lifting (1.25) to construct a connection form. We consider this connection as the (universal-calculus) super Dirac monopole. Since it is strong, we can conclude that the extension A(Ss2 ) ⊆ A(Ss3 ) enjoys all properties itemized in Corollary 2.4. Proposition 3.4. Let ω : H → 1 A(Ss3 ) be the linear map defined by (n ∈ N) ω(zn ) = (1 + nλ+ λ− ) ω(z−n ) = (1 + nλ+ λ− )
n n k=0 n k=0
d n−k (−b)k d(a n−k ck ),
k
(3.10)
n k
a n−k (−c)k d(d n−k bk ).
(3.11)
Then ω is a strong connection form. Proof. Note first that ω(1) = 0 and 1 P ω(z±n ) = ω(z±n ) ⊗ 1. Furthermore, taking advantage of (3.5) and the determinant relation ad − bc = 1 − λ+ λi , we have: ((m ⊗ id) ◦ (id ⊗ R ) ◦ ω) (z±n ) = χ τ (z±n ) − 1 ⊗A(Ss2 ) 1 (3.12)
= 1 ⊗ z±n − ε(z±n ) .
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
323
This proves that ω is a connection form. It remains to check the strongness condition. By linearity, it suffices to do it on monomials m±n (see the proof of Proposition 3.3). Putting pk+ = (1 + nλ+ λ− ) nk d n−k (−b)k , qk+ = a n−k ck , (3.13) pk− = (1 + nλ+ λ− ) nk a n−k (−c)k , qk− = d n−k bk , and applying the Leibniz rule and using again the determinant relation ad − bc = 1 − λ+ λi , we obtain: dm±n − m±n ω(z±n ) = dm±n −
n k=0
m±n pk± dqk±
= dm±n − d m±n =
n k=0
n k=0
pk± qk±
+
n k=0
d(m±n pk± ).qk±
(3.14)
d(m±n pk± ).qk± ∈ (1 A(Ss2 ))A(Ss3 ).
Consequently, for any a ∈ A(Ss3 ), we have (id − &ω )(da) ∈ (1 A(Ss2 ))A(Ss3 ), i.e., ω is strong. Our next step is to consider super Hopf line bundles. Precisely as in the case of quantum Hopf line bundles [HM99, Def. 3.1], since we are dealing with one-dimensional corepresentations of H (ρµ (1) = 1 ⊗ z−µ ), we can identify the colinear maps ξ : C → A(Ss3 ) with their values at 1 (η(ξ ) := ξ(1)), and define them as the following bimodules over A(Ss2 ): A(Ss3 )µ := {a ∈ A(Ss3 ) | R a = a ⊗ z−µ }, µ ∈ Z.
(3.15)
Reasoning in a similar manner as in the proof of Lemma 3.2, one can see that (n > 0) A(Ss3 )−n = A(Ss3 )n =
n k=0 n k=0
A(Ss2 )a n−k ck + A(Ss2 )d n−k bk +
n−1 k=0 n−1 k=0
A(Ss2 )a n−1−k ck λ+ ,
(3.16)
A(Ss2 )d n−1−k bk λ− .
(3.17)
Note that, since the powers of z form a basis of H , we have the direct sum decomposition A(Ss3 ) = µ∈Z A(Ss3 )µ as A(Ss2 )-bimodules. Observe also that the bimodules A(Ss3 )µ provide examples of bimodules Pρ defined in Proposition 2.7 (cf. [D-M97a, App. B]). Our goal is to compute projector matrices of these modules and their pairing with the appropriate cyclic cocycle on A(Ss2 ). The strategy for computing the projector matrices is to use the splitting associated to the super Dirac monopole (Proposition 3.4) and Lemma 1.4. To apply the aforementioned lemma, first we need to show that the monomials occurring in formula (3.16) are linearly independent, and that the same holds for the monomials in (3.17).
324
L. D¸abrowski, H. Grosse, P. M. Hajac
Lemma 3.5. n k=0 n k=0
α k a n−k ck +
α k d n−k bk +
n−1 :=0 n−1
β : a n−1−: c: λ+ = 0
⇒ αk = 0 = β :,
∀k, :;
(3.18)
β : d n−1−: b: λ− = 0
⇒ αk = 0 = β :,
∀k, :.
(3.19)
:=0
˜ c, ˜ a˜ d˜ − b˜ c˜ − 1 denote the coordinate ring of SL(2, C) Proof. Let R˜ = C[a, ˜ b, ˜ d]/ and C[λ]/λ2 be the algebra of dual numbers. We have the following homomorphism of algebras: π : A(Ss3 ) −→ R˜ ⊗ C[λ]/λ2 , π(a) = a˜ ⊗ 1, π(b) = b˜ ⊗ 1, π(c) = c˜ ⊗ 1, π(d) = d˜ ⊗ 1, π(λ± ) = 1 ⊗ λ. (3.20) Applying π to the first equality in (3.18) yields n
α k a˜
n−k k
c˜ ⊗ 1 +
k=0
n−1
β : a˜ n−1−: c˜: ⊗ λ = 0.
(3.21)
:=0
˜ they are linearly indepenSince the monomials a˜ n−k c˜k are part of the PBW basis of R, dent. Hence α k = 0 = β : , ∀k, :, by the linear independence of 1 and λ. The second implication can be proved in the same way. Note now that the above described identification η allows one to identify sρµ of (2.26) with the restriction of s to A(Ss3 )µ (see Proposition 2.7): A(Ss3 )µ ξ˜ → (id ⊗ η) ◦ sρµ ◦ η−1 (ξ˜ ) ∈ A(Ss2 ) ⊗ A(Ss3 )µ , (3.22) (id ⊗ η) ◦ sρµ ◦ η−1 (ξ˜ ) = : sρµ (η−1 (ξ˜ )) (1) = s ◦ η−1 (ξ˜ ) (1) = s(ξ˜ ). (3.23) On the other hand, remembering the formula for the universal differential and using again the fact that (ad − bc)n = 1 − nλ+ λ− , we can write (3.10) in the following form: ω(zn ) = (1 + nλ+ λ− )
n n :=0
:
d n−: (−b): ⊗ a n−: c: − 1 ⊗ 1.
(3.24)
Substituting this to (2.20), we obtain s(a n−k ck ) = a n−k ck ⊗ 1 + a n−k ck ω(zn ) n = a n−k ck (1 + nλ+ λ− ) n: d n−: (−b): ⊗ a n−: c: , :=0
s(a n−1−k ck λ+ ) = a n−1−k ck λ+ ⊗ 1 + a n−1−k ck λ+ ω(zn ) n = a n−1−k ck λ+ n: d n−: (−b): ⊗ a n−: c: . :=0
(3.25)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
325
Similarly, substituting ω(z−n ) = (1 + nλ+ λ− )
n n :=0
:
a n−: (−c): ⊗ d n−: b: − 1 ⊗ 1
(3.26)
to (2.20), we get s(d n−k bk ) =
n :=0
s(d n−1−k bk λ− ) =
n :=0
d n−k bk (1 + nλ+ λ− )
n :
a n−: (−c): ⊗ d n−: b: , (3.27)
d n−1−k bk λ− n:
a n−: (−c): ⊗ d n−: b: .
Hence, by Lemma 3.5 and Lemma 1.4, we can conclude that A(Ss3 )−n = A(Ss2 )2n+1 E−n as left A(Ss2 )-modules, where E−n = P−n QT−n (symbol T stands for the matrix transpose) with T P−n := (1 + nλ+ λ− ) × a n , · · · , a n−k ck , · · · , cn , a n−1 λ+ , · · · , a n−1−k ck λ+ , · · · , cn−1 λ+ , QT−n := d n , · · · , n: d n−: (−b): , · · · , (−b)n , 0, · · · , 0 . (3.28)
In an analogous manner, we infer that A(Ss3 )n = A(Ss2 )2n+1 En as left A(Ss2 )-modules, where En = Pn QTn with PnT := (1 + nλ+ λ− ) × d n , · · · , d n−k bk , · · · , bn , d n−1 λ− , · · · , d n−1−k bk λ− , · · · , bn−1 λ− , QTn := a n , · · · , n: a n−: (−c): , · · · , (−cn ), 0, · · · , 0 . (3.29) To show the non-freeness of the above projective modules, we determine the Chern– Connes pairing between their classes in K0 (A(Ss2 )) and the cyclic cocycle on A(Ss2 ) obtained by the pull-back ℘ ∗ (see (3.1)) of the cyclic 2-cocycle c2 on A(S 2 ) given by the integration on S 2 . We have: ℘ ∗ (c2 ), [E±n ] = c2 , [℘∗ E±n ] = ±n.
(3.30)
Here the last equality follows from the fact that the matrix (℘∗ E±n )i,j := ℘ (E±n )i,j is a projector matrix of the classical Hopf line bundle with the Chern number equal to ±n. Furthermore, since every free module can be represented in K0 by the identity matrix, the Chern number of any free A(Ss2 )-module always vanishes. (The Chern number of a trivial bundle is zero.) Thus the left modules A(Ss3 )µ , µ = 0, are not (stably) free. Also, they are pairwise non-isomorphic. Now, reasoning as in [HM99, Sect. 4], we obtain: Corollary 3.6. The H -Galois extension A(Ss2 ) ⊆ A(Ss3 ) (super Hopf fibration) is not cleft.
326
L. D¸abrowski, H. Grosse, P. M. Hajac
Let us remark that projectors E±n are not hermitian with respect to the involution a ∗ = d,
b∗ = −c,
c∗ = −b,
d ∗ = a,
λ∗± = −λ∓ .
(3.31)
† Nevertheless, one can slightly modify E±n to find hermitian projectors F±n = F±n such 2 2n+1 2 2n+1 that the modules A(Ss ) E±n and A(Ss ) F±n are isomorphic. They are given by † , where (n > 0) the formulas F±n = U±n U±n
1 n−1 2 λ+ λ− ) a n , · · · , nk a n−k ck , · · · , cn , a n−1 λ+ , · · · , 2 1 n − 1 2 n−1−k k n−1 a c λ , · · · , c λ + + , k
T := (1 + U−n
UnT
1 n+1 2 λ+ λ− ) d n , · · · , nk d n−k bk , · · · , bn , d n−1 λ− , · · · , := (1 + 2 1 n − 1 2 n−1−k k n−1 d b λ− , · · · , b λ− . k
(3.32)
The matrices F±n are hermitian by construction. To check that they are idempotent, we compute: † U−n U−n = (1 +
n n−1 n−k (−bc)k λ + λ− ) 2 k (ad) 2
− λ − λ+
n
k=0
n−1 k=0
n−1 k
(ad)n−1−k (−bc)k
(3.33)
= (1 + (n − 1)λ+ λ− )(ad − bc)n + λ+ λ− (ad − bc)n−1 = (1 + (n − 1)λ+ λ− )(1 − nλ+ λ− ) + λ+ λ− (1 − (n − 1)λ+ λ− ) = 1. In the same manner, we check Un† Un = 1. It remains to verify that the projective modules A(Ss2 )2n+1 E±n and A(Ss2 )2n+1 F±n are isomorphic. For this purpose, we use (1.26) † ˜ the matrices L±n := P±n U±n , L˜ ±n := U±n QT±n ∈ M2n+1 (A(Ss2 )), and take as L, L, respectively. A calculation similar to (3.33) shows that QT±n P±n = 1. This together with (3.33) and Un† Un = 1 implies that L±n and L˜ ±n satisfy (1.27). (Note that L±n L˜ ±n = E±n and L˜ ±n L±n = F±n .) Thus the modules A(Ss2 )2n+1 E±n and A(Ss2 )2n+1 F±n are isomorphic, as claimed. This hermitian presentation of the projective modules A(Ss3 )±n , n > 0, agrees with [L-G01a, (3.25)] for the projectors of the classical Hopf line bundles, and resembles the appropriate formulas obtained in [L-G01b, Sect. 4.2]. (The case n = 0 is trivial.) Finally, we want to show that Proposition 3.7. A(Ss3 )−1 ⊕ A(Ss3 )1 = A(Ss2 )2 as left A(Ss2 )-modules.
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
327
Proof. We can infer from the preceding considerations that the matrix diag(F−1 , F1 ) is a projector matrix of A(Ss3 )−1 ⊕ A(Ss3 )1 . First, it turns out to be technically convenient to conjugate F1 by 0 1 0 M := 1 0 0 . (3.34) 0 0 1 Then F˜1 := MF1 M is evidently equivalent (i.e., giving an isomorphic projective module) to F1 (just take L = L˜ = M in (1.27)), whence diag(F−1 , F1 ) is equivalent to diag(F−1 , F˜1 ). (Note that this way we have F−1 F˜1 = 0 = F˜1 F−1 .) To prove the proposition we employ (1.26–1.27), and put F = diag(F−1 , F˜1 ) and E = diag(1, 1). The point is to find L, L˜ satisfying (1.27). Since F−1 = (a, c, λ+ )T (d, −b, −λ− ) and F˜1 = (1 + 2λ+ λ− )(b, d, λ− )T (−c, a, −λ+ ), (3.35) we look for L˜ of the form a f L˜ = − , f− = c u+ , v+ , f+ λ+
b f+ = d u− , v− , λ−
and for L of the form
x L = g− , g+ , , g− = − d, −b, −λ− , y−
(3.36)
x+ −c, a, −λ+ . g+ = y+ (3.37)
Here to ensure that L˜ ∈ M6×2 (A(Ss2 )) and L ∈ M2×6 (A(Ss2 )) we take u+ , v+ , x+ , y+ ∈ A(Ss3 )1 and u− , v− , x− , y− ∈ A(Ss3 )−1 . Using the super determinant relation ad − bc + λ+ λ− = 1, one can verify that u+ = d, u− = −c,
v+ = −b, v− = a,
x+ = (1 + 3λ+ λ− )b, y+ = (1 + 3λ+ λ− )d, x− = (1 + λ+ λ− )a, y− = (1 + λ+ λ− )c,
is a solution of (1.27), as needed.
(3.38) (3.39)
By analogy with the classical situation, we call A(Ss3 )−1 and A(Ss3 )1 the super-spinbundle modules. Proposition 3.7 is a super version of the fact that the module of Dirac spinors, i.e., the direct sum of the spin-bundle modules, is free both for the classical and quantum sphere [LPS]. In fact, the freeness of the module P−1 ⊕ P1 [HM99, p. 257] of Dirac spinors on the quantum sphere can be shown by precisely the same method as in the super-sphere case. It suffices to take in the proof of Proposition 3.7 M = 01 01 , F−1 = (α, γ )T (δ, −qβ), F1 = (δ, β)T (α, −q −1 γ ), α β −1 δ, −qβ , f+ = f− = −q γ , α , γ δ
α β −1 δ, −qβ , g+ = g− = −q γ , α , γ δ where α, β, γ , δ are the generators of A(SLq (2)) as in [HM99].
(3.40) (3.41)
328
L. D¸abrowski, H. Grosse, P. M. Hajac
4. Appendix: Gauge Transformations We follow here the definition of a gauge transformation used in [H-PM96]. For an H -Galois extension B ⊆ P , it is defined as a unital convolution-invertible homomorphism f : H → P satisfying R ◦ f = (f ⊗ id) ◦ AdR . We treat this definition as the first approximation of an appropriate concept of gauge transformations on Hopf– Galois extensions (see [D-Mb] and the paragraph above Proposition 3.4 in [H-PM96], cf. [D-M97a, Sects. 6.1–6.2], [D-M97b], [B-T96, Sect. 5]). It turns out that the space of strong connections is closed under the action of gauge transformations [H-PM96, Prop. 3.7]. The following theorem describes this action. Theorem 4.1. Let B ⊆ P be an H -Galois extension admitting a strong connection. The following describes a left action of gauge transformations on strong connections which is compatible with the identifications of Theorem 2.3:
1) (f s)(p) := s p(0) f (p(1) ) f −1 (p(2) ),
2) (f D)(p) := D p(0) f (p(1) ) f −1 (p(2) ),
3) (f &)(rdp) := r& d(p(0) f (p(1) )) f −1 (p(2) ) + rp(0) f (p(1) )df −1 (p(2) ), 4) (f ω)(h) := f (h(1) )ω(h(2) )f −1 (h(3) ) + f (h(1) )df −1 (h(2) ). Proof. We need to study the following diagrams: αi
GT (P ) × Vi −−→ id×J ij αj
Vi J .ij
(4.1)
GT (P ) × Vj −−→ Vj Here α i ’s are the corresponding left actions specified above and Jij ’s, i, j ∈ {1, 2, 3, 4} are obtained in an obvious way by composing suitable bijections Ji introduced in the proof of Theorem 2.3. We know that α 4 is a well-defined left action [H-PM96, Prop. 3.4]. It suffices to show that α i = J4i ◦ α 4 ◦ (id × Ji4 ) for i ∈ {1, 2, 3, }.
(4.2)
For i = 3 it is proved in [H-PM96, Prop. 3.5]. For i = 2, we have J42 (f J24 (D)) (p) = (J12 ◦ J41 ) (f (J34 ◦ J23 )(D)) (p) = 1 ⊗ p − J41 (f (J34 ◦ J23 )(D)) (p) = 1 ⊗ p − p ⊗ 1 − p(0) (f (J34 ◦ J23 )(D)) (p(1) ) = dp − p(0) f (p(1) ) (J34 ◦ J23 ) (D)(p(2) )f −1 (p(3) ) − p(0) f (p(1) )df −1 (p(2) )
= d p(0) f (p(1) ) f −1 (p(2) ) − p(0) f (p(1) )p(2) [1] J23 (D)(dp(2) [2] )f −1 (p(3) ). (4.3) Note now that, since P admits a strong connection, it is projective (Corollary 2.4) and hence flat as a left B-module. Consequently P ⊗ H is left B-flat and Ker (( R − id ⊗ 1) ⊗B id ⊗ id) = B ⊗B P ⊗ H.
(4.4)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
329
Using property (1.4) of the translation map and the AdR -colinearity of f , we obtain (4.5) ( R − id ⊗ 1) p(0) f (p(1) )p(2) [1] ⊗B p(2) [2] ⊗ p(3) = 0. Hence p(0) f (p(1) )p(2) [1] ⊗B p(2) [2] ⊗ p(3) ∈ B ⊗B P ⊗ H,
(4.6)
and we have: J42 (f J24 (D)) (p)
= d p(0) f (p(1) ) f −1 (p(2) ) − (J23 (D) ◦ d) p(0) f (p(1) )p(2) [1] p(2) [2] f −1 (p(3) )
= d p(0) f (p(1) ) f −1 (p(2) ) + (D − d) p(0) f (p(1) ) f −1 (p(2) )
= D p(0) f (p(1) ) f −1 (p(2) ) = α 2 (f, D)(p). (4.7) Similarly, we compute: J41 (f J14 (s)) (p) = p ⊗ 1 + p(0) f (p(1) )J14 (s)(p(2) )f −1 (p(3) ) + p(0) f (p(1) )df −1 (p(2) ).
(4.8)
On the other hand, J14 (s)(h) = (J34 ◦ J23 ◦ J12 )(s)(h) = h[1] (J23 ◦ J12 )(dh[2] ) = h[1] (d − J12 (s))(h[2] ) [1]
(4.9)
[2]
= h (s − id ⊗ 1)(h ) = h[1] s(h[2] ) − ε(h) ⊗ 1. Therefore, taking advantage of the left B-linearity of s, (4.6) and (1.6), we obtain J41 (f J14 (s)) (p) = p(0) f (p(1) ) ⊗ f −1 (p(2) ) + p(0) f (p(1) )p(2) [1] s(p(2) [2] )f −1 (p(3) ) − p(0) f (p(1) ) ⊗ f −1 (p(2) )
= s p(0) f (p(1) ) f −1 (p(2) )
(4.10)
= α 1 (f, s)(p), as needed. Remark 4.2. The gauge transformations on the H -Galois extension B ⊆ P are in on-toone correspondence with the gauge automorphisms understood as unital left B-linear right H -colinear automorphisms of P [B-T96, Prop. 5.2]. If f : H → P is a gauge transformation, then F : P → P , F (p) := p(0) f (p(1) ) is a gauge automorphism. Analogously, for α ∈ 1P , we put F (α) := (id ⊗ m) ◦ (id ⊗ id ⊗ f ) ◦ 1 P (α). (The other way round we have f (h) = h[1] F (h[2] ).) Due to the right H -colinearity of the covariant differential D, we can re-write point 2) of the above theorem as (D ( F )(p) = F −1 (DF (p)). This formula coincides with the usual formula for the action of gauge transformations on projective-module connections (e.g., see [C-A94, p. 554]).
330
L. D¸abrowski, H. Grosse, P. M. Hajac
Remark 4.3. In the sense of the definition considered here, the connections in Example 2.14 are not gauge equivalent. This is because, for the quantum Hopf fibration, any gauge transformation f acts trivially on the space of connections. Indeed, since H is spanned by group-like elements, f is convolution-invertible, and the only invertible elements in A(SLq (2)) are non-zero complex numbers [HM99, Appendix], f must be C \ {0}-valued. This effect is due to working with non-completed (polynomial) algebras. Acknowledgement. P. M. H. was partially supported by a CNR postdoctoral fellowship and GNFM funds at SISSA, IHES visiting stipend and KBN grant 2 P03A 030 14. Further financial support was provided by the “Geometric Analysis” network (HPRN-CT-1999-00118). This work is also part of H. G.’s project P11783PHY of the “Fonds zur Förderung der Wissenschaftlichen Forschung in Österreich”. H. G. thanks SISSA for an invitation to Trieste. Finally, it is a pleasure to thank T. Brzezi´nski, G. Landi, J.-L. Loday, P. Schauenburg and A. Sitarz for very helpful discussions and suggestions.
References [BBL90] [B-N72] [B-T96] [B-T99] [BH99] [BM93] [BM98a] [BM98b] [BM00] [C-A94] [CQ95] [DHS99]
[DS94] [D-Y85] [DT86] [D-M96] [D-M97a] [D-M97b] [D-M97c]
[D-Ma] [D-Mb]
Bartocci, C., Bruzzo, U., Landi, G.: Chern Simons Form on Principal Superfiber Bundles. J. Math. Phys. 31, 45–53 (1990) Bourbaki, N.: Commutative Algebra. Reading, MA: Addison-Wesley, 1972 Brzezi´nski, T.: Translation Map in Quantum Principal Bundles. J. Geom. Phys. 20, 349–370 (1996) (hep-th/9407145) Brzezi´nski, T.: On Modules Associated to Coalgebra Galois Extensions. J. Algebra 215, 290–317 (1999) (q-alg/9712023) Brzezi´nski, T., Hajac, P.M.: Coalgebra Extensions and Algebra Coextensions of Galois Type. Commun. Alg. 27, 1347–1367 (1999) Brzezi´nski, T., Majid, S.: Quantum Group Gauge Theory on Quantum Spaces. Commun. Math. Phys. 157, 591–638 (1993); Erratum 167, 235 (1995) (hep-th/9208007) Brzezi´nski, T., Majid, S.: Coalgebra Bundles. Commun. Math. Phys. 191, 467–492 (1998) (q-alg/9602022) Brzezi´nski, T., Majid, S.: Quantum Differentials and the q-Monopole Revisited. Acta Appl. Math. 54, 185–232 (1998) (q-alg/9706021) Brzezi´nski, T., Majid, S.: Quantum Geometry of Algebra Factorisations and Coalgebra Bundles. Commun. Math. Phys. 213, 491–521 (2000) (math/9808067) Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 Cuntz, J., Quillen, D.: Algebra Extensions and Nonsingularity. J. Am. Math. Soc. 8, 251–289 (1995) D¸abrowski, L., Hajac, P.M., Siniscalco, P.: Explicit Hopf–Galois Description of SL 2π i (2)e 3
Induced Frobenius Homomorphisms. In: Kastler, D., Rosso, M., Schucker, T. (eds.) Enlarged Proceedings of the ISI GUCCIA Workshop on Quantum Groups, Noncommutative Geometry and Fundamental Physical Interactions, Commack–NewYork: Nova Science Pub, Inc., 1999, pp.279–298 D¸abrowski, L., Sobczyk, J.: Left Regular Representation and Contraction of slq (2) to eq (2). Lett. Math. Phys. 32, 249–258 (1994) Doi, Y.: Algebras with Total Integrals. Commun. Alg. 13, 2137–2159 (1985) Doi, Y., Takeuchi, M.: Cleft Comodule Algebras for a Bialgebra. Commun. Alg. 14, 801–817 (1986) Durdevic, M.: Quantum Principal Bundles and Tannaka–Krein Duality Theory. Rep. Math. Phys. 38, 313–324 (1996) (q-alg/9507018) Durdevic, M.: Geometry of Quantum Principal Bundles II. Rev. Math. Phys. 9, 531–607 (1997) (q-alg/9412005) Durdevic, M.: Quantum Principal Bundles and Corresponding Gauge Theories. J. Phys. A30, 2027–2054 (1997) (q-alg/9507021) Durdevic, M.: Quantum Principal Bundles and Their Characteristic Classes. (q-alg/9605008); Quantum Classifying Spaces and Universal Quantum Characteristic Classes. (q-alg/9605009) In: Budzy´nski, R. et al. (eds.) Quantum Groups and Quantum spaces, Banach Center Publ. 40, 1997, pp. 303–313 and pp. 315–327 Durdevic, M.: Characteristic Classes of Quantum Principal Bundles. (q-alg/9507017) Durdevic, M.: Quantum Gauge Transformations and Braided Structure on Quantum Principal Bundles. (q-alg/9605010)
Strong Connections and Chern–Connes Pairing in Hopf–Galois Theory
[GKP96] [GKP97] [GR98] [H-PM96] [H-PM00] [HM99] [HMS] [L-G01a] [L-G01b] [LM87] [LPS] [L-JL97] [M-J92] [M-S97] [MMNNU91] [MNW91] [M-S93] [P-P87] [R-J94] [R-D98] [S-P93] [S-P] [S-HJ90a] [S-HJ90b] [S-HJ94] [T-P88]
331
Grosse, H., Klimˇcik, C., Prešnajder. P.: Topologically Nontrivial Field Configurations in Noncommutative Geometry. Commun. Math. Phys. 178, 507–526 (1996) (hep-th/9510083) Grosse, H., Klimˇcik, C., Prešnajder. P.: Field Theory on a Supersymmetric Lattice. Commun. Math. Phys. 185, 155–175 (1997) (hep-th/9507074) Grosse, H., Reiter, G.: The Fuzzy Supersphere. J. Geom. Phys. 28, 349–383 (1998) (mathph/9804013) Hajac, P.M.: Strong Connections on Quantum Principal Bundles. Commun. Math. Phys. 182, 579–617 (1996) Hajac, P.M.: Bundles over Quantum Sphere and Noncommutative Index Theorem. K-Theory. 21, 141–150 (2000) Hajac, P.M., Majid, S.: Projective Module Description of the q-Monopole. Commun. Math. Phys. 206, 247–264 (1999) Hajac, P.M., Matthes, R., Szyma´nski, W.: Quantum Real Projective Space, Disc and Sphere. To appear in Algebras and Representation Theory. (math/0009185) Landi, G.: Projective Modules of Finite Type and Monopoles over S 2 . J. Geom. Phys. 37, 47–62 (2001) (math-ph/9905014) Landi, G.: Projective Modules of Finite Type over the Supersphere S 2,2 . Differ. Geom. Appl. 14, 95–111 (2001) (math-ph/9907020) Landi, G., Marmo, G.: Extensions of Lie Superalgebras and Supersymmetric Abelian Gauge Fields. Phys. Lett. 193 B, 61–66 (1987) Landi, G., Paschke, M., Sitarz, A.: In preparation Loday, J.-L.: Cyclic Homology. Berlin–Heidelberg–New York: Springer, 1997 Madore, J.: The Fuzzy Sphere. Class. Quant. Grav. 9, 69–87 (1992) Majid, S.: Some Remarks on Quantum and Braided Group Gauge Theory. In: Budzy´nski, R. et al. (eds.) Quantum Groups and Quantum spaces, Banach Center Publ. 40, 1997, pp. 336–349 (q-alg/9603031) Masuda, T., Mimachi, K., Nakagami, Y., Noumi, M., Ueno, K.: Representations of the Quantum Group SUq (2) and the Little q-Jacobi Polynomials. J. Funct. Anal. 99, 357–387 (1991) Masuda, T., Nakagami, Y., Watanabe, J.: Noncommutative Differential Geometry on the Quantum Two Sphere of Podle´s. I: An Algebraic Viewpoint. K-Theory 5, 151–175 (1991) Montgomery, S.: Hopf Algebras and Their Actions on Rings. Regional Conference Series in Mathematics no. 82 Providence, RI: AMS, 1993 Podle´s, P.: Quantum Spheres. Lett. Math. Phys. 14, 193–202 (1987) Rosenberg, J.: Algebraic K-Theory and its Applications. Berlin–Heidelberg–New York: Springer-Verlag, 1994 Rumynin, D.: Hopf–Galois Extensions with Central Invariants and their Geometric Properties. Algebras and Representation Theory 1, 353–381 (1998) Schauenburg, P.: Zur nichtkommutativen Differentialgeometrie von Hauptfaserbuendeln – Hopf–Galois-Erweiterungen von de Rham-Komplexen. Algebra-Berichte 71. München: Verl. R. Fischer, 1993 Schauenburg, P.: Private communication, 1999 Schneider, H.-J.: Principal Homogeneous Spaces for Arbitrary Hopf Algebras. Isr. J. Math. 72, 167–195 (1990) Schneider, H.-J.: Representation Theory of Hopf–Galois Extensions. Isr. J. Math. 72, 196–231 (1990) Schneider, H.-J.: Hopf Galois Extensions, Crossed Products, and Clifford Theory. In: Bergen, J., Montgomery, S. (eds.) Advances in Hopf Algebras, Lecture Notes in Pure and Applied Mathematics. New York: Marcel Dekker, Inc., 158, 1994, pp. 267–297 Teofilatto, P.: Discrete Supergroups and Super Riemann Surfaces. J. Math. Phys. 29, 2389– 2396 (1988)
Communicated by A. Connes
Commun. Math. Phys. 220, 333 – 375 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Julia Sets in Parameter Spaces X. Buff1 , C. Henriksen2 1 Université Paul Sabatier, UFR MIG, Laboratoire E. Picard, 31062 Toulouse Cedex, France.
E-mail: [email protected]
2 Technical University of Denmark, Department of Mathematics, 2800 Lyngby, Denmark.
E-mail: [email protected] Received: 22 September 2000 / Accepted: 16 January 2001
Abstract: Given a complex number λ of modulus 1, we show that the bifurcation locus of the one parameter family {fb (z) = λz + bz2 + z3 }b∈C contains quasi-conformal copies of the quadratic Julia set J (λz + z2 ). As a corollary, we show that when the Julia set J (λz + z2 ) is not locally connected (for example when z → λz + z2 has a Cremer point at 0), the bifurcation locus is not locally connected. To our knowledge, this is the first example of complex analytic parameter space of dimension 1, with connected but non-locally connected bifurcation locus. We also show that the set of complex numbers λ of modulus 1, for which at least one of the parameter rays has a non-trivial accumulation set, contains a dense Gδ subset of S 1 .
Contents 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction . . . . . . . . . . . . . . . . . . . . . . Conformal Representation of C \ Mλ . . . . . . . . . Copies of Quadratic Julia Sets in the Dynamical Plane Definition of the Wake W0 . . . . . . . . . . . . . . . Dynamics of fb in the Wake W0 . . . . . . . . . . . Holomorphic Motion of Rays . . . . . . . . . . . . . The Dyadic Wakes Wϑ . . . . . . . . . . . . . . . . Copies of Quadratic Julia Sets in the Parameter Plane Non-local Connectivity in the Parameter Plane . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
334 337 343 347 353 357 359 367 370
Research partially supported by the French Embassy in Denmark within the research co-operation between France and Denmark: “Holomorphic Dynamics”.
334
X. Buff, C. Henriksen
1. Introduction In this article, we study the one parameter family of cubic polynomials fb (z) = λz + bz2 + z3 ,
b ∈ C,
where λ = e2iπθ is a fixed complex number of modulus 1. We call K(fb ) the filled-in Julia set of the polynomial fb , J (fb ) its Julia set, and Mλ the connectedness locus of the family: K(fb ) = z ∈ C fb◦n (z) n∈N is bounded , J (fb ) = ∂K(fb ), and Mλ = b ∈ C J (fb ) is connected .
The notations Kb and Jb are kept for other purposes. In Sects. 2 and 3, we recall some classical results related to the study of the dynamics of cubic polynomials. Those results can be found in [BH1]. In particular, we prove that the connectedness locus Mλ is connected and we construct dynamically a conformal representation Φλ : C \ Mλ → C \ D (compare with [Z1]). This enables us to define the parameter rays Rλ (θ ), θ ∈ R/Z. In Sect. 4, we prove that the parameter rays Rλ (1/6) and Rλ (1/3) land at a common parameter b0 . The techniques we use are not new. They are similar to those developed by Douady and Hubbard in [DH1] to study the landing properties of parameter rays in 2 the quadratic family We then define the wake W0 as the connected {z → z + c}c∈C . component of C \ Rλ (1/6) ∪ Rλ (1/3) that contains the parameter ray Rλ (1/4) (see Fig. 5). In Sect. 5, we study the dynamical features of the polynomials fb when the parameter b ranges in the wake W0 . Matters get interesting in Sect. 6. Let us define ⊂ R/Z (respectively ⊂ R/Z) to be the Cantor set of angles that can be written in base 3 with only 0’s and 1’s (respectively with only 1’s and 2’s). We denote by Xb the set of dynamical rays whose arguments belong to . In Sect. 6, we prove that the set Xb moves holomorphically as long as the parameter b remains in the wake W0 . As a consequence, we show that for any parameter b ∈ W0 , the filled-in Julia set K(fb ) contains a quasi-conformal copy of the filled-in Julia set K(λz + z2 ) (see Fig. 11). Theorem A. For any parameter b ∈ W0 and for any θ ∈ , the dynamical ray Rb (θ ) does not bifurcate. We define Xb to be the set Xb = Rb (θ ). θ∈
We also define Jb to be the set Jb = Xb \ Xb and Kb to be the complement of the unbounded connected component of C \ Jb . Then, Kb is contained in the filled-in Julia set K(fb ), its boundary Jb is contained in the Julia set J (fb ) and Kb is quasi-conformally homeomorphic to the filled-in Julia set K(λz + z2 ). In the wake W0 , one can see a copy M of a Mandelbrot set (see Fig. 1). We give a precise definition of the set M , but we do not prove that it is homeomorphic to the Mandelbrot set. This has been done in [EY] in the case λ = 1, and is not known in the
Julia Sets in Parameter Spaces
335
Mλ
K(λz + z2 )
√
Fig. 1. Zooms in Mλ for λ = eiπ( 5−1)
case λ = 1. However, we show that the boundary of M is equal to the accumulation set of the parameter rays Rλ (θ/3), θ ∈ (see Fig. 12): ∂M = X \ X , where X = Rλ (θ/3). θ∈
At the same time, we show that the connected components of W0 \ X can be indexed by dyadic angles ϑ ∈ R/Z. The connected component Wϑ is bounded by two parameter rays Rλ (ϑ − ) and Rλ (ϑ + ) landing at a common parameter bϑ ∈ M . The angles ϑ − and ϑ + are two consecutive endpoints of the Cantor set . We prove that given any dyadic angle ϑ = (2p + 1)/2k , we have ϑ + = ϑ − + 1/(2 · 3k+1 ). We then define the sets Xϑ , Jϑ and Kϑ in the following way: −
ϑ θ Xϑ = + k+1 , Rλ 3 3 θ∈
Jϑ = Xϑ \ Xϑ , where the closure is taken in C, and Kϑ is the complement of the unbounded connected component of C \ Jϑ . Our main results are the following (see Fig. 1).
336
X. Buff, C. Henriksen
Main Theorem. Let λ ∈ S 1 be a complex number of modulus 1 and ϑ ∈ R/Z be a dyadic angle. The set Kϑ is contained in Mλ ∩ Wϑ , its boundary Jϑ is contained in the boundary of Mλ and the parameter bϑ belongs to Jϑ . Besides, there exists a quasiconformal homeomorphism defined in a neighborhood of Kϑ , sending Kϑ to K(λz+z2 ). Corollary A. For each complex number λ of modulus 1, the bifurcation locus of the one parameter family fb (z) = λz + bz2 + z3 , b ∈ C, contains quasi-conformal copies of the quadratic Julia set J (λz + z2 ). Corollary B. If the Julia set J (λz + z2 ) is not locally connected, then the bifurcation locus ∂Mλ is not locally connected. We would like to mention that one has to be careful. Indeed, in the context of Newton’s method of cubic polynomials, Pascale Roesch [R] has an example of a locally connected Julia set containing a copy of a quadratic Julia set which is not locally connected. In our case, this does not occur because the set Mλ is full. Observe that when t ∈ R \ Q does not satisfy the Bruno condition, the quadratic Julia set J (e2iπt z + z2 ) is known to be non-locally connected. Hence, the set of values of λ ∈ S 1 for which Mλ is not locally connected contains a dense Gδ subset of S 1 . Lavaurs [La] proved that the connectedness locus of the whole family of cubic polynomials is not locally connected. In the parameter space of real cubic polynomials, the bifurcation locus is also known to be non-locally connected (see [EY]). To our knowledge, we give the first example of complex parameter space of dimension 1 with connected but non-locally connected bifurcation locus. Shizuo Nakane brought to our attention that we could prove the existence of parameter rays with a non-trivial accumulation set. He has already proved this result in the family of real cubic polynomials in a joint work with Y. Komori (see [NK]). To state the next corollary, we need to introduce some notations. Given any complex number λ of modulus 1, we define Pλ to be the quadratic polynomial Pλ (z) = λz + z2 . For any angle θ ∈ R/Z, we define RPλ (θ ) to be the dynamical ray of the polynomial Pλ of angle θ . We also consider the Cantor map (or devil staircase) χ : R/Z → R/Z which is constant on the closure of each connected component of R/Z \ and is defined on by:
εi
εi = , where εi ∈ {0, 1}. χ i 3 2i i≥1
i≥1
Corollary C. Given any complex number λ of modulus 1, any dyadic angle ϑ = (2p + 1)/2k and any angle θ ∈ , the accumulation set of the parameter ray Rλ (ϑ − /3 + θ/3k+1 ) is reduced to a point if and only if the accumulation set of the quadratic ray RPλ (χ (θ )) is reduced to a point. Using an accumulation theorem due to Douady (see [Sø]), we then prove that the set of complex numbers λ of modulus 1, for which at least one of the parameter rays Rλ (θ ) ⊂ C \ Mλ has a non-trivial accumulation set, contains a dense Gδ subset of S 1 . We would like to make some comments about the choice of the family fb . We wanted to work with a family of cubic polynomials having a persistently indifferent fixed point. We decided to put this fixed point at the origin. This condition is achieved, since the map fb has an indifferent fixed point at 0 with multiplier λ. The reason why we have chosen this parametrization is that the polynomial fb is monic and thus, has a preferred Böttcher
Julia Sets in Parameter Spaces
337
coordinate. This will be useful to define a conformal representation Φλ : C\Mλ → C\D in a dynamical way. This is important since we want to be able to transfer results from the dynamical plane to the parameter plane. However, one should observe that the maps fb and f−b are always conjugate by the affine map z → −z. Indeed, −fb (−z) = −(−λz + bz2 − z3 ) = f−b (z). This explains why parameter pictures are symmetric with respect to the origin. The central argument we use is inspired from techniques developed by Tan Lei in [TL]. There, she proves that there are similarities between the Mandelbrot set and certain Julia sets. We would also like to mention that Pia Willumsen proved the existence of copies of the quadratic Julia set J (z2 − 1) in the parameter space of a well-chosen family of cubic polynomials. Hubbard made the suggestion that the two dimensional connectedness locus of the space of cubic polynomials may contain homeomorphic copies of the set (c, z) | K(z2 + c) is connected and z ∈ K(z2 + c) . After we exposed our results in Crete 1 , Lyubich and McMullen made the observation that pushing further our arguments, we should be able to prove this result. This would show the existence of cubic polynomials being in the same combinatorial class, but not being topologically conjugate. Such a result has been conjectured by Kiwi in his thesis [K]. 2. Conformal Representation of C \ Mλ In this section, we will use results by√Branner and Hubbard [BH1] to prove that Mλ is full, connected and has capacity 3/ 3 4. We will construct, in a dynamical way, the √ 3 Riemann mapping Φλ : C \ Mλ → C \ D, that is tangent to b → b · 4/3 at infinity. A similar study has already been done by Zakeri [Z1]. Working with the escaping critical value, he defines an analytic map from C \ Mλ to C \ D which turns out to be a covering map of degree 3. We will instead work with the escaping co-critical point. We will need this approach later, to transfer dynamical results to the parameter plane. In [Z2], Zakeri also gives an interesting proof of the connectivity of Mλ based on Teichmüller theory of rational maps. 2.1. Potential functions. Recall that Fatou proved that the Julia set of any polynomial is connected if and only if the orbit of each critical point is bounded. In our case, the map fb has two critical points. However, fb has an indifferent fixed point at 0. Hence, there is always one critical point with a bounded orbit. Indeed, there are only three possible cases: • the fixed point is parabolic (θ ∈ Q), and there is at least one critical point of fb in its basin of attraction; • the fixed point is linearizable (it could be the case even if θ is not a Bruno number), and the boundary of the Siegel disk is accumulated by the orbit of at least one critical point of fb ; 1 Euroconference in Mathematics on Crete; Holomorphic Dynamics; Anogia, June 26–July 2, 1999.
338
X. Buff, C. Henriksen
• the fixed point is a Cremer point and is contained in the limit set of at least one critical point of fb . Remark. We will say that this critical point is “captured” by 0. In particular, when J (fb ) is disconnected, there is exactly one critical point ω1 with bounded orbit, and one escaping critical point ω2 . Let us now recall some classical results that can be found in [DH1] and [BH1]. Definition 1 (Potential functions). For any b ∈ C, define gb : C → [0, +∞[ by 1 log+ fb◦n (z), n n→∞ 3
gb (z) = lim
where log+ is the supremum of log and 0. Also define the function G : C → R+ by G(b) =
sup
{ω | fb (ω)=0}
gb (ω).
Remark. When the Julia set J (fb ) is connected, G(b) = 0. Otherwise, G(b) = gb (ω2 ). Proposition 1. We have the following properties: 1. gb is continuous and subharmonic on all of C; 2. gb (fb (z)) = 3gb (z); 3. gb vanishes exactly on K(fb ) and is harmonic on C \ K(fb ); 4. the critical points of gb in C \ K(fb ) are the preimages of the escaping critical point ω2 by an iterate fb◦n , n ≥ 0; 5. the mapping (b, z) → gb (z) is a continuous plurisubharmonic function; 6. the function G is continuous and subharmonic. Remark. We will see that G vanishes exactly on the set Mλ and is harmonic outside Mλ . Definition 2 (Equipotentials). The level curve gb−1 {η} is called the dynamical equipotential of level η. The level curve G−1 {η} is called the parameter equipotential of level η. When the Julia set is connected, the two critical points are contained in K(fb ), and the harmonic map gb : C \ K(fb ) → R+ has no critical point. Hence, every dynamical equipotential of fb is a real-analytic simple closed curve. More generally, observe that gb has no critical point in the region {z ∈ C | gb (z) > G(b)}, and every dynamical equipotential of level η > G(b) is a real-analytic simple closed curve. The orthogonal curves to dynamical (respectively parameter) equipotentials will be called dynamical (respectively parameter) rays. We will be more precise about the definition of rays below. Figure 2 shows a filled-in Julia set with two dynamical equipotentials of level 1/3 and 1, together with four segments of dynamical rays.
Julia Sets in Parameter Spaces
339 Rb (1/4)
−4 + 4i
4 + 4i
Rb (1/12) Rb (5/12)
fb fb ω2
ω2
Ub Ub
fb
−4 − 4i
Rb (9/12)
4 − 4i
Fig. 2. A disconnected Julia set; Φλ (b) = ϕb (ω2 ) = e1/3+2iπ/12
2.2. The Böttcher coordinate at infinity. The vector field 1 ξb = grad(gb )/|grad(gb )|2 2 is a meromorphic vector field on C \ K(fb ), having poles exactly at the critical points of gb in C \ K(fb ). Definition 3. We define Sb to be the union of the critical points of gb in C \ K(fb ) and their stable manifolds for the vector field ξb . For any b ∈ C, we define Vb to be the open set C \ (K(fb ) ∪ Sb ). We have normalized our cubic polynomials so that they are monic. Hence, there exists a unique Böttcher coordinate ϕb defined in a neighborhood of infinity, and tangent to the identity at infinity. Consider the flow (z, τ ) → Fb (z, τ ) of the vector field ξb , where τ ∈ R is a real time. For any point z ∈ Vb , we can extend ϕb at z using the formula ϕb (z) = e−τ ϕb (Fb (z, τ )), where τ ∈ [0, +∞[ is chosen large enough so that Fb (z, τ ) ∈ Ub . The following proposition is then easily derived from the analyticity of ξb and its analytic dependence on b. Proposition 2 (Böttcher coordinate). There exists a unique analytic isomorphism ϕb defined in a neighborhood of infinity, tangent to the identity at infinity, and satisfying ϕb ◦ fb ◦ ϕb−1 (z) = z3 . The mapping ϕb extends to an analytic isomorphism ϕb : Vb → C and satisfies log |ϕb | = gb on this set. Furthermore, ϕb depends analytically on b, i.e., the set V= {b} × Vb b∈C
340
X. Buff, C. Henriksen
is open and the mapping - : V → C2 defined by -(b, z) = (b, ϕb (z)) is an analytic isomorphism from V onto its image. Remark. An easy computation shows that near infinity, we have ϕb (z) = z + b/3 + O(1/|z|). When J (fb ) is connected, Vb = C \ K(fb ) and the Böttcher coordinate ϕb is a univalent mapping ϕb : C \ K(fb ) → C \ D, and on C \ K(fb ), we have gb = log |ϕb |. In particular, the dynamical equipotential of level η is the set ϕb−1 eη+2iπθ | θ ∈ R/Z , i.e., the preimage by ϕb of the circle of radius eη centered at 0. When J (fb ) is disconnected this property still holds for equipotentials of level η > G(b), i.e., in the region {z ∈ C | gb (z) > G(b)}. In both cases, the push-forward (ϕb )∗ (ξb ) is the radial vector field w∂/∂w. In particular, ϕb maps every trajectory of the vector field ξb to a segment of line with constant argument. Hence, ϕb (Vb ) is a star-shaped domain with respect to infinity, i.e., for every angle θ ∈ R/Z, there exists a radius r(b, θ) ≥ 1 such that w ∈ ϕb (Vb ) and arg(w) = 2π θ if and only if |w| > r(b, θ ). Finally, along a trajectory z(τ ) of the vector field ξb , we have gb (z(τ )) = gb (z(0)) + τ . Definition 4 (Dynamical Rays). For any b ∈ C, the dynamical ray Rb (θ ) is defined as Rb (θ ) = ϕb−1 re2iπθ | r > r(b, θ ) . Remark. The vector field ξb = 21 grad(gb )/|grad(gb )|2 can be extended holomorphically to C \ K(fb ). Then, it has a sink at infinity and the dynamical rays are exactly the stable manifolds of infinity for the vector field ξb . When r(b, θ ) = 1, the accumulation set of a dynamical ray is contained in the Julia set J (fb ). This is true for any angle θ ∈ R/Z when J (fb ) is connected. If the limit z0 = lim ϕb−1 (re2iπθ ) r1
exists, we will say that the dynamical ray Rb (θ ) lands at z0 . When J (fb ) is disconnected and when r(b, θ ) > 1, then the limit z0 =
lim ϕ −1 (re2iπθ ) rr(b,θ) b
exists and is a critical point ω of gb . In this case, we will say that the dynamical ray Rb (θ ) bifurcates on ω. If b2 = 3λ, then there is unique critical point. This critical point cannot escape (because 0 “captures” a critical point), and b ∈ Mλ . On the other hand, if b2 = 3λ and b ∈ / Mλ , then fb (ω2 ) has a preimage ω2 = ω2 . Following Branner and Hubbard, we call it the co-critical point to ω2 . Let us observe that ϕb is well defined at the co-critical point ω2 . Indeed, ω2 cannot be a critical point of gb since it is not an inverse image of ω2 . Let us consider the trajectory z(τ ) defined by the initial condition z(0) = ω2 . We have gb (z(τ )) = gb (ω2 ) + τ . In particular, since the region {z ∈ C | gb (z) > gb (ω2 )} does not contain critical points of gb we see that the trajectory z(τ ) is defined on [0, +∞[. Hence, ω2 belongs to Vb , and ϕb (ω2 ) is well defined.
Julia Sets in Parameter Spaces
341
Definition 5. Given b ∈ C \ Mλ , the escaping critical point is called ω2 and the cocritical point to ω2 is called ω2 . We define the mapping Φλ : C \ Mλ → C by Φλ (b) = ϕb (ω2 ). Proposition 3 (Branner–Hubbard [BH1] and Zakeri [Z1, Z2]). The set Mλ is full and connected. Besides, the √map Φλ : C \ Mλ → C \ D is the conformal isomorphism which is tangent to b → b · 3 4/3 at infinity. Proof. We have seen that if b2 = 3λ, then b ∈ Mλ . Now, if b2 = 3λ, the two critical points are the two distinct roots of the equation fb (z) = 0, and by the implicit function theorem, we can follow them locally. Hence, we can follow holomorphically the two critical points locally outside Mλ . For the same reason, we can follow holomorphically the two distinct co-critical points locally outside Mλ . Lemma 1. The mapping Φλ : C \ Mλ → C \ D is analytic. / Mλ , let ω2 be the escaping critical point and ω2 be the Proof. Fix a parameter b0 ∈ co-critical point to ω2 . There exist two holomorphic maps defined in a neighborhood U of b0 that follow the two co-critical points. Let ω : U → C be the one which coincides with ω2 at b0 . The set W = {(b, z) ∈ C2 | z ∈ C \ K(fb )} is the preimage of ]0, +∞[ by the map g(b, z) = gb (z) which is continuous by 5) of proposition 1. Hence, W is open. Thus, by restricting U if necessary, we may assume that for any b ∈ U, the co-critical point ω (b) belongs to C \ K(fb ). This shows that, for any b ∈ U, the escaping co-critical point is ω (b). Furthermore, the mapping (b, z) → ϕb (z) is analytic in a neighborhood of any point (b0 , z0 ) such that z0 ∈ Vb0 . Hence, it is analytic in a neighborhood of (b0 , ω2 ). It follows immediately that Φλ (b) = ϕb (ω2 (b)) is analytic in a neighborhood of b0 . The proof that Φλ is an isomorphism between C \ Mλ and C \ D is an application of the principle: an analytic mapping is an isomorphism if it is proper of degree 1. We shall use a similar argument for quasi-regular mappings in Sect. 8. Lemma 2. Outside Mλ , we have G(b) = log |Φλ (b)|. Besides, the function G vanishes exactly on Mλ . Proof. If b ∈ Mλ , we can write: 1 log Φλ (b) = n log ϕb f ◦n (ω2 ) 3
◦n 1 1 = G(b). = n log f (ω2 ) + O
◦n 3 |f (ω )| 2
Since ϕb takes values outside D, so does Φλ . Hence, G is positive outside Mλ . Besides, if b ∈ Mλ , both critical points are in the filled-in Julia set K(fb ). So, their orbits are bounded and G(b) = 0.
342
X. Buff, C. Henriksen
We can now see that Mλ is full. This is an immediate consequence of the fact that sub-harmonic functions satisfy the maximum principle. Thus, level sets are full. By Picard’s theorem, the mapping Φλ : C \ Mλ → C \ D has a removable singularity at infinity. Hence, we can extend it to infinity. We necessarily have Φλ (∞) = ∞ since otherwise G would be a non-constant bounded subharmonic function on P1 . More precisely, a simple computation shows that when |b| tends to infinity, ω2 = −
2b + o(1), 3
Then,
and
ω2 = −b − 2ω2 =
b + o(1). 3
fb (ω2 ) 1 , =4+O
3 (ω2 ) |b|
and for any integer n ≥ 1, we have
◦(n+1) (ω2 ) fb 1 . =1+O ◦n
3 |b| (fb (ω2 )) Hence, we obtain
Φλ (b) =
ω2
1/3n+1 √
3 f ◦(n+1) (ω ) 4 1 2 b = b + O . ◦n
3 (fb (ω2 )) 3 |b|
We will now show that Φλ : C \ Mλ → C \ D is a proper mapping. Since G is continuous, G(b) tends to 0 as b tends to the boundary of Mλ . Hence, Φλ (b) tends to ∂D when b tends to ∂Mλ from outside Mλ . Since Φλ is analytic, it is a proper mapping. We can finally see that it has degree 1 since infinity has only one preimage counted with multiplicity. Hence, it is an isomorphism between C \ Mλ and C \ D. In particular, Mλ is connected. We have defined parameter equipotentials. We can now define parameter rays (see Fig. 5). Definition 6 (Parameter Rays). The parameter ray Rλ (θ ) is defined as Rλ (θ ) = Φλ−1 eη+2iπθ | η > 0 . If the limit b0 = lim Φλ−1 (re2iπθ ) r1
exists, we will say that the parameter ray Rλ (θ ) lands at b0 .
Julia Sets in Parameter Spaces
343
3. Copies of Quadratic Julia Sets in the Dynamical Plane In this section, we will first recall a result which is essentially due to Branner and Hubbard [BH2] (see also [Br]): when the parameter b is not in Mλ , there exists a restriction of fb which is a quadratic-like mapping. The reader will find information on polynomial-like mappings and related results in [DH2]. Definition 7 (Polynomial-like mappings). A polynomial-like mapping f : U → U of degree d is a ramified covering of degree d between two topological disks U and U , with U relatively compact in U . One can define its filled-in Julia set K(f ) and its Julia set J (f ) as follows: K(f ) = {z ∈ U | (∀n ∈ N) f ◦n (z) ∈ U },
and J (f ) = ∂K(f ).
A polynomial-like mapping of degree 2 will be called a quadratic-like mapping. Let us recall the so-called Straightening Theorem due to Douady and Hubbard. Proposition 4 (Straightening Theorem). If f : U → U is a polynomial-like mapping of degree d, then there exists • a polynomial P : C → C of degree d, • a neighborhood V of the filled-in Julia set K(P ) such that the mapping P : P −1 (V ) = V → V is a polynomial-like map, and • a quasiconformal homeomorphism ϕ : U → V with ϕ(U ) = V , such that ∂ϕ = 0 almost everywhere on K(f ) and such that on U ϕ ◦ P = f ◦ ϕ. Moreover, if K(f ) is connected, then P is unique up to conformal conjugacy. Definition 8. Two polynomial-like mappings f and g are said to be hybrid equivalent if there is a quasi-conformal h that conjugates f and g, with ∂h = 0 almost everywhere on the filled-in Julia set K(f ). Proposition 5. For any b ∈ C \ Mλ , let us denote by Ub the open set {z ∈ C | gb (z) < 3G(b)} and Ub the connected component of f −1 (Ub ) that contains the non-escaping critical point ω1 . Then, the restriction fb : Ub → Ub is a quadratic-like mapping and its hybrid class contains the polynomial z → λz + z2 . Figure 2 shows the domains Ub and Ub for the parameter Φλ−1 (e1/3+2iπ/12 ). Proof. We have seen that any dynamical equipotential of level η > G(b) is a real-analytic simple closed curve. This applies to the dynamical equipotential of level 3G(b). Thus, the set Ub is a topological disk. Besides, it only contains one critical value of fb (the non-escaping one). The set f −1 (Ub ) is the set {z ∈ C | gb (z) < G(b)} which is bounded by a lemniscate pinching at the escaping critical point ω2 . Each connected component of f −1 (Ub ) is a topological disk compactly contained in Ub . Besides, the restriction of fb to the connected component of f −1 (Ub ) containing the non-escaping critical point ω1 is a ramified covering of degree 2, ramified at ω1 . This is precisely the definition of a quadratic-like mapping. Next, to see that the hybrid class of this quadratic-like mapping contains z → λz+z2 , we will use the following result.
344
X. Buff, C. Henriksen
Lemma 3. The multiplier of an indifferent fixed point is a quasi-conformal invariant. Remark. Na˘ıshul [Na˘ı] shows a much better result since he proves that the multiplier of an indifferent fixed point is a topological invariant. Pérez-Marco [PM] gave a new proof of this result which is much simpler. The case of quasi-conformal conjugacy is easier to handle. R. Douady gave an easy proof based on the compacity of the space of quasi-conformal mappings with bounded dilatation (see [Y]). We will present a new proof based on holomorphic motions and the Ahlfors-Bers theorem. Those tools are more complicated than the ones used by Douady, but the idea of the proof fits very well within this article. Proof. Assume that two germs f0 : U0 → C and f1 : U1 → C are quasi-conformally conjugate. Call ψ the quasi-conformal conjugacy. Then µ = ∂ψ/∂ψ is a Beltrami form invariant by f0 . Integrating the Beltrami form µε = εµ, ε ∈ D(0, 1/||µ||∞ ), we get a family of quasi-conformal homeomorphisms ψε depending analytically on ε, and a family of analytic germs fε = ψε ◦ f0 ◦ ψε−1 . We claim that this family of germs depend analytically on ε (this is not immediate since ψε−1 does not need to depend analytically on ε; Douady explained a geometric proof to us, and Lyubich explained an analytic proof to us which we give here). Since fε ◦ ψε = ψε ◦ f0 , for any z ∈ U we can write ∂fε ∂fε ∂ψε ∂fε ∂ψε ∂ψε + . · · + = ∂ε ψε (z) ∂z ∂ε z ∂z ∂ε z ∂ε f0 (z) Since both ∂fε /∂z and ∂ψε /∂ε vanish, we see that ∂fε /∂ε vanishes. In particular, the multiplier λ(ε) of the fixed point depends analytically on ε. Since it cannot become repelling or attracting (all the germs are conjugate to f0 which has an indifferent fixed point), the modulus of λ(ε) is constant. Hence, λ(ε) is a constant function, and λ(1) = λ(0). The hybrid class of the quadratic-like map fb : Ub → Ub contains a quadratic polynomial having an indifferent fixed point with multiplier λ. Such a polynomial is always analytically conjugate to the polynomial z → λz + z2 . Definition 9. For any parameter b ∈ C \ Mλ , the filled-in Julia set of the quadratic-like map fb : Ub → Ub is called Kb and its Julia set is called Jb . We will now give more information about the dynamics of fb1 for the parameter b1 with potential η = 1/3 and external argument θ = 1/4 (we could have picked any parameter with potential η > 0 and external argument θ ∈]1/6, 1/3[). Proposition 6. Let b1 be the parameter b1 = Φλ−1 (e1/3+2iπ/4 ). If λ = 1, the two dynamical rays Rb1 (0/1) and Rb1 (1/2) both land at a common fixed point β = 0 which is repelling. If λ = 1, the rays Rb1 (0/1) and Rb1 (1/2) both land at the parabolic fixed point β = 0. Proof. We still denote by Ub1 the set Ub1 = {z ∈ C | gb1 (z) < 3G(b1 )}. Its preimage fb−1 (Ub1 ) has two connected components. Note that Ub 1 is the one containing ω2 in its 1 boundary. Denote by Ub
1 the other component (see Fig. 3). Remember that fb1 : Ub 1 →
Julia Sets in Parameter Spaces
345
−4 + 3i
4 + 3i
Rb1 (1/4) gb−1 {1} 1
ω2 Ub Rb1 (1/2)
β gb−1 {1/3}
1
U b1 Rb1 (0/1)
ω2
1
α
Rb1 (7/12) −4 − 5i
Ub
1
Rb1 (−1/12) 4 − 5i
Fig. 3. The rays Rb1 (0/1) and Rb1 (1/2) both land at a common fixed point β
Ub1 is a degree 2 proper mapping. Similarly fb1 : Ub
1 → Ub1 is a degree 1 proper mapping and since Ub
1 is compactly contained in Ub1 , fb1 has exactly one fixed point in Ub
1 . This fixed point is repelling. We will denote it by α. Next, observe that the rays Rb1 (−1/12) and Rb1 (7/12) bifurcate on ω2 , and since −1/12 < 0 < 1/2 < 7/12, they separate α from the rays Rb1 (0/1) and Rb1 (1/2). Since fb1 : Ub 1 → Ub1 is a degree 2 proper mapping, and since Ub 1 is compactly contained in Ub1 , Rouché’s Theorem shows that fb1 has exactly two fixed points in Ub 1 , counted with multiplicity. If λ = 1, those two fixed points are distinct. One is 0 which is indifferent, and has multiplier λ, the other one will be denoted by β. A theorem due to Douady–Hubbard [DH1] and to Sullivan asserts that every fixed dynamical ray that does not bifurcate, lands at a fixed point which is either repelling, or parabolic with multiplier 1. Since the two fixed dynamical rays Rb1 (0/1) and Rb1 (1/2) cannot land at 0 (since the multiplier is neither repelling nor equal to 1), they must both land at the fixed point β. Since β is the landing point of a ray, either it is repelling or it is a multiple fixed point. But since there are only two fixed points in Ub 1 counted with multiplicity the former case occurs. On the other hand, if λ = 1, there is only one fixed point in Ub 1 : the fixed point at 0 which is parabolic with multiplier 1. Hence the two fixed rays Rb1 (0/1) and Rb1 (1/2) must both land at 0. We will now describe the set of rays that accumulate on the Julia set Jb1 of the quadratic-like map fb1 : Ub 1 → Ub1 . Definition 10. We define ⊂ R/Z to be the set of angles θ such that for any n ≥ 0, 3n θ ∈ [0, 1/2] mod 1.
346
X. Buff, C. Henriksen
Remark. The set is the set of angles θ that can be written in base 3 with only 0’s and 1’s. It is a Cantor set and is forward invariant under multiplication by 3. Figure 4 shows the dynamical rays Rb1 (θ ) for θ ∈ . The following proposition shows that those rays accumulate on the Julia set Jb1 of the quadratic-like restriction of fb1 . Rb1 (1/3)
Rb1 (1/6)
Rb1 (4/9)
Rb1 (1/18)
Rb1 (1/2)
Rb1 (0/1) ω2
Fig. 4. The dynamical rays Rb1 (θ ), θ ∈ , accumulate on the Julia set Jb1 of the quadratic-like restriction of fb1
Proposition 7. Let b1 be the parameter b1 = Φλ−1 (e1/3+2iπ/4 ) and Jb1 be the Julia set of the quadratic-like mapping fb1 : Ub 1 → Ub1 . Then, for any θ ∈ , the dynamical ray Rb1 (θ ) does not bifurcate. Besides, if we define Rb1 (θ ), Xb1 = θ∈
then Xb1 \ Xb1 = Jb1 . Proof. Let us first recall that the rays Rb1 (0/1) and Rb1 (1/2) do not bifurcate and land at the same fixed point β. Hence, the curve {β} ∪ Rb1 (0/1) ∪ Rb1 (1/2) cuts the plane in two connected components V1 and V2 . We call V2 the one containing the escaping critical point ω2 . Observe that for any θ ∈ [0, 1/2], the dynamical ray Rb1 (θ ) is contained in C \ V2 . Now, assume that there exists an angle θ ∈ such that the dynamical ray Rb1 (θ ) bifurcates. Then, it bifurcates on a preimage of the escaping critical point ω2 and one of its forward images bifurcates on ω2 . But since by definition of , we have 3k θ ∈ [0, 1/2] mod 1, for any k ≥ 0, the forward orbit of the ray Rb1 (θ ) is contained in C \ V2 . Hence no forward image of Rb1 (θ ) can bifurcate on the escaping critical point ω2 ∈ V2 . Since the set is closed (it is an intersection of closed sets), Xb1 is closed in C \ K(fb1 ). Hence, Xb1 \ Xb1 ⊂ J (fb1 ).
Julia Sets in Parameter Spaces
347
We will now show that for any angle θ ∈ , the accumulation set I of the ray Rb1 (θ ) is contained in the Julia set Jb1 of the quadratic-like mapping fb1 : Ub 1 → Ub1 . Indeed, the accumulation set I is contained in the Julia set J (fb1 ) of fb1 , and its forward orbit is contained in C \ V2 . In particular, it cannot enter the region Ub
1 , and the forward orbit of I is entirely contained in Ub 1 . This shows that I ⊂ Kb1 . Since I is contained in the boundary of K(fb1 ), we see that I ⊂ Jb1 , and Xb1 \ Xb1 ⊂ Jb1 . To prove the reverse inclusion, we will use the fact that the backward orbit of the fixed point β by the quadratic-like map fb1 : Ub 1 → Ub1 is dense in Jb1 . Let us show by induction on n that if z ∈ Jb1 satisfies fb◦n (z) = β, then there is an angle θ ∈ 1 such that Rb1 (θ ) lands at z. This is true for n = 0 since the rays Rb1 (0/1) and Rb1 (1/2) land at β. Now, if the induction property holds for some n, let us show that it is true ◦(n+1) (z) = β, its image fb1 (z) satisfies for n + 1. Given a point z ∈ Jb1 satisfying fb1 the induction hypothesis. Thus, there is an angle θ ∈ such that the ray Rb1 (θ ) lands at fb1 (z). Observe that, on one hand, this ray cannot contain the escaping critical value (indeed, the ray containing the escaping critical value has argument 3/4 ∈ / ), and its three preimages land at the three preimages of fb1 (z). On the other hand, there are three angles θ1 , θ2 and θ3 such that 3θi = θ, i = 1, 2, 3. Two of them, let’s say θ1 and θ2 , are in , and the third one, θ3 , is contained in ]2/3, 5/6[ mod 1. Hence, the ray Rb1 (θ3 ) lands at the preimage of fb1 (z) which is contained in Ub
1 . This shows that one of the two rays Rb1 (θ1 ) or Rb1 (θ2 ) lands at z. Remark. It is easy to see that no other dynamical ray can accumulate on Jb1 since their forward orbits eventually enter V2 . 4. Definition of the Wake W0 We will now restrict our study to a particular region in the parameter plane: the wake W0 . Definition 11. The wake W0 is defined to be the connected component of C \ Rλ (1/6) ∪ Rλ (1/3) ∪ Rλ (2/3) ∪ Rλ (5/6) that contains the parameter ray Rλ (1/4). Remark. In fact, we will show that the parameter rays Rλ (1/6) and Rλ (1/3) land at a common parameter b0 which satisfies the equation b02 = 4(λ − 1). The wake W0 is the region contained between those two rays (see Fig. 5). There are several ways of proving the landing property of the parameter rays Rλ (1/6) and Rλ (1/3). We will use an argument similar to the one used by Douady and Hubbard in [DH1]. We will need to modify it slightly in the case λ = 1. Proposition 8. The parameter rays Rλ (1/6) and Rλ (1/3) land at the same parameter b0 satisfying b02 = 4(λ − 1). The parameter rays Rλ (2/3) and Rλ (5/6) land at −b0 . Remark. When λ = 1, we have b0 = 0 and the four rays land at 0 (see Fig. 6).
348
X. Buff, C. Henriksen Rλ (1/3)
W0
Rλ (1/6)
Rλ (1/2) Rλ (0/1)
Mλ
Rλ (2/3)
Rλ (5/6)
Fig. 5. The parameter rays Rλ (1/6) and Rλ (1/3) land at b0 , whereas the rays Rλ (2/3) and Rλ (5/6) land at −b0
Proof. In the case λ = 1, we will show that for any parameter b0 contained in the accumulation set of the ray Rλ (1/6), fb0 has a parabolic fixed point with multiplier 1. The set of such parameters is discrete – in fact b02 = 4(λ − 1). Since the accumulation set of any ray is connected, this will prove that the ray Rλ (1/6) lands. A similar argument shows that the rays Rλ (1/3), Rλ (2/3) and Rλ (5/6) land at b0 or −b0 . We will then have to show that the rays Rλ (1/6) and Rλ (1/3) land at the same parameter. In the case λ = 1, we will show that the only parameter in the accumulation set of the rays Rλ (1/6), Rλ (1/3), Rλ (2/3) and Rλ (5/6) is b0 = 0. This will conclude the proof of the proposition. Lemma 4. For any parameter b0 contained in the accumulation set of the ray Rλ (1/6), Rλ (1/3), Rλ (2/3) or Rλ (5/6), the polynomial fb0 has a parabolic fixed point with multiplier 1. Proof. Let us prove this lemma for the ray Rλ (1/6). We will proceed by contradiction. Assume that fb0 has no parabolic fixed point with multiplier 1. Since b0 ∈ Mλ , the dynamical ray Rb0 (1/2) does not bifurcate. It is a fixed dynamical ray. Hence, it lands at a fixed point α, which is either repelling, or parabolic with multiplier 1. By hypothesis on b0 , the second case is not possible. We claim that for b sufficiently close to b0 , the ray Rb (1/2) still lands on a repelling fixed point of fb . The proof is classical and can be found in the Orsay Notes [DH1]. Thus, for any b ∈ U1 , the ray Rb (1/2) does not bifurcate on a critical point. In particular, the dynamical ray Rb (1/6) cannot contain the co-critical point. But this precisely shows that the parameter ray Rλ (1/6) omits the neighborhood U1 of b0 which gives the contradiction.
Julia Sets in Parameter Spaces
349
−3 + 3i
Rλ (1/3)
Rλ (1/6)
3 + 3i
−3 − 3i
Rλ (2/3)
Rλ (5/6)
3 − 3i
Fig. 6. The parameter space for λ = 1. The four rays Rλ (1/6), Rλ (1/3), Rλ (2/3) and Rλ (5/6) land at 0
The fixed points of the polynomial fb are 0 and the roots of the equation λ − 1 + bz + z2 = 0. If λ = 1, there is a multiple root (i.e., a parabolic fixed point with multiplier 1) if and only if the discriminant is zero: b2 − 4(λ − 1) = 0. Hence, when λ = 1, we see that the parameter rays Rλ (1/6), Rλ (1/3), Rλ (2/3) and Rλ (5/6) can only accumulate on b0 or −b0 , where b02 = 4(λ − 1). Since the accumulation set of a ray is connected, we have proved that those rays land at b0 or −b0 . When λ = 1, the origin is a persistently parabolic fixed point with multiplier 1. Hence, to be able to conclude that the parameter rays land, we must improve our lemma. The following lemma completes the proof of the proposition in the case λ = 1. Lemma 5. When λ = 1 the parameter rays Rλ (1/6), Rλ (1/3), Rλ (2/3) and Rλ (5/6) land at b0 = 0. Proof. Let us prove this lemma for the parameter ray Rλ (1/6). The proof is essentially the same as in Lemma 4. We proceed by contradiction, assuming that the parameter ray Rλ (1/6) accumulates on b0 = 0. On the one hand, the dynamical ray Rb0 (1/2) cannot land at a repelling fixed point, since otherwise there would be a neighborhood U1 of b0 in which the dynamical ray Rb (1/2) would not bifurcate (as in Lemma 4). On the other hand, if the dynamical ray Rb0 (1/2) were landing at a parabolic fixed point with multiplier 1 (i.e., the fixed point 0) then we could still show that there exists a neighborhood U1 in which the dynamical ray Rb (1/2) would not bifurcate. The idea of the proof is the following. Since b0 = 0, the parabolic fixed point 0 is simple, i.e., fb
0 (0) = 0. We will show that we can follow continuously a repelling petal Prep (b) in a neighborhood U0 of b0 . On
350
X. Buff, C. Henriksen
this repelling petal, the inverse branches fb−1 : Prep (b) → Prep (b) are well defined and iterates of these inverse branches converge to 0. We will also show that the dynamical ray Rb0 (1/2) enters the repelling petal Prep (b0 ). Consequently, there exists a neighborhood U1 of b0 such that for any b ∈ U1 , the dynamical ray Rb (1/2) enters the repelling petal Prep (b), and thus lands at the parabolic fixed point 0. Let us fill in the details. Since we assume b0 = 0, there exists a neighborhood U0 of b0 and a radius ε > 0 such that for any b ∈ U0 , fb restricts to an isomorphism between the disk V (b) centered at 0 with radius ε/|b| and fb (V (b)). Now, observe that the change of coordinates z → Z = −1/bz conjugates fb : V (b) → fb (V (b)) to an isomorphism → Fb (V ), where Fb : V
1 1 V = {Z ∈ P | 1/ε < |Z|} and Fb (Z) = Z + 1 + O . |Z| √ Let us choose ε sufficiently small, so that |Fb (Z) − Z − 1| < 2/2 for any b ∈ U0 and att and P rep the sectors . Then, denote by P any Z ∈ V √ att = {Z ∈ C 2/ε − Re(Z) < |Im(Z)|}, P and
√ rep = {Z ∈ C 2/ε + Re(Z) < |Im(Z)|}. P
Besides, denote by Patt (b) and Prep (b) the sets att }, Patt (b) = {z ∈ C∗ | − 1/bz ∈ P and
rep }. Prep (b) = {z ∈ C∗ | − 1/bz ∈ P
The set Patt (b) is called an attracting petal and the set Prep (b) is called a repelling petal (see Fig. 7). One can easily check that the assumptions on ε implies that for any b ∈ U0 , we have (1) (2) (3) (4)
fb (Patt (b)) ⊂ Patt (b); fb◦n converges uniformly on compact subsets of Patt (b) to 0; there exists an inverse branch fb−1 : Prep (b) → Prep (b); [fb−1 ]◦n converges uniformly on compact subsets of Prep (b) to 0.
Let us express the ray Rb0 (1/2) as a countable union of segments t j j +1 − e , j ∈ Z, | 3 ≤ t ≤ 3 Sj = ϕb−1 0 so that fb0 (Sj ) = Sj +1 . Clearly, we see that Patt (b0 ) is contained in the filled-in Julia set K(fb0 ). Thus, Rb0 (1/2) does not intersect Patt (b0 ). Since we assumed that the ray Rb0 (1/2) lands at 0, there exists an integer j0 such that Sj0 is contained in Prep (b0 ). Again, by shrinking U0 if necessary, we may assume that U0 ⊂ {b ∈ C | G(b) < 3j0 }. This condition implies that for any b ∈ U0 , the ray Rb (1/2) is defined up to potential at least 3j0 , and Sj (b) = ϕb−1 − et | 3j ≤ t ≤ 3j +1 , j ≥ j0 is well defined. Finally, since {(z, b) | b ∈ U0 , z ∈ Prep (b)} is open and since ϕb−1 depends continuously (even analytically) on b, we see that there exists a neighborhood
Julia Sets in Parameter Spaces
351 1+i
Rb (1/2) Rb (0/1)
Prep (b)
Patt (b)
−1 − i Fig. 7. An attracting petal Patt (b) and a repelling petal Prep (b). The attracting petal Patt (b) is contained in K(fb ) and the ray Rb (1/2) eventually enters and stays in Prep (b)
U1 ⊂ U0 of b0 , such that for any b ∈ U1 the segment Sj0 (b) is contained in Prep (b). Hence, Sj0 +k (b) = [fb−1 ]◦k (Sj0 (b)) is well defined for any k ≥ 0, and the ray Rb (1/2) lands at 0. However, this implies that the parameter ray R(1/6) does not intersect U1 . We still need to prove that when λ = 1, the parameter rays Rλ (1/6) and Rλ (1/3) land at the same parameter. Remember that we defined the wake W0 as the connected component of C \ Rλ (1/6) ∪ Rλ (1/3) ∪ Rλ (2/3) ∪ Rλ (5/6) that contains the parameter ray Rλ (1/4). Let us call b0 the landing point of the parameter ray Rλ (1/6). We will use the fact that the connectedness locus Mλ is symmetric with respect to 0 (remember that fb and f−b are conjugate by z → −z). The symmetry of Mλ shows that two of the four rays Rλ (1/6), Rλ (1/3), Rλ (2/3) or Rλ (5/6) land at b0 and the other two land at −b0 . Moreover, the parameter rays Rλ (1/6) and Rλ (2/3) are symmetric, so that Rλ (2/3) cannot land at b0 ( = −b0 ). Hence, if the parameter ray Rλ (1/3) were not landing at b0 , then the ray Rλ (5/6) would. In that case, the wake W0 would contain the parameter b = 0 (see Fig. 8). We will get a contradiction by proving that for any parameter b ∈ W0 , the dynamical rays Rb (0/1) and Rb (1/2) land at the same point, whereas this is not the case for b = 0. Lemma 6. For any parameter b ∈ W0 , the two dynamical rays Rb (0/1) and Rb (1/2) do not bifurcate.
352
X. Buff, C. Henriksen
Rλ (1/4) b1
Rλ (1/3)
−b0
Rλ (1/6)
0
b0 W0 Rλ (5/6)
Rλ (2/3) −b1 Rλ (3/4)
Fig. 8. If the parameter rays Rλ (1/6) and Rλ (1/3) were not landing at the same parameter, the wake W0 would contain the parameter b = 0
Remark. This lemma and the following one are in fact true as soon as b does not belong to one of the parameter rays Rλ (1/6), Rλ (1/3), Rλ (2/3) or Rλ (5/6). Proof. If b ∈ / Rλ (1/3)∪Rλ (2/3), the dynamical ray Rb (0/1) does not bifurcate. Indeed, if Rb (0/1) were bifurcating, it would bifurcate on a preimage of the escaping critical point ω2 , i.e., there would be a non-negative n, such that fb−n (ω2 ) belongs to the ray Rb (0/1). Since this is a fixed ray, ω2 would belong to the ray Rb (0/1) and consequently ω2 would lie on either Rb (1/3) or Rb (2/3) which contradicts that b ∈ / Rλ (1/3) ∪ Rλ (2/3). A similar argument shows that if b ∈ / Rλ (1/6) ∪ Rλ (5/6), the dynamical ray Rb (1/2) does not bifurcate and also lands at a fixed point which is either repelling or parabolic with multiplier 1. Lemma 7. If λ = 1, then given any b ∈ W0 , the rays Rb (0/1) and Rb (1/2) both land at the same repelling fixed point β(b) = 0. If λ = 1, then given any b ∈ W0 , the rays Rb (0/1) and Rb (1/2) both land at the parabolic fixed point β(b) = 0. Proof. To prove this lemma, we will use an idea due to Peter Haïssinsky which has been explained to us by Carsten Petersen. We have seen that in the domain W0 , the dynamical rays Rb (0/1) and Rb (1/2) do not bifurcate. It follows that the set X = Rb (0/1)∪Rb (1/2) moves holomorphically with respect to the parameter b. Hence, the λ-Lemma by Mañe, Sad and Sullivan [MSS] shows that the closure of X in P1 moves holomorphically. In particular, if for some parameter b1 ∈ W0 the two dynamical rays Rb (0/1) and Rb (1/2) land at the same fixed point, they do so everywhere in W0 , i.e., there exists a holomorphic function β(b) such that β(b) is a fixed point of fb and is the landing point of the two rays Rb (0/1) and Rb (1/2). Besides, the multiplier at β(b) is a univalent function, that takes values in C \ D. Hence, either the multiplier is constantly equal to 1 (which corresponds to a persistently parabolic landing point) or it takes values in C \ D (and the landing point remains repelling in all W0 ). Thus, we just need to show that there is a parameter b ∈ W0 for which the two rays Rb (0/1) and Rb (1/2) land at a common fixed point, and that this point is repelling when
Julia Sets in Parameter Spaces
353
λ = 1, whereas it is parabolic with multiplier 1 when λ = 1. This is precisely given by Proposition 6 for the parameter b1 = Φλ−1 (e1/3+2iπ/4 ). To conclude the proof of the proposition, it is enough to see that when b = 0 and λ = 1, the two dynamical rays R0 (0/1) and R0 (1/2) cannot land at the same point. The polynomial f0 (z) = λz + z3 is an odd polynomial. Thus, the filled-in Julia set is symmetric with respect to the origin. In particular, the dynamical rays R0 (0/1) and R0 (1/2) are symmetric. Thus, if they land (in fact, the two critical orbits are symmetric, the Julia set is connected, and the rays land) the landing points are symmetric with respect to the origin. However, the origin cannot be the landing point of those rays because it is indifferent with multiplier λ = 1. Hence, the two dynamical rays R0 (0/1) and R0 (1/2) land at two symmetric, distinct fixed points. We have proved that in the wake W0 , the two dynamical rays Rb (0/1) and Rb (1/2) both land at a common fixed point β(b) which depends holomorphically on b. If λ = 1, we have seen that β(b) = 0 is a double fixed point, and the cubic polynomial fb has only one other fixed point: α(b) = −b. If λ = 1, the map fb has three distinct fixed points: 0, β(b) and α(b) = −b − β(b). Definition 12. For any b ∈ W0 , we call β(b) the landing point of the dynamical rays Rb (0/1) and Rb (1/2), and we call α(b) = −b − β(b) the fixed point of fb which is neither 0 nor β(b). Remark. Since the function β is holomorphic in W0 , the function α is also holomorphic in W0 . In fact, since W0 is simply connected and does not contain the parameters ±b0 , it is clear that the three fixed points of fb depend holomorphically on b in W0 , without using the fact that β(b) is the landing parameter of the rays Rb (0/1) and Rb (1/2). 5. Dynamics of fb in the Wake W0 We will now improve our description of the dynamical behaviour of the polynomial fb , when b ∈ W0 (see Fig. 9). Proposition 9. For any b ∈ W0 , the dynamics of the map fb is as follows: 1. the two critical points of fb are distinct and there exist two holomorphic functions ω1 (b) and ω2 (b) defined in W0 , such that for any b ∈ W0 , ω1 (b) and ω2 (b) are the two critical points of fb , ω2 (b) being the escaping critical point whenever b ∈ W0 \ Mλ ; the co-critical points are ωi (b) = −b − 2ωi (b); 2. the dynamical rays Rb (1/6) and Rb (1/3) do not bifurcate and both land at a preimage β1 (b) = β(b) of β(b); the rays Rb (2/3) and Rb (5/6) do not bifurcate and land at the other preimage β2 (b) ∈ / {β(b), β1 (b)}; we define Vi to be the connected component of C \ Rb (0/1) ∪ Rb (1/2) that contains βi (b); 3. each of the four connected components of C \ {θ|6θ∈Z} Rb (θ ) contains exactly one of the four points ω1 (b), ω2 (b), ω1 (b) or ω2 (b); we call Ui , i = 1, 2, the one containing ωi (b) and Ui , i = 1, 2, the one containing ωi (b); 4. the map fb : Ui → Vi , i = 1, 2, is an isomorphism and the map fb : Ui → Vi , i = 1, 2, is a ramified covering of degree 2 ramified at ωi (b). Proof. We will first show that we can follow the two critical points holomorphically when b ∈ W0 .
354
X. Buff, C. Henriksen Rb (1/3)
U2
Rb (1/6)
ω2 (b)
ω1 (b)
β1 (b) 0
U1
β(b) Rb (0/1)
Rb (1/2) U2
ω2 (b) α(b) β2 (b) ω1 (b)
Rb (2/3)
U1
Rb (5/6)
Fig. 9. The dynamical picture of the polynomial fb when the parameter b belongs to W0
Lemma 8. For any b ∈ W0 , the two critical points of fb are distinct. Moreover, there exist two holomorphic functions ω1 (b) and ω2 (b) defined in W0 , such that for any b ∈ W0 , ω1 (b) and ω2 (b) are the two critical points of fb . Proof. When the two critical points of fb are distinct, i.e., b2 = 3λ, we can locally follow them. Since W0 is simply connected, the proof of the lemma will be completed once we have proved that for any b ∈ W0 , the two critical points of fb are distinct. We will proceed by contradiction and assume that for some parameter b ∈ W0 , the polynomial fb has a unique critical point ω. The polynomial fb is then conjugate by the affine change of coordinate z → w = z−ω to a polynomial of the form w → w3 +c. The Julia set of such a polynomial is invariant under the rotation w → e2iπ/3 w. This shows that the Julia set of fb is invariant under the rotation of angle 1/3 around ω. In particular, the dynamical ray Rb (1/3) (respectively Rb (2/3)) is the image of the dynamical ray Rb (0/1) by the rotation of angle 1/3 (respectively 2/3) of center ω (see Fig. 10). For the same reason, the dynamical ray Rb (5/6) (respectively Rb (1/6)) is obtained from Rb (1/2) by rotating with angle 1/3 (respectively 2/3) around ω. We will show that the dynamical rays Rb (0/1) and Rb (1/2) cannot land at the same point β(b). Indeed, when b ∈ W0 , the two dynamical rays Rb (0/1) and Rb (1/2) land at β(b). By rotating with angle 1/3, we see that the two rays Rb (1/3) and Rb (5/6) land at e2iπ/3 β(b). Since those two rays are separated by the curve {β(b)} ∪ Rb (0/1) ∪ Rb (1/2), they can only meet at β(b). Hence, β(b) = e2iπ/3 (β(b)−ω)+ω = ω. But this would imply that ω is a super-attracting fixed point, and no ray could land at ω. This gives the contradiction. Then, it is not difficult to check that the co-critical points ωi (b) are defined by ωi (b) = −b − 2ωi (b), i = 1, 2. We still have a choice on which critical point will
Julia Sets in Parameter Spaces −2 + 2i
355
Rb (1/3)
Rb (1/6)
Rb (0/1)
ω Rb (1/2)
Rb (2/3)
Rb (5/6)
2 − 2i
Fig. 10. The Julia set of fb for b2 = 3λ. There is a unique critical point ω and the Julia set is invariant by rotation of angle 1/3 around ω
be labelled ω1 and which one will be labelled ω2 . To complete the proof of (1), we need to prove that we can choose ω2 such that ω2 (b) is the escaping critical point of fb for any b ∈ W0 \ Mλ . This will be done later and we will now focus on the proof of (2). Lemma 9. For any b ∈ W0 the dynamical rays Rb (1/6) and Rb (1/3) do not bifurcate. They both land at a preimage β1 (b) ∈ fb−1 {β(b)} \ {β(b)} of β(b). The rays Rb (2/3) and Rb (5/6) do not bifurcate and land at the other preimage β2 (b) ∈ fb−1 {β(b)} \ {β(b), β1 (b)}. Proof. Let us assume that Rb (1/6) bifurcates for some parameter b ∈ W0 . Then it bifurcates on a preimage of the escaping critical point ω2 , and one of its forward images bifurcates on ω2 . Since fb (Rb (1/6)) = Rb (1/2) is fixed, this means that ω2 belongs to the ray Rb (1/6) or to the ray Rb (1/2). On the one hand, the latter case is not possible since the ray Rb (1/2) does not bifurcate. On the other hand, since b ∈ W0 , the escaping co-critical point ω2 belongs to a dynamical ray Rb (θ ), with θ ∈]1/6, 1/3[. Hence, the rays bifurcating on ω2 have angle θ − 1/3 ∈] − 1/6, 0[ and θ + 1/3 ∈]1/2, 2/3[. Thus, the ray Rb (1/6) cannot bifurcate on ω2 . A similar argument shows that the rays Rb (1/3), Rb (2/3) and Rb (5/6) do not bifurcate. To complete the proof of the lemma, it is enough to prove that β(b) has three distinct preimages: β(b), β1 (b) and β2 (b). In other words, we need to show that β(b) is not a critical value of fb . Indeed, we can then argue that since fb is a local isomorphism in a neighborhood of βi (b) and since the rays Rb (0/1) and Rb (1/2) land at β(b), two of the rays Rb (1/6), Rb (1/3), Rb (2/3) and Rb (5/6) land at β1 (b) and two of them land at
356
X. Buff, C. Henriksen
β2 (b). The only possibility is that Rb (1/6) and Rb (1/3) land at the same preimage, let us say β1 (b), and the rays Rb (2/3) and Rb (5/6) land at the other preimage β(b). To see that β(b) is not a critical value of fb , we will proceed by contradiction. Hence, we assume that for some parameter b ∈ W0 , one critical point ω is mapped by fb to β(b). Then, since β(b) is either repelling or parabolic with multiplier 1, we see that ω = β(b). Besides, we have seen that the two critical points of fb are distinct. Hence, in a neighborhood of ω, the map fb is a two-to-one ramified covering, and the four rays Rb (1/6), Rb (1/3), Rb (2/3) and Rb (5/6) have to land at ω. But this is not possible since the rays Rb (1/6), Rb (2/3) are separated by Rb (0/1) and Rb (1/2). We will now prove (3) using a holomorphic motion argument. Lemma 10. The set
Xb = {ω1 (b), ω2 (b), ω1 (b), ω2 (b)} ∪
Rb (θ )
{θ|6θ∈Z}
undergoes a holomorphic motion as b moves in W0 . Proof. The functions ωi (b) and ωi (b), i = 1, 2, are holomorphic when b ∈ W0 . Besides, we have seen that the dynamical rays Rb (θ ), 6θ ∈ Z, do not bifurcate when b ∈ W0 , and thus, move holomorphically when b ∈ W0 . To prove the lemma, we need to prove the injectivity condition of holomorphic motions. Since we already know that the critical points are distinct, we only need to show that for any b ∈ W0 , the critical points and co-critical points cannot belong to any of the rays Rb (θ ), 6θ ∈ Z. But this is clear since otherwise, one of those rays would have to bifurcate on a critical point. The dynamical picture for the polynomial fb1 has been studied in section 3, and it is not difficult to check that each connected component of C\ Rb1 (θ ) {θ|6θ∈Z}
contains exactly one of the four points ω1 (b1 ), ω2 (b1 ), ω1 (b1 ) or ω2 (b1 ). None of the four points are contained in the set {θ|6θ∈Z} Rb1 (θ ) for any b ∈ W0 . Since the four points and {θ|6θ∈Z} Rb1 (θ ) move continuously when b changes and W0 is connected, statement (3) follows. We can now complete the proof of (1). We choose the functions ω1 (b) and ω2 (b) so that ω2 (b1 ) is the escaping critical point of fb1 . Then, the boundary of U2 (b1 ) is the union of the two dynamical rays Rb1 (1/6), Rb1 (1/3) and their landing point β1 (b1 ). Using the holomorphic motion, we see that the same property holds for U2 (b), i.e., the boundary of U2 (b) is the union of the two dynamical rays Rb (1/6), Rb (1/3) and their landing point β1 (b). In particular, the region U2 (b) contains the dynamical rays Rb (θ ), θ ∈]1/6, 1/3[. On the other hand, we know that when b ∈ W0 \ Mλ , the escaping co-critical point belongs to one of those rays. Hence, for any b ∈ W0 \ Mλ the escaping co-critical point belongs to the region U2 (b). Thus the escaping co-critical point is ω2 (b) and for any b ∈ W0 \ Mλ , the escaping critical point is ω2 (b). We finally prove (4). We have called V1 (b) and V2 (b) the two connected components of C \ Rb (0/1) ∪ Rb (1/2). Since the preimages of the rays Rb (0/1) and Rb (1/2) are the rays Rb (θ ), 6θ ∈ Z, the connected components of fb−1 (Vi ), i = 1, 2, are the connected
Julia Sets in Parameter Spaces
357
components of C \ {θ|6θ∈Z} Rb (θ ). Let U be one of them. Since the polynomial fb : C → C is a ramified covering, the restriction of fb to U is a ramified covering onto its image. Since U is simply connected, the Riemann-Hurwitz formula shows that the degree of the restriction of fb to U is n + 1, where n is the number of critical points of fb in U , counted with multiplicity. Hence, to finish the proof of (4), we only need to show that fb (Ui ) = Vi and fb (Ui ) = Vi for i = 1, 2. Lemma 11. For any b ∈ W0 , the component U2 contains the two dynamical rays Rb (2/9) and Rb (5/18) that both land at a preimage of β2 (b). Proof. We have seen previously that for any b ∈ W0 , the region U2 contains the dynamical rays Rb (θ ), θ ∈]1/6, 1/3[. Since 2/9 ∈]1/6, 1/3[ and 5/18 ∈]1/6, 1/3[, the first part of the lemma is proved. Next, we have seen that fb is an isomorphism between U2 and its image. Since U2 contains the two dynamical rays Rb (2/9) and Rb (5/18), its image contains the two dynamical rays fb (Rb (2/9)) = Rb (2/3) and fb (Rb (5/18)) = Rb (5/6) that both land at β2 (b) ∈ V2 and the lemma is proved. Since fb maps the rays Rb (2/9) and Rb (5/18) which are in U2 to the rays Rb (2/3) and Rb (5/6) which land at β2 (b) ∈ V2 , we see that fb (U2 ) = V2 . Since ω2 (b) and ω2 (b) have the same image, we immediately obtain that fb (U2 ) = V2 . Hence, fb : U2 → V2 is an isomorphism and fb : U2 → V2 is a ramified covering of degree 2, ramified at ω2 . Since the polynomial fb has degree 3, the component V2 has no other preimage, and fb (U1 ) = fb (U1 ) = V1 . This finishes the proof of the proposition. 6. Holomorphic Motion of Rays In the rest of this article, we will work in the wake W0 . We will constantly have to deal with the critical point ω2 (b), b ∈ W0 . Thus, the reader must keep in mind that the function ω2 is a holomorphic function defined throughout all the wake W0 , and that for any parameter b ∈ W0 \ Mλ , the point ω2 (b) is the escaping critical point. Theorem A. For any parameter b ∈ W0 and for any θ ∈ , the dynamical ray Rb (θ ) does not bifurcate. We define Xb to the set Rb (θ ). Xb = θ∈
We also define Jb to be the set Jb = Xb \ Xb and Kb to be the complement of the unbounded connected component of C \ Jb . Then, Kb is contained in the filled-in Julia set K(fb ), its boundary Jb is contained in the Julia set J (fb ) and Kb is quasi-conformally homeomorphic to the filled-in Julia set K(λz + z2 ). Figure 11 shows the set Kb and the set of dynamical rays Xb for a parameter b ∈ Mλ ∩ W0 . Proof. Let us first prove that for any parameter b ∈ W0 and any θ ∈ , the dynamical ray Rb (θ ) does not bifurcate. We will mimic the proof of Proposition 7. For any b ∈ W0 , we have defined V2 (b) to be the connected component of C \ Rb (0/1) ∪ Rb (1/2) that contains β2 (b). Since the two dynamical rays Rb (2/3) and Rb (5/6) land at β2 (b), they are contained in V2 (b), and for any θ ∈ [0, 1/2], the
358
X. Buff, C. Henriksen
Rb (4/9)
Rb (1/18)
Rb (1/2)
Rb (0/1)
Fig. 11. The set Kb and the set of set of dynamical rays Xb for a parameter b ∈ Mλ ∩ W0
dynamical ray Rb (θ ) is contained in C \ V2 (b). Since for any θ ∈ , we have 3k θ ∈ [0, 1/2] mod 1, for any k ≥ 0, the forward orbit of the ray Rb (θ ) is contained in C\V2 (b). Next, for any b ∈ W0 , we claim that the critical point ω2 (b) – which is the escaping critical point when b ∈ W0 \Mλ – belongs to the region V2 (b). Indeed, we have seen that β(b) cannot be a critical value of fb . Hence, the set {ω2 (b), β2 (b)} ∪ Rb (0/1) ∪ Rb (1/2) moves holomorphically when b ∈ W0 . Hence, ω2 (b) and β2 (b) are always in the same connected component of C \ Rb (0/1) ∪ Rb (1/2). Now, assume that there exists an angle θ ∈ such that the dynamical ray Rb (θ ) bifurcates. Then, it bifurcates on a preimage of the escaping critical point ω2 (b) and one of its forward images bifurcates on ω2 (b). But this contradicts the fact that the forward orbit of the ray Rb (θ ) is contained in C \ V2 (b) which does not contain ω2 (b). Next, observe that the mapping h : W0 × Xb1 → Xb defined by h(b, z) = ϕb−1 ◦ ϕb1 (z) is a holomorphic motion of Xb1 parametrized by b ∈ W0 . The λ-Lemma by Mañe, Sad and Sullivan [MSS] shows that h extends to a holomorphic motion of the closure Xb1 of Xb1 in C. Since W0 is a simply connected Riemann surface, Słodkowski’s Theorem (see [Sl, D2]) shows that one can in fact extend h to a holomorphic motion of the whole complex plane C, still parametrized by b ∈ W0 . We will keep the notation h for this extension. The mapping z → hb (z) = h(b, z) is a K(b)-quasi-conformal homeomorphism, where K(b) is the exponential of the hyperbolic distance between b1 and b inW0 . It maps the set of dynamical rays Xb1 to the set of dynamical rays Xb , and hb Xb1 \ Xb1 = Xb \ Xb . Since is closed, the set Jb is contained in the Julia set J (fb ). Since K(fb ) is full, the set Kb is contained in the filled-in Julia set K(fb ). Finally, hb provides a quasi-conformal homeomorphism between Kb and Kb1 ,
Julia Sets in Parameter Spaces
359
and since Kb1 is quasi-conformally homeomorphic to the quadratic Julia set K(λz + z2 ) (see Propositions 5 and 7), Theorem A is proved. Observe that the mapping hb conjugates the polynomials fb1 and fb on the set of rays Xb1 , i.e., for any z ∈ Xb1 we have hb ◦ fb1 = fb ◦ hb . By continuity of hb , this property holds on the closure Xb1 and in particular on Jb1 . Observe also that the fixed point 0 never belong to the set Xb so that the set Xb ∪ {0} moves holomorphically when b moves in W0 . In particular, we can choose the extension h so that h(b, 0) = 0 for any b ∈ W0 . Since 0 ∈ Kb1 , this shows that for any b ∈ W0 , 0 belongs to Kb . We finally would like to mention that we could choose the extension of h so that hb conjugates the polynomials fb1 and fb on the whole set Kb1 , and such that the distributional derivative ∂hb /∂z vanishes on Kb1 . But this would require extra work and we will just mention the idea of the proof. We could first prove that for any b ∈ W0 , there is a restriction of fb : Ub → Ub to a neighborhood of Kb which is a quadratic-like map. We could then prove as in Proposition 5 that the hybrid class of this quadraticlike restriction contains the quadratic polynomial z → λz + z2 . In particular, for any b ∈ W0 , the polynomial-like maps fb : Ub → Ub and fb1 : Ub 1 → Ub1 would be hybrid conjugate, i.e., there would exist a quasi-conformal homeomorphism hb : Ub1 → Ub such that hb ◦ fb1 = fb ◦ hb on Ub 1 and such that the distributional derivative ∂hb /∂z vanishes on Kb1 . We would finally have to prove that the restriction of the mapping (b, z) → hb (z) to W0 × Kb1 gives a holomorphic motion Kb1 extending h. 7. The Dyadic Wakes Wϑ Observe that in the wake W0 we see a copy M of a Mandelbrot set, with root point at b0 . In this section, we will explain why we see such a copy, and we will determine a Cantor set such that the boundary of M is the accumulation set of the parameter rays Rλ (θ ), θ ∈ . The reason why such a copy appears is that for any b ∈ W0 , the mapping fb : U2 → V2 is a ramified covering of degree 2, ramified at ω2 . The sets U2 and V2 are topological disks and U2 ⊂ V2 , and the family (fb : U2 → V2 )b∈W0 is almost a Mandelbrot-like family (see [DH2]). The problem is that U2 is not relatively compact in V2 . If λ = 1, one can cut along equipotentials and thicken domains (see [M]) to construct quadratic-like mappings. Such an approach has already been developed by Epstein and Yampolsky [EY] who proved that there exists a homeomorphism χ : M \ {b0 } → M \ {1/4} such that for any b ∈ M , there exists a quadratic-like restriction fb : Vb → Vb which is hybrid conjugate to z → z2 + χ (b). The case λ = 1 is different and less understood. Indeed, when λ = 1, the fixed point β(b) is parabolic with multiplier 1. In this case, no more thickening is possible. We would like to mention that in [Ha], Haïssinsky has made a major step in the direction of proving that in the case λ = 1, the set M is nevertheless homeomorphic to the Mandelbrot set. Since the thickening is not possible when λ = 1, we need to adopt an approach that is not based on surgery. Definition 13.
Kb = z ∈ K(fb ) | (∀n ≥ 0) fb◦n (z) ∈ U2 , M = {b0 } ∪ b ∈ W0 | Kb is connected .
Jb = ∂Kb and
360
X. Buff, C. Henriksen
Proposition 10. The sets Kb and M have the following properties: 1. for any b ∈ W0 , Kb is a compact set, Kb ⊂ K(fb ) and Jb ⊂ J (fb ); 2. a parameter b ∈ W0 belongs to M if and only if ω2 (b) belongs to Kb ; 3. M is a compact subset of Mλ and ∂M ⊂ ∂Mλ . 4. if b ∈ W0 \ M , then any cycle of fb which entirely lies in U2 is repelling. Proof. 1. For any b in W0 , we have Kb =
n≥0
Kn ,
where
K0 = K(fb ) ∩ U2
and
−1 Kn+1 = fb U (Kn ). 2
Each Kn is compact. Hence, Kb is also compact. By definition, Kb ⊂ K(fb ). Given / U2 , for any point z in a connected component U of the interior of K(fb ), if fb◦n (z) ∈ ◦n
some integer n ≥ 0, then fb (U ) entirely lies in C \ U2 . Hence, ∂Kb ⊂ ∂K(fb ), i.e., Jb ⊂ J (fb ).
2. Let us now consider a parameter b ∈ W0 . If ω2 (b) ∈ Kb , then ω2 (b) ∈ K(fb ), and K0 = K(fb ) ∩ U2 is connected. By induction, assume Kn is connected. Then, since ω2 (b) ∈ Kb , we see that fb (ω2 (b)) ∈ Kn , and Kn+1 is also connected. Hence, Kb is the intersection of a nested sequence of connected closed sets. Thus, Kb is connected and b ∈ M . Conversely, if ω2 (b) ∈ / Kb , there exists an integer n ≥ 1 such that ◦n / U2 . Since K0 ∈ U2 , we see that Kn has at least two connected components. fb (ω2 (b)) ∈ This shows that Kb is not connected and b ∈ / M .
3. If b belongs to M , then ω2 (b) ∈ Kb ⊂ K(fb ). Hence, M ⊂ Mλ . We have seen that / U2 . Since b ∈ W0 \ M if and only if there exists an integer n ≥ 1 such that fb◦n (ω2 (b) ∈ U2 moves holomorphically, hence continuously, when b moves in W0 , we see that this is an open condition. Hence, W0 \ M is open in W0 . Since the closure of M is contained in the closure of Mλ , since Mλ ∩ ∂W0 = {b0 }, and since by definition M ∩ ∂W0 = {b0 }, we see that M is closed, hence compact. Let us now show that ∂M ⊂ ∂Mλ . Take a parameter b = b0 in the boundary of M . Then in any neighborhood U ⊂ W0 of b, we can
find a parameter b ∈ U \M so that there exists an integer n ≥ 1 with fb◦n / U2 .
(ω2 (b )) ∈ ◦n Since fb (ω2 (b)) ∈ U2 , and since the boundary of U2 moves holomorphically when the
parameter moves in U, we can find a parameter b
∈ U such that fb◦n
(ω2 (b )) ∈ ∂U2 . There are two possibilities:
/ M ; either fb◦n
(ω2 (b )) belongs to a dynamical ray; in that case b ∈ λ ◦n
or fb
(ω2 (b )) is one of the two points β(b ) or β1 (b ); in that case the critical point ω2 (b) is eventually mapped to a repelling fixed point, and it is well-known that b
∈ ∂Mλ .
4. Assume that b ∈ W0 \ M . Then there exists a smallest integer n ≥ 1 such that fb◦n (ω2 (b)) ∈ / U2 . Define U
to be the nth preimage of U2 by fb |U2 and define U to be the image of U
by fb . Then, fb : U
→ U is a non-ramified covering map of degree 2. Hence, there are two well-defined inverse branches g1 : U → U
and g2 : U → U
. By Schwarz’s lemma, those two branches are contracting for the Poincaré metric of U
and thus, every periodic orbit of fb contained in U is repelling (there may be periodic orbits contained in the closure of U , but we are only concerned by the ones contained inside U ).
Julia Sets in Parameter Spaces
361
Definition 14. We define ⊂ R/Z to be the set of angles θ such that for any n ≥ 0, 3n θ ∈ [1/2, 1] mod 1. We also define X to be the set of parameter rays X = Rλ (θ/3), θ∈
and for any b ∈ M , we define Xb to be the set of dynamical rays Xb =
Rb (θ ).
θ∈
Remark. The set is the set of angles θ that can be written in base 3 with only 1’s and 2’s. It is a Cantor set, invariant under multiplication by 3. In fact, θ ∈ if and only if θ − 1/2 ∈ . Observe also that for any θ ∈ , the two angles θ/3 + 1/3 and θ/3 + 2/3 also belong to . Definition 15. We will say that b ∈ M is a tip of M if and only if the orbit of ω2 (b) is eventually mapped to β(b), i.e., if there exists an integer k ≥ 1 such that f ◦k (ω2 (b)) = β(b). Proposition 11. We have the following dynamical result: 1. for any parameter b ∈ M , we have Jb = Xb \ Xb , where the closure is taken in C; 2. for any b ∈ M , any z ∈ Jb which is eventually mapped to β(b) is the landing point of at least two rays Rb (θ − ) and Rb (θ + ), where θ ± ∈ . Moreover, if f ◦k (z) = β and (f ◦k ) (z) = 0, then, there are exactly two dynamical rays landing at z. The parameter counterpart of this statement is the following: 3. the boundary of M is the accumulation set of X : ∂M = X \ X ; 4. for any tip b ∈ M , there are exactly two angles θ − ∈ and θ + ∈ such that ω2 (b) is the landing point of the two dynamical rays Rb (θ − /3) and Rb (θ + /3). Furthermore, the parameter rays Rλ (θ − /3) and Rλ (θ + /3) land at b ∈ M . Figure 12 shows the set X of parameter rays and the set M . Proof. 1. Let us fix a parameter b ∈ M . Then, the dynamical rays Rb (θ ), θ ∈ , do not bifurcate and the set Xb is exactly the set of rays in U2 (b) whose forward orbit remains in U2 (b). Take any point z0 in the accumulation set of Xb . Since is closed, z0 ∈ J (fb ). Then, since is forward invariant by multiplication by 3, for any integer n ≥ 0, the point zn = fb◦n (z0 ) is in the accumulation set of Xb . Since Xb ⊂ U2 (b), we obtain zn ∈ U2 (b). But this precisely shows that z0 ∈ Jb . Hence Xb \ Xb ⊂ Jb . Conversely, given any point z0 ∈ Jb and any connected neighborhood W0 of z0 , we must show that W0 contains points of Xb . Since z0 ∈ Jb , for any integer n ≥ 0, the point zn = fb◦n (z0 ) belongs to U 2 (b). Since Jb ⊂ J (fb ) (see Proposition 10), the family of iterates fb◦n : W0 → C is not normal. Hence, there exists a first integer n ≥ 0 such that Wn = fb◦n (W0 ) intersects C \ U2 . Since Wn is connected and contains the point zn ∈ U2 (b), we see that Wn intersects at least one of the rays Rb (0/1), Rb (1/2), Rb (2/3), or Rb (5/6). Besides, for any integer k ∈ [0, n − 1], Wk is contained in U2 (b). Hence, W0 intersect a ray which is eventually mapped to one of the rays Rb (0/1), Rb (1/2),
362
X. Buff, C. Henriksen
W3/4
W1/2
W1/4 X
Fig. 12. The set X of parameter rays and the set M
Rb (2/3), or Rb (5/6) and whose forward orbit remains in U2 . Such a ray necessarily belongs to the set Xb . 2. We only need to observe that for any z ∈ Jb , if there exists an integer k ≥ 0 such that fb◦k (z) = β(b), then there exists a neighborhood U of z such that fb◦k : U → fb◦k (U ) is a covering. This covering may be ramified if z is a preimage of ω2 (b). However, by restricting U if necessary, we may assume that z is the only ramification point. Since the two rays Rb (0/1) and Rb (1/2) land at β(b), there are at least two rays Rb (θ − ) and Rb (θ + ) that land at z, satisfying fb◦k (Rb (θ − )) = Rb (0/1) and fb◦k (Rb (θ + )) = Rb (1/2). Finally, since the forward orbit of z remains in V2 , we immediately see that the forward orbit of Rb (θ ± ) also remains in V2 . Thus, θ ± ∈ . Furthermore, if (fb◦k ) (z) = 0, we have to show that there are exactly two dynamical rays landing at z. Since Rb (0/1) is landing at β(b), every dynamical ray landing at β must have combinatorial rotation number 0/1. Hence, the dynamical rays landing at β(b) are exactly the rays Rb (0/1) and Rb (1/2). Since fb◦k is a local isomorphism at z, mapping z to β(b), there are exactly two dynamical rays landing at z. 3. Since is closed, the accumulation set X \ X is contained in the boundary of Mλ . Given any parameter b in this accumulation set, we want to show that b ∈ M . Since, by definition of M , the parameter b0 belongs to M , we may assume that b = b0 . In this case, b ∈ W0 . Given any parameter b ∈ X , and any integer n ≥ 1, the point fb◦n (ω2 (b)) belongs to a dynamical ray Rb (3n θ), for some θ ∈ . Hence, the whole orbit {fb◦n (ω2 (b))}n≥0 belongs to U2 (b). Then, by continuity of U2 (b) at b ∈ W0 ,
the whole orbit {fb◦n
(ω2 (b ))}n≥0 belongs to U2 (b ). But since b ∈ Mλ , we know that ω2 (b ) ∈ K(fb ). This shows that b ∈ M . Hence X \ X ⊂ ∂M .
Julia Sets in Parameter Spaces
363
Conversely, we want to prove that ∂M ⊂ X \ X . We know that b0 is the landing point of the rays Rλ (1/6) and Rλ (1/3). Hence b0 ∈ X \ X . Given any parameter b∗ ∈ ∂M \ {b0 } ⊂ ∂Mλ ∩ W0 , and any neighborhood U ⊂ W0 of b∗ , we want
to show that there exists a parameter b ∈ U such that one of the rays Rb (θ ), θ ∈
bifurcates on ω2 (b). Assume this is not the case. Then, the set Xb = θ∈ Rb (θ ) moves holomorphically when b ∈ U, and therefore Xb remains connected for all b ∈ U and Xb \ Xb ⊂ J (fb ). By Proposition 10 we have ∂M ⊂ ∂Mλ , so there exists a parameter b ∈ U such that ω2 (b ) ∈ / K(fb ). Since the rays Rb (θ ), θ ∈ do not bifurcate on
ω2 (b ) and since Xb \ Xb ⊂ J (fb ), we see that ω2 (b ) does not belong to Xb . Besides, since b is in the wake W0 , the critical point ω2 (b ) is in the region U2 (b ) Hence, there exists an angle θ1 ∈ ]1/2, 2/3[ such that the dynamical rays Rb (θ1 ) and Rb (θ1 + 1/3), bifurcate on ω2 (b ). Since the set Rb (θ1 ) ∪ Rb (θ1 + 1/3) ∪ {ω2 (b )} does not intersect and does not disconnect Xb , and since it separates β(b ) ∈ Xb and β2 (b ) ∈ Xb , we get a contradiction. 4. Let us now consider a tip b∗ ∈ M . Then, fb∗ (ω2 (b∗ )) ∈ Jb ∗ , there exists a smallest ◦(k−1)
(ω2 (b∗ )) = β2 (b∗ ) and fb∗ is a local isomorphism at integer k ≥ 1 such that fb◦k ∗ fb∗ (ω2 (b∗ )). Hence, the dynamical statement shows that there are exactly two dynamical rays landing at fb∗ (ω2 (b∗ )). Those rays are of the form Rb∗ (θ + ) and Rb∗ (θ − ), θ ± ∈ . We will now show that the parameter ray Rλ (θ + /3) lands at the parameter b∗ . A similar proof can be carried out for the parameter ray Rλ (θ − /3). Observe that the two dynamical rays Rb∗ (θ − /3) and Rb∗ (θ + /3) land at ω2 (b∗ ). Besides, since the k − 1 first iterates of ω2 (b∗ ) omit the rays Rb∗ (0/1) and Rb∗ (1/2), and since the rays Rb (0/1) and Rb (1/2) move holomorphically when b ∈ W0 , it follows that there exists a neighborhood U ⊂ W0 of b∗ such that for any b ∈ U and any i ≤ k − 1, fb◦i (ω2 (b)) omits the two rays Rb (0/1) and Rb (1/2). In particular, for any b ∈ U, the two dynamical rays Rb (θ − ) and Rb (θ + ) do not bifurcate. Pulling-back once more, we see that for any b ∈ U, the dynamical rays Rb (θ − /3) and Rb (θ + /3) do not bifurcate when b ∈ U, and so, move holomorphically when b moves in U. Next, for every η ∈ [0, +∞[, define hη : U → C to be the holomorphic function hη (b) = ϕb−1 (eη+2iπθ
+ /3
).
When η tends to 0, one can show that hη converges uniformly on U to a function h0 (this is in fact the way one proves that the holomorphic motion of the ray extends to its closure). For any b ∈ U, h0 (b) is the landing point of the dynamical ray Rb (θ + ). Moreover, the function h0 − ω2 vanishes at b∗ . Besides, it does not vanish on U \ Mλ since for any b ∈ U \ Mλ , ω2 (b) ∈ / K(fb ), whereas h0 (b) ∈ K(fb ). Let us assume that the parameter ray Rλ (θ + /3) does not land at b∗ . Then there exist a neighborhood U and + / U, i.e., the function hηk − ω2 does a sequence ηk 0 such that Φλ−1 (eηk +2iπθ /3 ) ∈ not vanish on U. Then, Hurwitz’s theorem shows that h0 − ω2 either does not vanish on U, or vanishes everywhere on U. This is in contradiction to the previous observation. Hence, the parameter ray Rλ (θ + /3) lands at b∗ . Remark. We don’t claim that the only rays accumulating on Jb are rays of the form Rb (θ ), θ ∈ , or that the only rays accumulating on M are rays of the form Rλ (θ/3), θ ∈ . This would be of the same order of difficulty as proving that for a quadratic polynomial, the only dynamical ray accumulating the β-fixed point is the ray of angle 0/1. In the case of Cremer polynomials, this is not known.
364
X. Buff, C. Henriksen
We will now consider the unbounded connected components of W0 \ Rλ (θ/3). θ∈
We will show that those connected components are naturally indexed by the dyadic angles ϑ = (2p + 1)/2k , k ≥ 1 and 2p + 1 < 2k , and we will denote them by Wϑ . We will also show that the boundary of a component Wϑ is the union of two parameter rays Rλ (ϑ − /3) and Rλ (ϑ + /3), ϑ ± ∈ , that land at a common parameter bϑ ∈ M . In the next section, we will show that for every dyadic angle ϑ, Mλ ∩ Wϑ contains a quasi-conformal copy Kϑ of the filled-in Julia set K(λz + z2 ), such that bϑ ∈ ∂Kϑ ⊂ ∂Mλ . Definition 16. Any dyadic angle ϑ = (2p + 1)/2k , k ≥ 1 and 0 < 2p + 1 < 2k , can be expressed in a unique way as a finite sum k
2p + 1 εi = , 2k 2i i=1
where each εi , i = 1, . . . k, takes the value 0 or 1. We define ϑ − and ϑ + by the formulae: ϑ
−
k
εi + 1 = , 3i i=1
and ϑ + = ϑ − +
1 . 2 · 3k
Remark. There are two ways of writing a dyadic number ϑ in base 2: ϑ = 0.ε1 ε2 . . . εk−1 01111 . . . = 0.ε1 ε2 . . . εk−1 10000 . . . . Read those two numbers in base 3 and add 1/2. You will obtain ϑ − and ϑ + . Proposition 12. Given any dyadic angle ϑ = (2p + 1)/2k , k ≥ 1, 0 < 2p + 1 < 2k , the two parameter rays Rλ (ϑ − /3) and Rλ (ϑ + /3) land at a common tip bϑ ∈ M . More precisely, ◦(k+1) (ω2 (bϑ )) = β(bϑ ), fbϑ and the two dynamical rays Rbϑ (ϑ − /3) and Rbϑ (ϑ + /3) land at ω2 (bϑ ). Proof. Step 1. Let us first prove that the parameter ray Rλ (ϑ − /3) lands either at b0 or at a tip bϑ ∈ M (a similar proof works for the parameter ray Rλ (ϑ + /3)). The argument we use is very similar to the one written in the Orsay notes [DH1]. Let us choose any parameter bϑ in the accumulation set of the parameter ray Rλ (ϑ − /3) and assume bϑ = b0 . Then, bϑ belongs to the wake W0 and Proposition 11 shows that bϑ ∈ M . Moreover, observe that 3k ϑ − ≡ 0 mod 1. Hence, if b is the point of the parameter ray Rλ (ϑ − /3) of potential ◦(k+1) (ω2 (b)) is the point of the dynamical ray Rb (0/1) of potential 3k+1 η. η, then fb Since bϑ is in the wake W0 , the dynamical ray Rb (0/1) moves holomorphically in a neighborhood of bϑ and lands at β(b). Hence, by continuity as η tends to 0, we obtain ◦(k+1) (ω2 (bϑ )) = β(bϑ ). This shows that bϑ is a tip of M . Furthermore, the set that fbϑ ◦(k+1)
of parameters b such that fb
(ω2 (b)) = β(b) is discrete and the accumulation set
Julia Sets in Parameter Spaces
365
of the parameter ray Rλ (ϑ − /3) is connected. Hence, the parameter ray Rλ (ϑ − /3) land either at b0 or at a tip bϑ ∈ M . Let us now show that if the parameter ray Rλ (ϑ − /3) lands at a tip bϑ ∈ M , then the dynamical ray Rbϑ (ϑ − /3) lands at ω2 (bϑ ). For this purpose we need the following lemma: Lemma 12. Let θ be any angle such that 3k θ = 0 mod 1 or 3k θ = 1/2 mod 1 for some integer k, and let b∗ be any parameter in Mλ ∩ W0 . Then the dynamical ray Rb∗ (θ ) lands at a preimage z∗ of β(b∗ ). Assume z∗ is not a preimage of the critical point ω2 (b∗ ). Then, when b moves in a sufficiently small neighborhood of b∗ , the ray Rb (θ ) does not bifurcate, and thus, moves holomorphically. Proof. We will treat the case 3k θ = 1/2 mod 1. The other case is similar. Since b∗ ∈ Mλ , the dynamical ray Rb∗ (θ ) does not bifurcate. Besides, 3k θ = 1/2 mod 1, we have fb◦k (Rb∗ (θ )) = Rb∗ (1/2). ∗ Since the ray Rb∗ (1/2) lands at β(b∗ ), we see that the ray Rb∗ (θ ) lands at a preimage z∗ of β(b∗ ). The lemma now follows directly from [DH1], Proposition 3, exposé 8. We can apply the above lemma to the angle ϑ − /3 and the parameter bϑ . It shows that the ray Rbϑ (ϑ − /3) lands at a preimage zϑ of β(bϑ ). If zϑ is not a preimage of the critical point ω2 (bϑ ) then the ray moves holomorphically in a neighborhood of bϑ . We define bη to be the point of potential η on the parameter ray Rλ (ϑ − /3). Then ω2 (bη ) is the point of potential η on the dynamical ray Rbη (ϑ − /3). When η tends to 0, bη tends to bϑ and ω2 (bη ) converges to the landing point of the dynamical ray Rbϑ (ϑ − /3). By continuity of the function ω2 , it proves that the dynamical ray Rbϑ (ϑ − /3) lands at ω2 (bϑ ). Hence, the only remaining difficulty is proving that zϑ is not a preimage of the critical 1 point ω2 (bϑ ). If this were the case, one could find an integer k1 such that fb◦k (zϑ ) = ϑ − ω2 (bϑ ). Note that k1 ≥ 1 since ω2 (bϑ ) and the dynamical ray Rbϑ (ϑ /3) are separated by Rbϑ (0/1) ∪ Rbϑ (1/2) ∪ {β(bϑ )}. Since ω2 (bϑ ) is strictly preperiodic (this is our as◦(k +1) sumption that bϑ = b0 ), iterating once more, we know that fbϑ 1 (zϑ ) = fbϑ (ω2 (bϑ )) is not a preimage of ω2 (bϑ ) and is the landing point of the dynamical ray Rbϑ (3k1 ϑ − ). Hence, we can apply Lemma 12. It shows that the ray Rb (3k1 ϑ − ) moves holomorphically in a neighborhood of bϑ . Then again, defining bη to be the point of potential η on the ◦(k +1) parameter ray Rλ (ϑ − /3), we get by continuity that fbϑ 1 (ω2 (bϑ )) = fbϑ (ω2 (bϑ )). 1 1 Hence, either fb◦k (ω2 (bϑ )) = ω2 (bϑ ) or fb◦k (ω2 (bϑ )) = ω2 (bϑ ). The first case is not ϑ ϑ possible since ω2 (bϑ ) is not periodic. The second case is also impossible since bϑ ∈ M and thus ω2 (bϑ ) and ω2 (bϑ ) are separated by Rbϑ (0/1) ∪ Rbϑ (1/2) ∪ {β(bϑ )}. Step 2. Let us now show that the parameter rays Rλ (ϑ + /3) and Rλ (ϑ − /3) land at the same parameter. Either, both of them land at b0 , or one of them lands at a tip bϑ = b0 of M . Without loss of generality, assume that Rλ (ϑ − /3) lands at bϑ = b0 . We just proved in Step 1 that the dynamical ray Rbϑ (ϑ − /3) lands at ω2 (bϑ ). Proposition 11 (4) shows that there are exactly two rays landing at ω2 (bϑ ). It is not difficult to check that the other dynamical ray landing at ω2 (bϑ ) is Rbϑ (ϑ + /3). Proposition 11 (4) then shows that the parameter ray Rλ (ϑ + /3) lands at bϑ . Step 3. We now need to prove that the parameter rays Rλ (ϑ + /3) and Rλ (ϑ − /3) do not land at b0 . The usual techniques to prove this kind of result is based on a careful study
366
X. Buff, C. Henriksen
of parabolic implosion (see for example the Orsay notes [DH1]). We will use a different approach based on Yoccoz inequality (see Hubbard [Hu] or Petersen [P]). Let us first define Wϑ to be the connected component of W0 \Rλ (ϑ − /3)∪Rλ (ϑ + /3) that contains the parameter rays Rλ (θ ), with θ ∈ ]ϑ − /3, ϑ + /3[. We claim that the component Wϑ cannot intersect M . Indeed, Proposition 10 shows that if Wϑ intersect M , there is a parameter b ∈ Wϑ such that b is a tip of M (tips of M are dense in ∂M ). But Proposition 11 then shows that there are two parameter rays landing at b whose angles are in . However, no angle between ϑ − and ϑ + can be written with only 1’s and 2’s. Let us now assume that the parameter rays Rλ (ϑ − /3) and Rλ (ϑ + /3) land at b0 . Since Wϑ ∩ M = ∅, Proposition 10 shows that for any b ∈ Mλ ∩ Wϑ , the fixed point α(b) is repelling and thus, has a rotation number. This rotation number is constant on any connected component L of Mλ ∩ Wϑ . Besides, since Mλ is connected, we necessarily have b0 ∈ L. Since at b0 the fixed point α(b) collapses with β(b) and becomes a multiple fixed point, the multiplier at α(b) tends to 1 as b tends to b0 , and the Yoccoz inequality shows that the rotation number of α(b) is 0/1 for any b ∈ L. But in this case, for any b ∈ L one of the two dynamical rays Rb (0/1) or Rb (1/2) has to land at α(b), which is impossible since they both land at β(b) = α(b). This gives the required contradiction. Definition 17. For any dyadic angle ϑ, we define the wake Wϑ to be the connected component of C \ Rλ (ϑ − /3) ∪ Rλ (ϑ + /3) that contains the parameter rays Rλ (θ ), with θ ∈ ]ϑ − /3, ϑ + /3[. Proposition 13. Given any dyadic angle ϑ = (2p +1)/2k , k ≥ 1, 0 < 2p +1 < 2k , and any parameter b ∈ Wϑ , the dynamical rays Rb (ϑ − /3) and Rb (ϑ + /3) do not bifurcate and land at a common preimage of β(b). Proof. Let us assume that b belongs to the parameter ray Rλ (θ ) and that the dynamical ray Rb (ϑ − /3) bifurcates. Then, note that the dynamical ray Rb (θ ) bifurcates on ω2 (b). Hence, Rb (3θ ) contains the critical value fb (ω2 (b)). Moreover, the dynamical ray Rb (ϑ − /3) bifurcates on a preimage of ω2 (b). Hence, there exists an integer n ≥ 0 such that fb◦n (Rb (ϑ − /3)) = Rb (3n−1 ϑ − ) bifurcates on ω2 (b). Since Rb (ϑ − /3) ⊂ U2 , we necessarily have n ≥ 1, and Rb (3n ϑ − ) contains the critical value fb (ω2 (b)). This shows that the set of parameters b ∈ W0 , where the dynamical ray Rb (ϑ − /3) bifurcates is precisely the union of parameter rays Rλ (θ ), where θ ∈ ]1/6, 2/3[ and 3θ = 3n ϑ − mod 1 for some integer n ≥ 1. It is not difficult to check that for any n ≥ 1, the angle 3n ϑ − mod 1 does not belong to the interval [ϑ − , ϑ + ]. Besides, the parameter ray Rλ (3n ϑ − ) lands at a tip of M and this tip cannot be bϑ (see Proposition 12). Hence, the set of parameter b ∈ W0 for which the dynamical ray Rb (ϑ − /3) does not bifurcate is a neighborhood of Wϑ . A similar argument shows that the set of parameter b ∈ W0 for which the dynamical ray Rb (ϑ + /3) does not bifurcate is a neighborhood of Wϑ . Since at bϑ the two dynamical rays Rb (ϑ − /3) and Rb (ϑ + /3) land at the common point bϑ , we see that this property ◦(k+1) (Rb (ϑ − /3)) = Rb (0/1) lands holds for any parameter b in Wϑ . Finally, since fb − at β(b), the landing point of the rays Rb (ϑ /3) and Rb (ϑ + /3) is a preimage of β(b).
Julia Sets in Parameter Spaces
367
Definition 18. Given any dyadic angle ϑ = (2p + 1)/2k , k ≥ 1, 0 < 2p + 1 < 2k , and any parameter b ∈ Wϑ , we define Wϑ to be the connected component of C \ Rb (ϑ − /3) ∪ Rb (ϑ + /3) that contains the dynamical rays Rb (θ ), θ ∈ ]ϑ − /3, ϑ + /3[. Proposition 14. Given any dyadic angle ϑ = (2p + 1)/2k , k ≥ 1, 0 < 2p + 1 < 2k , and any parameter b ∈ Wϑ , the co-critical point ω2 (b) belongs to the region Wϑ (b) ◦(k+1) : Wϑ (b) → V1 (b) is an isomorphism. and the mapping fb Proof. We have seen (Proposition 13) that the boundary of the region Wϑ (b) moves holomorphically when b moves in the wake Wϑ . Furthermore, the co-critical point ω2 (b) cannot belong to this boundary since this would mean that b is in the boundary of the wake Wϑ . Hence, to see that for any parameter b ∈ Wϑ , the co-critical point ω2 (b) belongs to the region Wϑ (b), it is enough to check it at one particular parameter b ∈ Wϑ . This is clear as soon as b is outside Mλ . Indeed, in this case b belongs to a parameter ray Rλ (θ ) with θ ∈ ]ϑ − /3, ϑ + /3[. Thus, ω2 (b) belongs to the dynamical ray Rb (θ ) ⊂ Wϑ (b). ◦(k+1) : C → C is a ramified covering, we know that for any connected Since fb ◦(k+1) −1 ◦(k+1) : W → V1 (b) is also component W of fb (V1 (b)), the restriction fb a ramified covering. Those components are the connected components of C minus the closure of the dynamical rays Rb (θ ), where 3k+1 θ mod 1 is equal to 0 or 1/2. It is not ◦(k+1) : Wϑ (b) → difficult to check that the region Wϑ (b) contains no such ray. Thus, fb V1 (b) is a ramified covering. Since the boundary of Wϑ (b) is mapped to the boundary ◦(k+1) : Wϑ (b) → V1 (b) is an isomorphism. of V1 (b) with degree 1, fb 8. Copies of Quadratic Julia Sets in the Parameter Plane In Sect. 6, we have defined the set Xb =
Rb (θ )
θ∈
and we have proved that the mapping h : W0 × Xb1 → Xb defined by h(b, z) = ϕb−1 ◦ ϕb1 (z) gives a holomorphic motion of the set Xb1 . In this section we fix once and for all a holomorphic motion h : W0 × C → C that coincides with the previous holomorphic motion on W0 × Xb1 . This can be done using Słodkowski’s theorem (see Słodkowski [Sl] or Douady [D2]), because W0 is a simply connected Riemann surface. We will also fix once and for all a dyadic angle ϑ = (2p + 1)/2k , k ≥ 1, 0 < 2p + 1 < 2k , and we will define ϑ − , ϑ + , bϑ , Wϑ and Wϑ (b) as in the previous section. Definition 19. We define Xϑ to be the set of parameter rays −
ϑ θ Xϑ = + k+1 . Rλ 3 3 θ∈
Besides, we define Jϑ to be the set Jϑ = Xϑ \ Xϑ , where the closure is taken in C. Finally, we define Kϑ to be the complement of the unbounded connected component of C \ Jϑ .
368
X. Buff, C. Henriksen
1+i
fb−1 (β) −1 + i
0 fb−1 (0) β
K(fb )
fb◦2 (ω2 ) 1 − 3i
K(fb0 )
−.3 + 3.4i −1 − 3i
b
b0 W0
.3 + 2.7i
.2 + 3.4i Fig. 13. The holomorphic motion for λ = −1
Main Theorem. Let λ ∈ S 1 be a complex number of modulus 1 and ϑ ∈ R/Z be a dyadic angle. The set Kϑ is contained in Mλ ∩ Wϑ , its boundary Jϑ is contained in the boundary of Mλ and the parameter bϑ belongs to Jϑ . Besides, there exists a quasiconformal homeomorphism defined in a neighborhood of Kϑ , sending Kϑ to K(λz+z2 ). Figure 13 suggests the main idea of the proof in the case λ = −1, p = 0 and k = 1, i.e., for ϑ = 1/2. Proof. By definition of Xϑ and Jϑ , we see that Xϑ ⊂ Wϑ and Jϑ ⊂ ∂Mλ . Since Mλ is full, we also have Kϑ ⊂ Mλ . Finally, since bϑ is the landing point of the parameter ray Rλ (ϑ − /3), we see that bϑ ∈ Jϑ . Hence, the only difficulty is proving that Kϑ is quasi-conformally homeomorphic to K(λz + z2 ). Lemma 13. The mapping Hϑ : W0 → C defined by ◦(k+1) Hϑ (b) = h−1 f (ω (b)) , 2 b b
Julia Sets in Parameter Spaces
369
is locally quasi-regular. Its restriction to the dyadic wake Wϑ , is a homeomorphism which is locally quasi-conformal. Proof. The argument we use is essentially due to Douady and Hubbard [DH2] (with some modifications). Let us first show that the restriction of Hϑ to any open subset of W0 which is relatively compact in W0 is a quasi-regular mapping. It is enough to prove that there exists a κ ∈ [0, 1[ such that the distributional derivatives of H with respect to b and b are locally in L2 and satisfy ∂H /∂b ϑ ≤ κ < 1. ∂Hϑ /∂b ∞
◦(k+1)
Let us take the derivative with respect to b of the equation hb ◦ Hϑ = fb ◦(k+1) ◦ ω2 )/∂b identically vanish, we get Since ∂hb /∂b and ∂(fb
◦ ω2 .
∂hb ∂Hϑ ∂Hϑ ∂hb + = 0. ∂z Hϑ (b) ∂b b ∂z Hϑ (b) ∂b b
Thus,
∂H /∂b ϑ ∂Hϑ /∂b b
∂H /∂b ϑ = ∂Hϑ /∂b b
∂h /∂z b = ∂hb /∂z Hϑ (b)
.
The result follows by quasi-conformality of hb . Now, at every point b ∈ W0 , the mapping Hϑ has a local degree which is positive. To see that the restriction of Hϑ to the wake Wϑ is proper, let us show that Hϑ maps Wϑ (respectively ∂Wϑ ) to V1 (b1 ) (respectively ∂V1 (b1 )). Indeed, if b ∈ Wϑ , then ω2 (b) ◦(k+1) to V1 (b) (see belongs to the region Wϑ (b) which is mapped isomorphically by fb Proposition 14). This shows that for any b ∈ Wϑ , f ◦(k+1) (ω2 (b)) = f ◦(k+1) (ω2 (b)) ∈ V1 (b). Moreover, by construction, for any b ∈ W0 , we have hb (V1 (b1 )) = V1 (b). Since hb is a homeomorphism, we see that ◦(k+1) Hϑ (b) = h−1 (ω2 (b)) ∈ h−1 fb b b (V1 (b)) = V1 (b1 ). Furthermore, the map Hϑ is continuous in the whole wake W0 and in particular on the boundary of Wϑ , i.e., on Rλ (ϑ − ) ∪ Rλ (ϑ + ). Since hb maps Rb1 (0/1) (respectively Rb1 (1/2)) to Rb (0/1) (respectively Rb (1/2)), we see that when b ∈ ∂Wϑ , i.e., ω2 (b) ∈ Rb (ϑ − ) ∪ Rb (ϑ + ), we have fb◦2 Rb (ϑ − ) ∪ Rb (ϑ + ) = h−1 Rb (0/1) ∪ Rb (1/2) Hϑ (b) ∈ h−1 b b = Rb1 (0/1) ∪ Rb1 (1/2) = ∂V1 (b1 ). Hence, the mapping Hϑ : Wϑ → V1 (b1 ) is a proper mapping. Let us now show that the topological degree of the restriction of Hϑ to Wϑ is 1. Since Hϑ is locally quasi-regular, the topological degree of Hϑ at any point b ∈ Wϑ is positive. Hence, it is enough to show that when b turns once around Wϑ , Hϑ (b) turns once around V1 (b1 ). But this is straight forward since the point of potential η on the parameter ray Rλ (ϑ − )/3 (respectively Rλ (ϑ + )/3) is mapped to the point of potential 3k+1 η on the dynamical ray Rb1 (0/1) (respectively Rb1 (1/2)).
370
X. Buff, C. Henriksen
To conclude the proof of the main theorem, observe that Hϑ (Xϑ ) = Xb1 . Indeed, for any θ ∈ , −
ϑ θ Hϑ Rλ = Rb1 (3k ϑ − + θ) = Rb1 (θ ). + k+1 3 3 Hence,
Hϑ (Jϑ ) = Hϑ (Xϑ \ Xϑ ) = Xb1 \ Xb1 = Jb1 ,
and Hϑ (Kϑ ) = Kb1 . Since we know that Kb1 is quasi-conformally homeomorphic to K(λz + z2 ), the main theorem is proved. We say that the family fb is stable at a parameter b0 if and only if the Julia set J (fb ) moves holomorphically in a neighborhood of b0 . The bifurcation locus of the family fb is defined to be the set of parameters where the family is not stable. Using the results obtained by Mañe, Sad and Sullivan in [MSS], one can prove that the bifurcation of the family fb , b ∈ C, is precisely the boundary of the connectedness locus Mλ . The following corollary is an immediate consequence of the previous theorem. Corollary A. For each λ = e2iπθ , the bifurcation locus of the one parameter family fb (z) = λz + bz2 + z3 , b ∈ C, contains quasi-conformal copies of the quadratic Julia set J (λz + z2 ). 9. Non-local Connectivity in the Parameter Plane We will now prove that when the Julia set J (λz + z2 ) is not locally connected Mλ is not locally connected. Corollary B. If the Julia J (λz+z2 ) is not locally connected, Mλ is not locally connected. Proof. The proof we will give here was explained to us by Lyubich and McMullen. Let us recall that if the continuous image of a locally connected compact set is Hausdorff, then it is locally connected. Thus, it is enough for our purposes to construct a continuous retraction from Mλ to the set Kϑ . Let us first plough in the dynamical plane of fb1 . Observe that the unbounded connected component of C \ Xb1 are preimages of V2 (b1 ) by iterates of fb1 . Since V2 (b1 ) is bounded by the dynamical rays Rb1 (0/1) and Rb1 (1/2) which both land at β(b), we see that each unbounded connected component of C \ Xb1 is bounded by two dynamical rays belonging to Xb 1 which land at a common preimage of β(b1 ). Harvesting in the parameter plane using Hϑ , we see that each unbounded connected components of C \ Xϑ is bounded by two parameter rays belonging to Xϑ which land at a common parameter which belong to Jϑ . We can then define a retraction ψ : C \ Xϑ → Kϑ which is the identity on Kϑ and sends every unbounded connected component W of C \ Xϑ to the landing point of the two parameter rays bounding W. This retraction is continuous. Indeed, every open set in Kϑ can be written U ∩ Kϑ with U open in C \ Xϑ . Then, the preimage of this open set is the union of U and the unbounded connected components of C \ Xϑ intersecting U. This is clearly open. The restriction of ψ to Mλ ⊂ C \ Xϑ gives the required retraction. Let us finally prove that there exist values of λ for which certain parameter rays have a non-trivial impression. In order to state our third corollary, we need to introduce some notations.
Julia Sets in Parameter Spaces
371
Definition 20. Given any complex number λ of modulus 1, we define Pλ to be the quadratic polynomial Pλ (z) = λz + z2 . We define gPλ : C → [0, +∞[ to be its Green function and ϕPλ : C \ K(Pλ ) → C \ D to be its Böttcher coordinate. For any angle θ ∈ R/Z, we define RPλ (θ ) to be the dynamical ray of the polynomial Pλ of angle θ. Definition 21. Let χ : R/Z → R/Z be the Cantor map (or devil staircase) which is constant on each connected component of R/Z \ and is defined on by:
εi
εi = χ , i 3 2i i≥1
where εi ∈ {0, 1}.
i≥1
Corollary C. Given any complex number λ of modulus 1 and dyadic angle ϑ = (2p + 1)/2k and any angle θ ∈ , the accumulation set of the parameter ray Rλ (ϑ − /3 + θ/3k+1 ) is reduced to a point if and only if the accumulation set of the quadratic ray RPλ (χ (θ )) is reduced to a point. The following proof was explained to us by Douady. Proof. The proof of the main theorem provides a homeomorphism Hϑ : Wϑ → V1 (b1 ) which maps each parameter ray Rλ (ϑ − /3 + θ/3k+1 ), θ ∈ , to the dynamical ray Rb1 (θ ). Hence, it is enough to prove that for any θ ∈ , the accumulation set of the dynamical ray Rb1 (θ ) is reduced to a point if and only if the accumulation set of the quadratic ray RPλ (χ (θ )) is reduced to a point. Let us recall that the mapping fb1 : Ub 1 → Ub1 is a quadratic-like mapping hybrid conjugate to the quadratic polynomial Pλ (see Proposition 5 and Fig. 3). To fix the ideas, we choose a potential η0 > 0, we set UPλ = {z ∈ C | gPλ (z) < 2η0 } and UP λ = {z ∈ C | gPλ (z) < η0 }. Then, we choose a quasi-conformal homeomorphism ψ : Ub1 → UPλ that conjugates fb1 : Ub 1 → Ub1 to Pλ : UP λ → UPλ and that sends the segment of dynamical ray Ub1 ∩ Rb1 (0/1) onto UPλ ∩ RPλ (0/1). We will construct a continuous mapping ψ : Ub1 \ Kb1 → UPλ \ K(Pλ ) which semi-conjugates fb1 : Ub 1 → Ub1 to Pλ : UP λ → UPλ and which maps Rb1 (θ ) ∩ Ub1 , θ ∈ , to RPλ (χ (θ )). We will then prove that the distance, for the hyperbolic metric on C\K(Pλ ), between ψ(z) and ψ (z), is uniformly bounded independently on z ∈ UPλ \ K(Pλ ). It easily follows that the accumulation sets of ψ(Rb1 (θ )) and ψ (Rb1 (θ )) = RPλ (χ (θ )) are equal. Since ψ : Ub1 → UPλ is a homeomorphism, this will complete the proof of Corollary C. Let us now fill in the details. We will need to work with the universal coverings of Vb1 = C \ Kb1 and VPλ = C \ K(Pλ ). To write things correctly and to avoid nasty traps, we need to choose basepoints. We choose z0 (respectively z1 ) to be the point of potential G(b1 ) (respectively G(b1 )/3) on the dynamical ray Rb1 (0/1). We then define b1 → Vb1 to be the universal covering of Vb1 with basepoint at z0 . We choose πb1 : V to be a lift of Rb1 (0/1) and we define which R z0 (respectively z1 ) to be the point of R b1 = π −1 (Ub1 \ Kb1 ) and is in the fiber of z0 (respectively z1 ). Next, we define U b1
372
X. Buff, C. Henriksen
= π −1 (U \ Kb1 ). Then, we call fb1 : U → U b1 the lift of fb1 : U → Ub1 that U b1 b1 b1 b1 b1 sends z1 to z0 : , (U b1 z1 ) πb1
fb1
πb1
(Ub 1 , z1 )
/ (U b1 , z0 )
fb1
/ (Ub1 , z0 ).
b1 . We Finally, observe that the fundamental group of Vb1 is a cyclic group that acts on V b1 → V b1 the automorphism of V b1 that corresponds to turning once around call γb1 : V Kb1 counter-clockwise. Since fb1 : Ub 1 → Ub1 maps a loop that turns once around Kb1 counter-clockwise to a loop that turns twice around Kb1 counter-clockwise, we see that fb1 ◦ γb1 = γb◦2 ◦ fb1 . 1 Similarly, we define w0 (respectively w1 ) to be the point of potential η0 (respectively Pλ → VPλ to be the universal η0 /2) on the quadratic ray RPλ (0/1). We define πPλ : V covering with basepoint at w0 . In this case, we can give an explicit formula. We identify Pλ with the right half-plane H = {z ∈ C | Re(z) > 0} and we set πPλ = ϕ −1 ◦ exp. V Pλ The real axis projects to the quadratic ray RPλ (0/1). Thus, we define w 0 = η0 and w 1 = η0 /2, so that πPλ ( w0 ) = w0 and πPλ ( w1 ) = w1 . We define Pλ = π −1 UPλ \ K(Pλ ) = z ∈ H | Re(z) < 2η0 U Pλ and
P = π −1 UP \ K(Pλ ) = z ∈ H | Re(z) < η0 . U P λ λ λ
The lift of Pλ : VPλ → VPλ that sends w 1 to w 0 is the map w → 2 w . Finally, the automorphism of H that corresponds to turning once around K(Pλ ) counter-clockwise is the translation z → z + 2iπ. Next, a quasi-conformal homeomorphism ψ : Ub1 → UPλ , that conjugates fb1 : Ub 1 → Ub1 to Pλ : UP λ → UPλ and that sends the segment of dynamical ray Ub1 ∩ Rb1 (0/1) onto UPλ ∩ RPλ (0/1), can be lifted to a quasi-conformal homeomorphism b1 → U Pλ that sends ∩ U b1 to R ∩ U Pλ . Hence, it also :U sends R ψ z0 to w 0 . Then ψ sends z1 to w 1 , and it is not difficult to see that it conjugates fb1 to multiplication by 2: ◦ fb1 = 2ψ . ψ We now come to the construction of the semi-conjugacy ψ . First, consider the increasing homeomorphism h : [G(b1 ), 3G(b1 )] → [η0 , 2η0 ] defined by h = gPλ ◦ ψ ◦ ϕb−1 ◦ exp . 1 (h(η) is the potential in C \ K(Pλ ) of the image by ψ of the point of potential η on the dynamical ray Rb1 (0/1)). Then, define the continuous mapping ψ : Ub1 \ Ub 1 → UPλ \ UP λ in the following way: (eη0 ), i.e., the point of potential η0 on • on Ub
1 , the map ψ is constantly equal to ϕP−1 λ the dynamical ray RPλ (0/1); (eη+2iπθ ) to the point • on Ub1 \ (Ub 1 ∪ Ub
1 ) the map ψ sends the point ϕb−1 1 −1 h(η)+2iπχ (θ) ϕPλ (e ).
Julia Sets in Parameter Spaces
373
Observe that on the boundary of Ub 1 , we have ψ ◦ fb1 = Pλ ◦ ψ . Now, consider the → U b1 \ U Pλ \ U that sends : U semi-conjugates fb1 lift ψ z0 to w 0 . The map ψ b1 Pλ
to multiplication by 2 on the boundary of Ub1 . Thus, we can extend it continuously to b1 using the formula: U 1 ( fb◦n ( ψ z) = n ψ z) , 1 2 . An easy induction shows that b1 \ U where n is chosen so that fb◦n ( z) belongs to U b1 1 ◦ γ1 = ψ + 2iπ. Hence, ψ projects to a continuous map ψ : Ub1 \ Kb1 → ψ UPλ \ K(Pλ ) that semi-conjugates fb1 : Ub 1 → Ub1 to Pλ : UP λ → UPλ . We claim that for any θ ∈ , ψ maps Rb1 (θ ) ∩ Ub1 homeomorphically onto RPλ (χ (θ ) ∩ UPλ ). Indeed, set A0 = Ub1 \ Ub 1 and for n ≥ 0 define recursively An+1 = fb−1 (An ). Similarly, define Bn to be the annulus 1 Bn = z ∈ C \ K(Pλ ) | η0 /2n ≤ gPλ (z) ≤ η0 /2n−1 . By construction, for every θ ∈ , we have ψ (Rb1 (θ ) ∩ A0 ) = RPλ (χ (θ )) ∩ B0 . Besides, since ψ semi-conjugates fb1 and Pλ , we see that for every n ≥ 0 and every θ ∈ , ψ (Rb1 (θ ) ∩ An ) is contained in the intersection of the annulus Bn with a ray of Pλ . Since ψ is continuous, the whole set ψ (Rb1 (θ ) ∩ Ub1 ) is contained in a single ray of Pλ , i.e., the ray RPλ (χ (θ )). The point of potential η is mapped to the point of potential h(3n η)/2n , where n is chosen so that G(b1 ) ≤ 3n η ≤ 3G(b1 ). This shows that ψ : Rb1 (θ ) ∩ Ub1 → RPλ (χ (θ ) ∩ UPλ ) is a homeomorphism. Let us finally show that the distance, for the hyperbolic metric on C\K(Pλ ), between ψ(z) and ψ (z), is uniformly bounded independently on z ∈ Ub1 \ Kb1 . It is enough b1 , the hyperbolic distance in H between ψ ( ( to prove that for any z∈U z) and ψ z) is uniformly bounded. Since ψ ◦ fb1 = 2ψ and ψ ◦ fb1 = 2ψ , and since multiplication by 2 is an isometry for the hyperbolic metric on H, it is enough to prove the statement on with a fundamental domain for γ1 . This is immediate since b1 \ U the intersection of U b1 b1 and the mappings ψ and ψ are continuous the closure of such a set is compact in V b1 . on V It now follows that we can extend ψ continuously to Kb1 by setting ψ |Kb1 = ψ|Kb1 . Given θ ∈ consider the restriction of ψ to (Rb1 (θ ) ∩ Ub1 ) ∪ Kb1 . Since this map is injective continuous and the domain is compact, it is necessarily a homeomorphism. Notice that the closure of Rb1 (θ ) ∩ Ub1 in C equals the closure taken in (Rb1 (θ ) ∩ Ub1 ) ∪ Kb1 . Similarly the closure of RPλ (χ (θ )) ∩ UPλ in C equals the closure taken in (RPλ (χ (θ )) ∩ UPλ ) ∪ K(Pλ ). In particular ψ |Kb1 = ψ|Kb1 provides a homeomorphism, mapping the impression of Rb1 (θ ) onto the impression of RPλ (χ (θ )). Let us now consider the function θ2 : (R \ Q)/Z → (R \ Q)/Z defined in the following way: for any irrational angle, first choose the representative t ∈]0, 1[, then define
1 θ2 (t) = . q+1 2 0
The sum is taken over all pairs (p, q) such that 0 < p/q < t, whether p and q are relatively prime or not. Douady proved that the set of complex numbers λ = e2iπt ,
374
X. Buff, C. Henriksen
t ∈ (R \ Q)/Z, for which the accumulation set of the quadratic ray RPλ (θ2 (t)) is not reduced to a point, is a dense Gδ subset of S 1 . The proof can be found in [Sø]. Next, observe that for each t ∈ R \ Q, there is exactly one angle θ3 (t) ∈ which is mapped to θ2 (t) by χ :
1 . θ3 (t) = 3q+1 0
The previous corollary shows that when the accumulation set of the quadratic ray RPλ (θ2 (t)) is not reduced to a point, then the accumulation set of the parameter ray Rλ (2/9 + θ3 (t)/9) is also not reduced to a point. This shows that the set of complex numbers λ of modulus 1 for which at least one of the parameter rays Rλ (θ ) ⊂ C \ Mλ has an accumulation set not reduced to a point, contains a dense Gδ subset of S 1 . Acknowledgements. We are very grateful to Bodil Branner, Adrien Douady, John Hubbard and Carsten Petersen for encouraging us. We would like to thank the Department of Mathematics of Cornell University for its hospitality during the academic year 1997–1998. There, we drew the first pictures showing evidence of the existence of Julia sets in parameter spaces. We would also like to thank the Departments of Mathematics of Technical University of Denmark and of Université Paul Sabatier. This research was supported by the French Embassy in Denmark: it made possible the exchanges that took place at the time this article was written.
References [Br] [BH1] [BH2] [D1] [D2] [DH1] [DH2] [EY] [Ha] [Hu] [K] [La] [MSS] [M] [Na˘ı] [NK] [PM]
Branner, B.: Puzzles and Parapuzzles of quadratic and Cubic Polynomials. Proceedings of Symposia in Applied Math. 49, 31–69 (1994) Branner, B. and Hubbard, J.: The iteration of cubic polynomials, Part I: The global topology of parameter space. Acta Math. 160, 143–206 (1988) Branner, B. and Hubbard, J.: The iteration of cubic polynomials, Part II: Patterns and parapatterns. Acta Math. 169, 229–325 (1992) Douady, A.: Does a Julia set depend continuously on the Polynomial?. In: Proceedings of Symposia in Applied Math. 49, 91–139 (1994) Douady, A.: Prolongement de mouvements holomorphes [d’après Slodkowski et autres]. Séminaire Bourbaki, Astérisque 227, 7–20 (1993) Douady, A. and Hubbard, J.: Étude dynamique des polynômes complexes I & II. Publ. Math. d’Orsay, (1984–1985) Douady, A. and Hubbard, J.: On the dynamics of polynomial-like mappings. Ann. Sci. Éc. Norm. Sup. 18, 287–343 (1985) Epstein, A. and Yampolsky, M.: Geography of the Cubic Connectedness Locus: Intertwining Surgery. Ann. Sci. Éc. Norm. Sup. 32, 151–185 (1999) Haïssinsky, P.: Chirurgie parabolique. C. R. Acad. Sc. Paris 327, 195–198 (1998) Hubbard, J.: Local connectivity of Julia sets and bifurcation loci: Three theorems of J.C. Yoccoz. In: Topological methods in modern mathematics Stony Brook, NY: 1991, Houston, TX: Publish or Perish, 1993, pp. 467–511 Kiwi, J.: Rational Rays and Critical Portraits of Complex Polynomials. SUNY at Stony Brook IMS preprint, 1997/15 Lavaurs, P.: Systèmes dynamiques holomorphes, Explosion de points périodiques paraboliques. Thèse de doctorat, Université Paris-Sud in Orsay (1989) Mañe, R., Sad, P. and Sullivan, D.: On the dynamics of rational maps. Ann. Sci. Éc. Norm. Sup. 16, 193–217 (1983) Milnor, J.: Local connectivity of Julia sets: Expository Lectures, In: The Mandelbrot Set, Theme and Variations, ed. Tan Lei, London Math. Soc. Lect. Note 274, Cambridge: Cambridge Univ. Press, 2000, pp. 67–116 Na˘ıshul , V.A.: Topological Invariants of analytic and area preserving mappings and their applications to analytic differential equations in C2 and CP2 . Trans. Moscow Math. Soc. 42, 239–250 (1983) Nakane, S. and Komori,Y.: Non-landing of Stretching Rays for the Family of Real Cubic Polynomials. Manuscript Pérez-Marco, R.: Fixed points and circle maps. Acta Math. 179, 243–294 (1997)
Julia Sets in Parameter Spaces
[P] [R] [Sl] [Sø] [TL] [Y] [Z1] [Z2]
375
Petersen,C.: On the Pommerenke-Levin-Yoccoz inequality, Ergod. Th. Dynam. Sys. 13, 785–806 (1993) Roesch, P.: Topologie locale des méthodes de Newton cubiques: Plan dynamique., C. R. Acad. Sc. 326, 1221–1226 (1998) Słodkowski, Z.: Extensions of holomorphic motions. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 22, 185–210 (1995) Sørensen, D.: Infinitely renormalizable quadratic polynomials, with non-locally connected Julia set. J. Geom. Anal. 10, 169–206 (2000) Tan Lei.: Similarity between the Mandelbrot set and Julia sets. Commun. Math. Phys. 134, 587–617 (1990) Yoccoz, J.C.: Petits diviseurs en dimension 1. Astérisque 231, (1995) Zakeri, S.: On Dynamics of Cubic Siegel Polynomials. SUNY at Stony Brook IMS preprint, 1998/4 Zakeri, S.: Dynamics of Cubic Siegel Polynomials. Commun. Math. Phys. 206, 185–233 (1999)
Communicated by Ya. G. Sinai
Commun. Math. Phys. 220, 377 – 402 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Spectral Analysis of Weakly Coupled Stochastic Lattice Ginzburg–Landau Models Paulo A. Faria da Veiga1 , Michael O’Carroll2 , Emmanuel Pereira2 , Ricardo Schor2 1 Departamento de Matemática, ICMC-USP, C.P. 668, 13560-970 São Carlos SP, Brazil 2 Departamento de Física-ICEx, UFMG, C.P. 702, 30161-970 Belo Horizonte MG, Brazil.
E-mail: [email protected] Received: 27 September 2000 / Accepted: 16 January 2001
Abstract: We consider the relaxation to equilibrium of solutions ϕ(t, x), t > 0, x ∈ Zd , of stochastic dynamical Langevin equations with white noise and weakly coupled Ginzburg–Landau interactions. Using a Feynman–Kac formula, which relates stochastic expectations to correlation functions of a spatially non-local imaginary time quantum field theory, we obtain results on the joint spectrum of H , P , where H is the selfadjoint, positive, generator of the semi-group associated with the dynamics, and P j , j = 1, . . . , d are the self-adjoint generators of the group of lattice spatial translations. We show that the low-lying energy-momentum spectrum consists of an isolated one-particle dispersion curve and, for the mass spectrum (energy-momentum at zero-momentum), besides this isolated one-particle mass, we show, using a Bethe–Salpeter equation, the existence of an isolated two-particle bound state if the coefficient of the quartic term in the polynomial of the Ginzburg–Landau interaction is negative and d = 1, 2; otherwise, there is no two-particle bound state. Asymptotic values for the masses are obtained.
1. Introduction In this paper we consider the stochastic Langevin dynamics of Ginzburg–Landau (GL) scalar field lattice models. Precisely, we take unbounded continuous spin variables (fields) ϕ( x ) ∈ R, x in a toroidal lattice ⊂ Zd (the infinite lattice limit → Zd , also called thermodynamic limit, is considered later) and analyze the Langevin dynamics given by a system of stochastic differential equations dϕ(t, x) = −
1 ∇S (ϕ(t, x))dt + dη(t, x); 2
Partially supported by Pronex and CNPq (Brazil).
ϕ( x , 0) = ψ( x ),
(1.1)
378
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where ∇S = δS /δϕ, d 1 2 2 2 S (ϕ( x )) . x )) = (ϕ( x + ei ) − ϕ( x )) + m ϕ( x ) + λP(ϕ( 2 x∈
(1.2)
i=1
ei is the unit vector along the ith coordinate; P is an even polynomial of degree 2N , bounded from below and starting with a quartic term; m > 0 and λ ≥ 0. {η(t, x)}, x ∈ , t ∈ [0, ∞) is a family of Gaussian white noise processes with expectations E(η(t, x)) = 0; E(η(t, x)η(t , y)) = δ t − t δx,y . ψ( x ) is some initial condition. These models, known as time-dependent GL models, frequently appear in the study of dynamical critical phenomena: they can be used, e.g., to describe the (purely relaxational) time evolution of an order parameter for statistical mechanical systems [6, 20]. Besides the purely mathematical interest a stochastic equation such as (1.1) arouses, the GL interaction itself (1.2) is a recurrent theme in physics, as well as the analysis of the noise influence on dynamical systems [23] (recall e.g. the investigations on noise-induced phase transitions [7], stochastic resonance [5], etc.) In a different scenario, considering the stochastic quantization approach [9], these systems can also be used in the investigation of Euclidean field theories. In short, the study of fundamental properties of such models is of general relevance. The Markov process ϕ(t) = ϕ(t, x) defined by (1.1) has an invariant measure (see [10]) µ which coincides with the Gibbs probability distribution dµ = e−S (ϕ) dϕ/normalization. The time evolution of any function f of the spin configuration ϕ( x ) defined by ft (ψ) = E(f (ϕ(t)))
(1.3)
with ϕ(0) = ψ is determined by a strongly continuous semi-group with generator H which is Hermitian and positive in the Hilbert space L2 (dµ, ), and has the form, with f ({ϕ ( x )}), ∂2 1 ∂S ∂f H f = − f− . (1.4) 2 ∂ϕ( x )2 ∂ϕ( x ) ∂ϕ( x) x∈
The spectrum of H is discrete and zero is a multiplicity one eigenvalue with the eigenfunction f ≡ 1. This spectral result follows since H is unitarily equivalent to a Schrödinger operator with “potential” that goes to plus infinity in each coordinate (see Sect. 2 and [14]). To determine the spectrum of H above zero it is possible and useful to establish a quasi-particle representation for the operator H . To understand the generated dynamics, in this context, we use space translations of the torus to define self-adjoint momentum operators P , commuting with H , and call the joint spectrum of (H , P ) energymomentum (e − m) spectrum. As P 1 = 0, 0, 0 is a point in the e − m spectrum, called the vacuum or ground state. We are interested in the low-lying e − m spectrum associated with the infinite lattice system.
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
379
The infinite lattice system is constructed in the standard way [4]. A finite lattice Feynman–Kac formula is established, which represents expectations as Gibbs ensemble averages called correlation functions (cf), and relates them to Hilbert space inner products involving the semi-group e−H t and the unitary group ei P .x , x ∈ Zd . In this representation, the problem becomes a slightly non-local field theory which, for small λ, is a perturbation of a Gaussian measure. Infinite lattice translationally invariant cf’s are obtained using a cluster expansion [1] and truncated cf’s have exponential tree decay, for small λ, as m = 0. From these correlation functions, an infinite lattice imaginary-time quantum field theory is constructed, furnishing a Hilbert space H, self-adjoint commuting e − m operators H ≥ 0 and P , time-zero field operators and a vacuum vector . A Feynman–Kac formula holds for the infinite lattice system. Our results concern the spectrum of H and P of the infinite lattice system below some (called two-particle) threshold. We refer to the spectral points E ≥ 0, p ∈ Td , the d-dimensional (E, p),
torus, as the e − m spectrum and E, p = 0 as a mass spectral point. For λ = 0, the e − m spectrum canbe determined and consists of an isolated dispersion explicitly d j 2 curve E0 (p) = j =1 1 − cos p + m /2, with mass m2 /2, which we refer to as the spectrum associated with a one-particle state. Above this is a band spectrum, and eventually a continuum spectrum, starting at mass m2 (called two-particle threshold), which is associated with the spectrum of two or more particles. For λ > 0, mass spectrum below the two-particle threshold may occur, and which we call bound state spectrum. Similar problems related to the analysis of the spectrum of generators of stochastic dynamics have been recently considered using a different approach. In [11], the dynamics of an infinite system of plane rotators in the high temperature regime is treated (stochastic XY models: spins on a compact manifold), and one-particle states are constructed and the associated spectrum shown to be isolated from the rest. In [22], the next two-particle subspaces are investigated, and the existence of bound states for the one dimensional case is proved. In [12], for spins on a non-compact manifold, the stochastic dynamics of, say, an anharmonic crystal also in the high temperature regime is analyzed, and the exponential convergence to equilibrium is shown. In the present paper, we determine the e−m spectrum, via decay properties of the field correlations, of the quantum field theory related to the original dynamical system using cluster expansions and techniques of constructive field theory. We show the existence of isolated one-particle states. The e − m spectrum above the one-particle state is found by analyzing the Bethe–Salpeter (BS) equation [8], adapting the usual methods employed in relativistic quantum field theory to treat the discreteness of space and the non-locality of the interactions. We also determine the existence of two-particle bound states, proving some results established in a previous paper by some of the authors [15], where the analysis was carried out only in the ladder approximation. Now to state our main results, we write the polynomial interaction P in (1.2) as P(ϕ) =
N an : ϕ 2n : (2n)! n=2
with aN > 0; : : is the Wick order with respect to the unperturbed Gaussian covariance given by
d n αϕ− 1 C0 α 2
2 e , : ϕ n :=
dα n α=0
380
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where C0 is the covariance
2 −1 m2 d2 (x, y) − 2 + −$ + dt 2
at coincident points [see Sect. 2]. Here, $ is the Zd lattice Laplacian. Our main results are given in the theorem below. Theorem 1. Concerning the low-lying energy-momentum spectrum of the Ginzburg– Landau stochastic model, assuming m > 0 large, and for all sufficiently small λ > 0, we have: p ∈ Td correspond1. For d = 1, 2, . . . , there is an isolated dispersion curve Eλ (p), m2 ing to the single-particle spectrum with mass Mλ ≡ M(λ, m, d) = + O(λ2 ) > 2 = Mλ , p = 0, with Eλ (p) > E (p = 0) real analytic, and 0, and
Eλ (p)
λ
sup Eλ (p) − Eλ 0 = 2d + O(λ). p∈T d
2. For d < 3 and a2 < 0, there is a unique two-particle bound state with mass Mb ∈ (0, 2Mλ ) given by 2 2 2Mλ − 9a2 λ4 (1 + O(λ)), d = 1; 4m Mb = (1.5) 2M − exp − 4πm2 (1 + O(λ)) , d = 2. λ 3|a2 |λ Otherwise, for d ≥ 3 or a2 ≥ 0 (any d) there is no two-particle bound state in (0, 2Mλ ). Remark 1.1. Throughout the paper all results will be established, without further mention, for m and λ satisfying the hypothesis of Theorem 1. This paper is organized as follows. In Sect. 2, we establish a finite volume Feynman– Kac formula that allows us to map the problem of computing the expectations in (1.3) into the problem of studying correlations in a quantum field model. Also, for this quantum field model in a finite volume, the corresponding correlations, and the BS equation are defined in a way which is suitable to handle field theories on a lattice. In Sect. 3, the decay of the convolution inverse of the two-point function and decay of the BS kernel are obtained. These results are the main input for spectral analysis of bound states and their masses, left for Sect. 4. Section 4 is subdivided into five parts: the first one is devoted to the analysis of the two-point function; the second to the ladder approximation analysis; the third one deals with technical points and estimates. In the fourth and the fifth parts, respectively, we prove, for d < 3 and a2 < 0, the presence of a unique bound state and extend the proof of absence of bound states for d ≥ 3, as well as for d < 3 and a2 ≥ 0, both beyond the ladder approximation. In Sect. 5, we make some concluding remarks. 2. Feynman–Kac Formula and Bethe–Salpeter Equation d Considering the Hamiltonian of Eq. (1.4) H on a finite hypercube ⊂ Z with periodic x ), let boundary conditions, for dϕ = x∈ dϕ(
dµ (ϕ) =
1 −S e dϕ Z
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
381
with S given by (1.2), restricted to with periodic boundary conditions, and where Z is a normalization for dµ so that dµ = 1. The operator H is positive in the space L2 (dµ ). Next, let U be the unitary operator from L2 (dµ ) to L2 (dϕ ) given 1 −1/2 by (U f )(ϕ) = Z e− 2 S f (ϕ). After a straightforward calculation, we have 1 ∂2 1 1 ∂S 2 ∂ 2 S −1 L = U H U = − + − 2 ∂ϕ( x )2 4 2 ∂ϕ( x) ∂ϕ( x )2 x∈
x∈
so that L is a Schrödinger type operator. Performing the derivatives, we get L = −
1 ∂2 1 + ϕ( x )[(−$ + m2 )2 ϕ]( x) 2 2 ∂ϕ( x) 8 x∈
x∈
λ + [(−$ + m2 )ϕ]( x )P (ϕ( x )) 4 x∈ λ2 λ (2d + m2 ) + x ))2 − P (ϕ( x )) − P (ϕ( . 8 4 4
(2.1)
x∈
In the above formula, −$ is the lattice Laplacian with periodic boundary conditions on
given by (−$ϕ)( x ) = 2dϕ( x ) − |x −y |=1 ϕ( y ). The functional integral associated with (2.1) can be obtained by standard methods [4]. If f1 , . . . , fn are functions of the spin configuration in , if (ϕ) = 1 is the ground state of H and for t1 ≤ t2 ≤ . . . tn ∈ R, then we have the Feynman–Kac formula (, f1 e−(t2 −t1 )H f2 . . . e−(tn −tn−1 )H fn )L2 (dµ ) = (U , f1 e−(t2 −t1 )L f2 . . . e−(tn −tn−1 )L fn U )L2 (dϕ ) = f1 (ϕ(t1 )) . . . fn (ϕ(tn ))dρ , where the path space measure dρ is given by the weak T → ∞ limit of dρ ,T = e−W ,T dν −W e ,T dν
with W ,T ≡ W ,T (ϕ),
W ,T =
T −T
dt
λ x∈
4
x , t))(−$ + m2 )ϕ( x , t) P (ϕ( λ2 λ 2 + P (ϕ( x , t)) − P (ϕ( x , t)) , 8 4
and dν is a Gaussian measure with mean zero and variance given by C t, x; t , y = ϕ (t, x) ϕ t , y dν 1 = 2π | |
∞
x − y) eip0 (t−t ) ei p·( dp0 2 , −∞ m2 (p/ )2 ˜ 2 p∈ + p0 + 2 2
(2.2)
382
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
˜ is the where (p/ )2 ≡ 2 di=1 (1 − cos p i ). In (2.2), | | is the number of points in , d 1 d i i ˜ Fourier dual lattice, p = (p , . . . , p ) ∈ and p · ( x − y) = i=1 p (x − y i ). In [1], using a cluster expansion, the infinite lattice cf’s are shown to exist. The cf’s are translationally invariant and truncated cf’s have exponential tree decay. We denote the infinite lattice cf’s by .. From the cf’s, an infinite lattice imaginary time quantum field theory is constructed in the standard way [4]. The construction provides us a Hilbert space H with inner product (.), a strongly continuous semi-group e−tH , t > 0, with self-adjoint generator H ≥ 0, a unitary group of lattice translation operators with generators P i , i = 1, . . . , d (momentum operators), atime-zero field operator φˆ ( x ),
ˆ −i P .x , where φˆ ≡ φˆ x = 0 , and a vacuum vector such such that φˆ ( x ) = ei P .x φe that H = 0 and P = 0. H, P are mutually commuting. Furthermore, the following infinite lattice Feynman–Kac formula holds: (n)
Sλ (t1 , x1 ; . . . ; tn , xn ) ≡ ϕ (t1 , x1 ) . . . ϕ (tn , xn )
ˆ ˆ −(t2 −t1 )H +i P .(x2 −x1 ) φˆ . . . e−(tn −tn−1 )H +i P .(xn −xn−1 ) φ . = , φe
(2.3)
Our results concern the joint spectrum of H, P (called e − m spectrum). In particular, we determine the low-lying e-m spectrum (for the approximate spectrum see [15]). To determine the isolated dispersion curve of Theorem 1, we consider the Fourier transform (denoted by ∼, and throughout this paper without factors of 2π ) of the twopoint function given by (hereafter dropping superscript (2)), with p = p0 , p , p0 ∈ R, p ∈ Td , ∞ 2E (2π )d δ ( q − p) d , ϕE(E, ˆ q)ϕ ˆ , (2.4) S˜λ (p) = 2 0 2 Td E + (p ) 0 where E(E, p) is the spectral projection associated with the operators (H, P ). The integral over E runs from 0 to ∞ and that over q is on Td . For λ = 0, we obtain Sλ = C, the infinite lattice limit of the covariance (2.2). We can write (2.3) in the form ∞ 2E dηλ (E; p), (2.5) S˜λ (p) = 2 + (p 0 )2 E 0 where the positive measure dηλ (E, p) is supported on the spectrum of H restricted to the odd (number of particles) states with momentum p. The singularities of S˜λ (p) on the positive p 0 imaginary axis are points in the e − m spectrum. The isolated dispersion curve Eλ (p) results from determining the zeros of ˜ ˜ ˜ 2(p), 2(p) S˜λ (p) = 1. The necessary analyticity properties of 2(p) follow from spacetime decay of 2 which is determined in the next section. To determine additional mass spectrum in (0, 2Mλ ), i.e. bound states, we consider the partially truncated four-point function (associated with two-particle states) (4)
(2)
(2)
Dλ (x1 , x2 ; x3 , x4 ) = Sλ (x1 , x2 , x3 , x4 ) − Sλ (x1 , x2 )Sλ (x3 , x4 ),
(2.6)
where, hereafter, xi = (ti , xi ) ≡ (xi0 , xi ). From translation invariance, Dλ depends only on difference variables.
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
383
We introduce the continuous temporal center of mass and relative coordinates and lattice relative coordinates (called mixed coordinates) by
ξ = x20 − x10 , x2 − x1 ;
η = x40 − x30 , x4 − x3 ; (2.7)
1 0 τ= x + x40 − x10 − x20 , x3 − x2 . 2 3 Writing ξ = (ξ 0 , ξ ), etc., it follows from (2.6) that if ξ 0 = η0 = 0, Dλ (ξ, η, τ ) = (θ (−ξ ), e−|τ |H ei P ·τ θ( η)),
ϕ( ϕ( where θ ( η) = ϕ( ˆ 0) ˆ η) − (, ϕ( ˆ 0) ˆ η)). Let f : Zd → C be an arbitrary function vanishing outside a finite set and let f˜ (p) and D˜ λ (p, q, k) be respectively the Fourier transforms of f ( x ) and Dλ (ξ, η, τ ). A simple calculation shows that D˜ λ (p, q, k) f˜( q) (f, Dλ f ) (k) = dp dq f¯˜(p) ∞ 2E (θ(f ), E(E, q)θ(f )) , q − k)d = (2π )3d+2 δ( 0 2 2 Td (k ) + E 0 (2.8) where θ (f ) = f ( x ) θ (− x ). x∈Zd
the singularities in k 0 of the lhs of (2.8) Formula (2.8) is similar to (2.4). For fixed k, give direct information about the spectrum of H on the even subspace of states with To test for the presence of bound states, it is sufficient to study the spectrum momentum k. ˜ on the subspace of zero-momentum states. D(p, q, k) is analyzed by introducing the BS equation Dλ = Dλ0 + Dλ0 Kλ Dλ , where Dλ , etc. are operators defined by the kernels Dλ (x1 , x2 ; x3 , x4 ), etc., and (2)
(2)
(2)
(2)
Dλ0 (x1 , x2 ; x3 , x4 ) = Sλ (x1 , x3 ) Sλ (x2 , x4 ) + Sλ (x1 , x4 ) Sλ (x2 , x3 ) . Perturbatively, the BS kernel Kλ (x1 , x2 ; x3 , x4 ) is the sum of amplitudes of all connected Feynman diagrams with four (amputated) external lines which are (channel) two-particle irreducible (see [4]). In terms of the mixed coordinates we have ξ0 η0 ξ0 0 0 0 0 Dλ (ξ, η, τ ) = S τ + η − , τ + ξ S τ − + , τ + η 2 2 2 0 0 η η0 ξ ξ0 + S τ0 − − , τ + ξ + η S τ 0 + + , τ 2 2 2 2 and similarly for Dλ . Using the symmetries of Dλ and Dλ0 (hence also of Kλ ), namely x1 ↔ x2 , x3 ↔ x4 , x1 , x2 ↔ x3 , x4 , x1 , x2 , x3 , x4 ↔ −x1 , −x2 , −x3 , −x4 , we have D˜ λ (k) = D˜ λ0 (k) + (2π )−2(d+1) D˜ λ0 (k)K˜ λ (k)D˜ λ (k),
(2.9)
384
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where e.g. D˜ λ (k) is defined by the kernel D˜ λ (p, q, k) through ˜ (Dλ (k)f )(p) = D˜ λ (p, q, k)f (q)dq. The definitions of Kλ and the properties of K˜ λ (p, q, k) needed for showing the existence or absence bound states again follow from exponential decay properties of Kλ and are obtained in Sect. 3. In the rest of the paper, we use an infinite time, infinite lattice notation, with the understanding that the properties of the quantities involved actually are obtained for the finite volume objects and are independent of the volume. This is ultimately justified by the convergent expansions of Sect. 3. 3. Decay Estimates and Analyticity of the Bethe–Salpeter Kernels To prove the analyticity properties and the decay bounds on Kλ (and on the one-particle irreducible kernel kλ below) we adapt some methods developed in [18], to which we refer for the full details. The decay of Kλ in the continuous time direction is controlled following the strategy used in the analysis of P (ϕ)2 quantum field models [18], employing now a cluster expansion developed in [1] for lattice stochastic systems. The decay in the discrete space direction is obtained still within the decoupling of the hyperplane method, but now as employed in the study of lattice statistical systems [16, 17]. Momentum space analyticity properties follow from these space-time exponential decay properties. We give convenient representations for k and K (dropping now the subscript λ) which are used in conjunction with the hyperplane decoupling method to give the desired exponential decay of their kernels. For more details see [18]. We use the integration by parts formula [4] δ A(φ(x))e−W dµC = C(x, y) Ae−W dydµC δφ(y) and below we write W = λw. Two-point function. From S = C − CkS (which defines k), 2S = I , we have k(x, y) = (S −1 − C −1 )(x, y) = (2 − C −1 )(x, y). By integration by parts (C −1 S)(x, y) = δ(x − y) − λw (φ(x)); φ(y)c ≡ (I − λA1 )(x, y), (A1 C −1 )(x, y) = w (x)δ(x − y) − λw (φ(x)); w (φ(y))c ≡ B1 (x, y), where the superscript c means connected and the semi-colon means partial truncation (see [18]). Thus, k can be written as k = λ(1 − λA1 )−1 B1 .
(3.1)
The Neumann series converges (see below) and each term is a product of expectations . of polynomials. Four-point function. We write D0 = 2S ⊗ S ≡ 2SS, D0−1 = 21 2 ⊗ 2 ≡ 21 22 so that, with D = D0 + D0 KD, K = D0−1 − D −1 =
1 22 − D −1 . 2
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
385
Note that D = D0 +G, where G(x1 , x2 , x3 , x4 ) = φ(x1 ); . . . ; φ(x4 )c has exponential tree decay so that D −1 = (D0 + G)−1 = (1 + D0−1 G)D0−1 ;
K = (1 + D0−1 G)−1 (D0−1 GD0−1 ).
We isolate the singularities of D0−1 G using k = 2 − C −1 and integration by parts, i.e., write 1 1 D0−1 G = 22G = [(k + C −1 )(k + C −1 )]G, 2 2 where ((C −1 G)(·, x2 x3 x4 ))(x1 x2 x3 x4 ) = − λw (φ(x1 )); φ(x2 ); φ(x3 ); φ(x4 )c , ((C −1 C −1 G)(·, ·, x3 x4 ))(x1 x2 x3 x4 ) = λ2 w (φ(x1 )); w (φ(x2 )); φ(x3 ); φ(x4 )c − λδ(x1 − x2 )w (φ(x1 )); φ(x3 ); φ(x4 )c . Thus, D0−1 G = λA2 and D0−1 GD0−1 = 21 λA2 [(k + C −1 )(k + C −1 )] ≡ λB2 . We can now write K = λ(1 + λA2 )−1 B2 and the Neumann series converges. The ladder approximation λL is the lowest order term in λ of K and noting that G|λ=0 = 0 we see that L is given by
∂K
−1 ∂G = D0 D −1 = B2 |λ=0 . L=
∂λ λ=0 ∂λ λ=0 0 An explicit calculation gives L(x1 x2 ; x3 x4 ) =
3 a2 δ(x20 − x10 )δ(x30 − x20 )δ(x40 − x30 ) 8 4 · (−$ + m2 )( u, v) δ( xj − u)δ( xi − v) i=1 u, v ∈Zd
(3.2)
j =i
which is local in time and range two in space. We write K = λL + λ2 K (2) . The final result of this section can be stated as follows. (2)
Theorem 2. Given small A1 , A2 , A3 , A4 > 0, the kernels kλ and Kλ have the following bounds: m2 0 2 0 2 |kλ (x1 , x2 )| ≤ λ O(1) exp −(1 − A1 ) |x1 − x2 | − ln(A2 m ) x2 − x1 , 2
(2)
K (x1 , x2 ; x3 , x4 ) λ m2 0 0 0 0 0 0 0 0 ≤ O(1) exp −(1 − A3 ) 2|(x1 + x2 ) − (x3 + x4 )| + |x2 − x1 | + |x4 − x3 | 2 × exp − ln(A4 m2 ) 2 ( x1 + x2 ) − ( . x3 + x4 ) + x2 − x1 + x4 − x3
386
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
Remark 3.1. The last expression O(1) above in fact becomes a constant in the momentum space only: it includes temporally singular parts of K (2) , i.e., terms with factors such as δ(x10 − x30 ) and δ(x10 − x30 )δ(x20 − x40 ). From this theorem, in momentum space, we have Corollary 1. Given A > 0 small, if u, p, q, k are respectively the momentum conjugate variables to x1 − x2 , ξ, η, τ [see (2.7)], the kernels k˜λ (u), K˜ λ (p, q, k), are analytic and bounded by a constant (dependent on m) in the region |Im u0 | ≤ (1 − A)3m2 /2,
|Im uj | ≤ (1 − A) ln(Am2 ),
|Im k 0 | ≤ (1 − A)4m2 /2,
|Im p 0 , Im q 0 | ≤ (1 − A)m2 /2,
|Im k j | ≤ 4(1 − A) ln(Am2 ), |Im p j , Im q j | ≤ (1 − A) ln(Am2 ). Now, we briefly describe the main points behind the proof of the above theorem. To simplify the whole analysis we first split from the initial covariance C its nonlocal spatial term (involving a lattice Laplacian) and introduce it into the (already nonlocal) potential. Hence, the questions related to nonlocality and decay in the time direction will be associated to the covariance, and those related to the space directions will be associated to the new extended potential. We define this new covariance as C ∗ , where 1 0 0 0 x − y) eip (x −y ) ei p·( dp, ϕ(x)ϕ(y)dνC ∗ = 2 0 (p ) + [(1 − A)m2 /2]2 A small. Hence, replacing W , the new potential V includes the extra terms 18 (ϕ, $2 ϕ), m2 4 (ϕ, −$ϕ)
and Am8 (ϕ, ϕ). These rearrangements simplify the determination of the decay in the space directions. We remark that the computations of the expectations ., using a Gaussian measure with covariance C ∗ , are consistent with the previous definitions (see [4]). We now turn to the cluster expansion of [1], developed to write as a sum of local parts integrals such as F (ϕ) exp[−V (ϕ)]dµC ∗ (ϕ) with the field variables ϕ involving continuous (time) and discrete (space) parts, and with nonlocality both in the perturbation V (nonlocal in space) and in the covariance C ∗ (here, nonlocal in time). The (slight) space nonlocality in V is due to a lattice Laplacian term and is easily treated with a Mayer expansion of exp(−V ). The nonlocality in the measure µC ∗ is treated decoupling µC ∗ in terms of measures which factors over certain regions of space and time. The “localized” covariances are defined as C ∗ (U, U )(x, x ) = χU (x)C ∗ (x, x )χU (x ), where χU (x) is the characteristic function of U , a set of lattice points. And the cluster expansion is constructed, as a well known, using a decomposition in terms of convex (positivity preserving) combinations of such covariances, i.e., for Cs∗1 = s1 C ∗ + (1 − s1 )[C ∗ (U1 ) + C ∗ (∼ U1 )], s1 ∈ [0, 1], C ∗ (U1 ) ≡ C ∗ (U1 , U1 ), we write 1 d ∗ F dµC ∗ = F dµCs=1 = ds F dµCs∗ + F dµC0∗ ds 0 1 1 dCs∗ = ds ◦ $ϕ F dµCs∗ + F dµC0∗ , 2 ds 0 ∂ ∂ where C ∗ ◦ $ϕ = dx 0 dx 0 x,x C ∗ (x, x ) ∂ϕ(x) ∂ϕ(x ) . With this, the initial integral can be written as an integral with C0∗ = C ∗ (U1 ) + C ∗ (∼ U1 ) which decouples U1 4
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
387
and ∼ U1 , leading to a factorization of F dµC0∗ (for F local), plus a remainder term involving a covariance with interpolating parameters s. Iterating, i.e., decoupling U1 into U2 and ∼ U2 (introducing Cs∗1 ,s2 = s2 Cs∗1 + (1 − s2 )[Cs∗1 (U2 ) + Cs∗1 (∼ U2 )]), and so on, we construct the cluster expansion such as the standard ones used in the study of N particle statistical mechanical systems [17] (once we consider any U as the union of unit lines I × { x }, with I = (i, i + 1) ⊂ [−T , T ]). From the convergence of this cluster expansion existence of the infinite lattice limit and the exponential tree decay for the truncated field correlations follow [1]. To obtain the decay in the time direction we use the decoupling method as developed in quantum field theory [18]. Once more we change the covariances by introducing additional interpolating variables ti , i ∈ Z ∩ (−T , T ) to test the coupling between points below and above each hyperplane x 0 = i. Setting ti to zero has the effect of decoupling the interaction along this time hyperplane. We define Ct∗1 = t1 C ∗ + (1 − t1 )[C ∗ ($T1 ) + C ∗ (∼ $T1 )],
(3.3)
where t1 ∈ [0, 1] and $T1 is the region −T < x 0 < −T + 1. t1 = 1 decouples Ct∗1 along the hyperplane x 0 = −T + 1. In fact, in (3.3), we shall consider Ct∗1 ,s , i.e., C ∗ ∗ (x, y) ≤ C ∗ (x, y). From above replaced by Cs∗ , as defined before. Note that 0 ≤ Ct,s now on, we drop the s index. The next interpolating parameter t2 (for the hyperplane x 0 = −T +2) is introduced such that Ct∗1 ,t2 = t2 Ct∗1 +(1−t2 )[Ct∗1 ($T2 )+Ct∗1 (∼ $T2 )], where t2 ∈ [0, 1], $T1 is −T < x 0 < −T + 2. We write Ct∗ for Ct∗1 ,t2 ,t3 ,... . Turning to the one-particle irreducible kernel k and to the BS kernel K, derivative calculations with Gaussian integrations [18] give us Lemma 3.2. If the hyperplane i separates x and y (i.e., x 0 < i < y 0 or vice-versa) then
dr
k (x, y) r = 0, 1, 2. λ,t r
dti ti =0 Lemma 3.3. If the hyperplane i separates x1 , x2 from x3 , x4 then
dr Kλ,t (x1 , x2 ; x3 , x4 )
0 ≤ r ≤ 3. r dti ti =0 And if the hyperplane i separates x1 from x2 , x3 , x4 the result holds for 0 ≤ r ≤ 2. Hence, to obtain the decay rates of the kernel above we expand them in a Taylor series in ti to get 1 (1 − ti )ri ri +1 f (t = 1) = f (t)dt ∂ti ri ! 0 i
(f = k, K) and estimate the remainder term. Recalling that our kernels k and K can be expressed in terms of truncated functions, β β we have to estimate expressions such as ∂t Qe−λV dµC ∗ , where ∂t = i ∂ ni /∂tini ∗ ∗ (details below), C = Cs,t and Q is a polynomial in ϕ. Let us introduce some notation. For I ⊂ Z let β be a function on I with values in {1, 2, . . . , n}, i.e., a set {ni }i∈I , ni ≤ n. By Leibniz rule we have 1 β −λV α ∗ Qe dµC ∗ = ∂t C ◦ $ϕ e−λV QdµC ∗ , ∂t 2 α∈π π∈℘ (β)
388
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where ℘ (β) is the set of the partitions of β. We keep following the treatment given in [18] to estimate the derivatives. For each α above we introduce a complex variable h(α). Then, considering the localization index j = (j1 , j2 ) we define Cj∗ (x1 , x2 ) = χIJ1 (x1 )C ∗ (x1 , x2 )χIj2 (x2 ), where Ij1 is some unit line I × {x}. Finally, we define the key structure 1 α ∗ 1 + h(α) ∂t C ◦ $ϕ e−λV QdµC ∗ 2 α∈π j Qπ (t, h) ≡ numerator with Q ≡ 1 which is related to the t derivatives by β
∂t Q(t) =
π ∈℘ (β) α∈π
∂ Qπ (t, h)
. ∂h(α) h=0
Then, we control the derivatives in t using the analyticity in h of Qπ (t, h) (for h in a properly chosen domain), and we bound the derivatives in h (for h = 0) using the Cauchy formula. This method avoids problems such as the proliferation of terms generated by derivatives in t, as |β| derivatives lead to |β|! terms. The analyticity in h of Qπ (t, h) (hence also k(t, h), K(t, h)) is shown by a cluster expansion (involving the s parameters introduced before) which permit us to establish suitable bounds for |Qπ (t, h)| and related expressions for properly chosen h. The convergence of this expansion is insured by the cluster expansion convergence of [1] and by the fact that h(α)∂tα Cj∗ ◦ $ϕ may be treated as small terms (for suitable h). Precisely, we have Lemma 3.4. For α = {ni }i∈I ⊂ I (n) , we have
2
∂ α C ∗ (x, y)dy ≤ O(1) exp − m d(α)(1 − A) , t j
2 where A is small and d(α) is ∞ if any ni > 1; 0, if |α| = 1; or max{|i − j |, i, j ∈ I }, otherwise. The bound of Lemma 3.4 comes from the exponential decay of the initial covariance C ∗ and from the effect of the t interpolating parameter. For this, recall that Ct∗1 = t1 C ∗ + (1 − t1 )[C ∗ ($T1 ) + C ∗ (∼ $T1 )] and thus, ∂Ct∗1 /∂t1 = C ∗ − [C ∗ ($T1 ) + C ∗ (∼ $T1 )], i.e., ∂t1 Ct∗1 (x, y) = 0 iff x is in $T1 and y in ∼ $T1 (or vice-versa). We also remark that when |α| = 1 we do not gain a convergence factor from the covariance derivative but $ϕ (in Qπ ) eventually acting on e−λV (Q is a polynomial with fixed degree) gives us small λ factors. Using Lemma 3.4 and the definition for d(α), we obtain
2 (2) Theorem 3. Kλ (t, h) is analytic in h for |h(α)| ≤ exp − m2 (1 − A)[d(α) + 1] and then
m2 (2) |Kλ | ≤ O(1) exp − (1 − A )d¯ 2 with
(d(α) + 1) < 2 (x 0 + x 0 ) − (x 0 + x 0 ) + x 0 − x 0 + x 0 − x 0 d¯ ≡ min π
α∈π
and A = A (λ) → 0 as λ → 0.
1
2
3
4
1
2
3
4
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
389
A
similar result follows for the one-particle irreducible kernel k, but with d¯ bounded by x10 − x20 . We again use the decoupling of the hyperplane method to obtain the decay in the discrete space directions, but now the procedure is much simpler. We write V = ( x Vx + b Vb ), where x are the points in , Vx are the local potential parts, b are bonds for nearest neighbor lattice points in : b = ( x , y), with | x − y| = 1 or 2 (for the term involving $2 ), and Vb are the nonlocal potential parts. Then we introduce interpolating parameters z for each bond into the nonlocal potential terms, e.g., a term in Vb such as ϕ 3 (x)ϕ(y) (with | x − y| = 1, which appears if P(ϕ) = (ϕ 6 − ϕ 4 )) is re3 placed by zxy ϕ (x)ϕ(y). The original potential is recovered by setting all z = 1; setting zxy equal to zero has the effect of decoupling the hyperplanes x and y. Once more, we expand our kernels in a Taylor series in zi (i labeling the bonds) to get 1 (1 − zi )∂zi f (z)dz. f (z = 1) = 0
i
The point is that now f (z) is analytic for z in a disk with radius given by some fraction of m2 . For z in this region it is easy to see that the nonlocal square terms (with coefficients 4 involving z) are controlled by the term Am8 (ϕ, ϕ), and the nonlocal parts related to P(ϕ) are controlled by the local ones). Then, using Cauchy estimates in the Taylor formula above (together with the cluster expansion convergence) we obtain the decay estimate of Theorem 2. 4. The Bound State Analysis 4.1. The two-point function. The convergence of the cluster expansion referred to in the previous section gives an exponential decay for the two-point function, which establishes a spectral gap between the vacuum and the one-particle state. Now, continuing our analysis of the two-point function, we show the existence of an isolated e − m dispersion curve Eλ (p) for the one-particle state. From the Feynman–Kac formula and the spectral theorem, we have for the Fourier transform S˜λ (p) of the two-point function S (0; x), where x ≡ (x 0 , x) and 0 = (0, 0), the formulas (2.4) and (2.5)), a lattice version of the Källen-Lehmann representation (see [4]). The dispersion relation Eλ (p) is the solution of the equation 2˜ p 0 = iEλ , p = 0. In terms of Eλ (p), and separating the one-particle contribution from the remaining odd state contributions, for E ∈ [0, ∞) and p ∈ Td , the measure dηλ (E, p) has the decomposition dηλ (E, p) =
cλ (p) dEd p + dηλ (E, p) , δ (E − Eλ (p)) 2 (2π ) Eλ (p) 1
d
has support in [2Mλ −A, ∞), for some positive A vanishing for λ → 0. where dηλ (E, p) Above, cλ (p) is a normalization which will be determined below. From the decomposition above, we get ∞ cλ (p) 2E ˜ Sλ (p) = + dηλ (E, p) . (4.1) 2 0 )2 + E 2 0 2 (p (p ) + Eλ (p) 2M0 −1 By explicit calculation, we have S˜λ = S˜0 +O λ2 , where S˜0 (p) = (p0 )2 + E0 (p) 2 , E0 (p) = dj =1 (1−cos pj )+m2 /2, which implies 2˜ λ (p) = 2˜ 0 (p)+O λ2 . As shown
390
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
in Sect. 3, k˜λ and hence 2˜ λ are analytic for real p and Imp 0 < κ, with κ ≤ 2M0 − A. Also, since k˜λ is O λ2 near the zeroes (±iE0 (p), p) of 2˜ 0 (p), it follows that 2˜ λ (p) has zeroes nearby, which we call (±iEλ (p), p). That the difference Eλ (p) − E0 (p) is O(λ2 ) is a consequence of fact the measure dηλ is controlled by an O(λ2 ) itself, as given in Proposition 4.1 below. That these zeroes can only be located on the imaginary axis can be seen from (2.5). Also, the zero is unique since 2˜ λ ip 0 , p is monotone in p0 ∈ [0, 2Mλ −A). The representation (4.1) with cλ (p) = 1+O(λ2 ) follows immediately −1 2 .
˜ = − ∂ 2λ /∂κ iκ, p from the above facts since cλ (p) κ=E p λ
From (4.1), and the above argument, it follows that the spectral value Eλ (p) is isolated from the rest of the H spectrum. The corresponding mass Mλ then satisfies the property given in the statement of Theorem 1, i.e., Mλ = m2 /2 + O(λ2 ). Concerning dηλ (E, p), we have Proposition 4.1. Integrating over E with p fixed gives
∞ 0
dηλ (E, p) = O(λ2 ) .
(4.2)
Proof. The spatial Fourier transform of the two-point function is ∞
cλ (p) Sˆλ x 0 = 0, p = . + (2π )d dηλ (E, p) 2Eλ (p) 0 On the other hand, Sˆλ x 0 = 0, p = Sˆ0 x 0 = 0, p + O(λ2 ), with Sˆ0 x 0 = 0, p = −1 , so that Sˆλ x 0 = 0, p = [2E0 (p)] −1 + O(λ2 ). We obtain [2E0 (p)] = c0 (p) + O(λ2 ) ; c0 (p) = 1 ; Eλ (p) = E0 (p) + O(λ2 ). cλ (p) Hence (4.2) follows.
' &
For future use, we state another important property of the one-particle dispersion curve. < Eλ (p), Proposition 4.2. The dispersion relation satisfies the bound Mλ ≡ Eλ (0) for p = 0. ∂ 2 E0 (p) = δj / cos pj . For small λ = 0, the λ = 0 second ∂pj ∂p/
∂ 2 Eλ (p) − E0 (p) π π derivative is used to dominate , for pj ≤ . For pj > , ∂pj ∂p/ 4 4 1 2 E0 p − E0 0 > 1 − √ , so that, since |Eλ (p) − E0 (p)| ≤ constλ , we also have
2
Eλ p − Eλ (0) + Eλ (0) − E0 (0) ≤ 2d + constλ2 . As Eλ (p) is real analytic in p, the ∂ 2 E0 (p) analytic implicit function theorem and Cauchy estimates are used to control ∂pj ∂p/ and the remainder. & ' Proof. For λ = 0, we have
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
391
4.2. The ladder approximation. The first part of this subsection is devoted to showing the existence or absence of two-particle bound states in the ladder approximation and follows [15]. We use the mixed coordinates of Eq. (2.7) to analyze the kernels in the BS equation. The kernel of D˜ λ0 is given by 0 0
(2) k (2) k 0 0 0 0 0 ˜ ˜ ˜ Dλ (p, q, k) = δ p + q Sλ − p , p Sλ + p , q δ p + q − k 2 2 0 0 k k (2) (2) + S˜λ + p 0 , p S˜λ − p 0 , k − p δ (p − q) . 2 2 (4.3) The Recall that D˜ λ (k 0 ) means D˜ λ taken at zero spatial momentum, i.e., D˜ λ ((k 0 , 0)). action of D˜ λ0 (k 0 ) on energy independent functions f (p), which depend only on p, is 0 0 (2) k (2) k (D˜ λ0 (k 0 )f )(p) = (2π)d+1 S˜λ + f (−p)]. + p 0 , p S˜λ − p 0 , p [f (p) 2 2 (4.4) In the ladder approximation, K˜ λ is replaced by its first order term λL˜ of Eq. (3.2), which is local in time and so 3 ˜ + E0 ( L(p, q, k) = − a2 [E0 (p) + E0 ( q ) + E0 (p − k) q − k)], 4 i.e., its Fourier transform does not depend on p0 , q 0 and k 0 . Hence, at zero total spatial momentum k, ˜ = − 3 a2 [E0 (p) L(p, q, (k 0 , 0)) + E0 ( q )], 2 ˜ 0 , 0) has rank two (in a scalar local field theory the which shows that the operator L(k rank is one). Solving the Bethe–Salpeter equation (2.9) for D˜ λ , in the ladder approximation, yields −1 ˜ 0) D˜ λ0 (k 0 ) D˜ λ (k 0 ) = 1 − (2π )−2(d+1) λD˜ λ0 (k 0 )L(k (4.5) −1 = D˜ λ0 (k 0 ) 1 − (2π )−2(d+1) λL˜ λ (k 0 )D˜ λ0 (k 0 ) with all quantities taken at zero spatial momentum as in (4.4). The action of L˜ λ D˜ λ0 is given by
L˜ λ (k 0 )D˜ λ0 (k 0 )f (p) = − 3a2 (2π )d+1 E0 (p) + E0 ( q) 0 0 k k (4.6) × S˜λ − q 0 , q S˜λ + q 0 , q 2 2 × f (−q) + f (−q 0 , q ) dq. Hence, if the test function f depends only on p, we have (L˜ λ (k 0 )D˜ λ0 (k 0 )f )(p) = −3a2 (2π )d+1 ρ0 (f ) + ρ1 (f )E0 (p) ,
392
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where ρn (f ) = G( q , k0 ) =
1 2
Td ∞
−∞
G( q , k 0 )E0 ( q )δ0n f ( q ) + f (− q ) d q;
n = 0, 1,
(2) (2) S˜λ (q)S˜λ (k 0 − q0 , q )dq0 .
It follows from q , k 0 ) is
and from a simple analytic continuation argument that G(
(4.1) 0 This result depends on the fact that Eλ (0) ≤ Eλ (p) analytic on Imk < 2Eλ (0). for any p ∈ Td , proven in Proposition 4.2. Recall, from (2.8), that the basic object we want to analyze is (f, D˜ λ (k 0 )f ), which has the form, 0 d+1 ˜ f (p)G( p, k 0 )g(p, k 0 )d p, (4.7) (f, Dλ (k )f ) = 2(2π ) Td
where
−1 ˜ 0 )D˜ λ0 (k 0 ) f (·). g(·, k 0 ) = 1 − (2π )−2(d+1) λL(k
must come from those of g(·, k 0 ). The only singularities of (4.7) on Imk 0 < 2Eλ (0)
But, in turn, these come from the zeroes of 1−µ± (k 0 ), where µ± (k 0 ) are the eigenvalues ˜ 0 )D˜ 0 (k 0 ) on the space generated by the functions 1 and E0 (p). of (2π)−2(d+1) λL(k We λ find
1/2 0 −(d+1) 0 0 0 (4.8) λ α(k ) ± β(k )γ (k ) µ± (k ) = −3a2 (2π) with the eigenfunction corresponding to µ+ given by β ψ+ (p) = 1 + E0 (p), γ where
α(k 0 ) = β(k ) =
Td
Td
0
γ (k 0 ) =
Td
E0 ( q )G( q , k 0 )d q, G( q , k 0 )d q,
(4.9)
E0 ( q )2 G( q , k 0 )d q.
Now, from (4.1), G( q , k 0 ) can be written as q )2 cλ ( π q , k 0 ), + G1 ( 2 Eλ ( q ) Eλ ( q )2 + 41 (k 0 )2
+ 2M0 . q , k 0 ) is analytic on Imk 0 < Eλ (0) where G1 ( From general principles, the singularities of (4.7) can only be located on the imaginary k 0 axis. Writing k 0 = iκ with κ ≥ 0 and using (4.1), one can show that G( q , iκ) > 0 G( q , k0 ) =
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
393
It follows then that α(iκ), β(iκ) and γ (iκ) are positive and, by for 0 ≤ κ < 2Eλ (0). Cauchy-Schwarz’s inequality, α ≤ [βγ ]1/2 on 0 ≤ κ < 2Eλ (0). For space dimension d ≥ 3, then α(iκ), β(iκ) and γ (iκ) increase to a finite limit as because the singularity generated by G( κ → 2Eλ (0) q , iκ) is quadratic and therefore integrable. Thus, if λ is small enough, 1 − µ± (iκ) cannot be zero on 0 < κ < 2Eλ (0) so that, in the ladder approximation, there are no bound states. but α − [βγ ]1/2 remains finite. This If d < 3, α, β and γ diverge as κ → 2Eλ (0), yields the nonvanishing of 1 − µ− (iκ) . Finally, 1 − µ+ (iκ) is nonzero if a2 > 0, if a2 < 0. This implies the and has a unique zero on the interval 0 < κ < 2Eλ (0), existence of a single bound state for the later case. be the mass for a single quasiparticle in the interacting theLet Mλ = Eλ (0) ory. The mass ML of the bound state, in the ladder approximation, is the solution of (assuming a2 < 0) F (λ, iML ) = −(2π )d+1 /3a2 λ, where F (λ, k 0 ) = α(λ, k 0 ) + [β(λ, k 0 )γ (λ, k 0 )]1/2 , and we have made explicit the λ dependence of α, β and γ . Let E = 2Mλ − ML . Performing an asymptotic analysis of the coefficients α, β, and γ we find 9 λ2 2 a [1 + O(λ)] ; if d = 1 4 m4 2 (4.10) E(λ) = 4π m2 exp − [1 + O(λ)] ; if d = 2. 3 |a2 | λ To go beyond the ladder approximation, let us introduce some function spaces. We define a weighted Hardy space Hδ (see [3, 21]) as functions f analytic in the strip | Imp j |< δ1 such that f (p) = f (−p), with norm given by, with α = α 0 , α , | w(p + iα)f (p + iα) |2 dp, sup f 2δ = | Imp 0 |< δ0 ;
|α0 |<δ0 ,| α |<δ1
−2 where w(p) = (p0 )2 + (4M0 )2 , with 41 < α < 21 and, for a small positive number − A and δ1 = ln 2E0 (0) (1 − A). An equivalent norm is A, δ0 = 1 E0 (0) 2
f
2 δ
=
x∈Zd
+∞ −∞
e2δ0 |x
0 |+2δ | 1 x|
(wf )∨ (x) 2 dx 0 ,
where the symbol ∨ denotes the inverse Fourier transform. The space Hδ accommodates constant or bounded functions, as the eigenfunctions that occur in the ladder approximation, and also it is convenient for treating the various temporally singular exponentially ˜ 0 )D˜ 0 (k 0 ) and decaying kernels, such as Dλ , Dλ0 and Kλ . Furthermore the operators L(k λ 0 0 0 ˜ (2) K λ (k )D˜ λ (k ) will be shown to be a compact analytic family. We give criteria for an integral operator to be compact, which will be used in the sequel (see [19, 2] for more details). Let A be the operator acting in Hδ given by [Af ] (p) = A(p, q)f (q)dq. Then, denoting the Hilbert–Schmidt norm associated with Hδ by · H S , we have
394
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
Lemma 4.3. Let Bδ (q) =
1 (q 0 )2 + δ02
d
1
j q / )2 j =1 (
j
+ δ12
with, following (2.2), ( q / )2 =
2 1 − cos pj . Then, 2 −1 −1 2 a) A H S ≤ c sup w(q) | A(p + iδ, q ) | Bδ (q − q)w(q ) dq , where c is p,q
a positive constant, and 2 b) A 2H S ≤ c12 w(q)2 dq , for some c1 > 0 such that the following bound holds: sup p
w(q )−1 | A(p + iδ, q ) | Bδ (q − q)dq ≤ c1 w(q).
Proof. Here, we use the continuum notation for lattice sums. Use the unitary map U : Hδ −→ L2 (Zd × R, d d+1 x) given by g(p) ˜ * → eδ 0 |x0 |+δ1 |x | [wg]∨ (x) to write, with 2 the subscript 2 standing for the L inner product, g, ˜ Af˜ = g2 , UAU −1 f2 2 ≡ δ ˜ g¯ 2 (x)R(x, y)f2 (y)dxdy, where g2 = U g˜ and f2 = U f . The Hilbert–Schmidt norm is calculated in L2 , using the convolution and Plancherel theorems in the y variable, noting that Bδ is the Fourier transform of e−δ0 |y0 |−δ1 |y | , and the equivalence of norms in the x variable. In this way, we obtain 2 2 R 2,H S ≤ | w(p + iδ) | | | A(p + iδ, q ) | w(q )−1 Bδ (q − q )dq |2 dpdq. Using the inequalities | w(p + iδ) |< cw(p) and | w(p + iδ) |< 1, and multiplying and dividing by w(q)2 in the integrand leads to the result, since w(p)2 w(q)2 dpdq < ∞. ' & ˜ 0 )D˜ 0 (k 0 ) − w −1 . Consider the decomWe now obtain a representation for λL(k λ position of f in terms of 1, E0 (p) and f2 (p), where f2 is orthogonal to 1 and E0 (p). ˜ 0 )D˜ 0 (k 0 )f Associate with f (p) the column vector with entries c0 , c1 , f2 . Then, λL(k λ has the form c0 ρα ργ A02 c 0 c = ρβ ρα A12 c1 1 f2 0 0 0 f2 with ρ = −3a2 λ, and, for j = 0, 1, 0 0 (2) k (2) k 0 0 ˜ ˜ − q , q Sλ + q , q Aj 2 f2 = ρ Sλ 2 2 δj,0
× E0 (Q) f2 (−q) + f2 (−q 0 , p) dq. Letting D(w) ≡ (w − ρα)2 − ρ 2 βγ = (w − µ+ )(w − µ− ),
(4.11)
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
395
−1 ˜ 0 )D˜ 0 (k 0 ) − w where µ± are given in (4.8), λL(k f is given by λ
−w (ρα − w)
−1 wD(w)
wρβ 0
c0 −w (ρα − w) ρβA02 − (ρα − w) A12 c1 . (4.12) 0 D(w) f2 wργ
ργ A12 − (ρα − w) A02
For 0 < κ < 2Mλ , we show in the next subsection that A01 and A02 are compact −1 ˜ 0 )D˜ 0 (k 0 ) − w is a bounded operators so we see from (4.12) that the resolvent λL(k λ
operator for w = {µ+ , µ− , 0}. For w = 1, this resolvent equals the one in(4.5) and does not exist precisely at the ladder bound state mass ML , which satisfies µ+ k 0 = iML = 0. Moreover, since |µ− | ≤ O(λ), for 0 < κ < 2Mλ , it is not singular at µ− . From direct computation, we obtain ML = 2Mλ − Eλ as in (4.10). 4.3. Useful estimates. This subsection is devoted to establishing the main estimates that we will need in the ensuing sections. The proofs are patterned after those in [19, 2], so we will be brief. Hereafter, instead of k 0 , we use κ = −ik 0 . Also, in many places, we suppress the λ dependence for notational simplicity. Finally, throughout, the various constants appearing below are independent of m and λ, unless stated otherwise. Typically, the functions α(κ), β(κ) and γ (κ) [see (4.9)], as well as the single particle contribution to the two-point function, become singular as κ approaches the two-particle threshold due to the small denominator factor 4Eλ (p) 2 −κ 2 . For estimates it is convenient to introduce the parameter 2 −1/2 $(κ) = 4Mλ2 − Reκ (4.13) 2 − 4Mλ2 + 4Mλ2 − κ 2 , which arises from writing the denominator factor as 4Eλ (p) = Mλ . recalling that Eλ (0) The following bounds hold. Lemma 4.4. Recall from Eq. (4.8) that µ+ = ρ α + (βγ )1/2 . For some positive constants c1 , ..., c6 , we have the following bounds: a) λc1 $(ML ) ≤ µ+ (ML ) = 1 ≤ λc2 $(ML ) and λc3 $(ML )3 ≤
∂µ+ (ML ) ≤ ∂κ
λc4 $(ML )3 , b) For sufficiently small c, and | κ − ML |< cλ2 , $(κ) < c5 λ−1 and $(κ)3 > c6 λ−3 .
2 ≤ c8 p2 in the denomiProof. a) The bounds follow from c7 p2 ≤ 4Eλ (p) 2 − 4Eλ (0) −2 = 4M 2 − (κ − M + M )2 ≥ nators of α, β and γ . b) For the first inequality, $(κ) L L λ c1 − 2Mλ c − c2 λ2 λ2 , using the first part. The second inequality follows similarly. ' & Recalling that K(κ) = λL + λ2 K (2) (κ), we write D˜ λ0 (κ) ≡ R0λ (κ) = R00 (κ) + λ2 R02 (κ)
(4.14)
with R00 independent of λ. We also write K(κ)R0λ (κ) = λT1 (κ) + λ2 T2 (κ),
(4.15)
396
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
where T1 (κ) = LR00 (κ),
T2 (κ) = LR02 (κ) + K (2) (κ)R00 (κ) + λ2 K (2) (κ)R02 (κ). (4.16)
We have suppressed the λ dependence of K (2) , R02 and T2 . We have Lemma 4.5. The operators T1 , K (2) R00 , LR02 and K (2) R02 are compact analytic in 0 < Reκ < 2Mλ , | Imκ |< Mλ , and bounded in norm by O ($(κ)). Proof. The proofs are similar so we give a complete one only for K (2) R00 . The proof for LR02 requires a bound on the contributions to the two-point function above the oneparticle state. The needed bound is given in Proposition 4.1. To show compactness of K (2) R00 , we show that K (2) R00 satisfies the bound of Lemma 4.3a). From (4.3) with we have k = 0, *2 −1 ) 2
2 2 p/ m κ R00 (p, q, k) = 2δ (p + q) p 0 − i + + 2 2 2 *2 −1 ) (4.17) 2 2
2 p κ m / 0 × p +i + + 2 2 2 ≡ δ (p + q) r00 (κ, p) κ, p + iδ, q |≤ O(1) from the Corollary of Theorem 2, so that, as | K -K (2) R00 - is bounded by HS sup | r00 κ, q | w(q)−1 Bδ (q − q )w(q )−1 dq . (2)
q
Break the q 0 integration region into | q 0 |≤ M0 and | q 0 |> M0 . For | q 0 |≤ M0 , w(q)−1 Bδ (q −q )w(q )−1 is clearly bounded so that we have the required bound c$(κ). For | q 0 |> M0 , write q 0 = (q 0 − q 0 ) + q 0 , so that
2α
α α w(q)−1 = (q 0 )2 + 16Mλ2 ≤ 2 | q 0 − q 0 |2α +2 q 0 + 16Mλ2 , using the /p triangle inequality with p = α −1 . As r00 κ, q is O (q 0 )−4 , the result follows. & ' Let H∗ be the dual space to H, determined by the L2 inner product. We have Lemma 4.6. R0λ : H → H∗ is analytic in 0 < Reκ < 2Mλ , |Imκ| < Mλ and with norm bounded by c$(κ)2 , c > 0. Proof. From Eq. (4.3),
(g, R0λ f )2 ≤ sup S˜λ p 0 + iκ S˜λ p 0 − iκ w(p)−2 w(p) |g(p)f (p)| dp, 2 2 p and using (4.1) the result follows.
' &
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
397
4.4. Complete model: Existence of bound states. For the complete model, following [2], here we show the existence of mass spectrum in the interval κ ∈ (0, 2Mλ ) when d < 3 and a2 < 0. We will prove there is a unique bound state near the ladder bound state ML . In the next subsection, absence of bound states in (0, 2Mλ ) will be proven both for d ≥ 3 and a2 < 0 and for a2 > 0. Essentially, this is done by showing the existence or absence of an eigenvalue 1 of Kλ (κ) R0λ (κ). Multiplicity one is checked for the former case. Before we go to the technical details, we give a description of the strategy employed in both cases. For the repulsive case a2 > 0 and for the attractive case a2 < 0 and d ≥ 3, with Kλ = λL + λ2 K (2) λ , we write
−1 (2) , Dλ = DL + Dλ λ2 K (2) λ DL = DL 1 − λ2 Kλ DL where
−1 DL = Dλ0 + λDL LDλ0 = Dλ0 1 − λLDλ0 .
Using an explicit representation for DL , we show that DL has no singularities in (2) (0, 2Mλ ), and also that Kλ DL has norm less than one in (0, 2Mλ ). Hence, the resol−1
(2) is well defined by its Neumann series and Dλ does not have vent 1 − λ2 Kλ DL singularities in (0, 2Mλ ). For the attractive case a2 < 0 and d < 3, in order to show existence of a bound state we write
−1 Dλ = Dλ0 1 − Kλ Dλ0 and consider the family of compact operators, µ ∈ C, defined by Tλ (µ, κ) = −λT1 (κ) + µT2 (κ), where T1 , and T2 are defined in (4.16). We remark that µ = λ2 corresponds to the value of interest (the physical one), that is [see (4.15)] Tλ (λ2 , κ) = Kλ (κ)R0λ (κ). This family is shown to be compact and jointly analytic in κ and µ, for 0 < Reκ < 2Mλ and |µ| < 2λ2 . Without further analysis, the analytic Fredholm theory implies that −1 exists, except for κ in a discrete set. As Dλ0 is not singular in the same 1 − Kλ Dλ0 domain, it follows that the mass spectrum is discrete in (0, 2Mλ ). However, we show more. The point µ = 0 is called the ladder approximation which was solved explicitly in Subsect. 4.2, and leads to a bound state at some κ = κL ∈ (0, 2Mλ ). This is the only mass spectral point in (0, 2Mλ ). As µT2 is an analytic perturbation, it is shown that there is an isolated bound state of multiplicity one at κb ∈ (0, 2Mλ ), where κb lies in the interval |κb − κL | ≤ 21 bλ2 , for b sufficiently small, uniform in λ, such that κb is the unique mass spectral point in the interval. For κ in the intervals 0, κL − 21 bλ2 or κL + 21 bλ2 , 2Mλ − λ5/2 , the mass spec−1 exists. Thus, as Dλ0 is not singular, the trum is excluded by showing that 1 − Kλ Dλ0 −1 same holds for D = Dλ0 1 − Kλ Dλ0 . For κ near ML , the resolvent (−λT1 (κ) − w)−1 of −λT1 (κ) is constructed explicitly and µT2 (κ) is shown to be an analytic perturbation to this ladder operator. The resolvent (Tλ (µ, κ) − w)−1 is defined through its Neumann series and is shown to exist for w in
398
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
the complement of | w |−1 , with | w |−1 < 4. This means that the spectrum of Tλ (µ, κ) is contained in | w |≤ 1/4, | w − 1 |≤ 1/4. Consequently, by analytic perturbation theory, there is a unique multiplicity one eigenvalue αλ (µ, κ) of Tλ (µ, κ) which is analytic both in κ and µ, and satisfies αλ (0, κ) = 1. However, we do not know that for real µ > 0 and small the eigenvalue takes the value one. To show that indeed it does, we compute the derivative [∂αλ /∂κ] (0, κ) (see Lemma 4.11), which is large positive for small λ. This is shown to be the dominating contribution to [∂αλ /∂κ] (µ, κ). Thus, for small real µ, αλ (µ, κ) is strictly monotone increasing in µ. In this way, we show: Lemma 4.7. Let µ and κ be real. For | µ |< 2λ2 and c sufficiently small, there is a unique κ = κλ (µ) in | κ − ML |≤ 21 cλ2 such that αλ (µ, κλ (µ)) = 1. Remark 4.8. Recall that µ = λ2 is the physical value of interest so that αλ λ2 , κλ (λ2 ) = 1 is the eigenvalue of Tλ (λ2 , κ) = Kλ (κ)R0λ (κ), where κ = κλ (λ2 ) ≡ Mb is the bound state mass given in (1.5). In order for the analysis of [2, Lemmas 2.7–2.11] to go through, it suffices to show the two lemmas below Lemma 4.9. Let µ± be defined as in (4.8). Then, for some positive c, 1 1 1 1 1 −1 ≤ c max , , Tλ (0, ML ) . [w − Tλ (0, ML )] w w − 1 w w − 1 w − µ− (ML ) Remark 4.10. We recall, for κ = ML , that µ+ (ML ) = 1 and | µ− (ML ) |≤ c | λ |. Note that the ladder bound state satisfies αλ (0, ML ) = 1. Lemma 4.11. For κ such that | κ − ML |≤ 21 cλ2 , with a sufficiently small c > 0, set αλ (0, κ) = ρ (α + βγ )1/2 . Then, there exist positive constants c1 and c2 such that ∂αλ (0, κ) ≥ λc1 $(κ)3 ≥ c2 λ−2 , for $(κ) as defined in (4.13). ∂κ Proof of Lemma 4.9. Using the representation (4.12), the resolvent [w − Tλ (0, ML )]−1 is bounded using Lemma 4.5. & ' Proof of Lemma 4.11. From the representations for α, β and γ [see (4.9)], we see that they are all strictly positive as well as their κ derivatives. From the bounds of Lemma 4.4, ∂αλ it follows that ' (0, κ) ≥ λc1 $(κ)3 . & ∂κ 4.5. Complete model: Absence of bound states. Here, considering the complete model and using the strategy described in Sect. 4.4, we show the absence of mass spectrum in (0, 2Mλ ) in the two-particle sector for the repulsive case a2 > 0 and d < 3, as well as for d ≥ 3. A variant of the method is used to complete the proof that excludes spectrum between the bound state Mb and the two-particle threshold 2Mλ . We treat the repulsive case following the method of [19] and the attractive one following [2]. As before, the λ dependence is omitted unless deemed necessary. To control the spectrum, we treat D = D 0 + DKD 0 as a perturbation about the ladder approximation. For this, we set DL for the (λ dependent) D solution of the ladder BS equation, that is, DL = D 0 + λDL LD 0 .
(4.18)
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
399
Then, D is given by
−1 D = DL + Dλ2 K (2) DL = DL 1 − λ2 K (2) DL .
In the repulsive case a2 > 0, we show that DL has no singularity in 0 < κ < 2Mλ , and that the bound state of mass Mb is isolated with isolation radius rb . We now show that there is no spectrum in (Mb + rb , 2Mb ) again by showing that K (2) DL has norm less than one in this interval. The starting point of the analysis is an explicit representation for DL . Using (4.6) and (4.17) in (4.18), and suppressing the κ dependence, gives q )X(p), DL (p, q) = r0 (p)δ(p + q) − 3λa2 r0 (q)Y (p) − 3λa2 r(q)E0 (
where X(p) =
(4.19)
DL (p, q)dq
,
Y (p) =
DL (p, q)E0 ( q )dq.
Multiplying (4.19) by the function 1 and E0 ( q ), integrating over q and solving for X(p) and Y (p), leads to 3λa2 DL (p, q) = r0 (p)δ(p + q) − + E0 ( q ) + 3λa2 E0 (p) D × (E0 (p) + E0 ( q )) α − γ − βE0 (p)E 0 ( q) ≡ r0 (p)δ(p + q) + c(p, q)r0 (p)r0 (q), where D = D(w = 1) [see (4.11)], that is
D = (1 − µ+ ) (1 − µ− ) = 1 + 3λa2 α + (3λa2 )2 α 2 − βγ .
(4.20)
To establish our result, it is sufficient to use the bound of Lemma 4.3 for the Hilbert– (2) Schmidt norm of λ2 Kλ R, with, following (4.14), R(κ) ≡ D˜ L (κ), and the bound (uniformly in p )
I ≡ R(p, q)f (p)g(q)dpdq ≤ O λ−1 w(q ) . (4.21)
In (4.21), suppressing the p and q dependence, (2)
f (p) = Kλ (κ, p + iδ, p); As the κ behavior of
J ≡
g(q) = w(q)−1 Bδ (q − q).
D˜ L (p, q)dpdq = β/D
(4.22)
(4.23)
is easily controlled (see Lemma 4.14 below), it is convenient to write I of (4.21) as I= r0 (p) [f (p)g(q) − f (0)g(0)] dpdq + [f (p) − f (0)] g(0)c(p, q)r0 (p)r0 (q)dpdq + f (p) [g(p) − g(0)] c(p, q)r0 (p)r0 (q)dpdq + Jf (0)g(0) ≡ X1 + X2 + X3 + X4 .
400
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
The terms X2 and X3 are bounded by a combination of the methods used for bounding X1 and X4 (see [2]). We now bound X4 . Following [19], we write [h(p) ≡ f (p)g(p)] − h(0) as h(p) − h(0) = h(p) − h(0, p) + h(0, p) − h(0) ≡ δh1 (p) + δh2 (p). (4.24) The δh1 (p) and δh2 (p) terms are bounded in the lemmas below. Lemma 4.12. Recalling the definitions given in (4.22) and (4.24), the bound
δh1 (p)r00 (κ, p)dp ≤ O λ−1 w
is satisfied. Proof. Write [see (2.2) and (4.3)] o −1
r00 (κ, p) = (iκp )
−1 p/2 m2 κ2 + + (p ) − iκp − 4 2 2 −1 2 p/2 m2 κ 0 2 0 . + + − (p ) + iκp − 4 2 2
0 2
0
The singularity in p0 at zero is cancelled and for the first (second) terms we make the contour shift p 0 → p 0 ± iδ0 , with δ0 < δ0 . Thus, the denominators become 2 2 2 2 p2 (p0 )2 ± 2i δ0 p 0 ∓ κp 0 + κδ0 − δ02 + κ4 + m2 + 2/ , which is zero for κ4 = m2 + 2 √ m2 p/ 0 2 + δ0 κ − δ0 > 2 for δ0 < κ. Thus, κ > 2m. Hence, we have no p singularity 0 3 and the rest of the bound is carried out using the 1/(p ) falloff of the term of r00 (κ, p) as in the proof of Lemma 4.5. & '
Lemma 4.13. The bound δh2 (p)r00 (κ, p)dp
≤ O λ−1 w(q ) holds. Proof. Writing h(0, p) − h(0) = p.∇ u h(0, u ) |u=0 +
1
t
dt
0
dt
0
∂2 h(0, t p) ∂t 2
and doing the p0 integration, we get
cλ (p) 2 pj pk Eλ (p) 4Eλ (p )2 − 4Eλ (0 )2 + 4Mλ2 − κ 2
1 0
t
dt 0
dt
∂2 h(0, u = t p)d p, ∂uj uk
where the p terms integrate to zero by parity. The integral over p is finite for 0 < κ < 2Mλ . Concerning the derivatives of h(0, p), with respect to p j , we see that they are (2) (2) bounded by Bδ and those of Kλ . Using the analyticity of Kλ , the derivatives are uniformly bounded. Proceeding as in the proof of Lemma 4.5, the bound is completed. ' &
Spectral Analysis Stochastic Lattice Ginzburg–Landau Models
401
Lemma 4.14. There exist positive constants c1 , c2 , c3 and c4 such that i) For a2 > 0, and uniformly for 0 < κ < 2Mλ , −1 J ≤ c1 $(κ) 1 + 3λa2 c2 $(κ) − c3 λ2 . ii) For a2 < 0, and uniformly for 2Mλ − λ5/2 κ < 2Mλ , λJ ≤ c4 . Proof. i) From (4.20) and (4.23), we get 2 −1 J = β 1 − µ+ 1 − µ− = 1 + 3λa2 α + 3λa2 α 2 − βγ . The first bound follows from α, β, γ < c$(κ) but α 2 −βγ < 0, by the Cauchy-Schwarz inequality. However, by separating out the constant term in the numerators of α, β and γ , the p/2 singularity in the denominator is cancelled and α 2 − βγ < c uniformly in 0 < κ < 2Mλ . For ii) see Sect. 3 of [2]. & ' 5. Concluding Remarks We have determined the low-lying e − m spectrum for dynamic stochastic lattice Landau–Ginzburg models with small polynomial interaction and such that the equilibrium state is in the single phase region. The determination of the spectrum for models with equilibrium states in the multi-phase region is of interest. Also the question of the effect of large noise on the spectrum is relevant and is currently being investigated [13]. References 1. Dimock, J.: A Cluster Expansion for Stochastic Lattice Fields. J. Stat. Phys. (1990)
58, 1181–1207 2. Dimock, J., Eckmann, J.-P.: On the Bound State in Weakly Coupled λ φ 6 − φ 4 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
2
Models. Commun.
Math. Phys. 51, 41–54 (1976) Duren, P.: Theory of H p Spaces. Pure and Applied Mathematics Vol. 38, New York: Academic Press, 1970 Glimm, J., Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York: Springer Verlag, 1986 Gammaitoni, L., Hanggi, P., Jung, P., Marchesoni, F.: Stochastic Resonance. Rev. Mod. Phys. 70, 223–287 (1998) Hohenberg, P. C., Halperin, B. I.: Theory of Dynamic Critical Phenomena. Rev. Mod. Phys. 49, 435–479 (1977) Horsthemke, W., Lefever, R.: Noise-induced Transitions. Berlin: Springer Verlag, 1984 Itzykson, C., Zuber, J.-B.: Quantum Field Theory. New York: McGraw-Hill, 1980 Jona-Lasinio, G., Mitter, P. K.: On the Stochastic Quantization of Field Theory. Commun. Math. Phys. 101, 409–436 (1985) Jona-Lasinio, G., Sénéor, R.: Study of Stochastic Differential Equations by Constructive Methods I. J. Stat. Phys. 83, 1109–1148 (1996) Kondratiev, Yu. G., Minlos, R. A.: One-Particle Subspaces in the Stochastic XY Model. J. Stat. Phys. 87, 613–642 (1997) Minlos, R. A., Suhov, Y. M.: On the Spectrum of the Generator of an Infinite System of Interacting Diffusions. Commun. Math. Phys. 206, 463–489 (1999) Pereira, E.: Noise Induced Bound States. Phys. Lett. A 282, 169–174 (2001) Reed, M., Simon, B.: Analysis of Operators. Modern Methods of Mathematical Physics Vol. IV, New York: Academic Press, 1978 Schor, R., Barata, J. C. A., Faria da Veiga, P. A., Pereira, E.: Spectral Properties of Weakly Coupled Landau-Ginzburg Stochastic Models. Phys. Rev. E 59, Issue 3, 2689–2694 (1999)
402
P. A. Faria da Veiga, M. O’Carroll, E. Pereira, R. Schor
16. Schor, R., O’Carroll, M.: Decay of the Bethe–Salpeter Kernel and Absence of Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1207–1223 (2000); Transfer Matrix Spectrum and Bound States for Lattice Classical Ferromagnetic Spin Systems at High Temperature. J. Stat. Phys. 99, 1265–1279 (2000) 17. Simon, B.: Statistical Mechanics of Lattice Models. Princeton, NJ: Princeton University Press, 1994 18. Spencer, T.: The Decay of the Bethe–Salpeter Kernel in P(ϕ)2 Quantum Field Models. Commun. Math. Phys. 44, 143–164 (1975) 19. Spencer, T., Zirilli, F.: Scattering States and Bound States in λP(φ)2 Models. Commun. Math. Phys. 49, 1–16 (1976) 20. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer Verlag, 1991 21. Stein, E.M.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 22. Zhizhina, E.A.: Two-Particle Spectrum of the Generator for Stochastic Model of Planar Rotators at High Temperature. J. Stat. Phys. 91, 343–366 (1998) 23. Zinn-Justin, J.: Quantum Field Theory and Critical Phenomena. Oxford: Oxford University Press, 1993 Communicated by Ya. G. Sinai
Commun. Math. Phys. 220, 403 – 428 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Global Properties of Gravitational Lens Maps in a Lorentzian Manifold Setting Volker Perlick Albert Einstein Institute, 14476 Golm, Germany. E-mail: [email protected] Received: 16 October 2000 / Accepted: 18 January 2001
Abstract: In a general-relativistic spacetime (Lorentzian manifold), gravitational lensing can be characterized by a lens map, in analogy to the lens map of the quasi-Newtonian approximation formalism. The lens map is defined on the celestial sphere of the observer (or on part of it) and it takes values in a two-dimensional manifold representing a twoparameter family of worldlines. In this article we use methods from differential topology to characterize global properties of the lens map. Among other things, we use the mapping degree (also known as Brouwer degree) of the lens map as a tool for characterizing the number of images in gravitational lensing situations. Finally, we illustrate the general results with gravitational lensing (a) by a static string, (b) by a spherically symmetric body, (c) in asymptotically simple and empty spacetimes, and (d) in weakly perturbed Robertson–Walker spacetimes.
1. Introduction Gravitational lensing is usually studied in a quasi-Newtonian approximation formalism which is essentially based on the assumptions that the gravitational fields are weak and that the bending angles are small, see Schneider, Ehlers and Falco [1] for a comprehensive discussion. This formalism has proven to be very powerful for the calculation of special models. In addition it has also been used for proving general theorems on the qualitative features of gravitational lensing such as the possible number of images in a multiple imaging situation. As to the latter point, it is interesting to inquire whether the results can be reformulated in a Lorentzian manifold setting, i.e., to inquire to what extent the results depend on the approximations involved. In the quasi-Newtonian approximation formalism one considers light rays in Euclidean 3-space that go from a fixed point (observer) to a point that is allowed to vary Permanent address: TU Berlin, Sekr. PN 7-1, 10623 Berlin, Germany. E-mail: [email protected]
404
V. Perlick
over a 2-dimensional plane (source plane). The rays are assumed to be straight lines with the only exception that they may have a sharp bend at a 2-dimensional plane (deflector plane) that is parallel to the source plane. (There is also a variant with several deflector planes to model deflectors which are not “thin”.) For each concrete mass distribution, the deflecting angles are to be calculated with the help of Einstein’s field equation, or rather of those remnants of Einstein’s field equation that survive the approximations involved. Hence, at each point of the deflector plane the deflection angle is uniquely determined by the mass distribution. As a consequence, following light rays from the observer into the past always gives a unique “lens map” from the deflector plane to the source plane. There is “multiple imaging” whenever this lens map fails to be injective. In this article we want to inquire whether an analogous lens map can be introduced in a spacetime setting, without using quasi-Newtonian approximations. According to the rules of general relativity, a spacetime is to be modeled by a Lorentzian manifold (M, g) and the light rays are to be modeled by the lightlike geodesics in M. We shall assume that (M, g) is time-oriented, i.e., that the timelike and lightlike vectors can be distinguished into future-pointing and past-pointing in a globally consistent way. To define a general lens map, we have to fix a point p ∈ M as the event where the observation takes place and we have to look for an analogue of the deflector plane and for an analogue of the source plane. As to the deflector plane, there is an obvious candidate, namely the celestial sphere Sp at p. This can be defined as the set of all one-dimensional lightlike subspaces of the tangent space Tp M or, equivalently, as the totality of all light rays issuing from p into the past. As to the source plane, however, there is no natural candidate. Following Frittelli, Newman and Ehlers [2–4], one might consider any timelike 3-dimensional submanifold T of the spacetime manifold as a substitute for the source plane. The idea is to view such a submanifold as ruled by worldlines of light sources. To make this more explicit, one could restrict to the case that T is a fiber bundle over a two-dimensional manifold N , with fibers timelike and diffeomorphic to R. Each fiber is to be interpreted as the worldline of a light source, and the set N may be identified with the set of all those worldlines. In this situation we wish to define a lens map fp : Sp −→ N by extending each light ray from p into the past until it meets T and then projecting onto N . In general, this prescription does not give a well-defined map since neither existence nor uniqueness of the target value is guaranteed. As to existence, there might be some past-pointing lightlike geodesics from p that never reach T . As to uniqueness, one and the same light ray might intersect T several times. The uniqueness problem could be circumvented by considering, on each past-pointing lightlike geodesic from p, only the first intersection with T , thereby willfully excluding some light rays from the discussion. This comes up to ignoring every image that is hidden behind some other image of a light source with a worldline ξ ∈ N . For the existence problem, however, there is no general solution. Unless one restricts to special situations, the lens map will be defined only on some subset Dp of Sp (which may even be empty). Also, one would like the lens map to be differentiable or at least continuous. This is guaranteed if one further restricts the domain Dp of the lens map by considering only light rays that meet T transversely. Following this line of thought, we give a precise definition of lens maps in Sect. 2. We will be a little bit more general than outlined above insofar as the source surface need not be timelike; we also allow for the limiting case of a lightlike source surface. This has the advantage that we may choose the source surface “at infinity” in the case of an asymptotically simple and empty spacetime. In Sect. 3 we briefly discuss some general properties of the caustic of the lens map. In Sect. 4 we introduce the mapping degree (Brouwer degree) of the lens map as an important tool from differential topology.
Global Properties of Gravitational Lens Maps in Lorentzian Setting
405
This will then give us some theorems on the possible number of images in gravitational lensing situations, in particular in the case that we have a “simple lensing neighborhood”. The latter notion will be introduced and discussed in Sect. 5. We conclude with applying the general results to some examples in Sect. 6. Our investigation will be purely geometrical in the sense that we discuss the influence of the spacetime geometry on the propagation of light rays but not the influence of the matter distribution on the spacetime geometry. In other words, we use only the geometrical background of general relativity but not Einstein’s field equation. For this reason the “deflector”, i.e., the matter distribution that is the cause of gravitational lensing, never explicitly appears in our investigation. However, information on whether the deflectors are transparent or non-transparent will implicitly enter into our considerations. 2. Definition of the Lens Map As a preparation for precisely introducing the lens map in a spacetime setting, we first specify some terminology. By a manifold we shall always mean what is more fully called a “real, finitedimensional, Hausdorff, second countable (and thus paracompact) C ∞ -manifold without boundary”. Whenever we have a C ∞ vector field X on a manifold M, we may consider two points in M as equivalent if they lie on the same integral curve of X. We shall denote the resultant quotient space, which may be identified with the set of all integral curves of X, by M/X. We call X a regular vector field if M/X can be given the structure of a manifold in such a way that the natural projection πX : M −→ M/X becomes a C ∞ -submersion. It is easy to construct examples of non-regular vector fields. E.g., if X has no zeros and is defined on Rn \ {0}, then M/X cannot satisfy the Hausdorff property, so it cannot be a manifold according to our terminology. Palais [5] has proven a useful result which, in our terminology, can be phrased in the following way. If none of X’s integral curves is closed or almost closed, and if M/X satisfies the Hausdorff property, then X is regular. We are going to use the following terminology. A Lorentzian manifold is a manifold M together with a C ∞ metric tensor field g of Lorentzian signature (+ · · · + −). A Lorentzian manifold is time-orientable if the set of all timelike vectors {Z ∈ T M | g(Z, Z) < 0} has exactly two connected components. Choosing one of those connected components as future-pointing defines a time-orientation for (M, g).A spacetime is a connected 4-dimensional time-orientable Lorentzian manifold together with a time-orientation. We are now ready to define what we will call a “source surface” in a spacetime. This will provide us with the target space for lens maps. Definition 1. (T , W ) is called a source surface in a spacetime (M, g) if (a) T is a 3-dimensional C ∞ submanifold of M; (b) W is a nowhere vanishing regular C ∞ vector field on T which is everywhere causal, g(W, W ) ≤ 0, and future-pointing; (c) πW : T −→ N = T /W is a fiber bundle with fiber diffeomorphic to R and the quotient manifold N = T /W is connected and orientable. We want to interpret the integral curves of W as the worldlines of light sources. Thus, one should assume that they are not only causal but even timelike, g(W, W ) < 0, since a light source should move at subluminal velocity. For technical reasons, however, we
406
V. Perlick
allow for the possibility that an integral curve of W is lightlike (everywhere or at some points), because such curves may appear as (C 1 -)limits of timelike curves. This will give us the possibility to apply the resulting formalism to asymptotically simple and empty spacetimes in a convenient way, see Subsect. 6.2 below. Actually, the causal character of W will have little influence upon the results we want to establish. What really matters is a transversality condition that enters into the definition of the lens map below. Please note that, in the situation of Def. 1, the bundle πW : T −→ N is necessarily trivializable, i.e., T N × R. To prove this, let us assume that the flow of W is defined on all of R × T , so it makes πW : T −→ N into a principal fiber bundle. (This is no restriction of generality since it can always be achieved by multiplying W with an appropriate function. This function can be determined in the following way. Owing to a famous theorem of Whitney [6], also see Hirsch [7], p. 55, paracompactness guarantees that T can be embedded as a closed submanifold into Rn for some n. Pulling back the Euclidean metric gives a complete Riemannian metric h on T and the flow of the vector field h(W, W )−1/2 W is defined on all of R × T , cf. Abraham and Marsden [8], Prop. 2.1.21.) Then the result follows from the well known facts that any fiber bundle whose typical fiber is diffeomorphic to Rn admits a global section (see, e.g., Kobayashi and Nomizu [9], p. 58), and that a principal fiber bundle is trivializable if and only if it admits a global section (see again [9], p. 57). Also, it is interesting to note the following. If T is any 3-dimensional submanifold of M that is foliated into timelike curves, then time orientability guarantees that these are the integral curves of a timelike vector field W . If we assume, in addition, that T contains no closed timelike curves, then it can be shown that πW : T −→ N is necessarily a fiber bundle with fiber diffeomorphic to R, providing N satisfies the Hausdorff property, see Harris [10], Theorem 2. This shows that there is little room for relaxing the conditions of Def. 1. Choosing a source surface in a spacetime will give us the target space N = T /W for the lens map. To specify the domain of the lens map, we consider, at any point p ∈ M, the set Sp of all lightlike directions at p, i.e., the set of all one-dimensional lightlike subspaces of Tp M. We shall refer to Sp as to the celestial sphere at p. This is justified since, obviously, Sp is in natural one-to-one relation with the set of all light rays arriving at p. As it is more convenient to work with vectors rather than with directions, we shall usually represent Sp as a submanifold of Tp M. To that end we fix a future-pointing timelike vector Vp in the tangent space Tp M. The vector Vp may be interpreted as the 4-velocity of an observer at p. We now consider the set Sp = Yp ∈ Tp M g(Yp , Yp ) = 0 and g(Yp , Vp ) = 1 . (1) It is an elementary fact that (1) defines an embedded submanifold of Tp M which is diffeomorphic to the standard 2-sphere S 2 . As indicated by our notation, the set (1) can be identified with the celestial sphere at p, just by relating each vector to the direction spanned by it. Representation (1) of the celestial sphere gives a convenient way of representing the light rays through p. We only have to assign to each Yp ∈ Sp the lightlike geodesic s −→ expp (sYp ) , where expp : Wp ⊆ Tp M −→ M denotes the exponential map at the point p of the Levi-Civita connection of the metric g. Please note that this geodesic is past-pointing, because Vp was chosen future-pointing, and that it passes through p at the parameter value s = 0. The lens map is defined in the following way. After fixing a source surface (T , W ) and choosing a point p ∈ M, we denote by Dp ⊆ Sp the subset of all lightlike
Global Properties of Gravitational Lens Maps in Lorentzian Setting
p
..................................................................................... ............ .... ... . ..... .................... .... ... ......... .... .... .... ........ .... .... ....... .... .... .... .... .... .... ... ... ... .... .... .... ... . . . . .... . . .... .... .... .... .... . .... . . .... . . . . . . . .... . . .... ... ... ... .... . ... . . . . . .... . . . . .... . . . . . .... .... ..... .... ..... ..... .... . . . . .... . . p . . . .... ..... .... .... ..... .... .. .... .. . . ... . .... ... . . . . .... . . .... .. .... .... ..... .... . . ..... ....... .... ..... ..... .... ...... .... ... ... .. .. .... . . ... ... .... . . . . .... .... . . .... ... .... .... .. .... .... . . .... ... . .... .... . ..... . .... ... .... ....... .... .... .... ..... ... .... . .... .... . .... . .... .... .... ..... .... .... .... .... .... .... ... ... ... .... .... .... ... .... .... .... .... .... .... .... .... .... .... .... ... . ..................................................................... ... . ................ ... .... ........... ........... .... ......... ......... ..... ........ ... .......
q ❅ ❘ ❅
407
✻
W
Y
T
q
πW
❄ q
..................................................................... ............... ........... ........... ......... ......... ........ p p .....
f (Y )
N
Fig. 1. Illustration of the lens map
directions at p such that the geodesic to which this direction is tangent meets T (at least once) if sufficiently extended to the past, and if at the first intersection point q with T this geodesic is transverse to T . By projecting q to N = T /W we get the lens map fp : Dp −→ N = T /W , see Fig. 1. If we use the representation (1) for Sp , the definition of the lens map can be given in more formal terms in the following way. Definition 2. Let (T , W ) be a source surface in a spacetime (M, g). Then, for each p ∈ M, the lens map fp : Dp −→ N = T /W is defined in the following way. In the notation of Eq. (1), let Dp be the set of all Yp ∈ Sp such that there is a real number wp (Yp ) > 0 with the properties (a) sYp is in the maximal domain of the exponential map for all s ∈ [ 0 , wp (Yp )]; (b) the curve s −→ exp(sYp ) intersects T at the value s = wp (Yp ) transversely; (c) expp (sYp ) ∈ / T for all s ∈ [ 0 , wp (Yp )[ . This defines a map wp : Dp −→ R. The lens map at p is then, by definition, the map fp : Dp −→ N = T /X ,
fp (Yp ) = πW expp (wp (Yp )Yp ) .
(2)
Here πW : T −→ N denotes the natural projection. The transversality condition in part (b) of Def. 2 guarantees that the domain Dp of the lens map is an open subset of Sp . The case Dp = ∅ is, of course, not excluded. In particular, Dp = ∅ whenever p ∈ T , owing to part (c) of Def. 2.
408
V. Perlick
Moreover, the transversality condition in part (b) of Definition 2, in combination with the implicit function theorem, makes sure that the map wp : Dp −→ R is a C ∞ map. As the exponential map of a C ∞ metric is again C ∞ , and πW is a C ∞ submersion by assumption, this proves the following. Proposition 1. The lens map is a C ∞ map. Please note that without the transversality condition the lens map need not even be continuous. Although our Def. 2 made use of the representation (1), which refers to a timelike vector Vp , the lens map is, of course, independent of which future-pointing Vp has been chosen. We decided to index the lens map only with p although, strictly speaking, it depends on T , on W , and on p. Our philosophy is to keep a source surface (T , W ) fixed, and then to consider the lens map for all points p ∈ M. In view of gravitational lensing, the lens map admits the following interpretation. For ξ ∈ N , each point Yp ∈ Dp with fp (Yp ) = ξ corresponds to a past-pointing lightlike geodesic from p to the worldline ξ in M, i.e., it corresponds to an image at the celestial sphere of p of the light source with worldline ξ . If fp is not injective, we are in a multiple imaging situation. The converse need not be true as the lens map does not necessarily cover all images. There might be a past-pointing lightlike geodesic from p reaching ξ after having met T before, or being tangential to T on its arrival at ξ . In either case, the corresponding image is ignored by the lens map. The reader might be inclined to view this as a disadvantage. However, in Sect. 6 below we discuss some situations where the existence of such additional light rays can be excluded (e.g., asymptotically simple and empty spacetimes) and situations where it is desirable, on physical grounds, to disregard such additional light rays (e.g., weakly perturbed Robertson–Walker spacetimes with compact spatial sections). It was already mentioned that the domain Dp of the lens map might be empty; this is, of course, the worst case that could happen. The best case is that the domain is all of the celestial sphere, Dp = Sp . We shall see in the following sections that many interesting results are true just in this case. However, there are several cases of interest where Dp is a proper subset of Sp . If the domain of the lens map fp is the whole celestial sphere, none of the light rays issuing from p into the past is blocked or trapped before it reaches T . In view of applications to gravitational lensing, this excludes the possibility that these light rays meet a non-transparent deflector. In other words, it is a typical feature of gravitational lensing situations with non-transparent deflectors that Dp is not all of Sp . Two simple examples, viz., a non-transparent string and a non-transparent spherical body, will be considered in Subsect. 6.1 below. 3. Regular and Critical Values of the Lens Map Please recall that, for a differentiable map F : M1 −→ M2 between two manifolds, Y ∈ M1 is called a regular point of F if the differential TY F : TY M1 −→ TF (Y ) M2 has maximal rank, otherwise Y is called a critical point. Moreover, ξ ∈ M2 is called a regular value of F if all Y ∈ F −1 (ξ ) are regular points, otherwise ξ is called a critical value. Please note that, according to this definition, any ξ ∈ M2 that is not in the image of F is regular. The well-known (Morse-)Sard theorem (see, e.g., Hirsch [7], p. 69) says that the set of regular values of F is residual (i.e., it contains the intersection of countably many sets that are open and dense in M2 ) and thus dense in M2 and the critical values of F make up a set of measure zero in M2 .
Global Properties of Gravitational Lens Maps in Lorentzian Setting
For the lens map fp : Dp −→ N , we call the set Caust(fp ) = ξ ∈ N ξ is a critical value of fp
409
(3)
the caustic of fp . The Sard theorem then implies the following result. Proposition 2. The caustic Caust(fp ) is a set of measure zero in N and its complement N \ Caust(fp ) is residual and thus dense in N . Please note that Caust(fp ) need not be closed in N . Counter-examples can be constructed easily by starting with situations where the caustic is closed and then excising points from spacetime. For lens maps defined on the whole celestial sphere, however, we have the following result. Proposition 3. If Dp = Sp , the caustic Caust(fp ) is compact in N . This is an obvious consequence of the fact that Sp is compact and that fp and its first derivative are continuous. As the domain and the target space of fp have the same dimension, Yp ∈ Dp is a regular point of fp if and only if the differential TYp fp : TYp Sp −→ Tfp (Yp ) N is an isomorphism. In this case fp maps a neighborhood of Yp diffeomorphically onto a neighborhood of fp (Yp ). The differential TYp fp may be either orientation-preserving or orientation-reversing. To make this notion precise we have to choose an orientation for Sp and an orientation for N . For the celestial sphere Sp it is natural to choose the orientation according to which the origin of the tangent space Tp M is to the inner side of Sp . The target manifold N is orientable by assumption, but in general there is no natural choice for the orientation. Clearly, choosing an orientation for N fixes an orientation for T , because the vector field W gives us an orientation for the fibers. We shall say that the orientation of N is adapted to some point Yp ∈ Dp if the geodesic with initial vector Yp meets T at the inner side. If Dp is connected, the orientation of N that is adapted to some Yp ∈ Dp is automatically adapted to all other elements of Dp . Using this terminology, we may now introduce the following definition. Definition 3. A regular point Yp ∈ Dp of the lens map fp is said to have even parity (or odd parity, respectively) if TYp fp is orientation-preserving (or orientation-reversing, respectively) with respect to the natural orientation on Sp and the orientation adapted to Yp on N . For a regular value ξ ∈ N of the lens map, we denote by n+ (ξ ) (or n− (ξ ), respectively) the number of elements in fp−1 (ξ ) with even parity (or odd parity, respectively). Please note that n+ (ξ ) and n− (ξ ) may be infinite, see the Schwarzschild example in Subsect. 6.1 below. A criterion for n± (ξ ) to be finite will be given in Prop. 8 below. Definition 3 is relevant for gravitational lensing in the following sense. The assumption that Yp is a regular point of fp implies that an observer at p sees a neighborhood of ξ = fp (Yp ) in N as a neighborhood of Yp at his or her celestial sphere. If we compare the case that Yp has odd parity with the case that Yp has even parity, then the appearance of the neighborhood in the first case is the mirror image of its appearance in the second case. This difference is observable for a light source that is surrounded by some irregularly shaped structure, e.g. a galaxy with curved jets or with lobes. If ξ is a regular value of fp , it is obvious that the points in fp−1 (ξ ) are isolated, i.e., any Yp in fp−1 (ξ ) has a neighborhood in Dp that contains no other point in fp−1 (ξ ). This follows immediately from the fact that fp maps a neighborhood of Yp diffeomorphically
410
V. Perlick
onto its image. In the next section we shall formulate additional assumptions such that the set fp−1 (ξ ) is finite, i.e., such that the numbers n± (ξ ) introduced in Def. 3 are finite. It is the main purpose of the next section to demonstrate that then the difference n+ (ξ ) − n− (ξ ) has some topological invariance properties. As a preparation for that we notice the following result which is an immediate consequence of the fact that the lens map is a local diffeomorphism near each regular point. Proposition 4. n+ and n− are constant on each connected component of fp (Dp ) \ Caust(fp ). Hence, along any continuous curve in fp (Dp ) that does not meet the caustic of the lens map, the numbers n+ and n− remain constant, i.e., the observer at p sees the same number of images for all light sources on this curve. If a curve intersects the caustic, the number of images will jump. In the next section we shall prove that n+ and n− always jump by the same amount (under conditions making sure that these numbers are finite), i.e., the total number of images always jumps by an even number. This is well known in the quasi-Newtonian approximation formalism, see, e.g., Schneider, Ehlers and Falco [1], Sect. 6. If Caust(fp ) is empty, transversality guarantees that fp (Dp ) is open in N and, thus a manifold. Proposition 4 implies that, in this case, fp gives a C ∞ covering map from Dp onto fp (Dp ). As a C ∞ covering map onto a simply connected manifold must be a global diffeomorphism, this implies the following result. Proposition 5. Assume that Caust(fp ) is empty and that fp (Dp ) is simply connected. Then fp gives a global diffeomorphism from Dp onto fp (Dp ). In other words, the formation of a caustic is necessary for multiple imaging provided that fp (Dp ) is simply connected. In Subsect. 6.1 below we shall consider the spacetime of a non-transparent string. This will demonstrate that the conclusion of Prop. 5 is not true without the assumption of fp (Dp ) being simply connected. In the rest of this subsection we want to relate the caustic of the lens map to the caustic of the past light cone of p. The past light cone of p can be defined as the image set in M of the map Fp : (s, Yp ) −→ expp (sYp )
(4)
considered on its maximal domain in ] 0 , ∞ [ × Sp , and its caustic can be defined as the set of critical values of Fp . In other words, q ∈ M is in the caustic of the past light cone of p if and only if there is an s0 ∈ ] 0 , ∞ [ and a Yp ∈ Sp such that the differential T(s0 ,Yp ) Fp has rank k < 3. In that case one says that the point q = expp (s0 Yp ) is conjugate to p along the geodesic s −→ expp (sYp ), and one calls the number m = 3 − k the multiplicity of this conjugate point. As Fp ( · , Yp ) is always an immersion, the multiplicity can take the values 1 and 2 only. (This formulation is equivalent to the definition of conjugate points and their multiplicities in terms of Jacobi vector fields which may be more familiar to the reader.) It is well known, but far from trivial, that along every lightlike geodesic conjugate points are isolated. Hence, in a compact parameter interval there are only finitely many points that are conjugate to a fixed point p. A proof can be found, e.g., in Beem, Ehrlich and Easley [11], Theorem 10.77. After these preparations we are now ready to establish the following proposition. We use the notation introduced in Def. 2.
Global Properties of Gravitational Lens Maps in Lorentzian Setting
411
Proposition 6. An element Yp ∈ Dp is a regular point of the lens map if and only if the point expp (wp (Yp )Yp ) is not conjugate to p along the geodesic s −→ expp (sYp ). A regular point Yp ∈ Dp has even parity (or odd parity, respectively) if and only if the number of points conjugate to p along the geodesic [ 0 , wp (Yp )] −→ M , s −→ expp (sYp ) is even (or odd, respectively). Here each conjugate point is to be counted with its multiplicity. Proof. In terms of the function (4), the lens map can be written in the form fp (Yp ) = πW Fp (wp (Yp ), Yp ) .
(5)
As s −→ Fp (s, Yp ) is an immersion transverse to T at s = wp (Yp ) and πW is a submersion, the differential of fp at Yp has rank 2 if and only if the differential of Fp at (wp (Yp ), Yp ) has rank 3. This proves the first claim. For proving the second claim define, for each s ∈ [0, wp (Yp )], a map s : TYp Sp −→ Tfp (Yp ) N
(6)
by applying to each vector in TYp Sp the differential T(s,Yp ) Fp , parallel-transporting the result along the geodesic Fp ( · , Yp ) to the point q = Fp wp (Yp ), Yp and then projecting down to Tfp (Yp ) N . In the last step one uses the fact that, by transversality, any vector in Tq M can be uniquely decomposed into a vector tangent to T and a vector tangent to the geodesic Fp ( · , Yp ). For s = 1, this map s gives the differential of the lens map. We now choose a basis in TYp Sp and a basis in Tfp (Yp ) N , thereby representing the map s as a (2 × 2)-matrix. We choose the first basis right-handed with respect to the natural orientation on Sp and the second basis right-handed with respect to the orientation on N that is adapted to Yp . Then det(0 ) is positive as the parallel transport gives an orientation-preserving isomorphism. The function s −→ det(s ) has a single zero whenever Fp (s, Yp ) is a conjugate point of multiplicity one and it has a double zero whenever Fp (s, Yp ) is a conjugate point of multiplicity two. Hence, the sign of det(1 ) can be determined by counting the conjugate points. This result implies that ξ ∈ N is a regular value of the lens map fp whenever the worldline ξ does not pass through the caustic of the past light cone of p. The relation between parity and the number of conjugate points is geometrically rather evident because each conjugate point is associated with a “crossover” of infinitesimally neighboring light rays. 4. The Mapping Degree of the Lens Map The mapping degree (also known as Brouwer degree) is one of the most powerful tools in differential topology. In this section we want to investigate what kind of information could be gained from the mapping degree of the lens map, providing it can be defined. For the reader’s convenience we briefly summarize the definition and main properties of the mapping degree, following closely Choquet-Bruhat, Dewitt-Morette, and DillardBleick [12], pp. 477. For a more abstract approach, using homology theory, the reader may consult Dold [13], Spanier [14] or Bredon [15]. In this article we shall not use homology theory with the exception of the proof of Prop. 11. The definition of the mapping degree is based on the following observation.
412
V. Perlick
Proposition 7. Let F : D ⊆ M1 −→ M2 be a continuous map, where M1 and M2 are oriented connected manifolds of the same dimension, D is an open subset of M1 with compact closure D and F |D is a C ∞ map. (Actually, C 1 would do.) Then for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , the set F −1 (ξ ) is finite. Proof. By contradiction, let us assume that there is a sequence (yi )i∈N with pairwise different elements in F −1 (ξ ). By compactness of D, we can choose an infinite subsequence of (yi )i∈N that converges towards some point y∞ ∈ D. By continuity of F , F (y∞ ) = ξ , so the hypotheses of the proposition imply that y∞ ∈ / ∂D. As a consequence, y∞ is a regular point of F |D , so it must have an open neighborhood in D that does not contain any other element of F −1 (ξ ). This contradicts the fact that a subsequence of (yi )i∈N converges towards y∞ . If we have a map F that satisfies the hypotheses of Prop. 7, we can thus define, for every ξ ∈ M2 \ F (∂D) which is a regular value of F |D , deg(F, ξ ) = sgn(y) , (7) y ∈ F −1 (ξ )
where sgn(y) is defined to be +1 if the differential Ty F preserves orientation and −1 if Ty F reverses orientation. If F −1 (ξ ) is the empty set, the right-hand side of (7) is set equal to zero. The number deg(F, ξ ) is called the mapping degree of F at ξ . Roughly speaking, deg(F, ξ ) tells how often the image of F covers the point ξ , counting each “layer” positive or negative depending on orientation. The mapping degree has the following properties (for proofs see Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477). Property A. deg(F, ξ ) = deg(F, ξ ) whenever ξ and ξ are in the same connected component of M2 \ F (∂D). Property B. deg(F, ξ ) = deg(F , ξ ) whenever F and F are homotopic, i.e., whenever there is a continuous map : [0, 1] × D −→ M2 , (s, y) −→ s (y) with 0 = F and 1 = F such that deg(s , ξ ) is defined for all s ∈ [0, 1]. Property A can be used to extend the definition of deg(F, ξ ) to the non-regular values ξ ∈ M2 \ F (∂D). Given the fact that, by the Sard theorem, the regular values are dense in M2 , this can be done just by continuous extension. Property B can be used to extend the definition of deg(F, ξ ) to continuous maps F : D −→ M2 which are not necessarily differentiable on D. Given the fact that the C ∞ maps are dense in the continuous maps with respect to the C 0 -topology, this can be done again just by continuous extension. We now apply these general results to the lens map fp : Dp −→ N . In the case Dp = Sp it is necessary to extend the domain of the lens map onto a compact set to define the degree of the lens map. We introduce the following definition. Definition 4. A map fp : Dp ⊆ M1 −→ M2 is called an extension of the lens map fp : Dp −→ N if (a) M1 is an orientable manifold that contains Dp as an open submanifold; (b) M2 is an orientable manifold that contains N as an open submanifold; (c) the closure Dp of Dp in M1 is compact; (d) fp is continuous and the restriction of fp to Dp is equal to fp .
Global Properties of Gravitational Lens Maps in Lorentzian Setting
413
If the lens map is defined on the whole celestial sphere, Dp = Sp , then the lens map is an extension of itself, fp = fp , with M1 = Sp and M2 = N . If Dp = Sp , one may try to continuously extend fp onto the closure of Dp in Sp , thereby getting an extension with M1 = Sp and M2 = N . If this does not work, one may try to find some other extension. The string spacetime in Subsect. 6.1 below will provide us with an example where an extension exists although fp cannot be continuously extended from Dp onto its closure in Sp . The spacetime around a spherically symmetric body with Ro < 3m will provide us with an example where the lens map admits no extension at all, see Subsect. 6.1 below. Applying Prop. 7 to the case F = fp immediately gives the following result. Proposition 8. If the lens map fp : Dp −→ N admits an extension fp : Dp ⊆ M1 −→ M2 , then for all regular values ξ ∈ N \fp (∂Dp ) the set fp−1 (ξ ) is finite, so the numbers n+ (ξ ) and n− (ξ ) introduced in Def. 3 are finite. If fp is an extension of the lens map fp , the number deg(fp , ξ ) is a well defined integer for all ξ ∈ N \ fp (∂Dp ), provided that we have chosen an orientation on M1 and on M2 . The number deg(fp , ξ ) changes sign if we change the orientation on M1 or on M2 . This sign ambiguity can be removed if Dp is connected. Then we know from the preceding section that N admits an orientation that is adapted to all Yp ∈ Dp . As N is connected, this determines an orientation for M2 . Moreover, the natural orientation on Sp induces an orientation on Dp which, for Dp connected, gives an orientation for M1 . In the rest of this paper we shall only be concerned with the situation that Dp is connected, and we shall always tacitly assume that the orientations have been chosen as indicated above, thereby fixing the sign of deg(fp , ξ ). Now comparison of (7) with Def. 3 shows that deg(fp , ξ ) = n+ (ξ ) − n− (ξ )
(8)
for all regular values in N \ fp (∂Dp ). Owing to Property A, this has the following consequence. Proposition 9. Assume that Dp is connected and that the lens map admits an extension fp : Dp ⊆ M1 −→ M2 . Then n+ (ξ ) − n− (ξ ) = n+ (ξ ) − n− (ξ ) for any two regular values ξ and ξ which are in the same connected component of N \ fp (∂Dp ). In particular, n+ (ξ ) + n− (ξ ) is odd if and only if n+ (ξ ) + n− (ξ ) is odd. We know already from Prop. 4 that the numbers n+ and n− remain constant along each continuous curve in fp (Dp ) that does not meet the caustic of fp . Now let us consider a continuous curve α : ] − ε0 , ε0 [ −→ fp (Dp ) that meets the caustic at α(0) whereas α(ε) is a regular value of fp for all ε = 0. Under the additional assumptions that Dp is connected, an extension, and that α(0) ∈ / fp (∂Dp ), Prop. 9 that fp admits tells us that n+ α(ε) − n− α(ε) remains constant when ε passes through zero. In other words, n+ and n− are allowed to jump only by the same amount. As a consequence, the total number of images n+ + n− is allowed to jump only by an even number. We now specialize to the case that the lens map is defined on the whole celestial sphere, Dp = Sp . Then the assumption of fp admitting an extension is trivially satisfied, with fp = fp , and the degree deg(fp , ξ ) is a well-defined integer for all ξ ∈ N . Moreover,
414
V. Perlick
deg(fp , ξ ) is a constant with respect to ξ , owing to Property A. It is then usual to write simply deg(fp ) instead of deg(fp , ξ ). Using this notation, (8) simplifies to deg(fp ) = n+ (ξ ) − n− (ξ )
(9)
for all regular values ξ of fp . Thus, the total number of images n+ (ξ ) + n− (ξ ) = deg(fp ) + 2n− (ξ )
(10)
is either even for all regular values ξ or odd for all regular values ξ , depending on whether deg(fp ) is even or odd. In some gravitational lensing situations it might be possible to show that there is one light source ξ ∈ N for which fp−1 (ξ ) consists of exactly one point, i.e., ξ is not multiply imaged. This situation is characterized by the following proposition. Proposition 10. Assume that Dp = Sp and that there is a regular value ξ of fp such that fp−1 (ξ ) is a single point. Then |deg(fp )| = 1. In particular, fp must be surjective and N must be diffeomorphic to the sphere S 2 . Proof. The result |deg(fp )| = 1 can be read directly from (9), choosing the regular value ξ which has exactly one pre-image point under fp . This implies that fp must be surjective since a non-surjective map has degree zero. So N being the continuous image of the compact set Sp under the continuous map fp must be compact. It is well known (see, e.g., Hirsch [7], p. 130, Exercise 5) that for n ≥ 2 the existence of a continuous map F : S n −→ M2 with deg(F ) = 1 onto a compact oriented n-manifold M2 implies that M2 must be simply connected. As the lens map gives us such a map onto N (after changing the orientation of N , if necessary), we have thus found that N must be simply connected. Owing to the well-known classification theorem of compact orientable twodimensional manifolds (see, e.g., Hirsch [7], Chapter 9), this implies that N must be diffeomorphic to the sphere S 2 . In the situation of Prop. 10 we have n+ (ξ ) + n− (ξ ) = 2n− (ξ ) ± 1, for all ξ ∈ N \ Caust(fp ), i.e., the total number of images is odd for all light sources ξ ∈ N S 2 that lie not on the caustic of fp . The idea to use the mapping degree for proving an odd number theorem in this way was published apparently for the first time in the introduction of McKenzie [16]. In Prop. 10 one would, of course, like to drop the rather restrictive assumption that fp−1 (ξ ) is a single point for some ξ . In the next section we consider a special situation where the result |deg(fp )| = 1 can be derived without this assumption. 5. Simple Lensing Neighborhoods In this section we investigate a special class of spacetime regions that will be called “simple lensing neighborhoods”. Although the assumption of having a simple lensing neighborhood is certainly rather special, we shall demonstrate in Sect. 6 below that sufficiently many examples of physical interest exist. We define simple lensing neighborhoods in the following way. Definition 5. (U, T , W ) is called a simple lensing neighborhood in a spacetime (M, g) if (a) U is an open connected subset of M and T is the boundary of U in M; (b) ( T = ∂U, W ) is a source surface in the sense of Def. 1;
Global Properties of Gravitational Lens Maps in Lorentzian Setting
415
(c) for all p ∈ U, the lens map fp : Dp −→ N = ∂U/W is defined on the whole celestial sphere, Dp = Sp ; (d) U does not contain an almost periodic lightlike geodesic. Here the notion of being “almost periodic” is defined in the following way. Any immersed curve λ : I −→ U, defined on a real interval I , induces a curve λˆ : I −→ P U ˆ ˙ | c ∈ R }. in the projective tangent bundle P U over U which is defined by λ(s) = { cλ(s) The curve λ is called almost periodic if there is a strictly monotonous sequence of ˆ i ) i∈N has an accumulation point parameter values (si )i∈N such that the sequence λ(s in P U. Please note that Condition (d) of Def. 5 is certainly true if the strong causality condition holds everywhere on U, i.e., if there are no closed or almost closed causal curves in U. Also, Condition (d) is certainly true if every future-inextendible lightlike geodesic in U has a future end-point in M. Condition (d) should be viewed as adding a fairly mild assumption on the futurebehavior of lightlike geodesics to the fairly strong assumptions on their past-behavior that are contained in Condition (c). In particular, Condition (c) excludes the possibility that past-oriented lightlike geodesics are blocked or trapped inside U, i.e., it excludes the case that U contains non-transparent deflectors. Condition (c) requires, in addition, that the past-pointing lightlike geodesics are transverse to ∂U when leaving U. In the situation of a simple lensing neighborhood, we have for each p ∈ U a lens map that is defined on the whole celestial sphere, fp : Sp −→ N = ∂U/W . We have, thus, Eq. (9) at our disposal which relates the numbers n+ (ξ ) and n− (ξ ), for any regular value ξ ∈ N , to the mapping degree of fp . (Please recall that, by Prop. 8, n+ (ξ ) and n− (ξ ) are finite.) It is our main goal to prove that, in a simple lensing neighborhood, the mapping degree of the lens map equals ±1, so n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all regular values ξ . Also, we shall prove that a simple lensing neighborhood must be contractible and that its boundary must be diffeomorphic to S 2 × R. The latter result reflects the fact that the notion of simple lensing neighborhoods generalizes the notion of asymptotically simple and empty spacetimes, with ∂U corresponding to past lightlike infinity J− , as will be detailed in Subsect. 6.2 below. When proving the desired properties of simple lensing neighborhoods we may therefore use several techniques that have been successfully applied to asymptotically simple and empty spacetimes before. As a preparation we need the following lemma. Lemma 1. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M,g). Then there is a diffeomorphism , from the sphere bundle S = Yp ∈ Sp p ∈ U of lightlike directions over U onto the space T N × R2 such that the following diagram commutes. S
,
−→ T N × R2
ip ↑
↓ pr fp
Sp −→
(11)
N
Here ip denotes the inclusion map and pr is defined by dropping the second factor and projecting to the foot-point. Proof. We fix a trivialization for the bundle πW : T −→ N and identify T with N × R. Then we consider the bundle B = Xq ∈ Bq q ∈ T over T , where Bq ⊂ Sq is, by definition, the subspace of all lightlike directions that are tangent to past-oriented
416
V. Perlick
lightlike geodesics that leave U transversely at q. Now we choose for each q ∈ T a vector Qq ∈ Tq M, smoothly depending on q, which is non-tangent to T and outward pointing. With the help of this vector field Q we may identify B and T N × R as bundles over T N × R in the following way. Fix ξ ∈ N , Xξ ∈ Tξ N and s ∈ R and view the tangent space Tξ N as a natural subspace of Tq (N × R), where q = (ξ, s). Then the desired identification is given by associating the pair (Xξ , s) with the direction spanned by Zq = Xξ + Qq − α W (q), where the number α is uniquely determined by the requirement that Zq should be lightlike and past-pointing. – Now we consider the map π : S −→ B T N × R
(12)
given by following each lightlike geodesic from a point p ∈ U into the past until it reaches T , and assigning the tangent direction at the end-point to the tangent direction at the initial point. As a matter of fact, (12) gives a principal fiber bundle with structure group R. To prove this, we first observe that the geodesic spray induces a vector field without zeros on S. By multiplying this vector field with an appropriate function we get a vector field whose flow is defined on all of R × S (see the second paragraph after Def. 1 for how to find such a function). The flow of this rescaled vector field defines an R-action on S such that (12) can be identified with the projection onto the space of orbits. Conditions (c) and (d) of Def. 5 guarantee that no orbit is closed or almost closed. Owing to a general result of Palais [5], this is sufficient to prove that this action makes (12) into a principal fiber bundle with structure group R. However, any such bundle is trivializable, see, e.g., Kobayashi and Nomizu [9], pp. 57/58. Choosing a trivialization for (12) gives us the desired diffeomorphism , from S to B × R T N × R2 . The commutativity of the diagram (11) follows directly from the definition of the lens map fp . With the help of this lemma we will now prove the following proposition which is at the center of this section. Proposition 11. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g). Then (a) N = T /W is diffeomorphic to the standard 2-sphere S 2 ; (b) U is contractible; (c) for all p ∈ U, the lens map fp : Sp S 2 −→ N S 2 has |deg(fp )| = 1; in particular, fp is surjective. Proof. In the proof of part (a) and (b) we shall adapt techniques used by Newman and Clarke [17, 18] in their study of asymptotically simple and empty spacetimes. To that end it will be necessary to assume that the reader is familiar with homology theory. With the sphere bundle S, introduced in Lemma 1, we may associate the Gysin homology sequence . . . −→ Hm (S) −→ Hm (U) −→ Hm−3 (U) −→ Hm−1 (S) −→ . . . ,
(13)
where Hm (X ) denotes the mth homology group of the space X with coefficients in a field F. For any choice of F, the Gysin sequence is an exact sequence of abelian groups, see, e.g., Spanier [14], p. 260 or, for the analogous sequence of cohomology groups, Bredon [15], p. 390. By Lemma 1, S and N have the same homotopy type, so Hm (S) and Hm (N ) are isomorphic. Upon inserting this into (13), we use the fact
Global Properties of Gravitational Lens Maps in Lorentzian Setting
417
that Hm (U) = 1 ( = trivial group consisting of the unit element only) for m > 4 and Hm (N ) = 1 for m > 2 because dim(U) = 4 and dim(N ) = 2. Also, we know that H0 (U) = F and H0 (N ) = F since U and N are connected. Then the exactness of the Gysin sequence implies that Hm (U) = 1
for m > 0
(14)
H2 (N ) = F.
(15)
and H1 (N ) = 1 ,
From (15) we read that N is compact since otherwise H2 (N ) = 1. Moreover, we observe that N has the same homology groups and thus, in particular, the same Euler characteristic as the 2-sphere. It is well known that any two compact and orientable 2-manifolds are diffeomorphic if and only if they have the same Euler characteristic (or, equivalently, the same genus), see, e.g., Hirsch [7], Chapter 9. We have thus proven part (a) of the proposition. – To prove part (b) we consider the end of the exact homotopy sequence of the fiber bundle S over U, see, e.g., Frankel [19], p. 600, . . . −→ π1 (S) −→ π1 (U) −→ 1.
(16)
As S has the same homotopy type as N S 2 , we may replace π1 (S) with π1 (S 2 ) = 1, so the exactness of (16) implies that π1 (U) = 1, i.e., that U is simply connected. If, for some m > 1, the homotopy group πm (U) would be different from 1, the Hurewicz isomorphism theorem (see, e.g., Spanier [14], p. 394 or Bredon [15], p. 479, Corollary 10.10.) would give a contradiction to (14). Thus, πm (U) = 1 for all m ∈ N, i.e., U is contractible. – We now prove part (c). Since U is contractible, the tangent bundle T U and thus the sphere bundle S over U admits a global trivialization, S U ×S 2 . Fixing such a trivialization and choosing a contraction that collapses U onto some point p ∈ U gives a contraction i˜p : S −→ Sp . Together with the inclusion map ip : Sp −→ S this gives us a homotopy equivalence between Sp and S. (Please recall that a homotopy equivalence between two topological spaces X and Y is a pair of continuous maps ϕ : X −→ Y and ϕ˜ : Y −→ X such that ϕ ◦ ϕ˜ can be continuously deformed into the identity on Y and ϕ˜ ◦ ϕ can be continuously deformed into the identity on X .) On the other hand, the projection pr from (11), together with the zero section pr ˜ : N −→ T N × R2 gives a homotopy equivalence between T N × R2 and N . As a consequence, the diagram (11) ˜ tells us that the lens map fp = pr ◦ , ◦ ip together with the map f˜p = i˜p ◦ , −1 ◦ pr gives a homotopy equivalence between Sp S 2 and N S 2 , so fp ◦ f˜p is homotopic to the identity. Since the mapping degree is a homotopic invariant (please recall Property B of the mapping degree from Sect. 4), this implies that deg(fp ◦ f˜p ) = 1. Now the product theorem for the mapping degree (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], p. 483) yields deg(fp ) deg(f˜p ) = 1. As the mapping degree is an integer, this can be true only if deg(fp ) = deg(f˜p ) = ±1. In particular, fp must be surjective since otherwise deg(fp ) = 0. In all simple examples to which this proposition applies the degree of fp is, actually, equal to +1, and it is hard to see whether examples with deg(fp ) = −1 do exist. The following consideration is quite instructive. If we start with a simple lensing neighborhood in a flat spacetime (or, more generally, in a conformally flat spacetime), then
418
V. Perlick
conjugate points cannot occur, so it is clear that the case deg(fp ) = −1 is impossible. If we now perturb the metric in such a way that the simple-lensing-neighborhood property is maintained during the perturbation, then, by Property B of the degree, the equation deg(fp ) = +1 is preserved. This demonstrates that the case deg(fp ) = −1 cannot occur for weak gravitational fields (or for small perturbations of conformally flat spacetimes such as Robertson–Walker spacetimes). Among other things, Proposition 11 gives a good physical motivation for studying degree-one maps from S 2 to S 2 . In particular, it is an interesting problem to characterize the caustics of such maps. Please note that, by parts (a) and (c) of Proposition 11, fp (Dp ) is simply connected for all p ∈ U. Hence, Proposition 5 applies which says that the formation of a caustic is necessary for multiple imaging. Owing to (10), part (c) of Proposition 11 implies in particular that n(ξ ) = n+ (ξ ) + n− (ξ ) is odd for all worldlines of light sources ξ ∈ N that do not pass through the caustic of the past light cone of p, i.e., if only light rays within U are taken into account the observer at p sees an odd number of images of such a worldline. It is now our goal to prove a similar “odd number theorem” for a light source with worldline inside U. As a preparation we establish the following lemma. Lemma 2. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and p ∈ U. Let J − (p, U) denote, as usual, the causal past of p in U, i.e., the set of all points in M that can be reached from p along a past-pointing causal curve in U. Let ∂U J − (p, U) denote the boundary of J − (p, U) in U. Then (a) every point q ∈ ∂U J − (p, U) can be reached from p along a past-pointing lightlike geodesic in U; (b) ∂U J − (p, U) is relatively compact in M. Proof. As usual, let I − (p, U) denote the chronological past of p in U, i.e., the set of all points that can be reached from p along a past-pointing timelike curve in U. To prove part (a), fix a point q ∈ ∂U J − (p, U). Choose a sequence (pi )i∈N of points in U that converge towards p in such a way that p ∈ I − (pi , U) for all i ∈ N. This implies that we can find for each i ∈ N a past-pointing timelike curve λi from pi to q. Then the λi are past-inextendible in U \ {q}. Owing to a standard lemma (see, e.g., Wald [20], Lemma 8.1.5) this implies that the λi have a causal limit curve λ through p that is pastinextendible in U \ {q}. We want to show that λ is the desired lightlike geodesic. Assume that λ is not a lightlike geodesic. Then λ enters into the open set I − (p, U) (see Hawking and Ellis [21], Prop. 4.5.10), so λi enters into I − (p, U) for i sufficiently large. This, however, is impossible since all λi have past end-point on ∂U J − (p, U), so λ must be a lightlike geodesic. It remains to show that λ has past end-point at q. Assume that this is not true. Since λ is past-inextendible in U \ {q} this assumption implies that λ is pastinextendible in U, so by condition (c) of Def. 5 λ has past end-point on ∂U and meets ∂U transversely. As a consequence, for i sufficiently large λi has to meet ∂U which gives a contradiction to the fact that all λi are within U. – To prove part (b), we have to show that any sequence (qi )i∈N in ∂U J − (p, U) has an accumulation point in M. So let us choose such a sequence. From part (a) we know that there is a past-pointing lightlike geodesic µi from p to qi in U for all i ∈ N. By compactness of Sp S 2 , the tangent directions to these geodesics at p have an accumulation point in Sp . Let µ be the past-pointing lightlike geodesic from p which is determined by this direction. By condition (c) of Definition 5, this geodesic µ and each of the geodesics µi must have a past end-point on ∂U if maximally extended inside U. We may choose an affine parametrization for each of those geodesics with the parameter ranging from the value 0 at p to the value 1 at ∂U.
Global Properties of Gravitational Lens Maps in Lorentzian Setting
419
Then our sequence (qi )i∈N in U determines a sequence (si )i∈N in the interval [0, 1] by setting qi = µi (si ). By compactness of [0, 1], this sequence must have an accumulation point s ∈ [0.1]. This demonstrates that the qi must have an accumulation point in M, namely the point µ(s). We are now ready to prove the desired odd-number theorem for light sources with worldline in U. Proposition 12. Let (U, T , W ) be a simple lensing neighborhood in a spacetime (M, g) and assume that U does not contain a closed timelike curve. Fix a point p ∈ U and a timelike embedded C ∞ curve γ in U whose image is a closed topological subset of M. (The latter condition excludes the case that γ has an end-point on ∂U.) Then the following is true. (a) If γ does not meet the point p, then there is a past-pointing lightlike geodesic from p to γ that lies completely within U and contains no conjugate points in its interior. (The end-point may be conjugate to the initial-point.) If this geodesic meets γ at the point q, say, then all points on γ that lie to the future of q cannot be reached from p along a past-pointing lightlike geodesic in U. (b) If γ meets neither the point p nor the caustic of the past light cone of p, then the number of past-pointing lightlike geodesics from p to γ that are completely contained in U is finite and odd. Proof. In the first step we construct a C ∞ vector field V on M that is timelike on U, has γ as an integral curve, and coincides with W on T = ∂U. To that end we first choose any future-pointing timelike C ∞ vector field V1 on M. (Existence is guaranteed by our assumption of time-orientability.) Then we extend the vector field W to a C ∞ vector field V2 onto some neighborhood V of T . Since W is causal and future-pointing, V2 may be chosen timelike and future-pointing on V \ T . (Here we make use of the fact that T = ∂U is a closed subset of M.) Finally we choose a timelike and future-pointing vector field V3 on some neighborhood W of γ that is tangent to γ at all points of γ . (Here we make use of the fact that the image of γ is a closed subset of M.) We choose the neighborhoods V and W disjoint which is possible since γ is completely contained in U and closed in M. With the help of a partition of unity we may now combine the three vector fields V1 , V2 , V3 into a vector field V with the desired properties. In the second step we consider the quotient space M/V . This space contains the open subset U/V whose boundary T /V = N is, by Prop. 11, a manifold diffeomorphic to S 2 . We want to show that U/V is a manifold (which, according to our terminology, in particular requires that U/V is a Hausdorff space). To that end we consider the map jp : ∂U J − (p, U) −→ U/V which assigns to each point q ∈ ∂U J − (p, U) the integral curve of V passing through that point. (In this proof overlining always means closure in M.) Clearly, jp is continuous with respect to the topology ∂U J − (p, U) inherits as a subspace of M and the quotient topology on U/V . Moreover, ∂U J − (p, U) intersects each integral curve of V at most once, and if it intersects one integral curve then it also intersects all neighbboring integral curves in U; this follows from Wald [20], Theorem 8.1.3. Hence, jp is injective and its image is open in U/V . On the other hand, part (b) of Lemma 2 implies that the image of jp is closed. Since the image of jp is non-empty and connected, it must be all of U/V . (The domain of jp and, thus, the image of jp is non-empty because U does not contain a closed timelike curve. The domain and, thus, the image of jp is connected since U is connected.) We have, thus, proven that jp
420
V. Perlick
is a homeomorphism. This implies that the Hausdorff condition is satisfied on U/V and, in particular, on U/V . Since V is timelike and U contains no closed timelike curves, this makes sure that U/V is a manifold according to our terminology, see Harris [10], Theorem 2. In the third step we use these results to prove part (a) of the proposition. Our result that jp is a homeomorphism implies, in particular, that γ has an intersection with ∂U J − (p, U) at some point q. Now part (a) of Lemma 2 shows that there is a past-pointing lightlike geodesic from p to q in U. This geodesic cannot contain conjugate points in its interior since otherwise a small variation would give a timelike curve from p to q, see Hawking and Ellis [21], Prop. 5.4.12, thereby contradicting q ∈ ∂U J − (p, U). The rest of part (a) is clear since all past-pointing lightlike geodesics in U that start at p are confined to J − (p, U). In the last step we prove part (b). To that end we choose on the tangent space Tp M a Lorentz basis (Ep1 , Ep2 , Ep3 , Ep4 ) with Ep4 future-pointing, and we identify each x = (x 1 , x 2 , x 3 ) ∈ R3 with the past-pointing lightlike vector Yp = x 1 Ep1 + x 2 Ep2 + x 3 Ep3 − |x|Ep4 . With this identification, the lens map takes the form fp : S 2 −→ N = ∂U/V , x −→ πV expp (wp (x)x) . We now define a continuous map F : B −→ M/V x on the closed ball B = x ∈ R3 |x| ≤ 1 by setting F (x) = πV expp (wp ( |x| ) x) for x = 0 and F (0) = πV (p). The restriction of F to the interior of B is a C ∞ map onto the manifold U/V , with the exception of the origin where F is not differentiable. The latter problem can be circumvented by approximating F in the C o -sense, on an arbitrarily small neighborhood of the origin, by a C ∞ map. Then the mapping degree deg(F ) can be calculated (see, e.g., Choquet-Bruhat, Dewitt-Morette, and Dillard-Bleick [12], pp. 477) with the help of the integral formula F ∗ ω = deg(F ) ω, (17) B
U /V
where ω is any 3-form on U/V and the star denotes the pull-back of forms. For any 2-form ψ on U/V , we may apply this formula to the form ω = dψ. With the help of the Stokes theorem we then find F ∗ ψ = deg(F ) ψ. (18) S2
N
However, the restriction of F to ∂B = S 2 gives the lens map, so on the left-hand side of (18) we may replace F ∗ ψ by fp∗ ψ. Then comparison with the integral formula for the degree of fp shows that deg(F ) = deg(fp ) which, according to Prop. 11, is equal to ±1. For every ζ ∈ U/V that is a regular value of F , the result deg(F ) = ±1 implies that the number of elements in F −1 (ζ ) is finite and odd. By assumption, the worldline γ ∈ U/V meets neither the point p nor the caustic of the past light cone of p. The first condition makes sure that our perturbation of F near the origin can be done without influencing the set F −1 (γ ); the second condition implies that γ is a regular value of F , please recall our discussion at the end of Sect. 3. This completes the proof. If only light rays within U are taken into account, then Prop. 12 can be summarized by saying that, for light sources in a simple lensing neighborhood, the “youngest image” has always even parity and the total number of images is finite and odd. In the quasi-Newtonian approximation formalism it is a standard result that a transparent gravitational lens produces an odd number of images, see Schneider, Ehlers and
Global Properties of Gravitational Lens Maps in Lorentzian Setting
421
Falco [1], Section 5.4, for a detailed discussion. Proposition 12 may be viewed as a reformulation of this result in a Lorentzian geometry setting. It is quite likely that an alternative proof of Prop. 12 can be given by using the Morse theoretical results of Giannoni, Masiello and Piccione [22, 23]. Also, the reader should compare our results with the work of McKenzie [16] who used Morse theory for proving an odd-number theorem in certain globally hyperbolic spacetimes. Contrary to McKenzie’s theorem, our Prop. 12 requires mathematical assumptions which can be physically interpreted rather easily. 6. Examples 6.1. Two simple examples with non-transparent deflectors. 6.1.1. Non-transparent string. As a simple example, we consider gravitational lensing in the spacetime (M, g) where M = R2 × R2 \ {0} and g = −dt 2 + dz2 + dr 2 + k 2 r 2 dϕ 2
(19)
with some constant 0 < k < 1. Here (t, z) denote Cartesian coordinates on R2 and (r, ϕ) denote polar coordinates on R2 \ {0}. This can be interpreted as the spacetime around a static non-transparent string, see Vilenkin [24], Hiscock [25] and Gott [26]. One should think of the string as being situated at the z-axis. Since the latter is not part of the spacetime, it is indeed justified to speak of a non-transparent string. As ∂/∂t is a Killing vector field normalized to −1, the lightlike geodesics in (M, g) correspond to the geodesics of the space part. The latter is a metrical product of a real line with coordinate z and a cone with polar coordinates (r, ϕ). So the geodesics are straight lines if we cut the cone open along some radius ϕ = const. and flatten it out in a plane. Owing to this simple form of the lightlike geodesics, the investigation of lens maps in this string spacetime is quite easy. To work this out, choose some constant R > 0 and let T denote the hypercylinder r = R in M. Let W denote the restriction of the vector field ∂/∂t to T . Then (T , W ) is a source surface, with N = T /W S 1 × R. Henceforth we discuss the lens map fp for any point p ∈ M at a radius r < R. There are no past-pointing lightlike geodesics from p that intersect T more than once or touch T tangentially, so the lens map fp gives full information about all images at p of each light source ξ ∈ N . The domain Dp of the lens map is given by excising a curve segment, namely a meridian including both end-points at the “poles”, from the celestial sphere Sp , so Dp R2 is connected. The boundary of Dp in Sp corresponds to light rays that are blocked by the string before reaching T . It is easy to see that the lens map cannot be continuously extended onto Sp (= closure of Dp in Sp ). Nonetheless, the lens map admits an extension in the sense of Def. 4. We may choose M1 = S 2 and M2 = S 2 . Here Dp is embedded into the sphere in such a way that it covers a region (θ, ϕ) ∈ ]0, π [ × ] ε , 2π − ε[ , i.e., in comparison with the embedding into Sp the curve segment excised from the sphere has been “widened” a bit. The embedding of N S 1 × R into S 2 is made via Mercator projection. As the string spacetime has vanishing curvature, the light cones in M have no caustics. Owing to our general results of Sect. 3, this implies that the caustic of the lens map is empty and that all images have even parity, so (8) gives deg(fp , ξ ) = n+ (ξ ) = n(ξ ) for all ξ ∈ N \ fp (∂Dp ). The actual value of n(ξ ) depends on the parameter k that enters into the metric (19). If i = 1/k is an integer, N \ fp (∂Dp ) is connected and n(ξ ) = i everywhere on this set. If
422
V. Perlick
i < 1/k < i + 1 for some integer i, N \ fp (∂Dp ) has two connected components, with n(ξ ) = i on one of them and n(ξ ) = i +1 on the other. Thus, the string produces multiple imaging and the number of images is (finite but) arbitrarily large if k is sufficiently small. For all k ∈ ]0, 1[ , the lens map is surjective, fp (Dp ) = N S 1 ×R. So this example shows that the assumption of fp (Dp ) being simply connected was essential in Prop. 5. 6.1.2. Non-transparent spherical body. We consider the Schwarzschild metric −1 2 2 g = 1 − 2m dr + r 2 dθ 2 + sin2 θ dϕ 2 − 1 − 2m r r ) dt
(20)
on the manifold M = ]Ro , ∞[ × S 2 × R. In (20), r is the coordinate ranging over ]Ro , ∞[ , t is the coordinate ranging over R, and θ and ϕ are spherical coordinates on S 2 . This gives the static vacuum spacetime around a spherically symmetric body of mass m and radius Ro . Restricting the spacetime manifold to the region r > Ro is a way of treating the central body as non-transparent. In the following we keep a value Ro > 0 fixed and we allow m to vary between m = 0 (flat space) and m = Ro /2 (black hole). For discussing lens maps in this spacetime we fix a constant R > 3Ro /2. We denote by T the set of all points in M with coordinate r = R and we denote by W the restriction of ∂/∂t to W . Then (T , W ) is a source surface, with N = T /W S 2 . It is our goal to discuss the properties of the lens map fp : Dp −→ N for a point p ∈ M with a radius coordinate r < R in dependence of the mass parameter m. To that end we make use of well-known properties of the lightlike geodesics in the Schwarzschild metric, see, e.g., Chandrasekhar [28], Sect. 20, for a comprehensive discussion. For determining the relevant features of the lens map it will be sufficient to concentrate on qualitative aspects of image positions. For quantitative aspects the reader may consult Virbhadra and Ellis [27]. We first observe that, for any m ∈ [0, Ro /2], there is no past-pointing lightlike geodesic from p that intersects T more than once or touches T tangentially. This follows from the fact that in the region r > 3m the radius coordinate has no local maximum along any light ray. So the lens map fp gives full information about all images at p of light sources ξ ∈ N . For m = 0, the light rays are straight lines. The domain Dp of the lens map is given by excising a disc, including the boundary, from the celestial sphere Sp , i.e., Dp R2 . The boundary of Dp corresponds to light rays grazing the surface of the central body, so fp can be continuously extended onto the closure of Dp in Sp , thereby giving an extension of fp , in the sense of Def. 4, fp : Dp ⊆ Sp −→ N . In Fig. 2, fp (∂Dp ) can be represented as a “circle of equal latitude” on the sphere r = R, with the image of fp “to the north” of this circle. With increasing m, fp (∂Dp ) moves “south” until, at some value m = m1 , it has reached the “south pole” ξS . This is the situation depicted in Fig. 2. From now on the lens map is surjective and ξS is seen as an Einstein ring, thereby indicating that a caustic has formed. Now fp (∂Dp ) moves north until, at some value m = m1 , it has reached the “north pole” ξN . From now on ξN is seen as an Einstein ring, in addition to the regular image that exists from the beginning. With further increasing m, we find an infinite sequence of values 0 < m1 < m1 < m2 < m2 < · · · < mi < mi < . . . such that at m = mi a new Einstein ring of ξS and at mi a new Einstein ring of ξN comes into existence. For all intermediate values of m, fp (∂Dp ) divides N into two connected components. All points ξ in the southern component, with the exception of the south pole ξS , are regular values of the lens map. fp−1 (ξ ) consists of exactly 2i points where i is the largest integer with mi < m. There are i images of even parity, n+ (ξ ) = i, and i images
Global Properties of Gravitational Lens Maps in Lorentzian Setting
423
ξN
r
................................................................. .................... .............. ............. ........... ........... ......... ......... ......... . . . . . . . . ........ ...... . . . ........ . . . .... ....... . . . . . . ....... ..... . . . ..... . . ... . . .... . . . . . . . . .... .. ...... ..... . . . . . .... . .... .. .. . . .... . . . . . . .... .. ... .... . . . . . . . . .... .... .. .. . . . . . . . .... .... ... ... . . . . . ... . . .... ... .. .... . . . . . . .... ... . .. . . ... . . . .... .... ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . .............. ... ... .. .................... . . . . . . .. . . . ........ ... ... .. ............ . . . . . . . . ... ....... ... .. .. ......... . . .. . . ... . . ...... ... .. . .. ......... . . ... . ...... .. .. ....... ... .......... .. ..... ...... ... .. ...... .. ... ...... ..... .. ... .. .... ..... .. ..... ..... ... .. .... ... ...... .... ..... ... ..... ....... .... ... ... ... ... .... .... ... ... ... . ... ... . . ... .. .. ... ... . . . . ... ... ... . .... . . . . . . . . ... .. .... .. ... .... .. . . . . . .. . .. . . . . . . . . ... .. .. . .. .. .. ... ... ... .. .. ... ... ... .... .. .. ... .. ... . ... .. ... .. ... .. .. . . . . . . . . . . . .. ... .... ... .. .. ..... ... ... .... .. .. ...... ... ...... ... .. .. o ....... ... .. ... ....... ........ ... .. ........ . . ... ... .......... . . . . . . . . . . . ................ .... ... ... .................................... ... ... ... ... ... ... ... ... ... ... .. ... .... ... .. . . . . . . . .... . ... .. .... ... .... ... ... .... .... ... .... .... .... .... .... .... .... .... . . .... . . .... . . .... .... .... ...... .... ..... ...... .... .... ...... .... ....... .... .... ....... ....... .... .... ........ ....... . . . . . . . . . . . . . . ........ .... .... ........ ......... .... .... ......... ......... .... .... ......... ........... .... .... ........... .............. ..................... ........ ....... .................................. .........................................
r p
r=R
r=R
r
ξS Fig. 2. At m = m1 , the extended lens map fp maps the boundary of Dp onto the south pole ξS
of odd parity, n− (ξ ) = i, hence deg(fp , ξ ) = n+ (ξ ) − n− (ξ ) = 0. Similarly, all points ξ in the northern component, with the exception of the north pole ξN , are regular values of the lens map. fp−1 (ξ ) consists of exactly 2i + 1 points, where i is the largest integer with mi < m. There are i + 1 images of even parity, n+ (ξ ) = i + 1, and i images of odd parity, n− (ξ ) = i, hence deg(fp , ξ ) = n+ (ξ )−n− (ξ ) = 1. Both sequences (mi )i∈N and (mi )i∈N converge towards m = Ro /3. For m ≥ Ro /3, the boundary of Dp corresponds to light rays that approach the sphere r = 3m asymptotically in a neverending spiral motion, cf. Chandrasekhar [28], Fig. 9 and Fig. 10. The lens map no longer admits an extension in the sense of Def. 4, so we cannot assign a mapping degree to it. There are infinitely many concentric Einstein rings for both poles, and infinitely many isolated images for all other ξ ∈ N , with both n+ (ξ ) and n− (ξ ) being infinite. These features remain unchanged until the black-hole case m = Ro /2 is reached. The fact that in this case the caustic of the lens map consists of just two points is rather exceptional. After a small perturbation of the spherical symmetry the caustic would show a completely different behavior. For regular ξ ∈ N , however, the statements about n± (ξ ) are stable against small perturbations. Having studied Schwarzschild spacetimes around non-transparent bodies, the reader might ask what about transparent bodies, i.e., what about matching an interior solution to the exterior Schwarzschild solution at Radius Ro , with Ro > 2m, and allowing for light rays passing through the interior region. If Ro > 3m, and if there are no light rays trapped within the interior region, the resulting spacetime will be asymptotically simple and empty. Qualitative features of lens maps in this class of spacetimes are
424
V. Perlick
discussed in the following subsection. For a more explicit discussion of lens maps in the Schwarzschild spacetime of a transparent body, choosing a perfect fluid with constant density for the interior region, the reader is referred to Kling and Newman [29].
6.2. Asymptotically simple and empty spacetimes. Asymptotically simple and empty spacetimes are considered to be good models for the gravitational fields of transparent gravitating bodies that can be viewed as isolated from all other masses in the universe. The formal definition, which is essentially due to Penrose [30], cf., e.g. Hawking and Ellis [21], p. 222, reads as follows. Definition 6. A spacetime (M, g, ) is called asymptotically simple if there is a strongly ˜ g) causal spacetime (M, ˜ with the following properties: ˜ with a non-empty boundary ∂M . (a) M is an open submanifold of M ˜ −→ R such that M = { p ∈ M ˜ | @(p) > 0 }, (b) There is a C ∞ function @ : M ˜ ∂M = { p ∈ M | @(p) = 0 }, d@ = 0 everywhere on ∂M and g˜ = @2 g on M . (c) Every inextendible lightlike geodesic in M has past and future end-point on ∂M . (M, g) is called asymptotically simple and empty if, in addition, ˜ such that the Ricci tensor of g vanishes on (d) there is a neighborhood V of ∂M in M V ∩ M. Condition (d) of Def. 6 is a way of saying that, sufficiently far away from the gravitating body under consideration, Einstein’s vacuum field equation is satisfied. This assumption is reasonable for the spacetime around an isolated body producing gravitational lensing as long as cosmological aspects can be ignored. ˜ The assumptions (a)–(d) of Def. 6 imply that ∂M is a g-lightlike ˜ hypersurface in M + − that has two connected components, usually denoted by J and J (cf., e.g., Hawking and Ellis, [21], p. 222). Every inextendible lightlike geodesic in M has future end-point on J+ and past end-point on J− . In the following we concentrate on J− which is the relevant quantity in view of gravitational lensing. By construction, J− is ruled by the integral curves of the g˜ gradient Z of @. (In coordinate notation, the vector field Z is defined by Z a = g˜ ab ∂b @ on J− .) It is well known that Z is regular, with J− /Z being diffeomorphic to S 2 , and that the natural projection πZ : J− −→ J− /Z S 2 makes J− into a trivializable fiber bundle with typical fiber diffeomorphic to R. For a full proof we refer to Newman and Clarke [17, 18]. (The argument given in Hawking and Ellis [21], Prop. 6.9.4, which is due to Geroch [31], is incomplete.) This result can be translated into our terminology in the following way. Proposition 13. In the case of an asymptotically simple and empty spacetime, (J− , Z) ˜ g), is a source surface in the spacetime (M, ˜ with N = J− /Z diffeomorphic to S 2 . Each integral curve of Z can be written as the C 1 -limit of a sequence (γi )i∈N of timelike curves in M. We may interpret the γi as a sequence of worldlines of light sources approaching infinity. From the viewpoint of the physical spacetime (M, g), it is thus justified to interpret the integral curves of Z as “light sources at infinity”. With respect to the unphysical metric g, ˜ these worldlines are lightlike. With respect to the physical metric, however, they have no causal character at all, because the metric g is
Global Properties of Gravitational Lens Maps in Lorentzian Setting
425
not defined on J− . It is, thus, a misinterpretation to say that the “light sources at infinity” move at the speed of light. We shall now show that the formalism of “simple lensing neighborhoods” applies to the situation at hand. To that end, we observe that J− is the boundary of M in the manifold M˜ \ J+ . This gives rise to the following result. Proposition 14. In the case of an asymptotically simple and empty spacetime, ˜ \ J+ , g| (M, J− , Z) is a simple lensing neighborhood in the spacetime (M ˜M ˜ \J+ ). Proof. Condition (a) of Def. 5 is obvious from Def. 6 and Condition (b) was just established. The proof of the remaining two conditions is based on the fact that on M the g-lightlike geodesics coincide with the g-lightlike ˜ geodesics (up to affine parametrization). Condition (d) of Def. 5 is satisfied since every lightlike geodesic in M has past end-point on J− and future end-point on J+ . Moreover, the arrival on J± must be transverse since J± is g-lightlike. ˜ This shows that Condition (c) of Def. 5 is satisfied as well. We can, thus, apply our results on simple lensing neighborhoods to asymptotically simple and empty spacetimes. As a first result, Prop. 11 tells us that every asymptotically simple and empty spacetime M must be contractible. This result is not new. It is well known that every asymptotically simple and empty spacetime is globally hyperbolic and, thus, homeomorphic to a product of a Cauchy surface C with the real line, M C × R, and that C is contractible. For a full proof we refer again to Newman and Clarke [17,18]. The stronger result that C must be homeomorphic to R3 requires the assumption that the Poincaré conjecture is true (i.e., that every simply connected and compact 3-manifold is homeomorphic to S 3 ). In addition, Prop. 11 gives us the following result. Proposition 15. In the case of an asymptotically simple and empty spacetime, for all p ∈ M the lens map fp : Sp −→ J− /Z S 2 has |deg(fp )| = 1. The lens map fp for “light sources at infinity” in an asymptotically simple and empty spacetime was already discussed in Perlick [32, 33]. In particular, a proof of the result deg(fp ) = 1 was given in Theorem 6 of [32]. An equivalent statement, using a different terminology, can be found as Lemma 1 in Kozameh, Lamberti and Reula [34], together with a short proof. However, both these earlier proofs are incomplete. The proof in [32] is based on the idea to homotopically deform fp into the identity, but it is not shown that the construction can be made in such a way that the dependence on the deformation parameter is, indeed, continuous. In [34], the authors write the future light cone (or, equivalently, the past light cone) of a point p ∈ M as the image of a map : ]0, ∞[ ×S 2 −→ M, and they assign a winding number to each map (s, ·). Since a winding number has to refer to a “center”, the authors in [34] apparently take for granted that there is a timelike curve through p that has no further intersection with the light cone of p. The existence of such a curve, however, is an open question. With our Prop. 11 we have filled these gaps insofar as we have established the result deg(fp ) = ±1. However, we have not shown whether, with our choice of orientations, the occurrence of the minus sign can be ruled out. Proposition 15 implies that every observer in p sees an odd number of images of each light source at infinity that does not pass through the caustic of the past light cone of p. (Here one has to refer to the g-cone ˜ which is an extension of the g-cone.) As an immediate consequence of Prop. 12, we find that a similar statement is true for light sources inside M, see Fig. 3.
426
V. Perlick ........... ... ...... .... .... .... .... .... .... .... .... ... .... .... .... .... .... + .... .... .... .... .... .... ... .... .... . .. ... .... .. .... .. ....... .... .. .... .... .. .... ... ... .... . .... .... .. .... .... ... .... ... .... .... .... .... ... .... ... ... .... .... .... ... ... .... ... ... .... .... ... . . .... ... .. . . ... . ... ... . .... ... .. .... .. .... .... .... ... .... . . .... ... .... .... .. ... .... ... .... ... .... .... ... . .... . .... .. .. ... .... .... .... ... − .... .... ... ... . . .. .. .... . ... .. .... ... ... .. .. ...... .... .. .... .... ..... ...... .... .......... .... ......... .......... .
J
p
q
γ
J
Fig. 3. Illustration of Proposition 16
Proposition 16. Fix a point p and a timelike embedded C ∞ curve γ in an asymptotically simple and empty spacetime (M, g). Assume that the image of γ is a closed subset of ˜ \ J+ and that γ meets neither the point p nor the caustic of the past light cone of M p. Then the number of past-pointing lightlike geodesics from p to γ in M is finite and odd. Let us conclude this subsection with a few remarks on spacetimes that are asymptotically simple but not empty. For any asymptotically simple spacetime it is easy to verify that ∂M has either one or two connected components, and that all lightlike geodesics in M have their past end-point in the same connected component of ∂M. Let us denote this component by J− henceforth. In order to apply our formalism of simple lensing neighborhoods the additional assumptions needed are that J− is a fiber bundle with g-causal ˜ fibers diffeomorphic to R over an orientable basis manifold, and that all pastinextendible lightlike geodesics in M meet J− transversely. If these assumptions are satisfied, our results on simple lensing neighborhoods apply. In particular, J− must be diffeomorphic to S 2 × R and M must be contractible. As an interesting special case, we might modify Condition (d) of Def. 6 by requiring the Ricci tensor of g to be equal to D g near ∂M with a positive or negative cosmological constant D. The resulting spacetimes are called asymptotically deSitter for D > 0 and asymptotically anti-deSitter for D < 0. It was verified already by Penrose [30] that then ∂M is g-spacelike ˜ for D > 0 and g-timelike ˜ for D < 0. Thus, the formalism of simple lensing neighborhoods is inappropriate for investigating asymptotically deSitter spacetimes, but it may be used for the investigation of asymptotically anti-deSitter spacetimes. 6.3. Weakly perturbed Robertson–Walker spacetimes. It is a characteristic feature of the lens map, as defined in this paper, that it is constructed by following each pastpointing lightlike geodesic up to its first intersection with the source surface only. Further
Global Properties of Gravitational Lens Maps in Lorentzian Setting
427
intersections are ignored, i.e., some images are willfully excluded from the gravitational lensing discussion. In the preceding examples no such further intersections occurred. We shall now discuss an example where they do occur but where it is physically well motivated to disregard them. To that end we start out with a spacetime (M, g) with M = S 3 × R and g = R(t)2 − dt 2 + dχ 2 + sin2 χ (dθ 2 + sin2 θ dφ 2 ) . (21) Here χ ∈ [0, π ], θ ∈ [0, π ] and φ ∈ [0, 2π ] denote standard coordinates on S 3 (with the usual coordinate singularities), t denotes the projection from M = S 3 × R onto R, and R : R −→ R is a strictly positive but otherwise arbitrary C ∞ function. This is the general form of a Robertson–Walker spacetime with positive spatial curvature and natural topology which has no particle horizons. (Particle horizons are excluded by the assumption that the “conformal time” t runs over all of R.) Now fix a coordinate value χo ∈ ] 0 , π/2[ and let U denote the set of all points in M whose χ -coordinate is smaller than χo . Let W denote the restriction of the vector field ∂/∂t to the boundary ∂U . Then (U, ∂U, W ) is a simple lensing neighborhood. This is easily verified using the fact that the lightlike geodesics in M project to the geodesics of the standard metric on S 3 . Our assumptions that t ranges over all of R and that χo < π/2 are essential to make sure that, for all p ∈ U, the lens map is defined on all of Sp . In the case at hand, the lens map fp : Sp −→ ∂U/W is a global diffeomorphism for all points p ∈ U . Actually, there are infinitely many past-pointing lightlike geodesics from any fixed p ∈ U to any fixed ξ ∈ ∂U/W , but only one of them reaches ξ without having left U. All the other ones make at least a half circle around the whole universe, so they will give rise to rather faint images as a consequence of absorption in the intergalactic medium. It is, thus, reasonable to assume that only the one image which enters into the lens map is actually visible. In this sense, disregarding all the other light rays is physically well motivated. Please note that all the infinitely many images of ξ are situated at just two points of the celestial sphere at p ; the two brightest images cover all the other ones. Now this example is boring in view of gravitational lensing because the lens map is a global diffeomorphism. However, we can switch to a more interesting situation by choosing a compact subset K ∈ S 3 and modifying the metric on the set K × R. In view of Einstein’s field equation, this can be interpreted as introducing local mass concentrations that act as gravitational lens deflectors. If K × R is completely contained in U, and if the modification of the metric is sufficiently small to make sure that, even after the modification, no light rays are past- or future-trapped inside U, then U remains a simple lensing neighborhood. We have, thus, Prop. 11 at our disposal. Under the (very mild) additional assumption that, even after the perturbation, there are no closed timelike curves in U, we may also use Prop. 12. This is a line of argument to the effect that, in a Robertson–Walker spacetime of the kind considered here, any transparent gravitational lens deflector produces an odd number of visible images. The assumption that there are no particle horizons was essential since otherwise the lens map would not be defined on the whole celestial sphere for all p ∈ U. A similar argument applies, of course, to Robertson–Walker spacetimes with noncompact spatial sections. Then we don’t have to care about light rays traveling around the whole universe, so there are no additional images which are ignored by the lens map.
428
V. Perlick
References 1. Schneider, P., Ehlers, J., Falco, E.: Gravitational lenses. New York: Springer, 1992 2. Frittelli, S., Newman E.: Phys. Rev. D 59, 124001 (1999) 3. Ehlers, J., Frittelli, S., Newman, E.: In: J. Renn (ed.), Festschrift in honor of John Stachel. Kluwer Academic Publishers, to appear 2001 4. Ehlers, J.: Annalen der Physik (Leipzig) 9, 307 (2000) 5. Palais, R.: Ann. Math. 73, 295 (1961) 6. Whitney, H.: Ann. Math. 37, 645 (1936) 7. Hirsch, M.W.: Differential topology. Springer, New York, 1976 8. Abraham, R., Marsden, J.: Foundations of mechanics. Reading, MA: Benjamin-Cummings, 1978 9. Kobayashi, S., Nomizu, K.: Foundations of differential geometry. Vol.I. New York: Wiley-Interscience, 1963 10. Harris, S.: Class. Quantum Grav. 9, 1823 (1992) 11. Beem, J., Ehrlich, P., Easley, K.: Global Lorentzian geometry. New York: Dekker, 1996 12. Choquet-Bruhat,Y., Dewitt-Morette, C., Dillard-Bleick, M.: Analysis, manifolds and physics. Amsterdam: North-Holland, 1977 13. Dold, A.: Lectures on algebraic topology. Berlin: Springer, 1980 14. Spanier, E.: Algebraic topology. New York: McGraw Hill, 1966 15. Bredon, G.E.: Topology and geometry. New York: Springer, 1993 16. McKenzie, R.H.: J. Math. Phys. 26, 1592 (1985) 17. Newman, R.P.C., Clarke, C.J.S.: Class. Quantum Grav. 4, 53 (1987) 18. Newman, R.P.C.: Commun. Math. Phys. 123, 17 (1989) 19. Frankel, T.: The geometry of physics Cambridge: Cambridge UP, 1997 20. Wald, R.: General relativity. Chicago: University of Chicago Press, 1984 21. Hawking, S., Ellis, G.: The large scale structure of space-time. Cambridge: Cambridge UP, 1973 22. Giannoni, F., Masiello, A., Piccione, P.: Commun. Math. Phys. 187, 375 (1997) 23. Giannoni, F., Masiello, A., Piccione, P.: Ann. Inst. H. Poincaré, Physique Theoretique 69, 359 (1998) 24. Vilenkin, A.: Phys. Rev. D 23, 852 (1981) 25. Hiscock, W.: Phys. Rev. D 31, 3288 (1985) 26. Gott, J.R.: Astrophys. J. 288, 422 (1985) 27. Virbhadra, K.S., and Ellis, G.F.R.: Phys. Rev. D 62, 084003 (2000) 28. Chandrasekhar, S.: The mathematical theory of black holes. Oxford: Oxford UP, 1983 29. Kling, T., Newman, E.T.: Phys. Rev. D. 59, 124002 (1999) 30. Penrose, R.: In: deWitt, C. M., deWitt, B. (eds.): Relativity, groups and topology. Les Houches Summer School 1963. New York: Gordon and Breach, 1964, p. 565 31. Geroch, R.: In: Sachs, R. K. (ed.): General relativity and cosmology Enrico Fermi School, Course XLVII. New York: Academic Press, 1971, pp. 71–103 32. Perlick, V.: In: Schmidt, B. (ed.): Einstein’s field equations and their physical implications, Heidelberg: Springer, 2000 33. Perlick, V.: Ann. Physik (Leipzig) 9, SI–139 (2000) 34. Kozameh, C., Lamberti, P.W., Reula, O.: J. Math. Phys. 32, 3423 (1991) Communicated by H. Nicolai
Commun. Math. Phys. 220, 429 – 451 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On the Characteristic Polynomial of a Random Unitary Matrix C. P. Hughes1,2 , J. P. Keating1,2 , Neil O’Connell1 1 BRIMS, Hewlett-Packard Labs, Bristol, BS34 8QZ, UK 2 School of Mathematics, University of Bristol, University Walk, Bristol, BS8 1TW, UK
Received: 27 June 2000 / Accepted: 30 January 2001
Abstract: We present a range of fluctuation and large deviations results for the logarithm of the characteristicpolynomial Z of a random N × N unitary matrix, as N → ∞. First
we show that ln Z/ 21 ln N , evaluated at a finite set of distinct points, is asymptotically a collection of i.i.d. complex normal random variables. This leads to a refinement of a recent central limit theorem due to Keating and Snaith, and also explains the covariance structure of the eigenvalue counting function. Next we obtain a central limit theorem for ln Z in a Sobolev space of generalised functions on the unit circle. In this limiting regime, lower-order terms which reflect the global covariance structure are no longer negligible and feature in the covariance structure of the limiting Gaussian measure. Large deviations results for ln Z/A, evaluated at a finite set of distinct points, can be obtained √ for ln N A ln N . For higher-order scalings we obtain large deviations results for ln Z/A evaluated at a single point. There is a phase transition at A = ln N (which only applies to negative deviations of the real part) reflecting a switch from global to local conspiracy. 1. Introduction and Summary
Let U be an N × N unitary matrix, chosen uniformly at random from the unitary group U(N ), and denote its eigenvalues by exp(iθ1 ), . . . , exp(iθN ). In order to develop a heuristic understanding of the value distribution and moments of the Riemann zeta function, Keating and Snaith [21] considered the characteristic polynomial (normalised so that its logarithm has zero mean) Z(θ ) = det(I − U e−iθ ) =
N
1 − ei(θn −θ) .
(1.1)
n=1
This is believed to be a good statistical model for the zeta function at (large but finite) height T up the critical line when the mean density of the non-trivial zeros (which equals
430
C. P. Hughes, J. P. Keating, N. O’Connell
(1/2π) ln(T /2π)) is set equal to the mean density of eigenangles (which is N/2π). (For additional evidence of this, concerning other statistics, see [9].) Note that the law of Z(θ ) is independent of θ ∈ T (the unit circle). In [21] it is shown that as N → ∞, ln Z(0)/σ converges in distribution to a standard complex normal random variable, where 2σ 2 = ln N . That is ln Z(0) ⇒ X + iY, 1 ln N 2
(1.2)
where X and Y are independent normal random variables with mean zero and variance one1 , and ⇒ denotes convergence in distribution. (A similar result can be found in [2], but there the real and imaginary parts of ln Z/σ are treated separately.) In order to make the imaginary part of the logarithm well-defined, the branch is chosen so that ln Z(θ ) =
N
ln 1 − ei(θn −θ)
(1.3)
n=1
and
− 21 π < Im ln 1 − ei(θn −θ) ≤ 21 π.
(1.4)
Compare the above central limit theorem with a central limit theorem, due to Selberg, for the value distribution of the log of the Riemann zeta function along the critical line. Selberg proved (see, for example, §2.11 of [24] or §4 of [22]) that, for rectangles B ⊆ C, ln ζ ( 21 + it) 1 2 2 = 1 T ≤ t ≤ 2T : e−(x +y )/2 dx dy. (1.5) ∈ B lim 2π T →∞ T 1 B ln ln T 2
Equating the mean density of the Riemann zeros at height T with the mean density of eigenangles of an N × N unitary matrix, we have N = ln(T /2π ) and thus we see that these two central limit theorems are consistent. In this paper we obtain more detailed fluctuation theorems for ln Z as N → ∞, and a range of large and moderate deviations results. First we show that ln Z/σ , evaluated at a finite set of distinct points, is asymptotically a collection of i.i.d. complex normal random variables. This leads to a refinement of the above central limit theorem, and also explains the mysterious covariance structure which has been observed, by Costin and Lebowitz [10] and Wieand [32, 33], in the eigenvalue counting function. We also obtain a central limit theorem for ln Z in a Sobolev space of generalised functions on the unit circle. In this limiting regime, lower-order terms which reflect the global covariance structure are no longer negligible and feature in the covariance structure of the limiting Gaussian measure. The limiting process is not in L2 (T). It is, however, when integrated, Hölder continuous with parameter 1 − δ, for any δ > 0. Large deviations results for ln Z/A, evaluated at a finite set of distinct points, are √ obtained for ln N A ln N . For higher-order scalings we obtain large deviations results for ln Z/A evaluated at a single point. For the imaginary part, all scalings A N 1 Perhaps we should warn the reader at this point that some authors use the term “standard complex normal” to refer to the case where the variance of each component is 1/2.
Characteristic Polynomial of Random Unitary Matrix
431
lead to quadratic rate functions. At A = N , the speed is N 2 , and the rate function is a convex function for which we give an explicit formula. For the real part, only scalings up to A = ln N lead to quadratic rate functions. At this critical scaling one observes a phase transition, and beyond it deviations to the left and right occur at different speeds. For deviations to the left, the rate function becomes linear; for deviations to the right, the rate function remains quadratic up to but not including the scaling A = N . At the scaling A = N , deviations to the left occur at speed N , while deviations to the right occur at speed N 2 , and the rate function is again a convex function for which we give an explicit formula. The phase transition reflects a switch from global to local conspiracy. Related fluctuation theorems for random matrices can be found in [10, 13, 12, 19, 14, 27] and references therein. In particular, Diaconis and Evans [12] give an alternative proof of Theorem 2.2 below. The large deviation results at speed N 2 are partially consistent with (but do not follow from) a higher-level large deviation principle due to Hiai and Petz [16]. High-level large deviations results and concentration inequalities for other ensembles can be found in [5, 6, 15].
2. Fluctuation Results Our first main result is that the law of ln Z(0) obtained by averaging over the unitary group is asymptotically the same as the value distribution of ln Z(θ) obtained by averaging over θ for a typical realisation of U : Theorem 2.1. Set WN (θ ) = ln Z(θ )/σ , and denote by m the uniform probability measure on T (so that m(dθ ) = dθ/2π ). As N → ∞, the sequence of laws m ◦ WN−1 converges weakly in probability to a standard complex normal variable. This will follow from Theorem 2.2 below, so we defer the proof. Theorem 2.1 hints at the possibility that the t-range in (1.5) can be significantly reduced. The characteristic polynomial can also be used to explain the mysterious “white noise” process which appears in recent work of Wieand [32, 33] on the counting function (and less explicitly in earlier work of Costin and Lebowitz [10]).A Gaussian process is defined to be a collection of real (complex) random variables {X(α), α ∈ I }, with the property that, for any α1 , . . . , αm , the joint distribution of X(α1 ), . . . , X(αm ) is multivariate (complex) normal. For −π < s < t ≤ π , let CN (s, t) denote the number of eigenangles of U that lie in the interval (s, t). Wieand proves that the finite dimensional distributions N defined by of the process C − (t − s)N/2π N (s, t) = CN (s, t) C 1 1 π 2 ln N
(2.1)
converge to those of a Gaussian process C which can be realised in the following way: let Y be a centered Gaussian process indexed by T with covariance function EY (s)Y (t) = 11{s=t} (where 11 is the indicator function) and set C(s, t) = Y (t) − Y (s). What is the origin of this process Y ? The answer is as follows. First, it is not hard to show that for each N , N (s, t) = YN (t) − YN (s), C
(2.2)
432
C. P. Hughes, J. P. Keating, N. O’Connell
where YN (θ ) = Im ln Z(θ )/σ . This follows from the identity 11{θ∈(s,t)} =
1 t −s 1 + Im ln(1 − ei(θ−t) ) − Im ln(1 − ei(θ−s) ), 2π π π
(2.3)
where, as always, the principal branch of the logarithm is chosen as in (1.4). Moreover: Theorem 2.2. Set WN (θ ) = ln Z(θ )/σ . If r1 , . . . , rk ∈ T are distinct, the joint law of (WN (r1 ), . . . , WN (rk )) converges as N → ∞ to that of k i.i.d. standard complex normal random variables. In particular, the finite dimensional distributions of YN converge to those of Y . This suggests that the analogous extension of Selberg’s theorem (1.5) might hold for the zeta function. Proof. Let f be a real-valued function in L1 (T), and denote by 2π ˆ f (θ)e−ikθ m(dθ) fk =
(2.4)
0
its Fourier coefficients. The N th order Toeplitz determinant with symbol f is defined by DN [f ] = det(fˆj −k )1≤j,k≤N .
(2.5)
Heine’s identity (see, for example, [28]) states that DN [f ] = E
N
f (θn ).
(2.6)
n=1
The following lemma is more general than we need here, but we record it for later reference. Lemma 2.3. For any d(N) 1 as N → ∞, s, t ∈ Rk with N sufficiently large such that sj > −d(N) for all j , and rj distinct in T, k E exp sj Re ln Z(rj )/d + tj Im ln Z(rj )/d (2.7) j =1
∼
k
E exp sj Re ln Z(rj )/d + tj Im ln Z(rj )/d
(2.8)
j =1
k ln N ∼ exp (s 2 + tj2 ) . 4d 2 j
(2.9)
j =1
Proof. This follows from Heine’s identity and a result of Basor [4] on the asymptotic behaviour of Toeplitz determinants with Fisher–Hartwig symbols. The Fisher–Hartwig symbol we require has the form f (θ ) =
k
(1 − ei(θ−rj ) )αj +βj (1 − ei(rj −θ) )αj −βj .
j =1
(2.10)
Characteristic Polynomial of Random Unitary Matrix
433
Taking αj = sj /2d and βj = −itj /2d, we have, by Heine’s identity, k E exp sj Re ln Z(rj )/d + tj Im ln Z(rj )/d = DN [f ].
(2.11)
j =1
Note that the αj ’s are real and the βj ’s purely imaginary. Basor [4] proves that, as N → ∞, for rj distinct, DN [f ] ∼ E(α1 , β1 , r1 , . . . , αk , βk , rk )
k
N αj −βj , 2
2
(2.12)
j =1
for αj > −1/2, where E(α1 , β1 , r1 , . . . , αk , βk , rk ) =
1 − ei(rm −rn )
−(αm −βm )(αn +βn )
1≤m,n≤k m=n k G(1 + αj + βj )G(1 + αj − βj ) × , G(1 + 2αj )
(2.13)
j =1
where G is the Barnes G-function, and arg 1 − ei(rm −rn ) ≤ π/2. By closer inspection of the proof given in [4] it can be seen that (2.12) holds uniformly for |αj | < 1/2 − δ, and |βj | < γ , for any fixed δ, γ > 02 . We remark that uniformity in β is worked out carefully in [32] for the case αj = 0 for each j , and uniformity in α is discussed in [4]. The statement of the lemma follows from noting that E(0, 0, r1 , . . . , 0, 0, rk ) = 1. Setting d = 21 ln N = σ completes the proof of Theorem 2.2. Proof of Theorem 2.1. Set XN (θ ) = Re ln Z(θ )/σ , YN (θ ) = Im ln Z(θ )/σ and φN (s, t) = exp (sXN (θ ) + tYN (θ )) m(dθ). (2.14) T
By the central limit theorem derived in [21] (which we note, in passing, also follows from Theorem 2.2), EφN (s, t) = E exp (sXN (0) + tYN (0)) → e(s We also have EφN (s, t)2 =
2 +t 2 )/2
.
(2.15)
T
E exp (sXN (θ ) + tYN (θ ) + sXN (0) + tYN (0)) m(dθ).
(2.16)
By Cauchy–Schwartz, the integrand is bounded above by sup E exp (2sXN (0) + 2tYN (0)) ,
N≥N0
2 This was pointed out to us by Harold Widom.
(2.17)
434
C. P. Hughes, J. P. Keating, N. O’Connell
where N0 is chosen such that 2s > −σ (N0 ). Thus, by Theorem 2.2 and the bounded convergence theorem, EφN (s, t)2 → es
2 +t 2
,
(2.18)
and hence P(|φN (s, t) − e(s
2 +t 2 )/2
| > -) ≤ Var φN (s, t)/- 2 → 0,
(2.19)
for any - > 0, by Chebyshev’s inequality. Thus, for each s, t, the sequence φN (s, t) 2 2 converges in probability to e(s +t )/2 . The result now follows from the fact that moment generating functions are convergence-determining. We note that Szegö’s asymptotic formula for Toeplitz determinants does not apply in the above context. Szegö’s theorem for real-valued functions states that if A(h) =
∞
k|hˆ k |2 < ∞,
(2.20)
k=1
then
DN [eh ] = exp N hˆ 0 + A(h) + o(1)
(2.21)
as N → ∞. Combining this with Heine’s identity, we see that if hˆ 0 = 0 and A(h) < ∞, then Tr h(U ) is asymptotically normal with zero mean and variance 2A(h). Now, we can write Re ln Z(θ ) = Tr h(U ), where h(t) = Re ln(1 − ei(t−θ) ), but the Fourier coefficients hˆ k are of order 1/k in this case and A(h) = +∞. We can, however, apply Szegö’s theorem to obtain a functional central limit theorem for ln Z. Actually, we will use the following fact, due to Diaconis and Shahshahani [13], which can be deduced from Szegö’s theorem. Lemma 2.4. For each l, the collection of random variables 2 −j Tr U , j = 1, . . . , l j
(2.22)
converges in distribution to a collection of i.i.d. standard complex normal random variables. (In fact, it is shown in [13] that there is exact agreement of moments up to high order for each N . See also [18], where superexponential rates of convergence are established.) Denote by H0a the space of generalised real-valued functions f on T with fˆ0 = 0 and f 2a
=
∞
|k| |fˆk |2 = 2 2a
k=−∞
∞
k 2a |fˆk |2 < ∞.
(2.23)
k=1
This is a Hilbert space with the inner product f, ga =
∞ k=−∞
|k|2a fˆk gˆ k∗ .
(2.24)
Characteristic Polynomial of Random Unitary Matrix
435
It is also a closed subspace of the Sobolev space H a , which is defined similarly but without the restriction fˆ0 = 0. Sobolev spaces have the following useful property: the unit ball in H a is compact in H b , whenever a > b. It follows that the unit ball in H0a is compact in H0b for a > b. We shall make use of this fact later. Note that, when a = 0, · , ·a is just the usual inner product on L2 (T); in this case we will drop the subscript. Fix a < 0, and define a Gaussian measure µ on H0a × H0a as follows. First, let X1 , X2 , . . . be a sequence of i.i.d. standard complex normal random variables, X0 = 0 and X−k = Xk∗ , and define a random element F ∈ H0a by F (θ) =
∞ Xk eikθ . √ 2 2|k| k=−∞
(2.25)
(To see that F ∈ H0a , note that in fact EF 2a < ∞ for a < 0.) Now define µ to be the −1/2 : law of (F, AF ), where A is the Hilbert transform on H0 i fˆk Af k = −i fˆk
k>0 k < 0.
(2.26)
We will describe some properties of µ later. First we will prove: Theorem 2.5. The law of (Re ln Z, Im ln Z) converges weakly to µ. Proof. First note that Im ln Z = A(Re ln Z) and, for k = 0,
Re ln Z
k
− Tr U −k . 2|k|
=
(2.27)
Convergence on cylinder sets (in the Fourier representation) therefore follows from Lemma 2.4. To prove tightness in H0a × H0a we will use the fact that the unit ball in H0b is compact in H0a for a < b < 0, and the uniform bound ERe ln Z2b = 2E = = ≤
1 2 1 2 1 2
∞
2 ln Z k 2b Re
k=1 ∞ k=1 ∞
k=1 ∞ k=1
k
(2.28)
k 2b−2 E| Tr U −k |2
(2.29)
k 2b−2 min(k, N )
(2.30)
k 2b−1 ;
(2.31)
436
C. P. Hughes, J. P. Keating, N. O’Connell
a similar bound holds for Im ln Z. We have used the fact that E| Tr U k |2 = min(|k|, N ) for k = 0 (see, for example, [25]). Thus, sup P (max{Re ln Zb , Im ln Zb } > q) N
(2.32)
≤ sup {P (Re ln Zb > q) + P (Im ln Zb > q)} N ≤ sup ERe ln Z2b + EIm ln Z2b /q 2
(2.34)
→0
(2.35)
N
as q → ∞, so we are done.
(2.33)
We will now discuss some properties of the limiting measure µ. Let (F, AF ) be a realisation of µ. First note that F and AF have the same law. Recalling the construction of F , we note that for k > 0 the random variables |Fˆk |2 are independent and |Fˆk |2 is exponentially distributed with mean 1/4k. It follows that F a < ∞ if, and only if, a < 0. In particular, F is almost surely not in L2 (T). −1/2 Nevertheless, we can characterise the law of F by stating that, for f ∈ H0 , 2f, F /f −1/2
(2.36)
is a standard normal random variable. The covariance is given by Ef, F g, F = We note that
f, g−1/2 = −2
T2
1 f, g−1/2 . 4
ln |eiθ − eiφ |f (φ)g(θ) m(dφ)m(dθ).
(2.37)
(2.38)
In the language of potential theory, if f is a charge distribution, then f 2−1/2 is the logarithmic energy of f . The logarithmic energy functional also shows up as a large deviation rate function for the sequence of eigenvalue distributions: see Sect. 3.5 below. We can also write down a stochastic integral representation for the process F . If we set φ S(φ) = F (θ)dθ, (2.39) then S has the same law as 1 S(φ) = 2π
2π
b(φ − θ)dB(θ ),
where B is a standard Brownian motion and ∞ 1 −3/2 b(θ ) = √ k cos(kθ ). 8π k=1 To see this, compare covariances using the identity 2 ∞ 1 1 − cos(kt) 1 11[0,t] − t = . 4 2π −1/2 4π 2 k3 k=1
Finally, we observe:
(2.40)
0
(2.41)
(2.42)
Characteristic Polynomial of Random Unitary Matrix
437
Lemma 2.6. Let δ > 0. The process S has a modification which is almost surely Hölder continuous with parameter 1 − δ. Proof. This follows from Kolmogorov’s criterion (see, for example, [26, Theorem 2.1]) and the fact that 2 1 t E|S(t) − S(0)| = 11[0,t] − 4 2π −1/2
(2.43)
∞ 1 1 − cos(kt) 4π 2 k3
(2.44)
2
=
k=1
∼−
1 2 t ln t, 8π 2
(2.45)
as t → 0+ . To see that this asymptotic formula is valid, one can use the fact that the expression (2.44) is related to Claussen’s integral (see, for example, [1, §27.8]). We conclude this section with two remarks on Theorem 2.5. First, Rains [25] showed that, for each θ = 0, Var CN (0, θ) =
1 (ln N + γ + 1 + ln |2 sin(θ/2)|) + o(1), π2
(2.46)
where CN (0, θ) is the number of eigenangles lying in the interval (0, θ ). Comparing this with (2.2) we see that 1 EIm ln Z(θ )Im ln Z(0) = − ln |2 sin(θ/2)| + o(1). 2
(2.47)
This is consistent with the fact that (formally) 1 EF (θ)F (0) = − ln |2 sin(θ/2)|. 2
(2.48)
The formal identity (2.48) in fact contains all of the information needed to determine the covariance structure of the process F . The fluctuation theorem (2.5) is therefore a statement which contains information about the global covariance structure of ln Z. The covariance (2.47) is too small to feature in the scaling of Theorem 2.2. Finally, the following observation arose in discussions with Marc Yor. The process F also appears in the following context. Let B be a standard complex Brownian motion, and f : C → R defined by f (z) = h(arg z)δ(|z| = 1) for some h : T → R with hˆ 0 = 0. Then, as t → ∞, 1 √ π ln t
0
t
f (Bs )ds ⇒ h, F .
This can be deduced from a result of Kasahara and Kotani given in [20].
(2.49)
438
C. P. Hughes, J. P. Keating, N. O’Connell
3. Large Deviations In this section we present large and moderate deviations results for ln Z(0). We begin with a quick review of one-dimensional large deviation theory (see, for example, [8, 11]). We are concerned with the log-asymptotics of the probability distribution of RN /A(N ), where RN is some one-dimensional real random variable and A(N ) is a scaling that is much greater than the square root of the variance of RN (so we are outside the regime of the central limit theorem). Suppose that there exists a function B(N ) (which tends to infinity as N → ∞), such that 1 B(N ) (3.1) ;(λ) := lim ln E exp λ RN N→∞ B(N ) A(N ) exists as an extended real number, for each λ (i.e. the pointwise limit exists in the extended reals). The effective domain of ;(·) is the set D = {λ ∈ R : ;(λ) < ∞}
(3.2)
and its interior is denoted by D◦ . The convex dual of ;(·) is given by ;∗ (x) = sup{λx − ;(λ)}. λ∈R
(3.3)
Theorem 3.1. For a < b, if ;(·) is differentiable in D◦ and if
then
(a, b) ⊆ {;" (λ) : λ ∈ D◦ },
(3.4)
1 RN ln P ∈ (a, b) = − inf ;∗ (x). N→∞ B(N) x∈(a,b) A(N )
(3.5)
lim
If (3.5) holds we say that RN /A(N ) satisfies the large deviation principle (LDP) with speed B(N) and rate function ;∗ . Some partial moderate deviations results can be obtained using Lemma 2.3; however, for many of the results presented here we will need more detailed information. In particular, we will make use of the following explicit formula (see, for example, [2, 7, 21]): E exp (sRe ln Z(θ ) + tIm ln Z(θ )) G(1 + s/2 + it/2)G(1 + s/2 − it/2)G(1 + N )G(1 + N + s) = , G(1 + N + s/2 + it/2)G(1 + N + s/2 − it/2)G(1 + s)
(3.6)
valid for Re(s ± it) > −1, where G(·) is the Barnes G-function, described in Appendix A. We will find the single moment generating functions useful, which we record here as MN (s) := E exp(sRe ln Z(0)) G2 1 + 21 s G(N + 1)G(N + 1 + s) = , G(1 + s)G2 N + 1 + 21 s
(3.7) (3.8)
Characteristic Polynomial of Random Unitary Matrix
439
and LN (t) := E exp(itIm ln Z(0)) G 1 + 21 t G 1 − 21 t G2 (N + 1) = . G N + 1 + 21 t G N + 1 − 21 t Theorem 3.2. For any A(N ) ln N , and a < b < 0, Re ln Z(0) 1 ln P ∈ (a, b) = b. lim N→∞ A A
(3.9) (3.10)
(3.11)
Also, for any a < b < −1/2,
Re ln Z(0) 1 ln P ∈ (a, b) = b + 1/4. lim N→∞ ln N ln N
(3.12)
Proof. From Theorem 3.9 we have that if lim supN→∞ x/ ln N < −1/2, then 1 ln 2 − 21 ln π N 1/4 , p(x) ∼ ex exp 3ζ " (−1) + 12
(3.13)
where p(x) is the probability density function of Re ln Z(0). Therefore, for a < b < −1/2, b ln N Re ln Z(0) P p(x) dx ∈ (a, b) = ln N a ln N 1 ∼ exp 3ζ " (−1) + 12 ln 2 − 21 ln π N 1/4 N b − N a
(3.14)
and the result follows from taking logarithms of both sides. Similarly for A(N ) ln N with a < b < 0. 3.1. Large deviations at the scaling A = N . Since Re ln Z(0) ≤ N ln 2 and |Im ln Z(0)| ≤ N π/2, the scaling A = N is the maximal non-trivial scaling. Theorem 3.3. The sequence Re ln Z(0)/N satisfies the LDP with speed N 2 and rate function given by the convex dual of 2 1 (1 + s)2 ln(1 + s) − 1 + 21 s ln 1 + 21 s − 41 s 2 ln 2s for s ≥ 0 ;(s) = 2 ∞ for s < 0. (3.15) Proof. ln E exp(sN Re ln Z(0)) = ln MN (N s), the asymptotics of which are given in Appendix C, and so ;(s) = lim
N→∞
1 ln MN (N s) N2
(3.16)
= 21 (1 + s)2 ln(1 + s) − 1 + 21 s
2
ln 1 + 21 s − 41 s 2 ln 2s
(3.17)
for s ≥ 0, and ;(s) = ∞ for s < 0. If x > 0, then Theorem 3.1 implies that the rate function, I (x), is given by the convex dual of ;(s). If x < 0, then Theorem 3.2 implies that I (x) = 0. Thus for x ∈ R, I (x) is given by the convex dual of ;(s), and this completes the proof of Theorem 3.3.
440
C. P. Hughes, J. P. Keating, N. O’Connell
One can also obtain an LDP for the imaginary part: Theorem 3.4. The sequence Im ln Z(0)/N satisfies the LDP with speed N 2 and rate function given by the convex dual of 4 1 1 2 1 1 2 ;(t) = 8 t ln 1 + 2 − 2 ln 1 + 4 t + t arctan t . (3.18) t 2 Proof. ln E exp(tN Im ln Z(0)) = ln LN (−iN t), and the asymptotics (given in Appendix D) imply that 1 ln LN (−iN t) N2 4 1 = 18 t 2 ln 1 + 2 − 21 ln 1 + 41 t 2 + t arctan t t 2
;(t) = lim
N→∞
(3.19) (3.20)
Theorem 3.1 implies that J (y), the rate function, is given by the convex dual of ;(t), for all y ∈ R. 3.2. Moderate Deviations. At other scalings, one finds that the rate function is either quadratic or linear. √ Theorem 3.5. For scalings ln N A N , the sequence Re ln Z(0)/A satisfies the LDP with speed B = −A2 /W−1 (−A/N ) (where W−1 is Lambert’s W -function, described in Appendix B) and rate function given by √ x2 if ln N A ln N x ≥ −1/2 x2 if A = ln N I (x) = −x − 1/4 x < −1/2 (3.21) 2 x ≥ 0 x if ln N A N. x<0 0 Proof. For a given scaling sequence A(N ) we wish to find B(N ) such that lim
N→∞
1 ln MN (sB/A) B
exists as a non-trivial pointwise limit. For χ (N ) 1 as N → ∞, we have for each s, 2 1 2 N 2 ln χ N 1 s + O s 2 Bχ χ2 ln MN (sN/χ ) = 4 B ∞
(3.22)
if N s/χ > −1 if N s/χ ≤ −1
(3.23)
which follows from results summarized in Appendix C. Therefore a non-trivial limit of (3.22) occurs if B = N 2 ln χ /χ 2 , where χ = N A/B, that is, if B=
A2
A −W−1 − N
.
(3.24)
Characteristic Polynomial of Random Unitary Matrix
441
Note that the restriction χ → ∞ implies A N , and that the restriction that B → ∞ √ implies A ln N . χ If we set δ = lim inf N→∞ N , then we have 1 ln MN (sB/A) ;(s) = lim N→∞ B 1 2 s for s > −δ = 4 ∞ for s < −δ.
(3.25) (3.26)
√ If ln N A ln N then δ = +∞ and Theorem 3.1 implies that I (x) = x 2 for all x ∈ R. If A = ln N , then δ = 1/2, and Theorem 3.1 applies only for x > −1/2, where we have I (x) = x 2 . However, since B ∼ ln N at this scaling, Theorem 3.2 implies that, for x < −1/2, I (x) = |x| − 1/4. Finally, if ln N A N , then δ = 0, and I (x) = x 2 for x > 0 by Theorem 3.1 and I (x) = 0 for x < 0 by Theorem 3.2 (since B A for A ln N ). This completes the proof of Theorem 3.5, √ Remark. For all ln N A N it turns out that I (x) is the convex dual of ;(s). Once again, a similar result is true for the imaginary part, but this time the rate function is always quadratic. √ Theorem 3.6. For scalings ln N A N , the sequence Im ln Z(0)/A satisfies the LDP with speed B = −A2 /W−1 (−A/N ) and rate function J (y) = y 2 . Proof. For a given scaling sequence A(N ) we wish to find B(N ) such that 1 ln LN (−itB/A) N→∞ B lim
(3.27)
exists as a non-trivial pointwise limit. Applying results from Appendix D we have 2 ln χ N ln LN (−itB/A) = 41 t 2 N 2 2 + Ot (3.28) χ χ2 for all t ∈ R. So, √ as in the proof of Theorem 3.5, we need B to be as in (3.24) (which will be valid for ln N A N ), and the rate function will be given by the convex dual of 41 t 2 , i.e. J (y) = y 2 . 3.3. Large deviations of ln Z(θ) evaluated at distinct points. √ Theorem 3.7. For ln N A ln N , and for any r1 , . . . , rk (distinct), the sequence (Re ln Z(r1 )/A, Im ln Z(r1 )/A, . . . , Re ln Z(rk )/A, Im ln Z(rk )/A)
(3.29)
satisfies the LDP in (R2 )k with speed B = A2 / ln N and rate function I (x1 , y1 . . . , xk , yk ) =
k j =1
xj2 + yj2 .
(3.30)
442
C. P. Hughes, J. P. Keating, N. O’Connell
Proof. By Theorem 2.3, if B/A 1, k ln E exp sj Re ln Z(rj )B/A + tj Im ln Z(rj )B/A j =1
N B 2 ln N ∼ (sj2 + tj2 )/4 , A2
(3.31)
j =1
so choosing the speed B = A2 / ln N , the stated result follows from a multidimensional analogue of Theorem 3.1 (see, for example, [11]). √ 2 Remark. If B is given by (3.24), then for ln N A ln N , B ∼ lnAN . So for A in this restricted range, this theorem generalizes Theorems 3.5 and 3.6. From this we can deduce large deviations results for the counting function, using the identity (2.2). For example: √ Theorem 3.8. For ln N A ln N , and −π < s < t ≤ π , the sequence (CN (s, t) − (t − s)N/2π )/A satisfies the LDP in R with speed B = A2 / ln N and rate function L(x) = π 2 x 2 /2. 3.4. Refined large deviations estimates. By Fourier inversion, the probability density of Re ln Z(0) is given by ∞ 1 p(x) = e−iyx MN (iy) dy, (3.32) 2π −∞ where MN (iy) = EeiyRe ln Z(0) is given by (3.8). Theorem 3.9. If lim supN→∞ x/ ln N < −1/2, then 1 ln 2 − p(x) ∼ ex exp 3ζ " (−1) + 12 Proof. We evaluate 1 2π
C
1 2
ln π N 1/4 .
e−iyx MN (iy) dy,
(3.33)
(3.34)
where C is the rectangle with vertices −R, R, R + i + -i, −R + i + -i, for - a fixed real number subject to 0 < - < 1, and let R → ∞. Note that the contour encloses only the simple pole at y = i. The asymptotics for G(x) show that the integral on the sides of the contour vanish as M → ∞, which means p(x) = i Res e−iyx MN (iy) + E, (3.35) y=i
where E=
ex+-x 2π
∞ −∞
e−itx MN (it − 1 − -) dt.
(3.36)
Characteristic Polynomial of Random Unitary Matrix
443
It is not hard to show that i Res e−iyx MN (iy) ∼ ex exp 3ζ " (−1) + y=i
and ex+-x 2π
1 12
ln 2 −
1 2
ln π N 1/4 ,
∞
|MN (it − 1 − -)|dt −∞ ex+-x G2 21 − 21 - 1/4+-/2+- 2 /4 (ln N )−1/2 . ∼ √ N π G(−-)
|E| ≤
(3.37)
(3.38) (3.39)
Thus |E| ex N 1/4 when ex- N -/2+-
2 /4
(ln N )−1/2 1.
(3.40)
Thus the error term can be made subdominant if lim sup N→∞
x 1 <− ln N 2
(3.41)
by choosing x , 1 , 0 < - < min −2 − 4 lim sup N→∞ ln N which completes the proof of the theorem.
(3.42)
Remark. For x < 0, it is possible to extend the above argument to include all poles, by integrating over the rectangle with vertices −R, R, R + iR, −R + iR, and letting R → ∞ in order to show that p(x) =
∞ n=1
! " e(2n−1)x Res e−sx MN (s − (2n − 1)) . s=0
(3.43)
The problem with this evaluation of p(x) is that it is hard to evaluate the residues of the non-simple poles in the sum, and when one does so the sum is asymptotic (in x) only for x − ln N . Using Appendix C on the asymptotics of MN (t), the saddle point method gives • For |x| ln N , 1 −x 2 p(x) ∼ √ (3.44) exp ln N + 1 + γ π(ln N + 1 + γ ) √ (This result was first found in [21] for x = O( ln N ) – the central limit theorem.) 4x • For ln N x N 1/3 , writing W for W−1 − eN , 2 x2 1 −x " 5 1 1 + p(x) ∼ √ exp − ln(−W ) − ln x + ln 2 + ζ (−1) . 12 12 12 −W 2W 2 π (3.45)
444
C. P. Hughes, J. P. Keating, N. O’Connell
The probability density of Im ln Z(0) is given by ∞ 1 e−iyx LN (y) dy q(x) = 2π −∞
(3.46)
which we note is an even function. Applying the results from Appendix D to calculate LN (s), the saddle point method gives • For |x| ln N q(x) ∼ √ • For ln N |x|
1 −x 2 . exp ln N + 1 + γ π(ln N + 1 + γ )
(3.47)
√ N , writing W for W−1 −x Ne ,
2 1 x2 −x q(x) ∼ √ exp + 2− −W W π
1 3
ln(−W ) −
1 6
ln x + 2ζ " (−1) .
(3.48)
3.5. Inside the circle. The sequence of spectral measures SN =
N 1 δθn N
(3.49)
n=1
satisfies the LDP in M1 (T) with speed N 2 and good convex rate function given by the logarithmic energy functional 2π 2π
B(ν) = − 0
ln |eis − eit |ν(ds)ν(dt).
(3.50)
0
For a proof of this fact, see [16]. In this context, Varadhan’s lemma (see, for example, [11]) can be stated as follows. Theorem 3.10. For any continuous φ : M1 (T) −→ R satisfying the condition lim sup N→∞
1 2 ln EeλN φ(SN ) < ∞ 2 N
(3.51)
for some λ > 1, then 1 2 ln EeN φ(SN ) = sup {φ(ν) − B(ν)} . 2 N→∞ N ν∈M1 (T) lim
(3.52)
Now, we can write Re ln Z(0)/N = F0 (SN ), where
2π
F0 (ν) := 0
Re ln 1 − eiθ ν(dθ);
(3.53)
Characteristic Polynomial of Random Unitary Matrix
445
however, F0 is not weakly continuous, and Varadhan’s lemma does not apply. Nevertheless, it is interesting to see if it gives the correct answer. That is, does the asymptotic cumulant generating function of Theorem 3.3 satisfy ;(s) =
sup {sF0 (ν) − B(ν)}?
ν∈M1 (T)
(3.54)
If so, this variational formula would contain information about how large deviations for Re ln Z(0)/N actually occur. A similar variational problem can be written down for the imaginary part. Unfortunately, we are not able to even formally verify this except in very restricted and degenerate cases. Consider first, for - > 0, the continuous function 2π Re ln 1 − e−- eiθ ν(dθ). (3.55) F- (ν) := 0
Then Re ln Z- /N = F- (SN ), where Z- =
N
1 − e−- eiθn .
(3.56)
n=1
Applying Varadhan’s lemma, we obtain 1 ln EeNsRe ln Z- ) N→∞ N 2 = sup {sF- (ν) − B(ν)} .
;- (s) := lim
ν∈M1 (T)
It is possible to solve this variational problem in the restricted range −e- −1 ≤ s ≤ e- −1, where we obtain: 1 ;- (s) = 41 s 2 ln . (3.57) 1 − e−2Outside this range, it is much harder to solve. Note that, letting - → 0, we formally obtain ;(s) = ∞ for −2 ≤ s < 0, which agrees (in this very restricted range) with the ;(s) of Theorem 3.3. Similarly, for 2π (3.58) Im ln 1 − e−- eiθ ν(dθ), G- := 0
we get that for |t| ≤ e- − e−- , 1 1 NtIm ln Z1 2 , ln Ee = t ln 4 N→∞ N 2 1 − e−2lim
(3.59)
so letting - → 0 all we could possibly obtain is ;(0) = 0. In both cases, the problem (of extending s and t beyond the ranges given) comes from finding the maximum over the set of all non-negative functions; only within the ranges given does the infimiser lie away from the boundary of this set.
446
C. P. Hughes, J. P. Keating, N. O’Connell
Finally, we remark that ∞ −e−|k|- ikθn e , Re ln 1 − e−- eiθn = 2|k|
(3.60)
∞ ie−|k|- ikθn Im ln 1 − e−- eiθn = e , 2k
(3.61)
k=−∞ k=0
and
k=−∞ k=0
so Szegö’s theorem implies that both Re ln Z- and Im ln Z- converges in distribution to normal random variables, with mean 0 and variance − 21 ln 1 − e−2- . Note the lack of √ ln N normalization, as required in the case - = 0. 3.6. The phase transition. The phase transition of Theorem 3.5 can be understood in terms of how deviations to the left for the real part actually occur, given that they do occur: here √ we present some heuristic arguments. For ln N A ln N and B = A2 / ln N , we have (x > 0) 1 ln P(Re ln Z(0) < −xA) ∼ −x 2 . B
(3.62)
On the other hand, if A ln N , 1 ln P(Re ln Z(0) < −xA) ∼ −x. A
(3.63)
Fix - > 0 and consider the lower bound P(Re ln Z(0) < −xA) # ≥ P ln |1 − e
iθ1
| < −(x + -)A,
N
$ ln |1 − e
iθn
| < -A .
(3.64)
n=2
Assuming the two events on the right hand side are approximately independent, and using the facts that θ1 is uniformly distributed on T and #N $ iθn P ln |1 − e | < -A → 1, (3.65) n=2
this yields, for A ln N, the lower bound lim inf N→∞
1 1 ln P(Re ln Z(0) < −xA) ≥ lim inf ln P ln |1 − eiθ1 | < −(x + -)A N→∞ A A (3.66) = −(x + -);
(3.67)
Characteristic Polynomial of Random Unitary Matrix
447
since - is arbitrary, we obtain lim inf N→∞
On the other hand, if
1 ln P(Re ln Z(0) < −xA) ≥ −x. A
(3.68)
1 ln P(Re ln Z(0) < −xA) ≥ −∞. B
(3.69)
√ ln N A ln N , the same estimate leads to lim inf N→∞
The fact that this simple estimate gives the right answer when A ln N , suggests that if the deviation {Re ln Z(0) < −xA}
(3.70)
occurs, it occurs simply because there is an eigenvalue too close to 1 (the other eigenvalues are “free to follow their average behaviour”). This is what we mean by a local conspiracy. √ The fact that it leads to a gross underestimate when ln N A ln N , suggests that in this case the deviation must involve the cooperation of many eigenvalues.A similar estimate based on only a (fixed) finite number of eigenvalues deviating from their mean behaviour leads to a similarly gross underestimate. Clearly it is more efficient in this case for many eigenvalues to arrange themselves and “share the load”, so to speak, than it is for one to bear it alone. A. Barnes’ G-Function Barnes’ G-function is defined [3] for all z by ∞ z n −z+z2 /2n G(z + 1) = (2π)z/2 exp − 21 z2 + γ z2 + z 1+ e , n
(A.1)
n=1
where γ = 0.5772 . . . is Euler’s constant. The G-function has the following properties [3, 30]: Recurrence relation: G(z + 1) = D(z)G(z). Complex conjugation: G∗ (z) = G(z∗ ). Asymptotic formula, valid for |z| → ∞ with | arg(z)| < π , ln G(z + 1) ∼
z2 21
ln z −
3 4
+
1 2 z ln 2π
−
1 12
1 ln z + ζ (−1) + O . (A.2) z "
Taylor expansion for |z| < 1, ln G(z + 1) = 21 (ln 2π − 1)z − 21 (1 + γ )z2 + "
∞
(−1)n−1 ζ (n − 1)
n=3
Special values: G(1) = 1 and G(1/2) = e3ζ (−1)/2 π −1/4 21/24 . G(z + 1) has zeros at z = −n of order n, where n = 1, 2, . . . .
zn . n
(A.3)
448
C. P. Hughes, J. P. Keating, N. O’Connell
Logarithmic differentiation can be written in terms of the polygamma functions, E (n) (z), dn+1 ln G(z) = F(n) (z) dzn+1
(A.4)
and F(0) (z) =
1 2
ln 2π − z +
1 2
+ (z − 1)E (0) (z).
(A.5)
See, for example, [1] for properties of the gamma and polygamma functions. B. Lambert’s W -Function The Lambert W -function (sometimes called the Omega function) is defined to be the solution of W (x)eW (x) = x.
(B.1)
It has a branch point at x = 0, and is double real-valued for −e−1 < x < 0. The unique branch that is analytic at the origin is called the principal branch. It is real in the domain −e−1 < x < ∞, with a range −1 to ∞. The second real branch is referred to as the −1 branch, denoted W−1 . It is real in the domain −e−1 < x < 0, with a range −∞ to −1. The equation ln x = vx β
(B.2)
−W (−βv) . x = exp β
(B.3)
has solution
There are various asymptotic expansions of the W function: • As x → ∞, W0 (x) ∼ ln x − ln ln x +
ln ln x . ln x
(B.4)
• As x → 0 on the principal branch, W0 (x) ∼ x − x 2 + 23 x 3 .
(B.5)
• As x → 0− on the −1 branch, W−1 (x) ∼ ln |x| − ln | ln |x|| +
ln | ln |x|| . ln |x|
(B.6)
Characteristic Polynomial of Random Unitary Matrix
449
C. Asymptotics of ln MN (x) From the asymptotics for the G-function, (A.2), we have for x > −1, ln MN (x) = 2 ln G 1 + 21 x − ln G(1 + x) − 38 x 2 + 21 N 2 ln N 2 + 21 (N + x)2 ln(N + x) − N + 21 x ln N + 21 x 1 1 1 + 16 ln N + 21 x − 12 ln(N + x) − 12 ln N + O , N
(C.1)
where the error term is independent of x. This may be simplified if we assume that x(N ) is restricted to various regimes: • If |x| 1, then 1 ln MN (x) = 41 x 2 (ln N + 1 + γ ) + O x 3 + O . N
(C.2)
• If x = O(1) and x > −1, then ln MN (x) = • If 1 x
1 2 4 x ln N
+ 2 ln G 1 +
1 2x
− ln G(1 + x) + O
1 N
.
(C.3)
√ 3
N , then ln MN (x) = 41 x 2 ln N − ln x − ln 2 + 23 + 3 x 1 +O +O . N x
1 6
ln 2 −
1 12
ln x + ζ " (−1)
• If x = λN with λ = O(1) and λ > 0, then 2 ln MN (x) = N 2 21 (1 + λ)2 ln(1 + λ) − 1 + 21 λ ln 1 + 21 λ 1 1 − 41 λ2 ln(2λ) − 12 ln N − 12 ln λ + ζ " (−1) 1 1 1 + 6 ln(2 + λ) − 12 ln(1 + λ) + O . N
(C.4)
(C.5)
D. Asymptotics of ln LN (ix) We consider x ∈ R. From the asymptotics for the G-function, (A.2), we have ln LN (ix) = ln G 1 + 21 ix + ln G 1 − 21 ix − 38 x 2 + N 2 ln N 2 2 − 21 N + 21 ix ln N + 21 ix − 21 N − 21 ix ln N − 21 ix 1 1 1 ln N + 21 ix + 12 ln N − 21 ix + O − 16 ln N + 12 . N
(D.1)
450
C. P. Hughes, J. P. Keating, N. O’Connell
Constraining x(N) to lie in various regimes simplifies the above considerably: • If |x| 1, then
ln LN (ix) = 41 x 2 (ln N + 1 + γ ) + O(x 4 ) + O
1 N
• If x = O(1), then ln LN (ix) = ln G 1 + 21 ix + ln G 1 − 21 ix + 41 x 2 ln N + O √ N, then ln LN (ix) = 41 x 2 ln N − ln x + ln 2 + 23 − 4 x 1 +O +O . 2 N x2
.
(D.2)
1 N
.
(D.3)
• If 1 |x|
1 6
ln x +
1 6
ln 2 + 2ζ " (−1)
• If x = λN with λ = O(1), then ln LN (ix) = N 2 18 λ2 ln 1 + 4λ−2 − 21 ln 1 + 41 λ2 + λ tan−1 21 λ 1 1 − 16 ln N + 12 ln 1 + 4λ−2 + 2ζ " (−1) + O . N
(D.4)
(D.5)
Acknowledgements. We are grateful to Persi Diaconis and Steve Evans for their suggestions and for making the preprint [12] available to us. Thanks also to Harold Widom for helpful correspondence and Marc Yor for fascinating discussions on the connection between Theorem 2.5 and planar Brownian motion.
References 1. Abramowitz, M. and Stegun, I.A.: Handbook of Mathematical Functions. Dover, 1965 2. Baker, T.H. and Forrester, P.J.: Finite-N fluctuations formulas for random matrices. J. Stat. Phys. 88, nos. 5/6, 1371–1386 (1997) 3. Barnes, E.W.: The theory of the G-function. Quart. J. Pure Appl. Math. 31, 264–314 (1899) 4. Basor, E.: Asymptotic formulas for Toeplitz determinants. Trans. Am. Math. Soc. 239, 33–65 (1978) 5. Ben Arous, G. and Guionnet, A.: Large deviations for Wigner’s law and Voiculescu’s non-commutative entropy. Probab. Theor. Rel. Fields 108, no. 4, 517–542 (1997) 6. Ben Arous, G. and Zeitouni, O.: Large deviations from the circular law. ESAIM: Probability and Statistics 2, 123–134 (1998) 7. Böttcher, A. and Silbermann, B.: Introduction to Large Truncated Toeplitz Matrices. Berlin–Heidelberg– New York: Springer-Verlag, 1999 8. Bucklew, J.: Large Deviation Techniques in Decision, Simulation, and Estimation. New York: Wiley Interscience, 1990 9. Coram, M. and Diaconis, P.: New tests of the correspondence between unitary eigenvalues and the zeros of Riemann’s zeta function. Preprint (2000) 10. Costin, O. and Lebowitz, J.L.: Gaussian fluctuation in random matrices. Phys. Rev. Lett. 75, 69–72 (1995) 11. Dembo, A. and Zeitouni, O.: Large Deviations Techniques and Applications, 2nd Ed. Berlin–Heidelberg– New York: Springer-Verlag, 1998 12. Diaconis, P. and Evans, S.N.: Linear functionals of eigenvalues of random matrices. Preprint (2000) 13. Diaconis, P. and Shahshahani, M.: On the eigenvalues of random matrices. J. Appl. Probab. A 31, 49–62 (1994) 14. Guiounnet, A.: Fluctuations for strongly interacting random variables and Wigner’s law. Preprint (1999) 15. Guiounnet, A. and Zeitouni, O.: Concentration of the spectral measure for large matrices. Preprint (2000)
Characteristic Polynomial of Random Unitary Matrix
451
16. Hiai, F. and Petz, D.: A large deviation theorem for the empirical eigenvalue distribution of random unitary matrices. Preprint (2000) 17. Hida, T.: Brownian Motion. Berlin–Heidelberg–New York: Springer, 1980 18. Johansson, K.: On random matrices from the classical compact groups. Ann. Math. 145, 519–545 (1997) 19. Johansson, K.: On fluctuations of eigenvalues of random Hermitian matrices. Duke J. Math. 91, 151–204 (1998) 20. Kasahara, Y. and Kotani, S.: On limit processes for a class of additive functionals of recurrent diffusion processes. Z. Wahrsch. verw. Gebiete. 49, 133–153 (1979) 21. Keating, J.P. and Snaith, N.C.: Random matrix theory and ζ (1/2 + it). Commun. Math. Phys. 214, 57–89 (2000) 22. Laurinˇcikas, A.: Limit Theorems for the Riemann Zeta Function. Dordrecht: Kluwer Academic Publishers, 1996 23. Mehta, M.L.: Random Matrices. New York: Academic Press, 1991 24. Odlyzko, A.M.: The 1020 -th zero of the Riemann zeta function and 175 million of its neighbors. Unpublished 25. Rains, E.: High powers of random elements of compact Lie groups. Prob. Th. Rel. Fields 107, 219–241 (1997) 26. Revuz, D. and Yor, M.: Continuous Martingales and Brownian Motion. Berlin–Heidelberg–New York: Springer-Verlag, 1990 27. Shoshnikov, A.B.: Gaussian fluctuation for the number of particles in Airy, Bessel, sine, and other determinantal random point fields. Preprint (1999) 28. Szegö, G.: Orthogonal Polynomials. AMS Colloquium Publications XXII, 1939 29. Titchmarsh, E.C.: The Theory of the Riemann Zeta-Function. Oxford: Oxford Science Publications, 1986 30. Voros, A.: Spectral functions, special functions and the Selberg zeta function. Commun. Math. Phys. 110, 439–465 (1987) 31. Weyl, H.: Classical Groups. Princeton, NJ: Princeton University Press, 1946 32. Wieand, K.: Eigenvalue Distributions of Random Matrices in the Permutation Group and Compact Lie Groups. PhD Thesis, Harvard University, 1998 33. Wieand, K.: Eigenvalue distributions of random unitary matrices. Preprint (2000) Communicated by P. Sarnak
Commun. Math. Phys. 220, 453 – 454 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Erratum
Differential Graded Cohomology and Lie Algebras of Holomorphic Vector Fields Friedrich Wagemann Laboratoire de Mathématiques, Faculté des Sciences et des Techniques, Université de Nantes, 2, rue de la Houssinière, 44322 Nantes Cedex 3, France. E-mail: [email protected] Received: 31 January 2001 / Accepted: 25 April 2001 Commun. Math. Phys. 208, 521–540 (1999)
The author apologizes for introducing an inadequate new object in the above stated article. The sheaf Homcont (C∗,dg (g), C) of continuous sheaf homomorphisms between the sheaf of differential graded chains C∗,dg (g) of a sheaf of Lie algebras g and the constant sheaf C was intended to generalize continuous cochains of Lie algebras to a sheaf setting, but this is probably not possible for cochains with trivial coefficients. The space of global sections H omcont (C∗,dg (g), C) of Homcont (C∗,dg (g), C) is smaller than the space of continuous cochains C ∗ (g(X)) on the Lie algebra g(X). It does not contain evaluations of differential expressions at a point or integrals over differential expressions, but these are important examples of continuous Lie algebra cochains – for example, the Virasoro cocycle is of this type. Let us explain the foregoing for the sheaf of differentiable vector fields Vect on a finite dimensional compact manifold X. Indeed, examples of cochains (i.e. elements of C ∗ (g(X))) are evaluations Dx0 (ξ1 , . . . , ξr ) = D(ξ1 , . . . , ξr )(x0 ) at a point x0 ∈ X of some differential expression D(ξ1 , . . . , ξr ) in the coefficient functions of ξ1 , . . . , ξr ∈ Vect(X). In order to check whether Dx0 is a morphism of sheaves, we write down the diagram of restrictions to the open set U ∗ = U \ {x0 }, where U is an open neighborhood of x0 : r (Vect(X)) resr (Vect) ❄ r (Vect(U ∗ ))
✲ C Dx0 resC φ ✲
❄
C
There cannot exist φ rendering the diagram commutative, since resC = idC . Thus, the cochain Dx0 is not an element of H omcont (C∗,dg (g), C).
454
F. Wagemann
A similar argument shows that cochains which are integrals over differential expressions are not elements of H omcont (C∗,dg (g), C). These two examples constitute the most important classes of cochains with values in the trivial module C. Clearly, sheaf approaches are possible (and carried out) in the case of cochains with values in a module which is itself a non-constant sheaf, for example the sheaf of sections of a vector bundle. In conclusion, one has to ban the object Homcont (C∗,dg (g), C) from the setting from the article (in particular, 1.1.8). The complementary Cech approach still remains valid. The spectral sequence for continuous differential graded cohomology (Lemmas 1 to 4), the cosimplicial version (Sect. 2.3, Theorem 4) work well still, and the main result of the article (Theorem 7) is unaffected. Communicated by M. Aizenman
Commun. Math. Phys. 220, 455 – 488 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras E. J. Beggs1 , Shahn Majid2, 1 Department of Mathematics, University of Wales Swansea SA2 8PP, UK 2 School of Mathematical Sciences, Queen Mary, University of London, Mile End Rd, London E1 4NS, UK
Received: 22 August 1999 / Accepted: 4 February 2000
Abstract: We introduce a new 2-parameter family of sigma models exhibiting Poisson– Lie T-duality on a quasitriangular Poisson–Lie group G. The models contain previously known models as well as a new 1-parameter line of models having the novel feature that the Lagrangian takes the simple form L = E(u−1 u+ , u−1 u− ), where the generalised metric E is constant (not dependent on the field u as in previous models). We characterise these models in terms of a global conserved G-invariance. The models on G = SU2 and its dual G are computed explicitly. The general theory of Poisson–Lie Tduality is also extended, notably the reduction of the Hamiltonian formulation to constant loops as integrable motion on the group manifold. The approach also points in principle to the extension of T-duality in the Hamiltonian formulation to group factorisations D = G M, where the subgroups need not be dual or connected to the Drinfeld double. 1. Introduction Poisson–Lie T-duality has been introduced in [1–3] and other works as a non-Abelian version of T-duality in string theory, based on duality of Lie bialgebras. A motivation (stated in [2]) is quantum group or Hopf algebra duality; this had been introduced as a duality for quantum physics several years previously [4–7], as an “observable-state” duality for certain quantum systems based on group factorisations D = G M. In one system a particle moves in G under the action of M and its quantum algebra of observables is the bicrossproduct Hopf algebra U (m)C(G); in the dual system the roles of G, M are interchanged but its quantum algebra of observables C(M)U (g) has the same physical content with the roles of observables/states and position/momentum interchanged (here g, m are the Lie algebras of G, M respectively). Indeed, being mutually dual Hopf algebras the two quantum systems are related to each other by quantum Reader and Royal Society University Research Fellow at QM and Senior Research Fellow at the Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK.
456
E. J. Beggs, S. Majid
Fourier transform F : U (m)C(G) → C(M)U (g),
(1)
see [8] where this was recently studied in detail for the simplest example (the so-called Planck-scale Hopf algebra C[p]C[x] in [4].) Under this observable-state duality it was shown in [4] that one had inversion of coupling constants as well as connections with Planck-scale physics. At about the same time, Abelian T-duality was introduced in [9] and elsewhere as a momentum-winding mode symmetry in string theory with some similar features. The observable-state duality (1) was not, by contrast, limited in any way to the Abelian case and indeed there was a natural model for every compact simple group G with M = G , the Yang–Baxter dual. Here a Lie bialgebra is an infinitesimal version of a Hopf algebra and has a dual g , and G is its associated Lie group. It is also the group of dressing transformations [10] in the theory of classical inverse scattering and the solvable group in the Isawasa decomposition D = GC = G G of the complexification of the compact Lie group G, see [7]. Moreover, D = G G is the Lie group associated to the Drinfeld double d(g) of g as a Lie bialgebra [11]. The Lie bialgebra structure of g also implies a natural Poisson bracket on G [11]. Further details are in the Preliminaries; see also [12] for an introduction to these topics. These quantum systems U (g)C(G ) with observable-state duality were constructed in [5–7] as one of the two main sources of quantum groups canonically associated to a simple Lie algebra (the other is the more well-known q-deformation of U (g) to quantum groups Uq (g)). The subsequent theory of Poisson–Lie T-duality [1, 3] indeed has many of the same features. One system consists of a sigma model on the group G with a Lagrangian of the form L = Eu (u−1 u+ , u−1 u− ),
u : R1,1 → G,
where u is the field, u± are derivatives in light-cone coordinates and Eu a bilinear form on g but depending on the value of u (a “generalised metric” since Eu need not be symmetric). The dual theory is a sigma-model on G with Lˆ = Eˆ t (t −1 t+ , t −1 t− ),
t : R1,1 → G .
The physical content of the two theories is established to be the same due to the existence of the larger group D = G G associated to the Drinfeld double d(g). In the present paper we extend Poisson–Lie T-duality in several directions, motivated in part by the above connections with quantum groups and observable-state duality. From a physical point of view the main result is as follows: the previously-known models exhibiting Poisson–Lie T-duality require a very special form of the generalised metric Eu depending on u in a rather complicated way (related to the Poisson bracket on G). This is in sharp contrast to the usual principal sigma model [13] where the metric is a constant, the Killing form K. As a result, Poisson–Lie T-duality would appear to be somewhat artificial and to apply to only certain highly non-linear models where the “metric” in the target group is far from constant. Even the explicit form of Eu is known only in some simple cases such as g = b+ the Borel-subalgebra of su2 [2] and g = su2 case as discussed in [18] and (not very explicitly) in [14]. Our main result is the introduction of a new 2-parameter class of models within the existing general framework for Poisson–Lie T-duality but whith much nicer properties. We also provide new computational tools using the theory of Lie bialgebras to compute the models explicitly. We obtain, for
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
457
example, the explicit Lagrangians in the SU2 case and its dual in a compact form that we have not found elsewhere (in the SU2 case directly in terms of the matrix-valued fields). These new models require that g is a quasitriangular Lie bialgebra, i.e. defined by an element r ∈ g ⊗ g obeying the so-called modified classical Yang–Baxter equations, [11]. This includes all complex semisimple Lie algebras equipped, for example, with their standard Drinfeld–Sklyanin quasitriangular structure as used in the theory of classical inverse scattering. The quantisations of the associated Poisson bracket on G in these cases include coordinate algebras of the quantum groups Uq (g). This is therefore an important class of models, and we will find quite tractable formulae in this case. We use r not only in the Lie bialgebra structure (which is usual) but again in certain boundary conditions for the graph coordinates in order to cancel their natural u-dependence for the choice of certain parameters. This greater generality allows for a two-parameter family of models associated to this data. Moreover, in this extended parameter space there is a novel line of “nice” models in which Eu = Ee is a constant not dependent at all on u. This line includes at ∞ the standard principal sigma model where Ee = K the Killing form, but at other points has an antisymmetric part built from r itself. In this way one may approach the principal sigma model itself along a line of sigma models exhibiting Poisson–Lie T-duality and of a simple form without additional non-linearities due to a non-constant generalised metric. The dual models are more complicated but at ∞, for example, one obtains an Abelian model as the Poisson–Lie T-dual of the principal sigma model approached in this way (the latter lies on the boundary of the space of models exhibiting T-duality). These results are presented in Sect. 6. Also in the paper we consider the Hamiltonian picture of Poisson–Lie T-duality in more detail than we have found elsewhere, but see [15] where the Hamiltonian theory was introduced. This is done in Sect. 3 after the preliminary Sect. 2. Among the modest new results is a more regular expression for the Hamiltonian that covers both the model and the dual model simultaneously. Also new is a study of the symmetries of the theory induced by the left action of D on itself. These are not usually considered because they are not conserved but we show that they do respect the symplectic structure. Moreover, when Eu is constant we show that the action of G ⊂ D is conserved and we compute the conserved charges. A second development in this context, in Sect. 4, is a study of the classical mechanical system on G (say) in the limit of point-like strings (i.e. x-independent solutions). We show that this constraint commutes with the dynamics and we provide the resulting Lagrangian and Hamiltonian systems and the phase space. The left action of D descends to the classical mechanical system and we show that it has a moment map. The conserved charges are computed in the case of constant Eu . The dual model on G equivalent to these point-solutions are not point solutions but extended solutions of a certain special form. We also discuss the quantisation of this classical mechanical system both conventionally and in a manner relevant to the conserved charges. Although these systems appear to be different from the systems U (g )C[G] exhibiting observable-state duality at the Planck-scale [4], we do establish some points of comparison, such as a common phase space. Section 5 contains some further algebraic preliminaries needed for the explicit construction of Eu . We emphasise the matched pair of groups point of view and recover in particular the known formula [3] for the general case Ad∗u (Eu ) = (Ee−1 + (u))−1 ,
458
E. J. Beggs, S. Majid
where is the g ⊗ g-valued function defining the Poisson-structure on G and Ee−1 need not be the standard choice obtained from the Killing form. Although the freedom to choose Ee−1 has been known from the start [3], we provide the first general constructions for models based on a different and nonstandard choice for Ee−1 . In particular, this allows us in Sect. 6 to present our main result; the class of “nice” Poisson–Lie T-dual models based on quasitriangular Lie bialgebras. Finally, Sect. 7 introduces new “double-Neumann” boundary conditions for the open string and proceeds for these (as well as more trivially for closed strings) to extend the Poisson–Lie T-duality in the Hamiltonian form to general group factorisations D = G M, where D need no longer be the Lie group of the Drinfeld double d(g) and indeed m need not be g but could be some quite different Lie algebra, possibly of different dimension. This is directly motivated by the observable-state duality models which exist [5, 12] for any factorisation. It is also motivated by the Adler–Kostant–Symes theorem in classical inverse scattering which works for a general factorisation equipped with an inner product, see [12]. The dynamics are determined, similarly to the conventional bialgebra theory, by the splitting of the Lie algebra of D into orthogonal subspaces but these need no longer be of the same dimension (although only in this case is there a sigma-model interpretation). We also have an action of D by left multiplication on the phase space with the double-Neumann boundary conditions which is useful even for standard Poisson–Lie T-duality based on Lie bialgebras. In particular, it extends to an action of the affine Kac–Moody Lie algebra d˜ . Several directions remain for further work. First of all, only some first steps are taken (in Sect. 4) to relate T-duality to observable-state duality (1) in the quantum theory; our long term motivation here is to extend these ideas from particles to loops and hence to formulate T-duality for the full quantum systems as a duality operation on a more general algebraic structure (no doubt more general than Hopf algebras but in the same spirit). This in turn would give insight into the correct algebraic structure for the conjectured “M-theory” about which little is known beyond dualities visible in the Lagrangians at various classical limits. Let us mention only that Poisson–Lie T-duality is connected also with mirror symmetry [16] and indirectly with several other relevant dualities in the theory of strings and branes. Secondly, there are some interesting examples of the generalisation of Poisson–Lie T-duality in Sect. 7 which exist in principle and should be developed further. Thus, the conformal group on Rn (n > 2) has, locally, a factorisation into the Poincaré group and an Rn of special conformal translations. The global structure of the factorisation is singular in a similar manner to the “black-hole event-horizon”-like features of the Planck-scale Hopf algebra C[p]C[x] in [4]. There is also the possibility in our more general setting of a many-sided T-duality (i.e. not only two equivalent theories) associated to more than one factorisation of the same group. Finally, the natural emergence of generalised metrics which have both symmetric and antisymmetric parts is a natural feature of noncommutative Riemannian geometry [17] (where symmetry is natural only in the commutative limit). This is a further direction that remains to be explored. Also to be considered is the addition of WZNW terms to render our 2-parameter class of sigma-models conformally invariant as well as the computation of 1-loop or higher quantum effects cf. [18, 19].
Preliminaries. We recall, see e.g. [12] that a Lie bialgebra is a Lie algebra equipped with δ : g → g ⊗ g, where δ is antisymmetric and obeys the coJacobi identity (so that
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
459
g∗ is a Lie algebra) and δ[ξ, η] = adξ (η) − adη (ξ ) for all ξ, η ∈ g, where ad extends as a derivation. Next, associated to any Lie bialgebra g there is a double Lie algebra d = g g∗op . This is a double semidirect sum with cross relations [φ, ξ ] = φξ − φξ, where the actions are mutually coadjoint ones φξ = ξ[2] , φξ[1] ,
φξ = ξ, φ[2] φ[1] ,
where the angle brackets are the dual pairing of g∗ with g and δ(ξ ) = ξ[1] ⊗ ξ[2] . Here d is quasitriangular and factorisable (see later) and as a result there is an adjoint invariant inner product on d, (ξ ⊕ φ, η ⊕ ψ) = φ, η + φ, ξ . Here g = g∗op
(2)
and g are maximal isotropic subspaces. We will need this description from [6] which is somewhat more explicit than the usual description in terms of the “Manin triple” in Drinfeld’s work [20]. Given a double cross sum of Lie algebras g m, we may at least locally exponentiate to a double cross product of Lie groups G M. This is given explicitly in [7]. We view the Lie algebra actions as cocycles, exponentiate to Lie group cocycles, view these as flat connections and take the parallel transport operation. The actions can be described by b(u) ∈ g ⊗ m∗ given by b(u)(φ) = bφ (u) = (φu)u−1 and a(s) ∈ g∗ ⊗ m given 1 ∗ by a(s)(ξ ) = aξ (s) = s −1 (sξ ). It can be shown that b ∈ ZAd ⊗ ∗ (G, g ⊗ m ) is a cocycle, where the action ∗ is a left action of G on m∗ given by dualising the right action : m × G → m. Also a ∈ Z1∗ ⊗ AdR (M, g∗ ⊗ m), where AdR is the right adjoint action of M on m and ∗ is the right action of M on g∗ given by dualising its action on g. These Lie-algebra-valued functions a, b generate the vector fields for the action of g on M and m on G respectively. Thus, φu = bφ (u)u, where ξ u = ξ˜ denotes the right invariant vector field on G generated by ξ ∈ g. Similarly, sξ = saξ (s). Once the global actions of G on M and vice-versa are known, the structure of G M is such that su = (su)(us),
∀u ∈ G, s ∈ M.
(3)
This allows every element of the double cross product group G M to be uniquely factorised either as GM or as MG, and relates the two factorisations.
460
E. J. Beggs, S. Majid
2. T-Duality Based on Lie Bialgebras We begin by giving a version of the standard T-duality based on the Drinfeld double of a Lie bialgebra [2, 3]. We will phrase it slightly differently in terms of double cross products with a view to later generalisation. Thus, there is a double cross product group D = G M with Lie algebra d = g + m, and an adjoint-invariant bilinear form on d which is zero on restriction to g and m. The Lie algebra d is the direct sum of two perpendicular subspaces E− and E+ . This means that m = g∗op , that the factorisation is a coadjoint matched pair and that d = D(g), the Drinfeld double of g, which is the setting that Klimˇcík etc., assume. On R2 we use light cone coordinates x+ = t + x and x− = t − x, where t and x are the standard time-space coordinates. Now let us suppose that there is a function k : R2 → G M, with the properties that k+ k −1 (x+ , x− ) ∈ E− and k− k −1 (x+ , x− ) ∈ E+ for all (x+ , x− ) ∈ R2 . Then we see that, if we factor k = us for u ∈ G and s ∈ M, u−1 u± + s± s −1 ∈ u−1 E∓ u. If the projection πg : d → g (with kernel m) is 1-1 and onto when restricted to u−1 E− u and u−1 E+ u, we can find graph coordinates Eu : g → m and Tu : g → m so that ξ + Eu (ξ ) : ξ ∈ g = u−1 E+ u and ξ + Tu (ξ ) : ξ ∈ g = u−1 E− u. It follows that s− s −1 = Eu (u−1 u− ) and s+ s −1 = Tu (u−1 u+ ). From the identity (s+ s −1 )− − (s− s −1 )+ = [s− s −1 , s+ s −1 ] we deduce that u(x+ , x− ) satisfies the equation Tu (u−1 u+ ) − − Eu (u−1 u− ) + = Eu (u−1 u− ), Tu (u−1 u+ ) .
(4)
Klimˇcík shows that the Lagrangian density L = Eu (u−1 u− ), u−1 u+
(5)
gives rise to these equations of motion. The dual theory is given by the factorisation k = tv, where t ∈ M and v ∈ G. If we let Eˆ t : m → g and Tˆt : m → g be the graph coordinates of t −1 E+ t and t −1 E− t respectively, then t (x+ , x− ) obeys the dual equation (6) Tˆt (t −1 t+ ) − − Eˆ t (t −1 t− ) + = Eˆ t (t −1 t− ), Tˆt (t −1 t+ ) . These are the equations of motion for a sigma model with Lagrangian Lˆ = Eˆ t (t −1 t− ), t −1 t+ .
(7)
These two models are different but equivalent descriptions of the model defined by k. The (u, s) and (t, v) coordinates are related by the actions of the double cross product group structure: tv = (tv)(tv) = us.
(8)
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
461
3. Hamiltonian Formulation of T-Duality There are two models considered in the last section, the first order equations of motion for k : R2 → G M and the second order equations of motion for u : R2 → G. The equations of motion for k : R2 → G M are the natural way to introduce duality into the system, and are very nearly equivalent to the equations of motion for u : R2 → G. There is not a 1–1 correspondence between the systems, as multiplying k on the right by a constant element of M gives rise to exactly the same u. We have a Lagrangian and Hamiltonian for the u equations of motion, and can work out the corresponding Hamiltonian mechanics. However the reader must remember that this will not give the Hamiltonian mechanics for k, but rather for k quotiented on the right by constant elements of M. As pointed out by Klimˇcík, we can take the phase space of the system to be the set of ∞ smooth functions C ∞ (R, D) (or more strictly C (R, D)/M), where we regard R to be a constant time line in R1+1 , or C ∞ (0, π ), D /M for a finite space. We will compute the symplectic structure more explicitly than we have found elsewhere and then obtain a new and more symmetric formulation of the Hamiltonian density that covers both the model and the dual model simultaneously. We will need this in later sections when we generalise to arbitrary factorisations, as well as for the point-like limit. 3.1. The symplectic form. We begin by showing that this is the correct phase space, i.e. that such a function encodes both u and u˙ on a constant time line. Thus, take k ∈ C ∞ (R, D) or C ∞ (0, π ), D . As k(x) ∈ D we can factor it as k(x) = u(x)s(x), so u(x) is specified on the constant time line. But we also know that sx s −1 = Tu (u−1 u+ ) − Eu (u−1 u− ) 1 ˙ − Eu (u−1 u) ˙ + Tu (u−1 ux ) + Eu (u−1 ux ) , Tu (u−1 u) = 2
(9)
and as we know sx s −1 and (Tu + Eu )(u−1 ux ), we can find (Tu − Eu )(u−1 ut ). From this we can in principle find u−1 u˙ as the function ξ → Tu (ξ ) − Eu (ξ ) is 1-1 (if η lay in the kernel of this operator then η + Tu (η) = η + Eu (η) ∈ u−1 (E+ ∩ E− )u = {0}). If we have a system with coordinates for configuration space qi , and Lagrangian L(qi , q˙i ), then the canonical momenta are pi = ∂L/∂ q˙i , and we define a symplectic form on the phase space by ω = dpi ∧ dqi . With a little thought, it can be seen that this corresponds to the directional derivative formula (where we have taken a Lagrangian density L)
π ω(u, u; ˙ a, b; c, d) = L (u, u; ˙ 0, c; a, b) − L (u, u; ˙ 0, a; c, d) dx. x=0
If we write a change in k as labelled by y we get ky = uy s + usy , and likewise for kz = uz s + usz . From the last section, we can write the Lagrangian density for our system as 4L(u, u) ˙ = Eu (u−1 u˙ − u−1 ux ), u−1 u˙ + u−1 ux , so we can calculate a partial derivative ˙ 0, c) = Eu (u−1 c), u−1 u˙ + u−1 ux + Eu (u−1 u˙ − u−1 ux ), u−1 c 4L (u, u; = Eu (u−1 u) ˙ − Tu (u−1 u) ˙ − Eu (u−1 ux ) − Tu (u−1 ux ), u−1 c,
462
E. J. Beggs, S. Majid
so 2L (u, u; ˙ 0, uy ) = −sx s −1 , u−1 uy , which results in 2L (u, u; ˙ 0, uy ; uz , u˙ z ) = − (sx s −1 )z , u−1 uy + sx s −1 , u−1 uz u−1 uy = − (sz s −1 )x , u−1 uy + [sx s −1 , sz s −1 ], u−1 uy + sx s −1 , u−1 uz u−1 uy . Now compare this with the standard 2-form on the loop group of D. Consider (k −1 ky )x , k −1 kz = (s −1 sy )x + [s −1 u−1 uy s, s −1 sx ] + s −1 (u−1 uy )x s, s −1 sz + s −1 u−1 uz s = (sy s −1 )x − [sx s −1 , sy s −1 ], u−1 uz + [sx s −1 , sz s −1 ], u−1 uy + sx s −1 , [u−1 uz , u−1 uy ] + sz s −1 , (u−1 uy )x . On integration we find
π
sz s −1 , u−1 uy
x=0
=
π x=0
(sz s −1 )x , u−1 uy + sz s −1 , (u−1 uy )x dx,
so we have the following symplectic form on the phase space:
π π 2ω(k; kz , ky ) = (k −1 ky )x , k −1 kz dx − sz s −1 , u−1 uy . x=0
x=0
(10)
Now we come to the complication, the fact that this form is degenerate on C ∞ (0, π ), D . If we take a change in k ∈ C ∞ (0, π ), D given by kφ for φ ∈ m, then ω(k; kz , kφ) = 0 for all kz . To remedy this we could remove the null direction by declaring that the phase space would actually be C ∞ (0, π ), D /M. Equivalently we could consider the phase space to consist of those k = us ∈ C ∞ (0, π ), D for which s(0) is the identity in M. 3.2. The Hamiltonian density. The Hamiltonian density generating the time evolution can be calculated by 4H = 4L (u, u; ˙ 0, u) ˙ − 4L(u, u), ˙ and using our previous result we can write this as ˙ 4H = − Eu (u−1 u˙ − u−1 ux ), u−1 ux − sx s −1 + Eu (u−1 u˙ − u−1 ux ), u−1 u = − Eu (u−1 u˙ − u−1 ux ), u−1 ux − Tu (u−1 u) ˙ + Tu (u−1 ux ), u−1 u ˙ = Eu (u−1 ux ), u−1 ux − Tu (u−1 u), ˙ u−1 u ˙ = Eu (u−1 ux ), u−1 ux + Eu (u−1 u), ˙ u−1 u, ˙ or equivalently ˙ u−1 u. ˙ 8H = (Eu − Tu )(u−1 ux ), u−1 ux + (Eu − Tu )(u−1 u),
(11)
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
463
Using the equation we derived for sx s −1 , we can rewrite (Eu − Tu )(u−1 u), ˙ u−1 u ˙ as (Tu + Eu )(u−1 ux ) − 2sx s −1 , (Eu − Tu )−1 (Tu + Eu )(u−1 ux ) − 2sx s −1 = − (Tu + Eu )(Eu − Tu )−1 (Tu + Eu )(u−1 ux ), u−1 ux − 4sx s −1 , (Eu − Tu )−1 (Tu + Eu )(u−1 ux ) + 4sx s −1 , (Eu − Tu )−1 (sx s −1 ). If we observe that (Eu − Tu )(u−1 ux ), u−1 ux = (Eu − Tu )(Eu − Tu )−1 (Eu − Tu )(u−1 ux ), u−1 ux , then we can write 4H = − Tu (Eu − Tu )−1 Eu (u−1 ux ), u−1 ux − Eu (Eu − Tu )−1 Tu (u−1 ux ), u−1 ux − 2sx s −1 , (Eu − Tu )−1 (Tu + Eu )(u−1 ux )
(12)
+ 2sx s −1 , (Eu − Tu )−1 (sx s −1 ). To simplify this equation we shall first look at the form of the projections to the subspaces u−1 E+ u and u−1 E− u in terms of the graph coordinates. If we take ξ ∈ g and φ ∈ m, we can write ξ + φ = (w + Eu (w)) + (y + Tu (y)), where w = (Eu − Tu )−1 φ − (Eu − Tu )−1 Tu (ξ ) and y = (Eu − Tu )−1 Eu (ξ ) − (Eu − Tu )−1 φ. Then we can define projections πu+ and πu− to u−1 E+ u and u−1 E− u as πu+ (ξ + φ) = w + Eu (w) and
πu− (ξ + φ) = y + Tu (y).
It follows that (πu+ − πu− )ξ = −2Eu (Eu − Tu )−1 Tu ξ − (Eu − Tu )−1 (Tu + Eu )ξ, (πu+ − πu− )φ = 2(Eu − Tu )
−1
φ + (Tu + Eu )(Eu − Tu )
−1
φ.
(13) (14)
From this we can rewrite the last equation for the Hamiltonian as 4H = (πu+ − πu− )(u−1 ux + sx s −1 ), u−1 ux + sx s −1 . This can be further simplified by removing the u dependence from the projections. If π+ is the projection to E+ with kernel E− , then πu+ = Adu−1 ◦ π+ ◦ Adu , and since the inner product is adjoint invariant we find 4H = (π+ − π− )(ux u−1 + usx s −1 u−1 ), ux u−1 + usx s −1 u−1
(15)
or in terms of a combined variable on D, 4H = (π+ − π− )(kx k −1 ), kx k −1 .
(16)
The equations of motion can similarly be written in terms of k as ˙ −1 = (π− − π+ )(kx k −1 ). kk
(17)
464
E. J. Beggs, S. Majid
3.3. Symmetries of the models. Returning to the equations of motion in the form k± k −1 ∈ E∓ , it is clear that k → kd,
d∈D
(18)
is a global symmetry of the model. This has been discussed in [3]. In addition to this known symmetry we now consider k → dk,
E± → dE∓ d −1 ,
d∈D
(19)
which alters the subspaces E± and hence the model. On our phase space picture, where the different subspaces appear as different Hamiltonians, this left translation in D may not preserve the Hamiltonian for a particular model, but rather takes us from one model to another. To have a dynamical symmetry of a particular model we can proceed to restrict to left multiplication by those d ∈ D such that dE± d −1 = E± . We distinguish two special cases: (1) The subspaces E± are G-invariant, and (2) The subspaces E± are M-invariant. In Case (1) we say that the models are G-invariant. Then Tu = Te and Eu = Ee are independent of u ∈ G, and the models themselves are simpler to work with. The actions of d ∈ G by left translation in terms of the variables of the model and the dual model are (u, s) → (du, s),
(t, v) → ((t −1 d −1 )−1 , (t −1 d −1 )−1 v)
respectively. To see if the left translation has a moment map, we consider kz = δk for δ ∈ d in the equation for the symplectic form:
2ω(k; δk, ky ) =
π
x=0
π k(k −1 ky )x k −1 , δ dx − sz s −1 , u−1 uy . x=0
If δ ∈ g, then sz = 0, so we have the moment map 1 Iδ (k) = − 2
kx k −1 , δdx,
δ ∈ g.
In terms of the sigma-model on G, this is
−4Iδ (u) =
2u−1 ux +(Tu −Eu )(u−1 u)+ ˙ (Tu +Eu )(u−1 ux ), u−1 δudx ,
δ ∈ g,
which is a conserved charge in the G-invariant case. The left translations for δ ∈ m are not in general given by moment maps. There are analogous formulae for the dual model and the M-invariant case. We shall return to these symmetries when we have have discussed boundary conditions for the models. We shall also study the particular properties of G-invariant models in some detail in later sections.
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
465
4. Solutions Independent of x In this section we show that the systems above in the Hamiltonian form have “pointlike” limits where the solutions are restricted so that the field u, say, is independent of x. This then becomes a system of a classical particle moving on the group manifold of G. In the dual picture, i.e. in terms of the variable t, the model is far from point-like and instead describes some form of extended object in the manifold M. We obtain the Poisson brackets and the Hamiltonian and we study the symmetries, in particular the G-invariant case. The dual case where t is pointlike and u extended is identical with the roles of G and M interchanged and is therefore omitted except with regard to the study of this case when the model is G-invariant. 4.1. The point-particle Poisson structure. The solutions which have u(x) independent of x are parameterised by initial values of u ∈ G and p = sx s −1 ∈ m. This is because the equation sx s −1 = (Tu − Eu )(u−1 u)/2 ˙ shows that p is also independent of x. Therefore the effective phase space coordinates are (u, p) rather than the fields (u(x), s(x)) in the general case. The symplectic form per unit length is then 2ω(u, p; uz , pz , uy , py ) = py , u−1 uz − pz , u−1 uy + p, [u−1 uz , u−1 uy ], which is closed independently of the pairing used. This can also be written as 2ω(u, p; uz , pz ; uy , py ) = (upu−1 )y , uz u−1 − (upu−1 )z , uy u−1 − p, [u−1 uz , u−1 uy ].
(20)
We now invert the symplectic form on the phase space m × G to find the Poisson structure. Define ω0 : (m ⊕ g) ⊗(m ⊕ g) → R by 2ω0 (py ⊕ ξy , pz ⊕ ξz ) = py , ξz − pz , ξy + p, [ξz , ξy ], ∀py , pz ∈ m, ξy , ξz ∈ g. Take a basis ei of g and a dual basis ei of m = g∗ (for 1 ≤ i ≤ n). Then we can take a basis of m ⊕ g as fi = ei for 1 ≤ i ≤ n and fi = ei−n for n + 1 ≤ i ≤ 2n. Then in this basis,
0 id A −id −1 and (2ω0 ) = , 2ω0 = −id A id 0 where Aij = p, [ei , ej ]. The corresponding tensor is 1 −1 ei ⊗ ei − ei ⊗ ei + ω0 = p, [ei , ej ]ei ⊗ ej . 2 1≤i,j ≤n
1≤i≤n
Now, ω(u, p; uξz , pz ; uξy , py ) = ω0 (pz ⊕ξz , py ⊕ξy ) so its inverse, the corresponding Poisson bivector, is given by left translation from ω0−1 , e˜i ⊗ ei − ei ⊗ e˜i + 2δp, (21) γ (p, u) = 2 i
where ξ˜ = uξ is the left-invariant vector field generated by ξ ∈ g.
466
E. J. Beggs, S. Majid
The Poisson bracket itself then can be described simply for functions f, g on G and ξ, η ∈ g = m∗ by {f, g} = 0,
{ξ, f } = −2ξ˜ (f ),
{ξ, η} = 2[ξ, η].
(22)
From this it is clear that we can quantise the system with the Weyl algebra C[G]>U (g) or at the C ∗ -algebra level C(G)>C ∗ (G), where G acts on G by left multiplication. 4.2. The point-particle Hamiltonian. We have shown that p = sx s −1 is independent of x, so s is of the form s = epx a, where a ∈ M is also independent of x. To find the equations of motion we write k = uepx a, where u ∈ G depends only on time, not on x. ˙ −1 = (π− − π+ )kx k −1 gives Then the equation of motion kk uu ˙ −1 + u
d px −px −1 u + uepx aa ˙ −1 e−px u−1 = (π− − π+ )(upu−1 ), (e )e dt
which yields, for the case x = 0, u−1 u˙ + aa ˙ −1 = (πu− − πu+ )p, and taking the first order terms in x gives p˙ = [aa ˙ −1 , p]. We can now get rid of the variable a and write the equations of motion in terms of u and p only, u−1 u˙ = πg (πu− − πu+ )p,
p˙ = [πm (πu− − πu+ )p, p].
In the constant case, the Hamiltonian per unit length (15) restricts to 4H = (π+ − π− )(upu−1 ), upu−1 .
(23)
We have to check that the restricted Hamiltonian and the restricted symplectic form indeed correspond to these equations of motion, i.e. that the constraint of x-independence commutes with the original Hamiltonian. To do this, it will be convenient to first calculate from the equations of motion d (upu−1 ) = u[(πu− − πu+ )p, p]u−1 = [(π− − π+ )(upu−1 ), upu−1 ], dt and now we can write 2ω(u, p; uz , pz ; u, ˙ p) ˙ = [upu−1 , (π+ − π− )upu−1 ], uz u−1 − (upu−1 )z , uu ˙ −1 − upu−1 , [uz u−1 , uu ˙ −1 ] = [uz u−1 , upu−1 ], (π+ − π− )upu−1 − (upu−1 )z − [uz u−1 , upu−1 ], uu ˙ −1 = (upu−1 )z , (π+ − π− )upu−1 − upz u−1 , (π+ − π− )upu−1 − upz u−1 , uu ˙ −1 = (upu−1 )z , (π+ − π− )upu−1 − pz , u−1 u˙ − (πu− − πu+ )p = (upu−1 )z , (π+ − π− )upu−1 + pz , aa ˙ −1 = 2Hz ,
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
467
where we used at the end the equations of motion again, and then that pz , aa ˙ −1 = 0 as m is isotropic. In terms of graph coordinates, we can write the equations of motion as u−1 u˙ = −2(Eu − Tu )−1 p = 2Tu−1 (Eu−1 − Tu−1 )−1 Eu−1 p, p˙ = −[(Eu + Tu )(Eu − Tu )
−1
p, p] =
[(Eu−1
(24)
− Tu−1 )−1 (Eu−1
+ Tu−1 )p, p],
(25)
and the Hamiltonian as 4H = (πu+ − πu− )p, p = 2(Eu − Tu )−1 p, p
(26)
= 2(Eu−1 − Tu−1 )−1 Eu−1 p, Eu−1 p.
There is also a “conjugate” description of the system which we mention briefly here. Although only sx s −1 = p is directly needed for solving the x-independent equations of motion for the u variable, the rest of the degrees of freedom in s are also an auxiliary part of the system from the point of view of the group D. It turns out that one could equally regard (p, a) as phase space variables and solve the system in terms of them, with u regarded as auxiliary. Then the equations of motion would be aa ˙ −1 = πm (πu− − πu+ )p = −(Eu + Tu )(Eu − Tu )−1 p, p˙ = [πm (πu− − πu+ )p, p].
(27)
If we work with the phase space m × M = g ⊗ G , we can more easily compare the system with the classical phase space of the bicrossproduct Hopf algebra U (g)C[G ] associated to the same factorisation of D in [4]. In fact both the Poisson structures and the natural Hamiltonians look somewhat different, but the general interpretation as a particle on M = G with momentum given by p ∈ g is the same. 4.3. Symmetries of the point-particle system. We now consider which of the translation symmetries of the general theory restrict to the x-independent solutions. First of all, the right translation symmetries are not interesting in this case: the right action by M is the identity on our (u, p) coordinates, while the right action by G does not preserve that u is x-independent. On the other hand, the left translation symmetries by d ∈ D do preserve that u is x-independent. We compute the Hamiltonian functions for these actions. First of all, for an infinitesimal transformation by φ ∈ m the variations of u, upu−1 are uφ = φu,
(upu−1 )φ = [φ, upu−1 ],
and hence (20) yields 2ω(u, p; uz , pz ; uφ , pφ ) = −(upu−1 )z , φ for any variation uz , pz . Hence the Hamiltonian function generating this flow is 1 Iφ (u, p) = − upu−1 , φ 2 1 = −hu(pu−1 ), φ = − ubp (u−1 )u−1 , φ, 2
∀φ ∈ m.
468
E. J. Beggs, S. Majid
Similarly, for an infinitesimal left translation generated by ξ ∈ g we have uξ = ξ u (the right-invariant vector field generated by ξ ) and pξ = 0. In this case we obtain more simply 2ω(u, p; uz , pz ; uξ , pξ ) = −(upu−1 )z , ξ or the generating function 1 1 Iξ (u, p) = − upu−1 , ξ = − pu−1 , ξ , 2 2
∀ξ ∈ g.
The two cases can be combined into a single generating function or moment map 1 Iδ (u, p) = − upu−1 , δ, 2
∀δ ∈ d.
(28)
In particular, we see that if the model is G-invariant, so that G is a dynamical symmetry, then the projection of upu−1 to m, QG = pu−1
(29)
is a constant of motion, the conserved charge for the symmetry. Likewise, if the model is M-invariant then the projection of upu−1 to g, QM = ubp (u−1 )u−1
(30)
is a constant of motion. The Hamiltonian and the equations of motion also simplify in the G-invariant case, namely (24)-(26) with Eu = Ee and Tu = Te . Writing U = 2(Te − Ee )−1 , V = 1 2 (Ee + Te ), we have u−1 u˙ = Up,
p˙ = [V Up, p],
4H = −Up, p.
(31)
Thus, the equations of motion decouple in this case; p˙ is a quadratic function of p and u−1 u˙ is a linear function of p, i.e. can then be obtained (in principle) by integrating p(t). 4.4. The extended system dual to the point-particle limit. The dual model when u is x-independent is described by variables t, v both far from x-independent. The dual constraint is one where t is fixed to be x-independent, in which case the model in our original u, s description is far from x-independent. Rather, it is some form of “extended solution”. We can reverse the order of factorisation k = uepx a = tv to get t −1 = (epx a)−1 u−1 and v −1 = (epx a)−1 u−1 . Here u, p and a are functions of t only. It can be seen that t has a modified exponential behaviour in x, and that v is a constant acted on by an exponential as a function of x. In particular t will not satisfy the Neumann boundary conditions. The Hamiltonian can be written as 4H = (πt+ − πt− )(t −1 tx + vx v −1 ), t −1 tx + vx v −1 , where πt± are the projections to t −1 E± t. The constraints on the dual system corresponding to the constant u are that tv and tx t −1 + tvx v −1 t −1 are independent of x.
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
469
5. More About Graph Coordinates In this section we provide some preliminary results on the explicit construction of the graph coordinates of the subspaces Adu−1 E± in terms of the actions of the groups on the Lie algebras. This is needed, in particular, for the explicit computations for the quasitriangular case in the next section. In fact it will be convenient to consider the inverses of the graph coordinates rather than the graph coordinates themselves, as the formulae are considerably simpler. Thus, given generic E ± , the subspace Adu−1 E+ contains elements of the form Adu−1 (Ee−1 (φ) ⊕ φ) = Adu−1 (Ee−1 (φ) + bφ (u)) ⊕ φu = Eu−1 (φu) ⊕ φu, so we deduce that Eu−1 (φ) = Adu−1 (Ee−1 (φu−1 ) + bφu−1 (u)). We can write this as E¯ u−1 ≡ Adu ◦ Eu−1 ◦ (( )u),
E¯ u−1 = Ee−1 + b(u).
(32)
Also observe that (su−1 )u = (su−1 )−1 for any double cross product group, which implies that Adu−1 bφu−1 (u) = −bφ (u−1 ) (this is part of the cocycle property for b). Hence we can write equivalently Eu−1 (φ) = Adu−1 (Ee−1 (φu−1 )) − bφ (u−1 ).
(33)
The same formulae hold for T replacing E. If we consider the dual model the subspace Adt −1 E+ contains elements of the form Adt −1 (ξ ⊕ Eˆ e−1 (ξ )) = t −1 ξ ⊕ Adt −1 (Eˆ e−1 (ξ ) + aξ (t −1 )) = t −1 ξ ⊕ Eˆ t−1 (t −1 ξ ), from which we deduce ˆ¯ −1 ≡ Ad ◦ Eˆ −1 ◦ (t −1 ( )), E t t t
ˆ¯ −1 = Eˆ −1 + a(t −1 ) E e t
(34)
or equivalently that Eˆ t−1 (ξ ) = Adt −1 (Eˆ e−1 (tξ )) − aξ (t),
(35)
similarly for Tˆ . Note also that Ee−1 (φ) + φ ∈ E+ for all φ ∈ m and since this also characterises Eˆ e (and similarly for Tˆe ), we conclude that Eˆ e = Ee−1 ,
Tˆe = Te−1 .
(36)
Finally, we specialise to the case of a coadjoint matched pair, i.e. where g is a Lie bialgebra and m = g , with d = g g the Drinfeld double. This recovers the formulae of [3]. Now, associated to the Lie bialgebra structure is a Poisson–Lie group structure on G defined by the bivector ˜ γG (u) = (u),
470
E. J. Beggs, S. Majid
where ˜ = R∗ denotes extension as a left-invariant vector field and : G → g ⊗ g is 1 (G, g ⊗ g) extending the Lie cobracket δ ∈ Z 1 (g, g ⊗ g) (which the cocycle ∈ ZAd ad is the derivative of at the group identity). Since the action of g on g in the coadjoint matched pair is just δ viewed by evaluation against the second factor of its output, the cocycle generator b of its corresponding vector fields on G is just b = in this case. Also observe that we could equally well have defined γ as generated by right-invariant vector fields from some R , say. Here R (u) = Adu−1 ((u)) = −(u−1 ), the last equation by the cocycle condition obeyed by . To apply these observations to the above we write the operator Eu−1 : m → g as an evaluation against the second factor of elements Eu−1 ∈ g ⊗ g (we use the same symbols when the meaning is clear). Similarly for Eˆ t−1 . Then Eu−1 = Adu−1 (Ee−1 ) − (u−1 ) = Adu−1 (Ee−1 ) + R (u)
(37)
as elements of g ⊗ g. Inverting this defines the Lagrangian for the models, L = Eu (u−1 u− ), u−1 u+ = Eu (u−1 u+ , u−1 u− ),
(38)
where in the second expression we view Eu : g → m as an evaluation against the second factor of Eu ∈ m ⊗ m. Or in terms of E¯ u−1 = Adu (Eu−1 ) ∈ g ⊗ g, we have [3] E¯ u−1 = Ee−1 + (u),
(39)
and the Lagrangian is written equally as L = E¯ u (u− u−1 ), u+ u−1 = E¯ u (u+ u−1 , u− u−1 ).
(40)
One or the other of these two forms is usually easier to compute. Similarly, for the dual model we identify a(t) : g → m with evaluation against the ˆ R , i.e. a = − ˆ R when the latter is considered as an operator by first component of evaluation against its second factor (a convention that we adopt unless stated otherwise). Then Eˆ t−1 = Adt −1 (Ee ) + R (t),
ˆ¯ −1 = E + (t) ˆ E e t
(41)
and ˆ¯ (t t −1 , t t −1 ) L = Eˆ t (t −1 t+ , t −1 t− ) = E t + −
(42)
is the Lagrangian for the dual model. These results allow us to explicitly construct the graph coordinates and the Lagrangians given a generic splitting of d into subspaces E ± . The latter are equivalent to specifying Ee−1 , Te−1 and these allow us to obtain the general Eu−1 , etc., from (33) or from (37), etc., in the coadjoint case.
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
471
6. Models Based on g Quasitriangular In this section we define a class of Poisson–Lie dual models based on the double of g (the usual setting) but in the special case where g is quasitriangular and factorisable. In this case we are able to obtain much more explicit formulae for the model and the dual model than in the general case. A Lie bialgebra is quasitriangular if there is an element r ∈ g ⊗ g such that δξ = adξ (r) and r obeys the classical Yang–Baxter equations [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0
(43)
and has 2r+ = r + r21 ad-invariant. A factorisable quasitriangular Lie bialgebra is one where 2r+ viewed as a map g∗ → g is invertible. We denote its inverse by K. In standard examples where g is simple, K is a multiple of the Killing form viewed as a map. In this case there is an isomorphism [21, 12] d = g g∗ ∼ =gL gR ,
ξ ⊕ φ → (ξ + r1 (φ), ξ − r2 (φ))
which also sends the bilinear form , on d to KL −KR on the two copies gL , gR of g. Here KL , KR are two copies of K. Therefore the inverse image of gL , gR defines a splitting of d into mutually orthogonal subspaces. From the explicit form of the isomorphism in [12] one finds E + = {ξ − r1 (K(ξ )) + K(ξ )},
E − = {ξ − r2 (K(ξ )) − K(ξ )}.
(44)
These subspaces are not generic, however (the graphs blow up) but they are the model for the construction which follows. In fact one has a two parameter family of models by varying the coefficients of r1 , K in E + , etc., with graph coordinates in the general case. In another degenerate limit of these parameters one has the principal sigma model as well. 6.1. Construction of the quasitriangular models on G. The subspaces E ± defining our model will be constructed by introducing parameters into (44) in such a way as to preserve orthogonality. Equivalently, one may define suitable Ee−1 , Te−1 . We then obtain the general graph coordinates by the method of Sect. 5. In fact we consider the second problem first as it leads to the most elegant choice of ansatz for the Ee−1 , etc. Thus, in the case of a quasitriangular Lie bialgebra one has simply (u) = Adu (r) − r
(45)
for the cocycle defining its Poisson structure. This defines the Drinfeld–Sklyanin bracket on G when g is the standard quasitriangular structure [20] for a simple Lie algebra g. These are also the Poisson brackets of which the associated quantum groups in this case are the quantisations. We refer to [12] for further discussion of these preliminaries. In view of (45) and the results of Sect. 5, it is then immediate that the graph coordinates for the model on G in the quasitriangular case obey Eu−1 = Adu−1 (Ee−1 − r) + r
(46)
as an element of g ⊗ g. This equation, together with a little linear algebra, allows the explicit computation of the graph coordinates for any model based on a quasitriangular Lie bialgebra, given suitable Ee1 .
472
E. J. Beggs, S. Majid
Motivated by (44) we now let Ee−1 = (λ + 1)r + µK −1 , where λ, µ are two complex parameters. For generic values we will indeed be able to invert to obtain graph coordinates Eu , Tu and hence will obtain a model of the type studied in Sects. 2, 3. Clearly, from (47), we have Eu−1 = λAdu−1 (r) + r + µK −1
(47)
as solving Eq. (46) for all λ, µ. If we denote by r2 : g∗ → g the evaluation against the second factor of r ∈ g ⊗ g and similarly by r1 for evaluation against the first factor, we have equivalently, as maps m → g, Ee−1 = (λ + 1)r2 + µK −1 = (λ + µ + 1)r2 + µr1
(48)
for our class of models. Similarly, Te−1 = −(λ + 1)r1 − µK −1 = −(λ + µ + 1)r1 − µr2 .
(49)
These imply Ee−1 − Te−1 = (λ + 1 + 2µ)K −1 ,
Ee−1 + Te−1 = (λ + 1)(r2 − r1 ).
(50)
For further computations in the Hamiltonian formulation we need the difference of the associated projectors π± . Rearranging (13)–(14), we have (πu+ − πu− )ξ = 2(Eu−1 − Tu−1 )−1 ξ + (Eu−1 + Tu−1 )(Eu−1 − Tu−1 )−1 ξ, ∀ξ ∈ g,
(51)
− 2Eu−1 (Eu−1 − Tu−1 )−1 Tu−1 φ − (Eu−1 − Tu−1 )−1 (Eu−1 + Tu−1 )φ,
(52)
(πu+ − πu− )φ =
∀φ ∈ m.
Evaluating at the identity and inserting the above results for Ee−1 , etc., we obtain: 2 K(ξ, ξ ), λ + 1 + 2µ λ+1 (π+ − π− )ξ, φ = K(ξ, (r1 − r2 )φ), λ + 1 + 2µ 2 (π+ − π− )φ, φ = K(Te−1 φ, Te−1 φ), λ + 1 + 2µ (λ + 1)2 K(Te−1 φ, Te−1 φ) = K((r1 − r2 )φ, (r1 − r2 )φ) 4 (λ + 1 + 2µ)2 −1 + K (φ, φ). 4 (π+ − π− )ξ, ξ =
These results provide for the computation of the Hamiltonian from (15) in Sect. 3.
(53) (54)
(55)
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
473
It remains to show that the above Ee−1 , Te−1 indeed define an orthogonal splitting of d into subspaces E± and to give these explicitly. First of all the corresponding subspaces defined by our choice of Ee−1 , Te−1 are (λ + 1)r1 (K(ξ )) − K(ξ ) : ξ ∈ g}, λ+1+µ (λ + 1)r2 (K(ξ )) + K(ξ ) E− = {Te−1 φ ⊕ φ} = {ξ − : ξ ∈ g}. λ+1+µ E+ = {Ee−1 φ ⊕ φ} = {ξ −
(56) (57)
To show that these form an orthogonal decomposition of d, we calculate the inner products Ee−1 φ ⊕ φ, Te−1 φ ⊕ φ = Ee−1 φ, φ + φ, Te−1 φ = (λ + 1)(r2 − r1 )(φ), φ = 0,
Ee−1 φ ⊕ φ, Ee−1 φ ⊕ φ = Ee−1 φ, φ + φ, Ee−1 φ = (λ + 1 + 2µ)K −1 (φ, φ),
Te−1 φ ⊕ φ, Te−1 φ ⊕ φ = Te−1 φ, φ + φ, Te−1 φ = −(λ + 1 + 2µ)K −1 (φ, φ).
In particular, E ± are mutually orthogonal as required (the latter two equations show further that the inner product is nondegenerate on each subspace). To show that the subspaces span d we need to show that ξ ⊕ φ = Ee−1 (ψ) + ψ + Te−1 (χ ) + χ has a (unique) solution for ψ, χ ∈ m for all ξ ∈ g and φ ∈ m. Clearly ψ + χ = φ. Meanwhile, putting in the form of Ee−1 , Te−1 we have ξ = µK −1 (ψ − χ ) + (λ + 1)(r2 (ψ) − r1 (χ )) which can be rearranged as 1 1 ξ + (λ + 1)(−r2 + K −1 )(φ) = (λ + 1 + 2µ)K −1 (ψ − χ ). 2 2 Thus we have an orthogonal splitting if and only if λ + 1 + 2µ % = 0.
(58)
We assume this throughout. Moreover, the splitting has the inverse-graph coordinates Ee−1 , Te−1 computed above. This completes the construction of our model at least in the Hamiltonian formulation. Indeed, this can be defined entirely in terms of Eu−1 , Tu−1 without recourse to Eu , Tu themselves. It is clear from our construction that: (1) The model is G-invariant if and only if λ=0
(59)
(or the Lie bialgebra structure on g is identically zero). (2) The standard Lagrangian for the model (which requires Eu ) exists if and only if (47) are nondegenerate, in particular when µK dominates, i.e. |µ| >> |λ + 1| and g is semisimple. We describe several special cases.
(60)
474
E. J. Beggs, S. Majid
Modified principal sigma model. This is obtained by λ = −1, µ = 1. Then E ± = {ξ ± K(ξ ) : ξ ∈ g},
Ee−1 = K −1 = −Te−1 .
(61)
Here Eu is obtained by inverting Fu = K −1 − (u) and is not independent of u ∈ G. Considering K, as maps K, 2 by evaluation against the second component, we have Eu−1 − Tu−1 = 2K −1 ,
Eu−1 + Tu−1 = 2R (u)
for this model. Here R (u) defines the Poisson-bracket associated to the Lie bialgebra structure of G and is viewed as a map m → g by evaluation (as usual) against its second factor. In particular, the Lagrangian is L = (K −1 + R (u))−1 u−1 u− , u−1 u+ = (K −1 + (u))−1 (u− u−1 ), u+ u−1 .
(62)
This recovers the setting of [3], for example, as a special case of our class of models. Note that the formulae for general µ but λ = −1 are strictly similar, with Eu = (µK −1 + R (u))−1 in the Lagrangian instead. Pure-quasitriangular and principal sigma model. The G-invariant models are obtained by λ = 0, µ = 0. In this case Eu−1 = Ee−1 = r2 + µK −1 ,
Tu−1 = Te−1 = −r1 − µK −1 .
For the equations of motion we can use the equations u−1 u− = Ee−1 (s− s −1 ) and u−1 u+ = Te−1 (s+ s −1 ) since the operators Ee−1 and Te−1 are defined as above, even though Ee and Te may not be. Then the equations of motion are most conveniently described as a sigma model for s, with equation (Te−1 (s+ s −1 ))− − (Ee−1 (s− s −1 ))+ = −[Ee−1 (s− s −1 ), Te−1 (s+ s −1 )]. We see that this case contains another sigma model on the dual group which makes sense in the G-invariant case. Indeed, in the general G-invariant case the variable s may be considered to have a complex parameter µ, which makes this look very much like inverse scattering for the sigma model. Moreover, for generic µ, the operators Ee and Te do exist, and both u and s are described by sigma models. The pure-quasitriangular model is the special case with µ = 0 as well. In this case the subspaces E ± are the ones in (44) corresponding to the Drinfeld double as gg. This new class of models has the Hamiltonian defined by 1 (π+ − π− )(ξ ⊕ φ), ξ ⊕ φ = K(ξ, ξ ) + K(ξ, (r1 − r2 )φ) + K(r1 (φ), r1 (φ)). 2 The principal sigma model is the limit with µ → ∞ and a suitable rescaling. It is on the boundary of our moduli space of quasitriangular models. Then E + = {ξ + µ−1 (ξ − r1 ◦ K(ξ )) + K(ξ ))}, E − = {ξ + µ−1 (ξ − r2 ◦ K(ξ )) − K(ξ ))}
(63)
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
475
and Ee = µ−1 (K −1 + µ−1 r2 )−1 = µ−1 K(1 − µ−1 r2 ◦ K + · · · ). Hence the Lagrangian is L(u) = (µK −1 + r2 )−1 (u−1 u− ), u−1 u+ = µ−1 K(u−1 u− , u−1 u+ ) + µ−2 K(r2 ◦ K(u−1 u− ), u−1 u+ ) + · · ·
(64)
which after an infinite renormalisation has as leading term the usual principal sigma model. The equation of motion, to lowest order in µ−1 , is K((u−1 u+ )− + (u−1 u− )+ ) = µ−1 K(r1 K(u−1 u+ )− + r2 K(u−1 u− )+ )−[K(u−1 u− ), K(u−1 u+ )] + · · · . This is the usual principal sigma model equations of motion to lowest order in µ−1 , namely (u−1 u+ )− + (u−1 u− )+ = 0. 6.2. Quasitriangular models on SU2 . We now compute these models for the group G = SU2 and for its other real form G = SL2 (R). Actually, only the second of these is strictly real and quasitriangular. Thus, with a basis {H, X± } for its Lie algebra (with the usual relations), we take the Drinfeld–Sklyanin quasitriangular structure 1 r = X+ ⊗ X− + H ⊗ H. 4 Let sl2 (R) have the dual basis {φ, ψ± }, then its Lie algebra structure is [φ, ψ± ] =
1 ψ± , 2
and the other required maps are 1 φ 4H r2 ψ+ = 0 , ψ− X+
[ψ+ , ψ− ] = 0
H 2φ K X+ = ψ− . X− ψ+
Note that if we take a different real form e1 =
−ı (X+ + X− ), 2
e2 =
−1 (X+ − X− ), 2
e3 =
then [ei , ej ] = Bij k ek (the real form su2 ) but ei ⊗ ei + ı(e1 ⊗ e2 − e2 ⊗ e1 ) r=− i
−ı H, 2
476
E. J. Beggs, S. Majid
is not real in this basis. If {fi } is a dual basis then r2 (fj ) = −ej + ıei Bij 3 ,
1 K = − id. 2
This means that although we can arrange for a completely real Lie bialgebra su2 in this basis (here the Lie coalgebra is purely imaginary but we can rescale r to make it real) it is not a quasitriangular one over R; the required r if we want to obey (43) lives in the complexification. In the above conventions the Lie algebra sl2 in the dual basis is imaginary, [fi , fj ] = ı(δik δj 3 − δj k δi3 )fk . The choice of basis
ei
= −ıfi is its real form su2 .
Modified principal sigma model on SU2 . To construct the model we will need (u) = Adu (r− ) − r− quite explicitly, where r− = ıe1 ∧ e2 is the antisymmetric part of r. For our purposes we write SU2 as elements
a b u= , |a|2 + |b|2 = 1. −b¯ a¯ Then working with the matrix representation ei = easy to find
−ı 2 σi
given by the Pauli matrices it is
Adu−1 (e1 ) = ((a 2 − b2 )e1 + )(a 2 + b2 )e2 − 2((ab)e3 , Adu−1 (e2 ) = −)(a 2 − b2 )e1 + ((a 2 + b2 )e2 + 2)(ab)e3 , and hence R (u) = 2ıe1 ∧ e2 |b|2 − e3 ∧ e1 (a b¯ − ab) ¯ − ıe2 ∧ e3 (a b¯ + ab) ¯
(65)
after a short computation, which is purely imaginary (as expected). Evaluating against the second factor and regarding as a matrix we have ¯ 1 −ı|b|2 −ı)(a b) ¯ . Eu−1 = K −1 + R (u) = −2 ı|b|2 1 ı((a b) ¯ −ı((a b) ¯ ı)(a b) 1 Here Eu−1 (fj ) = Eij−1 ei , where (Eij−1 ) is the matrix shown. Note that we can write ¯ ((a b) ¯ , Eij−1 = −2(δij + ıBij k πk ), π = )(a b) −|b|2 and any matrix of this form has inverse Eij = −
1 (δij − ıBij k πk − πi πj ). 2(1 − π 2 )
Here π 2 = π · π = |b|2 in our case. The corresponding operator is Eu (ej ) = Eij fi . To cast the resulting Lagrangian in a useful form let us note that Tr (id − π /)σi σj = Tr (id − π · σ )(δij id + ıBij k σk ) = 2(δij − ıBij k πk ),
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
477
where σi are the Pauli matrices and π / = π · σ . Hence in our representation of su2 in the i basis ei = −ıσ we have 2
1 1 −1 −1 −1 −1 Tr [(id − π /)u u+ u u− ] − Tr [π /u u+ ]Tr [π /u u− ] , (66) L= |a|2 2 where
¯ −|b|2 ab π /= a b¯ |b|2
0 b −1 0 b = ¯ u=u . b 0 b¯ 0
The matrix Eij here is complex since R in our conventions is imaginary. For a completely real version of this model on SU2 one should keep the freedom of general µ in this class of models so that Eu−1 = µK −1 + R (u) and then set µ = ı. Taking the real normalisation of su2 as a Lie bialgebra (i.e. multiplying r by −ı so that r− = e1 ∧ e2 and K −1 = 2ıid) gives the same Eij−1 as above but times −ı off the diagonal. One may also work of course on G = SL2 (R) with real r, K for a completely real model with µ = 1. This type of model has been considered specifically for SU2 in [18] although not in the matrix/trace form above, and more recently in [14] but without explicit formulae for the Lagrangian. Pure-quasitriangular and principal sigma models on SU2 . Here we take λ = 0 and can write down immediately 1 + 2µ −ı 0 1 + 2µ 0 Eu−1 = Ee−1 = − ı 0 0 1 + 2µ which has inverse
1+2µ Eu = Ee =
−1 4µ
1+µ −ı 1+µ
i 1+µ 1+2µ 1+µ
0
0
0 0 4µ 1+2µ
for µ % = 0, − 21 , −1. The Lagrangian defined by this can be conveniently obtained by writing 0 Eij−1 = −(1 + 2µ)(δij − ıBij k πk ), π = 0 , 1 1+2µ
which implies (by similar computations to those above),
1 1 + µ 0 −1 L= Tr [ u u+ u−1 u− ] 0 µ µ(1 + µ) 1 − Tr [σ3 u−1 u+ ]Tr [σ3 u−1 u− ] . 4(1 + 2µ) This is singular for the pure quasitriangular model where µ = 0, and also does not have a good limit at µ = ∞ for the principal sigma model. Rather, we have well-defined
478
E. J. Beggs, S. Majid
equations of motion conveniently described as a sigma model for s ∈ M as explained above, using Ee−1 and a similar matrix for Te−1 . On the other hand, by changing the normalisation of the Lie bialgebra structure (namely, dividing r by µ) we have Eu with the same matrix as above but without the µ−1 factor in front. This rescaled Lagrangian is well defined both for µ = 0 and µ = ∞, with 1 Tr [(1 + σ3 )u−1 u+ u−1 u− ] − 41 Tr [σ3 u−1 u+ ]Tr [σ3 u−1 u− ] as µ → 0 µL → 2 −1 Tr [u u+ u−1 u− ] as µ → ∞. The first limit is the Lagrangian for the rescaled pure-quasitriangular model on SU2 , while the second is the standard Lagrangian for the principal sigma model on SU2 based on the Killing form of su2 . Notice that in this rescaled model the Lie cobracket of su2 is infinite at µ = 0, i.e. the Lie algebra m has infinite commutators, and zero at µ = 0, i.e. the Lie algebra m is Abelian. The geometrical pictures behind these two models are therefore very different but interpolated by general µ. Also note that the µ = 0 limit here is again defined by a complex Lagrangian. For a real version one may look at the pure-quasitriangular model on G = SL2 (R) instead. Here we have, clearly, 1+2µ 4µ φ H H 1+2µ φ 4 µ Ee−1 ψ+ = µX− , Ee X+ = µ−1 1+µ ψ− . ψ− X− (1 + µ)X+ ψ+ As before, we take out a factor µ by rescaling in order to obtain well-defined operators Ee at µ = 0, ∞, this time with all coefficients being real in our choice of bases. The corresponding Lagrangian can easily be written out explicitly upon fixing a description of u ∈ SL2 (R). For example, if we write u = exX+ ehH eyX− , so that u−1 u± = x± X+ e−2h + (h± + yx± e−2h )H + (y± − 2yh± − 2y 2 x± e−2h )X− , using the relations of sl2 then the rescaled µ = 0 limit gives the Lagrangian L = e−2h x+ (y− − 2yh− − 2y 2 e−2h x− ) as the pure-quasitriangular model on SL2 (R). The µ = ∞ limit is the standard principal sigma model on SL2 (R) and the general case interpolates the two. 6.3. Dual of the quasitriangular models on G . The quasitriangular models are examples of the case where the factorisation is based on the Drinfeld double associated to a Lie bialgebra, so that Eu−1 is related to the Poisson–Lie group G. Hence the dual models are of the same form but based on the Poisson–Lie group G rather than G, i.e. with ˆ (t) ∈ m ⊗ m in place of . As explained in Sect. 5 we can then construct them from the initial data Eˆ e−1 = Ee ,
Tˆe−1 = Te
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
479
ˆ¯ −1 = E + (t) ˆ as given above for our quasitriangular models. We compute E and invert e t to obtain the Lagrangian −1 ˆ Lˆ = (Ee + (t)) t− t −1 , t+ t −1
(67)
for the dual model. For the models below, where there is no special Adt -invariance of Ee , this is easier than computing the Lagrangian via Eˆ t . We outline the results for SU2 and SL2 (R) . First of all we describe these groups explicitly. The former is generated by the basis {−ıfi }, i.e. we write φ = φi (−ıfi ) ∈ m for real φi , which we regard as a vector φ. One standard representation of the resulting group is as matrices of the form
x z , 0 x −1
x > 0,
z ∈ C.
This is the group occurring in the Iwasawa decomposition SL2 (C) = SU2 SU2 , see [12]. Another description useful for very explicit computations is as the semidirect product R2 >R [12], which can be viewed as a modified product on R3 . Elements are s ∈ R3 with s3 > −1 and the product law and inversion are st = s + (s3 + 1)t,
s −1 = −
s . s3 + 1
The exponentiation from the Lie algebra to a group is explicitly s=φ
eφ3 − 1 φ3
for s = eφ in the natural 3-dimensional coadjoint representation. See [12]. The real form SL2 (R) has a similar description as C>R, i.e. where s2 is imaginary and s1 , s3 real with s3 > −1 according to the conventions in [12]. Note that x = s3 +1 is multiplicative under the group law if one wants a more standard notation. The Lie bracket on su2 determines the Lie cobracket and Poisson structure on SU2 (and similarly on SL2 (R) ). It is given by [12] 1 ˆ (s) = −ı(Bij a sa + s 2 Bij 3 )fi ⊗ fj . 2 Explicitly, ˆ ı (s) =
1 2 (s + s22 + (s3 + 1)2 − 1)f1 ∧ f2 + s2 f3 ∧ f1 + s1 f2 ∧ f3 . 2 1
Note also that the notation s± s −1 means more precisely Rs −1 ∗ s± . Similarly for s −1 s± . In our present group coordinates, from the product law, it is easy to see that Ls∗ φ = (s3 + 1)φ,
Rs∗ φ = φ + φ3 s.
480
E. J. Beggs, S. Majid
Dual of the modified principal sigma model. We set λ = −1 and µ = 1. Then 1 fi ⊗ fi . Ee = K = − 2 i
Hence −1
ˆ¯ E ij
1 = − (δij + ıBij k πˆ k ), 2
0 πˆ = 2t + 0 t2
and ˆ¯ = − E ij
1 − t 2 (t 2
2 (δij − ıBij k πˆ k − πˆ i πˆ j ). + 4(t3 + 1))
This defines the Lagrangian 2 Lˆ = ∇+ t · ∇− t − ı πˆ · (∇+ t × ∇− t) − (πˆ · ∇+ t)(πˆ · ∇− t) , 1 − t 2 (t 2 + 4(t3 + 1)) where Rt −1 ∗ t± is computed as t ∇± t = t ± − t±3 . t3 + 1 As before, the model in the form stated is complex but with a different choice µ = ı and different normalisation of r we can obtain a real model as well. Dual of the pure-quasitriangular and principal sigma models. Here we set λ = 0. Then rearranging Ee above as an element of m ⊗ m we have
1 1 + 2µ 4µ ı (f1 ⊗ f1 + f2 ⊗ f2 ) + f3 ⊗ f3 + f 1 ∧ f2 . Ee = − 4µ 1 + µ 1 + 2µ 1+µ One may then compute ˆ¯ = (E + ) ˆ −1 , E t e and hence the Lagrangian. The result does not have any particular simplifying features over the λ = −1 case above, so we omit its detailed form. Both limits of µ are singular, and require rescaling. The µ → ∞ case makes sense after a rescaling of r to r/µ. This in turn scales the Lie cobracket of g by µ−1 and hence also changes the Lie algebra structure of m to an Abelian one plus corrections of order µ−1 . The effect of this is to change the exponential map and the group law of G , making the latter Abelian. This can be expressed conveniently by working in new coordinates with t scaled by µ−1 . In this new coordinate system we have ˆ¯ = −2id + O(µ−1 ) E t ˆ is linear in t to lowest order. The Lagrangian is since Lˆ = 2t + · t − + O(µ−1 ). Thus the dual model to the principal sigma model on SU2 is an Abelian one based on the group R3 with the usual linear wave equation. The similar limit for the pure-quasitriangular case is ill-defined since the Lie bracket of m becomes singular as µ → 0. Other scaling limits of both the original model and its dual are possible in this case.
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
481
6.4. Point-particle limit of the quasitriangular models. We have seen that the pointparticle limit where u is independent of x reduces to a classical mechanical dynamical system on the group G. For our quasitriangular models we have the following special cases. Point-particle modified principal model. From the expressions for Eu−1 etc. above, the Hamiltonian is H=
1 −1 K ((K ◦ R (u) + 1)p, (K ◦ 2 (u−1 ) + 1)p) 4
(68)
and the equations of motion are u−1 u˙ = K −1 ◦ ((K ◦ R (u))2 − 1)p,
p˙ = [K ◦ R (u)p, p].
(69)
In this limit both the case entirely over R or the case where r and hence the cobracket are imaginary lead to well-defined real equations of motion. In this case is imaginary but so is the Lie bracket of m in the dual basis to the real basis of g. For example, we can either work on G = SL2 (R) or, as more usual, on G = SU2 . In the latter case (see above) we have ¯ 0 −|b|2 −)(a b) ¯ = ıBij k πk fi ⊗ ej . K ◦ R (u) = ı |b|2 0 ((a b) ¯ ¯ )(a b) −((a b) 0 Using the complexified Lie bracket on su2 we have the equations of motion for p = pi fi (with pi real) as p˙ = p(p × π )3 − p3 p × π in terms of the vector cross product. This can be written explicitly as p˙ 3 = 0,
ı ı ¯ 2 ρ˙ = − abρ + 2p32 ) + ı|b|2 ρp3 , ¯ 2 + a b(|ρ| 2 2
ρ ≡ p1 + ıp2
after a short computation. On the other hand, ((K ◦ R )2 − 1)ij = (π 2 − 1)δij − πi πj , hence the equation for u in our basis ei =
−ıσi 2
of su2 is
u−1 u˙ = ı(π 2 − 1)p/ − ıπ /π · p. In our case π 2 = |b|2 and π · p = ((ρ ab) ¯ − |b|2 p3 , hence
ı 0 b p3 ρ¯ 2 2 ¯ u˙ = − . (ρ ab ¯ + ρa ¯ b − 2|b| p3 ) − ı|a| u ρ −p3 2 b¯ 0 Explicitly, this is a˙ = −ı|a|2 (ap3 + bρ),
ı ı b˙ = ıbp3 − (1 + |a|2 )a ρ¯ − ρ ab ¯ 2. 2 2
One may verify that this preserves |a|2 + |b|2 = 1 as it must.
482
E. J. Beggs, S. Majid
Point-particle pure-quasitriangular model. We set λ = 0 and Eu = Ee etc. (the models are G-invariant). The Hamiltonian and equations of motion are then 1 K((r2 + µK −1 )p, (r2 + µK −1 )p), 1 + 2µ 2 u−1 u˙ = − (r2 + µK −1 ) ◦ K ◦ (r1 + µK −1 )p, 1 + 2µ 2H =
(70) p˙ =
2 [K ◦ r2 p, p]. 1 + 2µ (71)
Since these models are invariant, we know that pu−1 is conserved. This means that we can let Q = p(0)u(0)−1 ∈ m be fixed and substitute p(t) = Qu(t) into the equation for u. ˙ We then solve a first order non-linear differential equation for u(t). In particular, in the limit µ = 0 we obtain the x-independent limit of the purequasitriangular model. Thus H=
1 K(r2 p, r2 p), 2
u−1 u˙ = 2(r2 ◦ K − 1)r2 p,
p˙ = 2[K ◦ r2 p, p]
(72)
using r1 + r2 = K −1 to rearrange. In this case it makes sense to consider the reduced variable ξ = r2 p and write the equations of motion as u−1 u˙ = 2(r2 ◦ K − 1)ξ,
ξ˙ = 2[r2 ◦ Kξ, ξ ],
(73)
where we use that r2 : g → g is a Lie algebra homomorphism in view of the classical Yang–Baxter equation (43) [12]. We only need to solve this for ξ in the image of r2 but it is interesting that the equation makes sense for any ξ as an interesting integrable system on the group manifold. We can solve this for our strictly real form g = sl2 (R). We will solve it here for the general (73); the special case of interest is similar but more elementary. Thus, 1 H 2H r2 ◦ K X+ = X+ X− 0 so, writing ξ(t) = h(t)H + x(t)X+ + y(t)X− , we need to solve u−1 u˙ = −h(t)H − 2y(t)X− and ˙ + xX ˙ − = [hH + 2xX+ , hH + xX+ + yX− ] hH ˙ + + yX = 2xyH − 2hxX+ − 2hyX− , which is the system of equations h˙ = 2xy,
x˙ = −2hx,
y˙ = −2hy.
Note first of all that d 2 1˙ (h + h) = 0 dt 2
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
483
so h2 + xy =
ω2 4
(say) is a constant. Inserting this into the equation for h yields the Riccati equation 1 h˙ − ω2 + 2h2 = 0 2 which has the general solution h(t) =
2h(0) 1 sinh(ωt) + ω cosh(ωt) . ω 2 cosh(ωt) + 2h(0) sinh(ωt) ω
We can then compute y as y(t) = e−2
t 0
h(τ )dτ
y(0)
and similarly for x(t). Since we only need h, y to obtain u(t) we can consider the choice of x(0) to be equivalent to the choice of ω (at least in a certain range). The initial values of h, y then determine their general values as above, and these then determine u(t) given u(0). The latter can be expressed explicitly in terms of integrals on fixing a coordinate system for SL2 (R). For the point-particle limit of the pure quasitriangular models we are only interested in ξ ∈ b+ (the image of r2 ), i.e. we specialise to solutions of the form y(0) = 0, which clearly implies y(t) = 0 and h˙ = 0. In this case the solution is clearly ξ(t) =
ω H + e−ωt x(0)X+ , 2
u(t) = u(0)e− 2 ωtH 1
for initial data ω, x(0), u(0). For the full physical momentum p(t) we go back to (72). If we write p = 2ωφ + xψ− + xψ ¯ + say, then a similar computation using the Lie algebra of sl2 (R) gives ω constant, x˙ = −ωx as before, and additionally x˙¯ = ωx. ¯ Hence the solution is ¯ p(t) = 2ωφ + e−ωt x(0)ψ− + eωt x(0)ψ +,
u(t) = u(0)e− 2 ωtH 1
for constants ω, x(0), x(0). ¯ As a check, it is easy to verify that QG = pu−1 = (pe 2 ωtH )u(0)−1 1
is conserved. Here ψ± H = ∓2ψ± and φH = 0 is the relevant coadjoint action. Point-particle principal model. In the limit µ → ∞ of (70)–(71), we obtain the xindependent limit of the principal sigma model. Here 4H = K −1 (p, p),
u−1 u˙ = −K −1 p, ¯
p˙¯ = 0,
where p¯ = µp is the renormalised momentum variable. This has the general solution u(t) = u(0)e−tK
−1 p¯
,
p(t) ¯ = p(0). ¯
It is easy to see that Q = pu ¯ −1 is constant as well, using K ad-invariant.
484
E. J. Beggs, S. Majid
7. Generalised T-Duality with Double Neumann Boundary Conditions So far we have worked on providing a special class of Poisson–Lie T-dual models within the established general framework. We now return to our Hamiltonian formulation of the general framework and observe that in this form the main ideas can be extended to a much more general setting. Thus, from the symplectic form and the Hamiltonian we have just calculated, we can see how the definition of T-duality could be generalised. Begin with a Lie group D, with Lie algebra d, and suppose that d is the direct sum of two subspaces E− and E+ . We take π+ to be the projection to E+ with kernel E− , and π− to be the projection to E− with kernel E+ . Suppose that there is a function k : R2 → D, with the properties that k+ k −1 (x+ , x− ) ∈ E− and k− k −1 (x+ , x− ) ∈ E+ for all (x+ , x− ) ∈ R2 . Then the relation k+ k −1 (x+ , x− ) ∈ E− can be summarised by π+ (k+ k −1 ) = 0, and similarly we get π− (k− k −1 ) = 0. This gives the equations of motion ˙ −1 = (π− − π+ )(kx k −1 ). kk Now we look at the symplectic form on the phase space. Suppose that d has an adjoint invariant inner product , . If we imposed boundary conditions that k(0) and k(π ) were fixed, then the symplectic form we computed earlier becomes
π 2ω(k; kz , ky ) = (k −1 ky )x , k −1 kz dx. x=0
˙ then we get If we substitute kz = k,
π ˙ −1 dx ˙ ky ) = k(k −1 ky )x k −1 , kk 2ω(k; k, x=0
π = (kx k −1 )y , (π− − π+ )(kx k −1 ) dx, x=0
and so ˙ ky ) = −D(k;ky ) 4ω(k; k,
π x=0
kx k −1 , (π+ − π− )(kx k −1 ) dx,
on the assumption that π+ − π− is Hermitian. This will be true if the subspaces E− and ˙ = E+ are perpendicular with respect to the inner product. Then we see that ω(k; ky , k) D(k;ky ) H(k), where 4H = (πu+ − πu− )(u−1 ux + sx s −1 ), u−1 ux + sx s −1 gives the Hamiltonian generating the time evolution. The form of the boundary conditions we have imposed here should not come as too much of a surprise. Normally the string has boundary conditions (for k = us with u ∈ G and s ∈ M) ux = 0 at x = 0 or x = π . This Neumann condition is designed to prevent momentum transfer out of the string at the edges. But if the system is to be completely dual, we also need to impose a corresponding Neumann condition on the dual theory, which leads to the boundary condition kx = 0, the “double Neumann” condition. But then the equation of motion states k˙ = 0 on the boundary. Alternatively, if the reader prefers to work over x ∈ R, we just deal with rapidly decreasing solutions. In either of these cases, the symplectic form really is non-degenerate.
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
485
Now we have a phase space and Hamiltonian for the equations of motion just based on an invariant inner product on D and an orthogonal decomposition E− and E+ of d. If we take D to be a doublecross product D = G M, and assume that the subspaces Adu−1 E± have graph coordinates Tu and Eu as before, we again recover the previous equations of motion for u ∈ G in the factorisation k = us, Tu (u−1 u+ ) − − Eu (u−1 u− ) + = Eu (u−1 u− ), Tu (u−1 u+ ) . Importantly, we do not need to assume that the inner product has any special properties with respect to the decomposition d = g + m (such as being zero on g). We can also give the form of the Hamiltonian for this general case: 4H = (Eu + I )(u−1 u− ), (Eu + I )(u−1 u− ) − (Tu + I )(u−1 u+ ), (Tu + I )(u−1 u+ ). The corresponding dual formula would produce exactly the same value. 7.1. Poisson brackets and the central extension. In this section we continue with the generalised T-duality and boundary conditions of the last section. The phase space for our system is infinite dimensional, so it is rather hard to describe the functions on it directly. We shall describe a “nice” set of functions, and hope that more general functions are expressible as a product of these nice functions. If v ∈ C ∞ ((0, π), d), we can look at the vector field kz = vk for k ∈ C ∞ ((0, π ), D). To preserve the boundary conditions we consider only those v ∈ C ∞ ((0, π ), d) which tend to zero at the end points. Consider
1 (k −1 ky )x , k −1 kz dx ω(k; ky , kz ) = − 2
1 1 −1 =− (kx k )y , v dx = − D(k;y) kx k −1 , v dx. 2 2 It follows that the function which acts as a Hamiltonian generating this flow is
1 kx k −1 , v dx. fv (k) = − 2 We can calculate the Poisson brackets between these nice functions quite easily:
1 wx , v dx. {fv , fw } = fv (k, wk) = f[v,w] − 2 We now see the appearance of a central extension term in the Lie algebra. The Poisson brackets can be written as {fv , fw } = f[v,w] + ϑ(v, w)fc , where fc (k) = 1 and the cocycle ϑ(v, w) = − wx , v dx/2. We can also manufacture a derivation term, which corresponds to the momentum (the operation of incrementing the x coordinate). Consider
1 1 −1 −1 (k ky )x , k kx dx = − (kx k −1 )y , kx k −1 dx ω(k; ky , kx ) = − 2 2
1 = − D(k;y) kx k −1 , kx k −1 dx. 4
486
E. J. Beggs, S. Majid
Thus the momentum is given by 1 fd (k) = − 4
kx k −1 , kx k −1 dx.
A brief calculation shows that {fd , fv } = fv and {fd , fc } = 0. 7.2. Adjoint symmetries of the model and dual model. In this section we consider the left multiplication symmetry again, however this time we can simultaneously describe the action on the dual models. This requires some care with the boundary conditions, and we shall take the double Neumann condition on loops, i.e. k = e and kx = 0 at both boundaries. The operation of left multiplication by constants does not preserve these conditions, but we can use our freedom to introduce a right multiplication to work with the adjoint action instead. Take the action on the phase space given by Add for d ∈ D. This preserves the boundary conditions, and preserves the models in the case where Add E± = E± . The corresponding infinitesimal motions are generated by the moment map
1 Iδ (k) = − kx k −1 , δdx, δ ∈ d. 2 If the map adδ preserves the subspaces E± then this formula gives conserved charges for the system. 7.3. Automorphism symmetries of the model and dual model. Here we consider symmetries of the phase space arising from group automorphisms θ : D = G M → D. This is really a generalisation of the previous subsection, where we just considered automorphisms given by the adjoint action, i.e. inner automorphisms. We consider the same boundary conditions as in the last subsection. For convenience we also assume that the two subspaces θE± of the Lie algebra d are perpendicular for the given inner product. This is not really needed, as we can always manufacture a new Ad-invariant inner product from the old one using the automorphism in order to make this true. Given these conditions, any automorphism θ : D → D will induce a map θ˜ on the phase space given by (θ˜ k)(x) = θ (k(x)). This map will be symplectic if θ preserves the given inner product on d, and if θ E± = E± , then the map will preserve the given models. In general θ˜ k will factor to give G-models and dual M-models which are a mixture of the original G-models and dual M-models given by factoring k. However there are two special cases worthy of mention. 1) The automorphism θ : D → D is called subgroup preserving if θG ⊂ G and θ M ⊂ M. In this case a factorisation k = us for u ∈ G and s ∈ M is sent to θ (k) = θ (u)θ (s), and θ (u) is a solution of the sigma model on G. In the same manner, if t is a solution of the sigma model on M, then θ(t) is also a solution of the sigma model on M. 2) The automorphism θ : D → D is called subgroup reversing if θG ⊂ M and θ M ⊂ G. If such an automorphism exists, the double D = G M is called self-dual [23]. In this case a factorisation k = us for u ∈ G and s ∈ M is sent to θ (k) = θ (u)θ (s), and θ (u) is a solution of the sigma model on M. In the same manner, if t is a solution of the dual sigma model on M, then θ(t) is also a solution
Poisson–Lie T-Duality for Quasitriangular Lie Bialgebras
487
of the sigma model on G. In this manner the solutions of the sigma model on G and the dual sigma model on M are related by a group homomorphism from G to M, and in that sense the models are self-dual. Other symmetries may be constructed. For example if we have θ E+ = E− and θE− = E+ then the map θˆ (k)(t, x) = θ(k(t, π − x)) sends a solution k of the model into another solution. The explicit computation of examples of our generalised T-duality along the above lines is a topic for further work. However, the data required for the construction do exist in abundance. For example, given any two Lie algebras g0 ⊂ d whose Dynkin diagrams differ by the deletion of some nodes, one has an inductive construction d = (n>g0 ) n∗ , where n are braided-Lie bialgebras [22]. For a concrete example, one has, locally, D = SO(1, n + 1) = (Rn >SO(n)) Rn as the decomposition of conformal transformations into Poincaré and special conformal translations. The group D has a non-degenerate bilinear form as required (although not positive-definite). The explicit construction of the required factorisation and the associated bicrossproduct quantum groups and T-dual models will be attempted elsewhere. References 1. Klimcik, C. and Severa, P.: Dual non-Abelian duality and the Drinfeld double. Phys. Lett. B, 351, 455–462 (1995) 2. Klimcik, C. Poisson–Lie T-duality. Nucl. Phys. B (Proc. Suppl.) 46, 116–121 (1996) 3. C. Klimcik and P. Severa. Poisson–Lie T-duality and loop groups of Drinfeld doubles. Phys. Lett. B, 372, 65–71, (1996) 4. Majid, S.: Hopf algebras for physics at the Planck scale. J. Classical and Quantum Gravity 5, 1587–1606 (1988) 5. Majid, S.: Non-commutative-geometric Groups by a Bicrossproduct Construction. PhD thesis, Harvard mathematical physics, 1988 6. Majid, S.: Physics for algebraists: Non-commutative and non-cocommutative Hopf algebras by a bicrossproduct construction. J. Algebra 130, 17–64 (1990) 7. Majid, S.: Matched pairs of Lie groups associated to solutions of the Yang–Baxter equations. Pac. J. Math. 141, 311–332 (1990) 8. Majid, S. and Oeckl, R.: Twisting of quantum differentials and the Planck-scale Hopf algebra. Commun. Math. Phys. 205, 617–655 (1999) 9. Tseytlin, A.A.: Duality symmatrical closed string theory and interacting chiral scalars. Nucl. Phys. B 350, 395–440 (1991) 10. Semenov-Tian-Shansky, M.A.: Dressing transformations and Poisson group actions. Publ. RIMS (Kyoto) 21, 1237–1260 (1985) 11. Drinfeld, V.G. Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang–Baxter equations. Sov. Math. Dokl. 27, 68 (1983) 12. Majid, S. Foundations of Quantum Group Theory. Cambridge, Cambridge Univeristy Press, 1995 13. Novikov, S., Manakov, S.V., Pitaevskii, L.P. and Zakharov, V.E.: Theory of solitons. NewYork: Consultants Bureau, 1984 14. Lledo, M.A. and Varadarajan, V.S.: su(2)-Poisson–Lie T-duality. UCLA Preprint, 1998 15. Sfetsos, K.: Canonical equivalence of non-isometric sigma-models and Poisson–Lie T-duality. Nucl. Phys. B 517, 549–566 (1998) 16. Parkhomenko, S.E.: Mirror symmetry as a Poisson–Lie T-duality. Landau Inst. Preprint, 1997 17. Majid, S.: Quantum and braided group Riemannian geometry. J. Geom. Phys. 30, 113–146 (1999) 18. Sfetsos, K.: Poisson–Lie T-duality beyond the classical level and the renormalisation group. Phys. Lett. B 432, 365–375 (1998)
488
E. J. Beggs, S. Majid
19. Alekseev, A.Yu., Klimcik, C. and Tseytlin, A.A.: Quantum Poisson–Lie T-duality and the WZNW model. Nucl. Phys. B, 458, 430–444 (1996) 20. Drinfeld, V.G.: Quantum groups. In: A. Gleason, editor, Proceedings of the ICM, Providence, Rhode Island: AMS, 1987, p. 798–820 21. Semenov-Tian-Shansky, M.A.: What is a classical R-matrix. Func. Anal. Appl. 17, 17 (1983) 22. Majid, S.: Braided-Lie bialgebras. Pac. J. Math. 192, 329–356 (2000) 23. Beggs, E.J. and Majid, S.: Quasitriangular and differential structures on bicrossproduct Hopf algebras: J. Algebra 219, 682–727 (1999) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 220, 489 – 535 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Realizing Holonomic Constraints in Classical and Quantum Mechanics Richard Froese1 , Ira Herbst2 1 Department of Mathematics, University of British Columbia, Vancouver, British Columbia, Canada 2 Department of Mathematics, University of Virginia, Charlottesville, Virginia, USA
Received: 16 November 2000 / Accepted: 2 February 2001
Abstract: We consider the problem of constraining a particle to a smooth compact submanifold of configuration space using a sequence of increasing potentials. We compare the classical and quantum versions of this procedure. This leads to new results in both cases: an unbounded energy theorem in the classical case, and a quantum averaging theorem. Our two step approach, consisting of an expansion in a dilation parameter, followed by averaging in normal directions, emphasizes the role of the normal bundle of , and shows when the limiting phase space will be larger (or different) than expected. 1. Introduction and Table of Contents Consider a system of non-relativistic particles in a Euclidean configuration space Rn+m whose motion is governed by the Hamiltonian H = 21 p, p + V (x).
(1.1)
We are interested in the motion of these particles when their positions are constrained to lie on some n-dimensional smooth compact submanifold ⊂ Rn+m . In both classical and quantum mechanics there are accepted notions about what the constrained motion should be: In classical mechanics, the Hamiltonian for the constrained motion is assumed to have the form (1.1), but whereas p and x originally denoted variables on the phase space T ∗ Rn+m = Rn+m × Rn+m , they now are variables on the cotangent bundle T ∗ . The inner product p, p is now computed using the metric that inherits from Rn+m , and V now denotes the restriction of V to . In quantum mechanics, p, p is interpreted to mean − , where is the Laplace operator, and V (x) is the operator of multiplication by V . For unconstrained motion
is the Euclidean Laplacian on Rn+m , and the Hamiltonian acts in L2 (Rn+m ). For constrained motion, the Laplace operator for with the inherited metric is used, and the Hilbert space is L2 (, dvol).
490
R. Froese, I. Herbst
In both cases the description of the constrained motion is intrinsic: it depends only on the Riemannian structure that inherits from Rn+m , but not on other details of the imbedding. Of course, a constrained system of particles is an idealization. Instead of particles moving exactly on , one might imagine there is a strong force pushing the particles onto the submanifold. The motion of the particles would then be governed by the Hamiltonian Hλ =
1 p, p + V (x) + λ4 W (x), 2
(1.2)
where W is a positive potential vanishing exactly on and λ is large. (The fourth power is just for notational convenience later on.) Does the motion described by Hλ converge to the intrinsic constrained motion as λ tends to infinity? Surprisingly, the answer to this question depends on exactly how it is asked, and is often no. A situation in classical mechanics where the answer is yes is described by Rubin and Ungar [RU]. An initial position on and an initial velocity tangent to are fixed. Then, for a sequence of λ’s tending to infinity, the subsequent motions under Hλ are computed. As λ becomes large, these motions converge to the intrinsic constrained motion on . This result is widely known, since it appears in Arnold’s book [A1] on classical mechanics. However, from the physical point of view, it is neither completely natural to require that the initial position lies exactly on , nor that the initial velocity be exactly tangent. Rubin and Ungar also consider what happens if the initial velocity has a component in the direction normal to . In this case, the motion in the normal direction is highly oscillatory, and there is an extra potential term, depending on the initial condition, in the Hamiltonian for the limiting motion on . In their proof, is assumed to have co-dimension one. A more complete result is given by Takens [T]. Here the initial conditions are allowed to depend on λ in such a way that the initial position converges to a point on and the initial energy remains bounded. (We will give precise assumptions below.) Once again, the limiting motion on is governed by a Hamiltonian with an additional potential. Takens noticed that a non-resonance condition on the eigenvalues of the Hessian of the constraining potential W along is required to prove convergence. He also gave an example showing that if the Hessian of W has an eigenvalue crossing, so that the non-resonance condition is violated, then there may not be a good notion of limiting motion on . In his example, he constructs two sequences of orbits, each one converging to an orbit on . These limiting orbits are identical until they hit the point on where the eigenvalues cross. After that, they are different. This means there is no differential equation on governing the limiting motion. For other discussions of the question of realizing constraints see [A2] and [G]. A modern survey of the classical mechanical results that emphasizes the systematic use of weak convergence is given by Bornemann and Schütte [BS]. The quantum case was considered previously by Tolar [T], da Costa [dC1, dC2] and in the path integral literature (see Anderson and Driver [AD]). Related work can also be found in Helffer and Sjöstrand [HS1, HS2], who obtained WKB expansions for the ground state, and in Duclos and Exner [DE], Figotin and Kuchment [FK], Schatzman [S] and Kuchment and Zeng [KZ]. The most general formal expansions appear in the preprint of Mitchell [M]. (We thank the referee for this reference.) There are really two aspects to the problem of realizing constraints: a large λ expansion followed by an averaging procedure to deal with highly oscillatory normal motion. Previous work in quantum mechanics concentrated on the first aspect (although a related averaging procedure for classical paths with a vanishingly small random perturbation can be found
Realizing Holonomic Constraints in Classical and Quantum Mechanics
491
in [F] and [FW]). Already a formal large λ expansion reveals the interesting feature that the limiting Hamiltonian has an extra potential term depending on scalar and the mean curvatures. Since the mean curvature is not intrinsic, this potential does depend on the imbedding of in Rn+m . It is not completely straightforward to formulate a theorem in the quantum case. We have chosen a formulation, modeled on the classical mechanical theorems, tracking a sequence of orbits with initial positions concentrating on via dilations in the normal direction. Actually we consider the equivalent problem of tracking the evolution of a fixed vector governed by the Hamiltonian Hλ conjugated by unitary dilations. In order to obtain simple limiting asymptotics for the orbit we must assume that all the eigenvalues of the Hessian of the constraining potential W are constant on . In fact we will assume that W is exactly quadratic. Our theorems show that for large λ the motion is approximated by the motion generated by an averaged limiting Hamiltonian H B , with superimposed normal oscillations generated by λ2 HO , where HO is the normal harmonic oscillator Hamiltonian. The Hamiltonians H B and HO commute, so the motions are independent. These theorems do not require any non-resonance conditions on the eigenvalues of the Hessian of W . However, the limiting Hamiltonian H B does not act in L2 (), but in L2 (N ), where N is the normal bundle of . It is only in certain situations where one can effectively ignore the motion in the normal directions and obtain a unitary group on L2 () implementing the dynamics of the tangential motion. This occurs, for example, if (a) the eigenvalues of the Hessian of W are all distinct and non-resonant, (b) the normal bundle is trivial, and (c) we confine our attention to a simultaneous eigenspace of all the number operators for the normal motion. In the general situation, the dynamics of the additional degrees of freedom in N cannot be factored out, and we must be content with analysis on L2 (N ). Our formulation of the quantum theorems invites comparison with the classical mechanical results of Rubin and Ungar [RU] and Takens [T]. It turns out that extra potentials that appear in the two cases are quite different, and there is no obvious connection. Upon reflection, the reason for this difference is clear. If we have a sequence of initial quantum states whose position distribution is being squeezed to lie close to , then by the uncertainty principle, the distribution of initial momenta will be spreading out, and thus the initial energy will be unbounded. However, the classical mechanical convergence theorems above all deal with bounded energies. The danger in considering unbounded energies is that even if the initial energy in the tangential mode is bounded, the coupling between tangential and normal modes may result in unbounded tangential energy in finite time. Our assumptions, which allow us to obtain a classical theorem despite the unbounded energy, are motivated by quantum mechanics. Our results for classical mechanics with unbounded initial energies are quite similar to our results in quantum mechanics. Table of Contents 1. 2. 3. 4. 5. 6. 7. 8.
Introduction and Table of Contents . . . . . Classical Mechanics: Bounded Energy . . . Classical Mechanics: Unbounded Energy . . Quantum Mechanics . . . . . . . . . . . . . Co-ordinate Expressions . . . . . . . . . . . Proofs of Theorems in Classical Mechanics . More Co-ordinate Expressions . . . . . . . Proofs of Theorems in Quantum Mechanics
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
489 492 494 499 503 507 515 521
492
R. Froese, I. Herbst
Section 2 contains a statement of the theorem of Rubin, Ungar and Takens on limiting orbits when the initial energies remain bounded. In Sect. 3 we state our expansion and averaging theorems in classical mechanics when the initial energies scale as they do in quantum mechanics. We also describe when the limiting motion can be thought of as a motion on . These classical results are motivated by the parallel results in quantum mechanics, which we present in Sect. 4. The proofs of the theorems in Sects. 3 and 4 are found in Sects. 6 and 8 respectively, while Sects. 5 and 7 contain background material needed in the proofs. This paper is an expanded and improved version of the announcement [FH]. 2. Classical Mechanics: Bounded Energy To give a precise statement of our results we must introduce some notation. Let be a smooth compact n-dimensional submanifold of Rn+m . The normal bundle to is the submanifold of Rn+m × Rn+m given by N = {(σ , n) : σ ∈ , n ∈ Nσ }. Here Nσ denotes the normal space to at σ , identified with a subspace of Rn+m . There is a natural map from N into Rn+m given by ι : (σ, n) → σ + n. We now fix a sufficiently small δ so that this map is a diffeomorphism of N δ = {(σ, n) : n < δ} onto a tubular neighbourhood of in Rn+m . Then we can pull back the Euclidean metric from Rn+m to N δ . Since we are interested in the motion close to we may use N δ as the classical configuration space. This will be convenient in what follows, and is justified below. We will want to decompose vectors in the cotangent spaces of N δ into horizontal and vertical vectors, so we now explain this decomposition. Let π : N → denote the projection of the normal bundle onto the base given by π : (σ, n) → σ . The vertical subspace of T(σ,n) N is defined to be the kernel of dπ : T(σ,n) N → Tσ . The horizontal subspace is then defined to be the orthogonal complement (in the pulled back ∗ N metric) of the vertical subspace. Using the identification of T(σ,n) N with T(σ,n) given by the metric we obtain a decomposition of cotangent vectors into horizontal and vertical components as well. We will denote by (ξ, η) the horizontal and vertical ∗ N . components of a vector in T(σ,n) The decomposition can be explained more concretely as follows. For each point σ ∈ , we may decompose Tσ Rn+m = Tσ ⊕ Nσ into the tangent and normal space. Using the natural identification of all tangent spaces with Rn+m , we may regard this as a decomposition of Rn+m . Let PσT and PσN be the corresponding orthogonal projections. Since we are thinking of N as an n + m-dimensional submanifold of Rn+m × Rn+m , we can identify T(σ,n) N with the n + m-dimensional subspace of Rn+m × Rn+m given by all vectors of the form (X, Y ) = (σ˙ (0), n(0)), ˙ where (σ (t), n(t)) is a curve in N passing through (σ, n) at time t = 0. The inner product of two such tangent vectors is (X1 , Y1 ), (X2 , Y2 ) = X1 + Y1 , X2 + Y2 ,
(2.1)
where the inner product on the right is the usual Euclidean inner product. For a tangent vector (X, Y ), the decomposition into horizontal and vertical vectors is given by (X, Y ) = (X, PσT Y ) + (0, PσN Y ).
Realizing Holonomic Constraints in Classical and Quantum Mechanics
493
In the statements of our theorems we will want to express the fact that two cotangent vectors, for example ξλ (t) and ξ(t) in Theorem 2.1, are close, even though they belong to two different cotangent spaces. To do this we may use the imbedding to think of the vectors as elements of R2(n+m) . Then it makes sense to use the (Euclidean) norm of their difference, ξλ (t) − ξ(t) to measure how close they are. We will use the symbol · in this situation, while |ξ | will denote the norm of ξ as a cotangent vector. We will assume that the constraining potential is a C ∞ function of the form W (σ, n) =
1 n, A(σ )n, 2
(2.2)
where for each σ , A(σ ) is a positive definite linear transformation on Nσ . The Hamiltonian (1.2) can then be written Hλ (σ, n, ξ, η) =
1 1 ξ, ξ + η, η + V (σ + n) + λ4 W (σ, n). 2 2
(2.3)
Notice that on the boundary of N δ1 , for 0 < δ1 < δ, Hλ (σ, n, ξ, η) ≥ c1 λ4 − c2 with c1 = c2 =
inf
W (σ + n) > 0,
sup
|V (σ + n)|.
(σ,n):σ ∈,n=δ1 (σ,n):σ ∈,n=δ1
By conservation of energy, this implies that an orbit under Hλ that starts out in N δ1 with initial energy less than c1 λ4 − c2 can never cross the boundary, and therefore stays in N δ1 . We will only consider such orbits in this paper, and therefore are justified in taking our phase space to be T ∗ N δ , or even T ∗ N if we extend Hλ in some arbitrary way. Since we expect the motion in the normal directions to consist of rapid harmonic oscillations, it is natural to introduce action variables for this motion. There is one for each distinct eigenvalue ωα2 (σ ) of A(σ ). Let Pα (σ ) be the projection onto the eigenspace of ωα2 (σ ). This projection is defined on Nσ , which we may think of as the range of PσN in Rn+m . Thus the projection is defined on vertical vectors in T(σ,n) N and, via the natural ∗ N . With this notation, the corresponding identification, on vertical vectors in T(σ,n) action variable, multiplied by λ2 for notational convenience, is given by 1 λ4 ωα (σ ) η, Pα η + (2.4) n, Pα n. 2ωα (σ ) 2 Notice that the total normal energy is given by α ωα Iαλ . The following is a version of the theorem of Takens and Rubin, Ungar. Iαλ (σ, n, ξ, η) =
Theorem 2.1. Let be a smooth compact n-dimensional submanifold of Rn+m . Let the Hamiltonian Hλ on T ∗ N be given by (2.3), where V , W ∈ C ∞ , W has the form (2.2) and satisfies (i) The eigenvalues ωα2 (σ ) of A(σ ) have constant multiplicity. Suppose that (σλ , nλ , ξλ , ηλ ) are initial conditions in T ∗ N δ satisfying
494
R. Froese, I. Herbst
(a) σλ − σ0 + ξλ − ξ0 → 0, (b) Iαλ (σλ , nλ , ξλ , ηλ ) → Iα0 > 0, as λ → ∞. Let (σλ (t), nλ (t), ξλ (t), ηλ (t)) denote the subsequent orbit in T ∗ N δ under the Hamiltonian Hλ . Suppose that (σ (t), ξ(t)) is the orbit in T ∗ with initial conditions (σ0 , ξ0 ) governed by the Hamiltonian h(σ, ξ ) =
1 Iα0 ωα (σ ). ξ, ξ σ + V (σ ) + 2 α
Then for any T ≥ 0, sup σλ (t) − σ (t) + ξλ (t) − ξ(t) → 0
0≤t≤T
as λ → ∞. Implicit in this statement is the fact that the approximating orbit stays in the tubular neighbourhood for 0 ≤ t ≤ T , provided λ is sufficiently large. This theorem is actually true in greater generality. We can consider smooth constraining potentials W , where 1 2 n, A(σ )n is the first term in an expansion. If we choose our tubular neighbourhood so that W (σ +n) ≥ c|n|2 and impose the non-resonance condition ωα (σ ) = ωβ (σ )+ωγ (σ ) for every choice of α, β and γ and for every σ , then the same conclusion holds. This theorem is also really a local theorem: if we impose the conditions on W and the nonresonance condition locally, and take T to be a number less than the time where σ (t) leaves the set where condition (i) is true, then the same conclusion holds as well. Actually, Takens [T] only treats the case where all the eigenvalues ωα are distinct and the normal bundle is trivial. On the other hand, he does not require that Iα0 > 0. This positivity is a technical requirement of our proof and arises because action angle co-ordinates are singular on the surface Iα0 = 0. Since Theorem 2.1 is a minor variation of known results, we will not give a proof here. 3. Classical Mechanics: Unbounded Energy We now describe our theorems in classical mechanics where the initial energies are diverging as they do in the quantum case. In quantum mechanics, the ground state energy of a harmonic oscillator − 21 (d/dx)2 + 21 λ4 ω2 x 2 is λ2 ω/2. Thus we will assume that the initial values of the action variables Iαλ scale like λ2 Iα0 , and therefore that the initial normal energy diverges like λ2 . Examining the effective Hamiltonian h(σ, ξ ) in Theorem 2.1, one would expect there to be a diverging λ2 α Iα0 ωα (σ ) potential term similar to the constraining potential but with strength λ2 . If this potential is not constant, and thus has a local minimum (called a mini-well in [HS1, HS2]), no limiting orbit could be expected in general unless the initial positions were chosen to converge to such a minimum. For simplicity, we will assume that there are no mini-wells, i.e., the frequencies ωα are constant. The first step in our analysis is a large λ expansion. It is convenient to implement this expansion using dilations in the fibre of the normal bundle. It is also convenient to assume that our configuration space is all of N . This makes no difference, since the orbits we are considering never leave N δ .
Realizing Holonomic Constraints in Classical and Quantum Mechanics
495
The dilation dλ : N → N is defined by dλ (σ, n) = (σ, λn). As with any diffeomorphism of the configuration space, dλ has a symplectic lift Dλ to the cotangent bundle given by Dλ = dλ−1∗ = dλ∗−1 . The expression for Dλ in local co-ordinates is given by (5.1). Instead of the original Hamiltonian Hλ we may now consider the equivalent pulled back Hamiltonian Lλ = Hλ ◦Dλ−1 . Since Dλ is a symplectic transformation, orbits under Hλ and orbits under Lλ are mapped to each other by Dλ and its inverse. Therefore, it suffices to study the dynamics of the scaled Hamiltonian Lλ . A formal large λ expansion yields Lλ = HB + λ2 HO + O(λ−1 ), where HO is the harmonic oscillator Hamiltonian HO (σ, n, ξ, η) =
1 1 η, η + n, A(σ )n 2 2
(3.1)
and HB is the bundle Hamiltonian given by HB (σ, n, ξ, η) =
1 J ξ, J ξ σ + V (σ ). 2
(3.2)
The inner product ·, ·σ is the inner product on T ∗ defined by the imbedding. Here ∗ N with the horizontal J denotes the identification of the horizontal subspace of T(σ,n) ∗ ∗−1 . This subspace of Tσ given in terms of the bundle projection map πσ,n by J = dπσ,n map is well defined on the horizontal subspace, since dπσ,n : T(σ,n) N → Tσ is an isomorphism when restricted to the horizontal subspace of T(σ,n) N . Thus, its adjoint ∗ is an isomorphism of T ∗ onto the horizontal subspace of T ∗ N . In local dπσ,n σ (σ,n) co-ordinates xi , yi defined in Sect. 5 below, where xi are co-ordinates for , the map J ∗ N with dx ∈ T ∗ . simply identifies dxi ∈ T(σ,n) i σ Additional understanding of the Hamiltonians HB and HO can be obtained if we introduce another metric on N . If (X, Y ) ∈ T(σ,n) N , let (X, Y ), (X, Y )λ = X2 + λ−2 PσN Y 2 .
(3.3)
(In Sect. 7 we describe in what sense this is a limiting form of the pulled-back, scaled, Euclidean metric.) If ·, ·λ denotes the corresponding metric on the cotangent space, then 1 λ2 HB + λ2 HO = (ξ, η), (ξ, η)λ + n, A(σ )n + V (σ ). 2 2 The local co-ordinate expressions for HB and HO are given in (5.9) and (5.10). We will use the notation φtH to denote the Hamiltonian flow governed by the Hamiltonian H .
496
R. Froese, I. Herbst
Theorem 3.1. Let be a smooth compact n-dimensional submanifold of Rn+m . Let Lλ = Hλ ◦ Dλ−1 , where the Hamiltonian Hλ on T ∗ N is given by (2.3). Assume that V , W ∈ C ∞ , W has the form (2.2), and that the eigenvalues ωα2 of A(σ ) do not depend on σ . Suppose that γλ are initial conditions in T ∗ N with γλ → γ0 as λ → ∞. Then for any T ≥ 0, 2 sup φtLλ (γλ ) − φtHB +λ H0 (γ0 ) → 0 0≤t≤T
as λ → ∞. In this theorem the normal energy of the initial conditions, λ2 HO (γλ ) grows like since HO (γλ ) is converging to HO (γ0 ). This leads to increasingly rapid normal 2 oscillations for both orbits φtLλ (γλ ) and φtHB +λ H0 (γ0 ). Neither orbit converges as λ becomes large. It is only their difference that converges. The convergence of the initial conditions is stated for the scaled variables γλ . To find out what this implies for the original variables (σ˜ λ , n˜ λ , ξ˜λ , η˜ λ ) = Dλ−1 γλ we must determine the action of Dλ on horizontal and vertical vectors. This results in the following conditions: λ2 ,
(a) (b) (c) (d)
σ˜ λ → σ0 , λn˜ λ → n0 , ξ˜λ → J ξ0 , and λ−1 η˜ λ → η0 ,
where (σ0 , n0 , ξ0 , η0 ) = γ0 . Here we are thinking of σ , n as vectors in Rn+m and ξ , η as vectors in R2(n+m) . We may also compute what these conditions mean for the initial velocities (Xλ , Yλ ) ∈ T(σ˜ λ ,n˜ λ ) N , again thought of as vectors in R2(n+m) . It turns out that (c ) Xλ → X0 , and (d ) λ−1 Yλ → Y0 . This theorem gives a satisfactory description of the limiting motion if the Poisson bracket of HB and HO vanishes. Then the flows generated by HB and HO commute and the motion is given by the rapid oscillations generated by λ2 HO superimposed on the flow generated by HB . In this situation we can perform averaging by simply ignoring the oscillations. An example where {HB , HO } is zero is when has codimension one, or, more generally, if the connection form, given by (3.10) below, vanishes. Then HB only involves variables on T ∗ , so the motion for large λ is a motion on with independent oscillations in the normal variables. The Poisson bracket {HB , HO } also vanishes if all the frequencies ωα are equal, but in this case the motion generated by HB need not only involve the variables on T ∗ . The motion generated by HB can be thought of as a generalized minimal coupling type flow. (See [GS] for a description of the geometry of this sort of flow.) The flow has the property that the trajectories in N are parallel along their projections onto . In particular, |n|2 is preserved by this motion. In general, when the frequencies are not all equal, the flows generated by HB and λ2 HO interact, and HB + λ2 HO generates a more complicated flow which need not be
Realizing Holonomic Constraints in Classical and Quantum Mechanics
497
simply related to the flows generated by HB and HO . Let H B be defined by H B (γ ) = lim T −1 T →∞
T 0
HB ◦ φtHO (γ )dt.
(3.4)
The existence of this limit follows from the Fourier expansion discussed below. This averaged Hamiltonian Poisson commutes with HO . It turns out that the flow for large λ is the one generated by this Hamiltonian, with superimposed normal oscillations. Theorem 3.2. Assume that the assumptions of Theorem 3.1 hold. Let HO , HB and H B be the Hamiltonians given by (3.1), (3.2) and (3.4) respectively. Let γ0 ∈ T ∗ N and T > 0. Then 2 2 (3.5) sup φtHB +λ H0 (γ0 ) − φtλ H0 ◦ φtH B (γ0 ) → 0 0≤t≤T
as λ → ∞. In this theorem we do not impose a non-resonance condition. However, the form of the averaged Hamiltonian H B depends crucially on whether or not resonances are present. To explain this further we introduce scaled action variables. Recall that the scaled Hamiltonian was defined by Lλ = Hλ ◦ Dλ−1 . We perform a similar scaling on the action variables and define Iα by Iαλ ◦ Dλ−1 = λ2 Iα . Then Iα (σ, n, ξ, η) =
1 ωα η, Pα η + n, Pα n. 2ωα 2
Suppose that there are m0 distinct eigenvalues ωα2 . Then the flows φtIα are commuting Iα = φtIα . harmonic oscillations in the normal variables. They are periodic, satisfying φt+2π m ∗ We therefore obtain a group action 0 of the m0 torus T 0 on T N defined by Im
0τ = φτI11 ◦ · · · ◦ φτm00 , for τ = (τ1 , . . . , τm0 ) ∈ T m0 . Notice that φtHO = 0tω , where ω = (ω1 , . . . , ωm0 ). Now we may perform a Fourier expansion of HB ◦ 0τ yielding eiν,τ Fν HB ◦ 0τ = ν∈Zm0
so that
HB ◦ φtHO =
eitν,ω Fν .
ν∈Zm0
It turns out that only finitely many Fν ’s are non-zero. Thus we may exchange the integral and limit in the definition of H B with the Fourier sum to obtain T −1 itν,ω lim T HB = e dt Fν = Fν . ν∈Zm0
T →∞
0
ν∈Zm0 :ν,ω=0
498
R. Froese, I. Herbst
The non-resonance condition on the eigenvalues ω = (ω1 , . . . , ωm0 ) in this situation would be If ν = 0 and Fν = 0 then ν, ω = 0.
(3.6)
If this condition holds, we find that H B = F0 . We now examine the case m0 = m, where there are m distinct frequencies ωα . We wish to describe how the limiting motion generated by H B can be thought of as taking place on . To begin, since {H B , Iα } = 0 for each α, each Iα is a constant of the motion, so the motion takes place on the level sets of I1 , . . . , Im . Furthermore, we want to to disregard the normal oscillations. Technically, we may do this by replacing the original phase space T ∗ N , with its quotient by the group action 0. This amounts to ignoring the angle variables in local action angle co-ordinates. It turns out that T ∗ N /0 = T ∗ × Rm ,
(3.7)
where the variables in Rm are the action variables. Since these are constant, we may think of the motion as taking place on T ∗ . To describe the identification (3.7) we first make ∗ N . Since there are m a new direct sum decomposition of each cotangent space T(σ,n) distinct eigenvalues ω1 , . . . , ωm , the corresponding eigenvectors, defined globally up to sign, give an orthonormal frame for the normal bundle. In this situation the co-ordinates ∗ N yi = n, ni (σ ) are also globally defined up to sign. Thus the subspace of T(σ,n) spanned by dy1 , . . . , dym is globally defined. This subspace is complementary to the horizontal subspace, but is not necessarily orthogonal. Given horizontal and vertical ∗ N , we may write ξ + η = ξ + η , where ξ is components (ξ, η) of a vector in T(σ,n) 1 1 1 horizontal and η1 is in the span of dy1 , . . . , dym . The map from T ∗ N → T ∗ × Rm given by (σ, n, ξ, η) → (σ, J ξ1 , I1 (σ, n, ξ, η), . . . , Im (σ, n, ξ, η)) is invariant under 0 and gives rise to the identification (3.7). Now suppose that the values of I1 , . . . , Im have been fixed by the initial condition. Then the Hamiltonian governing the motion on T ∗ depends on these “hidden” variables, and is given by 1 (3.8) ξ, ξ σ + V (σ ) + V1 (σ ; I1 , . . . , Im ), 2 provided the non-resonance condition holds. Given that the eigenvalues are distinct, the following implies (3.6) hB (σ, ξ ; I1 , . . . , Im ) =
If j, k, l and m are all distinct then If j, k and l are all distinct then
ωj + ωk ± ωl − ωm = 0. 2ωj ± ωk − ωl = 0.
(3.9)
The extra potential V1 is defined in terms of the frame for the normal bundle, n1 (σ ), . . . , nm (σ ), consisting of normalized eigenvectors of A(σ ). Let bk,l be the associated connection one-form given by bk,l [·] = nk , dnl [·], Then V1 (σ ; I1 , . . . , Im ) =
Ik I l ω l k,l
ωk
(3.10)
|bk,l |2 .
Notice that the norm |bk,l | is insensitive to the choice of signs for the frame.
(3.11)
Realizing Holonomic Constraints in Classical and Quantum Mechanics
499
4. Quantum Mechanics In quantum mechanics, we wish to understand the time evolution generated by Hλ for large λ, where Hλ is the Hamiltonian given by (1.2) with p, p = − . As in the classical case, it is convenient to replace the original configuration space Rn+m with the normal bundle N . We will show that if the initial conditions in L2 (Rn+m ) are supported near then, to a good approximation for large λ, the time evolution stays near . Thus we lose nothing by inserting Dirichlet boundary conditions on the boundary of the tubular neighbourhood of , and may transfer our considerations to L2 (N δ , dvol), where dvol is computed using the pulled back metric. If we extend the pulled back metric, and make a suitable definition of Hλ in the complement of N δ , we may remove the boundary condition. Thus we may assume that that the Hamiltonian Hλ acts in L2 (N , dvol). More precisely, we let gN be any complete smooth Riemannian metric on N that equals the metric induced from the imbedding in the region {(σ, n) : n < :}, for some : < δ. For example, such a gN could be obtained by smoothly joining the induced metric for small n with the metric · , ·1 given by (3.3) for large n. Let dvol denote the Riemannian density for gN . Let V (σ, n) be a smooth bounded function on N such that V (σ, n) = V (σ + n) when n < :. Our goal in this section is to analyze the time evolution generated by λ4 1 Hλ = − + V (σ, n) + n, A(σ )n 2 2
(4.1)
acting in L2 (N , dvol). Here denotes the Laplace- Beltrami operator for gN . We now introduce the group of dilations in the normal directions by defining (Dλ ψ)(σ, n) = λm/2 ψ(σ, λn). This is a unitary operator from L2 (N , dvolλ ) to L2 (N , dvol), where dvolλ denotes the pulled back density dvolλ (σ, n) = dvol(σ, λ−1 n). Since the spaces L2 (N , dvolλ ) depend on λ, and we want to deal with a fixed Hilbert space as λ → ∞, we perform an additional unitary transformation. Let dvolN = lim dvolλ = dvol ⊗ dvolRm . λ→∞
Then the quotient of densities dvolN /dvol √ λ is a function on N and we may define Mλ to be the operator of multiplication by dvolN /dvolλ . The operator Mλ is unitary from L2 (N , dvolN ) to L2 (N , dvolλ ). Let Uλ = Dλ Mλ .
(4.2)
Notice that the support of a family of initial conditions of the form Uλ ψ is being squeezed close to as λ → ∞. We want to consider such a sequence of initial conditions. Therefore it is natural to consider the conjugated Hamiltonian Lλ = Uλ∗ Hλ Uλ , since the evolution generated by Lλ acting on ψ is unitarily equivalent to the evolution generated by Hλ acting on Uλ ψ. As a first step we perform a large λ expansion. Formally, this yields Lλ = HB + λ2 HO + O λ−1 ,
500
R. Froese, I. Herbst
where HO is the quantum harmonic oscillator Hamiltonian in the normal variables, and HB is quantum version of the corresponding classical Hamiltonian, except with an additional potential n2 n(n − 1) s − h2 . K= 4 8 Here s is the scalar curvature and h is the mean curvature vector (see Eqs. (7.2) and (7.1)). Notice that this extra potential does depend on the imbedding of in Rn+m , since the mean curvature does. The quadratic forms for HO and HB are 1 V 1 (4.3) P dψ, P V dψσ,n + n, A(σ )n|ψ|2 dvolN ψ, HO ψ = 2 N 2 and
ψ, HB ψ =
N
1 J P H dψ, J P H dψσ + (V (σ, 0) + K(σ ))|ψ|2 dvolN . (4.4) 2
Local co-ordinate expressions for these operators are given by (7.7) and (7.6) below. As in the classical case, we can gain additional understanding of these operators by introducing the metric (3.3). Then 1 λ2 HB + λ2 HO = − λ + n, A(σ )n + V (σ, 0) + K(σ ), 2 2 where λ is the Laplace–Beltrami operator on N with the metric (3.3). Note that the volume element dvolN is actually λm times the usual volume element associated to this metric (see Sect. 7). The operator HO is explicitly given on C 2 functions in its domain by the formula
m m 1 ∂2 1 (HO ψ)(σ, n) = − + n, A(σ )n ψ(σ, yk nk (σ )), 2 2 ∂yk2 k=1 k=1
where {nk (σ ) : k = 1 . . . m} is any orthonormal basis for N and n = m k=1 yk nk (σ ). It is easy to show that with the metric (3.3), N is complete so that any positive integer power of HB + λ2 HO is essentially self-adjoint on C0∞ for λ > 0 [C]. Similarly, because HO is basically a harmonic oscillator Hamiltonian, it is straightforward to show that any positive integer power of HO is essentially self-adjoint on C0∞ . The operator HB is more complicated, but also can be shown to be essentially self-adjoint on C0∞ . The argument is not difficult and will be omitted. Theorem 4.1. Let be a smooth compact n-dimensional submanifold of Rn+m . Let gN be a complete smooth Riemannian metric on N that coincides with the induced metric when n < :, for some : < δ, and suppose V (σ, n) is a bounded smooth extension of V (σ + n). Let Hλ be the Hamiltonian given by (4.1), acting in L2 (N , dvol). Assume that A(σ ) varies smoothly, and that the eigenvalues of ωα2 of A(σ ) do not depend on σ . Let Lλ = Uλ∗ Hλ Uλ acting in L2 (N , dvolN ). Then, for every ψ ∈ L2 (N , dvolN ) and every T > 0, 2 lim sup e−itLλ − e−it (HB +λ HO ) ψ = 0. λ→∞ 0≤t≤T
Realizing Holonomic Constraints in Classical and Quantum Mechanics
501
Just as in the classical case, this theorem provides a satisfactory description of the motion if [HB , HO ] = 0, so that exp(−it (HB + λ2 HO )) = exp(−itHB ) exp(−itλ2 HO ). As before, this will happen, for example, if has co-dimension one, or if all the frequencies ωα are equal. If has co-dimension one, then the normal bundle is trivial. (We are assuming that is compact.) Then we have L2 (N , dvolN ) = L2 (, dvol ) ⊗ L2 (R, dy) and HB = hB ⊗I for a Schrödinger operator hB acting in L2 (, dvol ). Since HO = I ⊗hO we have that exp(−it (HB + λ2 HO )) = exp(−ithB ) ⊗ exp(−itλ2 hO ). This can be interpreted as a motion in L2 (, dvol ) with superimposed normal oscillations. In the case where the frequencies ωα are all equal, the normal bundle may be nontrivial, and there is not such a simple tensor product decomposition of L2 (N , dvolN ). However, for some initial conditions ψ the limiting motion may again be thought of as taking place in L2 (, dvol ) with superimposed oscillations. For example, consider the subspace of functions in L2 (N , dvolN ) that are radially symmetric in the fibre variable n. This subspace does have a tensor product decomposition L2 (, dvol ) ⊗ L2radial (Rm , d m y). It is an invariant subspace for HB . Furthermore, the restriction of HB to this subspace has the form hB ⊗ I . Thus, if ψ0 is a radial function in n, then exp(−itLλ )ψ0 = exp(−ithB )⊗exp(−itλ2 hO )ψ0 .As above, we interpret this as motion in L2 (, dvol ) with superimposed normal oscillations. On the other hand, if the normal bundle is non-trivial, it may happen that the limiting motion takes place on a space of sections of a vector bundle over . Instead of giving more details about the general case, we offer the following illustrative example. Instead of a normal bundle, consider the Möbius band B defined by R × R / ∼, where (x, y) ∼ (x + 1, −y). This an O(1) bundle over S 1 with fibre R. An L2 function ψ on B can be thought of as a function on R × R satisfying ψ(x + 1, −y) = ψ(x, y). If we decompose ψ(x, y), for fixed x, into odd and even functions of y, ψ(x, y) = ψeven (x, y) + ψodd (x, y), then ψeven (x+1, y) = ψeven (x, y) and ψodd (x+1, y) = −ψodd (x, y). (Notice that these are eigenfunctions for the left regular representation of O(1) on L2 (R).) Thus ψeven can be thought of as an L2 (R, dy) valued function on S 1 , while ψodd can be thought of as an L2 (R, dy) valued section of a line bundle over S 1 (which happens to be B itself). In this way we obtain the decomposition L2 (B) = L2 (S 1 , dx) ⊗ L2even (R, dy) ⊕ B(S 1 , dx) ⊗ L2odd (R, dy), where B is the space of L2 sections of B. In this example, the bundle is flat, so HB = −Dx2 + V (x) and HO = −Dy2 + y 2 /2 acting in L2 (B, dxdy). Let h+ = −Dx2 + V (x) acting in L2 (S 1 , dx) and h− = −Dx2 + V (x) acting in B(S 1 , dx). Let h0 = −Dy2 +y 2 /2 acting in L2 (R, dy), with L2even (R, dy) and L2odd (R, dy) as invariant subspaces. Then e−it (HB +λ
2H ) O
= e−ith+ ⊗ e−itλ
2h O
⊕ e−ith− ⊗ e−itλ
2h O
.
So if the initial condition happens to lie in B ⊗ L2odd , then we would think of the limiting motion as taking place in B, with superimposed oscillations in L2odd .
502
R. Froese, I. Herbst
When HB and HO do not commute, we perform a quantum version of averaging. Define H B on C0∞ by ∞ −1 H B ψ = lim T eitHO HB e−itHO ψ dt. (4.5) T →∞
0
It can be shown that H B is essentially self-adjoint. Theorem 4.2. Assume that the hypotheses of Theorem 4.1 hold. Let HO , HB , and H B be the Hamiltonians defined by (4.3), (4.4) and (4.5). Then, for every ψ ∈ L2 (N , dvolN ) and every T > 0, 2 2 lim sup e−it (HB +λ HO ) − e−itλ HO e−itH B ψ = 0. λ→∞ 0≤t≤T
The proof that this limit defining H B exists parallels the discussion in classical 2 . For each α = mechanics. Suppose that there are m0 distinct eigenvalues ω12 , . . . , ωm 0 1, . . . , m0 define the operators Iα via the quadratic forms 1 ωα ψ, Iα ψ = P V dψ, Pα P V dψ + n, Pα n|ψ|2 dvolN . 2 N 2ωα These operators all commute and satisfy ω α I α = HO . α
An expression for Iα in terms of local creation and annihilation operators will be given iτ Iα H e−iτ Iα is periodic in τ near the end of Sect. 7. In that section we will show B that e i τ I α α with period 2π . Thus if we conjugate HB with e , the resulting operator is defined on the torus T m0 and has a Fourier expansion ei τα Iα HB e−i τα Iα = eiν,τ Fν . ν∈Zm0
Here τ = (τ1 , . . . , τm0 ) and the coefficients Fν are differential operators. As in the classical case, the sum is finite. Thus eitHO HB e−itHO = eitν,ω Fν . ν∈Zm0
This shows that the limit defining H B exists, and is given by HB = Fν . ν∈Zm0 :ν,ω=0
As in the classical case, we may look for conditions under which the limiting motion can be considered to take place on . Suppose that the eigenvalues ω1 , . . . , ωm are all distinct, and, in addition, that the eigenvectors nk (σ ) can be chosen to be smooth functions on all of . Then the normal bundle is trivial, N = ×Rm and L2 (N , dvolN ) = L2 (, dvol ) ⊗ L2 (Rm , d m y). If the non-resonance condition (3.9) holds, then 1 H B = − + V (σ ) + K(σ ) ⊗ 1 + V1 . 2
Realizing Holonomic Constraints in Classical and Quantum Mechanics
503
The term V1 is slightly different from (3.11), because terms arising in its computation do not all commute. It is given by I k Il ωl 1 |bk,l |2 . − V1 = ωk 4 k,l
The joint eigenspaces of I1 , . . . , Im are invariant subspaces for H B . The restriction of H B to such a joint eigenspace is the Schrödinger operator − 21 + V (σ ) + K(σ ) + V˜1 , acting in L2 (, dvol ), where V˜1 is obtained from V1 by replacing the operators Ik by their respective eigenvalues. Thus H B is a direct sum of Schrödinger operators acting in L2 (, dvol ). 5. Co-ordinate Expressions Our proofs will rely on local co-ordinate expressions for the quantities introduced above. Suppose x(σ ) is a local co-ordinate map for . Its inverse σ (x) is a local imbedding of Rn onto ⊂ Rn+m . Given a local orthonormal frame n1 (σ ), . . . , nm (σ ) for the normal bundle, we obtain local co-ordinates for N by setting i = 1, . . . , n, xi (σ, n) = xi (σ ), yi (σ, n) = ni (σ ), n, i = 1, . . . , m. We then may form the standard bases ∂/∂x1 , . . . , ∂/∂xn , ∂/∂y1 , . . . , ∂/∂ym for the tangent spaces of N and dx1 , . . . , dxn , dy1 , . . . , dyn for the cotangent spaces. This gives rise to local co-ordinates for T N and T ∗ N in the standard way. For the cotangent bundle, we will denote these by (x, y, p, r) ∈ R2(n+m) . Thus (x, y, p, r) denotes the cotangent vector pi dxi + rj dyj in the cotangent space over (σ (x), j yj nj (σ )). The standard symplectic form for T ∗ N is the two form given by ω=
n
dpi ∧ dxi +
i=1
m
drj ∧ dyj .
j =1
The dilation map Dλ is given in local co-ordinates by Dλ (x, y, p, r) = (x, λy, p, λ−1 r).
(5.1)
Clearly this map preserves the symplectic form ω. We now compute the local expression for the metric. Let σi (x) ∈ Rn+m denote the vector ∂σ (x)/∂xi . The tangent vector ∂/∂xi ∈ T(σ,n) N corresponds to the vector in R2(n+m) given by (σi , j yj dnj (σ )[σi ]). The tangent vector ∂/∂yj corresponds to (0, nj (σ )) Here σ = σ (x), σi = σi (x) and n = j yj nj (σ (x)). Using (2.1) for the inner product, we find that the local expression for the metric has block form
T G + C + BB T B I B G + C 0 I B G(x, y) = = , (5.2) 0 I 0 I 0 I I BT where G = G (x) is the metric for with matrix entries σi (x), σj (x), B = B(x, y) is the matrix with entries yk dnk [σi ], nj , (5.3) Bi,j (x, y) = k
504
R. Froese, I. Herbst
and where C = C(x, y) is the matrix with entries Ci,j (x, y) = yk (dnk [σi ], σj + σi , dnk [σj ]) k
+
yk yl dnk [σi ], dnl [σj ] − BB T
k,l
=
yk (dnk [σi ], σj + σi , dnk [σj ]) +
k
(5.4) k,l
yk yl dnk [σi ], PσT dnl [σj ].
The geometrical meaning of the term G + C is given in (7.12) below. The inverse can be written
T I −B (G + C)−1 0 I −B −1 . G (x, y) = 0 I 0 I 0 I
(5.5)
The local expressions for the projections onto the vertical and horizontal subspaces can now be computed. Let PV and PH denote the projections for the tangent space and P V and P H the projections for the cotangent spaces. Then
0 0 I 0 PV = = P , H BT I −B T 0 and P V = GPV G−1 =
0B 0 I
P H = GPH G−1 =
I −B . 0 0
Notice that the vertical subspace of T(σ,n) N is the span of ∂/∂y1 , . . . , ∂/∂ym and ∗ N is the span of dx , . . . , dx . The map dπ the horizontal subspace of Tσ,n 1 n σ,n : T(σ,n) N → Tσ sends ∂/∂xi ∈ T(σ,n) N to ∂/∂xi ∈ Tσ and sends ∂/∂yi ∈ ∗−1 , defined on the horizontal subspace of T(σ,n) N to 0. From this it follows that J = dπσ,n ∗ N sends dx ∈ T ∗ N to dx ∈ T ∗ . If (σ, n, ξ, η) has co-ordinates (x, y, p, r) Tσ,n i i σ,n σ then ξ has co-ordinates
p p − B(x, y)r H P (x, y) = , r 0 so that J ξ has co-ordinates
p − B(x, y)r.
We now compute the expressions for Hλ , HB and HO in local co-ordinates. We will abuse notation and use the same letters to denote functions on T ∗ N and their co-ordinate expressions. Suppose that the co-ordinates of (σ, n, ξ, η) are (x, y, p, r). Since
00 G−1 P V = P V T G−1 P V = (5.6) 0I we have that
p p η, η = P V , G−1 P V = r, r. r r
(5.7)
Realizing Holonomic Constraints in Classical and Quantum Mechanics
505
Here, and in what follows, inner products involving vectors always refer to co-ordinate 2 . For the horizontal vectors, Euclidean inner products. For example, r, r = m r i=1 i we have
I −B PH = PH 0 I so that ξ, ξ = P
H
p −1 H p ,G P r r
(5.8)
= (p − Br), (G + C)−1 (p − Br). Therefore the local co-ordinate expression for Hλ is Hλ (x, y, p, r) =
1 (p − Br), (G + C)−1 (p − Br) 2 1 λ4 + r, r + y, A(x)y + V (x, y). 2 2
Here C = C(x, y) and B = B(x, y) are the matrices appearing in the expression for the metric G, A(x) is the matrix for A(σ ) in the basis given by the orthonor mal frame n1 , . . . nm used to define the co-ordinate system and V (x, y) = V σ (x) + yk nk (σ (x)) . Similarly HB (x, y, p, r) =
1 (p − Br), G−1 (p − Br) + V (x, 0), 2
(5.9)
where B = B(x, y) and G = G (x). Finally HO (x, y, p, r) =
1 1 r, r + y, A(x)y. 2 2
(5.10)
The expressions for HO and Iα simplify if we can choose the vectors in the local orthonormal frame to be eigenvectors of A(σ ). This is always possible if there are no eigenvalue crossings. When, in addition, the eigenvalues ωα2 (σ ) do not depend on σ there are further simplifications. In what follows we will assume that there are m0 distinct constant eigenvalues ωα2 for α = 1, . . . , m0 , where ωα2 has multiplicity kα . We will assume that the local orthonormal frame used to define the co-ordinate system consists of eigenvectors for A(σ ). We label them nα,j , where α = 1, . . . m0 and j = 1, . . . , kα , where for each α, nα,j is an eigenvector with eigenvalue ωα2 . This means that the coordinates y and r now also acquire a double labelling. First of all we have HO (x, y, p, r) =
1 r, r + 2
1 2
α
If the co-ordinates of (σ, n, ξ, η) are (x, y, p, r), then 2 yα,j . n, Pα n = j
ωα2
j
2 yα,j .
506
R. Froese, I. Herbst
p The vertical cotangent vector η has co-ordinates P V . The corresponding tangent r
p 0 which equals , by (5.6). Now the projection vector has co-ordinates G−1 P V r r Pα , acting on tangent vectors, just picks off the basis vectors ∂/∂yα,j , i.e., Pα ∂/∂yβ,j = δβ,α ∂/∂yβ,j . Thus 2 η, Pα η = rα,j . j
Therefore Iα (x, y, p, r) =
1 2 ωα 2 rα,j + yα,j . 2ωα 2 j
j
Notice that in this situation, where the vectors in the local orthonormal frame are eigenvectors of A(σ ), neither HO nor Iα depend on x or p. Now we introduce local action-angle co-ordinates. In analogy with creation and destruction operators in quantum mechanics, we define the complex quantities aα,j =
yα,j ωα + irα,j , √ 2ωα
so that 1 ∗ (aα,j + aα,j ), 2ωα ωα ∗ = −i ). (aα,j − aα,j 2
yα,j = √ rα,j
The action variables Iα,j ∈ R and angle variables ϕα,j ∈ S 1 are then defined by aα,j = Iα,j eiϕα,j . Notice that j Iα,j = Iα . The change of co-ordinates from (x, y, p, r) to (x, ϕ, p, I ) is symplectic, since drα,j ∧ dyα,j = dIα,j ∧ dϕα,j . This makes it easy to compute the flow φtIα in these co-ordinates. Hamilton’s equations for the flow are x˙i = 0,
p˙ i = 0, I˙α,j = 0, ϕ˙α,j = δβ,α .
Thus, under the flow φtIα each ϕα,j is translated by t and all the other variables remain unchanged. This implies that under the group action 0(τ ), with τ = (τ1 , . . . , τm0 ) the quantities aα,j evolve as e−iτα aα,j . We now compute the expression for HB in action angle co-ordinates. We find Bi,(α,j ) (x, y)rα,j (Br)i = α,j
=
β,k,α,j
=
i b(α,j ),(β,k) (x)rα,j yβ,k
i b(α,j ),(β,k) (x) β,k,α,j
2
∗ (aα,j − aα,j )(aβ,k
∗ + aβ,k )
ωα . ωβ
Realizing Holonomic Constraints in Classical and Quantum Mechanics
507
i Here b(α,j ),(β,k) (x) = b(α,j ),(β,k) [σi (x)] is the antisymmetric matrix given by (3.10). The expression for HB is now obtained by substituting this formula for Br into (5.9), which we may rewrite as
HB (x, p, ϕ, I ) =
1 1 pi g i,l pl − (Br)i g i,l pl + (Br)i g i,l (Br)l + V (x, 0). 2 2 i,l
i,l
i,l
Here g i,l = g i,l (x) are the matrix elements of G−1 (x). To obtain the expression for HB ◦0(τ ) we simply replace each occurrence of aα,j in the formula above with eiτα aα,j . ∗ , we see that the Since HB contains only constant, quadratic and quartic terms in aα,j , aα,j Fourier expansionof HB ◦ 0(τ ) has finitely many terms, since the ν = (ν1 , . . . , νm0 )’s that appear have α |να | ∈ {0, 2, 4}. 6. Proofs of Theorems in Classical Mechanics Proof of Theorem 3.1. We begin with some remarks about the co-ordinate charts for T ∗ N . We will assume that the frames used to defined the co-ordinates consist of eigenvectors of A(σ ). We assume that each chart has the form {(σ, n, ξ, η) : σ ∈ U, n ∈ ∗ N is horizontal, η ∈ T ∗ N is vertical}, where U is a co-ordinate Nσ , ξ ∈ Tσ,n σ,n chart for . Since is compact, there is an atlas with finitely many charts, and there exists a positive number :1 so that two points in T ∗ N both lie in a single chart if their projections onto are a distance less than :1 apart. We use the notation γλ (t) = φtLλ (γλ ),
γ λ (t) = φtHB +λ
2H 0
(γ0 ).
Our first estimates are large λ bounds on the components of γλ (t) = (σλ (t), nλ (t), ξλ (t), ηλ (t)) that follow from the conservation of energy. These bounds are |nλ (t)|, |ηλ (t)| ≤ C
(6.1)
|ξλ (t)| ≤ Cλ.
(6.2)
and
The analogous bounds also hold for γ λ (t) = (σ λ (t), nλ (t), ξ λ (t), ηλ (t)). Clearly |nλ (t)| = |yλ (t)| and, by (5.7), |ηλ (t)| = |rλ (t)|. Thus, (6.1) implies that |yλ (t)| and |rλ (t)| remain bounded. To prove these we first consider the action of Dλ−1 on ξλ . Let γλ = (σλ , nλ , ξλ , ηλ ) have co-ordinates (xλ , yλ , pλ , rλ ). Then ξλ ∈ Tσ∗λ ,nλ N has co-ordinates
pλ − B(xλ , yλ )rλ H pλ = . P rλ 0 We now wish to apply Dλ−1 . Since B(x, y) is linear in y, the scaling in yλ and in rλ cancel. In other words B(xλ , λ−1 yλ )λrλ = B(xλ , yλ )rλ .
508
R. Froese, I. Herbst
Thus Dλ−1 ξλ ∈ Tσ∗ ,λ−1 n N has the same co-ordinates as ξλ ∈ Tσ∗λ ,nλ N . This implies λ λ that as λ → ∞,
pλ − B(xλ , yλ )rλ p − B(xλ , yλ )rλ , G−1 (xλ , λ−1 yλ ) λ |Dλ−1 ξλ |2 = 0 0 −1 pλ − B(xλ , yλ )rλ = pλ − B(xλ , yλ )rλ , G (xλ ) + C(xλ , λ−1 yλ ) → p0 − B(x0 , y0 )r0 , G (x0 )−1 p0 − B(x0 , y0 )r0 = |dπ ∗−1 ξ0 |2 .
(6.3)
Thus, for large λ, the initial energy satisfies 1 −1 2 λ2 |Dλ ξλ | + CV + |ηλ |2 + nλ , A(σλ )nλ ) 2 2 2 ≤ Cλ ,
Lλ (γλ ) = Hλ ◦ Dλ−1 (γλ ) ≤
where CV is an upper bound for V in a neighbourhood of . Given this bound on the initial energies, we may assume that V is bounded, as was explained in the introduction. We now estimate the energy for later times t, Lλ (γλ (t)) = Hλ ◦ Dλ−1 (γλ (t)) 1 ≥ |Dλ−1 ξλ (t)|2 − V ∞ + Cλ2 |ηλ (t)|2 + |nλ (t)|2 2 ≥ −V ∞ + Cλ2 |ηλ (t)|2 + |nλ (t)|2 .
Since energy is conserved, i.e., Lλ (γλ (t)) = Lλ (γλ ), this implies (6.1). In a similar way we find that |Dλ−1 ξλ (t)|2 ≤ Cλ2 .
(6.4)
Now for |y| < C1 sufficiently large λ there is a constant C such that G−1 (x, y) < CG−1 (x, λ−1 y) in any of the finitely many co-ordinate patches. Thus, (6.3) implies |ξλ (t)| ≤ |Dλ−1 ξλ (t)|, so that (6.4) implies (6.2). The proof of bounds (6.1) and (6.2) for γ λ (t) is similar. We now wish to improve the bound (6.2) to |ξλ (t)| ≤ C
(6.5)
for 0 ≤ t ≤ T . We begin by defining a function Q that depends on our co-ordinate systems. Let χ1 (σ ), . . . , χN (σ ) be a partition of unity with each χk supported in a single co-ordinate patch. Define Q = Qk χk , where the local co-ordinate expression for Qk is 1 Qk (x, p) = p, G (x)−1 p + 1. 2
Realizing Holonomic Constraints in Classical and Quantum Mechanics
509
(We are abusing notation by using the same letter Qk for the function on T ∗ N and its local co-ordinate expression.) Given (6.1) we may find a constant C such that |ξλ (t)|2 ≤ CQ(γλ (t)). Thus bound (6.5) follows from an upper bound for Q along an orbit. To establish such a bound we first estimate the time derivative of Qk (xλ (t), pλ (t)). This derivative is given by the Poisson bracket, d Qk (xλ (t), pλ (t)) = {Qk , Lλ } (xλ (t), pλ (t), pλ (t), rλ (t)). dt Recall that the orthonormal frame n1 (σ ), . . . , nm (σ ) giving our local co-ordinates consists of eigenvectors of A(σ ). Thus Lλ = HB + λ2 HO + Eλ with HB (x, y, p, r) = Qk (x, p) − B(x, y)r, G (x)−1 p 1 + B(x, y)r, G (x)−1 B(x, y)r + V (x, 0), 2 1 1 2 2 HO (x, y, p, r) = r, r + ωi y i 2 2 i
and Eλ (x, y, p, r) 1 p − B(x, y)r , (G (x) + C(x, λ−1 y))−1 − G (x)−1 p − B(x, y)r = 2 + V (x, λ−1 y) − V (x, 0). Since Qk only depends on x and p any Poisson bracket {Qk , F } is given in local coordinates by ∂Qk ∂F ∂Qk ∂F {Qk , F } = − . ∂pi ∂xi ∂xi ∂pi i
Thus {Qk , HO } = {Qk , Qk } = 0. Using these formulas, together with (6.1) and (6.2) we find d Qk (xλ (t), pλ (t)) ≤ C pλ (t)2 + λ−1 pλ (t)3 (6.6) dt ≤ CQk (xλ (t), pλ (t)). Next, writing Hamilton’s equations for xλ (t) and using (6.1) we find ∂HB |x˙λ (t)| ≤ ∂p 1 2
≤ CQ (xλ (t), pλ (t)).
(6.7)
510
R. Froese, I. Herbst
Since the cutoff functions, written in local co-ordinates, only depend on xλ we find that 1
|χ˙ k | ≤ C|x˙λ | ≤ CQ 2 .
(6.8)
Now we show if we evaluate Qk and Qj at the same point γ = (σ, n, ξ, η) with |n|, |η| < C then 1
|Qk (γ ) − Qj (γ )| ≤ CQk (γ ) 2 .
(6.9)
To see this, we first compute how our co-ordinates change. If (x, ˜ y, ˜ p, ˜ r˜ ) are the coordinates in the j th chart, obtained from the co- ordinates in the i th chart by a change of co-ordinates on and a change of frame, then p˜ = Mp + b, −1 −1 ˜ G = M −1 G−1 M , where M is the n × n matrix with entries ∂ x˜i /∂xj and b is a vector with components rk yl ∂θkl /∂xi for an orthogonal matrix valued function θ(x) given by taking inner products of the elements of the old and new frames. Thus ˜ −1 p Qj = p, ˜ G ˜ +1
2 = Qk + 2b, M −1 G−1 p + b + 1 1
≤ Qk + CQk2 . This implies (6.9). ˙ denote Now we are ready to establish a bound for Q along an orbit. Let Q dQ(γλ (t))/dt. Then ˙ = ˙ j χj + Qj χ˙ j Q Q j
=
˙ j χj + Q
j
Qj χ˙ j χk .
k,j
The first term is estimated using (6.6) yielding ˙ j χj ≤ C Q Qj χj = CQ. j
j
To estimate the second term, note that since k χk = 1, we have k χ˙ k = 0. Thus Qk χ˙ j χk = 0, k,j
so that
k,j
Qj χ˙ j χk =
(Qj − Qk )χ˙ j χk k,j
≤ CQ
Realizing Holonomic Constraints in Classical and Quantum Mechanics
511
by (6.8) and (6.9). Thus we have the differential inequality ˙ ≤ CQ Q which implies Q(γλ (t)) ≤ Q(γλ (0))eCt . This implies (6.5) Note that (6.7) implies σ˙ λ (t), σ˙ λ (t) < C
(6.10)
for 0 ≤ t ≤ T . We will now show that there exists : > 0 such that if lim sup γλ (τ ) − γ λ (τ ) = 0
λ→∞ τ ∈[0,t]
(6.11)
holds for some t = t1 ≤ T then (6.11) also holds for any t ≤ t1 + :. Since (6.11) holds for t = 0 by the assumption on the initial conditions, this will complete the proof. So assume that (6.11) holds for t = t1 ≤ T . To compare the two orbits for nearby times, we want to ensure that they lie in the same co- ordinate patch. There exists an :1 > 0 such that γλ and γ λ will lie in the same co-ordinate chart if σλ − σ λ < :1 . Choose λ0 so that λ > λ0 implies sup γλ (τ ) − γ λ (τ ) < :1 /3.
τ ∈[0,t1 ]
Now fix λ > λ0 . For t > t1 , σλ (t) − σ λ (t) ≤ σλ (t) − σλ (t1 ) + σλ (t1 ) − σ λ (t1 ) + σ λ (t1 ) − σ λ (t) ≤ 2|t − t1 |C + :1 /3, where C is the constant from (6.10). Thus if we choose : < :1 /3C then γλ and γ λ will lie in the same co-ordinate chart for t ∈ [t1 , t1 + :]. Notice that we do not rule out the chart changes with λ. We now write down the differential equation for γλ and γ λ in this common co-ordinate chart. Let z ∈ R2(n+m) denote co-ordinates for T ∗ N , i.e., x y z = . p r Denote by zλ the co-ordinates of γλ and by zλ the co-ordinates of γ λ . For a Hamiltonian H , let XH denote the corresponding Hamiltonian vector field given in local co-ordinates by ∂H /∂x(z) ∂H /∂y(z) XH (z) = . −∂H /∂p(z) −∂H /∂r(z)
512
R. Froese, I. Herbst
Then d zλ (t) = XHB (zλ (t)) + Xλ2 HO (zλ (t)) + XEλ (zλ (t)) . dt
(6.12)
Since HO is quadratic, the vector field Xλ2 HO is linear, given by Xλ2 HO (z) = λ2 Dz for a matrix D that is similar to a real antisymmetric matrix. It follows that (6.12) can be written in integral form t 2 2 2 zλ (t) = eλ (t−t1 )D zλ (t1 ) + eλ tD e−λ τ D XHB (zλ (τ )) + XEλ (zλ (τ )) dτ. t1
We may write a similar equation for the co-ordinates of γ λ and obtain zλ (t) − zλ (t) = eλ
2 (t−t )D 1
(zλ (t1 ) − zλ (t1 )) t 2 2 e−λ τ D XHB (zλ (τ )) − XHB zλ (τ ) + XEλ (zλ (τ )) dτ. + eλ tD t1
The harmonic oscillator evolution eλ tD is similar to a rotation and therefore uniformly bounded. Moreover we have the estimates XH (zλ (τ )) − XH zλ (τ ) ≤ C zλ (τ ) − zλ (τ ) 2
B
and
B
XE (zλ (τ )) ≤ Cλ−1 . λ
These follow from (6.1) and (6.5) which imply that the co-ordinates for the orbits stay in compact sets. Thus zλ (t) − zλ (t) = C zλ (t1 ) − zλ (t1 ) + C|t − t1 | sup zλ (τ ) − zλ (τ ) + C|t − t1 |λ−1 . τ ∈[t1 ,t1 +:]
If we now also insist that : < 1/(2C), then we find that 1 sup zλ (τ ) − zλ (τ ) ≤ C zλ (t1 ) − zλ (t1 ) + C:λ−1 . 2 τ ∈[t1 ,t1 +:] Since we have only finitely many co-ordinate charts, there is a constant C so that C −1 zλ (τ ) − zλ (τ ) ≤ γλ (τ ) − γ λ (τ ) ≤ C zλ (τ ) − zλ (τ ) in any chart. Thus we conclude that sup
τ ∈[t1 ,t1 +:]
γλ (τ ) − γ λ (τ ) ≤ C γλ (t1 ) − γ λ (t1 ) + C:λ−1 .
Realizing Holonomic Constraints in Classical and Quantum Mechanics
This implies that sup
lim
λ→∞ τ ∈[t1 ,t1 +:]
513
γλ (τ ) − γ λ (τ ) = 0
and completes the proof. # $ Proof of Theorem 3.2. We will show that there exists : > 0 such that if (3.5) holds for some t = t1 ≤ T , then (3.5) also holds for any t ≤ t1 + :. So assume that (3.5) holds for some t = t1 ≤ T . Define 2 λ2 HO ψλ (t) = φ−t ◦ φtHB +λ HO (γ0 ). Choosing our co-ordinate charts as in the proof of Theorem 3.1, we find that for small enough :, ψλ (t) will stay in a single chart for t ∈ [t1 , t1 + :]. This follows from the 2 estimate (6.10) for γ λ (t) = φtHB +λ HO (γ0 ) and the fact that the harmonic oscillator
λ HO keeps the base point σ fixed. motion φ−t Let wλ (t) denote the local co-ordinates of ψλ (t). In local co-ordinates, the evolution 2 λ2 HO φ−t is given by multiplication by e−tλ D , and so 2
wλ (t) = e−tλ
2D
zλ (t),
where D is the same matrix, similar to a real antisymmetric matrix, that appeared in the proof of Theorem 3.1, and zλ (t) are the co-ordinates of γ λ (t). Differentiating, we obtain dwλ (t) 2 2 = e−tλ D XHB (etλ D wλ (t)), dt so that for t ∈ [t1 , t1 + :],
wλ (t) = wλ (t1 ) +
t t1
e−sλ
2D
XHB (esλ
2D
wλ (s))ds.
(6.13)
Now consider the family of R2(n+m) valued functions on [t1 , t1 + :] given by W = {wλ (·) : λ > 0}. We will show for any sequence λj → ∞, there is a subsequence λ1,j such that wλ1,j converges uniformly to the same limit w∞ . This will imply that wλ → w∞ uniformly. The estimates (6.1) and (6.5) of Theorem 3.1 and the fact that the matrices e−tD are bounded uniformly in t imply that W is a bounded family. Moreover, from (6.13) and the boundedness of the orbits, it follows that wλ (t) − wλ (t ) ≤ C|t − t | so that W is equicontinuous. Suppose we are given a sequence λj → ∞. Then, by Ascoli’s theorem, there exists a subsequence λ1,j so that wλ1,j converges uniformly to w∞ . We wish to show that w∞ is always the same, no matter which sequence we start with. Our assumption on t1 implies that wλ1,j (t1 ) always converges to the same w0 , namely to the co-ordinates of φtH1 B (γ0 ). We will show that w∞ (t) is the orbit generated by the Hamiltonian H B with initial condition w0 at t = t1 . Using the uniform boundedness of the matrices e−tD in (6.13) we find that t 2 2 w∞ (t) = w0 + e−sλ1,j D XHB (esλ1,j D w∞ (s))ds + o(1) t1
514
R. Froese, I. Herbst
H0 as j → ∞. Now esλ1,j D is a symplectic map, being the Hamiltonian flow φsλ 2 in local 2
1,j
co-ordinates. It follows that e−sλ1,j D XHB (esλ1,j D w∞ (s)) = XH 2
2
H0 B ◦φ 2 sλ1,j
(w∞ (s)).
If we use the Fourier expansion H0 HB ◦ φsλ = 2 1,j
we find that XH
B ◦φ
so that
=
H0 sλ2 1,j
ν∈Zm0
t t1
eisλ1,j ν,ω Fν 2
ν∈Zm0
ν∈Zm0
w∞ (t) = w0 +
eisλ1,j ν,ω XFν 2
eisλ1,j ν,ω XFν (w∞ (s))ds + o(1). 2
Taking j to infinity and using the Riemann-Lebesgue lemma, we find that w∞ (t) = w0 + = w0 +
t
ν∈Zm0 :ν,ω=0 t1 t t1
XFν (w∞ (s))ds
XH B (w∞ (s))ds.
This identifies w∞ (t) as the orbit generated by H B with initial condition w0 at t1 , as claimed. Now we have 2 sup e−tλ D zλ (t) − w∞ (t) → 0 t∈[t1 ,t1 +:]
as λ → ∞ which implies sup
t∈[t1 ,t1 +:]
2 λ z (t) − etλ D w∞ (t) → 0.
This implies sup
t∈[t1 ,t1 +:]
2 HB +λ2 H0 (γ0 ) − φtλ H0 ◦ φtH B (γ0 ) → 0 φt
and completes the proof. # $
Realizing Holonomic Constraints in Classical and Quantum Mechanics
515
7. More Co-ordinate Expressions In this section we give the co-ordinate expressions that will be needed in our proofs of the quantum theorems. We begin by defining the second fundamental form, the Weingarten maps and the mean and scalar curvatures. Let X and Y be two vector fields tangent to . Since the Lie bracket [X, Y ] = dY [X] − dX[Y ] is tangent to we find that II (X, Y ) = P N dX[Y ] = P N dY [X] + P N [X, Y ] = P N dY [X] is symmetric in X and Y . Here P N denotes the projection onto the normal space. By definition, II (X, Y ) is the second fundamental form. Given an orthonormal frame n1 (σ ), . . . , nm (σ ) for the normal bundle, we have II (X, Y ) = X, Sk Y nk k
for a collection of symmetric linear transformations Sk on the tangent space. These are called the Weingarten maps. Clearly X, Sk Y = nk , dX[Y ]. But, by differentiating nk , X = 0, we obtain dnk [Y ], X + nk , dX[Y ] = 0, so that the Weingarten maps can also be written as Sk = −P T dnk . Here P T denotes the orthogonal projection onto the tangent space. The mean curvature vector is given by m
h=
1 tr(Sk )nk n
(7.1)
k=1
while the scalar curvature is m
s=
1 ((tr(Sk ))2 − tr(Sk2 )). n(n − 1)
(7.2)
k=1
Recall that the local expression G(x, y) for the pulled back metric on N has the block form (5.2). Initially, G(x, y) is only defined for y < δ. In our theorem, we wish to extend this metric to a complete Riemannian metric on all of N . One way to achieve this is to join the induced metric for small |y| to the metric · , ·1 given by (3.3) for large |y|. Since the matrix for the metric · , ·1 is
I B 0 I
T G 0 I B 0 I 0 I
the resulting metric on all of N would have the matrix
G(x, y) =
I B 0 I
T G + χ C 0 I B , 0 I 0 I
where χ = χ (|y|) is a cutoff function that equals 1 for |y| < : and 0 for |y| > δ. With this special form of the extended metric the local co-ordinate expression below remains true on all of N if C is replaced by χ C. However, this special form of the extension is not required for our theorems.
516
R. Froese, I. Herbst
Let g(x, y) = det(G(x, y)) = det(G + C). Define Dx1 Dy1 Dx = ... , Dy = ... . Dxn
Dym
The local co-ordinate expression for the operator Hλ = − 21 + V (σ, n) + λ4 W (σ, n) in the region |y| < δ is
T
−1 0 Dx − BDy 1/2 (G + C) 1 −1/2 Dx − BDy Hλ = − 2 g g Dy Dy 0 I λ4 + V (x, y) + y, A(x)y 2 1 −1/2 = − 2g (Dx − BDy )T g 1/2 (G + C)−1 (Dx − BDy ) + DyT g 1/2 Dy λ4 y, A(x)y. 2 Local expressions for the densities on N are dvol = g(x, y)|d n x||d m y|, dvolλ = g(x, y/λ)|d n x||d m y|, dvolN = g(x, 0)|d n x||d m y| = g (x)|d n x||d m y|, + V (x, y) +
where g (x) = det(G (x)). Thus the multiplication operator Mλ appearing in (4.2) is −1/4 multiplication by fλ where fλ (x, y) =
g(x, y/λ) . g (x)
We may now compute the local expression for Lλ . Conjugation by Dλ results in every multiplication by a (possibly matrix valued) function F (x, y) being replaced by multiplication by F (x, y/λ), and every Dy being replaced by λDy . Conjugation by Mλ −1/4 simply puts a multiplication by fλ to the right of the operator, and a multiplication 1/4 by fλ to the left. In a co-ordinate system for a domain in N of the form {(σ, n) : σ ∈
Dx U, n ∈ N σ } let D = and Gλ (x, y) be the scaled and extended metric taking Dy into account the scaling of Dy as well as y. In other words
I 0 I 0 Gλ (x, y) = G(x, y/λ) . (7.3) 0 λI 0 λI Then −1/4
Lλ = − 21 fλ g(x, y/λ)−1/2 D T g(x, y/λ)1/2 G−1 λ Dfλ 1/4
+ V (x, y/λ) +
λ2 y, A(x)y 2
−1/2 −1/4 T 1/4 1/2 −1 1/4 −1/4 fλ D fλ g Gλ fλ Dfλ
= − 21 g
(7.4) + V (x, y/λ) +
λ2 y, A(x)y. 2
Realizing Holonomic Constraints in Classical and Quantum Mechanics
517
Thus in the region where y < δλ we may use the explicit form of the metric to obtain Lλ = −
1 −1/4 −1/2 g 2 fλ
Dx − BDy Dy
T
(G + Cλ )−1 0 0 λ2 I
1/2 1/2 · g fλ
+ V (x, y/λ) +
Dx − BDy −1/4 fλ Dy
(7.5)
λ2 y, A(x)y, 2
where Cλ (x, y) = C(x, y/λ). Note that formally putting fλ = 1 above, and replacing Cλ by 0, we obtain for the first line of (7.5), −1/2 − 21 g
Dx Dy
T
I −B 0 I
T
1/2 g
0 G−1 0 λ2 I
I −B Dx , 0 I Dy
which is the Laplace–Beltrami operator for the metric which in local co-ordinates is
I B 0 I
T G 0 I B . 0 λ−2 I 0 I
This is easily seen to be the matrix for the metric (3.3). This explains part of the origin of the HB + λ2 HO . A more complete analysis (to which we now turn) is necessary to understand the origin of the term K(σ ). Before beginning this, note that the local expressions for HB and HO are given by HB =
1 (Dx − B(x, y)Dy )∗ G−1 (Dx − B(x, y)Dy ) + K(x) + V (x, 0) 2
(7.6)
and H0 =
1 ∗ D Dy + 21 y, A(x)y. 2 y
(7.7)
Here Dx∗ and Dy∗ denote the formal adjoints with respect to dvolN given by Dx∗ = −1/2
−1/2
−1/2
−g DxT g , Dy∗ = −g DyT g = −DyT and B ∗ = g B T g = B T . We now wish to perform a large λ expansion of Lλ . To state the error estimates precisely, we introduce the notation Ek to denote a smooth function of x and y that vanishes to k th order at y = 0, evaluated at (x, y/λ). Roughly speaking, Ek behaves like (y/λ)k for small y/λ. The effect of differentiating such an error term is given by 1/2
1/2
∂Ek = Ek , ∂xi ∂Ek λ−1 Ek−1 = λ−1 E0 ∂yi
1/2
if k ≥ 1 if k = 0.
In our theorems we will always assume that the eigenvalues ωj2 of A(σ ) are constant. If we choose the orthonormal frame in the definition of our co-ordinates to consist of eigenvectors of A(σ ) then n, A(σ )n = j ωj2 yj2 . We will make this substitution without further comment below.
518
R. Froese, I. Herbst
Lemma 7.1. In the region where y < δλ, the local expression for Lλ can be written Lλ = HB + λ2 H0 + (Dx − BDy )∗ E1 (Dx − BDy ) + E1 . Proof. In a co-ordinate
system for a domain in N of the form {(σ, n) : σ ∈ U, n ∈ Dx Nσ } let D = and Gλ (x, y) be given by (7.3). Setting kλ = (1/4) ln fλ , we may Dy write (7.4) as Lλ =
1 λ2 2 2 (D − ∂k ) + V (x, y/λ) + ωj y j , (D − ∂kλ )∗ G−1 λ λ 2 2
(7.8)
j
where ∂kλ =
∂ x kλ −1/2 1/2 , ∂kλ∗ = (∂kλ )T , and D ∗ = −g D T g . We further expand ∂y k λ
(7.8) to obtain Lλ =
1 ∗ −1 D Gλ D + 21 ∂kλ∗ G−1 λ ∂kλ 2 λ2 2 2 1 −1/2 1/2 + V (x, y/λ) + + g ∂i g G−1 ∂ k ωj y j . j λ λ i,j 2 2 i,j
(7.9)
j
If y < λδ then G−1 λ (x, y)
T I −B(x, y) I −B(x, y) (G (x) + C(x, y/λ))−1 0 = (7.10) 0 I I 0 λ2 I 0
so that in this region we obtain Lλ =
1 λ2 (Dx − BDy )∗ G (x)−1 (Dx − BDy ) + Dy∗ Dy 2 2 2 λ + (Dx − BDy )∗ E1 (Dx − BDy ) + E1 + ∂y2i kλ + (∂yi kλ )2 2 i
λ2 2 2 + V (x, y/λ) + ωj y j 2 j
= HB + λ HO + (Dx − BDy )∗ E1 (Dx − BDy ) + E1 λ2 2 + ∂yi kλ + (∂yi kλ )2 − K(x). 2 2
i
E1 , so that (∂x − B∂y )kλ = E1 . Here we used (∂x − B∂y )Ek = Ek and ∂kλ = −1 λ E0 The lemma will follow if we can show
λ2 2 ∂yi kλ + (∂yi kλ )2 = K(x) + E1 . 2 i
(7.11)
Realizing Holonomic Constraints in Classical and Quantum Mechanics
519
This requires a more careful expansion of fλ . The first step is to uncover the geometrical meaning of the term G (x) + C(x, y) occurring in the expression (5.2) for the metric. Note that dnk [σi ], σj = −Sk σi , σj = −σi , Sk σj = σi , dnk [σj ] and that
Mk = G−1 [σi , Sk σj ]
. , σn . Let S be the symmetric is the matrix for the Weingarten map Sk in the basis σ1 , . . operator defined by n, II (X, Y ) = X, SY . Then S = k yk Sk , and the matrix for S in the basis σ1 , . . . , σn is M = M(x, y) = yk Mk (x). k
A short calculation shows G + C = G (I − M)2 . Given the block form (5.2) of G and (7.12), we obtain fλ = gλ /g = det(G(x, y/λ))/ det(G (x)) = det(G (x)(I − λ−1 M(x, y))2 )/ det(G (x)) = det(I − λ−1 M(x, y))2 . Thus kλ = =
1/2 1 1 2 ln(fλ ) = 2 ln det(I −1 1 2 tr ln(I − λ M)
− λ−1 M)
1 = − 21 λ−1 tr(M) − λ−2 tr(M 2 ) + E3 4 1 1 −1 = −2λ yk tr(Sk ) − λ−2 yk yl tr(Sk Sl ) + E3 . 4 k
This implies that and
k,l
∂yi kλ = − 21 λ−1 tr(Si ) + λ−2 E1 + λ−1 E2 (∂yi )2 kλ = − 21 λ−2 tr(Si2 ) + λ−2 E1 .
Thus 1 1 λ2 2 ∂yi kλ + (∂yi kλ )2 = − tr(Si2 ) + (tr(Si ))2 + E1 4 8 2 i 1 1 = (tr(Si ))2 − tr(Si2 ) − (tr(Si ))2 + E1 4 8 n(n − 1) n2 = s − h2 + E1 . 4 8 Thus proves (7.11) and completes the proof.
$ #
(7.12)
520
R. Froese, I. Herbst
We conclude this section by discussing the expression for H B in local co-ordinates. We may define local annihilation and creation operators, using the co-ordinates yα,j defined in Sect. 5, as 1 (ωα,j yα,j + Dyα,j ), 2ωα 1 =√ (ωα,j yα,j − Dyα,j ). 2ωα
aα,j = √ ∗ aα,j
Then we find
1 ωα 2 y Dy2α,j + 2ωα 2 α,j j ∗ = aα,j + 21 . aα,j
Iα =
−
j
We may also write HB in terms of the annihilation and creation operators. We begin with i (B(x, y)Dy )i = bα,j ,β,k Dyα,j yβ,k . α,j ,β,k
Notice that the order of Dyα,j and yβ,k is irrelevant here, since bi is antisymmetric in (α, j ) and (β, k). Then we can use
ωα ∗ aα,j − aα,j , 2 1 ∗ aβ,k + aβ,k , = 2ωα
Dyα,j = yβ,k
and substitute the resulting expression in (7.6). The resulting formula expresses HB as a finite sum of terms involving the product of 0, 2 or 4 annihilation or creation operators. The identities eitHO aα,j e−itHO = e−itωα aα,j
∗ ∗ eitHO aα,j e−itHO = eitωα aα,j
lead to a finite sum eitHO HB e−itHO =
(7.13)
eitν,ω Fν
ν∈Zm0
that defines the differential operators Fν . Lemma 7.2. For ϕ ∈ C0∞ (N ), e−itHO ϕ ∈ D(HB ) and eitHO HB e−itHO ϕ =
eitν,ω Fν ϕ,
ν∈Zm0
where the operators Fν are defined by the sum above.
(7.14)
Realizing Holonomic Constraints in Classical and Quantum Mechanics
521
Proof. It suffices to prove this for ϕ ∈ C0∞ supported in a single co-ordinate patch, since a general ϕ ∈ C0∞ can be written as a sum of such functions. Introducing our usual local co-ordinates x and y, we find that e−itHO is simply a harmonic oscillator time evolution in the y variables. Hence e−itHO ϕ is in Schwartz space. This implies that e−itHO ϕ ∈ D(HB ), and that the expansion of HB into a sum of terms involving products ∗ is valid when applied to e−itHO ϕ. To complete the proof, it remains to of aα,j and aα,j show that the identities (7.13) hold when applied to a function ϕ in Schwartz space. This follows from d itHO aα,j e−itHO ϕ = ieitHO [HO , aα,j ]e−itHO ϕ e dt ∗ = iωα eitHO [aα,j aα,j , aα,j ]e−itHO ϕ ∗ = iωα eitHO [aα,j , aα,j ]aα,j e−itHO ϕ
= −iωα eitHO aα,j e−itHO ϕ.# $ 8. Proofs of Theorems in Quantum Mechanics We begin with some analysis that allows us to transfer our considerations from Rn+m to the normal bundle N . Let d(x, ) = inf{x − σ : σ ∈ } denote the distance to in Rn+m and let Uδ = {x ∈ Rn+m : d(x, ) < δ} be the tubular neighbourhood of that is diffeomorphic to N δ . The first proposition shows that the time evolution in L2 (Rn+m ) under Hλ is approximately the same for large λ as the time evolution in L2 (Uδ ) under the same Hamiltonian, except with Dirichlet boundary conditions. Proposition 8.1. Suppose that W, V ∈ C ∞ (Rn+m ) with W ≥ 0 and V bounded below. Suppose W (x) = 0 if and only if x ∈ and that W (x) ≥ w0 > 0 for large x. Suppose λ ≥ 1, ψ ∈ L2 (Rn+m ), ψ = 1 and Hλ ψ ≤ C1 λ2 , where Hλ = 1 − 2 + V + λ4 W . Then, given : > 0 there exists C2 such that for all t ∈ R, F(d≥:) e−itHλ ψ ≤ C2 λ−1 .
(8.1)
Here F(·) denotes multiplication by the characteristic function supported on the region indicated in the parentheses. Define Hλδ to be the operator in L2 (Uδ ) given by Hλ with Dirichlet boundary conditions on ∂Uδ . Then for all t ∈ [0, T ] and 0 < : < δ, δ
F(d≤:) e−itHλ ψ − e−itHλ F(d≤:) ψ ≤ C3 λ−1/4 . Here C2 depends only on C1 and : and C3 depends only on C1 , T and :. Remark. The power 1/4 in (8.2) is not optimal.
(8.2)
522
R. Froese, I. Herbst
Proof. By the assumption on ψ and the Schwarz inequality ψ, Hλ ψ ≤ C1 λ2 . Without loss we may assume that V ≥ 0, so that 1 ∇ψ2 ≤ C1 λ2 , 2 ψ, W ψ ≤ C1 λ−2 . It follows that
(8.3)
C(:)ψ, F(d≥:) ψ ≤ ψ, F(d≥:) W ψ ≤ C1 λ−2
which proves (8.1), since e−itHλ ψ satisfies the same hypotheses as ψ. For 0 < :1 ≤ α we will need the estimate 1
F(:1 ≤d≤α) ∇ψ ≤ C4 λ 2 ,
(8.4)
where C4 depends only on α, :1 and C1 . To prove this, choose a function χ ∈ C0∞ (Rn+m ), 0 ≤ χ ≤ 1, which is 1 in a neighbourhood of {x : :1 ≤ d(x, ) ≤ α} and vanishes in a neighbourhood of . Then F(:1 ≤d≤α) ∇ψ = F(:1 ≤d≤α) ∇(χ ψ) ≤ ∇(χ ψ). The Schwarz inequality and integration by parts gives 1
1
∇(χ ψ) ≤ (χ ψ) 2 χ ψ 2 so that (8.4) follows from (χ ψ) ≤ C5 λ2
(8.5)
and (8.1). To prove (8.5) let p = −i∇ and calculate, as forms on C0∞ × C0∞ , Hλ2 =
1 4 1 |p| + (V + λ4 W )2 + pj (V + λ4 W )pj − ( V + λ4 W ). 4 2
(8.6)
j
It follows from (8.6) and the fact that C0∞ is a core for Hλ that χ ψ ∈ D(Hλ ) and 1 p 2 χ ψ2 ≤ Hλ (χ ψ)2 + Cλ4 , 2 or
√ 1 1 2 p χ ψ ≤ Cλ2 + Hλ ψ + [ p 2 , χ ]ψ. 2 2 The last term can be bounded by (8.3), yielding (8.5). Let χ˜ be a smooth function which satisfies 0 ≤ χ˜ ≤ F(d<:/2) and χ˜ = 1 in a neighbourhood of . Because of (8.1) (which holds at t = 0) it is enough to show δ
eitHλ χ˜ e−itHλ ψ − χ˜ ψ ≤ Cλ−1/4 for t ∈ [0, T ]. Let
δ
φt,λ = eitHλ χ˜ e−itHλ ψ − χ˜ ψ.
Realizing Holonomic Constraints in Classical and Quantum Mechanics
523
Integrating the derivative, we obtain t δ φt,λ = i eisHλ (Hλδ χ˜ − χ˜ Hλ )e−isHλ ψds 0 t δ = eisHλ (∇ χ˜ · p − (i/2) χ˜ )e−isHλ ψds, 0
and thus
φt,λ 2 =
t 0
δ
e−isHλ φt,λ , (∇ χ˜ · p − (i/2) χ˜ )e−isHλ ψds.
Let χ˜˜ = 1 on the support of ∇ χ˜ and χ˜˜ = 0 in a neighbourhood of . Then from (8.4) t δ 2 φt,λ ≤ χ˜˜ e−isHλ φt,λ (∇ χ˜ · pe−isHλ ψ + C)ds 0 t 1 δ 2 ≤ Cλ χ˜˜ e−isHλ φt,λ ds. 0
Now φt,λ , Hλδ φt,λ ≤ 2χ˜ e−itHλ ψ, Hλδ χ˜ e−itHλ ψ + 2χ˜ ψ, Hλδ χ˜ ψ
= e−itHλ ψ, (Hλ χ˜ 2 + χ˜ 2 Hλ + (∇ χ˜ )2 )e−itHλ ψ + ψ, (Hλ χ˜ 2 + χ˜ 2 Hλ + (∇ χ˜ )2 )ψ ≤Cλ2 ,
by the Schwarz inequality. Thus, following the proof of (8.1), δ
χ˜˜ e−isHλ φt,λ ≤ Cλ−1 so that which gives (8.2).
φt,λ 2 ≤ Cλ 2 λ−1 1
$ #
Since the subset Uδ ⊂ Rn+m is diffeomorphic to N δ ⊂ N , we may think of = − 21 + V + λ4 W as acting in L2 (N δ , dvol) with Dirichlet boundary conditions on ∂Nδ , where the volume form dvol and the Laplace operator are computed using the pulled back metric, and V and W are now the pull backs of the corresponding functions on Uδ . We may now extend the metric, and the potentials V and W , from N δ to all of N , as explained in Sect. 4 above. Recall that the extended metric is assumed to be complete, that the extended V is bounded and that W = n, A(σ )n on all of N . We thus obtain an operator Hλ acting in L2 (N , dvol). Since the extended metric is complete, Hλ is essentially self-adjoint on C0∞ . Then it makes sense to talk about e−itHλ . A proposition analogous to Theorem 8.1 holds in this situation, allowing us to approximate the evolution under Hλδ with an evolution under Hλ . For the purposes of this proposition, it does not matter how the extensions are made, as long as the conditions on the potentials hold, and the state ψ that we use for the comparison satisfies Hλ ψ ≤ Cλ2 . Since the statement and proof of this proposition are nearly identical to Theorem 8.1 we omit them. Having justified the transfer of our considerations to L2 (N , dvolN ), we now turn to the proof of Theorem 4.1. Before beginning, we need some quantum energy bounds. Hλδ
524
R. Froese, I. Herbst
Lemma 8.2. Let Lλ be as in Theorem 4.1 and L0,λ = HB +λ2 HO . Let LOλ denote either −1 of these operators and ROλ = λ−2 LOλ + 1 . Let F2 = F(|n|/λ<:) be a smooth cutoff to the indicated region. When : < δ, this cutoff function is supported in the region of N where the metric is explicitly defined. Let χ (σ ) be a cutoff with support in a single co-ordinate patch. Then, for small enough : and large λ, nROλ + χ F2 Dy ROλ + λ−1 χ F2 Dx ROλ ≤ C. 1/2
1/2
1/2
(8.7)
If l is a non-negative integer and α, β are multi- indices with l + |α| + |β| ≤ 2, then χ F2 nl (λ−1 Dx )α Dyβ ROλ ≤ C.
(8.8)
In addition, if l is a positive integer and |α| + |β| ≤ 2, then l+1 χ F2 nl (λ−1 Dx )α Dyβ ROλ ≤ C.
Here n =
(8.9)
1 + |n|2 .
Proof. Without loss of generality we can assume that V ≥ 1. Set f = χ F2 . Then f ∈ C0∞ with 0 ≤ f ≤ 1. Using (7.8) we see that Lλ ≥
1 λ2 2 2 f (D − ∂k ) + ωj y j . (D − ∂kλ )∗ f G−1 λ λ 2 2 j
In the region where f > 0 we can use (7.9) to obtain
f
I −B 0 I
Using λ−2 Rλ (Lλ + λ2 )Rλ 1/2
T
1/2
I 0 0 λ2 I
I −B f ≤ Cf G−1 λ f. 0 I
= 1 we obtain 1/2
f Dy Rλ ≤ C, 1/2 λ−1 f (Dx − BDy − ∂x kλ + B∂y kλ )Rλ 1/2 nRλ ≤ C.
(8.10) ≤ C,
(8.11) (8.12)
On the support of f , ∂x kλ − B∂y kλ is bounded. Thus, using (8.10) and B ≤ C|n| we 1/2 obtain λ−1 f Dx Rλ ≤ C. This proves (8.7) for Rλ . The proof for R0,λ is similar. Define U by Lλ = 21 D ∗ G−1 λ D + U . Then, using (7.10) we calculate Lλ f 2 Lλ =
1 ∗ ∗ −1 ∗ −1 2 2 (f D ∗ G−1 λ D) (f D Gλ D) + D Gλ f U D + (Uf ) 4 1 1 −1 2 2 ∗ + D ∗ G−1 λ [D, f U ] + [Uf , D ]Gλ D. 2 2
The last two terms above combine to give a multiplication operator given by a function which is easily shown to be bounded below by −χ˜ 2 F˜22 (1 + λ2 |y|2 ),
Realizing Holonomic Constraints in Classical and Quantum Mechanics
525
where χ˜ and F˜2 are like χ and F2 , with slightly expanded support. It follows that λ−4 −1/2 2 −4 |U |1/2 DRλ 2 + λ−4 f U Rλ 2 f D ∗ G−1 λ DRλ + λ f Gλ 4 ≤ 1 + λ−4 χ˜ F˜2 nλRλ 2 . The right side is bounded by (8.7). From λ−2 f U Rλ ≤ C we obtain f n2 Rλ ≤ C, which proves (8.8) when l = 2. From −1/2
λ−2 f Gλ
|U |1/2 DRλ ≤ C
we obtain λ−1 f n(Dx − BDy )Rλ ≤ C and f nDy Ry ≤ C which then gives f nλ−1 Dx Rλ ≤ C. This proves (8.8) when l = 1. Finally we consider the consequences of λ−2 f D ∗ G−1 λ DRλ ≤ C. This is equivalent to λ−2 D ∗ G−1 λ Df Rλ ≤ C since the commutator term can be bounded using (8.7). We thus must examine the n+m contained in a operator D ∗ G−1 λ D acting on functions of compact support in R domain of the form Qλ = {(x, y) : |x| < r, |y| < :λ}. When we rescale y → λy and Dy → λ−1 Dy , the operator D ∗ G−1 λ D goes over to an elliptic operator E independent of λ operating on functions of compact support in a domain Q = {(x, y) : |x| < r, |y| < :}. The smooth coefficients of the operator E are bounded in Q. It follows that if |α|+|β| ≤ 2 Dxα Dyβ ψ ≤ CEψ for ψ with support in Q. When we scale back again this implies λ−2 Dxα (λDy )β f Rλ ≤ C or (λ−1 Dx )α Dyβ f Rλ ≤ C. Again, the commutator term which arises from moving f to the left can be bounded using (8.7). This takes care of the case l = 0 in (8.8). We have thus proved (8.8) for Rλ . The proof for R0,λ is similar. We now turn to (8.9). We give the proof for Rλ . The proof for R0,λ is similar. We first show that f nl Rλl ≤ C.
(8.13)
526
R. Froese, I. Herbst
We write f = ff1l , where f1 has slightly larger support than f and is of the form h1 (x)h2 (|y|/λ). Writing f1 n = g, we have g l Rλl = gRλ g l−1 Rλl−1 + g[g l−1 , Rλ ]Rλl−1
= gRλ g l−1 Rλl−1 + gRλ [λ−2 Lλ , g l−1 ]Rλl = gRλ g l−1 Rλl−1 + gRλ Dy∗ J1 nl−1
+ λ−1 (Dx − BDy )∗ J2 nl−1 + J3 nl−1 Rλl ,
where J1 , J2 and J3 are bounded functions with support contained in suppf1 . Thus, from (8.7) g l Rλl ≤ Cg l−1 Rλl−1 + Cf2 nl−1 Rλl−1 , where f2 has slightly larger support than f1 . Thus (8.13) follows inductively. β We now let Aα,β denote (λ−1 Dx )α Dy and take A = Aα,β with |α| + |β| ≤ 2. Then g l ARλl+1 ≤ [A, g l ]Rλl+1 + Af2 g l Rλl+1 , where f2 has slightly larger support than f1 . We have Af2 g l Rλl+1 ≤ Af2 Rλ g l Rλl + Af2 [g l , Rλ ]Rλl
≤ Af2 Rλ · g l Rλl + Af2 Rλ · [g l , λ−2 Lλ ]Rλl+1
and
[A, gl ] =
|γ |+|µ|≤1
so that
[A, gl ]Rλl+1 ≤
gγ ,µ,l−1 (λ−1 Dx )γ Dyµ
|γ |+|µ|≤1
gγ ,µ,l−1 Aγ ,µ Rλl ,
where |gγ ,µ,l−1 | ≤ C(f3 n)l−1 and where f3 has slightly larger support than f2 . Similarly [g l , λ−2 Lλ ] = J˜1 nl−1 Dy + J˜2 nl−1 (λ−1 Dx ) + J˜3 nl−1 , where J˜1 , J˜2 and J˜3 are bounded functions with support contained in suppf1 . Thus [g l , λ−2 Lλ ]Rλl+1 ≤ g˜ γ ,µ,l−1 Aγ ,µ Rλl , |γ |+|µ|≤1
where |g˜ γ ,µ,l−1 | ≤ C(f3 n)l−1 . Thus again using induction, the result (8.9) follows. $ # Proof of Theorem 4.1. Since e−itL0λ ψ − e−itLλ ψ2 = 2ψ, ψ − 2 Reψ, eitL0λ e−itLλ ψ it suffices to show lim
sup ψ, eitL0λ e−itLλ ψ − ψ, ψ = 0
λ→∞ 0≤t≤T
(8.14)
Realizing Holonomic Constraints in Classical and Quantum Mechanics
527
for a dense set of ψ in L2 (N , dvolN ). Let ψ ∈ C0∞ . Our goal is to show (8.14). As a first step, we insert an energy cutoff. Since LOλ ψ ≤ Cλ2 we have F(LOλ /λ2 ≥µ) ψ = F(LOλ /λ2 ≥µ) L−1 Oλ · LOλ ψ ≤ Cµ−1 . Set FO1 = F(LOλ /λ2 ≤µ) . Then it suffices to show that for each fixed µ > 0, lim sup F01 ψ, eitL0λ e−itLλ F1 ψ − F01 ψ, F1 ψ = 0. λ→∞ 0≤t≤T
(8.15)
We now need to show the quantum analogue of the fact in classical mechanics that the orbits stay in a bounded region of phase space if we watch the system for a time T < ∞ which is independent of λ. Using energy considerations it follows from Lemma 8.2 that n and Dy are bounded but only that Dx cannot grow faster than λ. We now seek a λ independent bound, showing that up to a fixed time T , not too much energy can be transferred from normal to tangential modes. In the quantum setting the statement F2 Dx χ e−itLOλ FO1 ψ < C,
(8.16)
where F2 is as in Lemma 8.2, will suffice. We will prove this estimate when LOλ = Lλ , since the other case when LOλ = L0λ is similar. Let {χk2 (σ )} be a partition of unity subordinate to a finite cover of co-ordinate charts. In other words, each χk2 is supported in a single co- ordinate chart, and k χk2 = 1. We may assume that each χk is a smooth function only of σ . Define Q= χk Dx∗ G−1 (x)Dx χk , k
where, in each term, Dx and x are defined in terms of the co- ordinates for the chart in which χk is supported. We now want to cut Q off to the region where we have explicit expressions for the metric, and then add a constant to regain positivity. So let ¯ = F2 QF2 + 1. Q ¯ commute with F2 , since in local co-ordinates F2 is a function Notice that Q and Q ¯ are essentially self-adjoint on of y alone. It is not difficult to show that both Q and Q C0∞ (N ). Define ¯ −itLλ F1 ψ. q(t) = F1 ψ, eitLλ Qe Then (8.7) follows from sup{q(t) : t ∈ [0, T ]} ≤ C. We will prove a differential inequality as in the classical case. We will need further estimates to bound the terms which arise when we compute q(t) ˙ and to prove an upper bound for q(0).
528
R. Froese, I. Herbst
Lemma 8.3. Suppose F1 is a smooth cutoff in the energy λ−2 Lλ . Then γ ¯ −1/2 ≤C nl (λ−1 Dx )α Dyβ Dx χj F2 F1 Q if l + |α| + |β| ≤ 2 and |γ | = 1. Proof. We use the Helffer–Sjöstrand formula (see [D]) F1 = g(z)(Rλ − z)−1 dz ∧ d z¯ , where we may take g ∈ C0∞ (R2 ) with |g(z)|| Im z|−N ≤ CN for any N . (We are using γ the fact that F1 (λ−2 Lλ ) = F˜1 (Rλ ) for F˜1 ∈ C0∞ (0, 2). Let A1 = nα (λ−1 Dx )β Dy χ with χ ∈ C ∞ (), supported in the j th co-ordinate patch, χ χ1 = χ1 , and let F2,1 be a smooth function of |n|/λ with F2,1 F2 = F2 . Then γ ¯ −1/2 = A1 F2,1 F1 Dxγ χj F2 Q ¯ −1/2 + A1 F2,1 [Dxγ χj F2 , F1 ]Q ¯ −1/2 . A1 Dx χj F2 F1 Q
Using (8.8), the first term is bounded by a constant times γ ¯ −1/2 ≤ C A1 F2,1 Rλ · Dx χj F2 Q
and it is thus sufficient to show γ
Rλ−1 [Dx χj F2 , F1 ] ≤ C. We compute from the Helffer–Sjöstrand formula γ
γ
Rλ−1 [Dx χj F2 , F1 ] ≤ C[Dx χj F2 , λ−2 Lλ ]Rλ .
(8.17)
For our present purposes we can write Lλ = (Dx − BDy )∗ E0 (Dx − BDy ) +
λ2 ∗ ω2 yj2 ) + E0 (Dy Dy + 2 j
and we thus obtain γ
γ
[Dx χj F2 , λ−2 Lλ ] = λ−1 Dx χj (∇F2 · Dy + Dy · ∇F2 ) γ
+ λ−2 [Dx χj , (Dx − BDy )∗ E0 (Dx − BDy )]F2 + λ−2 E0 . The first term gives a bounded contribution to (8.17) by Lemma 8.2. The second term can be written λ−1 (Dx − BDy )∗ E0 λ−1 (Dx − BDy ) + Dy∗ E0 λ−1 (Dx − BDy ) + λ−1 (Dx − BDy )∗ E0 Dy + λ−2 E0 (Dx − BDy ) χj F2 γ + λ−1 Dx (∂x χj )T E0 λ−1 (Dx − BDy ) + λ−1 (Dx − BDy )∗ E0 ∂x χj F2 and again this gives a bounded contribution to (8.17) by Theorem 8.2. # $
Realizing Holonomic Constraints in Classical and Quantum Mechanics
529
We now return to the proof of Theorem 4.1 and calculate ¯ 1 e−itLλ ψ. q(t) ˙ = ie−itLλ ψ, F1 [Lλ , Q]F Let F1,1 be a C0∞ function of λ−2 Lλ with slightly larger support than F1 , so that F1 F1,1 = F1 . We will show that ¯ 1,1 ≤ C Q ¯ F1,1 [iLλ , Q]F
(8.18)
so that q(t) ≤ eCt q(0). First consider any term which arises when the cut-off F2 = F(|n|/λ<:) is differentiated. The derivative F2 has support in a region of the form {(σ, n) : λ:1 < |n| < λ:2 } so that F2 (λ/|n|)l is bounded for any l. Thus F2 = F2 (λ/|n|)l λ−l |n|l so that according to Lemma 8.2, (8.9), such a term is bounded (and even decays faster that any inverse power ¯ appears alongside of λ. Note that such a term occurring in the commutator [Lλ , Q] β α Dx Dy with |α| + |β| ≤ 3 but because we have an F1,1 on the left and another on the right, (8.9) even allows |α| + |β| ≤ 4 and we still obtain faster than any inverse power ¯ contains the constant 1 such terms are harmless and we will ignore of λ decay.) Since Q them. Thus we are left with showing ¯ F1,1 F2 [iLλ , Q]F2 F1,1 ≤ C Q.
(8.19)
We write hk = Dx∗ G−1 (x)Dx when the x refers to the k th co-ordinate patch. Then χ k h k χk =
1 2 χk hk + hk χk2 + (∂x χk )T G−1 ∂x χ k 2
so that [Lλ , Q] =
1
1 [Lλ , χk2 ]hk + hk [Lλ , χk2 ] 2 2
k
+ [Lλ , mk ] +
1 2 1 χk [Lλ , hk ] + [Lλ , hk ]χk2 , 2 2
where mk = (∂x χk )T G−1 ∂x χk . We must make use of some cancellation which occurs above so we write 1 k
2
[Lλ , χk2 ]hk =
1 k,j
2
[Lλ , χk2 ](hk − hj )χj2 +
1 k,j
2
[Lλ , χk2 ]hj χj2
530
R. Froese, I. Herbst
and note that the second term on the right vanishes because [Lλ , Q] =
1 k,j
2
k
χk2 = 1. Thus we obtain
1 [Lλ , χk2 ](hk − hj )χj2 + χj2 (hk − hj )[Lλ , χk2 ] 2
+ [Lλ , M] +
1 k
2
1 χk2 [Lλ , hk ] + [Lλ , hk ]χk2 , 2
where M = k mk . In the term [Lλ , χk2 ](hk − hj )χj2 we refer all operators to the j th co-ordinate patch. Thus ˜ −1 D˜ x − Dx∗ G−1 Dx , hk − hj = D˜ x∗ G
where ∼ refers to the k th co-ordinate system. We obtain (schematically) D˜ x = M T Dx + ˜ −1 M T = G−1 . Hence λE1 Dy , where M G hk − hj = (λE1 Dy + E0 )Dx + λ2 E2 Dy Dy + λE1 Dy + E0 .
After some calculation we find 1 1 [Lλ , χk2 ](hk − hj )χj2 + χj2 (hk − hj )[Lλ , χk2 ] 2 2 k,j = χj Dx∗ (λE1 Dy + E0 )Dx χj + χj Dx∗ (λ2 E2 Dy Dy + λE1 Dy + E0 ) (8.20) j
+ χ˜ j Dy∗ λ3 E3 Dy Dy + λ2 E2 Dy Dy + λE1 Dy + E0 , where χ˜ j ∈ C ∞ () with suppχ˜ j contained in the j th co-ordinate patch. Noticing the presence of F2 in (8.19) and using Lemma 8.3 with α = 0 along with (8.9) of Lemma 8.2, we see that the terms in (8.20) give a contribution to the left side of (8.19) ¯ which is bounded by C Q. We can re-expand M = M(σ ) writing M = k Mχk2 and then we find [Lλ , M] = χk (Dx∗ E0 + λE1 Dy + E0 ) k
which is readily handled by Lemma 8.3 and (8.9) of Lemma 8.2. We now expand the terms involving [Lλ , hk ]. After some calculation we obtain 1 1 χ 2 [L − λ, hk ] + [Lλ , hk ]χk2 2 k 2 k = χk Dx∗ E1 Dx + λE1 Dy + λE1 + E0 Dx χk k
+
k
+
k
+
χk Dx∗ (λ2 E2 + λE1 )Dy Dy + λE1 Dy + λE1 + E0 χk (λ2 E2 + λE1 )Dy Dy + (λE1 + E0 )Dy + λE1 + E0 + λ−1 E0
k
χk Dx∗ E1 Dx + χ˜ k Eq Dx ,
Realizing Holonomic Constraints in Classical and Quantum Mechanics
531
where χ˜ k ∈ C ∞ () has support in the k th co-ordinate patch with χ˜ k χk = χk . These terms are also easily handled with a combination of Lemma 8.2, (8.9) and Lemma 8.3. This completes the proof of (8.19) and shows q(t) ≤ eCt q(0). Finally
¯ 1 ψ q(0) = F1 ψ, QF
has λ dependence and must be bounded uniformly in λ. But this follows from The¯ 1/2 ψ2 = ψ, Qψ ¯ orem 8.3 (with l = α = β = 0) and the fact that Q < ∞, independently of λ. We now return to (8.15). We introduce a stronger cutoff in the n variable by restricting |n|/λs < 1, where s ∈ (0, 1). Thus let F3 = F(|n|/λs <1) be a smooth cutoff in the indicated region. We note that (1 − F3 )F1 ≤ λ−s (1 − F1 )λs /|n| · nF1 ≤ Cλ−s by (8.7) of Lemma 8.2. Thus it is sufficient to prove lim sup F0,1 ψ, eitL0,λ F3 e−itLλ F1 ψ − F0,1 ψ, F3 F1 ψ = 0. λ→∞ t∈[0,T ]
By the fundamental theorem of calculus we obtain F0,1 ψ, eitL0,λ F3 e−itLλ F1 ψ − F0,1 ψ, F3 F1 ψ t F0,1 ψ, eisL0,λ [L0,λ , F3 ] + F3 (L0,λ − Lλ ) e−isLλ F1 ψds. (8.21) =i 0
The term [L0,λ , F3 ] contains derivatives of F3 and thus by Lemma 8.2, (8.9) its contribution to (8.21) decays faster than any inverse power of λ uniformly for t ∈ [0, T ]. According to Lemma 7.1, on the support of F3 we have Lλ − L0,λ = χk (Dx − BDy )∗ E1 (Dx − BDy ) + E1 χk . k
Thus, aside from terms involving derivatives of F3 , which again can be handled by Lemma 8.2, (8.9) we need only show that lim sup F0,1 eisL0,λ ψ, F2 χk (Dx − BDy )∗ λ→∞ s∈[0,T ]
Now
· F3 E1 (Dx − BDy )F2 χk + χk2 F3 E1 F1 e−isLλ ψ = 0. F3 E1 F1 ≤ Cλ−1 nF1 ≤ Cλ−1
so we need only bound the product (Dx − BDy )χk F2 F0,1 e−sL0,λ ψ · F3 E1 · (Dx − BDy )χk F2 F1 e−isLλ ψ. By Lemma 8.2, (8.8)
BDy χk F2 FO,1 ≤ C,
532
R. Froese, I. Herbst
and by (8.16)
Dx χk F2 FO,1 eisLOλ ψ ≤ C.
s∈[0,T ]
Finally
F3 E1 ≤ Cλs /λ = Cλs−1 , $ #
which proves (8.15) and thus completes the proof of the theorem.
Proof of Theorem 4.2. To prove the theorem, it suffices to show that for any ψ ∈ C0∞ (N ), lim
2 2 2 sup e−it (HB +λ HO ) − e−itλ HO e−itH B ψ = 0.
λ→∞ 0≤t≤T
This can be rewritten as lim
sup ψt,λ − e−itH B ψ, e−itH B ψ = 0,
λ→∞ 0≤t≤T
where
ψt,λ = eitλ
2H O
e−it (HB +λ
2H ) O
(8.22)
ψ.
We will show that for any ϕ ∈ L2 (N , dvolN ), lim sup ψt,λ − e−itH B ψ, ϕ = 0,
(8.23)
λ→∞ 0≤t≤T
which will imply (8.22). This implication follows from the general fact that if ψt,λ converges to some ψt,∞ with sup ψt,∞ ≤ C 0≤t≤T
in the sense that lim
sup ψt,λ − ψt,∞ , ϕ = 0
λ→∞ 0≤t≤T
then, for any continuous function ϕt from [0, T ] into L2 (N , dvolN ), lim sup ψt,λ − ψt,∞ , ϕt = 0. λ→∞ 0≤t≤T
To see this, pick an orthonormal basis {ϕn }. Then N sup ψt,λ − ψt,∞ , ϕt ≤ sup ψt,λ − ψt,∞ , ϕn ϕn , ϕt 0≤t≤T 0≤t≤T n=1 ∞ + sup ψt,λ − ψt,∞ , ϕn ϕn , ϕt 0≤t≤T n=N+1
≤ C sup
N
0≤t≤T n=1
|ψt,λ −ψt,∞ , ϕn |+C sup (1−PN )ϕt , 0≤t≤T
Realizing Holonomic Constraints in Classical and Quantum Mechanics
533
where PN denotes the orthogonal projection onto the span of ϕ1 , . . . , ϕN . The first term on the right tends to zero as λ → ∞, by assumption. Hence lim sup sup ψt,λ − ψt,∞ , ϕt ≤ C sup (1 − PN )ϕt . λ→∞ 0≤t≤T
0≤t≤T
But {φt : t ∈ [0, T ]} is compact and 1 − PN goes to zero uniformly on compact sets. Therefore the right side tends to zero as N → ∞. Thus it suffices to prove (8.23), which we will do in two steps. First, we will show that for every sequence λj → ∞, there exists a subsequence µj and a bounded, weakly continuous L2 (N , dvolN ) valued function ψt,∞ such that sup ψt,µ − ψt,∞ , ϕ → 0 (8.24) j
0≤t≤T
for every ϕ ∈ L2 (N , dvolN ). Then, to complete the proof, we will show that ψt,∞ is always the same, and equal to e−itH B ψ. To take the first step, we begin with a sequence λj → ∞. Let {ϕn } be an orthonormal basis of vectors in C0∞ (N ). Define wn,λ (t) = ψt,λ , ϕn . Then for fixed n, wn,λ (t) are a family of functions of t ∈ [0, T ], uniformly bounded as λ → ∞. Still for fixed n, this family is equicontinuous, since the derivative is bounded independently of λ. This follows from d ψt,λ , ϕn = −ieitλ2 HO HB e−it (HB +λ2 HO ) ψ, ϕn dt 2 2 = ψt,λ , eitλ HO HB e−itλ HO ϕn 2 eitλ ν,ω Fν ϕn ≤ ψ · ν
≤ ψ ·
Fν ϕn = Cn .
ν
The sum over ν is finite. Here we used (7.14), and that ϕn is in C0∞ (N ), and therefore in the domain of Fν . Using Ascoli’s theorem, we may now choose a subsequence λj1 of λj so that w1,λj1 (t) converges to some continuous function w1,∞ (t), uniformly in t for t ∈ [0, T ]. Then we may choose a subsequence λj2 of λj1 with w2,λj2 (t) converging uniformly to some continuous function w2,∞ (t). Continuing in this way, and then taking a diagonal subsequence, we end up with a subsequence µj such that sup wn,µj (t) − wn,∞ (t) → 0 0≤t≤T
for every n. Notice that N n=1
|wn,∞ (t)|2 = lim
j →∞
N n=1
≤ ψt,µj 2 = ψ2 .
|ψt,µj , ϕn |2
534
R. Froese, I. Herbst
This implies that
∞
n=1 |wn,∞ (t)|
≤ ψ2 , so that
2
ψt,∞ =
wn,∞ (t)ϕn
n
is well defined with ψt,∞ ≤ ψ. Clearly, for any n, ψt,µj − ψt,∞ , ϕn → 0 as j → ∞. This implies (8.24). Now take the second step of identifying ψt,∞ . Let ϕ ∈ C0∞ (N ). Then ψt,µj , ϕ = ψ, ϕ + i
t 0
ψs,µj , eisµj HO HB e−isµj HO ϕds 2
t
2
= ψ, ϕ + i ψs,∞ , eisµj HO HB e−isµj HO ϕds 0 t 2 2 +i ψs,µj − ψs,∞ , eisµj HO HB e−isµj HO ϕds. 2
2
(8.25)
0
Since ϕ ∈ C0∞ (N ) the formula (7.14) implies that |ψs,µj − ψs,∞ , eisµj HO HB e−isµj HO ϕ| ≤ 2
2
ν
|ψs,µj − ψs,∞ , Fν ϕ|.
Thus the second term of (8.25) tends to zero as j → ∞. On the other hand lim
j →∞ 0
t
ψs,∞ , eisµj HO HB e−isµj HO ϕds = lim 2
2
j →∞
= 0
t
ν
t 0
eisµj ν,ω ψs,∞ , Fν ϕds 2
ψs,∞ , H B ϕds
by the Riemann Lebesgue lemma. Thus, taking j → ∞ in (8.25) we obtain ψt,∞ , ϕ = ψ, ϕ + i
0
t
ψs,∞ , H B ϕds.
(8.26)
Now let ϕ˜ be in the domain of H B . Since C0∞ is a core for H B , we may use an approximation argument to replace ϕ with e−isH B ϕ˜ and H B ϕ with H B e−isH B ϕ˜ in the equation above. We find, using (8.26), d d d ψs,∞ , e−isH B ϕ ˜ = ψt,∞ , e−isH B ϕ ˜ + ψs,∞ , e−itH B ϕ ˜ t=s t=s ds dt dt = iψs,∞ , H B e−isH B ϕ ˜ − iψs,∞ , H B e−isH B ϕ ˜ = 0. ˜ is constant. But when s = 0, Eq. (8.26) implies ψs,∞ , e−isH B ϕ ˜ Thus ψs,∞ , e−isH B ϕ isH B = ψ, ϕ. ˜ Thuse ψs,∞ − ψ, ϕ ˜ = 0 for every ϕ˜ in the domain of H B . This implies eisH B ψs,∞ = ψ, or ψs,∞ = e−isH B ψ, and completes the proof. # $
Realizing Holonomic Constraints in Classical and Quantum Mechanics
535
References [AD] Andersson, L. and Driver, B.K.: Finite dimensional approximations to Wiener measure and path integral formulas on manifolds. J. Funct. Anal. 165, no. 2, 430–498 (1999) [A1] Arnold, V.I.: Mathematical Methods of Classical Mechanics. Berlin–Heidelberg–New York: SpringerVerlag, 1978 [A2] Arnold, V.I., Kozlov, V.V. and Neishtadt, A.I.: Mathematical Aspects of Classical and Celestial Mechanics. Encyclopædia of Mathematical Sciences, Volume 3, Berlin–Heidelberg–New York: SpringerVerlag, 1988 [BS] Bornemann and Schütte: Homogenization of Hamiltonian systems with a strong constraining potential. Physica D 102, 57–77 (1997) [C] Chernoff, P.R.: Essential self-adjointness of powers of generators of hyperbolic equations. J. Funct. Anal. 12, 401–414 (1973) [dC1] da Costa, R.C.T.: Quantum mechanics of a constrained particle. Physical Review A, 23, 4, 1982–1987 (1981) [dC2] da Costa, R.C.T.: Constraints in quantum mechanics. Phys. Rev. A, 25 6, 2893–2900 (1982) [D] Davies, E.B.: The functional calculus. J. Lond. Math. Soc. (2) 52, 1, 166–176 (1995) [DE] Duclos, P. and Exner, P.: Curvature-induced bound states in quantum waveguides in two and three dimensions. Rev. Math. Phys. 7, no. 1, 73–102 (1995) [FK] Figotin, A. and Kuchment, P.: Spectral properties of classical waves in high contrast media. SIAM J. Appl. Math. 58, 683–702 (1998) [F] Friedlin, M.: Markov Processes and Differential Equations: Asymptotic Problems. Lectures in Mathematics ETH, Basel–Boston: Birkhäuser, 1996 [FW] Friedlin, M.I. and Wentzell, A.D.: Random Perturbations of Dynamical Systems. Second edition, Berlin–Heidelberg–New York: Springer-Verlag, 1998 [FH] Froese, R. and Herbst, I.: Realizing holonomic constraints in classical and quantum mechanics In: Proceedings of the 1999 VAB–GIT international conference on differential equations [G] Gallavotti, G.: The Elements of Mechanics. Berlin–Heidelberg–New York: Springer-Verlag, 1983 [GS] Guillemin, V. and Sternberg, S.: Symplectic techniques in physics. Cambridge: Cambridge University Press, 1984 [HS1] Helffer, B. and Sjöstrand, J.: Puits multiples en méchanique semi-classique V, Étude des mini-puits. In: Current topics in partial differential equations, Kinokuniya Company Ltd. (Volume in honour of S. Mizohata), pp. 133–186 [HS2] Helffer, B. and Sjöstrand, J.: Puits multiples en méchanique semi-classique VI, Cas des puits sousvariétés. Annales de l’Institut Henri Poincaré, Physique Théorique 46, no. 4, 353–372 (1987) [KZ] Kuchment, P. and Zeng, H.: Convergence of spectra of mesoscopic systems collapsing onto a graph. Preprint (mp_arc preprint no. 00-308 at http://www.ma.utexas.edu/mp_arc/) [M] Mitchell, K.A.: Geometric phase, curvature, and extrapotentials in constrained quantum systems. Preprint at http://xxx.lanl.gov/abs/quant-ph/0001059 18 Jan. 2000 [RU] Rubin, H. and Ungar, P.: Motion under a strong constraining force. Comm. Pure Appl. Math. 10, 28–42 (1957) [S] Schatzman, M.: On the eigenvalues of the Laplace operator on a thin set with Neumann boundary conditions. Applicable Analysis, 61, 293–306 (1996) [Ta] Takens, F.: Motion under the influence of a strong constraining force. In: Global theory of dynamical systems. Lecture Notes in Mathematics 819, Springer-Verlag 1980, pp. 425–445 [To] Tolar, J.: On a quantum mechanical d’Alembert principle. Group theoretical methods in physics, Lecture Notes in Physics, 313, Berlin–Heidelberg–New York: Springer-Verlag 1988, pp. 268–274 Communicated by B. Simon
Commun. Math. Phys. 220, 537 – 560 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Quantum Affine (Super)Algebras (2) Uq (A(1) 1 ) and Uq (C(2) ) S. M. Khoroshkin1 , J. Lukierski2 , V. N. Tolstoy3 1 Institute of Theoretical and Experimental Physics, 117259 Moscow, Russia.
E-mail: [email protected]
2 Institute of Theoretical Physics, University of Wrocław, 50-204 Wrocław, Poland.
E-mail: [email protected]
3 Institute of Nuclear Physics, Moscow State University, 119899 Moscow, Russia.
E-mail: [email protected] Received: 17 May 2000 / Accepted: 6 February 2001 (1)
Abstract: We show that the quantum affine algebra Uq (A1 ) and the quantum affine superalgebra Uq (C(2)(2) ) admit a unified description. The difference between them (1) consists in the phase factor which is equal to 1 for Uq (A1 ) and it is equal to −1 for Uq (C(2)(2) ). We present such a description for the actions of the braid group, for the construction of Cartan–Weyl generators and their commutation relations, as well as for the extremal projector and the universal R-matrix. We give also a unified description for the “new realizations” of these algebras together with explicit calculations of the corresponding R-matrices. 1. Introduction Among the variety of all affine Lie (super)algebras1 (both quantized and non-quantized) the affine (super)algebras of rank 2 play the same key role. In the first place, all affine series of the type A(n|m)(1) , B(n|m)(1) , C(n)(1) , D(n|m)(1) , A(2n|2m − 1)(2) , A(2n − 1|2m − 1)(2) ,C(n)(2) , D(n|m)(2) and A(2n|2m)(4) start from the affine (super)algebras of rank 2. Secondly, the contragredient Lie (super)algebras of rank 2 are basic structural blocks of any affine (super)algebras of arbitrary rank. This fact permits, for example, to reduce the proof of basic theorems for the extremal projector and the universal R-matrix to the proof of such theorems for the (super)algebras of rank 2 (see Refs. [1, 16–18, 9, 13]). Moreover the representation theory of the affine (super)algebras (both quantized and non-quantized) contains some typical elements of the representation theory of the affine (super)algebras of rank 2. Besides, in applications of the affine (super)algebras, first of all the affine (super)algebras of rank 2 are used by virtue of their simplicity. In this paper we give a detailed description of the quantum untwisted affine algebra (1) ˆ Uq (A1 ) (≡ Uq (sl(2))) and the quantum twisted affine superalgebra Uq (C(2)(2) ) (≡ 1 We use the prefix “super” in brackets in order to include both the Lie algebras and the Lie superalgebras.
538
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
Uq ( osp(2|2))(2) ). Moreover our goal is to show that these quantum (super)algebras are described in a unified way. Namely, we present in a unified way their defining relations and actions of the braid group associated with the Weyl group, the construction of the Cartan–Weyl bases, the complete list of all permutation relations of the Cartan– Weyl generators corresponding to all root vectors, and finally the unified formula for their extremal projector and universal R-matrix. We extend also the unified description to so called “new realizations” of the algebras. Here we present a unified description of the universal R-matrices for the corresponding Hopf structures in a multiplicative form as well as in the form of contour integrals. The difference between both quantum (super)algebras considered here is only determined by a phase factor which is equal to (1) 1 for Uq (A1 ) and equal to −1 for Uq (C(2)(2) ). This situation is similar to the finitedimensional case. Namely, in the paper [9] it was shown that all quantum (super)algebras Uq (g), where g are the finite-dimensional contragredient Lie (super)algebras of rank 2, are divided into three classes. Each such class is characterized by the same Dynkin diagram and the same reduced root system, provided that we neglect the “colour” of the roots, and all (super)algebras of the same class have the unified defining relations, unified construction and properties of the Cartan–Weyl basis and a unified formula for the universal R-matrix. The difference between the (super)algebras of the same class is determined by some phase factors which take values ±1 depending on the colour of the nodes of their Dynkin diagram. Concerning the Cartan–Weyl bases for the quantum (1) affine algebra Uq (A1 ) and quantum affine superalgebra Uq (C(2)(2) ) it should be noted that certain results presented here can be found in the literature separately for each case (e.g., see Refs. [2, 6, 10–13], and [20]). (1) Basic information about the (super)algebras A1 and C(2)(2) is presented in Tables 1a and 1b (see Refs. [7, 8, 19]). In Table 1a there are listed the standard and symmetric Cartan matrices A and Asym , the corresponding extended symmetric matrices A¯ sym and their inverses (A¯ sym )−1 , as well as the sets of odd simple roots (odd roots), the Dynkin diagrams (diagram), and the dimensions of these (super)algebras (dim). We recall some elementary definitions of the colour of the roots: • All even roots are called white roots. A white root is pictured by the white node ❣. • An odd root γ is called a grey root if 2γ is not a root. This root is pictured by the grey node ⊗. • An odd root γ is called a dark root if 2γ is a root. This root is pictured by the dark node ✇. We also recall the definition of the reduced system of the positive root system + for any contragradient (super)algebras of finite growth. • The system + is called the reduced system if it is defined in the following way: + = +\{2γ ∈ + | γ is odd}. We say therefore that the reduced system + is obtained from the total system + by removing of all doubled roots 2γ , where γ is a dark odd root. (1)
(2) The total and reduced root systems of the (super)algebras A1 and C(2) are listed in Table 1b. It is convenient topresent the total root systems = + (−+ ) and reduced root systems = + (−+ ) by the pictures: see Figs. 1, 2a, 2b. Comparing (1) Fig. 1 and Fig. 2b we conclude that the reduced root systems of A1 and C(2)(2) coincide if we neglect the colour of the roots.
Quantum Affine (Super)Algebras
539
Table 1a. g(A, ϒ)
(1) A1
C(2)(2)
2 −2 −2 2
2 −2 −2 2
(A¯ sym )−1
A¯ sym
A = Asym
0 1 0 1 2 −2 0 −2 2
0 1 0 1 2 −2 0 −2 2
odd
diagram
01 1 1 0 0 1 0 21
∅
01 1 1 0 0 1 0 21
{δ−α, a}
δ−α
α
δ−α
α
❣
✇
❣
✇
Table 1b. g(A, ϒ)
+
+
{α, nδ±α, nδ | n ∈ N}
{α, nδ±α, nδ | n ∈ N}
{α, 2α, nδ±α, 2nδ±2α, nδ | n ∈ N}
{α, nδ±α, nδ | n ∈ N}
(1)
A1
C(2)(2)
... ...
−4δ+α
❝
−4δ
❝
. . . −4δ−α ❝
−3δ+α
❝
−3δ
❝
−3δ−α ❝
−2δ+α
❝
−2δ
❝
−2δ−α ❝
−δ+α
❝
−δ
❝
−δ−α ❝
α
δ+α ❝ ❝ ✻ δ ❝ ❍ ❍❍δ−α −α ❥❝ ❍ ❝
2δ+α
❝
3δ+α
❝
4δ+α
2δ
3δ
4δ
2δ−α ❝
3δ−α ❝
4δ−α ❝
...
4δ+2α
... ...
❝
❝
❝
...
❝
(1)
2 ) Fig. 1. The total and reduced root system ( = ) of A1 ( sl
... ... ...
−4δ+2α
−2δ+2α
❝
−4δ+α
s
−4δ
❝
. . . −4δ−α s . . . −4δ−2α ❝
2α
❝
−3δ+α
s
−3δ
❝
−3δ−α s
−2δ+α
s
−2δ
❝
−2δ−α s −2δ−2α ❝
2δ+2α
❝
−δ+α
s
−δ
❝
−δ−α s
α
❝
δ+α
❝
s s ✻ δ ❝ ❍ ❍❍δ−α −α ❥s ❍ s
2δ+α
s
3δ+α
s
4δ+α
2δ
3δ
4δ
2δ−α s
3δ−α s
4δ−α s
−2α ❝
2δ−2α ❝
❝
Fig. 2a. The total root system of C(2)(2)
❝
s ❝
...
4δ−2α ❝ ...
540
... ...
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
−4δ+α
s
−3δ+α
s
−3δ
−4δ
❝
❝
. . . −4δ−α s
−3δ−α s
−2δ+α
s
−δ+α
s
−2δ
−δ
❝
−2δ−α s
❝
−δ−α s
α
δ+α s s ✻ δ ❝ ❍ −αs ❍❍ δ−α ❥s ❍
2δ+α
s
3δ+α
s
4δ+α
2δ
3δ
4δ
2δ−α s
3δ−α s
4δ−α s
❝
s
❝
...
❝ ...
Fig. 2b. The reduced root system of C(2)(2)
(2) 2. Defining Relations of Uq (A(1) 1 ) and Uq (C(2) ) (1)
The quantum (q-deformed) affine (super)algebras Uq (A1 ) and Uq (C(2)(2) ) are gen±1 erated by the Chevalley elements kd±1 := q ±hd , kα±1 := q ±hα , kδ−α := q ±hδ−α , e±α , e±(δ−a) with the defining relations kγ kγ−1 = kγ−1 kγ = 1, kγ e±α kγ−1 = q ±(γ ,α) e±α , [eα , e−α ] = [hα ]q , [eα , e−δ+α ] = 0,
[kγ±1 , kγ±1 ] = 0,
(2.1)
kγ e±(δ−α) kγ−1 = q ±(γ ,δ−α) e±(δ−α) ,
(2.2)
[eδ−α , e−δ+α ] = [hδ−α ]q ,
(2.3)
[e−α , eδ−α ] = 0,
(2.4)
[e±α , [e±α , [e±α , e±(δ−α) ]q ]q ]q = 0,
(2.5)
[[[e±α , e±(δ−α) ]q , e±(δ−α) ]q , e±(δ−α) ]q = 0,
(2.6)
where (γ = d, α, δ − α), (d, α) = 0, (d, δ) = 1, and [hβ ]q := (kβ − kβ−1 )/(q − q −1 ). The brackets [· , ·] and [· , ·]q are the super-, and q-super-commutators:
[eβ , eβ ] = eβ eβ − (−1)ϑ(β)ϑ(β ) eβ eβ ,
[eβ , eβ ]q = eβ eβ − (−1)ϑ(β)ϑ(β ) q (β,β ) eβ eβ .
(2.7)
Here the symbol ϑ(·) means the parity function: ϑ(β) = 0 for any even root β, and ϑ(β) = 1 for any odd root β. Remark. The left sides of the relations (2.5) and (2.6) are invariant with respect to the replacement of q by q −1 . Indeed, if we remove the q-brackets we see that the left sides of (2.5) and (2.6) contain the symmetric functions of q and q −1 . This property permits to write the q-commutators in (2.5) and (2.6) in the inverse order, i.e. [[[e±(δ−α) , e±α ]q , e±α ]q , e±α ]q = 0
(2.8)
[e±(δ−α) , [e±(δ−α) , [e±(δ−α) , e±α ]q ]q ]q = 0.
(2.9)
Now we prove the useful proposition.
Quantum Affine (Super)Algebras
541 (1)
Proposition 2.1. (i) In the quantum (super)algebras Uq (A1 ) and Uq (C(2)(2) ) the following relations
[e±α , [e±α , e±(δ−α) ]q ]q , [[e±α , [e±α , e±(δ−α) ]q ]q , [e±α , e±(δ−α) ]q ]q = 0, q
(2.10)
[e±(δ−α) , [e±(δ−α) , e±α ]q ]q , [[e±(δ−α) , [e±(δ−α) , e±α ]q ]q , [e±(δ−α) , e±α ]q ]q = 0 q
(2.11) are fulfilled. (ii) On the contrary, if the relations (2.1)–(2.4) and (2.10), (2.11) are satisfied then also the relations (2.5), (2.6) are valid. Thus, the proposition says that under the conditions (2.1)–(2.4) the relations (2.5), (2.6) and (2.10), (2.11) are equivalent. Proof. Let us assume that the relations (2.1)–(2.6) are fulfilled. We take the relations (2.6) and apply to them the corresponding q-commutator with fourth power of e±α , i.e. [e±α , [e±α , [e±α , [e±α , a± ]q . . . ]q = 0, where a± is the left side of (2.6). After tedious calculation we arrive at the relations (2.10). The relations (2.11) are proved in a similar way. Namely, the relations
e±(δ−α) , [e±(δ−α) , e±(δ−α) , [e±(δ−α) , b± ]q . . . q = 0, where b± is the left-side of (2.5) in the form (2.8), results in (2.11). On the contrary, if the relations (2.1)–(2.4) and (2.10), (2.11) are realized then by the relations
]... = 0 e∓α , [e∓α , [e∓α , [e∓α , a∓ and
] . . . = 0, e∓(δ+α) , e∓(δ+α) , [e∓(δ+α) , [e∓(δ+α) , b∓
and b are the left-sides of the relations (2.10) and (2.11), where correspondingly a∓ ∓ the relations (2.5) and (2.6) follow. (1)
The standard Hopf structure of the quantum (super)algebras Uq (A1 ) and Uq (C(2)(2) ) is given by the following formulas for the comultiplication q and antipode Sq : q (kγ±1 ) = kγ±1 ⊗ kγ±1 , q (eβ ) = eβ ⊗ 1 + kβ−1 ⊗ eβ ,
Sq (kγ±1 ) = kγ∓1 , Sq (eβ ) = −kβ eβ ,
(2.12)
q (e−β ) = e−β ⊗ kβ + 1 ⊗ e−β , Sq (e−β ) = −e−β kβ−1 , where β = α, δ − α; γ = d, β. It is not hard to verify by direct calculations for the defining relations (2.1)–(2.6) that (1) the quantum affine (super)algebras Uq (A1 ) and Uq (C(2)(2) ) have the following simple involutive (anti)automorphisms.
542
(i)
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
The non-graded antilinear antiinvolution or conjugation “∗ ”: (q ±1 )∗ = q ∓1 ,
(kγ±1 )∗ = kγ∓1 ,
eβ∗ = e−β ,
∗ e−β = eβ
(2.13)
((xy)∗ = y ∗ x ∗ for ∀ x, y ∈ Uq (g)). (ii) The graded antilinear antiinvolution or graded conjugation “‡ ”: (q ±1 )‡ = q ∓1 ,
(kγ±1 )‡ = kγ∓1 ,
eβ‡ = (−1)ϑ(β) e−β ,
(2.14) ‡ e−β = eβ
((xy)‡ = (−1)deg x deg y y ‡ x ‡ for any homogeneous elements x, y ∈ Uq (g)). (iii) The Chevalley graded involution ω: ω(q ±1 ) = q ∓1 , ω(eβ ) = −e−β ,
ω(kγ±1 ) = kγ±1 , ω(e−β ) = −(−1)θ(β) eβ .
(2.15)
(iv) The Dynkin involution τ which is associated with the automorphism of the Dynkin (1) diagrams of the (super)algebras A1 and C(2)(2) : τ (q ±1 ) = q ±1 ,
τ (kd±1 ) = kd±1 ,
±1 , τ (kβ±1 ) = kδ−β
±1 ±1 τ (k−β ) = k−δ+β ,
τ (eβ ) = eδ−β ,
τ (e−β ) = e−δ+β .
(2.16)
Here in (2.13)–(2.16) β = α, δ−α; γ = d, β. It should be noted that the graded conjugation “‡ ” and the Chevalley graded involution ω are involutive (anti)automorphisms of the fourth order, i.e., for example, (ω)4 = id. Note also that the Dynkin involution τ commutes with all other three involutions, i.e. τ (x ∗ ) = (τ (x))∗ , τ (x ‡ ) = (τ (x))‡ and ωτ (x) = τ ω(x) for any element x ∈ Uq (g) (1) (g = A1 , C(2, 0)(2) ). In the next section we consider a q-analog of automorphisms connected with the (1) Weyl group of the (super)algebras A1 and C(2, 0)(2) . This q-analog defines actions of the braid group associated with the Weyl group.
Quantum Affine (Super)Algebras
543
3. Braid Group Actions We introduce the morphisms Tα and Tδ−α defined by the following formulas: Tα (q ±1 ) = q ±1 , ) ∓ 2(α,γ (α,α)
Tα (kγ±1 ) = kγ±1 kα
,
Tα (eα ) = −e−α kα , Tα (e−α ) = −(−1)θ(α) kα−1 eα , Tα (eδ−a ) = Tα (e−δ+a) ) =
(3.1)
1 [eα , [eα , eδ−a ]q ]q , a (−1)θ(α) [[e−δ+α , e−α ]q −1 , e−α ]q −1 , a
Tδ−α (q ±1 ) = q ±1 , ∓ 2(δ−α,γ )
Tδ−α (kγ±1 ) = kγ±1 kδ−α(α,α) , Tδ−α (eδ−α ) = −e−δ+α kδ−a , −1 Tδ−α (e−δ+α ) = −(−1)θ(α) kδ−a eδ−α ,
Tδ−α (eα ) =
(3.2)
1 [eδ−α , [eδ−α , eα ]q ]q , a
(−1)θ(α) [e−α , e−δ+α ]q −1 , e−δ+α ]q −1 , a where γ = d, α, δ −α. It is not difficult to prove by direct verification that the morphisms −1 Tα−1 and Tδ−α given by Tδ−α (e−α ) =
Tα−1 (q ±1 ) = q ±1 , ) ∓ 2(α,γ (α,α)
Tα−1 (kγ±1 ) = kγ±1 kα
,
Tα−1 (eα ) = −(−1)θ(α) kα−1 e−α , Tα−1 (e−α ) = −eα kα , Tα−1 (eδ−a ) = Tα−1 (e−δ+a) ) =
1 [[e , e ]q , eα ]q , a δ−α α (−1)θ(α) [e−α , [e−α , e−δ+α ]q −1 ]q −1 , a
(3.3)
544
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy −1 Tδ−α (q ±1 ) = q ±1 , ∓ 2(δ−α,γ )
−1 Tδ−α (kγ±1 ) = kγ±1 kδ−α(α,α) , −1 −1 (eδ−α ) = −(−1)θ(α) kδ−a e−δ+α , Tδ−α −1 (e−δ+α ) = −eδ−α kδ−a , Tδ−α −1 (eα ) = Tδ−α −1 Tδ−α (e−α ) =
(3.4)
1 [[e , e ]q , eδ−α ]q , a α δ−α (−1)θ(α) [e−δ+α , [e−δ+α , e−α ]q −1 ]q −1 a
are inverses to Tα and Tδ−α , i.e. −1 −1 Tα Tα−1 = Tα−1 Tα = 1, Tδ−α Tδ−α = Tδ−α Tδ−α = 1.
(3.5)
Here in (3.1)–(3.4) and in what follows we use the notation: a := [(α, α)]q =
q (α,α) − q −(α,α) . q − q −1
(3.6)
−1 Proposition 3.1. (i) The morphisms Tα and Tδ−α (and also Tα−1 and Tδ−α ) commute ‡ with the graded conjugation “ ”, i.e.
(Tα (x))‡ = Tα (x ‡ ),
(Tδ−α (x))‡ = Tδ−α (x ‡ )
(3.7)
for any element x ∈ Uq (g). ±1 are also compatible with the Chevalley graded invo(ii) The morphisms Tα±1 and Tδ−α lution ω, in the sense that Tα ω = ωTα−1 ,
−1 Tδ−α ω = ωTδ−α .
(3.8)
±1 (iii) The morphisms Tα±1 and Tδ−α are connected with each other by the Dynkin involution τ , in the sense that
Tα τ = τ Tδ−α ,
−1 Tα−1 τ = τ Tδ−α .
(3.9)
This proposition can be proved by direct verification for the Chevalley basis. −1 Proposition 3.2. The morphisms Tα and Tδ−α (and also Tα−1 and Tδ−α ) are the auto(1)
morphisms of the quantum (super)algebras Uq (A1 ) and Uq (C(2)(2) ).
Quantum Affine (Super)Algebras
545
Proof. The proposition is proved by direct verification that the defining relations remain valid under the actions of the given morphisms. Subsequently we apply Proposition 2.1. Note that under the action of Tα the relations (2.4) and (2.5) are transformed into each other, the relation (2.6) is transformed to (2.10). Analogously, under action of Tδ−α the relations (2.4) and (2.6) are transformed into each other, the relations (2.5) are transformed to (2.11). In addition, it is useful to apply the relations (3.7). ±1 are not Hopf algebra It is easy to see that all these automorphisms Tα±1 and Tδ−α (1)
automorphisms of Uq (A1 ) and Uq (C(2)(2) ), in sense that, e.g., Tα ⊗Tα ◦q = q ◦Tα . In the case of Uq (g), where g is a finite-dimensional simple Lie algebra, the auto±1 morphisms of type Tα±1 and Tδ−α are called the Lusztig automorphisms [15]. Let us introduce the following root vectors: eδ := [eα , eδ−α ]q ,
e−δ := [e−δ+α , e−α ]q −1 ,
e˜δ := [eδ−α , eα ]q ,
e˜−δ := [e−α , e−δ+α ]q −1 .
(3.10)
±1 the It is not difficult to verify that under the actions of the automorphisms Tα±1 and Tδ−α elements e±δ and e˜±δ are transformed as follows:
Tα (e˜±δ ) = (−1)θ(α) e±δ ,
Tα−1 (e±δ ) = (−1)θ(α) e˜±δ ,
Tδ−α (e±δ ) = (−1)θ(α) e˜±δ ,
−1 Tδ−α (e˜±δ ) = (−1)θ(α) e±δ .
(3.11)
Therefore T2δ (e±δ ) = e±δ ,
−1 T2δ (e±δ ) = e±δ ,
T˜2δ (e˜±δ ) = e˜±δ ,
−1 T˜2δ (e˜±δ ) = e˜±δ ,
(3.12)
where the elements T2δ and T˜2δ , called the translation operators, are given by T2δ = Tα Tδ−α ,
−1 −1 −1 T2δ = Tδ−α Ta ,
T˜2δ = Tδ−α Tα ,
−1 −1 T˜2δ = Tα−1 Tδ−α .
(3.13)
Proposition 3.3. The automorphisms Tδ := Tα τ , Tδ−1 := τ Tα−1 and T˜δ := Tδ−α τ , T˜δ−1 := −1 ±1 ±1 are the square roots of the automorphisms of T2δ and T˜2δ correspondingly, i.e. τ Tδ−α −1 , Tδ2 = T2δ , Tδ−2 = T2δ −1 . T˜δ2 = T˜2δ , T˜δ−2 = T˜2δ
(3.14)
Moreover Tδ (eδ−α ) = −e−α kα ,
Tδ (e−δ+α ) = −(−1)θ(α) kα−1 eα ,
−1 Tδ−1 (eα ) = −(−1)θ(α) kδ−α e−δ+α , Tδ−1 (e−α ) = −eδ−α kδ−α ,
(3.15)
546
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
T˜δ (eα ) = −e−δ+α kδ−α ,
−1 T˜δ (e−α ) = −(−1)θ(α) kδ−α eδ−α ,
T˜δ−1 (eδ−α ) = −(−1)θ(α) kα−1 e−α ,
T˜δ−1 (e−δ+α ) = −eα kα ,
(3.16)
and also Tδ (e±δ ) = (−1)θ(α) e±δ ,
Tδ−1 (e±δ ) = (−1)θ(α) e±δ ,
T˜δ (e˜±δ ) = (−1)θ(α) e˜±δ ,
T˜δ−1 (e˜±δ ) = (−1)θ(α) e˜±δ .
(3.17)
Proof. From (3.9) we have that Tδ−α = τ Tα τ and therefore T2δ = Tα Tδ−α = Tα τ Tα τ = Tδ2 . Analogously Tα = τ Tδ−α τ and therefore T˜2δ = Tδ−α Tα = Tδ−α τ Tδ−α τ = T˜δ2 . The formulas (3.15)–(3.17) are trivial. In the next section we construct the Cartan–Weyl basis and describe its properties in detail. (2) 4. Cartan–Weyl Basis for Uq (A(1) 1 ) and Uq (C(2) )
A general scheme for construction of a Cartan–Weyl basis for quantized Lie algebras and superalgebras was proposed in Ref. [17]. The scheme was applied in detail at first for quantized finite-dimensional Lie (super)algebras [9] and then to quantized non-twisted affine algebras [18]. This procedure is based on the notion of “normal ordering” for the reduced positive root system. For affine Lie (super)algebras this notation was formulated in [16] (see also [17, 10–13]). In our case the reduced positive system has only two normal orderings: α, δ+α, 2δ+α, . . . , ∞δ+α, δ, 2δ, 3δ, . . . , ∞δ, ∞δ−α, . . . , 3δ−α, 2δ−α, δ−α, (4.1) δ−α, 2δ−α, 3δ−α, . . . , ∞δ−α, δ, 2δ, 3δ, . . . , ∞δ, ∞δ+α, . . . , 2δ+α, δ+α, α. (4.2) The first normal ordering (4.1) corresponds to “clockwise” ordering for positive roots in Figs. 1, 2b if we start from root α to root δ −α. The inverse normal ordering (4.2) corresponds to “anticlockwise” ordering for the positive roots when we move from δ−α to α. In accordance with the normal ordering (4.1) we set eδ = eδ = [eα , eδ−α ]q , enδ+α = e(n+1)δ−α =
1 [e , e ], a (n−1)δ+α δ 1 ], [e , e a δ nδ−α
e(n+1)δ = [eα , e(n+1)δ−α ]q ,
e−δ = e−δ := [e−δ+α , e−α ]q −1 ,
e−nδ−α := e−(n+1)δ+α :=
(4.3)
1 [e , e ], a −δ −(n−1)δ−α
(4.4)
1 , e ], [e a −nδ+α −δ
(4.5)
e−(n+1)δ := [e−(n+1)δ+α , e−α ]q −1 ,
(4.6)
Quantum Affine (Super)Algebras
547
where n = 1, 2, . . . , and a is given by the formula (3.6). Analogously for the inverse normal ordering (4.2) we set e˜δ = e˜δ = [eδ−α , eα ]q , e˜nδ+α = e˜(n+1)δ−α =
e˜−δ = e˜−δ := [[e−α , e−δ+α ]q −1 ,
1 ], [e˜ , e˜ a δ (n−1)δ+α
e˜−nδ−α :=
1 [e˜ , e˜ ], a nδ−α δ
e˜−(n+1)δ+α :=
e˜(n+1)δ = [eδ−α , e˜nδ+α ]q ,
(4.7)
1 , e˜ ], [e˜ a −(n−1)δ−α −δ
(4.8)
1 [e˜ , e˜ ], a −δ −nδ+α
(4.9)
e˜−(n+1)δ := [e−δ+α , e˜−nδ−α ]q −1 ,
(4.10)
where n = 1, 2, . . . . Thus, we have two systems of the Cartan–Weyl generators: “direct” ±1 and “inverse”. Each such system together with the Cartan generators kα±1 , kδ−α , e±α , e±(δ−α) are called the q-analog of the Cartan–Weyl basis (or simply the Cartan–Weyl (1)
basis) for the quantum (super)algebras Uq (A1 ) and Uq (C(2)(2) ). Now we consider some properties of these bases. First of all, the explicit construction of the Cartan–Weyl generators (4.3)–(4.6) (or (4.7)–(4.10)) permits to find easily their properties with respect to the (anti)involutions (2.13)–(2.15). For example, it is evident that ∗ e±γ = e∓γ ,
∀γ ∈ + ,
(4.11)
and also ‡ enδ+α = (−1)(n+1)θ(α) e−nδ−α ,
‡ e−nδ−α = (−1)nθ(α) enδ+α ,
‡ enδ−α = (−1)nθ(α) e−nδ+α ,
‡ e−nδ+α = (−1)(n−1)θ(α) enδ−α ,
‡ (enδ ) = (−1)nθ(α) enδ ,
(e−nδ )‡ = (−1)nθ(α) e−nδ .
(4.12)
Further, it is easy to see that the “direct” and “inverse” Cartan–Weyl generators (4.3–(4.6) and (4.7)–(4.10) have a very simple connection by the Dynkin involution τ : τ (enδ+α ) = e˜(n+1)δ−α ,
τ (e˜nδ+α ) = e(n+1)δ−α
(n ∈ Z ),
τ (enδ−α ) = e˜(n−1)δ+α ,
τ (e˜nδ−α ) = e(n−1)δ+α
(n ∈ Z),
τ (enδ ) = e˜nδ ,
τ (e˜nδ ) = enδ
(4.13)
(n = 0).
The transformation properties with respect to the automorphisms Tα and Tδ−α is not difficult to obtain with the help of (3.1), (3.2), (3.11), and they have the form Tα (˜enδ+α) = (−1)(n+1)θ(α) enδ−α ,
Tα (˜e−nδ−α) = (−1)nθ(α) e−nδ+α ,
Tα (˜enδ−α) = (−1)(n−1)θ(α) enδ+α ,
Tα (˜e−nδ+α) = (−1)nθ(α) e−nδ−α ,
Tα (˜enδ ) = (−1)nθ(α) enδ ,
Tα (e˜−nδ ) = (−1)nθ(α) e−nδ ,
(4.14)
548
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
where n > 0, and Tδ−α (ekδ+α) = (−1)kθ(α) e˜(k+2)δ−α ,
Tδ−α (e−kδ−α) = (−1)(k+1)θ(α) e˜−(k+2)δ+α ,
Tδ−α (elδ−α) = (−1)lθ(α) e˜(l−2)δ+α ,
Tδ−α (e−lδ+α) = (−1)(l−1)θ(α) e˜(−l+2)δ−α ,
Tδ−α (emδ ) = (−1)mθ(α) e˜mδ ,
Tδ−α (e−mδ ) = (−1)mθ(α) e˜−mδ ,
(4.15) for k ≥ 0, l > 1, m > 0. As a corollary of the formulas (4.13)–(4.15) we find the actions of the translation operators Tδ and T˜δ : Tδ (ekδ+α) = (−1)kθ(α) e(k+1)δ+α ,
Tδ (e−kδ−α) = (−1)(k+1)θ(α) e−(k+1)δ−α ,
Tδ (elδ−α) = (−1)lθ(α) e(l−1)δ−α ,
Tδ (e−lδ+α) = (−1)(l−1)θ(α) e(−l+1)δ+α ,
Tδ (emδ ) = (−1)mθ(α) emδ ,
(4.16)
Tδ (e−mδ ) = (−1)mθ(α) e−mδ
for k ≥ 0, l > 1, m > 0, and T˜δ (˜enδ+α) = (−1)(n−1)θ(α) e˜(n−1)δ+α ,
T˜δ (˜e−nδ−α) = (−1)nθ(α) e˜(−n−1)δ+α ,
T˜δ (˜enδ−α) = (−1)(n+1)θ(α) e˜(n+1)δ−α ,
Tδ (˜e−nδ+α) = (−1)nθ(α) e˜−(n+1)δ+α ,
T˜δ (˜enδ ) = (−1)nθ(α) e˜nδ ,
(4.17)
T˜δ (e˜−nδ ) = (−1)nθ(α) e˜−nδ ,
where n > 0. (Also see (3.15) and (3.16)). Using the formulas (4.16), (4.17) we can −1 ˜ −1 easily find the actions for the inverse translation operators Tδ−1 , T˜δ−1 and T2δ , T2δ . These actions are not written here. From the relations (4.16), (4.17) it is clear that the operators Tδ±1 and T˜δ±1 can be used for construction of the Cartan–Weyl generators (4.3)–(4.6) starting from the Chevalley basis. In the case of the quantum untwisted affine algebras a similar procedure was applied in the paper [2]. Proposition 4.1. The root vectors (4.3)–(4.6) satisfy the following permutation relations: kd enδ±α kd−1 = q n(d,δ) enδ±α ,
−1 kd enδ kd = q n(d,δ) enδ ,
−1 kγ enδ±α kγ = q ±(γ ,α) enδ±α , kγ enδ kγ = enδ
(4.18)
for any n ∈ Z and any γ ∈ + , and also [enδ+α , e−nδ−α ] = (−1)nθ(α)
−1 knδ+α − knδ+α
q − q −1
[enδ−α , e−nδ+α ] = (−1)(n−1)θ(α)
[enδ+α , e(n+2m−1)δ+α ]q = (qα2 −1)
m−1 l=1
−1 knδ−α − knδ−α
(n ≥ 0),
(4.19)
(n > 0);
(4.20)
qα−l e(n+l)δ+α e(n+2m−1−l)δ+α ,
(4.21)
q − q −1
Quantum Affine (Super)Algebras
549
2 [enδ+α , e(n+2m)δ+α ]q = (qα −1)qα−m+1 e(n+m)δ+α
+ (qα2 −1)
m−1
qα−l e(n+l)δ+α e(n+2m−l)δ+α
(4.22)
qα−l e(n+2m−1−l)δ−α e(n+l)δ−α ,
(4.23)
qα−l e(n+2m−l)δ−α e(n+l)δ−α
(4.24)
l=1
for any integers n ≥ 0, m > 0; [e(n+2m−1)δ−α , enδ−α ]q = − (qα2 −1)
m−1
l=1 2 − (qα −1)qα−m+1 e(n+m)δ−α
[e(n+2m)δ−α , enδ−α ]q =
− (qα2 −1)
m−1 l=1
for any integers n, m > 0; [e−nδ+α , e(n+2m−1)δ+α ] = − (−1)(n−1)θ(α) (qα2 −1) ×
n+m−1 l=n
qα−l knδ−α e(l−n)δ+α e(n+2m−1−l)δ+α
+ (qα2 −1)
n−1 l=1
(4.25)
(−1)lθ(α) qα−l kδl e(l−n)δ+α e(n+2m−1−l)δ+α ,
[e−nδ+α , e(n+2m)δ+α ] = − (−1)(n−1)θ(α) (qα2 −1) ×
n+m−1 l=n
qα−l knδ−α e(l−n)δ+α e(n+2m−l)δ+α n−1
+ (qα2 −1)
l=1
(4.26) (−1)lθ(α) qα−l kδl e(l−n)δ+α e(n+2m−l)δ+α
2 − (−1)(n−1)θ(α) (qα −1)qα−m−n+1 knδ−α emδ+α
for any integers n, m > 0; [e(n+2m−1)δ−α , e−nδ−α ] = (−1)(n+1)θ(α) (qα2 −1) ×
n+m−1 l=n+1
−1 qα−l e(n+2m−1−l)δ−α e(l−n)δ−α knδ+α
− (qα2 −1)
n−1 l=1
(−1)lθ(α) qα−l e(n+2m−1−l)δ−α e(l−n)δ−α kδ−l , (4.27)
550
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
[e(n+2m)δ−α , e−nδ−α ] = (−1)(n+1)θ(α) (qα2 −1) ×
n+m−1 l=n
−1 qα−l e(n+2m−l)δ−α e(l−n)δ−α knδ+α
− (qα2 −1)
n−1 l=1
(−1)lθ(α) qα−l e(n+2m−l)δ−α e(l−n)δ−α kδ−l
−1 2 + (−1)(n−1)θ(α) (qα −1)qα−m−n+1 emδ−α knδ+α
(4.28) for any integers n ≥ 0, m > 0; [enδ+α , emδ−α ]q = e(n+m)δ
(n ≥ 0, m > 0),
(4.29)
−1 kmδ+α (n > m ≥ 0), [enδ+α , e−mδ−α ] = −(−1)(m+1)θ(α) e(n−m)δ
(4.30)
[e−mδ+α , enδ−α ] = −(−1)mθ(α) kmδ−α e(n−m)δ
(n > m > 0),
(4.31)
(n > 0, m > 0),
(4.32)
qα−l e(n+l)δ+α e(m−l)δ
(4.33)
qα−l e(m−l)δ e(n+l)δ−α
(4.34)
[enδ , emδ ] = [e−nδ , e−mδ ]=0 [enδ+α , emδ ] = qα−m+1 ae(n+m)δ+α + (qα2 −1)
m−1 l=1
for any integers n ≥ 0, m > 0; [emδ , enδ−α ] = qα−m+1 ae(n+m)δ−α + (qα2 −1)
m−1 l=1
for any integers n, m > 0; [e−nδ+α , emδ ] = − (−1)(n−1)θ(α) qα−m+1 aknδ−α e(m−n)δ+α
− (−1)(n−1)θ(α) (qα2 −1)knδ−α
+ (qα2 −1)
n−1 l=1
m−1 l=n
qα−l e(l−n)δ+α e(m−l)δ
(4.35)
(−1)lθ(α) qα−l kδl e(l−n)δ+α e(m−l)δ
for any integers m ≥ n > 0; [e−nδ+α , emδ ] = (−1)mθ(α) qα−m+1 akδm e(m−n)δ+α
+ (qα2 −1) for any integers n > m > 0;
m−1 l=1
(−1)lθ(α) qα−l kδl e(l−n)δ+α e(m−l)δ
(4.36)
Quantum Affine (Super)Algebras
551
−1 [emδ e−nδ−α ] = − (−1)(n+1)θ(α) qα−m+1 ae(m−n)δ−α knδ+α
− (−1)
(n+1)θ(α)
+ (qα2 −1)
n l=1
(qα2 −1)
m−1 l=n+1
−1 qα−l e(m−l)δ e(l−n)δ−α knδ+α
(4.37)
(−1)lθ(α) qα−l e(m−l)δ e(l−n)δ−α kδ−l
for any integers m > n ≥ 0; e−nδ−α ] = (−1)mθ(α) qα−m+1 ae(m−n)δ−α kδ−m [emδ
+ (qα2 −1)
m−1 l=1
(−1)lθ(α) qα−l e(m−l)δ e(l−n)δ−α kδ−l
(4.38)
for any integers n ≥ m > 0. Here in the relations (4.21)–(4.38) and in what follows qα := (−1)θ(α) q (α,α) . Outline of proof. First of all, the formulas (4.18) are trivial. The relations (4.19) and (4.20) are obtained by application of the translation operators Tδn and Tδ−n to the relations (2.3). Further, in terms of the generators (4.3)–(4.6) the relation (2.5) means that [eα , eδ+α ]q = 0. Applying to it the operator Tδn , we obtain the relation (4.21) for m = 1. In the case m > 1 the formulas (4.21) and (4.22) are proved for arbitrary m by induction. If we apply the operator Tδ−k to the relations (4.21) and (4.22) for n = 0, then in the case k < m we obtain the relations (4.25) and (4.26), in the case m < k < 2m we obtain the relations which are obtained from (4.27) and (4.28) by the conjugation “∗ ”, and finally for k > 2m we get the relations which are obtained from (4.23) and (4.24) by the conjugation “∗ ”. Further, the relation (4.29) for n = 0 is trivial (see (4.6)). Applying to (4.29) with n = 0 the operators Tδn , we can obtain for any n > 0 and m > 0 the relation (4.29) as well as the relation (4.30). The relation (4.31) can be obtained from (4.29) by repeated application of the operator Tδ−1 . The relations (4.33) in the case n = 0 and (4.34) in the case n = 1 are proved by direct verification with the help of previous results. Repeated application of the operators Tδ±1 to these relations provides the general case n, m > 0. The relation (4.32) is proved by direct verification with the help of the relations (4.33) and (4.34). At last, the relations (4.35)–(4.38) can be obtained from (4.33) and (4.34) by repeated application of the operator Tδ−1 . do not satisfy the relations of the type (4.19) and The imaginary root vectors enδ therefore we introduce new imaginary roots vectors e±nδ by the following (Schur) relations:
enδ =
p1 +2p2 +...+npn =n
(−1)θ (α) (q−q −1 )
pi −1
p1 !···pn !
p
p
eδ 1 · · · enδn .
(4.39)
In terms of generating functions E (u) = (−1)θ(α) (q − q −1 )
n≥1
enδ u−n ,
(4.40)
552
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
E(u) = (−1)θ(α) (q − q −1 )
enδ u−n
(4.41)
n≥1
the relation (4.39) may be rewritten in the form E (u) = −1 + exp E(u)
(4.42)
E(u) = ln(1 + E (u)).
(4.43)
or This provides a formula inverse to (4.39),
enδ =
(−1)θ(α) (q −1 −q)
pi −1
n
(
i=1 pi −1)!
p1 !···pn !
p1 +2p2 +...+npn =n
pn (eδ )p1 · · · (enδ ) . (4.44)
The new root vectors corresponding to negative roots are obtained by the Cartan conjugation (∗ ): e−nδ = (enδ )∗ .
(4.45)
Proposition 4.2. The new root vectors e±nδ satisfy the following commutation relations: [enδ+α , emδ ] = (−1)(m−1)θ(α) a(m)e(n+m)δ+α
(n ≥ 0, m > 0),
(4.46)
[emδ , enδ−α ] = (−1)(m−1)θ(α) a(m)e(n+m)δ−α
(n, m > 0),
(4.47)
[e−nδ+α , emδ ] = −(−1)(n+m)θ(α) a(m) knδ−α e(m−n)δ+α (m ≥ n > 0),
(4.48)
[e−nδ+α , emδ ] = (−1)θ(α)a(m) kδm e(m−n)δ+α
(n > m > 0),
(4.49)
−1 [emδ , e−nδ−α ] = −(−1)(n+m)θ(α)a(m)e(m−n)δ−α knδ+α
(m > n ≥ 0),
(4.50)
[emδ , e−nδ−α ] = (−1)θ(α)a(m)e(m−n)δ−α kδ−m
(n ≥ m > 0),
(4.51)
(n, m > 0),
(4.52)
[enδ , e−mδ ] = δnm a(m)
kδm − kδ−m q − q −1
where a(m) :=
q m(α,α) − q −m(α,α) . m(q − q −1 )
(4.53)
This can be proved by direct calculation, applying the relations of Proposition (4.1) and the actions of the translation operators Tδ±1 . All the relations of Propositions (4.1), (4.2) together with the ones obtained from them by the conjugation describe a complete list of the permutation relations of the Cartan–Weyl bases corresponding to the “direct” normal ordering (4.1). Applying to these relations the Dynkin involution τ , it is easy to obtain these results for the “inverse” normal ordering (4.2).
Quantum Affine (Super)Algebras
553
(2) 5. Extremal Projector for Uq (A(1) 1 ) and Uq (C(2) )
A general formula for the extremal projector for quantized contragredient Lie (super)algebras of finite growth was presented in Refs. [17, 10, 11]. Here we specialize (1) this result to our case Uq (g), where g = A1 , C(2)(2) . By definition, the extremal projector for Uq (g) is a nonzero element p := p(Uq (g)) of the Taylor extension Tq (g) of Uq (g) (see Refs. [17, 10, 11]), satisfying the equations eα p = p e−α = 0,
eδ−α p = p e−δ+α = 0,
p2 = p.
(5.1)
The explicit expression of the extremal projector p for our case Uq (g) can be presented as follows: p = p+ p0 p− ,
(5.2)
where the factors p+ , p0 and p+ have the following form: p+ =
→ n≥0
pnδ+α ,
p0 =
n≥1
pnδ ,
p− =
← n≥1
pnδ−α .
(5.3)
The elements pγ are given by the formula pnδ+α = pnδ = pnδ−α =
∞ (−1)m + m ϕ e em , (m)q¯α ! n,m −nδ−α nδ+α
(5.4)
∞ (−1)m 0 m m ϕn,m e−nδ enδ , m!
(5.5)
∞ (−1)m − m em , ϕ e (m)q¯α ! n,m −nδ+α nδ−α
(5.6)
m=0
m=0
m=0
+ , ϕ 0 and ϕ − are determined as follows: where the coefficients ϕm m m + ϕn,m =
m
(−1)mnθ(α) (q − q −1 )m q −m( knδ+α q
r=1 0 ϕn,m =
− ϕn,m
=
(n+ 21 + 2r )(α,α)
m−1 4 +n)(α,α) r
−1 − (−1)(r−1)θ(α) knδ+α q −(n+ 2 + 2 )(α,α) 1
,
nm (q − q −1 )n+m q −mn(α,α) , (q n(α,α) − q −n(α,α) )m (kδn q n(α,α) − kδ−n q −n(α,α) )m m
(−1)m(n−1)θ(α) (q − q −1 )m q −m(
r=1
r
(5.8)
m−5 4 +n)(α,α) r
−1 q −(n− 2 + 2 )(α,α) knδ−α q (n− 2 + 2 )(α,α) − (−1)(r−1)θ(α) knδ−α 1
(5.7)
1
.
(5.9)
Here in the relations (5.4), (5.6) and in what follows we use the notation q¯α := (−1)θ(α) q −(α,α) , and the symbol (m)q¯α is defined by the formula (6.4). Acting by the extremal projector p on any highest weight Uq (g)-module M we obtain a space M 0 = pM of highest weight vectors for M if pM has no singularities. A concrete example of the application of the extremal projector for the case of the quantum algebra Uq (gl(n, C) can be found in Ref. [17].
554
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
(2) 6. Universal R-Matrix for Uq (A(1) 1 ) and Uq (C(2) )
Any quantum (super)algebra Uq (g) is a non-cocommutative Hopf (super)algebra which has the intertwining operator called the universal R-matrix. By definition [5], the universal R-matrix for the Hopf (super)algebra Uq (g) is an invertible element R of the Tylor extension Tq (g) ⊗ Tq (g) of Uq (g) ⊗ Uq (g) (see Refs. [11–13]), satisfying the equations ˜ q (a) = Rq (a)R −1 (q ⊗ id)R = R 13 R 23 ,
∀ a ∈ Uq (g),
(6.1)
(id ⊗q )R = R 13 R 12 ,
(6.2)
˜ q is the opposite comultiplication: ˜ q = σ q , σ (a ⊗ b) = (−1)deg a deg b b ⊗ a where for all homogeneous elements a, b ∈ Uq (g). In the relation (6.2) we use the standard 12 = notations R ai ⊗bi ⊗id, R 13 = ai ⊗id ⊗bi , R 23 = id ⊗ai ⊗bi if R has the form R = ai ⊗ bi . We employ the following standard notation for the q-exponential: xn x2 xn expq (x) := 1 + x + (2) (6.3) ! + . . . + (n) ! + . . . = (n) ! , q
q
n≥0
q
where q n −1
(n)q := q−1 .
(6.4)
A general formula for the universal R-matrix R for quantized contragredient Lie (super)algebras was presented in Refs. [11–13]. Here we specialize this result to our case (1) Uq (g), where g = A1 , C(2)(2) . The explicit expression of the universal R-matrix R for our case Uq (g) can be presented as follows: R = R+ R0 R− K.
(6.5)
Here the factors K and R± have the following form: K = q (α,α) hα ⊗hα +hδ ⊗hd +hd ⊗hδ , 1
R+ =
→
Rnδ+α ,
R− =
n≥0
←
Rnδ−α .
A(γ ) =
(6.7)
n≥1
The elements Rγ are given by the formula Rγ = expq¯γ A(γ )(q − q −1 )(eγ ⊗ e−γ ) , where
(6.6)
(−1)nθ(α) (−1)(n−1)θ(α)
if γ = nδ + α, if γ = nδ − α.
Finally, the factor R0 is defined as follows: d(n)enδ ⊗ e−nδ , R0 = exp (q − q −1 ) n>0
(6.8)
(6.9)
(6.10)
Quantum Affine (Super)Algebras
555
where d(n) is the inverse to a(n), i.e. d(n) =
n(q − q −1 ) . q n(α,α) − q −n(α,α)
(6.11)
7. The “New Realization” Let us denote by d the Cartan element hd and by c the Cartan element hδ , emphasizing that d defines homogeneous gradation of the algebra and kδ = q hδ is the central element. c
±1
It will be convenient in the following to add its square roots q ± 2 = kδ 2 . Let us introduce the new notations: en := enδ+α (n ≥ 0), e−n := −(−1)(n−1)θ(α) k−nδ+α e−nδ+α (n > 0), and fn := −enδ−α knδ−α (n > 0), f−n := (−1)(n+1)θ(α) e−nδ−α n ≥ 0). We also put nc nc an := enδ q 2 (n ≥ 1), and a−n := (−1)nθ(α) e−nδ q − 2 (n ≥ 1). Collect the elements en , fn (n ∈ Z ) and a±n (n ≥ 1) into the generating functions (“fields”) e(z) =
en z−n ,
n=1
n∈ Z
f (z) =
∞ ψ+ (z) = kα−1 exp (−1)θ(α) (q − q −1 ) an z−n ,
fn z−n ,
ψ− (z) = kα exp (−1)θ(α) (q −1 − q)
∞
(7.1)
a−n zn ,
n=1
n∈ Z
such that deg e(z) = deg f (z) = θ(α),
deg ψ± (z) = 0.
(7.2)
These fields satisfy the following conjugation conditions with respect to graded conjugation “‡ ”: (e(z))‡ = f (z−1 ),
(f (z))‡ = (−1)θ(α) e(z−1 ),
(ψ+ (z))‡ = ψ− (z−1 ), (ψ− (z))‡ = ψ+ (z−1 ),
(7.3)
and have the following symmetry with respect to the translation operator Tδ : Tδ (e(z)) = (−1)θ(α) ze((−1)θ(α) z), Tδ (f (z)) = (−1)θ(α) z−1 f ((−1)θ(α) z), Tδ (ψ+ (z)) = q −c ψ+ ((−1)θ(α) z),
Tδ (ψ− (z)) = q c ψ− ((−1)θ(α) z). (7.4)
Proposition 7.1. In terms of the fields (7.1) the relations of Sect. 4 can be rewritten in the following compact form: [q c , everything] = 0, ud ϕ(v)u−d = ϕ(uv),
(7.5)
where ϕ(v) = e(v), f (v), ψ± (v), and also ψ± (u)ψ± (v) = ψ± (v)ψ± (u),
(7.6)
556
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
(u − q¯α v)e(u)e(v) = (q¯α u − v)e(v)e(u),
(7.7)
(u − qα v)f (u)f (v) = (qα u − v)f (v)f (u),
(7.8)
∓ 2c
−1 q¯α q u − v ψ± (u)e(v) ψ± (u) = (−1)θ(α) ∓ c e(v), q 2 u − q¯α v
(7.9)
c
−1 qα q ± 2 u − v = (−1)θ(α) ± c f (v), ψ± (u)f (v) ψ± (u) q 2 u − qα v
−1
ψ+ (u)
−1 (q c u − qα v)(q −c u − q¯α v) ψ− (v)ψ+ (u) ψ− (v) = c , (q v − q¯α u)(q −c v − qα u) u 1 [e(u), f (v)] = δ( q −c )ψ− (vq c/2 ) q − q −1 v u c − δ( q )ψ+ (uq c/2 ) . v
Here in (7.11) δ(z) = supercommutator:
n∈ Z z
(7.10) (7.11)
(7.12)
n , and the brackets [·, ·] in the relation (7.12) mean the
[e(u), f (v)] = e(u)f (v) − (−1)θ(α) f (v)e(u).
(7.13)
The given description is called the “new realization”, or the current realization of the (1) quantum affine superalgebras Uq (A1 ) and Uq (C(2)(2) ). It should be noted that the relations (7.1) and (7.6)–(7.12) differ from the corresponding relations of Refs. [3, 5, 20] by replacement of q by q −1 . The current realization possesses its own graded comultiplication structure, different from (2.12): (D) q (c) = c ⊗ 1 + 1 ⊗ c, (D) q (d) = d ⊗ 1 + 1 ⊗ d, c2
c1
±2 ) ⊗ ψ± (zq ∓ 2 ), (D) q (ψ± (z)) = ψ± (zq
(7.14)
c1
c1 2 (D) q (e(z)) = e(z) ⊗ 1 + ψ− (zq ) ⊗ e(zq ), c2
c2 2 (D) q (f (z)) = f (zq ) ⊗ ψ+ (zq ) + 1 ⊗ f (z),
S (D) q (c) = −c,
S (D) q (d) = −d,
−1 , S (D) q (ψ± (z)) = ψ± (z) − 2c −1 ) e(zq −c ), S (D) q (e(z)) = − ψ− (zq
(7.15)
−c − 2c −1 ) , S (D) q (f (z)) = −f (zq ) ψ+ (zq ε(c) = ε(d) = ε(e(z)) = ε(f (z)) = 0,
ε(ψ ± (z)) = 1.
(7.16)
Quantum Affine (Super)Algebras (D)
557
(D)
Here q , S q , and ε are the comultiplication, antipode and counit respectively. The (D) two comultiplications q and q are related by the twist [13]: −1 (x)F, (D) q (x) = F
(7.17)
21 , with R given by (6.7)–(6.9), such that the universal R-matrix for the where F = R+ + (D) comultiplication q equals 21 R(D) = R0 R− KR+
(7.18)
with the factors from (6.5). In the generators en , fn and an it can be rewritten as follows ¯ R(D) = KR,
(7.19)
where K=q
hα ⊗hα (α,α)
∞ 1 1 q 2 (c⊗d+d⊗c) exp (q − q −1 ) d(n) an ⊗ a−n q 2 (c⊗d+d⊗c) ,
(7.20)
n=1
¯ = R
→ n∈Z
expq¯α (q −1 − q)f−n ⊗ en ,
(7.21)
and d(n) =
n(q − q −1 ) . qαn − qα−n
(7.22)
¯ in the completed algebras It is possible to give another presentation of the element R (1) (2) ¯ U (g), where g is either A1 or C(2) [3, 4]. The completion is done with respect to open neighborhoods of zero U¯ r = s>r Us , where Us consists of all the elements from U (g) of degree s. The completed algebra acts on (infinite-dimensional) representations of highest weight and admits the series over monomials xi1 xi2 · · · xin , i1 ≤ i2 . . . ≤ in , with x = e, f, a and fixed ik . The matrix coefficients of the products of the currents e(z1 )e(z2 ) · · · e(zn ) and f (z1 )f (z2 ) · · · f (zn ), defined originally as formal series, converge to meromorphic in Cn functions with the poles at zi = 0 and zi = qα∓1 zj , i ≤ j. Let t (z) = (q −q −1 )f (z)⊗e(z). As before, we the product t (z1 ) · · · t (zn ) n understand as an operator-valued meromorphic function in C∗ with simple poles at zi = qα∓1 zj , i = j . Define 1 dz1 dzn ¯ = 1+ R · · · ··· t (z1 ) · · · t (zn ), (7.23) n n!(2π i) z1 zn n>0
Dn
and the integration region Dn is defined as Dn = |zi | = 1, i = 1, . . . , n for |q| < 1 and, more generally, by Dn = zi (zi − qα zj ) = 1, i = 1, . . . , n (7.24) j =1,...,n, j =i
for any q, such that qαN = 1, N ∈ Z \{0}.
558
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
¯ in the tensor product of highest Proposition 7.2. The action of the tensor R = KR weight modules is well defined and coincides with the action of the universal R-matrix (7.19) The integrals in (7.20) can be computed explicitly. Let us put by induction t (n) (z) = −
Res
z1 =zq¯α2n−2
t (z1 )t (n−1) (z)
dz1 , z1
(7.25)
where t (1) (z) = t (z). In the components the fields t (n) (z) look as follows: t (n) (z) = Cn e(n) (z) ⊗ f (n) (z),
(7.26)
where − n(n−1) 2
Cn = (−1)(n−1)θ(α) (q −q −1 )n q˜α
(q˜α −1)n−1 (n−1)q˜α! (n)q˜α!,
e(n) (z) = e(z)e(q¯α z)e(q¯α2 z) · · · e(q¯αn−2 z)e(q¯αn−1 z),
(7.27)
f (n) (z) = f (q¯αn−1 z)f (q¯αn−2 z) · · · f (q¯α2 z)f (q¯α z)f (z), such that e(n) (z) =
m∈ Z
zq¯αn
m
λ1 ≥···≥λn , λ1 +···+λn =m
qαλ1 +2λ2 +...nλn · · · eλ1 , e e (λj − λj +1 )q¯α ! λn λn−1
j∈ Z
(7.28) f (n) (z) =
m∈ Z
zqα
m
λ1 ≥···≥λn , λ1 +···+λn =m
qαλ1 +2λ2 +...nλn f f · · · fλ1 . (λj − λj +1 )qα ! λn λn−1
j∈ Z
Here λj = #k, such that λk ≥ j , and q˜α := q (α,α) . The product in the denominator is finite, since there are only finitely many distinct λj for a given choice of λk . Then, ¯ : repeating the calculations in [4], we get a vertex type presentation of the element R 1 ¯ = exp R (7.29) n In , n>0
where the sequence of operators
In =
t (n) (z)dz 2πiz
(7.30)
commute between themselves: [In , Im ] = 0,
n, m > 0.
The vertex operator presentation (7.29) is convenient for applications to integrable representations: it is expressed through integrals over the fields, which number is precisely k for level k integrable representations.
Quantum Affine (Super)Algebras
559
8. Final Remarks The aim of this paper is to describe in a unified way in detail the q-deformed untwisted (1) = Uq (A1 ) and twisted superalgebra Uq (osp(2|2)(2) ) = affine algebra Uq (sl(2)) Uq (C(2)(2) ). In order to describe the complete list of quantum affine (super)algebras of rank 2 one should consider the following three quantum affine (super)algebras: (2) Uq (sl(1|3)(4) ) = Uq (A(0, 2)(4) ), Uq (sl(3)(2) ) = Uq (A2 ) and Uq ( osp(1|2)) = (1) Uq (B(0, 1) ). The Dynkin diagram of the superalgebra A(0, 2)(4) has the same ge(1) ometric structure as the (super)algebras A1 and C(2)(2) , but in this case the root α is even and δ − α is odd, and the sector of imaginary roots has odd roots. Therefore in the case of the quantum superalgebra A(0, 2)(4) the relations of the type (4.29)–(4.38) are more complicated and they require special consideration. The second family of two (2) quantum affine (super)algebras Uq (A2 ) and Uq (B(0, 1)(1) ) are described by the same Dynkin diagram with different colors of roots. Preliminary results in this direction are given in [14], where in particular the Cartan–Weyl basis of the basic affine superalgebra osp(1|2)) is considered. The unified description of two affine Uq (B(0, 1)(1) ) ≡ Uq ( (super)algebras mentioned above, analogous to the one given in the present paper, is in preparation. Acknowledgements. This work was supported (S. M. Khoroshkin, V. N. Tolstoy) by the Russian Foundation for Fundamental Research, grant No. 98-01-00303, by the program of French-Russian scientific cooperation (CNRS grant PICS-608 and grant RFBR-98-01-22033), as well as by KBN grant 5P03B05620 (J. Lukierski) and INTAS-99-1705 (S. Khoroshkin).
References 1. Asherova, R.M., Smirnov, Yu.F., and Tolstoy, V.N.: A description of some class of projection operators for semisimple complex Lie algebras. (Russian) Matem. Zametki 26, 15–25 (1979) 2. Beck, J.: Braid group actions and quantum affine algebras. Commun. Math. Phys. 165, 555–568 (1994) 3. Ding, J., and Khoroshkin, S.: Weyl group extension of quantized current algebras. Transformation Groups 5, 35–59 (2000); math.QA/9804139 ¯ 4. Ding, J., Khoroshkin, S., and Pakuliak, S.: Integral representations for the universal R-matrix. ITF preprint, ITEP-TH-67/99, Moscow, 1999; http://wwwth.itep.ru/mathphys/psfiles/99_67.ps 5. Drinfeld, V.G.: Quantum groups. Proc. ICM-86 (Berkely USA) Vol. 1, Providence, RI: Am. Math. Soc., 1987, pp. 798–820 6. Drinfeld, V.G.: A new realization of Yangians and quantized affine algebras, Soviet Math. Dokl. 36, 212–216 (1988) 7. Kac, V.G.: Lie superalgebras. Adv. Math. 26, 8–96 (1977) 8. Kac, V.G.; Infinite dimensional Lie algebras. Cambridge: Cambridge University Press, 1990 9. Khoroshkin, S.M., and Tolstoy, V.N.: Universal R-matrix for quantized (super)algebras. Commun. Math. Phys. 141, no. 3, 599–617 (1991) 10. Khoroshkin S.M., and Tolstoy V.N.: Extremal projector and universal R-matrix for quantum contragredient Lie (super)algebras. In: Quantum groups and related topics (Wrocław, 1991), Math. Phys. Stud. 13, Dordrecht: Kluwer Acad. Publ., 1992, pp. 23–32 11. Khoroshkin, S.M., and Tolstoy, V.N.: The uniqueness theorem for the universal R-matrix. Lett. Math. Phys. 24, no. 3, 231–244 (1992) 12. Khoroshkin, S.M., and Tolstoy, V.N.: The Cartan–Weyl basis and the universal R-matrix for quantum Kac-Moody algebras and superalgebras. In: Quantum Symmetries (Clausthal 1991). River Edge, NJ: World Sci. Publishing, 1993, pp. 336–351 13. Khoroshkin, S.M.; Tolstoy, V.N.: Twisting of quantum (super)algebras. Connection of Drinfeld’s and Cartan–Weyl realizations for quantum affine algebras. MPIM preprint, Bonn: MPI/94-23, 29 p., 1994; hep-th/9404036 14. Lukierski, J., and Tolstoy, V.N.: Cartan–Weyl basis for quantum affine superalgebra Uq (osp (1|2)). Czech. J. Phys. 47, no. 12, 1231–1239 (1997); q-alg/9710030
560
S. M. Khoroshkin, J. Lukierski, V. N. Tolstoy
15. Lusztig, G.: Canonical bases arising from quantized enveloping algebras. J. Am. Math. Soc. 3, 447–498 (1990) 16. Tolstoy, V.N.: Extremal projectors for contragredient Lie algebras and superalgebras of finite growth. (Russian) Uspekhi Math. Nauk 44, no. 1 (265), 211–212 (1989); translation in Russian Math. Surveys 44, no. 1, 257–258 (1989) 17. Tolstoy, V.N.: Extremal projectors for quantized Kac-Moody superalgebras and some of their applications. In: Quantum Groups (Clausthal, 1989), Lectures Notes in Phys. 370, Berlin: Springer, 1990, pp. 118–125 18. Tolstoy, V.N., and Khoroshkin, S.M.: The Universal R-matrix for quantum nontwisted affine Lie algebras. (Russian) Funktsional. Anal. i Prilozhen. 26, no. 1, 85–88 (1992); translation in Functional Anal. Appl. 26, no. 1, 69–71 (1992) 19. Van der Leur, J.W.: Contragredient Lie superalgebras of finite growth. Utrecht thesis, 1985 20. Yang, W.-L., and Zhang, Y.-Z.: Drinfeld basis and a nonclassical free boson representation of twisted quantum affine superalgebra Uq [osp(2|2)]. Preprint math.QA/9904017 Communicated by A. Connes
Commun. Math. Phys. 220, 561 – 582 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Floquet Spectrum of Weakly Coupled Map Lattices Viviane Baladi1 , Hans Henrik Rugh2 1 CNRS UMR 8628, Université de Paris-Sud, 91405 Orsay, France. E-mail: [email protected] 2 Département des Mathématiques, Université de Cergy-Pontoise, 95302 Cergy-Pontoise, France.
E-mail: [email protected] Received: 4 January 2001 / Accepted: 6 February 2001
Abstract: We consider weakly coupled analytic expanding circle maps on the lattice ZD (for D ≥ 1), with small coupling strength and summable decay of the twosites coupling. We study the spectrum of the associated (Perron–Frobenius) transfer operators L . On a suitable Banach space, perturbation theory applied to the difference of a high iterate n0 of the transfer operators Ln0 0 and Ln 0 yields localisation of the full spectrum of Ln 0 . The time-transfer operator Ln 0 commutes with the spatial translations, and we provide a description of part of their joint eigenvalues, more precisely of the “nonresonant” spectrum of Ln 0 restricted to the eigenspaces of the spatial translations. In particular, we exhibit smooth curves of eigenvalues and eigenspaces of Ln 0 as functions of the eigenvalues eiα (“crystal momenta”) of the spatial translations. 1. Introduction Transfer operators are bounded linear operators L, which are usually defined through composition by a discrete-time dynamical system F : X → X, and multiplication by a weight function G on X: Lφ(x) = G(y)φ(y), Fy=x
where the observable φ : X → C belongs to a suitable Banach space B. The weight, or potential, G is often a positive function related to some reference measure m by a “conformality” condition: ψ ◦ F φ dm = ψLφ dm. X
X
In particular, if X is a (finite-dimensional) compact manifold, m is often Lebesgue measure and G the Jacobian. Then, if F is sufficiently smooth and hyperbolic, such a transfer operator enjoys Perron–Frobenius type properties, the maximal eigenfunction
562
V. Baladi, H. H. Rugh
being related to an F -invariant probability measure with good thermodynamic and/or ergodic properties, and the spectral gap corresponding to the exponential rate of decay of correlations for test functions in B. Even in this case, transfer operators are usually, however, very far from being finite rank: compactness essentially only holds if F is analytic, while a weaker quasicompactness property is the best one can do in finite differentiability. That part of the spectrum which consists of isolated eigenvalues of finite multiplicity sometimes has a dynamical interpretation, and can often be described by a dynamical Fredholm determinant or zeta function. For the rich array of results obtained since the early work of Ruelle in the 1960’s, we refer, e.g., to the monograph [B] and references therein. One aim of this article is to see how much of the above picture remains true in infinitedimensional dynamics, focusing on one of the most tractable situations: weakly coupled map lattices. Coupled map lattices (CMLs) have been an object of rigorous studies since the seminal article of Bunimovich and Sinai [BS]. In this work, we shall consider the situation of weakly coupled analytic circle maps F , the lattice being just ZD , in the footsteps of Bricmont and Kupiainen [BK1]. We refer to the book of Kaneko [K], the survey of Bunimovich [Bu] or the introductions, e.g., of the more recent articles [BIJ] or [FR] for overview and motivation, or discussion of other settings (in particular the work of Pesin–Sinai [PS], Keller and Künzle [KK], Volevich [V], Jiang [J], Jiang–Pesin [JP], or the article of Bricmont–Kupiainen [BK2] in a nonanalytic framework). Bricmont and Kupiainen [BK1], in a remarkable paper, constructed a Sinai–Ruelle– Bowen measure (limit of (F∗ )n (⊗ZD Lebesgue)) and proved the exponential decay of its correlation functions for analytic observables depending on finitely many sites. (They were supposing exponential decay of the couplings, an assumption which has been weakened only very recently by Rugh [R], see below.) Their main technical tools were the transfer operators L, associated to the truncations of the system on finite subsets ⊂ ZD of the lattice, acting on holomorphic and bounded functions on finitedimensional polyannuli. They showed bounds on their spectral gaps, uniform in the size ||. This approach forced them to consider only local observables, and the prefactor in the correlation decay bounds grows exponentially with the size of . It is therefore not only for esthetical reasons that one would like to define a Banach space of observables depending on all lattice sites, and a transfer operator associated to the full (non-truncated) coupled dynamics. There are two obvious caveats in this enterprise. Firstly, in the translation-invariant case the SRB measure (which is ergodic) is absolutely continuous with respect to the product (⊗ZD Lebesgue) of Lebesgue measure on the individual sites only if it coincides with this product. So a Banach space which is big enough to contain the SRB fixed point of the transfer operator cannot be a subset of L1 (⊗ZD Lebesgue). Secondly, an element λ = 1 of the spectrum should not be expected to be an eigenvalue of finite multiplicity, even for the uncoupled transfer operator. Indeed, restricting the uncoupled operator to a finite subset ⊂ ZD , we obtain a finite tensor product, with eigenvalues K n0 k=1 λik , K ≤ ||, where the λi are eigenvalues of the single-site operator L , and the multiplicity, e.g., of a simple eigenvalue λi of Ln0 viewed as an eigenvalue (for K = 1) of ⊗ Ln0 coincides with the cardinality of . (Writing h for the fixed point of Ln0 and ϕi for the eigenfunction of Ln0 and λi , just note that for each p ∈ the tensor product ϕi (xp )⊗q∈\{p} h(xq ) gives a different eigenfunction.) An obvious solution to the second difficulty is to use translation invariance to restrict the transfer operator to the (invariant) eigenspaces of the spatial translations σ ∗ . The spectrum of each such restriction should
Floquet Spectrum of Weakly Coupled Map Lattices
563
be more tractable, in particular the “single-site” or “nonresonant” eigenvalues of the uncoupled operator (K = 1 in the analysis above for the uncoupled operator) should have finite multiplicity, giving some hope that a perturbative analysis might lead to the construction of nontrivial eigenvalues and eigenfunctions for the coupled operator L . It is this “Floquet spectrum” strategy (suggested to the first author by D. Ruelle in 1996) which we are able, finally, to carry out successfully in the present article. This is Theorem 1, stated in Sect. 2, which describes the Floquet eigenvalues generated by “nonresonant” eigenvalues of the full uncoupled operator L0 . A few words about our use of the Floquet spectrum terminology ([JS, Chap. 9], see [FKT] for a recent example) are perhaps appropriate here. In fact, we hesitated between “Bloch decomposition” (see [AC] for a recent implementation of this idea to Schrödinger operators) and “Floquet spectrum”. We opted for the second expression, which seems to be more frequent in the rigorous literature. We would like to point out that in our situation all complex frequencies or multifrequencies (sometimes called crystal momenta elsewhere, they are denoted α below) are present because of the infinitedimensional nature of the system. A brief history of the “full” transfer operator for CMLs. Keller and Künzle [KK] were the first to let the transfer operator associated to the full (infinite-dimensional) coupled dynamics act on a Banach space, in the setting of coupled non-Markov piecewise monotone interval maps, with bounded variation norms. Unfortunately, they were not able to prove the existence of a spectral gap for this full transfer operator (this remains to our knowledge an open question in this framework). In the easier situation of coupled analytic expanding circle maps from [BK1], assuming furthermore that D = 1, Baladi et al. [BIJ] were able to construct a Banach space on which the full transfer operator has a simple eigenvalue at 1 and enjoys a spectral gap. (This yields exponential decay of correlations for a class of nonlocal observables.) Sadly, this Banach space is not spatially homogeneous, so that the spatial translation σ ∗ does not define a bounded operator on it. For this reason, it was not possible to go “beyond the first gap” for the Banach space of [BIJ]. In the same paper [BIJ], a translationally invariant Fréchet space F was introduced on which the coupled transfer operator enjoyed a spectral gap property (for all D ≥ 1), except, however, for the possible presence of continuous spectrum, which also obstructed further spectral analysis. This frustrating situation improved greatly when Fischer and Rugh [FR], using the same basic assumptions from [BK1], but exploiting a much simpler cluster expansion formalism (inspired by a paper of Maes and van Moffaert [MM] in a random noise framework), showed that the Banach space subset of F defined by a natural norm was preserved by a suitable iterate of Ln 0 , and that this iterate enjoyed a spectral gap. (As Fischer and Rugh were mostly concerned with ergodic properties of the dynamics, these facts are not explicitly stated in [FR], but can easily be deduced from their results.) The Banach space analysed in [FR] appears to be the “correct” invariant space: neither too big, nor too small, and translationally invariant. In the present work, we carry out the Floquet spectrum program sketched above for this Banach space, not only under the hypotheses of [FR], but in fact in a more general setting which was introduced very recently by Rugh [R]. Rugh also considered weakly coupled expanding analytic circle maps, and he studied the transfer operator acting on the Banach space from [FR]. The main novelty, for our present purposes, is that exponential decay of the coupling is not necessary: only a summability condition (see (2.1–2.2)) is required.
564
V. Baladi, H. H. Rugh
Sketch of the paper. The article is organised as follows. In Sect. 2, after recalling the restriction of the setting of [R] (translationally invariant coupled maps on a lattice) which is relevant here, we first state Theorem 0 (proved in the appendix), which exploits the results of [R] to describe the spectrum of the uncoupled transfer operator and show that the coupled and uncoupled operator are close in operator norm when the coupling strength is small. Next, we describe the eigenspaces X α of σ ∗ as well as the eigenspaces of the restriction of the uncoupled transfer operator to the various Xα (see in particular Lemma 1). We then state our main result, Theorem 1, which describes the nonresonant part of the spectrum of Ln 0 restricted to each eigenspace X α of σ ∗ (Claim (1), which can be deduced from Theorem 0 and perturbation theory), as well as the α-dependence of the corresponding eigenvalues (Claim (2)). Section 3 is devoted to an elementary abstract functional analytical argument inspired from [BY] which gives, in particular, a very explicit proof of Theorem 1 (1). The results from Sect. 3 are used in Sect. 4 to yield a matrix description of the restriction of Ln 0 to an invariant space, expressed as a graph over the corresponding invariant space for Ln0 0 . We exploit this description to show Theorem 1 (2). The key for this is Proposition 3, which embodies that information from the coupling decay which is relevant.
2. Setting and Formal Statement of Results We start by recalling the settings of [FR] and [R], keeping as close as possible to the notation of [R] (the main exceptions are our use of n, n0 for the time parameter instead of Rugh’s T , T0 , instead of κ for the coupling strength, and instead of ϑ for the Banach space parameter). In fact, we are essentially restricting the framework [R] to the case when the infinite set " = ZD , with a translationally invariant coupling which enjoys some spatial decay. We concentrate on pair couplings with exponential or polynomial decay for simplicity: m-point couplings (allowing m to be unbounded) with other kinds of decay may be treated by exactly the same methods. Tori and polyannuli. We view the circle S 1 = R/Z as a subset of the complex cylinder C = C/Z. For ρ ≥ 0, we define the closed annulus in the cylinder, A[ρ] = {z ∈ C/Z : | Im z| ≤ ρ} ⊂ C. For each integer D ≥ 1, the infinite torus SZD = p∈ZD S 1 and the infinite polyannulus AZD = p∈ZD A[ρ] are both compact for the product topology. Let S denote the family of all finite subsets, including the empty set, of the lattice ZD . For ∈ S we write S = p∈ S 1 and A = p∈ A[ρ] for the -torus and the -annulus, respectively. H = H (A ) denotes the space of complex valued functions, holomorphic on the interior of A and continuous on A . In the case = ∅, we let H∅ = C. Each H is a Banach space for the uniform norm denoted | · |. Through coordinate projections we obtain natural inclusions: j,K : HK *→ H whenever K ⊂ (∈ S) and also j : H *→ C(AZD ), where C(AZD ) is the set of continuous functions on AZD . We denote by H (AZD ) the closure of ∪ j H in C(AZD ), for the supremum norm.
Floquet Spectrum of Weakly Coupled Map Lattices
565
Analytic expanding circle map f , uncoupled map F. Definition. For ρ > 0 and λ > 1 we say that f : A[ρ] → C is a real analytic, (ρ, λ)-expanding map of the circle if (1) f is holomorphic in Int A[ρ] and continuous on A[ρ]. (2) f (S 1 ) = S 1 . (Real-analyticity). (3) The intersection f ∂A[ρ] ∩ A[λρ] is empty. (Expansion). A real analytic, (ρ, λ)-expanding map of the circle is real analytic expanding on the circle in the usual metric sense, and λ gives a lower bound for the expansion constant (see Appendix A in [R]). We shall choose a single real analytic, (ρ, λ)-expanding map f . The uncoupled system is the direct product F = (f )p∈ZD , which leaves the infinite (real) torus invariant. Coupling g, coupling-strength , and coupled map F . It is useful to introduce now the notation σp : ZD → ZD (p ∈ ZD ) for the translation σp (k) = k + p. The theory in [R] is valid for a general class of summable couplings. For the sake of simplicity we restrict here to pair-couplings. Let us first describe an exponentially decaying setting in which d will denote a translationally invariant metric on the lattice, to be specified later. For fixed ξ ∈ (0, 1), let Hξ,0 ⊂ H (AZD ) be those φ which may be written as φ= j{0,q} φ{0,q} , q∈ZD
the sum being uniformly converging with each φ{0,q} ∈ H{0,q} , and where ξ −d(0,q) |φ{0,q} | < ∞. q∈ZD
Let |φ|ξ,0 denote the infimum of the above sum over all possible decompositions of φ. The parameter ξ quantifies an exponential spatial decay of the terms φ{p,q} with respect to the distinguished point 0 and the metric d. In our context there are two particular choices for the metric d which will be of interest. First, we may take the Euclidean metric, i.e., d(p, q) = p − q, for which the decay is truly exponential. In this case the above sum becomes ξ −q |φ{0,q} |. (2.1) q∈ZD
Our second choice, cf. also [R, Sect. 6], is to take a “renormalized” metric, d(p, q) = log(1 + p − q). Setting P = − log(ξ ) > 0, the identity ξ d(p,q) = (1 + p − q)−P shows that we are in fact describing a polynomial decay with respect to the Euclidean metric. The above sum then reads (1 + q)P |φ{0,q} |. (2.2) q∈ZD
We shall consider a translationally invariant system of couplings gZD = (gp )p∈ZD , where gp = g0 ◦ σp with g0 in Hξ,0 , either for the Euclidean metric (exponential
566
V. Baladi, H. H. Rugh
case (2.1)) or for the renormalized Euclidean metric (polynomial case (2.2)), mapping the real torus to the real line, and so that |g0 |ξ,0 ≤ . We say that the coupling decays exponentially with rate ξ (respectively polynomially with exponent P = log(1/ξ )) and (spatial) coupling strength . D The coupled system F = (F,p )p∈ZD : AZD → C Z is obtained by setting F,p : AZD → C, F,p : z → f (zp ) + gp (z), p ∈ ZD . The infinite real torus is invariant under F , by definition. Boundary condition x, ¯ truncated map F, . Let x¯ ∈ SZD denote an arbitrary but fixed reference point. For ∈ S we define a holomorphic injection, i : A → AZD , i (z ) = (z , x¯c ) and a natural projection, q : AZD → A , q (zZD ) = z , all expressed in natural coordinates on the annuli. The map q is the left-inverse of i while r = i ◦ q : AZD → AZD is a projection which gives boundary conditions outside . We will use the same notation when considering the restriction of the above maps to the tori SZD and S . Using the above notation, we define for ∈ S the -truncated coupled map, F, : S → S (and F, : A → C ) by setting F, = q ◦ F ◦ i . When w ∈ A −1 we denote by F, (w ) the (finite) set of inverse images in A obtained by solving w = F, (z ) for z ∈ A . The condition < (λ − 1)ρ on the coupling strength implies [R, App. E] that local inverses indeed exist and are real analytic in w . In particular, F, is non-singular and therefore has a well-defined orientation on S . Truncated transfer operator L, , single-site gap η. The transfer operator L, associated with the truncated dynamical system (S , F, ) and the Lebesgue measure m on the -torus is defined by the identification,
S
ψ L, φ dm ≡
S
ψ ◦ F, φ dm ,
for ψ ∈ L∞ (S ) and φ ∈ L1 (S ). By standard arguments, this defines a bounded linear operator, L, , on L1 (S ). This operator may be extended to a compact (in fact nuclear) operator on H . We shall write L for the single-site operator associated to f acting on H = H (A[ρ]), and we denote by η < 1 the maximal modulus of its eigenvalues different from 1, which is a simple eigenvalue with a positive eigenfunction h ∈ H . It may happen (nongenerically) that η = 0, e.g. for the dynamics z → z2 . The present work is only interesting if η > 0, we shall make this assumption and fix η < η < 1. Note that ch := supn Ln (1) and cr := supn (L−h ·)n /(η)n are finite. Since Ln (1) converges to h in the supremum norm on S 1 , for any ch > ch we have hH ≤ ch , up to taking smaller ρ > 0 if necessary.
Floquet Spectrum of Weakly Coupled Map Lattices
567
Banach space Mθ . Given a non-empty ∈ S, let z = (zp )p∈ be natural coordinates on the -torus S . For every pair K ⊂ ∈ S we define a linear operator πK, : H → HK by “integrating away” the coordinates outside K: πK, φ (zK ) =
S\K
φ (z ) dm\K (z\K ),
φ ∈ H .
Such operators are norm-contracting and Fubini’s theorem shows that πL,K πK, = πL, if L ⊂ K ⊂ . Hence the family (H , πK, ) is projective. We denote by M its projective limit. An element φ = (φ )∈S ∈ M satisfies πK, φ = φK whenever K ⊂ ∈ S. We write π φ for the natural projection φ ∈ M → φ ∈ H . From the defining equation it is clear that if K ⊂ , aK ∈ HK , and φ ∈ H then πK, ((j,K aK )φ ) = aK πK, φ , simply because aK does not depend on the variables which are “integrated away”. We introduce for θ ∈ (0, 1), φθ = sup θ || |φ | ∈ [0, +∞], ∈S
φ ∈ M,
(2.3)
and define Mθ to be the set of those φ with finite norm. The norm of the natural projection π : Mθ → H is given by π θ = θ −|| .
Full transfer operator L , choice of 0 < θ < . In the following we use the constants ρ > 0, λ > 1, η < 1, ch , and cr introduced above and fix θ ∈ (0, 1) and 0 > 0 (denoted κ in [R]) so that the condition TR in [R, Def. 4.18] is verified for some 1 < γ < η−1 (the constant denoted Cβ there is our C() defined in (2.4) below). For a suitable value of 0 < θ < [R, Theorem 2.1, Lemmas 4.20 and 4.25], we may define for each n ≥ 0 (n) a bounded linear operator L : M → Mθ by the requirement that for all 0 < < 0 and each finite K ⊂ ZD : n πK L(n) = lim πK, ◦ L, ◦ π . →ZD
(n)
Furthermore, there is n0 finite, such that for n ≥ n0 , L maps M into M itself so that we may consider the spectrum of this operator. (n) Rugh [R] gives another characterisation of L as a sum over configurations (see his Lemma 4.25). This allows to bypass the boundary condition x¯ and is convenient to obtain bounds (some of which are quoted and used below), but since our current aim is to reduce technicalities as much as possible we content ourselves here with the naive definition and refer to [R] for more. (N+n) (N) (n) = L L is a bounded operator on M . It is If n, N ≥ n0 we have that L thus not a (serious) abuse of notation to write Ln , without parentheses, for n ≥ n0 .
568
V. Baladi, H. H. Rugh
Uncoupled spectrum and perturbative bounds. Let sp(·) denote the spectrum of an operator. We state here extensions of the results from [R] required for our present purposes (the proofs are given in the appendix): Theorem 0 (Uncoupled spectrum and perturbation). Let θ,
, 0 , and n0 be as above.
(1) If the coupling strength 0 ≤ ≤ 0 the operators Ln : M → Mθ and Ln : M → M are bounded for all n ≥ 0 and n ≥ n0 , respectively. (2) Let sp(Ln0 ) = {1, λj , j = 1, 2, . . . , |λj | < 1} denote the nonzero spectrum of the compact single-site operator Ln0 on H . Then the nonzero spectrum of Ln0 0 on M consists in the discrete set K =0 = 1, λjk , K = 1, 2, . . . , 1 ≤ j1 ≤ j2 ≤ . . . ≤ jK , k=1
where 1 is a simple eigenvalue, and all other points are eigenvalues of infinite multiplicity. Furthermore, the image of the spectral projector associated [Ka, III.6.4] to any domain D ⊂ C, with 0 ∈ / D and ∂D a union of rectifiable curves disjoint from =0 , consists of the generalised eigenvectors {ϕ ∈ M | ∃> ≥ 1, ∃z ∈ =0 ∩ D, s.t. (z − Ln0 0 )> ϕ = 0}. (3) Define for 0 ≤ ≤ 0 the function, C() =
e2π 1 − 2π(λ−1)ρ . 2π(λ−1)ρ 2π e −e e −1
(2.4)
Then we have the following bound on the perturbation for all n ≥ n0 : Ln0 − Ln M ≤ C()/C(0 ).
(2.5)
The function C() is denoted Cβ in [R], see Lemma 3.4 there. Corollary of Theorem 0 (Perturbation theory). Since 1 is an eigenvalue of Ln 0 , it follows from (2.5) that for small enough 0 ≤ ≤ 0 , the spectrum of the bounded operator Ln 0 : M → M consists in {1} ∪ = , where = is a subset of a disc of radius η. The maximal eigenvalue 1 is simple, with an eigenfunction invariant under spatial translations, and there are no other spectral points of Ln 0 on the unit circle. For each δ > 0 there is a function Cδ () = O() as → 0 and such that d(z, =0 ) < Cδ () for each z ∈ = with z > δ. In fact, the fixed point of Ln 0 in M gives rise to an F -invariant finite positive Borel measure on the infinite torus which is the SRB measure. We refer to [BK1,BIJ,FR], and [R] for this claim and for ergodic-theoretical properties of F related to the spectral properties stated in Theorem 0 (quantified time-mixing for observables in a space related to the coupling strength).
Floquet Spectrum of Weakly Coupled Map Lattices
569
Floquet spectrum of L , resonant spectrum R. For simplicity, consider D = 1 (the extension to D ≥ 2 is straightforward and left to the reader). Write σ for the spatial translation (to the left, say) on Z, and also σ : AZ → AZ for the associated shift on the product annulus. This shift defines in a natural way an operator σ ∗ on M . To be more precise, let K be a finite subset of our lattice and denote by σK : AK → Aσ (K) the action of the shift on the truncated annulus. For a projective family, φ ∈ M , we may for each finite subset, K ⊂ Z, associate a pull-back, (σ ∗ φ)K ≡ (πσ (K) φ)◦σK . Applying Fubini, it is readily seen that if K ⊂ ∈ S then πK, (σ ∗ φ) = πK, (φσ () ◦ σ ) = (πσ (K),σ () φσ () ) ◦ σK = φσ (K) ◦ σK = (σ ∗ φ)K (i.e., the shift and the projection commute). The same is true for the inverse operator. Thus, the family (σ ∗ φ)K , K ∈ S is projective and clearly has the same norm in M as φ. Hence, σ ∗ : M → M is an isometry. It is not difficult to see that every frequency eiα , α ∈ [0, 2π ) is an eigenvalue of σ ∗ of infinite multiplicity. Letting π∅ : M → C be the projection to the empty set, we denote by Xα = {ϕ ∈ M | σ ∗ ϕ = eiα ϕ, π∅ (ϕ) = 0} the corresponding eigenspace in the kernel of Lebesgue. The set X α is obviously a complete subspace of M . Note (although we shall not use this fact here) that (σ ∗ )n = 1 for all n implies that α X coincides with the generalised eigenspace {ϕ ∈ M | ∃> ≥ 1, (σ ∗ − eiα Id)> ϕ = 0, π∅ (ϕ) = 0}). For example, if ϕ is not an eigenvector but is a generalised eigenvector for > = 2 and eiα , then one easily shows by induction that (σ ∗ )r ϕ = rei(r−1)α (σ ∗ ϕ − eiα ϕ) + eirα ϕ, for all r ≥ 2, contradicting (σ ∗ )n = 1 for all n. Our aim in the present work is to describe in more detail the structure of the spectrum of Ln 0 , using our spatially invariant setting. Choosing a translationally invariant reference point x¯ for the boundary condition, the truncated dynamical system becomes spatially invariant and the very definition of the truncated operator ensures that (Lσ (), φσ () )◦σ = L, (φσ () ◦σ ) for ∈ S. Therefore πK, Ln, (φσ () ◦σ ) = (πσ (K),σ () Lnσ (), φσ () ) ◦ σK for K ⊂ ∈ S and any n ≥ 0. Taking the limit → Z (which also implies σ () → Z) we obtain πK (Ln (σ ∗ φ)) = (πσ (K) (Ln φ)) ◦ σK = πK (σ ∗ (Ln φ)), i.e., that σ ∗ and Ln commute as operators on M : Ln ◦ σ ∗ = σ ∗ ◦ Ln , ∀ 0 ≤ < 0 , n ≥ n0 . In particular, Ln 0 sends X α into itself. Recalling the notation {λj } and =0 from Theorem 0, we define the “resonant eigenvalues” of Ln0 0 to be: K λjk , K ≥ 2, j1 ≤ . . . ≤ jK ∪ {0}. R = z ∈ =0 | z = k=1
We also call resonant those eigenvalues of Ln0 which belong to R (this happens if an eigenvalue of Ln0 coincides with a product of other eigenvalues). We have: Lemma 1 (Nonresonant spectrum of Ln0 0 |X α ). Except for the resonant eigenvalues in R and, possibly {1}, for each α ∈ [0, 2π ) the nonzero spectrum of Ln0 0 |X α is the same (in particular, consists of eigenvalues with the same finite algebraic multiplicities) as the nonresonant spectrum of Ln0 on H . For each λ ∈ sp(Ln0 ) \ {1, 0}, a bijection between
570
V. Baladi, H. H. Rugh
any basis of the generalised eigenspace of Ln0 and a basis for that of Ln0 0 |Xα is given by: ϕ → ϕ α = k∈ZD eiαk ϕ0 ◦ σ k ∈ M (2.6) where ϕ0 (x) = ϕ(x0 ) ⊗>∈ZD \{0} h(x> ). More precisely, the projective families defined by (2.6) should be understood as follows. For each finite ⊂ ZD we set (ϕ α )| = k∈ eiαk ϕk, , where ϕk, (x) = ϕ(xk ) ⊗>∈\{k} h(x> ). (Note that S 1 ϕ dm = 0 and sup sup |(ϕ α )| |/|| < ∞.) Proof of Lemma 1. First note that, since a spectral projector @αD for Ln0 0 |Xα is just the restriction to Xα of the corresponding spectral projector for Ln0 0 , the last statement of Theorem 0 (2) implies that the image of @αD is a vector space of generalised eigenvectors in Xα . Fix a nonresonant eigenvalue λ = 1 for Ln0 and let ϕ be an element of a basis of a generalised eigenspace for the compact operator Ln0 and λ, in particular (Ln0 − λ)> ϕ = 0 for some > ≥ 1 and λ. Obviously, ϕ α ∈ X α defined by (2.6) is a generalised eigenvector of Ln0 0 for λ, and λ ∈ / R by the choice of λ. To check that the image of a linearly independent set by the map (2.6) is a linearly independent set, use that α α i υi ϕi = 0 implies i υi π{0} ϕi = i υi ϕi = 0. To check that the image of (2.6) spans the generalised eigenspace for Ln0 0 |Xα and λ, first note that if ψ α is a generalised α = 0 eigenvector for Ln0 0 |Xα and λ ∈ / R, then there is some nonempty finite so that ψ n0 is a generalised eigenfunction for L and λ. The tensor product expression for Ln0 and elementary algebra imply that there are υt,p ∈ C, for t = 1, . . . , m and p ∈ , m α = ˆ ˆ with ψ p∈ t=1 υt,p ψt (xp ) ⊗k∈\{p} h(xk ), where {ψs , s = 1, . . . , m} is a basis of the m-dimensional space of generalised eigenfunctions for (Ln0 , λ) (observe that m = mλ does not depend on ). Now, since ψ α ∈ X α , linear independence yields υt,p = υt,q eiα(p−q) so that, writing υt = υt,p0 for some arbitrarily chosen p0 ∈ , we ˆα find ψ α = m " t=1 υt ψt ◦ σp0 . ! Recall that sp(Ln 0 ) ⊂ {1} ∪ {|z| < η} with 1 a simple translationally invariant eigenvalue. The key result of this work is the following description of part of the spectrum of Ln 0 |Xα (see also the remark after the statement, and note that similar properties of the spectrum of Ln on X α for n > n0 may easily be formulated and proved): Theorem 1. Let D ⊂ {z | |z| < 1} be a complex neighbourhood of a finite subset of the nonresonant spectrum sp(Ln0 0 ) \ R of Ln0 0 such that D ∩ R = ∅. There is C > 0 so that for each small enough coupling strength 0 < < 0 : (1) For every α ∈ [0, 2π), the spectrum sp(Ln 0 |Xα ) ∩ D consists in finitely many eigenvalues {λ> (α), > ∈ E(α)} of finite multiplicity. To each λj ∈ D ∩ sp(Ln0 0 ) is associated a finite set {λ> (α), > ∈ Ej (α)} so that the Ej (α) form a partition of E(α), and the sum of the algebraic multiplicities of the λ> (α) for > ∈ Ej (α) coincide with the algebraic multiplicity Mj of λj for Ln0 . Furthermore, |λ> (α) − λj | → 0, uniformly, as → 0 for all > ∈ Ej (α). Finally, if ϕ α is associated to ϕ (with α ∈ X α with (λ − Ln0 )k> ϕ α = 0 (λj − Ln0 )k ϕ = 0) as in (2.6) then there are ϕ,> > ,> α so that >∈Ej (α) ϕ,> − ϕ α → 0, uniformly, as → 0. More precisely, we have 1/Mj |λ (α) − λj )| ≤ C · C() , ∀> > C(0 ) (2.7) 1/Mj α − ϕ α ≤ C · C() >∈E (α) ϕ,> . C(0 ) j
Floquet Spectrum of Weakly Coupled Map Lattices
571
(2) If the coupling decays exponentially, then for each j the map α → λ> (α)
(2.8)
>∈Ej (α)
is analytic. If the coupling decays polynomially with exponent P > D, then each map (2.8) is (P − D)-times differentiable. Analogous statements, in the appropriate topology, hold for the (sums of) eigenfunctions and eigenfunctionals. Theorem 1(1) is obtained through perturbation theory [Ka], and it will be proved in Sect. 3. Theorem 1(2) is proved in Sect. 4, using information from Sect. 3. Remark (Joint eigenvalues vs. joint spectrum). Since there is no obvious spectral decomposition of σ ∗ acting on the Banach space M (indeed, spectral projectors associated to the eigenvalues of σ ∗ are not available), Theorem 1 does not give the full joint spectrum of the commuting pair (Ln 0 , σ ∗ ) on M , even restricting to D. We refer, e.g., to [MPR] for a list of definitions of joint spectrum of commuting operators in infinite dimension. If there are just two operators, say Ln 0 and σ ∗ , the easiest to state is that of the Harte spectrum, given as follows: (ν, λ) ∈ C2 belongs to the joint Harte spectrum of (Ln 0 , σ ∗ ) on M if at least one of the two equations (σ ∗ − ν)A1 + (Ln 0 − λ)A2 = Id, A1 (σ ∗ −ν)+A2 (Ln 0 −λ) = Id has no bounded operator solutions A1 , A2 . For commuting finite matrices B1 , B2 , one defines the set of joint eigenvalues to be those (ν, λ) ∈ C2 so that there is a vector x with B1 x = νx and B2 x = λx (see, e.g., [BB]). Extending this definition to bounded linear operators, Theorem 1 indeed describes the intersection of D × C ⊂ C2 with the set of joint eigenvalues of the commuting pair (Ln 0 , σ ∗ ) on M . 3. Existence of Eigenvalues: Elementary Perturbation Theory We formulate some abstract results in an axiomatic setting: Let (X, · ) be a complex Banach space, and let {T , 0 ≤ ≤ 0 } be a family of bounded linear operators on X. We make the following two assumptions about T0 and T for ≥ 0: There is κ0 > 0 so that the spectrum of T0 decomposes as =0 ∪ =1 , with inf
z0 ∈=0 ,z1 ∈=1
|z0 − z1 | > κ0 .
lim T − T0 = 0.
→0
(3.1) (3.2)
The main result of this section follows: Proposition 1 (Perturbation theory). Let {T , ≥ 0} satisfy (3.1–3.2). Then there is C > 1 such that for all sufficiently small > 0, there is a decomposition of sp(T ) into =0 ∪ =1 such that (1) inf z0 ∈=0 ,z1 ∈=1 |z0 − z1 | > κ0 /3. (2) Let π0 : X0 ⊕X1 → X0 be the projection associated with the spectral decomposition of T . Then π00 − π0 ≤ C · T − T0 . (3) Let Xi be the image of X by the spectral projector of T0 corresponding to =i . The invariant space X0 may be written as the graph of a linear map S : X0 → X1 , with S ≤ C · T − T0 .
572
V. Baladi, H. H. Rugh
It follows from Proposition 1(2) that, if we assume additionally that dim(X0 ) < ∞ (in particular, X0 is a generalised eigenspace),
(3.3)
then dim(X0 ) = dim(X0 ) is finite for all small enough . Clearly, statements (1) and (2) of Proposition 1 can be obtained from standard perturbation theory, see, e.g., [Ka, Sect. IV.3.5]. The third claim allows us to define T : X0 → X0 by T (x) = π00 ◦ T (x + S (x)). (3.4) The map T will be convenient to show (2.7) in Theorem 1(1) (see the following Proposition 2), and also to prove Theorem 1(2) (see Sect. 4). Proposition 2 (Finite-dimensional perturbation). Assume that {T , ≥ 0} satisfies (3.1–3.3). Let M denote the maximum algebraic multiplicity of λ ∈ sp(T0 |X0 ). Then there is C > 0 so that for all small enough , 1/M . (1) For each z ∈ sp(T |X0 ) we have inf z0 ∈sp(T0 |X0 ) |z0 − z | ≤ C · T − T0 (2) If xj ∈ X0 is an eigenvector for T0 with eigenvalue λj and algebraic multiplicity ∈ X with eigenvalues λ such that Mj , then T has at most Mj eigenvectors xj,> 0 j,> 1/M |λj,> − λj | ≤ C · T − T0 j , ∀ >, 1/M xj,> − xj ≤ C · T − T0 j . >
Proof of Theorem 1 (1). Taking X = Xα for some α ∈ [0, 2π ) and applying Theorem 0 and Lemma 1, Assumptions (3.1–3.3) are satisfied for T0 = Ln0 |Xα and T = Ln 0 |Xα . Proposition 2 gives (2.7) while Proposition 1 implies all other claims. ! " We now prove Propositions 1 and 2. We let πi : X → Xi (i = 0, 1) denote the projections on Xi , i.e., x = π0 (x) + π1 (x) ∈ X0 ⊕ X1 . Proof of Proposition 1. (1) Let λ ∈ C satisfy d(λ, sp(T0 )) > κ0 /4. To show that λ ∈ / sp(T ) (if is small enough) it suffices to prove that the resolvent R(λ, T ) is a bounded operator. If the following sum converges it coincides with the resolvent (see, e.g., [BY, (2.1)]): R(λ, T ) =
∞
j
R(λ, T0 ) ◦ (T − T0 )
· R(λ, T0 ).
(3.5)
j =0
It is easily seen that σ0 := sup{λ|d(λ,sp(T0 ))>κ0 /4} R(λ, T0 ) < ∞. Therefore, it suffices to take small enough so that T − T0 < 1/σ0 . Setting =0 = {z ∈ σ (T ) | d(z, =0 ) < κ0 /3},
=1 = {z ∈ σ (T ) | d(z, =1 ) < κ0 /3},
our Assumption (3.1) allows us to conclude. (2) Let D ⊂ C contain =0 ∪=0 , be disjoint from a κ0 /3 neighbourhood of {0}∪=1 ∪=1 , and be such that ∂D is a union of rectifiable curves with finite total length (such a domain D exists by (1)). Then we have 1 1 π0 = R(λ, T0 ) dλ π0 = R(λ, T ) dλ. 2iπ ∂ D 2iπ ∂ D
Floquet Spectrum of Weakly Coupled Map Lattices
573
We may thus estimate π0 − π0 by 1 R(λ, T0 ) − R(λ, T ) dλ π0 − π0 ≤ 2π ∂ D 1 ≤ length (∂D) max R(λ, T0 ) − R(λ, T ). 2π λ∈∂ D
(3.6)
Using (3.5), we find R(λ, T0 ) − R(λ, T ) ≤
∞
R(λ, T0 )j +1 · T − T0 j .
j =1
Statement (2) immediately follows from (3.2). (3) To prove the last claim, take small and x in X0 . Since x−π0 (x) ≤ π0 −π0 x, it follows that if x = (x0 , x1 ) ∈ X0 ⊕X1 , then x1 % x0 . This inequality implies in particular that π0 is injective on X0 so that S is well defined. The estimate on S follows from x1 = x − π0 (x) ≤ CT − T0 x T − T0 x0 . " ! ≤C 1 − CT − T0 Proof of Proposition 2. Using the map T : X0 → X0 from (3.4), we have for x ∈ X0 with x = 1, T (x) − T0 (x) ≤ π0 · (T (x) − T0 (x) + T (S (x))) ≤ C · π0 · T − T0 + T · T − T0 .
(3.7)
The assertions in Proposition 2 follow immediately. (See, e.g., [W, Chap. 2] and [Ka, II.5].) ! "
4. Smoothness of Eigenvalue Curves Returning now to the setting of Sect. 2, we show here how the information from Sect. 3 can be combined with further crucial quantitative bounds on Ln 0 (Proposition 3) to show Theorem 1(2). Recall from Sect. 2 that the distance d(p, q) in the definition of the spatial coupling strength is either the Euclidean metric (exponential decay with rate ξ < 1) or the renormalized Euclidean metric (polynomial decay with exponent P = log(1/ξ )) on the lattice. Proposition 3. Let n0 and 0 be given by Theorem 0. Let @0 be a spectral projector for a subset of a disc of radius strictly less than 1 in the spectrum of Ln0 0 . Then there is C > 0 so that for each < 0 and k ∈ ZD , writing @0 for the corresponding spectral projector in the spectrum of Ln0 , and setting Zk = {ϕ ∈ Mθ | π ϕ = 0, ∀ s.t. k ∈ / }, we have (4.1) max π{0} ◦ (@0 )|Zk , π{0} ◦ (Ln 0 ◦ @0 )|Zk ≤ Cξ d(0,k) .
574
V. Baladi, H. H. Rugh
Proof of Proposition 3. To simplify notation, we write L0 and L instead of Ln0 0 and Ln 0 in this proof. We shall concentrate on bounding the second expression in the max, a simpler version of our arguments gives the other estimate. We shall represent @0 as a suitable contour integral ∂ D · dλ of the resolvent R(λ, L ) and use (3.5) again: R(λ, L ) =
∞
j
R(λ, L0 ) ◦ (L − L0 )
◦ R(λ, L0 ).
j =0
The resolvent R(λ, L0 ) preserves Zk , and we shall in fact show that for j = 0, 1, . . . : π{0} ◦ L ◦ (R(λ, L0 ) ◦ (L − L0 ))j |Zk ≤
−1 −n0 d(0,k)
γ
ξ
(σ0 γ −n0 (C()/C(0 )))j ,
(4.2)
where σ0 = supλ∈∂ D R(λ, L0 ) and γ > 1 is again the constant from TR in [R, Definition 4.18]. Let us first consider j = 1. For a finite subset of our lattice, we consider the operator π ◦ R(λ, L0 ) ◦ (L − L0 ). By Lemma A.1 from the appendix, we may write π ◦R(λ, L0 ) = R(λ, L )◦π , where R(λ, L ) is the (bounded) resolvent of L acting on H . The operator L is defined in [R, Sect. 4.7] through a sum of configurational operators π ◦ L = C ∈C [,n0 ] L [C] : M → H . We may therefore write: π ◦ R(λ, L0 ) ◦ (L − L0 ) = R(λ, L ) ◦
L [C],
C ∈C ∗ [,n0 ]
where C ∗ [, n0 ] represents the configurations which do not consist only of initial-leaves and end-leaves (see the proof of Theorem 0 (2) in the Appendix). The bounds on the configurational operators (see also the appendix) imply furthermore that C ∈C ∗ [,n0 ]
R(λ, L ) ◦ L [C] ≤
C() R(λ, L0 ) C(0 )
−||
.
The proof of this bound is perhaps slightly trickier than it looks. The reason is that the a priori bounds for R(λ, L ) are useless here. Instead one should use the same expansion as in the proof of Lemma A.1, integrate each term in the resolvent separately (i.e., each term in U<m and U≥m ), and combine with the configurational expansion of L − L0 . The terms in this expansion of the projector involve in particular projections to subsets J of . Therefore, when composing with a configurational operator L [C] such a term vanishes unless C is a configuration over J (we refer to [R] for the terminology). Summing up what remains yields the above inequality. When acting on the kernel Ker Leb = {ϕ | π∅ = 0} of the Lebesgue measure and taking the spatial decay of the couplings into account, more can be said. First, the “size” [R, Def. 4.11, Lemma 4.25] of each configuration must be at least n0 in order to get a non-vanishing contribution. Second, we may introduce an “interaction-radius”, rad(C), of a configuration, C: First, we map C into a collection of trees yp , p ∈ [R, Sect. 4.3, Def. 4.9]. A branching (in a tree) of a point q into a set K is associated with the interaction-radius, rad(q, K) = max{d(q, r) | r ∈ K ∪ {q}} and we define rad(C) to be the sum of interaction-radii over all branchings and all trees.
Floquet Spectrum of Weakly Coupled Map Lattices
575
As in the proofs of temporal [R, Lemma 4.25] and spatial [R, Sect. 5.1] decay, we then obtain the bound, C() R(λ, L0 ) −|| . R(λ, L ) ◦ L [C]|Ker Leb ξ −rad(C ) · γ n0 ≤ C( 0) ∗ C ∈C [,n0 ]
We wish to iterate this type of argument when taking powers j ≥ 2 of the operator difference composed with the resolvent. (The case j = 0 is easier and left to the reader.) In order to do so, we note that both L − L0 and R(λ, L0 ) map the kernel of Lebesgue measure into itself ([R, Lemma 4.25] and the fact that no projector involved includes the eigenvalue 1). In the product, π ◦ L ◦ (R(λ, L0 ) ◦ (L − L0 ))j , we introduce the configurational expansion in each factor, L − L0 . We then obtain a sequence of finite subsets, 0 ≡ ⊂ 1 ⊂ · · · ⊂ j and configurations Ci ∈ C ∗ [i , n0 ], i = 0, . . . , j − 1, where each configuration expands i into i+1 . It is important here to notice that each occurrence of R(λ, L0 ) does not expand a given finite subset (since πK ◦ R(λ, L0 ) = R(λ, LK ) ◦ πK ). When acting on Zk we note that non-vanishing contributions can only occur provided k ∈ j , i.e., the very last expansion of the original set, , has to contain the distinguished point, k. This, in turn, implies that the sequence of configurations must include a sequence of “trees” which “connects” by a path (of branchings) the point k with a point in . In the product we therefore obtain factors of ξ raised to the power rad(C0 ) + · · · rad(Cj −1 ) ≥ d(k, ). This finally implies π ◦ L ◦ (R(λ, L0 ) ◦ (L − L0 ))j |Zk ≤
−|| −n0
γ
(σ0 γ −n0 (C()/C(0 )))j ξ d(k,) .
The claim (4.2) now follows if for the set we take the origin of our lattice. To end the proof of Proposition 3, just note that j (σ0 γ −n0 C()/C(0 ))j remains uniformly bounded for small . ! " Proof of Theorem 1 (2). To simplify notation, we write L0 and L instead of Ln0 0 and Ln 0 in this proof. We shall use the notation from Sect. 3, taking again T0 = L|Xα T = L |Xα , and X = Xα with the norm induced by M , writing also X0α = X0 , X1α = X1 . Note that π0 = π0α and π1 = π1α are just the restrictions to X α of the spectral projections @0 , @1 of L0 : M → M associated to the partition D ∪ (C \ D). Theorem 0 (2) guarantees that the image of π0α is a generalised eigenspace, and since D is disjoint from a neighbourhood of R, Lemma 1 says that this eigenspace is finite-dimensional and in bijection with the eigenspace associated to Ln0 and D. Using also Theorem 0 (3), we see that Conditions (3.1–3.3) are satisfied. Note also that πiα ≤ @i for i = 0, 1 and all α. We work with the finite-dimensional operator T = T (α) from (3.4). Since T (x0 ) = π0 T (x0 + S (x0 )) = λx0 if x0 + S (x0 ) is an eigenvector of T for the eigenvalue λ, the relevant eigenvalues λα of L |Xα are just the eigenvalues of any finite matrix representing T (α) on the finite-dimensional space X0α . Therefore, to prove our claim it suffices (by classical results, see, e.g., [W, Chapter 2], [DS, Theorem VII.6.9]) to check that the coefficients of such a matrix depend analytically (respectively differentiably) on α ∈ [0, 2π). By Lemma 1, these coefficients may be indexed by the following finite basis of generalised eigenvectors of L0 in X α (for eigenvalues in D): ϕqα = k∈ZD eiαk ϕq ◦ σ k , ϕq (z) = ϕˆq (z0 ) ⊗>∈ZD \{0} h(z> ), (4.3) {ϕˆq } a basis of generalised eigenvectors for L for eigenvalues in D.
576
V. Baladi, H. H. Rugh
Just like for the more precise definition for (2.6) given after Lemma the expression 1, iαk in (4.3) means that for each ⊂ ZD we define (ϕqα )| = e ϕq,k, with k∈ ϕq,k, (z) = ϕˆq (zk ) ⊗>∈\{k} h(z> ) (note that S 1 ϕˆq dm = 0). (Recall also that we in fact restrict to D = 1 for simplicity, the extension to the general case is straightforward, with α in a D-dimensional torus and using that 1/(xt (1 + x)P ) is integrable on RD if and only if t > D − P .) To compute these coefficients, we may also use a finite basis of generalised eigenfunctionals of L∗0 in (X α )∗ νn = νˆ n ⊗k∈ZD \{0} dLeb, {ˆνn } basis of eigenvectors for L∗ for eigenvalues in D. Note that Proposition 1 may be applied to the spectral decomposition of Ln0 0 associated to D ∪ (C \ D) on M , giving projectors @i for L . Recall from Sect. 3 that S = S (α) : X0α → X1α is such that X0α, = @0 (X0α ) = X0α + S (X0α ). Now, the operator π0 is invertible when acting on the finite-dimensional space π0 (X0α ). (In fact, π0 (π0 x) = π0 (x) + (π0 − π0 )(π0 x) so that π0 |π0 (X0α ) is a small perturbation of the identity.) Thus, we may define Q = Q (α) : X0α → π0 (X0α ) by Q (α) = Id + S (α) = (π0 |π0 (X0α ) )−1 .
(4.4)
For each fixed α, the coefficient T,nq (α) of the matrix of T in the chosen bases can then be expressed as: T,nq (α) = νn L Q (4.5) eiαk ϕq ◦ σ k . k∈ZD
Denoting by Qrq (α) the coefficients of the matrix of Q in the bases given by (4.3) and its image under π0 , we finally get (the sum over r is finite) Qrq (α)π0 eiα> (ϕr ◦ σ > ) T,nq (α) = νn L =
r
r
Qrq (α)
>∈ZD
eiα> νn (L @0 )(ϕr ◦ σ > ).
>∈ZD
Formally, we may write the derivatives of the interior sum over > as: d s iα> > e ν (L @ )(ϕ ◦ σ ) = >s eiα> νn (L @0 )(ϕr ◦ σ > ). n r 0 dα s D D >∈Z
>∈Z
By definition (see also the remarks after (4.3)), ϕr ∈ Z0 so that ϕr ◦ σ > ∈ Z> . The upper bounds given by Proposition 3, ξ |>| ϕr , for exponential coupling > |νn (L @0 )(ϕr ◦ σ )| ≤ C · −P (1 + |>|) ϕr , for polynomial coupling, are good enough for our purposes. By the Leibniz formula, it only remains to prove that each coefficient map α → Qrq (α) is analytic (respectively P times differentiable). It is equivalent to prove the
Floquet Spectrum of Weakly Coupled Map Lattices
577
qr (α) of the inverse map Q (α)−1 = π0 |π (Xα ) . We same claim for the coefficients Q 0 0 have ds Qqr (α) = k s eiαk νq @0 (ϕr ◦ σ k ), s dα D k∈Z
so that our task reduces to finding good bounds for |νq @0 (ϕr ◦ σ k )| for all k ∈ Z. But this is again Proposition 3. ! "
Appendix Proof of Theorem 0. The proof of (1) may be found in [R], so that we first concentrate on (2). We recall some results from [R, App. B]: For each finite ⊂ ZD , the tensor product ⊗p∈ Hp is dense in H . Also, if J ⊂ then a bounded operator TJ on HJ has a natural norm-preserving extension to a bounded operator TJ acting on H (see Theorem B.4 in [R, App. B]). The extended operator may be defined as TJ (φJ ⊗ φ\J ) ≡ (TJ φJ ) ⊗ φ\J when acting on a direct tensor product and extended by continuity. The truncated unperturbed Perron–Frobenius operator L acting on H is itself a tensor product, L = ⊗p∈ Lp , of the operators acting at the individual sites (here, for simplicity, we do not write L p for the extension of L : H → H ). In the rest of this proof we write L instead of Ln0 (and L0 , L for Ln0 0 , Ln 0 ) in order to simplify notation, and we let Pp = hp ⊗ mp denote the principal eigenprojection (which yields the single-site SRB measure) of Lp . Thus, Lp ◦ Pp = Pp ◦ Lp = Pp . Denoting PJ = ⊗p∈J Pp and (1 − P )J = ⊗p∈J (1 − Pp ), let us introduce another extension of TJ : HJ → HJ through Ext (TJ ) = TJ ⊗ P\J = TJ P\J .
We end these preliminaries by a bound for J ⊂,|J |=k Ext ((1 − P )J ) ◦ π . For this, we start by noting that for J ⊂ , Ext ((1 − P )J ) = (1 − Pp ) ⊗ P\J = (−1)|I | PI ⊗ P\J . p∈J
I ⊂J
Next, define for k ∈ N the constants αk = αk ( , ch ) = max>≥0 k> (ch +1)k (ch )>−k , where ch > 1 is associated as above to the single-site operator L (equivalently, to Ln0 ) and 0 < < 1 is chosen as in Sect. 2 (see also [R, Theorem 2.1, Lemmas 4.20 and 4.25]). Since ch < 1 these constants are finite, though not uniformly bounded in k. Using again [R, Theorem B.4], we get, recalling also the definition of ch ,the estimate (with K = J \ I ), || Ext ((1 − P )J ) ◦ π φ J ⊂,|J |=k
≤
||
PI ⊗ P\J ◦ π φ
J ⊂,|J |=k I ⊂J
≤
||
J ⊂,|J |=k
|\J |
ch
K⊂J
|J \K|
ch
φK
578
V. Baladi, H. H. Rugh
≤
||
J ⊂,|J |=k
≤
|\J |
ch
(ch )
K⊂J
|\J |
J ⊂,|J |=k
≤
−|K| |J \K| ch φ
(ch )|J \K| φ
K⊂J
(ch
|J |
+ 1) (ch )|\J | φ
J ⊂,|J |=k
≤ αk φ .
(A.1)
We will use the following lemma: Lemma A.1. For λ ∈ / =0 , λ − L0 is invertible in M . We have the following expansion for the marginals of the resolvent: π ◦ R(λ, L0 ) = Ext (R(λ, LJ )(1 − P )J ). J ⊂
Proof of Lemma A.1. The identity mp Lp = mp and the fact that L is a tensor product implies for K ⊂ that πK, L = LK πK, . When λ ∈ / =0 then the definition of the resolvent R(λ, L ) = (λ − L )−1 yields πK, R(λ, L ) = R(λ, LK )πK, . The properties of Pp also ensure that L Ext ((1−P )K ) = Ext (LK (1−P )K ) and therefore also R(λ, L )Ext ((1 − P )K ) = Ext (R(λ, LK )(1 − P )K ). (A.2) Note that this is consistent with R(λ, L )P = (λ − 1)−1 P since for K = ∅ we define L∅ and (1 − P )∅ : C → C to be the identity and Ext (Id) = P . Our goal is to extend the family of resolvents R(λ, L ) to a bounded operator acting on the space M . We may clearly define R(λ, L ) ◦ π : M → H . This family is projective but a priori it is not clear that it is bounded in the M norm. Let us first note that 1 = ⊗p∈ (Pp + (1 − P )p ) = J ⊂ Ext ((1 − P )J ). Acting on each individual term with the resolvent of L we obtain the right-hand side in Lemma A.1. In general, however, this expansion is not going to be M convergent so we need to be more careful. Instead we choose m = m(|λ|) ∈ N such that |λ| > ηm and we split the decomposition of 1 just introduced, into two sums, 1 = Ext ((1 − P )J ) + Ext ((1 − P )J ). J ⊂,|J |<m
(A.3)
(A.4)
J ⊂,|J |≥m
We shall deal with these two terms separately. For the first term we note that by (A.2) we get, using again [R, Theorem B.4], the bound R(λ, L )Ext ((1 − P )J ) ◦ π ≤ R(λ, LJ ) Ext ((1 − P )J ) ◦ π . Here R(λ, LJ ) = R(λ, L⊗|J | ) depends on J only through its cardinality. The operator, U<m = R(λ, L ) Ext ((1 − P )J ) =
J ⊂,|J |<m
J ⊂,|J |<m
Ext (R(λ, LJ )(1 − P )J ),
(A.5)
Floquet Spectrum of Weakly Coupled Map Lattices
579
is clearly bounded on H . If we combine with (A.1) we see that we also have the norm bound, m−1 || U<m ◦ π ≤ R(λ, L⊗k ) · αk ( , ch ), (A.6) k=0
uniformly in . By commutativity of the involved operators we also have (λ − L )U<m = U<m (λ − L ) = Ext ((1 − P )J ).
(A.7)
J ⊂,|J |<m
For the second term in (A.4) we shall use the following “large deviations” bound (see Lemma 4.15 in [R] for a similar computation): if a, b > 0 and γ > 1 are so that γ a + b ≤ 1, then a |J | b|\J | ≤ a |J | b|\J | γ |J |−m ≤ (γ a + b)|| γ −m ≤ γ −m . (A.8) J ⊂,|J |≥m
J ⊂
Taking γ > 1 such that condition TR in [R, Def. 4.18], holds, as before, our choices above for (see [R, Lemma 4.20]) and n0 (see [R, Lemma 4.21], recalling that the notation η represents in fact ηn0 ) ensure that cr γ η(1 + ch ) + ch ≤ 1. Then the “large deviations” bound gives (note the extra factor of η>−1 ), (cr η> (1 + ch ))|J | (ch )|\J | ≤ η>m (γ η)−m . (A.9) J ⊂,|J |≥m
Hence, using L> Ext ((1 − P )J ) ◦ π = Ext (L>J (1 − P )J ) ◦ π
≤ (cr η> )|J | Extλ ((1 − P )J ) ◦ π ,
we get by a calculation similar to (A.1) || L> Ext ((1 − P )J ) ◦ π ≤ η>m (γ η)−m .
(A.10)
J ⊂,|J |≥m
Therefore, since |λ| > ηm , the following sum is absolutely convergent on H (but the bound depends on the size of ): A=
∞ 1 >+1 λ >=1
J ⊂,|J |≥m
L> Ext ((1 − P )J ).
More precisely, the above calculations show that || A ◦ π : M → H is bounded in norm by ∞ 1 1 η>m (γ η)−m = m , γ |λ| |λ| − ηm |λ|>+1 >=1
uniformly in . Finally, using (A.4), we find (corresponding to the “missing term” > = 0 in A) that 1 || 1 Ext ((1 − P )J ) ◦ π = || (1 − Ext ((1 − P )J )) ◦ π , λ λ J ⊂,|J |≥m
J ⊂,|J |<m
580
V. Baladi, H. H. Rugh
which by (A.1) is bounded by m−1
1 (1 + αk ), |λ| k=0
again uniformly in . We may therefore define a bounded linear operator U≥m : H → H through ∞ 1 ≥m U = L> Ext ((1 − P )J ). λ>+1 J ⊂,|J |≥m
>=0
Now,
|| U ≥m
◦ π is absolutely bounded by m−1
1 1 1 ( m + 1 + αk ). |λ| γ |λ| − ηm
(A.11)
k=0
Uniform convergence and continuity allow us to interchange the sum and the action of the operator to obtain Ext ((1 − P )J ). (A.12) (λ − L )U≥m = U≥m (λ − L ) = J ⊂,|J |≥m
Combining the above, in particular (A.7) and (A.12) we see that by setting U = U<m + U≥m ,
(A.13)
we obtain a bounded linear operator on H for which U (λ − L ) = (λ − L )U = 1 , thus showing that U ≡ R(λ, L ) is indeed the resolvent of L . In particular, ||
R(λ, L ) ◦ π
is uniformly bounded in . As already mentioned, the family R(λ, L ) ◦ π : M → H is projective. Hence, if we set (U φ) ≡ R(λ, L )φ , then U is a linear operator on M bounded in norm by the sum of (A.6) and (A.11). Finally π ◦ U ◦ (λ − L0 ) = U ◦ (λ − L ) ◦ π = π and π ◦ (λ − L0 ) ◦ U = (λ − L ) ◦ U ◦ π = π , demonstrating that U is precisely the resolvent R(λ, L0 ) which is therefore a bounded linear operator on M . The finite marginals, π ◦R(λ, L0 ), may be computed from the expansion of the identity which completes the proof of Lemma A.1. ! " End of the proof of Theorem 0. To end the proof of the unperturbed case (2), we must prove the claim on the image of the spectral projection. (The fact that every λ ∈ =0 is an eigenvalue of L0 on M is obvious, but not sufficient to prove the claim.) For this, we apply Theorem VII.3.24 in [DS] and need to check that every nonzero eigenvalue λ0 of L0 on M is a pole of finite order of the operator-valued resolvent function λ → R(λ, L0 ). (See also [DS, Theorem VII.3.18] and its proof.) Clearly, it suffices to
Floquet Spectrum of Weakly Coupled Map Lattices
581
show that there is >(λ0 ) such that for each finite subset ⊂ ZD , the point λ0 is a pole of finite order at most >(λ0 ) of the operator-valued resolvent function λ → R(λ, L ), and that the projective family of operator-valued residues Wλ0 of this pole define a bounded linear operator W λ0 on M by setting (W λ0 (φ)) = Wλ0 φ . The first claim follows from the fact that the index > of the eigenvalue λ0 = λ1 · · · λm of L is the same as the index of λ0 for L if λ0 is also an eigenvalue of L (e.g. if ⊂ ). The second can be obtained by large deviations arguments similar to those used above (the combinatorics is slightly more involved), using m ≥ 1 so that ηm+1 < |z| < ηm . The proof of (3) is short but we need to go a bit deeper into the construction in [R]. Let 0 < ≤ 0 . The generating functions unp (γ ) [R, Def. 4.16] defined for each fixed 0 ≤ ≤ 0 verify the following bound [R, Lemma 4.20] for all γ > 1, close enough to 1: unp (1) ≤ unp (γ ) ≤ A + B C() ≤
−1
.
Here A and B are positive constants, related to the single-site unperturbed operators (initial- and end-leaves) and the couplings (branchings), respectively and C() is the function defined in our Theorem 0 above. (The bound in [R, Lemma 4.20] is stated for unp (γ ) with some γ > 1, but the functions unp (γ ) are monotone increasing). The Perron–Frobenius operator when projected to H is bounded by [R, Lemma 4.22]: π ◦ L ≤
p∈
unp (1) ≤
(A + B C()) ≤
−||
.
p∈
If we now consider the perturbation alone, then in the proof of [R, Lemma 4.22] we should omit all contributions associated with a product of trees related to the unperturbed operator (trees consisting of initial-leaves and end-leaves only [R, Def. 4.9, Lemma 4.14]). This amounts to subtracting from the above expression precisely the contributions coming from such product trees, in other words π ◦ (L − L0 ) ≤
(A + B C()) −
p∈
A.
(A.14)
p∈
Expanding the product we see that at least one factor of C() occurs in each term on the right-hand side. On the other hand, we know that each factor, (A + B C(0 )), (note here the occurrence of 0 rather than ) is bounded by −1 . Hence, applying a large deviation argument similar to (A.8) (with m = 1 and γ = C(0 )/C()), we see that ||
π ◦ (L − L0 ) ≤ C()/C(0 ),
from which we obtain the desired bound.
(A.15)
" !
Acknowledgement. Both authors thank David Ruelle and the I.H.É.S. where this project was started during a visit in 1998. Support by the PRODYN programme of the European Science Foundation is also gratefully acknowledged. V.B. was partially supported by the Fonds National Suisse de la Recherche Scientifique and is grateful to IMPA for its hospitality in the final phase of this work. We are grateful to the referee for many thoughtful comments which improved the presentation.
582
V. Baladi, H. H. Rugh
References [AC] [B] [BIJ] [BY] [BB] [BK1] [BK2] [Bu] [BS] [DS] [FKT] [FR] [J] [JP] [JS] [K] [Ka] [KK] [MPR] [MM] [PS] [R] [V] [W]
Allaire, G. and Carlos, C.: Bloch wave homogeneization and spectral asymptotic analysis. J. Math. Pures Appl. 77, 153–208 (1998) Baladi, V.: Positive transfer operators and decay of correlations. Singapore: World Scientific, 2000 Baladi, V., Degli Esposti, M., Isola, S., Järvenpää, E. and Kupiainen, A.: The spectrum of weakly coupled map lattices. J. Math. Pures Appl. 77, 539–584 (1998) Baladi, V. and Young, L.-S.: On the spectra of randomly perturbed expanding maps. Commun. Math. Phys. 156, 355–385 (1993); (see also Erratum, Commun. Math. Phys. 166, 219–220 (1994)) Bhatia, R. and Bhattacharyya, T.: A Henrici theorem for joint spectra of commuting matrices. Proc. Am. Math. Soc. 118, 5–14 (1993) Bricmont, J. and Kupiainen, A.: Coupled analytic maps Nonlinearity. 8, 379–396 (1995) Bricmont, J. and Kupiainen, A.: High temperature expansions and dynamical systems. Commun. Math. Phys. 178, 703–732 (1996) Bunimovich, L.A.: Coupled map lattices: One step forward and two steps back. Phys. D (Chaos, order and patterns: aspects of nonlinearity – the “gran finale”, Como 1993) 86, 248–255 (1995) Bunimovich, L.A. and Sinai,Ya.G.: Spacetime chaos in coupled map lattices. Nonlinearity 1, 491–516 (1988) Dunford, N. and Schwartz, J.T.: Linear Operators, Part I. New York: Wiley-Interscience (Wiley Classics Library), 1988 Feldman, J., Knörrer, H., and Trubowitz, E.: Perturbatively unstable eigenvalues of a periodic Schrödinger operator. Comment. Math. Helv. 66, 557–579 (1991) Fischer, T. and Rugh, H.H.: Transfer operators for coupled analytic maps. Ergodic Theory Dynam. Syst. 20, 109–144 (2000) Jiang, M.: Equilibrium states for lattice models of hyperbolic type. Nonlinearity 8, 631–659 (1995) Jiang, M. and Pesin, Ya.B.: Equilibrium measures for coupled map lattices: Existence, uniqueness and finite-dimensional approximations. Commun. Math. Phys. 193, 675–711 (1998) Jordan, D.W. and Smith, P.: Nonlinear ordinary differential equations. An introduction to dynamical systems. Third edition, Oxford. Oxford University Press, 1999 Kaneko, K. (ed.): Theory and applications of coupled map lattices. Chichester: J. Wiley & Sons, 1993 Kaneko, K. (ed.): Perturbation theory for linear operators. (Reprint of the 1980 edition) Berlin: Springer-Verlag, 1995 Keller, G. and Künzle, M.: Transfer operators for coupled map lattices. Ergodic Theory Dynamical Systems 12, 297–318 (1992) McIntosh, A., Pryde, A. and Ricker, W.: Comparison of joint spectra for certain classes of commuting operators. Studia Math. LXXXVIII, 23–36 (1988) Maes, C. and Van Moffaert, A.: Stochastic stability of weakly coupled lattice maps. Nonlinearity 10, 715–730 (1997) Pesin, Ya.G. and Sinai, Ya.G.: Space-time chaos in chains of weakly interacting hyperbolic mappings. In: Dynamical systems and statistical mechanics. Moscow, 1991; Providence, RI: Am. Math. Soc., 1991 Rugh, H.H.: Coupled maps and analytic function spaces. Preprint (2000), submitted for publication Volevich, V.L.: Construction of an analogue of the Bowen–Ruelle–Sinai measure for a multidimensional lattice of interacting hyperbolic mappings. Mat. Sb. 184, 17–36 (1993) Wilkinson, J.H.: The Algebraic Problem. London: Oxford University Press, 1965
Communicated by A. Kupiainen
Commun. Math. Phys. 220, 583 – 621 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Anderson Localization for Schrödinger Operators on Z with Potentials Given by the Skew–Shift Jean Bourgain1 , Michael Goldstein1, , Wilhelm Schlag2 1 Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.
E-mail: [email protected]; [email protected]
2 Department of Mathematics, Princeton University, Fine Hall, Princeton NJ 08544, USA.
E-mail: [email protected] Received: 19 September 2000 / Accepted: 15 February 2001
Dedicated to Yakov G. Sinai on the occasion of his 65th birthday Abstract: In this paper we study one-dimensional Schrödinger operators on the lattice with a potential given by the skew shift. We show that Anderson localization takes place for most phases and frequencies and sufficiently large disorders.
1. Introduction In this paper we study the positivity of the Lyapunov exponent, the regularity of the integrated density of states, and the nature of the spectrum for the Schrödinger operators, Hω,(x,y) ψn = −ψn+1 − ψn−1 + v(Tωn (x, y))ψn on 2 (Z),
(1.1)
where Tω = (x + y, y + ω) (mod 1) is the skew-shift on the two-dimensional torus T2 . The number ω will be assumed to be Diophantine. The study of families of Schrödinger operators with potentials that are in some sense random has a long and rich history, starting with the famous work by P. Anderson [1]. It is not our intention to review this subject, as some of the history as well as many references can be found in [7]. Furthermore, the methods in this paper have little overlap with the work that has been done on the purely random case. Our approach is motivated by the recent works [3] and [7]. The main results in this paper are as follows. Fix a nonconstant real–analytic function v0 on T2 and some small ε > 0. Then there exists a set ε ⊂ T with mes[T \ ε ] < ε, and a large constant λ0 (ε, v0 ) so that for any ω ∈ ε and λ ≥ λ0 , the equation (1.1) with v = λv0 has the following properties: Permanent address: Dept. of Mathematics, University of Toronto, Toronto, Ontario, Canada M5S 1A1
584
J. Bourgain, M. Goldstein, W. Schlag
• The Lyapunov exponents of (1.1) are positive for all energies, see Prop. 2.11. • The integrated density of states is continuous with modulus of continuity 1 h(t) = exp −c| log t| 24 − , see Prop. 2.13. ε ⊂ T2 with • The operators (1.1) display Anderson localization, i.e., there exists 2 ε ] < ε so that for all (x, y) ∈ ε the spectrum is pure point and the eigen mes[T \ functions decay exponentially, see Theorem 3.7. 2. A Large Deviation Theorem for the Monodromy Matrices and Positivity of the Lyapunov Exponents for Large Disorder Consider the Schrödinger operator (1.1), where v is a trigonometric polynomial, say. An important example is v(x, y) = cos(2π x). Any solution of (1.1) is of the form ψn+1 ψ1 = Mn (x, y; E) , ψn ψ0 where Mn (x, y; E) = 1j =n Aj (x, y; E) with (T = Tω for simplicity) v(T j (x, y)) − E −1 Aj (x, y; E) = . (2.1) 1 0 The matrix Mn (x, y; E) is called the fundamental, or monodromy matrix of Eq. (1.1). As usual, 1 Ln (E) = log Mn (x, y; E) dxdy 2 T n and L(E) = limn→∞ Ln (E) = inf n Ln (E) denotes the Lyapunov exponent. Clearly, L(E) ≥ 0 for all E. Kingman’s subadditive ergodic theorem asserts that 1 log Mn (x, y; E) → L(E) for a.e. (x, y) ∈ T2 as n → ∞. n A more quantitative version of this convergence statement will be of particular importance in this paper. In fact, the goal of this section is to prove an estimate of the form 1
sup mes (x, y) ∈ T2 log Mn (x, y; E) − Ln (E) > n−σ ≤ C exp −nσ n E (2.2) for all positive integers n and some constant σ > 0, see Prop. 2.11 below for a more precise statement. These so-called “large deviation estimates” have been of central importance in some recent papers by the authors, see [3, 7], and [4]. They are a key ingredient in the proof of localization in [3] on the one hand, and are essential for proving regularity of the density of states as well as positivity of the Lyapunov exponent in [7]. The Schrödinger equations considered in [3] and [7] were of the form (1.1) with T given by the shift rather than the skew-shift, i.e., T (x, y) = (x + ω1 , y + ω2 ) (mod Z2 ) in the case of two dimensions. We want to emphasize that the methods from these papers do
Anderson Localization for the Skew–Shift
585
not directly apply to the skew-shift and a completely new approach was required for the proof of Prop. 2.11 below. To understand the difficulty introduced by the skew-shift, let us briefly review some basic aspects of the techniques underlying the proof of the large deviation estimates in [3] for the case of the shift. Firstly, the map un (z1 , z2 ) =
1 log Mn (z1 , z2 ; E) n
(2.3)
extends to a subharmonic function on a complex neighborhood of T2 . Moreover, these subharmonic functions are bounded in that neighborhood uniformly in n. Using the standard Riesz–representation for subharmonic functions one obtains the decay of the Fourier coefficients |un ( 1 , 2 )| ≤
C | 1 | + | 2 | + 1
(2.4)
with some absolute constant C. The second important idea is to exploit the almost invariance of un under the transformation T . In fact, it follows immediately from the definition of Mn as a product that K 1 K un (T k (x, y)) − un (x, y) ≤ C . sup n (x,y)∈T2 K
(2.5)
k=1
Fourier expanding the sum in (2.5) leads to a series in which the main contributions are given by the resonances of the shift, i.e., those k ∈ Z2 \ {0} for which
k · ω 1. Since ω = (ω1 , ω2 ) is assumed to be Diophantine, such resonances only occur for a sparse set of frequencies k and the decay (2.4) then controls the size of these contributions (in [3] certain technical problems arise due to the non- 2 decay provided by (2.4), which however do not concern us here). The difficulty one faces with this method in the case of the skew-shift derives from the failure of uniform boundedness of the subharmonic function (2.3). This is due to the fact that iteration of the skew-shift is given by T k (x, y) = (x + ky + k(k − 1)ω/2, y + kω) mod Z2 .
(2.6)
Complexifying in the variable y therefore produces an imaginary part of size about n in half of the factors of the product Mn , cf. (2.1). Therefore, most factors of Mn will be of size en rather than bounded as in the case of the shift. Instead of (2.4) one can only assert that |un ( 1 , 2 )| ≤
Cn . | 1 | + | 2 | + 1
(2.7)
However, since one typically has a resonance at the site (0, n) the Fourier series argument based on the decay (2.7) does not even provide that un − Ln 2 → 0. Of course, the argument which we outlined above is rather crude as the structure of Mn only enters through the almost invariance (2.5). The tool that will allow us to exploit the structure of Mn more carefully is the “avalanche principle” from [7]. We now reproduce the statement of this principle from [7], but refer the reader to that paper for the proof.
586
J. Bourgain, M. Goldstein, W. Schlag
Proposition 2.1. Let A1 , . . . , An be a sequence of arbitrary unimodular 2×2-matrices. Suppose that min Aj ≥ µ ≥ n and
(2.8)
1≤j ≤n
1 log µ. 2
(2.9)
n−1 n−1 n log Aj − log Aj +1 Aj < C . log An · . . . · A1 + µ
(2.10)
max [log Aj +1 + log Aj − log Aj +1 Aj ] ≤
1≤j
Then
j =2
j =1
Proposition 2.1 will allow us to prove (2.2) inductively. More precisely, assume that (2.2) holds for some integers n and 2n. Consider the monodromy matrix MN with a choice of N which is basically subexponential in n. Let the matrices Aj be the matrices Mn ◦ T j n so that MN (x, y; E) =
0
Aj (x, y; E).
j =N/n
By (2.2) conditions (2.8) and (2.9) will hold for all (x, y) ∈ T2 up to a set of measure not exceeding
(2.11) exp −nσ . The advantage of passing to the much shorter monodromy matrices Mn instead of MN lies with the fact that the size of their subharmonic extensions is only n rather than N . This allows one to prove that the averages appearing in (2.10) are close to their respective means up to a set which is subexponentially small in N , cf. Lemma 2.6 below. However, in order to apply the avalanche principle we had to remove a set of size given by (2.11), whereas the goal is to prove (2.2) for N . The key tool to circumvent this difficulty is the following BMO estimate for subharmonic functions, which have the additional property of being the sum of two functions, one of which is small in L∞ and one that is small in L1 . This mechanism is really the new feature compared to the methods from [3].
2.1. Subharmonic functions with small BMO-norm. Definition 2.2. Throughout this paper e(x) := e2πix . For any 0 < ρ < 1, Aρ := {z ∈ C | 1 − ρ < |z| < 1 + ρ}. For a function u defined on Aρ we shall write u(x) instead of u(e(x)). It will be clear from the context whether we mean u(z) for complex z or u(x) = u(e(x)) for real x. For any positive integer d, Td := Rd /Zd
Anderson Localization for the Skew–Shift
587
denotes the d-dimensional torus. BMO(T) is the space of functions of bounded mean oscillation on T, see [16]. Identifying functions that differ only by an additive constant, the norm on BMO(T) is given by 1
f BMO(T) := sup |f − f I | dx, I ⊂T |I | I where f I = |I1| I f (x) dx. The open unit disk will be denoted by D. Lemma 2.3. Suppose u is subharmonic on Aρ , with supAρ |u| ≤ N . Furthermore, assume that u = u0 + u1 , where
u0 − u0 L∞ (T) ≤ ε0
and
u1 L1 (T) ≤ ε1 .
Then for some constant Cρ depending only on ρ,
u BMO(T) ≤ Cρ ε0 log N/ε1 + N ε1 .
(2.12)
(2.13)
Proof. By Riesz’s representation theorem, there is a positive measure µ with supp(µ) ⊂ Aρ/2 and a harmonic function h such that for any z ∈ Aρ/2 , u(z) = log |z − ζ | dµ(ζ ) + h(z), (2.14) where µ(Aρ/2 ) + h L∞ (Aρ/4 ) ≤ Cρ N.
(2.15)
We first claim that one may assume supp(µ) ⊂ D ∩ Aρ/2 .
(2.16)
Indeed, define µ∗ by µ∗ (E) = µ(E ∩ D) + µ(E ∗ ), where
E ∗ = z−1 : z ∈ E
for any measurable E ⊂ C. For any |z| = 1, ∗ log |z − ζ | dµ(ζ ) − log |z − ζ | dµ (ζ ) = D
C\D
log |ζ | dµ(ζ ).
Since the term on the right-hand side is nonnegative and no larger than Cρ N , subtracting this constant from u and u0 changes N by at most a multiplicative constant, whereas both the hypothesis and the conclusion of the lemma remain unchanged. This implies claim (2.16). In particular, since log |e(t) − ζ | dt = 0 for all |ζ | ≤ 1 T
588
J. Bourgain, M. Goldstein, W. Schlag
we can assume that u = h = 0.
(2.17)
For ζ = r · e(x) with 0 ≤ r ≤ 1 let Pζ (y) =
1 − r2 1 − 2r cos(2π(x − y)) + r 2
be the usual Poisson kernel. If |ζ | = 1, then Pζ = δζ . For any f ∈ L1 (T) with f = 0, the anti-derivative D −1 f is defined as t −1 (D f )(t) = f (x) dx where t0 is chosen so that D −1 f = 0 (2.18) t0
for arbitrary t ∈ T. The existence of t0 is guaranteed by the mean value theorem. We shall also need (2.18) in case f = δθ0 , θ0 ∈ T. In that case let t0 = θ0 + 21 (mod 1). Observe that D −1 f is unique whereas the choice of t0 is not necessarily unique. For any ζ = |ζ |e(y) ∈ D one has the elementary relation d 2π|ζ | sin(2π(x − y)) log |e(x) − ζ | = = Qζ (x) = (HPζ )(x), dx 1 − 2|ζ | cos(2π(x − y)) + |ζ |2 where H denotes the Hilbert transform and Qζ is the standard notation for the conjugate function of the Poisson kernel, cf. Katznelson [9]. In particular, log |e(x) − ζ | = (D −1 HPζ )(x) = HD −1 (Pζ − 1) (x). Hence (2.14), (2.16), and (2.17) imply that
−1 u|T = H D (Pζ (·) − 1) dµ(ζ ) + H−1 h .
(2.19)
The anti-derivative appearing in (2.19) is a harmonic function on D. In fact, if z = r · e(t) ∈ D, then t t ∞ D −1 (Pr (·) − 1) (t) = (Pr (x) − 1) dx = 2 cos(2π nx)r n dx − 21
− 21
n=1
∞ −1 1 = sin(2π nt)r n π n
(2.20)
n=1
−1 = log(1 − z) = −2Arg(1 − z), π where Arg denotes the principal branch of the argument, i.e., Arg(z) = x if and only if z = |z|e(x) and − In particular,
D
−1
(P1 (·) − 1) (t) =
−t − 1 2 −t
1 2
1 1 ≤x< . 2 2
if − 21 ≤ t < 0 if 0 < t ≤ 21
(2.21)
Anderson Localization for the Skew–Shift
589
1 p p p p p p pp p p p p 2 --pp---------------- p pppp pp p -------
pppppp
pppppp
p ----------------------pp p ----- -------------------- pppp p -----ppppp ppp ----------------------ppppp -----pp ------p ----------p ----ppppp ----------p ------pp ----ppppp ----------p p --------------ppppp -----pp ------p ----------ppppp p ---------p ---------------------ppp ----------------p ----1 pp -------- 0 2 -ppp -------p ----pp --------pp ---pp -------
--------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------
− 21
pppppp
pppppp
pppppp
pppppp
pppppp
pppppp pp p p p p p
− 21
Fig. 1. The right-hand side of (2.24)
and similarly for Pζ with arbitrary |ζ | = 1. For any |ζ | ≤ 1 denote hζ = D −1 (Pζ (·) − 1). ∞ The functions are harmonic in the sense of (2.20). Let χ ≥ 0 be a C -function on the line with supp(χ ) ⊂ [−1, 1] and χ (x) dx = 1. Let R be a large number to be determined below and set
φR (x) = Rχ (Rx).
(2.22)
Clearly,
|φ R (k)| ≤ CR.
(2.23)
k
We claim that for any t ∈ T and any |ζ | ≤ 1, (hζ ∗ φR )(t − C0 R −1 ) − C1 R −1 ≤ hζ (t) ≤ (hζ ∗ φR )(t + C0 R −1 ) + C1 R −1 , (2.24) provided C0 , C1 are suitable absolute constants. Since all the functions appearing in (2.24) are harmonic, it suffices by the maximum principle to prove the claim for |ζ | = 1. By translation invariance, we may even set ζ = 1. In that case, hζ is given by the saw– tooth function (2.21) for which (2.24) is evident, see Fig. 1 (the rounded–off curve lying inside the saw–tooth function represents the convolution of (2.21) with φR , whereas the dashed line is given by raising that smoothed out function and translating it to the left until it lies above the saw-tooth). Let h be the harmonic function given by (2.14). In view of (2.15) one has
(H−1 h) L∞ (T) ≤ Cρ h L∞ (Aρ/4 ) ≤ Cρ N.
590
J. Bourgain, M. Goldstein, W. Schlag
Therefore, (h ∗ φR )(t − C0 R −1 ) − C2 N R −1 ≤ (H−1 h)(t) ≤ (h ∗ φR )(t + C0 R −1 ) + C2 N R −1 (2.25) with the same constant C0 as in (2.24), but a different choice of C2 also depending on ρ. Let F = [. . . ] denote the expression in brackets on the right-hand side of (2.19). By construction, F = 0. Integrating (2.24) over the positive measure dµ(ζ ) with mass controlled by (2.15) and adding (2.25) yields (F ∗ φR )(t − C0 R −1 ) − Cρ N R −1 ≤ F (t) ≤ (F ∗ φR )(t + C0 R −1 ) + Cρ N R −1 , (2.26) for any t ∈ T. Thus
F ∞ ≤ (F − H−1 u0 ) ∗ φR ∞ + H−1 u0 ∗ φR ∞ + CN R −1 ∧ −1 ≤ F − H−1 u0 (k)|φ R (k)| + u0 ∗ HφR ∞ + CN R .
(2.27)
k=0
Since F = H−1 u0 + H−1 u1 by (2.19), the sum in (2.27) can be estimated as follows: ∧ −1 u (k)|φ H (k)| ≤ F − H−1 u0 (k)|φ R 1 R (k)| k=0
k=0
≤
u1 1 |φ R (k)| ≤ Cε1 R.
k=0
Next we claim that HφR 1 ≤ C log R. With Q(y) = π cot(πy) being the kernel of H, |(HφR )(y) − Q(y)| = Q(y − x) − Q(y) φR (x) dx C|x| C ≤ φR (x) dx ≤ , 2 |y| R|y|2 provided R|y| ≥ C. Thus, [R|y|>C]
(HφR )(y) dy ≤ C log R.
(2.28)
On the other hand, 1 1 (HφR )(y) dy ≤ C HφR 2 R − 2 ≤ C φR 2 R − 2 ≤ C. [R|y|≤C]
(2.29)
Thus
u0 ∗ HφR ∞ ≤ u0 ∞ HφR 1 ≤ Cε0 log R. In view of the preceding
F ∞ ≤ C ε0 log R + ε1 R + N R −1 .
The lemma follows from (2.19) and (2.30) by taking R =
√ N/ε1 .
(2.30) " !
Anderson Localization for the Skew–Shift
591
Remark 2.4. The main application of Lemma 2.13 in this paper will be to estimates on the measure of the set {x ∈ T | |u(x) − u| > λ}. In fact, by the well-known John–Nirenberg inequality [16], the measure of this set does not exceed cλ C exp − . (2.31)
u BMO The exponential integrability of the Hilbert transform of a bounded function can be derived much more easily than by going through BMO and John–Nirenberg. Indeed, it is a classical, and rather simple fact that for any real-valued function f on T such that |f | ≤ 1, one has the bound 2 exp α|(Hf )(t)| dt ≤ cos(απ/2) T for any 0 ≤ α < 1, see Theorem 1.9 in [9] (with α < 1 being optimal). Using this bound in the previous proof instead of the deeper fact that H : L∞ → BMO leads directly to the estimate (2.31) on the measure. Since the BMO-estimate (2.13) might be of interest in its own right, we have chosen to present Lemma 2.3 in this way. Lemma 2.5. Let u : T2 → R satisfy u L∞ (T2 ) ≤ 1. Assume that u extends as a separately subharmonic function in each variable to a neighborhood of T2 such that for some N ≥ 1 and ρ > 0, sup
sup |u(z1 , z2 )| ≤ N.
z1 ∈Aρ z2 ∈Aρ
Furthermore, suppose that u = u0 + u1 on T2 where
u0 − u L∞ (T2 ) ≤ ε0 and u1 L1 (T2 ) ≤ ε1 with 0 < ε0 , ε1 < 1. Here u := T2 u(x, y) dxdy. Then for any δ > 0,
1 mes (x, y) ∈ T2 |u(x, y) − u| > B δ log(N/ε1 ) ≤ CN 2 ε1−1 exp −cB − 2 +δ , 3
1
where B = ε0 log(N/ε1 ) + N 2 ε14 . The constants c, C only depend on ρ. Proof. We may assume that u = 0 without significantly changing the hypotheses. Let −1 M = N 2 ε1 2 and denote the Fejér-kernel on T with Fourier support [−M +1, M −1] by FM . Then u ∗ 1 FM = u0 ∗ 1 FM + u 1 ∗ 1 F M , where ∗1 denotes the convolution in x alone. It is clear that for fixed x ∈ T, √
u0 ∗1 FM (x, ·) L∞ ≤ ε0 and u1 ∗1 FM (x, ·) L1y ≤ Mε1 ≤ 2N 2 ε1 . y
592
J. Bourgain, M. Goldstein, W. Schlag
Since FM ≥ 0, (u ∗1 FM )(x, ·) extends to a subharmonic function in the second variable satisfying sup |u ∗1 FM (x, z)| ≤ N.
z∈Aρ
Hence Lemma 2.3, in conjunction with the John–Nirenberg inequality, implies that for any λ > 0,
cλ sup mes y ∈ T (u ∗1 FM )(x, y) − (u ∗1 FM )(x, ·) > λ ≤ C exp − , B x∈T (2.32) 3
1
where B := Cρ (ε0 log(N/ε1 ) + N 2 ε14 ).
(2.33)
Observe that for any x, x ∈ T sup |(u ∗1 FM )(x, y) − (u ∗1 FM )(x , y)| ≤ M u L∞ (T2 ) |x − x | ≤ M |x − x |.
y∈T
(2.34) Let N ⊂ T be a M −1 λ/4-net. In view of (2.32) and (2.34) one concludes that
cλ 1 M mes y ∈ T sup (u ∗1 FM )(x, y) − (u ∗1 FM )(x, ·) > λ ≤ C exp − . 2 λ B x∈T (2.35) √ Now let λ = 2 B and denote the set on the left-hand side of (2.35) with this choice of λ by B1 . Thus 1 1 1 −1 mes(B1 ) ≤ CN 2 ε1 2 B − 2 exp −cB − 2 ≤ CN 2 ε1−1 exp −cB − 2 .
(2.36)
Now fix some y ∈ T \ B1 and consider the decomposition of u(·, y) as a function of the first variable given by u(·, y) = u(·, y) − (u ∗1 FM )(·, y) + (u ∗1 FM )(·, y).
(2.37)
From the Riesz representation u(z, y) = log |z − ζ | dµ(ζ ) + h(z) with µ(Aρ/2 ) + h L∞ (Aρ/4 ) ≤ Cρ N, it is standard to deduce that the Fourier coefficients u( , ˆ y) := u(x, y) e(− x) dx T
decay as follows: |u( , ˆ y)| ≤
Cρ N . | |
Anderson Localization for the Skew–Shift
593
In particular, by definition of FM and because of our choice of y, see (2.35),
u(·, y) − (u ∗1 FM )(·, y) 2 ≤ Cρ N M − 2 and √ sup (u ∗1 FM )(x, y) − (u ∗1 FM )(x, ·) ≤ B. 1
(2.38)
x∈T
The mean appearing in the second term is uniformly small. In fact, for all x ∈ T, (u ∗1 FM )(x, ·) ≤ |(u0 ∗1 FM )(x, y)| dy + |(u1 ∗1 FM )(x, y)| dy T
T
√ ≤ u0 L∞ (T2 ) + M u1 L1 (T2 ) ≤ ε0 + 2N 2 ε1 .
(2.39)
Assuming as we may √ that B ≤ 1, one checks from (2.33) that the bound in (2.39) is no larger than C B. Hence (2.38) implies that for any y ∈ T \ B1 (recall that −1 M = N 2 ε1 2 ) 1 √
u(·, y) − (u ∗1 FM )(·, y) 1 ≤ Cρ ε14 and sup (u ∗1 FM )(x, y) ≤ C B.
x∈T
Applying Lemma 2.3 to the function u(·, y) with the decomposition given by (2.37) therefore yields √ √ 1 1 sup u(·, y) BMO ≤ Cρ ( B log(N/ε1 ) + N 2 ε18 ) ≤ Cρ B log(N/ε1 ). (2.40)
y∈T\B1
It remains to be shown that v(y) := u(·, y) =
T
u(x, y) dx
is close to zero for most y. Clearly, v extends to a subharmonic function on Aρ such that sup |v(z)| ≤ N and v = u = 0.
z∈Aρ
With v0 (y) := u0 (·, y) and v1 (y) := u1 (·, y) one has
v0 L∞ (T) ≤ ε0 and v1 L1 (T) ≤ ε1 . Therefore, Lemma 2.3 implies that
v BMO ≤ C ε0 log(N/ε1 ) + N ε1 ≤ CB. Thus
√ 1 mes y ∈ T |v(y)| > B ≤ C exp −cB − 2 .
(2.41)
Denoting the set on the left-hand side by B2 , let B := B1 ∪B2 . One concludes from (2.36), (2.41), and (2.40) by means of the John–Nirenberg inequality that
1 mes (x, y) ∈ T2 |u(x, y)| > B δ log(N/ε1 ) ≤ mes(B) + C exp −cB − 2 +δ , and the lemma follows. ! "
594
J. Bourgain, M. Goldstein, W. Schlag
2.2. Averages of subharmonic functions over orbits of the skew-shift. In what follows we assume that ω ∈ (0, 1) is Diophantine in the sense that
nω ≥ ε n−1 (1 + log n)−2 for any n ∈ Z+ ,
(2.42)
where ε > 0 is some arbitrary but fixed small number. Let ε be the set of those ω that satisfy (2.42). It is clear that mes[T \ ε ] < Cε with an absolute constant C. The choice of logarithm in (2.42) is mainly for convenience. A very small power loss is also acceptable. Throughout this section we will use ε in this sense. Let Tω : T2 → T2 , Tω (x, y) = (x + y, y + ω) (mod Z2 ) be the skew-shift. Observe that the iterates of Tω are given by Tωk (x, y) = (x + ky + k(k − 1)ω/2, y + kω)
mod Z2
(2.43)
for any k ∈ Z. Lemma 2.6. Let u : T2 → R extend to some neighborhood of T2 as a separately subharmonic function in each variable so that for some ρ > 0, sup
sup |u(z1 , z2 )| ≤ 1.
z1 ∈Aρ z2 ∈Aρ
(2.44)
Fix a small ε > 0 and let ω ∈ ε , see (2.42). For any δ > 0 there exist constants c, C such that K 1
1 mes (x, y) ∈ T2 u ◦ Tωk (x, y) − u > K − 12 +2δ ≤ C exp −cK δ , K k=1
(2.45)
for any positive integer K. Here u = T2 u(x, y) dxdy and the constants depend only on ρ, δ, ε. 1 Proof. Let u( , ˆ y) = 0 u(x, y)e(− x) dx denote the Fourier coefficient with respect to the first variable. As above one deduces by means of the Riesz representation of the subharmonic function z % → u(z, y) and from (2.44) that sup |u( , ˆ y)| ≤ Cρ | |−1 .
y∈T
(2.46)
With some positive integer p1 to be determined, let u( ˆ 1 , y)e( 1 x) + u( ˆ 1 , y)e( 1 x) u(x, y) = | 1 |≤p1
| 1 |>p1
(2.47)
=: u1 (x, y) + u1 (x, y), where u1 and u1 are the respective sums on the right-hand side of (2.47). By (2.46), −1
sup u1 (·, y) L2x ≤ Cp1 2 .
y∈T
(2.48)
Anderson Localization for the Skew–Shift
595
With some positive integer p2 to be determined below, let u1 (x, y) = u( ˆ 1 , 2 )e( 1 x + 2 y) + u( ˆ 1 , 2 )e( 1 x + 2 y) | 1 |≤p1 | 2 |>p2
| 1 |≤p1 | 2 |≤p2
(2.49)
=: u2 (x, y) + u3 (x, y). Using the Riesz representation in the second variable one derives from (2.44) that Cρ . (2.50) |u( ˆ 1 , 2 )| ≤ e(− 2 y)u(x, y) dy dx ≤ 1 + | 2 | T T Therefore,
u( ˆ 1 , 2 )e( 2 y)
u2 L2 (T2 ) ≤
| 1 |≤p1 | 2 |>p2
−1
L2y
≤ C p1 p2 2 .
In particular,
mes y ∈ T
T
K 1 u2 ◦ T k (x, y) dx > K −1 K k=1
≤K
T2
K 1 u2 ◦ T k (x, y) dxdy K
(2.51)
k=1
−1
≤ K u2 L1 (T2 ) ≤ C Kp1 p2 2 . Let B be the set on the left-hand side of (2.51). In view of (2.43), K 1 sup u3 ◦ T k (x, y) − u K 2 x,y∈T k=1
≤
1 K
| 1 |≤p1 ,| 2 |≤p2 | 1 |+| 2 |=0
K C e 1 (ky + ωk(k − 1)/2) + 2 kω 1 + | 2 | k=1
p2 K 1 C ≤ e( 2 kω) K
2
2 =1
k=1
p1 p2 1 C + K 1 + 2
1 =1 2 =0
≤
K−1
min K, m 1 ω −1
21
(2.52)
m=1
p2 C 1 min(K,
2 ω −1 ) K
2
2 =1
1 p1 K−1 2 C√ + p1 log p2 min K, m 1 ω −1 K
1 =1 m=1
=: S1 + S2 .
(2.53)
596
J. Bourgain, M. Goldstein, W. Schlag
To obtain the second term in line (2.52), one uses the well-known method of Weyl– differencing, cf. Montgomery [11, Chap. 3]. In fact, K K−1 2 e 1 (ky + ωk(k − 1)/2) + 2 kω ≤ K + 2 min K, m=1
k=1
≤C
K−1
2 |1 − e( 1 ωm)|
min K,
1 ωm −1 ,
m=1
which leads to (2.52). In view of (2.42) (with a ∼ b denoting b ≤ a ≤ 2b), for any positive integer R, R 1
=1
R
min(1, K −1
ω −1 ) ≤
1≤2j ≤K =1
+
R
1 χ[
ω ∼2−j ] min(1, K −1 2j )
χ[
ω ≤K −1 ]
=1
≤C
1≤2j ≤K
≤C
1
R
(log K)2 2j 2 −j j 2 log R + C K
K
(log K)2 K
=1
log R.
Here the constants depend on ε. Thus, S1 ≤ Cε
(log K)2 log p2 . K
(2.54)
By Dirichlet’s principle there is an integer 1 ≤ q ≤ K and an integer p so that 1 gcd(p, q) = 1 and |ω − pq | ≤ qK . In view of (2.42), one also has q ≥ cε (logKK)2 . By means of the standard bound on the divisor function and the usual estimates for reciprocal sums, cf. [11, Chap. 3], p1 K−1
min(K, m 1 ω −1 ) ≤ Cε2 (p1 K)ε2
1 =1 m=1
p 1K
min(K, kω −1 )
k=1
p K2 1 ≤ Cε2 (p1 K)ε2 + p1 K log q + K + q log q q ≤ Cε2 (p1 K)1+2ε2 , (2.55)
where ε2 > 0 is an arbitrarily small parameter. One obtains from (2.53), (2.54), and (2.55) that K 1 1 u3 ◦ T k (x, y) − u ≤ S1 + S2 ≤ Cp11+ε2 K − 2 +ε2 log p2 sup x,y∈T2 K k=1
(2.56)
Anderson Localization for the Skew–Shift
597 1
with a constant that depends both on ε and ε2 . Fix some small δ > 0 and choose p1 = K 3 δ and p2 = exp 4K . The conclusion from the preceding is as follows, cf. (2.48), (2.51), and (2.56): There exists a subset B ⊂ T of measure 4 (2.57) mes(B) ≤ CK 3 exp −2K δ ≤ C exp −K δ , such that (choosing 2ε2 < δ) K 1 sup u ◦ T k (·, y) − u 1 Lx y∈T\B K k=1
K K 1 1 ≤ sup u1 ◦ T k (x, y) 1 + sup u2 ◦ T k (x, y) dx Lx K T y∈T K y∈T\B k=1
+
sup (x,y)∈T2
≤ CK
− 16
k=1
(2.58)
K 1 u3 ◦ T k (x, y) − u K k=1
+K
−1
+ C K 3 +ε2 K δ− 2 +ε2 ≤ C K − 6 +2δ 1
1
1
with constants that depend on both δ and ε. To obtain (2.45), one uses Lemma (2.3) to convert the L1 -bound (2.58) into an L∞ -bound at the cost of removing an exponentially small set. For any fixed y ∈ T \ B, consider the bounded subharmonic function K 1 vy (z) := u ◦ T k (z, y) with z ∈ Aρ . K k=1
It is important to notice that y is real. Otherwise T k (z, y) ∈ Aρ × Aρ for large k, see (2.43). One has the decomposition K K 1 1 u ◦ T k (·, y) = u + u ◦ T k (·, y) − u. K K k=1
k=1
In view of (2.58) one obtains from Lemma 2.3 (with N = 1, ε0 = 0, and ε1 = K − 6 +2δ ) that K 1 1 k u ◦ T (·, y) ≤ Cδ K − 12 +δ . K 1
k=1
BMOx
By the John–Nirenberg inequality thus
1 sup mes x ∈ T |vy (x) − vy | > Cδ K − 12 +2δ ≤ C exp −K δ . y∈T\B
(2.59)
Since (2.58) implies that |vy − u| ≤ Cδ K − 6 +2δ , the lemma follows from (2.59) and (2.57). ! " 1
598
J. Bourgain, M. Goldstein, W. Schlag
Remark 2.7. It will be important in the proof of localization below that the previous lemma requires only finitely many conditions on ω. More precisely, the arithmetic nature of ω only enters into the estimate of S1 and S2 . Furthermore, what is required for the bound on S1 is the following: If for some K −1 ≤ κ < 1 and some positive distinct integers , ,
ω < κ and
ω < κ, then | − | ≥ cε κ −1 (log κ)−2 . This clearly requires the Diophantine condition (2.42) only for 1 ≤ k ≤ K. As far as S2 is concerned, it is evident from the estimate of S2 that (2.42) is used only in the range 1 ≤ k ≤ p1 K ≤ K 2 . 2.3. The main inductive step in the proof of the large deviation theorem. Consider equations of the form −ψn+1 − ψn−1 + λv(Tωn (x, y))ψn = Eψn ,
(2.60)
where Tω : T2 → T2 , Tω (x, y) = (x + y, y + ω) (mod Z2 ) is the skew-shift, and v is a nonconstant real–analytic function on T2 satisfying some further conditions that will specified below. Let j Aj (x, y; λ, E) = λv(Tω (x, y)) − E −1 . 1 0 The matrix Mn (x, y; λ, E) = Eq. (2.60). As usual,
1
Ln (λ, E) =
j =n Aj (x, y; λ, E)
1 n
denotes the monodromy matrix of
T2
log Mn (x, y; λ, E) dxdy
and L(λ, E) = limn→∞ Ln (λ, E) denotes the Lyapunov exponent. Introduce a scaling factor S(λ, E) = log(Cv + |λ| + |E|) ≥ 1,
(2.61)
where Cv is a constant depending only on the potential v so that for all n 1 1 log Mn (z, y; λ, E) + sup sup 2 log Mn (z1 , z2 ; λ, E) ≤ S(λ, E). z∈Aρ y∈T n z1 ∈Aρ z2 ∈Aρ n (2.62) sup sup
Here ρ = ρv is determined by v. Observe that (2.62) basically requires the function v to extend in the first variable to an analytic function on C \ {0} such that sup |v(z, y)| ≤ C(|z|d + |z|−d )
y∈T
with some constant d, see (2.43). For example, any trigonometric polynomial v(x, y) = ak, e(kx + y) |k|+| |≤d
Anderson Localization for the Skew–Shift
599
satisfies this requirement. Another possibility, which is slightly more technical to state, but applies to any analytic function on a neighborhood of T2 , is as follows: For all n, sup
sup
z1 ∈Aρ z2 ∈Aρ/n
1 log Mn (z1 , z2 ; λ, E) ≤ S(λ, E). n
(2.63)
The difference from (2.62) here is that in the second term the z2 -variable only needs to be taken in an annulus of thickness ρn . Observe that (2.63) can be stated for any potential v that extends analytically to a neighborhood of T2 of size ρ. This is essential for realanalytic v. The reason (2.63) is sufficient for our purposes is the following simple fact. Suppose u is a subharmonic function on Aρ/n bounded by one. Then there is the Riesz representation u(z) = log |z − ζ | dµ(ζ ) + h(z), where µ(Aρ/(2n) ) + h L∞ (Aρ/(4n) ) ≤ Cρ n.
(2.64)
In particular, one has the decay of the Fourier coefficients |u( )| ˆ ≤
Cn .
(2.65)
The reader will easily verify that (2.64), (2.65) are all that is required in the proof of the following lemma. The following lemma provides the inductive step in the proof of the large deviation theorem. It is based on the avalanche principle and all our previous lemmas. Lemma 2.8. Fix ε > 0 small and let ω ∈ ε , see (2.42). Suppose n and N > n are positive integers such that 1
γ mes (x, y) ∈ T2 log Mn (x, y; λ, E) − Ln (λ, E) > S(λ, E) ≤ N −10 , n 10 (2.66) 1
γ mes (x, y) ∈ T2 log M2n (x, y; λ, E) − L2n (λ, E) > S(λ, E) ≤ N −10 . 2n 10 (2.67) Assume that min(Ln (λ, E), L2n (λ, E)) ≥ γ S(λ, E), γ S(λ, E), Ln (λ, E) − L2n (λ, E) ≤ 40 9γ nS ≥ 10 log(2N ) and n2 ≤ N.
(2.68) (2.69) (2.70)
Then there is some absolute constant C0 with the property that (with LN = LN (λ, E) etc.) LN ≥ γ S(λ, E) − 2(Ln − L2n ) − C0 S(λ, E)nN −1 and LN − L2N ≤ C0 S(λ, E)nN −1 .
(2.71)
600
J. Bourgain, M. Goldstein, W. Schlag
Furthermore, for any σ <
1 24
there is τ = τ (σ ) > 0 so that
1
mes (x, y) ∈ T2 log MN (x, y; λ, E) − LN (λ, E) > S(λ, E)N −τ N
(2.72) ≤ C exp −N σ with some constant C = C(σ, ε). Proof. We shall fix ω, λ, and E for the purposes of this proof and suppress these variables in the notation. In particular, S = S(λ, E). Denote the set on the left-hand side of (2.66) by Bn and the set on the left-hand side of (2.67) by B2n . For any (x, y) ∈ T2 \ Bn , 9γ γ
Mn (x, y) ≥ exp(nγ S − Sn) = exp Sn =: µ. 10 10 By (2.70), µ ≥ 2N. Furthermore, for any (x, y) ∈ Bn ∪ T −n Bn ∪ B2n , (2.66)–(2.69) imply log Mn ◦ T n (x, y) + log Mn (x, y) − log M2n (x, y) 4γ 9γ 1 Sn ≤ Sn = log µ. ≤ 2n(Ln − L2n ) + 10 20 2 Applying Prop. 2.1 N times yields a set B1 ⊂ T2 with measure mes(B1 ) ≤ 4N · N −10 = 4N −9 so that for any (x, y) ∈ T2 \ B1 , N 1 1 log MN (x, y) + 1 log Mn ◦ T j (x, y) N N n j =1 N Sn 2 1 1 j − log M2n ◦ T (x, y) ≤ C + ≤ CSnN −1 . N 2n N µ
(2.73)
(2.74)
j =1
Integrating (2.74) over T2 yields |LN + Ln − 2L2n | ≤ C SnN −1 + 16SN −9 ,
(2.75)
which implies the first inequality in (2.71). To obtain the second inequality in (2.71), observe that by virtue of (2.70) all arguments so far apply equally well to M2N instead of MN . Subtracting (2.75) from the analogous inequality involving L2N yields the desired bound. Denote uN (x, y) =
1 log MN (x, y) , N
and similarly with n and 2n. In view of (2.63), both un and u2n extend to separately subharmonic functions in both variables such that
sup sup |un (z1 , z2 )| + |u2n (z1 , z2 )| ≤ CS. z1 ∈Aρ
z2 ∈A(ρ/n)
Anderson Localization for the Skew–Shift
601
Applying Lemma 2.6 to un /S and u2n /(2S) (cf. the comments following (2.63), in particular (2.64) and (2.65)) therefore implies that there is a set B2 ⊂ T2 with measure (δ > 0 is a fixed small number) (2.76) mes(B2 ) ≤ C exp −N δ , such that for any (x, y) ∈ G := T2 \ (B1 ∪ B2 ), |uN (x, y) + Ln − 2L2n | ≤ CSnN −1 + Cδ SN − 12 +2δ , 1
(2.77)
see (2.74). For small δ the second term is the larger one since N ≥ n2 . Fix such an integer N . Consider the following decomposition of u := uN as a function on T2 : u = uχG + LN χG c + uχG c − LN χG c =: u0 + u1 . Here u0 is the sum of the first two terms (and G c := T2 \ G). In view of (2.77) and (2.75),
u0 − u ∞ = u0 − LN ∞ = u − LN L∞ (G ) ≤ uN + Ln − L2n L∞ (G ) + |LN + Ln − L2n | ≤ Cδ SN − 12 +2δ . 1
(2.78) On the other hand, (2.73) and (2.76) imply that
u1 1 ≤ 2S mes(G c ) ≤ CS N −9 + exp −N δ ≤ Cδ SN −9 .
(2.79)
Applying Lemma 2.5 to the function u/S with ε0 and ε1 given by (2.78) and (2.79), respectively, proves (2.72). Indeed, in this case the quantity B from Lemma (2.5) satisfies B ≤ Cδ N − 12 +2δ log(N 10 ) + CN 2 N − 4 , 1
which gives the value of σ stated above.
3
9
" !
Remark 2.9. In view of Remark 2.7 it is clear that Prop. 2.11 only requires the Diophantine condition (2.42) in the range 1 ≤ k ≤ N 2 . This will be relevant in the proof of localization below. 2.4. The initial condition via large disorder. Let Vj = v ◦ T j (x, y) and define λV1 − E −1 0 0 . . . . 0 −1 λV2 − E −1 0 0 . . . 0 0 −1 λV3 − E −1 0 0 . . 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . −1 0 0 . . . . 0 −1 λVn − E
fn (x, y; λ, E) = det Recall the simple property
(2.80)
fn (x, y; λ, E) −fn−1 (T (x, y); λ, E) Mn (x, y; λ, E) = . fn−1 (x, y; λ, E) −fn−2 (T (x, y); λ, E)
(2.81)
Dn (x, y; λ, E) = diag(λV1 − E, . . . , λVn − E).
(2.82)
Finally, let
602
J. Bourgain, M. Goldstein, W. Schlag
Lemma 2.10. There exist constants λ0 and B depending only on v such that for any positive integer n, 1
1 S(λ, E) ≤ n−50 , sup mes (x, y) ∈ T2 log Mn (x, y; λ, E) − Ln (λ, E) ≥ n 20 E (2.83) provided λ ≥ λ0 ∨ nB . Furthermore, for those λ and all E, Ln (λ, E) ≥
1 1 S(λ, E) and Ln (λ, E) − L2n (λ, E) ≤ S(λ, E). 2 80
Proof. The matrix on the right-hand side of (2.80) can be written in the form Dn + Bn , where Dn is given by (2.82). Clearly, Bn = 2 and n
1 1 log |v(T j (x, y)) − E/λ|. log | det Dn (x, y; λ, E)| = log λ + n n
(2.84)
j =1
It is a well-known property of nonconstant real-analytic functions v that there exist constants b > 0 and C depending on v such that mes (x, y) ∈ T2 |v(x, y) − h| < t ≤ Ct b (2.85) for all −2 v ∞ ≤ h ≤ 2 v ∞ and t > 0, see for example Lemma 11.4 in [7]. Therefore, for any |E| ≤ 2λ v ∞ , n 1
mes (x, y) ∈ T2 log |v ◦ T j (x, y) − E/λ| < −ρ < n Ce−bρ . n
(2.86)
j =1
One also has the upper bound n
sup (x,y)∈T2
1 log |v(x, y) − E/λ| ≤ log(3 v ∞ ). n
(2.87)
j =1
Since
Dn (x, y; λ, E)−1 ≤ λ−1 max |v ◦ T j (x, y) − E/λ|−1 , 1≤j ≤n
(2.85) implies that
1 mes (x, y) ∈ T2 Dn (x, y; λ, E)−1 > 4 ≤ n mes (x, y) ∈ T2 |v(x, y) − E/λ| < 4λ−1
(2.88)
≤ Cnλ−b . Hence
1 ≤ Cnλ−b . mes (x, y) ∈ T2 Dn (x, y; λ, E)−1 Bn > 2
(2.89)
Anderson Localization for the Skew–Shift
603
In view of (2.80), (2.84), (2.86), (2.87), and (2.88), 1 log |fn (x, y; λ, E)| − log λ n n 1 1 ≤ log |v(T j (x, y)) − E/λ| + log | det(I + Dn (x, y; λ, E)−1 Bn )| n n j =1
≤ ρ + log(3 v ∞ ) + log 2
(2.90)
up to a set of measure not exceeding Cne−bρ + Cnλ−b .
(2.91)
1 log λ and assume (6 v ∞ )400 ≤ λ. Then the right-hand side of (2.90) Now let ρ = 400 1 is no larger than 200 log λ. Under these assumptions the measure given by (2.91) is on b
the order of Cnλ− 400 . Choosing
λ ≥ nB for some B depending only on v implies 1
1 sup mes (x, y) ∈ T2 log |fn (x, y; λ, E)| − log λ ≥ log λ ≤ n−100 . n 200 |E|≤2λ v ∞ In view of (2.81) one therefore obtains 1
1 mes (x, y) ∈ T2 log Mn (x, y; λ, E) − log λ ≥ log λ n 199 |E|≤2λ v ∞ sup
≤ 4n−100 .
(2.92)
In particular, |Ln (λ, E) − log λ| ≤
1 1 log λ + 4S(λ, E)n−100 ≤ S(λ, E), 199 198
(2.93)
provided n ≥ 2. Since log λ ≥
99 S(λ, E) sup 100 |E|≤2λ v ∞
for large λ0 , (2.93) implies the second statement of the lemma in this range of E. Replacing log λ with Ln in (2.92) yields 1
1 S(λ, E) mes (x, y) ∈ T2 log Mn (x, y; λ, E) − Ln (λ, E) ≥ n 90 |E|≤2λ v ∞ sup
≤ 4n−100 .
(2.94)
If |E| > 2λ v ∞ and λ0 is sufficiently large, then the set in (2.83) is empty. In fact, for such E, 1 log | det Dn (x, y; λ, E)| − log |E| ≤ 2, n
604
J. Bourgain, M. Goldstein, W. Schlag
and thus 1 log |fn (x, y; λ, E)| − log |E| ≤ 4 n which implies that for large λ, 1 1 S(λ, E). log Mn (x, y; λ, E) − log |E| ≤ 8 ≤ n 200 Hence |Ln (λ, E) − log |E|| ≤ and the lemma follows.
1 S(λ, E), 200
" !
2.5. The proof of the large deviation estimate and positivity of the Lyapunov exponent. Proposition 2.11. Fix ε > 0 small and let ω ∈ ε , see (2.42). Assume v is a nonconstant 1 real–analytic function on T2 . Then for all σ < 24 there exist τ = τ (σ ) > 0 and constants λ1 and n0 depending only on ε, v and σ such that 1
sup mes (x, y) ∈ T2 log Mn (x, y; λ, E) − Ln (λ, E) n E
> S(λ, E)n−τ ≤ C exp −nσ
(2.95)
for all λ ≥ λ1 and n ≥ n0 . Furthermore, for those ω, λ and all E, L(λ, E) = inf Ln (λ, E) ≥ n
1 log λ. 4
1 throughout the proof and let τ = τ (σ ) > 0 be as in (2.72). Moreover, Proof. Fix σ < 24 B let λ ≥ λ0 ∨ n0 =: λ1 be as in Lemma 2.10. In this proof we shall require n0 to be sufficiently large at various places, but of course n0 will be assumed fixed. In view of Lemma 2.10 the hypotheses of Lemma 2.8 are satisfied with γ = γ0 = 21 ,
n20 ≤ N ≤ n50 ,
(2.96)
9n0 ≥ 20 log(2n10 0 ),
(2.97)
provided
cf. (2.70) (recall that S(λ, E) ≥ 1). It is clear that (2.97) holds if n0 is large. Applying Lemma 2.8 one obtains (suppressing λ, E for simplicity) 1 1 LN ≥ ( − )S − C0 SN −1 n0 ≥ γ1 S 2 40 γ1 and LN − L2N ≤ C0 SN −1 n0 ≤ S 40
(2.98)
Anderson Localization for the Skew–Shift
605
with γ1 = 13 . Moreover, with some constant C1 ≥ 1 depending on ε, 1
mes (x, y) ∈ T2 log MN (x, y; λ, E) N
− LN (λ, E) > S(λ, E)N −τ ≤ C1 exp −N σ
(2.99)
for all N in the range given by (2.96). In particular, (2.99) implies that 1
γ1 mes (x, y) ∈ T2 log MN (x, y; λ, E) − LN (λ, E) > S(λ, E) N 10
σ ≤ C1 exp −N ≤ N¯ −10 , provided n0 is large and
1 Nσ . 10 The first inequality was added to satisfy (2.70). In view of (2.96), one thus has the range 1 n40 ≤ N¯ ≤ exp (2.100) n5σ 10 0 of admissible N¯ . Moreover, 1
N 2 ≤ N¯ ≤ C110 exp
LN¯ ≥ γ1 S − 2C0 SN −1 n0 − C0 S N¯ −1 N and L ¯ − L ¯ ≤ C0 S N¯ −1 N. N
(2.101)
2N
At the next stage of this procedure, observe that the left end-point of the range of admissible indices starts at n80 , which is less than the right end-point of the range (2.100) (for n0 large). Therefore, from this point on the ranges will overlap and cover all large integers. To ensure that the process does not terminate, simply note the rapid convergence of the series given by (2.101). ! " Remark 2.12. Herman’s method [8] for proving positivity of the Lyapunov exponent for potentials given by trigonometric polynomials also applies to the skew-shift. However, it is well-known that his bound only involves the coefficient of the highest frequency of the trigonometric polynomial. In particular, it does not generalize to analytic functions covered by Prop. 2.11. On the other hand, for the important example v(x, y) = cos(2π x), it gives the superior lower bound inf L(λ, E) ≥ log(λ/2). E
Finally, in [2] the first author has recently shown that for this choice of v and all sufficiently small λ > 0 there is ω0 (λ) > 0 and a subset Eλ ⊂ [−2, 2] with the property that mes([−2, 2] \ Eλ ) → 0 as λ → 0 and such that inf L(ω, E) > 0 provided 0 < ω < ω0 .
E∈Eλ
Here L(ω, E) denotes the Lyapunov exponent for the skew-shift Tω (x, y) = (x +y, y + ω). Observe that this behavior is the exact opposite of the one displayed by the wellknown almost Mathieu equation as λ → 0. The approach in [2] is based on Kotani’s theorem [10, 14], Aubry-duality, and a perturbative argument for the almost Mathieu equation.
606
J. Bourgain, M. Goldstein, W. Schlag
2.6. Regularity of the integrated density of states.. Let EC,j (λ, x, y), j = 1, . . . , b − a + 1 = |C| be the eigenvalues of the restriction of (2.60) to the interval C = [a, b] with zero boundary conditions, ψ(a − 1) = ψ(b + 1) = 0. Consider NC (λ, E, x, y) =
1 χ(−∞,E) (EC,j ). |C| j
It is well-known that the weak limit (in the sense of measures) lim
a→−∞,b→+∞
dNC (λ, ·, x, y) = dN (λ, ·)
exists and does not depend on (x, y) ∈ T2 (up to a set of measure zero). The distribution function N (λ, ·) is called the integrated density of states. It is connected with the Lyapunov exponent via the Thouless formula L(λ, E) = log |E − E | dN (λ, E ). (2.102) In this subsection we show that for large λ both L and N have a modulus of continuity which is at least as good as 1 h(t) = exp −c| log t| 24 − . (2.103) This improves on various well-known continuity properties of L and N that hold for very general classes of transformations T . So far nothing better was known for the skewshift than log-continuity, which corresponds to replacing the power of log t in (2.103) with log log t, see Figotin, Pastur [5] and the references therein. For the proof of (2.103) we follow the approach from [4], which only requires a large deviation estimate and the avalanche principle. The latter does not depend on the transformation, and the former is given by Prop. 2.11. In particular, our assumption of large disorder is made necessary by that proposition. Since it is rather straightforward to apply the technique from [4] here, we shall be somewhat brief. Proposition 2.13. Let ω, v, and λ1 be as in Prop. 2.11. For λ > λ1 both N (λ, E) and L(λ, E) are continuous in E with modulus of continuity given by (2.103). Proof. We shall prove this for L. It is standard to deduce the statement about N from that on L by means of (2.102), see [7, Sect. 10]. For the sake of simplicity we shall 1 suppress λ in the notation. Fix any positive σ < 24 . Let N be a large integer and 1
set n = )C0 (log N ) σ * with some large constant C0 . One deduces from the avalanche principle and (2.95) that Cn , N Cn |L2N (E) − 2L2n (E) + Ln (E)| ≤ . N |LN (E) − 2L2n (E) + Ln (E)| ≤
(2.104)
The point is that (2.95) insures that the hypotheses (2.8) and (2.9) in Prop. 2.1 are satisfied up to a set of measure less than CN exp(−nσ ). This measure can therefore
Anderson Localization for the Skew–Shift
607
be made less than N −1 by taking C0 large enough. Taking the difference of the two inequalities in (2.104) yields |LN (E) − L2N (E)| ≤
Cn , N
which after summing over dyadic N gives |LN (E) − L(E)| ≤
Cn . N
(2.105)
Inserting (2.105) into (2.104) leads to |L(E) − 2L2n (E) + Ln (E)| ≤
Cn . N
(2.106)
It is clear that the derivatives of L2n (E) and Ln (E) in E are at most of size eCn . In view of this fact (2.106) implies that for any nearby E, E , σ Cn |L(E) − L(E )| ≤ (2.107) + eCn |E − E | ≤ C exp −clog |E − E | , N if one sets |E − E | = exp(−2Cn). 3. Localization The purpose of this section is to show that the operator (2.60) has pure point spectrum with exponentially decaying eigen functions for most ω, x, y ∈ T (i.e., up to a set of small measure) provided λ is sufficiently large, see Theorem 3.7 below. We will follow the scheme from [3]. The basic idea behind the proof is to start with a generalized eigen function with energy E, whose existence is guaranteed by the Shnol–Simon theorem, and then to show that it in fact decays exponentially. It is well-known that for this to hold one needs the Green’s functions GI (x, y; E) on most intervals I ⊂ Z with dist(I, 0) ∼ |I |
(3.1)
to possess exponential off–diagonal decay. This in turn is the case provided the monodromy matrices corresponding to those intervals I have norms which are on the order of eL(E)|I | , L(E) being the Lyapunov exponent. By the large deviation estimate (2.95), the bad set of (x, y) ∈ T, where any given one of these monodromy matrices has too small norm is exponentially small in |I |. The difficulty that arises here is of course that the sets of bad parameters depend on E. In principle, one would therefore need to remove the union over E of all these bad sets which might amount to the entire parameter set. The approach in [3] is to consider the set of parameters where there is some energy E with the property that, on the one hand, for some interval J ⊂ Z centered at 0 the Green’s function GJ (x, y; E) has very large norm and, on the other hand, the Green’s function GI (x, y; E) fails to have the necessary off–diagonal decay. Here I is an arbitrary interval as in (3.1), whose length and position is related to the length of J , see the proof of Theorem 3.7 below for details. Using the large deviation theorem it is possible to show that this set of parameters has small measure, see Lemma 3.6 below. It was observed in [3] that estimating the measure of the set of parameters that produce these “double resonances” can be accomplished provided one has some control on its complexity. This
608
J. Bourgain, M. Goldstein, W. Schlag
can be made precise in terms of semi-algebraic sets, which we also use here. The main technical statement in this context is Lemma 3.3 below. That lemma is in turn based on a general fact about the number of lattice points that can fall into a semi-algebraic set of not too large degree and small measure, see Lemma 3.2 for the exact statement. However, the proof of Lemma 3.3 also heavily exploits the structure of the skew-shift. It remains to be seen to what extent this method applies to other transformations. The arguments in this section do not directly invoke the lemmas from the previous section. We do, of course, use Proposition 2.11 in an essential way.
3.1. An estimate on the number of lattice points falling into a small set of bounded complexity. We begin by introducing some notation that will be used repeatedly in this section. Definition 3.1. For any a, b > 0 let a b denote C a ≤ b for some absolute constant C. The case where C is very large will be written as a b. Finally, a ∼ b means that both a b and a b. The following lemma will be important in the process of elimination of the energy. It is basically contained in Sect. 13 of [3]. Lemma 3.2. Let S ⊂ [0, 1] × [0, 1] be an open set with the following three properties: σ
mes(S) < e−B for some σ > 0, ∂S is contained in the union of at most B algebraic curves G = [P = 0] of degree deg P < B, for any line L, S ∩ L has at most B connected components.
(3.2) (3.3) (3.4)
Suppose M and B are related by the inequalities
Then
log log M log B log M.
(3.5)
m m 1 2 # (m1 , m2 ) ∈ Z2 |mi | < M and , ∈ S < B C M. M M
(3.6)
Furthermore, assume that m m −7 1 2 # (m1 , m2 ) ∈ Z2 |mi | < M and , ∈ S > M 1−10 . M M
(3.7)
Then S contains a line segment L of length |L| > M −1+10
−2
which is parallel to some integer with coordinates bounded by M 10
m1 vector m2 contains a point of the form M , M . Proof. See [3]. ! "
−6
and which
Anderson Localization for the Skew–Shift
609
3.2. On the number of times a generic orbit of the skew-shift visits a small set of bounded complexity. Lemma 3.3. Denote by Tω : T2 −→ T2 the ω-skew-shift on T2 . Let S ⊂ T4 × R be a semi-algebraic set of degree at most B such that mes(ProjT4 S) < e−B
σ
for some σ > 0.
(3.8)
Under the assumption (3.5) on M and B,
−8 mes (y0 , ω) ∈ T2 y0 , ω, Tωj (0, y0 ) ∈ ProjT4 S for some j ∼ M < M −10 . (3.9) Proof. Let ω ∈ (0, 1) be fixed and choose some y0 ∈ [0, 1). Then there are (x, y) ∈ [0, 1)2 such that mod Z2 (with ≡ denoting congruence mod Z2 ) j (j − 1) ω, y0 + j ω (x, y) ≡ Tωj (0, y0 ) ≡ jy0 + 2 j −1 (3.10) ≡ jy0 + (y − y0 + ν ), y0 + j ω 2 j − 1 ν j +1 ≡ y+ y0 + , y0 + j ω , 2 2 2 where ν, ν ∈ {0, 1}. Assume j ∼ M. Rewriting the congruences (3.10) as equalities in R yields j +1 x = ν2 + j −1 2 y + 2 y0 + m 1 (3.11) y = y0 + j ω + m2 with |mi |M. Solving (3.11) for y0 , ω one obtains
ν 2x−ν j −1 2 y0 = j +1 − 2 + x − j −1 2 y − m1 = j +1 − j +1 y − 2 2 ω = j1 (y − y0 − m2 ) = jν−2x (j +1) + j +1 y + j (j +1) m1 −
2 j +1 m1 m2 j .
(3.12)
Denoting π(S) = ProjT4 (S) we shall estimate
T2
j ∼M
χπ(S) y0 , ω, Tωj (0, y0 ) dy0 dω.
(3.13)
Using the change of variables given by (3.12) one obtains that the integral (3.13) is no larger than j −1 2 − j +1 χ j +12 2 π(S) (y0 (x, y), ω(x, y), x, y) dxdy − j (j +1) j +1 j ∼M |mi |M
∼ M −2
2x−ν j +1
T2 j ∼M |mi |M
−
j −1 j +1 y
−
(3.14)
χπ(S)x,y 2 ν−2x j +1 m1 , j (j +1)
+
2 j +1 y
+
2 j (j +1) m1
−
m2 j dxdy.
610
J. Bourgain, M. Goldstein, W. Schlag
Here π(S)x,y denotes the slice of π(S) for fixed (x, y). Restrict (x, y) ∈ T2 to the set where σ
mes(π(S)x,y ) < e− 2 B . 1
(3.15)
By (3.8), the complementary set contributes to the integral (3.14) an amount not exceeding σ
σ
e− 2 B M < e− 3 B . 1
1
(3.16)
For fixed x, y, the set Sx,y ⊂ T2 × R is still semi-algebraic of degree at most B. Therefore, condition (3.3) of Lemma 3.2 holds for π(Sx,y ) = π(S)x,y , with B C instead of B. Moreover, for any line L in [0, 1]2
π(S)x,y ∩ L = π Sx,y ∩ (L × R) has at most B C connected components, each of which is an interval. Thus condition (3.4) holds with B replaced by B C . Fix a point (x, y) ∈ [0, 1]2 satisfying (3.15) and assume j −1 m2 2 ν−2x 2 2 χπSx,y 2x−ν − y − m , + y + m − j +1 j +1 j +1 1 j (1+1) j +1 j (j +1) 1 j j ∼M;|mi |M
> κM 2 , (3.17) where −7
κ = M −10 . Fix j ∼ M and consider the affine transformation of R2 2x − ν ν − 2x j −1 2j 2 2 A(z1 , z2 ) := − y− z1 , + y+ z1 − z2 j +1 j +1 j +1 j (j + 1) j + 1 j +1 (3.18) for which 2j − j +1 0 ∼ 1. | det(DA)| = 2 −1 j +1
Thus the set A−1 π(S)x,y still satisfies conditions (3.2)–(3.4) of Lemma 3.2. Therefore, in view of (3.6), χπ(S)x,y (A(m1 /j, m2 /j )) < B C M. |m1 |M,|m2 |M
In conjunction with (3.17) this implies that there exists a subset J ⊂ {j ∼ M} such that #J > B −C κM and κ 1 −7 χπ(S)x,y (A(m1 /j, m2 /j )) > M = M 1−10 2 2
|mi |M
(3.19) (3.20)
Anderson Localization for the Skew–Shift
611
for any choice of j ∈ J . In view of (3.20), condition (3.7) of Lemma 3.2 holds for the set A−1 π(S)x,y . Hence, for any j ∈ J there exists a vector v ∈ Z2 \{0} such that |v1 | + |v2 | < M 10
−6
(3.21)
and a lattice point m ∈ Z2 , |m|M such that P + tv := m/j + tv ∈ A−1 π(S)x,y for all 0 < t < M −1+ 200 . 1
(3.22)
Applying the affine transformation A given by (3.18) yields j −1 2j ν − 2x 2x − ν − y− (tv1 + P1 ), j +1 j +1 j +1 j (j + 1) 2 2 + y+ (tv1 + P1 ) − (tv2 + P2 ) ∈ π(S)x,y j +1 j +1
(3.23)
for all t as in (3.22). Here v = (v1 , v2 ) and P = (P1 , P2 ) = m/j depend on j . Because of (3.19) and (3.21), there is a subset J ⊂ J , so that #J > M −2.10
−6
−6
#J > M −3.10 M
(3.24)
and for which all choices of j ∈ J have the same vector v. We first consider the case where v lies on the line v2 = 2v1 .
(3.25)
Denoting by L(j ) the line segment given by (3.23), assume that for some choice of j = j in J ,
dist(L(j ) , L(j ) ) < τ. Thus there exist t, t as in (3.22) so that 2x − ν 2j j −1 2 j + 1 − j + 1 y − j + 1 m1 − j + 1 tv1 2x − ν 2j j − 1 2 − − y− m − t v1 < τ j + 1 j +1 j + 1 1 j + 1
(3.26)
and ν − 2x
2v1 2 2m1 m2 + y+ − +t − v2 j (j + 1) j + 1 j (j + 1) j j +1
ν − 2x
2v1 2m1 m2 2 − + y+ − + t − v2 < τ. j (j + 1) j + 1 j (j + 1) j j +1
(3.27)
Since by (3.25) −
2v1 2j v1 = − v2 , j +1 j +1
(3.28)
612
J. Bourgain, M. Goldstein, W. Schlag
subtracting (3.26) from (3.27) and multiplying the resulting expression by j (j +1)j (j + 1) yields
2jj (j + 1)x − (j − 1)jj (j + 1)y − 2jj (j + 1)x + jj (j + 1)(j − 1)y + 2j (j + 1)x − 2jj (j + 1)y − 2j (j + 1)x + 2j (j + 1)j y τ M 4 , which is the same as
2(j − j )(1 + j )(1 + j )x M 4 τ.
(3.29)
Here · denotes the distance to the nearest integer. The points x ∈ T for which (3.29) holds for an arbitrary choice of distinct 1 ≤ j, j M form a set of measure M 6 τ . Taking τ = M −100 , one concludes that the contribution of those points x to the integral (3.14) is at most M −90 .
(3.30)
Excluding those points, one can therefore assume that for any choice of j = j in J ,
dist(L(j ) , L(j ) ) > τ,
(3.31)
where the line segments L(j ) ⊂ π(S)x,y . We will show that this leads to a contradiction. For an arbitrary set ⊂ R2 denote by N (, τ ) the number of τ -balls needed to cover the set . N is also referred to as “entropy”. In view of (3.31), (3.24), and the 1 property that |L(j ) | > M −1+ 100 , * 1 τ (j ) τ L , >N #J τ −1 M −(1− 100 ) N π(S)x,y , 10 10 j ∈J (3.32) 1 −6 M 1−3·10 τ −1 M 100 −1 M 200 τ −1 . 1
σ
On the other hand, π(S)x,y lies within a e− 4 B -neighborhood of at most B C many 1 σ algebraic curves G of degree not exceeding B C . By our assumption (3.5), τ , e− 4 B . Therefore, 1
N (G, τ ) τ −1 (G) < B C τ −1 , N (π(S)x,y , τ ) B C τ −1 .
(3.33)
Because of log M , log B this contradicts (3.32). It remains to consider the case where the vector v ∈ Z2 \{0} satisfies (3.21) but v2 = 2v1 . It follows from (3.23) that the segment L(j ) is oriented in the direction 2 j +1
−
v2 v1
− j2j +1
=
1 s(j + 1) − 1 1 v2 ≥ = 1, in fact, |s − 1| ≥ where s := . j 2v1 2|v1 | M
Anderson Localization for the Skew–Shift
613
------------------------------------------------------------------- ------------------------------------------------------------ --------------------------(j ) -- --------------------------------------------------------------------------------------- ---------------------- ------------------------------------ ----------------------- ----------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------- ----------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------0
L
P0
L
Fig. 2. The bush L(j )
Thus for any choice of j = j , s(j + 1) − 1 s(j + 1) − 1 |1 − s| − ≥ M −3 . j j M2
(3.34) (j )
One now again considers the system of lines {L(j ) |j ∈ J }. Let Lτ neighborhood of L(j ) . Then, on the one hand, 1 1 χL(j ) dxdy #J M −1+ 100 τ M 200 τ. T2 j ∈J
denote a τ (3.35)
τ
(j )
On the other hand, since each Lτ is contained in a τ -neighborhood of π(S)x,y , (3.33) implies that 2 c χL(j ) dxdy χ χ (j ) τ N (π(S)x,y , τ )τ B (j ) . (3.36) L L T2 j ∈J
τ
τ
j ∈J
∞
j ∈J
One concludes from (3.35) and (3.36) that 1 1 χL(j ) M 200 B −C M 300 . τ
τ
∞
(3.37)
∞
j ∈J
Hence there is a subsystem {L(j ) |j ∈ J } of cardinality #J M 300 1
(j )
such that the tubes {Lτ |j ∈ J } have a common point P0 . It follows from (3.34) that
(L(j ) , L(j ) ) M −3 for any choice of j = j .
(3.38)
Choose a line L0 that crosses the majority of lines in the bush {L(j ) |j ∈ J } transversely. 1 Recalling that π(S)x,y ∩ L0 has at most B C M 300 many components, one obtains two distinct j, j ∈ J for which the points L0 ∩ L(j ) ,
L0 ∩ L(j ) ∈ π(S)x,y
614
J. Bourgain, M. Goldstein, W. Schlag
belong to the same component of π(S)x,y ∩ L0 . In view of (3.38) and (3.22) this implies that mesL0 (π(S)x,y ∩ L0 ) M −4 . Since one can translate L0 by an amount M −1 , one finally obtains mes(π(S)x,y ) M −5 , which again contradicts (3.15). We have reached the conclusion that our assumption (3.17) fails. Recalling estimates (3.16), (3.30) on the exceptional (x, y)-sets, this implies that (3.13), (3.14)
1 − 1 Bσ −7 e 3 + M −99 + κM 2 < 2M −10 , M2
which proves (3.9). ! "
3.3. Averaging the monodromy matrix over long orbits. For the remainder of this paper we shall assume that there is a large deviation estimate as in Prop. 2.11, without specifying λ in our notation. More precisely, we shall write the large deviation estimate in the form 1
sup mes (x, y) ∈ T2 log Mn (x, y; E) − Ln (E) > n−σ ≤ C exp −nσ . n E (3.39) By Prop. 2.11 this holds provided σ > 0 is sufficiently small and for all n ≥ n1 (λ, v, ε), where ω ∈ ε . Moreover, for the sake of simplicity v will be assumed to be a trigonometric polynomial. The extension to real–analytic potentials is straightforward. Lemma 3.4. Let Tω be the ω-skew-shift, ω satisfying
kω ≥ cε |k|−1−ε for all k ∈ Z, 0 < |k| < N.
(3.40)
Then, denoting uN0 (x, y) :=
1 v(T j (x, y)) − E −1 1 ω log 1 0 N0 j =N0
there exist constants σ > 0, C > 1 so that for N > N0C one has the uniform bound N 1 −σ uN ◦ T j − uN0 (x, y) dxdy ω N 0 ∞ 2 < N0 . 2 T L (T ) j =1
(3.41)
Anderson Localization for the Skew–Shift
615
Proof. By the large deviation theorem, the set
:= (x, y) ∈ T2 |uN0 (x, y) − uN0 | > N0−σ satisfies σ
mes() < e−N0 .
(3.42)
Since v is a trigonometric polynomial, is clearly semi-algebraic expressed by polynomials in (x, y) of degree not exceeding N0C . Hence ∂ is contained in the union of no more than N0C many algebraic curves G of degree bounded by N0C . Therefore, one has the entropy bounds N (G, τ ) N0C τ −1 , and since, by (3.42) σ
sup dist((x, y), ∂) e− 2 N0 , 1
(x,y)∈
one also has N (, τ ) N0C τ −1 , σ
(3.43) σ
provided τ > e− 3 N0 . It clearly suffices to prove (3.41) for N < e 10 N0 . Consider the expression 1
1
N 1
T j (x, y) − T j (x, y) −2 N2 j =j =1
∼
N −2 1
)y + (j (j − 1) − j (j − 1) ω/2 +
(j − j )ω ,
(j − j N2 j =j =1
(3.44) where · denotes both the natural distance on T2 and T. Setting k = j − j and
= j + j − 1, (3.44) can be rewritten in the form 1 N2
0<|k|≤N | |≤2N
1 [ k(y + ω) + kω ]−2 2 ≤ N −2
+ 0<|k|≤N
sup z
, [ z + kω + kω ]−2 .
| |≤N
Let
θ = kω = δ > N −1+ε .
(3.45)
616
J. Bourgain, M. Goldstein, W. Schlag
Then the inner sum in (3.45) is at most δN 1+ε
1/δ
(δ 2 + s 2 δ 2 )−1 ∼
s=0
N 1+ε N 1+ε = . δ
kω
(3.46)
Summing (3.46) over 0 < |k| ≤ N implies that (3.44), (3.45) N ε ,
(3.47)
again invoking (3.40). Fixing (x, y) ∈ T2 , we shall estimate #J , where J = {j = 1, . . . , N | T j (x, y) ∈ }. Let τ > N1 and choose a collection of disks {D(Ps , τ )|s = 1, . . . , r} covering , where by (3.43) r N0C τ −1 .
(3.48)
Since by (3.47) N −2
N
T j (x, y) − T j (x, y) −2 N ε ,
(3.49)
j =j =1
we obtain in particular that r
−2 −2 # j = j T j (x, y) ∈ D(Ps , τ ), T j (x, y) ∈ D(Ps , τ ) N ε . (3.50) τ N s=1
Define for s = 1, . . . , r, so that J ⊂
-
s
Js = {j = 1, . . . , N | T j (x, y) ∈ D(Ps , τ )} Js . Clearly, (3.48) implies that #J N0C τ −1 +
#Js .
(3.51)
#Js >1
Furthermore, by (3.50),
(#Js )2 τ 2 N 2+ε .
(3.52)
#Js >1
It follows from (3.51), and (3.52) that √ #J N0C τ −1 + r τ N 1+ε N0C τ −1 + N0C τ 1/2 N 1+ε . Optimizing in τ yields #J N0C N 3 +ε . 2
(3.53)
Since uN0 is bounded, (3.53) implies that N 1 1 j (T (x, y)) − uN0 N0−σ + CN −1 #J N0−σ + N0C N − 3 +ε . uN0 2 N T j =1
Inequality (3.41) follows provided N > N0C1 with some large C1 .
" !
Anderson Localization for the Skew–Shift
617
The somewhat technical assumption (3.40), which requires only finitely many conditions on ω in terms of k, was made in order to insure that Lemma 3.3 can be applied. This will be important in the proof of localization, see Theorem 3.7 below. The previous lemma turns out to have several applications, one of which is the following uniform upper bound on the norm of the monodromy matrices. Corollary 3.5. Assume ω satisfies the Diophantine condition (3.40). For any N > N0C , there is a uniform estimate for all E ∈ R, sup (x,y)∈T2
1 log MN (x, y; E) < LN0 (E) + N0−σ . N
(3.54)
3.4. Double resonances occur with small probability. Fix ε > 0 small and let ω ∈ ε , see (2.42). Since we are assuming that the disorder λ is large, Prop. 2.11 guarantees that inf L(E) > c0 > 0. E
(3.55)
The purpose of this subsection is to prove the following lemma, which asserts in effect that double resonances occur with small probability. An analogous statement for the shift can be found in [3]. The importance of double resonances is of course a standard fact in the theory of localization, cf. Sinai [15] and Fröhlich, Spencer, Wittwer [6]. In what follows, H[−N1 ,N1 ] (ω, x, y) denotes the operator given by the left-hand side of (2.60) (with T = Tω ) restricted to the interval [−N1 , N1 ] with Dirichlet boundary conditions. We shall also write LN (ω, E) instead of LN (E) to indicate the dependence on ω. Lemma 3.6. Fix a small ε > 0. Let N be an arbitrary positive integer and let C2 ≥ 1 be some constant. Define S = SN ⊂ T4 × R to be the set of those (ω, y0 , x, y, E) for which there exists some N1 < N C2 so that
kω ≥ ε |k|−1 (1 + log k)−2 for all 0 < k < N, −1 H[−N1 ,N1 ] (ω, 0, y0 ) − E > eC3 N ,
(3.57)
log MN (ω, x, y, E) < LN (ω, E) − c0 /10.
(3.58)
1 N
(3.56)
Here c0 is the constant from (3.55) and C3 will be a sufficiently large constant depending on v. Then 1 (3.59) mes(ProjT4 S) exp − N σ . 2 Moreover, S is contained in a set S satisfying the measure estimate (3.59) and which is semi-algebraic of degree at most N C for some constant C depending on v, ε, C2 and C3 . Proof. Fix some sufficiently large N . Firstly, recall that the large deviation estimate (3.39) for n = N holds under the condition (3.56) on ω, see Remark 2.9. Now fix some ω as in (3.56) and let y0 ∈ T be arbitrary. If E satisfies (3.57), then by self-adjointness of H , |E − E | < e−C3 N
(3.60)
618
J. Bourgain, M. Goldstein, W. Schlag
for some E ∈ Spec H[−N1 ,N1 ] (ω, 0, y0 ) . Observe that these eigenvalues E do not depend on (x, y). It follows from (3.60) with sufficiently large C3 and (3.58) that 1 N
log MN (ω, x, y, E ) < LN (ω, E ) − c0 /20.
(3.61)
This can be seen by differentiating the functions on the left-hand side of (3.61) in the energy. In view of (3.39), the measure of the set of (x, y) ∈ T2 for which (3.61) holds σ −N with fixed E does not exceed e . This proves that
1 σ σ mes ProjT4 S N12 e−N e− 2 N ,
(3.62)
as claimed. It remains to be shown that conditions (3.57), and (3.58) can be replaced by inequalities involving only polynomials of degree at most N C for some C, without increasing the measure estimate (3.59) by more than a factor of two, say. We will not provide all details, since they can be readily found in [3]. Using Hilbert–Schmidt norms in (3.57) and expressing the inverse in terms of Cramer’s rule shows that condition (3.57) is semialgebraic of degree at most CN13 . Using Lemma 3.4, we may express the Lyapunov exponent 1 log MN (ω, x, y, E) dxdy LN (ω, E) = N [0,1]2 appearing in (3.58) as a discrete average LN (ω, E) = R −1
R 1 log MN (ω, Tωj (0, 0), E) + o(1) N j =1
with R < N C . Therefore, one obtains a semi-algebraic condition in ω, x, y, E of degree at most N C by rewriting (3.58) in the form
MN (ω, x, y; E) 2R ≤ e−NRc0 /10
R j =1
MN (ω, Tωj (0, 0); E) 2 .
Finally, the measure of the set S does not change by more than a factor in this process. ! " 3.5. The proof of localization for the skew-shift with large disorder. The following theorem is the main result of this section. Theorem 3.7. Fix ε > 0 small. Let v = v(x, y) be a nonconstant trigonometric polynomial on T2 and let λ1 = λ1 (v, ε) be as in Prop. 2.11. Let Tω (x, y) = (x + y, y + ω) (mod Z2 ) denote the ω-skew-shift on T2 . Then for every λ > λ1 and all (ω, x, y) ∈ T3 up to a set of measure ε, the operator
Hω,(x,y) ψ n := −ψn−1 − ψn+1 + λv(Tωn (x, y))ψn on 2 (Z) displays Anderson localization for all energies.
Anderson Localization for the Skew–Shift
619
Proof. Let ω ∈ ε , see (2.42). For large N , let SN be as in Lemma 3.6. Then Lemma 3.3 2 applies to SN and setting N¯ = e(log N) it follows that
−8 mes (y0 , ω) ∈ T2 (y0 , ω, Tωj (0, y0 )) ∈ ProjT4 (SN ) for some j ∼ N¯ < N¯ −10 . (3.63) Let BN denote the set on the left-hand side of (3.63) and define B (0) := lim sup BN . N→∞
Thus mes(B (0) ) = 0. Since T (x, y) = x + T (0, y) (mod 1), this construction applied to the potential v(x + ·, ·) instead of v produces a set B (x) of measure zero. Finally, set B := (ω, x, y) (y, ω) ∈ B (x) , which is again of measure zero. It is for all (ω, x, y) ∈ ε × T2 \ B that we shall prove localization.
Fix such a choice of (ω, x, y) and any E ∈ Spec Hω,(x,y) . By the Shnol–Simon theorem [12, 13] there exists a generalized eigenfunction ξ , i.e., (Hω,(x,y) − E)ξ = 0 and |ξn | 1 + |n| for all n ∈ Z.
(3.64)
Furthermore, we normalize |ξ0 | + |ξ1 | = 1. Fix some large integer N and assume that (3.57) holds. By our choice of (ω, x, y),
j 1 T (x, y); E > L(E) − c0 /10 log
M N ω N for all N ∼ N and j ∼ N¯ = e(log N) , cf. (3.58). It follows from the avalanche principle that then also
1 log MN2 Tωj (x, y); E > L(E) − c0 /10 if N2 (3.65) N¯ N¯ 2 ¯ < |j | < N and N < N2 < . 2 10 2
As usual, let −1 GC (ω, x, y; E) := HC (ω, x, y) − E be the Green’s function. As before, HC denotes the restriction of H to the interval C with Dirichlet boundary conditions. Consider intervals
N¯ N¯ C = j, j + , where < |j | < N¯ . 10 2 By definition of GC and because of (3.64), it will suffice to prove that 1 max GC (ω, x, y; E)(k, ) exp(−c1 N¯ ) for all k ∈ C with dist(k, ∂C) > |C|.
∈∂C 4 (3.66)
620
J. Bourgain, M. Goldstein, W. Schlag
Here c1 > 0 is some fixed constant. The proof of (3.66) follows from (3.65) by a standard argument. In fact, it is a simple consequence of Cramer’s rule and the representation of the Hamiltonian as the matrix appearing on the right-hand side of (2.80) that for any n and 1 ≤ k, ≤ n, G[1,n] (x, y; E)(k, ) =
fk−1 (x, y; E)fn− −1 (T (x, y); E) . fn (x, y; E)
In conjunction with (2.81), Corollary 3.5, and (3.65), this implies (3.66) as desired. Recall, however, that we made the assumption that (3.57) holds. To establish this condition it suffices to show that |ξN1 +1 | + |ξ−N1 −1 | e−2C3 N for some N1 ∼ N C2 . In view of (3.64) this estimate holds provided both Green’s functions
G[j −4C3 N, j +4C3 N] (ω, x, y; E) = G[−4C3 N,4C3 N] ω, T j (x, y); E with j = N1 , −N1 satisfy an exponential decay estimate as in (3.66). In view of the preceding argument involving (3.66) it remains to show that for some j ∼ N C2 one has the property
1 log M4C3 N Tωj (x, y), E > L(E) − c0 /10 4C3 N and similarly for −j . That, however, is an immediate consequence of Lemma 3.4.
" !
Acknowledgement. The authors thank Thomas Spencer and Yakov Sinai for helpful discussions. The third author wishes to thank Thomas Wolff for the suggestion that the Hilbert transform might lead to better bounds in the paper [7], and for pointing out Theorem 1.9 in [9], and Charles Fefferman for sketching to him the proof of an important BMO estimate for logarithms of polynomials. The second author was supported by a grant of the NEC Research Institute, Inc. during his stay at the Institute for Advanced Study, Princeton. The third author was supported in part by the NSF, grant number DMS-9706889.
References 1. Anderson, P.: Absence of diffusion in certain random lattices. Phys. Rev. 109, 1492–1501 (1958) 2. Bourgain, J.: Positive Lyapunov exponents for most energies. Geometric Aspects of Functional Analysis. Lecture Note sin Math. 1745. Berlin–Heidelberg–New York: Springer, 2000, pp. 37–66 3. Bourgain, J., Goldstein, M.: On nonperturbative localization with quasiperiodic potential. Annals of Math. 152, (3), 835–879 (2000) 4. Bourgain, J., Schlag, W.: Anderson localization for Schrödinger operators on Z with strongly mixing potentials. Commun. Math. Phys. 215, 143–175 (2000) 5. Figotin, A., Pastur, L.: Spectra of random and almost–periodic operators. Grundlehren der mathematischen Wissenschaften 297, Berlin–Heidelberg–New York: Springer, 1992 6. Fröhlich, J., Spencer, T., Wittwer, P.: Localization for a class of one dimensional quasi-periodic Schrödinger operators. Commun. Math. Phys. 132, 5–25 (1990) 7. Goldstein, M., Schlag, W.: Hölder continuity of the integrated density of states for quasiperiodic Schrödinger equations and averages of shifts of subharmonic functions. To appear in Annals of Math. 8. Herman, M.: Une méthode pour minorer les exposants de Lyapounov et quelques exemples montrant le charactère local d’un theoreme d’Arnold et de Moser sur le tore de dimension 2. Comment. Math. Helv. 58 no. 3, 453–502 (1983) 9. Katznelson, Y.: An introduction to harmonic analysis. New York: Dover, 1976 10. Kotani, S.: Ljapunov indices determine absolutely continuous spectra of stationary random onedimensional Schrödinger operators. In: Stochastic analysis (Kotata/Kyoto, 1982), North-Holland Math. Library, 32, Amsterdam–New York: North-Holland, 1984, pp. 225–247
Anderson Localization for the Skew–Shift
621
11. Montgomery, H.: Ten lectures on the interface between harmonic analysis and analytic number theory. CBMS Regional conference series in mathematics 84, Providence, RI: AMS, 1994 12. Shnol, I.: On the behavior of the Schroedinger equation. Mat. Sb. 273–286 (1957) Russian 13. Simon, B.: Spectrum and continuum eigenfunctions of Schroedinger Operators. J. Funct. Anal. 42, 66–83 (1981) 14. Simon, B.: Kotani theory for one-dimensional stochastic Jacobi matrices. Commun. Math. Phys. 89, 227–234 (1983) 15. Sinai, Y.G.: Anderson localization for one-dimensional difference Schrödinger operator with quasiperiodic potential. J. Stat. Phys. 46, 861–909 (1987) 16. Stein, E.: Harmonic analysis. Princeton Mathematical Series 43, Princeton, NJ: Princeton University Press, 1993 Communicated by P. Sarnak
Commun. Math. Phys. 220, 623 – 656 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Bryuno Function and the Standard Map Alberto Berretti1 , Guido Gentile2 1 Dipartimento di Matematica, II Università di Roma (Tor Vergata), Via della Ricerca Scientifica,
00133 Roma, Italy. E-mail: [email protected]
2 Dipartimento di Matematica, Università di Roma 3, Largo S. Leonardo Murialdo 1, 00146 Roma, Italy.
E-mail: [email protected] Received: 8 February 2000/ Accepted: 2 March 2001
Abstract: For the standard map the homotopically non-trivial invariant curves of rotation number ω satisfying the Bryuno condition are shown to be analytic in the perturbative parameter ε, provided |ε| is small enough. The radius of convergence ρ(ω) of the Lindstedt series – sometimes called critical function of the standard map – is studied and the relation with the Bryuno function B(ω) is derived: the quantity | log ρ(ω) + 2B(ω)| is proved to be bounded uniformly in ω. 1. Introduction We continue the study, started in [1], of the radius of convergence of the Lindstedt series for the standard map, for rotation numbers close to rational values. We consider real rotation numbers ω satisfying the Bryuno condition (see below), and study how the corresponding radius of convergence depends on the Bryuno function B(ω), introduced by Yoccoz in [2]. The standard map is a discrete time, one-dimensional dynamical system generated by the iteration of the area-preserving – symplectic – map of the cylinder into itself Tε : T × R → T × R, given by: x = x + y + ε sin x, Tε : (1.1) y = y + ε sin x. Given a real rotation number ω ∈ [0, 1), we can look for (homotopically non-trivial) invariant curves described parametrically by: x = α + u(α, ε; ω), (1.2) y = 2πω + u(α, ε; ω) − u(α − 2π ω, ε; ω),
624
A. Berretti, G. Gentile
such that the dynamics induced in the variable α is given by rotations by ω: α = α + 2π ω.
(1.3)
For irrational rotation numbers ω, by imposing that the average of u over α is 0, the (formal) conjugating function u is unique and odd in α, and has a formal expansion – known as the Lindstedt series – of the form: iνα k uν (ε)eiνα = u(k) (α)ε k = u(k) ε ; (1.4) u(α, ε) = ν e k≥1
ν∈Z
k≥1 ν∈Z
(k)
the coefficients uν can be expressed graphically in terms of sums over trees as explained shortly (see also [1] and references quoted therein). The radius of convergence of the series (1.4), called sometimes the critical function of the standard map, is defined as: 1/k −1 ρ(ω) = inf lim supu(k) (α) . α∈T
k→∞
(1.5)
Given ω, let {pn /qn } be the sequence of convergents defined by the standard continued fraction expansion of ω, and let: B1 (ω) =
∞ log qn+1 n=0
qn
.
(1.6)
The irrational number ω ∈ [0, 1) satisfies the Bryuno condition if B1 (ω) < ∞; we also say that in this case ω is a Bryuno number. After Yoccoz [2], we define on the irrational numbers the Bryuno function B(ω) by the functional equation:
B(ω) = − log ω + ωB(ω−1 ) for ω ∈ (0, 21 ) and irrational, B(ω + 1) = B(−ω) = B(ω).
(1.7)
It can be proved that such a functional equation has a unique solution in Lp , p ≥ 1; moreover B(ω) is related to the series B1 (ω) by the inequality: B(ω) − B1 (ω) < C1 ,
(1.8)
for some constant C1 . See [2] and [3] for the proofs of these statements. We prove the following theorem. Theorem. Consider the standard map (1.1) and let ω be an irrational number, ω ∈ [0, 1), satisfying the Bryuno condition. Then the radius of convergence (1.5) satisfies the bound: | log ρ(ω) + 2B(ω)| ≤ C0 , where C0 is a constant independent of ω.
(1.9)
Bryuno Function and Standard Map
625
An analogous result was proved by Davie [4] for the semistandard map (where the nonlinear term sin x in (1.1) is replaced by eix ); in the same paper it was also shown that the upper bound in (1.9) holds: log ρ(ω) + 2B(ω) < C2 ,
(1.10)
for some constant C2 . In [5] it was proved, by “phase space renormalization” arguments, that ∀η > 0 ∃C3 , depending on η, such that: log ρ(ω) + (2 + η)B(ω) > C3 .
(1.11)
So our theorem improves the result of [5] (using also a different, direct technique, taken from [6] – and inspired the works [7] and [8] – in some sense more elementary than the one of [5]) and proves the conjecture (“Bryuno’s interpolation”) first stated for the standard map in [9]; see also [10] and references quoted therein. Our theorem can be related to the result and the methods of [1]. There we proved that, for ω ∈ C, if ω tends to a rational number p/q through a path in the complex plane non-tangential to the real axis, then the radius of convergence satisfies: log ρ(ω) + 2 logω − p < C4 q q
(1.12)
for some constant C4 . If instead we consider a sequence of real, irrational numbers tending to a rational value p/q, the situation is quite a bit more complex. In fact, the limit and its very existence may depend on the arithmetic properties of the numbers of the sequence we consider, and on their uniformity in k; namely: 1. The sequence {ωk } can tend to p/q but, though all the ωk are irrational, some of them are not Bryuno numbers so that for those B(ωk ) = +∞ and ρ(ωk ) = 0. 2. The sequence {ωk } can tend to p/q through Bryuno numbers, or even Diophantine numbers, but they are not uniformly such in k so that B(ωk ) diverges faster than log |ωk − p/q|1/q (and so ρ(ωk ) tends to zero faster than |ωk − p/q|2/q ). An example can be the sequence of Diophantine (actually even “noble”) numbers: 1
ωk = k+
,
1
(1.13)
2k + γ 2
where γ denotes the “golden mean”: γ =
1 1 1+ 1 + ···
√ =
5−1 ; 2
(1.14)
a simple calculation using the recursion relation (1.7) shows that indeed B(ωk ) = O(k) while ωk = O(1/k), so that, by taking into account also logarithmic corrections in B(ωk ), ρ(ωk ) = O(ωk2 e−2/ωk ), that is much faster than ωk2 .
626
A. Berretti, G. Gentile
3. Finally, the sequence {ωk } can tend to p/q through a sequence of Bryuno numbers satisfying uniform estimates in k, so that an estimate like (1.12) holds (note that decays slower than |ωk − p/q|2/q are not possible); an example can be given by the sequence: ωk =
1 , k+γ
(1.15)
where again γ is the golden mean (1.14). Notice that in the numerical calculations of [11] only real sequences of type 1 were considered, and that sequences of type 1 are practically inaccessible from the numerical point of view. One might also ask whether the same interpolation property holds for the analytic critical threshold εc (ω), defined as the supremum of the set: Eω = {ε > 0 | ∀˜ε ∈ [0, ε) ∃ an analytic invariant curve with rotation number ω}; (1.16) of course ρ(ω) ≤ εc (ω). The interpolation properties of εc (ω) should be different, as, according to Davie [5], their orders of magnitude asymptotically differ as ω → 0. This, in turn, adds interest to the study of the interpolation properties for the radius of convergence ρ(ω), as a standard against which to check εc (ω), besides the obvious interest in an important analyticity property of the function u. Note that this is a much harder problem, especially considering that it is not at all clear what is the right question to ask. For example, for generic standard-like maps, the analytic critical threshold is different for positive or negative values of ε, as numerical experiments suggest (see e.g. [12]), and of course there is nothing special to positive values of ε from the physical point of view. Moreover, always for generic maps one can have the phenomenon of erratic invariant curves, that is for a given ω the invariant curve can break down at a certain value of ε, to reappear and disappear again as ε grows: again, this has been shown only numerically (see [13]) and it is unlikely the case of the standard map, but such a possibility makes the simple definition (1.16) questionable from the physical point of view. Finally, one may ask how much these results can be extended to more complicated, and realistic, symplectic maps and continuous time Hamiltonian systems. We believe that while some additional complications may arise, the really hard problem (i.e. how to handle resonances) is already present in the standard map and it was solved by carefully using the trees formalism and the multi-scale decomposition of the propagators (see below). More general maps and Hamiltonian systems, though, as already pointed out in [1, 14], have different, more complicated interpolation properties for the radius of convergence of their Lindstedt series: the challenge here seems to be to find the right interpolation formula, which the work of [14] shows is different from Bryuno’s interpolation; this is an area where much work still has to be done. The paper is organized as follows. In Sect. 2 we introduce the formalism and give the scheme of the proof of the theorem, elucidating the major difficulties, due to the accumulation of small divisors in the Lindstedt series, and showing that, in the absence of such a phenomenon, the proof could be carried out by a detailed analysis of the single terms of the series. In Sect. 3 and 4, we shall see how to handle the small divisors problem, by showing that there are cancellation mechanisms, operating to all perturbative orders
Bryuno Function and Standard Map
627
between different terms of the Lindstedt series, which assure its convergence. Finally Sect. 5 and 6 deal with the proof of the main technical lemmata used in the proof of the theorem. 2. Formalism: Trees, Clusters and Resonances (k)
As in [1], we can express graphically the coefficients uν in (1.4) in terms of trees. We shall only recall the definitions used in this paper and set up the notations, leaving the full details of the tree expansion for our problem to [1] and the references quoted therein. A tree ϑ consists of a family of lines arranged to connect a partially ordered set of points – nodes – with the lower nodes to the right. All the lines have two nodes at their extremes, except the highest which has only one node, the last node u0 of the tree; the other extreme r will be called the root of the tree and it will not be regarded as a node. We denote by the partial ordering relation between nodes defined as follows: given two nodes u, v, we say that v u if u is along the path of lines connecting v to the root r of the tree – they could coincide: we say that v ≺ u if they do not. So our trees are “rooted trees”, following the terminology of [15]. We assign to each line ! joining two nodes u and u an “arrow” pointing from the higher to the lower node according to the order relation just defined; if u ≺ u , we say that the line ! exists from u and enters u , and that u is the node immediately following u. We write u0 = r even if, strictly speaking, r is not considered a node. For each node u there is a unique exiting line, and mu ≥ 0 entering lines; as there is a one-to-one correspondence between lines and nodes, we can associate to each node u the line !u exiting from it. The line !u0 exiting the last node u0 will be called the root line. Note that each line !u can be considered the root line of the subtree consisting of the nodes v satisfying v u, and u will be the root of such tree. The order k of the tree is defined as the number of its nodes. To each node u ∈ ϑ we associate a mode label νu = ±1, and define the momentum flowing through the line !u as: ν!u = νw , νw = ±1; (2.1) wu (k)
note that no line can have zero momentum, as u0 = 0 in (1.4). While in [1] we could get along considering only two “scales”, we need a full multiscale decomposition of the momenta associated to each line. Given a rotation number ω ∈ [0, 1)\Q, let {pn /qn } be the sequence of convergents coming from the standard continued fraction expansion of ω. For x ∈ R, let: ||x|| = inf |x − ν| ν∈Z
(2.2)
be the distance of x from the nearest integer. Let now: γ (ν) = 2(cos 2π ων − 1);
(2.3)
|γ (ν)| = 2| cos 2π ων − 1| ≥ $||ων||2 ,
(2.4)
then we have the estimate:
for some constant $.
628
A. Berretti, G. Gentile
We introduce a C ∞ partition of unity in the following way. Let χ (x) a C ∞ , nonincreasing, compact-support function defined on R+ , such that: 1 for x ≤ 1, (2.5) χ (x) = 0 for x ≥ 2, and define for each n ∈ N: χ0 (x) = 1 − χ (96q1 x), χn (x) = χ (96qn x) − χ (96qn+1 x),
for n ≥ 1.
(2.6)
Then for each line ! set: g(ν! ) ≡
∞
∞
n=0
n=0
χn (||ων! ||) 1 = ≡ gn (ν! ), γ (ν! ) γ (ν! )
(2.7)
and call gn (ν! ) the propagator on scale n. Given a tree ϑ, we can associate to each line ! of ϑ a scale label n! , using the multiscale decomposition (2.7) and singling out the summands with n = n! . We shall call n! the scale label of the line !, and we shall say also that the line ! is on scale n! . Remark 1. Given a value ν! there can be at most two possible – consecutive – values of n such that the corresponding χn (||ων! ||) are not vanishing. This means that at most only two summands of the infinite series (2.7) really appear; nevertheless keeping all terms is more convenient, in order to have a label to characterize the “size” of the “propagators” g(ν! ). Remark 2. Note that if a line ! has momentum ν! and scale n! , then: 1 1 ≤ ||ων! || ≤ , 96qn! +1 48qn!
(2.8)
provided that one has χn! (||ων! ||) = 0. A group G of transformations acts on the trees, generated by the permutations of all the subtrees emerging from each node with at least one entering line: G is therefore a Cartesian product of copies of the symmetric groups of various orders. Two trees that can be transformed into each other by the action of the group G are considered identical. Denote by Tν,k the set of trees, with nonvanishing value, of order k and total momentum ν!u0 = ν, if u0 is the last node of the tree. The number of elements in Tν,k is bounded by 2k · 2k · 22k = 24k : the number of semitopological trees (see [1]) of order k is bounded by 22k ,1 and there are two possible values for the mode label of each node and two possible values for the scale label of each line. Then, as in [1] – to which we refer for more details and figures – one finds: mu +1
νu 1 u(k) = Val(ϑ), Val(ϑ) = −i g (ν ) (2.9) n! ! ; ν 2k mu ! ϑ∈Tν,k
u∈ϑ
!∈ϑ
the factors gn! (ν! ) above are called propagators of small divisors on scale n! , and the quantity Val(ϑ) will be called the value of the tree ϑ. We define now the main combinatorial tools. 1 The number of semitopological trees can be bounded by the number of one-dimensional random walks with 2k − 1 steps.
Bryuno Function and Standard Map
629
Definition (Cluster). Given a tree ϑ, a cluster T of ϑ on scale n is a maximal connected set of lines of lines on scale ≤ n with at least one line on scale n. We shall say that such lines are internal to T , and write ! ∈ T for an internal line T . A node u is called internal to T , and we write u ∈ T , if at least one of its entering lines or exiting line is in T . Each cluster has an arbitrary number mT ≥ 0 of entering lines but only one or zero exiting line; we shall call external to T the lines entering or exiting T (which are all on scale > n). We shall denote with nT the scale of the cluster T , with niT the minimum of the scales of the lines entering T , with noT the scale of the line exiting T and with kT the number of nodes in T . Note that, despite the name, not all lines outside T are “external” to it: only those lines outside T which enter or exit T are external to it. On the contrary a line inside T is said to be “internal” to it. The use of such a terminology is inherited from Quantum Field Theory. Definition (Resonance). Given a tree ϑ, a cluster V of ϑ will be called a resonance i o with resonance-scale n = nR V ≡ min{nV , nV }, if: 1. the sum of the mode labels of its nodes is 0: νV ≡ νu = 0;
(2.10)
u∈V
2. all the lines entering V are on the same scale except at most one, which can be on a higher scale; 3. niV ≤ noV if mV ≥ 2, and |niV − noV | ≤ 1 for mV = 1; 3. kV < qn ; 4. mV = 1 if qn+1 ≤ 4qn ; 5. if qn+1 > 4qn and mV ≥ 2, denoting by k0 the sum of the orders of the subtrees of order < qn+1 /4 entering V , either (a) there is only one subtree of order k1 ≥ qn+1 /4 entering V and k0 < qn+1 /8, or (b) there is no such subtree and k0 + k0 < qn+1 /4. Remark 3. Note that for any resonance V one has nR V ≥ nV + 1, if nV is the scale of the resonance V as a cluster. As in [16] we use the notation with a hyphen for the resonance-scale to avoid confusion between nR V and nV . Remark 4. One would be tempted to give a simpler definition of resonance (for instance, by imposing only condition 2 to the cluster V ). This temptation should be resisted, as it would make it impossible to exploit the cancellations leading to the improvement of the bound discussed at the end of this section (in fact, no relation would continue to subsist between momenta and scale labels and factorials would arise from counting the summands generated by the renormalization procedure described in Sect. 4). On the other hand we shall see in Sect. 5 that no problems should arise if no resonances – exactly as they are defined above – could appear. In the following we shall need to introduce trees in which it can happen that a line ! is on a scale n! and yet its momentum does not satisfy (2.8). The value of any such tree ϑ is vanishing as χn! (||ων! ||) = 0; nevertheless it will be useful to write Val(ϑ) as the sum of two (possibly) nonvanishing terms: one of them will be used to cancel terms arising from other tree values, so it will disappear, while the other one is left and has to
630
A. Berretti, G. Gentile
be bounded. This means that we shall have to deal with trees in which there are lines ! with momentum ν! and scale n! which do not satisfy (2.8). What will be shown to hold is that for such lines a bound similar to (2.8), though weaker, still holds; more precisely, a line ! with momentum ν! will have only scales n! such that: 1 1 ≤ ||ων! || ≤ , 768qn! +1 8qn!
(2.11)
and, for fixed ν! , the number of possible scales to associate to ! is bounded by an absolute constant. As (2.11) is implied by (2.8), even for trees with nonvanishing value we shall use that if a line is on scale n! then (2.11) holds. Then, if Nn (ϑ), n ∈ N, denotes the number of lines on scale n in ϑ, we have trivially for a given tree ϑ the bound: | Val(ϑ)| ≤ D1k
∞
768qn+1
2Nn (ϑ)
,
(2.12)
n=0
for some constant D1 (actually D1 = 1/ $; see (2.4), (2.9) and (2.11)). Given a tree ϑ, let us denote with NnR (ϑ) the number of resonances with resonancescale n and by Pn (ϑ) the number of resonances on scale n. Of course N0R = 0. Remark 5. Note that the number NnR (ϑ) of resonances with resonance-scale n can be counted by counting the number of lines exiting resonances with resonance-scale n; analogously Pn (ϑ) can be counted by counting the number of lines exiting resonances on scale n. Such counts will be performed in Sect. 5. The following simple lemmata contain all the arithmetic we shall need, and are basically adapted from [4]. Lemma 1 (Davie’s lemma). Given ν ∈ Z such that ||ων|| ≤ 1/4qn , then 1. either ν = 0 or |ν| ≥ qn , 2. either |ν| ≥ qn+1 /4 or ν = sqn for some integer s. Lemma 2. If a tree ϑ has k < qn nodes, then Nn (ϑ) = 0 and Pn−1 (ϑ) = 0. Lemma 3. For any irrational number ω ∈ [0, 1): ∞ log qn n=0
qn
≤ D2 ,
(2.13)
for a constant D2 ; here qn are the denominators of the convergents of ω. Lemma 4. Given a momentum ν such that 1 1 ≤ ||ων|| ≤ , 8qn 768qn+1 then one can have χn (||ων||) = 0 only for n such that n − 8 ≤ n ≤ n + 8.
(2.14)
Bryuno Function and Standard Map
631
Proof of Lemma 1. If {qn } are the denominators of the convergents of ω, then (see e.g. [17, Ch. 1, §3]): 1 1 < ||ωqn || < , 2qn+1 qn+1
(2.15)
and: ∀|ν| < qn+1 , |ν| = qn :
||ων|| > ||ωqn ||.
(2.16)
To prove 1 note that if ν = 0 nothing has to be proved: so we assume ν = 0. If |ν| < qn , by (2.16) and (2.15), ||ων|| ≥ ||ωqn−1 || > 1/2qn , so that ||ων|| < 1/4qn implies |ν| ≥ qn , proving the first assertion of Lemma 1. To prove 2, again if ν = 0 nothing has to be proved (and s = 0): so we assume ν = 0, and proceed by reductio ad absurdum. If 0 < ν < qn+1 /4 and there does not exist any s ∈ Z such that ν = sqn , then one has ν = mqn + r, with 0 < r < qn and m < qn+1 /4qn ; then, by (2.15), ||ωmqn || ≤ m||ωqn || < m/qn+1 < 1/4qn , and, by (2.16), ||ωr|| ≥ ||ωqn−1 || > 1/2qn , as r = 0; so ||ων|| ≥ ||ωr|| − ||ωmqn || > 1/4qn . The case 0 > ν > −qn+1 /4 is identical as || · || is even. Proof of Lemma 2. If k < qn , then for any ! ∈ ϑ one has |ν! | ≤ k < qn , so that, by (2.15) and (2.16), ||ων! || ≥ ||ωqn−1 || > 1/2qn , hence n! < n and so Nn (ϑ) = 0 ∀n ≥ n. If there are no lines on scale ≥ n, it is impossible to form a cluster on scale n − 1 – which is different from the whole tree – a fortiori a resonance. Proof of Lemma 3. The denominators of the convergents {qn } of ω satisfy q0 = 1, q1 ≥ 1 and qn ≥ 2qn−2 for any n ≥ 2. So we can write: ∞ log qn n=0
qn
=
∞ log q2n n=0
q2n
+
∞ log q2n+1 n=0
q2n+1
;
(2.17)
using the fact that, for x ≥ e, x −1 log x is decreasing, we obtain easily: ∞ log qn n=0
qn
log x ≤ 3 max x≥1 x
∞ k + 2 log 2 = 3(e−1 + log 2) ≡ D2 , 2k
(2.18)
k=2
which also gives an explicit value for the constant D2 .
Proof of Lemma 4. Simply use that qn+1 ≥ qn and qn+2 ≥ 2qn for all n ≥ 0, to deduce that 1/48qn+9 < 1/768qn+1 and 1/96qn−8 > 1/8qn . The following “counting” lemma is the main result stated in this section, and it can be considered an adaption and extension of Lemma 2.3 in [4]. We postpone its proof to Sect. 5. Lemma 5. Given a tree ϑ, let Mn (ϑ) = Nn (ϑ) + Pn (ϑ). Then: Mn (ϑ) ≤ where k is the order of ϑ.
k 8k + + NnR (ϑ), qn qn+1
(2.19)
632
A. Berretti, G. Gentile
Therefore we can rewrite the bound (2.12) on the tree value as: | Val(ϑ)| ≤ D1k ≤
D1k
∞
768qn+1
n=0 ∞
768qn+1
2(Mn (ϑ)−Pn (ϑ)) 2(k/qn +8k/qn+1 +NnR (ϑ)−Pn (ϑ))
(2.20) .
n=0
Note that at this point it would be very easy to prove the lower bound in (1.9) for the semistandard map and, by simple modifications of the same scheme, for the Siegel problem, since in these cases no resonances appear. On the contrary, in the more difficult case of the standard map we lack, for the moment, a control on the number NnR (ϑ) of resonances in ϑ with resonance-scale n. In Sects. 3 and 4 we shall see how to improve the bound on the sum over the trees of fixed order and total momentum, in order to prove the theorem stated in Sect. 1. We postpone to forthcoming sections the proofs, limiting ourselves here to a heuristic discussion in order to give an idea of the structure of the proof. We perform a suitable resummation – described in Sects. 3 and 4 – whose consequence is that, for each resonance V , it is as if one of the external lines on scale nR V contributed 2 2 768qnV +1 instead of 768qnR +1 . To obtain such a result, we shall perform on trees V transformations which will lead to the introduction of new trees: so we extend Tν,k ∗ . However we shall prove that the value of each single tree in T ∗ to a larger set Tν,k ν,k still admits the bound (2.20) – even if, unlike the values of the trees in Tν,k , it fails to ∗ satisfy the same bound with 768 replaced with 96 – and the number of elements in Tν,k is bounded by a constant to the power k (i.e. no bad counting factors, like factorials, appear). Then we obtain, for the sum of the resummed trees, a bound of the form (2.20) with: ∞
768qn+1
2NnR (ϑ)
n=0
replaced with: D3k
∞
768qn+1
2Pn (ϑ)
,
n=0 ∗ will be shown to be for some constant D3 . By using that the number of trees in Tν,k bounded by a constant to the power k, we obtain, for some constants D4 , D5 : (k) Val(ϑ) ≤ Val(ϑ) |u (α)| ≤ ∗ |ν|≤k ϑ∈Tν,k
|ν|≤k ϑ∈Tν,k
≤ D4k
∞
768qn+1
n=0
2k/qn +16k/qn+1
∞ log qn+1 8 log qn+1 , ≤ D5k exp 2k + qn qn+1 n=0
(2.21)
Bryuno Function and Standard Map
633
which, by making use of Lemma 3, gives: log ρ(ω) + 2B1 (ω) ≥ −16D2 − log D5 .
(2.22)
By making rigorous the above discussion in Sects. 3 and 4, we shall complete the proof of the theorem, since the bound from above was already proved in [4]. 3. Renormalization of Resonances: Set-up and the First Step Given a tree ϑ, let us consider maximal resonances, i.e. resonances not contained in any larger resonance; let us call them first generation resonances. Inside the first generation resonances let us consider the “next maximal” resonances, i.e. the resonances not contained in any larger resonance except first generation resonances, and let us call them second generation resonances. We can define in this way j th generation resonances, for j ≥ 2, as resonances which are maximal within (j − 1)th generation resonances. Let V be the set of all resonances of a tree ϑ, and Vj the set of all resonances of j th generation, with j = 1, . . . , G, for some integer G, depending on ϑ. Given a tree ϑ and a resonance V ∈ Vj with mV entering lines, define V0 as the set of nodes and lines internal to V and outside any resonances contained in V . Let LV = {!1 , . . . , !mV } be the set of entering lines of V ; we define LR V as the subset of the lines in LV which enter some resonances of higher generation contained inside V and L0V = LV \ LR V as the subset of lines in LV which enter nodes in V0 . For any line !m ∈ LR V , let V (!m ) be the minimal resonance containing the node which the line !m enters (i.e. the highest generation resonance containing such a node) and V0 (!m ) the set of nodes and lines internal to V (!m ) and outside resonances contained in V (!m ). Define: ˜ ) = {V˜ ⊂ V : V˜ = V (!m ) for some !m ∈ LR V(V V }.
(3.1)
Call mV0 the number of lines in L0V . The number of lines in LR V entering the same ˜ ˜ resonance V ∈ V(V ) is not arbitrary: it is always 1, as it is shown by the following lemma. Lemma 6. For j ≥ 1, given a resonance W ∈ Vj +1 contained inside a resonance V ∈ Vj , only one among the entering lines W can also enter V . Proof. The case mW = 1 is obvious, so we assume mW ≥ 2. One has nR W ≤ nV , R otherwise V would be a cluster on scale < nW , so that all the lines external to W would be also external to V and V = W , while we assumed that V ⊂ W . Then if a line ! enters both V and W , one must have n! > nR W . But, by items 2 and 2 in the definition of resonance, all lines entering W have the same scale nR W except at most one. We define the resonance family FV (ϑ) of V ∈ V in ϑ as the set of trees obtained from ϑ by the action of a group of transformations PV on ϑ, generated by the following operations: 1. Detach the line !1 , then consider all trees obtained by reattaching it to any node 0 internal to V0 (!1 ) if !1 ∈ LR V and to any node in V0 if !1 ∈ LV ; for each tree so obtained, do the same operations with the line !2 (i.e. detach !2 and reattach it to any 0 node internal to V0 (!2 ) if !2 ∈ LR V and to any node in V0 if !2 ∈ LV ), and so forth for each line entering the resonance.
634
A. Berretti, G. Gentile
2. In a given tree, each node u ∈ V will have mu entering lines, of which su are inside V and ru = mu − su are outside V (i.e. are entering lines of V ); then we can apply to the set of lines entering u a transformation in the group obtained as the quotient of the group of permutations of the mu lines entering u by the groups of permutations of the su internal entering lines and of permutations of the ru entering lines outside V ; in this way for each node u ∈ V a number of trees equal to:
mu mu ! = su su !ru ! is obtained. 3. Change sign simultaneously to all the mode labels of the nodes internal to V . We shall call renormalization transformations (of type 3, 3, 3) the operations described above. Remark 6. Note that in all such transformations the scales are not changed (by definition) and the set of resonance V remains the same (by construction). On the contrary the momenta flowing through the lines can change (because of the shift of the lines entering resonances) and in particular one can have for some lines !, χn! (||ων! ||) = 0, if ν! is the modified momentum flowing through !. Remark 7. The definition of resonance families is aimed at grouping together the trees between which one will look for compensations, but in doing so one has to avoid overcountings. In fact, to each tree ϑ we associate a value Val(ϑ) according to (2.9); when applying the transformations of the group PV on the tree ϑ, the same tree ϑ can be obtained, in general, in several ways; however, it has to be counted once. This means that PV , as a group, defines an equivalence class, and only inequivalent elements obtained through the transformations defining PV have to be retained. Let us call FV1 (ϑ) the family obtained by the composition of all transformations defining the resonance families FV1 (ϑ), V1 ∈ V1 . For any tree ϑ1 ∈ FV1 (ϑ), let V2 be a resonance in V2 and let us define the resonance family FV2 (ϑ1 ) of V2 in ϑ1 as the set of trees obtained from ϑ1 by the action of the group of transformations PV2 . The composition of all transformations defining the resonance families FV2 (ϑ1 ), for all ϑ1 ∈ FV1 (ϑ) and all V2 ∈ V2 , gives a family that we shall denote by FV2 (ϑ). We continue by considering resonances of 3rd generation, and so on until the Gth generation resonances are reached. At the end we shall have a family F(ϑ) of trees obtained by the composition of all transformations of the groups PV , V ∈ V, defined recursively through the application of the renormalization transformations corresponding to resonances V ∈ Vj to all trees ϑ belonging to the family FVj −1 (ϑ). Remark 8. Given a tree ϑ ∈ Tν,k and a family F(ϑ), when considering another tree ϑ ∈ F(ϑ) with nonvanishing value Val(ϑ ), the same family F(ϑ ) = F(ϑ) is obtained (by construction). Note however that F(ϑ) can contain also trees with vanishing values, as they can have lines ! such that χn! (||ων! ||) = 0 (see Remark 6). Define also NF (ϑ) the number of trees in F(ϑ) whose value is not vanishing; of course NF (ϑ) ≤ |F(ϑ)|, if |F(ϑ)| is the number of elements in F(ϑ).
Bryuno Function and Standard Map
Write:
Val(ϑ) =
ϑ∈Tν,k
ϑ∈Tν,k
635
1 NF (ϑ)
Val(ϑ ) =
∗ ϑ∈Tν,k
ϑ ∈F (ϑ)
1 |F(ϑ)|
Val(ϑ ),
ϑ ∈F (ϑ)
(3.2) where the factors NF (ϑ) and |F(ϑ)| have been introduced in order to avoid overcountings ∗ : so T ∗ is the set of (see Remark 8) and the last sum implicitly defines the set Tν,k ν,k inequivalent trees in ∪ϑ∈Tν,k F(ϑ). ∗ , then ϑ ∈ F(ϑ ) for some tree ϑ ∈ T ; however one has to bear If a tree ϑ ∈ Tν,k 0 0 ν,k in mind that ϑ, unlike ϑ0 , could vanish. ∗ , if V is a first generation resonance, we define its resonance Given a tree ϑ ∈ Tν,k factor VV (ϑ) as its contribution to the value of the tree ϑ, namely: mu +1
νu VV (ϑ) = gn! (ν! ) , (3.3) mu ! u∈V
!∈V
which of course depends on the subset of ϑ outside the resonance V only through the momenta of the entering lines of V . Given a node u ∈ V , let us denote with Eu the set of lines entering V such that they end in nodes preceding u. For future notational convenience, we rewrite (3.3) as: VV (ϑ) = UV (ϑ)LV (ϑ),
UV (ϑ) =
νumu +1 , mu !
LV (ϑ) =
u∈V
gn! (ν! ).
(3.4)
!∈V
In the following, we shall consider the quantities ων, ν ∈ Z, modulo 1, and shall continue to use the symbol ων to denote the representative of the equivalence class within the interval (−1/2, 1/2]. For any node u contained in a resonance V , we shall write: ν!u = ν!0u + ν! , ν!0u = νw , (3.5) ! ∈Eu
w∈V wu
where the set Eu was defined after (3.3). We shall consider the resonance factor (3.3) as a function of the quantities µ1 = ων!1 , . . . , µmV = ων!mV , where ν!1 , . . . , ν!mV are the momenta flowing through the lines !1 , . . . , !mV entering V . More precisely, we let: V(ϑ) ≡ VV (ϑ; ων!1 , . . . , ων!mV ),
(3.6)
and we write: VV (ϑ; ων!1 , . . . , ων!mV ) = LVV (ϑ; ων!1 , . . . , ωνmV ) + RVV (ϑ; ων!1 , . . . , ωνmV ),
(3.7)
where: LVV (ϑ; ων!1 , . . . , ων!mV ) = VV (ϑ; 0, . . . , 0) +
mV m=1
ων!m
∂ VV (ϑ; 0, . . . , 0) ∂µm
(3.8)
636
A. Berretti, G. Gentile
is the localized part of the resonance factor, or localized resonance factor, while: RVV (ϑ; ων!1 , . . . , ων!mV ) =
1
·
mV m,m =1
ων!m ων!m
dt (1 − t)
0
∂2 VV (ϑ; tων!1 , . . . , tων!mV ) ∂µm ∂µm
(3.9)
is the renormalized part of the resonance factor, or renormalized resonance factor. In (3.7) L is called the localization operator and R = 1 − L is called the renormalization operator. Using the notations (3.4), we can write: LVV (ϑ) = UV (ϑ)LLV (ϑ),
RVV (ϑ) = UV (ϑ)RLV (ϑ),
(3.10)
as only the factors in LV (ϑ) depend on the momenta flowing through the lines entering the resonance V . Remark 9. Note that in the localized part (3.8) the momentum ν! flowing through any line ! internal to V is changed into ν!0 (see (3.5)). Then we perform the renormalization transformations in PV described above. By Remark 9, for all trees obtained by applying the group PV the contribution to the localized resonance factor arising from the LV (ϑ) term in (3.4) is the same, i.e. : LLV (ϑ) = LLV (ϑ ),
∀ϑ ∈ FV (ϑ),
(3.11)
so that we can consider:
LVV (ϑ ).
(3.12)
ϑ ∈FV (ϑ)
The sum over all the trees in the resonance family FV (ϑ) of the localized resonance factors produces zero, so that only the renormalized part has to be taken into account. The proof of this assertion is similar to the proof of the analogous statement in [1], and it is given in Sect. 6 as a particular case of the proof of the more general statement in Lemma 8 below. Then only the second order terms have to be taken into account in (3.7). This leads to the following expression for the renormalized resonance factor: RVV (ϑ) = UV (ϑ)
mV m,m =1
·
ων!m ων!m
∂ ∂ gn!1 (ν!1 ) gn!2 (ν!2 ) ∂µm V V ∂µm V V 1 2
gn! (ν! )
!∈V !=!1V ,!2V
!V ,!V ∈V !1V =!2V
+
∂ ∂ gn!V (ν!V ) gn! (ν! ) , ∂µm ∂µm
!V ∈V
!∈V !=!V
(3.13)
Bryuno Function and Standard Map
637
from the very definition of the renormalized resonance factor (3.9), by noting that the two derivatives in (3.9) act either on two distinct propagators (the sum with !1V = !2V in (3.13)) or on the same propagator (the sum with only one line !V in (3.13)). Note that it can happen that ϑ ∈ FV (ϑ0 ), for some tree ϑ0 ∈ Tν,k , i.e. for some tree ϑ0 with nonvanishing value, while VV (ϑ) = 0 (correspondingly there does not exist any tree in Tν,k of that shape associated with the given choice of mode and scale labels). The tree ϑ is obtained from ϑ0 through a transformation of PV , so that there is a correspondence between the lines of ϑ0 and the lines of ϑ: we shall say that the lines are conjugate. The tree ϑ inherits the scale labels of the tree ϑ0 , i.e the lines in ϑ have the same scales of the conjugate lines of ϑ0 . So it can happen that in ϑ0 some line internal to V has a scale n! and a momentum ν˜ ! such that χn! (||ων˜ ! ||) = 0, while the momentum ν! of the line ! seen as a line of ϑ (i.e. of the line of ϑ conjugate to the line ! of ϑ0 ) is such that χn! (||ων! ||) = 0 (see Remark 8). This means that for such a line (2.8) does not hold. Nevertheless, as anticipated in Remark 6, one finds that the momentum ν! can not change “too much” with respect to ν˜ ! ; more precisely: 1 1 ≤ ||ων! || ≤ , 768qn! +1 24qn!
(3.14)
as we shall prove, using the following result. ∗ be a tree obtained by Lemma 7. Given a tree ϑ0 ∈ Tν,k and a resonance V ,let ϑ ∈ Tν,k the action of the group PV , i.e. ϑ ∈ FV (ϑ0 ). If ||ων!m || ≤ 1/8qnR for any entering line V !m of V , m = 1, . . . , mV , then, for any line ! ∈ V , one has
||ων! || − ||ων˜ ! || ≤
1 , 4qnR
||ων! || ≥
V
1 , 4qnR
||ων˜ ! || ≥
V
1 , 4qnR
(3.15)
V
if ν! and ν˜ ! are the momenta flowing through ! in ϑ and ϑ0 , respectively. Proof. As V is a resonance, then for each line ! ∈ V one has |ν!0 | ≤ kV < qnR (see item V 2 in the definition of resonance), so that: 1 , 2qnR
(3.16)
||ων!m ||,
(3.17)
||ων!0 || ≥ ||ωqnR −1 || > V
V
by (2.15) and (2.16). On the other hand: ||ων! − ων!0 || ≤
mV m=1
if ν1 , . . . , νmV are the momenta flowing through the lines !1 , . . . , !mV entering V . By hypothesis: ||ων!m || ≤
1 , 8qnR
∀m = 1, . . . , mV .
(3.18)
V
If mV ≥ 2 then one must have qnR +1 > 4qnR (see item 2 in the definition of resonance). V V In such a case if there is an entering line (say !1 ) which is the root line of a tree of order ≥ qnR +1 /4, then all the other lines are the root lines of subtrees of orders k2 , . . . , kmV V
638
A. Berretti, G. Gentile
such that k0 ≡ k2 + . . . + kmV < qnR +1 /8 (see item 2 in the definition of resonance). V Moreover, for each m = 2, . . . , mV , km ≥ qnR , otherwise the line !m would not be on V
scale ≥ nR V . By Lemma 1, ν!m = sm qnR for all m = 2, . . . , mV , with sm ∈ Z, and: V
|s2 | + . . . + |smV | ≤
qnR +1 k0 ≤ V , qnR 8qnR V
(3.19)
V
so that: mV m=1
m
V 1 1 1 1 ||ων!m || ≤ + |sm | ||ωqnR || ≤ + = , V 8qnR 8qnR 8qnR 4qnR
m=2
V
V
V
(3.20)
V
where use was made of (2.15). Therefore, when replacing ϑ0 with ϑ, (3.15) follows. If there is no entering line of V which is the root line of a tree of order ≥ qnR +1 /4 V and the tree having as root line the exiting line of V is of order k < qnR +1 /4 (see item V 2 in the definition of resonance), then: mV m=1
|sm |qnR ≤ k1 + . . . + kmV ≡ k − kV < k ≤
qnR +1 V
V
,
4
(3.21)
so that: mV
mV
||ων!m || ≤
m=1
m=1
|sm | ||ωqnR || ≤ V
qnR +1 V
1
4qnR qnR +1 V
V
=
1 , 4qnR
(3.22)
V
which implies again (3.15). If mV = 1, then (3.15) follows immediately from (3.17) and (3.18). We come back to the proof of (3.14). As the entering lines of V satisfy (2.8), hence (2.11), Lemma 7 applies. Note that inside V in ϑ0 (hence also in ϑ, see Remark 6) only lines on scale n! such that 1/48qn! > 1/4qnR are possible, by the second inequality in V (3.15) and the definition of scale (see (2.8)).Then, given a line ! internal to V on scale n! , one has: ||ων! || ≤
1 1 1 1 1 + ≤ + = . 48qn! 4qnR 48qn! 48qn! 24qn!
(3.23)
V
Likewise, if 1/96qn! +1 > 2/qnR , one has: V
||ων! || ≥
1 1 1 1 1 1 1− − ≥ − = , 96qn! +1 4qnR 96qn! +1 768qn! +1 96qn! +1 8
(3.24)
V
while, if 1/96qn! +1 < 2/qnR , one has: V
||ων! || ≥
1 1 ≥ , 4qnR 768qn! +1
(3.25)
V
by the third inequality in (3.15). Then (3.14) follows: so in particular the momentum ν! of the line ! ∈ ϑ still fulfills (2.11).
Bryuno Function and Standard Map
639
Note that (3.13) and (2.11) imply the following bound for the renormalized resonance factor of a first generation resonance: |RVV (ϑ)| ≤ D6 D7kV
mV m,m =1
· 768qnV +1
||ων!m || ||ων!m ||
2
768qn! +1
2
(3.26) ,
!∈V
(for some constants D6 and D7 ), where the last product (times $ −k ) represents a bound on the resonance factor (3.3). The proof of such an assertion again is as in [1] (see the proof of the Corollary in [1, §3]), and follows immediately by noting that for any line ! ∈ V one has n! ≥ nV . The only difference with respect to [1] is that now the derivatives can act also on the compact support functions: they were just missing in [1]; it is nevertheless straightforward to see that: p ∂ p χ (||ων ||) (3.27) ! ≤ D8 768qn+1 , ∂pµ n with p = 1, 2, for some constant D8 , so that: p ∂ p+2 (ν ) , g ∂ p µ n ! ≤ D9 768qn+1
(3.28)
with p = 0, 1, 2, for some constant D9 . For any tree in FV (ϑ) the bound (2.11) holds, so that Lemma 5 applies (see Remark 15 in Sect. 5). Note that the two factors ||ων!m ||, ||ων!m || in (3.26) allow us to neglect the propagator corresponding to a line entering a resonance with resonance-scale nR V , provided such a propagator is replaced by a factor (768qnV +1 )2 , where nV is the scale of the resonance as a cluster. Such a mechanism corresponds to the discussion leading to (2.21), as far as only the first generation resonances are considered. In general a tree will contain more resonances, and the resonances can be contained into each other. Then the above discussion has to be extended to cover the more general case: which will be done in the next section. 4. Renormalization of Resonances: The General Step We proceed following strictly the techniques of [6] and [18]. ∗ in (3.2). For each resonance V of any generation, let us Consider a tree ϑ ∈ Tν,k define a pair of derived lines !1V , !2V internal to V – possibly coinciding – with the following “compatibility” condition: if V is inside some other resonance W , the set {!1V , !2V } must contain those lines of {!1W , !2W } which are inside V . Clearly there can be 0, 1 or 2 such lines, and correspondingly we shall say that the resonance V is of type 2 if none of its derived lines is a derived line for one of the resonances containing it, of type 1 if just one of its two derived lines is a derived line for one of the resonances containing it, and of type 0 if both derived lines are derived lines for some resonances W , W – possibly coinciding – containing V ; we shall use a label zV = 0, 1, 2 to take note of the type of the resonance V . One associates also to each resonance V a pair of
640
A. Berretti, G. Gentile
entering lines !Vm , !Vm if zV = 2 and a single line !Vm if zV = 1, with m, m = 1, . . . , mV . Moreover for each resonance we shall introduce an interpolation parameter tV and a measure πzV (tV ) dtV such that: z=2 (1 − t), πz (t) = 1, (4.1) z=1 δ(t − 1), z = 0; we shall denote with t = {tV }V ∈V the set of all interpolation parameters. The momentum flowing through a line !u internal to any resonance V will be defined recursively as: ν!u (t) = ν!0u + tV ν! (t), ν!0u = νw ; (4.2) !∈Eu
w∈V wu
of course ν!u (t) will depend only on the interpolation parameters corresponding to the resonances containing the line !u (by construction). For any resonance V the resonance factor is defined as:
VV (ϑ) = UV (ϑ) gn! (ν! (t)) , (4.3) !∈V
when zV = 2, as:
∂ VV (ϑ) = UV (ϑ) gn (ν 1 (t)) gn! (ν! (t)) , ∂µ !1V !V
(4.4)
!∈V , !=!1V
when zV = 1 (and we have called !1V the line in {!1V , !2V } which belongs to the set {!1W , !2W } for some resonance W containing V ), as:
∂2 VV (ϑ) = UV (ϑ) gn (ν 1 (t)) gn! (ν! (t)) , (4.5) ∂µ∂µ !1V !V !∈V , !=!1V
when zV = 0 and !1V = !2V , and as: ∂ ∂ VV (ϑ) = UV (ϑ) g (ν (t)) gn!1 (ν!1 (t)) 2 n ∂µ V V ∂µ !2V !V
·
gn! (ν! (t)) ,
(4.6)
!∈V , !=!1V ,!2V
when zV = 0 and !1V = !2V . W and µ = ων!W , for some lines !W In (4.4)–(4.6) one has µ = ων!W m and !m (possibly m m
coinciding) entering, respectively, some resonances W and W (possibly coinciding) containing V .
Bryuno Function and Standard Map
641
We define the renormalization operator according to the type of the resonance; namely, if zV = 2, then: RVV (ϑ; ων!1 (t), . . . , ων!mV (t)) = · 0
1
dtV (1 − tV )
mV
m,m =1 2 ∂
∂µm ∂µm
ων!m (t)ων!m (t)
VV (ϑ, tV ων!1 (t), . . . , tV ων!mV (t));
(4.7)
if zV = 1, then: RVV (ϑ; ων!1 (t), . . . , ων!mV (t)) =
1
· 0
dtV
mV
ων!m (t)
m=1
∂ VV (ϑ, tV ων!1 (t), . . . , tV ων!mV (t)); ∂µm
(4.8)
finally if zV = 0, then: RVV (ϑ)(ϑ; ων!1 (t), . . . , ων!mV (t)) = VV (ϑ)(ϑ; ων!1 (t), . . . , ων!mV (t)).
(4.9)
In all cases set L = 1 − R. Remark 10. Note that zV equals the order of the renormalization performed on the resonance V . 0 R Remark 11. If a resonance V has a resonance-scale nR V , then there is a line !V on scale nV entering V such that ||ων! || ≤ ||ων!0 || for each ! entering V . If there is ambiguity, !0V V can be chosen arbitrarily. For any resonance V one has a factor bounded by ||ων!0 ||zV ,
from (4.7), (4.8) and (4.9) and by the definition of !0V .
V
To each line ! derived once one can associate the line !m (!) corresponding to the quantity µm = ων!m (!) with respect to which the propagator gn! (ν! (t)) is derived. If the line ! is derived twice one associates to it the two lines !m (!) and !m (!) such that µm = ων!m (!) and µm = ων!m (!) are the quantities with respect to which the propagator gn! (ν! (t)) is derived. Given aderivedline !, let V betheminimalresonance containing it. If the line ! is derived once, then let W be the resonance for which !m (!) is an entering line; if instead ! is derived twice, let W, W ⊆ W be the resonances for which the lines !m (!), !m (!) respectively are entering lines. In the first case, let Wi , i = 0, . . . , p the resonances contained by W and containing V , ordered naturally byinclusion: V = W0 ⊂ W1 ⊂ · · · ⊂ Wp = W.
(4.10)
We shall call the set W(!) = {W0 , . . . , Wp } the simple cloud of !. In the second case, let Wi , i = 0, . . . , p, the resonances contained by W and containing V , ordered naturally by inclusion: V = W0 ⊂ W1 ⊂ · · · ⊂ Wp = W ⊂ · · · ⊂ Wp = W,
(4.11)
642
A. Berretti, G. Gentile
with p ≤ p. We shall say that W− (!) = {W0 , . . . , Wp } is the minor cloud of ! while W+ (!) = {W0 , . . . , Wp } is the major cloud of V . When the renormalization of a resonance V ∈ Vj +1 is performed, a tree ϑ0V ∈ FV (ϑ), with V ∈ Vj , ϑ ∈ Tν,k , is replaced by the action of the group PV with a new tree ϑ V . As this replacement is performed iteratively, one has the constraint that if V1 and V2 are two resonances such that V1 is the minimal resonance containing V2 , then ∗ . On ϑ V1 = ϑ0V2 . At the end, the original tree ϑ0 ∈ Tν,k is replaced with a tree ϑ ∈ Tν,k each resonance V ∈ V of ϑ the renormalization operator R acts: a tree whose resonance factors have been all renormalized will be called a renormalized (or resummed) tree. As the replacement corresponding to each resonance settles a conjugation between lines of ϑ0V and those of ϑ V , in the end for each line of ϑ there will be a conjugate line of ϑ0 . Note that, as the transformations of the groups PV , V ∈ V, do not modify the scales of ϑ0 (see Remark 6), the scales of the lines of ϑ are the same as those of the conjugate lines of the tree ϑ0 , so that, in order to apply Lemma 5, we have only to verify that (2.11) is verified for the lines in ϑ: this will be done below (after Remark 12). Now, we shall show that: • the localized resonance factors can be neglected (in a sense that will appear clear shortly, see Lemma 8 below), • for any (renormalized) resonance we obtain a factor: 2 768qnV +1 ||ων!0 ||2 , (4.12) V
and • the number of terms generated by the renormalization procedure is bounded by a constant to the power k, so that the bound (2.20) can be replaced by a bound which leads to (2.21), as anticipated in Sect. 2. Note firstly that the localized part of the resonance factors can be dealt with as in Sect. 3, when only first generation resonances were considered. More formally, we have the following result, which is proved in Sect. 6. Lemma 8. Given a tree ϑ and a resonance V ∈ ϑ, the localized resonance factor LVV (ϑ) gives zero when the values of the trees belonging to the same resonance family FV (ϑ) are summed together. Define the map 8: 8 : V → 8V = zV , !1V , !2V , {!Vm , !Vm }∗ V ∈V ,
(4.13)
which associates to each resonance V ∈ V the derived lines !1V , !2V and the lines in the set {!Vm , !Vm }∗ defined as: V V if zV = 2, {!m , !m }, V V ∗ V {!m , !m } = !m , (4.14) if zV = 1, ∅, if zV = 0, where m, m = 1, . . . , mV and !V1 , . . . , !VmV are the lines entering V .
Bryuno Function and Standard Map
643
Note that the map 8 gives a natural decomposition of the set L of all lines of ϑ into L = L0 ∪ L1 ∪ L2 , where Lj is the set of lines derived j times. Then, by using Lemma 8, one has: Val(ϑ) =
8V
1
V ∈V 0
πzV (tV ) dtV
u∈ϑ
νumu +1 mu !
∂ gn! (ν! (t)) ων!m (!) gn! (ν! (t)) · ∂µm !∈L0 !∈L1
∂2 · ων!m (!) ων!m (!) gn! (ν! (t)) . ∂µm ∂µm
(4.15)
!∈L2
Remark 12. Note that no propagator is derived more than twice: this fact is essential for our proof since we have no control on the growth rate of the derivatives of the compact support functions (2.6). After the renormalization procedure has been applied for all resonances, one checks that the momenta of the lines in ϑ have changed, with respect to the original tree ϑ0 with nonvanishing value, in such a way that the bound (2.11) still holds. ∗ , obtained from ϑ ∈ T Lemma 9. Consider a renormalized tree ϑ ∈ Tν,k ν,k by the iterative replacements, described above, that take place each time a resonance appears. Then the lines of ϑ inherit the scales of the conjugate lines of ϑ0 and Lemma 5 applies to ϑ.
Proof. The first assertion follows by construction. The second one can be seen by induction on the generation of the resonances, by taking into account that for the first generation resonances the result has been already proved in Sect. 3. So let us suppose that (2.14) holds for resonances of any generation j , with j < j . Consider a line ! contained inside a resonance V ∈ Vj and outside all resonances in Vj +1 contained inside V : then there will be j resonances V ≡ W1 ⊂ . . . ⊂ Wj containing !. Each renormalization produces a change on the momentum flowing through the line !, such that, if ν˜ ! is the momentum flowing through the line ! in ϑ0 and ν! is the momentum flowing through the conjugate line ! in ϑ, then: j
j
i=1
i=1
1 1 1 1 − ≤ ||ων˜ ! || ≤ + . 96qn! +1 4qnR 48qn! 4qnR Wi
(4.16)
Wi
Call ϑ0V ∈ FVj (ϑ0 ) the tree containing V (which is not, in general, the originary tree ϑ0 ) and ϑ V the tree in FV (ϑ0V ) obtained by the action of the group PV . As (2.11) is supposed to hold before renormalizing V , for all lines !m , m = 1, . . . , mV , entering V one has ||ων!m || < 1/8qn!m , so that, by reasoning as in Sect. 3 to prove Lemma 7, we can conclude that: ||ων! || − ||ων˜ ! || ≤
1 , 4qnR V
||ων! || ≥
1 , 4qnR
||ων˜ ! || ≥
V
where ν! is the momentum flowing through the line ! in ϑ V .
1 , 4qnR V
(4.17)
644
A. Berretti, G. Gentile
In order that ! be contained inside V = W1 , one must have 1/48qn! ≥ 1/4qnR ; V moreover if j1 = &(j − 1)/2' and j2 = &j/2' (here &·' denotes the integer part), one has: qnR ≤ W1
qnR
W3
2
≤ ... ≤
qnR
Wj
2j1
1
,
qnR ≤
qnRW
W2
2
4
≤ ... ≤
qnRW
j2
,
2 j2
(4.18)
(simply use that qn+1 ≥ qn and qn+2 ≥ 2qn for any n ≥ 0). Then one can write: ||ων! || ≤
1 1 1 1 1 1 + + + ; ≤ i i 48qn! 4qnR 2 2 48qn! qnR V
j1
j2
i=0
i=0
(4.19)
V
this is bounded from above by 5/48qn! . Likewise one finds: ||ων! || ≥
1 1 1 1 1 1 − + − ; ≥ i 96qn! +1 4qnR 2 2i 96qn! +1 qnR V
j1
j2
i=0
i=0
(4.20)
V
this is bounded from below by 1/192qn! +1 if 1/96qn! +1 > 2/qnR and by 1/768qn! +1 V if 1/96qn! +1 ≤ 2/qnR . V Then (2.14) holds also for any line ! contained inside V0 , if V is a resonance in Vj . As any next renormalization is on resonances V ∈ Vj , with j > j , so that it does not shift the line !, the momentum ν! changes no more, so that the inductive proof is complete. Then in (4.15) we can bound, for ! ∈ L1 : ων! (!) ∂ gn (ν! (t)) m ! ∂µm 3 ≤ D9 ||ων!m (!) || 768qn! +1 p−1 3 ||ων!0Wi || ≤ D9 ||ων!m (!) || 768qn! +1 ||ων!0 ||
≤ D9 768qn! +1
p 2 i=0
i=0
||ων!0 || Wi
p
(4.21)
Wi
i=0
768qnWi +1 ,
where W(!) = {W0 , . . . , Wp } is the simple cloud of !, and, for ! ∈ L2 : ∂2 ων! (!) ων! (!) gn! (ν! (t)) m m ∂µ ∂µ m
m
4 ≤ D9 ||ων!m (!) || ||ων!m (!) || 768qn! +1
p−1 p −1 4 ||ων!0Wi || ||ων!0Wi || ≤ D9 ||ων!m (!) || ||ων!m (!) || 768qn! +1 ||ων!0 || ||ων!0 || i=0
Wi
i =0
Wi
Bryuno Function and Standard Map
≤ D9 768qn! +1 p i =0
645
p 2
||ων!0 || Wi
i=0
||ων!0 || Wi
p
i=0
768qnWi +1
i =0
p
768qnW +1 , i
(4.22)
where W− (!) = {W0 , . . . , Wp } is the minor cloud and W+ (!) = {W0 , . . . , Wp } is the major cloud of !. Note that (4.21) and (4.22) give a factor: ||ων!0 || 768qnWi +1 (4.23) Wi
for each resonance Wi belonging to the (simple or minor or major) cloud of !. As each resonance belongs to the cloud of some line internal to it and each resonance contains two derived lines or one line derived twice (by definition of the renormalization procedure), then one concludes that a factor equal to the square of (4.23) is obtained for each resonance. If we note that each underived propagator can be bounded again using (3.28) with p = 0, then we can summarize the bounds (4.21)–(4.22) stating that, for each resummed tree ϑ, we have: • for each resonance V , a factor ||ων!0 ||2 times a factor (768qnV +1 )2 ; V
• for each line !, a factor D9 (768qn! +1 )2 (as the factors (768qn! +1 )p , p = 1, 2, appearing when the corresponding propagator is derived, are taken into account by the factors associated to the resonances, see the item above).
Then the statement concerning (4.12) is proved. Once the single summand in (4.15) has been bounded, one is left with the problem of bounding the number of terms on which the sum is performed. For each first generation resonance V at most m2V times kV2 summands are generated by the renormalization procedure (see (3.13)). In general, for each (renormalized) resonance, we have to sum over the entering lines {!Vm , !Vm }∗ (corresponding to the quantities µm , m = 1, . . . , mV , in terms of which the renormalized resonance factor is considered a function) and over the internal lines {!1V , !2V } (corresponding to the factors on which the derivatives act). An estimate on the number of summands generated by the renormalization procedure can be obtained by using the counting Lemma 6. If V ∈ Vj , j ≥ 1, let NV be the number of (j + 1)th generation resonances contained inside V . Recall that V0 is the set of lines internal to V which are outside any resonance contained in V , and denote by kV0 the number of elements in V0 . The renormalization procedure, for each renormalized resonance, generates a single or double sum over the entering lines whose momenta appear in the quantities ων!1 (t), . . . , ων!mV (t), in terms of which the resonance factor is expanded: the sum is single if the localization is to first order and double if the localization is to second order (see (4.7) and (4.8)). Then we find, using Lemma 6, that in the renormalization procedure each sum over the entering lines of a first generation resonance V is on mV terms, each sum over the entering lines of all second generation resonances V ⊂ V is on kV0 + NV terms, each sum over the entering lines of all third generation resonances V ⊂ V ⊂ V is on
646
A. Berretti, G. Gentile
kV0 + NV , and so on; in general, each sum over the entering lines of all the resonances V ∈ Vj +1 contained inside a resonance V ∈ Vj is bounded by kV0 + NV . Once all generations of resonances have been considered, the overall number of summands generated by the renormalization procedure – by taking also into account the sum over the derived lines and using Remark 12 – is bounded by:
kV2 m2V (kV0 + NV )2 ≤ e6k , (4.24) V ∈V1
V ∈V1
V ∈V
where k is the order of the tree ϑ. In fact, just use x ≤ ex and the obvious inequalities: kV ≤ k, V ∈V1
mV +
V ∈V1
kV0 ≤ k,
(4.25)
V ∈V
NV ≤ k.
V ∈V
Then the statement after (4.12) is proved and the constant D3 is e6 . Finally one has to count the number of trees. The bound given in Sect. 2 is no more valid, as a line ! ∈ ϑ can have more than two scale labels. However Lemma 4 proves that to each line at most D10 = 17 scale labels can be associated, so that the number of trees ∗ is bounded by 23k D k . Then the bound (2.21) follows, with D = 23 D D D : in Tν,k 4 3 9 10 10 this concludes the proof of the theorem. 5. Proof of Lemma 5 We shall prove inductively on the order k the following bounds: Mn (ϑ) = 0, 2k − 1 + NnR (ϑ), Mn (ϑ) ≤ qn
if k < qn ,
(5.1a)
if k ≥ qn ,
(5.1b)
for any n ≥ 0, and: Mn (ϑ) = 0, k + NnR (ϑ), Mn (ϑ) ≤ qn k 8k Mn (ϑ) ≤ + − 1 + NnR (ϑ), qn qn+1
if k < qn , if qn ≤ k < if k ≥
(5.2a) qn+1 , 4
qn+1 , 4
(5.2b) (5.2c)
for qn+1 > 4qn , where k is the order of the tree ϑ. Note that (5.1a) and (5.2a) are simply a consequence of Lemma 2 of Sect. 2, so we have to prove only (5.1b), (5.2b) and (5.2c). Remark 13. If we were only interested in proving the analyticity of the invariant curves for rotation numbers satisfying the Bryuno condition, then Eqs. (5.1) would be sufficient – as it would be easy to check by proceeding along the lines of Sects. 3 and 4. However, in order to find the optimal dependence of the radius of convergence ρ(ω) on ω, which is the main focus of this paper, the more refined bounds (5.2) are necessary.
Bryuno Function and Standard Map
647
Remark 14. The proof of (5.1) is easier, as it is obvious since it is a weaker result. After dealing with (5.2), the proof of (5.1) could be left as an exercise: we shall prove it explicitly for completeness, and as it could be read as an introduction to the more involved proof of (5.2). We shall prove first (5.2) (case qn+1 > 4qn ) in cases [1]–[3] below, then (5.1) in items [4]–[6] below. We proceed by induction, and assuming that (5.1), (5.2) hold for any k < k we shall show that they hold for k also; their validity for k = 1 being trivial, Lemma 5 is proved. Recall also Remark 5 in Sect. 2 about the way of counting the resonances on scale n and the resonances with resonance-scale n. • So consider first qn+1 > 4qn . [1] If the root line ! of ϑ has scale = n and it is not the exiting line of a resonance on scale n, let us denote with !1 , . . . , !m the lines entering the last node u0 of ϑ and ϑ1 , . . . , ϑm the subtrees of ϑ whose root lines are those lines. By construction Mn (ϑ) = Mn (ϑ1 ) + · · · + Mn (ϑm ) and NnR (ϑ) = NnR (ϑ1 ) + · · · + NnR (ϑm ): the bounds (5.2) follow inductively by noting that for k ≥ qn+1 /4 one has 8k/qn+1 − 1 ≥ 1. [2] If the root line ! of ϑ has scale n, then we can reason as follows. Let us denote with !1 , . . . , !m the lines on scale ≥ n which are the nearest to the root line of ϑ,2 and let ϑ1 , . . . , ϑm be the subtrees with root lines !1 , . . . , !m . If m = 0 then (5.2) follow immediately from Lemma 2 of Sect. 2; so let us suppose that m ≥ 1. Then the lines !1 , . . . , !m are the entering lines of a cluster T (which can degenerate to a single point) having the root line of ϑ as the exiting line. As ! cannot be the exiting line of a resonance on scale n, one has: Mn (ϑ) = 1 + Mn (ϑ1 ) + · · · + Mn (ϑm ).
(5.3)
˜ ≤ m, In general m ˜ subtrees among the m considered have orders ≥ qn+1 /4, with 0 ≤ m while the remaining m0 = m − m ˜ have orders < qn+1 /4. Let us numerate the subtrees so that the first m ˜ have orders ≥ qn+1 /4. Let us distinguish the cases k < qn+1 /4 and k ≥ qn+1 /4. ˜ = 0 and each line entering T , by Lemma 1 of Sect. 2, has a [2.1] If k < qn+1 /4, then m momentum which is a multiple of qn and, by a Lemma 2, has a scale label n. Therefore the momentum flowing through the root line is ν = νT + s0 qn , for some s0 ∈ Z, with: νu . (5.4) νT ≡ u∈T
Moreover also the root line of ϑ has scale n, by assumption, and momentum ν = sqn for some s ∈ Z, by Lemma 1, so that νT = (s − s0 )qn = s qn , for some integer s . [ 2.1.1] If s = 0, then kT ≥ |νT | ≥ qn , giving: Mn (ϑ) ≤ 1 +
k 1 + · · · + km + NnR (ϑ1 ) + · · · + NnR (ϑm ) qn k − kT k ≤1+ + NnR (ϑ) ≤ + NnR (ϑ), qn qn
(5.5)
2 That is, such that no other line along the paths connecting the lines ! , . . . , ! to the root line is on scale m 1
≥ n.
648
A. Berretti, G. Gentile
as NnR (ϑ) = NnR (ϑ1 ) + · · · + NnR (ϑm ), and (5.2b) follows. [2.1.2] If s = 0 and kT ≥ qn , one can reason as in case [2.1.1]. [2.1.3] If s = 0 and kT < qn , then T is a resonance with resonance-scale n, and: Mn (ϑ) ≤ 1 +
k 1 + · · · + km + NnR (ϑ1 ) + · · · + NnR (ϑm ) qn k k ≤1+ + NnR (ϑ1 ) + · · · + NnR (ϑm ) ≤ + NnR (ϑ), qn qn
(5.6)
as NnR (ϑ) = 1 + NnR (ϑ1 ) + · · · + NnR (ϑm ), and again (5.2b) follows. [2.2] If k ≥ qn+1 /4, assume again inductively the bounds (5.2). From (5.3) we have: Mn (ϑ) ≤ 1 +
m ˜ kj j =1
qn
+
m m kj R 8kj −1 + + Nn (ϑj ), qn+1 qn
(5.7)
j =1
j =m+1 ˜
where kj is the order of the subtree ϑj , j = 1, . . . , m. [2.2.1] If m ˜ ≥ 2, then (5.2c) follows immediately. [2.2.2] If m ˜ = 0, then (5.7) gives: Mn (ϑ) ≤ 1 +
m
m
j =1
j =1
k 1 + · · · + km R k + Nn (ϑj ) ≤ 1 + + NnR (ϑj ) qn qn ≤
8k k −1+ + NnR (ϑ), qn+1 qn
(5.8)
as we are considering k such that 1 ≤ 8k/qn+1 − 1 and NnR (ϑ1 ) + · · · + NnR (ϑm ) = NnR (ϑ). [2.2.3] If m ˜ = 1, then (5.7) gives: Mn (ϑ) ≤ 1 +
k
1
qn
+
k 8k1 j −1 + + NnR (ϑj ) qn+1 qn m
m
j =2
j =1
m
=
k1 8k1 k0 R + + + Nn (ϑj ), qn qn+1 qn
(5.9)
j =1
where k0 = k2 + · · · + km . [2.2.3.1] If in such case k0 ≥ qn+1 /8, then we can bound in (5.9): k1 8k1 k0 k1 + k0 8(k1 + k0 ) 8k0 k 8k + + ≤ + − ≤ + − 1, qn qn+1 qn qn qn+1 qn+1 qn qn+1
(5.10)
and NnR (ϑ1 + · · · + NnR (ϑm ) = NnR (ϑ), so that (5.2c) follows. [2.2.3.2] If k0 < qn+1 /8, then, denoting with ν and ν1 the momenta flowing through the root line ! of ϑ and the root line !1 of ϑ1 respectively, one has: ||ω(ν − ν1 )|| ≤ ||ων|| + ||ων1 || ≤
1 , 4qn
(5.11)
Bryuno Function and Standard Map
649
as both ! and !1 are on scale ≥ n (see Remark 2 in Sect. 2 and use (2.14)). Then either |ν − ν1 | ≥ qn+1 /4 or ν − ν1 = s˜ qn , s˜ ∈ Z, by Lemma 1 of Sect. 2. [2.2.3.2.1] If |ν − ν1 | ≥ qn+1 /4, noting that ν = ν1 + νT + ν0 , where ν0 = s0 qn (with s0 ∈ Z and |ν0 | ≤ k0 < qn+1 /8) is the sum of the momenta flowing through the root lines of the m0 subtrees entering T with orders < qn+1 /4 and νT is defined by (5.4), one has: qn+1 , (5.12) kT ≥ |νT | ≥ |ν − ν1 | − |ν0 | ≥ 8 so that in (5.9) one can bound: 8k1 k0 k − kT 8(k − k0 − kT ) k 8(k − kT ) k1 + + ≤ + ≤ + qn qn+1 qn qn qn+1 qn qn+1 k 8k ≤ + − 1, qn qn+1
(5.13)
and NnR (ϑ1 ) + · · · + NnR (ϑm ) = NnR (ϑ), so that (5.2c) follows again. [2.2.3.2.2] If ν − ν1 = s˜ qn , s˜ ∈ Z, then: νT = ν − ν1 − ν0 = (˜s − s0 ) ≡ sqn ,
(5.14)
where s ∈ Z. [2.2.3.2.2.1] If s = 0, then kT ≥ qn , so that in (5.3) one has: 8k1 k0 k − kT 8k k 8k k1 + + ≤ − ≤ −1+ , qn qn+1 qn qn qn+1 qn qn+1
(5.15)
and NnR (ϑ1 ) + · · · + NnR (ϑm ) = NnR (ϑ), so implying (5.2c). [2.2.3.2.2.2] If s = 0 (i.e. νT = 0) and kT ≥ qn , one can proceed as in case [2.2.3.2.2.1]. [2.2.3.2.2.3] If s = 0 and kT < qn , then T is a resonance with resonance-scale n,3 so that NnR (ϑ) = 1 + NnR (ϑ1 ) + · · · + NnR (ϑm ), hence (5.9) gives: m
Mn (ϑ) ≤
k 8k k 8k + −1+1+ NnR (ϑj ) ≤ + − 1 + NnR (ϑ), qn qn+1 qn qn+1
(5.16)
j =1
and (5.2c) follows. [3] If the root line ! of ϑ is on scale > n and it is the exiting line of a resonance Vn on scale n, let us denote with !1 , . . . , !m the lines on scale ≥ n which are the nearest to the root line of ϑ, and let ϑ1 , . . . , ϑm be the subtrees with root lines !1 , . . . , !m ; some of these lines – at least one – are lines on scale n inside Vn .4 Let T be the cluster which the lines !1 , . . . , !m enter; of course T ⊂ Vn and T can degenerate into a single point. As in case [2], let m ˜ be the number of subtrees among the m considered which have orders ≥ qn+1 /4, and again let us numerate the subtrees in such a way that the ones with orders ≥ qn+1 /4 are the first m. ˜ 3 If m = 0, then ν ≡ ν = ν so that n ≤ n ≤ n + 1, by construction and by item 2 in the definition 0 ! !1 ! !1 ! of resonance. 4 Otherwise V would not contain any line on scale n, so that it would not be a resonance on scale n as we n are supposing.
650
A. Berretti, G. Gentile
Note that k ≥ qn+1 (otherwise ! could not be on scale > n) and: Mn (ϑ) = 1 + Mn (ϑ1 ) + · · · + Mn (ϑm ),
(5.17)
as the root line ! contributes one unit to Pn (ϑ) and does not contribute to Nn (ϑ). Note also that if T is a resonance then its resonance-scale is n. [3.1] If T is not a resonance, then: NnR (ϑ) = NnR (ϑ1 ) + · · · + NnR (ϑm ).
(5.18)
By induction (5.2) and (5.17) imply: Mn (ϑ) ≤ 1 +
m ˜ kj j =1
qn
+
m m kj R 8kj −1 + + Nn (ϑj ), qn+1 qn j =1
(5.19)
j =1
where kj are the orders of the subtrees ϑj , j = 1, . . . , m. [3.1.1] If m ˜ = 2, then (5.2c) follows immediately. [3.1.2] The case m ˜ = 0 is impossible because T is contained inside a resonance Vn on scale n, so that at least one of the subtrees entering T must have order ≥ qn+1 /4 – otherwise no line on scale > n could enter Vn , see Lemma 2. [3.1.3] If m ˜ = 1 let k0 = k2 +· · ·+km ; then the case k0 ≥ qn+1 /8 can be dealt with as in case [2.2.3.1]; if k0 < qn+1 /8, we deduce from Lemma 1 that either |ν − ν1 | ≥ qn+1 /4 or ν − ν1 = s˜ qn , using the same notations of case [2.2.3.2]. The first case can be discussed as in case [2.2.3.2.1], while in the second case we find, as in case [2.2.3.2.2], that νT = ν − ν1 − ν0 = sqn , with either s = 0 or s = 0 and kT ≥ qn (otherwise T would be a resonance), so that the conclusions in cases [2.2.3.2.2.1] and [2.2.3.2.2.2] can be inherited in the present case and (5.2c) follows again. [3.2] If T is a resonance, then its resonance-scale is n, so that: NnR (ϑ) = 1 + NnR (ϑ1 ) + · · · + NnR (ϑm ).
(5.20)
The discussion goes on as in case [3.1] above, with the only difference that now, when m ˜ = 1 (and kT < qn , k0 < qn+1 /8), the case νT = 0 (i.e. νT = sqn , with s = 0) is the only possible since T is a resonance. In such a case: m
k1 8k1 k0 R k 8k + −1+ + Nn (ϑj ) ≤ + − 1 + NnR (ϑ), Mn (ϑ) ≤ 1 + qn qn+1 qn qn qn+1 j =1
(5.21) and (5.2c) follows once more. • Now we prove (5.1). [4] If the root line ! of ϑ as scale = n and it is not the entering line of a resonance on scale n, let us denote with !1 , . . . , !m the lines entering the last node u0 of ϑ. By construction Mn (ϑ) = Mn (ϑ1 )+· · ·+Mn (ϑm ) and NnR (ϑ) = NnR (ϑ1 )+· · ·+NnR (ϑm ) so that the bound (5.1) follows immediately by induction. [5] If the root line ! of ϑ has scale n, using the same notations as in case [2], denote with !1 , . . . , !m the lines on scale ≥ n which are nearest to the root line of ϑ, and let ϑ1 ,
Bryuno Function and Standard Map
651
. . . , ϑm be the subtrees with these lines as root lines. Then such lines are the entering lines of a cluster T (which can degenerate into a single point) having the root line of ϑ as the exiting line. We have: Mn (ϑ) = 1 + Mn (ϑ1 ) + · · · + Mn (ϑm ). Assuming again inductively the bounds (5.1), from (5.22) we have: m m 2kj Mn (ϑ) ≤ 1 + −1 + NnR (ϑj ), qn j =1
(5.22)
(5.23)
j =1
where kj is the order of the subtree ϑj , j = 1, . . . , m. [5.1] If m ≥ 2, then (5.1b) follows immediately. [5.2] If m = 0, then Mn (ϑ) = 1. As ! is on scale n, the order k of ϑ has to be k ≥ qn , so that: Mn (ϑ) = 1 ≤
2k − 1, qn
NnR (ϑ) = 0,
and (5.1b) follows again. [5.3] If m = 1, then (5.23) gives:
2k1 2k1 Mn (ϑ) ≤ 1 + − 1 + NnR (ϑ1 ) = + NnR (ϑ1 ). qn qn
(5.24)
(5.25)
Denoting with ν and ν1 the momenta flowing, respectively, through the root line ! of ϑ and through the root line !1 of ϑ1 , we have: ||ω(ν − ν1 )|| ≤ ||ων|| + ||ων1 || ≤
1 , 4qn
(5.26)
as both ! and !1 are on scale ≥ n (see Remark 2 and use (2.14)). Then, as νT = ν − ν1 , either |νT | ≥ qn or νT = 0. [5.3.1] If |νT | ≥ qn , then kT ≥ |νT | ≥ qn and NnR (ϑ1 ) = NnR (ϑ) (since T is not a resonance), so that (5.25) gives: Mn (ϑ) ≤
2k 2kT 2k 2k − + NnR (ϑ1 ) ≤ − 1 + NnR (ϑ1 ) = − 1 + NnR (ϑ), qn qn qn qn
(5.27)
and (5.1b) follows. [5.3.2] If νT = 0 and kT ≥ qn , one can reason as in case [5.3.1]. [5.3.3] If νT = 0 and kT < qn , then ν1 = ν and either n!1 = n or n!1 = n + 1 (see item 2 in the definition of resonance): then T is a resonance with resonance-scale n, so that 1 + NnR (ϑ1 ) = NnR (ϑ), hence (5.25) gives:
2k 2k Mn (ϑ) ≤ − 1 + 1 + NnR (ϑ1 ) ≤ − 1 + NnR (ϑ), (5.28) qn qn and (5.1) follows again. [6] If the root line ! of ϑ is on scale > n and it is the exiting line of a resonance Vn , as in case [3] above, denote with !1 , . . . , !m the lines on scale ≥ n which are nearest to
652
A. Berretti, G. Gentile
the root line of ϑ, and let ϑ1 , . . . , ϑm be the subtree of ϑ of which these lines are root lines. Some of these lines – at least one – are lines on scale n inside Vn . Let T be the cluster which the lines !1 , . . . , !m enter; of course T ⊂ Vn , and T can degenerate into a single point. Note that as in case [3]: Mn (ϑ) = 1 + Mn (ϑ1 ) + · · · + Mn (ϑm ),
(5.29)
as the root line ! contributes one unit to Pn (ϑ) and does not contribute to Nn (ϑ), and that if T is a resonance then its resonance-scale is n. [6.1] If T is not a resonance, then: NnR (ϑ) = NnR (ϑ1 ) + · · · + NnR (ϑm ).
(5.30)
By induction, (5.1) and (5.29) imply: Mn (ϑ) ≤ 1 +
m 2kj j =1
qn
−1 +
m j =1
NnR (ϑj ),
(5.31)
where kj are the orders of the subtrees ϑj , j = 1, . . . , m. [6.1.1] If m = 2, then (5.1b) follows immediately. [6.1.2] The case m = 0 is impossible (see case [3.1.2]). [6.1.3] If m = 1 in (5.31), we have νT = ν − ν1 , so that |νT | ≥ qn (as νT = 0, otherwise T would be a resonance). Then we can go on along the lines of case [5.3.1] in order to obtain (5.1b). [6.2] If T is a resonance, then its resonance-scale is n, so that: NnR (ϑ) = 1 + NnR (ϑ1 ),
(5.32)
and the discussion goes on as in case [6.1], with the only difference that now, for m = 1, the case νT = 0 is the only possible as T is supposed to be a resonance. In such a case:
2k 2k Mn (ϑ) ≤ 1 + − 1 + NnR (ϑ1 ) ≤ − 1 + NnR (ϑ), (5.33) qn qn implying again (5.1b). • Finally, to deduce (2.19) from (5.1) and (5.2), simply note that, for qn+1 ≤ 4qn , we have 2k/qn ≤ 8k/qn+1 ; then Lemma 5 follows. Remark 15. Note that the correspondence between momenta and scale labels has been used only through the inequality (2.11). As we have seen in Sect. 4 the renormalization procedure can shift the “original” momenta flowing through the lines of a bounded quantity which does not alter such an inequality. This allow us to apply Lemma 4 also to the renormalized trees, as it was repeatedly claimed in the previous sections.
Bryuno Function and Standard Map
653
6. Proof of Lemma 8 As far as only the localized resonance factor is involved, the momenta flowing through the lines entering any resonance are set to zero, so that it does not matter if such momenta are interpolated or not (i.e. if they are of the form ν or ν(t)). In particular, the case of first generation resonances (discussed in Sect. 3) is included in Lemma 8. A basic property of the trees belonging to the resonance family FV (ϑ) is that the difference between their values is only in the resonance factor: for any tree ϑ ∈ FV (ϑ), we can write: Val(ϑ ) = A(ϑ)VV (ϑ ),
(6.1)
for some factor A(ϑ) which is the same for all ϑ ∈ FV (ϑ). This simply follows from the fact that the transformations in PV do not touch the part of the tree ϑ which is outside the resonance V . Therefore a cancellation between localized resonance factors yields a cancellation between tree values (in which the resonance factor has been localized of course). By item 2 in the definition of resonance and by definition of V0 , one has: νu = 0; (6.2) u∈V0
˜ moreover, given an entering line !m of V , if !m ∈ LR V and V0 = V0 (!m ), then: νu ≡ νu = 0. u∈V˜0
(6.3)
u∈V0 (!m )
In general we can write, for any tree ϑ ∈ FV (ϑ), LVV (ϑ ) = B(ϑ )LVV0 (ϑ ) LVV (!) (ϑ ),
(6.4)
!∈LR V
where VV0 (ϑ ) and VV (!) (ϑ ) are defined as the resonance factor VV (ϑ ), but with the product ranging only over nodes and lines internal to V0 and V (!), respectively, while LVV0 (ϑ ) and LVV (!) (ϑ ) are obtained from VV0 (ϑ ) and VV (!) (ϑ ), respectively, by replacing ν! with ν!0 in V , for all lines ! ∈ V . In (6.4) B(ϑ ) takes into account all other factors (if there are any), always evaluated with ν! replaced with ν!0 , ! ∈ V . Note that, as A(ϑ) in (6.1), also B(ϑ ) is the same for all ϑ ∈ FV (ϑ), so that one can set B(ϑ ) = B(ϑ) and write: Val(ϑ ) = A(ϑ)VV (ϑ ), LVV (ϑ ) = B(ϑ)LVV0 (ϑ ) LVV (!) (ϑ ). (6.5) !∈LR V
[1] If zV = 1 the localized resonance factor is given by the resonance factor computed for µ1 = · · · = µm = 0. Summing the localized resonance factors corresponding to the trees belonging to FV (ϑ), we can group them into subfamilies of inequivalent trees whose contributions are different as for each node u ∈ V there is a factor;
1 mu 1 1 = , (6.6) mu ! s u s u ! ru !
654
A. Berretti, G. Gentile
as all terms which are obtained by permutations are summed together (this gives the binomial coefficient in the left hand side of the above equation), times a factor: νumu +1 = νu(su +1)+ru ,
(6.7)
times a propagator gn!u (ν!0u ) (the last factor is missing if corresponding to the line exiting V ; see definitions (4.3)–(4.6)). Then for µ1 = · · · = µm = 0 we can write:
LVV (ϑ ) =
ϑ ∈FV (ϑ)
νusu +1 gn! (ν!0 ) su ! !∈V ϑ ∈FV (ϑ) u∈V
ru νu νuru · ru ! ru ! R
u∈V0
=
u∈V
·
!∈LV u∈V0 (!)
νusu +1 su !
!∈V
ϑ ∈FV (ϑ) u∈V0
(6.8)
gn! (ν!0 ) νuru ru !
νuru , ru !
u∈V0 (!) !∈LR V
where we have used the fact that for µ1 = · · · = µm = 0 the factors in square brackets have the same value for all ϑ ∈ FV (ϑ) (see (3.11) and take into account what observed at the beginning of this section). The last sum in (6.8) can be rewritten as:
νuru ru !
ϑ ∈FV (ϑ) u∈V0
=
u∈V0 (!) !∈LR V νuru
u∈V {ru ≥0} u∈V ru =mV0 0
mV0 1 = νu mV0 ! u∈V0
νuru ru !
ru !
νuru ru !
˜ ) {ru ≥0} u∈V˜0 V˜ ∈V(V u∈V˜ ru =1
(6.9)
0
νu ,
˜ ) u∈V˜0 V˜ ∈V(V
which is zero by definition of resonance (see (6.2) and (6.3) above). [2] If zV = 2 the localized resonance factor, with respect to the previous case, contains also the first order terms (again computed in µ1 = · · · = µm = 0). The zeroth order contribution can be discussed as for the case zV = 1, and the same result holds. Also the second order contribution vanishes, after summing over the trees ϑ ∈ FV (ϑ). To prove this we shall consider separately the cases mV = 2 and mV = 1. In the first case, when the derivative (∂/∂µm )VV (ϑ; 0, . . . , 0) is considered, let us compare all the trees ϑ in the subfamily of FV (ϑ) in which the line !m is kept fixed (call u¯ the node which such a line enters), while all other lines are shifted (i.e. detached and reattached to all nodes inside the resonance). The difference with respect to the previous case, discussed above, is that the line with momentum ν!m can be chosen in ru¯ ways
Bryuno Function and Standard Map
655
among the ru¯ lines entering the node u¯ ∈ V and outside V . This means that we can write:
(s +1)+ru νu u νumu +1 mu = (6.10) m u ! su su !ru ! for all nodes u = u, ¯ and: (s +1)+(ru¯ −1) u¯ νum ν u¯ mu ¯ ru¯ = u¯ mu¯ ! su su¯ !(ru¯ − 1)!
(6.11)
for u. ¯ Then we have an expression analogous to (6.8), with the only difference that the labels {ru } have to be replaced with labels {ru }, defined as: ru = ru − δuu¯ ,
∀u either in V0 or in
V˜0 ,
(6.12)
˜ ) V˜ ∈V(V
such that: u∈V0
ru +
ru = mV − 1;
(6.13)
˜ ) u∈V˜0 V˜ ∈V(V
so the last sum in the second line of (6.8) has to be replaced by: νuru νuru νu¯ ru ! ru ! R ϑ ∈FV (ϑ) u∈V0
=
!∈LV u∈V0 (!) νuru
ru !
{ru ≥0} ∗ u∈V u∈V ru =mV 0
=
1 m∗V0 !
u∈V0
{ru ≥0} u∈V˜0 ∗ u∈V˜ ru =ζ (!)
˜ ) V˜ ∈V(V
0
νu
m∗V
0
νuru ru !
(6.14)
0
ζ ∗ (V˜ ) νu
,
˜ ) u∈V˜ V˜ ∈V(V
where:
m∗V0
mV0 , = mV0 − 1,
if u¯ ∈ / V0 , if u¯ ∈ V0 ,
ζ (V˜ ) = ∗
1, 0,
if u¯ ∈ / V˜0 , if u¯ ∈ V˜0 ,
(6.15)
so that we have again vanishing contributions (as mV ≥ 2). On the contrary, if mV = 1, the above reasoning does not apply, as there is only one entering line. Anyway the function (∂/∂µ1 )VV (ϑ; 0) is an odd function, as all the propagators are even in their arguments, so that the derived one5 becomes odd, and the numerator contains an even number of νu ’s. Then by reversing the signs of the labels νu , u ∈ V , the numerator will not change, while the overall sign of the denominator 5 If z = 2, then there is only one derived propagator, arising from the renormalization of the resonance V V itself.
656
A. Berretti, G. Gentile
will change, so that the sum over the first order contributions of the localized resonance factors of the two tree values being considered vanishes.6 [3] Finally if zV = 0 the localization operator L gives zero when acting on the resonance factors, so that nothing has to be proved. References 1. Berretti, A. and Gentile, G.:, Scaling Properties for the Radius of Convergence of a Lindstedt Series: The Standard Map. J. Math. Pures Appl. (9) 78, no. 2, 159–176 (1999) 2. Yoccoz, J. C.: Théorème de Siegel, nombres de Brjuno and polinômes quadratiques. Astérisque 231, 3–88 (1995) 3. Marmi, S., Moussa, P., andYoccoz, J. C.: The Brjuno Functions and their Regularity Properties. Commun. Math. Phys. 186, 265–293 (1997) 4. Davie, A. M.: The Critical Function for the Semistandard Map. Nonlinearity 7, 219–229 (1994) 5. Davie, A. M.: Renormalization for Analytic Area-Preserving Maps. unpublished 6. Gentile, G. and Mastropietro, V.: Methods for the Analysis of the Lindstedt Series for KAM tori and Renormalizability in Classical Mechanics. A Review with Some Applications, Rev. Math. Phys. 8, no. 3, 393–444 (1996) 7. Eliasson, L. H.: Absolutely Convergent Series Expansions for Quasi-periodic Motions. University of Stockholm preprint (1988), and Math. Phys. Elect. J. 2, No. 4 (1996), 8. Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164, 145–156 (1994) 9. Marmi, S., and Stark, J.: On the Standard Map Critical Function. Nonlinearity 5, 743–761 (1992) 10. Rüssmann, H.: Invariant Tori in the Perturbation Theory of Weakly Non-degenerate Integrable Hamiltonian Systems, Preprint-Rehie des Fachbereichs Mathematik der Johannes Gutenberg-Universität Mainz, Nr. 14, 27.07.98 11. Berretti, A., and Marmi, S.: Scaling near Resonances and Complex Rotation Numbers for the Standard Map. Nonlinearity 7, 603–621 (1994) 12. Berretti, A., Celletti, A., Chierchia, L. and Falcolini, C.: Natural Boundaries for Area-Preserving Twist Maps. J. Stat. Phys. 66, no. 5–6, 1613–1630 (1992) 13. Wilbrink, J.: Erratic Behavior of Invariant Circles in Standard-like Mappings. Phys. D 26, 358–368 (1987) 14. Berretti, A. and Gentile, G.: Scaling Properties for the Radius of Convergence of Lindstedt Series: Generalized Standard Maps. J. Math. Pures Appl. 79, no. 7, 691–713 (2000) 15. Harary, P., and Palmer, E.: Graphical Enumeration. New York: Academic Press, 1973 16. Gallavotti, G., Gentile, G.: Majorant series convergence for twistless KAM tori. Ergodic Theory and Dynamical Systems 15, 857–869 (1995) 17. Schmidt, W. M.: Diophantine Approximation. Lecture Notes in Mathematics 785, Berlin: Springer-Verlag, 1980 18. Bonetto, F., Gallavotti, G., Gentile, G., and Mastropietro, V.: Lindstedt Series, Ultraviolet Divergences and Moser’s Theorem. Annali della Scuola Normale Superiore di Pisa Cl. Sci. (4), 26, No. 3, 545–593 (1998) Communicated by Ya. G. Sinai
6 Note that the renormalization transformations of type 3 are explicitly used in order to implement the cancellation mechanism only in the case of a resonance V with zV = 2 and mV = 1. In general not all the transformations are used for all resonances: in particular, when zV = 0, we consider separately all terms generated by the action of the group PV , as there is no need of additional renormalizations.